[Corpora-List] High-performance Computing and NLP
DJamé Seddah
djame.seddah at free.fr
Wed Mar 17 01:56:20 UTC 2010
Hi,
from what I learned from a heavy usage of such facilities offered by www.ichec.ie
(already a long time ago), if you want to do let's say parsing of
very very large data,
the best is to use a cluster with distributed memory and to use a set
of nodes as slaves to be used in a task farming environment.
Usually such tools are easily written in C using either mpi2ch or
openmpi.
If you need your programs to access a common memory space distributed
among different nodes, you need to have
a network with an insane amount of network bandwith otherwise your
programs will spend their time waiting for data to process.
See the README and the FAQ file of PETSc (http://www.mcs.anl.gov/petsc/petsc-as
) about that.
By the way, there's this IBM toolbox (not specially related to nlp but
with paralell machine learning)
http://www.alphaworks.ibm.com/tech/pml?open&S_TACT=105AGX59&S_CMP=GRsite-lnxw07&ca=dgr-lnxw07awpml
which was referenced on os news :
http://www.osnews.com/story/20631/Parallel_Machine_Learning_Toolbox_for_Linux
I think that we're many to wait for the emergence of a parsing at home
general grid framework.
I'd also like to know if the NVIDIA's CUDA compiler is used in the NLP
community.
Best,
Djamé
Le 15 mars 10 à 16:52, Sean Igo a écrit :
> Good day,
>
> My research group is investigating the use of high-performance
> computing facilities in NLP. By this we mostly mean clustered
> environments, in which many (usually identical) computers are
> networked in a single location, and used as a single computing entity
> through libraries like MPI / OpenMP, MapReduce, etc. and/or using UIMA
> or other frameworks in environments like that. Grid methods are less
> of interest to us but I'd also like to hear about them. Pure machine
> learning research that might be applied to NLP would also be welcome.
>
> If you're doing or aware of work like this, please let me know.
>
> Many thanks,
> Sean Igo
> University of Utah
> Center for High Performance Computing / Biomedical Informatics Dept.
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list