[Corpora-List] Word similarity from large text corpus

RadimRehurek RadimRehurek at seznam.cz
Fri May 13 13:39:37 UTC 2011


+1 on Semantic Vectors, or, if you prefer Python over Java, people have also used gensim for large scale Random Projections/LSA/LDA similarity stuff:

http://nlp.fi.muni.cz/projekty/gensim/

Best,
Radim


> ------------ Původní zpráva ------------
> Od: Dominic Widdows <widdows at google.com>
> Předmět: Re: [Corpora-List] Word similarity from large text corpus
> Datum: 13.5.2011 13:56:15
> ----------------------------------------
> Dear Pham,
> 
> Semantic vectors covers a lot of options, and people seem to have a
> reasonably productive and pain-free time using it.
> http://code.google.com/p/semanticvectors/
> 
> Best wishes,
> Dominic
> 
> On Fri, May 13, 2011 at 3:15 AM, Marco Baroni <marco.baroni at unitn.it> wrote:
> > Dear Pham,
> >
> > There is also a list of pre-compiled similarities (and tools to extract a
> > similar list from your own frequency table) here:
> >
> > http://clic.cimec.unitn.it/dm/
> >
> > (for the pre-compiled list, look at the "Top 10 nearest neighbours of each
> > word in TypeDM" section.)
> >
> > Regards,
> >
> > Marco
> >
> >
> > --
> > Marco Baroni
> > Center for Mind/Brain Sciences (CIMeC)
> > University of Trento
> > http://clic.cimec.unitn.it/marco
> >
> > _______________________________________________
> > UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> > Corpora mailing list
> > Corpora at uib.no
> > http://mailman.uib.no/listinfo/corpora
> >
> 
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
> 
> 
> 

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list