[Corpora-List] Word similarity from large text corpus

Minh Pham minhpham0902 at gmail.com
Fri May 13 14:06:54 UTC 2011


Thanks so much for your helps.

I will try tools you suggested.

Best,
Pham

2011/5/13 RadimRehurek <RadimRehurek at seznam.cz>

> +1 on Semantic Vectors, or, if you prefer Python over Java, people have
> also used gensim for large scale Random Projections/LSA/LDA similarity
> stuff:
>
> http://nlp.fi.muni.cz/projekty/gensim/
>
> Best,
> Radim
>
>
> > ------------ Původní zpráva ------------
> > Od: Dominic Widdows <widdows at google.com>
> > Předmět: Re: [Corpora-List] Word similarity from large text corpus
> > Datum: 13.5.2011 13:56:15
> > ----------------------------------------
> > Dear Pham,
> >
> > Semantic vectors covers a lot of options, and people seem to have a
> > reasonably productive and pain-free time using it.
> > http://code.google.com/p/semanticvectors/
> >
> > Best wishes,
> > Dominic
> >
> > On Fri, May 13, 2011 at 3:15 AM, Marco Baroni <marco.baroni at unitn.it>
> wrote:
> > > Dear Pham,
> > >
> > > There is also a list of pre-compiled similarities (and tools to extract
> a
> > > similar list from your own frequency table) here:
> > >
> > > http://clic.cimec.unitn.it/dm/
> > >
> > > (for the pre-compiled list, look at the "Top 10 nearest neighbours of
> each
> > > word in TypeDM" section.)
> > >
> > > Regards,
> > >
> > > Marco
> > >
> > >
> > > --
> > > Marco Baroni
> > > Center for Mind/Brain Sciences (CIMeC)
> > > University of Trento
> > > http://clic.cimec.unitn.it/marco
> > >
> > > _______________________________________________
> > > UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> > > Corpora mailing list
> > > Corpora at uib.no
> > > http://mailman.uib.no/listinfo/corpora
> > >
> >
> > _______________________________________________
> > UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> > Corpora mailing list
> > Corpora at uib.no
> > http://mailman.uib.no/listinfo/corpora
> >
> >
> >
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>



-- 
Pham Quang Nhat Minh (Mr)
PhD student
NLP Laboratory - School of Information Science - JAIST
1-1 Asahidai, Nomi, 923-1292 Japan
Email: minhpqn at jaist.ac.jp
Web: http://www.jaist.ac.jp/index-e.html
Phone: (+81) 090-9440-1556
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20110513/a6256e65/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list