[Corpora-List] Package for LSA, tfidf, etc

Stephan Gouws gouwsmeister at gmail.com
Thu Oct 15 10:09:01 UTC 2009


Hi,

 I'm looking for a software package that I can use to generate the document
similarity matrix for a small corpus of 50 documents, using various of the
standard algorithms like tfidf, okapi, language models, cosine, lsa, etc.

 Research code is fine I just want a trusted implementation of these
algorithms, languages in order of preference are [Python, C, C++] , [Java],
Perl], and from there it's not really preferred anymore but fine nonetheless
:)

 I want to correlate these with human ratings in a research setting.

 Thank you very much!
 Stephan.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20091015/b4b6d06b/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list