[Corpora-List] Software for ngram-document matrix

Georgios Mikros gmikros at isll.uoa.gr
Mon Jan 31 12:48:54 UTC 2011


Dear all,

I am trying to find an open-source tool which will take as input a corpus of
raw texts and produce ngram-document matrix with text file-names as raws and
ngrams as columns. It would be nice if I could filter ngrams based on their
frequency or using a stop list.

Kind regards

George Mikros

 

-----------------------------------

George K. Mikros

Associate Professor

Department of Italian Language and Literature School of Philosophy
University of Athens Greece

Tel.: +30 210 7277491

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20110131/91324b8d/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list