[Corpora-List] A Lemmatizer is Required

Sabine Bartsch bartsch at linglit.tu-darmstadt.de
Fri Feb 29 13:45:43 UTC 2008


Hi there,

I would suggest you have a look at Helmut Schmid's TreeTagger at the 
IMS, University of Stuttgart:

http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/

It's a pos-tagger / lemmatizer and works for several languages depending 
on the parameter files selected. Performance is good, runs under Linux, 
solaris, MacOS X and Windows.

Best of luck

Sabine



True Friend wrote:
> Hi
> I want an open source (or free at least) lemmatizer which can lemmatize 
> a corpus of 2.1 million english words into their base forms etc. If it 
> is a linux only software even no problem I've Kubuntu along with windows xp.
> Regards
> 
> -- 
> محمد شاکر عزیز
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

-- 
Dr. Sabine Bartsch
Technische Universität Darmstadt
Institut für Sprach- und Literaturwissenschaft - Englische Linguistik
Hochschulstr. 1         64289 Darmstadt
Fon: +49-6151-16 4570   Fax: +49-6151-16 3694
http://www.linglit.tu-darmstadt.de/bartsch

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list