[Corpora-List] annotation tools - summary

Jörg Tiedemann joerg at stp.ling.uu.se
Tue Jul 2 15:53:31 UTC 2002


The following summarizes the replies to my query from May, 23:


Petya Osenova
XML-based tool CLaRK (used for Bulgarien)
www.bultreebank.org


Brett Reynolds
Chasen - a  "morphological analyser" for Japanese
http://chasen.aist-nara.ac.jp/


Thorsten Brants
TnT - POS tagger pre-trained for German and English
http://www.coli.uni-sb.de/~thorsten/tnt


Beata Megyesi
POS tagger using several ML algorithms (HMM, MaxEnt, MBL, TBL)
http://www.speech.kth.se/~bea/research.html


Thank you very much!



I'd like to re-post my query and I hope for additional replies:


Dear list members,

I'm looking for freely available language-specific annotation tools such
as tokenizer, lemmatizer, POS-tagger, chunker/shallow-parser for the
following languages:

        Spanish
        French
        German
        Swedish
        Finnish
        Polish
        other languages (even English)

I'm looking for tools which are ready to use and preferably run on Linux.
Information on performance and tagset would be appreciated, too.

I will post a summary!
Thank you very much!



Jörg

***********/\/\/\/\/\/\/\/\/\/\/\************************************
**  Joerg Tiedemann                 joerg at stp.ling.uu.se           **
**  Department of Linguistics    http://stp.ling.uu.se/~joerg/     **
**  Uppsala University               tel: (018) 471 7007           **
**  S-751 20 Uppsala/SWEDEN          fax: (018) 471 1416           **
*************************************/\/\/\/\/\/\/\/\/\/\/\**********



More information about the Corpora mailing list