[Corpora-List] Annotation with SRILM

Mon Mar 22 12:09:38 UTC 2010

Dear all,

                I have an annotated corpus and a Morphological Analyzer.
My task is to use the nbest of the SRILM to choose the best solution from the Morphological Analyzer solutions for every new unanalyzed word.

The Question:

Should I use the srilm to make ONE language model file with all the annotated data features (concatenated with '+' for every word) like this: (PREP+DET+#+NOUN+NSUFF+#+i/GEN  DET+#+#+ADJ+NSUFF+#+i/GEN)
Or should I make for every feature, or some group of features, a separate model file?

I need to know the best way to automatically annotate new data according to my manually annotated data.

Thanks and best regards.

Eslam Amgad Abdel Salam
Computational Linguist,
Bibliotheca Alexandrina<http://www.bibalex.org/>, ICT Sector,
ISIS<http://www.bibalex.org/isis/>, ISAUC .
P.O. Box 138, Ashshatby,
Alexandria 21526, ARE.
Tel. : +2034839999, Ext.: 2726
Fax: +2034820405
Cellular: +20101000725
E-mail: Eslam.Amgad at bibalex.org<mailto:eslam.amgad at bibalex.org>
Web site: http://www.bibalex.org<http://www.bibalex.org/>

"A language is a dialect with an army and a navy. "
Max Weinreich

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20100322/a1d69dc7/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora