Corpora: Russian POS tagger

Alexander Gelbukh gelbukh at cic.ipn.mx
Wed Jul 18 00:08:06 UTC 2001


Hi,

I have developed a Russian morphological analyzer / synthesyzer (not
tagger!), which I can send you for reserach purpose and without the right to
transfer it to others. I would apprciate if in the resulting work you cite
my papers that describe the analyzer (I will provide you with the references
and full texts; see also my publications for 1989 to 1993 at my webpage, see
below).

Unfortunately, because of copyright issues I only can send you a rather old
version, which is a bit incomplete (perfection does not exist :-) and the
dictionary is limited to some 90,000 lexemes.

As far as I recall, that version does not handle a few frequent words like
"to be" (becuase of their complicated morphological patterns), though it
does handle more regular words.

To add a new word to the dictionary is possible but a bit complicated (I
would need to write English documentation on how to do this for you). For
now, it just says "unknown word" for the words it does not have in the
dictionary.

Thank you.
Alexander

=====================================
Prof. Dr. Alexander Gelbukh (Alexandre Guelboukh Kahn),
Professor and researcher, head of NLP Lab,
Centro de Investigacion en Computacion (CIC),
Instituto Politecnico Nacional (IPN).
Address: CIC, IPN, entrada por calle Venus (cerca de Metro Poli),
         Col. Zacatenco, CP 07738, Mexico DF., Mexico
Office: (+52) 5729-6000 ext. 56544, 56518, 56602, home 5597-0709
Fax: +1 (520) 441-1817 (personal), (+52) 5586-2936 (shared)
gelbukh at earthling.net, gelbukh at cic.ipn.mx, www.cic.ipn.mx/~gelbukh
=====================================


> -----Original Message-----
> From: owner-corpora at lists.uib.no [mailto:owner-corpora at lists.uib.no]On
> Behalf Of Hailing Jiang
> Sent: Tuesday, July 17, 2001 4:24 PM
> To: corpora at hd.uib.no
> Subject: Corpora: Russian POS tagger
>
>
> Dear list members,
>
> Does anyone know a Russian morphological analyzer or
> POS tagger that is publicly available or free for
> research purpose? I searched the net and only found
> this online demo of Brill's tagger trained for Russian:
> (http://www.ling.gu.se/~lager/Home/brilltagger_ui.html)
> There is no information on how to use it.
>
> any related information is appreciated.
>
> Thanks in advance,
> Hailing Jiang
>
>
>



More information about the Corpora mailing list