Corpora: Distribution of homography and lemmas

Serge Sharoff s_sharoff at yahoo.com
Mon Apr 15 07:31:01 UTC 2002


Dear Nada,

I'm in the process of developing a frequency dictionary for
Russian.  It was quite surprising for me that in Russian (an
inflexional language with reliable detection of grammatical
properties of word forms) the number of word forms with ambiguous
lemmas is about 20%.

Just my two (euro) cents.

Best,
Serge

Nada S. ILIC schrieb:

> Dear list members,
>
> I would like to ask does anybody know anything about researches on :
> 1) distribution of homography in English or other languages
> 2) distribution of new lemmas as a function of sample size (for example: if
> the 100 mil. word corpus is increased for 50 mil., what is the approximate
> number of new lemmas)
>
> Thanking in advance
> Nada Ilic
> Laboratory of Experimental Psychology,
> Faculty of Philosophy, Belgrade, YU
>
>
>
>

---
Dr. Serge Sharoff
Alexander von Humboldt Fellow,
Fakultät für Linguistik und Literaturwissenschaft,
Universität Bielefeld,
Postfach 10 01 31, D-33501 Bielefeld, Germany,
tel: +49-521-1065275; fax: +49-521-1066447



More information about the Corpora mailing list