[Corpora-List] Quantifying lexical diversity of (corpus-derived) word lists
John F Sowa
sowa at bestweb.net
Fri Apr 12 12:43:19 UTC 2013
On 4/11/2013 7:10 AM, Heiki-Jaan Kaalep wrote:
> when you are working on productivity issues, like want to determine
> whether an affix is productive, or why do new words join a certain
> inflectional class, it is the type and token ratio that you should be
> looking at. It is one of the few meaningful numbers...
That point can be stated in a way that is consistent with Adam's claim:
AK
> what Baayen is trying to do is 'rescue' the type-token ratio, which
> is doomed to vary with text length, replacing it by something similar
> but cleverer that is not text-length dependent.
To measure the productivity of an affix, look at the type/token ratio
relative to the size of the text (or corpus).
John Sowa
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list