[Corpora-List] Quantifying lexical diversity of (corpus-derived) word lists

John F Sowa sowa at bestweb.net
Fri Apr 12 12:43:19 UTC 2013


On 4/11/2013 7:10 AM, Heiki-Jaan Kaalep wrote:
> when you are working on productivity issues, like want to determine
> whether an affix is productive, or why do new words join a certain
> inflectional class, it is the type and token ratio that you should be
> looking at. It is one of the few meaningful numbers...

That point can be stated in a way that is consistent with Adam's claim:

AK
> what Baayen is trying to do is 'rescue' the type-token ratio, which
> is doomed to vary with text length, replacing it by something similar
> but cleverer that is not text-length dependent.

To measure the productivity of an affix, look at the type/token ratio
relative to the size of the text (or corpus).

John Sowa

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list