Corpora: Morphology and Word Length (was: Relatve text length)

John A Goldsmith ja-goldsmith at uchicago.edu
Fri Apr 26 22:17:38 UTC 2002


I'm not sure what you mean by "language-independent morphology," but
here's a suggestion that is not wildly wrong, and has a lot to be said
for it -- though it takes the case of concatenative morphology to be the
central phenomenon of morphology (which is not a terrible assumption).
Assume that a morphology of a lexicon is (any) phrase-structure grammar
that generates the words of the lexicon; then the correct morphology of
that lexicon is the one for which the description length (using that
grammar) is the shortest ("description length" in the sense of
Rissanen's Minimum Description Length); this is essentially what I was
trying to work out in my Computational Linguistics article last year.
This boils down to being the shortest grammar which best matches the
empirical distribution (well, that's a rough paraphrase).


John Goldsmith
Department of Linguistics
University of Chicago


-----Original Message-----
From: owner-corpora at lists.uib.no [mailto:owner-corpora at lists.uib.no] On
Behalf Of Tadeusz Piotrowski
Sent: Friday, April 26, 2002 11:44 AM
To: corpora at lists.uib.no
Subject: FW: Corpora: Morphology and Word Length (was: Relatve text
length)



Is there really any language-independent morphology? I doubt it, and I
recall that even for one language there are conficting views on
morphology, i.e. a word has as many morphemes as the theory allows it.
Regards
Tadeusz Piotrowski



More information about the Corpora mailing list