Corpora: Morphology and Word Length (was: Relatve text length)

Mike Maxwell maxwell at ldc.upenn.edu
Fri Apr 26 20:16:35 UTC 2002


In the context of word length in various languages, Tadeusz Piotrowski
writes:

>Is there really any language-independent morphology?
>I doubt it, and I recall that even for one language
>there are conficting views on morphology, i.e. a word
>has as many morphemes as the theory allows it.

I'm not sure I completely understand this response, but--there are certainly
lots of theories of morphology out there, and few of them are explicit about
what constitutes a 'word'.  This shows up particularly in decisions about
practical orthographies, and also in computational treatments of languages
which don't mark word boundaries (Chinese being a well-known example).  Of
course the fact that there may be differences or even conflicts among
theories says nothing about whether the notion of 'word' is valid, nor about
whether it can be defined.  Some theories (or all theories we have now) are
simply wrong.

The other way to interpret this response, is that it is asking whether we
would know how to mark word boundaries if we were suddenly given the correct
theory of morphology, i.e. a description of what the mind does (assuming
morphology is a distinct discipline, and that the notions of 'word' and
'edge of word' are valid concepts).  It's entirely possible that there would
be situations in languages where the theory would _not_ decide, and that
that's one of the causes of language change.  (E.g. when a clitic in some
language gets re-analyzed by speakers of following generations as an affix.)
It's also possible that individual words are fuzzy objects, as Ken Pike
argued.

Of course I don't claim to know the answer...

     Mike Maxwell
     Linguistic Data Consortium
     maxwell at ldc.upenn.edu



More information about the Corpora mailing list