[Corpora-List] ANC, FROWN, Fuzzy Logic
Mike Maxwell
maxwell at ldc.upenn.edu
Thu Jul 27 22:39:54 UTC 2006
Rob Freeman wrote:
> Mike - I'm not sure what you are saying, other than that linguists have been
> careless about fitting theory to the data.
Let me put it more bluntly. I'm saying that when you say things like
> No single grammar of natural language can ever be complete.
> This is because natural language text is at some level
> Kolmogorov complex.
you're wrong, or arguing against a straw man, or both. You're wrong if
you mean that any particular finite natural language text is K-complex,
because it isn't; the fact that you can zip English files and get
smaller files is enough to show that (as others have pointed out in this
discussion). And John Goldsmith's response told how his work on
morphology induction was based on a form of compression, which also
depends on text being non-K-complex.
Besides, most linguists (particularly generative linguists) do not
consider coverage of a finite corpus (if that's what you mean by
"natural language text") to be a goal, at least when it comes to syntax.
So in this case you're arguing against a straw man.
As for linguists being careless about fitting the theory to the data,
that has of course happened, but I wasn't talking about that. I was
rather saying that the issue of the degree of compression of natural
language that humans do in the process of language learning might be
less than maximal. So if by "careless" you mean "not doing maximal
compression", then human language learners may well be worse offenders
than linguists.
I am tempted to say more, but I'll stop there.
--
Mike Maxwell
maxwell at ldc.upenn.edu
More information about the Corpora
mailing list