[Corpora-List] ANC, FROWN, Fuzzy Logic

Mike Maxwell maxwell at ldc.upenn.edu
Thu Jul 27 22:39:54 UTC 2006


Rob Freeman wrote:
> Mike - I'm not sure what you are saying, other than that linguists have been 
> careless about fitting theory to the data.

Let me put it more bluntly.  I'm saying that when you say things like

 > No single grammar of natural language can ever be complete.
 > This is because natural language text is at some level
 > Kolmogorov complex.

you're wrong, or arguing against a straw man, or both.  You're wrong if 
you mean that any particular finite natural language text is K-complex, 
because it isn't; the fact that you can zip English files and get 
smaller files is enough to show that (as others have pointed out in this 
discussion).  And John Goldsmith's response told how his work on 
morphology induction was based on a form of compression, which also 
depends on text being non-K-complex.

Besides, most linguists (particularly generative linguists) do not 
consider coverage of a finite corpus (if that's what you mean by 
"natural language text") to be a goal, at least when it comes to syntax. 
  So in this case you're arguing against a straw man.

As for linguists being careless about fitting the theory to the data, 
that has of course happened, but I wasn't talking about that.  I was 
rather saying that the issue of the degree of compression of natural 
language that humans do in the process of language learning might be 
less than maximal.  So if by "careless" you mean "not doing maximal 
compression", then human language learners may well be worse offenders 
than linguists.

I am tempted to say more, but I'll stop there.
-- 
	Mike Maxwell
	maxwell at ldc.upenn.edu



More information about the Corpora mailing list