[Corpora-List] ANC, FROWN, Fuzzy Logic

Rob Freeman lists at chaoticlanguage.com
Tue Jul 25 00:15:38 UTC 2006


Hi John,

An aside to Daoud. Compression is indeed related to the issue in the way you 
suggest.

On Monday 24 July 2006 09:24, John F. Sowa wrote:
> ...Greg Chaitin makes many good points, but
> the connection between those points and this discussion is
> not clear.

The direct relevance of Chaitin's work for me is that it says there will 
always be uncertainty in any set of logical generalizations over many systems 
(the most compact ones, in fact.)

Otherwise put, that experimental observations are the most compact 
representations for many systems.

In our case this at least presents the possibility that a corpus of usage 
might be the most compact representation for (contradictory) grammatical 
abstractions, and any attempt to find a single logical basis for those 
abstractions will result in uncertainty/incompleteness.

Before Goedel people assumed there was always a logical basis for everything. 
After Goedel people need to accept that for some (Chaitin/Kolmogorov tell us 
most) systems the experimental facts are the most compact representation, and 
any attempt at a single complete logical description unravels.

> ...Saying that the corpus is not generated by rules might
> be a reasonable claim, but then it is necessary to answer Chris's
> questions:
>
> CB> how should we, as scientists, proceed in trying to derive
>  > objective and generalizable knowledge about language from
>  > corpora?
>  >
>  > once we have decided what to try and explain, what kind of
>  > models we should use?

I think the way to do this was identified long ago. It is functional contrast. 
The American structuralists developed it to make meaningful predictions which 
looked like grammatical class in the '30s. Harris was trying to extend it to 
produce rules over those classes when Chomsky was his student. Everything 
looked good, but this process suddenly fell out of favor when Chomsky pointed 
out it led to uncertain or incomplete solutions in terms of underlying logic. 
Exactly what mathematicians like Goedel, Chaitin, and Kolmogorov were finding 
for other systems at about the same time (and which Goedel showed must 
actually happen for _any_ sufficiently powerful system.)

So Chomsky's disproof, is actually our proof (that a corpus is the most 
compact representation in the case of language.) It is just that knowledge of 
the mathematical consequences of Goedel's result (that incompleteness is 
normal, necessary even) had not sufficiently diffused at the time Chomsky 
made the observations (or now.)

According to readings of Mark's message, this (functional contrast) would also 
be the "process" he is looking for.

-Rob



More information about the Corpora mailing list