[Corpora-List] ANC, FROWN, Fuzzy Logic
Rob Freeman
lists at chaoticlanguage.com
Tue Jul 25 00:15:38 UTC 2006
Hi John,
An aside to Daoud. Compression is indeed related to the issue in the way you
suggest.
On Monday 24 July 2006 09:24, John F. Sowa wrote:
> ...Greg Chaitin makes many good points, but
> the connection between those points and this discussion is
> not clear.
The direct relevance of Chaitin's work for me is that it says there will
always be uncertainty in any set of logical generalizations over many systems
(the most compact ones, in fact.)
Otherwise put, that experimental observations are the most compact
representations for many systems.
In our case this at least presents the possibility that a corpus of usage
might be the most compact representation for (contradictory) grammatical
abstractions, and any attempt to find a single logical basis for those
abstractions will result in uncertainty/incompleteness.
Before Goedel people assumed there was always a logical basis for everything.
After Goedel people need to accept that for some (Chaitin/Kolmogorov tell us
most) systems the experimental facts are the most compact representation, and
any attempt at a single complete logical description unravels.
> ...Saying that the corpus is not generated by rules might
> be a reasonable claim, but then it is necessary to answer Chris's
> questions:
>
> CB> how should we, as scientists, proceed in trying to derive
> > objective and generalizable knowledge about language from
> > corpora?
> >
> > once we have decided what to try and explain, what kind of
> > models we should use?
I think the way to do this was identified long ago. It is functional contrast.
The American structuralists developed it to make meaningful predictions which
looked like grammatical class in the '30s. Harris was trying to extend it to
produce rules over those classes when Chomsky was his student. Everything
looked good, but this process suddenly fell out of favor when Chomsky pointed
out it led to uncertain or incomplete solutions in terms of underlying logic.
Exactly what mathematicians like Goedel, Chaitin, and Kolmogorov were finding
for other systems at about the same time (and which Goedel showed must
actually happen for _any_ sufficiently powerful system.)
So Chomsky's disproof, is actually our proof (that a corpus is the most
compact representation in the case of language.) It is just that knowledge of
the mathematical consequences of Goedel's result (that incompleteness is
normal, necessary even) had not sufficiently diffused at the time Chomsky
made the observations (or now.)
According to readings of Mark's message, this (functional contrast) would also
be the "process" he is looking for.
-Rob
More information about the Corpora
mailing list