[Corpora-List] Chomsky and computational linguistics

John F. Sowa sowa at bestweb.net
Fri Jul 27 02:22:28 UTC 2007


Rob,

The theme I was trying to get across was the summary line,
which I quoted from C. S. Peirce:

    "Do not block the way of inquiry."

 > It's great to see these fundamental issues being discussed
 > again on Corpora. There's lots of scope to go back to the
 > original arguments.
 >
 > Your general theme seems to be a personal criticism that Chomsky
 > did not keep up. In detail you cite mainly a continued rejection
 > of statistics and semantics.

No, I claim that Chomsky had every right to develop his insights
in his own way and to demonstrate their value by showing positive
results.  But there was no need for him to denounce and destroy
other approaches as legitimate areas of inquiry.

There was nothing wrong in suggesting the hypothesis that syntax
is independent of semantics.   It might even be a useful methodology.
But there was no evidence that it was psychologically realistic or
linguistically necessary.  By making it a dogma, Chomsky caused
a lot of harm to the field by blocking the way of inquiry and
fomenting the linguistic wars.

 > Chomsky saw that structural descriptions of language based
 > on corpora failed.

Chomsky saw no such thing.  Have you read the book by C. C. Fries?
He used a corpus that was tiny by today's standards, but Fries was
able to develop a decent structural description of English from it.

And there was a long history of people who derived good grammars
for dead languages from corpora -- including C's former teacher
Zellig Harris, who derived a grammar of Phoenician.  In fact, Harris
postulated the idea of transformations as a result of his work on
corpora.

 > There is today, still, no (major?) branch of theoretical
 > linguistics which teaches a basis in observable structure.

You seem to be using the term "observable structure" as implying
that induction alone (also known as "data mining") is the only way
to analyze data.  But Peirce noted that there are three fundamental
methods of logic:  deduction, induction, and abduction.

Deduction cannot derive anything that wasn't implicit in the starting
assumptions.  Induction derives new hypothesis by a systematic search
for hidden patterns.  Abduction pulls a hunch, a wild guess, or a
brilliant insight out of thin air -- which must then be tested by
deduction and abduction.

 > I think Chomsky's detailed conclusion that the structure of
 > language was not observable because it is innate, was wrong.
 > But I think he was right that a regular structure in language
 > can't be observed.

I agree that in linguistics, as in most sciences, induction alone is
not sufficient to derive deep insights.  It is a useful, but weak
method, which must be supplemented by something else.  Harris's
hypothesis of transformations was a good example of an abduction.

Another excellent abduction was the hypothesis that the coptic
language was a later stage of the language spoken by the pharaohs.
That hypothesis was key to decoding the phonetic markers in the
hieroglyphics that were difficult to decipher from the corpus alone.

Chomsky's suggestion of using a native speaker's intuition is
another example of using insight as a source of abductions.
That was a fine idea.  But those insights are best used as
a *supplement* to a corpus, not as a *replacement* for it.

 > The answer is not to be had by moving to semantics, or
 > revalidating statistics again. These don't get to the core
 > of the problem.

I wasn't recommending either of them as the sole answer.  I was
criticizing Chomsky for ruling out other ideas, and I would not
recommend replacing his dogma by another dogma based on semantics
or statistics.  Language is a very large subject, and no single
methodology is likely to be sufficient to explore all of it.

 > The core of the problem I think is indeed to be found in what
 > Ken Litkowski describes as "some Godelian experiences that are
 > not covered" (July 3). There is nothing mysterious in this.
 > It is widely known that some systems, notably that arch "formal
 > system" maths itself, are formally incomplete in this sense.

Goedel's work has been misquoted and misapplied to everything
that anybody finds difficult to understand.  It's the atheist's
equivalent of the God hypothesis.

 > Should we be surprised that language behaves like maths?

Everything behaves like math, because no matter how outlandish
any system might be, it is possible to find some way to formalize
it.  For examples, look at the kinds of virtual realities and
computer graphics that programmers build out of mathematical
models.  But that hypothesis is so general that it doesn't tell
us anything specific.

John



_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list