Corpora: Chomsky and corpus linguistics

Christopher Bader cbader at MIT.EDU
Fri Apr 6 18:05:32 UTC 2001


Thanks to Mike Maxwell, like Samson, taking up the jawbone of an ass
against the Philistines :)

In a recent lecture, Chomsky himself made the bird/707 argument:
Specifically, he argued that although the human body might serve
as proof that it would be possible to design a forklift, it would be
completely absurd to base a forklift on the human body.

I want to make two additional points.

1.  It is simply wrong to contend that Chomsky has contributed
nothing to language technology.  His work in the 1950's and '60's
laid part of the foundation for formal language theory.  See any
textbook on automata and theory of computation, on the Chomsky
Hierarchy or Chomsky Normal Form.

2.  In his more recent work, Chomsky distinguishes between
the E-language (e.g. the set of all grammatical sentences)
and the I-language (the human language faculty).  Generative
grammarians study the latter; corpus linguists, the former.
The Chomsky Hierarchy and Chomsky Normal Form are
of course concepts pertaining to the E-language, not to
the I-language, which is why Chomsky no longer works
in this area.

Since generative linguists and computational linguists
have fundamentally different objects of study, it is not
surprising that they sometimes have trouble understanding
each other's work.  I urge people on this list who are interested
in Chomsky's actual views to read Knowledge of Language:
Its Nature, Origin, and Use (1986).  It lays out in well-reasoned,
non-technical prose the arguments for the E-language/I-language
distinction.

Christopher Bader
Dept. of Linguistics and Philosophy
MIT E39-245
Cambridge MA 02139



More information about the Corpora mailing list