Corpora: Chomsky and corpus linguistics

Mills, Carl (MILLSCR) MILLSCR at UCMAIL.UC.EDU
Mon Apr 30 11:59:03 UTC 2001


To follow up on the fascinating exchanges among Ramesh, Mike Maxwell, and
John,  Mike wrote:

> Putting this differently, there really are things that are *not* possible
> sentences in English, even though we sometimes know immediately what they
> would mean if they were grammatical.  "Whose did you find book?", "What
> are you afraid that happened?", "Who do you wonder whether will go?" etc.
> And there are things that we know aren't English, unless we twiddle the
> grammar a bit.  My favorite example is the sentence from Catch-22, "They
> disappeared him."  As one of the characters says in the novel says, it's
> not English, but...
>
Well, I am sorry, but I have heard sentences very much like these from
native speakers of English--with no attempts at repair by the speaker and no
expressions of puzzlement from the hearer.  And herein lies a fatal flaw in
generative linguistics.  It is not the reliance of grammaticality judgments
to settle important theoretical questions that damages the credibility of
generative linguistics.  It is the steadfast refusal to consider data from
an N of more than 1 (or at most a few, consisting of like-minded linguists).
As an experimental sociolinguist, I have noted for decades that people vary
greatly in their grammaticality (or acceptability) judgments, a point that
the late Dwight Bolinger made for years.

While John is correct in his assertion that the mathematics of probability
and theories that incorporate it are in no way crucially linked to random
number generators, the view that Mike seems to find unthinkable is, I have
come to believe, the correct one:  until a sentence is uttered, the words in
it can assume any configuration.  Some configurations are more likely than
others.  How do we know this last?  From examining large corpora and
calculating the probabilities of a given string's occurrence.  But anything
is possible; it is just that some things are less likely than others.

Syntactic structure, if it exists as more than left-over baggage from a
2,000-year-old grammatical tradition (pace Vic Yngve), is rudimentary.  But
as E.O. Wilson said years ago, Chomskyan linguists, like the
poet-naturalists of the 19th century, start out expecting to find structure,
and they find it.

Chairs.

Carl

Carl Mills
Linguistics Program
Director of Undergraduate Studies
Department of English and Comparative Literature
University of Cincinnati



More information about the Corpora mailing list