Corpora: Chomsky/Harris - one more fun question.

Mills, Carl (MILLSCR) MILLSCR at UCMAIL.UC.EDU
Thu Apr 5 16:51:09 UTC 2001


Near the end of Mike Maxwell's interesting and well thought out post, he
says

> that there are degrees of
> empiricism, and that the data that generative linguists typically use is
> not
> non-empirical just because they think of it, rather than finding it in
> some
> half-baked email msg from someone about whom you know nothing, and who may
> not even be a native speaker of the language they're writing in.  Another
> attack would be to say that one should use a variety of data, and that a
> sentence I (as a generative linguist) think of may be more relevant
> because
> I can tailor it to my needs.  (We don't fault chemists because they mix up
> their own chemicals, rather than studying only reactions that occur in the
> enviroment around them.)
>
Having spent a quarter-century as a quantitative linguist, working in
sociolinguistics, dialectology, and stylistics, and being rather new to
corpus linguistics, I have watched this thread with interest.  First, I do
not think that MIT (and theoretical linguists in general) are the enemy.
They do, however, inhabit a parallel universe that only lightly touches that
inhabited by those of us interested in language use.  Second, MIT linguists
use words like "empirical" and "data" in ways that are strange to those of
us who study language use--stranger still to those of us who come from other
scientific disciplines (chemistry and electronics technolgy, here).  So here
is a problem with the passage by Mike that I quoted above.  It is not the
sentences that generative linguists cite as "data" that bother me.  I have
conducted some fairly good (I think) work in experimental linguistics using
made-up sentences.  But the virtual "data" that generative linguists use to
"test" their theories and resolve "empirical" questions turn out to be the
linguist's own intuitions about the grammaticality of their made-up
sentences.  I have spent 25 years marvelling at generative linguists'
grammaticality judgments and at their refusal to change them in the face of
native-speaker denials.  If probabilistic linguistics and MIT linguists need
to come together it is in the area of grammaticality/acceptability
judgments.

Carl Mills



More information about the Corpora mailing list