Corpora: Evidence and intuition

Klaus Zechner zechner+ at cs.cmu.edu
Thu Nov 1 19:33:03 UTC 2001


Historically, the issue probably arose in the 60s when Chomsky and his
followers promoted "explanation" (through the rational mind of an ideal
native speaker/hearer) over "mere description" (of an emprical linguist
who collects data/corpora from the world out there).

But I think they threw the baby out with the bath water (is that the
correct phrase in Am.English?-} maybe we'll have to look at some
corpora...). For me, in the long run, it's not "either-or" bot
"both-and": The linguistic data/corpora from "real" speakers and writers
have to build the empirical basis for linguistic description and
explanation ---after all, these are the only objective data that we have
available to us---, but it is certainly possible and maybe even
necessary to augment this basis by native speakers' intuitions, albeit
very cautiously.

As was indicated however, these intuitions often are not in perfect
agreement and without a representative sampling of speakers (and then
the question arises: _which_ speakers...?) it is hard to tell which
?/*/*?/?? sentences are "ok" in a language or not.

Just my 2 cents...

-klaus zechner-
cmu, lti



More information about the Corpora mailing list