[Corpora-List] Chomsky and computationnel linguistics
Mike Maxwell
maxwell at umiacs.umd.edu
Wed Jul 11 02:39:51 UTC 2007
Philip Resnik wrote:
> ...there's an emerging community trying to use the
> tools of computational linguistics with corpora to enable linguistic
> theoreticians to be more empirical in their approach without
> abandoning their paradigm.
> ...
> Work of this kind offers value to theoretical linguists because the
> standard paradigm of inventing and judging examples can easily miss
> relevant facts, and lead to false generalizations.
If I were doing theoretical syntax, I think this is exactly where I'd
find myself: using corpora to keep me from missing relevant facts, that
is, examples from people whose judgements might differ from my own (like
my spell checker differs from me about how 'judgement' should be
spelled, but that's another question...), or making me consider
constructions that I might otherwise have overlooked.
There is a danger in corpus linguistics of conflating dialects, using
texts produced by non-native speakers, etc. And crucial constructions
may simply be so rare that you just can't find them in the available
corpus. (Crucial examples of certain kinds of reduplication in "exotic"
languages are an example of hard-to-find data in a corpus, and that's
not even syntax.) So there's still room, it seems to me, for
introspection (or asking the person in the next office, or elicitation
from an informant). But as you say, the corpora add value, too.
(Other kinds of linguistics, like lexicography, have been about corpus
collection for centuries, of course.)
--
Mike Maxwell
maxwell at umiacs.umd.edu
"Theorists...have merely to lock themselves in a room
with a blackboard and coffee maker to conduct their business."
--Bruce A. Schumm, Deep Down Things
More information about the Corpora
mailing list