Corpora: Chomsky/Harris - one more fun question.
Mike Maxwell
mike_maxwell at sil.org
Mon Apr 2 14:28:51 UTC 2001
I seem to have missed the first msg of this thread, which quoted the
following--so I can't say who is being quoted:
> It is unfortunate that many people in the corpus
> linguistics community have put themselves in opposition
> to Chomskyan linguists.
But it brought to mind my reaction to reading the Manning and Schutze
textbook, "Foundations of Statistical Natural Language Processing," which
made some claims (perhaps "claims" is too strong a word) about the
theoretical value of statistical approaches. I found the following two
quotes to be especially interesting in juxtaposition. The first is from the
beginning of chapter 8 "Lexical Acquisition" (page 265), and the second is
from the end of that same chapter (page 311):
While we discuss simply the ability of computers
to learn lexical information from online texts,
rather than in any way attempting to model
human language acquisition, to the extent that
such methods are successful, they tend to
undermine the classical Chomskyan arguments
for an innate language faculty based on the
perceived poverty of the stimulus.
...
What does the future hold for lexical acquisition?
One important trend is to look harder for sources
of prior knowledge that can constrain the process
of lexical acquisition... One important source of
prior knowledge should be linguistic theory, which
has been surprisingly underutilized in Statistical
NLP.
As a generative linguist myself, I would add that to the extent that
acquisition methods fail in the absence of prior knowledge, particularly
prior linguistic knowledge, they underscore--not undermine--the classical
arguments for an innate language faculty.
Of course, "statistical NLP" =\= "corpus linguistics", but there is some
commonality.
Mike Maxwell
Summer Institute of Linguistics
Mike_Maxwell at sil.org
More information about the Corpora
mailing list