Corpora: Chomsky/Harris - one more fun question.

Mike Maxwell mike_maxwell at sil.org
Mon Apr 2 14:28:51 UTC 2001


I seem to have missed the first msg of this thread, which quoted the
following--so I can't say who is being quoted:

> It is unfortunate that many people in the corpus
> linguistics community have put themselves in opposition
> to Chomskyan linguists.

But it brought to mind my reaction to reading the Manning and Schutze
textbook, "Foundations of Statistical Natural Language Processing,"  which
made some claims (perhaps "claims" is too strong a word) about the
theoretical value of statistical approaches.  I found the following two
quotes to be especially interesting in juxtaposition.  The first is from the
beginning of chapter 8 "Lexical Acquisition" (page 265), and the second is
from the end of that same chapter (page 311):

        While we discuss simply the ability of computers
        to learn lexical information from online texts,
        rather than in any way attempting to model
        human language acquisition, to the extent that
        such methods are successful, they tend to
        undermine the classical Chomskyan arguments
        for an innate language faculty based on the
        perceived poverty of the stimulus.

...

        What does the future hold for lexical acquisition?
        One important trend is to look harder for sources
        of prior knowledge that can constrain the process
        of lexical acquisition...  One important source of
        prior knowledge should be linguistic theory, which
        has been surprisingly underutilized in Statistical
        NLP.

As a generative linguist myself, I would add that to the extent that
acquisition methods fail in the absence of prior knowledge, particularly
prior linguistic knowledge, they underscore--not undermine--the classical
arguments for an innate language faculty.

Of course, "statistical NLP" =\= "corpus linguistics", but there is some
commonality.

                                         Mike Maxwell
                                         Summer Institute of Linguistics
                                         Mike_Maxwell at sil.org



More information about the Corpora mailing list