Corpora: Chomsky/Harris - one more fun question.

Mike Maxwell mike_maxwell at sil.org
Thu Apr 5 14:17:13 UTC 2001


Pete wrote:
>The point is, the ornithologists are not the
>theorists of flight, the physicists are. Bird
>flight is a natural embodiment of the physics of flight

Rather than annoy the list further with an argument about who studies the
way birds fly, or discussion of what is "the" correct analogy, I will just
say that my analogy is
    bird flight : airplane design ::
        theoretical linguistics : language technology
You may learn something about language that you can use on the computer from
theoretical linguists, just as the Wright brothers are said to have learned
from watching birds fly.  But you shouldn't fault the linguistic
theoreticians just because you can't make use of everything they do, any
more than you should fault astronomers for not helping you build a fusion
reactor, or ethologists for not teaching you how to train your dog, or
neurologists for not finding better ways to do classroom teaching--or
ornithologists for studying the shape of bird feathers.

>Do you ever hear psycholinguists and language
>engineers say 'those guys at MIT have really
>clarified the fundamental relationships between
>sounds, texts and meanings, now all we have to
>do is model them in wetware and software?'

I can't speak for the psycholinguists (although I very much doubt that
psycholinguistics would be where it is were it not for generative
linguistics).  But I will repeat what I said in my earlier posting: when I,
as a "language engineer" (not my term, nor my title then), wrote a
(reasonably) comprehensive grammar of English for a parser, I did refer to
"those guys at MIT", as well as to generative linguists elsewhere.  I also
used a traditional grammar of English (Quirk, Greenbaum, Leech and
Svartvik), although if you look at the revisions in that work over the
years, I think you'll find that they also owe a debt to generative
linguistics.  (I could be wrong about that, since it's always hazardous to
try to guess how someone came to a conclusion.)  My memory is getting fuzzy
now that I'm over the hill, but as I recall, the single best reference was
Joe Emond's book "A Transformational Approach to English Syntax."  (BTW, we
later did a weighting of the various constructions based on their frequency
of occurrence in various corpora, to help choose the best parse.  To my
mind, this is an ideal symbiosis between theoretical and corpus linguistics:
find what's possible from the theory, and filter by what's likely in the
corpus.)

At one point (back in the '80s) I played at doing English morphology on the
computer.  I needed to know where the stressed syllables were, and I
basically implemented the stress rules in The Sound Pattern of English
(Chomsky and Halle).  And when I later implemented a general phonological/
morphological parser, I again referred to the work of generative
phonologists and morphologists, including some at MIT, as well as to theses
done at MIT (and elsewhere).  (Oops, I hope we don't get off on what the
plural of 'thesis' is!)

I guess I'll make this msg even longer by replying to S. Warren:
>Maybe 50 years ago before the advent of
>computers as we know them today a non-
>empirical approach could be justified but
>surely not now?  Generative linguists please
>reply in your defence!!!!

There are a number of ways to "attack" this (I'd rather be on the attack
today than on the defence :-)).  One is to say that there are degrees of
empiricism, and that the data that generative linguists typically use is not
non-empirical just because they think of it, rather than finding it in some
half-baked email msg from someone about whom you know nothing, and who may
not even be a native speaker of the language they're writing in.  Another
attack would be to say that one should use a variety of data, and that a
sentence I (as a generative linguist) think of may be more relevant because
I can tailor it to my needs.  (We don't fault chemists because they mix up
their own chemicals, rather than studying only reactions that occur in the
enviroment around them.)  I could go on, but I'll stop there for today,
because I have to go to a class on SQL server...

                                 Mike Maxwell
                                 Summer Institute of Linguistics
                                 Mike_Maxwell at sil.org



More information about the Corpora mailing list