Corpora: Chomsky/Harris - one more fun question.

James L. Fidelholtz jfidel at siu.buap.mx
Thu Apr 5 20:46:28 UTC 2001


Dear CORPORA:
	(By the way, what do you call it if you don't have *any* yet?).

	OK, I'm interested in corpora, and what they can tell us about
language.  I also feel sorry for Mike Maxwell being the only one to
defend generative linguistics (pace Carl Mills, who, along with Mike
and a couple of others, have injected a bit of reason into the
discussion, it seems to me -- by the way, Carl, from the very
beginning, part of the sport of generative grammar was shooting down
people's examples, which we never considered was tantamount to
shooting down their theories -- I think even then we talked about
'refining theories' and such).
	I studied at MIT, and almost my first research in linguistics
was on English vowel reduction as a research assistant to Chomsky and
Halle, during which I discovered the 'frequency rule' for
pre-heavy-cluster vowel reduction in English, and although I perhaps did
not manage to convince Chomsky and Halle that this was worth their
taking into account in SPE, I've been a confirmed frequency buff ever
since.
	While I was there, Stan Petrick, Barbara Hall (now Partee) and
various others took part in a project for, I believe, the MITRE
Corporation, in which they designed a question-and-answer system in
English for, if memory serves, searching databases.  I somehow doubt if
this was the very first such system, but it must have been
state-of-the-art for then (early 60s), and certainly made a lot of use
of Chomsky's (and of course their own) work on English syntax.  Also, as
Fritz Newmeyer has pointed out, a very large portion of early theses at
MIT (in the 60s, including mine) were fieldwork theses, often on
indigenous, or at least non-Indo-European, languages.  I believe Petrick
used his MITRE experience as a springboard for his thesis, which I
believe was on such a system.
	The point here is that not all MIT-trained linguists are averse
to data (of different types, even), nor even averse to working with
corpora.  This sort of fake dichotomy must have gotten started from the
(correct) perception that Chomsky has very little personal interest in
the application of his theories in any practical pursuits, which seems
to aggravate a large number of linguists, especially if they, for
whatever reason, are not adherents of generative theories.  My answer to
these people would be: give the guy a break!  He has other interests,
and has done quite well, thank you, in pursuing them and in giving what
nearly all observers admit are the underpinnings of modern linguistics,
pretty much independent of the theory or approach one uses.  Chomsky
certainly has no objection to people using his theories (or even
others) in any number of practical ways.  *He* just isn't interested in
doing so.  He'd probably even be interested if some corpus studies
proved relevant for linguistic theory, but that's up to corpus linguists
to do, after all.  Very few people criticized Michael Jordan for being a
rather mediocre baseball player (although some criticized him for even
trying it! -- and they may have been right).  You guys are all
smart--you get the point.
	In sum, to get the attention of the 'MIT linguists', corpus
linguistics has to show that it is relevant to the formulation of
theories.  Probably very few of the MIT group would dismiss corpus
evidence out of hand, but they've got other fish to fry than puttering
around in corpora like we do.
	I guess I'll close with a limerick (oxymoron: it's indelicate):

There once was a guy from Byzondum
Used a dried hedgehog skin for a condom.
His girlfriend would shout,
As he pulled the thing out,
"De gustibus non disputandum".

		Jim

-- 
James L. Fidelholtz			e-mail: jfidel at siu.buap.mx
Posgrado en Ciencias del Lenguaje	tel.: +(52-2)229-5500 x5705
Instituto de Ciencias Sociales y Humanidades	fax: +(01-2) 229-5681
Benemιrita Universidad Autσnoma de Puebla, MΙXICO



More information about the Corpora mailing list