[Corpora-List] RE : Chomsky and computational linguistics

Terry tmorpheme at hotmail.com
Sun Sep 2 04:14:53 UTC 2007


The assertion that "discussions of personal attitudes by Chomsky (such as
whether he recognized he was wrong or not) strikes me as uninteresting"
strikes me as being consistent with Chomsky's own method of procedure. (For
the record, though, he has not.) 

Chomsky's work is replete with strong attitudinal judgments about what
linguistic work is ‘vacuous’ or ‘empty’ or has ‘no bearing’ (Aspects of the
Theory of Syntax 40, 204, 54, 20, 41, 53, 126f). This mode of rhetoric is
simply a means for dismissing whatever area of work Chomsky himself has no
interest in exploring. And, as everyone on this list knows, one of his
favourite areas for discouraging exploration was ... corpus linguistics!
Good job no one was listening to his advice too closely.

Terry


-----Original Message-----
From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of
Santos Diana
Sent: Saturday, September 01, 2007 10:48 PM
To: Rob Freeman; Mike Maxwell; CORPORA at UIB.NO
Subject: Re: [Corpora-List] RE : Chomsky and computational linguistics

Although I have not participated -- mainly because I came from holidays
already in the middle of it -- I tremendously enjoyed it, and agree with
Cécile and Rob that this is the (one) right place to discuss such issues.

So please go on! 

I avow that discussions of personal atitudes by Chomsky (such as whether he
recognized he was worong or not) strike me as uninteresting. Much has
already been written about the (science) political atitudes by Chomsky as a
scholar (see e.g. Geoffrey Sampson's books) and his influence in the
linguistics discipline. But a question that may still be pertinent to ask --
and discuss -- was whether anything he suggested is relevant to corpus
linguistics.
 
(Incidentally, I hate this designation, our discipline should be called
"empirical linguistics" or at least "linguistics using corpora" or
"corpus-based NLP"...)

To add my own bit of discussion, the "corpus of a language" is not the
reason we do corpus linguistics -- my impression being that a corpus is to
be thought as a sample, a sample that we can manipulate and observe
externally (and, therefore, discuss with others our findings on that corpus,
and replicate them.) 

I am not sure, either, that anyone is looking for a complete grammar - or
the most compact description of a corpus (this last one seems to me VERY
suspicious, if a corpus is a sample).

I think that most people doing corpus linguistics see a corpus as a (near)
perfect exploratory testbed, where instantaneous access to a lot of
intuitions and speech practices can be found, as well as a good (although
carefully dealt with) testbed for more developed hypotheses. (For this one
you might require carefully designed new corpora, in fact...)

There is also another branch (flavour) of corpus linguistics (?) where you
just test and train your own systems, of course, and then the goal is to aid
system development. This is the engineering side of corpus linguistics, that
again is not well described by its name. "Corpus-based testing &
development" might a better name?

In any case, if the corpora-list only had conference announcements and
requests for particular applications for particular languages, it would not
be half as interesting (IMO) as it is now, thanks to Mike Maxwell, Rob
Freeman and others:-)

Diana


________________________________

	From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On
Behalf Of Rob Freeman
	Sent: 1. september 2007 05:48
	To: Mike Maxwell; CORPORA at UIB.NO
	Subject: Re: [Corpora-List] RE : Chomsky and computational
linguistics
	
	
	For me the question of interest is the complexity of descriptions of
corpora. Is a complete grammar possible, or is the corpus of a language is
the most compact description of itself?
	
	If Chomsky's work is relevant to that, why not talk about it? 
	
	Personally I could do without the "he was right"/"he was wrong"
stuff too. It buries the interesting issues and is meaningless in the
abstract.
	
	So let's keep it tight to the science/engineering issue. 
	
	But let's talk about it.
	
	If the completeness of grammatical descriptions of corpora is not an
appropriate topic for the Corpora list, what is?
	
	-Rob
	
	
	On 8/31/07, Mike Maxwell <maxwell at umiacs.umd.edu> wrote: 

		Cécile Yousfi wrote:
		> I'm a mere user of the BNC and I'm no Chomsky specialist,
but I do enjoy
		> reading interesting discussion on the subject. So please
go on
		> discussing the matter on the list.
		> 
		> It's intellectually stimulating to have a genuine dialogue
on a
		> theoretical subject, and to be confronted to different
points of view.
		
		As one who posted on this subject a couple months ago (and
probably 
		posted too much :-), not to mention representing the
strident minority
		on this list), I have to say that there's probably a better
forum.  This
		list is, after all, about corpora; and while the discussion
could have 
		been about whether modeling corpora is about science or
engineering, it
		tended to be more about whether Chomsky's approach had any
validity, or
		whether he should have admitted defeat, or about other
issues that (at 
		least IMO) have less to do with corpora.
		
		I would welcome suggestions for a more appropriate forum.
		--
		        Mike Maxwell
		        maxwell at umiacs.umd.edu
		        "Theorists...have merely to lock themselves in a
room
		        with a blackboard and coffee maker to conduct their
business."
		        --Bruce A. Schumm, Deep Down Things
		
		_______________________________________________ 
		Corpora mailing list
		Corpora at uib.no
		http://mailman.uib.no/listinfo/corpora
		



_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list