22.4650, Disc: Re: Remarks by Noam Chomsky in London

Mon Nov 21 18:58:57 UTC 2011

LINGUIST List: Vol-22-4650. Mon Nov 21 2011. ISSN: 1069 - 4875.

Subject: 22.4650, Disc: Re: Remarks by Noam Chomsky in London

Moderators: Anthony Aristar, Eastern Michigan U <aristar at linguistlist.org>
            Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>

Reviews: Veronika Drake, U of Wisconsin-Madison
Monica Macaulay, U of Wisconsin-Madison
Rajiv Rao, U of Wisconsin-Madison
Joseph Salmons, U of Wisconsin-Madison
Anja Wanner, U of Wisconsin-Madison
       <reviews at linguistlist.org>

Homepage: http://linguistlist.org

The LINGUIST List is funded by Eastern Michigan University,
and donations from subscribers and publishers.

Editor for this issue: Elyssa Winzeler <elyssa at linguistlist.org>
================================================================  

To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.cfm.

===========================Directory==============================  

1)
Date: 20-Nov-2011
From: Mark Brenchley [schlemihl at gmail.com]
Subject: Re: Remarks by Noam Chomsky in London

-------------------------Message 1 ---------------------------------- 
Date: Mon, 21 Nov 2011 13:52:04
From: Mark Brenchley [schlemihl at gmail.com]
Subject: Re: Remarks by Noam Chomsky in London

E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=22-4650.html&submissionid=4536254&topicid=5&msgnumber=1
 It seems to us that there is one aspect of Noam Chomsky's talk that really stands out (and this includes the papers Geoffrey Pullum mentions; namely, Chomsky 2011 and Berwick et al. [BPYC] 2011): scholars need to stop and reflect upon what they are doing.

(1) This is no more true than in the case of whether machine learning is relevant (and if so, how relevant) for linguistic theory in general and language acquisition in particular. Whilst it is true that the field of mathematical linguistics has yielded many interesting results (some of which were initiated by Chomsky himself), Chomsky has been rather adamant regarding their limited relevance to the study of language qua biological system. This is undoubtedly true, in our opinion.

When someone states that language is mildy context-sensitive, surely they do not mean it in a literal sense (how could they?). Rather, what scholars really mean by statements like these is that the expressive power of language, when the latter is described in terms of strings of symbols that stand for terminals and non-terminals, is mildly context sensitive (that is, generable by the right collection of rewriting rules); a slightly different matter. Thus, even if an 'efficient, correct' algorithm (to reference Clark, 2011; cited by Pullum in his discussion piece) is shown to successfully acquire multiple context-free grammars, this is not ipso facto a demonstration that is directly relatable to the acquisition of natural language. 

As many authors have pointed out before, the expressive power of a (formal) language and its place within the so-called Chomsky Hierarchy constitute a fact about what has come to be known as 'weak generativity' (i.e. string-generation), but what the linguist ought to be studying is the generation and conceptualization of structure (i.e., strong generativity). Consequently, whilst it may be true that Chomsky misunderstood/misheard Clark's question, Clark misses the point that we ought to be interested in strong generativity, and not on the weak equivalence between strings of symbols and the structures they supposedly stand for. 

We are certain that both Pullum and Clark are aware of this, but some of their publications appear to show the suspension (temporary, we hope) of belief in these facts. In Rogers & Pullum (2011), we find a very careful analysis of the different grammars and languages of the Chomsky Hierarchy, but there is much at fault when these authors seek to identify the 'psychological correlates' that would show, in an experimental setting, what system subjects are employing/have internalized. The supposed connection between these cognitive abilities (e.g. the ability to recognize that every A is immediately followed by a B versus the ability to detect that at least one B was present somewhere) and the expressive power of an underlying grammar tells us very little indeed about mental properties and principles. Plausibly, the psychological correlates they list are the result of hierarchical mechanisms that operate over hierarchical (mental) representations, and the cognitive science literature contains myriad examples of theories that explicitly make use of these two components. Miller et al.'s (1960) TOTE units, or those studies that focus on Control operations (such as Simon 1962, Newell 1980 or Pylyshyn 1984) are some of the clearest examples we can think of. Crucially, these complex systems bear no relation whatsoever to formal grammars or languages. Much like in natural language, the key notion here is structure (incidentally, Miller & Chomsky 1963 already pointed to the 
analogy between TOTE units and the syntactic trees linguists postulated for sentences, something they did not consider coincidental).

In a way, computational linguists are hostage to the fact that strong generativity has so far resisted formalization and that, therefore, their results do not appear to be directly relatable to the careful descriptions and explanations linguists propose; a fortiori, their formulae do not tell us much about the psychological facts of human cognition. In our opinion, then, Chomsky's analysis does not show an 'extremely shallow acquaintance' with computational models, but a principled opposition to them because of what these models assume and attempt to show.

(2) We also take issue with Pullum's comment that the aforementioned papers 'share a steadfast refusal to engage with anything that might make the debate about the poverty of the stimulus (POS) an empirical one.' We think this is both false and not a little unfair. 

It is true, of course, that Chomsky seems to have little interest in what we might call empirical "number crunching" with respect to POS (e.g. quantifying the actual syntactic patterns in the child's environmental input and relating these quantifications to the actual frequencies of equivalent patterns within the child's developing output). However, the fact that he himself has not undertaken such research is entirely orthogonal to the claim that he has not provided empirical grounds for debating the POS. On the contrary, the last fifty-plus years have seen Chomsky build up a substantial body of actual natural language analysis. And it is this analysis which we would argue constitutes a clear empirical contribution to POS arguments. 

In particular, it seems to us that what Chomsky's work does (or, at least, looks to do) is provide an explication which is grounded in the study of natural language syntax, thereby attempting to establish the nature of human syntactic knowledge. As such, it necessarily establishes a framework within which all learning models must operate, defining the particular target structures that these models are to converge on. So, for example, whatever learning model/algorithm is eventually worked out - a task we believe to be both important and non-trivial - it must account for the fact that languages are hierarchical in structure; for it indeed seems to be an empirical fact that human languages have such structure (unlike, say, the linear strings of formal language theory; see BPYC for evidence to this effect). If a proposed general learning model does not produce such structures, it necessarily fails to provide a viable account of language acquisition, and does so precisely because it fails to match the empirically established account of natural language structure.

And, indeed, if you listen to the talk, this seems to be precisely the grounds on which Chomsky criticizes the computational cognitive science research literature raised in the Q+A session. So, when he criticizes the Perfors article in the talk, he does so because the researchers' specific approach simply fails to capture the syntactic knowledge that (some) linguistic theory has not only argued for, but argued for through detailed empirical analyses of natural language. Hence, perforce, their work fails outright to constitute an adequate rebuttal to POS (UCL video, 65:00; see also the relevant section in BPYC). 

A similar point applies to his comments regarding Clark's question (or, rather, what he takes to be Clark's question; not at all, as Pullum points out, the same thing). That is, Chomsky seems to argue against it (past it?) because the approach does not provide a realistic model of human syntactic knowledge. And the approach is not realistic because it doesn't stand up to (what he believes to be) the independent, viable and empirically established account of what this knowledge consists of (UCL video, 69:00; see Chomsky 2011 and BPYC for a brief recapitulation of certain pertinent features of this account). Hence, it couldn't possibly constitute a genuine POS counterargument.

The basic schema of the argument would, therefore, seem to be something like this: (1) As linguists, we are interested in the nature of human linguistic knowledge. (2) Our analyses of actual natural language syntax lead us to believe certain facts to be true of this knowledge (e.g. structure dependent movement), which we account for in a certain way (e.g. Merge). (3) The computational cognitive science literature has so far failed to provide domain-general learning models that adequately capture these facts about human language. (4) Therefore, they do not constitute POS counterarguments. 

Now, whilst this may of course turn out to be a bad argument, perhaps even a terrible one, it is prima facie one that looks to ground itself in empirically-derived content; content that Chomsky has surely been instrumental in contributing to.

Mark Brenchley
David J. Lobina

REFERENCES
Berwick, R. C., Pietroski, P., Yankama, B., & Chomsky, N. (2011) Cognitive Science, 35, 1207-1242.

Chomsky, N. (2011) Language and other cognitive systems. What is special about language? Language Learning and Development, 7, 263-278.

Miller, G. A. & Chomsky, N. (1963) Finitary models of language users. Handbook of Mathematical Psychology, vol. 2, John Wiley and sons, Inc. 419-492.

Miller, G. A.; Galanter, E. & Pribram, K. H. 1960. Plans and the Structure of Behaviour. Holt, Rinehart and Winston, Inc.

Newell, A. 1980. Physical symbol systems. Cognitive Science, 4, 135-183. 

Pylyshyn, Z. 1984. Computation and Cognition. The MIT Press.

Rogers, J. & Pullum, G. K. 2011. Aural Pattern Recognition Experiments and the Subregular Hierarchy. Journal of Logic, Language and Information, 20, 329-42.

Simon, H. 1962. The architecture of complexity. Proceedings of the American Philosophical Society, 106, 467-82.

Linguistic Field(s): Cognitive Science
                     Computational Linguistics
                     Discipline of Linguistics
                     Language Acquisition

-----------------------------------------------------------
LINGUIST List: Vol-22-4650	
----------------------------------------------------------