[Corpora-List] Chomsky and computational linguistics

Rob Freeman lists at chaoticlanguage.com
Mon Jul 23 05:00:29 UTC 2007


Nice review John.

It's great to see these fundamental issues being discussed again on Corpora.
There's lots of scope to go back to the original arguments.

How do you boil it down though?

Your general theme seems to be a personal criticism that Chomsky did not
keep up. In detail you cite mainly a continued rejection of statistics and
semantics.

I'll take another angle. I'll argue it would not have made any difference if
Chomsky had embraced statistics or semantics? Others embraced these in
opposition to him. It hasn't helped much.

I think Chomsky was kind-of right to stick to his guns. He saw something we
are still missing, and it is not resolved simply by embracing semantics or
statistics again.

Chomsky saw that structural descriptions of language based on corpora
failed. He was very successful in this. There is today, still, no (major?)
branch of theoretical linguistics which teaches a basis in observable
structure. Cognitivists and Functionalists both hate Chomsky, but they all
reject descriptions of language based on observable structure. In that sense
they too are schools born of Chomsky.

There has to be something in this.

I think Chomsky's detailed conclusion that the structure of language was not
observable because it is innate, was wrong. But I think he was right that a
regular structure in language can't be observed.

The answer is not to be had by moving to semantics, or revalidating
statistics again. These don't get to the core of the problem.

The core of the problem I think is indeed to be found in what Ken Litkowski
describes as "some Godelian experiences that are not covered" (July 3).
There is nothing mysterious in this. It is widely known that some systems,
notably that arch "formal system" maths itself, are formally incomplete in
this sense.

Should we be surprised that language behaves like maths?

-Rob

On 7/5/07, John F. Sowa <sowa at bestweb.net> wrote:
>
> Mike,
>
> That's a fair question:
>
> JFS>> It is truly sad when a man who had taught us all a great deal at
> >> one time long ago has walled himself off from any input that might
> >> raise questions he had decided to ignore five decades ago.
>
> MM> Let me take the devil's advocate position here, in hopes of
> > provoking some discussion.  What is the evidence from corpora
> > that would raise these questions, and what are the questions
> > Chomsky is ignoring?
>
> Let's go back 50 years to Chomsky's _Syntactic Structures_, which
> was an excellent book for its time (and the most readable book that
> C. ever wrote).  Following are some of the earlier developments
> that he was arguing for and/or against:
>
>   1. Immediate constituent analysis (i.e., parsing by the equivalent
>      of context-free grammars), which was recommended by Wells (1947)
>      and implemented in some early MT systems.
>
>   2. Transformational grammar by Harris (1951), who had argued that
>      immediate constituent (IC) analysis was insufficient to capture
>      all the generalizations in syntax and that a transformational
>      layer on top of ICs was necessary to relate, for example, the
>      active and passive forms of verbs.
>
>   4. Grammar discovery procedures, which involved the analysis of
>      large (for the time) corpora in order to derive a grammar.
>      The most impressive work of those years was an analysis of
>      a corpus of fifty hours of spoken English by Fries (1952).
>      To minimize preconceptions, Fries avoided traditional labels
>      for parts of speech, and assigned meaningless letters and
>      numbers to the classes of words that could be inserted in the
>      slots of various co-occurrence patterns.
>
>   5. Statistical methods, especially those derived from Shannon (1948;
>      Shannon & Weaver 1949), who introduced information theory and
>      discussed Markov chains and the analysis of text by N-grams of
>      letters or words.
>
>   6. Integration of syntax and semantics, especially in MT systems.
>      One of the pioneering groups in MT and Comp. Ling. was the
>      Cambridge Language Research Unit, founded in the early 1950s
>      by Margaret Masterman.  CLRU was a fertile breeding ground for
>      pioneers in both theoretical and computational linguistics.
>
> In _Syntactic Structures_, Chomsky followed his former teacher,
> Zellig Harris, in adopting transformations as a layer on top of
> IC analysis.  His major innovation was to adopt production rules
> by Post (1943, 1947) as the notation for representing IC rules,
> which C. called phrase-structure rules.  As working hypotheses,
> C. stated the following principles:
>
>   1. A language is a set of sentences defined by a formal grammar
>      stated in a mathematical notation.  This assumption was not
>      derived from Harris, but more likely from Quine and Goodman,
>      who, C. said in the preface, "strongly influenced" him.
>
>   2. "Grammar is best formulated as a self-contained subject
>      independent of semantics" (p. 106).  This assumption is
>      typical of the practice in formal logic, but not in the
>      linguistics of the 1950s.  Roman Jakobson, for example,
>      countered "Syntax without semantics is meaningless."
>
>   3. A rejection of Markov processes as inadequate to generate
>      all and only the grammatical sentences of a language.
>      This assumption follows from the definition in point #1,
>      but it does not imply that Markov processes cannot be
>      useful for recognizing major chunks of a sentence.
>
>   4. A rejection of phrase-structure grammars as inadequate, not
>      because they could not generate all sentences of a language,
>      but because they could not express common generalizations
>      (such as active-passive transformations).
>
>   5. A two-level approach with a phrase-structure component for
>      generating kernel sentences and a transformational component
>      for combining and transforming the kernel sentences.  This
>      approach is effectively equivalent to Harris's, but with
>      different notation and terminology.
>
>   6. "External conditions of adequacy... e.g., the sentences will
>      have to be acceptable to the native speaker" (p. 49).
>
>   7. "Condition of generality... we require that the grammar of a
>      given language be constructed in accordance with a specific
>      theory of structure... independently of any particular
>      language" (p. 50).
>
> As guidelines, these principles led to a great deal of fruitful
> research, but Chomsky fossilized them as dogma that ruled out
> an even larger body of potentially much more fruitful research.
>
> Harris, for example, would certainly object to point #6, since
> he had written a grammar of Phoenician for his PhD dissertation
> despite the lack of any native speakers.  Point #7 also presumes
> that a universally acceptable theory of structure can be edicted
> even before all languages have been studied.  Those two points
> led to the abandonment of many projects, such as the one by Fries,
> that were based on corpora.  (Fortunately, linguists who worked
> in the field with indigenous languages ignored those points.)
>
> Point #1, which implies that syntax alone must determine the
> set of permissible sentences, distorts everything else.  Instead
> of using a two-level syntax, most computational systems do the
> parsing with a context-free grammar and achieve the effect of
> transformations in the mapping to a semantic representation.
> Theoreticians ranging from Montague to the generative semanticists
> did something similar, but many of them suffered badly in the
> so-called "linguistic wars."
>
> Chomsky's rejection of statistics not only affected theoretical
> linguistics, it even spread to AI and comp. ling.  As an example,
> a colleague of mine at IBM, Eva Mueckstein, had written a PhD
> dissertation in grammar theory:  she showed how a context-free
> grammar combined with a finite-state control for selecting which
> CF rules to apply could provide the equivalent of a context-
> sensitive grammar.  She was hired by Fred Jellinek, who gave
> her the task of adding probabilities to the FS control.  After
> getting some promising results, she submitted a paper to IJCAI
> in 1981.  But the paper was rejected with the curt reply,
> "Statistics is not AI."
>
> In the late 1980s, another colleague, who still believed in Chomsky,
> said "But we don't know much about semantics."  That was thirty years
> after the MT work on semantics and seventeen years after Montague.
> Even worse, it was over six centuries after Ockham (1323) had written
> a semantic analysis of the entire Latin language (which would, even
> today, be an excellent introduction to model-theoretic semantics,
> especially for linguists who are terrified by Montague's notation).
>
> At the end of the preface, Chomsky cited the support he received from
> the U.S. Army, Navy, and Air Force.  Those payments were part of the
> money MIT received for work on machine translation.  If Chomsky had
> done the work he was being paid for, he could have learned a lot about
> how language actually works.  Both MT and theoretical linguistics
> might have benefited enormously.
>
> John Sowa
>
> --------------------------------------------------------------------
>
> References:
>
> Chomsky, Noam (1957) _Syntactic Structures_, Mouton, The Hague.
>
> Fries, Charles Carpenter (1952) _The Structure of English_,
> Harcourt, Brace & World, New York.  For a tribute to Fries, see
> http://itre.cis.upenn.edu/~myl/languagelog/archives/003743.html
>
> Harris, Zellig (1951) _Methods in Structural Linguistics_, Chicago
> University Press, Chicago.  For a tribute to Harris, see
> http://www.dmi.columbia.edu/zellig/
>
> Masterman, Margaret (2006) _Language, Cohesion and Form_, edited
> by Yorick Wilks, Cambridge University Press.  For a review, see
> http://www.jfsowa.com/pubs/mmb_rev.htm
>
> Ockham, William of (1323) Summa Logicae. _Ockham's Theory of Terms_,
> translation of Part I by M. J. Loux, University of Notre Dame Press,
> Notre Dame, IN, 1974. _Ockham's Theory of Propositions_, translation
> of Part II by A. J. Freddoso & H. Schuurman, University of Notre Dame
> Press, Notre Dame, IN, 1980.
>
> Post, Emil L. (1943) "Formal reductions of the general combinatorial
> decision problem," _American Journal of Mathematics_, 65, 197-215.
>
> Post, Emil L. (1947) "Recursive unsolvability of a problem of Thue,"
> _Journal of Symbolic Logic_, 12, 1-11.  Reprinted in M. Davis, ed.,
> _The Undecidable_, Raven Press, Hewlett, NY, 1965, pp. 293-303.
>
> Shannon, Claude E. (1948) "The mathematical theory of communication,"
> _The Bell System Technical Journal_, Vol. 27, pp. 379–423, 623–656.
> http://cm.bell-labs.com/cm/ms/what/shannonday/shannon1948.pdf
>
> Shannon, Claude E., & Warren Weaver (1949) _The Mathematical Theory
> of Communication_, Univ. of Illinois Press, Urbana.
>
> Wells, Rulon (1947) "Immediate constituents," _Language_, 23, 81-117.
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20070723/6c8e6274/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list