OT--5 crucial strategic decisions that vitiate...

Sun Dec 12 19:01:57 UTC 1999

Sorry I couldn't get to this till now.  My real life called me....
This is a very long msg, because Brian MacWhinney has asked some very
substantial questions.

Brian MacWhinney <macw at CMU.EDU> wrote on Friday, 10 December:

>    However, as a psycholinguist, I have been disappointed by five crucial
> "strategic" decisions in the development of OT that have tended to vitiate
> its potential for constructing a psychologically plausible linguistic theory
> of the type that Joan Bresnan and others have often sought.  In particular,
> 1.  Early on, OT was supposed to be linked to connectionist modeling.
> However, after the first few years, this linkage was largely dropped.   Dedre
> Gentner's interest in analogy as an acquisition or production mechanism has
> pretty much suffered the same fate, I would guess.

It is interesting to read what Prince and Smolensky say about this in
their article "Optimality: From Neutral Networks to Universal Grammar"
in the March 14, 1997 issue of Science Magazine.  (available on-line:
http://www.sciencemag.org/cgi/reprint/275/5306/1604):

        The principal empirical questions addressed by optimality theory, as
        by other theories of universal grammar, concern the characterization
        of linguistic forms in and across languages. A quite different
        question is, can we explicate at least some of the properties of
        optimality theory itself on the basis of more fundamental cognitive
        principles? A significant first step toward such an explanation, we
        will argue, derives from the theory of computation in neural networks.

        Linguistic research employing optimality theory does not, of course,
        involve explicit neural network modeling of language. The relation we
        seek to identify between optimality theory and neural computation must
        be of the type that holds between higher level and lower level systems
        of analysis in the physical sciences. ... Like thermodynamics,
        optimality theory is a self-contained higher-level theory; like
        statistical mechanics, we claim, neural computation ought to explain
        fundamental principles of the higher level theory by deriving them as
        large-scale consequences of interactions at a much lower level.

They give some interesting examples where OT has been conceptually
driven by the hypothesized underlying neural implementation, but point
out that it doesn't explain the universal recurrence of linguistic
constraints, so there's still a gap between the two approaches.

I suspect that Paul and Alan have a deeper view of the relation to
connectionist ideas than most generative linguists who have taken up
OT, but *some* linguists have been curious enough about the underlying
cognitive issues to walk part way across the bridge.  I have heard
Helen de Hoop, for example, start out a lecture on OT semantics by
talking about tensor products...  brave woman, intellectually valiant.

> 2.  Early on, OT constraints were supposed to have strength levels.  However,
> later on this feature was eliminated.  In our work on the Competition Model,
> Liz Bates and I learned how important strength levels are for describing and
> predicting psycholinguistic data.  In fact, one has to go beyond strength
> levels for separate constraints and look at what we can conflict validity,
> but none of this could possibly fit in with current OT.

My colleague Edward Flemming (http://www.stanford.edu/~flemming/) makes
a similar argument.  He concludes his most recent paper, "Scalar
Representations in a Unified Model of Phonetics and Phonology" as
follows:

        The proposed model of phonetics and phonology is similar to
        Optimality Theoretic phonology in that outputs are selected so as to
        best satisfy conflicting, violable constraints.  However, the
        constraints considered here (particularly implementations of
        minimization of effort and maximization of distinctiveness) trade-off
        against each other in an additive fashion, implying that these
        interactions are better modeled in a weighted constraint system rather
        than one which exclusively employs strict constraint dominance, as is
        the case for Optimality Theory.

It is an interesting question whether the scalar-valued functions that
play a role in assimilation and coarticulation are just what is needed
for studying the typology and structure of, say, pronominal
inventories or voice systems in syntax.  But even if one doesn't go
that far (into continuous modelling), the idea of optimizing symbolic
structures using a discrete evaluation function defined over
universal, violable constraints (like the markedness constraints of
functional/typological linguistics) is a radical shift that seems
promising, and creates an intellectual bridge where there
wasn't one before.

> 3.  Early on, learning of a phonology (or grammar) could have involved the
> strengthening and weakening of constraints.  Later on, it required the types
> of triggers used by G-B and P&P theories.

???? I find this a bit hard to understand in view of Tesar and
Smolensky's 1998 article "Learnability in Optimality Theory"
(Linguistic Inquiry 29, 229--68) (and their long technical report)
which explicitly argues against the triggers learning models.  (You
can find a lot of these papers on Paul Smolensky's web page:
http://www.cog.jhu.edu/faculty/smolensky.html.)  However, I also like
the new model of gradual learning of constraints whose ranking varies
probablistically (see Boersma and Hayes, ROA, for references:
http://ruccs.rutgers.edu/roa.html), even more.

> 4.  Even from the beginning, OT never questioned the need to provide a single
> abstract underlying structure for each lexical item.  As far as I can tell,
> this commitment is the one that tends to lead linguistic theories away from
> being able to develop psychological reality.  Trying to preserve this
> approach  in OT models of syntax would be equally problematic.

If you have a generative, derivational model of underlying structure,
that is certainly true.  It is hard to free ourselves from the kind of
thinking we were originally trained in... and it may be that OT
imported a bit too much of the derivational ways of thinking in
generative grammar at the beginning.  But we know much more now about
nonderivational, constraint-based representations than we used to
(especially those of us who have explored alternatives to Chomsky's
grammatical architectures), and some OT work has a very different
model of the nature and role of the input. (I would refer to several
of my recent papers here, if I weren't so modest.)  This is also true in
phonology.  I really like Rene Kager's new (1999) book, _Optimality
Theory_ in the red textbook series of Cambridge University Press.  It
does a beautiful job of explaining some of the functionalist ideas and
motivations for OT phonology, and showing some of the recent
developments that do away with inputs in various ways...    I would
address this answer to Wally Chafe's message of December 11, as well.

> 5.  At no point was OT really committed to an account of online processing.
> If problems 1-4 were not present, I would not consider this a fatal flaw,
> since the notion of constraint ordering has clear interest for typology and
> language change, at the very least.
>

This is a *very* difficult problem: how to do online computations
comparing infinite sets of recursive symbolic structures?  Here, the
mathematically formalized representational theories of syntax (LFG,
HPSG, categorial grammar, etc.) have some advantages, I think, over
the Chomskyan syntactic approaches that get more press.  Some new
theoretical work has been done on this problem in OT-LFG: one, using
conventional kinds of computation, is by Jonas Kuhn
(http://www.ims.uni-stuttgart.de/~jonas/):

  Jonas Kuhn.  1999.  Generation and Parsing in Optimality Theoretic
  Syntax---Issues in the Formalization of OT-LFG (draft version of
  November 1999. To appear in a CSLI volume _OT-LFG: Optimality Theory
  and LFG_, edited by Peter Sells.

I can also point you to a remarkable paper on Optimality Theoretic LFG
on Neuroidal Nets by Professor Tetsuro Nishino using a dynamic,
parallel computational model to evaluate the OT-LFG candidate space:
http://www.sw.cas.uec.ac.jp/tnlab/member/nishino.html.

These works are specific to LFG, and the first I know of on generation
and parsing in OT Syntax.  But once a problem has been solved in LFG,
it usually doesn't take too long to port it to HPSG and other
constraint-based frameworks, ... :-)

>    I am just a psycholinguist, so I am happy to have linguists explain to me
> how I have strayed in my judgments.  But I would really like to hear some
> open discussion of these issues.  ...

What irresistable modesty!  And so unwarranted.  You FUNKfolk are a
pleasure to discuss things with.

TTFN-

Joan