12.2550, Review: Dekkers, et al, Optimality Theory

LINGUIST List linguist at linguistlist.org
Fri Oct 12 23:01:13 UTC 2001


LINGUIST List:  Vol-12-2550. Fri Oct 12 2001. ISSN: 1068-4875.

Subject: 12.2550, Review: Dekkers, et al, Optimality Theory

Moderators: Anthony Aristar, Wayne State U.<aristar at linguistlist.org>
            Helen Dry, Eastern Michigan U. <hdry at linguistlist.org>
            Andrew Carnie, U. of Arizona <carnie at linguistlist.org>

Reviews (reviews at linguistlist.org):
	Simin Karimi, U. of Arizona
	Terence Langendoen, U. of Arizona

Editors (linguist at linguistlist.org):
	Karen Milligan, WSU 		Naomi Ogasawara, EMU
	Jody Huellmantel, WSU		James Yuells, WSU
	Michael Appleby, EMU		Marie Klopfenstein, WSU
	Ljuba Veselinova, Stockholm U.	Heather Taylor-Loring, EMU
	Dina Kapetangianni, EMU		Richard Harvey, EMU
	Karolina Owczarzak, EMU		Renee Galvis, WSU

Software: John Remmers, E. Michigan U. <remmers at emunix.emich.edu>
          Gayathri Sriram, E. Michigan U. <gayatri at linguistlist.org>

Home Page:  http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.



Editor for this issue: Terence Langendoen <terry at linguistlist.org>
 ==========================================================================
What follows is another discussion note contributed to our Book Discussion
Forum.  We expect these discussions to be informal and interactive; and
the author of the book discussed is cordially invited to join in.

If you are interested in leading a book discussion, look for books
announced on LINGUIST as "available for discussion."  (This means that
the publisher has sent us a review copy.)  Then contact Simin Karimi at
     simin at linguistlist.org or Terry Langendoen at terry at linguistlist.org.


=================================Directory=================================

1)
Date:  Tue, 9 Oct 2001 14:40:02 -0700 (PDT)
From:  Ash Asudeh <asudeh at csli.Stanford.EDU>
Subject:  Review of Dekkers et al., Optimality Theory

-------------------------------- Message 1 -------------------------------

Date:  Tue, 9 Oct 2001 14:40:02 -0700 (PDT)
From:  Ash Asudeh <asudeh at csli.Stanford.EDU>
Subject:  Review of Dekkers et al., Optimality Theory

Dekkers, Joost, Frank van der Leeuw, and Jeroen van de Weijer, ed. (2000)
Optimality Theory: Phonology, Syntax, and Acquisition. Oxford University
Press, paperback ISBN 0-19-823844-4, $45.00, x+635pp.

Reviewed by Ash Asudeh, Department of Linguistics, Stanford University.


OVERVIEW
This book consists of an introductory chapter by the editors and
Paul Boersma, 16 papers, and three indexes (subject, language,
name). The papers are divided into four areas: prosodic representation
(4 papers), segmental phonology (3 papers), syntax (5 papers), and
acquisition (4 papers).

The introduction is a compact, but useful, overview of
Optimality Theory (OT; Prince and Smolensky, 1993). The editors plus
Boersma spend only a few pages discussing phonology, on the
assumption that this is the best known domain for OT analyses. They
discuss syntax somewhat more extensively, using principally Pesetsky
(1998) and Grimshaw (1997) for exposition. The topic of acquisition
takes up about half the introduction. There is an extensive comparison
of OT learning algorithms to Principles and Parameters learning
algorithms (particularly the Trigger Learning Algorithm of Gibson and
Wexler (1994) and Berwick and Niyogi's (1996) refinement of it), based
on Pulleyblank and Turkel's (1996) discussion of Tongue Root Harmony
languages. Throughout the introduction, the editors tie in the
contents of the volume as appropriate.

The volume proper begins with the section on "Prosodic Representation"
and Burzio's paper, 'Cycles, Non-Derived Environment Blocking, and
Correspondence'. Burzio argues for output-output Correspondence
(Burzio, 1994; Benua, 1995), whereby there is a faithfulness relation
between morphologically related output forms. He argues that this
dispenses with the notion of 'Underlying Representation'.

Hayes ('Gradient Well-Formedness in Optimality Theory') considers the
problem of gradient grammaticality (Schütze, 1996; Keller, 2000;
Pullum and Scholz, 2001). He proposes a model in which constraints are
associated with strictness bands. This is clearly related to the
stochastic constraint evaluation and continuous constraint ranking of
the Boersma contribution to the volume, and indeed much of the results
in this article are incorporated in Boersma and Hayes (2001). In that
article, as in this one, gradience is related directly to frequency, a
view challenged by (Keller, 2000).

Kager ('Stem Stress and Peak Correspondence in Dutch') considers the
question of word-level stress assignment in Dutch. He argues that a
Correspondence Theory account (McCarthy and Prince, 1995, 1999) is
conceptually and empirically superior to a Lexical Phonology
alternative.

McCarthy's paper ('Faithfulness and Prosodic Circumscription') also
concerns Correspondence Theory. McCarthy argues, principally from
reduplication and infixation data, that a theory of prosodic
faithfulness, which is independently required, eliminates the need for
operational prosodic circumscription (McCarthy and Prince, 1990).

Jacobs and Gussenhoven ('Loan Phonology: Perception, Salience, the
Lexicon and Optimality Theory') begin the section on "Segmental
Phonology". They consider loanword phonology, especially in
Cantonese. Their paper is based largely on Silverman (1992) and Yip
(1993). Unlike these papers, they argue that the notion of 'phonetic
salience' is unnecessary and that a pure OT grammar makes all the
necessary distinctions, so long as one assumes Smolensky's (1996)
interpretive parsing and lexicon optimization (Prince and Smolensky,
1993).

The paper by LaCharité and Paradis ('Derivational Residue: Hidden
Rules in Optimality Theory') also concerns loanword phonology. They
argue for a Correspondence Theory of faithfulness rather than a
Containment Theory (Prince and Smolensky, 1993). Based on this
conclusion, they argue that OT has a crucial derivational core, in
GEN, the mapping from inputs to candidates.

Smith ('Dependency Theory Meets OT: A Proposal for a New Approach to
Segmental Structure') presents a model in which Dependency Phonology
characterizes the GEN and candidate set in an OT model.

The "Syntax" section starts with the paper by Ackema and Neeleman
('Absolute Ungrammaticality'). They tackle the problem of ineffability
(an input that should have no output). They propose that such cases
all involve selection of the null parse as the optimal
candidate. Although the null parse fails PARSE constraints, which
require morphological/featural information in the input to be realized
in the output, contentful candidates violate higher-ranked constraints
such as EPP or PROJECT. In order for their proposal to generalize, A&N
must assume that there is a series of evaluations, each containing
exactly one parse constraint (starting with the highest-ranked one and
so on down the hierarchy), where evaluation n+1 takes the output of
evaluation n as its input.

Anderson ('Towards an Optimal Account of Second-Position Phenomena')
argues that 1) second position phenomena do indeed target the second
position (it is not an epiphenomenon) and 2) second position clitics
and verb second (V2) are related phenomena.

Bresnan introduces the possibility of using Lexical Functional Grammar
(LFG; Bresnan, 1982, 2001; Dalrymple, 2001) in OT syntax to
characterize GEN and the candidate set, and to provide the set of
formatives to which the constraints make reference. She shows how GEN
can be characterized monostratally rather than derivationally, such
that there is parallel and possibly imperfect correspondence between
multiple grammatical representations and she uses the system to recast
Grimshaw's (1997) pioneering OT syntax analysis. OT-LFG is explored
further in other recent papers by Bresnan, and in the recent volume
edited by Sells (2001).

Legendre ('Morphological and Prosodic Alignment of Bulgarian
Clitics'), like Anderson, argues that clitics are not syntactic
elements, but are rather phrasal affixes that are the morphological
realization of functional features. Also like Anderson, she argues
that this characterization explains the fact that clitics orient to
edges. She proposes a characterization of Bulgarian second-position
clitics that uses certain alignment constraints with syntactic domains
and others with prosodic domains.

Boersma ('Learning a Grammar in Functional Phonology') begins the
final section, "Acquisition". He presents what is essentially a
condensed version of his thesis and book (Boersma, 1998). In his
model, constraint ranking is continuous, with constraints assigned
numerical rankings rather than hierarchical positions. Evaluation is
stochastic: there is a slight probabilistically-conditioned reranking
of constraints at evaluation time. Another major contribution of this
paper is a model in which constraints are not innate, but are rather
grounded functionally and learned from articulatory and acoustic
data. Boersma provides a detailed example: the learning of Wolof
tongue root harmony.

Ellison ('The Universal Constraint Set: Convention, Not Fact') also
effectively challenges the innateness of constraints. He presents six
arguments for the universality of constraints, as is usually assumed
in OT, and rejects them all. He concludes that the only sense in which
the universality of constraints should be maintained is as a
convention, like the International Phonetic Alphabet, which will make
the work of linguists easier, but has no substantive empirical content
and no psychological reality.

Pulleyblank and Turkel ('Learning Phonology: Genetic Algorithms and
Yoruba Tongue Root Harmony') present an innovative OT model in which
learning uses a Genetic Algorithm (Holland, 1973; Koza, 1992). Thus,
like in the stochastic model that Boersma presents, speakers can have
largely similar grammars, with minor differences. As the authors note,
this begins to give some handle on the problem of variation and the
continuous, rather than ordinal, differences between dialects.

Tesar's paper ('On the Role of Optimality and Strict Domination in
Language Learning') further develops the best-known work on OT learning
(Tesar and Smolensky, 1998, 2000). It considers how design principles of
Optimality Theory, particularly strict domination of constraints and
optimization, can be exploited in providing a theory of language learning.
The paper contains a nice overview of optimization-based learning
algorithms, such as Hill-climbing, its specialization Gradient Ascent, and
Expectation-Maximization, It concludes with a characterization of parsing
as optimization, which consists of "production directed parsing", more
commonly called generation, and "interpretive parsing", which consists of
holding the overt form constant across all candidates in the competition
and selecting the candidate that is optimal. The rest of the paper
presents two algorithms, the Error-driven Constraint Demotion Algorithm
and the Iterative Learning Algorithm (ILA), which unlike the EDCDA does
not rely on correct full structural descriptions being available to the
learner.


CRITICAL EVALUATION
This volume is quite heterogeneous. Since it is not possible to give
each paper its due individually, I will mainly discuss certain issues
that arise, commenting on particular papers where relevant. But first
let me consider the book as a whole.

The key strength of this volume is its very heterogeneity. The editors
have collected a number of important contributions from leading
scholars working in Optimality Theory in three central fields of
linguistic study. The fact that seven of the sixteen papers are on
phonology is a fair reflection of the field, as OT has made more
inroads in that area than in any other. The editors should be
commended on including sections on syntax and acquisition.

Despite the variety of topics explored here, there are themes tying
many of the papers together. The acquisition section, which I found
the most interesting (even though I do not work in this area), could
more appropriately be called "learnability". These papers all address
formal and algorithmic issues of learning OT grammars (Boersma,
Pulleyblank and Turkel, Tesar) or key foundational issues about the
universality and innateness of aspects of OT architectures (Boersma,
Ellison). Turning to phonology, the first section ("Prosodic
Representation") largely explores aspects of Correspondence Theory
(Burzio, Kager, McCarthy). The second phonology section ("Segmental
Phonology") picks up on this in places (LaCharité and Paradis), but
it is for the most part concerned with issues of loanword phonology
(Jacobs and Gussenhoven, LaCharité and Paradis). As for the syntax
section, there are two key concerns: the architecture of a syntactic
OT grammar (Bresnan, Broekhuis and Dekkers), and the interaction of
prosody and syntax in the distribution of clitics (Anderson,
Legendre). In each of the last three sections, there are important
papers that are thematically isolated. Hayes is to be commended on
taking gradience as a serious grammatical phenomenon, Smith makes the
simple, but important, point that OT can be married with already
well-developed phonological theories, rather than merely replacing
them, and Ackema and Neeleman take on the important problem of
ineffability in OT syntax.

The fact that the editors have brought together not only OT papers on
phonology, but also papers on syntax and acquisition is all the more
remarkable given that they started putting the volume together when OT
was pretty new. One indication of this is that there are various
references to Barbosa et al. (1998) or papers therein as "to
appear". The drawback is that the volume thus has a dated feel,
despite its 2000 copyright date (in fact the book was released on
12/28/2000). Most readers with an interest in OT will already be
familiar with many of the papers (from the Rutgers Optimality Archive,
or from circulating drafts). As for readers wishing to learn about OT,
an edited volume of this sort is the wrong place to start (although
the introduction is worth a look). A better place would be the
excellent Kager (1999) textbook, or the CD-ROM put together by John
McCarthy, available from the University of Massachusetts Graduate
Linguistics Student Association,
http://server102.hypermart.net/glsa/index.htm.

Let me now turn to three interesting theoretical issues that I believe
this book raises.


I. COMPETITION AND EVALUATION
Three foundational questions for Optimality Theory are:

1. What is the nature of inputs?

2. What is the nature of candidates?

3. How can we tell if a constraint should apply to a candidate and
   whether it is violated by a candidate?

The first two questions have received a fair amount of attention in
the OT literature and are discussed explicitly in various papers here,
particularly in the syntax section (part three). According to what in
the introduction is called the "Semantic Identity Approach", the input
to syntax is a bag of lexemes (cf. the related notion of Numeration in
the Minimalist Program; Chomsky, 1995), and the candidates must be
"truth-functional equivalents" (Broekhuis and Dekkers, p. 409). This
approach is pursued in the papers by Ackema and Neeleman, Broekhuis
and Dekkers, and possibly implicitly by Anderson (his paper leaves
most of the details of his formal analysis unspecified).

On the other approach, the "Structured Inputs Approach", which is
exemplified by Bresnan's paper and is implicit in Legendre's, the
input is a predicate-argument structure (be it an underspecified
Logical Form (LF) or an LFG functional-structure) and the candidates
are some more articulated version of the input (more fully-specified
LFs or LFG constituent-structure/functional-structure pairs).

However, I think that it is the third question above that really needs
to be answered, as evidenced by various conceptual confusions that
arise throughout this book. So far, there has been very little work on
providing a logic and semantics for OT constraints, and most of the
work postdates this volume. A recent paper by Hammond (2000) addresses
the logic of the OT architecture, but stops shy of providing a logic
for the constraints and their evaluation. Ellison (1994), Eisner
(1997a,b) and Kuhn (2001a,b) have done some work on this front in a
more computational setting, but their results need to find their way
into the theoretical literature.

But, why does it matter? Consider the constraint LE(CP) from Pesetsky
(1998), which is used by Broekhuis and Dekkers:

(1) Left Edge CP: CP starts with a lexicalized head from the extended
    projection of the verb.

After much argumentation, based on assumptions of the Minimalist
Program, B&D conclude that the structure of English subject relatives
is:

(2) the man [IP who saw Bill]

It was surprising to find that this structure violates LE(CP), even
though it does not even contain a CP! Of course, this was after
reading the constraint as universally quantifying over CPs ("Every CP
starts with . . . "). I still think this is the preferred reading for
the statement above. But, it is possible to read (1) as existentially
quantifying over CPs ("There is a CP such that it start with
. . . "). The point is that if the constraints were made explicit in
some logic, as discussed in some of the works cited above, it would be
absolutely clear whether the quantification is universal or
existential.

Well, perhaps I'm demanding too much of a paradigm at such an early
stage of development. OT is a relatively new framework (and it was
really quite new when the papers in this book were written), and
shouldn't we judge linguistic theories based on the predictions they
make, their empirical consequences, rather than fussy details of
formulation? I submit that this would be a grave error: it is
impossible to judge the consequences of a theory if its content is not
established first. More importantly, a formalism should make
theoretical claims explicit and verifiable, it should throw light on
the theory by making generalizations concisely and clearly. If the
formalism creates more questions than it helps to answer, why have it?
Why not just state the generalizations in clear, ordinary language?

This is not any kind of damning criticism of OT. The kind of logic
required seems to be only a first order predicate calculus, nothing
complicated, and as I noted, this work has been initiated. However,
the OT constraints in this book (and in general) are in need of an
explicit formalization of the kind provided for other constraint-based
theories, such as Head-driven Phrase Structure Grammar (King, 1989,
1994; Richter, 2000) and LFG (Johnson, 1995; Kaplan and Bresnan,
1982).

Similarly, the authors should have made explicit the formatives
(i.e. linguistic primitives) of their versions of OT and the extent of
interaction between different grammatical subsystems (phonology,
morphology, syntax, etc.). Legendre's paper is a step in the right
direction. She proposes that syntactic constraints outrank prosodic
constraints which outrank morpho-prosodic constraints (prosodic
alignment constraints for morphologic features) which outrank
morphological constraints. She proposes the "Constraint Intermixing
Ban", which states that "Constraints belonging to different modules of
the grammar may not intermix" (p. 458). Of course, this is more an
instruction to the OT grammarian than a theoretical construct, and we
would like it to somehow be derived from the nature of the constraints
or the OT architecture, but it is a start.

However, many of the papers are really quite unclear about the status
of the formatives they assume and about the interaction of grammatical
systems. For example, Anderson proposes a constraint
EDGEMOST(clitic,left,S), that requires a clitic to be at the left edge
of a clause (S). But, it is uncertain that the term "clitic" even has
any theoretical content (Sadock, 1995; Zwicky, 1994). If so, how can
it be a formative in a formal theory?

Another example comes from Kager's paper. He notes that although the
stress system of Dutch motivates a constraint LEFTMOST ("Align(PrWd,
L, peak, L)"; i.e. the stress peak is on the leftmost syllable of the
prosodic word), certain adjectives have their stress peak on the
rightmost stem, motivating a constraint ADJ-PK ("Align(Adjective, R,
peak, R)"; i.e. the stress peak is on the rightmost syllable of an
adjective). But, note that this means that the inputs and candidates
to a morphophonological process would have to contain syntactic
category information. This has serious consequences for the theory of
grammatical architecture, but its consequences are masked by the
informal nature of OT constraints.

By no means do I mean to single out just the papers I've mentioned
here. Many of the papers in the phonology and syntax sections suffer
from either the problem of uncertain constraint evaluation or
potentially problematic claims about the formatives (the structure of
inputs and candidates). Again, these are not problems for OT analyses
of linguistic phenomena necessarily, but they are definitely areas
that should receive much more serious attention than they receive in
this volume.


II. INPUTS ARE NOT OUTPUTS
The second issue has to to do with interpretive parsing, as introduced
in Smolensky (1996) (this point owes a lot to discussions with Ida
Toivonen, although any conceptual or factual errors are solely my
own.) This is adopted in the Jacobs and Gussenhoven paper and
exploited in Tesar's Interpretive Parsing Algorithm. Consider an OT
grammar on this view. In generation, or the production direction, we
have the following (x_prod should be read as "x of production";
similarly for x_comp ("x of comprehension")):

(3) Production:
    Input_prod -> GEN -> {x|x is a candidate_prod} -> EVAL -> Output

Now, consider comprehension as envisioned by interpretive
parsing. Tesar (p.601) writes, "The proposal is that language
comprehension, like production, is an optimization process. The hearer
is presented with an overt form, and selects the description of that
overt form that is optimal with respect to his or her current
constraint ranking. The difference is that here the candidate
structural descriptions competing for optimality are candidates whose
overt portions match the observed overt form. The interpretation
assigned to an observed overt form is that structural description
which, out of all descriptions whose overt portion matches the
observed form, best satisfies the ranked universal constraints." In
other words, the proposal is:

(4) Comprehension (interpretive parsing):

Input_prod -> GEN -> {x|x is a candidate_prod such that its overt
    portion matches the observed overt form} -> EVAL -> Output

Thus, according to interpretive parsing, although comprehension is
also an optimization process, it is a different kind of optimization
process, because there is something held constant about the
candidates.

However, we might otherwise expect the comprehension direction to look
like this (Anttila and Fong, 2000; Asudeh, 2001):

(5) Comprehension (no interpretive parsing):

Observed overt form -> GEN -> {x|x is a candidate_comp} -> EVAL -> Output

That is, we might expect comprehension to simply be the reverse of
production. One of the benefits of constraint-based theories, such as
OT is purported to be, is that grammars are reversible (Strzalkowski,
1993; Copestake et al., 1995, 1999): the same grammar can be used for
comprehension and production. There may have to be production- or
comprehension-specific processes that interface with the grammar, but
the core grammatical system could be the same. This has the benefit
that only one grammar needs to be learned, represented, and used in
computation. Reversibility is an important research area in the
implementation of Lexical-Functional Grammars and Head-driven Phrase
Structure Grammars, and other constraint-based architectures.

By contrast, in OT with interpretive parsing, the grammar is not
reversible: production and comprehension are handled differently. More
importantly, it should be noted that the alternative without
interpretive parsing (in (4)) is also not straightforwardly
reversible. In particular, the formatives for the inputs of
production/comprehension will not be the same as the formatives of the
outputs. Thus the elements that constraints target in the output of
production would be in the input to comprehension, rather than being
in the output, where the constraints need to evaluate
them. Faithfulness constraints, with their inherent asymmetry of
evaluating outputs against inputs would find the inputs and outputs
reversed. Similarly, the formatives that Markedness constraints target
may be in one set of outputs, but not the other.

Thus, OT-grammars seem to be non-trivially non-reversible, whether
they use interpretive parsing or not. This is in conflict with the
reversibility prized by other constraint-based architectures and is
tantamount to having separate grammars for production and
comprehension, which is conceptually undesirable.


III. FORMALISM, FUNCTIONALISM, AND NATIVISM
Ellison's paper considers the question of the universality of
constraints in OT phonology. He considers six arguments (structured
arguments, with premises and conclusions that follow from the
premises) for the view common in the OT literature, which he calls
UNIV-FACT (p. 526):

(6) UNIV-FACT: There is (at least) one hierarchy of constraints
    objectively present in the mind of each
    language-user. Furthermore, the same constraint set is used in
    each hierarchy of each and every user.

Ellison rejects each of the six arguments for UNIV-FACT that he
considers, because in each argument at least one premise does not
hold.

Instead, he presents an argument that the constraint set should be
treated universally as a convention (much like the International
Phonetic Alphabet is a convention for phonetic transcription):

(7) Languages should be analyzed (as much as possible) using a
    constraint set common to the community of phonologists.

Ellison's argumentation is quite convincing; his paper certainly
merits a reply if UNIV-FACT is still to be assumed.

In fact UNIV-FACT is not the only universalist assumption in OT. It is
also commonly assumed that the set of linguistic inputs is universal,
due to richness of the base (Prince and Smolensky, 1993). One need
look no further than this volume: "According to the principle of
'richness of the base' (Prince and Smolensky, 1993), the set of
linguistic inputs is universal" (Tesar, p. 616). Let us call this the
Strong Richness of the Base Hypothesis. A Weak Richness of the Base
Hypothesis would merely hold that there can be no constraints on
inputs in any given grammar, but not all inputs need be present in all
languages. However, it is certainly the strong version that is
prevalent in the literature. Lastly, the function that maps inputs to
candidate sets, GEN, and the evaluation function, EVAL, are considered
universal.

Thus, OT holds that inputs are universal, the mapping from inputs to
candidate sets is universal, and the constraint set is universal. A
standard linguistic hypothesis is that universals are innately
specified. This is in effect an inductive inference: if x is innate, x
is universal. That is innateness, or nativism, is considered the best
explanation of universality (and of course there is a long-standing
tradition in the field of language acquisition seeking to demonstrate
linguistic nativism empirically).

So, given that Ellison's paper makes us reconsider the universality of
the constraint set, it also makes us reconsider whether it is
innate. Boersma's paper takes up this question. He presents a detailed
model of the acquisition of the constraints of segmental phonology
from overt data, without presupposing innateness of the
constraints. That is, the constraints themselves are learned, not just
their ranking. He shows how his Maximized Gradual Learning Algorithm
works in general, and in the particular case of learning Wolof tongue
root harmony.

Boersma's model, which is set forth in much greater detail in his book
(Boersma, 1998), is functional: its constraints are grounded in
perception and articulation. Thus, Boersma's theory is not only
non-innatist, but also functional.

As mentioned above most OT models are innatist. Are they functional?
In practice, they often are, as argued by Newmeyer (to appear). But
this is only a contingent fact about the OT literature, that
constraints are often functionally motivated. For most OT analyses,
one could just as well strip away the functionalist rhetoric and take
all constraints to be purely formal. If the constraints "look"
functional, one could claim that it is just a coincidence. The
majority of OT analyses are innatist and can be construed as being
formal or functional in nature (in reality most OT analyses contain a
mix of purely formal and functionally-motivated constraints).

This allows us to build the following table:

(8)               INNATIST          NON-INNATIST
    FUNCTIONAL    Most OT models    Boersma's OT model
    FORMAL        Most OT models    ???

What about the cell marked with question marks? Can there be an OT
model that is formal and non-innatist?

I believe that there can be, and I will give a tentative sketch of
such a model here. First, though, a few words about innateness. In
their thought-provoking book, 'Rethinking Innateness', Elman et
al. (1996) distinguish between representational nativism and
architectural nativism. The former view postulates innate
knowledge/content: dedicated, cortical microcircuitry whose
developmental schedule and organization is specified in the
genome. The latter view postulates innate systems or capabilities:
specific knowledge or content is not innate, but the kind of
information that a cortical system can handle is constrained, which
indirectly constrains its representational capacity. Architectural
nativism is the weaker stance. With respect to language, it means that
the capacity for language is considered innate, and the overall
language architecture is constrained, but the representational content
of the language faculty must be learned. On this view there would be
no literally innate principles or parameters, although the mature
language faculty could be characterized as such descriptively. It is
in the sense of architectural nativism that Boersma's model is
non-innatist, as his language acquisition model does not postulate
that the entire OT architecture is learned. The functions GEN and EVAL
are still assumed to be given or innate.

Now suppose that we are trying to give a formal, non-innatist OT
model. Let us help ourselves to the existence of GEN and EVAL. But,
let us not postulate any innate linguistic primitives (phonological
features, syntactic features, etc.) or specific constraints. Let us
suppose all constraints can be classified as FAITHFULNESS or
MARKEDNESS constraints (Prince and Smolensky, 1993). This is
architectural nativism: we are assuming that the language faculty has
the structure of an OT grammar, such that a grammar is a triple of
<GEN, EVAL, CON>, where the set CON can be partitioned into the two
kinds of constraint. Let us further assume a general-purpose learning
mechanism, such as a neural network, or a Bayesian learner, that
learns the formatives of a language from the ambient linguistic
data. Each time a new formative is identified. a FAITH. and MARK
constraint is added to CON. For example, if the learner identifies a
formative which we might call [+=\Gamma voiced] which distinguishes
certain sounds, a constraint FAITH(voiced) and a constraint MARK(X
voiced) will be added to CON (which value of voiced X is set to
depends on how markedness is measured). Constraints are thus
automatically generated as new formatives are
distinguished. Interleaved with the learning of constraints is ranking
of constraints using some OT learning algorithm (any of the three
discussed in the book would do). This would yield a purely formal
grammar: none of the constraints are functionally grounded; they are
generated automatically as new formatives are identified. However, the
grammar is based on the assumption of architectural nativism, and
therefore could count as non-innatist.

I want to state explicitly that I am not necessarily endorsing the
view I have sketched. First of all, there isn't much to endorse, as
the proposal is vague and has many hidden assumptions. Secondly, there
are many poverty of the stimulus arguments in the literature for the
stronger, representational nativist stance, although Elman et
al. (1996), Ellison (this volume), Pullum and Scholz (to appear), and
many others have challenged these arguments.

But I think it is to the credit of OT, and in the context of this book
to the credit of Ellison and Boersma, that it is possible to think
about language as a formal system, without simultaneously taking a
strong innatist stance.


FINAL REMARKS
The first sentence of this book reads, "The introduction of Optimality
Theory by (Prince and Smolensky, 1993) can be considered the single
most important development in generative grammar in the 1990s."
Although this particular book may seem slightly dated, and much of its
contents highlights the need for serious foundational thinking about
OT, the claim just quoted is not easily refuted. Optimality Theory has
in a short time swept the field of phonology and made inroads into
other subfields of linguistics.

OT raises many interesting issues and promises to be a fruitful
research program for some time. The papers in this book represent
early attempts (especially in the syntax and acquisition domains) to
apply the theory to various phenomena and to address certain
fundamental issues in OT, many of which have been addressed in more
current research. Nevertheless, this book constitutes an important
historical milestone in the development of Optimality Theory.


ACKNOWLEDGMENTS
I am grateful to Farrell Ackerman, Andrew Koontz-Garboden,
Charles Reiss, Peter Sells, and especially Ida Toivonen
for their comments. All remaining errors are my own.


REFERENCES
Anttila, Arto, and Vivian Fong (2000). The partitive constraint in
Optimality Theory. Journal of Semantics, 17, 281-314.

Asudeh, Ash (2001). Linking, optionality, and ambiguity in Marathi. In
Sells (2001), (pp. 257-312).

Barbosa, Pilar, Danny Fox, Paul Hagstrom, Martha McGinnis, and David
Pesetsky (eds.) (1998). Is the best good enough? Cambridge, MA: MIT
Press.

Beckman, Jill, Laura Walsh Dickey, and Suzanne Urbanczyk (eds.)
(1995). Papers in Optimality Theory, vol. 18 of University of
Massachusetts Occasional Papers in Linguistics. Amherst, MA: Graduate
Linguistic Student Association.

Benua, Laura (1995). Identity effects in morphological truncation. In
Beckman et al. (1995), (pp. 77-136).

Berwick, Robert, and Partha Niyogi (1996). Learning from
triggers. Linguistic Inquiry, 27, 605-622.

Boersma, Paul (1998). Functional Phonology: Formalizing the
interactions between articulatory and perceptual drives. The Hague:
Holland Academic Graphics.

Boersma, Paul, and Bruce Hayes (2001). Empirical tests of the Gradual
Learning Algorithm. Linguistic Inquiry, 32, 45-86.

Bresnan, Joan (ed.) (1982). The mental representation of grammatical
relations. Cambridge, MA: MIT Press.

Bresnan, Joan (2001). Lexical-Functional Syntax. Oxford:
Blackwell.

Burzio, Luigi (1994). Principles of English stress. Cambridge:
Cambrige University Press.

Chomsky, Noam (1995). The minimalist program. Cambridge, MA: MIT Press.

Copestake, Ann, Dan Flickinger, Robert Malouf, Susanne Riehemann, and
Ivan A. Sag (1995). Translation using Minimal Recursion Semantics. In
Proceedings of the Sixth International Conference on Theoretical and
Methodological Issues in Machine Translation (TMI-95), Leuven,
Belgium.

Copestake, Ann, Dan Flickinger, Ivan A. Sag, and Carl Pollard
(1999). Minimal recursion semantics: An introduction. Ms., Stanford
University and Ohio State University.

Dalrymple, Mary (2001). Lexical Functional Grammar. Academic
Press.

Eisner, Jason (1997a). Efficient generation in primitive Optimality
Theory. In Proceedings of the 35th annual ACL and 8th EACL,
(pp. 313-320).

Eisner, Jason (1997b). What constraints should OT allow? Talk handout,
Linguistic Society of America, Chicago.

Ellison, T. Mark (1994). Phonological derivation in Optimality
Theory. In Proceedings of COLING, (pp. 1007-1013).

Elman, Jeffrey, Elizabeth Bates, Mark Johnson, Annette
Karmiloff-Smith, Domenico Parisi, and Kim Plunkett (eds.)
(1996). Rethinking innateness: A connectionist perspective on
development. Cambridge, MA: MIT Press.

Gibson, Ted, and Kenneth Wexler (1994). Triggers. Linguistic Inquiry,
25 , 407-454.

Grimshaw, Jane (1997). Projection, heads, and optimality. Linguistic
Inquiry, 28 , 373-422.

Hammond, Michael (2000). The logic of Optimality Theory. ROA 390-0400.

Holland, John (1973). Genetic algorithms and the optimal allocation of
trials. SIAM Journal on Computing, 2, 88-105.

Johnson, Mark (1995). Logic and feature structures. In Mary Dalrymple,
Ronald M. Kaplan, John T.  Maxwell, and Annie Zaenen (eds.), Formal
issues in Lexical-Functional Grammar, (pp. 369-380). Stanford, CA:
CSLI Publications.

Kager, René (1999). Optimality Theory. Cambridge: Cambridge
University Press.

Kaplan, Ronald M., and Joan Bresnan (1982). Lexical-Functional
Grammar: A formal system for grammatical representation. In Bresnan
(1982), (pp. 173-281).

Keller, Frank (2000). Gradience in grammar: Experimental and
computational aspects of degrees of grammaticality. Ph.D. thesis,
University of Edinburgh.

King, Paul (1989). A logical formalism for Head-Driven Phrase
Structure Grammar. Ph.D. thesis, University of Manchester.

King, Paul (1994). An expanded logical formalism for Head-Driven
Phrase Structure Grammar. Arbeitspapiere des sfb 340, University of
Tübingen.

Koza, John R. (1992). Genetic programming: On the programming of
computers by natural selection. Cambridge, MA: MIT Press.

Kuhn, Jonas (2001a). Formal and computational aspects of
optimality-theoretic syntax. Ph.D. thesis, Universität Stuttgart.

Kuhn, Jonas (2001b). Generation and parsing in Optimality Theoretic
syntax: Issues in the formalization of OT-LFG. In Sells (2001),
(pp. 313-366).

McCarthy, John, and Alan Prince (1990). Foot and word in prosodic
morphology: The Arabic broken plurals. Natural Language and Linguistic
Theory, 8 , 209-282.

McCarthy, John, and Alan Prince (1995). Faithfulness and reduplicative
identity. In Beckman et al. (1995), (pp. 249-384).

McCarthy, John, and Alan Prince (1999). Faithfulness and identity in
prosodic morphology. In René Kager, Harry van der Hulst, and Wim
Zonneveld (eds.), The prosody-morphology interface,
(pp. 218-309). Cambridge: Cambridge University Press.

Newmeyer, Frederick J. (to appear). Optimality and functionality: A
critique of functionally-based optimality-theoretic syntax. Natural
Language and Linguistic Theory.

Pesetsky, David (1998). Some optimality principles of sentence
pronunciation. In Barbosa et al. (1998), (pp.  337-383).

Prince, Alan, and Paul Smolensky (1993). Optimality Theory: Constraint
interaction in generative grammar.  Tech. rep., RuCCS, Rutgers
University, New Brunswick, NJ. Technical Report #2.

Pulleyblank, Douglas, and William Turkel (1996). Optimality Theory and
learning algorithms: The representation of recurrent featural
asymmetries. In J. Durand and B. Laks (eds.), Current trends in
phonology: Models and methods. Salford, UK: University of Salford
Press.

Pullum, Geoffrey K., and Barbara C. Scholz (2001). On the distinction
between model-theoretic and generative-enumerative syntactic
frameworks. In Philippe de Groote, Glyn Morrill, and Christian Retoré
(eds.), Logical aspects of computational linguistics (lecture notes in
artificial intelligence, 2099), (pp. 17-43). Berlin: Springer Verlag.

Pullum, Geoffrey K., and Barbara C. Scholz (to appear). Empirical
assessment of stimulus poverty arguments. Linguistic Review.

Richter, Frank (2000). A mathematical formalism for linguistic
theories with an application in Head-Driven Phrase Structure
Grammar. Ph.D. thesis, Eberhard-Karls-Universität Tübingen.

Sadock, Jerrold (1995). Multi-hierarchy view of clitics. In Papers
from the 31st regional meeting of the Chicago Linguistic Society. part
2: Parasession on clitics, Chicago, IL. CLS.

Schütze, Carson T. (1996). The empirical base of linguistics:
Grammaticality judgments and linguistic methodology. Chicago:
University of Chicago Press.

Sells, Peter (ed.) (2001). Formal and empirical issues in
optimality-theoretic syntax. Stanford, CA: CSLI Publications.

Silverman, Daniel (1992). Multiple scansions in loanword phonology:
Evidence from Cantonese. Phonology, 9, 289-328.

Smolensky, Paul (1996). On the comprehension/production dilemma in
child language. Linguistic Inquiry, 27, 720-731.

Strzalkowski, Tomek (ed.) (1993). Reversible grammar in natural
language processing. Boston: Kluwer.

Tesar, Bruce, and Paul Smolensky (1998). Learnability in Optimality
Theory. Linguistic Inquiry, 29.

Tesar, Bruce, and Paul Smolensky (2000). Learnability in Optimality
Theory. Cambridge, MA: MIT Press.

Yip, Moira (1993). Cantonese loanword phonology and Optimality
Theory. Journal of East Asian Linguistics, 2, 261-291.

Zwicky, Arnold (1994). What is a clitic? In Joel Nevis, Brian
D. Joseph, Dieter Wanner, and Arnold Zwicky (eds.), Clitics: A
comprehensive bibliography, (pp. xii-xx). Amsterdam: John Benjamins.


BIOGRAPHICAL SKETCH
I am in the fourth year of my Ph.D. studies at Stanford. I received a
Master of Philosophy from the Centre for Cognitive Science, University
of Edinburgh. My research interests are the syntax-semantics interface,
grammatical theory, and psycholinguistics.


---------------------------------------------------------------------------

If you buy this book please tell the publisher or author
that you saw it reviewed on the LINGUIST list.

---------------------------------------------------------------------------
LINGUIST List: Vol-12-2550



More information about the LINGUIST mailing list