16.1542, Review: Syntax: Hawkins (2004)
LINGUIST List
linguist at linguistlist.org
Sun May 15 02:25:43 UTC 2005
LINGUIST List: Vol-16-1542. Sat May 14 2005. ISSN: 1068 - 4875.
Subject: 16.1542, Review: Syntax: Hawkins (2004)
Moderators: Anthony Aristar, Wayne State U <aristar at linguistlist.org>
Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>
Reviews (reviews at linguistlist.org)
Sheila Dooley, U of Arizona
Terry Langendoen, U of Arizona
Homepage: http://linguistlist.org/
The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.
Editor for this issue: Naomi Ogasawara <naomi at linguistlist.org>
================================================================
What follows is a review or discussion note contributed to our
Book Discussion Forum. We expect discussions to be informal and
interactive; and the author of the book discussed is cordially
invited to join in. If you are interested in leading a book
discussion, look for books announced on LINGUIST as "available
for review." Then contact Sheila Dooley at collberg at linguistlist.org.
===========================Directory==============================
1)
Date: 14-May-2005
From: Luis Vicente < l.vicente at let.leidenuniv.nl >
Subject: Efficiency and Complexity in Grammars
-------------------------Message 1 ----------------------------------
Date: Sat, 14 May 2005 22:22:48
From: Luis Vicente < l.vicente at let.leidenuniv.nl >
Subject: Efficiency and Complexity in Grammars
AUTHOR: John A. Hawkins
TITLE: Efficiency and Complexity in Grammars
PUBLISHER: Oxford University Press
YEAR: 2004
Announced at http://linguistlist.org/issues/16/16-341.html
Luis Vicente, ULCL, Leiden University
OVERVIEW
"Efficiency and Complexity in Grammars" (ECG henceforth) represents one
further step in Hawkins' attempt of incorporating performance factors into
the theory of grammar. The opening sentence of the book represents his
research programme better than anything else: "Has performance had any
significant impact on the basic design features of grammars?". He answers
that this is indeed the case. In his view, most of the properties of
natural languages are a result of the pressure to optimise parsing and
production. ECG is a extended elaboration of this very brief statement.
The elaboration is actually so extended that, to keep the review from
being nearly as long as the book, I will concentrate on some major points
of the argumentation throughout, and simply sketch others. I believe the
theory developed in ECG is best by focusing on how particular analyses
work, rather than by giving a bird's eye overview of the entire work.
The book can be divided roughly in two parts. In the first one (chapters 1-
4), Hawkins' gives his "big picture" idea of how performance factors
interact with grammar. These chapters are heavy on theory, and
consequently demand one's full and undivided attention to follow the
reasoning -which is, otherwise rather neat, unless one makes the mistake
of taking ECG for what it is not. This is a book about how performance
factors can influence the shape of languages. It deals with the issue of
how typological tendencies reflect parsing preferences, not with what the
formal internal underpinnings of the grammar are. If one reads these
chapters bearing in mind which questions Hawkins is interested in
answering, one will find enough exciting ideas to make the heavy reading
more than worth.
SYNOPSIS
In the first two chapters ("Introduction" and "Linguistic forms,
properties, and efficient signaling"), Hawkins argues why performance
factors must be considered an integral part of the theory of grammar. The
main thesis of the book is summarised in the Performance-Grammar
Correspondence Hypothesis (p. 3), which runs as follows:
1) Performance-Grammar Correspondence Hypothesis (PGCH): Grammars have
conventionalised syntactic structures in proportion to the degree of
preference in performance.
Hawkins implements this idea through a number of principles, the most
important being Minimise Forms, Minimise Domains, and Maximise Online
Processing -MiF, MiD, and MaOP respectively. The definitions are the
following:
2) Minimise Forms: The human processor prefers to minimise the formal
complexity of each linguistic form F, and the number of forms with unique
conventionalised propety assignments, thereby, assigning more properties
to fewer forms. These minimisations apply in proportion to the ease with
which a given property P can be assigned in processing to a given form F.
3) Minimise Domains: The human processor prefers to minimise the
connected sequences of linguistic forms and their conventionally
associated syntactic and semantics properties in which relations of
combination and/or dependency are processed.
4) Maximise Online Processing: The human processor prefers to maximise
the set of properties that are assignable to an item X as X is processed.
Although the wording might be somewhat cumbersome, the leading idea behind
those principles is clear: the forms and patterns preferred by grammar as
those that place the lesser burden on parsing and production. This can be
achieved in several ways: by using morphologically simple forms for
frequently used elements, by favouring word order patterns that don't
overload the parser, and so on.
Chapters 3 and 4 ("Defining the efficiency principles and their
predictions" and "More on form minimisation") consist of a more detailed
exploration of how the performance principles introduced earlier on can
influence the shape of languages, thus building the stage for later
chapters. Some of the topics he deals with here include grammaticalisation
(e.g., verbs of wish and desire turning into modals or tense markers, or
the stages of the evolution of definiteness marking) and markedness
hierarchies (e.g., nominative being less marked than accusative, which is
in turn less marked than dative, and so on). Hawkins argues that these
phenomena and more can be easily accommodated under his framework. For
instance he argues that, in a number of hierarchies, the amount of
phonological and morphological complexity will be equal or greater down
each hierarchy position. He works his way through number marking, case
marking, degree forms of adjectives... showing that the more marked a
certain case or number specification is, the more morphophonologically
complex it usually is. The obvious question is whether this is not simply
circular reasoning -that is, something is more complex because it is more
marked, and we know it is more marked because of the extra
morphophonological complexity. Although Hawkins doesn't deal with this
question directly, he hints at a way in which markedness hierarchies can
be derived. Introducing ideas that will be further developed in chapter 8,
he claims that some positions in the hierarchy are dependent on others.
For instance, the comparative and superlative forms of adjectives
("bigger", "biggest") are taken to be dependent on the base ("big"). The
base form simply requires the adjective to modify an entity. Comparative
and superlative forms require this, but also the existence of (at least) a
second entity to establish a comparison with. Similarly, nominative case
can be assigned to the sole argument of a clause. However, accusative
usually requires the presence of an argument that has already been marked
as nominative, and datives usually require the presence of nominative and
accusative argument (this, of course, ignoring quirky case marking, which
is something for which I'm willing to give Hawkins the benefit of the
doubt). If this dependency relations can be established in the way argued
in this chapter, then it would be possible to avoid circularity. This is,
however, a notion that I would be happy to see formalised in a more
explicit way.
In the second part (chapters 5-8), Hawkins turns to more detailed
examination of certain phenomena. I found here considerably more food for
thought, and I can imagine that these chapters will be more interesting
than the first four to anyone who wants to focus on fine language-specific
data, rather than broad typological tendencies. This said, though, I found
chapter 5 somewhat disappointing despite its title ("Adjacency effects
within phrases"). I was anticipating a discussion of phenomena such as the
obligatory verb-object adjacency in English, or the requirement in several
languages that wh-words and foci immediately precede the verb. Instead,
Hawkins discusses the conditions under which separation of associated
phrases causes parsing degradation (or not). In a nutshell, his idea is
that the more dependent on each other two phrases are, the more difficult
it will be to separate them. One pair of English examples he discusses
(amongst many others) is "take X to the library" vs. "take X into
account". His claim is that displacement of the PP is less likely to occur
in the latter than in the former, because "take X into account" is what he
dubs an "opaque collocation". It has a somewhat idiomatic meaning, so the
larger the separation between its parts, the harder it will be to assign a
meaning to it. He discusses several similar cases, though one problem that
I find is that, while phrases dependent on each other tend to be close
together, they cannot appear closer than what rules of grammar allow them
to. That is, there are several idioms that consistently forbid word orders
that would certainly result in a shorter MaOP domain. For instance:
5) give Mary the sack vs. * give the sack Mary
6) throw John to the lions vs. * throw (to) the lions John
7) drive Peter bananas vs. * drive bananas Peter
See Harley 2002 for discussion of similar examples. The way I understand
Hawkins' analysis, the sentences on the right hand side should be
grammatical. They certainly bring the idiomatic parts of the VP together,
thus resulting in the shorter MaOP domain possible. Yet, despite the gain
in processing efficiency, the sentences are totally out. Why should this
be so? It seems like the grammar can yield a number of outputs, which are
more or less difficult to parse. This is where Hawkins' theory finds its
place. The reverse, nonetheless, doesn't seem to be true: parsing
considerations cannot force a structure that is not independently
generable by grammar. MiOP and other performance constraints can select
the most parser-friendly structure amongst a number of alternatives, but
they cannot generate the best possible structure independently of syntax.
The discussion of the correlation between dependency and proximity
continues in Chapter 6 ("Minimal forms in complements/adjuncts and
proximity"), where most of the space is devoted to a discussion to the
parsing preferences between "which", "that", and zero relative clauses,
and how they can be accommodated under the theory developed so far. The
general idea is that there is a tension in relative clause marking: while
an overt complementiser/relative pronoun unambiguously identifies the
clause as a relative, it also increases the parsing domain. Thus, the
conflict between MaOP and MiF gives rise to various patterns of preference
that Hawkins explores in detail.
The much longer chapter 7 ("Relative clause and wh- movement universals")
is, I think, one of the strongest parts of the book. Of interest in this
chapter is the correlation he tries to establish between wh- movement and
VO/OV order. It has been long noted that VO languages tend to have overt
wh- fronting, whereas OV languages tend to have wh- in situ constructions.
Within OV, those languages that have wh- fronting tend to be the ones that
are "partial OV" (e.g., West Germanic languages, where V2 coexists
together with V-final). Hawkins' claim is that these correlations follow
from his general theory: since the wh- word is subcategorised for by the
verb, preference is given to word orders where both of them are close
together, so that the link can be established without overloading the
parser too much. Thus, in VO languages, where the verb tends to stay
closer to the left periphery of the clause, wh- fronting can minimise the
wh- parsing domain. In OV languages, however, wh- fronting will increase
it... except in partial OV languages, where V2 applies in the case of wh-
movement. He excludes rightward wh- movement on the general assumption
that wh- words must precede their gaps in order to allow for an efficient
parsing. A [gap > wh-] order is deemed inefficient enough in parsing to
exclude rightward wh- movement nearly across the board (with regard to
this last point, one may wonder how prenominal relatives can be
accommodated, where the head noun follows the gap. Hawkins' answer is that
in these languages, NPs in general tend to be N-final. Thus, the
inefficiency of [gap > head noun orders] can be compensated by the
creation of a N-final noun phrase, in concordance with the general pattern
of the language).
Another point of interest in this chapter is its approach to certain wh-
movement restrictions. He claims, for instance, that some phenomena like
island violations and *that-trace effects are not the result of violating
any grammatical principles, but of processing factors and complexity
hierarchies. He assumes the following hierarchy for types of embedding:
8) non-finite clause << finite clause << complex NP
where complex NPs represent the type of embedding with the greatest
processing difficulty (possibly due to the extra structure). Making a
parallel with Keenan & Comrie's (1977) Accessibility Hierarchy, Hawkins
claims that individual languages can select a "cut-off" point in this
hierarchy. Extraction is possible out of types of embedded phrases below
this point, but not above it. Thus, there are languages that allow for
extraction only out of non-finite clauses; a subset of these can add
finite clauses on top of it; from these, a subset allows for extraction
out of complex NPs under some circumstances (e.g., Mainland Scandinavian
languages). What is not attested, according to Hawkins, is a language that
allows extraction from a certain type of embedded phrase but not from a
lower one (i.e., a language that allows extraction out of finite clauses,
but bans it out of non-finite clauses).
The discussion continues in chapter 8 ("Symmetries, asymmetric
dependencies, and earliness effects"), where Hawkins tackles issues like
the preference for subjects to precede objects (including scopal
interpretations), or for topics to be left-dislocated. The idea introduced
in this chapter is that asymmetric dependencies tend to appear in a
specific word order. Consider a pair of elements X and Y, such that Y is
dependent on X for interpretation or for other reason. Hawkins claims
that, in cases like this, the preferred order is one in which X precedes
Y. If the order were the reverse, one would have to hold in memory the
variable provided by Y until X is encountered, and a suitable value can be
assigned. This would use up the resources of the parser, leading to a less
efficient processing. This can be avoided by making X precede Y, so that a
value can be provided for the variable as soon as it is introduced.
The general conclusions achieved in ECG appear in chapter 9, which also
serves as a general manifesto of sorts for Hawkins' research programme. In
this chapter, he repeats time and again that his work is a reaction to
Chomskyan linguistics. Hawkins' accuses generative grammarians of
neglecting the impact of performance factors on the properties of
grammars, focusing exclusively, instead, on grammar-internal theorising.
As has already become clear after 250 pages, Hawkins advocates the
opposite view, where the pressure for an efficient parsing underlies many
of the language-specific and cross-linguistic peculiarities of language. I
agree with Hawkins in that a complete theory of language will ultimately
have to account for parsing and performance. However, I think the views he
expresses in this book show an overconfidence on the power of performance
as a means to explain grammatical phenomena (see below)
EVALUATION
The basic idea underlying the reasoning in ECG is that there exist a
number of complexity hierarchies in language, and that languages tend to
aim towards the more unmarked values of these hierarchies. Sometimes,
though, I had the impression that this idea is only worked out at an
intuitive level. For instance, most of the discussion about word order
preferences is based on counting words (see also Hawkins 1994). However,
nowhere in the book can we find a definition of "word". I'm not being
picky here. This is a serious objection, since the notion of "word" is
possibly the fuzziest notion in contemporary linguistics -to the extent
that some researchers have claimed that the concept of word is not
relevant in any morphosyntactic sense (e.g., Julien 2000). For most of the
discussion, he seems to assume something like a "dictionary entry" notion
of wordhood. This does the work, but only because Hawkins does not
consider cases where a more explicit definition would be needed. One can
easily think of several such cases. For instance, should clitics be
considered separate words, or should they be counted together with
whatever their host is? Should particle verbs in West Germanic count as
one single word or two when the particle is not stranded? What is the
status of compounds (e.g., "screwdriver" or "overthrow"? How should one
treat the output of incorporation (i.e. much of the discussion in Baker
1988 and Hale & Keyser 1993)? What happens with contractions like "would
have" --> "would've", or "I have" --> "I've"? One particularly interesting
case could be the German negative word "kein" ("no"), which has been
argued to actually be a combination of two words (negation plus an
indefinite determiner) spelled out as one (cf. Penka 2002). Should the
parser have access to this hidden structure or not? Although I haven't
thought about these objections in detail, it seems to me that a more
explicit definition of word would be needed to handle these cases. Maybe
one should also make reference to morphemes, to phonological weight,
prosody, syntactic operations like head movement, or even all of the above.
On a different level, I had the impression through most of the book that
the proposed complexity metric could be more useful as a theory on
language change than as a grammaticality evaluation device in synchronic
terms. That is, a sentence is rarely marked as ungrammatical because it
violates any of the constraints Hawkins introduces (MaOP, MiF, and so on).
It simply ranks as more difficult to process than an equivalent that
doesn't violate such constraint, but in and of itself it is not
grammatical. Of course, after a certain level of complexity is reached,
ungrammaticality results. However, this seems to be just a parsing
failure, rather than "real" ungrammaticality rooted in syntax alone (i.e.,
nobody would want to claim that centre-embedded structures are
ungrammatical after the third or fourth level of embedding. The grammar
has no problem in generating them, it's simply that the parser's resources
cannot cope with them). It seems more like language change can be biased
towards structures and patterns that don't overload the parser's
resources. I find this a reasonable conclusion, and I wouldn't have much
trouble adopting it myself. Nonetheless, I am not convinced that this
theory can be used to explain ungrammaticality patterns, as Hawkins tries
to do at some points (e.g., his discussion of *that-trace effects at the
end of chapter 7).
Notwithstanding these comments, one should take ECG for what it really
stands for. Hawkins' makes a strong point that performance factors ought
to be incorporated into the general theory of grammar, rather than being
used as a waste basket for certain phenomena one cannot explain on
grammatical terms alone. This I agree with, and I think Hawkins' work (not
only this book, bit his earlier publications as well) represents an
important contribution to the understanding of how performance affects
language. What I disagree with is Hawkins' somewhat hidden assumption that
performance can be held as a universal solution for grammar theory. It is
true that an ultimate theory of language must be able to explain the kinds
of phenomena discussed in this book, but I think one should not blur the
competence/performance dichotomy as easily as Hawkins. As mentioned above,
it seems like processing factors cannot force structures that are not
allowed by grammatical principles. This can be taken as an indicator that
the grammar and parsing are best kept as mainly separate systems, even
though they are subparts of the language faculty, and one can see their
interaction in specific phenomena.
REFERENCES
Baker, Mark (1988), Incorporation: a theory of grammatical function
changing, University of Chicago Press, Chicago
Hale, Ken, and Samuel J Keyser (1993), Argument structure and the lexical
expression of syntactic relations, in Hale & Keyser (eds.), The view from
building 20, 53-109, MIT Press, Cambridge, Massachusetts
Harley, Heidi (2002), Possession and the double object construction,
Language Variation Yearbook 2, 29-68, John Benjamins, Amsterdam
Hawkins, John (1994), A performance theory of order and constituency,
Cambridge University Press, Cambridge
Julien, Marit (2002), Verbal inflection and word formation, Oxford
University Press, Oxford
Keenan, Ed, and Bernard Comrie (1977), Noun phrase accessibility and
Universal Grammar, Linguistic Inquiry 8, 63-99
Penka, Doris (2002), Kein muss kein Ratsel sein, MA thesis, Tübingen
University
ABOUT THE REVIEWER
I am a 3rd year graduate student at Leiden University, specialising in
formal syntax. Topics I've worked on include relativisation, syntax-
phonology interface, head movement, remnant movement, scrambling, argument
licensing, and the structure of VP.
-----------------------------------------------------------
LINGUIST List: Vol-16-1542
More information about the LINGUIST
mailing list