16.1542, Review: Syntax: Hawkins (2004)

Sun May 15 02:25:43 UTC 2005

LINGUIST List: Vol-16-1542. Sat May 14 2005. ISSN: 1068 - 4875.

Subject: 16.1542, Review: Syntax: Hawkins (2004)

Moderators: Anthony Aristar, Wayne State U <aristar at linguistlist.org>
            Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>

Reviews (reviews at linguistlist.org) 
        Sheila Dooley, U of Arizona  
        Terry Langendoen, U of Arizona  

Homepage: http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.

Editor for this issue: Naomi Ogasawara <naomi at linguistlist.org>
================================================================  

What follows is a review or discussion note contributed to our 
Book Discussion Forum. We expect discussions to be informal and 
interactive; and the author of the book discussed is cordially 
invited to join in. If you are interested in leading a book 
discussion, look for books announced on LINGUIST as "available 
for review." Then contact Sheila Dooley at collberg at linguistlist.org. 

===========================Directory==============================  

1)
Date: 14-May-2005
From: Luis Vicente < l.vicente at let.leidenuniv.nl >
Subject: Efficiency and Complexity in Grammars 

-------------------------Message 1 ---------------------------------- 
Date: Sat, 14 May 2005 22:22:48
From: Luis Vicente < l.vicente at let.leidenuniv.nl >
Subject: Efficiency and Complexity in Grammars 

AUTHOR: John A. Hawkins
TITLE: Efficiency and Complexity in Grammars
PUBLISHER: Oxford University Press
YEAR: 2004
Announced at http://linguistlist.org/issues/16/16-341.html

Luis Vicente, ULCL, Leiden University

OVERVIEW

"Efficiency and Complexity in Grammars" (ECG henceforth) represents one 
further step in Hawkins' attempt of incorporating performance factors into 
the theory of grammar. The opening sentence of the book represents his 
research programme better than anything else: "Has performance had any 
significant impact on the basic design features of grammars?". He answers 
that this is indeed the case. In his view, most of the properties of 
natural languages are a result of the pressure to optimise parsing and 
production. ECG is a extended elaboration of this very brief statement. 
The elaboration is actually so extended that, to keep the review from 
being nearly as long as the book, I will concentrate on some major points 
of the argumentation throughout, and simply sketch others. I believe the 
theory developed in ECG is best by focusing on how particular analyses 
work, rather than by giving a bird's eye overview of the entire work. 

The book can be divided roughly in two parts. In the first one (chapters 1-
4), Hawkins' gives his "big picture" idea of how performance factors 
interact with grammar. These chapters are heavy on theory, and 
consequently demand one's full and undivided attention to follow the 
reasoning -which is, otherwise rather neat, unless one makes the mistake 
of taking ECG for what it is not. This is a book about how performance 
factors can influence the shape of languages. It deals with the issue of 
how typological tendencies reflect parsing preferences, not with what the 
formal internal underpinnings of the grammar are. If one reads these 
chapters bearing in mind which questions Hawkins is interested in 
answering, one will find enough exciting ideas to make the heavy reading 
more than worth. 

SYNOPSIS

In the first two chapters ("Introduction" and "Linguistic forms, 
properties, and efficient signaling"), Hawkins argues why performance 
factors must be considered an integral part of the theory of grammar. The 
main thesis of the book is summarised in the Performance-Grammar 
Correspondence Hypothesis (p. 3), which runs as follows:

1) Performance-Grammar Correspondence Hypothesis (PGCH):  Grammars have 
conventionalised syntactic structures in proportion to the degree of 
preference in performance.

Hawkins implements this idea through a number of principles, the most 
important being Minimise Forms, Minimise Domains, and Maximise Online 
Processing -MiF, MiD, and MaOP respectively. The definitions are the 
following:

2) Minimise Forms:  The human processor prefers to minimise the formal 
complexity of each linguistic form F, and the number of forms with unique 
conventionalised propety assignments, thereby, assigning more properties 
to fewer forms. These minimisations apply in proportion to the ease with 
which a given property P can be assigned in processing to a given form F. 

3) Minimise Domains:  The human processor prefers to minimise the 
connected sequences of linguistic forms and their conventionally 
associated syntactic and semantics properties in which relations of 
combination and/or dependency are processed. 

4) Maximise Online Processing:  The human processor prefers to maximise 
the set of properties that are assignable to an item X as X is processed.

Although the wording might be somewhat cumbersome, the leading idea behind 
those principles is clear: the forms and patterns preferred by grammar as 
those that place the lesser burden on parsing and production. This can be 
achieved in several ways: by using morphologically simple forms for 
frequently used elements, by favouring word order patterns that don't 
overload the parser, and so on. 

Chapters 3 and 4 ("Defining the efficiency principles and their 
predictions" and "More on form minimisation") consist of a more detailed 
exploration of how the performance principles introduced earlier on can 
influence the shape of languages, thus building the stage for later 
chapters. Some of the topics he deals with here include grammaticalisation 
(e.g., verbs of wish and desire turning into modals or tense markers, or 
the stages of the evolution of definiteness marking) and markedness 
hierarchies (e.g., nominative being less marked than accusative, which is 
in turn less marked than dative, and so on). Hawkins argues that these 
phenomena and more can be easily accommodated under his framework. For 
instance he argues that, in a number of hierarchies, the amount of 
phonological and morphological complexity will be equal or greater down 
each hierarchy position. He works his way through number marking, case 
marking, degree forms of adjectives... showing that the more marked a 
certain case or number specification is, the more morphophonologically 
complex it usually is. The obvious question is whether this is not simply 
circular reasoning -that is, something is more complex because it is more 
marked, and we know it is more marked because of the extra 
morphophonological complexity. Although Hawkins doesn't deal with this 
question directly, he hints at a way in which markedness hierarchies can 
be derived. Introducing ideas that will be further developed in chapter 8, 
he claims that some positions in the hierarchy are dependent on others. 

For instance, the comparative and superlative forms of adjectives 
("bigger", "biggest") are taken to be dependent on the base ("big"). The 
base form simply requires the adjective to modify an entity. Comparative 
and superlative forms require this, but also the existence of (at least) a 
second entity to establish a comparison with. Similarly, nominative case 
can be assigned to the sole argument of a clause. However, accusative 
usually requires the presence of an argument that has already been marked 
as nominative, and datives usually require the presence of nominative and 
accusative argument (this, of course, ignoring quirky case marking, which 
is something for which I'm willing to give Hawkins the benefit of the 
doubt). If this dependency relations can be established in the way argued 
in this chapter, then it would be possible to avoid circularity. This is, 
however, a notion that I would be happy to see formalised in a more 
explicit way. 

In the second part (chapters 5-8), Hawkins turns to more detailed 
examination of certain phenomena. I found here considerably more food for 
thought, and I can imagine that these chapters will be more interesting 
than the first four to anyone who wants to focus on fine language-specific 
data, rather than broad typological tendencies. This said, though, I found 
chapter 5 somewhat disappointing despite its title ("Adjacency effects 
within phrases"). I was anticipating a discussion of phenomena such as the 
obligatory verb-object adjacency in English, or the requirement in several 
languages that wh-words and foci immediately precede the verb. Instead, 
Hawkins discusses the conditions under which separation of associated 
phrases causes parsing degradation (or not). In a nutshell, his idea is 
that the more dependent on each other two phrases are, the more difficult 
it will be to separate them. One pair of English examples he discusses 
(amongst many others) is "take X to the library" vs. "take X into 
account". His claim is that displacement of the PP is less likely to occur 
in the latter than in the former, because "take X into account" is what he 
dubs an "opaque collocation". It has a somewhat idiomatic meaning, so the 
larger the separation between its parts, the harder it will be to assign a 
meaning to it. He discusses several similar cases, though one problem that 
I find is that, while phrases dependent on each other tend to be close 
together, they cannot appear closer than what rules of grammar allow them 
to. That is, there are several idioms that consistently forbid word orders 
that would certainly result in a shorter MaOP domain. For instance:

5) give Mary the sack vs. * give the sack Mary
6) throw John to the lions vs. * throw (to) the lions John
7) drive Peter bananas vs. * drive bananas Peter

See Harley 2002 for discussion of similar examples. The way I understand 
Hawkins' analysis, the sentences on the right hand side should be 
grammatical. They certainly bring the idiomatic parts of the VP together, 
thus resulting in the shorter MaOP domain possible. Yet, despite the gain 
in processing efficiency, the sentences are totally out. Why should this 
be so? It seems like the grammar can yield a number of outputs, which are 
more or less difficult to parse. This is where Hawkins' theory finds its 
place. The reverse, nonetheless, doesn't seem to be true: parsing 
considerations cannot force a structure that is not independently 
generable by grammar. MiOP and other performance constraints can select 
the most parser-friendly structure amongst a number of alternatives, but 
they cannot generate the best possible structure independently of syntax.

The discussion of the correlation between dependency and proximity 
continues in Chapter 6 ("Minimal forms in complements/adjuncts and 
proximity"), where most of the space is devoted to a discussion to the 
parsing preferences between "which", "that", and zero relative clauses, 
and how they can be accommodated under the theory developed so far. The 
general idea is that there is a tension in relative clause marking: while 
an overt complementiser/relative pronoun unambiguously identifies the 
clause as a relative, it also increases the parsing domain. Thus, the 
conflict between MaOP and MiF gives rise to various patterns of preference 
that Hawkins explores in detail.

The much longer chapter 7 ("Relative clause and wh- movement universals") 
is, I think, one of the strongest parts of the book. Of interest in this 
chapter is the correlation he tries to establish between wh- movement and 
VO/OV order. It has been long noted that VO languages tend to have overt 
wh- fronting, whereas OV languages tend to have wh- in situ constructions. 
Within OV, those languages that have wh- fronting tend to be the ones that 
are "partial OV" (e.g., West Germanic languages, where V2 coexists 
together with V-final). Hawkins' claim is that these correlations follow 
from his general theory: since the wh- word is subcategorised for by the 
verb, preference is given to word orders where both of them are close 
together, so that the link can be established without overloading the 
parser too much. Thus, in VO languages, where the verb tends to stay 
closer to the left periphery of the clause, wh- fronting can minimise the 
wh- parsing domain. In OV languages, however, wh- fronting will increase 
it... except in partial OV languages, where V2 applies in the case of wh- 
movement. He excludes rightward wh- movement on the general assumption 
that wh- words must precede their gaps in order to allow for an efficient 
parsing. A [gap > wh-] order is deemed inefficient enough in parsing to 
exclude rightward wh- movement nearly across the board (with regard to 
this last point, one may wonder how prenominal relatives can be 
accommodated, where the head noun follows the gap. Hawkins' answer is that 
in these languages, NPs in general tend to be N-final. Thus, the 
inefficiency of [gap > head noun orders] can be compensated by the 
creation of a N-final noun phrase, in concordance with the general pattern 
of the language).

Another point of interest in this chapter is its approach to certain wh- 
movement restrictions. He claims, for instance, that some phenomena like 
island violations and *that-trace effects are not the result of violating 
any grammatical principles, but of processing factors and complexity 
hierarchies. He assumes the following hierarchy for types of embedding:

8) non-finite clause << finite clause << complex NP

where complex NPs represent the type of embedding with the greatest 
processing difficulty (possibly due to the extra structure). Making a 
parallel with Keenan & Comrie's (1977) Accessibility Hierarchy, Hawkins 
claims that individual languages can select a "cut-off" point in this 
hierarchy. Extraction is possible out of types of embedded phrases below 
this point, but not above it. Thus, there are languages that allow for 
extraction only out of non-finite clauses; a subset of these can add 
finite clauses on top of it; from these, a subset allows for extraction 
out of complex NPs under some circumstances (e.g., Mainland Scandinavian 
languages). What is not attested, according to Hawkins, is a language that 
allows extraction from a certain type of embedded phrase but not from a 
lower one (i.e., a language that allows extraction out of finite clauses, 
but bans it out of non-finite clauses).

The discussion continues in chapter 8 ("Symmetries, asymmetric 
dependencies, and earliness effects"), where Hawkins tackles issues like 
the preference for subjects to precede objects (including scopal 
interpretations), or for topics to be left-dislocated. The idea introduced 
in this chapter is that asymmetric dependencies tend to appear in a 
specific word order. Consider a pair of elements X and Y, such that Y is 
dependent on X for interpretation or for other reason. Hawkins claims 
that, in cases like this, the preferred order is one in which X precedes 
Y. If the order were the reverse, one would have to hold in memory the 
variable provided by Y until X is encountered, and a suitable value can be 
assigned. This would use up the resources of the parser, leading to a less 
efficient processing. This can be avoided by making X precede Y, so that a 
value can be provided for the variable as soon as it is introduced.

The general conclusions achieved in ECG appear in chapter 9, which also 
serves as a general manifesto of sorts for Hawkins' research programme. In 
this chapter, he repeats time and again that his work is a reaction to 
Chomskyan linguistics. Hawkins' accuses generative grammarians of 
neglecting the impact of performance factors on the properties of 
grammars, focusing exclusively, instead, on grammar-internal theorising. 
As has already become clear after 250 pages, Hawkins advocates the 
opposite view, where the pressure for an efficient parsing underlies many 
of the language-specific and cross-linguistic peculiarities of language. I 
agree with Hawkins in that a complete theory of language will ultimately 
have to account for parsing and performance. However, I think the views he 
expresses in this book show an overconfidence on the power of performance 
as a means to explain grammatical phenomena (see below)

EVALUATION

The basic idea underlying the reasoning in ECG is that there exist a 
number of complexity hierarchies in language, and that languages tend to 
aim towards the more unmarked values of these hierarchies. Sometimes, 
though, I had the impression that this idea is only worked out at an 
intuitive level. For instance, most of the discussion about word order 
preferences is based on counting words (see also Hawkins 1994). However, 
nowhere in the book can we find a definition of "word". I'm not being 
picky here. This is a serious objection, since the notion of "word" is 
possibly the fuzziest notion in contemporary linguistics -to the extent 
that some researchers have claimed that the concept of word is not 
relevant in any morphosyntactic sense (e.g., Julien 2000). For most of the 
discussion, he seems to assume something like a "dictionary entry" notion 
of wordhood. This does the work, but only because Hawkins does not 
consider cases where a more explicit definition would be needed. One can 
easily think of several such cases. For instance, should clitics be 
considered separate words, or should they be counted together with 
whatever their host is? Should particle verbs in West Germanic count as 
one single word or two when the particle is not stranded? What is the 
status of compounds (e.g., "screwdriver" or "overthrow"? How should one 
treat the output of incorporation (i.e. much of the discussion in Baker 
1988 and Hale & Keyser 1993)? What happens with contractions like "would 
have" --> "would've", or "I have" --> "I've"? One particularly interesting 
case could be the German negative word "kein" ("no"), which has been 
argued to actually be a combination of two words (negation plus an 
indefinite determiner) spelled out as one (cf. Penka 2002). Should the 
parser have access to this hidden structure or not? Although I haven't 
thought about these objections in detail, it seems to me that a more 
explicit definition of word would be needed to handle these cases. Maybe 
one should also make reference to morphemes, to phonological weight, 
prosody, syntactic operations like head movement, or even all of the above.

On a different level, I had the impression through most of the book that 
the proposed complexity metric could be more useful as a theory on 
language change than as a grammaticality evaluation device in synchronic 
terms. That is, a sentence is rarely marked as ungrammatical because it 
violates any of the constraints Hawkins introduces (MaOP, MiF, and so on). 
It simply ranks as more difficult to process than an equivalent that 
doesn't violate such constraint, but in and of itself it is not 
grammatical. Of course, after a certain level of complexity is reached, 
ungrammaticality results. However, this seems to be just a parsing 
failure, rather than "real" ungrammaticality rooted in syntax alone (i.e., 
nobody would want to claim that centre-embedded structures are 
ungrammatical after the third or fourth level of embedding. The grammar 
has no problem in generating them, it's simply that the parser's resources 
cannot cope with them). It seems more like language change can be biased 
towards structures and patterns that don't overload the parser's 
resources. I find this a reasonable conclusion, and I wouldn't have much 
trouble adopting it myself. Nonetheless, I am not convinced that this 
theory can be used to explain ungrammaticality patterns, as Hawkins tries 
to do at some points (e.g., his discussion of *that-trace effects at the 
end of chapter 7). 

Notwithstanding these comments, one should take ECG for what it really 
stands for. Hawkins' makes a strong point that performance factors ought 
to be incorporated into the general theory of grammar, rather than being 
used as a waste basket for certain phenomena one cannot explain on 
grammatical terms alone. This I agree with, and I think Hawkins' work (not 
only this book, bit his earlier publications as well) represents an 
important contribution to the understanding of how performance affects 
language. What I disagree with is Hawkins' somewhat hidden assumption that 
performance can be held as a universal solution for grammar theory. It is 
true that an ultimate theory of language must be able to explain the kinds 
of phenomena discussed in this book, but I think one should not blur the 
competence/performance dichotomy as easily as Hawkins. As mentioned above, 
it seems like processing factors cannot force structures that are not 
allowed by grammatical principles. This can be taken as an indicator that 
the grammar and parsing are best kept as mainly separate systems, even 
though they are subparts of the language faculty, and one can see their 
interaction in specific phenomena. 

REFERENCES

Baker, Mark (1988), Incorporation: a theory of grammatical function 
changing, University of Chicago Press, Chicago

Hale, Ken, and Samuel J Keyser (1993), Argument structure and the lexical 
expression of syntactic relations, in Hale & Keyser (eds.), The view from 
building 20, 53-109, MIT Press, Cambridge, Massachusetts

Harley, Heidi (2002), Possession and the double object construction, 
Language Variation Yearbook 2, 29-68, John Benjamins, Amsterdam

Hawkins, John (1994), A performance theory of order and constituency, 
Cambridge University Press, Cambridge

Julien, Marit (2002), Verbal inflection and word formation, Oxford 
University Press, Oxford

Keenan, Ed, and Bernard Comrie (1977), Noun phrase accessibility and 
Universal Grammar, Linguistic Inquiry 8, 63-99

Penka, Doris (2002), Kein muss kein Ratsel sein, MA thesis, Tübingen 
University 

ABOUT THE REVIEWER

I am a 3rd year graduate student at Leiden University, specialising in 
formal syntax. Topics I've worked on include relativisation, syntax-
phonology interface, head movement, remnant movement, scrambling, argument 
licensing, and the structure of VP.

-----------------------------------------------------------
LINGUIST List: Vol-16-1542