17.1540, Review: Ling Theories/Methodology: Kepser & Reis (2005)

Fri May 19 01:37:10 UTC 2006

LINGUIST List: Vol-17-1540. Thu May 18 2006. ISSN: 1068 - 4875.

Subject: 17.1540, Review: Ling Theories/Methodology: Kepser & Reis (2005)

Moderators: Anthony Aristar, Wayne State U <aristar at linguistlist.org>
            Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>

Reviews (reviews at linguistlist.org) 
        Sheila Dooley, U of Arizona  
        Terry Langendoen, U of Arizona  

Homepage: http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.

Editor for this issue: Lindsay Butler <lindsay at linguistlist.org>
================================================================  

This LINGUIST List issue is a review of a book published by one of our
supporting publishers, commissioned by our book review editorial staff. We
welcome discussion of this book review on the list, and particularly invite
the author(s) or editor(s) of this book to join in. To start a discussion of
this book, you can use the Discussion form on the LINGUIST List website. For
the subject of the discussion, specify "Book Review" and the issue number of
this review. If you are interested in reviewing a book for LINGUIST, look for
the most recent posting with the subject "Reviews: AVAILABLE FOR REVIEW", and
follow the instructions at the top of the message. You can also contact the
book review staff directly.

===========================Directory==============================  

1)
Date: 12-May-2006
From: Elke Gehweiler < elkegehw at zedat.fu-berlin.de >
Subject: Linguistic Evidence: Empirical, Theoretical and Computational Perspectives 

-------------------------Message 1 ---------------------------------- 
Date: Thu, 18 May 2006 21:21:22
From: Elke Gehweiler < elkegehw at zedat.fu-berlin.de >
Subject: Linguistic Evidence: Empirical, Theoretical and Computational Perspectives 

Announced at http://linguistlist.org/issues/17/17-97.html 

EDITORS: Kepser, Stephan; Reis, Marga
TITLE: Linguistic Evidence
SUBTITLE: Empirical, Theoretical and Computational Perspectives
SERIES: Studies in Generative Grammar 85
PUBLISHER: Mouton de Gruyter
YEAR: 2005

Elke Gehweiler, Freie Universität Berlin and Berlin-Brandenburgische 
Akademie der Wissenschaften

GENERAL DESCRIPTION

The volume 'Linguistic Evidence', edited by Stephan Kepser and 
Marga Reis is based on the conference 'Linguistic Evidence. 
Empirical, Theoretical, and Computational Perspectives' that took 
place in Tübingen from January 29 - February 1, 2004. It contains a 
short introduction by the editors and 26 papers.

SUMMARY

The introduction discusses several issues related to linguistic 
evidence. As the central objects of linguistic enquiry -- ''language, 
languages, and the factors/mechanisms systematically (co-) governing 
language acquisition, language processing, language use, and 
language change'' (1) -- cannot be directly accessed, they have to be 
reconstructed from the manifestations of linguistic behaviour. As there 
are many possible data types, e.g. introspection, corpus data, data 
from (psycho-) linguistic experiments, synchronic vs. diachronic data, 
typological data, neurolinguistic data, data from first and second 
language learning, data from language disorders, gaining linguistic 
evidence from the potentially available data is no trivial matter. 
Linguistic evidence is quite a new topic of linguistic discussion. Until 
the mid nineties there were largely two ways of gathering data. 
Generativists largely relied on introspective data, whereas non-
generative linguists relied on informally gathered corpus data. But this 
has begun to change. The authors attribute this turning point to the 
book by Schütze (1996), who demanded a systematic approach to 
speaker judgements. Since then, many scholars have shown that it is 
necessary to control the many factors that influence speaker 
judgements in order to obtain more reliable data. Furthermore the size 
and availability of corpora has grown since the mid nineties, and with it 
the importance of corpora as a source of evidence. Both 
developments, Kepser/Reis claim, have paved the way for a 
rapprochement between introspective and corpus linguistics and ''[i]t is 
one of the main aims of this volume to overcome the corpus data 
versus introspective data opposition and to argue for a view that 
values and employs different types of linguistic evidence each in their 
own right. Evidence involving different domains of data will shed 
different, but altogether more, light on the issues under investigation, 
be it that the various findings support each other, help with the correct 
interpretation, or by contradicting each other, lead to factors or 
influence so far overlooked. This ties in naturally with the fact ... that 
there are more domains and sources of evidence that should be taken 
into account than just corpus data and introspective data.'' (3).

In the first article 'Gradedness and Consistency in Grammaticality' Aria 
Adli argues for graded grammaticality judgements. Adli criticises the 
fact that in theoretical studies questionable introspective judgements 
are quoted without prior empirical verification. One of the examples 
Adli discusses in detail is the case of the 'que' --> 'qui' rule in French, 
which is much cited in syntactic theorising. It essentially states that ''an 
ECP [Empty Category Principle-EG] violation can be avoided in 
French if 'qui' is used instead of the usual complementizer 'que' in 
sentences where a wh-phrase has been extracted from the subject 
position'' (7), and that there are clear differences in grammaticality 
between such sentences with 'qui' and 'que'. Using data from a 
controlled experiment with a graded concept of grammaticality Adli 
shows that the 'que' --> 'qui' rule is largely a myth and suggests that 
instead psycholinguistic factors are responsible for the differences in 
(un)grammaticality of different sentence types containing these forms.

Katrin Axel's paper 'Null Subjects and Verb Placement in Old High 
German' deals with Old High German (OHG) time and weather 
expressions without the quasi-argument 'iz' ('it') and with constructions 
where a referential subject is not overtly realised. Using three major 
prose texts as her empirical basis, she shows that earlier OHG (8th 
and 9th century) allowed genuine pro-drop and should therefore not 
be classified as a semi pro-drop language. Her data show that null 
subjects are (largely) restricted to root clauses in early OHG, which 
are distinguished from subordinate clauses by the position of the finite 
verb (verb-first/verb-second vs. sentence-final/sentence late). She 
claims that this main/subordinate asymmetry can be accounted for if 
we assume that null subjects are only licensed in post-finite position, 
i.e. ''it is highly plausible that null subjects are only licensed in 
configuration [sic] in which they are c-commanded by a leftward 
moved finite verb: [V+AGR]k [pro ... tk]]. In OHG, the only way to 
obtain the required configuration for null-subject licensing is verb 
movement to C0'' (34). Axel further suggests that the distribution of 
null subjects is influenced by morphological factors. In OHG there 
were two alternative verb endings in the 1st person plural: a short '-m' 
and a long '-mês'. Pronouns occurring with the short variant are 
virtually always overt but frequently omitted with the long ending, but 
only in post-finite position. Axel claims that although the Latinised 
writing tradition may have had a certain impact, the widely-held 
assumption that the omission of referential subject pronouns in earlier 
OHG is a foreign feature cannot be upheld as it fails to explain why 
null subjects were largely banned from pre-finite environments and 
from contexts with 1st person plural endings in '-m'. Modern Standard 
German does not allow referential pro-drop anymore, despite its 
comparatively 'rich' verbal inflection. Referring to Sprouse and Vance 
(1999) Axel argues that the replacement of null subjects by overt 
pronouns needs not be related to any grammar-internal changes, but 
rather to differences in parsing success, based on the assumption that 
utterances with null pronouns are more difficult to parse. Axel finally 
argues that the case of the OHG null subjects puts into doubt the 
assumed incompatibility of referential pro-drop and verb second. 
Neither does it confirm the relation between morphological richness 
and null subjects.

The authors of 'Beauty and the Beast: What Running a Broad-
Coverage Precision Grammar over the BNC Taught Us about the 
Grammar - and the Corpus' (Timothy Baldwin, John Beavers, Emily M. 
Bender, Dan Flickinger, Ara Kim, Stephan Oepen) argue for a hybrid 
approach to grammar engineering (referring to Fillmore 1992). After 
reviewing some of the arguments for and against corpus data and 
introspective data they present their methodology for building a broad 
coverage precision grammar. In a first step they apply English 
Resource Grammar (ERG) to a sample of the BNC. The grammar was 
able to generate at least one parse for 57% of the sentences. The 
43% that did not receive a parse were diagnosed and classified 
manually. The authors distinguished seven categories of parsing 
failure, which either represent gaps in the grammar (''missing lexical 
entry'', ''missing construction'', ''fragment''), are due to preprocessing 
errors or parser resource limitations, or represent noise 
(''ungrammatical string'', ''extragrammatical string''). They then discuss 
these categories further, and explain why the respective sentences 
could not be parsed. Missing lexical entries for example fall into two 
basic categories: missing lexical types for a given word token (e.g. the 
grammar contains the noun 'table', but not the verb) and missing 
multiword expressions. The authors argue that combining the two 
sources of linguistic evidence - using corpora as primary source of 
data, and enhancing and expanding that data with native speaker 
judgments - can be of much use to grammar developers. The corpus 
provides linguistic variety and authenticity, revealing new syntactic 
constructions, which can then be analyzed with the grammar. Here, 
insisting on a notion of grammaticality helps to recognise and 
categorise the noise in the corpus. According to Baldwin et 
al. ''precision grammar engineering serves both as a means of 
linguistic hypothesis testing and as an effective way to bring new data 
into the arena of syntactic theory'' (64).

In 'Seemingly Indefinite Definites' Greg Carlson and Rachel Shirley 
Sussmann use experimental and non-experimental methods to show 
that there is a sub-class of English definite articles which in their 
interpretations are similar to indefinite articles, such as 'the' in ''Mary 
went to the store'', where the identity of the store is not especially 
important, in contrast to 'the' in ''Mary went to the desk''. First, the 
authors show that weak definites have the same distributional 
properties as bare singular count nouns (''He was in bed''). They are 
lexically restricted, i.e. it is a lexical feature of the noun itself that 
determines whether it can function as a bare singular/weak definite, 
they do not allow any modification, a certain degree of semantic 
enrichment is added to them, they only co-occur with lexical items of 
certain classes, and their distributional properties preclude application 
of the usual tests for definiteness/indefiniteness. In the second part of 
the paper Carlson/Sussman show that experimental evidence 
supports the existence of a separate class of weak definites. For their 
experiment they selected six nouns that often function as indefinite 
definites and matched them with comparable regular definite nouns 
(e.g. ''After she finishes her breakfast, Lydia will read the newspaper'' 
vs. ''the book''). Each noun was put into a sentence containing a verb 
that was known to support the indefinite definite reading. For each 
sentence pair a visual context was created which depicted the scene 
just before the action depicted in the sentence is carried out. The 
participants saw this scene on a computer screen, while they heard a 
spoken version of the sentence. They then had to choose the item on 
display that they thought was most likely to be involved in the 
upcoming action. In addition, their eye-movement was monitored while 
they were listing to the sentence. Both target choice and eye 
movement supported the existence of two separate classes of 
definites.

Sonia Cyrino and Ruth Lopes ('Animacy as a Driving Cue in Change 
and Acquisition in Brazilian Portuguese') use both diachronic data and 
data from language acquisition to show that a feature that was 
relevant for a change in Brazilian Portuguese is still operative in 
language acquisition. Looking at historical data they first discuss the 
grammatical change in object constructions where the 3rd person 
neuter clitic 'o' is gradually replaced by a null element, leading to a 
change in the grammar. They then go on to examine the present-day 
acquisition of the null category, arguing that this shift became critical 
for language acquisition, cuing a new grammar, and that it was the 
semantic features of the antecedent that were the driving cue and 
played a role in the acquisition of the object pronominal paradigm in 
Brazilian Portuguese. The more general theoretical conclusions they 
draw from this is that firstly, ''we may take cue-based theories 
seriously and try to show how a cue can be operative after a change 
occurred in a language, explaining the change itself'' (102), and 
secondly, that this ''places some questions about acquisition proper 
within the generative framework'' (102).

In 'Aspectual Coercion and On-line Processing: The Case of Iteration' 
Sacha DeVelle discusses the phenomenon of iteration, which is a 
prime example of aspectual coercion. Iteration ''describes the 
encoding of a series of repetitions within a given situation'' (106). The 
iterative interpretation is enhanced by the semantic punctual feature 
of point action verbs ('jump'), which can reflect a single act ('dive') or 
an iterative act ('knock'). Two studies (Piñango, Zurif, and Jackendoff 
(1999), using a cross modal lexical decision (CMLD) interference task; 
Todorova, Straub, Bedecker, and Frank (2000) using a reading time 
task) have shown that if a point action verb is combined with the 
durational adverbials 'for' or 'until' (e.g. ''The girl dived in the pool for 
five minutes'') there is an increased processing load, which is 
demonstrated by longer reaction times and emerges at or just after 
the durational adverbial. The authors of both studies argue that this is 
evidence for an enriched compositional operation. DeVelle however 
argues that the processing differences between activity verbs and 
point action verbs may also be due to the sentence stimuli used in the 
two studies. A repetition of Piñango et al.'s (1999) study showed one 
significant difference from the original study: the point 
action/durational adverbial sentence pairs were overall interpreted as 
more difficult to understand and less plausible than their activity 
sentence counterparts. DeVelle claims that this may have influenced 
Piñango et al.'s findings.

Studies on child language acquisition have argued that the acquisition 
of epistemic expressions begins between two-and-half and three 
years of age, but that epistemic expressions remain very rare until 4;5 
(year; month) or later. Experiments have however shown that the 
linguistic epistemic system is not fully understood until the age of 8;0 
or later, and that weak epistemic expressions like 'können' 
or 'vielleicht' are still not understood by 6- and 7-year-olds. These 
findings suggest that children understand (weak) epistemic terms 
much later than they begin to use them. In 'Why Do Children Fail to 
Understand Weak Epistemic Terms? An Experimental Study' Serge 
Doitchinov presents the results of two experiments he has conducted 
in order to find out whether children's late understanding of epistemic 
terms is related to the development of their ability to understand 
epistemic uncertainty (inference based hypothesis) or to their ability to 
recognise scalar implicature (implicature based hypothesis). His first 
experiment consisted of three tasks: (i) the 'modal expression task' 
which investigated to children's ability to understand weak epistemic 
expressions correctly; (ii) the 'implicature task', to assess the 
children's understanding of scalar implicatures; and (iii) 
the 'interference task' which examined their ability to deal with 
epistemic uncertainty. The second experiment was conducted to 
further assess the children's ability to recognise scalar implicatures. 
The results of the two experiments suggests that the acquisition of 
epistemic terms depends on the development of children's ability to 
understand epistemic uncertainty; this ability seems not yet fully 
mastered by eight years of age. Doitchinov argues that younger 
children's capacity to use weak epistemic terms is limited. They 
probably first use weak epistemic terms only in very familiar situations -
 this does not contradict previous claims. According to Doitchinov the 
results however also suggest that they have difficulties in inferring 
epistemic possibility, and that they occasionally overgeneralise the 
use of strong epistemic terms in their talk.

Linguistic descriptions of negative polarity items agree that the 
occurrence of polarity items is licensed by semantic and/or pragmatic 
properties. Furthermore it was argued that a negative polarity item is 
only licensed if it occurs in the scope of a negator (cf. e.g. Haegeman 
1995).
(1) 
a. Kein Mann, der einen Bart hatte, war jemals glücklich. 'No man who 
had a beard was ever happy'
b.*Ein Mann, der einen Bart hatte, war jemals glücklich. 'A man who 
had a beard was ever happy'
c. *Ein Mann, der keinen Bart hatte, war jemals glücklich. 'A man who 
had no beard was ever happy'

The paper 'Processing Negative Polarity Items: When Negation 
Comes Through the Backdoor' by Heiner Drenhaus, Stefan Frisch and 
Douglas Saddy presents the results of two psycholinguistic studies 
(acceptability speeded judgment tasks and event-related brain 
potentials (ERPs)). They have used structures such as in (1) to 
examine the specific lexical properties of a negative polarity items 
like 'jemals' ('ever') and the licensing conditions that are due to 
hierarchical constituency. Both experiments confirmed that there are 
two licensing conditions for negative polarity items: the 
semantic/pragmatic, and the structural/syntactic condition. Both 
experiments however also showed that violation with inaccessible 
negation ((1c) was more often accepted as correct than violation 
without negation (1b)), indicating that the negator is (wrongly) used to 
license the polarity item even if it is not in a c-commanding position. 
Drenhaus et al. claim that this might be due to a ''competition between 
semantic/pragmatic information and hierarchical constituency'' (159), 
but that further systematic investigations of polarity constructions are 
needed.

Veronika Ehrich's paper 'Linguistic Constraints on the Acquisition of 
Epistemic Modal Verbs' discusses constraints on the acquisition of 
epistemic modal verbs (MVs) in German. Ehrich first gives a detailed 
description of the relevant semantic and syntactic properties of 
German MVs and reviews some of the main findings of MV-acquisition 
research. She then compares the results of her corpus study to 
different competing (psycho-) linguistic approaches to epistemicity in 
language and language development. Ehrich concludes that syntactic 
progress, semantic diversification and cognitive development are all 
necessary prerequisites for the rise of epistemicity, but none of them 
seems to be sufficient by itself.

In 'The Decathlon Model of Empirical Syntax' Sam Featherston 
describes a new model of grammar, the 'Decathlon Model'. 
Featherston has conducted studies on frequency (based on corpus 
data) and studies on grammaticality (based on native speakers' 
judgments, using a procedure which ''allowed informants to express all 
the differences in ''naturalness'' that they perceive, with no coercion to 
a given scale'' (189)). The grammaticality-judgment study has yielded 
the following results: (i) judged well-formedness is a continuum - a cut-
off point between well-formed and not well-formed cannot be located, 
(ii) each linguistic factor has an effect on well-formedness - more 
violations cause a structure to be evaluated worse, and (iii) there are 
no 'hard' constraints - no violation excludes a structure from the 
grammar. The frequency data shows a different picture. Of the 16 
structures tested in the judgments, one occurs once in the corpus (the 
one judged second best), one occurs 14 times (the one judged best); 
the remaining 14 structures do not occur at all. This shows that the 
two data types are in fact not measuring the same factor and that 
relative judgments say nothing about the probability of occurrence of a 
structure. Featherston then introduces the Decathlon Model, which is 
supposed to be both ''an outline architecture of a grammar and at the 
same time an account of the differences between data types'' (196). 
The Decathlon Model's 'Constraint Application' module ''applies 
constraints, assigns violation costs, and outputs form/meaning pairs, 
weighted with violation costs'' (197). These form/meaning pairs are 
then sent to the 'Output Selection' module, which basically contains 
the grammar and which selects the best candidate for output. The 
existence of these two modules explains the different results for the 
different data types: With judgments, what is returned is the output of 
the Constraint Application function, whereas frequency measures 
measure the output of the Output Selection module. Featherston then 
goes on to discuss the advantages of the Decathlon Model over other 
theories of syntax, the notion and the nature of well-formedness, and 
the implications of his findings for the choice of data types in syntax. 
Here he concludes that the data type for syntax must be relative 
judgments: ''Frequency measures give us the same information as 
relative judgments about the best (couple of) structural alternatives in 
each comparison set, but they give us no information about any of the 
others.'' (205) For syntactic theory this means that one has to chose 
what one wants to model, as output selection and the grammar are 
two separate processes.

In her paper 'Examining the Constraints on the Benefactive Alternation 
by Using the World Wide Web as a Corpus' Christiane Fellbaum asks 
whether data gathered from the web can give us new insights into 
speakers' grammars and serve as evidence for linguistic theories. She 
contrasts the constraints for the Benefactive alternation (consisting of 
the PP alternant (''Chris bought a cake for Kim'') and the direct object 
(DO) alternant (''Chris bought Kim a cake'')) that were formulated on 
the basis of introspective data, with the data found on the web. Her 
data show that the previously proposed constraints cannot fully 
account for the data found on the web, although ''most data fall into 
the kinds of patterns that previous researchers have suggested'' 
(237). Fellbaum e.g. shows (i) that the DO alternant can occur with 
verbs of destruction, (ii) that it not necessarily requires 
a ''created/prepared/obtained entity that becomes the Beneficiary's 
possession'' (222) as had been claimed by other scholars, and (iii) 
that there is no ''Latinate Constraint'', i.e. there is ''no restriction on the 
Benefactive alternation that can be formulated in terms of etymology 
or morphophonological properties of the verb'' (225). She further 
shows (iv) that restrictions concerning the Benefactive cannot be 
formulated in terms of aspect, and (v) that the constraints that had 
been formulated concerning the nominal arguments of the Benefactive 
seem to be no ''hard'' constraints. Fellbaum argues that although web 
data do not permit us to formulate any hard constraints, two 
observations can be made: in the DO alternant, the subject has to 
have control over the event, and, unlike in the PP alternant, ''a benefit 
is necessarily bestowed, resulting in a change of state of the affected 
entity, the Beneficiary'' (237). She concludes with the observation that 
constructed data often fails to capture the fuzzy nature of real 
constraints and argues that all those grammatical phenomena that 
could previously only be studied using one's intuition should now be 
re-examined using natural occurring data, i.e. corpus data.

In 'A Quantitative Corpus Study of German Word Order Variation' Kris 
Heylen attempts to overcome the limitations of ''traditional'' data 
(introspection and ''encountered'' examples) by using a corpus-based 
approach to study the word order variation in the German Mittelfeld. 
Heylen first discusses the problems with traditional data types for 
studying word order variation, arguing that they are unreliable and not 
able to deal with gradient and multifactorial phenomena. He then 
discusses the advantages of corpora over other data types. proposes 
a corpus-based approach, arguing that (i) corpus data is primary data 
in linguistics, (ii) corpora gives us easy access to large amounts of 
data, (iii) corpus-data reflects gradient effects through relative 
frequencies, and (iv) multiple factors can be studied directly by looking 
at actual usage data. Heylen then presents the results of a corpus-
based study on word order, where he has examined ''the variation that 
occurs when both a full NP-subject and a pronominally realised object 
are present in the Mittelfeld'' (244). He takes into account seven 
factors that might influence word order, and, using various statistical 
models, examines the correlations between word order and these 
factors (for each factors separately and for multiple factors 
simultaneously). Although his analysis shows that the seven factors 
investigated can explain some of the variation (e.g. the strong effect of 
clause-type: ''the 'marked' order subject-first is especially common in 
subordinate clauses'' (261)), Heylen argues that additional factors 
have to be tested in order to be able to fully account for the variation. 
He concludes with arguing that the results of the study are ''not yet 
explanations'' (261) and that in order to formulate an explanatory 
model for the variation corpus-data alone may not be sufficient as it is 
only ''part of a whole set of data types that are necessary for sound 
empirical language research'' (261).

There are a number of statistical word similarity measures, which are 
based on fundamentally different assumption. The paper 'Which 
Statistics Reflects Semantics? Rethinking Synonymy and Word 
Similarity' by Derrick Higgins presents yet another model - local 
context-information retrieval (LC-IR), which ''is based on web search 
statistics regarding the frequency with which words appear adjacent to 
one another'' (280). Higgins shows that LC-IR outperforms any other 
purely statistical model and ascribes this to the fact that as it uses web 
data there is no problem of data sparsity, and to the fact that is uses 
the parallelism assumption, i.e. it ''predicts that similar words will occur 
in grammatically parallel constructions'' (275). Other models, on the 
other hand, are either based on the idea that similar words occur near 
the same set of other words (the topicality assumption) or that words 
occur near those words which are most similar to them (the proximity 
assumption). Higgins goes on to discuss the implications his approach 
may have for a theory of lexical semantics and acquisition, arguing for 
example that grammatical parallelism is a cue used by language 
learners to identify words as semantically similar or synonymous.

The paper 'Language Production Errors as Evidence for Language 
Production Processes - the Frankfurt Corpora' (Annette Hohenberger, 
Eva-Maria Waleschkowski) compares ''slips'' in German Sign 
Language (DGS) to ''slips'' in spoken German in order to answer the 
question ''which aspects of language production and monitoring are 
modality-dependent and which are not'' (287). Using data from a DGS 
corpus and a corpus of spoken German, as well as experimental data 
from what they call ''the slip experiment'' to supplement the corpus 
data, Hohenberger/Waleschkowski show that ''language processing is 
basically modality independent'' (300), i.e. the fact that there are 
identical types of slips in DGS and spoken German indicates 
that ''producing speech and sign proceeds through the same planning 
stages and involves the same computational vocabulary'' (300). The 
observed differences in slip-types are argued to be related to 
differences in information packaging strategies in DGS and spoken 
German. 

The aim of Mary Aizawa Kato and Carlos Mioto's paper 'A Multi-
Evidence Study of European and Brazilian Portuguese wh-Questions' 
is to compare contemporary European Portuguese (EP) and Brazilian 
Portuguese (BP) wh-questions using equivalent written corpora as 
well as speakers' intuition. They then aim to provide a theoretical 
interpretation of the results, using Lightfoot's Principle and Parameters 
(PP) model of language change (Lightfoot 1999) as their framework. 
Their empirical research showed that there is an intersection of 
licensed patters in EP and BP, but that there are also differences. 
Compared to what had been found in previous studies, their empirical 
study revealed two facts: (i) ''spoken EP does not exhibit VS [verb-
subject - EG] order in non-cleft questions'' (316) and (ii) ''BP VS order 
in non-cleft questions is not restricted to unaccusative verbs'' (316). 
Kato/Mioto's most important theoretical conclusion is that the VS order 
in EP wh-questions reflects the derivation of thetic sentences in 
general.

Gerard Kempen and Karin Harbusch ('The Relationship between 
Grammaticality Ratings and Corpus Frequencies: A Case Study into 
Word Order Variability in the Midfield of German Clauses') compare 
the results of a graded grammaticality-study on word order in the 
German Mittelfeld (Keller 2000) to data from two corpora. Keller had 
found that none of the constraints (C1) Pronominal < Nominal, (C2) 
Nominative < Non-nominative, and (C3) Dative < Accusative 
are ''absolute'' in that their violation gave rise to extremely low 
grammaticality judgments (C1 and C2 were found to have equal 
strength, whereas C3 was very weak). If such constraints 
were ''psychologically real'', it could be assumed, the differences in 
acceptability would be reflected by different corpus frequencies. 
Kempen/Harbusch however found that this is not the case: ''a 
systematic discrepancy emerged between the frequency counts and 
the grammaticality ratings'' (330). The argument orderings that were 
rated average or low were absent from the corpora, i.e. ''the 
grammaticality judgments tend to be more lenient than the corpus 
data'' (337). The authors claim that this discrepancy exists because 
what was rated in Keller's study was actually the discrepancy between 
the to-be-judged argument ordering and the order(s) licensed by 
the ''strict production-based linearization rule'', a mechanism which 
yields equivalent output, i.e. ''the grammaticality ratings appear 
sensitive to the number and seriousness of violations of the rule'' 
(342). There seems to be a critical value, the ''production threshold'', 
which separates the grammaticality continuum. Structures with 
grammaticality values above this threshold will occur in corpora with 
moderate-to-high frequencies, all other structures will have zero or 
very low frequencies.

In 'The Emergence of Productive Non-Medical '-itis': Corpus Evidence 
and Qualitative Analysis' Anke Lüdeling and Stefan Evert use the 
German suffix '-itis' to show that the problem of (morphological) 
productivity can only be understood when different types of evidence - 
quantitative and qualitative - are combined. Medical '-itis' is rule-
based, or categorial, and therefore fully productive, it is originally used 
in medical contexts meaning 'inflammation (of)', it is bound and 
combined with neoclassical elements denoting body parts 
(e.g. ''Arthritis'' 'inflammation of the joints'). Non-medical '-itis' is 
similarity-based, and difficult to characterise in categorial terms. Its 
meaning can be described as 'doing too much of X'; Lüdeling/Evert 
argue that it likely developed from medical '-itis' the meaning of which 
was generalised to mean 'illness'. Their qualitative analysis of '-itis' 
has shown that there is evidence for two morphological processes 
with different properties. Lüdeling/Evert now use corpus data to find 
out (i) whether both processes differ with respect to productivity - here 
it could be expected that the productivity for the rule-based process 
should be higher, and (ii) whether (and how) the productivity of each 
process changes over time - here one would expect that ''the 
established medical rule-based use of '-itis' does not change over 
time, but non-medical '-itis', which is similarity-based and therefore 
dependent on the stored examples, can show short-term qualitative 
changes as well as changes in productivity'' (356f). They apply and 
discuss different statistical models to test the synchronic and 
diachronic productivity of both types of '-itis'. The quantitative 
properties of the two processes however do not confirm the two initial 
hypotheses, which leads Lüdeling/Evert to suggest that 
probably, morphological theory does not need to make a distinction 
between rule-based and similarity-based processes'' (366).

Wiltrud Mihatsch's paper 'Experimental Data vs. Diachchronic 
Typological Data: Two Types of Evidence for Linguistic Relativity' 
explores the interaction of perceptual and typological factors in lexical 
change, comparing diachronic data (from a database containing paths 
of lexical change in the domain of body parts in a sample of over 30 
languages) with experimental data from the psycholinguistic literature. 
Lucy (1992) and Imai/Gentner (1997) had found that ''the number 
marking system may influence the categorisation of entities that are 
ambiguous between a classification according to shape and one 
according to substance with respect to their shape'' (373). Speakers 
of languages with obligatory number marking (e.g. English) tend to 
classify according to shape, speakers of languages without obligatory 
number marking (e.g. Japanese) tend to classify such objects 
according to material. Presupposing that ''lexical change reflects 
fossilized categorization processes'' (375), i.e. that concepts are 
always conceptualised via existing labels for other concepts and some 
of these new concepts get lexicalised, Mihatsch looks at whether the 
concepts of EYEBALL, EYELID, EYEBROW, and EYELASH, the words 
for which tend to be less stable and change over time (in contrast to 
e.g. HAIR, EYE, or SKIN), are conceptualised according to substance 
or according to shape in different languages. EYEBALL is virtually 
always named on the basis of round objects, whereas in the case of 
EYELID, EYEBROW, and EYELASH there are different naming 
strategies. EYEBROW, and EYELASH for example can be 
conceptualised on the basis of HAIR or WOOL, i.e. in terms of material 
(mostly in languages without obligatory plural marking), but also via 
their elongated, arc-like shape (in languages with obligatory plural 
marking). The results indicate a very strong interaction between noun 
type and conceptualisation, and therefore, according to Mihatsch, 
point towards ''a moderate version of linguistic relativity'' (381).

In 'Reflexives and Pronouns in Picture Noun Phrases: Using Eye 
Movements as a Source of Linguistic Evidence' Jeffrey T. Runner, 
Rachel S. Sussman, and Michael K. Tanenhaus first show that native 
speaker judgments on binding in picture NPs, i.e. noun phrases 
headed by a ''representational'' noun such 
as 'photograph', 'picture', 'film', are not solid. Reflexives in picture NPs 
lacking a possessor may violate Binding Theory (BT) (e.g. ''John 
knows that there is a picture of himself in the morning paper''). These 
reflexives have been called logophors (cf. Reinhard/Reuland 1993), 
i.e. ''reflexive noun phrases which are not ... subject to structural 
Binding Theory, but rather are constrained at least in part by 
discourse variables'' (395). Picture NPs with possessors appear to 
show the complementary distribution predicted by BT, but two studies 
by Keller and Asudeh (2001) have shown that native speakers 
accepted equally reflexives and pronouns bound to the subject of the 
sentence in examples like ''Hanna found Peter's picture of herself/he''. 
The three authors then present the results of an experiment that 
investigated the use of reflexives and pronouns in possessed picture 
NPs. In the experiment participants had to work with a display and 
three dolls, Ken, Harry, and Joe, which each had three pictures, one 
of himself and one of each of the others. The participants were then 
presented with potentially ambiguous instructions like ''Have Joe touch 
Ken's picture of himself''. Thus, participants' target choice provided a 
kind of judgment. ''If a participant choose a picture indicating a 
particular reading, this means that reading is acceptable or possible.'' 
(398) In addition to target choice the eye movements of the 
participants were being monitored, to see which potential referents 
were being considered by them. The authors found that ''pronouns in 
picture NPs with possessors are constrained by Binding Theory and 
that reflexives are not'' (403), and that ''instead these reflexives 
behave like logophors'' (404). Runner et al. furthermore show that 
BT ''cannot be viewed as an early filter that constrains the set of 
potential referents'' (408) as BT-inappropriate referents were 
considered early on in the processing for both reflexives and 
pronouns. They conclude with two more general implications of their 
study: (i) reflexives in picture NPs should all be treated as logophors, 
and (ii) their experiment could serve as an example for other studies 
that aim at complementing introspective data with psycholinguistic 
evidence.

Uli Sauerland, Jan Anderssen, and Kazuko Yatsushiro ('The Plural is 
semantically unmarked') first show that the 'Strong Theory' of the 
plural - the plural implies cardinality greater than one and is marked - 
does not hold, and that there are many cases where ''the plural does 
not mean the same as explicitly adding 'two or more''' (414) (consider 
for example ''You're welcome to bring your children'' vs. ''You're 
welcome to bring your two or more children''. Using evidence from 
adult competence and from adult and child performance, the authors 
instead argue for a 'Weak Theory' of the plural, which ''is 
characterized by the assumption that the plural is not subject to an 
inherent lexical restriction as the singular is'' (429). According to 
Sauerland et al. the plural is rather subject to pragmatic comparison 
with the singular, and can therefore not be used in most examples 
where the singular is possible. Their findings, according to the 
authors, imply (i) that ''semantic and morphological markedness need 
to be distinguished'' (430), and (ii) ''that the interpretation of the plural 
always involves an implicit comparison'' (430).

Tanja Schmid, Markus Bader, and Joseph Bayer present the results of 
an experiment based on a questionnaire that compared German 
infinitival non-coherent constructions, where the infinitival complement 
forms an independent constituent which may be extraposed (e.g. ''... 
dass Maria prahlt, alle Verwandten zu kennen'') and coherent 
constructions, where the infinitival complement does not form an 
independent constituent (e.g. ''*... dass Maria scheint, alle Verwandten 
zu kennen''). Their paper 'Coherence - an Experimental Approach' 
addresses the questions (i) whether experimental evidence verifies 
the validity of their (non-) coherence-tests and the verb class 
differences proposed in the literature, and (ii) what the factors are that 
give rise to coherence. Four constructions - topicalisation of the verbal 
complex, 'long' scrambling of a pronoun, 'long-distance' passive, and 
wide scope of negation - were used as tests for coherence; two 
configurations - extraposition of the infinitival complement, and narrow 
scope of negation - were used to test non-coherence. The intraposed 
construction, which is assumed to be structurally ambiguous (''... dass 
Max mir [nur das Lexikon zu kaufen] empfohlen hat'' vs. ''... dass Max 
mir nur das Lexikon [zu kaufen empfohlen hat]''), was tested, too. 
Schmid et al. report the following findings: (i) their coherence tests can 
be considered valid as the different results correlate significantly, (ii) 
the ambiguous intraposed construction patterns with the coherence 
tests, and (iii) there is evidence that verbs within a given class behave 
similarly.

In his paper 'Thinking About What We Are Asking Speakers to Do' 
Carson T. Schütze argues that it is important to evaluate the status 
and quality of the various types of linguistic evidence. Specifically he 
asks whether the data obtained from ''naive'' speakers is reliable, 
i.e. ''whether we are asking them to do things that they can 
understand and are capable of doing, and whether we can be 
confident that they are actually doing what we have asked of them'' 
(457). Schütze examines a number of case studies in detail, finding 
that in particular experiments that ''address our questions of interest ... 
directly'' (477), i.e. experiments where the linguist has a particular 
hypothesis in mind, can yield questionable results. Schütze shows that 
these ''bad'' results can have various reasons: in one example the 
instructions for the participants were unclear and inconsistent, or 
researchers did not take into account that certain ''scenarios'' that 
were evoked by their elicitation tests could influence the results, or 
they failed to see that other factors than the ones tested influenced 
the answers of the participants, etc. Schütze argues that these 
shortcomings can be overcome by sticking ''as closely as possible to 
the ways in which language is actually used for everyday purposes, 
rather than contriving artificial unfamiliar tasks'' (477) and that 
experiments that are used to gain direct information about underlying 
linguistic knowledge have to be improved.

The question Augustin Speyer pursues in his paper 'A Prosodic Factor 
for the Decline of Topicalisation in English' is whether there is a 
connection between the loss of the verb-second constraint (V2) and 
the decline of topicalisation - ''the movement of a non-subject 
constituent to the left edge of a sentence'' (487) like in ''Beans, John 
likes'' -, which both occurred at about the same time in the history of 
English (starting between 1150 and 1250). The fact that pronouns 
behave differently from full NPs (the use of pronouns in topicalised 
sentences remains stable after a sharp drop after 1250 whereas the 
use of full NPs gradually declines) suggests, according to 
Speyer, ''that the connection might have something to with one of the 
properties that pronouns have, but not full noun phrases, or vice 
versa'' (490). Speyer then goes on to discuss the pragmatic and 
prosodic properties of topicalised sentences, and introduces a 
constraint which he thinks might have caused the decline of 
topicalisation, the 'Trochaic Requirement' (TR), which indicates 
that ''some weak element ... between two accents is compulsory'' 
(494). In German topicalised constructions this constraint is naturally 
fulfilled, due to V2 (''Bohnen hasst Maria''), but Present Day English 
speakers have to (i) either insert an empty timing slot (after 'beans' 
in ''Beans, John likes''), ''thus creating a dummy weak element'' (496) 
or (ii) avoid topicalised constructions. Schütze argues that the TR 
constraint also held in the history of English. As in the Middle English 
Period V2-word order became more and more marked and was 
therefore used less and less, speakers avoided ''accent clash'' by 
avoiding topicalised constructions - the rate of topicalisations 
decreased. This is confirmed by the fact that pronouns, which are 
naturally weak elements, do not seem to be affected by the avoidance 
of topicalisation. 

There are three different analyses of coordination. The ''deletion 
analysis'' (cf. e.g. Chomsky 1957) assumes that conjuncts are derived 
via a deletion mechanism, e.g. ''[The man is carrying the ladder] and 
[THE MAN IS CARRYING the bucket]'' (caps indicate deleted material). 
In the ''phrasal analysis'', ''coordinate phrases ... are base-generated 
directly by phrase structure rules'' (507), which either results in multi-
headed constructions (cf. e.g. Jackendoff 1977) or in analyses that 
treat conjunctions as heads (cf. e.g. Kayne 1994). The ''node-sharing 
analysis'' allows for three-dimensional syntactic-structures with single 
nodes being shared by more than one phrase marker (cf. e.g. 
Moltmann 1992). Using data from two comprehension studies in 
agrammatism, and data from reading-time experiments, Ilona Steiner 
('On the Syntax of DP Coordination: Combining Evidence from 
Reading-Time Studies and Agrammatic Comprehension') aims at 
finding out which of the three analyses is most plausible. The results 
of the two comprehension studies in agrammatism allowed her to 
discard the deletion approach; the reading time data provided 
evidence for the node sharing analysis and allowed her to distinguish 
between a phrasal analysis and the node sharing analysis. Both types 
of evidence however, taken together, indicated that the node-sharing 
analysis is most plausible.

The paper 'Lexical Statistics and Lexical Processing: Semantic 
Density, Information Complexity, Sex, and Irregularity in Dutch' by 
Wieke M. Tabak, Robert Schreuder, and R. Harald Baayen combines 
a survey of the distributional properties of regular and irregular verbs 
in Dutch verbs with an experimental lexical decision study, which 
addressed the predictability of these properties for lexical processing 
in reading. The authors established various factors with the help of 
which the regularity of a verb can be predicted, e.g. lemma frequency, 
family size, neighbourhood density, argument structures, auxiliaries, 
inflectional entropy, noun-verb frequency ratio, spoken-written 
frequency ratio. To test whether these systematic differences between 
regular and irregular verbs are reflected in on-line processing, the 
authors conducted a lexical decision study the results of which 
challenge many previous hypotheses about regular vs. irregular 
verbs. Tabak et al. for example found that error analysis and 
response latencies pointed to a procession advantage for regulars. In 
both analyses, this advantage was most prominent for past tense 
forms. This finding challenges Pinker's model (1991, 1997), which 
predicts that ''regulars should be more difficult to process than 
irregulars, because regulars would require decomposition into stem 
and affix in addition to lexical lookup, and therefore should elicit longer 
instead of shorter latencies'' (550). The more general picture that, 
according to Tabak et al., emerged from the study is ''that the 
distinction between regular and irregular verbs is not a simple one. 
Regulars and irregulars differ not only with respect to their formal 
properties, but also with respect to their semantic properties and the 
information structure of their inflectional paradigms'' (552). The 
authors conclude that ''the fascinating and enigmatic phenomenon of 
regularity and irregularity in the mental lexicon'' (552) requires further 
investigation.

In his paper 'The Double Competence Hypothesis: Diachronic 
Evidence' Helmut Weiß shows how the ''writing-competence'' that 
underlies the production of historical texts (which are performance 
data) can be modelled by combining two independently developed 
approaches to theoretical and historical linguistics: the double 
competence hypothesis (cf. e.g. Kroch 2001) - ''which assumes that 
the competence underlying writing (''first order natural languages'' 
(N1)) is different from the competence underlying speaking (''second 
order natural languages'' (N2)) since (i) it is acquired later and 
independently of the latter, and (ii) it is functionally different - and the 
hypothesis that there are several grades in languages' naturalness 
(cf. e.g. Ferguson 1959), which assumes that in a monolingual speech 
community the low variety (often a dialect) is acquired as native 
language and spoken in everyday communication, whereas the high 
variety is learned as second, non-native language, and only used in 
writing and formal communication. In the 14th and 15th centuries, 
when NHG started to evolve, the distance between these two 
competences was still very great, whereas in the 19th and 20th 
centuries, when NHG first became spoken and was acquired as native 
language, the distance began to decrease. Weiß shows that 
the ''mixed language'', which is characteristic of OHG texts is a 
consequence of a diglossic double competence, and ''that a historical 
syntactic pattern can be analysed in three ways: as the output of (i) 
the N1 competence, (ii) the N2 competence, or (iii) as a hybrid form'' 
(570). He concludes with the claim that in modern historical linguistics 
combining quantitative and theoretical tools is ''the right and only way 
to overcome the weaknesses of diachronic data in general and the 
consequences of double competence'' (571).

EVALUATION

Most papers in the volume 'Linguistic Evidence' address issues 
concerning linguistic evidence in relation to specific linguistic 
problems, using and combining various data types (experimental data 
and corpus data are perhaps the most frequently used data types 
here). The volume shows that the question of how to gain linguistic 
evidence is (or should be) important for all linguists and that linguists 
can only gain when they use more than one data type. Evidence 
involving more than one type of data provides a different, but definitely 
a more comprehensive perspective on a given linguistic phenomenon -
 whether it confirms one's hypothesis, or whether it contradicts it. 
There are only few papers that explicitly address methodological and 
theoretical questions concerning linguistic evidence (e.g. Featherston, 
Kempen/Harbusch and Schütze), but as linguistic evidence is quite a 
new topic of linguistic discussion it may well be hoped that we will get 
more linguistic evidence-theory and -methodology in the near future.

REFERENCES

Chomsky, Noam (1957) Syntactic Structures. Mouton: The Hague.

Ferguson, Charles (1959) 'Diglossia'. In: Word 15, 325-340.

Fillmore, Charles J. (1992) '''Corpus Linguistics'' or ''computer-aided 
armchair linguistics'''. In: Jan Svartvik (ed.) Directions in Corpus 
Linguistics: Proceedings of Nobel Syposium 82, Stockholm, 4-8 
August, 1991. de Gruyter, Berlin, Germany, 35-60.

Haegeman, Liliane (1995) The Syntax of Negation [=Cambridge 
Studies in Linguistics 75]. Cambridge: Cambridge University Press.

Imai, Mutsumi; Gentner, Deirdre (1997) 'A Cross-Linguistic Study of 
Early Word Meaning: Universal Ontology and Linguistic Influence'. In: 
Cognition 62, 169-200.

Kayne, Richard (1994) The Antisymmetry of Syntax. Cambridge, MA: 
MIT Press.

Keller, Frank (2000) Gradience in grammar: Experimental and 
computational aspects of degrees of grammaticality. Ph.d. thesis. 
University of Edinburgh.

Keller, Frank; Asudeh, Ash (2001) 'Constraints on linguistic 
coreference: Structural vs. pragmatic factors: In: Moore, 
J.D./Stenning, K. (eds.) Proceedings of the 23rd Annual Conference 
of the Cognitive Science Society. Mahwah, NJ: Lawrence Erlbaum.

Kroch, Anthony S. (2001) 'Syntactic Change'. In: Baltin, Mark/Collins, 
Chris (eds.) The Handbook of Contemporary Syntactic Theory. 
Oxford: Blackwell, 699-729.

Jackendoff, Ray (1977) X' Syntax. Cambridge, MA: MIT Press.

Lightfoot, David (1999) The Development of Language: acquisition, 
change and evolution. Oxford: Blackwell.

Lucy, John A. (1992) Grammatical Categories and Cognition: A Case 
Study of the Linguistic Relativity Hypothesis. [Studies in the social and 
cultural foundations of language 13]. Cambridge: Cambridge 
University Press.

Moltmann, Friederike (1992) Coordination and Comparatives. 
Cambridge, MA: MIT Press.

Piñango, Maria; Zurif, Edgar; Jackendorf, Ray (1999) 'Real-time 
processing implications at the syntax-semantics interface'. In: Journal 
of Psycholinguistic Research 28 (4), 395-414.

Pinker, Stephen (1991) 'Rules of language'. In: Science 153, 530-535.

Pinker, Stephen (1997) 'Words and rules in the human brain'. In: 
Nature 387, 547-548.

Reinhard, Tanya; Reuland, Eric (1993) 'Reflexivity'. In: Linguistic 
Inquiry 34, 657-720.

Schütze, Carson T. (1996) The Empirical Basis of Linguistics: 
Grammaticality Judgments and Linguistic Methodology. Chicago: 
University of Chicago Press.

Sprouse, Rex; Vance, Barbara (1999) 'An explanation for the decline 
of null pronouns in certain Germanic and Romance languages'. In: 
DeGraff, Michael (ed.). Language Creation and Language Change: 
Creolization, Diachrony and Development. Cambridge, MA: MIT Press, 
257-284.

Todorova, Marina; Straub, Kathleen; Badecker, William; Frank, Robert 
(2000) 'Aspectual coercion and the on-line computation of sentential 
aspect'. In: Proceedings of the twenty-second annual conference of 
the Cognitive Science Society. Philadelphia, PA. 

ABOUT THE REVIEWER

Elke Gehweiler is reasearch associate in the project Collocations in 
the German Language at the Berlin-Brandenburgische Akademie der 
Wissenschaften, Berlin, Germany, and in a project on 
grammaticalization at the Freie Universität Berlin, where she is 
currently preparing her Ph.D. thesis on the grammaticalization of 
adjectives in English and German.

-----------------------------------------------------------
LINGUIST List: Vol-17-1540