14.1481, Review: Cognitive Science/Psycholing: Briscoe (2002)

LINGUIST List linguist at linguistlist.org
Thu May 22 16:09:59 UTC 2003


LINGUIST List:  Vol-14-1481. Thu May 22 2003. ISSN: 1068-4875.

Subject: 14.1481, Review: Cognitive Science/Psycholing: Briscoe (2002)

Moderators: Anthony Aristar, Wayne State U.<aristar at linguistlist.org>
            Helen Dry, Eastern Michigan U. <hdry at linguistlist.org>

Reviews (reviews at linguistlist.org):
	Simin Karimi, U. of Arizona
	Terence Langendoen, U. of Arizona

Home Page:  http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.

Editor for this issue: Naomi Ogasawara <naomi at linguistlist.org>
 ==========================================================================
What follows is a review or discussion note contributed to our Book
Discussion Forum.  We expect discussions to be informal and
interactive; and the author of the book discussed is cordially invited
to join in.

If you are interested in leading a book discussion, look for books
announced on LINGUIST as "available for review." Then contact
Simin Karimi at simin at linguistlist.org.



=================================Directory=================================

1)
Date:  Thu, 22 May 2003 09:23:08 +0000
From:  Estival, Dominique <Dominique.Estival at dsto.defence.gov.au>
Subject:  Linguistic Evolution through Language Acquisition

-------------------------------- Message 1 -------------------------------

Date:  Thu, 22 May 2003 09:23:08 +0000
From:  Estival, Dominique <Dominique.Estival at dsto.defence.gov.au>
Subject:  Linguistic Evolution through Language Acquisition

Briscoe, Ted, ed. (2002). Linguistic Evolution through Language
Acquisition: Formal and Computational Models. Cambridge University
Press. Hardback ISBN 0-521-66299-0, vii+349pp.

Announced at http://linguistlist.org/issues/13/13-2378.html


Dominique Estival, Defence Science and Technology Organisation, Australia.

I cannot phrase the book's purpose and contents better than the dust
jacket itself: ''the volume proceeds from the basis that we should
address not only the language faculty per se, but also the origins and
subsequent development of languages themselves; languages evolve via
cultural rather than biological transmission on a historical rather
than genetic time scale. The book is distinctive in utilizing
computational simulation and modeling to ensure that the theories
constructed are complete and precise.

Drawing on a wide range of examples, the book covers the why and how
of specific syntactic universals; the nature of syntactic change; the
language-learning mechanisms needed to acquire an existing linguistic
system accurately and to impose further structure on an emerging
system; and the evolution of language(s) in relation to this learning
mechanism.''

The audience for this volume will primarily be advanced graduate
students and researchers in language evolution, language learning, and
computational models of language learning and of language change. It
would not be suitable as a text-book or an introduction to any of
these fields, but could be used as supplementary reading or reference
material for advanced courses in these disciplines.

The book consists of 10 chapters, with the first being an introduction
by the editor, Ted Briscoe, and the last byJames Hurford concluding
with a comparison of several systems, including 2 described in this
volume (Kirby and Batali). All the chapters, except the Introduction,
describe experiments involving autonomous computational agents
learning to communicate through some form of language and propose
hypotheses concerning the emergence and evolution of language based on
the results of those experiments. Following the usual book review
format for LINGUIST, I will first summarize the main points of each
chapter, before giving an overall evaluation of the volume as a
whole. However, my summaries of Ch.1 and Ch.10 also contain more
general observations about the other contributions.

1. Introduction (Ted Briscoe)

This first chapter lays out the main themes and issues for the
volume. TB emphasises the ''centrality of acquisition to insightful
accounts of language emergence, development, variation and change''
(p.19) and advocates an evolutionary perspective on language. The
methodology followed by all the contributors assumes that
speakers/hearers are ''language agents'' and all take languages as
dynamical systems, in fact complex adaptive systems. The following
quotation from Deacon (1997) sets the scene : this is about language
evolution, how this evolution was set in motion and how it is played
out every time a child learns to use language.  ''Languages have had
to adapt to children's spontaneous assumptions about communication,
learning, social interaction, and even symbolic reference, because
children are the only game in town... languages need children more
than children need languages''. (p.14)

One of the main hypotheses explored throughout the volume is that
''linguistic universals need not be genetically-encoded constraints,
but instead may just be a consequence of convergent evolution towards
more learnable grammatical systems.'' (p.14). Another controversial
(cf. Bickerton 1998) hypothesis explored by several contributors is
that of genetic assimilation in the evolution by natural selection of
the human language learning procedure. This is rejected by Worden
(Ch.4) who argues that there is no specific language faculty, and no
natural selection for language evolution. On the other hand, Turkel
(Ch.8) argues that genetic assimilation is needed to explain
convergence on a shared language faculty. In the same vein, Briscoe
(Ch.9) proposes a model which integrates Bayesian learning with a
Principles and Parameters (P&P, Chomsky 1981) account of language
acquisition and in which the language faculty would be refined by
genetic assimilation. A related!  issue discussed by a number of
contributors concerns the effects which a universally shared and
preadapted language learningprocedure would have on the evolution of
language itself.

In the section on methodological issues, TB discusses the limitations
and benefits of using computer simulation and modeling, and argues
that ''methodologically rigorous simulation is a critical and
indispensable tool in the development of evolutionary dynamical models
of language'' (p.16).

2. Learned systems of arbitrary reference: The foundation of human
linguistic uniqueness (Michael Oliphant)

MO makes a distinction between symbolic and non-symbolic, and between
innate and learned, signaling behavior. His main thesis is that
''human language is the only existing system of learned symbolic
communication'' and he explores the computational problems involved in
supporting such a learned system of communication, hypothesizing that
''maybe the problem of transmitting a learned symbolic system is the
primary factor limiting the evolution of the language ability''
(p.49). The model he proposes is a learning simulation with no
feedback, and no reinforcement, using Hebbian Network (Cumulative
Association Networks plus lateral inhibition). Like the other models
presented in this volume, it assumes that the agents have access to
signal/meaning pairs, but MO is particularly concerned with the
question of how the agents can observe meaning. The main conclusion is
that it is easier to learn an iconic or indexical system than a
symbolic system and that humans must be able to use context in order
to learn human languages.

3. Bootstrapping grounded word semantics (Luc Steels and Frederic
Kaplan)

S&K experiment with a population of ''visually grounded robotic agents
capable of boostrapping their own ontology [...] while playing a
language game''. This is called the 'Talking heads experiment', where
the robots try to guess what the other robots are trying to
communicate (the 'guessing game'). Using what they call 'semiotic
dynamics', S&K make a clever use of the distinction between 'meaning'
(sense, an internal state of the agent) and 'referent' (the world
objects) in the representations their agents use: this is an RMF
(referent/meaning/form) graph representation-- instead of the usual
meaning/form pairs of most other models. The question of how words get
their meaning, an issue also raised in Ch.2. by MO, becomes here the
question of how agents acquire word-meaning and meaning-object
relations. The solution proposed is that a conceptualisation module
creates an ontology from the environment while a verbalisation module
creates a lexicon. In the set of experiments reported here, the agents
have ''very minimal forms of intelligence'' but usually converge on a
shared language. An interesting side effect which is explored is how
synonymy and ambiguity arise as emergent properties in the lexicon.

4. Linguistic structure and the evolution of words (Robert Worden)

In this very well written and pleasant to read chapter-- no linguist
could resist such statements as ''The full OED is a museum of (mainly
extinct) word species'' (p.92)-- RW proposes a theory of language
learning formulated in the framework of unification-based grammar, in
which the evolution of word feature structures (memes) is driven by
the selection pressure of use. In this analogy, ''each language is an
ecology and each word is one species in the ecology'' and ''language
change can be regarded as a form of evolution - not of the language
itself, but of the individual words which constitute the language''
(p.75). The only concepts used in this model are those of feature
structure, unification and generalization (the complement of
unification). The slogan here is ''Unify to use, Generalize to learn''
(p.80), and RW claims that generalization learning offers an effective
solution into the problem of ''noisy'' input (p.84).

>>From the point of view of the evolution, or emergence, of language, RW
claims that ''for many features of language, there is no need to
suppose that they reflect any innate structure in the brain'' (p.76)
and that this model leads to a simpler theory of the human mind. He
assumes that the feature structures used by the brain for language
have evolved from the feature structures used for primate social
intelligence and that the learning mechanisms and inference mechanisms
are the same for language and for social intelligence (p.89).

>>From the point of view of language change, the hypothesis is that
''words replicate via unification and generalization, they evolve over
generations; as words evolve, the language changes'' (p.91). RW lists
6 selection factors in the evolution of words: useful meaning,
productivity, economy, minimum ambiguity, ease of learning and social
identification, and proposes that ''a language universal may arise
just from the convergent evolution of words'' (p.92).

As an alternative to Hawkins' account (1994) for the rise of language
universals, which is attributed to the need for ''ease of parsing'',
RW proposes the ''need to minimise ambiguity''. However, I was not
convinced that this is a case of either/or: the preference for short
Constituent Recognition Domains put forward by Hawkins may also lead
to better ambiguity resolution, not just to ease of
parsing. Nevertheless, RW accounts successfully for the mixture of
regularity and irregularity found in languages, and claims to do so
better than P&P, for which the 'core' is regular, and irregularity has
to be at the 'periphery': in his model, irregularity is fully
supported since the theory is fully lexicalized. Three selection
factors, productivity, minimum ambiguity and ease of learning lead to
an increase in language regularity (p.101) and the theory can also
explain the distribution of irregularity (p.103).

I found the section discussing speed limits for evolution a bit dry,
and too speculative in some places. For instance, statements such as
''humans typically produce only about 3 offspring'' or ''the selection
pressure for language proficiency is not more than about 20%'' (p.105)
leave the reader wondering where those numbers come from. However, it
contains interesting speculations about the rate of change for
language and the rate at which the language faculty would evolve. The
conclusion is that languages change too fast to force the language
faculty to adapt.

In summary, RW proposes a theory based on the 3 concepts of feature
structure, unification and generalization which aims to explain how
languages evolve and are learned, i.e. word by word. This model is
also shown to account for diverse syntactic means used to define
semantic roles, for domains of syntactic regularity and irregularity,
and for some language universals. One of the main theses of this paper
is that the structure of language reflects the functional requirements
for language, not language-specific structures in the human brain
(p.103).

5. The negotiation and acquisition of recursive grammars as a result
of competition among exemplars (John Batali)

As the title indicates, this paper is an ''investigation of how
recursive communication systems come to be'' and of how such systems
can emerge. JB argues that conventional communication systems are
learned through negotiation, and that language agents use structure to
construct meaning representations. JB makes the strong claim that
''the ability to create and manipulate mappings between structures of
representations [...] is prior to, and independent of, language''
(p.118)

ln the model presented here, there are no rules or principles, but
learners build 'exemplars' and a communication system emerges through
competition between the exemplars. The computational model is based on
'meanings' (internal representations of situations) expressed as
formula sets, and paired with 'signals' (sequences of characters),
expressed as strings, through 'mappings' expressed as phrases which
have numeric cost attached to them. Phrases can be simple tokens or
complex phrases (using binary branching). At the start of negotiation,
the agents have no exemplars at their disposal, but they have the
ability to use embedded phrase structure. They can only communicate
with each other when they have negotiated a shared system.

An interesting aspect of this paper is the way communicative accuracy
is measured, based on precision and recall.  However, JB also points
out that ''the agents' success at communication comes not from their
possessing identical sets of exemplars, but from regularities in the
structure of phrases constructed from their exemplars'' (p.144).

Another aspect of the paper which will be of interest to linguists is
JB's analysis of the emergent negotiated systems, a ''cartoon version
of linguistic fieldwork'' (p.144), which turns out to reveal some
fascinating aspects of those systems. In particular, partitioning
variables lead to a rudimentary alignment of syntax/semantics
(p.146-147), empty tokens are created by the language agents and used
as argument markers and as end markers for sequences of properties in
a way which seems strikingly similar to subordination and relative
clause syntax (p.151-153), reflexive markers are used to mark the
collapsing of arguments, and inverting argument markers function in a
way similar to passive (p.154). Finally, although JB calls them
''extremely simple'' because there is no embedding of meaning and no
semantic or syntactic categories, these systems also exhibit some
amount of constituent ordering (p.162-163). A study of these
negotiated systems also ''provides a simple account of how regular and
irregular forms can coexist'' (p.12). Furthermore, like MO and S&F, JB
argues that ''learnability depends on the learners having access to
the meaning of the string'' (p.166).

JB concludes that ''attention should be directed at understanding the
influences that learning mechanisms can have on the languages that
emerge from their use'' (p.168) and suggests that ''a significant
fraction of the syntactic phenomena used in human languages can, in
principle, emerge among agents with the cognitive and social
sophistication required to instantiate the model'' (p.170).

6. Learning, bottlenecks and the evolution of recursive syntax (Simon
Kirby)

In his model for the evolution of language, SK addresses the
interaction of two unique aspects of human language, the way it is
learned and its syntactic structure. With respect to learning, SK
claims that human language is different from animal communication in
that ''some of the content of the mapping (from meanings to signals)
is learned by children through observation of others' use of
language'' (p.173). Concerning the structure of language, SK focuses
on the two properties of compositionality, whereby an expression's
meaning is a function of the meanings of parts of that expression and
the way they are put together, and recursion, the property of
languages with finite lexica and rule sets in which some constituent
of an expression can contain a constituent of the same category.  SK's
conclusion is that ''the basic structural properties of language such
as recursion and compositionality will inevitably emerge over time
through the complex dynamical process of cultural transmission,
i.e. without being built in to a highly constraining innate LAD
[Language Acquisition Device]'' (p.174).

Unlike other experiments with larger populations of agents reported in
this volume, the simulation in SK's set of experiments contains only
two agents, an adult speaker and a new learner. At the end of the
first cycle, the grammar is non-compositional and non-syntactically
structured, but successive cycles lead to the emergence of both
internal structure and categories (p.188). SK doesn't point this out,
but the emergent categories are strongly reminiscent of Categorial
Grammar categories. The next step in the experiment is to allow for
infinite languages by including predicates which may take other
predicates as arguments. This leads to a dramatic reduction in size of
the grammar.

SK makes use of the distinction between I-language, the language
user's knowledge about language, and E-language, the set of observable
utterances. An important concept in this model is that of
'replicators', I-language units or rules that may or may not persist
through time. A more general rule, which can express more meaning, is
a better replicator than an idiosyncratic rule, even though it is not
learned as easily as a more idiosyncratic rule. This explains the
success of I-languages which contain general rules.

Another important concept in this paper is that of the mapping between
spaces, here the space of I-language and the meaning space. SK argues
that it is the structure of the mapping between spaces which is
important, not the syntactic structures of the language
itself. ''Constraints on variation are not built into the learner, but
are instead emergent properties of the social dynamics of learned
communication systems and the structure of the semantic space that the
individuals wish to express'' (p.196-197).

SK claims that his model gives a ''neat explanation of why human
languages use syntactic structure to compositionally derive semantics,
use recursion to express infinite distinctions in a digital way, have
words with major categories such as noun and verb, and use syntactic
rules of realization (such as ordering rules) to encode meaning
distinctions'' (p.197) and the sections describing these should be of
special interest to linguists.

7. Theories of cultural evolution and their application to language
change (Partha Niyogi)

In this chapter, PN looks at language change rather than language
evolution, i.e. not how language could have developed phylogenetically
but how and under what constraints it can change historically. He
addresses the ''problem of characterizing the evolutionary dynamics of
linguistic populations over successive generations'' (p.205) using the
framework of Cavalli-Sforza and Feldman (1981) [CF] for the treatment
of cultural evolution and map that approach to that of Niyogi and
Berwick (1995, 1997) [NB] for language change, with the goal of
providing a way in which the P&P approach to grammatical theory is
amenable to the CF framework. The comparison between the two
approaches of CF and NB is interesting in itself and one result is the
observation that the essential difference between their two update
rules arises from different assumptions made in the modeling process:
NB assume that all children receive input from the same distribution
while CF assume that children can be grouped into 4 classes depending
on their parental types. The latter approach is better able to arrive
at alternative evolutionary dynamics.

PN shows that the Triggering Learning Algorithm (TLA, Gibson & Wexler
1994) can be integrated within the NB model and yield the dynamics of
linguistic population under the CF model.  PN's goal is to
''characterize the dimensions along which human languages change over
time and explain why they do so'' (p.205) especially under conditions
of language contact. The second modeling experiment reported in this
paper is grounded in a historical example: the evolution of Old
English into Modern English and the syntactic changes associated with
the different settings of the Head Parameter and the V2 Parameter. The
presentation of the data (from Gibson & Wexler 1994) does not make it
clear that the +V2 parameter is more than just a restriction on
surface word order (i.e., a finite verb must be in 2nd position in
root clauses), but a shorthand for an analysis that allows a
non-subject to raise to Spec, C position and the finite verb to move
to C (Wexler, p.c.).  Nevertheless, the analysis of this particular
historical change is quite interesting and the simulations make
predictions which should be pursued for the study of language
evolution.

In a final section, PN reports on experiments to partition the
children population according to the distribution of languages they
are exposed to. If the exposure is limited to the languages spoken by
the parents, there are 4 discrete possibilities; if the distribution
models neighborhood, there is a linear map. PN then shows that the
reorganisation into homogeneous neighborhoods leads to interesting
predictions.

8. The learning guided evolution of natural language (William Turkel)

The main thesis of this paper is that ''even if one accepts the claim
that language cannot exist in intermediate forms, it is still possible
that it could have evolved via natural evolution'' (p.235). To argue
this position, WT makes two assumptions: 1) humans have some degree of
plasticity and are capable of learning; 2) successful communication
confers a reproductive advantage, i.e. language is adaptive as long as
it is shared (p.235). While Pinker & Bloom (1990) argue that there is
a continuum of viable communicative systems and that species can
gradually increase in communicative ability over time, the assumption
that there can be no intermediate forms would seem to lead to the
conclusion that language is not a product of selection but rather of
preadaptation or exaptation. However, WT argues for the possibility of
individual learning after evolutionary search, as 'learning guided
evolution'. Although WT stresses that this is not to be confused with
Lamarc kianism, but is due to the 'Baldwin effect' (Baldwin 1896),
whereby ''learning can affect the direction of evolution'' and ''the
learned behavior can become genetically determined'', I must admit
that the difference seemed rather tenuous and could have been better
clarified. However, Pinker (1994), Briscoe (2000) and Deacon (1997)
all argue that the Baldwin effect must have played a role in the
evolution of human language and Hinton & Nowlan (1987) provide an
explanation by showing that ''the adaptations learned during a
lifetime of an organism guide the course of evolution by altering the
shape of the search space'' (p.238). Furthermore, although it has been
argued that arbitrariness is evidence against adaptation, WT claims
that this is a misguided argument and that with complex objects, such
as natural languages, history plays a more important role. In fact, as
long as arbitrariness is shared, it can be adaptive.

WT then presents a set of simulations of a P&P system, with a variant
of Hinton & Nowlan's model. An important aspect of this simulation is
the distinction between fixed vs. plastic parameters. Here all P&P
parameters are plastic, i.e., they can be learned (given starting
values of ''?'' rather than ''0'' or ''1''). The simulations show the
Baldwin effect and demonstrate that learning can accelerate the
evolutionary process. The results also indicate that the amount of
plasticity (i.e., the number of parameters set at ''?'' at the
beginning of the simulation) was inversely proportional to the speed
with which the population converged to a single genotype (i.e. a set
of parameter settings). From a linguistic point of view, convergence
to ''0'' or ''1'' represents the evolution of a principle of grammar,
while convergence to ''?'' represents the evolution of a true
parameter (p.248).

9. Grammatical acquisition and linguistic selection (Ted Briscoe)

TB's main hypothesis is that an innate LAD (Language Acquisition
Device) could have coevolved with human protolanguage, and he tests
the ability of the model he proposes against a documented process of
creolization. While the standard model of the LAD for grammatical
acquisition incorporates ''a set of constraints defining a possible
human grammar, and a set of biases (partially) ranking possible
grammars by markedness'', TB's account suggests that ''biases as well
as constraints can evolve through a process of genetic assimilation''
and that ''those constraints and biases in turn influence subsequent
development of language via linguistic selection'' (p.256). In the
model of linguistic selection proposed here, languages are taken to be
dynamical systems which adapt to their ''niche of human language
learners and users'' and language change is primarily located in
parameter setting (reanalysis) during language acquisition.

This innate LAD is composed of 1) a theory of Universal Grammar (UG)
with a finite set of finite-valued parameters defining the space of
possible grammars, 2) a parser for the grammars, and 3) an algorithm
for updating parameter settings. For TB, the theory of UG (1) is
Generalized Categorial Grammar, where the category set and the rule
schemata are defined as a default inheritance network characterizing a
set of (typed) feature structures. I would complain that in this
paper, TB assumes too much knowledge of GCG on the part of the reader,
for instance there is no explanation for a number of terms introduced
in the exposition, e.g., ''gendir'' presumably means ''generic
direction of functors'' and ''Vt'' presumably means ''transitive
verb'', but the reader shouldn't have to guess. (2), the parser, is a
deterministic, bounded-context shift-reduce algorithm. For (3), the
parameter setting algorithm, TB proposes a statistical extension of an
n-local partially-ordered error-driven parameter setting algorithm
utilizing limited memory. Here, I would complain that in spite of the
great amount of technical details (pp.263-271) provided about the
parsing algorithm, the set-up, and the implementation of the
algorithms, we are not given enough explanation of what those
technical choices mean theoretically.

With respect to the simulation experiments reported here, they are
more ambitious in scope and in the phenomena they attempt to model
than most of the other contributions to this volume, and for that
reason interested me even more. TB first reports on a set of
acquisition experiments on ''feasible and effective grammatical
acquisition'' (section 9.3), with both unset and default learners for
a combination of parameters, which show convergence at the individual
level. This is followed by a set of experiments at the level of the
population of language agents (section 9.4), with simulation of
different rates of reproduction for the language agents. A set of
linguistic selection experiments at the population level (section
9.5), shows that linguistic selection is a viable approach to
accounting for some types of language change; it also shows that the
Bayesian approach to parameter setting accords with the behavior of
learners in situations with either multiple dialects or with contact
with speakers of other languages.

TB then introduces variations amongst the language agents, in two
ways. To investigate the coevolution of the LAD and of language, the
simulation uses a population of learners with different parameter
settings, the simulation of reproduction being as before, and also
including population movements. The results show that ''a minimal LAD,
incorporating a Bayesian learning procedure, could evolve the prior
probabilities and UG configuration which define the starting point for
learning, in order to attenuate the acquisition process by making it
more canalized and robust'' (p.282).

The last, even more linguistically realistic, experiment is the
simulation of creolization. Bickerton (1988) and Roberts (1998) argue
for an abrupt transition from pidgin to creole, but TB wants to
investigate whether and how the element of 'invention' assumed to be
necessary in creolization could arise in a model where the
parameter-setting algorithm is purely selectionist and largely
data-driven. I found this section of the paper the most fascinating
and the analysis of the linguistic data presented should be of
interest to other linguists as well.  The hypothesis being tested in
the simulation is that ''the primary linguistic data that creole
learners are exposed to is so uninformative that they retain their
prior default-valued parameter settings, as a direct consequence of
the Bayesian parameter-setting procedure'' (p.287). However, the
learners only need minimal exposure to the superstratum language to
converge on the creole grammar. Thus, ''creolization could result as a
consequence of a Bayesian parameter setting learner having default
setting for some parameters, acquired via genetic assimilation''
(p.293), under the assumption that ''richer triggers expressing
parameters for more complex categories [are] present in the primary
linguistic data'' (p.294). TB 's claim is that not only this account
of creolization requires no invention or special mechanism at work,
besides those already posited for language change, but that the timing
for the simulation is remarkably consistent with the time course
documented by Roberts (1998). However, he is careful to point out that
this has only been validated for cases with SVO superstratum
languages.

In conclusion, TB proposes that a robust and effective account of
parameter setting, broadly consistent with Chomsky's (1981) proposals,
can be developed by integrating GCG, embedded in a default inheritance
network, with a Bayesian learning framework, which will be compatible
with a robust convergence to a target grammar. However, linguistic
selection for more learnable variant constructions during language
acquisition offers ''a promising formal framework to account for
language change where language learners converge to a grammar
different from that of the preceding generation'' (p.295). An extreme
example is creolization, which is potentially challenging for a
selectional and essentially data-driven account, but the model of the
LAD developed here predicts that creolization will occur within the
timeframe identified by researchers. The success of the coevolutionary
scenario, where ''there is reciprocal interaction between natural
selection for more efficient learners and linguistic selection for
more learnable grammars'' (p.295) and whose consequence is
highly-biased language learning, leads TB to the conclusion that
''there is little reason to retain the parameter setting framework''
(p.296), and that a minimal LAD combined with UG is enough. TB argues
that this minimal LAD would have required only minor reconfiguration
of cognitive capacities in the hominid line, and that (since TB
assumes that UG = GCG) ''the categorial logic underlying the semantic
component of GCG was already in place''. Thus, ''much of the
domain-specific nature of language acquisition, particularly
grammatical acquisition, would follow not from the special nature of
the learning procedure per se, as from the specialized nature of the
morphosyntactic rules of realization for the language of thought''
(p.297).

10. Expression/induction models of language evolution: dimensions and
issues (James R. Hurford)

JH compares 5 models of language evolution: 2 by Batali (including the
one given in Ch.5), 2 by Kirby (including the one given in Ch.6) and 1
by himself (Hurford, 2000). This is a very dense paper, which is
nevertheless rewarding for the reader as it illuminates a number of
aspects shared by those models and it explores some of the assumptions
made by the other contributors. The first commonality between the
models discussed here is that they share a large amount of
idealization and simplification and the emergent language systems they
produce are very simple. The specific questions JH then asks are: In
what sense do the evolved systems actually exhibit syntax? To what
extent is this syntax truly emergent? In what ways do the evolved
systems resemble Natural Language?

JH posits an underlying framework for all these models, which he calls
'Expression/Induction' (the E/I acronym is not a coincidence, and is
meant to remind the reader of I-language and E-language). Some aspects
of E/I are common to many views of language, in particular the view
that language is a dynamical system, and the distinction between the
mental grammars of individuals (I-language) and their public behavior
(E-language). The models discussed here share further assumptions:
computational implementation for modeling; use of populations of
agents, with agents alternating between speakers/teachers and
hearers/learners; and assuming both expression/invention capacity and
grammar induction capacity. These models also all start from a
situation with no language, thus they are not primarily models of
historical change, but of language emergence. They do not invoke
biological evolution, but are ''models of cultural evolution of
learned signaling systems [...] not models of the rise of innate
signaling systems'' (p.304). Unlike in the models proposed by Turkel
and by Briscoe (which are not discussed by JH) communicative success
is not a driving force, the ''basic driving force is the learning of
behavior patterns by observation of the behavior of others''
(p.305). Another important shared assumption is that of emergence:
''the essential dynamic of an E/I model itself produces certain kinds
of language structure as a highly likely outcome. The interaction of
assumptions produces non-obvious outcomes, explored by simulation''
(p.306).

An important concept which JH explores throughout this chapter is that
of 'bottlenecks' (see also Ch.6). The set of possible meaning-form
pairs is infinite, or at least very large, while the set of example
utterances used for acquisition is necessarily finite. This leads to
two kinds of bottlenecks: a 'semantic bottleneck' where learners onl
observe a fraction of all possible meanings, and a 'production
bottleneck' where speakers only produce a subset of possible
utterances for the meaning. All the models implement a semantic
bottleneck, which is crucial otherwise ''no agent would ever be forced
to generalize beyond its learning experience'' (p.333). All the models
also implement a production bottleneck, otherwise the model would be
unrealistic. JH steps through a few examples to show how the I/E model
handles the evolution of vocabulary, without bottleneck (no change,
unrealistic model) with only a production bottleneck (leading to the
elimination of synonyms over time) and with only a semantic bottleneck
(leading to an increase in synonymy).

Moving to modeling the emergence of syntax, all the models surveyed
have evolved syntactic means of expressing their meanings, and JH
identifies 3 phases in the emergence of syntax which all models go
through.  However, the models differ in the agents' representation of
syntax and they contrast ''in the degree to which learners postulate
autonomous syntactic structure'' (p.315). This echoes the debate on
the rival merits of rules (Chomsky 1971) and analogies (Bolinger
1968). An important point is that in all the experiments, the
population of agents has converged on a set of representations over
which a generalization is possible, even if it wasn't actually
made. JH also raises the empirical psycholinguistic issue of the
degree to which humans store exemplars rather than rules and points
out that the issue of rules vs. chunks also arises in computational
parsing theory. He argues that both approaches (rules and stored
chunks) are ''compatible with some of the most basic facts of language
organization'' and that ''computational evolutionary models have a
long way to go in complexity before they can begin to shed light on
such issues'' (p.319). The models all converge on systems which
exhibit compositionality, and JH argues that this is achieved in more
or less stipulative ways inasmuch as the ''semantic representations
incorporated into the evolutionary models already have a syntax''
(p.321)-- however Kirby claims that compositionality emerges in his
model without being deliberately coded in. Concerning the invention
and production algorithms, ''invention for E/I models is an
essentially random process, constrained by the in-built assumptions''
(p.325). On the other hand, grammar induction is implemented very
differently in these models: in the early stages of the simulation,
the mode of learning is incremental in all models, while at later
stages they differ as to how much internal rearrangement of an agent's
previously stored information takes place.

The models also differ in their population dynamics, which can be
multi-generational or uni-generational, but they all assume a constant
size population, with agents periodically removed from the population,
which raises the issue of what would happen with more sophisticated
models of population expansion (cf. Ch.9).

In conclusion, JH considers that the factors which facilitate the
emergence of recursive, compositional syntactic systems in E/I models
boil down to: pre-defined syntactic representations; an invention
(and/or production) algorithm to construct new expressions in
conformity with the principle of compositionality; a learning
algorithm to internalize rules generalizing over form-meaning mappings
in a compositional way; a strong semantic bottleneck effect; and a
production bottleneck. JH moves on to speculate that a hybrid model
stripped down to: a flat semantic structure; an invention algorithm
with no bias towards compositional structures; a learning algorithm
which may, but not only, induce general rules with compositionality;
uni-generational population dynamics with only a single agent; and the
feedback effect of a production bottleneck would yield an emergent
recursive compositional syntactic system, and concludes that this is
an experiment waiting to be done!

Critical evaluation

This is a very interesting book, quite challenging for the reader
because of the technical level of most of the papers, and it certainly
raises a number of controversial issues. Before addressing these
issues, I have to say that I would have expected the distinction
between language emergence (phylogenetic) and language acquisition
(ontogenetic) to be made clearer throughout the volume by all
contributors. The term 'evolution' is itself ambiguous and in some
cases, it is not always clear how the results of the experiments
should be interpreted: as a model of how language evolved in humans,
or of how humans learn language. In other cases, of course,
'evolution' refers to language change, and RW takes the view that not
only languages, but individual words, should be regarded as evolving
species. However, from both the vantage points of language emergence
and of language acquisition, the main issue concerns the status of the
human language faculty and of the language acquisition device
(LAD). The question is how much of the human language faculty, and of
the LAD, is innate to humans, and to what extent the properties of
human languages arise from the constraints of communication. Indeed,
the very existence of the LAD as a separate cognitive faculty
(cf. Chomsky) as opposed to general purpose learning mechanisms is
under investigation. The latter view is argued by Worden, while
Briscoe argues for a minimal LAD refined by genetic assimilation.

Two related issues are those of the co-evolution of language and the
language faculty (Batali and Turkel) and of convergent evolution as
opposed to genetic encoding (Briscoe, Worden, Turkel). Both the
hypothesis that human language and the human language faculty evolved
together and the hypothesis that the language faculty emerged through
convergent evolution require some amount of genetic
assimilation. Genetic assimilation allows for the possibility of
intermediate forms of language and of the LAD, in sharp contrast to
the hypothesis of saltation, where the human language faculty (and the
LAD) is unique and fully specific (genetically encoded) to the human
species.

Collectively, the contributors to this volume have demonstrated that
to a large extent, some of the properties assumed to be constitutive
of human language, i.e., the arbitrary pairing of meanings with forms,
showing recursion and compositionality, and the syntactic structure of
language, can be made to emerge under certain assumptions. The
assumptions which are shared by all researchers, and are not likely to
be controversial, are that there is a some sort of pairing of meanings
with forms, and that successful communication is rewarded in one way
or another. Further assumptions regard the particular implementations
used for the simulations, and there is room for disagreement as to the
extent to which the implementations actually determine some of the
observed outcomes (Hurford).

Although all the papers in one way or another address the methodology
of modeling and simulation, they all assume that such modeling is
worth undertaking. However, the reader may still be left with the
question of whether we can actually learn something about all these
issues from simulation experiments with computational language
agents. This volume will convince people who are inclined to believe
that such simulations can provide some answers that progress has been
made, and it may convince others that such simulations are worth
performing and that the results are more illuminating and relevant to
linguistics than they might have expected.

Having assumed that computational simulation and modeling is one way
to explore some of the issues related to the evolution of language, it
is interesting to note that although all the contributors use language
agents to simulate humans, they group these agents into populations of
very different sizes and composition. They also make vastly different
assumptions about how learning and teaching across generations is
accomplished, with most authors remaining modest and only considering
simple models with fixed numbers and limited variety of
interaction. They have different models of population growth, contact
and migration (see especially the contributions by Niyogi and
Briscoe). Finally, they also differ markedly with respect to the
nature of the computational representations they use (e.g., trees,
RMF, formula sets). All these choices are the factors which determine
the observed outcomes.

Finally, there are several important assumptions about the nature and
properties of human languages which underlie the work reported here:
recursion, which most authors assume, and compositionality, which most
authors try to account for; and lastly, about the nature of the
linguistic representations, which is always some form of meaning-form
pairs.

As a collection of papers, the volume is very well put together and it
forms a coherent whole. The authors obviously know each others' work
and cross-reference each other, in some cases as work in progress, so
the reader is left with the impression of a lively growing community
of researchers. However, references to other work outside this
community are also sufficient to allow the reader to explore
alternative approaches. The progression between the papers is quite
logical and the placement of Hurford's contribution, with its
systematic comparison of 5 models, at the end serves as a kind of
overall review of many of the concepts and issues introduced or
assumed by the other contributors.  The book itself is very well
produced and, in general, well edited. The index is rather limited,
but adequate. I have a few minor quibbles about the layout, in
particular in chapter 4, most figures are too far away from their
reference in the text (e.g., Fig. 4.3 on p.87 is mentioned on p.78),
and in Ch.10 we find ''3 exemplars along the lines of the 3 shown on
the next page'' (p.318) but no examples are given. There are very few
typos (missing parenthesis in Table 7.2, p.211; Fig. 7.6 on p.227 is
referenced as Fig. 7.4). My worst criticism in this regard would be
that the paper by the editor might have benefited from more rigorous
editing and trimming, as it is one of the longest and arguably most
densely written of the volume, but I also found it extremely
interesting and stimulating.

Bibliography

Baldwin, J. M. (1896). ''A new factor in evolution''. American
naturalist, 30, pp.441-451.

Bickerton, D. (1998). ''Catastrophic evolution: the case for a single
step from protolanguage to full human language''. in Approaches to the
Evolution of Language: Social and Cognitive Bases. J. Hurford,
M. Studdert-Kennedy, C. Knight eds. pp.341-358. Cambridge: Cambridge
University Press.

Bolinger, D. (1968). Aspects of Language. New York: Harcourt, Brace
and World.

Briscoe, E. J. (2000). ''Grammatical acquisition: inductive bias and
coevolution of language and the language acquisition
device''. Language, 76(2), pp.245-296

Cavalli-Sforza, L. and M. W. Feldman. (1981). Cultural Transmission
and Change: A Quantitative Approach. Princeton, N.J.: Princeton
University Press

Chomsky, N. (1981). Lectures on Government and Binding. Dordrecht:
Foris.

Chomsky, N. (1971/1965). Paper read at the Northeast Conference on the
Teaching of Foreign Languages, 1965.  reprinted in J. P. B. Allen and
P. van Buren eds. Chomsky: Selected Readings. Oxford: Oxford
University Press.

Deacon, T. (1997) . The Symbolic Species: Coevolution of Language and
Brain. Cambridge, MA. MIT Press.

Gibson, E. and Wexler, K. (1994). ''Triggers''. Linguistic
Inquiry. 26, 3. pp.407-454.

Hinton, G. E. and S. J. Nowlan. (1987). ''How learning can guide
evolution''. Complex Systems, 1, pp.495-502.

Hurford, James R. (2000). ''Social transmission favours linguistic
generalization''. in Approaches to the Evolution of Language: The
emergence of phonology and syntax. C. Knight, M. Studdert-Kennedy,
J. R. Hurford, eds. pp.324-352. Cambridge: Cambridge University Press.

Niyogi, P. and R. C. Berwick. (1995). The Logical Problem of Language
Change. MIT AI Memo no. 1516.

Niyogi, P. and R. C. Berwick. (1997). ''Evolutionary Consequences of
Language Learning''. Linguistics and Philosophy, 20, 697-791.

Pinker, S. and P. Bloom. (1990). ''Natural Language and Natural
Selection''. Behavioral and brain sciences, 13(4), pp.707-784.

Roberts, S. (1998). ''The role of diffusion in the genesis of Hawaiian
creole''. Language, 74(1), pp.1-39.

ABOUT THE REVIEWER

Dominique Estival is a Senior Research Scientist at DSTO (the
Australian Defence Science and Technology Organisation), working on
various aspects of human interfaces and language technologies. She
received her PhD in linguistics from the U. of Pennsylvania in 1986
with a thesis on diachronic syntax and since then has been actively
involved in Computational Linguistics and Natural Language Processing,
in industrial R&D and in academic environments. She has worked on
various areas of NLP: linguistic engineering, grammar formalisms for
NLP, Machine Translation, reversible grammars, evaluation for NLP, and
spoken dialogue systems.


---------------------------------------------------------------------------

If you buy this book please tell the publisher or author
that you saw it reviewed on the LINGUIST list.

---------------------------------------------------------------------------
LINGUIST List: Vol-14-1481



More information about the LINGUIST mailing list