Atkinson on phoneme inventories in Science

Sun Apr 17 16:19:43 UTC 2011

Johanna's remarks and observations are really 'to the point'. Let me 
just stress/add one aspect, namely the 'nature' of the implication 
underlying Atkinson's hypothesis: It is a well-known assumption in 
relevance logics that in order to set up a true implication, the 
antecedent and consequent must be related in terms of relevance (e.g. by 
sharing some kind of common predicate). Thus, the phrase "if I am in New 
York, two and three are five" is superficially true in terms of a 
'material implication', but the antecedent and the consequent lack any 
common, evident relevance. This phrase comes close to a formula like "if 
a population is of size X, then the phoneme inventory of their language 
is of size Y". In order to assert a relevant relation, we have to 
describe in details at least one necessary condition that holds for both 
the antecedent ("if a population is of size X" and the consequent ("the 
phoneme inventory of their language is of size Y"). Else, the 
'implicative' relation between these two propositions would be irrelevant.

A famous example is the statistical observation that the number of 
storks in a village is in relation to the number of children born in 
this village, cf. Hofer et al. 2004. New Evidence for the Theory of the 
Stork. In: /Paediatric and Perinatal Epidemiology /18, S. 88-92. This 
correlation suggested by statistical data is based on a non-causal 
factor namely the degree of urbanization that may cause both the 
reduction of nesting sites for storks and the spreading of  'nuclear 
families'. Hence, the statistical data just strengthen the 
folk-scientific hypothesis according to which 'storks bring the babies', 
even though the implication is not mark for a relevant relation.....

The 'marriage' between two statistical observations must not just been 
'plausible', but based on a comprehensive scenario that relates in a 
relevant way basic properties of the two statements formulating these 
observations. One major point would be to formulate predictions that 
would be applicable to representatives of the antecedent within yet 
knowing about the quality of a possible consequent. For instance, in teh 
given case relevance would be suggested if we start from a population 
size X without knowing anything (!) of their language and its phoneme 
inventory. The hypothesis would be that the unknown language should have 
an phoneme inventory of size Y. By then (!) unveiling the given language 
and its phoneme inventory, we can check the assumed consequent against 
its antecedent. In case the relation supports the implicature and in 
case we can repeat this calculus in a statistically significant number 
of instances, we might set up the hypothesis that the relation between 
antecedent and consequent must share some kind of relevance, even though 
we still have to model this relevance.

As far as I can see, Atkinson does not apply this basic technique of 
evaluating a superficially 'material implication'. Rather, the author 
starts from two given sets of data (population / phoneme inventory) and 
sets up an ad hoc implication ('verified' with the help of statistical 
considerations). In this case, we need a full description of those 
properties of both the antecedent and the consequent that produce a 
'relevant relation' between them. Personally, I cannot fully understand 
how this should be done. One main point is that you have to define a 
population via 'language', that is you include the consequent in the 
antecedent, which is highly problematic when construing implications. In 
addition, we need a criterion that allows us to correlate population and 
language at all. The problem is that the notion 'population' (when 
referred to in Atkinson's sense) is marked for a demographic factor that 
again 'counts' biological 'substance' (hard data). On the other hand, 
'language' is not an 'object', but a variable that may or may not be 
correlated with these data. For instance, if one claims that a 
population is defined e.g. by sharing a set of economic strategies, one 
would arrive at totally different hypotheses. In fact, you cannot 
predict which language is spoken by a group of people (not to speak by 
an individual or a "human fossil"). So, you have to replace 'population' 
by 'speech community'. But again, this does not help very much: If you 
start from the individual (who normally initiates language change), its 
'speech community' usually is much smaller than the fictitious overall 
'speech community': Such a 'speech community' normally is a network of 
smaller communities sharing 'human interfaces' that belong to more than 
one smaller 'speech community'. In this sense, the overall number of 
speakers of a speech community defined by one language is not the 
relevant figure: What matters is the number of speakers in the smaller 
communities that are marked for constant and mutual linguistic practice. 
So, in reality, most 'speech communities' are rather 'small' (some 
people say that an individual in a pre-modern society was in constant 
discourse with maximally 100 people). Now, if language is structured 
also by such factors as interactional types that control the shaping and 
re-shaping of linguistic signs, then the demographic factors becomes 
only relevant with respect to the degree of interaction. In other words: 
It is the demography of a 'speech community in interaction' that counts 
(if ever), not the overall population marked for using the 'same' language.

Here, I refrain from discussing the hypothesis according to which 
smaller communities allow more lexical ambiguities than 'larger 
communities' (thus 'reducing' the need for larger sets of phonological 
oppositions). Let me just say that given the fact that people always 
live and communicate in smaller social networks this hypothesis loses 
ground. In addition, you can easily turn around this argument: In a 
larger social network, ambiguities should be much more pronounced 
because a speaker has to respect very different 'states of knowledge'. 
Being too explicit would set them at risk to isolate them from their 
interactional partners. Also, the hypothesis does no explain in details, 
how preference for ambiguity should effect a phonological system at all: 
A phonological system is mainly defined by the set of lexical and 
morphological linguistic signs used in a speech community. The rise of 
degree of ambiguity (or: inference) would mean that certain concepts 
(signifiés) that are too specific no longer become expressed by their 
corresponding signifiants. But why should this happen especially to 
those sets of linguistic signs that are marked for specific phonological 
values? Vagueness, the allowance of inference, and ambiguity concern 
concepts, not their articulation.....

Best wishes,
Wolfgang

Am 17.04.2011 08:30, schrieb Johanna Nichols:
> This is written up to be more or less self-standing, but please DON'T
> QUOTE it without asking me first -- I'm expanding the sample for this and
> there will be changes.
>
> Johanna
>
>
> Atkinson 2011 finds a significant positive correlation between population
> size and phoneme inventory size (confirming Hay&  Bauer 2007) and explains
> it by migration:  phoneme sizes are largest in Africa, and as societies
> spread out of Africa and around the world they went through population and
> cultural bottlenecks and underwent phonological simplification as a
> consequence.  I believe the correlation is artifactual.
>
> As background, Sproat 2011 points out that Atkinson's language sizes range
> from a few tens of speakers to hundreds of millions of speakers.  But
> population sizes in the Paleolithic were small, probably at most a few
> thousand speakers and often fewer.  If Atkinson's explanation is correct,
> the positive correlation between population size and phoneme inventory
> size should also hold among just the smaller population sizes; looking at
> Fig. S1, there does still seem to be a positive correlation, but it looks
> considerably weaker.  I agree with Sproat on all these points.
>
> In Nichols 2009, a cross-linguistic survey of overall grammatical
> complexity, I found a highly significant negative correlation between
> overall complexity and population size:  smaller communities have more
> complex languages.  But that proves to be an artifact of the larger
> population sizes and lower structural complexity in Africa and Eurasia:
> within subglobal areas (Old World, Pacific, New World) there is no
> correlation.  The large language-population sizes in Africa and Eurasia
> have to do with the long history of statehood and empire (which spread big
> state languages at the expense of smaller ones), economic growth, and
> efficient food production (themselves accidents of geography: Diamond
> 1997).  Also, European colonization brought smallpox and economic
> destruction to the Americas and Australia, drastically reducing
> populations, and the population figures we have are post-colonial.  Big
> trade languages, state languages, and other inter-ethnic languages tend to
> be simpler than small ethnic languages (Trudgill 2009, Szmrecsanyi&
> Kortmann 2009, Dahl 2004), and there have been many more of these in
> Africa and Eurasia than in the pre-contact Americas and Pacific.
>
> So, does Atkinson's positive correlation in phonology hold within large
> areas as well as worldwide?  I took the data on phonological complexity
> from my 2009 paper, quickly surveyed a few more languages to fill gaps,
> and did some counts.  Now, my data measures phonological complexity
> (consonant inventory size, vowel inventory size, suprasegmentals, syllable
> complexity), not quite the same thing as what Atkinson measures, but
> certainly getting at the same thing.  My figures for population size may
> be different, as they are often based on grammars and ethnographies rather
> than Ethnologue and I have attempted to track ethnic group size, not
> numbers of speakers (since the proportion of speakers in ethnic groups has
> fallen drastically in recent years).  Atkinson has 500+ languages; I have
> 85, representing that subset of the Autotyp (Bickel&  Nichols 2002ff.)
> genealogical sample that I was able to cover quickly.
>
> I found the same positive correlation worldwide (statisticaly
> significant).  But it does not obtain within large areas.  It is reversed
> in Africa (a negative correlation: larger population correlates with
> simpler phonology) and there is no correlation in Eurasia and the
> Americas; there is a slight correlation in the Pacific, but my sample from
> there is too small to be confident of this.
>
> If there is really a correlation between population size and phoneme
> inventory size (or anything else), it should hold within areas as well as
> worldwide.  I believe the worldwide positive correlation is an artifact of
> (a) larger population size in Eurasia and Africa, (b) areality in greater
> Africa (extending into the Near East and the Caucasus) (large number of
> airflow contrasts in consonant inventories).  (Africa is large but a
> closed area which has received almost no linguistic or genetic
> immigrations during its very long prehistory and the net effect of
> numerous local contact episodes is continent-wide areality manifesting
> itself not only in consonant contrasts but also e.g. in gender systems and
> tone systems.)
>
> Atkinson's explanation is that the smaller phoneme inventories in places
> distant from Africa are founder effects:  as small populations migrated
> greater and greater distances from Africa they passed through bottlenecks
> and isolation and lost phoneme diversity.  If this is the actual
> explanation, one would expect concomitant simplification of morphology and
> the rest of grammar with greater distance from Africa, but in fact we find
> the reverse:  languages in the Americas and the Pacific are on average
> more complex overall, and morphologically, than those in Africa (or Africa
> plus Eurasia) (Nichols 2009).
>
>
> References
>
> Atkinson, Quentin D.  2011. Phonemic diversity supports a serial founder
> effect model of language expansion from Africa. Science 33:346-9.
>
> Bickel, Balthasar and Johanna Nichols. 2002. The Autotyp research program.
> http://www.uni-leipzig.de/~autotyp/
>
> Dahl, Östen. 2004. The Growth and Maintenance of Linguistic Complexity.
> Amsterdam: Benjamins.
>
> Diamond, Jared. 1997. Guns, Germs, and Steel: The Fates of Human
> Societies. New York: Norton.
>
> Hay, Jennifer and Laurie Bauer. 2007. Phoneme inventory size and
> population size. Language 83:2.388-400.
>
> Sproat, Richard.  2011.  Science does it again.
> http://www.cslu.ogi.edu/~sproatr/newindex/atkinson.html   (accessed April
> 14, 2011)
>
> Szmrecsanyi, Benedikt and Bernd Kortmann. 2009. Between simplification and
> complexification: Non-standard varieties of English around the world. In
> Geoffrey Sampson, David Gil and Peter Trudgill, eds., Language Complexity
> as an Evolving Variable, 65-79. Oxford: Oxford University Press.
>
> Trudgill, Peter. 2009. Sociolinguistic typology and complexification. In
> Geoffrey Sampson, David Gil and Peter Trudgill, eds., Language Complexity
> as an Evolving Variable, 98-109. Oxford: Oxford University Press.
>

-- 

----------------------------------------------------------

*Prof. Dr. Wolfgang Schulze *

----------------------------------------------------------

Institut für Allgemeine & Typologische Sprachwissenschaft

Dept. II / F 13

Ludwig-Maximilians-Universität München

Ludwigstraße 25

D-80539 München

Tel.: 0049-(0)89-2180-2486 (Secretary)

0049-(0)89-2180-5343 (Office)

Fax:  0049-(0)89-2180-5345

Email: W.Schulze at lrz.uni-muenchen.de 
<mailto:W.Schulze at lrz.uni-muenchen.de>/// Wolfgang.Schulze at lmu.de 
<mailto:Wolfgang.Schulze at lmu.de>

Web: http://www.ats.lmu.de/index.html

Personal homepage: http://www.wolfgangschulze.in-devir.com

----------------------------------------------------------

Diese e-Mail kann vertrauliche und/oder rechtlich geschützte 
Informationen enthalten. Wenn Sie nicht der richtige Adressat sind bzw. 
diese e-Mail irrtümlich erhalten haben, informieren Sie bitte umgehend 
den Absender und vernichten Sie diese e-Mail. Das unerlaubte Kopieren 
sowie das unbefugte Verwenden und Weitergeben vertraulicher e-Mails oder 
etwaiger, mit solchen e-Mails verbundener Anhänge im Ganzen oder in 
Teilen ist nicht gestattet. Ferner wird die Haftung für jeglichen 
Verlust oder Schaden, insbesondere durch virenbefallene e-Mails 
ausgeschlossen.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20110417/72c813cb/attachment.htm>