[Lingtyp] Greenbergian word order universals: confirmed after all

Tue Nov 7 14:03:02 UTC 2023

Many thanks, Guillaume, for your post and the link to your paper 
(Pellard et al.), which looks very useful.

As you say, the reliability of these studies hinges on the cognate 
coding, which is done manually, by humans with their biases. I'm 
wondering if there is a way to measure the degree to which different 
linguists agree or not (by some kind of kappa statistic), and a way to 
identify or exclude systematic biases (which are part of normal human 
behaviour). Another thing that I worry about is that grammatical markers 
(even demonstratives and interrogatives) are ignored (see the list of 
170 comparison meanings in IE-COR: https://iecor.clld.org/parameters), 
even though we know that these are the most resistant to borrowing. 
Especially in closely related languages, it's very hard to distinguish 
lexical loanwords from inherited words, isn't it? (For example, Dutch 
begrijpen 'understand' is said to have been borrowed from German 
https://wold.clld.org/word/72181920155924122, but without the rich 
attestation of both languages since the Middle Ages, we wouldn't be able 
to tell.)

So it is my feeling that looking at unrelated languages is much safer in 
typology. And I don't understand why Simon Greenhill said (about the 
proposal to sample only one language from a family):

"But then what does this mean when you take one language from a family 
like Austronesian with ~1300 languages and a one from a family like 
Eastern Trans-Fly with 4 languages. This means that you've sampled 
0.0007% of Austronesian but 1/4 of ETF. This feels wrong."

It doesn't feel wrong to me at all, just as it doesn't feel wrong to 
treat large languages like Russian in the same way as small languages 
like Sorbian. They have many more speakers, but these speakers are not 
independent of each other; in the same way, Austronesian speakers are 
not independent of each other, so a genealogically stratified sample 
would have only one Austronesian language (one that is at least 30 
languages away from Papuan languages).

Best,

Martin

On 07.11.23 09:33, Guillaume Jacques wrote:
> The consensus trees that are published in the articles on phylogeny is 
> just the tip of the iceberg of the amount of information you can gain 
> from these tree distributions, but for now there is no convenient 
> interface to explore these data, and some knowledge of R or other 
> languages is necessary. This forthcoming chapter presents a 
> (hopefully) readable introduction to phylogenies for historical 
> linguists: (99+) The Family Tree model | Guillaume Jacques and Thomas 
> Pellard - Academia.edu 
> <https://www.academia.edu/101656989/The_Family_Tree_model>
>
> In the end, what decides the reliability of these studies is the 
> reliability of cognate coding, which means that historical linguistics 
> specialized in meticulous etymologies and sound laws will play a 
> crucial part, and should work collectively to produce better 
> phylogenies, which typologists can then use to study the distribution 
> of structural features through time and space.

-- 
Martin Haspelmath
Max Planck Institute for Evolutionary Anthropology
Deutscher Platz 6
D-04103 Leipzig
https://www.eva.mpg.de/linguistic-and-cultural-evolution/staff/martin-haspelmath/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20231107/4a14ac85/attachment.htm>