[Lingtyp] Greenbergian word order universals: confirmed after all

Sat Nov 4 06:24:25 UTC 2023

Many thanks, Simon, for offering your author's perspective on this old 
paper!

You are certainly right that "the scientific process" can sometimes be 
slow, and it may take a while for wrong conclusions to be corrected. We 
should of course not be discouraged from trying out novel methods just 
because they may initially yield strange results.

However, I found your 2011 paper very confusing because it was published 
in such a prominent place, seemingly betraying very high confidence in 
the results. It was only recently that I heard from a good source that 
the authors now agree with the earlier criticism that the sample size 
was too small in 2011, and two of them are now coauthors of the new 
forthcoming paper (with MUCH more data) that Annemarie Verkerk reported 
on recently. So that's a good developement.

Maybe that new paper will come out in /Nature/, too, and then all will 
be well, as the 2011 paper will fall into oblivion. (But since the new 
paper mostly confirms older Greenbergian perspectives, /Nature/ may not 
find the results novel enough...)

But I'm still confused about the three points you make:

First, your paper *didn't* show that Matthew Dryer's (1992) correlation 
method was "overly simplistic" – on the contrary, while some of the 
claims made in the 1970s were overly simplistic, Dryer showed that the 
Greenbergian correlations hold, but in a more limited way than had been 
thought (he rejected the head-dependent theory in favour of the 
branching-direction theory).

Second, the "need to understand language systems in a diachronic manner" 
*didn't* need to be lighlighted, because it was something that linguists 
had routinely done since the 19th century, and the idea was made very 
prominent again by Greenberg himself (e.g. Greenberg 1969; 1978; see 
also Croft's well-known work).

Third, linguists knew that "different routes can be taken in different 
families at different times", so this was *not* a contribution of your 
paper either.

I don't know about "rigorous" reviewing in high-profile journals (they 
certainly have a high rejection rate), but I think it is clear that 
their influence is not always justified by the validity of their 
research results. Publishing a paper in /Nature/ can have a chilling 
effect on the rest of the discipline, in that other approaches or views 
may not be seen as legitimate anymore. (We saw the bad effects of the 
outsize political influence of high-profile journals in recent years, 
when some of them declared the lab-leak theory and certain alternative 
approaches to public health as wrong, quite prematurely, and with 
devastating effects on the public discussion.)

So one of my reasons for raising the problems with the 2011 paper here 
is to point out that while the correlated-evolution method may have its 
advantages, it requires a huge amount of data (and very unrealistic 
worldwide trees, as is particularly clear with Jäger & Wahle's 
ASJP-based tree, but also with the Bouckaert et al. tree: 
https://osf.io/preprints/socarxiv/f8tr6/). For most other purposes in 
typology, the traditional sampling methods (discussed in our subfield 
since 1978) is all we have.

In their target authors' response article in /Linguistic T/ypology 
(2011, 509-534), Levinson et al. start out by noting the problem of 
spatial and genealogical autocorrelation (which used to be called 
"Galton's Problem", after the originator of eugenics), but they make it 
sound as if stratified sampling has a deep fundamental problem and that 
the correlated-evolution method is always better. But they seem to be 
exaggerating the problems with sampling, e.g. when they say:

"Many typologists may assume that the dangers of covert phylogenetic 
dependence are remote. But given the apparent genetic bottlenecks at the 
beginning of the modern human diaspora out of Africa (Amos & Hoffman 
2009), something close to language monogenesis seems a reasonable 
assumption, rendering Galton’s problem insurmountable."

Admittedly, this conceivable problem had not occurred to typologists at 
the time of Greenberg and Dryer, and it becomes important in the 
discussion only after Maslova (2000) (it is also mentioned by Cysouw 
2011 in LT, and by Jäger & Wahle 2021). This is why I wrote a blogpost a 
few years ago asking how realistic it is to suspect that current 
typological distributions might in part reflect Proto-World 
(https://dlc.hypotheses.org/2376). Now recently, Russell Gray told me 
that he never thought that Proto-World retention was an important 
problem for sampling.

If so, then I'm even more happy to recommend continuing with stratified 
sampling (among other methods, of course), as it is a much cheaper 
method – and still fully legitimate, despite the claims made by Levinson 
and colleagues. Greenberg's (1963) claims were based on his knowledge of 
hundreds of languages (and on earlier research by W. Schmidt), not only 
on his very small and very skewed 30-language sample, but the fact that 
most of his universals stood the test of time (despite his very 
imperfect methods) seems to show that what mattered primarily was the 
novel (universalistic and quantitative) perspective that he took, not so 
much the methods.

Best wishes, and thanks again to all for the discussions,

Martin

P.S. The earlier LINGTYP thread can be seen here: 
https://listserv.linguistlist.org/pipermail/lingtyp/

On 04.11.23 02:22, Simon Greenhill wrote:
> Colleagues, Martin, everyone else
>
> Thank you for sharing your perspectives on our 2011 paper. It's nice to see this still be discussed more than a decade later. However, I would like to express my concerns and disagreements with some of the points you've raised.
>
> I'm very proud of the Dunn et al. paper for a number of reasons. I'll name three.
>
> First, the paper showed that the overly simplistic correlation methods that had been used to make sweeping global claims were problematic. We need better tools to tackle these questions, and the tools we applied were one part of a better toolkit.
>
> Second, it highlighted the need to understand language systems in a diachronic manner. We cannot decouple language typology from language history, instead we need to understand how these are entangled.
>
> Third, it emphasised the way that particular configurations of languages can be arrived at via different routes in different families at different times. This enables a much richer understanding of how these particular generalisations have arisen.
>
> Have Jäger and Wähle disproved any of that? no. Maybe these were not completely novel insights (Maslova’s work has been mentioned which touches on a few of these issues too, for example), but these ideas did appear to crystallise in this paper.
>
> While it's certainly important to revisit and reevaluate research findings to ensure accuracy, it is crucial to approach these discussions with an understanding of the scientific process. Scientific paradigms evolve over time, and different studies may yield varying results due to changes in methodologies, data sources, and sample sizes. This doesn't necessarily imply that the initial research was flawed or that the authors were neglectful. In particular, the tools, the data, and our understanding of how languages change are substantially further advanced than they were a decade ago (or, I know that *my* understanding of these things is more advanced now, at least). And these other papers that you mention -- and many other studies -- have built upon the work we did in 2011.
>
> Furthermore, I would like to caution against drawing overly broad conclusions about the quality of research published in high-prestige journals. The peer-review process in such journals is rigorous, and while they may occasionally feature sensationalist claims, this doesn't diminish the overall value they contribute to the scientific community. For the record, of the handful of papers I've had in these journals *all* have been reviewed by people I would infer to be linguists based on the comments and issues they raised. We did not send these papers to these journals to avoid linguistic reviewers but, frankly, I've had better reviews at these journals than at prominent linguistics journals (and by "better" I mean more rigorous, more thorough, and more critical).
>
> Finally, linguistic typology is an ongoing and evolving field trying to tackle very difficult problems. We need all the tools and approaches we can get to solve these problems across all the levels that languages operate on (from detailed language internal analyses to high-level global analyses). Rather than looking back and gate-keeping what is 'real’ typology published in ‘real’ linguistics journals, we should shift our focus forward. Typology can be a welcoming and diverse community that embraces a wide range of approaches, analyses, and styles. Let's look outward to foster connections with other fields and disciplines.
>
> After all, why shouldn't linguistic typology work be everywhere in science? it's certainly interesting enough.
>
> Simon
>
> Dr. Simon J. Greenhill
>
> Associate Professor
>
> Te Kura Mātauranga Koiora | School of Biological Sciences
> Te Whare Wānanga o Tāmaki Makaurau | University of Auckland
>
> Abteilung für Sprach- und Kulturevolution | Department of Linguistic and Cultural Evolution
> Max-Planck-Institut für Evolutionäre Anthropologie | Max Planck Institute for Evolutionary Anthropology
>
-- 
Martin Haspelmath
Max Planck Institute for Evolutionary Anthropology
Deutscher Platz 6
D-04103 Leipzig
https://www.eva.mpg.de/linguistic-and-cultural-evolution/staff/martin-haspelmath/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20231104/92e38c5a/attachment-0001.htm>