[Lingtyp] Reporting cross-linguistic frequencies

Erich Round e.round at surrey.ac.uk
Thu Nov 20 09:09:52 UTC 2025


Dear Omri,

If you’re looking for methods, please see this paper and the accompanying software packages:

Paper:
Macklin-Cordes, Jayden L., and Erich R. Round. 2022. ‘Challenges of Sampling and How Phylogenetic Comparative Methods Help: With a Case Study of the Pama-Nyungan Laminal Contrast’. Linguistic Typology 26 (3): 533–72. https://doi.org/doi.org/10.1515/lingty-2021-0025.
Software:
Round, Erich R. 2021. glottoTrees: Phylogenetic Trees in Linguistics. V. 0.1. Released. https://github.com/erichround/glottoTrees.
Round, Erich R. 2021. phyloWeights: Calculation of Genealogically-Sensitive Proportions and Averages. V. 0.3. Released. https://github.com/erichround/phyloWeights.

The paper itself explains the logic of phylogenetic averages and their relation to discussions of sampling in typology.  The software gives you the ability quantify the prevalence of a feature, taking phylogeny into account.  It allows you to use Glottolog’s language subgrouping as a default, and to alter it to reflect your own expert views or hypotheses about subgrouping.  The supplementary materials for the paper (34pp) provide an extended tutorial for applying the software to the kind of question you’re asking.

Very best,
Erich



--

Prof. Erich Round

Director, Surrey Morphology Group,

https://www.smg.surrey.ac.uk/


From: Lingtyp <lingtyp-bounces at listserv.linguistlist.org> on behalf of Omri Amiraz via Lingtyp <lingtyp at listserv.linguistlist.org>
Date: Thursday, 20 November 2025 at 09:38
To: lingtyp at listserv.linguistlist.org <lingtyp at listserv.linguistlist.org>
Subject: Re: [Lingtyp] Reporting cross-linguistic frequencies

Dear all,
I agree with Ian that, in addition to genealogical and areal biases, the very question of what counts as a language versus a dialect is partly subjective. This makes actual frequencies even more problematic, since we would obtain different results depending on whether we treat Wu Chinese as one language or as thirty separate languages, as Ian pointed out.
Juergen wrote: "We can empirically assess the extent to which the probability of a random language having a certain property depends on (or is influenced by, or varies with, etc.) it being related to certain other languages, or being  spoken (or signed) in a particular area."

I wonder whether it might be useful to have a measure of the genealogical and areal spread of a feature, essentially quantifying how broadly it is distributed across families and regions in the present-day world. Such a measure might be more straightforward to interpret than an adjusted frequency/probability, since it is not clear whether the described population is a hypothetical set of isolated isolates or something else.

Is anyone aware of an existing metric that captures genealogical or areal spread in this way?

Best,
Omri
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20251120/cc5ea0e2/attachment-0001.htm>


More information about the Lingtyp mailing list