<html>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>

</head>

<body dir="ltr">

<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

I have a couple of comments about Juergen's actual question (as opposed to the other interesting issues that got raised in this thread).</div>

<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

<br>

</div>

<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

The notion of "typological prototype" in <i>Typology and Universals </i>does not have to do with frequency of occurrence of a phenomenon in a typological sample. It is defined by a set of implicational universals that refer to combinations of values from different

 grammatical categories. Cross-linguistic frequency plays only an indirect role, namely the construction of the implicational universals; see for instance Maslova 2003 on the latter topic.</div>

<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

<br>

</div>

<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

Regarding choice of languages in typological samples, Alan Bell in his 1978 chapter on language samples discusses "bibliographic bias", which is related to what Juergen is concerned about, namely the sample will be based on existing language descriptions, and

 there could be a bias in the language types based on circumstances leading to some languages being described rather than others, such as the biases that Östen apparently referred to. It is much easier now to avoid that bias in constructing areally and genetically

 stratified samples, thanks to the many very fine grammars of indigenous languages that have been produced in the past thirty years.</div>

<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

<br>

</div>

<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

Bill<br>

</div>

<div id="appendonsend"></div>

<hr style="display:inline-block;width:98%" tabindex="-1">

<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> Lingtyp <lingtyp-bounces@listserv.linguistlist.org> on behalf of Bohnemeyer, Juergen <jb77@buffalo.edu><br>

<b>Sent:</b> Saturday, April 25, 2020 11:02 AM<br>

<b>To:</b> John Du Bois <dubois@ucsb.edu>; Randy J. LaPolla <randy.lapolla@gmail.com><br>

<b>Cc:</b> LINGTYP <lingtyp@listserv.linguistlist.org><br>

<b>Subject:</b> Re: [Lingtyp] Diversity/dispersion of descriptive/typological knowledge by language</font>

<div> </div>

</div>

<div class="BodyFragment"><font size="2"><span style="font-size:11pt;">

<div class="PlainText">  UNM-IT Warning: This message was sent from outside of the LoboMail system. Do not click on links or open attachments unless you are sure the content is safe. (2.3)<br>

<br>

Thanks very much, Jack and Randy! I’m well aware of Trudgill’s thought-proviking work. Also worth mentioning in this connection are other recent studies that have looked into the nexus among population size, L2 learners, and complexification/simplification.

 There’s joint work by Gary Lupyan and Rick Dale:<br>

<br>

<a href="https://www.ncbi.nlm.nih.gov/pubmed/20098492">https://www.ncbi.nlm.nih.gov/pubmed/20098492</a><br>

<a href="https://www.cell.com/trends/cognitive-sciences/fulltext/S1364-6613(16)30101-2?_returnURL=https%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS1364661316301012%3Fshowall%3Dtrue">https://www.cell.com/trends/cognitive-sciences/fulltext/S1364-6613(16)30101-2?_returnURL=https%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS1364661316301012%3Fshowall%3Dtrue</a><br>

<br>

Then there is this paper by Daniel Nettle:<br>

<br>

<a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3367698/">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3367698/</a><br>

<br>

And finally, joint work by Martin Atkins, Simon Kirby, and Kenny Smith at the Edinburgh Centre for Language Evolution:<br>

<br>

<a href="https://www.ncbi.nlm.nih.gov/pubmed/26057624">https://www.ncbi.nlm.nih.gov/pubmed/26057624</a><br>

<a href="https://www.ncbi.nlm.nih.gov/pubmed/30320460">https://www.ncbi.nlm.nih.gov/pubmed/30320460</a><br>

<br>

However, this is actually not quite the direction I had my query intended to take. What I was wondering about was instead the following:<br>

<br>

Since the days of Joseph Greenberg, typological studies have steadily grown in sample size (though that’s primarily true of secondary-data-based studies; primary-data-based studies (which is where my own work is at home) such as Dahl 1985 and Kay et al. 2009

 are still severely limited in scope by the daunting logistics of expert data collection from speakers of large and diverse samples of languages).<br>

<br>

However, if one were to perform a meta-study across a large sample of typological research, one would as a matter of course find that some languages are represented more frequently in typological research than others.<br>

<br>

What I was looking for was a way of quantifying this differential.<br>

<br>

Now, this frequency distribution is not of any direct concern for matters of quantitative analysis. Typologists have grappled intensely with the problem of genealogical and areal bias and have come up with a host of sampling techniques and methods tapping lineage-specific

 transition probabilities. That’s a problem that hasn’t been solved, and may not be entirely solvable - but at least it’s being addressed. And I’m confident that the tools that are being used minimize the risk of quantitative generalizations being invalidated

 by the existence of typological “dark matter”, in the form of sparsely documented and rarely sampled genera with properties that are highly divergent from those of better studied genera.<br>

<br>

However, what has been bothering me is a different, and definitely smaller concern: typologists also make assumptions about typological prototypicality and typological markedness. Cf. for example Chapters 4 and 6 of Croft (2003) and Dahl (1985, 2016). Where

 such assumptions express certain quantitative patterns in a given sample, their validity depends on the quality of the sample. That is not inherently problematic in my view, but it makes me uneasy that there’s been a lack of discussion (as far as I know) about

 the principles and constraints of inferring typological markedness and prototype patterns in the underlying population - the universe of extant languages - from a given sample.<br>

<br>

Assumptions about what’s common, typical, and unmarked in the languages of the world may also play other, potentially more insidious roles: they might influence for example the construction of etic grids and comparative concepts. Perhaps not least of all, they

 might influence proposal reviews of the merits of documentation projects for funding.<br>

<br>

I know that I personally routinely make assumptions about what a “typical” representative of a particular macro-area looks like - and routinely find myself realizing that I sorely underestimated the internal diversity of the area.<br>

<br>

For reasons such as these, I found myself wondering whether there’s a way to quantify the extent to which typological research has been over-representing languages that are either well-studied per se or that the researcher and maybe other typologists happen

 to have worked on.<br>

<br>

I should like to use this opportunity to (belatedly) thank the colleagues who responded to my query:<br>

<br>

Jack Du Bois<br>

Harald Hammarström<br>

Randy LaPolla<br>

Sebastian Nordhoff<br>

Kilu von Prince<br>

Daniel Ross<br>

<br>

I think an exhaustive summary is not warranted in this case since it’s not clear to me that I was particularly successful in communicating my query ;-) But two bits of information stood out for me as immensely valuable:<br>

<br>

* The LangDoc database (Hammarström & Nordhoff 2011) permits quantification of the state of documentation/description of the world’s languages. Harald Hammarström shared slides with me that present important patterns.<br>

<br>

* Kilu von Pince made me aware of an (apparently unpublished) recent presentation by Östen Dahl in which he identifies a set of 57 “LOL languages” (‘literate, official, lots of users’) and shows how these have been overrepresented both in older typological

 studies and in some WALS chapters/maps.<br>

<br>

Thanks again! — Juergen<br>

<br>

Croft, W. (2003). Typology and universals. Second edition. Cambridge: Cambridge University Press.<br>

Dahl, Ö. (1985). Tense and aspect systems. Oxford: Blackwell.<br>

Dahl, Ö. (2016). Thoughts on language-specific and crosslinguistic entities. Linguistic Typology 20(2): 427-437.<br>

Hammarström, H. & Nordhoff, S. (2011). Langdoc: Bibliographic infrastructure for linguistic typology. Oslo Studies in Language 3(2): 3143.<br>

<br>

<br>

> On Apr 3, 2020, at 11:08 AM, John Du Bois <dubois@ucsb.edu> wrote:<br>

><br>

> Trudgill's paper is excellent and thought-provoking. Here's the reference:<br>

><br>

> Trudgill, Peter (2015). Sociolinguistic typology and the uniformitarian hypothesis. In De Busser, R. & LaPolla, R. J. (Eds.), Language structure and environment: Social, cultural, and natural factors. Amsterdam: Benjamins. 133-148.<br>

> <a href="https://benjamins.com/catalog/clscc.6">https://benjamins.com/catalog/clscc.6</a><br>

><br>

> Best,<br>

> Jack<br>

><br>

> On Sat, Jan 4, 2020 at 6:15 AM Randy J. LaPolla <randy.lapolla@gmail.com> wrote:<br>

> Hi Juergen,<br>

> Relevant to this is work by Peter Trudgill on Sociolinguistic typology and the uniformitarian hypothesis. See the attached paper of his from a few years ago. I couldn’t find a publication reference for it, but it might be included in his Sociolinguistic Typology

 book.<br>

><br>

> All the best,<br>

> Randy<br>

> -----<br>

> Randy J. LaPolla, PhD FAHA （羅仁地）<br>

> Professor of Linguistics, with courtesy appointment in Chinese, School of Humanities<br>

> Nanyang Technological University<br>

> HSS-03-45, 48 Nanyang Avenue| Singapore 639818<br>

> <a href="http://randylapolla.net/">http://randylapolla.net/</a><br>

> Most recent books:<br>

> The Sino-Tibetan Languages, 2nd Edition (2017)<br>

> <a href="https://www.routledge.com/The-Sino-Tibetan-Languages-2nd-Edition/LaPolla-Thurgood/p/book/9781138783324">

https://www.routledge.com/The-Sino-Tibetan-Languages-2nd-Edition/LaPolla-Thurgood/p/book/9781138783324</a><br>

> Sino-Tibetan Linguistics (2018)<br>

> <a href="https://www.routledge.com/Sino-Tibetan-Linguistics/LaPolla/p/book/9780415577397">

https://www.routledge.com/Sino-Tibetan-Linguistics/LaPolla/p/book/9780415577397</a><br>

><br>

><br>

>> On 4 Jan 2020, at 2:09 AM, Bohnemeyer, Juergen <jb77@buffalo.edu> wrote:<br>

>><br>

>> Dear all — I was wondering whether anybody has attempted to quantify the extent of linguistic diversity in our knowledge of the languages of the world. I believe mathematically speaking the type of information I’m looking for is a frequency distribution.

 The question is to what extent are a handful of languages and language families overrepresented in our knowledge of the languages of the world whereas the vast majority of languages and language families are underrepresented. One can ask this question (i)

 about our descriptive knowledge of any and all languages and (ii) specifically about the typological literature. I’m most interested in (ii), but I’m guessing there’s more likely to be an answer to (i) (though I also realize that the odds of anybody having

 proposed an answer to either question without me having heard of it are not great). Anybody aware of such a study? Even relevant claims without empirical footing would be of interest. — Best — Juergen<br>

>><br>

>> Juergen Bohnemeyer (He/Him)<br>

>> Professor and Director of Graduate Studies<br>

>> Department of Linguistics and Center for Cognitive Science<br>

>> University at Buffalo<br>

>><br>

>> Office: 642 Baldy Hall, UB North Campus * Mailing address: 609 Baldy Hall, Buffalo, NY 14260<br>

>> Phone: (716) 645 0127<br>

>> Fax: (716) 645 3825 * Email: jb77@buffalo.edu * Web: <a href="http://www.acsu.buffalo.edu/~jb77/">

http://www.acsu.buffalo.edu/~jb77/</a><br>

>><br>

>> Office hours Tu/Th 3:30-4:30pm<br>

>><br>

>><br>

>> There’s A Crack In Everything - That’s How The Light Gets In (Leonard Cohen)<br>

>><br>

>> _______________________________________________<br>

>> Lingtyp mailing list<br>

>> Lingtyp@listserv.linguistlist.org<br>

>> <a href="http://listserv.linguistlist.org/mailman/listinfo/lingtyp">http://listserv.linguistlist.org/mailman/listinfo/lingtyp</a><br>

><br>

> _______________________________________________<br>

> Lingtyp mailing list<br>

> Lingtyp@listserv.linguistlist.org<br>

> <a href="http://listserv.linguistlist.org/mailman/listinfo/lingtyp">http://listserv.linguistlist.org/mailman/listinfo/lingtyp</a><br>

><br>

><br>

> --<br>

> =======================================<br>

> John W. DuBois<br>

> Professor of Linguistics<br>

> University of California, Santa Barbara<br>

> Santa Barbara, California 93106<br>

> USA<br>

> Email:<br>

> dubois@ucsb.edu<br>

> Zoom room: <a href="https://ucsb.zoom.us/my/dubois">https://ucsb.zoom.us/my/dubois</a><br>

> Web page: <a href="http://www.linguistics.ucsb.edu/faculty/dubois/">http://www.linguistics.ucsb.edu/faculty/dubois/</a><br>

><br>

<br>

<br>

--<br>

Juergen Bohnemeyer (He/Him)<br>

Professor and Director of Graduate Studies<br>

Department of Linguistics and Center for Cognitive Science<br>

University at Buffalo<br>

<br>

Office: 642 Baldy Hall, UB North Campus<br>

Mailing address: 609 Baldy Hall, Buffalo, NY 14260<br>

Phone: (716) 645 0127<br>

Fax: (716) 645 3825<br>

Email: jb77@buffalo.edu<br>

Web: <a href="http://www.acsu.buffalo.edu/~jb77/">http://www.acsu.buffalo.edu/~jb77/</a><br>

<br>

Office hours will be held by Skype, WebEx, or phone until further notice. Email me to schedule a call at any time. I will in addition hold Tu 12:30-1:30 and Th 2:30-3:20 open specifically for remote office hours.<br>

<br>

There’s A Crack In Everything - That’s How The Light Gets In<br>

(Leonard Cohen)<br>

<br>

_______________________________________________<br>

Lingtyp mailing list<br>

Lingtyp@listserv.linguistlist.org<br>

<a href="http://listserv.linguistlist.org/mailman/listinfo/lingtyp">http://listserv.linguistlist.org/mailman/listinfo/lingtyp</a><br>

</div>

</span></font></div>

</body>

</html>