[Lingtyp] Diversity/dispersion of descriptive/typological knowledge by language

Sat May 2 22:30:57 UTC 2020

I have a couple of comments about Juergen's actual question (as opposed to the other interesting issues that got raised in this thread).

The notion of "typological prototype" in Typology and Universals does not have to do with frequency of occurrence of a phenomenon in a typological sample. It is defined by a set of implicational universals that refer to combinations of values from different grammatical categories. Cross-linguistic frequency plays only an indirect role, namely the construction of the implicational universals; see for instance Maslova 2003 on the latter topic.

Regarding choice of languages in typological samples, Alan Bell in his 1978 chapter on language samples discusses "bibliographic bias", which is related to what Juergen is concerned about, namely the sample will be based on existing language descriptions, and there could be a bias in the language types based on circumstances leading to some languages being described rather than others, such as the biases that Östen apparently referred to. It is much easier now to avoid that bias in constructing areally and genetically stratified samples, thanks to the many very fine grammars of indigenous languages that have been produced in the past thirty years.

Bill
________________________________
From: Lingtyp <lingtyp-bounces at listserv.linguistlist.org> on behalf of Bohnemeyer, Juergen <jb77 at buffalo.edu>
Sent: Saturday, April 25, 2020 11:02 AM
To: John Du Bois <dubois at ucsb.edu>; Randy J. LaPolla <randy.lapolla at gmail.com>
Cc: LINGTYP <lingtyp at listserv.linguistlist.org>
Subject: Re: [Lingtyp] Diversity/dispersion of descriptive/typological knowledge by language

  UNM-IT Warning: This message was sent from outside of the LoboMail system. Do not click on links or open attachments unless you are sure the content is safe. (2.3)

Thanks very much, Jack and Randy! I’m well aware of Trudgill’s thought-proviking work. Also worth mentioning in this connection are other recent studies that have looked into the nexus among population size, L2 learners, and complexification/simplification. There’s joint work by Gary Lupyan and Rick Dale:

https://www.ncbi.nlm.nih.gov/pubmed/20098492
https://www.cell.com/trends/cognitive-sciences/fulltext/S1364-6613(16)30101-2?_returnURL=https%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS1364661316301012%3Fshowall%3Dtrue

Then there is this paper by Daniel Nettle:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3367698/

And finally, joint work by Martin Atkins, Simon Kirby, and Kenny Smith at the Edinburgh Centre for Language Evolution:

https://www.ncbi.nlm.nih.gov/pubmed/26057624
https://www.ncbi.nlm.nih.gov/pubmed/30320460

However, this is actually not quite the direction I had my query intended to take. What I was wondering about was instead the following:

Since the days of Joseph Greenberg, typological studies have steadily grown in sample size (though that’s primarily true of secondary-data-based studies; primary-data-based studies (which is where my own work is at home) such as Dahl 1985 and Kay et al. 2009 are still severely limited in scope by the daunting logistics of expert data collection from speakers of large and diverse samples of languages).

However, if one were to perform a meta-study across a large sample of typological research, one would as a matter of course find that some languages are represented more frequently in typological research than others.

What I was looking for was a way of quantifying this differential.

Now, this frequency distribution is not of any direct concern for matters of quantitative analysis. Typologists have grappled intensely with the problem of genealogical and areal bias and have come up with a host of sampling techniques and methods tapping lineage-specific transition probabilities. That’s a problem that hasn’t been solved, and may not be entirely solvable - but at least it’s being addressed. And I’m confident that the tools that are being used minimize the risk of quantitative generalizations being invalidated by the existence of typological “dark matter”, in the form of sparsely documented and rarely sampled genera with properties that are highly divergent from those of better studied genera.

However, what has been bothering me is a different, and definitely smaller concern: typologists also make assumptions about typological prototypicality and typological markedness. Cf. for example Chapters 4 and 6 of Croft (2003) and Dahl (1985, 2016). Where such assumptions express certain quantitative patterns in a given sample, their validity depends on the quality of the sample. That is not inherently problematic in my view, but it makes me uneasy that there’s been a lack of discussion (as far as I know) about the principles and constraints of inferring typological markedness and prototype patterns in the underlying population - the universe of extant languages - from a given sample.

Assumptions about what’s common, typical, and unmarked in the languages of the world may also play other, potentially more insidious roles: they might influence for example the construction of etic grids and comparative concepts. Perhaps not least of all, they might influence proposal reviews of the merits of documentation projects for funding.

I know that I personally routinely make assumptions about what a “typical” representative of a particular macro-area looks like - and routinely find myself realizing that I sorely underestimated the internal diversity of the area.

For reasons such as these, I found myself wondering whether there’s a way to quantify the extent to which typological research has been over-representing languages that are either well-studied per se or that the researcher and maybe other typologists happen to have worked on.

I should like to use this opportunity to (belatedly) thank the colleagues who responded to my query:

Jack Du Bois
Harald Hammarström
Randy LaPolla
Sebastian Nordhoff
Kilu von Prince
Daniel Ross

I think an exhaustive summary is not warranted in this case since it’s not clear to me that I was particularly successful in communicating my query ;-) But two bits of information stood out for me as immensely valuable:

* The LangDoc database (Hammarström & Nordhoff 2011) permits quantification of the state of documentation/description of the world’s languages. Harald Hammarström shared slides with me that present important patterns.

* Kilu von Pince made me aware of an (apparently unpublished) recent presentation by Östen Dahl in which he identifies a set of 57 “LOL languages” (‘literate, official, lots of users’) and shows how these have been overrepresented both in older typological studies and in some WALS chapters/maps.

Thanks again! — Juergen

Croft, W. (2003). Typology and universals. Second edition. Cambridge: Cambridge University Press.
Dahl, Ö. (1985). Tense and aspect systems. Oxford: Blackwell.
Dahl, Ö. (2016). Thoughts on language-specific and crosslinguistic entities. Linguistic Typology 20(2): 427-437.
Hammarström, H. & Nordhoff, S. (2011). Langdoc: Bibliographic infrastructure for linguistic typology. Oslo Studies in Language 3(2): 3143.

> On Apr 3, 2020, at 11:08 AM, John Du Bois <dubois at ucsb.edu> wrote:
>
> Trudgill's paper is excellent and thought-provoking. Here's the reference:
>
> Trudgill, Peter (2015). Sociolinguistic typology and the uniformitarian hypothesis. In De Busser, R. & LaPolla, R. J. (Eds.), Language structure and environment: Social, cultural, and natural factors. Amsterdam: Benjamins. 133-148.
> https://benjamins.com/catalog/clscc.6
>
> Best,
> Jack
>
> On Sat, Jan 4, 2020 at 6:15 AM Randy J. LaPolla <randy.lapolla at gmail.com> wrote:
> Hi Juergen,
> Relevant to this is work by Peter Trudgill on Sociolinguistic typology and the uniformitarian hypothesis. See the attached paper of his from a few years ago. I couldn’t find a publication reference for it, but it might be included in his Sociolinguistic Typology book.
>
> All the best,
> Randy
> -----
> Randy J. LaPolla, PhD FAHA （羅仁地）
> Professor of Linguistics, with courtesy appointment in Chinese, School of Humanities
> Nanyang Technological University
> HSS-03-45, 48 Nanyang Avenue| Singapore 639818
> http://randylapolla.net/
> Most recent books:
> The Sino-Tibetan Languages, 2nd Edition (2017)
> https://www.routledge.com/The-Sino-Tibetan-Languages-2nd-Edition/LaPolla-Thurgood/p/book/9781138783324
> Sino-Tibetan Linguistics (2018)
> https://www.routledge.com/Sino-Tibetan-Linguistics/LaPolla/p/book/9780415577397
>
>
>> On 4 Jan 2020, at 2:09 AM, Bohnemeyer, Juergen <jb77 at buffalo.edu> wrote:
>>
>> Dear all — I was wondering whether anybody has attempted to quantify the extent of linguistic diversity in our knowledge of the languages of the world. I believe mathematically speaking the type of information I’m looking for is a frequency distribution. The question is to what extent are a handful of languages and language families overrepresented in our knowledge of the languages of the world whereas the vast majority of languages and language families are underrepresented. One can ask this question (i) about our descriptive knowledge of any and all languages and (ii) specifically about the typological literature. I’m most interested in (ii), but I’m guessing there’s more likely to be an answer to (i) (though I also realize that the odds of anybody having proposed an answer to either question without me having heard of it are not great). Anybody aware of such a study? Even relevant claims without empirical footing would be of interest. — Best — Juergen
>>
>> Juergen Bohnemeyer (He/Him)
>> Professor and Director of Graduate Studies
>> Department of Linguistics and Center for Cognitive Science
>> University at Buffalo
>>
>> Office: 642 Baldy Hall, UB North Campus * Mailing address: 609 Baldy Hall, Buffalo, NY 14260
>> Phone: (716) 645 0127
>> Fax: (716) 645 3825 * Email: jb77 at buffalo.edu * Web: http://www.acsu.buffalo.edu/~jb77/
>>
>> Office hours Tu/Th 3:30-4:30pm
>>
>>
>> There’s A Crack In Everything - That’s How The Light Gets In (Leonard Cohen)
>>
>> _______________________________________________
>> Lingtyp mailing list
>> Lingtyp at listserv.linguistlist.org
>> http://listserv.linguistlist.org/mailman/listinfo/lingtyp
>
> _______________________________________________
> Lingtyp mailing list
> Lingtyp at listserv.linguistlist.org
> http://listserv.linguistlist.org/mailman/listinfo/lingtyp
>
>
> --
> =======================================
> John W. DuBois
> Professor of Linguistics
> University of California, Santa Barbara
> Santa Barbara, California 93106
> USA
> Email:
> dubois at ucsb.edu
> Zoom room: https://ucsb.zoom.us/my/dubois
> Web page: http://www.linguistics.ucsb.edu/faculty/dubois/
>

--
Juergen Bohnemeyer (He/Him)
Professor and Director of Graduate Studies
Department of Linguistics and Center for Cognitive Science
University at Buffalo

Office: 642 Baldy Hall, UB North Campus
Mailing address: 609 Baldy Hall, Buffalo, NY 14260
Phone: (716) 645 0127
Fax: (716) 645 3825
Email: jb77 at buffalo.edu
Web: http://www.acsu.buffalo.edu/~jb77/

Office hours will be held by Skype, WebEx, or phone until further notice. Email me to schedule a call at any time. I will in addition hold Tu 12:30-1:30 and Th 2:30-3:20 open specifically for remote office hours.

There’s A Crack In Everything - That’s How The Light Gets In
(Leonard Cohen)

_______________________________________________
Lingtyp mailing list
Lingtyp at listserv.linguistlist.org
http://listserv.linguistlist.org/mailman/listinfo/lingtyp
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20200502/07aa74d6/attachment.htm>