Languages considered in typologcial research

Martin Haspelmath haspelmath at EVA.MPG.DE
Sun Dec 4 06:44:25 UTC 2011


To do what Wolfgang Schulze is asking for, one would simply need

(i) a representative selection of "the typological literature"
(ii) a database that records every language that is dealt with in each 
of these works

As Joseph Farquarson notes, the World Atlas of Language Structures 
(http://wals.info) comes close to this: It was intended to be 
representative of large-scale typological work, and the WALS database 
makes it easy to compute this ranking. See the list below (the top 113 
languages out of WALS's 2560 languages). (This is somewhat out of date, 
because it's based on the 2005/2008 edition, not the 2011 edition, but 
the trend is clear.)

There are at least three important caveats here:

(i) WALS is not representative of typological work as a whole -- only of 
large-scale typological work, covering more than 150 languages. Much 
typological work discusses grammatical features at a depth that is not 
possible with such large numbers of languages, because the relevant 
information cannot be easily found in reference grammars.

(ii) The WALS authors were explicitly given the instruction to try to 
cover a core sample of 100 or 200 languages, so that the number of 
languages treated in most chapters would be maximized (see 
http://wals.info/languoid/samples/200). So the top 200 languages of the 
WALS Language Coverage List are probably largely due to this instruction.

(iii) As Silvia Kouwenberg notes, pidgins and creoles are not well 
represented in WALS. This is because WALS is an atlas, and it was 
intended first and foremost as a way of showing areal patterns. 
Languages that arose due to long-distance migration over the last few 
centuries (including languages such as Brazilian Portuguese or 
Surinamese Hindustani) would confuse this areal pictures, so it was 
(controversially) decided not to encourage their inclusion in WALS.

Thus, we would need a more comprehensive database that does not show 
these idiosyncrasies. Colin Masica is "surprised that this hasn't been 
done", but this is not surprising at all -- it would be quite difficult 
to get funding for such an enterprise.

Greetings,
Martin

Am 12/3/11 6:42 PM, schrieb Wolfgang Schulze:
> Dear friends,
> just a short (and maybe silly?) question: Is anybody aware of some 
> kind of statistics that considers to which extent the individual 
> languages of the world are dealt with in the typological literature? 
> It would be interesting to see where (and why !) there are both 
> lacunae and statistic 'peaks'. The issue could be refined if one 
> includes the classical linguistics domains such as phonology, 
> morphology, syntax, semantics etc. Such a "World Atlas of Linguistics 
> Data" (just to give it a name) would not only help motivating 
> researchers to fill up lacunae, but also help understanding what the 
> reasons may be for certain preferences...
> Best wishes,
> Wolfgang

WALS Langage Coverage:

Languages in WALS and number of WALS features in which the language is 
considered
(only languages that occur in more than 100 features out of 141)

English139

French136

Finnish135

Russian135

Spanish135

Turkish135

Hungarian133

Indonesian133

Japanese130

Mandarin130

Amele129

German129

Greek (Modern)129

Lezgian129

Abkhaz128

Evenki128

Korean128

Persian128

Basque127

Hausa126

Maori126

Georgian125

Kannada125

Khalkha125

Malagasy125

Supyire125

Hindi124

Tagalog124

Arabic (Egyptian)123

Greenlandic (West)123

Hixkaryana123

Swahili123

Vietnamese123

Slave122

Burushaski121

Chamorro121

Chukchi121

Fijian121

Hebrew (Modern)121

Lango121

Oromo (Harar)121

Thai121

Yaqui121

Zulu121

Maybrat120

Tukang Besi120

Kanuri119

Kayardild119

Mapudungun119

Yoruba119

Yukaghir (Kolyma)119

Burmese118

Krongo118

Mangarrayi118

Tiwi118

Guaraní117

Khoekhoe117

Meithei117

Ngiyambaa117

Ainu115

Jakaltek115

Lakhota115

Martuthunira115

Pirahã115

Wari'115

Lavukaleve114

Rapanui114

Alamblak113

Gooniyandi113

Kutenai113

Mixtec (Chalcatongo)113

Awa Pit112

Kobon112

Latvian112

Maricopa112

Imonda111

Apurinã110

Berber (Middle Atlas)110

Warao110

Canela-Krahô109

Nivkh109

Quechua (Imbabura)108

Rama108

Wichí108

Yagua108

Koromfe107

Bagirmi106

Hunzib106

Ingush106

Maung106

Epena Pedee105

Ket105

Koasati105

Luvale105

Sango105

Iraqw104

Kewa104

Sanuma104

Shipibo-Konibo104

Ju|'hoan103

Kilivila103

Nunggubuyu103

Asmat102

Ewe102

Grebo102

Hmong Njua102

Khasi102

Khmer102

Kiowa102

Ndyuka102

Wichita102

Arapesh101

Oneida101

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20111204/a0dad773/attachment.htm>


More information about the Lingtyp mailing list