[Corpora-List] Phonetic corpora typology

Bryar Family bryar at vermontel.net
Fri Mar 5 20:44:44 UTC 2010


Yuri Tambovtsev wrote:
> I have also sent my request on the difference between LANGUAGE and 
> DIALECT from a typological point of view but could read only jokes 
> about the army and fleet.

Yuri: 
RE: Language vs. Dialect
The question is a marvelous one. I'm no expert,  but as any of the linguists on this list can tell you, the terms are politically defined, and that no objective set of metrics involving isoglosses or other set of linguistic distinctions are going to be very helpful.
The concepts of language vs. dialect need to be understood as localized social and political constructs and an arbitrary ones at that. None are based on anything but socio-political declarations.
For example, look at the ETHNOLOGUE.http://www.ethnologue.com/home.asp 
The Ethnologue and the accompanying SIL bibliography http://www.sil.org/ attempt to use objective linguistic metrics and have built an imposing academic citation index to buttress its decisions as to what is a language and what is a dialect. 
This and the ISO language list are widely used references, but they are loaded with arbitrary delineations. 
Based in part on the Ethnologue, Papua New Guinea is supposed to have literally hundreds of languages. 
However, a close examination of the Ethnologue reveals that "Gapapaiwa" and "Ghayavi" are held to be separate PNG languages, yet they have a "73%" lexical similarity".
This declaration begs all sorts of questions.
First of all, how is this "similarity" measured with such precision given these languages vary from village to village? Who knows! 
On the other hand, "Galeya" and "Basima" are supposed to be dialects based on a purported 80% lexical similarity. http://www.ethnologue.com/show_country.asp?name=PG
What makes one a language and another a dialect? The most accurate answer is because the SIL and the ISO say so!
Affiliated linguists have conducted various local field studies in an attempt to build an academic case and objective metrics for the distinctions they have made. 
For example, here's one attempting to frame a definition for "dialects" of the Mende "language" in Papua New Guinea:
Hoel, Hanna Marie, Tarja Ikäheimonen and Michiyo Nozawa. Mende dialect survey. [Manuscript] Available: 2007; Created: 1997-12. 9 p. http://www.sil.org/pacific/png/abstract.asp?id=48479 
Here is another conducted in Ethiopia:
Gutt, Ernst-August. 1980. "Intelligibility and interlingual comprehension among selected Gurage speech varieties." http://www.ethnologue.com/show_work.asp?id=50110 
The latter effort immediately gets into the weeds of delineating language ("Kistane," e.g.) vs. language group ("Guarage") vs. dialect ("Dobi," "Soddo"). Here the researchers conclude,  
"The Dobi dialect comprehension of Soddo is 76%, and Soddo speakers’ of Dobi is 90%." 
Thus similar levels of mutual comprehension make you a language in New Guinea and a dialect in Ethiopia! 
Here is one guy's definition. What distinguishes a language from a dialect is the level of success achieved by some self-proclaimed or socio-politically powerful authority to place artificial boundaries around areas where there is or was once a broad "dialect spectrum." 
In other words, Village A can understand most of what Village B next door is saying, despite a vowel substitution or two, a few shifts in vocabulary, some peculiarities of syntax, etc. Village B can understand most of what their neighbors in Village C are saying.  Village C can understand most of what residents of Village D are saying..... but Village A can't understand the residents of Village D whatsoever! 
So who determines if Villages B and C are speaking the standard version of one language, with versions A and D being dialects, or if village A and D are speaking two languages and villages B and C are speaking some sort of creole? How does one determine which is THE LANGUAGE among these vernacular varieties, and which are the dialect deviations? 
Consider the implications of this paper: 
Oliver and Richard Cox. Choosing a standard dialect for Rangi literacy. [manuscript]. http://www.ling.ed.ac.uk/~oliver/0703dial.pdf
Who are Stegen, Oliver and Cox that THEY get to choose what is the standard version of Rangi in Tanzania? Why is a standard version necessary? And on what basis are they making such a determination? And what are the consequences? 
The reality is that, by making a decision, and generating a "standard" spelling and syntax for publishing and educational purposes they and their committee of Rangi-speaking advisors are taking an important first step in CREATING a "language" from a set of related local vernaculars.
In this case, as in ALL OTHERS,  the process of determining a standard is an artificial one, based on a variety of socio-political considerations. Do the people making the decision have the ability to impose a standard acceptable to the local population? To what degree does the result draw from high prestige vernaculars? Will the dominant social class, tribe or political power use it and impose it on others?  use? 
This mix of prestige, and political power serve as the arbiters of standard language vs dialect nearly everywhere. 
Why is Florentine the basis for standard Italian? Some would say it is because Dante wrote the Inferno in the vernacular of Florence rather than Naples, making it the prestige dialect. 
Why is Standard Arabic "standard" given that far more people speak the Egyptian variety? Because it has a higher prestige as the language of the Koran.  
Why is the French spoken in Paris standard and Joual spoken in Montreal a dialect? One answer would be the relative prestige of Paris, the political power of the French King in the old regime, and the authority granted to the Academy as an arbiter. 
Why is High German the standard and other variations dialects? Perhaps it is because Martin Luther used the version of West Saxon "Sächsische Kanzleisprache" to construct his Bible translation, and this was adopted by the major Protestant states including the most politically powerful ones.
Why is the English spoken at the British university considered standard? Because they were the places of high prestige and the centers of literacy where one studied the King James Bible and the Book of Common Prayer that were written in the vernacular used by the London Court..
In today's world language standards are often legal constructs as well. 
Who decides what is French? The Académie française! If you plan to pass your school exams you better write them to that standard! 
Who decides what is standard "Mandarin" Chinese, given the "language spectrum" quality of the spoken version prior to the 1st Chinese Revolution? It is The National Languages Committee (國語推行委員) 
What makes Bavarian, Schwäbian, Alemannisch, Mainfränkisch, Hessisch, Palatinian, Rheinfränkisch, Westfälisch, Saxonian, Thuringian, Brandenburgisch, Low Saxon etc., all dialects as opposed to languages? Their relative levels of historical prestige, their treatment by German state and the EU. 
The distinction between language and dialect is a political decision. Anyone who doubts the political nature of such determinations should look at Europe. Some of the EU's most entertaining political fights center around campaigns by various groups to get recognition as an official minority language status for a given local vernacular. A great example of this sort of thing is the campaign for the recognition of Scottish as a "language" despite the fact that the average Londoner can understand Scottish far more easily than a speaker of any of the "English dialects" of the American Deep South. 
Why is Dutch a language when it has more in common linguistically with Low German dialects than many low German dialects have with standard German? 
One answer is that there was and is a Dutch state, a Dutch capital, a Dutch school system to enforce standards, a bible created in the local vernacular of the Dutch court, the Statenvertaling. In other words, there was an autonomous center of a Germanic vernacular 
*  With a higher prestige than that of its neighbors, 
*  With an educational, literary and legal infrastructure capable of propagating its grammatical and linguistic peculiarities, and  
*  With the political power to force acceptance of them as a standard. 
The bottom line is this: 
If a verbal or written vernacular is recognized as a standard form by a state or other center of high prestige with an ability to force compliance with its peculiar pronunciation and grammatical system, YOU HAVE A LANGUAGE. 
This can be done by state fiat, by common schooling, or by common adoption by broadcasters, etc. 
It’s a political process. Here's another example: . 
When did Serbo-Croatian become Serb, Croatian and Bosnian? The answer -- After the Civil War, despite the fact that these "LANGUAGES" are nearly 99% mutually intelligible! Certainly Serb, Croat, Bosnian, and Montenegrin" are not languages because of mutual incomprehension but because prestige leaders and government agencies insist they are. 
Ivo Pranjković, the author of The Grammar of Croatian Language says that 
"On the level of standardization, Croatian, Serbian, Bosnian and even Montenegrin are different varieties, but of a same language. Thus, on purely linguistic level, or genetic level, on typological level, we're talking about one language and that must be clearly said."
To even describe these as varieties may be an overstatement. For example, a Croat speaker from Zagreb can understand his neighbor in Banja Luka speaking the Serbian LANGUAGE with little difficulty, but might have a tougher time understanding someone speaking the "Burgenland Croatian" DIALECT over in Austria.  
A good reference is Language and identity in the Balkans: Serbo-Croatian and its disintegration, by Robert Greenberg. See a review at http://www.sil.org:8090/silebr/2005/silebr2005-002
China features multiple examples of the political nature of the distinction between language vs. dialect. 
Some of China's official dialects like Hakka and Gan differ more from region to region than they differ from nearby Cantonese or Mandarin. They are more ethnic markers than linguistic delineations. 
On the other hand, "dialects" like Wu (江南話) have a dramatically different tonal architecture, vocabulary, etc. compared to Mandarin. Moreover they feature regional differences http://en.wikipedia.org/wiki/File:Wu_Dialects.png that themselves would be called dialects if not separate languages anywhere else. For example, the Suzhou and the Shanghai versions of Wu are mutually unintelligible. 
To attempt a definition for you:
A language can be defined as a dialect that has 
	A certain level of prestige - often from being a court vernacular
	Possibly, a set of common "canonical" oral or written literary works
	Sponsorship by some political or academic authorities willing to impose standard pronunciations and grammatical forms (through schools, use in the courts and offical documents,  language standards committees etc.,) 
	And a level of socio-political recognition as such.  

So we get back to the wisecrack: 
"A language is a dialect with an army and navy" 
http://en.wikipedia.org/wiki/A_language_is_a_dialect_with_an_army_and_a_navy
Jack Bryar
Grafton, VT 05146
Office: 802-843-6033


-----Original Message-----
From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of Angus B. Grieve-Smith
Sent: Friday, March 05, 2010 8:59 AM
To: Corpora list
Subject: Re: [Corpora-List] Phonetic corpora typology

    If you think that was a joke, then I really failed at making myself 
clear.

-- 
				-Angus B. Grieve-Smith
				grvsmth at panix.com


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora




_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list