Yeniseian and Na-Dene

Thu Nov 12 21:14:14 UTC 1998

----------------------------Original message----------------------------
Mark Hubey writes:

>Johanna Nichols wrote:
>>
>> I've worked out the chances of finding words with two similar consonants in
>> the same order, with similar but not necessarily identical meanings, in two
>> languages.  Out of a fixed list of 100 meanings chosen in advance (this is
>> analogous to asking "what is the probability that similar forms will mean
>> 'water' in both languages?", and so on for another 99 glosses), this is how
>> many resemblant sets it takes to exceed the range of chance and show that
>> relatedness is likely:
>>
>> 2-consonant words with the very same meaning:   7
>> 2-consonant words with similar meanings (modeling this as a search that
>> allows up to 5 senses' leeway, e.g. for 'fly' also 'flee', 'wing', or
>> whatever; these must also be specified in advance):  25
>> 1-consonant words (or 2-consonant words with one resemblant consonant and
>> one non-resemblant one) with the same meaning:  27
>> 1-consonant words with 5 senses' leeway each:  over 50
>
>
>I erased the rest not because it is not important but because I want to
>ask about this.
>
>Is it not true that the most important consideration in probability
>theory is knowing the sample space?
>
>In other words, when "matches" due to chance are being calculated,
>should not the fact that the two languages have (or seem to have)
>the same set of phonemes enter into the calculation? In other words,
>the sample space should consist of the phonemes that the languages
>could have had (along with the phonemes that they do have) but do not?
>
>The calculations should involve conditional probabilities. No?
>
>Secondly, I also made some calculations. But mine is not for phonemes
>and does not take into account phonemes for the reason that they cause
>more complications, and do not take into account that the same speech
>space available for humanity is divided up differently and into
>different
>number of chunks (phonemes) in different languages. The fact that out of
>possible M phonemes if languages seem to have a particular set of N
>phonemes that in itself has to be accounted for.
>

I have two ways of computing the probability of a generic consonant.

(1) Languages (as sampled in my database) have an average of about 20
consonant phonemes.  One of them has, on average, 0.05 chance of occurring
in a randomly chosen position in a randomly chosen form.  This is the
probability of a specific consonant.  Allowing three trials (allowing a
search through two to three distinctive features' space, or about three
phones) yields a probability of 0.143.

(2) Whatever the consonant inventories of the languages under comparison,
divide each of them into 7 phonetically coherent spaces.  (Variant:  divide
them into 6 coherent spaces, and count lack of any consonant -- e.g.
initial V rather than C -- as a seventh possibility.  This makes it
possible to accommodate Rotokas, with its 6-consonant system.)  This way
too we get a probability of 0.143 (1/7 = 0.143).

Neither of these procedures guarantees fair coverage of vastly different
frequencies of different consonants, language-specific or family-specific
preferences of different consonants or consonant classes for different
phonotactic positions, and the like.  I hope that some of these differences
get ironed out by putting consonants together in phonetic groupings.
Still, the metric is only approximate.  It enables us to point out that 36
resemblant sets, half of them with only one resemblant consonant, isn't
enough to indicate genetic relatedness unless a very small wordlist was
specified in advance.

Johanna Nichols

* * * * * * * * * * * * * * * * * * * * *
Johanna Nichols
Professor
Department of Slavic Languages
Mailcode 2979
University of California, Berkeley
Berkeley, CA 94720, USA

Phone:  (1) (510) 642-1097 (direct)
        (1) (510) 642-2979 (messages)
Fax:    (1) (510) 642-6220 (departmental)
* * * * * * * * * * * * * * * * * * * * *