Phonetic Resemblance, Birthday Prob. Regular Sound change and Yakhontov

H. Mark Hubey HubeyH at Mail.Montclair.edu
Sun Feb 7 22:29:43 UTC 1999


----------------------------Original message----------------------------
Larry Trask wrote:
>
> ----------------------------Original message----------------------------
> On Sat, 6 Feb 1999, Alexis Manaster-Ramer wrote:
>
> > This goes to the heart of the whole fight about what role phonetic
> > similarity plays in comparative linguistics (Larry claims none at
> > all. I claim a subordinate but important and indeed probably crucial
> > one). But note that Starostin is not doing anything that Larry could
> > object to on this score.  He is NOT using phonetic similarity at
> > all.

> However, in the Mother Tongue exchange, Sergei was most certainly
> working with mere perceived phonological and semantic resemblances, and
> with nothing else at all.  That is obvious to anyone who reads the
> relevant passage, and that is what got me confused about what Yakhontov
> was saying in the first place.

I would like to point out some quick facts;

1. The so-called Birthday Problem is about showing why quick guesses
are wrong in prob problems. Computations show that for 23 persons
selected at random, the odds that at least two have the same birthday
is 50-50.

2. Suppose we are comparing two languages A and B; and we have a batch
of
candidate words. For simplicity let A and B have about 20 consonants
each
and let our comparison be a simple one of matching up consonants. There
are
380 possible sound changes (ignoring the no-change). Naturally, we are
looking for regular sound correspondence/change (RSC)

        2.i) In the worst possible case, if we find 380 words with sound
        changes, each one could be unique and hence there is no sign of
        regularity. Thefore even in this worst case if we find 381 sound
        changes, according to the Pigeonhole Principle of counting, there
        will be at least one sound change that is repeated and hence
        "regular" in this restricted sense.

        2.ii) However this worst-case scenario is very unlikely to happen.
        Its probability is near zero. What is more likely to happen is
something
        similar to the Birthday problem. IF we find ~25 words which seem to
        correspond, the odds are 50-50 that at least one sound change will be
        repeated. As the number of matches increases to about 100 or so
        more and more sound changes will be repeated. IT is not difficult to
        compute the distributions of the sound changes. I have done some of
        these. So we can always subtract out this baseline due to chance.



The first conclusion we can draw is that what really counts (especially
in
languages like IE and AA for which plenty of samples exist going back
thousands of years) is really "quantity" because if we find quantity we
will find due to laws of probability "regular sound change".

This means that even established families, with say 400 RSC should be
tested
rigorously by substracting out the baseline RSC that could occur purely
due to
chance. IT is easy enough to do this via simulation. Linguists who try
this
kind of simulation as evidence against proto-worlders forget to apply
this
criteria to their own language families.

When "regular sound correspondance" (RSC) is really important is if we
find it
for small samples. AFter all, we get RSC with large numbers even due to
chance because it cannot be avoided. Then again, most linguists ignore
small number of correspondences even if the sound changes are "regular"
because they claim that the numbers are too small, but this is exactly
when
RSC is significant.

That means that if we found only 5-6 words and saw regular sound change
then it is really significant because the laws of chance dictate that
for small numbers of correspondences the odds of repetetion (RSC) is
small.

So Yakhontov is apparently trying to do something like this. I say "like
this"
because I do not see any clear reasoning that this is being done.

OF course, after this, since the justification for RSC is probabilistic
anyway, there is no reason to attack the use of statistics which is
based
on probability laws.


--
Best Regards,
Mark
-==-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
hubeyh at montclair.edu =-=-=-= http://www.csam.montclair.edu/~hubey
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=



More information about the Histling mailing list