Random Noise in Multilateral Comparison.

Wed Sep 1 05:50:41 UTC 1999

ECOLING wrote:

> In a discussion about Random Noise,
> I was making a point that it is much less serious a problem
> in Multilateral Comparison than it is when one is trying to make
> an argument that two particular selected languages
> ARE GENETICALLY RELATED.

Actually, random noise is a far MORE serious problem when doing multilateral
comparison than when doing binary comparison.  Let's say that /t/ represents
20% of the initial consonants in each of four unrelated languages, A, B, C,
and D.  Comparing just pairs of words from A and B, then 40 out of a
thousand pairs of words should have /t/ as the first consonant in each
language.  Now throw C into the mix.  Now A-C will have 40 matches, A-B will
have 40 matches, and B-C will have 40 matches for a total of 120 pairs out
of 1000 words linking A-B-C.  In addition, there will be 8 words with
matches all the way across (A-B-C).  Add D to the problem and you wind up
with 40 words for each possible pairing (A-B, A-C, A-D, B-C, B-D, and C-D),
for a total of 240 pairs, along with another 8 words for each possible
triplet (A-B-C, A-B-D, A-C-D, B-C-D), for a grand total of 264 "cognate
sets" (out of a 1000 possible ones) that include forms from at least 50% of
the languages in each set.  In other words, we have SIX times as much random
noise by doubling the number of languages involved in the comparison.  To
the uninitiated that is pretty impressive.

John E. McLaughlin, Ph.D.
Assistant Professor
mclasutt at brigham.net

Program Director
Utah State University On-Line Linguistics
http://english.usu.edu/lingnet

English Department
3200 Old Main Hill
Utah State University
Logan, UT  84322-3200

(435) 797-2738 (voice)
(435) 797-3797 (fax)