<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title></title>
</head>
<body><><><><><><><><><><><><>--This is the Language List--<><><><><><><><><><><><><><BR>
<BR>
<small><br>
</small>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">
<title></title>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">
<title></title>
<pre><pre><p><font face="Courier New, Courier, monospace">BOOK REVIEW</font></p><p><font
face="Courier New, Courier, monospace">by
H.M. Hubey, Department of Computer Science,
Montclair State University,
New Jersey
</font></p><p>
<big>DESCRIPTION OF THE BOOK</big></p></pre><p>
Kessler, Brett. 2001. The Significance of Word Lists. CSLI Publications,
x+277pp, hardback ISBN 1-57586-299-9, paperback ISBN 1-57586-300-6,
Dissertations in Linguistics.
Announced at <big><font
face="Courier New, Courier, monospace"><a
href="http://linguistlist.org/issues/12/12-790.html#1">http://linguistlist.org/issues/12/12-790.html#1</a></font></big>
</p><p><big><font
face="Courier New, Courier, monospace"></font>
DESCRIPTION OF THE BOOK</big></p><p><font
face="Courier New, Courier, monospace">The major issues addressed in the book are (i) concept of distance: similarity, (ii) comparative method, (iii) statistical tests, specifically
the chi-square test, (iv) data-cleaning so that the chi-square test gives good results. Quick conclusion/summary is in order: (i) the book is excellent,
(ii) contrary to expectation it is not about statistics but rather linguistics, (iii) its significance lies in its use of the methods of probability theory
in a comprehensive way instead of simple and patched methods used previously. In fact, the book's jacket explains the problem most succintly:</font>
</p></pre>
<p><font face="arial, helvetica" size="-1"> The most strident
controversies in historical linguistics debate whether claims for historical
<br>
connections between languages are erroneously based
on chance similarities between word <br>
lists. But even though it is the province of statistical
mathematics to judge whether evidence <br>
is significant or due to chance, neither side in these
debates uses statistics, leaving readers <br>
little room to adjudicate competing claims objectively.
This book fills that gap by presenting a <br>
new statistical methodology that helps linguists decide
whether short word lists have more <br>
recurrent sound correspondences than can be expected
by chance. The author shows that many <br>
of the complicating rules of thumb linguists invoke
to obviate chance resemblances, such as <br>
multilateral comparison or emphasizing grammar over
vocabulary, actually decrease the power <br>
of quantitative tests. But while the statistical methodology
itself is straightforward, the author also <br>
details the extensive linguistic work needed to produce
word lists that do not yield nonsensical results.</font></p>
<p><font face="Courier New, Courier, monospace"><br>
</font><font face="Courier New, Courier, monospace"> The first problem
Kessler tackles is the usual confusion between "resemblance" and "cognate",
"proof" vs "statistics", and "distance" vs "similarity". It is not unusual
even for "alleged" quantitative linguists to get these concepts backwards,
and then use Freudian projection as defense. <br>
</font> </p>
<p><font face="Courier New, Courier, monospace">First, the most important
concept. Suppose we attempt to ascertain the abstract property which this
compound word represents: hotness:coldness. From what we know we can see
that these measure the same thing but the scales are running in opposite
directions. In this case, the property we measure has an unequivocal name,
temperature.<br>
</font> </p>
<p><small><font face="Courier New, Courier, monospace"><big>Suppose we attempt
it with, nearness:farness. It is easy to see that the property being measured
is distance. However, this word is not abstract enough. We can also measure
"time" with the same compound word. Or we may use long_ago:recent, or even
distantness:recentness. In perceptual space (not physical space or temporal
space) the common word in use is "similarity". Thus distance:similarity
measures a concept called "distance". The reason for this seeming contradiction
is the fact that natural languages often have words with two meanings. We
might correct it via similarity:dissimilarity which then measures "distance".
It is this dual usage of distance that often confuses people especially
in conjunction with the word "similarity" or "dissimilarity". In other
words we have distance1:similarity and this concept we measure via distance2.
</big><br>
</font> </small></p>
<p><font face="Courier New, Courier, monospace">Having said this, we can say
that comparative methods are attempts to measure how far from <br>
chance the observed data is, nothing more, nothing less. This much is
made crystal clear by<br>
Kessler, who obviously understands historical linguistics methodology
better than some linguists despite being a psychologist. But surely not
being a linguist should not be held against him. It is not unusual for new
methods to be brought into a field by outsiders. It happens in physics,
engineering, computer science, genetics, biology, economics. Why not linguistics?<br>
</font> </p>
<font face="Courier New, Courier, monospace"> Once this is clear, then
it becomes clearer why binary comparison can be put on the same<br>
footing as multi-way comparisons. After all, whatever the data represents,
all we want to <br>
know is "what is the probability that this data occurred purely due to
chance?" We obviously want this number as small as possible if we want
to conclude that the data represent an event that is not due to chance.<br>
<br>
Kessler also makes it clear that if we obtain a very small probability
that the data represents a state of events that is not due to chance, all
we can conclude is exactly that. How it came about depends on other assumptions.
<br>
<br>
Kessler is very clear on the fact that the Swadesh list is nothing more
than a formalization of concepts that historical linguists developed over
centuries. A list that would be useful for making tests, a list that avoided
technological borrowings, onomotopaic words, and other "unreliable" words,
and those lists were created by Swadesh. Any linguist who has anything against
the lists is not in disagreement only with Swadesh but also with the basic
postulates of historical linguistics.<br>
<br>
In summary what we want is a test or a number that tells us how far from<br>
chance distribution the data are. </font><font
face="Courier New, Courier, monospace">There is such a test. It is the chi-square
test. But there is a catch; the data must be independent. That means that
if the word for finger and foot come from the same root, they are not independent,
and the chi-square test will give incorrect results since it is based on
the independence hypothesis. Kessler gives a small and short example of
how it works. In truth he probably should have given a whole chapter, or
two on the mathematics of the chi-square test, however he probably decided
that it can be found in any statistics books. Instead he concentrates on
selection procedures for the words. At the end of the book there is the
Swadesh list for the languages which Ringe used for his early attempt at
use of statistics. Kessler shows that many of these words are borrowings
from other languages. In any case, any test will give incorrect results
if the inputs are incorrect. In computer science it is called GIGO, Garbage-In,
Garbage-out. There is never any substitute, not yet, for human intelligence.
However, there is now at least one way of comparing the closeness of languages
to each other using some number. That is about closest concept to "distance"
that historical linguistics has ever reached. It would have been better
if he had developed it further by normalizing it. For example, let d(x,y)
be the 'dissimilarity" between languages x and y. Then let s(x,y) be the
'similarity between languages x and y. Then by normalizing these quantities
to the interval [0,1] we can easily see that s(x,y) = 1
- d(x,y)</font>
<pre><p><font face="Courier New, Courier, monospace">Obviously, d(x,y) should be normalized so that d(x,x)=0. It can be seen already that if z, and w are two "most-distant" languages then
d(z,w)=1. That is still something that still needs more work. As already mentioned, most of the book is spent on the actual
results of the comparisons amongst the languages. Kessler makes various changes e.g. Swadesh 100 vs Swadesh 200, using only the
first phoneme vs using more phonemes, etc. He also discusses thoroughly the problems with using more than a single phoneme,
or even a single phoneme. The problem is that we do not know which phoneme should be cognate with which phoneme. In other words
what if one language has lost the initial consonants. Then we would be attempting to match a consonant to a vowel which is certain to
produce bad results. In the words of datamining, this is called data-cleaning, or pre-processing and it is an important part of
analysis. Kessler discusses such problems thoroughly and clearly.</font>
</p><p><font
face="Courier New, Courier, monospace">The final result is that historical linguistics is on its way to becoming a rigorous science like those that preceded it. Kessler probably could have
spent more time (and space) explaining the concept of hypothesis testing, false positives, false negatives, etc. even if only in an appendix.
</font></p><p><font
face="Courier New, Courier, monospace">Kessler discusses in other chapters how to go about making use of consonants other than the first one in the comparanda. The main problem is
the one faced by researchers in speech recognition and genetics. The phonemes have to be "aligned". That is, it is possible that one of the
languages could have lost the initial consonant,or could have gone through a metathesis, etc. . Therefore some algorithms
are needed to automatically obtain optimum alignment. These require the existence of phonetic/phonemic distance, but these are much easier than
semantic distance (and do already exist in various forms, even if only implicitly). There is much more to the book that should be of
interest to historical linguists. </font>
</p><p><big><font
face="Courier New, Courier, monospace"><small>In summary, the book is excellent but might require some work for those linguists who have math-anxiety or have any kind of aversion
to quantitative techniques. However, beginning statistics courses are now taught at universities at the general-education level, and
there is no excuse for anyone not to at least have some grasp of the fundamentals of statistics and probability theory. Time never
goes backwards.</small></font>
</big>
</p></pre>
<pre class="moz-signature" cols="$mailwrapcol">--
M. Hubey
-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o
The only difference between humans and machines is that humans
can be created by unskilled labor. Arthur C. Clarke
/\/\/\/\//\/\/\/\/\/\/ <a class="moz-txt-link-freetext" href="http://www.csam.montclair.edu/~hubey">http://www.csam.montclair.edu/~hubey</a></pre>
---<><><><><><><><><><><><>----Language----<><><><><><><><><><><><><><BR>
Copyrights/"Fair Use": http://www.templetons.com/brad/copymyths.html<BR>
The "fair use" exemption to copyright law was created to allow things<BR>
such as commentary, parody, news reporting, research and education <BR>
about copyrighted works without the permission of the author. That's<BR>
important so that copyright law doesn't block your freedom to express<BR>
your own works -- only the ability to express other people's. <BR>
Intent, and damage to the commercial value of the work are <BR>
important considerations. <BR>
<BR>
You are currently subscribed to language as: language@listserv.linguistlist.org<BR>
To unsubscribe send a blank email to leave-language-4283Y@csam-lists.montclair.edu
</BODY>
</html>