"only six" argument

Thu Sep 14 11:01:01 UTC 2000

We have been following the HISTLING discussion initiated by Larry Trask
with interest, because we have been involved over the last two years in a
particular case that had to solve the problem of the amount of data that is
necessary to establish relatedness. (Not genetic, but via borrowing).  We
have been doing what Robert R. Ratcliffe takes as his starting point in his
last e-mail, i.e. "approaching unclassified languages or languages which
haven't been compared before [where] the first question we have to ask is
whether these languages have something in common which cannot be due to
chance or coincidence."  The results will be published in the December
issue of Oceanic Linguistics, but we thought it might interest LIST members
to have a sneak preview, at least of the (rahter long) section on
probability, where we discuss relevant issues.  (The section is attached.)

We were trying to find out whether some semantic and phonological matches
in Old Japanese and Old Javanese lexis were too extensive to be due to
chance. In this particular case, rather than looking at single sound
correspondences, we used whole-word comparison, and of longer words (CVCVC
structure) with recurrent sound correspondences.  While it is not possible
to go into the calculations here, it turned out that in this case only one
match between words of this length could be expected to occur by chance. In
the section on probability Rose discusses the usefulness of the approaches
taken earlier by Nichols and Ringe and goes on to propose that a Bayesian,
rather than frequentist, statistical approach should be the preferred
option. We have attached this section.

We agree with Ratcliffe that "Numerical criteria and probability theory are
the most reliable means for making judgements of this type". But we are
able to demonstrate a few more things that might interest LIST readers,
(and can also offer some real data!).  As mentioned, we also have some
points to make concerning the appropriateness of the frequentist (as
opposed to a Bayesian) paradigm for evaluating questions of this kind (i.e
assessing the probability of a hypothesis). (Bayesian formulations are
used, for example, in forensics. We don't know to what extent historical
linguistics are aware of them, so we offer them in case people are
interested.)

Ann Kumar
Phil Rose

-------------- next part --------------
A non-text attachment was scrubbed...
Name: short_prob.doc
Type: application/mac-binhex40
Size: 71221 bytes
Desc: not available
URL: <http://listserv.linguistlist.org/pipermail/histling/attachments/20000914/da56d2a7/attachment.hqx>
-------------- next part --------------
 ===========================================================================
Dr Ann Kumar
Vice-President, Australian Academy of the Humanities
Centre for the Study of Asian Societies and Histories
Faculty of Asian Studies
Canberra ACT 0200
Australia
Tel. (02) 6249 3677/4658  fax. (02) 6279-8326