Using Dictionaries: Pros and Cons

Lars Henrik Mathiesen thorinn at diku.dk
Mon Mar 22 19:29:59 UTC 1999


   From: X99Lynx at aol.com
   Date: Thu, 18 Mar 1999 15:47:17 EST

   In a message dated 3/15/99 4:13:07 AM, thorinn at diku.dk wrote
   in a note titled "Re: STATISTICS IN LINGUISTICS:"

   <<The important thing is that this measurement must not depend on the
   researcher's knowledge of the languages --- on the contrary, it should be
   repeatable with consistent results by different people....

   One possible protocol might be to simply hand out dictionaries to a
   few undergraduates that never even heard the names of the languages
   before, and letting them find as many similarities as they can. >>

   [Quote from Bob Whiting about the uselessness of dictionaries].

   What a comparison of the two points-of-view given above might
   suggest is that Statistics and Linguistics may not be precisely on
   the same track.

I agree completely that this experimental procedure --- you elided the
better one --- will probably give random results in nearly all cases.
If the control experiments and statistical analysis I described ---
and you also elided --- is done properly, the lack of correlation with
`reality' (or anything else) will be painfully evident, and the whole
thing will be written off as an exercise in futility.

Come to think of it, it might be a good result to have, as proof that
dictionary comparisons are in fact worthless. Then Larry T would not
have to spend so much time correcting `data' about Basque.

But the real point of my post was not the use of dictionaries, but the
use of an unbiased `mechanism' to compare languages, with control
experiments on known cases to calibrate it.

   [...]
   Dictionaries also may not give you an accurate idea of the
   phonology or the comparative phonology between two languages.

That's exactly why my other suggested procedure was to have experts in
each single language produce lists of words in IPA, to be compared by
a computer program. At shallow time depths, this might actually have a
chance of producing results that fit tolerably well with accepted
views --- while still producing nonsense for longer range comparisons,
I would expect.

   [...]
   And even phonetic or historical linguistic dictionaries can REALLY
   ask too much when it comes to the supposed meaning of old words.
   [...]

And why I was talking only about comparing modern languages. (But I
may have forgotten to write that).

Lars Mathiesen (U of Copenhagen CS Dep) <thorinn at diku.dk> (Humour NOT marked)



More information about the Indo-european mailing list