Algorithm for comparative method?

Sean Crist kurisuto at unagi.cis.upenn.edu
Tue Aug 31 04:12:14 UTC 1999


On Thu, 26 Aug 1999 ECOLING at aol.com wrote:

> I would suggest that the phonology hides problems just as great as the
> semantics,
> because we do not have an explicit empirically based metric there either!
> No metric of how probable each state-to-state transition is between
> particular sounds, in specified contexts of languages of particular overal
> phonological or phonetic structure, etc.

I don't see that that makes any difference.  The Comparative Method
operates by looking at correspondences between languages and working
backwards thru the mergers which have applied.  Occasionally, applying the
Comparative Method to a body of data leads us to posit sound changes which
seem phonetically unnatural.  This should not distress us, however, since
we occasionally find crazy rules within the phonologies of well-documented
modern languages; it's just one of those things that happens every now and
then ("crazy rule" is actually the correct technical term for such rules).
For example, the language Kashaya has a rule i -> u / d __ (/i/ becomes
/u/ after /d/), which is phonologically and phonetically absurd, yet it
happens.

When the Comparative Method yields a rule which seems phonetically
implausible, it's an occasion to double check your work.  However, it's
bad methodology to impugn the analysis- or the Comparative Method- just
because a rule we've posited goes against our expectations.  We know that
there _are_ sometimes rules right here in the light of the present which
are phonetically unnatural.  So I don't see how a metric for what's
phonologically or phonetically plausible is going to be of direct help
here.  This isn't how the Comparative Method works, in any case.

When I say that a computer can already compute the phonological half of
the work in the Comparative Method, I mean just this: it could look at the
correspondences between cognates and compute backwards to undo the mergers
which have applied.  For example, if language A has /a/ contrasting with
/o/, but language B has /a/ for both, and if there is nothing in the
phonological environments which allows us to predict where you get /a/ and
where you get /o/ in Language A, then we reconstruct an */a/ - */o/
contrast for the protolanguage, and posit a rule which merged them in
Language B.

I _think_ that this is a computationally tractable problem, but nobody has
done it yet as far as I know.

> Developing a handbook of such information is very much like the task of
> developing the handbook of physical and chemical constants.
> Linguistics is in its infancy to a great degree
> because this is still handled on an intuitive basis, and one investigator's
> judgement may be quite different from another's, based on the accidents
> of what "odd" languages that investigator happens to be familiar with.
> There needs to be a much better way of integrating data from many different
> sources, making it available to all.

This is really a matter of opinion, but IMHO, I think you're
underestimating how much we know.  Linguistics is not a new discipline; in
its modern incarnation, it's been around for two centuries, and the
formal, mathematical models of linguistics have been being pursued for the
better part of this century, depending on what you take as their beginning
(Saussure, Trubetskoy, etc.).

A relatively comprehensive and recent statement of the theories of
phonology which have been distilled from this huge mass of data is John
Goldsmith, ed. 1995 _The Handbook of Phonological Theory_.  I think I'd be
hard pressed to come away from a text of that sort and describe the state
of our knowledge as one of "infancy".

  \/ __ __    _\_     --Sean Crist  (kurisuto at unagi.cis.upenn.edu)
 ---  |  |    \ /     http://www.ling.upenn.edu/~kurisuto/
  _| ,| ,|   -----
  _| ,| ,|    [_]
   |  |  |    [_]



More information about the Indo-european mailing list