The UPenn IE Tree (the stem)

Brian M. Scott BMScott at stratos.net
Wed Sep 22 07:33:03 UTC 1999


X99Lynx at aol.com wrote:

> <<The program was not handed a list of Sanskrit, Greek,
> Latin, and Germanic words and expected to derive Grimm's
> Law therefrom, but was rather given the information that
> Grimm's Law encodes...>>

> And if I'm not mistaken it was also given a date for Grimm's
> Law and an assumption that of course that Law is relatively
> unique to Germanic languages.

You are mistaken: it was given no dates, even relative ones.  It was
given data on 300+ characters for 12 languages, the earliest attested in
each of the branches (counting Balto-Slavic and Indo-Iranian as two
branches each).  A character is a property on which languages can
differ; the centum/satem character, for instance, is two-valued
(satemized vs. non-satemized).  In effect it had a 12 by 300+ matrix of
numerically encoded character values.

It then attempted to build an unrooted tree on the assumption that
identical values of a character in different languages are not the
result of independent, identical innovation or borrowing.  (The reality
of borrowing is also dealt with; I'm describing the basic idea only.)
This is the assumption underlying the notion of a perfect phylogeny,
which you can find described at
<http://www.cis.upenn.edu/~histling/home.html>.  Quite detailed
information is available in the papers available there.

The algorithm does not root the tree; that was done afterward on the
basis of linguistic considerations.

> I don't have a copy of the methods statement - I'd love to see
> it - but the point of branching is supposed to be a real event.

Do you mean historically, or in the construction of the tree?
Historically it is certainly not a point event, and in the tree it is
simply a consequence of what character values are shared by which
languages.  For example, all character values common to Latin and Old
Irish that are shared at most with Tocharian B or Hittite contribute to
the branching that separates Lat. and OIr from everything else except
Toch. B and Hitt.

> My problem has always been with what this apparent use of
> technology adds.

This is discussed in some detail in papers available at the UPenn site.

Brian M. Scott



More information about the Indo-european mailing list