The UPenn IE Tree: correction

Sean Crist kurisuto at
Wed Aug 25 16:29:10 UTC 1999

On Tue, 24 Aug 1999, Larry Trask wrote:

> In an earlier posting, I remarked that the Penn tree for IE recognized
> no Italo-Celtic grouping.  This was true for the original publication.
> However, I've just been informed off-list that the Penn group have more
> recently produced a revised version of their tree, which *does*
> recognize such a grouping.

> My apologies for misleading you, but I am a little concerned by the
> news.  As I understand it, the change was produced by adding only a
> single character to the set of characters used.  If this is right, it
> suggests that the approach used at Penn is somewhat less than robust.

We shouldn't be particularly disturbed that the addition of a single
character could result in a differently structured tree.  If the character
we've added is the first to indicate a shared innovation between two
branches, then it's the expected and desired result that the recomputed
tree should group those two branches together (assuming that there aren't
other conflicting characters).  Celtic and Italic were always very close
together in the tree; now they're grouped together.

Even from the earliest stages of this work by Ringe, Taylor, and Warnow
(at least, from the earliest versions of their handouts that I have),
there have been certain structures within the tree which have been very
robust and which have changed little even with later refinements in the
character set.  This is true of the early separation of Anatolian, of the
grouping of Greek and Armenian, and of the grouping of the Satem core
(Indic, Iranian, Slavic, Baltic).

As the character set has been refined, there have been some resulting
changes in the placement of Tocharian, Italic, and Celtic.  In nearly all
of the versions of the tree, these three branches separated from the
others some time after the separation of Anatolian, but prior to the
separation of Greco-Armenian.  The big picture hasn't changed much.

What has changed is this:

1) the team now claim that Italic and Celtic form an Italo-Celtic branch
together. In an earlier version, the four best trees returned by the
algorithm either had Italic and Celtic branching off separately (but one
right after another), or else had a certain indeterminate structure which
could be resolved several ways, one of which is an Italo-Celtic grouping.

2) the team now claim that Tocharian branched off before, not after,
Italic and Celtic.

I'm racking my brains trying to remember what the character was which
caused Italo-Celtic to pop out grouped together; I asked Don Ringe that
specific question, and I remember that it was some morphological
character. I would have thought it was the optative */a:/, but that
character has been in there since early versions of the work.  I'll ask
Don next time I see him.

  \/ __ __    _\_     --Sean Crist  (kurisuto at
 ---  |  |    \ /
  _| ,| ,|   -----
  _| ,| ,|    [_]
   |  |  |    [_]

