Principled Comparative Method - a new tool

Tue Aug 31 06:01:24 UTC 1999

<< The comparative method doesn't compute anything.

In a message dated 8/26/99 5:10:47 PM, kurisuto at unagi.cis.upenn.edu replied:

<<It does.  It's essentially a function which takes attested languages as its
input and gives reconstructed languages as its output.  In principle, a
program could be written to do it; the major unsolved problem is modeling the
semantics in a way that allows a program to make human-like judgments
regarding what semantic developments are reasonable.  The phonology, tho,
could almost certainly already be computed by program.>>

Which brings up an interesting question.  Why use 'semantics'?

After all, in the usual presentation of the Comparative Method, the meaning
of the word is just really a way of squaring up different languages.  First
of all, to make a guess about whether they may be related.  And secondly
"meanings" - generally dictionary/glossary meanings - just line up the
languages in a convenient way so that phonology can be more easily compared.

We can after all imagine two hypothetical languages where every word in one
is phonologically cognate with a corresponding word in the other - and even
have clear historical proof of this total cognation - but at the same time
find in usage none of these cognates are 'semantically' similiar in any way
that is apparent.

True, this an imaginary situation.  But it is possible because we know that
the paths of phonetic cognation may be quite distinct from the paths that
yield 'semantic' change.  Some 25th Century NeoGrammarian could have
difficulty seeing the relationship between the prescribed "gaiety" of the
Eastern season's priestly vestures and the judicial recognition of "gay"
rights.  Presume sketchy contextual information and the 'semantic'
relatedness might seem implausible.

So might there not be, with the mega-statistical probabilities created by a
mega-data base, a way to avoid the whole issue of meaning?  The co-occurence
of sound categories in well-populated distributions should yield high degrees
of statistical certainties (so long as you got the dates right).

But then if we just did the phonetics we might find ourselves asking whether
the result really describes what we mean by language.  But I guess it's all a
matter of what we are trying to understand.

<<If there's some other way of going from attested forms to reconstructions
of prehistoric forms without using the Comparative Method, I'd like to know
about it.>>

I believe internal reconstruction is one often mentioned.  Typographical
inference is another.

<<The only reason we're able to say anything at all about
prehistoric languages is that sound changes have a particular property,
namely, they are exceptionless (with a small amount of hand-waving here).
The Comparative Method crucially exploits this property of sound changes.>>

Now, to be fair, what you are speaking about is a working assumption.  (In
fact, the most productive thing about Grimm's Law may have been the
'exceptions'.) But it is a little silly to say that the only thing we can say
about prehistoric languages is that their sound changes were exceptionless.
In fact, by definition, we don't know anything directly about the sounds of
prehistoric languages.   So we don't know, by definition, it the sound
categories included exceptions or not.  But we have decrypted prehistoric
languages without any knowledge of what sounds the characters represented.

<<Automata normally perform a concatenation operation across each arc between
states.  One can imagine an automaton-like machine where the transitions can
perform other sorts of operations,....  But if the machine in question is
strictly concatentative (as automata at least canonically are), I'm puzzled
as to how you would model historical sound change in such a machine, since
historical sound change isn't concatenative.>>

It's a little like looking at the pistons in a car engine and asking which
one will get you to Chicago.  You are assuming a point-for-point analogy
between the internal system or structure used by the automation and the
external structure it is being applied to analyze.  The "linkages "in
"concatenative" do not have to mirror the elements you are analyzing.  They
are rather internal relationships yielding values that mathematically
correspond to but do not have to structurally mirror the values you've
attached to external events.

/a/>/a/ may correspond to a single "link" in your concatenation.  /a/ > /b/
may correspond to six, even though your real-life event may correspond to
only one.  Those six links represent values you have assigned to /a/ > /b/,
which the machine achieves any why it must in order to match the operations
required.  'Invisible' intermediate formulae in a spreadsheet are a good
example.

<<Whether or not loans happened in the light of written history, you can
identify a word as a loan from a related language because of the sound
changes it has and has not undergone.  For example, while English "cardiac"
does ultimately go back to the PIE word for "heart", you can readily tell
that it is a loan from a non-Germanic language, because it has not undergone
Grimm's Law, which applied exceptionlessly in prehistoric Germanic.>>

Unless of course you are among the number of linguists (no small number) that
find Grimm's Law representing archaisms, in which case you must find another
path for the loan.  But, in one very important definitional sense,  every
word in modern English is a "loan" word.  What, for example, is not a loan
word in Old French, if 'Frankish' is described as a "different language?"

<<This same method of identifying loans among related languages works just as
well for languages which don't have a long written tradition.>>

Just as well, eh?  No added element of uncertainty at all caused by a lack of
writing?  Have you tried your hand at finding the loans in Thracian?

<<Now, it's true that there is a problematic case: it's hard to detect loans
which occurred between related languages soon after their branching, before
very many of the telltale sound changes took place.>>

There is also the problematic case where loans went back and forth without
documentation or were loaned from a third language of which we have an
incomplete record.  And another where the chronology of the loan is based on
eroneous historical information, so that the giver and taker have been
confused.  And another where the inherent arbitrariness of sound changes (why
p>f?) can suggest relationships where commonalities are purely accidental.
Etc.

By the way, do you think there was an intermediate period between p>f where
there was /p'h/?   Just curious?

<<You don't need a long written tradition to be able to work out the
relative chronology of prehistoric sound changes.>>

We have trouble being sure of the continuity of atomic half-lives, the
constancy of gravity and the accuracy of radio-carbon dating.  Surely, you
might take a slightly less certain tone about the chronology of prehistoric
sound changes.  A certain humility seems to be a characteristic of the better
scientist.  After all, you never know when an IE Rosetta Stone or a Quantum
Phyics of Linguistics may show up and demand the humility you can voluntarily
adopt before hand.

regards,
Steve Long