Hist Ling, a Primer: Part 1 (was Re: The Single Parent Question)

Thu Jul 26 15:07:47 UTC 2001

On Thu, 28 Jun 2001 X99Lynx at aol.com wrote:

>In a message dated 6/27/2001 12:29:59 AM, larryt at cogs.susx.ac.uk
>writes: << And what is that?  So far as I know, all I have ever
>claimed about the comparative method is that it cannot produce
>proto-languages that never existed.  And that's just true.  Do
>you want to challenge this? >>

>Yes.

>There are perhaps a number of ways in which the comparative
>method might "produce" a language or a part of a language that
>that never existed.  There is perhaps one way that is relevant to
>this discussion.

>If you assume only one parent where there was more than one
>parent, the comparative method can be used to reconstruct a
>language that never existed.

This is rather like saying that if you assume that the world is
flat then you can walk to the edge and jump off.

>If a language family "inherited" from more than one prehistoric
>parent, the comparative method will not be able to distinguish
>more than one parent - IF you assume only one parent.  If you
>assume all reconstructible features descended from one parent -
>where there were actually multiple parents - you will reconstruct
>a language that never existed.

Such a statement shows a complete lack of comprehension of what
the term "genetic relationship" means.  Very simply, languages
that are genetically related were once the same language.  That
is all there is to it.  Thus genetically related languages do not
"inherit" from more than one language.  They only inherit from
the common parent (i.e., the language that these languages were
once identical to).  Thus English and German are genetically
related because they were once the same language
(Proto-Germanic) and French and Spanish are genetically related
because they were once the same language (Latin).  On a different
level, English and Spanish are genetically related because they
were once the same language (Proto-Indo-European).

Interestingly enough, Sir William Jones, who is usually credited
with being the first to recognize the Indo-European unity and the
founder of comparative linguistics, hit the nail pretty much on
the head back in 1786 when he said:

   The Sanskrit language, whatever may be its antiquity, is of a
   wonderful structure; more perfect than Greek, more copious
   than Latin, and more exquisitely refined than either, yet
   bearing to both of them a stronger affinity, both within the
   roots of verbs and in the forms of grammar, than could
   possibly have been produced by accident; so strong indeed that
   no philologer could examine them all, without believing them
   to have sprung from some common source, which, perhaps, no
   longer exists:  there is a similar reason, though not quite so
   forcible, for supposing that both the Gothic and the Celtic,
   though blended with a very different idiom, had the same
   origin with the Sanskrit; and the Old Persian might be added
   to the same family.

Here it all is, tied up in a neat ribbon.  These words, delivered
as a address to the Bengal Asiatic Society, are often quoted in
introductory texts on comparative linguistics or on Indo-European
studies and constitute the creation story of the discipline.  Sir
William pretty much got it right the first time.  It took about a
century to work out the details, but his observations have stood
the test of time.  That is why his words are so often quoted.  He
even left room for the influence of other languages ("blended
with a very different idiom"), but you will notice that he didn't
say 'sprung from several common sources'.

But you cannot show any such thing as a "language family that
'inherited' from more than one prehistoric parent" because a
language family is by definition a group of languages that were
once the same language (cf. David Crystal _The Cambridge
Encyclopedia of Language_ [2nd edition, 1997], p. 427, s.v.
"family":  "A set of languages that derive from a common ancestor
(parent) language...").  It is not possible for a group of
languages to have once been two different languages (at least not
without the intervening stage of a single language).  It may be
possible that the parent language was a meld of two or more
different languages, but if it was, then this is what the
comparative method will reconstruct.  This is what the
comparative method does.  It tells you (if used competently) what
the parent of the daughter languages looked like (just before the
parent split).  It doesn't tell you how the parent language got
that way because it can't.  That is the job of internal
reconstruction, or, if other proto-languages are available for
comparison, the comparative method taken to the next higher level.

Now a language can be influenced by any number of languages
through borrowing or convergence, but that does not make any of
these languages a parent of that language.  If it did, the
parents of English would include Arabic, Hebrew, Chinese,
Japanese, Korean, Turkish, Etruscan, Finnish, Hungarian, Malay,
Mayan, Nahuatl, Sumerian, Swahili, Kaffir, Algonkian, Hawaiian,
and hundreds if not thousands of other languages from which
English with its capacity for swallowing foreign words whole has
appropriated words for its own use.  Similarly, it could be said
that most of the world's 6000 (give or take a few thousand) or so
languages have English as a parent if they have at least one
English loanword (such as 'hamburger', 'television' or some form
of 'automobile').

Carrying this reductio ad absurdum a step further, English has
been heavily influenced by Latin in its lexicon, morphology, and
syntax.  Basque has also been heavily influenced by Latin.
According to you this makes Latin a parent of both English and
Basque.  The next logical claim is that English and Basque must
be related because they share a common parent.

But to get back to the romance languages (the daughters of
Latin), French and Spanish were once the same language (Latin).
We can tell this by using the comparative method, which gives
reconstructions of many, many lexical, morphological, and
syntactic features of the parent.  Since the parent is an
attested language (Latin) we can check the forms against the
reconstruction (proto-romance) to verify the effectiveness of the
method.  Let's say, for the purposes of discussion, that we can
account for the differences between French and Spanish by the
influence of different substratum languages (let's say *Gaulish
for French and *Iberian for Spanish without committing to either
the truth of the claim or of the nature of the hypothetical
substratum languages).

According to Steve's view this means that French now has two
parents (Latin and *Gaulish) and so does Spanish (Latin and
*Iberian).  If we use the comparative method on French and
Spanish does it reveal these second "parents"?  No, it still
reconstructs Latin forms because that is what is common to the
two languages.  Using the comparative method on French and
Spanish *cannot* reconstruct *Gaulish and/or *Iberian because
they are not common to the two languages being compared.  The
comparative method will only reconstruct what is common to the
languages being compared (incidentally, I am using "languages
being compared" as shorthand for "comparing forms from the
languages being compared").

Okay, if that doesn't work let's say that French and Spanish both
have the same substratum language and then compare them.  That
way both the parent and the substratum language will be common to
both languages.  But wait a minute -- we were accounting for the
differences between French and Spanish by different substratum
languages.  If they both have the same substratum then we don't
have different languages.  There is no way to tell the difference
between French and Spanish.  We just have *Spench or *Franish or
some such thing and nothing to use the comparative method on.

Okay then, let's compare *Spench with some other romance language,
say Italian.  In this case since Italian is in the ancestral
homeland, we can do without a substratum language, thus creating
that great rarity (in Steve's view), a language with only one
parent.  When we reconstruct the parent of Italian and Spench
what do we get? -- Latin again, because that is the only language
common to both.

So we see Steve's dream of being able to reconstruct multiple
parents using the comparative method constantly receding before
us, much as the pot of gold at the end of the rainbow constantly
recedes as we walk toward the rainbow.  Aw, shucks.

>How does the comparative method tell if there was more than one
>parent language?

It doesn't because there is no such thing.  A mixed languages is
just that.  One language cannot be "genetically related."  You
can't use the comparative method on a language like Michif
because there is nothing to compare it to.  It's like asking
"which weighs more, a pound of feathers?"

If you have daughters of a mixed language, using the comparative
method on them will reconstruct the parent language, mixed or
not (and since every language is mixed to some extent, the
concept has no particular significance).  That's what the
comparative method does.  It reconstructs the probable form of
the parent of genetically related languages.  You can't use the
comparative method on a parent language and its daughter because
that's not what the comparative method does (besides, in most
cases you don't have the parent until you reconstruct it from the
daughters using the comparative method).  The comparative method
reconstructs the probable ancestor of two or more now different
forms that were once the same form (in one language).  It
provides a possible solution to the many-from-one situation.  It
can do nothing about a one-from-many situation.

But using the comparative method is not like solving an equation
for the volume of a sphere as you seem to think.  Using the
comparative method requires judgment, common sense, training, and
experience.  You don't just plug in the data, push a button and
come back later and see what the proto-language looks like.  The
results have to be interpreted based on the linguist's knowledge
of the kinds of things that are possible in terms of sound
changes and other types of linguistic developments.  To do this
the linguist has to be familiar with as many different languages
and their histories as possible and with typological
classification and its ramifications.

>It depends on the assumption one makes from the start -  I think
>its ability to see multiple descent is canceled out by the single
>parent assumption.  It will show "systematic correspondences" but
>has no way of distinguishing multiple descent for those
>correspondences.

No, what the comparative method does is identify the features of
the parent language that are still present in its daughters.
"Single parent" is not an assumption of the comparative method.
It is the definition of genetically related languages in
historical linguistics.

The assumption that makes the comparative method work is that
sound change (within a language) is regular.  Irregular sound
changes block the comparative method.

Now there are some situations in which the comparative method
could point to erroneous conclusions, but none of them have
anything to do with "the single parent assumption."

One such situation is where two or more daughter languages have
independently undergone identical innovations.  Features shared
by daughter languages are likely to be reconstructed for the
proto-language simply because it is more likely for some feature
to have arisen only once and have been transmitted to the
daughters than for it to have arisen independently in the
daughters.  But as Steve himself has pointed out some time ago,
unlikely things do happen.  Here typology of changes comes into
play.  If the change is a typologically common one, say
palatalizaion of velars before front vowels or s > h > 0, then it
is less unlikely for it to have arisen independently.  On the
other hand, if the change is typologically unusual or complex,
like Grimm's Law, or depends on other changes that must have
taken place in the proto-language, like Verner's Law, then the
change can safely be reconstructed for the proto-language.

Another situation that plays merry hell with the comparative
method is changes in the proto-language that reverse themselves
in one or more daughters.  Such an event is rare, but not unheard
of.  It makes it very difficult to say what belongs to the
proto-language and what doesn't.

Yet another situation that causes glitches in the reconstruction
of a proto-language, and that appears to be closest to what you
are proposing as a method of reconstructing a nonexistent
proto-language, is when two (or more) daughter languages are
influenced in exactly the same way by another language.  Under
these circumstances, the common features from this source could
be reconstructed for the proto-language when in fact the
proto-language never heard of them.  If the influencing language
(or a descendent of it) is attested, it may be possible to
identify these features and attribute them to their proper
source, but if the influencing language has become extinct
without leaving a trace (except for its influence on the
languages we are investigating) then this influence may not be
recognized for what it is.  But in any case, if it is attributed
incorrectly to the proto-language, it is likely to appear as an
anomaly of some kind.

The fact that these situations could arise show that the
comparative method is not foolproof.  But then no method is
foolproof (Murphy's Law).  Nor can a method be made foolproof.
If someone devises a better method, someone else will devise a
better fool.

But these potential pitfalls do not invalidate the comparative
method.  They simply show that those who would use the
comparative method need to be aware of them so that they can
evaluate how much confidence can be placed in a given
reconstruction.  And this is where experience and training come
in.

>The comparative method is a powerful tool,

And powerful tools require training and experience to operate.

>but even the Hubbell can't see the far side of the moon.

Which is why we assume that the far side of the moon is, grosso
modo, not much different from the side that we can see.  But
until you know the answer to "how do we know that the moon is not
made of green cheese?", you won't be able to come to grips with
the concept of historical linguistics.

>Without the single parent assumption, I suspect the comparative
>method could also support explanations that include multiple
>"genetic strains."

Which shows how misplaced your suspicions are.  You seem to be
under a considerable genetic strain yourself. :)

But just to show you that "the single parent assumption" doesn't
have to be abandoned in order to do historical linguistic or even
to uncover different "genetic strains," let's look at an actual
example.  By using the comparative method on English, German,
Dutch, the Scandinavian languages, Gothic, etc., it is possible
to reconstruct Proto-Germanic in considerable detail.  The
comparative method will filter out the extensive French, Latin,
and Greek influence on English because it is not shared
systematically by the sister languages.  What we get from this is
a reconstruction of the Proto-Germanic language just before it
split into its daughters.

Now we can take this back a step farther by using the comparative
method on Proto-Germanic and its sister languages, Proto-Italic,
Proto-Balto-Slavic, Proto-Indo-Iranian, etc.  This gives us yet
another proto-language, which we call Proto-Indo-European.  Now
it turns out that a number of features that we can reconstruct
for Proto-Germanic, we can't reconstruct for PIE, which means
that they are not found in the sister languages.  This suggests
that Proto-Germanic had been influenced by some other language or
languages before it split into its daughters and that its sister
languages were not subjected to this influence.

To account for this, we propose that this influence took place
between the time that Germanic split from PIE and before it split
into its own daughters, a stage of the development that we refer
to as pre-Proto-Germanic or sometimes simply as pre-Germanic.  In
order to account for the large number of features of
Proto-Germanic that can't be traced back to PIE we hypothesize
that there was a substratum language that heavily influenced
Germanic during the pre-Germanic phase (so that Proto-Germanic is
"blended with a very different idiom").

Unfortunately, unless this substratum language is actually
attested, it is very difficult to say (scientifically) anything
very specific about this hypothetical substrate language because:

  a) We can never be entirely sure whether something that appears
  in Proto-Germanic is an internal development or is to be
  attributed to the substratum.

  b) There is no way to reconstruct the substratum language
  because there is nothing to compare.  The comparative method
  works (scientifically) because it uses two or more forms to
  triangulate on the original form.  Proposing an origin for a
  single form is simply speculative because there are too many
  possible origins.

So one can suggest that the unaccounted for (by inheritance)
features of Proto-Germanic are the result of influence from some
other language (which is not unreasonable because inter-language
influence is a normal thing), but there is no way to prove it,
and even worse, there is no way to disprove it.  But the
comparative method can detect this possible influence if there is
enough data.  It just can't say much of anything else about it.
That has to come from other sources.

>In which case, the method would produce data that could be used
>to reconstruct one or multiple parents.  In which case, one of
>those two reconstructions would be false.  And that would be one
>way the comparative method could be used to reconstruct a
>proto-language that never existed.

Which is as good a way as any of saying that you don't know what
the comparative method is, how it works, what the inputs are and
what comes out of it.  The comparative method doesn't produce
data -- data is the input to the comparative method.  Data is
something that exists in nature.  Interpretations of data are
different from data.  Interpretations of data involve judgment,
common sense, training and experience.  Interpretations of data
are hermeneutics.  That is why comparative linguistics is a
hermeneutic discipline.

Now the output of the comparative method is data in the sense
that it can be used in further applications of the method to
reconstruct a higher level proto-language.  But there is a
qualitative difference between this kind of datum and a naturally
occurring one.  That is why linguists (and philologists) put an
asterisk (*) in front of reconstructed forms -- to show that they
are not naturally occurring data.  The asterisk means:  caution --
this is a reconstructed datum that is not actually attested but
was arrived at by interpretation of other data.

><<Steve, have you ever *done* any comparative linguistics?  Have
>you ever grappled with linguistic data in an effort to
>demonstrate common ancestry, or to challenge someone else's
>efforts in this direction?>>

>This won't help you here.  I won't ask if you've ever argued a
>science case in Federal Court or ever did plasma analysis or
>worked on neural systems or did any high-order economic analysis.

The difference is that Larry is not trying to tell you how to do
any of those things or telling you that you are doing them wrong.
What he is telling you is that everything that you have said
indicates that you don't understand the methodology of
comparative linguistics.  And the fact that you may have done
the things you say and perhaps even done them right doesn't mean
that you know anything about how to do comparative linguistics.
What he is telling you is that using the methods of comparative
linguistics competently requires training and experience and he
is asking you what your qualifications are for critiqueing the
methodology.  Your answer seems to be that you don't need any.

>You've claimed an extremely high level of certainty with regard
>to the reconstruction of proto-languages.  I'm looking at the
>scientific validity of that claim.

No you're not.  You are just making it clear that you don't
understand either the terminology or the methodology involved in
that claim.  You are making it clear that you have only one
concept of science:  predictability based on precise
measurements.  You are making it clear that you do not realize
that sometimes science is merely explanatory without being able
to make accurate or detailed predictions.  You are making it
clear that you have no grasp of the concept of epistemology.

You seem to be under the impression that all sciences have the
same methods and reach their conclusions in the same way as the
sciences that you are familiar with and therefore the methods of
physics should be used in linguistics.  But this is not so.
Physics gets its cachet from very precise measurements of
universal properties or characteristics of the real world.  Its
plausibility stems from the fact that these constants are the
same wherever or whenever they are measured and can be used to
predict the actions of a physical body when subjected to certain
forces in a cause and effect relationship.  But comparative
linguistics does not get its plausibility from precise
measurements of universal constants or from predictions of cause
and effect relationships.  It gets its plausibility from the fact
that the pieces of a correct linguistic solution must all be
interlocking.

This is why regular sound correspondences and pattern matching
are so important in historical/comparative linguistics.  Parts of
the solution must be brought into uniformity with other parts of
the solution (pattern matching).  If the patterns of regular
sound correspondences don't match throughout the reconstruction,
then the reconstruction is wrong in one or more details.
Historical linguistics is hermeneutic; it is a science of
interpretation, of ideas and rationality, not of universal
constants and causality.

>That demands that the process should be rational and
>reproducible.  If you're saying I'm missing something, spell it
>out.

Okay, try this:  LANGUAGES THAT ARE GENETICALLY RELATED WERE ONCE
THE SAME LANGUAGE.

And what in the world is rational and reproducible about
high-order economic analyses?  Economic analysis is as chaotic as
language change.  Incidentally, chaotic here is a technical term.
It does not have its free meaning of "in a state of utter
confusion."  It refers to systems that are unusually sensitive to
variations in their initial conditions or are affected by a large
number of independent variables.  Such systems may show patterns,
even classic patterns that develop quite often, but predicting
the outcome of these patterns in a particular instance usually
doesn't work because the classic development of the pattern can
be upset by a wide range of more or less unpredictable and
unrelated variables.  It is like trying to predict the weather.
You can keep all the statistics you want.  You can have a record
of the temperature and amount of rainfall on July 4th for the
past hundred years.  But none of these statistics will tell you
on July 3rd whether it's going to rain on your parade the next
day or not (unless perhaps you are having your 4th of July parade
in Riyadh).

>But not with the conclusions or unexplained assumptions
>that you have been relying on so far.  And I assume you aren't
>claiming any kind of unique psychic powers in your use of the
>comparative method that are beyond ordinary comprehension.

I really don't think the idea of genetically related languages is
beyond your comprehension despite your claim that it is.  But it
is obviously beyond your limited concept of "science."  Of
course, it is quite possible that the aforementioned judgment,
common sense, and linguistic experience are psychic powers that
are beyond your comprehension.

But just in case you haven't gotten it yet:  LANGUAGES THAT ARE
GENETICALLY RELATED WERE ONCE THE SAME LANGUAGE.  This is the
basic assumption of historical/comparative linguistics.  This has
been true for over 200 years -- from Jones' 1786 "sprung from
some common source" to Anttila's 1989 "'Related' is a technical
term ... meaning that the items were once identical" (p. 300).
If you can just keep this in mind, you won't get hung up on the
50 parent problem.

Bob Whiting
whiting at cc.helsinki.fi