The Significance of Comecrudan

Thu Feb 18 21:03:46 UTC 1999

----------------------------Original message----------------------------
Alexis Manaster-Ramer wrote:

> ----------------------------Original message----------------------------
> Before we got sidetracked, I thought we had seen the beginning
> of a really interesting and useful (and indeed potentially
> revolutionary) debate about how languages really get classified.

> I don't know how many people are interested in this, so I will not pursue

> this any further here if there is no interest, of course.

I am interested in this, so here are my $0.02.

>
> work on actual languages.  If there is to be a discussion of
> methodologies, then we should look at language families that have proposed
> on the basis of specific methodologies and see what we find, not start out
> by a priori accepting some completely arbitrary methodological assertion
> made up from whole cloth and then rejecting any language family whose
> recognition would force us to abandon that assertion.  I am referring of
> course to the two assertions:
>
> (I) Language relatedness can only be shown by reference to morphology
> (falsified by the history of how Tai, Comecrudan, and (an example I forgot
> to cite earlier) how Uto-Aztecan was discovered),
>
> (II) Language relatedness can only be shown by establishing a system of
> sound laws (falsified by the history of how Niger-Kordofanian,
> Indo-European, Afro-Asiatic, and indeed probably most of the currently
> accepted language families were established).

Some comments here.

1. In order to do (I) they have to have morphology.
2. In order to do (II) there are two issues to be settled.
    2.1) If data is plentiful, then finding sound laws is easy.

    2.2) If data is scarce, then one may still find sound laws even if
    doing nothing else but brute-force methods using computers.

    2.3) if we do use brute-force methods, we still have to compare them
    against some baseline effects due to chance.
3. Executing (I) still requires some kind of sound-laws. If that were not the

    case, then languages could be said to be related for having the same
    kind of morphology but not having sound laws among bound morphemes.

4. So, in both cases, we are still working with morphemes (whether they are
free or bound). The bound-morpheme method will only work on languages that
have
morphology, but again sound-laws are required.

5. In both cases, we are attempting to determine if probability of  the
existing
situation can be attributable to chance. If the answer is no, then we go to
the
next step.

6. If the existing situation cannot be easily attributable to chance, then
can
it be borrowing? In order to remedy this situation, we put other rules into
action:
    6.1) The morphemes in which we see the realization of sound laws must be
    those that cannot be attributable to copying/borrowing. The heuristic
that
    is employed here is that some special set of words are resistant to
    exactly this kind of borrowing/copying. The formalization of this concept

    first attempted by Swadesh is in the so-called Swadesh list.
    6.2) The rules in (6.1) are further modified by not allowing words that
    are phonetically similar to a specific set of word {ata, ana,ama,...}

Now, we have to justify these laws. The attempt to justify certain words that

are apparently resistant to copying/borrowing/diffusion has to be backed up
by some kind of evidence that is not circular. It can't be only IE languages.

The circularity can be seen in the fact that these words are among the first
that would exist in any language and can thus point to Protoworld. Therefore,

(6.2) has to be fully justified and justifiable from empirical evidence that
is
not circular but independent of historical linguistics, especially of the
IE-kind.

Other problems:

i)We need a measure of complexity.
ii) We need a measure of semantic distance.
ii) We need a measure of phonological/phonetic distance.

A. (i) is necessar in cases like this: Does Kabardian have 1 or two vowels?
    If some brute-force computer program changed all the known words of some
    language (say Etruscan, or some other language with only  a few known
words)
    by using regular changes to some other language, then how many of these
regular
    sound changes are we willing to tolerate. If the rules become very
convoluted,
    do we throw up the towel and disallow it, or do we continue to insist on
    regular sound change, no matter how complex?

B. (ii) is neeeded because we have to be able to determine if two words are
cognates. We can't allow 'foot' and 'wagon' to become cognates, in general
unless we can trace the word accross time extremely accurately. This could
only occur if we had writings from the language stretching back thousands
of years. Lacking that we have to guess, as we always do. And this guessing
has to have some validity so that we can compare guesses against other
guesses
and be consistent.

C. (iii) is the easiest thing to do and has been done. There are many such
distance metrics (in my book), but we have to clearly define what is meant by

phonetic, acoustic, perceptive and phonemic distances. There is a lot of
utter confusion in the literature.

There is no way, except to produce some number in some normalized interval
like
[0,1] even if the first attempt is not good, to be able to create a
consistent
comparison of attempts of linguists to create 'genetic' trees. Eventually,
even
this concept can be more clearly explained, after we've taken the bugs out of

the simpler constituent concepts.

>
> (c) I agree with Johanna Nichols, pace what Larry and Stefan seem to say,
> that, when dealing with languages of which we know only a small number of
> words, it matters not just that we can only, therefore, only find at best
> a small number of cognates with other languages but also it matters how
> percentage of the attested forms we can explain.  This is why I posted the
> entire Garza and Mamulique corpus, to see whether she (and others) would
> agree with me that the Comecrudan hypothesis (which links these two with
> Comecrudo) is a reasonable one.  I say Freasonable' because as stated I
> think the proof of "Comecrudan" lies in the broader Pakawan comparison.

And that is why I posted the material from Hintze on Meroitic, Altaic and
Uralic. There has to be a consistent way to evaluate all such data.

Historical linguistics has to be taken out of the realm of gut feelings.
It is almost 21st century, and that kind of gut feeling will not hold up.
There are computer programs that "compose music like Bach" and that
paint. It is not believable to claim that historical linguistics is less
structured than music or art, or that it has to stay at the level of
intuition and black magic expertise. It is not.

--
M. Hubey
Email:          hubeyh at Montclair.edu    Backup:hubeyh at alpha.montclair.edu
WWW Page:       http://www.csam.montclair.edu/Faculty/Hubey.html