Refining early Basque criteria

Larry Trask larryt at cogs.susx.ac.uk
Fri Oct 15 16:09:17 UTC 1999


ECOLING at aol.com writes:

> Concerning Larry Trask's list of criteria for potential candidates
> for early Basque vocabulary lists:

[on the choice of cut-off date]

> The details in the paragraph above suggest to me OBVIOUSLY
> if you want early Basque, you use 1700 in preference to 1600,
> because the 16th-century materials are so limited in content.
> It is always possible to study any differences between 16th and 17th-century
> equivalent grammatical morphemes, forms of the same words, etc.,
> where those are attested in both centuries, but obviously much
> non-religious vocabulary will be systematically disfavored by the earlier
> cutoff date.

Well, I've already explained that I am prepared to consider 1700 rather than
1600.  But I don't think the choice is obvious.  My impression at this stage --
which might prove to be wrong, of course -- is that most of the words that meet
my other criteria are already attested by 1600, and so, if possible, I'd prefer
to use the more restrictive early date.

For example, <uko> 'forearm', is nowhere attested before the 17th-century
writer Oihenart, but then it appears to be attested *only* in Oihenart, so it
will fail to be included anyway.

But this example raises another interesting point.  Though <uko> itself is not
found outside of Oihenart, its transparent compound <ukondo> (and variants>
'elbow' (from <uko> plus <ondo> 'bottom') is close to universal in the
language, and recorded from 1596.  Now <ukondo> must be excluded as obviously
polymorphemic, but I will have to decide whether its existence should or should
not license the listing of <uko>, which itself does not meet my criteria.  At
the moment, I have not yet decided, though I lean toward the negative.

[on my criterion of very widespread distribution]

> As explained in a long and detailed message sent many days ago,
> focused on sound-symbolic vocabulary ...
> given the limited recording of sound-symbolic vocabulary,
> an insistence on very wide distribution will have the effect of biasing
> against this type of vocabulary,
> and in this case will certainly bias against a variety of canonical forms,
> in favor of canonical forms more uniform and more limited than they
> actually were in very early Basque.
> A systematic distortion, in other words,
> in this case not merely a lack of particular lexical items, but even
> a systemic distortion by changing the hypotheses of canonical forms.

But consider the alternative.

If I admit words found, say, only in one of the recognized dialects, then I'm
inevitably going to be admitting a vast number of words whose native and
ancient status is at best deeply questionable and at worst certainly zero.  And
this would be catastrophic for my purposes.

I can't afford to sweep up huge numbers of non-ancient words in order to avoid
overlooking a much smaller number of genuinely ancient words.

> Thinking of subject matters attested or not, we have the following,
> which ties this issue back to the specifics of subject matter noted
> by Larry Trask for 16th vs. 17th centuries:
> If only two dialect areas have documents in certain subject matters,
> then vocabulary specific to those subject matters will be systematically
> excluded by requiring their attestation from more than two dialect areas.
> This is obviously undesirable.  It suggests that a moderate position might
> be to categorize documentary attestations by subject matter,
> and vary the number of dialect areas required according to the number
> of areas attesting documents in each subject matter.
> Of course in practice, this can be done in another way.
> Record ALL vocabulary items for a particular concept,
> and study the UNIFORMITY of etyma for that concept,
> without much regard AT FIRST for whether it comes from two or from five
> areas.
> If variants for a particular concept cannot be established as loans from
> neighboring languages, then remaining variety of non-cognate terms
> argues against immediately positing any of the conflicting forms
> as candidates for very early Basque (even though one or more
> of them MIGHT be a direct descendant of very early Basque).
> Additional argumentation would then be necessary, either way.
> Of course things are not this simple,
> but Larry Trask is an expert at using all of these varied sorts of
> information.

Well, much of this is very reasonable, but only for a different task from the
one I have in mind.

In written texts, the great bulk of the Basque vocabulary pertaining to
specific areas is clearly neither native nor ancient -- as we might expect.
Religion, law, government, seafaring, even agriculture -- most of the words are
either obvious compounds or derivatives, or obvious borrowings.  Hence I can
see little point at this stage in worrying about them.  First things first.

> More dialect areas of course gives additional security,
> and perhaps additional phonological information.

Not sure what this means.

Requiring a given word to be attested in a large number of dialects certainly
does give additional security, and that's why I require this.

[on my suspicion of possible borrowings]

> Some would use "caution" in not throwing out things for which loanword
> origin is merely suspected, for which the argument is not a strong one.
> "Strong" is not the same as "certain".
> Moderation in all criteria, as in all things.

Sure, but I have to make a decision here.

Given the vast impact of Latin and Romance on the Basque vocabulary, and the
near-total absence of traffic in the other direction, I think it's wise to
exclude words for which a loan origin looks even moderately plausible.

After all, it is hardly conceivable that genuinely native and ancient Basque
words for which a loan origin has been seriously (but wrongly) suggested are
likely to be systematically different in form from other native and ancient
words -- now is it?

[on the rare Basque loans into Romance]

> Would such examples be those in which the Castilian and Portuguese words
> have no cognates in other Romance languages?

Not necessarily, but such cases are so few anyway that no generalizations can
be made.

> In such a case,
> would not the identical sort of criteria dictate that they be excluded from
> studies of early Castilian and Portuguese?

Students of early Romance must draw up their own criteria, which need not be
the same as mine -- especially since the Romanists have not only a whole bunch
of languages to look at, but also Latin.  My criteria are designed only for the
particular task I have in mind.  I claim no universal validity.

> Of course, there is no necessary
> contradiction here, because items of this sort could in principle be
> excluded from BOTH sides of any puzzling sharing, in the approach
> Larry Trask is taking.  Or they can be included on BOTH sides.
> My own position would be simply to include them on both sides,
> but with a note that they might be from either side,

Why?  There are thousands of Latino-Romance loans into Basque, while there
exists just one apparently clear case of a Basque loan into Romance with any
great currency.

> and if they are
> from the Romance side, but limited to Iberian Romance, then we
> must have an additional hypothesis that there was some innovation within
> Iberian Romance, or else a borrowing from some third language family
> related neither to Romance nor to early Basque.  Is there some gap in
> that reasoning?

Yes; I think so.  See below.

> Because it seems to me to suggest that words limited
> to Basque and to Iberian Romance (not found in other Romance languages),
> are better assigned to early Basque than to early Romance,
> since by definition of the situation they are not reconstructible to early
> Romance.

No; I can't agree.  Given the established paucity of Basque loans into Romance,
it is out of order to impute a Basque origin to any Romance word not derivable
from Latin, even if it does occur in Basque.

> But this is not certain, Occam's razor can suggest a route
> to follow, but it cannot absolutely exclude the more complex case that
> there was an extinct third language family from which a word was
> borrowed both into Basque and into Iberian Romance.

It is widely suspected by specialists that such words exist.  But clearly I do
not want to sweep up such words if I can help it.

[on words like Basque <ama> 'mother' and <tu> 'spit']

> That does not argue either for or against such words actually being
> inherited from Proto-Basque.

Of course, but not the point.

The point is this: is there any good reason to suppose that a given word was
not present in Pre-Basque, and native there?  If the answer to this question is
"yes", for any reason at all, then I prefer to exclude the word.

Remember: I'm looking for the *best* candidates, not for *all possible*
candidates.

> It DOES make it difficult to use such words in trying to prove
> a deep genetic relation between languages, because one must then
> have sufficient knowledge of sound-symbolic forces to argue
> something more specific is shared between particular languages,
> not merely a vague resemblance.
> That is quite a separate issue.

Yes, but I don't think that <ama> 'mother' is "only vaguely" a nursery word, or
that <tu> 'spit' is "only vaguely" an imitative word.

[LT]

>> When -- as so often -- a word exists in
>> several regional variant forms, what form should go into the list?
>> My answer is that we should simply appeal to the known phonological
>> prehistory of Basque, and use the form which can be reconstructed as the
>> common ancestor.

> I have great confidence that Larry Trask will almost always draw the
> correct conclusions in such cases, given his knowledge of the
> phonological history of Basque.  But it nevertheless should be clear
> that there is a potential circularity, of exactly the kind pointed out by
> Steve Long, that a theory of the historical development of a language
> is used to select which forms are considered to have been in a proto-
> language.  That virtually guarantees that a different theory of
> historical development of the language cannot easily be developed
> from data thus selected.  Elementary common sense.

But there exists *only one* theory of the phonological prehistory of Basque,
and that theory is massively and meticulously documented.  It has never been
seriously challenged by anyone, and no alternative of substance has ever been
put forward.  Hence vague invocations of "alternative theories" are without
foundation.

> That does not make this procedure wrong.
> Because it is the totality of the COMBINATION of the attested data
> and the hypothesized sound changes (etc.) which we evaluate,
> in the long run.  But it does make this procedure less than absolutely
> certain to give the correct results.  (Using terminology from other
> fields, it is often possible to find a "local minimum" or solution
> which is better than any nearby points (closely similar solutions),
> yet which is not an absolute minimum, not the absolute best solution.
> In our field,
> changing BOTH some of the hypotheses about sound changes and
> other historical developments AND some of the hypothesized proto-forms,
> changing both together, in a co-ordinated fashion, may
> yield a better solution.  Such shifts of paradigm do occur.

This is becoming extremely abstract.

If Lloyd, or anybody else, wants to propose an alternative theory of the
prehistory of Basque, let's hear about it.  Meanwhile, I am entirely
comfortable working with the conclusions we have already reached.

[LT]

>> Finally ... if anybody out there still believes that my primary criteria are
>> somehow likely to skew the results in some phonological way, or if
>> anybody thinks that there exist better criteria for the purpose of
>> identifying the best candidates for native and ancient status in the
>> language, let's hear about it.

> I don't understand the word "still" here,
> it should be evident that I do and have previously explained the
> concrete reasons why.  There is no need to repeat the details here.
> As far as I know, Larry Trask has not
> argued against the reasons I gave.

Yes, I have.  I have addressed every single comment of substance I have seen on
this list -- and there haven't been many.

> I have repeatedly pointed to the problem of selection against
> sound-symbolic vocabulary through accidents of limited recording,
> having the effect of biasing our notions of canonical forms.
> Using Larry's mention of the difference of subject matters
> between 16th-century and 17th-century documentations,
> it is easy to explain why using too early a cutoff in time,
> or requiring too many or the wrong dialect attestations,
> can systematically bias against vocabulary in certain semantic
> fields, because these, like sound-symbolic items more generally,
> were not within the subject matter favored by the documents.

Possibly, but not a problem for me.

If there is no persuasive evidence for particular sound-symbolic forms in
Pre-Basque, then there is no such evidence.  That's all there is to it.

> Additionally, criteria for what are likely to be descendants of early
> Basque forms are NOT THE SAME THING as criteria for
> what are good items to use in any consideration of potential
> external relationships of Basque.

Sorry; I don't follow.

I myself have only one immediate goal: identifying the best candidates for
Pre-Basque words.  My other goals can only follow on later.

> I say this latter NOT because I hold
> out any hopes for finding distant relatives of Basque in my lifetime,
> but simply because mixing these two goals can distort the picture
> of proto-Basque by excluding many items which were in fact
> part of proto-Basque.

It is inevitable that I will not succeed in listing every single attested word
which was in fact present in Pre-Basque.  But there is no way around this fact,
and it doesn't worry me in the slightest.

Remember what I'm doing.  I'm trying to find the best candidates for Pre-Basque
status, so that I can then determine their phonological characteristics.
*After* that, I can hope to examine further words, and try to judge whether
they too might be plausible candidates for Pre-Basque status, even if they fail
my initial criteria.

> In addition to all of the criteria Larry Trask mentions, I think there
> should be another criterion:  For each item on the basic 100-word
> or 200-word Swadesh list, be sure to INCLUDE SOME vocabulary item
> whose meaning matches that item.
> Simply on the grounds that every language will have vocabulary
> for such meanings, so reconstructing an early Basque without any
> term for such a meaning is contra-indicated.

Absolutely not.

One of Swadesh's words is 'mountain'.  If we applied this criterion to English,
then we'd have to include 'mountain' as the only English representative -- but
it's a shrieking loan word.

Likewise, there are three Basque words which might be glossed as 'animal':
<animalia>, <piztia>, and <abere>.  All three are transparent borrowings from
Romance.  So what on earth could be the point of including them?
This is perverse.  We cannot decide *in advance* which Basque words must be
native.

> This is not a criterion for evaluating any particular proposed vocabulary
> item in Proto-Basque, it is rather a global criterion which can be
> used to evaluate the sum total of the judgments on individual candidates
> for inclusion.  It can tell us that we have excluded too much,
> and in what semantic ranges we should probably seek additional
> candidates for inclusion.

I am not concerned at this stage about excluding too much.  I am far more
concerned about including things that shouldn't be there.

> I would bet there are many other criteria which might be added,

Let's see these, then.

> and balancing them all together to make decisions will yield
> better results than using a simpler set of criteria and allowing
> any one otherwise reasonable criterion to dictate inclusion or exclusion.
> Larry Trask has shown his ability to use many criteria beyond the
> simple set in discussing particular vocabulary items
> (such as /sei/ or any other).

Thank you.  But the criteria I invoke depend upon the problem in hand.
Not all problems require the same criteria.

Larry Trask
COGS
University of Sussex
Brighton BN1 9QH
UK

larryt at cogs.susx.ac.uk



More information about the Indo-european mailing list