Excluding Basque data

Larry Trask larryt at cogs.susx.ac.uk
Wed Oct 6 08:34:24 UTC 1999


On Sat, 2 Oct 1999, Roslyn M. Frank wrote:

[LT]

>> So tell me: how do my principal criteria of early attestation,
>> widespread distribution, and absence from neighboring languages "have
>> the effect" of biasing my results on phonological form?

> Could you share with us once more precisely what these criteria are?
> I seem to recall that earlier you listed them but I can't locate
> that email.

OK.  I propose the following primary criteria.

1. Early attestation

The word should be recorded early.  I have proposed a cut-off date of
1600, since the first substantial literature appears in the 16th
century.  Someone else (Jon Patrick?) suggested 1700 instead.  This is
reasonable: the 16th-century texts are not numerous; they are all
written by clerics; and they are overwhelmingly religious, with many of
them being translations.  The 17th-century literature, in contrast, is
much more voluminous, and it includes the first lay writers, notably the
important Oihenart.  I'm happy with 1700, though I suspect it won't make
a great deal of difference.  But nothing later.

2. Widespread distribution.

There exists a fairly conventional system of dialect boundaries, and
words are commonly reported according to the recognized dialects in
which they are attested.  Since some dialects are smaller than others,
and also less well recorded, I suggest the following provisional
groupings:

	(a) Bizkaian
	(b) Gipuzkoan
	(c) High Navarrese, Salazarese, Aezkoan
	(d) Lapurdian, Low Navarrese
	(e) Zuberoan, Roncalese

Now, I suggest counting a word as widespread if it is securely attested
in at least four of these five groupings.  Insisting on attestation in
all five would be more rigorous, and is possible, but it would certainly
exclude an unknown number of good candidates for native and ancient
status.  For example, <(h)itz> `word' is universal except that it is
absent from Bizkaian, which has only the Romance loan <berba>.  We could
argue about the details, but I wouldn't like to relax the criteria much
beyond my proposal.

3. Absence from neighboring languages

Basque has borrowed a vast number of words from neighboring languages,
mostly from Latin and Romance, but also a few from Celtic and Arabic,
and also from Germanic, though all the Germanic loans seem to have
entered via Romance.  In contrast, very few Basque words have penetrated
into Romance.

This is the hardest criterion to formalize, mainly because there exist a
number of Basque words which *may* be borrowed but for which a secure
source acceptable to all has not been identified.  I suggest that, if
Agud and Tovar's etymological dictionary shows a widespread belief or
suspicion among specialists that a word is borrowed, then it should be
excluded -- even if the loan origin is not certain.  Caution is vital
here, in my view.

A decision must be made about the very few shared words which are
thought to be of Basque origin.  For example, everybody believes that
the Castilian and Portuguese words for `left (hand)' are borrowed from
Basque <ezker>.  A policy must be adopted here, but such words are
vanishingly few anyway, and the decision is most unlikely to have any
significant consequences.

These are my primary criteria.  I personally would like to exclude, in
advance, obvious nursery words like <ama> `mother' and obvious imitative
words like <tu> `spit'.  However, I agree that it may not be easy to get
general agreement as to which words "obviously" fall into one of these
categories.  Still, I am hopeful that appeal to universal properties of
such formations will allow me to exclude at least the most blatant
cases, such as my two examples: it is well known that words like <ama>
`mother' and <tu> `spit' occur in languages all over the planet.

> Could you also define more precisely what you mean by "widespread
> distribution"?

I've just done so.

> For example, what would an unacceptable distribution be?
> Present in only the Gipuzkoan and Bizkaian dialects? Present in the
> "northern" dialects of Iparralde (French Basque region) but not in any of
> the other dialects?

Both of these distributions would be unacceptable under my proposed
criteria.

> Or contrarily would you argue that the phonology of a term used in
> all dialects always should takes precedence over one that is limited
> to one dialectal variant only?

This raises another matter.  When -- as so often -- a word exists in
several regional variant forms, what form should go into the list?
My answer is that we should simply appeal to the known phonological
prehistory of Basque, and use the form which can be reconstructed as the
common ancestor.

For example, take the word for `wine', which has the following regional
variants:

	old B		<arda~o> (nasalized)
	B		<ardao>
	G, HN, Sal, Aez	<ardo>
	L, LN		<arno>
	Z		<ardu~'> (final vowel nasalized and stressed)
	R		<ardau~> (nasalized)

The combining form is <ardan->, as in <ardantza> `vineyard' (<-tza>
noun-forming suffix) and <ardandu> `ferment' (<-tu> verb-forming
suffix).  We therefore reconstruct *<ardano> for Pre-Basque, with
familiar developments leading to the attested variants and to the
combining form.

I hope this clarifies things.

Finally (this is not a reply to Roz, who I don't think has raised this
point), if anybody out there still believes that my primary criteria are
somehow likely to skew the results in some phonological way, or if
anybody thinks that there exist better criteria for the purpose of
identifying the best candidates for native and ancient status in the
language, let's hear about it.

Larry Trask
COGS
University of Sussex
Brighton BN1 9QH
UK

larryt at cogs.susx.ac.uk



More information about the Indo-european mailing list