Excluding much?

ECOLING at aol.com ECOLING at aol.com
Sun Dec 12 02:28:21 UTC 1999

It has been pointed out to me several times in private communication
that Larry Trask's criteria in seeking to establish his list of
"best" candidates for early Basque monomorphemic lexical items
(I hope that phrasing accurately represents Trask's statements
of his goals)

contains two parts which together will cause the total list to be
very seriously reduced, to a tiny number of items.

I think Trask should have an opportunity to clarify or deal
with this issue publicly, so here are the specifics.

Among other reasons for asking this explicitly,
I understood one of Trask's more recent communications
about Basque to be saying that there are more documentary
sources available than I had previously understood him as saying.
(The question on interaction of these two criteria is quite
independent of other questions which have been under discussion.)

1.  Only include candidates attested in four out of five major dialect
          (groupings I believe Trask has established based on his study of
          which are more closely related to each other, which more
2.  Only include candidates from the earliest documentation,
          either pre-1600, as Trask prefers,
          or pre-1700, as he has said he is willing to consider a

Do these two criteria together mean that almost no lexical
items will qualify, only the most rudimentary lexicon,
words which are almost totally independent of subject matter and style,
such as English "and, the, good, very, when, not, come, go"?

How many does Trask estimate this would permit to be included?

Lloyd Anderson

After writing the above, I read Trask's listings of ordinary vocabulary
from the religious texts.  That is a very good list, and I thank Trask for it.
Most of it, though by no means all, fits the category I mention just above of
words almost independent of subject matter and style.

There remain for me two content questions.

(a)  What happens to the list Trask posted, when he applies the rule
    that four out of five dialects must attest each item to be included?
    Still the very most basic vocabulary should survive,
    perhaps a list much like what Trask just provided us,
    but though I believe so, it is an empirical question.

(b) When I wrote the following, there was a scope ambiguity:
     Speaking of attestation primarily in religious texts:
     >  That is a very strong bias of content, I would think against quite a
     >  range of vocabulary from ordinary life.

I intended not "against [there being] a range of vocabulary from ordinary
which was Trask's interpretation in this case (a reasonable possibility),
but rather I intended "against quite [large parts of the] vocabulary
of ordinary life [which are not religious in content]".  The grammatical
items and most basic adjectives, nouns, and verbs, would still occur,
probably in all dialects, but whether measured in a single dialect group
attestation in religious texts, or if requiring four out of five dialect
I would think many words like these would not be found
in such a high proportion of texts which are not oriented toward
a subject content which would promote their inclusion.
I will *of course* be wrong about some of the following,
but others could be substituted.  And I know from previous
correspondence that Trask believes some of these are in
vocabulary domains where almost all Basque vocabulary
is loanwords from other languages.  I assume not all of the items of
ordinary life which would fail to be widely enough attested would
be such loanwords.  The following list is *of course* not
designed with any knowledge specifically of Basque in mind.
But it illustrates roughly some of what I mean is included
by "ordinary life" on land and sea.

"cartwheel", (horse's) "bit", "fleece", "canal",
"rafter", "threshold", "sunrise", "planet", "yoke",
"thresh", "root-cellar", "eye" (of potato), "scrape",
"consult", "dig", "build", "rudder", "hull" (of boat),
"hull" (of seed), "bill" (of bird), "down" (of bird),
"mast", "shear", "midwife", "stillbirth", "roe"
"kitten", "chick", "kid" (of goat), "foal", "filly",
"vixen", "badger", etc.
names of quite a number of plants and animals,
perhaps some kin terms, some terms relating to marriage
ceremonies, "visit", "adopted" (child), "village idiot",
various kinds of earth and minerals and plant products,
etc. etc. etc.

The items I find in Trask's list posted today which are
not the strongest core grammatical items but slightly more like
those I have sketched above are from his first list:

acquaintance, ability, arrive, country, drown,
hair, custom, prudent, dark,

and from the second list:

king, star, gather, lord, dream, mouth

Notice that there are no overlaps between these two
sets of vocabulary from ordinary life,
which are not in the highest-frequency sets.
So I believe (perhaps wrongly) that many of these will not
satisfy the criterion of occurring in four out of five

Of course, Trask's samples were only a few lines
from a New Testament preface and from Chapter One of Matthew.
Once we include the entire text, things should be better.
How much better?  I do not claim to know.
That is why I think my questions in this message
are really empirical questions.

Of course, if we correct our estimates again,
by noting that even the rarest items which did occur
in Trask's two short samples are not as rare as
most of the sample items in the sketch I gave above,
it again looks less probable that a wide range of
rare vocabulary of ordinary life will be covered,
therefore not a wide range of polysyllables,
relative to the monomorphemic polysyllables
which really did exist in spoken Basque of the time.

More information about the Indo-european mailing list