Expressive forms (long)

Larry Trask larryt at
Sat Nov 9 17:01:10 UTC 1996

The following message I have recently posted to the Nostratic list, as
part of the discussion of the proper methodology for comparing
languages which has been taking place on that list.  But it occurs to
me that the point raised here may be of more general interest to
historical linguists.  I would be interested to find out to what
degree expressive forms of the type chiefly discussed below occur in
other languages, and what problems they pose in historical work.
There has been some skepticism expressed on this list (and elsewhere)
about the reality of what I have called "expressive forms" and about
the validity of my policy of excluding such forms from any possible
comparison with other languages when trying to establish a genetic
link in the first place.  In this posting, I'll try to explain what I
mean by this term and show why I take the position I do.
Everyone is familiar with words of purely onomatopoeic origin, such as
COCK-A-DOODLE-DO, and few people would dream of trying to use such
items as comparanda -- or so I hope.  These names for noises are the
most obvious type of what we might call "expressive forms", but they're
not really what I have in mind.
Somewhat more interesting are items which were originally coined as
onomatopoeias but have acquired transferred senses.  Among these are
the numerous words in the world's languages of the form BER(BER),
which were coined as imitations of the sound of boiling water but have
been transferred to senses like `hot', `burn', `fire' and `cook'.  At
the very least, such items cannot be given full weight in
comparisons, because they are treacherous, and it is for this reason
that I have objected to the use of Basque BERO `hot' in remote
comparisons.  Another possible example is English OOMPH, which some
scholars think may have originated as an imitation of the bellow of a
mating bull.  More certain is Basque TU `spit, saliva', whose origin
as an onomatopoeia for the sound of spitting is surely too obvious to
need pointing out.  This too has been compared with similar words in
other languages, entirely improperly, since all such words doubtless
have the same origin.  Also certain is the English verb SLURP, derived
from a name for the sound.
More interesting again are words obtained by some kind of playful
alteration of existing words.  This process does not appear to be
prominent in English, but consider French.  Spoken French uses a large
number of playful alterations: AMERICAN becomes AMERLO, APERITIF
NUNU, SAC is inverted to KEUS, FEMME is inverted to MEUF, and so on.
Some of these remain confined to vernacular styles, while others, like
APERO, almost seem to have driven their source words out of the
The French formations, being recent, present no problems to
etymologists, but ancient examples of the same type may be more
difficult to identify.  Basque provides an excellent example.  As a
general rule, word-forming prefixes are absolutely wanting in Basque.
But we find a curious little group of words which exist in two very
different forms.  Here are some examples:
UDARI `pear', MADARI `pear'
GAKO `hook', MAKO `hook'
HEGAL `wing', MAGAL `wing'
It is clear that, at some time in the distant past, there was in
Basque a process by which the morph MA- could be attached to the
beginning of a word to produce some kind of playful or expressive
variant.  The process is long dead, and we can no longer recover the
original function of MA-addition, but we may be confident that the
forms with initial MA- are secondary, and hence that they are
unavailable for comparison.  Since Pre-Basque had no phoneme /m/ in
lexical items, it may well be that the very rarity of this consonant
favored its use in expressive variants.
The detection of this ancient process has a further consequence: *any*
Basque word with initial MA- for which no secure etymology is
available should be treated with suspicion, since it might well be in
origin just such an expressive variant of an earlier word now lost.
Yet those whose work I have criticized do not hesitate to adduce such
words as comparanda, and indeed MAGAL itself has been invoked in a
comparison -- though in this case, compounding error with error, the
word was cited as `belly', a meaning it does not have, apparently
resulting from a misinterpretation of the Spanish gloss meaning `lap'
(the development of the sense is roughly `wing' > `edge' > `hem' >
But the most dramatic and interesting cases of expressive forms are
words which appear to have been coined more or less out of thin air
because of their appealing sound.  English does this much less
frequently than some other languages, but we none the less have
of course the celebrated BLURB.  A particularly interesting set is
represented by Middle English TINE, which was converted by the Great
Vowel Shift into TINY, which has more recently been joined by the
expressive coinages TEENY, TEENSY and TEENSY-WEENSY.  The
embarrassment of lexicographers and etymologists with these words is
patent, and they sometimes prefer wild guesses to an admission of
defeat, as with SHAM, sometimes linked to SHAME on the basis of zero
evidence.  A few of these may have some kind of vague source, such as
GLOB, thought to have something to do with GLOBE and BLOB.
Many other languages do this far more frequently than does English,
and Basque is a language in which such formations have clearly been
very frequent indeed, probably throughout the history and prehistory
of the language.  And these are precisely the expressive forms I
usually have in mind in dismissing a proposed comparison as
immediately untenable on the Basque side.
In English, such expressive items often have forms which are not
conspicuously different from the forms of ordinary lexical items,
though they do often tend to have meanings in particular semantic
areas, such as `mucky stuff' and `devastating action'.  In Basque,
however, in which we generally lack the kind of historical testimony
available for English, we are fortunate that such expressive forms
tend strongly to have highly distinctive phonological forms, quite
different from those of ordinary lexical items, while they too tend to
have meanings in certain semantic domains.  Thus, even in the absence
of explicit records, we can *usually* manage to identify such
formations with some confidence -- and such formations clearly have no
business being included in a remote comparison.
Here are some of the characteristics of expressive forms in Basque.
No single word shows all of them, or even most of them, but I wouldn't
normally venture to class a word as an expressive form unless it
showed several of them.
1. Frequency of /m/, both initially and medially.
2. The specific form /mVPV(-)/, where V is any vowel and P is any
     voiceless plosive, or /mVSPV(-)/, where S is any sibilant.
3. Frequency of palatal consonants, especially TX and X and especially
     in initial position.
4. Initial (and sometimes final) voiceless plosives.
4a. The specific form PVPV(-), where both consonants are voiceless
5. Unusual clusters.
6. Initial clusters.
7. Unusual length (four or more syllables).
8. Reduplication (partial or total) (very often with initial /m/ in
     the second occurrence).
9. Unusual variation in form (not parallel to the variation observed
     in ordinary lexical items).
10. Unusual range of seemingly unrelated meanings.
11. One of several identifiable semantic domains:
     Physical and moral defects;
     Noises and noise-making objects;
     Varieties of movement or activity;
     Conspicuous meteorological phenomena;
     Small creatures (creepy-crawlies, insects, fish, birds, other
       arthropods, small reptiles and mammals);
     Sexual terms;
     Names of unpleasant substances;
     Words for projections and extremities.
       (This last is perhaps a little more unexpected than the others,
       but there appear to be a number of examples.)
12. Confinement to one area of the country.
13. Notable regional preference for particular shapes.
Now these phonological properties are simply not exhibited by ordinary
indigenous lexical items, even when they fall into a relevant semantic
domain, and hence words exhibiting them are almost certainly of
expressive origin, providing we can rule out other sources, such as
borrowing.  Not all these features are equally significant.  In
particular, properties (2) and (4a) are, even on their own, virtually
decisive, since ordinary indigenous words absolutely never have such a
Now consider a case like PINPIRIN `butterfly'.  This word has not only
been cited in remote comparisons, it has even been cited by Bengtson
and Ruhlen as continuing their putative "Proto-World" root *PAR `fly'
(verb).  But this word exhibits very many of my features.  It has
property (4) (initial voiceless plosive), property (5) (strange
cluster /np/, not found in ordinary lexical items), property (9)
(unusual variation in form: PINPIRIN(A), PINPILIN, PINPILINPAUXA, and
others, not comparable to ordinary words), property (10) (unusual
range of meanings, including `butterfly', `garfish', `bud'), property
(11) (insect name), property 12 (entirely confined to one small corner
of the country, all other regions having quite distinct words for
`butterfly', often also expressive), and property (13) (this region
strongly favors expressive formations in PIN- and PAN-, which are rare
elsewhere).  We may therefore safely conclude that this is an
expressive formation, especially since initial /p/ is categorically
absent in indigenous lexical items.
Or consider POTORRO `vulva', also adduced in long-range comparisons.
This at once shows both initial /p/ and the decisive property (4a),
the PVPV(-) shape.  It cannot possibly be indigenous.  Moreover, this
is a sexual term (property 11), and it exhibits an almost astronomical
number of variant forms differing in ways not paralleled by ordinary
This is an expressive formation, and it can't be invoked in
Another word which has been cited in comparisons is MUTUR `snout,
extremity'.  This is one of the `extremity' words mentioned above, and
it has the shape MVPV(-), which, outside of loan words, points
categorically to an expressive origin.  Moreover, it has a variant
MUSTUR, and such variation in form is typical of expressive forms but
absolutely unknown in ordinary words: Basque ZATI `segment' does not
have a variant *ZAZTI, nor does BAZTER `edge' have a variant *BATER.
Unless it's a loan word (which is possible), this is an expressive
form; in any case, it has no business being adduced in comparisons.
Also adduced in comparisons are KUKUR and TUTUR, both `crest' (on a
bird or animal).  But these are categorically identified as expressive
forms by their shape PVPV(-); they appear to be reduplicated; they are
`extremity' words; they may even be the same item in origin, in
which case they show a type of variation unknown in indigenous
The great majority of expressive formations in Basque are easy to
recognize by these criteria.  Such words as ZIRIMIRI `drizzle',
ZURRUMURRU `whisper, rumor, gossip', KARRAMARRO `crab', ARMIARMA (and
many variants) `spider', TRIKI-TRAKA `toddling', TXIKILI-TXAKALA
`coitus', MARA-MARA `steadily, continuously, smoothly', MAKAR `the
crud you rub out of your eyes in the morning', TXISTMIST (and several
variants) `lightning', ZARRAMARRA `trash', KOKO `larva infesting
maize', AIKOMAIKO `pretext, excuse', and many hundreds of others are
so obviously expressive formations that there is nothing to discuss.
Basque has one particularly interesting class of expressive forms with
especially distinctive properties.  This is the class of adjectives
denoting physical or moral defects.  They all begin with the
expressive consonant /m/, and they are usually two syllables long
(sometimes three).  Here are just a few examples of what is a rather
large group: MOTEL `weak, insipid', MATZER ~ MATXAR `deformed,
twisted, defective', MAZKELO `clumsy', MAZKARO `blackened, dirty',
MAKAL ~ MAZKAL ~ MASKAL `weak, feeble, sick', MOKOR `perverse', MOZKOR
~ MOXKOR `squat, stout, fat; drunk', MUTZULU `wild, savage,
unsociable', MIXKIRI `envious', MOTROTX `stocky, plump', MOTZOR
`crude', MUKER `unsociable, arrogant', MUKUR `clumsy, crude', MUTXIN
`angry', MALMUTZ `fat, obese; tricky, shrewd', MAKUR `twisted,
crooked', MIRRIN `shriveled, scraggly', MALTZUR `dishonest'.  Any
given word is usually confined to a certain part of the country, but
every region has a number of these formations.  These words have no
source, and it appears that this pattern has long been available for
coining new words in the relevant semantic domain.  Such words cannot
be invoked in comparisons, and that is the end of it.
Naturally, not every case is beyond dispute.  The three words for
`hail', BABAZUZA, KAZKABAR and TXINGOR, illustrate the defining
features less well than my other examples, but on balance they are
more likely to be expressive forms than not.  The word for `ant',
INHURRI ~ TXINGURRI ~ TXINAURRI (and other variants) cannot with
certainty be regarded as an expressive form, but the bizarre regional
variation in form nevertheless points strongly to an expressive
Loan words may accidently present many of my features.  For example,
MUKU ~ MUKI `mucus' and MIKA `magpie' qualify rather well as
expressive forms, but we can nonetheless be sure they are loans from
the synonymous Latin MUCU and PICA.  Particularly interesting is
KIRIKIN~O `hedgehog'.  This looks for all the world like an expressive
form, but it seems impossible to separate this word from the
synonymous Latin ERICINEU, which might readily have been borrowed as
*IRIKIN~O, and it rather looks as if the Basques borrowed the word and
then added an initial /k/ in order to make the word look more like a
typical expressive form.
The point of all this is that Basque possesses an exceedingly large
number of words of purely expressive origin, words that have been
coined more or less out of thin air because of their appealing sound.
In most cases, these items can be securely identified by appealing to
the criteria I have listed, some of which are almost totally decisive.
Such words have absolutely no business being adduced in comparisons.
Therefore, when I reject a comparison involving one of these words, as
I very often do, I am not just pig-headedly flinging the label
`expressive form' around as a kind of magical curse to get rid of
impressive-looking matches that offend my determination to keep Basque
isolated: I have very good reasons for rejecting the comparison.
Surely nobody would dream of trying to use English TEENSY in a remote
comparison, and Basque cannot be an exception to the ordinary
standards of good practice in comparative work.
Larry Trask
University of Sussex
Brighton BN1 9QH
larryt at

