[Lexicog] roots, stems, idioms, & phrases in an Oceanic language

Ronald Moe ron_moe at SIL.ORG
Mon May 31 19:46:07 UTC 2010


Hi Paul,

There is no neat way to distinguish various kinds of complex forms. Instead
languages employ a variety of means to build complex forms out of the
building blocks of the lexicon. These building blocks are things like roots,
affixes, reduplication rules, tone/stress patterns, vowel replacement rules,
and other strategies. We have some terms like "compound" and "derivative"
which have standard definitions. But I've seen lots of examples of complex
forms that used a combination of roots and derivational affixes. So there
are no neat categories. Your examples illustrate a number of strategies. Any
time a complex form contains two or more words, we would generally call it a
lexical phrase or multi-word expression (MWE). But you have examples of MWEs
that contain compounds and reduplicated forms. So these complex forms are
built using more than one strategy. In such cases it is fruitless from a
theoretical perspective to try to label them. However  in a practical
dictionary users may need a label. The general practice is to label anything
with more than one word a "phrase" of some type. Any single word with more
than one root is labeled a "compound". Any word with a single root and one
or more derivational affixes is labeled a "derivative". Other than these
cases, the labeling gets tricky. You could use the term "reduplicated root"
for ghanaghana. The term "phrasal verb" is used for a phrase that contains a
verb and some sort of particle. In English it is used of a verb plus
preposition such as "think up". But a verb-adjective phrase like "think
high" would not be called a phrasal verb.

 

There are theories of the lexicon (e.g. Cognitive Grammar, Construction
Grammar) that view the lexicon and grammar as a large continuum between
morphemes on the simple end to grammatical patterns on the complex end.
Grammatical patterns, such as a clause structure, are seen as the same sort
of thing as lexemes, except they are less specified. Languages are full of
set phrases of various types. An example in English is "the X-er, the Y-er"
as in "The more, the merrier," and "The bigger they come, the harder they
fall." In the formula "the X-er, the Y-er" the "X" and "Y" can be filled
with a comparative adjective or some sort of phrase that contains a
comparative adjective. Such a formula is not specified for lexical content.
It has a general form and a general meaning that can be described.

 

The more you study the lexicon of a language, the more examples you will
discover that follow particular patterns. (Notice that the last sentence is
of the "the X-er, the Y-er" type.) One task of the lexicographer/grammarian
is to identify the lexical and grammatical patterns of the language,
including word forming patterns and lexico-grammatical phrase forming
patterns. English has many examples of the verb + preposition "phrasal verb"
pattern. But it has many other patterns. It has an adverb +
quantity.adjective pattern, such as "not many," "very little," "hardly any."
It has idiom-like phrases that include an unspecified possessive pronoun,
such as "stub my/your/his toe". This in turn is part of an even less
specified formula verb(strike) + poss.pro + noun(body.part), such as "hit my
head" or "banged his elbow".

 

So I would recommend that you study the patterns in your language. You will
have to decide what patterns (or examples of patterns) you want to include
in your dictionary and what patterns should be handled in your grammar,
recognizing of course that there is not a sharp distinction between
"lexemes" and "grammatical structures". The general principle for inclusion
in the dictionary or grammar is what your users would expect and be able to
understand. You will also have to develop a set of terms to label the
various structures. In your examples I see what appear to be the following:

 

verb + verb

/bere ghilaghana/ v. 'recognise; lit. see know'
/ghana dea/ v. 'consider; lit. think go'
/ghana lubathi/ v. 'forget; lit. think leave'
/talu ghilaghana/ v. '"tag"; attach an identifying mark on a chicken (foot /
wing) to indicate ownership; lit. put know'



verb + adj

/ghana bule-kaghini/ v. 'think carelessly; lit. think crazy-CAUS'
/ghana mava/ v. 'respect; lit. think heavy'
/ghanaghana doku/ v. 'think hard'



verb + prep/adv

/ghana iti/ v. 'respect; lit. think up/high'
/ghana le/ v. 'forget; lit. think purposelessly'



verb + num

/ghana ruka/ v. 'doubt; lit. think two [times], twice'



complete reduplication

/ghanaghana/ n. 'thought, opinion'



Plus you have combinations:

 

complete reduplication, num + verb

/ruka ghanaghana/ v. 'reconsider, think twice'



You also have longer phrases that probably (but not necessarily) fit general
grammatical structures:

 

/ghana sapa longa/ v. 'think comprehensively; lit. think seaward landward'



I seriously doubt that your structures will correspond to the labels that we
use for English (e.g. phrasal verb). I have pretty much given up on trying
to label complex form types because there are just too many. I think the
terms "derivative" "compound" and "phrase" (or "idiom") are well enough
understood by the general populace to be used in a popular dictionary.
Beyond that and you run the risk of being too technical. It is always good
to test your users to see what they can understand and profit from. An entry
such as the following could work:

 

/ghilaghana/ (compound of ghila + ghana) v. Know; know how [to do s.t.].

 

But I think only a serious student would bother to find out what "comp.
redup." means:

 

/ghanaghana/ (comp. redup. of ghana) n. Thought, opinion.

 

In FieldWorks you can link complex forms to their roots. Then you can
configure your dictionary to output entries, such as the two above, that
indicate the root(s) of complex forms. You can also configure the dictionary
to provide labels such as "compound" or "comp. redup." In FieldWorks these
labels are called "Complex Form Types". These labels are different than the
grammatical category labels, such as "verb" and "noun". A label such as
"compound" indicates the internal structure of the complex form. A
grammatical category label indicates the morphological potential and
syntactic function of a word. For instance the grammatical category "mass
noun" indicates that a word, such as "milk", functions syntactically like
other nouns. The "mass" part of the label indicates that "milk" does not
take the plural suffix. For this reason many dictionaries do not give
phrases a grammatical category label. (English has some orthographic
phrases, such as "of course", that function as single words, and can
therefore be given a grammatical category.) All these linguistic issues have
to be worked out for each language and we, as lexicographers, have to make
decisions about how to handle them in our dictionaries.

 

Ron Moe

 

  _____  

From: lexicographylist at yahoogroups.com
[mailto:lexicographylist at yahoogroups.com] On Behalf Of lengosi
Sent: Sunday, May 30, 2010 10:27 PM
To: lexicographylist at yahoogroups.com
Subject: [Lexicog] roots, stems, idioms, & phrases in an Oceanic language

 

  

I'm plugging away at a lexicon for an Oceanic language and I've found myself
in a bit of a muddle... I've been using something called the Dictionary
Development Program to gather 'words' (and FieldWorks to work with the
data), and much of the data are clearly compound words or idioms / phrases.
Sometimes I think I know what I'm doing in terms of roots and stems, idioms
and phrases, and I happily forge ahead, but there are other times I haul
back on the reins and wonder if I've really got it straight. This is one of
those 'other times'...

Here's a short example concerning the word /ghana/:
/bere ghilaghana/ v. 'recognise; lit. see know'
/ghana/ v. 'think'
/ghana bule-kaghini/ v. 'think carelessly; lit. think crazy-CAUS'
/ghana dea/ v. 'consider; lit. think go'
/ghana iti/ v. 'respect; lit. think up/high'
/ghana le/ v. 'forget; lit. think purposelessly'
/ghana lubathi/ v. 'forget; lit. think leave'
/ghana mava/ v. 'respect; lit. think heavy'
/ghana ruka/ v. 'doubt; lit. think two [times], twice'
/ghana sapa longa/ v. 'think comprehensively; lit. think seaward landward'
/ghanaghana/ n. 'thought, opinion'
/ghanaghana doku/ v. 'think hard'
/ghanaghana kasuni/ v. 'think before acting'
/ghanaghini/ v. 'remember'
/ghilaghana/ v. 'know, know how [to do s.t.]'
/ruka ghanaghana/ v. 'reconsider, think twice'
/talu ghilaghana/ v. '"tag"; attach an identifying mark on a chicken (foot /
wing) to indicate ownership; lit. put know'

I think it would be safe to call /ghana/ a root, and something like
/ghanaghana/ a stem (full and partial reduplication are used for both N > V
and V > N derivation [among other things]), but things get muddled after
that... How do you discern a compound from an idiom from a phrase / phrasal
verb? 

Of course, as in many other Oceanic languages, there tends to be a lot of
reduplication to indicate things like intensity, plurality, etc., but I've
tried not to include such forms in the lexicon. But with something like the
verb /ruka ghanaghana/ 'reconsider', it looks like the /ghanaghana/ part is
reduplicated for plurality rather than V > N derivation.

Well, any clarity you could add to my muddle would be most welcome! Thanks,

Paul



No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.819 / Virus Database: 271.1.1/2908 - Release Date: 05/30/10
23:25:00


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lexicography/attachments/20100531/08e8a55c/attachment.htm>


More information about the Lexicography mailing list