[Lexicog] database structure

John Roberts dr_john_roberts at SIL.ORG
Sat May 21 08:20:16 UTC 2005


Dick Watson wrote:

Second, at the price of oversimplifying, I would like to propose an approach to all of the rest of the relationships between lexemes.  Rather than entering some words as subentries or minor entries of other words, we should enter all of them as major entries.  

-------------------

In principle this is a very good suggestion and I would agree with it. However, how would you apply this proposal to a language like Amele (Papuan)? Any verb can be inflected for any of the following categories.

one of
6 independent realis tenses and aspects
or
7 independent irrealis tenses and moods

or

9 dependent tenses, aspects and moods

plus

SuAgr in 1, 2, or 3 person and

singular, dual, or plural number

 

A verb like *qoc* 'to hit' has a stem plus infinitive suffix. The infinitive suffix can be replaced by any of the inflectional categories.



q-oc                'to hit'

Base+INF    



E.g. *qugina* 'I am hitting'  

 

Most of the inflectional paradigms are regular, so you wouldn't need to record them by an entry in the dictionary. But the simultaneous tense forms are irregular. There are three paradigms of sim. tense as illustrated for 'hit'. Some verbs, like 'hit' reduplicate the first CV of the "stem" to express sim. tense. Others like 'do' reduplicate the first V of the stem, and some like 'call' reduplicate the first V of the subject agreement inflection.



ququg            'while I hit (SS) .

ququgin         'while I hit (DS.R) .

qoqomin        'while I hit (DS.IR) .

 

od-ec              'to do'

Base+INF      

 

oodig              'while I do (SS) .

oodigin          'while I do (DS.R) .

oodemin         'while I do (DS.IR) .

 

uta-ec             'to call'

Base+INF      

 

utaiig              'while I call (SS) .

utaiigin           'while I call (DS.R) .

utaeemin        'while I call (DS.IR) .



So, these forms would need to be entered in the dictionary because they are unpredictable. If you have the policy to do a main entry for every unpredictable inflectional form then you would have to enter 21 forms for each verb, i.e. the full 3 paradigms. In practise we just enter the third person singular SS form as a minor entry cross-referenced to the main entry for *qoc* 'to hit'.



Then there are all the derivational categories to deal with. These categories are derivational because (a) they do not replace the INF suffix and (b) they do not apply to every verb.

 

Derivational categories:

Aspect: iterative (IT) vs. irregular iterative (IRIT)

Voice: reciprocal (RECIP) and impersonal (IMPERS)

Object agreement: direct (DOAgr), indirect (IOAgr), and oblique (OOAgr)

in 1, 2, 3 person and singular, dual, plural number

 

q-oc                                                'to hit'

Base+INF                                      

aqec                                                'to hit them'

+DOAgr

aqalec                                             'to hit them(2)'

+DOAgr

aqitec                                             'to hit them to/for me'

+DOAgr+IO/OOAgr                    

aaqitiec                                          'to hit them to/for me repeatedly'

+IT+DOAgr+IO/OOAgr             

qututuec                                        'to hit him repeatedly'

+IT+DOAgr                                  

ququocobocobec                         'to hit each other repeatedly'

+IT+RECIP                                    

qudoga doc                                   'for him to want to hit him'

+IMPERS+DOAgr                       

qoga duduec                                 'for him to repeatedly want to hit'

+IT+IMPERS                                

qocobqocobeiga adec                 'for them to want to hit each other'

+RECIP+IMPERS                         

qudocobqudocobec                    'to hit each other'

+RECIP+DOAgr                           

qutocobqutocobec                      'to hit to/for each other'

+RECIP+IO/OOAgr                     

aqaga doc                                      'to want to hit them'

+IMPERS+DOAgr                       

ququdocobdocobec                    'to hit each other repeatedly'

+IT+RECIP+DOAgr                     

ququtocobtocobec                      'to hit to/for each other repeatedly'

+IT+RECIP+IO/OOAgr               

qudoga duduec                            'for him to repeatedly want to hit him'

+IT+IMPERS+DOAgr                 

+qudocobqudocobeiga adec     'for them to want to hit each other'

+RECIP+IMPERS+DOAgr     



For all of these forms it would be appropriate to have a main entry for each one. However, the different combinations can generate a large number of forms. Also they are semi-predictable, so a better solution is to have a sampling listed within the main entry for *qoc*. The question then is do you list their sim. tense forms as minor entries? E.g.

 

qudocobqudocobeiga aadeig     'while they wanted to hit each other (SS) ...'



You could write a whole dictionary on "the forms of 'hit' in Amele". But that wouldn't be all that useful to the Amele people. However, you still have to make critical decisions about what should be a major entry or minor entry or subentry at an early stage of constructing the database, otherwise you have to do a lot of reconstructing when you change your mind later.



I think Amele is one of those languages that you can't do a complete dictionary for.



John Roberts

 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lexicography/attachments/20050521/9ad104d8/attachment.htm>


More information about the Lexicography mailing list