[Lexicog] Database structure

Kenneth C. Hill kennethchill at YAHOO.COM
Thu May 19 21:58:01 UTC 2005


At the beginning of the Hopi Dictionary Project in 1985 I had to come up
with a database structure that could remain stable as various parts of it
were being worked on at different sites. In retrospect I'm rather
surprised --having been exposed now for some time to the Lexicography
list-- that I was able to establish a structure that served us so well.
(The Hopi Dictionary/Hopìikwa Lavàytutuveni was published in 1998 by the
University of Arizona Press.) I have continued working with our database
introducing corrections, additions, etc., with an eye to a second edition
and have modified the database structure in small detail.

I work with Notebook II, a DOS program by Pro/Tem Software (now defunct)
and I have found that many of the useful properties of Notebook II are
available in FileMaker Pro. Neither Notebook II nor FileMaker Pro are
programs that automate lexicography. You have to think for yourself. Each
field can be printed in any order with respect to other fields and can be
printed in its own typeface. It can also be marked in various other ways,
such as being in parentheses or being preceded by a label. Font changes
within a fiels have to be specified by the user (i.e., the lexicographer).
Only a few typographical details are mentioned below.

The Hopi Dictionary database contains the following 21 fields:

1. Headword (large boldface)
2. Alphabetizer (this is the "ordering handle" and was not printed. In
electronic versions of the dictionary it should be included.)
3. Form class (= "part of speech"; italics)
4. Definition (normal type. For multiple-definition entries, this contains
only definition 1.)
5. Word division (the headword broken by hyphens; small italics)
6. Morpology (labels for each item in Field 5 separated by corresponding
hyphens; small roman)
7. Underlying form (an invariant phonological specification of each
morpheme of the item, written with /(space) at the beginning and (space)/
at the end. When morphemes cluster such that two items, X Y, correspond to
one item in Fields 5 and 6, these are represented < X Y >. In the
published dictionary this field was not printed, but now that I have
developed it further I think it should be made available to the user.)
8. Inflected forms (boldface. For Hopi the inflected forms are mainly
plurals and case forms.)
9. Combining forms (boldface. Allomorphs of the headword.)
10. Pausal (boldface. This is a Hopi-specific variant form of the
headword.)
11. Cross-reference (normal type)
12. Examples (normal type. This field also includes definitions other than
the first, with their respective illustrative example sentences.)
13. Intensive form (small type. In effect a cross-reference field to
entries with a particular sort of derivation in Hopi.)
14. Compounds (small type. A list of all the items in the dictionary with
the headword or one of its allomorphs as the last part.)
15. Usage comments (normal type, printed in parentheses before the
definition, Field 4. "Usage" includes masculine speaker, feminine speaker,
babytalk, ritual register, botanical, astronomical, etc.)
16. Loanword source (small)
17. Semantic domain (not printed. I use this to generate specialized
wordlists for interested parties. This is a flat field; I have no way of
specifying hierarchical things as per earlier discussions in the
Lexicography list.)
18. English (not printed. We used this field to keep track of what we were
to put in the English-Hopi finder part of the dictionary. There was no
mechanical way of generating the English-Hopi list from the Hopi entries
themselves.)
19. Scientific name (not printed. When this was of importance, it was
included in the definition --fields 1 and 12.)
20. Editorial comments (not printed. A very useful field even if only one
person is working with the database.)
21. Edited by (not printed. A record of who worked on a record and when.)

I hope the above will be useful to some List reader.

I offer the above in blissful semi-ignorance of Toolbox, Shoebox, and
perhaps other lexicographical software that might have had the Hopi
Dictionary Project project go in different directions. When I was at the
Research Centre for Linguistic Typology in late 2000, several colleagues
there were trying to get into Shoebox. The major problem seemed to be font
definitions. That was not a problem for us working on the Hopi Dictionary.
Font problems for us were separate from the problem of compiling the
information for a dictionary.

--Ken

--- Mike Maxwell <maxwell at ldc.upenn.edu> wrote:
> Claire Bowern wrote:
> > Coming up with a good database structure early on is really important
> > (I've learnt this the hard way, several times actually!)
>
> I can't speak for others on this list, but personally I'd like to hear
> more about that.  When I worked on SIL's LinguaLinks (and later on the
> model for FieldWorks), one thing we noticed was that we had considerable
> trouble coming up with a good data model.



		
Discover Yahoo!
Stay in touch with email, IM, photo sharing and more. Check it out!
http://discover.yahoo.com/stayintouch.html


------------------------ Yahoo! Groups Sponsor --------------------~-->
Has someone you know been affected by illness or disease?
Network for Good is THE place to support health awareness efforts!
http://us.click.yahoo.com/RzSHvD/UOnJAA/79vVAA/HKE4lB/TM
--------------------------------------------------------------------~->


Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/lexicographylist/

<*> To unsubscribe from this group, send an email to:
    lexicographylist-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/



More information about the Lexicography mailing list