[Lexicog] database structure

John Roberts dr_john_roberts at SIL.ORG
Thu May 19 10:17:07 UTC 2005


>> Claire Bowern wrote:
>>> Coming up with a good database structure early on is really important
>>> (I've learnt this the hard way, several times actually!)
>
>> Mike Maxwell wrote:
>> I can't speak for others on this list, but personally I'd like to hear
>> more about that.
>
> John Roberts wrote"
>> Well, one of the main issues on structuring your database is whether you
>>decide to have a form-based dictionary or a meaning-based dictionary.
>
> Ron Moe wrote:
> When you structure a
> database, you have to decide how these kinds of words are to be related.
> In
> FieldWorks we have a list of "wordforms"--a list of every word that occurs
> in our text corpus. We also have a list of lexical entries in the
> dictionary. We then have to decide how to link a word in the list of
> wordforms to an entry in the dictionary. <snip>
> ... but there are lots and lots of different kinds of relationships
> between words. Trying to design a program to handle every possible kind
> becomes rather daunting.

Yes, the issue in database structure is how do you organise the networks of
relationships between word-forms - and more yet how do you enable the
facility to select one network of relationships over another? In a program
like FieldWorks I assume you could set up a link for the word-form *went* to
a paradigm field, a subentry, a minor entry and a major entry. In other
words, enter in all the possible arrangements. But when you came to
"produce" the dictionary in, say, a published form you certainly wouldn't
want to output both the minor entry and the major entry for *went*, for
example, in the database. You would need to have them linked in some way so
that at the output stage you can choose either minor entry or major entry
for that word-form.

With a derivative like *houseboat* you could enter this in the database as
both a major entry and a subentry under *house* and a subentry under *boat*.
But the problem here is that you would need to enter two different kinds of
subentry in each case. If you have *houseboat* as a major entry then in the
subentry under *boat* this subentry would function as illustrating a type of
boat and all you would need is a cross-reference - see *houseboat*. But if
you don't have a major entry for *houseboat* in the dictionary then you
would have to put all the relevant lexical information under the subentry
for *houseboat*. Then you have to decide do you put all the information
under both the subentry under *boat* and the subentry under *house*? For a
form-based dictionary like CHAMBERS all this information goes under the
subentry under *house* and you don't have a subentry under *boat*. But for a
meaning-based dictionary you would want to fill out the subentry under
*boat* as a type of boat - although a meaning-based dictionary would more
likely have a major entry for *houseboat*. You would then have to link all
these various major and minor entries and subentries so that only one
consistent network of relationships is produced.

And this would only apply to one sense of *house*, for example. Under
*house* 'building in which people live ...' you would have subentries for
*boarding house*, *council house*, etc. and under *house* 'building used for
a special purpose ...' you would have subentries for *chapter house*,
*clearing house*, *slaughterhouse*, etc. Under *boat* you might have
*boathouse* and so on. All these major or minor entries would have to be
linked to the relevant subentries so that only one consistent network of
relationships is produced in each case. And then you would want to choose
consistent networks that were all consistent with each other, e.g. all
form-based or all meaning-based.

Then how would you apply the idea of entering a major entry, a minor entry,
a subentry, and a paradigm link for every word-form to a language like Amele
(Papuan) where any given verb can have hundreds of thousands of possible
word-forms depending on the combinations of inflectional and derivational
categories?

RM said this is a daunting task. I would agree.

John Roberts




------------------------ Yahoo! Groups Sponsor --------------------~-->
What would our lives be like without music, dance, and theater?
Donate or volunteer in the arts today at Network for Good!
http://us.click.yahoo.com/TzSHvD/SOnJAA/79vVAA/HKE4lB/TM
--------------------------------------------------------------------~->


Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/lexicographylist/

<*> To unsubscribe from this group, send an email to:
    lexicographylist-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/



More information about the Lexicography mailing list