[Lexicog] Collaborative lexicography software?

maxwell at LDC.UPENN.EDU maxwell at LDC.UPENN.EDU
Wed Apr 30 20:18:52 UTC 2008


Quoting Emmanuel HABUMUREMYI <emmahab at yahoo.fr>:
> You are true Ron. WeSay cannot be compared to Fieldworks or Toolbox. 
> ...Some issues I can consider as shortcomings are:

I have recently been involved in an annotation task, in which the 
annotator (a native speaker of a language) was asked to provide the 
dictionary citation form for each word in the text, based on a printed 
dictionary.  The language has a mild amount of agglutinating 
morphology, and in some cases there is stem allomorphy.

The language has been written for hundreds of years, people use it in 
writing (there are newspapers, magazines, web sites...), the dictionary 
is in the third edition (meaning that at least someone must find it 
useful and usable!), and our current annotator has a bachelor's degree 
(although college in the country where this language is spoken is 
mostly taught in English).

Despite all this, our annotator has had a very hard time producing the 
annotation, and has made lots of mistakes, such as finding the wrong 
citation form (either a citation form which is phonologically similar, 
or a form which belongs to the wrong part of speech).  Based on this 
(admittedly limited) experience, it seems to me that coming up with a 
citation form for a particular word in an agglutinating language is a 
difficult task for someone without linguistic training.  So the thought 
of having such people come up with words to go into a dictionary fills 
me with a bit of trepidation; I picture a dictionary with separate 
entries for 'walk', 'walked', 'walking', 'walks' and so forth.

I have heard similar anecdotal stories about the ability of native 
speakers of Arabic to find citation forms in dictionaries (whether the 
dictionary is a root dictionary or a stem dictionary, although the 
latter is said to be easier).  And I understand that there is (was?) a 
college-level semester long course for native speakers of Dene (Navajo) 
teaching how to use the Young-Morgan Navajo dictionary, which tells me 
that Athabaskan morphology is very hard.  (I didn't need that fact to 
tell me, though...)

So--I would be interested in hearing about success (or failure) stories 
about building dictionaries for languages with complex morphologies, 
particularly in about native speakers without linguistic training 
collecting vocabulary.  And in particular about how well a simple 
method like WeSay apparently works.

   Mike Maxwell
   CASL/ U MD

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.


------------------------------------

Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/lexicographylist/

<*> Your email settings:
    Individual Email | Traditional

<*> To change settings online go to:
    http://groups.yahoo.com/group/lexicographylist/join
    (Yahoo! ID required)

<*> To change settings via email:
    mailto:lexicographylist-digest at yahoogroups.com 
    mailto:lexicographylist-fullfeatured at yahoogroups.com

<*> To unsubscribe from this group, send an email to:
    lexicographylist-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/



More information about the Lexicography mailing list