Fw: [Lexicog] Discovering the lexicon via semantic domains

List Facilitator lexicography2004 at YAHOO.COM
Tue Jan 13 00:28:46 UTC 2004


----- Original Message -----
From: "List Facilitator" <lexicography2004 at yahoo.com>
To: <lexicographylist at yahoogroups.com>
Sent: Friday, January 09, 2004 2:17 PM
Subject: [Lexicog] Discovering the lexicon via semantic domains


> The DDP version 2 files that Ron refers to, below, are now in the DDP
folder
> in the Files section of our list website (URL in the trailer to this
> message).
>
> Wayne Leman
> List Facilitator
>
>
> > For the past three years I have been developing a lexicography tool that
I
> > call the Dictionary Development Program (DDP). At this point in its
> > development it is useful as a word collection tool. I use a list of 1750
> > semantic domains that I have compiled from numerous sources. I have
> > attempted to make it as universal and exhaustive as possible, but it
needs
> > more input from non-Indo-European languages. In November I released
> version
> > 2 of the DDP. I am currently on vacation and leaving in an hour for a
> week.
> > When I get back I would be happy to send the materials to anyone
> interested.
> > Send me an email and I'll send you the materials via email. They are
about
> > 1MB.
> >
> > Essentially the method uses semantic domains to prompt speakers of a
> > language to think of the words in their language that belong to each
> domain.
> > I've collected sample words from English, organized them into lexical
> sets,
> > and written an elicitation question for each lexical set. An example
from
> > the domain 'Wind' would be:
> >
> > What words describe a wind that only lasts a short time?  breath of air,
> > puff of wind, gust, blast, flurry
> >
> > In a ten day workshop about 30 speakers of Lunyole (Bantu, Uganda)
> collected
> > a total of 17,000 lexical items that boiled down to 12,000 unique words
> and
> > phrases. The extra 5,000 were duplicates that showed up in more than one
> > domain and often represent multiple senses. The method utlilizes the
> mental
> > network in each person's head. With a little practice it is possible to
> > think of words almost as fast as you can write them down. This is far
more
> > efficient than using the text corpus method or any other method I have
> heard
> > of. I also have the workshop participants gloss the words in the
national
> > language. Since the words are collected by domain, you end up with a
> > classified glossed word list and a 1,750 entry thesaurus. Once you
collect
> > the words, you can use automated routines to expand the word list into a
> > basic dictionary. I'm beginning to work on materials to help speakers of
a
> > language define the words in each domain. My goal is to produce a tool
> that
> > is as easy and efficient as possible, so that speakers of a language
with
> > little or no training in lexicography can produce a reasonably good
> > dictionary of massive proportions. Lexicographers estimate that there
are
> > 23,000 unique words in most languages, with perhaps 50,000 lexical items
> > including multiple senses and phrases. So even with my method we are
only
> > collecting about half the words. I hope to refine the method and
increase
> > this percentage. A dictionary of 3-4,000 entries is rather pitiful.
> (Pardon
> > me for saying so.) The text corpus method is advantageous in many ways
but
> > useless in a language where there are no texts. Even when you have some
> > texts, setting up the parser and manually adding entries to your
database
> > results in a small dictionary that is uneven in its depth and breadth of
> > treatment. Even automating some sort of concordance program only results
> in
> > a simple list of words with no gloss and no semantic classification. I
> > prefer to collect the words all at once, and use automated routines to
> > develop the word list. Then as time and opportunity allows, use the text
> > corpus method to collect natural examples of usage. Many lexicographers
> > recommend investigating semantics within the context of semantic
domains.
> >
> > Ron Moe
>
>
>
>
> ------------------------ Yahoo! Groups Sponsor ---------------------~-->
> Buy Ink Cartridges or Refill Kits for your HP, Epson, Canon or Lexmark
> Printer at MyInks.com. Free s/h on orders $50 or more to the US & Canada.
> http://www.c1tracking.com/l.asp?cid=5511
> http://us.click.yahoo.com/mOAaAA/3exGAA/qnsNAA/HKE4lB/TM
> ---------------------------------------------------------------------~->
>
> Yahoo! Groups Links
>
> To visit your group on the web, go to:
>  http://groups.yahoo.com/group/lexicographylist/
>
> To unsubscribe from this group, send an email to:
>  lexicographylist-unsubscribe at yahoogroups.com
>
> Your use of Yahoo! Groups is subject to:
>  http://docs.yahoo.com/info/terms/
>
>



More information about the Lexicography mailing list