[Lexicog] Digest Number 1050

Ronald Moe ron_moe at SIL.ORG
Tue Nov 18 00:33:22 UTC 2008


Bob Parks wrote:

"I'm interested in finding out which words are the most central nodes 
in a dictionary."

 

I would be very interested in the results of your research. I can imagine a
way to produce a simple database that could be mined for central nodes. For
instance I know of software that could sort a database by the frequency of a
"node". If you had an electronic copy of Wordsmyth, it would probably take a
day or so to generate the database. Essentially you would have to break each
definition into separate fields, one word per field. Then you would reverse
the database to produce a list of the defining vocabulary. Then you would
sort the database by frequency-yielding a list of words used in definitions
with the most frequent ones first. We could keep the original headwords and
definitions so that you could look for multiple senses of the defining words
(e.g. 'a *kind* of shoe' versus 'to be *kind* and compassionate'). We could
do the same thing with synonyms and antonyms (and any other lexical
relations in the database). I've done this sort of thing before. None of it
is particularly hard.

 

I just read Steyvers and Tenenbaum. I don't know anything about statistics,
so I didn't gain much from it. But the article raises a number of issues
that I am very interested in. Their semantic networks and central nodes are
very similar to semantic domains and other perspectives on semantics. I've
been trying to get a handle on why the mental lexicon tends to clump and
cluster around key words.

 

Ron Moe

 

  _____  

From: lexicographylist at yahoogroups.com
[mailto:lexicographylist at yahoogroups.com] On Behalf Of Bob Parks
Sent: Friday, November 14, 2008 12:52 PM
To: lexicographylist at yahoogroups.com
Subject: Re: [Lexicog] Digest Number 1050

 

Greetings:
I'm interested in finding out which words are the most central nodes 
in a dictionary - i.e., the words that are most used in definitions 
(aside from stop words); as well as those that have the largest 
number of synonyms/antonyms, etc. The analysis in Steyvers and 
Tennenbaum, "The Large-Scale Structure of Semantic Networks" seems to 
indicate there is a "small worlds" network in WordNet, but they don't 
give an actual list of the central node-words. I'm interested in 
doing this sort of an analysis with the Wordsmyth 
Dictionary-Thesaurus. Does anyone have any suggestions for analysis? 
Would Ken Litkowski's "Di-graph" software discover this sort of a 
network? Any advice/assistance is appreciated.
Bob Parks
-- 
* The best dictionary and integrated thesaurus on the web: 
http://www.wordsmyt <http://www.wordsmyth.net> h.net
* Robert Parks - Wordsmyth - (607) 272-2190
* "To imagine a language is to imagine a form of life." (LW)
* "Philosophers have only interpreted the world. The point, however, 
is to change it." (KM)
* Community grows as we communicate, honing our words till their 
meanings tap the rich voice of our full human potential.

 

No virus found in this incoming message.
Checked by AVG - http://www.avg.com
Version: 8.0.175 / Virus Database: 270.9.4/1792 - Release Date: 11/16/2008
10:04 AM


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lexicography/attachments/20081117/17eeebf4/attachment.htm>


More information about the Lexicography mailing list