[Lexicog] Digest Number 1050
Ronald Moe
ron_moe at SIL.ORG
Tue Nov 18 00:33:22 UTC 2008
Bob Parks wrote:
"I'm interested in finding out which words are the most central nodes
in a dictionary."
I would be very interested in the results of your research. I can imagine a
way to produce a simple database that could be mined for central nodes. For
instance I know of software that could sort a database by the frequency of a
"node". If you had an electronic copy of Wordsmyth, it would probably take a
day or so to generate the database. Essentially you would have to break each
definition into separate fields, one word per field. Then you would reverse
the database to produce a list of the defining vocabulary. Then you would
sort the database by frequency-yielding a list of words used in definitions
with the most frequent ones first. We could keep the original headwords and
definitions so that you could look for multiple senses of the defining words
(e.g. 'a *kind* of shoe' versus 'to be *kind* and compassionate'). We could
do the same thing with synonyms and antonyms (and any other lexical
relations in the database). I've done this sort of thing before. None of it
is particularly hard.
I just read Steyvers and Tenenbaum. I don't know anything about statistics,
so I didn't gain much from it. But the article raises a number of issues
that I am very interested in. Their semantic networks and central nodes are
very similar to semantic domains and other perspectives on semantics. I've
been trying to get a handle on why the mental lexicon tends to clump and
cluster around key words.
Ron Moe
_____
From: lexicographylist at yahoogroups.com
[mailto:lexicographylist at yahoogroups.com] On Behalf Of Bob Parks
Sent: Friday, November 14, 2008 12:52 PM
To: lexicographylist at yahoogroups.com
Subject: Re: [Lexicog] Digest Number 1050
Greetings:
I'm interested in finding out which words are the most central nodes
in a dictionary - i.e., the words that are most used in definitions
(aside from stop words); as well as those that have the largest
number of synonyms/antonyms, etc. The analysis in Steyvers and
Tennenbaum, "The Large-Scale Structure of Semantic Networks" seems to
indicate there is a "small worlds" network in WordNet, but they don't
give an actual list of the central node-words. I'm interested in
doing this sort of an analysis with the Wordsmyth
Dictionary-Thesaurus. Does anyone have any suggestions for analysis?
Would Ken Litkowski's "Di-graph" software discover this sort of a
network? Any advice/assistance is appreciated.
Bob Parks
--
* The best dictionary and integrated thesaurus on the web:
http://www.wordsmyt <http://www.wordsmyth.net> h.net
* Robert Parks - Wordsmyth - (607) 272-2190
* "To imagine a language is to imagine a form of life." (LW)
* "Philosophers have only interpreted the world. The point, however,
is to change it." (KM)
* Community grows as we communicate, honing our words till their
meanings tap the rich voice of our full human potential.
No virus found in this incoming message.
Checked by AVG - http://www.avg.com
Version: 8.0.175 / Virus Database: 270.9.4/1792 - Release Date: 11/16/2008
10:04 AM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lexicography/attachments/20081117/17eeebf4/attachment.htm>
More information about the Lexicography
mailing list