[Lexicog] Digest Number 1050
Ronald Moe
ron_moe at SIL.ORG
Wed Nov 19 05:16:50 UTC 2008
Mike Maxwell wrote:
"what does it mean for one word to be a 'hub', rather than another?"
This is a metaphor used in a theory of semantic networks. In the theory each
lexeme is a "node" and each relationship is an "edge". You can picture a
node as a little circle and an edge as a line:
o--------o
Some lexemes are related to many other lexemes. So you can picture one of
these important lexemes and its many relationships as a wheel hub with many
spokes. There would be no rim to the wheel, just spokes.
o o
\ /
o-o-o
/ \
o o
A better image might be a pin cushion with lots of pins sticking out of it
at all angles. Chemists use a similar metaphor to display molecules. Each
atom is a different color ball. The atoms are stuck together with sticks at
various angles to portray molecular bonds. For a pictorial example go to:
http://www.blc.arizona.edu/Molecular_Graphics/DNA_Structure/DNA_Tutorial.HTM
L
Let's imagine that 'armchair' is only related to 'chair', but that 'chair'
was related to 'high chair' 'sofa' 'furniture' and 'stool'. In this case
'armchair' would be a node with only a single link to 'chair':
armchair-----chair
But 'chair' would be a hub (and a node) with lots of spokes:
furniture
|
armchair-----chair----sofa
/ \
stool high chair
It turns out that 'furniture' is also a hub:
lamp desk
\ /
bookcase---furniture---table
|
armchair-----chair----sofa
/ \
stool high chair
I prefer the pin cushion metaphor. So you can think of the semantic network
as a big pile of pin cushions with some pins just sticking out and other
pins linking one pin cushion to another. A semantic domain would be a pin
cushion and all the pins sticking in it.
Now we can start examining the features of this pile of pin cushions. There
are about 2,000 pin cushions and 50,000 nodes. So on average each pin
cushion has 25 pins sticking in it. Why so many? Why not 10,000 pin cushions
and only five pins in each? A pile of 2,000 pin cushions would be about 13
by 13 by 12 in three dimensional space. Yet on average each node is linked
to each other node by only four or five pins. Why not 12 or 13? Why so
dense? What sort of structure would result in an average of only four links
between any two words? How would you link up 2,000 pin cushions so that they
would all be tied together in such a dense mass?
One way to determine which words are hubs is to do a free association test.
In this test a person is given a word and ask to say the first word that
comes to mind. For instance the person is given the word 'cat' and the first
word that comes to mind is 'dog'. By testing lots of people with lots of
words the researcher can determine which words are associated with which
other words. It turns out that some words have many associations and others
very few. Those with many associations are the hubs (or pin cushions). (The
DDP word collection method is basically a free association exercise.)
Steyvers and Tenenbaum's article investigated the results of such a research
project and two others-WordNet and Roget's Thesaurus. Unfortunately the
article is mostly a very obscure and technical discussion of the statistical
properties of the semantic networks contained in these three works. My poor
brain could only glean a few insights and observations from it. The primary
conclusion they draw is that the mental semantic network is not random. It
has peculiar statistical properties.
Ron Moe
_____
From: lexicographylist at yahoogroups.com
[mailto:lexicographylist at yahoogroups.com] On Behalf Of Mike Maxwell
Sent: Monday, November 17, 2008 10:58 PM
To: lexicographylist at yahoogroups.com
Subject: Re: [Lexicog] Digest Number 1050
Ronald Moe wrote:
> Words tend to cluster around hubs. This is the basis of a semantic
> domain-a hub with a cluster of related words. As I've studied
> semantic domains, I've come to several conclusions. For instance I've
> known for a long time that words were not uniformly distributed, but
> tend to cluster.
> ...
> Another puzzling feature of the lexicon is that high-frequency,
> mono-morphemic lexemes tend to be hubs.
I'm trying to wrap my mind around this...what does it mean for one word
to be a 'hub', rather than another? I.e. how do we know which words are
hubs?
And what does it mean for words to cluster around hubs--that there is
some kind of semantic empty space between clusters, like the more or
less empty space between galaxies? The latter can be measured in light
years, and the average distance between stars in a galaxy is much less
than the average distance between galaxies (or between a star in one
galaxy and the nearest star in a neighboring galaxy). What is the
metric for measuring the space between word clusters, i.e. how do we
know that the clusters don't abut each other or even overlap?
--
Mike Maxwell
maxwell at ldc. <mailto:maxwell%40ldc.upenn.edu> upenn.edu
No virus found in this incoming message.
Checked by AVG - http://www.avg.com
Version: 8.0.175 / Virus Database: 270.9.6/1797 - Release Date: 11/18/2008
11:23 AM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lexicography/attachments/20081118/04eaf771/attachment.htm>
More information about the Lexicography
mailing list