Wordnet
Chaumont Devin
devil at lava.net
Fri Feb 4 17:32:04 UTC 2000
Dear Jean-Paul,
Before saying anything else, I would like to thank you for your very
imaginative and fanciful wordnet, which almost got me right out of my
Honolulu apartment and back into the SE Asian jungle (beam me over, dear
Scotty)!
POTETJP at wanadoo.fr wrote:
>Now, as regards your wordnet, I must confess I don't see very well what
>it exactly is.
Unfortunately, there would seem to be no easy explanation.
The name, WordNet, is not mine, but the property of Princeton University.
I have a somewhat more sophisticated product called SEMLEX, which does the
things WordNet does and more.
Some of the inadequacies of traditional dictionaries might be summed up as
follows:
1. They may completely ignore various kinds of important semantic
relationships.
2. Their approach to semantic relationships is gungho, haphazard, and
reliably unreliable. As an example, here is the American Heritage
Dictionary definition for "capybara":
capybara, n. A large, short tailed semiaquatic rodent, Hydrochoerus
hydrochaeris, of tropical South America, often attaining a length of four
feet.
This kind of essay definition may carry a lot of semantic information, but
in an imprecise and nonsystematic manner. The following is what a more
organized definition might look like:
capybara isa rodent
* are large
* are short-tailed
* are semiaquatic
* syn Hydrochoerus hydrochaeris
* are tropical
* hab South America
These kinds of information can be represented conveniently in semantic
networks in very precise and systematic ways. The computational advantage
gained by this kind of representation is that no parsing or interpretation
is required for its use--in other words, it is immediately available for
processing by automated systems. This is desirable to linguists because
it enables them to get at the information easily for various kinds of
linguistic analyses, and this is desirable to computer software engineers
because it enables them to get at the information needed for parsing
particular languages and for text generation. In both cases, of course, I
mean by automated means.
Notice that dictionaries also carry various kinds of information that
CANNOT be handled by semantic networks. For example, "often attaining a
length of four feet". Computers can also store this kind of information,
but in a different way which space would not permit me to explain here.
The important thing is that ontologies (semantic networks) are a very
powerful means for getting at the guts of language using minimal
resources, and they will be indispensable for machine translation, which
would seem to be impossible without them. And because of their more
rigorous precision, they are of obvious value for the preservation of
dying languages. A key consideration is the fact that a good semantic
network, besides just returning information about linkages between words,
can also record the way the user of a language views the world within
his/her language and culture.
There are, of course, many other things that might be said about this
subject, but these are the general outlines. If you are interested in
further information about ontologies and the representation of knowledge
within automated systems, I have a manual about the subject available free
of charge. It is the result of something over five years of private
research, and describes in fairly detailed fashion how ontologies will be
used later in automated translation between natural languages.
With best regards from Honolulu,
Chaumont Devin.
More information about the An-lang
mailing list