15.2475, Diss: Comp Ling/Semantics: Old: 'The Semantic...'

LINGUIST List linguist at linguistlist.org
Tue Sep 7 20:05:09 UTC 2004


LINGUIST List:  Vol-15-2475. Tue Sep 7 2004. ISSN: 1068-4875.

Subject: 15.2475, Diss: Comp Ling/Semantics: Old: 'The Semantic...'

Moderators: Anthony Aristar, Wayne State U.<aristar at linguistlist.org>
            Helen Dry, Eastern Michigan U. <hdry at linguistlist.org>

Reviews (reviews at linguistlist.org):
	Sheila Collberg, U. of Arizona
	Terence Langendoen, U. of Arizona

Home Page:  http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.

Editor for this issue: Takako Matsui <tako at linguistlist.org>
 ==========================================================================
To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.html.
=================================Directory=================================

1)
Date:  Mon, 6 Sep 2004 13:42:09 -0400 (EDT)
From:  leonard_old at hotmail.com
Subject:  The Semantic Structure of Roget's, A Whole-Language Thesaurus

-------------------------------- Message 1 -------------------------------

Date:  Mon, 6 Sep 2004 13:42:09 -0400 (EDT)
From:  leonard_old at hotmail.com
Subject:  The Semantic Structure of Roget's, A Whole-Language Thesaurus

Institution: Indiana University
Program: School of Lib and Information Science
Dissertation Status: Completed
Degree Date: 2003

Author: Leonard J Old

Dissertation Title: The Semantic Structure of Roget's, A
Whole-Language Thesaurus

Dissertation URL:
http://www.dcs.napier.ac.uk/~cs171/LJOld/publications_l_john_old.htm

Linguistic Field: Computational Linguistics, Semantics, Lexicography,
Cognitive Science

Dissertation Director 1: Charles H Davis
Dissertation Director 2: Ralf Shaw

Dissertation Abstract:

This study analyzed a database version of Roget's Thesaurus (Roget's
International Thesaurus, 3rd Edition, 1962) for frequency and
connectivity patterns among the words, senses, and cross-references in
order to identify the implicit conceptual structure. Using descriptive
statistics, lattices, and information maps, semantic patterns implicit
in the data, at both the local and global levels of the structure,
were identified.

The explicit organizational structure of the thesaurus is, at the
local level, sets of synonyms; and at the global level, a hierarchy of
concepts. In contrast, the implicit organization at the local level
has the characteristics of dictionary sense definitions (genus and
differentiae), and at the global level has the characteristics of a
small-world social network. The concept of genus and differentiae
provides a model that can be seen to account for the distribution of
polysemy within senses and across the Thesaurus. The small-world
network model can be seen to account for the incidence of semantic
hubs and authorities among cross-references, and conceptual and
semantic switching centers among senses and words in the Thesaurus.

Previous work on Roget's Thesaurus calculated chains and equivalence
relations algorithmically from senses and words. In that research it
was found that there is an inner semantic core of
most-densely-connected words and senses. This study expanded on that
research identifying the semantic structure of the inner core and
relating it to the top most polysemous words in Roget's.

While the largest thesaurus Categories relate to concrete objects such
as plants, animals, food, clothing and technology, the most-connected
words (in terms of numbers of senses and synonyms) were found to
relate to abstract concepts such as motion, agitation and what appear
to be concepts related to survival. This observation was supported by
frequency counts, and global cross-reference and word connectivity
patterns.

---------------------------------------------------------------------------
LINGUIST List: Vol-15-2475



More information about the LINGUIST mailing list