[Corpora-List] semantic similarity

Eric Atwell eric at comp.leeds.ac.uk
Thu Jan 20 17:33:44 UTC 2005


George Demetriou used the word-definitions in LDOCE (Longman Dictionary of
Contemporary English) online lexicon, to measure similarity between 2 words
in a text by computing overlap between their definitions. See:

Demetriou G and Atwell E. 2001.
<a href="http://www.comp.leeds.ac.uk/eric/iwcs.ps"> A domain-independent
semantic tagger for the study of meaning associations in English text.</a>
In Harry Bunt, Ielka van der Sluis and Elias Thijsse (editors),
Proceedings of the Fourth International Workshop on Computational
Semantics (IWCS-4) pp.67-80. Tilburg, Netherlands.

Xiao Yuan Duan used the same metric to compute semantic overlap between
2 web-apges, by aggregating semantic overlaps between every pair of
words across the 2 documents; see:

Xiao Yuan DUAN, 2002, <a href="/research/pubs/theses/duan.pdf">
Lexical Semantic Association Between Web Documents</a>
MSc research thesis, School of Computing, Leeds University

LDOCE is good for this becuase all definitions are in terms of a small
defining vocabulary; but other online dictionaries can also work

Hope this helps,

Eric Atwell, Leeds University



On Thu, 20 Jan 2005, Jana Diesner wrote:

> Dear list members,
>
> We are looking for strategies, algorithms or code to automatically find
> single terms or multiple adjacent terms that are semantically similar within
> and across documents. The approach must not require POS tagging or an
> initial input of a reference term to the system. The resulting clusters of
> semantically similar terms suggested by the system do not need to be
> exclusive. We are familiar with secondstring, the software developed by
> William Cohen, and semantic similarity based on string-edit distances.
>
>
>
> Thank you very much.
>
> Jana
>
>
>
> ____________________
>
> Jana Diesner
> Carnegie Mellon University
>
> jdiesner at andrew.cmu.edu
>
>
>
>

--
Eric Atwell, Senior Lecturer, Computer Vision and Language research group,
School of Computing, University of Leeds, LEEDS LS2 9JT, England
TEL: +44-113-2335430  FAX: +44-113-2335468  http://www.comp.leeds.ac.uk/eric



More information about the Corpora mailing list