[Corpora-List] Semantic Similarity summary

Daniel Midgley dmidgley at arts.uwa.edu.au
Sun Nov 3 05:45:51 UTC 2002


Thanks to all to responded to my inquiry about semantic similarity using
distributional techniques.
Here are some of the results.

Websites:

* http://www.ilc.pi.cnr.it/EAGLES96/rep2/node37.html
A quick rundown of methods and concepts involved in Word Clustering.

* http://www.cs.ualberta.ca/~lindek/demos.htm
A quite enjoyable set of demos. You can type in words and look for similar
words in a newspaper corpus based on measures of dependency-based and
proximity-based similarity. There's also a "Usage Checker", where language
learners can get suggestions for the most common usages for pairs of
keywords (so as to avoid anomalous usages).

* Latent Semantic Analysis
http://lsa.colorado.edu/whatis.html
A corpus-based method of analysing the content of documents based on word
distibution.


Articles:

* Demetriou G and Atwell E. 2001. A domain-independent semantic tagger for
the study of meaning associations in English text. In Harry Bunt, Ielka
van der Sluis and Elias Thijsse (editors), Proceedings of the Fourth
International Workshop on Computational Semantics (IWCS-4) pp.67-80.
Tilburg, Netherlands. ISBN: 90-74029-16-7.
http://www.comp.leeds.ac.uk/eric/iwcs.ps

* Wilson, A. and Rayson, P. 1993. Automatic Content Analysis of Spoken
Discourse: a report on work in progress. In: C. Souter and E. Atwell
(eds), Corpus Based Computational Linguistics. Amsterdam: Rodopi. pp215-226
http://www.comp.lancs.ac.uk/computing/research/ucrel/papers/war93.txt

* Wilson, A. and Thomas, J.A. 1997. Semantic annotation,
in Garside, R., Leech, G., and McEnery, A. (eds.) Corpus Annotation:
Linguistic Information from Computer Text Corpora. Longman, London, pp.53-65.

* Natural Semantic Metalanguage
This is an ongoing attempt by Cliff Goddard and Anna Wierzbicka (and
others) to find the language primitives (or the "semantic core") that are
present in all languages.
http://www.une.edu.au/arts/LCL/disciplines/linguistics/nsmpage1.htm

* Sahlgren | 2001: Vector-Based Semantic Analysis: Representing Word
Meanings Based on Random Labels
Author's website:
http://www.sics.se/~mange/

* Pereira F., Tishby N., and Lee L. (1993) Distributional clustering of
English words. In Proc. of the 31st Annual Meeting of the ACL, pp. 183-190.
http://citeseer.nj.nec.com/pereira93distributional.html

* Vasileios Hatzivassiloglou and Kathleen McKeown. 1993. Towards the
automatic identication of adjectival scales: Clustering of adjectives
according to meaning. In 31st Annual Meeting of the ACL, pages 172-182.
http://citeseer.nj.nec.com/context/114108/0 (Not the article itself, but
similar ones.)

A link to a query for CiteSeer:
http://citeseer.nj.nec.com/cs?q=Distributional+clustering&submit=Search+Document
s&cs=1

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Daniel Midgley
dmidgley at arts.uwa.edu.au
+ (61 8) 9371 3730
http://www.cs.uwa.edu.au/~fontor



More information about the Corpora mailing list