[Corpora-List] Discover Word Meanings with SenseClusters!

Ted Pedersen tpederse at d.umn.edu
Sun Jan 4 21:25:10 UTC 2004


We are pleased to announce the release of SenseClusters, a free software
package that does unsupervised discovery of word senses by clustering
together instances of a word (or words) that are used in similar contexts
in raw text. It supports a wide range of clustering techniques based on
both context vectors and similarity matrices.

SenseClusters is flexible, and can be used in any application that
requires clustering of similar instances of text. Examples could include
word sense discrimination, synonymy identification, text classification,
and summarization. It can also be used to implement models such as Latent
Semantic Analysis (LSA).

SenseClusters takes a user through the entire process of unsupervised
learning of word senses, including text preprocessing, feature selection,
context vector and similarity matrix construction, dimensionality
reduction via singular value decomposition (SVD), and clustering via both
agglomerative and partitional algorithms.

SenseClusters provides a great deal of native functionality, and also
provides seamless interfaces to take advantage of a number of powerful
tools, including Cluto (a Clustering toolkit), SVDPACKC (which carries
out singular value decomposition), and the Ngram Statistics Package.

For general information please visit:
http://senseclusters.sourceforge.net

For immediate download of the first public release (0.47) please visit:
http://sourceforge.net/projects/senseclusters/

This is an active project, and the principle designer and lead developer
(Amruta Purandare, pura0010 at d.umn.edu) and I would be delighted to hear
any comments, requests, or even bug reports that you might have. You can
see some of our future plans in our Todo list, which is distributed with
the package.

Cordially,
Ted and Amruta

PS To subscribe to the SenseClusters mailing list/s, visit:

http://lists.sourceforge.net/lists/listinfo/senseclusters-users  (discussion)
http://lists.sourceforge.net/lists/listinfo/senseclusters-news (announcements)

--
# Ted Pedersen                              http://www.umn.edu/~tpederse #
# Department of Computer Science                        tpederse at umn.edu #
# University of Minnesota, Duluth                                        #
# Duluth, MN 55812                                        (218) 726-8770 #



More information about the Corpora mailing list