[Corpora-List] SenseClusters version 0.85 released!

ted pedersen tpederse at d.umn.edu
Tue May 9 02:01:06 UTC 2006


It has been a year or two since I have sent a SenseClusters announcement
to this list, so I thought it was time to update you on the current state
of the project.

SenseClusters is a free software package that clusters similar contexts
using a variety of lexical features and representation methods. It
includes support for SVD and a range of clustering algorithms. It
also provides several methods for automatically determining the number
of clusters in your input contexts. It is language independent, and we
hope easy to use!

We have mostly applied SenseClusters to word sense and name
discrimination, but it is really much more general than that. For
example, we have done some experiments clustering email that have been
quite promising.

You can download and install SenseClusters on your own Linux or Unix
system. If you would prefer not to install, or you do not have access to
Linux or Unix system, you can use our web interface, OR you can run off
of a Knoppix CD we have created with SenseClusters already installed.

You can find SenseClusters at the following site, which includes
a link to the web interface, and a link to download the system.

http://senseclusters.sourceforge.net/

If you would like a Knoppix CD, please visit our demo at NAACL this June
in New York City, or write to us and we can either send you a CD or make
the iso image available to you.

The most current version of SenseClusters is 0.85. This features our
adaptation of the Gap Statistic, a state of the art method for
automatically finding the number of clusters in a data set.

In addition to clustering contexts, SenseClusters does provide some
support for finding word clusters, and one of the things we will be
working on this summer is adding support for Latent Semantic Analysis.
So, there is a lot already included in SenseClusters, and more planned.

Please check it out, and let us know if you have any questions, comments,
or suggestions!

Cordially,
Ted

--
Ted Pedersen
http://www.d.umn.edu/~tpederse



More information about the Corpora mailing list