[Corpora-List] token clustering tool

Steven Bird sb at cs.mu.oz.au
Wed May 12 01:07:58 UTC 2004


On Wed, 2004-05-12 at 09:07, Normand Peladeau wrote:
> At 2004-05-11 03:24, you wrote:
> > Dear all,
> >
> > Does anyone know of a tool (or algorithm), preferably available
> > freely
> > for research purposes, that takes as its input a corpus only and
> > produces as its output clusters of tokens that occur close to each
> > other
> > relatively often?
>
> I created such a software but it is a commercial product...

Its easy to write a program to do this using NLTK, and its free.

Please see: http://nltk.sourceforge.net/

-Steven Bird



More information about the Corpora mailing list