[Corpora-List] token clustering tool
Jose Maria Gomez Hidalgo
jmgomez at uem.es
Tue May 11 08:19:33 UTC 2004
At 09:24 11/05/2004, Murk Wuite wrote:
>Dear all,
>
>Does anyone know of a tool (or algorithm), preferably available freely
>for research purposes, that takes as its input a corpus only and
>produces as its output clusters of tokens that occur close to each other
>relatively often?
It is possible that the document clustering toolkit CLUTO fit your
necessities, perhaps with some adaptation.
http://www-users.cs.umn.edu/~karypis/cluto/
>Best wishes,
>
>Murk Wuite
>MA student at the Department of Language and Speech, Katholieke
>Universiteit Nijmegen, The Netherlands
Jose Maria Gomez Hidalgo
Departamento de Inteligencia Artificial
Universidad Europea de Madrid
28670 - Villaviciosa de Odon - MADRID
(+34) 912115670
jmgomez at uem.es
http://www.esi.uem.es/~jmgomez/
La legislación española ampara el secreto de las comunicaciones. Este
correo electrónico es estrictamente confidencial y va dirigido
exclusivamente a su destinatario/a. Si no es Ud., le rogamos que no difunda
ni copie la transmisión y nos lo notifique cuanto antes.
Spanish law guarantees privacy in electronic communications. This
electronic transmission is strictly confidential and intended solely for
the addressee. If you are not the intended addressee, you are kindly
requested not to disclose nor to copy this transmission and to notify us as
soon as possible.
More information about the Corpora
mailing list