[Corpora-List] Developing and testing new similarity measures for word clustering

Mark P. Line mark at polymathix.com
Fri Oct 8 20:46:47 UTC 2004


Normand Peladeau said:
>
> I am working now on some modified versions of those indices and I need
> some ways to benchmark those new similarity measures.  I would like to
> have a series of benchmarks for several kinds of application (dimension
> reduction, automatic identification of themes, automatic taxonomy
> development, etc.).

Although I don't think I can offer any advice on where to get your
benchmark data without stating the obvious, I do have one suggestion about
how you might proceed once you know what results you're looking for
against the benchmark:

You could use _genetic programming_ to breed indices that have just the
properties you're looking for (as long as you know what you're looking for
when you see it).


-- Mark

Mark P. Line
Polymathix
San Antonio, TX



More information about the Corpora mailing list