[Corpora-List] Corpus for clustering

Florian Petran florian.petran at googlemail.com
Tue Mar 13 15:05:06 UTC 2007


There is a recent article about LSA and Wikipedia, I believe they
somewhat deal with similar problems:

http://www.cs.technion.ac.il/~shaulm/papers/pdf/Gabrilovich-Markovitch-ijcai2007.pdf

Hope this helps.

2007/3/10, Bob Parks <bobp at clarityconnect.com>:
> I'm looking for references on how to construct corpora that reflect
> documents that use particular concepts and topics. I'm assuming its
> necessary to first cluster a larger document set. But how does one
> conceptualize the problem of creating the larger set to analyze for a
> set of concepts/topics, before the analysis?
> Thanks,
> Bob Parks
> --
> * The best dictionary and integrated thesaurus on the web:
> http://www.wordsmyth.net
> * Robert Parks - Wordsmyth - (607) 272-2190
> * "To imagine a language is to imagine a form of life."  (LW) And to
> imagine new forms of life is to create pathways to the language for
> living that life.
> * "Philosophers have only interpreted the world. The point, however,
> is to change it." (KM) And the best way to change the world is to
> first imagine a better form of life, and shape and offer your words
> as tools for living that world. This is the highest calling of a
> wordsmyth: to enrich the deep structure of communication and
> community.
>
>



More information about the Corpora mailing list