[Corpora-List] English-language paraphrase corpora
David Evans
devans at cs.columbia.edu
Tue Feb 1 15:20:56 UTC 2005
We have a system at Columbia that crawls the web, and clusters documents
into related sets:
http://newsblaster.cs.columbia.edu/
It has archives going back to 2001 or so.
Dave
Olga Shaumyan wrote:
> Dear All,
>
> I am looking for English-language "comparable" corpora. I.e. I want,
> e.g., 2 collections of articles from different sources describing same events.
>
> Alternatively, would anyone know off-hand how one would go about
> constructing such comparable collections?
>
> (This is to be used for automatic paraphrasing.)
>
> Any pointers greatly appreciated,
>
> Olga
> University of Sussex NLP group
>
>
>
>
>
More information about the Corpora
mailing list