[Corpora-List] English-language paraphrase corpora

Gregor Erbach gor at acm.org
Tue Feb 1 10:16:14 UTC 2005


Hi Olga,
Google News (news.google.com) performs grouping of different
news articles relating to the same event, and can be used
for constructing such a corpus.
However, many of the articles will be duplicates, as different
newspapers take over the same text from the press agencies.

regards,

    Gregor

Quoting Olga Shaumyan <olgas at sussex.ac.uk>:

>
> Dear All,
>
> I am looking for English-language "comparable" corpora. I.e. I want,
> e.g., 2 collections of articles from different sources describing same
> events.
>
> Alternatively, would anyone know off-hand how one would go about
> constructing such comparable collections?
>
> (This is to be used for automatic paraphrasing.)
>
> Any pointers greatly appreciated,
>
> Olga
> University of Sussex NLP group
>
>
>
>
>
>
>



~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Dr. Gregor Erbach                     http://purl.org/net/gregor/
DFKI GmbH, Language Technology Lab    http://www.dfki.de/
Tel. +49 (681) 302-5354               mailto:erbach at dfki.de



More information about the Corpora mailing list