[Corpora-List] English-language paraphrase corpora
Gregor Erbach
gor at acm.org
Tue Feb 1 10:16:14 UTC 2005
Hi Olga,
Google News (news.google.com) performs grouping of different
news articles relating to the same event, and can be used
for constructing such a corpus.
However, many of the articles will be duplicates, as different
newspapers take over the same text from the press agencies.
regards,
Gregor
Quoting Olga Shaumyan <olgas at sussex.ac.uk>:
>
> Dear All,
>
> I am looking for English-language "comparable" corpora. I.e. I want,
> e.g., 2 collections of articles from different sources describing same
> events.
>
> Alternatively, would anyone know off-hand how one would go about
> constructing such comparable collections?
>
> (This is to be used for automatic paraphrasing.)
>
> Any pointers greatly appreciated,
>
> Olga
> University of Sussex NLP group
>
>
>
>
>
>
>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Dr. Gregor Erbach http://purl.org/net/gregor/
DFKI GmbH, Language Technology Lab http://www.dfki.de/
Tel. +49 (681) 302-5354 mailto:erbach at dfki.de
More information about the Corpora
mailing list