[Corpora-List] English-language paraphrase corpora

Paula Newman paulan at earthlink.net
Tue Feb 1 10:50:54 UTC 2005


Olga,
The website for the DUC (document understanding conference)
run by US NIST contains clusters of relatively short articles
on the same topics. http://www-nlpir.nist.gov/projects/duc/data.html
Accessing the data requires obtaining some permissions, described
on that web page.
Paula

> [Original Message]
> From: Olga Shaumyan <olgas at sussex.ac.uk>
> To: <corpora at uib.no>
> Date: 2/1/2005 3:41:26 AM
> Subject: [Corpora-List] English-language paraphrase corpora
>
>
> Dear All,
>
> I am looking for English-language "comparable" corpora. I.e. I want,
> e.g., 2 collections of articles from different sources describing same
events.
>
> Alternatively, would anyone know off-hand how one would go about
> constructing such comparable collections?
>
> (This is to be used for automatic paraphrasing.)
>
> Any pointers greatly appreciated,
>
> Olga
> University of Sussex NLP group
>
>
>
>
>



More information about the Corpora mailing list