[Corpora-List] Where can I find a copy of MSE corpus?

Thu Mar 17 09:09:20 UTC 2011

Dear Aqil,
few months ago, we released a multilingual summary evaluation corpus in
seven languages for research and evaluation purposes. You can download the
set from http://langtech.jrc.ec.europa.eu/JRC_Resources.html .

This dataset consists of a manually annotated collection of document
clusters of parallel texts in seven languages (Arabic, Czech, English,
French, German, Russian and Spanish) that can be used to evaluate
multi-document, or even single document, summarisation software. The data is
particularly useful to compare the performance of software across languages.

The four document clusters consist of five high-level commentaries each,
selected from http://www.project-syndicate.org/, discussing fields that can
roughly be described as being about malaria, Israel-and-Palestine-Conflict,
genetics and science-and-society.

The resource and its use are described in:

    Marco Turchi, Josef Steinberger, Mijail Kabadjov and Ralf Steinberger
(2010)
    Using Parallel Corpora for Multilingual (Multi-document) Summarisation
Evaluation.
    Springer Lecture Notes in Computer Science (LNCS), Volume 6360/2010,
52-63.

Thanks a lot
Marco Turchi

On Thu, Mar 17, 2011 at 7:27 AM, Aqil Azmi <aazmi at yahoo.com> wrote:

> Hello everybody,
>
> Any idea where can I find a copy of MSE (Multilingual Summarization
> Evaluation) corpus? Thank you very much.
>
> --Aqil
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20110317/e8f38e5b/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora