[Corpora-List] Document summarization evaluation dataset needed

Ralf Steinberger ralf.steinberger at jrc.ec.europa.eu
Thu Dec 11 12:47:33 UTC 2014


Dear Tomáš,

 

You did not mention the language of the summarisation corpus you are looking for, but at the following web site, you can find manually produced single-document and multi-document summaries in seven languages, together with many more multilingual parallel corpora:

 

https://ec.europa.eu/jrc/en/language-technologies

 


The International Standard Language Resource Number ISLRN for this ‘Multilingual summary evaluation data’ is: 762-292-165-648-8 <http://islrn.org/resources/762-292-165-648-8> .


 


The data is described in detail in:


 


Turchi Marco, Josef Steinberger, Mijail Kabadjov & Ralf Steinberger (2010). Using Parallel Corpora for Multilingual (Multi-Document) Summarisation Evaluation. Multilingual and Multimodal Information Access Evaluation. Springer Lecture Notes for Computer Science, LNCS 6360/2010, pp. 52-63

 

I hope you find this useful.

 

All the best,

 

Ralf

 

 

Ralf Steinberger

European Commission – Joint Research Centre (JRC)

URL – Applications:  <http://emm.newsbrief.eu/overview.html> http://emm.newsbrief.eu/overview.html

URL – The science behind them:  <http://ipsc.jrc.ec.europa.eu/?id=179> http://ipsc.jrc.ec.europa.eu/?id=179

21027 Ispra (VA), Italy

 

From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of Tomáš Kociský
Sent: 10 December 2014 21:25
To: corpora at uib.no
Subject: [Corpora-List] Document summarization evaluation dataset needed

 

Hi All,

 

Could anyone provide me with pointers to datasets for evaluating (single) document summarization (extractive and/or abstractive) for research purposes? I was unable to obtain the DUC datasets.

 

Alternatively, if you have any of the DUC datasets please contact me!

 

Many thanks,




Tomas Kocisky

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20141211/c36db664/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list