[Corpora-List] Essex Arabic Summaries Corpus (EASC)

El-Haj, Mahmoud melhaj at essex.ac.uk
Fri Feb 19 19:39:20 UTC 2010


-----------------------------------------------------------
The Essex Arabic Summaries Corpus (EASC)
-----------------------------------------------------------

We are pleased to announce the immediate availability of EASC 1.0,
 Essex Arabic Summaries Corpus, free of charge for research purposes.

The EASC is an Arabic natural language resources. It contains 153
Arabic articles and 765 human-generated extractive summaries of those
articles. These summaries were generated using Mechanical Turk
(http://www.mturk.com/).

You can request a copy of the EASC corpus through the following link:
(http://privatewww.essex.ac.uk/~melhaj/easc.htm)

Among the major features of EASC are:
* Names and extensions are formatted to be compatible with current
   evaluation systems such as ROUGE and AutoSummENG.
* Available in two encoding formats UTF-8 and ISO-8859-6 (Arabic).

Extra files:
(Does not come with the corpus and can be provided separately only).
* Arabic version of ROUGE <ROUGE-l.5.5.pl>
    To request for ROUGE: (http://berouge.com/default.aspx)
* ROUGE Arabic XML configuration file.
* ROUGE Arabic lst input file.
* 153 single-sentence Arabic system summaries
   (could be used as testing baseline).

The Essex Arabic Summaries Corpus (EASC) uses copyright material.
Users of the corpus are responsible for ensuring that they comply with
the terms of the copyrights that apply to the source material and the
derived works (summaries) and the terms of relevant copyright law.

Any other original data that is distributed with this corpus is
made available under the Creative Commons Attributive/Share Alike
licence (http://creativecommons.org/licenses/by-sa/3.0/).  You must
provide details of the source of the material when using it.


--
The EASC was created by Mahmoud El-Haj <melhaj at essex.ac.uk>, under
the supervision of Dr Udo Kruschwitz <udo at essex.ac.uk> and Dr Chris Fox
<foxcj at essex.ac.uk>.

Corpus URL: (http://privatewww.essex.ac.uk/~melhaj/easc.htm)


School of Computer Science and Electronic Engineering, University of
Essex, Wivenhoe Park, Colchester, Essex CO4 3SQ, United Kingdom.


Best wishes,
Mahmoud EL-Haj
http://privatewww.essex.ac.uk/~melhaj/
School Computer Science and Electronic Engineering
Essex University, Wivenhoe Park,
Colchester CO4 3SQ, United Kingdom.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20100219/757c5149/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list