Arabic-L:LING:Essex Arabic Summaries Corpus
Dilworth Parkinson
dil at BYU.EDU
Wed Feb 24 00:12:39 UTC 2010
------------------------------------------------------------------------
Arabic-L: Mon 23 Feb 2010
Moderator: Dilworth Parkinson <dilworth_parkinson at byu.edu>
[To post messages to the list, send them to
arabic-l at byu.edu
]
[To unsubscribe, send message from same address you subscribed from to
listserv at byu.edu
with first line reading:
unsubscribe arabic-l ]
-------------------------Directory------------------------------------
1) Subject:Essex Arabic Summaries Corpus
-------------------------Messages-----------------------------------
1)
Date: 23 Feb 2010
From:"El-Haj, Mahmoud" <melhaj at essex.ac.uk>
Subject:Essex Arabic Summaries Corpus
-----------------------------------------------------------
The Essex Arabic Summaries Corpus (EASC)
-----------------------------------------------------------
We are pleased to announce the immediate availability of EASC 1.0,
Essex Arabic Summaries Corpus, free of charge for research purposes.
The EASC is an Arabic natural language resources. It contains 153
Arabic articles and 765 human-generated extractive summaries of those
articles. These summaries were generated using Mechanical Turk
(http://www.mturk.com/).
You can request a copy of the EASC corpus through the following link:
(http://privatewww.essex.ac.uk/~melhaj/easc.htm)
Among the major features of EASC are:
* Names and extensions are formatted to be compatible with current
evaluation systems such as ROUGE and AutoSummENG.
* Available in two encoding formats UTF-8 and ISO-8859-6 (Arabic).
Extra files:
(Does not come with the corpus and can be provided separately only).
* Arabic version of ROUGE <ROUGE-l.5.5.pl>
To request for ROUGE: (http://berouge.com/default.aspx)
* ROUGE Arabic XML configuration file.
* ROUGE Arabic lst input file.
* 153 single-sentence Arabic system summaries
(could be used as testing baseline).
The Essex Arabic Summaries Corpus (EASC) uses copyright material.
Users of the corpus are responsible for ensuring that they comply with
the terms of the copyrights that apply to the source material and the
derived works (summaries) and the terms of relevant copyright law.
Any other original data that is distributed with this corpus is
made available under the Creative Commons Attributive/Share Alike
licence (http://creativecommons.org/licenses/by-sa/3.0/). You must
provide details of the source of the material when using it.
--
The EASC was created by Mahmoud El-Haj <melhaj at essex.ac.uk>, under
the supervision of Dr Udo Kruschwitz <udo at essex.ac.uk> and Dr Chris Fox
<foxcj at essex.ac.uk>.
Corpus URL: (http://privatewww.essex.ac.uk/~melhaj/easc.htm)
School of Computer Science and Electronic Engineering, University of
Essex, Wivenhoe Park, Colchester, Essex CO4 3SQ, United Kingdom.
Best wishes,
Mahmoud EL-Haj
http://privatewww.essex.ac.uk/~melhaj/
School Computer Science and Electronic Engineering
Essex University, Wivenhoe Park,
Colchester CO4 3SQ, United Kingdom.
--------------------------------------------------------------------------
End of Arabic-L: 23 Feb 2010
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/arabic-l/attachments/20100223/fd79c0f5/attachment.htm>
More information about the Arabic-l
mailing list