[Corpora-List] Classical Arabic corpora
Eric Atwell
csc6ea at leeds.ac.uk
Fri Feb 3 17:45:56 UTC 2012
Dear Mai,
I guess you already know about the Quranic Arabic Corpus
http://corpus.quran.com/ - annotated with morphological tagging,
pronoun reference resolution, parallel English translation,
partial parsing etc; BUT this covers only the Quran,
c70K words (depending on how you tokenise Arabic into words)
Claudio Soria recommended the LRE Map
http://www.resourcebook.eu/LreMap/faces/views/resourceMap.xhtml
BUT the Arabic corpora there are all Modern Standard Arabic,
except for Qurany: A Tool to Search for Concepts in the Quran
http://quranytopics.appspot.com/
... and this is also limited to the Quran
Gregory Crane's Perseus digital library of classical text is mainly
Classical Greek and Latin, but there is a section labelled "Arabic"
- BUT currently this contains the Quran, plus dictionaries
The Perseus website does have a pointer to another source:
"Perseus also wants to highlight the release on Alpheios.net of key
texts in Classical Arabic, including Book of Songs, Arabian Nights,
Arabic Reading Lessons, The Autobiography Of The Constantinopolitan
Story-Teller, Selection from the Annals of Tabari, Selections from
Arabic geographical literature and Voyages D'Ibn Batutah ..."
BUT while the Alpheios.net enables online reading, I am not clear
whether you can download a whole book as a corpus textfile.
At the NITS'2011 National Information Technology Symposium on
"Arabic and Islamic Content on the Internet" at King Saud
University, Riyadh, Mansour Alhamdi outlined the KACST initiative
to collect a large Arabic corpus including Classical and Modern Arabic
http://nits2011.ksu.edu.sa/en/cap/CD/Keynote%20Speakers/Mansour%20Alghamdi.pdf
BUT I have not heard how far this has succeeded yet.
I believe the Kuwait government ministry of religious studies has plans
to put online its collection of Classical Arabic texts;
but again I have no news of progress on this.
If you get any better answers, do please let me know
Eric Atwell, Leeds University
On Fri, 3 Feb 2012, Mai Zaki wrote:
> Dear all,
>
> I was wondering if you could advise me if there are any available corpora
> for Classical Written Arabic in any genre. I'm looking for a corpus of
> Written Arabic of any age between the Classical Arabic of the Qur'an and
> Modern Standard Arabic.
>
> Thanks a lot in advance,
>
> Mai Zaki
>
>
--
Eric Atwell, Senior Lecturer, Language Processing research group,
I-AIBS Institute for Artificial Intelligence and Biological Systems
School of Computing, Faculty of Engineering, UNIVERSITY OF LEEDS
Leeds LS2 9JT, England. TEL: 0113-3435430 FAX: 0113-3435468
WWW: http://www.comp.leeds.ac.uk/eric
http://www.comp.leeds.ac.uk/nlp
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list