[Corpora-List] Fwd: Arabic Corpus work in Python

Lisa Hesterberg lisahesterberg2013 at u.northwestern.edu
Mon Oct 12 18:21:10 UTC 2009


---------- Forwarded message ----------
From: Majdi Sawalha <maj_sawalha at yahoo.com>
Date: Mon, Oct 12, 2009 at 11:40 AM
Subject: Re: [Corpora-List] Arabic Corpus work in Python
To: Lisa Hesterberg <lisahesterberg2013 at u.northwestern.edu>


Hi lisa,

i would suggest to use unicode utf-8 for input and output Arabic text in
python. there is a utf-8 copy of the CCA Arabic corpus which u can use. if
you mean writing Arabic words inside the code in IDLE, this might not work,
and if it is work on one machine, it might cause problem on other machines
that do not support Arabic characters. so, the best way is to use a string
of unicode characters instead. e.g Alif is equivelant to u"\u0627". Arabic
letters starts from u0621 to u0652 including short vowels.

i hope this will help,

Majdi

------------------------------
Majdi Sawalha
Faculty of Engineering
School of Computing
University of Leeds
Leeds, LS2 9JT
UK
http://www.comp.leeds.ac.uk/sawalha
------------------------------



 ------------------------------
*From:* Lisa Hesterberg <lisahesterberg2013 at u.northwestern.edu>
*To:* CORPORA at uib.no
*Sent:* Mon, October 12, 2009 4:49:49 PM
*Subject:* [Corpora-List] Arabic Corpus work in Python

Hi,

I'm currently working with Python on the CCA Arabic corpus, and IDLE is
giving me problems with the Arabic characters. Does anyone have any
experience working with Arabic in IDLE, or is there a better way to deal
with Arabic characters in Python? I would very much appreciate any help on
this matter.

Thanks,

Lisa Hesterberg
Department of Linguistics
Northwestern University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20091012/4c31a799/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list