Arabic-L:LING:Vowelled Corpora responses

Dilworth Parkinson dilworth_parkinson at BYU.EDU
Mon Oct 8 17:22:13 UTC 2007


------------------------------------------------------------------------
Arabic-L: Mon 08 Oct 2007
Moderator: Dilworth Parkinson <dilworth_parkinson at byu.edu>
[To post messages to the list, send them to arabic-l at byu.edu]
[To unsubscribe, send message from same address you subscribed from to
listserv at byu.edu with first line reading:
            unsubscribe arabic-l                                      ]

-------------------------Directory------------------------------------

1) Subject:Vowelled Corpora response
2) Subject:Vowelled Corpora response
3) Subject:Vowelled Corpora response
4) Subject:Vowelled Corpora response

-------------------------Messages-----------------------------------
1)
Date: 08 Oct 2007
From:Hassan Gadalla <hgadalla at yahoo.com>
Subject:Vowelled Corpora response

Dear Alex Magidow,
The following site has a vowelled copy of the Quran:
http://www.quraat.com/Quran_emlaei.htm

Hassan Gadalla
Associate Professor of Linguistics
Girls' Faculty of Education at Al-Baha
Al-Baha University
Saudi Arabia

------------------------------------------------------------------------ 
--
2)
Date: 08 Oct 2007
From:"John Joseph Colangelo" <yaacolangelo at hotmail.com>
Subject:Vowelled Corpora response

Hello Alex,

I know that the Arabic language books used in primary education in  
the Arab world are vocalized. Sometimes you might even find books  
that are academically demanding such as the Muqaddima of Ibn Khaldun,  
published by Al-Ola Book Shop in the UAE, which have the tashkeel.  
Their address is:
Al-Ola Book Shop
P.O. Box: 4594
Sharjah, UAE
Tel. 0097165614459
Fax. 0097165613225

------------------------------------------------------------------------ 
--
3)
Date: 08 Oct 2007
From:"Mahmoud Elsayess" <melsayess at socal.rr.com>
Subject:Vowelled Corpora response

Dear Mr. Magidow,

Our website  http://www.readverse.com/   has the entire Quran in  
Arabic and 4 English
Translations by 4 different authors.  Our Quran database has over  
17,000 words with
Their roots. I believe our free website can be useful for your research.
Please, take a look and if you need help, drop me a line.

Happy Ramadan.
Mahmoud Elsayess

------------------------------------------------------------------------ 
--
4)
Date: 08 Oct 2007
From:Dil Parkinson <dil at byu.edu>
Subject:Vowelled Corpora response

The online corpus arabiCorpus.byu.edu has a vowelled version of the  
Quran.  The regular search engine on the main page strips all vowels  
before searching, since normal texts are either not vowelled or are  
unpredictably vowelled and this makes searching with vowels a  
nightmare (since the computer would consider, for example, each of  
the following to be entirely different: yktb, yaktb, yktub, yktbu,  
yakotb, yakotubu, yaktubu, etc, etc.).  However, if you click on  
advanced search and then search by hand, you can click on a box that  
allows you to search with vowels.  You have to remember, in this  
case, that if you search for ywm, for example, it will not find yawom  
(i.e. with the vowels).  You have to type in exactly what you are  
looking for.  Also, the morphological analysis the program does is  
less felicitious when using the vowels, so sometimes it is better to  
handle it yourself with regular expressions, and use the 'String'  
category instead of any of the morphological categories.  Another  
example, if you have the vowel box clicked and choose noun and type  
yawom, it will find 0 since you didn't type an ending vowel and all  
examples of ywm have an ending vowel in the Quran.  String will give  
you what you want.  Things can also get frustrating since this  
version of the Quran has some vowels in unexpected orders.  For  
example, if you type LyAm into the basic search (without vowels) you  
find that there are many examples in the Quran.  But then if you go  
into the vowelled search, you will find that both Lay~Am and Lay~aAm  
(with String) bring up 0 hits, since this version of the Quran  
typically types the short vowel BEFORE the shadda, so to get results  
you have to type: Laya~Am.  This is not easy to discover by oneself,  
and can get frustrating, but the capability is there if you need it.
With regular expressions you CAN handle the possible variation of  
vowel endings and vowel presence or absence, but you have to be a bit  
clever: yawom[uiaUIN] for example which allows for all the possible  
endings, or ya?wo?m\w? to allow for any or no vowels.  After doing  
any search, always click on "word forms" first, to see if it indeed  
found what you intended.  This can help you learn to refine your  
regular expressions until they are giving you what you are expecting.
dil


------------------------------------------------------------------------ 
--
End of Arabic-L:  08 Oct 2007



More information about the Arabic-l mailing list