<html><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><span class="Apple-style-span" style="font-size: 12px; white-space: pre; ">------------------------------------------------------------------------
</span><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Arabic-L: Wed 16 Sep 2009</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Moderator: Dilworth Parkinson <<a href="mailto:dilworth_parkinson@byu.edu">dilworth_parkinson@byu.edu</a>></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">[To post messages to the list, send them to <a href="mailto:arabic-l@byu.edu">arabic-l@byu.edu</a>]</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">[To unsubscribe, send message from same address you subscribed from to</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><a href="mailto:listserv@byu.edu">listserv@byu.edu</a> with first line reading:</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "> unsubscribe arabic-l ]</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><br></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">-------------------------Directory------------------------------------</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><br></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">1) Subject:New version of Quran Morphology Website</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><br></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">-------------------------Messages-----------------------------------</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">1)</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Date: 16 Sep 2009</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">From:<a href="mailto:dukes.kais@googlemail.com">dukes.kais@googlemail.com</a></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Subject:New version of Quran Morphology Website</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><br></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><span class="Apple-style-span" style="font-size: medium; "><p>Hello,</p><p>Apologies for the mass email. Hopefully an e-mail list for this project will soon be set up, that will allow you to subscribe/unsubscribe if you are continuing to be interested in this project.</p><p>I have uploaded a new version of the website <a href="http://quran.uk.net/">http://quran.uk.net</a>. There has been a lot of good feedback about the work being done, and I have tried to respond to this by adding new features to the morphological annotated corpus of the Quran. I am now continuing this research under the supervision of Eric Atwell at the University of Leeds, so the website now includes my School of Computing email as a contact address. On busy days we get a few hundred visitors to the website, and this has been growing over time.</p><p>(1) <strong>Corrections to the morphological and syntactic annotation of the Quran.</strong> Over the last few months, corrections were suggested to nearly 1000 words (there are 77,430 words in the Quran). As a result of these suggestions, accuracy is now much improved on key passages of the text. I have gone through and reviewed each of these suggestions by comparing against traditional sources, and I have approved about half of them. The other half were mostly subtle tagging issues that it may be hard to find agreement on. For example, adjectives as predicates verses nouns, and nouns versus proper nouns. In classical Arabic these distinctions do not usually change meaning, nor are the differences critical for further syntactic analysis of the text. I plan to improve the annotator guidelines to cover these cases. For now, I have enforced consistency by reviewing each change made to the tagged corpus.</p><p>(2) <strong>Arabic terminology on the website.</strong> This is the biggest change to the site. Hopefully including Arabic terms for the morphological analysis will attract a bigger audience. I am generating the Arabic analysis automatically form the morphological features. So if this are wrong in any way, it would be great if annotators please let me know before I make a public announcement about the new version of the website via the mailing lists.</p><p>(3) <strong>Improvements to the segmentation scheme.</strong> Previously only attached object pronouns were segmented (maf3ul bihi). Now subject pronouns (fa3il) are also segmented, and the morphemes are shown in blue. This is to keep the analysis more in line with traditional Arabic grammar. This segmentation has been performed automatically according to Traditional inflection rules, so I believe this should be quite accurate. Annotators are welcome to review this new change.</p><p>(4) <strong>Audio recitation of the Quran.</strong> Some volunteers requested that a feature be added whereby they can listen to each verse being recited by an authentic source. My guess for why this is useful is perhaps that the tone of voice makes disambiguation easier. For each verse, you can now click the play button (at the bottom of the page) and hear the verse in Arabic. Please allow time to load for slow connections.</p><p>(5) <strong>A link to the new treebank project.</strong> This has been included on the main page. The idea here is that we want to attract more volunteers to help with the syntactic analysis of the Quran.</p><p>(6) <strong>The root list has been reviewed and improved.</strong> We now have an accurate root for each word in the Quran. The Buckwalter analyzer used to provide the initial tagging did not provide roots, only stems. However, I have managed to get hold of a more accurate root list than before.</p><p>(7) <strong>Changes to the part-of-speech tags.</strong> The adverb tag (ADV) has been removed. Instead, there are two new tags (LOC and T) for location and time adverbs. This is to keep the tagging more aligned with traditional Arabic grammar. In Arabic, these are tags for Dharf Makan, and Dharf Zaman. If used as adverbs, these words will always be the accusative case.</p><p>(8) <strong>Changes to preposition tagging.</strong> The preposition tag (P) is now only used for Harf Jar (genitive prepositions), so that the P tag now agrees 100% with traditional Arabic grammar. Words previously tagged as P are now either nouns (N) or time/location adverbs (T/LOC) depending on context. The idea behind this change was that Traditional Arabic grammar defines a set of prepositions (harf jar) and we were not previously using this list, and we used to confuse T/LOC adverbs (Dharf Makan/Zaman) as prepositions.</p><p>(9) <strong>Proper nouns.</strong> The list of proper nouns has been extended. For example, Satan and Quran are now considered to be proper nouns.</p><p>(10) <strong>Entries in the dictionary are now sorted by verb form.</strong> The lexicon page shows words in the Quran grouped by root. The words are now subdivided according to form I, form II, form III, etc.</p><p>Outstanding work...</p><p>Here are a list of other good suggestions that never made it into this release. Hopefully they will be included in the next version.</p><p>(1) Changes to feminine/masculine. They have been quite a few suggestions that we change some the gender of various words. This needs to be reviewed.</p><p>(2) We should show root counts in the dictionary. This will help with manual verification against published root lists of the Quran.</p><p>(3) We should consider showing the pattern of each word, as well as the root. The Buckwalter analyzed used to produce the initial tagging didn’t give us patterns. However, now that we have accurate roots and words, it may be possible to derive the patterns automatically. One idea would be to use regular expressions.</p><p>(4) Linguistic search tool. Similar to the search tool for the British National Corpus, we should be able to search by word, part-of-speech tag and proximity, e.g. 20 words away from another word.</p><p>(5) Translations. We should include multiple English translations and allow searches over them.</p><p>(6) A testimonials page, listing the positive feedback to the project. Might encourage other interested volunteers to join.</p><p>(7) For each word, give links to existing Arabic lexicons showing the analysis of the word. Might speed up annotation and corrections.</p><div>Any feedback is welcome! If you are interested in volunteering for the morphology or for the treebank projects do let me know.</div><div> </div><div>Kind Regards,</div><div> </div><div>-- Kais Dukes</div><div>School of Computing</div><div>University of Leeds</div><div><br></div><div><br></div></span></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">--------------------------------------------------------------------------</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">End of Arabic-L: 16 Sep 2009</div></div></body></html>