<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii"><meta name=Generator content="Microsoft Word 15 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:"\@SimSun";
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
text-align:center;
font-size:10.0pt;
font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:#954F72;
text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0in;
margin-right:0in;
margin-bottom:0in;
margin-left:.5in;
margin-bottom:.0001pt;
mso-add-space:auto;
text-align:center;
font-size:10.0pt;
font-family:"Times New Roman","serif";}
p.MsoListParagraphCxSpFirst, li.MsoListParagraphCxSpFirst, div.MsoListParagraphCxSpFirst
{mso-style-priority:34;
mso-style-type:export-only;
margin-top:0in;
margin-right:0in;
margin-bottom:0in;
margin-left:.5in;
margin-bottom:.0001pt;
mso-add-space:auto;
text-align:center;
font-size:10.0pt;
font-family:"Times New Roman","serif";}
p.MsoListParagraphCxSpMiddle, li.MsoListParagraphCxSpMiddle, div.MsoListParagraphCxSpMiddle
{mso-style-priority:34;
mso-style-type:export-only;
margin-top:0in;
margin-right:0in;
margin-bottom:0in;
margin-left:.5in;
margin-bottom:.0001pt;
mso-add-space:auto;
text-align:center;
font-size:10.0pt;
font-family:"Times New Roman","serif";}
p.MsoListParagraphCxSpLast, li.MsoListParagraphCxSpLast, div.MsoListParagraphCxSpLast
{mso-style-priority:34;
mso-style-type:export-only;
margin-top:0in;
margin-right:0in;
margin-bottom:0in;
margin-left:.5in;
margin-bottom:.0001pt;
mso-add-space:auto;
text-align:center;
font-size:10.0pt;
font-family:"Times New Roman","serif";}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Calibri","sans-serif";
color:windowtext;}
span.apple-converted-space
{mso-style-name:apple-converted-space;}
..MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.25in 1.0in 1.25in;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:18238759;
mso-list-type:hybrid;
mso-list-template-ids:593685446 67698703 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l0:level1
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level2
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level3
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l0:level4
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level5
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level6
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l0:level7
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level8
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level9
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-US link="#0563C1" vlink="#954F72"><div class=WordSection1><p class=MsoNormal align=left style='text-align:left'><o:p> </o:p></p><p class=MsoNormal><b><span style='font-size:14.0pt'>Workshop on Free/Open-Source Arabic Corpora and Corpora Processing Tools<o:p></o:p></span></b></p><p class=MsoNormal style='text-align:justify'><b><span style='font-size:14.0pt'><o:p> </o:p></span></b></p><p class=MsoNormal style='text-align:justify'><b><span style='font-size:14.0pt'>Workshop URL: </span></b><a href="http://www.kacstac.org.sa/osact/index.html"><b><span style='font-size:14.0pt'>http://www.kacstac.org.sa/osact/index.html</span></b></a><b><span style='font-size:14.0pt'> <o:p></o:p></span></b></p><p class=MsoNormal align=left style='text-align:left'><o:p> </o:p></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt;color:red'><o:p> </o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:12.0pt;color:red'>Workshop description <o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><o:p> </o:p></p><p class=MsoNormal style='text-align:justify'><span style='font-size:11.0pt'>For Natural Language Processing (NLP) and Computational Linguistics (CL) communities, it was a known situation that Arabic is a resource poor language. This situation was thought to be the reason why there is a lack of corpus based studies in Arabic. However, the last years witnessed the emergence of new considerably free Arabic corpora and in lesser extent Arabic corpora processing tools. <o:p></o:p></span></p><p class=MsoNormal style='text-align:justify'><span style='font-size:11.0pt'><o:p> </o:p></span></p><p class=MsoNormal style='text-align:justify'><span style='font-size:11.0pt'>Freely available Arabic corpora can be divided into two groups. The first group contains large Arabic corpora, which are designed and constructed basically for Arabic linguistics research and activities, and maybe for Arabic NLP. These corpora are diverse in the genres they cover and their sizes range from one million words to 700 million words. The second group contains corpora that were designed basically for Arabic text classification and clustering, they mainly contain newspapers' articles. They range from less than 1 million words to 11 million words. <o:p></o:p></span></p><p class=MsoNormal style='text-align:justify'><span style='font-size:11.0pt'><o:p> </o:p></span></p><p class=MsoNormal style='text-align:justify'><span style='font-size:11.0pt'>Some Arabic corpora are available on the web to explore using different tools, basically large corpora, while other corpora are only available for download. For the corpora that are available for download, the user may need to use standalone corpus processing tools. These tools contain many functionality such as word frequency, concordance, collocation, etc. Therefore, with the availability of large and diverse Arabic corpora, the situation does not change. There is still a lack of Arabic corpus base studies. Is this because of representativeness of these corpora? The available functions and tools associated with these corpora? or is it because they are not well known enough for the Arabic linguistics community? <o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><o:p> </o:p></p><p class=MsoNormal align=left style='text-align:left'><o:p> </o:p></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:12.0pt;color:red'>Motivation and topics of interest<o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><o:p> </o:p></p><p class=MsoNormal style='text-align:justify'>This half-day-workshop aims to encourage the researchers and developers to foster the utilization of freely available Arabic corpora and open source Arabic corpora processing tools and help in highlighting the drawbacks of these resources and discuss techniques and approaches on how to improve them<span style='font-size:11.0pt'>. The workshop topics include but not limited to:<o:p></o:p></span></p><p class=MsoListParagraphCxSpFirst style='text-align:justify;text-indent:-.25in;mso-list:l0 level1 lfo1'><![if !supportLists]><span style='font-size:11.0pt'><span style='mso-list:Ignore'>1.<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span dir=LTR></span><span style='font-size:11.0pt'>Surveying and criticizing the design of freely available Arabic corpora, their associated tools and stand alone Arabic corpora processing tools.<o:p></o:p></span></p><p class=MsoListParagraphCxSpMiddle style='text-align:justify;text-indent:-.25in;mso-list:l0 level1 lfo1'><![if !supportLists]><span style='font-size:11.0pt'><span style='mso-list:Ignore'>2.<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span dir=LTR></span><span style='font-size:11.0pt'>The applications and uses of freely available Arabic language resources in fields such as Arabic language education e.g. L1 and L2.<o:p></o:p></span></p><p class=MsoListParagraphCxSpMiddle style='text-align:justify;text-indent:-.25in;mso-list:l0 level1 lfo1'><![if !supportLists]><span style='font-size:11.0pt'><span style='mso-list:Ignore'>3.<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span dir=LTR></span><span style='font-size:11.0pt;color:black;background:white'>Arabic language modeling.</span><span style='font-size:11.0pt'><o:p></o:p></span></p><p class=MsoListParagraphCxSpLast style='text-align:justify;text-indent:-.25in;mso-list:l0 level1 lfo1'><![if !supportLists]><span style='font-size:11.0pt'><span style='mso-list:Ignore'>4.<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span dir=LTR></span><span style='font-size:11.0pt;color:black;background:white'>Corpus based Arabic lexigraphy.</span><span style='font-size:11.0pt'><o:p></o:p></span></p><ol start=5 type=1><li class=MsoNormal style='color:black;mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;text-align:left;mso-list:l0 level1 lfo1;background:white'><span style='font-size:11.0pt'>Lexical semantics and word sense.<o:p></o:p></span></li></ol><p class=MsoListParagraphCxSpFirst style='text-align:justify;text-indent:-.25in;mso-list:l0 level1 lfo1'><![if !supportLists]><span style='font-size:11.0pt'><span style='mso-list:Ignore'>6.<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span dir=LTR></span><span style='font-size:11.0pt'>Corpus based Arabic syntactic.<o:p></o:p></span></p><p class=MsoListParagraphCxSpMiddle style='text-align:justify;text-indent:-.25in;mso-list:l0 level1 lfo1'><![if !supportLists]><span style='font-size:11.0pt'><span style='mso-list:Ignore'>7.<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span dir=LTR></span><span style='font-size:11.0pt'>Corpus based Arabic morphology.<o:p></o:p></span></p><p class=MsoListParagraphCxSpMiddle style='text-align:justify;text-indent:-.25in;mso-list:l0 level1 lfo1'><![if !supportLists]><span style='font-size:11.0pt'><span style='mso-list:Ignore'>8.<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span dir=LTR></span><span style='font-size:11.0pt'>Development of Arabic mobile applications based on the available Arabic language resources.<o:p></o:p></span></p><p class=MsoListParagraphCxSpMiddle style='text-align:justify;text-indent:-.25in;mso-list:l0 level1 lfo1'><![if !supportLists]><span style='font-size:11.0pt'><span style='mso-list:Ignore'>9.<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span dir=LTR></span><span style='font-size:11.0pt'>Evaluation and assessment of </span><span style='font-size:11.0pt'>Arabic Corpora and Corpora Processing Tools</span><span style='font-size:11.0pt'>.</span><span style='font-size:11.0pt'><o:p></o:p></span></p><p class=MsoListParagraphCxSpLast style='text-align:justify;text-indent:-.25in;mso-list:l0 level1 lfo1'><![if !supportLists]><span style='font-size:11.0pt'><span style='mso-list:Ignore'>10.<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span dir=LTR></span><span style='font-size:11.0pt'>Future directions of Free/Open </span><span style='font-size:11.0pt'>Arabic Corpora and Corpora Processing Tools</span><span style='font-size:11.0pt'>.</span><span style='font-size:11.0pt'><o:p></o:p></span></p><p class=MsoNormal style='text-align:justify'><span style='font-size:11.0pt'><o:p> </o:p></span></p><p class=MsoNormal align=left style='text-align:left'><o:p> </o:p></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:12.0pt;color:red'>Important Dates<o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><o:p> </o:p></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt;color:black'>Submission deadline: 10 February 2014<o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt;color:black'>Notification of acceptance: 10 March 2013<o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt;color:black'>Final submission of manuscripts: 21 March 2014<o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt;color:black'>Workshop date: 27 May 2014 (</span><span style='font-size:11.0pt;color:#282828;background:white'>morning session</span><span style='font-size:11.0pt;color:black'>) </span><span style='font-size:11.0pt'><o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><o:p> </o:p></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:12.0pt;color:red'>Submission guidelines<o:p></o:p></span></p><p class=MsoNormal style='text-align:justify'><span style='font-size:11.0pt'>The language of the workshop is English and submissions should be with respect to LREC 2014 paper submission instructions. All papers will be peer reviewed possibly by three independent referees. Papers must be submitted electronically in PDF format to the STAR system. When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. Moreover, ELRA encourages all LREC authors to share the described LRs (data, tools, services, etc.), to enable their reuse, replicability of experiments, including evaluation ones, etc.<o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt;color:#282828'><br></span><span style='font-size:12.0pt;color:red'>Organising Committee<o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:12.0pt;color:red'><o:p> </o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt'>Hend Al-Khalifa, King Saud University, KSA<o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt'>Abdulmohsen Al-Thubaity, King Abdul Aziz City for Science and Technology, KSA<o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:12.0pt;color:red'><o:p> </o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:12.0pt;color:red'>Program Committee<o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:12.0pt;color:red'><o:p> </o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt'>Eric Atwell, University of Leeds, UK <o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt;background:white'>Khaled Shaalan, The British University in Dubai (BUiD), UAE <o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt;background:white'>Dilworth Parkinson, Brigham Young University, USA<o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><em><span style='font-size:11.0pt;background:white'>Nizar Habash</span></em><span style='font-size:11.0pt;background:white'>, Columbia University, USA <o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt;background:white'>Khurshid Ahmad, Trinity College Dublin, Ireland<o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt;background:white'>Abdulmalik AlSalman, </span><span style='font-size:12.0pt'>King Saud University, KSA</span><span style='font-size:11.0pt;background:white'> <o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt;background:white'>Maha Alrabiah, </span><span style='font-size:12.0pt'>King Saud University</span><span style='font-size:11.0pt;background:white'>, KSA<o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt;background:white'>Saleh Alosaimi, Imam University, KSA<o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt;background:white'>Sultan almujaiwel, </span><span style='font-size:12.0pt'>King Saud University</span><span style='font-size:11.0pt;background:white'>, KSA<o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt;background:white'>Adam Kilgarriff, Lexical Computing Ltd<span class=apple-converted-space>, UK </span><o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt;background:white'>Amal AlSaif, Imam University, KSA<o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt;background:white'>Maha AlYahya, </span><span style='font-size:12.0pt'>King Saud University</span><span style='font-size:11.0pt;background:white'>, KSA<o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt;background:white'>Auhood AlFaries, </span><span style='font-size:12.0pt'>King Saud University</span><span style='font-size:11.0pt;background:white'>, KSA<o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt;background:white'>Salwa Hamada, Taibah University, KSA<o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt;background:white'>Mansour Algamdi, </span><span style='font-size:12.0pt'>King Abdul Aziz City for Science and Technology</span><span style='font-size:11.0pt;background:white'>, KSA<o:p></o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt'>Abdullah Alfaifi, </span><span style='font-size:11.0pt'>University of Leeds<span style='background:white'>, UK<o:p></o:p></span></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt;color:#444444;background:white'><o:p> </o:p></span></p><p class=MsoNormal align=left style='text-align:left'><span style='font-size:11.0pt;color:#444444;background:white'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'><o:p> </o:p></span></p></div>
<br /><br />
<hr style='border:none; color:#909090; background-color:#B0B0B0; height: 1px; width: 99%;' />
<table style='border-collapse:collapse;border:none;'>
<tr>
<td style='border:none;padding:0px 15px 0px 8px'>
<a href="http://www.avast.com/">
<img border=0 src="http://static.avast.com/emails/avast-mail-stamp.png" />
</a>
</td>
<td>
<p style='color:#3d4d5a; font-family:"Calibri","Verdana","Arial","Helvetica"; font-size:12pt;'>
This email is free from viruses and malware because <a href="http://www.avast.com/">avast! Antivirus</a> protection is active.
</p>
</td>
</tr>
</table>
<br />
</body></html>