<div dir="ltr"><div>=======================================================</div><div><br></div><div>First Call for Papers and Participation</div><div>EMNLP Workshop on Arabic Natural Language Processing </div><div>Including Shared Task on Automatic Arabic Error Correction</div>

<div><br></div><div>      Apologies for multiple postings</div><div>      Please distribute to colleagues</div><div><br></div><div>=======================================================</div><div><br></div><div>First Call for Papers and Participation</div>

<div><br></div><div>Arabic Natural Language Processing Workshop</div><div>collocated with EMNLP 2014, Doha, Qatar </div><div><br></div><div>Workshop date: Saturday October 25, 2014</div><div>Paper submission deadline: July 26, 2014</div>

<div>Shared task registration deadline: July 1, 2014</div><div><br></div><div>=======================================================</div><div><br></div><div>====================</div><div>WORKSHOP DESCRIPTION</div><div>

====================</div><div><br></div><div>There has been a lot of progress in the last 15 years in the area of</div><div>Arabic Natural Language Processing (NLP).  Many Arabic NLP (or Arabic</div><div>NLP-related) workshops and conferences have taken place, both in the</div>

<div>Arab World and in association with international conferences, e.g.,</div><div>the conference on Arabic Language Resources and Tools (MEDAR-2009,</div><div>NEMLAR-2004), the workshop on Computational Approaches to Semitic</div>

<div>Languages (LREC 2010, EACL 2009, ACL 2007, ACL 2005, ACL 2002, ACL</div><div>1998), the workshop on Computational Approaches to Arabic Script-based</div><div>Languages (MTSummit XII 2009, LSA 2007, COLING 2004), the</div>

<div>International Symposium on Computer and Arabic Language (ISCAL 2009,</div><div>ISCAL 2007), the Colloque International sur le Traitement Automatique</div><div>de la Langue Arabe (CITALA 2007), the International Symposium on</div>

<div>Processing of Arabic (Tunisia 2002), the workshop on Arabic Language</div><div>Resources and Evaluation (LREC 2002), and the workshop on Arabic</div><div>Language Processing (ACL -2001), among others. This workshop proposal</div>

<div>follows in the footsteps of these efforts to provide a forum for</div><div>researchers to share and discuss their ongoing work. This workshop is</div><div>timely given the continued rise in research projects focusing on</div>

<div>Arabic NLP in the Arab World and the West.</div><div><br></div><div>We invite submissions on topics that include, but are not limited to,</div><div>the following:</div><div><br></div><div>* Basic core technologies: morphological analysis, disambiguation,</div>

<div>  tokenization, POS tagging, named entity detection, chunking,</div><div>  parsing, semantic role labeling, sentiment analysis, Arabic dialect</div><div>  modeling, etc.</div><div><br></div><div>* Applications: machine translation, speech recognition, speech</div>

<div>  synthesis, optical character recognition, pedagogy, assistive</div><div>  technologies, social media, etc.</div><div><br></div><div>* Resources: dictionaries, annotated data, specialized databases etc.</div><div><span class="" style="white-space:pre">      </span></div>

<div>Submissions may include work in progress as well as finished work.</div><div>Submissions must have a clear focus on specific issues pertaining to</div><div>the Arabic language whether it is standard Arabic, dialectal, or</div>

<div>mixed. Descriptions of commercial systems are welcome, but authors</div><div>should be willing to discuss the details of their work.  Submissions</div><div>are expected to be 8 pages long plus 2 pages for references.</div>

<div>Associated with the workshop will be a shared task on Arabic text</div><div>error correction (details below).</div><div><br></div><div>===========</div><div>SHARED TASK</div><div>===========</div><div><br></div><div>

As part of the Arabic Natural Language Processing Workshop at EMNLP</div><div>2014 (to be held in Doha, Qatar), we will conduct a shared task on</div><div>Automatic Arabic Error Correction. We designed this task in the</div>

<div>traditions of high profile shared tasks in natural language processing</div><div>such as CONLLÕs grammar/error detection and correction shared tasks in</div><div>2011-2013 and numerous machine translation campaigns by</div>

<div>NIST/WMT/MEDAR, among others.  The task relies on resources created</div><div>under the Qatar Arabic Language Bank (QALB) project (currently over 1M</div><div>words of manually corrected Arabic text).  A participating system in</div>

<div>this shared task will be given Modern Standard Arabic texts, which are</div><div>to be automatically corrected. The provided input will be provided in</div><div>Arabic script and in a standard Romanization scheme, and will be</div>

<div>annotated for part-of-speech (in three different granularities),</div><div>clitics (which appear in 20% of Arabic words), lemmas, English</div><div>glosses, and dependency tree relations.  All of the input text will be</div>

<div>preprocessed in a common way to make sure all participants have access</div><div>to all of these features at no additional overhead novelty cost. An</div><div>XML format will be used to encode all of this information.  A</div>

<div>participating system then returns a corrected version of the Arabic</div><div>text that is one sentence per line in an XML format.  The task is</div><div>focused on correction as opposed to identification. There will not be</div>

<div>an error identification task per se.  Participants need to register.</div><div>Once registered, all participating teams will be provided with a</div><div>common training data set, which includes common preprocessed input and</div>

<div>corrected output. A common development set will also be provided. A</div><div>blind test data set will be used to evaluate the output of the</div><div>participating teams. An evaluation script will be provided to all the</div>

<div>teams.  Participants are expected to author a short paper (4 pages + 2</div><div>for references) describing their approach, resources and experiments.</div><div>The paper needs to follow the standard format of EMNLP conference.</div>

<div><br></div><div>===============</div><div>IMPORTANT DATES</div><div>===============</div><div><br></div><div>Shared task registration period: April8, 2014 through July 1, 2014</div><div>Shared task test release:  July 7, 2014</div>

<div>Shared task system output collection: July 18, 2014</div><div>Submission deadline (Workshop and shared task papers): July 26, 2014</div><div>Author notification: August 26, 2014</div><div>Camera Ready: September 15, 2014</div>

<div>Workshop:<span class="" style="white-space:pre">   </span>October 25, 2014 </div><div><br></div><div>==========</div><div>ORGANIZERS</div><div>==========</div><div><br></div><div>Program Co-chairs</div><div>Nizar Habash, Columbia University</div>

<div>Stephan Vogel, Qatar Computing Research Institute</div><div><br></div><div>Publication Co-chairs</div><div>Nadi Tomeh, Paris 13 University</div><div>Houda Bouamor, Carnegie Mellon University Qatar</div><div><br></div>

<div>Website Committee</div><div>Kareem Darwish, Qatar Computing Research Institute</div><div>Noura Farra, Columbia University </div><div><br></div><div>Shared Task Committee</div><div>Behrang Mohit, Carnegie Mellon University Qatar</div>

<div>Alla Rozovskaya, Columbia University</div><div>Wajdi Zaghouani, Carnegie Mellon University Qatar </div><div>Ossama Obeid, Carnegie Mellon University Qatar</div><div>Nizar Habash, Columbia University (advisory)</div><div>

<br></div><div>Program Committee Members </div><div>(TBA in Second Call)</div><div><br></div></div>