Appel: EMNLP Workshop on Arabic Natural Language Processing & Shared Task on Automatic Arabic Error Correction

Wed Mar 26 21:19:30 UTC 2014

Date: Tue, 25 Mar 2014 23:07:53 -0700 (PDT)
From: Wajdi Zaghouani <wajdiuqam at yahoo.com>
Message-ID: <1395814073.7190.YahooMailNeo at web121701.mail.ne1.yahoo.com>

=======================================================

First Call for Papers and Participation
EMNLP Workshop on Arabic Natural Language Processing
Including Shared Task on Automatic Arabic Error Correction

Apologies for multiple postings
Please distribute to colleagues

=======================================================

First Call for Papers and Participation

Arabic Natural Language Processing Workshop
collocated with EMNLP 2014, Doha, Qatar

Workshop date: Saturday October 25, 2014
Paper submission deadline: July 26, 2014
Shared task registration deadline: July 1, 2014

=======================================================

====================
WORKSHOP DESCRIPTION
====================

There has been a lot of progress in the last 15 years in the area of
Arabic Natural Language Processing (NLP). Many Arabic NLP (or Arabic
NLP-related) workshops and conferences have taken place, both in the
Arab World and in association with international conferences, e.g., the
conference on Arabic Language Resources and Tools (MEDAR-2009,
NEMLAR-2004), the workshop on Computational Approaches to Semitic
Languages (LREC 2010, EACL 2009, ACL 2007, ACL 2005, ACL 2002, ACL
1998), the workshop on Computational Approaches to Arabic Script-based
Languages (MTSummit XII 2009, LSA 2007, COLING 2004), the International
Symposium on Computer and Arabic Language (ISCAL 2009, ISCAL 2007), the
Colloque International sur le Traitement Automatique de la Langue Arabe
(CITALA 2007), the International Symposium on Processing of Arabic
(Tunisia 2002), the workshop on Arabic Language Resources and Evaluation
(LREC 2002), and the workshop on Arabic Language Processing (ACL -2001),
among others. This workshop proposal follows in the footsteps of these
efforts to provide a forum for researchers to share and discuss their
ongoing work. This workshop is timely given the continued rise in
research projects focusing on Arabic NLP in the Arab World and the West.

We invite submissions on topics that include, but are not limited to,
the following:

* Basic core technologies: morphological analysis, disambiguation,
  tokenization, POS tagging, named entity detection, chunking,
  parsing, semantic role labeling, sentiment analysis, Arabic dialect
  modeling, etc.

* Applications: machine translation, speech recognition, speech
  synthesis, optical character recognition, pedagogy, assistive
  technologies, social media, etc.

* Resources: dictionaries, annotated data, specialized databases etc.

Submissions may include work in progress as well as finished work.
Submissions must have a clear focus on specific issues pertaining to the
Arabic language whether it is standard Arabic, dialectal, or
mixed. Descriptions of commercial systems are welcome, but authors
should be willing to discuss the details of their work.  Submissions are
expected to be 8 pages long plus 2 pages for references.  Associated
with the workshop will be a shared task on Arabic text error correction
(details below).

===========
SHARED TASK
===========

As part of the Arabic Natural Language Processing Workshop at EMNLP 2014
(to be held in Doha, Qatar), we will conduct a shared task on Automatic
Arabic Error Correction. We designed this task in the traditions of high
profile shared tasks in natural language processing such as CONLLÕs
grammar/error detection and correction shared tasks in 2011-2013 and
numerous machine translation campaigns by NIST/WMT/MEDAR, among others.
The task relies on resources created under the Qatar Arabic Language
Bank (QALB) project (currently over 1M words of manually corrected
Arabic text).  A participating system in this shared task will be given
Modern Standard Arabic texts, which are to be automatically
corrected. The provided input will be provided in Arabic script and in a
standard Romanization scheme, and will be annotated for part-of-speech
(in three different granularities), clitics (which appear in 20% of
Arabic words), lemmas, English glosses, and dependency tree relations.
All of the input text will be preprocessed in a common way to make sure
all participants have access to all of these features at no additional
overhead novelty cost. An XML format will be used to encode all of this
information.  A participating system then returns a corrected version of
the Arabic text that is one sentence per line in an XML format.  The
task is focused on correction as opposed to identification. There will
not be an error identification task per se.  Participants need to
register.  Once registered, all participating teams will be provided
with a common training data set, which includes common preprocessed
input and corrected output. A common development set will also be
provided. A blind test data set will be used to evaluate the output of
the participating teams. An evaluation script will be provided to all
the teams.  Participants are expected to author a short paper (4 pages +
2 for references) describing their approach, resources and experiments.
The paper needs to follow the standard format of EMNLP conference.

===============
IMPORTANT DATES
===============

Shared task registration period: April8, 2014 through July 1, 2014
Shared task test release:  July 7, 2014
Shared task system output collection: July 18, 2014
Submission deadline (Workshop and shared task papers): July 26, 2014
Author notification: August 26, 2014
Camera Ready: September 15, 2014
Workshop:October 25, 2014 

==========
ORGANIZERS
==========

Program Co-chairs
Nizar Habash, Columbia University
Stephan Vogel, Qatar Computing Research Institute

Publication Co-chairs
Nadi Tomeh, Paris 13 University
Houda Bouamor, Carnegie Mellon University Qatar

Website Committee
Kareem Darwish, Qatar Computing Research Institute
Noura Farra, Columbia University 

Shared Task Committee
Behrang Mohit, Carnegie Mellon University Qatar
Alla Rozovskaya, Columbia University
Wajdi Zaghouani, Carnegie Mellon University Qatar 
Ossama Obeid, Carnegie Mellon University Qatar
Nizar Habash, Columbia University (advisory)

Program Committee Members 
(TBA in Second Call)

-------------------------------------------------------------------------
Message diffuse par la liste Langage Naturel <LN at cines.fr>
Informations, abonnement : http://www.atala.org/article.php3?id_article=48
English version       : 
Archives                 : http://listserv.linguistlist.org/archives/ln.html
                                http://liste.cines.fr/info/ln

La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion  : http://www.atala.org/

ATALA décline toute responsabilité concernant le contenu des
messages diffusés sur la liste LN
-------------------------------------------------------------------------