[Corpora-List] Last Call for System Participation: Shared Task on Automatic Arabic Error Correction (Registration Deadline: July 1st)

Behrang Mohit i_am_behrang at yahoo.com
Sat Jun 14 13:27:49 UTC 2014


=======================================================
Last Call for System Participation
Shared Task on Automatic Arabic Error Correction
In conjunction with EMNLP Workshop on Arabic Natural Language Processing 


      Apologies for multiple postings
      Please distribute to colleagues

=======================================================

Last Call for System Participation

Shared Task on Automatic Arabic Error Correction
collocated with EMNLP 2014, Doha, Qatar 

Registration deadline: July 1, 2014
System test period: July 7-18, 2014
Workshop date: Saturday October 25, 2014

Shared Task Website:
http://www.emnlp2014.org/workshops/anlp/shared_task.html 

=======================================================
Shared Task Description

As part of the Arabic Natural Language Processing Workshop at EMNLP
2014, we will conduct a shared task on Automatic Arabic Error
Correction. We designed this task in the traditions of high profile
shared tasks in natural language processing such as CONLL?s
grammar/error detection and correction shared tasks in 2011-2013 and
numerous machine translation campaigns by NIST/WMT/MEDAR, among
others.  The task relies on resources created under the Qatar Arabic
Language Bank (QALB) project (currently over 1M words of manually
corrected Arabic text).

A participating system in this shared task will be given Modern
Standard Arabic texts, which are to be automatically corrected. The
input will be provided in Arabic script, and will be annotated for
part-of-speech (in different granularities), inflectional features,
clitics (which appear in 20% of Arabic words), lemmas, and English
glosses.  All of the input text will be preprocessed in a common way
to make sure all participants have access to all of these features at
no additional overhead novelty cost.  We follow the file format and
evaluation framework used by the CONLL shared tasks on error
correction. The task is focused on correction as opposed to
identification. There will not be an error identification task per se.

Participants need to register.  Once registered, all participating
teams will be provided with a common training data set (about 1 million words), which includes common preprocessed input and corrected output. Registration link is on the Shared Task Website (see above). A common development set will also be provided. A blind test data set will be used to evaluate the output of the participating teams. An evaluation script will be provided to all the teams.  Each participating team can submit up to three systems.  Participants are welcome to use additional resources
and tools that are not part of the released data set. However, all
such additions must be fully disclosed.  

All those who registered to participate in the Shared Task will receive an email message on July 7, 2014 with specific instructions on how to download the test set and how to send the automatic correction of it. The information will also be available at the shared task group
(https://groups.google.com/forum/#!forum/qalb-shared-task).

Participants are expected to author a short paper (4 pages + 2 for
references) describing their approach, resources and experiments.  The
paper needs to follow the standard format of EMNLP conference.

DISCUSSION GROUP
The following discussion group has been created and is used for all announcements and discussions related to the shared task.
https://groups.google.com/forum/#!forum/qalb-shared-task 

Participants are encouraged to subscribe and follow the discussion group.

IMPORTANT DATES

Shared task registration period: April 8, 2014 through July 1, 2014
Shared task test release:  July 7, 2014
Shared task system output collection: July 18, 2014
Submission deadline for system description papers: July 26, 2014
Author notification: August 26, 2014
Camera Ready: September 15, 2014
Arabic NLP Workshop: October 25, 2014 

ORGANIZERS
Behrang Mohit (co-chair), Carnegie Mellon University Qatar
Alla Rozovskaya (co-chair), Columbia University
Wajdi Zaghouani, Carnegie Mellon University Qatar 
Ossama Obeid, Carnegie Mellon University Qatar
Nizar Habash (advisor), Columbia University 

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list