Appel: Workshop on dialect arabic processing at ICCTIM2014
Thierry Hamon
hamon at LIMSI.FR
Wed Jan 29 09:37:00 UTC 2014
Date: Sun, 26 Jan 2014 22:41:10 +0100
From: smaili <Kamel.Smaili at loria.fr>
Message-ID: <52E580F6.9090700 at loria.fr>
X-url: http://sdiwc.net/conferences/2014/icctim2014/workshops/
* Workshop at **ICCTIM2014**, Dubai, april 10, 2014 *
http://sdiwc.net/conferences/2014/icctim2014/workshops/
Title : Arabic natural dialect processing
Summary
Modern Standard Arabic (MSA) is the language of more than 250 million
persons. It is used mainly in writing and in formal
speech. Unfortunately, most of Arab people, do not use MSA in their
daily conversations; the result is that different Arabic dialects are
spoken through more than twenty countries.In fact, MSA is not acquired
as a mother tongue, but rather it is learned as a second language at
school and through exposure to formal broadcast programs (such as the
daily news), religious practice, and newspaper. Spoken Arabic is often
referred to as colloquial Arabic, dialects, or vernaculars. It's a mixed
form, which has many variations, and often a dominating influence from
local languages (from before the introduction of Arabic) and from
languages of the countries which occupied the Arabic region. Differences
between the various variants of spoken Arabic can be large enough to
make them incomprehensible to Arabic people coming from different
regions.
Hence, regarding the large differences between such spoken languages, we
can consider them as disparate languages or more exactly as different
dialects depending on the geographical place in which they are
practiced: Morocco, Algeria, Egypt,... Because in general, they are not
written therefore, corpora are not available. Everyone knows the
importance of such corporawhen we would like to mine texts or to develop
some applications as speech recognition or machine translation which are
based on statistical models. The only existing corpora but not yet
explored are those used in social networks which cannot be used easily
due to the multiplicity of formats, the number of foreign words, the
mixture between dialects and French or English and so on.
The objective of this workshop is
This workshop is an opportunity for the NLP community to focus on this
challenging topic and encourage them to develop new resource Arabic
dialect,Arabic dialect corpora processing tools and help in highlighting
the difficulties of processing Arabic dialects especially those which
use so many foreign words adapted lexically and grammatically to
Arabic. The workshop topics include but not limited to:
1. Collecting Arabic dialect corpora
2. Diacritization of Arabic dialects
3. Mining Arabic social networks
4. Language modelling
5. Arabic dialect morphology
6. Development of mobile Arabic dialects applications: speech
recognition, machine translation, ...
7. Tagging Arabic dialects corpora
8. Maghreb Arabic dialects versus Orient Arabic dialects: Linguistic
study.
Format and duration: a full day workshop will held on April 10,
2014. The language of the workshop is English and submissions should be
with respect to ICCTIM2014 paper submission instructions. All papers
will be peer reviewed.Papers must be submitted electronically in PDF
format as soon as possible and before March 10, 2014.
When you submit by using the OpenConf management system, please select
others in the proposed topics and in keywords, enter Arabic dialect.
In all the cases, when you submit, please send an email to the chairman
of the workshop: smaili at loria.fr
More information about the Ln
mailing list