Appel: LREC 2014 Workshop on Free/Open-Source Arabic Corpora and Corpora Processing Tools

Fri Dec 27 20:22:39 UTC 2013

Date: Thu, 26 Dec 2013 20:05:38 +0000
From: OSACT <OSACT at kacst.edu.sa>
Message-ID: <76B661A7930AFC49A0DC6418253B1FA2C6AC8E at ex-itu01-002>
X-url: http://www.kacstac.org.sa/osact/index.html

We apologize for multiple postings, Please distribute to interested
colleagues

----------------------------------------------

1st Call for Papers

WORKSHOP ON Free/Open-Source Arabic Corpora and Corpora Processing Tools

http://www.kacstac.org.sa/osact/index.html

  May 27, 2014
  Co-located with LREC 2014
  Harpa Conference Centre, Reykjavik (Iceland)

  DEADLINE FOR PAPERS: February 10, 2014
https://www.softconf.com/lrec2014/OSACT/

============================================================
Workshop description

For Natural Language Processing (NLP) and Computational Linguistics (CL)
communities, it was a known situation that Arabic is a resource poor
language. This situation was thought to be the reason why there is a
lack of corpus based studies in Arabic. However, the last years
witnessed the emergence of new considerably free Arabic corpora and in
lesser extent Arabic corpora processing tools.

Freely available Arabic corpora can be divided into two groups. The
first group contains large Arabic corpora, which are designed and
constructed basically for Arabic linguistics research and activities,
and maybe for Arabic NLP. These corpora are diverse in the genres they
cover and their sizes range from one million words to 700 million
words. The second group contains corpora that were designed basically
for Arabic text classification and clustering, they mainly contain
newspapers' articles. They range from less than 1 million words to 11
million words.

Some Arabic corpora are available on the web to explore using different
tools, basically large corpora, while other corpora are only available
for download. For the corpora that are available for download, the user
may need to use standalone corpus processing tools. These tools contain
many functionality such as word frequency, concordance, collocation,
etc. Therefore, with the availability of large and diverse Arabic
corpora, the situation does not change. There is still a lack of Arabic
corpus base studies. Is this because of representativeness of these
corpora? The available functions and tools associated with these
corpora? or is it because they are not well known enough for the Arabic
linguistics community?

Motivation and topics of interest

This half-day-workshop aims to encourage the researchers and developers
to foster the utilization of freely available Arabic corpora and open
source Arabic corpora processing tools and help in highlighting the
drawbacks of these resources and discuss techniques and approaches on
how to improve them. The workshop topics include but not limited to:

- Surveying and criticizing the design of freely available Arabic
  corpora, their associated tools and stand alone Arabic corpora
  processing tools.

- The applications and uses of freely available Arabic language
  resources in fields such as Arabic language education e.g. L1 and L2.
- Arabic language modeling.
- Corpus based Arabic lexigraphy.
- Lexical semantics and word sense.
- Corpus based Arabic syntactic.
- Corpus based Arabic morphology.
- Development of Arabic mobile applications based on the available
  Arabic language resources.
- Evaluation and assessment of Arabic Corpora and Corpora Processing
  Tools.
- Future directions of Free/Open Arabic Corpora and Corpora Processing
  Tools.

Organising Committee

- Hend Al-Khalifa, King Saud University, KSA
- Abdulmohsen Al-Thubaity, King Abdul Aziz City for Science and
  Technology, KSA

Program Committee

- Eric Atwell, University of Leeds, UK
- Khaled Shaalan, The British University in Dubai (BUiD), UAE
- Dilworth Parkinson, Brigham Young University, USA
- Nizar Habash, Columbia University, USA
- Khurshid Ahmad, Trinity College Dublin, Ireland
- Abdulmalik AlSalman, King Saud University, KSA
- Maha Alrabiah, King Saud University, KSA
- Saleh Alosaimi, Imam University, KSA
- Sultan almujaiwel, King Saud University, KSA
- Adam Kilgarriff, Lexical Computing Ltd, UK
- Amal AlSaif, Imam University, KSA
- Maha AlYahya, King Saud University, KSA
- Auhood AlFaries, King Saud University, KSA
- Salwa Hamada, Taibah University, KSA
- Mansour Algamdi, King Abdul Aziz City for Science and Technology, KSA
- Abdullah Alfaifi, University of Leeds, UK

Important Dates

- Submission deadline: 10 February 2014
- Notification of acceptance: 10 March 2013
- Final submission of manuscripts: 21 March 2014
- Workshop date: 27 May 2014 (morning session)

Submissions

The language of the workshop is English and submissions should be with
respect to LREC 2014 paper submission instructions. All papers will be
peer reviewed possibly by three independent referees. Papers must be
submitted electronically in PDF format to the START
system<https://www.softconf.com/lrec2014/OSACT/>. When submitting a
paper from the START page, authors will be asked to provide essential
information about resources (in a broad sense, i.e. also technologies,
standards, evaluation kits, etc.) that have been used for the work
described in the paper or are a new result of your research. Moreover,
ELRA encourages all LREC authors to share the described LRs (data,
tools, services, etc.), to enable their reuse, replicability of
experiments, including evaluation ones, etc.