Arabic-L:LING: 1st CfP: LREC 2014 Workshop on Free/Open-Source Arabic Corpora and Corpora Processing Tools

Sat Jan 4 04:33:47 UTC 2014

------------------------------------------------------------------------
Arabic-L: Sat 04 Jan 2014
Moderator: Dilworth Parkinson <dilworth_parkinson at byu.edu>
[To post messages to the list, send them to arabic-l at byu.edu]
[To unsubscribe, send message from same address you subscribed from to
listserv at byu.edu with first line reading:
           unsubscribe arabic-l                                      ]

-------------------------Directory------------------------------------

1) Subject:  1st CfP: LREC 2014 Workshop on Free/Open-Source Arabic Corpora
and Corpora Processing Tools

-------------------------Messages-----------------------------------
1)
Date: 04 Jan 2014
From: OSACT <OSACT at kacst.edu.sa>
Subject:  1st CfP: LREC 2014 Workshop on Free/Open-Source Arabic Corpora
and Corpora Processing Tools

 1st Call for Papers

 WORKSHOP ON Free/Open-Source Arabic Corpora and Corpora Processing Tools

http://www.kacstac.org.sa/osact/index.html

  May 27, 2014
  Co-located with LREC 2014
  Harpa Conference Centre, Reykjavik (Iceland)

  DEADLINE FOR PAPERS: February 10, 2014
https://www.softconf.com/lrec2014/OSACT/

============================================================
Workshop description

For Natural Language Processing (NLP) and Computational Linguistics (CL)
communities, it was a known situation that Arabic is a resource poor
language. This situation was thought to be the reason why there is a lack
of corpus based studies in Arabic. However, the last years witnessed the
emergence of new considerably free Arabic corpora and in lesser extent
Arabic corpora processing tools.

Freely available Arabic corpora can be divided into two groups. The first
group contains large Arabic corpora, which are designed and constructed
basically for Arabic linguistics research and activities, and maybe for
Arabic NLP. These corpora are diverse in the genres they cover and their
sizes range from one million words to 700 million words. The second group
contains corpora that were designed basically for Arabic text
classification and clustering, they mainly contain newspapers' articles.
They range from less than 1 million words to 11 million words.

Some Arabic corpora are available on the web to explore using different
tools, basically large corpora, while other corpora are only available for
download. For the corpora that are available for download, the user may
need to use standalone corpus processing tools. These tools contain many
functionality such as word frequency, concordance, collocation, etc.
Therefore, with the availability of large and diverse Arabic corpora, the
situation does not change. There is still a lack of Arabic corpus base
studies. Is this because of representativeness of these corpora? The
available functions and tools associated with these corpora? or is it
because they are not well known enough for the Arabic linguistics community?

Motivation and topics of interest

This half-day-workshop aims to encourage the researchers and developers to
foster the utilization of freely available Arabic corpora and open source
Arabic corpora processing tools and help in highlighting the drawbacks of
these resources and discuss techniques and approaches on how to improve
them. The workshop topics include but not limited to:

  *       Surveying and criticizing the design of freely available Arabic
corpora, their associated tools and stand alone Arabic corpora processing
tools.
  *       The applications and uses of freely available Arabic language
resources in fields such as Arabic language education e.g. L1 and L2.
  *       Arabic language modeling.
  *       Corpus based Arabic lexigraphy.
  *       Lexical semantics and word sense.
  *       Corpus based Arabic syntactic.
  *       Corpus based Arabic morphology.
  *       Development of Arabic mobile applications based on the available
Arabic language resources.
  *       Evaluation and assessment of Arabic Corpora and Corpora
Processing Tools.
  *       Future directions of Free/Open Arabic Corpora and Corpora
Processing Tools.

Organising Committee

  *       Hend Al-Khalifa, King Saud University, KSA
  *       Abdulmohsen Al-Thubaity, King Abdul Aziz City for Science and
Technology, KSA

Program Committee

  *       Eric Atwell, University of Leeds, UK
  *       Khaled Shaalan, The British University in Dubai (BUiD), UAE
  *       Dilworth Parkinson, Brigham Young University, USA
  *       Nizar Habash, Columbia University, USA
  *       Khurshid Ahmad, Trinity College Dublin, Ireland
  *       Abdulmalik AlSalman, King Saud University, KSA
  *       Maha Alrabiah, King Saud University, KSA
  *       Saleh Alosaimi, Imam University, KSA
  *       Sultan almujaiwel, King Saud University, KSA
  *       Adam Kilgarriff, Lexical Computing Ltd, UK
  *       Amal AlSaif, Imam University, KSA
  *       Maha AlYahya, King Saud University, KSA
  *       Auhood AlFaries, King Saud University, KSA
  *       Salwa Hamada, Taibah University, KSA
  *       Mansour Algamdi, King Abdul Aziz City for Science and Technology,
KSA
  *       Abdullah Alfaifi, University of Leeds, UK

Important Dates

  *       Submission deadline: 10 February 2014
  *       Notification of acceptance: 10 March 2013
  *       Final submission of manuscripts: 21 March 2014
  *       Workshop date: 27 May 2014 (morning session)

Submissions

The language of the workshop is English and submissions should be with
respect to LREC 2014 paper submission instructions. All papers will be peer
reviewed possibly by three independent referees. Papers must be submitted
electronically in PDF format to the START system<
https://www.softconf.com/lrec2014/OSACT/>. When submitting a paper from the
START page, authors will be asked to provide essential information about
resources (in a broad sense, i.e. also technologies, standards, evaluation
kits, etc.) that have been used for the work described in the paper or are
a new result of your research. Moreover, ELRA encourages all LREC authors
to share the described LRs (data, tools, services, etc.), to enable their
reuse, replicability of experiments, including evaluation ones, etc.
Warning: This message and its attachment, if any, are confidential and may
contain information protected by law. If you are not the intended
recipient, please contact the sender immediately and delete the message and
its attachment, if any. You should not copy the message and its attachment,
if any, or disclose its contents to any other person or use it for any
purpose. Statements and opinions expressed in this e-mail and its
attachment, if any, are those of the sender, and do not necessarily reflect
those of King Abdulaziz city for Science and Technology (KACST) in the
Kingdom of Saudi Arabia. KACST accepts no liability for any damage caused
by this email.

--------------------------------------------------------------------------
End of Arabic-L: 04 Jan 2014
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/arabic-l/attachments/20140104/361b3721/attachment.htm>