Call for Papers: LREC 2012 Workshop on Creating Cross-language Resources for Disconnected Languages and Styles (CREDISLAS)
Sara Morrissey
sara.morrissey2 at MAIL.DCU.IE
Mon Feb 6 11:34:23 UTC 2012
Dear all,
on behalf of the CREDISLAS organising committee, please find below details
of the LREC workshop that may be of interest to the list.
Regards,
Sara
==============================
========================================================
Workshop on
CREATING CROSS-LANGUAGE RESOURCES FOR DISCONNECTED LANGUAGES AND STYLES
Co-located with LREC 2012 (http://www.lrec-conf.org/lrec2012/)
Istanbul, Turkey
May 27, 2012 (afternoon session)
Deadline for paper submissions: February 26, 2012
http://www-lium.univ-lemans.fr/credislas2012
======================================================================================
This half-day workshop aims at developing strategies and sharing
experiences on creating resources for reducing the linguistic gap between
those language pairs for which cross-language resources are scarce.
Although this specific situation has been most commonly addressed for the
case of minority languages that have scarce resources by themselves, it
also happens to be an important issue in some other situations such as:
majority languages that, because of their cultural, historical and/or
geographical disconnection, do not count with a significant amount of
cross-language resources between them (as Chinese and Spanish, just to
mention an excellent example in this category); or, single languages for
which new communication trends and styles do not have available
cross-language resources between the main formal language and it (as chat
speak style communications and formal languages).
Current computational and data storage capabilities have favoured the
proliferation of data-driven and statistical approaches in natural language
processing and computational linguistics. Empirical evidence has
demonstrated in a large number of cases and applications how the
availability of appropriate datasets can boost the performance of
processing methods and analysis techniques. In this scenario, the
availability of data has become to play a fundamental role. On the other
hand, both the diversity of languages and the emergence of new
communication media and stylistic trends are responsible for the scarcity
of resources in the case of some specific tasks and applications. In this
sense, this workshop attempts to focus its attention on those specific
applications or cases for which data scarcity poses a restrictive problem
for data-driven approaches. This includes the following three specific
situations:
Minority Languages, for which scarcity of resources is a consequence of the
minority nature of the language itself. In this case, attention is focused
on the development of both monolingual and cross-lingual resources. Some
examples in this category include: Basque, Pashto and Haitian Creole, just
to mention a few.
Disconnected Languages, for which a large amount of monolingual resources
are available, but due to cultural, historical and/or geographical reasons
cross-language resources are actually scarce. Some examples in this
category include language pairs such as Chinese and Spanish, Russian and
Portuguese, and Arabic and Japanese, just to mention a few.
New Language Styles, which represent different communication forms or
emerging stylistic trends in languages for which the available resources
are practically useless. This case includes the typical examples of tweets
and chat speak communications, as well as other informal form of
communications, in many languages.
The main topics of interest for this workshop include, but are not limited
to, the following ones:
* Construction and collection of monolingual resources
* Construction and collection of cross-language resources
* Annotation guidelines and evaluation
* Automatic extraction of linguistic resources
* Automatic annotation of linguistic resources
* Use of crowdsourcing for generating and annotating resources
* Use of pivot languages for bridging unconnected languages
* Methods to adapt existing resources to new domains and styles
* Generation of resources for informal communication styles
* Evaluation of monolingual resources: tasks and protocols
* Evaluation of cross-language resources: tasks and protocols
SUBMISSION INSTRUCTIONS
Authors are invited to submit papers on original and previously unpublished
work. Formatting should
be according to LREC 2012 specifications (see
http://www.lrec-conf.org/lrec2012/?Authors-Kit)
using LaTeX or MS-Word style files (available for download at
http://www.lrec-conf.org/lrec2012/?Download-Templates,178).
Submission is electronic in PDF format using the START submission system at
https://www.softconf.com/lrec2012/CREDISLAS2012/
Double submission policy: Parallel submission to other meetings or
publications are possible but
must be immediately notified to the workshop contact person (see below).
Authors of accepted papers will be invited to present their research at the
workshop.
The workshop papers will be part of the LREC proceedings and published on
the web site of LREC 2012 before the conference.
IMPORTANT DATES
February 26, 2012: Paper submissions due
March 16, 2012: Notification of acceptance
March 30, 2012: Camera ready papers due
May 27, 2012: Workshop in Istanbul (afternoon session)
ORGANIZERS
Contact person: Patrik Lambert (e-mail: patrik.lambert at lium.univ-lemans.fr )
Patrik Lambert (University of Le Mans),
Marta R. Costa-jussà (Barcelona Media Innovation Center),
Rafael E. Banchs (Institute for Infocomm Research)
PROGRAMME COMMITTEE
Iñaki Alegria, University of the Basque Country, Spain
Marianna Apidianaki, LIMSI-CNRS, Orsay, France
Victoria Arranz, ELDA, Paris, France
Jordi Atserias, Yahoo! Research, Barcelona, Spain
Joan Codina, Barcelona Media, Barcelona, Spain
Gareth Jones, Dublin City University, Ireland
Min-Yen Kan, National University of Singapore
Philipp Koehn, University of Edinburgh, UK
Udo Kruschwitz, University of Essex, UK
Yanjun Ma, Baidu Inc. Beijing, China
Sara Morrissey, Dublin City University, Ireland
Maja Popovic, DFKI, Berlin, Germany
Paolo Rosso, Universidad de Valencia, Spain
Marta Recasens, Stanford University, USA
Wade Shen, Massachusetts Institute of Technology, Cambridge, USA
Haifeng Wang, Baidu Inc. Beijing, China
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/slling-l/attachments/20120206/276c9fef/attachment.htm>
More information about the Slling-l
mailing list