[Corpora-List] CFP: Symposium on Learning Language Models from Multilingual Corpora

Preslav Nakov (GMail) preslavn at gmail.com
Tue Jan 11 11:40:07 UTC 2011


=======================================================================

Symposium on Learning Language Models from Multilingual Corpora (LLMMC)

(Part of the AISB 2011 Convention, 4-7 April 2011.)

http://www.cs.york.ac.uk/aig/LLMMC/

(Apologies for Cross-Posting)

=======================================================================


Second Call for Papers

International organizations, such as the UN and the EU, news agencies, and
companies operating internationally are producing large volumes of texts in
different languages. As a result, large publicly-available parallel
paragraph- or sentence-aligned corpora have been created for many language
pairs, e.g., French-English, Chinese-English or Arabic-English. The
multilingual nature of the EU has given rise to many documents available in
all or many of its official languages, which have been assembled in
multi-lingual parallel corpora such as Europarl (11 languages, 34-55M words
for each) and JRC-Acquis (22 languages, 11-22M words for each).

These parallel corpora have been used, both monolingually and
multilingually, for a variety of NLP tasks, including but not limited to
machine translation, cross-lingual information retrieval, word sense
disambiguation, semantic relation extraction, named entity recognition, POS
tagging, and syntactic parsing. With the advent of Internet, there has been
also an explosion in the availability of semi-parallel multilingual online
resources like Wikipedia that have been used for similar tasks and have a
big potential for future exploration and research.

In this symposium, we are interested in explicit models, usable and
verifiable by humans, which could be used for either translation or for
modelling individual languages, e.g., as applied to morphology, where the
available translations can help identify word forms of the same lexical
entry in a given language; or lexical semantics, where parallel corpora can
help extract instances of relations, such as synonymy and hypernymy, which
are essential for building thesauri and ontologies. The results may be
compared against existing approaches for the acquisition of language models
(morphology, syntax, semantics) from monolingual corpora, or combined with
these in order to use the best of both approaches.

The main purpose of the symposium will be to gather and disseminate the best
ideas in this new area. Thus, we welcome discussions of previously published
ideas alongside original contributions. The submission format is limited to
a 4-page extended abstract, which may be a position paper, or one outlining
an initial idea, work in progress or completed research. The camera-ready
copy of the article to be published in the symposium proceedings can be the
same extended abstract of up to 4 pages or a full-length paper of up to 8
pages in the AISB convention format:
http://www.aisb.org.uk/convention/aisb11/style.html. All accepted papers
will be allocated 25 minutes for presentation and questions. At least one of
the authors will need to be registered for the event before a paper can
appear in the proceedings. A considerable part of this one-day symposium
will be dedicated to discussions to encourage the formations of new
collaborations and consortia. 

The symposium will take place alongside 9 other symposia on various aspects
of AI-related research, drawing strongly on Computer Science, Psychology and
Philosophy, among other disciplines. A number of plenary speakers of
international fame (Alan Baddeley, Katie Slocombe, Mark Steedman, Stephen
Wolfram) will add to the excitement of this interdisciplinary international
event, which will take place in the historical city of York, one of the
oldest and most visited towns in England.

Duration: a one-day symposium.



Important dates:
================

Submissions: January 19, 2011

Notification: February 14, 2011

Submission of camera-ready versions: February 28, 2011

Symposium: April 6, 2011



Organizers:
===========

Dimitar Kazakov, The University of York, UK (kazakov AT cs DOT york DOT ac
DOT uk)

Preslav Nakov, National University of Singapore, Singapore (preslav DOT
nakov AT gmail DOT com)

Ahmad R. Shahid, The University of York, UK (ahmad AT cs DOT york DOT ac DOT
uk)


Program Committee:
==================

Graeme Blackwood, University of Cambridge, UK

Phil Blunsom, University of Oxford, UK

Francis Bond, Nanyang Technological University, Singapore

Yee-Seng Chan, University of Illinois at Urbana-Champaign, USA

Daniel Dahlmeier, National University of Singapore, Singapore

Marc Dymetman, Xerox Research Centre Europe, France

Andreas Eisele, Directorate-General for Translation, Luxembourg

Michel Galley, Stanford University, USA

Kuzman Ganchev, University of Pennsylvania, USA

Corina R Girju, University of Illinois at Urbana-Champaign, USA

Philipp Koehn, University of Edinburgh, UK

Krista Lagus, Aalto University School of Science and Technology, Finland

Wei Lu, National University of Singapore, Singapore

Elena Paskaleva, Bulgarian Academy of Sciences, Bulgaria

Katerina Pastra, Institute for Language and Speech Processing, Greece

Khalil Sima'an, University of Amsterdam, The Netherlands

Ralf Steinberger, Joint Research Centre, Italy

Joerg Tiedemann, Uppsala University, Sweden

Marco Turchi, Joint Research Centre, Italy

Jaakko Väyrynen, Aalto University School of Science and Technology, Finland


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list