[Corpora-List] Call for participation: LREC 2008 Workshop on Comparable Corpora
Pierre Zweigenbaum
pz at limsi.fr
Mon May 19 07:05:02 UTC 2008
=============================================================
Building and Using Comparable Corpora
LREC 2008 Post-Conference Workshop
May 31st, 2008, Marrakesh
Program and Call for Participation
http://www.limsi.fr/~pz/lrec2008-comparable-corpora/
=============================================================
This workshop aims to bring together researchers interested in the
constitution and use of comparable corpora. Contributions are
presented on the constitution and application of comparable
corpora, both by linguists and by computational linguists.
We are pleased to announce that Dr Serge Sharoff, Centre for
Translation Studies, School of Modern Languages and Cultures,
University of Leeds, will give a special talk on "Parallel worlds:
Recent advances in finding translations in comparable corpora"
at the Workshop.
---------------------
PROGRAM
----------------
09:00 Welcome and Introduction
----------------
09:15-10:15 Oral Session 1: Some Challenges
09:15
Gloria Corpas Pastor, Ruslan Mitkov, Naveed Afzal, Lisette Garcia Moya
Translation universals: do they exist? A corpus-based and NLP approach
to convergence
09:45
Sanjika Hewavitharana, Stephan Vogel
Enhancing a Statistical Machine Translation System by using an
Automatically Extracted Parallel Corpus from Comparable Sources
----------------
10:15-11:00 Coffee break
10:15-11:00 Poster session 1 (see list of posters below)
----------------
11:00-12:30 Oral Session 2:
Extracting Bilingual Lexicons from Comparable Corpora
11:00
Iñaki Alegria, Nerea Ezeiza, Izaskun Fernandez
Translating Named Entities using Comparable Corpora
11:30
Pablo Gamallo Otero
Evaluating Two Different Methods for the Task of Extracting Bilingual
Lexicons from Comparable Corpora
12:00
Xabier Saralegi, I. San Vicente, A. Gurrutxaga
Automatic extraction of bilingual terms from comparable corpora in a
popular science domain
12:30-13:30 Invited session
Serge Sharoff (University of Leeds, UK)
Parallel worlds: Recent advances in finding translations in
comparable corpora
----------------
13h30-14:30 Lunch break
----------------
14:30-16:00 Oral session 3: Linguistic studies
14:30
Christel Stolz, Thomas Stolz
Functional-Typological Approaches To Parallel And Comparable
Corpora: The Bremen Mixed Corpus
15:00
Maria Fernanda Bacelar do Nascimento, Antónia Estrela, Amália
Mendes, Luísa Pereira
On the use of comparable corpora of African varieties of Portuguese
for linguistic description and teaching/learning applications
15:30
Oliver Culo, Silvia Hansen-Schirra, Stella Neumann, Mihaela Vela
Empirical studies on language contrast using the English-German
comparable and parallel CroCo corpus
----------------
16:00-16:45 Coffee break
16:00-16:45 Poster session 2 (see list of posters below)
----------------
16:45-18:00 Panel session
Comparable corpora: varying definitions, varying uses
18:00 End of workshop
----------------
** List of poster presentations
----------------
Magnar Brekke
Term Extraction from Parallel and Comparable Text: The KB-N Legacy
Carmen Dayrell, Sandra Aluísio
Using a comparable corpus to investigate lexical patterning in English
abstracts written by non-native speakers
Meng Ji
A Comparative Approach to Diachronic Comparable Corpus Investigation
Natalie Kübler
A comparable Learner Translator Corpus: creation and use
Belinda Maia, Sérgio Matos
Corpógrafo V.4 -- tools for researchers and teachers using comparable
corpora
Emmanuel Prochasson, Kyo Kageura, Emmanuel Morin, Akiko Aizawa
Looking for Transliterations in a trilingual English, French and
Japanese Specialised Comparable Corpus
Richard Rohwer, Zhiqiang (John) Wang
Coarse Lexical Translation with no use of Prior Language Knowledge
----------------
Workshop Description
Research in comparable corpora is motivated by the scarcity of
parallel corpora. Parallel corpora are a key resource to mine
translations for statistical machine translation or for building
or extending bilingual lexicons and terminologies. However, beyond
a few language pairs such as English-French or English-Chinese and
a few contexts such as parliamentary debates or legal texts, they
remain a scarce resource, despite the creation of automated
methods to collect parallel corpora from the Web. A more
fundamental limitation is that translated texts, whatever the
skills of translators, are generally influenced by the very
translation process and by the language of source texts, so that
they may not be fully adequate for the task at hand.
This has motivated research into the use of comparable corpora:
pairs of monolingual corpora selected according to the same set of
criteria, but in different languages or language
varieties. Comparable corpora overcome the two limitations of
parallel corpora, since sources for original, monolingual texts
are much more abundant than translated texts. However, because of
their nature, mining translations in comparable corpora is much
more challenging than in parallel corpora. What constitutes a good
comparable corpus, for a given task or per se, also requires
specific attention: while the definition of a parallel corpus is
fairly straightforward, building a comparable corpus requires
control over the selection of source texts in both languages.
----------------
Workshop Organisers
Pierre Zweigenbaum
LIMSI, CNRS, Orsay, France
& ERTIM, INALCO, Paris, France
Eric Gaussier
LIG, Université J. Fourier, Grenoble, France
Pascale Fung
Department of Electronic & Computer Engineering,
University of Science & Technology, Hong Kong
Scientific Committee
Lynne Bowker (University of Ottawa, Canada)
Hervé Déjean (Xerox Research Centre Europe, Grenoble, France)
Éric Gaussier (Université Joseph Fourier, Grenoble, France)
Gregory Grefenstette (CEA/LIST, Fontenay-aux-Roses, France)
Pascale Fung (University of Science & Technology, Hong Kong)
Natalie Kübler (Université Paris Diderot, France)
Tony McEnery (Lancaster University, UK)
Emmanuel Morin (Université de Nantes, France)
Dragos Stefan Munteanu (Information Sciences Institute, Marina Del Rey, USA)
Carol Peters (ISTI-CNR, Pisa, Italy)
Reinhard Rapp (Johannes Gutenberg-Universität Mainz, Germany)
Serge Sharoff (University of Leeds, UK)
Monique Slodzian (INALCO, Paris, France)
Richard Sproat (University of Illinois at Urbana-Champaign, USA)
Pierre Zweigenbaum (LIMSI-CNRS, Orsay, France)
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list