Appel: AMTA Workshop on Monolingual Machine Translation (MONOMT 2012)

Thierry Hamon thierry.hamon at UNIV-PARIS13.FR
Sat Jun 23 20:28:14 UTC 2012


Date: Sat, 23 Jun 2012 11:01:04 +0200 (CEST)
From: "Artem Sokolov" <artem at limsi.fr>
Message-ID: <8c72079ac6fb947320d8c267d0cb72f7.squirrel at webmail.limsi.fr>
X-url: http://computing.dcu.ie/~tokita/MONOMT/monomt.htm

Call for Papers:

AMTA 2012 Workshop on Monolingual Machine Translation (MONOMT 2012)

Submission deadline: August 3, 2012
Workshop date: November 1, 2012
Location: San Diego, United States
http://computing.dcu.ie/~tokita/MONOMT/monomt.htm

Description:

Due to the increasing demands for high quality translation, monolingual
Machine Translation (MT) subtasks are frequently encountered in various
occasions, where one MT task is decomposed into several subtasks some of
which can be called `monolingual'. Such monolingual MT subtasks include:
(1) MT for morphologically rich languages, [Bojar, 08] aimed at dealing
with morphologic richness of the target, as is the case with the
English-Czech (EN-CZ) language pair. An MT task is thus split into two
subtasks: first, English is (`bilingually') translated into simplified
Czech and then, the obtained morphologically normalized Czech is
(`monolingually') translated into morphologically rich Czech; (2) system
combination [Matusov et al., 05], where a source sentence is first
translated into the target language by several MT systems, and then, the
obtained translations are combined to create / generate the output in
the same language; (3) statistical post-editing [Dugast et al., 07;
Simard et al., 07], where a source sentence is first translated into the
target language by a rule-based MT system and then, the obtained output
is `monolingually' translated by an SMT system; (4) domain adaptation
using transfer learning [Daume III, 07]: the source side written in a
`source' domain (e.g., newswires) is converted into the target side
written in a `target' domain (e.g., patents); (5) transliteration
between phonemes / alphabets [Knight and Graehl, 98]; (6) considering
reordering issues (SVO and SOV) [Katz-Brown et al., 11]; (7) MERT
process [Arun et al., 10]; (8) translation memory (TM) and MT
integration [Ma et al., 11]; (9) paraphrasing for creating additional
training data or for evaluation purposes.

A distinction could be established between bilingual MT tools (B-tools)
and monolingual MT tools (M-tools) that may be exploited for monolingual
MT. Consider, e.g., monolingual subtasks such as MT for morphologically
rich languages, statistical post-editing, or transliteration and a task
of system combination or domain adaptation as respective
representatives. The latter group is often approached with monolingual
M-tools like monolingual word alignment [Matusov et al., 05; He et al.,
08] and the minimization of Bayes risk [Kumar and Byrne, 02] (on the
outputs of combined systems).  However, the former usually employs
bilingual MT tools, like GIZA++ [Och and Ney, 04] to extract bilingual
phrases and MAP decoding on them. The way M-tools and B-tools are used
for monolingual MT is an issue of particular interest for this workshop.

This workshop is intended to provide the opportunity to discuss ideas
and share opinions on the question of the applicability of M-tools or
B-tools for monolingual MT subtasks, and on their respective strengths
and weaknesses in specific settings. Furthermore we wish to provide
opportunity to demonstrate successful usecases of M-tools.

Possible questions, that are encouraged to be addressed during the
workshop, include:

- ways of applying M-tools to monolingual MT subtasks such as MT for
  morphologically rich languages and statistical post-editing.

- investigation of the suitability of B-tools or M-tools for monolingual
  MT subtasks.

- performance improvements of monolingual word alignment tools, since
  these are necessary for specific monolingual subtasks, such as MT for
  morphologically rich languages and statistical post-editing.

TOPICS OF INTEREST

Original papers are invited on different aspects of monolingual MT, such
as:
    MT for morphologically rich languages
    system combination
    statistical post-editing
    domain adaptation
    MERT process
    MT for reordering mismatched language pairs (SVO and SOV)
    MT-Translation Memory integration
    transliteration
    MT using textual entailment
    MT using confidence estimation
    paraphrasing
    hybrid MT

Papers describing the mechanism of MT tools that may be considered
`monolingual' are also encouraged. Some possible topics are listed
below:
    MBR decoding, consensus decoding
    monolingual word alignment (based on TER, METEOR,...)
    language models constructed by learning the representation of data
    data structure related matters
    ranking algorithms
    multitask learning (in the context of domain adaptation)

SUBMISSION

Authors are invited to submit long papers (up to 10 pages) and short
papers (2 - 4 pages). Long papers should describe unpublished,
substantial and completed research. Short papers should be position
papers, papers describing work in progress or short, focused
contributions. Papers will be accepted until August 3, 2012 in PDF
format via the system. Submitted papers must follow the styles and
formatting guidelines available from the AMTA main conference site. As
the reviewing will be blind, the papers must not include the authors'
names and affiliations. Furthermore, self-references that reveal the
author's identity, e.g., "We previously showed (Smith, 1991) ..." must
be avoided. Instead, use citations such as "Smith previously showed
(Smith, 1991) ..." Papers that do not conform to these requirements will
be rejected without review.

ORGANIZERS

Tsuyoshi Okita (DCU, Ireland)
Artem Sokolov (LIMSI, France)
Taro Watanabe (NICT, Japan)

PROGRAM COMMITTEE (Tentative)

Bogdan Babych (University of Leeds, UK)
Loic Barrault (LIUM, Universite du Maine, France)
Nicola Bertoldi (FBK, Italy)
Boxing Chen (NRC Institute for Information Technology, Canada)
Trevor Cohn (University of Sheffield, UK)
Marta Ruiz Costa-jussa (Barcelona Media, Spain)
Josep M. Crego (SYSTRAN, France)
John DeNero (Google, USA)
Jinhua Du (Xi'an University of Technology, China)
Kevin Duh (Nara Institute of Science and Technology, Japan)
Chris Dyer (CMU, USA)
Christian Federmann (DFKI, Germany)
Barry Haddow (University of Edinburgh, UK)
Xiadong He (Microsoft, USA)
Jagadeesh Jagarlamudi (University of Maryland, USA)
Philipp Koehn (University of Edinburgh, UK)
Shankar Kumar (Google, USA)
Alon Lavie (CMU, USA)
Yanjun Ma (Baidu, China)
Aurelien Max (LIMSI, University Paris Sud, France)
Stefan Riezler (University of Heidelberg, Germany)
Lucia Specia (University of Sheffield, UK)
Marco Turchi (JRC, Italy)
Antal van den Bosch (Radboud University Nijmegen, Netherlands)
Xianchao Wu (Baidu, Japan)
Dekai Wu (HKUST, Hongkong)
Francois Yvon (LIMSI, University Paris Sud, France)

-------------------------------------------------------------------------
Message diffuse par la liste Langage Naturel <LN at cines.fr>
Informations, abonnement : http://www.atala.org/article.php3?id_article=48
English version       : 
Archives                 : http://listserv.linguistlist.org/archives/ln.html
                                http://liste.cines.fr/info/ln

La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion  : http://www.atala.org/
-------------------------------------------------------------------------



More information about the Ln mailing list