--------------------------------------------------------------------<br>Apologies for multiple postings<br>--------------------------------------------------------------------<br><br>CALL FOR PAPERS<br><br>AMTA 2012 Workshop on Monolingual Machine Translation (MONOMT 2012)<br>


* Colocated with AMTA 2012 (The Tenth Biennial Conference of the<br>

Association for Machine Translation in the America)<br>

<br>Title: Monolingual Machine Translation (MONOMT 2012).<br>Date: Nov 1, 2012<br>Location: San Diego, United States<br><a href="http://computing.dcu.ie/%7Etokita/MONOMT/monomt.htm" target="_blank">http://computing.dcu.ie/~tokita/MONOMT/monomt.htm</a><br>


<br>DESCRIPTION<br><br>Due to the increasing demands for high quality translation,<br>monolingual Machine Translation (MT) subtasks are frequently<br>encountered in various occasions, where one MT task is decomposed into<br>


several subtasks some of which can be called `monolingual'. Such<br>monolingual MT subtasks include: (1) MT for morphologically rich<br>languages, [Bojar, 08] aimed at dealing with morphologic richness of<br>the target, as is the case with the English-Czech (EN-CZ) language<br>


pair. An MT task is thus split into two subtasks: first, English is<br>(`bilingually') translated into simplified Czech and then, the<br>obtained morphologically normalized Czech is (`monolingually')<br>translated into morphologically rich Czech; (2) system combination<br>


[Matusov et al., 05], where a source sentence is first translated into<br>the target language by several MT systems, and then, the obtained<br>translations are combined to create / generate the output in the same<br>language; (3) statistical post-editing [Dugast et al., 07; Simard et<br>


al., 07], where a source sentence is first translated into the target<br>language by a rule-based MT system and then, the obtained output is<br>`monolingually' translated by an SMT system; (4) domain adaptation<br>using transfer learning [Daume III, 07]: the source side written in a<br>


`source' domain (e.g., newswires) is converted into the target side<br>written in a `target' domain (e.g., patents); (5) transliteration<br>between phonemes / alphabets [Knight and Graehl, 98]; (6) considering<br>


reordering issues (SVO and SOV) [Katz-Brown et al., 11]; (7) MERT<br>process [Arun et al., 10]; (8) translation memory (TM) and MT<br>integration [Ma et al., 11]; (9) paraphrasing for creating additional<br>training data or for evaluation purposes.<br>


<br>A distinction could be established between bilingual MT tools<br>(B-tools) and monolingual MT tools (M-tools) that may be exploited for<br>monolingual MT. Consider, e.g., monolingual subtasks such as MT for<br>morphologically rich languages, statistical post-editing, or<br>


transliteration and a task of system combination or domain adaptation<br>as respective representatives.  The latter group is often approached<br>with monolingual M-tools like monolingual word alignment [Matusov et<br>al., 05; He et al., 08] and the minimization of Bayes risk [Kumar and<br>


Byrne, 02] (on the outputs of combined systems). However, the former<br>usually employs bilingual MT tools, like GIZA++ [Och and Ney, 04] to<br>extract bilingual phrases and MAP decoding on them. The way M-tools<br>and B-tools are used for monolingual MT is an issue of particular<br>


interest for this workshop.<br><br>This workshop is intended to provide the opportunity to discuss ideas<br>and share opinions on the question of the applicability of M-tools or<br>B-tools for monolingual MT subtasks, and on their respective strengths<br>


and weaknesses in specific settings. Furthermore we wish to provide<br>opportunity to demonstrate successful usecases of M-tools.<br><br>Possible questions, that are encouraged to be addressed during the<br>workshop, include:<br>


<br>- ways of applying M-tools to monolingual MT subtasks such as MT for<br>morphologically rich languages and statistical post-editing.<br>- investigation of the suitability of B-tools or M-tools for<br>monolingual MT subtasks.<br>


- performance improvements of monolingual word alignment tools,<br>since these are necessary for specific monolingual subtasks, such as<br>MT for morphologically rich languages and statistical post-editing.<br><br><br>IMPORTANT DATES<br>


<br>Submission deadline: August 3, 2012 <br>Notification to authors:August 31, 2012<br>Camera ready:September 7, 2012<br>Workshop: November 1, 2012<br><br>TOPICS OF INTEREST<br><br>Original papers are invited on different aspects of monolingual MT,<br>


such as:<br>-MT for morphologically rich languages<br>-system combination<br>-statistical post-editing<br>-domain adaptation<br>-MERT process<br>-MT for reordering mismatched language pairs (SVO and SOV, ...)<br>-MT-TM integration (i.e. MT systems whose prior knowledge includes<br>


bilingual terminology and TM)<br>-transliteration<br>-MT using textual entailment<br>-MT using confidence estimation<br>-paraphrasing<br>-hybrid MT<br>-...<br><br>Papers describing the mechanism of MT tools that may be considered<br>


`monolingual' are also encouraged. Some possible topics are listed<br>below:<br>-MBR decoding, consensus decoding<br>-monolingual word alignment (based on TER, METEOR,...)<br>-language models constructed by learning the representation of data<br>


-data structure related matters<br>-ranking algorithms<br>-multitask learning (in the context of domain adaptation)<br>-...<br><br>SUBMISSION<br>    <br>Authors are invited to submit long papers (up to 10 pages) and short<br>


papers (2 - 4 pages). Long papers should describe unpublished,<br>substantial and completed research. Short papers should be position<br>papers, papers describing work in progress or short, focused<br>contributions. Papers will be accepted until August 3, 2012 in PDF<br>


format via the system. Submitted papers must follow the styles and<br>formatting guidelines available from the AMTA main conference site<br>(See below). As the reviewing will be blind, the papers must not<br>include the authors' names and affiliations. Furthermore,<br>


self-references that reveal the author's identity, e.g., "We<br>previously showed (Smith, 1991) ..." must be avoided. Instead, use<br>citations such as "Smith previously showed (Smith, 1991) ..." Papers<br>


that do not conform to these requirements will be rejected without<br>review.<br><br>ORGANIZERS<br><br>Tsuyoshi Okita (Dublin City University, Ireland)<br>Artem Sokolov (LIMSI, France)<br>Taro Watanabe (National Institute of Information and Communications<br>


Technology, Japan)<br><br>PROGRAM COMMITTEE (Tentative)<br><br>Bogdan Babych (University of Leeds, UK)<br>Loic Barrault (LIUM, Universite du Maine, France)<br>Nicola Bertoldi (FBK, Italy)<br>Boxing Chen (NRC Institute for Information Technology, Canada)<br>


Trevor Cohn (University of Sheffield, UK)<br>Marta Ruiz Costa-jussa (Barcelona Media, Spain)<br>Josep M. Crego (SYSTRAN, France)<br>John DeNero (Google, USA)<br>Jinhua Du (Xi'an University of Technology, China)<br>Kevin Duh (Nara Institute of Science and Technology, Japan)<br>


Chris Dyer (CMU, USA)<br>Christian Federmann (DFKI, Germany)<br>Barry Haddow (University of Edinburgh, UK)<br>Xiadong He (Microsoft, USA)<br>Jagadeesh Jagarlamudi (University of Maryland, USA)<br>Philipp Koehn (University of Edinburgh, UK)<br>


Shankar Kumar (Google, USA)<br>Alon Lavie (CMU, USA)<br>Yanjun Ma (Baidu, China)<br>Aurelien Max (LIMSI, University Paris Sud, France)<br>Stefan Riezler (University of Heidelberg, Germany)<br>Lucia Specia (University of Sheffield, UK)<br>


Marco Turchi (JRC, Italy)<br>Antal van den Bosch (Radboud University Nijmegen, Netherlands)<br>Xianchao Wu (Baidu, Japan)<br>Dekai Wu (HKUST, Honkong)<br>Francois Yvon (LIMSI, University Paris Sud, France)<br><br>---------------------------<br>


<br>