<html>

  <head>

    <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    ACL 2014 NINTH WORKSHOP ON STATISTICAL MACHINE TRANSLATION<br>

    Shared Tasks on news translation, quality estimation, metrics and

    medical text translation.<br>

    June 26-27, in conjunction with ACL 2014 in Baltimore, USA<br>

    <br>

    <a class="moz-txt-link-freetext" href="http://www.stamt.org/wmt14">http://www.stamt.org/wmt14</a><br>

    <br>

    As part of the ACL WMT14 workshop, as in previous years, we will be

    organising a collection of shared tasks related to machine

    translation.  We hope that both beginners and established research

    groups will participate. This year we are pleased to present the

    following tasks:<br>

    <br>

    - Translation task<br>

    - Quality estimation task<br>

    - Metrics task<br>

    - Medical translation task<br>

    <br>

    Further information, including task rationale, timetables and data

    can be found on the WMT14 website. Brief descriptions of each task

    are given below. Intending participants are encouraged to register

    with the mailing list for further announcements (<a

      class="moz-txt-link-freetext"

      href="https://groups.google.com/forum/#%21forum/wmt-tasks">https://groups.google.com/forum/#!forum/wmt-tasks</a>)<br>

    <br>

    For all tasks,  participants will also be  invited to submit a short

    paper describing their system.<br>

    <br>

    Translation Task<br>

    ---------------------<br>

    This will compare translation quality on four European language

    pairs (English-Czech, English-French, English-German and

    English-Russian), as well as a low-resource language pair

    (English-Hindi). The last language pair is *new* for this year. The

    test sets will be drawn from online newspapers, and translated

    specifically for the task.<br>

    <br>

    We will provide extensive monolingual and parallel data sets for

    training, as well as development sets, all available for download

    from the task website. Translations will be evaluated both using

    automatic metrics, and using human evaluation. Participants will be

    expected to contribute to the human evaluations of the translations.<br>

    <br>

    For this year's task we will be releasing the following new or

    updated corpora:<br>

    - An updated version of news-commentary<br>

    - A monolingual news crawl for 2013 in all the task languages<br>

    - A development set of English-Hindi<br>

    - A parallel corpus of English-Hindi (HindEnCorp), prepared by

    Charles University, Prague<br>

    - A cleaned-up version of the JHU English-Hindi corpus.<br>

    Not all data sets are available on the website yet, but they will be

    uploaded as soon as they are ready.<br>

    <br>

    The translation task test week will be February 24-28.<br>

    <br>

    This task is supported by MosesCore (<a

      class="moz-txt-link-freetext" href="http://www.mosescore.eu">http://www.mosescore.eu</a>),

    and the Russian test sets are provided by Yandex.<br>

    <br>

    Quality Estimation<br>

    ------------------------<br>

    <p>This shared task will examine automatic <b>methods for estimating

        the quality of machine translation output at run-time</b>,

      without relying on reference translations. In this third edition

      of the shared task, we will once again consider <b>word-level</b>

      and <b>sentence-level</b> estimation. However, this year we will

      focus on settings for quality prediction that are MT

      system-independent and rely on a limited number of training

      instances. More specifically, our tasks have the following <b>goals</b>:

    </p>

    <ul>

      <li> To investigate the effectiveness of different quality labels.

      </li>

      <li> To explore word-level quality prediction at different levels

        of granularity. </li>

      <li> To study the effects of training and test datasets with mixed

        domains, language pairs and MT systems. </li>

      <li> To analyse the effectiveness of quality prediction methods on

        human translations. </li>

    </ul>

    The WMT12-13 quality estimation shared tasks provided a set of

    baseline features, datasets, evaluation metrics, and oracle results.

    Building on last two years' experience, this year's shared task will

    reuse some of these resources, but provide additional training and

    test sets for four language pairs (English-Spanish, English-German,

    Spanish-English, German-English) and use different quality labels at

    word-level (specific types of errors) and sentence-levels. These new

    datasets have been collected using professional translators as part

    of the QTLaunchPad project (<a class="moz-txt-link-freetext"

      href="http://www.qt21.eu/launchpad/">http://www.qt21.eu/launchpad/</a>).

    <br>

    <br>

    Metrics Task<br>

    ----------------<br>

    <br>

    The shared metrics task will examine automatic evaluation metrics

    for machine translation. We will provide you with all of the

    translations produced in the translation task along with the

    reference human translations. You will return your automatic metric

    scores for each of the translations at the system-level and/or at

    the sentence-level. We will calculate the system-level and

    sentence-level correlations of your rankings with WMT14 human

    judgements once the manual evaluation has been completed.<br>

    <br>

    The task will be very similar to previous years. The most visible

    change this year is that we are going to use Pearson's (instead

    Spearman's) correlation coefficient to compute system level

    correlations.<br>

    <br>

    <br>

    The important dates for metrics task participants are:<br>

    <br>

    March 7, 2014 - System outputs distributed for metrics task<br>

    March 28, 2014 - Submission deadline for metrics task<br>

    <br>

    Medical Translation Task<br>

    --------------------------------<br>

    <br>

    In the Medical Translation Task, participants are welcome to test

    their MT systems on a genre- and domain-specific exercise. The goal

    is to translate sentences from summaries and also short queries in

    the medical domain. As usual, we provide training data specific for

    the task. Unlike the standard translation task, the medical task

    will be evaluated only automatically.<br>

    <br>

    More details: <a class="moz-txt-link-freetext"

      href="http://www.statmt.org/wmt14/medical-task.html">http://www.statmt.org/wmt14/medical-task.html</a><br>

    <br>

    -----<br>

    <br>

    Barry Haddow<br>

    (on behalf of the organisers)<br>

  </body>

</html>