<html>

  <head>

    <meta http-equiv="content-type" content="text/html; charset=UTF-8">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    EMNLP 2015 TENTH WORKSHOP ON STATISTICAL MACHINE TRANSLATION<br>

    <div class="moz-forward-container">

      <div class="moz-forward-container">

        <div class="moz-forward-container">

          <div class="moz-forward-container"> Shared Tasks on news

            translation, automatic post-editing,  quality estimation and

            metrics.<br>

            September 2015, in conjunction with EMNLP 2015 in Lisbon,

            Portugal<br>

            <br>

            <a moz-do-not-send="true" class="moz-txt-link-freetext"

              href="http://www.statmt.org/wmt15/">http://www.statmt.org/wmt15/</a><br>

            <br>

            As part of the EMNLP WMT15 workshop, as in previous years,

            we will be organising a collection of shared tasks related

            to machine translation.  We hope that both beginners and

            established research groups will participate. This year we

            are pleased to present the following tasks:<br>

            <br>

            - Translation task<br>

            - Automatic Post-editing task (pilot)<br>

            - Quality estimation task<br>

            - Metrics task (including tunable metrics)<br>

            <br>

            Further information, including task rationale, timetables

            and data can be found on the WMT15 website. Brief

            descriptions of each task are given below. Intending

            participants are encouraged to register with the mailing

            list for further announcements (<a moz-do-not-send="true"

              class="moz-txt-link-freetext"

              href="https://groups.google.com/forum/#%21forum/wmt-tasks">https://groups.google.com/forum/#!forum/wmt-tasks</a>)<br>

            <br>

            For all tasks,  participants will also be  invited to submit

            a short paper describing their system.<br>

            <br>

            ---------------------<br>

            Translation Task<br>

            ---------------------<br>

            This will compare translation quality on five European

            language pairs (English-Czech, English-Finnish,

            English-French, English-German and English-Russian). <br>

            *New* for this year:<br>

            - Finnish appears as a "guest" language<br>

            - The English-French text will be drawn from informal news

            discussions. All other test sets will be from professionally

            written news articles.<br>

            <br>

            We will provide extensive monolingual and parallel data sets

            for training, as well as development sets, all available for

            download from the task website. Translations will be

            evaluated both using automatic metrics, and using human

            evaluation. Participants will be expected to contribute to

            the human evaluations of the translations.<br>

            <br>

            For this year's task we will be releasing the following new

            or updated corpora:<br>

            - An updated version of news-commentary<br>

            - A monolingual news crawl for 2014 in all the task

            languages<br>

            - Development sets for English-French and English-Finish<br>

            Not all data sets are available on the website yet, but they

            will be uploaded as soon as they are ready.<br>

            <br>

            The translation task test week will be April 20-27.<br>

            <br>

            This task is supported by the EU projects MosesCore (<a

              moz-do-not-send="true" class="moz-txt-link-freetext"

              href="http://www.mosescore.eu">http://www.mosescore.eu</a>),

            QT21 and Cracker,  and the Russian test sets are provided by

            Yandex.<br>

            <br>

            -----------------------------------------------------<br>

            Pilot task on Automatic Post-Editing<br>

            -----------------------------------------------------<br>

            This shared task task will examine automatic methods for

            correcting errors produced by machine translation (MT)

            systems.  Automatic Post-editing (APE) aims at improving MT

            output in black box scenarios, in which the MT system is

            used "as is" and cannot be modified.<br>

            From the application point of view APE components would make

            it possible to:<br>

            <br>

            * Cope with systematic errors of an MT system whose decoding

            process is not accessible<br>

            * Provide professional translators with improved MT output

            quality to reduce (human) post-editing effort<br>

            <br>

            In this first edition of the task, the evaluation will focus

            on one language pair (English-Spanish), measuring systems'

            capability to reduce the distance (HTER) that separates an

            automatic translation from its human-revised version

            approved for publication. Training and test data are

            provided by Unbabel.<br>

            <br>

            Important dates<br>

            Release of training data: January 31, 2015<br>

            Test set distributed: April 27, 2015<br>

            Submission deadline: May, 15<br>

            <br>

            ------------------------<br>

            Quality Estimation<br>

            ------------------------<br>

            <p>This shared task will examine automatic <b>methods for

                estimating the quality of machine translation output at

                run-time</b>, without relying on reference translations.

              In this fourth edition of the shared task, in addition to

              <b>word-level</b> and <b>sentence-level</b> estimation,

              we will introduce <b>document-level </b>estimation. Our

              main <b>goals</b> are the following: </p>

            <ul>

              <li> To investigate the effectiveness of quality labels

                and features for document-level prediction. </li>

              <li> To explore differences between sentence-level and

                document-level prediction. </li>

              <li> To analyse the effect of training data sizes and

                quality for sentence and word-level prediction,

                particularly for negative (i.e. low translation quality)

                examples. </li>

            </ul>

            The WMT12-14 quality estimation shared tasks provided a set

            of baseline features, datasets, evaluation metrics, and

            oracle results. Building on the last three years' experience

            and focusing on English, Spanish and German as languages,

            this year's shared task will reuse some of these resources,

            but provide additional training and test sets.<br>

            <br>

            ----------------<br>

            Metrics Task<br>

            ----------------<br>

            <br>

            The shared metrics task will examine automatic evaluation

            metrics for  machine translation. <br>

            We will provide you with all of the translations  produced

            in the translation task along with the <br>

            reference human  translations. You will return your

            automatic metric scores for each of  the <br>

            translations at the system-level and/or at the

            sentence-level. We  will calculate the system-level <br>

            and sentence-level correlations of your rankings with WMT15

            human judgements once the manual <br>

            evaluation has been  completed.<br>

            <br>

            In addition to this evaluation task, we will run a tunable

            metrics task,  similar to the one we ran in <br>

            2010. The idea of this task is to evaluate which metrics

            give the best performance (according <br>

            to human evaluation) when used to tune an SMT system. We

            will provide the system, then you <br>

            will tune it using your metric and send us the resulting

            tuned weights.<br>

            <br>

            Full details of the metrics tasks will be made  available on

            the task website.<br>

            <br>

            <br>

            The important dates for metrics task participants are:<br>

            <br>

            May 4, 2015 - System outputs distributed for metrics task<br>

            May 25, 2014 - Submission deadline for metrics task<br>

            <br>

            -----<br>

            <br>

            Barry Haddow<br>

            (on behalf of the organisers)<br>

            <br>

            <br>

          </div>

          <br>

          <br>

        </div>

        <br>

        <br>

      </div>

      <br>

      <br>

    </div>

    <br>

  </body>

</html>