<div dir="ltr">LAST CALL FOR PAPERS - Deadline February 14, 2014<br>


 <br>

Automatic and Manual Metrics for Operational Translation Evaluation<br><span class="">Workshop</span> at Language Resources and Evaluation Conference (<span class="">LREC</span>) 2014<br>

(<a href="http://mte2014.github.io/" target="_blank">http://mte2014.github.io/</a>)<br>

<br>

While a significant body of work has been done by the machine 

translation (MT) research community towards the development and 

meta-evaluation of automatic metrics to assess overall MT quality, less 

attention has been dedicated to more operational evaluation

 metrics aimed at testing whether translations are adequate within a 

specific context: purpose, end-user, task, etc., and why the MT system 

fails in some cases. Both of these can benefit from some form of manual 

analysis. Most work in this area is limited to

 productivity tests (e.g. contrasting time for human translation and MT 

post-editing). A few initiatives consider more detailed metrics for the 

problem, which can also be used to understand and diagnose errors in MT 

systems. These include the Multidimensional

 Quality Metrics (MQM) recently proposed by the EU F7 project 

QTLaunchPad, the TAUS Dynamic Quality Framework, and past projects such 

as FEMTI, EAGLES and ISLE. Some of these metrics are also applicable to 

human translation evaluation. A number of task-based

 metrics have also been proposed for applications such as topic ID / 

triage and reading comprehension.<br>

The purpose of this <span class="">workshop</span> is to bring 

together representatives from academia, industry and government 

institutions to discuss and assess metrics for manual quality evaluation

 and compare them through correlation analysis with well-established 

metrics for

 automatic evaluation such as BLEU, METEOR and others, as well as 

reference-less metrics for quality prediction.<br>

The <span class="">workshop</span> will benefit from datasets already 

collected and manually annotated for translation errors by the 

QTLaunchPad project as part of a shared task on error annotation and 

automatic quality translation.<br>

<br>

Submissions: We will accept two types of submissions:<br>

1. Abstract (of up to one page)<br>

2. One-page abstract plus full paper (6-10 pages)<br>

<br>

Both abstracts and full papers will address any of the topics included 

in this CFP (see below), but full papers have the advantage of 

presenting the authors’ work and ideas at a greater level of detail. 

Both abstract submissions and abstract + paper submissions

 must be received by the submission deadline below and will be reviewed 

by experts in the field. Short slots for oral presentation will be given

 to all accepted submissions, regardless of their format (abstract only 

or abstract + full paper).<br>


<br>

Topics: The <span class="">workshop</span> welcomes submissions on the topics of<br>

<br>

- task-based translation evaluation metrics,<br>

● specifically, metrics for machine (and/or human) translation quality 

evaluation and quality estimation, be these metrics automatic, 

semi-automatic or manual (rubric, error annotation, etc.),<br>

<br>

- error analysis of machine (and human) translations (automated and manual),<br>

● for example studies exploiting whether manually annotated translations

 can contribute to the automatic detection of specific translation 

errors and whether this can be used to automatically correct 

translations<br>

- correlation between translation evaluation metrics, error analysis, and task-suitability of translations.<br>

<br>

The format of the <span class="">workshop</span> will be a half-day of

 short presentations on the above topics, followed by a half-day of 

hands-on collaborative work with MT metrics that show promise for the 

prediction of task suitability of MT output. The afternoon hands-on work

 will follow from the morning’s presentations. Thus, all submissions, 

both abstracts and abstracts + papers should address at least the 

following points:<br>

● definition of the metric(s) being proposed, along with an indication of whether the metric is manual or automated,<br>

● method of computation of the metric(s), if not already well-known,<br>

● discussion of the applicability of the metric(s) to determining task suitability of MT output, and<br>

● indication of human (annotation) effort necessary to produce the metric(s).<br>

<br>

Submissions must be made via the START Conference Manager at <a href="https://www.softconf.com/lrec2014/MTE/" target="_blank">

https://www.softconf.com/lrec2014/MTE/</a>. Email submissions will not be reviewed.<br>

<br>

Share your LRs: When making a submission from the START page, authors 

will be asked to provide essential information about resources (in a 

broad sense, i.e. also technologies, standards, evaluation kits, etc.) 

that have been used for the work described in the

 paper or are a new result of your research. Moreover, ELRA encourages 

all <span class="">LREC</span> authors to share the described LRs 

(data, tools, services, etc.), to enable their reuse, replicability of 

experiments, including evaluation ones, etc.<br>


<br>

Important Dates:<br>

Submission of Abstract or Abstract plus Paper: February 14, 2014<br>

[NOTE: Author(s) intending to submit a full paper must submit the full 

paper along with the abstract in order for it to be considered for 

inclusion in the <span class="">workshop</span> and publication in the <span class="">workshop</span> proceedings.]<br>

Notification to authors: March 10, 2014<br>

Camera-ready versions of accepted Abstract or Abstract plus Paper due to organizing committee: March 28, 2014<br>

<span class="">Workshop</span> Date: May 26, 2014<br>

<br>

Organizing Committee: Keith J. Miller (MITRE), Lucia Specia (University 

of Sheffield), Kim Harris (GALA and text & form), Stacey Bailey 

(MITRE)</div>