Conf: MT Summit IX 2003 : DEADLINE EXTENDED !

alexis.nasr at LINGUIST.JUSSIEU.FR alexis.nasr at LINGUIST.JUSSIEU.FR
Tue May 13 13:06:11 UTC 2003

                      Workshop on MT Evaluation
                   Towards Systematizing MT Evaluation

                         (deadline extended!)


Estimating the quality of any machine-translation system accurately is
only possible if the evaluation methodology is robust and
systematic. The NSF and EU-funded ISLE project has created a taxonomy
that relates situations and measures for a variety of MT
applications. The ?Framework for MT Evaluation in ISLE? (FEMTI) is now
available online at The effort of
matching these measures correctly with their appropriate evaluation
tasks, however, is an area that needs further attention. For example,
what effect do ?user needs? have on the ?functionality
characteristics? specified in the FEMTI guidelines? To what extent are
there unseen relationships in the branches of the taxonomy? How can we
judge when a given evaluation measure is appropriate? Issues that come
to bear on this question are the automation of MT evaluation, the
extension to MT applications such as automated speech-translation, and
the evaluation of the very training corpora that an MT system relies
on to improve output quality.

This workshop welcomes papers for 30-minute presentations on the
comparison between MT evaluation measures, studies of the behavior of
individual measures (i.e., meta-evaluation), new uses for measures,
analysis of MT evaluation tasks with respect to measures, and related
topics on this theme. We solicit submissions to the workshop that
address some of the following issues, however any other topic related
to MT Testing and Evaluation is also acceptable:

Machine Translation Evaluation Measures:

    Use of existing measures in the ISLE hierarchy (FEMTI guidelines)
    New measures and their uses
    Matching evaluation requirements (e.g., translation tasks, user
profiles) with
    Effects of combining measures

Evaluation Measures and Languages

    Is a metric?s effectiveness language independent?
    Counting grammatical features for evaluation

Evaluation and Domains

    Measures for spoken Language translation
    Domain-specific evaluation techniques
    Using measures to evaluate the quality of a training corpus for a
given task

Automation vs. Human Testing

    Which measures are suitable for automation?
    Human/machine scoring comparisons
    Human tester agreement: which measures fare best?

Submission Format

Papers (full papers up to 8 pages in length) must be submitted
electronically to:
barrett at, or andrei.popescu-belis at

Papers are preferred in .pdf, .ps, .rtf,  or .txt,  format

Important Dates: NEW

Paper submission deadline: May 26

Notification: June 30

Camera-ready due: July 31

Note: the workshop will be held on September 23, 2003

Also see workshop website for updates:


Leslie Barrett (Transclick, Inc., New York, NY)

Andrei Popescu-Belis (ISSCO/TIM/ETI, University of Geneva)


Leslie Barrett (Transclick, Inc., New York, NY)

Maghi King (ISSCO/TIM/ETI, University of Geneva)

Keith Miller (MITRE Corp)

Andrei Popescu-Belis (ISSCO/TIM/ETI, University of Geneva)

Program Committee:

Bonnie Dorr (University of Maryland)

Eduard Hovy (Information Sciences Institute, University of Southern

Maghi King (ISSCO/TIM/ETI, University of Geneva)

Bente Maegaard (Center for Sprogteknologi, Copenhagen, Denmark)

Keith Miller (MITRE Corp.)

Martha Palmer (University of Pennsylvania)

Ted Petersen (Univesity of Minnesota)

Andrei Popescu-Belis (ISSCO/TIM/ETI, University of Geneva)

Florence Reeder (MITRE Corp)

Nancy Underwood (ISSCO/TIM/ETI, University of Geneva)

Michelle Vanni (National Computer Security Center)
Message diffusé par la liste Langage Naturel <LN at>
Informations, abonnement :
English version          :
Archives                 :

La liste LN est parrainée par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhésion  :

More information about the Ln mailing list