Corpora: NAACL-01 Machine Translation Workshop

Wed Jan 17 18:38:26 UTC 2001

________________________________________________________________

	                  CALL FOR PARTICIPATION	

                 Workshop on Machine Translation Evaluation
                       in conjunction with NAACL-2001

                         WORKSHOP ON MT EVALUATION:

                            Hands-On Evaluation

                               3 June, 2001
                             Pittsburgh, PA
                               United States

MOTIVATION

Evaluation of language tools, particularly tools that generate language,
remains an interesting and general problem. Machine Translation (MT) is a
prime example. Approaches to evaluating MT are even more plentiful than
approaches to MT itself; the number of evaluations and range of variants is
confusing to anyone considering an evaluation. In an effort to systematize
MT evaluation, the NSF-funded ISLE project has created a taxonomy of
evaluation-related features and measures. Unfortunately, however, many prior
evaluations do not include an adequate specification of important aspects
such as evaluation process complexity, cost, variance of score, etc.

In an effort to drive MT evaluation to the next level, this workshop will
focus on exercising with methods of acquiring such information for several
important MT evaluation measures. The workshop thus embodies the challenge
of Hands-On Evaluation, within the context of the framework being developed
by the ISLE MT Evaluation effort. The workshop follows a workshop on MT
Evaluation held at the AMTA Conference in Cuernavaca, Mexico, in October
2000, and a subsequent workshop being planned for April 2001 in Geneva.

STRUCTURE OF THE WORKSHOP

The first part of the workshop will introduce the ISLE MT Evaluation effort,
funded by NSF and the EU, to create a general framework of characteristics
in terms of which MT evaluations, past and future, can be described and
classified. The framework, whose antecedents are the JEIDA and EAGLES
reports, consists of taxonomies of increasingly specific features, with
associated measures and pointers to systems. The discussion will review the
current state of the classification effort as well as review the MT
evaluation history from which it was drawn.

The second, principal, part of the workshop will focus on real-world
evaluation. In an effort to facilitate common ground for discussion,
participants will be given specific evaluation exercises, defined by the
taxonomy and recent MT evaluation trends. In addition, they will be given a
set of texts generated by MT systems and human reference translations. They
will be asked, during the workshop, to perform given evaluation exercises
with the given data. This common framework will give insights into the
evaluation process and useful metrics for driving the development process.
The results of the exercises will then be presented by the participants,
synthesized into a uniform description of each evaluation, and added to the
ISLE taxonomy, which has been made available on the web for future analysis
in MT evaluation. The results of the workshop will also be incorporated into
a publicly available resource and the workbook from the workshop will be
able to be used by teachers of evaluation and MT.

QUESTIONS AND ISSUES

Since this is a hands-on workshop, participants will be asked to submit an
intent to participate. At that time, they will be able to download the
relevant data for review. During the workshop, they will be given a series
of exercises and split into teams for working these exercises. The result of
the workshop will be at least one paper which addresses the following
threads of investigation within the framework:

   * What is the variance inherent in an evaluation measure?
   * How complex is it to employ a measure?
   * What task(s) is the evaluation measure suited to?
   * What kinds of tools automate the evaluation process?
   * What kind of metrics are useful for users versus system developers?
   * How can we use the evaluation process to speed up or improve the MT
     development process?
   * What kind of impacts does real-world data have?
   * How can we evaluate MT when MT is a small part of the data flow?
   * How independent is MT of the subsequent processing? That is, cleaning
     up the data improves performance, but does it improve it enough? How do
     we quantify that?

TO REGISTER

Since this is a hands-on workshop, no papers are being solicited.
Participants will be expected to take part in the exercises and report their
conclusions. They will additionally be encouraged to contribute to a summary
paper of the workshop proceedings. The data will be sent to participants in
advance of the workshop, with instructions on what to do and what to
prepare. The amount of work required should not exceed 4 hours (much less
than paper preparation).

To register an intent to participate, please send a paragraph outlining your
interest in MT, experience with MT evaluation, knowledge of either Spanish
or Arabic, and the following contact information to Flo Reeder (contact info
below):

   * name
   * address
   * e-mail address
   * knowledge of other foreign languages
   * translation domain specialization

Participants will need to register for the workshop as part of their NAACL
registration.

IMPORTANT DATES

Intent to Participate: April 16, 2001
Release of Data: April 23, 2001
Workshop date: June 3, 2001

CONTACT POINTS

Florence Reeder
MITRE Corporation
1820 Dolley Madison Blvd.
McLean, VA 22102-3481
TEL: 703-883-7156
FAX: 703-883-1379
EMAIL: freeder at mitre.org

Eduard Hovy
Information Sciences Institute
University of Southern California
4676 Admiralty Way
Marina del Rey, CA 90292-6695
TEL: 310-448-8731
FAX: 310-823-6714
EMAIL: hovy at isi.edu

Workshop URL: http://www.isi.edu/natural-language/mt-eval-naacl.html