12.353, Confs: Machine Translation Evaluation - Geneva

Sun Feb 11 23:15:42 UTC 2001

LINGUIST List:  Vol-12-353. Sun Feb 11 2001. ISSN: 1068-4875.

Subject: 12.353, Confs: Machine Translation Evaluation - Geneva

Moderators: Anthony Aristar, Wayne State U.<aristar at linguistlist.org>
            Helen Dry, Eastern Michigan U. <hdry at linguistlist.org>
            Andrew Carnie, U. of Arizona <carnie at linguistlist.org>

Reviews (reviews at linguistlist.org):
	Simin Karimi, U. of Arizona
	Terence Langendoen, U. of Arizona

Editors (linguist at linguistlist.org):
	Karen Milligan, WSU 		Naomi Ogasawara, EMU
	Lydia Grebenyova, EMU		Jody Huellmantel, WSU
	James Yuells, WSU		Michael Appleby, EMU
	Marie Klopfenstein, WSU		Ljuba Veselinova, Stockholm U.

Software: John Remmers, E. Michigan U. <remmers at emunix.emich.edu>
          Gayathri Sriram, E. Michigan U. <gayatri at linguistlist.org>

Home Page:  http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.

Editor for this issue: Lydia Grebenyova <lydia at linguistlist.org>
 ==========================================================================
Please keep conferences announcement as short as you can; LINGUIST
will not post conference announcements which in our opinion are
excessively long.

=================================Directory=================================

1)
Date:  Fri, 09 Feb 2001 12:52:13 -0500
From:  "Reeder,Florence M." <freeder at mitre.org>
Subject:  Machine Translation Evaluation, Geneva

-------------------------------- Message 1 -------------------------------

Date:  Fri, 09 Feb 2001 12:52:13 -0500
From:  "Reeder,Florence M." <freeder at mitre.org>
Subject:  Machine Translation Evaluation, Geneva

MT Evaluation : An invitation to get your hands dirty!
(in conjunction with the MT Evaluation Working Group
of the ISLE project)
================================================

Background
- --------
A workshop on MT evaluation organised during the AMTA conference in
Cuernavaca in 2000 included a series of practical exercises on machine
translation evaluation. Carrying out the exercises provided insights
into the difficulties and subtleties of MT evaluation, thus inspiring
several of those present to suggest the organisation of a longer
workshop whose primary focus would be to design and carry out portions
of a thorough evaluation.

At the same time, the Evaluation Working Group of the ISLE project
(funded by the EU, the NSF in the USA and by the Swiss and Danish
governments) has been working on the provision of support material for
those involved in MT evaluation.  This material takes the form of
classification schemes intended to be helpful in the definition of user
needs, the choice of system characteristics of importance to the
specific evaluation and the choice of metrics to be applied to system
characteristics.  The current version of the ISLE proposals can be seen
at http://www.isi.edu/natural-language/mteval/ .

Date and Place
- ------------
We invite you to a practical workshop to be held in Geneva between
April 19th and 24th 2001.

Organisation and activities
- -------------------------
Participants in the workshop will be provided with a scenario
describing a practical situation in which an evaluation of an MT system
or systems might be undertaken.  The organisers will ensure that the
scenario(s) reflect real life situations.  Particpants will then spend
two days designing an evaluation which is appropriate to their scenario,
using a unified framework (ISLE) described in the introductory talks.
They may choose to work alone or in small groups.  Participants will
have free access to the machine translation systems available on the
web, and to the considerable computing support available at the
University of Geneva School of Translation and Interpretation.  As much
as possible, the evaluations will be carried out. Results and experience
will be pooled and discussed in the final day of the workshop. This
workshop can be seen as one in an-going series (LREC 2000, AMTA 2000,
NAACL 2001, MT Summit VIII 2001), where each workshop builds on the
experience and the results of previous workshops. Potential participants
need not however have participated in earlier workshops, although we
would of course like to encourage them to participate in later ones.

More information about the conferences where workshops have been held
or will be held can be found at
LREC: http://www.icp.grenet.fr/ELRA/lrec2000.html
AMTA: http://www.isi.edu/natural-language/conferences/AMTA2000.html
NAACL 2001: http://www.cs.cmu.edu/~ref/NAACL2001
MT Summit VIII: http://www.eamt.org/summitVIII

Week outline timetable
- --------------------
* April 19th, morning :  Introduction of the ISLE proposals.
	Distribution and discussion of scenarios, formation of working
	groups.
* April 19th afternoon, April 20th :  Design of evaluations.
* April 21st, morning : Execution of evaluations.
* April 21st, afternoon : Free time.
* April 22nd : Free time.
* April 23rd : Interpretation of evaluation scores and metrics.
* April 24th : Reports and discussion of results.

Major themes of interest
- ----------------------
	* What metrics are suitable for assessing what system
	  characteristics?
	* What system characteristics reflect what user needs?
	* Is there a radical difference between evaluation focusing on
	  research or development needs and evaluation focusing on
	  end-user needs?
	* When should real world data be used, and what is the impact of
	  using it?
	* What constitutes a valid metric? How can you demonstrate that
	  a metric is valid?
	* What metrics can be automated?
	* What are the advantages and disadvantages of specific
	  metrics?
	* For the metric(s) selected for the evaluation, what are the
	  difficulties in applying them?
	* For a given metric, what variations in scores are typically
	  produced?
	  What are the statistical error variances?
	* For a given metric, what are the score ranges for 'good' and
	  for 'bad' systems?
	* Are there metrics which correlate with one another? Are there
	  metrics which indicate an overall quality score?
	* Are there metrics which work better with specific language
	  pairs?

Participation
- -----------
Participation in the workshop is free of charge, although particpants
must pay their own travel and living expenses. Because of the nature of
the exercise, participation is limited to a maximum of 20 persons, and
will be on a first come first served basis. Note, though, that if there
is a team which would like to participate as a team, these restrictions
may be relaxed in order to accomodate them.

Further information can be obtained from
	Maghi King at Margaret.King at issco.unige.ch or
	Florence Reeder at freeder at mitre.org

How to register
- -------------
Send your registration request to Gisella Anspach at
Gisella.Anspach at issco.unige.ch as soon as possible and at the
absolute latest by March 15th. Gisella will also be able to help
you if necessary with finding accomodation in Geneva, as will the
Geneva Tourist Office whose site at
	http://www.geneva-tourism.ch/eng/
will provide you with much information about the city.

DEADLINES
- -------
	REGISTRATION - MARCH 15TH
	LATEST NOTIFICATION OF ACCEPTANCE - MARCH 22ND
	WORKSHOP - APRIL 19 - 24

---------------------------------------------------------------------------
LINGUIST List: Vol-12-353