18.2516, Confs: Computational Linguistics/Denmark
LINGUIST Network
linguist at LINGUISTLIST.ORG
Tue Aug 28 18:43:33 UTC 2007
LINGUIST List: Vol-18-2516. Tue Aug 28 2007. ISSN: 1068 - 4875.
Subject: 18.2516, Confs: Computational Linguistics/Denmark
Moderators: Anthony Aristar, Eastern Michigan U <aristar at linguistlist.org>
Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>
Reviews: Randall Eggert, U of Utah
<reviews at linguistlist.org>
Homepage: http://linguistlist.org/
The LINGUIST List is funded by Eastern Michigan University,
and donations from subscribers and publishers.
Editor for this issue: Jeremy Taylor <jeremy at linguistlist.org>
================================================================
To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.html.
===========================Directory==============================
1)
Date: 28-Aug-2007
From: Helene Mazo < mazo at elda.org >
Subject: Automatic Procedures in MT Evaluation at MT Summit XI
-------------------------Message 1 ----------------------------------
Date: Tue, 28 Aug 2007 14:41:25
From: Helene Mazo [mazo at elda.org]
Subject: Automatic Procedures in MT Evaluation at MT Summit XI
E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=18-2516.html&submissionid=155012&topicid=4&msgnumber=1
Automatic Procedures in MT Evaluation at MT Summit XI
Date: 11-Sep-2007 - 11-Sep-2007
Location: Copenhagen, Denmark
Contact: Gregor Thurmair
Contact Email: g.thurmair at linguatec.de
Meeting URL:
http://mtsummitcph.ku.dk/workshops/mts-automatic_procedures_in_mt_evaluation.doc
Linguistic Field(s): Computational Linguistics
Meeting Description:
This workshop, during MT Summit XI, Copenhagen 2007 (Sept. 11), focusses
on the discussion of automatic evaluation procedures in MT: BLEU / NIST,
d-score, x-score, edit distance, and other such tools.
The questions to be discussed are:
- What do the scores really measure? Are they biased towards
specific MT technologies? (validity)
- What kind initial effort do they require (e.g.: pre-translate
test corpus)? (economy)
- What kind of implicit assumptions do they make?
ยท What kind of resources do they need (e.g.: third party
grammars)? (economy, feasibility)
- What kind of diagnostic support can they give? (where to
improve the system)
- What kind of evaluation criteria (related to the FEMTI
framework) do they support (adequacy, fluency, ...)
The objective of the workshop is to learn from recent evaluation
activities, and to create a better understanding of the strengths and
limitations of the respective approaches, and to get closer to a common
methodology for MT output evaluation.
Draft programme
9.00 Welcome and introduction
9.20 The place of automatic evaluation metrics in external quality
models for
machine translation
Andrei Popescu-Belis, University of Geneva
10.00 Evaluating Evaluation --- Lessons from the WMT'07 Shared Task
Philipp Koehn, University of Edinburgh
10.30 Coffee break
11.00 Investigating Why BLEU Penalizes Non-Statistical Systems
Eduard Hovy, University of Southern California
11.30 Edit distance as an evaluation metric
Christopher Cieri, Linguistic Data Consortium (TBC)
12.00 Experience and conclusions from the CESTA evaluation project
Olivier Hamon, ELDA
12.30 Lunch
13.30 Automatic Evaluation in MT system production
Gregor Thurmair, Linguatec
14.00 Sensitivity of performance-based and proximity-based models for MT
evaluation
Bogdan Babych, Univ. Leeds
14.30 Automatic & human Evaluations of MT in the framework of a speech to
speech communication
Khalid. Choukri, ELDA
15.00 Coffee break
15.30 Discussion and conclusions
17.00 Close
-----------------------------------------------------------
LINGUIST List: Vol-18-2516
More information about the LINGUIST
mailing list