[Corpora-List] Interlingual Machine Translation Systems (fwd)

Sun Nov 21 18:06:01 UTC 2004

This discussion about the quality of MT systems is well worth having,
though I don't quite understand what it has to do with corpora.  There
has been serious discussion on how to evaluate MT systems since the
1950s or even earlier, and my understanding of the conclusions is 1)
task-related evaluation, where most of the questions one asks are
things like "would this translation result allow me to judge the
subject of document?" is significantly less controversial than any
attempt to judge good/bad without regard to context or purpose.

2) Automatic methods such as IBM's Bleu are potentially useful tools
for system builders, but don't always yield deep insights, because the
way that they measure "quality" is pretty crude.  In
(http://www.amtaweb.org/summit/MTSummit/FinalPapers/90-Turian-final.pdf)
where Melamed's group do a nice job on demystifing BLEU, proposing an
alternative and re-raising the very challenging question of what we
should expect from automatic measures. To my mind the main message of
this work is that we should be cautious in trusting human judgements
of translation quality when comparing multiple systems of divergent
quality on short documents.  As the doctor in the joke knows:
sometimes the right answer to "It hurts when I do this" is "Then don't
do that". Of couse, that doesn't mean that we should give up on
autmatic evaluation, just that we shouldn't expect superb results
all the time.

3) If you are serious about understanding the nature of the evaluation
task , one place to start is
(http://www.issco.unige.ch/projects/isle/MT-Summit-wsp.html) which has
good pointers to how things looked in 2001.

But whatever you think of the state of MT evaluation,
one of the most important
lessons of the last 40 years is that it doesn't help to dismiss work
in AI or AI-like fields simply because humans can (perhaps) do
better. The argument for pragmatism as a defense against inflated
expectations is made in one of the classic papers of this field, which
is Church and Hovy's

"Good Applications for Crummy Machine Translation"
(http://www.isi.edu/natural-language/people/hovy/papers/93churchhovy.pdf).

Some of the examples are dated now that MT is more available on the web, but
this a paper that everybody should read and consider.

--
==================================================================
Dr. Chris Brew,  Associate Professor of Computational Linguistics
Department of Linguistics, The Ohio State University
1712 Neil Avenue, Columbus OH 43210
Tel:  +614 292 5420 Fax: +614 292 8833
Web:http://www.ling.ohio-state.edu/~cbrew
Email:c-b-r-e-w at acm.org (delete hyphens)

If you do not use correct grammar, people will lose respect for you,
and they will burn down your house.  - Dave Barry

==================================================================