<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

</head>

<body bgcolor="#ffffff" text="#000000">

<font face="Trebuchet MS"><b style=""><span style="" lang="EN-GB">CALL

FOR PAPERS<o:p></o:p></span></b></font>

<div align="left"></div>

<font face="Trebuchet MS"><b style=""><span style="" lang="EN-GB"><o:p> </o:p></span></b></font>

<div align="left"></div>

<font face="Trebuchet MS"><b style=""><span style="font-size: 14pt;"

 lang="EN-GB">ELRA

Workshop on Evaluation<o:p></o:p></span></b></font>

<div align="left"></div>

<font face="Trebuchet MS"><b style=""><span style="font-size: 14pt;"

 lang="EN-GB">Looking

into the Future of Evaluation: when automatic metrics meet task-based

and

performance-based approaches<o:p></o:p></span></b></font>

<div align="left"></div>

<font face="Trebuchet MS"><b style=""><span style="" lang="EN-GB"><o:p> </o:p></span></b></font>

<div align="left"></div>

<font face="Trebuchet MS"><b style=""><span style="" lang="EN-GB"><o:p> </o:p></span></b></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="font-size: 11pt;" lang="EN-GB">To

be held in conjunction with the 6th International Language Resources

and Evaluation Conference (LREC 2008)<o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB"><o:p> </o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><b>27 May 2008, </b>Palais des Congrès

Mansour

Eddahbi, Marrakech<br>

<br>

<br>

Submission page:<br>

</font>

<div align="left"></div>

<font face="Trebuchet MS"><o:p> </o:p></font>

<div align="left"></div>

<font face="Trebuchet MS"><b><u><span style="color: blue;"><a

 href="https://www.softconf.com/LREC2008/ELRA-EVAL2008/submit.html"><span

 style="font-size: 10pt;">https://www.softconf.com/LREC2008/ELRA-EVAL2008/submit.html</span></a><o:p></o:p></span></u></b></font>

<div align="left"></div>

<font face="Trebuchet MS"><o:p> </o:p></font>

<div align="left"></div>

<font face="Trebuchet MS"><b style=""><span style="font-size: 14pt;"

 lang="EN-GB">Background<o:p></o:p></span></b></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">Automatic methods

to evaluate system

performance play an important role in the development of a language

technology system.

They speed up research and development by allowing fast feedback, and

the idea

is also to make results comparable while aiming to match human

evaluation in

terms of output evaluation. However, after several years of study and

exploitation of such metrics we still face problems like the following

ones:<o:p></o:p></span></font>

<div align="left">

<ul style="margin-top: 0cm;" type="disc">

  <li class="MsoNormal" style=""><font face="Trebuchet MS"><span

 style="" lang="EN-GB">they only evaluate part of what should be

evaluated<o:p></o:p></span></font></li>

  <li class="MsoNormal" style=""><font face="Trebuchet MS"><span

 style="" lang="EN-GB">they produce measurements that are hard to

understand/explain, and/or hard to relate to the concept of quality<o:p></o:p></span></font></li>

  <li class="MsoNormal" style=""><font face="Trebuchet MS"><span

 style="" lang="EN-GB">they fail to match human evaluation<o:p></o:p></span></font></li>

  <li class="MsoNormal" style=""><font face="Trebuchet MS"><span

 style="" lang="EN-GB">they require resources that are expensive to

create<o:p></o:p></span></font></li>

</ul>

</div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">etc. Therefore,

an effort to integrate

knowledge from a multitude of evaluation activities and methodologies

should

help us solve some of these immediate problems and avoid creating new

metrics

that reproduce such problems.<o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB"><o:p> </o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">Looking at MT as

a sample case, problems to be

immediately pointed out are twofold: reference translations and

distance

measurement. The former are difficult and expensive to produce, they do

not

cover the usually wide spectrum of translation possibilities and what

is even

more discouraging, worse results are obtained when reference

translations are

of higher quality (more spontaneous and natural, and thus, sometimes

more

lexically and syntactically distant from the source text). Regarding

the

latter, the measurement of the distance between the source text and the

output

text is carried out by means of automatic metrics that do not match

human

intuition as well as claimed. Furthermore, different metrics perform

differently,

which has already led researchers to study metric/approach combinations

which

integrate automatic methods into a deeper linguistically oriented

evaluation.

Hopefully, this should help soften the unfair treatment received by

some

rule-based systems, clearly punished by certain system-approach

sensitive

metrics.<o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB"><o:p> </o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">On the other

hand, there is the key issue of

« what needs to be measured », so as to draw the conclusion that

« something is of good quality », or probably rather « something

is useful for a particular purpose ». In this regard, works like those

done within the FEMTI framework have shown that aspects such as

usability,

reliability, efficiency, portability, etc. should also be considered.

However,

the measuring of such quality characteristics cannot always be

automated, and

there may be many other aspects that could be usefully measured.<o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB"><o:p> </o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">This workshop

follows the evolution of a series

of workshops where methodological problems, not only for MT but for

evaluation

in general, have been approached. Along the lines of these discussions

and

aiming to go one step further, the current workshop, while taking into

account

the advantages of automatic methods and the shortcomings of current

methods, should

focus on task-based and performance-based approaches for evaluation of

natural

language applications, with key questions such as:<o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB"><o:p> </o:p></span></font>

<div align="left"></div>

<!--[if !supportLists]--><font face="Trebuchet MS"><span style=""

 lang="EN-GB"><span style="">-<span

 style="font-family: "Times New Roman"; font-style: normal; font-variant: normal; font-weight: normal; font-size: 7pt; line-height: normal; font-size-adjust: none; font-stretch: normal;">        

</span></span></span><span dir="ltr"><span style="" lang="EN-GB">How

can it be determined how<b style="">

useful</b> a given system is for a given task?<o:p></o:p></span></span></font><!--[endif]-->

<div align="left"></div>

<!--[if !supportLists]--><font face="Trebuchet MS"><span style=""

 lang="EN-GB"><span style="">-<span

 style="font-family: "Times New Roman"; font-style: normal; font-variant: normal; font-weight: normal; font-size: 7pt; line-height: normal; font-size-adjust: none; font-stretch: normal;">        

</span></span></span><span dir="ltr"><span style="" lang="EN-GB">How

can focusing on such issues and combining these approaches with our

already acquired experience on automatic evaluation help us develop new

metrics

and methodologies which do not feature the shortcomings of current

automatic

metrics? <o:p></o:p></span></span></font><!--[endif]-->

<div align="left"></div>

<!--[if !supportLists]--><font face="Trebuchet MS"><span style=""

 lang="EN-GB"><span style="">-<span

 style="font-family: "Times New Roman"; font-style: normal; font-variant: normal; font-weight: normal; font-size: 7pt; line-height: normal; font-size-adjust: none; font-stretch: normal;">        

</span></span></span><span dir="ltr"><span style="" lang="EN-GB">Should

we work on hybrid methodologies of automatic and human evaluation

for certain technologies and not for others?<o:p></o:p></span></span></font><!--[endif]-->

<div align="left"></div>

<!--[if !supportLists]--><font face="Trebuchet MS"><span style=""

 lang="EN-GB"><span style="">-<span

 style="font-family: "Times New Roman"; font-style: normal; font-variant: normal; font-weight: normal; font-size: 7pt; line-height: normal; font-size-adjust: none; font-stretch: normal;">        

</span></span></span><span dir="ltr"><span style="" lang="EN-GB">Can we

already envisage the integration of these approaches?<o:p></o:p></span></span></font><!--[endif]-->

<div align="left"></div>

<!--[if !supportLists]--><font face="Trebuchet MS"><span style=""

 lang="EN-GB"><span style="">-<span

 style="font-family: "Times New Roman"; font-style: normal; font-variant: normal; font-weight: normal; font-size: 7pt; line-height: normal; font-size-adjust: none; font-stretch: normal;">        

</span></span></span><span dir="ltr"><span style="" lang="EN-GB">Can we

already plan for some immediate collaborations/experiments?<o:p></o:p></span></span></font><!--[endif]-->

<div align="left"></div>

<!--[if !supportLists]--><font face="Trebuchet MS"><span style=""

 lang="EN-GB"><span style="">-<span

 style="font-family: "Times New Roman"; font-style: normal; font-variant: normal; font-weight: normal; font-size: 7pt; line-height: normal; font-size-adjust: none; font-stretch: normal;">        

</span></span></span><span dir="ltr"><span style="" lang="EN-GB">What

would it mean for the FEMTI framework to be extended to other HLT

applications, such as summarization, IE, or QA? Which new aspects would

it need

to cover?<o:p></o:p></span></span></font><!--[endif]-->

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB"><o:p> </o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">We solicit papers

that address these questions

and other related issues relevant to the workshop.<o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB"><o:p> </o:p></span></font>

<div align="left"><font face="Trebuchet MS"><b style=""><span

 style="font-size: 14pt; font-family: "Times New Roman";" lang="EN-GB">Workshop

Programme and Audience Addressed<o:p></o:p></span></b></font>

</div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">This full-day

workshop is intended for

researchers and developers on different evaluation technologies, with

experience on the various issues concerned in the call, and interested

in

defining a methodology to move forward.<o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB"><o:p> </o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">The workshop

feature invited talks, submitted

papers, and will conclude with a discussion on future developments and

collaboration.<o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB"><o:p> </o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><b style=""><span style="font-size: 14pt;"

 lang="EN-GB">Workshop

Chairing Team<o:p></o:p></span></b></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">Gregor Thurmair

(Linguatec Sprachtechnologien

GmbH, </span><st1:country-region><st1:place><span style="" lang="EN-GB">Germany</span></st1:place></st1:country-region><span

 style="" lang="EN-GB">) - chair<o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB"><o:p> </o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">Khalid Choukri (</span><st1:City><st1:place><span

 style="" lang="EN-GB">ELDA</span></st1:place></st1:City><span style=""

 lang="EN-GB"> - Evaluations and Language

resources </span><st1:place><st1:City><span style="" lang="EN-GB">Distribution

Agency</span></st1:City><span style="" lang="EN-GB">, </span><st1:country-region><span

 style="" lang="EN-GB">France</span></st1:country-region></st1:place><span

 style="" lang="EN-GB">) – co-chair<o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">Bente Maegaard

(CST, </span><st1:place><st1:City><span style="" lang="EN-GB">University

of Copenhagen</span></st1:City><span style="" lang="EN-GB">, </span><st1:country-region><span

 style="" lang="EN-GB">Denmark</span></st1:country-region></st1:place><span

 style="" lang="EN-GB">) – co-chair<o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB"><o:p> </o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><b style=""><span style="font-size: 14pt;"

 lang="EN-GB">Organising Committee<o:p></o:p></span></b></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">Victoria Arranz (</span><st1:City><st1:place><span

 style="" lang="EN-GB">ELDA</span></st1:place></st1:City><span style=""

 lang="EN-GB"> - Evaluations and Language

resources </span><st1:place><st1:City><span style="" lang="EN-GB">Distribution

Agency</span></st1:City><span style="" lang="EN-GB">, </span><st1:country-region><span

 style="" lang="EN-GB">France</span></st1:country-region></st1:place><span

 style="" lang="EN-GB">)<o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">Khalid Choukri (</span><st1:City><st1:place><span

 style="" lang="EN-GB">ELDA</span></st1:place></st1:City><span style=""

 lang="EN-GB"> - Evaluations and Language

resources </span><st1:place><st1:City><span style="" lang="EN-GB">Distribution

Agency</span></st1:City><span style="" lang="EN-GB">, </span><st1:country-region><span

 style="" lang="EN-GB">France</span></st1:country-region></st1:place><span

 style="" lang="EN-GB">)<o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="IT">Christopher Cieri

(LDC - Linguistic Data Consortium, USA)<o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">Eduard Hovy (</span><a

 href="http://www.isi.edu"><span

 style="color: windowtext; text-decoration: none;" lang="EN-GB">Information

Sciences Institute </span></a><span style="" lang="EN-GB">of the </span><a

 href="http://www.usc.edu"><span

 style="color: windowtext; text-decoration: none;" lang="EN-GB">University

of Southern

California</span></a><span style="" lang="EN-GB">, </span><st1:country-region><st1:place><span

 style="" lang="EN-GB">USA</span></st1:place></st1:country-region><span

 style="" lang="EN-GB">)<o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">Bente Maegaard

(CST, </span><st1:place><st1:City><span style="" lang="EN-GB">University

of Copenhagen</span></st1:City><span style="" lang="EN-GB">, </span><st1:country-region><span

 style="" lang="EN-GB">Denmark</span></st1:country-region></st1:place><span

 style="" lang="EN-GB">)<o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">Keith J. Miller

(The MITRE Corporation, </span><st1:country-region><st1:place><span

 style="" lang="EN-GB">USA</span></st1:place></st1:country-region><span

 style="" lang="EN-GB">)<o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">Satoshi Nakamura

(National Institute of Information and Communications </span><st1:place><st1:City><span

 style="" lang="EN-GB">Technology</span></st1:City><span style=""

 lang="EN-GB">, </span><st1:country-region><span style="" lang="EN-GB">Japan</span></st1:country-region></st1:place><span

 style="" lang="EN-GB">)</span><span style="" lang="EN-GB"><o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">Andrei

Popescu-Belis (IDIAP Research Institute,

</span><st1:country-region><st1:place><span style="" lang="EN-GB">Switzerland</span></st1:place></st1:country-region><span

 style="" lang="EN-GB">)<o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">Gregor Thurmair

(Linguatec Sprachtechnologien </span><st1:place><st1:City><span

 style="" lang="EN-GB">GmbH</span></st1:City><span style="" lang="EN-GB">,

</span><st1:country-region><span style="" lang="EN-GB">Germany</span></st1:country-region></st1:place><span

 style="" lang="EN-GB">)<o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB"><o:p> </o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><b style=""><span style="font-size: 14pt;"

 lang="EN-GB">Important dates<o:p></o:p></span></b></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">Deadline

for abstracts: </span><st1:date month="2" day="18" year="2008"><span

 style="" lang="EN-GB">Monday 18 February 2008</span></st1:date><span

 style="" lang="EN-GB"><o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">Notification

to Authors: </span><st1:date month="3" day="10" year="2008"><span

 style="" lang="EN-GB">Monday 10 March 2008</span></st1:date><span

 style="" lang="EN-GB"><o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">Submission

of Final Version: </span><st1:date month="3" day="31" year="2008"><span

 style="" lang="EN-GB">Monday 31 March 2008</span></st1:date><span

 style="" lang="EN-GB"><o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">Workshop: </span><st1:date

 month="5" day="27" year="2008"><span style="" lang="EN-GB">Tuesday 27

May 2008</span></st1:date><span style="" lang="EN-GB"><o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB"><o:p> </o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><b style=""><span style="font-size: 14pt;"

 lang="EN-GB">Submission Format<o:p></o:p></span></b></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB">Abstracts should

be no longer than 1500 words

and should be submitted in PDF format through the <a

 href="https://www.softconf.com/LREC2008/ELRA-EVAL2008/submit.html">online

submission form</a> on START. For further queries, please contact

Gregor

Thurmair at <a href="mailto:g.thurmair@linguatec.de">g.thurmair@linguatec.de</a>.

<o:p></o:p></span></font>

<div align="left"></div>

<font face="Trebuchet MS"><span style="" lang="EN-GB"><o:p> </o:p></span></font>

</body>

</html>