<!doctype html public "-//w3c//dtd html 4.0 transitional//en">

<html>

Dear Colleagues,

<p>the EACL 2003 Workshop on the reuse of Evaluation Resources addresses

<br>- among other topics - the reuse of corpora that have been developed

for evaluation

purposes and therefore, it might be of interest to some of you.

<br>Detailed information on the workshop can be found in the following

call for papers.

<p>All best,

<br>Katerina Pastra

<p>===============================================================

<br>[Apologies for multiple postings]

<br> 

<p><br>

<center>

<p> ***********************     Call for papers   

***********************

<p> <b>EACL 2003 Workshop on:</b>

<p> <b> “Evaluation Initiatives in Natural Language Processing:</b>

<br><b>are evaluation methods, metrics and resources reusable?”</b>

<p> 13 April 2003, Budapest, Hungary

<br>11th conference of the European Chapter of the

<br> Association for Computational Linguistics (April 12-17, 2003)

<br> <A HREF="http://www.dcs.shef.ac.uk/~katerina/EACL03-eval">http://www.dcs.shef.ac.uk/~katerina/EACL03-eval</A></center>

<br> 

<p><br>

<p><b>Introduction:</b>

<br>Systems that accomplish different Natural Language Processing (NLP)

tasks have different characteristics and

<br>therefore, it would seem, different requirements for evaluation. However,

are there common features in evaluation

<br>methods used in various language technologies? Could the evaluation

methods established for one type of systems be

<br>ported/adapted to another NLP research area? Could automatic evaluation

metrics be ported? For instance, could

<br>Papineni’s MT evaluation metric be used for the evaluation of generated

summaries? Could the extrinsic evaluation

<br>method used within SUMMAC be applied to the evaluation of Natural Language

Generation systems? What are the

<br>reusability obstacles encountered and how could they be overcome? What

are the evaluation needs of system types

<br>such as dialogue systems, which have been less strenuously evaluated

till now, and how could they benefit from

current practices in evaluating Language Engineering technologies?

What are the evaluation challenges that emerge

<br>from systems that integrate a number of different language processing

functions (e.g. multimodal dialogue systems

<br>such as Smartkom)? Could resources (e.g. corpora) used for a specific

NLP task, be reused for the evaluation of

<br>the output of an NLP system and if so, what adaptations would this

require?  John White suggested some years

<br>ago a hierarchy of difficulty, or compositionality, of NLP tasks; if

correct, does this have implications for evaluation?

<p>End-to-end evaluation of systems in a specific NLP area of research

has been attempted within both European

<br>initiatives (e.g. EAGLES/ISLE, ELSE, TEMAA etc.) and U.S. evaluation

regimes with international participation

<br>(e.g. MUC, TREC, SUMMAC). It has been reported that evaluation techniques

in the different Language

<br>Engineering areas grow more similar (Hovy et al. 1999), a fact that

emphasizes the need for co-ordinated and

<br>reusable evaluation techniques and measures. The time has come to bring

together all the above attempts to address

<br>the evaluation of NLP systems as a whole and explore ways for reusing

established evaluation methods, metrics and

<br>resources, thus, contributing to a more co-ordinated approach to the

evaluation of language technology.

<p><b>Target audience:</b>

<br>The aim of this workshop is to bring together leading researchers from

various NLP areas (such as Machine

<br>Translation, Information Extraction, Information Retrieval, Automatic

Summarization, Question-Answering, Dialogue

<br>Systems and Natural Language Generation) in order to explore ways in

making the most of currently available

evaluation methods, metrics and resources.

<p><b>Workshop format:</b>

<br>The workshop will be opened with an invited speaker who will introduce

the topic and present the research

<br>questions and challenges that need to be addressed. Oral presentations

divided into thematic sessions will follow;

<br>at the end of each session a panel discussion will take place. The

panels will consist of members of the

program committee. The workshop will close with an overview talk.

<p><b>Topics of interest:</b>

<br>We welcome submissions of both discussion-papers and papers presenting

applied experiments relevant with -but

<br>not limited to- the following topics:

<br>- cross-fertilization of evaluation methods and metrics

<br>- reuse of resources for evaluation (corpora, evaluation tools etc.)

<br>- feasibility experiments for the reuse of established evaluation methods/metrics/resources

in different

<br>NLP system types

<br>- reusability obstacles and the notion of  compositionality of

NLP tasks

<br>- evaluation needs and challenges for less strenuously evaluated system

types (e.g. multimodal dialogue systems ),

<br>possible benefits from established evaluation practices

<br>- evaluation standards and reusability

<br>- reuse within big evaluation initiatives

<br>- application of e.g. Machine Translation methods to Information Retrieval:

implications for evaluation

<p><b>Submission format:</b>

<br>Submissions must be electronic only, and should consist of full papers

of max. 8 pages (inclusive of references,

<br>tables,figures and equations). Authors are strongly encouraged to use

the style-files suggested for the

<br>EACL main conference submissions at: <A HREF="http://ufal.ms.mff.cuni.cz/~hajic/eacl03/submission.html">http://ufal.ms.mff.cuni.cz/~hajic/eacl03/submission.html</A>

<br>Please, mail your submissions to Katerina Pastra: e.pastra@dcs.shef.ac.uk

<p><b>Important dates:</b>

<br>* Deadline for workshop paper submissions:

<br>   TUESDAY, 7 January 2003 (NOTE: strict deadline)

<br>* Notification of workshop paper acceptance:

<br>   TUESDAY, 28 January 2003

<br>* Deadline for camera-ready workshop papers:

<br>   THURSDAY, 13 February 2003

<br>* Workshop Date:

<br>    SUNDAY, 13 April 2003

<p><b>Program Committee:</b>

<br>Rob Gaizauskas (University of Sheffield, UK)

<br>Donna Harman (NIST, US)

<br>Lynette Hirschman (MITRE, US)

<br>Maghi King (ISSCO, Switzerland)

<br>Steven Krauwer (Utrecht University, Netherlands)

<br>Inderjeet Mani (MITRE, US)

<br>Joseph Mariani (LIMSI, France)

<br>Patrick Paroubek (LIMSI, France)

<br>Katerina Pastra (University of Sheffield, UK)

<br>Martin Rajman (EPFL - Switzerland)

<br>Karen Sparck-Jones (University of Cambridge, UK)

<br>Horacio Saggion (University of Sheffield, UK)

<br>Simone Teufel (University of Cambridge, UK)

<br>Yorick Wilks (University of Sheffield, UK)

<p><b>Registration details:</b>

<br>Information on registration fees and procedures will be published at

the main EACL 2003 conference pages at:

<br><A HREF="http://www.conferences.hu/EACL03/">http://www.conferences.hu/EACL03/</A>

<p><b>For detailed and up-to-date information on the workshop please visit

the workshop’s website:</b>

<br><A HREF="http://www.dcs.shef.ac.uk/~katerina/EACL03-eval">http://www.dcs.shef.ac.uk/~katerina/EACL03-eval</A>

For any queries, please don't hesitate to contact me.

<p>Best,

<br>Katerina Pastra

<p>*************************************************************

<br>Katerina Pastra

<br>Research Associate & ILASH Research Co-ordinator

<br>Natural Language Processing Group

<br>Department of Computer Science, University of Sheffield

<br>*************************************************************

<br>Regent Court - Room G35

<br>211 Portobello Street, Sheffield, U.K.

<br>Tel. +44 114 2221945

<br>Fax  +44 114 2221810

<br><A HREF="http://www.dcs.shef.ac.uk/~katerina">http://www.dcs.shef.ac.uk/~katerina</A>

<br>*************************************************************</html>