<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
Dear Colleagues,
<p>the EACL 2003 Workshop on the reuse of Evaluation Resources addresses
<br>- among other topics - the reuse of corpora that have been developed
for evaluation
<br>purposes and therefore, it might be of interest to some of you.
<br>Detailed information on the workshop can be found in the following
<br>call for papers.
<p>All best,
<br>Katerina Pastra
<p>===============================================================
<br>[Apologies for multiple postings]
<br>
<p><br>
<center>
<p> *********************** Call for papers
***********************
<p> <b>EACL 2003 Workshop on:</b>
<p> <b> “Evaluation Initiatives in Natural Language Processing:</b>
<br><b>are evaluation methods, metrics and resources reusable?”</b>
<p> 13 April 2003, Budapest, Hungary
<br>11th conference of the European Chapter of the
<br> Association for Computational Linguistics (April 12-17, 2003)
<br> <A HREF="http://www.dcs.shef.ac.uk/~katerina/EACL03-eval">http://www.dcs.shef.ac.uk/~katerina/EACL03-eval</A></center>
<br>
<p><br>
<p><b>Introduction:</b>
<br>Systems that accomplish different Natural Language Processing (NLP)
tasks have different characteristics and
<br>therefore, it would seem, different requirements for evaluation. However,
are there common features in evaluation
<br>methods used in various language technologies? Could the evaluation
methods established for one type of systems be
<br>ported/adapted to another NLP research area? Could automatic evaluation
metrics be ported? For instance, could
<br>Papineni’s MT evaluation metric be used for the evaluation of generated
summaries? Could the extrinsic evaluation
<br>method used within SUMMAC be applied to the evaluation of Natural Language
Generation systems? What are the
<br>reusability obstacles encountered and how could they be overcome? What
are the evaluation needs of system types
<br>such as dialogue systems, which have been less strenuously evaluated
till now, and how could they benefit from
<br>current practices in evaluating Language Engineering technologies?
What are the evaluation challenges that emerge
<br>from systems that integrate a number of different language processing
functions (e.g. multimodal dialogue systems
<br>such as Smartkom)? Could resources (e.g. corpora) used for a specific
NLP task, be reused for the evaluation of
<br>the output of an NLP system and if so, what adaptations would this
require? John White suggested some years
<br>ago a hierarchy of difficulty, or compositionality, of NLP tasks; if
correct, does this have implications for evaluation?
<p>End-to-end evaluation of systems in a specific NLP area of research
has been attempted within both European
<br>initiatives (e.g. EAGLES/ISLE, ELSE, TEMAA etc.) and U.S. evaluation
regimes with international participation
<br>(e.g. MUC, TREC, SUMMAC). It has been reported that evaluation techniques
in the different Language
<br>Engineering areas grow more similar (Hovy et al. 1999), a fact that
emphasizes the need for co-ordinated and
<br>reusable evaluation techniques and measures. The time has come to bring
together all the above attempts to address
<br>the evaluation of NLP systems as a whole and explore ways for reusing
established evaluation methods, metrics and
<br>resources, thus, contributing to a more co-ordinated approach to the
evaluation of language technology.
<p><b>Target audience:</b>
<br>The aim of this workshop is to bring together leading researchers from
various NLP areas (such as Machine
<br>Translation, Information Extraction, Information Retrieval, Automatic
Summarization, Question-Answering, Dialogue
<br>Systems and Natural Language Generation) in order to explore ways in
making the most of currently available
<br>evaluation methods, metrics and resources.
<p><b>Workshop format:</b>
<br>The workshop will be opened with an invited speaker who will introduce
the topic and present the research
<br>questions and challenges that need to be addressed. Oral presentations
divided into thematic sessions will follow;
<br>at the end of each session a panel discussion will take place. The
panels will consist of members of the
<br>program committee. The workshop will close with an overview talk.
<p><b>Topics of interest:</b>
<br>We welcome submissions of both discussion-papers and papers presenting
applied experiments relevant with -but
<br>not limited to- the following topics:
<br>- cross-fertilization of evaluation methods and metrics
<br>- reuse of resources for evaluation (corpora, evaluation tools etc.)
<br>- feasibility experiments for the reuse of established evaluation methods/metrics/resources
in different
<br>NLP system types
<br>- reusability obstacles and the notion of compositionality of
NLP tasks
<br>- evaluation needs and challenges for less strenuously evaluated system
types (e.g. multimodal dialogue systems ),
<br>possible benefits from established evaluation practices
<br>- evaluation standards and reusability
<br>- reuse within big evaluation initiatives
<br>- application of e.g. Machine Translation methods to Information Retrieval:
implications for evaluation
<p><b>Submission format:</b>
<br>Submissions must be electronic only, and should consist of full papers
of max. 8 pages (inclusive of references,
<br>tables,figures and equations). Authors are strongly encouraged to use
the style-files suggested for the
<br>EACL main conference submissions at: <A HREF="http://ufal.ms.mff.cuni.cz/~hajic/eacl03/submission.html">http://ufal.ms.mff.cuni.cz/~hajic/eacl03/submission.html</A>
<br>Please, mail your submissions to Katerina Pastra: e.pastra@dcs.shef.ac.uk
<p><b>Important dates:</b>
<br>* Deadline for workshop paper submissions:
<br> TUESDAY, 7 January 2003 (NOTE: strict deadline)
<br>* Notification of workshop paper acceptance:
<br> TUESDAY, 28 January 2003
<br>* Deadline for camera-ready workshop papers:
<br> THURSDAY, 13 February 2003
<br>* Workshop Date:
<br> SUNDAY, 13 April 2003
<p><b>Program Committee:</b>
<br>Rob Gaizauskas (University of Sheffield, UK)
<br>Donna Harman (NIST, US)
<br>Lynette Hirschman (MITRE, US)
<br>Maghi King (ISSCO, Switzerland)
<br>Steven Krauwer (Utrecht University, Netherlands)
<br>Inderjeet Mani (MITRE, US)
<br>Joseph Mariani (LIMSI, France)
<br>Patrick Paroubek (LIMSI, France)
<br>Katerina Pastra (University of Sheffield, UK)
<br>Martin Rajman (EPFL - Switzerland)
<br>Karen Sparck-Jones (University of Cambridge, UK)
<br>Horacio Saggion (University of Sheffield, UK)
<br>Simone Teufel (University of Cambridge, UK)
<br>Yorick Wilks (University of Sheffield, UK)
<p><b>Registration details:</b>
<br>Information on registration fees and procedures will be published at
the main EACL 2003 conference pages at:
<br><A HREF="http://www.conferences.hu/EACL03/">http://www.conferences.hu/EACL03/</A>
<p><b>For detailed and up-to-date information on the workshop please visit
the workshop’s website:</b>
<br><A HREF="http://www.dcs.shef.ac.uk/~katerina/EACL03-eval">http://www.dcs.shef.ac.uk/~katerina/EACL03-eval</A>
<p>For any queries, please don't hesitate to contact me.
<p>Best,
<br>Katerina Pastra
<p>*************************************************************
<br>Katerina Pastra
<br>Research Associate & ILASH Research Co-ordinator
<br>Natural Language Processing Group
<br>Department of Computer Science, University of Sheffield
<br>*************************************************************
<br>Regent Court - Room G35
<br>211 Portobello Street, Sheffield, U.K.
<br>Tel. +44 114 2221945
<br>Fax +44 114 2221810
<br><A HREF="http://www.dcs.shef.ac.uk/~katerina">http://www.dcs.shef.ac.uk/~katerina</A>
<br>*************************************************************</html>