[Corpora-List] MULTILINGUAL QUESTION ANSWERING TRACK AT CLEF-2004

Alessandro Vallin vallin at itc.it
Fri Nov 14 15:29:17 UTC 2003


APOLOGIES FOR MULTIPLE POSTINGS] 



   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    MULTILINGUAL QUESTION ANSWERING TRACK AT CLEF-2004
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

            ------------------------------------------------------------------
            First Announcement and Call for Participation
            ------------------------------------------------------------------

We are glad to announce that the second CLEF Question Answering evaluation exercise is starting. 
For information about the track and instructions for participants visit the QA at CLEF website at:

     http://clef-qa.itc.it

Question Answering (QA) systems are of great interest to the Language Engineering community because they combine Information Retrieval and Natural Language Processing within the same task. QA systems receive natural language queries (and not keywords) as input, process large unstructured document collections, and return precise answers (and not entire documents) as output.
Within the framework of the Cross Language Evaluation Forum (CLEF), a pilot evaluation exercise for non-English and cross-language QA systems was successfully carried out in 2003. Three monolingual tasks (with Dutch, Italian and Spanish questions) and five bilingual tasks (where Dutch, French, German, Italian and Spanish queries searched for an answer in an English target corpus) were proposed. Eight groups tested their systems. The results showed that multilingual QA is a promising field, and the experience was encouraging in terms of participation and future perspectives. The CLEF-2003 campaign culminated in a workshop held in Trondheim last August: proceedings are available at the CLEF website (http://clef.iei.pi.cnr.it:2002/)




    CLEF-2004 QA TRACK

A new, challenging evaluation exercise is planned for 2004, with six main tasks. Each main task is identified by a target language and is divided into several sub-tasks. We plan to have Dutch, French, German, Italian, Spanish and English as target languages.
In the monolingual tasks, queries, document collection and responses are formulated in the same language. In the cross-language tasks the document collection and the queries are written in two different languages, and responses are due in the language of the target corpus. For instance, in the "Dutch => Spanish" sub-task, participants are provided with Dutch queries whose answers must be retrieved in a Spanish document collection.
Questions will be mostly factoid, but the test sets could include also definition queries (like "Who/What is X") and questions that do not have a known answer in the target corpora.
Participants will be allowed to submit only one exact answer per question, and up to two runs per sub-task.





    SCHEDULE

Registration Open: January 15, 2004
Corpora Release: February 2004
Track Guidelines Available: by February 2004
Trial data: March 2004
Test Sets Release: May 10, 2004
Submissions of Runs by Participants: May 17, 2004
Release of Individual Results: from July 15, 2004
Submission of Papers for Working Notes: August  15, 2004
CLEF Workshop (in Bath, UK, after ECDL): 16-17 September 2004



You are all encouraged to participate!
To register, please contact Carol Peters (carol.peters at isti.cnr.it), who is in charge of the general  co-ordination of the CLEF campaign. Registration will open on the 15th of  January, 2004, via the CLEF website at www.clef-campaign.org.

A QA at CLEF mailing-list (clef-qa at itc.it) has been established: to be included and for further information, please contact Bernardo Magnini (magnini at itc.it) or Alessandro Vallin (vallin at itc.it). 



--------------------------------------------------
    ORGANIZING COMMITTEE

ITC-irst
Centro per la Ricerca Scientifica e Tecnologica, Trento - Italy
Bernardo Magnini, Simone Romagnoli, Alessandro Vallin


UNED
Spanish Distance Learning University, Madrid - Spain 
Felisa Verdejo, Anselmo Peñas, Jesús Herrera


ILLC
Language and Inference Technology Group, University of Amsterdam - The Netherlands
Maarten de Rijke


DFKI
German Research Center for Artificial Intelligence, Saarbruecken - Germany 
Hans Uszkoreit


ELDA/ELRA
Evaluations and Language Resources Distribution Agency, Paris - France
Khalid Choukri


University of Limerick - Ireland
Richard Sutcliffe

ISTI-CNR
Istituto di Scienza e Tecnologie dell'Informazione "A. Faedo", Pisa - Italy 
Carol Peters


NIST
National Institute of Standards and Technology, Gaithersburg, Md. - United States of America
Donna Harman
-------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20031114/1f476d96/attachment.htm>


More information about the Corpora mailing list