20.4405, FYI: Call for Participation: ResPubliQA 2010

linguist at LINGUISTLIST.ORG linguist at LINGUISTLIST.ORG
Sun Dec 20 20:12:08 UTC 2009


LINGUIST List: Vol-20-4405. Sun Dec 20 2009. ISSN: 1068 - 4875.

Subject: 20.4405, FYI: Call for Participation:  ResPubliQA 2010

Moderators: Anthony Aristar, Eastern Michigan U <aristar at linguistlist.org>
            Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>
 
Reviews: Monica Macaulay, U of Wisconsin-Madison  
Eric Raimy, U of Wisconsin-Madison  
Joseph Salmons, U of Wisconsin-Madison  
Anja Wanner, U of Wisconsin-Madison  
       <reviews at linguistlist.org> 

Homepage: http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, 
and donations from subscribers and publishers.

Editor for this issue: Danielle St. Jean <danielle at linguistlist.org>
================================================================  

To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.html.

===========================Directory==============================  

1)
Date: 18-Dec-2009
From: Pamela Forner < forner at celct.it >
Subject: Call for Participation:  ResPubliQA 2010
 

	
-------------------------Message 1 ---------------------------------- 
Date: Sun, 20 Dec 2009 15:09:53
From: Pamela Forner [forner at celct.it]
Subject: Call for Participation:  ResPubliQA 2010

E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=20-4405.html&submissionid=2231975&topicid=6&msgnumber=1
  


ResPubliQA 2010 
Question Answering Evaluation over European Legislation 

Preliminary Call for Participation

Following the success of ResPubliQA 2009, we are pleased to announce 
ResPubliQA 2010, the second evaluation campaign of Question Answering
systems over European Legislation to be held within the framework of CLEF
2010 conference.
  
For more information and updates visit the ResPubliQA website at:

http://celct.isti.cnr.it/ResPubliQA/

We invite participation from IR and NLP practitioners and potential users
of QA systems concerned with European texts. Detailed guidelines describing
the task will be distributed among the participants and downloadable from
the ResPubliQA website.

The results of the evaluation campaign will be disseminated at the final
workshop which will be organized, in conjunction with the CLEF 2010
conference, 20-23 September in Padua, Italy. 

ResPubliQA 2010: Task Overview

The aim of ResPubliQA 2010 is to capitalize on what has been achieved in
the previous evaluation campaign while at the same time adding a number of
refinements:

- The addition of new question types and the refinement of old ones;
- The opportunity to return both paragraph and exact answer; 
- The addition of a new collection: EUROPARL

Two separate tasks are proposed for the ResPubliQA 2010 evaluation campaign:

1. Paragraph Selection (PS) Task: to retrieve one paragraph containing the
answer to a question in natural language. One of the following responses
must be returned:
a) One single paragraph containing the candidate answer.
b) The string NOA to indicate that the system prefers not to answer the
question. 

2. Answer Selection (AS) Task: beyond retrieving a paragraph containing the
answer to a question in natural language, systems are required to demarcate
also the exact answer. One of the following responses must be returned:
a) The exact answer highlighted inside one paragraph. 
b) The string NOA to indicate that the system prefers not to answer the
question. 

N.B. Systems that prefer to leave some questions unanswered, can optionally
decide to submit also a candidate paragraph/answer with the aim of
evaluating the validation performance.

The two tasks are only different in the output required. Document
collection and test data for both tasks are the same.

Document Collection: the following multilingual parallel-aligned document
collections are used:
- The ResPubliQA collection: a subset of JRC-Acquis with parallel-aligned
documents in 9 languages. 
- A small subset of the EUROPARL collection with parallel-aligned documents
in 9 languages has been created by crawling the web to get the data from
the website of the European Parliament (starting from January 2009). 

Both collections will be available at the ResPubliQA website. The subject
of the Acquis documents is European legislation while EUROPARL deals with
the parliamentary domain. The two collections are different in style and
content while being fully compatible at the same time.
 
Languages: Parallel-aligned documents are available in 9 languages, i.e:
Bulgarian, Dutch, English, French, German, Italian, Portuguese, Romanian
and Spanish. 

Only the tasks in which there will be at least one registered participant
will be activated.

Test Data: a pool of 200 questions will be provided:
- independent questions that can be answered by a paragraph
- question types: factoid, definition, purpose, reason, opinion, other
- No NIL; No LIST

Evaluation: Each output of both the PS and AS tasks are automatically
evaluated against the GoldStandard manually produced. Non-matching
paragraphs and answers are manually evaluated by native speakers assessors.

The adoption of the c at 1 evaluation metric encourages systems to maintain
the number of correct answers while reducing the amount of incorrect ones
by leaving some questions unanswered (NOA). Answer Validation techniques
(including Machine Learning) are expected to be used for taking this final
decision. For more details, please read the ResPubliQA 2009 Overview,
available at the campaign website.

Runs: Systems are allowed to participate in one or both tasks which will
operate simultaneously on the same input questions. A maximum of two runs
in total can be submitted, i.e. two PS runs, two AS runs or one PS plus one
AS run.

Preliminary Timeline:
- Track guidelines: January 25
- Registration at the ResPubliQA website: by March
- Test set release: May 17
- Run submissions: May 27*
- Results to the participants: July 9
- Submission of Papers: August 15
- Workshop: 20-23 September 2010, in Padua, Italy

* Participants will have 5 days to upload their submissions, starting from
the moment when the questions are downloaded.


Lab Organizers:
- Anselmo Peñas, E.T.S.I. Informática de la UNED, Madrid, Spain
- Pamela Forner, CELCT, Trento, Italy
- Richard Sutcliffe, Dept. of Computer Science, University of Limerick,
Limerick, Ireland

Advisory Board:
- Donna Harman (National Institute for Standards and Technology (NIST), USA
- Maarten de Rijke (University of Amsterdam, The Netherlands)
- Dominique Laurent (Synapse Développement, France) 



Linguistic Field(s): Computational Linguistics
                     Text/Corpus Linguistics





 




-----------------------------------------------------------
LINGUIST List: Vol-20-4405	

	



More information about the LINGUIST mailing list