[Corpora-List] Call for participation: Robust WSD-CLIR task at CLEF2008
Eneko Agirre
e.agirre at ehu.es
Tue Mar 4 11:28:46 UTC 2008
Apologies for cross-postings
Call for participation
Robust WSD CLIR at CLEF2008
Word Sense Disambiguation
for (Cross-Lingual) Information Retrieval
http://ixa2.si.ehu.es/clirwsd
The robust task will bring semantic and retrieval evaluation
together. The participants will be offered topics and document
collections from previous CLEF campaigns which were annotated by
systems for word sense disambiguation (WSD). The goal of the task is
to test whether WSD can be used beneficially for retrieval systems.
Note that there will be another related task on WSD and Question
Answering (http://ixa2.si.ehu.es/qawsd).
The organizers believe that polysemy is among the reasons for
information retrieval (IR) systems to fail. WSD could allow a more
targeted retrieval. Last year, the campaigns SemEval and CLEF
cooperated and created a task (http://ixa2.si.ehu.es/semeval-clir)
where participants were required to provide WSD on CLEF data
collections. In a retrieval experiment by the organizers the WSD data
was used for retrieval but did not lead to improvement. This year,
participants are given the WSD data (or can derive their own) and can
run their own retrieval experiments with various retrieval strategies.
The WSD data is based on WordNet version 1.6 and will be supplemented
with data from the English and Spanish WordNets in order to test
different expansion strategies. Several leading WSD experts will run
their systems, and provide those WSD results for the participants to
use.
Participants are required to submit at least one baseline run without
WSD and one run using the WSD data. They can submit four further
baseline runs without WSD and four runs using WSD with in various
ways.
The robust task will use two languages often used in previous CLEF
campaigns (English, Spanish). Documents will be in English, and topics
in both English and Spanish.
A subset of highly ambiguous topics will be identified by the
organizers and used for a separate evaluation to see how WSD works for
these hard topics.
The evaluation will be based on Mean Average Precision (MAP) as well
as Geometric Average Precision (GMAP). The robust measure GMAP intends
to evaluate stable performance over all topics instead of high average
performance in Mono- and Cross-Language IR ("ensure that all topics
obtain minimum effectiveness level" Voorhees 2005 SIGIR Forum).
Time Schedule:
Registration Opens - 10 February 2008 (closes 1 May 2008)
Data Release - from 1 March 2008
Topic Release - from 1 May 2008
Submission of Runs by Participants - 15 June 2008
Release of Relevance Assessments and Individual Results - 15 July 2008
Submission of Paper for Working Notes - 15 August 2008
Workshop - 17-19 September 2008
Contact
Thomas Mandl, University of Hildesheim, mandl at uni-hildesheim de
Eneko Agirre, University of the Basque Country, e.agirre at ehu es
For more details please visit http://ixa2.si.ehu.es/clirwsd
Please join the mailing list:
http://groups.google.com/group/clirwsd
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list