[Corpora-List] Call for participation: Robust WSD-CLIR task at CLEF2009
Eneko Agirre
e.agirre at ehu.es
Tue Feb 17 14:35:42 UTC 2009
Apologies for cross-postings
Call for participation
Robust WSD CLIR at CLEF2009
Word Sense Disambiguation
for (Cross-Lingual) Information Retrieval
http://ixa2.si.ehu.es/clirwsd
Following the success of the 2007 joint SemEval-CLEF task and the 2008
Robust WSD task at CLEF, a follow-up task will be hold in 2009 with the
aim of exploring the contribution of Word Sense Disambiguation to
monolingual and multilingual Information Retrieval. The 2009 exercise
will be very similar to the 2008 one. Those interested in exploring the
2008 data can check it here:
http://ixa2.si.ehu.es/clirwsd/index.php?option=com_content&task=view&id=19&Itemid=35
The robust task will bring semantic and retrieval evaluation
together. The participants will be offered topics and document
collections from previous CLEF campaigns which were annotated by
systems for word sense disambiguation (WSD). The goal of the task is
to test whether WSD can be used beneficially for retrieval systems.
The organizers believe that polysemy is among the reasons for
information retrieval (IR) systems to fail. WSD could allow a more
targeted retrieval. Robust-WSD at CLEF 2008 showed that some
top-scoring systems improved their IR and CLIR results with the use of
WSD tags. See working notes:
http://www.clef-campaign.org/2008/working_notes/adhoc-final.pdf
The WSD data is based on WordNet version 1.6 and will be supplemented
with data from the English and Spanish WordNets in order to test
different expansion strategies. Several leading WSD experts will run
their systems, and provide those WSD results for the participants to
use.
Participants are required to submit at least one baseline run without
WSD and one run using the WSD data. They can submit four further
baseline runs without WSD and four runs using WSD with in various
ways.
The robust task will use two languages often used in previous CLEF
campaigns (English, Spanish). Documents will be in English, and topics
in both English and Spanish.
The evaluation will be based on Mean Average Precision (MAP) as well
as Geometric Average Precision (GMAP). The robust measure GMAP intends
to evaluate stable performance over all topics instead of high average
performance in Mono- and Cross-Language IR ("ensure that all topics
obtain minimum effectiveness level" Voorhees 2005 SIGIR Forum).
Time Schedule:
Registration Opens - 1 February 2009 (closes on 1 May)
Data Release - from 15 March 2009
Topic Release - 24 April 2009
Submission of Runs by Participants - 1 June 2009
Release of Relevance Assessments and Individual Results - from 26 June 2009
Submission of Paper for Working Notes - around August 2009 (to be announced)
Workshop - 30 September to 2 October 2009 (collocated with ECDL 2009)
Contact
Thomas Mandl, University of Hildesheim, mandl at uni-hildesheim de
Eneko Agirre, University of the Basque Country, e.agirre at ehu es
For more details please visit
http://ixa2.si.ehu.es/clirwsd
http://www.clef-campaign.org
Please join the mailing list:
http://groups.google.com/group/clirwsd
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list