[Corpora-List] Data collections available: Robust WSD-CLIR task at CLEF2008

Eneko Agirre e.agirre at ehu.es
Fri Feb 13 12:04:10 UTC 2009


Release of data collections of the Robust WSD CLIR at CLEF2008 exercise

		      Word Sense Disambiguation
	      for (Cross-Lingual) Information Retrieval

http://ixa2.si.ehu.es/clirwsd/index.php?option=com_content&task=view&id=19&Itemid=35


The CLEF 2008 robust task brought semantic and retrieval evaluation
together. The participants were offered with topics and document
collections from previous CLEF campaigns which were annotated by
systems for word sense disambiguation (WSD). The goal of the task was
to test whether WSD could be used beneficially for retrieval systems,
with some positive results (see the working notes at

As a preparation for the 2009 Robust task (to be announced soon) we
have compiled all the necessary data to replicate the 2008 experiments,
including topics, relevance judgements, and an unordered version of
the LA94 and GH95 document collections with WSD data.

The WSD informatoin is based on WordNet version 1.6 and was supplemented
with data from the English and Spanish WordNets in order to test
different expansion strategies. Several leading WSD experts run
their systems, and provided those WSD results for the participants to
use.

The robust task used two languages often used in previous CLEF
campaigns (English, Spanish). Documents are in English, and topics
in both English and Spanish, we thus had both monolingual and cross-lingua
Information Retrieval.

For more details please visit http://ixa2.si.ehu.es/clirwsd

For information on future developments, please join the mailing list:
   http://groups.google.com/group/clirwsd





_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list