[Corpora-List] Release of SemEval2010-Task1 datasets (Coreference Resolution in Multiple Languages)

Marta Recasens mrecasens at ub.edu
Tue Sep 28 15:02:10 UTC 2010

We are pleased to announce the release of the SemEval2010-Task1 datasets for coreference resolution in multiple languages including Catalan, Dutch, German, Italian, and Spanish. The English data will be released by LDC early 2011.

The data is now freely available for downloading from the task website, at http://stel.ub.edu/semeval2010-coref/download/

General formatting is shared by all languages and is inspired by the previous CoNLL shared tasks (2008/2009 editions: http://ufal.mff.cuni.cz/conll2009-st). The data are displayed in a uniform column-based format with information about coreference, lemma, PoS, morphological features, head, dependency relations, NEs, and semantic dependencies. Both gold-standard and automatically predicted information are provided (availability depending on the language). The existence of gold and automatic preprocessing as well as of already published results makes it an ideal resource for training and testing coreference resolution systems, especially when cross-language portability is to be achieved.

Marta Recasens, Lluís Màrquez, Emili Sapena, M. Antònia Martí, Mariona Taulé, Véronique Hoste, Massimo Poesio, and Yannick Versley. 2010. SemEval-2010 Task 1: Coreference Resolution in Multiple Languages. In Proceedings of the 5th International Workshop on Semantic Evaluation (SemEval-2010), ACL 2010, pages 1-8, Uppsala, Sweden.

We will be happy to publish your results or articles related to any of these datasets on our website: http://stel.ub.edu/semeval2010-coref/posttask
Please feel free to let us know.

* Marta Recasens, M. Antònia Martí, Mariona Taulé
  University of Barcelona
* Lluís Màrquez, Emili Sapena
  Technical University of Catalonia
*  Massimo Poesio
   University of Essex / University of Trento
*  Véronique Hoste
   University College Ghent
*  Yannick Versley
   University of Tübingen
Corpora mailing list
Corpora at uib.no

More information about the Corpora mailing list