21.3904, FYI: Release of SemEval2010-Task1 Datasets

Mon Oct 4 17:47:55 UTC 2010

LINGUIST List: Vol-21-3904. Mon Oct 04 2010. ISSN: 1068 - 4875.

Subject: 21.3904, FYI: Release of SemEval2010-Task1 Datasets

Moderators: Anthony Aristar, Eastern Michigan U <aristar at linguistlist.org>
            Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>

Reviews: Monica Macaulay, U of Wisconsin-Madison  
Eric Raimy, U of Wisconsin-Madison  
Joseph Salmons, U of Wisconsin-Madison  
Anja Wanner, U of Wisconsin-Madison  
       <reviews at linguistlist.org> 

Homepage: http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, 
and donations from subscribers and publishers.

Editor for this issue: Brent Miller <brent at linguistlist.org>
================================================================  

To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.cfm.

===========================Directory==============================  

1)
Date: 04-Oct-2010
From: Mariona Taulé < mtaule at ub.edu >
Subject: Release of SemEval2010-Task1 Datasets

-------------------------Message 1 ---------------------------------- 
Date: Mon, 04 Oct 2010 13:46:53
From: Mariona Taulé [mtaule at ub.edu]
Subject: Release of SemEval2010-Task1 Datasets

E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=21-3904.html&submissionid=2650423&topicid=6&msgnumber=1

We are pleased to announce the release of the SemEval2010-Task1 datasets
for coreference resolution in multiple languages including Catalan, Dutch,
German, Italian, and Spanish. The English data will be released by LDC
early 2011.

The data is now freely available for downloading from the task website, at
http://stel.ub.edu/semeval2010-coref/download/.

General formatting is shared by all languages and is inspired by the
previous CoNLL shared tasks (2008/2009 editions:
http://ufal.mff.cuni.cz/conll2009-st). The data are displayed in a uniform
column-based format with information about coreference, lemma, PoS,
morphological features, head, dependency relations, NEs, and semantic
dependencies. Both gold-standard and automatically predicted information
are provided (availability depending on the language). The existence of
gold and automatic preprocessing of already published results makes it an
ideal resource for training and testing coreference resolution systems,
especially when cross-language portability is to be achieved.

REFERENCE
Marta Recasens, Lluís Màrquez, Emili Sapena, M. Antònia Martí, Mariona
Taulé, Véronique Hoste, Massimo Poesio, and Yannick Versley. 2010.
SemEval-2010 Task 1: Coreference Resolution in Multiple Languages. In
Proceedings of the 5th International Workshop on Semantic Evaluation
(SemEval-2010), ACL 2010, pages 1-8, Uppsala, Sweden.

We will be happy to publish your results or articles related to any of
these datasets on our website: http://stel.ub.edu/semeval2010-coref/posttask.

Please feel free to let us know.

ORGANIZERS
Marta Recasens, M. Antònia Martí, Mariona Taulé
University of Barcelona
{mrecasens, amarti, mtaule}@ub.edu

Lluís Màrquez, Emili Sapena
Technical University of Catalonia
{lluism, esapena}@lsi.upc.edu

Massimo Poesio
University of Essex / University of Trento

Véronique Hoste
University College Ghent

Yannick Versley
University of Tübingen 

Linguistic Field(s): General Linguistics

-----------------------------------------------------------
LINGUIST List: Vol-21-3904