[Corpora-List] new annotations for Recognizing Textual Entailment corpus

Mark Sammons mssammon at illinois.edu
Tue Nov 30 17:25:52 UTC 2010


New Corpus/Call for Participation
-----------------------------------------------

The Cognitive Computation Group at the University of Illinois  have released new 
annotations of a Recognizing Textual Entailment (RTE) corpus that contains 
phenomena-specific annotation and indicates inference steps required to "prove" 
entailment (see 
http://agora.cs.illinois.edu/display/rtedata/Explanation+Based+Analysis+of+RTE+Dat
a), following the definition in our challenge paper at ACL 2010 [2].  We solicit 
involvement from the community as a way to grow this corpus.  More detail is given 
below.



Explanation-Based Analysis of Textual Entailment Data
--------------------------------------------------------------------------------

We seek to start -- and allow coordination of -- a community-wide effort to annotate 
Recognizing Textual Entailment (RTE) examples from the RTE main task [1] with the 
inference steps required to reach a decision about the example label (Entailed vs. 
Contradicted vs. Unknown), as proposed in [2]. The Explanation-Based Analysis 
(EBA) wiki (link above) contains the annotation resources described in that paper, and 
is intended to be a starting point for the NLP community to develop the annotation 
scheme into a formal, consistent standard, and to augment, extend, and refine the 
existing annotations accordingly.

We invite interested researchers to review our annotation scheme and 
annotations at

http://agora.cs.illinois.edu/display/rtedata/Explanation+Based+Analysis+of+RTE+Dat
a

We seek help with improving the annotation scheme (and annotation tools), and 
applying that scheme to label RTE data. 


Motivation
---------------

Broadly speaking, the RTE challenge frames Natural Language
Understanding in the context of recognizing when two text fragments
share the same meaning. The task is specified operationally, with a
human-annotated gold standard (similar to many NLP task formulations),
which allows solutions to use any means to reach the correct
decision. This open-endedness has many advantages, but at least one
significant drawback: it is hard to assign partial credit to systems
that make a partially correct inference about a given RTE
example. Most RTE examples require a number of inference steps to
reach the correct answer, so solving only one of them is unlikely to
significantly improve performance on the overall task. As a result,
the RTE evaluation does not afford a much-needed resource to
researchers developing NLP solutions that address more focused
inference tasks -- a means of evaluating that resource in the context
of an end-to-end inference application.  


Anticipated Benefits
-----------------------------

Anyone is free to augment the annotation of existing data with a new 
entailment phenomenon, and/or to augment the pool of RTE data with 
examples that highlight a phenomenon in which they are interested 
(provided they also annotate this data with the full explanation-based 
analysis), motivating other researchers to engage with these phenomena.

RTE system developers and NLP researchers can assess the relative
importance of entailment phenomena in RTE corpora, by examining the
distribution of labeled entailment phenomena.

RTE evaluation improves, allowing RTE system developers to evaluate
contributions of individual components which may have broad
application to textual inference tasks.  


References
-----------------

1. Bentivogli, L., Dagan, I., Dang H.T., Giampiccolo, D.,
Magnini, B. (2009). The Fifth PASCAL Recognizing
Textual Entailment Challenge. In Proceedings of the
TAC Workshop, Gaithersburg, MD, USA.

2. M. Sammons, V. Vydiswaran and D. Roth. "Ask not what Textual
Entailment can do for you...", Proc. of the Annual Meeting of the
Association of Computational Linguistics (ACL) - 2010


Mark Sammons
Principal Research Scientist
Cognitive Computation Group
University of Illinois,  Dept. Computer Science
(217) 265-6759  mssammon at uiuc.edu

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list