[Corpora-List] new annotations for Recognizing Textual Entailment corpus
Mark Sammons
mssammon at illinois.edu
Tue Nov 30 17:25:52 UTC 2010
New Corpus/Call for Participation
-----------------------------------------------
The Cognitive Computation Group at the University of Illinois have released new
annotations of a Recognizing Textual Entailment (RTE) corpus that contains
phenomena-specific annotation and indicates inference steps required to "prove"
entailment (see
http://agora.cs.illinois.edu/display/rtedata/Explanation+Based+Analysis+of+RTE+Dat
a), following the definition in our challenge paper at ACL 2010 [2]. We solicit
involvement from the community as a way to grow this corpus. More detail is given
below.
Explanation-Based Analysis of Textual Entailment Data
--------------------------------------------------------------------------------
We seek to start -- and allow coordination of -- a community-wide effort to annotate
Recognizing Textual Entailment (RTE) examples from the RTE main task [1] with the
inference steps required to reach a decision about the example label (Entailed vs.
Contradicted vs. Unknown), as proposed in [2]. The Explanation-Based Analysis
(EBA) wiki (link above) contains the annotation resources described in that paper, and
is intended to be a starting point for the NLP community to develop the annotation
scheme into a formal, consistent standard, and to augment, extend, and refine the
existing annotations accordingly.
We invite interested researchers to review our annotation scheme and
annotations at
http://agora.cs.illinois.edu/display/rtedata/Explanation+Based+Analysis+of+RTE+Dat
a
We seek help with improving the annotation scheme (and annotation tools), and
applying that scheme to label RTE data.
Motivation
---------------
Broadly speaking, the RTE challenge frames Natural Language
Understanding in the context of recognizing when two text fragments
share the same meaning. The task is specified operationally, with a
human-annotated gold standard (similar to many NLP task formulations),
which allows solutions to use any means to reach the correct
decision. This open-endedness has many advantages, but at least one
significant drawback: it is hard to assign partial credit to systems
that make a partially correct inference about a given RTE
example. Most RTE examples require a number of inference steps to
reach the correct answer, so solving only one of them is unlikely to
significantly improve performance on the overall task. As a result,
the RTE evaluation does not afford a much-needed resource to
researchers developing NLP solutions that address more focused
inference tasks -- a means of evaluating that resource in the context
of an end-to-end inference application.
Anticipated Benefits
-----------------------------
Anyone is free to augment the annotation of existing data with a new
entailment phenomenon, and/or to augment the pool of RTE data with
examples that highlight a phenomenon in which they are interested
(provided they also annotate this data with the full explanation-based
analysis), motivating other researchers to engage with these phenomena.
RTE system developers and NLP researchers can assess the relative
importance of entailment phenomena in RTE corpora, by examining the
distribution of labeled entailment phenomena.
RTE evaluation improves, allowing RTE system developers to evaluate
contributions of individual components which may have broad
application to textual inference tasks.
References
-----------------
1. Bentivogli, L., Dagan, I., Dang H.T., Giampiccolo, D.,
Magnini, B. (2009). The Fifth PASCAL Recognizing
Textual Entailment Challenge. In Proceedings of the
TAC Workshop, Gaithersburg, MD, USA.
2. M. Sammons, V. Vydiswaran and D. Roth. "Ask not what Textual
Entailment can do for you...", Proc. of the Annual Meeting of the
Association of Computational Linguistics (ACL) - 2010
Mark Sammons
Principal Research Scientist
Cognitive Computation Group
University of Illinois, Dept. Computer Science
(217) 265-6759 mssammon at uiuc.edu
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list