Danilo Giampiccolo giampiccolo at celct.it
Mon May 26 12:27:51 UTC 2008

[Apologies for cross-postings]



The three Recognising Textual Entailment (RTE) challenges (see RTE websites, http://www.pascal-network.org/Challenges/RTE/; http://www.pascal-network.org/Challenges/RTE2/; http://www.pascal-network.org/Challenges/RTE3/) held so far have shown that the interest for Textual Entailment research has steadily grown in years. The RTE organizing committee is now glad to announce the 4th round of the Recognizing Textual Entailment (RTE) Challenge, organized as a track within NIST's new Textual Analysis Conference (TAC, http://www.nist.gov/tac/).


Although RTE4 will maintain the basic structure of the previous challenges, some important changes will be introduced in order to make the task more stimulating and bring the research in textual entailment to the next level. In particular:

- the task will include the 3-way classification task piloted in RTE3 (see http://nlp.stanford.edu/RTE3-pilot/), allowing the systems to make a further distinction between hypotheses unknown on the basis of the texts and hypotheses contradicted or proved false by the texts (submitting 2-way classifications will still be possible, and 2-way results will be announced as well).

- the number of pairs will be increased to 300 in two of the application settings, namely Information Extraction and Information Retrieval, as they have proven to be more difficult from the analysis of the results in previous challenges. The test set will be made of 1000 pairs (300 each for IE and IR, 200 each for SUM and QA).

- there will be no development data set for RTE4, and the participants are invited to use the past RTE data for  training.

- in 2008 the RTE challenge will be carried out for the first time as a track at the Textual Analysis Conference, organised by NIST (more details at http://www.nist.gov/tac/). Registration is open until June 1, 2008, at the TAC 2008 RTE Track website ((http://www.nist.gov/tac/tracks/2008/rte/index.html)


RTE is the task of recognizing that the meaning of one text, termed H(ypothesis), can be inferred by the content of another, termed T(ext). Given a set of pairs of T's and H's as input, the systems must recognise whether each T entails the corresponding H, deciding whether:

-T entails H
-T contradicts H, or shows it false
-the veracity of H is unknown on the basis of T.

System results will be compared to a human-annotated gold-standard test corpus. Examples of three-way judgments taken from last year's pilot task are given at the bottom of this message.

As in previous challenges, the test data sets will be based on multiple data sources, intended to be representative of typical problems encountered by applied systems. Specifically, data types corresponding to the following application areas will be used:

-Question Answering (QA): simulating a QA scenario in which the hypothesized answer has to be inferred from the candidate text passage

-Information Retrieval (IR): choosing propositional queries as hypotheses, and proposing relevant and irrelevant sentences retrieved by IR systems as texts

-Information Extraction/Relation Extraction (IE): generating T-H pairs, picking positive and negative examples of typical outputs of IE systems

-Summarization (SUM): converting sentence pairs produced by multi-document text summarization systems into T-H pairs

More details can be found at the RTE-3 website (http://www.pascal-network.org/Challenges/RTE3/). The guidelines for participants will be available at the track website shortly.



The RTE Resource Pool, set up for the first time during RTE3, serves as a portal and forum for publicizing and tracking resources, and reporting on their use. RTE participants and other members of the NLP community who develop or use relevant resources are encouraged to contribute to this important resource.


Registration deadline:                                                  EXTENDED to 27 June 2008

Test Set Release:                                                           2 September 2008

Submissions:                                                                    9 September 2008

Release of individual evaluated results:               12 September 2008

Workshop:                                                                                       17-18 November 2008, at TAC 2008


Danilo Giampiccolo, CELCT (Trento), Italy (Coordinator, giampiccolo at celct.it)
Hoa Dang, NIST, USA (Coordinator, hoa.dang at nist.gov)
Ido Dagan, Bar Ilan University, Israel
Bill Dolan, Microsoft Research, USA
Bernardo Magnini, FBK-irst (Trento), Italy


Johan Bos, University of Rome "La Sapienza", Italy
Christopher Manning, Stanford, USA
Dan Moldovan, University of Texas at Dallas, USA
Dan Roth, UIUC, USA
Annie Zaenen, Palo Alto Research Center, USA
Fabio Massimo Zanzotto, University of Rome "Tor Vergata", Italy


Examples of three-way judgments taken from last year's pilot task:

T: After his release, the clean-shaven Magdy el-Nashar told reporters outside his home that he had nothing to do with the July 7 transit attacks, which killed 52 people and the four bombers.</t>
H:52 people and four bombers were killed on July 7.
Entailment: YES

T: Mrs. Bush's approval ratings have remained very high, above 80%, even as her husband's have recently dropped below 50%.
H: 80% approve of Mr. Bush.
Entailment: NO

T: Recent Dakosaurus research comes from a complete skull found in Argentina in 1996, studied by Diego Pol of Ohio State University, Zulma Gasparini of Argentinas National University of La Plata, and their colleagues.</t>
H: A complete Dakosaurus was discovered by Diego Pol.
Entailment: UNKNOWN

T: The British tabloids portrayed Nicholas Leeson as a working-class villain who single-handedly brought down Barings PLC, a 233-year-old London merchant bank that helped finance the Napoleonic wars.
H: Barings was Britain's oldest merchant bank.
Entailment: UNKOWN

T: The floods were exceptional since they affected an extensive area across Europe from the UK to Spain and as far east as the Black Sea coast. Economic losses amounted to EUR 9.2 bn in Germany, EUR 2.9 bn in Austria and EUR 2.3 bn in the Czech Republic. Total economic damage exceeds EUR 15 bn.
H: Flooding in Europe causes major economic losses.
Entailment: YES

T: Oscar-winning director Franco Zeffirelli has been awarded an honorary knighthood for his "valuable services to British performing arts".
H: Italian director is awarded an honorary Oscar.
Entailment: NO

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20080526/57df8916/attachment.htm>
-------------- next part --------------
Corpora mailing list
Corpora at uib.no

More information about the Corpora mailing list