22.5129, FYI: 2nd Call for Participation: CLTE at SemEval-2012

Tue Dec 20 16:24:46 UTC 2011

LINGUIST List: Vol-22-5129. Tue Dec 20 2011. ISSN: 1069 - 4875.

Subject: 22.5129, FYI: 2nd Call for Participation: CLTE at SemEval-2012

Moderators: Anthony Aristar, Eastern Michigan U <aristar at linguistlist.org>
            Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>

Reviews: Veronika Drake, U of Wisconsin-Madison
Monica Macaulay, U of Wisconsin-Madison
Rajiv Rao, U of Wisconsin-Madison
Joseph Salmons, U of Wisconsin-Madison
Anja Wanner, U of Wisconsin-Madison
       <reviews at linguistlist.org>

Homepage: http://linguistlist.org

The LINGUIST List is funded by Eastern Michigan University,
and donations from subscribers and publishers.

Editor for this issue: Brent Miller <brent at linguistlist.org>
================================================================  

To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.cfm.

===========================Directory==============================  

1)
Date: 20-Dec-2011
From: Danilo Giampiccolo [giampiccolo at celct.it]
Subject: 2nd Call for Participation: CLTE at SemEval-2012

-------------------------Message 1 ---------------------------------- 
Date: Tue, 20 Dec 2011 11:24:34
From: Danilo Giampiccolo [giampiccolo at celct.it]
Subject: 2nd Call for Participation: CLTE at SemEval-2012

E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=22-5129.html&submissionid=4537948&topicid=6&msgnumber=1

Second Call for Participation: Cross-lingual Textual Entailment for 
Content Synchronization

Update: The CLTE training set and the test scripts are now available!

For further information on how to obtain them, please visit 
http://www.cs.york.ac.uk/semeval-2012/task8/index.php?id=data

We invite participants to a new SemEval-2012 task: Cross-lingual 
Textual Entailment (CLTE) for Content Synchronization. 

http://www.cs.york.ac.uk/semeval-2012/task8/

Given a pair of topically related text fragments (T1 and T2) in different 
languages, the CLTE task consists of automatically annotating it with 
one of the following entailment judgments: 
- Bidirectional (T1 entails T2; T2 entails T1)
- Forward (T1 entails T2; T2 does not entail T1) 
- Backward (T1 does not entail T2; T2 entails T1)
- No Entailment (T1 does not entail T2; T2 does not entail T1)

Datasets are available for the following language combinations:
- Spanish/English  
- German/English
- Italian/English
- French/English

The CLTE task  addresses textual entailment recognition under a new 
dimension (cross-linguality), and within a new challenging application 
scenario (content synchronization).

Cross-linguality represents a dimension of the TE recognition problem 
that so far has been only partially investigated. The great potential of 
integrating monolingual TE recognition components into NLP 
architectures has been reported in several areas, including question 
answering, information retrieval, information extraction, and document 
summarization. However, mainly due to the absence of CLTE 
recognition components, similar improvements have not been achieved 
yet in any cross-lingual application. The CLTE task aims at prompting 
research to fill this gap.

Content synchronization represents a challenging application scenario 
to test the capabilities of advanced NLP systems. Given two documents 
about the same topic written in different languages (e.g. Wikipedia 
articles), the task consists of automatically detecting and resolving 
differences in the information they provide, in order to produce aligned, 
mutually enriched versions of the two documents. Towards this 
ambitious objective, a crucial requirement is to identify the information 
in one page that is equivalent or novel (more informative) with respect 
to the content of the other. The task can be naturally cast as an 
entailment-related problem, where bidirectional and unidirectional 
entailment judgments for two text fragments are respectively mapped 
into judgments about semantic equivalence and novelty. Alternatively, 
the task can be seen as a Machine Translation evaluation problem, 
where judgments about semantic equivalence and novelty relate to the 
possibility that one text fragment is the full or partial translation of the 
other.

The Task Guidelines are available at: 
http://www.cs.york.ac.uk/semeval-2012/task8/index.php?id=guidelines

Proposed schedule:

* September 1, 2011: Trial Dataset released (40 English/Spanish pairs)
* December 16, 2011: Training data + test scripts release
* February 10, 2012: Test data release
* February 20, 2012: Task submissions deadline
* March 1, 2012: Release of individual results
* March 10, 2012: Systems' reports due to organizers
* March 25, 2012: Papers' review due to participants
* April 1, 20121: Camera Ready deadline

If you are interested in the task, please join the discussion group 
http://groups.google.com/group/clte-semeval

Best regards,

The CLTE track organizers

Matteo Negri, Yashar Mehdad, Luisa Bentivogli (FBK-irst, Trento, Italy)
Danilo Giampiccolo, Alessandro Marchetti (CELCT, Trento, Italy) 

Linguistic Field(s): Computational Linguistics

-----------------------------------------------------------
LINGUIST List: Vol-22-5129	
----------------------------------------------------------