27.4072, FYI: Shared Task on Semantic Similarity (STS 2017)

Tue Oct 11 23:43:06 UTC 2016

LINGUIST List: Vol-27-4072. Tue Oct 11 2016. ISSN: 1069 - 4875.

Subject: 27.4072, FYI: Shared Task on Semantic Similarity (STS 2017)

Moderators: linguist at linguistlist.org (Damir Cavar, Malgorzata E. Cavar)
Reviews: reviews at linguistlist.org (Anthony Aristar, Helen Aristar-Dry,
                                   Robert Coté, Michael Czerniakowski)
Homepage: http://linguistlist.org

*****************    LINGUIST List Support    *****************
                       Fund Drive 2016
                   25 years of LINGUIST List!
Please support the LL editors and operation with a donation at:
           http://funddrive.linguistlist.org/donate/

Editor for this issue: Kenneth Steimel <ken at linguistlist.org>
================================================================

Date: Tue, 11 Oct 2016 19:42:47
From: Daniel Cer [cer at google.com]
Subject: Shared Task on Semantic Similarity (STS 2017)

Call for Participation:

SemEval 2017 Task 1: Semantic Textual Similarity (STS)

Semantic Textual Similarity (STS) measures the degree of equivalence in the
underlying semantics of paired snippets of text. While making such an
assessment is trivial for humans, constructing algorithms and computational
models that mimic human level performance represents a difficult and deep
natural language understanding problem.

STS evaluations have seen significant progress in methods targeted at a
specific language such as English or Spanish. For the 2017 shared task, the
emphasis is on building multilingual textual similarity models that are
capable of assessing both same language and cross-lingual sentence pairs. The
primary evaluation for the shared task assesses methods over a combination of
same language pairs in Arabic, English and Spanish as well as cross-lingual
Arabic-English and Spanish-English pairs. 

To encourage the development of methods that can be readily applied or adapted
to new languages, we also provide an optional evaluation track with a surprise
language that will only be announced at the beginning of the evaluation
period. This optional track provides an opportunity to explore STS models
capable of zero-shot learning via mechanisms such as multilingual embeddings.

In addition to the multilingual primary evaluation and the surprise language
track, a number of language and language pair specific tracks are also
provided. We hope that these tracks will provide participants with particular
linguistic expertise a chance to excel as well as provide an opportunity to
compare performance differences between multilingual and language specific
methods.

Task Definition:

Given two sentences, participants are asked to produce a continuous valued
similarity score on a scale from 0 to 5, with 0 indicating that the semantics
of the sentences are completely independent and 5 signifying semantic
equivalence. Performance is assessed by computing the Pearson correlation
between machine assigned semantic similarity scores and human judgments.

Following the emphasis on building multilingual and cross-lingual models, the
2017 shared task is organized into the following seven multilingual and
cross-lingual tracks:

- Track 0 - Primary: Combined evaluation of all announced monolingual and
cross-lingual language pairings explored by the 2017 task: ar-ar, ar-en,
en-en, es-en, and es-es. The primary track will not include the surprise
language evaluation data.

- Track 1 - Arabic-Arabic: Evaluation only on ar-ar pairs.

- Track 2 - Arabic-English: Evaluation only on ar-en pairs.

- Track 3 - Spanish-Spanish: Evaluation only on es-es pairs

- Track 4 - Spanish-English: Evaluation only on es-en pairs.

- Track 5 - English-English: Evaluation only on en-en pairs.

- Track 6 - Surprise language track (announced during the evaluation period)

For all language pairings, participants will be provided with two sentence
length snippets of text, s1 and s2. The two snippets will then be used to
compute and return a continuous valued semantic similarity score.

The cross-lingual language pairings (ar-en, es-en) only differ from the
monolingual language pairings (ar-ar, en-en, es-es) in that the two text
snippets in each pair are written in different languages. The inclusion of
cross-lingual STS pairs follows a successful pilot in 2016 that paired English
and Spanish sentences. Depending on the approach being used to compute the
similarity scores this may present different degrees of difficulty in adapting
the underlying model to handle the cross-lingual pairs. 

Participants are encouraged to review the successful approaches to monolingual
and cross-lingual STS from prior years of the STS shared task (Agirre et al.
2016; Agirre et al. 2015; Agirre et al. 2014; Agirre et al. 2013; Agirre et
al. 2012)  

2017 Data:

This year's shared task includes one evaluation set for each of the seven
tracks described above. Each evaluation set consists of between 200 to 250
sentence pairs. Within each evaluation set, we will attempt to approximately
balance the distribution of STS scores.

For training data, participants are encouraged to make use of all existing
English, Spanish and cross-lingual English-Spanish data sets from prior STS
evaluations. This includes all previously released trial, training and
evaluation data. 

Since this is the first year that we will include Arabic as part of an STS
evaluation, we will release training data for both monolingual Arabic and
cross-lingual Arabic-English. Each training set will consist of approximately
14,000 pairs sourced from prior English STS evaluations. 

As with the 2016 evaluation, participants are allowed and very much encouraged
to train purely unsupervised models and model components on arbitrary data
(e.g., unsupervised word embeddings).

Participation:

Registration: To register, please complete the following form:
https://docs.google.com/forms/d/e/1FAIpQLScXnt7qeioCPyxu6dv9wrSDYaF04bRgVBFCUb
ahxsAG6F43Sg/viewform

Website and trial data: For more details, including trial data, see the STS
SemEval 2017 Task 1 webpage at: http://alt.qcri.org/semeval2017/task1/

Mailing List: Join the mailing list for task updates and discussion at:
http://groups.google.com/group/STS-semeval.

Important Dates:

Trail data ready: Wed 21 Sep 2016
Training data ready: Mon 24 Oct 2016
Evaluation start: Mon 09 Jan 2017
Evaluation end: Mon 30 Jan 2017
Results posted: Mon 06 Feb 2017
Paper submissions due: Mon 27 Feb 2017
Author notifications: Mon 03 Apr 2017
Camera ready submissions due: Mon 17 Apr 2017
SemEval workshop: Summer 2017

Organizers (alphabetical order):

Eneko Agirre
Daniel Cer
Mona Diab
Lucia Specia

References:

Eneko Agirre, Carmen Banea, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, Rada
Mihalcea, German Rigau, Janyce Wiebe. SemEval-2016 Task 1: Semantic Textual
Similarity, Monolingual and Cross-Lingual Evaluation. Proceedings of SemEval
2016.

Eneko Agirre, Carmen Banea, Claire Cardie, Daniel Cer, Mona Diab, Aitor
Gonzalez-Agirre, Weiwei Guo, Inigo Lopez-Gazpio, Montse Maritxalar, Rada
Mihalcea, German Rigau, Larraitz Uria and Janyce Wiebe. SemEval-2015 Task 2:
Semantic Textual Similarity, English, Spanish and Pilot on Interpretability.
Proceedings of SemEval 2015.

Eneko Agirre, Carmen Banea, Claire Cardie, Daniel Cer, Mona Diab, Aitor
Gonzalez-Agirre, Weiwei Guo, Rada Mihalcea, German Rigau and Janyce Wiebe.
SemEval-2014 Task 10: Multilingual Semantic Textual Similarity. Proceedings of
SemEval 2014.

Eneko Agirre, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre and WeiWei Guo.
*SEM 2013 shared task: Semantic Textual Similarity. Proceedings of *SEM 2013.

Eneko Agirre, Daniel Cer, Mona Diab and Aitor Gonzalez-Agirre. SemEval-2012
Task 6: A Pilot on Semantic Textual Similarity. Proceedings of SemEval 2012.

Linguistic Field(s): Cognitive Science
                     Computational Linguistics
                     Linguistic Theories
                     Pragmatics
                     Semantics

Subject Language(s): Arabic, Standard (arb)
                     English (eng)
                     Spanish (spa)

------------------------------------------------------------------------------

*****************    LINGUIST List Support    *****************
                       Fund Drive 2016
Please support the LL editors and operation with a donation at:
            http://funddrive.linguistlist.org/donate/

        Thank you very much for your support of LINGUIST!

----------------------------------------------------------
LINGUIST List: Vol-27-4072	
----------------------------------------------------------