RepEval 2017: The Second Workshop on Evaluating Vector-Space Representations for NLP

Fri Feb 24 05:41:39 UTC 2017

CALL FOR PAPERS

==========================================================================================
===RepEval 2017: The Second Workshop on Evaluating Vector-Space
Representations for NLP===
==========================================================================================

Mission Statement: To foster the development of new and improved ways of
measuring the quality and understanding the properties of vector space
representations in NLP.

Time & Location: Copenhagen, Denmark (EMNLP 2017 workshop).

Website: https://repeval2017.github.io/

===Motivation===

Models that learn real-valued vector representations of words, phrases,
sentences, and even document are ubiquitous in today's NLP landscape. These
representations are usually obtained by training a model on large amounts
of unlabeled data, and then employed in NLP tasks and downstream
applications. While such representations should ideally be evaluated
according to their value in these applications, doing so is laborious, and
it can be hard to rigorously isolate the effects of different
representations for comparison. There is therefore a need for evaluation
via simple and generalizable proxy tasks. To date, these proxy tasks have
been mainly focused on lexical similarity and relatedness, and do not
capture the full spectrum of interesting linguistic properties that are
useful for downstream applications. This workshop challenges its
participants to propose methods and/or design benchmarks for evaluating the
next generation of vector space representations, for presentation and
detailed discussion at the event.

===Submissions===

We encourage researchers at all levels of experience to consider
contributing to the discussion at RepEval by making a short submission to
either of two tracks:

=Shared Task=

Starting from this year, RepEval will feature a shared task for evaluating
general-purpose sentence representations. This year’s task will be natural
language inference (also known as recognizing textual entailment, or RTE)
in the style of SNLI - a three-class balanced classification problem over
sentence pairs. The shared task will feature a new, dedicated dataset that
spans several genres of text. The shared task will feature two evaluations,
a standard in-domain evaluation in which the training and test data are
drawn from the same sources, and a cross-domain evaluation in which the
training and test data differ substantially. This cross-domain evaluation
will test the ability of submitted systems to learn representations of
sentence meaning that capture broadly useful features.

More details available online: https://repeval2017.github.io/shared/

=Proposals=

A proposal submission should propose a novel method for evaluating
representations. It does not have to construct an actual dataset, but it
should describe a way (or several optional ways) of collecting one.
Proposals are expected to provide roughly 5-10 examples in the manuscript
as a proof of concept.

In addition, each proposal should explicitly mention:
* Which type of representation it evaluates (e.g. word, sentence, document)
* For which downstream application(s) it functions as a proxy
* Any linguistic/semantic/psychological properties it captures

Among other important points, proposals should take the following into
consideration:
* If the task captures some linguistic phenomenon via annotators, what
evidence is there that it is robustly observed in humans (e.g.,
inter-annotator agreement)?
* How easy would it be for other researchers to accurately reproduce the
evaluation (not necessarily the dataset)?
* Will the dataset be cost-effective to produce?
* Is a specific family of models expected to perform particularly better
(or worse) on the task? In other words, which types of models is this
evaluation targeted at?
* How should the evaluation's results be interpreted?

We hope that one or more of these proposals will evolve into next year’s
shared task (RepEval 2018).

=Submission Format=

Submissions to both tracks should be 2-4 pages of content in EMNLP format,
with an unlimited amount of pages for references. For the proposal track,
we encourage shorter content (2-3 pages), leaving more room for examples
and their visualization.

===Important Dates===

=Shared Task=

By March 15: Training and development data available, draft data
description paper available, competition begins
By May 1: Expert-tagged development data for error analysis available
June 1: Unlabeled test data available, evaluation period begins, Kaggle
evaluation site opens
June 14 (GMT-11, 23:59:59): Evaluation period ends, system description
papers and code packages due
June 16: Winners formally announced
July 3 (GMT-11, 23:59:59): Reviews due
July 6: Notification of presentation acceptance
July 21 (GMT-11, 23:59:59): Camera ready papers due
September 8: Workshop at EMNLP 2017, Copenhagen: shared task poster session
and selected short talks

=Proposals=

June 14 (GMT-11, 23:59:59): Proposal papers due
July 3 (GMT-11, 23:59:59): Reviews due
July 6: Acceptance notification
July 21 (GMT-11, 23:59:59): Camera-ready papers due

===Organizers===

Sam Bowman, New York University
Yoav Goldberg, Bar-Ilan University
Felix Hill, Google DeepMind
Angeliki Lazaridou, University of Trento
Omer Levy, University of Washington
Roi Reichart, Technion - Israel Institute of Technology
Anders Søgaard, University of Copenhagen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20170224/0aa175db/attachment.htm>