30.482, Calls: Computational Linguistics/USA

Wed Jan 30 10:56:10 UTC 2019

LINGUIST List: Vol-30-482. Wed Jan 30 2019. ISSN: 1069 - 4875.

Subject: 30.482, Calls: Computational Linguistics/USA

Moderator: Malgorzata E. Cavar (linguist at linguistlist.org)
Student Moderator: Jeremy Coburn
Managing Editor: Becca Morris
Team: Helen Aristar-Dry, Everett Green, Sarah Robinson, Peace Han, Nils Hjortnaes, Yiwen Zhang, Julian Dietrich
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Everett Green <everett at linguistlist.org>
================================================================

Date: Wed, 30 Jan 2019 05:55:00
From: Anna Rogers [anna_rogers at uml.edu]
Subject: Third Workshop on Evaluating Vector Space Representations for NLP (co-located with NAACL 2019)

Full Title: Third Workshop on Evaluating Vector Space Representations for NLP (co-located with NAACL 2019) 
Short Title: RepEval2019 

Date: 06-Jun-2019 - 06-Jun-2019
Location: Minneapolis, MN, USA 
Contact Person: Anna Rogers
Meeting Email: anna_rogers at uml.edu
Web Site: https://repeval2019.github.io/ 

Linguistic Field(s): Computational Linguistics 

Call Deadline: 06-Mar-2019 

Meeting Description:

General-purpose dense word embeddings have come a long way since the beginning
of their boom in 2013, and they are still the most widely used way of
representing words in both industrial and academic NLP systems. However, the
issue of intrinsic metrics that are predictive of performance on downstream
tasks, and can help to develop better representations, is far from being
solved. At the sentence level and above, we now have a number of probing tasks
and large extrinsic evaluation datasets targeting high-level verbal reasoning,
but there is still much to learn about what features make a compositional
representation successful. Last but not the least, there are no established
intrinsic methods for newer kinds of representations such as ELMO, BERT, or
box embeddings.

The third edition of RepEval aims to foster discussion of the above issues,
and to support the search for high-quality general purpose representation
learning techniques for NLP. We hope to encourage interdisciplinary dialogue
by welcoming diverse perspectives on the above issues: submissions may focus
on properties of embedding space, performance analysis for various downstream
tasks, as well as approaches based on linguistic and psychological data. In
particular, experts from the latter fields are encouraged to contribute
analysis of claims previously made in NLP community.

Research paper submissions may consist of 4-6 pages of content, plus unlimited
references. An additional page in the camera-ready version will be available
for addressing reviewers' comments. Please refer to the NAACL author
guidelines for the style files, policy on double submissions and preprints:
https://naacl2019.org/calls/papers/#author-guidelines

Call for Papers:

RepEval 2019 invites submissions on approaches to extrinsic and intrinsic
evaluation of distributional meaning representations, including evaluation
motivated by linguistic, psycholinguistic or neurological evidence. For the
former, we still know very little about what linguistic phenomena should be
captured by distributional meaning representations for better performance on
downstream tasks, and how general it can be. Improved diagnostic tests for
word and morpheme compositionality and all kinds of semantic relations,
interpretability of distributional meaning representations, validating
existing approaches in cross-lingual studies, especially with languages of
different families - these are all topics in which computational linguistics
needs to be truly interdisciplinary.

We invite practical proposals for new evaluation techniques that
experimentally demonstrate the benefits of the new approach. Submissions may
also focus on critical analysis of the existing approaches (especially by
experts in other domains such as linguistics or psychology), or methodological
caveats (reproducibility, parameters impact, the issue of attribution of
results to the representation or the whole system, dataset
structure/balance/representativeness).

Paper Submission:

Submission is electronic, using the Softconf START conference management at
https://www.softconf.com/naacl2019/repeval/

Analysis papers might like to consider the following questions:

- Pros and cons of existing evaluations;
- (Mis)attribution of performance improvements to various elements of the
pipeline in complex NLP systems;
- Given a specific downstream application, which existing evaluation (or
family of evaluations) is a good predictor of performance improvement?
- Which linguistic/semantic/psychological properties are captured by existing
evaluations? Which are not?
- What methodological mistakes were made in the creation of existing
evaluation datasets?
- What linguistic/psychological properties of meaning representations are
supposed to make them ''better'', and why?
- The recent tendency is to take high-level reasoning tasks such as question
answering or inference as the ''ultimate'' evaluation for meaning
representations (effectively, a Turing test proxy). How justified is this
approach? Should a ''good'' representation excel at all such tasks (and also
all the low-level ones), or specialize? What alternatives do we have?

Proposal papers should introduce a novel method for evaluating
representations, accompanied with a proof-of-concept dataset (of which at
least a sample should be made available to the reviewers at the submission
time). The new method should highlight some previously unnoticed properties of
the target representations, enable a faster/more cost-effective way of
measuring some previously known properties, or demonstrate a significant
improvement to the previous proposals (e.g. update to an imbalanced or noisy
dataset that shows that previous claims were misattributed). Each proposal
should explicitly mention:

- Which type of representation it evaluates (e.g. word, sentence, document,
contextualized or not), and what properties of that representation it targets;
- For which downstream application(s) it functions as a proxy;
- Any syntactic/semantic/psychological properties it captures, in comparison
with previous work;
- If any annotation was performed, what was the inter-annotator agreement, and
how cost-effective would it be to scale it up and/or create a similar resource
for other languages?
- The permissions to use/release the data.

See the full CFP at https://repeval2019.github.io/cfp/

------------------------------------------------------------------------------

*****************    LINGUIST List Support    *****************
Please support the LL editors and operation with a donation at:

              The IU Foundation Crowd Funding site:
       https://iufoundation.fundly.com/the-linguist-list

               The LINGUIST List FundDrive Page:
            https://funddrive.linguistlist.org/donate/

----------------------------------------------------------
LINGUIST List: Vol-30-482	
----------------------------------------------------------