[HPSG-L] 2nd CfP: REPROLANG 2020 (Shared Task on the Reproduction of Research Results in Science and Technology of Language)
Antonio Branco
antonio.branco at di.fc.ul.pt
Fri Jun 21 11:53:29 UTC 2019
[Apologies for multiple postings]
SECOND CALL FOR PAPERS
REPROLANG 2020
Shared Task on the Reproduction of Research Results in Science and
Technology of Language
(part of LREC 2020 conference)
Marseille, France
May 13-15, 2020
http://wordpress.let.vupr.nl/lrec-reproduction
We are very pleased to announce REPROLANG 2020, the Shared Task on the
Reproduction of Research Results in Science and Technology of Language,
organized by ELRA - European Language Resources Association with the
technical support of CLARIN - European Research Infrastructure for
Language Resources and Technology, as part of the LREC 2020 conference.
BACKGROUND
Scientific knowledge is grounded on falsifiable predictions and thus its
credibility and raison d’être relies on the possibility of repeating
experiments and getting similar results as originally obtained and
reported. In many young scientific areas, including ours,
acknowledgement and promotion of the reproduction of research results
need very much to be increased.
For this reason, a special track on reproducibility is included into the
LREC 2020 conference regular program (side by side with other sessions
on other topics) for papers on reproduction of research results, and the
present specific community-wide shared task is launched to elicit and
motivate the spread of scientific work on reproduction. This initiative
builds on the previous pioneer LREC workshops on reproducibility 4REAL
2016 and 4REAL 2018.
SHARED TASK
The shared task is of a new type: it is partly similar to the usual
competitive shared tasks --- in the sense that all participants share a
common goal; but it is partly different to previous shared tasks --- in
the sense that its primary focus is on seeking support and confirmation
of previous results, rather than on overcoming those previous results
with superior ones. Thus instead of a competitive shared task, with each
participant struggling for an individual top system that scores as far
as possible from a rough baseline, this will be a cooperative shared
task, with participants struggling for systems that reproduce as close
as possible an original complex research experiment and thus eventually
reinforcing the level of reliability on its results by means of their
eventually convergent outcomes. Concomitantly, like with competitive
shared tasks, in the process of participating in the collaborative
shared task, new ideas for improvement and new advances beyond the
reproduced results find here an excellent ground to be ignited.
We invite researchers to reproduce the results of a selected set of
articles, which have been offered by the respective authors with their
consent to be used for this shared task. Papers submitted for this task
are expected to report on reproduction findings, to document how the
results of the original paper were reproduced, to discuss
reproducibility challenges, to inform on time, space or data
requirements found concerning training and testing, to ponder on lessons
learned, to elaborate on recommendations for best practices, etc.
Submissions that in addition to the reproduction exercise, report also
on results of the replication of the selected tasks with other
languages, domains, data sets, models, methods, algorithms, downstream
tasks, etc. are also encouraged. These should permit to gain insight
also into the robustness of the replicated approaches, their learning
curves and potential of incremental performance, their capacity of
generalization, their transferability across experimental circumstances
and into eventual real-life usage scenarios, their suitability to
support further progress, etc.
PUBLICATION
LREC conferences have one of the top h5-index scores of research impact
among the world class venues for research on Human Language Technology.
Accepted papers for the shared task will be published in the Proceedings
of the LREC 2020 main conference. LREC Proceedings are freely available
from ELRA and ACL Anthology. They are indexed in Scopus (Elsevier) and
in DBLP. LREC 2010, LREC 2012 and LREC 2014 Proceedings are included in
the Thomson Reuters Conference Proceedings Citation Index (the other
editions are being processed).
Substantially extended versions of papers selected by reviewers as the
most appropriate will be considered for publication in special issues of
the Language Resources and Evaluation Journal published by Springer (a
SCI-indexed journal).
IMPORTANT DATES
November 25, 2019: deadline for paper submission (aligned with LREC 2020)
November 27: deadline for projects in gitlab.com to go public
February 14, 2020: notification of acceptance
May 11-16: LREC conference takes place
SELECTED TASKS
The Selection Committee has selected a broad range of papers and tasks.
Chapter A: Lexical processing
Task A.1: Cross-lingual word embeddings
Artetxe, Mikel, Gorka Labaka, and Eneko Agirre. 2018. “A robust
self-learning method for fully unsupervised cross-lingual mappings of
word embeddings”. In Proceedings of the 56th Annual Meeting of the
Association for Computational Linguistics (ACL 2018), pp. 789–798.
http://aclweb.org/anthology/P18-1073
Major reproduction comparables: Accuracy scores (tables 1 to 4).
Task A.2: Named entity embeddings
Newman-Griffis, Denis, Albert M Lai, and Eric Fosler-Lussier. 2018.
“Jointly Embedding Entities and Text with Distant Supervision”. In
Proceedings of The Third Workshop on Representation Learning for NLP,
pp. 195–206.
http://aclweb.org/anthology/W18-3026
Major reproduction comparables: Spearman’s ρ scores for semantic
similarity predictions
(tables 3 and 4), and accuracy scores (table 6).
Chapter B: Sentence processing
Task B.1: POS tagging
Bohnet, Bernd, Ryan McDonald, Gonçalo Simões, Daniel Andor, Emily
Pitler, and Joshua Maynez. 2018. “Morphosyntactic Tagging with a
Meta-BiLSTM Model over Context Sensitive Token Encodings”. In
Proceedings of the 56th Annual Meeting of the Association for
Computational Linguistics (ACL 2018), pp. 2642–2652.
http://aclweb.org/anthology/P18-1246
Major reproduction comparables: f-score values (tables 2 to 8).
Task B.2: Sentence semantic relatedness
Gupta, Amulya, and Zhu Zhang. 2018. “To Attend or not to Attend: A Case
Study on Syntactic Structures for Semantic Relatedness”. In Proceedings
of the 56th Annual Meeting of the Association for Computational
Linguistics (ACL 2018), pp. 2116–2125.
http://aclweb.org/anthology/P18-1197
Major reproduction comparables: Pearson’s r and Spearman’s ρ scores for
the semantic relatedness
(table 1), and f-score values for paraphrase detection (table 2).
Chapter C: Text processing
Task C.1: Relation extraction and classification
Rotsztejn, Jonathan, Nora Hollenstein, and Ce Zhang. 2018. “ETH-DS3Lab
at SemEval-2018 Task 7: Effectively Combining Recurrent and
Convolutional Neural Networks for Relation Classification and
Extraction”. In Proceedings of the 12th International Workshop on
Semantic Evaluation (SemEval 2018), pp. 689–696.
http://aclweb.org/anthology/S18-1112
Major reproduction comparables: precision, recall and f-score values
(tables 3 and 4).
Task C.2: Privacy preserving representation
Li, Yitong, Timothy Baldwin, and Trevor Cohn. 2018. “Towards Robustand
Privacy-preserving Text Representations”. In Proceedings of the 56th
Annual Meeting of the Association for Computational Linguistics (ACL
2018), pp. 25-30.
http://aclweb.org/anthology/P18-2005
Major reproduction comparables: POS accuracy scores (tables 1 and 2),
and sentiment analysis
f-score scores (table 3).
Task C.3: Language modelling
Howard, Jeremy, and Sebastian Ruder. 2018. ”Universal Language Model
Fine-tuning for Text Classification”. In Proceedings of the 56th Annual
Meeting of the Association for Computational Linguistics (ACL 2018), pp.
328–339.
http://aclweb.org/anthology/P18-1031
Major reproduction comparables: Error rate (%) scores in sentiment
analysis and question classification tasks (tables 2 and 3).
Chapter D: Applications
Task D.1: Text simplification
Nisioi, Sergiu, Sanja Stajner, Simone Paolo Ponzetto, and Liviu P. Dinu.
2017.
“Exploring Neural Text Simplification Models”. In Proceedings of the
55th Annual Meeting of the Association for Computational Linguistics
(ACL 2017), pp. 85-91.
http://aclweb.org/anthology/P/P17/P17-2014.pdf
Major reproduction comparables: Averaged human evaluation scores, by 3
evaluators,
in 1 to 5 and -2 to +2 scales (table 2).
Task D.2: Language proficiency scoring
Vajjala, Sowmya, and Taraka Rama. 2018. “Experiments with UniversalCEFR
classifications”.
In Proceedings of Thirteenth Workshop on Innovative Use of NLP for
Building Educational Applications, pp. 147–153.
http://aclweb.org/anthology/W18-0515
Major reproduction comparables: f-score values (tables 2, 3 and 4).
Task D.3: Neural machine translation
Vanmassenhove, Eva, and Andy Way. 2018. “SuperNMT: Neural Machine
Translation with Semantic Supersenses and Syntactic Supertags”. In
Proceedings of the 56th Annual Meeting of the Association for
Computational Linguistics (ACL 2018), pp. 67–73.
http://aclweb.org/anthology/P18-3010
Major reproduction comparables: BLEU scores (tables 1 and 2; plots in
figures 2, 3 and 4).
Chapter E: Language resources
Task E.1: Parallel corpus construction
Brunato, Dominique, Andrea Cimino, Felice Dell'Orletta, and Giulia
Venturi. 2016. “PaCCSS-IT: A Parallel Corpus of Complex-Simple Sentences
for Automatic Text Simplification”. In Proceedings of the 2016
Conference on Empirical Methods in Natural Language Processing (EMNLP
2016), pp. 351-361.
https://aclweb.org/anthology/D16-1034
Major reproduction comparables: data set.
Participants are expected to obtain the data and tools for the
reproduction from the information provided in the paper. Using the
description of the experiment is part of the reproduction exercise.
SUBMISSION
The START platform of LREC 2020 will be used for the submission of the
following required elements: A paper describing the reproduction effort,
and a link to the software and data used to obtain the results reported
in the paper (more details below). The submitted materials and results
will be checked by a CLARIN panel. Papers will be peer-reviewed.
PAPER PREPARATION
REPROLANG 2020 invites the submission of full papers from 4 pages to 8
pages (plus more pages for references if needed). These submissions must
strictly follow the LREC 2020 conference stylesheet which will be
available on the conference website.
MATERIALS PREPARATION
To be checked by a CLARIN panel and the submission to be complete, the
software used to obtain the results reported in the paper must be made
available as a docker container through a project in gitlab. Detailed
instructions are available at https://gitlab.com/CLARIN-ERIC/reprolang/
For technical support, the CLARIN team can be contacted at
reprolang-tc at clarin.eu or an issue can be created under
https://gitlab.com/CLARIN-ERIC/reprolang/issues.
Submissions are done via the START conference management system used by
LREC 2020 and include the following elements:
- url address of your gitlab.com project
- url of the tar.gz with the datasets - the md5 checksum of the above tar.gz
- .pdf with the paper, which must include the above url of your
gitlab.com project, and the above commit hash and tag
The project in gitlab.com should be made public within 2 days after the
submission deadline.
PRESENTATION
Papers accepted for publication will be presented in a specific session
of the LREC main conference. There is no difference in quality between
oral and poster presentations. Only the appropriateness of the type of
communication (more or less interactive) to the content of the paper
will be considered. The format of the presentations will be decided by
the Program Committee. The proceedings will include both oral and poster
papers in the same format.
REGISTRATION
For a selected paper to be included in the programme and to be published
in the proceedings, at least one of its authors must register for the
LREC 2020 conference by the early bird registration deadline. A single
registration only covers one paper, following the general LREC policy on
registration. Registration service is to be found at the LREC 2020 website.
CONTACTS
About the shared task:
Piek Vossen
p.t.j.m.vossen at vu.nl
About the preparation and submission of materials:
reprolang-tc at clarin.eu
REPROLANG 2020 website: http://wordpress.let.vupr.nl/lrec-reproduction
STEERING COMMITTEE
António Branco, University of Lisbon (chair of Steering Committee)
Nicoletta Calzolari, ILC, Pisa (co-chair of Steering Committee)
Gertjan van Noord, University of Groningen (chair of Task Selection
Committee)
Piek Vossen, VU University Amsterdam (chair of Program Committee)
Khalid Choukri, ELRA/ELDA
TASK SELECTION COMMITTEE
Gertjan van Noord, University of Groningen (chair)
Tim Baldwin, University of Melbourne
António Branco, University of Lisbon
Nicoletta Calzolari, ILC, Pisa
Çağrı Çöltekin, University of Tuebingen
Nancy Ide, Vassar College, New York
Malvina Nissim, University of Groningen
Stephan Oepen, University of Oslo
Barbara Plank, University of Copenhagen
Piek Vossen, VU University Amsterdam
Dan Zeman, Prague University
PROGRAM COMMITTEE
Piek Vossen, VU University Amsterdam (chair)
Gilles Adda, LIMSI-CNRS, Paris
Eneko Agirre, Basque University
Francis Bond, NanyangTechnical University, Singapore
António Branco, University of Lisbon
Nicoletta Calzolari, ILC, Pisa
Khalid Choukri, ELRA/ELDA
Kevin Cohen, University of Colorado Boulder
Thierry Declerck, DFKI Saarbruecken
Nancy Ide , Vassar College, New York
Antske Fokkens VU University Amsterdam
Karën Fort, University of Paris-Sorbonne
Cyril Grouin, LIMSI-CNRS
Mark Liberman, University of Pennsylvania
John McCrae, Galway University
Margo Mieskes, University of Applied Sciences Darmstadt
Aurélie Névéol, LIMSI-CNRS
Gertjan van Noord, University of Groningen
Stephan Oepen, University of Oslo
Ted Pedersen, University of Minnesota
Senja Pollak, Jozef Stefan Institute, Ljubljana
Paul Rayson, Lancaster University
Martijn Wieling, University of Groningen
TECHNICAL COMMITTEE
reprolang-tc at clarin.eu
Dieter Van Uytvanck, CLARIN (chair)
André Moreira, CLARIN
Twan Goosen, CLARIN
João Ricardo Silva, CLARIN and University of Lisbon
Luís Gomes, CLARIN and University of Lisbon
Willem Elbers, CLARIN
More information about the HPSG-L
mailing list