32.917, FYI: SIGMORPHON 2021 Shared Task 0 on Morphological Inflection - Call for Participation

Sat Mar 13 03:20:31 UTC 2021

LINGUIST List: Vol-32-917. Fri Mar 12 2021. ISSN: 1069 - 4875.

Subject: 32.917, FYI: SIGMORPHON 2021 Shared Task 0 on Morphological Inflection - Call for Participation

Moderator: Malgorzata E. Cavar (linguist at linguistlist.org)
Student Moderator: Jeremy Coburn, Lauren Perkins
Managing Editor: Becca Morris
Team: Helen Aristar-Dry, Everett Green, Sarah Robinson, Nils Hjortnaes, Joshua Sims, Billy Dickson
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Everett Green <everett at linguistlist.org>
================================================================

Date: Fri, 12 Mar 2021 22:20:05
From: Edoardo Maria Ponti [edoardomaria.ponti at gmail.com]
Subject: SIGMORPHON 2021 Shared Task 0 on Morphological Inflection - Call for Participation

We invite you to participate in SIGMORPHON’s 6th installment of its inflection
generation shared task, which will be divided into two parts:

Part 1: Generalization Across Typologically Diverse Languages
Part 2: Are We There Yet? A Shared Task on Cognitively Plausible Morphological
Inflection

Please join our Google Group to stay up to date:
https://groups.google.com/forum/#!forum/sigmorphon2021-sharedtask0/join
Click here to register for the task: https://forms.gle/tu4tX648F9kA9eps7
Consult our website for additional information:
https://github.com/sigmorphon/2021Task0
Contact the organizers at the following email address:
sigmorphon+workshop2021 at gmail.com

The shared task will be part of the SIGMORPHON workshop, co-located with
ACL-IJCNLP 2021 in Bangkok, Thailand, on either August 5 or 6, 2021.

Part 1: Generalization Across Typologically Diverse Languages

For the first part of the shared task, participants will design a model that
learns to generate morphological inflections from a lemma and a set of
morphosyntactic features of the target form. Each language has its own
training, development, and test splits. Training and development splits
contain triples, each consisting of a lemma, a target form, and a set of
morphological features, provided in the UniMorph format. Test splits only
provide lemmas and morphological tags: the participants' models will need to
predict the missing target form.

The model should be general enough to work for natural languages of any
typological patterning. For example, Tagalog verbs exhibit circumfixation;
thus, a model with a strong inductive bias towards suffixing will likely not
work well for Tagalog.

As part of the task, we will release data for 50 new languages annotated in
the Unimorph schema. The data for the 35 development languages are already
available on the shared task website. These include a number of languages
indigenous to Russia, such as Itelmen and Chukchi, as well as many languages
from the Americas, such as Aymara and Seneca.

Important Dates:

- February 28, 2021: Training and development splits for development
languages, baselines released.
- March 7, 2021: Development language data are frozen.
- April 20, 2021: Training and development splits for surprise languages
released.
- April 27, 2021: Test splits for all languages (both development and
surprise) released.
- May 4, 2021: Participants submit test predictions on all languages.
- June 1, 2021: Participants’ system description papers due.
- June 7, 2021: Participants’ system description papers camera ready due.

Part 2: Are We There Yet? A Shared Task on Cognitively Plausible Morphological
Inflection

An open question in the use of neural networks for the study of language is to
what degree they resemble humans in how they generate language. 

This shared task adopts the experimental paradigm introduced by Albright and
Hayes (2003). We have created a large number of new nonce words in four
languages: English, German, Portuguese and Russian. To the best of our
knowledge, this will be the largest and most multilingual collection of nonce
words in existence. The goal of the participants in the shared task is to
design a model that morphologically inflects the nonce words according to the
grammar of the given languages. 

Important Dates:

- February 25, 2021: Training data for English, German, Portuguese and Russian
are released.
- March 8, 2021: Neural and non-neural baselines for development languages
released.
- May 1, 2021: Development data for nonce inflections are released. (This
includes human judgments.)
- May 23, 2021: Test data for the nonce inflections are released. (This
includes human judgments.)
- June 1, 2021: Users submit their system output.
- June 7, 2021: Users submit their system description paper.

Linguistic Field(s): Cognitive Science
                     Computational Linguistics
                     Morphology

------------------------------------------------------------------------------

***************************    LINGUIST List Support    ***************************
 The 2020 Fund Drive is under way! Please visit https://funddrive.linguistlist.org
  to find out how to donate and check how your university, country or discipline
     ranks in the fund drive challenges. Or go directly to the donation site:
                   https://crowdfunding.iu.edu/the-linguist-list

                        Let's make this a short fund drive!
                Please feel free to share the link to our campaign:
                    https://funddrive.linguistlist.org/donate/

----------------------------------------------------------
LINGUIST List: Vol-32-917	
----------------------------------------------------------