31.2554, Calls: Comp Ling/Online

Wed Aug 12 13:12:25 UTC 2020

LINGUIST List: Vol-31-2554. Wed Aug 12 2020. ISSN: 1069 - 4875.

Subject: 31.2554, Calls: Comp Ling/Online

Moderator: Malgorzata E. Cavar (linguist at linguistlist.org)
Student Moderator: Jeremy Coburn
Managing Editor: Becca Morris
Team: Helen Aristar-Dry, Everett Green, Sarah Robinson, Lauren Perkins, Nils Hjortnaes, Yiwen Zhang, Joshua Sims
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Lauren Perkins <lauren at linguistlist.org>
================================================================

Date: Wed, 12 Aug 2020 09:11:49
From: Tewodros Gebreselassie [tewodros.gebreselassie at gu.se]
Subject: RESOURCEs and Representations For Under-Resourced Languages and Domains

Full Title: RESOURCEs and Representations for Under-Resourced Languages and Domains 
Short Title: RESOURCEFUL-2020 

Date: 25-Nov-2020 - 25-Nov-2020
Location: Gothenburg, Sweden 
Contact Person: Tewodros Gebreselassie
Meeting Email: tewodros.gebreselassie at gu.se
Web Site: https://gu-clasp.github.io/resourceful-2020/index.html 

Linguistic Field(s): Computational Linguistics 

Call Deadline: 29-Sep-2020 

Meeting Description:

All areas of natural language processing have achieved visible breakthroughs
from the use of data-driven models. Contemporary machine learning is
significantly influenced by techniques that rely on large datasets that demand
substantial computational resources to solve practical problems in a tangible
way (e.g. models based on transformers such as BERT, VilBERT, ALBERT, and
GPT-2 that are pre-trained on large corpora of unlabelled data).

However, many of the world’s languages lack the availability of linguistic
description as well as of sufficiently large computer-readable corpora of
linguistic material. Even those languages that are considered well-resourced
have some domains where resources are scarce, for example corpora of dialogue
and situated interaction. Another similarity of these domains with
under-resourced languages is that since they focus on spoken or spoken-like
interaction (either in a written or an audio form) they show a high
variability of input data. Applying state-of-the-art deep-neural-network-based
methods for the development of data-driven systems in such
resource-constrained environments is a non-trivial task.

Intended participants are researchers, PhD students and practitioners from
diverse backgrounds (linguistics, computational linguistics, speech, machine
learning etc). We foresee an interactive workshop with plenty of time for
discussion, complemented with invited talks and short presentations of
on-going or completed research.

Call for Papers: 

For this workshop, we encourage contributions in the area of resource creation
and representation learning in limited or low-resource environments that are
tackling the above mentioned problems. In particular we would like to open a
forum by bringing together students, researchers, and experts to address and
discuss the following questions:
 - How can new resources be constructed or extended for languages and domains
that lack standardised representations of linguistic units?
 - What experience from building resources for languages that have a good
coverage today (for example Scandinavian languages) can be ported to building
resources for under-resources languages and domains?
 - How to deal with the variability of data and its standardisation in machine
learning approaches?
 - What algorithms and methods can we employ to transfer learning from related
domains/languages that have good coverage?
 - What is the role of multi-task learning in this domain?
 - What representations can be learned and how effective are they in different
low-resource scenarios?
 - How can newly created resources and learned representations be evaluated?
 - What ethical considerations are involved?

We invite submissions of 2-page extended non-anonymous abstracts with any
number of pages for references using the ACL/EMNLP template. Papers related to
our theme and already presented at other venues or have already been published
elsewhere will be considered for acceptance for presentation as well. The
abstracts will be reviewed by the workshop organisers and the accepted ones
will be posted on the website, unless authors wish not to do so. There will be
no workshop proceedings but post-proceedings may be organised depending on the
interest of authors.

Extended abstracts should be submitted in the pdf format at
https://easychair.org/conferences/?conf=resourceful2020

Submission of extended abstracts: 29 September 2020
Notification of acceptance: 23 October 2020
Final version: 10 November 2020
Workshop date: 25 November 2020
All times are 11:59PM UTC-12:00 (“anywhere on Earth”).

------------------------------------------------------------------------------

***************************    LINGUIST List Support    ***************************
 The 2019 Fund Drive is under way! Please visit https://funddrive.linguistlist.org
  to find out how to donate and check how your university, country or discipline
     ranks in the fund drive challenges. Or go directly to the donation site:
               https://iufoundation.fundly.com/the-linguist-list-2019

                        Let's make this a short fund drive!
                Please feel free to share the link to our campaign:
                    https://funddrive.linguistlist.org/donate/

----------------------------------------------------------
LINGUIST List: Vol-31-2554	
----------------------------------------------------------