<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<div>https://sigtyp.github.io/st2020.html<br>
</div>
<div><br>
</div>
<div>The SIGTYP workshop, co-located with the EMNLP 2020 conference in Punta Cana (Dominican Republic), is offering a shared task on the prediction of typological features. The shared task encompasses nearly 2,000 languages, with typological features taken
from the World Atlas of Language Structures (WALS; Dryer and Haspelmath 2013).<br>
</div>
<div><br>
</div>
<div>To participate in the shared task, you will build a system that can predict typological properties of languages, given a handful of observed features. Training examples and development examples have already been provided (see link below). All submitted
systems will be compared on a held-out test set.<br>
</div>
<div><br>
</div>
<div>Moreover, you will be invited to describe your system in a system paper for the SIGTYP workshop proceedings. The task organisers will write an overview paper that describes the task and summarises the different approaches taken, and their results.<br>
</div>
<div><br>
</div>
<div><b>Important Links</b></div>
<div><b><br>
</b></div>
<div>- Download Train and Dev data: https://github.com/sigtyp/ST2020/tree/master/data<br>
</div>
<div>- Register for the Task! https://sigtyp.github.io/st2020-reg.html<br>
</div>
<div><br>
</div>
<div><b>Important Dates</b></div>
<div><br>
</div>
<div>- Training data Release: 26 March 2020<br>
</div>
<div>- Test data Release: 20 June 2020<br>
</div>
<div>- Submissions Due: 1 July 2020<br>
</div>
<div>- Writeup Due: 1 August 2020<br>
</div>
<div><br>
</div>
<div><b>Description</b></div>
<div><br>
</div>
<div>The typological features in WALS represent one approach to the categorization of the languages of the world according to their linguistic properties, e.g. in terms of their syntax, morphology, phonology inter alia. One example of such a typological feature
is the basic word order feature. For instance, English is best described as a subject-verb-object (SVO) language whereas Japanese is best described as a subject-object-verb (SOV) language.<br>
</div>
<div><br>
</div>
<div>One major issue with WALS, however, is that it is both sparse and skewed in terms of language-feature annotations. It is sparse in the sense that most languages only have annotations for a handful of features, and skewed in the sense that a few features
have much wider coverage than others. Luckily, such features often correlate with one another, which allows for prediction of those features from others. For instance, languages where the verb precedes the object tend to have prepositions, e.g. Norwegian,
whereas languages where the object precedes the verb word tend to have postpositions, e.g. Japanese.<br>
</div>
<div><br>
</div>
<div>Although there is a significant amount of previous work dealing with versions of this task (<i>Daumé III and Campbell 2017; Bjerva et al. 2019; Ponti et al. 2019</i>), important design choices have been frequently ignored. Some papers controlled for genetic
relationships between training and evaluation languages, but little-to-no work has considered controlling for geographical proximity.<br>
</div>
<div><br>
</div>
<div>The shared task will consist of two settings (subtasks):<br>
</div>
<div>
<ol>
<li><i>Constrained</i>: only provided training data can be employed.</li><li><i>Unconstrained</i>: training data can be extended with any external source of information (e.g. pre-trained embeddings, raw texts, etc.)</li></ol>
</div>
<div><b>Organizers</b></div>
<div><br>
</div>
<div>Johannes Bjerva<br>
</div>
<div>Isabelle Augenstein<br>
</div>
<div>Aditi Chaudhary<br>
</div>
<div>Edoardo M. Ponti<br>
</div>
<div>Giuseppe Celano<br>
</div>
<div>Liz Salesky<br>
</div>
<div>Ryan Cotterell<br>
</div>
<div>Michael Regan<br>
</div>
<div>Sabrina J. Mielke<br>
</div>
<div><br>
</div>
<div><b>Contact</b></div>
<div><b><br>
</b></div>
<div>- email: sigtyp AT gmail DOT com<br>
</div>
<span>- website: <a href="https://sigtyp.github.io/st2020.html" id="LPNoLP782020">
https://sigtyp.github.io/st2020.html</a></span><br>
<br>
</div>
</body>
</html>