30.2260, FYI: GermEval 2019 Task 1 - Shared Task on hierarchical classification of German blurbs (short texts): 3rd Call for Participation

Thu May 30 22:04:12 UTC 2019

LINGUIST List: Vol-30-2260. Thu May 30 2019. ISSN: 1069 - 4875.

Subject: 30.2260, FYI: GermEval 2019 Task 1 - Shared Task on hierarchical classification of German blurbs (short texts): 3rd Call for Participation

Moderator: Malgorzata E. Cavar (linguist at linguistlist.org)
Student Moderator: Jeremy Coburn
Managing Editor: Becca Morris
Team: Helen Aristar-Dry, Everett Green, Sarah Robinson, Peace Han, Nils Hjortnaes, Yiwen Zhang, Julian Dietrich
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Everett Green <everett at linguistlist.org>
================================================================

Date: Thu, 30 May 2019 17:58:12
From: Steffen Remus [remus at informatik.uni-hamburg.de]
Subject: GermEval 2019 Task 1 - Shared Task on hierarchical classification of German blurbs (short texts): 3rd Call for Participation

3rd Call for Participation:

We invite interested parties to participate in this shared task.
Hierarchical multi-label classification (HMC) of blurbs is the task of
classifying multiple labels for short descriptive texts of books, where each
label is part of an underlying hierarchy of categories. Further information
can be found here: https://competitions.codalab.org/competitions/21226

Tasks:
This shared task consists of two subtasks, described below. Participants are
free to participate in either one of them or both.

- *Subtask A*: The task is to classify German books into *one or multiple most
general categories*. It can be thus be considered a non-hierarchical
multi-label classification task. Eight classes can be assigned in total:
'Literatur & Unterhaltung', 'Ratgeber', 'Kinderbuch & Jugendbuch', 'Sachbuch',
'Ganzheitliches Bewusstsein', 'Glaube & Ethik', 'Künste, Architektur &
Garten'.

- *Subtask B*: The second task targets hierarchical multi-label
classification, where the full hierarchy of labels should be assigned to a
book. In addition to the most general category (Subtask A), additional
categories of different specificity can be assigned to a book. In total, 343
different classes can be assigned in a hierarchical structure of maximally 4
levels.

Data:

The entire dataset consists of 20,784 examples in total. Sample data is
provided in order to enable familiarization with the structure of the data.
14,548 training samples have been released and can be downloaded after
registering for the shared tasks. A validation set (2,079 samples) has been
published where gold labels have been held back. Submissions for the
validation set via the codalab page are accepted and published on a
leaderboard until June 1st. From June 1st, we will start the final evaluation
phase of the task by providing the gold labels of the validation set, which
can be used as additional training data. Additionally, the test set samples
will be provided, for which we accept submissions until July, 15th. More
information can be found on the task's webpage:
https://competitions.codalab.org/competitions/21226

Important Dates:

- January 2019: Release of trial data
- February 01, 2019: Release of training data (train + validation)
- June 01, 2019: Release of gold labels for validation set + test data
- July 15, 2019: Final deadline for submissions of test results
- July 31, 2019: Submission of description papers
- August 20, 2019: Notification of acceptance
- September 15, 2019: Camera-ready deadline for system description papers
- October 08, 2019: Workshop in Erlangen, Germany

The shared task will be accompanied by a pre-conference workshop of the
Conference on Natural Language Processing ("Konferenz zur Verarbeitung
natürlicher Sprache", KONVENS) hosted on October 8, 2019 at FAU
Erlangen-Nuremberg (http://2019.konvens.org/).

Workshop Proceedings:

Description papers will appear in online workshop proceedings. Participants
who submit a description paper will be asked to register at the workshop and
present their system as a poster or in an oral presentation (depending on the
number of submissions).

Organizers:
The task is organized by Rami Aly, Steffen Remus and Chris Biemann, Language
Technology, Department of Informatics, Universität Hamburg,
https://lt.informatik.uni-hamburg.de

GermEval:

GermEval is a series of shared task evaluation campaigns that focus on Natural
Language Processing for the German language. GermEval has been conducted four
times since 2014 in co-location with KONVENS/GSCL conferences. For an overview
of the currently conducted tasks, please see http://2019.konvens.org/germeval.
We highly encourage readers to also take note of task 2 (Identification of
offensive language, https://projects.fzai.h-da.de/iggsa/) and task 3
(Lemmatization of German Web and Social Media Texts,
https://fau-klue.github.io/empirist-lemmatization/).

Linguistic Field(s): Applied Linguistics
                     Computational Linguistics
                     Text/Corpus Linguistics

Subject Language(s): German (deu)

Language Family(ies): German

------------------------------------------------------------------------------

***************************    LINGUIST List Support    ***************************
 The 2019 Fund Drive is under way! Please visit https://funddrive.linguistlist.org
  to find out how to donate and check how your university, country or discipline
     ranks in the fund drive challenges. Or go directly to the donation site:
               https://iufoundation.fundly.com/the-linguist-list-2019

                        Let's make this a short fund drive!
                Please feel free to share the link to our campaign:
                    https://funddrive.linguistlist.org/donate/

----------------------------------------------------------
LINGUIST List: Vol-30-2260	
----------------------------------------------------------