29.1129, Calls: Computational Linguistics/Germany

Mon Mar 12 22:44:36 UTC 2018

LINGUIST List: Vol-29-1129. Mon Mar 12 2018. ISSN: 1069 - 4875.

Subject: 29.1129, Calls: Computational Linguistics/Germany

Moderators: linguist at linguistlist.org (Damir Cavar, Malgorzata E. Cavar)
Reviews: reviews at linguistlist.org (Helen Aristar-Dry, Robert Coté,
                                   Michael Czerniakowski)
Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           http://funddrive.linguistlist.org/donate/

Editor for this issue: Kenneth Steimel <ken at linguistlist.org>
================================================================

Date: Mon, 12 Mar 2018 18:44:17
From: Agata Savary [agata.savary at univ-tours.fr]
Subject: Shared Task on Automatic Identification of Verbal Multiword Expressions

Full Title: Shared Task on Automatic Identification of Verbal Multiword Expressions 

Date: 25-Aug-2018 - 26-Aug-2018
Location: Santa Fe, Germany 
Contact Person: Agata Savary
Meeting Email: agata.savary at univ-tours.fr
Web Site: http://multiword.sourceforge.net/sharedtask2018 

Linguistic Field(s): Computational Linguistics 

Call Deadline: 04-May-2018 

Meeting Description:

The second edition of the PARSEME shared task on automatic identification of
verbal multiword expressions (VMWEs) aims at identifying verbal MWEs in
running texts.  Verbal MWEs include, among others, idioms (*to let the cat out
of the bag*), light verb constructions (*to make a decision*), verb-particle
constructions (*to give up*), multi-verb constructions (*to make do*) and
inherently reflexive verbs (*se suicider* 'to suicide' in French).  Their
identification is a well-known challenge for NLP applications, due to their
complex characteristics including discontinuity, non-compositionality,
heterogeneity and syntactic variability.

The shared task is highly multilingual: PARSEME members have elaborated
annotation guidelines based on annotation experiments in about 20 languages
from several language families.  These guidelines take both universal and
language-specific phenomena into account.  We hope that this will boost the
development of language-independent and cross-lingual VMWE identification
systems.

Call for Papers:

The second edition of the PARSEME shared task on automatic identification of
verbal multiword expressions (VMWEs) aims at identifying verbal MWEs in
running texts.  Verbal MWEs include, among others, idioms (*to let the cat out
of the bag*), light verb constructions (*to make a decision*), verb-particle
constructions (*to give up*), multi-verb constructions (*to make do*) and
inherently reflexive verbs (*se suicider* 'to suicide' in French). 

Participation:

Participation is open and free worldwide.

We ask potential participant teams to register using the expression of
interest form:
https://docs.google.com/forms/d/e/1FAIpQLSd6L8IntkNKXbMp8QVLLvCYzzhoH-_8ovSW0D
L3BtYGNnsFhA/viewform?c=0&w=1

Task updates and questions will be posted to our public mailing list:
http://groups.google.com/group/verbalmwe

More details on the annotated corpora can be found here:
https://typo.uni-konstanz.de/parseme/index.php/2-general/202-parseme-shared-ta
sk-on-automatic-identification-of-verbal-mwes-edition-1-1

The annotation guidelines used in manual annotation of the training and test
sets are available here:
http://parsemefr.lif.univ-mrs.fr/parseme-st-guidelines/1.1

Publication and workshop:

Shared task participants will be invited to submit a system description paper
to a special track of the Joint Workshop on Linguistic Annotation, Multiword
Expressions and Constructions (LAW-MWE-CxG-2018) at COLING 2018, to be held on
August 25-26, 2018, in Santa Fe, New Mexico, USA:
http://multiword.sourceforge.net/lawmwecxg2018

Provided data:

The training and test sets of edition 1.0 of the shared task exemplify the
type of data and annotations (with a slightly different set of VMWE
categories) that we will provide, and are available at:
http://hdl.handle.net/11372/LRT-2282

When available, morphosyntactic data  (parts of speech, lemmas, morphological
features and/or syntactic dependencies) will also be provided.  

We are currently preparing corpora for the following languages: Arabic,
Basque, Bulgarian, Croatian,  English, Farsi, French, German, Greek, Hebrew,
Hindi, Hungarian, Italian, Lithuanian,  Polish, Brazilian Portuguese,
Romanian, Slovene, Spanish, Turkish.
The amount of annotated data will depend on the language, and the list of
covered languages may vary until the release of the training corpora.

Tracks:

System results can be submitted in two tracks:

Closed track: Systems using only the provided training data - VMWE annotations
+ morpho-syntactic data (if any) - to learn VMWE identification models and/or
rules
.
Open track: Systems using or not the provided training data, plus any
additional resources deemed useful (MWE lexicons, symbolic grammars, wordnets,
raw corpora, word embeddings, parserslanguage models trained on external data,
etc.). This track includes notably purely symbolic and rule-based systems.

Important dates:

March 21, 2018: shared task trial data and evaluation script released
April 4, 2018: shared task training data released
April 30, 2018: shared task blind test data released
May 4, 2018: submission of system results
May 11, 2018: announcement of results
May 25, 2018: submission of system description papers
June 20, 2018: notification
June 30, 2018: camera-ready papers
August 25-26, 2018: shared task workshop colocated with LAW-MWE-CxG-2018

Organizing team:

Silvio Ricardo Cordeiro, Carlos Ramisch, Agata Savary, Veronika Vincze

------------------------------------------------------------------------------

*****************    LINGUIST List Support    *****************
Please support the LL editors and operation with a donation at:
            http://funddrive.linguistlist.org/donate/

----------------------------------------------------------
LINGUIST List: Vol-29-1129	
----------------------------------------------------------