31.1365, FYI: April 2020 Newsletter - LDC

Thu Apr 16 00:00:03 UTC 2020

LINGUIST List: Vol-31-1365. Wed Apr 15 2020. ISSN: 1069 - 4875.

Subject: 31.1365, FYI:  April 2020 Newsletter - LDC

Moderator: Malgorzata E. Cavar (linguist at linguistlist.org)
Student Moderator: Jeremy Coburn
Managing Editor: Becca Morris
Team: Helen Aristar-Dry, Everett Green, Sarah Robinson, Lauren Perkins, Nils Hjortnaes, Yiwen Zhang, Joshua Sims
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Sarah Robinson <srobinson at linguistlist.org>
================================================================

Date: Wed, 15 Apr 2020 19:59:51
From: Membership Coordinator [ldc at ldc.upenn.edu]
Subject: April 2020 Newsletter - LDC

In this newsletter: 

New Publications:
2018 NIST Speaker Recognition Evaluation Test Set
Abstract Meaning Representation 2.0 - Four Translations
TAC KBP English Temporal Slot Filling - Comprehensive Training and Evaluation
Data 2011 and 2013

New publications:

(1) 2018 NIST Speaker Recognition Evaluation Test Set was developed by LDC and
NIST (National Institute of Standards and Technology) and contains
approximately 396 hours of Tunisian Arabic telephone recordings and English
web video speech used as development and test data in the NIST-sponsored 2018
Speaker Recognition Evaluation (SRE). This release also contains answer keys,
trial and train files, development data, and evaluation documentation.

The SRE task is speaker detection, that is, to determine whether a specified
target speaker is speaking during a segment of speech. In addition to the
traditional focus on conversational telephone speech recorded over a variety
of handset types for the training and test conditions, SRE18 added VOIP (voice
over IP) data and audio from video.

The English audio was sampled from amateur web videos collected by LDC as part
of the Video Annotation for Speech Technology (VAST) project.

2018 NIST Speaker Recognition Evaluation Test Set is distributed via web
download.

2020 Subscription Members will automatically receive copies of this corpus.
2020 Standard Members may request a copy as part of their 16 free membership
corpora. Non-members may license this data for a fee. 

(2) Abstract Meaning Representation 2.0 - Four Translations was developed by
researchers at the University of Edinburgh, School of Informatics and consists
of Spanish, German, Italian, and Chinese Mandarin translations of 5,484 test
split sentences (1,371 sentences per language) from Abstract Meaning
Representation (AMR) Annotation Release 2.0 (LDC2017T10). 

AMR Annotation Release 2.0 is a semantic treebank of over 39,000 English
natural language sentences from broadcast conversations, newswire, and web
text. The translated data in this release was designed for use in
cross-lingual parsing.

Abstract Meaning Representation 2.0 - Four Translations is distributed via web
download.

2020 Subscription Members will automatically receive copies of this corpus.
2020 Standard Members may request a copy as part of their 16 free membership
corpora. Non-members may license this data for a fee.

(3) TAC KBP English Temporal Slot Filling - Comprehensive Training and
Evaluation Data 2011 and 2013 was developed by LDC and contains training and
evaluation data produced in support of the TAC KBP English Temporal Slot
Filling tasks in 2011 and 2013. This release includes queries, manual runs
produced by LDC annotators, and the final rounds of assessment results.

The goal of the Temporal Slot Filling task was to identify and capture
temporal information in text indicating when a given relation between a slot
filling query entity and filler held true. This built upon the technology
developed for regular Slot Filling which involved mining information about
entities from text.

TAC KBP English Temporal Slot Filling - Comprehensive Training and Evaluation
Data 2011 and 2013 is distributed via web download.

2020 Subscription Members will automatically receive copies of this corpus.
2020 Standard Members may request a copy as part of their 16 free membership
corpora. Non-members may license this data for a fee.

Membership Coordinator
Linguistic Data Consortium
University of Pennsylvania
T: +1-215-573-1275
E: ldc at ldc.upenn.edu
M: 3600 Market St. Suite 810
Philadelphia, PA 19104

Linguistic Field(s): Computational Linguistics

------------------------------------------------------------------------------

***************************    LINGUIST List Support    ***************************
 The 2019 Fund Drive is under way! Please visit https://funddrive.linguistlist.org
  to find out how to donate and check how your university, country or discipline
     ranks in the fund drive challenges. Or go directly to the donation site:
               https://iufoundation.fundly.com/the-linguist-list-2019

                        Let's make this a short fund drive!
                Please feel free to share the link to our campaign:
                    https://funddrive.linguistlist.org/donate/

----------------------------------------------------------
LINGUIST List: Vol-31-1365	
----------------------------------------------------------