32.3625, FYI: November 2021 Newsletter - LDC
The LINGUIST List
linguist at listserv.linguistlist.org
Tue Nov 16 13:19:55 UTC 2021
LINGUIST List: Vol-32-3625. Tue Nov 16 2021. ISSN: 1069 - 4875.
Subject: 32.3625, FYI: November 2021 Newsletter - LDC
Moderator: Malgorzata E. Cavar (linguist at linguistlist.org)
Student Moderator: Jeremy Coburn, Lauren Perkins
Managing Editor: Becca Morris
Team: Helen Aristar-Dry, Everett Green, Sarah Robinson, Nils Hjortnaes, Joshua Sims, Billy Dickson
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org
Homepage: http://linguistlist.org
Please support the LL editors and operation with a donation at:
https://funddrive.linguistlist.org/donate/
Editor for this issue: Everett Green <everett at linguistlist.org>
================================================================
Date: Tue, 16 Nov 2021 08:19:37
From: Membership Coordinator [ldc at ldc.upenn.edu]
Subject: November 2021 Newsletter - LDC
In this newsletter:
Join LDC for Membership Year 2022
Spring 2022 Data Scholarship Application Deadline
New Publications:
BOLT Egyptian Arabic PropBank and Sense – Discussion Forum, SMS/Chat, and
Conversational Telephone Speech
Second DIHARD Challenge Development – Eleven Sources
Second DIHARD Challenge Development - SEEDLingS
--
Join LDC for Membership Year 2022
Membership Year 2022 (MY2022) is open and discounts are available for those
who keep their membership current and join early. Current MY2021 members who
renew their LDC membership before March 1, 2022 will receive a 10% discount
off the membership fee. New or returning organizations will receive a 5%
discount when joining by March 1.
Visit Join LDC for details on membership, user accounts and payment.
Spring 2022 Data Scholarship Application Deadline
Applications are now being accepted through January 15, 2022 for the Spring
2022 LDC Data Scholarship program which provides university students with
no-cost access to LDC data. Consult the LDC Data Scholarship page for more
information about program rules and submission requirements.
--
New publications:
(1) BOLT Egyptian Arabic PropBank and Sense – Discussion Forum, SMS/Chat, and
Conversational Telephone Speech was developed by the University of Colorado
Boulder - CLEAR (Computational Language and Education Research) for the DARPA
BOLT program and consists of propbank annotation on Egyptian Arabic informal
text and telephone speech.
BOLT Egyptian Arabic PropBank and Sense – Discussion Forum, SMS/Chat, and
Conversational Telephone Speech is distributed via web download.
2021 Subscription Members will automatically receive copies of this corpus.
2021 Standard Members may request a copy as part of their 16 free membership
corpora. Non-members may license this data for a fee.
*
(2) Second DIHARD Challenge Development - Eleven Sources was developed by LDC
and contains approximately 22 hours of English and Chinese speech data along
with corresponding annotations used in support of the Second DIHARD Challenge.
Second DIHARD Challenge Development – Eleven Sources is distributed via web
download.
2021 Subscription Members will automatically receive copies of this corpus.
2021 Standard Members may request a copy as part of their 16 free membership
corpora. Non-members may license this data for a fee.
*
(3) Second DIHARD Challenge Development - SEEDLingS was developed by Duke
University and LDC and contains approximately two hours of English child
language recordings along with corresponding annotations used in support of
the Second DIHARD Challenge. The DIHARD Challenges are a set of shared tasks
on diarization focusing on "hard" diarization; that is, speech diarization for
challenging corpora where there was an expectation that existing
state-of-the-art systems would fare poorly.
Source data is from the SEEDLingS (The Study of Environmental Effects on
Developing Linguistic Skills) corpus, designed to investigate how infants'
early linguistic and environmental input plays a role in their learning.
Recordings were generated in the home environment of infants in the Rochester,
New York area. A subset of that data was annotated by LDC for use in the first
and second DIHARD Challenges.
The data in this release consists of files provided in the Second DIHARD
Challenge as well as subsequently updated annotated files not provided to
second challenge participants.
Second DIHARD Challenge Development – SEEDLingS is distributed via web
download.
2021 Subscription Members will automatically receive copies of this corpus
provided they have submitted a completed copy of the special license
agreement. 2021 Standard Members may request a copy as part of their 16 free
membership corpora. Non-members may license this data for a fee.
Membership Coordinator
Linguistic Data Consortium
University of Pennsylvania
T: +1-215-573-1275
E: ldc at ldc.upenn.ed
Linguistic Field(s): Computational Linguistics
------------------------------------------------------------------------------
*************************** LINGUIST List Support ***************************
The 2020 Fund Drive is under way! Please visit https://funddrive.linguistlist.org
to find out how to donate and check how your university, country or discipline
ranks in the fund drive challenges. Or go directly to the donation site:
https://crowdfunding.iu.edu/the-linguist-list
Let's make this a short fund drive!
Please feel free to share the link to our campaign:
https://funddrive.linguistlist.org/donate/
----------------------------------------------------------
LINGUIST List: Vol-32-3625
----------------------------------------------------------
More information about the LINGUIST
mailing list