32.3625, FYI: November 2021 Newsletter - LDC

The LINGUIST List linguist at listserv.linguistlist.org
Tue Nov 16 13:19:55 UTC 2021


LINGUIST List: Vol-32-3625. Tue Nov 16 2021. ISSN: 1069 - 4875.

Subject: 32.3625, FYI: November 2021 Newsletter - LDC

Moderator: Malgorzata E. Cavar (linguist at linguistlist.org)
Student Moderator: Jeremy Coburn, Lauren Perkins
Managing Editor: Becca Morris
Team: Helen Aristar-Dry, Everett Green, Sarah Robinson, Nils Hjortnaes, Joshua Sims, Billy Dickson
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Everett Green <everett at linguistlist.org>
================================================================


Date: Tue, 16 Nov 2021 08:19:37
From: Membership Coordinator [ldc at ldc.upenn.edu]
Subject: November 2021 Newsletter - LDC

 
In this newsletter: 
Join LDC for Membership Year 2022 
Spring 2022 Data Scholarship Application Deadline 

New Publications:
BOLT Egyptian Arabic PropBank and Sense – Discussion Forum, SMS/Chat, and
Conversational Telephone Speech
Second DIHARD Challenge Development – Eleven Sources 
Second DIHARD Challenge Development - SEEDLingS

--
Join LDC for Membership Year 2022 
Membership Year 2022 (MY2022) is open and discounts are available for those
who keep their membership current and join early. Current MY2021 members who
renew their LDC membership before March 1, 2022 will receive a 10% discount
off the membership fee. New or returning organizations will receive a 5%
discount when joining by March 1.

Visit Join LDC for details on membership, user accounts and payment.
Spring 2022 Data Scholarship Application Deadline
Applications are now being accepted through January 15, 2022 for the Spring
2022 LDC Data Scholarship program which provides university students with
no-cost access to LDC data. Consult the LDC Data Scholarship page for more
information about program rules and submission requirements.
--

New publications:
(1) BOLT Egyptian Arabic PropBank and Sense – Discussion Forum, SMS/Chat, and
Conversational Telephone Speech was developed by the University of Colorado
Boulder - CLEAR (Computational Language and Education Research) for the DARPA
BOLT program and consists of propbank annotation on Egyptian Arabic informal
text and telephone speech. 

BOLT Egyptian Arabic PropBank and Sense – Discussion Forum, SMS/Chat, and
Conversational Telephone Speech is distributed via web download.

2021 Subscription Members will automatically receive copies of this corpus.
2021 Standard Members may request a copy as part of their 16 free membership
corpora. Non-members may license this data for a fee.

*

(2) Second DIHARD Challenge Development - Eleven Sources was developed by LDC
and contains approximately 22 hours of English and Chinese speech data along
with corresponding annotations used in support of the Second DIHARD Challenge.

Second DIHARD Challenge Development – Eleven Sources is distributed via web
download. 

2021 Subscription Members will automatically receive copies of this corpus.
2021 Standard Members may request a copy as part of their 16 free membership
corpora. Non-members may license this data for a fee. 

*

(3) Second DIHARD Challenge Development - SEEDLingS was developed by Duke
University and LDC and contains approximately two hours of English child
language recordings along with corresponding annotations used in support of
the Second DIHARD Challenge. The DIHARD Challenges are a set of shared tasks
on diarization focusing on "hard" diarization; that is, speech diarization for
challenging corpora where there was an expectation that existing
state-of-the-art systems would fare poorly.

Source data is from the SEEDLingS (The Study of Environmental Effects on
Developing Linguistic Skills) corpus, designed to investigate how infants'
early linguistic and environmental input plays a role in their learning.
Recordings were generated in the home environment of infants in the Rochester,
New York area. A subset of that data was annotated by LDC for use in the first
and second DIHARD Challenges.

The data in this release consists of files provided in the Second DIHARD
Challenge as well as subsequently updated annotated files not provided to
second challenge participants.

Second DIHARD Challenge Development – SEEDLingS is distributed via web
download. 

2021 Subscription Members will automatically receive copies of this corpus
provided they have submitted a completed copy of the special license
agreement. 2021 Standard Members may request a copy as part of their 16 free
membership corpora. Non-members may license this data for a fee.

Membership Coordinator
Linguistic Data Consortium
University of Pennsylvania
T: +1-215-573-1275
E: ldc at ldc.upenn.ed
 



Linguistic Field(s): Computational Linguistics





 



------------------------------------------------------------------------------

***************************    LINGUIST List Support    ***************************
 The 2020 Fund Drive is under way! Please visit https://funddrive.linguistlist.org
  to find out how to donate and check how your university, country or discipline
     ranks in the fund drive challenges. Or go directly to the donation site:
                   https://crowdfunding.iu.edu/the-linguist-list

                        Let's make this a short fund drive!
                Please feel free to share the link to our campaign:
                    https://funddrive.linguistlist.org/donate/
 


----------------------------------------------------------
LINGUIST List: Vol-32-3625	
----------------------------------------------------------






More information about the LINGUIST mailing list