35.958, FYI: March 2024 Newsletter - LDC

The LINGUIST List linguist at listserv.linguistlist.org
Fri Mar 15 23:05:09 UTC 2024


LINGUIST List: Vol-35-958. Fri Mar 15 2024. ISSN: 1069 - 4875.

Subject: 35.958, FYI: March 2024 Newsletter - LDC

Moderators: Malgorzata E. Cavar, Francis Tyers (linguist at linguistlist.org)
Managing Editor: Justin Fuller
Team: Helen Aristar-Dry, Steven Franks, Everett Green, Daniel Swanson, Maria Lucero Guillen Puon, Zackary Leech, Lynzie Coburn, Natasha Singh, Erin Steitz
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Justin Fuller <justin at linguistlist.org>

LINGUIST List is hosted by Indiana University College of Arts and Sciences.
================================================================


Date: 15-Mar-2024
From: Membership Coordinator [ldc at ldc.upenn.edu]
Subject: March 2024 Newsletter - LDC


In this newsletter:
LDC data and commercial technology development

New publications:
RATS Low Speech Density
BabyEars Affective Vocalizations
________________________________________
LDC data and commercial technology development
For-profit organizations are reminded that an LDC membership is a
pre-requisite for obtaining a commercial license to almost all LDC
databases. Non-member organizations, including non-member for-profit
organizations, cannot use LDC data to develop or test products for
commercialization, nor can they use LDC data in any commercial product
or for any commercial purpose. LDC data users should consult
corpus-specific license agreements for limitations on the use of
certain corpora. Visit the Licensing page for further information.
________________________________________
New publications:
RATS Low Speech Density was developed by LDC and is comprised of 87
hours of English, Levantine Arabic, Farsi, Pashto, and Urdu speech,
and non-speech samples. The recordings were assembled by concatenating
a randomized selection of speech, communications systems sounds, and
silence. This corpus was created to measure false alarm performance in
RATS speech activity detection systems.

The source audio was extracted from RATS development and progress sets
and consists of conversational telephone speech recordings collected
by LDC. Non-speech samples were selected from communications systems
sounds, including telephone network special information tones, radio
selective calling signals, HF/VHF/UHF digital mode radio traffic,
radio network control channel signals, two-way radio traffic
containing roger beeps, and short duration shift-key modulated handset
data transmissions.

The goal of the RATS (Robust Automatic Transcription of Speech)
program was to develop human language technology systems capable of
performing speech detection, language identification, speaker
identification, and keyword spotting on the severely degraded audio
signals that are typical of various radio communication channels,
especially those employing various types of handheld portable
transceiver systems.

2024 members can access this corpus through their LDC accounts.
Non-members may license this data for a fee.

*

BabyEars Affective Vocalizations contains 22 minutes of spontaneous
English speech by 12 adults interacting with their infant children,
for a total of 509 infant-directed utterances and 185 adult-directed
or neutral utterances. Speech data was collected in a quiet room
during a one-hour session where each sparent was asked to play and
otherwise interact normally with their infant (aged 10-18 months). A
trained research assistant then extracted discrete utterances and
classified them in three categories: approval, attention, and
prohibition.

2024 members can access this corpus through their LDC accounts
provided they have submitted a completed copy of the special license
agreement. Non-members may license this data for a fee.

To unsubscribe from this newsletter, log in to your LDC account and
uncheck the box next to “Receive Newsletter” under Account Options or
contact LDC for assistance.

Membership Coordinator
Linguistic Data Consortium
University of Pennsylvania
T: +1-215-573-1275
E: ldc at ldc.upenn.edu
M: 3600 Market St. Suite 810
      Philadelphia, PA 19104

Linguistic Field(s): Computational Linguistics




------------------------------------------------------------------------------

Please consider donating to the Linguist List https://give.myiu.org/iu-bloomington/I320011968.html


LINGUIST List is supported by the following publishers:

Cambridge University Press http://www.cambridge.org/linguistics

De Gruyter Mouton https://cloud.newsletter.degruyter.com/mouton

Equinox Publishing Ltd http://www.equinoxpub.com/

John Benjamins http://www.benjamins.com/

Lincom GmbH https://lincom-shop.eu/

Multilingual Matters http://www.multilingual-matters.com/

Narr Francke Attempto Verlag GmbH + Co. KG http://www.narr.de/

Wiley http://www.wiley.com


----------------------------------------------------------
LINGUIST List: Vol-35-958
----------------------------------------------------------



More information about the LINGUIST mailing list