34.2506, FYI: August 2023 Newsletter - LDC

The LINGUIST List linguist at listserv.linguistlist.org
Thu Aug 17 00:05:05 UTC 2023


LINGUIST List: Vol-34-2506. Thu Aug 17 2023. ISSN: 1069 - 4875.

Subject: 34.2506, FYI: August 2023 Newsletter - LDC

Moderators: Malgorzata E. Cavar, Francis Tyers (linguist at linguistlist.org)
Managing Editor: Justin Fuller
Team: Helen Aristar-Dry, Steven Franks, Everett Green, Daniel Swanson, Maria Lucero Guillen Puon, Zackary Leech, Lynzie Coburn, Natasha Singh, Erin Steitz
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Everett Green <everett at linguistlist.org>
================================================================


Date: 15-Aug-2023
From: Membership Coordinator [ldc at ldc.upenn.edu]
Subject: August 2023 Newsletter - LDC


In this newsletter:
LDC at Interspeech 2023
LDC releases speech activity detector
Fall 2023 LDC Data Scholarship Program

New publications:
2019 OpenSAT Public Safety Communications Simulation
Samrómur Queries Icelandic Speech 1.0
________________________________________

LDC at Interspeech 2023
LDC is happy to be back in person as an exhibitor and longtime
supporter of Interspeech, taking place this year August 20-24 in
Dublin, Ireland. Stop by Stand A2 to say hello and learn about the
latest developments at the Consortium. LDC is also delighted to once
again be a silver sponsor for the Young Female Researchers in Speech
Workshop and to provide data in support of the CHiME-7 challenge
satellite workshop and the MERLIon CCS Challenge.

LDC will post conference updates via our social media platforms. We
look forward to seeing you in Dublin!

LDC releases speech activity detector
LDC announces the release of the LDC Broad Phonetic Class Speech
Activity Detector. Based on the broad phonetic class recognizer
implemented in the HTK Speech Recognition Toolkit, LDC’s speech
activity detector model runs the speech signal through a GMM-HMM
recognizer to identify five broad phonetic classes: vowel,
stops/affricate, fricative, nasal, and glide/liquid. The LDC Broad
Phonetic Class Speech Activity Detector is available at no cost on
github under a GPL v3 license.

Fall 2023 LDC Data Scholarship Program
Student applications for the Fall 2023 LDC Data Scholarship program
are being accepted now through September 15, 2023. This program
provides eligible students with no-cost access to LDC data. Students
must complete an application consisting of a data use proposal and
letter of support from their advisor. For application requirements and
program rules, visit the LDC Data Scholarships page
________________________________________

New publications:
2019 OpenSAT Public Safety Communications Simulation contains 141
hours of English speech recordings and transcripts used in the NIST
Open Speech Analytic Technologies (OpenSAT) 2019 evaluation's
automatic speech recognition, speech activity detection, and keyword
search tasks. The data is part of the SAFE-T (Speech Analysis For
Emergency Response Technology) corpus created by LDC which is
comprised of speakers engaged in a collaborative problem-solving
activity representative of public safety communications in terms of
speech content, noise types, and noise levels.

US English speakers played the board game Flash Point Fire Rescue.
Background noise was played through a participant's headset during the
recording session. Recording sessions consisted of 2 30-minute games.
The corpus is divided into training, development, and evaluation data.

2023 members can access this corpus through their LDC accounts.
Non-members may license this data for a fee.

*

Samrómur Queries Icelandic Speech 1.0 was developed by the Language
and Voice Lab, Reykjavik University in cooperation with Almannarómur,
Center for Language Technology. The corpus contains 20 hours of
Icelandic prompted queries from 3,809 speakers representing 17,475
utterances.

Speech data was collected between October 2019 and December 2021 using
the Samrómur website which displayed prompts to participants. The
prompts were mainly from The Icelandic Gigaword Corpus, which includes
text from novels, news, plays, and from a list of location names in
Iceland. Additional prompts were taken from the Icelandic Web of
Science and others were created by combining a name followed by a
question or a demand. Prompts and speaker metadata are included in the
corpus.

2023 members can access this corpus through their LDC accounts
provided they have submitted a completed copy of the special license
agreement. Non-members may license this data for a fee.

Linguistic Field(s): Computational Linguistics




------------------------------------------------------------------------------

Please consider donating to the Linguist List https://give.myiu.org/iu-bloomington/I320011968.html


LINGUIST List is supported by the following publishers:

American Dialect Society/Duke University Press http://dukeupress.edu

Bloomsbury Publishing (formerly The Continuum International Publishing Group) http://www.bloomsbury.com/uk/

Brill http://www.brill.com

Cambridge Scholars Publishing http://www.cambridgescholars.com/

Cambridge University Press http://www.cambridge.org/linguistics

Cascadilla Press http://www.cascadilla.com/

De Gruyter Mouton https://cloud.newsletter.degruyter.com/mouton

Dictionary Society of North America http://dictionarysociety.com/

Edinburgh University Press www.edinburghuniversitypress.com

Elsevier Ltd http://www.elsevier.com/linguistics

Equinox Publishing Ltd http://www.equinoxpub.com/

European Language Resources Association (ELRA) http://www.elra.info

Georgetown University Press http://www.press.georgetown.edu

John Benjamins http://www.benjamins.com/

Lincom GmbH https://lincom-shop.eu/

Linguistic Association of Finland http://www.ling.helsinki.fi/sky/

MIT Press http://mitpress.mit.edu/

Multilingual Matters http://www.multilingual-matters.com/

Narr Francke Attempto Verlag GmbH + Co. KG http://www.narr.de/

Netherlands Graduate School of Linguistics / Landelijke (LOT) http://www.lotpublications.nl/

Oxford University Press http://www.oup.com/us

SIL International Publications http://www.sil.org/resources/publications

Springer Nature http://www.springer.com

Wiley http://www.wiley.com


----------------------------------------------------------
LINGUIST List: Vol-34-2506
----------------------------------------------------------



More information about the LINGUIST mailing list