34.1513, FYI: May 2023 Newsletter - LDC

The LINGUIST List linguist at listserv.linguistlist.org
Tue May 16 02:05:07 UTC 2023


LINGUIST List: Vol-34-1513. Tue May 16 2023. ISSN: 1069 - 4875.

Subject: 34.1513, FYI: May 2023 Newsletter - LDC

Moderator: Malgorzata E. Cavar, Francis Tyers (linguist at linguistlist.org)
Managing Editor: Lauren Perkins
Team: Helen Aristar-Dry, Steven Franks, Everett Green, Joshua Sims, Daniel Swanson, Matthew Fort, Maria Lucero Guillen Puon, Zackary Leech, Lynzie Coburn
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Everett Green <everett at linguistlist.org>
================================================================


Date: 16-May-2023
From: Membership Coordinator [ldc at ldc.upenn.edu]
Subject: May 2023 Newsletter - LDC


In this newsletter:
LDC at ICASSP 2023

New publications:
2019 NIST Speaker Recognition Evaluation Test Set – CTS Challenge
LORELEI Zulu Representative Language Pack
________________________________________
LDC at ICASSP 2023
LDC will be exhibiting at ICASSP 2023, held this year June 4-10 in
Rhodes, Greece. Stop by booth 15 to learn more about recent
developments at the Consortium and the latest publications.

LDC will post conference updates via Twitter and Facebook. We look
forward to seeing you there!
________________________________________
New publications:
2019 NIST Speaker Recognition Evaluation Test Set – CTS Challenge,
developed by LDC and NIST, contains 635 hours of Tunisian Arabic
telephone recordings for development and test, answer keys,
enrollment, trial files, and documentation from the CTS Challenge
portion of the NIST-sponsored 2019 Speaker Recognition Evaluation. The
2019 evaluation was conducted in two parts: (1) a leaderboard-style
challenge based on conversational telephone speech from LDC's Call My
Net 2 (CMN2) corpus; and (2) a separate evaluation using audio-visual
material collected by LDC for the VAST (Video Annotation for Speech
Technology) project (released as LDC2023V01).

The telephone speech data for the CTS Challenge was drawn from the
CMN2 collection conducted by LDC in Tunisia in which Tunisian Arabic
speakers called friends or relatives who agreed to record their
telephone conversations lasting between 8-10 minutes. The speech
segments include PSTN (public switched telephone network) and VOIP
(voice over IP) data.

2023 members can access this corpus through their LDC accounts.
Non-members may license this data for a fee.
*
LORELEI Zulu Representative Language Pack is comprised of over 5
million words of Zulu monolingual text, 2.7 million words of found
Zulu-English parallel text, and 71,000 Zulu words translated from
English data. Approximately 100,000 words were annotated for named
entities and over 23,000 words were annotated for entity discovery and
linking and situation frames (identifying entities, needs, and
issues). Data was collected from discussion forum, news, reference,
social network, and weblogs.

The LORELEI (Low Resource Languages for Emergent Incidents) program
was concerned with building human language technology for low resource
languages in the context of emergent situations. Representative
languages were selected to provide broad typological coverage.

The knowledge base for entity linking annotation is available
separately as LORELEI Entity Detection and Linking Knowledge Base
(LDC2020T10).

2023 members can access this corpus through their LDC accounts.
Non-members may license this data for a fee.

To unsubscribe from this newsletter, log in to your LDC account and
uncheck the box next to “Receive Newsletter” under Account Options or
contact LDC for assistance.
Membership Coordinator
Linguistic Data Consortium
University of Pennsylvania
T: +1-215-573-1275
E: ldc at ldc.upenn.edu
M: 3600 Market St. Suite 810
Philadelphia, PA 19104

Linguistic Field(s): Computational Linguistics




------------------------------------------------------------------------------


LINGUIST List is supported by the following publishers:

American Dialect Society/Duke University Press http://dukeupress.edu

Bloomsbury Publishing (formerly The Continuum International Publishing Group) http://www.bloomsbury.com/uk/

Brill http://www.brill.com

Cambridge Scholars Publishing http://www.cambridgescholars.com/

Cambridge University Press http://www.cambridge.org/linguistics

Cascadilla Press http://www.cascadilla.com/

De Gruyter Mouton https://cloud.newsletter.degruyter.com/mouton

Dictionary Society of North America http://dictionarysociety.com/

Edinburgh University Press www.edinburghuniversitypress.com

Equinox Publishing Ltd http://www.equinoxpub.com/

European Language Resources Association (ELRA) http://www.elra.info

Georgetown University Press http://www.press.georgetown.edu

John Benjamins http://www.benjamins.com/

Lincom GmbH https://lincom-shop.eu/

Linguistic Association of Finland http://www.ling.helsinki.fi/sky/

MIT Press http://mitpress.mit.edu/

Multilingual Matters http://www.multilingual-matters.com/

Narr Francke Attempto Verlag GmbH + Co. KG http://www.narr.de/

Netherlands Graduate School of Linguistics / Landelijke (LOT) http://www.lotpublications.nl/

Oxford University Press http://www.oup.com/us

SIL International Publications http://www.sil.org/resources/publications

Wiley http://www.wiley.com


----------------------------------------------------------
LINGUIST List: Vol-34-1513
----------------------------------------------------------



More information about the LINGUIST mailing list