36.1283, FYI: April 2025 Newsletter - LDC

The LINGUIST List linguist at listserv.linguistlist.org
Thu Apr 17 11:05:05 UTC 2025


LINGUIST List: Vol-36-1283. Thu Apr 17 2025. ISSN: 1069 - 4875.

Subject: 36.1283, FYI: April 2025 Newsletter - LDC

Moderator: Steven Moran (linguist at linguistlist.org)
Managing Editor: Justin Fuller
Team: Helen Aristar-Dry, Steven Franks, Joel Jenkins, Daniel Swanson, Erin Steitz
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Editor for this issue: Joel Jenkins <joel at linguistlist.org>

================================================================


Date: 15-Apr-2025
From: Membership Coordinator [ldc at ldc.upenn.edu]
Subject: April 2025 Newsletter - LDC


In this newsletter:
LDC launches upgraded, mobile-friendly website
Connect with LDC on Bluesky
New publications:
DEFT Spanish Light and Rich ERE Annotation
MATERIAL Kazakh-English Language Pack
________________________________________
LDC launches upgraded, mobile-friendly website
We are pleased to announce the launch of the newly upgraded LDC main
website: https://www.ldc.upenn.edu/. Designed with a modern layout,
the site now offers an improved experience across all devices. While
the LDC Catalog, LDC user accounts, and LDC Submissions are not
affected by this upgrade, they are now more accessible than ever from
any page on the site. We invite you to explore the website and enjoy a
smoother, more intuitive LDC web experience.
Connect with LDC on Bluesky
In addition to Facebook, X and LinkedIn, you can now connect with LDC
on the microblogging platform, Bluesky. Follow us today to learn the
latest news, announcements and corpora releases from the Consortium.
________________________________________
New publications:
DEFT Spanish Light and Rich ERE Annotation was developed by LDC and
consists of 158 Spanish discussion forum and newswire documents
annotated for entities, relations, and events (ERE). Light ERE
annotation labels entity mentions for the target set of entity,
relation, and event types between and among those entities including
coreference. Rich ERE annotation expands types and tagging in the
entities, relations, and events annotation tasks and replaces strict
event coreference with a more loosely defined event hopper annotation.
The source data consists of Spanish newswire text and Latin American
discussion forum data from DEFT Spanish Treebank LDC2018T01. 128
documents were annotated following Light ERE annotation guidelines.
154 files were labeled with Rich ERE annotation, 124 of which were
also labeled with Light ERE annotation.
DARPA's Deep Exploration and Filtering of Text (DEFT) program aimed to
address remaining capability gaps in state-of-the-art natural language
processing technologies related to inference, causal relationships and
anomaly detection. LDC supported the DEFT program by collecting,
creating and annotating a variety of data sources.
2025 members can access this corpus through their LDC accounts.
Non-members may license this data for a fee.
*
MATERIAL Kazakh-English Language Pack was developed by Appen for the
IARPA MATERIAL program and contains 57 hours of Kazakh conversational
telephone speech, transcripts, English translations, annotations, and
queries. Calls were made using different telephones (e.g., mobile,
landline) from a variety of environments. Transcripts cover
approximately 17% of the speech files, all of which were translated
into English. This release also includes English queries and their
relevance annotations.
The MATERIAL program focused on underserved languages with the
ultimate goal to build cross language information retrieval systems to
find speech and text content using English search queries.
2025 members can access this corpus through their LDC accounts
provided they have submitted a completed copy of the special license
agreement. Non-members may license this data for a fee.
To unsubscribe from this newsletter, log in to your LDC account and
uncheck the box next to “Receive Newsletter” under Account Options or
contact LDC for assistance.
Membership Coordinator
Linguistic Data Consortium
University of Pennsylvania
T: +1-215-573-1275
E: ldc at ldc.upenn.edu
M: 3600 Market St. Suite 810
      Philadelphia, PA 19104

Linguistic Field(s): Computational Linguistics




------------------------------------------------------------------------------

********************** LINGUIST List Support ***********************
Please consider donating to the Linguist List to support the student editors:

https://www.paypal.com/donate/?hosted_button_id=87C2AXTVC4PP8

LINGUIST List is supported by the following publishers:

Bloomsbury Publishing http://www.bloomsbury.com/uk/

Cambridge University Press http://www.cambridge.org/linguistics

Cascadilla Press http://www.cascadilla.com/

De Gruyter Mouton https://cloud.newsletter.degruyter.com/mouton

Edinburgh University Press http://www.edinburghuniversitypress.com

Elsevier Ltd http://www.elsevier.com/linguistics

John Benjamins http://www.benjamins.com/

Language Science Press http://langsci-press.org

Lincom GmbH https://lincom-shop.eu/

Multilingual Matters http://www.multilingual-matters.com/

Netherlands Graduate School of Linguistics / Landelijke (LOT) http://www.lotpublications.nl/

Oxford University Press http://www.oup.com/us

Wiley http://www.wiley.com


----------------------------------------------------------
LINGUIST List: Vol-36-1283
----------------------------------------------------------



More information about the LINGUIST mailing list