35.3563, FYI: December 2024 Newsletter - LDC (Linguistic Data Consortium)
The LINGUIST List
linguist at listserv.linguistlist.org
Wed Dec 18 01:05:02 UTC 2024
LINGUIST List: Vol-35-3563. Wed Dec 18 2024. ISSN: 1069 - 4875.
Subject: 35.3563, FYI: December 2024 Newsletter - LDC (Linguistic Data Consortium)
Moderator: Steven Moran (linguist at linguistlist.org)
Managing Editor: Justin Fuller
Team: Helen Aristar-Dry, Steven Franks, Joel Jenkins, Daniel Swanson, Erin Steitz
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org
Homepage: http://linguistlist.org
Editor for this issue: Joel Jenkins <joel at linguistlist.org>
================================================================
Date: 16-Dec-2024
From: Membership Coordinator [ldc at ldc.upenn.edu]
Subject: December 2024 Newsletter - LDC (Linguistic Data Consortium)
In this newsletter:
LDC 2025 membership discounts now available
Approaching deadline for Spring 2025 data scholarship applications
LDC closed for Winter Break December 25-January 1
New publications:
MATERIAL Farsi-English Language Pack
Abstract Meaning Representation 3.0 – Machine Translations
________________________________________
LDC 2025 membership discounts now available
Now through March 3, 2025, current 2024 members receive a 10% discount
for renewing their membership, and new or returning organizations
receive a 5% discount. Membership remains the most economical way to
access current and past LDC releases. Consult Join LDC for details on
membership options and benefits.
Approaching deadline for Spring 2025 data scholarship applications
Attention students: don’t miss out on the chance to receive no-cost
access to LDC data for your research. Applications for Spring 2025
data scholarships are due January 15, 2025. For more information on
requirements and program rules, see LDC Data Scholarships.
LDC closed for Winter Break December 25-January 1
LDC will be closed from Wednesday December 25, 2024, through
Wednesday, January 1, 2025, in accordance with the University of
Pennsylvania Winter Break Policy. Our offices will reopen on Thursday,
January 2, 2025. Requests received by the Membership Office during
Winter Break will be processed when the office reopens.
________________________________________
New publications:
MATERIAL Farsi-English Language Pack was developed by Appen for the
IARPA MATERIAL program and contains 61 hours of Farsi conversational
telephone speech, transcripts, English translations, annotations, and
queries. Calls were made using different telephones (e.g., mobile,
landline) from a variety of environments. Transcripts cover
approximately 30% of the speech files, and approximately 3% of the
speech files were translated into English. This release also includes
English queries and their relevance annotations.
The MATERIAL program focused on underserved languages with the
ultimate goal to build cross language information retrieval systems to
find speech and text content using English search queries.
2024 members can access this corpus through their LDC accounts
provided they have submitted a completed copy of the special license
agreement. Non-members may license this data for a fee.
*
Abstract Meaning Representation 3.0 - Machine Translations was
developed by the Center for Computational Linguistics at KU Leuven in
the HORIZON2020 project SignON. It is an automatic translation of a
subset of sentences from Abstract Meaning Representation (AMR)
Annotation Release 3.0 (LDC2020T02) into Spanish, Irish Gaelic, and
Dutch.
AMR 3.0 training, development, and test splits were translated using
Google Translate. "Unsplit" directories were not translated and are
not included in this release. Translations were not manually verified,
but formal issues (such as unexpected new lines) were corrected, and
special tokens and encoding issues were fixed with the Python tool
ftfy.fix_text.
AMR 3.0 is a semantic treebank of over 59,000 English natural language
sentences drawn from material collected by LDC, specifically,
discussion forum text from the DARPA BOLT and DARPA DEFT programs,
transcripts and English translations of Mandarin Chinese broadcast
news programming, Wall Street Journal text, translated Xinhua news
texts, various newswire texts from NIST OpenMT evaluations, and weblog
data from the DARPA GALE program.
2024 members can access this corpus through their LDC accounts.
Non-members may license this data for a fee.
To unsubscribe from this newsletter, log in to your LDC account and
uncheck the box next to “Receive Newsletter” under Account Options or
contact LDC for assistance.
Membership Coordinator
Linguistic Data Consortium
University of Pennsylvania
T: +1-215-573-1275
E: ldc at ldc.upenn.edu
M: 3600 Market St. Suite 810
Linguistic Field(s): Computational Linguistics
------------------------------------------------------------------------------
********************** LINGUIST List Support ***********************
Please consider donating to the Linguist List to support the student editors:
https://www.paypal.com/donate/?hosted_button_id=87C2AXTVC4PP8
LINGUIST List is supported by the following publishers:
Bloomsbury Publishing http://www.bloomsbury.com/uk/
Brill http://www.brill.com
Cambridge University Press http://www.cambridge.org/linguistics
Cascadilla Press http://www.cascadilla.com/
De Gruyter Mouton https://cloud.newsletter.degruyter.com/mouton
Edinburgh University Press https://edinburghuniversitypress.com
Elsevier Ltd http://www.elsevier.com/linguistics
Equinox Publishing Ltd http://www.equinoxpub.com/
European Language Resources Association (ELRA) http://www.elra.info
John Benjamins http://www.benjamins.com/
Language Science Press http://langsci-press.org
Lincom GmbH https://lincom-shop.eu/
Multilingual Matters http://www.multilingual-matters.com/
Narr Francke Attempto Verlag GmbH + Co. KG http://www.narr.de/
Netherlands Graduate School of Linguistics / Landelijke (LOT) http://www.lotpublications.nl/
Oxford University Press http://www.oup.com/us
Wiley http://www.wiley.com
----------------------------------------------------------
LINGUIST List: Vol-35-3563
----------------------------------------------------------
More information about the LINGUIST
mailing list