33.261, FYI: January 2022 Newsletter - LDC

The LINGUIST List linguist at listserv.linguistlist.org
Mon Jan 24 10:16:31 UTC 2022


LINGUIST List: Vol-33-261. Mon Jan 24 2022. ISSN: 1069 - 4875.

Subject: 33.261, FYI: January 2022 Newsletter - LDC

Moderator: Malgorzata E. Cavar (linguist at linguistlist.org)
Student Moderator: Billy Dickson
Managing Editor: Lauren Perkins
Team: Helen Aristar-Dry, Everett Green, Sarah Goldfinch, Nils Hjortnaes,
      Joshua Sims, Billy Dickson, Amalia Robinson, Matthew Fort
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Everett Green <everett at linguistlist.org>
================================================================


Date: Mon, 24 Jan 2022 05:02:51
From: Membership Coordinator [ldc at ldc.upenn.edu]
Subject: January 2022 Newsletter - LDC

 
In this newsletter: 
Renew your LDC Membership today 

New Publications:
2017 NIST OpenSAT Pilot - SSSF
LORELEI Kinyarwanda Incident Language Pack
________________________________________
Renew your LDC Membership today 
The importance of curated resources for language-related education, research,
and technology development drives LDC’s mission to create them, to accept data
contributions from researchers across the globe and to broadly share such
resources through the LDC Catalog. LDC members enjoy no-cost access to new
corpora released annually, as well as the ability to license legacy data sets
from among our 900 holdings at reduced fees. Ensure that your data needs
continue to be met by renewing your LDC membership or by joining the
Consortium today.

Now through March 1, 2022, 2021 members receive a 10% discount on 2022
membership, and new or returning organizations receive a 5% discount.
Membership remains the most economical way to access current and past LDC
releases. Consult Join LDC for more details on membership options and
benefits.
________________________________________
New publications:

(1) 2017 NIST OpenSAT Pilot - SSSF was developed by NIST (National Institute
of Standards and Technology) and contains approximately one hour of
operational speech data, transcripts, and annotation files used in the speech
activity detection, automatic speech recognition, and keyword search tasks of
the 2017 OpenSAT Pilot evaluation. The source audio consists of radio and
telephone dispatches during the Sofa Super Store fire (Charleston, South
Carolina) in June 2007 (SSSF). 

2017 NIST OpenSAT Pilot - SSSF is distributed via web download.  

2022 Subscription Members will automatically receive copies of this corpus.
2022 Standard Members may request a copy as part of their 16 free membership
corpora. Non-members may license this data for a fee.
*
(2) LORELEI Kinyarwanda Incident Language Pack was developed by LDC and is
comprised of approximately 11.9 million words of Kinyarwanda monolingual text,
35,000 words of English monolingual text, 3.4 million words of parallel and
comparable Kinyarwanda-English text, and 50,000 words each of English and
Kinyarwanda data annotated for Entity Discovery and Linking and Situation
Frames. It constitutes all of the text data, annotations, supplemental
resources, and related software tools for the Kinyarwanda language that were
used in the DARPA LORELEI / LoReHLT 2018 Evaluation. 

Data was collected from news, social network, weblog, newsgroup, discussion
forum, and reference material. Entity detection and linking annotation
identified entities to be detected by systems for scoring purposes. Situation
frame analysis was designed to extract basic information about needs and
relevant issues for planning a disaster response effort.

The knowledge base for entity linking annotation is available separately as
LORELEI Entity Detection and Linking Knowledge Base (LDC2020T10).

LORELEI Kinyarwanda Incident Language Pack is distributed via web download. 

2022 Subscription Members will automatically receive copies of this corpus.
2022 Standard Members may request a copy as part of their 16 free membership
corpora. Non-members may license this data for a fee.

Membership Coordinator
Linguistic Data Consortium
University of Pennsylvania
T: +1-215-573-1275
E: ldc at ldc.upenn.edu
M: 3600 Market St. Suite 810
      Philadelphia, PA 19104 
 



Linguistic Field(s): Computational Linguistics





 



------------------------------------------------------------------------------

***************************    LINGUIST List Support    ***************************
 The 2020 Fund Drive is under way! Please visit https://funddrive.linguistlist.org
  to find out how to donate and check how your university, country or discipline
     ranks in the fund drive challenges. Or go directly to the donation site:
                   https://crowdfunding.iu.edu/the-linguist-list

                        Let's make this a short fund drive!
                Please feel free to share the link to our campaign:
                    https://funddrive.linguistlist.org/donate/
 


----------------------------------------------------------
LINGUIST List: Vol-33-261	
----------------------------------------------------------






More information about the LINGUIST mailing list