32.2408, FYI: July 2021 Newsletter - LDC

The LINGUIST List linguist at listserv.linguistlist.org
Sat Jul 17 02:12:37 UTC 2021


LINGUIST List: Vol-32-2408. Fri Jul 16 2021. ISSN: 1069 - 4875.

Subject: 32.2408, FYI: July 2021 Newsletter - LDC

Moderator: Malgorzata E. Cavar (linguist at linguistlist.org)
Student Moderator: Jeremy Coburn, Lauren Perkins
Managing Editor: Becca Morris
Team: Helen Aristar-Dry, Everett Green, Sarah Robinson, Nils Hjortnaes, Joshua Sims, Billy Dickson
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Everett Green <everett at linguistlist.org>
================================================================


Date: Fri, 16 Jul 2021 22:05:39
From: Membership Coordinator [ldc at ldc.upenn.edu]
Subject: July 2021 Newsletter - LDC

 
In this newsletter: 
LDC Submissions: a new platform for sharing data through LDC 
Fall 2021 LDC Data Scholarship Program 

New Publications:
Ethnobotanical Research and Language Documentation of Nahuatl
Chinese Abstract Meaning Representation 2.0
BOLT Egyptian Arabic Co-reference – Discussion Forum, SMS/Chat, and
Conversational Telephone Speech

__
LDC Submissions: a new platform for sharing data through LDC 
LDC is pleased to announce the launch of LDC Submissions, a platform that
provides infrastructure and resources for sharing data through the Catalog.
After registering for a user account, corpus submitters can create a
submission, upload files, and communicate with LDC’s publications team during
the review process. After all reviews are complete, the final, release-ready
version of your data set is uploaded to the platform and enters the
publications queue. 

Sharing your corpus through LDC ensures access to the global research
community and the permanent preservation of your data according to best
practices for archiving digital language resources. Get started and register
for an LDC Submissions user account today.

Fall 2021 LDC Data Scholarship Program 
Student applications for the Fall 2021 LDC Data Scholarship program are being
accepted now through September 15, 2021. This program provides eligible
students with no-cost access to LDC data. Students must complete an
application consisting of a data use proposal and letter of support from their
advisor.  
__

New publications:
(1) Ethnobotanical Research and Language Documentation of Nahuatl consists of
approximately 190 hours of field recordings collected in the Sierra
Nororiental and Sierra Norte regions of Puebla, Mexico. The corpus contains
audio and video recordings of native Nahuatl speakers during the collection of
particular plants; partial transcripts (Nahuatl and Spanish); a Highland
Puebla Nahuat dictionary; botanical and ethnobotanical data; and speaker
metadata.

Ethnobotanical Research and Language Documentation of Nahuatl is distributed
via web download.

2021 Subscription Members will automatically receive copies of this corpus
provided they have submitted a completed copy of the special license
agreement. 2021 Standard Members may request a copy as part of their 16 free
membership corpora. Non-members may license this data for a fee.

*

(2)  Chinese Abstract Meaning Representation 2.0 was developed by Brandeis
University and Nanjing Normal University and is comprised of semantic
representations of a set of approximately 20,000 Chinese sentences from
Chinese Treebank (CTB) 8.0 (LDC2013T21). CAMR 2.0 includes the content of
Chinese Abstract Meaning Representation 1.0 (LDC2019T07) (CTB 8.0 weblog and
discussion forum sentences), plus an additional 9,933 sentences from the
newswire portion of CTB 8.0.

Chinese Abstract Meaning Representation 2.0 is distributed via web download.

2021 Subscription Members will automatically receive copies of this corpus.
2021 Standard Members may request a copy as part of their 16 free membership
corpora. Non-members may license this data for a fee.

*

(3) BOLT Egyptian Arabic Co-reference – Discussion Forum, SMS/Chat, and
Conversational Telephone Speech was developed by Raytheon BBN Technologies.
Co-reference annotation aims to fill in the connections between specific
mentions in the text that refer to the same entities and events in the
discourse context. BOLT co-reference annotation was performed on BOLT treebank
annotation. It covers noun phrases (including proper nouns, nominals, pronouns
and null arguments), possessives, proper noun pre-modifiers, and verbs.

BOLT Egyptian Arabic Co-reference is distributed via web download.

2021 Subscription Members will automatically receive copies of this corpus.
2021 Standard Members may request a copy as part of their 16 free membership
corpora. Non-members may license this data for a fee.
 



Linguistic Field(s): Computational Linguistics





 



------------------------------------------------------------------------------

***************************    LINGUIST List Support    ***************************
 The 2020 Fund Drive is under way! Please visit https://funddrive.linguistlist.org
  to find out how to donate and check how your university, country or discipline
     ranks in the fund drive challenges. Or go directly to the donation site:
                   https://crowdfunding.iu.edu/the-linguist-list

                        Let's make this a short fund drive!
                Please feel free to share the link to our campaign:
                    https://funddrive.linguistlist.org/donate/
 


----------------------------------------------------------
LINGUIST List: Vol-32-2408	
----------------------------------------------------------






More information about the LINGUIST mailing list