36.449, FYI: Launch of the Parsed Corpus of Southern Dutch Dialects (GCND)
The LINGUIST List
linguist at listserv.linguistlist.org
Tue Feb 4 02:05:05 UTC 2025
LINGUIST List: Vol-36-449. Tue Feb 04 2025. ISSN: 1069 - 4875.
Subject: 36.449, FYI: Launch of the Parsed Corpus of Southern Dutch Dialects (GCND)
Moderator: Steven Moran (linguist at linguistlist.org)
Managing Editor: Justin Fuller
Team: Helen Aristar-Dry, Steven Franks, Joel Jenkins, Daniel Swanson, Erin Steitz
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org
Homepage: http://linguistlist.org
Editor for this issue: Joel Jenkins <joel at linguistlist.org>
================================================================
Date: 04-Feb-2025
From: Melissa Farasyn [melissa.farasyn at ugent.be]
Subject: Launch of the Parsed Corpus of Southern Dutch Dialects (GCND)
We are excited to announce the release of the first parsed corpus of
spoken Dutch dialects, the Gesproken Corpus van de
zuidelijk-Nederlandse Dialecten (GCND). This resource offers extensive
data for linguistic research and is now accessible online.
Corpus Highlights:
• Speakers: 1,206 individuals, with the eldest born in 1871.
• Geographical Coverage: 639 distinct locations.
• Audio Data: Over 430 hours of recordings across 650 sessions.
• Transcriptions: Over 600 time-aligned, highly detailed
transcriptions.
• Total Tokens: Approximately 4.77 million.
• GrETEL Treebank: 50,111 verified sentences and 452,459
verified tokens.
These figures represent the corpus as of its initial release. Ongoing
efforts, supported by additional funding (GCND+), aim to expand the
corpus with more transcriptions, including northern dialects from the
Meertens Institute collection, and to enhance grammatical annotations.
The latest updates are available through the corpus application.
Access Information:
The GCND is available online
• GCND corpus application (requires CLARIN login):
• GCND project website
Acknowledgments:
This project was made possible through the funding of the Research
Foundation Flanders and the dedicated efforts of numerous student
assistants, volunteers and our project partners.
The GCND team (at Ghent University):
Anne Breitbarth (anne.breitbarth at ugent.be)
Anne-Sophie Ghyselen (annesophie.ghyselen at ugent.be)
Melissa Farasyn (melissa.farasyn at ugent.be)
Lien Hellebaut (lien.hellebaut at ugent.be)
Linguistic Field(s): Computational Linguistics
Historical Linguistics
Sociolinguistics
Syntax
Text/Corpus Linguistics
Subject Language(s): Dutch (nld)
Language Family(ies): West Germanic
------------------------------------------------------------------------------
********************** LINGUIST List Support ***********************
Please consider donating to the Linguist List to support the student editors:
https://www.paypal.com/donate/?hosted_button_id=87C2AXTVC4PP8
LINGUIST List is supported by the following publishers:
Bloomsbury Publishing http://www.bloomsbury.com/uk/
Cambridge University Press http://www.cambridge.org/linguistics
Cascadilla Press http://www.cascadilla.com/
De Gruyter Mouton https://cloud.newsletter.degruyter.com/mouton
Elsevier Ltd http://www.elsevier.com/linguistics
John Benjamins http://www.benjamins.com/
Language Science Press http://langsci-press.org
Multilingual Matters http://www.multilingual-matters.com/
Netherlands Graduate School of Linguistics / Landelijke (LOT) http://www.lotpublications.nl/
Wiley http://www.wiley.com
----------------------------------------------------------
LINGUIST List: Vol-36-449
----------------------------------------------------------
More information about the LINGUIST
mailing list