37.1475, Qs: GUM 13 Corpus Survey
The LINGUIST List
linguist at listserv.linguistlist.org
Fri Apr 17 13:05:02 UTC 2026
LINGUIST List: Vol-37-1475. Fri Apr 17 2026. ISSN: 1069 - 4875.
Subject: 37.1475, Qs: GUM 13 Corpus Survey
Moderator: Steven Moran (linguist at linguistlist.org)
Managing Editor: Valeriia Vyshnevetska
Team: Helen Aristar-Dry, Mara Baccaro, Daniel Swanson
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org
Homepage: http://linguistlist.org
Editor for this issue: Daniel Swanson <daniel at linguistlist.org>
================================================================
Date: 16-Apr-2026
From: Lauren Levine [lel76 at georgetown.edu]
Subject: GUM 13 Corpus Survey
The GUM Corpus - Public Survey
Georgetown University Multilayer Corpus
The Corpling Lab at Georgetown University would like your
participation in this survey (https://forms.gle/SQkfN8MTHNXo32Z3A) to
help us better understand GUM usage and preferences regarding current
and potential new genres in the GUM corpus, which would be of great
help for our future selection of genres and availability of formats
and annotation layers.
Survey Link: https://forms.gle/SQkfN8MTHNXo32Z3A
GUM is an open source corpus of richly annotated English texts from
multiple genres: academic, bio, fiction, interview, news, travel,
how-to, Reddit forum discussions, conversations, political speeches,
CC vlogs, textbooks, podcasts, letters, L1 essays, and oral court
arguments. The corpus is created by students as part of the
Computational Linguistics curriculum at Georgetown University and is
available under Creative Commons licenses. As of now, the GUM Corpus
has released 12 series containing over 291K tokens annotated for
multiple layers. For more information and to search or download the
corpus online, see: https://gucorpling.org/gum/
We value your opinions and appreciate your participation and help! For
full consideration, please respond to the survey by the end of July.
Our lab will be attending the ACL 2026 main conference, CODI-CRAC, and
LAW XX in San Diego, so please feel free to come talk to us if you are
in attendance as well!
Linguistic Field(s): Computational Linguistics
Text/Corpus Linguistics
Subject Language(s): English (eng)
------------------------------------------------------------------------------
********************** LINGUIST List Support ***********************
Please consider donating to the Linguist List, a U.S. 501(c)(3) not for profit organization:
https://www.paypal.com/donate/?hosted_button_id=87C2AXTVC4PP8
LINGUIST List is supported by the following publishers:
Bloomsbury Publishing http://www.bloomsbury.com/uk/
Cambridge University Press http://www.cambridge.org/linguistics
Cascadilla Press http://www.cascadilla.com/
De Gruyter Brill https://www.degruyterbrill.com/?changeLang=en
Edinburgh University Press http://www.edinburghuniversitypress.com
European Language Resources Association (ELRA) http://www.elra.info
John Benjamins http://www.benjamins.com/
Language Science Press http://langsci-press.org
Lincom GmbH https://lincom-shop.eu/
MDPI Languages https://www.mdpi.com/journal/languages
MIT Press http://mitpress.mit.edu/
Multilingual Matters http://www.multilingual-matters.com/
Narr Francke Attempto Verlag GmbH + Co. KG http://www.narr.de/
Netherlands Graduate School of Linguistics / Landelijke (LOT) http://www.lotpublications.nl/
Peter Lang AG http://www.peterlang.com
SIL International Publications http://www.sil.org/resources/publications
----------------------------------------------------------
LINGUIST List: Vol-37-1475
----------------------------------------------------------
More information about the LINGUIST
mailing list