37.1475, Qs: GUM 13 Corpus Survey

The LINGUIST List linguist at listserv.linguistlist.org
Fri Apr 17 13:05:02 UTC 2026


LINGUIST List: Vol-37-1475. Fri Apr 17 2026. ISSN: 1069 - 4875.

Subject: 37.1475, Qs: GUM 13 Corpus Survey

Moderator: Steven Moran (linguist at linguistlist.org)
Managing Editor: Valeriia Vyshnevetska
Team: Helen Aristar-Dry, Mara Baccaro, Daniel Swanson
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Editor for this issue: Daniel Swanson <daniel at linguistlist.org>

================================================================


Date: 16-Apr-2026
From: Lauren Levine [lel76 at georgetown.edu]
Subject: GUM 13 Corpus Survey


The GUM Corpus - Public Survey
Georgetown University Multilayer Corpus
The Corpling Lab at Georgetown University would like your
participation in this survey (https://forms.gle/SQkfN8MTHNXo32Z3A) to
help us better understand GUM usage and preferences regarding current
and potential new genres in the GUM corpus, which would be of great
help for our future selection of genres and availability of formats
and annotation layers.
Survey Link: https://forms.gle/SQkfN8MTHNXo32Z3A
GUM is an open source corpus of richly annotated English texts from
multiple genres: academic, bio, fiction, interview, news, travel,
how-to, Reddit forum discussions, conversations, political speeches,
CC vlogs, textbooks, podcasts, letters, L1 essays, and oral court
arguments. The corpus is created by students as part of the
Computational Linguistics curriculum at Georgetown University and is
available under Creative Commons licenses. As of now, the GUM Corpus
has released 12 series containing over 291K tokens annotated for
multiple layers. For more information and to search or download the
corpus online, see: https://gucorpling.org/gum/
We value your opinions and appreciate your participation and help! For
full consideration, please respond to the survey by the end of July.
Our lab will be attending the ACL 2026 main conference, CODI-CRAC, and
LAW XX in San Diego, so please feel free to come talk to us if you are
in attendance as well!

Linguistic Field(s): Computational Linguistics
                     Text/Corpus Linguistics

Subject Language(s): English (eng)




------------------------------------------------------------------------------

********************** LINGUIST List Support ***********************
Please consider donating to the Linguist List, a U.S. 501(c)(3) not for profit organization:

https://www.paypal.com/donate/?hosted_button_id=87C2AXTVC4PP8

LINGUIST List is supported by the following publishers:

Bloomsbury Publishing http://www.bloomsbury.com/uk/

Cambridge University Press http://www.cambridge.org/linguistics

Cascadilla Press http://www.cascadilla.com/

De Gruyter Brill https://www.degruyterbrill.com/?changeLang=en

Edinburgh University Press http://www.edinburghuniversitypress.com

European Language Resources Association (ELRA) http://www.elra.info

John Benjamins http://www.benjamins.com/

Language Science Press http://langsci-press.org

Lincom GmbH https://lincom-shop.eu/

MDPI Languages https://www.mdpi.com/journal/languages

MIT Press http://mitpress.mit.edu/

Multilingual Matters http://www.multilingual-matters.com/

Narr Francke Attempto Verlag GmbH + Co. KG http://www.narr.de/

Netherlands Graduate School of Linguistics / Landelijke (LOT) http://www.lotpublications.nl/

Peter Lang AG http://www.peterlang.com

SIL International Publications http://www.sil.org/resources/publications


----------------------------------------------------------
LINGUIST List: Vol-37-1475
----------------------------------------------------------



More information about the LINGUIST mailing list