36.1502, Confs: LexiVault Workshop: Developing Annotated Corpora tools for Under-resourced Languages (United Kingdom)
The LINGUIST List
linguist at listserv.linguistlist.org
Tue May 13 00:05:02 UTC 2025
LINGUIST List: Vol-36-1502. Tue May 13 2025. ISSN: 1069 - 4875.
Subject: 36.1502, Confs: LexiVault Workshop: Developing Annotated Corpora tools for Under-resourced Languages (United Kingdom)
Moderator: Steven Moran (linguist at linguistlist.org)
Managing Editor: Justin Fuller
Team: Helen Aristar-Dry, Steven Franks, Joel Jenkins, Daniel Swanson, Erin Steitz
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org
Homepage: http://linguistlist.org
Editor for this issue: Erin Steitz <ensteitz at linguistlist.org>
================================================================
Date: 09-May-2025
From: Farida Soliman [f.a.i.soliman at qmul.ac.uk]
Subject: LexiVault Workshop: Developing Annotated Corpora tools for Under-resourced Languages
LexiVault Workshop: Developing Annotated Corpora tools for
Under-resourced Languages
Short Title: LexiVault
Date: 12-Jun-2025 - 13-Jun-2025
Location: London, United Kingdom
Linguistic Field(s): Text/Corpus Linguistics
*LexiVault Workshop: Developing Annotated Corpora tools for
Under-resourced Languages*
Where: Queen Mary University of London, Mile End Campus
When: 10 AM – 4PM, 12–13 June 2025
Sign-up/EOI:
https://docs.google.com/forms/d/e/1FAIpQLSfGgEQZGXM2EwZF9-EASYNRQHafphNNfyU-dv1jN5c2vb0iSA/viewform?pli=1
Join us for a two day hands-on workshop exploring LexiVault, a
user-friendly, open-source web tool developed by Samantha Wray, Hind
Saddiki and Daisy Li as part of the SAVANT project for querying
annotated lexicons, especially those of low-resource languages.
Psycholinguistic research on lesser-studied languages often requires
researchers to build corpora and compute measures like word frequency
and phonological neighborhood density from scratch. LexiVault closes
that gap by making these metrics easily accessible and searchable.
Currently, the tool hosts lexicons for Tagalog, Bangla, and multiple
Arabic dialects, with searchable annotations including part of speech
tags, morpheme frequency, transition probability, and more.
We'd like to expand our offerings while helping you convert your
language data to a useable, shareable resource, whether you're
starting with raw audio, a text corpus, or existing annotations. This
workshop is intended for those with any amount of corpus or behavioral
data that they would like to process or annotate further for storage
and usage on the LexiVault site.
The focus of this two-day workshop will differ from individual to
individual depending on the starting state of your dataset and your
interests, but could take the following forms:
- automatic transcription of auditory data to create a text corpus
from speech
- stemming a text corpus to create a list of morphemes and their
frequencies
- part-of-speech tagging a text corpus
- calculating minimal pairs and phonological neighborhood density from
a text corpus
All paths lead to your resource being in a form you (and others!) can
easily query in the future.
To book a spot onto the workshop (places are limited), or express your
interest in future workshops, click here:
https://docs.google.com/forms/d/e/1FAIpQLSfGgEQZGXM2EwZF9-EASYNRQHafphNNfyU-dv1jN5c2vb0iSA/viewform?pli=1
------------------------------------------------------------------------------
********************** LINGUIST List Support ***********************
Please consider donating to the Linguist List to support the student editors:
https://www.paypal.com/donate/?hosted_button_id=87C2AXTVC4PP8
LINGUIST List is supported by the following publishers:
Bloomsbury Publishing http://www.bloomsbury.com/uk/
Cambridge University Press http://www.cambridge.org/linguistics
Cascadilla Press http://www.cascadilla.com/
De Gruyter Mouton https://cloud.newsletter.degruyter.com/mouton
Edinburgh University Press http://www.edinburghuniversitypress.com
Elsevier Ltd http://www.elsevier.com/linguistics
John Benjamins http://www.benjamins.com/
Language Science Press http://langsci-press.org
Lincom GmbH https://lincom-shop.eu/
Multilingual Matters http://www.multilingual-matters.com/
Netherlands Graduate School of Linguistics / Landelijke (LOT) http://www.lotpublications.nl/
Oxford University Press http://www.oup.com/us
Wiley http://www.wiley.com
----------------------------------------------------------
LINGUIST List: Vol-36-1502
----------------------------------------------------------
More information about the LINGUIST
mailing list