37.77, Software: NyishiBERT: A Transformer Language Model for the Nyishi Language
The LINGUIST List
linguist at listserv.linguistlist.org
Thu Jan 8 20:05:02 UTC 2026
LINGUIST List: Vol-37-77. Thu Jan 08 2026. ISSN: 1069 - 4875.
Subject: 37.77, Software: NyishiBERT: A Transformer Language Model for the Nyishi Language
Moderator: Steven Moran (linguist at linguistlist.org)
Managing Editor: Valeriia Vyshnevetska
Team: Helen Aristar-Dry, Mara Baccaro, Daniel Swanson
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org
Homepage: http://linguistlist.org
Editor for this issue: Daniel Swanson <daniel at linguistlist.org>
================================================================
Date: 08-Jan-2026
From: Badal Nyalang [nyalang at mwirelabs.com]
Subject: NyishiBERT: A Transformer Language Model for the Nyishi Language
NyishiBERT is a transformer-based language model developed for the
Nyishi language, a low-resource Tibeto-Burman language spoken in
Northeast India. The model is intended to support linguistic research
and language technology development for underrepresented languages,
including tasks such as language modeling, downstream NLP
experimentation, and corpus-based analysis.
The model is trained on Nyishi text data and released openly to
encourage reproducibility, reuse, and further research on low-resource
and indigenous languages. NyishiBERT follows the BERT-style masked
language modeling paradigm and is suitable for adaptation to tasks
such as part-of-speech tagging, named entity recognition, and text
classification.
This release builds on prior work on language-specific BERT models for
Northeast Indian languages and contributes to the growing ecosystem of
open language resources for the region.
https://huggingface.co/MWirelabs/nyishibert
Linguistic Field(s): Applied Linguistics
Computational Linguistics
Language Documentation
Subject Language(s): Nyishi (njz)
Language Family(ies): Tibeto-Burman
------------------------------------------------------------------------------
********************** LINGUIST List Support ***********************
Please consider donating to the Linguist List, a U.S. 501(c)(3) not for profit organization:
https://www.paypal.com/donate/?hosted_button_id=87C2AXTVC4PP8
LINGUIST List is supported by the following publishers:
Bloomsbury Publishing http://www.bloomsbury.com/uk/
Cambridge University Press http://www.cambridge.org/linguistics
Cascadilla Press http://www.cascadilla.com/
De Gruyter Brill https://www.degruyterbrill.com/?changeLang=en
Edinburgh University Press http://www.edinburghuniversitypress.com
John Benjamins http://www.benjamins.com/
Language Science Press http://langsci-press.org
Lincom GmbH https://lincom-shop.eu/
MIT Press http://mitpress.mit.edu/
Multilingual Matters http://www.multilingual-matters.com/
Narr Francke Attempto Verlag GmbH + Co. KG http://www.narr.de/
Netherlands Graduate School of Linguistics / Landelijke (LOT) http://www.lotpublications.nl/
Peter Lang AG http://www.peterlang.com
----------------------------------------------------------
LINGUIST List: Vol-37-77
----------------------------------------------------------
More information about the LINGUIST
mailing list