34.3377, FYI: Announcing CreoleVal

The LINGUIST List linguist at listserv.linguistlist.org
Sun Nov 12 15:05:05 UTC 2023


LINGUIST List: Vol-34-3377. Sun Nov 12 2023. ISSN: 1069 - 4875.

Subject: 34.3377, FYI: Announcing CreoleVal

Moderators: Malgorzata E. Cavar, Francis Tyers (linguist at linguistlist.org)
Managing Editor: Justin Fuller
Team: Helen Aristar-Dry, Steven Franks, Everett Green, Daniel Swanson, Maria Lucero Guillen Puon, Zackary Leech, Lynzie Coburn, Natasha Singh, Erin Steitz
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Justin Fuller <justin at linguistlist.org>
================================================================


Date: 01-Nov-2023
From: Johannes Bjerva [jbjerva at cs.aau.dk]
Subject: Announcing CreoleVal


We are proud to announce the release of CreoleVal - a collection of
benchmarks for 28 Creole languages. The collection of datasets span
tasks such as relation classification, machine comprehension, machine
translation, named entity recognition, and use cases such as language
modeling. We cover Haitian Creole, Bislama, Chavacano, Pitkern,
Singlish, Tok Pisin, Papiamento, and others.

We hope the NLP community will include this collection of datasets in
ongoing & future evaluations of methods directed at low-resource
languages. Not only that, we also hypothesise that CreoleVal will open
the door for controlled experimentation with transfer learning
methodology.

This resource has been long in the making, and was made possible by a
long list of collaborators.

For a pre-print, see: https://arxiv.org/abs/2310.19567

For code and data, see: https://github.com/hclent/CreoleVal
(Repository under construction)

Linguistic Field(s): Computational Linguistics
                     Text/Corpus Linguistics

Subject Language(s): Bislama (bis)
                     Chavacano (cbk)
                     Creole, Haitian (hat)
                     Papiamento (pap)
                     Pitcairn-Norfolk (pih)
                     Tok Pisin (tpi)

Language Family(ies): Creole



------------------------------------------------------------------------------

Please consider donating to the Linguist List https://give.myiu.org/iu-bloomington/I320011968.html


LINGUIST List is supported by the following publishers:

American Dialect Society/Duke University Press http://dukeupress.edu

Bloomsbury Publishing (formerly The Continuum International Publishing Group) http://www.bloomsbury.com/uk/

Brill http://www.brill.com

Cambridge Scholars Publishing http://www.cambridgescholars.com/

Cambridge University Press http://www.cambridge.org/linguistics

Cascadilla Press http://www.cascadilla.com/

De Gruyter Mouton https://cloud.newsletter.degruyter.com/mouton

Dictionary Society of North America http://dictionarysociety.com/

Edinburgh University Press www.edinburghuniversitypress.com

Elsevier Ltd http://www.elsevier.com/linguistics

Equinox Publishing Ltd http://www.equinoxpub.com/

European Language Resources Association (ELRA) http://www.elra.info

Georgetown University Press http://www.press.georgetown.edu

John Benjamins http://www.benjamins.com/

Lincom GmbH https://lincom-shop.eu/

Linguistic Association of Finland http://www.ling.helsinki.fi/sky/

MIT Press http://mitpress.mit.edu/

Multilingual Matters http://www.multilingual-matters.com/

Narr Francke Attempto Verlag GmbH + Co. KG http://www.narr.de/

Netherlands Graduate School of Linguistics / Landelijke (LOT) http://www.lotpublications.nl/

Oxford University Press http://www.oup.com/us

SIL International Publications http://www.sil.org/resources/publications

Springer Nature http://www.springer.com

Wiley http://www.wiley.com


----------------------------------------------------------
LINGUIST List: Vol-34-3377
----------------------------------------------------------



More information about the LINGUIST mailing list