34.1831, Confs: Shared Task for Corpus Generation and Corpus Augmentation for Machine Translation

The LINGUIST List linguist at listserv.linguistlist.org
Thu Jun 8 01:05:02 UTC 2023


LINGUIST List: Vol-34-1831. Thu Jun 08 2023. ISSN: 1069 - 4875.

Subject: 34.1831, Confs: Shared Task for Corpus Generation and Corpus Augmentation for Machine Translation

Moderator: Malgorzata E. Cavar, Francis Tyers (linguist at linguistlist.org)
Managing Editor: Lauren Perkins
Team: Helen Aristar-Dry, Steven Franks, Everett Green, Joshua Sims, Daniel Swanson, Matthew Fort, Maria Lucero Guillen Puon, Zackary Leech, Lynzie Coburn
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Everett Green <everett at linguistlist.org>
================================================================


Date: 08-Jun-2023
From: John Ortega [johneortega at gmail.com]
Subject: Shared Task for Corpus Generation and Corpus Augmentation for Machine Translation


Full Title: Shared Task for Corpus Generation and Corpus Augmentation
for Machine Translation
Short Title: CoCo4MT Shared Task

Date: 05-Sep-2023 - 05-Sep-2023
Location: Macau, China
Contact Person: Ananya Ganesh
Meeting Email: coco4mt-shared-task at googlegroups.com
Web Site: https://sites.google.com/view/coco4mt/shared-task

Linguistic Field(s): Applied Linguistics; Computational Linguistics;
Text/Corpus Linguistics; Translation
Subject Language(s): Burmese (mya)
                     English (eng)

Call Deadline: 12-Jul-2023

Meeting Description:

We are excited to introduce a new shared task for this year’s CoCo4MT
workshop! Our aim is to encourage and facilitate research on corpus
construction for low-resource machine translation.

Corpus creation for machine translation is typically constrained by
the cost and availability of human translators. When a new dataset
needs to be created for a low-resource language or a specialized
domain, the annotation budget should be used efficiently and any
sentences chosen for translation should be of high quality.

In this shared task, we ask participants to come up with ways in which
such examples can be identified for a target language without any
existing data. Specifically, given a parallel corpus between
high-resource languages, the goal is to choose a good subset of the
high-resource corpus to manually be translated into the low-resource
language, in order to obtain a good machine translation system. The
shared task winner will be the team whose instances result in the best
final system after training.

Call for Papers:

Detailed information:
https://sites.google.com/view/coco4mt/shared-task
Registration: https://forms.gle/jfKSPQMKEmaaXFHy5

Important Dates
May 19 2023: Release of train, dev and test data
May 30 2023: Release of baselines
July 12, 2023: Deadline to submit results
July 20, 2023: System description papers due

Organizers (listed alphabetically)
Ananya Ganesh, University of Colorado Boulder
Constantine Lignos, Brandeis University
John E. Ortega, Northeastern University
Jonne Sälevä, Brandeis University
Katharina Kann, University of Colorado Boulder
Marine Carpuat, University of Maryland
Rodolfo Zevallos, Universitat Pompeu Fabra
Shabnam Tafreshi, University of Maryland
William Chen, Carnegie Mellon University



------------------------------------------------------------------------------

Please consider donating to the Linguist List https://give.myiu.org/iu-bloomington/I320011968.html


LINGUIST List is supported by the following publishers:

American Dialect Society/Duke University Press http://dukeupress.edu

Bloomsbury Publishing (formerly The Continuum International Publishing Group) http://www.bloomsbury.com/uk/

Brill http://www.brill.com

Cambridge Scholars Publishing http://www.cambridgescholars.com/

Cambridge University Press http://www.cambridge.org/linguistics

Cascadilla Press http://www.cascadilla.com/

De Gruyter Mouton https://cloud.newsletter.degruyter.com/mouton

Dictionary Society of North America http://dictionarysociety.com/

Edinburgh University Press www.edinburghuniversitypress.com

Equinox Publishing Ltd http://www.equinoxpub.com/

European Language Resources Association (ELRA) http://www.elra.info

Georgetown University Press http://www.press.georgetown.edu

John Benjamins http://www.benjamins.com/

Lincom GmbH https://lincom-shop.eu/

Linguistic Association of Finland http://www.ling.helsinki.fi/sky/

MIT Press http://mitpress.mit.edu/

Multilingual Matters http://www.multilingual-matters.com/

Narr Francke Attempto Verlag GmbH + Co. KG http://www.narr.de/

Netherlands Graduate School of Linguistics / Landelijke (LOT) http://www.lotpublications.nl/

Oxford University Press http://www.oup.com/us

SIL International Publications http://www.sil.org/resources/publications

Springer Nature http://www.springer.com

Wiley http://www.wiley.com


----------------------------------------------------------
LINGUIST List: Vol-34-1831
----------------------------------------------------------



More information about the LINGUIST mailing list