34.2770, Calls: Computational Linguistics / CORPUS (Jrnl)
The LINGUIST List
linguist at listserv.linguistlist.org
Thu Sep 21 15:05:03 UTC 2023
LINGUIST List: Vol-34-2770. Thu Sep 21 2023. ISSN: 1069 - 4875.
Subject: 34.2770, Calls: Computational Linguistics / CORPUS (Jrnl)
Moderators: Malgorzata E. Cavar, Francis Tyers (linguist at linguistlist.org)
Managing Editor: Justin Fuller
Team: Helen Aristar-Dry, Steven Franks, Everett Green, Daniel Swanson, Maria Lucero Guillen Puon, Zackary Leech, Lynzie Coburn, Natasha Singh, Erin Steitz
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org
Homepage: http://linguistlist.org
Please support the LL editors and operation with a donation at:
https://funddrive.linguistlist.org/donate/
Editor for this issue: Zachary Leech <zleech at linguistlist.org>
================================================================
Date: 20-Sep-2023
From: Luca Pallanti [luca.pallanti at univ-lyon2.fr]
Subject: Computational Linguistics / CORPUS (Jrnl)
Call for Papers:
AAC/CFP Corpus 26 - 2025 - https://journals.openedition.org/corpus/
Background noise or added value? Managing noise during computer
processing of linguistic corpora
Elisa Gugliotta, Luca Pallanti, Olivier Kraif, Iris Fabry et Martina
Barletta (eds.)
This special issue aims to delve into the definition of noise, from a
linguistic perspective, and the practices employed by researchers to
mitigate the biases that can arise from it. These practices are
implemented during collection, recording, and annotation of data. The
question of noise inevitably emerges at each stage of the empirical
process involved in data construction and analysis:
1. Noise during data collection and recording
If one accepts the postulate that "linguistic data is a result"
(Benveniste, 1966), decoding the noise stemming from data collection
and recording becomes crucial. Depending on the research object,
various factors may contribute to data alteration, including the
researcher's preconceptions or the biases introduced by an OCR system
(Jentsch & Porada, 2020).
2. Data preparation and pre-processing
The methods employed to refine raw data and prepare it for advanced
manipulation can give rise to a significant source of noise (or,
conversely, of silence, if noise elimination filters are applied).
This is particularly evident during the data normalization process (Al
Sharou et al., 2021).
3. The annotation process and metadata
Initially, corpus annotation aims to enrich the data by categorizing
units through a labelling process, depending on the developed analysis
model (Péry-Woodley et al., 2011). However, while this process has the
potential to introduce noise, it can result in detrimental silence
(when missing or erroneous labels lead to incomplete results during
data analysis or querying).
At each and every step of the process, key methodological questions
arise: what threshold can be considered acceptable for noise? How can
we differentiate between noise and methodological bias? Is it possible
to estimate noise without a ground truth? Which statistical tools are
specific to corpus studies and enable the definition of confidence
intervals? How can we strike a balance to prevent the noise resulting
from compromising research outcomes?
Proposals for articles may address these topics from a general point
of view, offering a theoretical and methodological perspective.
Alternatively, they can be based on one or more case studies that
focus on specific observations, while highlighting the noise
management methods employed throughout the study.
Retro-planning
* July 2023: call for publications.
* 10 November 2023: pre-selection based on article summaries.
* March 2024: article submission deadline.
* June 2024: response to the authors.
* June-October 2024: review process with authors to submit the final
version of the article.
* November-December 2024: editing process.
* January 2025: publication.
Please note that this retro-planning outlines a general timeline and
may vary depending on the specific publication requirements.
Abstract submission
* Your abstract should be no longer than 1,500 words, including
bibliographical references.
* Please submit your abstracts by November 10, 2023 to
elisa.gugliotta at ilc.cnr.it and luca.pallanti at univ-lyon2.fr.
French and English
------------------------------------------------------------------------------
Please consider donating to the Linguist List https://give.myiu.org/iu-bloomington/I320011968.html
LINGUIST List is supported by the following publishers:
American Dialect Society/Duke University Press http://dukeupress.edu
Bloomsbury Publishing (formerly The Continuum International Publishing Group) http://www.bloomsbury.com/uk/
Brill http://www.brill.com
Cambridge Scholars Publishing http://www.cambridgescholars.com/
Cambridge University Press http://www.cambridge.org/linguistics
Cascadilla Press http://www.cascadilla.com/
De Gruyter Mouton https://cloud.newsletter.degruyter.com/mouton
Dictionary Society of North America http://dictionarysociety.com/
Edinburgh University Press www.edinburghuniversitypress.com
Elsevier Ltd http://www.elsevier.com/linguistics
Equinox Publishing Ltd http://www.equinoxpub.com/
European Language Resources Association (ELRA) http://www.elra.info
Georgetown University Press http://www.press.georgetown.edu
John Benjamins http://www.benjamins.com/
Lincom GmbH https://lincom-shop.eu/
Linguistic Association of Finland http://www.ling.helsinki.fi/sky/
MIT Press http://mitpress.mit.edu/
Multilingual Matters http://www.multilingual-matters.com/
Narr Francke Attempto Verlag GmbH + Co. KG http://www.narr.de/
Netherlands Graduate School of Linguistics / Landelijke (LOT) http://www.lotpublications.nl/
Oxford University Press http://www.oup.com/us
SIL International Publications http://www.sil.org/resources/publications
Springer Nature http://www.springer.com
Wiley http://www.wiley.com
----------------------------------------------------------
LINGUIST List: Vol-34-2770
----------------------------------------------------------
More information about the LINGUIST
mailing list