34.2221, Calls: Text/Corpus Linguistics / CORPUS (Jrnl)

The LINGUIST List linguist at listserv.linguistlist.org
Mon Jul 17 01:05:02 UTC 2023


LINGUIST List: Vol-34-2221. Mon Jul 17 2023. ISSN: 1069 - 4875.

Subject: 34.2221, Calls: Text/Corpus Linguistics / CORPUS (Jrnl)

Moderators: Malgorzata E. Cavar, Francis Tyers (linguist at linguistlist.org)
Managing Editor: Justin Fuller
Team: Helen Aristar-Dry, Steven Franks, Everett Green, Daniel Swanson, Maria Lucero Guillen Puon, Zackary Leech, Lynzie Coburn, Natasha Singh, Erin Steitz
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Zachary Leech <zleech at linguistlist.org>
================================================================


Date: 13-Jul-2023
From: Luca Pallanti [luca.pallanti at univ-lyon2.fr]
Subject: Text/Corpus Linguistics / CORPUS (Jrnl)


Call for Papers:

AAC/CFP Corpus 26 - 2025 - https://journals.openedition.org/corpus/

Background noise or added value? Managing noise during computer
processing of linguistic corpora
Elisa Gugliotta, Luca Pallanti, Olivier Kraif, Iris Fabry et Martina
Barletta (eds.)

This special issue of Corpus builds upon a workshop held in April 2023
(https://je-bruit-corpus.sciencesconf.org/) and offers an opportunity
to examine noise management methods in the fields of NLP and corpus
linguistics, as well as their impact on the quality of linguistic data
(Kraif & Ponton, 2007; Goutte et al., 2012; Zeroual, 2018).
This special issue aims to delve into the definition of noise, from a
linguistic perspective, and the practices employed by researchers to
mitigate the biases that can arise from it. These practices are
implemented during collection, recording, and annotation of data. The
question of noise inevitably emerges at each stage of the empirical
process involved in data construction and analysis:
1. Noise during data collection and recording
If one accepts the postulate that "linguistic data is a result"
(Benveniste, 1966), decoding the noise stemming from data collection
and recording becomes crucial. Depending on the research object,
various factors may contribute to data alteration, including the
researcher's preconceptions or the biases introduced by an OCR system
(Jentsch & Porada, 2020).
2. Data preparation and pre-processing
The methods employed to refine raw data and prepare it for advanced
manipulation can give rise to a significant source of noise (or,
conversely, of silence, if noise elimination filters are applied).
This is particularly evident during the data normalization process (Al
Sharou et al., 2021). When transcribing data or correcting errors,
researchers must make choices that inevitably influence the nature of
the data, either by reducing or enriching its content.
3. The annotation process and metadata
Initially, corpus annotation aims to enrich the data by categorizing
units through a labelling process, depending on the developed analysis
model (Péry-Woodley et al., 2011). However, while this process has the
potential to introduce noise, it can result in detrimental silence
(when missing or erroneous labels lead to incomplete results during
data analysis or querying). The concept of metadata also raises
questions: does categorizing data transform it into something
different?

At each and every step of the process, key methodological questions
arise: what threshold can be considered acceptable for noise? How can
we differentiate between noise and methodological bias? Is it possible
to estimate noise without a ground truth? Which statistical tools are
specific to corpus studies and enable the definition of confidence
intervals? How can we strike a balance to prevent the noise resulting
from compromising research outcomes?

Proposals for articles may address these topics from a general point
of view, offering a theoretical and methodological perspective.
Alternatively, they can be based on one or more case studies that
focus on specific observations, while highlighting the noise
management methods employed throughout the study.

Retro-planning
* July 2023: call for publications.
* November 2023: pre-selection based on article summaries.
* March 2024: article submission deadline.
* June 2024: response to the authors.
* June-October 2024: review process with

Abstract submission
* Your abstract should be no longer than 1,500 words, including
bibliographical references.
* Please submit your abstracts by November 6, 2023 to
elisa.gugliotta at ilc.cnr.it and luca.pallanti at univ-lyon2.fr.



------------------------------------------------------------------------------

Please consider donating to the Linguist List https://give.myiu.org/iu-bloomington/I320011968.html


LINGUIST List is supported by the following publishers:

American Dialect Society/Duke University Press http://dukeupress.edu

Bloomsbury Publishing (formerly The Continuum International Publishing Group) http://www.bloomsbury.com/uk/

Brill http://www.brill.com

Cambridge Scholars Publishing http://www.cambridgescholars.com/

Cambridge University Press http://www.cambridge.org/linguistics

Cascadilla Press http://www.cascadilla.com/

De Gruyter Mouton https://cloud.newsletter.degruyter.com/mouton

Dictionary Society of North America http://dictionarysociety.com/

Edinburgh University Press www.edinburghuniversitypress.com

Equinox Publishing Ltd http://www.equinoxpub.com/

European Language Resources Association (ELRA) http://www.elra.info

Georgetown University Press http://www.press.georgetown.edu

John Benjamins http://www.benjamins.com/

Lincom GmbH https://lincom-shop.eu/

Linguistic Association of Finland http://www.ling.helsinki.fi/sky/

MIT Press http://mitpress.mit.edu/

Multilingual Matters http://www.multilingual-matters.com/

Narr Francke Attempto Verlag GmbH + Co. KG http://www.narr.de/

Netherlands Graduate School of Linguistics / Landelijke (LOT) http://www.lotpublications.nl/

Oxford University Press http://www.oup.com/us

SIL International Publications http://www.sil.org/resources/publications

Springer Nature http://www.springer.com

Wiley http://www.wiley.com


----------------------------------------------------------
LINGUIST List: Vol-34-2221
----------------------------------------------------------



More information about the LINGUIST mailing list