35.223, Review: Corpus Dialectology: Pustka, Quijada Van den Berghe, Weiland (eds.) (2023)

The LINGUIST List linguist at listserv.linguistlist.org
Wed Jan 17 21:05:08 UTC 2024


LINGUIST List: Vol-35-223. Wed Jan 17 2024. ISSN: 1069 - 4875.

Subject: 35.223, Review: Corpus Dialectology: Pustka, Quijada Van den Berghe, Weiland (eds.) (2023)

Moderators: Malgorzata E. Cavar, Francis Tyers (linguist at linguistlist.org)
Managing Editor: Justin Fuller
Team: Helen Aristar-Dry, Steven Franks, Everett Green, Daniel Swanson, Maria Lucero Guillen Puon, Zackary Leech, Lynzie Coburn, Natasha Singh, Erin Steitz
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Justin Fuller <justin at linguistlist.org>
================================================================


Date: 17-Jan-2024
From: Troy Spier [tspier2 at gmail.com]
Subject: Linguistic Theories, Sociolinguistics: Pustka, Quijada Van den Berghe, Weiland (eds.) (2023)


Book announced at https://linguistlist.org/issues/34.2738

EDITOR: Elissa Pustka
EDITOR: Carmen Quijada Van den Berghe
EDITOR: Verena Weiland
TITLE: Corpus Dialectology
SERIES TITLE: Studies in Corpus Linguistics   110
PUBLISHER: John Benjamins
YEAR: 2023

REVIEWER: Troy Spier

SUMMARY

Chapter 1, written by the three editors, outlines the exigence for
this edited volume by arguing that the need exists to bridge a gap
between traditional studies in dialectology and more recent
methodologies employed in corpus linguistics. As such, it begins with
a thorough explication of the role that monographs and linguistic
atlases have historically played before shifting to the primary
objective of corpus dialectology: “[...] not only to use current data
to update and revise existing descriptions of, and approaches to,
regional variation but also to further develop linguistic theories in
the domain of variation and change, in the different levels of
structure (phonology, morphosyntax, semantics), and in the combination
of both” (p. 3). Finally, a variety of common corpora are described,
and all subsequent chapters are briefly contextualized for the reader.

Chapter 2, written by Anne Kruijt et al., focuses on the usefulness of
crowdsourced data in corpus dialectology studies by foregrounding two
projects: Advancing the European Multilingual Experience (AThEME) and
English Varieties in Contact (VinKo). The benefits of the latter are
explained in detail, and the reader is guided through the process of
data collection for this particular project. Next, three case studies
are offered with data from both projects by differentiating pronoun
usage in South Tyrol (Germanic), verbal agreement in declarative and
indirect interrogatives in Trentino (Romance), and obligatory and
optional subject proclitics in Veneto (Romance). Finally, the chapter
concludes by arguing that the crowdsourced data contained within these
two projects “[are] of comparable quality to data gathered in
traditional fieldwork” (p. 30).

Chapter 3, written by Céline Dugua et al., offers a strikingly novel
questioning of the validity of data collected by interrogating “the
effect of the socio-cultural and linguistic biography of the
researchers who conducted the interviews” (p. 35) and whether this
impacts one’s transcriptional practices. Interestingly, the authors
undertake a decidedly ‘meta’ approach to addressing these issues by
interviewing those who conducted the interviews and completed the
transcriptions for the ESLO Corpus. In doing so, they consider not
only the corpus/database construction, but also the language used to
elicit data, the kinds of questions asked, and the types of data
ultimately collected, offering a uniquely human dimension into the
experience of the researcher.

Chapter 4, written by Marie-Hélene Côté and Hugo Saint-Amant Lamy,
begins by situating the two primary varieties of French in Canada and
distinguishes these according to dialectal affiliation (Laurentian vs.
Acadian), location with reference to the Province of Quebec (inside
vs. outside), and relationship with English (majority vs. minority).
The parameters for the fieldwork, data collection, and data processing
are delineated before the chapter shifts to a dialectometric analysis
of vowel length, hereby illustrating the potential for “[...]
quantitative analysis of macro-level geolinguistic features (dialect
areas, bundles of isoglosses, etc.)” (p. 71). Numerous figures are
presented to illustrate the distinction in length and the geographic
distribution of these forms.

Chapter 5, written by Gianmario Raimondi et al., shifts the focus to
non-standardized linguistic varieties of the Western Alps,
specifically Occitan (Romance), Franco-Provençal (Romance), and Walser
(Germanic). After providing a brief geolinguistic overview, the
chapter introduces the Corpus Linguistics Meets Alpine Cultural
Heritage (CLiMAlp) initiative. Finally, an in-depth, methodological
discussion of the corpus is provided before introducing some
implications for the resulting database.

Chapter 6, written by Linda Bäumler and Frederik Hartmann, offers a
quantitative, phonological treatment of Anglicized loanwords from
speakers in Mexico and Spain. The data collected for this chapter come
from the former’s doctoral dissertation, in which seventy-four native
speakers of Spanish from both countries were recorded. Three
generations and both urban and rural speakers are represented. After
temporally positioning the loanwords considered in the corpus, the
chapter shifts to the statistical analysis before finally offering
some general findings, noting, in particular, that greater exposure to
English effected closer approximation of the corresponding phoneme in
English.

Chapter 7, written by Vicente J. Marcet Rodríguez and Manuel Nevot
Navarro, unifies diachronic linguistics and corpus linguistics by
utilizing four corpora of Spanish (covering Castile and León) over
four centuries. Once the authors introduce the objectives of their
study, they shift toward a broad overview of the corpora employed.
Finally, the chapter presents and exemplifies a sequence of linguistic
phenomena, viz. the development of periphrastic verbal phrases to
supplant erstwhile analytic forms, the aspiration and loss of
syllable-initial /f/ (e.g. ‘habla’ from ‘fabla’), and the selective
devoicing of voiced sibilants in the Romance languages more broadly.

Chapter 8, written by Borja Alonso Pascua, centers the usage of the
aoristic present perfect in rural varieties of Spanish with data
present in the Audible Corpus of Spoken Rural Spanish. The chapter
begins with a direct justification for the usage of a corpus in this
study before offering a review of the morphosyntactic definitional
criteria for and the historical development of the aoristic present
perfect. A total of 177 instances of the present perfect were
investigated, and the possible impacting factors, alongside the
location of the speaker, were considered.

Chapter 9, written by Andrés Enrique-Arias and Marina Gomila Albal,
utilizes some thirteen-thousand tweets to determine whether any
differences exist between the usage of the analytic and periphrastic
future in Peninsular (i.e. European) and Latin American Spanish. The
chapter opens with a discussion of the future tense more broadly
before reviewing extant scholarship from the last sixty years on the
future tense in relevant varieties of Spanish. Finally, after
reviewing the context of the corpus itself, the chapter considers
whether register, verb frequency, lexical type, temporal distance,
and/or assertiveness–in addition, of course, to the location of the
author of the tweet–impact the usage of the periphrastic or analytic
form.

Chapter 10, written by Thomas Krefeld and Stephan Lücke, serves as a
concluding chapter to the volume by reiterating the importance of
corpus dialectology as a field, remarking, though, that it is “a
discipline that is still searching for the best solutions and is far
from establishing corresponding standards” (p. 198). The authors
remind the reader that geographically-situated studies of linguistic
varieties constitute an empirical pursuit that follows–or should
follow–the best standards for corpus development and analysis, noting
that existing data are limited in a number of ways, methodologically
or otherwise. They also offer a digitization scale to characterize the
extent to which non-electronic works are made available through the
internet. Finally, and perhaps most importantly, the chapter closes
with a conversation about the importance of ethical data management
and both the involvement of (non-)traditional informants and the
inclusion of their speech.

EVALUATION

As a co-edited volume organized into four parts (data collection,
methodology, case studies, and theory), “Corpus Dialectology”
offers—in approximately two-hundred pages—unique insight into some of
the possible applications of corpus linguistics with specific emphasis
upon linguistic varieties of the Romance family. Although not
enumerated within the volume itself, ten chapters are included, nine
of which are co- or multi-authored. Excluding the introductory
chapter, each chapter ranges from eighteen to twenty-eight pages, and
the inclusion of extensive references indicates thorough engagement
with the literature. Nonetheless, there are a few areas where this
volume could be improved, particularly if revised for a subsequent
reprinting.
        First, although the first chapter serves as an introduction,
it is quite short and offers only three pages of actual content; as a
result, an opportunity was missed to discuss best practices within
corpus linguistics (see e.g. Biber 2006, McEnery and Wilson 2001,
McEnery and Hardie 2011, and Stefanowitsch 2020), quantitative corpus
linguistics (see e.g. Desagulier 2017, Gries 2016, and Johnson 2008),
and/or dialectology more generally. Similarly, the tenth chapter takes
the place of a conclusion, though it is not formally demarcated as
such. The introduction also takes for granted the value of corpora in
linguistic research, a view not retained by all, most noticeably
through Chomsky’s (1964) statement that “[...] a direct record—an
actual corpus—is almost useless, as it stands, for linguistic analysis
of any but the most superficial kind” (p. 36). Thus, an expanded
introduction and a dedicated conclusion would more effectively ‘frame’
the other chapters within and prevent any substantial criticism of the
methodology employed throughout.
        Second, while the distribution of chapters into four sections
was both meaningful and important, its actual execution was less
effective. This is due to the fact that many of the chapters are
themselves case studies, regardless of their inclusion or exclusion
from the so-named section in this volume. Moreover, only the tenth
chapter is situated within the section dedicated to theory, suggesting
to the reader that the theoretical implications of the prior chapters
are less worthy of similar space. Additionally, the chapters offered
almost exclusive focus upon a singular branch of Indo-European,
rendering the volume’s title too broad to be representative. With a
clearer demarcation of chapter-specific objectives, this volume could
make an incredibly compelling case for the usage of corpus methods in
dialectology more broadly.
        Third, there are numerous points where formatting conventions
are inconsistent. For instance, interlinear glosses would greatly
benefit from the presence of some limited white-space to aid in the
readerly experience, as it appears initially that free translations
are missing when two variations of the same utterance are offered.
Similarly, the eighth chapter contains neither interlinear glosses nor
free translations, thus requiring the reader to have enough knowledge
of Spanish to read a volume intended, at least based on the current
title, for anyone with an interest in corpus dialectology.
        Despite these areas for improvement through a revised version,
this volume does successfully engage standardized and non-standardized
linguistic varieties in a range of spoken and written corpora, for
which the authors should be acknowledged. Additionally, the editors
should be recognized for the unimaginable effort that it must have
required to coordinate among some two-dozen scholars in undertaking
this project and bringing to light such interesting findings in
variation through phonological and morphosyntactic accounts of these
linguistic varieties. Finally, the third chapter in this volume
offered a genuine ‘breath of fresh air’ through the authors’ treatment
of self-reflection and ethics, two areas of growing importance within
scholarship across all disciplines.

REFERENCES

Biber, D. 2006. University Language: A Corpus-Based Study of Spoken
and Written Registers. Philadelphia, PA: John Benjamins Publishing
Company.

Chomsky, Noam. 1964. The Development of Grammar in Child Language:
Discussion. Monographs of the Society for Research in Child
Development 29(1). 35-42.

Desagulier, Guillaume. 2017. Corpus Linguistics and Statistics with R:
Introduction to Quantitative Methods in Linguistics. New York, NY:
Springer.

Gries, Stefan Th. 2016. Quantitative Corpus Linguistics with R: A
Practical Introduction. New York, NY: Routledge.
.
Johnson, Mark. 2008. Essential Python for Corpus Linguistics. Hoboken,
NJ: Blackwell.

McEnery, Tom and Andrew Wilson. 2001. Corpus Linguistics: An
Introduction. Edinburgh, Scotland: Edinburgh University Press.

McEnery, Tony and Andrew Hardie. 2011. Corpus Linguistics: Method,
Theory and Practice. Cambridge University Press.

Stefanowitsch, Anatol. 2020. Corpus Linguistics: A Guide to the
Methodology. Berlin, DE: Language Science Press.

ABOUT THE REVIEWER

Troy E. Spier is Assistant Professor of English and Linguistics at
Florida A&M University. He earned his MA and Ph.D. in Linguistics at
Tulane University, his B.S.Ed. in English/Secondary Education at
Kutztown University, and a graduate certificate in Islamic Studies at
Dallas International University. His research interests include
language documentation and description, discourse analysis, corpus
linguistics, and linguistic landscapes.



------------------------------------------------------------------------------

Please consider donating to the Linguist List https://give.myiu.org/iu-bloomington/I320011968.html


LINGUIST List is supported by the following publishers:

Cambridge University Press http://www.cambridge.org/linguistics

Multilingual Matters http://www.multilingual-matters.com/

Wiley http://www.wiley.com


----------------------------------------------------------
LINGUIST List: Vol-35-223
----------------------------------------------------------



More information about the LINGUIST mailing list