31.1618, Review: Computational Linguistics; Discourse Analysis; Text/Corpus Linguistics: Callies, Levin (2019)
linguist at listserv.linguistlist.org
Fri May 15 01:41:15 UTC 2020
LINGUIST List: Vol-31-1618. Thu May 14 2020. ISSN: 1069 - 4875.
Subject: 31.1618, Review: Computational Linguistics; Discourse Analysis; Text/Corpus Linguistics: Callies, Levin (2019)
Moderator: Malgorzata E. Cavar (linguist at linguistlist.org)
Student Moderator: Jeremy Coburn
Managing Editor: Becca Morris
Team: Helen Aristar-Dry, Everett Green, Sarah Robinson, Lauren Perkins, Nils Hjortnaes, Yiwen Zhang, Joshua Sims
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org
Homepage: http://linguistlist.org
Please support the LL editors and operation with a donation at:
Editor for this issue: Jeremy Coburn <jecoburn at linguistlist.org>
Date: Thu, 14 May 2020 21:40:30
From: Shuyi Sun [shuyi.amelia.sun at uq.net.au]
Subject: Corpus Approaches to the Language of Sports
Discuss this message:
Book announced at http://linguistlist.org/issues/30/30-3538.html
EDITOR: Marcus Callies
EDITOR: Magnus Levin
TITLE: Corpus Approaches to the Language of Sports
SUBTITLE: Texts, Media, Modalities
SERIES TITLE: Corpus and Discourse
PUBLISHER: Bloomsbury Publishing (formerly The Continuum International Publishing Group)
YEAR: 2019
REVIEWER: Shuyi Amelia Sun, The University of Queensland
To date, the world of sports has witnessed fundamental changes with regard to
a diversification of sports types and events, an increasing commercialization
and globalization of major spectator sports, and an ever-increasing public
attention and intensive coverage in various media and modalities. Despite this
increasing popularization, studies of the language and discourse of sports are
not only relatively heterogeneous in nature, but also scattered across
different academic disciplines. In the field of applied linguistics, there is
not a single, specialized journal that deals with sports discourse to date,
though several journals in the larger field of social studies serve as such
research outlets. In addition, the emergence of new sports genres in the age
of computer-mediated communication (CMC, Herring, 1996) has opened up an
innovative way of studying sports discourse, i.e. by means of large online
electronic resources, while at the same time, linguistic research has greatly
benefitted from corpus-based and corpus-driven investigations of real-world
language thanks to the compilation and accessibility of computer corpora and
software tools.
Accordingly, this timely volume Corpus Approaches to the Language of Sports:
Texts, Media, Modalities, edited by Marcus Callies and Magnus Levin, brings
together innovative empirical studies that adopt a usage-based perspective and
use corpus data and corpus linguistic methods to examine language occurring in
a variety of genres and pragmatic contexts of different types of sports. The
editors attempt to extend the scope of applied linguistic research on sports
beyond football/soccer, which has been very much at the center of attention.
Furthermore, they aim to advance the scope of corpus linguistic research more
generally by throwing light on both the potential and the necessity of
exploring sports language in association with its accompanying audio-visual
modes of communication from a multimodal perspective. Given the above scopes,
this timely volume is expected to be of great interest to a broad readership,
including those researchers working on (sports) discourse analysis and corpus
linguistics, or even in the larger fields of applied linguistics and social
Structurally, the volume comprises an introductory chapter and ten empirical
studies, where the introductory chapter (Chapter 1) lays down the theoretical
and methodological contexts required to appreciate the subsequent studies,
while the following ten empirical studies (Chapter 2-11) are divided into
three parts. Part one ‘Texts. Contrastive and Comparative Aspects of the
Phraseology of Football Match Reports’ (Chapter 2-4) explores the phraseology
of football reporting across different text types and languages by adopting a
comparative/contrastive linguistic approach. Part two ‘Media. Expanding the
Scope of Research to New Contexts of Use’ (Chapter 5-8) extends the existing
research to new media and relatively downplayed sports discourse outside of
football. The final part ‘Modalities. Multimodal Studies’ (Chapter 9-11)
addresses sports language from a multimodal perspective which has rarely been
applied to the language of sports.
The volume begins with the introduction (Chapter 1 penned by the editors),
which lays the foundation for the following parts by briefly reviewing
sports-related linguistic research with a view to main research topics and
recent trends, presenting new research initiatives and resources, and finally
introducing the rest of the chapters in this volume. Over the last few
decades, football/soccer has largely dominated the research agenda with an
extensive focus on sports reporting, including such aspects as structural
linguistics, linguistic borrowing, metaphors, and diachronic studies to trace
the history of reporting genres. Outside of sports reporting, several studies
have analyzed the sociolinguistic aspects and significance of football chants.
Although previous research appears to have become more diverse and
interdisciplinary, it has mostly been limited in monomodal corpus approach
(drawing only on the textual level) and scope (football/soccer). The editors
then present useful resources, especially the Innsbruck Football Research
Group and Simon Meier’s electronic corpora of cross-linguistic football
reporting (Meier, 2017), and introduce a recent-initiated research network
‘Applied Linguistics in Sport’. Finally, the editors introduce each of the
following chapters and indicate ways they are arranged in three broader parts
of the present volume.
To come to the first part, Chapter 2 (Simon Meier) studies authors’ strategies
to produce online football coverage while meeting the challenge of reconciling
linguistic routines and emotional involvement under high time pressure. Meier
adopts data-driven corpus-linguistic methods to investigate two types of
formulaicity, namely recurrent schematic constructions and idioms. A large
corpus of over 12 million tokens were built based on German and English data
from two online football-related genres: live text commentaries (LTC) and
match reports (MR). The results suggest that the production of online football
coverage oscillates between preconstructed patterns and word-for-word
combinations. To be more detailed, syntactic patterns and idioms serve as text
routines to present texts in a cross-cultural register-specific way that is
tied to the communicative and social needs in the domain of sports coverage,
while there are still enough open choices to modify these ‘templates’ so as to
demonstrate creativity and make narratives more appealing. As is argued by the
author, such findings give evidence to what Sinclair (1991) has pointed to as
the alternation between ‘idiom principle’ (word-for-word combinations) and
‘open choice principle’ (preconstructed multi-word combinations) in the
production of texts (Erman & Warren, 2009).
Chapter 3 (Signe Oksefjell Ebeling) explores the English-Norwegian Match
Report Corpus (ENMaRC) with the aim of filling the gap that little contrastive
research between English and Norwegian has been done on more specific and
homogeneous non-fictional text types, such as football match reports in the
present study. Relying on ENMaRC, with the Premier League (PL) match reports
reaching more than 500,000 tokens and the ‘Eliteserie’ (ES) match reports
reaching roughly 155,000 tokens, the author applies corpus-driven extraction
methods in the forms of word lists, n-gram lists and keyword lists. Results of
this study show that, on the one hand, post-match reports in the two languages
under study are similar to other text types in the use of time and space
expressions, on the other hand, there are cross-linguistic differences when
reporting on victories and defeats. Given the pioneering observations based on
ENMaRC, this study is impressive in its exploratory nature, rich food for
thought, and several avenues for future research. At the same time though, as
the ENMaRC is still under construction, the corpus itself could have prevented
the author from offering more in-depth studies of specific linguistic
Chapter 4 (Rita Juknevičienė and Paulius Viluckas) presents a comparison
between human-mediated and computer-mediated football reports in a bid to fill
in the niche that (dis)similarities between human- and computer-mediated
football language remain unknown. In this study, the two modes (computer- vs
human- mediated reports) are represented respectively by ‘Football Manager
2017’ (FM) reports and BBC online football reports, and correspondingly two
corpora were compiled, each containing 200 texts and spanning around 20,000
tokens. The researchers then use a corpus-driven approach to analyze keywords
that distinguish one mode from the other and the relationships regarding
lexical bundles. The results reveal a number of prominent differences in terms
of both individual lexemes (particularly among function words) and four-word
lexical bundles where only 11 out of the 200 most frequent lexical bundles are
shared by both corpora. Besides, this study shows a limited use of
conjunctions, cohesive and linking devices in computer-generated football
reports, which may explain a major cross-mode difference related to text
cohesion. However, findings of this study are not without shortcomings since
the analyzed corpora are still relatively small with only 20,000 tokens in
each sub-corpus, which may affect the generalizability of the present
Part two, titled ‘Media. Expanding the Scope of Research to New Contexts of
Use’ begins with Chapter 5 (Turo Hiltunen) which explores framing in news
media accounts of cycling crashes. This is out of the fact that the language
of cycling crash reports remains largely unexplored though actually cycling
reports reflect and can shape public and policy makers’ understanding,
attitude, and behavior towards the sport (Rissel et al., 2010). Thus, this
chapter aims to identity the structure and functions of such reports, describe
the ways different social actors are represented, and investigate what is
identified as the cause of the crash and whether the cause is expressed
neutrally. To achieve these goals, Hiltunen applies a corpus-based approach to
study framing in these reports on the basis of a 79,000-word corpus of 230
English reports collected from the Internet, identifies and discusses the main
textual functions and lexico-grammatical patterns from a discourse-analytical
perspective. His findings suggest that similar textual strategies are employed
for framing crash and representing social actors, although the shortest and
most other texts still vary in the specific aspects they stress. Besides,
findings highlight clear differences in the representation of riders and
drivers of motor vehicles involved in the crash; this could be interpreted as
evidence for the existence of such media bias against cycling.
Chapter 6 (Jukka Tyrkkö and Hanna Limatius) studies race radio interactions
between drivers and teams (i.e. race engineers) in Formula One, attempting to
fill in the gap that no previous linguistic research has touched upon race
radio interactions or any other similar context of language use. Based on a
newly compiled corpus, which consists of 5,432 individual messages (63,183
tokens) from the 2016 and 2017 seasons of Formula One, the authors apply
corpus-based quantitative and qualitative methods to investigate the dialogic
turns for structure and complexity, and present a breakdown analysis of the
stylometric markers of both cdriver and team broadcasts. Findings basically
support their assumption in that the effects of stress are indeed observed by
examining most of the linguistic markers selected though there are significant
differences among individual drivers and race engineers during a race. While
the sampling method for constructing corpus could be ‘opportunistic’ (p. 117),
the present research takes a preliminary step into the language of Formula One
and provides heuristic implications for further studies.
Chapter 7 (Isabel Balteiro) investigates the use of English swearword
f-expressions (‘WTF’, ‘fucking’, and ‘fuck’) used by Spanish football
followers in spontaneous synchronic comments in online chats. The corpus
consists of over 390,500 authentic online messages and/or comments produced by
Spanish football followers between 2007 and 2018, manually compiled from the
comments sections and messages in chatrooms in the online version of the
Spanish sports newspaper Marca. However, the actual hits of the expletives are
unexpectedly low as there are only a total of 144 examples (28 examples of
fuck, 16 examples of fucking and 100 examples of wtf) by 139 different users.
Findings of this study show that the f-expressions used as code-switches by
Spanish football followers in chats have by and large lost their taboo.
Instead, they are primarily used to contribute to organizing the sequentiality
of discourse (Li & Milroy, 1995) as well as to communicate speakers’ attitude
and mostly negative emotions. In addition, the distribution patterns of
f-expressions indicate that Spanish users replicate or imitate native uses of
those words, probably motivated by context and/or chat (in-group) norms. As is
claimed by the author, the main interests of this study lie in the functions
of these expletives as pragmatic markers and their interactional significance,
position, and distribution in football-related discourse. Hence, this study is
expected to provide interesting insights into both the language of football
followers and/or online communities and the cross-linguistic pragmatic role of
The last chapter in this part (Chapter 8 written by Miguel Ángel
Campos-Pardillos) addresses sports-related legal discourse by illustrating the
presence of metaphor in the description of sports fraud (e.g. bribing) and the
fight against it. The author manually extracted 203 metaphors (a total of
72,809 word counts) associated with sports fraud and fight measures from nine
academic studies, including four journal papers and five book chapters, all of
which deal with this topic from a legal or law enforcement perspective.
Following a qualitative approach, the author analyzes how scholars in the
field of law use metaphorical language to justify the fight against sports
fraud, and how the metaphorical discourse is influenced by its identification
with other criminal activities. His analysis shows that, on the one hand, some
metaphors have a clear and objective ontological basis which usually pertains
to the world of concrete objects, on the other hand, process and event
scenarios are employed to justify measures and actions that seem strict or
controversial at first but are eventually accepted as something inevitable
within the ‘war against fraud’ (p. 176). This chapter concludes with a
reminder of the fundamental role of metaphors in creating a discourse, and
thus it is vital to be aware of the metaphors used in both sports law contexts
and in general discourse on sports.
In the final part, ‘Modalities. Multimodal Studies’, Chapter 9 (Valentin
Werner) explores the multimodal nature of football live text (FLT) and the
role of audience participation in the electronic medium, aiming at expanding
the description of sports/football discourse. Based on a corpus of 68 FLT
reports (around 160,000 tokens in total), Werner takes a multimodal approach
towards FLT as an artefact connecting offline and online practices. Accounting
for the combined linear and non-linear nature of live text commentaries, his
findings indicate that the genre increasingly taps into elements from various
external sources (e.g. information from a commercial statistics provider and
images) that also enable audience participation, thus merging as a hybrid and
complex multimodal ensemble characterized by media convergence. While calls
for further studies of (F)LT as an under-studied form of journalism have
repeatedly been voiced from the angles both linguistics (Hauser, 2008) and
media research (McEnnis, 2016), relying on a multimodal framework for the
analysis of this type of communication is undoubtedly an appropriate choice
due to the very multimodal nature of the artefact, which thus simultaneously
facilitates the exploration of issues such as media convergence.
Chapter 10 (Peter Crosthwaite and Joyce Cheung) studies multimodal discourse
practices in 4chan (www.4chan.org), an anonymous online community where users
post images and texts on a wide range of topics. The analyzed corpus consists
of eleven full threads (including 35850 posts and 1169 images posts) relating
to the Ultimate Fighting Championship’s (UFC) 2017-2018 New Year’s Eve
flagship event UFC 219 – Cyborg vs. Holm. The authors then apply a sentiment
analysis to the multimodal corpus’s text and images and focus on positive and
negative appraisals of action from the sports event as it occurs in real time,
as well as reaction images of the poster’s personal response to the event or
to other user’s reactions. Their primary goal is to quantify and characterize
the intermodality regarding 4chan posters’ juxtaposition of text, images and
videos while communicating their reactions to the event itself and to other
posters as the event unfolds, which is expected to reveal how the meanings
made in one mode are interwoven with the other to co-present and co-operate
during the event. Besides, the authors also examine how hyperlinks serve to
direct the sentiment of text and images to other specific posters on an
anonymous message board. According to the results, the general sentiment
conveyed in 4chan’s discussion facilitated by computer-mediated communication
is almost entirely negative, with a strong sense of both fear and disgust
expressed multimodally via texts and images. This negativity is directed at
the fighters, at other posters, and even to the self-identity of the posters
involved, and is accompanied by fantasies of white male dominance over other
races and genders. As such, results of this study shed light on the discourse
practices within a typically shady corner of the internet population (as
occupied by 4chan users), as well as contribute to a greater understanding of
online sports discourse as mediated by thousands of users in (semi-)
Finally, Chapter 11 contributed by this volume’s editors (Marcus Callies and
Magnus Levin) presents a comparative study of dislocation in live (i.e.
play-by-play) TV football commentary. The point of departure for this study is
the assumption that live TV sports commentary is a specialized register which
is characterized by largely unplanned discourse and shaped by the
time-critical nature of the action that unfolds on and off the field, but also
as to what is visible on the TV screen. To test the assumption, a trilingual
comparative corpus-based study is conducted based on a corpus of 14,726 words
comprising English, German and Swedish transcripts of live TV commentaries of
the 2014 men’s football FIFA World Cup final between Germany and Argentina.
Findings support the authors’ preliminary argument in that right dislocation
could be considered as a register-specific, functionally motivated discourse
feature of live TV sports commentary. In addition, considering there are no
major differences in the use of dislocation in sports commentary regarding
frequency, nor regarding the distributions of the different discourse
functions, the authors suggest that future research has to determine to what
extent these cross-linguistic similarities hold true in general. It should be
noted that the present study highlights both the potential and the necessity
of examining language use in association with accompanying modes of
communication and visualization from a multimodal perspective.
On the whole, the present volume makes a strong contribution to corpus
linguistics and the application of corpus linguistic methods to the language
of sports and such contexts. As the only volume dealing with sports in the
Corpus and Discourse series, it offers innovative empirical studies that use
new corpus resources to showcase the structural-linguistic and discourse
aspects of a wide range of sports (e.g. football, cycling, motor racing),
genres (e.g. live commentary, post-match reports, legal texts) and contexts of
use (e.g. sports media, in-team communication). Considering the pioneering
investigations involved in each chapter, the volume is especially impressive
in its exploratory nature and rich implications for future research. In
addition, detailed corpus-linguistic research methods in each chapter make it
easier for both experienced corpus linguists and newcomers to immediately
apply these approaches to their research in corpus-based/driven (sports)
discourse analysis. Newcomers especially will benefit from the thorough
literature review of sports discourse research, which is expected to serve as
the theoretical foundations for their work.
Despite these strong points, the volume is not devoid of certain limitations,
the main one being the corpora analyzed. As mentioned, the scale of corpus
data in some chapters (e.g. Chapter 4) is relatively limited. Similarly, as
the ENMaRC in Chapter 3 is still under construction, the corpus itself could
have prevented researchers from offering more in-depth studies of specific
linguistic tendencies. Such chapters that do provide some evidence are still
limited in the number of subjects, and thus, the generalizability of their
results. Nevertheless, this shortcoming does not in essence detract from the
strength of the present volume. Actually, it could be said that each chapter,
while filling one gap in the literature of sports discourse analysis,
simultaneously opens another avenue of academic research in the application of
corpus-based/driven methods to a wider range of real-word language contexts.
As such, the present volume serves as an indispensable step for future
research which, if based on larger corpora of sports language, will
corroborate the present observations and offer new insights into the
specificity of computer-mediated sports language.
Erman, Britt., & Warren, Beatrice. 2009. The idiom principle and the open
choice principle. Text 20(1). 29-62.
Hauser, Stefan. 2008. Live-Ticker: ein neues Medienangebot zwischen
medienspezifischen Innovationen und stilistischem Trägheitsprinzip.
kommunikation @ gesellschaft 9(1). 1–10.
Herring, Susan. (ed.). 1996. Computer-mediated communication: Linguistic,
social and cross-cultural perspectives. Amsterdam: John Benjamins.
Li, Wei. & Milroy, Lesley. 1995. Conversational codeswitching in a Chinese
community in Britain: A sequential analysis. Journal of Pragmatics 23.
McEnnis, Simon. 2016. Following the action: How live bloggers are reimagining
the professional ideology of sports journalism. Journalism Practice 10(8).
Meier, Simon. 2017. Korpora zur Fußballlinguistik-eine mehrsprachige
Forschungsressource zur Sprache der Fußballberichterstattung. Zeitschrift für
germanistische Linguistik 45(2). 345-349.
Sinclair, John. 1991. Corpus, concordance, collocation. Oxford: Oxford
University Press.
Shuyi Amelia Sun is a postgraduate student in the Applied Linguistics-TESOL
program at the University of Queensland. Her research interests are in the
areas of (learner) corpus linguistics, English for Academic Purposes (EAP),
quantitative text/data analysis using R. Her previous research experience
includes managing a project of the Student Research Training Program for
Colleges and Universities in China, publications in academic journal and
conferences, working as a reviewer for a linguistic journal. She aspires to
pursue a Ph.D. in the field of corpus linguistics in the future.
*************************** LINGUIST List Support ***************************
The 2019 Fund Drive is under way! Please visit https://funddrive.linguistlist.org
to find out how to donate and check how your university, country or discipline
ranks in the fund drive challenges. Or go directly to the donation site:
Let's make this a short fund drive!
Please feel free to share the link to our campaign:
LINGUIST List: Vol-31-1618
More information about the LINGUIST
mailing list