28.2935, Review: Applied Linguistics; Discourse Analysis; Language Acquisition; Pragmatics; Text/Corpus Linguistics: Dobrić, Graf, Onysko (2016)

Wed Jul 5 13:52:01 UTC 2017

LINGUIST List: Vol-28-2935. Wed Jul 05 2017. ISSN: 1069 - 4875.

Subject: 28.2935, Review: Applied Linguistics; Discourse Analysis; Language Acquisition; Pragmatics; Text/Corpus Linguistics: Dobrić, Graf, Onysko (2016)

Moderators: linguist at linguistlist.org (Damir Cavar, Malgorzata E. Cavar)
Reviews: reviews at linguistlist.org (Helen Aristar-Dry, Robert Coté,
                                   Michael Czerniakowski)
Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           http://funddrive.linguistlist.org/donate/

Editor for this issue: Clare Harshey <clare at linguistlist.org>
================================================================

Date: Wed, 05 Jul 2017 09:51:57
From: Luciana Forti [Luciana.forti at unistrapg.it]
Subject: Corpora in Applied Linguistics

Discuss this message:
http://linguistlist.org/pubs/reviews/get-review.cfm?subid=36246137

Book announced at http://linguistlist.org/issues/27/27-3774.html

EDITOR: Nikola  Dobrić
EDITOR: Eva-Maria  Graf
EDITOR: Alexander  Onysko
TITLE: Corpora in Applied Linguistics
SUBTITLE: Current Approaches
PUBLISHER: Cambridge Scholars Publishing
YEAR: 2016

REVIEWER: Luciana Forti, Università per Stranieri di Perugia

SUMMARY

This volume contains eight studies stemming from the Klagenfurt Conference of
Corpus-Based Applied Linguistics (CALK14). 

The first contribution by Marcus Callies, entitled “Research on L2 Pragmatics
at a conceptual and methodological interface”(pp. 9-31), presents a case study
situated at the intersection between the pragmatics, syntax and discourse
interface, and the SLA and LCR interface. The study is based on an analysis of
demonstrative clefts, which Callies defines as “a syntactic means of
information highlighting located at the interface of syntax and
discourse-pragmatics” (p. 15), and operationalises as “all instances of that
and this followed by a form of be (‘s, is, was) and a wh-word (what, when,
why, where, how)” (p. 17). 

The aim of the study is to compare the speech of native and non-native
speakers of English in terms of differences in frequency of use, range of
discourse functions, and L1 effects. Callies uses three corpora: the French
and German sections of LINDSEI (Louvain International Database of Spoken
English Interlanguage), respectively named LINDSEI-F and LINDSEI-G, and the
comparable LOCNEC (Louvain Corpus of Native English Conversation). 

French learners are found to use fewer demonstrative clefts and with a more
restricted functional spectrum compared not only to natives, but also to
German learners, who display all discourse functions used by native speakers,
though in lesser amounts. The fact that demonstrative cleft constructions are
dispreferred both in French and German, L1 influence may be an explanatory
factor for results based on the French subcorpus, but not on the German
subcorpus. On the other hand, the time spent learning English is much higher
in LINDSEI-G when compared to LINDSEI-F. As a result, Callies concludes that
the overall explanatory factor for the differences observed in the two
language groups may be the number of years spent studying English at school. 

The second contribution, “A focus of pragmatic competence: the use of
pragmatic markers in a corpus of Business English textbooks” (pp. 33-51), by
Peter Furkó is a replication study, based on a previous publication (Furkó &
Mónos, 2013). The main aim of the study is to evaluate the treatment, in terms
of frequency of occurrence and quality of occurrence, that two pragmatic
markers (PMs), well and of course, display in textbooks when compared to a
reference corpus containing naturally-occurring discourse. Firstly, the author
conducts a qualitative analysis of well and of course based on previous
literature, in order to identify the functional range of their uses in
naturally-occurring spoken discourse; five super categories are defined for
both units of analysis. In accordance with these categories, quantitative data
is presented and discussed. 

The textbook corpus from the 2013 study comprised textbooks published between
1987 and 2006, while the corpus used in the present study contains textbooks
published between 2000 and 2011. 

The more recent corpus displays a slightly higher proportion of attention
devoted to well and of course (47,9% compared to 44% from the 2013 study), but
these percentages are still very much lower when compared to data based on
naturally-occurring discourse.  The functional spectrum of PMs observed in
both textbook corpora has remained unchanged, however the author hypothesises
that the higher the occurrence of PMs, the higher the likelihood of them being
used with three or more super-functions. Finally, the PM of course is found to
be characterised by an utterance-initial position in most of its occurrences
in the reference corpus, while in both textbook corpora it is found to be
described mostly in its medial-final utterance position. 

“Written summarisation for academic writing skills development: a corpus-based
contrastive investigation of EFL student writing” (pp. 53-77), by Gyula Tankó,
is the third study presented in the volume. It aims to evaluate the role of
task effects on the kind of language elicited, by comparing the effect of
writing an academic essay versus a guided summary. First, the researcher
identifies the lexical features of academic prose in terms of syntactic
features (prevalence of nominalisation, coordination and use of the passive
voice) and of lexical features (higher density of lexical words, adjectives,
linking adverbs, etc.), through a literature review of previous studies. Then,
50 first year BA English major students, with varying proficiency levels, are
asked to write a short independent argumentative essay as well as a guided
summary task. The resulting corpus, 13.903 tokens, is analysed in relation to
23 syntactic complexity indices, and 25 lexical complexity indices.
Statistical tests are employed in order to determine significant differences
between the two subcorpora, the existence of correlations between the essay
and summary syntactic and lexical complexity indices, and the use of academic
texts in the two types of texts. 

The results ultimately indicate that the typical features of academic prose
are more prominently elicited via a guided summary task, rather than an essay.
While warning against possible theme effects, which were not controlled for in
this study and which may affect the obtained results, Tankó indicates the
implications of the study for EAP pedagogy and testing, as well as
highlighting the need for further research. 

The fourth contribution is by Günther Sigott, Hermann Cesnik and Nikola Dobrić
and it is entitled “Refining the scope-substance error taxonomy: a closer look
at substance” (pp. 79-94). The aim of the study is to establish the
effectiveness of an error coding taxonomy, by determining the extent of
agreement amongst a group of error annotators. The taxonomy, originally
formulated by Lennon (1991) as the authors discovered after developing their
own in 2014 (Dobric & Sigott, 2014), relies on the notions of scope and
substance. In the authors’ words, “Scope refers to the amount of context that
is necessary in order for an error to become perceptible. Substance, by
contrast, refers to the amount of text that needs to be changed so that the
error will disappear” (p. 80). In order to create a coding system for the
annotation of errors, with special reference to the substance dimension, the
authors take into consideration four textual units beyond the word (i.e.
phrase, clause, sentence, text), as well as punctuation, thus arriving at a
taxonomy of 14 error types. These had been already identified in a previous
study, though uncited in this volume (Dobric, 2015). A group of thirteen
corpus linguistics students served as annotators of five texts produced by
Austrian learners of English. 

Overall, the study indicates a low rate of agreement amongst annotators, which
the authors discuss in light of two kinds of possible factors: those related
to annotators, who may have had an inadequate command of the language or may
have had different perceptions as to what constitutes a norm, and those
related to the taxonomy itself, which may be lacking in clarity for dealing
with difficult phenomena that may arise in error analyses. 

The role of spoken metadiscourse in intercultural context is at the centre of
Hermine Penz’s contribution “The uses and functions of metadiscourse in
intercultural project discussions on language education” (pp. 95-119). By
defining metadiscourse as “discourse about the evolving discourse” (p. 98),
the study aims at identifying the types and functions of metadiscoursive
strategies, in terms of frequency and variation, and at establishing whether
there is a connection between interactivity and the kind of metadiscourse
employed. In order to do this, the researchers analyse the production of two
discussion groups, the first consisting of 4 participants talking about the
topic “Language at the work place” and the second consisting of 6 participants
talking about the topic “Intercultural communication in teacher education”. 

After qualitatively categorising the types of metadiscoursive units, a
quantitative analysis is conducted on the corpus resulting from the data
collection, containing about 14 thousand and 22 thousand tokens for each of
the two groups. The quantitative analysis is based on raw frequencies and
percentages. The results indicate the use of similar metadiscourse functions
for both groups, while observing a certain degree of variation within the
single activities, in terms of frequency and types of metadiscourse. 

Olga Grebeshkova’s contribution, entitled “Does code-switching exist in
personal writing,” constitutes the sixth study presented in this volume (pp.
121-144). It aims to describe code-switching in a specific type of text:
personal writing, which may be defined as the act of producing texts where
“the author and the reader are the same person” (p. 124). This appears to be
an underexplored area in research: in tracing the lines of background
literature, Grebeshkova cites examples based on intra-sentential
code-switching from Tolstoy’s “War and Peace,” or studies based on
conversational code-switching (p. 122). This study aims to evaluate the extent
to which existing models of description developed for analysing code-switching
in speech are applicable to the analysis of personal writings. The study is
based on the collection of 83 examination notes from French students, and 83
examination notes from Russian students, all having a high proficiency level
in English. In the former sample group, the cases of code-switching found were
18, while in the latter they were 25, for a total of 43 notes. The analysis of
the texts is conducted according to two parameters: the first one relates to
Sebba’s language content relationships of multilingual texts (p. 138); the
second one, to the use of intra-sentential and inter-sentential
code-switching. In relation to both parameters, the two groups display
opposing trends, thus making it difficult to describe the phenomenon in terms
of common features of development. 

The seventh study is by Vesna Lazović and it is entitled “Frequency analysis
of trigger words and money-based expressions in British and Serbian bank
offers” (pp. 145-163). It applies a corpus-based methodology to a contrastive
analysis between two native languages. The author creates a corpus based on
texts found on the websites of 65 different banks, 33 Serbian and 32 British,
for a total of about 43 thousand tokens (about 14 thousand Serbian, and about
30 thousand British). The study analyses and compares three aspects of lexical
use in the two corpora. 

First, the most frequent words. The two frequency lists reflect cross-cultural
differences in terms of different products being offered: the British
subcorpus shows an emphasis on mortgages and fixed rates, as opposed to the
Serbian subcorpus showing an emphasis on loans or payments in instalments. 

Second, the study analyses the quality and quantity of trigger words and
money-saving 
expressions, finding that they recur slightly more often in the Serbian
subcorpus. Third, the lexical strategies used to express some form of
restriction to the offer presented are analysed. In this case, the comparison
seems to reveal a marked difference between the two subcorpora: in terms of
normalised absolute frequencies, British banks use restrictions 12.30 times,
while Serbian banks do so only 3.66 times. 

Branka Drljača Margić and Irena Vodopija-Krstanović conclude the volume with
their study entitled ‘“I use English, but if need be I’m fluent in German as
well”: Croatian Business professionals’ use of English and other languages’
(pp. 165-186). The study aims at evaluating the use of English in the context
of the Croation business environment, in terms of use and perceived status and
importance. In order to do this, the researchers ask a sample of 117 business
professionals to respond to an online questionnaire, made of five parts built
to gain data about: field of work, mother tongue and English proficiency
level; use and perceived status of English in their jobs; corporate languages
used in respective companies; opinions about the ideal native speaker, and
whether nativeness facilitates or hinders communication in business; the
extent to which they agree to a series of statements. The findings indicate
the primacy of English as a Lingua Franca in the business corporate sector,
without disregarding the use of other languages if needed, a perceived need to
further English language education in the corporate field, and that although
it is not deemed indispensable to attain native-like English proficiency,
close to native-like proficiency is seen as a factor that is able to
positively influence the image of a business professional in the corporate
field. 

EVALUATION

The first feature of the volume catching one’s attention is its title.
“Corpora in Applied Linguistics” is, in fact, Susan Hunston’s classic volume
published in 2002 and focused on the potential of corpus-based descriptions of
language in contexts of second language teaching, with brief accounts related
to other areas of Applied Linguistics (Hunston, 2002). In the volume under
evaluation, however, we find an addition to the original title: current
approaches.

The reader is thus inevitably led towards a few basic though specific
expectations. Firstly, that all contributions deal with corpora, i.e. large
collections of texts that are authentic, representative and in electronic
format. Secondly, that all contributions deal with studies relating to second
language acquisition, or other related areas. 

The volume opens with a sound study by Marcus Callies on demonstrative cleft
constructions, which applies the principles of CIA, Contrastive Interlanguage
Analysis. 
The aim of the author is not only to present the findings of the study, but
also to use them to draw attention to the potential that learner corpus
research has within the study of L2 pragmatics. The concluding remarks about
the most likely explanation of the results obtained, relies in fact on the
corpus metadata. Thanks to the way in which the corpus was designed, the
descriptive analysis can be substantiated with the analysis of the variables
displayed. 

The only shortcoming of this first paper is that references to previous work
by the author are not included in the bibliography, which makes it impossible
for the reader to know that, in fact, this study is not new, but was
originally published in the same form in 2013, alongside another case study
regarding the use of emphatic do (Romero-Trillo, 2013, pp. 18–19; 25–35). 

Furkó’s study continues in the path based on expanding the research agenda
pertaining to pragmatics in the field of second language acquisition, thus
going beyond the sole focus on speech acts. The literature review aimed at
describing ‘well’ and ‘of course’ in terms of their pragmatic functions is
sound and serves the purpose of identifying the qualitative categories that
are necessary in order to conduct the quantitative analysis based on the
textbook corpus. Moreover, it is informed by a corpus-based analysis of ‘well’
and ‘of course’ using the Larry King Corpus, a corpus made of transcriptions
of a popular TV show that the author compiled and analysed for the same
purpose at the time of his PhD research (Furkó, 2005).

However, on more than one occasion, a misleading assumption seems to be made:
that all of the learners’ input derives from textbooks. One may argue that the
unit of learning is the lesson, and not the textbook, and that teachers
frequently plan a lesson by integrating textbook content with other activities
that they may invent or take from resource books. Secondly, textbooks may come
with audio components, in which case, the analysis would have to be extended
to the audio transcriptions as well, which are either included in the student
textbook or in the teacher’s book. This aspect does not seem to be specified
in the paper. 

Since looking at the corpus as a whole, the two tables reporting on the
quantitative analysis conducted in the 2013 paper and in the present one (pp.
43, 45) would have perhaps benefited from aggregated quantitative measures,
such as means and dispersion rates. The tables, instead, provide only absolute
occurrence values, along with percentages (D-values). 

Tankó’s study on the syntactic and lexical characteristics of learner prose
elicited via two different tasks is one of the most interesting in the volume.
It grounds the study in a solid theoretical framework, with a solid founding
qualitative analysis, and provides a comprehensive quantitative analysis by
taking into consideration a number of different measures, which are ultimately
able to create an integrated picture in response to the proposed research
question. The combination of descriptive as well as inferential statistics,
along with the detailed description of the corpus used, makes the contribution
stand out. The implications for pedagogy and language testing are made clear,
as well as the shortcomings that may be addressed by future studies. The only
shortcoming that does not seem to be mentioned is the need to create larger
corpora to conduct similar studies, as almost 14 thousand tokens may be
insufficient to make solid generalisations regarding the results. 

The fourth study by Günther Sigott, Hermann Cesnik and Nikola Dobrić, based on
analysing inter-annotator agreement in relation to the application of an error
coding system, is particularly valuable in its methodology because it unveils
the difficulties of error analysis and annotation and, as a result, of
analysing interlanguage as a whole. The results that the study comes to,
indicating a low degree of agreement amongst the annotators, may be due to
many reasons which are partly discussed by the authors themselves. The issue
of the norm is a central one in second language acquisition studies, and the
continuum between correctness and incorrectness is very often made of a series
of intermediate areas. 

However valuable the study, it is not clear how this study fits with the title
of the volume. There is no mention of the concept corpus in the study, unless
one assumes that the corpus is represented by those five texts that the
annotators are required to analyse. If we go back to the definition of corpus
as a large collection of texts that are authentic, representative and in
electronic format, we see that the implied notion of corpus emerging from the
present study falls short.  

Penz’s study of metadiscourse helps to shed light on the ways in which
metadiscourse is used in intercultural communication contexts, in the field of
language education studies. The study adds to the pragmatic interest
significantly manifested in the volume so far and does so by providing a sound
qualitative analysis upon which the quantitative analysis bases itself. 

The only minor shortcoming concerns, perhaps, the fact that data regarding the
two groups of participants, whose productions make up the two corpora being
analysed, are not normalised, in the sense that percentages are given for each
raw frequency value, but percentages seem to be of little help because they do
not provide a common ground to compare two corpora that are in fact
significantly different in terms of extension, one being around 14 thousand
tokens, and the other around 22 thousand tokens. 

Grebeshkova’s interest in personal writing interestingly stems from her own
experience as a bilingual writer of personal notes (p. 124). As she points out
at the beginning of the article, the study is a work-in-progress, but at the
same time seems to derive from her uncited doctoral dissertation (Grebeshkova,
2016). ''Written code-switching in the note taking of second-language learners
in bilingual classroom environments”  (Grebeshkova 2016). The attention
devoted to this particular kind of writing is certainly valuable for the field
of studies related to code-switching, especially in regard  to the
possibilities of widening the scope of the empirical basis upon which such
studies are based, in terms of text variety and text medium. The divergent
results that the study ultimately attains are discussed in light of two
possible causes: first, different exercises and different pedagogical
traditions that the students are accustomed to may have affected the extent
and nature of code-switching; second, the oralised structure of code-switching
may play a role in how this takes place in personal writing. Unfortunately, it
is not clear whether metadata regarding the students producing the text were
collected, i.e. information about the sociolinguistic background
characterising each student. This kind of information may help in further
interpreting the results obtained so far, by continuing and deepening the
quantitative analysis. Here, the notions of ‘corpora’ and ‘applied
linguistics’ are implied in their broadest meaning. 

The seventh study by Vesna Lasović unites corpus-based discourse analysis with
contrastive analysis, thus contributing to widening the meaning of applied
linguistics which is implied in the present book. It is not clear where the
list of trigger words is taken from in order to perform the analysis that is
reported in the paper. The research provides valuable data indicating the
variable according to which the main cross-cultural differences between Sebian
and British bank advertising are observable, namely the use of restrictions in
offering certain products. In the concluding remarks to the study, Lasović
usefully provides an overall picture of the study through a useful table that
summarises the main findings. However, even though more sophisticated
statistical analysis may be performed in order to establish the significance
of the results found, along with the plan to build even larger corpora of this
kind, the study is an example of the usefulness of such comparisons in
cross-cultural studies. 

The last study presented in the volume, conducted by Branka Drljača Margić and
Irena Vodopija-Krstanović, confirms that English as Lingua Franca in the
business field is the main language used and to be used. It is certainly
useful in order to underline the importance of English language skills in the
corporate sector, which implies the need to invest in specialised teaching
courses and specialised training courses for English teachers. Interestingly,
the study indicates that the status of ELF does not hinder the use of national
languages or other languages, whenever the need arises as in contrast with the
fears that we read about on newspapers or even in some research. However, it
is not clear how the study fits in the volume. The data collection tool used
in this study is a questionnaire, and the aim of the study is to investigate
the perception and use of ELF amongst a sample of speakers in a specific
working sector. There is no use of corpora and, again, the expression applied
linguistics seems to be, again, considered in its broadest meaning. 

Overall, the volume presents eight interesting studies, which reflect the
topic indicated by the title with varying degrees of relevance. As we have
seen, not all studies are corpus-based, and not all studies deal with second
language acquisition.
More specifically, five make clear use of corpus linguistics methods, while
three don’t; on the other hand, five studies pertain to the field of second
language acquisition, one to bilingualism, one to contrastive linguistics, one
to ELF. In regard  to the studies that are explicitly corpus-based and focused
on SLA, the volume is a testimony to one the characteristics of corpus
linguistics so far, namely the fact that it is still mostly focused on English
language learning, with little space devoted to studies dealing with the
acquisition of other L2s. 

Inspired by the principles stated in Hunston’s publication from 2002, this
volume takes a number of different directions both methodologically and
conceptually. It is, of course, a worthwhile read for specialists of the
field, interested in widening the scope of corpus linguistics by reflecting on
areas in which corpus linguistics methods may be employed. 

REFERENCES

Dobrić, N. (2015). Quality Measurements of Error Annotation-Ensuring Validity
Through Reliability. The European English Messenger, 24, 36–42.

Dobrić, N., & Sigott, G. (2014). Towards an error taxonomy for student
writing. Zeitschrift Für Interkulturellen Fremdsprachenunterricht, 19(2),
111–118.

Furkó, B. P. (2005). The pragmatic marker - discourse marker dichotomy
reconsidered - the case of well and of course. Unpublished PhD thesis.

Furkó, B. P., & Mónos, K. (2013). The teachability of communicative competence
and the acquisition of pragmatic markers–a case study of some widely-used
Business English coursebooks. Argumentum, 9, 132–148.

Grebeshkova, O. (2016). Written code-switching in the note taking of
second-language learners in bilingual classroom environments. Unpublished PhD
thesis.

Hunston, S. (2002). Corpora in applied linguistics. Cambridge: Cambridge
University Press.

Romero-Trillo, J. (Ed.). (2013). Yearbook of Corpus Linguistics and Pragmatics
2013 (Vol. 1). Dordrecht: Springer Netherlands.

ABOUT THE REVIEWER

I am a PhD candidate at the University for Foreigners of Perugia, Italy. My
research project deals with the use of corpora in Italian as a second language
learning and teaching, with a focus on the acquisition of collocations by
Chinese native speakers. It involves the creation of a corpus informed
syllabus, followed by an experimental evaluation of its effectiveness. I am
interested in the corpus-based analysis of Italian and English learner
language, and in the design of corpus-based pedagogical materials and
activities. I am also a CELTA qualified EFL teacher.

------------------------------------------------------------------------------

*****************    LINGUIST List Support    *****************
Please support the LL editors and operation with a donation at:
            http://funddrive.linguistlist.org/donate/

----------------------------------------------------------
LINGUIST List: Vol-28-2935	
----------------------------------------------------------