27.1078, Review: Applied Ling; Text/Corpus Ling; Translation: Boulton, Leńko-Szymańska (2015)

The LINGUIST List via LINGUIST linguist at listserv.linguistlist.org
Tue Mar 1 16:07:18 UTC 2016


LINGUIST List: Vol-27-1078. Tue Mar 01 2016. ISSN: 1069 - 4875.

Subject: 27.1078, Review: Applied Ling; Text/Corpus Ling; Translation: Boulton, Leńko-Szymańska (2015)

Moderators: linguist at linguistlist.org (Damir Cavar, Malgorzata E. Cavar)
Reviews: reviews at linguistlist.org (Anthony Aristar, Helen Aristar-Dry, Sara Couture)
Homepage: http://linguistlist.org

*****************    LINGUIST List Support    *****************
                   25 years of LINGUIST List!
Please support the LL editors and operation with a donation at:
           http://funddrive.linguistlist.org/donate/

Editor for this issue: Sara  Couture <sara at linguistlist.org>
================================================================


Date: Tue, 01 Mar 2016 11:06:53
From: Peter Crosthwaite [drprc80 at hku.hk]
Subject: Multiple Affordances of Language Corpora for Data-driven Learning

 
Discuss this message:
http://linguistlist.org/pubs/reviews/get-review.cfm?subid=36082238


Book announced at http://linguistlist.org/issues/26/26-2774.html

EDITOR: Agnieszka  Leńko-Szymańska
EDITOR: Alex  Boulton
TITLE: Multiple Affordances of Language Corpora for Data-driven Learning
SERIES TITLE: Studies in Corpus Linguistics 69
PUBLISHER: John Benjamins
YEAR: 2015

REVIEWER: Peter R Crosthwaite, University of Hong Kong

Reviews Editor: Robert Arthur Cote

SUMMARY

The volume entitled ‘Multiple Affordance of Language Corpora for Data-Driven
Learning and edited by Agnieszka Leńko-Szymańska & Alex Boulton begins with an
introduction to data-driven learning (henceforth DDL) in language pedagogy. 
After outlining the development of corpora over the last 50 years, they
introduce the central theme of the volume, that of the ‘affordances’ of
corpora.   In this respect, an affordance is categorized as any use of an
object that a person can perceive. The editors’ view this as a useful analogy
to the many uses (or affordances) of corpora that are currently available for
language teaching, many of which may not have been envisaged by the original
corpus developers.

After introducing seminal works in DDL such as Leech (1997) and Johns (1991),
the editors bemoan (and rightfully so) the current dearth of classroom-based
studies on language corpora and in the language teaching profession in
general. In addition, they point out some of the well-known difficulties with
implementation, yet stress positive gains made in this area over the last ten
years.  The introduction then provides an overview of each following chapter.

Following this introduction, an overview of DDL and SLA theory is provided
(Lynne Flowerdew), looking at how corpora have been used in relationship with
the noticing hypothesis (Schmidt, 1990), constructivist learning and
Vygotskyan sociocultural theory. Given DDL’s lexico-grammatical approach, the
connection between DDL and ‘noticing’ is particularly apt, with Flowerdew
stressing that the inductive approach of DDL is ‘entirely dependent’ (p20) on
noticing.  Through teacher-led guided activities comprising illustration,
interaction, induction and intervention, students can become ‘pattern hunters’
or ‘pattern definers’, learning inductively via data-driven means.  The link
between DDL and constructivism is perhaps less well-defined, as it is not
entirely clear how the extra functionality employed in modern DDL software
front-ends promotes ‘constructivist’ learning, despite the excellent variety
of usability options afforded by the software reviewed in this section. By the
author’s admission, however, DDL and constructivism is ‘a very complex issue
meriting more experimental investigation’ (p. 27).  Finally, the author
considers social aspects affecting classroom uptake of DDL such as learner
agency and learning styles.  This is particularly interesting given the
current lack of research connecting corpora and individual differences, with a
warning that those holding more deductive than inductive learning styles might
not be able to take full advantage of DDL.  The author finishes with a call
for more large-scale research to be conducted in this area, which can be
agreed upon by all. 

One of the more fascinating sections of the book, written by Christopher
Tribble, is an insight into how language corpora and their affordance for
teaching have developed over the last 30 years. Beginning with the creation of
‘proto-DDL’ courses analysis in the 70’s that relied on typewriters and
photocopy machines, the chapter covers the introduction of personal computers
and the first concordancers in the 80’s, the development of corpus tools and
large corpora such as the British National Corpus (BNC) in the 90’s,  and the
rise of the internet and vast storage capabilities of the 21st century before
focusing on developments in corpus applications for teaching.  The chapter
next focuses on a series of surveys including data on who uses language
corpora in general, who uses them for teaching, and what tools and resources
they are using.  Corpus data on 4-grams taken from reasons for and for not
using corpora are provided in an interestingly ‘meta’ take on corpus uptake
for teaching.  The chapter ends with comments on what more needs to be done to
take corpus-based approaches to language teaching further.

Following these introductory overviews of corpora, teaching and DDL, the
volume then divides the remainder of the content into three ‘parts’, each
tackling a different ‘affordance’ of corpora.  Part I looks at corpora for
language learning, Part II looks at corpora for skills development, while Part
III covers corpora for translation training, all explored in further detail
below.

Part I: Corpora for language learning
The first section of Part I, written by Guy Aston, uses a 1,000,000 spoken
corpus derived from TED talks to examine the learning of phraseology in speech
corpora.  After pointing out the advantages to discourse processing of using
lexicalized phrases and their importance to pronunciation, the author outlines
the construction of a corpus in which the audio is aligned with the
transcription to facilitate learning (and ease of transcription).  The writer
outlines how n-grams taken from the corpus data provide evidence of
phraseological items (and when they do not), with accompanying description of
the phonological qualities of such items in a manner that is clear and
supported with authentic examples.  The implications for language learning are
then put forward, including listening and repeating segments, shadowing, and
reading aloud before listening. The author is also keen to stress potential
difficulties with the corpus-based approach suggested here, as well as the
investment in guidance and practice that teachers and students will have to
put in to make such an approach viable.

The following section, by James Thomas, introduces the design and application
of the Collocation Plus procedure for collocation analysis. After outlining
the educational context of the study, the author proceeds to introduce the
various lexicographic advantages of SketchEngine®, before providing a detailed
analysis of the concept of collocation. This leads to the explanation for
Collocation Plus, which is an investigation of collocation that takes into
account the patterns of the normal usage of collocations. This includes
information on typical grammatical and phrasal contexts in which a collocation
is found and suggestions on how this information can aid teaching. 
Specifically, a study is introduced where groups of students are presented
with ‘topic-trails’, or lists of keywords related to a topic (such as ‘In a
magazine article […] there are words about ‘animals’, ‘food’, ‘diseases’ and
‘research’), and have to produce ‘word templates’ consisting of collocates to
these keywords.  Students must then consider the representativeness of each
construction using the ‘Hoey Procedure’ (e.g. Hoey, 2005), where the frequency
of the said constructions are checked against corpus data for validity.  Such
word templates can also be pre-generated by teachers to save time.  However,
the results of the study are not provided, and while the chapter is certainly
a glowing advertisement for the approach (and also, it must be said, for
SketchEngine), further study is required before the procedure can be said to
have aided language learning.

The final section of Part I, by Kiyomi Chujo, Kathryn Oghigian & Shiro
Akasegawa, explores the use of corpora and DDL on low L2 proficiency learners,
with a focus on the Japanese context. The authors clearly state the lack of
success with DDL reported by low L2 proficiency learners compared to higher
level learners, in part due to the lack of available level-appropriate
materials. They introduce the construction of a needs-driven corpora of
level-appropriate sentences controlled for reading grade, word familiarity,
and sentence length  accompanied by L1 translation alongside a new DDL tool
titled the Grammatical Pattern Profiling System that allows for corpus queries
via word or grammatical category.  The chapter mainly describes the
construction and suggested pedagogical uses of these tools rather than any
experimental data on its success with learners, which would have been of great
interest to readers.

Part II: Corpora for skills development
The first section of Part II, by Maggie Charles, outlines learner-constructed
disciplinary specific corpora for multidisciplinary EAP classes, where
students work on the same task but use their own corpora. After presenting
information about the corpora and participants, Charles presents a number of
qualitative results based on student achievement using the concordance tool,
wordlists, collocates tool, and concordance plot tool before evaluating the
success of the course as a whole. Students finished with positive attitudes
towards the corpus work although the typical negative responses regarding
reading concordance lines and the amount of time required to master the
approach are also highlighted, alongside difficulties with the smaller corpus
sizes involved in disciplinary-specific texts. I consider this a
groundbreaking approach in bridging the gap between EAP and subsequent
discipline-specific investigation, given that the majority of tertiary EAP
programs start as general as possible in the first year before moving onto
disciplinary discourse in the following years. If successful, this approach
could significantly reduce the amount of time spent on this process.

The subsequent section, by Svitlana Babych, looks at how corpora can promote
the acquisition of reading strategies via the creation of contrastive
multilingual resources, in this case a thesaurus of textual connectors for
Russian undergraduates learning to read Ukrainian.  According to the author,
by promoting an understanding of the contrastive uses of such connectors,
students are able to better process L2/L3 texts in a top-down manner by
allowing them to predict L2/L3 text structure more effectively.  A
classification scheme of these connectors is included with examples from
Ukrainian, Russian, and English, which is helpful for other researchers
looking to explore such cohesive devices in future research.  Finally, a brief
overview of pre- and post-training test data suggests that the students
benefitted greatly from corpus-driven sessions on contrastive linguistic
features (although more detail could have been provided here), which is
followed by a useful presentation of exercises tailored for the development of
reading skills.

The final section of Part II, written by Alejandro Curado Fuentes, also
examines lower-level L2 students, this time focusing on keywords in texts for
reading comprehension. These keywords were used in targeted concordance
searches by an experimental DDL group compared with a paper-based activity for
a control group using pre- and post-tests on the decoding of keywords, with
increased benefits to comprehension reported for the DDL group.  The author
uses small home-made news corpora from the internet to generate the keywords
for subsequent DDL, which he states is an under-researched use of such
keywords.  Detailed explanations of the teaching procedures are provided,
making the study much more replicable.  The attempt to link students’
perceptions of DDL and their actual test performance is also an interest and
novel feature of this study.

Part III: Corpora for translation training
The first section of Part III, provided by Teresa Molés-Cases and Ulrike
Oster, looks at the use of ‘webquests’ or autonomous/group task-based
internet-fronted approach to scaffolded learning.  Pointing out the potential
applications of corpus used for translation training early on in the chapter,
the study describes the benefits of webquests in terms of collaboration and
student-teacher feedback before introducing a pilot study where translation
students conduct webquests using an in-house multilingual corpus titled
‘Covalt’ in order to develop translation strategy.  Links to the webquests
themselves are fortunately provided for the reader, which is excellent.  The
authors claim that through the webquests outlined in this chapter, students
were ‘pushed’ to make linguistic decisions autonomously and were generally
positive in their perceptions of the webquests and of their learning
post-training.

Josep Marco and Heike van Lawick wrote the following section, which looks at
how translator trainees’ awareness of L1 transfer or interference can be
enhanced using ‘comparable’ corpora.  The difference between the authors’
definition of comparable corpora and previous approaches for translation-based
studies is that existing comparable corpora have both texts and translations
in the native language (constituting a monolingual corpus in that source texts
in the target language are not found), which leads to a lack of data on L1
interference.  This is particularly seen/obvious in student translations.
Under the authors’ approach, by expanding corpus data to include a variety of
L1/L2/l3 texts including translations and non-translations, students were
better able to realise potential interference across a five-step process
leading to a retranslation of a text they had translated in the first step,
noting qualitative examples of student improvement.

The next section, by Patricia Sotelo, looks at the design of a multimedia
corpus of subtitles for translation training.  After suggesting that
multimedia corpora for translation training have received comparatively little
attention compared to other types of corpora, the author introduces the
development of the Veiga corpus – a multimedia corpus of English-language
films subtitled in English and Galician – the purpose of which is to develop
‘translation competence’ in subtitling training.  The chapter outlines the
construction of the corpus and the accompanying query tools, and offers highly
detailed descriptions of corpus-driven tasks carried out by a small group of
undergraduate students, who responded positively to the process.  The author
finishes with an impassioned ‘call to arms’ regarding the adoption of corpus
technology for translation, which to me, at least, was quite convincing.

In the final section of Part III which concludes the volume, Alex Boulton
looks at applying DDL to the web via web searches rather than via traditional
corpus concordancing, for example. It is certainly refreshing to see Boulton
espouse the value of Google or other web software for teaching and learning,
despite the hesitance many corpus linguists feel about using Google as a
corpus resource. Furthermore, Boulton does well to look at Tim John’s concerns
with the processes involved in DDL (that of an inductive learning process)
rather than any focus on the data or specific tools used for it. While
pointing out that the web ‘fails’ as a textbook definition of corpora and
noting some of the difficulties involved in using it for this purpose, the
idea that Google CAN be a concordance in certain cases is one that I believe
both teachers and students will be positive towards.  Boulton finishes by
pointing out some existing functions of Google for DDL that educators might be
unfamiliar with as well as reviews some studies testing the use of the web for
DDL.

EVALUATION

The main contribution of the volume is that of presenting new ideas or
‘affordances’ of language corpora that are still in their infancy as evidenced
by the lack of actual data in some of the chapters but that are ripe for
further development.  Therefore, the title should be incredibly useful for
those who are looking to begin or continue research on DDL but seek new ground
to break.

Another refreshing element of the text is the emphasis on lower-level L2
learners by some of the authors, which is indeed a vital area to explore if we
in the corpus linguistics field really want to break into the classroom.  Too
much of the research on DDL (as well as too much existing corpus data) is
simply inaccessible to those at the frontlines of teaching lower-level
learners, despite the fact that lower-level learners make up the largest
majority of L2 learners and that such learners are arguably more in need of
DDL. As such, I applaud the efforts of the authors here in focusing on
lower-level students.

Another feature of the volume is its accessibility for educators.  Most of the
chapters contain very detailed step-by-step guides of the DDL training that
the studies’ participants embarked upon. This is most welcome  from the point
of view of a teacher who is considering implementing DDL in their context. 
Often, these details are missing, so budding DDL practitioners are forced  to
contact the authors of such studies through personal correspondence for more
in-depth information. However, given the detail provided in many of these
chapters, they act as ‘how-to’ guides for those interested in adopting the
particulars of DDL. As a result,  the scope of the volume as a whole
encompasses a wider audience than those who are merely interested in DDL for
the sake of it.  Boulton’s chapter regarding Google for DDL is also likely to
come as a relief to teachers and students who want to incorporate DDL into the
classroom but find it impossible to manage existing corpora and corpus tools.
Thus, closing the volume with this chapter should serve as a point of
encouragement for educators to  get started.

One issue with the volume is that while certainly practical for potential
teachers and researchers in terms of the ‘affordances’ and opportunities
possible for DDL, a number of the studies provide little empirical data as to
their effectiveness, or the studies presented are small in scope.  In
particular, none of the studies included in Part 1 offer experimental data
regarding whether the methodologies or technologies described have led to
documented language ‘learning’.  Teachers are well-known for being hesitant to
try something new if they do not think they are going to get a significant
payoff, even if the potential pedagogical advantages are numerous. Therefore ,
further studies where performance data is collected under these approaches are
obviously needed. (I’m sure the authors would agree, or are at least in the
process of collecting such data themselves.).  This leads to a second issue
with the volume;  the distinction made between DDL for language learning in
Part I and DDL for skills development in Part II is not so clear.  While I can
see the logic of the distinction, the studies involved in part 2 have arguably
more (or at least comparable) evidence of language learning due to their more
empirical nature, and there is no reason why the studies in Parts I (and even
Part III) cannot also lead to skills development.  This is a minor fault,
however.

In summary, the volume can be considered as an essential resource for those
already well-versed in DDL. Perhaps more importantly, however, the volume acts
as an accessible guide for educational administrators, teachers and students
who are thinking about incorporating a data-driven approach to teaching and
learning, and who need the know-how, practical applications and most of all,
encouragement to start experimenting with DDL and language corpora more
generally.

REFERENCES 

Biber, D. (1988).  Variation across Speech and Writing, Cambridge: CUP.

Hoey, M. (2005). Lexical Priming: A New Theory of Words and Language. London:
Routledge.

Johns, T. (1991). From printout to handout: Grammar and vocabulary teaching in
the context of data-driven learning. In Johns, T. & King, P. (eds.) Classroom
Concordancing, English Language Research Journal 4 (pp.27-45). 

Leech, G. (1997). Teaching and language corpora: A convergence. In Wichmann,
A., Fligelstone, S., McEnery, T. & Knowles, G. (eds.) Teaching and Language
Corpora (pp. 11-23). Harlow: Addison Wesley Longman.

Schmidt, R. (1990). The role of consciousness in second language learning.
Applied Linguistics, 11(2), 129-158. DOI: 10.1093/applin/11.2.129.


ABOUT THE REVIEWER

Peter has enjoyed a varied career in both the TESOL and applied linguistics
fields. His previous experience includes materials preparation, editing and
consultancy work with publishers including Cambridge University Press, two
sessions as director of studies for language schools in the UK, over six
years’ experience in the Korean EFL context, lengthy experience as an IELTS
examiner, as well as teaching and supervision experience at Cambridge
University. He is currently an assistant professor at the Centre for Applied
English Studies, University of Hong Kong, and his research areas include
second language acquisition, (learner) corpus linguistics, and language
assessment. He is also an editor of The Linguistics Journal.





------------------------------------------------------------------------------

*****************    LINGUIST List Support    *****************
Please support the LL editors and operation with a donation at:
            http://funddrive.linguistlist.org/donate/
 


----------------------------------------------------------
LINGUIST List: Vol-27-1078	
----------------------------------------------------------







More information about the LINGUIST mailing list