27.1801, Review: Applied Ling; Lang Acq; Text/Corpus Ling: Callies, Götz (2015)

Mon Apr 18 20:06:41 UTC 2016

LINGUIST List: Vol-27-1801. Mon Apr 18 2016. ISSN: 1069 - 4875.

Subject: 27.1801, Review: Applied Ling; Lang Acq; Text/Corpus Ling: Callies, Götz (2015)

Moderators: linguist at linguistlist.org (Damir Cavar, Malgorzata E. Cavar)
Reviews: reviews at linguistlist.org (Anthony Aristar, Helen Aristar-Dry, Robert Coté, Sara Couture)
Homepage: http://linguistlist.org

*****************    LINGUIST List Support    *****************
                       Fund Drive 2016
                   25 years of LINGUIST List!
Please support the LL editors and operation with a donation at:
           http://funddrive.linguistlist.org/donate/

Editor for this issue: Sara  Couture <sara at linguistlist.org>
================================================================

Date: Mon, 18 Apr 2016 16:06:21
From: Islam Farag [islamfarag at missouristate.edu]
Subject: Learner Corpora in Language Testing and Assessment

Discuss this message:
http://linguistlist.org/pubs/reviews/get-review.cfm?subid=36119397

Book announced at http://linguistlist.org/issues/26/26-2283.html

EDITOR: Marcus  Callies
EDITOR: Sandra  Götz
TITLE: Learner Corpora in Language Testing and Assessment
SERIES TITLE: Studies in Corpus Linguistics 70
PUBLISHER: John Benjamins
YEAR: 2015

REVIEWER: Islam Medhat Farag, Missouri State University

Reviews Editor: Robert Arthur Cote

SUMMARY 

“Learner Corpora in Language Testing and Assessment”, edited by Marcus Callies
and Sandra Gotz, explores how learner corpora can be used in assessing L2
proficiency, especially speaking and writing proficiency.  This 219 page
edited volume consists of eight chapters which are evenly divided into two
sections. The first section, ‘New Corpus Resources, Tools and Methods,’ 
presents new corpora and software that was created specifically to be used in
language testing and assessment (LTA).,The other section, ‘Data-Driven
Approaches to the Assessment of Proficiency,’  shows how research studies can
use learner corpora in assessing L2 proficiency. 

In Chapter 1, “The Marburg Corpus of Intermediate Learner English (MILE),”
Rolf Kreyer introduces a new learner corpus named MILE that is based on data
collected from German learners of English at an intermediate level: grades
nine to twelve. MILE fills two current gaps in the existent corpora, for it
contains longitudinal data and has annotations that can be used for better
assessment (14-20).  The main advantage of MILE is that it gives clear details
of how students’ language proficiency levels improve from grade to grade. This
gradual improvement in English proficiency will enable researchers and
teachers to have a clear description of what intermediate proficiency level
looks like in terms of the width and mastery of vocabulary, expressions, and
grammar usage (20-31).

In Chapter 2, “Avalingua: Natural Language Processing for Automatic Error
Detection,”  Pablo Gamllo et al., present a linguistic software that is used
to identify, detect, and classify different types of written errors such as
syntactic, spelling, and lexical errors (35). This error detection software
can be used to assess students’ language proficiency level and  enhance
students’ learning experiences as this software suggests solutions to the
detected errors. To evaluate and test the accuracy and the performance of
Avalingua, the authors used two learner corpora and found that Avalingua
achieved “91% precision and 65% recall” (54) with regard to identifying
spelling and lexical errors and classifying syntactic errors. This linguistic
software is intended for both first and second language learners of Galician;
Avalingua can be used, however, for other languages because the way the
program is designed makes it easier for the software to adapt to other
languages (54-55).

In Chapter 3, “Data Commentary in Science Writing: Using a Small, Specialized
Corps for Formative Self-Assessment Practices,” Lene Nordum and Andrea Erikson
notice that Swedish science students often face problems with properly writing
a data commentary. One reason for this is the shortcoming of suitable English
learning materials which usually provide very general data commentary examples
(59-61).  As an attempt to overcome this challenge, the authors introduce an
approach which promotes self-learning and assessment through the use of an
applied chemistry corpus, which is based on data commentaries taken from both
masters’ theses and published papers. It also provides annotations to the data
commentary. The authors suggest different activities –for better usage of the
corpus– that teachers can use in order to make student more self-autonomous
(70-79).

The last chapter of section one, Chapter 4, is entitled “First Steps in
Assigning Proficiency to Texts in a Learner Corpus of Computer-Mediated
Communication” and written by Tim Marchand and Sumie Akutsu. In this article,
the authors propose a Computer-Mediated Communication (CMC) corpus designed
for Japanese EFL students at the university level by collecting students’
comments that are posted on two news websites. This corpus provides a
different way of L2 proficiency assessment through the use of binary decision
trees that enable teachers to identify the accuracy, complexity, and fluency
of students’ written texts (90-109).

The second section starts with a Chapter 5, “The English Vocabulary Profile as
a Benchmark for Assigning Levels to Learner Corpus Data”. Here, Agnieszka
Lenko-Szymanska examines the effectiveness of the use of English Vocabulary
Profile (EVP) in  assessing learners’  written texts by the analysis of the
lexical content of learners’ texts (121-122). The researcher analyzes 90
essays that were taken from and humanly rated by the International Corpus of
Crosslinguistic Interlanguage (125). The results show that there is a strong
correlation between EVP description of students’ essays and the CEFR assigned
levels which are made by human raters (134-5).

In Chapter 6, “A Multidimensional Analysis of Learner Language during Story
Reconstruction in Interviews,” Pascual Perez-Paredes and Maria Sanchez-Tornel
compare students’ L2 oral proficiency levels drawn from the Louvain Corpus of
Native English Conversation (LOCNEC) to Native speakers’ samples that were
taken from Louvain International Database of Spoken English Interlanguage
(LINDSEI) (142). They find that non-native speakers describe pictures
differently than native speakers do in terms of many linguistic features such
as the use of pronouns (151-158).

In Chapter 7, “Article Use and Criteria Features in Spanish EFL Writing: A
Pilot Study from CEFT A2 to B2 Levels,” Maria Belen Diez-Bedmar investigates
the usage of articles in the writings of EFL Spanish learners. This pilot
study uses learner corpora to analyze students’ use of English articles
(164-5). The researcher traces both the frequency and accuracy of the article
usage and concludes that Spanish EFL students use the zero article very
effectively and accurately in non-referential contexts. For this reason, she
believes that students tend to use non-referential contexts more to avoid the
use of other articles. She finds that there are two main types of contexts in
which article errors occur frequently: (1) students make article errors when
there is a long noun phrase, and(2) they use articles in contexts in which
articles are not needed (175-185).

Finally, in Chapter 8, “Tense and Aspect in Spoken Learner English:
Implication for Language Testing and Assessment,” Sandra Gotz uses LINDSEI
corpus to analyze the verb-tense spoken accuracy in the speeches of German
learners of English.  The results show that there are heterogeneous groups of
errors, but the most common verb-tense errors are as follows: the use of
simple present instead of past progressive, the use of  the present
progressive instead of simple present, and the present perfect instead of
simple past (205-206).

EVALUATION

The main goal of the book is to explore how learner corpora can be used in
testing and assessment writing and speaking second language proficiency; the
authors have achieved this goal. The first section introduces new corpora and
a linguistic software and shows how they can be used in L2 assessment whereas
the second section is about research papers that investigate the benefits of
using corpora in L2 assessment. The volume is well organized and coherent, for
all chapters address the main questions of the book, that is, how can learner
corpora be used in L2 proficiency assessment? It is suitable for researchers
and graduate students who are interested in learning about and in using
corpora in testing and assessment, especially in Europe because the authors
propose corpora that are based on European learners. Having prior knowledge in
computational linguistics will make it much easier to read the volume and
understand what is mentioned about the design of different corpora and a
linguistic software. 

One main limitation is that there is not a CD that has the corpora and
software that were created and discussed in the book, forcing the reader to
search for them online. I found some of them, but I could not find others.
Another issue is that Second Language (L2) proficiency lacks a clear and an
adequate measurement or definition; one reason for this inadequacy is that
there is not enough information about L2 proficiency levels (Thomas 1994). The
need for transparent reliable and valid measures that assess L2 proficiency
becomes an urgent request; this edited volume thus constitutes a good step
towards filling this gap. Other than these shortcomings, this book contributes
to the booming subfield of using learner corpora in L2 assessment.

REFERENCES

Thomas, M. (1994). Assessment of L2 proficiency in second language acquisition
research. Language Learning, 44(2), 307.

ABOUT THE REVIEWER

Islam M. Farag is a second-year graduate students doing his MA in linguistics,
TESOL track at Missouri State University. He is interested in dialect
variations, second language acquisition, and language assessment. To get a PhD
in applied linguistics is his career goal.

------------------------------------------------------------------------------

*****************    LINGUIST List Support    *****************
                       Fund Drive 2016
Please support the LL editors and operation with a donation at:
            http://funddrive.linguistlist.org/donate/

This year the LINGUIST List hopes to raise $79,000. This money 
will go to help keep the List running by supporting all of our 
Student Editors for the coming year.

Don't forget to check out Fund Drive 2016 site!

http://funddrive.linguistlist.org/

For all information on donating, including information on how to 
donate by check, money order, PayPal or wire transfer, please visit:
http://funddrive.linguistlist.org/donate/

The LINGUIST List is under the umbrella of Indiana University and 
as such can receive donations through the eLinguistics Foundation, 
which is a registered 501(c) Non Profit organization. Our Federal 
Tax number is 45-4211155. These donations can be offset against 
your federal and sometimes your state tax return (U.S. tax payers only). 
For more information visit the IRS Web-Site, or contact your financial 
advisor.

Many companies also offer a gift matching program, such that 
they will match any gift you make to a non-profit organization. 
Normally this entails your contacting your human resources department 
and sending us a form that the eLinguistics Foundation fills in and 
returns to your employer. This is generally a simple administrative 
procedure that doubles the value of your gift to LINGUIST, without 
costing you an extra penny. Please take a moment to check if 
your company operates such a program.

Thank you very much for your support of LINGUIST!

----------------------------------------------------------
LINGUIST List: Vol-27-1801	
----------------------------------------------------------