16.1893, Review: Corpus Ling/Applied Ling: Aston et al. (2004)

Sun Jun 19 22:27:14 UTC 2005

LINGUIST List: Vol-16-1893. Sun Jun 19 2005. ISSN: 1068 - 4875.

Subject: 16.1893, Review: Corpus Ling/Applied Ling: Aston et al. (2004)

Moderators: Anthony Aristar, Wayne State U <aristar at linguistlist.org>
            Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>

Reviews (reviews at linguistlist.org) 
        Sheila Dooley, U of Arizona  
        Terry Langendoen, U of Arizona  

Homepage: http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.

Editor for this issue: Naomi Ogasawara <naomi at linguistlist.org>
================================================================  

What follows is a review or discussion note contributed to our 
Book Discussion Forum. We expect discussions to be informal and 
interactive; and the author of the book discussed is cordially 
invited to join in. If you are interested in leading a book 
discussion, look for books announced on LINGUIST as "available 
for review." Then contact Sheila Dooley at collberg at linguistlist.org. 

===========================Directory==============================  

1)
Date: 16-Jun-2005
From: Przemys?aw Kaszubski < przemka at amu.edu.pl >
Subject: Corpora and Language Learners 

-------------------------Message 1 ---------------------------------- 
Date: Sun, 19 Jun 2005 18:20:38
From: Przemys?aw Kaszubski < przemka at amu.edu.pl >
Subject: Corpora and Language Learners 

EDITORS: Aston, Guy; Bernardini, Silvia; Stewart, Dominic
TITLE: Corpora and Language Learners
SERIES: Studies in Corpus Linguistics 17
PUBLISHER: John Benjamins
YEAR: 2004
Announced at http://linguistlist.org/issues/16/16-33.html

Przemyslaw Kaszubski, School of English, Adam Mickiewicz University, 
Poznan, Poland

SUMMARY

CORPORA AND LANGUAGE LEARNERS features a selection of 
papers presented at the fifth meeting of the bi-annual TaLC (Teaching 
and Language Corpora) conference, which was held in Bertinoro, Italy, 
in the summer of 2002. The book is divided into five parts, the central 
sections exploiting three areas involving corpora and 
learners: "Corpora by learners" (i.e. corpus-based studies of learner 
language, 6 papers), "Corpora for learners" (various types of target 
language corpora, 4 papers), and "Corpora with learners" (data-driven 
learning, 3 papers). These 'core' contents are braced by two more 
general contributions: a proposal for a corpus-informed theory for 
applied linguistics, and an overview of prospects for applying the Web 
to corpus-based pedagogy. An index (pp. 301-305) and contributors' 
bionotes (307-311) complement the volume.

In their "Introduction: Ten years of TaLC", the editors, previewing the 
book's organization and contents, note the field's constantly evolving 
and diversifying efforts to optimize the link between corpus application 
and language pedagogy. Central to these efforts are attempts to 
understand learners and their needs, and the necessity to resolve the 
vexed notion of input 'authenticity', surfacing in several papers.

The first major contribution, Michael Hoey's "The textual priming of 
lexis", is the one that offers "A theory for TaLC". The author claims 
that lexical units -- central to his proposal -- display the property of 
becoming loaded ('primed') in a mind exposed to frequently repeating 
patterns of usage. Priming may concern any broadly understood 
grammatical and collocational properties, both within and beyond the 
sentence. Thus, a word may, for example, be primed for acting as a 
noun or verb, for representing certain meanings, for preceding or 
following specific modification patterns (colligation), for appearing in 
particular textual positions (textual colligation), for contributing to 
textual relations (e.g. a Problem-Solution pattern), etc. Such primings 
are, in addition, relative to specific genres and domains of use. Priming 
may "change through an individual's lifetime" (p. 24);it also precedes 
grammatical categorizations, which are likely to be post hoc creations. 
According to Hoey, effective studies of primings must be based on 
specialized corpora, which do not regularize any specific preferences 
in favour of the 'big picture'. Evaluating the pedagogical relevance of 
his theory, the author points to the role of teachers and materials in 
ensuring correct, though gradual, priming of lexical content, properly 
contextualized. Priming may also account for creative uses of 
language, which, as Hoey claims, can breach some -- but never all -- 
of the priming constraints (the latter would produce "non-language"). 
Overall, priming theory, recently elaborated in a monograph (Hoey 
2005), merits attention in that it aptly positions, and legitimizes, corpus-
based lexical research within the larger scope of psycholinguistics, 
language variation, and acquisition theory.

The first paper in the "Corpora by learners" part is Yukio 
Tono's "Multiple comparisons of IL, L1 and TL corpora: The case of L2 
acquisition of verb subcategorization patterns by Japanese learners of 
English". Solidly grounded in L1 and L2 acquisition theory and Levin's 
division of verb classes, the paper lays a methodological claim in 
favour of a multiple corpus comparison method in corpus studies of 
learner language. Tono shows that combining interlanguage (IL) 
material (at possibly various stages of proficiency) with, on the one 
hand, appropriate target language corpora (here: English textbooks) 
and, on the other, comparable L1 corpora, can make it possible to 
capture computationally diverse effects influencing SLA, such as the 
L1 effects, the L2 input effects, or the developmental effects. The 
advanced linguistic analysis relies on syntactic parsing, database 
systems and log-linear analysis of clusters, whose brief discussion 
some readers may find a little obscure. The author's concluding wish is 
to see international collaboration for the development of a 
computational model of SLA.

In "New wine in old skins? A corpus investigation of L1 syntactic 
transfer in learner language", Lars Borin and Klaus Prütz attempt to 
investigate the syntax of Swedish university-level students through 
frequencies of part-of-speech (POS) n-grams (a procedure feasible for 
languages with fixed-order syntax, as the authors rightly point out, p. 
71). A contrastive, multi-corpus environment is also advocated here, 
the corpora ranging between 350,000 and 1 million tokens. The 
counted frequencies of 1-4 grams (excluding sequences containing 
proper nouns and punctuation, as well as, controversially, those 
exclusive to either language) are compared and tested statistically 
(Mann-Whitney), revealing a predominant overuse pattern in the 
learner data. The authors illustrate their findings and compare them 
with earlier studies, most notably Aarts and Granger (1998). The final 
outcome is far from definitive, but the discussion sheds interesting light 
on the significance of methodological decisions for this kind of 
research, such as about the size of the adopted tagset or the degree 
of manual adjustment in the frequency lists.

Agnieszka Lenko-Szymanska's "Demonstratives as anaphora markers 
in advanced learners' English" adopts a comparatively lighter 
computational approach and a more traditional comparison paradigm, 
with a Polish university learner corpus (PELCRA; four proficiency 
levels) set against just a native speaker corpus norm (BNC Sampler). 
The applied log-likelihood and chi-square statistics demonstrate that 
Polish learner writers overuse distal anaphoric signals ('that', 'those'), 
primarily in the determiner function, and that the problem does not 
seem to disappear with rising proficiency. The author accounts for that 
by pointing to the lack of appropriate, explicit explanations in the 
grammar books.

In "How learner corpus analysis can contribute to language teaching: 
A study of support verb constructions", Nadia Nesselhauf presents yet 
another learner corpus research scheme, in which the uses 
of 'make', 'have', 'take', and 'give' in support constructions (extracted 
by eyeball analysis of concordance lines), are judged for appropriacy 
not just against a comparable native English reference corpus (written 
BNC), but also using lexicographic sources and native-speaker 
informants. The author also undertakes to seek correspondences and 
clusters across the error types (despite rather low frequencies). Some 
of the suggested implications for teaching may seem obvious (e.g. that 
frequency information in learner data is insufficient and should be 
complemented by appropriate native-speaker genre/text-type 
frequency); more importantly, Nesselhauf reminds us of the need to 
consider non-corpus factors in judging errors, such as the degree of 
communicative disruption. One interesting pedagogical suggestion for 
her data is the idea of focusing learners' attention on instances where 
single verb uses differ semantically from the corresponding support 
constructions (e.g. 'take notice' vs 'notice').

Lynne Flowerdew's article "The problem-solution pattern in apprentice 
vs. professional technical writing: An application of appraisal theory" 
explores the possibility of applying the systemic-functional Appraisal 
framework of categorizing evaluative language to an analysis of cross-
corpus keyword and key-keyword listings generated with Scott's 
WordSmith Tools. The author concentrates on apprentice and 
professional authors' use of 'inscribed' (explicitly evaluative, 
e.g. 'problem') vs. 'evoking' (inviting evaluation, e.g. 'impact') lexis in 
signalling problems and/or solutions. The findings indicate that, for the 
genre in question, the majority of keywords identified are indeed 
problem-solution in nature, and that while learner writers tend to use 
inscribed terms for both the Problem and Solution elements, native-
English professionals signal problems with more evoking terms. This, 
Flowerdew argues, may be a teaching-induced phenomenon; other 
encountered incongruencies are put down to the inequality of topics in 
the two corpora under comparison.

Ngoni Chipere, David Malvern and Brian Richards' paper "Using a 
corpus of children's writing to test a solution to the sample size 
problem affecting type-token ratios" is primarily computational in 
character. The authors review and criticize various existing measures 
of lexical richness, in particular the type-token ratio (TTR), and put 
forward their own formula for a D parameter, which is independent of 
the text sample size and, as empirically tested in the study, better 
correlated with varied proficiency levels, determined by human scorers 
and certain known measures (word length, text length). The D metric 
thus appears especially well suited for tracing linguistic development, 
and it is only regrettable that the authors do not provide download or 
ordering details for readers wishing to test the tool (by comparison, a 
mildly criticized measure, standardized TTR, is readily available in 
WordSmith Tools).

Opening the "Corpora for learners" section, Ute Römer's "Comparing 
real and ideal language learner input: The use of an EFL textbook 
corpus in corpus linguistics and language teaching" assesses the 
linguistic value of pedagogical materials for classroom use on the 
example of spoken 'if' constructions. While conceding the point about 
the impossibility of fully transferring the contextual authenticity of 
attested language to a classroom setting, the author declares 
confidence in learners' ability to adapt, and in the overall positive 
influence of naturalistic language exposure as opposed to special 
input. Suggestions for  applying findings contrasting the language of 
the scanned German textbook conversations and the evidence of the 
spoken BNC are also offered. Römer's optimism may be open to some 
question, as thus far relatively little empirical evidence exists 
confirming the effectiveness of corpus-driven material selection; 
conversely, authors such as Aston (2001: 8), Gabrielatos (2005), or 
Nesselhauf and Mauranen (this volume) admit the necessity of 
considering non-frequency factors.

An interesting proposal for a corpus-based stylistics programme is 
described by Bernhard Kettemann and Georg Marko in "Can the L in 
TALC stand for Literature?". The authors plan to offer an integrative 
and 'hands-on' awareness-raising course for students at English 
departments (in particular at Graz University), whose knowledge often 
gets excessively compartmentalized. It is claimed that corpus-based 
analyses of literary texts should help students integrate their 
knowledge and build five important, inter-related types of awareness: 
(1) language, (2) discourse, (3) literary, (4) cultural/social, and (5) 
methodological / metatheoretical (= how to organize and logically 
conduct research). Although it is still at an early stage of design, 
Kettemann and Marko characterize in considerable detail each part of 
their course, providing illustrations of concordancing and other corpus 
activities (e.g.: how to discuss the role of performatives retrieved on 
the basis of "I * you" frame searches in a Shakespeare corpus). 
Special attention is devoted to methodological awareness, which is 
meant to build gradually throughout the course, incorporating such 
elements as acquisition of strict research procedures, co-textual and 
transtextual analysis of data, and the faculty of critical analysis. The 
authors hope that, when properly combined with other components in 
the curriculum, their course may be successful, especially in view of its 
focus on culturally vital literary texts.

The possibility of enhancing academic speaking skills with the help of 
the Michigan Corpus of Academic Spoken English (MICASE) is in turn 
reviewed by Anna Mauranen ("Speech corpora in the classroom"), 
who reports on the responses from a teacher and her students after 
running such an  experimental course. While the teacher found corpus 
use fascinating and stimulating (though humbling), students' 
appreciation depended on the level of computer-literacy. Most cited 
problems sound familiar: the need for longer pre-training, high time 
cost, the questionable value of corpora for less proficient learners. In 
addition, some users found inductive learning uncomfortable and 
studying frequency irrelevant. In her comments on these results, 
Mauranen proposes that pedagogical authenticity of corpora be seen 
as including both 'objective authenticity' (the linguistic evidence) as 
well as 'subjective authenticity' (how students relate to corpus 
material); secondly, she notes that the appeal of corpus material may 
relate to its discourse nature: "interactively saturated" spoken data 
may deactivate students more than, e.g., written prose. Other issues 
concern adapting corpus activities to analytically processing learners 
(e.g. adults), and taking a stand on the native-English vs. English as a 
lingua franca (ELF) controversy.

In "Lost in parallel concordances", Ana Frankenberg-Garcia gives 
recipes for using parallel concordancing in a general language course. 
The assumption is that such practice encourages explicit L1-L2 
comparison, which, as current research shows, may facilitate rather 
than necessarily impede effective learning, since it engages students 
and, providing the teacher is sufficiently experienced, brings to the 
fore relevant L1-related difficulties. "Navigating through a parallel 
corpus" may depend on whether uni-directional or bidirectional 
translations are available. Frankenberg-Garcia considers all the 
different options for initiating parallel searches (beginning with source 
texts in L1, source texts and L2, target texts in L1, or target texts in 
L2) and compares their pedagogical value. Some interesting points 
are raised (e.g. about the possibility of using L1 translations as 
models), although the activities presented seem unsupported by 
classroom practice, which poses the question of their genuine 
effectiveness. What is perhaps lacking is some proof of parallel 
concordancing actually outperforming bilingual dictionaries in some 
contexts. Also, little attention is paid to the age or proficiency of 
learners, or the importance of genres. Overall, the paper emerges as 
a catalogue of ideas that may (some would say should) be useful, but 
which have yet to be proved so. (For those interested, an online 
version of the paper is available at 
http://www.linguateca.pt/Repositorio/Frankenberg-GarciaTALC2002.rtf

The third section of the volume, "Corpora with learners", begins with 
Passapong Sripicharn's research report on "Examining native 
speakers' and learners' investigation of the same concordance data 
and its implications for classroom concordancing with ELF learners". In 
the recounted experiment, six BA-level Thai and British students were 
presented with brief, pre-selected concordance material and asked to 
perform three simple tasks: (1) compare collocations of two verbs, (2) 
name the difference between two groups of sentences arranged 
according to grammatical patterns, (3) guess the meaning of a 
concordanced word, complete a gapped line and (most interestingly) 
justify the answer during a taped interview. The results showed that 
the Thai students were eager to apply data-driven strategies, while the 
native-English students preferred to rely on intuition, generalize 
beyond the data, question the evidence and call up exceptions. Such 
results, while probably anticipated, may have been prompted by the 
the non-native English group having been introduced into 
concordancing prior to the experiment. This flaw in the set-up appears 
rather unfortunate; however, the study validly points out that 
concordancing does not always have to be used in a data-driven-way 
(cf. Aston 2001: 22-25), and that limited corpus evidence can condone 
overgeneralizing -- a point to beware for teachers preparing material.

In "Some lessons students learn: Self-discovery and corpora", Pascual 
Pérez-Paredes and Pascual Cantos-Gómez outline a corpus-based, 
form-focused protocol designed to help English learners attain greater 
awareness of and control over their spoken performance. For 
convenience, the protocol only monitors the use of words. Students 
access and query hyperlinked transcriptions and audio recordings of 
their aural output, and, guided by a series of open-ended questions, 
compare the statistics from their own file with the average class results 
and then with data derived from reference corpora (it is, however, not 
clear which corpora are used for reference). Pérez-Paredes and 
Cantos-Gómez describe their system as promoting Nunan's fifth stage 
of learner autonomy (learner as researcher) and report generally 
positive feedback from their students. A convenient feature of this 
networked database environment is that student data are processed 
statistically (cluster analysis), allowing teachers to classify learners by 
performance. Overall, the system described is an interesting example 
of how learner-corpus data can inform IT solutions for learning, a 
promising line of development for intelligent CALL (I-CALL).

In the final paper of the book's third section, entitled "Student use of 
large, annotated corpora to analyze syntactic variation", Mark Davies 
describes his corpora-supported advanced online course in Spanish 
syntax, in which students learn to retrieve and combine data from 
multiple corpora in order to solve variation tasks. The corpora are 
large (100 M words; 200 M words; and the Spanish web -- Google and 
Google Groups), diversified, and, in one case, richly annotated to 
enable more powerful searches (Davies' Corpus del Espa?ol 
resembles in this respect his VIEW interface to the British National 
Corpus,  http://view.byu.edu/ ). The course is not corpus-driven, 
however: hands-on practice follows readings from a grammar book, 
and mainly involves testing the validity of the rules and claims found 
there. The author emphasizes the importance of an intensive, task-
based training stage, and of supervising students' early projects 
during which they can develop expertise in choosing and combining 
corpora and search patterns. A valuable pedagogical suggestion is the 
shift from purely quantitative to more explanatory tasks in mid-course. 
Concluding, Davies argues that, at advanced levels of proficiency, 
even less experienced students can learn to use and appreciate 
corpora, a cogent point considering the author's enormous experience 
in the field.

In the last, forward-looking article on "Facilitating the compilation and 
dissemination of ad-hoc web corpora", William H. Fletcher summarizes 
the current possibilities for linguistic exploitation of the World Wide 
Web and outlines the prospects for future developments. According to 
Fletcher "[t]he quantity of information online greatly surpasses its 
overall quality" (p. 275); on the other hand, the infrequency of some 
phenomena and genres and the inevitable ageing of finite corpora 
force linguists to embrace the web. Techniques of access range from 
the most widely known 'browsing' to 'hunting', 'grazing' and 
automatic 'crawling', but none of them guarantees immediately high 
quality results to linguistic searches. There is therefore a strong need 
to filter search engine output by applying linguistic and heuristic "noise-
reduction techniques", which, however, can unduly prolong access 
time. Fletcher considers two possibilities for breaking the deadlock: (1) 
the creation of a special Web Corpus Archive (WCA), whereby 
professionals would help one another by analysing and classifying 
web content and submitting reports which would trigger automatic 
download and annotation of the pages for future use; and 2) the 
creation of a special Search Engine for Applied Linguists (SEAL), 
enabling direct, highly sophisticated KWiC concordancing of the web. 
Neither solution is free from problems (securing copyright, providing 
sufficient processing power, etc). Fletcher then compares 
his 'idealistic' visions with the existing facilities: online concordancers 
for static corpora, commercial meta search engines, web 
concordancers (e.g. WebCorp), the Internet Archive ('Wayback 
machine'), advanced linguistic search engines. A practical tip resulting 
from this discussion is that students should be taught "responsible 
online searching techniques". Overall, the paper brings a useful, if only 
slightly lengthy (28 pages), overview of the workings of the web-as-
corpus sub-domain, supported by a set of numerous URL addresses 
that IT-minded teachers should be willing to explore.

EVALUATION:

As transpires from the extended summary above, Aston et al's 2004 
collection will be a valuable resource for teachers seeking working and 
prospective solutions, as well as up-to-date theoretical motivations, for 
corpus-informed teaching practice. The book offers admirable 
continuation to Aston's edited volume of 2001 as well as to the 
previous volumes of TaLC proceedings. A sceptical reader could 
require more theory and more empirical verification, but there is no 
doubt that the field of 'applied corpus linguistics' (a broader term, 
borrowed here from the name of an American association and a recent 
volume of proceedings from a conference it organized) is growing, 
maturing and slowly developing its standards (Hoey, Mauranen, 
Römer). This progress should lead to the establishment of models for 
integrating corpora with other teaching methods and programmes -- a 
key to success not just in academic education (Kettemann and Marko), 
but also at schools. As noted by several authors, some technical and 
practical issues must be resolved before corpus-driven tasks can be 
added to the bank of regular in-course activities. Both Davies and 
Mauranen point out the need of extensive, task-based pre-training, a 
point all the more vital if the level of initial computer literacy affects 
students' motivation and performance. In addition, the relatively long 
time required to complete corpus-based activities may confine them to 
some tasks and/or some learners: more empirical testing is needed to 
explore such feasibilities. Thirdly, ensuring universal and dependable 
access (cf. Fletcher) to 'optimal' corpora, both general and 
specialized, large and  small, will be another key factor determining the 
popularity of corpus methods among teachers and learners. Despite 
these yet unresolved problems, Aston et al.'s collection clearly 
demonstrates that enough experience has been accumulated in this 
area for a comprehensive resource book for teachers to be offered, 
which could recommend specific tools, corpora, methods, techniques, 
exercises, etc., for meeting specific teaching aims in a typical (not 
necessarily task-based, Gabrielatos 2005) language learning syllabus.

Compared with data-driven learning, the 'behind-the-scenes' (Aston 
2000) approach, i.e. corpus-based linguistic research, is well 
entrenched. The large size of the "Corpora by learners" section shows 
that learner corpora have become a staple component of corpus 
networks exploited for educational purposes (all major ELT publishers 
today rely on their collections of learner data). Ignoring corpus 
evidence is likely to lead to artificiality of input, which many applied 
corpus linguists openly criticize. However, as already mentioned, 
corpus-derived results, even those supported by the most 
sophisticated statistical methods, must be used wisely and in 
proportion with other factors. On the other hand, as this volume richly 
demonstrates, progress in learner corpus research is on-going and 
constantly diversifying inasmuch as ever larger and better annotated 
resources are created and new (networking) technologies are reached 
for (e.g. Tono, Pérez-Paredes and Cantos-Gómez). The field of 
pedagogical exploitation of corpora is thus hardly ready to settle, 
inviting interested educators continually to refresh their position on its 
development. Of course, Aston et al.'s volume could not be 
comprehensive in this respect.

There are no apparently weak papers in the reviewed volume, 
although, as indicated, some contributions could be questioned for 
methodological assumptions or for insufficient scepticism. Additionally, 
some debatable omissions in the use of sources may be noted, e.g. 
Römer's lack of reference to earlier word-based analyses of written 
textbooks (e.g. Ljung 1990) or Kettemann and Marko's lack of mention 
of the Web Concordances service. Fletcher, at the time of writing his 
article, could not have heard of the WebCorp team's plans to develop 
their own linguistic search engine, or of LexWare Culler -- a fast, 
Google-based web concordancer equipped with part-of-speech 
search syntax and lemmatization rules for grouping results (several 
major languages are supported). These gaps, however, hardly 
undermine the overall quality of the volume.

The editing is also generally careful, the few slips mostly concerning 
orthography and punctuation. The grossest oversight is the missing 
Table 2 in Flowerdew's article, an omission preventing comparison 
with Table 1, called upon several times.

REFERENCES:

Aarts, Jan and Sylviane Granger. 1998. "Tag sequences in learner 
corpora: a key to interlanguage grammar and discourse". In: Sylviane 
Granger (ed.), Learner English on computer, London, Longman. 132-
141.

Aston, Guy. 2000. "Learning English with the British National Corpus". 
In: M. Paz Battaner & Carmen López (eds), VI jornada de corpus 
lingüístics, Barcelona, Institut universitari de lingüística aplicada, 
Universitat Pompeu Fabra. 15-40.

Aston, Guy. 2001. "Learning with corpora: an overview". In: Guy Aston 
(ed.). 7-45. 

Aston, Guy. (ed.). 2001. Learning with corpora. Houston, TX: 
Athelstan.

Gabrielatos, Costas. 2005. "Corpora and language teaching: just a 
fling or wedding bells?". EJ 8, 4. 
http://www-writing.berkeley.edu/TESL-EJ/ej32/a1.html

Hoey, Michael. 2005. Lexical priming: a new theory of words and 
language. London: Routledge.

LexWare Culler. 2004-5. 
http://82.182.103.45/lexware/concord/culler.html

Ljung, Magnus. 1990. A study of TEFL vocabulary. Stockholm: 
Almqvist & Wiksell.

Scott, Mike. 1996. WordSmith Tools. Oxford: Oxford University Press.

The Web Concordances. [nd]. 
http://www.dundee.ac.uk/english/wics/wics.htm

WebCorp. 1999-2005.   http://www.webcorp.org.uk/ 

ABOUT THE REVIEWER

Dr Przemyslaw Kaszubski is a teacher of academic writing and a 
corpus linguistics researcher and lecturer at the School of English, 
Adam Mickiewicz University, Poznan, Poland. His current research 
interests concern the use of online corpus resources for academic 
writing instruction. He maintains an online concordancer for English 
students, and a large corpus linguistics bibliography 
( http://www.staff.amu.edu.pl/~przemka/ ). In 1995-2002 he co-
ordinated the compilation of the Polish subcorpus of the International 
Corpus of Learner English.

-----------------------------------------------------------
LINGUIST List: Vol-16-1893