[Corpora-List] "errors and the art of correcting"

Kristina Hmeljak kristina.hmeljak at guest.arnes.si
Mon Nov 13 15:38:35 UTC 2006


Another corpus of learner English with native-speaker corrections
is being developed by Kevin Mark at Meiji University. It is made up of
sentences produced by Japanese university students and their
reformulations, written by Mark himself.

A paper on this subject at the 2003 Corpus Linguistics conference in  
Lancaster
is cached at
http://scholar.google.com/scholar?hl=en&lr=&q=cache: 
8XJdEcxkfxUJ:korpus.dsl.dk/cl2003/cdrom/papers/mark.pdf+Kevin+Mark, 
+learner+corpus

Kristina Hmeljak
Dept. of Asian and African Studies, Faculty of Arts, University of  
Ljubljana


On 11.  Nov 2006 , at 9:50 PM, TadPiotr wrote:

> A collection of corpora along those lines -- native vs non-native  
> English --
> have been compiled by Sylviane Granger. At least the Polish sub-corpus
> contained texts corrected later by native speakers. The analysis of  
> the
> errors was done by Przemek Kaszubski in his PhD. Here are some  
> quotations
> and links:
>
> " One of the major international collections built on strict sampling
> principles is the International Corpus of Learner English (ICLE),  
> which
> contains argumentative essays acquired from learners in more than a  
> dozen
> different EFL countries in Europe and beyond. Although the ICLE  
> corpus is
> not yet available to the public, research on it has been carried  
> out for
> years. "
> Przemek Kaszubski http://www.hltmag.co.uk/dec99/idea.htm
>
> The Louvain Centre for English Corpus Linguistics has played a  
> pioneering
> role in promoting computer learner corpora (CLC) and was among the  
> first, if
> not the first, to begin compiling such a corpus. The Centre's  
> computerised
> databank is known as the International Corpusof Learner English  
> (ICLE) and
> is the result of over ten  years of collaborative activity between  
> a number
> of universities internationally and currently contains over 2  
> million words
> of writing by learners of English from 19 different mother tongue
> backgrounds. The writing in the corpus has been contributed by  
> advanced
> learners of English as a foreign language rather than as a second  
> language
> and is made up of 19 distinct sub-corpora,each containing one language
> variety (E2French, E2German, E2Swedish etc). The type of writing being
> collected is essay writing (see below for fuller details). Advanced  
> students
> can, for the purpose of the project, be broadly defined as university
> students of English in their 3rd or 4th year of study. In cases  
> where the
> comparability of the level is in doubt, sample pieces of writing  
> should be
> submitted beforehand.
> http://cecl.fltr.ucl.ac.be/Cecl-Projects/Icle/icle.htm#heading1
>
> Best
> Tadeusz Piotrowski



More information about the Corpora mailing list