[Corpora-List] "errors and the art of correcting"
Kristina Hmeljak
kristina.hmeljak at guest.arnes.si
Mon Nov 13 15:38:35 UTC 2006
Another corpus of learner English with native-speaker corrections
is being developed by Kevin Mark at Meiji University. It is made up of
sentences produced by Japanese university students and their
reformulations, written by Mark himself.
A paper on this subject at the 2003 Corpus Linguistics conference in
Lancaster
is cached at
http://scholar.google.com/scholar?hl=en&lr=&q=cache:
8XJdEcxkfxUJ:korpus.dsl.dk/cl2003/cdrom/papers/mark.pdf+Kevin+Mark,
+learner+corpus
Kristina Hmeljak
Dept. of Asian and African Studies, Faculty of Arts, University of
Ljubljana
On 11. Nov 2006 , at 9:50 PM, TadPiotr wrote:
> A collection of corpora along those lines -- native vs non-native
> English --
> have been compiled by Sylviane Granger. At least the Polish sub-corpus
> contained texts corrected later by native speakers. The analysis of
> the
> errors was done by Przemek Kaszubski in his PhD. Here are some
> quotations
> and links:
>
> " One of the major international collections built on strict sampling
> principles is the International Corpus of Learner English (ICLE),
> which
> contains argumentative essays acquired from learners in more than a
> dozen
> different EFL countries in Europe and beyond. Although the ICLE
> corpus is
> not yet available to the public, research on it has been carried
> out for
> years. "
> Przemek Kaszubski http://www.hltmag.co.uk/dec99/idea.htm
>
> The Louvain Centre for English Corpus Linguistics has played a
> pioneering
> role in promoting computer learner corpora (CLC) and was among the
> first, if
> not the first, to begin compiling such a corpus. The Centre's
> computerised
> databank is known as the International Corpusof Learner English
> (ICLE) and
> is the result of over ten years of collaborative activity between
> a number
> of universities internationally and currently contains over 2
> million words
> of writing by learners of English from 19 different mother tongue
> backgrounds. The writing in the corpus has been contributed by
> advanced
> learners of English as a foreign language rather than as a second
> language
> and is made up of 19 distinct sub-corpora,each containing one language
> variety (E2French, E2German, E2Swedish etc). The type of writing being
> collected is essay writing (see below for fuller details). Advanced
> students
> can, for the purpose of the project, be broadly defined as university
> students of English in their 3rd or 4th year of study. In cases
> where the
> comparability of the level is in doubt, sample pieces of writing
> should be
> submitted beforehand.
> http://cecl.fltr.ucl.ac.be/Cecl-Projects/Icle/icle.htm#heading1
>
> Best
> Tadeusz Piotrowski
More information about the Corpora
mailing list