[Corpora-List] Corpora of Learner English and Learner German

Andrea Mulloni andrea2 at wlv.ac.uk
Wed May 16 08:08:16 UTC 2007


Hi Barbara,

have a look at the appendix, you might find something there. It's
quite outdated now (1999), but it might work as a reference to
corpora you don't know of.

http://clg.wlv.ac.uk/~andrea/tesi%20completa.pdf

Best,

Andrea

On 16 May 2007, at 10:03, Eric Atwell wrote:

> Barbara,
>
> When looking for suitable corpora, try ELDA http://www.elda.org/
> - a search of the catalogue for "learner English Corpus" finds:
>
> ISLE Speech Corpus
>
> Approx. 20 minutes of speech (per speaker) from 23 German and 23
> Italian
> intermediate learners of English. Each speaker recorded sentences from
> several blocks of differing types (reading simple sentences, using
> minimal pairs, giving answers to multiple choice questions). The
> prompts
> were of varying perplexities.
> About 2/3 of the data for each speaker was annotated by one of a
> team of
> linguists. The files were corrected first at the word level, and an
> automatic recognizer was then used to produce phone-level annotations.
> The annotator then re-annotated each sentence to mark phone and stress
> errors (e.g., substitutions, insertions, or deletions). Corpus
> details:
> 46 speakers (23 German and 23 Italian);  11484 utterances; 1.92
> gigabytes of WAV files (4 CDs); 17 hours, 54 minutes, and 44
> seconds of speech data. For more details, see:
>
> Menzel, W; Atwell, E; Bonaventura, P; Herron, D; Howarth, P;
> Morton, R;
> Souter, C. The ISLE Corpus of non-native spoken English. in Proc
> LREC2000 vol. 2, pp. 957-964, European Language Resources
> Association. 2000. http://www.comp.leeds.ac.uk/eric/menzel00lrec.pdf
>
> Atwell, Eric; Howarth, Peter; Souter, Clive. The ISLE corpus: Italian
> and German spoken learner's English. ICAME Journal, vol. 27, pp. 5-18.
> 2003. http://www.comp.leeds.ac.uk/eric/atwell03icamej.pdf
>
>
> I hope this helps...
>
>
> Eric Atwell,
>
> Senior Lecturer, Language research group leader, School of
> Computing Faculty of Engineering, UNIVERSITY OF LEEDS, Leeds LS2
> 9JT, England
> TEL: 0113-3435430  FAX: 0113-3435468  WWW/email: google Eric Atwell
>
>
> On Wed, 16 May 2007, Barbara Schiftner wrote:
>
>>
>> Dear all,
>>
>>
>> I am a student at the department of English at the University of
>> Vienna. In my diploma thesis, I am investigating the development
>> in learner corpus research, focusing in particular on corpora of
>> learner English and learner German.
>>
>> An integral part of my paper will be an analysis of the status
>> quo, which should incorporate a representative sample of available
>> corpora of learner English and learner German. Therefore, I would
>> be grateful for any up-to-date information about the corpora
>> listed below, or suggestions for other learner corpora that should
>> not be left out in my discussion.
>>
>>
>> Thank you for your help!
>>
>>
>>
>> Best regards,
>>
>> Barbara Schiftner
>>
>>
>>
>> This is a list of the learner corpora I have found out about so far:
>>
>>
>>
>> Corpora of Learner English
>>
>>
>> CLC (Cambridge Learner Corpus)
>>
>> CLEC (Chinese Learner English Corpus)
>>
>> HKUST (Hong Kong University of Science and Technology)
>>
>> ICLE (International Corpus of Learner English)
>>
>> JEFLL (Japanese EFL Learner)
>>
>> JPU (Janus Pannonius University Corpus)
>>
>> LLC (Longman Learners? Corpus)
>>
>> MELD (Montclair Electronic Language Database)
>>
>> Polish Learner English Corpus
>>
>> SILS (School of International Liberal Studies at Waseda University)
>>
>> TeleNex Student Corpus
>>
>> USE (Uppsala Student English Project)
>>
>>
>> Corpora of Learner German
>>
>>
>> FALKO (fehlerannotiertes Lernerkorpus des Deutschen als
>> Fremdsprache, HU Berlin)
>>
>> LeKo (Lernerkorpus, HU Berlin)
>>
>> Telecorp (Pennsylvania)
>>
>> Corpus collected by Ursula Weinberger (Lancaster)
>>
>>
>>
>> (My main focus is on written texts, but remarks about corpora of
>> spoken learner language are also welcome.)
>>
>>
>> ______________________________
>> Barbara Schiftner
>>
>> Fachdidaktisches Zentrum
>> Institut fuer Anglistik und Amerikanistik
>> Universitaet Wien
>> Spitalgasse 2-4, Hof 8
>> A-1090 Wien
>> Austria
>>
>> phone: +43-1-4277-424-53
>> e-mail: barbara.schiftner at univie.ac.at



More information about the Corpora mailing list