[Corpora-List] new Falko German learner corpus release - FalkoL2WHIGv2.0
Marc Reznicek
marc.reznicek at staff.hu-berlin.de
Mon Sep 17 14:26:13 UTC 2012
The error-annotated German learner corpus Falko has released a new
subcorpus: FalkoEssayL2WHIGv2.0 including 195 argumentative essays by
advanced learners of German (117,189 tokens).
For each text two full-text target hypotheses (a minimal morphosyntactic
normalization and an extended semantic-pragmatic version) have been
manually
annotated.
Each representation has been POS-tagged and lemmatized (Treetagger &
rfTagger). rfTagger morphological annotation has been integrated as well.
On this basis, tags indicating differences between the learner text and its
POS and lemma annotations and the respective target hypotheses (POS &
lemma)
have been added.
The corpus is freely available under the following link:
http://korpling.german.hu-berlin.de/falko-suche
The annotation guidelines can be found here:
http://www.linguistik.hu-berlin.de/institut/professuren/korpuslinguistik/for
schung/falko/Falko-Handbuchv2.0.pdf Linguistic Field: Language
Acquisition, Text/Corpus Linguistics
--
Marc Reznicek
Wiss. Mitarbeiter
Korpuslinguistik
Humboldt-Universität zu Berlin
Marc.Reznicek at hu-berlin.de
Tel: +49 (0)30 2093-9727
Dorotheenstr.24, 10099 Berlin
Raum 3.310
http://u.hu-berlin.de/reznicek
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list