New German corpus

Brian MacWhinney macw at cmu.edu
Fri Jan 5 07:17:37 UTC 2001


Dear Info-CHILDES,
  I am happy to announce the availability of a new corpus of German child
language data from Gisela Szagun at Oldenburg.  As the following readme file
indicates, this large corpus includes data from both normally-hearing
children and children with cochlear implants.  The data are currently only
available directly from Dr. Szagun.

--Brian MacWhinney

German - Szagun
Prof. Dr. Gisela Szagun
Fb 5, Institut fuer Kognitionsforschung
Carl-von-Ossietzky  Universität  Oldenburg
Postfach 2503, Gebäude  A6
D-26111 Oldenburg
Germany
gisela.szagun at uni-oldenburg.de


This large corpus of German child language includes speech from
normally-hearing (NH) children, as well as children with cochlear implants
(CI).  The title of the project is ³Language acquisition in children with
cochlear implants and with normal hearing.²  It was funded by the Deutsche
Forschungsgemeinschaft (DFG) grants Sz 41/5-1 (1996-98) and Sz 41/5-2
(1999-2000).  In addition to documenting language development in these two
groups, this corpus is the first comprehensive data collection of child
directed adult speech in German.  Each of the 426 data files is a transcript
from a two-hour session.  Researchers contributing to the project include
The people Sonja Arnhold-Kerri, Tanja Hampf, Elfrun Klauke, Stefanie Kraft,
Dorit Pefferkorn, Dagmar Roesner, Claudia Steinbrink, Gisela Szagun, Bettina
Timmermann, and Sylke Wilken. These data are currently only available
directly from Gisela Szagun.

For the NH children, there are 6 children with 22 data points each (ann,
eme, fal, lis, rah, soe) for a total of 132 files.   For these 6 children,
recordings are taken every 5-6 weeks from ages 1;4 to 3;8.  For the other 16
children, there are only five data points between 1;4 and 2;10 for a total
of 80 files.

The 22 NH children and the 22 CI children were matched for initial language
level at an MLU of 1.25.  The 22 CI subjects (12 male, 10 female) were all
deaf before implantation.  The mean age at implantation was 2;5 with a SD of
0;9 and a range of 1;2 to 3;10.  Each of the CI subjects is given a tune-up
age, which is the time since the first fitting of the device to the child¹s
comfortable level of hearing.  In this group, all 22 children were recorded
every 4 months between 5 and 44 months after implantation, i.e. up to the
first 2 years 8 months after implantation.  In addition 9 children were
recorded more frequently within this time span.

At 4 data points at least 600 utterances of child-directed speech were
transcribed. The data points for normally hearing children are1;4, 1,8, 2;1,
and 2;5.  For CI children, the corresponding tune-up ages are 0;5, 0;9, 1;2,
and 1;6  There are between 2000 and 2400 utterances per adult.

Transcription conventions includes:
1. Nouns are placed in lower case, except for proper nouns and family forms
such as Papa, Mama, Oma, and Opa.
2. In accord with CHAT style, initial words are not capitalized unless they
are proper nouns.  
3. Comma use is avoided.
4. German ß is written as ss.
5. Schwa is written as 6.
6. The word ³nein² is represented as ³mm² and ³66².  The word ³ja² is
represented as ³mhm² and ³hm.²
7. Vowel length is represented by adding ³h² as in &eh.
8. The forms ³kuck ma² and ³kuck mal² are transcribed as ³guck mal.²
9. Animal sounds are coded with ampersand, as in &miau, &muh, &wuf, &quak,
&gronk, or &gack.
10. Interjections found in Duden are given in word form, as in aua (hurt),
uih, oh, oi (surprise), aha (insight), etc.
11. Shortening are not transcribed in CHAT parenthesis form,  instead they
are transcribed directly, so that  shortened ³ist² is ³is² rather than CHAT
³is(t)².  Similarly, for nich(t), jetz(t), (e)¹s, and un(d).
12. Similarly, verb suffix deletion is marked by an apostrophe, as in
kommst du        komms' du
kommt der        komm' der
geht dein        geh' dein
sind die        sin' die
passt da        pass' da
kommt denn        komm' denn     oder    kommt' nn
geht doch        geh' doch
hat es            hat¹s
13. Even strong contractions occur with ³du² as in
full Form:    shortened Form:    even shorter Form:
willst du     willst 'e            wills' 'e
hast du        hast 'e            has' 'e
bist du        bist 'e            bis' 'e
hörst du    hörst 'e            hörs' 'e
14. Contracted nominal endings are:
ein            'n  
eine            'ne
einen            ein'n    'nen    'n
einem            ei'm    'nem    'm
einer            'ner
deinen            dein'n    deinem     dei'm
meinen            mein'n    meinemmei'm
seinen            sein'n
in den            in'n            in'n kindergarten
mit der            mit'er            mit'er schale
auf den            auf'n            auf'n tisch
auf dem             auf'm            auf'm tisch
für das             für's            auf's bett
mit dem            mit'm            mit'm schuh
den kleinen elefant            den klein'n elefant
die kleinen löwen            die klein'n löwen
15. Other contractions include:
nehmen            nehm¹n  etc for some infinitives
blumen            blum¹n for ­en in some plurals
16. Eye-dialect is used for: haben ý ham, du ý de, wir ý wa.

The transcript also markwhether the mothers used hyperclarity in their
speech.   Hyperclarity includes stressed pronunciation of  -en, -em, -el,
-er, -e at the ends of words as well as other syllable stressings.   Another
form of hyperclarity involves the drawling of vowels and nasals, which is
marked in the standard CHAT format with a colon.

Postcodes:

Fillers, interjections, exclamations:                        [+ F]
Routines:                        [+ R]
One-word answers to Yes/No questions:                        [+ Q]
Partially unintelligible:                        [+ PI]
Isolated onomatopoeic:                        [+ O]
Isolated Vocalizations                        [+ V]
Imitations:                        [+ I]
Elicited Imitations.                        [+ EI]


                
Speech Samples for CI Children
CHILD    sex    Samples in Set 1    Samples in Set 2    Total
Adriane     f    7    1    8
Anne     f    7    2    9
Claudia    f    10    1    11
Daniel    m    10        10
Eileen    f    11    1    11
Erik    m    10        10
Finn    m    11        11
Finn-Hendrick    m    7        7
Lara    f    8    2    10
Laura    f    8    1    9
Lena    f    11    1    12
Maik    m    6    1    7
Marco    m    7        7
Marius    m    7    1    8
Michelle    f    6        6
Mike    m    6    2    8
Nancy    f    10    4    14
Philipp    m    9        9
Ricardo    m    6    1    7
Sara    f    11        11
Sarah-M    f    11    2    13
Silja    f    10    2    12


Speech Samples for ND Children
Child    Sex    Samples in Set 1    Samples in Set 2    Total
Anna    f    15    7    22
Emely    f    15    7    22
Falko    m    15    7    22
Lisa    f    15    7    22
Rahel    f    15    7    22
Soeren    m    15    7    22
                
Celina    f    5        5
Emely S    f    5        5
Finn G    m    5        5
Ina    f    5        5
Isabel    f    5        5
Jores    m    5        5
Konstantin    m    5        5
Leo    m    5        5
Leon    m    5        5
Luisa    f    5        5
Mario    m    5        5
Marlou    f    5        5
Martin    m    5        5
Neele    f    5        5
Sina    f    5        5
Sino    m    5        5



The following manual provides codings of MLU, morphology, syntax and
mothers¹s speech acts:

Szagun, G. (1999b). Rules for transcribing and analyzing German child
language. Institut für Kognitionsforschung, University of Oldenburg,
Germany. 

Articles based on these data should cite one or more of these sources:

Szagun, G. (1998). Spracherwerb bei Kindern mit Cochlea-Implantat: Erste
Ergebnisse einer entwicklungspsycholinguistischen Studie. Sprache - Stimme
-Gehör, 22, 133-138.
Szagun, G. (2000). The acquisition of grammatical and lexical structures in
children with cochlear implants: A developmental psycholinguistic approach.
Audiology & Neuro-Otology, 5, 39-47.
Szagun, G. (in press). Learning the h(e)ard way: The acquisition of grammar
in young German-speaking children with cochlear implants and with normal
hearing. In Windsor, F., Kelly, L. & Newlett, N. (Eds.), Themes in Clinical
Phonetics and Linguistics. Mahwah, New Jersey: Erlbaum.
Steinbrink, C. & Szagun, G. (1999). Der Einfluß überdeutlichen Sprechens auf
den Spracherwerb von Kindern mit Cochlea Implantat. Sprache - Stimme -
Gehör, 23, 213-217.



More information about the Info-childes mailing list