new Spanish corpus

Brian MacWhinney macwhinn at hku.hk
Thu Jun 28 10:04:37 UTC 2001


Dear Info-CHILDES,
  I am happy to announce the addition to CHILDES of a new corpus of data on
children learning Castillian Spanish.  The corpus is contributed by Maria
Benedet, Cruz Celis, Maria Carrasco, and Catherine Snow.  In can be found in
becacesno.sit and becacesno.zip in the /spanish folder.  Here is the readme:

The database that we present here has been created under the project
"Psycholingüistic assessment of language abilities in children" partially
supported by the Comunidad Autónoma de Madrid (C.A.M.) and conducted under
the direction of M. J. Benedet. The data have been collected by students in
speech therapy program at the Universidad Complutense de Madrid (U.C.M.).

The transcription has been performed by students in the Speech Therapy and
Psychology programs at the U.C.M. These students have been trained and
supervised by M. Carrasco and C. Celis. Catherine Snow has been a central
piece in our own training and advice. She has also made possible a partial
support that we had at this stage. BECACESNO is a combination of our
initials.
    
The principal aim of this corpus is to achieve a descriptive study of the
normal development in conversational language skills for children between
ages 4 (3;6) and 12 (11;6).

The corpora has 81 transcriptions, each based on one-hour audio recordings.
Each tape includes a free conversation between one or more children and an
adult. The target-children are normal monolingual speakers of Spanish.

The file names begin with the age of the child in years and months, then the
gender of the target child, then the letter ³g² if there is a group of
children or ³c² if there is a class of children.  For example, 10c03.cha
indicates a class of children between ages 9,6 and 10,6.


--Brian MacWhinney



More information about the Info-childes mailing list