Russian corpora

Fri Jan 22 22:16:04 UTC 1999

In reply to Tanja Anstatt's query about Russian corpora --

Eva Bar-Shalom and I have videotaped and transcribed a longitudinal corpus
of fifteen spontaneous speech samples from a monolingual Russian-learning
child (pseudonym 'Tanya', ages 2;5.14 - 2;11.20) who was recorded here in
the United States at a rate of approximately twice per month.  At the time
of the study, the child was cared for at home by her monolingual (native
Russian) mother and her bilingual (native Russian, ESL) father.  The
language used at home was consistently Russian, and exposure to English
was minimal.

The Tanya corpus has been transcribed by Eva Bar-Shalom, a native
Russian-speaker, at the Language Acquisition Laboratory, Department of
Linguistics, University of Connecticut.  Transcription followed CHAT
conventions.  The resulting transcripts must be considered preliminary,
however, because they have not yet been subjected to any rigorous form
of reliability checking.

Analyses of syntax and morphology in the preliminary version of Tanya's
corpus have thus far been included in two research reports:

Bar-Shalom, Eva and Snyder, William (1997) "Optional infinitives in
  Russian and their implications for the pro-drop debate."  In Martina
  Lindseth and Steven Franks (eds.) _Formal Approaches to Slavic
  Linguistics: The Indiana Meeting 1996_.  Ann Arbor: Michigan Slavic
  Publications, pp.38-47.

Bar-Shalom, Eva and Snyder, William (1998) "Root infinitives in Child
  Russian:  A comparison with Italian and Polish." In Richard Shillcock,
  Antonella Sorace, and Caroline Heycock (eds.) _Language Acquisition:
  Knowledge Representation and Processing.  Proceedings of GALA '97._
  Edinburgh, UK: The University of Edinburgh.

At present Eva and I are prepared to share the Tanya corpus with other
researchers, although we are still hoping to perform better reliability-
checking before we submit the corpus to the CHILDES database.  In particular,
we'd like to ensure that the intended use of the Tanya corpus by other
researchers is consistent with its current level of reliability.

Thus, for the time being, requests for a copy of the Tanya corpus should
be directed to me, William Snyder, at the following (electronic or
postal) address:

  wsnyder at sp.uconn.edu

  Prof. William Snyder, Ph.D.
  Dept. of Linguistics, U-145
  University of Connecticut
  341 Mansfield Road
  Storrs, CT 06269-1145
  USA

With best regards,

William