New Thai corpus

Brian MacWhinney macw at cmu.edu
Wed Feb 4 23:47:41 UTC 2004


Dear Info-CHILDES,
  I am happy to announce the addition to CHILDES of a new corpus studying
the development of language and communicative interactions in Thai children
from 6-24 months.  This corpus is the collaboration of the Centre for
Research in Speech and Language Processing (CRSLP), Chulalongkorn
University, Thailand led by Assistant Professor Dr. Sudaporn
Luksaneeyanawin, and MARCS Auditory Laboratories, University of Western
Sydney, Australia led by Professor Dr. Denis Burnham. To recognize this
collaboration, we call it the CRSLP-MARCS corpus.

The data consist of video-linked transcriptions of 18 Thai adult-child dyads
from the child age of 6 to 24 months, at three monthly intervals. Sessions
at each age were of 20 minutes duration and for CHILDES these have been
split into 10 minute files, a total of 242 files.

The data comprised the major part of a doctoral thesis by Sorabud
Rungrojsuwan - ŒFirst Words: Communicative Development of 9- to 24-Month-Old
Thai Children¹. Data were collected by Sorabud Rungrojsuwan and Nirattisai
Krajaikiat, postgraduate research assistants at CRSLP, during the period
January 2000-January 2002, using a SONY Digital Handicam DCR-TRV320E video
camera. The videotaped data were then computerized and converted into 242
video files (in .mpg format) using the Ulead Video Studio 4.0 SE Basic
program. Using the CLAN program (CHAT mode), Sorabud transcribed the data in
Thai script. Roman phonological representations of these Thai transcriptions
were automatically added by the use of a Thai text-to-phonological
representation program developed by the CRSLP, by Assistant Professor Dr.
Sudaporn Luksaneeyanawin.

The transcripts can be found at
http://childes.psy.cmu.edu/data/eastasian/thai/

And the video can be browsed from the "directly browsable" link at
http://childes.psy.cmu.edu/data/

I would like to add an editorial note here.  Even if you cannot speak Thai,
the videos should be of rather universal interest for exploring ideas about
early vocalizations and mother-child interactions.

Many thanks to both of these work groups for contributing this new corpus.

--Brian MacWhinney



More information about the Info-childes mailing list