request for ideas

Brian MacWhinney macw at cmu.edu
Fri Feb 5 17:03:26 UTC 1999


Dear Info-CHILDES,
  Thanks to all of you who have been sending me letters for NIH for the
CHILDES renewal.  Along a very similar vein, I have been working on a
proposal to extend the usage of the CHILDES system and related tools
from other projects to a wider scope of data in the social sciences.
At the end of this note, I summarize some of the projects that come to
mind.  What all these projects share is an interest in non-scripted
social interactions.  Many use video recording and all use audio
recording.  Almost all do some type of transcription, coding, and
annotation that is linked to the original recordings.  Working with the
Informedia Project here at CMU and the Linguistic Data Consortium at
Penn, I have been trying to collect clear examples of projects of this
type throughout the social sciences.  Mark Liberman and Steven Bird
have completed a fairly nice survey of computational approaches to this
problem, including formats and tools.  This summary can be found
http://morph.ldc.upenn.edu/annotation/
  What I would like to create next is a set of links to projects that
actually have rich datasets and interests in the collection and
analysis of such datasets.  This would be pointers either to web sites,
the literature, and people.  Appended is my current set of good
candidates.  Can people suggest additional candidates?  If so, feel
free to either send the info to me or the list.  Please note that the
names given in this list reflect my own parochial emphasis and this is
exactly what I am trying to correct.

--Brian MacWhinney

1.  CHILDES
2.  Classroom interactions. Researchers such as James Stigler have
collected videotaped data comparing Japanese, German, Czech,
Spanish,and American instruction in mathematics.
3.  Conversation analysis.  Conversation analysis is a methodology and
intellectual tradition developed by Harvey Sachs, Gail Jefferson,
Emanuel Schegloff, and others. Recently, workers in this field have
begun to publish fragments of their transcripts over the Internet.
CHILDES now supports this type of transcription.
4. Second language learning.  Reiko Uemura of Fukuoka Institute of
Technology has collected a large database of videotaped and transcribed
interactions of English speakers learning Japanese and Japanese
speakers learning English.  Manfred Pienemann in Sydney has a similar
database for Australian learners of Japanese, French, and German.  The
audio quality of these recordings is high and they provide excellent
material for error analysis and other studies of second language
acquisition.  Uemura has already put these data onto the Internet using
RealAudio and still pictures.
5.  National corpora.  There are many major computerized corpora of the
major languages, often identified as national projects, that contain
interactional material. These include the British National Corpus, the
London-Lund Corpus, the Australian National Database of Spoken
Language, the Corpus of Spoken American English, the Vincent Voice
Library of historical American recordings, and others.
6.  SignStream.  The NSF-sponsored SignStream project (Carol Neidle at
Boston University, Dimitri Metaxas, Penn) has formulated programs for
coding videotaped data of signed language.   Researchers such as David
McNeill at the University of Chicago have developed schemes for coding
the relations between language and gesture.
7.  Speech production, aphasia, language disorders, and disfluency. (A
lot of this is already in CHILDES, but more is needed).
8.  Clinical psychology.  Psychiatrists such as Mardi Horowitz have
explored transcript analysis and annotation.
9.  Intensive behavioral analyses.  (I need concrete references to this
area.)
10.  Animal behavior.  Videotapes of animals in experimental situations
are often coded using tools such as The Observor.  The formal issues in
coding audio or video records of animal behavior are simlar to those
that arise for coding human interaction, though of course the content
may be quite different.
11.  Documentary.  Since the beginning of the century, ethnographers
have pioneered the use of film documentaries to record the lives of
non-Western peoples.  Much of this documentary material is still
available and includes excellent video footage.  Another example of a
documentary collection is Steven Spielberg's documentary of the
Holocaust.
12.  Human tutoring.  Researchers such as Kurt Van Lehn, Micki Chi, Ken
Koedinger, and Arthur Graesser have conducted detailed video studies of
the tutoring process.
13.  Computer tutoring.  The process of human tutoring can be
successfully compared to the process of computer tutoring. Researchers
at CMU such as Ken Koedinger, Al Corbett, Bonnie John, Martha Alibali,
John Anderson, Steve Ritter, and Kevin Gluck have begun to make these
comparisons.
14.  Human-computer interaction.  Finally, there is a large volume of
work in the field of Human-Computer Interaction that relies on
videotapes, codes, and analyses similar to the ones required in the
CHILDES project.

If you have any further pointers I can add to this list, including
self-referential ones, please tell me.  Thanks.



More information about the Info-childes mailing list