TalkBank

Tue Jun 8 17:33:04 UTC 1999

Project Announcement for
"TalkBank: A Multimodal Database of Communicative Interaction"

The goal of TalkBank is to create a distributed, web-based data
archiving system for transcribed video and audio data on communicative
interactions.  TalkBank builds on our experience with CHILDES and LDC
corpora, and is expected to be a major new tool for the social
sciences.  TalkBank data will be stored in an XML-based transcription
framework incorporating richly structured, time-aligned annotations.

For detailed information, please consult:
CMU  - http://childes.psy.cmu.edu/talkbank.html
Penn - http://www.ldc.upenn.edu/annotation/talkbank.html

We believe that TalkBank will benefit four types of research enterprises:

Cross-corpora comparisons.  For those interested in quantitative
  analyses of large corpora, TalkBank will provide direct access to
  enormous amounts of real-life data, subject to strict controls
  designed to protect confidentiality.

Folios.  Other researchers wish to focus on qualitative analyses
  involving the collection of a carefully sampled folio or casebook of
  evidence regarding specific fine-grained interactional
  patterns. TalkBank programs will facilitate the construction of
  these folios.

Single corpus studies.  For those interested in analyzing their own
  datasets rather than the larger database, TalkBank will provide a
  rich set of open-source tools for transcription, alignment, coding,
  and analysis of audio and video data.

Collaborative commentary.  For researchers interested in contrasting
  theoretical frameworks, TalkBank will provide support for entering
  competing systems of annotations and analytic profiles either
  locally or over the Internet.

The creation of this distributed database with its related analysis
tools will free researchers from many tedious aspects of data analysis
and will stimulate fundamental improvements in the study of
communicative interactions.  The initiative unites ongoing efforts
from the Linguistic Data Consortium (LDC) at Penn, the Penn Database
Group, the Informedia Project at CMU, and the CHILDES Project at
CMU. The initiative also establishes an ongoing interaction between
computer scientists, linguists, psychologists, sociologists, political
scientists, criminologists, educators, ethologists, cinematographers,
psychiatrists, and anthropologists.

A variety of funding possibilities are being sought for TalkBank, and we have
recently received a commitment of support from NSF for initial planning
meetings. We are also using the initiative to foster wide-ranging cooperation
between ongoing research efforts.  The TalkBank homepage
[http://www.ldc.upenn.edu/annotation/talkbank.html] lists current
participants and has a pointer to a document giving a detailed exposition of
our vision for TalkBank.

We invite anyone who is interested in participating actively in TalkBank or
even in just providing suggestions and criticism to contact one or more of us:

Brian MacWhinney (Psychology, CMU)
Howard Wactlar (Computer Science, CMU)
Peter Buneman (Computer Science, U Penn)
Mark Liberman (Linguistic Data Consortium, U Penn)
Steven Bird (Linguistic Data Consortium, U Penn)

***********************

This message is being posted on June 4, 1999 to the following mailing lists.
Our apologies if you receive multiple copies.

If you think this announcement should be posted to additional mailing lists,
please send the addresses of those lists to Brian MacWhinney (macw at cmu.edu).
It is particularly important to reach additional lists outside of the domains
of linguistics and psycholinguistics.   Many thanks.

corpora at hd.uib.no
elsnet-list at let.ruu.nl
empiricists at unagi.cis.upenn.edu
language-culture at cs.uchicago.edu
linganth at cc.rochester.edu
linguist at listserv.linguistlist.org
ap-mate at mate.mip.ou.dk
nl-kr at cs.rpi.edu
info-childes at childes.psy.cmu.edu
info-psyling at psy.gla.ac.uk
funknet at rice.edu
discours at linguist.ldc.upenn.edu
phonet at mailbase.ac.uk