New Corpus from LDC
LDC Office
ldc at unagi.cis.upenn.edu
Thu Jul 6 15:10:55 UTC 2000
********************************************************
Santa Barbara Corpus of Spoken American English - Part I
********************************************************
LDC is pleased to announce the availability of the
Santa Barbara Corpus of Spoken American English -
Part I. This CD-ROM release contains 14 speech files
from the Santa Barbara Corpus of Spoken American
English, which was collected by the University of
California, Santa Barbara Center for the Study of
Discourse under the direction of John W. Du Bois.
Associate Editors were Wallace L. Chafe (UCSB),
Charles Meyer (UMass, Boston), and Sandra A. Thompson
(UCSB). The Santa Barbara Corpus of Spoken American
English is part of the International Corpus of
English (Charles W. Meyer, Director), representing
the American Component.
The Santa Barbara Corpus of Spoken American English
is based on hundreds of recordings of natural speech
from all over the United States, representing a wide
variety of people of different regional origins,
ages, occupations, and ethnic and social backgrounds.
It reflects many ways that people use language in
their lives: conversation, gossip, arguments,
on-the-job talk, card games, city council meetings,
sales pitches, classroom lectures, political
speeches, bedtime stories, sermons, weddings, and
more.
Each speech file is accompanied by a transcript in
which phrases are time stamped with respect to the
audio recording. Personal names, place names, phone
numbers, etc, in the transcripts have been altered to
preserve the anonymity of the speakers and their
acquaintances and the audio files have been filtered
to make these portions of the recordings
unrecognizable.
For the latest information on this corpus, please refer to
the UCSB and Linguistic Data Consortium (LDC) web sites
devoted to it:
http://humanitas.ucsb.edu/depts/linguistics/research/csae/
http://www.ldc.upenn.edu/Publications/SBC/
These sites may also contain software or revised
versions of data which may be downloaded.
Institutions that have membership in the LDC during
the 2000 Membership Year will be able to receive this
corpus free of charge. Nonmembers may purchase the
Santa Barbara Corpus of Spoken American English -
Part I for $75.
If you would like to order a copy of this corpus,
please email your request to <ldc at ldc.upenn.edu>. If
you need additional information before placing your
order, or would like to inquire about membership in
the LDC, please send email or call (215) 573-1275.
More information about the Linganth
mailing list