Spoken English data base

Carol Genetti cgenetti at LINGUISTICS.UCSB.EDU
Mon Oct 10 18:01:53 UTC 2011


Hi Dan,

It sounds like you are looking for the Santa Barbara Corpus of Spoken 
American English. I've pasted the blurb below. The corpus is available 
through Linguistic Data Consortium: <http://projects.ldc.upenn.edu/SBCSAE/>.

Enjoy!
Carol

****
The Santa Barbara Corpus of Spoken American English is based on a large 
body of recordings of naturally occurring spoken interaction from all over 
the United States. The Santa Barbara Corpus represents a wide variety of 
people of different regional origins, ages, occupations, genders, and 
ethnic and social backgrounds. The predominant form of language use 
represented is face-to-face conversation, but the corpus also documents 
many other ways that that people use language in their everyday lives: 
telephone conversations, card games, food preparation, on-the-job talk, 
classroom lectures, sermons, story-telling, town hall meetings, tour-guide 
spiels, and more.

The Santa Barbara Corpus was compiled by researchers in the Linguistics 
Department of the University of California, Santa Barbara. The Director of 
the Santa Barbara Corpus is John W. Du Bois, working with Associate Editors 
Wallace L. Chafe and Sandra A. Thompson (all of UC Santa Barbara), and 
Charles Meyer (UMass, Boston). For the publication of Parts 3 and 4, the 
authors are John W. Du Bois and Robert Englebretson.

The Santa Barbara Corpus of Spoken American English also forms part of the 
International Corpus of English (ICE). The Santa Barbara Corpus provides 
the main source of data for the spontaneous spoken portions of the American 
component of the International Corpus of English. In order to meet the 
specific design specifications of the International Corpus of English 
(allowing comparison between American and other national varieties of 
English), the Santa Barbara Corpus data have been supplemented by 
additional materials in certain genres (e.g. read speech), filling out the 
American component of ICE.

The Research Centre for Empirical Pragmatics (RCEP) at Bonn Applied English 
Linguistics (BAEL) maintains a bibliography of works which make use of the 
Santa Barbara Corpus.

--On Monday, October 10, 2011 1:22 PM -0400 "Everett, Daniel" 
<DEVERETT at BENTLEY.EDU> wrote:

> Folks,
>
> I am wondering if there is a data base of spoken English (in particular
> American English, but any dialect would be interesting to know about).
> What I have in mind is something along the lines of the Brazilian
> Portuguese Projeto Nurc (one site for that is here:
> http://www.letras.ufrj.br/nurc-rj/).
>
> Could anyone point me to such a data base of spoken (adult) English?
>
> Thanks in advance.
>
> Dan Everett
>
> **********************
> Daniel L. Everett
>
> http://daneverettbooks.com



More information about the Lingtyp mailing list