[Corpora-List] are there corpora of fast speech?

Eric Atwell eric at comp.leeds.ac.uk
Tue Jan 14 20:34:14 UTC 2003


Dinoj,
You dont say what counts as "fast" speech - anything spontaneous?
The ICAME CD-ROM includes spoken corpora (transcription texts) :
London Lund Corpus
Lancaster/IBM Spoken English Corpus (SEC)
Corpus of London Teenage Language (COLT)
Wellington Spoken Corpus (New Zealand)
The International Corpus of English - East African component

see http://www.hd.uib.no/icame/newcd.htm for samples and manuals.
These are transcriptions only, but SEC sound files are available
separately (as MARSEC), and I think some sound files from other corpora
may be availablew from the corpus collectors.

At Leeds we collected Italian and German Spoken Learners' English for
the ISLE corpus - but these are sound files recorded by getting subjects
to read out loud on-screen prompts, NOT spontaneous dialogue, so I guess
not what you want (not particularly fast either, subjects generally read
rather slowly trying to "get it right"...)

hope this helps

Eric Atwell



On Tue, 14 Jan 2003, Dinoj Surendran wrote:

> Dear list members,
>
> Does anyone know if there is a (at least) phonetically transcribed corpus
> of fast English speech? A corpus of spontaneous speech known to have
> several fast speakers could also work. And while I would prefer to have
> both the sound files and the transcription files, the latter only will
> still be of use.
>
> Thanks,
>
> Dinoj Surendran
> Graduate Student
> Computer Science Dept
> University of Chicago
>
>
>

--
Eric Atwell, CVL: Computer Vision and Language research group
Distributed Multimedia Systems MSc Tutor & SOCRATES/JYA Tutor
School of Computing, University of Leeds, LEEDS LS2 9JT
TEL: 0113-3435761  MOBILE: 0775-1039104 FAX: 0113-3435468
WWW: http://www.comp.leeds.ac.uk/eric  EMAIL: eric at comp.leeds.ac.uk
Visit http://www.computingLEEDS.ac.uk - our newsletter for industry



More information about the Corpora mailing list