[Corpora-List] Summary: prosodically annotated dialogue corpora

Matthew Purver matthew.purver at kcl.ac.uk
Thu Nov 28 14:56:46 UTC 2002


On 15th November, I asked the following question:

> Does anyone know of any freely available prosodically annotated corpora
> that include samples of dialogue?

Here is a summary of what I found out:

(1) the London-Lund corpus contains prosodic annotation of spontaneous
English dialogue. It is available from ICAME
(http://www.hd.uib.no/icame.html). It isn't free - the ICAME CD-ROM costs
3500 NOK (about 300 UKP or 480 USD/EUR) for an individual license.

(2) the Multext prosodic corpus contains prosodic data for English,
French, German, Italian and Spanish, although it consists of read passages
rather than spontaneous dialogue. It is available from ELRA/ELDA
(http://www.icp.grenet.fr/ELRA/). It isn't free, the CD-ROM costs 100 EUR
for an academic non-ELRA-member license, less if you're a member.

(3) The Wellington Spoken Corpus contains markup for emphatic stress in
spontaneous dialogue. It is also on the ICAME CD-ROM (see above).

(4) IViE contains some spontaneous conversation data
(http://www.phon.ox.ac.uk/~esther/ivyweb/)


Good places to find dialogue corpora:
http://www-rcf.usc.edu/~billmann/diversity/DDivers-site.htm
http://devoted.to/corpora


Many thanks to Bas Aarts, Bill Mann, Geoffrey Sampson and Jean Veronis for
their help.

--
Matthew Purver  <matthew.purver at kcl.ac.uk>

Logic, Language and Computation Group
Department of Computer Science
King's College London, Strand, London WC2R 2LS



More information about the Corpora mailing list