[Corpora-List] corpus of speech transcriptions

Lucia Specia lspecia at gmail.com
Thu Mar 18 14:44:30 UTC 2010


Hi all,

We have a student interested in working on detecting errors in
human/automatic English speech transcriptions, but she is struggling
to find a corpus for that. Ideally, we would like to have a corpus
where errors are marked, or some sort or of "parallel" corpus with the
correct version of the transcriptions (or just written text) and
automatic/human transcriptions, prone to error.

We could use some speech recognition system to produce such a corpus,
but it'd be better to use an already existing one, if possible.

Would anyone recommend a corpus for that?

Best,

Lucia Specia and Alice Kaiser-Schatzlein

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list