Corpora: Please help with Kids Corpus repetition study!

James P. Salsman bovik at best.com
Fri Jan 12 04:05:18 UTC 2001


Recently I have been studying the CMU/LDC Kids Corpus of children's
oral English reading, in order to help build reading skills evaluation
systems.  I have come to the point where I need a lot of help, and have
set up a web page to make it easy for anyone to contribute as much or
as little as they like.  It is fun, too, because there is embedded audio
of the Pittsburgh children who read for the collection of the Corpus,
repeating part of what they were supposed to say.  Please have a look
and listen:

  http://www.bovik.org/reps-char.cgi

Please also try to submit some of the requested characterizations; it
takes less than a minute each.  We need to collect thousands of those
submissions to get a statistical model of the kinds of repetitions that
occurred, which will help computerized reading skills assessment systems
tell the difference between harmless self-corrections and bona-fide
mispronunciations.  So, please forward this message on to any of your
friends and associates who might be able to do some, too.

The resulting data will remain available for everyone at:
  http://www.bovik.org/reps-char.txt

Thank you!

Cheers,
James



More information about the Corpora mailing list