[Corpora-List] Question concerning audio file search

Doug Cooper doug at th.net
Thu Dec 21 08:50:19 UTC 2006


You might want to check the DAISY Consortium site, especially
the tools area:  http://www.daisy.org/tools/  They produce both
open tools and standards for digital talking book data (esp. for
the blind), including recorded speech.

   On a related topic, I recently built an audio corpus tool to locate
single words in recorded (by many speakers) & transcribed Thai
texts, aligned variously at the sentence or short paragraph level.

    It turned out that the naive approach -- using the relative
character-count position of a search string within the larger
transcription to locate the corresponding spoken word within
the recording of that segment -- worked reasonably well, given
a +/- 1.25-second window.  One critical requirement was getting
rid of pauses in the sound files.  For my data, applying SOX
"silence" with these parameters worked pretty well at normalizing
speaking rates without introducing artifacts:

   sox -V a.wav silence 1 0:0:0.1 -55d -1 0:0:0.1 -55d

   Doug Cooper
_______________________________________
Center for Research in Computational Linguistics
http://sealang.net     http://crcl.th.net
CRCL Inc. is a US 501(c)3 nonprofit organization



More information about the Corpora mailing list