auto alignment of transcript and audio?

Doug Cooper doug.cooper.thailand at gmail.com
Sat Apr 9 09:24:12 UTC 2011


Hi, Nick:
We made extensive use of Lipsync  (working with Russian).  Performance
was extremely good on reasonably clean input, and fell off on very noisy
recordings, musical interludes, and (of course) inaccurate transcription.
http://www.annosoft.com/lipsync-tool
The desktop tool is US$500 (that's about $79.99 Aussie); there's also an
SDK.  I think they have save-disabled and 30-day trial versions available
for demo.   Their xml input and output formats were easy to modify.

   I had some correspondence with the brains behind the operation; he was
friendly and quick to respond.  I've attached part of one note, below.

    A fundamental issue with audio alignment is that it is not robust in the
way that, say, even bad OCR is.  Once sync is lost, it's difficult to recover
automatically.  The workaround is to embed breakpoints -- not terribly
difficult, but it does require a human sitting at a computer for long hours.

    Good luck,
    Doug

Mark Zartler <mzartler at annosoft.com>  wrote:
 > I've tried a couple of times to implement a speech/music/noise discriminator,
 > but have a come up with lackluster results. It's a high value feature for us,
 > but it has been an elusive thing.
[snip]
> We have a few processes to develop new languages. If we have access to a
 > pronunciation dictionary, we can crunch a letter to sound system, using
 > methods by Dr Alan Black in his paper (Issues building General Letter To
 > Sound Rules).
[snip]
 > I'd be happy to do some research into your needed languages to get a feel
 > for the difficulties in each of them. I enjoy this work.




On 4/9/2011 2:33 PM, Nick Thieberger wrote:
> Has anyone had experience of software that takes a textual transcript
> and aligns it with the media it transcribes? I know it exists for
> major languages but have not seen it working and do not know of the
> software. It would be interesting to know how applicable it could be
> to the many hours of (handwritten/typed) transcripts of recordings we
> have in the PARADISEC collection.
>
> Thanks,
>
> Nick
>



More information about the Resource-network-linguistic-diversity mailing list