[Corpora-List] Query on Linking Text & Sound Files

Jean Veronis Jean.Veronis at up.univ-mrs.fr
Sat Oct 19 17:14:28 UTC 2002

We have a reasonably good experience of text-sound alignement in my team, 
since we have aligned more than 500,000 words of transcripts at this point. 
The technique we use has been developed by a student of mine in her PhD 
thesis (in French) :

Campione, E. (2001). Etiquetage prosodique semi-automatique de corpus oraux 
: algorithmes et méthodologie. Thèse de doctorat. Aix-en-Provence: 
Université de Provence [online :

Our strategy is to align as we transcribe, with the Transcriber tool 
already mentioned by Khalid Choukri on this list 
(http://www.etca.fr/CTA/gip/Projets/Transcriber/), but it can be used on 
pre-existing transcripts as well, although it is a bit less practical.

The strategy is based on a pre-segmentation of the sound files by means of 
a pause detector. Pause detection is fairly reliable (90-95% precision and 
recall, depending on language and type of speech -- more results p.200 of 
the thesis). It produces segments of a few seconds, which is the perfect 
span for transcribing audio files, since it matches fairly well what the 
transcriber can memorise at a time. We actually found that using this 
technique, the transcription time was not increased as compared with our 
old technique using a simple tape recorder, and the alignement was given as 
a bonus. In addition, the result is more precise than the old methods of 
transcription, because the transcriber can replay the exact segment at 
will, which was rather impractical with tape recorders and resulted in 
reluctance to listen several times to the same segment.

Hope this helps.

Jean Véronis

More information about the Corpora mailing list