transcriber-soundFormats

Joe Blythe joe.blythe at ARTS.USYD.EDU.AU
Tue May 2 01:53:44 UTC 2006


I had a slightly worse problem than John's but it relates to the same 
thing.
I transcribed a number of transcripts in Clan from mp3s. They were all 
encoded with a fixed rather than a variable bit rate, I think I made 
the files with Amadeus. I also did this to save hard drive space. Some 
time later I received a warning about mp3s and I checked the time 
coding against corresponding wav files and found that the timecoding 
was out of alignment. At the end of a twenty minute transcription the 
bullet points didn't select one word of the correct utterance although 
at the beginning of the transcript the match was fine. I now have the 
problem of recoding or adjusting the timecoding of those transcripts 
and realigning them to wav.

It's possible that as Bartek suggests converting wav to mp3s using this 
Akustyk might be an improvement but I would thoroughly discourage the 
transcription from mp3s. If hard drive space is an issue then I suspect 
a hardware solution to the problem would be infinitely better than a 
software solution.

Cheers
Joe

On 02/05/2006, at 9:00 AM, Bartlomiej Plichta (by way of Nicholas 
Thieberger) wrote:

> The PCM (wav, aiff, au, etc.) files have a much different structure 
> than an MP3 file. While a PCM file contains one file header followed 
> by raw sample data (in one chunk), an MP3 file has many small data 
> chunks, each with its own header, and each containing different 
> compression. There is no duration information in an MP3 file because 
> of that, so an application that reads the file must compute it, 
> roughly. This is particularly problematic with MP3 files encoded with 
> Variable Bit Rate (VBR). Duration variation might also additionally 
> occur if the MP3 file is created by resampling the wav file, which 
> happens very often. So there will always be some degree of duration 
> mismatch.
>
> I would therefore recommend using Constant Bit Rate (CBR) and the same 
> sample rate as the wav file. This will help minimize the problem. I 
> would also avoid using conversion software where these parameters are 
> not transparent to the user. Also, most commercial MP3 converters are 
> optimized for music, not speech. That's potentially a problem, as 
> well.
>
> In my experience, Akustyk (http://bartus.org) does a decent job 
> converting from wav to MP3 using the Lame codec. It is optimized for 
> speech and produces decent results. Finally, may I ask why one would 
> want to use MP3 is speech research? There's quite a bit of signal 
> degradation involved, particularly with multiple resampling.
>
> Hope this helps.
>
> Best,
>
> Bartek
>
>>
>> From: John Giacon <jgiacon at ozemail.com.au>
>> Subject: transcriber-soundFormats
>> To: rnld list <Resource-Network-Linguistic-Diversity at unimelb.edu.au>
>> MIME-version: 1.0 (Apple Message framework v728)
>> Precedence: list
>> X-Spam-Score: * (1.296) HTML_MESSAGE,TRACKER_ID
>> X-Spam-Info: http://www.infodiv.unimelb.edu.au/email/spam/
>> Comments: RESOURCE-NETWORK-LINGUISTIC-DIVERSITY Mailing List
>> Hello,
>>
>> I have been using transcriber, and the sound files were .aif files. 
>> To save room I converted the sound files to mp3 using iTunes.   When 
>> I reopened the files in transcriber the timings got out of sync, so 
>> that by the end of the file [around 60 min] the text and sound file 
>> were about 5 seconds out of sync. - the sound has been 'stretched' by 
>> the 5 seconds;
>> In fact the image of the sound file corresponds to the text 
>> divisions, but the actual sound is delayed, so there is 5 seconds of 
>> a plain line at the end of the sound image, but it corresponds to 
>> actual sound;
>> I could do the transcription using the mp3 files, but the problem of 
>> the sound and the image not corresponding still remains;
>> any suggestions are very welcome.
>>
>> John
>>
>>
>> John Giacon
>> Christian Brothers, 14 Landsborough St
>> Griffith, ACT 2603
>> 02 6239 6300
>> [0421 177 932 when away]
> jgiacon at ozemail.com.au <mailto:jgiacon at ozemail.com.au>
>
> --
> Bartlomiej Plichta
> http://bartus.org
>
>
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Joe Blythe

Department of Linguistics
Transient Building
University of Sydney
NSW 2006



More information about the Resource-network-linguistic-diversity mailing list