Clan/mp3 timecoding update
Bartlomiej Plichta
plichtab at MSU.EDU
Wed May 31 01:51:55 UTC 2006
Hello all,
Let me address the issues below.
Linguist - Wangka Maya wrote:
> I just brought some interested friends into the discussion, Mark
> Piggott says:
>
> Interesting discussion....... Actually most digital formats compress
> using one algorithm or another and are therefore "lossy" to some extent.
This is not exactly true. Not all digital audio formats use lossy
compression. In fact, there are two popular standards, PCM and DSD, that
store raw sample data, without compression. PCM is by far most common in
professional applications (e.g., audio CDs), but DSD is becoming more
and more widely accepted in the pro audio industry. Psychoacoustic
compression is a relatively new thing and is used primarily in consumer
applications (e.g., iPod, iTunes Music Store, etc.). There are other
audio compression standards used, for example, in digital telephony, but
they have little relevance to field linguists.
> For normal beings, like you and I Grant, mp3 and mp3 vers2 will
> reproduce sound to a quality than is undistinguishable from the real
> thing. I personally never use wav format - it's a Microsoft proprietry
> format not open source.........
This is not exactly true. There have been numerous psychoacoustic tests
done on that and compression is distinguishable. Of course, that depends
on a lot of variables, including the compression ratio, the codec used,
the playback hardware, the listener's auditory system, etc.
The WAVE standard is not proprietary. It is the native audio standard
for Microsoft, but the spec is open. Anyone can write WAVE files,
without licensing fees. The biggest problem of the WAVE standards is
that over the years it has allowed a lot of variation in how the
various data and metadata chunks are used in the WAVE file. Therefore,
the Broadcast WAVE Format (BWF), which is a variant of WAVE, is perhaps
a better choice for recording and storage. It is a widely used standard,
with very good archival prospects.
The only truly standard-agnostic way of storing PCM data would be in
headerless file, but then proper metadata would have to be supplied to
read this file, e.g., sample rate, bit-dept, byte order, etc. This is a
solution that some people have used, especially in the speech
engineering circles.
> So, how much "lossy" is ok? Is there a minimum standard, beyond which
> is unacceptable (I know, least "lossy" is best)? And is .wav format
> the least "lossy"?
Any audio file can be compressed in a lossy process. There are important
differences, though. You can "compress" a PCM file (e.g., WAVE, AIFF,
AU, etc.) by reducing its sample rate and bit-depth. For example, you
originally acquired your recording at 48,000 Hz and 24-bit. Then you
downsample it to 16,000 Hz and lower the bit-depth to 16-bit to use this
file with older sofware, for example. This process is lossy, as it
removes original samples. This is a linear process.
There is also non-linear, psychoacoustic compression, such as that in
MP3, which removes samples from the original uncompressed file in a
dynamic process.
Both types of compression are bad for long-term preservation, but the
linear type is acceptable for, say, some types of acoustic analysis of
speech. For instance, for formant analysis of male voices, the sample
rate of 16,000 Hz is sufficient. The lack of original high-frequency
content does not harm my analysis in any way, because the low-frequency
content (below 8,000 Hz) is left intact. The same is not true of MP3
compression.
I would like to add that the WAVE format is basically a container. It
can also, in theory, store dynamically compressed data. It is rare, but
the spec allows it.
So to answer the question of how much sample loss is acceptable I would
say that for long-term preservation, none. For other purposes, it is far
more important to use the right hardware and recording technique. Then
evaluate what you need these recordings for. If you can do your analysis
well with a compressed MP3 file at 256 kbps, then that's fine.
Best regards,
Bartek
More information about the Resource-network-linguistic-diversity
mailing list