xiphmont | Why 24-bit/192kHz music downloads make no sense

From:

I, for one, agree with the audiophiles that compression is the problem with modern music distribution. Unfortunately, unlike most of them, I'm referring to dynamic range compression. Do you have a solution for us there? :-)

Edited Date: 2012-03-05 08:28 pm (UTC)

From: (Anonymous)

Wow, I really enjoyed reading this article. I have nothing to do with music (besides listening to it occasionally) and it's good to know that 16/44.1 will be sufficient and that I don't have to be anxious about it anymore.

I switched my entire music library over to FLAC some time ago, and I was even planning to spend a fair amount of money for good headphones. Unfortunately I quickly discovered that I know nothing about headphones at all and reading dozens of online reviews was even more disturbing. Even my friends weren't able to rationalize there buying decisions. As a matter of fact, I still don't have good headphones yet. :)

Now it just happened that I've discovered your article by accident and I was really glad to read something more scientific about audio in general. Can you recommend some in-depth articles about headphones and how to tell apart the good ones?

Re: Good Headphones?

From:

xiphmont.livejournal.com - Date: 2012-03-06 12:17 am (UTC) - Expand

Re: Good Headphones?

From: (Anonymous) - Date: 2012-03-16 06:52 am (UTC) - Expand

Re: Good Headphones?

From:

denisekw.livejournal.com - Date: 2012-05-28 10:22 pm (UTC) - Expand

Re: Good Headphones?

From:

xiphmont.livejournal.com - Date: 2012-08-05 12:03 pm (UTC) - Expand

Re: Good Headphones?

From:

denisemedevac.livejournal.com - Date: 2012-08-03 05:52 pm (UTC) - Expand

Re: Good Headphones?

From:

xiphmont.livejournal.com - Date: 2012-08-05 12:02 pm (UTC) - Expand

From:

xiphmont.livejournal.com

The Loudness War may finally be receding; unfortunately, it's something the industry as a whole needs to let go of. :-|

From: (Anonymous)

Thanks for a thoughtful analysis. I have no reason to question anything you've written here. It still leaves me with the question: If digital is better than the analog stereo setup of 30 years ago, why did I (and, we now learn, Steve Jobs) always find my CDs tiring, and prefer my vinyl records? Identical receiver and speakers; just CD player vs. turntable. It can't be confirmation bias. I wanted and expected CD to be better than vinyl. Thanks.

Dave

From:

xiphmont.livejournal.com

Objectively, by any fidelity measure, digital far surpasses what vinyl is capable of. It's entirely possible you prefer either the more veiled distorted sound of vinyl (many people like tube amps for the same reason), or the physical interaction with it (something I kinda miss myself), or both.

Or... Perhaps you're objecting to how modern pop music is so badly overcompressed which has nothing to do with digital, it's just a modern trend. BTW, reissues of vinyl are remastered, so it's likely they've been compressed as well like modern releases.

I came from vinyl and tape and for the first few years of CDs bought into the whole 'there's no way this could be as good as vinyl' that all my hi-fidelity buddies repeated. I eventually wanted a release that wasn't going to come out on vinyl, and I had a real job, so I figured I'd get a CD player finally, but I wouldn't _like_ it.

I felt like a complete fool! I could not believe how much better the CD was in every way. It was deeper, blacker, crisper, no noise, no pops... wow. I never looked back. I bought my first computer not long after for the express purpose of using it to record...

That's my experience. It's too bad if modern mastering trends are ruining it for others :-(

Re: If digital is better . . .

From:

xiphmont.livejournal.com - Date: 2012-03-06 02:58 am (UTC) - Expand

From: (Anonymous)

"why did I (and, we now learn, Steve Jobs) always find my CDs tiring" This is the real question. I have been a music fanatic most of my life and found that in the late eighties that I pretty much abandoned my stereo and new music collection. After a remodel a couple of years ago I pulled out my high end gear, including a turntable and found myself listening again. CDs were and still are fatiguing and I tend to wander away. Vinyl is not. This is not something that is readily apparent via A/B testing but it is absolutely there on extended listening and there are a lot of people with the same complaint. Modding my player, different players, outboard DACs help some, but it's still there.

About a year ago I bought a high end DAC and started listening to some hi-res digital stuff. CDs do sound better when they are upsampled, de-jittered, and run through a decent DAC, but they're still not right. Some of the higher res stuff is just stunning, and I don't want to flee the room. (Some of the hi-res stuff is very obviously upsampled 44/16 too)

I'm a EE and a long time audio hobbyist and I understand the theory but the practical implementation doesn't quite live up to the theory. I suspect a lot of it is the analog filtering to get rid of the HF crap which you discuss in your extended post. But that doesn't explain why hi-res stuff sounds better on the same DAC, so I'm inclined to say that theory or not, 44/16 just isn't enough.
Regards,
Jim W.

Re: If digital is better . . .

From:

dr-memory.livejournal.com - Date: 2012-03-07 01:08 am (UTC) - Expand

Re: If digital is better . . .

From: (Anonymous) - Date: 2012-03-07 01:54 am (UTC) - Expand

Re: If digital is better . . .

From:

dr-memory.livejournal.com - Date: 2012-03-07 04:48 am (UTC) - Expand

Re: If digital is better . . .

From: (Anonymous) - Date: 2012-03-07 06:53 am (UTC) - Expand

Re: If digital is better . . .

From:

xiphmont.livejournal.com - Date: 2012-03-08 06:01 pm (UTC) - Expand

Re: If digital is better . . .

From: (Anonymous) - Date: 2012-04-28 03:58 am (UTC) - Expand

Re: If digital is better . . .

From:

xiphmont.livejournal.com - Date: 2012-08-05 11:52 am (UTC) - Expand

Re: If digital is better . . .

From:

xiphmont.livejournal.com - Date: 2012-08-05 11:56 am (UTC) - Expand

Re: If digital is better . . .

From: (Anonymous) - Date: 2012-04-28 03:40 am (UTC) - Expand

Re: If digital is better . . .

From:

xiphmont.livejournal.com - Date: 2012-08-05 11:43 am (UTC) - Expand

Re: If digital is better . . .

From: (Anonymous) - Date: 2012-03-07 10:38 pm (UTC) - Expand

Re: If digital is better . . .

From:

xiphmont.livejournal.com - Date: 2012-03-08 06:03 pm (UTC) - Expand

Re: If digital is better . . .

From: (Anonymous) - Date: 2012-08-05 11:06 am (UTC) - Expand

Re: If digital is better . . .

From:

xiphmont.livejournal.com - Date: 2012-08-05 11:24 am (UTC) - Expand

Re: If digital is better . . .

From:

dr-memory.livejournal.com - Date: 2012-08-05 04:17 pm (UTC) - Expand

From:

some41.livejournal.com

Good read. Looks like a reference to footnote 5 is missing in the text

From:

some41.livejournal.com

And a word missing?
...or even by a good lossy encoder *used* incorrectly.

From:

xiphmont.livejournal.com

Excellent catches both. Fixed and Thanks.

From:

fkobeh.livejournal.com

It was very nice to find such a well documented article! I've been trying to persuade my friends for many years not to fall into temptation of charlatans and 'high end' companies marketing.

I have dedicated my life to Audio (+30 years), and I have been an Audio Engineer for 22 years. Unfortunately when you become professional you get more critical and less enthusiastic...

What I can tell you guys is that there is a huge difference between 16/44.1 and 24/192. 16/44.1 just doesn't sound right. When you mix a project (usually I work with 24/96 kHz) you have a sonic depth of the elements, let's say a voice vs. reverberation; finally you get a mix down which is your "Master" but as soon as you convert it to 16/44.1 your work goes to the trash, you lose much of the program you had. The voice will get 'into your face' and you will lose a lot of the reverb you had, you don't get things in the space they were.

Going from 16 to 20 bit is like going from vinyl to CD. Remember that every bit more represents twice the information, so going from 16 to 18 you'll get 4 times more depth. Please don't do less. I agree that you don't need 24 bit but 16 is not enough.

On regards of the sampling frequency 192 kHz do sound softer, it's much more natural but unfortunately it does take more resources. 96kHz has a good trade off.

From:

xiphmont.livejournal.com

The issue I take with this statement is that it's made constantly... and if the change is so obvious, it would be easily observable in a controlled blind test. Yet in every controlled test, no one can tell the difference.

Re: I disagree with your point of view

From:

fkobeh.livejournal.com - Date: 2012-03-07 03:23 pm (UTC) - Expand

Re: I disagree with your point of view

From:

xiphmont.livejournal.com - Date: 2012-03-08 06:07 pm (UTC) - Expand

Re: I disagree with your point of view

From:

fkobeh.livejournal.com - Date: 2012-03-09 03:05 am (UTC) - Expand

Re: I disagree with your point of view

From: (Anonymous) - Date: 2012-04-06 09:09 pm (UTC) - Expand

Re: I disagree with your point of view

From: (Anonymous) - Date: 2012-04-28 03:49 am (UTC) - Expand

From:

prodicus.myopenid.com (from livejournal.com)

Great writeup!

It is also worth mentioning that increasing the bit depth of the audio representation from 16 to 24 bits does not increase the perceptible resolution or 'fineness' of the audio. It only increases the dynamic range, the range between the softest possible and the loudest possible sound, by lowering the noise floor.

This is an extremely common and very reasonable-sounding misconception. I think your page could definitely benefit from elaborating more on why this is wrong.

BTW I recently added a little writeup on the hydrogenaudio wiki about TOS 8 (http://wiki.hydrogenaudio.org/index.php?title=TOS_8) - just a starting point and unlikely to convince the unconvinced but perhaps it may be useful to somebody.

From:

xiphmont.livejournal.com

Noted, on my tweak list.

From:

xiphmont.livejournal.com

I got this in email, a few others will likely find it interesting:

http://www.audiocheck.net/blindtests_index.php

I've not yet looked at everything there, but the pieces I've played with looked quite good.

From: (Anonymous)

Just want to say thanks for the effort you put into this. It must have taken some time to get it all laid out. This blog has really helped my audio angst attacks - I'm a trained electrical engineer and an audiophile, and even I have a hard time occasionally cutting through the buffer-bloat on this topic.

Anyway, this article will be read for years to come I'm sure.

Now, however, I shall go listen to some wonderful music without a thought of how it was engineered!

From:

xiphmont.livejournal.com

Summing up in one sentence why I'd never want to be a porn star.

From:

valdis skesters (from livejournal.com)

In your article you say that hardly anybody understands the basic signal theory or the sampling theorem and that analog signal in practice "can be reconstructed losslessly" from the information which samples contain. As from your article it appears that you are one of the very few who understands the sampling theorem, I have a big question to you. What are differences of signals most widely encountered in practice (e.g., musical ones) from signals that fully corresponds to the sampling theorem? Please list all the differences and then substantiate why we should ignore all these differences.

From:

xiphmont.livejournal.com

You're right that 'lossless' happens only in ideal circumstances.

No part of the process is going to be ideal, so you see small deviations at every step. All of them will be measurable (and predictable), but it's unusual to find one that's audible unless you're dealing with a flat out bug. Flat out bugs do happen.

The most common truly digital 'bugs' are bad digital antialiasing filters (almost always in software; have a look at http://src.infinitewave.ca/) and linearity errors in the hardware DAC. However, it's still much more common to hit analog shortcomings (an output stage that can't go full range without distorting badly, etc).

How sampling in practice differs from the ideal:

1) the sampling theorem assumes the sampling period is infinitely small. In practice, it's not. It's close enough, though, that you will have trouble measuring the effect on the bench. You have no hope of ever hearing the effect unless you're designing a bad DAC on purpose just to hear it.

2) quantization assumes perfect linearity. Again, in practice, it's not perfectly linear. This was a problem in the bad old days, not so much now with oversampling (and cascading). The audible effect is primarily harmonic and intermodulation distortion. Here's an example file for you:

http://people.xiph.org/~xiphmont/demo/30_and_33.wav

That's two tones, one at 30kHz and one at 33kHz. They're completely inaudible, and the file sould sound 100% silent. If it does not, you're probably hearing intermodulation distortion from a linearity problem in your DAC or analog amplifier (or maybe even from your transducers). Regardless the source, if you hear it, that's what nonlinearity sounds like.

Oversampling has killed this problem in the DAC, you might still hit problems int he analog stages.

The other reason you might hear something is bad resampling in your computer's sound drivers. You can't blame that on the DAC.

3) clock jitter adds noise. It was established in the 80s that it can be audible. The problem's been thoroughly addressed since then with better clocks.

4) the antialiasing filters: These were once the weak link before oversampling. A bad filter will roll off too early, too late, not fast enough (causing aliasing) or ripple in the passband's frequency response. Oversampling has effectively killed this problem. The only time you'll hear it is in those singing greeting cards.

Edited Date: 2012-03-08 06:29 pm (UTC)

From: (Anonymous)

I am very pleased you have written this article. I understood, and agreed with most of it, but I did not understand a couple of parts.

I had most problems understanding the section "The dynamic range of 16 bits"

It says "16 bit audio is commonly said to have a dynamic range of 96dB (each bit doubles the range and a doubling is about 6dB so, 6dB*16=96dB). This is incorrect."

It then offers an encoding of a 1KHz tone, at a level of -105dB, using 16bit/48kHz in a wav file. It provides a spectral analysis plot to show that 16 bits can encode such a signal.

While that is sufficient to show something can be encoded, I don't think that is sufficient to prove that 16 bits is sufficient to encode *all* sounds with a level below -96dB. Nor does it prove that such encoding would be perceived in the same way as a higher-bit encoding of the same signal.

I can understand that frequencies which have simple integer relationships with the sampling frequency can be encoded and have spectral plots which show lower energy than -96dB.

For example, let's assume the encoding uses a signed integer, with integer 0 as no signal, and the sample rate is 48kHz.

We could construct a data stream which has +1 and -1 at appropriate sample-rate distances to produce any single tone of 1kHz, 2kHz, 4kHz, 500Hz, 250Hz, etc.

We could then remove alternate pairs of +1 and -1, setting the value to 0, and that has reduced the energy content of the signal. So I assume it must be below -96dB. We could remove two samples from three, and so on, reducing the energy in the signal. I apologise if I have already made an error, but this seemed 'intuitive'.

Looking at it a slightly different way, is this 1kHz signal just an artefact of taking some large number of samples? Are there other frequencies which can be carefully encoded to give an audible tone at selected frequencies, but in general, some frequencies can not be encoded down to -105dB? Using that approach of encoding signals at specific sub-frequencies of the sample frequency may be misleading me, but it seems like some combinations of frequencies will be handled less well than others. After all, it seems the signal must be encoded in a tiny number of sample values (or the signal will have an average power above -96dB), and I assume there is a finite duration after which the ear+brain no longer hears 'sound' and it degenerates into noise.

Summary: the existence of that wav file that encodes a specific frequency at a level below -96dB doesn't seem to be enough to prove that 16bits can encode *all* sounds with a level below -96dB, only some of them.

My second concern is the ability to encode that signal with that low energy, for example using the technique I mention, doesn't demonstrate that a human would perceive the signal as the original sound. Does a human perceive this as a 1kHz tone which is identical to a 1kHz signal encoded with 24 bits?

Further, how is that sample proving all audible sounds can be encoded in 16 bits with a level down to 120dB? It may be obvious to you, you think about this a lot, but I need a bit more help and information.

A small point. The article says "Handled correctly, the dynamic range of 16 bit audio reaches 120dB in practice [10], more than twenty times deeper than the 96dB claim."

The difference between 120dB and 96dB is 24dB. Earlier the article says "a doubling is about 6dB", so I's expect the difference to be about sixteen, not "more than twenty". On my calculator, the difference between 120dB and 96dB is a factor of 15.85.

The spectrum shows a wide range of frequencies as well as the 1kHz peak, for example there is a peak about 20dB lower than the 1kHz tone at just above 8kHz. Is this showing that it is impossible to encode a low-signal level without other frequency content? If that is the case, can all the other signal content be digitally removed without effecting the 1kHz signal? If it can't be removed, without a-priori knowledge of what the actual signal should contain, then might that be a problem? If 16 bits is the *only* format available for storing and transporting sound, yet it introduces artefacts that can not be removed, then it is not the encoding we need for sound which may be further processed.

From:

xiphmont.livejournal.com

Hello (I took the liberty of deleting a mostly duplicate earlier comment; I hope that's OK).

I've substantially rewritten the 'dynamic range of 16 bits' section, as I wasn't happy with it and it confused many people. Hopefully it's much easier to understand now, and I think it answers your questions directly (you weren't the only person to have them).

>On my calculator, the difference between 120dB and 96dB is a factor of 15.85.

Yup, you spotted an error. Fixed.

> The spectrum shows a wide range of frequencies as well as the 1kHz peak, for
> example there is a peak about 20dB lower than the 1kHz tone at just above
> 8kHz. Is this showing that it is impossible to encode a low-signal level
> without other frequency content?

That's noise from the shaped dither. And yes, dither of some variety is needed; it doesn't just allow the very low level signal to be encoded, dither is also the mechanism by which quantization is rendered distortion-free.

It's not leaked energy or an artifact, it is merely injected uncorrelated noise.

> If that is the case, can all the other
> signal content be digitally removed without effecting the 1kHz signal? If it
> can't be removed, without a-priori knowledge of what the actual signal
> should contain, then might that be a problem? If 16 bits is the *only*
> format available for storing and transporting sound, yet it introduces
> artefacts that can not be removed, then it is not the encoding we need for
> sound which may be further processed.

The short answer is: correct. Processing breaks the dither (rendering it partly or completely useless) but the added noise from the dither is still there.

16 bit depth, dithered or no, is still going to be deeper than any analog format that preceded it. But if you have the choice between processing 24 bit audio and 16, you should choose the 24.

Re: The dynamic range of 16 bits

From: (Anonymous) - Date: 2012-04-02 02:24 am (UTC) - Expand

Re: The dynamic range of 16 bits

From:

xiphmont.livejournal.com - Date: 2012-08-05 12:11 pm (UTC) - Expand

(deleted comment)

From:

xiphmont.livejournal.com

Comment removed at request of poster

From: (Anonymous)

You mentioned squishyball. Do you have any intention to make a release? I'd like see a Debian package and one of the guidelines is that any uploaded package should be stemming from an actual release. There's the occasional -svn package, but I've never seen one without some justification (most often a non-svn package that's hopelessly outdated).

Also, is there a (more-or-less) canonical set of samples to test against?

From: (Anonymous)

a .tar.xz of the sources would already increase the comfort level because i wouldn't have to install git or click 50 times till i have the sourcecode ready to compile.

From:

garage-rob.livejournal.com

lol, I was definitely saying "whaaa?" thanks for linking the whole article :-D

From:

denisekw.livejournal.com

thank YOU

From:

xiphmont.livejournal.com

Looks like the spambots are out in full force again. It's been ~ 6 months, so I think it's not a bad time to close comments to keep the spam out.
Thanks to everyone who wrote, and there's still email :-)