What specs would digital music have to have to be indistinguishable from analog?

I have heard about this war since the CD was invented. The main argument I hear about digital music is that the sampling rate is not good enough to match analog.

What would the sampling rate have to be to produce music that is technically indistinguishable from high-quality analog? When I say technically indistinguishable, I mean to go beyond whatever humans can subjectively detect look at the wave that comes out of the speaker on playback. I am assuming that sampling rate would be some finite number, as analog quality must somehow be limited by physical characteristics. This sampling rate may turn out to be infeasible for all practical purposes, but this is not a practical question.

What else do we need to consider besides sampling rate? I don’t want to consider the “loudness wars”–let’s leave the mastering process out of this and just consider what it takes to capture the source signal faithfully.

What rate (and other parameters) you need for it to be indistinguishable depends on what equipment you’re using to try to tell the difference.

If your measuring equipment is the human ear, then most digital music is already past the sample rate where you can’t tell the difference, and audiophiles who say otherwise are just deluding themselves. Where there can still be a difference, though, is dynamic range. A typical sound system can’t reproduce both the quietest sounds humans can hear and the loudest sounds that don’t do damage. Most of them can’t do either.

I want to go beyond the human ear and use whatever is the current state of the art. I’m not sure what that this. Is there a way to capture a sound wave directly from the speaker leads in an analog format? Of course, if we capture it digitally then we’re right back to limitations (if any) of digital audio.

Which is a bunch of nonsense. A sampling rate of 44.1 kHz is perfectly adequate. The only problem is the source material.

The sample rate you need is entirely dependent on the highest frequency you wish to record. Multiply that highest frequency by 2, and you have your answer.

The other measure is the bit depth. This number dictates the noise floor, i.e. what is the quietest possible sound you can record.

Without both the highest frequency and noise floor being defined, there is no answer to your question. If we can’t use human hearing as the limits, and practicality doesn’t matter, then I think the only thing left is the equipment itself.

But that would be the same limitation that analog audio would have. It’s not about the digitization. It’s just that, for practical purposes, digital audio has to set those two variables.

Note that these limitations also exist on any audio recording medium, too. Tape or records have a limited frequency response and noise floor, too. And digital has far surpassed those, and did so with CD-quality audio.

Your question is akin to asking what is the highest resolution possible resolution with the highest possible framerate.

Analog what? Records are analog and you’d have to purposely make cds worse to match them.

The top level answer is trivial.
The bandwidth and dynamic range of the required signal exactly determine the sample rate and bit depth.
This is precisely given by the Shannon-Nyquist sampling theorem.
If we take ordinary human hearing, CD’s 44.1kHz sample rate and 16 bit sample depth is sufficient. This comes with a couple of caveats. The signal needs to be properly dithered (which is a technical term) and preferably with a perceptual noise shaped dithering technique. This gets the effective bit depth in the frequency range of highest sensitivity up to about 18 bits.

The other caveat is that the chain is properly designed, which isn’t a given. This is especially a problem with implementation of anti-alias and reconstruction filters. In the modern world there is no excuse for this not to be true.

Modern recordings tend to be done at 24 bit 96kHz. This is way excessive, but affords the production chain the freedom to attenuate and boost levels and generally fritz with the audio without getting into trouble with clipping (which can occur even if the samples don’t actually saturate) and do things like pitch shifting.

When the final mix is created it can be converted to 44.1/16 - which includes dithering. Once this happens the recording does not contain enough leeway to survive any further messing around.

Why are you so worried about the sampling rate and digital vs analogue when your speakers have maximum loudness, frequency range, etc.? For reference, a DXD file has 24-bit PCM sampled at 352.8 kHz. I’m not saying to throw away your 2-inch tape, but come on. (In fact some people deliberately record so you get that analog “tape warmth”.)

I am not the least bit worried. I am trying to gain an understanding of 1) the audiophile POV on digital vs. analog audio quality, and 2) what it would take to nullify any argument they might have, even if it’s wrong to begin with. To wit, sampling rate is often cited.

  1. I don’t know about “audiophiles”, you realize that’s a marketing buzzword and some/most of them are probably not professional sound engineers or even have particularly good hearing, but there are technical issues specific to digital recording like eliminating clock jitter 2) blind listening tests

The frequency ceiling would be 20KHz, because that is the limit of human hearing. But I specified analysis using technology and not humans because humans are subjective and not every human will agree on what they hear. I want to take out humans as a variable. If a digital recording captured frequencies only up to 10KHz, some people would not be physically capable of hearing the difference, some people might be physical capable of hearing the difference but not be able to know how to listen and therefore be unaware of the difference, and some people with training or experience could tell the difference.

The noise floor would be the threshold of human hearing, using the same rationale as for the frequency ceiling. This threshold varies with the frequency, so let’s just say 0 dB.

If this is true then my entire question is rendered moot. So do audiophiles even have a defensible position that vinyl is somehow superior to digital, in any way at all?

In a word - No. And furthermore they never did, not even back in 1980.

But digital audio sounded different. And to the curmudgeons, that was prima facie evidence that it was worse. Because, tautologically, they defined the sound of vinyl as the “correct accurate genuine sound”. Despite the wow, the pops & crackles, the funky RIAA equalization, mechanical wear of their silly physical disks, and all the rest.

Mixed in with a lot of this was both the transition from tube amps to transistors, and a mad dash by the CD production companies to slap out as many badly remastered, or not remastered at all, new CDs as they could ASAP. All hoping to catch the wave of demand for CD content that was then exploding.

A bunch has been learned about how to make better CD sound. But those lessons came early, and sampling rate was never the true issue. It was just something simple and “digital-ish” sounding for the analog curmudgeon brigade to hitch their whine-wagon to.

Curmudgeons about newfangled airplanes: If god had meant for man to fly, he’d have given us wings.

Curmudgeons about newfangled CDs: If god had meant for man to listen to digital audio, he’d have given us ears with high frequency clocks.

It’s about that dumb. And always has been.

@Francis_Vaughan is an actual expert in this stuff. I’m just repeating what I’ve learned along the way.

Yeah. As far as “what else do we need to consider besides sampling rate?” goes, you need to make it worse. That from the beginning has been the way to make digital indistinguishable from analog; do things that add noise to the recording.

So your question isn’t actually about being indistinguishable from the original audio, but being indistinguishable from the highest quality analog recording of said audio. That’s more doable.

I’m not finding clear numbers on the noise floor of various analog audio formats, but I am finding the dynamic range, which is what changes the noise floor in digital audio. The highest dynamic range I’m finding is 80dB for studio reel-to-reel tape running at 30 IPS (inches per second). Vinyl seems to have a maximum dynamic range of 70dB. CD audio, which has a 16 bit bitdepth, has a dynamic range of 96dB.

CD audio has a sample rate of 44.1 kHz, and thus can handle audio up to 22.05 kHz. So it is already above the maximum for human hearing. The highest quality vinyl and reel-to-reel match this, though there is the potential issue with low frequencies in vinyl causing the needle to jump out of the groove.

So the answer seems to be that CD-quality audio already surpassed all analog formats before it, and that there haven’t been any real improvements since.

The only arguments I’ve seen that have any possible relevance are that some humans might be able to hear frequencies above 22.05kHz-24kHz* but only subconsciously, but I believe A/B tests have not found this to be true.

*Most audio these days has 48kHz sampling, because it’s a nicer round number. I believe 44.1kHz was chosen for space saving reasons, but this is less of a problem today with audio compression.

But the limit of human hearing still matters even if you are doing your analysis with technology because human hearing determines what the source material is. There is no music with a note at 50 kHz because no one (except maybe bats) could hear it. If theoretically someone built an analog system that could play such notes then indeed a digital system using a 44.1 kHz sample rate couldn’t replicate it, but I don’t think that’s what you’re asking.

To this curmudgeon the “worse” parts of going digital was losing the fun of album art and reading the liner notes and going to the record store and flipping through all the albums in the various bins.

Also, there is just something fun about record players that CD players lack akin to the difference between a mechanical and digital watch. I get they do not sound better and are less convenient but still…I kinda miss them.

And, some actually like the little errors a record has over the sterile CD sound.

(Disclosure: I do not own any vinyl or tape audio anymore…it is all digital for me but a part of me does miss the days when I had a vinyl record collection.)

This is a fun one. 44.1kHz was chosen for a series of arcane reasons. Back when CDs were being designed, the 600 odd MB of data on one was a lot. Disk drives that held that much were the size of washing machines. What was designed was an adapter that could store the digital data on a videotape. The format chosen ran the data at the 44.1kHz sample rate - so it was a useful recording device as well - you could feed it from a stereo analog to digital converter. But Philips was in Europe, and Sony on Japan. One used PAL TV, the other NTSC. PAL was 625 lines per interleaved frame (at thus 25Hz), NTSC has 480, at 29.97Hz. The line rate is pretty close. PAL is 15,625 Hz, NTSC 14,385 Hz. Close enough that you could pack three stereo samples (so 3 *2 * 16 = 96 bits) on each line of either format. The result was a format that could store 16 bit stereo at 44.1 kHz. You could hold an entire CD of data on a single video cassette. (They actually ran the NTSC recorders at 30Hz, not 29.97)

48 kHz is just a nice round number, something engineers like. But there were real reasons as well. There was a digital format used for reticulating audio to FM broadcast stations. FM only goes out to 15kHz. So a sample rate above 30kHz would work (with a bit of room for the antialiasing filter). So 32kHz was a perfect match. There was a desire for a professional digital format that was able to work with the 32kHz rate. Back then rate converting was not well understood, and a simple decimator (drop every 3rd sample) and low pass filter could be used to convert 48kHz to 32kHz. Moreover, 48kHz gave producers that extra leeway to do stuff. And now 96kHz is common. Rate conversion is vastly easier with integer multiples, although the mathematics to do arbitrary conversion was worked out decades ago, and has made worrying about it much less of a problem.

There was lots of comment that another reason pro formats differed from CD was to prevent the gear being used to pirate recordings on CD. Lots of moaning when DAT became available to consumers that it couldn’t record a CD digitally.

Sure. That’s exactly what analog recording equipment does: it takes an analog electrical signal and records it on analog media. But then what do you do with the recorded speaker signals from your analog and digital sources? If the whole point was for a listener to compare the sounds of an analog source and a digital source, then why would you want to insert another analog device into the whole sequence? For that matter, why are you proposing recording from the speaker leads into an analog format rather than a digital format?

Just play the analog source through the speaker directly to a listener’s ear, then play the digital source through the same speaker. The only way you’ll get digital and analog to be indistinguishable to the ear is if the analog source is a studio-grade analog mastering system, the performance of which far exceeds consumer grade (vinyl, cassette) analog formats.

Yes, I explained that the limit of human hearing was my benchmark for what the technical analysis would look at.

I’m not interested in the ear. I’m interested in a technical analysis that does not depend on the subjective judgement of a person.

How can you take the output of a digital recording and an analog recording and compare the wave forms they produce? Wouldn’t capturing the signal sent to the speakers do that?