Please explain Bell's theorem

I came across this page on Wikipedia about something called Bell’s Theorem:

Sounds interesting but unfortunately most of it is over my head. Can someone explain what this theorem is and why someone would say “Bell’s theorem is the most profound discovery of science”? (see footnotes 3 and 12 in wiki article.)

Many quantum physics events have paradoxical results. Bell’s theorems simply asserts that some of those results are, indeed paradoxical in a well-defined sense.

Stripped of detail, in a typical such experiment each of two “entangled” photons is tested (with a polarizing filter of variable angle); the results are seen to be inconsistent with “local variable” models. That is, one photon seems to “sense” the angle with which the remote photon was/will be tested.

Theoretical physicists are well aware that the “paradox”, at least in the simplest cases, goes away by simply allowing “advanced action” (i.e. future-to-past) causality, but few embrace this perspective, preferring non-intuitive notions of “entanglement.”

Can you explain what “local variable” and “hidden variable” means? Seems to be central to bell’s theorem.

The Wiki article states “No physical theory of local hidden variables can ever reproduce all of the predictions of quantum mechanics.”

The “local variables” denotes parameters (or state, or “memory”) associated with a particle. This is not enough to reproduce results; as I stated

… the photon’s behavior also seems to depend on information gathered/produced by its twin.

To put it another way: quantum mechanics has the weird property that if you run an experiment on (say) an electron, you’ll often get a distribution of different results instead of the same result every time. You might think that this simply means that we didn’t really do the experiment the same way each time; that the electron had some “hidden” information that we weren’t privy to. This extra information is what is referred to as “hidden variables” in Bell’s Theorem. Moreover, one would intuitively expect this hidden information to be “local”, i.e., innate to the electron itself. In other words, the electron really had a well-defined state all along, and this state was independent of the state of the rest of the Universe.

What Bell’s Theorem shows is that the Universe does not have local hidden variables; one can do experiments whose results are inconsistent with the mere existence of this type of information.

It’s been quite a while since I had occassion to learn about this, but I remember an analogy with throwing dice. Throwing dice seems random, but in a sense it really isn’t. If one knew and could take proper account of all the variables involved - the force with which the dice are thrown, the angle they hit the table, the friction with the table, the distribution of weight within each die, ect, ect - you should be able to predict with high accuracy (in theory 100%) what numbers will come up each time. But all of those variables, though theoretically knowable, can never be known in any real world situation. So those variable remain “hidden”. What seems random is really just a byproduct of our ignorance of all the variables involved.

What Bell’s Theorem seems to say (and please edify me if this is way off base) is that even if you knew all of those hidden variables, you would still not be able to predict the throw of the (quantum) dice, as randomness is, in the quantum world, not just a byproduct of ignorance, but a fundamental property.

Or rather, that there are no hidden variables to know at all. The dice are in a “mid throw” state and later in a “lying on the ground” state, and that’s all there is no know.

One possible Bell’s theorem set up involves two particles that are set up so that they have undetermined anti-aligned spins (e.g. by having a spin-0 particle decay into two spin-1/2 particles). The two particles are then separated, and their spins are measured along the same axis. We find that one is spin up, and the other is spin down 100% of the time.

There are two ways to interpret this result:

[ol]
[li]At the time of interaction of the particles, the spins were actually fully determined, but unknown to us (i.e. a hidden variable), and our later measurements just measured that.[/li][li]The spin of the particles was, as is typical in QM, in a superposition of states, and the perfect correlation between one particle’s state and the other is due to some implied communication between them at the time of measurement.[/li][/ol]

Bell’s inequality is a proposed experimental result that through measuring spins of particles at angles to each other, can actually make a distinction between these two cases. QM’s predictions imply the latter, but we don’t have to trust QM because we can do the experiment.

Einstein famously didn’t like quantum mechanics very much, and coined the term “spooky action at a distance” to describe the weird behavior, and formulated it as the EPR paradox. He thought that this type of behavior, where measuring one particle could determine the state of an entangled particle at any distance, apparently instantaneously, was self-evidently wrong, and therefore quantum mechanics must be wrong or incomplete. He argued that you couldn’t really be changing one particle by acting on a different one, so instead the particles were somehow defined in advance and this information carried with the particle (the local hidden variable), and you were just measuring the hidden variable, not actually forcing a particular into a particular state by measuring it.

Bell’s theorem showed that this could not be the case. The resolution of this “paradox” isn’t really a resolution at all, it’s just the recognition that quantum mechanics really is that weird.

Now Bell’s theorem does allow the possibility of global hidden variables (the De Brogie-Bohm formulation of quantum mechanics relies on them), but that doesn’t really resolve the apparent paradox: you still have things that appear to act over long distances instantaneously, which doesn’t really correspond to our intuition.

Not to mention special relativity. If in any reference frame any variable is conserved globally but not locally, then there exists another reference frame in which it is not conserved at all.

In order to appreciate Bell’s theorem, we need to first have a look at the concept of correlation in good, old classical physics. So consider the following example: there’s two shoe-boxes, a red ball, and a green ball. If we put each of the balls in one of the boxes, with equal probability, then distribute the boxes—say, you keep one here, I take one with me to Mars—, we share a classically correlated state. The correlation simply consists in the fact that if either you or I opens their box, we will immediately know what colour ball the other has.

The question that Bell’s theorem now assesses is whether all correlations, and particularly, those of quantum theory, can be explained in terms of such shoe boxes. The coloured balls are, in this example, the local hidden variable: they are local because whatever I do to my shoebox will not impact on yours; they are hidden because until we open the box (perform a measurement), we don’t know their values (but they always possess a definite value). The terminology is actually somewhat inconvenient: actually, anything we ever see is the value of such a ‘hidden variable’; their hiddenness only applies in cases when, in a sense, nobody’s looking.

So, to recap, the question we have before us is: can all correlations be explained with parameters that 1. do not influence one another instantaneously, and 2. always have a well-defined value?

Bell’s theorem answers this question in the negative. The reason it is held in such high regard is that at first sight, this doesn’t seem to be a question that permits of a definite answer at all, and additionally, it’s a question about the foundations of our world—it’s been characterized as ‘a piece of metaphysics decided in the laboratory’.

Now, let’s have a look at how, exactly, this feat is performed. The basic strategy is to formulate a combination of correlations, that is, of shoe-box experiment outcomes, that, if it is to be explained by parameters of the type discussed above, is limited in overall value. How does one do this?

Well, once more, there’s two shoeboxes, let’s call them A and B. However, this time, we can check whatever is in the shoebox with regards to two different properties–say, colour and weight. We are, however, only allowed to check either one or the other of these properties, not both at once (yes, this has, in quantum mechanics, to do with the uncertainty principle).

Now, let’s distribute a great many shoeboxes between A and B. An experiment will consist in both A and B checking one of their shoeboxes with respect to one of the properties of the balls inside—so A might check colour, B weight, or both might check colour, and so on. There are, thus, four possible checks that could be performed. Indicating a colour check with an index ‘c’, and a weight check with an index ‘w’, these are simply: a[sub]c[/sub]b[sub]c[/sub], a[sub]c[/sub]b[sub]w[/sub], a[sub]w[/sub]b[sub]c[/sub], and a[sub]w[/sub]b[sub]w[/sub].

In order to ease our notation somewhat, we will denote outcomes of the individual experiments by the values +1 and -1, for instance: green = +1, red = -1, heavy = +1, light = -1. A joint outcome is simply the product of both outcome, of if the experiment a[sub]c[/sub]b[sub]w[/sub]—that is, A measures colour and B measures weight—is performed, and A obtains green = +1, while B obtains light = -1, we will note down a -1 for the total experiment. Now consider the quantity
C = a[sub]c[/sub]b[sub]c[/sub] + a[sub]c[/sub]b[sub]w[/sub] + a[sub]w[/sub]b[sub]c[/sub] - a[sub]w[/sub]b[sub]w[/sub].

It is simple (and I really mean simple, not mathematician-simple meaning that there exists a finite number of operations leading to the desired conclusion) to show that C ≤ 2. Consider that
C = a[sub]c[/sub](b[sub]c[/sub] + b[sub]w[/sub]) + a[sub]w[/sub](b[sub]c[/sub] - b[sub]w[/sub]).

This is just a simple reordering. But then, you can just try all possible combinations: whenever (b[sub]c[/sub] + b[sub]w[/sub]) = 2, (b[sub]c[/sub] - b[sub]w[/sub]) = 0; and vice versa, whenever (b[sub]c[/sub] - b[sub]w[/sub]) = 2 (i.e. b[sub]c[/sub] = +1, b[sub]w[/sub] = -1), (b[sub]c[/sub] + b[sub]w[/sub]) = 2. This will remain true if we perform our experiment many times, and average over the results; in any single experiment, of course, we only have access to one of the terms.

Thus, whenever we perform our experiments, we will find for C a value that is less than 2. But note that we have made certain assumptions here: most notably, that A’s choice of measurement can’t affect B’s outcome (and vice versa); for if that were not the case, then whenever A measures c, things at B’s side could sort themselves such that both b[sub]c[/sub] and b[sub]w[/sub] turn up +1, and whenever A measures w, they could conspire such that b[sub]c[/sub] = +1 and b[sub]w[/sub] = -1. This is the assumption of locality.

We have furthermore assumed that there always is a fixed value for any given property, i.e. that we could, in principle, obtain all the values for a[sub]c[/sub], a[sub]w[/sub], b[sub]c[/sub] and b[sub]w[/sub]; this is necessary in order to even speak about the quantity C intelligibly for a single experiment (in which we only ascertain the values of one pair of these values).

And now, for the punchline: it is possible to produce a quantum mechanical set up of such ‘boxes’ such that the classical bound on the correlations is violated. The boxes are, commonly, something like two electrons in a special (‘entangled’) state, and it’s not their colour and weight, but rather, their spin along a certain axis (which is always either +1 and -1) that is checked. Both A and B have two axes along which to check the spin, but otherwise, they perform just the same kind of protocol as above: take on of their electron (‘shoe-boxes’), measure the spin along either axis, note down the value, and continue; later, then, A and B meet, and multiply their values for corresponding pairs.

Now, for Bell’s theorem: if the above assumptions are valid, no system could produce a value of C > 2; however, in quantum mechanics, it is possible to obtain a greater value (equal to C = 2√2, in fact; why this is not the theoretically possible maximum value of C = 4 is a very deep question on which much research is being performed). Hence, one of those assumptions—locality or value definiteness—must be wrong. And there you have it.

Just parenthetically, I would like to note that there are other assumptions you can make, in order to derive similar inequalities—for instance, non-contextuality (the outcome of one measurement is independent of what other measurements are performed simultaneously) together with value definiteness yield the Kochen-Specker theorem, while ‘macroscopic realism’—the assumption that a given object is always in one of the states available to it—and measurement nondisturbance yield Leggett-Garg inequalities.

I’m a programmer, not a physicist, but it sounds like what it is, is:

Say that you have a ball. The ball has a position on a graph, which we can define as its ‘x, y’ coordinates. It also has a speed that it is traveling, which we could call ‘v’. And it has a particular direction that it is traveling, which we’ll call ‘a’. So we can say that we have an object "Ball’ defined like such:

Ball {
x = 10
y = 42
v = 35
a = 73
}

Now, I could hand the above information to another person, and he would be able to take all of the variables contained within (local to) the Ball and he would be able to figure out what will happen to the ball next. After one unit of time, the ball’s ‘x’ and ‘y’ coordinates would have changed in accordance with its ‘v’ and ‘a’ parameters.

Bell’s theorem seems to be saying that the sorts of effects we see in Quantum Mechanics can’t be explained by variables like these. What happens to a quantum particle next cannot be explained solely by information built into the particle. Exterior variables of some form are manipulating its behavior. If I was to hand the object QuantumParticle to another person, he would not be able to make any guess about what will happen to it, based on the information enclosed in the object.

Half Man Half Wit, I would not consider the color of the balls in your example to be a “hidden variable”, because it is directly measurable. One can, however, construct models with variables which aren’t directly measurable (but which still have some inherent value), which are much more deserving of the name “hidden”.

For instance, suppose that instead of balls in boxes, we had needles in spherical eggshells. Before we open the eggshell, we can’t tell which way the needle is pointing. And (in this hypothetical) the only way we have of opening an eggshell is by smashing it between two hard parallel surfaces. If the needle within was not already lying in that plane, it’ll be deflected into the plane when we smash it. Now suppose that we have some way of preparing pairs of eggshells: When we take two eggs from such a specially-prepared pair, and smash them in the same plane (no matter what plane we choose), we find that the resulting orientations of the needles are always opposite.

This is easily-enough explained: Our special method of preparation must always produce eggs with needles pointing in opposite directions in three-dimensional space. We can never observe the three-dimensional orientations of the needles (they’re hidden), but if we assume that, whatever they are, they’re opposite, this will give us the result we observe.

But now suppose that we don’t always choose exactly the same plane to smash our two eggshells. If we smash them on almost exactly the same plane, then we would expect the needles after smashing to almost always point in almost exactly opposite directions. If we smash them in very different planes, we wouldn’t expect any particular relationship between the ways that the needles point, and so on. We could, based on our model, describe just exactly how correlated we expect the needles to be, for smashing-planes at various angles.

Bell took this one step further: He didn’t just look at the case where the hidden variable was a needle in an egg, he looked at all possible hidden-variable scenarios. And he wasn’t able to find an exact calculation for the correlation, because that depends on just how the hidden variables are set up, but he was able to find a lower bound for it. And quantum mechanics violates that bound.

Well, sort of. In fact, there is a set of observables for a quantum system that behaves pretty much classically: if you know their values, you can predict them, using the system’s time evolution, at any point in time. However, the difference to classical mechanics is that not all of the system’s observables—i.e. all its measurable properties—can be definite, or sharp, at the same time; when you have complete knowledge about one set of observables, you can’t know anything about the values of a different (conjugate) set. In classical mechanics, however, all properties of an object are simultaneously sharply measurable.

Also note that, contrary to what’s sometimes claimed, Bell’s theorem does not suffice to establish nonlocality: it only does so when we assume that all properties have definite values at all times. Since this isn’t the case in QM, it would be more accurate to say that quantum mechanics is a local theory, but any ‘classical-like’ theory yielding the same predictions must be nonlocal.

As are, for instance, the hidden variables in Bohmian mechanics—the particle positions. Measurement here merely uncovers their exact values at the time of the measurement. In your example, in contrast, measurement ‘throws away’ some information—from a position on a 2d manifold down to the direction of its projection onto the measurement axis. An equivalent model would be a set of spheres that are half-red, half-green, with any measurement corresponding to looking at the sphere through a narrow slit, and noting whether you see green on top and red on the bottom, or the other way around.

If you want your model to more accurately conform to quantum statistics, say by e.g. having the needle flip to the slit’s orientation with a cos[sup]2[/sup]-distribution, then actually the argument I’ve given above no longer works, because you loose the possibility to reason counterfactually in every single instance of the experiment—that is, if you actually do measure one set of orientations, you can’t any more argue that if you had measured the other set, you’d have gotten a certain set of outcomes, and thus, C is no longer bounded for any individual measurement. You can of course still derive Bell inequalities, but you’d have to use a probabilistic argument (in fact, a Bell inequality is nothing but the assertion that a certain set of random variables, giving the measurement outcomes, has a joint probability distribution, and as such, they were already introduced by George Boole in the 1860s).

As I understand it, when two particles are entangled, measuring one will affect the other. Is there a way to know if the particle has been affected? In other words, Bob and Steve are two scientists, separated by a large distance, and each has one from a pair of entangled photons. Would Bob know instantaneously if Steve did something to his photon?

Nope. Entanglement gets us results that make it seem, in retrospect, that the particles had some capability to communicate faster than light, but doesn’t give us the ability to use them to communicate faster than light ourselves.

In your example, Bob can measure his photon, and get some result A, and say that, “if Steve measured his photon, he would get some result A’ because of entanglement”, but he can’t tell whether Steve has actually made that measurement or not.

And if Steve “does something” to his photon, (i.e. changing it’s state,) that action breaks the entanglement completely so all bets are off about the relationship between measurements of the photons after that.

Here’s the main argument:

Let there be a set of objects where there are 3 dichotomous properties that can describe it. Let’s call them Firm/Soft, Heavy/Light, Old/New. Consider the following two sets: {(The number of Firm Heavy objects) plus (the number of Light New objects)} vs. {the number of Firm New objects.} Let’s write that as (FH + LN) ?= (FN)

Because the properties are dichotomous, we have the following:
FH = FHO + FHN (every FH object is either O or N, and is exactly one of the two)
LN = FLN + SLN (similar)

So we have FH + LN = FHO + FHN + FLN + SLN. The middle two terms add together in the reverse of the two formulas to get FN, and whatever FHO and SLN are, they aren’t negative. Thus we have FH + LN >= FN. This must be true for any sets of objects that have three dichotomous properties, regardless of their correlation with each other, assuming only that each object possesses exactly one of each of the three properties and that no negative objects exist.

Now, in the world of Quantum mechanics, we have particles that have spin. We can measure that spin along a certain axis and always get up or down, never anything in between. If we measure the spin along an axis that is a slightly different angle, we might get a different result, so let’s consider an experiment where we measure the spin of a particle along three different axes. To get something interesting, we’re not going to have them perpendicular to each other, but two will be offset only by some random angle in opposite directions. We now have three (potentially correlated, but that doesn’t matter) dichotomous properties that we can use the above inequality to get bounds on the size of certain sets. Now we can run experiments using entangled particles where measure the spin of each in different directions; we normally can only measure one at a time (and measurement destroys the state it’s in), but entangled particles allow us to firmly say what the other particle would have been measured had we chosen the angle that was used for its entangled partner’s measurement.

Now let Firm/Soft be the measurement outcome of spin along the angle in one direction of the center, Heavy/Light for the central direction, and Old/New for the opposite angled direction. Quantum mechanics predicts that the above inequality that I proved does not actually hold for certain angles, experiments have been run that say that quantum mechanics is right, and thus our logic about dichotomous properties does not extend to the quantum realm. Something in our assumptions about them is fundamentally flawed. We’re not sure how exactly, but there’s something very strange going on. Quantum mechanics says it’s because the particles have no definite spin until measured - something that just doesn’t sit right with most people.

While Bell’s Theorem deals with paradox involving two entangled particles, there is also the GHZ experiment involving three entangled particles. The paradox for such a three-particle case may be easier to describe than that of the EPR paradox (though you might not see that from the Wikipedia article).

What about entangling four or five particles? Can even more elegant paradoxes be described? I hope our expert physicists will comment.

It is trivially true that there exist paradoces for more particles, since any two-particle or three-particle paradox still applies to them. It is almost certainly also true that other, new paradoces arise. Whether any of these are more elegant is a matter of taste.

That said, however, it generally takes a lot of contriving to get more than two or maybe three particles entangled together.

The two-particle paradox associated with Bell’s Theorem involves a probability greater than 0 and less than one that appears impossible, IIRC. No paradox arises until you do enough experiments for statistical significance.

The three-particle paradox associated with the GHZ Experiment leads to a probability which, in defiance of common-sense, must be zero instead of one (ignoring any uncertainties in the lab setup). This seems like a more “definite” paradox. (“Elegant” wasn’t the right word, but neither is “definite.”)

Although the three-particle paradox is more “definite”, it still takes a little thought to perceive it. I wondered if an even more “definite” paradox can be constructed.

Paradoces.

Huh.
ETA: They let you get away with that at the water fountain and in professional communication?