Can somebody explain the two envelope paradox to me

I meant that, in your example, expected value is used to predict whether X will be larger or smaller than Y. But from “the expected value of X/Y is greater than 1” I wouldn’t have derived “X is probably greater than Y” but rather “making the choice which, in our example, has a value determined by the expression X/Y, will tend to yield a gain greater than 1.”

Me too. I’m saying that in your example, expected value is being used the wrong way. You’re saying this too, correct?

If you roll a standard six-sided die, the expected value of the number that comes up is 3.5. You wouldn’t go making any choices that are based on the assumption that you’ll see a non-integer, would you?

I apologize if this was said earlier in the thread, but it seems to me the key is that while the value of your envelope is X, the expected value of your envelope is 1.25X. It seems weird to think that your envelope has an expected value that is higher than its actual value, but I believe it does. So you’re being asked to exchange your envelope, which has an expected value of 1.25X, with another envelope with the same expected value. You are indifferent. The fact that you know the actual value of your envelope is X, I think, is irrelevant.

I could be thinking about this wrong.

Call your envelope X and the other envelope Y.

E[X | Y], i.e., the expected value of X conditioned on Y, is a term which depends on Y but not on X; it tells you “Supposing you conditioned the probability distribution on Y having a particular value; with this weighting, what would the arithmetic mean value of X be?”. Specifically, E[X | Y] = 1.25Y, as correctly shown by the familiar reasoning of the problem.

E[X | X], i.e., the expected value of X conditioned on X, is a term which depends on X but not on Y; it tells you “Supposing you conditioned the probability distribution on X having a particular value; with this weighting, what would the arithmetic mean value of X be?”. Specifically, E[X | X] = X; if you knew what X was, well, then, there you go, that’s what X has to be.

As for E, i.e., the unconditional expected value of X, is a term which depends on neither X nor Y; it tells you, “Supposing you took the given probability probability distribution as is and did not condition it on anything further; with this weighting, what would the arithmetic mean value of X be?”. Specifically, E is infinitely positive, as shown previously in this thread.

E[X | Y], E[X | X], and E are all different terms which may be called “the expected value of X” in different contexts; however, they mean different things. One can speak of E[X | C] for any context C of random variables upon whose values to condition, and the resulting terms may look widely varied indeed. However, there is no C such that E[X | C] = 1.25X; if C includes the information as to what X is, then that’s what the expected value of X is, straight-up, and if C doesn’t include such information, then the term E[X | C] cannot depend on X.

Here’s the simple summary of the situation:

M=the .5 amount of money (you have to choose one or the other to write it up)

Envelope 1 can have M or 2M with equal probability
Envelope 2 has ((M+2M)-Envelope1)

Both envelopes will have (M+2M)/2 as their average value over many iterations, and, of course, it’s the same average value for both envelopes.

When you choose an envelope, all you know is that both envelopes have the same average value over time, nothing should compel you to choose one over the other.

Well yes, there are many ways to get the correct answer. OP asks to explain the fallacy in his (failing) approach. Just as a 2=1 paradox isn’t busted just by proving 2>1, so OP’s query isn’t to prove the 1.25X idea wrong but rather to explain why it is wrong.

Several perspectives have been offered on that. I think it’s time to bring the competition to a close and ask OP to score submissions, from most to least helpful.

In the case where you look inside the envelope:

If there is a 50% chance the other envelope is the larger value regardless of what value you see in the envelope, then the expected value of the envelopes is undefined/infinite. There’s no finite value you can put on the expected value of the other envelope, so it should be no surprise that you get garbage if you try to relate that expected value to what you see initially, working with it as if it has some finite value.

In the care where you don’t look inside the envelope:

The claim is that E(Y) = 1.25E(X), but that’s not what you’re actually saying. You’re first fixing X=k and determining the expected value of Y based on that, so you’re actually claiming E(Y|X=k) = 1.25k. That statement in no way implies E(Y) = 1.25E(X); it only does if you use non-specific language and don’t fully analyze what you’re trying to say. You can try to claim that E(Y|X=k) = E(Y) and E(X) = k, but those statements are not true universally, and in this instance are generally false. It might be strange that once you fix one envelope the expected value of the other is greater, but is not in itself contradictory, because it is not possible to reverse it. You can say that E(X|Y=n) = 1.25n, but you’re stuck in the same situation - you have a bunch of different quantities some of which are related that you’re claiming are equal when you have no basis for that claim.

The paradox in either case arises from using pseudo-mathematical concepts, and not following through on precisely what you mean. The initial formulation used “X” to mean multiple things; actually following through with the probability calculations shows that two different things called X are actually completely mathematically unrelated quantities.

It’s true that E[Y | X] (i.e., the function f(X) such that f(k) = E[Y | X = k]) is a different entity from E[Y]. However, they’re not unrelated, and, in fact, they’re related enough to allow the very reasoning you’re decrying: E[Y] is simply the expected value of E[Y | X]. So, the fact that E[Y | X] = 1.25X is, in fact, enough to allow the conclusion that E[Y] = E[E[Y | X]] = E[1.25X] = 1.25E.

Go ahead, check it out for yourself. Try as you might, you cannot make a probability distribution on X and Y in which E[Y | X = k] = 1.25k, yet E[Y] and 1.25E are distinct.

I thought of that, but, since you are playing the same game either way (since the contents of the envelopes don’t change), you’ll find that Y = X. The total money is M=3X in before the first switch, and M=3Y after, so 3X = 3Y and Y = X.

Again, you have to assume that the game changes in order for your original analysis to make sense. Either what’s already in the envelope changes, or the total amount of money changes. As long as both are the same, then both envelopes have the same expected value.

Hunh? I think you’re making basically the same point I am. You have to use expected value in the right way if you’re going to use it to deliberate on how to act. You’re pointing out another way in which this is true.

Maybe you were thrown off by my injudicious use of the phrase “greater than one.” I used that because the point of the 1.25<–>X>Y example Indistinguishable used turned on the fact not that the E.V. was 1.25 but just that it was greater than one.

But what I’d get from an E.V. of 1.25, to phrase myself better from my previous post, isn’t that I’ll tend to get a value “greatere than one” but rather that I’ll tend to get values that over the long term average out to 1.25.

I would not derive, in Indistinguishable’s scenario, that X will tend to be greater than Y. There’d be no reason to think that–as I take it Indistinguishable illustrating.

Exactly. I fully understand that there’s no logical reason to switch envelopes. The problem I’m asking is why a procedure that to all appearances should produce a correct answer does not.

And maybe it’s me. Maybe I just haven’t learned the math I need to see what’s obvious to others.

Have you looked at this summary post? Do any of those four bullets address the specific aspect of the paradox you are concerned with?

(In case you are unfamiliar with the notation used in that post, E[Y | X] means “The expected value of Y given any particular value for X, in terms of that value of X”, while E[Y] means “The expected value of Y overall”. Thus, for example, in our problem E[Y | X] = 1.25X, while E[Y], which by definition is not given in terms of X, must be given as a constant, whatever the overall expected value of a random envelope is, without reference to the value of the other envelope (in fact, E[Y] is infinite, for reasons explained in that post))

Because you are defining “X” as the current value in your envelope, then you are defining the expected value of the other envelope, but you are using the term “X” in that definition which is incorrect.

The expected value of the other envelope has nothing to do with the current value in your envelope, so the term “X” does not belong anywhere in the expected value equation.

Summary:
X=Current value in envelope 1
(M+2M)/2=expected value of both envelope 1 and envelope 2

Here’s a key point:

Thus E[other] = E[this], and not (1.25 E[this]) which even some attempting to refute the paradox have accepted.

Most everyday probability problems, including this one, do not require measure theory, Kolmogorov axioms, summing infinite series, etc. Much probabilistics is simply a craft for solving real-world problems, as I emphasized earlier:

OP’s question is straightforward and can be answered simply.

For the unopened envelope case, the fallacy assumes X’=X’’ even though that defies the facts and common sense.
For the opened envelope case, the fallacy assumes the over/under probabilities are each 50% even though there’s no reason to assume that.
Simple? These correct approaches emerge the more you apply simple common-sense, and the less abstract math you try to apply.

Indistinguishable changes simple math to complicated math. I seek to reduce simple math to common-sense. The cited post considers infinite series, which would be irrelevant if OP had taken the precaution to add “Assume neither envelope contains more banknotes than all the banknotes ever printed on Earth.”

Again, there are different notions of expected value which make sense. There’s E[the other envelope] (aka, E[Y]), which indeed should not have an X anywhere in it, and then there’s E[the other envelope conditioned on the value of this envelope being X] (aka, E[Y | X]), which should have an X in it. It’s not incoherent to consider the latter expectation.

Yes, but surely, the OP didn’t intend to add that precaution because it’s not part of the mathematical problem they are interested in considering, the one which is apparently paradoxical. To resolve the problem by stipulating such finiteness conditions is to not actually deal with the paradox, but to run off and think about some other, nicer situation. Similarly, to a lesser extent, for supposing the over/under probabilities are not 50/50; clearly, implicit in the OP is the stipulation that they are, a stipulation that should be respected.

For unopened-envelope case, probabilities are clearly 50/50 by symmetry, not stipulation.

For opened-envelope case, no general 50/50 assumption is plausible; any stipulation would be counterfactual.

Why do you say this? There are certainly mathematical models which would allow the 50/50 assumption.

Infinite models. I think OP would be content with the assumption that no envelope contains less than a half-farthing nor more than all the gold on Planet Earth.

I think that’s fighting the hypothetical. Of course if there’s always a (50%) chance of a higher envelope, the envelopes go arbitrarily high; I doubt anyone feels the need to retract the paradox because of this. Surely everyone who poses the paradox is perfectly happy in so doing to take the range of envelope possibilities to be infinite; if not, they’d have posed a different problem, or no problem at all.

Which isn’t to say that the paradox isn’t fruitfully studied by thinking about its relation with these finite analogues; just that it isn’t, I think, helpfully resolved by barring its intrinsic infinity by fiat.

My intuition says that E[Y|X] would provide no valuable information for the problem at hand, and would require adding the very same variables/constants (M, 2M) that exist in E[Y], so I don’t see how it can “make sense” in this particular case. If it yields something other than the intuitively obvious notion that envelope 2 contains the value that envelope 1 does not, then tell me what that info is, I am prepared to be educated.