Can somebody explain the two envelope paradox to me

Oh… I may have misinterpreted why you picked the .5s, then.

So would the scenario where it is known that one envelope contains $1 and one envelope contains $2 be an accurate model of what you were getting at, then? That is, is the scenario in your mind possibly one where, upon opening one envelope, you instantly know whether the other is larger or smaller?

If so, I sheepishly admit I misinterpreted what paradox you were interested in discussing and I will not mention the infinitely many possibilities scenario anymore, instead restricting discussion to the 2 possibilities scenario.

In that case, everyone is right to chastise me for being off-base. I assumed you were actually talking about another well-known problem, also called the two envelope paradox, in which the contents of the envelopes could be arbitrary quantities, and you would not know, even after opening one envelope, whether the other was larger or smaller.

You can do lots of calculations.

The only problem is that you thought .5(2X)+.5(X/2) represented something it doesn’t represent. It’s a formula that represents something, just not what you were looking for.

Your fundamental mistake in the calculation E[Y] = 0.5 * 2X + 0.5 * X/2 = 1.25 * X is this:

The formula for expected value of Y is “The probability of case 1 * the expected value of Y over all case 1 scenarios + the probability of case 2 * the expected value of Y over all case 2 scenarios”. The actual values of X and Y have NOTHING to do with the expected value. Zip. Nada. That’s why I wanted to rephrase discussion in terms of “averages” to make this clearer. When you take an average of a bunch of data points, there isn’t any one real data point; just a lot of different things you’re looking to add up. The class average for a test has nothing to do with which student calculates it, even though only one of the test scores is the “real” score for any particular student.

So what?

So when you write “X” in your calculations, you’re referring to the ACTUAL value of X. But the ACTUAL value of X has nothing to do with the expected value.

The expected value of Y is (the probability that Y = 2X) * (the expected value of Y over all possibilities where Y = 2X) + (the probability that Y = X/2) * (the expected value of Y over all scenarios where Y = X/2) = 0.5 * (the expected value of Y over all possibilities where Y = 2X) + 0.5 * (the expected value of Y over all possibilities where Y = X/2) = 0.5 * (the expected value of 2X over all possibilities where Y = 2X) + 0.5 * (the expected value of X/2 over all possibilities where Y = X/2).

If you want to turn this into 0.5 * 2X + 0.5 * X/2, you have to replace “the expected value of 2X over all possibilities where Y = 2X” with just “2X”, and similarly for the X/2 one. But you can’t do this! “The expected value of 2X over all possibilities where Y = 2X” has nothing to do with the ACTUAL value of 2X, so you can’t replace the one with the other. They aren’t equal.

That’s the fundamental error. Where you write “2X” in your calculation, you should be writing “The expected value of 2X over all possibilities where Y = 2X”, which is a different number. And similarly for where you write “X/2” in your calculation.

In other words, “the expected value of X” and X generally aren’t equal, and your mistake is in using the latter term in your calculations in places where the former belongs. “X” refers to the actual value of X, which is a different beast from the expected value of X.

E[Y] = 0.5 * E[2X | Y = 2X] + 0.5 * E[X/2 | Y = X/2] is correct.

E[Y] = 0.5 * 2X + 0.5 * X/2 is incorrect.

Lets look at the two possible outcomes, if the envelopes contain $1 and $2.

X = $1 and Y = $2. You swap your choice of envelopes, you gain $1, which is equal to the value of X. In this case Y = 2X

X = $2 and Y = $1. You swap your choice of envelopes, you lose $1, which is equal to the value of X/2. In this case Y = X/2

X is a random variable. It doesn’t have “a value”, it has a range of possible values. You can’t “use” X like a normal variable, you have to take into account that it has one of multiple values. This is why you have a paradox, you are using X when you should be using X = $1 or X = $2.

Note that what is true in both situations is that X + Y = $3, which simply results in E(X) + E(Y) = $3. You could also express it as a ratio, Y/X = ($3 - X)/X. When X = $1 or $2 it has the same Y values as above, but is very hard to find the expected value of Y directly from that sort of expression.

I said right in the title line that I needed somebody to explain this to me. It’s not like I’m claiming knowledge I don’t have.

I asked this earlier. Is the problem I have that expected value is not the appropriate method to address this situation?

Expected value is perfectly fine. It’s just that the expected value of Y is 0.5 * (the expected value of 2X over cases where Y = 2X) + 0.5 * (the expected value of X/2 over cases where Y = X/2). This isn’t the same thing as 0.5 * 2X + 0.5 * X/2. You can’t replace “the expected value of 2X over cases where Y = 2X” with 2X, and you can’t replace “the expected value of X/2 over cases where Y = X/2” with X/2.

You misunderstood the formula for expected value; that’s ok, it’s rather subtle in this instance. It’s not “Expected value of Y = probability of case 1 * (any expression equal to Y in case 1) + probability of case 2 * (any expression equal to Y in case 2)”.

Instead, it’s “Expected value of Y = probability of case 1 * (expected value over case 1 of any expression equal to Y in case 1) + probability of case 2 * (expected value over case 2 of any expression equal to Y in case 2])”.

Sure I understand that.

But my question has been what happens if you don’t start out with the assumption you started out with. What happens if you started out with the assumption I started out with.

Lets look at the two possible outcomes, if your first envelope contains $1 and is the smaller one or your first envelope contains $1 and is the larger one.

X = $1 and Y = $2. You swap your choice of envelopes, you gain $1, which is equal to the value of X. In this case Y = 2X.

X = $1 and Y = 50c. You swap your choice of envelopes, you lose 50c, which is equal to the value of X/2. In this case Y = X/2.

Is there something inherently wrong at this point with this line of reasoning? I’m not asking if other lines of reasoing are possible - I’m asking if this particular line of reasoning is flawed and if so where the flaw is.

OKay, maybe it’s just late. But what is the difference between 0.5 * (2X) and 0.5 * (the expected value of 2X over cases where Y = 2X)?

Nothing in this is wrong so far. But you haven’t said anything about expected values yet.

Don’t worry; it’s a subtle point and it’s totally natural that the nuances here are tricky.

The difference is that the former is an expression that depends on the actual value of X and the latter is not.

Suppose there are a bunch of students in some class, and they each take two tests; their score on the first is called their X score and their score on the second is called their Y score. As it happens, every student did twice as well on one test as on the other.

Suppose I’m a student in that class, and I got a 40 on my first test. That is, so far as I’m concerned, X = 40. But that’s just my one particular score; it tells me nothing about the overall distribution of X scores in the class. If I want to calculate information about the rest of the class (for example, the average second test score of all the students who got 40s on their first test), I better not mistakenly substitute references to my one particular test score in places where information about the overall class’s distribution belongs.

In this case, X = 40, but “the expected value of 2X over cases where Y = 2X” is the average second test score of those students who did better on the second test, which is probably some completely different number. That’s the difference.

Because they are different situations. In one of them X + Y = $3, and the expected value of either envelope is $1.5. If you knew X = $1, the you know Y = $2 and the expected value of Y is $2, or 2X. In the other, the total is $1.5, and the expected value of either envelope is $0.75. If you knew X = $1, the you know Y = $0.5 and the expected value of Y is $0.5, or X/2.

Er, I should say:

In this case, 2X = 80, but “the expected value of 2X over cases where Y = 2X” is the average doubled first test score (equivalently, their second test score) for those students who did better on the second test, which is probably some completely different number. That’s the difference.

The point is, to know what “X” is, you have to know, well, the actual value of X. To know what “the expected value of X over cases such-and-such” is, you don’t have to know what X means; there doesn’t even have to be any particular notion of one “actual” value of X. What’s more, knowing one doesn’t tell you the other; knowing the actual value of X doesn’t tell you anything about the average of various possible values of X, nor vice versa. That makes the expressions “X” and “the expected value of X over cases such-and-such” very different.

It’s actually again unclear to me exactly which paradox the OP wants to discuss. That’s alright; maybe it’s just not a clear-cut situation of wanting to discuss one thing or another. But I’ll try to get it clear.

Do you want to discuss:

  1. There are two envelopes, one containing twice the money of the other. I pick one at random and I see $1 in it. I think “Ok, the other envelope can have $2 or $0.50, each equally likely. So the other envelope has an average value of $1.25. I should switch over”. But no matter what value I saw in the first envelope, I could have carried out the same reasoning and decided to switch over. Which appears to indicate I should keep switching envelopes back and forth, without opening them. Which is nonsense; what happened?!

Or do you want to discuss
2) There are two envelopes, one containing twice the money of the other. Alice and Bob pick envelopes from the two at random. Alice’s envelope is just as likely to be 2 * Bob’s as it is to be 1/2 * Bob’s. Accordingly, the average value of Alice’s envelope is 1.25 * Bob’s. But this means the average value of Alice’s envelope is higher than Bob’s envelope, which appears to indicate Alice is magically a better envelope-picker than Bob. Which is nonsense; what’s wrong with this reasoning?!

Or do you want to discuss both of these or something else?

The difference is that 1) above assumes that, no matter what, even after learning the value of one envelope, the other is equally likely to be twice or half that. However, 2) doesn’t depend on that [it would work even if it was known that the bigger envelope had 100 bucks and the smaller had 50 bucks], and is actually a different, simpler paradox. I spent most of this thread talking about 1); my last few posts were about 2). But I’m still not really sure which the thread is supposed to be about.

Here is the summary again:

  1. Yes, you can use expected value
  2. Here is the expected value formula for both envelopes: (M+2M)/2
  3. Here is a formula that is NOT the expected value for either envelope: (2X+.5X)/2

If you are still wondering why (2X+.5X)/2 is not the expected value formula for the problem you described in the OP, then I would ask you the following questions:

  1. During this one single run of the game, are the values 2X and .5X both going to appear in one of the 2 envelopes?
  2. If those 2 values are not both going to appear in the envelopes during this one single run of your game, then how can we possibly include both of those numbers in an average?

Er, well, it depends on what you’re averaging, doesn’t it? It’s very common to take an average including numbers which do not appear together in a single run of a game; e.g., the average value of a die roll is (1 + 2 + 3 + 4 + 5 + 6)/6, even though only one of these appears in any single roll.

You seem to be comfortable calculating the average of the two possible values of the other envelope as (2X+.5X)/2=1.25X, but I think this is incorrect and here is why:

  1. The OP is clearly stating this is 1 single trial of a game, so we are calculating an average based on 2 and only 2 discrete values
  2. There are 2 different amounts, 2 unknowns and therefore 2 variables involved: X and Y
  3. The average of the 2 possible values for both envelope 1 and envelope 2 is clearly (X+Y)/2
  4. (X+Y) is clearly not equal to (2X+.5X)
  5. Therefore you can not substitute (2X+.5X) for (X+Y) when calculating the average

We are talking about averaging the POSSIBLE values.

In your game 6 values are possible, so they should all be included.

In the OP’s game there are 2 values possible, only those 2 should be included.

Do you disagree with this?

Forget I posted this, I’m not sure what you are averaging or what the rules are of your game with the dice, so I don’t know if you should include them all or not.