If I were to give myself a sandwich, and you a piece of wood, you would have a piece of wood, and I were to have a sandwich. That is nothing to do with the situation described in the “paradox”.
Then you don’t understand what paradox Little Nemo is referring to.
Good, let’s discuss the unbounded case. But first, let me discuss a somewhat simpler example illustrative of the same phenomenon.
Let Z be a random integer; it’s just picked at random from all the integers, each considered equally likely to be picked.
Is the expected value of Z positive, zero, or negative?
Well, the expected value of Z just means its average value. And the average of a bunch of (equiprobable) values is just the sum of all the values divided by the number of values. The denominator there will always be possible, so all we’re really looking for is the numerator; is the sum of all the integers positive, zero, or negative?
Well, let’s line those bad boys up and add them.
One way we could do it is this:
1 + 2 + 3 + 4 + 5 + 6 + ...
0 + -1 + -2 + -3 + -4 + -5 + -6 + ...
-------------------------------------
0 + 0 + 0 + 0 + 0 + 0 + 0 + ...
If we add down each column, we get 0, and then we sum the results from each column up to get a total of 0. It looks like the integers add up to 0. So the average integer is 0.
Great. Only, we could instead have added up the integers with a slightly different arrangement:
1 + 2 + 3 + 4 + 5 + 6 + ...
0 + -1 + -2 + -3 + -4 + -5 + ...
-------------------------------------
1 + 1 + 1 + 1 + 1 + 1 + 1 + ...
Now if we add down each column, we get a whole bunch of positive values. Adding the results from all the columns, it comes out to a positive total. So now it looks like the average integer is positive.
Or we could use this arrangement:
0 + 1 + 2 + 3 + 4 + 5 + ...
-1 + -2 + -3 + -4 + -5 + -6 + ...
-------------------------------------
-1 + -1 + -1 + -1 + -1 + -1 + 1 + ...
Now if we add down each column, we get a whole bunch of negative values. Adding the results from all the columns, it looks like we get a negative total. So apparently the average integer is negative.
So there’s this tremendous ambiguity lurking in the question as to whether the average integer is bigger than 0, less than 0, or equal to 0, with the calculations coming out differently depending how you arrange them.
Does that makes sense so far? If so, next, I’ll show how this same same phenomenon is the one underlying the unbounded two-envelope paradox.
Absolutely not according to the OP.
Little Nemo does not have knowledge about the contents of the first envelope’s specific value.
Well, then they’re looking at the trickier paradox I’m outlining, in which the expected value of the second envelope conditioned upon any particular value for the first envelope would be larger than that particular value, whence it appears to follow that the expected value overall of the second envelope is some positive value plus the expected value of the first envelope, which seems crazy. And I’ll explain why it’s not so crazy. But Little Nemo’s expected value calculations aren’t the problem. Little Nemo’s expected value calculations are just fine. Are you naysayers familiar with conditional expectation (conditioned on a random variable rather than an event)?
Look, the phrasing in terms of what people know and so on is throwing you guys. It’s not fundamentally a question about knowledge; it’s a question about averages. Out of all the 2d points whose X and Y coordinates are, let’s say, adjacent powers of 2, is the average Y coordinate greater than the average X coordinate? On the one hand, by the symmetry of the situation, it seems like it shouldn’t be; on the other hand, by setting up the calculations in a certain way, we find that for any vertical line, the average Y coordinate of the two points on it is 1.25 times their shared X coordinate; averaged over all lines, by linearity of expectations, this demonstrates that the average Y coordinate overall is 1.25 times the average X coordinate overall, despite the symmetry of the situation. What’s going on?
That’s the paradox. It has nothing to do with what people know when. Talk of “expected value” is just talk of averages. The question is a question about averages. The specific averages being looked at are exactly as in the paragraph above.
To find E(Y | X = c) you need P(Y | X = c). You can’t say that P(Y = a | X = c) = P(Y = b|X = c) without knowing about P(Y = a), P(Y = b) and P(X = c). This is Bayes’ Theorem. Little Nemo probably doesn’t know about it. If you had some knowledge of P(Y = a), you have some idea about the distribution of how much was put in both envelopes. The “paradox” is to assume that every possible value has equal probability. It might seem the right thing to assume, but no possible distribution has it. Once you put a specific distribution on what was put in you know about P(Y) and P(X).
Maybe you could put in an upper bound of h, then P(Y = 2X| X=h) =/= P(Y = X/2| X = h). If you know that there can’t be more than $100,000 in an envelope, and you knew your envelope had $100,000, the expected value of the other envelope is $50,000. If you knew your envelope had $50,001, the expected value of the other is $25,000.5. For roughly half of all values for X, P(Y = 2X|x=c) is 0.
Instead of putting an upper bound you might put in something like an exponential distribution, but then again P(Y = 2X| X = c) =/= P(Y = X/2| X = c) is extremely likely.
Now, you are trying to be “correct” by talking about the ratio. Yes, if you were to take two numbers X and Y and find X/Y, and sum it over many trials, you’d get a number higher than E(X) or E(Y) if P(X>Y)=P(Y>X) (or even in other cases, there’s plenty of room to set things). It’s a side issue, with the main issue that Little Nemo doesn’t know about Bayes’ Theorem.
Then explain how it is different because it’s rapidly approaching the point of being word for word identical.
You never have 2X in the first envelope. X was defined as the contents of the first envelope. There is no uncertainty about what the content of the first envelope is. The uncertainty is what the content of the second envelope is.
And the amount was already placed in the envelope (presumedly by Erwin Schrodinger). But it’s still an uncertainty to you because you haven’t opened it. It’s like flipping a coin in a dark room. It’s landed but you can’t see what it is. So if somebody asked you what it might be, you could meaningfully say that there’s a 50% chance that it’s heads and a 50% chance that it’s tails - even though it already has been flipped.
Er, that bolded word should read “positive”, of course.
So you were destined, no matter which envelope you chose, to get the same value? If X = 1, if you had picked the other envelope, would X also be = 1?
Lets say a tails = 0, and a heads = 1. U is the value of the coin currently showing. You don’t know this. If the coin is flipped again, V is the new value. If U = 0, then V = U + 1 with 50% chance, correct? Then if you flip the coin again, U, which was 1 or 0, results in a V of 2 or 1 or 0? Correct?
I almost certainly don’t know as much about Bayes’ theorem as you do. But I don’t think you’ve got enough information to apply Bayes’ theorem in the situation I described.
To reinterate, here’s what you know.
There are two envelopes, both containing money. One contains twice as much as much money as the other does.
You choose one envelope at random and open it. It contains ten dollars.
That’s all the information you have. You don’t know how much money there is in total.
But based on this information, can you determine the possible amounts of money that might be in the sealed envelope?
And if you are able to compose a complete list of the possible amounts can you determine what are the chances of any amount being the actual amount?
No. Where did you possibly come up with that?
This is not your OP.
In your OP you did not open the first envelope.
In your OP “X”=M OR 2M
“X” does not have 1 distinct value.
How do you compose that list of values that could appear in the other envelope in this one trial of the game?
At best you can word it one of two ways:
The two possible values are X and Y
The two possible values are M and 2M
If you choose to write that Y can be 2X or .5X you need to realize you did not just list all of the values that Y can actually be.
Instead, you have created a set of values with the following properties:
The set contains the value Y for sure
The set also contains a value that is not even allowed in this game, for sure, 100% probability
So you created a formula that includes 1 good value and 1 bad value for this one trial of the game, but what can you do with a formula that has these properties? Not much during this one trial of the game.
In my original OP I used a scenario where you didn’t open the envleope. This seemed to lead to some confusion so I abandoned that point as it wasn’t really central to the paradox.
But you are incorrect about the rest. I specifically said in the OP that “Call the sum of money in your envelope X. The sum of money in the other envelope is therefore either 2X or X/2. There’s an equal probability that the other envelope will have either a higher or lower amount than your envelope.” There was never any time when I said the content of the first envelope had more than one amount. X was always one number. It was never “M or 2M” - that’s an example of the confusion I spoke of.
No, the answer is very simple. If the contents of the first envelope is ten dollars, then the content of the second envelope can only be five dollars or twenty dollars. Forget about probabilities - this is elementary algebra.
People who aren’t Little Nemo,
Imagine a game, which isn’t Little Nemo’s game, which works like this: A) First, one envelope is stuffed with money according to some random process or another, B) Then, a die is rolled. Based on the result of the die roll, the second envelope is filled with either 1, 2, 3, 4, 5, or 6 times the contents of the first envelope.
If I tell you that the first envelope is, on average, filled with $12, can you tell me how much the second envelope is, on average, filled with?
Please show your work (either a calculation of the expected value of the second envelope, or a proof that you cannot deduce this from the information provided).
People who are Little Nemo,
Have you had a chance to look at this post yet (concerning the ambiguity of the average value of a uniformly random integer, and how different calculations of this expected value give inconsistent results)? If so, what’d you think so far? If you feel it’s unhelpful or confusing, let me know. Otherwise, if you are happy to proceed to discussing how that phenomenon relates to the paradox you are rightly curious about, I will glady proceed.
“No” what?
Tell me if you agree or disagree with the following statements about the set of values (2X,.5X) that you used to replace “Y”:
The set of values (2X,.5X) contains the value Y for sure
The set of values (2X,.5X) also contains a value that is not even allowed in this game
Raftpeople, you are aware of what “expected value of Y (conditioned on information K)” means, right?
It means “average of all the Y values over all possible outcomes consistent with information K”.
There’s nothing in the definition of expected value that says “Don’t look at any outcomes other than the current, actual situation”. Probability isn’t about the actual situation, it doesn’t care what the actual situation is (beyond the specified information K to condition on), it has nothing to do with the actual situation. Probability is about taking averages over a distribution of outcomes. And in the distribution the OP is specifying, there’s two possible outcomes where X = k: one where Y = 2k and one where Y = 0.5k. Yes, of course only one of these is the actual outcome in any given run of the game. That has nothing to do with what expected value is, though. Expected value is the average over all outcomes matching whatever condition, even though those outcomes can’t all happen simultaneously.
If I roll a die and hide it behind my back, and then ask you what your expected value for the die is, it’s not ridiculous of you to say “Well, it’s the average of 1, 2, 3, 4, 5, and 6”, even though you know it can’t be all of those, that 5 of those are “bad” values. Because “expected value” is not meant to be the actual value; using information about what the actual value is, beyond the information you’re given to condition on, will generally lead to a very poor approximation of the expected value. They don’t mean the same thing; they mean rather different things.
So, again, this is why I propose rephrasing the conversation purely in terms of averages and not even mentioning probability. We have the following paradoxical situation:
Consider all the points on the two lines Y = 2X and X = 2Y (with positive coordinates). Is their average Y coordinate larger than their average X coordinate?
Well, along any vertical cross-section, the two points on that cross-section do have an average Y coordinate which is 1.25 times as large as their shared X coordinate. Which (correctly) implies that overall, the average Y coordinate is 1.25 times as large as the average X coordinate, despite the symmetry of the situation! And the average X coordinate is also 1.25 times as large as the average Y coordinate!
That’s the paradoxical result. It doesn’t matter whether you pick a particular point out of those lines to call “the actual point”; it doesn’t matter that no point actually has two different Y coordinates simultaneously. That’s not what the problem is about. The problem is about averages, and averages naturally involve values taken from many points, not just one.