Name of this probability paradox?

The subtle deception is in using a particular number in your calculation, when you should be using c, the cupcake that you will tell me you ate. The probability that you will tell me c, given U,is still 1 under the conditions of the problem. The probability that c = 96 is the same in both cases (A or U), 1 in 100, so there’s no help there.

Okay, I made a bit of a mistake trying to distinguish between the act of ‘telling’ and the act of ‘eating’. Where it should lie is in the act of choosing. The problem isn’t different if you walk into the room holding a cupcake, eat it, and tell me that you either just ate the last cupcake or only the one you entered the room with. Do you think it makes a difference if you tell me the number in this case?

Again, this is the same as the “two children” paradox. It’s not just that I know “one of the children is a boy”; it matters how I know that. Likewise, it’s not just that I know “you ate cupcake #96”; it matters how I know that.

First the alternative computation:

Ignoring the visible numbers on top of the cupcakes, there are 200 equally probable outcomes. 100 where you ate one cupcake, and 100 where you at them all. Of the 100 possibilities where you ate one cupcake, for 1 of them, you will tell me you ate the cupcake with invisible number 96. Of the 100 where you ate all of them, for 1 of them you will tell me you ate the cupcake with invisible number 96. For the other 99 possibilities, you ate that cupcake, but you tell me some other number. So of the 200 possible outcomes, only for two of them will you tell me that you ate the cupcake with invisible printed-label 96. Those are equally probable, so it is equally likely you ate one or ate all of the cupcakes.

I believe Tyrrell McAllister means for 96 to not be special. That is, he will randomly choose one of the invisible numbers from among the ones on the cupcake(s) he ate. If this is what he means, then it is an error.

You calculated p(A|E[sub]96[/sub]) = 100/101 > 99%, which is the probability that you ate all the cupcakes given that you ate cupcake 96. This is correct, but it is not what you set out to prove. You are supposed to show us the probability that you ate all the cupcakes if you tell us that you ate cupcake 96. This is what panamajack showed, but he had a small error:

The last line should be
= 1/100 * 1/2 + 1/100 * 1/2

Of course, there are 99 other similar calculations for p(tE[sub]1[/sub]), etc. Telling us a number selects between those 100 cases, but for each of them it’s equally probable whether you ate one or all the cupcakes.

I don’t believe Tyrrell McAllister intends for us to condition on “Having randomly selected a number from among all the cupcakes I ate, that number came out to be 96”. I think he merely intends for us to condition on “I ate cupcake number 96”, no more and no less. Which is a legitimate thing to condition on.

As I said, the reason conditioning on “I ate cupcake number X” can cause probability to shift upwards no matter what X is is because all the various Xes are not exclusive, so there is no bar to this.

If we conditioned on “Having randomly selected a number from among all the cupcakes I ate, that number came out to be X” instead, in this case, the various Xes yield an exhaustive set of disjoint events, so that the current pre-conditioning probability of anything is the average of all of these post-conditioned probabilities of the same thing [weighted by the probability of this randomly selected number coming out to be X]. Thus, if conditioning on this is what is being asked about, probabilities will not always shift up no matter what X is; indeed, given a suitably symmetric setup, they won’t shift at all.

As an analogous example:

Suppose a 3-sided die is rolled, with faces A, B, and C.

To begin with, the probability that the result is B is 1/3.

Conditioned on the information that the result is in {A, B}, the probability that it is B is 1/2.

Conditioned on the information that the result is in {B, C}, the probability that it is B is 1/2.

But we already know that at least one of these two things (result is in {A, B} or result is in {B, C}) occurs. So shouldn’t we already be able to say the probability that the result is B is 1/2, without having to conditionalize on further information to obtain this shift? Why delay the inevitable, etc.?

Well, no, clearly, we can’t reason like that; such reasoning only applies when looking at an exhaustive set of disjoint events.

For example, suppose we now introduce some guy who honestly tells us either “The result is in {A, B}” or “The result is in {B, C}”, flipping a coin to determine which to say if necessary. Then we can reason as follows:

Conditioned on the information that we are told “The result is in {A, B}” (i.e., either the result is A or (the result is B and the coin came up heads)}, the probability that the result is B is 1/3.

Conditioned on the information that we are told “The result is in {B, C}” (i.e., either the result is C or (the result is B and the coin came up tails)}, the probability that the result is B is 1/3

Since this is an exhaustive, disjoint set of events, we know the “current” probability that the result is B, without conditioning on any new information, is the weighted average of these probabilities [weighted by the probabilities of their conditions obtaining]; thus, the unconditioned probability that the result is B is (1/3 + 1/3 * 1/2) * 1/3 + (1/3 + 1/3 * 1/2) * 1/3 = 1/3. Which is just as it should be. In this case, there would be a problem if the different conditioned cases all gave the same higher probability than the current unconditioned one… but crucial to the fact that this would be a problem is the selection of an exhaustive, exclusive list of events to conditionalize upon.

I haven’t had time to review the computations in the thread but I think one thing that needs to be clarified is eating a cupcake then telling us it was number 96 after the fact is an *entirely *different probability than if we first tell you to let us know if you eat #96, then after your snack you tell us, “Yes, I ate #96.” The latter case more fits the logic that the OP is trying to put forward.

Yes, there are multiple interpretations possible, which is why, way back in the first reply, I said that Tyrrell McAllister needed to be more clear about how exactly the number 96 is chosen. It certainly isn’t the case that all possible was he may have chosen 96 give the same answer.

Chronos is absolutely correct here. Think about this carefully until you can defend this perspective yourself.

While it’s legitimate in a mathematical sense to condition on it, it’s difficult, as in the twin problem, to come up with a practical way situation in which it should occur. The usually assumed epistemological situation doesn’t allow for it.

Zenbeam, thanks for pointing out the gap in my reasoning. I had already slipped into thinking of ‘96’ as c.

I think I’ve made explicit what you know and how you know it. You know that I ate the cupcake with underside label 96 because I told you that I did. You don’t know why I chose to tell you that particular number from among the available ones, or even if there were any other available ones.

So the question is, given this state and provenance of your knowledge, what degree of credence should you give to the proposition that I ate all the cupcakes?

OK. Let me expand my argument in post 24 a bit. First, I’m assuming a fair coin flip, and I’m assuming that when you eat only one cupcake, you choose it randomly. Also, you will always tell me exactly one number.

Consider the point in time where you have flipped the coin, eaten any cupcakes, and chosen the number you’re going to tell me, but haven’t told me the number yet. At this point, we all agree (I think) that the 100 possibilities where you eat one cupcake total to 50 percent, and the other 100 also total 50 percent. Since the one-cupcake-eaten possibilities are all equally probable, they each have a 1/200 chance of occurring. The question left is how to divvy up remaining 50 percent probability among the other 100 possibilities. Since you’re always telling me exactly one number, we know that all those 100 possibilities are mutually exclusive.

Now from my point of view, I have no knowledge of how you chose the number. There are two approaches to looking this situation. One approach is that since I don’t know what method you’re using, I won’t know, after you tell me the number, whether that implies eating all of the cupcakes is more likely or less likely. For example, if you used the “lowest underside label” rule you mentioned in post 7, and the number you tell me is 96, there is zero chance that you ate all the cupcakes. For this approach, all I can say once you tell me the number is that I don’t know the likelihood of you eating all the cupcakes; I can at best give a range, from 0 to 99 percent.

The other approach (again since I don’t know how the number was chosen) is to assume it is equally likely that any one number was chosen. This is a valid approach, since I have no information to choose otherwise. In this case, all the other 100 possibilities are also equally likely, with a 1/200 chance of occurring. For this approach, once you tell me a number, 198 possibilities are eliminated, with one left from each of eat-one and eat-all. Those two possibilities were and are equally probable, so there’s a 50/50 chance of you having eaten all the cupcakes.

If you want to argue that once I’m given a specific number, that the possibility of eating all the cupcakes is 99 percent, you’ll have to explain why, just before I was told that number, the possibility of that particular choice was 1/2 instead of 1/200, and how I have knowledge of that.

This is exactly right. But I see only now how much my original formulation invited the other interpretation. panamajack, ZenBeam, and others are right that if you condition on “I tell you that I ate cupcake 96”, you get the probabilities that they computed. I’ll need to think about how to reformulate the problem to disallow the unintended interpretation.

I still don’t see this. Are the cupcakes the children? And what is it in the cupcake story that maps to “is a boy”?

To formulate it so that the probability increases when you name a number, you’ll have to take a rather odd approach:
You will tell me that you ate a particular cupcake only in the case that you actually eat that cupcake. This isn’t too much different from choosing in advance that you’re going to tell me the status of C96. If you choose the single one and it isn’t 96, then you won’t tell me anything. But it hardly takes much effort to see that you’ll almost certainly only tell me about it when you eat all of them.

One could take an utterly naive approach and assume that, with absolutely no outside information, it’s equiprobable that you might do this, or that you might choose a single random one. That’ll give a slightly different answer that still favors the ‘eat em all’ choice. This amounts to a guessing game, though. Any method could be mixed in - I might similarly add the chance that you would lie about eating a particular cupcake, or any number of options. In most cases, these scenarios are not considered equally likely but set to a probability of 0. It’s only in problems like this - which do not appear to be, but are ambiguously worded, that it becomes important to consider.

Here’s a reformulation:

100 cupcakes, all with labels, one of them blue. Tyrrell flips a coin and then either eats all of them or just the blue one. At this point, you, dear reader, enter the room.
Sign on the door: “Ask Tyrrell about any one particular label, and he’ll tell you whether or not he ate that cupcake”.
You: “Ok…”
You: “Did you eat cupcake #96?”
Tyrrell: “Yes”.

At this point, given the information you currently have (and all the usual conventions about how to assign invariance to aspects of the setup, fairness of coins, honesty, not being surprised by the results of one’s own volition, etc.), what’s the probability that Tyrrell ate all 100 cupcakes?

Suppose you instead had asked “Did you eat cupcake #X?” (for any other particular label X) and got the answer “Yes”. What’s the probability then? [Well, obviously, the same, since the problem is symmetrically set up; nothing special about #96].

Suppose you instead had asked “Did you eat cupcake #X for any label X?” and got the answer “Yes”. What’s the probability then? [That the answer now is different is the paradoxical result]

I don’t see what’s paradoxical there. In the first two cases, it’s obvious that the odds are much higher that he ate all of them if the answer is yes.

I’m having trouble understanding what is meant in the last question. Is ‘X’ meant literally there? That’d be just asking if he ate every cupcake; but I don’t know what the question “Did you eat cupcake #3 for label 3?” means.

The last question basically means “Did you eat any cupcake?”. Of course, there’s no point asking this; you already know the answer will be “Yes”. I just phrased it this way to contrast it with the above.

The ostensible paradox (which, of course, like anything true is not actually contradiction-causingly paradoxical) is that there is a difference between “If I ask of any label ‘Did you eat this label?’ and get the answer ‘Yes’, then the prob. you ate them all is p” and “If I ask ‘Did you eat any label?’ and get the answer ‘Yes’, then the prob. you ate them all is p”. These are two different statements and neither implies the other.

Of course, that may not actually seem paradoxical at all; you may think “Well, why wouldn’t they be different?”. And you’d be right to think so, because the problem’s whole raison d’etre is to illustrate precisely that they can be different. So, yeah, the paradox may not seem so paradoxical anymore once formulated this way, but that’s fine by me; I’d rather dissolve confusions than figure out how to keep them propped up.

(Emphasis added.) The part I bolded isn’t a necessary assumption. You could have rolled a hundred-sided die to decide which cupcake you’d ask about.

Which I think is a good thing, because I think that your prior probability function should reflect a state of knowledge that doesn’t yet know anything about what the result of your own deliberation will be, beyond the fact that the result will lie in a particular set.

Of course, you often do have knowledge about your propensity for deciding in a certain way, but I hold that this should be incorporated as a Bayesian updating of the prior “ignorant” probability function.