Another tricky math/probability question

(This was the “puzzler” on 538.com recently)

There’s a game at a fictional casino. If you choose to play, the casino randomly generates a number between 0 and 1. Then they generate another number between 0 and 1, and add it to the first. They repeated this process until the sum of all the generated numbers is greater than 1.

They then pay you $100 for every random number they generated.

So if the random numbers were .25, then .42, then .61, you would be paid $300, because .25 + .42 is not greater than 1, but adding in .61 is, and they generated three numbers total.

They charge you $250 to play this game.

Is that a good deal? Why or why not? And how much should you be willing to pay to play?

My (partial) answer:

I’m quite certain that you should pay $250 to play the game.

Obviously you always get at least 2 pulls, so at least $200. The question is how often does it take at least 3 pulls, so you’re coming out ahead, with $300.

So let’s restrict the casino real numbers to fractions with n as the denominator, so from 0 through 1/n, 2/n, up to n/n=1. What are the odds that two random such fractions sum to 1 or less? Well, if the first number they chose was 0, then it’s certain, n/n. If the first number they chose was 1/n, then it’s (n-1)/n. If the first number they chose was 2/n, then it’s (n-2)/n. Each of their possible choices was equally likely for their first pick, so the sum of all those probabilities ends up being 1 + 2 + 3 … + n/n^2, which, as n approaches infinity, becomes 1/2. So if you maxed out at 3 flips, you would still get 2 flips half the time and 3 flips half the time, so that’s break even at $250. Since you have some non-zero chance of winning more than $300, the game is definitely in your favor.

As for what the actual EV is, I have absolutely no idea.

Some puzzles like this the answer is that the EV is infinite. After all, there’s a non-zero probability that you win a googol dollars. But I don’t think that’s the case here, because as you get to larger and larger amounts, the value you win increases linearly but the likelihood of winning it increases exponentially.

That’s all I got… any of our math experts want to take a crack at this? (Or is it a well known problem?)

My answer is simpler, though it reaches the opposite conclusion that you do:

The game is not a good deal, because if it were, the casino wouldn’t be offering it.

He did say it was a fictional casino…

I’ll buy MaxTheVool’s reasoning on the first two numbers, and then say that at every stage after that it is 50/50 as to whether you will get a number that takes you above 1. It is 50/50 simply because there are always as many numbers that will take you above 1 as will not take you above 1.

It is a good deal, because it is a fictional casino. As is well known, fictional casinos offer good deals to show how smart the authors are.

I’ve been enjoying the FiveThirtyEight Riddler quite a lot since they started it. Below is the solution that I came up with on Friday.

The answer as to whether you should play the game is relatively simple:

Yes. You always get $200. Moreover, there is a 50-50 chance that the first two numbers will add up to less than 1. You can see this by plotting the first number along the x-axis in the Cartesian plan and the second number along the y-axis. Each possibility for the first two draws will lie somewhere in the square with opposite vertices (0,0) and (1,1). To be able to keep going after the second draw, the point in the square corresponding to your first two draws lies below the line x + y = 1 (which runs diagonally across the square, from the corner at (1,0) to the corner at (0,1).) The area inside the square below this line is exactly 1/2 the area of the square. Thus the chance that you’ll get a third pull is 50-50, which means that your expected winnings are at least $250: $200 automatically, an average of $50 from the third round (since you have a 50-50 chance of making it to the third round), and whatever extra money you can expect to win from the fourth round and beyond.

As far as the expected winnings for the entire game, they are

$100 * e ≈ $271.83.

I showed this as follows, though there may be a simpler method:

[spoiler]Suppose we select a set of n numbers x[sub]1[/sub], x[sub]2[/sub], … x[sub]n[/sub] between 0 and 1 at random. What is the probability that their sum x[sub]1[/sub] + x[sub]2[/sub] + … + x[sub]n[/sub] is less than or equal to 1? This can be shown (via calculus and induction, which I won’t get into here unless someone wants to see it) to be 1/n!. So the expected winnings from round n are $100 / (n - 1)!. (Note that you get $100 from round 3, for example, if the first two numbers add up to less than 1.) Thus, the total expected winnings are

$100 * (1/0! + 1/1! + 1/2! + … )

This infinite sum is known to add up to e, the base of natural logarithms.
[/spoiler]

I want to comment on a peculiarity of the underlying function.

The expected number of turns, f(x), to reach target x is

[SPOILER] f(x) = 1 + INTEG f(x-t) dt
where INTEG is the definite integral from 0 to min(x,1)
When x < 1 this is just f(x) = e^x so, as MikeS points out, f(1) = 2.71828. If you only need to get to .5, you’ll need an average of f(.5) = e^.5 = 1.6487 turns, and so on.

Of course if your target is very large, say x=100, you’ll have, approximately
f(x) = f(x-1) + 2
which is a totally different sort of function than f(x) = e^x. The function changes character (due to the min(x,1) above) even though it’s obvious intuitively that f(x) is very smooth: a target of 0.999 isn’t much different from 1.001.

I once solved this, or a similar, problem and discovered that f(x) was a spline where not only was there a switch at x=1, but also (counterintuitively?) at x=2 and x=3 and so on. But I won’t try to repeat this now (even my sense of smell is going :mad: )[/SPOILER]

That depends on what you mean by “as many.” The sets have the same cardinality but not the same measure. It’s probably the latter that’s relevant here, but we don’t know for sure without knowing how the casino generates its random numbers.

MikeS is correct in their calculations (on the presumably intended interpretation of the random numbers as generated independently uniformly from [0, 1]). I’ll just finish off by noting two different approaches to the bit they didn’t want to get into: that the sum of n independent uniform [0, 1] variables has probability 1/n! of being less than 1.

A) Along the lines of MikeS’s visualization for n = 2, we might in general consider this to be the question “What is the volume of the n-dimensional pyramid with one vertex at the origin and other vertices at a distance of 1 along each axis from this?”. The n-dimensional volume of a pyramidal shape is the (n-1)-dimensional volume of its base, * its height/n; thus, the volume we are interested in will be 1 * 1/2 * 1/3 * … * 1/n = 1/n!. (If you prefer to think arithmetically rather than geometrically, you can think of this as repeated integration, with 1 integrated n times becoming x |-> x^n/n!)

B) Another way of looking at it: Call the original random variables x1, x2, …, xn, and define from these the new variables y1 = x1, y2 = (x1 + x2) mod 1, y3 = (x1 + x2 + x3) mod 1, etc., where by “mod 1” I mean “Keeping only the part after the decimal point”. Note that these are also independent uniform [0, 1] variables; furthermore, the original variables will all be less than 1 if and only if the new variables are in increasing order. Of course, the new variables are equally likely to be in any of the n! possible orders, and so the probability of them lining up as we want is 1/n!.

Beautiful!

(For what it’s worth, MaxTheVool’s reasoning was also essentially correct, and corresponds to the beginning of MikeS’s reasoning, but discretely approximating a familiar area we might also recognize directly.

TATG’s reasoning was… unlikely to be the intended interpretation of the problem, for the reasons pointed out by Thudlow Boink (if your total is already 0.75, for example, we probably want to say you have a 25% chance of staying under 1 on the next roll and a 75% chance of breaking over.). Indeed, considering all nontrivial intervals as having 50-50 odds because they have the same the cardinality as their complement is incoherent: if one has 50-50 odds of drawing values below A or above A, and also 50-50 odds of drawing values below B or above B, one must have zero chance at all of drawing values between A and B, and we can’t manage to have this be true for every A and B if you are able to draw any values at all.

Illustrated another way, this cardinality-as-probability argument would have us say we have 50-50 odds of drawing a value less than 1/3 (because there are as many values below 1/3 as above it), but at the same a 1/3 chance of drawing values less than 1/3 (because there are as many values below 1/3 as there are between 1/3 and 2/3 and as there are above 2/3, thus giving us three equally likely possibilities), and in the same way a 1/(whatever value you like) chance of drawing values within whatever interval you like, splitting the world appropriately; of course, these cannot simultaneously be true.)

All true (what you said, not me). Assuming we have a uniform probability distribution over intervals allows us to get an answer. My question is, if we don’t assume that, can we get any coherent answer?

Lets say we start with a wheel where each point picks out an ordering of the numbers [0,1]. We have the pre-game session where we spin that wheel, then use the result of that spin to pick a second wheel, where the numbers are ordered as determined by our pre-game spin. We then use that second wheel for all remaining spins. If we have the rules otherwise as in the OP, is this game a good deal?

I want to marry that math.

I’m not quite sure I understand what you’re asking, TATG. It’s not the ordering, per se, that matters; it’s the probability distribution. Are you asking “Suppose we first pick a random probability distribution P on the unit interval, then use P for all the draws in the problem. Is the game a good deal (in the sense of having positive ‘expected value’)?”. The answer will depend on what the probability distribution of P itself (i.e., the probability distribution of probability distributions on [0, 1]) is to be taken as.

I just wanted to illustrate it with a more concrete case. So imagine a case where we have a clockface, the edge of the clock between 9-12 is large than that between 1-2. Instead of having a nicely ordered wheel like this, we have a randomly ordered wheel (as one might have in a casino). But forget the concrete case.

Let me just try to define different games.
In the first game let’s assume we have the probability distribution over [0,1] that others above did. Now let’s take all of the bijections from [0,1]->[0,1], and pick one of those (where we are equally likely to pick any of those functions). In this new game we first randomly pick a bijection, b. Then we randomly pick a number, x, from [0,1] (according to the distribution being assumed by others, again), but we instead take the value for the game to be b(x). Is this is a favourable game?
In the second game let’s just assume that each number is as likely to be drawn as each other number. Is this a favourable game?
The game you define isn’t the same as either of those, so far as I can tell.

It’s not clear what “equally likely to pick any of those functions” means (there are infinitely many of them, so each particular one will have to have probability zero of being exactly the chosen one, but beyond that constraint, it does not follow that we will know what the probability of other properties of the chosen bijection will be). If you want this to mean cardinality-as-probability, we run into exactly the same problems seen with trying to use cardinality-as-probability before; cardinality-as-probabiity only works well over a finite space.

With the uniform probability distribution on [0, 1], every number IS exactly as likely to be drawn as each other number: each particular number has probability zero of being exactly the one drawn. (The same is true of many other probability distributions on [0, 1] as well; the distributions it’s true of are precisely the “continuous” ones). The only interesting facts are probabilities of falling within larger intervals, but specifying the behavior of our distribution on these intervals involves more than just saying “each number is exactly as likely to be drawn as each other number”.

And it cannot be the case, no matter what distribution we pick, that all intervals of equal cardinality have the same probability, for the reasons noted before.

From what you have said, it seems to me like it is fairly clear what it means. Each is equally likely may just have to mean that each has zero probability (or, say, some infinitesimal probability). The problem, it seems, is not that, it’s that to get an answer we need to know more than just that each has equal probability. (And if so, fair enough. That is an answer to what I want to know, if not an answer to the questions I managed to specify.)

I don’t. What I said (about equal probability) was restricted to the special case of individual functions (not arbitrary sets of them).

I know. None of my examples were meant to assume this (or suggest it, or anything else it).

Alright, yes; specifying a distribution on an infinite space necessarily involves making choices above and beyond just what the probabilities of individual points will be (in particular, these will all be zero if we are to have them all equal), with everything interesting coming down to questions as to what probabilities are assigned to larger (infinite) subsets.

[Pedantically, one can get away with just choosing probabilities of individual points if one imposes countable additivity on the probability distribution, as is common, and is also dealing with a countable space, but nevermind that for now; it doesn’t help with anything here]

Grepping around I see that I did indeed encounter the very same problem over two decades ago. :eek:



	f(x) = e^x			, 0 < x < 1
	f(x) = P_1(x-1) * e^(x-1)	, 1 < x < 2
	f(x) = P_2(x-2) * e^(x-2)	, 2 < x < 3
	f(x) = P_3(x-3) * e^(x-3)	, 3 < x < 4
		et cetera
where
	P_0(t) = 1
	P_1(t) = -t + e
	P_2(t) = 1/2 * t^2 - e*t + e^2 - e
	P_3(t) = -1/6 * t^3 + e/2 * t^2 + (e - e^2)*t + e^3 - 2e^2 + e/2
		et cetera


For what it’s worth, the general form of f(x) is the alternating series f[sub]0[/sub](x - 0) - f[sub]1[/sub](x - 1) + f[sub]2[/sub](x - 2) - f[sub]3[/sub](x - 3) + … where f[sub]n/sub = e[sup]x[/sup] x[sup]n[/sup]/n! for positive x, and 0 otherwise. (Thus, one can ignore all the terms f[sub]n[/sub] for n > x)

[In other words, f(x) is the sum, over all n < x, of e[sup](x - n)[/sup] (n - x)[sup]n[/sup]/n!]