'nother probability"?" which bin empties first?

Wouldn’t the number of states be huge making it somewhat prohibitive?

Here’s an easy way to look at it. Consider two bins that have the same amount of balloons. They are equally likely to empty first because they are exactly the same. Add one balloon to one bin. In order to get back to the 50/50 stage one balloon in the larger bin has to pop first which is not guaranteed. You can add as many as you like to convince yourself of other combinations of bins.

Okay, I think I got the right answer in terms of calculating, because my numbers are very close to what CookingWithGas got with the simulation. Here’s the numbers I got:


0.413023204	0.26152622	0.163113676	0.100589796	0.061747104

I got this with a somewhat complex formula, but basically, for a given n number of pops, it was the probability that it was already the first one empty + the probability that it wasn’t times the probability that it empties with this pop and none of the others were already empty on the previous step.

I’m not 100% sure I have the nitty-gritty of the math 100% correct, so I’ll play with it a bit more before I post it.

This is exactly how I think of it.

At the point at which the probability of a balloon in one bin popping is equal to that of another bin (ie, the number of balloons in each bin is the same), the bin with fewer balloons to start with has already had an oportunity to drop to the next lower number of balloons. So, it is more likely that the bin with one balloon initially will empty before the bin with five balloons, because bin A always has a larger than 0% chance of emptying on each successive pop, while bin B has a 0% chance up until the last pop.

Yet another way to look at it - say we allow all balloons to pop and carefully note the order in which they pop. Consider only the last balloon in each bin to pop. How many of the possible orders that the balloons could pop in would lead to that balloon popping ahead of all the other last-in-bin balloons? Well, the last balloon in the 3-bin must have had at least 2 other balloons pop before it, whereas the last balloon in the 2-bin need only have had one balloon pop before it. Therefore, there are more possible orders of balloon pops that lead to the 2-bin emptying first.

I don’t think so, because the number of transitions into/out of a given state is relatively small.

Okay, I think I figured out the right math… but it’s still recursive. Again, using the same i, n, s’s (ie, sj is the size of the jth bucket), also let F(j,n) be a boolean function such that bucket j is the first bucket to be empty at or before n balloons have popped.

p(F(j, n)) = (1-p(F(k, n-1)) * p(sj-1|n-1) * ForEach(i<>j, (1-p(si|n-1))) / n + p(F(k, n-1)

Then, using that, I get these numbers at the 16th balloon (the last place the first one could have emptied) which, as expected, sum to 1 (okay, very close to it due to rounding):


0.402583292	0.256953873	0.165277465	0.106630356	0.068605352

I’ve posted my balloons2.xls spreadsheet here. The highlighted values in the left chart are the probabilities of a particular bucket being empty (without consideration of what’s in the other buckets) after so many balloons are popped… as I said before, the higher value for the smaller buckets indicates a higher probability of the smaller buckets being empty. The highlighted values on the right charts (near the bottom) indicate the over all probability of that particular bucket being the first to be empty and are very close to the values from the simulation, so I’m hoping they’re right… maybe statistician can correct me if my calculations are wrong.

I have to admit, getting those values was tougher than I’d thought it’d be.

From ultrafilter’s suggestion of using a lattice, the numbers I get are:

Bin with 2: 0.418486
Bin with 3: 0.245758
Bin with 4: 0.156669
Bin with 5: 0.105379
Bin with 6: 0.073707

I do this by creating a set of boxes with (2, 3, 4, 5, 6) baloons in them, and give it a probability of 1.0. I initialize the top level of the lattice with this set. At each step down in the lattice, I check the sets in the current level. If any bin in the current has 0 elements in it, then this set is a terminal set, and a vector of probabilities has the element for this bin incremented by this set’s probability. For each set with no 0-bins, I create 5 new sets, each with all but one bin equal to the bins of current set, and the remaining bin with one less balloon than the equivalent bin in the current set. For these new sets, I give each the probability of the current set, multiplied by the number of balloons in the current set’s relevent bin divided by the total balloons in the current set. I merge the new sets into the next level of the lattice, and continue on in the current level.

For a simpler example, consider the set with 3, 1, and 2 balloons:

Level 0: [3, 1, 2] @ 1.0
Level 1: [2, 1, 2] @ 0.5; [3, 0, 2] @ 0.167; [3, 1, 1] @ 0.333
Level 2: [1, 1, 2] @ 0.2; [2, 0, 2] @ 0.1; [2, 1, 1] @ 0.4; [3, 0, 1] @ 0.067; [3, 1, 0] @ 0.067
Level 3: [0, 1, 2] @ 0.05; [1, 0, 2] @ 0.05; [1, 1, 1] @ 0.3; [2, 0, 1] @ 0.1; [2, 1, 0] @ 0.1
Level 4: [0, 1, 1] @ 0.1; [1, 0, 1] @ 0.1; [1, 1, 0] @ 0.1

Summing the probabilities of the terminal sets, we get the probability that the first bin will empty first of 0.15, the probability that the second bin will empty first is 0.5833, and the probability that the third bin will empty first is 0.2667.

Note that the set [2, 1, 1] has probability 0.4, which comes from the probability of set [2, 1, 2] (1/2) multiplied by the odds for selecting a balloon in the third bin (2/5) , plus the probability of set [3, 1, 1] (1/3) multiplied by the odds for selecting a balloon in the first bin (3/5). A similar thing happens for set [1, 1, 1], coming from both set [2, 1, 1] and set [1, 1, 2].

I didn’t track the percentage of time each bin empties first, but here’s another way of answering the OP. . .

Expected finishing time for each bin (from a simulation with 100,000 runs).

[ 9.085 12.37973 14.96164 16.91972 18.3609 ]

I ran that a few times. There’s actually a bit of variation, but it’s close. If you really want to prove it to yourself, you could get a standard error for each of those estimates, and show there is no overlap.

Here’s the python code I used in case anyone wants to check it. It very well might be screwed up, but I think it’s all right. The only odd thing I do is that each time, I generate a random integer between 0 and 4. If the bin that number corresponds to that is already empty, I just regenerate one from 0 to 4 until I get a non-empty bin.

I don’t (for example) reorder all the bins, and regenerate a random from 0 to “Number of non empty bins” but it still uniformly distributes the “pops” over the remaining bins.


from random import *
from numpy import *

"""
myMat[r][j] adds a 1 if bin r empties on the jth step. I use it at the end
to calculate the expected value, and you can also print it out to see how
often each bin empties on each step. Divide it by steps.
"""

myMat = zeros((5,20))
steps= 100000

for i in range(steps):
    bin = array((2,3,4,5,6))       #fill the bins with balloons
    j = 0                          #number of steps.     
    while (sum(bin) > 0):
        r = randint(0, 4)
        if (bin[r] != 0):          #if it does == 0, regenerate a random       
            if (bin[r] == 1):      #empties the bin 
               myMat[r][j] += 1    #indicate the empty point
               bin[r] = 0           
               j += 1
            else:
               bin[r] -= 1
               j += 1
   

step = array((1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20))

ExpectedValue = dot(myMat, step)/steps

#print myMat/steps

print ExpectedValue


But we don’t want to distribute the pops uniformly over the bins. We want to distribute them uniformly over the balloons.

Oh, you’re exactly right.

I’ll redo if I can find a few minutes.

Let’s try this. Expected length of time until empty:

[14.16689 15.07231 15.64298 16.08396 16.37662]

I’ll put in a calculation to figure the percentage of time each bin finishes first, and see if that jives with what others got.



from random import *
from numpy import *

"""
myMat[r][j] adds a 1 if bin r empties on the jth step. I use it at the end
to calculate the expected value, and you can also print it out to see how
often each bin empties on each step.
"""

myMat = zeros((5,20))
steps= 100000
"""
now, we'll make a vector that contains bin values, and then pop balloons
randomly over that vector. If we pop in an empty bin, do nothing, and go 
to next step.
"""
whichBin = array((0,0,1,1,1,2,2,2,2,3,3,3,3,3,4,4,4,4,4,4))

for i in range(steps):
    bin = array((2,3,4,5,6))       #fill the bins with balloons
    j = 0                          #number of steps.     
    while (sum(bin) > 0):
        r = randint(0, 19)
        idx = whichBin[r]
        if (bin[idx] != 0):
            if (bin[idx] == 1):      #empties the bin 
               myMat[idx][j] += 1    #indicate the empty point
               bin[idx] = 0           
               j += 1
            else:
               bin[idx] -= 1
               j += 1
    

step = array((1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20))

ExpectedValue = dot(myMat, step)/steps

print myMat/steps

print ExpectedValue

OK, I used my simulation to find the percentage of time each bin empties first. I got something quite different than the earlier posters.

[0.30141, 0.22628, 0.18305, 0.1543 , 0.13496]

If anyone sees something wrong with the code I wrote, let me know. This seems pretty transparent, though. I tried sifting through punoq’s formula and BlasterMaster’s code, but I didn’t really get anywhere with it.

code with added “firstFlag” and isFirst to accumulate firsts.



from random import *
from numpy import *

"""
myMat[r][j] adds a 1 if bin r empties on the jth step. I use it at the end
to calculate the expected value, and you can also print it out to see how
often each bin empties on each step.
"""

myMat = zeros((5,20))
steps= 100000
whichBin = array((0,0,1,1,1,2,2,2,2,3,3,3,3,3,4,4,4,4,4,4))
isFirst = zeros(5)

for i in range(steps):
    bin = array((2,3,4,5,6))       #fill the bins with balloons
    j = 0                          #number of steps.
    firstFlag = 0                  #gets set to 1 when a bin empties. 
    while (sum(bin) > 0):
        r = randint(0, 19)
        idx = whichBin[r]
        if (bin[idx] != 0):
            if (bin[idx] == 1):      #empties the bin 
               myMat[idx][j] += 1    #indicate the empty point
               bin[idx] = 0           
               j += 1
               if (firstFlag == 0):    #hasn't been set yet.
                  isFirst[idx] += 1    #increment number of "firsts"
                  firstFlag = 1        #set flag.
            else:
               bin[idx] -= 1
               j += 1
    

step = array((1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20))

ExpectedValue = dot(myMat, step)/steps

print myMat/steps

print ExpectedValue

print isFirst/steps


OK, now I’m just responding to myself, but there’s an error in that code, too.

As long as there is any balloon in a bin, it assigns a probability to that bin as if it were full. That’s what I get for just hacking it together.

From 10000 runs:

Expected steps until empty: [ 14.017 15.7561 16.8068 17.5233 17.9998]
Finish first probabilities: [ 0.414 0.2449 0.1588 0.1055 0.0768]

final code I’m going to write on the subject:



from random import *
from numpy import *

"""
myMat[r][j] adds a 1 if bin r empties on the jth step. I use it at the end
to calculate the expected value, and you can also print it out to see how
often each bin empties on each step.
"""

myMat = zeros((5,20))
steps= 10000
isFirst = zeros(5)
numBalloons = 20

for i in range(steps):
    bin = array((2,3,4,5,6))       #fill the bins with balloons
    j = 0                          #number of steps.
    firstFlag = 0
    numBalloons = 20
    oldBin = array((0,0,1,1,1,2,2,2,2,3,3,3,3,3,4,4,4,4,4,4))
    newBin = oldBin
    while (sum(bin) > 0):
        oldBin = newBin
        r = randint(0, len(newBin)-1)
        idx = oldBin[r]
        if (bin[idx] != 0):
            if (bin[idx] == 1):      #empties the bin 
               myMat[idx][j] += 1    #indicate the empty point
               bin[idx] = 0           
               j += 1
               k = 0
               numBalloons -= 1
               newBin = zeros(numBalloons)
               for m in range(len(oldBin)):
                   if (m != r):
                       newBin[k] = oldBin[m]
                       k += 1
               if (firstFlag == 0):
                  isFirst[idx] += 1
                  firstFlag = 1
            else:
               bin[idx] -= 1
               j += 1
               numBalloons -= 1
               k = 0
               newBin = zeros(numBalloons)
               for m in range(len(oldBin)):
                   if (m != r):
                       newBin[k] = oldBin[m]
                       k += 1
    

step = array((1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20))

ExpectedValue = dot(myMat, step)/steps

print myMat/steps

print ExpectedValue

print isFirst/steps