Could somebody calculate this for me? Over a 10 hour period event X will happen exactly 20 times, independently random across the 10 hour period. If we split the 10 hour period up into 10 discrete hours and check them individually is there any way to calculate the probabilities for how many of them have at least 2 events?
Am I correct that this equivalent to saying “roll a d10 twenty times; what is the probability that a particular number 0-9 is rolled at least twice”?
I think it’s how many of the numbers came up twice or more.
Yes that would be exactly equivalent.
I’m getting:
- 0: 0%
- 1: 0%
- 2: 0%
- 3:0.3%
- 4: 4.3%
- 5: 23%
- 6: 39.6%
- 7: 26%
- 8: 6.5%
- 9: 0.4%
- 10: 0%
eta: low precision answer.
Ok thanks.
I don’t know a good way to get the answers except by random simulations.
My results are in the same ballpark as those of Snarky_Kong.
- 0: 0%
- 1: 0%
- 2: 0.003%
- 3:0.257%
- 4: 4.37%
- 5: 22.2%
- 6: 40.0%
- 7: 26.7%
- 8: 6.09%
- 9: 0.368%
- 10: 0.002%
How many simulations?
300 million! (Computers are fast these days.)
I didn’t calculate error-bars. I did 3 runs of 100 million each and eye-balled the variation; there seemed to be slightly more sig-figs than I showed.
Thanks a lot. That should make the results as good as they can be.
I just used a computer algebra system. You don’t need to simulate anything, just symbolic algebra.
0: 0.000% (i.e. zero)
1: 0.000% (i.e. 0.0000000123320412181)
2: 0.003%
3: 0.257%
4: 4.371%
5: 22.2%
6: 40.0%
7: 26.7%
8: 6.09%
9: 0.368%
10: 0.002%
Do these results mean that there’s a 40% chance that 6 one-hour time slots have at least 2 events?
(I’m not a mathematician)
That’s for the op to say, but that is how I interpreted it: you have 20 balls, and you throw them randomly into 10 bins. Then you count what is the probability that exactly 6 bins received at least 2 balls, you get 40%
Thanks.
I did some simulations too (a bit over 4 billion of them)
1 hour with more than 2 events occur 0.000001247% of the time
2 hours occur 0.00283% of the time
3 hours occur 0.2565% of the time
4 hours occur 4.37% of the time
5 hours occur 22.2% of the time
6 hours occur 40.03% of the time
7 hours occur 26.67% of the time
8 hours occur 6.1 % of the time
9 hours occur 0.368% of the time
10 hours occur 0.00238 % of the time
Further thoughts:
There at 10^20 possible outcomes.
For the case of only 1 hour having more than 2 events, there are
10 ways for that to happen with all the events in one hour
90 ways (10 times 9 choose 1) for that to happen with 19 events in one hour and 1 in a different hour
360 ways (10 times 9 choose 2) for that to happen with 18 events in one hour, and two in two different slots in the 9 hours remaining
and so on.
That gives a total of 5120 possible ways to end up with only one hour having more than 2 events (out of 10^20) which suggests my simulation is not doing an accurate job with the very unlikely events, which I can believe.
But working through the problem this way, makes me think we’re dealing with something very much like entropy. The only-one-hour-with-2-or-more-events macrostate is very unlikely, and a small change (moving one event) probably changes the macrostate. The most likely case is the lowest entropy, with small changes probably leaving you in the same macrostate.
What do our physicist friends think @Chronos and @Asympotically_fat ?
OK…
What happened to the 20 events? You mean 90 times 20
We had 1233204121810
I took this as a challenge and wrote a simple procedure that gives exact results. The procedure can be translated into cumbersome math formulae.
40.032742724298324 % to be precise; or if you prefer,
10008185681074581 / 25000000000000000
I am not being coy (though I agree it is a fun challenge). If anyone wants, I am happy to post some cumbersome math formulae.
In fact, here you go:
Summary
We will use the interpretation in Post #2, or we can think of the periods as labelled A
through J
and form random 20-letter words like IFHJGDJFCFFDEAEIFHDI
.
Now we want [the computer] to count how many letters show up at least twice. A quick way is, define an exponential generating function and introduce a variable u to mark the events of interest:
Finally, read off the counts by taking the coefficient of t^20 and multiplying by 20!:
You’re right. I got stuck on the number of slots and not the number of events and got the wrong multiplication. Embarrassing.
Which is actually very close to my simulation. So that’s cool.