Simple probability, and World of Warcraft

I was trying to think of a clever, more universal example to ask this question but I’ll go ahead and admit what got me thinking about it was World of Warcraft, and just use that to demonstrate the problem I’m facing.

If an item has a 1 in 400, or 0.25% chance of dropping off a boss, and you kill that boss 400 times, it is statistically likely that it should drop right? But each time there is a 99.75% chance that it won’t drop, and that happening 400 times in a row would be represented by 0.9975 ^ 400 = ~.37, or a 37% chance that it would not drop at all in 400 trials. So you really only have a 63% chance of seeing it. This is creating cognitive dissonance in me because I’m still thinking “statistically, it should be likely happen in 400 trials”. 63% is more likely than not, but still seems low to me.

So I figure the reason is because I’m still only looking at one “series” of 400. When looking at a coin flip, we know there is a 50% chance of landing heads yet it is not that rare for tails to come up twice in a row (25% chance), but in the long run it is going to even out. So what are the chances you will see it drop twice in 800 trials? Or see it 2500 times in 1,000,000 trials? Does it ever approach 1? And how is that computed?

if you flip a coin, there is a 50% chance of a head. If you flip a coin twice this doesn’t equate to a 100% chance of a head, which I’m sure you would have no issue with. Is this not, in effect, the same situation?

So, applying this to the OP, you would not expect to see, necessarily, one drop out of 400 kills, but over a longer course you’d expect to see a corresponding 1:400. For example, about 10 out of 4,000.

Basically, the expected number of drops over 400 trials is, exactly, 1.

But there will be some sequences of 400 trials in which there will be 2 (or even more) drops – and some (~37%, as you calculated), where none will occur.

And yes, as **ShibbOleth **says, as the sequence gets longer, the number of drops you will see in each such sequence will be a “closer” match to the “expected” value.

You can model is as a Poisson distribution.

P[sub]L/sub = L[sup]k[/sup] e[sup]-L[/sup]/k!

where k is the number of successes and L is the expected number of successes in the length of time in question. If you’re not familiar with the “!” operator, it’s defined over the non-negative integers such that

0! = 1
1! = 1
2! = 1 * 2
3! = 1 * 2 * 3
4! = 1 * 2 * 3 * 4

n! = 1 * 2 * … * n

If you’re not familiar with e, it’s the natural logarithm base, approximately 2.718281828459045.

So when L = 1,
P(0) = 36.8%
P(1) = 36.8%
P(2) = 18.4%
P(3) = 6.1%
etc.

So that would model your case where there’s a 0.25% drop chance and 400 trials. Bumping it up to 4,000 trials would make L = 10, so:

P(0) = 0.0045%
P(1) = 0.045%
P(2) = 0.23%
P(3) = 0.76%
P(4) = 1.9%

P(8) = 11.3%
P(9) = 12.5%
P(10) = 12.5%
P(11) = 11.4%
P(12) = 9.5%
etc.

Oh, minor note, since I wasn’t clear. While you can model it as a Poisson process, it’s only an approximation, since the function will return nonzero probability for impossible values, e.g., for 400 trials the approximation says that you’ll get 800 successes

0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000047711303342725068197552% of the time.

Using a binomial coefficient will get you an exact answer. (aCb)P(b)^b(1-P(b))^(a-b) Where a is the amount of kills, b is the number of drops, C is the choose function, and P(b) is the probability of a drop. If you are looking for a large amount of trials this gets tricky since a lot of calculators can’t handle it.

It is incidentally very unlikely that an item that drops with probability p (with distinct trials independent, of course) will drop exactly Np times over N trials, right on the nose. Indeed, the expected value of |D - Np|, where D is the number of drops, grows on the order of sqrt(N). But precisely because of that, we can see that the expected value of |D/N - p| is on the order of 1/sqrt(N), and thus approaches 0. To put it plainly, it’s very unlikely that you’ll achieve D = N*p right on the nose, but it’s very likely that the ratio of drops to trials is very close to p (for large N, of course).

Speaking of which, with large numbers of trials, as always, the central limit theorem kicks in, and you can model it with a bell curve. As N increases, the probability distribution of number of drops approaches that of a bell curve with mean Np and standard deviation sqrt(p(1-p)*N). Snarky_Kong gives the exact answer, which is the binomial distribution, but that’s much more difficult to compute exactly for large N and the bell curve approximation will be very very good anyway.

Incidentally, I’m not sure why you indicate P(b) as a function of b. What did you mean by that? This formula only works when the probability of a drop is unvarying, and thus P will be a constant.

I should clarify that the bell curve approximation is only useful for estimating the probability that the proportion of drops to kills will fall within a particular nontrivial interval (e.g., between 30% and 60%). If you try to estimate the probability that the proportion will equal some particular number exactly, it’ll tell you 0, since, well, that is what the probability approaches as the number of kills increases.

Or, to be blunt about it,

No, the probability of this happening actually approaches 0 (for the ratio of drops to kills to exactly equal the drop probability), except when the drop probability is 0 or 1, of course. The long-run evening out does not cause the ratio to tend to exactly equal the drop probability; rather, it causes the ratio to tend to be very close (closer and closer) to the drop probability, despite extremely probably not being exactly equal.

(That exact equality isn’t eventually reached and maintained is obvious enough, when you think about it, from the fact that the number of kills usually isn’t an integer multiple of the drop probability anyway… though the actual deviation from exact equality will generally be more than such quantization forces. Like I said above, the absolute difference between the number of drops and the expected number of drops grows on the order of the square root of the number of kills)

Ah, I just meant that it was the probability of that event occurring. In this case P(b)=1/400

Ever heard of exponential notation? :rolleyes:

Or significant figures?

Ah, I see now. I was confused by the fact that you were using “b” to indicate both the number of drops and the event of a drop. But it’s all good now.