Calculating and comparing probabilities

Is there some relatively uncomplicated way or, failing that, some software, to calculaye, for example, how often someone rolling 4D8 will outroll someone rolling 5D8? There surely must be a simpler way than counting the number of possible results and comparing them, but this mathematical idiot is at a loss.

This ought to get you partway there:
http://www.anwu.org/games/dice_calc.html

Here’s how I would approach the problem. Let S(n, k) denote the number of ways to express the integer n as a sum of k integers in the interval [1, 8]. Then the probability you want is given by sum(S(i, 4)*sum(S(j, 5)/8[sup]5[/sup], 5 < j < i - 1)/8[sup]4[/sup], 6 < i < 32).

I don’t know an explicit formula for S(n, k), but it can be calculated by the recurrence S(n, k) = sum(S(n - i, k - 1), 1 < i < 8) with the following base conditions:
[ul][li]S(n, k) = 0, n < k[/li][li]S(n, k) = 0, 8k < n[/li]S(n, 1) = 1 if 1 < n < 8[/ul]It would be pretty straightforward to write a program to do the calculation.

As a rough estimate, it ought to be close to 0.55.

The sums of dice rolls are approximately normally distributed, and the difference between two normal distributions will be normal as well (with mean = difference of means, and variance = sum of variances)

If I did this right:
1d8 has mean = 4.5, var = 3.94
4d8 has mean = 18, var = 15.75
5d8 has mean = 22.5, var = 19.69

Then 5d8 - 4d8 has mean = 4.5, var = 35.44

We want the value corresponding to 0; I normalized and used this calculator to get the result.

Though it’s probably not too much work to get the exact answer, and ultrafilter’s approach looks like it’ll do it.

For this specific case, maybe. But if you’re wondering about the chance that a 20th level wizard would survive a great wyrm red dragon’s breath (20d4 vs. 12d12, IIRC), it might get hairier. Fortunately, though, for large numbers of dice, the Gaussian approximation improves, and with that many, the difference between the approximation and exact result would be negligible.

Priceguy, isn’t it nice that you don’t have to explain to this crowd what “5d8” means? :slight_smile:

This is why you should take a toad familiar. 20d4+20, FTW…

Oops, it seems I misread the OP. I calculated the odds that 5d8 beats 4d8. Funny thing was, I had it the number right, then changed it because I’d written my post wrong.

Anyways, the odds of 4d8 beating 5d8 would be approximately .45, just to be complete.

That is nice. I actually thought about adding something lame like “five eightsided dice” but thought better of it.

What is less nice, however, is that I am in fact a mathematical idiot and don’t understand most of the answers. I have no idea how to do what ultrafilter suggests and while panamajack’s post is mostly within my grasp I don’t know how he did the final calculation.

Is it possible to dumb it down a handful of shades?

Start with one n-sided dice roll, identified by the variable X.

The mean is (n+1)/2.

The variance is the expected value of X^2 minus the square of the mean.

The expected value of X^2 is (n+1)(2n +1)/6.

Thus the variance is (n^2 -1)/12.

For 1d8, this gives mean=4.5 and variance=5.25

For 4d8, mean = 18 and variance=21

For 5d8, mean = 22.5 and variance=26.25

Then 5d8 - 4d8 has mean 4.5 and variance 47.25
We want the value corresponding to .5. Normalizing gives (.5-4.5)/47.25^.5= -.582

(to normalize, subtract the mean then divide by the square root of the variance. I used .5 instead of 0 because the dice can only have integer values while the normal distribution is continuous)
Enter this z-value into the calculator panamajack linked to.

You get .28 for the probability of 4d8 beating 5d8.
You can easily do the same for any number of dice with any number of sides. As mentioned, the estimation improves for more dice.

Actually in the last post, I calculated the probability that 4d8 will beat or tie 5d8.

The estimated probability of 4d8 winning outright is only .24

Why did alive and panamajack get different results and different variances?

Because I figured it wrong … I normalized by dividing by the variance, instead of the square root of it (aka standard deviation, or sigma) :smack:

I knew I’d missed something there.

No more time to post now, but it appears alive calculated the initial variances based on normal approx., and I based them on the multinomial (which they actually are at the start). I’m not sure which gives a better estimate, though.

Okay, I have to seriously apologize for posting flat-out wrong information above. I clearly wasn’t thinking straight, and my answers were incorrect; especially my last post, which was early this morning before I was even thinking.

About the only thing I got right was that using he normal approximation is a decent approach; despite that, I ran a quick program in MATLAB to figure the exact answer.

I used two quick functions to compute joint probability distributions (a summing and difference one). To get a 4d8 & 5d8, I ran the summing one 4 times, starting with 8 x 1 vectors that had each entry set to 1/8.
This is a fairly crude approach, though applicable to any positive distribution (even then it could be done better). It generates a matrix of probabilities and then gathers the results where the sum or difference of indices is the same. Note that this requires the starting distributions to be positive (since indices must be positive), but that’s okay for as far as we’re going.

The code for the two is essentially the same (I’m posting it all so you can see how it works):



% Calculate the joint distribution difference
% of two discrete distributions, x1-y1
% x1 = nx1 vector where x1(i) = probability x = i
% y1 = mx1 similarly for y1
% 
% x1 and y1 must be independent

% The result will be in the jsum/jdif array.
% jind is an index for a difference array 
% (for sum results, the index is the vector index) 

% set up a table multiplying the values in x1 & y1

[X1 Y1] = meshgrid(x1, y1);

JM = X1 .* Y1;


JM is essentially a multiplication table, with each entry equal
to the multiplied value of the number at the ‘top’ and the ‘side’.
(I use quotes since those values aren’t actually in the matrix itself.)

From here, it’s slightly different:



clear jsum;
jsum = zeros((length(x1)+length(y1)),1);

xl1 = length(x1);
yl1 = length(y1);
for i = 1:yl1,
	for j = 1:xl1,
		jsum(i+j) = jsum(i+j) + JM(i,j);
	end;
end;


% This is for difference calculations
jind = [-length(x1):length(x1)]';
jdif = zeros(length(jind),1);

xl1 = length(x1);
yl1 = length(y1);

for i = 1:yl1,
	for j = 1:xl1,
        dind = j-i;
	    k = find(jind==dind); 
        jdif(k) = jdif(k) + JM(i,j);
    end;
end;

The difference output then displays a list of the joint difference dist., and the prob of being > 0.



%display the output
[jind jdif]

% get values greater than 0

gt0=find(jind==1);

% and display their sum
sum(jdif(gt0:length(jind)))

The answer, finally, is : 0.2365

That is almost exactly what I got with the normal approximation. I just rounded up to .24.

The nice thing about using the normal approximation is that you can easily generalize it.
If you want to know the probability that k n-sided dice beat m n-sided dice, using the same formulas you get

mean= (m-k)(n+1)/2 and variance= (m+k)(n^2-1)/12

Normalizing, you get Z= (-.5 - mean)/variance^.5

Then plug Z into the normal CDF calculator, and you are done.
For example, if you want to know the probability that 6d10 beats 8d10,

mean= 2(11)/2= 11 and variance= (14)(99)/12= 115.5

Then Z= (-.5 - 11)/10.75= -1.07

So there is a .14 probability of 6d10 beating 8d10.

What is the meaning of a negative probability? Is it something like the square root of -1 on the number line? I ask because, if you want to reduce a probability, or compare different probabilities, you always have to invoke a meaningless concept - the negative probability. Take the reduction of a 3/5ths probability to a 2/5ths probability. It means having to add -1/5th probabilty (not -1/5 the number, -1/5 the probability) to the former, but any negative probability is a meaningless concept, no? So how can arithmetic with probabilities work?

I have never encountered any need to invoke the concept of a negative probability, which, so far as I know, is completely meaningless.

As Chronos said, ‘negative probability’ is meaningless. If you are ever reducing probability, you must thereby by increasing the probability that that event won’t occur.

Generally the only time you can directly subtract probabilities in a formula is when it’s known there’s no way the result will be negative. As an example, the probability that either of two events A or B will occur is P(A) + P(B) - P(AB). P(AB) means the case where they both occur, but clearly can’t be greater than one of the two.

In case there was any confusion, in this thread we’ve also added & subtracted means (which are based on the values something may equal, not the probability that they equal it).