On the MathWorld site, it gives a formula for calculating the probability of rolling a total of p on n dice with s sides each. It starts by saying the number of possible ways to roll this total is the coefficient of the x^p term in the multinomial:
(x + x^2 + … + x^s)^n
The rest of the article is simply about how to calculate this multinomial coefficient using a number of binomial coefficients. But it doesn’t elaborate on why the multinomial coefficient itself is the correct answer, treating it as self-evident. However, it doesn’t seem obvious to me. How can we prove that this is the case?
Consider what happens when you distribute out the calculation of (some sum of powers of x) * (some sum of powers of x) * … * (some sum of powers of x).
Each way of choosing one term (i.e., one power of x) from each factor contributes 1 to the coefficient of x[sup]r[/sup], where r is the sum of the powers you chose.
Thus, the overall coefficient of x[sup]r[/sup] is the total number of ways to choose one term from each factor with powers that add up to r.
In your particular case, each of the n factors corresponds to a particular die, with the s terms in each factor corresponding to the possible outcomes for that die.
This is an application of what’s known as generating functions, which are pretty slick. A good introductory combinatorics/discrete math book will have a section on generating functions, and there’s a lot more out there.
A fairly straightforward induction argument will do the trick. I can supply it upon request.