True Randomness

DPRK · April 8, 2025, 2:34pm

Imagine just rolling a 10-sided die to generate digits, though, if you want a random number between 0 and 1. Why go through a Gaussian distribution if you do not need to?

Pleonast · April 8, 2025, 2:43pm

It all depends on the details.

What are we using the random number for?
How quickly do we need random numbers?
How well do we know the actual distribution of the physical source of randomness?

It’s more engineering than mathematics.

LSLGuy · April 8, 2025, 2:43pm

Thank you for solidifying that.

As I was typing that up I could not recall if that 2-dice roll distribution was properly strictly Gaussian or was simply an example of something symmetrical with a peak in the middle. IIRC all Gaussians are symmetrical with a single peak in the center, but the converse is not true. Gaussians are a proper subset of single-peaked symmetrical.

And I was too lazy to look it up. Hence my weasel words.

Pasta · April 8, 2025, 4:57pm

Does anybody do that? The Box-Muller transform is used widely in the other (non-inverse) direction to turn a source of uniformly distributed random values into a source of Gaussian distribution random values.

(And to define it for folks: if you have two independent random numbers u_1 and u_2 distributed uniformly between 0 and 1, then the variables x_1 = \sqrt{-2\log u_1}\cos(2\pi u_2) and x_2 = \sqrt{-2\log u_2}\sin(2\pi u_1) are independent random numbers distributed as a standard normal, i.e. as a Gaussian with mean 0 and standard deviation 1.)

To distill the point here: “uniformly distributed” isn’t a requirement on the output; it’s a requirement on the probability of the output. Drawing ten random digits uniformly distributed from 0 to 9, is very very unlikely to get you a “uniform” set of all ten possible digits. But each digit had the the same (hence: uniform) probability of being drawn each time.

The Gaussian limit is of interest to be sure, but for the case of two dice rolls or other uniform distribution, the distribution of their sum is perfectly triangular. For two dice:

P(2)  = 1/36
P(3)  = 2/36
P(4)  = 3/36
P(5)  = 4/36
P(6)  = 5/36
P(7)  = 6/36
P(8)  = 5/36
P(9)  = 4/36
P(10) = 3/36
P(11) = 2/36
P(12) = 1/36

RitterSport · April 8, 2025, 5:26pm

What’s the formula for 6 dice?

Great_Antibob · April 8, 2025, 5:48pm

As it happens, you can use Pascal’s triangle.

P(6) = 1/46656 1 = 1st entry on 6th row
P(7) = 6/46656 6 = 2nd entry on 7th row
P(8) = 21/46656 21 = 3rd entry on 8th row
P(9) = 56/46656 56 = 4th entry on 9th row
P(10) = 126/46656 126 = 5th entry on 10th row
P(11) = 252/46656 252 = 6th entry on 11th row
and so on

With that many dice, you get a reasonably good bell curve made of piecewise 5th order polynomial bits.

This works with any number of dice. Just start on the row corresponding to the number of dice.

ETA: you do need to stop at the halfway point, though, i.e. for 2 dice, stop at P(7) and for 6 dice, stop at P(21). The probabilities are naturally symmetric about that point.

RitterSport · April 8, 2025, 5:56pm

That’s a pretty big “and so on”. Where does 46656 come from? I’m sorry I’m so dense today.

So, for P(7), I go to the 7th row of Pascal’s triangle and take the second entry?

There must be some general formula, or at least some specific formula for n dice, right?

Great_Antibob · April 8, 2025, 6:04pm

46656 = 6^6, since you have 6 dice.

For 2 dice, it’s 36 since 6^2 = 36. For 3 dice, it would be 216 since 6^3 = 216.

For 6 dice, yes, P(7) is the 2nd entry on the 7th row

You can just look at the corresponding diagonal on the triangle. It’s a straight line down.

But for completeness, here’s the rest of the list:

P(12) = 462 / 46656
P(13) = 792 / 46656
P(14) = 1287 / 46656
P(15) = 2002 / 46656
P(16) = 3003 / 46656
P(17) = 4368 / 46656
P(18) = 6188 / 46656
P(19) = 8568 / 46656
P(20) = 11628 / 46656
P(21) = 15504 / 46656

From there, P(22) = P(20), P(23) = P(19), and so on.

As for a general method, sure, though the formula is a bit complex for the general case. You can find a derivation on this page.

Note that it is equivalent to finding coefficients of a polynomial, which is the link back to Pascal’s triangle.

Great_Antibob · April 8, 2025, 6:21pm

Oh wait, I made a mistake somewhere. Website is good for the formula but the numbers are off.

RitterSport · April 8, 2025, 6:22pm

Thanks!

Great_Antibob · April 8, 2025, 6:56pm

Welp, never mind that bit about the triangle. That works up to sum P(11). Much more complicated for the higher numbers where you have to subtract off a bit.

Mea culpa

The real figures using the formula:

P(6) = 1/46656
P(7) = 6/46656
P(8) = 21/46656
P(9) = 56/46656
P(10) = 126/46656
P(11) = 252/46656
P(12) = 456 / 46656
P(13) = 756 / 46656
P(14) = 1161 / 46656
P(15) = 1666 / 46656
P(16) = 2247 / 46656
P(17) = 2856 / 46656
P(18) = 3431 / 46656
P(19) = 3906 / 46656
P(20) = 4221 / 46656
P(21) = 4332 / 46656

And again, the numbers bigger than 21 you get from symmetry. Double checked this time, and they all add up to 46656, so that’s good

Chronos · April 8, 2025, 8:45pm

Of course, you almost never actually have a true Gaussian, and you’ll certainly never get one by adding together dice, because n d m always has a minimum possible value and a maximum possible value, while the tails of a true Gaussian extend infinitely. And there are also many real-world distributions that fail to be Gaussian by having fatter tails than a true Gaussian.

What makes Gaussians so useful is not that they’re common, but that they’re a good approximation for many real-world distributions, especially when you’re not too far from the center of the distribution.

I think one reason might be if you don’t fully trust your dice. If I want 10 decimal digits, I could roll a 10-sided die ten times, but if my d10 is biased, then my results will be, too. But even if my die is biased, if I roll it 10 times and add the results together, I’m going to get something that’s extremely close to a Gaussian distribution (even if it’s not the same Gaussian distribution I expected).

RitterSport · April 8, 2025, 9:08pm

Thanks! That makes my graph look a lot more like I expected it to.

Pasta · April 8, 2025, 11:41pm

This will never be a practical application for unbiasing a die. Even after the upfront costs for normalizing the Gaussian, you would need massive number of rolls per draw to wash out the discretization. No one would do this. Worst case, you would just generate a boring unbiased bitstring from the biased die outputs instead.*

*This could simply be via the “coin flip” unbiasing from upthread: Let rolls 0-4 be called output A and rolls 5-9 be called output B. Roll twice to get a pair of outputs. AB = bit 0. BA = bit 1. AA = BB = discard the outputs and try again. This will generate an unbiased string of 0s and 1s to use for whatever, including implementing an unbiased d10 despite the original entropy source being a biased d10. For maximum efficiency (optional), group the rolls into output buckets A and B in as close to a 50%/50% way as possible with the biased die.

DPRK · April 9, 2025, 2:20am

I am not suggesting anyone would or should; it was meant as an example. If your x_1 and x_2 are independent standard normals then the inverse of the Box–Muller transform (which is actually usefully used in the non-inverse direction) will be something like u_1=e^{-(x_1^2+x_2^2)/2}, u_2=\frac{1}{2 \pi}\arctan(x_1/x_2)

Saffer · April 11, 2025, 10:56am

And, for the sake of completeness, the box muller transform is computationally efficient, but only works for moving between Gaussian and uniform distributions.

But, for any arbitrary distribution, you can move between that distribution and a uniform distribution by applying the cumulative distribution function, or its inverse. But although this would work for any distribution, it might not be the fastest way to do it on a large scale.

Chronos · April 11, 2025, 12:34pm

Again, if, of course, you know what the distribution actually is.

DPRK · April 13, 2025, 1:33pm

If you look at computer algorithms like this https://www.schneier.com/wp-content/uploads/2015/12/fortuna.pdf
what they do is ingest some unpredictable data from different sources, which you may not know everything about, and use that to seed a pseudo-random number generator in such a way that the output remains unpredictable.

Chronos · April 13, 2025, 5:03pm

For many purposes, it’s good enough to just take a completely predictable (but constantly changing) input, like the current system time, and use that to seed the PRNG.

markn_1 · April 13, 2025, 5:07pm

Perhaps for simulations or something like that, but such an algorithm would be completely unacceptable for cryptographic purposes. An adversary could easily guess the seed in a small number of attempts, and then predict the sequence of “random” numbers.

Topic		Replies	Views
True Random Number Generators Factual Questions	21	1597	October 4, 2001
Is it inherently impossible to generate truly random numbers using computers? Factual Questions	32	4258	January 23, 2009
Is it true that computers can't generate true random? Factual Questions	38	15623	August 11, 2010
Technically speaking, are there any real random number generators? (In electronics) Factual Questions	31	2501	September 22, 2009
What's the difference between a random number and a pseudo-random number? Factual Questions	62	21373	December 27, 2011

True Randomness

Related topics