Another probability question (not homework!)

Lance_Turbo · February 6, 2017, 5:25am

This is a specific case of the more general Vandermonde’s identity.

moes_lotion · February 6, 2017, 6:34am

Beaten to the finish line by a mere 714 years by Zhu Shijie!

Chronos · February 6, 2017, 12:10pm

But random bits aren’t free. Generating even one random bit has a nonnegligible cost to it, certainly greater than the cost of one comparison. And if you’re using true random numbers instead of pseudorandom, then you want to budget out those random bits carefully. Generating a random number from 1 to 32 requires only 5 bits, and if you have a standard package that generates 8 random bits at a time, you’re not going to want to just throw out the extra 3 every time you generate a number in that range. So “generate a random number in a given range” should not be counted as constant cost.

Quercus · February 6, 2017, 2:55pm

Just saying, you may want to double check anything else this team does related to statistics or probability…

septimus · February 6, 2017, 3:02pm

Generating truly random bits with special hardware may indeed be expensive, but in practice, when cost is an issue aren’t M “true” bits used as a PRNG seed to generate N bits where N is much larger than M? How big can M get before this approach becomes cryptologically insecure?

Chronos, of course your point has some validity but when does this become of practical relevance? There is a pretty good PRNG which generates 64 pseudo-random bits with 43 Intel[sup]®[/sup] instructions so, with a net cost of* less than 1 instruction per bit*, generating a random bit is certainly not greater than the cost of one comparison!

High quality PRNG’s like Marsaglia 's KISS or the famous Twister aren’t much slower than the pretty good generator just mentioned. It appears that my cheap laptop generates well over 1 billion pseudo-random bits per second with any of these generators. The overhead of saving unnecessary PRNG bits for subsequent use may become significant, or even be counterproductive. The PRNG’s are unbranching code, which gets much better pipeline performance than branching overheads.

And BTW, generating a random number from 1 to 17 is more difficult than generating a random number from 1 to 32, and will often require more than 5 bits! (In fact if you insist that each of the 17 possibilities for x have probability exactly 1/17, there is no finite algorithm guaranteed to find x! I’ll bet many programmers use a simple “good-enough” way to find x in which the wastage of bits from a 32-bit PRNG turns into a feature, not a bug!)

markn_1 · February 6, 2017, 3:23pm

The cost of calling a RNG is of course of practical importance when evaluating an algorithm, but it’s irrelevant to the big-O complexity of the algorithm. Big-O doesn’t measure the absolute cost of the algorithm, it merely measures how the cost changes as the input size changes. If the input size changes from n to 10n, how much longer does the algorithm run? Is it twice as slow, 10 times slower, 100 times slower, or 10 billion times slower? An O(n[sup]2[/sup]) algorithm can be faster than an O(n) algorithm, for small values of n. And “small” may be even large enough to cover all practical uses, in which case the O(n[sup]2[/sup]) algorithm is preferable. Evaluation of an algorithm must consider the actual runtime for practical values of n, but that doesn’t change the big-O value of the algorithm.

Dr.Strangelove · February 6, 2017, 6:59pm

Not true–multiplication is only O(nlog(n)) (actually, there might be another factor of log(log(n)) in there; I need to work it out again). Multiplication is just convolution of two vectors, which can be done in nlog(n) time by first taking the Fourier transform of each one, multiplying them term-by-term, doing the inverse transform, and then performing the carry operation. All of these are O(n*log(n)) or less.

ftg · February 6, 2017, 9:50pm

The classic Schönhage–Strassen algorithm is O(n log n log log n). If you really want to tweak things then there’s Martin Fürer’s algorithm with time complexity of O(n log n 2[sup]Θ(log* n)[/sup]). (It’s barely an improvement but it does show that the bound is more likely to be just Θ(n log n). If you have constant time log n operations from hardware then you can go lower.

markn+'s comment on it being O(n[sup]2[/sup]) is common. Too often people assume that the method they are taught in school for an operation is the fastest. A simple algorithm reduces the exponent to log[sub]2[/sub] 3 quite easily. Lots of examples of this sort of thing.

Dr.Strangelove · February 6, 2017, 10:04pm

Yeah, as I recall the limit for the “straightforward” fast algorithm is that the elements of your vector need to be big enough to fit components of size O(n), which means you need O(log(n)) bits of precision, which gives the extra log(log(n)) dependency.

Of course, we could make the same argument as above; 64-bit math is enough for billions of digits, and is constant time on modern machines. So we might just say that multiplication is O(n*log(n)) for n<billions.

Topic		Replies	Views
Does this probability question admit an answer? Factual Questions	65	2272	October 27, 2004
Random arrangement of marbles on a surface; how many with neighbors of the same color? Factual Questions	14	1389	December 2, 2016
Combinatorial mathematics question Factual Questions	4	870	January 4, 2006
Probability Question Factual Questions	19	1264	June 18, 2003
Is 0 Red / 10 White vs 3 Red / 8 White Significantly Different? Factual Questions	13	1003	March 3, 2004

Another probability question (not homework!)

Related topics