This is a specific case of the more general Vandermonde’s identity.
Beaten to the finish line by a mere 714 years by Zhu Shijie!
But random bits aren’t free. Generating even one random bit has a nonnegligible cost to it, certainly greater than the cost of one comparison. And if you’re using true random numbers instead of pseudorandom, then you want to budget out those random bits carefully. Generating a random number from 1 to 32 requires only 5 bits, and if you have a standard package that generates 8 random bits at a time, you’re not going to want to just throw out the extra 3 every time you generate a number in that range. So “generate a random number in a given range” should not be counted as constant cost.
Just saying, you may want to double check anything else this team does related to statistics or probability…
Generating truly random bits with special hardware may indeed be expensive, but in practice, when cost is an issue aren’t M “true” bits used as a PRNG seed to generate N bits where N is much larger than M? How big can M get before this approach becomes cryptologically insecure?
Chronos, of course your point has some validity but when does this become of practical relevance? There is a pretty good PRNG which generates 64 pseudo-random bits with 43 Intel[sup]®[/sup] instructions so, with a net cost of* less than 1 instruction per bit*, generating a random bit is certainly not greater than the cost of one comparison!
High quality PRNG’s like Marsaglia 's KISS or the famous Twister aren’t much slower than the pretty good generator just mentioned. It appears that my cheap laptop generates well over 1 billion pseudo-random bits per second with any of these generators. The overhead of saving unnecessary PRNG bits for subsequent use may become significant, or even be counterproductive. The PRNG’s are unbranching code, which gets much better pipeline performance than branching overheads.
And BTW, generating a random number from 1 to 17 is more difficult than generating a random number from 1 to 32, and will often require more than 5 bits! (In fact if you insist that each of the 17 possibilities for x have probability exactly 1/17, there is no finite algorithm guaranteed to find x! I’ll bet many programmers use a simple “good-enough” way to find x in which the wastage of bits from a 32-bit PRNG turns into a feature, not a bug!)
The cost of calling a RNG is of course of practical importance when evaluating an algorithm, but it’s irrelevant to the big-O complexity of the algorithm. Big-O doesn’t measure the absolute cost of the algorithm, it merely measures how the cost changes as the input size changes. If the input size changes from n to 10n, how much longer does the algorithm run? Is it twice as slow, 10 times slower, 100 times slower, or 10 billion times slower? An O(n[sup]2[/sup]) algorithm can be faster than an O(n) algorithm, for small values of n. And “small” may be even large enough to cover all practical uses, in which case the O(n[sup]2[/sup]) algorithm is preferable. Evaluation of an algorithm must consider the actual runtime for practical values of n, but that doesn’t change the big-O value of the algorithm.
Not true–multiplication is only O(nlog(n)) (actually, there might be another factor of log(log(n)) in there; I need to work it out again). Multiplication is just convolution of two vectors, which can be done in nlog(n) time by first taking the Fourier transform of each one, multiplying them term-by-term, doing the inverse transform, and then performing the carry operation. All of these are O(n*log(n)) or less.
The classic Schönhage–Strassen algorithm is O(n log n log log n). If you really want to tweak things then there’s Martin Fürer’s algorithm with time complexity of O(n log n 2[sup]Θ(log* n)[/sup]). (It’s barely an improvement but it does show that the bound is more likely to be just Θ(n log n). If you have constant time log n operations from hardware then you can go lower.
markn+'s comment on it being O(n[sup]2[/sup]) is common. Too often people assume that the method they are taught in school for an operation is the fastest. A simple algorithm reduces the exponent to log[sub]2[/sub] 3 quite easily. Lots of examples of this sort of thing.
Yeah, as I recall the limit for the “straightforward” fast algorithm is that the elements of your vector need to be big enough to fit components of size O(n), which means you need O(log(n)) bits of precision, which gives the extra log(log(n)) dependency.
Of course, we could make the same argument as above; 64-bit math is enough for billions of digits, and is constant time on modern machines. So we might just say that multiplication is O(n*log(n)) for n<billions.