A similar thing that has bitten people many times in my industry: certain computational architectures support a fused-multiply-add instruction that computes A*B+C in one shot. Depending on exactly how the code got optimized, that sequence may end up getting fused or not, which alters the rounding behavior and thus the final result. Even within the same program there could be differences here.
Yeah, I wrote it the way I did in the hope of defeating, as much as possible, some attempts at optimization that some compilers might do.
Gaak! Whatever kind of optimization a compiler/interpreter may do, I’d at least hope it would be more consistent (and hence, predictable) than that. We’d like to hang onto our fantasy that computers are deterministic, but this calls even that into question!
Dude, that’s not even the half of it. This was a performance bug rather than a results bug, but the JIT once unrolled an O(n^2) loop for me that was taking forever. But it only unrolled it on Windows machines – and if it was a laptop, it would only do it if it was running on outlet power and not battery. Needless to say I was very confused when my groupmates on other systems yelled at me for making the game unplayable.
Edit:
And as for scaled integers. You absolutely have to remember the multiply-before-divide rule if you use it.
92 * (1/92) =/= (92*1)/92
in integer arithmetic.
can it ? it can only show values in decimal,
but the ‘>’ test is being done in binary.
Not necessarily, if you enable that option it’s possible they use a decimal format or some other trick. This could either be a tedious string compare, or a Rational type. Note that when you enable this option, you have to pick a decimal level of precision, consider
type Rat struct {
Numerator, Denominator int
}
If I have 2.53, I can express this as Rat{253, 100}, both of which can be expressed in binary with perfect precision. While it’s a bit more costly, both in terms of memory and performance, you can have perfect equivalence and ordering (<, >) tests on this type, using the same methods you learned for adding fractions of unlike denominators in grade school. In fact, it should be relatively cheap since all numbers should have the same Denominator component (in this case, 100), the only relevant bit is the ordering of numerators. Though whether the denominators stay 100 depends, I suppose, on how you handle division.
What kind of internal criteria do you suppose the JIT was using to decide when to do that and when not to?
It’s not a big secret that the JVM is better optimized on Windows than other platforms, I assume the logic simply didn’t exist on the OS X and Linux implementations. As for the plugged in/not plugged in thing, I assume it has something to do with clock speed, Windows laptops have features that underclock your components when running off battery power so that the battery will last longer. I assume the JVM feels it can’t unroll loops or do other JIT optimizations if the clock isn’t fast enough because it will slow down the program unacceptably, so when it’s underclocked by Windows it won’t unroll the loop because the clock is under the acceptable level.
Gotta be careful! You could sum up:
int sum = 0 ;
for ( int i = 1 ; i < ∞ ; i++ ) {
sum = sum + i ;
}
and, if the compiler optimized stupidly, you could end up with sum = -1/12
(But I suppose no matter how the compiler does or doesn’t optimize it, it would take O(∞) time. )
BTW, the Kahan summation algorithm may be worth knowing, for programmers who lose precision during very long summations. I do very little floating-point work myself, so have seldom if ever needed it. (I use it sometimes anyway, just as a fetish :rolleyes: ) If there’s interest, I’ll post demonstration C code.
No … I’ll just post demonstration code despite total lack of interest.
/*
* This is a demonstration of floating-point addition.
* We sum 1 + 1/2 + 1/3 + ... + 1/100000000 four different ways.
* (a) sum ordinarily, largest addends first
* (b) sum ordinarily, smallest addends first
* (c) sum with error carries, per William Kahan
* (d) predict the correct sum, using Euler's formula
*
* To emphasize summation precision problems, we use 'float' sums
* rather than 'double' sums.
*/
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
typedef float Real;
struct Floatsum {
Real sum, carry;
};
void usual_add(double addend, struct Floatsum *kp)
{
kp->sum += addend;
}
void kahan_add(double addend, struct Floatsum *kp)
{
Real y = addend - kp->carry;
Real tmp = kp->sum + y;
kp->carry = (tmp - kp->sum) - y;
kp->sum = tmp;
}
main(argc, argv)
char **argv;
{
struct Floatsum J;
int k, Many = 100000000;
/* Sum the first Many reciprocals. */
/* naive addition */
J.sum = 0;
for (k = 1; k <= Many; k++)
usual_add(1.0 / k, &J);
printf("%f
", J.sum);
/* small-to-large addition */
J.sum = 0;
for (k = Many; k >= 1; k--)
usual_add(1.0 / k, &J);
printf("%f
", J.sum);
/* Do Kahan adds */
J.sum = J.carry = 0;
for (k = 1; k <= Many; k++)
kahan_add(1.0 / k, &J);
printf("%f
", J.sum);
/* Euler's formula */
printf("%f
", log(Many + 0.0) + 0.57721566 + 0.5 / Many);
}
I’m not familiar with the arrow operator used there. Is that the same thing as the dot used in structs?
Equivalent. If you’ve got a pointer to a struct, then you use the -> operator to say 'get me the member of the struct this points to.) Like you use the dot for a member of an ordinary struct variable.
The operator ’ -> ’ is an unnecessary frill in C. The expression
P->B
is simply shorthand for
(*P).B
It provides a “smoother-looking” way to follow pointer chains. For example, here’s a line from the Linux kernel inode code:
bdi = sb->s_bdev->bd_inode->i_mapping->backing_dev_info;
If you were writing cash-register software or any kind of business or accounting software you would probably do what the entire industry does and that is use decimal/numeric data types instead of floating point.