To go “outside the box” and answer the question from another perspective…
Because the standard that specifies this type of binary number (IEEE 754) calls it “floating point”. Regardless of whatever reason the standard had for choosing that name, once it’s in the standard, it’s gospel.
Indeed, I’ve encountered this problem in scientific coding. The quick-and-dirty solution is to store the logs of the variables instead of the variables themselves (and if the logs of your variables are growing out of control, forget it, it’s hopeless). But then you need to work out a way to add such variables together, which is a royal pain. The real solution is to look at your actual equations and do math yourself, instead of relying on the computer, to convert them into a form that does not have the indeterminate forms in it.
This is why a serious programmer, who expects to be doing mathematical programming, needs to take a class in numerical analysis. Is that still required of Computer Science majors these days? That is a full-semester class in doing heavy-duty computational mathematics, where you study how to do exactly this sort of stuff.
Note that you can’t solve all problems simply by using double-precision floats instead of single-precision, or by using quadruple-precision instead of double-precision. If the problem is that you need 18 significant digits instead of only 8 to compute the trajectory to get your rocket ship to Jupiter, then using greater precision may be the solution. If you’re computing sums of infinite series by adding up successive terms until you’ve got the precision you need, that often won’t work. You’ll end up with your 18 or whatever “significant” digits, but they will be random gibberish.
There is also the problem of using computed values in comparisons. Suppose you’ve added a bunch of floats, and you want to see if the result is < 10.0 thus:
if ( sum < 10.0 ) . . .
It’s entirely possible the sum should be exactly 10.0 but will fall a little below that due to rounding error, and no amount of additional precision will solve it, and the branch will go the wrong way. This is a very common problem in uncarefully-written financial software, where accuracy down to the exact penny is required. (I’ve spent, and still spend, a lot of time working with a financial application that is full of bugs like this.)
Recommended reading: We discussed that in some detail a while back, in this thread, from February 2012, starting in post #4.
For additional introductory reading, also check out this thread, from June 2012. There is a lot of discussion there about what constitutes good programming practices.
This is a problem with how you’re doing the division, nothing to do with the following multiply. Your answer for just the division 92.1/100.1 should be 0.9.
Then you’d end up with 92.1 / 100.1 * 100.1 = 90.0 or 90.1. Not quite as egregious.
Well, yes, but the point is that most fixed-point arithmetic uses scaled ints, and you don’t really have a choice to do division differently if you use scaled ints since you’re a slave to how the ALU does it.
li $t0, 921 # t0 = 921
li $t1, 1001 # t1 = 1001
div $t0, $t1 # stores result of t0/t1 in lo
mflo $v0 # get result of division
(Or the equivalent in your favorite assembler) is always going to result in the value in $v0 being 0. Hence you multiply before you divide because otherwise for any a/b, if a<b; a/b=0. Even 999/1000 == 0. If you don’t you lose all scaling precision between two integers.
If I have 16 bits, and I want to be able to store a number with precision to one part in a thousand (.001), I need to dedicate 10 bits to the decimal part, leaving only 6 bits for the integer part. This means I can only store numbers from 0.000 - 63.999 (or half that range for positive and negative numbers). This is a big problem with microcontroller code, where memory is tight, and word size is small. If you don’t pick your fixed point representation correctly at the outset, you may find that something simple (like multiplying by a scale factor) will cause the result to exceed the 16 bits available.
Or alternately, you choose appropriate numbers of digits for the various quantities you’re working with, such that anything you’re dividing will always come out with reasonable precision. Which admittedly might not always be possible, but sometimes it will be.
And I can’t speak for computer scientists, but I was required to take a numerical course as a physicist. We didn’t cover fixed point much, though: Most of what I know about the practicalities of that, I learned from video game theorycrafting (fixed point operations used to be really popular in video games, because a little lost precision isn’t usually a big deal in a game, but speed is).
When you calculate A/B, somewhere you need to multiply by 10. If I have A = 100.1 and B = 92.1, and I do the integer division, I get 1001/921 = 1. But the integer 1 doesn’t represent 1.0, it represents 0.1. You need to multiply by 10 to get the integer that represents 1.0.
So when you’re calculating 92.1/100.0, you can do that multiply-by-10 first instead of last, giving you (921 * 10) / 1001 = 9 (which represents 0.9).
Yeah, no kidding. I wish I had, but I don’t remember it being recommended for computer engineering students back in the 70’s.
Another one programmers routinely get wrong is probability. Most of the time, we don’t need to know probability, but every now and then a serious probability problem comes up, and I usually chuckle inside at their confidence in their answers, remembering how easy it was to get it wrong in the one prob/stat course I did take way back when. IMHO, a big company should have a small applied math department available when needed for these kinds of questions. Or perhaps an applied mathematician on retainer, a prof at a nearby university. Oh well!
I’m delighted that I can get the Monty Python problem correct. When it gets much more complicated than that (just about anytime I can’t enumerate all the cases), I lose confidence pretty fast.
OK, I’m lost. What is the Monty Python problem? I’m guessing it’s some reference to “Three is the number to which thou shalt count”, but I can’t see how to make that relevant to programming.