I’m playing around writing code in Python these days (hence my series of IT-related questions lately), and one of the little practice projects that I’m doing involves, inter alia, the total amount of heaps of coins. So I get integers stating the numbers of coins from each denomination, then I multiply the number of quarters by 0.25, the number of dimes by 0.10 etc., and add all this up to get the total monetary value of whatever assortment of coins there is. So far so good.
The odd thing is that when, further down the road, I calculate sums or differences of several assortments of coins, I get results such as $4.510000027 (made-up example, but it is along these lines) where the correct result ought to be just $4.51. That really surprised me. The number of coins for each denomination is typed as an integer; the monetary value of each coin is a float number, but they’re all multiples of 0.01 (with 0.01 being the penny). So you would assume the final outcome of all these calculations should be a multiple of 0.01 as well.
It’s easy to fix this issue by simply rounding the intermediate results to two decimal places along the way, but I’d be curious how this arose. Is it that somewhere deep in the internal working of the processor, a number such as 0.01 is really represented as a much more granular number, and these imprecisions got carried along in the various calculcations? Along the lines of what caused this infamous Intel Pentium bug in the 1990s?
I never did much in Python, but floating point imprecision like that is common in other languages. Can you tell us more of the environment you are working in? Any particular version of Python, on a PC, Unix machine, other system? There is probably a spec somewhere that will tell you the parameters of the floating point arithmetic.
However, as you describe it, you are only calculating sums and differences, which should not lead to such problems unless you are talking about billions of coins. The integral portion of the number internally will be limited to some number of bits, any integer of greater magnitude than that will lose precision. Short of that kind of magnitude you must be doing something besides adding and subtracting, unless this is a severe weakness in Python.
I would keep everything in integer cents until the final calculation. You could even avoid the final division by 100 by converting the value to a string and inserting the “.” appropriately (for display purposes)
As mentioned this has little to do with Python, and a lot on how floating point numbers are represented internally.
Some fractions can be expressed exactly in either binary or decimal numbers. For example, one-quarter is exactly 0.25 in decimal or exactly 0.01 in binary. Some fractions that can be expressed exactly in decimal notation cannot be expressed exactly as a binary number with a finite number of digits. For example one-tenth is exactly 0.1 in decimal but 0.00011001100… (repeating ad infinitum) in binary. This often causes rounding errors in floating-point operations. The issue is not limited to Python. See the page from Python’s official documentation Floating Point Arithmetic: Issues and Limitations for a succinct explanation.
Workarounds can involve using Python fractions objects (which keep track of integer numerators and denominators separately) or decimal objects (which keep track of integer decimal places separately).
The moral of this story is that you shouldn’t be using floating-point numbers for money, anyway. Money comes in discrete units, and so you should be measuring it with integers. Just call it an integer number of cents (or mills, if you need something finer), and only do a conversion when displaying a result to the user.
Although the math that you perform in your script is decimal operations, the computer fundamentally works in strictly binary numbers and logical operation where everything is represented in increments of 2n. This is not generally a problem with integer mathematics (until you get into very large prime numbers) but precision in fractional calculations is problematic. You can improve this to some extent by explicitly truncating values in calculations or (as @N9IWP recommends) keeping everything in units of integer cents (or if you are dealing with fractional cents as banks do, in mills), which is what you would do in an older language suh as COBOL, FORTRAN, or C, but fortunately for you Python has in its standard library a decimal data type specifically for these types of “discrete decimal” operations, which will address your problem without any messy and potentially problematic rounding steps.
See also:
Coinage and bank notes are in discrete units but many forms of ‘money’, and particularly those involving compound interest or whatever you consider a cryptocurrency increment to be are inherently fractional and often irregular. In Python, if I were dealing specifically with coins and bills, I would actually make a money container with discrete units in which operations were predefined using the decimal type to avoid ever getting an intermediate value (e.g. coins and bills could only be exchanged as whole units not fractions avoiding potential problems with rounding calculations), but something like a Bitcoin would have to be represented as a high precision floating point value.
Bitcoins also have discrete individual units. The discrete individual unit is many orders of magnitude smaller than a bitcoin, but that doesn’t matter: It’s still well-defined.
Well, sort of. The ostensible base unit of a bitcoin is a tonal, which is defined as 0.00065536 BTC. Bitcoins are actually inherently binary so they can be represented ‘exactly’ in a digital computer (although it becomes more complicated once the value of the bitcoin exceeds that possible to carry in a single 64 bit word value) but for all practical purposes of exchange they are converted to decimal values. If you are exchanging between two purses of bitcoins, it is theoretically possible to simply perform any calculations in whole binary numbers but, since there is no monetary exchange system that is entirely within the bitcoin economy, any practical purchases or exchanges for real goods will require conversion, and any kind of compounding or other exponent operations will result in irregular fractions.
This is all an aside from the question of the o.p. regarding floating point math operations, which it should be noted (if not already clear) that this is not an issue with Python but rather a fundamental consequence of how all digital computers perform arithmetic. Floating point ‘errors’ are actually a big problem in high performance computing where even tiny rounding errors can add up to large numbers or create artificial computational asymmetries, and also a moderately clever plot in the occasional movie, e.g. Superman 3 or Office Space.
And, yeah, it’s nothing specific to Python, just the nature of floats, and how they are represented in memory (essentially analogous to something like scientific notation.) I remember as a kid tearing my hair out programming the Commodore 64 and discovering all these weird floating point quirks. For example, on a C64 if you have a variable, starting with 0, and keep adding 0.01 to it (as you would in dealing with something like cents, like in your own example), you’re fine up until 79. Once you hit 0.79, it becomes 0.790000001. But once you get to 0.990000001, then next one properly displays as 1, and then 1.01, 1.02, etc., but at 1.23 becomes 1.230000001. It’s quirky.
Once upon a time, someone told me something to the effect of, “The only compiler I trust to implement floating point correctly is the one I wrote myself for the Motorola 68000.” I would like to imagine standards compliance is a bit better these days, and that pretty much every compiler and language these days offer predictable and correct results. Loss of precision when calculating logarithms and Bessel functions may be a subtle issue, but basic IEEE floating-point arithmetic, normal versus denormal numbers, etc., is something that at least used to be taught to first-year CS students.
In the OP’s case, “rounding intermediate results to two decimal places” is not the way to go, because (1) you are not supposed to round intermediate results, and (2) decimal rounding will not work too well for binary numbers. Decimal floating-point arithmetic is trivial to use in Python, of course. Maybe someone in the finance industry can point out when floating-point calculations are appropriate and authorized, as opposed to fixed-point or integer arithmetic.
Along these lines: Never do a check for exact equality in floating-point numbers. Like, if you were incrementing something by 0.01, and wanted to do it until you got to 1, you shouldn’t say “until x = 1.0”, you should say “until x >= 1.0”, or whatever. Or maybe “until x >= 0.995”. Because it’s likely never going to be exactly equal to 1.0, but will instead just keep on running with x getting larger and larger indefinitely.
I never really understood the vastness of the likelihood of floats being imprecise until I became an accountant. If you represent debits as positives and credits as negatives, every transaction has to add up to 0. If you get more than a two decimal values (ones with cents) in a transaction, Excel will adamantly maintain that they add up to 3.69023424e-12 or whatever. What’s more, the number Excel claims they add up to will be different in different instances of Excel, or perhaps if ordered differently; I don’t really know what causes the tiny sum to be different when the entire transaction is transferred to a different Excel file, but it happens.
Obviously all of our actual accounting software deals with discrete cents so it’s not actually a problem, but it’s something you get used to really quick when using Excel to help summarize tons of transactions into one single gigantic transaction to import between systems.
In this case, using some sort of fixed decimal system (whether integerss of some special decimal type) makes sense, since you’re just counting things.
But there are plenty of calculations even with our current currency that work at precision levels higher than $0.01. They are rounded off only at the end, like when you do calculations with scientific measures.
My understanding is that financial institutions do use fixed point math, but that it has precision to like the eight decimal place, or at 1 millionth of a cent.
1/2 is no problem. 1/4 is no problem. 1/10 in binary is an infinitely repeating number. Since it is infinitely repeating, you run out of bits and you can’t store the whole thing, leading to errors.
Never use floating point for money.
ETA: Somehow I missed bibliophage’s post upthread which said the same thing. Apologies.
So some here are saying accounting is done in discrete cents; other mention fixed point with 8 places, and also (binary or decimal) floating point. Which is used where, if you go by authoritative standards?
That’s the exam question my first year students miss every time. The correct way is to take the absolute value of the difference of the two numbers, and test if that result is “close enough”, whatever that means in the specific instance.
What I know is that binary floats are generally not used in any financial software, and that decimal floats aren’t used in much of anything because they’re slow.
I also am not sure how many decimal places they use in fixed point. That’s why I said “like the eight[h] decimal place.” I just know they use more than to the cent.
Looking around, it seems that either four or six decimal places is the norm. And that US currency uses a scale factor of 100, which meaning that currency conversion in USD goes to four decimal places (two past the penny).
Excel uses floating point, and that is one of the things they caution people about using Excel to keep track of your money.