Traditional rounding 'bias'. I don't get it.

I mentioned the Jpeg tricks because they seemed interesting. We’re not talking of banks that misplace a few pennies, but of lossy compression methods (not just Jpeg) which get a big performance boost by understanding rounding bias.

It’s safe to say that I have more than a passing familiarity with Jpeg. :cool:

The compression “trick” I mentioned was designed to operate with any ordinary decompressor. (And obviously, reduction of the frequency of ±1 would immediately yield a Huffman code optimization.) The decompression “trick” I mentioned operates with any ordinary compressor. (Jpeg files can include custom markers for compressor-decompressor cooperation to get further improvement, but the tricks I described don’t require them.)

The tricks are so obvious that it may be a mystery why Jpeg did NOT explicitly incorporate them. I am aware of one company which kept its knowledge of these ideas secret to obtain a competitive advantage over “vanilla” implementations. (I had a hunch what they were doing and confirmed it with a code disassembler.)

People on the Jpeg-2000 committee (some of who were aware of my own paper on this topic) did add a new quantization parameter:

If the company was keeping secret that they were doing this, then it couldn’t have been protected under patent, and anyone who managed to reverse-engineer their trick would be legally entitled to do whatever they wished with it (as long as they did so without help from any employees of that company, who are probably bound by their contracts).

I found your post very interesting, but I can’t reply to it other than to say that. I don’t like replying to GQ posts without adding to the discussion.

Thank you, Pleonast! Your kind words are appreciated.

I have a bad habit of tongue-in-cheekiness and this paragraph is a case in point.

A few decades ago, there was a craze to patent software algorithms. Were they even patentable? Some patent attorneys insisted that the improper patent claim “I claim the algorithm which blah blah blah” could be rendered proper by simply adding four words — “I claim an apparatus which implements the algorithm which blah blah blah.” (Even though the apparatus wasn’t the inventor’s at all — just an off-the-shelf Pentium or whatever.) The U.S. PTO would eventually get tired of arguing and issue the patent, but would it stand up in court if challenged?

This software-patenting craze did generate income for patent attorneys, so I guess it was at least as good as breaking windows. :smack:

Patents for methods or techniques, as opposed to physical objects, were well-established long before the age of computers. I don’t see why a method patent couldn’t apply to an algorithm.

Many patent jurisdictions specifically exclude computer programs, among other things. So one could argue that an algorithm is a description of a computer program and therefore not patentable, if it is not already excluded as an “abstract idea”. On the other hand, certain “practical applications” or “implementations” of a mathematical formula or algorithm are theoretically patentable…

There are dozens of court cases on the subject of patenting mathematical algorithms, several of which went to the Supreme Court. Consider Parker v. Flook (1978):

One test, I think, is to eliminate any details of a mathematical formula and see if the result is vacuous. In our example, “I claim a process wherein said intermediate arithmetic value is altered in some way …” Sounds almost vacuous to me.

But the Supreme Court has reversed itself and then reversed again since Parker v. Flook. Contact an attorney!

Leaving aside JPEG and JPEG2k, there are many newer standards and proposed standards which include steps along the lines of computing the discrete cosine transform or wavelet transform of a block, followed by quantization and entropy coding. How do they handle the inevitable rounding?

JPEG2k was after my time, let alone anything newer, but since no one else answered I’ll summarize the basic difference in quantization rounding between Jpeg and Jpeg2k. Recall that ‘x’ in the following has already been multiplied by a scaling constant which depends on psychometrics and user-desired compression ratio.


-1.5 < x < -0.5 --> x = -1
-0.5 < x < +0.5 --> x = 0
+0.5 < x < +1.5 --> x = +1
+1.5 < x < +2.5 --> x = +2
+2.5 < x < +3.5 --> x = +3

The spec offers no flexibility beyond this. (One implementation increases the x=0 domain to -1.5 < x < +1.5 — i.e. eliminates all ±1’s which would be encoded — in image regions the user specifies as being of low interest.)

Jpeg2k, default:

-2.0 < x < -1.0 --> x = -1.5
-1.0 < x < +1.0 --> x = 0
+1.0 < x < +2.0 --> x = +1.5
+2.0 < x < +3.0 --> x = +2.5
+3.0 < x < +4.0 --> x = +3.5

This policy recovers most of the inefficiency that applied to Jpeg (cf. my ‘tricks’ above); further meddling would yield only diminished returns. However IIUC two optional parameters are provided to further tune this quantization: The width of the zero region can be adjusted, and the reconstruction values can be moved slightly toward zero to reduce average squared error. (Never mind more complicated modes of Jpeg2k.)

You’re right that float isn’t appropriate for almost any business/accounting software, but neither is integer for currency and catch weight values.

Fixed precision decimal/numeric data types are what is used.

Fixed precision is the same thing as integer, just with different units.

It’s not like the financial programmer can just ignore rounding issues just because she uses decimal and/or fixed-point numbers. All the issues (possibility of incorrect rounding, etc) discussed in this thread still apply.

I was assuming you were using the term “integer” to mean the software data type of integer because the context was software running on the cash register. Maybe you were using the term more generically.

My point was that there are data types that are designed for this type of stuff as opposed to the integer data type which just requires a lot more work to achieve the same result as the decimal/numeric types.

No, I mean that a decimal type is the same thing as an integer type. Like, measuring in dollars to two places after the decimal point is equivalent to measuring in integer cents.

Not exactly the same; you need to store the number of digits after the decimal point also, 2 in your example, and keep track of it when multiplying, etc.

If you’re storing the place of the decimal point, it’s floating point, not fixed point.

I meant fixed point.

Say you have one decimal digit, and the operation is 1.2 x 3.4. The intermediate value is 4.08, and it must be correctly rounded to 4.1. On the other hand, 12 x 34 = 408. So, arithmetic is not quite the same.

There’s something else in rounding that bugs me.

2.3217 could be rounded up to 2.328. And ultimately to 2, if we need an integer.

But 4.4449 could be rounded up to 4.445, and then 4.445 to 4.45, then to 4.5, then to 5.

I’m not saying that’s happening very often, but if a multiple people are working on the same spreadsheet, it’s possible. I did it myself a few times, absentmindedly.

I’ m just saying - rounding is a convention, and we should be careful following the rules of that convention.

Under what circumstances would you ever be multiplying two amounts of money?

And even if you do, if you have two amounts of money expressed in cents, and you multiply them, you’ll end up with a perfectly valid answer in square cents. Why are square cents any less valid of a unit than square dollars?

It’s not typically two amounts of money, it’s typically money (e.g. unit price, invoice sub total) multiplied by things like shipped weight (typically food uses catch weight systems), or yards (e.g. 3.07 yards consumed) or currency conversion (typically 7 decimal positions), or tax rates, or discounts, or prorating costs across multiple invoices or across lines, etc., etc.,

Manually tracking the decimal position during all of these calculations requires extra work and is more error prone than just using the software (and sometimes hardware) that already handles it.