My wife was having a discussion with her manager yesterday about how they should be recording their lab measurements. One piece of equipment is really only accurate to 2 decimal places, even though they sometimes get readings to 3 decimal places. They were debating whether to record all 3 decimal places, to truncate the numbers at 2 decimal places, or to round the 3rd decimal place up or down.
As they discussed the rounding, her boss said that, when rounding, one should round the numbers 1,2,3,4,5 DOWN and round 6,7,8,9 UP.
For example:
3.394 = 3.39
3.395 = 3.39
3.396 = 3.40
This dumbfounded my wife. She (and I) had been taught to always round 5 and above UP, and to round 4 and below DOWN.
Thus you’d get the following:
3.394 = 3.39
3.395 = 3.40
3.396 = 3.40
Have we been doing it wrong all these years? Or is her boss off base?
They are both wrong. You skew the data either way. Her boss skews low, and she skews high. You round to the nearest even number. 3.395 rounds to 3.40, while 3.405 rounds to 3.40. In the long run you introduce less bias.
Superman 3 was based on this conundrum - which ever way you go, you’re going to tilt the data a little bit high or low. My high school math teacher championed rounding 5 to the nearest even number -
Well, if it’s a lab, really, they should be recording all decimal places, and doing all calculations with +/- expected errors. (so 3.394 +/- 0.011). Unless the expected error is large enough that the third decimal point is essentially random [3.394 +/- 0.24 doesn’t really give any more information than 3.39 +/- 0.24 ], or there’s something about the device itself that makes them believe the third decimal point is random.
Not true, that would always skew downwards, with a maximum error of 0.01 per calculation. Rounding up or down per 0.005 leads to a max error of 0.005 per calculation. Rounding to the nearest even digit will not lead to less errors in each calculation, but it will tend lead to lessen the total upward or downward skew for a list of calculations.
ETA: I was tought to use 5 = round up rule, but most of my code rounds to nearest even.
When I worked at a place that uses whole numbers, cents were truncated from amounts. This often resulted in crossfoot errors. (i.e., the five aging amounts did not add up to the balance amount within a $5 limit.) Records that exceeded the $5 crossfoot error limit were dropped. So for the data that I needed to write a program for, I wrote some code that rounded the numbers to the nearest whole dollar (.50 or higher, round up; .49 or lower, round down). This resulted in much fewer records exceeding the $5 crossfoot error limit.
I agree that you should leave in the remaining digit. There’s no way that lopping off the extra digit whose accuracy is questionable will make the data more accurate as long as the uncertainty is recorded. On the other hand there is a good chance the data will be less accurate.
I also learned the round-to-even rule. More specifically, if the digits that are being dropped are “50000…” with the number of zeros being 0 or more, and the last retained digit was odd, increase it by one, otherwise leave the last retained digit alone. This does not introduce a bias in the mean. It does, though, introduce other kinds of bias, including making even numbers a little more common.
My preference is never to do any rounding, except perhaps in the final result if it is being published or distributed to many or shown in a slide or something else that makes it expensive or ugly not to round. The reason is that the supposedly meaningless numbers actually do have a small meaning. They don’t tell you anything more about the quantity you were measuring, but they create a distinctive sort of fingerprint. It often seems that we want to check or verify after the fact that there hasn’t been some kind of confusion or malfunction. If there are 5 insignificant digits and they keep repeating, perhaps some assumption about the recordkeeping is wrong. If there are only certain combinations of those digits that appear, there is probably a significant rounding happening someplace upstream, like losing the fractions of an ounce before it got converted to grams, or like only reading the 12 most significant bits out of an 16 bit A/D converter. If these digits repeat at the tops of two pages, it might tip you off that a page got printed twice. They let you check all sorts of things that you usually can’t anticipate needing to check.
I also learned to round fives to even. And that was in my numeric methods class in Computer Science graduate school were we had to calculate the error bounds on complicated computations.
Another question is: What exactly is your wife doing with the data after it’s gathered? Are any calculations being done with it? Or is it just used raw?
I know that you can also analysis error to expect from different people taking the measurements. I haven’t done anything with that in 8 years, so can’t tell you what the name would be.
Think of it this way: Numbers ending in 0, 1, 2, 3, and 4 are rounded down, numbers ending in 5, 6, 7, 8, or 9 are rounded up. Five of each digit. That eliminates bias. Truncation is biased.
As for the third digit on the gauge, do you think it’s just unreliable, or actually noise? If the former, round it, if the latter, truncate it.