 # Traditional rounding 'bias'. I don't get it.

In elementary school when I was taught how to round numbers to a whole number, I was taught that if the last digit ends in 0, 1, 2, 3, or 4, round down to ‘0’. Otherwise round up to the next digit.

But later on, I was told that this introduces bias and that there are different techniques to deal with it.

So I want to believe that it does cause bias because people much smarter than me in this have developed these techniques to reduce it.

But I can’t convince myself of it. Can someone explain this to me like I’m a 5 year old?

Well, ending in 0 isnt really rounding down… So you have 9 numbers that round.

1-4 round down (four numbers)
5-9 round up (five numbers)

So you have an inherent bias toward rounding up.

As I remember it:

10.46 = 10
10.5 = 10
10.51 = 11
11.5 = 12

So, if .5 has no following digits you round down if it is even and up if it is an odd number (the number before the decimal point).

If the .5 has a number after it (e.g. .51 .52, .500006 etc) then you round up.

The bias comes in if you always round up (or down) on .5…by splitting it between even and odd numbers (down/up) then you average out that bias when calculating a lot of numbers.

If it is .500001 then it is over .5 and you go up.

I’m a programmer, so I wrote a script to repeat a generation of random integer between 0 - 9. If the integer is greater than or equal to 5, I +1 an Ups count otherwise I +1 a downs count. When I run this over a million iterations, I usually end up the ups and downs count nearly equal. Sometimes the ups is > downs, sometimes the downs is > ups, but I don’t see a ‘bias’ either way. What am I missing?

But sometimes you have a number ending in 0, right? So why exclude that?

1.0 = 1 (no rounding needed)

So:

1.1
1.2
1.3
1.4
Those are rounded down in your method (=1).

1.5
1.6
1.7
1.8
1.9
Those are rounded up in your method (=2).

Now count how many are in each set. See the bias?

I can tell right now that this is where my confusion sets in. Why does it matter if 0 rounding down to 0 is counted as being ‘rounded’ or not?

For sake of discussion, let’s assume we are rounding a float number to an integer. The idea is that we are changing the actual value to a value that meets our needs.

So if the “actual” value is 10.0, then the “rounded” value is exactly the same. It is unchanged, so no rounding occurred, just a change of type (float to int).

For literally any other value, we will actually have to change the value to get our rounded integer. So, a decimal ending in any of the other 9 digits will have to be rounded one way or another, the “actual” value changed. And if you always round #.5 UP, you should see a divergence somewhere around 56% rounded up and 44% rounded down.

As Whack-a-Mole alluded to, the way we can combat it is to round up if it is an odd number leading your cutoff point, and up for odds. So 11.5 rounds up to 12, 12.5 rounds down to 12.

I agree that the percentage of last digit being changed is as you state. But the results of the rounding are still 50/50, half rounded up, half either staying the same or rounding down. Bias in the number of changed values? Sure. Bias in the results? I can’t see it.

Consider you are a grocery store. Almost none of your products are priced with no decimal amount.

That is where the bias creeps in.

No biggie for a small store with few transactions (won’t amount to much) but can add up in a big supermarket.

Now imagine you are Amazon.

The question is, why are you lumping “stay the same” with rounding down? They are two different things. That would definitely be the source of the disagreement.

It is surely crucial to consider where these numbers are coming from. Are they actually truncations of random numbers (so 10.5 is really something like 10.524673…), or are they whole amounts which you are trying to reduce to a smaller number of significant digits (so \$100 -> \$100 is exact, but \$102 -> \$100 loses \$2 and \$105 -> \$110 gains \$5)

Definitely.

For instance, I forget exactly but I think the interest on your mortgage or car loan is calculated to 4 decimal places with the 5th decimal place being used to round.

It may seem like crazy nitpicking but this stuff really matters because it can add up to a fair bit of money when you have millions of loans.

Wasn’t that the central premise in the movie “Office Space”? They figure they would skim off the fractions of a cent that no one will ever miss because no one sees it and then find, to their dismay, that they skimmed a few hundred thousand dollars in a month.

If you want to consider a value of x.0 as being rounded, you’d have to consider it as both being rounded up and down—you can just as well argue that it’s the limit of numbers you round up, and include it with those, as that it’s the limit of numbers you round down.

Run a simulation yourself, or take my word for it, but if we’re dealing resolution of 1 cent, the count of 0-49 equals the count of 50 - 99 and rounding will be non-biased.

Ok…not the best example since they are in whole cents.

Run a simulation using a 4.385% APR for 30 years on a loan of \$734,314.77.

I see it as a result of a rounding function. 0 is equally likely to be a result in a random distribution of 0-9 so I don’t see why it would be excluded.

This is what I’ve been struggling to find a way to explain simply. 0 is either both rounding up and rounding down, or neither. But you can’t call it one over the other.

In my mental model, I call it neither, as I explained above… We aren’t rounding the value at all.

Just look at the total change that the rounding causes:

1.0 -> 1 (no change)
1.1 -> 1 (down .1)
1.2 -> 1 (down .2)
1.3 -> 1 (down .3)
1.4 -> 1 (down .4)
1.5 -> 2 (up .5)
1.6 -> 2 (up .4)
1.7 -> 2 (up .3)
1.8 -> 2 (up .2)
1.9 -> 2 (up .1)

So if you take these ten numbers (1.0, 1.1, …, 1.9) and round them, the total amount of rounding up is 1.5, and the total amount of rounding down is 1.0.

Or, look at the sums. The original numbers sum to: 1.0+1.1+1.2+1.3+1.4+1.5+1.6+1.7+1.8+1.9 = 14.5. The rounded numbers sum to 1+1+1+1+1+2+2+2+2+2 = 15.

But if you have a uniform distribution of 0 - 9, and traditional round all of the samples, you’d have exactly half at 0 and half at 10, right? If we can agree on that, I can sleep well tonight. 