Often I see a percentage or decimal that is a rounded value of a fraction and wonder what actual fraction it might represent. For example, a baseball player’s batting average is .342; how many hits and at bats does he have?
Is there an algorithm that will solve for 0.3415 < m/n < 0.3425 for m, n determining all pairs with n < N (fixed value) that is better than brute force trial and error for each n?
Almost certainly, although maybe not a trivially simple one. There’s a whole field of mathematics that discusses approximating real numbers with “optimal” fractions (Diophantine approximation).
I don’t think there is a good answer, especially if you’re allowing for large denominators. Just using .342… if we allowed for 2000 at-bats, then the answer could be 683/2000, 684/2000 or 685/2000, all of which would fit the allowable criteria. And 683/2001, 684/2001, 685/2001 would also work.
As the denominator range gets smaller, you’re more likely to have unique values, but you still can’t guarantee it. A batting average of .200 would have as possible answers any denominator divisible by 5.
Yes, there is an algorithm, if you want the best approximation with a low denominator. If the fraction is between 0 and 1, start with the best approximations with denominator 1 each side. Then, for each best pair of approximations a/b and c/d, find the fraction (a+b)/(c+d), and discard the approximation that’s worse than that in favour of this new approximation. So, with 0.342:
0/1 and 1/1 – new fraction 1/2 is closer than 1/1
0/1 and 1/2 – new fraction 1/3 is closer than 0/1
1/3 and 1/2 – new fraction 2/5 is closer than 1/2
1/3 and 2/5 – new fraction 3/8 is closer than 2/5
1/3 and 3/8 – new fraction 4/11 is closer than 2/5
1/3 and 4/11 – new fraction 5/14 is closer than 4/11
I would use a simple continued fraction. Each step is guaranteed to give a fraction that is as close as possible to the actual number in the sense that each step gives a result that this closer to the actual value than any other fraction with the same or a smaller denominator (http://en.wikipedia.org/wiki/Continued_fraction). Calculating them is easy thanks to the continued fraction calculator (Continued Fraction Calculator).
Anyway, with 0.342, the “best” approximations are: 1/2, 1/3, 7/20, 8/23, 9/26, 10/29, 11/32, 12/35, 13/38, 53/155, 66/193, 79/231, 92/269, and 171/500.
Now of these, 171/500 is exact, but 1/3 is within 1% of the actual value (obviously), 12/35 is within 0.1%, and 53/155 is within 0.01%. So if someone has a 0.342 batting average you are 99% correct that the batter gets a hit in 1 of every 3 official at bats, which is certainly good enough for a rough mental guide.
[nitpick]It would be better to say “1/3 of official at bats” or “1 of 3 official at bats” - when you throw in “every” it means (in a strict sense) that there is no case of three at bats without a hit.[/np]
(I readily concede that this inaccuracy is extremely common.)
But even though 1/3 is a very good approximation, you do know that the batter has not had exactly 3 at-bats. On the other hand, if the batting average is given as 0.333, then it might be a rookie who has had exactly 3 at-bats (or exactly 6, or exactly 9, etc.). I interpret the OP not as being “if the number is 0.342, how can I reasonably approximate that?”, but as “If the exact number is rounded to 0.342, what are the possible values of the denominator?”.
Lets say a player has a batting total of 401 in 723 (I have no idea how realistic this is) and their batting average is given as 0.555 (rounded to 3 sig figs) we know that they “at bats” of somewhere between 700-800.
389/701 rounded to 3 sig figs is 0.555 and 444/800 is exactly 0.555 (these are jsut two examples at each end of the range) so our knowledge is very little help at all in reconstructing the original data from the batting average. Though we could say he didn’t have 700 (for example) at bats as no fraction with 700 as the denominator rounds with 3 sig figs to 0.555.
For a really deep explanation of the method Giles talks about, look up the Stern–Brocot tree. It is the best way to approximate a fraction to the desired number of digits.
“Concrete Mathematics: A Foundation for Computer Science” by Graham, Knuth, and Patashnik has an excellent section on it, including a neat matrix-based way of representing numbers.
I’m not sure about that. What it means to me is take at bats in groups of three (and hits in groups of one, I guess), and then, if you do that the groups are the same size. Every here does not mean (at least in my humble opinion) “all,” but as it turns out the OED doesn’t agree with me. So perhaps I will rephrase and say “1 out of every 3, on average.”
Ah. I see now. You’re saying that if the batting average is .333, we can be pretty sure that the batter hasn’t had too many at bats, since if there were a lot of at bats, it becomes very unlikely that exactly a third of them were hits. But if the batting average is something like .342, it’s harder to guess if the guy has say 57 at bats or 157. Is that right?
To answer this question, I’m going to say no, overall. There are simply too many possibilities to make a meaningful guess. For example, it is possible to have a batting average that rounds to .342 with the following numbers of at bats:
38, 73, 76, 79, 111, 114, 117, 120, 146, 149, 152, 155, 158, 161, 184, 187, 190, 193, 196, 199, 202, 219, 222, 225, 228, 231, 234, 237, 240, 243, 257, 260, 263, 266, 269, 272, 275, 278, 281, 284, 292, 295, 298, 301, 304, 307, 310, 313, 316, 319, 322, 325, 330, 333, 336, 339, 342, 345, 348, 351, 354, 357, 360, 363, 365, 366, 368, 371, 374, 377, 380, 383, 386, 389, 392, 395, 398, 401, 403, 404, 406, 407, 409, 412, 415, 418, 421, 424, 427, 430, 433, 436, 438, 439, 441, 442, 444, 445, 447, 448, 450, 453, 456, 459, 462, 465, 468, 471, 473, 474, 476, 477, 479, 480, 482, 483, 485, 486, 488, 489, 491, 494, 497, and 500. It goes on a bit from here, but as you see, as the numbers get bigger, there are more possibilities, and after 943, every number of at bats can result in an average of .342.
At the beginning of the season, you can make some guesses. It seems that you might be able to guess between 38 and 73 at bats. But later, it doesn’t seem like you’re going to be able to get much out of it. At bats happen randomly enough that you’re just not going to have any way of deciding between 111 and 114, for example.
Oh, as for an algorithm, I used brute force for these numbers. The tree-branches are a bit better than this, but, as I stated above, you just can’t get too much good information out of the batting average. But for other situations, the branches work well.
It should also be clarified that there are multiple levels of brute force. One could, for instance, systematically try every numerator and every denominator, and just take the ones that give the desired number. Force doesn’t get much brutish than that. On the other hand, you could also systematically try every denominator, take that denominator and multiply it by the decimal to get an approximate numerator, and then check the integer immediately above and below that approximate numerator and see if either of them works. You’re still brute-forcing the denominators, but for any given denominator, you’re doing very little work on the numerators. This is probably what RadicalPi actually did.
One last bit of procastination before I get back to work: The at bats that can give an average of .555 do not have to lie between 700 and 800. For example, 61/110, 66/119, 71/128, 76/137, and 81/146 also make an average of .555. It is true, though, that some averages have particularly high minimum number of at bats to be possible (for example, here, for .555, 110 at bats is the first one), but with only 3 significant figures to work with, I don’t think it’ll ever get as high as 700.
I took the ceiling of .3415 times all the denominators as one data point and the the floor of .3425 times all the denominators as the other. Then I ran a check to see if the ceiling matched or was less than the floor for each denominator.