Penny Jar Statistics

Let’s suppose that I have a jar of pennies, which I’ve dumped onto a table, and sorted according to the year in which they were minted. I don’t know when the pennies were put into the jar; it’s possible they were bought from the bank yesterday, and dumped in all at once, or that they were taken out of someone’s pocket at the end of the day for the last five years or more, and the jar filled gradually. I simply don’t know, but it’m curious to find out which among these possibilities, one I haven’t thought of, or some combination thereof, was used to fill the jar.

I’m assuming that there would be some predictable pattern of distribution of pennies from a given year, which when compared to the data collected, would indicate whether they’re representative of the current set of circulating pennies, or not. How much could I find out, simply by looking at the minted dates? How large would the data set have to be to obtain any accuracy?

Tantalizing problem. Some things I have thought of that might complicate the reasoning:

  1. Since you have no clue as to the frequency of adding coins to the jar, it’s possible there were some big gaps of weeks, months, or even years where none were added.

  2. If the pennies all came from the bank on some recent date, I’d suspect there’d be a heavier weighting to recent years.

  3. Knowing the numbers of pennies minted each year might assist in evaluating the weighting of pennies in the jar.

  4. Since the timing of the placement into the jar is unknown, I’d expect that even taking the entire population as the sample still wouldn’t give the necessary data for judging the method of placement.

I’m just working off of “common sense” thinking, with no real clues as to statistical methods or assumptions. I would tend to doubt any conclusions based purely on statistical explanations. Too many variables yet to be explained.

I’m sure this is of no real help, but at least I “thought” about it some.

Were this a homework problem, there’d be all kinds of assumptions. One assumption that seems very sound is you can’t put a penny into the jar with a date later than the date of placement. An assumption that is probably pretty good is that any penny in circulation at time t is equally likely to be put into the jar. Slightly less innocuous is the assumption that pennies enter into circulation uniformly during hte year they are minted. I don’t know about this one and it may well be that the mint rleases pennies more on a demand schedule. The assumption that is most tenuous is the probability distribution of the lifetime of pennies. Here some standard probability distribution woudl probably be assumed – something like exponential.

Given that last two assumptions, the date make-up of the penny population could be determined. Then with the second assumption, a maximum likelihood estimation could be made of when each penny was put into the jar. The maximum likelihood estimate of when the pennies in aggregate were put into the jar woudl be the aggregation of these.

This would give you something like: The mostl likely form of saving was
July 2006 86.3 pennies
Jun 2006 72.4 pennies
etc.

If you wanted your estimate to only give you integer pennies (You can’t put 86.3 into the jar in a month), youd want to do something like a probit model.

I save all my pennies, and occasionally sort them into dates and store them in plastic tubes. I haven’t sorted them since May 2004, but at that time the year with the greatest number was 1991. It apparently takes a while for newly-minted coins to circulate more-or-less evenly throughout the country, and it takes a while for the coins to get into the hands of hoarders like me. My WAG is that if I sorted my pennies today, the tally would peak around 1993-4.