Help me understand this statistic

According to CBS news and the Census Bureau:

http://www.cbsnews.com/stories/2004/08/26/politics/main638642.shtml?CMP=ILC-SearchStories

What does this number mean?

And why did they use the median instead of the average? Wouldn’t that stat make more sense?

I’m not a math person. ::bows head in shame::

The median of a set of numbers is the “middle point” i.e. the point where half of the observations are above and half are below. This will often be quite different from the average.

As an example, if 5 students score 10%, 15%, 20%, 25% and 100% on a test, then the median mark is 20% (since it represents the middle score), but the average mark is 34%. The skewed distribution of the scores means that only one student has an above-average mark.

Continuing my previous post, since I hit the submit button too quickly:

Household incomes are also likely to be skewed and so the quoted statistic is the median income. Theoretically 50% of households have an income below the number and 50% have an income above the number.

If you used the mean (average), Bill Gates would throw everything off.

“Average” is a misleading term. There are several different kinds of average. The most common are the mean, the median and the mode.

Median is a useful statistic to quote for populations that are greatly skewed or have large extremes (such as incomes). By definition, 50% are one each side of the median.

Mean is more useful for any kind of mathematical analysis of the sample taken. It is also used in any situation where the total has a useful meaning. Therefore “mean rainfall” is useful, wheras “median rainfall” is rather pointless. (I was going to say meaningless, but you get the idea.)

Mode is the most commonly occurring value. Useful for when you want to make decisions to accommodate the greatest proportion of the population. “What size shoes will I run out of first?”

There are other averages. A weighted mean is probably the next most common used. It is a bit like giving some data more votes than other data. A typical example would be a grade in a course that is composed 30% of assignments and 70% on the final exam.

Then there are specialist averages. Consider the problem of working out the average sized rock in a pile of rocks.
If you took the average diameter, you would get a different rock from what you’d get if you took the average mass.
Suppose you were interested in the diameter. If you considered the large number of small gravel particles, sand and dust, you are going to have a very skewed distribution and your average is not going to be very representative. There are ways of “un-skewing” your data by considering the volume, surface area, or even the moment of inertia of the rocks you are measuring and performing a weighted mean accordingly. Of course it gets kinda complicated about that point.

The Greeks studied ten different means. The formulas to calculate them were all permutations of the mean we most commonly use. I think around seven of them had practical uses. The others were just for fun. :dubious:

But for layman’s use, the mean, median and mode will suffice. And for the most part, the correct one is used and reported by journalists etc. The conclusions and implications that are then drawn from that are another matter. It is usually not wise to trust anyone spouting a whole lot of statistics.

Another reason for using the median is that it ignores outliers (extreme values). The mean income is affected by those multi-megabuck hockey players and movie stars, so it will make the average Joe look like he’s even worse off in comparison to everyone else than he really is.

A “household income” is, not surprisingly, the sum of the money earned in that home. If Dad has a $70,000 a year job, Mom works part-time while the kids are in school as a store clerk and earns $12,000, and Junior has a paper route that nets him $1,000 a year, the household income is $83,000.

You use the median rather than the mean (arithmetic average) to avoid throwing off the balance by one or two large numbers.

Say you have a small town of 10 families – one of whom is a multimillionaire who was born there and has devoted part of his fortune to keeping the town alive and growing. Three families are below poverty level, five are lower middle class, one is upper middle class, and then you have the multimillionaire’s family, with a household income of $750,000 a year. Averaging their incomes out gives you a mean somewhere around the upper middle class family’s income; using the median fixes it between two of the lower middle class families’ income, where it ought to be in order to represent a cross section of the families.

Median household income is a useful standard for a lot of uses. Several federal programs use it to assure that federal funding is put where it is most needed, not where the most creative writers of proposals are supporting. (USDA and HUD programs, for example, will fund water and sewer system projects in communities where the MHI is below the statewide average, along with other criteria – such as a demonstrated need for the funds owing to nonexistent or deteriorated systems.)

Thanks everyone! A small part of my own ignorance has been successfully fought.

Sometimes, the median figures support the changes you want in government policy. Sometimes, the mean figures are more friendly. In the latter case, we say that the means justify the end. :smack:

IIRC, those are all measures of central tendency. There are several means: the arithmetic mean, what we think of as an average; a geometric mean, which is the product of the outcomes divided by their number; a harmonic mean, which I don’t know what the heck it is; and maybe more for all I know.

I’ve searched the census website quite a bit lately for work, and it appears that income is put into ranges. For example, it may be $0 to $10,000, $10,001 to $20,000, and so on. One can take an average of these, IIRC, by just making every member of each income range equal to the ranges midpoint. But if there are lots of people earning $9,000 and few earning $2,000, then that average will be inaccurate. Additionally, it looks as thought the top range is something like $150,000 and up. Thus there is no midpoint to take a guesstimate by.

I haven’t submitted a question to the census answer people on this point, but that is my guess as to why they use median…the mean is just too sloppy to be really accurate. We can get average income from national output, but I think the census works from a different direction and when the data is just ranges of incomes, it makes it hard to get a good average number.

YMMV. Odds are pretty good that I’m totally wrong, I’m sure.

Just checked the site and the form takes the actual income, not broken into categories. So I was mistaken.

A great book, (concise, clear, not overly long), is Darrell Huff’s How to Lie With Statistics. (I have the 1954 hardcover, but I have seen it both as a trade paperback and as a mass-market.) It explains the various uses of “averages” while noting the ways in which they are manipulated presented so that the math challenged can understand.

The geometric mean is the nth root of the product of the numbers, e.g. the geometric mean of a, b, and c is the cube root of abc. It’s more useful than the arithmetic mean if the values that you are taking the mean of increase exponentially.

The harmonic mean is the reciprocal of the arithmetic mean of the reciprocals of the numbers (1/((1/a+1/b+1/c)/3)). I seem to recall that that is appropriate for taking the mean of proportions, although I’ve never had occasion to use it in “real life”.

If you really want to rot your brain (unless you’re a mathematician that is into such things - I most emphatically am not), you can see the definitions of lots of other means at Means - from Mathworld

Curses! I would have gotten it right, if it weren’t for those meddling kids!

I see that you have read ‘How to Lie with Statistics’! :wink: And, although the success of wordplay is in the “Oy!” of the beholder, I’d have to say that, while you may be right on this, I much prefer to seek the happy median. :stuck_out_tongue:

This is exactly on the mark. Too bad we can’t post a bell curve to illustrate the lesson! A mean bell curve for the U.S. is skewed to the left. While the average income may rise because a few billionaires make a few more billion, the majority of us will remain wallowing in the blob to the left.