Use of median prices to track housing market

I’m hoping someone can shed some light on why median prices are used in all statistics on home sales. I don’t know much about such things, so hopefully someone can explain it at a layman’s level. I’ve been following home sales data for California, and I started getting interested as to why sales volume could be its lowest since the mid-90s without exerting much downward pressure on prices. Something I noticed was that in many areas, looking at the data purely on a “gut” level, one would have the impression that prices were falling, yet the overall median price for the area often shows as positive.

The most striking example I’m noticing for last month is in the East Bay Area. If I look at a chart for last month’s sales: http://www.dqnews.com/ZIPCAR.shtm - I notice that in Alameda county, prices fell in 11 out of 15 cities. And if I take the median prices for each city and average them, the total is lower year-over-year. Yet the overall median supposedly increased by 1.67%.

Similarly, in Contra Costa County, only 4 out of 24 cities showed an increase in the median price year-over-year, with 19 cities having considerable price drops, and if I average the prices for each city, it decreases, yet the overall median for the county shows a slight increase.

I first noticed this a few months ago, when it looked like Santa Barbara County took a huge nosedive, with only one city’s median price increasing, but it still showed an overall increase. (That data isn’t up anymore.)

What I’m wondering is, when comparing two lists of numbers, what factors would result in the median being higher, but the average being lower in one of the lists. (The answer is probably really obvious, but I never studied this aspect of math.) Is it possible to glean any information about the state of the housing market from such a phenomenon existing, or are these just arbitrary statistical glitches? Am I perhaps simply engaging in selective reinforcement?

Well, if you take in all the ‘fire sales’, sherrif sales and exchanges that happen for almost nothing, the average gets dinged - big time. It is drug down, and it skews what is happening to the regular house that is just up for sale as regular business. In a bad housing market, there are some ridiculously low prices that ring up, from families bailing out for foreclosers, auctions and just one low ball deal to the next. These sales do not afffect the median — they are removed. These proprty transactions drag the average down.

To someone who is just trying to sell a home and move, the median has all the meaning. To someone with their pulse on the market, the average shows there is trouble and some really low ball deals out there from people taking short sales on desperation to sherrif and foreclosuer exchanges.

Typical average Joe: watch the median

Wanna be a broker? watch the average, too.

Averages are very deceptive, and using the median can help correct the deception. However, you might be smart to watch the average (say you are a loan officer for a mtg company, or you are speculating on real estate, or maybe want to invest, etc)

The median has a perfectly legitimate claim to the “average measure” too and is the best fit for lots of data set summaries (the others are the “mean” which people sometimes gravitate to as the true “average measure” for little good reason and the mode which is pretty crappy for most things).

Medians are much more stable than means and often provide a more realistic picture of what is happening in the middle as opposed to the mean which can easily get distorted by unusual things at both ends of the curve. Income is another thing that is usually reported as a median with good reason.

Let’s say you live in a tiny hamlet with only four other families. It is a very poor place but strikingly beautiful and attractive for people who want to escape the pressure once and for all.

Household incomes for each household in town are:

  1. 20,000
  2. 21,000
  3. 22,000
  4. 23,000
  5. 24,000

The mean household income is $22,000 and the median is $22,000.

Perfect statistics you think.

Now Bill Gates gets fed up with all this computer crap and decides to become a mountain man and move to this place but he still has his various investments of course. He buys the cabin from family 5 above.

After he moves in, household incomes for each household in town are:

  1. 20,000
  2. 21,000
  3. 22,000
  4. 23,000
  5. 1,000,000,000

The new mean income for this town. is:

$200,017,200 which suggests that any income summary statistic will report this little place as by far the most wealthy town ever known on earth by far if we use the mean even though Bill Gates just wanted to hang up and hunt bears for a while.

However, when we use the median, household income is still $22,000 (middle value in the sort above) . This is an extreme illustration of distortion caused by outliers but it is a very real effect and can distort the mean severely while the reality doesn’t reflect that nearly as well as using the median.

The other posters seemed to answer why the median is used. Now, to answer why your averaging wasn’t working. Presumably the cities in the counties in question did not all have the same number of sales throughout the course of the year. Thus, when you take a straight average of them, you are effectively giving more weight to the cities with fewer sales.

For an idealized example, imagine City A and City B both with a median sale price of $150K last year. This year, City A has 100 sales year evenly distributed from $115K to $215K with a median of $165K (+10%) and City B has 10 sales evenly distributed from $85K to $185K with a median of $135K (-10%). If we just average them, it would appear to be no net change, but the actual median price countywide would be $163K (+8.7%).