One of the most misleading ideas in all of climate science is that there is actually such a thing as an “average global temperature”. It simply does not exist.
The measurable aspects of physical objects are divided into two groups - “extensive” attributes and “intensive” attributes.
Extensive attributes include such things as mass and length. We can take the arithmetic mean (the most common type of average, but far from the only type) of extensive attributes quite easily. First, we add up the individual masses or lengths. This gives us a meaningful quantity, the total mass or the total length. If we add up the masses of three people, we get a total mass for the three which has a physical meaning. We then divide this total by three, and we get the average mass.
Intensive variables, on the other hand, include such things as temperature and pressure. But attempting to take an average of these quickly leads us into unphysical wonderland. To start with, we add up the temperatures, just as we did the masses. But that total has no physical meaning, and is not a fixed figure.
For example, suppose we are looking at the different counties in California, and we have one temperature reading for each county, and one area measurement for each county. We can add up the areas of each of the counties to get a total area for the counties, which has a physical meaning - it is the total area of California. And we can divide by the number of counties to get an average area for each county.
But when we try that with the temperatures, we get a meaningless figure, the “total temperature” of the counties … is this the “total temperature” of California? And because this figure is meaningless, so is the result of dividing it by the number of counties.
To put it bluntly, “average temperature” is a very poor metric for measuring climate changes, because it has no physical basis. An excellent explanation of this problem, along with other problems with temperature averaging, is available here. (A better metric would be the heat content of the ocean. Heat content is an extrinsic value, so a total or an average actually has a physical meaning.)
But let us suppose that we choose to ignore the fact that we are using an imaginary figure, the “average temperature”. If we want to use the mythical “average temperature” as our metric, the problems have only started. For example, should we average all of the stations in the world, or average the northern and southern hemisphere separately and then average the two to get the final answer? There are advantages and disadvantages of each one, and there is no rule or reason for favoring one over the other … but they give different answers. (The usual method, by the way, is to average the hemispheres separately, and then average the two of them.)
Having made a selection between those two methods on some basis, we then are faced with a host of other problems. How do we deal with missing data? Ignore it? Estimate it? How much data does a station need to have before it is even considered at all? Do we average the stations individually, or do we combine them into “gridcells” and average the gridcells? Again, there are no a priori or “correct” answers to these questions. The custom is to use gridcell averages which are then averaged, but there is no reason to pick that over the alternative. And remember … each of these options will give us a different answer for the “average”.
Next, how do we deal with the changing numbers (and thus locations) of stations over time? The number of stations dropped radically in the 1990s, with more rural stations being closed. How do we adjust for that change from rural to urban?
What about gridcells (currently about 20% of the earth’s surface, with much larger numbers in the past) where there is no data? Do we ignore those gridcells, or estimate them in some manner? Again, arguments can be made for either of these options, and different organizations use different methods.
And if a very cold gridcell only has data for half of the period of record, if we include it, the average for that period of time will be lowered … How do we adjust for that?
Next, gridcells that contain a large number of stations have less variance in their average than gridcells that contain only a few stations. Should we “variance adjust” the gridcells to have the same variance, and if so, should they be adjusted to have the larger or the smaller variance?
What do we do with individual temperature records that seem to be way extreme, and are perhaps in error? Do we ignore them, and if so, how do we choose which ones to ignore? Again, whatever rule we choose will have proponents and detractors. (The method used by Jones et al. in the HadCRUT dataset is to remove data points which are more than 5 standard deviations from the 1961-1990 average. I leave it as an exercise for the reader to figure out why, in a period of generally rising temperatures such as 1850 - 2000, this method will change the overall trend.)
Finally, we come to the question of the stations themselves. These are plagued with a host of problems, which include inappropriate locations (too close to buildings or roads, etc.), changes in location, changes in instrumentation, changes in averaging methods, changes in observation times, lack of maintenance, changes in surroundings (local heat islands), changes in cleaning frequency, changes in observation frequency, changes in elevation, and changes in instrument enclosures (from painting to repair to replacement).
All of these problems with the stations might be tolerable if we had accurate historical records of these changes … but we have only the most scattered, fragmentary records for many stations, particularly those in developing countries. This means that we can’t even begin to correct the historical data for these problems.
In short, while it seems clear that world is warming overall, exactly how much it is warming is still a very open question. This can be seen in the difference between succeeding versions of the same dataset. For example, the 1940 to 2000 temperature change varies between the 2000 version of the USHCN temperature record and the current version of USHCN by about 0.5° … which is about the same size as the temperature difference being measured. See here for details.
In other words … the 1940 - 2000 difference between the two succeeding datasets (0.5°C), both done by the same organization but using different assumptions as detailed above, is nearly as large as the entire claimed change in the 20th century (0.6°C).
And all of this, of course, is merely one more example of what I have been pointing out all along. Our knowledge of the climate system, both instrumental and theoretical, is woefully inadequate to support the sweeping claims being made by AGW proponents.
w.