Is Global Warming science?

Bryan_Ekers · November 26, 2009, 1:27am

It wouldn’t surprise me at all if 75% of the American population was already so inclined.

Anyway, let’s assume that a general-population distrust of science comes into being (or gets worse) if the anti-AGW stance becomes widely believed (not proven, mind you, just believed). As I figure it, the net effect is fewer American kids go into science because they get called “nerds” by their willfully-troglodyted peers, America falls behind in science, the Chinese build a bsae on the moon, the Americans experience a freakout similar to the one their grandparents felt after Sputnik, science education resumes in a big way, and in 2033 President-Elect Bristol Palin takes office and a new cycle of “DUH-ning in America” begins.
Big whoop.

So how much money are we talking, here? Besides, the notions of how to combat AGW aren’t nearly as timewasting and meaningless as, say, going to church. More efficient vehicles and cleaner power plants are a benefit in themselves.

Bryan_Ekers · November 26, 2009, 1:32am

I’m just sarcastically pointing out that the very thing Bricker is trying to warn me about already exists in America, and in a big way.
And actually, it’s not clear to me that even the remaining 10% has a firm handle on how gravity works, in the sense that they’ve fully charted out the mechanism and can replicate it under laboratory conditions.

MysteryFellow63427 · November 26, 2009, 1:51am

There has been huge political import to climate study for at least twenty years. The scientists who do climate research are well aware of how the results of their work can be and are being used.

The study of climate is science. The issue of “Global Warming”–how climate science is being portrayed in the media and used to achieve political ends–is obviously not.

Avumede · November 26, 2009, 4:03am

Sam, have you ever done research programming? I have. It’s a different world. I never saw any source control, no one cared about the quality of code, no bug tracking, and frequent hard coding deadlines due to publication deadlines.

You think this is bad science. The truth is, based on your description, it’s not any worse than average.

Sam_Stone · November 26, 2009, 4:42am

Yeah, I have. When I was in college I worked summers in a private chemistry lab, writing software for doing Fourier synthesis and other things. And you’re right - there were no procedures in place, no source control, no versioning, and no one to check if a college student’s work was any damned good.

But that kind of makes my point. If the modeling software is this screwed up, then I certainly don’t want to put my kid’s well-being in its hands. We’re being asked to radically change the economy of the world, in large part based on results of modeling software that seems to me to be quite unreliable and potentially fatally flawed.

You know, once you get away from software used to model things of academic interest and you start talking about software models that can kill people (or save them), I expect a hell of a lot higher quality. NASA does software modeling, as do many big engineering shops (finite element analysis, etc). And I guarantee you their code is written rigorously, is quality controlled under known processes, and stored and versioned properly. Go have a look at the processes in place to guarantee code quality for space shuttle algorithms.

The problem here is we’re talking academic-level quality, and using it to justify the biggest damned production decisions in the history of the world. It’s crazy.

But I wasn’t talking just of the code. The people at CRU were the custodians of critical databases of original raw data. The value of that stuff to mankind had to be measured in the billions of dollars, considering the decisions we’re thinking of making based on it. And as far as I can tell, they took absolutely no care or due diligence to protect that data. In fact, they claim that much of it is lost, and the rest is hopelessly contaminated. That’s damned criminal. It’s like finding out that the Louvre Custodians use the back of the Mona Lisa as a table cloth on their lunch breaks.

Look, I don’t think this destroys the case for AGW. It does, however, weaken it or at least add much uncertainty to the conclusions that have been drawn. And it seriously damages the credibility of the science among the public. The people who should be angriest are the climate science community and the people working to pass global warming legislation. They’ve been dealt a serious blow because of the unprofessional conduct of some leading figures in a major research center for climate.

Rather than circling the wagons and trying to defend these people, the AGW community should be working feverishly to regain the trust of the public and to prove that the science is still intact. Here’s what they should be doing:

Put together a public commission of scientists, managers, and engineers. Some of them should be leading researchers in the climate field (who are not associated in any way with CRU), and others should be pulled in from other fields. The Challenger Commission is a good model for this. Their goal is to do a survey of the literature, to find out how much of it is based in part on the results from CRU. Then they should get climate researchers to re-run the results with that data entirely removed. The goal is to find out whether a strong case can still be made for global warming even if every piece of data out of CRU turned out to be fraudulent or shoddy. Then they can do an analysis of CRU’s results and determine of how much that data is still useful.
This commission should be given access to all source code for any models that have been used in published papers on global warming. They can then enlist the computing community to analyze it, debug it, and report on the state of the software.
The commission should inventory all original source data, catalog it, and archive it. They should also give recommendations for proper safeguards for code, including version control, archiving, documentation, etc.
The commission should work to set up a form of open-source licensing for scientific research. Any paper published in a peer-reviewed journal should submit all source data and any software used to derive conclusions. Most sciences already do this. This nonsense of protecting your models and your data and simply asking everyone to trust that you’re honest has got to stop. This subject is way too important. Of course, you can’t force people to do this if they are privately funded, but you can certainly make note of the refusal to release data in the paper submission, and journals can choose not to accept papers that refuse to archive their data.

I’m not out to ‘debunk’ global warming. I spent a long time reviewing the science to my satisfaction and decided that there was enough support for it that it was the best hypothesis. I’m pissed off now, because a lot of that effort was spent on garbage information masquerading as science. So now I want to get the science back on track. This mess has to be cleaned up and the public trust in climate science restored. Then we can look at the data again and re-evaluate.

WarmNPrickly · November 26, 2009, 5:21am

That’s because we don’t pay them to do it. They are scientists, not conspirators on a jewel heist. They don’t plan on criminals hacking into their systems. If the government wants the data protected, then the government should offer a grant to higher someone to protect it. These people aren’t wealthy oil companies, with millions at their disposal to higher goons.

There is positively nothing in those e-mails to even suggest a conspiracy. Nothing at all. In fact, I’ll bet those that are releasing these e-mails, are specifically holding back conversations that demonstrate true debate. Rather than a flood of profound e-mails, we we have a scattering of individual e-mails that are easily explainable to anyone that has published and cited scientific journals. Unfortunately, the general population has not.

The debate is a waste of time. We lose.

Sam_Stone · November 26, 2009, 6:36am

WarmNPrickly:

That’s because we don’t pay them to do it. They are scientists, not conspirators on a jewel heist. They don’t plan on criminals hacking into their systems. If the government wants the data protected, then the government should offer a grant to higher someone to protect it. These people aren’t wealthy oil companies, with millions at their disposal to higher goons.

There is positively nothing in those e-mails to even suggest a conspiracy. Nothing at all. In fact, I’ll bet those that are releasing these e-mails, are specifically holding back conversations that demonstrate true debate. Rather than a flood of profound e-mails, we we have a scattering of individual e-mails that are easily explainable to anyone that has published and cited scientific journals. Unfortunately, the general population has not.

The debate is a waste of time. We lose.

No one’s talking about conspiracies. This is a straw men set up by the people trying to circle the wagons. The first defense I saw of CRU was a defense that no grand conspiracy was found. But no one alleged there was. What seems to have happened was that a bunch of overzealous scientists took a whole lot of shortcuts because they were already convinced of the conclusion they were seeking. Their attitude wasn’t, "we need to run models and experiments to see if AGW is occurring, it was more like “We KNOW AGW is occurring. There’s no doubt. Our job now is to build the public case for it. And if we’ve got to fudge a little data, well… Any data that sheds doubt on AGW must be wrong anyway - we just have to figure out why it’s wrong, or come up with some new parameters in our models to explain it or make it go away.”

And by “losing data”, I’m not talking about protecting it from hackers. I’m talking about literally losing it. As in, “We don’t know where it went. We don’t have it any more. We think someone might have purged it because they needed the storage space.” What we’re left with is data that’s already been ‘corrected’ by various algorithmic and statistical means, except that’s not documented well either, so no one really knows what the corrections are. And if you read some of the comments by their data analyst guy, he was trying to validate their data setts and finding measurements for weather stations that don’t exist. He even said in an E-mail that it was possible that someone simply invented fake weather stations and inserted them into the data to help smooth over the gaps. But no one really knows.

Seriously - forget the simple-minded conspiracy stuff. Think more like an auditor. Go look at those E-mails. Document the many lapses in quality, judgment, and good working practices. It will astound you. I won’t go as far as to say the data is completely worthless, because I don’t have a good sense of what has been compromised and what hasn’t, and how much is recoverable from outside sources. But as of this moment, ALL of it needs to be re-analyzed and questioned.

And let’s not forget the outright illegal activity, such as stonewalling FOIA requests, threatening to destroy data rather than release it under FOIA, exhortations in E-mails reminding everyone to delete them afterwards so there is no paper trail of what they were up to, etc. Pretty damning stuff.

XT · November 26, 2009, 5:01pm

Purely anecdotal, but I do a lot of IT work for Sandia Labs, and I’ll give you my take. Mind, I am not involved in their modeling, but wrt IT security and basic systems design, my impression of most of the scientists and admins is that they are a total pain in the ass. Why? Because they THINK they know…well, everything. Oh, they are VERY smart folks, but their idea of security or systems design is a joke, because, frankly, they aren’t trained in those fields. It’s like they think that because they are really smart, with degrees in highly technical and challenging fields, that somehow they know everything there is to know about how IT security and infrastructure works and should be set up. As anyone who knows Sandia can vouch, their security is a mess and their infrastructure is inadequate, poorly documented and poorly designed with all kinds of weird, Byzantine rules and responsibilities.

FWIW, I can easily see at least the part about the model code being spaghetti like, with poor version control on both code and data. It would be like pulling teeth to TELL the scientist (and administrator types) that there already exists a methodology for coding and control…they would dismiss it if they didn’t come up with it, and be such pains in the ass about it that in the end it would be done their way.

I don’t believe that they intentionally are trying to skew results towards an AGW conclusion…but I could see how confirmation bias and their own sense of being right could cause them to bulldozer any dissenting view, or to toss out what seems anomalous data. The history of science is full of instances where these kinds of things happened. I’m confident that, in the end, a very solid theory will come out that is provable, repeatable, etc etc…and that the theory will still be continually refined, just as ALL scientific theories are continually refined and updated as new information comes out. I’ll be surprised if whatever emerges radically contradicts the current climate theories, as I think there is a LOT of evidence that the earth is in a warming period, and that mankind is having a non-zero impact on that warming.

-XT

Sam_Stone · November 26, 2009, 9:13pm

This will make you cry: Code Comments from HadCrut analyst.

Just to clarify, HADCRUT is the model that CRU developed for global warming. Its data is the single-most cited resource in other global warming literature (i.e. a lot of non-CRU global warming research is based in part on the data from HADCRUT).

The link contains comments from the guy charged with running the HADCRUT simulation after the original authors of the software apparently left the job.

Some highlights:

…getting seriously fed up with the state of the Australian data. so many new stations have been introduced, so many false references… so many changes that aren’t documented. Every time a cloud forms I’m presented with a bewildering selection of similar-sounding sites, some with references, some with WMO codes, and some with both. And if I look up the station metadata with one of the local references, chances are the WMO code will be wrong (another station will have it) and the lat/lon will be wrong too.

…

I am very sorry to report that the rest of the databases seem to be in nearly as poor a state as Australia was. There are hundreds if not thousands of pairs of dummy stations, one with no WMO and one with, usually overlapping and with the same station name and very similar coordinates. I know it could be old and new stations, but why such large overlaps if that’s the case? Aarrggghhh!
There truly is no end in sight.
…

Well, dtr2cld is not the world’s most complicated program. Wheras cloudreg is, and I immediately found a mistake! Scanning forward to 1951 was done with a loop that, for completely unfathomable reasons, didn’t include months! So we read 50 grids instead of 600!!!

Back to the gridding. I am seriously worried that our flagship gridded data product is produced by Delaunay triangulation - apparently linear as well. As far as I can see, this renders the station counts totally meaningless. It also means that we cannot say exactly how the gridded data is arrived at from a statistical perspective - since we’re using an off-the-shelf product that isn’t documented sufficiently to say that. Why this wasn’t coded up in Fortran I don’t know - time pressures perhaps?

…

Here, the expected 1990-2003 period is MISSING - so the correlations aren’t so hot! Yet the WMO codes and station names /locations are identical (or close). What the hell is supposed to happen here? Oh yeah - there is no ‘supposed’, I can make it up. So I have

…

Well, it’s been a real day of revelations, never mind the week. This morning I discovered that proper angular weighted interpolation was coded into the IDL routine, but that its use was discouraged because it was slow! Aaarrrgghh.

…

OH FUCK THIS. It’s Sunday evening, I’ve worked all weekend, and just when I thought it was done I’m hitting yet another problem that’s based on the hopeless state of our databases. There is no uniform data integrity, it’s just a catalogue of issues that continues to grow as they’re found.

…

printf,1,’Osborn et al. (2004) gridded reconstruction of warm-season’
printf,1,’(April-September) temperature anomalies (from the 1961-1990 mean).’
printf,1,’Reconstruction is based on tree-ring density records.’
printf,1
printf,1,’NOTE: recent decline in tree-ring density has been ARTIFICIALLY’
printf,1,’REMOVED to facilitate calibration. THEREFORE, post-1960 values’
printf,1,’will be much closer to observed temperatures then they should be,’
printf,1,’which will incorrectly imply the reconstruction is more skilful’
printf,1,’than it actually is. See Osborn et al. (2004).’

…

printf,1,‘IMPORTANT NOTE:’
printf,1,‘The data after 1960 should not be used. The tree-ring density’
printf,1,‘records tend to show a decline after 1960 relative to the summer’
printf,1,‘temperature in many high-latitude locations. In this data set’
printf,1,‘this “decline” has been artificially removed in an ad-hoc way, and’
printf,1,'this means that data after 1960 no longer represent tree-ring
printf,1,'density variations, but have been modified to look more like the
printf,1,‘observed temperatures.’

…

;mknormal,yyy,timey,refperiod=[1881,1940]
;
; Apply a VERY ARTIFICAL correction for decline!!
;
yrloc=[1400,findgen(19)*5.+1904]
valadj=[0.,0.,0.,0.,0.,-0.1,-0.25,-0.3,0.,-0.1,0.3,0.8,1.2,1.7,2.5,2.6,2.6,$
2.6,2.6,2.6]*0.75 ; fudge factor
(…)
;
; APPLY ARTIFICIAL CORRECTION
;
yearlyadj=interpol(valadj,yrloc,x)
densall=densall+yearlyadj

…

;
; Plots 24 yearly maps of calibrated (PCR-infilled or not) MXD reconstructions
; of growing season temperatures. Uses “corrected” MXD – but shouldn’t usually
; plot past 1960 because these will be artificially adjusted to look closer to
; the real temperatures.
;

And it goes on, and on, and on… In another comment, he notices that there are Canadian weather stations in the database that don’t even exist, and he speculates that someone may have simply invented them for some reason.

Also, the software was written on another computer, and when he ran it on the current one, the results did not match - not even close in some cases. He speculates that it could be rounding errors, differences in data lengths, compiler bugs, whatever. But the data is not reproducible.

A lot of the results aren’t even really based on algorithms representing known processes, but simply ‘calibrations’ made by hand. Someone would inject a file containing a bunch of strange integers, used for no other purpose than to massage the output. It looks like they’d tweak these sets of numbers to ‘tune’ the output to get rid of results they didn’t think were ‘correct’.

GIGObuster · November 26, 2009, 10:31pm

At Realclimate they already replied to this.

[Response: Way overblown. Everyone knows (since it was published in Nature) there is a problem the MXD proxy post 1960. It makes perfect sense that the scientists involved would try a number of things to attempt to correct for the problem. The code highlighted is structured to produce two plots. The first is entitled “Northern Hemisphere temperatures, MXD and corrected MXD” and the second is “Northern Hemisphere temperatures and MXD reconstruction”. In the code as available, the “corrected MXD” plot and the artificial correction is commented out (note the ‘;’ on the line ;filter_cru,5.,/nan,tsin=yyy+yearlyadj,tslow=tslow. Instead the program will quite happily plot the temperatures and MXD reconstruction with no correction. As for where the correction comes from, it appears to be based on a PCA technique described in briffa_sep98_decline1.pro and briffa_sep98_decline2.pro:
; On a site-by-site basis, computes MXD timeseries from 1902-1976, and
; computes Apr-Sep temperature for same period, using surrounding boxes
; if necessary. Normalises them over as common a period as possible, then
; takes 5-yr means of each (fairly generous allowance for
; missing data), then takes the difference.
; Results are then saved for briffa_sep98_decline2.pro to perform rotated PCA
; on, to obtain the 'decline' signal!

; Reads in site-by-site MXD and temperature series in 5 yr blocks, all
; correctly normalised etc. Rotated PCA is performed to obtain the 'decline'
; signal!
I guess the initial reason to do this would be to see if there is a spatial pattern to the divergence that might reveal something about it’s cause. The weighting of that pattern (the ‘yearlyadj’ PC weights) could be used to correct for the decline, but I’m not sure what use that would be. More importantly, I have no idea if that was used in a paper (I have no access from home), but since the graph would have read “Corrected MXD”, I don’t see how anyone would have been misled. It certainly has nothing to do with Jones’ comment or the 1999 WMO plot, nor the published data. This is just malicious cherry picking. - gavin]

GIGObuster · November 26, 2009, 10:39pm

One thing I noticed about that cherry picking, many are forgetting that it did involve ongoing investigations, programing, and processing of data. The context if the steps or items in the emails were used in the final papers or conclusions is missing.

Sam_Stone · November 26, 2009, 11:17pm

I wasn’t talking specifically about that. And I understand the reason for the ‘hide the decline’ stuff - a determination was made that tree ring data after 1960 is unreliable, so they’re trying to correct it by overlaying it with actual measured temperature. Fine. Maybe. I believe there is still controversy about this. And the problem with models that don’t release their source is that other scientists using the data have no way of knowing how it’s already been ‘corrected’. Some of the corrections may be published and known, but others may not be. Some people might have added fudge factors that they thought were no big deal, and didn’t even bother to tell anyone. There appear to be comments in the simulation code to that effect.

But there is a lot more going on here than that. What it looks like to me is that a lot of their data is the result of hand-corrections by well-meaning computer people operating largely in the dark as to what other computer people did.

What do you think the odds are that this program’s output is exactly based on the models and algorithms originally designed, as opposed to output that’s the result of buggy code, tweaked data, and assumptions other people made who are no longer around to explain them?

Just look at the number of times poor Harry the data analyst had to make a ‘best guess’ as to what the data is supposed to be. He’d write his own programs to try to parse files of unknown origin, then run comparators to see if the data matched other files of known origin, just to figure out what the hell he was working with. He had to remove weather station data that appeared bad or nonsensical, and there was a lot of it (implying that there is probably an equal amount of weather station data that’s also bad, but not bad enough to trigger his limit algorithms). In places he comments out code that he doesn’t understand, simply because it doesn’t seem to give the results he expected.

In the end, it looks to me that what he did was try to run the HADCRUT simulation against the old data, trying to get the same results. If he got the same results, he’d be satisfied that the data was reconstructed correctly.

However, since their databases are such a mess, he has no idea what the correct data is, and since the HADCRUT program was running on new hardware, it was not going to return the same results even against the same data since things like precision had changed (and he suspected bugs in the compiler or in the code that didn’t surface on the old hardware). So, he wrote programs to analyze the output and categorize it by percentage of accuracy for each resolved value measured against the old one. If he could get a result matching the old HADCRUT data within some level of tolerance, he would be satisfied that it was once again operating correctly, and they could use it with new data.

But in the meantime, other people had dicked around with the code, adding their own ‘corrections’ and tunings, and other people had dicked around with the databases, adding and removing weather stations and such. So an awful lot of what he was doing appears to be little more than educated guesswork.

Here’s the kind of error I’d be worried about - someone runs the simulation against known data, trying to see if it matches. It doesn’t. So they assume the algorithm is wrong and tweak it to ‘calibrate’ the sim so it matches known data. But if it turns out that the algorithm was right, but the bad data was the result of a bug, then the tuning will only work in that specific case. They think they’ve tuned the algorithm to be more accurate, but what they’ve actually done is tune it to absorb bad data from a bug specific to those particular numbers. So it always looks right when applied to the historical data, but is totally useless.

Have you ever worked on a codebase that’s been around for 20 years, tweaked by people who no longer work for the company? I have. Usually, it becomes unstable, unpredictable, and eventually has to be scrapped and rewritten because no one can tell what the hell it does anymore. A product I worked on was like that - it had a very complex module written by a bad coder who was long gone, and had years and years of hacks and bug fixes applied to it by other developers who were no longer around. Eventually, it became useless. A bug fix would trigger five other bugs. No one could explain exactly what it was doing or what entire sections of code were for. It became an achilles heel of the product, and was eventually completely removed.

This entire HADCRUT application looks like that. The quality of the data this thing is producing is seriously in question. I’m not claiming grand conspiracies to cover up global warming, I’m talking about processes getting out of control and people having an “Oh, shit!” moment when they realize how crappy their systems are. My guess is that the real reason CRU didn’t want to release their data and source code is because they knew that it was in an incredibly poor state, and they were trying to protect their reputations.

elucidator · November 26, 2009, 11:29pm

Michael Mann Responds to CRU Hack

Long, dense. Hard to “highlight”, link offered. Daily Kos. Moderately centrist site.

Topic		Replies	Views
Please explain Climategate. Great Debates	64	7748	December 6, 2009
AGW Revealed as a Hoax? Great Debates	96	12733	November 29, 2009
No global warming since 1850? Great Debates	9	3461	February 10, 2010
Global Warming Redux: Have they lost their credibility? Great Debates	206	24063	February 23, 2010
"Climate change is accelerating beyond expectations"; while Americans' belief in AGW declines Great Debates	191	19504	December 10, 2009

Is Global Warming science?

Related topics