Last Hours of the Deepwater Horizon

The NY Times has a very good, lengthy investigative piece today on the last hours of the Deepwater Horizon: http://www.nytimes.com/2010/12/26/us/26spill.html?_r=2&hp=&pagewanted=all

The central theme seems to be that, even after the blowout, the rig might have been saved without loss of life if shipboard safety measures had been followed, the chains of command for emergency procedures were more clearly understood, and (possibly) if the captain hadn’t been a spineless jellyfish.

Any thoughts on this article? Any Dopers with relevant expertise care to comment on it?

Single, solitary bump. No one wants to talk about this article? It’s quite good, honest.

Just started reading and it’s already quite interesting. Nothing to share personally as this is very much out of my personal areas of experience and expertise, but I’ll be interested to see if anybody else has input.

What a massive clusterf*ck. My gosh a dozen or more people made bad decisions at once. I didn’t realize how bad it was until reading this article. Everyone was overwhelmed by the magnitude of the system failures.

This is what I want to know:
If mud and gas spewing from the drill hole, giant explosions setting the rig on fire, generators exploding, power out, and emergency safety systems failing aren’t enough reason to hit the “kill” switch, WTF is?

An engraved letter from the C-suites of both companies saying “We won’t attempt to find some way to pin this on you and fire your ass for taking a sensible and necessary step to ensure the safety of the crew of this platform and prevent a seriously damaging spill.”

Without a detailed read (can’t access, at work), it sounds a lot both like the Bhopal disaster and like nobody-got-hurt accidents I’ve encountered at work (an example explained a bit more in the paragraph below). They too were situations where the final mess would have been averted if someone had had the cojones to do what they knew needed to be done.

We had a strange double situation in one of my jobs, where two similar products had the same problem (reaction not starting) in two factories, in the same week and without the second factory having heard about the first one.
In the first one (ours), it was on the weekend and when the shift manager saw that the “Emergency Manager” he had to call for authorization to dump was an engineer known for his utter uselessness in emergencies, he said “to hell with it, I’m dumping”, and did: the product turned into a coagulated mass as soon as it touched the cold floor, we spent the rest of Saturday, Sunday all day and half of Monday morning cleaning up - no special protective equipment needed, although we did make sure the room was very well-ventilated (by the system of opening the doors wide, two opposing walls were whole-wall doors).
The second one took part during a weekday, the whole management team was present, they kept saying “give it one more shot of starter… heat it a bit more… try cooling…” - the reaction finally started, but inside the reactor: two weeks of cleanup (by men wearing what looks like astronaut suits, with both autonomous and tubed air). We got kudos (I still want to know how exactly did the factory manager defend to Those Above that not calling the engineer was absolutely the right thing to do), they got fired.

The NYT is blocked from your work but the SDMB isn’t? What an odd filter.

A lot of IT departments only block sites that management has heard of and/or disapproves of.

Re: the OP. I work (as a contractor) at a government safety agency, and although not a safety engineer myself, I’ve read several books in the library here and I see all kinds of safety reports and accident analyses. One thing that stands out in my experience is how often, especially but not exclusively in the big disasters, multiple things went wrong at the same time. It (subjectively) seems to me it’s hardly ever one single thing wrong, and when I read accounts of disasters, I’ve come to expect the “but that’s not all!” additions to the crisis.

It seems like that would be rare – and maybe it is…maybe individual things go wrong so often that having a bunch of them go wrong together, though rare by comparison, is still frequent enough to notice.

Charles Perrow’s Normal Accident Theory (reviewed comprehensibly here) holds that processes which are:

[ol]
[li]high risk [/li][li]massively complex and [/li][li]tightly coupled*[/li][/ol]

will inevitably suffer accidents, thus such accidents are “normal” for such systems.

*[from the linked review I quoted: The sub-components of a tightly coupled system have prompt and major impacts on each other. If what happens in one part has little impact on another part, or if everything happens slowly (in particular, slowly on the scale of human thinking times), the system is not described as “tightly coupled.” Tight coupling also raises the odds that operator intervention will make things worse, since the true nature of the problem may well not be understood correctly.]

This theory is somewhat at odds with the theory of Highly Reliable Organizations, which was sort of a response to Perrow’s book, in which some organizations (the US Navy nuclear program being frequently cited as an example) seem to have avoided “normal” accidents so far. Whether that really indicates they’ve overcome the problems or they’ve just been lucky is the subject of much debate.

IMHO (and it’s just semi-informed lay opinion), I think the few relatively high-reliability organizations out there with good records actually have held “normal” accidents at bay, but that it’s is not the result of a better understanding of technology or better systems, but simply the work of outstanding individual people, and as such, difficult to sustain when they retire or move on, and we’ll eventually find that the vaunted “high reliability” is impermanent.

That’d be me, I guess.

Let’s start with Shakespeare.

You could argue that both Romeo and Juliet might both have lived if lines of communication had been better, but you’d be missing the point of the play. The Montagues and Capulets had been at war long before the onstage action even begins.

Now let’s talk about offshore drilling.

Oil companies such as BP pay the bills. Drilling companies such as Transocean do the drilling. It’s been that way for almost a hundred years and it will be that way for the next hundred years. The war was not about killing people, but about maximizing profits. The collateral damage was blown out wells and dead people.

Notice that I used the past tense in that last paragraph. Ten or fifteen years ago, oil companies started to drill far offshore in very deep water. The CEOs and Presidents of BP and Transocean etc. signed a peace treaty and looked forward to an age of peace and prosperity. A memo was issued to their minions and the bosses went back to smoking cigars.

Now let’s get firmly back to reality. In the Gulf of Mexico (it’s also true in many other places with a long established oil and gas industry), working on rigs is a way of life. If you work on a rig, chances are so did your daddy and maybe grand daddy. Hopefully your kid will too. It’s well paid, you don’t need a college degree and there’s no other work. Also, when you’re actually on a rig, it’s very, very hierarchical. It’s not supposed to be that way. It’s just the system that’s developed over decades.

The people who run Transocean and BP are blissfully unaware of this. They live in a world where all of their employees take individual responsibility for safety and are prepared to shut down million dollar a day operations because things look a little squirrly. All their glossy brochures say “Safety is our #1 priority” etc.

And on to specifics.

Blow Out Preventer.

This was due for an inspection that would have taken 90 days. This wasn’t done. If Tony Hayward (former BP CEO and arch villain in many people’s eyes) has known this, he would the operation before it got started. I suspect the CEO of Transocean would have done the same. Of course, the guys at the very top never hear about these kinds of problems, and the further down the food chain the problem goes, the less likely it is that safety really is the #1 priority.

Negative Pressure Test.

Doing a pressure test on a rig is sometimes an artform. There’s always lots of piping and valves involved. There are always slow system losses. Except for the negative test. It’s virtually impossible to get a false-positive reading. Alarm bells must have been ringing in everyone’s heads, but the BP and Transocean guys decided to ignore them. Maybe having “Drill Baby, Drill!” as an industry wide slogan for fifty years had something to with it.

The Blowout

By the time this happened, it was too late, the rig was probably doomed, regardless of what anyone did. As a famous engineer once said “ya canna change the laws of physics”. In fact, it might have been a good thing that the Blow Out Preventer failed. If it had worked correctly, there’s a good chance it would have blown the top of the well off, leading to a much worse spill than actually happened.
Conclusion

The people who run multi billion dollar industries aren’t out to make a quick buck. Neither are they touchy feely types in touch with their inner selves. They are out to make as much money as possible over the next several decades. Taking safety seriously is a good way to do this.

Unfortunately, this hasn’t filtered down through the food chain. Someone in charge of an individual rig is only looking at the yearly profit. The guys on the rig only look at pay check at the end of the week.

Is there a solution? I don’t know. The structural solution is already in place and has been for many years. The problem is people. If you’ve barely graduated high school are you really going to tell some senior engineer that you don’t like what’s going on. If you run an oil rig, do you really want to listen

Thanks, Tapioca!

Pretty good article, with maybe a few technical errors; reasonable given the complexity of the subject.

Meanwhile, Tapioca’s excellent post stands on its own, but I’ll add a few further comments:

  1. I’ve been in a blowout situation on land (fairly mild compared to this one, and with no fire), and in well control situations that could have devolved into blowouts offshore. One thing I can say is once a high-velocity flow gets going from the well, all bets are off. That is to say, even if the BOP were in as-new condition, closure on a high rate of flow (filled with all sorts of eroded material from the wellbore and drill string) would be very difficult indeed. The best time to attempt activating the emergency well control equipment would have been when signs of fluid imbalance first began showing up, and there is virtually no question from the articles, testimony and well data I’ve seen that there were in fact warning signs of a kick prior to the first eruption at drill floor level.

  2. It’s easy in hindsight to criticize some of the decisions made by rig staff once the blowout ignited, but a lot of that is down to the horrifying nature of the emergency they were facing, and the simple fact that emergency training does not and probably cannot accurately simulate the conditions the crew was subjected to. It’s pretty much a given that none of the crew (except perhaps some with military combat experience) had faced a situation anything like that before, and it’s very I’d be surprised if the rig had ever practised a full emergency shutdown sequence: individual parts of the systems most likely were tested from time to time, and there are of course frequent lifeboat drills, but realistic testing of full emergency scenarios seems almost non-existent in the industry, due to the relative difficulty and expense of doing so.

I’ll close by mentioning that for several years BP has freely passed out to its contractors a training CD containing a video concerning energy isolation hazards. Tony Hayward himself introduces the scenarios therein. The centerpiece is a re-creation of a (non-fatal) fire on an offshore production platform in Egypt in the early 2000s that occurred when a set of valves were mistakenly left open prior to maintenance work on an oil/gas separator. Point is made of the fact that in the confusion once the fire erupted, supervisors on the platform were unable or unwilling to make an immediate decision to shut in the producing wells without permission from their bosses onshore. As Tapioca and others have said, there is much built-in inertia that has to be overcome when possibly hundreds of millions of dollars are riding on a shutdown decision. Tragically, it seems that while this lesson learned was obviously known within the company, for whatever reason it had not been fully absorbed.

Sorry, should be clear on one point: as I think everyone who has followed this case knows by now, BOPs are subject to regulatory requirements for frequent, documented testing; I was referring to other emergency shutdown components, where it may be more difficult to do a reasonably realistic test on a frequent basis.

That’s why it seem to me to be so important to have strict procedures to follow–it both absolves you of personal responsibility of having to explain to your boss why you took Drastic Step X, and it also means that you don’t have to *think *about how to deal with the emergency–you just *do *it. That’s also why it’s so dangerous that they did something like moving that alarm activation from being automatically triggered to being turned on by a person.

Although the press has perhaps focused too much on the Blow Out Preventer, which was only the final stopgap measure in a long chain of circumstances, it bears mentioning that Senate testimony included the gem that the device had 241 failure modes showing at the time the decision was made to resume drilling.

I’m surprised it had that many possible things occurring in the first place, let alone that two hundred and forty one of them would be failures. There’s plenty of culpability in this tale, but somebody decided that 241 failures showing on their main safety device meant it was OK to proceed. AFTER questionable pressure readings had indicated this might not be routine. That’s flat-out Russian-roulette crazy.