I suppose billg@microsoft.com has a few levels of eyeballs filtering the inbox before Bill actually sees any of it.
Nobody died, but when my dad was in the Navy something similar happened during night maneuvers - big ol’ hole in the side of either the Buck or the John W. Thomason, I forget which one (the Thomason was my dad’s ship). Albeit without the awful codas to your story, as far as I know.
Evidently the chaplain was taking flash photos of the accident and the captain lost his shit about it (the flashes, obviously.)
I have one that cost no money or lives but was a breathtaking violation of professionalism that should have cost me my job. But somehow didn’t.
When I was the computer support person for the county 911 agency, at one point we needed to upgrade the operating system of our 911 servers. If you read my earlier post, you remember these were two redundant servers in a cluster configuration. I had just returned from vendor training where I learned how to upgrade those exact systems, but the agency decided to spend the money to have our 911 system software vendor do it instead. I went along thinking they were pros and I’d learn a little more from watching.
So the dude shows up and disappears into the server room. About 30 minutes later, the systems go down unannounced, so I wander at a high rate of speed to the server room to see what was going on. Remember, redundant; he should have been able to upgrade one server at a time while the dispatchers ran on the other server. I see the dude sitting at the console table with one of the OS manuals open on his lap. Long story short (because I interrogated him in disbelief), he had no experience with that platform (hardware or OS), so he showed up at a client site, took down all of their machines in preparation for working on them, and proceeded to read about how to do an upgrade in the manual. Starting with backups. He was struggling to do backups because the examples in the manual were a little different from our actual disk names. Moron with a capital M.
Yes, while my users - while my 911 freaking operators - were dead in the water, this dude was learning how to do the job he was hired for. My mind was blown.
I ran back to my director and told him what was going on. Shrug. If I’d done what this dude was doing, I would have been fired. But for him, they shrug? I explained to them the severity of the problem and recommended that they call the vendor and tell them that when we gave them $$,$$$ we expected experienced talent and we wanted someone else out asap to replace this phony. They patted me on the head and told me to just let him do his job.
Fuming, I did want any self-respecting, professional person would NOT do: I logged onto the internet (newsgroups back then) and posted in an industry newsgroup about what was going on and how beyond pissed I was and how DARE this vendor send this inexperienced flunky out to hose our real-time users with critical system needs. It caused a shit storm to say the least. I was scolded, but somehow not fired. Perhaps because the idiot actually kept our 911 dispatchers computer-free for a total of THREE and a HALF days.
As if this isn’t mind blowing enough, it gets mind blowing in the other direction: about 5 years later, I was actually hired by that vendor. Yes, they remembered me. No, nothing was said about that incident. Not even by the idiot, who actually still worked there!
No, the real idiot was the manager who sent an untrained, inexperienced person out alone to do this critical job.
(I bet he was still there at that company, too.)
This sort of thing is pretty hard to quantify. I used to work at the hospital (oh, excuse me - the “medical center”) on the campus where I NOW work (in a TOTALLY different capacity) and I’m SURE some bad things were done then (not that the hospital I worked at was a BAD one. Quite the contrary - it’s considered one of the very best “teaching” hospitals around. But let’s face it - when people are being operated on and there are humans involved [why wouldn’t there be?], bad things happen sometimes. Fact of life) but since I can’t think of any specific examples from there I’ll go with the one that sticks out in my head right now. It’s from a place I worked at from Y2K until 2006. It’s a little bit of a long story so I’ll try to keep it (relatively) short: the company I was working for at that time had 5 buildings, all within walking distance of one another. One time a giNORmous machine was brought to the building I worked in with the intention of having it installed there (how “ginormous” was the machine in question? Ginormous enough that when the workers tried to move it off the flatbed it was delivered on the wheels of the forklift[s] that were used to try to move it actually sank several inches into the asphalt of the parking lot{!!!}. Also, I seem to recall that at one point, somehow, the machine fell off the flatbed into a small grove of trees that was next to [and slightly downhill from] it. It took an industrial-sized crane to retrieve said monstrosity). Anyhoo, all this time and trouble was spent trying to get that machine installed in the building I worked in. Only for it to be discovered that…(if someone could let me know how to do that “spoiler” thing I would REALLY appreciate it!) the building in question didn’t have the power necessary to run that monstrous thing!
So I think the monster machine had to be sent back to wherever it was sent to us from costing the company untold (at least to me) amounts of $$$. Needless to say the powers-that-be were NOT happy with the results of THAT little exercise and I do believe that the person whose bright idea it was to bring that monstrosity over in the first place without first checking to see if the building he planned to have it put in could even house it later, uh…left the company of his own accord. And not MUCH later, either.
Yep, been there, done that. Unfortunately, I’d say it’s quite a common thing in the IT industry to send some poor, untrained sap out to just figure something out as they go.
The first time I ever saw an ADSL connection was when I was sent to a client site to install it and show the client how everything works. Training? Hahaha.
I had absolutely no clue how it worked, and more alarmingly, neither did the phone ‘support’ people. And this wasn’t just ‘plug the router in and it just works’ like it is these days - this was back in the early days (it wasn’t even commercially available - we were running some field tests for some trial customers, about a year before it was properly released), so there was an awful lot of settings to go through and tests to do. I just had to muddle on through and figure it out, all with the client sat there beside me.
Then again, that wasn’t on a live, mission critical system. :smack:
I was doing software QA for a company that makes educational games for young children. We had just got the license to make Star Wars games, and I was lead test for the first title. This title was also the first in a new corporate cost-saving strategy that was basically, “Fire all our full time developers, and outsource everything to China.” That’s a difficult situation to start with, only exacerbated by the fact that the company we ended up working with was… really ill equipped for the project. By which I mean, they had neither A) a single employee who could speak English, nor B) a functioning internet connection in their office. Also, I suspect, C) nobody who had ever written code before. At any rate, what should have been a simple, two month testing cycle on a Flash game written for five year olds, turned into a six month ordeal that only ended when we started relaxing release standards from, “Game never crashes,” to “Game doesn’t crash in less than one hour of play.”
This was also around the start of the trend of having achievements or trophies in video games. Ours were web-based: you played the game, then went online to see a collection of all the trophies you’d accumulated on your various games. Among other firsts, this Star Wars title was also one of the first to use this system. And it was one of the few things that was implemented correctly. Four or five months into the testing cycle, the achievement system worked fine. And with the last two months of testing basically being one ginormous, ongoing crisis over the game just not fucking working for more than thirty minutes at a stretch, double checking the achievement system sort of fell off my radar. So, I wasn’t best pleased when, after I’d signed off on the game working (more or less) to our standards, after the cartridges had been manufactured, the boxes printed, the former inserted into the later, and whole pallets of both ready to be shipped to stores… we found out that some asshole in China, instead of working on fixing the varied and multitudinous crashes in the game, went in and redesigned the (fully functional) achievement system. So that it would no longer work with our servers.
Only time in my life I genuinely expected to be fired, but mine was really one of the smallest in an unscalable mountain of fuck ups that went into this project. About a month before the project was finished, it came out that the senior VP who had headed the “China initiative” had taken his mistress on a Caribbean vacation, and paid for the whole thing (including first class flights and a four star hotel) with his corporate credit card. Which only came to light when his wife’s divorce lawyer subpenaed the company’s financial records.
I got to tell you, there’s nothing like seeing a senior VP getting perp walked out of the building by security, carrying all his belongings in a cardboard box. Certainly made being stuck in the cluster fuck he was responsible for a lot easier to bear.
While this wasn’t me, I was there to see it happen.
I work for a Lottery corporation, when our lottery system is down on a typical day it costs around $100,000/hour. On a busy draw day it could get up to $200,000 and hour.
Anyway this was before we upgraded our network to IP, it was all a legacy serial network. All the connections came back to our head office to a device called a Walcom. It was giant A/B serial switch. It filled up about 6 42U racks.
Anyway one of my coworkers was doing something in the base of one of the racks and somehow shorted out a power supply. It then caused a chain reaction taking down the backup power supply. Needless to say our lottery system was down for about 6 hours while we brought it back online!
No action was taken. And you know what, with the majority of these stories it was a simple mistake. Hey shit happens! It sucks, but that’s no reason for someone to lose their job. Now if they were doing something stupid, or something they shouldn’t have been doing that is a different story!
MtM
One lighter one, from when I was working in my dad’s machine shop:
I’d spent two days making about a hundred parts, and was nearly finished on the third day when my dad came over and looked at one, and immediately blew his top. “These are all fucked up!” he yelled, “You’ve drilled all the holes backwards! We need to start over!”
I looked at the part in his hand, gently took it from him, turned it around, and handed it back. He stared at it for a minute, then said, “Let’s knock off early today.”
OK - 30 years in the IT business, it would be a shame to let these stories die.
Large bank in SF (stagecoach) had dual systems - there was what is now called an “air gap” between test and production environments (which, all by itself eliminated an incalculable number of screw-ups).
You go into a bank branch (do it quick while they still exist), and you see flyers for various kinds of accounts, services, nay money-making scheme. Each of those is, in bank parlance, a “product”. Your Super Platinum EZ Checking is a product.
In the mid-80’s, simply everyone decided to go after the same “high value” client.
This on came up with a new product:
High Limit Credit Card (and it is metallic GOLD - which was brand new technology then)
Safe Deposit Box
“Personal” banker* to handle all your needs/wants/hairs up your ass
At this time, the bank issued its own cards and generated the PINs.
Remember: dual mirror-image systems test/production.
Came time to mail out the cards to these “High Value”, super-desirable clients.
The PIN generator died and could not be brought up. These customers are not the kind noted for not noticing screw-ups.
An SVP approved running the PINs on the test system. All is well, cards and PINs go out on schedule.
Until someone tried to use one of these shiny new gold cards - and the ATM ate thier card.
These are also not people who are used to having cards taken from them.
It turns out that, for security reasons, the test machine’s PIN routine hashes a PIN different than the ones hashed by production*. A moment’s reflection should tell you that an organization which spends the bucks to keep FOUR friggin’ top-of-the-line mainframe COMPLEXES (5 CPU’s per system) is probably going to prevent a developer from generating the PINs for customer cards.
There was one less SVP.
-
- the really good part: blind squirrel, nut, the PINs did, on occasion, log on. To the WRONG account. Somebody who had nothing to do with this program gets this activity.
I was not around for this - I don’t know how long it took, or how much it cost just in dollars, let alone reputation among the “High Value” population.
I am amazed that banks now admit to hacks and screw-ups. Back in the day, they quietly ate any losses, made everybody whole and never said a word for fear of scaring off the general population.
Remember - this was when people (many) still thought that banks could, in an emergency, revert to manual processing.
Does anyone still believe that? In the First World?
All I had to do with this was have to drive to work:
The BART system was obscenely over-sold (“a train every 60 seconds!”) back in the 60’s. It got built anyway.
It is, of course, dependent on computers (no, don’t know type), and, to make people comfty (computer-controlled Trains?! and you expect me to put my ass on it?), they installed not just one, but 6 machines, so there would be instant recovery if one died. This actually worked, as they discovered on the day in question.
Some nobody-ever-dreamed-of-THIS situation arose, and the #1 machine rolled over with its feet in the air. Cool, number 2 - you’re up.
Guess what - that have identical programming. If the input kills the first, it will kill all of them.
It was "Watch them die: 1, 2,3, 4, 5, 6.
Now what, hotshot?
Took them several hours to clear the jam and bring them back up. Peak of the commute hours.
One from a friend who works for a local utility supervising field work.
They were to dig on a street. So he collected the plans and called 1-call to mark the other utilities. All was marked and ready to go.
The contractors start to dig into the pavement and just after breaking through the asphalt sparks fly, and the power in the neighborhood goes out. Luckily no-one was hurt as breaking through a high voltage line usually means disaster much worse than a blackout.
My friend checks the plans and sees they are old but not ancient, so he checks the area’s work on the city master files and find that there was both electrical and fiber optical work done a couple of years before. The electrical utility had put their cable far too close to the surface to be legal, the cable company contractor thought this must be how they do things in the city and ran their cable right next to it (also illegal). Both submitted plans that claimed a much deeper cable depth, even thought the listed depth was physically impossible given the street’s topography.
To top it all off the city plans department decided not to bother adding in the new (albeit false) information that they had. The contractor, and even the 1-call crew had no idea there were any electrical or fiber optics cables in the ground.
Another one: a specialist crew was working in a storm drain my friend was monitoring for repair work. Since this was the ‘final leg’ of a network of storm drains they had to be careful about the weather as rain miles up north in the city could mean the water they were working in could rise several feet in a few minutes even if it was sunny right above them. So part of the specialist crew team had the job of watching a weather radar screen and warn of any storms happening. They even had a panic button.
But one day a couple of the specialist crew were ill so they assigned one of the regular contractors to watch the screen. Not a hard job. You sit and keep and eye on the screen and let people down below if there was a problem. Easy work for a guy who might otherwise be shoveling a ditch. He had one job.
He decided this was too much hard work and went off the grab a long lunch without telling anyone. You can guess what happened: A storm roared in up north and the crew in the storm drain thankfully saw the water level rising and got out before disaster struck. Had the storm been heavier the crew might not have made it out. There was damn near a fist fight.
A Finance Director made a simple formula error in a spreadsheet and overestimated the benefits of a project by a factor of 10. She was tight with the CFO, rumored to be more than just his protege. No one could convince her she was wrong. No one dared go over her head. Company invested $25 million before it became obvious it wasn’t going to work. The whole thing had to be written off.
They were both fired, but he quickly became a partner at a consulting firm, and soon she was working there as well. Not a rinky dink firm, either. One of the largest in the world. Incidentally they both came to our company from the consulting world.
Our company was big enough to handle ma $25m loss. Actually a few years later we absorbed a $100m+ project write-off and no one at the top level got fired. The stock rice went up on the day the project was canceled, because everyone outside the company could see it was a disaster.
I used to think these farcical things only happened in the public sector. I’ve found that in a large enough company peoples careers can survive multiple multi-million dollar screw-ups.
[hijack]
[noparse]Enclose the text you want to hide in . . . tags. Thus, you type this:
Only for it to be discovered that…
the building in question didn’t have the power necessary to run that
monstrous thing!
[/noparse]
and it comes out looking like this:
Only for it to be discovered that…
the building in question didn’t have the power necessary to run that
monstrous thing![/hijack]
Not sure this fits exactly into this thread but here is my fishing story anyway.
In my twenties I worked on tuna and prawn fishing boats in between slow periods in the music industry. One prawn trawler boat’s skipper and his crew were particularly fond of the foamy frothy stuff , they would routinely drink a slab each (24 cans of full strength beer)in an evening of jolly fun. I was studying biology at the time and drank very little in favour of hitting the books while the guys were at the pub, when in port.
One night , when the boys were out drinking,after a few hours studying I fell asleep in my bunk( above the skippers ), I awoke to the sound of a combination of main engine running smoothly, indicating we had gone to sea, and the bilge alarm sounding off. being the cook I was not expected to have intimate knowledge of the workings of the ship, but knew where to switch off the very loud alarm , which I did. I then made an unimpressive effort to wake the skipper to ensure the issue was properly dealt with. went straight back to sleep.
Bilge alarm sounded again shortly after, skipper unrouseable, checked crew deck to find every member , including the engineer dead drunk. Floor littered with empty beer cans.
Summary, skipper and crew had returned from the pub while I slept, drank an enormous amount of beer, somehow managed to negotiate the ship out of port, across the shallow bar and into open ocean putting the autopilot on and crashed on their bunks, dead to the world.
I hefted a near full container of cold sea water onto the face of said skipper, dodged his flailing fists and, believing I had directed his attention to the bilge alarm problem, promptly went back to bed and sleep.
Some time later, a fearsome combination of alarms woke me ( and no one else), noticed the somnolent captain still in his bunk and wondered what the foamy stuff was that seemed to have collected near the open door to the back deck. It was a combination of diesel and water thrown out of the main exhaust in a thick whitish foamy blanket completely covering the rear deck. This was because the bilge had filled up to the level of the intake and was being mixed and expelled through he engine, which by this time was sounding very rough.
Checked the time, it seemed we had been steaming at 3/4 throttle with no one at the helm for approximately 4 hours. I didn’t know how to shut down the engine, couldn’t rouse anyone and was getting more concerned about the 30 degree list to starboard.
Eventually the water intake problem solved that and the main engine shut down, leaving our90 foot trawler listing about 35degrees in calm tropical water. That meant that the very full diesel tank on the starboard side began to leak fuel out of the air intake on deck. After about 15 minutes I noticed the sea around the ship had a very oily calmer look, caused by thousands of gallons of diesel that now surrounded us.
Still unable to wake anyone, I did what I ought to have hours before, got on the radio and called mayday.
Happily, there were other vessels nearby and one came to rescue us, by which time the crew were beginning to wake.
I made the decision not to continue commercial fishing once we returned to port.
There were several at the feed mill I used to work at. The most serious would have to be sending feed to a hog barn with the concentrated drug instead of the diluted mix we sold to customers and a bunch of the hogs died, and the rest had to be put down. Then there was the time a salesman accidentally faxed a bunch of feed formulas to a rival feed mill. That feed mill picked up an important (now former) customer a few weeks later. Then there was the time someone dumped one ingredient on top of another, which happened from time to time- except this time nobody noticed until we had already run two days worth of production. We had to recall everything we had made and remake everything, and we found out about it late on New Year’s Eve and had to come in to work on New Year’s Day. This is just the tip of the iceberg of the crappy things that went on at that place, but these are the only things I can think of where it could have harmed the public. (Well, the faxing incident wouldn’t have hurt anyone, but it was damaging to the company. And the guy still didn’t lose his job.)
Mine is another pharma/biotech story. We were testing a number of natural products compounds for breast and uterine cancer applications when we suddenly got a hit relative to the control. Oh my God, this had huge applications and efforts were made to re-create the hit with more of the natural product, which was from a rare sea invertebrate. Weeks and easily hundreds of thousands of dollars was devoted to this effort, which included a vice president leaking the news to investors on a truly unique way to target cancer with this new class of drugs…which was all due to an inexperienced lab tech using a contaminated pipette tip (which was contaminated with the control compound) on the natural product.
That guy last about a year longer after that, but never lived that mistake down, and ultimately got fired when they discovered he didn’t even have a degree, which was a requirement for the job.
By "that guy " I’m assuming you meant the VP who lied to the investors in violation of SEC regs, right ?? Yeah sure you did. He probably got a medal.
A patient died because a six hour potassium infusion was given over an hour.
My dad tells stories of the worst field-hand ever (WFHE) deployed during a season.
- Doing some seismic surveying, the field-hands would each grab a cable with several geophones attached and head off in a direction specified by the geologist.
Each cable could be around 500m or so long. The field hands would place the geophones and then wait until the geologist radioed to let them know he had enough readings, they would then move to a new location and repeat.
WFHE gets radio call to come back, for some reason leaves the cable and geophones in place and heads off in a random direction he thinks will get him back to the vehicle. Yep, this guy got lost when all he had to do was rewind a cable that would get him back.
- WFEH and geologist are surveying in a 4WD. Geologist stops at a creek, climbs down into the creek bed (several metres) then tells WFHE “I’m going to walk downstream, give me half an hour and then follow me down with the car”.
Sure enough, half an hour later the geologist hears an almighty crash. WFHE had taken him literally and rather than driving alongside the creek to meet the geologist had driven over the embankment into the creek and smashed up the vehicle. They had to get a truck in from basecamp to haul the damn thing out.
- After this WFHE was kept in camp. One day he was detailed to take a load of garbage to the ribbish pit for disposal – by burning it. :eek:
An hour later he arrives back in camp smoke blackened and a bit singed but otherwise unhurt. After throwing diesel fuel over the rubbish he couldn’t get it to burn by tossing lit matches onto it and so climbed down into the pit. He got it lit OK, just couldn’t climb out as quickly as he had hoped.
They sent him home after that.