I crashed a mainframe. That is not supposed to be possible, but I did. Um, acheivement unlocked??
In the early 90s I was installing a fully redundant system that talked to the local mainframe at the client site. I had to write custom code changes for this site, and finally got them all working just before the mainframe went down at 6:00, per the nightly schedule. Yeah, that’s pathetic, but it’s how the client worked. They claimed they needed 12 hours to process a database. Anyway, between 6:00 and 8-ish that night, I documented my changes, backed everything up to tape, and copied everything from the primary computer to the stand-by/backup server. Then I went to my hotel, confident that everything would be great when the mainframe came back up at 6:00 AM. I could relax after my long day - I’d come in as soon after 6:00 AM as I could so I’d have a full day while the mainframe was up so I could (hopefully) complete my site-custom code changes.
Each machine accessed the mainframe through regular user accounts. We had a file for each machine’s mainframe passwords. I was careful - I had a backup of each machine’s password file on the other machine. During my long day I changed all the account passwords. What I neglected to do was save a current copy of the backup machine’s password file on the primary before I sprayed around the files. When I was saving my files around I had, without realizing it, saved the old password file to each machine. Neither machine had the correct password file, but the mainframe was down anyway so the failure would not be visible until the mainframe came back up at 6:00 AM.
Cue scary music.
When I drifted back to the client site at 9-ish in the morning, and walked to where my system was located, I could hear people whispering “He’s the one!” I got to my system, and saw that both mainframe connections were down. I shut them down, and checked the virtual screens. Each of the accounts had tried to log in with the old password. That failed, so my system was programmed to wait one minute and try to log in again. For each account. And again, and again. Lather, rinse repeat for 48 accounts every minute from 6:00 AM until 9:00-ish. I realized what had happened, so I called the help desk.
I did not realize the magnitude of the consequences of my screw up. It turns out my machine’s 48 failed log-ins every minute for 3 hours had overflowed the mainframe’s security logs, and nobody could log in. That’s the part that’s not supposed to happen - the security logs should not have overflowed at all, let alone so easily, and even with some repeated bad logins other folks should still have been able to log in. I don’t know why someone from at the client site didn’t call me at my hotel - they had all the information, and the mainframe security folks had ID’ed my system as the problem. The mainframe comms folks could have shut down those two lines, but instead they waited for me to come back in.
I called the help desk and said I needed to reset a number of account passwords, and the help desk person said “Darn right you do!” Anyway, it was all fixed in a few minutes, but I had crashed a mainframe.
My boss at the time was a former tech person, so she was very understanding. And I took pains in the future to make sure that the backup password file for the secondary machine was stored in a separate place, so I was less likely to over-write it by mistake.
I don’t know how much crap my boss took for me. Probably a lot. That client was very particular about keeping access up - the reason they’d insisted on our system being fully redundant.