How does the military keep its computers from crashing at the wrong time?

I currently work in the S6 shop of a BN. My resposibilites include keeping all computers running, and supplying communication links between our lower units, and higher HQ. All of our reagular office computers run the same software as any other normal company. Windows XP, MS Office. Nothing fancy. Just like any other large company we lock the systems down so users can not make any important changes. They can still change background, and pick which of the default screensavers they want. Software installs have to be done by us.

Virus software is updated reagularly, and also vulnerability scans are run constantly, and will not only detect systems that need patches, but also detect adware/spyware that might have been able to get itself installed. Those reports get sent to us, and we have to clean the systems up, or remove them from the network.

The secret to keeping the systems running is preforming maitnance, and running down problems quick.

It is a fun job, and I am glad I am doing it. As someone above said there is no more fun than going out to the field, setting up a gen set, and getting a full office up and running with laptops so they can do their jobs. Unlike in the “real world” where the network is setup once, and then just maintained, when we go to the field you setup from scratch. Run the wires, terminate, configure routers, switches, and systems all to talk to each other. Then a week later tear it all down, clear configurations, and get it ready to go next time.
-Otanx

GPS

Which begs the question. How does the military keep GPS from crashing?

Modern computer users assume an operating system is required. If you only need the computer to run one application, and nothing else, then an operating system is just so much overhead and endless points of potential failure.

If you don’t attempt to use an operating system at all, much less an operating system that attempts to be all things to all users; And if you don’t run your processor at 99% of the frequency at which it will produce one error every 2 hours; and if you cool the hell out of everything; and if you use vastly over rated memory, capacitors, etc, ets,.then crashes needn’t be an issue. At all You test it for weeks, nay, months on end, and you troubleshoot every single failure to determine the cause, and you fix it so it can never happen again. Really. You don’t “reboot and see if it happens again”.

And then you build a watchdog timer in hardware. It looks for the processor to to perform a complicated and unique set of steps every so often. If that doesn’t happen, it reboots the processor. And they are a royal pain in the ass when you are only at the “hello world” stage of programming…so you do that on an engineering test bed where you can disable the WDT.

Properly planned out orbital trajectories? :smiley:

It’s long been obsolete, but the old SAGE network consisted of two completely redundant vacuum-tube systems (which were, at the time, considered the most reliable way to go). The now sadlt depaqrted Computer Museum in Boston had a chunk of the old SAGE computer set up that you could walk through, and illustrated the redundancy. It was pretty sophisticated for its time. Computer glitches at the time were mainly the result of physical problems (like burned-out tubes) rather than software problems, and the system was built to make those easy to handle – huge far-apart racks of components to allow maximum cooling and accessibility to parts. No commercial computers from the same era I’ve seen were built that way at all.