When computers HAVE to work reliably

Voyager · March 7, 2018, 9:40pm

I think that problem would be the least of it. How are you going to get three differently coded programs synched up well enough to have their outputs arrive at the voting hardware at the same time?
In any case I suspect many if not most of the problems are requirements and spec issues, not traditional bugs, so all three copies could be wrong.

AHunter3 · March 7, 2018, 10:44pm

That’s what I was going to say (although it would not have been said as well). A bulletproof computer would be one that can do a tiny handful of chores extremely reliably because it is hard-coded to do those chores instead of being coded to interpret any instruction within a large instruction set within a larger environmental context.

iamthewalrus_3 · March 8, 2018, 6:07pm

Hard real-time requirements are a thing, so as long as the result of each calculation arrives prior to some cutoff time that gives the deciding hardware time to compare them and then take over if they don’t agree, this could work in theory.

I’ve never heard of a case where this was done, though. As pointed out, the cost of designing the same thing three times vastly exceeds the utility you get here. Much better to spend twice as much time on one really good implementation.

There are cases where redundant hardware is used, but they don’t run separately-written software on each. The redundant hardware is used to protect against hardware failures, not software bugs.

Stranger_On_A_Train · March 8, 2018, 6:20pm

And more specifically, redunancy protects against random failures of hardware, not limit failures due to exceeding design capability or aystem failures due to unexpected complex interactions of different parts of the system. Designing and performing independent verification and validation (IV&V) of independently redundant systems is more likely to introduce uncaptured errors than to protect against them.

Stranger

Chimera · March 8, 2018, 6:24pm

Hence my old joke about Worf (Star Trek TNG) getting an MS blue screen of death in a critical situation.

Grrr · March 8, 2018, 7:12pm

The company I work with had a contract to build some electronic parts for the government. Everyone in my area had to go through a background check. And they implemented a two badge system.

Badge in once at the door of the building, and again once you got to your work area.

We weren’t even allowed to know what the devices we were building were being used for.

Francis_Vaughan · March 9, 2018, 12:55am

Here is Ian Sommerville explaining the multiple redundancy and multiple software systems in the Airbus flight control system. Airbus FCS - software and hardware redundancy - YouTube

Two levels of flight control computers, each with two types of CPU and two different software development streams.

Crafter_Man · March 9, 2018, 1:13am

A couple years ago I was on a project for an aircraft that had quadruple redundant hardware to control the flight surfaces.

iamthewalrus_3 · March 9, 2018, 5:41pm

I stand partly corrected.

Note that there aren’t two different software teams developing to the same spec here. The primary flight control system is more complex, and the secondary one is simpler (and the tertiary, etc. are simpler still).

This makes a lot of sense: after the amount of care that’s put into those systems, it’s as likely to be a spec bug as a software bug that causes problems. Having two different teams develop the same buggy spec is bad redundancy. Having each failover system be simpler than the level above it means that you lose some advanced capabilities but the simpler systems are more likely to be correct.

Francis_Vaughan · March 9, 2018, 10:43pm

Sommerville does go on to say that within each of the two levels of control system there are also two different processor types and two different software streams. He goes over this rather quickly - but there really are two software teams developing identical spec code as well as two different levels of flight control - so four separate flight control software developments.

However, it should be noted that these identical spec software systems are not totally separate flight control software systems. They are only the software implementations of the already designed specified and tested flight control rules. The real meat of the system has already been done by the time the specification of this software has been created. Indeed the specification of these implementations is arguably the real meat - not the individual implementations. Each implementation is required to behave identically. They are not autonomous flight control systems, being developed from scratch - they are just the reification of a core component. Each component is required to process all inputs and create control outputs in an identical manner. It is that spec that is the real flight control specification. Grinding out the code to do it is not the difficult bit.

smithsb · March 10, 2018, 1:02am

Another untested/unintended example. The US and other Navies locked out gun systems mechanically from shooting out their own superstructures. Computerized fire control systems carried this over for newer gun systems (I hope). One of the US Army air defense systems when used in combat had a glitch where it could compute below-ground intercepts and launch a missile trying to do exactly that. Further exploration of the problem discovered that if the intercept path was through an airtraffic control tower or skyscraper; the system had no problem launching. Hasty changes were implemented mostly on the manual control end of things until terrain mapping was modded in the software (it was mostly there already - just locking out different locations/scenarios).

Another discovered fault in the above system was the data recording. Previous days interactions were recorded for investigation and training purposes. Through some glitch or user action; the previous day’s intercepts could be generated as if they were happening in real time. Phantom aircraft would appear, get interogated to determine friend-or-foe, of course there would be no response from the ghost aircraft so off the missile would go for intercept. A mad dash for the self-destruct button in the control van. Then boom, missile parts would rain down on our ammunition supply point - not good. We made them move the missile battery.

TimeWinder · March 10, 2018, 1:56am

I’d have made them destroy it entirely. Whether it’s getting FOF or not, autonomous systems shouldn’t be launching live ordinance without a human confirmation. I thought that was forbidden by a number of treaties.

smithsb · March 10, 2018, 5:26am

Someone is supposed to be in the launch trailer but sometimes things like smoke and BS breaks take precedence. The launch is autonomus, and two missiles are the default.

Topic		Replies	Views
Space rocket tech is decades old. Why do space rockets still fail so often? Factual Questions	16	3065	October 1, 2006
How does the military keep its computers from crashing at the wrong time? Factual Questions	25	2325	September 4, 2006
What if someone dies in orbit? Factual Questions	28	1820	August 20, 2001
Are people becoming more tolerant of tech mediocrity? Great Debates	20	1894	May 9, 2006
is it theoretically impossible to engineer a 100% reliable solution out of unreliable parts? Factual Questions	43	5041	July 24, 2015

When computers HAVE to work reliably

Related topics