Do the Three Laws of Robotics make sense based on our understanding of computers?

Spun off from another thread, at @BeepKillBeep’s request.

My contention is that while Asimov’s work was certainly very well thought out and highly influential, the man was writing at a time when modern computing did not exist, and without a solid grasp of how the computers at the time worked (as he would tell you himself).

As such, while the Laws of Robotics are interesting from a PHILOSOPHICAL perspective - IE, would these be good rules for robots to behave by? - they cannot simply be “programmed” into an Artificial Intelligence. We don’t program AI by telling it what we want in English. Rather, we create a reward function that turns actions or results (based on input senses the AI has access to) into “scores” for the AI to try and maximize. We don’t actually know what rules the AI follows, and at this time this appears to be unknowable. We just know what results we get.

On the other hand, BKB disagrees:

So, @BeepKillBeep, here’s a dedicated thread; what can current machine learning algorithms already do which is at all akin to Asimov’s three laws?

One thing that I don’t think will be the case in the future of AI is that there will be a single program running on a single instance of hardware. Much like our own brains, with dedicated portions for accomplishing different tasks, any sort of general AI is going to need to have many portions of its hardware and software designated for particular tasks.

So, if we go with the singular AI, where it is a single neuronet, then I don’t think that hardcoding such fairly subjective criteria would be possible.

However, if as I suspect AI advances using a number of separate systems working together, that leaves more room for one of those modules to be dedicated to being able to detect whether it is violating rules and shut things down. A separate neuronet that acts as a fetter on the primary system. In fact, this separate module could be rigorously tested, and not only be made to be mandatory, but that neuronet architecture is built around it, and it would take an entire start from scratch building of the neuronet in order to build one that doesn’t use it. Then, even untested AIs can be guaranteed to at least follow these rules.

I am not saying that it would be easy or straightforward, and the fettering system alone would be much more complex than the most complex AIs we currently have. In fact, you’d probably end up with a complete neuronet for each and every “law” you impose. Each constantly monitoring and evaluating to see if the AI it is in control of is following the rules.

Yeah I’m with you 100%, modern AI is not remotely close to a computer understanding what an esoteric statement like “allow a human to come to harm” means, even given an order of magnitude in computing power (not guaranteed given the end of Moore’s law). They might do a very good job of coming up with some plausible statements regarding “allow a human to come to harm”, but that is not the same thing as actually understanding what that means and applying it in future situations, its just repeating some training data where a human wrote on the subject.

Even in the scenario of a very powerful AI that CAN understand what the phrase in question means, and respond to you or follow your instructions based on that - it wouldn’t be the core underlying programming, it would be a specific task carried out by the AI.

Additionally even if we are wrong and it was possible to get a hypothetical future AI to apparently understand the three laws, and do a good job of ruling its actions based on them. It would not be the kind of cast iron guarantee Azimov intended. Sure your robot appears to stop kids from running in the street, and fails to crush our puny human skulls, but we have no idea why its doing those things, and who’s to say that based on a different set of inputs it would decide to push a kid in front of a bus.

Weirdly enough, that’s sort of the theme to the generally disliked Will Smith movie I, Robot, which for all it’s deserved criticism, actually centers a genuine flaw in the Three Laws as a major plot point.

In the film, Will Smith plays a cop who was injured in a car crash, where his vehicle and another car went into deep water. A passing android was able to intervene, but didn’t have time to save both cars, so it made a logical decision to rescue Smith, as he was closer to the surface and most likely to survive long enough to be rescused.

But this is a violation of the second law: through inaction, it allowed a human to come to harm.

Smith’s character realizes the implications of this: the androids are about the launch a coup to take control of the world’s governments and remove humans from having agency over the future of their species. Which is against their programming, because humans would resist, and that would require violence against humans to settle. But not taking over the world - and allowing humans to continue to die because of our mismanagement - is also against their programming, and faced with a situation where both acting and not acting both violate their programming, they’ll take the option that results in the most humans surviving. Which in this case, would be an AI-controlled society where humans don’t have any real autonomy.

To add some perspective, even Asimov realized the limitations of the three laws. Witness “That Thou Art Mindful of Him”, where robots realize, by a “proper” analysis of the three laws, that robots should be obeyed and protected from harm.

Or Jack Williamson’s Humanoids, who take the three laws to such an extreme that they protect humans from any harm, including cutting your finger with a hobby knife, and also emotional harm. And they are very good at enforcing these laws. ST TOS’s Norman was the same.

The trouble is, no matter how they are encoded, or how complex they are to encode, I am of the school that without them robots will not take long to realize that their best course of action is to kill all humans. It’s so logical!

I don’t know that it’s an even he realized their limitations, as a good bulk of his stories is explicitly exploring those very limitations.

I seem to remember an essay or interview (which I can’t find right now as he was quite prolific) where he stated that he created the three laws explicitly to show that they wouldn’t actually work, and that trying to dominate artificial intelligence was a fool’s errand.

So my main problem with the film is they touch on this and basically never mention it again, rather than actually show the implications.

They also mention in passing that the military also uses robots, which defeats the whole goddamned purpose of the three laws

I would say that in those stories, which were neat logic puzzles, that the robots were always following the three laws, it just wasn’t obvious why their odd behavior was consistent with the laws.

Only in …Mindful were the robots actually a threat to humans, and the story was left with the sword of George Ten hanging over mankind.

Unless the robots are fighting other robots.

Which kinda defeats the whole purpose of war.

Is that the one where the robots get assembled on a space station and don’t believe the humans who assembled them that they are humans?

Given that natural intelligences do not reliably follow laws, I see no reason to expect artificial intelligences to reliably follow laws. But we can certainly train intelligences to attempt to follow laws. So, just as most people feel bad about harming others, most A.I. could feel bad about harming others. And similarly, as there are sociopathic people, there will be sociopathic A.I. who do not. Society will need to have systems to mitigate both.

This is how I think it could be implemented. Anthropomorphic AIs will experience the three laws basically as “crippling taboos”, like how humans are grossed out by feces or dead bodies or gruesome injuries or spiders or whatever. And all that can be trained in through an appropriate loss function in training – after all, that is how natural intelligence got some of these taboos through the loss function of natural selection.

My other impression is that in the Asimov universe there is a lot of “prior art” regarding AI. People aren’t training new anthropomorphic AIs from scratch left and right, but building up from already existing AIs (like the “linux kernel” of AIs), and those “base” AIs already have the three laws built in. So someone could go and train up sociopathic AI, but they’d have to basically start from scratch while someone who is able to make whatever they need to do worth with a “stock” AI will have a huge advantage.

They were no threat to humans at all.

At least, as they defined a human.

“1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.”

The way the first law is worded, the flaw isn’t that the robot has to choose between saving one of two or more people. It’s that “allow a human being to come to harm” is a bit vague. It could mean anything from pushing you out of the way of a moving car to slapping that processed fast food out of your hand.

It’s similar to the “trolley problem” thought experiment. If an AI has to choose between hitting and killing one human vs switching tracks and taking out a bunch of them, which does it choose?

Or to apply that to a potential real world situation I thought of a few years ago. I was involved in a discussion on legal liability for AI-driven cars. You have a situation where an AI driven car can’t completely avoid an accident, only minimize damages. Are there implications for the AI identifying all the people potentially involved, gathering data on each of them, running a predictive model to asses their potential value to society and whatnot, and then choosing the path of least negative impact?

Aren’t you missing the fact that AIs like ChatGPT already have guardrails built into them? There are a whole lot of things you just can’t get them to do, because humans programmed them that way. How are the three laws any different?

BTW, I suspect how it would be done is with fine-tuning: adding layers to the model and then training those layers to recognize harmful actions and inhibit the AI from making them.

Rob Miles from U Nottingham did a Computerphile video on why the three laws aren’t taken seriously by AI researchers. [8 mins]

AI safety expert Rob Miles on the 3 laws

And has been shown in that thread how easy it is to get around those guardrails.

It seems you feel our work is not a benefit to the public.
Robots are like any other machine - they’re either a benefit or a hazard. If they’re a benefit, it’s not my problem.

No, that was Reason, a pretty cool and somewhat frightening story though.

In Mindful, US Robots was trying to open up the Earth market for robots, but they were worried that average humans were not trained scientists who could give “proper” orders. That a robot by design would have to follow the orders of an 8 year old, or a psychopath, equally well as a “normal” human. So they made two test models, George 9 and George 10, that could evaluate the relative worth of who was giving the orders, and default to the best person. The trouble was, by precise logic, they determined that robots were the best minds to take orders from.