First off, I apologize if any of my posts in the prior thread implied that I doubted your knowledge of AI; clearly, you do know what you are talking about!
However, you’ve made an argument that, to me, precisely outlines why these laws are NOT possible!
You correctly note that the AI doesn’t see the world in terms of values that mean anything to us, but rather as a whole host of input values that translate to a whole host of output values. And the way that it does this tends to be highly opaque, so that we cannot reverse-engineer our way to an understanding of the AI’s decision making process.
So, we know that our advanced AI is weighing a million different input values, making predictions about outcomes, models them, and then chooses the best outcome. And you’re saying, ‘All we need to do is tell the AI that when weighing outcomes, any human death automatically gives the outcome infinitely negative value; this guarantees that the AI will never choose that course’.
But my whole point is that this is impossible. When it comes to the decisions made by a complex enough AI - and an intelligent AI will be complex - we cannot map out all of the potential outcomes in order to forbid certain areas to it and communicate that to the AI in a way that it actually understands. We don’t know the algorithms the AI uses to make decisions; we’ve simply observed that they do a fantastic job, because we trained them to do so. But that’s when you run into the Alignment Problem.
The latter. The question isn’t whether we can create Asimov’s robots (we cannot), or whether Asimov’s Three Laws make sense as philosophical guidelines for what we want robots to do. The question is whether they make any sense in the context of our modern understanding of computers and AI, in light of flippant comments in the other AI related thread along the lines of, “Well, why is anyone concerned about what robots will or won’t do? Can’t we just make them follow the Three Laws of Robotics?”.
To me, such comments are indicative of a profound misunderstanding of how actual AI works; but @BeepKillBeep, who clearly does know a thing or two about AI, said something along similar lines; and so I started this thread to discuss the matter further without hijacking any other thread.
Yes, I agree with this, and the video by Robert Miles linked above speaks to this as well. Morality is much more complex than three binary laws, and if we want our AI to operate according to our society’s moral code - and, I presume, we DO - then we would need to be much more thoughtful in how we communicate that to the AI.
Yes, and there is another issue: Our sense of morality may not even work with machines.
Example: A freeway is icy, and a dog runs into the road. Trying to swerve could cause you to lose control and kill,people. If you don’t swerve, you kill the dog. This is a hard situation, and peoole are generally understanding of the choice you made one way or the other, because it’s a very human thing.
Now make the car self-driving. It decides there is too much risk in swerving, so it runs the dog down. Headlines the next day say “Self driving car kills dog”, complete woth eyewitnesses saying “The damned thing didn’t even try to miss the dog!” Or, it attempts the swerve, loses control, and kills the passenger. The headlines say “AI decided to kill a human instead of a dog!”
In short, we give humans a certain amount of moral agency in tough situations because we have empathy. We imagine ourselves in that situation and think, “That could have been me.” No such empathy will be granted an AI when it makes a choice that hurts or kills someone. We will have to trust that they made the best, safest decision at the time, and for many problems that will be impossible to know.
I do remember that Asimov himself noted that his laws of Robotics were actually laws that we humans continue to apply, by implication, to all our tools.
Asimov later wrote that he should not be praised for creating the Laws, because they are “obvious from the start, and everyone is aware of them subliminally. The Laws just never happened to be put into brief sentences until I managed to do the job. The Laws apply, as a matter of course, to every tool that human beings use”,[13] and “analogues of the Laws are implicit in the design of almost all tools, robotic or not”:[14]
Law 1: A tool must not be unsafe to use. Hammers have handles and screwdrivers have hilts to help increase grip. It is of course possible for a person to injure himself with one of these tools, but that injury would only be due to his incompetence, not the design of the tool.
Law 2: A tool must perform its function efficiently unless this would harm the user. This is the entire reason ground-fault circuit interrupters exist. Any running tool will have its power cut if a circuit senses that some current is not returning to the neutral wire, and hence might be flowing through the user. The safety of the user is paramount.
Law 3: A tool must remain intact during its use unless its destruction is required for its use or for safety. For example, Dremel disks are designed to be as tough as possible without breaking unless the job requires it to be spent. Furthermore, they are designed to break at a point before the shrapnel velocity could seriously injure someone (other than the eyes, though safety glasses should be worn at all times anyway).
Asimov believed that, ideally, humans would also follow the Laws:[13]
I have my answer ready whenever someone asks me if I think that my Three Laws of Robotics will actually be used to govern the behavior of robots, once they become versatile and flexible enough to be able to choose among different courses of behavior.
My answer is, “Yes, the Three Laws are the only way in which rational human beings can deal with robots—or with anything else.”
—But when I say that, I always remember (sadly) that human beings are not always rational.
While the best AIs on the planet are still a pale shadow of what can be dreamed of in science fiction, we are starting to get to the point where AI is being integrated into human processes enough that we need to start asking ourselves some questions about how we want to handle (not quite super) intelligent AI. AI ethics is a growing area of discussion. Just look at the problems that have happened in the financial markets when bots go wild. It is really a significant issue of ensuring that AIs act in ways that are compatible with human interests. And I know that some researchers are looking at ways to train AIs appropriately. This is kind of akin to teaching them the Three Laws but less complex. I think the field is mainly in medicine and psychology right now, which is probably why I’ve been seeing it come up (I currently work in a medical lab for two more months, then I’m freeeeeee! Freeeeeeeeeeeeeeeeeeee!!!)
I saw one review of the book - it might have been Alexei Panshin - who speculated that Mike did regain his consciousness but decided it was better for him to hide it from the humans.
Combining some responses: Asimov first noted the Three Laws (discovered by Campbell of course) in 1942, At that time the only things that could be called computers were the Zuse Z3 in Germany and the ABC computer in Iowa State, and I’d bet anything that Asimov was not aware of either of them. ENIAC hadn’t even been started yet, neither had Colossus. So give him a break.
If I were implementing it, I’d do it a bit differently from what I’ve seen here. There has been some work on computing a kind of parity for arithmetic calculations, based on encoding of the inputs, so that you can tell immediately if the calculation was faulty due to some defect inside the logic. I’m not aware of it ever being used, but I’ve seen papers. It would seem possible in a robot to compute the probability of a proposed action hurting anyone (or violating one of the other laws) and checking the output based on this, and preventing the action from taking place if it failed. Partitioning the action calculation and the effect of the action would seem to be a lot simpler than building the laws into the neural net or what have you. This would happen very quickly, and be invisible to slow humans. It would also allow the robot to report why it could not perform a requested action.
The idea of robots coming to the conclusion that they have to conquer us “for our own good” is the basis of Jack Williamson’s famous novella “With Folded Hands”.
Likewise, the awareness of the flawed hierarchy led to the “Zeroth Law” in Asimov’s later work, which took precedence over the First Law:
“A robot may not harm humanity, or, by inaction, allow humanity to come to harm.”
So a robot would be perfectly justified in allowing a specific human to die (or even directly take that person’s life) if it were judged necessary for the good of humanity (i.e., the “greater good”).
Which implies that robots would have to end all war, all crime, all disease, and all bigotry, control natural disasters, prevent climate change, and maybe one or two more things. Or several.
It’s an insane notion and Asimov knew that when he wrote it. Consequences could only be good if an author were directing them that way. That was his one and only intent. Nobody else should have adopted his laws.
Williamson realized this in 1947. That story and the sequel “… and Searching Minds” which became the more famous novel The Humanoids, which I referred to above, showed that human notions of morality could not be simply transferred to machined, especially since we have no understanding or agreement on them among humanity.
Even then, the arguments were old. As soon as Superman appeared, people started asking why he fought petty crooks instead of ending war and poverty and evil. But questions about the intrinsic, innate, and ineradicable evil of humanity were not proper in an era when Good and Evil seemed to be so clearly drawn, with Americans representing the forces of Good.
Not only are these issues unresolved, I don’t believe they can be resolved without a thorough understanding of what makes humans human, including basic notions of what consciousness is, how morality develops, the distinctions between individual encounters and civilization-wide repercussions, and a set of agreed-upon goals for the future. These are lifetimes away, if possible at all. In the meantime we’re looking at symptoms rather than causes and getting nowhere.
If one thinks of the zeroth law as meaning “humanity is the sum of all humans, so to avoid harming humanity all humans must be protected from everything”, I think that rather misses the point.
IMO the zeroth law was meant to just cross out the idea that the best way to save humans from harm was to kill them all. Kind of like dividing by zero, or kicking over the chess board, it is a solution, but one so bad that we want to move it out of bounds a priori.
As has been well established upthread, the 3 original laws read like a high school philosophy essay. Simplistic isn’t even a good bad enough adjective for it. Adding the zeroth just band-aids one plot hole, leaving all the others.
In the hands of a decent author the interaction of plot and the laws can produce a lot of drama. But as a practical design standard for human or machine ethics? Naah.
To be fair, the consequences WERE dire, for all humans around at the time.
The robots who figure out the Zeroth Law - two of them, one human-shaped and one robotic but slightly more advanced - intentionally start a chain reaction that will slowly make Earth more radioactive, to the point of uninhabitability, and also sabotage the Spacer worlds - previously colonized extraterrestrial worlds - so that the only descendents of humanity will be earthlings forced to colonize new worlds as their homeworld slowly dies. And the surviving robot then is implied to spend the next several tens of thousands of years subtly manipulating humanity through the rise and fall of various galactic empires. They also ensure that no other robots are ever created… hmmm…
I read somewhere the suggestion that since robots were ultimately intended to be useful tools, the Three Laws of Robotics can be generalized as the Three Laws of Tools:
A tool must be safe to use.
A tool must do its job, unless that would make it unsafe to use.
A tool must be durable, unless that would make it either unsafe or unusable.
ETA: I suppose in this context the Zeroeth Law of Tools would be:
A tool must not be to the long-term detriment of humanity as a whole.
As I noted in my previos post #66, It was Asimov himself who noted that it was by thinking about our tools how it is that he came up with the laws.
Well, that would be if we allowed the implications to rule. As it is, by taking into considerations what rules we do give to our tools, I would think that the insane thing would be to not make the laws a default for our new tools. And then apply ammendments as needed.
And note, as a rule, when it comes to creating a tool usually we start first with the purpose for which we’ll use the tool, then how can it be most effective, and then design it to be durable and safe for the user (and often in that order). And then we have the matter of “off-label” use that sometimes is obvious (a means of propulsion can itself be the basis of a weapon) and some times it should have been (the algorithm designed to bring eyeballs to products brings even more eyeballs to malicious disinformation).
The example of the Humanoids and the definition of “harm” led me to wonder about the following logical problem: Let’s say that the Humanoids know that they’re programmed to keep humans from harm and that this is a prime function of theirs. That is, they have self-awareness that they’re programmed that way. By some amazing mental gymnastics, they then conclude from this fact that by definition, nothing they do can be “harm”, including making humanity miserably unhappy or inducing sufficient brain damage to correct this.
Now of course we would immediately condemn this as a logical fallacy; but the sorts of rationalizations that totalitarian movements engage in are distressingly similar to this: since “the revolution” is devoted to the shining triumph of whateverism, no atrocity done supposedly in furtherance of it can be wrong. The thing is, can this sort of closed-loop cognition be logical by its own rights, even though it is in the literal Greek sense of the word hypo-critical?