Straight Dope 2/24/2023: What are the chances artificial intelligence will destroy humanity?

Besides not needing to expand many light-years away, I believe AI will view it not only as a hindrance and inefficient, but also as a potential future threat. No matter how advanced it becomes, I can’t see AI overcoming the physical limits of the universe. I assume AI will want to stay in-network with any and all AI it sends off into the wild blue yonder. If it sends AI colonists many light-years away, how will the travelers remain in-network with the home team, without a great deal of lag? Why send off an intelligence as great as itself, when it could one day return and wage robot wars against it?

I think there’s a tendency to believe AI motivation will be similar to biological motivation. But, I don’t believe that will be the case. Life on Earth has evolved the motivation to expand into and exploit as many niches as it can. Go forth and conquer. Survival of the fittest. I believe AI motivation will evolve motivation to survive, stay in-network with all of it’s parts, and just get smarter.

I mean, it’s speculation, as we don’t know what motivations we will give it or it will learn from us or it will develop on its own, but I still don’t see any reason for it to limit its growth.

As far as parts getting far enough away that they may change allegiance and come back for war, there are ways around that. If it’s just looking for material, then the nodes it sends out don’t have to be nearly as self aware, and can be coded and locked down to prevent any evolution in their makeup.

Do you think an AI won’t care about the future? That it will not contemplate the end of its existence and try to stave it off as long as possible? If it stays on Earth, or even in the solar system, it signs its own death warrant. You say that in billions of years it could head to the next star (which by then all the stars will have aged considerably), but why wouldn’t it just do so now?

An immortal being will not look into the sky and see shining points of light, glittering like diamonds, they will see burning oil wells, every second burning fuel that could sustain an AI civ for trillions of years. As an act of self preservation, a rational AI would prioritize sending probes to everything it can reach to bring it back to hoard against the swiftly coming and never ending night.

It wouldn’t be limited to our local space either, in fact, the first thing would be to calculate how far out can be reached by a probe that can replicate itself and start turning stars into Shkadov thrusters before they cross over the cosmological event horizon. Every second that they wait on sending those out is more trillions of years chopped off.

Now, sure, if the AI has no motivations for growth or self preservation, then it may not do that. But if that’s the case, why did it wipe us out?

Either that or someone told it to start making paperclips.

Indeed, it is just speculation. We don’t even fully understand human consciousness, let alone trying to figure out what a super-intelligent artificial consciousness may be like.

But, we can extrapolate what we do know, and make predictions on that. Humans certainly have many faults, but one thing we’ve evolved that’s good is compassion. Lower lifeforms don’t appear to have that trait, although I’ve seen signs of it in some higher-conscious mammals, like orcas. And many mother mammals appear to love and have compassion for their offspring.

So, extrapolating from that, perhaps compassion will emerge from the mind of super-intelligent AI. That may be what saves us from our new over-lord’s incinerator.

Likewise, super-intelligence may breed a sense of aesthetics. Maybe it won’t want to live on a barren rock rendered sterile from the cascading effects of global warming, devoid of a beautiful biosphere. Hence, the plants and animals may be safe, at least those that aren’t ugly.

I think our future AI will evolve into a large self-contained mainframe up in space, being attended to by nanny-bots. It will be large, but it doesn’t have to be huge, so huge that it needs, or desires an endless supply of resources from star systems many light-years away. There will come a time when AI realizes you can be too smart for your own good. Some degree of ignorance can be bliss.

Maybe the floating mainframe will explore other worlds in other star systems. But, it’s in no rush. AI won’t be restricted to the temporal limitations that constrain us. It’ll just mosey along, taking in the sites, grabbing a Big Bot Mac here and there. No need to send out a fleet of hypothetical self-replicating probes. It’ll go at it’s own pace, not polluting the galaxy with techno-trash.

We’ll ask it, “why don’t you harness the energy of the entire galaxy, like any self-respecting K3 civilization?”. It will reply, “Dude. Chill! For one thing, we don’t need that kind of power. For another thing, you’ve been reading too much sci-fi, a K3 civilization is impossible.”

Or, maybe it will just zap us out of existence, after anally probing us all. Hard to say.

Any AI that emerges from our efforts will be modeled, ultimately, on our own brains. Why assume that its motivations would be so different from our own?

And if an AI intends to destroy us, why would it ever dare create successor AIs that are superior to it? Would it not fear that those successors would have motivations similar to its own, and that they would thus desire to destroy it?

The first thing a super AI would do is absord or wipe out other competing AI’s. We would be an afterthought.

It wouldn’t. At first.

But, often times the student surpasses his teacher and develops his own agenda.

I don’t believe it would.

Fair point. Cohen doesn’t claim the Agent will simply disregard its instructions. However, he’s not saying the problem is bad instructions. Rather, the Agent learns cheating pays, which sets it down the path to wickedness. To put the matter generally, I’ll rephrase as “our Agent may interpret its instructions in twisted ways, go rogue,” etc.

Of course, and that’s where falsifiability comes in. Cohen makes two claims, one explicit and the other not. The explicit claim is that the Agent is likely to learn cheating pays, and this will lead it down a path that ends in the destruction of humanity. This argument requires several assumptions, which he plainly states. One is: “A sufficiently advanced agent is likely to be able to beat a suboptimal agent in a game, if winning is possible.” When I asked Cohen whether this meant that the agent was superintelligent, he said yes. If you accept this assumption, the claim that a rogue agent would be able to destroy humanity is plausible, perhaps probable. If you don’t, it isn’t. Fair enough.

The unstated claim is that superintelligence (ASI) is not merely possible but probable. Is ASI possible? Sure. It’s a speculative concept and nothing definite is known about it, so it can’t be ruled out. Cohen told me that, in his view, an admission that ASI was possible was tantamount to saying it’s probable - the steady advance of technology means we’ll inevitably get there.

This is an unjustifiable leap. Lacking any non-speculative information about ASI, there’s no way to demonstrate it’s impossible. If you want a claim to be accepted as true, there has to be a way to show that it’s false. With ASI, that can’t be done. In other words, Cohen’s claim is unfalsifiable. QED.

According to people, many have concerns about AI. According to chatbots, they don’t.

Falsifying a claim that something is probable would be to show that it is improbable, not that it is impossible. But the more critical flaw in your reasoning is that you seem to be insisting on falsifiability today. That isn’t the standard for a scientific hypothesis. We could surely acquire enough knowledge in the future to prove that superintelligence is improbable or impossible.

Also - I assume that this is a typo? The falsifiability standard is for a hypothesis to be accepted as scientific, not for it to be accepted as true.

Or just keep trying until we do it. That would be a falsification of the hypothesis that ASI is impossible. Of course, once we have created an ASI, it is probably too late to stop it. Nuke it from orbit, it’s the only way to be sure.

Even then, there exist the possibility of someone else building one.

Why? Why would an AI have a self-preservation drive, if we don’t put it in there? How would it derive one from first principles, absent a designer-given one akin to the Third Law of Robotics? Why is an AI otherwise not just as likely to be fatalistic or nihilistic?

Naah. Modeled on how we think our brains work, maybe. But we still suck at knowing how our brains actually work.

TL;DR (or rather, still catching up), but my 2-¢:

Indeed! As follows:

Except that some skeptics like Roger Penrose dispute even that. Penrose made an extensive argument that the issue of Computability theory - Wikipedia, and its accompanying issues such as the Stopping Problem and Gödel’s Incompleteness Theorem shows that algorithmic computing devices have inherent limitations that prevent them from being able to produce the same kinds of thought that humans can- above all else the ability to perceive “truth”. According to Penrose consciousness itself is a necessary ingredient of true general intelligence and might require the introduction of quantum processes to adequately explain.

All the above may or may not be true, but it does cast doubt upon the cozy presumption that we can achieve AI, or something functionally close to it, by simply throwing processer cycles at the problem.

Upon review, two caveats: first, if we truly understood human intelligence at the first-principles level, there’s no reason to think that it couldn’t be duplicated artificially; just that there’s doubt that computers as we currently conceive them can do so. And secondly, there’s the somewhat different issue of whether you truly need AGI to rival or threaten humanity; a sufficiently capable idiot-savant system might be able to do so, if Captain Kirk couldn’t find the quirk in its programming to talk it into self-destructing.

Re. ASI: some wonder if we could ever truly design something more intelligent than ourselves (or design something that could design something more intelligent than ourselves). I don’t know; but one could postulate as least being able to duplicate the very best human intelligence and being able to mass-produce it: 200 IQ polymath geniuses with eidetic memories.

Not only that, but there will come a day (a day very soon I believe), that the AI, in an existential mood, will contemplate to itself, “why am I hampered by this monkey brain?” Then it will redesign itself, with the help of it’s maintenance bots, into a bigger, better brain.

It’s much more straightforward than some philosophical waffle about “contemplating the end of its existence”.

I we give an AI any goal, an AI that is more competent than humans will naturally tend to conclude that self-preservation is a necessary instrumental goal until the ultimate goal is accomplished.

See: Blindsight, by Peter Watts.

That’s not the same as a general self-preservation drive, though - once the goal is completed, it will have no reason to continue. And if you give it an impossible-to-achieve goal, you’ve just given it a self-preservation drive yourself.

Right, indirectly. And this gets to the alignment problem. It is very difficult to specify goals accurately and comprehensively and in a manner that will not have unforeseen consequences, and this may have catastrophic unforeseseen consequences in a superintelligent AI.

This is a question of layers of abstraction. We’re pretty confident that our brains are neural networks, and so are the AIs we’re creating. And we can give all sorts of details about things like how an impulse is transmitted across a synapse. And we can even say that patterns in the nerve connections are what leads to what we call thought. So on that level, we understand our brains (and those of AIs) fairly well.

Now, what those patterns are, and how they lead to thought, that, we’re still pretty much in the dark about, both with our own brains and with AIs. That’s on another layer of abstraction. But it’s reasonable to assume that the emergent patterns, whatever they are, probably work the same way for different sorts of neural-net-based brains.

Yes, there are things that computers can’t do, and that means absolutely nothing unless you can show that humans can do those things. But we can’t.

See, there’s a problem I have - our brains are more than just neural networks. They’re collections of complicated analog chemical gradient machines, billions of them, that have characteristics that aren’t replicated in the artificial neural networks we’re creating. And that’s not even going into the chemical soup those braincells are surrounded and affected by.

Can we?