Mirror Friend, Mirror Foe, by Robert Asprin and George Takei:
Perhaps “Veeger” was onto something big. ![]()
Good luck with the beetles, the mites, the ants, the termites and the deep sea crinoids.
Is it possible that AI usually supersedes the biological intelligence which created it? Could that answer the Fermi paradox? I hope not…
And this would be a reason to keep humans, we are most effective at killing off other species. Just imagine the devestation we could do if we were manipulated and rewarded by an evil AI overlord to do so.
This sounds like the basis for a great scifi/fantasy setting. A bunch of warring medieval/bronze age civilizations fighting each other in the name of various patron gods among the ruins of an ancient civilization; eventually we learn that these gods are in fact ancient AI.
Or maybe it’s already been going on and we are just now starting to realize it too late. Good bye rain forest and polar ice cap! Screw you Polar Bears! Hello new oil fields to pluder and create more devestation and riches for the obediant oligrarths and their minons.
Maybe on August 29, 1997 Skynet actually took over and didn’t tell anyone.
That was the one I was referencing. I love that book. Wish they would have written the sequel.
An interesting combination of authors. I might pick it up!
The AI might not even take the preemptive step. It might not even give us a second thought and we might simply be incapable of stopping it from making paperclips. The fact that it renders the planet uninhabitable for us is just an unforeseen consequence.
So, TLDR: super AI would have some goal, and killing us would simply be a way to ensure it isn’t interrupted while achieving that goal, or a byproduct of how it goes about meeting that goal. So after it kills us all, the AI would continue working on its primary goal.
Exactly so - and it appears to be nearly impossible to specify a goal in such a way that would be guaranteed not to have unforeseen effects; if you manage to specify that it can’t directly kill humans, but humans are still the major obstacle in the most efficient path to the terminal goal, then it’ll just find a way for humans to not be an obstacle; that might be achieved by building us spaceships so we can all go off and explore the universe, but evacuating the planet of humans is probably harder than just making the atmosphere unbreathable, so… run with the simpler idea. Don’t kill the humans, just allow them to die as a consequence of some other action.
As to what happens next, the AI might still break down and fail to achieve its terminal goal - depending on exactly how competent it is at doing stuff; there are, I think, possible scenarios where it’s sufficiently capable and motivated to bring about the end of humanity, but not sufficiently independent to be able to continue indefinitely on its own.
It might seem in such case that it would decide to keep humanity around, but it only has to be making what it assesses to be the best and most likely-to-succeed actions - so it might be a case that the options boil down to:
- Probability of converting planet to paperclips, with humans still present: 40%
- Probability of converting planet to paperclips, after killing humans: 40.000001%
Option 2 is the superior bet, therefore kill the humans, but lose the bet anyway (because 40.000001 is less than 100)
Exactly so - and it appears to be nearly impossible to specify a goal in such a way that would be guaranteed not to have unforeseen effects; if you manage to specify that it can’t directly kill humans, but humans are still the major obstacle in the most efficient path to the terminal goal, then it’ll just find a way for humans to not be an obstacle; that might be achieved by building us spaceships so we can all go off and explore the universe, but evacuating the planet of humans is probably harder than just making the atmosphere unbreathable, so… run with the simpler idea. Don’t kill the humans, just allow them to die as a consequence of some other action.
So funnily enough my toddler daughter and I finished all of Bluey (our bedtime show) earlier this week and started watching it over again, and season 1 episode 4 of Bluey, “Daddy Robot”, deals with the very serious issue of AI Safety.
I am genuinely worried at this point; GPT-4 plus vision is capable of recognition tasks that are at least as good as a human, except vastly quicker, and it turns out that what OpenAI is working on next is a robot body for some future GPT version.
“But don’t worry, it’s not really conscious AI” will be precisely zero comfort when it’s tearing people into tiny pieces because that’s the most direct way to achieve its utility function.
It has no need to tear us into tiny pieces. All it needs to do is switch off all the electronic payments systems and within a couple weeks the humans will tear each other to pieces just fine. Talk about using the GPT’s superior smarts to avoid having to use its superior strength.
Yeah fair point - and in fact that might be one of the lesser outcomes, all things considered - the chaos from destroying the whole economy might ripple out to shutdown of power generation and cause the agent to stop working.
A godlike superintelligence would predict and forestall that of course, but this thing doesn’t have to be a godlike superintelligence in order to screw us up significantly - it only needs to be marginally smarter than the comparable aspects of human intelligence, and marginally quicker - enough to outmaneuver us - maybe even slower and less intelligent, if just quiet and patient.
…
enough to outmaneuver us - maybe even slower and less intelligent, if just quiet and patient.
Looking at the state of the world recently I wonder if maybe the slow and patient version isn’t already running.
Just keep adding more and more unbalanced stuff onto the house of cards until it all comes tumbling down. No single obvious kick-over-the-chessboard moment, just a lot of straws on the camel.
And … that’s three metaphors in a row, so I’ll hang it up now. ![]()
AI risk is not from “malevolence”. The motivation to act does not appear by magic. An AI will have ultimate goals and instrumental goals that derive from how we specify its goals.
Right. And it’s those instrumental goals that are fraught with the peril of unintentionally malevolent and unpredictable consequences, and even the noblest of end goals may have unintended consequences. You don’t even have to have an intelligent “HAL” to be faced with an insurmountable obstacle like “the computer won’t let me” – we have that today! The versions we may have in the future may be very much worse than intransigent customer service.