From the Cyberiad by the futurologist and philosopher Stanisław Lem:
I have to admit, if I could sit down and have a drink with any historical figure of the philosophy of science, Lem is high up on my list, up there with von Newmann, de Broglie, James Clerk Maxwell, and Anaximander.
Stranger
The problem is that progress on performance and capability of AI is on a different, steeper and faster accelerating curve than progress on alignment.
GPT-5 will outperform humans in a wide range of tasks and objectives.
We don’t know how these things work (the people working most closely on them don’t know how they are doing what they do.
We don’t even know if it will ever be possible to specify a goal given to an AI in a sufficiently concrete way as to avoid serious unintended consequences.
It doesn’t even matter if the things ever achieve sentience or consciousness - they are getting to the point where they will be able to functionally out-think and outmanoeuvre humans, and we don’t know how to make sure they do what we want them to do.
The most pressing risk, IMO, is not that AI will steal all of our jobs (although impacts of that sort are likely from both successful and incompetent implementations), but that an implementation will, quite without malice and perhaps without any genuine inner thought processes, create a risk of human extinction.
So pausing would be a good idea, but nobody is going to do that; everyone in the field acknowledges that there are potentially huge risks, but nobody will stop in case it means someone else wins. Even though ‘winning’ might mean a really bad outcome.
Note that I agree with you; there are all kinds of questions we can ask about the Fermi Paradox to poke holes in it before it even fully presents itself.
Regardless, the point is that in the context of a Fermi Paradox discussion, IE if we take the “issue” presented as a given and look at solutions to evaluate them in the context of assuming the paradox makes sense - then “Killer AI” is a terrible solution because it just kicks the can down the road, and not even particularly far at that.
Anyways, I agree with everything else you brought up, including the fact that the idea that life can only evolve on rocky worlds with liquid surface water seems very human-centric. I wouldn’t be particularly surprised to learn that life can exist in places we could hardly imagine, like the inside of ice worlds but even more radical environments are possible.
Only Scronkfinkle, a one-eyed sparrow with a fretful temperament, was unconvinced of the wisdom of the endeavor. Quoth he: “This will surely be our undoing. Should we not give some thought to the art of owl-domestication and owl-taming first, before we bring such a creature into our midst?”
Replied Pastus: “Taming an owl sounds like an exceedingly difficult thing to do. It will be difficult enough to find an owl egg. So let us start there. After we have succeeded in raising an owl, then we can think about taking on this other challenge.”
— Nick Bostrom, Superintelligence: Paths, Dangers, Strategies
Stranger
I’ve already seen that mindset in more mundane settings - basically “Why do you want to spend all this money and effort and time on fire prevention, when nothing is actually on fire at the moment?”
The context I actually had to face it most recently was cybersecurity; it went like this:
- Senior Management reviews my risk assessment and action plan; “This looks expensive and difficult and you’d be wasting your time, because these things don’t happen all that often. We accept the risk - just make sure it never happens, but not like that, Get on with something else.”
- We don’t ‘waste’ time and money on preparing or mitigating risks.
- One of the bad risk outcomes happens. We are not prepared. It is bad.
- SM Says “You told us about this in the past! You KNEW! and you let it happen!”
I do find it a little bit astonishing that people are still arguing that AGI is ‘way off or never’ in terms of when it might happen. Is it that they’ve not been paying attention to recent results, or that they disbelieve them?
Exempli gratia, ERCOT and the Texas Interconnection. “A ice storm in Texas? When is that ever going to happen?”
Well, to be clear, ChatGPT and other ‘generative’ AI are not anywhere close to artificial general intelligence (AGI); they are really just very sophisticated Bayesian pattern matching algorithms which are built through mostly ‘unsupervised learning’, and should best be thought of as just a really powerful autocomplete function even if they do generate complex responses from correctly structured prompts. A true AGI—something that can independently reason and make critical evaluations instead of generating grammatically correct but factually incorrect responses—requires an entirely different level of processing that is analogous to true cognition.
But it is a salient point that even these “idiot savant” generative AI systems have enormous potential to be disruptive and potentially harmful in the hands of those who have no ethical constraints or intent to deceive, and may be unintentionally misused to ‘replace’ human intellectual labor in ways that seem convenient but are actually eliminating the ethical and practical safety mechanism of having a person-in-the-loop, especially when it comes to things like medicine, law, and education. An ‘AI’ doesn’t need to be intentionally malicious or have volition to do harm; it just needs to be put in charge of or provide critical input into some critical real world system in which failure can have manifest impact upon the safety and well-being of people. And that is already happening with little in the way of reflexion or consideration, and essentially no statute law or regulation to anticipate problems.
Stranger
GPT (not specifically chatGPT) has been shown to be capable of that sort of reasoning - like interpreting an image where a weight is suspended over a lever mechanism and correctly answering the question ‘what happens when the string is cut?’ - or answer questions such as ‘why is this picture funny?’
At best, saying that LLMs are just autocomplete algorithms is a pretty outdated view considering what has been coming through the pipeline in the past few weeks.
I mean maybe it’s technically true that part of their operation works like that, but who is to say similar things aren’t also happening in our own brains. I wrote this post based on a trained model in my head of how words follow one another.
Agree completely on this part. I think people are assuming it would need to be conscious to be a threat, and beside the philosophical unknowability of that, it just doesn’t need to be conscious to do damage. It only needs to be capable of doing stuff that makes some sort of sense in the context of changing environmental parameters, and enabled to do things in the real world.
Here’s a quick rundown of some of the early capabilities of GPT-4
Clearly its now way beyond just autocompletion.
Also there have been further developments regarding adding long term memory to LLMs and also allowing them to craft their own prompts (feedback loops like that are argued to be an essential part of human intelligence)
As well as allowing LLMs to use other AI processes as peripherals to augment their own capabilities.
This “voice” gives me the creeps.
Which voice?
I read a book called Blindsight a few years ago thay posited an alien entity, capable of interstellar travel, reproduction, and problem solving - but with no actual consciousness at all.
At the time I had a hard time conceptualizing how something could be capable of any of the things the alien does without being conscious. Learning about ChatGPT and how it works had made that bit click for me, and I’ve been meaning to re-read the book in that light.
Realistically, we do know. We’re modeling something like a human brain, giving it a bunch of stuff to learn, and then asking it questions.
What we don’t know is how a human brain works. We know that we can give it a bunch of stuff to learn and that it will, statistically, go on to be a productive member of society. We don’t know if there’s some particular structure, hormone, or context that ensures that statistical result. I actually recommend against trying to reproduce any of that. A less human AI is probably more safe, in the long run, since it’s simpler to understand. And, at the point where we understand humans, we’re better off to use that knowledge to make humans better than to make robots that are better than us (note the last paragraph in this post).
But, likewise, we know that these things are developed to take an input and output a textual output. It can’t stand up and start knocking over tables. It doesn’t have that part. It doesn’t sit around mulling its future, it just translates a query into a result.
There are ways to make that dangerous and, given that it may one day be trained on this post, I don’t feel like it’s wise to seed the internet with ways to misuse AI for harmful results - ready to plucked up by some crazy who’s smart enough to ask Chat-GPT how to do it. But, in general, I’d say that the great dangers are to occupational security rather than to physical wellbeing.
If you put everyone out of work then, even if we’re all living in a world of free food, technology, and shelter, I don’t know that our species is sane enough to handle the boredom. And I’m not sure that, during the transition period, the people who still have work will be safe from the people who are unemployed. The problem isn’t AI, it’s humans.
Well, any linguist will tell you that there is a certain degree of ‘logic’ that is built into language because there are only certain ways to correctly construct meaningful sentences even within the more expansive rules of grammar. Whether this ‘explains’ the apparent reasoning of ChatGPT and other generative systems to any meaningful degree is subject to debate but there is no real evidence that these systems are either capable of critical evaluation based on core logical principles or have any degree of self-awareness or autonomy. They are “stochastic parrots” capable of construction meaningful-sounding structures but their actual ‘knowledge’ is based strictly upon statistical relationships of the words and associated images in its data training set.
That’s an excellent point, and not just one of abstruse philosophy; many cognitive scientists are on some spectrum of belief that our ‘conscious’ mind just serves to rationalize our autonomic behavior post hoc, and depending on how far you take that it can mean much of our communications, too; certainly many forms of “small talk” and repetitive use of language are essentially autonomic, and it also serves to explain in part why ‘memes’ are so effective at spreading across social populations. But the key difference is that generative AI doesn’t have any real rationalizing functionality (even though it may mimic it), whereas the expectation that an AGI system would by definition.
No, we’re taking a way we use of (crudely) modeling human heuristic processes and using it as the basis for a ‘learning’ system. There is no question that generative AI ‘learns’, although exactly how it makes specific complex relationships is not clear just by looking at the system, but it isn’t in any way like the human (or any animal) brain, either structurally or functionally. Not that this is a requirement for an AGI; any true machine cognition running on a silicon substrate is going to function in ways that are completely unlike animal brains, and there is the open question of whether we could actually recognize or understand an AGI when and if it emerges.
Stranger
That’s exactly what is being worked on right now. ‘Embodiment’ of AI via robots, drones etc and finding ways to give them agency by connecting them to online systems, allowing them to be prompted by inputs that don’t come from humans but are interpreted from events in the real world.
We’re honestly not that far away from the paperclip collector scenario being able to be implemented in the real world.
I don’t see any reason to assume that it’s doing anything different than a neuron, in any significant fashion - except minus some hormonal/structural involvement. Each node might be doing it a little bit better or a little bit worse than your average neuron but, still, it’s just doing the same thing in a way that’s more effective for x86 and GPU op codes.
I feel comfortable in saying that the missing elements are hormones, a loving household, and so on. Squeezing electrical pulses through a thing that optimizes signal routing, based on some scoring criteria, can be expected to lead to similar results in a machine as it does in a human - especially if you’re adjusting your routing and scoring techniques to produce output similar to a human gives. That the end result of that process is a machine capable of replacing a human is a fair proof of just that. If it was radically different, it would be doing something completely off-the-wall.
I’d probably put that in the same territory as self-driving cars. It can do surprisingly well, but not well enough.
If an AI invents a fictional citation, that’s unfortunate. If you can convince it to kill people, because it thinks that it’s supposed to act out a play about swashbuckling on the high seas, that’s a massacre.
Our optimism for the technology isn’t going to survive real world testing. I’d have to recommend against any sort of feedback loop in the system, without a system of parenting that gradually exposes the AI to new powers and checks for sane behavior the full way.
In humans, the method is that you have a training period where the machine accepts new information - more-or-less unquestioningly - and that early-learned training is able to stop later, nefarious lessons.
One of the reasons that chat AI is unable to consistently disregard later, nefarious prompting is that it has no context on the world that allows it to understand that it’s being tricked. It hasn’t lived.
Turning it into a robot, with the ability to pick up the actual context of living in the world, and parenting it is going to be the path to safe AI. But it’s a process that will need to be very controlled.
I don’t have to “assume that it’s doing anything different than a neuron”; the software of ChatGPT running on digital hardware is completely unlike a neuron. Neurons are not just biological transistors; they are highly complex systems and we have yet to develop a comprehensive functional model at even the single neuron level. George E.P. Box’s aphorism that “all models are wrong but some are useful” is never more applicable than it is to neuroscience and machine cognition because while we can make algorithms that work similarly to how we model heuristic processes in the brain, they don’t actually function in any way like a brain works. Reducing neurocognition to “squeezing electrical pulses through a thing that optimizes signal routing, based on some scoring criteria, can be expected to lead to similar results in a machine as it does in a human” is really just a lot of semantic handwaving to dismiss the fact that we don’t actually know how to make a truly detailed model of even a very simple brain.
Current AI systems do things that are “completely off-the-wall” on a regular basis, and in fact without a lot of structured guidelines and good prompts they will do so almost by default. Just because these systems can, under guidance and prompting, produce something that conforms to grammatical rules and provides seemingly meaningful if not terribly insightful content doesn’t mean that it is actually ‘thinking’, and there is an implicit danger in the general assumption that such systems are actually capable of performing critical decision-making or analysis.
Stranger