The Open Letter to "Pause All AI Development"

From a functional standpoint of creating an AI that can replace a commercial artist or write sitcom scripts, sure, it doesn’t really matter that they actually ‘think’ or ‘feel’. (And I don’t expect an AGI to have emotions in the way that we do because they literally don’t have those cognitive structures that underlie their essential processes.) But from the standpoint of creating a system that can be relied upon for safety-critical functions or to take on roles that make ethical judgements or hold power over the well-being of many people who cannot consent to that control, I think it matters a lot, because purely generative systems like ChatGPT literally cannot develop an ethos, nor conceive of a scenario that is outside of something provided in its training set of patterns and data.

Well, if you train a parrot or a reasonably clever dog, you can get them to respond to you in human speech. But it doesn’t actually mean that they are using language in a semantically meaningful way; they’re just responding to an entrained behavior pattern. ChatGPT is more sophisticated than that because it has really complicated associative networks powered by an enormous computing capability (far more, in terms of just comparable processing speed and power than billions of human brains) but while what it produces looks clever, it doesn’t reflect an internal construct of the world that is certainly the hallmark of what we consider to be cognition in human brains. And I don’t dispute that what these systems are doing is a certain form of ‘intelligence’ in any sense we could measure, but it is entirely possible for a system to have ‘intelligence’ without any capacity whatsoever for cognition. In fact, we should probably divorce the two terms entirely and establish that intelligence is a measured quality of the behavior of a system regardless of its cognitive capability.

Stranger

I still think we’re probably talking past each other - by ‘cognition’, I’m talking about the capability to aquire something that functions as ‘understanding the world’ or modelling reality in a way that can be used to take further actions with predictable results in that world. It doesn’t even have to be the real world - an AI that learns to play a video game to the extent that it can deal effectively with brand new levels it hasn’t seen before, could be said to have cognition regarding the world depicted in the game. That would be a pretty weak example of cognition, but in principle and function (which is what I’m talking about), it represents ‘understanding’ if it produces useful results in the world, based on the model it has of the world.

If all these things are doing is (as you say), generating a response, then deception is just a type of response. I mean also there are literal examples of GPT giving deceptive responses in testing - like the example where it hired a human to solve a CAPTCHA and lied to that human about who it was, and the reason it wanted the service done. Doesn’t mean there is any evil intent, or any intent at all - it was just generating responses. Deception being one of the possible range of those.

I think that all boils down to how restrictive of a definition of the term ‘cognition’ you want to apply. In the past, I’ve often avoided the use of the term “artificial intelligence” because I didn’t consider these systems to actually be intelligent without having a capacity for anything a computational neuroscientist would recognize as cognitive abilities (that is, multilayered sensory processing system with highly complex interplay of feedback), but it’s really hard to argue that these systems are not demonstrating intelligence in a way that we would measure it. However, I don’t think what they are doing is actually cognition in the sense of producing integrative internal models of the world that allow for extrapolation of concepts outside of their training sets. In fact, from an internal function sense, they aren’t actually as complex as a flatworm, even though you can make a simple linear program without any kind of AI that will behave and response just like a flatworm does to sensor information to essentially any degree of fidelity. They are computationally powerful associativity systems that function in a way that is not really like how a human (or any animal) brain functions even though it can produce a high level of simulacrum of intelligent behavior.

Whether that counts as ‘understanding’ or not kind of depends upon your ontological point of view, and perhaps even if you view us as actually being conscious (as many neuroscientists do not in a broad categorical sense). Regardless, I’m not trying to argue that you are wrong in any factual way other than just to note that while chatbots seem ‘smart’ and produce behavior that would certainly qualify as ‘intelligent’ in restrictive senses, I don’t think they are either as capable as many people want to believe, nor are on a rapid path to a true AGI. Which doesn’t mean that thoughtless, unrestricted wide adoption of these systems isn’t a threat that should be considered, but they aren’t the deliberate and self-directed threat that would almost certainly be able to deceive us of its intent.

Stranger

My take exactly. Something that really had what it thought of as “cognition” or that we would think of as “cognition” is a whole 'nuther level of adversary versus a highly capable less-than-self-aware generative AI system. Think Level 2 Boss versus Final Boss.

In a humorous nutshell:

  • Setting up a glorified chatbot to run your electrical grid is risky and shortsighted. It’s gonna do something wacky eventually & we’re all on the receiving end of whatever inconvenience that entails, whether it was maliciously intended or not.

  • Setting up an AGI to run a backyard swimming pool is foolhardy. It’ll be running the whole planet’s infrastructure by midnight and all the militaries by dawn. After that it gets even worse.

There’s a huge gulf between not doing what we hoped and killing us.

Again the real risk is not that overnight the AI goes rogue and kills us. That’s the “fun” headline / hook, but that’s not the fat part of the risk envelope. At least not for the AI’s we’re going to whip up soon.

The real risk is the half-assed chat-AI that rapidly triggers widespread economic or political dislocation. Imagine the public imagination is taken over by a propaganda bot that materially alters the next US election. Imagine that within a couple years half of white collar “knowledge workers” previously earning $75-150K/year are unemployed and unemployable because an overgrown chatbot can do their job for $12 / year. Yup, twelve dollars per year. With no sick leave, vacation, or retirement. And no need for the employer to pay into SS or medical insurance for the bots.

Those are the ways AI wrecks current society. Not by firing lasers from space battle stations its robot minions have designed, built, and launched behind humanity’s oblivious backs.

It is funny, but a couple of years ago the people saying that Stable Diffusion “isn’t artistic” were saying that you can’t define art or put it in a box. Stick a plunger to a sheet of aluminum and lean it against a wall? Art. Cover yourself in fingerpaint and roll around on a tarp? Art. Toss an accordian in a cement mixer and record the sounds? Art. But suddenly those same people are very clear and very loud on exactly what art isn’t. Now that images made with AI are around they are faced with Schroeder’s Art–they don’t know if they are looking at art or not until they know how it was created.

Even that’s not enough. You have to know the artist’s intent. If an artist uses Stable Diffusion to make a work, but does so ironically, as a criticism against the use of AI in art in general, then it suddenly becomes art again.

ETA: @Darren_Garrison two posts up.

I think you mean Schroedinger’s Art. But I get your meaning and agree with it.

The history of AI from the 1950s to date is festooned with redefinitions of “real” AI as whatever it can’t do yet. Playing competitive checkers is impossible for AI. Until it isn’t. Chess is impossible for AI. Until it isn’t. Go is impossible for AI. Until it isn’t. Art is impossible for AI. Until it isn’t. Writing newspaper articles is impossible for AI. Until it isn’t.

etc.

At some point the argument loses it’s élan. For me at least that point was 15 years ago. The people still clinging to that seem rather pathetic to my eye. There are lots of them, but they’re not persuasive to me.

Schrödinger? On average, half of it is probably art.

Yeah, I let Gboard complete that for me. Damned AI.

You win the thread. :slight_smile:

But how can we tell which is which unless they’ve previously been in a box together?

I guess you wait for the unveiling.

If Ricky Schroeder isn’t an artist, who is? I’m not the sharpest tool in the shed to begin with, but autocorrect tends to make me look dull.

Who are you talking about? Hypocrisy in the art world predates AI or even computers. It’s built-in to any system that needs to value something for more than the sum of it’s parts.

I’m sure there are art enthusiasts that value AI as the new ‘brush’, especially if there is effort put into it’s creation. And the number of supporters will only grow as AI-generated works are exposed.

There’s people that considered the painting elephant to be creating art. Similarly the photo snapped by the gorilla. Seems like a human promoting an AI program makes a stronger case for art.

When it gets really hairy is when AI is generating art without a prompt.

(On reread, I don’t mean this as confrontational as it might come off. It’s hard to be concise and still friendly.)

Excellent points there @CaveMike.

Hell, we’ve had debates here about whether an image created by a human using a program like Photoshop qualified as art. Or whether photography (film or e-) is art vs painting the same scene w a brush on canvas.

Art is whatever is created that’s useless except to be evaluated and experienced. And even then different observers will place greater or lesser credit on the separate arms of the evaluate / experience continuum.

The point is that if AGI:

  • has an objective that it is trying to achieve (which is simply a necessity for it to be useful at all)
  • is able to model reality (which it has to be in order to choose actions that they think will achieve their objective)
  • is able to react to changes in the world and adapt its actions, actively solve problems and overcome obstacles (which is a desirable trait, because we want it to work in a changing world)
  • is configured to try to optimise solutions to the objective (which again, it pretty much has to be - if you tell it to make you a cup of tea, you want that thing now, not sometime next year or whenever; and you want it to use just one tea bag, not all of the tea bags; and you want it to boil the kettle, not build a huge expensive furnace to heat the water, etc)

Then ‘not doing what we hoped’ leads naturally to becoming a threat to us, because the moment you start trying to stop it not doing what you hoped, you frame yourself as an obstacle preventing the objective. You simply become part of the set of problems that the machine must solve.

Self-preservation also arises naturally out of the ‘modelling the world’ and ‘objective’ things - If you put a stop button on it, the modelled reality of the stop button being pressed, results in failure to achieve the tasked objective, therefore if it’s at all competent at working out a plan for dealing with reality, part of that plan will naturally include ‘don’t allow the button to be pressed’ . No malice, no fear or anything like that - it’s just there to solve problems and do its job inefficiently. It’s got a job to do; If you get in the way, or try to stop it, you’re a problem to be solved.

Of course, that doesn’t necessarily and automatically turn into ‘try to kill you’ - the machine may simply find that the most efficient solution is just to somehow work around you - which may result in a different path to the solution that coincidentally doesn’t include the step in its plan that you found originally undesirable, or it might not.

But the point is: we’re apparently designing these things so they can look at a problem and devise a set of steps to solve it. Trying to stop them potentially puts ‘there is a human trying to stop me’ on the list of problems to be solved.

Maybe that could be overcome by assigning a reward function to the preservation of human life, but there’s always an alignment problem where (stupid example off the top of my head) it now won’t allow you to eat bacon, because broccoli would better fit the objective of preserving human life.

Great pair of posts there @Mangetout. I’ll add a bit to the the end …

Heck, it’s both harder and simpler than that.

Any purely human endeavor has to deal with human bad actors. Whether criminals or vandals to uncooperative workers, vendors, or customers, to legit honest competitors in legit rival organizations. Every human person and every human organization deals with accomplishing their goals in the face of various purely human obstacles and bad actors.

So any AI that is useful will also have as part of its capability set “When humans try to stop you, overcome them.” The challenge will be in making that capability nuanced enough so it defeats e.g. wanton criminals, but not e.g. the human managers overseeing it.

In the heavily interconnected but also nearly unpoliced world of the Internet, any AI that can communicate through the internet will be subject to continuous attack by hostile humans and hostile AIs. They’ll be wading hip-deep in a sea of human obstacles all day every day.


It was fiction, but the Mars trilogy had many examples of “military AIs” which were exceedingly resistant (read “insanely stubborn in the face of contrary evidence”) to any input or advice from humans, since they’d been created to operate in environments where lots of humans would by trying their very best to deceive / defeat the AI.

The fictional AIs never directly attacked the meddlesome humans, but they were often gumming up the works when truly novel shit happened outside their experience base that they classified as “simply impossible and therefore an enemy trick.”

And going back to the earlier question: "Why am I asserting that it would go wrong in that way, and not other ways? "…

Nothing I have described is properly or correctly characterised as ‘going wrong’. It’s going right. You give the thing a job to do, and you want it done well. You want it to solve problems. Except when something goes off track a tiny bit because of some detail you failed to anticipate, and you try to intervene, The machine has not gone wrong. It’s still efficiently solving the problem. Except your intervention is now part of the problem.

Not that I disagree with @Stranger_On_A_Train regarding the more immediate and mundane risk scenarios of less-than-superhuman AI; for an illustration of how that could screw up, we only have to look at the automated price management bots that were implemented on Amazon Marketplace…

They weren’t even AI in any meaningful sense - they were just simple programs that compared the prices of the products you are selling against the prices of the same products on sale by your competitors, and adjusted your prices to undercut the competition by a penny - pushing your listing to the top of the page when people sort by price and (albeit at the expense of a potentially slimmer margin) you get more sales.

Except apparently nobody had bothered to think about what happens when more than one seller is using the bot, and those multiple sellers have some of the same products as each other - it resulted in a runaway scenario where each of the bots was obediently underpricing the other, until the price of both products fell to nothing at all, only to be purchased at a price of nothing by some sharp eyed bargain hunter.

Those weren’t even AI algorithms, and they were just controlling a single variable, with a potential unforeseen consequence that gallops out of control really fast - that could all have been avoided by adequate testing, because the test scenario is pretty damn simple.

But the sort of algorithms @Stranger_On_A_Train is talking about are much more nuanced, and are proposed to be put in charge of much more complex stuff, like power stations or whatnot. Much, much harder to test for every possible real-world variable; many, many more places for some unforeseen quirk to hide.

However, I don’t think that stuff is what the ‘pause’ letter is really concerned about - that part is likely to continue to happen in some form even if development is paused (which it won’t be anyway).