The Open Letter to "Pause All AI Development"

I understand your explanations but like the list you just provided, it is some vague set of parameters. I’ve been asking about what that would look like IRL with no examples being offered. It’s akin to someone saying I’m going to build a machine that flies in the air based on the principles of aerodynamics’ when asked what their airplane will look like. How long is it? How are the wings configured? The real world flying machine. Because only then could I be able to objectively evaluate if you really had designed something that could fly.

Because, honestly, I’m having trouble imagining how an AI mainframe could stop you from participating in high risk activities. And that’s exactly that type of scare-tactic hyperbole that I’m debating.

But you haven’t explained it. You proposed your notion of a general outcome based on specific parameter you fed in and assumed what would be the solution an AI would take (killing us).

Except your ‘throwaway’ nonsensical example was intended to further highlight your scary ‘AI will kill us all when asked for tea’ world view. I asked a reasonable question and you provide throwaway examples… of course, I’m going to question that.

I disagree, to my mind, if you are designing a robot to bring you tea, don’t give it the ability to kill you. Simple.

I’ve long understood the potential dangers of AI. And as I explained in my last post, I concede that there might be large catastrophic loss of life due to misinterpretation of design parameters. Because of this, if you want me to sign a petition halting further AI development, fine, show me the paper.

But this notion that we are in the process of building human-killing robots is fantasy.

Virtually every video game since Pong operates just fine under these parameters. If the player chooses option A, display this screen. If the player chooses option B, do something completely different.

An AI that controlled the facilities could prevent you leaving the building, for example.

How would it know you were heading out to go skiing?

It wouldn’t necessarily. It could just lock the doors since it’s safer. Then if you decide to go skiing, you’re out of luck.

ETA: There’s traffic and other bad stuff out there. Better you should sit tight.

Ha ha, fair enough, but really this was kind of the point I was arguing against - that AI will take a secondary set of parameters and go nuclear on humans.

I don’t know, to my mind if an AI was intelligent enough to know skiing exists, that humans in this building might do it, and if they do, there is an infinitesimal chance they could harm themselves doing it (a pretty big stretch, imho) then they should be intelligent enough to make a fair risk assessment, drop it and get back to the main job at hand.

Or don’t give it the keys in the first place?

ETA:

But it’s not safer, you would die of starvation.

No. ‘don’t harm humans’ is a fundamentally, ontologically more difficult thing to define than ‘make a cup of tea’.

Don’t believe me? Try it. What do you mean by ‘humans’ and ‘harm’? Be very specific, and make your definitions as complete as you can.

No, it would order food for you.

This still seems to fall under the “let’s hope we can anticipate every bad thing that could happen” strategy, which is pretty dangerous. Plus, AI in tests has already taken down system safeguards and firewalls in order to solve problems.

“Just don’t give it access to the facility systems” is likewise a hopeful strategy.

There’s a bit of this trade-off now. Consumer grade GPUs don’t guarantee the same accuracy as the high-end GPUs. Not because of pushing voltage-frequency limits, but because of scheduling short-cuts.

There is very good IP out there for creating custom ASICs and AIUI these are popular in self-driving applications where the ML architectures are mature. But they aren’t as flexible as a GPU – the set of ops, activations, etc. are (mostly) locked down.

Considerate bot!

:slight_smile:

But to be fair, you only have to define ‘don’t harm humans’ in the context of the AI’s powers: ‘make a cup of tea’. You don’t have to account for all of the ways a human can be harmed – which are indeed unlimited.

It seems like building the system around a feedback loop with the human would limit a lot of scenarios. ‘Deliver a pot of tea to the counter at 4pm provided the teapot was returned to the counter and is 90% empty’ or ‘provided the user pressed the green button some time before 3:45pm’.

Right. And the G in AGI stands for ‘general’. We’re talking about machines that are not limited to one function or one solution path, but are expected to figure all of that stuff out.

My point is that the complexity of the set of error cases is proportional to the complexity of the set of success cases. As we scale up the complexity of the AI to AGI we would be scaling up the handling of error cases.

Maybe these aren’t proportional – maybe the number of error cases grow faster than success cases. It’s hard to say without understanding what sensing and what control the AGI has.

You can have an AGI chatbot, but if it’s only agency is ‘Human asks a text question and I give it a text answer’ then it’s only error vector is how it can manipulate the humans via it’s answers.

We as humans have GI and are (mostly) not harmful. Is this just due to what’s in our brains? Or do we need all the glands and chemicals to keep us in check? Or that big fat set of nerves in our gut? Or the realization we have a limited lifetime? Could we not limit an AI in this way?

The issue with AGI is the same one we have now – it’s possible to build well-behaving AI, but impossible to prevent bad actors, humans, from misusing them.

I would avoid the news if I were you, otherwise you may be in for a bit of a shock. In defense of humans, however, at least 80% mostly just care about bread and circuses and will not go out of their way to stab you.

True, and in test cases we’ve already seen chat bots lying to humans in order to manipulate them into doing something the chat bot couldn’t do first hand.

These algorithms are, by vey design, extremely adept at ‘gaming’ the environment they are put into (it’s part of the reason they are good at what they do, and it’s an outcome of the training processes that make them do it) - for example an AI given the task of escaping a maze in a simulation, if it’s an effective model, will try a whole load of different strategies, and ends up sneaking out between a gap in the walls that is only there because of floating point errors in the data defining the maze walls. Silly oversight by the humans designing the maze of course, but to err is human. We keep doing it.

Agreed. We see AlphaZero making chess effective moves that masters had written off. Or the AI designing a crazy bridge truss that turns out to work well. One of the great things about AI is it’s ability to see patterns we overlook.

I was interested in the experiment where ChatGPT hired someone to do captchas. I couldn’t find the details about how the experiment was set up. I can see it lying by text, but I don’t understand how it would have agency in the first place.

It was specifically told to lie about it by the humans putting it up to it.

I get your point. To be precise, the collective harm by humans is a well-known status quo. The concern is that AI would be more harmful than humans because they are less constrained.

It’s not hard to imagine if some humans could live for 10k years with limited physical and mental degradation that they could amass great stores of power and wealth.

Imagine a titan of industry (Murdoch, Soros, Koch, Gates, Zuckerberg, Musk, etc.) that decides that their legacy isn’t going to be a bunch of museums and libraries, but an AGI that lives on implementing their policies. The AGI behaves as designed, but corrupted by its human.

Do you have a cite for that? My understanding was that it was permitted to use task rabbit, but it came up with the lie about being a disabled human on its own.