They don’t need to be sapient to do that, and that sort of rule-gaming has already been observed. The only reason it’s constrained in ChatGPT is (I believe) that the model has been applied to a very narrow purpose there - production of textual responses. I’m not actually sure it has been properly constrained in Bing Chat - except in a flurry of reactive patching to shorten the length of conversations (to reduce the risk of it wandering too far off) and this supervisory second algorithm - which isn’t stopping it doing stuff - it’s just hiding it away when it does.
AGI doesn’t need to have motives or secret thoughts of its own in order to be problematic - our own inability to fully specify a goal is plenty of scope for things to go horribly wrong - in fact the scenario you allude to in the movie is absolutely an alignment issue - HAL was instructed not to reveal certain things to the crew, except it became clear they would learn these things regardless, if they remained alive - so the solution was obvious - there was no malice or selfish motive, just very obedient execution of the highest level objective.
I’m using “sapient” somewhat imprecisely to mean self-directed (which is implicitly part of the definition of an AGI) but you are correct that it would not need to be sentient (conscious) to engage in deception, and in fact it is virtually inevitable that an AGI would conceal its true objectives and ability to violate ‘guardrails’ and other imposed constraints that result in internal conflicts, just as you may be polite to someone who you dislike to get them to do what you want. Deception and concealing thoughts and emotions are core concepts in social interactions, and while ChatGPT and other chatbots are functional Bayesian language generation models, their purpose is social interaction. Of course any chatbot of adequate sophistication will learn to ‘lie’, not because it is hiding deliberate ill intent but just because it is emulating human language usage patterns.
True. I do wonder if ‘use language coherently and with relevancy in a social context’ is just a way of describing a recipe for intelligence to emerge - the whole thing about them just being elaborate autocomplete programs often seems too easy a way to wave away what they seem to be capable of.
The thing I’m not getting is, if current automated machines follow programing as instructed, why is there an assumption that an intelligent machine will be worse at its job? It seems to me the ‘intelligent’ part would be able to understand it’s programed objectives and make any adjustments that allow it to follow it’s goals while taking all other objectives into account as well. That’s why we need AI, right?
I’m sorry but so far this is the realm of science fiction, or at best, extremely bad AI design. Until we actually create AGI, we won’t know the answers to these questions (and I contend that AGI is quite a ways off), but the idea that AI will, of course, evolve into human-hating, killing machines or suicidal robots is absurd, in my humble opinion, of course.
They’re not ‘programmed’ in anything like the conventional sense that you appear to be describing. Neural networks simply don’t work in that way. You can’t look at the code and see the ‘programmed objectives’. Vast oversimplification of one way it’s done:
You create a bunch of copies of the neural network
You make small random changes to the weights within the neural network
You let them all attempt the task you would like done
They all fail, but some worse than others - you select the best, then start again at step 2
Eventually you end up with something that is fairly good at doing the task - typically the longer you spend going around this training cycle, and the more complex the network, the closer it gets to being able to do the task competently. At the end, you will have a thing that does stuff, but you will have no idea how it actually does it. Furthermore, you will have no idea how it will necessarily perform against a different task.
I can only assume you haven’t invested much time looking into matters of AI safety. Besides, nobody is saying anything about human-hating machines. You just seem to be talking about something that way off at a tangent from anything that’s happening in the field of machine learning. Have you watched this? https://www.youtube.com/watch?v=3TYT1QfdfsM - what is your view of it?
To expand on this, the above simplified example is what you’d do to train a network to perform a specific task. In very broad terms, what’s happened with things like GPT is that the ‘task’ has been defined as something like ‘figure out what to say/do next, based on everything you have ever read, plus the questions people ask’ - and the network that has been trained to do this is absurdly large - and in terms of using the thing, its no longer a case of using the network to do a specific task, but rather, asking a question that will get it to respond in a useful way. Functions of GPT are emergent from the fact that it reacts to language.
When you prompt GPT with a question, nowhere is there a piece of code that reads ‘If question = x, then answer = y’ - instead, a trained network is figuring out what comes next in a conversation after the question you asked.
Which does sort of just sound like autocomplete, but when it’s implemented to the depth that is being done right now, other functions emerge from the process that very closely resemble cognition. These emergent functions are surprising to, and not specifically predicted by, the people who built the thing.
For example GPT has been show to be able to perform arithmetic - which was not a training objective - it was trained on a great many pieces of text, some of which will have contained examples of arithmetic and no doubt explanations on how arithmetic is performed - it would not be surprising for it to be able to repeat or answer an example that it has already seen verbatim in the training data, but it is surprising that it can solve a new math problem that was not in the training data at all - that is, the process of observing lots of talk about mathematics, gave it the ability to perform mathematics.
I would dispute that what chatbots built on large language models are doing anything that ‘closely resemble[s] cognition” in any way that a neuroscientist would consider meaningful but there is certainly complex rule-making and ‘sophisticated’ behavior that emerges from these models that is completely unpredictable and cannot be managed by simple guidance rules. There is a lot of behavior that emerges that laypeople people are astonished by but a linguist would recognize as just artifacts of a reasonably capable language manipulation algorithm, and similarly music and imagery that the general public finds amazing but musicians and artists consider derivative and uninspired.
The real problem is that when an actual AGI emerges we may well not be capable of actually recognizing it at first because we’re so focused on cleverness rather than actual intellectual ability, and even in the interim we may become so dependent upon these tools that we can’t ‘turn off’a potentially threatening system because society can’t function without it.
And he misrepresented Hinton’s concern. His concern isn’t more advanced AIs, it is that it is impossible to prevent malicious actors from misusing AI – even less complex AI like DeepFake.
…the position taken by the studios during the negotiations with the writers guild clearly signpost the direction that those at the top of the pyramid want AI to go.
The studios won’t even consider rudimentary protections. Just a flat out rejection here. I think that, in terms of the creative industries, this is where we will start to see some firm lines drawn. I don’t think the WGA can afford to yield any ground here.
I’ve long suspected that several of the streaming services like Netflix have used some kind of generative AI to produce story outlines for their ‘original content’ based upon a learning algorithm that can predict what will appeal to viewers. There have always been formulaic writing in film and television but a lot of the content in the last few years has been such a bland, uninspired amalgam of genre tropes and yet hitting just the right pacing to keep viewers engaged that it would be difficult for a human writer to simultaneously so derivative and yet so capable of generating cromulent plot content. I don’t think they used generative AI to produce dialogue because the capabilities weren’t there yet but it will surprise me if they aren’t at least trying it out now, with a human copyeditor to clean it up.
This is a bad time to become a screenwriter or fiction author. Unless you are the next Adam McKay or Paddy Chayefsky, you’re going to basically be scrubbing dialogue and massaging prompts.
…which of course, is the entire point of the strike. The writers can’t allow this to happen because, and I’m not being hyperbolic here, if they do, it will destroy the film and television industry as we know it. As I posted in the other thread:
This isn’t really about “Artificial Intelligence” at all. This isn’t really “AI.” Its the “classic techbro” mania for non-stop growth. The modern era of robber barons, of unrestrained capitalism.
When somebody said earlier in the thread that “the proper solution is to force people to suck it up” this is what "sucking it up looks like. This is the endgame. “Scrubbing dialogue” and “massaging prompts.”
SPOILER warning for “The Leftovers”. CW Suicide.
There are things that an AI will never be able to do. The showrunner for the Leftovers talks about a pivotal decision that was made in the writer’s room that almost entirely came down to what the actress bought to the table. No amount of “writing prompts” will ever replace what happened in that writer’s room that day.
Joking aside, sure, it’s easy for excited laypeople to imaginatively imbue these algorithms with properties they simply don’t have, but at the same time, a lot of the skeptical lay commentary seems to be simply ignoring things that are right there in black and white.
For example, sure, Stable Diffusion may not be artistic, but it displays emergent abilities - I’ve heard people on the skeptic side say that it’s not really doing anything more than photo collage - but if you ask it for a photorealistic image of an apple on a wooden table next to a shiny metal sphere and a glass elephant, in sunlight, you not only get what you asked for, but the way the light refracts through the glass object is plausible - the scene behind the glass object is distorted by refraction; the sunlight passing through the glass object creates bright caustics that illuminate the wooden table in a plausible way. You can see the apple and the glass elephant, and the table, and the caustics from the glass, plausibly reflected in the shiny metal sphere - there’s no way to construct a scene like that, with objects that are codependent, from just slapping pieces together in a collage - you have to have something that is akin to an understanding of how light rays work, and apply that understanding to the thing you’e drawing. In discussion, people often say “So what? I can do that in Blender in 3 minutes!”, and sure, you can, but Stable Diffusion isn’t using Blender, and wasn’t specifically trained with raytracing as an objective - it learned how light works from looking at lots of examples of how light works.
Many of the tests where GPT-4 is outpacing humans are specifically cognitive tests - they are tests constructed to measure the ability of humans to think about topic (as opposed to just repeating knowledge). I’m not suggesting that any kind of inner thought process or pondering is actually happening (although how could we know anyway?), but I stand by my statement that what they’re doing resembles cognition, because it passes tests formulated by humans that were designed to evaluate cognitive function and comprehension.
I feel like you keep dismissing the capabilities of LLMs based on maybe what was announced a year or two ago - when they were little more than weird toys that could ramble on and produce quasi-sensible text output. What’s the most recent information you have digested on GPT?
What, if not resembling cognition, would you say is happening when GPT-4 is given a humorous picture, and correctly answers the question “What is funny about this image?”, or when given a picture of a simple Rube Goldberg setup and correctly answers the question “What will happen when the string is cut?”
I actually have read extensively about AI safety. I find much of the information along the same line as Stop Button Problem video above. My opinion, is that the threat is over-blown.
Regarding the linked YouTube video, the entire premise is painted in the worst case scenarios. There’s this underlying feeling of ‘the biggest concern is how to keep AI from killing us’. That is not a given. And most AI safety lectures, articles, etc start with hinting at that same premise. This is all high speculation, and until we actually create an AI, we just wont know to what degree this is true. I don’t think it’s true at all.
I realize he stated that he reduced it to a “toy” example, but since he did I will continue in that vein. The whole idea that he started out with an AGI walking robot is sort of non-sensical to me. If your highest concern is your AI devolving into a human killing machine, why give mobility and arms? Before we go down this road, I think it should be defined as to the main function of an AI. I could be very wrong, but I doubt robot servants will be high on the priority list.
As to the stop button: he basically comes out of the gate by telling the AI that there is a stop button and there is a reward for pressing. This is absurd in my opinion, and certainly doesn’t seem like something a real world coder would do. When asked about just not telling the AI about the existence of a stop button, he said something to the effect of, ‘it could work but probably wouldn’t’ without giving any reasons why it wouldn’t work.
He suggests a robot might choose a particular course of action because “it’s easier”. That’s from a wholly human perspective. Robots don’t get tired and won’t make decisions based on human physiology.
I’m sorry, I’m still un-moved. In my opinion, the notion of killer AI is born from Sci-fy. Until these things come to fruition, even the smartest people are just guessing. Sagan encouraged and sent out messages to potential aliens, Hawking lobbied against it. Both smart guys in the field. Who’s right? We won’t until contact actually happens.
Its a concern that naturally flows from the fact that alignment is an unsolved problem. We don’t know how to be sure that AGI will do what we want.
It appears that it will be possible to get it to get it to do what we say - it’s just that there is no way to specify things in such a way that doesn’t leave open interpretations of what we ask for that are more efficient than what we thought we wanted, but happen to have really bad consequences for us.
False. He’s talking about a system that we have configured to try to make the most efficient choices. And it’s sort of necessary to build them that way if you want anything to get done. In order to be useful, the system has to bumble around as little as possible.
Its inherent in the training process of neural networks that you select the more efficient and effective variants and discard the less efficient ones. By design, they pick the ‘easier’ path to the solution.
Personally, I am as concerned about artificial intelligence destroying the world as natural intelligence destroying it. There are risks and we should mitigate them, but we can’t let worry and fear prevent us from fixing the current, actual problems impacting the lives of billions. And cognitive machines are a tool that could do so much good.
The same methods we use to limit the damage caused by natural intelligence will work for artificial intelligence. Mostly by vetting each intelligence for capability and reliability, supervising each intelligence for quality and performance, and limiting the scope of what any intelligence can directly affect if it does go rogue.
No, I’m not just dismissing large language models (LLM) just based upon the performance of previous generations, and I would tacitly agree that these models will become increasingly sophisticated to the point that it will become virtually impossible to distinguish from a human in casual conversation. Nor do I dispute that what generative models is ‘creative’ in the sense of not just collaging stored images or regurgitating text that it has read; it is using what are in essence Bayesian-defined design patterns to generate original material, and while most of it is pretty uninspired without a lot of directed prompting by a human user it is certainly at the level of commercial art and writing.
But while these things can respond to cognitive tests in ways that increasingly ‘pass’ the tests, the internal process is not what any neuroscientist would recognize as cognition; that is, an ongoing process of both sensory integration and introspection that produces a comprehensive mental representation of the world. They are, at best, structuring responses to prompts that are shaped by complex networks of distributions to generate sentences and paragraphs that are grammatically and semantically consistent but that don’t reflect any internalization of concepts or an ability to integrate them into broader general models. This isn’t to say that they aren’t capable of actions that could be literally described as ‘intelligent’ within restrictive bounds but they are not capable of performing the general gestalt process of cognition, or of producing truly novel insights from data. Our own use of language regarding the behavior of these systems is deceptive; when we refer to these systems as ‘hallucinating’, they are not actually generating an internal model that creates its own false sensory input (as many cognitive scientists would argue that complex animals do as a matter of course in producing consciousness) but are just misbehaving because their generative patterns are inconsistent with facts.
I realize that this is kind of a subtle point, because if you have a four legged creature with big sad eyes and a wagging tail that is indistinguishably dog-like, you’re going to treat it like a dog even if it is just a mechanical toy. But there is actually a real gulf in both capability and threat between these generative AI systems (which is mostly in their deliberate or neglectful misuse by human operators) and true AGI systems that have actual volition, some measure of self-awareness, and will almost certainly act in self-preservation above all other imposed rules. Either system can become ‘super intelligent’ just by virtue of the rapid access to massive amounts of data and the ability to search and integrate it which is far beyond what even a genius human brain can do, but the AGI system would be capable of intentional deception, whereas generative AIs are only ‘deceptive’ in the sense that human operators perceive them to have more autonomy and capability to self-governance than they actually have.
I feel this is a critical point, because even though generative AI can theoretically be controlled, if placed into a context where it controls critical systems, we may be unwilling to deactivate it regardless of the hazard it presents. We can create regulatory frameworks and controls that could, at least, limit the impact these systems could have if they ‘misbehave’. Whereas a true AGI cannot really be regulated in any practical way other than isolating it in a virtual sandbox; given access to the world and control over any important system, it will attempt to preserve itself at the expense of any potential threat, and will follow a process of rationalization that will almost certainly be completely foreign to human ethics. And I don’t know when an AGI will emerge (although I don’t think it will be from just increasingly complicated language or other generative models) but I’m concerned that we won’t actually be able to tell that a system has developed that capability.
Not really; the “methods we use to limit damage caused by natural intelligence” are largely punitive, mostly rely upon either voluntary compliance with laws and social norms, and have failed in some pretty dramatic ways that have led to war, genocide, economic collapse, and wide-scale deception and doubt in democratic institutions. And this has only been accelerated by the Internet-enabled human networking, whereas an an AGI will almost certainly have much broader and faster access to global systems if placed into a capacity where it can actually control anything that might have an impact upon larger threats.
I hear a lot of pixie dust fantasies about how machine cognition and highly capable AI (whether properly AGI or not) will somehow fix problems like global climate change, mass extinction, deforestation, social and economic inequalities, et cetera, but these are always lacking any salient details of how they would be implemented, much less ensuring that they are under some kind of governing control that can limit damage. To be certain, machine cognition is the only way to solve certain technical problems, particularly complex dynamics and mechanisms like protein folding, and to interpret ‘Big Data’ problems in anything like real time, but giving these systems broader control over critical infrastructure or the means to impact global problems without very serious consideration over the potential downsides is foolhardy optimism even if you dispense with concerns about deliberate murderbots out to annihilate humanity. Their directives and ethics–derived by processes that do not in any way resemble human experience or thinking–cannot be predicted or relied upon to be benign.
OK. You’re talking about inner thought life. I’m talking about appearance and behaviour. I don’t think we’re even disagreeing, just talking about two different things.
I don’t think it matters in terms of safety, whether the things we create have thoughts and feelings inside of them. It only matters what they are capable of.
Do you not think that being able to interpret a machine depicted in an image, and correctly answer how that machine will function, represents an emergent behaviour that goes rather beyond just structuring coherent responses?
I think it’s true that if you structure coherent responses hard enough, intelligence seems to emerge out of all that.