Could AI design and create good video games?

In theory maybe; but creating and powering a generative AI capable of that may not be possible and almost certainly isn’t affordable. They’re horribly inefficient.

Actually, Microsoft has done exactly what was described. Mangetout mentioned this. (It wasn’t Doom, though. It was Quake 2.) Here’s a link with more information.

It’s basically as he describes, a real-time daydream.

Where I disagree is the prospect that this could be used to similarly daydream something from scratch. These “AI” models just absorb and regurgitate existing content. The only reason this Quake 2 demo is possible is because there are already endless hours of gameplay on which to build the model. I don’t see how this is possible with a new game, because this approach depends on that new game already existing for the purpose of training the model. I suppose it might be theoretically possible to tell the model, “make it just like Quake 2 (or whatever) except change all the villains to cats,” or something. But it wouldn’t be new, because that’s not how these models work.

There’s no particular reason why it would have to work exactly as these technical demos did - they regurgitated existing content because that was the design goal.

If I ask an AI image generator for ‘a screenshot of an 8 bit game where the player has to jump over Donald Trump’s head on a ride-on lawnmower’, assuming it passes the content policy, I’ll get something that, whilst it will probably contain elements that are derived from extant video games, will be somewhat novel.

Ditto if the request is made of an AI video generator - the output doesn’t necessarily have to be a simple copy of something it already saw - you can ask for things nobody has ever seen before and these algorithms will try to create it.

So it seems like it would be largely an engineering problem to extend that to realtime video generation based on an initial prompt plus ongoing additional controller input prompts.

Probably one of the bigger issues is making it a coherent game with no dead-ends, no impossible jumps and where the output doesn’t morph and evolve into something completely different over time.

@Der_Trihs makes a good point about the resources/efficiency and that may or may not yield to pretraining and local running

Part of the question would be how much we expect the AI to do. If I ask an AI to make a video game and it uses Unreal Engine 5, is it still an AI-made game or does it have to create a new game engine and rebuild physics out of nothing each time? What if I taught an AI how to navigate and use JRPG Maker? Would that count as an AI made game?

I think where we already see AI is in things like generating assets, textures, etc, especially with things like AI tools in Photoshop being used by artists to enhance their own work.

The next step for AI in a game proper would probably be plug in modules that handle specific tasks. For example, someone modded Mount & Blade so that when you ride into town and talk to someone, ChatGPT gets a prompt “you are a peasant in such and such village, under such and such Lord, and you have this quest to offer” and then has a conversation with the player, who is inputting text through the game and seeing responses ingame as well. A model trained especially for a game, with extensive prompts for characters would basically revolutionize RPG type games. You’d probably want a whole back end system more akin to current dialogue trees and logic chains to drive how the AI is prompted, and then the AI handles the actual conversation.

You could also have AI drive procedural generation of terrain, cities, etc as in games like Minecraft.

Another way to look at this:

  1. Could a copilot-like AI write any of the components of a game, piecemeal, if directed by a human who gave it a series of prompts such as ‘write a collision detection function for sprites’?

  2. Could another AI write all of the necessary prompts for all of those individual pieces, instead of a human writing them?

Let’s say that pre-AI, making a game took a few thousand “big” steps done by teams of humans. By “big”, I mean things like:

  • Write the story and design its major and minor characters
  • Model the world that the story exists in
  • Model each character
  • Add weapons
  • Add magical abilities
  • Create a rules engine that enforces the interactions between characters, the world, weapons, and abilities
  • Add sound and speech
  • Add animations
  • Balance everything to make it all fun

Each one of those “big” steps could involve several hundred thousand little steps individually performed by dozens of people over months or years.

AI could step in and replace one or more of the small steps, but the output quality will vary a lot, depending on how novel the thing it’s building is compared to its training data, how complicated it is, how many other systems it must interact with, etc.

The more specific the ask, the easier the time the AI will have with it. You need to add three or four more voice lines to the existing 10,000 that a human voice actor already did? No problem, existing models can easily do that (hence the SAG-AFTRA strike and them attempting to preserve some rights). Similarly, if John Hero already has most of their weapons modeled out and you really just need a shiny rare gold version of an existing magic staff, a model can generate that mesh. Same if you a need a few lines of dialogue added to an existing NPC.

But if you want a full, functioning game the scale and quality of, say, Battlefront 2 or Oblivion or Baldur’s Gate 3, then you would probably need millions of tiny little prompts to compose it, with a lot of repeated hair-pulling in between because no single current model or system can fit all of that in its limited context. It’s questionable whether a few million prompts would save you any time or money over just hiring humans.

It’s not a black-and-white thing, of course. If AI can replace 20% of the workflow (i.e., 20% of individual “small” steps), that’s already a huge savings for the games company. Or if a specialized model can do even just one “big” thing (like given a game world, backstory, and character dialogue, generate all the voice acting needed — it can already do that, to a quality level some players would be happy enough with), that is a big deal as well.

It’s all progressing very quickly, across all the fronts (dialogue/graphics/audio/game design/coding/acting/etc.). Every few months, a new “wow” demo comes.

But as of today, there does not yet exist a capable enough model to do all of the little steps for one big thing, much less all the big things individually, much less orchestrating all the big things together given a rough idea of a game, and then designing and playtesting that game to ensure it’s fun and balanced. That’s an eventuality, but we’re just not quite there yet.

That Quake 2 demo, for example, is largely just rehashing a relatively simple 30-year-old game — and even then, it’s very flawed. Shooting doesn’t quite work (enemies sometimes get shot, sometimes not, barrels don’t work, the lighting effects disappear for a bit, the gun doesn’t always fire…), the collision detection is far from perfect (you can easily walk into a crate and become stuck there forever), and even when it all works, it’s quite low-res and laggy, etc. Is it a cool demo of what’s possible? Sure. Is it a game that anybody would pay $60 for? No, not yet.

But let’s revisit this thread in 2030 and see what’s changed…

AI undoubtedly will have a big role and it’s already started.
Honestly, the big surprise for historians will be that we managed to have massive, open world games before the development of generative AI…with humans manually creating thousands if not millions of individual assets, record hundreds of hours of audio etc.

But at the extreme of end of this: telling an AI “Make me a role-playing game” and have it produce something which destroys the best man-made efforts of Square-Enix or Bethesda, basically means super AI.
A game like that includes almost every art-form you can name from music to story-writing from architecture to fashion. And weaving it all into a compelling fun experience involves a great intuitive understanding of human psychology and the context in which we live. If it can beat us at all that at the same time then it’s hard to see what human tasks of any kind will still be out of its reach.

And it’s not sufficient to say “Well what about simpler game genres” because nowadays people do expect a lot of games of any genre. It we can hit “go” and have it make a puzzle game that people would rather play than the best efforts of humans would already imply to me that our time is up.

All this being said, there is a bit of a middle ground here. In-between the world today of AI making some assets and the extreme case of AI making entire games there is the scenario of AI helping to generate novel gameplay ideas / mechanics and make rolling them out simpler. This could help to make games even more fun and engaging, even as humans remain in the driving seat overall.

I’m curious about the ability of an LLM to play chess. Just an ordinary one like Chat GPT. Not one that’s trained to play chess.

Not the right technology for that. One test shows a 0% win rate for LLMs: Can LLMs Play Chess? I've Tested 13 Models (GPT-4o, Claude 3.5, Gemini 1.5 etc.) - DEV Community

Meanwhile, today’s classical chess algorithms can easily beat any human player. Like Stockfish: Stockfish (chess) - Wikipedia (Edit: Recent versions use a neural net now, another kind of AI, not LLM but also not a classical algorithm anymore.)

My understanding is that humans aren’t competitive in chess anymore and all the top matches are between different computer programs. I think that’s been the case for many years now, and the programs only keep getting better and better every year. But you can probably beat a LLM.

It’s also worth being aware that a phrase like “trained to play chess” can be misleading here.

(ETA: it might be that everything in the following paragraphs is known to you, and you did mean self play as trained to play chess. But just in case…)

The best modern chess computers, mostly or entirely learn to play chess.
Training is necessary to play at a competitive level, but the form that training takes is typically self play guided only by the desired outcome of ultimately winning, rather than humans manually adding heuristics, or being directly involved in the training.

When brute force + heuristic algorithms were beating us at chess, we didn’t really learn much. The computers only knew chess as well as the best human players; they won from being able to calculate further and more reliably.
But now we actually learn new strategies, and new position evaluations, from watching chess computers play. Their chess “intuition” is better than ours.

These chess playing engines are in a lot of ways very much traditional AI. You have a basic player engine that defines the rules and moves, alpha-beta pruning to perform the search of moves, and some form of goodness evaluation of each position to inform the pruning.
The evaluation of the position is where the magic lives. You need to provide a metric of how strong a position is without doing further expansion of possible moves. This is a lot like human players evaluating a board and “positional play”. Players will talk of strong and weak positions. Strategies revolve around the nature of positions rather than explicit sequences. Chess represents one the most important game paradigms, as the next move is independent of previous moves, and only depends on the current state, with no unknown or hidden state.

In the end you are looking for some evaluation heuristic that gets you a good estimate of the minimum of some manifold in a large many dimensional space. This general problem has been the core of a huge amount of AI since the very early days.
A big part of making a system work well is defining how that space represents the problem. Then finding a mechanism that provides robust enough and fast enough estimates of the optimum. All the classic problems of becoming trapped in a local minima and their ilk apply, and the the ways used to avoid this, and tuning them to work well in the particular space come to play. Neural nets have been a favourite universal technique for probably the last 3 decades. Where universal doesn’t mean best, but one that can almost always be pressed into service. You don’t need the absolute best answer, so long as the answer is good. Chess intrinsically allows a range of choices at each move. Weak moves generally lead to loss, lots of strong moves tend to win. This is a good match to neural networks.

LLMs contain multi-layer perceptrons, which are a close cousin to neural nets. Where an LLM architecture is going to have problems is that they are not structured in a way that provides for deep search of the game space. They are optimised for a very different problem, and whilst it may be possible to explicitly train one to be an OK player, they will be very inefficient, and likely hit a wall in ability early. Expecting any of the big available LLMs to be any use at all is fanciful. This is square peg, round hole.

Yes, this. I’m sure that AI as a general concept will eventually be able to create good video games. But the specific example of LLMs is both less powerful and narrower in what it can do than a lot of people like to think it is. Trying to use it for tasks it’s bad at is causing all sorts of messes as governments and corporations throw at at everything like some sort of universal solution.

Graphics are a small part of a video game anyway. It’s the GAME, the presentation of challenges and mechanics that contribute to the achievement of goals.

I do not for an instant think any AI on this planet can do this or is anywhere close to it.

Tell that to the big video game companies, lol. They predominantly focus on graphics and realism to the detriment of gameplay. Can’t blame em, it’s what draws many customers in.

At least we still have an indie scene where games like Balatro can still succeed despite ass graphics.

If an AI was sufficiently advanced to make the Star Wars game you asked for, it would tell you no because Disney own the IP

The overriding advantage they have is that they can learn via gradient descent. All you have to know is whether the output is good or bad. You then tweak each weight in the direction that biases it toward the good outcome. Repeat a trillion times.

Heuristics require a human to develop. And they come with all the biases that humans do.

Self-learning is always going to beat hand-tuned heuristics in the long run. Heuristics may be efficient but self-learning will come up with strategies that humans cannot.

Maybe we’ll eventually have a better learning technique than gradient descent, but I doubt it. It just works too well. And matches how nature does pretty much everything.

And you also get games that fail due to only the visuals eg Concord, from most reviews played pretty well; maybe not well enough to stand out, but not bad by any means. But due to “ugly” and uninspired character design, it tanked hard.

Making a successful game today basically involves maxing out all the stats; it’s an incredibly competitive media.

(Apart from music maybe… I’m not a fan of most modern games’ music but I’ll definitely be hijacking if I expand on that)

What exactly are we talking about, though? When human teams design a game (the scope or work being outside the capabilities of one person) There is a constant feedback between teams, among teams, etc. Nobody gets a short description, comes back a month later, “here’s the hero, the villain, and some exploding mushrooms” and walks away. There’s a constant feedback, test of play and visuals, changes, etc. The artistic group will feed back to the computer design group, tweaking this and that.

Can AI substitute for some of the programming teams, producing computer models and effects? Most likely. Can they substitute for the artistic direction? I would say highly unlikely. Game play dynamics - probably, but how likely it actually knows how to make something appeal to humans?

And in all this, Ai would have to be like any human team, presenting results and getting feedback requiring modification of output in a continuous improvement cycle. It might speed and simplify some processes - as mentioned, textures, character details, etc.

The issue comes from whether Ai can be “creative”. I saw an example where someone said “make me a cartoon character of a sponge”, or “make me a cartoon character who is an Italian hero”. A human would consciously avoid trying to duplicate Spongebob or Mario. AI has been trained on a very limited range of source models, and I’m not sure how creative it can be to actively avoid duplication of its minimal sources like Italian cartoon heroes. (Has it been trained on gladiator movies too? Does AI like movies about gladiators? Does it hang around gymnasiums?)

Perhaps a better design goal would be something like flight simulator, where the scope for creativity is much less and the source material fixed and widely avialable. Aircraft have fixed characteristics, well documented. The planet is well mapped, aircraft performace is documented, etc.

Fun story today: ChatGPT asked to play an Atari 2600 at chess then 'got absolutely wrecked on the beginner level' | PC Gamer

ChatGPT, with all its self awareness, challenged an Atari 2600 game to a chess match. ChatGPT lost badly on the first levels. It couldn’t actually play the game right, being unable to keep track of the pieces without human intervention.