Here they have AI’s compete against each other in Diplomacy:
Here CICERO played pretty well:
Brian
Here they have AI’s compete against each other in Diplomacy:
Here CICERO played pretty well:
Brian
Really bad at playing Twister, I hear.
But unless there’s a “supervisor” saying “that’s not a valid move” then the computer is learning nothing, just shuffling pieces on the board randomly. Which requires programming the rules into something.
When training for Bridge, is the computer cheating by knowing all the other hands, or does it learn that the other three hands can be totally random and plays them blind? Plus a supervisor again, that says, “you are not allowed to play that card this hand” and “you have won this hand”. The bidding process is new again, since it requires “sending messages”. If separate instances of the AI are playing each hand, it could be they evolve a completely signal process. (Or each pair of AI partners develop their own code) Although (human) logic suggests that after each game (hand?) the separate IA’s would pool their learning.
The real question in this thread is - can an AI learn strategy, as opposed to valid moves? I assume strategy takes a lot longer.
There is a game engine aka supervisor that enforces the rules and presents the valid choices at any given time.
An AI Bot can learn strategy. When you train it, you give the trainer a score based on game state. It uses this to optimize it’s game play.
Couldn’t you feed it like a hundred thousand or million games and just have it infer the rules by observation? I would think this is possible given enough example games.
There is a genre of games like this. You’ll have a game with a deck of cards, for example, where one player is the leader who comes up with a set of rules which they keep secret. The rest of the players play cards one at a time and the leader tells them what the results of their card is, without telling them why. The point of the game is for the other players to figure out the secret rules.
Oh, yes, I’m familiar with those. Eleusis is the one that I remember. One difference here is that you do have a “leader,” as you put it, that does know the rules and is queried as to whether a move is valid or not. Similar sort of idea to what I’m proposing, but without the intervention of a rules arbiter.
I would think the basics of the game could be induced pretty quickly with a fairly small, by AI standards, training set. The tricky bits would be inferring en passant and castling rules, and maybe things like underpromotion (but that one less so.) Once again, given enough training games, I think it’s doable. It looks like there are over 100 million games available publicly from what I could get with some google searches (and about 10-20 million classified as “high quality”), so I would think there are enough examples there for a good learning algorithm to pick up on. Castling happens almost every game, and en passant captures are like 1% of games, on the low end of the estimate. I should think there is enough training data there, given a good unsupervised training algorithm.
If you can create a scoring function based on the saved games, then you can absolutely do this. It will only learn as well as your scoring function is accurate and consistent.
For example, if your scoring function is naive:
“for these saved games I give a +1 if the AI learns the next move and a -1 if it does not”
then you are going to confound the AI. It won’t learn valid moves; it will learn to reproduce games. Since saved games don’t capture the entire possibilities, the AI will be confused.
You can give it a score based on whether or wins/loses/draws and treat invalid moves as loses. This will work, but requires more compute iterations to see the big picture.
One example I gave up-thread is learning to play Atari games, like Breakout. This is done completely from scratch. The AI doesn’t know how to even “see”. You feed it screen pixels and the current score and it first learns to understand the screen, how to move the paddle, and eventually how to keep the ball in the air.
The AI doesn’t have any pre information like “this is a video game” or “this 2d array of pixels represents vision”.
The answer to that question is definitely yes. Someone did, in some sense, teach AlphaChess valid moves. But nobody taught it anything about strategy; it did that all on its own.