Genetic Algorithms and Checkers

I first read about Genetic Algorithms years ago in an article describing one that evolved checkers-playing robots (these were not physical robots, just programs). The article stated that they gave the robots the rules and after many generations the robots were competing on a master level.

I tried to see if I could make a hound and hare robot shortly after reading about GAs but I couldn’t figure out how the “DNA” part of the algorithm worked. What I decided to use was a set of neural net threshold values and input weights. I let the robots run at a constant speed and fed the range and bearing of the other robot into a set of neurodes which passed their outputs to an intermediate set of neurodes (I arbitrarily decided on 4) and then passed those outputs to another set of neurodes each of which produced one bit of the output which was the robot’s new bearing. I wrote the whole thing in Macromedia Director, so it took all night to run the trials for the first individuals and I got laid off shortly after that, so I never went any further with it.

I have always wondered if that is how it is done and also how the checkers players worked. Does anyone know?

Thanks for your help,
Rob

I’ve only heard of genetic algorithms in the context of optimization problems. That is to say, you have a set of parameters which can take some values, and some figure of merit which is derived in some complicated way from those values, and you want to find the set of values which will give you the largest (or smallest, depending on the problem) value for the figure of merit. For instance, if you own a company that makes widgets, the parameters might be how durable you make your widgets, how much you charge for them, how much money you spend on advertising for them, and what color (or colors) to make them, and the figure of merit would be the profit you earn. Even if you have some sort of formula or other way to calculate how much the profit will be, it’s not necessarily easy to go from that formula to the set of values that would give you the most. This is where you use the genetic algorithms.

The basic idea of a genetic algorithm is that you start with a set of “organisms”, each of which has a full set of values for the parameters. These values correspond to the DNA. In each generation, you calculate the figure of merit for each organism, and you take some number of the best organisms. Combine the DNA from these best organisms in some manner, to come up with new parameter values, and give them some small chance to change further (mutations). The result of these is new organisms, which you use as the next generation. Calculate the fitness for each of these, and repeat.

There are, of course, many variations on this basic theme. The programmer typically chooses how large to make each generation, and how many organisms from each generation “mate” to produce the next generation, as well as the probability of mutation. And, of course, there are a number of different ways to combine the “DNA” of the parents. You might simply average the values, for instance, or you might pick one of each, or you might pick each bit of the child’s DNA from one of the parents.

In the case of the checkers-playing programs, I suspect that someone first wrote a checkers program with a bunch of free parameters, representing how much the program should value certain moves. These parameters formed the DNA. The programmer could have just chosen values for the parameters himself, or he could have asked for advice from expert human checkers players, but instead, he decided to evolve them. So he started with a bunch of programs with different values, and decided their value by having them play against each other, with the winner of each match mating more often than the loser (and maybe even “killing” the loser). Repeat for many generations, and you’ll end up with a very good set of values for those parameters, probably close to the values the human players actually use.

For a neural network (which is what what used in the checkers example), the DNA is some sort of coding technique that completely describes the neural network, including neurons, connections, transfer functions, weights, etc. Many techniques have been used, and it continues to be an area of research to come up with optimal encoding techniques.

But it sounds like you were on the right track, basically just a set of numbers indicating the structure of the neural network.

However, as you dig deeper into this topic, you find that some approaches work better than others. For example, a small change in the genotype (DNA), should result in a small change in the phenotype (ANN). If it doesn’t, then good solutions might suddenly become horrible solutions.

Yes, my thoughts are similar. You could use standard alpha-beta pruning, and use genetic algorithms to train the heuristic functions used to estimate position strength.

The person who did the first successful checkers program was Arthur Samuel - his first learning version was done in 1955. This was long before neural nets or GAs.
http://www.cs.ualberta.ca/~sutton/book/11/node3.html is a link. He didn’t use anything like DNA - first he remembered positions he’d seen before to save effort in searching for the best move given that position. Second, he gave each position a value, which could be adjusted depending on if that position resulted in having to back up the search tree. I think by the early '60s Samuel’s program could beat any checkers player.

GAs work slightly differently, using an evaluation function and a mutation function, and working on a string. Say you wanted to use a GA to write Shakespeare. You’d generate a whole bunch of strings of characters, and rate each one as to how close it was to Shakespeare. You’d pick the top couple, and mutate them, perhaps by exchanging strings with other candidates, or rearranging the strings. You’d also generate new strings. You would then apply the evaluation function again, and repeat until you got a match. (A sonnet would be faster to generate than Hamlet.)

For checkers, I suppose that for each position you’d generate some next possible legal positions, or a tree of the next few moves, and evaluate them. It doesn’t sound very efficient - min/max searching is used for a reason.