Wolfpup's great computer chess competition

Finally got around to doing something I’d wanted to do for a while, setting up the chessboard in front of the computer and playing the classic and historic MacHack chess program against an unexceptional modern program . MacHack is the famous chess program developed in the 60s at the MIT AI Lab by Richard Greenblatt, famous among other things for being the first computer chess program capable of tournament-level play, and also famous for beating the AI skeptic and philosopher Hubert Dreyfus who had claimed that computers would never be able to play a good game of chess.

And in the other corner, we have a program called Shredder, which is very good in some of its PC implementations, but I wanted to run the lesser version that I have on my humble Android tablet. It actually has some serious issues at lower settings like sometimes letting an important piece be captured for no apparent gain (the author claims it mimicks a human player by “deliberately” making mistakes), but these failings seem to go away at higher settings. Still, I was skeptical about its proficiency. To at least get rid of obvious weaknesses I cranked it up to what it alleges to be ELO 2350 (supposedly “master” – it goes all the way to 2600 - “grandmaster”, although I don’t believe this is at all accurate) and away we go …

MacHack was written for the PDP-10 timesharing system (actually, originally for its predecessor, the PDP-6) and for anyone curious, I was running it on the SIMH PDP-10 simulator. Also for the curious, an MIT research memo about MacHack [PDF].

This was the game. Shredder was white, MacHack played black.

Not only did the humble tablet beat MacHack at the latter’s default settings, but it did it quite readily and decisively. I have never been able to beat MacHack myself (so it was sweet to see it humiliated!), though I’m a pretty poor chess player. I can beat Shredder, but only when set to somewhere around “casual player” level and it makes a stupid mistake. I guess cranking it up to the “master” range was probably unfair, but even the humble tablet was coming up with its moves almost instantaneously at that setting. Also surprising, after about the first 6 moves or so, when entering MacHack’s moves, on at least three occasions Shredder warned “that last move wasn’t so good, do you want to take it back?”. I have new respect for that program, at least when it’s allowed a far amount of scope in the search tree.

The default MacHack settings I was using were the “standard” rather than tournament level settings. These were: default search depth of 4, number of plausible moves to consider on each ply 7, 7, 6, 6. The tournament setting is a depth of 6, plausible-move settings 15, 15, 9, 9, 7, 7. I think I’ll try that next.

It may seem “obvious” that MacHack would be beaten by a program with presumably deeper searches, but MacHack has long had a reputation for excellence due to its amazing heuristics, and it also tends to create interesting games because of its aggressive tactics. Shredder seems to me to be more plodding, and, as said, even makes dumb mistakes at certain settings.



   Tablet        MacHack
---------------------------------
1  P/K2-K4       P/K2-K4
2  N/KN1-KB3     N/KN1-KB3
3  B/KB1-QB4     N/KB3*P/K5
4  N/QN1-QB3     N/K5*N/QB6
5  P/Q2*N/QB3    P/QB2-QB3
6  O-O           P/Q2-Q4
7  N/KB3*P/K5    P/KB2-KB3
8  Q/Q1-KR5      P/KN2-KN3
9  N/K5*P/KN6    P/KR2*N/KN3
10 Q/KR5*R/KR8   P/Q4*B/QB5
11 R/KB1-K1      K/K1-KB2
12 B/QB1-KR6     B/QB1-K3
13 Q/KR8-KR7     K/KB2-K1
14 R/K1*B/K6     B/KB1-K2
15 R/QR1-K1      K/K1-Q2
16 R/K6*B/K7     Q/Q1*R/K2
17 Q/KR7*Q/K7    K/Q2-QB1
18 B/KR6-KB4     N/QN1-Q2
19 Q/K7-K8  Mate


The fact that Shredder makes deliberate mistakes at low settings is irrelevant, when you’re not running it at those settings.

And no matter how clever MacHack was, remember that all subsequent programs were written by programmers who knew MacHack.

Yeah, but I was wondering how “deliberate” those mistakes really were, as opposed to flaws in its algorithms that were compensated for by brute-force look-ahead at higher settings.

I don’t know how universally this was true, but certainly I grant you that we’ve had all kinds of incremental progress in this area that has relied on learnings from previous efforts.

My fascination with MacHack stems from several factors. I was fascinated from an early age by the idea of computer programs that could play chess, and I was duly impressed by MacHack back in an era when it was a privilege to be able to play games with a million-dollar mainframe computer. Later on I played lesser programs on various lesser devices but MacHack remained the gold standard. I was a poor enough player that I often lost to those, too, but it was a boring, plodding kind of loss, just getting stuck and eventually being outmaneuvered. MacHack was different. It came at you like a ton of bricks. When you lost, it was often devastatingly impressive and fast.

So it was interesting to me, with the availability of a PDP-10 simulator and this legacy software, to see how it stacked up against a modern program.

And now I have the definitive answer.

My first attempt at a second round was to set MacHack to levels way higher than the recommended “tournament” settings. That didn’t go too well and reflects the ancient technology involved. With unrealistically high depth and width settings, the program crashed partway through with a stack overflow error. I suspect the poor thing just ran out of memory in the simulated PDP-10 environment.

So I set it to the tournament level mentioned previously, kept Shredder at the same level as the first time, and played another game. The significant thing here was that Shredder still had untapped capability, but MacHack was at the limits of the technology it was running on. And, in fact, the search depth was such that it sometimes took as long as ten or even fifteen seconds to discover a move, with the simulator running on a fast quad core i7 processor that was about a gazillion times faster than a real PDP-10, while my handheld little cheap tablet never took more than a second or two to make its move.

And so the two went to battle. This was the game, again with Shredder on the tablet playing White, MacHack playing Black:



   Tablet        MacHack
-----------------------------------
1  P/K2-K4       P/K2-K4
2  N/KN1-KB3     N/KN1-KB3
3  P/Q2-Q4       N/KB3*P/K5
4  B/KB1-Q3      P/Q2-Q4
5  N/KB3*P/K5    B/QB1-K3
6  N/QN1-Q2      N/K5*N/Q7
7  B/QB1*N/Q2    N/QN1-Q2
8  O-O           B/KB1-Q3
9  P/KB2-KB4     O-O
10 P/KB4-KB5     N/Q2*N/K4
11 P/Q4*N/K5     B/Q3-QB4
12 K/KN1-KR1     B/K3-Q2
13 Q/Q1-KR5      B/QB4-Q5  ?* <-- Black totally went south at this point
14 P/KB5-KB6     B/Q2-KB4
15 B/Q3*B/KB5    P/KR2-KR3
16 P/KB6*P/KN7   Q/Q1-K2
17 B/KB5-KR7     K/KN1*P/KN2
18 Q/KR5*P/KR6   K/KN2-KR1
19 B/KR7-K4      K/KR1-KN1
20 Q/KR6-KR7  Mate

MacHack move 13 – QB4-Q5 – followed by White’s P-KB6 was the point at which Shredder’s “winning” meter, formerly hovering around “White is slightly better” went off the scale to “White is winning”.

Case closed. This is my first definitive proof that the much-vaunted best chess program of the pioneering days of AI can now be beaten by an “app” running on a phone or a tablet.

Independent of the quality of the software, wouldn’t the power of the hardware suggest this?

If it were me, I’d go for even more rounds, but this time lower the settings on the new software. Keep doing that until the older one can beat it. Though, once it becomes closer, definitely do more than one game per configuration, to rule out luck.

And also, of course, to balance out the white advantage.

Not necessarily. After all, the physical hardware that the simulator is running on is a fast modern i7-based PC, while the tablet is just an ARM processor – although of course the simulator is taking a big hit from the fact of having to interpret the foreign PDP-10 instruction set. Still, speed doesn’t seem to be a problem, which is why I tried cranking it up to much higher search settings, at which it’s entirely possible that MacHack would have performed much better.

And I think it’s fair to say that the “stack overflow” error was significantly symptomatic of the real problem – the architectural limitations – and specifically the intrinsic memory limitations – of the old hardware. The program probably pre-allocates a limited amount of stack space considering the maximum search settings that would ever realistically be used with the processor speeds and memory limitations of the time. The first-generation PDP-10 had a maximum total memory capacity of 256K 36-bit words. It was comprised of expensive core memory, each 32K-word “module” being contained in a cabinet a little bigger than a household refrigerator. So systems were designed to be correspondingly frugal in their use of memory, and even if a programmer wanted to do otherwise, profligate use of memory would quickly use up the equally limited address space.

At first I wondered if it was worth the bother now that the basic challenge has been answered, but I believe I’ll do that. I played MacHack so often in my younger days that it’s kind of a known quantity, and it would be cool to know what skill setting in Shredder corresponds to it.

This is some more interesting background on MacHack, and it says that in its final incarnation it was rated at 1529. In Shredder, this is labeled the “experienced player” range, and is only one-third of the way up the scale. I’m convinced that Shredder greatly overestimates its skill at any specific setting and that MacHack would probably beat it at this setting, but there’s obviously some setting somewhere – probably a good bit higher – where they would be about even.