Shortest possible encoding of a chessboard?

ultrafilter · January 27, 2006, 5:42pm

I’m down to 22 bytes and wondering if I can do any better.

I got that far by generating a prefix-free code for the contents of each square. 0 signifies an empty square, while a sequence of 1s represents a piece (ranging from 1 for a pawn to 111111 for a king). After the 1s, 00 represents a white piece, and 01 a black one. The starting board works out to be 170 bits, which is padded out to 176.

Can anyone beat that?

friedo · January 27, 2006, 5:53pm

You might be able to shave a little off by adding a code for skipping a number of squares. For example, the starting board has 32 empty squares right in the middle. Let’s say a 1111111 followed by a four-bit number means “skip that many squares.” The middle of the board is then reduced form thirty-two zeros to eleven ones.

acaos · January 27, 2006, 5:53pm

You could squeeze it down a little more by having 01111111 (empty square, then seven 1s) stand for ‘this many empty squares’, following that with five bits which you would then decode into a number. That would shave the 32 0’s for the middle of the board in the starting position down to 01111111 10000.

This would mostly be useful at the very beginning and in an open endgame where most of the pieces have been taken.

acaos · January 27, 2006, 6:01pm

This is definitely a little better than my suggestion - I’d say ‘skip one plus this many squares’ so you could represent up to 32 with the four following bits.

friedo · January 27, 2006, 6:04pm

Good point!

chrisk · January 27, 2006, 6:11pm

Well, all the suggestions about square-skipping aside, (and remember guys, that might save a lot of space on the opening position, but they might cost you when the pieces are all over the place.) I applied the huffman algorithm and got it down to 164 bits, less than 21 bytes.



square status	count	prefix code	length	total space taken
empty		32	0		1	32
white pawn	8	100		3	24
black pawn	8	101		3	24
white knight	2	11000		5	10
black knight	2	11001		5	10
white bishop	2	11010		5	10
black bishop	2	11011		5	10
white rook	2	11100		5	10
black rook	2	11101		5	10
white queen	1	111100		6	6
black queen	1	111101		6	6
white king	1	111110		6	6
black king	1	111100		6	6

Your scheme turned out pretty good… but it left a few holes that could be filled in, and wasn’t ideally balanced for the mix of pieces. The huffman algorithm kicks ass.

friedo: I pulled up the first chess problem I could find on google and found fourteen empty ‘strings’ between the pieces, counting the squares top down and left to right. Spending 5 bits on each of them would take 70 bits, as opposed to just one bit per space.

friedo · January 27, 2006, 6:18pm

friedo: I pulled up the first chess problem I could find on google and found fourteen empty ‘strings’ between the pieces, counting the squares top down and left to right. Spending 5 bits on each of them would take 70 bits, as opposed to just one bit per space.
[/QUOTE]

You don’t have to use the skip code for all blocks of empty space. The single zeros in ultrafilter’s original spec are still available. Thus the skip code is only advantageous when you have twelve or more empty spaces in a row.

chrisk · January 27, 2006, 6:22pm

Whoops… I think I misunderstood that one. Hmmm… wonder what the best way of encoding multiple spaces into my code would be, or if it would be worth it to bother.

CaveMike · January 27, 2006, 6:27pm

This is a page that uses Huffman encoding for chess positions. The results are similar to chrisk’s:

Chronos · January 27, 2006, 6:35pm

One improvement I can see: It’s impossible for a pawn to be on the first or last ranks. So for those ranks, you could use a different encoding, with knights as 1 and kings as 11111. This will save two bytes in the initial configuration, though if everything moves off of the back ranks, you’d end up getting those bytes back. Another improvement: There are limits to the number of each piece you can have. So once you’ve enumerated all of the pawns, for instance, you can let all subsequent 1s mean the next piece up. This will also not guarantee any savings, but if you enumerate the squares in the correct order (probably starting from the middle ranks and working out), it’ll usually save a couple of bytes or so. There may be configurations which would nullify both of these advantages without giving any others (the representation shrinks every time a piece is captured, of course), but I’m not sure that any such configurations are actually attainable, and they almost certainly wouldn’t occur in a real game.

On the other hand, there’s more to the state of a chessboard than which pieces are where. You’d also need to keep track of whether the kings and rooks have moved yet, to determine whether castling is currently allowed. This could be done in your six padding bits, or less if some of them are not in their original positions. I’m not sure how many bits would be needed to decide whether en passant is allowed in the current position, though.

acaos · January 27, 2006, 6:35pm

I did a little more work and got this new table:



square status	count	prefix code	length	total space taken
empty		32	0		1	32
white pawn	8	100		3	24
black pawn	8	101		3	24
white knight	2	10000		5	10
black knight	2	10001		5	10
white bishop	2	10010		5	10
black bishop	2	10011		5	10
white rook	2	11101		5	10
black rook	2	11110		5	10
white queen	1	1111110		7	7
black queen	1	1111111		7	7
white king	1	111000		6	6
black king	1	111001		6	6
empty RLE	1	111110		6	0

As in friedo’s code, ‘empty RLE’ stands for ‘one plus the following four bits worth of empty spaces’.

This code is advantageous any time there are more than 10 consecutive empty spaces on the board, and has only a 2-bit disadvantage otherwise (for the two queens). I almost selected the kings for the 7-bit codes, but then I realized they will always be on the board, but the queens might be captured, saving more space in the long run.

chrisk · January 27, 2006, 6:38pm

hmm… as nearly as I can see, that page’s strategy is quite a bit different, even though the results end up at the same place. that’s kinda funny.

chrisk · January 27, 2006, 6:42pm

On the other hand, extra queens can be created by pawn promotion, but probably only when some other pieces have been captured. Probably on average queens are captured more often than promoted.

iamthewalrus_3 · January 27, 2006, 6:59pm

acaos:

I did a little more work and got this new table:
square status	count	prefix code	length	total space taken
empty		32	0		1	32
white pawn	8	100		3	24
black pawn	8	101		3	24
white knight	2	10000		5	10
<rest of table removed>

How can you tell between WP-empty-empty and WK?

Seems like WP should be 110.

CaveMike · January 27, 2006, 7:00pm

True. They use Huffman for most pieces, but not the king (not sure why – I guess to shorten the Huffman encodings by one byte). They also store some extra info: 1-bit for the active color and 4-bits for the e.p. flag and e.p. column of the baggy pawn (I don’t know what those mean).

They also use a “filling matrix” for each square. I don’t know why, since it seems to just move a bit out of the Huffman encodings and in to the filling matrix. For example, their pieces (without the king) can get encoded in 2-4 bits, but they also use a bit for each piece in the filling matrix. This translates to a 3-5 bit encoding – which is what you would have above (if you also excluded the king).

Ludovic · January 27, 2006, 7:02pm

If this is to represent any chess board, it also has to include information on promoted pawns.

If we ignore that, we could instead focus on which pieces are left and where they are, instead of encoding each space.

You’d have to use 17.84962501 bits (using fractional compression) to indicate how many pieces and which kind are left on each side (1 for each queen, 1.584962501 for each major piece [of course, you could designate each bishop and save a bit later on…], and 3.169925001 for how many pawns.

Then you could designate where each piece is, using 6 bits for the first one, 5.977279923
for the second one, etc.

It ends up using a lot more space in the beginning, 197 bits, but in endgame conditions it’ll save bunches. For instance if you only have 3 pieces left, you only use 35.78110124
bits or 5 bytes.

ultrafilter · January 27, 2006, 7:03pm

Why? Why can’t we just store all of the pieces and figure it out from there?

acaos · January 27, 2006, 7:05pm

You’re absolutely correct. My mistake, sorry.

Bryan_Ekers · January 27, 2006, 7:11pm

Isn’t that a black pawn that advises a white knight about his golf game?

(pop culture reference challenge)

Pleonast · January 27, 2006, 7:13pm

It may be worth considering building a Huffman table based on piece and position (vs just the piece, as chrisk has done). For example, white pawn at A1, white pawn at A2, etc… Of course, you’d then need a statistical description of where different pieces are likely to be. There are online databases of games, right?

This method would have the advantage of being the shortest encoding on average for a random chess position pulled from a random game.

Topic		Replies	Views
Is the game of chess infinite? Factual Questions	33	2892	January 15, 2003
Chess, a solved game in the future? Factual Questions	121	4391	December 17, 2007
How many chess moves? Factual Questions	31	1502	May 13, 2001
Why can't computer-aided chess players wipe the floor with human players? Factual Questions	28	2549	October 18, 2002
Great idea for a chess game In My Humble Opinion	33	1385	July 5, 2001

Shortest possible encoding of a chessboard?

Related topics