Has this variant of a ‘book cipher’ ever been used?

I’ve been mildly interested in book ciphers ever since learning about the Beale ciphers and I recently started thinking about using a ‘Word Finder’ puzzle grid and relative coordinates in place of a book and ‘page/line/word’ instructions.

How it would work is:

Grid can be any size. Let’s take a 20x20 one as an example. Starting letter ‘H’ might be in eight separate places on the grid, but we select the one at coordinates 18,12. Our second letter, ‘E’ could be in any of maybe twenty three places, but we choose the one at 7,7. The relative coordinates from the ‘H’ are -10,-5 but instead we treat the grid like a ‘wraparound’ screen, so go 9 rows ‘forwards’ and 15 columns ‘up’.

The coded message for ‘He’ becomes ‘18120915’.

Where (if ever) has this type of ‘coordinate’ coding been used before (either real life or fiction)?

Possibly more suited to the IMHO forum, but would this cipher rate as strong? My guess is it would, since number frequency doesn’t necessarily correlate with letter frequency. There’s still the same weakness as with any book code, of course.

If you didn’t use relative addressing, so the “E” in your example would be “0707” then you just have a simple letter substitution cipher. So negligible security. Adding in that e.g. H’s and E’s occur multiple times and you can select which encoding to use at random blunts the letter frequency statistics but does not erase them. So it might take 2x or 10x as much ciphertext to break the code, but it’ll break trivially once the enemy has enough to work with.

You adding the relative addressing idea is sneaky. That obfuscates the fixed one-to-many mapping I described in the prior paragraph. But also introduces challenges for the legit recipient in that any mistake in encoding, transmission, or reception, means the entire rest of the decoding train is totally derailed into gibberish. From your example imagine trying to decode a much longer message that begins with “he” but the recipient received as “18120914…” with the rest received correctly. Oops. Not only is the second letter, the “e” wrong, but so is every subsequent letter.

One comment: when you’re filling your encoding grid with letters, how secure it will be will change depending on whether you fill it with letters according to English letter frequency stats, same number of each letter, or even the opposite, where “e” is rare and “z” is common in your grid.

Sorta bottom line:
None of this sort of ciphering is strong in the modern 21st century computer-driven codebreaking sense of strong. But would this have kept folks in the 1910s out of your secret bidness? Better than a plain Caesar cipher for sure, but it still suffers from the fact that by encrypting at the letter level, you’re preserving some echo of the natural letter frequency. Which is enough of a leg up for it to be broken in principle - So decipherable given enough ciphertext and enough stubby pencil work.

You want your correspondent to have access to the same book you’re using. That’s why they used an almanac in Sherlock Holmes’ Valley of Fear and they used law books (instead of the implied book of the bible) in Manhunter/Red Dragon.

  • Both of those were deduced by people “breaking” the cipher by trying to figure out which books the two correspondents would be likely to have access to. If I understand your description, this one would require both of them to have access to the same “word search” puzzle. I could see this if they’d arranged beforehand to use the same edition of a word search from the same publisher, or to always use the puzzle from Tuesday of that week found in a particular newspaper. But misunderstandings could render the cipher unusable to the receiver.

And, of course, it’s an archaic code for this day of computer-driven encryption, as LSL pointed out.

redacted

Both versions of the proposed cipher are polyalphabetic ciphers. It is more secure than some classical ciphers because of the size of the key; many classical ciphers use a key small enough to memorize. The book cipher uses a whole book as the key, which would make it more secure, except that guessing the book reveals the whole key. The OP’s cipher is somewhere in between: it uses a 400 letter key, which is too long to memorize so must be written down by both parties, and can’t be easily guessed, but because it is written down it could be stolen. Nevertheless, cryptanalysis would not be very difficult. Techniques like those used to cryptanalyze the Vigenère cipher would probably work, given enough ciphertext to work with.

You’re right, but any advantage of the relative addressing is out the window once the codebreaker knows the size of the grid. At that point, the security is identical to using absolute addresses.

And determining the grid size is probably pretty easy by looking at the range of numbers in the code. You could further obfuscate that by wrapping around multiple times, but I still suspect it wouldn’t be difficult to figure it out.

I just noticed that some Discourse or copy/paste issue apparently corrupted the URL that I entered for Vigenère cipher. It should be Vigenère cipher - Wikipedia.

That’s an excellent point I hadn’t considered. The risk of error probably makes it impractical for anything but short non-vital messages.

Plenty of historically-used ciphers also had the problem of a single error propagating through the whole message. That was also a feature of the Enigma, for instance.

Not at all. Assuming the 400 letters form a passage of text, that’s only ~80 words, which almost anyone could memorize with a little practice. It’s just a few sentences. Of course, you’d want it to cover all the letters, but pangrams of that length are easy.

Having the key be human-readable text lowers the entropy vs. random letters, but this isn’t exactly a secure system to start with.

This post is just over 400 characters. Easy.

Agree with the point about memorising. Even though a typical Word Finder grid is designed to look like a random collection of letters, I’d neglected to consider that it does contain plenty of non-random actual words. That probably would aid memorisation.

There are varying degrees of randomness you could include. Even 400 random letters wouldn’t be that difficult to memorize, particularly in some earlier era where people are more used to memorizing long passages (just come up with a word for each letter and memorize that). A series of totally random words would be significantly easier, and an actual sensible passage of text easier yet.

You could add a bit of challenge by having a few different geometries: zig-zag, spiral, boustrophedonic, etc. Still no challenge to a computer but would increase the workload a bit for someone doing it manually (in the non-random-letter case, that is).

There is also the fact that this cipher considerably expands the length of the coded transmission – your 2-character plaintext becomes 8 digits when encoded. (Note that this is not true in a book cipher – 3 numbers, usually 6-7 digits, would encode a full word. Still some expansion, since the average English word is about 5 characters. Though skipping articles would improve that. An advantage of the Enigma machine was that the ciphered text was the same length as the original plaintext.)

A longer coded transmission can cause practical problems in use. Longer messages are harder to conceal in some secret method. And such messages are often sent via clandestine transmitters – a longer transmission gives the enemy more time to notice and radio-locate the source.

Also, a longer cipher increases the chances for a simple mistake in ciphering, transmitting, receiving, or deciphering. And several respondents have already pointed out, such a mistake can garble much of the intended message.

If, instead of using numbers, you just encoded 0->A, 1->B, etc., it would only require two symbols per character.