Yeah, what I’m thinking is that this is essentially just a simple vigenere rotation cipher with a long keytext that you can work both on the cipher text and the key text at the same time. What I mean is something along the lines that has been said: look for certain key words like “the” or better yet, longer ones like “that” or “would” and rotate them through the cipher text at progressive offsets and see if any yield an English-like sequence of letters. If so, then, try guessing what the previous or following letters might be. For example, if the key word “THAT” yields the sequence “INVA” then try seeing what happens by guessing “SION” (to complete “invasion”) does to the possible keyword there. If you get “THEF”, you might try “ollowing” (hoping the key text is “that the following”) if that yields junk try “irst” (“that the first” and so on.) And you work iteratively back and forth through the keytext and plain text to figure each other out, as it’s a simple ROT cipher, letter by letter. Or if you know it’s likely the word “NORMANDY” or “TOMORROW” is encoded in the ciphertext, work backword from that assumption and see if any likely hits in the key text turn up.
Something like that. It’s not a fully realized thought yet, as I have to run out the door, but I’m thinking that sort of approach.
Additionally to other comments, remember its a two-way relationship. If you find a “THIS” in the key, you can use that as a hint as to what comes next (“is” is a good guess). And once you correctly guess (which, as others have said, is obvious because what comes out is English not random gibberish) then the you can do the same thing from the plaintext to the key (i.e. if the key “this is” reveals “nd this” in the plaintext then you are pretty sure the letter before is ‘a’).
Once you’ve decrypted a lot of things like this you’ll probably have a bunch more clues to go on (e.g. if the book has a main character called “Fred” you will find that out pretty quickly and than can be used as a clue)
For these reasons, even if you follow all the rules of one-time-pad cyphers (you happened to have thousands of books that only exist at your house and the house of the person who will be decrypting the message, and never repeat them), but use plain-text human language as the key rather than random number, its still super weak decryption.
And . . .too late for edit . . . we seem to have some pretty savvy crackers here. Who can crack this simple cypher? Please tell how long it took You and how you did it.
Just treat letters as numbers, 0 to 25 (as computers always start from 0), and then arithmetic wraps around (so A=0 and 0-3 wraps around to -3, or 23, which means X)
Indeed many a decrypt has been based on slopiness or typos from a specific operator. One of the flaws in the Enigma is it required the user to enter a random setting (by choosing three random letters) as part of the encryption process, and it turns out humans are really bad at coming up with random letters (one common setting that operators used was Hitler’s initials) so this was one of the ways Bletchley Park decrypted it.
I’m not sure how a cryptographer really approaches a problem like this but one thing that I notice is that if we don’t loop the numbers then you’re going to create a modified normal curve on the distribution of values.
It’s like if you roll two dice and add their values together, a 2 and a 12 are fairly uncommon, whereas a 7 is quite likely. When you add two sets of random numbers, you get a normal distribution on the output.
a+a, for example, is the only way to get a 2. Likewise, z+z is the only possible configuration to output a 52. We can pretty quickly work back to the original text of both the original text and the cypher.
While applying a modulus 26 will make this a little less readily apparent, there would probably be a notable skew to the distribution of numbers.
Given that adding letters to letters is fairly common, there’s a fair chance that they’d guess that as a cause of the skew and find your distribution to match what we would expect from adding random texts together.
Once you do that, you can start to statistically work back to the most likely letters and thence the most likely words.
There’s no numbers in this cipher. And, yes, it does loop around. It’s just a type of rotation cipher where the rotation value changes each character. So if “a” codes to “b” and “b” to “c”, then at “z” we get it looping around to code to “a.”
(Unless I’m misunderstanding what you’re saying, of course)
You wouldn’t get a bell curve, because of the cyclical boundary conditions, but you probably would still get a telltale spectrum of letter frequency that would reveal to an attacker that you’re using a code like this. For instance, since ‘e’ is common in both your plaintext and in your keys, ‘e+e’ (that is, ‘j’) would be common in the ciphertext.
Actually, to the OP, if you look at Wikipedia article on running key ciphers under the “Security” heading, it provides a better explanation of the attack method I tried to explain above:
The main flaw here is obviously, using plain text as the key - whereas a one time pad uses randomly selected sequences of letters. Because English sequences are recognizable (“SIHT”) it makes decryption too easy. Also, if the sequence is in the message or the key, the effect is the same. In fact, the extra long key of English words makes things even easier because the likelihood of encountering several common English words gets more likely. (Which is also a factor in other decryptions, that volume of data makes the work easier)
In addition to the sloppiness you mentioned, a failure of ciphers is including known (or easily assumed) plaintext such as a standard greeting or name at the end. In this sense, the cipher has the same failure point as the Enigma in that no letter will ever map to itself.