IIRC it wasn’t that they would actually include “Heil Hilter” in the message commonly but they were required to choose three letter indicator settings settings for each message. These were not part of the daily setup instructions for the operator, so they were required to choose a “random” value for each message. As doing this is a pain (and humans are anyway really bad at genrating random numbers) so they would choose things like Hilter’s initials making it easier for the code breakers.
They used both ideas. “Cribs” were common words/phrases at the beginning of a message sent from a particular operator or station. Bletchley Park would get messages sent from these stations each day and attack them by looking for the cribs. After they had successfully decoded one message, the day’s Enigma settings could be found making the rest of the traffic easy to break, until midnight when the settings got changed again. The indicator settings, specific to an individual message, were vulnerable to sloppiness by the operators, dubbed “Sillies”, where an operator had to choose three random letters. Many operators were a bit lazy and tended to use letters which formed a diagonal on the machine’s keyboard, such as E D C or T G B.
This was a problem for SOE cryptographers during World War 2. The radio operators sent into occupied territories were taught a cryptographic method that relied on a self-composed or original poem that they had to memorize. This became the cryptographic key used to encrypt the message.
Under stress, many of the agents could not reliably recreate the poem, leading to decryption failures. Using knowledge gained during training and familiarity with the agents themselves, operators and analysts would attack the cipher-text with known failures, and were in many cases able to successfully decrypt the cipher-text by logically recreating the unexpected cryptographic key used by the operator.
In fact, high quality cipher-texts with no errors aroused suspicions, and in some cases this suspicion was well founded - some operators in occupied Holland were compromised and were sending information the Germans wanted them to send.
a lot of the answers above assume that the message is in english and make assumptions on word frequency and possible next words based on that. what if the message and/or key text are not in english? what if you don’t know the languages used? does that make the decryption impossible or are there still ways to decrypt it?
There are not that many languages, so it just makes the task harder, but not infeasibly so. Just a larger search space and more effort needed. Assuming the code breaker knows of the existence of the language in the first place.
This does bring to mind the Code Talkers used by the USA.
Here are the frequencies of the letters in quite a few languages:
In general it’s possible to quickly calculate the frequencies of the letters in any language if you have a fairly long text (it would be best to use a lot of different texts) in that language if it is in computer-readable format or you can convert it quickly to computer-readable format. By a long text, I mean one that is at least ten thousand letters long and preferably a million letters long. This now is possible in at least hundreds (and probably more like thousands) of different languages. If it’s possible to create a good letter frequency table for whatever the language the message is in, it should be no more difficult to break the message than it is if the message is in English. It’s best if the message is fairly long. This would mean that you didn’t have to do as much guessing about some of the letters in the message. It would help if you created a table which doesn’t just show the frequency of each letter but which also shows the frequency of each pair (or triple or whatever) of each pair (or triple or whatever) of letters for that language.
but are you assuming you know the language the message is in? if you don’t know the language the message is in then how does knowing the letter frequency for different languages help you?
In most cases, you have some notion of what language the message is in. At any rate, compared with the number of possible keys, the number of possible languages is trivial - it costs almost the same amount of effort to test for all documented languages as it does to test for just one.
There are only 4,065 written languages in the world at the moment. Many of those languages are actually only spoken by most of the speakers of the language, so only a small number of those speakers also know how to write the language. If you have a set of texts that you can use to find the frequencies of all letters (and pairs and triples and so forth) of some language, you can very quickly check the message that you’re trying to break to determine if the language is the one that the message was in. So you can quickly eliminate a lot of those 4,065 languages. Unless the message was fairly short, you can figure out which language it was in. Besides, if you know what languages the speaker and receiver of the message speak fluently, the message will very likely be in that language. if the speaker and receiver are native English speakers, do you seriously think that they will write the message in a language only spoken by 4,381 people in New Guinea, only 387 of whom can write in the language?
If the idea is to make it more difficult for your message to be decrypted then sending it in other than your native language would be the obvious thing to do.
But then you have to learn that language. That would take several years to do. Sure, it’s always possible for the sender and the receiver to learn a new language and send messages in encrypted versions of that language, but there are now vastly easier ways to encrypt messages.
It is worth reiterating the basic ideas of encryption and how it affects difficulty in breaking the code.
There is only one unbreakable code. The one time pad. The reason it is unbreakable is that there is exactly the same amount of entropy in the key used for encryption as information in the plain text. This means that there is no possible internal redundancy in the cipher text, and all possible decodings are equally likely. So you can decode the message into anything ^{\dagger}.
As soon as the key drops in entropy to lower than the information in the plain text there must arise some level of redundancy in the encryption. Eventually, given enough cipher text, you can, in principle, build up enough information to break the encryption.
All encryption other than OTP attempts to straddle the line between complexity of encryption (which mostly ends up being equivalent to the length of the key) versus feasibility of decryption attack. Session keys are commonly used, and discarded after a set amount of text has been encrypted. The amount of text encrypted being judged as being still not enough to provide useful traction for attack. But it is not perfect, just a judgement call of feasibility. Then a new session key is exchanged using public key encryption, which is expected to be much more secure, and thus protects the new session key.
Messing about with layers of various traditional encryption methods is just adding a small amount of entropy to the key. Moving from a selection of books in one language to a larger selection in a large number of languages has added more entropy, but relative to the information in the plain text, possibility not enough to render decryption infeasible, and never renders it impossible. The worst situation is that inventors of the layered code system feel very smug with their creation and proceed to use it way too much, wiping out the entropy advantage they gained by adding other languages. This is a failing all to common in cryptography, and why such emphasis is placed on using known peer-reviewed algorithms, not clever ideas you had in the shower.
The Code Speakers I linked to above worked for no other reason than mere existence of the language was essentially unknown outside the US. But one slip and the entire thing could have unravelled. The entire key was expressible in one word.
{\dagger} In one adventure in Stanislaw Lem’s The Cyberiad, Trurl the constructor creates an automaton that can perform any task for a king. The first thing the king does is ask the automaton to tell him how to not pay Trurl and Trurl is ejected from the court unpaid. In revenge, in order to discredit the automaton, Trurl sends his creation a message containing random data. The automaton is unable to convince the king that the data does not contain a hidden message, as all possible messages are encoded within, and the king’s cryptographers can find anything they wish within.
Another thing to note about the code talkers is they were used for the lowest level of secrets. For things like “bomb that guy over there” where you only need to keep what you are saying secret for a matter of minutes (know your enemy wants to bomb that guy over there an hour after he’s been bombed doesn’t help you).
That’s the opposite of the kind of thing that One time Pads are used for. Which is communication between spies and diplomats where if it’s decrypted months or years later it’s still a big deal.
It wasn’t quite that bad. They actually used codes, so that another speaker of the language wouldn’t necessarily understand what they were talking about. I’m not sure how often they replaced the code book—maybe never.
In fact, the Japanese came closer than you might think to breaking the code. The codetalkers used the same radio channels as other American military people did. There were thus conversations where other American military heard them talking on a radio channel. Those other American military people got angry and told the codetalkers that the channel was not for the Japanese. The codetakers replied that they were Navajo, not Japanese. The Japanese recorded some of the codetalking messages. They had many American prisoners of war, and one was Navajo. They played the messages for him. He said it sounded just like random Navajo words. Navajo codetalking only lasted for two years. Eventually it would have been broken if the Japanese had worked harder on it for many years.
There are plenty of examples of decoding languages that are long dead, using purely mathematical methods (just like decryption, using frequency analysis, and then “cribs” as in educated guesses as to what a given phrase might mean that can be then be proven or disproving by trial and error).
There are even more examples of writing systems that nobody has been able to decode yet.
The really crazy thing with the Code Talkers is that they took what was, on its face, a nearly-unbreakable code, and then “improved” it so much that it became a trivial code that could have been broken without any knowledge of the language whatsoever. Because they weren’t actually speaking in Navajo. What they did was they took each English letter and associated it with an English word, like D for Dog. Then they translated those words into Navajo, so that whatever the Navajo word is for “dog” would represent the English letter D. Which means that what they had was a trivial letter-substitution cipher, which is a sort of thing that newspapers use to create puzzles that amateurs solve in a half-hour for fun, if it had only occurred to the Japanese to approach the problem that way. They’d have been much better off if they had just let the code talkers talk in their native language: Sure, that’d leave them vulnerable to the Japanese having a Navajo speaker of their own, but then, any method was vulnerable to a code talker being captured.
Where the code talkers were really useful was in identifying friendlies. There was one incident where two American battleships mistook each other for hostiles, at a range too long for clear identification. The incident was resolved by both ships putting their code talkers on the radio, where they presumably just conversed colloquially in Navajo. That’s something much harder for an enemy to learn to do.
Wow. I thought Randall was kidding
If I remember correctly, they only used the letter-by-letter codes when the word they wanted to say wasn’t in their code book. They had general codes (use this word for “Sherman tank”, that word for “Wildcat fighter”, etc). And then they had code words for specific tactical instructions (“climb the hill”, “flank to the left”, “enemies advancing”, etc). The boundary between “cryptographic code” and “enhancing existing language for tactical combat situations” is blurry.
Interesting note: Frank Herbert took the concept for the “battle languages” used in his Dune novels.