I’ve been doing a bit of historical research for a project, and I came across a commercial telegram that appears to have been encoded for secrecy.
I’ve got the draft of the text, and then the worksheet to turn it into a cipher to be sent by telegram. Each word from the text is assigned a number, then a number is added to that number, which apparently gave a new word, which is what would be sent by telegram, apparently a nonsense string of words.
I assume it was common to use ciphers when you had to give the text of your message to the telegram company. Anyone there could read it in clear, so you put it in a cipher, with the recipient at the other end knowing how to decrypt it.
Anyone have any experience with this sort of thing? I assume there were thick code books, which assigned words to numbers. If the recipient knew which code book, and the number you add, they could decipher it at the other end.
There were a vast number of cyphers going back to the Sumerians and the Egyptians. In the 1850s the Vigenère cypher gained a reputation for being exceptionally strong, but Babbage found a way to break it in the 1850s.
Here is an interesting article on the subject of encoded telegraphs.
From the paper I found, it looks like the person composing the cipher wrote out the first word of his text, say “Shipment”. Then beside it he wrote a five digit string. Something like 22691. Then he added a number to it, say 100. So now the number is 23691. Then he wrote a completely different word, unrelated to shipment, let’s say “bellows”
So what I think happened is that he had a commercial cipher book, with thousands of words, each assigned a unique number. He looks up “Shipment” and finds it is assigned 22691. He decides what number he’ll add, choosing 100, then looks up 23691, finds that it’s the number for “bellows” and writes out “bellows” as the first word of the cipher.
He then goes through his message, getting the number for each subsequent word, adding 100, and then looking up the corresponding word. That string of nonsense words is sent by telegram.
The recipient has the same cipher book. He and the sender have agreed in advance that the cipher key number will be 100. So he looks up “bellows” in the cipher book and finds that its number is 23691. He subtracts 100, getting 22691. Then he looks up 23691 and finds that it’s the number for “shipment” and that’s the first word of the message. Then he does the same process for each subsequent word.
Note that this was isn’t a code word system, where certain words have an agreed upon expanded meaning. Those telegram codes were designed to compress data, saving money (telegram companies charged by the word), but were publicly available and didn’t conceal the meaning of the message.
I recall reading about them in a book on cryptography, possibly The Code Book by Simon Singh. You have the right idea about how they worked. Frank Miller invented the most secure system, known as the one-time pad. If used properly and if the pre-shared key of shift numbers is kept secret, it can’t be broken.
Each correspondent would buy a commercially available book listing the codes and their meanings, such as this one by Miller himself. They would share (usually by postal mail) a list of the (preferably randomly generated) shift numbers they intend to use in composing their future telegraphic messages. When he wants to send a secret message, the sender uses the first shift number from his list that he hasn’t used yet. He looks in the book for the word or phrase he wants to relay, notes the corresponding number, adds the shift number to it and writes the word corresponding to that number. To decode it, the receiver looks on the list of the shift numbers his correspondent earlier sent him, and finds the first shift number that hasn’t been used yet. He looks for the encoded word in the book, finds the associated number, subtracts the shift number, and finds the word or phrase corresponding to that number. The numbers roll over, so if the receiver gets, say, code 15,345 with a book that contains only 14,000 words, it’s really 1345 ( = 15345 - 14000).
For maximum security, you have to use a different shift number for each word of a message, but that requires sharing huge lists of the shift numbers you intend to use. In practice people often used the same shift number for all the words of the message, changing the shift number only when there was a new telegram to send. Some people even used to the same pre-agreed shift number for all their telegrams, which is not very secure at all.
Note that many, or perhaps most (or all?), of the commercial code books also included instructions for encryption. Encrypted telegrams cost significantly more to send, but prior compression using the code books made them significantly shorter (and cheaper).
At least at the beginning, manual encryption was very slow and laborious: the encryption methods were simple and often simplistic.
Another point about encrypted telegraph messages was that they had to be sent and received by ordinary telegraph clerks.
If the message was only a meaningless string of letters and numbers, there would be far more chance of mistakes. By sending actual words, the chance of errors was reduced. And if one letter of a word was wrong, you could probably guess what it was supposed to be.
A simple code that anyone can use is to have a published book agreed between sender and recipient. Any book will do and in any language so long as both parties use the same version/edition.
At its simplest, all you have to do is to find the word you want to send and send a number which points the recipient to that word. Since no one else knows which book you are using, it is superficially secure, but open to decription by analysis. It’s not hard to imagine putting other staps into the encryption that will make it much harder, although it would never be totally secure.
This encryption algorithm was exactly at the center of a Sherlock Holmes story. (Sorry, don’t remember which. Someone here will surely fill in the blanks.) Needless to say, Holmes quickly guessed what the magic book was (although, to be sure, it took him two tries.)
It is theoretically unbreakable if the pad is kept secret. Using a published book is not a secret pad, even though the code breaker would have to figure out what book it is.
I will add a little bit of detail on why peculiarities of Morse Code caused telegraph companies to charge a premium for encrypted traffic:
Numeric data is much less compact in Morse code. Letters are encoded in as few as one symbol for the most common English letters (E & T) up to four symbols for the least common (like Y, Z, Q) You can send two dits in the space of one dah, so there was a tradeoff between fewer symbols and more dits. with bias toward more dits in the more common letters, and more dahs the less common.
Numerals all encode as 5 symbols, and punctuation marks as 6 symbols. In some cases, where the data was all numeric, the telegraphers might agree to use “cut numbers” , many of which are the same as letters. For example a full numeral two would be …— but a cut two would be …- which is also the letter U.
Also for plain english text, telegraphers might use some standard abbreviations, like ES for AND. Due to per-word charges, there wasn’t much need to abbreviate some other common words like THE, because the sending party would tend omit it from the original message.
Thus not only was it easier for a receiving telegrapher to fill in any blanks with sensible English text, and knowing that a U always follows a Q and other common combinations, it also was MUCH faster to send. Random letters don’t follow the normal English distributions, rules, or conventions, so cypher text made up of 5 letter groups (which became the standard for encrypted traffic) are generally slower to transmit and much harder to accurately receive than English words.
It should be, but so was Enigma; the weak link, as always is the human factor. For example; if the book was one you carry with you, other people might work out that it must be significant.