I have a question for real cryptogeeks: Is it really any more difficult to create a one-time pad system in letters than in digits? (We’re talking non-computerized technology, here).
I just read a book by a codemaker in an isolated branch of the British secret service in WWII (Between Silk And Cyanide – good read if you can get past how conceited the author was and still is). In it he describes how numeric one-time pads work and basically says their use was widespread and common knowledge (messages were first converted to numeric groups using a code book). But he make a big deal of how hard he worked to create a one-time pad system for agents using letters, instead of numbers, and says that a code expert described thinking up one-time pads for letters all by himself as quite an accomplishment.
Why is this so hard? Isn’t a letter one-time pad exactly like a numeric one-time pad (except letters instead of numbers, duh), and using it is identical except you add each plaintext letter to the pad letter modulo 26 to get the ciphertext letter instead of adding plaintext digit to the pad digit modulo 10 to get a ciphertext digit?
The letter one-time pad system he describes uses a complex printed substitution square rather than just adding the plaintext letter to the pad letter. So that an “A” plaintext with a “C” pad gives, for instance “Q” ciphertext while an “A” plaintext with a “D” pad gives “J” ciphertext, and so on.
Why do you need a substitution square and why not just say A=1… Z=26 and add the plaintext and pad letters (modulo 26) so that an ‘A’ in the pad always gives for the ciphertext letter the letter alphabetically following the plaintext letter, and a ‘Z’ always gives the same ciphertext letter and the plaintext? What am I missing?
Just guessing on my part , but the substitution code that you refered to , with the A=1 and so on , is actually centuries old. One time pads work best when they have a life expectancy of a month , any more than that and the OPFOR crypto guys get to try out their skills due to your message traffic.
WW2 was the first war in which cryto was being decrypted via computers , so using a number based cypher had to be done a certain way, btw as far as I know , the germans and japanese never used computers other than the mechanical enigma set up , which was an encoder/decoder , but for allied traffic , I have not seen reference to any kind of mechanical brute force decryption.
I think in the case of the above coding genius who came up with that version of the one time pad , may have based it around existing books or tomes like the king james bible or the like. So the one time pad may have just been key references to pages , paragraphs ,and such.
Some one will probably be along shortly to either add more info , or totally refute what I have said
I think the scheme the OP is referring to just uses substitution as a means of numerically shifting the letters, not as the cipher in itself. That is, if the message contains B (=2) and the next character on the one-time pad is a D (=4) then the ciphertext is an F (=6). This is like a ROT13 cipher, but with a variable shift based on the OTP.
Message traffic has no bearing on security because you never reuse any part of the OTP. If you ever reuse any part of the sequence then the bad guys can get information about your pad by comparing ciphertext, but you’ve violated one of the basic rules of OTP implementation. You could stream OTP-encrypted bits at 10G/s 24/7 and they would be just as impenetrable as a low-traffic message. Of course, you’d have the problem of securely sharing a similar amount of key data with the recipient, which is the downfall of most OTP systems. Basically, OTP message traffic is generally kept low because the pad bits are “valuable” in the sense that when you run out, you can’t communicate securely anymore. BTW, if you use key bits obtained from a public source like a book or the bible, then you’ve violated the premises of the one time pad that leads to it being “provably secure”, so you don’t really have a secure cryptosystem.
I don’t have an answer for the OP. Digital OTP schemes are based on a very simple operation (XORing a message bit with a key bit) and they get their security from the secrecy of the key bits. I don’t see any reason why a letter-based OTP couldn’t use a similarly simple operation like a mod-26 shift in letter position and still be secure, and I don’t see how a complicated letter-substitution scheme increases the security much (unless you’re relying on security-by-obscurity in keeping your substitution scheme secret in addition to your key bits, which is rather bad design). However, all of my experience in crypto is in implementation rather than design and analysis, so a more qualified cryptographer may have more insight.
I think you are right in your assertions, there’s nothing difficult at all about combining plaintext and padtext (as opposed to plainnumbers abd padnumbers). If ones agents are particularly flaky then maybe you couldn’t trust them to “add” X to Y modulo 26, so providing them with a look-up as an aide memoir might make sense, but, there’s no added cryptographic advantage in “shuffling” this pad in the way you describe.
Nitpick: number your letters A=0 to Z=25 if you’re doing mod 26 arithmatic.
Anyways , I think that as per the original post one of Britains problems towards the end of the war , was that their people generating the one time pads , became lazy and started to repeat some of the previous one time stuff.
As for using the open source books as part of the one time crypto , would be based on what the agent has available at any one time, rather than the military that would have both the one time pad ,and the orders or circumstances that the code was designed to take advantage of.
AFFTIE for example may advise Subron 16 to rendevous at such a time and place and action to be taken , of course the code string would be longer.
But an agent in deep cover may have needed to keep message traffic to under three minutes , depending on the time that he may be D/F’ed, so in that case , using outsourced tomes may make more sense.
Maybe he’s talking about that encryption key algorithm that produces keys with English digraphs so they can be remembered more easily. That probably would have been a feat back then and would have certainly solved the problem of using poems as keys.
Thanks guys, if no-one else can see the problem either, then I’ll assume that there’s some key point missing from the book about why this is difficult.
Current schemes just use mod-two addition on bits, which isn’t conceptually different from mod-ten addition on digits or mod-26 addition on letters.
Interestingly, though, according to his book, he had originally wanted to create a different substitution square for each agent, until someone pointed out that this was unnecessary, so maybe he was just confused about how secure a very vanilla one-time pad system would be.
And KidCharlemagne is right, The Code Book is good at telling some interesting code stories and giving a general overview of code evolution through history.
Its very simple to prove that 1 time pads are secure under a few basic assumptions. Saying that one implementation is “more secure” than another, assuming both fulfil all the assumptions is stupid.
The difficulty is in producing the one-time pads in the first place. It is VERY difficult to produce truly random sequences. With today’s volumes of traffic to work with, almost any trace of non-randomness can give the analysts a toehold into the system.
I recall a story told me by a cryptanalyst once who worked on a system where the random key sequence was generated by someone sitting at a typewriter and hitting keys “randomly”. It turns out that this is one of the worst ways to produce a key sequence. The analyst claimed that he could tell what music the operator was listening to by the rhythm of the right-hand vs. left-hand keys.
I understand that modern one-time sequences are generated by putting a Guiger counter on a radioactive source…
I found something to elaborate on what I meant by the digraph keying algorithm:
http://www.fourmilab.ch/onetime/otpjs.html
Look at the difference in keys generated when “word-like” is selected vs. “alphabetic.” I wasn’t using the term digraph in the reverse engineering sense typically associated with cryptography.
That cryptanalyst must be talking about WWII era crypto or something. Decent quality Random numbers are not too expensive. Of course, if you want guarenteed NSA unbreakable RNG’s, then they will cost a lot more. But you can get halfway decent true RNG PCI cards for a couple of hundred $$$. Hell, even some modern motherboards and CPU’s have basic true RNG’s.
Do you have a cite for a card or motherboard with a real RNG? I’m not doubting they exist, but most products contain only a pseudo-random generator. Any sequence generated from a deterministic source (e.g. an algorithm) is pseudo-random. There are plenty of cryptographically-strong pseudo-random generators (i.e. good enough for most practical purposes like crypto key generation), but it’s important to distinguish these from a true random generator which needs a source of non-deterministic input like the geiger counter methods BrotherCadfael mentioned or the lava-lamp sampler that got some press.
Conventional pseudo-random number generators are useless for serous cryptography. Any deterministic component can be exploited by cryptanalysts, especially given the enormous amount of encrypted text available for analysis in today’s environment.
All cypher systems, other than one-time pad systems, are essentially pseudo-random number generators, but to eliminate as many exploitable weaknesses as possible (you can never eliminate them all), they have to be designed with extraordinary sophistication.
What if you keep changing the seed value for a pseudo-randomn number generator? Granted it’s still theoretically breakable because, as has said already, the only randomn event we know of is a quantum event. But isn’t it practical even for serious cryptography? How can you isolate a seed value for one of the rngs if it’s dependent on another one?
This isn’t much different than adding additional rounds to a multi-round algorithm. It might make it harder to crack and it might make some exploits impractical, but it’s still a deterministic system. For some things like session keys, cryptographically-strong pseudo-random is usually adequate, depending on your threats and the value of what you’re protecting. For other things, like generating an RSA key pair which may exist for many years and protect thousands of transactions, it’s worth the effort to use a true random system.
But what do you change the seed to? If you’re getting your new seed from a deterministic source, then your total algorithm is still deterministic. If you get your seed from a random source, then why not just use that random source to begin with, and not bother with the pseudo-random algorithm?
And even with WWII technology, it’s still very easy to produce good random numbers. Heck, dice have been around since Caesar’s time, and probably for a long time before that. I’m sure that a national government could cobble together a few roulette wheels in such a way to quickly produce all the pads they needed.
Once you have figured out the algorithm, you have to determine the initial key, or cryptovariable. If the algorithm is sufficiently well-crafted that only a brute-force attack on the key is possible, and the time required for a brute force attack is long enough as to make any infomation contained in the message useless, the cipher is considered secure for that purpose.
Commercial random-number generators are nowhere near sophisticated enough for this purpose. (By the way, a brute-force attack is an exhaustive search, which tests all possible key values.)
The key to breaking a cipher system is to find a method of attack that does not require brute-force (or which drasticly reduces the size of the brute-force attack). Basically, you are looking for something which the mathematicians who designed the system overlooked.
Not that this is directly relevant to the OP but in Cryptonomicon one of the main character’s was able to crack a OTP. The Brit’s in the book were using bingo style containers (cages that rotate when a handle is turned) with a letter on each ball. A woman would turn the handle a random number of times, then without looking, would pull a ball out and write it down. The man who cracked the OTP reasoned that procedure wouldn’t be consistently followed. Occasionally the person pulling the ball out might glance at where their hand was going first. Or they might pull the same letter 2 or 3 times in a row and decide that wasn’t random enough and replace the last pull with a new one. Based on this, and his analysis of the ways people might subconciously select letters, he was able to get an advantage in determining the probable contents of the cipher. The point being, as I see it, that any procedure which relies on humans to generate randomness is flawed simply due to human nature.