My alarm system requires a 4-digit code for me to turn it on and off. I have also learned that I can enter in this 4-digit code in the middle of any key sequence.
For example, if my code was 5678, I could press 12345678 and
it would successfully turn off when I hit 5678.
Now, let’s say that I forgot my code. What would be the fewest key presses that I would need in order to try every 4-digit code?
For example, I would be able to do all of the 4-digit codes that are sequences by pressing 1234567890123 (gotta get the wrap-arounds). In other words, that would only take me 13 key presses to try all of those.
There are 10,000 four digit numbers.
Entered individually that would be 40,000 keystrokes to try them all.
You can just count 4 digit subsequences to get an idea of the minimum number of keystrokes needed to get all 10,000 four digit numbers.
sequence sequence length # of four digit subsequences
x 1 0
xx 2 0
xxx 3 0
xxxx 4 1
xxxxx 5 2
xxxxxx 6 3
xxxxxxx 7 4
n n n-3
So to get 10,000 distinct four digit sequences you’d need a minimum of n + 3 or 10,003 keypresses. So 10,003 keystrokes is the minimum needed to try all the possible combinations. Not just any 10,003 digit number would do however; you’d have to get very lucky to pick a 10,0003 digit number at random and get one that contains all possible 4 digit subsequences. As a WAG, there is at most 1 such number; so unless you are entering it out of a book, you’d be better off just trying all 10,000 distinct numbers in sequence.
If there is one, then there are two, since you can also enter the same 10,003-digit sequence in reverse. Oh shoot. What if it’s palindromic? Nevermind that. But, if you have a sequence that works, you should get a different sequence that works by swapping all 2’s and 1’s, or any digits for that matter. It’s my WAG that there are many many many successful sequences, but I agree that you’d be darn lucky to guess one.
Granted there are a lot of sequences that work, but you can order them. Let’s consider the sequence that represents the smallest integer. You must have ‘0000’ somewhere. The smallest sequence will start with ‘0000’. Then place a ‘1’ three more zeros and a two. So, the smallest such sequence will start like this:
Oops, swapping 1’s for 2’s really would work wouldn’t it ? Isn’t symmetry grand when you see it properly ! Then counting the possible swaps gives at least 45 10,003 digit numbers that’d work. Has anyone got a proof that there is 1 such number ?
Seems to me that you ought to be able to construct such a 10,003-digit number. For example, pick any four-digit sequence to start; say, 9439 (just a random choice). For each subsequent digit, follow this procedure: 1) examine the previous three digits (439, in this case). 2) Determine which of the four-digit sequences of the form 439X have been used already in previous steps. 3) Pick any digit for X that gives a sequence not previously used. 4) Repeat the next 9998 times.
The trick here is that there are only ten sequences of the form Y439. Thus, you’re guaranteed to see this sequence ten and only ten times, and you simply choose a different digit X every time you see the same sequence.
Assuming this logic is correct, there ought to be, uh, (10000)(10[sup]99[/sup])(9[sup]100[/sup])(8[sup]100[/sup])(7[sup]100[/sup])(6[sup]100[/sup])(5[sup]100[/sup])(4[sup]100[/sup])(3[sup]100[/sup])(2[sup]100[/sup]) distinct 10,003-digit numbers that would work.
I’m pretty sure that if you take a number that works, and circular shift it by some number k, which is smaller than the number of digits, you’ll get a valid solution. What k could be is still an open question…
I checked out that link, ultrafilter. It says you can generate these things using Mathematica. Does anyone with access to Mathematica want to give it a go and check to make sure that one of length 10,003 actually does exist?
Maybe I don’t quite understand these things totally, but one sentence from the MathWorld link confused me. It says, “For example, a de Bruijn sequence of order n on the alphabet {a,b,c} is given by {a,a,c,b,b,c,c,a,b}.” It would make sense to me if they said 2 instead of n, but for, say, n = 3, the sequence a,a,a is not represented. What’s up with that?
Apparently Lempel (of LZW fame) came up with an efficient method of generating De Bruijn sequences.
You can buy a description of the algorithm for $10 here. The sequences probably do awful things to lossless compression algorithms.
This is actually too many numbers since 00000001 is too long to account for the actual sequence 0000. You actually only need 00001 to account for 0000 and 0001. We would need to start a pattern as follows:
0000 & 0001 = 00001
0001 & 1000 = 10001
0000 & 0001 & 1000 = 000010001.
Also note:
0001 with 1000 = 10001
0002 with 2000 = 20002
0003 with 3000 = 30003
0004 with 4000 = 40004
etc.
All palindromic numbers can be “chopped” in such a manner.
So there are 10,000 possible combinations (40000 maximum button presses), but there are a ton of palinromes in there for which we can lose a digit. In fact, there are 9 palindromes for every three digit “inner” palindrome.
X000X = X000 and 000X
XY0YX = XY0Y and Y0YX
Or, most generally,
XYXYX = XYZY and YZYX. This pattern can cut 3 digits for each number of form XYZY less than 5000. Or, more specifically, there are (I believe) 999 = 729 such numbers, or 2187 buttons we don’t have to press.
Okay, there are 10[sup]10,000[/sup] distinct 10,000-digit sequences, no? I wanted to figure out how many of them contain each of the 10 digits exactly 1,000 times. I seem to remember this being equal to 10,000!/(1,000![sup]10[/sup]). Using Stirling’s approximation, I got that this was equal to 10[sup]10,000[/sup], so I guessed that 1,000 is too small a number to be using this approximation on. I tried punching it up in my calculator, though, and got that it was equal to 10[sup]9,983.4[/sup]. This seems abnormally high to me. One in 4×10[sup]16[/sup]. Where’s my error, if anywhere?
There is a theorem that proves this type of thing can always be done minimally (i.e., each “word” (string of four digits) occurs once and only once throughout the entire sequence). So there is a 10,000 digit string which has all possible four digit combinations (finding it is another matter, of course). This is assuming that “wrap-arounds” are allowed, i.e., the two digit sequence (a,b) represents two 2 letter words: ab, and ba (since we can wrap around from the end to the beginning). If we don’t allow wrap arounds, then we know there is a sequence of 10,003 digits that includes all possible four digit strings.
So the answer to the OP is in fact 10,003 (since wrap arounds won’t be recognized by the alarm system).
So it’s always easy to determine the minimum length of such a sequence; the hard part is actually finding it. The thereom was proven by de Bruijn (for two letter alphabets, and arbitrary finit word length) in 1946, and by Good (for arbitrary finite alphabets and arbitrary finite word length), also in 1946.
But that is what I posted. I started with 0000 1000 which will have all sequences with zeros and at most one 1.
The really strange thing is I originally made that mistake, but I caught it in preview. You are weirding me out.
I just realized that I did make a mistake. after 9000, go to 1100. I counted 1000 again. (ten followed by two 0s = one followed by four 0s). I should have eliminated multiples of ten.
I would contend that finding a number sufficient to contain all n-digit sequences is easy, but finding the shortest such number is difficult. Your simple example illustrates this pretty well, as I can represent 0000, 0001, and 1000 three characters shorter than you have: 100001. As more and more numbers are involved, it gets trickier to spot the optimizations.
With computers these days, even this isn’t as difficult as I imagined. I wanted to see how hard it would be to locate a valid sequence and wrote a quick-and-dirty program to find one. It took less than 5 seconds of running time.
The sequence is very interesting. It begins “0001000200030004000500060007…” and ends “…9799879998888988998989999000”. One property of a valid sequence is that the first three digits will be the same as the final three. Of course, another 10! sequences could be created through substitution.