A tough logic problem that might interest maths/computer folks

InvidiousCourgette · May 18, 2005, 11:28am

I am contributing to a discussion on the maths behind sudoku’s on a small message board. A sudoku is a puzzle like this:

71* *** 53
*** 75 ***
65 *** 79

2 3*9 4
4 *** 9
7 1*6 2

16 *** 97
*** 98 ***
57 *** *84

where you have to replace all the stars with the digits 1 to 9, such that no number is repeated in any of the 9 mini grids, any row or any column.

The discussion is focussed on how many unique completed sudoku grids can be devised. Several people have calculated an estimate of around 6 * 10^21. Getting from there to an absolute answer is proving more difficult. As this board is much more active, I wondered if anyone here could see how it might be done?

The discussion can be found here

Tyrrell_McAllister · May 18, 2005, 3:16pm

This sounds like a nasty inclusion-exclusion problem if you want to compute the exact number.

The total number of ways to fill the grid with the numbers 1 through 9 is clearly 9[sup]81[/sup]. If we subtract from this the number of grids which are fail the sudoku condition, then we get the number you seek. Here’s an outline of one way to do that which you might be able to work out to the point that Maple or Mathematica could compute the answer.

Let F be the set of pairs of entries that are forbidden to be equal. (So, F consists of pairs of entries on the same row, the same column, or the same minigrid.) For each pair p in F, let S[sub]p[/sub] be the set of completed grids in which both of the entries in p have the same value. (So, S[sub]p[/sub] is a set of grids which fail the sudoku condition).

Evidently, the grids that fail the sudoku condition are precisely those that are in the union of the S[sub]p[/sub]'s. Using #S to denote the number of elements in a set, the number you want is
9[sup]81[/sup] ? #(?[sub]p ? F[/sub] S[sub]p[/sub])

This is where the inclusion-exclusion principle comes in. According to this principle, you have that

#(?[sub]p ? F[/sub] S[sub]p[/sub]) = ?[sub]G?F, G ? ?[/sub] (?1)[sup]#G ? 1[/sup] #(?[sub]p?G[/sub]S[sub]p[/sub])

Now the problem is “reduced” to computing the values #(?[sub]p?G[/sub]S[sub]p[/sub]) for each nonempty subset G of F. This should be reasonably do-able if you break the problem up into cases, such as how many of the pairs in G share entries.

InvidiousCourgette · May 18, 2005, 5:00pm

Thanks for the input Tyrell. I am afraid I don’t understand the notation you have used (or quite possibly the operators you are using). Nonetheless, the idea of reducing the problem to something that can be tackled by brute force is attractive.

My approach had been to prefill 3 of the mini grids (1,5 and 9) with numbers and then brute force how many valid ways there were to complete the grid. The trouble with this technique is the brute force attack would take approx. 10 minutes to solve. Multiply that time with the number of unique ways to fill mini grids 5 and 9 (9!^2), and you will be waiting a long time to get a complete solution. Of course many of the 9!^2 ways will be mirror images and what not, but I could not see a way to reduce the total to a sensible number for attack.

Tyrrell_McAllister · May 18, 2005, 6:13pm

I don’t know if it will help, but I see that I screwed up the coding. My final paragraphs were supposed to look like this:

Evidently, the grids that fail the sudoku condition are precisely those that are in the union of the S[sub]p[/sub]'s. Using #S to denote the number of elements in a set, the number you want is
9[sup]81[/sup] − #(∪[sub]p ∈ F[/sub] S[sub]p[/sub])

This is where the inclusion-exclusion principle comes in. According to this principle, you have that

#(∪[sub]p ∈ F[/sub] S[sub]p[/sub]) = ∑[sub]G⊂F, G ≠ ∅[/sub] (−1)[sup]#G − 1[/sup] #(∩[sub]p ∈ G[/sub]S[sub]p[/sub])

Now the problem is “reduced” to computing the values #(∩[sub]p∈G[/sub]S[sub]p[/sub]) for each nonempty subset G of F. This should be reasonably do-able if you break the problem up into cases, such as how many of the pairs in G share entries.

In case you are unfamiliar with the notation even when I don’t screw up the coding, here is the basic idea. I have to admit that the explanation below is very poorly done. It is so bad that if I came back and looked at it a year from now, even I would have difficulty understanding what I was trying to say. But maybe it will help you get some ideas.

We know how many ways total there are to fill the grid with numbers (9[sup]81[/sup]), so if we can figure out how many of these grid-fillings are not valid sudoku fillings, we are done.

Each non-sudoku filling (NSF) fails to be sudoku because it has at least one “violation”. That is, at least one pair of entries are equal that should not be equal. Hence, each NSF ends up in at least one of my S[sub]p[/sub]. However, some NSFs have more than one violation, so they’ll end up in more than one S[sub]p[/sub]. So I can’t just add up the sizes of the S[sub]p[/sub] without overcounting. This fact is encapsulated by the inclusion-exclusion principle.

Using the formula I gave, you begin by adding up the size of each S[sub]p[/sub]. Each S[sub]p[/sub] has size 9*(9[sup]81[/sup] − 2). The number of S[sub]p[/sub] is a little harder to figure out because you have to break up into three cases (how many are row violations, how many are column violations, and how many are in-the-same-minigrid violations; this is only really two cases because the number of row violations is equal to the number of column violations). That number shouldn’t be too hard to figure out, so call it C. Then adding up the size of each S[sub]p[/sub] gives you
9 * (9[sup]81[/sup] − 2) * C.

However, as I said, when you add all this up you’ve over counted. In particular, you’ve counted the NSFs with exactly two violations twice, so, for each possible pair of violation, you subtract off the number of NSFs with that pair of violations.

Here’s where things get a little complicated. There are two ways that an NSF can have two violations: Either the two violations overlap, or they don’t. That is, the two violations either do or do not share an entry. And you still have to break up into cases based on whether the violations are row, column, or minigrid violations.

Once you’ve done that, you get a new number that you subtract from the previous one. But now you are underestimating, because, so far, an NSF with a given triple of violations has been removed too many times and isn’t being counted, so you need to add it in again. (This is best seen with the aid of Venn diagrams).

And so on :D. As poorly explained as it is, I still think that this direction is do-able. It seems like a cop-out to leave it at that, but there it is.

Punoqllads · May 18, 2005, 9:48pm

Well, let’s start with a simply constructed sudoku S:

123456789
234567891
345678912
456789123
567891234
678912345
789123456
891234567
912345678

I make the conjecture that every possible sudoku is the above matrix S multiplied by some permutation matrix R that permutes rows and then multiplied by some permutation matrix C that permutes columns. If this conjecture is correct, then the number of possible sudokus is 9![sup]2[/sup] = 131681894400. I haven’t worked out the math; perhaps others could look into it.

Punoqllads · May 18, 2005, 10:07pm

Minor alteration of my above conjecture:
Given the matrix S above, then for any sudoku S’, there exist two matrices R and C such that S’ = R * S * C, where R * M is a matrix that is M with its rows permuted and where M * C is a matrix with its columns permuted. I want to further say that 131,681,894,400 is the upper bound on the number of solutions, for I’m pretty sure that R and C are not necessarily unique for every S’.

Chronos · May 18, 2005, 10:09pm

Punoqllads (and how do you pronounce that, anyway?), I don’t think that’s a valid sudoku, since it fails the subgrid condition. That is to say, your first subgrid is

123
234
345

which has duplicates. Your value might be good for an upper bound, however.

Punoqllads · May 18, 2005, 10:14pm

You’re right, missed the mini-grid part.

Punoqllads · May 18, 2005, 10:24pm

And my name is the word spellbound upside-down, so I pronounce it, “spellbound”.

ticker · May 18, 2005, 10:41pm

FWIW my Sudoku Solver program, when starting with an empty grid, produces the following which looks, at least to start with, quite close to Punoqllads’ erroneous simple grid:

123 456 789
456 789 123
789 123 456

231 674 895
875 912 364
694 538 217

317 265 948
542 897 631
968 341 572

Punoqllads · May 19, 2005, 12:40am

Nope. My conjecture is wrong. Sorry for the bum steer. I’ll keep lookin’.

InvidiousCourgette · May 19, 2005, 9:57am

Thanks for the inputs again. Tyrrell, I think I sort of follow what you are suggesting. Not enough to know how to begin with the technique you are proposing, however!

Punoqllads, I think your upper bound is wrong. It does not tally with an approximation I calculated:

By filling in the mini grids 1, 5 and 9 randomly you can set a program to brute force the number of valid ways to fill in the remaining mini grids. The number you get turns out to vary depending on how you filled in those first three grids. I tried it 25 times with 25 set of randomly filled in grids (1, 5 and 9). I got a figure of 133700 as the average number of valid ways left to fill in the remaining grids. It is then possible to say that the total number of valid soduku’s, based on the average calculated by my 25 trials, is = 133700*9!^3, or 6.4 * 10^21

Note that when I say “randomly” filling in the first three grids, I mean random in the sense that each grid has all the numbers 1 to 9 put in them, but in a random order.

Tyrrell_McAllister · May 19, 2005, 2:23pm

My method can at least be used to find an upper and lower bound for the number sought if someone is willing to take the time to figure out the following values. These bounds can then be used to test other peoples’ conjectures about the precise number.

First, a piece of terminology: If a pair of positions are forbidden to be equal in a Sudoku, I call that pair of positions a potential violation locus, or PVL.

The upper bound: What is the total number of PVLs in the Sudoku grid? (If this value is plugged in for C in my previous post, it will give an upper bound).

The lower bound: What is the number in each of the following classes of pairs of PVLs?

(a) How many pairs of PVLs are there that share a position, but the two unshared positions do not form a PVL? An example of such a pair is indicated by astericks in the diagram below. The two positions oriented horizontally form one PVL, and the two positions oriented vertically form the other PVL.



.*. .*. ...
.*. ... ...
... ... ...

... ... ...
... ... ...
... ... ...

... ... ...
... ... ...
... ... ...

(b) How many pairs of PVLs share a position, and the two unshared positions also form a PVL?

Now consider the pairs of PVLs do not share an position, but you can choose an entry from each PVL to form yet another PVL. For example, the two astericked positions in the first row below form on PVL, and the two astericked positions in the second row from another PVL, but you can also take the astericked positions in the middle column to form yet a third PVL.



.*. .*. ...
... .*. .*.
... ... ...

... ... ...
... ... ...
... ... ...

... ... ...
... ... ...
... ... ...

Now you need to count each of the following four subcases, depending on whether you can form © one (as above), (d) two, (e) three, or (f) four additional PVLs from the original two PVLs.

(g) How many pairs of PVLs do not share a position and, moreover, you cannot choose a position from one and a position from the other that together form a PVL? For example:



.*. .*. ...
... ... ...
... ... ...

... ... ...
... ..* .*.
... ... ...

... ... ...
... ... ...
... ... ...

Once the sizes of each of the classes (b)–(g) is computed, you need to multiply the number in each class by the number of ways to fill in a grid so that it has actual violations at each of the PVLs in one of the elements of that class.

Finally, add the resulting numbers up and subtract the result from the upper bound derived above. This will give you the lower bound.

bonzer · May 19, 2005, 9:21pm

The argument below is pretty much an unimaginative frontal attack, but a) it gives a good upper and lower bounds and b) only one step is not exact. That step isn’t that complicated and a little more effort directed at it should yield the exact result.

As a preliminary, a word about what counts as a unique solution. In sudoku the numbers are merely labels. If we were to replace, say, 2 by 3 and vice versa throughout, we’d still have a possible solution. That also applies to the logic of solving an unfinished puzzle. Personally, I’d regard grids that can be produced from each other by such swapping of labels as equivalent. There are 9! such permutations and so any solution is equivalent to 9! others. I’ll regard all these equivalent grids as being a single unique one.

The strategy is to fill in overall grid by subgrid. How many possible allowed ways there are to fill in a particular subgrid will be constrained by the grids that are already filled in, but not by the grids that are filled in later.
We’ll start in the top-left hand corner, then fill in the two immediately next to it. Then top-right, bottom-left, middle, middle-right, bottom-middle and finally bottom-right. That breaks the problem into 4 non-trivial stages.

The first step is trivial. Because of the equivalence under permuting labels, we’re free to fill the top-left subgrid as

123
456
789

The second step is to fill the top-middle and middle-left subgrids. Consider the top-middle one. We can place the 1 in one of 6 allowed spaces, the 2 in one of five, etc. But this runs into the problem that how many spaces are available when we come to 4 depends on where we’ve placed 1-3. Arggh. The following is also messy, but gets there.The whole top row from the first square either winds up together in another row in the second or it is (somehow) split into two possible rows. If split, then the other rows from the first square are also split. Represent all this as an ijth matrix. The ith row is how many members from the i row from the top-right matrix is in the jth column. There are 4 possibilities:

003 030 012 021
300 003 201 102
030 300 120 210

In the last two of these, we have to also consider where the “lone” member finds itself: the first second or third position in the row. That’s 1+1+3[sup]2[/sup]+3[sup]2[/sup]=20 possibilities. None of this considers how we distribute the 3 members of the original row amongst its 3 new locations scattered about the new square. Total number of possibilities = 20(3!)[sup]3[/sup]=4320.

The third step is to fill the top-right and bottom-left subgrids. This step is non-trivial, but easy enough. Take the top-right square. We now have a top row that is already filled with 6 numbers. From the requirement that this row contain 1-9 when completed, we know what 3 numbers need to be in this row of the subgrid. There are 3!=6 ways of entering these numbers in this row. The total number of ways of completing this subgrid is then 6[sup]3[/sup]=216. Same for the bottom-left square.

The fourth step is the middle square and is the hardest. We’ll just find an upper bound on the result. The completed square to the left imposes a set of constraints. But this is just like finding the top-middle square. The number of possible cases is thus less than 4320, as before. The complication is the middle square is constrained by this top-middle square. Although not explcitly part of the argument about the top-middle square, we can permute the entries in each row of that and still get an allowed arrangement. The same applies to the dependence of the middle square on the middle-left one. In each row, the top-middle square at least eliminates at least two of each of these 3! permutations in each row. (Take the row with a 1 in it in some case; out of the 6 permuted cases, in two of these the 1 falls in the column where there’s a 1 in the top-middle square.) The number of possible cases is thus less than 4320/8 = 540.

The fifth step is to fill the bottom-middle and middle-right subgrids. Consider the top row of the middle-right one. As in the third step, we know what 3 numbers must go in these three spaces. But we now have the additional constraint of the completed top-right square. We can take one of these 3 numbers and place it either of the two allowed spaces.There’s then no choice about where to place the other two numbers. Given the three rows, there are thus 2[sup]3[/sup]=8 ways of completing this square. Same for bottom-middle.

The sixth step to complete the final, bottom-right square. With the rest of the grid now decided in the order above, we can always consistently complete this square, but we have no choice about how to do so.

We can summarise all this as follows:

1[sub] [/sub] N[sub]2[/sub] N[sub]3[/sub]

N[sub]2[/sub] N[sub]4[/sub] N[sub]5[/sub]

N[sub]3[/sub] N[sub]5[/sub] 1

where N[sub]2[/sub]=4320, N[sub]3[/sub]=216, N[sub]4[/sub] < 540, N[sub]5[/sub]=8. The total number of solutions is then

N = (N[sub]2[/sub])[sup]2/sup[sup]2[/sup]N[sub]4/sub[sup]2[/sup] < 3 x 10[sup]16[/sup].

Dropping the issue of equivalence (i.e. multiplying by 9!) gives the total as less than about 1 x 10[sup]22[/sup].

That’s larger than what you’re claiming for your previous upper bound, but note that the only slop here is in the value of N[sub]4[/sub], which we know has to satisfy 8 < N[sub]4[/sub] < 540. That’s all the room left. (The lower bound here follows from noticing that N[sub]4[/sub] > N[sub]5[/sub].) And fixing N[sub]4[/sub] exactly doesn’t seem that much more difficult than doing so for N[sub]2[/sub], which wasn’t too difficult a case.

Punoqllads · May 20, 2005, 7:25pm

I get a different number for N[sub]2[/sub]. Start with the matrix:

ABC
DEF
GHI

For the 3x3 matrix adjacent to it, the top row must contain 3 of the set [D E F G H I]. That means that the top row will contain 0, 1, 2, or 3 of the set [G H I].

If it contains 0, then the top row is made up of D, E, and F in some permutation
If it contains 1, then there are 3 choose 1 ([sub]3[/sub]C[sub]1[/sub]) ways to choose one of G, H, and I, and [sub]3[/sub]C[sub]2[/sub] ways to select 2 values from D, E, and F, or 9 ways total. Similarly if 2 or 3 of the bottom three values are placed in the top row there are 9 and 1 ways of choosing which values, respectively. In any case, once the threee values are chosen, there are 6 ways to permute them.

Next, any of the values G, H, and I that are not in the top row must be in the middle row. So, if there are 0 elements from [G H I] in the top row, then all three must be in the second row, so there will be 0 elements from [A B C]. If 1 of them is in the top row, there are 3 ways to select 1 element from A, B, and C. If 2, then 2 elements from A, B, and C will be in the middle row (3 ways) and if 3, then A, B, and C will be in the middle row. Again, in all cases there are 6 ways to permute the selected elements.

Finally, for the bottom row there are only 3 elements to select and 6 ways to permute them.

So, I believe that the number of ways to create a 3x3 matrix that obeys the sudoku’s constraint with respect to itself and another 3x3 matrix, there are exactly 6[sup]3[/sup] * (1 * 1 + 9 * 3 + 9 * 3 + 1 * 1) = 12096 matrices.

bonzer · May 21, 2005, 12:19pm

You’re right. In my version the mistake is in the statement:

There are actually 3 positions in three rows to consider, so the correct calculation is 1+1+3[sup]3[/sup]+3[sup]3[/sup]=56, which agrees with your result. This bumps the final answers up by about a factor of 10.

However, there’s a bigger logical problem with my argument. It’s the assertion that

This isn’t quite true. For example, the suggested method of filling in the grid allows

9 * *
8 * *
1 * *

to arise as a possible case in the top-right subgrid. But it also allows

2 3 4 5 6 7 * * *

to arise as the 4th row after the middle subgrid has been filled. This is actually an impossible case - the next number in the row has to be 1, 8 or 9, but the partially filled portion of the column doesn’t allow these.

Granted, such unallowed-for constraints merely lower the calculated upper bound. And probably not by that much; indeed, it looks to me that they can all be regarded as reducing the values of N[sub]4[/sub] and N[sub]5[/sub].

Chronos · May 22, 2005, 8:31pm

Depends on how you interpret that statement. There are partially-completed sudokus which contain no violations in the completed parts, but which are nonetheless impossible. But one could say that, when filling in subgrids, one is constrained by the existing subgrids in that one cannot fill in a subgrid in such a way as to create such a situation. In other words, any given subgrid is only constrained by previously-filled subgrids, but in a complicated way.

To add my contribution: Given any valid sudoku, I can construct another valid sudoku using any composition of the following steps:
1: Permuting rows (across the entire sudoku) within a single subgrid (as, for instance, permuting rows 4, 5, and 6)
2: Permuting columns within a single subgrid
3: Permuting the rows of subgrids (for instance, swapping the 1st, 2nd, and 3rd subgrids with the 4th, 5th, and 6th)
4: Permuting the columns of subgrids
5: Permuting the labels of the elements

There are then two questions: First, do these operations exhaust the space of valid sudokus? That is to say, if we consider sudokus related by one or more of these operations to be equivalent, are all sudokus in the same equivalence class?

Second, are these operations independant of each other? That is to say, is there any way to construct one of these operations (or a special case of it) by composition of the others?

If we assume that all sudokus can be formed by action of these operations, and that the operations are independant, then the number of sudokus is (3!)[sup]8[/sup]9! = 6.09510[sup]11[/sup]. This is far fewer than InvidiousCourgette’s Monte Carlo estimation of 6.4*10[sup]21[/sup], which leads me to suspect that I’m missing some other way to construct one sudoku from another. Anyone have any suggestions?

If we assume that the operations I’ve listed are at least independant (an assumption I’m not at all confident is justified, incidentally), then my figure of (3!)[sup]8[/sup]*9! would at least be a lower bound, and the true answer would be some integer multiple of that bound.

Omphaloskeptic · May 22, 2005, 9:50pm

Not for all sudokus; some sudokus are very symmetric. For example consider


1 2 3  5 6 4  9 7 8
4 5 6  8 9 7  3 1 2
7 8 9  2 3 1  6 4 5

5 6 4  9 7 8  1 2 3
8 9 7  3 1 2  4 5 6
2 3 1  6 4 5  7 8 9

9 7 8  1 2 3  5 6 4
3 1 2  4 5 6  8 9 7
6 4 5  7 8 9  2 3 1

This is invariant under quite a few of the permutations you list (for example the permutation which rotates all rows down 3 (e.g., row 1 to row 4, row 8 to row 2) and all columns left 3), so the answer to your second question is No.

Note too that your operations 1-4 leave invariant the row and column subsets within a block, merely permuting their order or moving the blocks around. In this sudoku, the row subsets are {1,2,3},{4,5,6},{7,8,9} and the column subsets are {1,4,7},{2,5,8},{3,6,9}, for all 3x3 subblocks; operations 1-4 do not change this. The relabeling can change the particular 3-element subsets involved, but cannot change the fact that the row subsets and column subsets are identical for all 3x3 subblocks. ticker’s example above is thus unreachable from this sudoku. So the answer to your first question is also No.

Of course, this is a very nongeneric sudoku; your conjectures may hold for some more “generic” sudokus. I suspect not, though. The arguments above show that your operations preserve some invariants (e.g., the counts of the various three-element row and column subsets, modulo labeling) which have a number of possible values in the space of sudokus.

Chronos · May 23, 2005, 2:26am

Thanks; I had a hunch that both of those conjectures were false. I’m just not experienced enough with these things to have constructed that high-symmetry counterexample. It’s also an illustration of just how hard it’s going to be to calculate the number, since there exist operations which will leave that one unchanged but which would not leave another sudoku unchanged. Sigh.

Punoqllads · July 2, 2005, 12:01am

Well, like a song trapped in my head, this question has haunted me for a while. I don’t claim to have solved it, but I think that I’ve developed some ideas that simplify the calculation. The first idea is that of using a matrix’s profile.

There’s probably some other name for it in the mathematical literature, so let me explain what I mean by “profile”. The profile of a matrix is just that matrix viewed from one side, essentially an ordered list of sets of number. For example, the 3x3 matrix:

1 2 3
4 5 6
7 8 9

has a horizontal profile of { (1, 2, 3), (4, 5, 6), (7, 8, 9) } and a vertical profile of { (1, 4, 7), (2, 5, 8), (3, 6, 9) }. The number of profiles for a matrix made up of the numbers 1 through 9 is 9 choose 3 * 6 choose 3 * 3 choose 3 = 1680. The hope is that we can break down the problem into steps where the temporary results can be described by profiles, each profile having a count of how many solutions to that point have that profile.

Note that when I write matrix below, I mean only a 3x3 grid. I hope I don’t confuse anybody by my terminology, but I’ve been thinking about this problem using these terms up to this point, and they seem natural to me.

Next we want to decrease the search space. As noted by others, we can specify that one of the 9 matrices, say the top left, can have its order explicitly set, and merely multiply the solution by 9!. We can also say that for the three middle columns, that we can easily permute their order and still have a valid answer, so we can specify a strict ordering for them and then multiply the final result by 6. We can also do this for the final 3 columns, the center 3 rows, and the final 3 rows, and then multiply the final result by 6[sup]4[/sup]. Finally, we can see that we can swap the center 3 columns with the final 3 columns and again have no loss of generality, so we can again specify a strict ordering for the two of them, as well as the center and bottom bars, and multiply the result by 4.

Another speedup is to analyze how many permutations can exist for a particular horizontal profile for a matrix when it must meet the sudoku condition vertically with respect to another matrix. It turns out that for that particular horizontal profile, there are exactly 0 or 8 valid permutations. The proof is as follows: for any row R of a matrix’s horizontal profile H, the three elements show up in either 1, 2, or 3 distinct columns of the vertically constraining profile V. If all three elements are in one column, then there is no number to place in that column and therefore there are no valid matrices for this profile, regardless of the other two rows this profile. If the elements are in two distinct columns, then two of the elements must be in one column, and one element in another. The latter element is therefore the only element that can go into the column forbidden to the first two, and then there are two permutations of the other two elements in the remaining two spaces. Finally, if all three elements are in distinct columns, then there are also two permutations allowed: where the element forbidden from column 1 is in column 2, the element forbidden from column 2 is in column 3, and the element forbidden from column 3 is in column 1; or where the element forbidden from column 1 is in column 3, the element forbidden from column 3 is in column 2, and the element forbidden from column 2 is in column 1. Since there are 3 rows, either at least one of the rows is forbidden, in which case there are 0 valid permutations, or all 3 rows are permitted, in which case exactly 2[sup]3[/sup] = 8 are permitted.

The basic algorithm is as follows: start off with the fixed matrix in the top left corner. There are 56 valid profiles for its horizontal neighbor. Each of those profiles combined with the fixed matrix’s profile specify an exact profile for the matrix in the upper right corner. We only need to concern ourselves with half of those, since for each solution we can swap the middle matrix with the right matrix and have exactly one of the other solutions. Next, for each of those matrices, there are 216 different permutations of the top, middle, and bottom rows. We can drop that down to 36 each since we can specify an exact ordering for the 3 resultant columns in each matrix. Finally, for all of those solutions we have generated, there is a 9-set vertical profile, of which the first 3 sets will always be (1, 4, 7), (2, 5, 8), and (3, 6, 9). For each profile, we would store the number of permutations that have that particular vertical profile. I have already written a program to calculate them, and there are 22266 distinct profiles.

Next, using similar math we can calculate 28 different vertical profiles to look at for the center left matrix and the 36 relevant permutations of each of those. It turns out that there will be only 225 distinct horizontal profiles for those 1008 vertical profiles; 27 with 8 vertical profiles and the remainign 198 with 4 vertical profiles. Now here, I believe that the right way to procede would be to iterate through those distinct profiles. For each one of those horizontal profiles H, we would iterate through the 22266 distinct vertical profiles V of the top blocks. For each pairing (H, V), there are 56 (note, not 28 – we’ve already exploited the search space’s symmetry) possible horizontal profiles of the center matrix C and a single counterpart R for the right matrix. If any of the rows of C are identical to the center 3 columns of V, then this combination is invalid. If any of the rows of R are identical to the right 3 columns of V, then this combination is invalid. Otherwise, there are 64 (8 * 8) valid permutations of the elements of C and R. We determine the vertical profile of these permutations along with the valid permutations of H and note that, together with V they express an inverse profile – the inverse of the elements that would be allowed in the vertical profile of the bottom bar. Similar to the creation of the 22266 vertical profiles, we will then add the count on the vertical profile V to the bottom bar profile in question. Once all of those profiles have been created and enumerated, we go through the bottom profiles. For each one of those, there are 36 permutations of the left matrix to consider. For each of those permutations there are either 0 or 8 valid permutations of the center matrix. If there are 8, then there is either 0 or 1 valid matrix for the right-hand matrix. We can iterate over all of these and count them up. If the estimate of 6 *10[sup]21[/sup] is right, then the total that we’ll be counting up will be about 3 trillion, well within the limits of 64-bit integers – remember, we’ve already divided out 9! * 6[sup]4[/sup] * 4 = 1881169920.

Obviously, the middle part is the bottleneck in the calculation. It will take about 225 * 22266 * (56 * 64) = ~18 billion steps to do, with each step taking 100-200 operations. It will need to store values for 28 * 1680[sup]2[/sup] = ~8 million different profiles for the bottom bar. I can’t really see how to simplify it any more. Those numbers are right on the threshhold of what todays computers can do in a reasonable amount of time. The nice thing about it, though, is that it’s easily parallelizable. It’s still a work in progress, so if you have anything to add or point out, feel free to say it.

Topic		Replies	Views
Sudoku - How difficult to make one? Cafe Society	2	901	November 16, 2005
Sudoku Generation Algorithm Factual Questions	34	4114	January 9, 2007
How many Sudoku puzzle solutions are there? Factual Questions	12	2922	July 18, 2007
Sudoku impasse The Game Room	8	3605	April 29, 2013
Sudoko theory The Game Room	22	1798	December 3, 2008

A tough logic problem that might interest maths/computer folks

Related topics