Sudoku Generation Algorithm

Well none of us did in the thread, but I bumped it a while later to note the Felgenhauer and Jarvis paper that concludes that the answer we were all puzzling over is 6670903752021072936960. See post 21 for a link to the paper.

Note that was counting reflected or rotated grids as different. Off the top of my head, regarding them as equivalent reduces that number by a factor of 4! = 24.
More significantly, however, that thread was only concerned with the number of different possible completed grids. The number of different possible puzzles is presumably significantly larger and I suspect that question is unanswered.

This physicist (and this sort of thing crops up all the time in theoretical physics) just thinks of that as “permuting the labels”. Makes a difference of 9! in the final answer.

If so then your Palm probably just isn’t very well coded. If a human can solve a puzzle in a reasonable amount of time, a computer can do it instantaneously. But that’s can not will. A poorly coded application can take centuries to calculate anything.

Figure that the thing which will be the most computationally expensive will be to permutate through random possibilities to see if any cells can be determined. But, permutating through more test-cells than a human could do is worthless. So the limit to have much trial-and-error calculation which will need to be done is limitted to what a human can do in their head. And a human is only going to need to try a trial-and-error approach where there is two or three possible answers in a cell, so the amount of locations which will be tried is limitted (and doing it smart you can invalidate tests on a certain number in a related square so you don’t test it as well.) Overall you might have to do a good bit of work, but doing it smart you should be able to do it all pretty instantaneously, even given the CPU of a Palm. Even they can do a few million calculations a second.

Maybe. But the problem with that is that you then have no idea how hard the puzzle is. You could create a bunch of random puzzles in the background and then go back with different solvers to determine the difficulty of each and store that with them for later retrieval, but this could just as well lead to a case where the user has to wait a good moment while random puzzles are created and solved looking for one of the desired difficulty.

But if your solver takes guesses to generate the solution, it doesn’t have any (good) way of knowing that there is a unique solution, and in the creation step, knowing that a partially filled grid has a unique solution is critical, isn’t it?

Which is the case on my Palm, and how I assumed that all puzzle generators work. They come up with a puzzle which has a unique solution (and taking away more given cells has multiple solutions). My Palm doesn’t rate them, but I assumed that other generators, like the ones used in the newspaper and the websudoku site, generate them ahead of time and then use their solvers to rank the difficulty. Is that not how they do it?

A puzzle without a unique solution is one which isn’t solvable. Until the puzzle becomes solvable, it will just keep uncovering cells.

It’s a way to do it. But if you only want a certain level of difficulty, it’s easier to just create puzzles of that kind. There’s always multiple ways to do something.

My, isn’t this an old thread?

I haven’t actually done it, but I think that writing a sudoku solver that runs “instantly” is fairly trivial; it seems a straightforward constraint propagation algorithm, on a relatively small number of squares/moves. It’s the sort of thing I’d assign as a final project for a first or second year programming class (and then probably have to eat my words as some student demonstrates that the problem is NP-complete, or somesuch).

My question, back in 2005 when the dinosaurs roamed the earth, was about generators. I realize many people are using “solvers” as steps in generation, but I just wanted to remind folks of the original point (in case this thread actually remains open after a mod notices its creation date). And the algorithm should come up with all possible puzzles as possible outputs (preferably, equally likely).

My current thinking is that you start by generating a completed board, by any straightforward algorithm (and I consider the symbols on the board arbitrary, so any rotation or complete substitution of one digit with another, however many “substitutions” are involved, shouldn’t count as an additional board). The question then becomes how you take away squares. And using my awe-inspring OP powers, I’m removing the supposed constraint that the “disclosed” squares be symmetric in any way – most of the (mostly machine generated) sudoku’s I’ve seen don’t seem to honor this – only human-generated ones do.

If you remove the restriction that the grid be 3 x 3, then Sudoku is NP-complete.

Hey TimeWinder. I’m the one who revived it, on the theory that reviving a GQ was less objectionable than reviving a GD. If the mods disagree, my apologies. Anyhoo, did you notice that I linked to a generator?

If your solver uses guesses (permutations as you put it) to solve, then it will easily find one of those solutions. And when it does that, it will have no way of knowing whether that was the only solution.

If your solver uses logic alone, then a puzzle with multiple solutions won’t be solvable, but then you’re back to that problem that if your generator is dependent on your solver, then it will generate only puzzles that your solver has been programmed with the techniques to solve. It’s easy to imagine a nearly brain-dead solver algorithm being used to generate the puzzles, and this will only generate ones that are super-simple to solve.

Permutating through possibilities can only disprove cells as being a certain number. If you try a certain value in one square and work off it X-steps ahead and it creates an impossible state somewhere down the line, you know that the original square can’t be that value–thus you erase it from the possibles.

Those techniques will be the same ones as humans use–hence the ability to vary the difficulty. That’s the whole point. The goal isn’t just to find a single solution, it’s to find a single solution based on a certain starting point, utilizing a certain subset of techniques.

Sweet, thanks. Consider it “Wikipediaed”.

Incidentally, people spend a lot of time on this sort of thing. I’m here at the Joint Mathematics Meetings, where the Mathematical Association of America is running a special session on Sudoku puzzles (“Sudoki”?).

Yes, professional mathematicians investigating this. And there are a couple dozen talks or so.

I do hope you didn’t include that along with rotations and reflections. You’d be dividing out by far too much in that case.

Nope.

OK, thanks for explaining that. I thought that when you were talking about permuting, you were looking for guesses that would allow you to solve the puzzle.

I still think that the other way is better, since it can potentially generate puzzles that you haven’t yet figured out a way to solve. And what I was mentioning depends on this test I mentioned, which I read about one time somewhere, that can tell you quickly whether a puzzle has zero, one, or more solutions. I’ll look for that again, but it’s been a long time since I read that. I do recall specifically the mention of Don Knuth in that article.

By the way - the way that I solve puzzles that are printed in my paper on Saturdays, which are all six-stars-out-of-five difficulty, is with a technique that I came up with, and I wonder if it has a name. Let’s say that cell A, somewhere in the grid, has only the possibilities 5 and 7. And that cell B, at some other arbitrary location, has possibilities 5, 6, and 9. I will first assume that A is 5, and work my way to see what that makes B. Then I assume that A is 7, and see what that makes B. For example, say that when A is 5, B is 6, and when A is 7, B is 5. That means that B can never be 9, so I get to eliminate that. This technique usually allows me to solve the six-star puzzles in my paper. Is there a name for it?

If you’re talking about a Knuth algorithm, you’re probably talking about “Dancing Links” (a.k.a. “Algorithm X”), Knuth’s algorithm for solving EXACT COVER. Note that this is an NP-complete problem; but Sudoku is small enough that even a brute-force solution, if implemented fairly efficiently, can quickly solve the problem. The Dancing Links algorithm actually provides the solution (not just determining solvability), though.

Note that because the problem is in NP, there’s not much difference between just determining whether a solution exists and actually solving the puzzle—that is, finding the solution is polynomially reducible to determining the existence of a solution. (Just guess at a value for a cell and then see if a solution exists. If not, erase your guess. Repeat.)