Interesting numerical exercise with little practical value (crossword puzzles)

How would one go about determining the maximum possible number of letter indexes that would be required for a crossword puzzle?

Suppose you have a crossword puzzle grid that is 15 x 15. If you have ever worked a crossword puzzle, you know that any square that begins a word is indexed with a number, starting with the upper-leftmost square as 1, incrementing across, then continuing to increment down the rows. In addition, there are three unwritten rules (well, they’re probably written down somewhere) of American crosswords:

  1. Each square must belong to both an across and a down word
  2. No word may be shorter than three letters
  3. The puzzle must be symmetric. The most common form is symmetric with respect to the origin, although some more eccentric puzzles are symmetric w.r.t. the center column or center row.

There are 225 squares so that’s an absolute limit, but they can’t all be indexed, because they are not all the first letter of a word.

I think it would be complex to determine the puzzle configuration that would result in the maximum number of indexes, and I don’t even know how to generate all legal configurations. There are 2[sup]225[/sup] = 5.39 x 10[sup]67[/sup] possible configurations (225 squares that can each be either used or not used). If we assume symmetry w.r.t. origin then that number can be cut in half but still it’s not feasible to do this computationally, and only a small fraction of these would be a legal crossword configuration.

This question occurred to me as I was developing an Excel VBA tool for crossword construction, as I need to dimension an array to hold these numbers. It’s easy to dimension it as 225 but I am wondering as an exercise if there is a way to determine the true theoretical maximum.

There are a few more unwritten rules, for NY Times publication (and I think for most syndicates).

No more than about a dozen 3-letter words, and at most about 76 words.

But ignoring those rules - the most I can find googling for 15x15 grids is a couple of 90 word configurations.

For what it is worth, I imagine that the densest possible packing of word starts in a 15x15 grid is:



xox xox xox xox
o xoo xoo xoo o
xxo xxo xxo xxo
 o   o   o   o
xox xox xox xox
o xoo xoo xoo o
xxo xxo xxo xxo
 o   o   o   o
xox xox xox xox
o xoo xoo xoo o
xxo xxo xxo xxo
 o   o   o   o
xox xox xox xox
o xoo xoo xoo o
xoo xoo xoo xoo


Where x is an indexed square, and o is a non-index square, with, unless I’ve miscounted, 72 indexed squares.

I’d have a hard time proving that that is the densest, though.

(And after some googling, it’s clear that it is not the densest possible.)

It should be possible to restate the rules in terms of what neighbors, or near neighbors, each square is allowed to have.

Legal configurations of immediate neighbors for a white square, with X indicating a black square, and O a white square:



XXX
XOO
XOO

XXO
XOO
OOX

XXX
OOO
OOO

XXO
XOO
OOO

OOO
OOO
OOO


… and all rotations and reflections of these configurations. Did I miss any?

Edited to add: I think for these purposes, we can consider the puzzle to be surrounded by a border of black squares, no?

Oh, of course I missed some.



XXO
OOO
OOO

XOO
OOO
OOX

XXO
OOO
OOX

XXO
OOX
OOX

XXO
OOX
OOO


… and all reflections and rotations of those.

I’m sure I missed some more legal arrangements.

The rule about no words shorter than three letters is easier. If a black square has a white square as its neighbor in any cardinal direction, the next two squares in that direction must be white.

Maybe it would be better to say:

Each white square must be part of a contiguous row of at least three white squares in both the horizontal and vertical directions.

There are crosswords where a square belongs only to an across or a down word but not both. They show up occasionally in asymmetric puzzles where the squares form an image, but also in puzzles that just don’t like to conform.

So, to generate the puzzle algorithmically …
First, make a 17x17 border of black squares.
Then, starting at the upper left, you add squares to empty spaces inside the border. You can choose whether to add a black or white square according to any method you want, but every time you add a white square, you must also add enough white squares around it to satisfy the rule.

I guess that this process would not result in any illegal configurations.

Then you’re getting into nonstandard crosswords, which I assume are outside the scope of the OP’s question.

So, for example, if you start by putting a white square in the upper left corner, you must also add two more white squares immediately to the right of it, and two more immediately below it. Now you’ve got five squares determined already. But if you start with a black square, you don’t have to add any extra squares around it.

It occurs to me that we should probably add an upper limit for black squares in the outermost rows and columns. It would be at most 14, because if you don’t have at least one white square on each edge, it’s not really a 15 by 15 puzzle.

Edited to add: actually, at most 12, because of the 3-letter word rule.

In fact, just flipping through a bunch of NYT crosswords, I find that I hardly ever see one with more than 4 black squares on an edge row or column. And of course, puzzles with an unusually large number of black squares would have fewer words, and thus fewer indexed squares, making them irrelevant to what we are trying to investigate.

OK, I’m working on an algorithm, but I have some other things to do. However, I can say that I don’t think there are really all that many “standard-looking” grid designs. Definitely nowhere near 2[sup]225[/sup].

I can get 80 with the given rules, and I’m pretty sure that’s the maximum: The grid consists of 16 separate 3x3 squares. But that violates another unwritten rule, that all of the white squares in the puzzle must connect to each other.

By getting rid of a certain block (let’s use that as the term for a black square), you can both add and lose an indexed square. Others you just lose one. I can’t think of a way you would just add one. So maybe 80 is the theoretical upper limit, but I’m not sure it’s achievable.

Here’s a perfectly legal crossword with 78 indexed squares. That’s pretty damn good.

For what it’s worth, I see that the New York Times Crossword’s submission guidelines specify a maximum of 78 words (not indexes), so that grid probably wouldn’t be accepted for a NYT crossword—not that that’s really relevant to the question posed by this thread.

Right, and Tribune Media Services is about the same. But one of those Dell or Penny Press crossword books for old people would allow it. It doesn’t violate any hard rules.

I’m not sure what you’re saying here. Assuming the rules as stated, 80 is definitely achievable, because I just achieved it. And removing any black block from the grid I described would result in an unchecked square (a square that’s part of only one word), which is illegal.

First, you achieved the 3 rules specifically laid out in the OP, but he was also clearly talking about legal crossword puzzles, which yours is not.

I was trying to think if one could start with your configuration and remove blocks (not just one, of course - as you point out, that would result in a square that was in only one word), maintaining the number of indices until you got a legal construction. Since removing a block from yours at best leads to keeping the same number of indexed squares (add one, and take away one), I surmise one could not.