Backtracking & Search/Constraint Search: N-Queens & Sudoku

Lesson 9.21,694 words

Constraint Search: N-Queens & Sudoku

Many hard puzzles are constraint satisfaction problems: assign each variable a value from its domain so that every constraint holds. Backtracking solves them by assigning variables one at a time and rejecting a partial assignment the instant a constraint breaks.

╌╌╌╌

The previous lesson built backtracking as a general tool: explore the tree of partial solutions depth-first, extend a partial solution one choice at a time, and abandon (backtrack) the moment the partial solution cannot possibly be completed. This lesson specializes that machinery to its most natural habitat, the constraint satisfaction problem (CSP), and asks the question that decides whether the search returns in milliseconds or never: how cheaply, and how early, can we detect that a partial assignment is doomed?

A CSP is three things:1

  • a set of variables ;
  • for each variable a domain of values it may take;
  • a set of constraints, each forbidding certain combinations of values on some subset of the variables.

A solution assigns every variable a value from its domain so that all constraints hold. Backtracking treats the variables as levels of a search tree: at level we try each value in , check the constraints that involve and the already-assigned variables, and recurse only if none is violated. The single most important design decision is to check constraints incrementally, the moment we place , not after a full assignment, so that a violated constraint prunes an entire subtree of would-be assignments at once. A cheap, early constraint check plus a smart variable ordering is what makes exponential search terminate.

N-Queens: the archetype

Place queens on an board so that no two attack each other. A queen attacks along its row, its column, and both diagonals. The first pruning insight is structural: since no two queens may share a row, place exactly one queen per row and let the variable be the column of the queen in row . That choice bakes the row constraint into the encoding: we never even consider two-in-a-row.

What remains is to check, when placing a queen at , that it shares no column and no diagonal with an earlier queen. The trick is to notice that the two diagonal families have constant labels. Every square on a diagonal has the same value of ; every square on a diagonal has the same value of . So three boolean sets, one for occupied columns, one for occupied diagonals, one for occupied diagonals, give an conflict check and an update.2

conflict check via column & diagonal sets (, ); the conflict is highlighted

The shaded squares are exactly those attacked by the placed queen: its column, its row, and its two diagonals. The marked square (, in accent) lies on that queen's diagonal: it has the same value, so the lookup rejects it in and the column is pruned before we ever descend to the next row.

Algorithm:Queens(r)\textsc{Queens}(r) — place one queen per row using column \& diagonal sets
  1. 1
    if r=nr = n then
  2. 2
    record a solution; return
  3. 3
    for c0c \gets 0 to n1n-1 do
  4. 4
    if ccolsc \in cols or (r+c)diag+(r+c) \in diag_{+} or (rc)diag(r-c) \in diag_{-} then
  5. 5
    continue
    conflict — prune
  6. 6
    add cc to colscols; add r+cr+c to diag+diag_{+}; add rcr-c to diagdiag_{-}
  7. 7
    place[r]cplace[r] \gets c
  8. 8
    Queens(r+1)\textsc{Queens}(r+1)
  9. 9
    remove cc from colscols, r+cr+c from diag+diag_{+}, rcr-c from diagdiag_{-}
    undo

Each node does work over the columns with per column, and the recursion is deep. The number of solutions grows fast and irregularly ( for , for , for , for ) with no closed form; counting them is exactly N-Queens II. The board's symmetries (rotations and reflections form a group of order ) let a solver explore only a fundamental region and multiply, a standard constant-factor speedup. The asymptotic cost is still exponential in the worst case; the constraint sets buy a large constant factor and prune most of the tree, but they do not change the complexity class, and no fast exact algorithm is known.

The whole search fits in one picture for . Each level commits the next row's queen to a column, and a clash with the column or a diagonal set prunes the branch immediately. Two of the four opening columns ( and ) die in conflicts before the board fills; the other two each lead, by a single surviving path, to one of the board's two solutions, and .

The complete search tree drawn as partial boards — each level places the next row's queen; a conflict prunes the branch (), and only two paths fill the board (in acc)

Sudoku: propagation and the most-constrained variable

A Sudoku is a CSP with variables (the cells), each with domain , and constraints that the nine cells of every row, column, and box are all distinct. Naive backtracking already works: pick an empty cell, try each digit consistent with its row, column, and box, recurse, and undo on failure. As with queens, keep a boolean set per row, per column, and per box so the consistency check and update are .

But naive ordering is slow, and two ideas turn it sharp:

  • Constraint propagation. Before branching, repeatedly fill every cell whose candidate set has collapsed to a single value (a naked single), and remove that value from its peers' candidate sets. One forced fill often triggers a cascade, solving easy puzzles with no search at all and shrinking the tree dramatically on hard ones.
  • Most-constrained-variable (MRV) heuristic. When you must branch, branch on the empty cell with the fewest remaining candidates. Branching on a cell with two options instead of nine cuts the fan-out where it matters, and it fails fast: a cell that has been narrowed to zero candidates is discovered immediately, pruning that branch at the top instead of after a deep fruitless descent. MRV is the single biggest practical speedup for Sudoku.
Propagation cascade: filling a naked single () strikes from a peer, collapsing it to a new naked single () — one forced fill triggers the next
MRV branches on the empty cell with the smallest candidate set (, in acc), minimizing fan-out and failing fast

Branching on the two-candidate cell forks the search only two ways instead of four or nine; and if propagation later narrows a cell to , MRV reaches it first and prunes that branch at the top.

Together these collapse a search that is hopeless under naive row-major ordering into one that finishes quickly on every newspaper puzzle. The algorithm (backtracking) is unchanged; the ordering and the propagation are what make it tractable.

Graph -coloring: the same skeleton again

Given a graph and colors, assign a color to each vertex so that adjacent vertices differ. This is a CSP whose variables are vertices, whose domains are the colors, and whose constraints are one inequality per edge. Backtracking colors vertices in some order, trying each color not already used by an assigned neighbor and backtracking when a vertex has no legal color: exactly the queens/Sudoku skeleton with a different constraint. Deciding whether a -coloring exists is NP-complete, so we again expect exponential worst case. The same speedups (order vertices by degree, a form of MRV, and propagate forced colors) are what make real instances solvable.3

-coloring as a CSP: vertex sees neighbors using all three colors , so its domain is empty — no legal color, backtrack

General CSP speedups

The techniques above are instances of a small, reusable toolkit. They share one goal: discover failure as early and as cheaply as possible, so that the search never descends into a subtree that cannot contain a solution.

Concretely, forward checking acts on the domains themselves: assigning a value strikes it from every neighbor's remaining choices, and a domain that empties is the signal to backtrack.

Forward checking after : is struck from each neighbor's domain; collapses to , so the branch is dead before descending

The figure below shows forward checking pruning a branch the instant a choice empties a neighbor's domain, long before a constraint check on a complete assignment would have caught it.

forward checking prunes dead branches before descending; the surviving path is in accent

Setting leaves a neighbor with an empty domain, so forward checking cuts that branch immediately (); only keeps every neighbor non-empty, and the search descends along the accented path.

Word Search and Palindrome Partitioning

The same constraint-pruned backtracking drives grid and string puzzles where the constraint is a property of the partial path rather than a global relation.

  • Word Search asks whether a word can be traced through a grid by moving to adjacent cells without reusing a cell. The variables are the successive characters; the domain at each step is the four neighbors; the constraints are matches the next letter and not already visited. Mark a cell visited before recursing and unmark it on backtrack (the canonical make-move/undo-move pair), and prune the instant a neighbor's letter mismatches.
  • Palindrome Partitioning cuts a string into substrings that are all palindromes. The choice at each position is where to make the next cut; the constraint is that the prefix you cut off is a palindrome, checked before you recurse on the rest. Rejecting a non-palindromic prefix prunes every partition that would have started with it: early constraint checking, exactly as in the CSPs above.

Takeaways

  • A constraint satisfaction problem is variables + domains + constraints; backtracking assigns variables one at a time, checks constraints incrementally, and abandons a partial assignment the moment a constraint breaks, pruning an entire subtree at once.
  • N-Queens places one queen per row and tests safety with three boolean sets (columns, diagonals, diagonals) for an conflict check; solution counts grow fast and irregularly with no closed form.
  • Sudoku combines row/column/box consistency with constraint propagation (fill forced cells, narrow candidates) and the MRV heuristic (branch on the cell with the fewest candidates), the big practical speedup.
  • Graph -coloring is the same skeleton: one inequality constraint per edge, solved by the same ordering-and-propagation toolkit.
  • Forward checking, MRV + least-constraining-value ordering, and arc consistency / AC-3 all serve one end, detecting failure as early and as cheaply as possible, which is what lets exponential search finish.
  • Word Search and Palindrome Partitioning are grid/string backtracking with path constraints (visited cells; palindromic prefixes), pruned by the same early-check principle.

Footnotes

  1. Erickson, Ch. — Backtracking: CSPs as variables/domains/constraints solved by recursive, incrementally-checked assignment.
  2. Skiena, § — Combinatorial Search: N-Queens with column and diagonal occupancy sets for pruning, and pruning as the core of practical backtracking.
  3. Skiena, § — Combinatorial Search: graph coloring as a CSP and the role of vertex ordering and propagation.
Practice

╌╌ END ╌╌