Backtracking: Subsets, Permutations & Combinations
Backtracking builds a solution one choice at a time and abandons a partial solution the moment it cannot be completed, exploring a state-space tree by depth-first search. We meet the universal choose/explore/un-choose template, derive the canonical enumerations — subsets (), permutations ($n!
╌╌╌╌
Dynamic programming, which closed the previous module, works when a problem has overlapping subproblems we can tabulate. But a great many problems ask us instead to enumerate or search a combinatorial space: list every subset, every permutation, every way to place eight queens, every assignment satisfying a formula. These spaces are exponential, so we cannot afford to materialize them, yet we can walk them cleverly. Backtracking is the disciplined depth-first walk of such a space: it builds a candidate solution one decision at a time and, crucially, abandons a partial candidate the instant it proves it cannot be extended to a valid complete one.1 This act of abandonment, the backtrack, is what separates a directed search from blind brute force.
This lesson opens the module by establishing the paradigm and instantiating it on the three enumerations every later technique builds on (subsets, permutations, and combinations) together with the two ideas that make backtracking fast in practice: deduplication of equal choices and pruning of infeasible subtrees.
The paradigm: choose, explore, un-choose
Think of a solution as a sequence of decisions. At each step we have a partial solution and a set of valid choices to extend it. We pick one choice, recurse to extend further, and then undo that choice before trying the next one. The recursion thus traverses a state-space tree (also called a decision tree or choice tree): the root is the empty partial solution, each edge is one choice, each node is the partial solution accumulated so far, and the leaves are complete candidates.2 The walk is depth-first.
The un-choose step is what lets a single mutable buffer serve the entire tree: by the time control returns from a child, the buffer is byte-for-byte what it was before we descended, so the next sibling starts from a clean slate.
- 1if then
- 2record a copy of
- 3return
- 4for each do
- 5if then
- 6continueprune
- 7choose
- 8explore
- 9un-choose
Everything in this lesson is a specialization of this skeleton: the four primitives , , , and the choose/undo pair are all we ever change.
Subsets: the power set in
The cleanest decision tree is the power set of . Each element faces one binary decision, in the subset or out, so the tree is a perfect binary tree of depth , and its leaves are exactly the subsets. This is the include–exclude recursion:
- 1if then
- 2record a copy of
- 3return
- 4exclude
- 5include
- 6explore
- 7un-choose
An equivalent and often handier formulation uses a start index so that every node, not only the leaves, is a valid subset. We loop over choices , and each recursion only ever looks forward from the chosen index, which guarantees we generate each subset once, in lexicographic order of indices:
- 1record a copy ofevery node is a subset
- 2for to do
- 3choose
- 4forward only
- 5un-choose
Advancing the start index to in the recursive call is what stops the same subset from being generated twice, once as -then- and once as -then-: an element earlier in the array is never chosen after a later one. Both formulations do recursive calls and spend to copy each emitted subset, for total, unavoidable, since the output itself has that size.
Permutations: orderings
A permutation uses every element, so completeness is all chosen,
and the
valid choices at each step are the elements not yet used. We track membership
with a boolean used[] array; the branching factor shrinks from at the root
to , then , giving exactly leaves.
- 1if then
- 2record a copy of
- 3return
- 4for to do
- 5if then continuenot a valid choice
- 6;choose
- 7explore
- 8;un-choose
An alternative avoids the auxiliary array by swapping in place: to permute , swap each candidate into position , recurse on , then swap it back: the swap-back is the un-choose. Either way the work is , dominated by emitting permutations of length .
Each root-to-leaf path consumes every element exactly once; the accented path builds the permutation . The fan-out is at the root, at depth one, and at depth two, so the leaf count is the falling product .
Combinations: with a start index
A combination is a subset of a fixed size where order does not matter.
We want but not also , so we reuse the start-index trick
from subsets, which generates each combination exactly once by only ever choosing
forward. Completeness is now we have collected elements.
- 1if then
- 2record a copy of
- 3return
- 4for to do
- 5choose
- 6forward only
- 7un-choose
The start index does the combinatorial bookkeeping: because the chosen indices strictly increase along any root-to-leaf path, each -subset corresponds to exactly one path, and we enumerate all of them with no duplicates. This is the same idea as the start-index subset enumerator: a combination is simply a subset enumeration cut off at depth .
Handling duplicates: skip equal siblings
When the input multiset contains repeated values, say , the naive enumerator emits the same combination twice, because the two s are distinguishable by position but identical in value. The standard fix is to sort the array, then at each level skip a choice equal to the one just tried at the same depth.
The condition carries the logic. The first occurrence of a value at a given level (when ) is always allowed; we only skip subsequent equal values at the same depth. Why does this dedup correctly? Two sibling branches that choose equal values would root identical subtrees, the same set of completions, so keeping only the first sibling drops the duplicate subtrees while losing no distinct solution. Note we skip equal siblings, not equal ancestors: choosing then descending and choosing is legitimate (it uses both copies), and there so the guard does not fire.
At the root, repeats sibling (here ), so its whole subtree is pruned as a duplicate. Inside the branch the same guard fires, killing . But descending from to is ancestor reuse, not a sibling repeat — there , the guard stays silent, and both copies of are legitimately used.
This same machinery distinguishes two classic problems. In Combination Sum,
each number may be reused unboundedly, so after choosing we recurse with
start = i (do not advance); staying on the same element keeps it available.
In Combination Sum II, each input number may be used at most once and
duplicates exist, so we recurse with start = i+1 (advance) and apply the
equal-sibling skip above to avoid duplicate combinations.
Pruning: kill infeasible subtrees early
Everything so far enumerates all of a space. Backtracking earns its keep when the problem constrains the answer, because then we can prune: refuse to descend into a subtree the moment we can prove it contains no solution. Pruning prunes whole subtrees, so a single early cut can save exponentially many leaves.
Take Combination Sum with a target over sorted positive numbers. Two
prunes apply at the point we consider extending partial (current sum ) by
:
- Over-target cut. If , this choice overshoots; and since the
array is sorted ascending, every later overshoots too, so we
breakout of the entire loop, not merelycontinue. - Reachability cut (a general lower-bound prune). If even the smallest remaining additions cannot reach , abandon the branch.
Each branch reaching (the dashed nodes) is dropped without
descending further; the blue path is the live solution
. Pruning does the heavy lifting: a search that looks exponential can
run in milliseconds when the prune severs the bulk of the tree, while a poorly
pruned search of the same size is hopeless. This is why Skiena frames practical
combinatorial search as the art of pruning.
2
Complexity: the size of the explored tree
There is no single formula for backtracking complexity
; the running time is
simply the size of the state-space tree we actually explore, times the work
per node. For a full enumeration this is the output size:
for subsets, for permutations, for
combinations. More generally the cost is , and for problems whose answers we emit, it is bounded below by
: we cannot beat the cost
of writing the output.
The lever is always the same: a tighter predicate cuts subtrees nearer the root, and the savings compound exponentially with the depth of the cut.
Takeaways
- Backtracking is depth-first search of a state-space tree: build a partial solution incrementally, and abandon any branch that cannot reach a valid complete solution.
- The universal template is choose / explore / un-choose: a single mutable buffer suffices because the undo step restores it before each sibling is tried.
- Subsets () come from an include/exclude binary recursion or a
forward-only start index; permutations () from a
used[]array or in-place swapping; combinations () from a start index that forces increasing choices so each set is generated once. - Duplicates are handled by sorting and skipping equal siblings at the same
depth (
if i > start and a[i] == a[i-1] continue), which drops identical subtrees without losing distinct solutions; reuse-allowed variants keepstart = iinstead of advancing. - Pruning, cutting infeasible or non-improving subtrees early (sum exceeds target, bound can't beat the best), is what separates exponential-but-fast from hopeless; one early cut saves an exponential subtree.
- Running time is the size of the explored tree times per-node work: exponential in the worst case, but tamed to practicality by good pruning on typical inputs.
Footnotes
- Erickson, Ch. — Backtracking: incrementally constructing a solution and abandoning partial candidates that cannot be completed. ↩
- Skiena, §7 — Combinatorial Search and Heuristic Methods: the state-space search tree and pruning as the core of efficient exhaustive search. ↩ ↩2
- CLRS, Ch. — Exhaustive Search: backtracking explores only feasible extensions, exponential in the worst case but far smaller in practice. ↩
╌╌ END ╌╌