Lower Bounds for Comparison Sorting
Every sort we have seen runs in , and that is no accident. Modeling a sort as a decision tree of comparisons, we show any such tree must have $n!
╌╌╌╌
Mergesort, heapsort, and (in expectation) quicksort all run in . Insertion sort and the rest do worse. None does better. It is natural to ask whether some cleverer algorithm could break the barrier. The striking answer is no, not if it learns about the input only by comparing elements. This lesson proves a matching lower bound: any comparison sort needs comparisons in the worst case.1 This is one of the rare cases where we can prove a problem hard, pinning its complexity from both sides.
The comparison model
A comparison sort determines the sorted order using only comparisons between
pairs of input elements. The only questions it may ask are of the form is ?
(or , , , ). It never inspects the elements'
values; it cannot read a digit, hash a key, or use one as an array index. All
of insertion, merge, quick, and heap sort live in this model, and so does almost
every general-purpose sort.
This restriction is exactly what makes a lower bound possible. Because the only information the algorithm extracts is the outcomes of comparisons, we can account for everything it could possibly learn by counting those outcomes. The elements' actual values are irrelevant beyond their relative order, so we may assume without loss of generality that the input is some permutation of the distinct values .
A sort is a decision tree
Fix the number of elements and fix a comparison sort. We can model its entire behavior as a decision tree: a binary tree whose internal nodes are comparisons and whose leaves are answers.2
- Each internal node is labeled , meaning
compare with .
- Its two outgoing edges are the two outcomes, and . The algorithm follows the edge matching the actual data.
- Each leaf is labeled with a permutation , the sorted order the algorithm declares once it reaches that leaf.
Running the algorithm on a particular input traces a single root-to-leaf path: at each node the data picks the branch, and the leaf reached announces the ordering. The structure of the tree depends only on , not on the input; the input merely chooses which path through the fixed tree gets walked.
Here is the decision tree for sorting three elements ; each leaf names the order from smallest to largest.
Trace the input . At the root we ask , that is ; since we branch right. Next , i.e. , so sends us left. Finally , i.e. , again , landing at the leaf , meaning , which reads off as . Correct.
Counting leaves and bounding height
Two observations turn this picture into a theorem.
Every permutation needs its own leaf. A correct sort must, for each of the possible input orderings, output a different permutation to put it in order. If two distinct input orderings led to the same leaf, that leaf's single fixed output would be wrong for at least one of them. Hence the tree must have at least reachable leaves, one per permutation it might be asked to produce.3
Height equals worst-case comparisons. The number of comparisons the algorithm makes on a given input is the length of the root-to-leaf path it walks. The worst-case number of comparisons is therefore the height of the tree — the longest such path.
Now we connect the two. A binary tree of height has at most leaves (the count at most doubles each level). Combining with the leaf count above,
and taking of both ends gives the key inequality
So every comparison sort, whatever its strategy, must make at least comparisons on its worst input. It remains to see how large that is.
is
We need a lower bound on . Stirling's approximation gives the sharp estimate
from which, taking logarithms,
The leading term is , so .4
If Stirling seems like a heavy tool, an elementary argument reaches the same conclusion. Drop the smallest half of the factors in and bound each survivor below by :
The picture for : keep only the larger half of the factors (each at least ) and throw the rest away. Half the factors, each , already force .
Taking ,
Either way the height satisfies . (For intuition on tightness, mergesort's shows the bound is achieved up to constants, so exactly.)
The conclusion
Because comparisons are a lower bound on total work, no comparison-based algorithm can sort elements in worst-case time . and mergesort are therefore asymptotically optimal: they match the lower bound, and no comparison sort can beat them by more than a constant factor.
Two points are worth stressing about what this argument does and does not say.
- It is an information-theoretic bound. The algorithm must distinguish possible answers, and each comparison is a single yes/no question yielding one bit; bits are needed to identify one answer among . The lower bound counts questions, independent of any cleverness in choosing them.
- It bounds only comparison sorts. The proof leans entirely on the restriction that information arrives one comparison at a time. An algorithm that learns about elements another way, by reading their digits or using them as array indices, sits outside the model, and the floor simply does not apply to it.5 The next lesson exploits exactly this loophole to sort in linear time.
Takeaways
- A comparison sort learns only the outcomes of element comparisons; this restriction is precisely what makes a lower bound provable.
- Any comparison sort is a decision tree: internal nodes are comparisons, leaves are output orderings, and worst-case comparisons equal the tree's height.
- Correctness forces leaves; a height- binary tree has leaves, so .
- By Stirling, , so every comparison sort needs comparisons in the worst case.
- Mergesort and heapsort hit this bound and are thus asymptotically optimal; beating it requires stepping outside the comparison model.
Footnotes
- CLRS, §8.1 — Lower Bounds for Sorting — any comparison sort requires comparisons in the worst case. ↩
- Erickson, Algorithms, Ch. — Lower Bounds via Decision Trees — modeling a comparison sort as a binary decision tree whose internal nodes are comparisons and leaves are output orderings. ↩
- CLRS, §8.1 — Lower Bounds for Sorting — a correct sort's decision tree has at least reachable leaves, one per permutation. ↩
- CLRS, §8.1 — Lower Bounds for Sorting — via Stirling's approximation, , bounding the tree height below. ↩
- Erickson, Algorithms, Ch. — Lower Bounds via Decision Trees — the bound holds only for comparison sorts; algorithms reading digits or indices fall outside the model. ↩
╌╌ END ╌╌