Lesson 2.21,723 words

Quicksort

Quicksort sorts in place by partitioning around a pivot and recursing on each side. We give Lomuto and Hoare partitioning with a correctness invariant, see why a bad pivot costs Θ(n2)\Theta(n^2) while a balanced one gives Θ(nlogn)\Theta(n\log n), and prove that randomizing the pivot makes the expected cost Θ(nlogn)\Theta(n\log n) on every input.

╌╌╌╌

Mergesort does its hard work in the combine step: splitting is trivial, merging is where the sorting happens. flips this around. It does its hard work in the divide step, partitioning the array so that everything small comes before everything large, after which the combine step is empty. Sort the two parts in place and the whole array is sorted, with no merging required.1

The paradigm applied

To sort :

  • Divide. Choose a pivot element and partition into two regions: a left part whose elements are all the pivot, and a right part whose elements are all the pivot, with the pivot itself in between at some index .
  • Conquer. Recursively sort and .
  • Combine. Nothing to do: the subarrays are already in place and in order relative to each other.
Algorithm 1:Quicksort(A,p,r)\textsc{Quicksort}(A, p, r) — sort A[p..r]A[p..r] in place
  1. 1
    if p<rp < r then
  2. 2
    qq \gets call Partition(A,p,r)\textsc{Partition}(A, p, r)
    pivot at final index q
  3. 3
    call Quicksort(A,p,q1)\textsc{Quicksort}(A, p, q - 1)
    everything \le pivot
  4. 4
    call Quicksort(A,q+1,r)\textsc{Quicksort}(A, q + 1, r)
    everything \ge pivot

Everything hinges on .

Partitioning around a pivot

Partition rearranges an array around a pivot so that it falls into three contiguous regions: everything less than the pivot, then the pivot itself sitting at its final sorted index , then everything greater. Sorting around a chosen pivot value rearranges it into

Partition splits the array into elements less than the pivot, the pivot at index , then greater elements.

No element to the left of exceeds the pivot, and none to the right is smaller, so the pivot is already in its final position. The two recursive sorts never have to look across the boundary at , which is exactly why the combine step vanishes. The whole game of quicksort is to make this split, and to make it balanced.

Lomuto partition

The simplest scheme, due to Lomuto, takes the last element as the pivot and sweeps an index across the array, maintaining a boundary between the elements known to be the pivot and those known to be it.2

Algorithm 2:Partition(A,p,r)\textsc{Partition}(A, p, r) — Lomuto scheme, pivot =A[r]= A[r]
  1. 1
    xA[r]x \gets A[r]
    the pivot
  2. 2
    ip1i \gets p - 1
    end of x\le x region
  3. 3
    for jpj \gets p to r1r - 1 do
  4. 4
    if A[j]xA[j] \le x then
  5. 5
    ii+1i \gets i + 1
  6. 6
    exchange A[i]A[i] with A[j]A[j]
  7. 7
    exchange A[i+1]A[i + 1] with A[r]A[r]
    pivot into its slot
  8. 8
    return i+1i + 1

Correctness of partition

Partition is correct by a four-region loop invariant. At the start of each iteration of the for loop, for any array index:

  • if then ;
  • if then ;
  • .

In words: everything up to is small, everything from to is large, the slice from onward is unexamined, and the pivot waits at the end.

  • Initialization. Before the first iteration and , so both regions (1) and (2) are empty. The invariant holds vacuously, and .
  • Maintenance. If , only advances, extending the large region (2), which stays valid. If , we increment and swap with : the newly small element joins region (1), and the large element that was at moves to the back of region (2). Both regions stay correct.
  • Termination. The loop ends with . Regions (1) and (2) together cover . The final swap places the pivot at index , with all smaller elements to its left and all larger to its right: exactly the postcondition, and is the pivot's final position .

Partition does comparisons on elements.

A snapshot of the sweep makes the four regions concrete. On with pivot , just after the scan reaches the boundary has collected the small elements on the left, the large ones trail behind, and the rest is still unexamined:

Lomuto sweep on at : the region (boundary ) precedes the region, with pivot parked at the end.

Hoare partition

Hoare's original scheme uses two indices that march toward each other from the ends, swapping out-of-place pairs as they meet. It does fewer swaps on average than Lomuto and handles arrays with many duplicate keys more gracefully, at the cost of a subtler invariant (the returned index splits the array but is not necessarily the pivot's final resting place).

The two pointers and start outside the array and walk inward: stops at the first element that does not belong on the left, at the first that does not belong on the right, and the pair is swapped. When the pointers cross, marks the boundary.

Hoare partition with pivot : scans right past small elements, scans left past large ones, and the out-of-order pair is swapped before both resume marching inward.
Algorithm 3:Hoare-Partition(A,p,r)\textsc{Hoare-Partition}(A, p, r) — pivot =A[p]= A[p]
  1. 1
    xA[p]x \gets A[p]
    the pivot
  2. 2
    ip1i \gets p - 1
  3. 3
    jr+1j \gets r + 1
  4. 4
    repeat
  5. 5
    repeat jj1j \gets j - 1 until A[j]xA[j] \le x
    scan down from right
  6. 6
    repeat ii+1i \gets i + 1 until A[i]xA[i] \ge x
    scan up from left
  7. 7
    if i<ji < j then
  8. 8
    exchange A[i]A[i] with A[j]A[j]
  9. 9
    else
  10. 10
    return jj
    split point: A[p..j]A[j+1..r]A[p..j] \le A[j+1..r]

With Hoare partition the recursive calls become and , since is a boundary rather than the pivot's final index. Either scheme yields a correct, in-place quicksort; the difference lies purely in the constants and in robustness to duplicates.

Worst case versus best case

Partition is always , so quicksort's total cost is governed entirely by how balanced the splits are.

Worst case. Suppose every partition is maximally lopsided, with one side empty and the other holding elements. This happens, for Lomuto with a last-element pivot, on an array that is already sorted (or reverse sorted). The recurrence is

The recursion tree degenerates into a path of depth , with work at the top shrinking by one at each level: . Quicksort's worst case is no better than insertion sort.

The reason a sorted array is the villain for Lomuto is concrete: the last-element pivot is then the largest value, so the whole scan stays and the partition peels off just the pivot, leaving an empty right side and an -element left side to do it all again.

On a sorted array Lomuto's pivot is the maximum, so every element is : partition strips off one element and recurses on the other , the maximally lopsided split.

Best case. If every partition splits evenly, the recurrence is the mergesort recurrence,

The happy surprise. Balance need not be perfect. Even a fixed -to- split gives

because the recursion tree still has only levels (the longest root-to-leaf path shrinks by a factor of each step) and each level does work. Any split by a constant fraction yields . Only splits that are lopsided by a constant number of elements, like the worst case, push us to quadratic.

Balanced splits keep the tree deep; a constant-size lopsided split stretches it into a depth- path costing .

The shrinking-recurrence lemma

This constant-fraction is enough intuition packages into a clean lemma that we will reuse for linear-time selection. It says that as long as the recursive subproblems together are a constant fraction smaller than the original, linear work at each level collapses to linear work overall.

The proof is a tidy induction that pins the hidden constant exactly. Claim: for all , where

The constant is positive precisely because ; that is where the shrinkage is spent.

  • Base case (). .
  • Inductive step (). Assuming the bound for all smaller arguments,

where the last inequality uses , which is exactly how was chosen. So .

For balanced quicksort the per-level work grows a logarithmic number of times rather than collapsing, since the splits sum to the whole array (, the boundary case the lemma deliberately excludes), which is why quicksort is and not . The lemma's strict inequality is what separates recurse on both halves (sorting) from throw away a constant fraction and recurse on one piece (selection).

Why randomization helps

The danger is a pivot rule that an adversary, or merely unlucky real-world data, can drive into the worst case. The fix is to randomize: choose the pivot uniformly at random from (equivalently, swap a random element to the end before running Lomuto partition).

Algorithm 4:Randomized-Partition(A,p,r)\textsc{Randomized-Partition}(A, p, r)
  1. 1
    kk \gets a uniformly random integer in [p,r][p, r]
  2. 2
    exchange A[k]A[k] with A[r]A[r]
    randomize pivot, reuse Lomuto
  3. 3
    return call Partition(A,p,r)\textsc{Partition}(A, p, r)

Now no particular input is bad: the coin flips, not the input order, decide the split. The worst case still exists in principle (every random choice could be unlucky), but its probability is vanishingly small, and we can prove the expected running time is on every input.3

The expected-comparisons argument

Let the sorted order of the elements be , and let . Two elements are compared at most once over the whole run, since comparisons only ever happen against a pivot, and a pivot is removed from future partitions. Define the indicator . The total comparison count is , so by linearity of expectation

The combinatorial fact that matters: and are compared iff the first pivot chosen from the range is either or . If instead some middle element (with ) is picked first, it splits and into different subarrays and they never meet.

Endpoints are compared only when the first pivot drawn from is one of them; any middle pivot separates them forever.

Since the first pivot drawn from the elements of is equally likely to be any of them,

Substituting and reindexing with ,

using the harmonic-number bound . So randomized quicksort makes comparisons in expectation, regardless of the input arrangement.

Quicksort versus mergesort

QuicksortMergesort
Worst case
Expected / average
Extra space (stack)
In placeyesno
Stablenoyes
Constantssmall (cache-friendly)larger

In practice quicksort is usually the fastest comparison sort on arrays in memory: it works in place, has tight inner loops, and accesses memory sequentially, which the cache loves.4 Its weaknesses are the worst case (tamed by randomization or median-of-three pivoting) and instability. Mergesort wins when you need a worst-case guarantee, stability, or are sorting linked lists or data too large for memory. A common engineering compromise, introsort, runs quicksort but switches to heapsort once the recursion depth exceeds , capturing quicksort's speed with a worst-case ceiling.

Takeaways

  • front-loads the work into ; once the array is partitioned around a pivot into , the pivot is in its final slot and the recursive sorts need no combine step.
  • (single sweep, pivot at the end) and (two converging indices) are both correct via partition loop invariants; Hoare does fewer swaps and handles duplicates better.
  • Cost is set by split balance: lopsided-by-a-constant gives , but any constant-fraction split gives . The shrinking-recurrence lemma () makes constant fraction is enough rigorous and powers linear-time selection next door.
  • Randomizing the pivot makes the expected cost on every input; the proof counts each pair's chance of being compared.
  • Quicksort is typically the fastest in-memory sort; mergesort wins on worst-case guarantees, stability, and external data.

Footnotes

  1. Erickson, Algorithms, Ch. 1 — Recursion: quicksort as divide-and-conquer that front-loads the work into partitioning, leaving an empty combine step.
  2. CLRS, Ch. 7 — Quicksort: the Lomuto single-sweep partition scheme and its four-region loop invariant.
  3. CLRS, Ch. 7 — Quicksort: randomized quicksort and the proof that its expected comparison count is on every input.
  4. Skiena, The Algorithm Design Manual, §4 — Sorting and Searching: quicksort as the fastest in-memory comparison sort in practice, and the role of pivot selection.
Practice

╌╌ END ╌╌