Linear-Time Selection

How do you find the median of $n$ numbers? The obvious answer, sorting them and reading off the middle, costs $Θ (n log n)$ . But the median, and more generally the $k$ -th smallest element, can be found in linear time.¹ The insight is that selection requires less than sorting: we want one element's value, not the full ordering, and we can stop as soon as we have it.

The selection problem

Special cases: $k = 1$ is the minimum, $k = n$ the maximum, and $k = ⌊ (n + 1) /2 ⌋$ the median. The minimum and maximum are easy in $n - 1$ comparisons. The median is the interesting case, and the algorithms below solve it as a byproduct of solving general selection.

Minimum and maximum together

The minimum alone costs $n - 1$ comparisons, and no algorithm can do better: every element except the eventual winner must lose at least one comparison, so $n - 1$ losses are unavoidable. Finding both the minimum and the maximum naively costs $2 (n - 1)$ — run the minimum scan and the maximum scan separately. But you can do it in about $3 n /2$ by processing elements in pairs. For each pair, compare the two against each other first ( $1$ comparison), then send the smaller to the running minimum and the larger to the running maximum ( $2$ more). That is $3$ comparisons per $2$ elements, or $3 n /2$ total, versus $4 n /2 = 2 n$ for the separate scans. The saving comes from never comparing the larger of a pair against the current minimum, nor the smaller against the current maximum — half the comparisons in the naive method are provably pointless.

Quickselect: partition, then recurse on one side

Quicksort partitions around a pivot and recurses on both halves. But for selection we only care about one of them. After partitioning $A [p .. r]$ around a pivot that lands at index $q$ , the pivot is the $(q - p + 1)$ -th smallest element of the subarray. Compare that rank to $k$ :

if it equals $k$ , the pivot is the answer;
if $k$ is smaller, the answer lies in the left part, so recurse there;
if $k$ is larger, the answer lies in the right part; recurse there, adjusting $k$ to skip the elements we discarded.

Throwing away the side that cannot contain the answer is what turns $Θ (n log n)$ into $Θ (n)$ .²

Quickselect partitions once around

x

, then recurses into only the side holding rank

k

(here

k < i

, the left part) and discards the other.

Algorithm 1:

\textsc{Quickselect}(A, p, r, k)

— return the

k

-th smallest of

A[p..r]

1
if $p = r$ then
2
return $A[p]$
only element: the answer
3
$q \gets$ call $\textsc{Randomized-Partition}(A, p, r)$
4
$i \gets q - p + 1$
pivot rank in A[p..r]
5
if $k = i$ then
6
return $A[q]$
pivot is the k-th smallest
7
else if $k < i$ then
8
return call $\textsc{Quickselect}(A, p, q - 1, k)$
recurse left
9
else
10
return call $\textsc{Quickselect}(A, q + 1, r, k - i)$
recurse right, shift k

quickselect.pypython

import random
from typing import TypeVar

from comparable import Comparable

Element = TypeVar("Element", bound=Comparable)

def _partition(
  values: list[Element], low: int, high: int, pivot_index: int
) -> int:
  """
    Lomuto-style partition of `values[low..high]` around the element at\n
    `pivot_index`. Returns the pivot's final resting index, with every\n
    smaller element to its left and every larger one to its right.\n
  """
  pivot_value: Element = values[pivot_index]

  # park the pivot at the end while we sweep the rest of the range.
  values[pivot_index], values[high] = values[high], values[pivot_index]
  boundary: int = low
  for current in range(low, high):
    if values[current] < pivot_value:
      values[boundary], values[current] = values[current], values[boundary]
      boundary += 1

  # drop the pivot into the gap between the small and large halves.
  values[boundary], values[high] = values[high], values[boundary]
  return boundary

def quickselect(values: list[Element], k: int) -> Element:
  """
    Return the `k`-th smallest element of `values`, 1-indexed, so `k = 1`\n
    is the minimum and `k = len(values)` the maximum. Works on a private\n
    copy, leaving the caller's list untouched. Raises ValueError when `k`\n
    falls outside `1..len(values)`.\n
  """
  if not 1 <= k <= len(values):
    raise ValueError(f"rank {k} out of range 1..{len(values)}")

  # work on a copy; track the 0-indexed rank and the live search window.
  working: list[Element] = list(values)
  target_index: int = k - 1
  low: int = 0
  high: int = len(working) - 1

  # iterative recursion: shrink [low, high] toward the target rank.
  while low < high:

    # partition around a random pivot; stop once it lands on the target.
    pivot_index: int = random.randint(low, high)
    settled: int = _partition(working, low, high, pivot_index)
    if settled == target_index:
      break

    # keep only the side of the pivot that still contains the target rank.
    if target_index < settled:
      high = settled - 1
    else:
      low = settled + 1

  return working[target_index]

def median(values: list[Element]) -> Element:
  """
    The lower median of `values`: the ceil(n / 2)-th smallest element.\n
  """
  if not values:
    raise ValueError("median of an empty list is undefined")
  return quickselect(values, (len(values) + 1) // 2)

comparable.pypython

from typing import Any, Protocol, TypeVar


class Comparable(Protocol):
  """
    Anything orderable with `<` (int, float, str, tuple, date, …).\n
  """

  # `other` is position-only so built-ins (int, str, …), whose dunder
  # operands are position-only, structurally satisfy the protocol.
  def __lt__(self, other: Any, /) -> bool: ...
  def __gt__(self, other: Any, /) -> bool: ...
  def __le__(self, other: Any, /) -> bool: ...
  def __ge__(self, other: Any, /) -> bool: ...

A worked trace

Take $A = [7, 2, 9, 4, 1, 6, 3, 8, 5]$ ( $n = 9$ ) and ask for the $k = 4$ -th smallest. Suppose each call happens to pick the last element of its range as the pivot.

The first partition uses pivot $5$ . Everything smaller slides left, everything larger slides right, and $5$ settles into the slot where it belongs:

[< 5 2, 4, 1, 3 ∣ 5 ∣ > 5 7, 9, 6, 8] .

The pivot landed at index $q = 5$ , so its rank in the range is $i = 5$ . We want rank $k = 4 < 5$ , so the answer is in the left part $[2, 4, 1, 3]$ and we recurse there with $k$ unchanged.

Partition $[2, 4, 1, 3]$ around pivot $3$ : $[2, 1 ∣ 3 ∣ 4]$ . The pivot's rank in this range is $i = 3$ . We still want rank $4$ , and $4 > 3$ , so we recurse right into $[4]$ and shift the target to $k - i = 4 - 3 = 1$ .

That range has a single element, $4$ , which is trivially its own $1$ st smallest — so $Quickselect$ returns $4$ . Checking against the sorted array $[1, 2, 3, 4, 5, 6, 7, 8, 9]$ : the $4$ -th smallest is indeed $4$ . At every step we partitioned one range and then threw away everything on the wrong side of the pivot, never sorting the parts we kept.

Quickselect on

[7, 2, 9, 4, 1, 6, 3, 8, 5]

seeking

k = 4

. Each level partitions around a pivot, keeps only the side holding rank

k

(accent), and discards the rest (faded); the target rank is re-based after every right recursion.

Expected linear time

Partition costs $Θ (n)$ . The win over quicksort is that we recurse on only one side. With a randomized pivot, the partition splits the array at a uniformly random rank, so on average we discard a constant fraction each time. Intuitively, a random pivot lands in the middle half of the array with probability $1/2$ , in which case the surviving side has at most $3 n /4$ elements. The expected work satisfies, roughly,

T (n) \leq T (3 n /4) + Θ (n),

a geometric (not branching) recurrence. Unrolling it,

T (n) \leq c (n + \frac{3}{4} n + (\frac{3}{4})^{2} n + \dots) = c n \cdot \frac{1}{1 - 3/4} = 4 c n = Θ (n) .

The geometric series converges to a constant, so the total is linear.

The hand-wave above hides one step: why is the expected surviving size a constant fraction of $n$ ? With a uniformly random pivot, its rank is equally likely to be any of $1, \dots, n$ . Call the pivot good if its rank falls in the middle half, between $n /4$ and $3 n /4$ ; that happens with probability $1/2$ . A good pivot leaves at most $3 n /4$ elements on either side. So on average we need at most two partitions to shrink the range to $\leq 3 n /4$ , and each partition costs $O (n)$ . That gives $E [T (n)] \leq E [T (3 n /4)] + O (n)$ , which unrolls to the geometric sum above. CLRS reaches the same $E [T (n)] = O (n)$ more carefully with indicator variables summed over every possible pivot rank; the constant it extracts is small.

A single-side recurrence shrinks the subproblem by a constant factor each level, so the bars

n, \frac{3}{4} n, (\frac{3}{4})^{2} n, \dots

sum to a geometric

Θ (n)

The catch is the same as quicksort's: the worst case, with every pivot maximally unbalanced, is

T (n) = T (n - 1) + Θ (n) = Θ (n^{2}) .

Randomization makes that astronomically unlikely, but it is still possible. Can we guarantee linear time?

Median of medians: a guaranteed-good pivot

The deterministic algorithm of Blum, Floyd, Pratt, Rivest, and Tarjan (1973) achieves worst-case $O (n)$ by spending a little effort to choose a pivot that is provably not too extreme.³ The idea is to pick the pivot as a median of medians.

Algorithm 2:

\textsc{Select}(A, k)

— deterministic

k

-th smallest, worst-case

O(n)

1
if $A$ has at most $5$ elements then
2
return the $k$ -th smallest of $A$ by direct sorting
3
divide $A$ into $\ceil{n/5}$ groups of $5$ elements (last group may be smaller)
4
foreach group do
5
find that group's median by sorting its $\le 5$ elements
6
let $M$ be the array of the $\ceil{n/5}$ group medians
7
$x \gets$ call $\textsc{Select}(M, \ceil{|M|/2})$
median of medians
8
$q \gets$ partition $A$ around the pivot $x$ , returning its rank $i$
9
if $k = i$ then
10
return $x$
11
else if $k < i$ then
12
return call $\textsc{Select}(A[\,\text{left part}\,], k)$
13
else
14
return call $\textsc{Select}(A[\,\text{right part}\,], k - i)$

median_of_medians.pypython

from typing import TypeVar

from comparable import Comparable

Element = TypeVar("Element", bound=Comparable)

_GROUP_SIZE: int = 5

def _median_by_sorting(group: list[Element]) -> Element:
  """
    The lower median of a short list, found by sorting its <= 5 elements\n
    directly — O(1) work per call.\n
  """
  ordered: list[Element] = sorted(group)
  return ordered[(len(ordered) - 1) // 2]

def _median_of_medians_pivot(values: list[Element]) -> Element:
  """
    The pivot for one Select step: chop `values` into groups of five, take\n
    each group's median, and recurse to select the median of that smaller\n
    array of medians.\n
  """
  if len(values) <= _GROUP_SIZE:
    return _median_by_sorting(values)

  # one median per group of five.
  medians: list[Element] = [
    _median_by_sorting(values[start : start + _GROUP_SIZE])
    for start in range(0, len(values), _GROUP_SIZE)
  ]

  # the pivot is the median of the group medians — a size-n/5 subproblem.
  return select(medians, (len(medians) + 1) // 2)

def select(values: list[Element], k: int) -> Element:
  """
    Return the `k`-th smallest element of `values`, 1-indexed, in worst-case\n
    O(n) time. Leaves the caller's list untouched. Raises ValueError when\n
    `k` falls outside `1..len(values)`.\n
  """
  if not 1 <= k <= len(values):
    raise ValueError(f"rank {k} out of range 1..{len(values)}")

  # pick a provably balanced pivot in linear time.
  pivot_value: Element = _median_of_medians_pivot(values)

  # three-way split keeps duplicate pivots in their own band, so the
  # recursion stays well-defined even when elements repeat.
  smaller: list[Element] = [item for item in values if item < pivot_value]
  equal: list[Element] = [item for item in values if item == pivot_value]
  larger: list[Element] = [item for item in values if item > pivot_value]

  # recurse into the band that holds the target rank.
  if k <= len(smaller):
    return select(smaller, k)
  if k <= len(smaller) + len(equal):
    return pivot_value

  # discard the smaller and equal bands; shift the rank past them.
  return select(larger, k - len(smaller) - len(equal))

def median(values: list[Element]) -> Element:
  """
    The lower median of `values`: the ceil(n / 2)-th smallest element, in\n
    guaranteed linear time.\n
  """
  if not values:
    raise ValueError("median of an empty list is undefined")
  return select(values, (len(values) + 1) // 2)

comparable.pypython

from typing import Any, Protocol, TypeVar


class Comparable(Protocol):
  """
    Anything orderable with `<` (int, float, str, tuple, date, …).\n
  """

  # `other` is position-only so built-ins (int, str, …), whose dunder
  # operands are position-only, structurally satisfy the protocol.
  def __lt__(self, other: Any, /) -> bool: ...
  def __gt__(self, other: Any, /) -> bool: ...
  def __le__(self, other: Any, /) -> bool: ...
  def __ge__(self, other: Any, /) -> bool: ...

Why groups of five give a good pivot

Here is the core of it. Picture the $⌈ n /5 ⌉$ groups as columns, each sorted top (large) to bottom (small), and now imagine the columns reordered left to right by their medians. The pivot $x = MoM$ is the median of that middle row, so it sits dead-center in the grid below. (We assume distinct elements for simplicity.)

The pivot is built in stages: chop the array into groups of five, sort each group to expose its median, collect those $⌈ n /5 ⌉$ medians, and recurse to find their median — the median of medians.

Building the pivot: split into groups of

5

, sort each to surface its median (accent cell), gather the

⌈ n /5 ⌉

medians, then recurse on that smaller array to get the median of medians

x

Grid of groups of five with the median of medians

x

, showing the regions guaranteed at least or at most

x

For example, take these $15$ numbers and split them into three groups of five:

G_{1} [12, 3, 20, 7, 15] G_{2} [9, 1, 18, 4, 11] G_{3} [6, 17, 2, 14, 8] .

Sort each group and read off its median (the third of five):

G_{1} = [3, 7, 12, 15, 20], G_{2} = [1, 4, 9, 11, 18], G_{3} = [2, 6, 8, 14, 17] .

The three group medians are ${12, 9, 8}$ . Their median is $x = 9$ — the median of medians, our pivot. Now count what $9$ guarantees. Its own group $G_{2}$ contributes $9$ and everything at or below it there ( $1, 4$ ) as elements $\leq 9$ ; $G_{3}$ , whose median $8 < 9$ , contributes its median and the two below it ( $2, 6, 8$ ) as elements $\leq 9$ . That already fixes $6$ of the $15$ values as $\leq 9$ before we even scan the array. Partitioning around $9$ therefore cannot strand it near either end: the split is provably balanced. This is the $\geq 3 n /10$ guarantee in miniature — with $n = 15$ , at least $3 \cdot 15/10 \approx 4$ to $5$ elements are pinned to each side.

(Five is the smallest odd group size that makes the recurrence below close; groups of $3$ fail because their fractions sum to exactly $1$ .)

Solving the recurrence

Tallying the work: splitting into groups and finding their medians is $O (n)$ (each group is sorted in $O (1)$ ). Partitioning is $O (n)$ . There are two recursive calls:

finding the median of the $⌈ n /5 ⌉$ medians, a subproblem of size $n /5$ ;
recursing into the surviving side, a subproblem of size at most $7 n /10$ .

T (n) \leq T (\frac{n}{5}) + T (\frac{7 n}{10}) + O (n) .

The two fractions are what make this work: $\frac{1}{5} + \frac{7}{10} = \frac{9}{10} < 1$ , so the two subproblems together are strictly smaller than the input. That is the shrinkage — the total work contracts by a constant factor at every level.

The two recursive calls span

\frac{1}{5} + \frac{7}{10} = \frac{9}{10} < 1

of the input; the leftover

\frac{1}{10}

gap is the shrinkage that forces

O (n)

Start with the single-call cousin $f (n) \leq f (n /2) + bn$ . The master theorem gives $f (n) = O (n)$ ; unrolling shows why directly,

f (n) \leq f (\frac{n}{2}) + bn \leq f (\frac{n}{4}) + b \frac{n}{2} + bn \leq f (1) + (bn + \frac{bn}{2} + \frac{bn}{4} + \dots) \leq f (1) + 2 bn .

The constant-factor shrinkage plus linear cleanup work sums to $O (n)$ . The median-of-medians recurrence has two calls instead of one, but the same shrinkage idea carries it through, stated as a general lemma:

Plugging in $λ = \frac{1}{5}$ , $μ = \frac{7}{10}$ (so $λ + μ = \frac{9}{10}$ ) gives $T (n) = O (n)$ , that is, worst-case linear time. Had the fractions summed to $1$ or more (as for groups of $3$ , where $\frac{1}{3} + \frac{2}{3} = 1$ ), the lemma's hypothesis fails: the per-level savings vanish, the recursion tree carries $log n$ levels of $Θ (n)$ work, and the bound degrades to $Θ (n log n)$ .

Which to use

Both algorithms are linear, but they trade off differently.

	expected time	worst case	pivot cost	in practice
Randomized quickselect	$O (n)$	$Θ (n^{2})$	one random draw	fast; the default
Median of medians	$O (n)$	$O (n)$	recursive, heavy	slow constant

The median-of-medians algorithm settled a real question: it proves selection is possible in worst-case linear time, with no randomness and no probabilistic escape hatch. But its constant factor is large. Every level does the grouping, the per-group sort, and a second recursive call just to pick the pivot, so the hidden constant dwarfs quickselect's. On real inputs randomized quickselect is faster and is the algorithm to reach for.⁴ Its quadratic worst case is a theoretical possibility that a random pivot makes vanishingly unlikely — an adversary who cannot see your coin flips cannot force it.

The deterministic version matters in two places. First, when a hard worst-case guarantee is mandatory (a real-time deadline, or an adversarial setting where inputs are chosen to break you). Second, and more commonly, as a pivot-selection subroutine for quicksort: use $Select$ to find the true median in $O (n)$ , partition around it, and every quicksort split is perfectly balanced, giving a worst-case $Θ (n log n)$ sort. The practical compromise, introselect, runs plain quickselect but watches the recursion depth; if it ever exceeds a threshold (a sign of bad pivots), it switches to median-of-medians for the rest. That keeps quickselect's speed on ordinary inputs while capping the worst case at $O (n)$ — the strategy C++'s std::nth_element uses.

Bonus: Closest Pair of Points

Sorting and selection are not the only classics that fall to divide-and-conquer. Finding the closest pair among $n$ points in the plane beats the $Θ (n^{2})$ brute force the same way: split by the median $x$ -coordinate, recurse on each half to get the best distance $δ$ within each side, and combine by checking only pairs that straddle the dividing line. The combine looks dangerous — a naive cross-check is $Θ (n^{2})$ — but the same kind of counting argument as in median-of-medians fixes it: any straddling pair closer than $δ$ lies in a width- $2 δ$ strip, and a packing bound shows each strip point need only be compared against a constant number of $y$ -neighbours. That makes the combine $O (n)$ and the whole recurrence $T (n) = 2 T (n /2) + O (n) = O (n log n)$ . The full algorithm, the strip-packing proof, and the pseudocode live in Polygons & Proximity.

closest_pair.pypython

import math
from typing import NamedTuple, Sequence

class Point(NamedTuple):
  """
    A point in the plane, with x and y coordinates.\n
  """
  x: float
  y: float

def distance(first: Point, second: Point) -> float:
  """
    Euclidean distance between two points.\n
  """
  return math.hypot(first.x - second.x, first.y - second.y)

class ClosestPair(NamedTuple):
  """
    The result of a closest-pair query: the two nearest points and the\n
    distance between them.\n
  """
  first: Point
  second: Point
  distance: float

def _brute_force(points: Sequence[Point]) -> ClosestPair:
  """
    Direct O(m^2) scan over every pair — the base case for tiny inputs.\n
  """
  # seed with the first pair, then keep the smallest gap over all pairs.
  best: ClosestPair = ClosestPair(
    points[0], points[1], distance(points[0], points[1])
  )
  for left in range(len(points)):
    for right in range(left + 1, len(points)):
      gap: float = distance(points[left], points[right])
      if gap < best.distance:
        best = ClosestPair(points[left], points[right], gap)

  return best

def _strip_closest(
  strip: list[Point], best_so_far: ClosestPair
) -> ClosestPair:
  """
    Scan points already sorted by y within the delta-wide strip. Each point\n
    need only be compared against the few followers within `best.distance`\n
    in y, so the inner loop breaks early and the whole scan stays O(n).\n
  """
  best: ClosestPair = best_so_far
  for lower in range(len(strip)):

    # compare against followers until the y-gap alone exceeds the best.
    upper: int = lower + 1
    while (
      upper < len(strip)
      and strip[upper].y - strip[lower].y < best.distance
    ):
      gap: float = distance(strip[lower], strip[upper])
      if gap < best.distance:
        best = ClosestPair(strip[lower], strip[upper], gap)
      upper += 1

  return best

def _closest_recursive(
  by_x: list[Point], by_y: list[Point]
) -> ClosestPair:
  """
    Recurse on points held in two orders at once: `by_x` sorted by x to find\n
    the split line, `by_y` sorted by y to feed the strip scan in linear\n
    time without re-sorting at every level.\n
  """
  if len(by_x) <= 3:
    return _brute_force(by_x)

  # split at the median x; left/right keep their x-sorted order.
  middle: int = len(by_x) // 2
  split_point: Point = by_x[middle]
  left_by_x: list[Point] = by_x[:middle]
  right_by_x: list[Point] = by_x[middle:]

  # partition the y-sorted list to match the x-sorted halves, preserving order.
  left_set = set(map(id, left_by_x))
  left_by_y: list[Point] = [p for p in by_y if id(p) in left_set]
  right_by_y: list[Point] = [p for p in by_y if id(p) not in left_set]

  left_best: ClosestPair = _closest_recursive(left_by_x, left_by_y)
  right_best: ClosestPair = _closest_recursive(right_by_x, right_by_y)
  best: ClosestPair = (
    left_best if left_best.distance <= right_best.distance else right_best
  )

  # only points within delta of the split line can beat the recursive best.
  strip: list[Point] = [
    point for point in by_y if abs(point.x - split_point.x) < best.distance
  ]
  return _strip_closest(strip, best)

def closest_pair(points: Sequence[Point]) -> ClosestPair:
  """
    Return the two closest points and their distance. Raises ValueError on\n
    fewer than two points. Coincident points yield a distance of 0.\n
  """
  if len(points) < 2:
    raise ValueError("closest pair needs at least two points")
  by_x: list[Point] = sorted(points, key=lambda point: point.x)
  by_y: list[Point] = sorted(points, key=lambda point: point.y)
  return _closest_recursive(by_x, by_y)

Selection in practice

Selection is a solved problem in theory — both algorithms are linear — but the constant factors and the shift to huge or distributed data keep it a live engineering topic.

Floyd–Rivest: fewer comparisons than either. The practical winner is often neither plain quickselect nor median-of-medians but the Floyd–Rivest algorithm (1975). It samples a small random subset of the array, selects two pivots from the sample chosen to straddle the target rank $k$ with high probability, and partitions into three parts so that after one pass the surviving range is a tiny $O (n^{2/3})$ -sized sliver almost certain to contain the answer. It finds the median in $1.5 n + o (n)$ expected comparisons, beating quickselect's constant and far below median-of-medians', which is why high-performance numeric libraries reach for it when comparisons are the bottleneck. It is a refinement of the same sample to guess a good pivot idea, pushed to two pivots and a sampled estimate of where $k$ lands.

Streaming and approximate selection. When the data is a stream too large to store — network packets, sensor readings, query logs — you cannot partition an array you never hold. Exact selection provably needs $Ω (n)$ space in one pass, so practical systems compute approximate quantiles instead. Sketches like Greenwald–Khanna (2001) and the t-digest (Dunning, 2019) maintain a small summary, $O (\frac{1}{ε} log (ε n))$ space, that answers the $k$ -th order statistic, within rank error $ε n$ for any $k$ . These power the percentile latency dashboards ( $p_{50}$ , $p_{99}$ ) that every production service watches; the exact median of a billion requests is neither needed nor affordable, but a $99.9$ th percentile good to a fraction of a percent is both.

Parallel and distributed medians. On a cluster the data is sharded across machines and no single node sees it all. The median-of-medians idea reappears: each machine computes a local summary or weighted median, a coordinator combines these into a pivot estimate, and one round of counting how many global elements fall below the pivot narrows the search — a distributed echo of the sequential partition-and-recurse. The recurring theme across all three settings is the one this lesson opened with: selection requires less than sorting, and every regime, sequential, streaming, or distributed, finds a way to do only the work the answer needs.⁴

Takeaways

Selection finds the $k$ -th smallest element; it needs less than sorting, so it can run in $O (n)$ .
$Quickselect$ = quicksort's partition, but recurse into only the side that holds the answer; expected $O (n)$ via a geometric (non-branching) recurrence, worst case $Θ (n^{2})$ .
Median of medians chooses a provably balanced pivot using groups of five, guaranteeing the recursion drops at least $\approx 3 n /10$ elements per side.
That balance yields $T (n) \leq T (n /5) + T (7 n /10) + O (n)$ , and because $\frac{1}{5} + \frac{7}{10} < 1$ , the recurrence solves to worst-case $O (n)$ .
In practice randomized quickselect wins on constants; median-of-medians matters for guarantees and as a quicksort pivot rule.
Closest pair of points is divide-and-conquer with the same flavor: split by median $x$ , recurse on each half, then combine over a width- $2 δ$ strip. A geometric argument caps the strip work at $7$ comparisons per point, giving $T (n) \leq 2 T (n /2) + O (n) = O (n log n)$ .

CLRS, Ch. 9 — Medians and Order Statistics: selecting the $k$ -th order statistic in linear time without fully sorting. ↩
Erickson, Algorithms, Ch. 1 — Recursion: quickselect adapting quicksort's partition to recurse into only the side that holds the answer. ↩
CLRS, Ch. 9 — Medians and Order Statistics: the Blum–Floyd–Pratt–Rivest–Tarjan median-of-medians algorithm achieving worst-case $O (n)$ via groups of five. ↩
Skiena, The Algorithm Design Manual, §4 — Sorting and Searching: randomized quickselect as the practical choice over deterministic median-of-medians. ↩ ↩²

The selection problem

Minimum and maximum together

Quickselect: partition, then recurse on one side

A worked trace

Expected linear time

Median of medians: a guaranteed-good pivot

Why groups of five give a good pivot

Solving the recurrence

Which to use

Bonus: Closest Pair of Points

Selection in practice

Takeaways

Footnotes