Sequences & Strings/Binary Search on the Answer

Lesson 5.31,578 words

Binary Search on the Answer

Binary search is not really about arrays — it is about locating the boundary of a monotone predicate p(x)p(x) in O(log(range))O(\log(\text{range})) probes. We first pin down the half-open `while (lo < hi)` template for lower_bound\textsc{lower\_bound} and upper_bound\textsc{upper\_bound}, then generalize to "binary search on the answer": whenever feasibility is monotone in a numeric parameter, we binary search the parameter itself, calling a feasibility check at each step.

╌╌╌╌

We have seen binary search as the canonical divide-and-conquer search: a sorted array of keys, a target , and a halving loop that finds (or proves it absent) in comparisons. That framing is correct but narrow. The array was never the point. What binary search needs is a monotone predicate: a boolean test that is false up to some boundary and true from there on. Sorted membership is just one instance, where . Once we see binary search as locating the boundary of a monotone predicate, we can search ranges that no array ever materialises, the technique known as binary search on the answer.

Recap: binary search on a sorted array

Fix a sorted array and a target . The loop maintains an interval guaranteed to contain if it is present at all.

Because the interval length drops geometrically, the loop runs times. The plain does occur? version is easy; the subtle and far more useful versions are the boundary queries.

lower_bound and upper_bound

Two queries answer almost every practical question about a sorted array:

  • is the first index with .
  • is the first index with .

Their difference is the count of elements equal to ; itself is the insertion point that keeps sorted. Both are boundary searches over the monotone predicate (respectively ): a sorted array makes go false, …, false, true, …, true exactly once.

The reliable template is half-open and uses lo < hi, never lo <= hi. We search for the smallest index in at which the predicate holds, treating index as a virtual past the end sentinel that is always feasible.

Algorithm:lower_bound(A,x)\textsc{lower\_bound}(A, x) — first index ii with A[i]xA[i] \ge x
  1. 1
    lo0lo \gets 0
  2. 2
    hinhi \gets n
    half-open: hi is past-the-end
  3. 3
    while lo<hilo < hi do
  4. 4
    midlo+(hilo)/2mid \gets lo + \lfloor (hi - lo)/2 \rfloor
    floor, and overflow-safe
  5. 5
    if A[mid]xA[mid] \ge x then
  6. 6
    himidhi \gets mid
    mid may be the answer
  7. 7
    else
  8. 8
    lomid+1lo \gets mid + 1
    mid infeasible
  9. 9
    return lolo
    lo=hilo = hi: boundary

Two details make this correct, and getting either wrong is the classic off-by-one bug:

  • The interval is half-open, as a search space but inclusive as an answer. We initialize , not , because the answer can be no element is , i.e. index . The loop's exit then names a valid boundary in .
  • The mid uses and the two branches are asymmetric. When holds we set (not ), because is itself a candidate boundary. When fails we set , because is now known-infeasible and must be excluded. With a floored mid, always, so strictly shrinks the interval and the loop cannot spin forever. For , change the test to ; nothing else moves.

The generalization: searching a monotone predicate

Nothing in the loop above inspected the array except through . Abstract it away. Let be monotone:

If is monotone and computable, we can find by binary search over the numeric range , with no array at all. This is binary search on the answer: we are searching the space of candidate answers, and the only thing that makes it work is that feasibility is monotone in the answer.

A monotone predicate flips false→true exactly once; binary search finds that boundary

The cost is uniform across every application:

We pay probes, and each probe runs the feasibility check once. Replacing the array access by an arbitrary monotone test is the entire idea.

Each probe tests and discards the infeasible half — probes

Worked examples

In each case the work is the same three steps: name the answer parameter and its range , name the monotone predicate , and write the feasibility check. The binary search loop never changes.

Koko eating bananas

Koko has piles and hours; at speed she clears hours on pile . Minimize such that she finishes within hours.

  • Answer parameter: the speed , an integer in .
  • Predicate: , can finish in hours.
  • Monotonicity: a larger never increases any term , so is monotone increasing in (false for tiny speeds, true once fast enough). Binary search the smallest feasible .

Each check is , the range is , so the cost is .

Koko feasibility over speeds for , . The predicate flips once; binary search returns the boundary

Capacity to ship / split array largest sum

These two problems are the same problem. Given an array and a count (days / parts), partition it into contiguous groups to minimize the maximum group sum. (Capacity to ship within days reads the array as package weights; Split array largest sum reads it as integers, with identical structure.)

  • Answer parameter: the cap on a group's sum, in . (You cannot go below the largest single element; you never need to exceed the whole sum.)
  • Predicate: the array can be split into contiguous groups, each with sum .
  • Feasibility check (greedy, ): sweep left to right, accumulating into the current group; whenever adding would exceed , close the group and start a new one at . The number of groups this greedy uses is the minimum possible for cap , so holds iff that count is .
Feasibility check for cap on with : the greedy sweep cuts whenever the next element would push a group past , using groups (sums , ). Since , holds

A larger only ever merges groups, so the group count is non-increasing in : is monotone, and we binary search the smallest feasible . Cost: .

Integer square root

A pure-numeric instance with no array in sight: given , compute , the largest with .

  • Answer parameter: , in (or tighten to ).
  • Predicate: here the natural test is monotone the other way, namely true, …, true, false, …, so we want the last true. Search the first with via the standard template and subtract one; or flip the comparison and keep the largest feasible form.

The check is , so the integer square root costs , and the same shape computes any .1

Reversed monotonicity: runs , so the answer is the LAST true, not the first. For the boundary sits at ()

Correctness and termination

Every variant rests on one invariant, stated here for the smallest feasible half-open template ( inclusive as an answer):

Termination and correctness. With we have whenever . If is true, setting keeps feasible and strictly decreases . If is false, setting keeps everything below infeasible and strictly increases . Either way drops by at least one and at most halves, so after iterations .

Binary search on a real interval

When the answer is a real number rather than an integer (minimize a continuous radius, a rate, a time), the boundary need not be representable exactly, so we run parametric search: the same loop on a real interval, stopping after a fixed number of iterations or once .

Algorithm:real-valued binary search — first xx with p(x)p(x) within ε\varepsilon
  1. 1
    lo; hirlo \gets \ell;\ hi \gets r
  2. 2
    repeat KK times:
    K=100K=100 \Rightarrow error (r)2100\le(r-\ell)2^{-100}
  3. 3
    mid(lo+hi)/2mid \gets (lo + hi)/2
  4. 4
    if p(mid)p(mid) then himidhi \gets mid else lomidlo \gets mid
  5. 5
    return hihi

Each iteration halves the interval, so iterations reach absolute error ; a fixed buys machine precision. There is no bookkeeping because we never need the exact integer boundary, only an -close one. As always, the predicate's monotonicity is the whole game: if is monotone over the loop converges to its boundary; if is not monotone, binary search is simply the wrong tool, since there may be several transitions and no guarantee which one we land on.3

Takeaways

  • Binary search locates the boundary of a monotone predicate in probes; the sorted array is just the special case .
  • Memorise one template. The half-open while (lo < hi) form with a floored mid, on feasible and on infeasible, returning , computes and without off-by-one errors.
  • Binary search on the answer: when feasibility is monotone in a numeric parameter, binary search the parameter and call a feasibility check at each step, for total cost .
  • Recipe for each problem: name the answer range , the monotone predicate , and an efficient check. Koko (speed; sum of ceilings), ship/split (max-group cap; greedy partition in ), integer square root () all fit this mold.
  • The correctness invariant keeps infeasible and feasible; floored mid plus asymmetric updates guarantee termination. Mixing the closed and half-open templates is where the classic bugs live.
  • For a real-valued answer, run parametric search: a fixed iteration count or tolerance. Monotonicity of is the only precondition that matters.

Footnotes

  1. CLRS, Ch. 2 — Binary search (Exercise 2.3-5): the sorted-array search and its boundary (insertion-point) variants.
  2. Skiena, §4.x — Binary Search and Related: searching a monotone predicate over a numeric range (binary search on the answer) and parametric search on a real interval.
Practice

╌╌ END ╌╌