Sequences & Strings/Two Pointers, Sliding Windows & Prefix Sums

Lesson 5.11,533 words

Two Pointers, Sliding Windows & Prefix Sums

A family of array idioms that collapse an obvious O(n2)O(n^2) scan into a single O(n)O(n) pass by maintaining an invariant as indices move. We meet two pointers (converging on a sorted array, and a fast/slow pair for in-place rewriting), the sliding window (fixed and variable size, amortized O(n)O(n)), and prefix sums, which answer any range-sum in O(1)O(1) and count subarrays summing to kk.

╌╌╌╌

We open a new module on sequences (arrays and strings), and the first thing to learn is a way of thinking that recurs everywhere in it. A great many array questions have an obvious brute-force answer that examines every pair or every subarray: two nested loops, work. The techniques in this lesson share a single trick for collapsing that quadratic scan into a single linear pass. Instead of re-examining the array from scratch at each step, we maintain a small amount of state (one or two indices, a window, a running sum) together with an invariant that this state always tells us something true. Each step advances an index and patches the invariant in , so the whole pass is .1 The art is choosing the invariant. The loop-invariant discipline from our first algorithm, establish it, maintain it, read off the answer at termination, is exactly the reasoning we use to prove these correct.2

Two pointers, opposite ends

The cleanest instance lives on a sorted array. Suppose is sorted ascending and we want indices with for a target . Brute force tries all pairs. Instead, place one pointer at each end, and , and let them converge:

At each step we look at . If we are done. If , then paired with anything still available is too small. Since is the largest sum involving in the window and it already fell short, cannot be part of any solution, and we discard it by advancing . The case is symmetric: is too large to pair with anything remaining, so we drop it by decrementing .

Converging pointers on sorted , target . Each step compares to and discards one end; the pair is found in five steps

The pointers start apart and each step closes the gap by one, so the loop runs at most times: time, space. This is exactly Two Sum II on a sorted input, and the reason sorting first () can beat a hash table when the array is already sorted or space is tight.

The same converging-pointers move solves Container With Most Water: with heights , the area between walls and is . We start at the widest pair and always advance the pointer at the shorter wall. Moving the taller one can only shrink the width without ever raising the binding height, so it can never improve on the current best. The shorter wall, by contrast, might be replaced by a taller one. One pass, .

Container With Most Water: the area is , bound by the shorter wall (here ). Moving the taller wall inward only loses width at the same height, so we advance the shorter wall instead

Two pointers, same direction (fast/slow)

A second flavor sends both pointers the same way at different speeds. The canonical use is rewriting an array in place: a write pointer trails a read pointer , and only advances when is an element we want to keep. To remove duplicates from a sorted array:

Algorithm:Dedup(a)\textsc{Dedup}(a) — compact a sorted array in place, returning new length
  1. 1
    w1w \gets 1
  2. 2
    for r1r \gets 1 to n1n-1 do
  3. 3
    if a[r]a[w1]a[r] \ne a[w-1] then
  4. 4
    a[w]a[r]a[w] \gets a[r]
  5. 5
    ww+1w \gets w + 1
  6. 6
    return ww

The read pointer scans every element once and the write pointer never overtakes it, so the routine is time and extra space; it overwrites the input rather than allocating output. The same trailing-write pattern underlies in-place partition (the core of quicksort and quickselect): a write pointer marks the boundary between elements already placed below the pivot and the rest.

Fast/slow dedup on a sorted array: write index trails read , advancing only when , so holds the compacted prefix so far

Sliding windows

A window is a contiguous range that we slide rightward across the array while keeping it consistent with some property. Two regimes appear.

Fixed size . To compute, say, every window's sum, we do not re-add elements each time. We add the entering element and subtract the leaving one: when the window advances from to , update . Each slide is , so all windows cost total.

Variable size. Here the window grows and shrinks to stay feasible. The pattern: advance to expand the window greedily; whenever the window violates its constraint, advance to shrink it until the constraint holds again. The double loop looks , but it is not:

A variable window slides across the array; and each move only rightward, so the pass is amortized

Worked: smallest subarray with sum . Given positive integers and a target , find the shortest contiguous subarray whose sum is at least (Minimum Size Subarray Sum). Keep a running window sum; expand to grow it, and the moment the sum reaches , shrink from to find the tightest window ending at .

Algorithm:MinSubarray(a,S)\textsc{MinSubarray}(a, S) — shortest window with sum S\ge S, in O(n)O(n)
  1. 1
    l0,  sum0,  bestl \gets 0,\ \ \text{sum} \gets 0,\ \ \text{best} \gets \infty
  2. 2
    for r0r \gets 0 to n1n-1 do
  3. 3
    sumsum+a[r]\text{sum} \gets \text{sum} + a[r]
  4. 4
    while sumS\text{sum} \ge S do
  5. 5
    bestmin(best, rl+1)\text{best} \gets \min(\text{best},\ r - l + 1)
  6. 6
    sumsuma[l]\text{sum} \gets \text{sum} - a[l]
  7. 7
    ll+1l \gets l + 1
  8. 8
    return (best=) ? 0:best(\text{best} = \infty)\ ?\ 0 : \text{best}

Because all entries are positive, the window sum is monotone in width, so once it drops below no further shrinking helps — the while exits and moves on.

Worked: longest substring without repeats. For Longest Substring Without Repeating Characters, the window must contain no duplicate character. We keep a map last[c] of the most recent index of each character. Expand ; if was last seen at a position , that occurrence is inside the window, so we jump to one past it. The window is always duplicate-free, and we track its maximum length.

Algorithm:LongestUnique(a)\textsc{LongestUnique}(a) — longest duplicate-free window, in O(n)O(n)
  1. 1
    l0,  best0,  last{}l \gets 0,\ \ \text{best} \gets 0,\ \ \text{last} \gets \{\}
  2. 2
    for r0r \gets 0 to n1n-1 do
  3. 3
    if a[r]lasta[r] \in \text{last} and last[a[r]]l\text{last}[a[r]] \ge l then
  4. 4
    llast[a[r]]+1l \gets \text{last}[a[r]] + 1
  5. 5
    last[a[r]]r\text{last}[a[r]] \gets r
  6. 6
    bestmax(best, rl+1)\text{best} \gets \max(\text{best},\ r - l + 1)
  7. 7
    return best\text{best}

Each character is visited once by , and only moves forward, so this is time and space for an alphabet of size .

Prefix sums

The sliding window assumed we could update a sum incrementally. Prefix sums generalize this to answer any range-sum query in after a linear precompute. Define

so has length and is built in one pass: . Then the sum of any subarray telescopes:

With , any range sum is the difference of two prefix entries:

The highlighted entries are and ; their difference is exactly . One subtraction, no rescanning.

Counting subarrays with sum . Prefix sums turn a subarray scan into for Subarray Sum Equals K. A subarray has sum iff , i.e. . So as we sweep a running prefix left to right, the number of valid left endpoints is the number of earlier prefixes equal to . Keep a hash map of prefix-value frequencies seen so far:

Algorithm:CountSubarrays(a,k)\textsc{CountSubarrays}(a, k) — number of subarrays summing to kk, in O(n)O(n)
  1. 1
    count0,  prefix0\text{count} \gets 0,\ \ \text{prefix} \gets 0
  2. 2
    freq{0:1}\text{freq} \gets \{\,0 : 1\,\}
    empty prefix seen once
  3. 3
    for j0j \gets 0 to n1n-1 do
  4. 4
    prefixprefix+a[j]\text{prefix} \gets \text{prefix} + a[j]
  5. 5
    countcount+freq[prefixk]\text{count} \gets \text{count} + \text{freq}[\text{prefix} - k]
    0 if absent
  6. 6
    freq[prefix]freq[prefix]+1\text{freq}[\text{prefix}] \gets \text{freq}[\text{prefix}] + 1
  7. 7
    return count\text{count}

Seeding freq with accounts for subarrays that start at index (those need ). One pass, time and space. Note this works for any integers, positive or negative, unlike the sliding window, which needed positivity for monotonicity.

Difference arrays, the dual. To apply many range updates "add to every " and only then read the array, invert the relationship. Keep a difference array and for each update do and , two touches. A single prefix-sum pass over at the end materializes the final array, since . Thus range-adds cost instead of .

Difference array: range-add to pokes and (two touches); one prefix-sum sweep of then materializes the update

Prefix sums extend to two dimensions as well: precompute , and any axis-aligned rectangle sum is recovered by inclusion–exclusion with four lookups in .

2-D prefix sums: the sum over the shaded rectangle is — subtract the two overhanging strips, then add back corner which both subtractions removed twice

Takeaways

  • These idioms all replace a pair-or-subarray scan with a single pass by maintaining an invariant as indices advance and patching it in per step.3
  • Two pointers from opposite ends solve sorted-array pair problems (Two Sum II, Container With Most Water): the move that discards an index provably discards no solution, giving time, space.
  • Fast/slow pointers (a trailing write behind a read) rewrite an array in place, dedup or partition, in time and extra space.
  • A sliding window expands and shrinks to keep a property; since each index enters and leaves the window once, the nested loop is amortized .
  • Prefix sums answer any range sum as in , and a hash map of prefix frequencies counts subarrays with sum in , even with negative numbers.
  • The difference-array dual turns range-adds into , and prefix sums generalize to 2-D rectangle queries in .

Footnotes

  1. Erickson, Ch. — Arrays and Amortization: the amortized argument that a two-pointer window, though nested, does total work because each index is enqueued and dequeued once. 2
  2. CLRS, Ch. 2 — Getting Started (§2.1): the loop-invariant method (initialization, maintenance, termination) used here to prove each pointer scheme correct.
  3. Skiena, § — Sorting & array techniques: two-pointer and windowing idioms on sorted arrays as the linear-time alternative to a quadratic scan.
Practice

╌╌ END ╌╌