Network Flow
How much can flow through a network from source to sink? Max-flow is a surprisingly general model — once you see a problem as flow, a whole toolbox opens up.
╌╌╌╌
Imagine a network of pipes carrying water from a source to a sink, each pipe with a maximum capacity. How much water can you push through end to end? This is the maximum flow problem, and its reach extends far past plumbing: routing traffic, scheduling jobs, matching applicants to jobs, even segmenting images all reduce to it. Max-flow is one of algorithm design's great modeling tools; the art, as Erickson stresses, lies in recognizing a flow problem.1
Flow networks
Think of routing a single commodity (water, electricity, traffic, money) from a source to a sink across a network whose edges have limited throughput.
A flow is a function on the edges. The single most useful piece of bookkeeping is the net flow out of a vertex , written using the boundary symbol :
A flow is feasible when it obeys two rules:
- Non-negativity & capacity. for every edge: no edge runs backward, and none carries more than its capacity.
- Conservation. for every vertex : nothing is created or destroyed at an interior node. (When conservation holds at every vertex, is a circulation.)
The value of an – flow is the net amount leaving the source, . A short computation shows this is the same as the net amount arriving at the sink: summing over all vertices counts each edge's flow once with a and once with a , so ; conservation kills every interior term, leaving , i.e. . The maximum-flow problem is to find a feasible flow of greatest value.
Below is a worked network; each edge is labeled (flow over capacity), realizing units pushed from to :
Conservation is exactly the statement that each interior node balances. Take vertex : its incident flows are the outgoing , , and the incoming , so
Checking at every interior vertex this way is how you verify a flow is feasible.
Augmenting paths and the residual graph
How do we increase a flow? Find a path from to that still has spare room and push more along it. The bookkeeping device that makes this precise, and makes the algorithm correct, is the residual graph.
Concretely, each original edge splits into three cases on its residual capacity :
- If (empty), put with .
- If (partly used), put both: with , and with .
- If (saturated), put only the reverse with .
(We assume has no anti-parallel edges, at most one of , is in , so these reverse edges are unambiguous. Any graph can be preprocessed to satisfy this by splitting an edge through a dummy vertex.)
The backward edges are the subtle part that matters: pushing flow along cancels existing flow on , letting the algorithm undo earlier,
suboptimal decisions. That is exactly why a greedy fill paths until stuck
approach can get stuck below optimum, while augmenting paths cannot. Below, the
feasible flow from above (top) and the residual graph it induces (below);
each residual edge is labeled with its capacity :
A single augmentation is easy to picture on a small network. Each edge below is labeled ; the highlighted residual path has residual capacities , so its bottleneck is . Pushing one unit along it saturates and lifts the flow value from to , with every interior vertex still balanced:
The augmentation lemma
Before trusting the algorithm we must prove that augmenting actually produces a better feasible flow. Augmenting along a path by amount produces a new function defined edge-by-edge:
The proof is four short claims, each just unfolding the definitions:
- Non-negativity (). The only danger is a decrease, , which happens when the reverse lies on . But then forces , and , so .
- Capacity (). The only danger is an increase, , which happens when lies on . Then forces , and , so .
- Conservation ( for ). If , none of its incident terms change. If is an interior vertex of , the path enters once and leaves once; that pair of edges shifts the in- and out-flow by the same (with matching signs whether the path step is a forward or reverse edge), so .
- Value (). uses exactly one edge out of and none into . That single edge gains (forward) or loses across a reverse step, so , i.e. .
So every augmentation strictly increases the value while keeping legal. The converse, which the min-cut proof will need, also holds: if is not maximum, must contain an augmenting path. Putting the two together gives the equivalence at the center of the theory
Ford-Fulkerson and Edmonds-Karp
The method is then irresistibly simple: while an augmenting path exists, push flow along it.2
- 1foreach edge do
- 2
- 3while there exists an augmenting path from to in do
- 4bottleneck
- 5foreach edge on do
- 6push forward
- 7cancel on reverse edge
- 8return
Correctness is immediate from the augmentation lemma: each iteration produces a feasible flow of strictly larger value, and the method halts exactly when no augmenting path remains, which (by the equivalence above, proved in full below) is precisely the condition for a maximum flow.
Running time rests on an invariant: if every capacity is an integer, , then all residual capacities and flow values stay integers throughout. Each augmentation then raises by at least , so there are at most iterations, each costing to find a path and push along it (here , ):
If additionally every capacity lies in , then , giving . This is pseudo-polynomial: fine for small capacities, but it can genuinely crawl when they are huge. The classic bad case has a middle edge of capacity between two paths of capacity ; a naïve solver alternately pushes one unit forward and one unit back, taking augmentations. (On irrational capacities the method may not even terminate.)
The fix, due to , is to always pick the shortest augmenting path (fewest edges): find it with BFS in the residual graph. The key lemma is that each edge disappears from at most times; since each iteration causes at least one disappearance, the number of iterations is . With per BFS this gives a strongly polynomial bound, independent of capacities. (Orlin's 2012 algorithm improves this to ; you may cite it as a black-box max-flow subroutine.)
- 1foreach edge do
- 2
- 3while BFS finds a shortest path from to in do
- 4augment along by its bottleneck
- 5return
The max-flow min-cut theorem
When does Ford-Fulkerson stop, and why is the result optimal? The answer is one of the most elegant theorems in algorithms, linking flow to cuts.3
The dual view is a minimum-cut problem: an adversary wants to cut a cheap set of edges so that no – flow can get through. The two problems turn out to be the same problem.
The proof has two directions.
Easy direction (weak duality): every feasible flow and every – cut satisfy . We compute by summing over all of , where conservation makes the interior terms vanish, leaving only :
So ; flow is bounded by the narrowest cut.
Hard direction: there exist a flow and an – cut with . Take to be a maximum flow. If had an augmenting path, the augmentation lemma would yield a larger flow, a contradiction. Therefore is not reachable from in . Define the reachable set and its complement:
Then and , so is a genuine – cut. Now the two boundary observations:
- Forward edges are saturated. Every with , has ; otherwise it would contribute a forward residual edge, making reachable from , so , a contradiction.
- Backward edges are empty. Every with , has ; otherwise it would contribute a reverse residual edge , again making reachable.
These are exactly the two inequalities that were slack in the easy direction. With
forward edges saturated () and backward edges unused (), both
steps become equalities, so . Combined with weak duality,
is a maximum flow and is a minimum cut.
Because the hard direction shows the converse of the augmentation lemma, we get the promised three-way equivalence — all three statements are the same fact:
This is also why is correct: it halts exactly when has no – path, which is precisely when is maximum.
Application: bipartite matching
The payoff of the flow abstraction is that other problems melt into it.4 A useful rule of thumb: whenever a graph problem looks for paths or cycles under some per-edge or per-vertex budget constraint, reach for max-flow as a subroutine, and many non-graph problems (task assignment, base-station connection, workshop scheduling) yield to it too once you find the right network.
Consider bipartite matching: a set of applicants, a set of jobs, and an edge for each applicant qualified for a job (edge iff ). A matching pairs applicants to jobs with no one used twice; we want a maximum matching.
Build a flow network: add a source with a unit-capacity edge to every applicant, add a sink with a unit-capacity edge from every job, and orient each qualification edge from to with capacity .
The reduction works because of an exact correspondence in both directions: if tasks can be assigned, those disjoint routes form a flow of value (easy); conversely, if the max-flow value is , then tasks can be assigned (less obvious, since it needs integrality).
Here every capacity is , so the maximum flow is / on every edge. Its path-flow decomposition is a set of vertex-disjoint routes; reading off the middle edge of each gives a matching, and the value of the max flow equals the size of the maximum matching. So one run of solves bipartite matching.
For the network above, the integral max flow saturates three vertex-disjoint routes, shown in blue: , , . The flow value is , so the maximum matching has size — every applicant is placed:
This is the template for
countless reductions (assignment, base-station connection, workshop scheduling,
vertex-disjoint paths), all of which become build a network, compute max flow, read off an integral solution.
Takeaways
- A flow network routes flow from to under capacity and conservation () constraints; the value is what we maximize.
- The residual graph adds reverse edges (capacity ) that let us undo flow; the augmentation lemma proves pushing along any – path of yields a feasible flow of value .
- augments until has no – path; with integer capacities it runs in (pseudo-polynomial), and 's BFS choice makes it , capacity-independent.
- The max-flow min-cut theorem (max flow min cut) gives a three-way equivalence: is maximum has no augmenting path for some cut. The hard direction builds the cut from the set reachable from in .
- By the integrality theorem, an all-integer network has an integral max flow that decomposes into path flows; bipartite matching is the canonical unit-capacity instance, the gateway to flow's vast catalog of applications.
Footnotes
- Erickson, Ch. 10 & 11 — Maximum Flows and Applications — the art of recognizing a problem as a flow problem. ↩
- CLRS, Ch. 26 — Maximum Flow — the Ford-Fulkerson method augmenting along residual paths. ↩
- CLRS, Ch. 26 — Maximum Flow — the max-flow min-cut theorem equating maximum flow with minimum cut. ↩
- Skiena, §6 — Weighted Graph Algorithms — modeling problems such as bipartite matching as max-flow instances. ↩
╌╌ END ╌╌