Dynamic Programming on Graphs

The Graphs module's Shortest Paths lesson introduced three algorithms (Dijkstra, Bellman–Ford, Floyd–Warshall) and remarked that one operation, relaxation, underlies them all. The Principles of Dynamic Programming lesson then distilled DP into optimal substructure plus overlapping subproblems. This lesson connects the two ideas with a single thesis: many graph algorithms are dynamic programs, and relaxation is the DP transition.

The pattern is always the same. We define a subproblem as the best value obtainable using a restricted resource, and we grow the resource one unit at a time. The resource is whatever we ration:

the set of intermediate vertices a path may pass through (Floyd–Warshall);
the number of edges a path may use (Bellman–Ford, $K$ -stops);
a topological prefix of an acyclic graph (DAG-DP);
a subset of vertices already visited (Held–Karp, the Bitmask DP lesson).

Relaxing an edge $(u, v)$ , testing whether routing through $u$ improves the current estimate for $v$ , is the $min$ over choices in a DP recurrence. Under this framing, is there a path?, what is the cheapest path?, and how many shortest paths are there? all become the same exercise: pick the resource, write the recurrence, choose an evaluation order that respects the dependencies.¹

Floyd–Warshall: intermediate vertices as the resource

Number the vertices $1, \dots, V$ . Restrict which vertices a path is allowed to pass through internally, and relax that restriction one vertex at a time.

The base case $d_{0} [i] [j]$ is $w (i, j)$ if the edge exists, $0$ if $i = j$ , and $+ \infty$ otherwise, since no intermediate vertices are allowed, so only direct edges count. The induction is the core of the method.

The lemma is a verbatim DP transition, route through $k$ or don't:

d_{k} [i] [j] = min avoid k d_{k - 1} [i] [j], through k d_{k - 1} [i] [k] + d_{k - 1} [k] [j] .

route through

k

, or don't — the Floyd–Warshall

min

Because $d_{k}$ depends only on $d_{k - 1}$ , and the two cells it reads in row/column $k$ are unchanged when $i = k$ or $j = k$ , we can drop the $k$ index and update the matrix in place. This yields the entire algorithm in three nested loops with $k$ outermost:

Algorithm:

\textsc{Floyd-Warshall}(W)

— all-pairs shortest paths in

O(V^3)

1
$d \gets W$
$w(i,j)$ ; $0$ on diagonal, else $\infty$
2
for $k \gets 1$ to $V$ do
3
for $i \gets 1$ to $V$ do
4
for $j \gets 1$ to $V$ do
5
if $d[i][k] + d[k][j] < d[i][j]$ then
6
$d[i][j] \gets d[i][k] + d[k][j]$
7
$\text{next}[i][j] \gets \text{next}[i][k]$
for reconstruction
8
return $d$

The triple loop is $Θ (V^{3})$ time and $Θ (V^{2})$ space, independent of the edge count, which is what makes Floyd–Warshall the method of choice for dense graphs where we want every pair at once.²

For a worked example, take four vertices with the weight matrix (rows are sources, $\infty$ where no direct edge exists):

d_{0} = 0852 30 \infty \infty \infty 20 \infty 7 \infty 10 ⟶ d_{4} = 0532306552076310

Each level admits one more intermediate vertex. Allowing vertex $1$ ( $k = 1$ ) fixes $d [2] [4]$ : the direct entry was $\infty$ , but $2 \to 1 \to 4$ is not yet improved — rather $d [4] [2]$ becomes $2 + 3 = 5$ through vertex $1$ , and $d [3] [2] = 5 + 3 = 8$ . Allowing vertex $2$ ( $k = 2$ ) sets $d [1] [3] = d [1] [2] + d [2] [3] = 3 + 2 = 5$ and $d [4] [3] = d [4] [2] + d [2] [3] = 5 + 2 = 7$ . Allowing vertex $3$ ( $k = 3$ ) improves $d [1] [4]$ to $d [1] [3] + d [3] [4] = 5 + 1 = 6$ and $d [2] [1]$ to $d [2] [3] + d [3] [1] = 2 + 5 = 7$ . The last level ( $k = 4$ ) routes through vertex $4$ to drop $d [2] [1]$ further to $3 + 2 = 5$ and $d [3] [1]$ to $2 + 1 = 3$ . Reading $d_{4}$ off gives every pairwise shortest distance at once.

Floyd-Warshall on four vertices: the base matrix

d_{0}

(direct edges only) relaxes level by level; the

k = 1

pass fills

d [4] [2] = 5

via vertex 1, shaded, illustrating the route-through-k update.

Path reconstruction uses a $next$ matrix (above) initialized to $next [i] [j] = j$ for each edge: whenever routing through $k$ wins, the first hop out of $i$ toward $j$ becomes the first hop toward $k$ . To rebuild a path, follow $i \to next [i] [j] \to \dots \to j$ . (A $pred$ matrix storing the last vertex before $j$ is the symmetric alternative.) The practice problem Find the City With the Smallest Number of Neighbors at a Threshold Distance is Floyd–Warshall verbatim: compute all-pairs distances, then count for each city how many others lie within the threshold.

transitive_closure.pypython

from collections.abc import Hashable
from typing import Generic, TypeVar

from graph import Graph

Label = TypeVar("Label", bound=Hashable)

class ReachabilityMatrix(Generic[Label]):
  """
    The transitive closure of a graph: which vertices reach which.\n
    `reaches(source, target)` is the boolean analog of an all-pairs distance.\n
  """

  def __init__(self, labels: list[Label]) -> None:
    self.labels: list[Label] = labels
    self._index: dict[Label, int] = {
      label: position for position, label in enumerate(labels)
    }
    self._size: int = len(labels)
    self._reachable: list[list[bool]] = [
      [False for _ in range(self._size)] for _ in range(self._size)
    ]

  @property
  def size(self) -> int:
    """
      The number of vertices, i.e. the side length of the matrix.\n
    """
    return self._size

  def position_of(self, label: Label) -> int:
    """
      The row/column index that `label` occupies in the matrix.\n
    """
    return self._index[label]

  def connect(self, source_position: int, target_position: int) -> None:
    """
      Record that the vertex at `target_position` is reachable from the\n
      vertex at `source_position`.\n
    """
    self._reachable[source_position][target_position] = True

  def reaches_positions(self, source_position: int, target_position: int) -> bool:
    """
      Whether the vertex at `target_position` is reachable from the vertex\n
      at `source_position`, indexed by matrix position.\n
    """
    return self._reachable[source_position][target_position]

  def reaches(self, source: Label, target: Label) -> bool:
    """
      Whether some directed path leads from `source` to `target`.\n
    """
    return self._reachable[self._index[source]][self._index[target]]

def transitive_closure(graph: Graph[Label]) -> ReachabilityMatrix[Label]:
  """
    The reachability matrix of `graph` via Warshall's algorithm.\n
    Every vertex reaches itself; a direct edge seeds the base case, then\n
    each intermediate vertex `pivot` is folded in with the or/and step.\n
  """
  # an empty reachability matrix over the graph's labels.
  labels: list[Label] = graph.labels
  result: ReachabilityMatrix[Label] = ReachabilityMatrix(labels)
  size: int = result.size

  # base case: every vertex reaches itself.
  for position in range(size):
    result.connect(position, position)

  # base case: every direct edge is a one-hop reachability.
  for edge in graph.edges():
    result.connect(
      result.position_of(edge.source.label),
      result.position_of(edge.target.label),
    )

  # fold in one intermediate vertex at a time, in place.
  for pivot in range(size):
    for source in range(size):
      if not result.reaches_positions(source, pivot):
        continue

      # source reaches pivot, so it reaches everything pivot reaches.
      for target in range(size):
        if result.reaches_positions(pivot, target):
          result.connect(source, target)

  return result

graph.pypython

from collections.abc import Hashable, Iterator
from typing import Generic, Optional, TypeVar


Label = TypeVar("Label", bound=Hashable)


class Edge(Generic[Label]):
  """
    A directed connection from `source` to `target`, carrying a weight.\n
  """

  def __init__(
    self,
    source: Vertex[Label],
    target: Vertex[Label],
    weight: float = 1.0,
  ) -> None:
    self.source: Vertex[Label] = source
    self.target: Vertex[Label] = target
    self.weight: float = weight

  def __repr__(self) -> str:
    return f"Edge({self.source.label!r} -> {self.target.label!r}, w={self.weight})"


class Vertex(Generic[Label]):
  """
    A graph vertex: a label plus the list of edges leaving it.\n
  """

  def __init__(self, label: Label) -> None:
    self.label: Label = label
    self.outgoing: list[Edge[Label]] = []

  def neighbors(self) -> list[Vertex[Label]]:
    """
      The vertices reachable from this one by a single edge.\n
    """
    return [edge.target for edge in self.outgoing]

  def edge_to(self, label: Label) -> Optional[Edge[Label]]:
    """
      The outgoing edge to the vertex with `label`, or None.\n
    """
    for edge in self.outgoing:
      if edge.target.label == label:
        return edge
    return None

  def __repr__(self) -> str:
    return f"Vertex({self.label!r})"


class Graph(Generic[Label]):
  """
    A graph of Vertex objects linked by Edge objects.\n
    Pass `directed=True` for a digraph; otherwise each `add_edge` inserts\n
    the reverse edge too.\n
  """

  def __init__(self, directed: bool = False) -> None:
    self.directed: bool = directed
    self._vertices: dict[Label, Vertex[Label]] = {}

  def add_vertex(self, label: Label) -> Vertex[Label]:
    """
      Return the vertex for `label`, creating it if it is absent.\n
    """
    # reuse the existing vertex, or mint and register a fresh one.
    vertex = self._vertices.get(label)
    if vertex is None:
      vertex = Vertex(label)
      self._vertices[label] = vertex
    return vertex

  def add_edge(
    self,
    source_label: Label,
    target_label: Label,
    weight: float = 1.0,
  ) -> None:
    """
      Connect two labels (creating either vertex as needed).\n
      Adds the reverse edge as well when the graph is undirected.\n
    """
    source = self.add_vertex(source_label)
    target = self.add_vertex(target_label)

    # link source to target, and mirror it back when undirected.
    source.outgoing.append(Edge(source, target, weight))
    if not self.directed:
      target.outgoing.append(Edge(target, source, weight))

  def vertex(self, label: Label) -> Vertex[Label]:
    """
      The vertex carrying `label` (raises KeyError if absent).\n
    """
    return self._vertices[label]

  @property
  def vertices(self) -> list[Vertex[Label]]:
    """
      Every vertex, in insertion order.\n
    """
    return list(self._vertices.values())

  @property
  def labels(self) -> list[Label]:
    """
      Every vertex label, in insertion order.\n
    """
    return list(self._vertices)

  def edges(self) -> Iterator[Edge[Label]]:
    """
      Each edge once — an undirected edge is yielded a single time.\n
    """
    # track undirected endpoint pairs so each is emitted only once.
    seen: set[frozenset[Label]] = set()

    for vertex in self._vertices.values():
      for edge in vertex.outgoing:
        # skip an undirected edge already yielded from the other endpoint.
        if not self.directed:
          endpoints = frozenset((edge.source.label, edge.target.label))
          if endpoints in seen:
            continue
          seen.add(endpoints)

        yield edge

  def __contains__(self, label: Label) -> bool:
    return label in self._vertices

  def __iter__(self) -> Iterator[Vertex[Label]]:
    return iter(self._vertices.values())

  def __len__(self) -> int:
    return len(self._vertices)

Bellman–Ford: edges as the resource

Floyd–Warshall rations intermediate vertices. Bellman–Ford rations edges, which makes it a single-source DP over path length.

A walk of at most $t$ edges to $v$ is either a walk of at most $t - 1$ edges to $v$ , or such a walk to some predecessor $u$ followed by the edge $(u, v)$ :

D_{t} [v] = min (D_{t - 1} [v], (u, v) \in E min D_{t - 1} [u] + w (u, v)) .

Each round is one full sweep of edge relaxations, the same relaxation primitive from the Shortest Paths lesson, now indexed by a layer $t$ . Because a shortest path in a graph with no negative cycle is simple, it uses at most $V - 1$ edges, so $D_{V - 1}$ is the answer: after $V - 1$ rounds the table converges. If one more round still relaxes some edge, a path of $\geq V$ edges beats every shorter one, which can only happen along a negative cycle, so that extra relaxation is the standard negative-cycle test.³ The cost is $V - 1$ sweeps of $E$ edges, $O (V E)$ .

The layered view pays off directly when the problem caps the number of edges. Cheapest Flights Within K Stops asks for the cheapest $s \to t$ route using at most $K$ stops, i.e. at most $K + 1$ edges. That is just $D_{K + 1} [t]$ : run exactly $K + 1$ Bellman–Ford rounds, no more.

Bellman–Ford as a DP over path length:

D_{t} [v]

= cheapest

s \to v

using

\leq t

edges; each round relaxes against the previous layer, so capping

t

solves the

K

-stops variant

A concrete Cheapest Flights Within K Stops instance shows why the cap matters. Take cities $s, a, b, t$ with directed flights $s \to a$ (cost $100$ ), $a \to t$ (cost $100$ ), $s \to b$ (cost $500$ ), $b \to t$ (cost $50$ ), and the much cheaper two-leg detour $s \to a \to t$ competing against the direct-ish $s \to b \to t$ . Ask for the cheapest $s \to t$ route with at most $K = 1$ stop, that is at most $2$ edges, so we run exactly $K + 1 = 2$ rounds, each relaxing against the frozen previous layer:

layer	$s$	$a$	$b$	$t$
$D_{0}$	$0$	$\infty$	$\infty$	$\infty$
$D_{1}$	$0$	$100$	$500$	$\infty$
$D_{2}$	$0$	$100$	$500$	$min (100 + 100, 500 + 50) = 200$

Round $1$ reaches the one-edge neighbors $a$ and $b$ . Round $2$ reaches $t$ two ways — $s \to a \to t = 200$ and $s \to b \to t = 550$ — and keeps $200$ . The answer is $200$ , using exactly $2$ edges (one stop). A cheaper $3$ -edge route, had one existed, would be invisible here precisely because we stopped after $2$ rounds: the cap is enforced by the round count, and freezing $D_{1}$ before round $2$ is what prevents $t$ from being reached in a single sweep through both edges.

bounded_bellman_ford.pypython

from collections.abc import Hashable
from math import inf
from typing import TypeVar

from graph import Graph

Label = TypeVar("Label", bound=Hashable)

def bounded_bellman_ford(
  graph: Graph[Label],
  source: Label,
  max_edges: int,
) -> dict[Label, float]:
  """
    Cheapest distance from `source` to every vertex using a path of at most\n
    `max_edges` edges. Unreachable vertices (within the budget) map to inf.\n
    Each round snapshots the previous layer so no extra edge is smuggled in.\n
  """
  # only the source is reachable with zero edges.
  distance: dict[Label, float] = {vertex.label: inf for vertex in graph}
  distance[source] = 0.0

  for _ in range(max_edges):

    # one round per allowed edge; relax against a frozen copy of the layer.
    previous: dict[Label, float] = dict(distance)
    for edge in graph.edges():
      origin: float = previous[edge.source.label]
      if origin == inf:
        continue
      candidate: float = origin + edge.weight
      distance[edge.target.label] = min(distance[edge.target.label], candidate)

  return distance

def cheapest_within_stops(
  number_of_cities: int,
  flights: list[tuple[int, int, int]],
  source: int,
  destination: int,
  max_stops: int,
) -> int:
  """
    Cheapest price from `source` to `destination` using at most `max_stops`\n
    intermediate stops, i.e. at most `max_stops + 1` flights — the LeetCode\n
    "Cheapest Flights Within K Stops" framing. Returns -1 if unreachable.\n
    Each flight is (from_city, to_city, price); cities are 0..n-1.\n
  """
  # build the flight network: one vertex per city, one weighted edge per flight.
  graph: Graph[int] = Graph(directed=True)
  for city in range(number_of_cities):
    graph.add_vertex(city)
  for from_city, to_city, price in flights:
    graph.add_edge(from_city, to_city, float(price))

  # K stops means K+1 flights; report -1 when the destination stays unreachable.
  distance: dict[int, float] = bounded_bellman_ford(
    graph, source, max_stops + 1
  )
  best: float = distance[destination]
  return -1 if best == inf else int(best)

graph.pypython

from collections.abc import Hashable, Iterator
from typing import Generic, Optional, TypeVar


Label = TypeVar("Label", bound=Hashable)


class Edge(Generic[Label]):
  """
    A directed connection from `source` to `target`, carrying a weight.\n
  """

  def __init__(
    self,
    source: Vertex[Label],
    target: Vertex[Label],
    weight: float = 1.0,
  ) -> None:
    self.source: Vertex[Label] = source
    self.target: Vertex[Label] = target
    self.weight: float = weight

  def __repr__(self) -> str:
    return f"Edge({self.source.label!r} -> {self.target.label!r}, w={self.weight})"


class Vertex(Generic[Label]):
  """
    A graph vertex: a label plus the list of edges leaving it.\n
  """

  def __init__(self, label: Label) -> None:
    self.label: Label = label
    self.outgoing: list[Edge[Label]] = []

  def neighbors(self) -> list[Vertex[Label]]:
    """
      The vertices reachable from this one by a single edge.\n
    """
    return [edge.target for edge in self.outgoing]

  def edge_to(self, label: Label) -> Optional[Edge[Label]]:
    """
      The outgoing edge to the vertex with `label`, or None.\n
    """
    for edge in self.outgoing:
      if edge.target.label == label:
        return edge
    return None

  def __repr__(self) -> str:
    return f"Vertex({self.label!r})"


class Graph(Generic[Label]):
  """
    A graph of Vertex objects linked by Edge objects.\n
    Pass `directed=True` for a digraph; otherwise each `add_edge` inserts\n
    the reverse edge too.\n
  """

  def __init__(self, directed: bool = False) -> None:
    self.directed: bool = directed
    self._vertices: dict[Label, Vertex[Label]] = {}

  def add_vertex(self, label: Label) -> Vertex[Label]:
    """
      Return the vertex for `label`, creating it if it is absent.\n
    """
    # reuse the existing vertex, or mint and register a fresh one.
    vertex = self._vertices.get(label)
    if vertex is None:
      vertex = Vertex(label)
      self._vertices[label] = vertex
    return vertex

  def add_edge(
    self,
    source_label: Label,
    target_label: Label,
    weight: float = 1.0,
  ) -> None:
    """
      Connect two labels (creating either vertex as needed).\n
      Adds the reverse edge as well when the graph is undirected.\n
    """
    source = self.add_vertex(source_label)
    target = self.add_vertex(target_label)

    # link source to target, and mirror it back when undirected.
    source.outgoing.append(Edge(source, target, weight))
    if not self.directed:
      target.outgoing.append(Edge(target, source, weight))

  def vertex(self, label: Label) -> Vertex[Label]:
    """
      The vertex carrying `label` (raises KeyError if absent).\n
    """
    return self._vertices[label]

  @property
  def vertices(self) -> list[Vertex[Label]]:
    """
      Every vertex, in insertion order.\n
    """
    return list(self._vertices.values())

  @property
  def labels(self) -> list[Label]:
    """
      Every vertex label, in insertion order.\n
    """
    return list(self._vertices)

  def edges(self) -> Iterator[Edge[Label]]:
    """
      Each edge once — an undirected edge is yielded a single time.\n
    """
    # track undirected endpoint pairs so each is emitted only once.
    seen: set[frozenset[Label]] = set()

    for vertex in self._vertices.values():
      for edge in vertex.outgoing:
        # skip an undirected edge already yielded from the other endpoint.
        if not self.directed:
          endpoints = frozenset((edge.source.label, edge.target.label))
          if endpoints in seen:
            continue
          seen.add(endpoints)

        yield edge

  def __contains__(self, label: Label) -> bool:
    return label in self._vertices

  def __iter__(self) -> Iterator[Vertex[Label]]:
    return iter(self._vertices.values())

  def __len__(self) -> int:
    return len(self._vertices)

DAG-DP: a topological prefix as the resource

When the graph is acyclic, the resource becomes trivial to ration: process vertices in topological order. A topological order lists every vertex after all of its predecessors, so by the time we reach $v$ , every $D [u]$ for an incoming edge $(u, v)$ is already final. The recurrence has no cycles, so a single pass suffices, with no convergence over $V - 1$ rounds and no $min$ over $k$ layers.

Algorithm:

\textsc{DAG-Relax}(G, s)

— shortest/longest paths on a DAG in

O(V+E)

1
$\text{topologically sort } G$
2
$D[v] \gets \infty$ for all $v$ ; $D[s] \gets 0$
3
for each $u$ in topological order do
4
for each edge $(u,v)$ do
5
if $D[u] + w(u,v) < D[v]$ then
use $\max$ for longest path
6
$D[v] \gets D[u] + w(u,v)$
7
return $D$

Each vertex and edge is touched once, so DAG-DP runs in $Θ (V + E)$ , faster than Dijkstra and immune to negative weights, because acyclicity replaces the non-negativity that Dijkstra needs. The longest path problem, NP-hard in general graphs, is just as easy on a DAG: swap $min$ for $max$ . Acyclicity is what makes the difference.⁴

one pass in topo order — longest-path values, chosen edge in

a cc

In the figure, $d$ is reached by $3 + 4 = 7$ via $b$ versus $2 + 1 = 3$ via $c$ ; the $max$ keeps the through- $b$ edge (in acc), and because $b$ and $c$ were finalized before $d$ in topological order, one forward sweep settles it.

Counting paths uses the same order. To count all paths $s \to t$ in a DAG, set $cnt [s] = 1$ and accumulate $cnt [v] + = cnt [u]$ over incoming edges in topo order. To count shortest paths in a weighted graph that may have cycles (Number of Ways to Arrive at Destination), process vertices in non-decreasing distance order (the order Dijkstra finalizes them, which is a topological order of the shortest-path DAG) and carry a parallel count:

when D [u] + w (u, v) = D [v] : cnt [v] + = cnt [u] (a strict improvement instead resets cnt [v]) .

The distance DP finds the best value; the count DP, evaluated in the same order, tallies how many ways achieve it.

dag_longest_path.pypython

from collections.abc import Hashable
from math import inf
from typing import Generic, NamedTuple, Optional, TypeVar

from graph import Graph

Label = TypeVar("Label", bound=Hashable)

class LongestPaths(NamedTuple, Generic[Label]):
  """
    Longest-path lengths from a single source on a DAG, with the predecessor\n
    links needed to rebuild the path that realizes each length.\n
  """
  length: dict[Label, float]
  predecessor: dict[Label, Optional[Label]]

  def path_to(self, target: Label) -> Optional[list[Label]]:
    """
      The longest source-to-`target` path as a label list, or None when\n
      `target` is unreachable from the source.\n
    """
    # an unreachable target has no realizing path.
    if self.length[target] == -inf:
      return None

    # walk predecessor links back from the target, then flip to forward order.
    path: list[Label] = []
    cursor: Optional[Label] = target
    while cursor is not None:
      path.append(cursor)
      cursor = self.predecessor[cursor]

    path.reverse()
    return path

def topological_order(graph: Graph[Label]) -> list[Label]:
  """
    A topological ordering of `graph`'s labels via Kahn's algorithm.\n
    Raises ValueError if the graph contains a directed cycle.\n
  """
  # tally each vertex's indegree across all edges.
  indegree: dict[Label, int] = {label: 0 for label in graph.labels}
  for edge in graph.edges():
    indegree[edge.target.label] += 1

  # seed the queue with every source (indegree zero), in insertion order.
  ready: list[Label] = [
    label for label in graph.labels if indegree[label] == 0
  ]

  # pop a ready vertex, then release neighbors that just lost their last edge.
  order: list[Label] = []
  while ready:
    current_label: Label = ready.pop()
    order.append(current_label)
    for edge in graph.vertex(current_label).outgoing:
      neighbor_label: Label = edge.target.label
      indegree[neighbor_label] -= 1
      if indegree[neighbor_label] == 0:
        ready.append(neighbor_label)

  # leftover vertices mean an unbroken cycle.
  if len(order) != len(graph):
    raise ValueError("graph is not a DAG: a directed cycle was found")
  return order

def dag_longest_path(graph: Graph[Label], source: Label) -> LongestPaths[Label]:
  """
    Longest paths from `source` on a directed acyclic `graph`.\n
    Relaxes outgoing edges toward the *larger* candidate in one topological\n
    sweep; unreachable vertices keep length -inf.\n
  """
  # every length starts at -inf; only the source is reachable for free.
  length: dict[Label, float] = {label: -inf for label in graph.labels}
  predecessor: dict[Label, Optional[Label]] = {
    label: None for label in graph.labels
  }
  length[source] = 0.0

  for current_label in topological_order(graph):

    # vertices unreachable from the source never relax their successors.
    if length[current_label] == -inf:
      continue

    # relax toward the larger candidate, recording the predecessor on a win.
    for edge in graph.vertex(current_label).outgoing:
      neighbor_label: Label = edge.target.label
      candidate: float = length[current_label] + edge.weight
      if candidate > length[neighbor_label]:
        length[neighbor_label] = candidate
        predecessor[neighbor_label] = current_label

  return LongestPaths(length, predecessor)

dag_path_count.pypython

from collections.abc import Hashable
from typing import TypeVar

from graph import Graph

Label = TypeVar("Label", bound=Hashable)

def topological_order(graph: Graph[Label]) -> list[Label]:
  """
    A topological ordering of `graph`'s labels via Kahn's algorithm.\n
    Raises ValueError if the graph contains a directed cycle.\n
  """
  # tally each vertex's indegree across all edges.
  indegree: dict[Label, int] = {label: 0 for label in graph.labels}
  for edge in graph.edges():
    indegree[edge.target.label] += 1

  # seed the queue with every source (indegree zero), in insertion order.
  ready: list[Label] = [
    label for label in graph.labels if indegree[label] == 0
  ]

  # pop a ready vertex, then release neighbors that just lost their last edge.
  order: list[Label] = []
  while ready:
    current_label: Label = ready.pop()
    order.append(current_label)
    for edge in graph.vertex(current_label).outgoing:
      neighbor_label: Label = edge.target.label
      indegree[neighbor_label] -= 1
      if indegree[neighbor_label] == 0:
        ready.append(neighbor_label)

  # leftover vertices mean an unbroken cycle.
  if len(order) != len(graph):
    raise ValueError("graph is not a DAG: a directed cycle was found")
  return order

def count_paths(graph: Graph[Label], source: Label, target: Label) -> int:
  """
    The number of distinct directed paths from `source` to `target` in a\n
    DAG. Counts the single empty path when `source == target`, and 0 when\n
    `target` is unreachable.\n
  """
  # exactly one way to stand at the source: the empty path.
  reachable_count: dict[Label, int] = {label: 0 for label in graph.labels}
  reachable_count[source] = 1

  for current_label in topological_order(graph):

    # stop pushing once we are past the target in the order.
    if current_label == target:
      break

    # unreached vertices contribute nothing downstream.
    ways_here: int = reachable_count[current_label]
    if ways_here == 0:
      continue

    # every outgoing edge carries this vertex's path count to its neighbor.
    for edge in graph.vertex(current_label).outgoing:
      reachable_count[edge.target.label] += ways_here

  return reachable_count[target]

shortest_path_count.pypython

import heapq
from collections.abc import Hashable
from math import inf
from typing import Generic, NamedTuple, TypeVar

from graph import Graph

Label = TypeVar("Label", bound=Hashable)

class ShortestPathCount(NamedTuple, Generic[Label]):
  """
    The shortest distance to each vertex and the number of distinct shortest\n
    paths reaching it from the source. Unreachable vertices have distance inf\n
    and count 0; the source has distance 0 and count 1.\n
  """
  distance: dict[Label, float]
  multiplicity: dict[Label, int]

def count_shortest_paths(
  graph: Graph[Label],
  source: Label,
  modulus: int = 0,
) -> ShortestPathCount[Label]:
  """
    Shortest distances from `source` plus the count of shortest paths to each\n
    vertex. Requires non-negative edge weights (it relies on Dijkstra order).\n
    Pass a positive `modulus` to keep counts under it, as the LeetCode\n
    variant requires; the default 0 leaves the exact integer counts.\n
  """
  # the source sits at distance 0 with exactly one (empty) path.
  distance: dict[Label, float] = {label: inf for label in graph.labels}
  count: dict[Label, int] = {label: 0 for label in graph.labels}
  distance[source] = 0.0
  count[source] = 1

  # tie_break keeps the heap total-ordered so equal distances never compare labels.
  settled: set[Label] = set()
  frontier: list[tuple[float, int, Label]] = [(0.0, 0, source)]
  tie_break: int = 1

  while frontier:

    # settle the nearest unsettled vertex; skip stale heap entries.
    current_distance, _, current_label = heapq.heappop(frontier)
    if current_label in settled:
      continue
    settled.add(current_label)

    # relax each outgoing edge against the neighbor's best known distance.
    for edge in graph.vertex(current_label).outgoing:
      neighbor_label: Label = edge.target.label
      candidate: float = current_distance + edge.weight

      if candidate < distance[neighbor_label]:

        # a strictly shorter route discards the old tally and inherits ours.
        distance[neighbor_label] = candidate
        count[neighbor_label] = count[current_label]
        heapq.heappush(frontier, (candidate, tie_break, neighbor_label))
        tie_break += 1
      elif candidate == distance[neighbor_label]:

        # another route of the same length: add its ways to the total.
        count[neighbor_label] += count[current_label]

      # keep the running tally bounded when a modulus is requested.
      if modulus > 0:
        count[neighbor_label] %= modulus

  return ShortestPathCount(distance, count)

graph.pypython

from collections.abc import Hashable, Iterator
from typing import Generic, Optional, TypeVar


Label = TypeVar("Label", bound=Hashable)


class Edge(Generic[Label]):
  """
    A directed connection from `source` to `target`, carrying a weight.\n
  """

  def __init__(
    self,
    source: Vertex[Label],
    target: Vertex[Label],
    weight: float = 1.0,
  ) -> None:
    self.source: Vertex[Label] = source
    self.target: Vertex[Label] = target
    self.weight: float = weight

  def __repr__(self) -> str:
    return f"Edge({self.source.label!r} -> {self.target.label!r}, w={self.weight})"


class Vertex(Generic[Label]):
  """
    A graph vertex: a label plus the list of edges leaving it.\n
  """

  def __init__(self, label: Label) -> None:
    self.label: Label = label
    self.outgoing: list[Edge[Label]] = []

  def neighbors(self) -> list[Vertex[Label]]:
    """
      The vertices reachable from this one by a single edge.\n
    """
    return [edge.target for edge in self.outgoing]

  def edge_to(self, label: Label) -> Optional[Edge[Label]]:
    """
      The outgoing edge to the vertex with `label`, or None.\n
    """
    for edge in self.outgoing:
      if edge.target.label == label:
        return edge
    return None

  def __repr__(self) -> str:
    return f"Vertex({self.label!r})"


class Graph(Generic[Label]):
  """
    A graph of Vertex objects linked by Edge objects.\n
    Pass `directed=True` for a digraph; otherwise each `add_edge` inserts\n
    the reverse edge too.\n
  """

  def __init__(self, directed: bool = False) -> None:
    self.directed: bool = directed
    self._vertices: dict[Label, Vertex[Label]] = {}

  def add_vertex(self, label: Label) -> Vertex[Label]:
    """
      Return the vertex for `label`, creating it if it is absent.\n
    """
    # reuse the existing vertex, or mint and register a fresh one.
    vertex = self._vertices.get(label)
    if vertex is None:
      vertex = Vertex(label)
      self._vertices[label] = vertex
    return vertex

  def add_edge(
    self,
    source_label: Label,
    target_label: Label,
    weight: float = 1.0,
  ) -> None:
    """
      Connect two labels (creating either vertex as needed).\n
      Adds the reverse edge as well when the graph is undirected.\n
    """
    source = self.add_vertex(source_label)
    target = self.add_vertex(target_label)

    # link source to target, and mirror it back when undirected.
    source.outgoing.append(Edge(source, target, weight))
    if not self.directed:
      target.outgoing.append(Edge(target, source, weight))

  def vertex(self, label: Label) -> Vertex[Label]:
    """
      The vertex carrying `label` (raises KeyError if absent).\n
    """
    return self._vertices[label]

  @property
  def vertices(self) -> list[Vertex[Label]]:
    """
      Every vertex, in insertion order.\n
    """
    return list(self._vertices.values())

  @property
  def labels(self) -> list[Label]:
    """
      Every vertex label, in insertion order.\n
    """
    return list(self._vertices)

  def edges(self) -> Iterator[Edge[Label]]:
    """
      Each edge once — an undirected edge is yielded a single time.\n
    """
    # track undirected endpoint pairs so each is emitted only once.
    seen: set[frozenset[Label]] = set()

    for vertex in self._vertices.values():
      for edge in vertex.outgoing:
        # skip an undirected edge already yielded from the other endpoint.
        if not self.directed:
          endpoints = frozenset((edge.source.label, edge.target.label))
          if endpoints in seen:
            continue
          seen.add(endpoints)

        yield edge

  def __contains__(self, label: Label) -> bool:
    return label in self._vertices

  def __iter__(self) -> Iterator[Vertex[Label]]:
    return iter(self._vertices.values())

  def __len__(self) -> int:
    return len(self._vertices)

Counting

s \to t

paths in topo order:

cnt [s] = 1

, then

cnt [v] + = cnt [u]

over incoming edges — here

cnt [t] = 1 + 2 = 3

Warshall's transitive closure: the boolean analog

Replace shortest distance with is there any path, and $min / +$ with $\lor / \land$ , and Floyd–Warshall becomes Warshall's transitive-closure algorithm. Let $r_{k} [i] [j]$ be true iff $j$ is reachable from $i$ using intermediate vertices in ${1, \dots, k}$ :

r_{k} [i] [j] = r_{k - 1} [i] [j] \lor (r_{k - 1} [i] [k] \land r_{k - 1} [k] [j]) .

Same triple loop, same $O (V^{3})$ , same in-place collapse of the $k$ index; only the semiring changed (booleans under or/and instead of reals under min/plus). Course Schedule IV is this very problem: prerequisites form a DAG, and each query is course $a$ a prerequisite of course $b$ ? is a lookup in the reachability matrix $r_{V}$ .

Held–Karp: a subset as the resource

The resource need not be a single number. In Held–Karp bitmask TSP (the Bitmask DP lesson), the subproblem is $D [S] [v]$ = the cheapest path starting at the origin, visiting exactly the vertex set $S$ , and ending at $v$ ; the transition relaxes over the last hop $D [S] [v] = min_{u \in S ∖ {v}} D [S ∖ {v}] [u] + w (u, v)$ . The rationed resource is the subset of visited vertices, grown one vertex per layer, the same skeleton as Floyd–Warshall, with $2^{V}$ subsets in place of $V$ intermediate-vertex levels. Shortest Path Visiting All Nodes is the unweighted cousin: a BFS over $(mask, v)$ states, where the mask is again the resource. The same framing scales from a single scalar resource up to an exponential subset.

Choosing the framing

All four are DP, but the right resource depends on the graph:

Floyd–Warshall: dense, all-pairs, negative edges allowed; $O (V^{3})$ time, $O (V^{2})$ space. The default when you need every pair.
$V \times$ Dijkstra: sparse graphs with non-negative weights; $O (V (E + V log V))$ , which beats $V^{3}$ when $E ≪ V^{2}$ .
Bellman–Ford: single-source with negative edges, negative-cycle detection, or an edge-count cap ( $K$ -stops); $O (V E)$ .
DAG-DP: acyclic graphs; $O (V + E)$ , handles negative weights and even longest paths, because topological order removes the need to iterate to convergence.

The algebraic view and its descendants

The unification in this lesson is an instance of an algebraic fact. Floyd–Warshall, Warshall's closure, and even the path-counting variant are all the same algorithm over different semirings. A semiring supplies a $+$ that combines alternatives and a $\times$ that concatenates a path in sequence: for shortest paths $(min, +)$ , for reachability $(\lor, \land)$ , for path counting $(+, \times)$ , for widest-path / bottleneck routing $(max, min)$ . Replace the operators and the triple loop computes the corresponding closure without any other change. This is the algebraic path problem, developed by Backhouse and Carré (1975) and by Lehmann (1977), and it explains why cheapest, is there any, and how many are the same exercise: one program parameterized by a semiring.⁵

The same all-pairs closure connects to linear algebra through matrix multiplication over the $(min, +)$ semiring (the min-plus or tropical product). All-pairs shortest paths equals the $(V - 1)$ -th tropical power of the weight matrix, computable by repeated squaring in $O (V^{3} log V)$ — slower than Floyd–Warshall's $O (V^{3})$ , but the connection underlies the theory of subcubic all-pairs shortest paths. Williams (2014) gave the first truly subcubic APSP, running in $O (V^{3} / 2^{Θ (log V)})$ ; whether a genuinely polynomially-faster ( $O (V^{3 - ε})$ ) algorithm exists is a central open question, tied by fine-grained complexity to the (min,+)-matrix-multiplication and Boolean-matrix-multiplication conjectures.

Held and Karp's subset DP for TSP (Held and Karp, 1962, J. SIAM) remains, sixty years on, the fastest known exact TSP algorithm in the worst case at $O (2^{n} n^{2})$ time — no algorithm with a better exponential base is known, which indicates how hard exact TSP is. On the practical side, the DAG shortest path with a topological order underlies critical-path scheduling (PERT/CPM): the longest path through a task-dependency DAG is the minimum project completion time, and it is computed by exactly the $max$ variant of DAG-Relax above. And the layered Bellman–Ford view — relax against the previous layer, snapshotting each round — is the shape of the Viterbi algorithm for hidden Markov models, where each layer is a time step and the DP finds the most likely state sequence.

Takeaways

Many graph algorithms are dynamic programs. The subproblem is the best value under a restricted resource (intermediate vertices, edges, a topological prefix, or a visited subset), and edge relaxation is the DP transition.
Floyd–Warshall is the archetype: $d_{k} [i] [j] = min (d_{k - 1} [i] [j], d_{k - 1} [i] [k] + d_{k - 1} [k] [j])$ , route through $k$ or don't, collapsing in place to $O (V^{3})$ time, $O (V^{2})$ space; negative edges OK, and $d [i] [i] < 0$ flags a negative cycle.
Bellman–Ford is a DP over path length, $D_{t} [v]$ for at-most- $t$ edges; $V - 1$ rounds converge, an extra relaxation detects a negative cycle, and capping $t$ at $K + 1$ solves Cheapest Flights Within K Stops ( $O (V E)$ ).
DAG-DP processes vertices in topological order for an $O (V + E)$ single pass, shortest or longest path, and counting (shortest) paths by carrying a count alongside the distance.
Warshall's transitive closure is the boolean Floyd–Warshall ( $\lor / \land$ ), giving reachability, exactly Course Schedule IV.
The resource can be a subset (Held–Karp / bitmask TSP), unifying scalar shortest paths with exponential-state DP under one frame.

Erickson, Ch. — Dynamic Programming: the DP recipe is to define the subproblem, write the recurrence, and evaluate in an order respecting the dependency DAG. ↩
CLRS, Ch. 23 — All-Pairs Shortest Paths (§23.2): Floyd–Warshall's intermediate-vertex recurrence, $Θ (V^{3})$ in-place evaluation, and negative-cycle detection via $d [i] [i] < 0$ . ↩
CLRS, Ch. 22 — Single-Source Shortest Paths: Bellman–Ford relaxes all edges $V - 1$ times; a further relaxation reveals a negative cycle. ↩
Skiena, § — Shortest Paths / DP: shortest and longest paths on a DAG in $O (V + E)$ by relaxing edges in topological order; longest path is NP-hard only in general graphs. ↩
Backhouse & Carré (1975) and Lehmann (1977) on the algebraic path problem: Floyd–Warshall, transitive closure, and path counting are one closure algorithm over different semirings ( $(min, +)$ , $(\lor, \land)$ , $(+, \times)$ ). See Williams (2014, STOC) for subcubic APSP via min-plus matrix products, and Held & Karp (1962, J. SIAM 10) for the $O (2^{n} n^{2})$ exact-TSP subset DP. ↩

Floyd–Warshall: intermediate vertices as the resource

Bellman–Ford: edges as the resource

DAG-DP: a topological prefix as the resource

Warshall's transitive closure: the boolean analog

Held–Karp: a subset as the resource

Choosing the framing

The algebraic view and its descendants

Takeaways

Footnotes