Coping with NP-Hardness

This is the capstone of the course, so let us first take stock of the tools we have assembled. Almost every algorithm we built fell under a handful of paradigms, recurring structures that apply across problems:

Divide and conquer. Split, recurse, combine; analyzed by recurrences and the master theorem (merge sort, counting inversions, closest points, Karatsuba, Strassen).
Dynamic programming. Tabulate overlapping subproblems (LCS, interval scheduling, Bellman–Ford, Floyd–Warshall, knapsack, subset-sum).
Greedy. Commit to a locally best choice and prove it is safe (Huffman, Dijkstra, Prim, Kruskal; Dijkstra and Prim share the same greedy skeleton, differing only in the priority key).
Graph methods. BFS, DFS, topological sort, strongly connected components, and the shortest-path and spanning-tree algorithms built atop them.
Network flow. Max-flow / min-cut (Ford–Fulkerson, Edmonds–Karp) as a modeling tool covering matching, scheduling, and connectivity problems.
Reductions and $NP$ -completeness, the meta-tool: relate one problem to another, transferring either an algorithm or a hardness proof.

This lesson builds on that last paradigm. Suppose you have followed the recipe from the previous lesson and proved that the problem on your desk is $NP$ -hard. This is genuine progress: you now know not to waste months hunting for a fast exact algorithm that almost certainly does not exist. But the problem still has to be solved. $NP$ -hardness is a statement about the worst case over all instances; it does not forbid doing well on the instances you actually meet, or doing nearly as well as optimal, or doing exactly as well but slowly.

There are four honest ways to cope, and a well-designed system often blends them. We can approximate: settle for a solution provably close to optimal. We can use heuristics, methods that work well in practice without guarantees. We can pay for an exact exponential algorithm that is merely smart about its exponential search. Or we can exploit special structure in our instances. We take each in turn.

The decision tree below summarizes the choices: first ask whether the problem is even hard; only the rightmost branch forces the compromises of this lesson.

Decision tree choosing a coping strategy for a possibly NP-hard problem.

Approximation algorithms

The first response is to relax optimality while keeping a guarantee. An approximation algorithm runs in polynomial time and returns a solution provably within a bounded factor of the best possible.

A ratio of $ρ = 2$ means never more than twice optimal: guaranteed, on every input, with no exceptions. The subtlety is that we prove this without ever knowing $C^{*}$ . The proof almost always uses a lower bound: find some quantity that provably under-estimates $C^{*}$ , then show the solution is not much bigger than that.

ρ

-approximation pins the output cost

C

into the band

[C^{*}, ρ C^{*}]

— proved via a lower bound

L \leq C^{*}

A worked example: 2-approximation for Vertex Cover

A vertex cover of a graph $G = (V, E)$ is a set of vertices touching every edge: for each edge, at least one endpoint is chosen. $Min-Vertex-Cover$ , the problem of finding the smallest such set, is $NP$ -hard. Yet a simple algorithm comes within a factor of $2$ .

Algorithm 1:

\textsc{Approx-Vertex-Cover}(G)

— return a cover at most twice optimal

1
$C \gets \emptyset$
2
$E' \gets E$
working copy
3
while $E' \neq \emptyset$ do
4
pick any edge $(u, v) \in E'$
5
$C \gets C \cup \{u, v\}$
take both endpoints
6
remove from $E'$ every edge incident to $u$ or $v$
7
return $C$

The algorithm repeatedly picks an uncovered edge and adds both its endpoints to the cover. Taking both looks wasteful, but it is what makes the analysis work.

So $Approx-Vertex-Cover$ is a $2$ -approximation.¹ We bounded our output against $∣ M ∣$ , a quantity that lower-bounds the unknown optimum — the standard technique of approximation analysis. (Whether vertex cover admits a ratio better than $2$ is a famous open question; under standard hardness assumptions, no polynomial-time algorithm does substantially better.)

Approx-Vertex-Cover

: each picked matching edge (heavy) contributes both endpoints (filled) to the cover, giving

∣ C ∣ = 2∣ M ∣ \leq 2 C^{*}

approx_vertex_cover.pypython

from __future__ import annotations

from collections.abc import Hashable
from typing import TypeVar

from graph import Graph

Label = TypeVar("Label", bound=Hashable)

def approx_vertex_cover(graph: Graph[Label]) -> set[Label]:
  """
    A vertex cover of `graph` whose size is at most twice optimal.\n
    Returns the set of chosen vertex labels; every edge has at least one\n
    endpoint in the result.\n
  """
  cover: set[Label] = set()

  for edge in graph.edges():
    source_label: Label = edge.source.label
    target_label: Label = edge.target.label

    # skip self-loops and edges already covered by a prior pick.
    if source_label == target_label:
      continue
    if source_label in cover or target_label in cover:
      continue

    # this edge joins the matching M: take both its endpoints.
    cover.add(source_label)
    cover.add(target_label)

  return cover

def is_vertex_cover(graph: Graph[Label], cover: set[Label]) -> bool:
  """
    Whether `cover` touches every edge of `graph`.\n
  """
  for edge in graph.edges():
    if edge.source.label == edge.target.label:
      continue
    if edge.source.label not in cover and edge.target.label not in cover:
      return False
  return True

graph.pypython

from collections.abc import Hashable, Iterator
from typing import Generic, Optional, TypeVar


Label = TypeVar("Label", bound=Hashable)


class Edge(Generic[Label]):
  """
    A directed connection from `source` to `target`, carrying a weight.\n
  """

  def __init__(
    self,
    source: Vertex[Label],
    target: Vertex[Label],
    weight: float = 1.0,
  ) -> None:
    self.source: Vertex[Label] = source
    self.target: Vertex[Label] = target
    self.weight: float = weight

  def __repr__(self) -> str:
    return f"Edge({self.source.label!r} -> {self.target.label!r}, w={self.weight})"


class Vertex(Generic[Label]):
  """
    A graph vertex: a label plus the list of edges leaving it.\n
  """

  def __init__(self, label: Label) -> None:
    self.label: Label = label
    self.outgoing: list[Edge[Label]] = []

  def neighbors(self) -> list[Vertex[Label]]:
    """
      The vertices reachable from this one by a single edge.\n
    """
    return [edge.target for edge in self.outgoing]

  def edge_to(self, label: Label) -> Optional[Edge[Label]]:
    """
      The outgoing edge to the vertex with `label`, or None.\n
    """
    for edge in self.outgoing:
      if edge.target.label == label:
        return edge
    return None

  def __repr__(self) -> str:
    return f"Vertex({self.label!r})"


class Graph(Generic[Label]):
  """
    A graph of Vertex objects linked by Edge objects.\n
    Pass `directed=True` for a digraph; otherwise each `add_edge` inserts\n
    the reverse edge too.\n
  """

  def __init__(self, directed: bool = False) -> None:
    self.directed: bool = directed
    self._vertices: dict[Label, Vertex[Label]] = {}

  def add_vertex(self, label: Label) -> Vertex[Label]:
    """
      Return the vertex for `label`, creating it if it is absent.\n
    """
    # reuse the existing vertex, or mint and register a fresh one.
    vertex = self._vertices.get(label)
    if vertex is None:
      vertex = Vertex(label)
      self._vertices[label] = vertex
    return vertex

  def add_edge(
    self,
    source_label: Label,
    target_label: Label,
    weight: float = 1.0,
  ) -> None:
    """
      Connect two labels (creating either vertex as needed).\n
      Adds the reverse edge as well when the graph is undirected.\n
    """
    source = self.add_vertex(source_label)
    target = self.add_vertex(target_label)

    # link source to target, and mirror it back when undirected.
    source.outgoing.append(Edge(source, target, weight))
    if not self.directed:
      target.outgoing.append(Edge(target, source, weight))

  def vertex(self, label: Label) -> Vertex[Label]:
    """
      The vertex carrying `label` (raises KeyError if absent).\n
    """
    return self._vertices[label]

  @property
  def vertices(self) -> list[Vertex[Label]]:
    """
      Every vertex, in insertion order.\n
    """
    return list(self._vertices.values())

  @property
  def labels(self) -> list[Label]:
    """
      Every vertex label, in insertion order.\n
    """
    return list(self._vertices)

  def edges(self) -> Iterator[Edge[Label]]:
    """
      Each edge once — an undirected edge is yielded a single time.\n
    """
    # track undirected endpoint pairs so each is emitted only once.
    seen: set[frozenset[Label]] = set()

    for vertex in self._vertices.values():
      for edge in vertex.outgoing:
        # skip an undirected edge already yielded from the other endpoint.
        if not self.directed:
          endpoints = frozenset((edge.source.label, edge.target.label))
          if endpoints in seen:
            continue
          seen.add(endpoints)

        yield edge

  def __contains__(self, label: Label) -> bool:
    return label in self._vertices

  def __iter__(self) -> Iterator[Vertex[Label]]:
    return iter(self._vertices.values())

  def __len__(self) -> int:
    return len(self._vertices)

Approximability varies widely across problems:

Some problems admit a polynomial-time approximation scheme (PTAS): for any $ε > 0$ a $(1 + ε)$ -approximation in polynomial time: you can get as close to optimal as you like, paying in running time. Euclidean TSP and Knapsack are of this kind.²
Some have a fixed best-possible constant ratio, like vertex cover's $2$ .
Some are inapproximable: unless $P = NP$ , no polynomial-time algorithm achieves any constant ratio. General TSP (without triangle inequality) is the classic example: even approximating it within any factor is $NP$ -hard.

When the triangle inequality does hold (the metric case), a $2$ -approximation follows from the minimum spanning tree: double every MST edge to get an Eulerian multigraph, walk an Euler tour, and shortcut past already-visited vertices. The shortcuts only shorten the walk (triangle inequality), so the tour costs at most $2 MST \leq 2 OPT$ , since the MST is no heavier than the optimal tour with one edge removed.

Metric-TSP

2

-approximation: double the MST into an Euler tour, then shortcut repeats; cost

\leq 2 MST \leq 2 OPT

metric_tsp_approx.pypython

from typing import Sequence

def _mst_adjacency(
  distances: Sequence[Sequence[float]],
) -> list[list[int]]:
  """
    Prim's MST as a child-list adjacency, rooted at city 0.\n
    `distances[i][j]` is the symmetric distance between cities i and j.\n
  """
  city_count: int = len(distances)
  in_tree: list[bool] = [False for _ in range(city_count)]
  best_edge: list[float] = [float("inf") for _ in range(city_count)]
  parent: list[int] = [-1 for _ in range(city_count)]
  best_edge[0] = 0.0

  for _ in range(city_count):

    # pick the cheapest fringe city not yet in the tree.
    current: int = -1
    for candidate in range(city_count):
      if not in_tree[candidate]:
        if current == -1 or best_edge[candidate] < best_edge[current]:
          current = candidate
    in_tree[current] = True

    # relax every neighbor against the new tree vertex.
    for neighbor in range(city_count):
      weight: float = distances[current][neighbor]
      if not in_tree[neighbor] and weight < best_edge[neighbor]:
        best_edge[neighbor] = weight
        parent[neighbor] = current

  # invert parents into a child-list adjacency.
  children: list[list[int]] = [[] for _ in range(city_count)]
  for city in range(1, city_count):
    children[parent[city]].append(city)

  return children

def metric_tsp_tour(
  distances: Sequence[Sequence[float]],
) -> list[int]:
  """
    A closed tour visiting every city once, of length at most twice optimal.\n
    Returns the visiting order (a permutation of city indices) starting at 0;\n
    the salesman returns to 0 after the last city. Assumes a symmetric metric.\n
  """
  if len(distances) == 0:
    return []

  children: list[list[int]] = _mst_adjacency(distances)
  order: list[int] = []

  # pre-order DFS of the MST; visiting cities in this order, then shortcutting,
  # is exactly the doubled-tree Euler walk with repeats removed.
  stack: list[int] = [0]
  visited: list[bool] = [False for _ in range(len(distances))]
  while stack:

    # take the next unvisited city, recording it in the tour.
    city: int = stack.pop()
    if visited[city]:
      continue
    visited[city] = True
    order.append(city)

    # push children reversed so the leftmost child is popped first.
    for child in reversed(children[city]):
      stack.append(child)

  return order

def tour_length(
  distances: Sequence[Sequence[float]],
  order: Sequence[int],
) -> float:
  """
    Total length of the closed tour that visits cities in `order` and\n
    returns to the start.\n
  """
  if len(order) == 0:
    return 0.0

  # sum each leg, wrapping the last city back to the start.
  total: float = 0.0
  for position in range(len(order)):
    current_city: int = order[position]
    next_city: int = order[(position + 1) % len(order)]
    total += distances[current_city][next_city]

  return total

Heuristics and local search

When a provable ratio is out of reach or simply unnecessary, we turn to heuristics, strategies that are usually good but carry no worst-case promise. The most general is local search: start from some feasible solution and repeatedly apply a small modification, a move, that improves the objective, stopping at a local optimum where no single move helps.

For the traveling salesman, the famous 2-opt move deletes two edges of the current tour and reconnects the pieces the other way; iterating it untangles crossings and converges to short, though not always optimal, tours.

A 2-opt move deletes two crossing edges (discarded, red) and reconnects the four endpoints the other way, uncrossing the tour and shortening it.

two_opt_tsp.pypython

from typing import Sequence

def tour_length(
  distances: Sequence[Sequence[float]],
  order: Sequence[int],
) -> float:
  """
    Length of the closed tour visiting cities in `order` and returning home.\n
  """
  if len(order) == 0:
    return 0.0

  # sum each leg, wrapping the last city back to the start.
  total: float = 0.0
  for position in range(len(order)):
    current_city: int = order[position]
    next_city: int = order[(position + 1) % len(order)]
    total += distances[current_city][next_city]

  return total

def two_opt(
  distances: Sequence[Sequence[float]],
  initial_order: Sequence[int],
) -> list[int]:
  """
    Improve `initial_order` to a 2-opt local optimum and return the tour.\n
    Repeatedly reverses a tour segment whenever doing so shortens the tour,\n
    until no improving reversal remains.\n
  """
  order: list[int] = list(initial_order)
  city_count: int = len(order)
  if city_count < 4:
    return order

  improved: bool = True
  while improved:
    improved = False
    for first in range(city_count - 1):
      for second in range(first + 1, city_count):

        # the two edges leaving the segment endpoints.
        before_first: int = order[first - 1]
        segment_start: int = order[first]
        segment_end: int = order[second]
        after_second: int = order[(second + 1) % city_count]

        # reversing order[first..second] swaps these two edges; skip the
        # wrap-around degenerate case where both edges share a vertex.
        if before_first == segment_end or after_second == segment_start:
          continue

        # cost of the two current edges vs. the two reconnected edges.
        old_cost: float = (
          distances[before_first][segment_start]
          + distances[segment_end][after_second]
        )
        new_cost: float = (
          distances[before_first][segment_end]
          + distances[segment_start][after_second]
        )

        # apply the reversal only when it strictly shortens the tour.
        if new_cost + 1e-12 < old_cost:
          order[first : second + 1] = order[first : second + 1][::-1]
          improved = True

  return order

Local search has a characteristic failure mode: it can stall in a local optimum far from the global one. The standard escapes are metaheuristics that occasionally accept worsening moves to climb out of bad valleys:

Simulated annealing accepts uphill moves with a probability that cools over time, mimicking the physics of slowly freezing metal.
Tabu search forbids recently-visited solutions to avoid cycling back.
Genetic algorithms evolve a population of solutions by recombination and mutation.

Local search descends to the nearest valley and stalls at a local optimum; a metaheuristic must accept an uphill move to escape and reach the global optimum.

simulated_annealing.pypython

import math
import random
from typing import Sequence

def tour_length(
  distances: Sequence[Sequence[float]],
  order: Sequence[int],
) -> float:
  """
    Length of the closed tour visiting cities in `order` and returning home.\n
  """
  if len(order) == 0:
    return 0.0

  # sum each leg, wrapping the last city back to the start.
  total: float = 0.0
  for position in range(len(order)):
    current_city: int = order[position]
    next_city: int = order[(position + 1) % len(order)]
    total += distances[current_city][next_city]

  return total

def anneal_tsp(
  distances: Sequence[Sequence[float]],
  initial_order: Sequence[int],
  temperature: float = 10.0,
  cooling: float = 0.995,
  iterations: int = 5000,
  seed: int | None = None,
) -> list[int]:
  """
    A short tour for `distances`, found by simulated annealing from\n
    `initial_order`. Each step reverses a random tour segment; an uphill\n
    move is accepted with probability exp(-delta / temperature), and the\n
    temperature is scaled by `cooling` every step. Returns the best tour seen.\n
  """
  # too few cities for a non-trivial segment reversal.
  generator: random.Random = random.Random(seed)
  current_order: list[int] = list(initial_order)
  city_count: int = len(current_order)
  if city_count < 4:
    return current_order

  # track the running solution and the best tour seen so far.
  current_cost: float = tour_length(distances, current_order)
  best_order: list[int] = list(current_order)
  best_cost: float = current_cost
  current_temperature: float = temperature

  for _ in range(iterations):

    # pick a random segment [left, right]; a degenerate single city is a no-op.
    left: int = generator.randint(0, city_count - 1)
    right: int = generator.randint(0, city_count - 1)
    if left > right:
      left, right = right, left
    if left == right:
      continue

    # propose: reverse that segment (a 2-opt-style move).
    candidate_order: list[int] = list(current_order)
    candidate_order[left : right + 1] = candidate_order[left : right + 1][::-1]
    candidate_cost: float = tour_length(distances, candidate_order)

    # accept downhill always; accept uphill with the Boltzmann probability.
    delta: float = candidate_cost - current_cost
    if delta < 0 or generator.random() < math.exp(-delta / current_temperature):
      current_order = candidate_order
      current_cost = candidate_cost

    # remember the best tour encountered.
    if current_cost < best_cost:
      best_order = list(current_order)
      best_cost = current_cost

    # cool geometrically, clamped above zero to keep exp() well-defined.
    current_temperature *= cooling
    current_temperature = max(current_temperature, 1e-9)

  return best_order

Heuristics dominate industrial practice precisely because they are flexible and fast. Their cost is the loss of guarantees: you rarely know how far from optimal you landed. The honest practice, Skiena stresses, is to test against known optima on small instances and against lower bounds on large ones.³

Exact exponential methods: branch and bound

Sometimes you genuinely need the optimal answer and the instances are small enough to afford exponential time, provided it is spent wisely. Search the space of solutions as a tree, pruning subtrees that provably cannot beat the best solution found so far — this is branch and bound.

The method interleaves two operations. Branching splits the problem into subproblems (e.g. vertex $v$ is in the cover vs. $v$ is out), forming a search tree. Bounding computes, for each subproblem, an optimistic estimate, a bound, on the best solution reachable within it. If that optimistic estimate is already no better than the best complete solution we have already found (the incumbent), the entire subtree is discarded unexplored.

The worst case is still exponential; branch and bound does not remove $NP$ -hardness. But on real instances a good bound can prune away the overwhelming majority of the tree, making problems with thousands of variables routinely solvable. The quality of the bound is everything: a tight bound (often from a relaxation such as linear programming) prunes aggressively; a loose one leaves a near-complete exponential search. Modern integer-programming solvers are highly engineered branch-and-bound implementations.

Branch and bound: split on a variable, bound each subtree, and prune any whose optimistic bound

L

is no better than the incumbent

U

exact_vertex_cover.pypython

from __future__ import annotations

from collections.abc import Hashable
from typing import TypeVar

from graph import Graph

Label = TypeVar("Label", bound=Hashable)

def _remaining_edges(
  graph: Graph[Label],
  chosen: set[Label],
) -> list[tuple[Label, Label]]:
  """
    Edges of `graph` not yet covered by `chosen`, as endpoint-label pairs.\n
  """
  uncovered: list[tuple[Label, Label]] = []
  for edge in graph.edges():
    source_label: Label = edge.source.label
    target_label: Label = edge.target.label

    # skip self-loops and edges already touched by a chosen vertex.
    if source_label == target_label:
      continue
    if source_label in chosen or target_label in chosen:
      continue

    uncovered.append((source_label, target_label))

  return uncovered

def _matching_bound(uncovered: list[tuple[Label, Label]]) -> int:
  """
    Size of a greedy maximal matching on the uncovered edges.\n
    Any vertex cover of these edges needs at least this many vertices, so it\n
    is a valid lower bound for branch-and-bound pruning.\n
  """
  used: set[Label] = set()
  matching_size: int = 0

  # greedily grab disjoint edges; each pulls both endpoints out of play.
  for source_label, target_label in uncovered:
    if source_label not in used and target_label not in used:
      used.add(source_label)
      used.add(target_label)
      matching_size += 1

  return matching_size

def exact_vertex_cover(graph: Graph[Label]) -> set[Label]:
  """
    A minimum-size vertex cover of `graph`, found by branch and bound.\n
  """
  best_cover: set[Label] = set(graph.labels)

  def search(chosen: set[Label]) -> None:
    nonlocal best_cover
    uncovered: list[tuple[Label, Label]] = _remaining_edges(graph, chosen)

    # leaf: every edge is covered — record if it beats the incumbent.
    if not uncovered:
      if len(chosen) < len(best_cover):
        best_cover = set(chosen)
      return

    # prune: chosen + matching lower bound cannot improve the incumbent.
    lower_bound: int = len(chosen) + _matching_bound(uncovered)
    if lower_bound >= len(best_cover):
      return

    # branch on an uncovered edge (source, target): every cover must contain
    # at least one endpoint, so try each side.
    source_label, target_label = uncovered[0]
    search(chosen | {source_label})
    search(chosen | {target_label})

  search(set())
  return best_cover

graph.pypython

from collections.abc import Hashable, Iterator
from typing import Generic, Optional, TypeVar


Label = TypeVar("Label", bound=Hashable)


class Edge(Generic[Label]):
  """
    A directed connection from `source` to `target`, carrying a weight.\n
  """

  def __init__(
    self,
    source: Vertex[Label],
    target: Vertex[Label],
    weight: float = 1.0,
  ) -> None:
    self.source: Vertex[Label] = source
    self.target: Vertex[Label] = target
    self.weight: float = weight

  def __repr__(self) -> str:
    return f"Edge({self.source.label!r} -> {self.target.label!r}, w={self.weight})"


class Vertex(Generic[Label]):
  """
    A graph vertex: a label plus the list of edges leaving it.\n
  """

  def __init__(self, label: Label) -> None:
    self.label: Label = label
    self.outgoing: list[Edge[Label]] = []

  def neighbors(self) -> list[Vertex[Label]]:
    """
      The vertices reachable from this one by a single edge.\n
    """
    return [edge.target for edge in self.outgoing]

  def edge_to(self, label: Label) -> Optional[Edge[Label]]:
    """
      The outgoing edge to the vertex with `label`, or None.\n
    """
    for edge in self.outgoing:
      if edge.target.label == label:
        return edge
    return None

  def __repr__(self) -> str:
    return f"Vertex({self.label!r})"


class Graph(Generic[Label]):
  """
    A graph of Vertex objects linked by Edge objects.\n
    Pass `directed=True` for a digraph; otherwise each `add_edge` inserts\n
    the reverse edge too.\n
  """

  def __init__(self, directed: bool = False) -> None:
    self.directed: bool = directed
    self._vertices: dict[Label, Vertex[Label]] = {}

  def add_vertex(self, label: Label) -> Vertex[Label]:
    """
      Return the vertex for `label`, creating it if it is absent.\n
    """
    # reuse the existing vertex, or mint and register a fresh one.
    vertex = self._vertices.get(label)
    if vertex is None:
      vertex = Vertex(label)
      self._vertices[label] = vertex
    return vertex

  def add_edge(
    self,
    source_label: Label,
    target_label: Label,
    weight: float = 1.0,
  ) -> None:
    """
      Connect two labels (creating either vertex as needed).\n
      Adds the reverse edge as well when the graph is undirected.\n
    """
    source = self.add_vertex(source_label)
    target = self.add_vertex(target_label)

    # link source to target, and mirror it back when undirected.
    source.outgoing.append(Edge(source, target, weight))
    if not self.directed:
      target.outgoing.append(Edge(target, source, weight))

  def vertex(self, label: Label) -> Vertex[Label]:
    """
      The vertex carrying `label` (raises KeyError if absent).\n
    """
    return self._vertices[label]

  @property
  def vertices(self) -> list[Vertex[Label]]:
    """
      Every vertex, in insertion order.\n
    """
    return list(self._vertices.values())

  @property
  def labels(self) -> list[Label]:
    """
      Every vertex label, in insertion order.\n
    """
    return list(self._vertices)

  def edges(self) -> Iterator[Edge[Label]]:
    """
      Each edge once — an undirected edge is yielded a single time.\n
    """
    # track undirected endpoint pairs so each is emitted only once.
    seen: set[frozenset[Label]] = set()

    for vertex in self._vertices.values():
      for edge in vertex.outgoing:
        # skip an undirected edge already yielded from the other endpoint.
        if not self.directed:
          endpoints = frozenset((edge.source.label, edge.target.label))
          if endpoints in seen:
            continue
          seen.add(endpoints)

        yield edge

  def __contains__(self, label: Label) -> bool:
    return label in self._vertices

  def __iter__(self) -> Iterator[Vertex[Label]]:
    return iter(self._vertices.values())

  def __len__(self) -> int:
    return len(self._vertices)

Exploiting special structure

The final and most underrated strategy is to remember that $NP$ -hardness is a worst case over all inputs, and your inputs may not be the worst. Many hard problems become easy when restricted to the structured instances that arise in practice.

Restricted graph classes. Problems that are $NP$ -hard on general graphs frequently admit polynomial (even linear) algorithms on trees, on bipartite graphs, or on planar graphs. Independent Set, hard in general, falls to a simple greedy/dynamic program on trees.
Bounded parameters. A problem may be solvable in time $f (k) \cdot n^{O (1)}$ , where $k$ is some small parameter of the instance (the solution size, the treewidth, the number of constraints). The exponential blow-up is confined to $k$ , so if $k$ is small the algorithm is fast. This is the domain of fixed-parameter tractability: Vertex Cover, for instance, is solvable in $O (2^{k} \cdot n)$ for covers of size $k$ , practical whenever the cover is small even if the graph is huge.⁴
Pseudo-polynomial algorithms. Numeric problems like $Subset-Sum$ and Knapsack have dynamic programs running in time polynomial in the numeric values, fast when the numbers are modest, exponential only because values can be exponentially large in their bit-length.

The lesson is to look hard at the instances you must actually solve before declaring defeat. Hardness in the worst case is fully compatible with easiness in your case.

tree_independent_set.pypython

from __future__ import annotations

from collections.abc import Hashable
from typing import Generic, Optional, TypeVar

Label = TypeVar("Label", bound=Hashable)

class TreeNode(Generic[Label]):
  """
    One tree vertex: a label, a weight, and its child nodes.\n
  """

  def __init__(self, label: Label, weight: float = 1.0) -> None:
    self.label: Label = label
    self.weight: float = weight
    self.children: list[TreeNode[Label]] = []

  def add_child(
    self,
    label: Label,
    weight: float = 1.0,
  ) -> TreeNode[Label]:
    """
      Attach and return a new child carrying `label` and `weight`.\n
    """
    child: TreeNode[Label] = TreeNode(label, weight)
    self.children.append(child)
    return child

  def __repr__(self) -> str:
    return f"TreeNode({self.label!r}, w={self.weight})"

def max_weight_independent_set(
  root: Optional[TreeNode[Label]],
) -> float:
  """
    The largest total weight of a set of tree vertices, no two adjacent.\n
    An empty tree (`root` is None) has weight 0.\n
  """
  if root is None:
    return 0.0

  def solve(node: TreeNode[Label]) -> tuple[float, float]:
    """
      Returns (best with `node` taken, best with `node` skipped).\n
    """
    taken: float = node.weight
    skipped: float = 0.0
    for child in node.children:
      child_taken, child_skipped = solve(child)

      # taking `node` forbids taking its children.
      taken += child_skipped

      # skipping `node` lets each child pick its own better option.
      skipped += max(child_taken, child_skipped)
    return (taken, skipped)

  root_taken, root_skipped = solve(root)
  return max(root_taken, root_skipped)

fpt_vertex_cover.pypython

from __future__ import annotations

from collections.abc import Hashable
from typing import Optional, TypeVar

from graph import Graph

Label = TypeVar("Label", bound=Hashable)

def _first_uncovered_edge(
  graph: Graph[Label],
  chosen: set[Label],
) -> Optional[tuple[Label, Label]]:
  """
    An edge with neither endpoint in `chosen`, or None if all are covered.\n
  """
  for edge in graph.edges():
    source_label: Label = edge.source.label
    target_label: Label = edge.target.label

    # the first real edge with both endpoints still outside the cover.
    if source_label == target_label:
      continue
    if source_label not in chosen and target_label not in chosen:
      return (source_label, target_label)

  return None

def vertex_cover_within(
  graph: Graph[Label],
  budget: int,
) -> Optional[set[Label]]:
  """
    A vertex cover of `graph` of size at most `budget`, or None if none\n
    exists. Runs in O(2^budget * (V + E)) time.\n
  """

  def search(chosen: set[Label], remaining: int) -> Optional[set[Label]]:
    edge: Optional[tuple[Label, Label]] = _first_uncovered_edge(graph, chosen)

    # base case: nothing left uncovered — `chosen` is a valid cover.
    if edge is None:
      return set(chosen)

    # an uncovered edge remains but the budget is spent: failure here.
    if remaining == 0:
      return None

    # branch: one endpoint of this edge must join the cover.
    source_label, target_label = edge
    for endpoint in (source_label, target_label):
      result: Optional[set[Label]] = search(chosen | {endpoint}, remaining - 1)
      if result is not None:
        return result
    return None

  return search(set(), budget)

graph.pypython

from collections.abc import Hashable, Iterator
from typing import Generic, Optional, TypeVar


Label = TypeVar("Label", bound=Hashable)


class Edge(Generic[Label]):
  """
    A directed connection from `source` to `target`, carrying a weight.\n
  """

  def __init__(
    self,
    source: Vertex[Label],
    target: Vertex[Label],
    weight: float = 1.0,
  ) -> None:
    self.source: Vertex[Label] = source
    self.target: Vertex[Label] = target
    self.weight: float = weight

  def __repr__(self) -> str:
    return f"Edge({self.source.label!r} -> {self.target.label!r}, w={self.weight})"


class Vertex(Generic[Label]):
  """
    A graph vertex: a label plus the list of edges leaving it.\n
  """

  def __init__(self, label: Label) -> None:
    self.label: Label = label
    self.outgoing: list[Edge[Label]] = []

  def neighbors(self) -> list[Vertex[Label]]:
    """
      The vertices reachable from this one by a single edge.\n
    """
    return [edge.target for edge in self.outgoing]

  def edge_to(self, label: Label) -> Optional[Edge[Label]]:
    """
      The outgoing edge to the vertex with `label`, or None.\n
    """
    for edge in self.outgoing:
      if edge.target.label == label:
        return edge
    return None

  def __repr__(self) -> str:
    return f"Vertex({self.label!r})"


class Graph(Generic[Label]):
  """
    A graph of Vertex objects linked by Edge objects.\n
    Pass `directed=True` for a digraph; otherwise each `add_edge` inserts\n
    the reverse edge too.\n
  """

  def __init__(self, directed: bool = False) -> None:
    self.directed: bool = directed
    self._vertices: dict[Label, Vertex[Label]] = {}

  def add_vertex(self, label: Label) -> Vertex[Label]:
    """
      Return the vertex for `label`, creating it if it is absent.\n
    """
    # reuse the existing vertex, or mint and register a fresh one.
    vertex = self._vertices.get(label)
    if vertex is None:
      vertex = Vertex(label)
      self._vertices[label] = vertex
    return vertex

  def add_edge(
    self,
    source_label: Label,
    target_label: Label,
    weight: float = 1.0,
  ) -> None:
    """
      Connect two labels (creating either vertex as needed).\n
      Adds the reverse edge as well when the graph is undirected.\n
    """
    source = self.add_vertex(source_label)
    target = self.add_vertex(target_label)

    # link source to target, and mirror it back when undirected.
    source.outgoing.append(Edge(source, target, weight))
    if not self.directed:
      target.outgoing.append(Edge(target, source, weight))

  def vertex(self, label: Label) -> Vertex[Label]:
    """
      The vertex carrying `label` (raises KeyError if absent).\n
    """
    return self._vertices[label]

  @property
  def vertices(self) -> list[Vertex[Label]]:
    """
      Every vertex, in insertion order.\n
    """
    return list(self._vertices.values())

  @property
  def labels(self) -> list[Label]:
    """
      Every vertex label, in insertion order.\n
    """
    return list(self._vertices)

  def edges(self) -> Iterator[Edge[Label]]:
    """
      Each edge once — an undirected edge is yielded a single time.\n
    """
    # track undirected endpoint pairs so each is emitted only once.
    seen: set[frozenset[Label]] = set()

    for vertex in self._vertices.values():
      for edge in vertex.outgoing:
        # skip an undirected edge already yielded from the other endpoint.
        if not self.directed:
          endpoints = frozenset((edge.source.label, edge.target.label))
          if endpoints in seen:
            continue
          seen.add(endpoints)

        yield edge

  def __contains__(self, label: Label) -> bool:
    return label in self._vertices

  def __iter__(self) -> Iterator[Vertex[Label]]:
    return iter(self._vertices.values())

  def __len__(self) -> int:
    return len(self._vertices)

Parameterized complexity as its own theory

The bounded parameters idea above grew into a full complexity theory, parameterized complexity, developed by Downey and Fellows in the 1990s.⁵ It replaces the single input size $n$ with a pair $(n, k)$ , where $k$ is a chosen parameter, and asks for algorithms whose exponential cost is confined to $k$ . The central class is FPT (fixed-parameter tractable): problems solvable in $f (k) \cdot n^{O (1)}$ for some function $f$ . Vertex Cover's $O (2^{k} \cdot n)$ puts it in FPT; the exponential in $k$ is unavoidable (the problem is still $NP$ -hard), but it is isolated from $n$ .

Two results make this more than a definition. The first is kernelization, a provable form of preprocessing. A kernelization reduces an instance $(x, k)$ in polynomial time to an equivalent instance whose size is bounded by a function of $k$ alone — for Vertex Cover, down to $O (k^{2})$ vertices via the Buss reduction (any vertex of degree $> k$ must be in every size- $k$ cover, so take it). After kernelization the surviving core is small whenever $k$ is small, and brute force finishes the job. A theorem ties the two ideas together: a problem is in FPT if and only if it has a kernel.

Kernelization shrinks a large instance to a core of size bounded by

k

in polynomial time; brute force then finishes on the small kernel.

The second result is a hardness theory for parameters. Not every parameterized problem is FPT; the $W$ -hierarchy ( $FPT \subseteq W [1] \subseteq W [2] \subseteq \dots$ ) plays the role that $NP$ plays for ordinary complexity. Clique parameterized by solution size is $W [1]$ -hard, which is strong evidence it has no $f (k) \cdot n^{O (1)}$ algorithm — its best known algorithms are $n^{O (k)}$ , with the exponent growing in $k$ . So parameterized complexity draws a second, finer tractability line right through the class of $NP$ -hard problems: Vertex Cover and Clique are both $NP$ -complete and inter-reducible, yet one is FPT and the other is $W [1]$ -hard. The choice of parameter decides which side a problem lands on.

Choosing a strategy

These responses are not rivals so much as a toolkit; the right choice depends on what you can tolerate.

Need a guarantee and can accept near-optimal? Reach for an approximation algorithm.
Need speed and flexibility and can live without guarantees? Use a heuristic or local search, validated empirically.
Need the exact optimum on instances of modest size? Invest in branch and bound with the tightest bound you can compute.
Do your real instances have structure, such as small parameters, special graph shape, or modest numbers? Exploit it, possibly turning the problem polynomial outright.

Final thoughts

Looking back, the course covered three things: how to analyze (asymptotics, recurrences, invariants), how to design across a small repertoire of paradigms (divide and conquer, dynamic programming, greedy, graph search, network flow), and how to recognize the limits of design through reductions and $NP$ -completeness. When the limit is real, change the question: trade exactness, optimality, or generality for tractability, and state clearly what was given up.

The paradigms compose. A branch-and-bound solver gets its bound from a relaxation (often linear programming); an approximation proof rests on a combinatorial lower bound like a matching; a special-case algorithm is frequently just dynamic programming rediscovered on a tree. The reduction habit — map my problem onto one I already understand — is the same whether you are proving hardness or borrowing an algorithm. A few paradigms, understood deeply, cover problems you have never seen.

Natural sequels to this material are advanced algorithms and data structures (Fibonacci heaps, the engineering behind Dijkstra/Prim and disjoint-set union), complexity and computability theory (the formal machinery beneath $P$ and $NP$ ), and the specialized branches (randomized, streaming, distributed, and geometric algorithms) where each of the paradigms above reappears in a new guise.

Takeaways

$NP$ -hardness rules out a fast exact algorithm in the worst case, not useful answers in practice. There are four honest responses.
An approximation algorithm trades optimality for a provable ratio $C \leq ρ C^{*}$ . The proofs lean on a lower bound for the unknown optimum, as in the 2-approximation for vertex cover, which takes both endpoints of a maximal matching, giving $∣ C ∣ = 2∣ M ∣ \leq 2 C^{*}$ .
Heuristics and local search (2-opt, simulated annealing, tabu search) are fast and flexible but unguaranteed; validate them empirically.
Branch and bound finds the exact optimum by searching a tree and pruning subtrees whose optimistic bound cannot beat the incumbent; still exponential in the worst case, often fast in practice.
Special structure (trees, planar or bounded-treewidth graphs, small parameters, modest numeric values) frequently turns a worst-case-hard problem tractable on the instances you actually face.
Capstone view. The course is a small kit of composable paradigms (divide and conquer, dynamic programming, greedy, graph methods, network flow) bounded by reductions and $NP$ -completeness. When hardness is real, you change the question (approximation, heuristic, exact-but-exponential, special case) rather than abandon it.

CLRS, Ch. 35 — Approximation Algorithms (§35.1): the $2$ -approximation for vertex cover via a maximal matching lower bound, $∣ C ∣ = 2∣ M ∣ \leq 2 C^{*}$ . ↩
CLRS, Ch. 35 — Approximation Algorithms: polynomial-time approximation schemes (PTAS), including Knapsack and Euclidean TSP. ↩
Skiena, §11 — NP-Completeness; Heuristics: local search and metaheuristics, and validating unguaranteed heuristics against known optima and lower bounds. ↩
Erickson, Ch. 12 — NP-Hardness: exploiting special structure, including fixed-parameter tractability such as $O (2^{k} \cdot n)$ vertex cover. ↩
Rodney G. Downey and Michael R. Fellows, Parameterized Complexity (Springer, 1999), and Fundamentals of Parameterized Complexity (2013) — the FPT class, the kernelization/FPT equivalence, and the $W$ -hierarchy with $W [1]$ -hardness of Clique by solution size. ↩

Approximation algorithms

A worked example: 2-approximation for Vertex Cover

Heuristics and local search

Exact exponential methods: branch and bound

Exploiting special structure

Parameterized complexity as its own theory

Choosing a strategy

Final thoughts

Takeaways

Footnotes