Lowest Common Ancestor & Binary Lifting
Given a rooted tree, the lowest common ancestor of and is the deepest node that is an ancestor of both. A naive walk answers one query in ; binary lifting precomputes the -th ancestor of every node in , then answers -th-ancestor and LCA queries in each.
╌╌╌╌
The previous lessons gave us a rooted tree and a single root-to-node path for each vertex. Many problems instead ask about two vertices at once: how far apart are and ? what is the highest fork their paths share? which company contains both of two nested regions? Each reduces to one question: the lowest common ancestor.
The LCA is well defined and unique: the sets of ancestors of and of are each a chain from the root, so their intersection is a chain, and a finite chain has a unique deepest element.
The naive walk
If every node stores a parent pointer and a depth, one query is easy. Lift the deeper of until both sit at the same depth, then advance both pointers up in lockstep; the first node they agree on is the LCA.
- 1while do
- 2
- 3while do
- 4
- 5while do
- 6
- 7
- 8return
This needs no preprocessing and is correct, but each step moves up one edge, so a query costs where is the tree's height. On a balanced tree , but on a degenerate path , and queries cost . We want a query cost that does not depend on shape. (For the asymptotic notation, see asymptotic analysis.)
Binary lifting
The fix is to make each jump cover an exponentially larger distance. Instead of
go up one,
precompute, for every node and every , a pointer that goes up
edges at once.
The whole table is built from a single doubling identity: climbing edges is climbing edges twice.
So column of the table is computed entirely from column , one pass per power of two. The number of columns is , since no node has an ancestor more than edges up.
- 1run a DFS/BFS from the root to fill and
- 2for each node do
- 3root points to itself
- 4for to do
- 5for each node do
- 6
The table has entries and each costs , so preprocessing is time and space.
-th ancestor in
Any non-negative integer has a unique binary expansion, so the climb of
edges decomposes into jumps of size , one jump per set bit.
Take each set bit from low to high and follow the matching column of up.
- 1for to do
- 2if has bit set then
- 3
- 4if then return nilran off the root
- 5return
At most bits are set, so this is . The order of the jumps does not matter for the destination (they compose to the same total climb), but processing low bits first keeps the running node well-defined at each step.
LCA in
The LCA query reuses the same jumps as two phases. Phase 1 lifts the deeper node up by exactly , a single call, so and sit at equal depth. If they now coincide, one was an ancestor of the other and we are done. Phase 2 lifts both nodes simultaneously: scanning from high to low, we jump both up by only when that keeps them distinct. When the loop ends, and are the two distinct children-side nodes just below the LCA, so the answer is their common parent.
- 1if then swap
- 2phase 1
- 3if then return
- 4for downto dophase 2
- 5if then
- 6
- 7
- 8returntheir common parent
The depth-equalizing jump is and the second loop runs times, so each LCA query is after the one-time build.
Here and already share depth; both lift to and (kept distinct),
then one parent step lands on , drawn in acc.
A worked table
The whole method lives in the up grid. For the small tree below, rows are nodes
and columns are , with each entry the -th ancestor. Column is
the parent; every later column is the previous column composed with itself.
Reading row : its st ancestor is (), its nd is , its th
is . To find the th ancestor of we write and jump by the
and columns — cells highlighted in acc — climbing then
edges; on this depth- tree that runs off the root, which
reports as nil.
Application: tree distance and path queries
LCA turns a two-vertex path question into arithmetic on depths. The unique path from to in a tree goes up from to and back down to , so its length is
Each query is one LCA plus work, hence . The same decomposition
answers is on the path?
, aggregates a value along the path (split
into the two vertical legs), or, combined with , emits
step-by-step U/L/R directions: climb steps
up, then walk the recorded downward path to .
Alternatives
Binary lifting is the most broadly useful LCA method, but two others are worth knowing.1
- Euler tour + sparse-table RMQ. Record the Euler traversal of the tree (each
node appended on entry and after each child returns); within it, the LCA of
and is the shallowest node visited between any occurrence of and of
. That reduces LCA to a range-minimum query over the depth array, which a
sparse table answers in after an build.2 So queries
drop to , but the structure is static and does not directly give -th
ancestors.
The reduction is best seen laid out. Below the tree, the Euler tour writes each node as it is entered and re-entered, with its depth underneath. The LCA of and is the shallowest entry anywhere between an occurrence of and one of — i.e. the minimum of that depth subarray (shaded), which here is :
- Tarjan's offline LCA. If all query pairs are known in advance, a single DFS with a union-find structure answers them in near-linear total time, processing each query when its second endpoint is first reached.3
Takeaways
- The lowest common ancestor of and is the deepest node ancestral to both; it is unique because ancestor sets are root-chains.
- The naive walk (equalize depth, climb together) needs no preprocessing but costs per query, or on a degenerate tree.
- Binary lifting precomputes , the -th ancestor, via the doubling identity in time and space.
- A -th-ancestor query jumps by each -bit of ; an LCA query lifts the deeper node to equal depth, then jumps both up by decreasing powers of two while they stay distinct — each .
- Tree distance is , turning path queries into arithmetic.
- Alternatives: Euler tour + sparse-table RMQ gives queries on a static tree; Tarjan's union-find DFS answers all queries offline in near-linear time. Binary lifting wins on being online and also serving -th ancestors.
Footnotes
- Skiena, § — Trees / LCA: survey of LCA strategies and the preprocessing/query trade-off across query models. ↩
- Erickson, Ch. — Trees: the Euler-tour reduction of LCA to range-minimum, with sparse-table RMQ giving queries after preprocessing. ↩
- CLRS, Ch. — (trees): Tarjan's offline LCA via depth-first search and disjoint-set union, near-linear in . ↩
╌╌ END ╌╌