Data Structures/AVL Trees

Lesson 4.41,388 words

AVL Trees

An AVL tree is the first balanced BST: at every node the two subtrees' heights differ by at most 11. A Fibonacci-style minimal-node argument forces height $h \le 1.

╌╌╌╌

The previous lesson left us with a binary search tree that does everything in time: magnificent when the tree is bushy, ruinous when a run of sorted insertions stretches it into a height- path. The fix is to refuse to let the tree get tall: pick a structural invariant that pins height to , and restore it after every update. The AVL tree, named for Adelson-Velsky and Landis (1962), is the oldest such scheme and the easiest to reason about, balancing by a direct height constraint at every node.1

To talk about rebalancing we need one primitive, which the next lesson on balanced search trees treats in full: the rotation. A rotation rearranges three pointers to change a subtree's shape, and so its height, while preserving the BST ordering and running in time. A right rotation at lifts its left child to the root and makes the right child of ; a left rotation is the inverse. That is all we need here: a cheap, order-preserving lever for trading height between siblings.

The AVL invariant

Define the height of a node as the number of edges on the longest downward path to a leaf, with . The balance factor is

and the invariant says exactly for every node. We store (or, equivalently, ) in each node and keep it current as the tree changes; both a search and an in-order walk ignore the field entirely, so a read-only operation on an AVL tree is just a BST operation.

A node with is left-heavy past tolerance, right-heavy; these are the two illegal states an update can create, and the two we must repair.

Height is logarithmic

The invariant is worth what it buys, and what it buys is a logarithmic height bound. The cleanest argument is extremal: ask how few nodes an AVL tree of a given height can possibly contain. A sparse tree is the dangerous case, so if even the sparsest legal tree is bushy, all of them are.

The minimal-node trees are exactly the Fibonacci trees: the sparsest height- AVL tree is a root over a Fibonacci tree of height and one of height . Because , an AVL tree is provably shorter than the worst-case red-black tree of the same size, so its lookups touch fewer nodes. With guaranteed, search, , , , and successor all run in worst-case time.3

The sparsest AVL tree of height — a Fibonacci tree with the minimum nodes. Each node's two subtrees differ in height by exactly (the extreme the invariant allows): the root sits over a minimal height- tree and a minimal height- tree, recursively. Subtree heights are labeled.

Insertion: rebalancing on the way up

We insert a key exactly as in a plain BST: descend to a leaf position and attach the new node. That can only push heights up along the single root-to-leaf path we just walked, so we retrace it upward, recomputing at each node. Let be the lowest node whose balance factor has become . Its imbalance was caused by the new node landing in one of four positions relative to , named by the two steps down the heavy path from toward the insertion:

  • LL: left child left-heavy, a single right rotation at .
  • RR: right child right-heavy, a single left rotation at .
  • LR: left child right-heavy, a left rotation at the child, then a right rotation at .
  • RL: right child left-heavy, a right rotation at the child, then a left rotation at .

Consider the LL case. Node has balance , its left child leans left, and the new node sits under 's left child . A single right rotation at lifts into 's place, hangs as 's right child, and rehomes 's old right subtree as 's new left subtree.

The LL case — a left-left-heavy subtree fixed by one right rotation at

The LR case cannot be fixed by a single rotation: the heavy path zig-zags, so we first rotate the child to straighten it into an LL shape, then finish with the right rotation at . Here has balance , its left child leans right, and the offending grandchild is 's right child. A left rotation at lifts above ; the subtree is now left-left-heavy, and a right rotation at completes the repair, leaving as the new root.

The LR case — left rotation at , then right rotation at (a double rotation)

The RR and RL cases are the left-right mirrors of LL and LR. For completeness, here is the RL double rotation, the mirror of the LR case just shown:

The RL case — right rotation at , then left rotation at (the mirror of LR). Here has balance , its right child leans left, and the offending grandchild is 's left child; the double rotation lifts to the root.

In every case the rebalanced subtree ends up with the same height it had before the insertion, which is the decisive fact:

So insertion is to descend, to walk back up adjusting heights, and at most a double rotation ( rotations) to repair: overall. The dispatch is driven entirely by the stored heights:

Algorithm:Insert-Fixup\textsc{Insert-Fixup} — retrace the insertion path, repairing the lowest violation
  1. 1
    xparent of the new leafx \gets \text{parent of the new leaf}
  2. 2
    while xnilx \ne \text{nil} do
  3. 3
    h(x)1+max(h(left(x)),h(right(x)))h(x) \gets 1 + \max\bigl(h(left(x)),\, h(right(x))\bigr)
  4. 4
    bh(right(x))h(left(x))b \gets h(right(x)) - h(left(x))
    balance factor of xx
  5. 5
    if b=2b = -2 then
    left-heavy: LL or LR
  6. 6
    if h(left(left(x)))h(right(left(x)))h(left(left(x))) \ge h(right(left(x))) then
  7. 7
    Right-Rotate(x)\textsc{Right-Rotate}(x)
    LL
  8. 8
    else
  9. 9
    Left-Rotate(left(x))\textsc{Left-Rotate}(left(x)) ; Right-Rotate(x)\textsc{Right-Rotate}(x)
    LR
  10. 10
    break
    height restored, ancestors safe
  11. 11
    else if b=+2b = +2 then
    right-heavy: RR or RL
  12. 12
    if h(right(right(x)))h(left(right(x)))h(right(right(x))) \ge h(left(right(x))) then
  13. 13
    Left-Rotate(x)\textsc{Left-Rotate}(x)
    RR
  14. 14
    else
  15. 15
    Right-Rotate(right(x))\textsc{Right-Rotate}(right(x)) ; Left-Rotate(x)\textsc{Left-Rotate}(x)
    RL
  16. 16
    break
  17. 17
    xparent(x)x \gets parent(x)

Deletion: the same cases, but up the whole path

Deletion starts as in a BST, splicing out the node, or its in-order successor if it has two children, then retraces the path to the root updating heights, applying exactly the same four rotation cases at any node that reaches . The one difference is consequential. A rotation that repairs an insertion preserves the subtree's height, but a rotation that repairs a deletion can shrink the subtree by one. That shorter subtree may unbalance the node's parent, which may unbalance its parent, and so on. Deletion can therefore trigger up to rotations cascading toward the root, though each is still , so the operation remains .3

Why deletion cascades. A rotation that repairs the subtree at leaves it one shorter (height ), so the height drop propagates to the parent : with its other subtree still height , 's balance becomes and must itself be rotated — and that rotation can shorten , unbalancing 's parent, all the way to the root. (Insertion never does this: its repair restores the original height.)

AVL versus red-black

AVL and red-black trees both guarantee height, but they sit at opposite ends of a tradeoff. The AVL height-difference invariant is strict: , noticeably shorter than red-black's , so AVL lookups are faster, the structure of choice for read-heavy workloads. The price is paid on updates: AVL trees track exact heights and may rotate to repair even small imbalances, whereas red-black trees tolerate looser balance and lean on cheap recolorings, doing fewer rotations per update. Roughly: AVL is more rigidly balanced and faster to search; red-black is cheaper to mutate, which is why it backs most standard-library ordered maps.4

Takeaways

  • An AVL tree is a BST with the invariant that every node's two subtrees differ in height by at most , i.e. its balance factor ; each node stores its height (or balance).
  • A minimal-node / Fibonacci-tree argument gives , so and height , hence all operations are worst-case .
  • Insertion descends as a BST, retraces the path updating heights, and repairs the lowest unbalanced node with one of four cases, LL (right), RR (left), LR (left-then-right), RL (right-then-left), using rotations, because the fix restores the subtree's prior height.
  • Deletion uses the same four cases but its rotations can shrink a subtree, so the rebalancing may cascade up to the root, taking rotations.
  • Versus red-black trees: AVL is more rigidly balanced (shorter, faster lookups) but does more rotations per update.

Footnotes

  1. Erickson, Ch. — Balanced Binary Search Trees: the height-balance invariant and its restoration by rotation after each update.
  2. CLRS, Problem 13-3 — AVL Trees: the Fibonacci minimal-node recurrence and the height bound; insertion via with rotations.
  3. Skiena, §3.4 — Balanced Search Trees: AVL operations are ; deletion may rebalance along the full root path. 2
  4. Skiena, §3.4 — Balanced Search Trees: the AVL-vs-red-black tradeoff — tighter balance and faster search versus cheaper update.
Practice

╌╌ END ╌╌