Elementary Data Structures

Every data structure in this course (every tree, heap, hash table, and graph) is ultimately stored one of two ways, and the choice affects everything built on top of it. Elements are laid out contiguously in one block of memory, or scattered and joined by pointers. This first lesson works out the two strategies and the four ordered containers (array, linked list, stack, queue) that the rest of the module builds on.

Two ways to store a sequence

A contiguous structure stores its elements in a single block of memory, one after another. An array of $n$ elements each of size $s$ occupies one run of $n s$ bytes, so element $i$ lives at a known offset from the start. A linked structure stores each element in its own separately-allocated node and uses a pointer in each node to find the next; the nodes may sit anywhere in memory.¹

The contrast is sharp, and it drives every later choice:

Random access. Contiguous wins outright. Element $i$ of an array is at address $ba se + i \cdot s$ , computed with one multiply-add, so indexing is $O (1)$ . In a linked list there is no address arithmetic; to reach the $i$ -th node you must follow $i$ pointers, which is $O (n)$ .
Splicing. Linked wins outright. To insert or delete an element given a pointer to its node, a linked list rewires a constant number of pointers in $O (1)$ ; an array must shift every later element to keep the block contiguous, which is $O (n)$ .
Cache locality. Contiguous wins. A modern CPU reads memory in cache lines and prefetches sequentially, so a linear scan of an array is far faster than chasing pointers across scattered nodes, even though both are $Θ (n)$ comparisons. Constant factors, not asymptotics, yet they are large.
Space overhead. Linked pays per-element: every node carries one or two pointers besides its key. An array pays nothing per element but may reserve unused capacity (below).

The same sequence

⟨ 9, 16, 4 ⟩

stored two ways. Contiguous: one block, element

i

ba se + i \cdot s

, so indexing is address arithmetic. Linked: three separately-allocated nodes scattered in memory, each pointing to the next, so reaching element

i

means following

i

pointers.

Arrays and dynamic arrays

A fixed-size array is the contiguous structure in its purest form: allocate $n$ slots up front, index any of them in $O (1)$ . Its limitation is that $n$ is fixed at allocation. Real programs rarely know the final size in advance, so we want a structure that grows.

A dynamic array (C++ vector, Python list, Java ArrayList) keeps a contiguous backing block of some capacity $\geq$ the current size, and appends into the spare room. When the block fills, it allocates a larger block, copies the elements over, and frees the old one. The design decision that matters is how much larger, and the answer is to double the capacity.

Algorithm:

\textsc{Append}(A, x)

— push

x

, doubling the backing block on overflow

1
if $size(A) = capacity(A)$ then
2
$cap' \gets \max(1, 2 \cdot capacity(A))$
3
allocate new block $B$ of capacity $cap'$
4
copy $A[0\,..\,size(A)-1]$ into $B$
the $O(n)$ resize
5
free old block; $store(A) \gets B$ ; $capacity(A) \gets cap'$
6
$A[size(A)] \gets x$
7
$size(A) \gets size(A) + 1$

A single append is usually $O (1)$ , writing into spare room and bumping the size, but the appends that trigger a resize cost $Θ (n)$ because they copy the whole array. The worst case of one operation is therefore $O (n)$ . Yet the average cost over a sequence of appends is $O (1)$ , and this is worth proving.

A resize on overflow. The full block of capacity

4

cannot take a fifth element, so a new block of capacity

8

is allocated, the four elements are copied over (the

Θ (n)

step, accented arrows),

x

is written into the first spare slot, and the old block is freed. The four trailing slots are reserved-but-unused capacity.

The geometric growth is what makes this work: each resize is twice as expensive as the last but happens half as often, so the costs telescope into a constant per operation. Growing by a fixed increment instead of doubling would make the same $n$ appends cost $Θ (n^{2})$ . Amortized $O (1)$ has two costs: an occasional latency spike on the doubling step, and up to $2 \times$ wasted capacity right after a resize.²

Doubling growth over

16

appends: the capacity staircase (accent) jumps

1 \to 2 \to 4 \to 8 \to 16

, while the size (filled) rises by one each append. Each jump is a

Θ (n)

copy, and the gap above the fill is the reserved-but-unused capacity.

The trace, append by append

The lemma's algebra is worth checking on a concrete trace. Start from an empty array with capacity $1$ and run $16$ appends, counting one unit per element written or copied:

append	capacity before	resize?	copies	cost
$1$	$1$	no	$0$	$1$
$2$	$1$	grow to $2$	$1$	$2$
$3$	$2$	grow to $4$	$2$	$3$
$4$	$4$	no	$0$	$1$
$5$	$4$	grow to $8$	$4$	$5$
$6$ – $8$	$8$	no	$0$	$1$ each
$9$	$8$	grow to $16$	$8$	$9$
$10$ – $16$	$16$	no	$0$	$1$ each

The total is $16$ writes plus $1 + 2 + 4 + 8 = 15$ copies: $31$ units for $16$ appends, under $2$ per operation, matching the aggregate bound. The spikes at appends $2, 3, 5, 9$ (and next at $17, 33, \dots$ ) double in height each time but arrive half as often, which is the telescoping made visible.

Per-append cost for the same

16

appends. Cheap appends (muted) cost one write; the resizing appends at

2, 3, 5, 9

(accent) pay a copy of

1, 2, 4, 8

on top. Spikes double in height but halve in frequency, so the running total never crosses the

3

-credit budget line (dashed).

The aggregate proof above sums costs after the fact. The accounting method explains the same bound as a budget you could enforce up front: charge every append $3$ credits. One credit pays for writing the new element. The other two are banked on the element itself. When a resize hits at capacity $2 k$ , the $k$ elements appended since the previous resize each hold $2$ banked credits, enough to pay for copying themselves and one element from the older half of the array, which spent its own credits at an earlier resize. Every copy is prepaid, so no operation ever draws on future income, and $3 n$ credits cover any $n$ appends.²

Doubling is essential, not incidental. Suppose the array instead grew by a fixed increment $c$ each time it filled. Resizes would then occur at sizes $c, 2 c, 3 c, \dots$ , and the resize at size $i c$ copies $i c$ elements, so $n$ appends cost

i = 1 \sum n / c i c = c \cdot \frac{( n / c ) ( n / c + 1 )}{2} = Θ (\frac{n ^{2}}{c}),

which is $Θ (n)$ per append for any constant $c$ . Concretely, $n = 1 0^{6}$ appends with $c = 1024$ perform about $4.9 \times 1 0^{8}$ copy operations where doubling performs under $2 \times 1 0^{6}$ . Any geometric factor works ( $1.5 \times$ trades a smaller memory overshoot for more frequent copies); arithmetic growth does not.

The same discipline runs in reverse for a shrinking array. Popping elements should eventually release memory, but halving the block the instant the array is half full invites thrashing: alternating push/pop at the boundary would resize on every operation. The standard fix is hysteresis, halving only when the array falls to a quarter full. After any resize, in either direction, the array is exactly half full, so at least $Θ (n)$ cheap operations must pass before the next resize, and the amortized bound survives deletion too.

Linked lists

A linked list threads elements through pointers. In a singly linked list each node stores a $k ey$ and a $n e x t$ pointer to its successor; a $h e a d$ pointer names the first node and the last node's $n e x t$ is $nil$ . A doubly linked list adds a $p r e v$ pointer, so the list can be traversed in both directions and a node can be removed knowing only itself.

A doubly linked list; deleting the middle node is

O (1)

pointer splicing

The complexities follow directly from the pointer structure:

Insert / delete given the node. $O (1)$ . To delete node $x$ from a doubly linked list, set $n e x t (p r e v (x)) \leftarrow n e x t (x)$ and $p r e v (n e x t (x)) \leftarrow p r e v (x)$ , a constant number of pointer writes, no shifting. This is the linked list's signature advantage over an array.
Search by key, or index by position. $O (n)$ . There is no address arithmetic; you must walk the chain.

The boundary cases (deleting the head, deleting the tail, operating on an empty list) force nil checks that clutter the code. A standard trick removes them: a sentinel is a dummy node that is always present and never holds real data. Wrap the list into a ring around one sentinel $ni l$ , with $n e x t (ni l)$ the first real node and $p r e v (ni l)$ the last; now every node has a real predecessor and successor, and delete needs no special cases.³

Algorithm:

\textsc{List-Delete}(x)

— remove

x

from a doubly linked list (sentinel form)

1
$next(prev(x)) \gets next(x)$
2
$prev(next(x)) \gets prev(x)$

With a sentinel there are no nil guards: even at the ends, $p r e v (x)$ and $n e x t (x)$ point at real nodes (possibly the sentinel itself), so the two assignments always make sense.

The splice, pointer by pointer

Watch the delete on a concrete list. Take $9 \leftrightarrow 16 \leftrightarrow 4$ and delete the node holding $16$ ; call it $x$ . The first assignment, $n e x t (p r e v (x)) \leftarrow n e x t (x)$ , rewrites the $n e x t$ field of the $9$ -node to point at the $4$ -node. The second, $p r e v (n e x t (x)) \leftarrow p r e v (x)$ , rewrites the $p r e v$ field of the $4$ -node to point back at the $9$ -node. Two writes and the list reads $9 \leftrightarrow 4$ in both directions. Nothing was shifted, nothing else was touched, and the cost is the same whether the list holds three nodes or three million: that locality is the whole case for linked storage.

Deleting the

16

-node from

9 \leftrightarrow 16 \leftrightarrow 4

. Before: the chain runs through

x

. After: two pointer writes (accent) bypass it. The unlinked node still points into the list (muted dashes), which is harmless; it is simply unreachable and can be freed.

Insertion is the same idea with four writes instead of two. To splice a new node $y$ in immediately after a node $x$ :

Algorithm:

\textsc{List-Insert-After}(x, y)

— splice node

y

in right after

x

1
$next(y) \gets next(x)$
$y$ learns its successor first
2
$prev(y) \gets x$
3
$prev(next(x)) \gets y$
old successor points back at $y$
4
$next(x) \gets y$
finally $x$ lets go of the old link

The order of the writes is the classic pitfall. The first line reads $n e x t (x)$ , so the last line, which overwrites $n e x t (x)$ , must come after it: swap them and $y$ 's successor becomes $y$ itself, quietly turning the tail of the list into a self-loop. Run the trace on $9 \leftrightarrow 16 \leftrightarrow 4$ , inserting $y = 11$ after the $16$ -node: line 1 points $n e x t (y)$ at the $4$ -node, line 2 points $p r e v (y)$ at the $16$ -node, line 3 rewrites the $4$ -node's $p r e v$ to $y$ , and line 4 rewrites the $16$ -node's $n e x t$ to $y$ . The list now reads $9 \leftrightarrow 16 \leftrightarrow 11 \leftrightarrow 4$ , again in $O (1)$ regardless of length. Deleting the head or splicing at the tail is still the same code under a sentinel, which is why the sentinel is worth its one node of overhead.

operation	array	linked list
index / random access	$O (1)$	$O (n)$
search (unsorted)	$O (n)$	$O (n)$
insert/delete at known position	$O (n)$ shift	$O (1)$ splice
insert/delete at end	$O (1)$ amortized	$O (1)$
cache locality	excellent	poor
extra space per element	none	1–2 pointers

Neither structure dominates: choose contiguous when you index and scan, linked when you splice in the middle and never need the $i$ -th element by number.

singly_linked_list.pypython

from collections.abc import Iterable, Iterator
from typing import Generic, Optional, TypeVar

Value = TypeVar("Value")

class ListNode(Generic[Value]):
  """
    One singly linked node: a key and a link to its successor.\n
  """

  def __init__(self, value: Value) -> None:
    self.value: Value = value
    self.next: Optional[ListNode[Value]] = None

  def __repr__(self) -> str:
    return f"ListNode({self.value!r})"

class SinglyLinkedList(Generic[Value]):
  """
    A forward-only chain of nodes with O(1) ends and O(n) search.\n
  """

  def __init__(self, values: Iterable[Value] = ()) -> None:
    # start empty: no nodes, no length.
    self.head: Optional[ListNode[Value]] = None
    self.tail: Optional[ListNode[Value]] = None
    self._size: int = 0

    # append any seed values in order.
    for value in values:
      self.push_back(value)

  def push_front(self, value: Value) -> ListNode[Value]:
    """
      Insert `value` at the head in O(1) and return its node.\n
    """
    # splice the new node in front of the old head.
    node: ListNode[Value] = ListNode(value)
    node.next = self.head
    self.head = node

    # a previously empty list now has this node as its tail too.
    if self.tail is None:
      self.tail = node

    self._size += 1
    return node

  def push_back(self, value: Value) -> ListNode[Value]:
    """
      Insert `value` at the tail in O(1) (we cache the tail) and return it.\n
    """
    # empty list: the new node is both head and tail.
    node: ListNode[Value] = ListNode(value)
    if self.tail is None:
      self.head = self.tail = node

    # otherwise link it after the cached tail and advance the tail.
    else:
      self.tail.next = node
      self.tail = node

    self._size += 1
    return node

  def pop_front(self) -> Value:
    """
      Remove and return the head value in O(1) (raises if empty).\n
    """
    if self.head is None:
      raise IndexError("pop_front from empty list")

    # unlink the old head and advance to its successor.
    node: ListNode[Value] = self.head
    self.head = node.next

    # the list just went empty, so drop the dangling tail too.
    if self.head is None:
      self.tail = None

    self._size -= 1
    return node.value

  def find(self, value: Value) -> Optional[ListNode[Value]]:
    """
      The first node whose key equals `value`, walking the chain in O(n).\n
    """
    # walk the chain, returning the first node that matches.
    current: Optional[ListNode[Value]] = self.head
    while current is not None:
      if current.value == value:
        return current
      current = current.next

    return None

  def remove(self, value: Value) -> bool:
    """
      Delete the first node holding `value`; report whether one was found.\n
      A singly linked list must track the predecessor to rewire its `next`.\n
    """
    # walk the chain keeping the predecessor so we can rewire its `next`.
    previous: Optional[ListNode[Value]] = None
    current: Optional[ListNode[Value]] = self.head
    while current is not None:
      if current.value == value:
        # unlink the match: from the head when it has no predecessor.
        if previous is None:
          self.head = current.next
        else:
          previous.next = current.next

        # if we dropped the tail, the predecessor becomes the new tail.
        if current is self.tail:
          self.tail = previous

        self._size -= 1
        return True

      previous, current = current, current.next

    return False

  def reverse(self) -> None:
    """
      Reverse the list in place by flipping every `next` pointer in O(n).\n
    """
    # the old head becomes the new tail before we start flipping.
    previous: Optional[ListNode[Value]] = None
    current: Optional[ListNode[Value]] = self.head
    self.tail = self.head

    # flip each `next` to point at the predecessor, saving the successor first.
    while current is not None:
      following: Optional[ListNode[Value]] = current.next
      current.next = previous
      previous, current = current, following

    self.head = previous

  def __len__(self) -> int:
    return self._size

  def __iter__(self) -> Iterator[Value]:
    # yield each value as we walk from head to tail.
    current: Optional[ListNode[Value]] = self.head
    while current is not None:
      yield current.value
      current = current.next

  def __repr__(self) -> str:
    return f"SinglyLinkedList({list(self)!r})"

doubly_linked_list.pypython

from collections.abc import Iterable, Iterator
from typing import Generic, Optional, TypeVar

Value = TypeVar("Value")

class DoublyLinkedNode(Generic[Value]):
  """
    A node with both a successor and a predecessor link.\n
    The sentinel reuses this class with its value left as None.\n
  """

  def __init__(self, value: Optional[Value] = None) -> None:
    self.value: Optional[Value] = value
    self.prev: DoublyLinkedNode[Value] = self
    self.next: DoublyLinkedNode[Value] = self

  def __repr__(self) -> str:
    return f"DoublyLinkedNode({self.value!r})"

class DoublyLinkedList(Generic[Value]):
  """
    A two-way chain with O(1) splicing at and given any node.\n
  """

  def __init__(self, values: Iterable[Value] = ()) -> None:
    self._sentinel: DoublyLinkedNode[Value] = DoublyLinkedNode()
    self._size: int = 0
    for value in values:
      self.push_back(value)

  def _insert_between(
    self,
    value: Value,
    predecessor: DoublyLinkedNode[Value],
    successor: DoublyLinkedNode[Value],
  ) -> DoublyLinkedNode[Value]:
    """
      Splice a fresh node holding `value` between two adjacent nodes.\n
    """
    # point the new node at its neighbors.
    node: DoublyLinkedNode[Value] = DoublyLinkedNode(value)
    node.prev = predecessor
    node.next = successor

    # point the neighbors back at the new node.
    predecessor.next = node
    successor.prev = node

    self._size += 1
    return node

  def push_front(self, value: Value) -> DoublyLinkedNode[Value]:
    """
      Insert `value` just after the sentinel (the front) in O(1).\n
    """
    return self._insert_between(value, self._sentinel, self._sentinel.next)

  def push_back(self, value: Value) -> DoublyLinkedNode[Value]:
    """
      Insert `value` just before the sentinel (the back) in O(1).\n
    """
    return self._insert_between(value, self._sentinel.prev, self._sentinel)

  def delete(self, node: DoublyLinkedNode[Value]) -> Value:
    """
      Unlink `node` from the list in O(1) and return its value.\n
      With the sentinel, both neighbors are real nodes, so the two pointer\n
      writes always make sense — no head/tail special cases.\n
    """
    if node is self._sentinel:
      raise ValueError("cannot delete the sentinel node")

    # bridge the two neighbors past the node, then drop it.
    node.prev.next = node.next
    node.next.prev = node.prev
    self._size -= 1

    value = node.value
    assert value is not None  # every non-sentinel node carries a real value.
    return value

  def pop_front(self) -> Value:
    """
      Remove and return the front value in O(1) (raises if empty).\n
    """
    if self._size == 0:
      raise IndexError("pop_front from empty list")
    return self.delete(self._sentinel.next)

  def pop_back(self) -> Value:
    """
      Remove and return the back value in O(1) (raises if empty).\n
    """
    if self._size == 0:
      raise IndexError("pop_back from empty list")
    return self.delete(self._sentinel.prev)

  def front(self) -> Value:
    """
      The front value without removing it (raises if empty).\n
    """
    if self._size == 0:
      raise IndexError("front of empty list")
    value = self._sentinel.next.value
    assert value is not None  # size > 0, so the front node is real.
    return value

  def back(self) -> Value:
    """
      The back value without removing it (raises if empty).\n
    """
    if self._size == 0:
      raise IndexError("back of empty list")
    value = self._sentinel.prev.value
    assert value is not None  # size > 0, so the back node is real.
    return value

  def find(self, value: Value) -> Optional[DoublyLinkedNode[Value]]:
    """
      The first node whose key equals `value`, walking forward in O(n).\n
    """
    # walk forward from the first real node until the sentinel comes round.
    current: DoublyLinkedNode[Value] = self._sentinel.next
    while current is not self._sentinel:
      if current.value == value:
        return current
      current = current.next
    return None

  def __len__(self) -> int:
    return self._size

  def __iter__(self) -> Iterator[Value]:
    # yield values front-to-back, stopping when the ring returns to start.
    current: DoublyLinkedNode[Value] = self._sentinel.next
    while current is not self._sentinel:
      value = current.value
      assert value is not None  # the loop skips the sentinel; nodes are real.
      yield value
      current = current.next

  def __reversed__(self) -> Iterator[Value]:
    # yield values back-to-front by following prev links instead.
    current: DoublyLinkedNode[Value] = self._sentinel.prev
    while current is not self._sentinel:
      value = current.value
      assert value is not None  # the loop skips the sentinel; nodes are real.
      yield value
      current = current.prev

  def __repr__(self) -> str:
    return f"DoublyLinkedList({list(self)!r})"

Stacks: last in, first out

A stack restricts access to one discipline: LIFO, last in, first out. Only the most recently inserted element is reachable. The operations are $Push$ (add to the top), $Pop$ (remove the top), and $Peek$ (read the top without removing) — all $O (1)$ .

A stack is trivial to back with a dynamic array: keep a $t o p$ index, push by writing $A [t o p]$ and incrementing, pop by decrementing. (Amortized $O (1)$ if it must grow.) Equally, a singly linked list with push/pop at the head is a stack with worst-case $O (1)$ operations and no resize spikes.

Stacks appear wherever computation is nested: the call stack that holds function activation records, depth-first search, evaluating arithmetic expressions, and matching brackets. For brackets, push each opener, then pop and check on each closer, accepting iff the stack ends empty. That last pattern is the Valid Parentheses problem.

stack.pypython

from collections.abc import Iterable, Iterator
from typing import Generic, TypeVar

Value = TypeVar("Value")

# matching closers keyed by their opener, for `balanced_brackets`.
_BRACKET_PAIRS: dict[str, str] = {"(": ")", "[": "]", "{": "}"}

class Stack(Generic[Value]):
  """
    Last-in, first-out access at one end (the top).\n
  """

  def __init__(self, values: Iterable[Value] = ()) -> None:
    self._store: list[Value] = list(values)

  def push(self, value: Value) -> None:
    """
      Add `value` to the top in amortized O(1).\n
    """
    self._store.append(value)

  def pop(self) -> Value:
    """
      Remove and return the top element in O(1) (raises if empty).\n
    """
    if not self._store:
      raise IndexError("pop from empty stack")
    return self._store.pop()

  def peek(self) -> Value:
    """
      Read the top element without removing it (raises if empty).\n
    """
    if not self._store:
      raise IndexError("peek at empty stack")
    return self._store[-1]

  def is_empty(self) -> bool:
    """
      Whether the stack holds no elements.\n
    """
    return not self._store

  def __len__(self) -> int:
    return len(self._store)

  def __iter__(self) -> Iterator[Value]:
    # top-of-stack first, mirroring repeated pops.
    return reversed(self._store)

  def __repr__(self) -> str:
    return f"Stack({self._store!r})"

def balanced_brackets(text: str) -> bool:
  """
    Whether every bracket in `text` is matched and properly nested.\n
    The canonical stack application: push each opener, and on each closer pop\n
    and check that it matches; accept iff the stack ends empty.\n
  """
  stack: Stack[str] = Stack()
  closers: set[str] = set(_BRACKET_PAIRS.values())
  for character in text:
    if character in _BRACKET_PAIRS:
      stack.push(character)
    elif character in closers:
      if stack.is_empty() or _BRACKET_PAIRS[stack.pop()] != character:
        return False
  return stack.is_empty()

Queues and deques: first in, first out

A queue enforces the opposite discipline: FIFO, first in, first out, like a line at a counter. $Enqueue$ adds at the tail; $Dequeue$ removes from the head. Both are $O (1)$ .

A stack is LIFO (one end, the top); a queue is FIFO (insert at tail, remove at head)

Backing a queue with an array needs care: if we always dequeued from index $0$ we would shift the whole array each time, $O (n)$ . The fix is a circular buffer. Keep a fixed array of capacity $m$ and two indices, $h e a d$ and $t ai l$ ; enqueue writes $A [t ai l]$ and advances $t ai l \leftarrow (t ai l + 1) mod m$ , dequeue reads $A [h e a d]$ and advances $h e a d \leftarrow (h e a d + 1) mod m$ . The indices chase each other around the ring, reusing freed slots, so both operations stay $O (1)$ with no shifting and no wasted scanning.⁴ (When the buffer fills, resize and re-lay-out into a larger ring at amortized $O (1)$ , exactly as for the dynamic array.)

A circular buffer of capacity

8

holding four elements:

h e a d

and

t ai l

chase each other around the ring, and advancing past slot

7

wraps to slot

0

Wraparound, index by index

The modular arithmetic deserves one full trace. Take capacity $m = 8$ and start empty with $h e a d = t ai l = 0$ . Enqueue $a$ through $f$ : each write lands at $A [t ai l]$ and advances $t ai l$ , leaving $a \dots f$ in slots $0$ – $5$ with $h e a d = 0$ , $t ai l = 6$ . Dequeue four times: the reads return $a, b, c, d$ in insertion order while $h e a d$ advances to $4$ ; slots $0$ – $3$ still contain the old values, but they are logically free and are never erased. Now enqueue $g$ , $h$ , $i$ :

operation	write	$t ai l$ update	state after
enqueue $g$	$A [6] \leftarrow g$	$(6 + 1) mod 8 = 7$	$h e a d = 4$ , $t ai l = 7$
enqueue $h$	$A [7] \leftarrow h$	$(7 + 1) mod 8 = 0$	$h e a d = 4$ , $t ai l = 0$
enqueue $i$	$A [0] \leftarrow i$	$(0 + 1) mod 8 = 1$	$h e a d = 4$ , $t ai l = 1$

The enqueue of $h$ is the wrap: $t ai l$ steps off the right end of the array and the $mod$ folds it back to slot $0$ , where the next write overwrites the stale $a$ . The queue now holds $e, f, g, h, i$ , physically split across slots $4$ – $7$ and $0$ but logically contiguous around the ring. Dequeues would keep reading $e, f, g, \dots$ in FIFO order, with $h e a d$ making the same wrap three steps later.

The same ring, unrolled: three snapshots of the capacity-

8

array. Top: six enqueues f/ill slots

0

5

. Middle: four dequeues advance

h e a d

past the stale values (muted). Bottom: enqueuing

g, h, i

runs

t ai l

off the right end and the modulus wraps it back to slot

0

(accent arc), overwriting stale

a

One boundary case needs a decision. With only the two indices, $h e a d = t ai l$ describes both the empty queue and the full one, since a full ring's $t ai l$ has lapped all the way around to $h e a d$ . Either keep an explicit element count alongside the indices, or declare the buffer full at $m - 1$ elements so the two states stay distinguishable; both choices are $O (1)$ and both appear in production code. Forgetting the ambiguity entirely is the classic circular-buffer bug: the full buffer reports empty and silently drops a lap of data.

A deque (double-ended queue, pronounced deck) generalizes both: it supports $O (1)$ insert and delete at both ends. A deque used at one end only is a stack; used to push at one end and pop at the other, it is a queue, so the deque subsumes everything in this lesson. A doubly linked list with a head and tail sentinel implements a deque directly, and a circular buffer with both indices movable in either direction does too; the Design Circular Deque problem asks for the latter.

circular_queue.pypython

from collections.abc import Iterable, Iterator
from typing import Generic, Optional, TypeVar

Value = TypeVar("Value")

class CircularQueue(Generic[Value]):
  """
    First-in, first-out access via a wrapping ring buffer.\n
  """

  def __init__(self, values: Iterable[Value] = ()) -> None:
    self._store: list[Optional[Value]] = [None]
    self._head: int = 0
    self._size: int = 0
    for value in values:
      self.enqueue(value)

  @property
  def capacity(self) -> int:
    """
      The number of allocated ring slots.\n
    """
    return len(self._store)

  def _resize(self, new_capacity: int) -> None:
    """
      Re-lay the elements out into a larger ring, head-aligned at index 0.\n
    """
    new_store: list[Optional[Value]] = [None for _ in range(new_capacity)]
    for offset in range(self._size):
      new_store[offset] = self._store[(self._head + offset) % self.capacity]
    self._store = new_store
    self._head = 0

  def enqueue(self, value: Value) -> None:
    """
      Add `value` at the tail in amortized O(1), doubling the ring if full.\n
    """
    # grow the ring before it overflows.
    if self._size == self.capacity:
      self._resize(2 * self.capacity)

    # drop the value into the slot just past the current tail.
    tail: int = (self._head + self._size) % self.capacity
    self._store[tail] = value
    self._size += 1

  def dequeue(self) -> Value:
    """
      Remove and return the head element in O(1) (raises if empty).\n
    """
    if self._size == 0:
      raise IndexError("dequeue from empty queue")

    # read the head slot, then clear it so the value can be freed.
    value = self._store[self._head]
    assert value is not None  # a live slot (size > 0) always holds a value.
    self._store[self._head] = None

    # advance the head one step around the ring.
    self._head = (self._head + 1) % self.capacity
    self._size -= 1
    return value

  def peek(self) -> Value:
    """
      Read the head element without removing it (raises if empty).\n
    """
    if self._size == 0:
      raise IndexError("peek at empty queue")
    value = self._store[self._head]
    assert value is not None  # a live slot (size > 0) always holds a value.
    return value

  def is_empty(self) -> bool:
    """
      Whether the queue holds no elements.\n
    """
    return self._size == 0

  def __len__(self) -> int:
    return self._size

  def __iter__(self) -> Iterator[Value]:
    # head-to-tail, the order elements would dequeue.
    for offset in range(self._size):
      value = self._store[(self._head + offset) % self.capacity]
      assert value is not None  # offset < size, so this slot is live.
      yield value

  def __repr__(self) -> str:
    return f"CircularQueue({list(self)!r})"

deque.pypython

from collections.abc import Iterable, Iterator
from typing import Generic, Optional, TypeVar

Value = TypeVar("Value")

class Deque(Generic[Value]):
  """
    O(1) insertion and removal at either end via a ring buffer.\n
  """

  def __init__(self, values: Iterable[Value] = ()) -> None:
    # start with a one-slot ring; head and size pin the live window.
    self._store: list[Optional[Value]] = [None]
    self._head: int = 0
    self._size: int = 0

    # seed from the given iterable, front-to-back.
    for value in values:
      self.push_back(value)

  @property
  def capacity(self) -> int:
    """
      The number of allocated ring slots.\n
    """
    return len(self._store)

  def _resize(self, new_capacity: int) -> None:
    """
      Re-lay the elements out into a larger ring, head-aligned at index 0.\n
    """
    # copy the live window into a fresh ring, head-aligned at index 0.
    new_store: list[Optional[Value]] = [None for _ in range(new_capacity)]
    for offset in range(self._size):
      new_store[offset] = self._store[(self._head + offset) % self.capacity]

    self._store = new_store
    self._head = 0

  def _grow_if_full(self) -> None:
    if self._size == self.capacity:
      self._resize(2 * self.capacity)

  def push_back(self, value: Value) -> None:
    """
      Insert `value` at the back (tail) in amortized O(1).\n
    """
    # grow first, then write into the slot just past the current tail.
    self._grow_if_full()
    tail: int = (self._head + self._size) % self.capacity
    self._store[tail] = value
    self._size += 1

  def push_front(self, value: Value) -> None:
    """
      Insert `value` at the front (head) in amortized O(1).\n
    """
    # grow first, then step head back one slot and write there.
    self._grow_if_full()
    self._head = (self._head - 1) % self.capacity
    self._store[self._head] = value
    self._size += 1

  def pop_front(self) -> Value:
    """
      Remove and return the front element in O(1) (raises if empty).\n
    """
    if self._size == 0:
      raise IndexError("pop_front from empty deque")

    # read the head slot (always live while size > 0).
    value = self._store[self._head]
    assert value is not None

    # clear the slot, advance head, shrink the live window.
    self._store[self._head] = None
    self._head = (self._head + 1) % self.capacity
    self._size -= 1
    return value

  def pop_back(self) -> Value:
    """
      Remove and return the back element in O(1) (raises if empty).\n
    """
    if self._size == 0:
      raise IndexError("pop_back from empty deque")

    # read the tail slot (always live while size > 0).
    tail: int = (self._head + self._size - 1) % self.capacity
    value = self._store[tail]
    assert value is not None

    # clear the slot and shrink the live window.
    self._store[tail] = None
    self._size -= 1
    return value

  def front(self) -> Value:
    """
      Read the front element without removing it (raises if empty).\n
    """
    if self._size == 0:
      raise IndexError("front of empty deque")

    # the head slot is live whenever size > 0.
    value = self._store[self._head]
    assert value is not None
    return value

  def back(self) -> Value:
    """
      Read the back element without removing it (raises if empty).\n
    """
    if self._size == 0:
      raise IndexError("back of empty deque")

    # the tail slot is live whenever size > 0.
    tail: int = (self._head + self._size - 1) % self.capacity
    value = self._store[tail]
    assert value is not None
    return value

  def is_empty(self) -> bool:
    """
      Whether the deque holds no elements.\n
    """
    return self._size == 0

  def __len__(self) -> int:
    return self._size

  def __iter__(self) -> Iterator[Value]:
    # front-to-back order.
    for offset in range(self._size):
      value = self._store[(self._head + offset) % self.capacity]
      assert value is not None  # offset < size, so this slot is live.
      yield value

  def __repr__(self) -> str:
    return f"Deque({list(self)!r})"

doubling_array.pypython

from collections.abc import Iterator
from typing import Generic, Optional, TypeVar

Value = TypeVar("Value")

class DoublingArray(Generic[Value]):
  """
    A resizable contiguous sequence backed by a fixed-capacity block.\n
    `size` is the number of live elements; `capacity` is the allocated room.\n
  """

  def __init__(self) -> None:
    self._store: list[Optional[Value]] = []
    self._size: int = 0

  @property
  def size(self) -> int:
    """
      The number of live elements.\n
    """
    return self._size

  @property
  def capacity(self) -> int:
    """
      The number of allocated slots, live or spare.\n
    """
    return len(self._store)

  def _resize(self, new_capacity: int) -> None:
    """
      Allocate a fresh block of `new_capacity` and copy the elements over.\n
      This is the O(n) step a doubling append occasionally pays.\n
    """
    # copy the live prefix into a fresh, larger block.
    new_store: list[Optional[Value]] = [None for _ in range(new_capacity)]
    for index in range(self._size):
      new_store[index] = self._store[index]

    self._store = new_store

  def append(self, value: Value) -> None:
    """
      Push `value` onto the end, doubling the backing block on overflow.\n
    """
    # grow the block before it overflows.
    if self._size == self.capacity:
      self._resize(max(1, 2 * self.capacity))

    # park the value in the next free slot.
    self._store[self._size] = value
    self._size += 1

  def pop(self) -> Value:
    """
      Remove and return the last element (raises IndexError if empty).\n
    """
    if self._size == 0:
      raise IndexError("pop from empty DoublingArray")

    # uncover the last live slot and read it out.
    self._size -= 1
    value = self._store[self._size]
    assert value is not None  # slots below size are live and never None.

    # clear the vacated slot so the reference can be collected.
    self._store[self._size] = None
    return value

  def get(self, index: int) -> Value:
    """
      The element at `index` in O(1) by address arithmetic.\n
    """
    self._check_index(index)
    value = self._store[index]
    assert value is not None  # _check_index guarantees a live (non-None) slot.
    return value

  def set(self, index: int, value: Value) -> None:
    """
      Overwrite the element at `index` in O(1).\n
    """
    self._check_index(index)
    self._store[index] = value

  def _check_index(self, index: int) -> None:
    if not 0 <= index < self._size:
      raise IndexError(f"index {index} out of range for size {self._size}")

  def __getitem__(self, index: int) -> Value:
    return self.get(index)

  def __setitem__(self, index: int, value: Value) -> None:
    self.set(index, value)

  def __len__(self) -> int:
    return self._size

  def __iter__(self) -> Iterator[Value]:
    for index in range(self._size):
      value = self._store[index]
      assert value is not None  # index < size, so this slot is live.
      yield value

  def __repr__(self) -> str:
    live = [self._store[index] for index in range(self._size)]
    return f"DoublingArray({live!r}, capacity={self.capacity})"

Elementary structures in practice

The textbook trade-off, contiguous versus linked, is only the starting point; real systems adjust it in several ways.

Growth factors. The amortized argument works for any geometric factor, and standard libraries pick different ones for different reasons. Microsoft's and most C++ std::vector implementations double; GCC's libstdc++ also doubles, but Facebook's folly::fbvector grows by $1.5 \times$ precisely because doubling can never reuse the freed blocks. With a factor below the golden ratio $φ \approx 1.618$ , the sum of all previous block sizes eventually exceeds the next block, so an allocator can place the new array in the coalesced space the old ones left behind; doubling can never do this. The choice is a genuine trade of memory footprint against copy frequency, and both live in production.

Bulk nodes for cache locality. A plain linked list's one-node-per- element layout has poor cache behavior, so practical linked structures store many elements per node. An unrolled linked list keeps a small array (say $16$ to $64$ elements) in each node, recovering most of an array's locality while keeping $O (1)$ splicing at node boundaries; this is the shape of many production rope and gap buffer text structures. A B-tree or its cache-oblivious cousins push the same idea to a full tree, which is why they dominate on disk.

Standard-library deques. Python's collections.deque and Java's ArrayDeque are not linked lists but blocked circular buffers, arrays of fixed-size blocks, giving $O (1)$ push/pop at both ends and cache-friendly iteration. And in immutable/functional languages, the everyday list is a persistent singly linked list whose shared tails make $O (1)$ prepend and structural sharing cheap, a different sweet spot from the mutable dynamic array that dominates imperative code.⁵

Takeaways

Every container is either contiguous (an array — $O (1)$ random access by address arithmetic, cache-friendly, $O (n)$ to splice) or linked (nodes joined by pointers — $O (1)$ splice given the node, $O (n)$ to index, a pointer of overhead per element). The choice is a trade, not a winner.
A dynamic array appends in amortized $O (1)$ by doubling capacity on overflow: the aggregate copy cost over $n$ appends is $1 + 2 + \dots + 2^{k} < 2 n$ , or by the accounting method, $3$ prepaid credits per append cover every copy. Fixed-increment growth costs $Θ (n^{2} / c)$ instead; shrinking halves only at one-quarter full to avoid thrashing. The worst-case single append is still $O (n)$ on the resize step.
A doubly linked list inserts and deletes in $O (1)$ given the node: delete is two pointer writes, insert-after is four, with write order mattering (read $n e x t (x)$ before overwriting it). A sentinel node erases the boundary cases.
A stack is LIFO ( $Push / Pop / Peek$ , all $O (1)$ ) and underlies the call stack, DFS, expression evaluation, and bracket matching.
A queue is FIFO, implemented as a circular buffer with $h e a d$ and $t ai l$ indices advanced $mod m$ for $O (1)$ ends. Since $h e a d = t ai l$ means both empty and full, keep a count (or cap at $m - 1$ elements) to tell them apart. The deque generalizes both stack and queue to $O (1)$ operations at either end.

Skiena, §3.1–3.2, Contiguous vs. Linked Structures: the array-vs-pointer trade-off and its consequences for access, splicing, and locality. ↩
CLRS, Ch. 10, Elementary Data Structures (with the amortized analysis of Ch. 16): geometric doubling gives $O (1)$ amortized table append. ↩ ↩²
CLRS, Ch. 10, Elementary Data Structures (§10.2): doubly linked lists and the sentinel that removes boundary cases from insert/delete. ↩
CLRS, Ch. 10, Elementary Data Structures (§10.1): stacks and the circular-array queue with head/tail indices taken $mod m$ . ↩
On sub- $φ$ growth factors reusing freed memory, see the folly fbvector design notes; on unrolled lists, Shao & Reps, Unrolling lists (1994); persistent lists are standard in Okasaki, Purely Functional Data Structures (1998). ↩

Two ways to store a sequence

Arrays and dynamic arrays

The trace, append by append

Linked lists

The splice, pointer by pointer

Stacks: last in, first out

Queues and deques: first in, first out

Wraparound, index by index

Elementary structures in practice

Takeaways

Footnotes