Elementary Data Structures
Every container is built one of two ways: contiguous in an array, or linked through pointers. We trade cache-friendly random access against splicing, derive the **amortized append of a doubling dynamic array, and assemble the two ordered access disciplines — the LIFO stack and the FIFO queue (with its generalization, the deque**) — on top of both.
╌╌╌╌
Before we can balance trees or hash keys, we need the raw material they are made of. Every data structure in this course (every tree, heap, hash table, and graph) is ultimately stored one of two ways, and the choice echoes through everything built on top of it. You can lay the elements out contiguously, packed shoulder-to-shoulder in one block of memory, or you can scatter them and stitch them together with pointers. This first lesson works out the two strategies and the four ordered containers (array, linked list, stack, queue) that the rest of the module assumes you already own.
Two ways to store a sequence
A contiguous structure stores its elements in a single block of memory, one after another. An array of elements each of size occupies one run of bytes, so element lives at a known offset from the start. A linked structure stores each element in its own separately-allocated node and uses a pointer in each node to find the next; the nodes may sit anywhere in memory.1
The contrast is sharp, and it drives every later choice:
- Random access. Contiguous wins outright. Element of an array is at address , computed with one multiply-add, so indexing is . In a linked list there is no address arithmetic; to reach the -th node you must follow pointers, which is .
- Splicing. Linked wins outright. To insert or delete an element given a pointer to its node, a linked list rewires a constant number of pointers in ; an array must shift every later element to keep the block contiguous, which is .
- Cache locality. Contiguous wins. A modern CPU reads memory in cache lines and prefetches sequentially, so a linear scan of an array is far faster than chasing pointers across scattered nodes, even though both are comparisons. Constant factors, not asymptotics, yet they are large.
- Space overhead. Linked pays per-element: every node carries one or two pointers besides its key. An array pays nothing per element but may reserve unused capacity (below).
Arrays and dynamic arrays
A fixed-size array is the contiguous structure in its purest form: allocate slots up front, index any of them in . Its limitation is exactly that is fixed at allocation. Real programs rarely know the final size in advance, so we want a structure that grows.
A dynamic array (C++ vector, Python list, Java ArrayList) keeps a
contiguous backing block of some capacity the current size, and
appends into the spare room. When the block fills, it allocates a larger block,
copies the elements over, and frees the old one. The design decision that matters is
how much larger, and the answer is to double the capacity.
- 1if then
- 2
- 3allocate new block of capacity
- 4copy intothe resize
- 5free old block; ;
- 6
- 7
A single append is usually , writing into spare room and bumping the size, but the appends that trigger a resize cost because they copy the whole array. The worst case of one operation is therefore . Yet the average cost over a sequence of appends is , and this is worth proving.
The geometric growth is what makes this work: each resize is twice as expensive as the last but happens half as often, so the costs telescope into a constant per operation. Growing by a fixed increment instead of doubling would make the same appends cost . The price of amortized is twofold: an occasional latency spike on the doubling step, and up to wasted capacity right after a resize.2
Linked lists
A linked list threads elements through pointers. In a singly linked list each node stores a and a pointer to its successor; a pointer names the first node and the last node's is . A doubly linked list adds a pointer, so the list can be traversed in both directions and a node can be removed knowing only itself.
The complexities follow directly from the pointer structure:
- Insert / delete given the node. . To delete node from a doubly linked list, set and , a constant number of pointer writes, no shifting. This is the linked list's signature advantage over an array.
- Search by key, or index by position. . There is no address arithmetic; you must walk the chain.
The boundary cases (deleting the head, deleting the tail, operating on an empty
list) force nil checks that clutter the code. A standard trick removes them: a
sentinel is a dummy node that is always present and never holds real data. Wrap
the list into a ring around one sentinel , with the first real
node and the last; now every node has a real predecessor and
successor, and delete needs no special cases.3
- 1
- 2
With a sentinel there are no nil guards: even at the ends, and
point at real nodes (possibly the sentinel itself), so the two
assignments always make sense.
| operation | array | linked list |
|---|---|---|
| index / random access | ||
| search (unsorted) | ||
| insert/delete at known position | shift | splice |
| insert/delete at end | amortized | |
| cache locality | excellent | poor |
| extra space per element | none | 1–2 pointers |
Neither structure dominates: choose contiguous when you index and scan, linked when you splice in the middle and never need the -th element by number.
Stacks: last in, first out
A stack restricts access to one discipline: LIFO, last in, first out. Only the most recently inserted element is reachable. The operations are (add to the top), (remove the top), and (read the top without removing) — all .
A stack is trivial to back with a dynamic array: keep a index, push by writing and incrementing, pop by decrementing. (Amortized if it must grow.) Equally, a singly linked list with push/pop at the head is a stack with worst-case operations and no resize spikes.
Stacks appear wherever computation is nested: the call stack that holds function activation records, depth-first search, evaluating arithmetic expressions, and matching brackets. For brackets, push each opener, then pop and check on each closer, accepting iff the stack ends empty. That last pattern is the Valid Parentheses problem, and it is a stack in one line of intuition.
Queues and deques: first in, first out
A queue enforces the opposite discipline: FIFO, first in, first out, like a line at a counter. adds at the tail; removes from the head. Both are .
Backing a queue with an array needs care: if we always dequeued from index we would shift the whole array each time, . The fix is a circular buffer. Keep a fixed array of capacity and two indices, and ; enqueue writes and advances , dequeue reads and advances . The indices chase each other around the ring, reusing freed slots, so both operations stay with no shifting and no wasted scanning.4 (When the buffer fills, resize and re-lay-out into a larger ring at amortized , exactly as for the dynamic array.)
A deque (double-ended queue, pronounced deck
) generalizes both: it supports
insert and delete at both ends. A deque used at one end only is a stack;
used to push at one end and pop at the other, it is a queue, so the deque
subsumes everything in this lesson. A doubly linked list with a head and tail
sentinel implements a deque directly, and a circular buffer with both indices
movable in either direction does too; the Design Circular Deque problem asks for
exactly the latter.
Takeaways
- Every container is either contiguous (an array — random access by address arithmetic, cache-friendly, to splice) or linked (nodes joined by pointers — splice given the node, to index, a pointer of overhead per element). The choice is a trade, not a winner.
- A dynamic array appends in amortized by doubling capacity on overflow: the aggregate copy cost over appends is . The worst-case single append is still on the resize step.
- A doubly linked list inserts and deletes in given the node; a sentinel node erases the boundary cases so delete is two pointer writes.
- A stack is LIFO (, all ) and underlies the call stack, DFS, expression evaluation, and bracket matching.
- A queue is FIFO, implemented as a circular buffer with and indices advanced for ends; the deque generalizes both stack and queue to operations at either end.
Footnotes
- Skiena, §3.1–3.2, Contiguous vs. Linked Structures: the array-vs-pointer trade-off and its consequences for access, splicing, and locality. ↩
- CLRS, Ch. 10, Elementary Data Structures (with the amortized analysis of Ch. 16): geometric doubling gives amortized table append. ↩
- CLRS, Ch. 10, Elementary Data Structures (§10.2): doubly linked lists and the sentinel that removes boundary cases from insert/delete. ↩
- CLRS, Ch. 10, Elementary Data Structures (§10.1): stacks and the circular-array queue with head/tail indices taken . ↩
╌╌ END ╌╌