Asymptotic Analysis
We measure an algorithm's running time as a function of its input size, then strip away machine-specific constants and lower-order terms to compare algorithms cleanly. This lesson defines the RAM model and the , , , , and notations, and shows how to read the cost of loops off the page.
╌╌╌╌
In the previous lesson we saw the same algorithm, insertion sort, cost quadratically many comparisons on one input and linearly many on another, all on the same machine. To compare algorithms as algorithms, independent of the hardware and the particular input, we need two things: a model of computation abstract enough to ignore the machine, and a notation coarse enough to ignore constants. This lesson supplies both.
The RAM model of computation
We analyze algorithms against an idealized machine, the random-access machine (RAM). It has the properties all three books assume, usually tacitly:
- Instructions execute one at a time, no concurrency.
- The basic operations, namely arithmetic (, , , ), comparisons, data movement (load, store, copy), and control flow, each take a constant amount of time.
- Memory is an unbounded array of cells, and accessing any cell by its index costs the same constant (this is what random access means).
- Each cell holds an integer or float of
reasonable
size, roughly bits for an input of size , so a single value fits in a machine word.
The RAM is a deliberate fiction. Real multiplication is not truly constant-time for arbitrarily large numbers; real memory has caches that make some accesses far cheaper than others. But the model is predictive: an algorithm that is fast on the RAM is, overwhelmingly, fast in practice. Skiena stresses this engineering payoff;1 CLRS is careful to flag the places, such as bignum arithmetic, where the constant-word assumption breaks down.2
From problem to
It helps to be precise about what we are even measuring. A
computational problem is just a function
from a set of possible inputs to a set of possible outputs; each element
is an instance of . Sorting integer arrays, for
example, has . A size function
records how big
each instance
is. For an array we take . An algorithm solves if for every
instance .
Now charge = the number of elementary RAM operations performs on input . Inputs of the same size can cost different amounts, so we take the worst one of each size:
When the algorithm is understood, this is exactly the function we denote , the running time as a function of the input size . The size is usually the number of elements, but sometimes the number of bits, or two parameters (e.g. and for a graph); choosing the right size measure is the first decision in any analysis. What we ultimately want is a good, convenient-to-understand upper bound on , which is precisely what the notation below provides.
Worst, average, and best case
For a fixed input size , different inputs of that size may cost different amounts. Insertion sort costs on a sorted array and on a reversed one. So is not one number; it is a range. We summarize it three ways:
- Worst case : the maximum cost over all inputs of size .
- Best case: the minimum cost over all inputs of size .
- Average case: the expected cost over a probability distribution on inputs of size (usually the uniform distribution).
We almost always report the worst case. It is a guarantee: the algorithm never does worse, no matter how adversarial the input. The best case is nearly useless as a promise, since any algorithm looks good on its luckiest input. The average case is the most honest predictor of typical performance but requires us to commit to a distribution, and the analysis is usually harder (it often needs the probabilistic tools of a later module). CLRS develops all three; Skiena argues that for design purposes the worst case is what you should plan for.
Why we drop constants and lower-order terms
Suppose careful counting gives the running time of some algorithm as
Two facts make most of this expression noise:
- The leading term dominates. As grows, swamps . At the quadratic term is and the rest is , under . The growth rate is governed entirely by .
- The constants are machine artifacts. The depends on how many RAM operations our particular pseudocode spends per iteration; recompile on a different machine and it changes. It says nothing about the algorithm's intrinsic scaling.
So we throw both away and say the running time is order . This is not
sloppiness; it is the right level of abstraction. An algorithm with a tiny
constant still loses eventually to an algorithm with a large one,
and eventually
is exactly what asymptotic analysis captures. The notation below
makes order
precise.
The asymptotic notations
Let and be functions from the positive integers to the nonnegative reals. The notations describe how behaves relative to for all sufficiently large .
Big-O: asymptotic upper bound
State it as a clean existential:
The constant lets us ignore multiplicative factors; the threshold lets us ignore small inputs where lower-order terms might still dominate. (CLRS phrases as a set of functions and adds to keep things nonnegative; the two readings agree.)
The picture to keep in mind is the for large
sketch: the scaled curve
rises above once passes the threshold , and stays above
forever after. What happens to the left of is irrelevant.
The wanted inequality
method. Proofs of -bounds follow a fixed recipe:
write down the inequality you want to hold, then reverse-engineer constants
and that make it true. To prove , we want
. For each lower term is at most , so
; thus ,
work. The same move handles any polynomial; see the theorem below.
Big-Omega: asymptotic lower bound
Mirror the quantifiers, flip the inequality:
It is the mirror image of : if and only if .
Big-Theta: asymptotic tight bound
pins between two constant multiples of : it grows exactly as fast as , up to constants. The fundamental link is
When we say insertion sort is in the worst case,
we mean its
worst-case cost is sandwiched between and , a precise,
two-sided claim. When we only have an upper bound we say ; this is why people
loosely write even where holds. But the distinction matters, as the
next two results make concrete.
The polynomial theorem
The single most useful fact for everyday analysis collapses every polynomial to its leading power:
Upper bound (). Replace every coefficient by its absolute value and every lower power by (valid for ):
So define and pick .
Lower bound (). Pull out the leading term and bound the rest below; for ,
The wanted inequality
method now finds an past which the negative tail is,
say, at most half of , leaving for all .
Together the two directions give : drop the constants and
lower-order terms, keep the leading power.
is an upper bound, not a promise of tightness
Consider a worked cautionary case. Exchange sort compares with for every pair , so its running time satisfies
by the polynomial theorem. But also implies the true but useless statement : a correct upper bound need not be tight. So how loose can we go? Not below . The number of pairs is itself a lower bound on the work:
This forces ; indeed for any . No constant can keep under once is large enough, because whenever . The moral: alone tells you the cost is no worse than something; only matching it with (i.e. proving ) certifies you have found the true growth rate.
Picture the exponents on a line. Exchange sort's cost sits at . Every with is a valid upper bound, but only is tight; the side forbids any upper bound with , walling off the left. Where the two bounds meet is .
Little-o and little-omega: strict bounds
and allow and to grow at the same rate. The lowercase versions forbid that; they assert a strict gap.
So means becomes negligible compared to : e.g. and . Symmetrically, is defined by the reciprocal limit, so dominates . A useful analogy from CLRS: are to functions as are to numbers.
Comparing functions with limits
The limit of the ratio is how you rank growth rates:
To compare, say, and for any constants , one shows the ratio , so every polynomial beats every polylogarithm.3 Likewise for every constant : exponentials dominate all polynomials. A useful fact for sums: (by Stirling's approximation),4 which is why comparison sorts that do work are optimal in a sense we will prove later.
Why we write with no base. Inside asymptotic notation the base is irrelevant, because changing base only multiplies by a constant:
So , , and all live in the same -class, and means the same thing whichever base you had in mind. (The constant is exactly the multiplicative factor and are designed to absorb.)
The growth hierarchy
The functions algorithm designers meet most often, in strictly increasing order of growth, are:
| Class | Name | A typical algorithm |
|---|---|---|
| constant | array index, hash lookup (expected) | |
| logarithmic | binary search | |
| linear | scan / find max | |
| linearithmic | merge sort, heapsort | |
| quadratic | insertion sort (worst case) | |
| cubic | naive matrix multiply | |
| exponential | subset enumeration | |
| factorial | brute-force permutations (TSP) |
Each row's growth dwarfs every row above it for large . The practical
dividing line Skiena draws is between polynomial time ( for a constant
), generally considered tractable,
and exponential time, which becomes
hopeless very fast.5 A factorial-time algorithm that handles in
a second needs years at .
A sketch of the curves makes the separation clear. Even with a generous constant on the slower-growing function, the faster one wins past some crossover :
Reading the cost off loops
Most analysis reduces to counting how many times each line runs. A few rules cover the common cases.
Sequential blocks add; we keep the max. If block costs and is followed by block costing , the total is : the larger term absorbs the smaller.
A simple loop multiplies the body by the iteration count. This nested fragment runs the constant-time body times:
- 1for to do
- 2for to do
- 3body, times
so its cost is .
When the inner bound depends on the outer index, sum a series. If the inner loop runs , the body executes times — still quadratic, because triangular work is half of square work, and the constant vanishes. This is exactly the shape of insertion sort's worst case.
When the loop variable is scaled, take a logarithm. A loop that does until runs about times, since doubles each pass. This is the source of every logarithm in algorithm analysis: repeatedly halving (or doubling) gives steps. Binary search and balanced-tree depth are the canonical examples.
The three loop shapes read off as area (or, for doubling, as the number of rungs). A square grid of iterations is work; bounding the inner loop by the outer index fills only the triangle below the diagonal, half as much but still ; and a doubling index touches just the powers of two.
Two more loop shapes round out the catalog, and both read off as areas just like the three above. An outer loop of passes wrapped around a doubling inner loop does work per pass — an grid, . And a loop whose live problem halves every pass while doing linear work on what remains sums a geometric series , so the whole stack of shrinking bars is no taller than two of the first — .
The recurrences that arise when a loop is replaced by recursion, a function calling smaller copies of itself, need their own machinery, which is the subject of the next lesson.
Takeaways
- The RAM model charges constant time per primitive operation and lets us measure running time as a function of input size, machine-independently.
- Report the worst case by default: it is a guarantee. Best case promises nothing; average case is honest but needs a distribution.
- Drop constants and lower-order terms. They are machine artifacts and noise; the leading term's growth rate is what scales.
- is an upper bound, a lower bound, a tight (two-sided) bound; and are their strict versions. and .
- The limit ranks two functions: , a constant, or give , , or .
- Memorize the hierarchy ; the polynomial/exponential boundary is the line between tractable and hopeless.
Footnotes
- Skiena, §2 — Algorithm Analysis: the RAM model as a machine-independent abstraction whose engineering payoff is predicting real-world speed. ↩
- CLRS, Ch. 3 — Characterizing Running Times: the RAM model's constant-word assumption and where it breaks down (e.g. bignum arithmetic). ↩
- Erickson, Algorithms, Appendix — Solving Recurrences (analysis throughout): ranking growth rates by the limit of the ratio, so every polynomial dominates every polylogarithm. ↩
- CLRS, Ch. 3 — Characterizing Running Times: via Stirling's approximation. ↩
- Skiena, §2 — Algorithm Analysis: the polynomial-vs-exponential dividing line between tractable and hopeless running times. ↩
╌╌ END ╌╌