What Is an Algorithm?
An algorithm is a finite, mechanical recipe that transforms inputs into outputs. We pin down what counts as an algorithm, how we write one down, and the three things we always ask of it: is it correct, is it fast, and can we prove it.
╌╌╌╌
You have already met dozens of algorithms without being told that's what they were. Merge sort, quicksort, binary search, linear search, breadth-first search; even the rote procedures for multiplying () or adding () two integers by hand; even tidying a room by a fixed set of rules. What unites them is captured by a simple working definition:
def fib(n: int) -> int:
if n <= 1:
return n
return fib(n-1) + fib(n-2)
def fib2(n: int) -> int:
M = [0, 1]
for i in range(1, n):
M.append(M[-1] + M[-2])
return M[-1]
fib :: Int -> Int
fib 0 = 0
fib 1 = 1
fib n = fib (n - 1) + fib (n - 2)
fib2 :: Int -> Int
fib2 n
| n <= 1 = n
| otherwise = fib2 (n - 1) + fib2 (n - 2)
That definition is deliberately blunt. Erickson sharpens it the same way: an
algorithm is a procedure a rock could follow, where every step is so
mechanical that no intelligence, intuition, or luck is required to carry it
out.1 CLRS frames it operationally: a well-defined computational procedure
that takes a value (or set of values) as input and produces a value
as output.2 Skiena adds the engineer's caveat: it must work correctly on
every instance, not merely on the examples we happen to test.3
Three demands run through all of these, and through this entire course:
- Correctness. The algorithm produces the right output on every valid input. One counterexample is enough to sink it.
- Efficiency. It uses few resources (time, space) as the input grows.
- Provability. We can argue, rather than assert, that the first two hold.
Communicating an algorithm
Having an algorithm in your head is not enough; you must convey it so that someone else can run it, trust it, and predict its cost. In this course every algorithm comes with four deliverables, and you should reach for all four by reflex:
- High-level idea. One or two sentences of plain English.
- Pseudocode. The steps, precise enough to analyze.
- Proof of correctness. An argument that it always returns the right answer.
- Complexity analysis. How its running time (and sometimes space) grows with the input.
These four are not independent checkboxes; they form a pipeline. The problem spec fixes what counts as a correct answer; the idea and pseudocode are two resolutions of the same method; the proof certifies the pseudocode against the spec; and the analysis measures that same pseudocode. Each deliverable feeds the next.
We will build a complete example of all four below. First, two preliminaries: what exactly the algorithm is for, and how we write it down.
Specifying the problem
Before writing an algorithm we must agree on the problem it solves: the set of legal inputs and, for each, the required output. A problem is a relation between inputs and outputs; an instance is one particular input. For sorting:
Notice what the specification hides: it says nothing about how to reorder the numbers, only what the result must satisfy. That separation of what from how is what algorithm design turns on.
Our running example will be a smaller, more concrete problem:
It is simple enough that you can see the whole algorithm at once, yet rich enough to demand all four deliverables, including a proof that is more subtle than it first looks.
Deliverable 1 — the high-level idea
Before any pseudocode, say what you intend to do in plain words. For :
That sentence is the whole algorithm. Everything that follows makes it precise and proves it works.
Deliverable 2 — the pseudocode
We describe algorithms in pseudocode: precise enough to analyze, free of the syntactic noise of any real language. The high-level idea translates directly:
- 1largest value seen so far
- 2for to do
- 3if then
- 4
- 5return
Two small but real design choices deserve attention. We seed with
rather than with or . This is why the specification insisted
, so that exists and the maximum
is well defined. The loop
starts at , since has already been accounted for by the seed.
As a second specimen of pseudocode, here is the classic insertion sort, which sorts in place by growing a sorted prefix one element at a time. We will return to it when we study sorting; for now it shows what nested loops and an in-place rearrangement look like on the page.
- 1for to do
- 2
- 3insert into sorted prefix
- 4while and do
- 5
- 6
- 7
- 8return
The outer loop walks a marker from left to right; everything before is already sorted. Each pass lifts out as , slides the larger sorted-prefix elements one slot right, and drops into the gap that opens up — exactly how you would tidy a hand of playing cards. The trace below shows the sorted prefix (shaded) absorbing one new element per row.
A picture of the idea
Here is mid-sweep on . The cursor has just reached ; everything to its left has been scanned, and holds the largest value among . Since , the if fires and is updated to .
The shaded cell is the element under the cursor; the region to its left (positions through ) is the part already summarized by .
Deliverable 3 — proof of correctness
How do we know works? Erickson's a rock could run it
intuition tells us the steps are mechanical, but it does not tell us the
answer is right. Correctness needs an argument.
It is tempting to argue by contradiction (suppose is the true maximum but returns something else), but the clean way is to name what the loop preserves and induct on it. The key observation is that is only ever overwritten by a larger value, so never decreases and never holds anything that wasn't actually in the array. Made precise, that is a loop invariant: a statement true before and after every iteration.
We verify it with the three-part rubric that will recur throughout the course:
- Initialization. Before the first iteration , so we must check . The seed line set , and the maximum of a one-element set is that element. ✓
- Maintenance. Assume the invariant holds entering iteration , i.e. . The body sets (it overwrites exactly when , and leaves it otherwise). Hence after the body , which is precisely the invariant for the next cursor value . ✓
- Termination. The loop ends once the cursor would exceed , i.e. with the invariant established for . So , and returns exactly the value the specification demands. ✓
Initialization, maintenance, termination: that triple drives correctness proofs, and we will use it constantly.
Aside: claims and the contrapositive
For algorithms that answer yes/no rather than compute a value, the same rigor takes a slightly different shape. Consider , which reports whether key occurs in . Correctness splits into two claims:
- Claim 1. If , then returns
found. - Claim 2. If , then returns
not found.
Claim 1 is direct: if sits at position , the loop's -th iteration
tests and returns found. Claim 2 is cleaner to prove in its
contrapositive form. The logical identity is worth memorizing:
So instead of if then it returns
we prove the
equivalent not found,if it returns
which is immediate, since the
only way to return found then ,found is to have just tested for some ,
witnessing . Choosing the easier of two logically identical statements
is a recurring move in correctness proofs.
Deliverable 4 — complexity, in brief
The fourth deliverable asks how many steps the algorithm takes as a function of the input size . For , the seed and the final return cost a constant; the loop runs times, doing a comparison and at most one assignment each pass. If each line costs some machine-dependent constant , the total is
a linear running time: double the array and you roughly double the work. The point of the is that it throws away the machine-specific constants and keeps only the growth rate.
That insertion sort, by contrast, is not always so cheap is exactly why fast
needs care. On an array already in reverse order, every new element sifts past
all its predecessors, costing
comparisons, which is quadratic in . On an already-sorted array it does
only comparisons, which is linear: the same algorithm, wildly different
costs.
Defining , , and precisely, and measuring this growth independently of the machine, is the subject of asymptotic analysis in the next lesson.
Takeaways
- An algorithm is a precise recipe of steps for solving a computational problem: a finite, mechanical, input-to-output procedure that must be correct on every instance.
- Every algorithm comes with four deliverables: high-level idea, pseudocode, proof of correctness, and complexity analysis. shows all four at full size.
- Specify the problem (legal inputs → required outputs) before the method. That is also what tells you to seed and require .
- Pseudocode is for humans to reason about; a loop invariant (
only ever grows, to a value actually in the array
) turnsit looks right
into a proof by initialization, maintenance, termination. Pin down which moment your invariant describes. - For yes/no algorithms, prove each direction separately, and use the contrapositive to pick the easier statement.
- Correctness and efficiency are separate questions: an algorithm can win one and lose the other.
Footnotes
- Erickson, Algorithms, Ch. 0 — Introduction: an algorithm as a procedure mechanical enough that
a rock could follow it.
↩ - CLRS, Ch. 1 — The Role of Algorithms (§1.1): the operational
well-defined computational procedure
framing. ↩ - Skiena, The Algorithm Design Manual, §1.1–1.3 — Introduction to Algorithm Design: an algorithm must be correct on every instance. ↩
╌╌ END ╌╌