Mathematical Algorithms/Number Theory: GCD & Modular Arithmetic

Lesson 10.11,533 words

Number Theory: GCD & Modular Arithmetic

This lesson opens the mathematical-algorithms module with the bedrock of computational number theory. We prove Euclid's recurrence gcd(a,b)=gcd(b,amodb)\gcd(a,b)=\gcd(b,\,a\bmod b) and its O(logmin(a,b))O(\log\min(a,b)) running time, extend it to recover Bézout coefficients x,yx,y with ax+by=gcd(a,b)ax+by=\gcd(a,b), and build modular arithmetic on residue classes — including when a modular inverse a1modma^{-1}\bmod m exists and how to compute it.

╌╌╌╌

Most of this course has measured algorithms against the size of their input: elements, vertices, levels of a tree. Number-theoretic algorithms break that habit. Their inputs are single integers, and the interesting cost is measured against the magnitude of those integers, or equivalently the number of bits needed to write them down. The oldest non-trivial algorithm we know, Euclid's, from around 300 BCE, lives here, and it is still the right way to compute a greatest common divisor. This lesson develops it carefully, extends it to solve linear Diophantine equations, and uses both to lay the foundations of modular arithmetic, the arithmetic that underlies hashing, cryptography, and the math problems you will meet in practice.

Divisibility and the greatest common divisor

For integers and we say divides (written ) if for some integer . A common divisor of and is an integer dividing both. The greatest common divisor is the largest such integer, with the conventions and . Throughout we take ; signs only flip the answer's sign.1

The naive way to compute , factoring both numbers and multiplying the shared prime powers, is a trap: integer factorization is believed to be hard, and the sieve and factorization methods that find those prime powers are themselves a separate study. Euclid's insight is that you never need the factorization at all.

Euclid's algorithm

The entire algorithm rests on one recurrence.

The base case is immediate: every integer divides , so the largest divisor of and is itself. The recurrence follows from a sharper claim: the two pairs share exactly the same set of common divisors, hence the same greatest one.

Because the second argument strictly shrinks () and stays non-negative, the recursion must terminate, and it terminates at a pair whose answer is .

Algorithm:Euclid(a,b)\textsc{Euclid}(a, b) — greatest common divisor, O(logmin(a,b))O(\log\min(a,b))
  1. 1
    while b0b \ne 0 do
  2. 2
    ramodbr \gets a \bmod b
  3. 3
    aba \gets b
  4. 4
    brb \gets r
  5. 5
    return aa

Why it is fast

Each iteration replaces with . The key fact is that two iterations at least halve the larger argument.

So after every two steps the first argument drops below half its value; the number of iterations is therefore once the first swap orders the arguments.2 Each iteration does one division on numbers of bits, so the bit-complexity is polynomial in the input size, exponentially better than factoring. (For a refresher on this kind of logarithmic bound, see asymptotic analysis.)

The worst case is slow precisely because every quotient is the smallest it can be, , so each step subtracts only once and the pair merely slides to the previous Fibonacci pair. A single larger quotient would collapse the chain far faster.

Fibonacci inputs are the worst case: every quotient is , so the pair steps down through every Fibonacci number one rung at a time.

Each iteration simply replaces the pair by and recurses; tracing shows the second argument collapsing to in three steps.

Euclid's remainder steps for . Each arrow applies : the divisor slides down to become the new first argument and the remainder becomes the new second, until the second argument hits and the first is the answer.

Euclid, geometrically

Replacing by repeated subtraction gives the subtractive form of the algorithm, and it has a geometric reading: tile an rectangle greedily with the largest squares that fit. Cut off a square as many times as you can, then recurse on the leftover strip. The side of the last square is the gcd.

as the largest square that tiles an rectangle

Extended Euclid: Bézout's identity

Euclid tells us what the gcd is; the extended algorithm tells us how to build it out of and .

The coefficients fall out of running Euclid in reverse. At the base case we have the trivial identity , so . For the recursive step, suppose the recursive call on has already returned with Substitute and regroup by and :

So the back-substitution recurrence is

Algorithm:Extended-Euclid(a,b)\textsc{Extended-Euclid}(a, b) — returns (g,x,y)(g, x, y) with ax+by=g=gcd(a,b)ax+by=g=\gcd(a,b)
  1. 1
    if b=0b = 0 then
  2. 2
    return (a, 1, 0)(a,\ 1,\ 0)
  3. 3
    (g, x, y)Extended-Euclid(b, amodb)(g,\ x',\ y') \gets \textsc{Extended-Euclid}(b,\ a \bmod b)
  4. 4
    xyx \gets y'
  5. 5
    yxa/byy \gets x' - \lfloor a / b \rfloor \cdot y'
  6. 6
    return (g, x, y)(g,\ x,\ y)

It performs the same divisions as plain Euclid, so it is also . The table below traces : the forward pass fills the remainder/quotient columns top-down, and the coefficients are filled bottom-up by the back-substitution recurrence, landing on Bézout coefficients for the original pair in the top row.

back-substitution yields for

When does have a solution?

Bézout pins down solvability of the general linear Diophantine equation.

Why. Let . Any combination is a multiple of , so is necessary. Conversely, if , scale Bézout's coefficients by : from we get . This is exactly the predicate behind the Water and Jug Problem (can we measure liters using jugs of capacity and ? iff and ) and Check if Point Is Reachable, where the reachable lattice is governed by the gcd of the allowed steps.

Modular arithmetic

Fix a modulus . We say is congruent to modulo , written i.e. and leave the same remainder on division by . Congruence is an equivalence relation, and it partitions the integers into residue classes. The decisive property is that the class operations are well-defined: if and , then So you may reduce mod at any point in a chain of , , without changing the final residue, the foundation of every answer modulo problem and of combinatorics modulo a prime.3

When two coprime moduli are at play, the residue classes interlock perfectly: the pair pins down uniquely modulo . The grid below tabulates that bijection, with landing in each cell, the constructive heart of the Chinese Remainder Theorem we revisit in combinatorics.

recovered from — a bijection of residues

Modular inverse and linear congruences

A modular inverse of modulo is an integer with . It is exactly what lets you divide by .

Why. The congruence is the Diophantine equation , which by the corollary above is solvable iff , i.e. . There are two standard ways to produce the inverse:

  1. Extended Euclid. Run to get . Reducing mod kills the term, leaving , so is the inverse. This works for any coprime modulus and costs .
  2. Fermat's little theorem. When is prime, every is coprime to , and , hence . Computed by fast exponentiation in multiplications, the subject of the next lesson, on modular exponentiation and primality.

The reason an inverse exists exactly when is visible directly: multiplying every nonzero residue by such an permutes them, so some residue must land on , and that residue is . Below, multiplying by modulo shuffles the set, and the arrow into comes from , so .

, so ; multiplying by permutes
Algorithm:Mod-Inverse(a,m)\textsc{Mod-Inverse}(a, m) — inverse of aa modulo mm, or "none"
  1. 1
    (g, x, y)Extended-Euclid(amodm, m)(g,\ x,\ y) \gets \textsc{Extended-Euclid}(a \bmod m,\ m)
  2. 2
    if g1g \ne 1 then
  3. 3
    return "no inverse"
    not coprime
  4. 4
    return ((xmodm)+m)modm((x \bmod m) + m) \bmod m
    normalize into [0,m)[0, m)

The same machinery solves the general linear congruence . Let .

When this reduces to multiply both sides by and yields the single solution , the everyday case. The general count is what you reach for when the modulus and coefficient share a factor.

Takeaways

  • is computed by Euclid's algorithm via , base ; the recurrence is exact because and have identical common divisors.
  • Euclid runs in iterations because two steps at least halve the argument; the worst case is consecutive Fibonacci numbers.
  • Extended Euclid returns Bézout coefficients with , and is solvable iff , the test behind Water-and-Jug and Check-if-Point-Is-Reachable.
  • Modular arithmetic is arithmetic on residue classes; are well-defined, but division requires an inverse and overflow must be guarded.
  • A modular inverse exists iff , found by extended Euclid, or by Fermat () when is prime; the linear congruence is solvable iff , with exactly solutions.

Footnotes

  1. CLRS, Ch. 31 — Number-Theoretic Algorithms (§31.1–31.2): divisibility, common divisors, and the recursive characterization of the gcd.
  2. CLRS, Ch. 31 — Number-Theoretic Algorithms (§31.2): Euclid's and the extended algorithm; the Fibonacci worst case bounds the iteration count to .
  3. Skiena, § — Number Theory: residue classes, congruences, and practical modular-arithmetic pitfalls (overflow, negative remainders).
Practice

╌╌ END ╌╌