📖 Chapter 8.5 ⏱️ ~55 min read 🎯 Gold / Hard

Chapter 8.5: Combinatorics & Number Theory

📝 Before You Continue: This chapter requires basic algebra and an understanding of modular arithmetic from Appendix E (Math Foundations). Chapter 6.1 (DP) is also helpful since many combinatorics problems are solved via DP.

Combinatorics and number theory problems appear regularly in USACO Gold — often as the "math" problem in a contest set. They typically involve counting (how many valid configurations exist?) or divisibility (which numbers satisfy a property?), and answers are usually requested modulo a prime (typically 10⁹+7).

Learning objectives:

Implement modular arithmetic correctly (add, multiply, power, inverse)
Compute binomial coefficients C(n, k) mod p efficiently
Apply the inclusion-exclusion principle
Use the Sieve of Eratosthenes for prime factorization
Recognize and solve USACO counting problems

8.5.0 Why Modular Arithmetic?

Combinatorics answers can be astronomically large. C(100, 50) has 30 digits! USACO problems always ask you to output the answer modulo a prime p (usually p = 10⁹+7 = 1,000,000,007).

The key operations under mod:

(a + b) mod p = ((a mod p) + (b mod p)) mod p ✓
(a × b) mod p = ((a mod p) × (b mod p)) mod p ✓
(a − b) mod p = ((a mod p) − (b mod p) + p) mod p ✓ (add p to avoid negative)
(a / b) mod p = (a mod p) × (b⁻¹ mod p) mod p ← requires modular inverse

const long long MOD = 1e9 + 7;

long long mod_add(long long a, long long b) { return (a + b) % MOD; }
long long mod_sub(long long a, long long b) { return (a - b + MOD) % MOD; }
long long mod_mul(long long a, long long b) { return (a % MOD) * (b % MOD) % MOD; }

8.5.1 Fast Power (Binary Exponentiation)

Computing a^n mod p in O(log n) using repeated squaring:

// Returns a^n mod p
long long power(long long a, long long n, long long p = MOD) {
    a %= p;
    long long result = 1;
    while (n > 0) {
        if (n & 1)             // if current bit of n is 1
            result = result * a % p;
        a = a * a % p;         // square the base
        n >>= 1;               // move to next bit
    }
    return result;
}

Example: 2^10 mod 1000:

n=10 (1010₂):  a=2,  result=1
n=5  (101₂):   a=4,  result=1   (bit 1 not set)
n=2  (10₂):    a=16, result=4   (bit 0 set: result = 1*4 = 4, then a=16)

Wait, let's redo: n=10=1010₂ means bits 1 and 3 are set:
2^10 = 2^8 * 2^2 = 256 * 4 = 1024 → 24 mod 1000

8.5.2 Modular Inverse

To compute a/b mod p, you need the modular inverse of b: a value b⁻¹ such that b × b⁻¹ ≡ 1 (mod p).

When does it exist? Only when gcd(b, p) = 1. If p is prime and 0 < b < p, the inverse always exists.

Method 1: Fermat's Little Theorem (p must be prime)

Fermat's little theorem: a^(p−1) ≡ 1 (mod p) for prime p and gcd(a,p)=1.

Therefore: a^(p−2) ≡ a⁻¹ (mod p).

long long mod_inv(long long a, long long p = MOD) {
    return power(a, p - 2, p);  // O(log p)
}

// Division mod p:
long long mod_div(long long a, long long b) {
    return mod_mul(a, mod_inv(b));
}

Method 2: Extended Euclidean Algorithm (works for non-prime moduli)

// Returns x such that a*x ≡ 1 (mod m)
// Uses extended GCD: finds x, y with a*x + m*y = gcd(a, m)
long long ext_gcd(long long a, long long b, long long& x, long long& y) {
    if (b == 0) { x = 1; y = 0; return a; }
    long long x1, y1;
    long long g = ext_gcd(b, a % b, x1, y1);
    x = y1;
    y = x1 - (a / b) * y1;
    return g;
}

long long mod_inv_general(long long a, long long m) {
    long long x, y;
    long long g = ext_gcd(a, m, x, y);
    if (g != 1) return -1;  // inverse doesn't exist
    return (x % m + m) % m;
}

8.5.3 Binomial Coefficients C(n, k) mod p

The binomial coefficient C(n, k) = n! / (k! × (n−k)!) counts the number of ways to choose k items from n.

Precomputed Factorials (for repeated queries, n ≤ 10⁶)

const int MAXN = 1000001;
const long long MOD = 1e9 + 7;

long long fact[MAXN], inv_fact[MAXN];

void precompute_factorials(int n) {
    fact[0] = 1;
    for (int i = 1; i <= n; i++)
        fact[i] = fact[i-1] * i % MOD;

    inv_fact[n] = power(fact[n], MOD - 2);  // Fermat's little theorem
    for (int i = n - 1; i >= 0; i--)
        inv_fact[i] = inv_fact[i+1] * (i+1) % MOD;
    // inv_fact[i] = 1/i! mod p, computed backwards
}

long long C(int n, int k) {
    if (k < 0 || k > n) return 0;
    return fact[n] % MOD * inv_fact[k] % MOD * inv_fact[n-k] % MOD;
}

Pascal's Triangle DP (for small n, k ≤ 2000)

long long dp[2001][2001];  // dp[n][k] = C(n, k) mod p

void precompute_pascal(int maxn) {
    for (int i = 0; i <= maxn; i++) {
        dp[i][0] = 1;
        for (int j = 1; j <= i; j++)
            dp[i][j] = (dp[i-1][j-1] + dp[i-1][j]) % MOD;
    }
}
// C(n, k) = dp[n][k]

8.5.4 Common Combinatorics Formulas

Formula	Value	Meaning
C(n, k)	n! / (k!(n-k)!)	Choose k from n (unordered, no repetition)
P(n, k)	n! / (n-k)!	Arrange k from n (ordered, no repetition)
n^k	n^k	Place k distinct items into n distinct bins
C(n+k-1, k)	(n+k-1)! / (k!(n-1)!)	Stars and bars: k items into n bins (with repetition)
n! / (a! b! c! ...)		Multinomial: arrange items of a types
C(2n, n) / (n+1)	Catalan number	Binary trees, valid bracket sequences

Catalan number (appears surprisingly often in USACO):

long long catalan(int n) {
    // C_n = C(2n, n) / (n+1)
    return C(2*n, n) % MOD * mod_inv(n+1) % MOD;
}
// C_0=1, C_1=1, C_2=2, C_3=5, C_4=14, C_5=42, ...

8.5.5 Inclusion-Exclusion Principle

The inclusion-exclusion principle counts elements in a union of sets by alternating addition and subtraction:

|A₁ ∪ A₂ ∪ ... ∪ Aₙ| = Σ|Aᵢ| − Σ|Aᵢ ∩ Aⱼ| + Σ|Aᵢ ∩ Aⱼ ∩ Aₖ| − ...

Template for 2-3 sets:

|A ∪ B| = |A| + |B| − |A ∩ B|
|A ∪ B ∪ C| = |A| + |B| + |C| − |A∩B| − |A∩C| − |B∩C| + |A∩B∩C|

USACO Pattern: Count sequences satisfying "at least one condition"

"Count N-length sequences where each element is 1..M, such that every value from 1..K appears at least once."

Inclusion-exclusion over "missing" values:

Total = Σ_{j=0}^{K} (-1)^j × C(K, j) × (M-j)^N

Choose j values to exclude (C(K, j) ways)
Fill N positions with remaining M-j values: (M-j)^N sequences
Alternating sign for inclusion-exclusion

long long count_surjective(int n, int m, int k) {
    // Count N-length sequences with each of K values appearing at least once
    long long ans = 0;
    for (int j = 0; j <= k; j++) {
        long long term = C(k, j) * power(m - j, n) % MOD;
        if (j % 2 == 0) ans = (ans + term) % MOD;
        else             ans = (ans - term + MOD) % MOD;
    }
    return ans;
}

8.5.6 Sieve of Eratosthenes

Find all primes up to N in O(N log log N):

const int MAXN = 1000001;
bool is_prime[MAXN];
vector<int> primes;

void sieve(int n) {
    fill(is_prime, is_prime + n + 1, true);
    is_prime[0] = is_prime[1] = false;

    for (int i = 2; i <= n; i++) {
        if (is_prime[i]) {
            primes.push_back(i);
            for (long long j = (long long)i * i; j <= n; j += i)
                is_prime[j] = false;
        }
    }
}

Linear Sieve (O(N)) — for prime factorization

int min_prime[MAXN];  // smallest prime factor of each number

void linear_sieve(int n) {
    for (int i = 2; i <= n; i++) {
        if (min_prime[i] == 0) {  // i is prime
            min_prime[i] = i;
            primes.push_back(i);
        }
        for (int p : primes) {
            if (p > min_prime[i] || (long long)i * p > n) break;
            min_prime[i * p] = p;
        }
    }
}

// Factorize n using min_prime[] in O(log n)
vector<pair<int,int>> factorize(int n) {
    vector<pair<int,int>> factors;
    while (n > 1) {
        int p = min_prime[n], cnt = 0;
        while (n % p == 0) { n /= p; cnt++; }
        factors.push_back({p, cnt});
    }
    return factors;
}

Number of Divisors

If n = p₁^a₁ × p₂^a₂ × ... × pₖ^aₖ, then the number of divisors is (a₁+1)(a₂+1)...(aₖ+1).

int count_divisors(int n) {
    auto factors = factorize(n);
    int cnt = 1;
    for (auto [p, e] : factors)
        cnt *= (e + 1);
    return cnt;
}

8.5.7 Euler's Totient Function φ(n)

φ(n) (Euler's totient function) counts the number of integers in [1, n] that are coprime with n (i.e., gcd(k, n) = 1).

φ(1) = 1
φ(2) = 1  (only 1 is coprime with 2)
φ(6) = 2  (1 and 5 are coprime with 6)
φ(12) = 4 (1, 5, 7, 11)
φ(p) = p-1 for any prime p  (all 1..p-1 are coprime with p)

Formula

If n = p₁^a₁ × p₂^a₂ × ... × pₖ^aₖ, then:

φ(n) = n × (1 - 1/p₁) × (1 - 1/p₂) × ... × (1 - 1/pₖ)

Implementation: Single Value

int euler_phi(int n) {
    int result = n;
    for (int p = 2; (long long)p * p <= n; p++) {
        if (n % p == 0) {
            while (n % p == 0) n /= p;  // remove all factors of p
            result -= result / p;        // result *= (1 - 1/p)
        }
    }
    if (n > 1) result -= result / n;     // n is a remaining prime factor
    return result;
}

Sieve for φ(1..N) — O(N log log N)

const int MAXN = 1000001;
int phi[MAXN];

void phi_sieve(int n) {
    // Initialize phi[i] = i (multiplicative identity step)
    iota(phi, phi + n + 1, 0);

    for (int p = 2; p <= n; p++) {
        if (phi[p] == p) {   // p is prime (not yet modified)
            for (int j = p; j <= n; j += p) {
                phi[j] -= phi[j] / p;  // phi[j] *= (1 - 1/p)
            }
        }
    }
}
// After calling phi_sieve(n), phi[i] = φ(i) for all i in [1, n]

Why φ Matters in USACO/Combinatorics

Fermat's little theorem generalization: For any a with gcd(a, n) = 1: a^φ(n) ≡ 1 (mod n). This is Euler's theorem.
Primitive roots / multiplicative order: The order of a divides φ(n).
Necklace counting (Burnside): The formula uses φ(d) for each divisor d of N.
Sum of φ: Σ_{d|n} φ(d) = n. Useful in inclusion-exclusion on divisors.

// Example: count pairs (a,b) with 1<=a<=b<=n, gcd(a,b)=1
// Answer = 1 + Σ_{i=2}^{n} φ(i)   (the "+1" accounts for pair (1,1))
phi_sieve(n);
long long count = 1;
for (int i = 2; i <= n; i++)
    count += phi[i];

8.5.8 Chinese Remainder Theorem (CRT)

The Chinese Remainder Theorem says: if you have a system of congruences with pairwise coprime moduli:

x ≡ r₁ (mod m₁)
x ≡ r₂ (mod m₂)
...
x ≡ rₖ (mod mₖ)

Then there exists a unique solution x modulo M = m₁ × m₂ × ... × mₖ.

Two-Equation CRT

For two equations x ≡ r₁ (mod m₁) and x ≡ r₂ (mod m₂) where gcd(m₁, m₂) = 1:

// Returns x such that x ≡ r1 (mod m1) and x ≡ r2 (mod m2)
// Requires gcd(m1, m2) = 1
// Solution is unique mod (m1 * m2)
long long crt(long long r1, long long m1, long long r2, long long m2) {
    // x = r1 + m1 * k for some k
    // r1 + m1 * k ≡ r2 (mod m2)
    // m1 * k ≡ r2 - r1 (mod m2)
    // k ≡ (r2 - r1) * inv(m1) (mod m2)
    long long k = (r2 - r1 % m2 + m2) % m2 * mod_inv(m1 % m2, m2) % m2;
    return r1 + m1 * k;
    // Result is in range [0, m1*m2), may overflow if m1*m2 > 10^18
    // Use __int128 if needed
}

Generalized CRT (Non-Coprime Moduli)

When moduli are NOT coprime, a solution may not exist. Use extended GCD to check:

// Returns {x, lcm(m1,m2)} such that x ≡ r1 (mod m1) and x ≡ r2 (mod m2)
// Returns {-1, -1} if no solution exists
// Works even when gcd(m1, m2) > 1
pair<long long, long long> crt_general(long long r1, long long m1, long long r2, long long m2) {
    long long g = __gcd(m1, m2);
    if ((r2 - r1) % g != 0) return {-1, -1};  // no solution

    long long lcm = m1 / g * m2;
    long long diff = (r2 - r1) / g;
    long long m2g = m2 / g;

    // k ≡ diff * inv(m1/g) (mod m2/g)
    long long k = diff % m2g * mod_inv(m1 / g % m2g, m2g) % m2g;
    long long x = (r1 + m1 * k) % lcm;
    if (x < 0) x += lcm;
    return {x, lcm};
}

Multi-Equation CRT (Iterative)

// Solve system: x ≡ r[i] (mod m[i]) for i = 0..k-1
// Returns {x, M} where M = lcm of all moduli
// or {-1, -1} if no solution
pair<long long, long long> crt_multi(vector<long long>& r, vector<long long>& m) {
    long long cur_r = r[0], cur_m = m[0];
    for (int i = 1; i < (int)r.size(); i++) {
        auto [x, M] = crt_general(cur_r, cur_m, r[i], m[i]);
        if (x == -1) return {-1, -1};
        cur_r = x;
        cur_m = M;
    }
    return {cur_r, cur_m};
}

USACO Pattern: CRT

"Process events that happen every A₁ steps, B₁ days, C₁ hours. Find the next time all three coincide."

x ≡ r₁ (mod A₁)
x ≡ r₂ (mod B₁)
x ≡ r₃ (mod C₁)
→ Solve iteratively with crt_multi

8.5.9 USACO Gold Math Patterns

Pattern 1: Counting with DP

"Count the number of valid sequences of length N where each element is chosen from 1..M satisfying constraints."

Model as DP: dp[i][state] = number of sequences of length i ending in some state. Often the answer requires modular arithmetic.

Pattern 2: Divisibility Constraints

"How many numbers from 1 to N are divisible by at least one of {a₁, a₂, ..., aₖ}?"

Inclusion-exclusion: Σ|multiples of aᵢ| − Σ|multiples of lcm(aᵢ, aⱼ)| + ...

// Count multiples of m up to n:
int count_multiples(long long n, long long m) {
    return n / m;
}

Pattern 3: Stars and Bars

"Distribute N indistinguishable balls into K bins with certain constraints."

Without constraints: C(N+K-1, K-1). With "at most X per bin": inclusion-exclusion.

Pattern 4: Symmetry / Burnside's Lemma

"Count distinct necklaces / colorings up to rotation/reflection."

Burnside's lemma: average number of fixed points over all group actions. Appears rarely but memorably in USACO.

💡 思路陷阱（Pitfall Patterns）

陷阱 1：对非素数模数使用费马小定理求逆元

错误判断： "求 a 的逆元就是 power(a, MOD-2, MOD)" 实际情况： 费马小定理要求 MOD 是素数，且 gcd(a, MOD)=1；若 MOD 不是素数（如 MOD=10⁶）则结果错误

// 错误：MOD = 10^6（不是素数）
long long inv = power(6, 1e6 - 2, 1e6);  // 6^(10^6-2) mod 10^6 ≠ 6⁻¹

// 正确：MOD 不是素数时，用扩展欧几里得
long long inv = mod_inv_general(6, 1000000);  // ext_gcd 方法
// 或者，大多数 USACO 题的 MOD=10⁹+7（素数），直接用 Fermat 没问题

识别信号： 题目给的模数不是 10⁹+7 或 998244353 → 先验证是否为素数再决定求逆方法

陷阱 2：容斥原理的符号方向搞反

错误判断： "偶数大小子集加，奇数大小子集减"（或者反过来） 实际情况： 容斥公式：|A₁∪...∪Aₙ| = Σ|单集合| - Σ|二元交| + Σ|三元交| - ... 奇数大小子集加，偶数大小子集减（包含-排除交替）

// 常见题：N 个元素，统计"至少满足 k 个条件之一"
// 错误：把 + - 方向搞反
long long ans = 0;
for (int mask = 1; mask < (1<<k); mask++) {
    int bits = __builtin_popcount(mask);
    long long term = compute_intersection(mask);
    if (bits % 2 == 0) ans += term;  // ← 错误！偶数应该减
    else               ans -= term;  // ← 错误！奇数应该加
}

// 正确
    if (bits % 2 == 1) ans += term;  // 奇数大小交集：加
    else               ans -= term;  // 偶数大小交集：减

识别信号： 容斥答案出现负数或明显偏大 → 检查 +/- 符号是否与 popcount % 2 对应正确

陷阱 3：C(n, k) 中 k < 0 或 k > n 时未做边界检查

错误判断： "公式直接套，fact[n] * inv_fact[k] * inv_fact[n-k]" 实际情况： 当 k < 0 或 k > n 时，inv_fact 数组越界或数学上 C(n,k) 应为 0

// 错误：没有边界检查
long long C(int n, int k) {
    return fact[n] * inv_fact[k] % MOD * inv_fact[n-k] % MOD;
    // 若 k=-1 或 k=n+1，inv_fact[-1] 或 inv_fact[-1] 是野指针访问
}

// 正确：加边界保护
long long C(int n, int k) {
    if (k < 0 || k > n || n < 0) return 0;  // ← 必须有！
    return fact[n] * inv_fact[k] % MOD * inv_fact[n-k] % MOD;
}

识别信号： 组合数计算在某些特殊输入下崩溃或返回极大值 → 检查边界条件

⚠️ Common Mistakes

Integer overflow in a * b % MOD: If a, b ≈ 10⁹, then a * b overflows int and even long long if you're not careful. Cast to long long first: (long long)a * b % MOD.
Negative result from subtraction: (a - b) % MOD can be negative in C++. Always write (a - b + MOD) % MOD.
inv_fact[0] = 1: Make sure inv_fact[0] = 1 (since 0! = 1). The backwards loop in precompute_factorials handles this.
C(n, k) = 0 when k > n or k < 0: Always guard against these edge cases.
MOD not being prime: Fermat's little theorem requires p to be prime. If the problem uses a non-prime modulus (rare), use ext_gcd for modular inverse.
Lucas' theorem for large n: When n > 10⁷ and p is small, use Lucas' theorem: C(n, k) mod p = C(n mod p, k mod p) × C(n/p, k/p) mod p.

📋 Chapter Summary

📌 Key Takeaways

Concept	Summary
Modular inverse	a⁻¹ mod p = a^(p-2) mod p (Fermat, p prime); O(log p)
Factorial table	Precompute fact[], inv_fact[] up to 10⁶; O(N) space
C(n, k) mod p	fact[n] * inv_fact[k] * inv_fact[n-k] mod p; O(1) per query
Inclusion-exclusion	Alternating sum over subsets of constraints
Sieve	All primes up to N in O(N log log N); factorization in O(log N)
Catalan numbers	C(2n,n)/(n+1); counts binary trees, bracket sequences

❓ FAQ

Q: What is the most common modulus in USACO? A: 10⁹+7 (1,000,000,007), which is prime. Occasionally 998,244,353 (also prime, used in NTT).

Q: How do I know if a problem needs combinatorics vs DP? A: If the problem has a "nice" closed-form answer (like C(n,k)), combinatorics works. If the constraints have complex dependencies, DP may be necessary. Often you need both: DP to compute a table, then combinatorics to sum it up.

Q: What is gcd and when do I need it? A: gcd(a, b) = greatest common divisor. Use __gcd(a, b) in C++. You need it for: simplifying fractions, checking divisibility, computing lcm = a*b/gcd(a,b).

Q: When do I use Lucas' theorem? A: When n is very large (10¹²+) but the prime modulus p is small (< 10⁶). Rare in USACO Gold but appears at Platinum.

🔗 Connections to Later Chapters

Appendix E (Math Foundations): This chapter extends what you learned there — modular arithmetic, primes, and combinatorics are all introduced there.
Ch.6.3 (Advanced DP): Digit DP (counting integers with digit constraints) combines number theory with DP.
Ch.8.2 (DAG DP): Path counting in DAGs often requires taking the result mod p.

🏋️ Practice Problems

🟢 Easy

8.5-E1. Power mod p Compute a^n mod p for given a, n, p where p is prime and n ≤ 10¹⁸.

Hint

Binary exponentiation (power(a, n, p)). Watch for overflow: use __int128 or careful multiplication if a, p ≈ 10¹⁸.

8.5-E2. Count Paths in Grid Count the number of monotone paths (right or down only) from (0,0) to (n,m) in an N×M grid. Output mod 10⁹+7.

Hint

The answer is C(n+m, n) = (n+m)! / (n! × m!). Use precomputed factorials and modular inverses.

🟡 Medium

8.5-M1. Counting Sequences (USACO-style) Count N-length sequences where each element is chosen from {1, 2, ..., M} and all K "special" values appear at least once. Output mod 10⁹+7.

Hint

Inclusion-exclusion: count_surjective(N, M, K) from section 8.5.5. Enumerate over j values to exclude.

8.5-M2. Divisor Sum Given N numbers a₁, a₂, ..., aₙ. For each aᵢ, output the sum of its divisors mod 10⁹+7.

Hint

Factorize each aᵢ using the linear sieve (precompute smallest prime factors). If aᵢ = p₁^e₁ × ... × pₖ^eₖ, divisor sum = product of (1 + pᵢ + pᵢ² + ... + pᵢ^eᵢ) = product of (pᵢ^(eᵢ+1) - 1) / (pᵢ - 1). Use modular inverse for division.

🔴 Hard

8.5-H1. Necklace Counting (Burnside's Lemma) Count the number of distinct necklaces with N beads, each colored with one of K colors, where two necklaces are the same if one is a rotation of the other.

Hint

Burnside's lemma: answer = (1/N) × Σ_{d|N} φ(N/d) × K^d, where φ is Euler's totient function and the sum is over divisors d of N. Requires GCD, modular inverse, and Euler's totient computation.

🏆 Challenge

8.5-C1. Expected Value on Trees (USACO Platinum-adjacent) Given a tree with N vertices, each initially colorless. Randomly color each vertex red with probability 1/2, blue with probability 1/2. Find the expected number of edges where both endpoints have the same color. Output as a fraction p/q reduced to lowest terms, then output p × q⁻¹ mod 10⁹+7.

Hint

By linearity of expectation: E[same-color edges] = (number of edges) × P(both endpoints same color). For each edge, P(same color) = 1/2 (both red) + 1/2 (both blue) = 1/2. Wait — actually P = 1/4 + 1/4 = 1/2. So the answer is (N-1)/2. Output as (N-1) × mod_inv(2) % MOD.

(The interesting version has non-uniform color probabilities — extend this idea.)

C++ for Competitive Programming: A USACO Guide