📖 Chapter 4.1 ⏱️ ~55 min read 🎯 Intermediate

Chapter 4.1: Greedy Fundamentals

📝 Before You Continue: You should be comfortable with sorting (Chapter 3.3) and basic priority_queue usage (Chapter 3.1). Some problems also use interval reasoning.

A greedy algorithm is like a traveler who always takes the nearest oasis — no map, no planning, just the best move visible right now. For the right problems, this always works out. For others, it leads to disaster.


4.1.1 What Makes a Problem "Greedy-Solvable"?

A greedy approach works when the problem has the greedy choice property: making the locally optimal choice at each step leads to a globally optimal solution.

Contrast with DP

Consider making change for 11 cents:

  • Coins: {1, 5, 6, 9}
  • Greedy: 9 + 1 + 1 = 3 coins
  • Optimal: 6 + 5 = 2 coins

Here greedy fails. The greedy choice (always take the largest coin) doesn't lead to the global optimum.

But with US coins {1, 5, 10, 25, 50}:

  • 41 cents: Greedy → 25 + 10 + 5 + 1 = 4 coins ✓ (optimal)

US coins have a special structure that makes greedy work. Always verify!

💡 Key Insight: Greedy works when there's a "no regret" property — once you make the greedy choice, you'll never need to undo it. If you can always swap any non-greedy choice for the greedy one without making things worse, greedy is optimal.

贪心 vs DP 决策路径对比:

flowchart TD
    Start["遇到优化问题"] --> Q1{"能否找到反例?"}
    Q1 -->|"能,贪心失败"| DP["使用 DP\n考虑所有选择"]
    Q1 -->|"不能,尝试证明"| Q2{"能否用交换论证明贪心选择安全?"}
    Q2 -->|"能"| Greedy["使用贪心\n每步取局部最优"]
    Q2 -->|"不确定"| Both["先尝试贪心\n如果 WA 就改 DP"]
    style Greedy fill:#dcfce7,stroke:#16a34a
    style DP fill:#dbeafe,stroke:#3b82f6
    style Both fill:#fef9ec,stroke:#d97706

4.1.2 The Exchange Argument

The exchange argument is the standard proof technique for greedy algorithms:

  1. Assume there's an optimal solution O that makes a different choice than our greedy at some step
  2. Show that we can "swap" our greedy choice for theirs without making things worse
  3. By repeated swaps, transform O into the greedy solution — it remains optimal throughout
  4. Conclude: the greedy solution is optimal

💡 Key Insight: The exchange argument works by showing that greedy choices are "at least as good" as any alternative. You don't need to show greedy is uniquely optimal — just that no swap can improve it.

Visual: Greedy Exchange Argument

Greedy Exchange Argument

The diagram illustrates the exchange argument: if two adjacent elements are "out of order" relative to the greedy criterion, swapping them produces a solution that is at least as good. By repeatedly applying swaps we can transform any solution into the greedy solution without losing value.

Let's see this in action.


4.1.3 Activity Selection Problem

Problem: Given N activities, each with a start time s[i] and end time f[i], select the maximum number of non-overlapping activities.

Visual: Activity Selection Gantt Chart

Activity Selection

The Gantt chart shows all activities on a timeline. Selected activities (green) are non-overlapping and maximally many. Rejected activities (gray) are skipped because they overlap with an already-selected one. The greedy rule is: always pick the activity with the earliest end time that doesn't conflict.

Greedy Algorithm:

  1. Sort activities by end time
  2. Always select the activity that ends earliest among those compatible with previously selected activities

活动选择贪心过程示意:

flowchart LR
    subgraph sorted["按结束时间排序后"]
        direction TB
        A1["A(1,3)"] 
        B1["B(2,5)"]
        C1["C(5,7)"]
        D1["D(6,8)"]
        F1["F(8,11)"]
    end
    subgraph select["贪心选择过程"]
        direction TB
        S1["lastEnd=-1\n选 A(1,3) ✓\nlastEnd=3"]
        S2["B开始=2 < lastEnd=3\n跳过 B ✗"]
        S3["C开始=5 ≥ lastEnd=3\n选 C(5,7) ✓\nlastEnd=7"]
        S4["D开始=6 < lastEnd=7\n跳过 D ✗"]
        S5["F开始=8 ≥ lastEnd=7\n选 F(8,11) ✓\nlastEnd=11"]
        S1 --> S2 --> S3 --> S4 --> S5
    end
    sorted --> select
    style S1 fill:#dcfce7,stroke:#16a34a
    style S3 fill:#dcfce7,stroke:#16a34a
    style S5 fill:#dcfce7,stroke:#16a34a
    style S2 fill:#fef2f2,stroke:#dc2626
    style S4 fill:#fef2f2,stroke:#dc2626

💡 为什么按结束时间排序? 选择最早结束的活动,留下最多的时间给后续活动。按开始时间排序可能选一个开始早但结束很晚的活动,占用大量时间。

// Solution: Activity Selection — O(N log N)
#include <bits/stdc++.h>
using namespace std;

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n;
    cin >> n;

    vector<pair<int,int>> activities(n);  // {end_time, start_time}
    for (int i = 0; i < n; i++) {
        int s, f;
        cin >> s >> f;
        activities[i] = {f, s};  // sort by end time
    }

    sort(activities.begin(), activities.end());  // ← KEY LINE: sort by end time

    int count = 0;
    int lastEnd = -1;  // end time of the last selected activity

    for (auto [f, s] : activities) {
        if (s >= lastEnd) {      // this activity starts after the last one ends
            count++;
            lastEnd = f;         // update last end time
        }
    }

    cout << count << "\n";
    return 0;
}

Complete Walkthrough: USACO-Style Activity Selection

Problem: Given activities: [(1,3), (2,5), (3,9), (6,8), (5,7), (8,11), (10,12)] (format: start, end)

Step 1 — Sort by end time:

Activity:  A      B      C      D      E      F      G
(s,e):  (1,3)  (2,5)  (5,7)  (6,8)  (3,9)  (8,11) (10,12)

Sorted: A(1,3), B(2,5), C(5,7), D(6,8), E(3,9), F(8,11), G(10,12)

Step 2 — Greedy selection (lastEnd = -1 initially):

Activity A (1,3):  start=1 ≥ lastEnd=-1 ✓ SELECT. lastEnd = 3. Count = 1
Activity B (2,5):  start=2 ≥ lastEnd=3? NO (2 < 3). SKIP.
Activity C (5,7):  start=5 ≥ lastEnd=3 ✓ SELECT. lastEnd = 7. Count = 2
Activity D (6,8):  start=6 ≥ lastEnd=7? NO (6 < 7). SKIP.
Activity E (3,9):  start=3 ≥ lastEnd=7? NO (3 < 7). SKIP.
Activity F (8,11): start=8 ≥ lastEnd=7 ✓ SELECT. lastEnd = 11. Count = 3
Activity G (10,12):start=10 ≥ lastEnd=11? NO (10 < 11). SKIP.

Result: 3 activities selected — A(1,3), C(5,7), F(8,11)

ASCII Timeline Diagram:

Time:  0  1  2  3  4  5  6  7  8  9  10 11 12
       |  |  |  |  |  |  |  |  |  |  |  |  |
A:        [===]                                   ✓ SELECTED
B:           [======]                             ✗ overlaps A
C:                   [======]                     ✓ SELECTED
D:                      [======]                  ✗ overlaps C
E:              [============]                    ✗ overlaps A and C
F:                            [======]            ✓ SELECTED
G:                               [======]         ✗ overlaps F

Selected: A ===    C ===    F ===
          1-3      5-7      8-11

Formal Exchange Argument Proof (Activity Selection)

Claim: Sorting by end time and greedily selecting is optimal.

Proof:

Let G = greedy solution, O = some other optimal solution. Both select k activities.

Step 1 — Show first selections can be made equivalent: Let a₁ be the first activity selected by G (earliest-ending activity overall). Let b₁ be the first activity selected by O.

Since G sorts by end time, end(a₁) ≤ end(b₁).

Now "swap" b₁ for a₁ in O: replace b₁ with a₁. Does O remain feasible?

  • a₁ ends no later than b₁, so a₁ conflicts with at most as many activities as b₁ did
  • All activities in O that came after b₁ and didn't conflict with b₁ also don't conflict with a₁ (since a₁ ends ≤ b₁ ends)
  • So O' (with a₁ replacing b₁) is still a valid selection of k activities ✓

Step 2 — Induction: After the first selection, G picks the earliest-ending activity compatible with a₁, and O' has a₁ as its first activity. Apply the same argument to the remaining activities.

Conclusion: By induction, any optimal solution O can be transformed into G (the greedy solution) without losing optimality. Therefore G is optimal. ∎

💡 Key Insight from the proof: The greedy choice (earliest end time) is "safe" because it leaves the most remaining time for future activities. Choosing any later-ending first activity can only hurt future flexibility.


4.1.4 Interval Scheduling Maximization vs. Minimization

Visual: Interval Scheduling on a Number Line

Interval Scheduling

The number line diagram shows multiple intervals and the greedy selection process. By sorting by end time and always taking the next non-overlapping interval, we achieve the maximum number of selected intervals. Green intervals are selected; gray ones are rejected due to overlap.

Maximization: Maximum Non-Overlapping Intervals

→ Sort by end time, greedy select as above.

Minimization: Minimum "Points" to Stab All Intervals

Problem: Given N intervals, find the minimum number of points such that each interval contains at least one point.

Greedy: Sort by end time. For each interval whose left endpoint is to the right of the last placed point, place a new point at its right endpoint.

sort(intervals.begin(), intervals.end());  // sort by end time

int points = 0;
int lastPoint = INT_MIN;

for (auto [end, start] : intervals) {
    if (start > lastPoint) {  // this interval not yet covered
        lastPoint = end;       // place point at its end (covers as many future intervals as possible)
        points++;
    }
}

cout << points << "\n";

Minimization: Minimum Intervals to Cover a Range

Problem: Cover the range [0, T] with minimum intervals from a given set.

Greedy: Sort by start time. At each step, among all intervals starting at or before the current position, pick the one that extends furthest to the right.

sort(intervals.begin(), intervals.end());  // sort by start time

int covered = 0;    // currently covered up to 'covered'
int count = 0;
int i = 0;
int n = intervals.size();

while (covered < T) {
    int farthest = covered;

    // Among all intervals that start at or before 'covered', pick the farthest-reaching
    while (i < n && intervals[i].first <= covered) {
        farthest = max(farthest, intervals[i].second);
        i++;
    }

    if (farthest == covered) {
        cout << "Impossible\n";
        return 0;
    }

    covered = farthest;
    count++;
}

cout << count << "\n";

4.1.5 The Scheduling Problem: Minimize Lateness

Problem: N jobs with deadlines d[i] and processing times t[i]. Schedule all jobs on one machine to minimize maximum lateness (how much the latest job exceeds its deadline).

Lateness of job i = max(0, finish_time[i] - d[i]).

Greedy: Sort jobs by deadline (Earliest Deadline First — EDF).

#include <bits/stdc++.h>
using namespace std;

int main() {
    int n;
    cin >> n;

    vector<pair<int,int>> jobs(n);  // {deadline, processing_time}
    for (int i = 0; i < n; i++) cin >> jobs[i].second >> jobs[i].first;

    sort(jobs.begin(), jobs.end());  // sort by deadline

    int time = 0;
    int maxLateness = 0;

    for (auto [deadline, proc] : jobs) {
        time += proc;                          // finish time of this job
        int lateness = max(0, time - deadline); // how late is it?
        maxLateness = max(maxLateness, lateness);
    }

    cout << maxLateness << "\n";
    return 0;
}

Proof sketch: If job A has earlier deadline than B but is scheduled after B, swap them. The lateness of A can only decrease (it finishes earlier), and the lateness of B can only increase by at most the processing time of A — but since d[A] ≤ d[B], B's lateness doesn't worsen. So EDF is optimal.


4.1.6 Huffman Coding (Greedy Tree Building)

Problem: Given N symbols with frequencies, build a binary tree minimizing total encoding length (frequency × depth summed over all symbols).

Greedy: Always merge the two symbols/groups with smallest frequency.

#include <bits/stdc++.h>
using namespace std;

int main() {
    int n;
    cin >> n;

    priority_queue<long long, vector<long long>, greater<long long>> pq;  // min-heap
    for (int i = 0; i < n; i++) {
        long long f; cin >> f;
        pq.push(f);
    }

    long long totalCost = 0;
    while (pq.size() > 1) {
        long long a = pq.top(); pq.pop();
        long long b = pq.top(); pq.pop();
        totalCost += a + b;  // cost of merging a and b
        pq.push(a + b);      // merged group has frequency a+b
    }

    cout << totalCost << "\n";
    return 0;
}

⚠️ Common Mistakes in Chapter 4.1

  1. Applying greedy to DP problems: Just because greedy is simpler doesn't mean it's correct. Always test your greedy on small counterexamples. Coin change with arbitrary denominations is a classic trap.
  2. Wrong sort criterion: Sorting by start time instead of end time for activity selection is a classic bug. The justification for WHY we sort a certain way (the exchange argument) is what tells you the correct criterion.
  3. Off-by-one in overlap check: s >= lastEnd (allows adjacent activities) vs. s > lastEnd (requires a gap). Check which interpretation the problem intends.
  4. Assuming greedy works without proof: Always verify with a small example or brief exchange argument. If you can't find a counterexample AND you can sketch why the greedy choice is "safe," it's likely correct.
  5. Forgetting to sort: Greedy algorithms almost always begin with a sort. Forgetting to sort means the greedy "order" doesn't exist.

Chapter Summary

📌 Key Takeaways

ProblemGreedy StrategySort ByTime
Max non-overlapping intervalsPick earliest-endingEnd time ↑O(N log N)
Min points to stab intervalsPlace point at end of each uncovered intervalEnd time ↑O(N log N)
Min intervals to cover rangePick farthest-reaching at each stepStart time ↑O(N log N)
Minimize max latenessEarliest Deadline First (EDF)Deadline ↑O(N log N)
Huffman codingMerge two smallest frequenciesMin-heapO(N log N)

❓ FAQ

Q1: How do I tell if a problem can be solved greedily?

A: Three signals: ① After sorting, there's a clear processing order; ② You can use an exchange argument to show the greedy choice is never worse than any alternative; ③ You can't find a counterexample. If you find one (e.g., coin change with {1,5,6,9}), greedy fails — use DP instead.

Q2: What's the real difference between greedy and DP?

A: Greedy makes the locally optimal choice at each step and never looks back. DP considers all possible choices and builds the global optimum from subproblem solutions. Greedy is a special case of DP — it works when the local optimum happens to equal the global optimum.

Q3: What is the "binary search on answer + greedy check" pattern?

A: When a problem asks to "minimize the maximum" or "maximize the minimum," binary search on the answer X and use a greedy check(X) to verify feasibility. See the Convention problem in Chapter 4.2.

Q4: Why sort Activity Selection by end time instead of start time?

A: Sorting by end time ensures we always pick the activity that "frees up resources" earliest, leaving the most room for future activities. Sorting by start time might select an activity that starts early but ends very late, blocking all subsequent ones.

🔗 Connections to Other Chapters

  • Chapters 6.1–6.3 (DP) are the "upgrade" of greedy — when greedy fails, DP considers all choices
  • Chapter 3.3 (Sorting & Binary Search) is the prerequisite — almost every greedy algorithm starts with a sort
  • Chapter 4.2 applies greedy to real USACO problems, showcasing the classic "binary search on answer + greedy check" pattern
  • Chapter 5.3 (Kruskal's MST) is fundamentally greedy — sort edges and greedily pick the minimum, one of the most classic greedy algorithms

Practice Problems

Problem 4.1.1 — Meeting Rooms II 🟡 Medium N meetings with start/end times. Find the minimum number of rooms needed so all meetings can occur simultaneously.

Solution sketch: Sort by start time. Use a min-heap of end times (when each room becomes free). For each meeting, if its start ≥ earliest-free room, reuse that room. Otherwise, add a new room.

Hint The minimum rooms needed = the maximum number of meetings happening simultaneously. Use a priority queue (min-heap) to track when rooms become available.

Problem 4.1.2 — Gas Station 🔴 Hard N gas stations in a circle. Station i has gas[i] liters and requires cost[i] to reach the next. Can you complete the circuit? If yes, find the starting station.

Solution sketch: If total gas ≥ total cost, a solution exists. Greedy: try each starting station. If tank drops negative, reset starting station to the next one.

Hint Key insight: if the total gas ≥ total cost, there's always exactly one valid starting station. Track cumulative gas balance; when it goes negative, the starting station must be after the current failed position.

Problem 4.1.3 — Minimum Platforms 🟡 Medium Given arrival and departure times for N trains, find the minimum number of platforms needed so no train waits.

Hint Create events: +1 for each arrival, -1 for each departure. Sort by time. Sweep and track the running count; the maximum is the answer.

Problem 4.1.4 — Fractional Knapsack 🟢 Easy You can take fractions of items. Weight w[i], value v[i], capacity W. Maximize value.

Solution sketch: Sort by value/weight ratio (highest first). Take as much as possible of each item until knapsack is full.

Hint Greedy works here (unlike 0/1 knapsack) because you can take fractions. Always take from the highest value/weight ratio item first.

Problem 4.1.5 — Jump Game 🟡 Medium Array A of non-negative integers. From position i, you can jump up to A[i] steps forward. Can you reach the last position from position 0?

Solution sketch: Track farthest = furthest position reachable so far. At each position i ≤ farthest, update farthest = max(farthest, i + A[i]). If we reach farthest ≥ n-1, return true.

Hint If you can reach position i, you can reach all positions ≤ i + A[i]. Greedily maintain the farthest reachable position.

🏆 Challenge Problem: USACO 2016 February Silver: Fencing the Cows (Activity Selection Variant) Farmer John has N fence segments on the x-axis, each defined by [L_i, R_i]. He wants to select a minimum set of "anchor points" such that every fence segment contains at least one anchor point. (This is the interval stabbing problem — greedy with end-time sorting.)