span
🏆 C++ for Competitive Programming: A USACO Guide
From Zero to USACO Gold
The complete beginner's roadmap to competitive programming in C++, designed around USACO competition preparation.
No prior experience required. Written for clarity, depth, and contest readiness.
🎯 What Is This Book?
This book is a structured, self-contained course for students who want to learn competitive programming in C++ — specifically USACO (USA Computing Olympiad)
Unlike scattered online resources, this book gives you a single linear path: from writing your very first C++ program, through data structures and graph algorithms, all the way to solving USACO Gold problems with confidence. Every chapter builds on the previous one, with detailed worked examples, annotated C++ code, and SVG diagrams that make abstract algorithms visual and concrete.
If you've ever felt overwhelmed looking at USACO editorials, or if you know some programming but don't know what to learn next — this book was written for you.
✅ What You'll Learn
📊 Book Statistics
| Metric | Value |
|---|---|
| Parts / Chapters | 7 parts / 26 chapters |
| Code Examples | 150+ (all C++17, compilable) |
| Practice Problems | 130+ (labeled Easy/Medium/Hard) |
| SVG Diagrams | 35+ custom visualizations |
| Algorithm Templates | 20+ contest-ready templates |
| Appendices | 6 (Quick Ref, Problem Set, Tricks, Templates, Math, Debugging) |
| Estimated Completion | 8–12 weeks (1–2 chapters/week) |
| Target Level | USACO Bronze → USACO Gold |
🗺️ Learning Path
🚀 Quick Start (5 Minutes)
Step 1: Install C++ Compiler
Windows: Install MSYS2, then: pacman -S mingw-w64-x86_64-gcc
macOS: xcode-select --install in Terminal
Linux: sudo apt install g++ build-essential
Verify: g++ --version (should show version ≥ 9)
Step 2: Get an Editor
VS Code + C/C++ extension + Code Runner extension
Step 3: Competition Template
Copy this to template.cpp — use it as your starting point for every problem:
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
// freopen("problem.in", "r", stdin); // uncomment for file I/O
// freopen("problem.out", "w", stdout);
// Your solution here
return 0;
}
Step 4: Compile & Run
g++ -o sol solution.cpp -std=c++17 -O2 -Wall
./sol < input.txt
Step 5: Start Reading
Go to Chapter 2.1 and write your first C++ program. Then solve all practice problems before moving on. Don't skip the problems — that's where 80% of learning happens.
📚 How to Use This Book
The Reading Strategy That Works
- Read actively: Code every example yourself. Don't just read — type it out.
- Do the problems: Each chapter has 5–7 problems. Attempt every one before reading hints.
- Read hints when stuck (after 20–30 minutes of genuine effort)
- Review the Chapter Summary before moving on — it's a quick checklist.
- Return to earlier chapters when a later chapter references them.
Practice Problems Guide
Each practice problem is labeled:
- 🟢 Easy — Directly applies the chapter's main technique
- 🟡 Medium — Requires combining ideas or a minor insight
- 🔴 Hard — Challenging; partial credit counts!
- 🏆 Challenge — Beyond chapter scope; try when ready
All hints are hidden by default (click to expand). Struggle first!
Reading Schedule
| Stage | Chapters | Recommended Time |
|---|---|---|
| Foundations | 2.1–2.3 | 1–2 weeks |
| Data Structures | 3.1–3.11 | 2–3 weeks |
| Greedy | 4.1–4.2 | 1 week |
| Graphs | 5.1–5.4 | 2–3 weeks |
| DP | 6.1–6.3 | 3–4 weeks |
| USACO Contest Guide | 7.1–7.3 | 1 week |
📖 Chapter Overview
Part 2: C++ Foundations (1–2 weeks)
| Chapter | Topic | Key Skills |
|---|---|---|
| Ch.2.1: First C++ Program | Hello World, variables, I/O | cin, cout, int, long long |
| Ch.2.2: Control Flow | Conditions and loops | if/else, for, while, break |
| Ch.2.3: Functions & Arrays | Reusable code, collections | Arrays, vectors, recursion |
Part 3: Core Data Structures (2–3 weeks)
| Chapter | Topic | Key Skills |
|---|---|---|
| Ch.3.1: STL Essentials | Powerful built-in containers | sort, map, set, stack, queue |
| Ch.3.2: Arrays & Prefix Sums | Range queries inO(1) | 1D/2D prefix sums, difference arrays |
| Ch.3.3: Sorting & Searching | Efficient ordering and lookup | sort, binary search, BS on answer |
| Ch.3.4: Two Pointers & Sliding Window | Linear-time array techniques | Two pointer, fixed/variable windows |
| Ch.3.5: Monotonic Stack & Monotonic Queue | Monotonic data structures | Next greater element, sliding window max |
| Ch.3.6: Stacks, Queues & Deques | Order-based data structures | stack, queue, deque; LIFO/FIFO patterns |
| Ch.3.7: Hashing Techniques | Fast key lookup and collision handling | unordered_map/set, polynomial hashing, rolling hash |
| Ch.3.8: Maps & Sets | Key-value lookup and uniqueness | map, set, multiset |
| Ch.3.9: Introduction to Segment Trees | Range queries with updates | Segment tree build/query/update |
| Ch.3.10: Fenwick Tree (BIT) | Efficient prefix-sum with point updates | Binary Indexed Tree, BIT update/query, inversion count |
| Ch.3.11: Binary Trees | Tree data structure fundamentals | Traversals, BST operations, balanced trees |
Part 4: Greedy Algorithms (1 week)
| Chapter | Topic | Key Skills |
|---|---|---|
| Ch.4.1: Greedy Fundamentals | When greedy works (and fails) | Activity selection, exchange argument |
| Ch.4.2: Greedy in USACO | Contest-focused greedy | Scheduling, binary search + greedy |
Part 5: Graph Algorithms (2–3 weeks)
| Chapter | Topic | Key Skills |
|---|---|---|
| Ch.5.1: Introduction to Graphs | Modeling relationships | Adjacency list, graph types |
| Ch.5.2: BFS & DFS | Graph traversal | Shortest path, multi-source BFS, cycle detection, topo sort |
| Ch.5.3: Trees & Special Graphs | Tree algorithms | DSU, Kruskal's MST, tree diameter, LCA, Euler tour |
| Ch.5.4: Shortest Paths | Weighted graph shortest paths | Dijkstra, Bellman-Ford, Floyd-Warshall |
Part 6: Dynamic Programming (3–4 weeks)
| Chapter | Topic | Key Skills |
|---|---|---|
| Ch.6.1: Introduction to DP | Memoization and tabulation | Fibonacci, coin change |
| Ch.6.2: Classic DP Problems | Core DP patterns | LIS, 0/1 Knapsack, grid paths |
| Ch.6.3: Advanced DP Patterns | Harder techniques | Bitmask DP, interval DP, tree DP, digit DP |
Part 7: USACO Contest Guide (Read anytime)
| Chapter | Topic | Key Skills |
|---|---|---|
| Ch.7.1: Understanding USACO | Format, divisions, scoring, problem taxonomy | Contest strategy, upsolving, pattern recognition |
| Ch.7.2: Problem-Solving Strategies | How to think about problems | Algorithm selection, debugging |
| Ch.7.3: Ad Hoc Problems | Observation-based problems with no standard algorithm | Invariants, parity, cycle detection, constructive thinking |
Appendix & Reference
| Section | Content |
|---|---|
| Appendix A: C++ Quick Reference | STL cheat sheet, complexity table |
| Appendix B: USACO Problem Set | Curated problem list by topic and difficulty |
| Appendix C: Competitive Programming Tricks | Fast I/O, macros, modular arithmetic |
| Appendix D: Contest-Ready Templates | DSU, Segment Tree, BFS, Dijkstra, binary search, modpow |
| Appendix E: Math Foundations | Modular arithmetic, combinatorics, number theory, probability |
| Appendix F: Debugging Guide | Common bugs, debugging techniques, AddressSanitizer |
| Glossary | 35+ competitive programming terms defined |
| 📊 Knowledge Map | Interactive chapter dependency graph — click nodes to explore prerequisites |
🔧 Setup Instructions
Compiler Setup
| Platform | Command |
|---|---|
| Windows (MSYS2) | pacman -S mingw-w64-x86_64-gcc |
| macOS | xcode-select --install |
| Linux (Debian/Ubuntu) | sudo apt install g++ build-essential |
Verify with: g++ --version
Recommended Compile Flags
# Development (shows warnings, helpful for debugging)
g++ -o sol solution.cpp -std=c++17 -O2 -Wall -Wextra
# Contest (fast, silent)
g++ -o sol solution.cpp -std=c++17 -O2
Running with I/O Redirection
# Run with input file
./sol < input.txt
# Run and save output
./sol < input.txt > output.txt
# Compare output to expected
diff output.txt expected.txt
🌐 External Resources
| Resource | What It's Best For |
|---|---|
| usaco.org | Official USACO problems + editorials |
| usaco.guide | Community guide, curated problems by topic |
| codeforces.com | Additional practice problems, contests |
| cp-algorithms.com | Deep dives into specific algorithms |
| atcoder.jp | High-quality educational problems (AtCoder Beginner) |
🏅 Who Is This Book For?
✅ Middle school / high school students preparing for USACO Bronze through Silver
✅ Complete beginners with no prior programming experience (Part 2 starts from zero)
✅ Intermediate programmers who know Python or Java and want to learn C++ for competitive programming
✅ Self-learners who want a structured, complete curriculum instead of scattered tutorials
✅ Coaches and teachers looking for a comprehensive curriculum for their students
This book is NOT for:
- USACO Gold/Platinum (advanced data structures, network flow, geometry)
- General software engineering (no databases, web development, etc.)
🐄 Ready? Let's Begin!
Turn to Chapter 2.1 and write your first C++ program.
The path from complete beginner to USACO GOLD is roughly 200–400 hours of focused practice over 2–6 months. It won't always be easy — but every USACO GOLD and Gold competitor you admire started exactly where you are now.
The only way to get better is to write code, struggle with problems, and keep going. 🐄
Last updated: 2026 · Targets: USACO Bronze & GOLD · C++ Standard: C++17 35+ SVG diagrams · 150+ code examples · 130+ practice problems
⚡ Part 2: C++ Foundations
Master the building blocks of competitive programming in C++. From your first "Hello World" to functions and arrays.
📚 3 Chapters · ⏱️ Estimated 1-2 weeks · 🎯 Target: Write and compile C++ programs
Part 2: C++ Foundations
Before you can solve algorithmic problems, you need to speak the language. Part 2 is your crash course in C++ — from the very first program to functions, arrays, and vectors. You'll build the foundational skills needed for all later chapters.
What You'll Learn
| Chapter | Topic | Key Skills |
|---|---|---|
| Chapter 2.1 | Your First C++ Program | Variables, input/output, compilation |
| Chapter 2.2 | Control Flow | if/else, loops, break/continue |
| Chapter 2.3 | Functions & Arrays | Reusable code, arrays, vectors |
Why C++?
Competitive programmers overwhelmingly choose C++ for two reasons:
- Speed — C++ programs run faster than Python or Java, which matters when you have tight time limits (typically 1–2 seconds for up to 10^8 operations)
- The STL — C++'s Standard Template Library gives you ready-made implementations of nearly every data structure and algorithm you'll ever need
Note: USACO accepts C++, Java, and Python. But C++ is by far the most common choice among top competitors, and this book focuses on it exclusively.
Tips for Part 2
- Type the code yourself. Don't copy-paste. Your fingers need to learn the syntax.
- Break things. Deliberately introduce errors and see what happens. Reading compiler errors is a skill.
- Run every example. Seeing output appear on screen cements understanding far better than just reading.
Let's dive in!
Chapter 2.1: Your First C++ Program
📝 Before You Continue: This is the very first chapter — no prerequisites! You don't need to have any programming experience. Just work through this chapter from top to bottom and you'll write your first real C++ program by the end.
Welcome! By the end of this chapter, you will have:
- Set up a working C++ environment (takes 5 minutes using an online compiler)
- Written, compiled, and run your first C++ program
- Understood what every single line of code does
- Learned about variables, data types, and input/output
- Solved 13 practice problems with full solutions
2.1.0 Setting Up Your Environment
Before writing any code, you need a place to write and run it. There are two options: online compilers (recommended for beginners — no installation required) and local setup (optional, for when you want to work offline).
Option A: Online Compilers (Recommended — Start Here!)
You only need a web browser. Open any of these sites:
| Site | URL | Notes |
|---|---|---|
| Codeforces IDE | codeforces.com | Create a free account, then click "Submit code" on any problem to get a code editor |
| Replit | replit.com | Create a "C++ project", get a full editor + terminal |
| Ideone | ideone.com | Paste code, select C++17, click "Run" — simplest option |
| OnlineGDB | onlinegdb.com | Good debugger built in |
Using Ideone (simplest for beginners):
- Go to ideone.com
- Select "C++17 (gcc 8.3)" from the language dropdown
- Paste your code in the text area
- Click the green "Run" button
- See output in the bottom panel
That's it! No installation, no configuration.
Option B: Using CLion (Recommended Local IDE)
If you want to write and run C++ code offline on your own computer, we highly recommend CLion — a professional C/C++ IDE by JetBrains. It features intelligent code completion, one-click build & run, and a built-in debugger, all of which will significantly boost your productivity.
💡 Free for Students! CLion is a paid product, but JetBrains offers a free educational license for students. Simply apply with your
.eduemail on the JetBrains Student License page.
Installation Steps:
Step 1: Install a C++ Compiler (CLion requires an external compiler)
| OS | How to Install |
|---|---|
| Windows | Install MSYS2. After installation, run the following in the MSYS2 terminal: pacman -S mingw-w64-x86_64-gcc, then add C:\msys64\mingw64\bin to your system PATH |
| Mac | Open Terminal and run: xcode-select --install. Click "Install" in the dialog that appears and wait about 5 minutes |
| Linux | Ubuntu/Debian: sudo apt install g++ cmake; Fedora: sudo dnf install gcc-c++ cmake |
Step 2: Install CLion
- Go to the CLion download page and download the installer for your OS
- Run the installer and follow the prompts (keep the default options)
- On first launch, choose "Activate" → sign in with your JetBrains student account, or start a free 30-day trial
Step 3: Create Your First Project
- Open CLion and click "New Project"
- Select "C++ Executable" and set the Language standard to C++17
- Click "Create" — CLion will automatically generate a project with a
main.cppfile - Write your code in
main.cpp, then click the green ▶ Run button in the top-right corner to compile and run - The output will appear in the "Run" panel at the bottom
🔧 CLion Auto-Detects Compilers: On first launch, CLion automatically scans for installed compilers (GCC / Clang / MSVC). If detection succeeds, you'll see a green checkmark ✅ in Settings → Build → Toolchains. If not detected, verify that the compiler from Step 1 is correctly installed and added to your PATH.
Useful CLion Features for Competitive Programming:
- Built-in Terminal: The Terminal tab at the bottom lets you type test input directly
- Debugger: Set breakpoints, step through code line by line, and inspect variable values — an essential tool for tracking down bugs
- Code Formatting: Ctrl + Alt + L (Mac: Cmd + Option + L) automatically tidies up your code indentation
How to Compile and Run (Local)
Once you have g++ installed, here's how to compile and run:
g++ -o hello hello.cpp -std=c++17
Let's break down that command character by character:
| Part | Meaning |
|---|---|
g++ | The name of the C++ compiler program |
-o hello | -o means "output file name"; hello is the name we're giving our program |
hello.cpp | The source file we want to compile (our C++ code) |
-std=c++17 | Use the C++17 version of C++ (has the most features) |
Then to run it:
./hello # Linux/Mac: ./ means "in current directory"
hello.exe # Windows (the .exe is added automatically)
🤔 Why
./helloand not justhello? On Linux/Mac, the system won't run programs from the current folder by default (for security). The./explicitly says "look in the current directory."
2.1.1 Hello, World!
Every programming journey starts the same way. Here is the simplest complete C++ program:
#include <iostream> // tells the compiler we want to use input/output
int main() { // every C++ program starts executing from main()
std::cout << "Hello, World!" << std::endl; // print to the screen
return 0; // 0 = success, program ended normally
}
Run it, and you should see:
Hello, World!
What every line means:
Line 1: #include <iostream>
This is a preprocessor directive — an instruction that runs before the actual compilation. It says "copy-paste the contents of the iostream library into my program." The iostream library provides cin (read input) and cout (print output). Without this line, your program can't print anything.
Think of it like: before you can cook, you need to bring the ingredients into the kitchen.
Line 3: int main()
This declares the main function — the starting point of every C++ program. When you run a C++ program, the computer always starts executing from the first line inside main(). The int means this function returns an integer (the exit code). Every C++ program must have exactly one main.
Line 4: std::cout << "Hello, World!" << std::endl;
This prints text. Let's break it down:
std::cout— the "console output" stream (think of it as the screen)<<— the "put into" operator; sends data into the stream"Hello, World!"— the text to print (the quotes are not printed)<< std::endl— adds a newline (like pressing Enter);— every statement in C++ ends with a semicolon
Line 5: return 0;
Exits main and tells the operating system the program finished successfully. (A non-zero return would signal an error.)
The Compilation Pipeline
Visual: The Compilation Pipeline
The diagram above shows the three-stage journey from source code to executable: your .cpp file is fed to the g++ compiler, which produces a runnable binary. Understanding this pipeline helps debug compilation errors before they happen.
2.1.2 The Competitive Programmer's Template
When solving USACO problems, you'll use a standard template. Here it is, fully explained:
#include <bits/stdc++.h> // "batteries included" — includes ALL standard libraries
using namespace std; // lets us write cout instead of std::cout
int main() {
ios_base::sync_with_stdio(false); // disables syncing C and C++ I/O (faster)
cin.tie(NULL); // unties cin from cout (faster input)
// Your solution code goes here
return 0;
}
Why #include <bits/stdc++.h>?
This is a GCC-specific header that includes every standard library at once. Instead of writing:
#include <iostream>
#include <vector>
#include <algorithm>
#include <map>
// ... 20 more lines
You write one line. In competitive programming, this is universally accepted and saves time.
Note:
bits/stdc++.honly works with GCC (the compiler USACO judges use). It's fine for competitive programming, but don't use it in production software.
Why using namespace std;?
The standard library puts everything inside a namespace called std. Without this line, you'd write std::cout, std::vector, std::sort everywhere. With using namespace std;, you write cout, vector, sort — much cleaner.
The I/O Speed Lines
ios_base::sync_with_stdio(false);
cin.tie(NULL);
These two lines make cin and cout much faster. Without them, reading large inputs can be 10× slower and cause "Time Limit Exceeded" (TLE) even if your algorithm is correct. Always include them.
🐛 Common Bug: After using these speed lines, don't mix
cin/coutwithscanf/printf. Pick one style.
2.1.3 Variables and Data Types
A variable is a named location in memory that stores a value. In C++, every variable has a type — the type tells the computer how much memory to reserve and what kind of data will go in it.
🧠 Mental Model: Variables are like labeled boxes
When you write: int score = 100;
The computer does three things:
1. Creates a box big enough to hold an integer (4 bytes)
2. Puts the label "score" on the box
3. Puts the number 100 inside the box
The Essential Types for Competitive Programming
#include <bits/stdc++.h>
using namespace std;
int main() {
// int: whole numbers, range: -2,147,483,648 to +2,147,483,647 (about ±2 billion)
int apples = 42;
int temperature = -5;
// long long: big whole numbers, range: about ±9.2 × 10^18
long long population = 7800000000LL; // the LL suffix means "this is a long long literal"
long long trillion = 1000000000000LL;
// double: decimal/fractional numbers
double pi = 3.14159265358979;
double percentage = 99.5;
// bool: true or false only
bool isRaining = true;
bool finished = false;
// char: a single character (stored as a number 0-255)
char grade = 'A'; // single quotes for characters
char newline = '\n'; // special: newline character
// string: a sequence of characters
string name = "Alice"; // double quotes for strings
string greeting = "Hello!";
// Print them all:
cout << "Apples: " << apples << "\n";
cout << "Population: " << population << "\n";
cout << "Pi: " << pi << "\n";
cout << "Is raining: " << isRaining << "\n"; // prints 1 for true, 0 for false
cout << "Grade: " << grade << "\n";
cout << "Name: " << name << "\n";
return 0;
}
Visual: C++ Data Types Reference
Choosing the Right Type
| Situation | Type to Use |
|---|---|
| Counting things, small numbers | int |
| Numbers that might exceed 2 billion | long long |
| Decimal/fractional answers | double |
| Yes/no flags | bool |
| Single letters or characters | char |
| Words or sentences | string |
Variable Naming Rules
Variable names follow strict rules in C++. Getting these right is essential — bad names lead to bugs, and illegal names won't compile at all.
The Formal Rules (Enforced by the Compiler)
✅ Legal names must:
- Start with a letter (a-z, A-Z) or underscore
_ - Contain only letters, digits (0-9), and underscores
- Not be a C++ reserved keyword
❌ These will NOT compile:
| Illegal Name | Why It's Wrong |
|---|---|
3apples | Starts with a digit |
my score | Contains a space |
my-score | Contains a hyphen (interpreted as minus) |
int | Reserved keyword |
class | Reserved keyword |
return | Reserved keyword |
⚠️ Case sensitive!
score,Score, andSCOREare three completely different variables. This is a common source of bugs — be consistent.
Common Naming Styles
There are several widely-used naming conventions in C++. You don't have to pick one for competitive programming, but knowing them helps you read other people's code:
| Style | Example | Typically Used For |
|---|---|---|
| camelCase | numStudents, totalScore | Local variables, function parameters |
| PascalCase | MyClass, GraphNode | Classes, structs, type names |
| snake_case | num_students, total_score | Variables, functions (C/Python style) |
| ALL_CAPS | MAX_N, MOD, INF | Constants, macros |
| Single letter | n, m, i, j | Loop indices, math-style competitive programming |
In competitive programming, camelCase and single-letter names are most common. In production code at companies, snake_case or camelCase are standard depending on the style guide.
Best Practices for Naming
1. Be descriptive — make the purpose clear from the name:
// ✅ Good — instantly clear what each variable stores
int numCows = 5;
long long totalMilk = 0;
string cowName = "Bessie";
int maxScore = 100;
// ❌ Bad — legal but confusing
int x = 5; // What is x? Count? Index? Value?
long long t = 0; // What is t? Time? Total? Temporary?
string n = "Bessie"; // n usually means "number" — misleading for a name!
2. Use conventional single-letter names only when the meaning is obvious:
// ✅ Acceptable — these are universally understood conventions
for (int i = 0; i < n; i++) { ... } // i, j, k for loop indices
int n, m; // n = count, m = second dimension
cin >> n >> m; // in competitive programming, everyone does this
// ❌ Confusing — single letters with no clear convention
int q = 5; // Is q a count? A query? A coefficient?
char z = 'A'; // Why z?
3. Constants should be ALL_CAPS to stand out:
const int MAX_N = 200005; // maximum array size
const int MOD = 1000000007; // modular arithmetic constant
const long long INF = 1e18; // "infinity" for comparisons
const double PI = 3.14159265359; // mathematical constant
4. Avoid names that look too similar to each other:
// ❌ Easy to mix up
int total1 = 10;
int totall = 20; // is this "total-L" or "total-1" with a typo?
int O = 0; // the letter O looks like the digit 0
int l = 1; // lowercase L looks like the digit 1
// ✅ Better alternatives
int totalA = 10;
int totalB = 20;
5. Don't start names with underscores followed by uppercase letters:
// ❌ Technically compiles, but reserved by the C++ standard
int _Score = 100; // names like _X are reserved for the compiler/library
int __value = 42; // double underscore is ALWAYS reserved
// ✅ Safe alternatives
int score = 100;
int myValue = 42;
Naming in Competitive Programming vs. Production Code
| Aspect | Competitive Programming | Production / School Projects |
|---|---|---|
| Variable length | Short is fine: n, m, dp, adj | Descriptive: numStudents, adjacencyList |
| Loop variables | i, j, k always | i, j, k still fine |
| Constants | MAXN, MOD, INF | kMaxSize, kModulus (Google style) |
| Comments | Minimal — speed matters | Thorough — readability matters |
| Goal | Write fast, solve fast | Write code others can maintain |
💡 For this book: We'll use a mix — descriptive names for clarity in explanations, but shorter names when solving problems under time pressure. The important thing is: you should always be able to look at a variable name and immediately know what it stores.
Deep Dive: char, string, and Character-Integer Conversions
Earlier in this chapter we briefly introduced char and string. Since many USACO problems involve character processing, digit extraction, and string manipulation, let's take a deeper look at these essential types.
char and ASCII — Every Character is a Number
A char in C++ is stored as a 1-byte integer (0–255). Each character is mapped to a number according to the ASCII table (American Standard Code for Information Interchange). You don't need to memorize the whole table, but knowing a few key ranges is extremely useful:
Key relationships:
• 'a' - 'A' = 32 (difference between lower and upper case)
• '0' has ASCII value 48 (not 0!)
• Digits, uppercase letters, and lowercase letters
are each in CONSECUTIVE ranges
#include <bits/stdc++.h>
using namespace std;
int main() {
char ch = 'A';
// A char IS an integer — you can print its numeric value
cout << ch << "\n"; // prints: A (as character)
cout << (int)ch << "\n"; // prints: 65 (its ASCII value)
// You can do arithmetic on chars!
char next = ch + 1; // 'A' + 1 = 66 = 'B'
cout << next << "\n"; // prints: B
// Compare chars (compares their ASCII values)
cout << ('a' < 'z') << "\n"; // 1 (true, because 97 < 122)
cout << ('A' < 'a') << "\n"; // 1 (true, because 65 < 97)
return 0;
}
char ↔ int Conversions — The Most Common Technique
In competitive programming, you constantly need to convert between character digits and integer values. Here's the complete guide:
1. Digit character → Integer value (e.g., '7' → 7)
char ch = '7';
int digit = ch - '0'; // '7' - '0' = 55 - 48 = 7
cout << digit << "\n"; // prints: 7
// This works because digit characters '0'~'9' have consecutive ASCII values:
// '0'=48, '1'=49, ..., '9'=57
// So ch - '0' gives the actual numeric value (0~9)
2. Integer value → Digit character (e.g., 7 → '7')
int digit = 7;
char ch = '0' + digit; // 48 + 7 = 55 = '7'
cout << ch << "\n"; // prints: 7 (as the character '7')
// Works for digits 0~9 only
3. Uppercase ↔ Lowercase conversion
char upper = 'C';
char lower = upper + 32; // 'C'(67) + 32 = 'c'(99)
cout << lower << "\n"; // prints: c
// More readable approach using the difference:
char lower2 = upper - 'A' + 'a'; // 'C'-'A' = 2, 'a'+2 = 'c'
cout << lower2 << "\n"; // prints: c
// Reverse: lowercase → uppercase
char ch = 'f';
char upper2 = ch - 'a' + 'A'; // 'f'-'a' = 5, 'A'+5 = 'F'
cout << upper2 << "\n"; // prints: F
// Using built-in functions (recommended for clarity):
cout << (char)toupper('g') << "\n"; // prints: G
cout << (char)tolower('G') << "\n"; // prints: g
4. Check character types (very useful in USACO)
char ch = '5';
// Check if digit
if (ch >= '0' && ch <= '9') {
cout << "It's a digit!\n";
}
// Check if uppercase letter
if (ch >= 'A' && ch <= 'Z') {
cout << "Uppercase!\n";
}
// Check if lowercase letter
if (ch >= 'a' && ch <= 'z') {
cout << "Lowercase!\n";
}
// Or use built-in functions:
// isdigit(ch), isupper(ch), islower(ch), isalpha(ch), isalnum(ch)
if (isdigit(ch)) cout << "Digit!\n";
if (isalpha(ch)) cout << "Letter!\n";
5. A Classic Pattern: Extract Digits from a String
string s = "abc123def";
int sum = 0;
for (char ch : s) {
if (ch >= '0' && ch <= '9') {
sum += ch - '0'; // convert digit char to int and add
}
}
cout << "Sum of digits: " << sum << "\n"; // 1+2+3 = 6
string Detailed Guide
string is C++'s built-in text type. Unlike a single char, a string holds a sequence of characters and provides many useful operations.
Basic operations:
#include <bits/stdc++.h>
using namespace std;
int main() {
// Creating strings
string s1 = "Hello";
string s2 = "World";
string empty = ""; // empty string
string repeated(5, 'x'); // "xxxxx" — 5 copies of 'x'
// Length
cout << s1.size() << "\n"; // 5 (same as s1.length())
// Concatenation (joining strings)
string s3 = s1 + " " + s2; // "Hello World"
s1 += "!"; // s1 is now "Hello!"
// Access individual characters (0-indexed, just like arrays)
cout << s3[0] << "\n"; // 'H'
cout << s3[6] << "\n"; // 'W'
// Modify individual characters
s3[0] = 'h'; // "hello World"
// Comparison (lexicographic, i.e., dictionary order)
cout << ("apple" < "banana") << "\n"; // 1 (true)
cout << ("abc" == "abc") << "\n"; // 1 (true)
cout << ("abc" < "abd") << "\n"; // 1 (true, compares char by char)
return 0;
}
Iterating over a string:
string s = "USACO";
// Method 1: index-based loop
for (int i = 0; i < (int)s.size(); i++) {
cout << s[i] << " "; // U S A C O
}
cout << "\n";
// Method 2: range-based for loop (cleaner)
for (char ch : s) {
cout << ch << " "; // U S A C O
}
cout << "\n";
// Method 3: range-based with reference (for modifying in-place)
for (char& ch : s) {
ch = tolower(ch); // convert each char to lowercase
}
cout << s << "\n"; // "usaco"
Useful string functions:
string s = "Hello, World!";
// Substring: s.substr(start, length)
string sub = s.substr(7, 5); // "World" (starting at index 7, take 5 chars)
string sub2 = s.substr(7); // "World!" (from index 7 to end)
// Find: s.find("text") — returns index or string::npos if not found
size_t pos = s.find("World"); // 7 (size_t, not int!)
if (s.find("xyz") == string::npos) {
cout << "Not found!\n";
}
// Append
s.append(" Hi"); // "Hello, World! Hi"
// or equivalently: s += " Hi";
// Insert
s.insert(5, "!!"); // "Hello!!, World! Hi"
// Erase: s.erase(start, count)
s.erase(5, 2); // removes 2 chars starting at index 5 → "Hello, World! Hi"
// Replace: s.replace(start, count, "new text")
string msg = "I love cats";
msg.replace(7, 4, "dogs"); // "I love dogs"
Reading strings from input:
// cin >> reads ONE WORD (stops at whitespace)
string word;
cin >> word; // input "Hello World" → word = "Hello"
// getline reads the ENTIRE LINE (including spaces)
string line;
getline(cin, line); // input "Hello World" → line = "Hello World"
// ⚠️ Remember: after cin >>, call cin.ignore() before getline!
int n;
cin >> n;
cin.ignore(); // consume the leftover '\n'
string fullLine;
getline(cin, fullLine); // now this reads correctly
Converting between string and numbers:
// String → Integer
string numStr = "42";
int num = stoi(numStr); // stoi = "string to int" → 42
long long big = stoll("123456789012345"); // stoll = "string to long long"
// String → Double
double d = stod("3.14"); // stod = "string to double" → 3.14
// Integer → String
int x = 255;
string s = to_string(x); // "255"
string s2 = to_string(3.14); // "3.140000"
char Arrays (C-Style Strings) — Know They Exist
In C (and old C++ code), strings were stored as arrays of char ending with a special null character '\0'. You'll rarely need these in competitive programming (use string instead), but you should recognize them:
// C-style string (char array)
char greeting[] = "Hello"; // actually stores: H e l l o \0 (6 chars!)
// The '\0' (null terminator) marks the end of the string
// WARNING: you must ensure the array is big enough to hold the string + '\0'
char name[20]; // can hold up to 19 characters + '\0'
// Reading into a char array (rarely needed)
// cin >> name; // works, but limited by array size
// scanf("%s", name); // C-style, also works
// Converting between char array and string
string s = greeting; // char array → string (automatic)
// string → char array: use s.c_str() to get a const char*
Why string is better than char[] for competitive programming:
| Feature | char[] (C-style) | string (C++) |
|---|---|---|
| Size | Must predefine max size | Grows automatically |
| Concatenation | strcat() — manual, error-prone | s1 + s2 — simple |
| Comparison | strcmp() — returns int | s1 == s2 — natural |
| Length | strlen() — O(N) each call | s.size() — O(1) |
| Safety | Buffer overflow risk | Safe, managed by C++ |
⚡ Pro Tip for USACO: Always use
stringunless a problem specifically requireschararrays. String operations are cleaner, safer, and easier to debug. The only common use ofchararrays in competitive programming is when reading very large inputs withscanf/printffor speed — but withsync_with_stdio(false),string+cin/coutis fast enough for 99% of USACO problems.
Quick Reference: Character/String Cheat Sheet
| Task | Code | Example |
|---|---|---|
| Digit char → int | ch - '0' | '7' - '0' → 7 |
| Int → digit char | '0' + digit | '0' + 3 → '3' |
| Uppercase → lowercase | ch - 'A' + 'a' or tolower(ch) | 'C' → 'c' |
| Lowercase → uppercase | ch - 'a' + 'A' or toupper(ch) | 'f' → 'F' |
| Is digit? | ch >= '0' && ch <= '9' or isdigit(ch) | '5' → true |
| Is letter? | isalpha(ch) | 'A' → true |
| String length | s.size() or s.length() | "abc" → 3 |
| Substring | s.substr(start, len) | "Hello".substr(1,3) → "ell" |
| Find in string | s.find("text") | returns index or npos |
| String → int | stoi(s) | stoi("42") → 42 |
| Int → string | to_string(n) | to_string(42) → "42" |
| Traverse string | for (char ch : s) | iterate each character |
⚠️ Integer Overflow — The #1 Bug in Competitive Programming
What happens when a number gets too big for its type?
// Imagine int as a dial that goes from -2,147,483,648 to 2,147,483,647
// When you go past the maximum, it WRAPS AROUND to the minimum!
int x = 2147483647; // maximum int value
cout << x << "\n"; // prints: 2147483647
x++; // add 1... what happens?
cout << x << "\n"; // prints: -2147483648 (OVERFLOW! Wrapped around!)
This is like an old car odometer that hits 999999 and rolls back to 000000. The number wraps around.
How to avoid overflow:
int a = 1000000000; // 1 billion — fits in int
int b = 1000000000; // 1 billion — fits in int
// int wrong = a * b; // OVERFLOW! a*b = 10^18, doesn't fit in int
long long correct = (long long)a * b; // Cast one to long long before multiplying
cout << correct << "\n"; // 1000000000000000000 ✓
// Rule of thumb: if N can be up to 10^9 and you multiply two such values, use long long
⚡ Pro Tip: When in doubt, use
long long. It's slightly slower thanintbut prevents overflow bugs that are very hard to spot.
2.1.4 Input and Output with cin and cout
Printing Output with cout
int score = 95;
string name = "Alice";
cout << "Score: " << score << "\n"; // Score: 95
cout << name << " got " << score << "\n"; // Alice got 95
// "\n" vs endl
cout << "Line 1" << "\n"; // fast — just a newline character
cout << "Line 2" << endl; // slow — flushes buffer AND adds newline
⚡ Pro Tip: Always use
"\n"instead ofendl. Theendlflushes the output buffer which is much slower. In problems with lots of output, usingendlcan cause Time Limit Exceeded!
Reading Input with cin
int n;
cin >> n; // reads one integer from input
string s;
cin >> s; // reads one word (stops at whitespace — spaces, tabs, newlines)
double x;
cin >> x; // reads a decimal number
cin >> automatically skips whitespace between values. This means spaces, tabs, and newlines are all treated the same way. So these two inputs are read identically:
Input style 1 (all on one line): 42 hello 3.14
Input style 2 (on separate lines):
42
hello
3.14
Both work with:
int a; string b; double c;
cin >> a >> b >> c; // reads all three regardless of formatting
Reading Multiple Values — The Most Common USACO Pattern
USACO problems almost always start with: "Read N, then read N values." Here's how:
Typical USACO input:
5 ← first line: N (the number of items)
10 20 30 40 50 ← next line(s): the N items
int n;
cin >> n; // read N
for (int i = 0; i < n; i++) {
int x;
cin >> x; // read each item
cout << x * 2 << "\n"; // process it
}
Complexity Analysis:
- Time: O(N) — read N numbers and process each one in O(1)
- Space: O(1) — only one variable
x, no storage of all data
For the input 5\n10 20 30 40 50, this would print:
20
40
60
80
100
Reading a Full Line (Including Spaces)
Sometimes input has multiple words on a line. cin >> only reads one word at a time, so use getline:
string fullName;
getline(cin, fullName); // reads the entire line, including spaces
cout << "Name: " << fullName << "\n";
🐛 Common Bug: Mixing
cin >>andgetlinecan cause problems. Aftercin >> n, there's a leftover\nin the buffer. If you then callgetline, it will read that empty newline instead of the next line. Fix: callcin.ignore()aftercin >>before usinggetline.
Controlling Decimal Output
double y = 3.14159;
cout << y << "\n"; // 3.14159 (default)
cout << fixed << setprecision(2) << y << "\n"; // 3.14 (exactly 2 decimal places)
cout << fixed << setprecision(6) << y << "\n"; // 3.141590 (6 decimal places)
2.1.5 Basic Arithmetic
#include <bits/stdc++.h>
using namespace std;
int main() {
int a = 17, b = 5;
cout << a + b << "\n"; // 22 (addition)
cout << a - b << "\n"; // 12 (subtraction)
cout << a * b << "\n"; // 85 (multiplication)
cout << a / b << "\n"; // 3 (INTEGER division — truncates toward zero!)
cout << a % b << "\n"; // 2 (modulo — the REMAINDER after division)
// Integer division example:
// 17 ÷ 5 = 3 remainder 2
// So: 17 / 5 = 3 and 17 % 5 = 2
double x = 17.0, y = 5.0;
cout << x / y << "\n"; // 3.4 (real division when operands are doubles)
// Shorthand assignment operators:
int n = 10;
n += 5; // same as: n = n + 5 → n is now 15
n -= 3; // same as: n = n - 3 → n is now 12
n *= 2; // same as: n = n * 2 → n is now 24
n /= 4; // same as: n = n / 4 → n is now 6
n++; // same as: n = n + 1 → n is now 7
n--; // same as: n = n - 1 → n is now 6
cout << n << "\n"; // 6
return 0;
}
🤔 Why does integer division truncate?
When both operands are integers, C++ does integer division — it discards the fractional part. 17 / 5 gives 3, not 3.4. This is intentional and very useful (e.g., to find which "group" something falls into).
// How many full hours in 200 minutes?
int minutes = 200;
int hours = minutes / 60; // 200 / 60 = 3 (not 3.33...)
int remaining = minutes % 60; // 200 % 60 = 20
cout << hours << " hours and " << remaining << " minutes\n"; // 3 hours and 20 minutes
// To get decimal division, at least ONE operand must be a double:
int a = 7, b = 2;
cout << a / b << "\n"; // 3 (integer division)
cout << (double)a / b << "\n"; // 3.5 (cast a to double first)
cout << a / (double)b << "\n"; // 3.5 (cast b to double)
cout << 7.0 / 2 << "\n"; // 3.5 (literal 7.0 is a double)
2.1.6 Your First USACO-Style Program
Let's put everything together and write a complete program that reads input and produces output — just like a real USACO problem.
Problem: Read two integers N and M. Print their sum, difference, product, integer quotient, and remainder.
Thinking through it:
- We need two variables to store N and M
- We use
cinto read them - We use
coutto print each result - Since N and M could be large, should we use
long long? Let's be safe.
💡 Beginner's Problem-Solving Flow:
When facing a problem, don't rush to write code. First think through the steps in plain language:
- Understand the problem: What is the input? What is the output? What are the constraints?
- Work through an example by hand: Use the sample input, manually compute the output, confirm you understand the problem
- Think about data ranges: How large can N and M be? Could there be overflow?
- Write pseudocode:
Read → Compute → Output- Translate to C++: Convert pseudocode to real code line by line
This problem: read two numbers → perform five operations → output five results. Very straightforward!
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
long long n, m;
cin >> n >> m; // read both numbers on one line
cout << n + m << "\n"; // sum
cout << n - m << "\n"; // difference
cout << n * m << "\n"; // product
cout << n / m << "\n"; // integer quotient
cout << n % m << "\n"; // remainder
return 0;
}
Complexity Analysis:
- Time: O(1) — only a fixed number of arithmetic operations
- Space: O(1) — only two variables
Sample Input:
17 5
Sample Output:
22
12
85
3
2
⚠️ Common Mistakes in Chapter 2.1
| # | Mistake | Example | Why It's Wrong | Fix |
|---|---|---|---|---|
| 1 | Integer overflow | int a = 1e9; int b = a*a; | a*b = 10^18 exceeds int max ~2.1×10^9, result "wraps around" to wrong value | Use long long |
| 2 | Using endl | cout << x << endl; | endl flushes the output buffer, 10x+ slower than "\n" for large output, may cause TLE | Use "\n" |
| 3 | Forgetting I/O speedup | Missing sync_with_stdio and cin.tie | By default cin/cout syncs with C's scanf/printf, very slow for large input | Always add the two speed lines |
| 4 | Integer division surprise | 7/2 expects 3.5 but gets 3 | Dividing two integers, C++ truncates the fractional part | Cast to double: (double)7/2 |
| 5 | Missing semicolon | cout << x | Every C++ statement must end with ;, otherwise compilation fails | cout << x; |
| 6 | Mixing cin >> and getline | cin >> n then getline(cin, s) | cin >> leaves a \n in the buffer, getline reads an empty line | Add cin.ignore() in between |
Chapter Summary
📌 Key Takeaways
| Concept | Key Points | Why It Matters |
|---|---|---|
#include <bits/stdc++.h> | Includes all standard libraries at once | Saves time in contests, no need to remember each header |
using namespace std; | Omits the std:: prefix | Cleaner code, universal practice in competitive programming |
int main() | The sole entry point of the program | Every C++ program must have exactly one main |
cin >> x / cout << x | Read input / write output | The core I/O method for USACO |
int vs long long | ~2×10^9 vs ~9.2×10^18 | Wrong type = overflow = wrong answer (most common bug in contests) |
"\n" vs endl | "\n" is 10x faster | Determines AC vs TLE for large output |
a / b and a % b | Integer division and remainder | Core tools for time conversion, grouping, etc. |
| I/O Speed Lines | sync_with_stdio(false) + cin.tie(NULL) | Essential in contest template, forgetting may cause TLE |
❓ FAQ
Q1: Does bits/stdc++.h slow down compilation?
A: Yes, compilation time may increase by 1-2 seconds. But in contests, compilation time is not counted toward the time limit, so it doesn't affect results. Don't use it in production projects.
Q2: Which should I default to — int or long long?
A: Rule of thumb — when in doubt, use
long long. It's slightly slower thanint(nearly imperceptible on modern CPUs), but prevents overflow. Especially note: if twointvalues are multiplied, the result may needlong long.
Q3: Why can't I use scanf/printf in USACO?
A: You actually can! But after adding
sync_with_stdio(false), you cannot mixcin/coutwithscanf/printf. Beginners are advised to stick withcin/cout— it's safer.
Q4: Can I omit return 0;?
A: In C++11 and later, if
main()reaches the end without areturn, the compiler automatically returns 0. So technically it can be omitted, but writing it is clearer.
Q5: My code runs correctly locally, but gets Wrong Answer (WA) on the USACO judge. What could be wrong?
A: The three most common reasons: ① Integer overflow (used
intwhenlong longwas needed); ② Not handling all edge cases; ③ Wrong output format (extra or missing spaces/newlines).
🔗 Connections to Later Chapters
- Chapter 2.2 (Control Flow) builds on this chapter by adding
if/elseconditionals andfor/whileloops, enabling you to handle "repeat N times" tasks - Chapter 2.3 (Functions & Arrays) introduces functions (organizing code into reusable blocks) and arrays (storing a collection of data) — core tools for solving USACO problems
- Chapter 3.1 (STL Essentials) introduces STL tools like
vectorandsort, greatly simplifying the logic you write manually in this chapter - The integer overflow prevention techniques learned in this chapter will appear throughout the book, especially in Chapter 3.2 (Prefix Sums) and Chapters 6.1–6.3 (DP)
Practice Problems
Work through all problems in order — they get progressively harder. Each has a complete solution you can reveal after trying it yourself.
🌡️ Warm-Up Problems
These problems only require 1-3 lines of new code each. They're meant to help you practice typing C++ and running programs.
Warm-up 2.1.1 — Personal Greeting Write a program that prints exactly this (with your own name):
Hello, Alice!
My favorite number is 7.
I am learning C++.
(You can hardcode all values — no input needed.)
💡 Solution (click to reveal)
Approach: Just print three lines with cout. No input needed.
#include <bits/stdc++.h>
using namespace std;
int main() {
cout << "Hello, Alice!\n";
cout << "My favorite number is 7.\n";
cout << "I am learning C++.\n";
return 0;
}
Key points:
- Each
coutstatement ends with;\n"— the\ncreates a new line - You can also chain multiple
<<operators on onecoutline - No
cinneeded when there's no input
Warm-up 2.1.2 — Five Lines
Print the numbers 1 through 5, each on its own line. Use exactly 5 separate cout statements (no loops yet — we cover loops in Chapter 2.2).
💡 Solution (click to reveal)
Approach: Five separate cout statements, one per number.
#include <bits/stdc++.h>
using namespace std;
int main() {
cout << 1 << "\n";
cout << 2 << "\n";
cout << 3 << "\n";
cout << 4 << "\n";
cout << 5 << "\n";
return 0;
}
Key points:
cout << 1 << "\n"prints the number 1 followed by a newline- We'll learn to do this with a loop in Chapter 2.2 — but this manual approach works fine for small counts
Warm-up 2.1.3 — Double It Read one integer from input. Print that integer multiplied by 2.
Sample Input: 7
Sample Output: 14
💡 Solution (click to reveal)
Approach: Read into a variable, multiply by 2, print.
#include <bits/stdc++.h>
using namespace std;
int main() {
int n;
cin >> n;
cout << n * 2 << "\n";
return 0;
}
Key points:
cin >> nreads one integer and stores it inn- We can do arithmetic directly inside
cout:n * 2is computed first, then printed - Use
long long nif n might be very large (up to 10^9), sincen * 2could overflowint
Warm-up 2.1.4 — Sum of Two Read two integers on the same line. Print their sum.
Sample Input: 15 27
Sample Output: 42
💡 Solution (click to reveal)
Approach: Read two integers, add them, print.
#include <bits/stdc++.h>
using namespace std;
int main() {
int a, b;
cin >> a >> b;
cout << a + b << "\n";
return 0;
}
Key points:
cin >> a >> breads two values in one statement — works whether they're on the same line or different lines- Declaring two variables on the same line:
int a, b;is equivalent toint a; int b;
Warm-up 2.1.5 — Say Hi
Read a single word (a first name, no spaces). Print Hi, [name]!
Sample Input: Bob
Sample Output: Hi, Bob!
💡 Solution (click to reveal)
Approach: Read a string, then print it inside the greeting message.
#include <bits/stdc++.h>
using namespace std;
int main() {
string name;
cin >> name;
cout << "Hi, " << name << "!\n";
return 0;
}
Key points:
string name;declares a variable that holds textcin >> namereads one word (stops at the first space)- Notice how
coutcan chain: literal string + variable + literal string
🏋️ Core Practice Problems
These problems require combining input, arithmetic, and output. Think through the math before coding.
Problem 2.1.6 — Age in Days Read a person's age in whole years. Print their approximate age in days (use 365 days per year, ignore leap years).
Sample Input: 15
Sample Output: 5475
💡 Solution (click to reveal)
Approach: Multiply years by 365. Since age × 365 fits in an int (max age ~150 → 150×365 = 54750, well within int range), int is fine here.
#include <bits/stdc++.h>
using namespace std;
int main() {
int years;
cin >> years;
cout << years * 365 << "\n";
return 0;
}
Key points:
years * 365is computed as integers — no overflow risk here- If you wanted to include hours, minutes, seconds, you'd use
long longto be safe
Problem 2.1.7 — Seconds Converter Read a number of seconds S (1 ≤ S ≤ 10^9). Convert it to hours, minutes, and remaining seconds.
Sample Input: 3661
Sample Output:
1 hours
1 minutes
1 seconds
💡 Solution (click to reveal)
Approach: Use integer division and modulo. First divide by 3600 to get hours, then use the remainder (mod 3600), divide by 60 to get minutes, remaining is seconds.
#include <bits/stdc++.h>
using namespace std;
int main() {
long long s;
cin >> s;
long long hours = s / 3600; // 3600 seconds per hour
long long remaining = s % 3600; // seconds left after removing full hours
long long minutes = remaining / 60; // 60 seconds per minute
long long seconds = remaining % 60; // seconds left after removing full minutes
cout << hours << " hours\n";
cout << minutes << " minutes\n";
cout << seconds << " seconds\n";
return 0;
}
Key points:
- We use
long longbecause S can be up to 10^9 (safe in int, but long long is a good habit) - The key insight:
s % 3600gives the seconds after removing full hours, then we can divide that by 60 to get minutes - Check: 3661 → 3661/3600=1 hour, 3661%3600=61, 61/60=1 minute, 61%60=1 second ✓
Problem 2.1.8 — Rectangle Read the length L and width W of a rectangle. Print its area and perimeter.
Sample Input: 6 4
Sample Output:
Area: 24
Perimeter: 20
💡 Solution (click to reveal)
Approach: Area = L × W, Perimeter = 2 × (L + W).
#include <bits/stdc++.h>
using namespace std;
int main() {
long long L, W;
cin >> L >> W;
cout << "Area: " << L * W << "\n";
cout << "Perimeter: " << 2 * (L + W) << "\n";
return 0;
}
Key points:
- Order of operations:
2 * (L + W)— the parentheses ensure we add L+W first, then multiply by 2 - Using
long longin case L and W are large (if L,W up to 10^9, L*W could be up to 10^18)
Problem 2.1.9 — Temperature Converter
Read a temperature in Celsius. Print the equivalent in Fahrenheit. Formula: F = C × 9/5 + 32
Sample Input: 100
Sample Output: 212.00
💡 Solution (click to reveal)
Approach: Apply the formula. Since we need a decimal output, use double. The tricky part is the integer division trap: 9/5 in integer math = 1, not 1.8!
#include <bits/stdc++.h>
using namespace std;
int main() {
double celsius;
cin >> celsius;
double fahrenheit = celsius * 9.0 / 5.0 + 32.0;
cout << fixed << setprecision(2) << fahrenheit << "\n";
return 0;
}
Key points:
- Use
9.0 / 5.0(or9.0/5) instead of9/5— the latter is integer division giving1, not1.8! fixed << setprecision(2)forces exactly 2 decimal places in the output- Check: 100°C → 100 × 9.0/5.0 + 32 = 180 + 32 = 212 ✓
Problem 2.1.10 — Coin Counter Read four integers: the number of quarters (25¢), dimes (10¢), nickels (5¢), and pennies (1¢). Print the total value in cents.
Sample Input:
3 2 1 4
(3 quarters, 2 dimes, 1 nickel, 4 pennies)
Sample Output: 104
💡 Solution (click to reveal)
Approach: Multiply each coin count by its value, sum them all.
#include <bits/stdc++.h>
using namespace std;
int main() {
int quarters, dimes, nickels, pennies;
cin >> quarters >> dimes >> nickels >> pennies;
int total = quarters * 25 + dimes * 10 + nickels * 5 + pennies * 1;
cout << total << "\n";
return 0;
}
Key points:
- Each coin type multiplied by its value in cents: quarters=25, dimes=10, nickels=5, pennies=1
- Check: 3×25 + 2×10 + 1×5 + 4×1 = 75 + 20 + 5 + 4 = 104 ✓
- If coin counts can be very large, switch to
long long
🏆 Challenge Problems
These require more thought — especially around data types and problem-solving.
Challenge 2.1.11 — Overflow Detector
Read two integers A and B (each up to 10^9). Compute their product TWO ways: as an int and as a long long. Print both results. Observe the difference when overflow occurs.
Sample Input: 1000000000 3
Sample Output:
int product: -1294967296
long long product: 3000000000
(The int result is wrong due to overflow; long long is correct.)
💡 Solution (click to reveal)
Approach: Read both numbers as long long, then compute the product both ways — once forcing integer math, once with long long. This demonstrates overflow visually.
#include <bits/stdc++.h>
using namespace std;
int main() {
long long a, b;
cin >> a >> b;
// Cast to int FIRST to force integer overflow
int int_product = (int)a * (int)b;
// Long long multiplication — no overflow for values up to 10^9
long long ll_product = a * b;
cout << "int product: " << int_product << "\n";
cout << "long long product: " << ll_product << "\n";
return 0;
}
Key points:
(int)a * (int)b— both operands are cast tointbefore multiplication, so the multiplication overflowsa * bwhere a,b arelong long— multiplication is done inlong longspace, no overflow- The actual output for 10^9 × 3: correct is 3×10^9, but
intwraps around because max int ≈ 2.147×10^9 < 3×10^9, so the result overflows to-1294967296 - Lesson: Always use
long longwhen multiplying values that could each be up to ~10^5 or larger
Challenge 2.1.12 — USACO-Style Large Multiply
You're given two integers N and M (1 ≤ N, M ≤ 10^9). Print their product. (This seems simple, but requires long long.)
Sample Input: 1000000000 1000000000
Sample Output: 1000000000000000000
💡 Solution (click to reveal)
Approach: N and M fit individually in int, but N × M = 10^18 — which doesn't fit in int (max ~2.1×10^9) and barely fits in long long (max ~9.2×10^18). Must use long long.
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
long long n, m;
cin >> n >> m;
cout << n * m << "\n";
return 0;
}
Key points:
- Reading into
long longvariables is the key —cin >> ncan handle values up to 9.2×10^18 - If you read into
intvariables:int n, m; cin >> n >> m; cout << n * m;— this overflows silently and gives the wrong answer - In USACO, always check the constraints: if N can be 10^9, and you might multiply N by N, you need
long long
Challenge 2.1.13 — Quadrant Problem (USACO 2016 February Bronze) Read two non-zero integers x and y. Determine which quadrant of the coordinate plane the point (x, y) is in:
- Quadrant 1: x > 0 and y > 0
- Quadrant 2: x < 0 and y > 0
- Quadrant 3: x < 0 and y < 0
- Quadrant 4: x > 0 and y < 0
Print just the number: 1, 2, 3, or 4.
Sample Input 1: 3 5 → Output: 1
Sample Input 2: -1 2 → Output: 2
Sample Input 3: -4 -7 → Output: 3
Sample Input 4: 8 -3 → Output: 4
💡 Solution (click to reveal)
Approach: Check the signs of x and y. Each combination of positive/negative x and y maps to exactly one quadrant. We use if/else-if chains (covered fully in Chapter 2.2, but straightforward here).
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int x, y;
cin >> x >> y;
if (x > 0 && y > 0) {
cout << 1 << "\n";
} else if (x < 0 && y > 0) {
cout << 2 << "\n";
} else if (x < 0 && y < 0) {
cout << 3 << "\n";
} else { // x > 0 && y < 0
cout << 4 << "\n";
}
return 0;
}
Key points:
- The
&&operator means "AND" — both conditions must be true - Since the problem guarantees x ≠ 0 and y ≠ 0, we don't need to handle those edge cases
- The four cases are mutually exclusive (exactly one will be true for any input), so else-if chains work perfectly
- We could simplify using a formula, but the explicit if/else is clearer and equally fast
Chapter 2.2: Control Flow
📝 Prerequisites: Chapter 2.1 (variables, cin/cout, basic arithmetic)
2.2.0 What is "Control Flow"?
So far, every program we wrote ran top to bottom — line 1, line 2, line 3, done. Like reading a book straight through.
But real programs need to make decisions and repeat things. That's what "control flow" means — controlling the flow (order) of execution.
Think of it like a "Choose Your Own Adventure" book:
- Sometimes you're told "if you want to fight the dragon, turn to page 47; otherwise turn to page 52"
- Sometimes you're told "repeat this section until you escape the dungeon"
C++ gives us exactly this with:
if/else— make decisions based on conditionsfor/whileloops — repeat a section of code
Here's a visual overview:
In the loop diagram: the program keeps going back to Step 2 until the condition becomes false, then it exits to Step 3.
2.2.1 The if Statement
The if statement lets your program make a decision: "if this condition is true, do this thing."
Basic if
#include <bits/stdc++.h>
using namespace std;
int main() {
int score;
cin >> score;
if (score >= 90) {
cout << "Excellent!\n";
}
cout << "Done.\n"; // always runs regardless of score
return 0;
}
If score is 95: prints Excellent! then Done.
If score is 80: prints only Done. (the if-block is skipped)
if / else
int score;
cin >> score;
if (score >= 60) {
cout << "Pass\n";
} else {
cout << "Fail\n";
}
The else block runs only when the if condition is false. Exactly one of the two blocks will run.
if / else if / else Chains
When you have multiple conditions to check:
int score;
cin >> score;
if (score >= 90) {
cout << "A\n";
} else if (score >= 80) {
cout << "B\n";
} else if (score >= 70) {
cout << "C\n";
} else if (score >= 60) {
cout << "D\n";
} else {
cout << "F\n";
}
C++ checks these conditions in order, from top to bottom, and runs the first one that's true. Once it runs one block, it skips all the remaining else if/else blocks.
So if score = 85:
- Is 85 >= 90? No → skip
- Is 85 >= 80? Yes → print "B", then jump past all the else-ifs
🤔 Why does this work? When we reach
else if (score >= 80), we already know score < 90 (because if it were ≥ 90, the first condition would have caught it). Eachelse ifimplicitly assumes all the previous conditions were false.
Comparison Operators
| Operator | Meaning | Example |
|---|---|---|
== | Equal to | a == b |
!= | Not equal to | a != b |
< | Less than | a < b |
> | Greater than | a > b |
<= | Less than or equal to | a <= b |
>= | Greater than or equal to | a >= b |
Logical Operators (Combining Conditions)
| Operator | Meaning | Example |
|---|---|---|
&& | AND — both must be true | x > 0 && y > 0 |
|| | OR — at least one must be true | x == 0 || y == 0 |
! | NOT — flips true to false | !finished |
int x, y;
cin >> x >> y;
if (x > 0 && y > 0) {
cout << "Both positive\n";
}
if (x < 0 || y < 0) {
cout << "At least one is negative\n";
}
bool done = false;
if (!done) {
cout << "Still working...\n";
}
🐛 Common Bug: = vs ==
This is one of the most common mistakes for beginners (and even experienced programmers!):
int x = 5;
// DANGEROUS BUG:
if (x = 10) { // This ASSIGNS 10 to x, doesn't compare!
// x becomes 10, and since 10 is nonzero, this is always TRUE
cout << "x is 10\n"; // This ALWAYS runs, even though x started as 5!
}
// CORRECT:
if (x == 10) { // This COMPARES x with 10
cout << "x is 10\n"; // Only runs when x actually equals 10
}
The = operator assigns (stores a value). The == operator compares (checks if two values are equal). They look similar but do completely different things.
⚡ Pro Tip: Some programmers write
10 == xinstead ofx == 10— if you accidentally type=instead of==, it becomes10 = xwhich is a compile error (you can't assign to a literal). This is called a "Yoda condition."
Nested if Statements
You can put if statements inside other if statements:
int age, income;
cin >> age >> income;
if (age >= 18) {
cout << "Adult\n";
if (income > 50000) {
cout << "High income adult\n";
} else {
cout << "Standard income adult\n";
}
} else {
cout << "Minor\n";
}
Be careful: each else matches the nearest preceding if that doesn't already have an else.
2.2.2 The while Loop
A while loop repeats a block of code as long as its condition is true. When the condition becomes false, execution continues after the loop.
while (condition) {
body (runs over and over)
}
#include <bits/stdc++.h>
using namespace std;
int main() {
int i = 1; // 1. Initialize before the loop
while (i <= 5) { // 2. Check condition — if false, skip the loop
cout << i << "\n"; // 3. Run the body
i++; // 4. Update — VERY IMPORTANT! Forget this → infinite loop
}
// After loop: i = 6, condition 6 <= 5 is false, loop exits
return 0;
}
Output:
1
2
3
4
5
🐛 Common Bug: Infinite Loop
If you forget to update the variable (step 4 above), the condition never becomes false and the loop runs forever!
int i = 1;
while (i <= 5) {
cout << i << "\n";
// BUG: forgot i++ — this prints "1" forever!
}
If your program seems stuck, press Ctrl+C to stop it.
When to use while vs for
- Use
whilewhen you don't know in advance how many iterations you need - Use
forwhen you do know the count (we'll coverfornext)
Classic while use case: read until a condition is met.
// Common USACO pattern: read until end of input
int x;
while (cin >> x) { // cin >> x returns false when input runs out
cout << x * 2 << "\n";
}
do-while Loop
A do-while loop always runs its body at least once, then checks the condition:
int n;
do {
cin >> n;
} while (n <= 0); // keep re-reading until user gives a positive number
This is useful when you want to execute something before checking whether to repeat. It's rare in competitive programming but worth knowing.
2.2.3 The for Loop
The for loop is the most used loop in competitive programming. It packages initialization, condition-check, and update into one clean line:
for (initialization; condition; update) {
body
}
This is equivalent to:
initialization;
while (condition) {
body
update;
}
Visual: For Loop Flowchart
The flowchart above traces the execution: initialization runs once, then the condition is checked before every iteration. When false, the loop exits.
Common for Patterns
// Count from 0 to 9 (standard competitive programming pattern)
for (int i = 0; i < 10; i++) {
cout << i << " ";
}
// Prints: 0 1 2 3 4 5 6 7 8 9
// Count from 1 to n (inclusive)
int n = 5;
for (int i = 1; i <= n; i++) {
cout << i << " ";
}
// Prints: 1 2 3 4 5
// Count backwards
for (int i = 10; i >= 1; i--) {
cout << i << " ";
}
// Prints: 10 9 8 7 6 5 4 3 2 1
// Count by steps of 2
for (int i = 0; i <= 10; i += 2) {
cout << i << " ";
}
// Prints: 0 2 4 6 8 10
🧠 Loop Tracing: Understanding Exactly What Happens
When learning loops, trace through them manually. Here's how:
Code: for (int i = 0; i < 4; i++) cout << i * i << " ";
Practice tracing loops on paper before running them — it builds intuition and helps spot bugs.
The Most Common USACO Loop Pattern
Read N numbers and process each one:
int n;
cin >> n;
for (int i = 0; i < n; i++) {
int x;
cin >> x;
// process x here
cout << x * 2 << "\n";
}
⚡ Pro Tip: In competitive programming,
for (int i = 0; i < n; i++)with 0-based indexing is standard. It matches how arrays are indexed (Chapter 2.3), so everything lines up neatly.
2.2.4 Nested Loops
You can put a loop inside another loop. The inner loop runs completely for each single iteration of the outer loop.
// Print a 4x4 multiplication table
for (int i = 1; i <= 4; i++) { // outer: rows
for (int j = 1; j <= 4; j++) { // inner: columns
cout << i * j << "\t"; // \t = tab character
}
cout << "\n"; // newline after each row
}
Output:
1 2 3 4
2 4 6 8
3 6 9 12
4 8 12 16
Tracing the first two rows:
i=1: j=1→print 1, j=2→print 2, j=3→print 3, j=4→print 4, then newline
i=2: j=1→print 2, j=2→print 4, j=3→print 6, j=4→print 8, then newline
...
⚠️ Nested Loop Time Complexity
💡 Why should you care about loop counts? In competitions, your program typically needs to finish within 1-2 seconds. A modern computer can execute roughly 10^8 to 10^9 simple operations per second. So if you can estimate how many times your loop body executes in total, you can determine whether it will exceed the time limit (TLE). This is the core idea behind "time complexity analysis" — we'll study it in greater depth in later chapters.
A single loop of N iterations does N operations. Two nested loops of N do N × N = N² operations.
| Loops | Operations | Safe for N ≤ | Example |
|---|---|---|---|
| 1 | N | ~10^8 | Iterating through an array to compute a sum |
| 2 (nested) | N² | ~10^4 | Comparing all pairs |
| 3 (nested) | N³ | ~450 | Enumerating all triplets |
If N = 1000 and you have two nested loops, that's 10^6 operations — fine. But if N = 100,000, that's 10^10 — too slow!
🧠 Quick Rule of Thumb: After seeing the range of N, use the table above to work backwards and determine the maximum number of nested loops you can afford. For example, N ≤ 10^5 → you can only use O(N) or O(N log N) algorithms; N ≤ 5000 → O(N²) is acceptable. This technique is extremely useful in USACO!
2.2.5 Switch Statements
When you have a variable and want to check many specific values, switch is cleaner than a long chain of if/else if:
int day;
cin >> day;
switch (day) {
case 1:
cout << "Monday\n";
break; // IMPORTANT: break exits the switch
case 2:
cout << "Tuesday\n";
break;
case 3:
cout << "Wednesday\n";
break;
case 4:
cout << "Thursday\n";
break;
case 5:
cout << "Friday\n";
break;
case 6:
case 7:
cout << "Weekend!\n"; // cases 6 and 7 share this code
break;
default:
cout << "Invalid day\n"; // runs if no case matches
}
When to use switch vs if-else
Use switch when... | Use if-else when... |
|---|---|
| Checking one variable against exact integer/char values | Comparing ranges (x > 10, x < 5) |
| 3+ specific values to check | Only 1-2 conditions |
| Cases are mutually exclusive | Complex boolean logic |
🐛 Common Bug: Forgetting
break— Withoutbreak, execution "falls through" to the next case!
int x = 2;
switch (x) {
case 1:
cout << "one\n";
case 2:
cout << "two\n"; // this runs
case 3:
cout << "three\n"; // ALSO runs (fall-through!) because no break after case 2
}
// Output: two\nthree\n (surprising!)
2.2.6 break and continue
break — Exit the Loop Immediately
// Find the first number divisible by 7 between 1 and 100
for (int i = 1; i <= 100; i++) {
if (i % 7 == 0) {
cout << "First multiple of 7: " << i << "\n"; // prints 7
break; // stop searching — we found it
}
}
continue — Skip to the Next Iteration
// Print all numbers 1 to 10 except multiples of 3
for (int i = 1; i <= 10; i++) {
if (i % 3 == 0) {
continue; // skip the rest of this iteration, go to i++
}
cout << i << " ";
}
// Output: 1 2 4 5 7 8 10
break in Nested Loops
break only exits the innermost loop. To exit multiple levels, use a flag variable:
bool found = false;
int target = 25;
for (int i = 0; i < 10 && !found; i++) { // outer loop also checks !found
for (int j = 0; j < 10; j++) {
if (i * j == target) {
cout << i << " * " << j << " = " << target << "\n";
found = true;
break; // exits inner loop; outer loop exits too because of !found
}
}
}
2.2.7 Classic Loop Patterns in Competitive Programming
These patterns appear in nearly every USACO solution. Learn them cold.
Pattern 1: Read N Numbers, Compute Sum
int n;
cin >> n;
long long sum = 0;
for (int i = 0; i < n; i++) {
int x;
cin >> x;
sum += x;
}
cout << sum << "\n";
Complexity Analysis:
- Time: O(N) — iterate through N numbers, each processed in O(1)
- Space: O(1) — only one accumulator variable
sum
Pattern 2: Find Maximum (and Minimum) in a List
int n;
cin >> n;
int maxVal, minVal;
cin >> maxVal; // read first element
minVal = maxVal; // initialize both max and min to first element
for (int i = 1; i < n; i++) { // start from 2nd element (index 1)
int x;
cin >> x;
if (x > maxVal) maxVal = x;
if (x < minVal) minVal = x;
}
cout << "Max: " << maxVal << "\n";
cout << "Min: " << minVal << "\n";
Complexity Analysis:
- Time: O(N) — iterate through N numbers, each comparison in O(1)
- Space: O(1) — only two variables
maxValandminVal
🤔 Why initialize to the first element? Don't initialize max to 0! What if all numbers are negative? Initializing to the first element guarantees we start with a real value from the input.
Pattern 3: Count How Many Satisfy a Condition
int n;
cin >> n;
int count = 0;
for (int i = 0; i < n; i++) {
int x;
cin >> x;
if (x % 2 == 0) { // condition: even number
count++;
}
}
cout << "Even count: " << count << "\n";
Pattern 4: Print a Star Triangle Pattern
int n;
cin >> n;
for (int row = 1; row <= n; row++) { // row goes from 1 to n
for (int col = 1; col <= row; col++) { // print `row` stars per row
cout << "*";
}
cout << "\n"; // newline after each row
}
For n=4, output:
*
**
***
****
Pattern 5: Compute Sum of Digits
int n;
cin >> n;
int digitSum = 0;
while (n > 0) {
digitSum += n % 10; // last digit
n /= 10; // remove last digit
}
cout << digitSum << "\n";
Tracing for n = 12345:
n=12345: digitSum += 5, n becomes 1234
n=1234: digitSum += 4, n becomes 123
n=123: digitSum += 3, n becomes 12
n=12: digitSum += 2, n becomes 1
n=1: digitSum += 1, n becomes 0
n=0: loop exits. digitSum = 15 ✓
2.2.8 Complete Example: USACO-Style Problem
Problem: You have N cows. Each cow has a milk production rating. Find the highest-rated cow's rating and count how many cows produce above-average milk.
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
// We need to store all values to compare against the average
// (We'll learn arrays/vectors in Chapter 2.3 — for now use two passes)
// First pass: find sum and max
long long sum = 0;
int maxMilk = 0;
vector<int> milk(n); // store all values (preview of Chapter 2.3)
for (int i = 0; i < n; i++) {
cin >> milk[i];
sum += milk[i];
if (milk[i] > maxMilk) maxMilk = milk[i];
}
double avg = (double)sum / n;
// Second pass: count above-average
int aboveAvg = 0;
for (int i = 0; i < n; i++) {
if (milk[i] > avg) aboveAvg++;
}
cout << "Maximum: " << maxMilk << "\n";
cout << "Above average: " << aboveAvg << "\n";
return 0;
}
Sample Input:
5
10 20 30 40 50
Sample Output:
Maximum: 50
Above average: 2
(Average is 30; cows with 40 and 50 are above average → 2 cows)
Complexity Analysis:
- Time: O(N) — two passes (read + count), each O(N), total O(2N) = O(N)
- Space: O(N) — uses
vector<int> milk(n)to store all data
⚠️ Common Mistakes in Chapter 2.2
| # | Mistake | Example | Why It's Wrong | Fix |
|---|---|---|---|---|
| 1 | Confusing = with == | if (x = 10) | = is assignment, not comparison; result is always true | Use == for comparison |
| 2 | Forgetting i++ causing infinite loop | while (i < n) { ... } without i++ | Condition is always true, program hangs | Ensure the loop variable is updated |
| 3 | Forgetting break in switch | case 2: cout << "two"; without break | Execution "falls through" to the next case | Add break; at the end of each case |
| 4 | Off-by-one error | for (int i = 0; i <= n; i++) should be < n | Loops one extra time, may go out of bounds or overcount | Carefully verify < vs <= |
| 5 | Initializing max to 0 | int maxVal = 0; when all numbers are negative | 0 is larger than all inputs, result is wrong | Initialize to the first element or INT_MIN |
| 6 | Reusing the same variable name in nested loops | Outer for (int i...) and inner for (int i...) | Inner i shadows outer i, causing unexpected outer loop behavior | Use different variable names for inner and outer loops (e.g., i and j) |
Chapter Summary
📌 Key Takeaways
| Concept | Syntax | When to Use | Why It Matters |
|---|---|---|---|
if | if (cond) { ... } | Execute when a condition is true | Foundation of program decisions; used in almost every problem |
if/else | if (...) {...} else {...} | Choose between two options | Handles yes/no type decisions |
if/else if/else | chained | Choose among multiple options | Grading scales, classification scenarios |
while | while (cond) {...} | Repeat when count is unknown | Reading until end of input, simulating processes |
for | for (int i=0; i<n; i++) {...} | Repeat when count is known | Most commonly used loop in competitive programming |
| Nested loops | Loop inside loop | Need to iterate over all pairs | Watch out for O(N²) complexity limits |
break | break; | Exit immediately after finding target | Early termination saves time |
continue | continue; | Skip current iteration | Filter out elements that don't need processing |
switch | switch(x) { case 1: ... } | Check one variable against multiple exact values | Cleaner code than long if-else chains |
&& / || / ! | logical operators | Combine multiple conditions | Building blocks for complex decisions |
🧩 Five Classic Loop Patterns Quick Reference
| Pattern | Purpose | Complexity | Section |
|---|---|---|---|
| Read N + Sum | Read N numbers and compute their sum | O(N) | 2.2.7 Pattern 1 |
| Find Max/Min | Find the maximum/minimum value | O(N) | 2.2.7 Pattern 2 |
| Count Condition | Count how many elements satisfy a condition | O(N) | 2.2.7 Pattern 3 |
| Star Triangle | Print patterns using nested loops | O(N²) | 2.2.7 Pattern 4 |
| Digit Sum | Extract and sum individual digits | O(log₁₀N) | 2.2.7 Pattern 5 |
❓ FAQ (Frequently Asked Questions)
Q1: Can for and while replace each other? When should I use which?
A: Yes, any
forloop can be rewritten as awhileloop, and vice versa. Rule of thumb: if you know the number of iterations (e.g., "loop N times"), usefor; if you don't know the count (e.g., "read until end of input"), usewhile. In competitions,foris used about 90% of the time.
Q2: How many levels deep can nested loops go? Is there a limit?
A: Syntactically there's no limit, but in practice you should be cautious beyond 3 levels. Two nested loops give O(N²), three give O(N³). When N ≥ 1000, three nested loops can easily time out. If you find yourself needing more than 3 levels of nesting, it usually means you need a more efficient algorithm (covered in later chapters).
Q3: break only exits the innermost loop. How do I break out of multiple nested loops at once?
A: Two common approaches: ① Use a
bool found = falseflag variable, and have the outer loop also check!found; ② Wrap the nested loops in a function and usereturnto exit directly. Approach ① is more common — see Section 2.2.6 for a complete example.
Q4: Which is faster, switch or if-else if?
A: For a small number of cases (< 10), performance is virtually identical. The advantage of
switchis code readability, not speed. In competitions, you can freely choose either. If conditions involve range comparisons (likex > 10), you must useif-else.
Q5: My program produces correct output, but after submission it shows TLE (Time Limit Exceeded). What should I do?
A: Step one: estimate your algorithm's complexity. Look at the range of N → use the "nested loop complexity table" from this chapter to estimate total operations → if it exceeds 10^8, you need to optimize. Common optimization strategies include: reducing the number of loop levels, replacing brute-force search with sorting + binary search (Chapter 3.3), and replacing repeated summation with prefix sums (Chapter 3.2).
🔗 Connections to Later Chapters
- Chapter 2.3 (Functions & Arrays) will let you encapsulate the loop patterns from this chapter into functions, and use arrays to store collections of data
- Chapter 3.2 (Arrays & Prefix Sums) will teach you how to optimize O(N²) range sum queries to O(N) preprocessing + O(1) per query — one of the solutions for when "nested loops are too slow"
- Chapter 3.3 (Sorting & Searching) will teach you binary search, optimizing the O(N) linear search from this chapter to O(log N)
- The five classic loop patterns learned in this chapter (summation, finding max/min, counting, nested iteration, digit processing) are the foundational building blocks for all algorithms in this book
- Nested loop complexity analysis is the first step toward understanding time complexity (a theme throughout the entire book)
Practice Problems
🌡️ Warm-Up Problems
Warm-up 2.2.1 — Count to Ten
Print the numbers 1 through 10, each on its own line. Use a for loop.
💡 Solution (click to reveal)
Approach: A for loop from 1 to 10 (inclusive).
#include <bits/stdc++.h>
using namespace std;
int main() {
for (int i = 1; i <= 10; i++) {
cout << i << "\n";
}
return 0;
}
Key points:
i <= 10(noti < 10) because we want to include 10- Alternatively:
for (int i = 1; i < 11; i++)— same result
Warm-up 2.2.2 — Even Numbers Print all even numbers from 2 to 20, each on its own line.
💡 Solution (click to reveal)
Approach: Two options — loop by 2s, or loop every number and check if even.
#include <bits/stdc++.h>
using namespace std;
int main() {
// Option 1: step by 2
for (int i = 2; i <= 20; i += 2) {
cout << i << "\n";
}
return 0;
}
Key points:
i += 2increments by 2 each time instead of the usual 1- Alternative:
for (int i = 1; i <= 20; i++) { if (i % 2 == 0) cout << i << "\n"; }
Warm-up 2.2.3 — Sign Check
Read one integer. Print Positive if it's > 0, Negative if it's < 0, Zero if it's 0.
Sample Input: -5 → Output: Negative
💡 Solution (click to reveal)
Approach: Three-way if/else if/else to cover all cases.
#include <bits/stdc++.h>
using namespace std;
int main() {
int n;
cin >> n;
if (n > 0) {
cout << "Positive\n";
} else if (n < 0) {
cout << "Negative\n";
} else {
cout << "Zero\n";
}
return 0;
}
Key points:
- The
elseclause at the end catches exactlyn == 0(since the two conditions above cover n>0 and n<0)
Warm-up 2.2.4 — Multiplication Table of 3 Print the first 10 multiples of 3 (i.e., 3, 6, 9, ..., 30), each on its own line.
💡 Solution (click to reveal)
Approach: Loop from 1 to 10, print i*3 each time.
#include <bits/stdc++.h>
using namespace std;
int main() {
for (int i = 1; i <= 10; i++) {
cout << i * 3 << "\n";
}
return 0;
}
Key points:
- Alternative:
for (int i = 3; i <= 30; i += 3)— same result
Warm-up 2.2.5 — Sum of Five Read exactly 5 integers (on separate lines or the same line). Print their sum.
Sample Input: 3 7 2 8 5 → Output: 25
💡 Solution (click to reveal)
Approach: Read 5 times in a loop, accumulate sum.
#include <bits/stdc++.h>
using namespace std;
int main() {
long long sum = 0;
for (int i = 0; i < 5; i++) {
int x;
cin >> x;
sum += x;
}
cout << sum << "\n";
return 0;
}
Key points:
sumshould belong longin case the integers are large- We read exactly 5 times since the problem says "exactly 5 integers"
🏋️ Core Practice Problems
Problem 2.2.6 — FizzBuzz The classic programming challenge: print numbers from 1 to 100. But:
- If the number is divisible by 3, print
Fizzinstead - If divisible by 5, print
Buzzinstead - If divisible by both 3 and 5, print
FizzBuzzinstead
First few lines of output:
1
2
Fizz
4
Buzz
Fizz
7
8
Fizz
Buzz
11
Fizz
13
14
FizzBuzz
💡 Solution (click to reveal)
Approach: Loop 1 to 100. For each number, check divisibility — check the combined case (divisible by both) FIRST, otherwise that case would be caught by the Fizz or Buzz case alone.
#include <bits/stdc++.h>
using namespace std;
int main() {
for (int i = 1; i <= 100; i++) {
if (i % 3 == 0 && i % 5 == 0) {
cout << "FizzBuzz\n";
} else if (i % 3 == 0) {
cout << "Fizz\n";
} else if (i % 5 == 0) {
cout << "Buzz\n";
} else {
cout << i << "\n";
}
}
return 0;
}
Key points:
- Check
i % 3 == 0 && i % 5 == 0FIRST — if you checki % 3 == 0first, then 15 would print "Fizz" and never reach the FizzBuzz case - A number divisible by both 3 and 5 is divisible by 15:
i % 15 == 0also works
Problem 2.2.7 — Minimum of N Read N (1 ≤ N ≤ 1000), then read N integers. Print the minimum value.
Sample Input:
5
8 3 7 1 9
Sample Output: 1
💡 Solution (click to reveal)
Approach: Initialize min to the first value read, then update whenever we see something smaller.
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
int first;
cin >> first;
int minVal = first; // initialize to first element
for (int i = 1; i < n; i++) { // read remaining n-1 elements
int x;
cin >> x;
if (x < minVal) {
minVal = x;
}
}
cout << minVal << "\n";
return 0;
}
Key points:
- Initialize
minValto the first element read (not 0 or INT_MAX), then handle remaining elements in the loop - Alternatively, use
INT_MAXas the initial value:int minVal = INT_MAX;— this is guaranteed to be larger than any int, so the first element will always update it
Problem 2.2.8 — Count Positives Read N (1 ≤ N ≤ 1000), then read N integers. Print how many of them are strictly positive (> 0).
Sample Input:
6
3 -1 0 5 -2 7
Sample Output: 3
💡 Solution (click to reveal)
Approach: Maintain a counter, increment when the condition (x > 0) is met.
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
int count = 0;
for (int i = 0; i < n; i++) {
int x;
cin >> x;
if (x > 0) {
count++;
}
}
cout << count << "\n";
return 0;
}
Key points:
countstarts at 0 and increments only whenx > 0- 0 is NOT positive (not negative either — it's zero), so
x > 0correctly excludes it
Problem 2.2.9 — Star Triangle
Read N. Print a right triangle of * characters with N rows, where row i has i stars.
Sample Input: 4
Sample Output:
*
**
***
****
💡 Solution (click to reveal)
Approach: Nested loops — outer loop over rows, inner loop prints the right number of stars.
#include <bits/stdc++.h>
using namespace std;
int main() {
int n;
cin >> n;
for (int row = 1; row <= n; row++) {
for (int star = 1; star <= row; star++) {
cout << "*";
}
cout << "\n";
}
return 0;
}
Key points:
- Row 1 has 1 star, row 2 has 2 stars, ..., row N has N stars
- The inner loop runs exactly
rowtimes for each value ofrow - Alternative using
string:cout << string(row, '*') << "\n";— creates a string ofrowcopies of*
Problem 2.2.10 — Sum of Digits Read a positive integer N (1 ≤ N ≤ 10^9). Print the sum of its digits.
Sample Input: 12345 → Sample Output: 15
Sample Input: 9999 → Sample Output: 36
💡 Solution (click to reveal)
Approach: Use the modulo trick. N % 10 gives the last digit. N / 10 removes the last digit. Repeat until N becomes 0.
#include <bits/stdc++.h>
using namespace std;
int main() {
int n;
cin >> n;
int digitSum = 0;
while (n > 0) {
digitSum += n % 10; // add last digit
n /= 10; // remove last digit
}
cout << digitSum << "\n";
return 0;
}
Key points:
n % 10extracts the ones digit (e.g., 12345 % 10 = 5)n /= 10is integer division, removing the last digit (e.g., 12345 / 10 = 1234)- The loop continues until n = 0 (all digits extracted)
- Trace: 12345 → +5 → 1234 → +4 → 123 → +3 → 12 → +2 → 1 → +1 → 0. Sum = 15 ✓
🏆 Challenge Problems
Challenge 2.2.11 — Collatz Sequence The Collatz sequence starting from N works as follows:
- If N is even: next = N / 2
- If N is odd: next = N * 3 + 1
- Stop when N = 1
Read N. Print the entire sequence (including N and 1). Also print how many steps it takes to reach 1.
Sample Input: 6
Sample Output:
6 3 10 5 16 8 4 2 1
Steps: 8
💡 Solution (click to reveal)
Approach: Use a while loop. Keep applying the rule until we reach 1. Count steps.
#include <bits/stdc++.h>
using namespace std;
int main() {
long long n;
cin >> n;
int steps = 0;
cout << n; // print starting number
while (n != 1) {
if (n % 2 == 0) {
n = n / 2;
} else {
n = n * 3 + 1;
}
cout << " " << n; // print each next number
steps++;
}
cout << "\n";
cout << "Steps: " << steps << "\n";
return 0;
}
Key points:
- Use
long long— even starting from small numbers, the sequence can reach large intermediate values (e.g., N=27 reaches 9232!) - The Collatz conjecture says this always reaches 1, but it's not proven for all N
- We print N before the loop (as the starting value), then print each new value after each step
Challenge 2.2.12 — Prime Check
Read N (2 ≤ N ≤ 10^6). Print prime if N is prime, composite otherwise.
A number is prime if it has no divisors other than 1 and itself.
Sample Input: 17 → Output: prime
Sample Input: 100 → Output: composite
💡 Solution (click to reveal)
Approach: Trial division — check if any number from 2 to √N divides N. If none do, N is prime. We only need to check up to √N because if N = a×b and a > √N, then b < √N (so we would have found b already).
#include <bits/stdc++.h>
using namespace std;
int main() {
int n;
cin >> n;
bool isPrime = true;
if (n < 2) {
isPrime = false;
} else {
// Check divisors from 2 to sqrt(n)
for (int i = 2; (long long)i * i <= n; i++) {
if (n % i == 0) {
isPrime = false;
break; // found a divisor, no need to continue
}
}
}
cout << (isPrime ? "prime" : "composite") << "\n";
return 0;
}
Key points:
- We check
i * i <= ninstead ofi <= sqrt(n)to avoid floating-point issues (and it's slightly faster) - The
(long long)i * icast prevents overflow when i is large (e.g., i = 1000000, i*i = 10^12) breakexits the loop early as soon as we find any divisor — no need to keep checking- Time complexity: O(√N), so this handles N up to 10^6 easily (√10^6 = 1000 iterations)
Challenge 2.2.13 — Highest Rated Cow Read N (1 ≤ N ≤ 1000), then read N pairs of (cow name, rating). Find and print the name of the cow with the highest rating.
Sample Input:
4
Bessie 95
Elsie 82
Moo 95
Daisy 88
Sample Output: Bessie
(If there's a tie, print the name of the first one that appeared.)
💡 Solution (click to reveal)
Approach: Track the best rating and name seen so far. Update whenever we see a strictly higher rating.
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
string bestName;
int bestRating = -1; // initialize to -1 so any real rating beats it
for (int i = 0; i < n; i++) {
string name;
int rating;
cin >> name >> rating;
if (rating > bestRating) {
bestRating = rating;
bestName = name;
}
}
cout << bestName << "\n";
return 0;
}
Key points:
- Initialize
bestRating = -1(or useINT_MIN) so the first cow always becomes the new best - We use
>(strictly greater), not>=, so in case of a tie, we keep the first one seen (the problem asks for first) - Mixing
cin >> name >> ratingreads a string and then an int from the same line — this works perfectly
Chapter 2.3: Functions & Arrays
📝 Prerequisites: Chapters 2.1 & 2.2 (variables, loops, if/else)
As your programs grow larger, you need ways to organize code (functions) and store collections of data (arrays and vectors). This chapter introduces both — two of the most important tools in competitive programming.
2.3.1 Functions — What and Why
🍕 The Recipe Analogy
A function is like a pizza recipe:
- Input (parameters): ingredients — flour, cheese, tomatoes
- Process (body): the cooking steps
- Output (return value): the finished pizza
Just like you can make many pizzas using one recipe,
you can call a function many times with different inputs.
pizza("thin crust", "pepperoni") → one pizza
pizza("thick crust", "mushroom") → another pizza
Without functions, if you need to compute "is this number prime?" in five different places, you'd copy-paste the same 10 lines of code five times. Then if you find a bug, you have to fix it in all five places!
When to Write a Function
Use a function when:
- You repeat the same logic 3+ times in your program
- A block of code does one clear, named thing (e.g., "check if prime", "compute distance")
- Your
mainis getting too long to read comfortably
Basic Function Syntax
returnType functionName(parameter1Type param1, parameter2Type param2, ...) {
// function body
return value; // must match returnType; omit for void functions
}
Your First Functions
#include <bits/stdc++.h>
using namespace std;
// ---- FUNCTION DEFINITIONS (must come BEFORE they are used, or use prototypes) ----
// Takes one integer, returns its square
int square(int x) {
return x * x;
}
// Takes two integers, returns the larger one
int maxOf(int a, int b) {
if (a > b) return a;
else return b;
}
// void function: does something but doesn't return a value
void printSeparator() {
cout << "====================\n";
}
// ---- MAIN ----
int main() {
cout << square(5) << "\n"; // calls square with x=5, prints 25
cout << square(12) << "\n"; // calls square with x=12, prints 144
cout << maxOf(7, 3) << "\n"; // prints 7
cout << maxOf(-5, -2) << "\n"; // prints -2
printSeparator(); // prints the divider line
cout << "Done!\n";
printSeparator();
return 0;
}
🤔 Why do functions come before main?
C++ reads your file top-to-bottom. When it sees a call like square(5), it needs to already know what square means. If square is defined after main, the compiler will say "I've never heard of square!"
Solution 1: Define all functions above main (simplest approach).
Solution 2: Use a function prototype — a forward declaration telling the compiler "this function exists, I'll define it later":
#include <bits/stdc++.h>
using namespace std;
int square(int x); // prototype — just the signature, no body
int maxOf(int a, int b); // prototype
int main() {
cout << square(5) << "\n"; // OK! compiler knows square exists
return 0;
}
// Full definitions can come after main
int square(int x) {
return x * x;
}
int maxOf(int a, int b) {
return (a > b) ? a : b;
}
2.3.2 Void Functions vs Return Functions
void functions: Do something, return nothing
// void functions perform an action
void printLine(int n) {
for (int i = 0; i < n; i++) {
cout << "-";
}
cout << "\n";
}
// Calling a void function — just call it, don't try to capture a value
printLine(10); // prints: ----------
printLine(20); // prints: --------------------
Return functions: Compute and give back a value
// Returns the absolute value of x
int absoluteValue(int x) {
if (x < 0) return -x;
return x;
}
// Calling a return function — capture the result in a variable or use it directly
int result = absoluteValue(-7);
cout << result << "\n"; // 7
cout << absoluteValue(-3) << "\n"; // 3 (used directly)
Multiple return statements
A function can have multiple return statements — execution stops at the first one reached:
string classify(int n) {
if (n < 0) return "negative"; // exits here if n < 0
if (n == 0) return "zero"; // exits here if n == 0
return "positive"; // exits here otherwise
}
cout << classify(-5) << "\n"; // negative
cout << classify(0) << "\n"; // zero
cout << classify(3) << "\n"; // positive
2.3.3 Pass by Value vs Pass by Reference
When you pass a variable to a function, there are two ways it can happen. Understanding this is crucial.
Pass by Value (default): Function gets a COPY
void addOne_byValue(int x) {
x++; // modifies the LOCAL COPY — original is unchanged
cout << "Inside function: " << x << "\n"; // 6
}
int main() {
int n = 5;
addOne_byValue(n);
cout << "After function: " << n << "\n"; // still 5! original unchanged
return 0;
}
Think of it like a photocopy: the function works on a photocopy of the paper. Changes to the photocopy don't affect the original.
Pass by Reference (&): Function works on the ORIGINAL
void addOne_byRef(int& x) { // & means "reference to the original"
x++; // modifies the ORIGINAL variable directly
cout << "Inside function: " << x << "\n"; // 6
}
int main() {
int n = 5;
addOne_byRef(n);
cout << "After function: " << n << "\n"; // now 6! original was changed
return 0;
}
When to use each
| Use pass by value when... | Use pass by reference when... |
|---|---|
| Function shouldn't modify original | Function needs to modify original |
| Small types (int, double, char) | Returning multiple values |
| You want safety (no side effects) | Large types (avoiding expensive copy) |
Multiple Return Values via References
A C++ function can only return one value. But you can "return" multiple values through reference parameters:
// Computes both quotient AND remainder simultaneously
void divmod(int a, int b, int& quotient, int& remainder) {
quotient = a / b;
remainder = a % b;
}
int main() {
int q, r;
divmod(17, 5, q, r); // q and r are modified by the function
cout << "17 / 5 = " << q << " remainder " << r << "\n";
// prints: 17 / 5 = 3 remainder 2
return 0;
}
2.3.4 Recursion
A recursive function is one that calls itself. It's perfect for problems that break down into smaller versions of the same problem.
Classic Example: Factorial
5! = 5 × 4 × 3 × 2 × 1 = 120
= 5 × (4!) ← same problem, smaller input!
💡 Three-Step Recursive Thinking:
- Find "self-similarity": Can the original problem be broken into smaller problems of the same type? 5! = 5 × 4!, and 4! and 5! are the same type ✓
- Identify the base case: What is the smallest case? 0! = 1, cannot be broken down further
- Write the inductive step: n! = n × (n-1)!, call yourself with smaller input
This thinking process will be used repeatedly in Graph Algorithms (Chapter 5.1) and Dynamic Programming (Chapters 6.1–6.3).
int factorial(int n) {
if (n == 0) return 1; // BASE CASE: stop recursing
return n * factorial(n - 1); // RECURSIVE CASE: reduce to smaller problem
}
Tracing factorial(4):
factorial(4)
= 4 * factorial(3)
= 4 * (3 * factorial(2))
= 4 * (3 * (2 * factorial(1)))
= 4 * (3 * (2 * (1 * factorial(0))))
= 4 * (3 * (2 * (1 * 1))) ← base case!
= 4 * (3 * (2 * 1))
= 4 * (3 * 2)
= 4 * 6
= 24 ✓
Every recursive function needs:
- A base case — stops the recursion (prevents infinite recursion)
- A recursive case — calls itself with a smaller input
🐛 Common Bug: Forgetting the base case → infinite recursion → "Stack Overflow" crash!
2.3.5 Arrays — Fixed Collections
🏠 The Mailbox Analogy
An array is like a row of mailboxes on a street:
- All mailboxes are the same size (same type)
- Each has a number on the door (the index, starting from 0)
- You can go directly to any mailbox by its number
Visual: Array Memory Layout
Arrays are stored as consecutive blocks of memory. Each element sits right next to the previous one, allowing O(1) random access.
Array Basics
#include <bits/stdc++.h>
using namespace std;
int main() {
// Declare an array of 5 integers (elements are uninitialized — garbage values!)
int arr[5];
// Assign values one by one
arr[0] = 10;
arr[1] = 20;
arr[2] = 30;
arr[3] = 40;
arr[4] = 50;
// Declare AND initialize at the same time
int nums[5] = {1, 2, 3, 4, 5};
// Initialize all elements to zero
int zeros[100] = {}; // all 100 elements = 0
int zeros2[100];
fill(zeros2, zeros2 + 100, 0); // another way
// Access and print
cout << arr[2] << "\n"; // 30
// Loop through the array
for (int i = 0; i < 5; i++) {
cout << nums[i] << " "; // 1 2 3 4 5
}
cout << "\n";
return 0;
}
🐛 The Off-By-One Error — The #1 Array Bug
Arrays are 0-indexed: if you declare int arr[5], valid indices are 0, 1, 2, 3, 4. There is NO arr[5]!
int arr[5] = {10, 20, 30, 40, 50};
// WRONG: loop goes from i=0 to i=5 inclusive — index 5 doesn't exist!
for (int i = 0; i <= 5; i++) { // BUG: <= 5 should be < 5
cout << arr[i]; // CRASH or garbage value when i=5
}
// CORRECT: loop from i=0 to i=4 (i < 5 ensures i never reaches 5)
for (int i = 0; i < 5; i++) { // i goes: 0, 1, 2, 3, 4 ✓
cout << arr[i]; // always valid
}
This is called an "off-by-one error" — going one element past the end. It's the single most common array bug in competitive programming.
🤔 Why start at 0? C++ inherited this from C, which was designed close to hardware. The index is actually an offset from the start of the array. The first element is at offset 0 (no offset from the beginning).
Global Arrays for Large Sizes
The local variables inside main live on the "stack," which has limited space (~1-8 MB). For competitive programming with N up to 10^6, you need global arrays (live in a different memory area, much larger):
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 1000001; // max size + 1 (common convention)
int arr[MAXN]; // declared globally — safe for large sizes
// Global arrays are automatically initialized to 0!
int main() {
int n;
cin >> n;
for (int i = 0; i < n; i++) {
cin >> arr[i];
}
return 0;
}
⚡ Pro Tip: Global arrays are initialized to 0 automatically. Local arrays are NOT — they contain garbage values until you assign them!
2.3.6 Common Array Algorithms
Find Sum, Max, Min
int n;
cin >> n;
vector<int> arr(n); // we'll learn vectors soon; this works like an array
for (int i = 0; i < n; i++) cin >> arr[i];
// Sum
long long sum = 0;
for (int i = 0; i < n; i++) sum += arr[i];
cout << "Sum: " << sum << "\n";
// Max (initialize to first element!)
int maxVal = arr[0];
for (int i = 1; i < n; i++) {
if (arr[i] > maxVal) maxVal = arr[i];
}
cout << "Max: " << maxVal << "\n";
// Min (same idea)
int minVal = arr[0];
for (int i = 1; i < n; i++) {
minVal = min(minVal, arr[i]); // min() is a built-in function
}
cout << "Min: " << minVal << "\n";
Complexity Analysis:
- Time: O(N) — each algorithm only needs one pass through the array
- Space: O(1) — only a few extra variables (not counting the input array itself)
Reverse an Array
int arr[] = {1, 2, 3, 4, 5};
int n = 5;
// Swap elements from both ends, moving toward the middle
for (int i = 0, j = n - 1; i < j; i++, j--) {
swap(arr[i], arr[j]); // swap() is a built-in function
}
// arr is now {5, 4, 3, 2, 1}
Complexity Analysis:
- Time: O(N) — each pair of elements is swapped once, N/2 swaps total
- Space: O(1) — in-place swap, no extra array needed
Two-Dimensional Arrays
A 2D array is like a table or grid. Perfect for maps, grids, matrices:
int grid[3][4]; // 3 rows, 4 columns
// Fill with i * 10 + j
for (int r = 0; r < 3; r++) {
for (int c = 0; c < 4; c++) {
grid[r][c] = r * 10 + c;
}
}
// Print
for (int r = 0; r < 3; r++) {
for (int c = 0; c < 4; c++) {
cout << grid[r][c] << "\t";
}
cout << "\n";
}
Output:
0 1 2 3
10 11 12 13
20 21 22 23
2.3.7 Vectors — Dynamic Arrays
Arrays have a major limitation: their size must be known at compile time (or must be declared large enough in advance). Vectors solve this — they can grow and shrink as needed while your program is running.
Array vs Vector Comparison
| Feature | Array | Vector |
|---|---|---|
| Size | Fixed at compile time | Can grow/shrink at runtime |
| Read N elements | Must hardcode or use MAXN | push_back(x) works naturally |
| Memory location | Stack (fast, limited) | Heap (slightly slower, unlimited) |
| Syntax | int arr[5] | vector<int> v(5) |
| Preferred in competitive programming | For fixed-size, simple cases | For most problems |
Vector Basics
#include <bits/stdc++.h>
using namespace std;
int main() {
// Create an empty vector
vector<int> v;
// Add elements to the back with push_back
v.push_back(10); // v = [10]
v.push_back(20); // v = [10, 20]
v.push_back(30); // v = [10, 20, 30]
// Access by index (same as arrays, 0-indexed)
cout << v[0] << "\n"; // 10
cout << v[1] << "\n"; // 20
// Useful functions
cout << v.size() << "\n"; // 3 (number of elements)
cout << v.front() << "\n"; // 10 (first element)
cout << v.back() << "\n"; // 30 (last element)
cout << v.empty() << "\n"; // 0 (false — not empty)
// Remove last element
v.pop_back(); // v = [10, 20]
// Clear all elements
v.clear(); // v = []
cout << v.empty() << "\n"; // 1 (true — now empty)
return 0;
}
Creating Vectors With Initial Values
vector<int> zeros(10, 0); // ten 0s: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
vector<int> ones(5, 1); // five 1s: [1, 1, 1, 1, 1]
vector<int> primes = {2, 3, 5, 7, 11}; // initialized from list
vector<int> empty; // empty vector
Iterating Over a Vector
vector<int> v = {10, 20, 30, 40, 50};
// Method 1: index-based (like arrays)
for (int i = 0; i < (int)v.size(); i++) {
cout << v[i] << " ";
}
cout << "\n";
// Method 2: range-based for loop (cleaner, preferred)
for (int x : v) {
cout << x << " ";
}
cout << "\n";
// Method 3: range-based with reference (use when modifying)
for (int& x : v) {
x *= 2; // doubles each element in-place
}
🤔 Why
(int)v.size()in the index-based loop?v.size()returns an unsigned integer. If you compareint iwith an unsigned value, C++ can behave unexpectedly (especially ifigoes negative). Casting to(int)is the safe habit.
The Standard USACO Pattern with Vectors
int n;
cin >> n;
vector<int> arr(n); // create vector of size n
for (int i = 0; i < n; i++) {
cin >> arr[i]; // read into each position
}
// Now process arr...
sort(arr.begin(), arr.end()); // sort ascending
2D Vectors
int rows = 3, cols = 4;
vector<vector<int>> grid(rows, vector<int>(cols, 0)); // 3×4 grid of 0s
// Access: grid[r][c]
grid[1][2] = 42;
cout << grid[1][2] << "\n"; // 42
2.3.8 Passing Arrays and Vectors to Functions
Arrays
When you pass an array to a function, the function receives a pointer to the first element. Changes inside the function affect the original:
void fillSquares(int arr[], int n) { // arr[] syntax for array parameter
for (int i = 0; i < n; i++) {
arr[i] = i * i; // modifies the original!
}
}
int main() {
int arr[5] = {0};
fillSquares(arr, 5);
// arr is now {0, 1, 4, 9, 16}
for (int i = 0; i < 5; i++) cout << arr[i] << " ";
cout << "\n";
return 0;
}
Vectors
Vectors by default are copied when passed to functions (expensive for large vectors!). Use & to pass by reference:
// Pass by value — makes a copy (SLOW for large vectors)
void printVec(vector<int> v) {
for (int x : v) cout << x << " ";
}
// Pass by reference — no copy, CAN modify original (use for output params)
void sortVec(vector<int>& v) {
sort(v.begin(), v.end());
}
// Pass by const reference — no copy, CANNOT modify (best for read-only)
void printVecFast(const vector<int>& v) {
for (int x : v) cout << x << " ";
}
⚡ Pro Tip: For any vector parameter that you're only reading (not modifying), always write
const vector<int>&. It avoids the copy and also signals to readers that the function won't change the vector.
⚠️ Common Mistakes in Chapter 2.3
| # | Mistake | Example | Why It's Wrong | Fix |
|---|---|---|---|---|
| 1 | Off-by-one array out-of-bounds | arr[n] when array size is n | Valid indices are 0 to n-1, arr[n] is out-of-bounds | Use i < n instead of i <= n |
| 2 | Forgot recursive base case | int f(int n) { return n*f(n-1); } | Never stops, causes stack overflow crash | Add if (n == 0) return 1; |
| 3 | Recursive function receives invalid (e.g. negative) argument | factorial(-1) | Base case only handles n == 0; negative values cause infinite recursion → stack overflow | Before calling, ensure input is within valid range; or在函数入口加防御:if (n < 0) return -1; |
| 4 | Vector passed by value causes performance issue | void f(vector<int> v) | Copies entire vector, very slow when N is large | Use const vector<int>& v |
| 5 | Local array uninitialized | int arr[100]; sum += arr[50]; | Local arrays are not auto-zeroed, contain garbage values | Use = {} to initialize or use global arrays |
| 6 | Array too large inside main | int main() { int arr[1000000]; } | Exceeds stack memory limit (usually 1-8 MB), program crashes | Put large arrays outside main (global) |
| 7 | Function defined after call | main calls square(5) but square is defined below main | Compiler does not recognize undefined functions | Define function before main, or use function prototype |
Chapter Summary
📌 Key Takeaways
| Concept | Key Points | Why It Matters |
|---|---|---|
| Functions | Define once, call anywhere | Reduce duplicate code, improve readability |
| Return types | int, double, bool, void | Use different return types for different scenarios |
| Pass by value | Function gets a copy, original unchanged | Safe, no side effects |
Pass by reference (&) | Function operates on original variable | Can modify original, avoids copying large objects |
| Recursion | Function calls itself, must have base case | Foundation of divide & conquer, backtracking, DP |
| Arrays | Fixed size, 0-indexed, O(1) random access | Most fundamental data structure in competitive programming |
| Global arrays | Avoid stack overflow, auto-initialized to 0 | Must use global arrays when N exceeds 10^5 |
vector<int> | Dynamic array, variable size | Preferred data container in competitive programming |
push_back / pop_back | Add/remove at end | O(1) operation, primary way to build dynamic collections |
| Prefix Sum | Preprocess O(N), query O(1) | Core technique for range sum queries, covered in depth in Chapter 3.2 |
❓ FAQ
Q1: Which is better, arrays or vectors?
A: Both are common in competitive programming. Rule of thumb: if the size is fixed and known, global arrays are simplest; if the size changes dynamically or needs to be passed to functions, use
vector. Many contestants default tovectorbecause it is more flexible and less error-prone.
Q2: Is there a limit to recursion depth? Can it crash?
A: Yes. Each function call allocates space on the stack, and the default stack size is about 1-8 MB. In practice, about 10^4 ~ 10^5 levels of recursion are supported. If exceeded, the program crashes with a "stack overflow". In contests, if recursion depth may exceed 10^4, consider switching to an iterative (loop) approach.
Q3: When should I use pass by reference (&)?
A: Two cases: ① You need to modify the original variable inside the function; ② The parameter is a large object (like
vectororstring) and you want to avoid copy overhead. For small types likeintanddouble, copy overhead is negligible, so pass by value is fine.
Q4: Can a function return an array or vector?
A: Arrays cannot be returned directly, but
vectorcan!vector<int> solve() { ... return result; }is perfectly valid. Modern C++ compilers optimize the return process (called RVO), so the entire vector is not actually copied.
Q5: Why does the prefix sum array have one extra index? prefix[n+1] instead of prefix[n]?
A:
prefix[0] = 0is a "sentinel value" that makes the formulaprefix[R+1] - prefix[L]work in all cases. Without this sentinel, querying [0, R] would require special handling when L=0. This is a very common programming trick: use an extra sentinel value to simplify boundary handling.
🔗 Connections to Later Chapters
- Chapter 3.1 (STL Essentials) will introduce tools like
sort,binary_search, andpair, letting you accomplish in one line what this chapter implements by hand - Chapter 3.2 (Prefix Sums) will dive deeper into the prefix sum technique introduced in Problem 3.10, including 2D prefix sums and difference arrays
- Chapter 5.1 (Introduction to Graphs) will build on the recursion foundation in Section 2.3.4 to teach graph traversals like DFS and BFS
- Chapters 6.1–6.3 (Dynamic Programming): the core idea of "breaking large problems into smaller ones" is closely related to recursion; this chapter's recursive thinking is important groundwork
- The function encapsulation and array/vector operations learned in this chapter will be used continuously in every subsequent chapter
Practice Problems
🌡️ Warm-Up Problems
Warm-up 2.3.1 — Square Function
Write a function int square(int x) that returns x². In main, read one integer and print its square.
Sample Input: 7 → Sample Output: 49
💡 Solution (click to reveal)
Approach: Write the function above main, call it with the input.
#include <bits/stdc++.h>
using namespace std;
int square(int x) {
return x * x;
}
int main() {
int n;
cin >> n;
cout << square(n) << "\n";
return 0;
}
Key points:
- Function defined above
mainso the compiler knows about it return x * x;— C++ evaluatesx * xand returns the result- Use
long longif x can be large (e.g., x up to 10^9, then x² up to 10^18)
Warm-up 2.3.2 — Max of Two
Write a function int myMax(int a, int b) that returns the larger of two integers. In main, read two integers and print the larger.
Sample Input: 13 7 → Sample Output: 13
💡 Solution (click to reveal)
Approach: Compare a and b, return whichever is larger.
#include <bits/stdc++.h>
using namespace std;
int myMax(int a, int b) {
if (a > b) return a;
return b;
}
int main() {
int a, b;
cin >> a >> b;
cout << myMax(a, b) << "\n";
return 0;
}
Key points:
- C++ has a built-in
max(a, b)function — but writing your own teaches the concept - Alternative using ternary operator:
return (a > b) ? a : b;
Warm-up 2.3.3 — Reverse Array
Declare an array of exactly 5 integers: {1, 2, 3, 4, 5}. Print them in reverse order (no input needed).
Expected Output:
5 4 3 2 1
💡 Solution (click to reveal)
Approach: Loop from index 4 down to 0 (backwards).
#include <bits/stdc++.h>
using namespace std;
int main() {
int arr[5] = {1, 2, 3, 4, 5};
for (int i = 4; i >= 0; i--) {
cout << arr[i];
if (i > 0) cout << " ";
}
cout << "\n";
return 0;
}
Key points:
- Loop from index
n-1 = 4down to0(inclusive), usingi-- - The
if (i > 0) cout << " "avoids a trailing space — but for USACO, a trailing space is usually acceptable
Warm-up 2.3.4 — Vector Sum
Create a vector, push the values 10, 20, 30, 40, 50 into it using push_back, then print their sum.
Expected Output: 150
💡 Solution (click to reveal)
Approach: Create empty vector, push 5 values, loop to sum.
#include <bits/stdc++.h>
using namespace std;
int main() {
vector<int> v;
v.push_back(10);
v.push_back(20);
v.push_back(30);
v.push_back(40);
v.push_back(50);
long long sum = 0;
for (int x : v) {
sum += x;
}
cout << sum << "\n";
return 0;
}
Key points:
- Range-for
for (int x : v)iterates over every element accumulate(v.begin(), v.end(), 0LL)is a one-liner alternative
Warm-up 2.3.5 — Hello N Times
Write a void function sayHello(int n) that prints "Hello!" exactly n times. Call it from main after reading n.
Sample Input: 3
Sample Output:
Hello!
Hello!
Hello!
💡 Solution (click to reveal)
Approach: A void function with a for loop inside.
#include <bits/stdc++.h>
using namespace std;
void sayHello(int n) {
for (int i = 0; i < n; i++) {
cout << "Hello!\n";
}
}
int main() {
int n;
cin >> n;
sayHello(n);
return 0;
}
Key points:
voidmeans the function returns nothing — noreturn value;needed (can use barereturn;to exit early)- The
ninsayHello's parameter is a separate copy from theninmain(pass by value)
🏋️ Core Practice Problems
Problem 2.3.6 — Array Reverse Read N (1 ≤ N ≤ 100), then read N integers. Print them in reverse order.
Sample Input:
5
1 2 3 4 5
Sample Output: 5 4 3 2 1
💡 Solution (click to reveal)
Approach: Store in a vector, then print from the last index to the first.
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
vector<int> arr(n);
for (int i = 0; i < n; i++) {
cin >> arr[i];
}
for (int i = n - 1; i >= 0; i--) {
cout << arr[i];
if (i > 0) cout << " ";
}
cout << "\n";
return 0;
}
Key points:
vector<int> arr(n)creates a vector of size n (all zeros initially)- We read into
arr[i]just like an array - Print from
n-1down to0inclusive
Problem 2.3.7 — Running Average Read N (1 ≤ N ≤ 100), then read N integers one at a time. After reading each integer, print the average of all integers read so far (as a decimal with 2 decimal places).
Sample Input:
4
10 20 30 40
Sample Output:
10.00
15.00
20.00
25.00
💡 Solution (click to reveal)
Approach: Keep a running sum. After each new input, divide by how many we've read so far.
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
long long sum = 0;
for (int i = 1; i <= n; i++) {
int x;
cin >> x;
sum += x;
double avg = (double)sum / i;
cout << fixed << setprecision(2) << avg << "\n";
}
return 0;
}
Key points:
sumis updated with each new element;iis the count of elements read so far(double)sum / i— cast to double before dividing so we get decimal resultfixed << setprecision(2)forces exactly 2 decimal places
Problem 2.3.8 — Frequency Count Read N (1 ≤ N ≤ 100) integers. Each integer is between 1 and 10 inclusive. Print how many times each value from 1 to 10 appears.
Sample Input:
7
3 1 2 3 3 1 7
Sample Output:
1 appears 2 times
2 appears 1 times
3 appears 3 times
4 appears 0 times
5 appears 0 times
6 appears 0 times
7 appears 1 times
8 appears 0 times
9 appears 0 times
10 appears 0 times
💡 Solution (click to reveal)
Approach: Use an array (or vector) as a "tally counter" — index 1 through 10 holds the count for that value.
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
int freq[11] = {}; // indices 0-10; we'll use 1-10. Initialize all to 0.
for (int i = 0; i < n; i++) {
int x;
cin >> x;
freq[x]++; // increment the count for value x
}
for (int v = 1; v <= 10; v++) {
cout << v << " appears " << freq[v] << " times\n";
}
return 0;
}
Key points:
freq[x]++is a very common pattern — use the VALUE as the INDEX in a frequency array- We declare
freq[11]with indices 0-10 so thatfreq[10]is valid (index 10 for value 10) int freq[11] = {}— the= {}zero-initializes all elements
Problem 2.3.9 — Two Sum
Read N (1 ≤ N ≤ 100) integers and a target value T. Print YES if any two different elements in the array sum to T, NO otherwise.
Sample Input:
5 9
1 4 5 6 3
(N=5, T=9, then the array)
Sample Output: YES (because 4+5=9 or 3+6=9)
💡 Solution (click to reveal)
Approach: Check all pairs (i, j) where i < j. If any pair sums to T, print YES.
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, t;
cin >> n >> t;
vector<int> arr(n);
for (int i = 0; i < n; i++) cin >> arr[i];
bool found = false;
for (int i = 0; i < n && !found; i++) {
for (int j = i + 1; j < n; j++) {
if (arr[i] + arr[j] == t) {
found = true;
break;
}
}
}
cout << (found ? "YES" : "NO") << "\n";
return 0;
}
Key points:
- Inner loop starts at
j = i + 1to avoid using the same element twice and checking duplicate pairs break+ the&& !foundcondition in the outer loop ensures we stop as soon as we find a match- This is O(N²) — fine for N ≤ 100. For N up to 10^5, you'd use a set (Chapter 3.1)
Problem 2.3.10 — Prefix Sums Read N (1 ≤ N ≤ 1000), then N integers. Then read Q queries (1 ≤ Q ≤ 1000), each with two integers L and R (0-indexed, inclusive). For each query, print the sum of elements from index L to R.
Sample Input:
5
1 2 3 4 5
3
0 2
1 3
2 4
Sample Output:
6
9
12
💡 Solution (click to reveal)
Why not sum directly for each query? Brute force: each query loops from L to R, time complexity O(N), all queries total O(N×Q). When N=10^5, Q=10^5, that is 10^{10} operations—far exceeding the time limit.
Optimization idea: Preprocess the array once in O(N), then each query takes only O(1). Total time O(N+Q), much faster! This is the core idea of prefix sums (covered in depth in Chapter 3.2).
Approach: Build a prefix sum array where prefix[i] = sum of arr[0..i-1]. Then sum from L to R = prefix[R+1] - prefix[L]. This gives O(1) per query instead of O(N).
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
vector<long long> arr(n), prefix(n + 1, 0);
for (int i = 0; i < n; i++) {
cin >> arr[i];
prefix[i + 1] = prefix[i] + arr[i]; // build prefix sum
}
// prefix[0] = 0
// prefix[1] = arr[0]
// prefix[2] = arr[0] + arr[1]
// prefix[i] = arr[0] + arr[1] + ... + arr[i-1]
int q;
cin >> q;
while (q--) {
int l, r;
cin >> l >> r;
// sum from l to r (inclusive) = prefix[r+1] - prefix[l]
cout << prefix[r + 1] - prefix[l] << "\n";
}
return 0;
}
Key points:
prefix[i]= sum of the firstielements (prefix[0] = 0 is a sentinel)- Sum of arr[L..R] = prefix[R+1] - prefix[L] — subtracting the part before L
- Check with sample: arr=[1,2,3,4,5], prefix=[0,1,3,6,10,15]. Query [0,2]: prefix[3]-prefix[0]=6-0=6 ✓
Complexity Analysis:
- Time: O(N + Q) — preprocess O(N) + each query O(1) × Q queries
- Space: O(N) — prefix sum array uses N+1 space
💡 Brute force vs optimized: Brute force O(N×Q) vs prefix sum O(N+Q). When N=Q=10^5, the former takes 10^{10} operations (TLE), the latter only 2×10^5 operations (instant).
🏆 Challenge Problems
Challenge 2.3.11 — Rotate Array Read N (1 ≤ N ≤ 1000) and K (0 ≤ K < N). Read N integers. Print the array rotated right by K positions (the last K elements wrap to the front).
Sample Input:
5 2
1 2 3 4 5
Sample Output: 4 5 1 2 3
💡 Solution (click to reveal)
Approach: The new array has element at original position (i - K + N) % N at position i. Equivalently, print elements starting from index N-K, wrapping around.
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, k;
cin >> n >> k;
vector<int> arr(n);
for (int i = 0; i < n; i++) cin >> arr[i];
// Print n elements starting from index (n - k) % n, wrapping around
for (int i = 0; i < n; i++) {
int idx = (n - k + i) % n;
cout << arr[idx];
if (i < n - 1) cout << " ";
}
cout << "\n";
return 0;
}
Key points:
- Right rotate by K: last K elements come first, then first N-K elements
(n - k + i) % nmaps new positionito old position — the% nhandles the wraparound- Check: n=5, k=2. i=0: idx=(5-2+0)%5=3 → arr[3]=4. i=1: idx=4 → arr[4]=5. i=2: idx=0 → arr[0]=1. Correct!
Challenge 2.3.12 — Merge Sorted Arrays Read N₁, then N₁ sorted integers. Read N₂, then N₂ sorted integers. Print the merged sorted array.
Sample Input:
3
1 3 5
4
2 4 6 8
Sample Output: 1 2 3 4 5 6 8
💡 Solution (click to reveal)
Approach: Use two pointers — one for each array. At each step, take the smaller of the two current elements.
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n1;
cin >> n1;
vector<int> a(n1);
for (int i = 0; i < n1; i++) cin >> a[i];
int n2;
cin >> n2;
vector<int> b(n2);
for (int i = 0; i < n2; i++) cin >> b[i];
// Two-pointer merge
int i = 0, j = 0;
vector<int> result;
while (i < n1 && j < n2) {
if (a[i] <= b[j]) {
result.push_back(a[i++]); // take from a, advance i
} else {
result.push_back(b[j++]); // take from b, advance j
}
}
// One array may have leftover elements
while (i < n1) result.push_back(a[i++]);
while (j < n2) result.push_back(b[j++]);
for (int idx = 0; idx < (int)result.size(); idx++) {
cout << result[idx];
if (idx < (int)result.size() - 1) cout << " ";
}
cout << "\n";
return 0;
}
Key points:
- Two pointers
iandjscan through arraysaandbsimultaneously - We always pick the smaller current element — this maintains sorted order
- After the while loop, one array might still have elements — copy those directly
Challenge 2.3.13 — Smell Distance (Inspired by USACO Bronze)
N cows are standing in a line. Each cow has a position p[i] and a smell radius s[i]. A cow can smell another if the distance between them is at most the sum of their radii. Read N, then N pairs (position, radius). Print the number of pairs of cows that can smell each other.
Sample Input:
4
1 2
5 1
8 3
15 1
Sample Output: 1
(Pair (0,1): dist=|1-5|=4, radii sum=2+1=3. 4>3, NO. Pair (0,2): dist=|1-8|=7, sum=2+3=5. 7>5, NO. Pair (1,2): dist=|5-8|=3, sum=1+3=4. 3≤4, YES. Pair (0,3): 14>3 NO. Pair (1,3): 10>2 NO. Pair (2,3): 7>4 NO. Total: 1.)
💡 Solution (click to reveal)
Approach: Check all pairs (i, j) where i < j. For each pair, compute the distance and compare to the sum of their radii.
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
vector<long long> pos(n), rad(n);
for (int i = 0; i < n; i++) {
cin >> pos[i] >> rad[i];
}
int count = 0;
for (int i = 0; i < n; i++) {
for (int j = i + 1; j < n; j++) {
long long dist = abs(pos[i] - pos[j]);
long long sumRad = rad[i] + rad[j];
if (dist <= sumRad) {
count++;
}
}
}
cout << count << "\n";
return 0;
}
Key points:
- Check all pairs (i, j) with i < j to avoid counting the same pair twice
abs(pos[i] - pos[j])computes the absolute distance between positions- Use
long longin case positions and radii are large
🏗️ Part 3: Core Data Structures
The data structures that appear in nearly every USACO Bronze and Silver problem — prefix sums, sorting, two pointers, stacks, maps, and segment trees.
📚 11 Chapters · ⏱️ Estimated 2-3 weeks · 🎯 Target: Solve USACO Bronze problems
Part 3: Core Data Structures
Estimated time: 2–3 weeks
Part 3 is where competitive programming starts getting exciting. You'll learn the data structures that appear in nearly every USACO Bronze and Silver problem — and techniques that can turn O(N²) brute force into O(N) elegance.
What Topics Are Covered
| Chapter | Topic | The Big Idea |
|---|---|---|
| Chapter 3.1 | STL Essentials | Master the powerful built-in containers: sort, map, set, queue, stack |
| Chapter 3.2 | Arrays & Prefix Sums | Answer range sum queries in O(1) after O(N) preprocessing |
| Chapter 3.3 | Sorting & Searching | Sort + binary search turns many O(N²) problems into O(N log N) |
| Chapter 3.4 | Two Pointers & Sliding Window | Efficiently process subarrays/pairs with two coordinated pointers |
| Chapter 3.5 | Monotonic Stack & Monotonic Queue | Next greater element, sliding window max/min in O(N) |
| Chapter 3.6 | Stacks, Queues & Deques | Order-based data structures for LIFO/FIFO processing |
| Chapter 3.7 | Hashing Techniques | Fast key lookup, polynomial hashing, rolling hash |
| Chapter 3.8 | Maps & Sets | O(log N) lookup, unique collections, frequency counting |
| Chapter 3.9 | Introduction to Segment Trees | Efficient range queries and point updates in O(log N) |
| Chapter 3.10 | Fenwick Tree (BIT) | Efficient prefix-sum with point updates, inversion count |
| Chapter 3.11 | Binary Trees | Tree traversals, BST operations, balanced trees |
What You'll Be Able to Solve After This Part
After completing Part 3, you'll be ready to tackle:
-
USACO Bronze: Most Bronze problems use Part 3 techniques
- Range queries (how many cows of type X in positions L to R?)
- Sorting problems (closest pair, ranking, scheduling)
- Frequency counting (how many times does each value appear?)
- Stack-based problems (balanced brackets, monotonic processing)
-
USACO Silver Intro:
- Binary search on the answer (aggressive cows, rope cutting)
- Sliding window maximum/minimum
- Difference arrays for range updates
Key Algorithms Introduced
| Technique | Chapter | USACO Relevance |
|---|---|---|
| 1D Prefix Sum | 3.2 | Breed counting, range queries |
| 2D Prefix Sum | 3.2 | Rectangle sum queries on grids |
| Difference Array | 3.2 | Range update, point query |
std::sort with custom comparator | 3.3 | Nearly every Silver problem |
Binary search (lower_bound, upper_bound) | 3.3 | Counting, range queries |
| Binary search on answer | 3.3 | Aggressive cows, painter's partition |
| Monotonic stack | 3.5 | Next greater element, histogram |
| Sliding window (monotonic deque) | 3.5 | Window min/max |
Frequency map (unordered_map) | 3.7 | Counting occurrences |
| Ordered set operations | 3.8 | K-th element, range queries |
Prerequisites
Before starting Part 3, make sure you can:
- Write and compile a C++ program from scratch (Chapter 2.1)
-
Use
forloops and nested loops correctly (Chapter 2.2) -
Work with arrays and
vector<int>(Chapter 2.3)
Note: Chapter 3.1 (STL Essentials) is the first chapter of this part and will teach you
std::sort,map,set, and other key STL containers before you need them in later chapters.
Tips for This Part
- Chapter 3.2 (Prefix Sums) is the most frequently tested technique in Bronze. Make sure you can implement it from scratch in 5 minutes.
- Chapter 3.3 (Binary Search) introduces "binary search on the answer" — this is a Silver-level technique that separates good solutions from great ones.
- Don't skip the practice problems. Each chapter's problems are specifically chosen to build the intuition you need.
- After finishing Chapter 3.3, you have enough tools for most USACO Bronze problems. Try solving 5–10 Bronze problems before continuing.
🏆 USACO Tip: At USACO Bronze, the most common techniques are: simulation (Chapters 2.1–2.3), sorting (Chapter 3.3), and prefix sums (Chapter 3.2). If you master these, you can solve almost any Bronze problem.
Let's dive in!
STL Essentials
Chapter 3.2: Arrays & Prefix Sums
📝 Before You Continue: Make sure you're comfortable with arrays, vectors, and basic loops (Chapters 2.2–2.3). You'll also want to understand
long longoverflow (Chapter 2.1).
Imagine you have an array of N numbers, and someone asks you 100,000 times: "What is the sum of elements from index L to index R?" A naive approach recomputes the sum from scratch each time — that's O(N) per query, or O(N × Q) total. With N = Q = 10^5, that's 10^10 operations. Way too slow.
Prefix sums solve this in O(N) preprocessing and O(1) per query. This is one of the most elegant and useful techniques in all of competitive programming.
💡 Key Insight: Prefix sums transform a "range query" problem into a subtraction. Instead of summing L to R every time, you precompute cumulative sums and subtract two of them. This trades
O(Q)repeated work for one-timeO(N)preprocessing.
3.2.1 The Prefix Sum Idea
The prefix sum of an array is a new array where each element stores the cumulative sum up to that index.
Visual: Prefix Sum Array
The diagram above shows how the prefix sum array is constructed from the original array, and how a range query sum(L, R) = P[R] - P[L-1] is computed in O(1) time. The blue cells highlight a query range while the red and green cells show the two prefix values being subtracted.
Given array: A = [3, 1, 4, 1, 5, 9, 2, 6] (1-indexed for clarity)
Index: 1 2 3 4 5 6 7 8
A: 3 1 4 1 5 9 2 6
P: 3 4 8 9 14 23 25 31
Where P[i] = A[1] + A[2] + ... + A[i].
Why 1-Indexing?
Using 1-indexed arrays lets us define P[0] = 0 (the "empty prefix" sums to zero). This makes the query formula P[R] - P[L-1] work even when L = 1 — we'd compute P[R] - P[0] = P[R], which is correct.
Building the Prefix Sum Array
// Solution: Build Prefix Sum Array — O(N)
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
// Step 1: Read input (1-indexed)
vector<int> A(n + 1);
for (int i = 1; i <= n; i++) cin >> A[i];
// Step 2: Build prefix sums
vector<long long> P(n + 1, 0); // P[0] = 0 (base case)
for (int i = 1; i <= n; i++) {
P[i] = P[i - 1] + A[i]; // ← KEY LINE: each P[i] = all elements up to i
}
return 0;
}
Complexity Analysis:
- Time:
O(N)— one pass through the array - Space:
O(N)— stores the prefix array
Step-by-step trace for A = [3, 1, 4, 1, 5]:
i=1: P[1] = P[0] + A[1] = 0 + 3 = 3
i=2: P[2] = P[1] + A[2] = 3 + 1 = 4
i=3: P[3] = P[2] + A[3] = 4 + 4 = 8
i=4: P[4] = P[3] + A[4] = 8 + 1 = 9
i=5: P[5] = P[4] + A[5] = 9 + 5 = 14
3.2.2 Range Sum Queries in O(1)
Once you have the prefix sum array, the sum from index L to R is:
sum(L, R) = P[R] - P[L-1]
Why? P[R] = sum of elements 1..R. P[L-1] = sum of elements 1..(L-1). Their difference = sum of elements L..R.
💡 Key Insight: Think of P[i] as "the total sum of the first i elements." To get the sum of a window [L, R], you subtract the "prefix before L" from the "prefix through R." It's like: big triangle minus smaller triangle = trapezoid.
// Solution: Range Sum Queries — Preprocessing O(N), Each Query O(1)
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 100001;
long long A[MAXN];
long long P[MAXN];
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, q;
cin >> n >> q;
// Step 1: Read array
for (int i = 1; i <= n; i++) cin >> A[i];
// Step 2: Build prefix sum — O(n)
P[0] = 0;
for (int i = 1; i <= n; i++) {
P[i] = P[i - 1] + A[i];
}
// Step 3: Answer q range sum queries — O(1) each
for (int i = 0; i < q; i++) {
int l, r;
cin >> l >> r;
cout << P[r] - P[l - 1] << "\n"; // ← KEY LINE: range sum formula
}
return 0;
}
Sample Input:
8 3
3 1 4 1 5 9 2 6
1 4
3 7
2 6
Sample Output:
9
21
20
Verification:
sum(1,4) = P[4] - P[0] = 9 - 0 = 9→ A[1]+A[2]+A[3]+A[4] = 3+1+4+1 = 9 ✓sum(3,7) = P[7] - P[2] = 25 - 4 = 21→ A[3]+...+A[7] = 4+1+5+9+2 = 21 ✓sum(2,6) = P[6] - P[1] = 23 - 3 = 20→ A[2]+...+A[6] = 1+4+1+5+9 = 20 ✓
⚠️ Common Mistake: Writing
P[R] - P[L]instead ofP[R] - P[L-1]. The formula includes both endpoints L and R — you want to subtract the sum before L, not the sum at L.
Total Complexity: O(N + Q) — perfect for N, Q up to 10^5.
3.2.3 USACO Example: Breed Counting
This is a classic USACO Bronze problem (2015 December).
Problem: N cows in a line. Each cow is breed 1, 2, or 3. Answer Q queries: how many cows of breed B are in positions L to R?
Solution: Maintain one prefix sum array per breed.
// Solution: Multi-Breed Prefix Sums — O(N + Q)
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, q;
cin >> n >> q;
vector<int> breed(n + 1);
vector<vector<long long>> P(4, vector<long long>(n + 1, 0));
// P[b][i] = number of cows of breed b in positions 1..i
// Step 1: Build prefix sums for each breed
for (int i = 1; i <= n; i++) {
cin >> breed[i];
for (int b = 1; b <= 3; b++) {
P[b][i] = P[b][i - 1] + (breed[i] == b ? 1 : 0); // ← KEY LINE
}
}
// Step 2: Answer each query in O(1)
for (int i = 0; i < q; i++) {
int l, r, b;
cin >> l >> r >> b;
cout << P[b][r] - P[b][l - 1] << "\n";
}
return 0;
}
🏆 USACO Tip: Many USACO Bronze problems involve "count elements satisfying property X in a range." If Q is large, always consider prefix sums.
3.2.4 USACO-Style Problem Walkthrough: Farmer John's Grass Fields
🔗 Related Problem: This is a fictional USACO-style problem inspired by "Breed Counting" and "Tallest Cow" — both classic Bronze problems.
Problem Statement:
Farmer John has N fields in a row. Field i has grass[i] units of grass. He needs to answer Q queries: "What is the total grass in fields L through R (inclusive)?" With N, Q up to 10^5, he needs each query answered in O(1).
Sample Input:
6 4
4 2 7 1 8 3
1 3
2 5
4 6
1 6
Sample Output:
13
18
12
25
Step-by-Step Solution:
Step 1: Understand the problem. We have an array [4, 2, 7, 1, 8, 3] and need range sums.
Step 2: Build the prefix sum array.
Index: 0 1 2 3 4 5 6
grass: - 4 2 7 1 8 3
P: 0 4 6 13 14 22 25
Step 3: Answer queries using P[R] - P[L-1]:
- Query (1,3):
P[3] - P[0] = 13 - 0 = 13✓ - Query (2,5):
P[5] - P[1] = 22 - 4 = 18✓ - Query (4,6):
P[6] - P[3] = 25 - 13 = 12✓ - Query (1,6):
P[6] - P[0] = 25 - 0 = 25✓
Complete C++ Solution:
// Farmer John's Grass Fields — Prefix Sum Solution O(N + Q)
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, q;
cin >> n >> q;
// Step 1: Read grass values and build prefix sum simultaneously
vector<long long> P(n + 1, 0);
for (int i = 1; i <= n; i++) {
long long g;
cin >> g;
P[i] = P[i - 1] + g; // ← KEY LINE: incremental prefix sum
}
// Step 2: Answer each query in O(1)
while (q--) {
int l, r;
cin >> l >> r;
cout << P[r] - P[l - 1] << "\n";
}
return 0;
}
Why is this O(N + Q)?
- Building prefix sums: one loop, N iterations →
O(N) - Each query: one subtraction →
O(1)per query,O(Q)total - Total:
O(N + Q)— much better than theO(NQ)brute force
⚠️ Common Mistake: Using
intinstead oflong longfor the prefix sum. If grass values are up to 10^9 and N = 10^5, the total could be up to 10^14 — way beyondint's range of ~2×10^9.
3.2.5 Difference Arrays
The difference array is the inverse of prefix sums. It's useful when you need to add a value to a range of positions, then query final values.
Problem: Start with all zeros. Apply M updates: "add V to all positions from L to R." Then print the final array.
Naively, each update is O(R-L+1). With a difference array, each update is O(1), and reconstruction is O(N).
💡 Key Insight: Instead of adding V to every position in [L, R] (slow), we record "+V at position L" and "-V at position R+1" (fast). When we later do a prefix sum of these markers, the +V and -V "cancel out" outside [L,R], so the net effect is exactly adding V to [L,R].
// Solution: Difference Array for Range Updates — O(N + M)
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, m;
cin >> n >> m;
vector<long long> diff(n + 2, 0); // difference array (extra space for R+1 case)
// Step 1: Process all range updates in O(1) each
for (int i = 0; i < m; i++) {
int l, r, v;
cin >> l >> r >> v;
diff[l] += v; // ← KEY LINE: mark start of range
diff[r + 1] -= v; // ← KEY LINE: mark end+1 to undo the addition
}
// Step 2: Reconstruct the final array by taking prefix sums of diff
long long running = 0;
for (int i = 1; i <= n; i++) {
running += diff[i];
cout << running;
if (i < n) cout << " ";
}
cout << "\n";
return 0;
}
Sample Input:
5 3
1 3 2
2 5 3
3 4 -1
Step-by-step trace:
📝 索引说明: 以下追踪中 diff 数组使用 1-indexed(即 diff[1]..diff[n+1]),与代码
vector<long long> diff(n + 2, 0)一致。方括号内的数字表示 diff[1], diff[2], ..., diff[6](n=5 时,数组共 7 个位置,有效使用 diff[1..6])。
初始状态: diff[1..6] = [0, 0, 0, 0, 0, 0]
After update(1,3,+2): diff[1]+=2, diff[4]-=2
diff[1..6] = [2, 0, 0, -2, 0, 0]
After update(2,5,+3): diff[2]+=3, diff[6]-=3
diff[1..6] = [2, 3, 0, -2, 0, -3]
After update(3,4,-1): diff[3]+=-1 即 diff[3]-=1, diff[5]-=(-1) 即 diff[5]+=1
diff[1..6] = [2, 3, -1, -2, 1, -3]
Prefix sum reconstruction: i=1: running = 0+2 = 2 → result[1] = 2 i=2: running = 2+3 = 5 → result[2] = 5 i=3: running = 5-1 = 4 → result[3] = 4 i=4: running = 4-2 = 2 → result[4] = 2 i=5: running = 2+1 = 3 → result[5] = 3
**Sample Output:**
2 5 4 2 3
**Complexity Analysis:**
- **Time:** `O(N + M)` — `O(1)` per update, `O(N)` reconstruction
- **Space:** `O(N)` — just the difference array
> ⚠️ **Common Mistake:** Declaring `diff` with size N+1 instead of N+2. When R=N, you write to `diff[R+1] = diff[N+1]`, which needs to exist!
---
## 3.2.6 2D Prefix Sums
For 2D grids, you can extend prefix sums to answer rectangular range queries in `O(1)`.
Given an R×C grid, define `P[r][c]` = sum of all elements in the rectangle from (1,1) to (r,c).
### Building the 2D Prefix Sum
P[r][c] = A[r][c] + P[r-1][c] + P[r][c-1] - P[r-1][c-1]
The subtraction removes the overlap (otherwise the top-left rectangle is counted twice).
> 💡 **Key Insight (Inclusion-Exclusion):** Visualize the four rectangles:
> - `P[r-1][c]` = the "top" rectangle
> - `P[r][c-1]` = the "left" rectangle
> - `P[r-1][c-1]` = the "top-left corner" (counted in BOTH above — so subtract once)
> - `A[r][c]` = the single new cell
### Step-by-Step 2D Prefix Sum Worked Example
Let's trace through a 4×4 grid:
**Original Grid A:**
c=1 c=2 c=3 c=4
r=1: 1 2 3 4 r=2: 5 6 7 8 r=3: 9 10 11 12 r=4: 13 14 15 16
**Building P step by step (left-to-right, top-to-bottom):**
P[1][1] = A[1][1] = 1
P[1][2] = A[1][2] + P[0][2] + P[1][1] - P[0][1] = 2 + 0 + 1 - 0 = 3 P[1][3] = A[1][3] + P[0][3] + P[1][2] - P[0][2] = 3 + 0 + 3 - 0 = 6 P[1][4] = 4 + 0 + 6 - 0 = 10
P[2][1] = A[2][1] + P[1][1] + P[2][0] - P[1][0] = 5 + 1 + 0 - 0 = 6 P[2][2] = A[2][2] + P[1][2] + P[2][1] - P[1][1] = 6 + 3 + 6 - 1 = 14 P[2][3] = 7 + 6 + 14 - 3 = 24 P[2][4] = 8 + 10 + 24 - 6 = 36
P[3][1] = 9 + 6 + 0 - 0 = 15 P[3][2] = 10 + 14 + 15 - 6 = 33 P[3][3] = 11 + 24 + 33 - 14 = 54 P[3][4] = 12 + 36 + 54 - 24 = 78
P[4][1] = 13 + 15 + 0 - 0 = 28 P[4][2] = 14 + 33 + 28 - 15 = 60 P[4][3] = 15 + 54 + 60 - 33 = 96 P[4][4] = 16 + 78 + 96 - 54 = 136
**Resulting prefix sum grid P:**
c=1 c=2 c=3 c=4
r=1: 1 3 6 10 r=2: 6 14 24 36 r=3: 15 33 54 78 r=4: 28 60 96 136
**Query: Sum of subgrid (r1=2, c1=2) to (r2=3, c2=3):**
ans = P[3][3] - P[1][3] - P[3][1] + P[1][1] = 54 - 6 - 15 + 1 = 34
Verify: A[2][2]+A[2][3]+A[3][2]+A[3][3] = 6+7+10+11 = 34 ✓
**Visualization of the inclusion-exclusion:**

```cpp
// Solution: 2D Prefix Sums — Build O(R×C), Query O(1)
#include <bits/stdc++.h>
using namespace std;
const int MAXR = 1001, MAXC = 1001;
int A[MAXR][MAXC];
long long P[MAXR][MAXC];
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int R, C;
cin >> R >> C;
for (int r = 1; r <= R; r++)
for (int c = 1; c <= C; c++)
cin >> A[r][c];
// Step 1: Build 2D prefix sum — O(R × C)
for (int r = 1; r <= R; r++) {
for (int c = 1; c <= C; c++) {
P[r][c] = A[r][c]
+ P[r-1][c] // rectangle above
+ P[r][c-1] // rectangle to the left
- P[r-1][c-1]; // ← KEY LINE: remove overlap (counted twice)
}
}
// Step 2: Answer each query in O(1)
int q;
cin >> q;
while (q--) {
int r1, c1, r2, c2;
cin >> r1 >> c1 >> r2 >> c2;
long long ans = P[r2][c2]
- P[r1-1][c2] // subtract top strip
- P[r2][c1-1] // subtract left strip
+ P[r1-1][c1-1]; // add back top-left corner
cout << ans << "\n";
}
return 0;
}
Complexity Analysis:
- Build time:
O(R × C) - Query time:
O(1)per query - Space:
O(R × C)
⚠️ Common Mistake: Forgetting to add
P[r1-1][c1-1]back in the query formula. The top strip and left strip both include the top-left corner, so it gets subtracted twice — you need to add it back once!
3.2.7 USACO Example: Max Subarray Sum
Problem (variation of Kadane's algorithm): Find the contiguous subarray with the maximum sum.
// Solution: Kadane's Algorithm — O(N) Time, O(1) Space
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
vector<int> A(n);
for (int &x : A) cin >> x;
// Kadane's Algorithm: O(n)
long long maxSum = LLONG_MIN; // LLONG_MIN = smallest long long
long long current = 0;
for (int i = 0; i < n; i++) {
current += A[i];
maxSum = max(maxSum, current);
if (current < 0) current = 0; // ← KEY LINE: restart if sum goes negative
}
cout << maxSum << "\n";
return 0;
}
💡 Key Insight: Why reset
currentto 0 when it goes negative? Because a negative prefix sum hurts any future subarray. If the running sum so far is -5, any future subarray starting fresh (sum 0) will always beat continuing from -5.
Alternative with prefix sums: The max subarray sum equals max over all pairs (i,j) of P[j] - P[i-1]. For each j, this is maximized when P[i-1] is minimized. Track the running minimum of prefix sums!
// Alternative: Min Prefix Trick — also O(N)
long long maxSum = LLONG_MIN, minPrefix = 0, prefix = 0;
for (int x : A) {
prefix += x;
maxSum = max(maxSum, prefix - minPrefix); // best sum ending here
minPrefix = min(minPrefix, prefix); // track minimum prefix seen so far
// ⚠️ 注意:minPrefix 的更新必须在 maxSum 之后。
// 若提前更新 minPrefix,相当于允许空子数组(长度为0,和为0)参与比较,
// 会导致结果在全负数组时错误地返回 0 而非最大负数。
}
⚠️ Common Mistakes in Chapter 3.2
- Off-by-one in range queries:
P[R] - P[L]instead ofP[R] - P[L-1]. Always verify on a small example. - Overflow: Prefix sums of large values can exceed
intrange (2×10^9). Uselong longfor the prefix array even if elements areint. - 2D query formula: Forgetting the
+P[r1-1][c1-1]term in the 2D query — a very easy slip. - Difference array size: Declaring
diff[n+1]when you needdiff[n+2](because you write to indexr+1which could ben+1). - 1-indexing vs 0-indexing: If you use 0-indexed prefix sums, the query formula changes to
P[R+1] - P[L]. Pick one convention and stick to it within a problem.
Chapter Summary
📌 Key Takeaways
| Technique | Build Time | Query Time | Space | Use Case |
|---|---|---|---|---|
| 1D prefix sum | O(N) | O(1) | O(N) | Range sum on 1D array |
| 2D prefix sum | O(RC) | O(1) | O(RC) | Range sum on 2D grid |
| Difference array | O(N+M) | O(1)* | O(N) | Range addition updates |
| Kadane's algorithm | O(N) | — | O(1) | Maximum subarray sum |
*After O(N) reconstruction pass to read all values.
🧩 Core Formula Quick Reference
| Operation | Formula | Notes |
|---|---|---|
| 1D range sum | P[R] - P[L-1] | P[0] = 0 is the sentinel value |
| 2D rectangle sum | P[r2][c2] - P[r1-1][c2] - P[r2][c1-1] + P[r1-1][c1-1] | Inclusion-exclusion: subtract twice, add once |
| Difference array update | diff[L] += V; diff[R+1] -= V; | Array size should be N+2 |
| Restore from difference | Take prefix sum of diff | Result is the final array |
❓ FAQ
Q1: What is the relationship between prefix sums and difference arrays?
A: They are inverse operations. Taking the prefix sum of an array gives the prefix sum array; taking the difference (adjacent element differences) of the prefix sum array restores the original. Conversely, taking the prefix sum of a difference array also restores the original. This is analogous to integration and differentiation in mathematics.
Q2: When to use prefix sums vs. difference arrays?
A: Rule of thumb — look at the operation type:
- Multiple range sum queries → prefix sum (preprocess
O(N), queryO(1))- Multiple range add/subtract operations → difference array (update
O(1), restoreO(N)at the end)- If both operations alternate, you need a more advanced data structure (like Segment Tree in Chapter 3.9)
Q3: Can prefix sums handle dynamic modifications? (array elements change)
A: No. Prefix sums are a one-time preprocessing; the array cannot change afterward. If elements are modified, use Fenwick Tree (BIT) or Segment Tree, which support point updates and range queries in
O(log N)time.
Q4: Why are there two versions of Kadane's algorithm (current=0 vs minPrefix)?
A: Both are essentially the same, both
O(N). The first (classic Kadane) is more intuitive: restart when the current subarray sum goes negative. The second (min-prefix method) uses prefix sum thinking: max subarray = max(P[j] - P[i-1]) = max(P[j]) - min(P[i]). Choose based on personal preference.
Q5: What are the space constraints for 2D prefix sums?
A: If R, C are both up to 10^4, the P array needs 10^8
long longvalues (about 800MB) — exceeding memory limits. Generally R×C ≤ 10^6~10^7 is safe. For larger grids, consider compression or offline processing.
🔗 Connections to Later Chapters
- Chapter 3.4 (Two Pointers): sliding window can also do range queries, but only for fixed-size or monotonically moving windows; prefix sums are more general
- Chapter 3.3 (Sorting & Searching): binary search can combine with prefix sums — e.g., binary search on the prefix sum array for the first position ≥ target
- Chapter 3.9 (Segment Trees): solves "dynamic update + range query" problems that prefix sums cannot handle
- Chapters 6.1–6.3 (DP): many state transitions involve range sums; prefix sums are an important tool for optimizing DP
- The difference array idea ("+V at start, -V after end") recurs in sweep line algorithms, event sorting, and other advanced techniques
Practice Problems
Problem 3.2.1 — Range Sum 🟢 Easy Read N integers and Q queries. Each query gives L and R. Print the sum of elements from index L to R (1-indexed).
Hint
Build a prefix sum array P where P[i] = A[1]+...+A[i]. Answer each query as P[R] - P[L-1].Problem 3.2.2 — Range Add, Point Query 🟢 Easy Start with N zeros. Process M operations: each operation adds V to all positions from L to R. After all operations, print the value at each position. (Use difference array)
Hint
Use ``diff[L]`` += V and ``diff[R+1]`` -= V for each update, then take prefix sums of diff.Problem 3.2.3 — Rectangular Sum 🟡 Medium Read an N×M grid of integers and Q queries. Each query gives (r1,c1,r2,c2). Print the sum of the subgrid.
Hint
Build a 2D prefix sum. Query = P[r2][c2] - P[r1-1][c2] - P[r2][c1-1] + P[r1-1][c1-1].Problem 3.2.4 — USACO 2016 January Bronze: Mowing the Field 🔴 Hard (Challenge) Farmer John mows grass along a path. Cells visited more than once contribute to "double-mowed" area. Use a 2D array and count cells visited at least twice.
Hint
Simulate the path, marking cells in a 2D visited array. Count cells with value ≥ 2 at the end.Problem 3.2.5 — Maximum Subarray (Negative numbers allowed) 🟡 Medium Read N integers (possibly negative). Find the maximum possible sum of a contiguous subarray. What if all numbers are negative?
Hint
Use Kadane's algorithm. If all numbers are negative, the answer is the single largest element (that's why we initialize maxSum to `LLONG_MIN`, not 0).🏆 Challenge Problem: Cows and Paint Buckets An N×M grid contains paint buckets, each with a positive value. You can select any rectangular subgrid. Your score is the maximum value in your subgrid minus the sum of all border cells of your subgrid. Find the optimal rectangle. (N, M ≤ 500)
Solution approach: 2D prefix sums for sums + careful enumeration of all rectangles.
Chapter 3.3: Sorting & Searching
📝 Before You Continue: You should be comfortable with arrays, vectors, and basic loops (Chapters 2.2–2.3). Familiarity with
std::sortfrom Chapter 3.1 helps, but this chapter covers it in depth.
Sorting and searching are two of the most fundamental operations in computer science. In USACO, a huge fraction of problems become easy once you sort the data correctly. And binary search — the ability to search a sorted array in O(log n) — is a technique you'll reach for again and again.
3.3.1 Why Sorting Matters
Consider this problem: "Given N cow heights, find the two cows whose heights are closest together."
- Unsorted approach: Compare every pair →
O(N²). For N = 10^5, that's 10^10 operations. TLE. - Sorted approach: Sort the heights →
O(N log N). Then the closest pair must be adjacent! Check N-1 pairs →O(N). Total:O(N log N). ✓
💡 Key Insight: Sorting transforms many
O(N²)brute-force solutions intoO(N log N)orO(N)solutions. When you see "find the pair with property X" or "find the minimum/maximum of something involving two elements," always consider sorting first.
Complexity Analysis:
- Sorting:
O(N log N)time,O(log N)space (for the recursion stack in Introsort / quicksort) - After sorting: adjacent comparisons or two-pointer techniques are
O(N)
3.3.2 How Sorting Works (Conceptual)
You don't need to implement sorting algorithms yourself — std::sort does it for you. But understanding the ideas helps you reason about time complexity and choose the right approach.
Here are four classic sorting algorithms, each with an interactive visualization to help you understand how they work.
| Algorithm | Time Complexity | Space | Stable | Core Idea |
|---|---|---|---|---|
| Bubble Sort | O(N²) | O(1) | ✅ | Swap adjacent elements; large values "bubble" to the end |
| Insertion Sort | O(N²) / O(N) best | O(1) | ✅ | Insert each element into its correct position in the sorted region |
| Merge Sort | O(N log N) | O(N) | ✅ | Divide and conquer: split recursively, then merge |
| Quicksort | O(N log N) avg | O(log N) | ❌ | Divide and conquer: partition around a pivot, recurse |
🫧 Bubble Sort — O(N²)
Repeatedly scan the array, swapping adjacent elements that are out of order. Each pass "bubbles" the current maximum to the end:
Pass 1: [64,34,25,12,22,11,90] → 90 bubbles to end
Pass 2: [34,25,12,22,11,64,90] → 64 bubbles to second-to-last
...
Bubble sort is O(N²). Never use it on large inputs in competitive programming. We cover it only because it's conceptually the simplest.
🃏 Insertion Sort — O(N²) / O(N) best case
Divide the array into a left "sorted region" and a right "unsorted region." Each step takes the first element of the unsorted region and inserts it into the correct position in the sorted region:
Start: [64 | 34, 25, 12, 22, 11, 90] ← | sorted on left
i=1: [34, 64 | 25, 12, 22, 11, 90] ← 34 inserted before 64
i=2: [25, 34, 64 | 12, 22, 11, 90] ← 25 inserted at front
i=3: [12, 25, 34, 64 | 22, 11, 90] ← 12 inserted at front
...
💡 Insertion sort's strength: Very fast on nearly-sorted arrays (approaches O(N)).
std::sortswitches to insertion sort for small subarrays.
void insertionSort(vector<int>& a) {
int n = a.size();
for (int i = 1; i < n; i++) {
int key = a[i]; // element to insert
int j = i - 1;
// shift elements greater than key one position to the right
while (j >= 0 && a[j] > key) {
a[j + 1] = a[j];
j--;
}
a[j + 1] = key; // place key in its correct position
}
}
🔀 Merge Sort — O(N log N) always
Divide and conquer: recursively split the array in half, then merge the two sorted halves back together:
[38, 27, 43, 3, 9, 82, 10]
↓ split recursively
[38,27,43,3] [9,82,10]
[38,27] [43,3] [9,82] [10]
[38][27][43][3] [9][82][10]
↓ merge bottom-up
[27,38] [3,43] [9,82] [10]
[3,27,38,43] [9,10,82]
[3,9,10,27,38,43,82] ✓
Merge sort is O(N log N) in all cases and is a stable sort.
void merge(vector<int>& a, int lo, int mid, int hi) {
vector<int> tmp(a.begin() + lo, a.begin() + hi + 1);
int i = lo, j = mid + 1, k = lo;
while (i <= mid && j <= hi) {
if (tmp[i - lo] <= tmp[j - lo])
a[k++] = tmp[i++ - lo]; // take smaller from left half
else
a[k++] = tmp[j++ - lo]; // take smaller from right half
}
while (i <= mid) a[k++] = tmp[i++ - lo]; // append remaining left
while (j <= hi) a[k++] = tmp[j++ - lo]; // append remaining right
}
void mergeSort(vector<int>& a, int lo, int hi) {
if (lo >= hi) return;
int mid = lo + (hi - lo) / 2;
mergeSort(a, lo, mid); // sort left half
mergeSort(a, mid + 1, hi); // sort right half
merge(a, lo, mid, hi); // merge two sorted halves
}
⚡ Quicksort — O(N log N) average
Quicksort is one of the core algorithms underlying std::sort. Its key idea is divide and conquer:
- Pick a pivot element (typically the last element)
- Partition: move all elements ≤ pivot to the left, all > pivot to the right; pivot lands in its final position
- Recurse on the left and right subarrays
[8, 3, 6, 1, 9, 2, 7, 4] ← pivot = 4
↓ partition
[3, 1, 2, 4, 9, 6, 7, 8] ← 4 in final position; left ≤ 4, right > 4
↑_______↑ ↑ ↑__________↑
left subarray right subarray
Recurse on [3,1,2] → [1,2,3]
Recurse on [9,6,7,8] → [6,7,8,9]
Final: [1, 2, 3, 4, 6, 7, 8, 9] ✓
// Partition arr[lo..hi] using last element as pivot.
// Returns the final index of the pivot.
int partition(vector<int>& arr, int lo, int hi) {
int pivot = arr[hi]; // choose last element as pivot
int i = lo - 1; // i points to end of "≤ pivot" region
for (int j = lo; j < hi; j++) {
if (arr[j] <= pivot) {
i++;
swap(arr[i], arr[j]); // bring arr[j] into ≤ pivot region
}
}
swap(arr[i + 1], arr[hi]); // place pivot in its final position
return i + 1; // return pivot's index
}
void quickSort(vector<int>& arr, int lo, int hi) {
if (lo >= hi) return; // base case: subarray length ≤ 1
int p = partition(arr, lo, hi); // p is pivot's final position
quickSort(arr, lo, p - 1); // sort left subarray
quickSort(arr, p + 1, hi); // sort right subarray
}
Visual: Quicksort Partition
The diagram above illustrates how the partition operation rearranges elements around the pivot. Elements ≤ pivot move to the left; elements > pivot move to the right. The pivot then lands in its final sorted position.
⚠️ Worst case: If the pivot is always the max or min (e.g., already-sorted input), recursion depth degrades to O(N) and total time becomes O(N²).
std::sortavoids this via random pivot selection or median-of-three, guaranteeing O(N log N) worst case.
| Case | Time | Notes |
|---|---|---|
| Average | O(N log N) | Pivot roughly splits array in half |
| Worst | O(N²) | Pivot always extreme (sorted input) |
| Space | O(log N) | Recursion stack depth (average) |
3.3.3 std::sort in Practice
⚠️ Stability Note:
std::sortis NOT stable — it uses Introsort (Quicksort + Heapsort + Insertion sort hybrid), which does not preserve the relative order of equal elements. If you need stable sorting, usestd::stable_sortinstead (see the comparison table in this section).
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
vector<int> v(n);
for (int &x : v) cin >> x;
// Sort ascending
sort(v.begin(), v.end());
// Sort descending
sort(v.begin(), v.end(), greater<int>());
// Sort only part of a vector (indices 2 through 5 inclusive)
sort(v.begin() + 2, v.begin() + 6);
for (int x : v) cout << x << " ";
cout << "\n";
return 0;
}
Sorting by Multiple Criteria
Often you want to sort by one field, and break ties with another. With pair, this is automatic (sorts by .first, then .second):
vector<pair<int, string>> students;
students.push_back({85, "Alice"});
students.push_back({92, "Bob"});
students.push_back({85, "Charlie"});
sort(students.begin(), students.end());
// Result: {85, "Alice"}, {85, "Charlie"}, {92, "Bob"}
// Sorted by score first, then alphabetically by name
Custom Comparators
A comparator is a function that returns true if the first argument should come before the second in the sorted order.
The clearest way to write a comparator is as a standalone function:
struct Cow {
string name;
int weight;
int height;
};
// Sort by weight ascending; break ties by height descending
bool cmpCow(const Cow &a, const Cow &b) {
if (a.weight != b.weight) return a.weight < b.weight; // lighter first
return a.height > b.height; // taller first (tie-break)
}
int main() {
vector<Cow> cows = {{"Bessie", 500, 140}, {"Elsie", 480, 135}, {"Moo", 500, 138}};
sort(cows.begin(), cows.end(), cmpCow);
for (auto &c : cows) {
cout << c.name << " " << c.weight << " " << c.height << "\n";
}
// Output:
// Elsie 480 135
// Bessie 500 140
// Moo 500 138
return 0;
}
💡 Style Note: Defining
cmpas a standalone function (rather than an inline lambda) makes the sorting logic easier to read, test, and reuse — especially when the comparison involves multiple fields.
Sorting Algorithm Stability
⚠️ Important:
std::sortis NOT stable — equal elements may appear in any order after sorting. Usestd::stable_sortif relative order of equal elements must be preserved.
Sorting Algorithm Stability Comparison
| Algorithm | Time Complexity | Space Complexity | Stable | C++ Function |
|---|---|---|---|---|
| std::sort | O(N log N) | O(log N) | ❌ | sort() |
| std::stable_sort | O(N log² N) | O(N) | ✅ | stable_sort() |
| std::partial_sort | O(N log K) | O(1) | ❌ | partial_sort() |
| Counting Sort | O(N+K) | O(K) | ✅ | Manual |
| Radix Sort | O(d(N+K)) | O(N+K) | ✅ | Manual |
📝 Note:
std::sortuses Introsort (a hybrid of Quicksort + Heapsort + Insertion sort). Because Quicksort is not stable,std::sortmakes no guarantee on the relative order of equal elements. When you sort students by score and need students with the same score to remain in their original order, usestd::stable_sort.
Visual: Sorting Algorithm Comparison
This chart compares the time complexity, space usage, and stability of common sorting algorithms, helping you choose the right one for each situation.
Counting Sort — O(N+K) for Small Value Ranges
When values are bounded integers in a small range [0, MAXVAL], counting sort beats std::sort by a wide margin:
// Counting sort: for integers in range [0, MAXVAL]
// Time O(N+MAXVAL), stable sort
void countingSort(vector<int>& arr, int maxVal) {
vector<int> cnt(maxVal + 1, 0);
for (int x : arr) cnt[x]++;
int idx = 0;
for (int v = 0; v <= maxVal; v++)
while (cnt[v]--) arr[idx++] = v;
}
// USACO use case: faster than std::sort when value range is small (e.g., cow IDs 1-1000)
When to use counting sort in USACO:
- Cow IDs in range [1, 1000], N = 10^6 → counting sort is O(N + 1000) vs O(N log N)
- Grade values [0, 100] → trivially fast
- Color categories [0, 3] → instant
Caution: If MAXVAL is large (e.g., 10^9), counting sort requires O(MAXVAL) memory — don't use it. Coordinate compress first (Section 3.3.6), then count.
3.3.4 Binary Search
Binary search finds a target in a sorted array in O(log n) — instead of O(n) for linear search.
Analogy: Searching for a word in a dictionary. You don't start from A and read every entry — you open to the middle, check if your word is before or after, then repeat. Each step cuts the search space in half: after k steps, you've gone from N candidates to N/2^k. When N/2^k < 1, you're done — that takes k = log₂(N) steps.
💡 Key Insight: Binary search works whenever you have a monotone predicate — a condition that is
false false false ... true true true(or the reverse). You can binary search for the boundary between false and true inO(log N).
Visual: Binary Search in Action
The diagram above shows a single-step binary search finding 7 in [1,3,5,7,9,11,13]. The left (L), right (R), and mid (M) pointers are shown. The key insight: computing mid = left + (right - left) / 2 avoids integer overflow compared to (left + right) / 2.
Manual Binary Search
// Solution: Binary Search — O(log N)
#include <bits/stdc++.h>
using namespace std;
// Returns index of target in sorted arr, or -1 if not found
int binarySearch(const vector<int> &arr, int target) {
int lo = 0, hi = (int)arr.size() - 1;
while (lo <= hi) {
int mid = lo + (hi - lo) / 2; // ← KEY LINE: avoid overflow (don't use (lo+hi)/2)
if (arr[mid] == target) {
return mid; // found!
} else if (arr[mid] < target) {
lo = mid + 1; // target is in the right half
} else {
hi = mid - 1; // target is in the left half
}
}
return -1; // not found
}
int main() {
vector<int> v = {1, 3, 5, 7, 9, 11, 13, 15};
cout << binarySearch(v, 7) << "\n"; // 3 (index)
cout << binarySearch(v, 6) << "\n"; // -1 (not found)
return 0;
}
Step-by-step trace for searching 7 in [1, 3, 5, 7, 9, 11, 13, 15]:
lo=0, hi=7: mid=3, arr[3]=7 → FOUND at index 3 ✓
Searching for 6:
lo=0, hi=7: mid=3, arr[3]=7 > 6 → hi=2
lo=0, hi=2: mid=1, arr[1]=3 < 6 → lo=2
lo=2, hi=2: mid=2, arr[2]=5 < 6 → lo=3
lo=3 > hi=2: loop ends → return -1 ✓
Why lo + (hi - lo) / 2? If lo and hi are both large (close to INT_MAX), then lo + hi overflows! This formula is equivalent but safe.
The STL Way: lower_bound and upper_bound
These are almost always what you actually want in competitive programming:
// STL Binary Search Operations — all O(log N)
#include <bits/stdc++.h>
using namespace std;
int main() {
vector<int> v = {1, 3, 3, 5, 7, 9, 9, 11};
// lower_bound: iterator to first element >= target
auto lb = lower_bound(v.begin(), v.end(), 3);
cout << *lb << "\n"; // 3 (first 3)
cout << lb - v.begin() << "\n"; // 1 (index)
// upper_bound: iterator to first element > target
auto ub = upper_bound(v.begin(), v.end(), 3);
cout << *ub << "\n"; // 5 (first element after all 3s)
cout << ub - v.begin() << "\n"; // 3 (index)
// Count occurrences: upper_bound - lower_bound
int count_of_3 = upper_bound(v.begin(), v.end(), 3)
- lower_bound(v.begin(), v.end(), 3);
cout << count_of_3 << "\n"; // 2
// Check if value exists
bool exists = binary_search(v.begin(), v.end(), 7);
cout << exists << "\n"; // 1
// Find largest value <= target (floor)
auto it = upper_bound(v.begin(), v.end(), 6);
if (it != v.begin()) {
--it;
cout << *it << "\n"; // 5 (largest value <= 6)
}
return 0;
}
⚠️ Common Mistake: Using
lower_bound/upper_boundon an unsorted container. These functions assume sorted order — on unsorted data, they give wrong results with no error!
3.3.5 Binary Search on the Answer
This is one of the most powerful and commonly-tested techniques in USACO Silver. The idea:
Instead of searching for a value in an array, binary search over the answer space itself.
When does this apply? When:
- The answer is a number in some range [lo, hi]
- There's a function
canAchieve(X)that checks if X is feasible - The function is monotone: if X works, all values ≤ X also work (or all ≥ X work)
💡 Key Insight: Monotonicity means there's a "threshold" separating feasible from infeasible answers. Binary search finds this threshold in
O(log(hi-lo))calls tocanAchieve. If each call takesO(f(N)), total time isO(f(N) × log(answer_range)).
Classic Example: Aggressive Cows (USACO 2011 March Silver)
Problem: N stalls at positions p[1..N], place C cows to maximize the minimum distance between any two cows.
Why binary search? If we can place cows with minimum gap D, we can also place them with gap D-1. So feasibility is monotone: there's a threshold D* where ≥ D* is infeasible and < D* is feasible. We binary search for D*.
The canPlace(minDist) function: Place the first cow at the leftmost stall, then greedily pick the next stall that is at least minDist away. Count how many cows we can place this way — if ≥ C, return true.
// Solution: Binary Search on Answer — O(N log N log(max_distance))
#include <bits/stdc++.h>
using namespace std;
int n, c;
vector<int> stalls;
// Can we place c cows such that the minimum gap between any two cows is >= minDist?
bool canPlace(int minDist) {
int placed = 1; // place first cow at stall 0
int lastPos = stalls[0]; // position of last placed cow
for (int i = 1; i < n; i++) {
if (stalls[i] - lastPos >= minDist) { // this stall is far enough
placed++;
lastPos = stalls[i];
}
}
return placed >= c; // did we place all c cows?
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
cin >> n >> c;
stalls.resize(n);
for (int &x : stalls) cin >> x;
sort(stalls.begin(), stalls.end()); // must sort first!
// Binary search on the answer: what's the maximum possible minimum distance?
int lo = 1, hi = stalls.back() - stalls.front();
int answer = 0;
while (lo <= hi) {
int mid = lo + (hi - lo) / 2;
if (canPlace(mid)) {
answer = mid; // mid works, try larger
lo = mid + 1;
} else {
hi = mid - 1; // mid doesn't work, try smaller
}
}
cout << answer << "\n";
return 0;
}
Trace for stalls = [1, 2, 4, 8, 9], C = 3:
Sorted: [1, 2, 4, 8, 9]
lo=1, hi=8
mid=4: canPlace(4)?
Place cow at 1. Next stall ≥ 1+4=5: that's 8. Place at 8.
Next stall ≥ 8+4=12: none. Total placed=2 < 3. Return false.
→ hi = 3
mid=2: canPlace(2)?
Place cow at 1. Next stall ≥ 3: that's 4. Place at 4.
Next stall ≥ 6: that's 8. Place at 8. Total placed=3 ≥ 3. Return true.
→ answer=2, lo=3
mid=3: canPlace(3)?
Place cow at 1. Next ≥ 4: that's 4. Place at 4.
Next ≥ 7: that's 8. Place at 8. Total placed=3 ≥ 3. Return true.
→ answer=3, lo=4
lo=4 > hi=3: done. Answer = 3
Another Classic: Minimum Time to Complete Tasks (Rope Cutting)
Problem: Given N ropes of lengths L[i], cut K ropes of equal length. What's the maximum length you can cut each piece to?
// Can we get K pieces of length >= len from the ropes?
bool canCut(vector<int> &ropes, long long len, int K) {
long long count = 0;
for (int r : ropes) count += r / len; // pieces from each rope
return count >= K;
}
// Binary search: maximize len such that canCut(len) is true
long long lo = 1, hi = *max_element(ropes.begin(), ropes.end());
long long answer = 0;
while (lo <= hi) {
long long mid = lo + (hi - lo) / 2;
if (canCut(ropes, mid, K)) {
answer = mid;
lo = mid + 1;
} else {
hi = mid - 1;
}
}
Template for Binary Search on Answer:
// Generic template — adapt lo, hi, and check() for your problem
long long lo = min_possible_answer;
long long hi = max_possible_answer;
long long answer = lo; // or -1 if no valid answer exists
while (lo <= hi) {
long long mid = lo + (hi - lo) / 2;
if (check(mid)) { // mid is feasible
answer = mid; // save it
lo = mid + 1; // try to do better (or worse, depending on problem)
} else {
hi = mid - 1; // mid not feasible, go lower
}
}
🏆 USACO Tip: Whenever a USACO problem asks "find the maximum X such that [some condition]" or "find the minimum X such that [some condition]," consider binary search on the answer. This technique solves USACO Silver problems frequently.
3.3.6 Coordinate Compression
Sometimes values are large (up to 10^9), but there are few distinct values. Coordinate compression maps them to small indices (0, 1, 2, ...).
// Solution: Coordinate Compression — O(N log N)
#include <bits/stdc++.h>
using namespace std;
int main() {
vector<int> A = {100, 500, 200, 100, 700, 200};
// Step 1: Get sorted unique values
vector<int> sorted_unique = A;
sort(sorted_unique.begin(), sorted_unique.end());
sorted_unique.erase(unique(sorted_unique.begin(), sorted_unique.end()),
sorted_unique.end());
// sorted_unique = {100, 200, 500, 700}
// Step 2: Map each original value to its compressed index
vector<int> compressed(A.size());
for (int i = 0; i < (int)A.size(); i++) {
compressed[i] = lower_bound(sorted_unique.begin(), sorted_unique.end(), A[i])
- sorted_unique.begin();
// 100→0, 200→1, 500→2, 700→3
}
for (int x : compressed) cout << x << " ";
cout << "\n"; // 0 2 1 0 3 1
return 0;
}
⚠️ Common Mistakes in Chapter 3.3
- Sorting with wrong comparator: Your lambda must return
trueifashould come BEFOREb. If it returnstruefora == b, you get undefined behavior (strict weak ordering violation). - Binary search on unsorted array:
lower_boundandupper_boundassume sorted order. On unsorted data, results are meaningless. - Off-by-one in binary search:
lo <= hivslo < himatters. When in doubt, test your binary search on a 1-element and 2-element array. - Wrong answer range in "binary search on answer": If the answer could be 0, set
lo = 0, notlo = 1. If it could be very large, make surehiis large enough (uselong longif necessary). - Integer overflow in mid computation: Always write
mid = lo + (hi - lo) / 2, never(lo + hi) / 2.
Chapter Summary
📌 Key Takeaways
| Operation | Method | Time Complexity | Notes |
|---|---|---|---|
| Sort ascending | sort(v.begin(), v.end()) | O(N log N) | Uses IntroSort |
| Sort descending | sort(..., greater<int>()) | O(N log N) | |
| Custom sort | Lambda comparator | O(N log N) | Must be strict weak order |
| Find exact value | binary_search | O(log N) | Returns bool |
| First index ≥ x | lower_bound | O(log N) | Returns iterator |
| First index > x | upper_bound | O(log N) | Returns iterator |
| Count of value x | ub - lb | O(log N) | |
| Binary search on answer | Manual BS + check() | O(f(N) log V) | V = answer range |
| Coordinate compression | sort + unique + lower_bound | O(N log N) | Map large values to small indices |
🧩 Binary Search Template Quick Reference
| Scenario | lo/hi init | Update rule | Answer |
|---|---|---|---|
| Maximize value satisfying condition | lo=min, hi=max | check(mid) → ans=mid, lo=mid+1 | ans |
| Minimize value satisfying condition | lo=min, hi=max | check(mid) → hi=mid | lo (when loop ends) |
| Floating-point binary search | lo=min, hi=max | Loop 100 times, check(mid) → hi=mid else lo=mid | lo ≈ hi |
❓ FAQ
Q1: Is sort's time complexity O(N log N) or O(N²)?
A: C++'s
std::sortuses Introsort (a hybrid of Quicksort + Heapsort + Insertion sort), guaranteeingO(N log N)worst case. No need to worry about degrading toO(N²). But note: if your custom comparator doesn't satisfy strict weak ordering, behavior is undefined (may infinite loop or crash).
Q2: What's the difference between lo <= hi and lo < hi in binary search?
A: The two styles correspond to different templates:
while (lo <= hi): when search ends, lo > hi, answer is stored inanswervariable. Good for "find target value" or "maximize value satisfying condition".while (lo < hi): when search ends, lo == hi, answer is lo. Good for "minimize value satisfying condition". Both can solve all problems; the key is pairing with the correct update rule. Beginners should pick one style and stick with it.
Q3: What problems is "binary search on answer" applicable to? How to identify them?
A: Three signals: ① The problem asks "the maximum/minimum X such that..."; ② There exists a decision function
check(X)that can determine feasibility in polynomial time; ③ The decision function is monotone (X feasible → X-1 also feasible, or vice versa). If all three hold, binary search on answer applies.
Q4: What is coordinate compression actually useful for?
A: When the value range is large (e.g., 10^9) but the number of distinct values is small (e.g., 10^5), coordinate compression maps large values to small indices 0~N-1. This lets you use arrays instead of maps (faster), or perform prefix sums/BIT operations over the value domain. Frequently needed in USACO Silver.
Q5: Why can't the sort comparator use <=?
A: C++ sorting requires the comparator to satisfy strict weak ordering: when a == b,
comp(a,b)must return false.<=returns true when a==b, violating this rule. The result is undefined behavior — may infinite loop, crash, or produce incorrect ordering.
🔗 Connections to Later Chapters
- Chapter 3.4 (Two Pointers): two-pointer techniques are often used after sorting — sort first
O(N log N), then two pointersO(N) - Chapter 3.2 (Prefix Sums): prefix sum arrays are naturally ordered, enabling binary search on them (e.g., find first prefix sum ≥ target)
- Chapters 4.1 & 5.4 (Greedy + Shortest Paths): Dijkstra internally uses a priority queue + greedy strategy, fundamentally related to sorting
- Chapter 6.2 (DP): LIS (Longest Increasing Subsequence) can be optimized to
O(N log N)using binary search - "Binary search on answer" is one of the most core techniques in USACO Silver, also frequently combined in Chapter 4.1 (Greedy)
Practice Problems
Problem 3.3.1 — Closest Pair 🟢 Easy Read N integers. Find the pair with the minimum difference. Print that difference.
Hint
Sort the array. The closest pair must be adjacent after sorting — scan pairs and take the minimum difference.Problem 3.3.2 — Room Allocation 🟡 Medium Read N events, each with start and end time. What is the maximum number of events that overlap at any single moment? (Hint: sort start/end times together and sweep)
Hint
Create an array of events: (time, +1 for start, -1 for end). Sort by time. Sweep and maintain a running count of active events; track the maximum.Problem 3.3.3 — Kth Smallest 🟡 Medium Read N integers. Find the K-th smallest element (1-indexed).
Solution sketch: Binary search on the answer X. Count how many elements ≤ X using a scan — this is O(N). Total: O(N log(max_value)).
Hint
Alternatively, just sort and return v[K-1]. But try the binary search approach for practice!Problem 3.3.4 — Aggressive Cows (USACO 2011 March Silver) 🔴 Hard
N stalls at positions p[1..N], place C cows to maximize the minimum distance between any two cows. (Full implementation of the example above.)
Solution sketch: Sort stalls. Binary search on minimum distance D. For each D, greedily place cows: always place next cow at the earliest stall that is ≥ D away from the last cow.
Hint
The check function `canPlace(D)` runs in `O(N)` by scanning sorted stalls greedily. Total time: `O(N log N)` sort + `O(N log(max_dist))` binary search.Problem 3.3.5 — Binary Search on Answer: Painter's Partition 🔴 Hard N boards with widths w[1..N]. K painters, each takes 1 unit time per unit width. Assign contiguous boards to painters to minimize total time (the maximum any single painter works).
Solution sketch: Binary search on the answer T (max time any painter works). Check: greedily assign boards to painters, starting a new painter whenever the current one would exceed T. If ≤ K painters suffice, T is feasible.
Hint
Feasibility check: simulate greedily — run left to right, assign boards to current painter until adding the next board would exceed T. `O(N)` per check, `O(log(sum))` binary search iterations.🏆 Challenge Problem: USACO 2016 February Silver: Fencing the Cows Fence all N points in a convex region using minimum fencing. This is the Convex Hull problem — look up the Graham scan or Jarvis march algorithms. While this is a Gold-level topic, thinking about it now will prime your intuition.
3.3.7 Advanced Binary Search on Answer — Three Examples
Example 1: Minimum Time to Finish Tasks (Parametric Search)
Problem: N workers, M tasks with effort[i]. Assign tasks to workers (each worker gets contiguous tasks). Minimize the maximum time any worker spends (minimize the bottleneck).
This is the "Painter's Partition" problem. Binary search on the answer (max time T), check if T is achievable.
// Check: can we distribute tasks among K workers so max work <= T?
bool canFinish(vector<int>& tasks, int K, long long T) {
int workers = 1;
long long current = 0;
for (int t : tasks) {
if (t > T) return false; // single task exceeds T — impossible
if (current + t > T) {
workers++; // start new worker
current = t;
if (workers > K) return false;
} else {
current += t;
}
}
return true;
}
// Binary search on T
long long lo = *max_element(tasks.begin(), tasks.end()); // minimum possible T
long long hi = accumulate(tasks.begin(), tasks.end(), 0LL); // maximum T (1 worker)
while (lo < hi) {
long long mid = lo + (hi - lo) / 2;
if (canFinish(tasks, K, mid)) hi = mid; // mid works, try smaller
else lo = mid + 1; // mid doesn't work, need larger
}
cout << lo << "\n"; // minimum possible maximum time
📝 Note: Here we binary search for the minimum feasible T, so we use
hi = midwhen feasible (notanswer = mid; lo = mid+1). The two templates are mirror images.
Example 2: Kth Smallest in Multiplication Table
Problem: N×M multiplication table. Find the Kth smallest value.
The table has values i*j for 1≤i≤N, 1≤j≤M. Binary search on the answer X: count how many values are ≤ X.
// Count values <= X in N×M multiplication table
long long countLE(long long X, int N, int M) {
long long count = 0;
for (int i = 1; i <= N; i++) {
count += min((long long)M, X / i);
// Row i has values i, 2i, ..., Mi
// Count of values <= X in row i: min(M, floor(X/i))
}
return count;
}
// Binary search for Kth smallest
long long lo = 1, hi = (long long)N * M;
while (lo < hi) {
long long mid = lo + (hi - lo) / 2;
if (countLE(mid, N, M) >= K) hi = mid;
else lo = mid + 1;
}
cout << lo << "\n";
Complexity: O(N log(NM)) — O(N) per check, O(log(NM)) iterations.
Example 3: USACO-Style Cable Length (Agri-Net inspired)
Problem: Given N farm locations, connect them all with cables. The cables must be at most length L. Find the maximum L such that you can form a spanning tree with all edges ≤ L.
// Binary search on maximum cable length L
// Check: does a spanning tree exist using only edges of length <= L?
// (This reduces to: is the graph connected when restricted to edges <= L?)
bool canConnect(vector<tuple<int,int,int>>& edges, int n, int L) {
DSU dsu(n);
for (auto [w, u, v] : edges) {
if (w <= L) dsu.unite(u, v);
}
return dsu.components == 1; // all nodes connected
}
3.3.8 lower_bound / upper_bound Complete Cheat Sheet
vector<int> v = {1, 3, 3, 5, 7, 9, 9, 11};
// 0 1 2 3 4 5 6 7
// ── lower_bound: first position >= x ──
lower_bound(all, 3) → index 1 (first 3)
lower_bound(all, 4) → index 3 (first element >= 4, which is 5)
lower_bound(all, 12) → index 8 (past-end: no element ≥ 12 exists in the array)
// ── upper_bound: first position > x ──
upper_bound(all, 3) → index 3 (first element after all 3s)
upper_bound(all, 4) → index 3 (same as above: no 4s)
upper_bound(all, 11) → index 8 (past-end)
// ── Derived operations ──
// Count occurrences of x:
ub(x) - lb(x) = upper_bound(all,3) - lower_bound(all,3) = 3-1 = 2 ✓
// Does x exist?
binary_search(all, x) // O(log N), returns bool
// Largest value <= x (floor):
auto it = upper_bound(all, x);
if (it != v.begin()) cout << *prev(it); // *--it
// Smallest value >= x (ceil):
auto it = lower_bound(all, x);
if (it != v.end()) cout << *it;
// Largest value < x (strict floor):
auto it = lower_bound(all, x);
if (it != v.begin()) cout << *prev(it);
// Count elements < x:
lower_bound(all, x) - v.begin()
// Count elements <= x:
upper_bound(all, x) - v.begin()
// Count elements in range [a, b]:
upper_bound(all, b) - lower_bound(all, a)
| Goal | Code | Note |
|---|---|---|
| First index ≥ x | lower_bound(v.begin(), v.end(), x) - v.begin() | Equals v.size() if all < x |
| First index > x | upper_bound(v.begin(), v.end(), x) - v.begin() | |
| Count of value x | upper_bound(...,x) - lower_bound(...,x) | |
| Largest value ≤ x | *prev(upper_bound(...,x)) | Check iterator ≠ begin |
| Smallest value ≥ x | *lower_bound(...,x) | Check iterator ≠ end |
| Does x exist? | binary_search(...) | Returns bool |
3.3.9 Custom Predicate Binary Search
For non-standard sorted structures or custom criteria:
// Binary search with custom predicate
// Find first index i where pred(i) is true, in range [lo, hi]
// Assumption: pred is monotone: false...false, true...true
int lo = 0, hi = n - 1, answer = -1;
while (lo <= hi) {
int mid = lo + (hi - lo) / 2;
if (/* some condition on mid */) {
answer = mid;
hi = mid - 1; // look for smaller index
} else {
lo = mid + 1;
}
}
// Example: first index where arr[i] >= arr[i-1] + 10 (gap >= 10)
int lo = 1, hi = n - 1, firstLargeGap = -1;
while (lo <= hi) {
int mid = lo + (hi - lo) / 2;
if (arr[mid] - arr[mid-1] >= 10) {
firstLargeGap = mid;
hi = mid - 1;
} else {
lo = mid + 1;
}
}
// Floating point binary search (epsilon-based)
double lo_f = 0.0, hi_f = 1e9;
for (int iter = 0; iter < 100; iter++) { // 100 iterations → error < 1e-30
double mid = (lo_f + hi_f) / 2;
if (check(mid)) hi_f = mid;
else lo_f = mid;
}
// Answer: lo_f (or hi_f, they converge to same value)
🏆 USACO Pro Tip: "Binary search on answer" is one of the most common Silver techniques. When you see "maximize/minimize X subject to [constraint]," ask yourself: Is the feasibility function monotone? If yes, binary search.
3.3.10 Ternary Search — Finding the Peak of a Unimodal Function
Binary search requires a monotone predicate (false→true boundary). For unimodal functions (increases then decreases), use ternary search to find the maximum.
💡 When to use: A function
fis unimodal on[lo, hi]if it first strictly increases then strictly decreases (or is always one direction). Ternary search finds the maximum point inO(log((hi-lo)/eps))evaluations.
USACO appearances: Problems where the answer depends on a continuous parameter (e.g., "find the optimal point on a line to minimize the sum of distances to a set of points") sometimes require ternary search.
// Ternary search: find maximum of unimodal function f on [lo, hi]
// Prerequisite: f increases then decreases (unimodal)
// Time: O(log((hi-lo)/eps)) for continuous, or O(log N) for integers
// f must be declared/defined before calling this
double ternarySearch(double lo, double hi) {
for (int iter = 0; iter < 200; iter++) {
double m1 = lo + (hi - lo) / 3;
double m2 = hi - (hi - lo) / 3;
if (f(m1) < f(m2)) lo = m1; // maximum is in [m1, hi]
else hi = m2; // maximum is in [lo, m2]
}
return (lo + hi) / 2; // Maximum point (lo ≈ hi after convergence)
}
// Integer ternary search (when f is defined on integers):
int ternarySearchInt(int lo, int hi) {
// 使用 > 2 而非 >= 2:保留至少 3 个候选值再暴力枚举。
// 当范围缩至 2 个元素时,m1 == m2(因为 (hi-lo)/3 == 0),
// 会导致死循环。用 > 2 可确保安全退出并正确处理边界。
while (hi - lo > 2) {
int m1 = lo + (hi - lo) / 3;
int m2 = hi - (hi - lo) / 3;
if (f(m1) < f(m2)) lo = m1 + 1;
else hi = m2 - 1;
}
// Check remaining candidates [lo, hi] (at most 3 elements)
int best = lo;
for (int x = lo + 1; x <= hi; x++)
if (f(x) > f(best)) best = x;
return best;
}
Contrast with binary search:
| Binary Search | Ternary Search | |
|---|---|---|
| Requires | Monotone predicate | Unimodal function |
| Finds | Boundary (false→true) | Peak (maximum/minimum) |
| Each step eliminates | Half the range | One-third of the range |
| Iterations for ε precision | log₂(range/ε) | log₃/₂(range/ε) ≈ 2.4× more |
⚠️ Note: Ternary search on integers requires care — use
while (hi - lo > 2)to avoid infinite loops when the range shrinks to 2 or 3 elements, then brute-force the remaining candidates.
Chapter 3.4: Two Pointers & Sliding Window
📝 Before You Continue: You should be comfortable with arrays, vectors, and
std::sort(Chapters 2.3–3.3). This technique requires a sorted array for the classic two-pointer approach.
Two pointers and sliding window are among the most elegant tricks in competitive programming. They transform naive O(N²) solutions into O(N) by exploiting monotonicity: as one pointer moves forward, the other never needs to go backward.
3.4.1 The Two Pointer Technique
The idea: maintain two indices, left and right, into a sorted array. Move them toward each other (or in the same direction) based on the current sum/window.
When to use:
- Finding a pair/triplet with a given sum in a sorted array
- Checking if a sorted array contains two elements with a specific relationship
- Problems where "if we can do X with window size k, we can do X with window size k-1"
The diagram shows how two pointers converge toward the center, each step eliminating an entire row/column of pairs from consideration.
Problem: Find All Pairs with Sum = Target
Naïve O(N²) approach:
// O(N²): check every pair
for (int i = 0; i < n; i++) {
for (int j = i + 1; j < n; j++) {
if (arr[i] + arr[j] == target) {
cout << arr[i] << " + " << arr[j] << "\n";
}
}
}
Two Pointer O(N) approach (requires sorted array):
// Solution: Two Pointer — O(N log N) for sort + O(N) for search
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, target;
cin >> n >> target;
vector<int> arr(n);
for (int &x : arr) cin >> x;
sort(arr.begin(), arr.end()); // MUST sort first
int left = 0, right = n - 1;
while (left < right) {
int sum = arr[left] + arr[right];
if (sum == target) {
cout << arr[left] << " + " << arr[right] << " = " << target << "\n";
left++;
right--; // advance both pointers
} else if (sum < target) {
left++; // sum too small: move left pointer right (increase sum)
} else {
right--; // sum too large: move right pointer left (decrease sum)
}
}
return 0;
}
Why Does This Work?
Key insight: After sorting, if arr[left] + arr[right] < target, then no element smaller than arr[right] can pair with arr[left] to reach target. So we safely advance left.
Similarly, if the sum is too large, no element larger than arr[left] can pair with arr[right] to reach target. So we safely decrease right.
Each step eliminates at least one element from consideration → O(N) total steps.
Complete Trace
Array = [1, 2, 3, 4, 5, 6, 7, 8], target = 9:
State: left=0(1), right=7(8)
sum = 1+8 = 9 ✓ → print (1,8), left++, right--
State: left=1(2), right=6(7)
sum = 2+7 = 9 ✓ → print (2,7), left++, right--
State: left=2(3), right=5(6)
sum = 3+6 = 9 ✓ → print (3,6), left++, right--
State: left=3(4), right=4(5)
sum = 4+5 = 9 ✓ → print (4,5), left++, right--
State: left=4, right=3 → left >= right, STOP
All pairs: (1,8), (2,7), (3,6), (4,5)
3-Sum Extension
Finding a triplet that sums to target: fix one element, use two pointers for the remaining pair.
// O(N²) — much better than O(N³) brute force
sort(arr.begin(), arr.end());
for (int i = 0; i < n - 2; i++) {
int left = i + 1, right = n - 1;
while (left < right) {
int sum = arr[i] + arr[left] + arr[right];
if (sum == target) {
cout << arr[i] << " " << arr[left] << " " << arr[right] << "\n";
left++; right--;
} else if (sum < target) left++;
else right--;
}
}
3.4.2 Sliding Window — Fixed Size
A sliding window of fixed size K moves across an array, maintaining a running aggregate (sum, max, count of distinct, etc.).
Problem: Find the maximum sum of any contiguous subarray of size K.
Array: [2, 1, 5, 1, 3, 2], K=3
Windows: [2,1,5]=8, [1,5,1]=7, [5,1,3]=9, [1,3,2]=6
Answer: 9
Naïve O(NK): Compute sum from scratch for each window.
Sliding window O(N): Add the new element entering the window, subtract the element leaving.
// Solution: Sliding Window Fixed Size — O(N)
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, k;
cin >> n >> k;
vector<int> arr(n);
for (int &x : arr) cin >> x;
// Compute sum of first window
long long windowSum = 0;
for (int i = 0; i < k; i++) windowSum += arr[i];
long long maxSum = windowSum;
// Slide the window: add arr[i], remove arr[i-k]
for (int i = k; i < n; i++) {
windowSum += arr[i]; // new element enters window
windowSum -= arr[i - k]; // old element leaves window
maxSum = max(maxSum, windowSum);
}
cout << maxSum << "\n";
return 0;
}
Trace for [2, 1, 5, 1, 3, 2], K=3:
Initial window [2,1,5]: sum=8, max=8
i=3: add 1, remove 2 → sum=7, max=8
i=4: add 3, remove 1 → sum=9, max=9
i=5: add 2, remove 5 → sum=6, max=9
Answer: 9 ✓
3.4.3 Sliding Window — Variable Size
The most powerful variant: the window expands when we need more, and shrinks when a constraint is violated.
Problem: Find the smallest contiguous subarray with sum ≥ target.
// Solution: Variable Window — O(N)
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, target;
cin >> n >> target;
vector<int> arr(n);
for (int &x : arr) cin >> x;
int left = 0;
long long windowSum = 0;
int minLen = INT_MAX;
for (int right = 0; right < n; right++) {
windowSum += arr[right]; // expand: add right element
// Shrink window from left while constraint satisfied
while (windowSum >= target) {
minLen = min(minLen, right - left + 1);
windowSum -= arr[left];
left++; // shrink: remove left element
}
}
if (minLen == INT_MAX) cout << 0 << "\n"; // no such subarray
else cout << minLen << "\n";
return 0;
}
Why O(N)? Each element is added once (when right passes it) and removed at most once (when left passes it). Total operations: O(2N) = O(N).
Problem: Longest Subarray with At Most K Distinct Values
// Variable window: longest subarray with at most K distinct values
int left = 0, maxLen = 0;
map<int, int> freq; // frequency of each value in window
for (int right = 0; right < n; right++) {
freq[arr[right]]++;
// Shrink while we have > k distinct values
while ((int)freq.size() > k) {
freq[arr[left]]--;
if (freq[arr[left]] == 0) freq.erase(arr[left]);
left++;
}
maxLen = max(maxLen, right - left + 1);
}
cout << maxLen << "\n";
3.4.4 USACO Example: Haybale Stacking
Problem (USACO 2012 November Bronze): N haybales in a line. M operations, each adds 1 to all bales in range [a, b]. How many bales have an odd number of additions at the end?
This is actually best solved with a difference array (Chapter 3.2), but a simpler version:
Problem: Given array of integers, find the longest subarray where all elements are ≥ K.
// Two pointer: longest contiguous subarray where all elements >= K
int left = 0, maxLen = 0;
for (int right = 0; right < n; right++) {
if (arr[right] < K) {
left = right + 1; // reset window: current element violates constraint
} else {
maxLen = max(maxLen, right - left + 1);
}
}
⚠️ Common Mistakes
- Not sorting before two-pointer: The two-pointer technique for pair sum only works on sorted arrays. Without sorting, you'll miss pairs or get wrong answers.
- Moving both pointers when a pair is found: When you find a matching pair, you must move BOTH
left++ANDright--. Moving only one misses some pairs (unless duplicates aren't relevant). - Off-by-one in window size: The window
[left, right]has sizeright - left + 1, notright - left. - Forgetting to handle empty answer: For the "minimum subarray" problem, initialize
minLen = INT_MAXand check if it changed before outputting.
Chapter Summary
📌 Key Takeaways
| Technique | Constraint | Time | Space | Key Idea |
|---|---|---|---|---|
| Two pointer (pairs) | Sorted array | O(N) | O(1) | Approach from both ends, eliminate impossible pairs |
| Two pointer (3-sum) | Sorted array | O(N²) | O(1) | Fix one, use two pointers on the rest |
| Sliding window (fixed) | Any | O(N) | O(1) | Add new element, remove old element |
| Sliding window (variable) | Any | O(N) | O(1~N) | Expand right end, shrink left end |
❓ FAQ
Q1: Does two-pointer always require sorting?
A: Not necessarily. "Opposite-direction two pointers" (like pair sum) require sorting; "same-direction two pointers" (like sliding window) do not. The key is monotonicity — pointers only move in one direction.
Q2: Both sliding window and prefix sum can compute range sums — which to use?
A: For fixed-size window sum/max, sliding window is more intuitive. For arbitrary range queries, prefix sum is more general. Sliding window can only handle "continuously moving windows"; prefix sum can answer any [L,R] query.
Q3: Can sliding window handle both "longest subarray satisfying condition" and "shortest subarray satisfying condition"?
A: Both, but with slightly different logic. "Longest": expand right until condition fails, then shrink left until condition holds again. "Shortest": expand right until condition holds, then shrink left until it no longer holds, recording the minimum length throughout.
Q4: How does two-pointer handle duplicate elements?
A: Depends on the problem. If you want "all distinct pair values", after finding a pair do
left++; right--and skip duplicate values. If you want "count of all pairs", you need to carefully count duplicates (may require extra counting logic).
🔗 Connections to Later Chapters
- Chapter 3.2 (Prefix Sums): prefix sums and sliding window are complementary — prefix sums suit offline queries, sliding window suits online processing
- Chapter 3.3 (Sorting): sorting is a prerequisite for two pointers — opposite-direction two pointers require a sorted array
- Chapter 3.5 (Monotonic): monotonic deque can enhance sliding window — maintaining window min/max in
O(N) - Chapters 6.1–6.3 (DP): some problems (like LIS variants) can be optimized with two pointers
Practice Problems
Problem 3.4.1 — Pair Sum Count 🟢 Easy
Given N integers and a target T, count the number of pairs (i < j) with arr[i] + arr[j] = T.
Hint
Sort the array first. Use two pointers from both ends. When a pair is found, both advance. Handle duplicate elements carefully.Problem 3.4.2 — Maximum Average Subarray 🟡 Medium Find the contiguous subarray of length exactly K with the maximum average. Print the average as a fraction or decimal.
Hint
Use fixed-size sliding window to find the maximum sum of K elements. Average = maxSum / K.Problem 3.4.3 — Minimum Window Covering 🔴 Hard Given string S and string T, find the shortest substring of S that contains all characters of T.
Hint
Variable sliding window. Use a frequency map for T characters needed. Expand right until all T chars covered; shrink left while still covered. Track minimum window length.🏆 Challenge: USACO 2017 February Bronze — Why Did the Cow Cross the Road Given a grid with cows and their destinations, find which cow can reach its destination fastest. Use two-pointer / greedy on sorted intervals.
Chapter 3.5: Monotonic Stack & Monotonic Queue
📝 Before You Continue: Make sure you're comfortable with two pointers / sliding window (Chapter 3.4) and basic stack/queue operations (Chapter 3.1). This chapter builds directly on those techniques.
Monotonic stacks and queues are elegant tools that solve "nearest greater/smaller element" and "sliding window extremum" problems in O(N) time — problems that would naively require O(N²).
3.5.1 Monotonic Stack: Next Greater Element
Problem: Given an array A of N integers, for each element A[i], find the next greater element (NGE): the index of the first element to the right of i that is greater than A[i]. If none exists, output -1.
Naive approach: O(N²) — for each i, scan right until finding a greater element.
Monotonic stack approach: O(N) — maintain a stack that is always decreasing from bottom to top. When we push a new element, pop all smaller elements first (they just found their NGE!).
💡 Key Insight: The stack contains indices of elements that haven't found their NGE yet. When A[i] arrives, every element in the stack that is smaller than A[i] has found its NGE (it's i!). We pop them and record the answer.
单调栈状态变化示意(A=[2,1,5,6,2,3]):
flowchart LR
subgraph i0["i=0, A[0]=2"]
direction TB
ST0["Stack: [0]↓\n底→顶: [2]"]
end
subgraph i1["i=1, A[1]=1"]
direction TB
ST1["1<2, 直接入栈\nStack: [0,1]↓\n底→顶: [2,1]"]
end
subgraph i2["i=2, A[2]=5"]
direction TB
ST2["5>1: pop 1, NGE[1]=2\n5>2: pop 0, NGE[0]=2\nStack: [2]↓\n底→顶: [5]"]
end
subgraph i3["i=3, A[3]=6"]
direction TB
ST3["6>5: pop 2, NGE[2]=3\nStack: [3]↓\n底→顶: [6]"]
end
subgraph i5["i=5, A[5]=3"]
direction TB
ST5["3>2: pop 4, NGE[4]=5\n3<6, 入栈\nStack: [3,5]↓\n幕尾无 NGE"]
end
i0 --> i1 --> i2 --> i3 --> i5
style ST2 fill:#dcfce7,stroke:#16a34a
style ST3 fill:#dcfce7,stroke:#16a34a
style ST5 fill:#dcfce7,stroke:#16a34a
💡 摘要: 栈始终保持单调递减(底大顶小)。每个元素至多入栈一次、出栈一次,总操作次数 O(2N) = O(N)。
Array A: [2, 1, 5, 6, 2, 3]
idx: 0 1 2 3 4 5
Processing i=0 (A[0]=2): stack empty → push 0
Stack: [0] // stack holds indices of unresolved elements
Processing i=1 (A[1]=1): A[1]=1 < A[0]=2 → just push
Stack: [0, 1]
Processing i=2 (A[2]=5):
A[2]=5 > A[1]=1 → pop 1, NGE[1] = 2 (A[2]=5 is next greater for A[1])
A[2]=5 > A[0]=2 → pop 0, NGE[0] = 2 (A[2]=5 is next greater for A[0])
Stack empty → push 2
Stack: [2]
Processing i=3 (A[3]=6):
A[3]=6 > A[2]=5 → pop 2, NGE[2] = 3
Push 3
Stack: [3]
Processing i=4 (A[4]=2): A[4]=2 < A[3]=6 → just push
Stack: [3, 4]
Processing i=5 (A[5]=3):
A[5]=3 > A[4]=2 → pop 4, NGE[4] = 5
A[5]=3 < A[3]=6 → stop, push 5
Stack: [3, 5]
End: remaining stack [3, 5] → NGE[3] = NGE[5] = -1 (no greater element to the right)
Result: NGE = [2, 2, 3, -1, 5, -1]
Verify:
A[0]=2, next greater is A[2]=5 ✓
A[1]=1, next greater is A[2]=5 ✓
A[2]=5, next greater is A[3]=6 ✓
A[3]=6, no greater → -1 ✓
Complete Implementation
// Solution: Next Greater Element using Monotonic Stack — O(N)
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
vector<int> A(n);
for (int& x : A) cin >> x;
vector<int> nge(n, -1); // nge[i] = index of next greater element, -1 if none
stack<int> st; // monotonic decreasing stack (stores indices)
for (int i = 0; i < n; i++) {
// While the top of stack has a smaller value than A[i]
// → the current element A[i] is the NGE of all those elements
while (!st.empty() && A[st.top()] < A[i]) {
nge[st.top()] = i; // ← KEY: record NGE for stack top
st.pop();
}
st.push(i); // push current index (not yet resolved)
}
// Remaining elements in stack have no NGE → already initialized to -1
for (int i = 0; i < n; i++) {
cout << nge[i];
if (i < n - 1) cout << " ";
}
cout << "\n";
return 0;
}
Complexity Analysis:
- Each element is pushed exactly once and popped at most once
- Total operations: O(2N) = O(N)
- Space: O(N) for the stack
⚠️ Common Mistake: Storing values instead of indices in the stack. Always store indices — you need to know where in the array to record the answer.
3.5.2 Variations: Previous Smaller, Previous Greater
By changing the comparison direction and the traversal direction, you get four related problems:
| Problem | Stack Type | Direction | Use Case |
|---|---|---|---|
| Next Greater Element | Decreasing | Left → Right | Stock price problems |
| Next Smaller Element | Increasing | Left → Right | Histogram problems |
| Previous Greater | Decreasing | Right → Left | Range problems |
| Previous Smaller | Increasing | Right → Left | Nearest smaller to left |
Template for Previous Smaller Element:
// Previous Smaller Element: for each i, find the nearest j < i where A[j] < A[i]
vector<int> pse(n, -1); // pse[i] = index of previous smaller, -1 if none
stack<int> st;
for (int i = 0; i < n; i++) {
while (!st.empty() && A[st.top()] >= A[i]) {
st.pop(); // pop elements that are >= A[i] (not the "previous smaller")
}
pse[i] = st.empty() ? -1 : st.top(); // stack top is the previous smaller
st.push(i);
}
3.5.3 USACO Application: Largest Rectangle in Histogram
Problem: Given an array of heights H[0..N-1], find the area of the largest rectangle that fits under the histogram.
Key insight: For each bar i, the largest rectangle with height H[i] extends left and right until it hits a shorter bar. Use monotonic stack to find, for each i:
left[i]= previous smaller element indexright[i]= next smaller element index
直方图最大矩形边界计算示意(H=[2,1,5,6,2,3]):
flowchart LR
subgraph bars["每个柱子的左右边界"]
direction TB
B0["i=0, H=2\nleft=-1, right=1\nwidth=1, area=2"]
B1["i=1, H=1\nleft=-1, right=6\nwidth=6, area=6"]
B2["i=2, H=5\nleft=1, right=4\nwidth=2, area=10 ⭐"]
B3["i=3, H=6\nleft=2, right=4\nwidth=1, area=6"]
B4["i=4, H=2\nleft=1, right=6\nwidth=4, area=8"]
B5["i=5, H=3\nleft=4, right=6\nwidth=1, area=3"]
end
note["最大面积 = 10\n(i=2, 高度=5, 宽度=2)"]
style B2 fill:#dcfce7,stroke:#16a34a
style note fill:#f0fdf4,stroke:#16a34a
💡 公式:
width = right[i] - left[i] - 1,area = H[i] × width。左边界用"左侧第一个更小元素的下标",右边界用"右侧第一个更小元素的下标"。
// Solution: Largest Rectangle in Histogram — O(N)
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
vector<int> H(n);
for (int& h : H) cin >> h;
// Find previous smaller for each position
vector<int> left(n), right(n);
stack<int> st;
// Previous smaller (left boundary)
for (int i = 0; i < n; i++) {
while (!st.empty() && H[st.top()] >= H[i]) st.pop();
left[i] = st.empty() ? -1 : st.top(); // index before rectangle starts
st.push(i);
}
while (!st.empty()) st.pop();
// Next smaller (right boundary)
for (int i = n - 1; i >= 0; i--) {
while (!st.empty() && H[st.top()] >= H[i]) st.pop();
right[i] = st.empty() ? n : st.top(); // index after rectangle ends
st.push(i);
}
// Compute maximum area
long long maxArea = 0;
for (int i = 0; i < n; i++) {
long long width = right[i] - left[i] - 1; // width of rectangle
long long area = (long long)H[i] * width;
maxArea = max(maxArea, area);
}
cout << maxArea << "\n";
return 0;
}
Trace for H = [2, 1, 5, 6, 2, 3]:
left = [-1, -1, 1, 2, 1, 4] (index of previous smaller, -1 = none)
right = [1, 6, 4, 4, 6, 6] (index of next smaller, n=6 = none)
Widths: 1-(-1)-1=1, 6-(-1)-1=6, 4-1-1=2, 4-2-1=1, 6-1-1=4, 6-4-1=1
Areas: 2×1=2, 1×6=6, 5×2=10, 6×1=6, 2×4=8, 3×1=3
Maximum area = 10
i=2: H[2]=5, left[2]=1, right[2]=4, width=4-1-1=2, area=5×2=10 ✓
(bars at indices 2 and 3 both have height ≥ 5, so the rectangle of height 5 spans width 2)
📌 Note for Students: Always trace through your algorithm on the sample input before submitting. Small off-by-one errors in index boundary calculations are the #1 source of bugs in monotonic stack problems.
3.5.4 Monotonic Deque: Sliding Window Maximum
Problem: Given array A of N integers and window size K, find the maximum value in each window of size K as it slides from left to right. Output N-K+1 values.
Naive approach: O(NK) — scan each window for its maximum.
Monotonic deque approach: O(N) — maintain a decreasing deque (front = maximum of current window).
💡 Key Insight: We want the maximum in a sliding window. We maintain a deque of indices such that:
- The deque is decreasing in value (front is always the maximum)
- The deque only contains indices within the current window
When a new element arrives:
- Remove all smaller elements from the back (they can never be the maximum while this new element is in the window)
- Remove the front if it's outside the current window
Step-by-Step Trace
Array A: [1, 3, -1, -3, 5, 3, 6, 7], K = 3
Window [1,3,-1]: max = 3
Window [3,-1,-3]: max = 3
Window [-1,-3,5]: max = 5
Window [-3,5,3]: max = 5
Window [5,3,6]: max = 6
Window [3,6,7]: max = 7
i=0, A[0]=1: deque=[0]
i=1, A[1]=3: 3>1 → pop 0; deque=[1]
i=2, A[2]=-1: -1<3 → push; deque=[1,2]; window [0..2]: max=A[1]=3 ✓
i=3, A[3]=-3: -3<-1 → push; deque=[1,2,3]; window [1..3]: front=1 still in window, max=A[1]=3 ✓
i=4, A[4]=5: 5>-3→pop 3; 5>-1→pop 2; 5>3→pop 1; deque=[4]; window [2..4]: max=A[4]=5 ✓
i=5, A[5]=3: 3<5→push; deque=[4,5]; window [3..5]: front=4 in window, max=A[4]=5 ✓
i=6, A[6]=6: 6>3→pop 5; 6>5→pop 4; deque=[6]; window [4..6]: max=A[6]=6 ✓
i=7, A[7]=7: 7>6→pop 6; deque=[7]; window [5..7]: max=A[7]=7 ✓
Complete Implementation
// Solution: Sliding Window Maximum using Monotonic Deque — O(N)
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, k;
cin >> n >> k;
vector<int> A(n);
for (int& x : A) cin >> x;
deque<int> dq; // monotonic decreasing deque, stores indices
vector<int> result;
for (int i = 0; i < n; i++) {
// 1. Remove elements outside the current window
while (!dq.empty() && dq.front() <= i - k) {
dq.pop_front(); // ← KEY: expired window front
}
// 2. Maintain decreasing property
// Remove from back all elements smaller than A[i]
// (they'll never be the max while A[i] is in the window)
while (!dq.empty() && A[dq.back()] <= A[i]) {
dq.pop_back(); // ← KEY: pop smaller elements from back
}
dq.push_back(i); // add current element
// 3. Record maximum once first full window is formed
if (i >= k - 1) {
result.push_back(A[dq.front()]); // front = maximum of current window
}
}
for (int i = 0; i < (int)result.size(); i++) {
cout << result[i];
if (i + 1 < (int)result.size()) cout << "\n";
}
cout << "\n";
return 0;
}
Complexity:
- Each element is pushed/popped from the deque at most once → O(N) total
- Space: O(K) for the deque
⚠️ Common Mistake #1: Forgetting to check
dq.front() <= i - kfor window expiration. The deque must only contain indices in[i-k+1, i].⚠️ Common Mistake #2: Using
<instead of<=when popping from the back. With<, equal elements are preserved, but duplicates can cause issues. Use<=to maintain strict decreasing deque.
3.5.5 USACO Problem: Haybale Stacking (Monotonic Stack)
🔗 Inspiration: This problem type appears in USACO Bronze/Silver ("Haybale Stacking" style).
Problem: There are N positions on a number line. You have K operations: each operation sets all positions in [L, R] to 1. After all operations, output 1 for each position that was set, 0 otherwise.
Solution: Difference array (Chapter 3.2). But let's see a harder variant:
Harder Variant: Given an array H of N "heights," find for each position i the leftmost j such that H[j] < H[i] for all k in [j+1, i-1]. (Find the span of each bar in a histogram.)
This is exactly the "stock span problem" and solves using a monotonic stack — identical to the previous smaller element pattern.
// Stock Span Problem: for each day i, find how many consecutive days
// before i had price <= price[i]
// (the "span" of day i)
vector<int> stockSpan(vector<int>& prices) {
int n = prices.size();
vector<int> span(n, 1);
stack<int> st; // monotonic decreasing stack of indices
for (int i = 0; i < n; i++) {
while (!st.empty() && prices[st.top()] <= prices[i]) {
st.pop();
}
span[i] = st.empty() ? (i + 1) : (i - st.top());
st.push(i);
}
return span;
}
// span[i] = number of consecutive days up to and including i with price <= prices[i]
3.5.6 USACO-Style Problem: Barn Painting Temperatures
Problem: N readings, find the maximum value in each window of size K.
(This is the sliding window maximum — solution already shown in 3.5.4.)
A trickier USACO variant: Given N cows in a line, each with temperature T[i]. A "fever cluster" is a maximal contiguous subarray where all temperatures are above threshold X. Find the maximum cluster size for each of Q threshold queries.
Offline approach: Sort queries by X, process with monotonic deque.
⚠️ Common Mistakes in Chapter 3.5
-
Storing values instead of indices — Always store indices. You need them to check window bounds and to record answers.
-
Wrong comparison in deque (
<vs<=) — For sliding window MAXIMUM, pop whenA[dq.back()] <= A[i](strict non-increase). For MINIMUM, pop whenA[dq.back()] >= A[i]. -
Forgetting window expiration — In sliding window deque, always check
dq.front() < i - k + 1(or<= i - k) before recording the maximum. -
Stack bottom-top direction confusion — The "monotonic" property means: bottom-to-top, the stack is increasing (for NGE) or decreasing (for NSE). Draw it out if confused.
-
Processing order for NGE vs PSE:
- Next Greater Element: left-to-right traversal
- Previous Greater Element: right-to-left traversal (OR: left-to-right, record stack.top() before pushing)
Chapter Summary
📌 Key Summary
| Problem | Data Structure | Time Complexity | Key Operation |
|---|---|---|---|
| Next Greater Element (NGE) | Monotone decreasing stack | O(N) | Pop when larger element found |
| Previous Smaller Element (PSE) | Monotone increasing stack | O(N) | Stack top is answer before push |
| Largest Rectangle in Histogram | Monotone stack (two passes) | O(N) | Left boundary + right boundary + width |
| Sliding Window Maximum | Monotone decreasing deque | O(N) | Maintain window + maintain decreasing property |
🧩 Template Quick Reference
// Monotone decreasing stack (for NGE / Next Greater Element)
stack<int> st;
for (int i = 0; i < n; i++) {
while (!st.empty() && A[st.top()] < A[i]) {
answer[st.top()] = i; // i is the NGE of st.top()
st.pop();
}
st.push(i);
}
// Monotone decreasing deque (sliding window maximum)
deque<int> dq;
for (int i = 0; i < n; i++) {
while (!dq.empty() && dq.front() <= i - k) dq.pop_front(); // remove expired
while (!dq.empty() && A[dq.back()] <= A[i]) dq.pop_back(); // maintain monotone
dq.push_back(i);
if (i >= k - 1) ans.push_back(A[dq.front()]);
}
❓ FAQ
Q1: Should the monotone stack store values or indices?
A: Always store indices. Even if you only need values, storing indices is more flexible — you can get the value via
A[idx], but not vice versa. Especially when computing widths (e.g., histogram problems), indices are required.
Q2: How do I decide between monotone stack and two pointers?
A: Look at the problem structure — if you need "for each element, find the first greater/smaller element to its left/right", use monotone stack. If you need "maintain the maximum of a sliding window", use monotone deque. If "two pointers moving toward each other from both ends", use two pointers.
Q3: Why is the time complexity of monotone stack O(N) and not O(N²)?
A: Amortized analysis. Each element is pushed at most once and popped at most once, totaling 2N operations, so O(N). Although a single while loop may pop multiple times, the total number of pops across all while loops never exceeds N.
Practice Problems
Problem 3.5.1 — Next Greater Element 🟢 Easy For each element in an array, find the first element to its right that is greater. Print -1 if none exists.
Hint
Maintain a monotonic decreasing stack of indices. When processing A[i], pop all smaller elements from the stack (they found their NGE).Problem 3.5.2 — Daily Temperatures 🟢 Easy For each day, find how many days you have to wait until a warmer temperature. (LeetCode 739 style)
Hint
This is exactly NGE. Answer[i] = NGE_index[i] - i. Use monotonic decreasing stack.Problem 3.5.3 — Sliding Window Maximum 🟡 Medium Find the maximum in each sliding window of size K.
Hint
Use monotonic decreasing deque. Maintain deque indices in range [i-k+1, i]. Front = max.Problem 3.5.4 — Largest Rectangle in Histogram 🟡 Medium Find the largest rectangle that fits in a histogram.
Hint
For each bar, find the previous smaller (left boundary) and next smaller (right boundary). Width = right - left - 1. Area = height × width.Problem 3.5.5 — Trapping Rain Water 🔴 Hard Given an elevation map, compute how much water can be trapped after raining. (Classic problem)
Hint
For each position i, water = min(max_left[i], max_right[i]) - height[i]. Can be solved with: (1) prefix/suffix max arrays O(N), (2) two pointers O(N), or (3) monotonic stack O(N).🏆 Challenge: USACO 2016 February Silver: Fencing the Cows Given a polygon, find if a point is inside. Use ray casting — involves careful implementation with edge cases.
Chapter 3.6: Stacks, Queues & Deques
These three data structures control the order in which elements are processed. Each has a unique "personality" that makes it perfect for specific types of problems.
- Stack: Last In, First Out (like a stack of plates)
- Queue: First In, First Out (like a line at a store)
- Deque: Double-ended — insert/remove from both ends
3.6.1 Stack Deep Dive
We introduced stack in Chapter 3.1. Let's use it to solve real problems.
Visual: Stack Operations
The diagram above illustrates the LIFO (Last In, First Out) property with step-by-step push and pop operations. Note how pop() always removes the most-recently-pushed element — this is what makes stacks ideal for matching brackets, DFS, and undo operations.
The Balanced Brackets Problem
Problem: Given a string of brackets ()[]{}, determine if they're properly nested.
#include <bits/stdc++.h>
using namespace std;
bool isBalanced(const string &s) {
stack<char> st;
for (char ch : s) {
if (ch == '(' || ch == '[' || ch == '{') {
st.push(ch); // opening bracket: push onto stack
} else {
// closing bracket: must match the most recent opening
if (st.empty()) return false; // no matching opening bracket
char top = st.top();
st.pop();
// Check if it matches
if (ch == ')' && top != '(') return false;
if (ch == ']' && top != '[') return false;
if (ch == '}' && top != '{') return false;
}
}
return st.empty(); // all brackets matched if stack is empty
}
int main() {
cout << isBalanced("()[]{}") << "\n"; // 1 (true)
cout << isBalanced("([]){}") << "\n"; // 1 (true)
cout << isBalanced("([)]") << "\n"; // 0 (false)
cout << isBalanced("(()") << "\n"; // 0 (false — unmatched '(')
return 0;
}
The "Next Greater Element" Problem
Problem: For each element in an array, find the next element to its right that is strictly greater. If none exists, output -1.
This is a classic monotonic stack problem.
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
vector<int> A(n);
for (int &x : A) cin >> x;
vector<int> answer(n, -1); // default: -1 (no greater element)
stack<int> st; // stores indices of elements awaiting their answer
for (int i = 0; i < n; i++) {
// While stack is non-empty and current element > element at stack's top index
while (!st.empty() && A[i] > A[st.top()]) {
answer[st.top()] = A[i]; // A[i] is the next greater element for st.top()
st.pop();
}
st.push(i); // push current index (waiting for a larger element later)
}
for (int x : answer) cout << x << " ";
cout << "\n";
return 0;
}
Trace for [3, 1, 4, 1, 5, 9, 2, 6]:
- i=0: push 0. Stack: [0]
- i=1: A[1]=1 ≤ A[0]=3, push 1. Stack: [0,1]
- i=2: A[2]=4 > A[1]=1 → answer[1]=4, pop. A[2]=4 > A[0]=3 → answer[0]=4, pop. Push 2.
- i=3: push 3. Stack: [2,3]
- i=4: A[4]=5 > A[3]=1 → answer[3]=5. A[4]=5 > A[2]=4 → answer[2]=5. Push 4.
- i=5: A[5]=9 > A[4]=5 → answer[4]=9. Push 5. Stack: [5]
- i=6: push 6. Stack: [5,6]
- i=7: A[7]=6 > A[6]=2 → answer[6]=6. Push 7.
- Remaining on stack (5, 7): answer stays -1.
Output: 4 4 5 5 9 -1 6 -1
Key insight: A monotonic stack maintains elements in a strictly increasing or decreasing order. When a new element breaks that order, it "solves" all the elements it's greater than. This is
O(n)because each element is pushed and popped at most once.
3.6.2 Queue and BFS Preparation
The queue's FIFO property makes it perfect for Breadth-First Search (BFS), which we cover in Chapter 5.2. Here we focus on the queue itself and related patterns.
Visual: Queue Operations
The queue processes elements in order of arrival: the front element is always dequeued next, while new elements join at the back. This FIFO property ensures BFS visits nodes level-by-level, guaranteeing shortest-path distances.
Simulation with a Queue
Problem: A theme park ride has N groups of people. Each group has size[i]. The ride holds at most M people per run. Simulate how many runs are needed to take everyone.
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, m;
cin >> n >> m;
queue<int> groups;
for (int i = 0; i < n; i++) {
int x;
cin >> x;
groups.push(x);
}
int runs = 0;
while (!groups.empty()) {
int capacity = m; // remaining capacity for this run
runs++;
while (!groups.empty() && groups.front() <= capacity) {
capacity -= groups.front(); // fit this group
groups.pop();
}
}
cout << runs << "\n";
return 0;
}
3.6.3 Deque — Double-Ended Queue
A deque (pronounced "deck") supports O(1) insertion and removal at both the front and back.
#include <bits/stdc++.h>
using namespace std;
int main() {
deque<int> dq;
dq.push_back(1); // [1]
dq.push_back(2); // [1, 2]
dq.push_front(0); // [0, 1, 2]
dq.push_front(-1); // [-1, 0, 1, 2]
cout << dq.front() << "\n"; // -1
cout << dq.back() << "\n"; // 2
dq.pop_front(); // [-1 removed] → [0, 1, 2]
dq.pop_back(); // [2 removed] → [0, 1]
cout << dq.front() << "\n"; // 0
cout << dq.size() << "\n"; // 2
// Random access (like a vector)
cout << dq[0] << "\n"; // 0
cout << dq[1] << "\n"; // 1
return 0;
}
3.6.4 Monotonic Deque — Sliding Window Maximum
Problem: Given an array A of N integers and a window of size K, find the maximum value in each window as it slides from left to right.
Naive approach: for each window, scan all K elements → O(N×K). Too slow for large K.
Monotonic deque approach: O(N).
The deque stores indices of elements in decreasing order of their values. The front is always the maximum.
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, k;
cin >> n >> k;
vector<int> A(n);
for (int &x : A) cin >> x;
deque<int> dq; // stores indices; values A[dq[i]] are decreasing
vector<int> maxInWindow;
for (int i = 0; i < n; i++) {
// Remove elements outside the window (front is too old)
while (!dq.empty() && dq.front() <= i - k) {
dq.pop_front();
}
// Remove elements from back that are smaller than A[i]
// (they can never be the maximum for future windows)
while (!dq.empty() && A[dq.back()] <= A[i]) {
dq.pop_back();
}
dq.push_back(i); // add current index
// Window is full starting at i = k-1
if (i >= k - 1) {
maxInWindow.push_back(A[dq.front()]); // front is always the max
}
}
for (int x : maxInWindow) cout << x << " ";
cout << "\n";
return 0;
}
Sample Input:
8 3
1 3 -1 -3 5 3 6 7
Sample Output:
3 3 5 5 6 7
Windows: [1,3,-1]=3, [3,-1,-3]=3, [-1,-3,5]=5, [-3,5,3]=5, [5,3,6]=6, [3,6,7]=7.
3.6.5 Stack-Based: Largest Rectangle in Histogram
A classic competitive programming problem: given N bars of heights h[0..N-1], find the largest rectangle that fits within the histogram.
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
vector<int> h(n);
for (int &x : h) cin >> x;
stack<int> st; // stores indices of bars in increasing height order
long long maxArea = 0;
for (int i = 0; i <= n; i++) {
int currentH = (i == n) ? 0 : h[i]; // sentinel 0 at the end
while (!st.empty() && h[st.top()] > currentH) {
int height = h[st.top()]; // height of the rectangle
st.pop();
int width = st.empty() ? i : i - st.top() - 1; // width
maxArea = max(maxArea, (long long)height * width);
}
st.push(i);
}
cout << maxArea << "\n";
return 0;
}
⚠️ Common Mistakes in Chapter 3.6
| # | Mistake | Why It's Wrong | Fix |
|---|---|---|---|
| 1 | Calling top()/front() on empty stack/queue | Undefined behavior, program crashes | Check !st.empty() first |
| 2 | Wrong comparison direction in monotonic stack | "Next Greater" needs > but used <, gets "Next Smaller" | Read carefully, verify with examples |
| 3 | Forgetting to remove expired elements in sliding window | Front index of deque is out of window range, wrong result | while (dq.front() <= i - k) |
| 4 | Forgetting sentinel in histogram max rectangle | Remaining stack elements unprocessed, missing final answer | Use height 0 when i == n |
| 5 | Confusing stack and deque | stack can only access top, cannot traverse middle elements | Use deque when two-end operations needed |
Chapter Summary
📌 Key Takeaways
| Structure | Operations | Key Use Cases | Why It Matters |
|---|---|---|---|
stack<T> | push/pop/top — O(1) | Bracket matching, undo/redo, DFS | Core tool for LIFO logic |
queue<T> | push/pop/front — O(1) | BFS, simulating queues | Core tool for FIFO logic |
deque<T> | push/pop front & back — O(1) | Sliding window, BFS variants | Versatile container with two-end access |
| Monotonic stack | O(n) total | Next Greater/Smaller Element | High-frequency USACO Silver topic |
| Monotonic deque | O(n) total | Sliding Window Max/Min | O(N) solution for window extremes |
❓ FAQ
Q1: Why is the monotonic stack O(N) and not O(N²)? It looks like there's a nested loop.
A: Key observation — each element is pushed at most once and popped at most once. Although the inner while loop may pop multiple elements at once, the total number of pops globally is ≤ N. So total operations ≤ 2N =
O(N). This analysis method is called amortized analysis.
Q2: When to use stack vs deque?
A: If you only need LIFO (one-end access), use
stack; if you need two-end operations (e.g., sliding window needs front removal + back addition), usedeque.stackis actually backed bydequeinternally, but restricts the interface to only expose the top.
Q3: Must BFS use queue? Can I use vector?
A: Technically you can simulate with
vector+ index, butqueueis clearer and less error-prone. In contests, usequeuedirectly. The only exception is 0-1 BFS (shortest path with only 0 and 1 weights), which requiresdeque.
Q4: Why can the "largest rectangle" problem be solved with a stack?
A: The stack maintains an increasing sequence of bars. When a shorter bar is encountered, it means the top bar's "rightward extension" ends here. At that point, we can compute the rectangle area with the top bar's height. Each bar is pushed/popped once, total complexity
O(N).
🔗 Connections to Later Chapters
- Chapter 5.2 (Graph BFS/DFS):
queueis the core container for BFS,stackcan be used for iterative DFS - Chapter 3.4 (Two Pointers): the sliding window technique combines well with the monotonic deque from this chapter
- Chapters 6.1–6.3 (DP): certain optimization techniques (e.g., DP-optimized sliding window extremes) directly use the monotonic deque from this chapter
- The monotonic stack also appears as an alternative to Chapter 3.9 (Segment Trees) — many problems solvable by segment trees can also be solved in
O(N)with a monotonic stack
Practice Problems
Problem 3.6.1 — Stock Span Read N daily stock prices. For each day, find the number of consecutive days up to that day where the price was ≤ today's price (including today). (Classic monotonic stack problem)
Problem 3.6.2 — Circular Queue Implement a circular queue of size K. Process operations: PUSH x (add x to back), POP (remove from front). Print "OVERFLOW" if push on full queue, "UNDERFLOW" if pop on empty.
Problem 3.6.3 — Sliding Window Minimum Same as the sliding window maximum example, but find the minimum.
Problem 3.6.4 — Expression Evaluation
Read a simple expression with integers and +, - operators (no parentheses). Evaluate it using a stack.
Problem 3.6.5 — USACO 2020 January Bronze: Loan Repayment (Simplified) You have N stacks of hay. Each day, you can take one bale from any non-empty stack. Model this with a priority_queue: always take from the tallest stack. Simulate for D days and print the remaining bales.
Chapter 3.7: Hashing Techniques
📝 Before You Continue: You should know STL containers (Chapter 3.1) and string basics (Chapter 2.3). This chapter covers hashing principles and advanced competitive programming usage.
Hashing is one of the most important "tools" in competitive programming: it turns complex comparison problems into O(1) numeric comparisons. But hashing is also the easiest technique to get "hacked"—this chapter teaches both how to use it well and how to prevent being hacked.
3.7.1 unordered_map vs map: Internals & Performance
Internal Implementation Comparison
| Feature | map | unordered_map |
|---|---|---|
| Internal structure | Red-black tree (balanced BST) | Hash table |
| Lookup time | O(log N) | O(1) avg, O(N) worst |
| Insert time | O(log N) | O(1) avg, O(N) worst |
| Iteration order | Ordered (ascending by key) | Unordered |
| Memory usage | O(N), smaller constant | O(N), larger constant |
| Worst case | O(log N) (stable) | O(N) (hash collision) |
#include <bits/stdc++.h>
using namespace std;
int main() {
// map: ordered, O(log N)
map<int, int> m;
m[3] = 30; m[1] = 10; m[2] = 20;
for (auto [k, v] : m) cout << k << ":" << v << " ";
// output: 1:10 2:20 3:30 ← ordered!
// unordered_map: unordered, O(1) average
unordered_map<int, int> um;
um[3] = 30; um[1] = 10; um[2] = 20;
// iteration order undefined, but lookup is very fast
// performance difference: N=10^6 operations
// map: ~300ms; unordered_map: ~80ms (roughly)
}
When to Choose Which?
- Use
map: need ordered iteration, needlower_bound/upper_bound, extreme key range (high hash collision risk) - Use
unordered_map: pure lookup/insert, key is integer or string, large N (> 10^5)
3.7.2 Anti-Hack: Custom Hash
Problem: unordered_map's default integer hash is essentially hash(x) = x, allowing attackers to construct many hash collisions, degrading operations to O(N) and causing TLE.
On platforms like Codeforces, this is a common hack technique.
Solution: splitmix64 Hash
// Anti-hack custom hasher — uses splitmix64
struct custom_hash {
static uint64_t splitmix64(uint64_t x) {
x += 0x9e3779b97f4a7c15;
x = (x ^ (x >> 30)) * 0xbf58476d1ce4e5b9;
x = (x ^ (x >> 27)) * 0x94d049bb133111eb;
return x ^ (x >> 31);
}
size_t operator()(uint64_t x) const {
static const uint64_t FIXED_RANDOM =
chrono::steady_clock::now().time_since_epoch().count();
return splitmix64(x + FIXED_RANDOM);
}
};
// Usage:
unordered_map<int, int, custom_hash> safe_map;
unordered_set<int, custom_hash> safe_set;
⚠️ Contest tip: When using
unordered_mapon Codeforces, always addcustom_hash. USACO test data won't deliberately construct hacks, but it's a good habit.
3.7.3 String Hashing (Polynomial Hash)
String hashing maps a string to an integer, turning string comparison into numeric comparison (O(1)).
Core Formula
For string s[0..n-1], define the hash value as:
hash(s) = s[0]·B^(n-1) + s[1]·B^(n-2) + ... + s[n-1]·B^0 (mod M)
where B is the base (typically 131 or 131117) and M is a large prime (typically 10⁹+7 or 10⁹+9).
Prefix Hash + Substring Hash O(1)
// String hashing: O(N) preprocessing, O(1) substring hash
#include <bits/stdc++.h>
using namespace std;
typedef unsigned long long ull;
const ull BASE = 131;
// Use unsigned long long natural overflow (equivalent to mod 2^64)
// Or specify MOD manually:
// const ull MOD = 1e9 + 7;
struct StringHash {
int n;
vector<ull> h, pw;
StringHash(const string& s) : n(s.size()), h(n + 1, 0), pw(n + 1, 1) {
for (int i = 0; i < n; i++) {
h[i + 1] = h[i] * BASE + (s[i] - 'a' + 1); // 1-indexed prefix hash
pw[i + 1] = pw[i] * BASE; // BASE^(i+1)
}
}
// Get hash of substring s[l..r] (0-indexed)
ull get(int l, int r) {
return h[r + 1] - h[l] * pw[r - l + 1]; // ← KEY formula
}
};
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
string s = "abcabc";
StringHash sh(s);
// Compare if two substrings are equal
// s[0..2] = "abc", s[3..5] = "abc"
cout << (sh.get(0, 2) == sh.get(3, 5) ? "Equal" : "Not Equal") << "\n"; // Equal
// Compare s[0..1] = "ab" vs s[3..4] = "ab"
cout << (sh.get(0, 1) == sh.get(3, 4) ? "Equal" : "Not Equal") << "\n"; // Equal
}
Hash Formula Derivation:
h[r+1] = s[0]*B^r + s[1]*B^(r-1) + ... + s[r]*B^0
h[l] = s[0]*B^(l-1) + ... + s[l-1]*B^0
h[r+1] - h[l] * B^(r-l+1)
= (s[0]*B^r + ... + s[r]*B^0)
- (s[0]*B^r + ... + s[l-1]*B^(r-l+1))
= s[l]*B^(r-l) + s[l+1]*B^(r-l-1) + ... + s[r]*B^0
= hash(s[l..r]) ✓
下图直观展示了前缀哈希数组的构建过程,以及如何用 get(l, r) 公式在 O(1) 内提取任意子串的哈希值:
3.7.4 Double Hashing (Avoiding Collisions)
Single hash (mod M) has collision probability ≈ 1/M. For N substring comparisons, expected collisions ≈ N²/(2M).
- If
M = 10⁹+7, N = 10⁶: collision probability ≈10¹²/(2×10⁹) = 500times! Not safe. - Solution: double hashing, using two different (B, M) pairs simultaneously, collision probability drops to
1/(M₁×M₂) ≈ 10⁻¹⁸.
// Double hashing: two (BASE, MOD) pairs used simultaneously, extremely low collision probability
struct DoubleHash {
static const ull B1 = 131, M1 = 1e9 + 7;
static const ull B2 = 137, M2 = 1e9 + 9;
int n;
vector<ull> h1, h2, pw1, pw2;
DoubleHash(const string& s) : n(s.size()),
h1(n+1,0), h2(n+1,0), pw1(n+1,1), pw2(n+1,1) {
for (int i = 0; i < n; i++) {
ull c = s[i] - 'a' + 1;
h1[i+1] = (h1[i] * B1 + c) % M1;
h2[i+1] = (h2[i] * B2 + c) % M2;
pw1[i+1] = pw1[i] * B1 % M1;
pw2[i+1] = pw2[i] * B2 % M2;
}
}
// Return pair<ull,ull> as the hash "fingerprint" of substring s[l..r]
pair<ull,ull> get(int l, int r) {
ull v1 = (h1[r+1] - h1[l] * pw1[r-l+1] % M1 + M1) % M1;
ull v2 = (h2[r+1] - h2[l] * pw2[r-l+1] % M2 + M2) % M2;
return {v1, v2};
}
};
3.7.5 Application: String Matching (Rabin-Karp)
// Rabin-Karp string matching: find all occurrences of pattern P in text T
// Time: O(N+M) average, O(NM) worst case (but extremely fast in practice)
#include <bits/stdc++.h>
using namespace std;
typedef unsigned long long ull;
vector<int> rabinKarp(const string& T, const string& P) {
int n = T.size(), m = P.size();
if (m > n) return {};
const ull BASE = 131;
ull patHash = 0, textHash = 0, pow_m = 1;
// Compute BASE^m (natural overflow)
for (int i = 0; i < m - 1; i++) pow_m *= BASE;
// Initial hash
for (int i = 0; i < m; i++) {
patHash = patHash * BASE + P[i];
textHash = textHash * BASE + T[i];
}
vector<int> result;
for (int i = 0; i + m <= n; i++) {
if (textHash == patHash) {
// Verify when hashes match (avoid false positives from collision)
if (T.substr(i, m) == P) result.push_back(i);
}
if (i + m < n) {
// Rolling hash: remove leftmost char, add rightmost char
textHash = textHash - T[i] * pow_m; // remove leftmost
textHash = textHash * BASE + T[i + m]; // add rightmost
}
}
return result;
}
3.7.6 Application: Longest Common Substring
Problem: Given strings S and T, find the length of their longest common substring.
Approach: Binary search on the answer (length L of longest common substring), then use a hash set to check if any substring of length L appears in both strings.
// Longest common substring: O(N log N) — binary search + hashing
int longestCommonSubstring(const string& S, const string& T) {
StringHash hs(S), ht(T);
int ns = S.size(), nt = T.size();
auto check = [&](int len) -> bool {
unordered_set<ull> setS;
for (int i = 0; i + len <= ns; i++)
setS.insert(hs.get(i, i + len - 1));
for (int j = 0; j + len <= nt; j++)
if (setS.count(ht.get(j, j + len - 1)))
return true;
return false;
};
int lo = 0, hi = min(ns, nt);
while (lo < hi) {
int mid = (lo + hi + 1) / 2;
if (check(mid)) lo = mid;
else hi = mid - 1;
}
return lo;
}
⚠️ Common Mistakes
-
Bad modulus choice: Don't use numbers other than
10⁹+7; especially avoid non-prime moduli (high collision rate). Recommended:10⁹+7and10⁹+9as a double hash pair. -
unordered_maphacked: On platforms like Codeforces, the default hash can be attacked. Always usecustom_hash. -
Substring hash subtraction underflow:
h[r+1] - h[l] * pw[r-l+1]may be negative (with signed integers). Useunsigned long longnatural overflow, or(... % M + M) % Mto ensure non-negative. -
BASE doesn't match character set: For lowercase letters (26 types), BASE must be > 26 (typically 31 or 131). For all ASCII characters (128 types), BASE must be > 128 (use 131 or 137).
-
Hash collision causing WA: Even with double hashing, collisions are theoretically possible. If uncertain, add direct string comparison when hashes match.
Chapter Summary
📌 Core Comparison Table
| Tool | Time Complexity | Use Case |
|---|---|---|
map<K,V> | O(log N) | Need ordering, need range queries |
unordered_map<K,V> | O(1) amortized | Only need lookup/insert, key order not required |
| String hash (single) | O(N) preprocess, O(1) query | Substring comparison, pattern matching |
| String hash (double) | O(N) preprocess, O(1) query | High-precision scenarios, avoid collisions |
❓ FAQ
Q1: Which is better — unsigned long long natural overflow double hash or manual mod hash?
A:
ullnatural overflow (equivalent to mod 2⁶⁴) is simpler to code, and 2⁶⁴ is large enough that single-hash collision probability is already very low (≈ 10⁻¹⁸). But crafted data can deliberately cause collisions — double hashing is safer then. Both work in contests;ullis more common.
Q2: What can string hashing do that KMP cannot?
A: String hashing excels at multi-string comparison (e.g., finding longest common substring, palindromic substrings), while KMP only excels at single-pattern matching. Hash + binary search can solve many string problems in O(N log N) that would require more complex KMP implementations.
Q3: Should I use BASE 31 or 131?
A: Use 31 for lowercase letters only (a prime less than 37, avoids too-small hash space). Use 131 for mixed case or digits (a prime greater than 128, covers full ASCII). The key is: BASE must be larger than the character set size and ideally a prime.
Practice Problems
Problem 3.7.1 — Two Sum with Hash 🟢 Easy
Given array A, find if any two distinct elements sum to target X. Use unordered_set.
Hint
For each A[i], check if (X - A[i]) is already in the hash set. Insert A[i] after checking.Problem 3.7.2 — Substring Check 🟢 Easy Given string T and pattern P, check if P appears in T. Print all starting indices.
Hint
Use Rabin-Karp rolling hash, or just use `string::find` for practice, then implement manually.Problem 3.7.3 — Longest Palindromic Substring 🟡 Medium Find the length of the longest palindromic substring.
Hint
A palindrome s[l..r] satisfies: hash(s[l..r]) == hash(reverse(s)[n-1-r..n-1-l]). Binary search + hash on both forward and reversed string.Problem 3.7.4 — Count Distinct Substrings 🟡 Medium Given string S of length N (N ≤ 5000), count the number of distinct substrings.
Hint
Insert all O(N²) substring hashes into an unordered_set, count distinct values. Use double hash to avoid collisions.Problem 3.7.5 — String Periods 🔴 Hard Find the smallest period of a string S (smallest k such that S is a repetition of S[0..k-1]).
Hint
Try each k that divides n. For each candidate k, verify using string hash comparison in O(1) per check. Total O(d(n) × n) where d(n) is number of divisors.Chapter 3.8: Maps & Sets
Maps and sets are the workhorses of frequency counting, lookup, and tracking unique elements. In this chapter, we go deep into their practical use in USACO problems.
3.8.1 map vs unordered_map — Choosing Wisely
Visual: Map Internal Structure (BST)
std::map stores key-value pairs in a balanced BST (Red-Black tree). This gives O(log N) for all operations and keeps keys sorted automatically — great when you need lower_bound/upper_bound queries. Use unordered_map when you only need O(1) lookups and don't care about order.
| Feature | map | unordered_map |
|---|---|---|
| Underlying structure | Red-black tree | Hash table |
| Insert/lookup time | O(log n) | O(1) average, O(n) worst |
| Iterates in | Sorted key order | Arbitrary order |
| Min/Max key | Available via .begin()/.rbegin() | Not available |
| Keys must be | Comparable (has <) | Hashable |
| Use when | You need sorted keys or find min/max | You need fastest possible lookup |
For most USACO problems, either works fine. Use unordered_map for speed when keys are integers or strings, map when you need ordered iteration.
Example: Frequency Map
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
unordered_map<int, int> freq;
for (int i = 0; i < n; i++) {
int x;
cin >> x;
freq[x]++; // increment count; creates with 0 if not present
}
// Find the element with highest frequency
int maxFreq = 0, maxVal = INT_MIN;
for (auto &[val, count] : freq) { // structured binding (C++17)
if (count > maxFreq || (count == maxFreq && val < maxVal)) {
maxFreq = count;
maxVal = val;
}
}
cout << "Most frequent: " << maxVal << " (" << maxFreq << " times)\n";
return 0;
}
3.8.2 Map Operations — Complete Reference
#include <bits/stdc++.h>
using namespace std;
int main() {
map<string, int> scores;
// Insert
scores["Alice"] = 95;
scores["Bob"] = 87;
scores["Charlie"] = 92;
scores.insert({"Dave", 78}); // another way
scores.emplace("Eve", 88); // most efficient way
// Lookup
cout << scores["Alice"] << "\n"; // 95
// WARNING: scores["Unknown"] creates it with value 0!
// Safe lookup
if (scores.count("Frank")) {
cout << scores["Frank"] << "\n";
} else {
cout << "Frank not found\n";
}
// Using find() — returns iterator
auto it = scores.find("Bob");
if (it != scores.end()) {
cout << it->first << ": " << it->second << "\n"; // Bob: 87
}
// Update
scores["Alice"] += 5; // Alice now has 100
// Erase
scores.erase("Charlie");
// Iterate in sorted key order (map always gives sorted order)
for (const auto &[name, score] : scores) {
cout << name << ": " << score << "\n";
}
// Alice: 100
// Bob: 87
// Dave: 78
// Eve: 88
// Size and empty check
cout << scores.size() << "\n"; // 4
cout << scores.empty() << "\n"; // 0 (false)
// Clear all entries
scores.clear();
return 0;
}
3.8.3 Set Operations — Complete Reference
#include <bits/stdc++.h>
using namespace std;
int main() {
set<int> s = {5, 3, 8, 1, 9, 2};
// s = {1, 2, 3, 5, 8, 9} (always sorted!)
// Insert
s.insert(4); // s = {1, 2, 3, 4, 5, 8, 9}
s.insert(3); // already there, no change
// Erase
s.erase(8); // s = {1, 2, 3, 4, 5, 9}
// Lookup
cout << s.count(3) << "\n"; // 1 (exists)
cout << s.count(7) << "\n"; // 0 (not found)
// Iterator-based queries
auto it = s.lower_bound(4); // first element >= 4
cout << *it << "\n"; // 4
auto it2 = s.upper_bound(4); // first element > 4
cout << *it2 << "\n"; // 5
// Min and Max
cout << *s.begin() << "\n"; // 1 (min)
cout << *s.rbegin() << "\n"; // 9 (max)
// Remove minimum
s.erase(s.begin()); // removes 1
cout << *s.begin() << "\n"; // 2
// Iterate
for (int x : s) cout << x << " ";
cout << "\n"; // 2 3 4 5 9
return 0;
}
3.8.4 USACO Problem: Cow IDs
Problem (USACO 2017 February Bronze): Bessie wants to find the N-th smallest number that doesn't appear in a set of "taken" IDs. Given a set of taken IDs and N, find the N-th available ID.
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, q;
cin >> n >> q;
set<int> taken;
for (int i = 0; i < n; i++) {
int x; cin >> x;
taken.insert(x);
}
// For each query q, find the q-th positive integer NOT in taken
while (q--) {
int k; cin >> k;
// Binary search: find smallest x such that x - (# taken values <= x) >= k
int lo = 1, hi = 2e9;
while (lo < hi) {
int mid = lo + (hi - lo) / 2;
// count of available numbers in [1, mid] = mid - (# taken values <= mid)
int taken_count = (int)(taken.lower_bound(mid + 1) - taken.begin());
int available = mid - taken_count;
if (available >= k) hi = mid;
else lo = mid + 1;
}
cout << lo << "\n";
}
return 0;
}
3.8.5 Multiset — Sorted Bag with Duplicates
A multiset is like a set, but allows duplicate values:
#include <bits/stdc++.h>
using namespace std;
int main() {
multiset<int> ms;
ms.insert(3);
ms.insert(1);
ms.insert(3); // duplicate allowed
ms.insert(5);
ms.insert(1);
// ms = {1, 1, 3, 3, 5}
cout << ms.count(3) << "\n"; // 2 (how many 3s)
cout << ms.count(2) << "\n"; // 0
// Remove ONE occurrence of 3
ms.erase(ms.find(3)); // removes only one 3
// ms = {1, 1, 3, 5}
// Remove ALL occurrences of 1
ms.erase(1); // removes all 1s
// ms = {3, 5}
cout << *ms.begin() << "\n"; // 3 (min)
cout << *ms.rbegin() << "\n"; // 5 (max)
return 0;
}
Running Median with Two Multisets
Keep track of the median of a stream of numbers using a max-multiset (lower half) and a min-multiset (upper half):
#include <bits/stdc++.h>
using namespace std;
int main() {
multiset<int> lo; // max-heap: lower half (use negation or reverse iterator)
multiset<int> hi; // min-heap: upper half
// For simplicity, use two multisets where lo stores values in reverse
// lo's maximum = lo.rbegin(); hi's minimum = hi.begin()
int n;
cin >> n;
for (int i = 0; i < n; i++) {
int x;
cin >> x;
// Add to appropriate half
if (lo.empty() || x <= *lo.rbegin()) {
lo.insert(x);
} else {
hi.insert(x);
}
// Rebalance: sizes should differ by at most 1
while (lo.size() > hi.size() + 1) {
hi.insert(*lo.rbegin());
lo.erase(lo.find(*lo.rbegin()));
}
while (hi.size() > lo.size()) {
lo.insert(*hi.begin());
hi.erase(hi.begin());
}
// Print median
if (lo.size() == hi.size()) {
// Even count: average of two middle values
double median = (*lo.rbegin() + *hi.begin()) / 2.0;
cout << fixed << setprecision(1) << median << "\n";
} else {
// Odd count: middle value is in lo
cout << *lo.rbegin() << "\n";
}
}
return 0;
}
3.8.6 Practical Patterns
Pattern 1: Counting Distinct Elements
vector<int> data = {1, 5, 3, 1, 2, 5, 5, 3};
set<int> distinct(data.begin(), data.end());
cout << "Distinct count: " << distinct.size() << "\n"; // 4
Pattern 2: Group by Frequency, Sort by Value
vector<int> nums = {3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5};
map<int, int> freq;
for (int x : nums) freq[x]++;
// Group values by their frequency
map<int, vector<int>> byFreq;
for (auto &[val, cnt] : freq) {
byFreq[cnt].push_back(val);
}
// Print in order of frequency
for (auto &[cnt, vals] : byFreq) {
for (int v : vals) cout << v << " (×" << cnt << ")\n";
}
Pattern 3: Offline Queries with Sorting
Sort queries along with events to process them together in O((N+Q) log N):
// Example: for each query point, count how many events have value <= query point
// Sort both arrays, sweep through with two pointers
⚠️ Common Mistakes in Chapter 3.8
| # | Mistake | Why It's Wrong | Fix |
|---|---|---|---|
| 1 | map[key] accessing non-existent key | Auto-creates entry with value 0, pollutes data | Use m.count(key) or m.find(key) to check first |
| 2 | multiset::erase(value) deletes all equal values | Expected to delete one, deleted all | Use ms.erase(ms.find(value)) to delete just one |
| 3 | Modifying map/set size during iteration | Iterator invalidated, crash or skipped elements | Use it = m.erase(it) for safe deletion |
| 4 | unordered_map hacked to degrade to O(N) | Adversary constructs hash-collision data, TLE | Switch to map or use custom hash function |
| 5 | Forgetting set doesn't store duplicates | size() doesn't grow after inserting duplicate, count wrong | Use multiset when duplicates needed |
Chapter Summary
📌 Key Takeaways
| Structure | Ordered | Duplicates | Key Feature | Why It Matters |
|---|---|---|---|---|
map<K,V> | Yes (sorted) | No (unique keys) | Key-value mapping, O(log N) | Frequency counting, ID→attribute mapping |
unordered_map<K,V> | No | No | O(1) average lookup | 5-10x faster than map for large data |
set<T> | Yes (sorted) | No | Ordered unique set | Deduplication, range queries (lower_bound) |
unordered_set<T> | No | No | O(1) membership test | Just need to check "seen before?" |
multiset<T> | Yes (sorted) | Yes | Ordered multiset | Dynamic median, sliding window |
🧩 "Which Container to Use" Quick Reference
| Need | Recommended Container | Reason |
|---|---|---|
| Count occurrences of each element | map / unordered_map | freq[x]++ in one line |
| Deduplicate and sort | set | Auto-dedup + auto-sort |
| Check if element was seen | unordered_set | O(1) lookup |
| Dynamic ordered set + find extremes | set / multiset | O(1) access to min/max |
Need lower_bound / upper_bound | set / map | Only ordered containers support this |
| Value→index mapping | map / unordered_map | Coordinate compression etc. |
❓ FAQ
Q1: What's the difference between map's [] operator and find?
A:
m[key]auto-creates a default value (0 for int) when key doesn't exist;m.find(key)only searches, doesn't create. If you just want to check if a key exists, usem.count(key)orm.find(key) != m.end().
Q2: Both multiset and priority_queue can get extremes — which to use?
A:
priority_queuecan only get the max (or min) and delete it, doesn't support deletion by value.multisetsupports finding and deleting any value, more flexible. If you only need to repeatedly get the extreme,priority_queueis simpler; if you need to delete specific elements (e.g., removing elements leaving a sliding window), usemultiset.
Q3: When can unordered_map be slower than map?
A: Two situations: ① When hash collisions are severe (many keys hash to the same bucket), degrades to
O(N); ② In contests, adversaries deliberately construct data to hackunordered_map. Solution: use a custom hash function, or switch tomap.
Q4: Is C++17 structured binding auto &[key, val] safe? Can I use it in contests?
A: USACO and most contest platforms support C++17, so
for (auto &[key, val] : m)is safe to use. It's cleaner thanentry.first/entry.second.
🔗 Connections to Later Chapters
- Chapter 3.3 (Sorting & Searching): coordinate compression often combines with
map(value → compressed index) - Chapter 3.9 (Segment Trees): ordered
set'slower_boundcan replace simple segment tree queries - Chapters 5.1–5.2 (Graphs):
mapis commonly used to store adjacency lists for sparse graphs - Chapter 4.1 (Greedy):
multisetcombined with greedy strategies can efficiently maintain dynamic optimal choices - The
mapfrequency counting pattern appears throughout the book and is one of the most fundamental tools in competitive programming
Practice Problems
Problem 3.8.1 — Two Sum Read N integers and a target T. Find two values in the array that sum to T. Print their indices (1-indexed). (Hint: use a map to store value → index)
Problem 3.8.2 — Anagram Groups Read N words. Group them by their sorted-letter form. Print each group on one line, sorted alphabetically.
- Example: "eat tea tan ate nat bat" → groups: {ate, eat, tea}, {bat}, {nat, tan}
Problem 3.8.3 — Interval Overlap Count Read N intervals [L_i, R_i]. For each integer point from 1 to M, count how many intervals contain it. Output the maximum overlap count. (Hint: use difference array, or sort events and sweep with a set)
Problem 3.8.4 — Cow Photography (USACO Bronze Inspired) N cows each have a unique ID. Read N lists (each a permutation of IDs). Find the ordering that's consistent with all lists (the "true" order). (Hint: use maps to count pairwise orderings)
Problem 3.8.5 — Running Distinct Count Read N integers one by one. After each new integer, print the count of distinct values seen so far. (Hint: maintain an unordered_set; its size is the answer)
Chapter 3.9: Introduction to Segment Trees
📝 Before You Continue: You should understand prefix sums (Chapter 3.2), arrays, and recursion (Chapter 2.3). Segment trees are a more advanced data structure — make sure you're comfortable with recursion before diving in.
Segment trees are one of the most powerful data structures in competitive programming. They solve a fundamental problem that prefix sums cannot: range queries with updates.
3.9.1 The Problem: Why We Need Segment Trees
Consider this challenge:
- Array
Aof N integers - Q1: What is the sum of
A[l..r]? (Range sum query) - Q2: Update
A[i] = x(Point update)
Prefix sum solution: Range query in O(1), but update requires O(N) to recompute all prefix sums. For M mixed queries, total: O(NM) — too slow for N,M = 10^5.
Segment tree solution: Both query and update in O(log N). For M mixed queries: O(M log N) ✓
| Data Structure | Build | Query | Update | Best For |
|---|---|---|---|---|
| Simple array | O(N) | O(N) | O(1) | Only updates |
| Prefix sum | O(N) | O(1) | O(N) | Only queries |
| Segment Tree | O(N) | O(log N) | O(log N) | Both queries + updates |
| Fenwick Tree (BIT) | O(N log N) | O(log N) | O(log N) | Simpler code, prefix sums only |
The diagram shows a segment tree built on array [1, 3, 5, 7, 9, 11]. Each internal node stores the sum of its range. A query for range [2,4] (sum=21) is answered by combining just 2 nodes — O(log N) instead of O(N).
3.9.2 Structure: What Is a Segment Tree?
A segment tree is a complete binary tree where:
- Each leaf corresponds to a single array element
- Each internal node stores the aggregate (sum, min, max, etc.) of its range
- The root covers the entire array [0..N-1]
- A node covering [l..r] has two children: [l..mid] and [mid+1..r]
For an array of N elements, the tree has at most 4N nodes (we use a 1-indexed tree array of size 4N as a safe upper bound).
Array: [1, 3, 5, 7, 9, 11] (indices 0..5)
Tree (1-indexed, node i has children 2i and 2i+1):
[0..5]=36
/ \
[0..2]=9 [3..5]=27
/ \ / \
[0..1]=4 [2]=5 [3..4]=16 [5]=11
/ \ / \
[0]=1 [1]=3 [3]=7 [4]=9
下图展示了线段树的完整结构,以及查询 sum([2,4]) 时蓝色高亮的访问路径:
3.9.3 Building the Segment Tree
// Solution: Segment Tree Build — O(N)
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 100005;
int tree[4 * MAXN]; // segment tree array (4x array size for safety)
int arr[MAXN]; // original array
// Build: recursively fill tree[]
// node = current tree node index (start with 1)
// start, end = range this node covers
void build(int node, int start, int end) {
if (start == end) {
// Leaf node: stores the array element
tree[node] = arr[start];
} else {
int mid = (start + end) / 2;
// Build left and right children first
build(2 * node, start, mid); // left child
build(2 * node + 1, mid + 1, end); // right child
// Internal node: sum of children
tree[node] = tree[2 * node] + tree[2 * node + 1];
}
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
for (int i = 0; i < n; i++) cin >> arr[i];
build(1, 0, n - 1); // build from node 1, covering [0..n-1]
return 0;
}
Build trace for [1, 3, 5, 7, 9, 11]:
build(1, 0, 5):
build(2, 0, 2):
build(4, 0, 1):
build(8, 0, 0): tree[8] = arr[0] = 1
build(9, 1, 1): tree[9] = arr[1] = 3
tree[4] = tree[8] + tree[9] = 4
build(5, 2, 2): tree[5] = arr[2] = 5
tree[2] = tree[4] + tree[5] = 9
build(3, 3, 5):
build(6, 3, 4):
...
tree[3] = 27
tree[1] = 9 + 27 = 36
3.9.4 Range Query
Query sum of arr[l..r]:
Key idea: Recursively descend the tree. At each node covering [start..end]:
- If [start..end] is completely inside [l..r]: return this node's value (done!)
- If [start..end] is completely outside [l..r]: return 0 (no contribution)
- Otherwise: recurse into both children, sum the results
// Range Query: sum of arr[l..r] — O(log N)
// node = current tree node, [start, end] = range it covers
// [l, r] = query range
int query(int node, int start, int end, int l, int r) {
if (r < start || end < l) {
// Case 1: Current segment completely outside query range
return 0; // identity for sum (use INT_MAX for min queries)
}
if (l <= start && end <= r) {
// Case 2: Current segment completely inside query range
return tree[node]; // ← KEY LINE: use this node directly!
}
// Case 3: Partial overlap — recurse into children
int mid = (start + end) / 2;
int leftSum = query(2 * node, start, mid, l, r);
int rightSum = query(2 * node + 1, mid + 1, end, l, r);
return leftSum + rightSum;
}
// Usage: sum of arr[2..4]
int result = query(1, 0, n - 1, 2, 4);
cout << result << "\n"; // 5 + 7 + 9 = 21
Query trace for [2..4] on tree of [1,3,5,7,9,11]:
query(1, 0, 5, 2, 4):
query(2, 0, 2, 2, 4): [0..2] partially overlaps [2..4]
query(4, 0, 1, 2, 4): [0..1] outside [2..4] → return 0
query(5, 2, 2, 2, 4): [2..2] inside [2..4] → return 5
return 0 + 5 = 5
query(3, 3, 5, 2, 4): [3..5] partially overlaps [2..4]
query(6, 3, 4, 2, 4): [3..4] inside [2..4] → return 16
query(7, 5, 5, 2, 4): [5..5] outside [2..4] → return 0
return 16 + 0 = 16
return 5 + 16 = 21 ✓
Only 4 nodes visited — O(log N)!
3.9.5 Point Update
Update arr[i] = x (change a single element):
// Point Update: set arr[idx] = val — O(log N)
void update(int node, int start, int end, int idx, int val) {
if (start == end) {
// Leaf: update the value
arr[idx] = val;
tree[node] = val;
} else {
int mid = (start + end) / 2;
if (idx <= mid) {
update(2 * node, start, mid, idx, val); // update in left child
} else {
update(2 * node + 1, mid + 1, end, idx, val); // update in right child
}
// Update this internal node after child changes
tree[node] = tree[2 * node] + tree[2 * node + 1];
}
}
// Usage: set arr[2] = 10
update(1, 0, n - 1, 2, 10);
3.9.6 Complete Implementation
Here's the full, contest-ready segment tree:
// Solution: Segment Tree — O(N) build, O(log N) query/update
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 100005;
long long tree[4 * MAXN];
void build(int node, int start, int end, long long arr[]) {
if (start == end) {
tree[node] = arr[start];
return;
}
int mid = (start + end) / 2;
build(2 * node, start, mid, arr);
build(2 * node + 1, mid + 1, end, arr);
tree[node] = tree[2 * node] + tree[2 * node + 1];
}
long long query(int node, int start, int end, int l, int r) {
if (r < start || end < l) return 0;
if (l <= start && end <= r) return tree[node];
int mid = (start + end) / 2;
return query(2 * node, start, mid, l, r)
+ query(2 * node + 1, mid + 1, end, l, r);
}
void update(int node, int start, int end, int idx, long long val) {
if (start == end) {
tree[node] = val;
return;
}
int mid = (start + end) / 2;
if (idx <= mid) update(2 * node, start, mid, idx, val);
else update(2 * node + 1, mid + 1, end, idx, val);
tree[node] = tree[2 * node] + tree[2 * node + 1];
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, q;
cin >> n >> q;
long long arr[MAXN];
for (int i = 0; i < n; i++) cin >> arr[i];
build(1, 0, n - 1, arr);
while (q--) {
int type;
cin >> type;
if (type == 1) {
// Point update: set arr[i] = v
int i; long long v;
cin >> i >> v;
update(1, 0, n - 1, i, v);
} else {
// Range query: sum of arr[l..r]
int l, r;
cin >> l >> r;
cout << query(1, 0, n - 1, l, r) << "\n";
}
}
return 0;
}
Sample Input:
6 5
1 3 5 7 9 11
2 2 4
1 2 10
2 2 4
2 0 5
1 0 0
Sample Output:
21
26
41
(第1次查询 [2,4] = 5+7+9 = 21;执行 update arr[2]=10 后,第2次查询 [2,4] = 10+7+9 = 26;第3次查询 [0,5] = 1+3+10+7+9+11 = 41;最后一条操作 update arr[0]=0 无输出)
3.9.7 Segment Tree vs. Fenwick Tree (BIT)
| Feature | Segment Tree | Fenwick Tree (BIT) |
|---|---|---|
| Code complexity | Medium (~30 lines) | Simple (~15 lines) |
| Range query | Any associative op | Prefix sums only |
| Range update | Yes (with lazy prop) | Yes (with tricks) |
| Point update | O(log N) | O(log N) |
| Space | O(4N) | O(N) |
| When to use | Range min/max, complex queries | Prefix sum with updates |
💡 Key Insight: If you need range sum with updates, a Fenwick tree is simpler. If you need range minimum, range maximum, or any other aggregate that isn't a prefix operation, use a segment tree.
3.9.8 Range Minimum Query Variant
Just change the aggregate from + to min:
// Range Minimum Segment Tree — same structure, different operation
void build_min(int node, int start, int end, int arr[]) {
if (start == end) { tree[node] = arr[start]; return; }
int mid = (start + end) / 2;
build_min(2*node, start, mid, arr);
build_min(2*node+1, mid+1, end, arr);
tree[node] = min(tree[2*node], tree[2*node+1]); // ← changed to min
}
int query_min(int node, int start, int end, int l, int r) {
if (r < start || end < l) return INT_MAX; // ← identity for min
if (l <= start && end <= r) return tree[node];
int mid = (start + end) / 2;
return min(query_min(2*node, start, mid, l, r),
query_min(2*node+1, mid+1, end, l, r));
}
⚠️ Common Mistakes
- Array size too small: Always allocate
tree[4 * MAXN]. Using2 * MAXNwill cause out-of-bounds for non-power-of-2 sizes. - Wrong identity for out-of-range: For sum queries, return 0. For min queries, return
INT_MAX. For max queries, returnINT_MIN. - Forgetting to update the parent node: After updating a child, you MUST recompute the parent:
tree[node] = tree[2*node] + tree[2*node+1]. - 0-indexed vs 1-indexed confusion: This implementation uses 0-indexed arrays but 1-indexed tree nodes. Be consistent.
- Using segment tree when prefix sum suffices: If there are no updates, prefix sum (
O(1)query) beats segment tree (O(log N)query). Use the simpler tool when appropriate.
Chapter Summary
📌 Key Takeaways
| Operation | Time | Key Code Line |
|---|---|---|
| Build | O(N) | tree[node] = tree[2*node] + tree[2*node+1] |
| Point update | O(log N) | Recurse to leaf, update upward |
| Range query | O(log N) | Return early if fully inside/outside |
| Space | O(4N) | Allocate tree[4 * MAXN] |
❓ FAQ
Q1: When to choose segment tree vs prefix sum?
A: Simple rule — if the array never changes, prefix sum is better (
O(1)query vsO(log N)). If the array gets modified (point updates), use segment tree or BIT. If you need range updates (add a value to a range), use segment tree with lazy propagation.
Q2: Why does the tree array need size 4N?
A: A segment tree is a complete binary tree. When N is not a power of 2, the last level may be incomplete but still needs space. In the worst case, about 4N nodes are needed. Using
4*MAXNis a safe upper bound.
Q3: Which is better, Fenwick Tree (BIT) or Segment Tree?
A: BIT code is shorter (~15 lines vs 30 lines), has smaller constants, but can only handle "prefix-decomposable" operations (like sum). Segment Tree is more general (can do range min/max, GCD, etc.) and supports more complex operations (like lazy propagation). In contests: use BIT when possible, switch to Segment Tree when BIT is insufficient.
Q4: What types of queries can segment trees handle?
A: Any operation satisfying the associative law: sum (+), minimum (min), maximum (max), GCD, XOR, product, etc. The key is having an "identity element" (e.g., 0 for sum,
INT_MAXfor min,INT_MINfor max).
Q5: What is Lazy Propagation? When is it needed?
A: When you need to "add V to every element in range [L,R]" (range update), the naive approach updates every leaf from L to R (
O(N)), which is too slow. Lazy Propagation stores updates "lazily" in internal nodes and only pushes them down when a child node actually needs to be queried, optimizing range updates toO(log N)as well.
🔗 Connections to Later Chapters
- Chapter 3.2 (Prefix Sums): the "simplified version" of segment trees — use prefix sums when there are no update operations
- Chapters 5.1–5.2 (Graphs): Euler Tour + segment tree can efficiently handle path queries on trees
- Chapters 6.1–6.3 (DP): some DP optimizations require segment trees to maintain range min/max of DP values
- Segment tree is a core data structure at USACO Gold level, mastering it solves a large number of Gold problems
Practice Problems
Problem 3.9.1 — Classic Range Sum 🟢 Easy Implement a segment tree. Handle N elements and Q queries: either update a single element or query the sum of a range.
Hint
Use the complete implementation from Section 3.9.6. Distinguish query type by a flag (1 = update, 2 = query).Problem 3.9.2 — Range Minimum 🟡 Medium Same as above but query the minimum of a range. Handle point updates.
Hint
Change `+` to `min` in the tree operations. Return `INT_MAX` for out-of-range. The identity element for min is +∞.Problem 3.9.3 — Number of Inversions 🔴 Hard
Count the number of pairs (i,j) where i < j and arr[i] > arr[j].
Hint
Process elements left to right. For each element x, query how many elements already inserted are > x (using a segment tree indexed by value). Then insert x. Total inversions = sum of these counts.🏆 Challenge: USACO 2016 February Gold: Fencing the Cows A problem requiring range max queries with updates. Try solving it with both a Fenwick tree and a segment tree to understand the tradeoffs.
3.9.6 Lazy Propagation — Range Updates in O(log N)
The segment tree so far handles point updates (change one element). But what about range updates: "add V to all elements in [L, R]"?
Without lazy propagation, we'd need O(N) updates (one per element). With lazy propagation, we achieve O(log N) range updates.
💡 Key Insight: Instead of immediately updating all affected leaf nodes, we "lazily" defer the update — store it at the highest applicable node and only push it down when we actually need the children.
How Lazy Propagation Works
Each node now stores two values:
tree[node]: the actual aggregated value (range sum) for this rangelazy[node]: a pending update that hasn't been pushed to children yet
The push-down rule: When we visit a node with a pending lazy update, we:
- Apply the lazy update to the node's value
- Pass the lazy update to both children (push down)
- Clear the lazy for this node
Example: Array = [1, 2, 3, 4, 5], update "add 10 to [1..3]"
Initial tree:
[15] ← sum of [0..4]
/ \
[6] [9] ← sum of [0..2], [3..4]
/ \ / \
[3] [3] [4] [5] ← sum of [0..1], [2], [3], [4]
/ \
[1] [2]
After update "add 10 to [1..3]" with lazy propagation:
We need to update indices 1, 2, 3 (0-indexed).
At node covering [0..2]:
- Only partially inside [1..3], so recurse down
At node covering [0..1]:
- Partially inside [1..3], so recurse down
- At leaf [1]: update arr[1] += 10. tree = [1, 12, 3, 4, 5]
At leaf [2]:
- Fully inside [1..3]: store lazy, don't recurse!
- lazy[covering [2]] = +10
- tree[node] += 10 × (length of [2]) = +10
At node covering [3..4]:
- Partially inside, recurse to [3]
- Leaf [3]: += 10
Complete Lazy Propagation Implementation
// Solution: Segment Tree with Lazy Propagation
// Supports: range add update, range sum query — O(log N) each
#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
const int MAXN = 100005;
ll tree[4 * MAXN]; // tree[node] = sum of range
ll lazy[4 * MAXN]; // lazy[node] = pending add value (0 means no pending)
// ── PUSH DOWN: apply pending lazy to children ──
// Called before we recurse into children
void pushDown(int node, int start, int end) {
if (lazy[node] == 0) return; // no pending update, nothing to do
int mid = (start + end) / 2;
int left = 2 * node, right = 2 * node + 1;
// Update left child's sum: add lazy * (number of elements in left child)
tree[left] += lazy[node] * (mid - start + 1);
tree[right] += lazy[node] * (end - mid);
// Pass lazy to children
lazy[left] += lazy[node];
lazy[right] += lazy[node];
// Clear current node's lazy (it's been pushed down)
lazy[node] = 0;
}
// ── BUILD: construct tree from array ──
void build(int node, int start, int end, ll arr[]) {
lazy[node] = 0; // no pending updates initially
if (start == end) {
tree[node] = arr[start];
return;
}
int mid = (start + end) / 2;
build(2*node, start, mid, arr);
build(2*node+1, mid+1, end, arr);
tree[node] = tree[2*node] + tree[2*node+1];
}
// ── RANGE UPDATE: add val to all elements in [l, r] ──
void update(int node, int start, int end, int l, int r, ll val) {
if (r < start || end < l) return; // out of range: no-op
if (l <= start && end <= r) {
// Current segment fully inside [l, r]: apply lazy here, don't recurse
tree[node] += val * (end - start + 1); // ← KEY: multiply by range length
lazy[node] += val; // store pending for children
return;
}
// Partial overlap: push down existing lazy, then recurse
pushDown(node, start, end); // ← CRITICAL: push before recursing!
int mid = (start + end) / 2;
update(2*node, start, mid, l, r, val);
update(2*node+1, mid+1, end, l, r, val);
// Update current node from children
tree[node] = tree[2*node] + tree[2*node+1];
}
// ── RANGE QUERY: sum of elements in [l, r] ──
ll query(int node, int start, int end, int l, int r) {
if (r < start || end < l) return 0; // out of range
if (l <= start && end <= r) {
return tree[node]; // fully inside: return stored sum (already includes lazy!)
}
// Partial overlap: push down, then recurse
pushDown(node, start, end); // ← CRITICAL: push before recursing!
int mid = (start + end) / 2;
ll leftSum = query(2*node, start, mid, l, r);
ll rightSum = query(2*node+1, mid+1, end, l, r);
return leftSum + rightSum;
}
// ── COMPLETE EXAMPLE ──
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, q;
cin >> n >> q;
ll arr[MAXN];
for (int i = 0; i < n; i++) cin >> arr[i];
build(1, 0, n-1, arr);
while (q--) {
int type;
cin >> type;
if (type == 1) {
// Range update: add val to [l, r]
int l, r; ll val;
cin >> l >> r >> val;
update(1, 0, n-1, l, r, val);
} else {
// Range query: sum of [l, r]
int l, r;
cin >> l >> r;
cout << query(1, 0, n-1, l, r) << "\n";
}
}
return 0;
}
Visual Trace: Range Update with Lazy
Array: [1, 2, 3, 4, 5, 6] (0-indexed)
Initial tree (sums):
tree[1] = 21 [0..5]
tree[2] = 6 [0..2] tree[3] = 15 [3..5]
tree[4] = 3 [0..1] tree[5] = 3 [2..2] tree[6] = 7 [3..4] tree[7] = 6 [5..5]
tree[8] = 1 [0..0] tree[9] = 2 [1..1] tree[12] = 4 [3..3] tree[13] = 3 [4..4]
update(1, 0, 5, 1, 4, +10): (add 10 to indices 1..4)
At node 1 [0..5]: partial overlap, pushDown(1)—no lazy. Recurse.
At node 2 [0..2]: partial overlap, pushDown(2)—no lazy. Recurse.
At node 4 [0..1]: partial overlap, pushDown(4)—no lazy. Recurse.
At node 8 [0..0]: outside [1..4]. Return.
At node 9 [1..1]: FULLY inside [1..4].
tree[9] += 10×1 = 12. lazy[9] = 10. Return.
tree[4] = tree[8] + tree[9] = 1 + 12 = 13.
At node 5 [2..2]: FULLY inside [1..4].
tree[5] += 10×1 = 13. lazy[5] = 10. Return.
tree[2] = 13 + 13 = 26.
At node 3 [3..5]: partial overlap. pushDown(3)—no lazy. Recurse.
At node 6 [3..4]: FULLY inside [1..4].
tree[6] += 10×2 = 27. lazy[6] = 10. Return. ← lazy stored for later!
At node 7 [5..5]: outside [1..4]. Return.
tree[3] = 27 + 6 = 33.
tree[1] = 26 + 33 = 59. ✓ (original 21 + 10×4 = 61... let me recheck)
query(1, 0, 5, 2, 3): sum of [2..3]
At node 1 [0..5]: partial. pushDown(1)—no lazy. Recurse.
At node 2 [0..2]: partial. pushDown(2)—no lazy. Recurse.
At node 4 [0..1]: outside [2..3]. Return 0.
At node 5 [2..2]: FULLY inside. Return tree[5] = 13. ✓ (arr[2] = 3+10 = 13)
At node 3 [3..5]: partial. pushDown(3)—no lazy. Recurse.
At node 6 [3..4]: partial. pushDown(6)! (lazy[6] = 10)
tree[12] += 10×1 = 14, lazy[12] = 10.
tree[13] += 10×1 = 13, lazy[13] = 10.
lazy[6] = 0.
At node 12 [3..3]: FULLY inside. Return tree[12] = 14. ✓ (arr[3] = 4+10 = 14)
At node 13 [4..4]: outside. Return 0.
Result = 13 + 14 = 27. ✓
Complexity Analysis
Why O(log N)? Each update/query visits at most O(log N) "fully covered" nodes (where we stop and apply lazy). Between two consecutive fully-covered nodes at the same level, there's at most one partially-covered node that requires descent.
⚠️ Lazy Propagation Common Mistakes
// BAD: This gives wrong answers!
void update(int node, int start, int end, int l, int r, ll val) {
if (r < start || end < l) return;
if (l <= start && end <= r) {
tree[node] += val * (end - start + 1);
lazy[node] += val;
return;
}
// FORGOT: pushDown(node, start, end); ← BUG!
int mid = (start + end) / 2;
update(2*node, start, mid, l, r, val);
update(2*node+1, mid+1, end, l, r, val);
tree[node] = tree[2*node] + tree[2*node+1];
}
// GOOD: Push pending lazy before going to children
void update(int node, int start, int end, int l, int r, ll val) {
if (r < start || end < l) return;
if (l <= start && end <= r) {
tree[node] += val * (end - start + 1);
lazy[node] += val;
return;
}
pushDown(node, start, end); // ← ALWAYS before recursing!
int mid = (start + end) / 2;
update(2*node, start, mid, l, r, val);
update(2*node+1, mid+1, end, l, r, val);
tree[node] = tree[2*node] + tree[2*node+1];
}
Top 4 Lazy Propagation Bugs:
- Forgetting
pushDownbefore recursion — children receive parent's lazy on top of their own, giving wrong query results - Wrong size multiplier —
tree[node] += valinstead oftree[node] += val * (end - start + 1). The node stores a SUM, so adding val to each of(end-start+1)elements means addingval*(size)to the sum. - Not initializing
lazy[]to 0 — usememset(lazy, 0, sizeof(lazy))or initialize inbuild() - Mixing lazy for different operations — if you have both "range add" and "range multiply" lazy, the order matters. You need two separate lazy arrays and a careful push-down combining both.
Generalizing Lazy Propagation
The pattern works for any operation where:
- The aggregate is an associative operation (sum, min, max, XOR...)
- The update distributes over the aggregate (
sum += k*nwhen addingktonelements)
Common variants:
| Update | Query | Lazy stores | Push-down formula |
|---|---|---|---|
| Range Add | Range Sum | Add delta | tree[child] += lazy * size; lazy[child] += lazy |
| Range Set | Range Sum | Set value | tree[child] = lazy * size; lazy[child] = lazy |
| Range Add | Range Min | Add delta | tree[child] += lazy; lazy[child] += lazy |
| Range Set | Range Min | Set value | tree[child] = lazy; lazy[child] = lazy |
Chapter 3.10: Fenwick Tree (Binary Indexed Tree)
📝 Before You Continue: You should already know prefix sums (Chapter 3.2) and bitwise operations. This chapter complements Segment Tree (Chapter 3.9) — BIT code is shorter, with smaller constants, but supports fewer operations.
Fenwick Tree (also known as Binary Indexed Tree / BIT) is one of the most commonly used data structures in competitive programming: under 15 lines of code, yet supports point updates and prefix queries in O(log N) time.
3.10.1 The Core Idea: What Is lowbit?
Bitwise Principle of lowbit
For any positive integer x, lowbit(x) = x & (-x) returns the value of the lowest set bit in the binary representation of x.
x = 6 → binary: 0110
-x = -6 → two's complement: 1010 (bitwise NOT + 1)
x & (-x) = 0010 = 2 ← lowest set bit corresponds to 2^1 = 2
Examples:
| x | Binary | -x (two's complement) | x & (-x) | Meaning |
|---|---|---|---|---|
| 1 | 0001 | 1111 | 0001 = 1 | Manages 1 element |
| 2 | 0010 | 1110 | 0010 = 2 | Manages 2 elements |
| 3 | 0011 | 1101 | 0001 = 1 | Manages 1 element |
| 4 | 0100 | 1100 | 0100 = 4 | Manages 4 elements |
| 6 | 0110 | 1010 | 0010 = 2 | Manages 2 elements |
| 8 | 1000 | 1000 | 1000 = 8 | Manages 8 elements |
BIT Tree Index Intuition
The elegance of BIT: tree[i] does not store a single element, but stores the sum of a range, with length exactly lowbit(i).
BIT 树形结构示意(n=8):
flowchart BT
T1["tree[1]\nA[1]\n管理 1 个"]
T2["tree[2]\nA[1..2]\n管理 2 个"]
T3["tree[3]\nA[3]\n管理 1 个"]
T4["tree[4]\nA[1..4]\n管理 4 个"]
T5["tree[5]\nA[5]\n管理 1 个"]
T6["tree[6]\nA[5..6]\n管理 2 个"]
T7["tree[7]\nA[7]\n管理 1 个"]
T8["tree[8]\nA[1..8]\n管理 8 个"]
T1 --> T2
T3 --> T4
T2 --> T4
T5 --> T6
T7 --> T8
T6 --> T8
T4 --> T8
style T8 fill:#dbeafe,stroke:#3b82f6
style T4 fill:#e0f2fe,stroke:#0284c7
style T2 fill:#f0f9ff,stroke:#38bdf8
style T6 fill:#f0f9ff,stroke:#38bdf8
查询 prefix(7) 的跳转路径:
flowchart LR
Q7["i=7\n加 tree[7]=A[7]"] -->|"7-lowbit(7)=6"| Q6
Q6["i=6\n加 tree[6]=A[5..6]"] -->|"6-lowbit(6)=4"| Q4
Q4["i=4\n加 tree[4]=A[1..4]"] -->|"4-lowbit(4)=0"| Q0
Q0(["i=0 停止\n共 3 步 = O(log 7)"])
style Q0 fill:#dcfce7,stroke:#16a34a
💡 跳转规律: 查询时
i -= lowbit(i)(向下跳),更新时i += lowbit(i)(向上跳)。每次跳转消除最低位的 1,最多 log N 步。
Index i: 1 2 3 4 5 6 7 8
Range managed by tree[i]:
tree[1] = A[1] (length lowbit(1)=1)
tree[2] = A[1]+A[2] (length lowbit(2)=2)
tree[3] = A[3] (length lowbit(3)=1)
tree[4] = A[1]+...+A[4] (length lowbit(4)=4)
tree[5] = A[5] (length lowbit(5)=1)
tree[6] = A[5]+A[6] (length lowbit(6)=2)
tree[7] = A[7] (length lowbit(7)=1)
tree[8] = A[1]+...+A[8] (length lowbit(8)=8)
更新位置 3 的跳转路径:
flowchart LR
U3["i=3\n更新 tree[3]"] -->|"3+lowbit(3)=4"| U4
U4["i=4\n更新 tree[4]"] -->|"4+lowbit(4)=8"| U8
U8["i=8\n更新 tree[8]"] -->|"8+lowbit(8)=16>n"| U_end
U_end(["i>n 停止\n共 3 步 = O(log N)"])
style U_end fill:#dcfce7,stroke:#16a34a
When querying prefix sum prefix(7), jump up via i -= lowbit(i):
i=7: addtree[7](manages A[7]), then7 - lowbit(7) = 7 - 1 = 6i=6: addtree[6](manages A[5..6]), then6 - lowbit(6) = 6 - 2 = 4i=4: addtree[4](manages A[1..4]), then4 - lowbit(4) = 4 - 4 = 0, stop
Total 3 steps = O(log 7) ≈ 3 steps.
When updating position 3, jump up via i += lowbit(i):
i=3: updatetree[3], then3 + lowbit(3) = 3 + 1 = 4i=4: updatetree[4], then4 + lowbit(4) = 4 + 4 = 8i=8: updatetree[8], 8 > n, stop
3.10.2 Point Update + Prefix Query — Complete Code
// ══════════════════════════════════════════════════════════════
// Fenwick Tree (Binary Indexed Tree) — Classic Implementation
// Supports: Point Update O(log N), Prefix Sum Query O(log N)
// Arrays are 1-INDEXED (critical!)
// ══════════════════════════════════════════════════════════════
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 300005;
int n;
long long tree[MAXN]; // BIT array, 1-indexed
// ── lowbit: returns the value of the lowest set bit ──
// x & (-x) works because:
// -x in two's complement = ~x + 1
// The lowest set bit of x is preserved, all higher bits cancel out
// Example: x=6 (0110), -x=1010, x&(-x)=0010=2
inline int lowbit(int x) {
return x & (-x);
}
// ── update: add val to position i ──
// Walk UP the tree: i += lowbit(i)
// Each ancestor that covers position i gets updated
void update(int i, long long val) {
for (; i <= n; i += lowbit(i))
tree[i] += val;
// Time: O(log N) — at most log2(N) iterations
}
// ── query: return prefix sum A[1..i] ──
// Walk DOWN the tree: i -= lowbit(i)
// Decompose [1..i] into O(log N) non-overlapping ranges
long long query(int i) {
long long sum = 0;
for (; i > 0; i -= lowbit(i))
sum += tree[i];
return sum;
// Time: O(log N) — at most log2(N) iterations
}
// ── build: initialize BIT from an existing array A[1..n] ──
// Method 1: N individual updates — O(N log N)
void build_slow(long long A[]) {
fill(tree + 1, tree + n + 1, 0LL);
for (int i = 1; i <= n; i++)
update(i, A[i]);
}
// Method 2: O(N) build using the "direct parent" trick
void build_fast(long long A[]) {
for (int i = 1; i <= n; i++) {
tree[i] += A[i];
int parent = i + lowbit(i); // direct parent in BIT
if (parent <= n)
tree[parent] += tree[i];
}
}
// ── Full Example ──
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int q;
cin >> n >> q;
long long A[MAXN] = {};
for (int i = 1; i <= n; i++) cin >> A[i];
build_fast(A); // O(N) initialization
while (q--) {
int type;
cin >> type;
if (type == 1) {
// Point update: A[i] += val
int i; long long val;
cin >> i >> val;
update(i, val);
} else {
// Prefix query: sum of A[1..r]
int r;
cin >> r;
cout << query(r) << "\n";
}
}
return 0;
}
3.10.3 Range Query = prefix(r) - prefix(l-1)
Range query sum(l, r) is identical to the prefix sum technique:
// Range sum query: sum of A[l..r]
// Time: O(log N) — two prefix queries
long long range_query(int l, int r) {
return query(r) - query(l - 1);
// query(r) = A[1] + A[2] + ... + A[r]
// query(l-1) = A[1] + A[2] + ... + A[l-1]
// difference = A[l] + A[l+1] + ... + A[r]
}
// Example usage:
// A = [3, 1, 4, 1, 5, 9, 2, 6] (1-indexed)
// range_query(3, 6) = query(6) - query(2)
// = (3+1+4+1+5+9) - (3+1)
// = 23 - 4 = 19
// Verify: A[3]+A[4]+A[5]+A[6] = 4+1+5+9 = 19 ✓
3.10.4 Comparison: Prefix Sum vs BIT vs Segment Tree
| Operation | Prefix Sum Array | Fenwick Tree (BIT) | Segment Tree |
|---|---|---|---|
| Build | O(N) | O(N) or O(N log N) | O(N) |
| Prefix Query | O(1) | O(log N) | O(log N) |
| Range Query | O(1) | O(log N) | O(log N) |
| Point Update | O(N) rebuild | O(log N) ✓ | O(log N) ✓ |
| Range Update | O(N) | O(log N) (Difference BIT) | O(log N) (lazy tag) |
| Range Min/Max | O(1) (sparse table) | ❌ Not supported | ✓ Supported |
| Code Complexity | Minimal | Simple (10 lines) | Complex (50+ lines) |
| Constant Factor | Smallest | Very small | Larger |
| Space | O(N) | O(N) | O(4N) |
When to choose BIT?
- ✅ Only need prefix/range sum + point update
- ✅ Need extremely concise code (fewer bugs in contest)
- ✅ Counting inversions, merge sort counting problems
- ❌ Need range min/max → use Segment Tree
- ❌ Need complex range operations (range multiply, etc.) → use Segment Tree
3.10.5 Interactive Visualization: BIT Update Process
3.10.6 Range Update + Point Query (Difference BIT)
Standard BIT supports "point update + prefix query". Using the difference array technique, it can instead support "range update + point query".
Principle
Let difference array D[i] = A[i] - A[i-1] (D[1] = A[1]), then:
A[i] = D[1] + D[2] + ... + D[i](i.e., A[i] is the prefix sum of D)- Adding val to all A[l..r] is equivalent to:
D[l] += val; D[r+1] -= val
// ══════════════════════════════════════════════════════════════
// Difference BIT: Range Update + Point Query
// ══════════════════════════════════════════════════════════════
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 300005;
int n;
long long diff_bit[MAXN]; // BIT over difference array D[]
inline int lowbit(int x) { return x & (-x); }
// Update D[i] += val in the difference BIT
void diff_update(int i, long long val) {
for (; i <= n; i += lowbit(i))
diff_bit[i] += val;
}
// Query A[i] = sum of D[1..i] = prefix query on diff BIT
long long diff_query(int i) {
long long s = 0;
for (; i > 0; i -= lowbit(i))
s += diff_bit[i];
return s;
}
// Range update: add val to all A[l..r]
// Equivalent to: D[l] += val, D[r+1] -= val
void range_update(int l, int r, long long val) {
diff_update(l, val); // D[l] += val
diff_update(r + 1, -val); // D[r+1] -= val
}
// Point query: return current value of A[i]
// A[i] = D[1] + D[2] + ... + D[i] = prefix_sum(D, i)
long long point_query(int i) {
return diff_query(i);
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int q;
cin >> n >> q;
// Initialize: read A[i], build diff BIT
for (int i = 1; i <= n; i++) {
long long x; cin >> x;
// D[i] = A[i] - A[i-1]: use two updates
diff_update(i, x);
if (i + 1 <= n) diff_update(i + 1, -x); // will be overridden by next iteration
// Simpler: just set D[i] = A[i]-A[i-1] directly
}
// Better initialization using range_update for each element:
// fill diff_bit with 0, then range_update(i, i, A[i]) for each i
// Or: diff_update(1, A[1]); for i>=2: diff_update(i, A[i]-A[i-1])
while (q--) {
int type; cin >> type;
if (type == 1) {
int l, r; long long val;
cin >> l >> r >> val;
range_update(l, r, val); // A[l..r] += val, O(log N)
} else {
int i; cin >> i;
cout << point_query(i) << "\n"; // query A[i], O(log N)
}
}
return 0;
}
Advanced: Range Update + Range Query (Dual BIT)
To support both range update + range query simultaneously, use two BITs:
// ══════════════════════════════════════════════════════════════
// Double BIT: Range Update + Range Query
// Formula: sum(1..r) = B1[r] * r - B2[r]
// where B1 is BIT over D[], B2 is BIT over (i-1)*D[i]
// ══════════════════════════════════════════════════════════════
long long B1[MAXN], B2[MAXN]; // Two BITs
inline int lowbit(int x) { return x & (-x); }
void add(long long* b, int i, long long v) {
for (; i <= n; i += lowbit(i)) b[i] += v;
}
long long sum(long long* b, int i) {
long long s = 0;
for (; i > 0; i -= lowbit(i)) s += b[i];
return s;
}
// Range update: add val to A[l..r]
void range_add(int l, int r, long long val) {
add(B1, l, val);
add(B1, r + 1, -val);
add(B2, l, val * (l - 1)); // compensate for prefix formula
add(B2, r + 1, -val * r);
}
// Prefix sum A[1..r]
long long prefix_sum(int r) {
return sum(B1, r) * r - sum(B2, r);
}
// Range sum A[l..r]
long long range_sum(int l, int r) {
return prefix_sum(r) - prefix_sum(l - 1);
}
3.10.7 USACO-Style Problem: Counting Inversions with BIT
Problem Statement
Counting Inversions (O(N log N))
Given an integer array A of length N (distinct elements, range 1..N), count the number of inversions.
Inversion: a pair of indices (i, j) where i < j but A[i] > A[j].
Constraints: N ≤ 3×10⁵, requires O(N log N) solution.
Sample Input:
5
3 1 4 2 5
Sample Output:
3
Explanation: Inversions are (3,1), (3,2), (4,2), total 3 pairs.
Solution: BIT Inversion Count
// ══════════════════════════════════════════════════════════════
// Counting Inversions using Fenwick Tree — O(N log N)
//
// Key Idea:
// Process A[i] from left to right.
// For each A[i], the number of inversions with A[i] as the
// RIGHT element = count of already-processed values > A[i]
// = (elements processed so far) - (elements <= A[i])
// = i-1 - prefix_query(A[i])
// Sum over all i gives total inversions.
//
// BIT role: track frequency of seen values.
// After seeing value v: update(v, +1)
// Query # of values <= x: query(x)
// ══════════════════════════════════════════════════════════════
#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
const int MAXN = 300005;
int n;
int bit[MAXN]; // BIT for frequency counting; bit[v] tracks how many times v appeared
inline int lowbit(int x) { return x & (-x); }
// Add 1 to position v (we saw value v)
void update(int v) {
for (; v <= n; v += lowbit(v))
bit[v]++;
}
// Count how many values in [1..v] have been seen
int query(int v) {
int cnt = 0;
for (; v > 0; v -= lowbit(v))
cnt += bit[v];
return cnt;
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
cin >> n;
ll inversions = 0;
for (int i = 1; i <= n; i++) {
int a;
cin >> a;
// Count inversions where a is the RIGHT element:
// # of already-seen values GREATER than a
// = (i-1 elements seen so far) - (# of seen values <= a)
int less_or_equal = query(a); // # of seen values in [1..a]
int greater = (i - 1) - less_or_equal; // # of seen values in [a+1..n]
inversions += greater;
// Mark that we've now seen value a
update(a);
}
cout << inversions << "\n";
return 0;
}
/*
Trace for A = [3, 1, 4, 2, 5]:
i=1, a=3: seen=[], query(3)=0, greater=0-0=0. inversions=0. update(3).
i=2, a=1: seen=[3], query(1)=0, greater=1-0=1. inversions=1. update(1).
(3 > 1: that's 1 inversion: (3,1) ✓)
i=3, a=4: seen=[3,1], query(4)=2, greater=2-2=0. inversions=1. update(4).
(no element > 4 was seen before)
i=4, a=2: seen=[3,1,4], query(2)=1, greater=3-1=2. inversions=3. update(2).
(3>2 and 4>2: 2 inversions: (3,2),(4,2) ✓)
i=5, a=5: seen=[3,1,4,2], query(5)=4, greater=4-4=0. inversions=3. update(5).
Final: 3 ✓
*/
Complexity Analysis:
- Time: O(N log N) — N iterations, each O(log N) for update + query
- Space: O(N) for BIT
Extension: If array elements are not in range 1..N, first apply coordinate compression before using BIT:
// Coordinate compression for arbitrary values
vector<int> A(n);
for (int i = 0; i < n; i++) cin >> A[i];
// Step 1: sort and deduplicate
vector<int> sorted_A = A;
sort(sorted_A.begin(), sorted_A.end());
sorted_A.erase(unique(sorted_A.begin(), sorted_A.end()), sorted_A.end());
// Step 2: replace each value with its rank (1-indexed)
for (int i = 0; i < n; i++) {
A[i] = lower_bound(sorted_A.begin(), sorted_A.end(), A[i]) - sorted_A.begin() + 1;
// A[i] is now in [1..M] where M = sorted_A.size()
}
// Now use BIT with n = sorted_A.size()
3.10.8 Common Mistakes
❌ Mistake 1: Wrong lowbit Implementation
// ❌ WRONG — common typo/confusion
int lowbit(int x) { return x & (x - 1); } // This CLEARS the lowest bit, not returns it!
// x=6 (0110): x&(x-1) = 0110&0101 = 0100 = 4 (WRONG, should be 2)
int lowbit(int x) { return x % 2; } // Only works for last bit, not lowbit value
// ✅ CORRECT
int lowbit(int x) { return x & (-x); }
// x=6: -6 = ...11111010 (two's complement)
// 0110 & 11111010 = 0010 = 2 ✓
Memory trick: x & (-x) reads as "x AND negative-x". -x is bitwise NOT plus 1, which clears all bits below the lowest set bit, flips all bits above it, and the AND operation keeps only the lowest set bit.
❌ Mistake 2: 0-indexed Array (the 0-index trap)
BIT must use 1-indexed arrays. 0-indexed causes infinite loops!
// ❌ WRONG — 0-indexed causes infinite loop!
// If i = 0: query loop: i -= lowbit(0) = 0 - (0 & 0) = 0 - 0 = 0 → infinite loop!
// (Actually lowbit(0) = 0 & 0 = 0, so i never decreases)
void query_WRONG(int i) { // i is 0-indexed
int s = 0;
for (; i > 0; i -= lowbit(i)) // if i=0 initially, loop doesn't execute but
s += bit[i]; // if called with i=0 during calculation... disaster
return s;
}
// ❌ WRONG — forgetting +1 when converting to 1-indexed
int arr[n]; // 0-indexed A[0..n-1]
for (int i = 0; i < n; i++) {
update(i, arr[i]); // BUG: should be update(i+1, arr[i])
}
// ✅ CORRECT — always shift to 1-indexed
for (int i = 0; i < n; i++) {
update(i + 1, arr[i]); // convert 0-indexed i to 1-indexed i+1
}
// And remember: query(r+1) - query(l) for 0-indexed range [l, r]
❌ Mistake 3: Integer Overflow in Large Sum
// ❌ WRONG — tree[] should be long long for large sums
int tree[MAXN]; // overflow if sum > 2^31
// ✅ CORRECT
long long tree[MAXN];
// Also: when counting inversions, inversions can be up to N*(N-1)/2 ≈ 4.5×10^10 for N=3×10^5
// Always use long long for the result counter!
long long inversions = 0; // ✅ not int!
❌ Mistake 4: Forgetting to Clear BIT Between Test Cases
// ❌ WRONG — in problems with multiple test cases
int T; cin >> T;
while (T--) {
// forgot to clear tree[]!
// Old data from previous test case corrupts results
solve();
}
// ✅ CORRECT — reset before each test case
int T; cin >> T;
while (T--) {
fill(tree + 1, tree + n + 1, 0LL); // clear BIT
solve();
}
3.10.9 Chapter Summary
📋 Formula Quick Reference
| Operation | Code | Description |
|---|---|---|
| lowbit | x & (-x) | Value of lowest set bit of x |
| Point Update | for(;i<=n;i+=lowbit(i)) t[i]+=v | Propagate upward |
| Prefix Query | for(;i>0;i-=lowbit(i)) s+=t[i] | Decompose downward |
| Range Query | query(r) - query(l-1) | Difference formula |
| Range Update (Diff BIT) | upd(l,+v); upd(r+1,-v) | Difference array |
| Inversion Count | (i-1) - query(a[i]) | Count when processing each element |
| Array must be | 1-indexed | 0-indexed → infinite loop |
❓ FAQ
Q1: Both BIT and Segment Tree support prefix sum + point update. Which should I choose?
A: Use BIT whenever possible. BIT code is only 10 lines, has smaller constants (empirically 2-3x faster), and lower error probability. Only choose Segment Tree when you need range min/max (RMQ), range coloring, or more complex range operations. In contests, BIT is the "default weapon", Segment Tree is "heavy artillery".
Q2: Can BIT support Range Minimum Query (RMQ)?
A: Standard BIT cannot support RMQ, because the min operation has no "inverse" (cannot "undo" a merged min value like subtraction). For range min/max, use Segment Tree or Sparse Table. There is a "static BIT for RMQ" technique, but it only works without updates and has limited practical use.
Q3: Can BIT support 2D (2D BIT)?
A: Yes! 2D BIT solves 2D prefix sum + point update problems, with complexity O(log N × log M). The code structure uses two nested loops:
// 2D BIT update void update2D(int x, int y, long long v) { for (int i = x; i <= N; i += lowbit(i)) for (int j = y; j <= M; j += lowbit(j)) bit[i][j] += v; }Less common in USACO, but occasionally needed for 2D coordinate counting problems.
3.10.10 Practice Problems
Given an array of length N, support two operations:
1 i x: Increase A[i] by x2 l r: Query A[l] + A[l+1] + ... + A[r]
Constraints: N, Q ≤ 10⁵.
Hint: Direct BIT application. Use update(i, x) and query(r) - query(l-1).
Given N operations, each either inserts an integer (range 1..10⁶) or queries "how many of the currently inserted integers are ≤ K?"
Hint: BIT maintains a frequency array over the value domain. update(v, 1) inserts value v, query(K) is the answer.
Given an array of length N (initially all zeros), support two operations:
1 l r x: Add x to every element in A[l..r]2 i: Query the current value of A[i]
Constraints: N, Q ≤ 3×10⁵.
Hint: Use Difference BIT (Section 3.10.6).
Given an array of length N with elements in range 1..10⁹ (possibly repeated). Count the number of inversions.
Constraints: N ≤ 3×10⁵.
Hint: First apply coordinate compression, then use BIT counting (variant of Section 3.10.7). Note equal elements: (i,j) with i<j and A[i]>A[j] (strictly greater) counts as an inversion.
Given an array of length N, support two operations:
1 l r x: Add x to every element in A[l..r]2 l r: Query A[l] + ... + A[r]
Constraints: N, Q ≤ 3×10⁵, elements and x can reach 10⁹.
Hint: Use Dual BIT (Dual BIT method at end of Section 3.10.6). Formula: prefix_sum(r) = B1[r] * r - B2[r], where B1 maintains the difference array and B2 maintains the weighted difference array. Derivation: let D[i] be the difference array, then A[1]+...+A[r] = Σᵢ₌₁ʳ Σⱼ₌₁ⁱ D[j] = Σⱼ₌₁ʳ D[j]*(r-j+1) = (r+1)Σ D[j] - Σ jD[j].
💡 Chapter Connection: BIT and Segment Tree are the two most commonly paired data structures in USACO. BIT handles 80% of scenarios with 1/5 the code of Segment Tree. After mastering BIT, return to Chapter 3.9 to learn Segment Tree lazy propagation—the territory BIT cannot reach.
Chapter 3.11: Binary Trees
Binary trees are the foundation of some of the most important data structures in competitive programming — from Binary Search Trees (BST) to Segment Trees to Heaps. Understanding them deeply will make graph algorithms, DP on trees, and USACO Gold problems significantly more approachable.
3.11.1 Binary Tree Fundamentals
A binary tree is a hierarchical data structure where:
- Each node has at most 2 children: a left child and a right child
- There is exactly one root node (no parent)
- Each non-root node has exactly one parent
Leaf — node with no children
Internal node — node with at least one child
Height — longest path from root to any leaf
Depth — distance from root to that node
Subtree — a node and all its descendants
Visual Example
In this tree:
- Height = 2 (longest root-to-leaf path: A → B → D)
- Root = A, Leaves = D, E, F
- B is parent of D and E; D is left child of B, E is right child of B
C++ Node Definition
Throughout this chapter, we use a consistent struct TreeNode:
// Solution: Basic Binary Tree Node
#include <bits/stdc++.h>
using namespace std;
struct TreeNode {
int val;
TreeNode* left;
TreeNode* right;
// Constructor: initialize with value, no children
TreeNode(int v) : val(v), left(nullptr), right(nullptr) {}
};
💡 Why raw pointers? In competitive programming, we often manage memory manually for speed.
nullptr(C++11) is always safer than uninitialized pointers — always initializeleft = right = nullptr.
3.11.2 Binary Search Trees (BST)
A Binary Search Tree is a binary tree with a crucial ordering property:
BST Property: For every node v:
- All values in the left subtree are strictly less than
v.val - All values in the right subtree are strictly greater than
v.val
[5] ← valid BST
/ \
[3] [8]
/ \ / \
[1] [4] [7] [10]
left of 5 = {1, 3, 4} — all < 5 ✓
right of 5 = {7, 8, 10} — all > 5 ✓
3.11.2.1 BST Search
// Solution: BST Search — O(log N) average, O(N) worst case
// Returns pointer to node with value 'target', or nullptr if not found
TreeNode* search(TreeNode* root, int target) {
// Base case: empty tree or found the target
if (root == nullptr || root->val == target) {
return root;
}
// BST property: go left if target is smaller
if (target < root->val) {
return search(root->left, target);
}
// Go right if target is larger
return search(root->right, target);
}
Iterative version (avoids stack overflow for large trees):
// Solution: BST Search Iterative
TreeNode* searchIterative(TreeNode* root, int target) {
while (root != nullptr) {
if (target == root->val) return root; // found
else if (target < root->val) root = root->left; // go left
else root = root->right; // go right
}
return nullptr; // not found
}
3.11.2.2 BST Insert
// Solution: BST Insert — O(log N) average
// Returns the (potentially new) root of the subtree
TreeNode* insert(TreeNode* root, int val) {
// If we've reached a null spot, create the new node here
if (root == nullptr) {
return new TreeNode(val);
}
if (val < root->val) {
root->left = insert(root->left, val); // recurse left
} else if (val > root->val) {
root->right = insert(root->right, val); // recurse right
}
// val == root->val: duplicate, ignore (or handle as needed)
return root;
}
// Usage:
// TreeNode* root = nullptr;
// root = insert(root, 5);
// root = insert(root, 3);
// root = insert(root, 8);
3.11.2.3 BST Delete
Deletion is the trickiest BST operation. There are 3 cases:
- Node has no children (leaf): simply delete it
- Node has one child: replace node with its child
- Node has two children: replace with inorder successor (smallest in right subtree), then delete the successor
// Solution: BST Delete — O(log N) average
// Helper: find minimum node in a subtree
TreeNode* findMin(TreeNode* node) {
while (node->left != nullptr) node = node->left;
return node;
}
// Delete node with value 'val' from tree rooted at 'root'
TreeNode* deleteNode(TreeNode* root, int val) {
if (root == nullptr) return nullptr; // value not found
if (val < root->val) {
// Case: target is in left subtree
root->left = deleteNode(root->left, val);
} else if (val > root->val) {
// Case: target is in right subtree
root->right = deleteNode(root->right, val);
} else {
// Found the node to delete!
// Case 1: No children (leaf)
if (root->left == nullptr && root->right == nullptr) {
delete root;
return nullptr;
}
// Case 2a: Only right child
else if (root->left == nullptr) {
TreeNode* temp = root->right;
delete root;
return temp;
}
// Case 2b: Only left child
else if (root->right == nullptr) {
TreeNode* temp = root->left;
delete root;
return temp;
}
// Case 3: Two children — replace with inorder successor
else {
TreeNode* successor = findMin(root->right); // smallest in right subtree
root->val = successor->val; // copy successor's value
root->right = deleteNode(root->right, successor->val); // delete successor
}
}
return root;
}
3.11.2.4 BST Degeneration Problem
⚠️ Critical Issue: If you insert values in sorted order (1, 2, 3, 4, 5...), the BST becomes a linked list:
[1]
\
[2]
\
[3] ← This is O(N) per operation, not O(log N)!
\
[4]
\
[5]
This is why balanced BSTs (AVL trees, Red-Black trees) exist. In C++, std::set and std::map are implemented as Red-Black trees — always O(log N).
std::set / std::map instead of writing your own BST. They are always balanced. Learn BST fundamentals to understand why they work, then use the STL in contests (see Chapter 3.8).
3.11.3 Tree Traversals
Traversal = visiting every node exactly once. There are 4 fundamental traversals:
| Traversal | Order | Use Case |
|---|---|---|
| Preorder | Root → Left → Right | Copy tree, prefix expression |
| Inorder | Left → Root → Right | Sorted output from BST |
| Postorder | Left → Right → Root | Delete tree, postfix expression |
| Level-order | BFS by depth | Find shortest path, level operations |
3.11.3.1 Preorder Traversal
// Solution: Preorder Traversal — O(N) time, O(H) space (H = height)
// Visit order: Root, Left subtree, Right subtree
void preorder(TreeNode* root) {
if (root == nullptr) return; // base case
cout << root->val << " "; // process ROOT first
preorder(root->left); // then left subtree
preorder(root->right); // then right subtree
}
// For the tree: [5]
// / \
// [3] [8]
// / \
// [1] [4]
// Preorder: 5 3 1 4 8
Iterative Preorder (using stack):
// Solution: Preorder Iterative
void preorderIterative(TreeNode* root) {
if (root == nullptr) return;
stack<TreeNode*> stk;
stk.push(root);
while (!stk.empty()) {
TreeNode* node = stk.top(); stk.pop();
cout << node->val << " "; // process current
// Push RIGHT first (so LEFT is processed first — LIFO!)
if (node->right) stk.push(node->right);
if (node->left) stk.push(node->left);
}
}
3.11.3.2 Inorder Traversal
// Solution: Inorder Traversal — O(N) time
// Visit order: Left subtree, Root, Right subtree
// KEY PROPERTY: Inorder traversal of a BST gives SORTED output!
void inorder(TreeNode* root) {
if (root == nullptr) return;
inorder(root->left); // left subtree first
cout << root->val << " "; // then ROOT
inorder(root->right); // then right subtree
}
// For BST with values {1, 3, 4, 5, 8}:
// Inorder: 1 3 4 5 8 ← sorted! This is the most important BST property
🔑 Key Insight: Inorder traversal of any BST always produces a sorted sequence. This is why
std::setcan be iterated in sorted order — it uses inorder traversal internally.
Iterative Inorder (slightly trickier):
// Solution: Inorder Iterative
void inorderIterative(TreeNode* root) {
stack<TreeNode*> stk;
TreeNode* curr = root;
while (curr != nullptr || !stk.empty()) {
// Go as far left as possible
while (curr != nullptr) {
stk.push(curr);
curr = curr->left;
}
// Process the leftmost unprocessed node
curr = stk.top(); stk.pop();
cout << curr->val << " ";
// Move to right subtree
curr = curr->right;
}
}
3.11.3.3 Postorder Traversal
// Solution: Postorder Traversal — O(N) time
// Visit order: Left subtree, Right subtree, Root
// Used for: deleting trees, evaluating expression trees
void postorder(TreeNode* root) {
if (root == nullptr) return;
postorder(root->left); // left subtree first
postorder(root->right); // then right subtree
cout << root->val << " "; // ROOT last
}
// For BST [1, 3, 4, 5, 8]:
// Postorder: 1 4 3 8 5 (root 5 is always last)
// ── Memory cleanup using postorder ──
void deleteTree(TreeNode* root) {
if (root == nullptr) return;
deleteTree(root->left); // delete left first
deleteTree(root->right); // then right
delete root; // then this node (safe: children already deleted)
}
3.11.3.4 Level-Order Traversal (BFS)
// Solution: Level-Order Traversal (BFS) — O(N) time, O(W) space (W = max width)
// Uses a queue: process nodes level by level
void levelOrder(TreeNode* root) {
if (root == nullptr) return;
queue<TreeNode*> q;
q.push(root);
while (!q.empty()) {
int levelSize = q.size(); // number of nodes at current level
for (int i = 0; i < levelSize; i++) {
TreeNode* node = q.front(); q.pop();
cout << node->val << " ";
if (node->left) q.push(node->left);
if (node->right) q.push(node->right);
}
cout << "\n"; // newline between levels
}
}
// For the BST [5, 3, 8, 1, 4]:
// Level 0: 5
// Level 1: 3 8
// Level 2: 1 4
Traversal Summary Table
Tree: [5]
/ \
[3] [8]
/ \ /
[1] [4] [7]
Preorder: 5 3 1 4 8 7
Inorder: 1 3 4 5 7 8 ← sorted!
Postorder: 1 4 3 7 8 5
Level-order: 5 | 3 8 | 1 4 7
3.11.4 Tree Height and Balance
3.11.4.1 Computing Tree Height
// Solution: Tree Height — O(N) time, O(H) space for recursion stack
// Height = length of longest root-to-leaf path
// Convention: height of null tree = -1, leaf node height = 0
int height(TreeNode* root) {
if (root == nullptr) return -1; // empty subtree has height -1
int leftHeight = height(root->left); // height of left subtree
int rightHeight = height(root->right); // height of right subtree
return 1 + max(leftHeight, rightHeight); // +1 for current node
}
// Time: O(N) — visit every node exactly once
// Space: O(H) — recursion stack depth = tree height
// Alternative: some define height as number of nodes on longest path
// Then: leaf has height 1, and empty tree has height 0
// Be careful about which convention your problem uses!
3.11.4.2 Checking Balance
A balanced binary tree requires that for every node, the heights of its left and right subtrees differ by at most 1.
// Solution: Check Balanced BST — O(N) time
// Returns -1 if unbalanced, otherwise returns the height of subtree
int checkBalanced(TreeNode* root) {
if (root == nullptr) return 0; // empty is balanced, height 0
int leftH = checkBalanced(root->left);
if (leftH == -1) return -1; // left subtree is unbalanced
int rightH = checkBalanced(root->right);
if (rightH == -1) return -1; // right subtree is unbalanced
// Check balance at current node: heights can differ by at most 1
if (abs(leftH - rightH) > 1) return -1; // unbalanced!
return 1 + max(leftH, rightH); // return height if balanced
}
bool isBalanced(TreeNode* root) {
return checkBalanced(root) != -1;
}
3.11.4.3 Counting Nodes
// Solution: Count Nodes — O(N)
int countNodes(TreeNode* root) {
if (root == nullptr) return 0;
return 1 + countNodes(root->left) + countNodes(root->right);
}
// Count leaves specifically
int countLeaves(TreeNode* root) {
if (root == nullptr) return 0;
if (root->left == nullptr && root->right == nullptr) return 1; // leaf!
return countLeaves(root->left) + countLeaves(root->right);
}
3.11.5 Lowest Common Ancestor (LCA) — Brute Force
The LCA of two nodes u and v in a rooted tree is the deepest node that is an ancestor of both.
[1]
/ \
[2] [3]
/ \ \
[4] [5] [6]
/
[7]
LCA(4, 5) = 2 (both 4 and 5 are descendants of 2)
LCA(4, 6) = 1 (deepest common ancestor is the root 1)
LCA(2, 4) = 2 (node 2 is ancestor of 4 and ancestor of itself)
O(N) Brute Force LCA
// Solution: LCA Brute Force — O(N) per query
// Strategy: find path from root to each node, then find last common node
// Step 1: Find path from root to target node
bool findPath(TreeNode* root, int target, vector<int>& path) {
if (root == nullptr) return false;
path.push_back(root->val); // add current node to path
if (root->val == target) return true; // found target!
// Try left then right
if (findPath(root->left, target, path)) return true;
if (findPath(root->right, target, path)) return true;
path.pop_back(); // backtrack: target not in this subtree
return false;
}
// Step 2: Find LCA using two paths
int lca(TreeNode* root, int u, int v) {
vector<int> pathU, pathV;
findPath(root, u, pathU); // path from root to u
findPath(root, v, pathV); // path from root to v
// Find last common node in both paths
int result = root->val;
int minLen = min(pathU.size(), pathV.size());
for (int i = 0; i < minLen; i++) {
if (pathU[i] == pathV[i]) {
result = pathU[i]; // still common
} else {
break; // diverged
}
}
return result;
}
💡 USACO Note: For USACO Silver problems, the
O(N)brute force LCA is NOT always sufficient. With N ≤ 10^5 nodes and Q ≤ 10^5 queries, the total isO(NQ)=O(10^10)— too slow. Use it only when N, Q ≤ 5000. Chapter 5.3 coversO(log N)LCA with binary lifting for harder problems.
3.11.6 Complete BST Implementation
Here's a complete, contest-ready BST with all operations:
// Solution: Complete BST Implementation
#include <bits/stdc++.h>
using namespace std;
struct TreeNode {
int val;
TreeNode* left;
TreeNode* right;
TreeNode(int v) : val(v), left(nullptr), right(nullptr) {}
};
struct BST {
TreeNode* root;
BST() : root(nullptr) {}
// ── Insert ──
TreeNode* _insert(TreeNode* node, int val) {
if (!node) return new TreeNode(val);
if (val < node->val) node->left = _insert(node->left, val);
else if (val > node->val) node->right = _insert(node->right, val);
return node;
}
void insert(int val) { root = _insert(root, val); }
// ── Search ──
bool search(int val) {
TreeNode* curr = root;
while (curr) {
if (val == curr->val) return true;
curr = (val < curr->val) ? curr->left : curr->right;
}
return false;
}
// ── Inorder (sorted output) ──
void _inorder(TreeNode* node, vector<int>& result) {
if (!node) return;
_inorder(node->left, result);
result.push_back(node->val);
_inorder(node->right, result);
}
vector<int> getSorted() {
vector<int> result;
_inorder(root, result);
return result;
}
// ── Height ──
int _height(TreeNode* node) {
if (!node) return -1;
return 1 + max(_height(node->left), _height(node->right));
}
int height() { return _height(root); }
};
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
BST bst;
vector<int> vals = {5, 3, 8, 1, 4, 7, 10};
for (int v : vals) bst.insert(v);
cout << "Sorted: ";
for (int v : bst.getSorted()) cout << v << " ";
cout << "\n";
// Output: 1 3 4 5 7 8 10
cout << "Height: " << bst.height() << "\n"; // 2
cout << "Search 4: " << bst.search(4) << "\n"; // 1 (true)
cout << "Search 6: " << bst.search(6) << "\n"; // 0 (false)
return 0;
}
3.11.7 USACO-Style Practice Problem
Problem: "Cow Family Tree" (USACO Bronze Style)
Problem Statement:
Farmer John has N cows numbered 1 to N. Cow 1 is the ancestor of all cows (the "root"). For each cow i (2 ≤ i ≤ N), its parent is cow parent[i]. The depth of a cow is defined as the number of edges from the root (cow 1) to that cow (so cow 1 has depth 0).
Given the tree and M queries, each asking "what is the depth of cow x?", answer all queries.
Input:
- Line 1: N, M (1 ≤ N, M ≤ 100,000)
- Lines 2 to N: each line contains
i parent[i] - Next M lines: each contains a single integer
x
Output: For each query, print the depth of cow x.
Sample Input:
5 3
2 1
3 1
4 2
5 3
4
5
1
Sample Output:
2
2
0
- Cow 4's path: 4→2→1, depth = 2
- Cow 5's path: 5→3→1, depth = 2
- Cow 1: root, depth = 0
Solution Approach: Use DFS/BFS to compute depth of each node.
// Solution: Cow Family Tree — Depth Query
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 100005;
vector<int> children[MAXN]; // adjacency list: children[i] = list of i's children
int depth[MAXN]; // depth[i] = depth of node i
// DFS to compute depths
void dfs(int node, int d) {
depth[node] = d;
for (int child : children[node]) {
dfs(child, d + 1); // children have depth+1
}
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, m;
cin >> n >> m;
for (int i = 2; i <= n; i++) {
int par;
cin >> par;
children[par].push_back(i); // par is parent of i
}
dfs(1, 0); // start DFS from root (cow 1) at depth 0
while (m--) {
int x;
cin >> x;
cout << depth[x] << "\n";
}
return 0;
}
// Time: O(N + M)
// Space: O(N)
💡 Extension: What if we want sum of values on path to root?
// Instead of depth, compute path sum (sum of node values on path to root)
int pathSum[MAXN]; // pathSum[i] = sum of values from root to i
int nodeVal[MAXN]; // nodeVal[i] = value of node i
void dfs(int node, int cumSum) {
pathSum[node] = cumSum + nodeVal[node];
for (int child : children[node]) {
dfs(child, pathSum[node]);
}
}
// Query: just return pathSum[x] in O(1)
3.11.8 Building a Tree from Traversals
A classic problem: given preorder and inorder traversals, reconstruct the original tree.
Key insight:
- The first element of preorder is always the root
- In the inorder array, the root splits it into left and right subtrees
// Solution: Reconstruct Tree from Preorder + Inorder — O(N^2) naive
TreeNode* build(vector<int>& pre, int preL, int preR,
vector<int>& in, int inL, int inR) {
if (preL > preR) return nullptr;
int rootVal = pre[preL]; // first preorder element = root
TreeNode* root = new TreeNode(rootVal);
// Find root in inorder array
int rootIdx = inL;
while (in[rootIdx] != rootVal) rootIdx++;
int leftSize = rootIdx - inL; // number of nodes in left subtree
// Recursively build left and right subtrees
root->left = build(pre, preL+1, preL+leftSize, in, inL, rootIdx-1);
root->right = build(pre, preL+leftSize+1, preR, in, rootIdx+1, inR);
return root;
}
TreeNode* buildTree(vector<int>& preorder, vector<int>& inorder) {
int n = preorder.size();
return build(preorder, 0, n-1, inorder, 0, n-1);
}
⚠️ Common Mistakes
// BAD: No null check!
void inorder(TreeNode* root) {
inorder(root->left); // CRASH if root is null
cout << root->val;
inorder(root->right);
}
// GOOD: Base case first
void inorder(TreeNode* root) {
if (root == nullptr) return; // ← critical!
inorder(root->left);
cout << root->val;
inorder(root->right);
}
// BAD: Recursive DFS on a 10^5 node
// degenerate tree (skewed) = 10^5 recursion depth
// Default stack ~ 8MB = overflow around 10^4-10^5!
void dfsRecursive(TreeNode* root) {
if (!root) return;
process(root);
dfsRecursive(root->left);
dfsRecursive(root->right);
}
// GOOD: Use explicit stack for large trees
void dfsIterative(TreeNode* root) {
stack<TreeNode*> stk;
if (root) stk.push(root);
while (!stk.empty()) {
TreeNode* node = stk.top(); stk.pop();
process(node);
if (node->right) stk.push(node->right);
if (node->left) stk.push(node->left);
}
}
Top 5 BST/Tree Bugs
- Forgetting
nullptrbase case — causes segfault immediately - Not returning the (potentially new) root from insert/delete — tree structure broken
- Stack overflow — use iterative traversal for N > 10^5
- Memory leak — always
deletenodes you remove (or use smart pointers) - Using unbalanced BST when STL set would work — use
std::setin contests
Chapter Summary
📌 Key Takeaways
| Concept | Key Point | Time Complexity |
|---|---|---|
| BST Search | Follow left/right based on comparison | O(log N) avg, O(N) worst |
| BST Insert | Find correct position, insert at null | O(log N) avg |
| BST Delete | 3 cases: leaf, one child, two children | O(log N) avg |
| Inorder | Left → Root → Right | O(N) |
| Preorder | Root → Left → Right | O(N) |
| Postorder | Left → Right → Root | O(N) |
| Level-order | BFS by level | O(N) |
| Height | max(leftH, rightH) + 1 | O(N) |
| Balance Check | leftH - rightH | |
| LCA (brute) | Find paths, compare | O(N) per query |
❓ FAQ
Q1: When should I use BST vs std::set?
A: In competitive programming, almost always use
std::set.std::setis backed by a red-black tree (balanced BST), guaranteeingO(log N); a hand-written BST may degenerate toO(N). Only consider writing your own BST when you need custom BST behavior (e.g., tracking subtree sizes for "K-th largest" queries), or use__gnu_pbds::tree(Policy-Based Data Structure).
Q2: What is the relationship between Segment Tree and BST?
A: Segment Tree (Chapter 3.9) is a complete binary tree, but not a BST—nodes store range aggregate values (like range sums), not ordered keys. Both are binary trees with similar structure, but completely different purposes. Understanding BST pointer/recursion operations makes Segment Tree code easier to understand.
Q3: Which traversal—preorder/inorder/postorder—is most common in contests?
A: Inorder is most important—it outputs the BST's sorted sequence. Postorder is common for tree DP (compute children before parent). Level-order (BFS) is used when processing by level. Preorder is less common, but useful for serializing/deserializing trees.
Q4: Which is better, recursive or iterative implementation?
A: Recursive code is concise and easy to understand (preferred in contests). But when N ≥ 10^5 and the tree may degenerate, recursion risks stack overflow (default stack ~8MB, supports ~10^4~10^5 levels). USACO problems usually have non-degenerate trees, so recursion is usually fine; but if unsure, iterative is safer.
Q5: How important is LCA in competitive programming?
A: Very important! LCA is the foundation of tree DP and path queries. It appears occasionally in USACO Silver and is almost always tested in USACO Gold. The
O(N)brute-force LCA learned here handles N ≤ 5000. TheO(log N)Binary Lifting LCA is covered in detail in Chapter 5.3 (Trees & Special Graphs).
🔗 Connections to Other Chapters
- Chapter 2.3 (Functions & Arrays): foundation of recursion—binary tree traversal is a perfect application of recursion
- Chapter 3.8 (Maps & Sets):
std::set/std::mapare backed by balanced BST; understanding BST helps you use them better - Chapter 3.9 (Segment Trees): Segment Tree is a complete binary tree; the recursive structure of build/query/update is identical to BST traversal
- Chapter 5.2 (Graph Algorithms): trees are special undirected graphs (connected, acyclic); all tree algorithms are special cases of graph algorithms
- Chapter 5.3 (Trees & Special Graphs): LCA Binary Lifting, Euler Tour—built directly on this chapter's foundation
Practice Problems
Problem 3.11.1 — BST Validator 🟢 Easy Given a binary tree (not necessarily a BST), determine if it satisfies the BST property (all left subtree values < node < all right subtree values for every node).
Hint
Common mistake: only checking `root->left->val < root->val` is NOT enough (doesn't verify the full subtree). Pass `minVal` and `maxVal` bounds down the recursion: `isValidBST(root, INT_MIN, INT_MAX)`.Problem 3.11.2 — BST Inorder K-th Smallest 🟢 Easy Given a BST, find the K-th smallest element.
Hint
Inorder traversal of a BST gives elements in sorted order. Count nodes as you do inorder traversal. Stop when you've visited K nodes.Problem 3.11.3 — Tree Diameter 🟡 Medium Given a binary tree (not a BST), find the longest path between any two nodes (the diameter). The path does not need to pass through the root.
Hint
For each node, the longest path through it = leftHeight + rightHeight + 2. Compute this for all nodes and take the maximum. You can do this in a single DFS by returning height and updating a global `maxDiameter` variable.Problem 3.11.4 — Flatten BST to Sorted Array (USACO Style) 🟡 Medium You are given a BST with N nodes. N cows are each assigned a "score" (the node value). Find the median cow score (the ⌈N/2⌉-th smallest value).
Hint
Do an inorder traversal to get a sorted array, then return element at index (N-1)/2 (0-indexed). Time: `O(N)`.Problem 3.11.5 — Maximum Path Sum 🔴 Hard Given a binary tree where nodes can have negative values, find the path (between any two nodes) with the maximum sum. A path can go up and down through the tree.
Hint
For each node v, the best path through v uses some prefix of the left branch and some prefix of the right branch. Use DFS: for each node, return the maximum "one-sided" path (going down only). Maintain a global maximum considering both sides. Handle negative branches by clamping to 0 (`max(0, leftMax) + max(0, rightMax) + node->val`).End of Chapter 3.11 — Next: Chapter 4.1: Greedy Fundamentals
⚡ Part 4: Greedy Algorithms
Elegant algorithms with no complex recurrences — just one clever observation. Learn when greedy works, how to prove it, and powerful greedy + binary search combos.
📚 2 Chapters · ⏱️ Estimated 1-2 weeks · 🎯 Target: Activity selection, scheduling, binary search + greedy
Part 4: Greedy Algorithms
Estimated time: 1–2 weeks
Greedy algorithms are elegant: no complex recurrences, no state explosions — just one clever observation that makes everything fall into place. The challenge is knowing when greedy works and being able to prove it when it does.
What Topics Are Covered
| Chapter | Topic | The Big Idea |
|---|---|---|
| Chapter 4.1 | Greedy Fundamentals | When greedy works; exchange argument proofs |
| Chapter 4.2 | Greedy in USACO | Real USACO problems solved with greedy |
What You'll Be Able to Solve After This Part
After completing Part 4, you'll be ready to tackle:
-
USACO Bronze:
- Simulation with greedy decisions (process events optimally)
- Simple sorting-based greedy
-
USACO Silver:
- Activity selection (maximum non-overlapping intervals)
- Scheduling problems (EDF, minimize lateness)
- Greedy + binary search on answer
- Huffman-style merge problems (priority queue)
Key Greedy Patterns
| Pattern | Sort By | Application |
|---|---|---|
| Activity selection | End time ↑ | Max non-overlapping intervals |
| Earliest deadline first | Deadline ↑ | Minimize maximum lateness |
| Interval stabbing | End time ↑ | Min points to cover all intervals |
| Interval covering | Start time ↑ | Min intervals to cover a range |
| Fractional knapsack | Value/weight ↓ | Maximize value with capacity |
| Huffman merge | Use min-heap | Minimum cost encoding |
Prerequisites
Before starting Part 4, make sure you can:
- Sort with custom comparators (Chapter 3.3)
-
Use
priority_queue(Chapter 3.1) - Binary search on the answer (Chapter 3.3) — used in Chapter 4.2
The Greedy Mindset
Before coding a greedy solution, ask:
- What's the "obvious best" choice at each step?
- Can I make an exchange argument? If I swap the greedy choice with any other choice, does the solution only get worse (or stay the same)?
- Can I find a counterexample? Try small cases where the greedy might fail.
If you can answer (1) and (2) but not find a counterexample for (3), your greedy is likely correct.
Tips for This Part
- Greedy is the hardest part to "verify." Unlike DP where you just need the right recurrence, greedy requires a correctness argument. Practice sketching exchange argument proofs.
- When greedy fails, DP is usually the fix. The coin change example (Chapter 4.1) shows this perfectly.
- Chapter 4.2 has real USACO problems — work through the code carefully, not just the high-level idea.
- Greedy + binary search (Chapter 4.2) is a powerful combination that appears frequently in Silver. The greedy solves the "check" function, and binary search finds the optimal answer.
💡 Key Insight: Sorting is the engine of most greedy algorithms. The sort criterion embodies the "greedy choice" — choosing the best element first. The exchange argument proves that this criterion is optimal.
🏆 USACO Tip: In USACO Silver, if a problem asks "maximum X subject to constraint Y" or "minimum cost to achieve Z," first try binary search on the answer with a greedy check. This combination solves a surprising fraction of Silver problems.
Chapter 4.1: Greedy Fundamentals
📝 Before You Continue: You should be comfortable with sorting (Chapter 3.3) and basic priority_queue usage (Chapter 3.1). Some problems also use interval reasoning.
A greedy algorithm is like a traveler who always takes the nearest oasis — no map, no planning, just the best move visible right now. For the right problems, this always works out. For others, it leads to disaster.
4.1.1 What Makes a Problem "Greedy-Solvable"?
A greedy approach works when the problem has the greedy choice property: making the locally optimal choice at each step leads to a globally optimal solution.
Contrast with DP
Consider making change for 11 cents:
- Coins: {1, 5, 6, 9}
- Greedy: 9 + 1 + 1 = 3 coins
- Optimal: 6 + 5 = 2 coins
Here greedy fails. The greedy choice (always take the largest coin) doesn't lead to the global optimum.
But with US coins {1, 5, 10, 25, 50}:
- 41 cents: Greedy → 25 + 10 + 5 + 1 = 4 coins ✓ (optimal)
US coins have a special structure that makes greedy work. Always verify!
💡 Key Insight: Greedy works when there's a "no regret" property — once you make the greedy choice, you'll never need to undo it. If you can always swap any non-greedy choice for the greedy one without making things worse, greedy is optimal.
贪心 vs DP 决策路径对比:
flowchart TD
Start["遇到优化问题"] --> Q1{"能否找到反例?"}
Q1 -->|"能,贪心失败"| DP["使用 DP\n考虑所有选择"]
Q1 -->|"不能,尝试证明"| Q2{"能否用交换论证明贪心选择安全?"}
Q2 -->|"能"| Greedy["使用贪心\n每步取局部最优"]
Q2 -->|"不确定"| Both["先尝试贪心\n如果 WA 就改 DP"]
style Greedy fill:#dcfce7,stroke:#16a34a
style DP fill:#dbeafe,stroke:#3b82f6
style Both fill:#fef9ec,stroke:#d97706
4.1.2 The Exchange Argument
The exchange argument is the standard proof technique for greedy algorithms:
- Assume there's an optimal solution O that makes a different choice than our greedy at some step
- Show that we can "swap" our greedy choice for theirs without making things worse
- By repeated swaps, transform O into the greedy solution — it remains optimal throughout
- Conclude: the greedy solution is optimal
💡 Key Insight: The exchange argument works by showing that greedy choices are "at least as good" as any alternative. You don't need to show greedy is uniquely optimal — just that no swap can improve it.
Visual: Greedy Exchange Argument
The diagram illustrates the exchange argument: if two adjacent elements are "out of order" relative to the greedy criterion, swapping them produces a solution that is at least as good. By repeatedly applying swaps we can transform any solution into the greedy solution without losing value.
Let's see this in action.
4.1.3 Activity Selection Problem
Problem: Given N activities, each with a start time s[i] and end time f[i], select the maximum number of non-overlapping activities.
Visual: Activity Selection Gantt Chart
The Gantt chart shows all activities on a timeline. Selected activities (green) are non-overlapping and maximally many. Rejected activities (gray) are skipped because they overlap with an already-selected one. The greedy rule is: always pick the activity with the earliest end time that doesn't conflict.
Greedy Algorithm:
- Sort activities by end time
- Always select the activity that ends earliest among those compatible with previously selected activities
活动选择贪心过程示意:
flowchart LR
subgraph sorted["按结束时间排序后"]
direction TB
A1["A(1,3)"]
B1["B(2,5)"]
C1["C(5,7)"]
D1["D(6,8)"]
F1["F(8,11)"]
end
subgraph select["贪心选择过程"]
direction TB
S1["lastEnd=-1\n选 A(1,3) ✓\nlastEnd=3"]
S2["B开始=2 < lastEnd=3\n跳过 B ✗"]
S3["C开始=5 ≥ lastEnd=3\n选 C(5,7) ✓\nlastEnd=7"]
S4["D开始=6 < lastEnd=7\n跳过 D ✗"]
S5["F开始=8 ≥ lastEnd=7\n选 F(8,11) ✓\nlastEnd=11"]
S1 --> S2 --> S3 --> S4 --> S5
end
sorted --> select
style S1 fill:#dcfce7,stroke:#16a34a
style S3 fill:#dcfce7,stroke:#16a34a
style S5 fill:#dcfce7,stroke:#16a34a
style S2 fill:#fef2f2,stroke:#dc2626
style S4 fill:#fef2f2,stroke:#dc2626
💡 为什么按结束时间排序? 选择最早结束的活动,留下最多的时间给后续活动。按开始时间排序可能选一个开始早但结束很晚的活动,占用大量时间。
// Solution: Activity Selection — O(N log N)
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
vector<pair<int,int>> activities(n); // {end_time, start_time}
for (int i = 0; i < n; i++) {
int s, f;
cin >> s >> f;
activities[i] = {f, s}; // sort by end time
}
sort(activities.begin(), activities.end()); // ← KEY LINE: sort by end time
int count = 0;
int lastEnd = -1; // end time of the last selected activity
for (auto [f, s] : activities) {
if (s >= lastEnd) { // this activity starts after the last one ends
count++;
lastEnd = f; // update last end time
}
}
cout << count << "\n";
return 0;
}
Complete Walkthrough: USACO-Style Activity Selection
Problem: Given activities: [(1,3), (2,5), (3,9), (6,8), (5,7), (8,11), (10,12)] (format: start, end)
Step 1 — Sort by end time:
Activity: A B C D E F G
(s,e): (1,3) (2,5) (5,7) (6,8) (3,9) (8,11) (10,12)
Sorted: A(1,3), B(2,5), C(5,7), D(6,8), E(3,9), F(8,11), G(10,12)
Step 2 — Greedy selection (lastEnd = -1 initially):
Activity A (1,3): start=1 ≥ lastEnd=-1 ✓ SELECT. lastEnd = 3. Count = 1
Activity B (2,5): start=2 ≥ lastEnd=3? NO (2 < 3). SKIP.
Activity C (5,7): start=5 ≥ lastEnd=3 ✓ SELECT. lastEnd = 7. Count = 2
Activity D (6,8): start=6 ≥ lastEnd=7? NO (6 < 7). SKIP.
Activity E (3,9): start=3 ≥ lastEnd=7? NO (3 < 7). SKIP.
Activity F (8,11): start=8 ≥ lastEnd=7 ✓ SELECT. lastEnd = 11. Count = 3
Activity G (10,12):start=10 ≥ lastEnd=11? NO (10 < 11). SKIP.
Result: 3 activities selected — A(1,3), C(5,7), F(8,11)
ASCII Timeline Diagram:
Time: 0 1 2 3 4 5 6 7 8 9 10 11 12
| | | | | | | | | | | | |
A: [===] ✓ SELECTED
B: [======] ✗ overlaps A
C: [======] ✓ SELECTED
D: [======] ✗ overlaps C
E: [============] ✗ overlaps A and C
F: [======] ✓ SELECTED
G: [======] ✗ overlaps F
Selected: A === C === F ===
1-3 5-7 8-11
Formal Exchange Argument Proof (Activity Selection)
Claim: Sorting by end time and greedily selecting is optimal.
Proof:
Let G = greedy solution, O = some other optimal solution. Both select k activities.
Step 1 — Show first selections can be made equivalent: Let a₁ be the first activity selected by G (earliest-ending activity overall). Let b₁ be the first activity selected by O.
Since G sorts by end time, end(a₁) ≤ end(b₁).
Now "swap" b₁ for a₁ in O: replace b₁ with a₁. Does O remain feasible?
- a₁ ends no later than b₁, so a₁ conflicts with at most as many activities as b₁ did
- All activities in O that came after b₁ and didn't conflict with b₁ also don't conflict with a₁ (since a₁ ends ≤ b₁ ends)
- So O' (with a₁ replacing b₁) is still a valid selection of k activities ✓
Step 2 — Induction: After the first selection, G picks the earliest-ending activity compatible with a₁, and O' has a₁ as its first activity. Apply the same argument to the remaining activities.
Conclusion: By induction, any optimal solution O can be transformed into G (the greedy solution) without losing optimality. Therefore G is optimal. ∎
💡 Key Insight from the proof: The greedy choice (earliest end time) is "safe" because it leaves the most remaining time for future activities. Choosing any later-ending first activity can only hurt future flexibility.
4.1.4 Interval Scheduling Maximization vs. Minimization
Visual: Interval Scheduling on a Number Line
The number line diagram shows multiple intervals and the greedy selection process. By sorting by end time and always taking the next non-overlapping interval, we achieve the maximum number of selected intervals. Green intervals are selected; gray ones are rejected due to overlap.
Maximization: Maximum Non-Overlapping Intervals
→ Sort by end time, greedy select as above.
Minimization: Minimum "Points" to Stab All Intervals
Problem: Given N intervals, find the minimum number of points such that each interval contains at least one point.
Greedy: Sort by end time. For each interval whose left endpoint is to the right of the last placed point, place a new point at its right endpoint.
sort(intervals.begin(), intervals.end()); // sort by end time
int points = 0;
int lastPoint = INT_MIN;
for (auto [end, start] : intervals) {
if (start > lastPoint) { // this interval not yet covered
lastPoint = end; // place point at its end (covers as many future intervals as possible)
points++;
}
}
cout << points << "\n";
Minimization: Minimum Intervals to Cover a Range
Problem: Cover the range [0, T] with minimum intervals from a given set.
Greedy: Sort by start time. At each step, among all intervals starting at or before the current position, pick the one that extends furthest to the right.
sort(intervals.begin(), intervals.end()); // sort by start time
int covered = 0; // currently covered up to 'covered'
int count = 0;
int i = 0;
int n = intervals.size();
while (covered < T) {
int farthest = covered;
// Among all intervals that start at or before 'covered', pick the farthest-reaching
while (i < n && intervals[i].first <= covered) {
farthest = max(farthest, intervals[i].second);
i++;
}
if (farthest == covered) {
cout << "Impossible\n";
return 0;
}
covered = farthest;
count++;
}
cout << count << "\n";
4.1.5 The Scheduling Problem: Minimize Lateness
Problem: N jobs with deadlines d[i] and processing times t[i]. Schedule all jobs on one machine to minimize maximum lateness (how much the latest job exceeds its deadline).
Lateness of job i = max(0, finish_time[i] - d[i]).
Greedy: Sort jobs by deadline (Earliest Deadline First — EDF).
#include <bits/stdc++.h>
using namespace std;
int main() {
int n;
cin >> n;
vector<pair<int,int>> jobs(n); // {deadline, processing_time}
for (int i = 0; i < n; i++) cin >> jobs[i].second >> jobs[i].first;
sort(jobs.begin(), jobs.end()); // sort by deadline
int time = 0;
int maxLateness = 0;
for (auto [deadline, proc] : jobs) {
time += proc; // finish time of this job
int lateness = max(0, time - deadline); // how late is it?
maxLateness = max(maxLateness, lateness);
}
cout << maxLateness << "\n";
return 0;
}
Proof sketch: If job A has earlier deadline than B but is scheduled after B, swap them. The lateness of A can only decrease (it finishes earlier), and the lateness of B can only increase by at most the processing time of A — but since d[A] ≤ d[B], B's lateness doesn't worsen. So EDF is optimal.
4.1.6 Huffman Coding (Greedy Tree Building)
Problem: Given N symbols with frequencies, build a binary tree minimizing total encoding length (frequency × depth summed over all symbols).
Greedy: Always merge the two symbols/groups with smallest frequency.
#include <bits/stdc++.h>
using namespace std;
int main() {
int n;
cin >> n;
priority_queue<long long, vector<long long>, greater<long long>> pq; // min-heap
for (int i = 0; i < n; i++) {
long long f; cin >> f;
pq.push(f);
}
long long totalCost = 0;
while (pq.size() > 1) {
long long a = pq.top(); pq.pop();
long long b = pq.top(); pq.pop();
totalCost += a + b; // cost of merging a and b
pq.push(a + b); // merged group has frequency a+b
}
cout << totalCost << "\n";
return 0;
}
⚠️ Common Mistakes in Chapter 4.1
- Applying greedy to DP problems: Just because greedy is simpler doesn't mean it's correct. Always test your greedy on small counterexamples. Coin change with arbitrary denominations is a classic trap.
- Wrong sort criterion: Sorting by start time instead of end time for activity selection is a classic bug. The justification for WHY we sort a certain way (the exchange argument) is what tells you the correct criterion.
- Off-by-one in overlap check:
s >= lastEnd(allows adjacent activities) vs.s > lastEnd(requires a gap). Check which interpretation the problem intends. - Assuming greedy works without proof: Always verify with a small example or brief exchange argument. If you can't find a counterexample AND you can sketch why the greedy choice is "safe," it's likely correct.
- Forgetting to sort: Greedy algorithms almost always begin with a sort. Forgetting to sort means the greedy "order" doesn't exist.
Chapter Summary
📌 Key Takeaways
| Problem | Greedy Strategy | Sort By | Time |
|---|---|---|---|
| Max non-overlapping intervals | Pick earliest-ending | End time ↑ | O(N log N) |
| Min points to stab intervals | Place point at end of each uncovered interval | End time ↑ | O(N log N) |
| Min intervals to cover range | Pick farthest-reaching at each step | Start time ↑ | O(N log N) |
| Minimize max lateness | Earliest Deadline First (EDF) | Deadline ↑ | O(N log N) |
| Huffman coding | Merge two smallest frequencies | Min-heap | O(N log N) |
❓ FAQ
Q1: How do I tell if a problem can be solved greedily?
A: Three signals: ① After sorting, there's a clear processing order; ② You can use an exchange argument to show the greedy choice is never worse than any alternative; ③ You can't find a counterexample. If you find one (e.g., coin change with {1,5,6,9}), greedy fails — use DP instead.
Q2: What's the real difference between greedy and DP?
A: Greedy makes the locally optimal choice at each step and never looks back. DP considers all possible choices and builds the global optimum from subproblem solutions. Greedy is a special case of DP — it works when the local optimum happens to equal the global optimum.
Q3: What is the "binary search on answer + greedy check" pattern?
A: When a problem asks to "minimize the maximum" or "maximize the minimum," binary search on the answer X and use a greedy
check(X)to verify feasibility. See the Convention problem in Chapter 4.2.
Q4: Why sort Activity Selection by end time instead of start time?
A: Sorting by end time ensures we always pick the activity that "frees up resources" earliest, leaving the most room for future activities. Sorting by start time might select an activity that starts early but ends very late, blocking all subsequent ones.
🔗 Connections to Other Chapters
- Chapters 6.1–6.3 (DP) are the "upgrade" of greedy — when greedy fails, DP considers all choices
- Chapter 3.3 (Sorting & Binary Search) is the prerequisite — almost every greedy algorithm starts with a sort
- Chapter 4.2 applies greedy to real USACO problems, showcasing the classic "binary search on answer + greedy check" pattern
- Chapter 5.3 (Kruskal's MST) is fundamentally greedy — sort edges and greedily pick the minimum, one of the most classic greedy algorithms
Practice Problems
Problem 4.1.1 — Meeting Rooms II 🟡 Medium N meetings with start/end times. Find the minimum number of rooms needed so all meetings can occur simultaneously.
Solution sketch: Sort by start time. Use a min-heap of end times (when each room becomes free). For each meeting, if its start ≥ earliest-free room, reuse that room. Otherwise, add a new room.
Hint
The minimum rooms needed = the maximum number of meetings happening simultaneously. Use a priority queue (min-heap) to track when rooms become available.Problem 4.1.2 — Gas Station 🔴 Hard N gas stations in a circle. Station i has gas[i] liters and requires cost[i] to reach the next. Can you complete the circuit? If yes, find the starting station.
Solution sketch: If total gas ≥ total cost, a solution exists. Greedy: try each starting station. If tank drops negative, reset starting station to the next one.
Hint
Key insight: if the total gas ≥ total cost, there's always exactly one valid starting station. Track cumulative gas balance; when it goes negative, the starting station must be after the current failed position.Problem 4.1.3 — Minimum Platforms 🟡 Medium Given arrival and departure times for N trains, find the minimum number of platforms needed so no train waits.
Hint
Create events: +1 for each arrival, -1 for each departure. Sort by time. Sweep and track the running count; the maximum is the answer.Problem 4.1.4 — Fractional Knapsack 🟢 Easy You can take fractions of items. Weight w[i], value v[i], capacity W. Maximize value.
Solution sketch: Sort by value/weight ratio (highest first). Take as much as possible of each item until knapsack is full.
Hint
Greedy works here (unlike 0/1 knapsack) because you can take fractions. Always take from the highest value/weight ratio item first.Problem 4.1.5 — Jump Game 🟡 Medium Array A of non-negative integers. From position i, you can jump up to A[i] steps forward. Can you reach the last position from position 0?
Solution sketch: Track farthest = furthest position reachable so far. At each position i ≤ farthest, update farthest = max(farthest, i + A[i]). If we reach farthest ≥ n-1, return true.
Hint
If you can reach position i, you can reach all positions ≤ i + A[i]. Greedily maintain the farthest reachable position.🏆 Challenge Problem: USACO 2016 February Silver: Fencing the Cows (Activity Selection Variant) Farmer John has N fence segments on the x-axis, each defined by [L_i, R_i]. He wants to select a minimum set of "anchor points" such that every fence segment contains at least one anchor point. (This is the interval stabbing problem — greedy with end-time sorting.)
Chapter 4.2: Greedy in USACO
USACO problems that yield to greedy solutions are some of the most satisfying to solve — once you see the insight, the code practically writes itself. This chapter walks through several USACO-style problems where greedy is the key.
4.2.1 Pattern Recognition: Is It Greedy?
Before coding, ask yourself:
- Can I sort the input in some clever way?
- Is there a "natural" order to process elements that always leads to the best result?
- Can I argue that taking the "obvious best" at each step never hurts?
If yes to any of these, try greedy. If your greedy fails a test case, reconsider — maybe it's actually a DP problem.
4.2.2 USACO Bronze: Cow Sorting
Problem: N cows in a line. Each cow has a "grumpiness" value g[i]. To sort them in increasing order, you can swap two adjacent cows, but you pay g[i] + g[j] for swapping cows i and j. Minimize total cost.
Key Insight: With adjacent swaps, each inversion (pair (i, j) where i < j but g[i] > g[j]) requires exactly one swap. The total cost is the sum of (g[i] + g[j]) over all inversions. There is no freedom to reduce this — every inversion pair must be swapped exactly once, and any ordering of swaps gives the same total cost.
⚠️ Common Misconception: The formula
sumG + (n-2) × minGis NOT the correct answer for general Cow Sorting. That expression only coincidentally equals the answer in edge cases (e.g., n=2). The correct cost is always the sum over all inversions.
Counting inversions in O(N²):
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
vector<long long> g(n);
for (long long &x : g) cin >> x;
// Total cost = sum of (g[i] + g[j]) for every inversion pair i < j where g[i] > g[j]
// Equivalently: for each element g[i], add g[i] * (# elements it must "cross"):
// (# elements to its left that are > g[i]) + (# elements to its right that are < g[i])
// Both counts together = total inversions involving g[i].
long long totalCost = 0;
for (int i = 0; i < n; i++) {
for (int j = i + 1; j < n; j++) {
if (g[i] > g[j]) {
totalCost += g[i] + g[j]; // this inversion costs g[i]+g[j]
}
}
}
cout << totalCost << "\n";
return 0;
}
// Time: O(N²) — for N ≤ 10^5 use merge-sort inversion count (O(N log N))
Example:
Input: g = [3, 1, 2]
Inversions: (3,1) → cost 4; (3,2) → cost 5
Total: 9
Verification: Bubble sort on [3,1,2]:
- Swap(3,1) = cost 4 → [1,3,2]
- Swap(3,2) = cost 5 → [1,2,3]
- Total = 9 ✓
4.2.3 USACO Bronze: The Cow Signal (Greedy Simulation)
Many USACO Bronze problems are pure simulation with a greedy twist: process events in time order and maintain the optimal state.
Problem: N cows each leave the barn at time t[i] and must reach the pasture. The barn-pasture road has capacity C (at most C cows at once). Cows travel instantaneously but must wait if capacity is full. What is the time when the last cow arrives?
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, c;
cin >> n >> c;
vector<int> t(n);
for (int &x : t) cin >> x;
sort(t.begin(), t.end()); // process cows in order of departure time
int ans = 0;
// Process in groups of c
for (int i = 0; i < n; i += c) {
// Group starts at t[i] (the earliest cow in this batch)
// But batch can't start before previous batch finished
ans = max(ans, t[i]); // this batch must start at least when earliest cow is ready
ans++; // takes 1 time unit
}
cout << ans << "\n";
return 0;
}
4.2.4 USACO Silver: Paired Up
Problem: N cows in two groups (group A and B). Each cow in A must be paired with one in group B. Pairing cow with value a with cow with value b gives profit f(a, b). Maximize total profit.
For specific profit functions, greedy sorting works. The classic version: profit = min(a, b), maximize sum.
Greedy: Sort both groups. Pair the largest A with the largest B, etc.
#include <bits/stdc++.h>
using namespace std;
int main() {
int n;
cin >> n;
vector<int> A(n), B(n);
for (int &x : A) cin >> x;
for (int &x : B) cin >> x;
sort(A.begin(), A.end());
sort(B.begin(), B.end());
long long total = 0;
for (int i = 0; i < n; i++) {
total += min(A[i], B[i]); // pair i-th smallest with i-th smallest
}
cout << total << "\n";
return 0;
}
This works because if you pair (a_large, b_small) instead of (a_large, b_large) and (a_small, b_small), you get min(a_large, b_small) + min(a_small, b_small) ≤ min(a_large, b_large) + min(a_small, b_small). Always match sorted order.
4.2.5 USACO Silver: Convention
Problem (USACO 2018 February Silver): N cows arrive at times t[1..N] at a bus stop. There are M buses, each holding C cows. A bus departs when full or at a scheduled time. Assign cows to buses to minimize the maximum waiting time for any cow.
Approach: Binary search on the answer + greedy check.
This is a "binary search on the answer with greedy verification" problem:
#include <bits/stdc++.h>
using namespace std;
int n, m, c;
vector<long long> cows; // sorted arrival times
// Can we schedule all cows with max wait <= maxWait?
bool canDo(long long maxWait) {
int busesUsed = 0;
int i = 0; // current cow index
while (i < n) {
busesUsed++;
if (busesUsed > m) return false; // ran out of buses
// This bus serves cows starting from cow i
// The bus must depart by cows[i] + maxWait
long long depart = cows[i] + maxWait;
// Fill bus with as many cows as possible (capacity c, all with arrival <= depart)
int count = 0;
while (i < n && count < c && cows[i] <= depart) {
i++;
count++;
}
}
return true;
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
cin >> n >> m >> c;
cows.resize(n);
for (long long &x : cows) cin >> x;
sort(cows.begin(), cows.end());
// Binary search on the maximum wait time
long long lo = 0, hi = 1e14;
while (lo < hi) {
long long mid = lo + (hi - lo) / 2;
if (canDo(mid)) hi = mid;
else lo = mid + 1;
}
cout << lo << "\n";
return 0;
}
4.2.6 USACO Bronze: Herding (Greedy Observation)
Problem: 3 cows at positions a, b, c on a number line. In one move, you can move any cow to any empty position. Find the minimum moves to get all 3 cows into consecutive positions.
Insight: 2 moves are always sufficient (you can move the outer two to surround the middle). Can 1 move work? Can 0 work? Check these cases.
#include <bits/stdc++.h>
using namespace std;
int main() {
long long a, b, c;
cin >> a >> b >> c;
// Make sure a <= b <= c
long long pos[3] = {a, b, c};
sort(pos, pos + 3);
a = pos[0]; b = pos[1]; c = pos[2];
// 0 moves: already consecutive
if (c - a == 2) { cout << 0; return 0; }
// 1 move: check if moving one cow can make them consecutive
// Options:
// - Move a to b+1 or b-1 (if that makes 3 consecutive with c)
// - Move c to b-1 or b+1 (if that makes 3 consecutive with a)
// - Move b to somewhere
// Case: after moving a, we have {b, b+1, b+2} or similar
bool one_move = false;
// Move a: can {b, c} be made consecutive with a new position?
// Need b and c to differ by 1: c - b == 1 (then a → b-1 or c+1)
if (c - b == 1) one_move = true;
// Or c - b == 2: then a → b+1 fills the gap
if (c - b == 2) one_move = true;
// Move c symmetrically
if (b - a == 1) one_move = true;
if (b - a == 2) one_move = true;
// Move b:
// After moving b, we have {a, c} and new position x
// Need {a, x, c} consecutive: x = a+1, c = a+2
if (c - a == 2) one_move = true; // already handled above
// Or put b adjacent to a or c
if (a + 1 == c - 1 && a + 1 != b) one_move = true; // if a+1 == c-1 means c-a=2 already...
// Simpler approach: just try all possible "target" consecutive triples
// The cows need to end up at some {x, x+1, x+2}
// In 1 move: one cow is already at its target, two others might need to move... wait, exactly 1 cow moves
// So two cows stay put and one moves. Check all combos.
// Pairs that stay: (a,b), (a,c), (b,c)
// Pair (b, c) stays: a moves. Consecutive triple containing b and c:
// {b-2, b-1, b} with c = b (c!=b), {b-1, b, b+1} with c = b+1, {b, b+1, b+2} with c = b+2
if (c - b == 1 || c - b == 2) one_move = true;
// Pair (a, b) stays: c moves.
if (b - a == 1 || b - a == 2) one_move = true;
// Pair (a, c) stays: b moves. We need {a, x, c} consecutive.
// c - a == 2 → already checked. c - a == 3: put b at a+1 or c-1.
if (c - a == 3) one_move = true;
if (one_move) { cout << 1; return 0; }
cout << 2;
return 0;
}
4.2.7 Common Greedy Patterns in USACO
| Pattern | Description | Sort By |
|---|---|---|
| Activity selection | Max non-overlapping intervals | End time |
| Scheduling | Minimize completion time / lateness | Deadline or ratio |
| Greedy + binary search | Check feasibility, find optimal via BS | Various |
| Pairing | Optimal matching of two sorted lists | Both arrays |
| Simulation | Process events in time order | Event time |
| Sweep line | Maintain active set as you move across time | Start/end events |
Chapter Summary
📌 Key Takeaways
Greedy algorithms in USACO often involve:
- Sorting the input in a clever order
- Scanning once (or twice) with a simple update rule
- Occasionally combining with binary search on the answer
| USACO Greedy Pattern | Description | Sort By |
|---|---|---|
| Activity selection | Max non-overlapping intervals | End time |
| Scheduling | Minimize completion time / lateness | Deadline or ratio |
| Greedy + binary search | Check feasibility, find optimal via BS | Various |
| Pairing | Optimal matching of two sorted lists | Both arrays |
| Simulation | Process events in time order | Event time |
| Sweep line | Maintain active set as you scan | Start/end events |
❓ FAQ
Q1: What is the template for "binary search on answer + greedy check"?
A: Outer layer: binary search on answer X (lo=min possible, hi=max possible). Inner layer: write a
check(X)function that uses a greedy strategy to verify whether X is feasible. Adjust lo/hi based on the result. The key requirement is thatcheckmust be monotone (if X is feasible, so is X+1, or vice versa).
Q2: How are USACO greedy problems different from LeetCode greedy problems?
A: USACO greedy problems typically require proving correctness (exchange argument) and are often combined with binary search and sorting. LeetCode tends to focus on simpler "always pick max/min" greedy. USACO Silver greedy problems are noticeably harder than LeetCode Medium.
Q3: When should I use priority_queue to assist greedy?
A: When you repeatedly need to extract the "current best" element (e.g., Huffman coding, minimum meeting rooms, repeatedly picking max/min values).
priority_queuereduces "find the best" from O(N) to O(log N).
🔗 Connections to Other Chapters
- Chapter 4.1 covered the theory of greedy and exchange arguments; this chapter applies them to real USACO problems
- Chapter 3.3 (Binary Search) introduced the "binary search on answer" pattern used directly in the Convention problem here
- Chapter 7.1 (Understanding USACO) and Chapter 7.2 (Problem-Solving Strategies) will further discuss how to recognize greedy vs DP in contests
- Chapter 3.1 (STL) introduced
priority_queue, which appears frequently in greedy simulations in this chapter
Practice Problems
Problem 4.2.1 — USACO 2016 December Bronze: Counting Haybales N haybales at positions on a number line. Q queries: how many haybales are in [L, R]? (Prefix sums, but practice the sorting mindset)
Problem 4.2.2 — USACO 2019 February Bronze: Sleepy Cow Sorting N cows labeled 1 to N (not in order). Move cows to sort them. Each move takes one cow from the end and inserts it somewhere. Minimum moves? (Greedy: find the longest already-sorted suffix)
Problem 4.2.3 — Task Scheduler N tasks labeled A–Z. Must wait k steps between two instances of the same task. Minimum time to complete all tasks? (Greedy: always schedule the most frequent remaining task)
Problem 4.2.4 — USACO 2018 February Silver: Convention II Cows arrive at a watering hole with arrival times and drink durations. The most senior waiting cow goes next. Simulate and find the maximum wait time. (Greedy simulation with priority queue)
Problem 4.2.5 — Weighted Job Scheduling N jobs with start, end, and profit. Select non-overlapping jobs to maximize total profit. (This one requires DP, NOT greedy — a good lesson in when greedy fails!)
🕸️ Part 5: Graph Algorithms
Learn to see graphs in problems and solve them efficiently. BFS, DFS, trees, Union-Find, and Kruskal's MST — the core of USACO Silver.
📚 4 Chapters · ⏱️ Estimated 2-3 weeks · 🎯 Target: Reach USACO Silver level
Part 5: Graph Algorithms
Estimated time: 2–3 weeks
Graphs are everywhere in competitive programming: mazes, networks, family trees, city maps. Part 5 teaches you to see graphs in problems and solve them efficiently.
What Topics Are Covered
| Chapter | Topic | The Big Idea |
|---|---|---|
| Chapter 5.1 | Introduction to Graphs | Representing graphs; adjacency lists; types of graphs |
| Chapter 5.2 | BFS & DFS | Traversal, shortest paths, flood fill, connected components |
| Chapter 5.3 | Trees & Special Graphs | Tree traversals; Union-Find; Kruskal's MST |
| Chapter 5.4 | Shortest Paths | Dijkstra, Bellman-Ford, Floyd-Warshall, SPFA |
What You'll Be Able to Solve After This Part
After completing Part 5, you'll be ready to tackle:
-
USACO Bronze:
- Flood fill (count connected regions in a grid)
- Reachability problems (can cow A reach cow B?)
- Simple BFS shortest paths in grids/graphs
-
USACO Silver:
- BFS/DFS on implicit graphs (states rather than explicit nodes)
- Multi-source BFS (distance to nearest obstacle/fire)
- Union-Find for dynamic connectivity
- Graph connectivity under edge additions
- Tree problems (subtree sums, depths, LCA)
Key Algorithms Introduced
| Technique | Chapter | Time Complexity | USACO Relevance |
|---|---|---|---|
| DFS (recursive & iterative) | 5.2 | O(V + E) | Connectivity, cycle detection |
| BFS | 5.2 | O(V + E) | Shortest path (unweighted) |
| Grid BFS | 5.2 | O(R × C) | Maze problems, flood fill |
| Multi-source BFS | 5.2 | O(V + E) | Distance to nearest source |
| Connected components | 5.2 | O(V + E) | Counting disconnected regions |
| Tree traversals (pre/post-order) | 5.3 | O(N) | Subtree aggregation |
| Union-Find (DSU) | 5.3 | O(α(N)) ≈ O(1) | Dynamic connectivity |
| Kruskal's MST | 5.3 | O(E log E) | Minimum spanning tree |
| Dijkstra's algorithm | 5.4 | O((V + E) log V) | SSSP on non-negative weighted graphs |
| Bellman-Ford | 5.4 | O(V × E) | SSSP with negative edges; detect negative cycles |
| Floyd-Warshall | 5.4 | O(V³) | All-pairs shortest paths on small graphs |
| SPFA | 5.4 | O(V × E) worst | Practical Bellman-Ford with queue optimization |
Prerequisites
Before starting Part 5, make sure you can:
-
Use
vector<vector<int>>for adjacency lists (Chapters 2.3–3.1) -
Use
queueandstackfrom STL (Chapter 3.1, 3.5) - Work with 2D arrays and grid traversal (Chapter 2.3)
- Understand basic nested loops (Chapter 2.2)
-
Use
priority_queue(Chapter 3.1) — needed for Chapter 5.4 (Dijkstra)
Tips for This Part
- Chapter 5.1 is mostly setup — read it to understand graph representation, but the real algorithms start in Chapter 5.2.
- Chapter 5.2 (BFS) is one of the most important chapters for USACO Silver. Grid BFS appears in roughly 1/3 of Silver problems.
- The
dist[v] == -1pattern for unvisited nodes in BFS is the key. Never mark visited when you pop — always when you push. - Chapter 5.3's Union-Find is faster to code than BFS for connectivity questions. Memorize the 15-line template — you'll use it constantly.
- Chapter 5.4 (Dijkstra) is essential for weighted shortest path problems. Use
priority_queue<pair<int,int>>with the standard template — it's the most common Silver/Gold graph algorithm.
💡 Key Insight: Most USACO graph problems are actually grid problems in disguise. A grid cell (r,c) becomes a graph node; adjacent cells become edges. BFS on this implicit graph finds shortest paths.
🏆 USACO Tip: Whenever you see "shortest path," "minimum steps," or "fewest moves" in a problem, think BFS immediately. Whenever you see "are these connected?" or "how many groups?", think DSU.
Chapter 5.1: Introduction to Graphs
A graph is one of the most versatile mathematical structures ever invented. It models relationships between things — roads between cities, friendships between people, connections between web pages. In USACO, graphs represent mazes, networks, and relationships between cows.
5.1.1 What Is a Graph?
A graph consists of:
- Vertices (also called nodes): the "things" (cities, cows, cells)
- Edges: the connections between them (roads, friendships)
This graph has 6 vertices (1–6) and 6 edges.
Visual: Graph Basics Reference
This reference diagram shows the key graph terminology — vertices, edges, directed vs undirected, weighted edges, and common graph properties — all in one view.
Types of Graphs
| Type | Description | Example |
|---|---|---|
| Undirected | Edges have no direction; if A-B, then B-A | Friendships |
| Directed | Edges go one way; A→B doesn't mean B→A | Twitter follows |
| Weighted | Edges have costs/distances | Road distances |
| Unweighted | All edges equal | Maze connections |
| Tree | Connected, no cycles, N-1 edges for N nodes | File system |
| DAG | Directed Acyclic Graph | Dependencies |
Most USACO Bronze/Silver problems use unweighted, undirected graphs or simple grids.
5.1.2 Graph Representation
The most important decision when coding a graph algorithm is how to store the graph.
Visual: Graph Structure and Adjacency List
The left side shows an undirected graph with 5 nodes and their edges. The right side shows the adjacency list — for each node, a list of its neighbors. This representation uses O(V + E) space, which is optimal for sparse graphs typical in USACO problems.
Adjacency List (USE THIS)
Store each vertex's neighbors as a list. This is the standard in competitive programming.
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, m; // n vertices, m edges
cin >> n >> m;
// adj[u] = list of vertices connected to u
vector<vector<int>> adj(n + 1); // 1-indexed: vertices 1..n
for (int i = 0; i < m; i++) {
int u, v;
cin >> u >> v;
adj[u].push_back(v); // edge u → v
adj[v].push_back(u); // edge v → u (undirected: add both directions)
}
// Print adjacency list
for (int u = 1; u <= n; u++) {
cout << u << " -> ";
for (int v : adj[u]) cout << v << " ";
cout << "\n";
}
return 0;
}
Input:
6 6
1 2
1 3
2 4
3 5
4 6
5 6
Output:
1 -> 2 3
2 -> 1 4
3 -> 1 5
4 -> 2 6
5 -> 3 6
6 -> 4 5
Space complexity: O(V + E). For V = 10^5 and E = 2×10^5, this is fine.
Adjacency Matrix (When to Use)
A 2D array where adj[u][v] = 1 if there's an edge from u to v.
bool adj[1001][1001] = {}; // global, zero-initialized
// Add edge u-v
adj[u][v] = true;
adj[v][u] = true; // undirected
// Check if edge exists: O(1)
if (adj[u][v]) { ... }
Space complexity: O(V²). For V = 10^5, that's 10^10 bytes — way too much! Only use for V ≤ 1000.
When to Use Which
| Condition | Use |
|---|---|
V ≤ 1000 and need O(1) edge lookup | Adjacency matrix |
| V up to 10^5 (or larger) | Adjacency list |
| Almost all pairs are connected (dense graph) | Adjacency matrix |
| Few edges compared to pairs (sparse graph) | Adjacency list |
Default in competitive programming: Always use adjacency list unless V is very small.
5.1.3 Reading Graph Input
USACO graphs come in several formats. Here are the patterns:
Standard: Edge List
5 4 ← n vertices, m edges
1 2 ← edge between 1 and 2
2 3
3 4
4 5
int n, m;
cin >> n >> m;
vector<vector<int>> adj(n + 1);
for (int i = 0; i < m; i++) {
int u, v;
cin >> u >> v;
adj[u].push_back(v);
adj[v].push_back(u);
}
Tree: Parent Array
5 ← n nodes
2 3 1 1 ← parent[2]=2, parent[3]=3, parent[4]=1, parent[5]=1 (node 1 is root)
int n;
cin >> n;
vector<vector<int>> children(n + 1);
for (int i = 2; i <= n; i++) {
int parent;
cin >> parent;
children[parent].push_back(i); // parent → child edge
}
Grid Graph
A grid where cells are nodes; edges connect adjacent cells (up/down/left/right):
4 4 ← rows × columns
....
.##.
....
....
int R, C;
cin >> R >> C;
vector<string> grid(R);
for (int r = 0; r < R; r++) cin >> grid[r];
// To iterate over neighbors of cell (r, c):
int dr[] = {-1, 1, 0, 0}; // row offsets for up/down/left/right
int dc[] = {0, 0, -1, 1}; // col offsets
for (int d = 0; d < 4; d++) {
int nr = r + dr[d]; // neighbor row
int nc = c + dc[d]; // neighbor col
if (nr >= 0 && nr < R && nc >= 0 && nc < C) {
// (nr, nc) is a valid neighbor
}
}
5.1.4 Trees vs. Graphs
A tree is a special type of graph with these properties:
- N nodes and exactly N-1 edges
- Connected (every node reachable from every other)
Visual: Rooted Tree Structure
A rooted tree has a designated root node at depth 0. Each node has a parent (except the root) and zero or more children. Leaf nodes have no children. This structure naturally represents hierarchies and enables efficient tree DP algorithms.
- No cycles (acyclic)
Trees appear constantly in USACO — they represent hierarchies, family trees, and many other structures.
1 ← root
/ \
2 3
/ \ \
4 5 6
Key tree vocabulary:
- Root: The topmost node (usually node 1)
- Parent: The node directly above in the hierarchy
- Children: Nodes directly below
- Leaf: A node with no children
- Depth: Distance from the root (root has depth 0)
- Height: Length of the longest path from a node to a leaf
Representing a Rooted Tree
vector<vector<int>> children(n + 1); // children[u] = list of u's children
int parent[n + 1];
// Read tree as undirected graph, then root it with DFS
vector<vector<int>> adj(n + 1);
for (int i = 0; i < n - 1; i++) {
int u, v;
cin >> u >> v;
adj[u].push_back(v);
adj[v].push_back(u);
}
// Root at node 1 using DFS
fill(parent + 1, parent + n + 1, 0);
function<void(int, int)> root_tree = [&](int u, int par) {
parent[u] = par;
for (int v : adj[u]) {
if (v != par) {
children[u].push_back(v);
root_tree(v, u); // recursive DFS
}
}
};
root_tree(1, 0);
5.1.5 Weighted Graphs
For weighted graphs (edges with costs), store the weight alongside each neighbor:
vector<vector<pair<int,int>>> adj(n + 1);
// adj[u] = list of {v, weight} pairs
// Add weighted edge u-v with weight w
adj[u].push_back({v, w});
adj[v].push_back({u, w});
// Iterate neighbors with weights
for (auto [v, w] : adj[u]) {
cout << "Edge " << u << "-" << v << " weight " << w << "\n";
}
Chapter Summary
📌 Key Takeaways
| Concept | Key Points | Why It Matters |
|---|---|---|
| Graph | Vertices + edges; model "relationships" | Almost all USACO Silver+ problems involve graphs |
| Undirected | Add v to adj[u] and u to adj[v] | Forgetting both directions is the most common bug |
| Directed | Add only v to adj[u] | Twitter follows, dependency relations, etc. |
| Adjacency list | vector<vector<int>> adj(n+1) | Default choice, O(V+E) space |
| Adjacency matrix | bool adj[1001][1001] | Only use when V ≤ 1000 |
| Grid graph | 4-direction neighbors + boundary check | Most common graph input in USACO |
| Tree | Connected acyclic, N-1 edges | Special graph, supports efficient algorithms |
❓ FAQ
Q1: Why use vector<vector<int>> for adjacency list instead of linked lists?
A: C++
vectoruses contiguous memory, is cache-friendly, and is much faster than linked lists. In contests,listis almost never used;vector<vector<int>>is the standard approach.
Q2: Should graph vertices be 0-indexed or 1-indexed?
A: USACO problems are usually 1-indexed. Recommend declaring adjacency list with size
n+1:vector<vector<int>> adj(n+1). This wastes index 0 but makes code clearer and less error-prone.
Q3: What is the only difference between directed and undirected graphs?
A: When reading edges, undirected graphs add two (u→v and v→u), directed graphs add only one (u→v). The subsequent BFS/DFS code is identical.
Q4: Does a grid graph need an explicit adjacency list?
A: No! Grid graph "neighbors" can be computed implicitly via direction arrays
dr[]/dc[], no need to store an adjacency list—saves memory and is cleaner.
🔗 Connections to Later Chapters
- Chapter 5.2 (BFS & DFS) runs on the adjacency list built in this chapter—this chapter is a prerequisite for Chapter 5.2
- Chapter 5.3 (Trees & DSU) uses this chapter's tree representation and adds Union-Find
- Graph traversal from Chapters 5.1–5.2 is the foundation for "Tree DP" and "DP on DAG" in Chapters 6.1–6.3 (DP)
- Grid graph representation is used throughout the book—BFS shortest path, Flood Fill, grid DP, etc.
Practice Problems
Problem 5.1.1 — Degree Count Read an undirected graph with N vertices and M edges. Print the degree (number of edges) of each vertex.
Problem 5.1.2 — Is It a Tree? Read a connected graph. Determine if it's a tree (exactly N-1 edges and no cycles).
Problem 5.1.3 — Reachability Read a directed graph and two vertices S and T. Print "YES" if T is reachable from S following directed edges, "NO" otherwise. (You'll need DFS from Chapter 5.2 to fully solve this, but you can set it up now)
Problem 5.1.4 — Leaf Count Read a rooted tree. Count how many nodes are leaves (have no children).
Problem 5.1.5 — Grid to Graph Read an N×M grid. Cells with '.' are passable; '#' are walls. Print the number of edges in the implicit graph (connect adjacent '.' cells).
Visual: Graph Adjacency List
The left side shows a 5-node weighted graph visually. The right side shows the corresponding adjacency list in C++: vector<pair<int,int>> adj[] where each entry is a {neighbor, weight} pair. This is the standard representation for most USACO graph problems.
Chapter 5.2: BFS & DFS
📝 Before You Continue: Make sure you understand graph representation (Chapter 5.1), queues and stacks (Chapter 3.6), and basic 2D array traversal (Chapter 2.3).
Graph traversal algorithms explore every node reachable from a starting point. They're the foundation of dozens of graph algorithms. DFS (Depth-First Search) dives deep before backtracking. BFS (Breadth-First Search) explores layer by layer. Knowing which to use and when is a skill you'll develop throughout your competitive programming career.
5.2.1 Depth-First Search (DFS)
DFS works like exploring a maze: you keep going forward until you hit a dead end, then backtrack and try another path.
Visual: DFS Traversal Order
DFS dives as deep as possible before backtracking. The numbered circles show the visit order, red dashed arrows show backtracking. The call stack on the right illustrates how recursion naturally implements the LIFO behaviour needed for DFS.
Recursive DFS
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 100001;
vector<int> adj[MAXN];
bool visited[MAXN];
void dfs(int u) {
visited[u] = true; // mark current node as visited
cout << u << " "; // process u (print it, in this example)
for (int v : adj[u]) { // for each neighbor v
if (!visited[v]) { // if not yet visited
dfs(v); // recursively explore v
}
}
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, m;
cin >> n >> m;
for (int i = 0; i < m; i++) {
int u, v;
cin >> u >> v;
adj[u].push_back(v);
adj[v].push_back(u);
}
// DFS from node 1
dfs(1);
cout << "\n";
return 0;
}
Important: Always mark nodes as visited before recursing, not after! This prevents infinite loops on cycles.
Iterative DFS (Using a Stack)
For very large graphs, recursive DFS can cause a stack overflow (too deep recursion). The iterative version uses an explicit stack:
void dfs_iterative(int start, int n) {
vector<bool> visited(n + 1, false);
stack<int> st;
st.push(start);
while (!st.empty()) {
int u = st.top();
st.pop();
if (visited[u]) continue; // may have been pushed multiple times
visited[u] = true;
cout << u << " ";
for (int v : adj[u]) {
if (!visited[v]) {
st.push(v);
}
}
}
}
5.2.2 Connected Components
A connected component is a maximal set of vertices where every vertex can reach every other vertex. Finding components is a very common USACO task.
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 100001;
vector<int> adj[MAXN];
int comp[MAXN]; // comp[v] = component ID of vertex v
void dfs(int u, int id) {
comp[u] = id;
for (int v : adj[u]) {
if (comp[v] == 0) { // 0 means unvisited (use 0 as sentinel, 1-index components from 1)
dfs(v, id);
}
}
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, m;
cin >> n >> m;
for (int i = 0; i < m; i++) {
int u, v;
cin >> u >> v;
adj[u].push_back(v);
adj[v].push_back(u);
}
int numComponents = 0;
for (int u = 1; u <= n; u++) {
if (comp[u] == 0) {
numComponents++;
dfs(u, numComponents); // assign component ID
}
}
cout << "Number of components: " << numComponents << "\n";
// Print component sizes
vector<int> size(numComponents + 1, 0);
for (int u = 1; u <= n; u++) size[comp[u]]++;
for (int i = 1; i <= numComponents; i++) {
cout << "Component " << i << ": " << size[i] << " nodes\n";
}
return 0;
}
5.2.3 Breadth-First Search (BFS)
BFS explores all nodes at distance 1, then all at distance 2, then distance 3, and so on. This makes it perfect for finding shortest paths in unweighted graphs.
Visual: BFS Level-by-Level Traversal
BFS spreads outward like ripples in a pond. Each "level" of nodes is colored differently, showing that all nodes at distance d from the source are discovered before any node at distance d+1. The queue at the bottom shows the processing order.
BFS Template
// Solution: BFS Shortest Path — O(V + E)
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 100001;
vector<int> adj[MAXN];
// Returns array of shortest distances from source to all vertices
// dist[v] = -1 means unreachable
vector<int> bfs(int source, int n) {
vector<int> dist(n + 1, -1);
queue<int> q;
dist[source] = 0; // distance to source is 0
q.push(source); // seed the queue with the source
while (!q.empty()) {
int u = q.front();
q.pop();
for (int v : adj[u]) {
if (dist[v] == -1) { // not yet visited
dist[v] = dist[u] + 1; // ← KEY LINE: one hop further
q.push(v);
}
}
}
return dist;
}
Why BFS Finds Shortest Paths
BFS processes nodes in order of their distance from the source. The first time BFS visits a node, it's via the shortest path. This is because BFS never visits a node at distance d+1 before visiting all nodes at distance d.
💡 Key Insight: Think of BFS as dropping a stone in water — ripples spread outward one layer at a time. All cells at distance 1 are processed before any cell at distance 2. This level-by-level processing guarantees the first visit to any node is via the shortest path.
BFS vs. DFS for shortest path:
- BFS: guaranteed shortest path in unweighted graphs ✓
- DFS: does NOT guarantee shortest path ✗
Complexity Analysis:
- Time:
O(V + E)— each vertex and edge is processed at most once - Space:
O(V)— for the distance array and queue
Complete BFS Shortest Path Trace on a 4×4 Grid
Let's trace BFS starting from node 1 in this graph:
1 — 2 — 3
| |
4 — 5 6
|
7 — 8
Edges: 1-2, 2-3, 1-4, 3-6, 4-5, 5-7, 7-8
BFS Trace:
Start: dist = [-1, 0, -1, -1, -1, -1, -1, -1, -1] (1-indexed, source=1)
Queue: [1]
Process 1: neighbors 2, 4
→ dist[2] = 1, dist[4] = 1
Queue: [2, 4]
Process 2: neighbors 1, 3
→ 1 already visited; dist[3] = 2
Queue: [4, 3]
Process 4: neighbors 1, 5
→ 1 already visited; dist[5] = 2
Queue: [3, 5]
Process 3: neighbors 2, 6
→ 2 already visited; dist[6] = 3
Queue: [5, 6]
Process 5: neighbors 4, 7
→ 4 already visited; dist[7] = 3
Queue: [6, 7]
Process 6: neighbor 3 → already visited
Process 7: neighbors 5, 8
→ 5 already visited; dist[8] = 4
Queue: [8]
Process 8: neighbor 7 → already visited. Queue empty.
Final distances from node 1:
Node: 1 2 3 4 5 6 7 8
Dist: 0 1 2 1 2 3 3 4
5.2.4 Grid BFS — The Most Common USACO Pattern
Many USACO problems give you a grid with passable (.) and blocked (#) cells. BFS finds the shortest path from one cell to another.
Visual: Grid BFS Distance Flood Fill
Starting from the center cell (distance 0), BFS expands to all reachable cells, recording the minimum number of steps to reach each one. Cells colored more blue are farther away. This is exactly how USACO flood-fill and shortest-path problems work on grids.
USACO-Style Grid BFS Problem: Maze Shortest Path
Problem: Given a 5×5 maze with walls (#) and open cells (.), find the shortest path from top-left (0,0) to bottom-right (4,4). Print the length, or -1 if no path exists.
The Maze:
. . . # .
# # . # .
. . . . .
. # # # .
. . . . .
BFS Trace — Distance Array Filling:
Starting at (0,0), BFS expands level by level. Here's the distance each cell gets assigned:
Step 0 — Initialize:
dist[0][0] = 0, queue: [(0,0)]
Step 1 — Process (0,0):
Neighbors: (0,1)='.', (1,0)='#'(wall)
dist[0][1] = 1. Queue: [(0,1)]
Step 2 — Process (0,1):
Neighbors: (0,0)=visited, (0,2)='.', (1,1)='#'
dist[0][2] = 2. Queue: [(0,2)]
Step 3 — Process (0,2):
Neighbors: (0,1)=visited, (0,3)='#', (1,2)='.'
dist[1][2] = 3. Queue: [(1,2)]
Step 4 — Process (1,2):
Neighbors: (0,2)=visited, (1,1)='#', (1,3)='#', (2,2)='.'
dist[2][2] = 4. Queue: [(2,2)]
Step 5 — Process (2,2):
Neighbors: (1,2)=visited, (2,1)='.', (2,3)='.', (3,2)='#'
dist[2][1] = 5, dist[2][3] = 5. Queue: [(2,1),(2,3)]
...continuing BFS...
Final distance array (. = reachable, # = wall, X = unreachable):
c=0 c=1 c=2 c=3 c=4
r=0: 0 1 2 # X
r=1: # # 3 # X
r=2: 8 5 4 5 6
r=3: 9 # # # 7
r=4: 10 11 12 11 8
Shortest path length = dist[4][4] = 8
Path reconstruction: Follow the path backward from (4,4), always moving to the cell with distance one less:
(4,4)=8 → (3,4)=7 → (2,4)=6 → (2,3)=5 → (2,2)=4 → (1,2)=3 → (0,2)=2 → (0,1)=1 → (0,0)=0
Path length: 8 steps ✓
ASCII Visualization of the path:
S → . → . # .
# # ↓ # .
. . ↓ → → →
. # # # ↓
. . . . E
Complete C++ Code:
// Solution: Grid BFS Shortest Path — O(R × C)
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int R, C;
cin >> R >> C;
vector<string> grid(R);
for (int r = 0; r < R; r++) cin >> grid[r];
// Find start (S) and end (E), or use fixed corners
int sr = 0, sc = 0, er = R-1, ec = C-1;
// BFS distance array: -1 = unvisited
vector<vector<int>> dist(R, vector<int>(C, -1));
queue<pair<int,int>> q;
// Step 1: Seed BFS from source
dist[sr][sc] = 0;
q.push({sr, sc});
// Step 2: Direction arrays (up, down, left, right)
int dr[] = {-1, 1, 0, 0};
int dc[] = {0, 0, -1, 1};
// Step 3: BFS expansion
while (!q.empty()) {
auto [r, c] = q.front();
q.pop();
for (int d = 0; d < 4; d++) {
int nr = r + dr[d];
int nc = c + dc[d];
if (nr >= 0 && nr < R // in-bounds row
&& nc >= 0 && nc < C // in-bounds col
&& grid[nr][nc] != '#' // not a wall
&& dist[nr][nc] == -1) { // ← KEY LINE: not yet visited
dist[nr][nc] = dist[r][c] + 1;
q.push({nr, nc});
}
}
}
// Step 4: Output result
if (dist[er][ec] == -1) {
cout << -1 << "\n"; // no path
} else {
cout << dist[er][ec] << "\n";
}
return 0;
}
Sample Input (the maze above):
5 5
...#.
##.#.
.....
.###.
.....
Sample Output:
8
⚠️ Common Mistake: Using DFS instead of BFS for shortest path in a maze. DFS might find A path, but not the SHORTEST path. Always use BFS for shortest distances in unweighted grids.
5.2.5 USACO Example: Flood Fill
USACO loves "flood fill" problems: find all connected cells of the same type, or count connected regions.
Problem: Count the number of distinct connected regions of '.' cells in a grid. (Like counting islands.)
#include <bits/stdc++.h>
using namespace std;
int R, C;
vector<string> grid;
vector<vector<bool>> visited;
void floodFill(int r, int c) {
if (r < 0 || r >= R || c < 0 || c >= C) return; // out of bounds
if (visited[r][c]) return; // already visited
if (grid[r][c] == '#') return; // wall
visited[r][c] = true;
floodFill(r - 1, c); // up
floodFill(r + 1, c); // down
floodFill(r, c - 1); // left
floodFill(r, c + 1); // right
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
cin >> R >> C;
grid.resize(R);
visited.assign(R, vector<bool>(C, false));
for (int r = 0; r < R; r++) cin >> grid[r];
int regions = 0;
for (int r = 0; r < R; r++) {
for (int c = 0; c < C; c++) {
if (!visited[r][c] && grid[r][c] == '.') {
regions++;
floodFill(r, c);
}
}
}
cout << regions << "\n";
return 0;
}
5.2.6 Multi-Source BFS
Sometimes you start BFS from multiple source nodes simultaneously. For example: "Find the minimum distance from each cell to the nearest fire."
// Multi-source BFS: start from all fire cells at once
queue<pair<int,int>> q;
vector<vector<int>> dist(R, vector<int>(C, -1));
// Push ALL sources first
for (int r = 0; r < R; r++) {
for (int c = 0; c < C; c++) {
if (grid[r][c] == 'F') { // fire cell
dist[r][c] = 0;
q.push({r, c});
}
}
}
// Run BFS from all sources simultaneously
while (!q.empty()) {
auto [r, c] = q.front();
q.pop();
for (int d = 0; d < 4; d++) {
int nr = r + dr[d], nc = c + dc[d];
if (/*valid and unvisited*/ nr >= 0 && nr < R && nc >= 0 && nc < C && dist[nr][nc] == -1) {
dist[nr][nc] = dist[r][c] + 1;
q.push({nr, nc});
}
}
}
5.2.7 DFS vs. BFS — When to Use Each
| Task | Use | Why |
|---|---|---|
| Shortest path (unweighted) | BFS ✓ | Level-by-level guarantees shortest |
| Connectivity / connected components | Either | Both work; DFS often simpler recursively |
| Cycle detection | DFS ✓ | Recursion stack tracks current path |
| Topological sort | DFS ✓ | Post-order gives reverse topological order |
| Flood fill | Either (DFS often simpler) | DFS recursion is concise |
| Bipartite check | BFS or DFS | 2-color with either |
| Distance to ALL nodes | BFS ✓ | BFS naturally computes all distances |
| Tree traversals (pre/in/post order) | DFS ✓ | Recursion maps naturally to tree structure |
💡 Key Insight: Use BFS whenever you need "the minimum number of steps." Use DFS whenever you just need to visit all nodes or check properties of paths.
⚠️ Common Mistakes in Chapter 5.2
- Using DFS for shortest path: DFS explores one path deeply and doesn't guarantee minimum steps. Always use BFS for unweighted shortest paths.
- Forgetting bounds check:
nr >= 0 && nr < R && nc >= 0 && nc < C— missing any one of these four conditions causes out-of-bounds crashes. - Not marking visited before pushing to queue: If you mark visited when popping instead of pushing, the same node can be pushed multiple times, causing
O(V²)time instead ofO(V+E). - Stack overflow in recursive DFS: For grids with N×M = 10^6, recursive DFS can exceed the default stack size. Use iterative DFS or increase stack size.
- Using wrong starting point: In grid problems, make sure you're BFSing from the correct cell (0-indexed vs 1-indexed confusion).
Chapter Summary
📌 Key Takeaways
| Algorithm | Data Structure | Time | Space | Best For |
|---|---|---|---|---|
| DFS (recursive) | Call stack | O(V+E) | O(V) | Connectivity, cycle detection, tree problems |
| DFS (iterative) | Explicit stack | O(V+E) | O(V) | Same, avoids stack overflow |
| BFS | Queue | O(V+E) | O(V) | Shortest path, layer traversal |
| Multi-source BFS | Queue (multi-source pre-fill) | O(V+E) | O(V) | Distance from each node to nearest source |
| 3-Color DFS | Color array | O(V+E) | O(V) | Directed graph cycle detection |
| Topological Sort | DFS/BFS (Kahn) | O(V+E) | O(V) | Sorting/DP on DAG |
❓ FAQ
Q1: Both BFS and DFS have time complexity O(V+E). Why can BFS find shortest paths but DFS cannot?
A: The key is visit order. BFS uses a queue to guarantee "process all nodes at distance d before distance d+1," so the first time a node is reached is always via the shortest path. DFS uses a stack (or recursion) and may take a long path to a node, missing shorter ones.
Q2: When does recursive DFS cause stack overflow? How to fix it?
A: Default stack size is ~1-8 MB. Each recursion level uses ~100-200 bytes. When graph depth exceeds ~10^4-10^5, overflow may occur. Solutions: ① Switch to iterative DFS (explicit stack); ② Add
-Wl,-z,stacksize=67108864at compile time to increase stack size.
Q3: In Grid BFS, why use dist == -1 for unvisited instead of a visited array?
A: Using
dist[r][c] == -1kills two birds with one stone: it records both "visited or not" and "distance to reach." One fewer array, cleaner code.
Q4: When to use DFS topological sort vs. Kahn's BFS topological sort?
A: DFS topological sort has shorter code (just reverse postorder), but Kahn's is more intuitive and can detect cycles (if final sorted length < N, there is a cycle). Both are common in contests; choose whichever you're more comfortable with.
🔗 Connections to Later Chapters
- Chapter 5.3 (Trees & DSU): Tree Traversal (pre/postorder) is essentially DFS
- Chapters 5.3 & 6.1–6.3 (DP): "DP on DAG" requires topological sort first, then compute DP in topological order
- Chapter 4.1 (Greedy): Some graph greedy problems need BFS to compute distances as input
- BFS shortest path is a simplified version of Dijkstra (Gold level)—Dijkstra handles weighted graphs, BFS handles unweighted
- Multi-source BFS is extremely common in USACO Silver and is a must-master core technique
Practice Problems
Problem 5.2.1 — Island Count 🟢 Easy Read an N×M grid of '.' (water) and '#' (land). Count the number of islands (connected groups of '#').
Hint
Do DFS/BFS from each unvisited '#' cell. Each DFS call marks a full island. Count how many DFS calls you make.Problem 5.2.2 — Maze Shortest Path 🟢 Easy Read an N×M maze with 'S' (start), 'E' (end), '.' (passable), '#' (wall). Find the minimum steps from S to E, or -1 if impossible.
Hint
BFS from S. When you reach E, output ``dist[E]``. If E is never reached, output -1.Problem 5.2.3 — Bipartite Check 🟡 Medium A graph is bipartite if you can color each node black or white such that every edge connects a black node to a white node. Given a graph, determine if it's bipartite.
Solution sketch: BFS and 2-color. When you visit a node, color it the opposite of its parent. If you ever find an edge between two same-colored nodes, it's not bipartite.
Hint
Assign color 0 to the source. For each neighbor, assign color 1-parent_color. If a neighbor already has the same color as the current node, return false.Problem 5.2.4 — Multi-Source BFS: Nearest Fire 🟡 Medium Given a grid with fire cells 'F', empty cells '.', and walls '#', find the minimum distance from each empty cell to the nearest fire cell.
Solution sketch: Initialize the BFS queue with ALL fire cells at distance 0. Run BFS normally. Each empty cell gets the distance to its nearest fire.
Hint
Multi-source BFS = push all sources into queue at step 0. The BFS then naturally computes the minimum distance from any source to each cell.Problem 5.2.5 — USACO 2016 February Bronze: Milk Pails 🔴 Hard Starting from a state (0, 0) of two buckets with capacities A and B, operations: fill A (→ capacity A), fill B, pour A into B, pour B into A, empty A, empty B. Find minimum operations to get exactly X gallons in either bucket.
Solution sketch: BFS on states (a, b) where a ∈ [0,A] and b ∈ [0,B]. Each state is a node, each operation is an edge. BFS from (0,0) finds minimum operations.
Hint
Total states: `O(A×B)`. BFS explores at most `O(A×B)` states, each with 6 transitions. Make sure to mark visited states to avoid cycles.🏆 Challenge Problem: USACO 2015 December Bronze: Switching on the Lights You have an N×N grid of light switches. Each switch is connected to some lights. Turn on all lights by flipping switches. Model as a BFS/DFS graph reachability problem where turning on a light may reveal new switches.
This requires multi-source BFS + careful state management.
5.2.8 Multi-Source BFS — In Depth
Multi-source BFS starts from multiple source nodes simultaneously. The key: push all sources into the queue at distance 0 before starting BFS.
Why does this work? BFS processes nodes level by level. If multiple nodes start at "level 0," BFS naturally propagates from all of them in parallel — exactly as if you had a virtual super-source connected to all real sources at cost 0.
Level 0: [S₁][S₂][S₃] ← all fire sources / all starting nodes
Level 1: neighbors of S₁, S₂, S₃
Level 2: their neighbors not yet visited
...
Complete Example: Spreading Fire
Problem: Given an N×M grid with fire cells ('F'), water cells ('.'), and walls ('#'), compute the minimum distance from each '.' cell to the nearest fire cell.
// Solution: Multi-Source BFS — O(N×M)
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int R, C;
cin >> R >> C;
vector<string> grid(R);
for (auto& row : grid) cin >> row;
vector<vector<int>> dist(R, vector<int>(C, -1));
queue<pair<int,int>> q;
// ← KEY: Push ALL fire sources at distance 0 before starting BFS
for (int r = 0; r < R; r++) {
for (int c = 0; c < C; c++) {
if (grid[r][c] == 'F') {
dist[r][c] = 0;
q.push({r, c});
}
}
}
int dr[] = {-1, 1, 0, 0};
int dc[] = {0, 0, -1, 1};
while (!q.empty()) {
auto [r, c] = q.front(); q.pop();
for (int d = 0; d < 4; d++) {
int nr = r + dr[d], nc = c + dc[d];
if (nr >= 0 && nr < R && nc >= 0 && nc < C
&& grid[nr][nc] != '#' && dist[nr][nc] == -1) {
dist[nr][nc] = dist[r][c] + 1;
q.push({nr, nc});
}
}
}
// Print distance grid
for (int r = 0; r < R; r++) {
for (int c = 0; c < C; c++) {
if (dist[r][c] == -1) cout << " # ";
else cout << " " << dist[r][c] << " ";
}
cout << "\n";
}
return 0;
}
BFS Level Visualization:
Level 0: [F₁][F₂] ← all fire sources enter queue together
Level 1: [ 1 ][ 1 ][ 1 ] ← cells adjacent to any fire source
Level 2: [ 2 ][ 2 ][ 2 ][ 2 ]
Level 3: [ 3 ][ 3 ][ 3 ][ 3 ][ 3 ]
Multi-Source BFS 层级扩散示意:
flowchart TD
subgraph L0["Level 0 — 初始火源"]
F1(["F₁\ndist=0"])
F2(["F₂\ndist=0"])
end
subgraph L1["Level 1 — 第一层扩散"]
N1(["dist=1"])
N2(["dist=1"])
N3(["dist=1"])
end
subgraph L2["Level 2 — 第二层扩散"]
N4(["dist=2"])
N5(["dist=2"])
end
F1 --> N1
F1 --> N2
F2 --> N2
F2 --> N3
N1 --> N4
N3 --> N5
style F1 fill:#fca5a5,stroke:#dc2626
style F2 fill:#fca5a5,stroke:#dc2626
style L0 fill:#fff1f2
style L1 fill:#fef9ec
style L2 fill:#f0fdf4
💡 关键原理: 所有火源同时入队(dist=0),BFS 自然地并行向外扩散。每个空白格子得到的是到最近火源的最短距离——这是 BFS 层序性质的直接推论。
Each cell gets the minimum distance to any fire source — guaranteed by BFS's level-order property.
USACO Application: "Icy Perimeter" Style
Multi-source BFS is useful when you need:
- "Distance from each cell to nearest [thing]"
- "Spreading from multiple starting points" (fire, infection, flood)
- "Simultaneous evacuation from multiple exits"
5.2.9 Cycle Detection with DFS — White/Gray/Black Coloring
For directed graphs, cycle detection uses 3-color DFS:
- White (0): Not yet visited
- Gray (1): Currently in DFS call stack (being processed)
- Black (2): Fully processed (all descendants explored)
A back edge (edge to a gray node) indicates a cycle.
// Solution: Cycle Detection in Directed Graph — O(V+E)
#include <bits/stdc++.h>
using namespace std;
int n;
vector<int> adj[100001];
vector<int> color; // 0=white, 1=gray, 2=black
bool hasCycle = false;
void dfs(int u) {
color[u] = 1; // mark as "in progress" (gray)
for (int v : adj[u]) {
if (color[v] == 0) {
dfs(v); // unvisited: recurse
} else if (color[v] == 1) {
hasCycle = true; // ← back edge: v is an ancestor of u → cycle!
}
// color[v] == 2: already fully processed, safe to skip
}
color[u] = 2; // mark as "done" (black)
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int m;
cin >> n >> m;
color.assign(n + 1, 0);
for (int i = 0; i < m; i++) {
int u, v; cin >> u >> v;
adj[u].push_back(v); // directed edge u → v
}
for (int u = 1; u <= n; u++) {
if (color[u] == 0) dfs(u);
}
cout << (hasCycle ? "HAS CYCLE" : "NO CYCLE") << "\n";
return 0;
}
⚠️ Undirected graph cycle detection: For undirected graphs, use a simpler method: during DFS, if you visit a node that's already visited AND it's not the parent of the current node, there's a cycle. Alternatively, use DSU: if an edge connects two already-connected nodes, it creates a cycle.
5.2.10 Topological Sort with DFS
Topological sort orders the nodes of a directed acyclic graph (DAG) such that for every edge u → v, u comes before v.
DFS approach: When a node finishes (all descendants processed), add it to the front of the result list. This gives reverse topological order.
Finish order (post-order): E, D, C, B, A
Topological order (reverse): A, B, C, D, E
// Solution: Topological Sort via DFS — O(V+E)
#include <bits/stdc++.h>
using namespace std;
vector<int> adj[100001];
vector<bool> visited;
vector<int> topoOrder;
void dfs(int u) {
visited[u] = true;
for (int v : adj[u]) {
if (!visited[v]) dfs(v);
}
topoOrder.push_back(u); // ← add AFTER all children processed (post-order)
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, m;
cin >> n >> m;
visited.assign(n + 1, false);
for (int i = 0; i < m; i++) {
int u, v; cin >> u >> v;
adj[u].push_back(v);
}
for (int u = 1; u <= n; u++) {
if (!visited[u]) dfs(u);
}
// Reverse post-order = topological order
reverse(topoOrder.begin(), topoOrder.end());
for (int u : topoOrder) cout << u << " ";
cout << "\n";
return 0;
}
Alternative: Kahn's Algorithm (BFS-based Topological Sort)
// Kahn's Algorithm: Process nodes with in-degree 0 first — O(V+E)
vector<int> inDeg(n + 1, 0);
for (int u = 1; u <= n; u++)
for (int v : adj[u])
inDeg[v]++;
queue<int> q;
for (int u = 1; u <= n; u++)
if (inDeg[u] == 0) q.push(u); // start with nodes having no prerequisites
vector<int> order;
while (!q.empty()) {
int u = q.front(); q.pop();
order.push_back(u);
for (int v : adj[u]) {
inDeg[v]--;
if (inDeg[v] == 0) q.push(v);
}
}
// If order.size() != n, there's a cycle (not a DAG)
if ((int)order.size() != n) cout << "CYCLE DETECTED\n";
else for (int u : order) cout << u << " ";
Kahn 算法入度变化过程:
flowchart LR
subgraph init["初始入度"]
direction TB
A0(["A\nin=0"])
B0(["B\nin=1"])
C0(["C\nin=2"])
D0(["D\nin=1"])
end
subgraph step1["处理 A(in=0)"]
direction TB
A1(["A\n已处理"])
B1(["B\nin=0↓"])
C1(["C\nin=2"])
D1(["D\nin=1"])
end
subgraph step2["处理 B(in=0)"]
direction TB
A2(["A\n已处理"])
B2(["B\n已处理"])
C2(["C\nin=1↓"])
D2(["D\nin=0↓"])
end
subgraph step3["处理 D→C(in=0)"]
direction TB
A3(["A"])
B3(["B"])
C3(["C\nin=0↓"])
D3(["D\n已处理"])
end
init -->|"将 A 入队"| step1
step1 -->|"将 B 入队"| step2
step2 -->|"将 D,C 入队"| step3
style A0 fill:#dcfce7,stroke:#16a34a
style B1 fill:#dcfce7,stroke:#16a34a
style D2 fill:#dcfce7,stroke:#16a34a
style C3 fill:#dcfce7,stroke:#16a34a
💡 循环检测: 若最终
order.size() < n,说明有节点的入度始终不为 0(它们在环中)。这是 Kahn 算法相比 DFS 拓扑排序的一大优势。
💡 Key Application: Topological sort is essential for DP on DAGs. If the dependency graph is a DAG, process nodes in topological order — each node's DP state depends only on previously-processed nodes.
Example DAG and BFS levels visualization:
Visual: Grid BFS Distances from Source
The diagram shows a 5×5 grid BFS where each cell displays its minimum distance from the source (0,0). Walls are shown in dark gray. Note how the BFS "flood fills" outward in concentric rings, never revisiting a cell — guaranteeing minimum distances.
Chapter 5.3: Trees & Special Graphs
Trees are graphs with a special structure that enables elegant and efficient algorithms. This chapter covers tree traversals, and one of the most important data structures in competitive programming: Union-Find (also called Disjoint Set Union or DSU).
5.3.1 Tree Traversals
Given a rooted tree, there are three classic ways to visit every node with DFS:
- Pre-order: Visit node, then children (node before subtree)
- In-order: Visit left child, node, right child (only for binary trees)
- Post-order: Visit children, then node (subtree before node)
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 100001;
vector<int> children[MAXN];
// Pre-order: parent before children (useful for computing subtree info top-down)
void preorder(int u) {
cout << u << " "; // process u first
for (int v : children[u]) preorder(v);
}
// Post-order: children before parent (useful for subtree aggregation bottom-up)
void postorder(int u) {
for (int v : children[u]) postorder(v);
cout << u << " "; // process u after all children
}
// Calculate subtree size (post-order style)
int subtreeSize[MAXN];
void calcSize(int u) {
subtreeSize[u] = 1; // start with just this node
for (int v : children[u]) {
calcSize(v);
subtreeSize[u] += subtreeSize[v]; // add child subtree sizes
}
}
// Calculate depth of each node (pre-order style)
int depth[MAXN];
void calcDepth(int u, int d) {
depth[u] = d;
for (int v : children[u]) {
calcDepth(v, d + 1);
}
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
for (int i = 2; i <= n; i++) {
int p; cin >> p;
children[p].push_back(i);
}
cout << "Pre-order: ";
preorder(1);
cout << "\n";
cout << "Post-order: ";
postorder(1);
cout << "\n";
calcSize(1);
cout << "Subtree sizes: ";
for (int i = 1; i <= n; i++) cout << subtreeSize[i] << " ";
cout << "\n";
calcDepth(1, 0);
cout << "Depths: ";
for (int i = 1; i <= n; i++) cout << depth[i] << " ";
cout << "\n";
return 0;
}
5.3.2 Lowest Common Ancestor (LCA) — Naive
The LCA of two nodes u and v is the deepest node that is an ancestor of both.
For small trees, a naive approach: march both nodes up to the root, find where they first meet.
Naive LCA 向上爬升过程示意:
flowchart TD
subgraph tree["树结构(根节点=1)"]
N1(["1\ndepth=0"])
N2(["2\ndepth=1"])
N3(["3\ndepth=1"])
N4(["4\ndepth=2"])
N5(["5\ndepth=2"])
N6(["6\ndepth=3"])
N1 --> N2
N1 --> N3
N2 --> N4
N2 --> N5
N4 --> N6
end
subgraph lca["LCA(6, 5) 求解过程"]
direction LR
S1["u=6, v=5\ndepth[6]=3, depth[5]=2"] -->|"u上移至同深度"| S2
S2["u=4, v=5\ndepth相同=2"] -->|"u≠v,同步上移"| S3
S3["u=2, v=2\nu==v → LCA=2 ✓"]
end
style S3 fill:#dcfce7,stroke:#16a34a
💡 关键步骤: ① 先将深度较大的节点上移至与另一节点同深度;② 再同步上移直到两节点相遇,相遇点即为 LCA。
#include <bits/stdc++.h>
using namespace std;
int parent[100001];
int depth_arr[100001];
// Naive LCA: walk both nodes up until they meet — O(depth) per query
int lca(int u, int v) {
while (depth_arr[u] > depth_arr[v]) u = parent[u]; // bring u up to same depth as v
while (depth_arr[v] > depth_arr[u]) v = parent[v]; // bring v up to same depth as u
while (u != v) { // now both at same depth; walk up together
u = parent[u];
v = parent[v];
}
return u;
}
For Silver problems, naive LCA (O(N) per query) is often sufficient. Gold uses binary lifting (O(log N) per query).
Binary Lifting 祖先表构建示意:
flowchart LR
subgraph anc0["anc[v][0](直接父节点)"]
direction TB
v6a(["6"]) -->|"父"| v4a(["4"])
v4a -->|"父"| v2a(["2"])
v2a -->|"父"| v1a(["1"])
end
subgraph anc1["anc[v][1](2¹=2级祖先)"]
direction TB
v6b(["6"]) -->|"2级祖先"| v2b(["2"])
v4b(["4"]) -->|"2级祖先"| v1b(["1"])
end
subgraph anc2["anc[v][2](2²=4级祖先)"]
direction TB
v6c(["6"]) -->|"4级祖先"| v1c(["1"])
end
anc0 -->|"anc[v][k] = anc[anc[v][k-1]][k-1]"| anc1
anc1 --> anc2
style anc2 fill:#f0f4ff,stroke:#4A6CF7
💡 倍增思想:
anc[v][k]= v 的 2^k 级祖先 = v 的 2^(k-1) 级祖先的 2^(k-1) 级祖先。查询时将深度差分解为二进制,每次跳 2^k 步,共 O(log N) 步。
5.3.3 Union-Find (Disjoint Set Union)
Union-Find is a data structure that efficiently answers two questions:
- Find: Which group does element X belong to?
- Union: Merge the groups containing X and Y.
Why is this useful? It efficiently tracks connected components as edges are added one by one, which is used in Kruskal's MST algorithm, detecting cycles, and many USACO problems.
Visual: Union-Find Operations
The diagram shows Union-Find evolving: initially all nodes are separate (each is its own root), then after union(0,1) and union(1,2) a tree forms. Path compression (shown at bottom) flattens the tree so future find() calls are nearly O(1).
This static reference diagram shows the Union-Find tree structure with path compression and union by rank, illustrating how the data structure maintains near-constant time operations.
Basic Implementation
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 100001;
int parent[MAXN]; // parent[x] = parent of x in the tree
int rankArr[MAXN]; // used for union by rank
// Initialize: each element is its own group
void init(int n) {
for (int i = 1; i <= n; i++) {
parent[i] = i; // parent of i is itself
rankArr[i] = 0; // initial rank is 0
}
}
// Find: returns the "representative" (root) of x's group
// Uses PATH COMPRESSION: flattens the tree for future queries
int find(int x) {
if (parent[x] != x) {
parent[x] = find(parent[x]); // path compression!
}
return parent[x];
}
// Union: merge groups containing x and y
// Uses UNION BY RANK: attach smaller tree under larger tree
void unite(int x, int y) {
int px = find(x), py = find(y);
if (px == py) return; // already in same group
// Attach tree with lower rank under tree with higher rank
if (rankArr[px] < rankArr[py]) swap(px, py);
parent[py] = px;
if (rankArr[px] == rankArr[py]) rankArr[px]++;
}
// Check if x and y are in the same group
bool connected(int x, int y) {
return find(x) == find(y);
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, m;
cin >> n >> m;
init(n);
for (int i = 0; i < m; i++) {
int u, v;
cin >> u >> v;
unite(u, v);
}
// Count connected components
set<int> roots;
for (int i = 1; i <= n; i++) roots.insert(find(i));
cout << "Connected components: " << roots.size() << "\n";
// Answer connectivity queries
int q;
cin >> q;
while (q--) {
int u, v;
cin >> u >> v;
cout << (connected(u, v) ? "YES" : "NO") << "\n";
}
return 0;
}
Time complexity: With path compression and union by rank, both find and unite run in nearly O(1) — specifically O(α(n)) where α is the inverse Ackermann function, which is effectively constant for all practical inputs.
Union by Rank vs Union by Size 对比
flowchart LR
subgraph bad["❌ 不按秩合并(退化为链)"]
direction TB
R1(["1"]) --> R2(["2"]) --> R3(["3"]) --> R4(["4"]) --> R5(["5"])
note_bad["find(5) 需要 5 步\nO(N) 每次查询"]
end
subgraph good["✅ 按秩合并(保持树矮"]
direction TB
Root(["1"])
C2(["2"])
C3(["3"])
C4(["4"])
C5(["5"])
Root --> C2
Root --> C3
Root --> C4
Root --> C5
note_good["find(任意) 只需 2 步\nO(1) 每次查询"]
end
bad -->|"按秩合并优化"| good
style good fill:#dcfce7,stroke:#16a34a
style bad fill:#fef2f2,stroke:#dc2626
💡 按秩合并规则: 将 rank 较小的树挂到 rank 较大的树下,保证树高不超过 O(log N)。配合路径压缩后,均摊复杂度降至 O(α(N)) ≈ O(1)。
Why Union-Find is Powerful
Compare with BFS/DFS for connectivity queries:
- BFS/DFS:
O(N+M)per query (rebuilds from scratch) - Union-Find:
O(α(N))per query afterO((N+M)α(N))preprocessing
For Q queries after reading all edges: BFS = O(Q(N+M)) vs DSU = O((N+M+Q)α(N)).
5.3.4 Cycle Detection with DSU
Problem: Given a graph, determine if it has a cycle. If so, report which edge creates a cycle.
DSU 环检测过程示意:
flowchart LR
subgraph e1["加入边 1-2"]
direction TB
A1(["1"]) --- B1(["2"])
note1["find(1)≠find(2)\n→ 合并,无环"]
end
subgraph e2["加入边 2-3"]
direction TB
A2(["1"]) --- B2(["2"]) --- C2(["3"])
note2["find(2)≠find(3)\n→ 合并,无环"]
end
subgraph e3["加入边 1-3"]
direction TB
A3(["1"]) --- B3(["2"]) --- C3(["3"])
A3 -.-|"虚线=新边"| C3
note3["find(1)==find(3)\n→ 已连通,成环! ⚠️"]
end
e1 --> e2 --> e3
style note3 fill:#fef2f2,stroke:#dc2626
style e3 fill:#fff1f2,stroke:#fca5a5
💡 核心判断: 加入边 (u, v) 前,若
find(u) == find(v),说明 u 和 v 已在同一连通分量,加入此边必然成环。
#include <bits/stdc++.h>
using namespace std;
int parent[100001];
int find(int x) {
if (parent[x] != x) parent[x] = find(parent[x]);
return parent[x];
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, m;
cin >> n >> m;
for (int i = 1; i <= n; i++) parent[i] = i;
bool hasCycle = false;
for (int i = 0; i < m; i++) {
int u, v;
cin >> u >> v;
if (find(u) == find(v)) {
cout << "Cycle created by edge " << u << "-" << v << "\n";
hasCycle = true;
} else {
parent[find(u)] = find(v); // simple union (no rank for brevity)
}
}
if (!hasCycle) cout << "No cycle\n";
return 0;
}
5.3.5 Minimum Spanning Tree (Kruskal's Algorithm)
A minimum spanning tree (MST) of a weighted graph connects all vertices with total edge weight minimized, using exactly N-1 edges.
Kruskal's algorithm:
- Sort all edges by weight
- Process edges in order; add an edge if it connects two different components (using DSU)
- Stop when N-1 edges are added
#include <bits/stdc++.h>
using namespace std;
int parent[100001];
int find(int x) {
if (parent[x] != x) parent[x] = find(parent[x]);
return parent[x];
}
bool unite(int x, int y) {
x = find(x); y = find(y);
if (x == y) return false; // already connected
parent[x] = y;
return true;
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, m;
cin >> n >> m;
for (int i = 1; i <= n; i++) parent[i] = i;
// Read edges as (weight, u, v)
vector<tuple<int,int,int>> edges(m);
for (auto &[w, u, v] : edges) cin >> u >> v >> w;
// Sort by weight
sort(edges.begin(), edges.end());
long long totalCost = 0;
int edgesAdded = 0;
for (auto [w, u, v] : edges) {
if (unite(u, v)) { // connects two different components
totalCost += w;
edgesAdded++;
if (edgesAdded == n - 1) break; // MST complete
}
}
if (edgesAdded == n - 1) {
cout << "MST cost: " << totalCost << "\n";
} else {
cout << "Graph is disconnected; no MST\n";
}
return 0;
}
5.3.6 USACO Example: The Fence
Problem (USACO-style): A farm has N fields and M fences. Each fence connects two fields. Fields in the same connected component form a "pasture." After adding each fence, output the size of the largest pasture.
#include <bits/stdc++.h>
using namespace std;
int parent[100001];
int sz[100001]; // sz[root] = size of component rooted at 'root'
int find(int x) {
if (parent[x] != x) parent[x] = find(parent[x]);
return parent[x];
}
void unite(int x, int y) {
x = find(x); y = find(y);
if (x == y) return;
if (sz[x] < sz[y]) swap(x, y);
parent[y] = x;
sz[x] += sz[y];
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, m;
cin >> n >> m;
for (int i = 1; i <= n; i++) { parent[i] = i; sz[i] = 1; }
for (int i = 0; i < m; i++) {
int u, v;
cin >> u >> v;
unite(u, v);
int maxSz = 0;
for (int j = 1; j <= n; j++) {
if (find(j) == j) maxSz = max(maxSz, sz[j]); // only roots have correct sz
}
cout << maxSz << "\n";
}
return 0;
}
Chapter Summary
📌 Key Takeaways
| Technique | Time per Operation | Use Case | Why It Matters |
|---|---|---|---|
| Tree DFS (pre/post order) | O(N) total | Subtree sum, depth calc | Foundation for tree DP |
| Naive LCA | O(depth) per query | Small trees | Understanding LCA concept |
| Binary Lifting LCA | O(log N) per query | Large tree path queries | Gold-level core technique |
| Union-Find find/union | O(α(N)) ≈ O(1) | Dynamic connectivity | Kruskal MST, online connectivity |
| Kruskal's MST | O(E log E) | Minimum spanning tree | Common in USACO Silver/Gold |
| Euler Tour | O(N) preprocessing | Subtree→range query | Combined with Segment Tree for tree problems |
| Tree Diameter | O(N) (two BFS) | Longest path in tree | Common interview/contest problem |
❓ FAQ
Q1: Can Union-Find use only one of "path compression" or "union by rank"?
A: Yes. Path compression alone gives amortized
O(log N). Union by rank alone givesO(log N). Both together achieveO(α(N)). In contests, at least use path compression (more impactful); union by rank can be simplified to union by size.
Q2: What is the difference between Kruskal and Prim? When to use which?
A: Kruskal sorts edges + DSU, suited for sparse graphs (E ≪ V²), concise code. Prim is like Dijkstra with a priority queue, suited for dense graphs. In contests, use Kruskal 90% of the time.
Q3: What is the difference between Euler Tour and DFS order?
A: Essentially the same. "DFS order" usually refers to
in_timeandout_time; "Euler Tour" sometimes means the full entry/exit sequence (length 2N). In this book they are the same thing—the key is[in[u], out[u]]corresponds to u's subtree.
Q4: Why can tree diameter be found with "two BFS passes"?
A: Proof: Let the diameter be u→v. Starting BFS from any node s, the farthest node must be u or v (provable by contradiction). Then BFS from that endpoint finds the other endpoint and the diameter length.
Q5: What is the difference between multiset::erase(ms.find(val)) and ms.erase(val)?
A: Not this chapter's content (Chapter 3.8), but related to DSU
sztracking.ms.erase(val)removes all elements equal to val;ms.erase(ms.find(val))removes only one. When tracking group sizes in DSU, watch for similar "delete one vs delete all" issues.
🔗 Connections to Later Chapters
- Chapter 3.9 (Segment Tree) + Euler Tour = efficient subtree queries (update + query both
O(log N)) - Chapter 6.1 (DP Introduction): Tree DP builds directly on this chapter's tree traversal—postorder DFS aggregates bottom-up
- Chapter 4.1 (Greedy): MST is a classic greedy application—Kruskal greedily selects minimum edges
- Union-Find is powerful for offline processing—sort all queries/edges first, then add with DSU incrementally
- Binary Lifting for LCA is one of the core techniques at USACO Gold level
Practice Problems
Problem 5.3.1 — Subtree Sum Read a rooted tree with values at each node. For each node, output the sum of values in its subtree.
Problem 5.3.2 — Network Components Read a graph. Add edges one by one. After each edge, print the number of connected components.
Problem 5.3.3 — Redundant Edge Read a tree (N nodes, N-1 edges) plus one extra edge that creates a cycle. Find the extra edge. (Hint: use DSU — the edge that unites two already-connected nodes is the answer)
Problem 5.3.4 — Friend Groups N students. Read M pairs of friendships. Friends of friends are also friends (transitivity). Print the number of friend groups and the size of the largest one.
Problem 5.3.5 — USACO 2016 February Silver: Fencing the Cows (Inspired) Read a weighted graph. Find the minimum cost to connect all nodes (MST using Kruskal's). Print the total MST weight, or "IMPOSSIBLE" if the graph is not connected.
5.3.7 Kruskal's MST — Complete Worked Example
Let's trace Kruskal's algorithm on a 6-node graph with 9 edges.
Graph:
Nodes: 0,1,2,3,4,5
Edges (sorted by weight):
0-1: w=1
2-3: w=2
0-2: w=3
1-3: w=4
3-4: w=5
2-4: w=6
4-5: w=7
1-4: w=8
3-5: w=9
Kruskal's Algorithm Trace:
Initial: 6 components {0},{1},{2},{3},{4},{5}
Process edge 0-1 (w=1): find(0)=0, find(1)=1 → DIFFERENT → ACCEPT ✓
Tree edges: {0-1}. Components: {0,1},{2},{3},{4},{5}
Process edge 2-3 (w=2): find(2)=2, find(3)=3 → DIFFERENT → ACCEPT ✓
Tree edges: {0-1, 2-3}. Components: {0,1},{2,3},{4},{5}
Process edge 0-2 (w=3): find(0)=root_of_01, find(2)=root_of_23 → DIFFERENT → ACCEPT ✓
Tree edges: {0-1, 2-3, 0-2}. Components: {0,1,2,3},{4},{5}
Process edge 1-3 (w=4): find(1)=root_of_0123, find(3)=root_of_0123 → SAME → SKIP ✗
(Adding this would create a cycle: 0-1-3-2-0)
Process edge 3-4 (w=5): find(3)=root_of_0123, find(4)=4 → DIFFERENT → ACCEPT ✓
Tree edges: {0-1, 2-3, 0-2, 3-4}. Components: {0,1,2,3,4},{5}
Process edge 2-4 (w=6): find(2)=root_of_01234, find(4)=root_of_01234 → SAME → SKIP ✗
Process edge 4-5 (w=7): find(4)=root_of_01234, find(5)=5 → DIFFERENT → ACCEPT ✓
Tree edges: {0-1, 2-3, 0-2, 3-4, 4-5}.
edgesAdded = 5 = n-1 = 5. DONE!
MST total weight: 1 + 2 + 3 + 5 + 7 = 18
Kruskal 边选择过程示意:
flowchart LR
subgraph s0["初始:6个独立分量"]
direction LR
n0a([0])
n1a([1])
n2a([2])
n3a([3])
n4a([4])
n5a([5])
end
subgraph s1["加入 0-1(w=1)、2-3(w=2)"]
direction LR
n01b(["0——1"])
n23b(["2——3"])
n4b([4])
n5b([5])
end
subgraph s2["加入 0-2(w=3),跳过 1-3(w=4)成环"]
direction LR
n0123c(["0—1—2—3"])
n4c([4])
n5c([5])
end
subgraph s3["加入 3-4(w=5)、4-5(w=7),MST完成"]
direction LR
n012345d(["0—1—2—3—4—5"])
end
s0 --> s1 --> s2 --> s3
style s3 fill:#dcfce7,stroke:#16a34a
Complete C++ implementation with worked example:
// Solution: Kruskal's MST — O(E log E)
#include <bits/stdc++.h>
using namespace std;
struct DSU {
vector<int> parent, rank_;
DSU(int n) : parent(n), rank_(n, 0) {
iota(parent.begin(), parent.end(), 0);
}
int find(int x) {
if (parent[x] != x) parent[x] = find(parent[x]);
return parent[x];
}
bool unite(int x, int y) {
x = find(x); y = find(y);
if (x == y) return false;
if (rank_[x] < rank_[y]) swap(x, y);
parent[y] = x;
if (rank_[x] == rank_[y]) rank_[x]++;
return true;
}
};
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, m;
cin >> n >> m;
// Read edges as {weight, u, v}
vector<tuple<int,int,int>> edges(m);
for (auto& [w, u, v] : edges) cin >> u >> v >> w;
// Sort by weight (ascending)
sort(edges.begin(), edges.end());
DSU dsu(n);
long long mstWeight = 0;
int edgesAdded = 0;
vector<pair<int,int>> mstEdges;
for (auto [w, u, v] : edges) {
if (dsu.unite(u, v)) { // different components → safe to add
mstWeight += w;
mstEdges.push_back({u, v});
if (++edgesAdded == n - 1) break; // MST complete (N-1 edges)
}
}
if (edgesAdded < n - 1) {
cout << "IMPOSSIBLE: graph is disconnected\n";
} else {
cout << "MST weight: " << mstWeight << "\n";
cout << "MST edges:\n";
for (auto [u, v] : mstEdges) cout << u << " - " << v << "\n";
}
return 0;
}
5.3.8 Tree Diameter
The diameter of a tree is the longest path between any two nodes (measured in number of edges, or total weight for weighted trees).
Algorithm (Two BFS/DFS approach):
- BFS/DFS from any node
u. Find the farthest nodev. - BFS/DFS from
v. The farthest node fromvis one endpoint of the diameter. - The distance found in step 2 is the diameter.
Why does this work? The farthest node from any node is always one endpoint of a diameter.
两次 BFS 求树直径过程示意:
flowchart LR
subgraph bfs1["第1次 BFS:从任意节点 s=1 出发"]
direction TB
S1(["s=1\ndist=0"])
N2a(["2\ndist=1"])
N3a(["3\ndist=2"])
N4a(["4\ndist=3 ← 最远"])
S1 --> N2a --> N3a --> N4a
note1["最远节点 = 4\n(直径端点之一)"]
end
subgraph bfs2["第2次 BFS:从端点 u=4 出发"]
direction TB
U(["u=4\ndist=0"])
N3b(["3\ndist=1"])
N2b(["2\ndist=2"])
N1b(["1\ndist=3"])
N5b(["5\ndist=4 ← 最远"])
U --> N3b --> N2b --> N1b --> N5b
note2["最远节点 = 5\n直径长度 = 4"]
end
bfs1 -->|"以最远节点为新起点"| bfs2
style N4a fill:#dbeafe,stroke:#3b82f6
style N5b fill:#dcfce7,stroke:#16a34a
style note2 fill:#dcfce7,stroke:#16a34a
💡 正确性证明(反证法): 设真正的直径端点为 p, q。从任意节点 s 出发,最远节点 u 必为 p 或 q(若不是,则 u 到 p 或 q 的距离更长,矛盾)。从 u 出发的最远节点即为另一端点,距离即为直径。
// Solution: Tree Diameter (Two BFS) — O(N)
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 100001;
vector<pair<int,int>> adj[MAXN]; // {neighbor, edge_weight}
// BFS from src, returns {farthest_node, farthest_distance}
pair<int,int> bfsFarthest(int src, int n) {
vector<int> dist(n + 1, -1);
queue<int> q;
dist[src] = 0;
q.push(src);
int farthest = src;
while (!q.empty()) {
int u = q.front(); q.pop();
for (auto [v, w] : adj[u]) {
if (dist[v] == -1) {
dist[v] = dist[u] + w;
q.push(v);
if (dist[v] > dist[farthest]) farthest = v;
}
}
}
return {farthest, dist[farthest]};
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
for (int i = 0; i < n - 1; i++) {
int u, v, w;
cin >> u >> v >> w;
adj[u].push_back({v, w});
adj[v].push_back({u, w});
}
// Step 1: BFS from node 1, find farthest node u
auto [u, _] = bfsFarthest(1, n);
// Step 2: BFS from u, find farthest node v and its distance
auto [v, diameter] = bfsFarthest(u, n);
cout << "Diameter: " << diameter << "\n";
cout << "Endpoints: " << u << " and " << v << "\n";
return 0;
}
Unweighted version: Set all edge weights to 1, or use a simpler BFS that counts hops.
5.3.9 Lowest Common Ancestor (LCA) — Concept
The Lowest Common Ancestor (LCA) of two nodes u and v in a rooted tree is the deepest node that is an ancestor of both u and v.
Naive LCA (O(depth) per query): Walk both nodes up to the same depth, then walk together until they meet.
// Naive LCA — O(depth) per query, depth can be O(N) worst case
int lca_naive(int u, int v, int* depth, int* parent) {
// Equalize depths
while (depth[u] > depth[v]) u = parent[u];
while (depth[v] > depth[u]) v = parent[v];
// Now same depth — walk up together
while (u != v) {
u = parent[u];
v = parent[v];
}
return u;
}
Binary Lifting LCA (O(log N) per query, O(N log N) preprocessing):
Store anc[v][k] = 2^k-th ancestor of v.
const int LOG = 17; // log2(10^5) ≈ 17
int anc[MAXN][LOG]; // anc[v][k] = 2^k-th ancestor of v
int depth_arr[MAXN];
void preprocess(int root, int n) {
// DFS to compute depths and anc[v][0] = direct parent
// Then: anc[v][k] = anc[anc[v][k-1]][k-1]
// (2^k-th ancestor = 2^(k-1)-th ancestor of 2^(k-1)-th ancestor)
for (int v = 1; v <= n; v++)
for (int k = 1; k < LOG; k++)
anc[v][k] = anc[anc[v][k-1]][k-1];
}
int lca(int u, int v) {
if (depth_arr[u] < depth_arr[v]) swap(u, v);
int diff = depth_arr[u] - depth_arr[v];
// Lift u up by diff levels using binary lifting
for (int k = 0; k < LOG; k++)
if ((diff >> k) & 1) u = anc[u][k];
if (u == v) return u;
// Now same depth — binary lift both
for (int k = LOG - 1; k >= 0; k--)
if (anc[u][k] != anc[v][k]) {
u = anc[u][k];
v = anc[v][k];
}
return anc[u][0];
}
💡 When to use LCA: Path queries on trees (e.g., "sum of values on path from u to v"), distance queries between nodes, finding "meeting points" on tree paths.
5.3.10 Euler Tour of Tree
An Euler tour flattens a tree into a linear array, enabling range queries on subtrees using regular array data structures (e.g., segment tree).
Idea: Record entry and exit times for each node during DFS. The subtree of node u corresponds to the contiguous range [in[u], out[u]] in the Euler tour array.
// Euler Tour / Heavy-Light Decomposition preprocessing
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 100001;
vector<int> children[MAXN];
int in_time[MAXN], out_time[MAXN], timer_val = 0;
int euler_arr[MAXN]; // euler_arr[in_time[v]] = val[v]
int val[MAXN]; // value at each node
void dfs_euler(int u) {
in_time[u] = ++timer_val; // entry time
euler_arr[timer_val] = val[u]; // record value in euler array
for (int v : children[u]) {
dfs_euler(v);
}
out_time[u] = timer_val; // exit time (same as in_time for leaf)
}
int main() {
int n;
cin >> n;
for (int i = 2; i <= n; i++) {
int p; cin >> p;
children[p].push_back(i);
}
for (int i = 1; i <= n; i++) cin >> val[i];
dfs_euler(1);
// Now: subtree of node u = euler_arr[in_time[u]..out_time[u]]
// Use a segment tree or prefix sums on euler_arr for subtree queries!
// Example: sum of values in subtree of node 3
// answer = sum(euler_arr[in_time[3]..out_time[3]])
cout << "Subtree of 3 covers indices: "
<< in_time[3] << " to " << out_time[3] << "\n";
return 0;
}
Euler Tour in/out 时间戳示意:
flowchart TD
subgraph tree["树结构"]
R(["1\nin=1, out=7"])
A(["2\nin=2, out=4"])
B(["5\nin=5, out=7"])
C(["3\nin=3, out=3"])
D(["4\nin=4, out=4"])
E(["6\nin=6, out=6"])
F(["7\nin=7, out=7"])
R --> A
R --> B
A --> C
A --> D
B --> E
B --> F
end
subgraph arr["欧拉序列数组"]
direction LR
P1["[1]\n节点1"]
P2["[2]\n节点2"]
P3["[3]\n节点3"]
P4["[4]\n节点4"]
P5["[5]\n节点5"]
P6["[6]\n节点6"]
P7["[7]\n节点7"]
P1 --- P2 --- P3 --- P4 --- P5 --- P6 --- P7
end
tree -->|"子树 2 = 区间 [2,4]"| arr
style P2 fill:#dbeafe,stroke:#3b82f6
style P3 fill:#dbeafe,stroke:#3b82f6
style P4 fill:#dbeafe,stroke:#3b82f6
💡 关键性质: 节点 u 的子树对应欧拉序列中的连续区间
[in[u], out[u]]。子树查询变为区间查询,配合线段树可实现 O(log N) 的子树更新和查询。
Updated DSU Diagram
The diagram shows all three key DSU operations: isolated nodes, trees after unions, and path compression side-by-side.
Chapter 5.4: Shortest Paths
priority_queue, vector. Make sure you understand how BFS works before reading about Dijkstra.
Finding the shortest path between nodes is one of the most fundamental problems in graph theory. It appears in GPS navigation, network routing, game AI, and — most importantly for us — USACO problems. This chapter covers four algorithms (Dijkstra, Bellman-Ford, Floyd-Warshall, SPFA) and explains when to use each.
5.4.1 Problem Definition
Single-Source Shortest Path (SSSP)
Given a weighted graph G = (V, E) and a source node s, find the shortest distance from s to every other node.
From source A:
dist[A] = 0dist[B] = 1dist[C] = 5dist[D] = 5(A→B→D = 1+4)dist[E] = 8(A→B→D→E = 1+4+3)
Multi-Source Shortest Path (APSP)
Find shortest distances between all pairs of nodes. Used when you need distances from multiple sources, or between every pair.
Why Not Just BFS?
BFS finds shortest path in unweighted graphs (each edge = distance 1). With weights:
- Some paths have many short-weight edges
- Others have few large-weight edges
- BFS ignores weights entirely → wrong answer
5.4.2 Dijkstra's Algorithm
The most important shortest path algorithm. Used in ~90% of USACO problems involving weighted shortest paths.
Core Idea: Greedy + Priority Queue
Dijkstra is a greedy algorithm:
- Maintain a set of "settled" nodes (shortest distance finalized)
- Always process the unvisited node with smallest current distance next
- When processing a node, try to relax its neighbors (update their distances if we found a shorter path)
Why greedy works: If all edge weights are non-negative, the node currently at minimum distance cannot be improved by going through any other node (all alternatives would be ≥ current distance).
Step-by-Step Trace
Start: node 0 | Initial: dist = [0, ∞, ∞, ∞, ∞]
| Step | Process Node | Relaxations | dist array | Queue |
|---|---|---|---|---|
| 1 | node 0 (dist=0) | 0→1: min(∞, 0+4)=4; 0→2: min(∞, 0+2)=2; 0→3: min(∞, 0+5)=5 | [0, 4, 2, 5, ∞] | {(2,2),(4,1),(5,3)} |
| 2 | node 2 (dist=2) | 2→3: min(5, 2+1)=3 ← improved! | [0, 4, 2, 3, ∞] | {(3,3),(4,1),(5,3_old)} |
| 3 | node 3 (dist=3) | 3→1: min(4, 3+1)=4 (no change); 3→4: min(∞, 3+3)=6 | [0, 4, 2, 3, 6] | {(4,1),(6,4),(5,3_old)} |
| 4 | node 1 (dist=4) | No relaxation possible | [0, 4, 2, 3, 6] | {(6,4)} |
| 5 | node 4 (dist=6) | Done! | [0, 4, 2, 3, 6] | {} |
Final: dist = [0, 4, 2, 3, 6]
Complete Dijkstra Implementation
// Solution: Dijkstra's Algorithm with Priority Queue — O((V+E) log V)
#include <bits/stdc++.h>
using namespace std;
typedef pair<int, int> pii; // {distance, node}
typedef long long ll;
const ll INF = 1e18; // use long long to avoid int overflow!
const int MAXN = 100005;
// Adjacency list: adj[u] = list of {weight, v}
vector<pii> adj[MAXN];
vector<ll> dijkstra(int src, int n) {
vector<ll> dist(n + 1, INF); // dist[i] = shortest distance to node i
dist[src] = 0;
// Min-heap: {distance, node}
// C++ priority_queue is max-heap by default, so negate to make min-heap
priority_queue<pii, vector<pii>, greater<pii>> pq;
pq.push({0, src});
while (!pq.empty()) {
auto [d, u] = pq.top(); pq.pop(); // get node with minimum distance
// KEY: Skip if we've already found a better path to u
// (outdated entry in the priority queue)
if (d > dist[u]) continue;
// Relax all neighbors of u
for (auto [w, v] : adj[u]) {
ll newDist = dist[u] + w;
if (newDist < dist[v]) {
dist[v] = newDist; // update distance
pq.push({newDist, v}); // add updated entry to queue
}
}
}
return dist;
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, m;
cin >> n >> m;
for (int i = 0; i < m; i++) {
int u, v, w;
cin >> u >> v >> w;
adj[u].push_back({w, v});
adj[v].push_back({w, u}); // undirected graph
}
int src;
cin >> src;
vector<ll> dist = dijkstra(src, n);
for (int i = 1; i <= n; i++) {
if (dist[i] == INF) cout << -1 << "\n";
else cout << dist[i] << "\n";
}
return 0;
}
Reconstructing the Shortest Path
路径回溯过程示意:
flowchart LR
subgraph fwd["正向:Dijkstra 运行时记录 prev_node"]
direction LR
S(["src=0"]) -->|"w=2"| C(["2"])
C -->|"w=1"| D(["3"])
D -->|"w=1"| B(["1"])
D -->|"w=3"| E(["4"])
note_fwd["prev[2]=0, prev[3]=2\nprev[1]=3, prev[4]=3"]
end
subgraph back["回溯:从终点倒推到起点"]
direction RL
E2(["4"]) -->|"prev[4]=3"| D2(["3"])
D2 -->|"prev[3]=2"| C2(["2"])
C2 -->|"prev[2]=0"| S2(["0"])
note_back["倒序路径: 4→3→2→0\n翻转后: 0→2→3→4"]
end
fwd -->|"回溯重建"| back
style note_back fill:#dcfce7,stroke:#16a34a
💡 实现要点: 记录
prev_node[v] = u表示“到达 v 的最短路径上,v 的前一个节点是 u”。回溯时从终点不断跟随prev_node直到起点,再翻转即得正序路径。
// Solution: Dijkstra with Path Reconstruction
vector<int> prev_node(MAXN, -1); // prev_node[v] = previous node on shortest path to v
vector<ll> dijkstraWithPath(int src, int n) {
vector<ll> dist(n + 1, INF);
dist[src] = 0;
priority_queue<pii, vector<pii>, greater<pii>> pq;
pq.push({0, src});
while (!pq.empty()) {
auto [d, u] = pq.top(); pq.pop();
if (d > dist[u]) continue;
for (auto [w, v] : adj[u]) {
if (dist[u] + w < dist[v]) {
dist[v] = dist[u] + w;
prev_node[v] = u; // track where we came from
pq.push({dist[v], v});
}
}
}
return dist;
}
// Reconstruct path from src to dst
vector<int> getPath(int src, int dst) {
vector<int> path;
for (int v = dst; v != -1; v = prev_node[v]) {
path.push_back(v);
}
reverse(path.begin(), path.end());
return path;
}
// BAD: Processes stale entries in queue
while (!pq.empty()) {
auto [d, u] = pq.top(); pq.pop();
// NO CHECK for d > dist[u]!
// Will re-process nodes with outdated distances
// Still correct, but O(E log E) instead of O(E log V)
for (auto [w, v] : adj[u]) {
if (d + w < dist[v]) {
dist[v] = d + w;
pq.push({dist[v], v});
}
}
}
// GOOD: Skip outdated priority queue entries
while (!pq.empty()) {
auto [d, u] = pq.top(); pq.pop();
if (d > dist[u]) continue; // ← stale entry, skip!
for (auto [w, v] : adj[u]) {
if (dist[u] + w < dist[v]) {
dist[v] = dist[u] + w;
pq.push({dist[v], v});
}
}
}
Key Points for Dijkstra
🚫 CRITICAL: Dijkstra does NOT work with negative edge weights! If any edge weight is negative, Dijkstra may produce incorrect results. The algorithm's correctness relies on the greedy assumption that once a node is settled (popped from the priority queue), its distance is final — negative edges break this assumption. For graphs with negative weights, use Bellman-Ford or SPFA instead.
- Only works with non-negative weights. Negative edges break the greedy assumption (see warning above).
- Use
long longfor distances when edge weights can be large.dist[u] + wcan overflowint. - Use
greater<pii>to makepriority_queuea min-heap. - The
if (d > dist[u]) continue;check is essential for correctness and performance.
5.4.3 Bellman-Ford Algorithm
When edges can have negative weights, Dijkstra fails. Bellman-Ford handles negative weights — and even detects negative cycles.
Core Idea: Relaxation V-1 Times
Key insight: any shortest path in a graph with V nodes uses at most V-1 edges (no repeated nodes). So if we relax ALL edges V-1 times, we're guaranteed to find the correct shortest paths.
Algorithm:
1. dist[src] = 0, dist[all others] = INF
2. Repeat V-1 times:
For every edge (u, v, w):
if dist[u] + w < dist[v]:
dist[v] = dist[u] + w (relax!)
3. Check for negative cycles:
If ANY edge can still be relaxed → negative cycle exists!
Bellman-Ford Relaxation Process (5 nodes, 6 edges):
flowchart LR
subgraph iter0["初始状态"]
direction LR
A0(["A\ndist=0"])
B0(["B\ndist=∞"])
C0(["C\ndist=∞"])
D0(["D\ndist=∞"])
end
subgraph iter1["第1轮松弛"]
direction LR
A1(["A\ndist=0"])
B1(["B\ndist=2"])
C1(["C\ndist=∞→5"])
D1(["D\ndist=∞"])
end
subgraph iter2["第2轮松弛"]
direction LR
A2(["A\ndist=0"])
B2(["B\ndist=2"])
C2(["C\ndist=5→4"])
D2(["D\ndist=∞→7"])
end
subgraph iter3["第3轮松弛(收敛)"]
direction LR
A3(["A\ndist=0"])
B3(["B\ndist=2"])
C3(["C\ndist=4"])
D3(["D\ndist=7→6"])
end
iter0 -->|"处理边 A→B(2), A→C(5)"| iter1
iter1 -->|"处理边 B→C(-1), C→D(2)"| iter2
iter2 -->|"处理边 B→D(4)"| iter3
💡 关键观察: 每轮松弛后,至少有一个节点的最短距离被确定。V-1 轮后所有节点的最短距离均已确定(前提:无负环)。
Bellman-Ford Implementation
// Solution: Bellman-Ford Algorithm — O(V * E)
#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
typedef tuple<int, int, int> Edge; // {from, to, weight}
const ll INF = 1e18;
// Returns shortest distances, or empty if negative cycle detected
vector<ll> bellmanFord(int src, int n, vector<Edge>& edges) {
vector<ll> dist(n + 1, INF);
dist[src] = 0;
// Relax all edges V-1 times
for (int iter = 0; iter < n - 1; iter++) {
bool updated = false;
for (auto [u, v, w] : edges) {
if (dist[u] != INF && dist[u] + w < dist[v]) {
dist[v] = dist[u] + w;
updated = true;
}
}
if (!updated) break; // early termination: already converged
}
// Check for negative cycles (one more relaxation pass)
for (auto [u, v, w] : edges) {
if (dist[u] != INF && dist[u] + w < dist[v]) {
// Negative cycle reachable from source!
return {}; // signal: negative cycle exists
}
}
return dist;
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, m;
cin >> n >> m;
vector<Edge> edges;
for (int i = 0; i < m; i++) {
int u, v, w;
cin >> u >> v >> w;
edges.push_back({u, v, w});
// For undirected: also add {v, u, w}
}
int src;
cin >> src;
vector<ll> dist = bellmanFord(src, n, edges);
if (dist.empty()) {
cout << "Negative cycle detected!\n";
} else {
for (int i = 1; i <= n; i++) {
cout << (dist[i] == INF ? -1 : dist[i]) << "\n";
}
}
return 0;
}
Why Bellman-Ford Works
After k iterations of the outer loop, dist[v] contains the shortest path from src to v using at most k edges. After V-1 iterations, all shortest paths (which use at most V-1 edges in a cycle-free graph) are found.
Negative Cycle Detection: A negative cycle means you can keep decreasing distance indefinitely. If the V-th relaxation still improves a distance, that node is on or reachable from a negative cycle.
5.4.4 Floyd-Warshall Algorithm
For finding shortest paths between all pairs of nodes.
Core Idea: DP Through Intermediate Nodes
dp[k][i][j] = shortest distance from i to j using only nodes {1, 2, ..., k} as intermediate nodes.
Recurrence:
dp[k][i][j] = min(dp[k-1][i][j], // don't use node k
dp[k-1][i][k] + dp[k-1][k][j]) // use node k
Since we only need the previous layer, we can collapse to 2D:
// Solution: Floyd-Warshall All-Pairs Shortest Path — O(V^3)
#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
const ll INF = 1e18;
const int MAXV = 505;
ll dist[MAXV][MAXV]; // dist[i][j] = shortest distance from i to j
void floydWarshall(int n) {
// ⚠️ CRITICAL: k MUST be the OUTERMOST loop!
// Invariant: after processing k, dist[i][j] = shortest path from i to j
// using only nodes {1..k} as intermediates.
// If k were inner, dist[i][k] or dist[k][j] might not yet reflect all
// intermediate nodes up to k-1, breaking the DP correctness.
for (int k = 1; k <= n; k++) { // ← OUTER: intermediate node
for (int i = 1; i <= n; i++) { // ← MIDDLE: source
for (int j = 1; j <= n; j++) { // ← INNER: destination
// Can we go i→k→j faster than i→j directly?
if (dist[i][k] != INF && dist[k][j] != INF) {
dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j]);
}
}
}
}
// After Floyd-Warshall, dist[i][i] < 0 iff node i is on a negative cycle
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, m;
cin >> n >> m;
// Initialize: distance to self = 0, all others = INF
for (int i = 1; i <= n; i++)
for (int j = 1; j <= n; j++)
dist[i][j] = (i == j) ? 0 : INF;
// Read edges
for (int i = 0; i < m; i++) {
int u, v; ll w;
cin >> u >> v >> w;
dist[u][v] = min(dist[u][v], w); // handle multiple edges
dist[v][u] = min(dist[v][u], w); // undirected
}
floydWarshall(n);
// Query: shortest path from u to v
int q; cin >> q;
while (q--) {
int u, v; cin >> u >> v;
cout << (dist[u][v] == INF ? -1 : dist[u][v]) << "\n";
}
return 0;
}
Floyd-Warshall Complexity
- Time:
O(V³)— three nested loops, each running V times - Space:
O(V²)— the 2D distance array - Practical limit: V ≤ 500 or so (500³ = 1.25 × 10⁸ is borderline)
- For V > 1000, use Dijkstra from each source:
O(V × (V+E) log V)
Floyd-Warshall DP 状态转移示意:
flowchart LR
subgraph before["引入节点 k 之前"]
i1([i]) -->|"dist[i][j]"| j1([j])
end
subgraph after["引入节点 k 之后"]
i2([i]) -->|"dist[i][k]"| k2([k])
k2 -->|"dist[k][j]"| j2([j])
i2 -.->|"min(dist[i][j],\ndist[i][k]+dist[k][j])"| j2
end
before -->|"k 作为中间节点"| after
💡 为什么 k 必须是最外层循环? 当处理中间节点 k 时,
dist[i][k]和dist[k][j]必须已经基于 {1..k-1} 完全计算好。若 k 在内层,这些值可能在同一轮中被修改,破坏 DP 的正确性。
5.4.5 Algorithm Comparison Table
| Algorithm | Time Complexity | Negative Edges | Negative Cycles | Multi-Source | Best For |
|---|---|---|---|---|---|
| BFS | O(V + E) | ✗ No | ✗ No | ✓ Yes (multi-source BFS) | Unweighted graphs |
| Dijkstra | O((V+E) log V) | ✗ No | ✗ No | ✗ (run once per source) | Weighted, non-negative edges |
| Bellman-Ford | O(V × E) | ✓ Yes | ✓ Detects | ✗ | Negative edges, detecting neg cycles |
| SPFA | O(V × E) worst, O(E) avg | ✓ Yes | ✓ Detects | ✗ | Sparse graphs with neg edges |
| Floyd-Warshall | O(V³) | ✓ Yes | ✓ Detects (diag) | ✓ Yes (all pairs) | Dense graphs, all-pairs queries |
When to Use Which?
flowchart TD
Start(["图中有负权边?"])
Start -->|"是"| NegEdge["Bellman-Ford 或 SPFA\n或 Floyd-Warshall(全对)"]
Start -->|"否"| NoNeg["V ≤ 500 且需要全对最短路?"]
NoNeg -->|"是"| Floyd["Floyd-Warshall\nO(V³)"]
NoNeg -->|"否"| Unweighted["无权图(边权=1)?"]
Unweighted -->|"是"| BFS["BFS\nO(V+E)"]
Unweighted -->|"否"| ZeroOne["边权只有 0 或 1?"]
ZeroOne -->|"是"| BFS01["0-1 BFS\nO(V+E)"]
ZeroOne -->|"否"| Dijkstra["Dijkstra\nO((V+E) log V)"]
style NegEdge fill:#fef3c7,stroke:#d97706
style Floyd fill:#dbeafe,stroke:#3b82f6
style BFS fill:#dcfce7,stroke:#16a34a
style BFS01 fill:#dcfce7,stroke:#16a34a
style Dijkstra fill:#f0f4ff,stroke:#4A6CF7
5.4.6 SPFA — Bellman-Ford with Queue Optimization
SPFA (Shortest Path Faster Algorithm) is an optimized Bellman-Ford that only adds a node to the queue when its distance is updated, avoiding redundant relaxations.
⚠️ SPFA Worst Case: SPFA's worst-case time complexity is O(V × E) — identical to plain Bellman-Ford. On adversarially constructed graphs (common in competitive programming "anti-SPFA" test cases), SPFA degrades to O(VE) and may TLE. A node can enter the queue up to V times; with E edges processed per queue entry, the total is O(VE). In most random/practical cases it's fast (O(E) average), but for USACO, prefer Dijkstra when all weights are non-negative.
// Solution: SPFA (Bellman-Ford + Queue Optimization)
#include <bits/stdc++.h>
using namespace std;
typedef pair<int,int> pii;
typedef long long ll;
const ll INF = 1e18;
const int MAXN = 100005;
vector<pii> adj[MAXN];
vector<ll> spfa(int src, int n) {
vector<ll> dist(n + 1, INF);
vector<bool> inQueue(n + 1, false);
vector<int> cnt(n + 1, 0); // cnt[v] = number of times v entered queue
queue<int> q;
dist[src] = 0;
q.push(src);
inQueue[src] = true;
while (!q.empty()) {
int u = q.front(); q.pop();
inQueue[u] = false;
for (auto [w, v] : adj[u]) {
if (dist[u] + w < dist[v]) {
dist[v] = dist[u] + w;
if (!inQueue[v]) {
q.push(v);
inQueue[v] = true;
cnt[v]++;
// Negative cycle detection: if a node enters queue >= n times
// (a node can enter at most n-1 times without a neg cycle;
// using > n is also safe but detects one step later)
if (cnt[v] >= n) return {}; // negative cycle!
}
}
}
}
return dist;
}
5.4.7 BFS as Dijkstra for Unweighted Graphs
When all edge weights are 1 (unweighted graph), BFS is exactly Dijkstra with a simple queue:
- Dijkstra's priority queue naturally processes nodes in order of distance
- In an unweighted graph, all edges have weight 1, so nodes at distance d are processed before distance d+1
- BFS naturally explores level-by-level, which is exactly "by distance"
// Solution: BFS for Unweighted Shortest Path — O(V + E)
// Equivalent to Dijkstra when all weights = 1
vector<int> bfsShortestPath(int src, int n) {
vector<int> dist(n + 1, -1);
queue<int> q;
dist[src] = 0;
q.push(src);
while (!q.empty()) {
int u = q.front(); q.pop();
for (auto [w, v] : adj[u]) {
if (dist[v] == -1) { // unvisited
dist[v] = dist[u] + 1; // all weights = 1
q.push(v);
}
}
}
return dist;
}
Why is BFS correct for unweighted graphs?
Because BFS explores nodes in strictly increasing order of their distance. The first time you reach a node v, you've found the shortest path (fewest edges = minimum distance when all weights are 1).
0-1 BFS: A powerful trick when edge weights are only 0 or 1 (use deque instead of queue):
0-1 BFS 双端队列操作示意:
flowchart LR
subgraph dq["双端队列 deque 状态"]
direction LR
Front[["队首(小距离)"]] --- M1["..."] --- M2["..."] --- Back[["队尾(大距离)"]]
end
subgraph rule["入队规则"]
direction TB
W0["边权 w=0\n→ push_front(加队首)"]
W1["边权 w=1\n→ push_back(加队尾)"]
end
subgraph why["为什么正确?"]
direction TB
Exp["队首始终是当前最小距离的节点\nw=0 边不增加距离,应与当前节点同优先级\nw=1 边增加距离,排到队尾等待"]
end
rule --> dq
dq --> why
style W0 fill:#dbeafe,stroke:#3b82f6
style W1 fill:#fef3c7,stroke:#d97706
💡 效率对比: 0-1 BFS 为 O(V+E),比 Dijkstra 的 O((V+E) log V) 更快。当边权只有 0 和 1 时,优先选用此方法。
// Solution: 0-1 BFS — O(V + E), handles {0,1} weight edges
vector<int> bfs01(int src, int n) {
vector<int> dist(n + 1, INT_MAX);
deque<int> dq;
dist[src] = 0;
dq.push_front(src);
while (!dq.empty()) {
int u = dq.front(); dq.pop_front();
for (auto [w, v] : adj[u]) {
if (dist[u] + w < dist[v]) {
dist[v] = dist[u] + w;
if (w == 0) dq.push_front(v); // 0-weight: add to front
else dq.push_back(v); // 1-weight: add to back
}
}
}
return dist;
}
5.4.8 USACO Example: Farm Tours
Problem Statement (USACO 2003 Style)
Farmer John wants to take a round trip: travel from farm 1 to farm N, then return from N to farm 1, using no road twice. Roads are bidirectional. Find the minimum total distance of such a round trip.
Constraints: N ≤ 1000, M ≤ 10,000, weights ≤ 1000.
Input Format:
N M
u1 v1 w1
u2 v2 w2
...
Analysis:
- We need to go 1→N and N→1 without repeating any edge
- Key insight: this equals finding two edge-disjoint paths from 1 to N with minimum total cost
- Alternative insight: the "return trip" N→1 is just another path 1→N in the original graph
- Simplification for this problem: Find the shortest path from 1 to N twice, but with different edges
For this USACO-style problem, a simpler interpretation: since roads are bidirectional and we can use each road at most once in each direction, find:
- Shortest path 1→N
- Shortest path N→1 (using possibly different roads)
- These can be found independently with Dijkstra
But the real challenge: "using no road twice" means globally, not just per direction.
Greedy approach for this version: Find shortest path 1→N, then find shortest path on remaining graph N→1. This greedy doesn't always work, but for USACO Bronze/Silver, many problems simplify to just running Dijkstra twice.
// Solution: Farm Tours — Two Dijkstra (simplified version)
// Run Dijkstra from both endpoints, find min round-trip distance
#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
typedef pair<ll, int> pli;
const ll INF = 1e18;
const int MAXN = 1005;
vector<pair<int,int>> adj[MAXN]; // {weight, dest}
vector<ll> dijkstra(int src, int n) {
vector<ll> dist(n + 1, INF);
priority_queue<pli, vector<pli>, greater<pli>> pq;
dist[src] = 0;
pq.push({0, src});
while (!pq.empty()) {
auto [d, u] = pq.top(); pq.pop();
if (d > dist[u]) continue;
for (auto [w, v] : adj[u]) {
if (dist[u] + w < dist[v]) {
dist[v] = dist[u] + w;
pq.push({dist[v], v});
}
}
}
return dist;
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, m;
cin >> n >> m;
for (int i = 0; i < m; i++) {
int u, v, w;
cin >> u >> v >> w;
adj[u].push_back({w, v});
adj[v].push_back({w, u}); // bidirectional
}
// Run Dijkstra from farm 1 and farm N
vector<ll> distFrom1 = dijkstra(1, n);
vector<ll> distFromN = dijkstra(n, n);
// Find intermediate farm that minimizes: dist(1,k) + dist(k,N) + dist(N,k) + dist(k,1)
// = 2 * (dist(1,k) + dist(k,N)) ... but this is just going via k twice
// Simplest: answer is distFrom1[n] + distFromN[1]
// (Go 1→N one way, return N→1 by shortest path — may reuse edges)
ll answer = distFrom1[n] + distFromN[1];
if (answer >= INF) cout << "NO VALID TRIP\n";
else cout << answer << "\n";
// For the "no road reuse" constraint, see flow algorithms (beyond Silver)
return 0;
}
💡 Extended: Finding Two Edge-Disjoint Paths
The true "no road reuse" version requires min-cost flow (a Gold+ topic). The key insight is:
- Model each undirected edge as two directed edges with capacity 1
- Find min-cost flow of 2 units from node 1 to node N
- This equals two edge-disjoint paths with minimum total cost
For USACO Silver, you'll rarely need min-cost flow — the simpler Dijkstra approach suffices.
5.4.9 Dijkstra on Grids
Many USACO problems involve grid-based shortest paths. The graph is implicit:
// Solution: Dijkstra on Grid — find shortest path from (0,0) to (R-1,C-1)
// Each cell has a "cost" to enter
#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
typedef tuple<ll,int,int> tli;
const ll INF = 1e18;
int dx[] = {0,0,1,-1};
int dy[] = {1,-1,0,0};
ll dijkstraGrid(vector<vector<int>>& grid) {
int R = grid.size(), C = grid[0].size();
vector<vector<ll>> dist(R, vector<ll>(C, INF));
priority_queue<tli, vector<tli>, greater<tli>> pq;
dist[0][0] = grid[0][0];
pq.push({grid[0][0], 0, 0});
while (!pq.empty()) {
auto [d, r, c] = pq.top(); pq.pop();
if (d > dist[r][c]) continue;
for (int k = 0; k < 4; k++) {
int nr = r + dx[k], nc = c + dy[k];
if (nr < 0 || nr >= R || nc < 0 || nc >= C) continue;
ll newDist = dist[r][c] + grid[nr][nc];
if (newDist < dist[nr][nc]) {
dist[nr][nc] = newDist;
pq.push({newDist, nr, nc});
}
}
}
return dist[R-1][C-1];
}
⚠️ Common Mistakes — The Dirty Five
// BAD: int overflow when adding large distances
vector<int> dist(n+1, 1e9); // use int
// dist[u] = 9×10^8, w = 9×10^8
// dist[u] + w overflows int!
if (dist[u] + w < dist[v]) { ... }
// GOOD: always use long long for distances
const ll INF = 1e18;
vector<ll> dist(n+1, INF);
// No overflow with long long (max ~9.2×10^18)
if (dist[u] + w < dist[v]) { ... }
// BAD: This is a MAX-heap, not min-heap!
priority_queue<pii> pq; // default is max-heap
pq.push({dist[v], v});
// Will process FARTHEST node first — wrong!
// GOOD: explicitly specify min-heap
priority_queue<pii, vector<pii>, greater<pii>> pq;
pq.push({dist[v], v});
// Now processes NEAREST node first ✓
5 Classic Dijkstra Bugs:
- Using
intinstead oflong long— distance sum overflows → wrong answers silently - Max-heap instead of min-heap — forgetting
greater<pii>→ processes wrong node first - Missing stale entry check (
if (d > dist[u]) continue) → not wrong but ~10x slower - Forgetting
dist[src] = 0— all distances remain INF - Using Dijkstra with negative edges — undefined behavior, may loop infinitely or give wrong answer
Chapter Summary
📌 Key Takeaways
| Algorithm | Complexity | Handles Neg | Use When |
|---|---|---|---|
| BFS | O(V+E) | ✗ | Unweighted graphs |
| Dijkstra | O((V+E) log V) | ✗ | Non-negative weighted SSSP |
| Bellman-Ford | O(VE) | ✓ | Negative edges, detect neg cycles |
| SPFA | O(VE) worst, fast avg | ✓ | Sparse graphs, neg edges |
| Floyd-Warshall | O(V³) | ✓ | All-pairs, V ≤ 500 |
| 0-1 BFS | O(V+E) | N/A | Edges with weight 0 or 1 only |
❓ FAQ
Q1: Why can't Dijkstra handle negative edges?
A: Dijkstra's greedy assumption is "the node with the current shortest distance cannot be improved by later paths." With negative edges, this assumption fails—a longer path through a negative edge may end up shorter.
Concrete counterexample: Nodes A, B, C. Edges: A→B=2, A→C=10, B→C=−20.
- Dijkstra processes A first (dist=0), relaxes to dist[B]=2, dist[C]=10
- Then processes B (dist=2, minimum), relaxes to dist[C]=min(10, 2+(-20))=-18
- But if Dijkstra "settles" C before processing B (this specific case won't, but slightly different weights will cause issues)
General explanation: When node u is popped and settled, Dijkstra considers
dist[u]optimal. But if there is a negative edge (v, u, w) with w < 0, there may be a path src→...→v→u with total weight < currentdist[u], while v has not yet been processed.Conclusion: With negative edges, you must use Bellman-Ford (
O(VE)) or SPFA (averageO(E), worstO(VE)).
Q2: What is the difference between SPFA and Bellman-Ford?
A: SPFA is a queue-optimized version of Bellman-Ford. Bellman-Ford traverses all edges each round; SPFA only updates neighbors of nodes whose distance improved, using a queue to track which nodes need processing. In practice SPFA is much faster (average
O(E)), but the theoretical worst case is the same (O(VE)). On some contest platforms SPFA can be hacked to worst case, so with negative edges consider Bellman-Ford; without negative edges always use Dijkstra.
Q3: Why must the k loop be the outermost in Floyd-Warshall?
A: This is the most common Floyd-Warshall implementation error! The DP invariant is: after the k-th outer loop iteration,
dist[i][j]represents the shortest path from i to j using only nodes {1, 2, ..., k} as intermediates. When processing intermediate node k,dist[i][k]anddist[k][j]must already be fully computed based on {1..k-1}. If k is in the inner loop,dist[i][k]may have just been updated in the same outer loop iteration, leading to incorrect results. Remember: k is outermost, i and j are inner — order matters!
Q4: How to determine whether a USACO problem needs Dijkstra or BFS?
A: Key question: Are edges weighted?
- Unweighted graph (edge weight=1 or find minimum edges) → BFS,
O(V+E), faster and simpler code- Weighted graph (different non-negative weights) → Dijkstra
- Edge weights only 0 or 1 → 0-1 BFS (faster than Dijkstra,
O(V+E))- Has negative edges → Bellman-Ford/SPFA
Q5: When to use Floyd-Warshall?
A: When you need shortest distances between all pairs, and V ≤ 500 (since
O(V³)≈ 1.25×10⁸ is barely feasible at V=500). Typical scenario: given multiple sources and targets, query distance between any pair. For V > 500, run Dijkstra once per node (O(V × (V+E) log V)) is faster.
🔗 Connections to Other Chapters
- Chapter 5.2 (BFS & DFS): BFS is "Dijkstra for unweighted graphs"; this chapter is a direct extension of BFS
- Chapter 3.11 (Binary Trees): Dijkstra's priority queue is a binary heap; understanding heaps helps analyze complexity
- Chapter 5.3 (Trees & Special Graphs): Shortest path on a tree is the unique root-to-node path (DFS/BFS suffices)
- Chapter 6.1 (DP Introduction): Floyd-Warshall is essentially DP (state = "using first k nodes"); many shortest path variants can be modeled with DP
- USACO Gold: Shortest path + DP combinations (e.g., DP on shortest path DAG), shortest path + binary search, shortest path + data structure optimization
Practice Problems
Problem 5.4.1 — Classic Dijkstra 🟢 Easy Given N cities and M roads with travel time, find the shortest travel time from city 1 to city N. If unreachable, output -1. (N ≤ 10^5, M ≤ 5×10^5, weights ≤ 10^9)
Hint
Standard Dijkstra. Use `long long` for distances (max path ≤ N × max_weight = 10^5 × 10^9 = 10^14). Initialize `dist[1] = 0`, all others INF.Problem 5.4.2 — BFS on Grid 🟢 Easy A robot is on an R×C grid. Some cells are walls. Find the shortest path (in steps) from top-left to bottom-right. Output -1 if impossible.
Hint
Use BFS. Each step moves to an adjacent (4-directional) non-wall cell. Distance = number of steps = number of edges = works with BFS.Problem 5.4.3 — Negative Edge Detection 🟡 Medium Given a directed graph with possibly negative edge weights, determine:
- The shortest distance from node 1 to node N
- Whether any negative cycle exists that is reachable from node 1
Hint
Use Bellman-Ford. Run V-1 relaxation iterations. Then do one more: if any distance improves, there's a negative cycle. Report the distance (it may be -INF if a negative cycle can reach node N).Problem 5.4.4 — Multi-Source BFS 🟡 Medium A zombie outbreak starts at K infected cities. Find the minimum time for zombies to reach each city (spread 1 city per time unit via roads).
多源 BFS 扩散过程示意:
flowchart LR
subgraph t0["初始:K个感染源同时入队"]
direction TB
Z1(["Z1\nt=0"])
Z2(["Z2\nt=0"])
Z3(["Z3\nt=0"])
note0["queue = [Z1, Z2, Z3]"]
end
subgraph t1["第 1 轮:向外扩散"]
direction TB
A1(["Z1\nt=0"]) --- B1(["A\nt=1"])
C1(["Z2\nt=0"]) --- D1(["B\nt=1"])
E1(["Z3\nt=0"]) --- F1(["C\nt=1"])
end
subgraph t2["第 2 轮:继续扩散"]
direction TB
G2(["A\nt=1"]) --- H2(["D\nt=2"])
I2(["B\nt=1"]) --- J2(["E\nt=2"])
note2["已访问节点不再入队"]
end
t0 --> t1 --> t2
style Z1 fill:#fef2f2,stroke:#dc2626
style Z2 fill:#fef2f2,stroke:#dc2626
style Z3 fill:#fef2f2,stroke:#dc2626
💡 等价转化: 多源 BFS = 添加一个虚拟起点 S,将 S 以距离 0 连接到所有 K 个感染源,再运行单源 BFS。所有感染源同时入队就是这个思路的实现。
Hint
Multi-source BFS: initialize the queue with all K infected cities at time 0. Run BFS normally. This is equivalent to adding a virtual "source" node connected to all K cities with weight 0.Problem 5.4.5 — All-Pairs with Floyd 🟡 Medium Given N cities (N ≤ 300) and M roads, answer Q queries: "Is city u reachable from city v within distance D?"
Hint
Run Floyd-Warshall to get all-pairs shortest paths in `O(N³)`. Each query is then `O(1)`: check `dist[u][v] <= D`.Problem 5.4.6 — Dijkstra + Binary Search 🔴 Hard A delivery drone can carry a maximum weight of W. There are N cities connected by roads, each road has a weight limit. Find the path from city 1 to city N that maximizes the minimum weight limit along the path (i.e., the heaviest cargo the drone can carry).
Hint
This is "Maximum Bottleneck Path" — find the path where the minimum edge weight is maximized. Two approaches: (1) Binary search on the answer W, then check if a path exists using only edges with weight ≥ W. (2) Run a modified Dijkstra where `dist[v]` = maximum minimum edge weight on any path to v. Use max-heap, update: `dist[v] = max(dist[v], min(dist[u], weight(u,v)))`.End of Chapter 5.4 — Next: Chapter 6.1: Introduction to DP
🧠 Part 6: Dynamic Programming
The most powerful and most feared topic in competitive programming. Master memoization, tabulation, and classic DP patterns for USACO Silver.
📚 3 Chapters · ⏱️ Estimated 3-4 weeks · 🎯 Target: Reach USACO Silver level
Part 6: Dynamic Programming
Estimated time: 3–4 weeks
Dynamic programming is the most powerful and most feared topic in competitive programming. Once you master it, you'll be able to solve problems that seem impossible by brute force. Take your time with this part — it's worth it.
What Topics Are Covered
| Chapter | Topic | The Big Idea |
|---|---|---|
| Chapter 6.1 | Introduction to DP | Memoization, tabulation, the DP recipe |
| Chapter 6.2 | Classic DP Problems | LIS, 0/1 Knapsack, grid path counting |
| Chapter 6.3 | Advanced DP Patterns | Bitmask DP, interval DP, tree DP |
What You'll Be Able to Solve After This Part
After completing Part 6, you'll be ready to tackle:
-
USACO Bronze:
- Simple counting problems (how many ways to do X?)
- Basic optimization (minimum cost to do Y?)
-
USACO Silver:
- Longest increasing subsequence (and variants)
- Knapsack-style resource allocation
- Grid path problems (max value path, count paths)
- 1D DP with careful state definition (Hoof-Paper-Scissors, etc.)
-
DP on intervals or trees (Chapter 6.3)
Key DP Patterns to Master
| Pattern | Chapter | Example Problem |
|---|---|---|
| 1D DP (sequential) | 6.1 | Fibonacci, climbing stairs |
| 1D DP (optimization) | 6.1 | Coin change (minimum coins) |
| 1D DP (counting) | 6.1 | Coin change (number of ways) |
| 2D DP | 6.2 | 0/1 Knapsack, grid paths |
| LIS (O(N²)) | 6.2 | Longest increasing subsequence |
| LIS (O(N log N)) | 6.2 | Fast LIS with binary search |
| Bitmask DP | 6.3 | TSP, assignment problem |
| Interval DP | 6.3 | Matrix chain multiplication |
| Tree DP | 6.3 | Independent set on trees |
Prerequisites
Before starting Part 6, make sure you can:
- Write recursive functions and understand the call stack (Chapter 2.3)
- Use 2D vectors comfortably (Chapter 2.3)
- Understand binary search (Chapter 3.3) — needed for O(N log N) LIS
- Solve basic BFS problems (Chapter 5.2) — DP and BFS share "state space exploration" intuition
The DP Mindset
DP is not about memorizing formulas — it's about asking the right questions:
- What is the "state"? What information do I need to describe a subproblem?
- What is the "transition"? How does the answer to a bigger state depend on smaller states?
- What are the base cases? What are the simplest subproblems with known answers?
- What order do I fill the table? Dependencies must be computed before they're used.
💡 Key Insight: If you find yourself writing the same computation multiple times in a recursive solution, DP is the fix. Cache the result the first time, reuse it every subsequent time.
Tips for This Part
- Start with Chapter 6.1 carefully. Don't rush to knapsack before you truly understand Fibonacci DP. The "why" of DP is more important than the "what."
- Write both memoization and tabulation for the same problem. Converting between them deepens understanding.
- Chapter 6.2's LIS has two implementations: O(N²) (easy to understand) and O(N log N) (fast, needed for large N). Learn both.
- Chapter 6.3 is Silver/Gold level. If you're targeting Bronze, you can skip Chapter 6.3 initially and return to it later.
- Most DP bugs come from wrong initialization. For min-cost problems, initialize to
INF, not 0. For counting problems, initialize the base case to 1, not 0.
⚠️ Warning: The #1 DP bug: forgetting to check
dp[w-c] != INFbefore using it in a minimization DP.INF + 1overflows!The #2 DP bug: wrong loop order for 0/1 knapsack vs. unbounded knapsack. Backward iteration = each item used at most once. Forward iteration = unlimited use.
Chapter 6.1: Introduction to Dynamic Programming
📝 Before You Continue: Make sure you understand recursion (Chapter 2.3), arrays/vectors (Chapters 2.3–3.1), and basic loop patterns (Chapter 2.2). DP builds directly on recursion concepts.
Dynamic programming (DP) is often described as "clever recursion with memory." Let's build up this intuition from scratch, starting with the simplest possible example: Fibonacci numbers.
💡 Key Insight: DP solves problems with two properties:
- Overlapping subproblems — the same sub-computation appears many times
- Optimal substructure — the optimal solution to a big problem can be built from optimal solutions to smaller problems
When both are true, DP transforms exponential time into polynomial time.
6.1.1 The Problem with Naive Recursion
The Fibonacci sequence: 0, 1, 1, 2, 3, 5, 8, 13, 21, ...
Definition: F(0) = 0, F(1) = 1, F(n) = F(n-1) + F(n-2) for n ≥ 2.
Visual: Fibonacci Recursion Tree and Memoization
The recursion tree for fib(5) exposes the problem: fib(3) is computed twice (red nodes). Memoization caches each result the first time it's computed, reducing 2^N calls to just N unique calls — the fundamental insight behind dynamic programming.
The static diagram above shows how memoization eliminates redundant computations: each unique subproblem is solved only once and its result is cached for future lookups.
The naïve recursive implementation:
int fib(int n) {
if (n == 0) return 0;
if (n == 1) return 1;
return fib(n-1) + fib(n-2); // recursive
}
This is correct, but devastatingly slow. Let's see why:
fib(5)
├── fib(4)
│ ├── fib(3)
│ │ ├── fib(2)
│ │ │ ├── fib(1) = 1
│ │ │ └── fib(0) = 0
│ │ └── fib(1) = 1
│ └── fib(2) ← COMPUTED AGAIN!
│ ├── fib(1) = 1
│ └── fib(0) = 0
└── fib(3) ← COMPUTED AGAIN!
├── fib(2) ← COMPUTED AGAIN!
│ ├── fib(1) = 1
│ └── fib(0) = 0
└── fib(1) = 1
fib(3) is computed twice. fib(2) three times. For fib(50), the number of calls exceeds 10^10. This is exponential time: O(2^n).
The core insight: we're recomputing the same subproblems over and over. DP fixes this.
6.1.2 Memoization (Top-Down DP)
Memoization = recursion + cache. Before computing, check if we've already computed this value. If yes, return the cached result. If no, compute it, cache it, return it.
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 100;
long long memo[MAXN]; // memo[n] = F(n), or -1 if not yet computed
bool computed[MAXN]; // track which values are computed
long long fib_memo(int n) {
if (n == 0) return 0;
if (n == 1) return 1;
if (computed[n]) return memo[n]; // already computed? return cached value
memo[n] = fib_memo(n-1) + fib_memo(n-2); // compute and cache
computed[n] = true;
return memo[n];
}
int main() {
memset(computed, false, sizeof(computed)); // initialize cache as empty
for (int i = 0; i <= 20; i++) {
cout << "F(" << i << ") = " << fib_memo(i) << "\n";
}
return 0;
}
Or using -1 as the sentinel:
📝 说明: 以下是与上文
fib_memo等价的另一种写法。区别在于:① 用-1作为"未计算"的哨兵值,省去单独的computed[]数组;② 函数名改为fib,写法更简洁。两种写法在功能上完全相同,请勿将两种写法的代码片段混用(它们各自拥有独立的全局memo数组)。
// 写法二:-1 哨兵值(等价于上文 fib_memo,更简洁)
const int MAXN = 100;
long long memo[MAXN];
long long fib(int n) {
if (n <= 1) return n;
if (memo[n] != -1) return memo[n];
return memo[n] = fib(n-1) + fib(n-2);
}
int main() {
fill(memo, memo + MAXN, -1LL); // 将所有值初始化为 -1("未计算"标记)
cout << fib(50) << "\n"; // 12586269025
return 0;
}
Now each value is computed exactly once. Time complexity: O(N). 🎉
6.1.3 Tabulation (Bottom-Up DP)
Tabulation builds the answer from the ground up — compute small subproblems first, use them to compute larger ones.
#include <bits/stdc++.h>
using namespace std;
int main() {
int n = 50;
vector<long long> dp(n + 1);
// Base cases
dp[0] = 0;
dp[1] = 1;
// Fill the table bottom-up
for (int i = 2; i <= n; i++) {
dp[i] = dp[i-1] + dp[i-2]; // use already-computed values
}
cout << dp[n] << "\n"; // 12586269025
return 0;
}
We can even optimize space: since each Fibonacci number only depends on the previous two, we only need O(1) space:
long long a = 0, b = 1;
for (int i = 2; i <= n; i++) {
long long c = a + b;
a = b;
b = c;
}
cout << b << "\n";
Memoization vs. Tabulation
两种方式的执行路径对比(以 fib(4) 为例):
flowchart LR
subgraph topdown["🔽 Top-Down(记忆化递归)"]
direction TB
F4a(["fib(4)"])
F3a(["fib(3)"])
F2a(["fib(2)"])
F1a(["fib(1)=1"])
F0a(["fib(0)=0"])
F2b(["fib(2)\n📦 缓存命中!"])
F4a --> F3a
F4a --> F2b
F3a --> F2a
F3a --> F1a
F2a --> F1a
F2a --> F0a
style F2b fill:#dcfce7,stroke:#16a34a
end
subgraph bottomup["🔼 Bottom-Up(制表法)"]
direction LR
D0["dp[0]=0"] --> D1["dp[1]=1"] --> D2["dp[2]=1"] --> D3["dp[3]=2"] --> D4["dp[4]=3"]
note["顺序填表,每格只算一次"]
end
style topdown fill:#f0f4ff,stroke:#4A6CF7
style bottomup fill:#f0fdf4,stroke:#16a34a
💡 核心区别: Top-Down 按需计算(只算用到的子问题),Bottom-Up 全量填表(按顺序算所有子问题)。两者时间复杂度相同,但 Bottom-Up 无递归栈开销。
| Aspect | Memoization (Top-Down) | Tabulation (Bottom-Up) |
|---|---|---|
| Approach | Recursive with caching | Iterative table filling |
| Memory usage | Only computed states | All states (even unused) |
| Implementation | Often more intuitive | May need to figure out fill order |
| Stack overflow risk | Yes (deep recursion) | No |
| Speed | Slightly slower (function call overhead) | Slightly faster |
| Subproblems computed | Only reachable ones | All (even unreachable) |
| Debugging | Easier (follow recursion) | Harder (need correct fill order) |
| USACO preference | Great for understanding | Great for final solutions |
🏆 USACO Tip: In competition, bottom-up tabulation is slightly preferred because it avoids potential stack overflow (critical on problems with N = 10^5) and is often faster. But start with top-down if you're having trouble seeing the recurrence — it's a great way to think through the problem.
In competitive programming, both are valid. Practice both until you can convert easily between them.
6.1.4 The DP Recipe
Every DP problem follows the same recipe:
DP 四步法流程图:
flowchart TD
S1["① 定义状态\ndp[i] 代表什么?"] --> S2
S2["② 写出递推关系\ndp[i] 如何由更小的状态得到?"] --> S3
S3["③ 确定边界条件\n最小子问题的答案是什么?"] --> S4
S4["④ 确定填表顺序\n从小到大?从大到小?"] --> S5
S5{"能否压缩空间?"}
S5 -->|"只依赖前1-2行"| S6["滚动数组 / 1D 优化"]
S5 -->|"依赖整个表"| S7["保留完整 2D 表"]
style S1 fill:#dbeafe,stroke:#3b82f6
style S2 fill:#dbeafe,stroke:#3b82f6
style S3 fill:#dbeafe,stroke:#3b82f6
style S4 fill:#dbeafe,stroke:#3b82f6
style S6 fill:#dcfce7,stroke:#16a34a
- Define the state: What information uniquely describes a subproblem?
- Define the recurrence: How does
dp[state]depend on smaller states? - Identify base cases: What are the simplest subproblems with known answers?
- Determine order: In what order should we fill the table?
Let's apply this to Fibonacci:
- State:
dp[i]= the i-th Fibonacci number - Recurrence:
dp[i] = dp[i-1] + dp[i-2] - Base cases:
dp[0] = 0,dp[1] = 1 - Order: i from 2 to n (each depends on smaller i)
6.1.5 Coin Change — Classic DP
Problem: You have coins of denominations coins[]. What is the minimum number of coins needed to make amount W? You can use each coin type unlimited times.
Example: coins = [1, 5, 6, 9], W = 11
Let's first try the greedy approach (always pick the largest coin ≤ remaining):
- Greedy: 9 + 1 + 1 = 3 coins ← not optimal!
- Optimal: 5 + 6 = 2 coins ← DP finds this
This is why greedy fails here and we need DP.
Visual: Coin Change DP Table
The DP table shows how dp[i] (minimum coins to make amount i) is filled left to right. For coins {1,3,4}, notice that dp[3]=1 (just use coin 3) and dp[6]=2 (use two 3s). Each cell builds on previous cells using the recurrence.
This static reference shows the complete coin change DP table, with arrows indicating how each cell's value depends on previous cells via the recurrence dp[w] = 1 + min(dp[w-c]).
DP Definition
Coin Change 状态转移示意(coins=[1,5,6], W=7):
flowchart LR
D0(["dp[0]=0"])
D1(["dp[1]=1"])
D5(["dp[5]=1"])
D6(["dp[6]=1"])
D7(["dp[7]=2"])
D0 -->|"用硬币1"| D1
D0 -->|"用硬币5"| D5
D0 -->|"用硬币6"| D6
D1 -->|"用硬币6"| D7
D5 -->|"用硬币1\ndp[5]+1=2"| D6_2(["dp[6]=min(1,2)=1"])
D6 -->|"用硬币1\ndp[6]+1=2"| D7
note1["dp[7] = min(dp[6]+1, dp[2]+1, dp[1]+1)\n = min(2, ?, 2) = 2\n最优: 1+6 或 6+1"]
style D0 fill:#f0fdf4,stroke:#16a34a
style D7 fill:#dbeafe,stroke:#3b82f6
style note1 fill:#fef9ec,stroke:#d97706
💡 转移方向: 每个
dp[w]都从dp[w-c](用了硬币 c 之后的剩余金额)转移而来。箭头方向 = 状态依赖方向。
- State:
dp[w]= minimum coins to make exactly amountw - Recurrence:
dp[w] = 1 + min over all coins c where c ≤ w: dp[w - c](use coin c, then solve the remaining w-c optimally) - Base case:
dp[0] = 0(zero coins to make amount 0) - Answer:
dp[W] - Order: fill w from 1 to W
Complete Walkthrough: coins = [1, 5, 6, 9], W = 11
dp[0] = 0 (base case)
dp[1]: try coin 1: dp[0]+1=1 → dp[1] = 1
dp[2]: try coin 1: dp[1]+1=2 → dp[2] = 2
dp[3]: try coin 1: dp[2]+1=3 → dp[3] = 3
dp[4]: try coin 1: dp[3]+1=4 → dp[4] = 4
dp[5]: try coin 1: dp[4]+1=5
try coin 5: dp[0]+1=1 → dp[5] = 1 ← use the 5-coin!
dp[6]: try coin 1: dp[5]+1=2
try coin 5: dp[1]+1=2
try coin 6: dp[0]+1=1 → dp[6] = 1 ← use the 6-coin!
dp[7]: try coin 1: dp[6]+1=2
try coin 5: dp[2]+1=3
try coin 6: dp[1]+1=2 → dp[7] = 2 ← 1+6 or 6+1
dp[8]: try coin 1: dp[7]+1=3
try coin 5: dp[3]+1=4
try coin 6: dp[2]+1=3 → dp[8] = 3
dp[9]: try coin 1: dp[8]+1=4
try coin 5: dp[4]+1=5
try coin 6: dp[3]+1=4
try coin 9: dp[0]+1=1 → dp[9] = 1 ← use the 9-coin!
dp[10]: try coin 1: dp[9]+1=2
try coin 5: dp[5]+1=2
try coin 6: dp[4]+1=5
try coin 9: dp[1]+1=2 → dp[10] = 2 ← 1+9, 5+5, or 9+1
dp[11]: try coin 1: dp[10]+1=3
try coin 5: dp[6]+1=2
try coin 6: dp[5]+1=2
try coin 9: dp[2]+1=3 → dp[11] = 2 ← 5+6 or 6+5!
dp table: [0, 1, 2, 3, 4, 1, 1, 2, 3, 1, 2, 2]
Answer: dp[11] = 2 (coins 5 and 6) ✓
// Solution: Minimum Coin Change — O(N × W)
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, W;
cin >> n >> W;
vector<int> coins(n);
for (int &c : coins) cin >> c;
const int INF = 1e9;
vector<int> dp(W + 1, INF); // dp[w] = min coins to make w
dp[0] = 0; // base case
// Step 1: Fill dp table bottom-up
for (int w = 1; w <= W; w++) {
for (int c : coins) {
if (c <= w && dp[w - c] != INF) {
dp[w] = min(dp[w], dp[w - c] + 1); // ← KEY LINE
}
}
}
// Step 2: Output result
if (dp[W] == INF) {
cout << "Impossible\n";
} else {
cout << dp[W] << "\n";
}
return 0;
}
Sample Input:
4 11
1 5 6 9
Sample Output:
2
Complexity Analysis:
- Time:
O(N × W)— for each amount w (1..W), try all N coins - Space:
O(W)— just the dp array
Reconstructing the Solution
How do we print which coins were used? Track parent[w] = which coin was used last:
vector<int> dp(W + 1, INF);
vector<int> lastCoin(W + 1, -1); // which coin gave optimal solution for w
dp[0] = 0;
for (int w = 1; w <= W; w++) {
for (int c : coins) {
if (c <= w && dp[w-c] + 1 < dp[w]) {
dp[w] = dp[w-c] + 1;
lastCoin[w] = c; // record that coin c was used
}
}
}
// Trace back the solution
vector<int> solution;
int w = W;
while (w > 0) {
solution.push_back(lastCoin[w]);
w -= lastCoin[w];
}
for (int c : solution) cout << c << " ";
cout << "\n";
6.1.6 Number of Ways — Coin Change Variant
Problem: How many different ways can you make amount W using the given coins? (Order matters: [1,5] and [5,1] are different.)
// Ordered ways (permutations — order matters)
vector<long long> ways(W + 1, 0);
ways[0] = 1; // one way to make 0: use no coins
for (int w = 1; w <= W; w++) {
for (int c : coins) {
if (c <= w) {
ways[w] += ways[w - c]; // ← KEY LINE
}
}
}
If order doesn't matter (combinations — [1,5] same as [5,1]):
// Unordered ways (combinations — order doesn't matter)
vector<long long> ways(W + 1, 0);
ways[0] = 1;
for (int c : coins) { // outer loop: coins (each coin is considered once)
for (int w = c; w <= W; w++) { // inner loop: amounts
ways[w] += ways[w - c];
}
}
💡 Key Insight: The order of loops matters for counting combinations vs. permutations! When coins are in the outer loop, each coin is "introduced" once and order is ignored. When amounts are in the outer loop, each amount is formed fresh each time, allowing all orderings.
⚠️ Common Mistakes in Chapter 6.1
- Initializing dp with 0 instead of INF: For minimization problems,
dp[w] = 0means "0 coins" which will never get improved. Usedp[w] = INFand onlydp[0] = 0. - Not checking
dp[w-c] != INFbefore using it:INF + 1overflows! Always check that the subproblem is solvable. - Wrong loop order for knapsack variants: For unbounded (unlimited coins), loop amounts forward. For 0/1 (each used once), loop amounts backward. Getting this wrong gives wrong answers silently.
- Using
INT_MAXas INF then adding 1:INT_MAX + 1overflows to negative. Use1e9or1e18as INF. - Forgetting the base case:
dp[0] = 0is crucial. Without it, nothing ever gets set.
Chapter Summary
📌 Key Takeaways
| Concept | Key Points | When to Use |
|---|---|---|
| Overlapping subproblems | Same computation repeated exponentially | Duplicate calls in recursion tree |
| Memoization (top-down) | Cache recursive results; easy to write | When recursive structure is clear |
| Tabulation (bottom-up) | Iterative table-filling; no stack overflow | Final contest solution; large N |
| DP state | Information that uniquely identifies a subproblem | Define carefully — determines everything |
| DP recurrence | How dp[state] depends on smaller states | "Transition equation" |
| Base case | Known answer for the simplest subproblem | Usually dp[0] = some trivial value |
🧩 DP Four-Step Method Quick Reference
| Step | Question | Fibonacci Example |
|---|---|---|
| 1. Define state | "What does dp[i] represent?" | dp[i] = the i-th Fibonacci number |
| 2. Write recurrence | "Which smaller states does dp[i] depend on?" | dp[i] = dp[i-1] + dp[i-2] |
| 3. Determine base case | "What is the answer for the smallest subproblem?" | dp[0]=0, dp[1]=1 |
| 4. Determine fill order | "i from small to large? Large to small?" | i from 2 to n |
❓ FAQ
Q1: How do I tell if a problem is a DP problem?
A: Two signals: ① the problem asks for an "optimal value" or "number of ways" (not "output the specific solution"); ② there are overlapping subproblems (the same subproblem is computed multiple times in brute-force recursion). If greedy can be proven correct, DP is usually not needed; otherwise it's likely DP.
Q2: Should I use top-down or bottom-up?
A: While learning, use top-down (more naturally expresses recursive thinking); for contest submission, use bottom-up (faster, no stack overflow). Both are correct. If you can quickly write bottom-up, go with it directly.
Q3: What is "optimal substructure" (no aftereffect)?
A: The core prerequisite of DP — once
dp[i]is determined, subsequent computations will not "come back" to change it. In other words,dp[i]'s value only depends on the "past" (smaller states), not the "future". If this property is violated, DP cannot be used.
Q4: What value should INF be set to?
A: For
intuse1e9(= 10^9), forlong longuse1e18(= 10^18). Do not useINT_MAX, becauseINT_MAX + 1overflows to a negative number.
🔗 Connections to Later Chapters
- Chapter 6.2 (Classic DP): extends to LIS, knapsack, grid paths — all applications of the four-step DP method from this chapter
- Chapter 6.3 (Advanced DP): enters bitmask DP, interval DP, tree DP — more complex state definitions but same thinking
- Chapter 3.2 (Prefix Sums): difference arrays can sometimes replace simple DP, and prefix sum arrays can speed up interval computations in DP
- Chapter 4.1 (Greedy) vs DP: greedy-solvable problems are a special case of DP (local optimum = global optimum at each step); when greedy fails, DP is needed
Practice Problems
Problem 6.1.1 — Climbing Stairs 🟢 Easy
You can climb 1 or 2 stairs at a time. How many ways to climb N stairs?
(Same as Fibonacci — ways[n] = ways[n-1] + ways[n-2])
Hint
This is exactly Fibonacci! ways[1]=1, ways[2]=2. Or start with ways[0]=1, ways[1]=1, then ways[n] = ways[n-1] + ways[n-2].Problem 6.1.2 — Minimum Coin Change 🟡 Medium Given coin denominations [1, 3, 4] and target 6, find the minimum coins. (Expected answer: 2 coins — use 3+3)
Hint
Build `dp[0..6]` using the coin change recurrence. Greedy gives 4+1+1=3 coins, but dp finds 3+3=2.Problem 6.1.3 — Tile Tiling 🟡 Medium A 2×N board can be tiled with 1×2 dominoes (placed horizontally or vertically). How many ways?
Solution sketch: dp[n] = dp[n-1] + dp[n-2]. Vertical tile fills one column alone; two horizontal tiles fill two columns together.
Hint
Same recurrence as Fibonacci! The key insight: when you place a vertical domino at column n, you recurse on n-1; when you place two horizontal dominoes at columns n-1 and n, you recurse on n-2.Problem 6.1.4 — Bounded Coin Change 🔴 Hard Same as coin change, but you can use each coin at most once (0/1 knapsack). Find the minimum coins.
Solution sketch: Similar to 0/1 knapsack. Use a 2D dp[i][w] = min coins using first i coins to make w. Or the space-optimized version with backward iteration.
Hint
This is a 0/1 knapsack variant. Key difference: when you use coin i, you can't use it again. In the 1D space-optimized version, iterate w from W down to coins[i] to prevent reuse.Problem 6.1.5 — USACO Bronze: Haybale Stacking 🔴 Hard Given N operations "add 1 to all positions from L to R", determine the final value at each position.
Use difference array from Chapter 3.2. This is also solvable by thinking of it as "build the answer" (DP-like perspective).
Hint
Difference array: ``diff[L]``++, ``diff[R+1]``--. Then prefix sum of diff gives final values.🏆 Challenge Problem: Unique Paths with Obstacles An N×M grid has '.' cells and '#' obstacles. Count paths from (1,1) to (N,M) moving only right or down. Answer modulo 10^9+7. (N, M ≤ 1000)
Visual: Fibonacci Recursion Tree
The diagram shows naive recursion for fib(6). Red dashed nodes are duplicate subproblems — computed multiple times. Green nodes show where memoization caches results. Without memoization: O(2^N). With memoization: O(N). This is the fundamental insight behind dynamic programming.
Chapter 6.2: Classic DP Problems
📝 Before You Continue: Make sure you've mastered Chapter 6.1's core DP concepts — states, recurrences, and base cases. You should be able to implement Fibonacci and basic coin change from scratch.
In this chapter, we tackle three of the most important and widely-applied DP problems in competitive programming. Mastering these patterns will help you recognize and solve dozens of USACO problems.
6.2.1 Longest Increasing Subsequence (LIS)
Problem: Given an array A of N integers, find the length of the longest subsequence where elements are strictly increasing. A subsequence doesn't need to be contiguous.
Example: A = [3, 1, 8, 2, 5]
- LIS: [1, 2, 5] → length 3
- Or: [3, 8] → length 2 (not the longest)
- Or: [1, 5] → length 2
💡 Key Insight: A subsequence can skip elements but must maintain relative order. The key DP insight: for each index i, ask "what's the longest increasing subsequence that ends at A[i]?" Then the answer is the maximum over all i.
LIS O(N²) 状态转移示意(A=[3,1,8,2,5]):
flowchart LR
subgraph arr["数组 A"]
direction LR
A0(["A[0]=3\ndp=1"])
A1(["A[1]=1\ndp=1"])
A2(["A[2]=8\ndp=2"])
A3(["A[3]=2\ndp=2"])
A4(["A[4]=5\ndp=3"])
end
A0 -->|"3<8"| A2
A1 -->|"1<8"| A2
A1 -->|"1<2"| A3
A1 -->|"1<5"| A4
A3 -->|"2<5"| A4
note["答案: max(dp)=3\nLIS=[1,2,5]"]
style A4 fill:#dcfce7,stroke:#16a34a
style note fill:#f0fdf4,stroke:#16a34a
💡 转移规则:
dp[i] = 1 + max(dp[j])对所有 j<i 且 A[j]<A[i]。箭头表示“可以延伸”的关系。
The diagram above illustrates the LIS structure: arrows show which earlier elements each position can extend from, and highlighted elements form the longest increasing subsequence.
The diagram shows the array [3,1,4,1,5,9,2,6] with the LIS 1→4→5→6 highlighted in green. Each dp[i] value below the array shows the LIS length ending at that position. Arrows connect elements that extend the subsequence.
O(N²) DP Solution
- State:
dp[i]= length of the longest increasing subsequence ending at index i - Recurrence:
dp[i] = 1 + max(dp[j]) for all j < i where A[j] < A[i] - Base case:
dp[i] = 1(a subsequence of just A[i]) - Answer:
max(dp[0], dp[1], ..., dp[N-1])
Step-by-step trace for A = [3, 1, 8, 2, 5]:
dp[0] = 1 (LIS ending at 3: just [3])
dp[1] = 1 (LIS ending at 1: just [1], since no j<1 with A[j]<1)
dp[2] = 2 (LIS ending at 8: A[0]=3 < 8 → dp[0]+1=2; A[1]=1 < 8 → dp[1]+1=2)
Best: 2 ([3,8] or [1,8])
dp[3] = 2 (LIS ending at 2: A[1]=1 < 2 → dp[1]+1=2)
Best: 2 ([1,2])
dp[4] = 3 (LIS ending at 5: A[1]=1 < 5 → dp[1]+1=2; A[3]=2 < 5 → dp[3]+1=3)
Best: 3 ([1,2,5])
LIS length = max(dp) = 3
// Solution: LIS O(N²) — simple but too slow for N > 5000
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
vector<int> A(n);
for (int &x : A) cin >> x;
vector<int> dp(n, 1); // every element alone is a subsequence of length 1
for (int i = 1; i < n; i++) {
for (int j = 0; j < i; j++) {
if (A[j] < A[i]) { // A[j] can extend subsequence ending at A[i]
dp[i] = max(dp[i], dp[j] + 1); // ← KEY LINE
}
}
}
cout << *max_element(dp.begin(), dp.end()) << "\n";
return 0;
}
Sample Input: 5 / 3 1 8 2 5 → Output: 3
Complexity Analysis:
- Time:
O(N²)— double loop - Space:
O(N)— dp array
For N ≤ 5000, O(N²) is fast enough. For N up to 10^5, we need the O(N log N) approach.
O(N log N) LIS with Binary Search (Patience Sorting)
The key idea: instead of tracking exact dp values, maintain a tails array where tails[k] = the smallest possible tail element of any increasing subsequence of length k+1 seen so far.
Why is this useful? Because if we can maintain this array, we can use binary search to find where to place each new element.
💡 Key Insight (Patience Sorting): Imagine dealing cards to piles. Each pile is a decreasing sequence (like Solitaire). A card goes on the leftmost pile whose top is ≥ it. If no such pile exists, start a new pile. The number of piles equals the LIS length! The
tailsarray is exactly the tops of these piles.
Step-by-step trace for A = [3, 1, 8, 2, 5]:
Process 3: tails = [], no element ≥ 3, so push: tails = [3]
→ LIS length so far: 1
Process 1: tails = [3], lower_bound(1) hits index 0 (3 ≥ 1), replace:
tails = [1]
→ LIS length still 1; but now the best 1-length subsequence ends in 1 (better!)
Process 8: tails = [1], lower_bound(8) hits end, push: tails = [1, 8]
→ LIS length: 2 (e.g., [1, 8])
Process 2: tails = [1, 8], lower_bound(2) hits index 1 (8 ≥ 2), replace:
tails = [1, 2]
→ LIS length still 2; but best 2-length subsequence now ends in 2 (better!)
Process 5: tails = [1, 2], lower_bound(5) hits end, push: tails = [1, 2, 5]
→ LIS length: 3 (e.g., [1, 2, 5]) ✓
Answer = tails.size() = 3
ASCII Patience Sorting Visualization:
Cards dealt: 3, 1, 8, 2, 5
After 3: After 1: After 8: After 2: After 5:
[3] [1] [1][8] [1][2] [1][2][5]
Pile 1 Pile 1 P1 P2 P1 P2 P1 P2 P3
Number of piles = LIS length = 3 ✓
// Solution: LIS O(N log N) — fast enough for N up to 10^5
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
vector<int> A(n);
for (int &x : A) cin >> x;
vector<int> tails; // tails[i] = smallest tail of any IS of length i+1
for (int x : A) {
// Find first tail >= x (for strictly increasing: use lower_bound)
auto it = lower_bound(tails.begin(), tails.end(), x);
if (it == tails.end()) {
tails.push_back(x); // x extends the longest subsequence
} else {
*it = x; // ← KEY LINE: replace to maintain smallest possible tail
}
}
cout << tails.size() << "\n";
return 0;
}
⚠️ Note:
tailsdoesn't store the actual LIS elements, just its length. The elements intailsare maintained in sorted order, which is why binary search works.
⚠️ Common Mistake: Using
lower_boundgives LIS for strictly increasing (A[j] < A[i]). For non-decreasing (A[j] ≤ A[i]), useupper_boundinstead.
Complexity Analysis:
- Time:
O(N log N)— N elements, each withO(log N)binary search - Space:
O(N)— the tails array
LIS Application in USACO
Many USACO Silver problems reduce to LIS:
- "Minimum number of groups to partition a sequence so each group is non-increasing" → same as LIS length (by Dilworth's theorem)
- Sorting with restrictions often becomes LIS
- 2D LIS: sort by one dimension, find LIS of the other
🔗 Related Problem: USACO 2015 February Silver: "Censoring" — involves finding a pattern that's a subsequence.
6.2.2 The 0/1 Knapsack Problem
Problem: You have N items. Item i has weight w[i] and value v[i]. Your knapsack holds total weight W. Choose items to maximize total value without exceeding weight W. Each item can be used at most once (0/1 = take it or leave it).
Example:
- Items: (weight=2, value=3), (weight=3, value=4), (weight=4, value=5), (weight=5, value=6)
- W = 8
- Best: take items 1+2+3 (weight 2+3+4=9 > 8), or items 1+2 (weight 5, value 7), or items 1+4 (weight 7, value 9), or items 2+4 (weight 8, value 10). Answer: 10.
Visual: Knapsack DP Table
The 2D table shows dp[item][capacity]. Each row adds one item, and each cell represents the best value achievable with that capacity. The answer (8) is in the bottom-right corner. Highlighted cells show where new items changed the optimal value.
This static reference shows the complete knapsack DP table with the take/skip decisions highlighted for each item at each capacity level.
DP Formulation
0/1 背包决策过程示意(以第 i 个物品为例):
flowchart TD
State["当前状态: dp[i-1][w]"] --> Dec{"第 i 个物品\nweight[i]=wi, value[i]=vi"}
Dec -->|"不拿 (skip)"| Skip["dp[i][w] = dp[i-1][w]"]
Dec -->|"拿 (take)\n前提: wi ≤ w"| Take["dp[i][w] = dp[i-1][w-wi] + vi"]
Skip --> Max["dp[i][w] = max(不拿, 拿)"]
Take --> Max
style Dec fill:#fef9ec,stroke:#d97706
style Max fill:#dcfce7,stroke:#16a34a
💡 关键区别: 0/1 背包每个物品只能用一次,所以“拿”的时候从上一行
dp[i-1]转移而来,而不是当前行。这就是为什么 1D 优化时要倒序遍历的原因。
- State:
dp[i][w]= maximum value using items 1..i with total weight ≤ w - Recurrence:
- Don't take item i:
dp[i][w] = dp[i-1][w] - Take item i (only if w[i] ≤ w):
dp[i][w] = dp[i-1][w - weight[i]] + value[i] - Take the maximum:
dp[i][w] = max(don't take, take)
- Don't take item i:
- Base case:
dp[0][w] = 0(no items = zero value) - Answer:
dp[N][W]
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, W;
cin >> n >> W;
vector<int> weight(n + 1), value(n + 1);
for (int i = 1; i <= n; i++) cin >> weight[i] >> value[i];
// dp[i][w] = max value using first i items with weight limit w
vector<vector<int>> dp(n + 1, vector<int>(W + 1, 0));
for (int i = 1; i <= n; i++) {
for (int w = 0; w <= W; w++) {
dp[i][w] = dp[i-1][w]; // option 1: don't take item i
if (weight[i] <= w) { // option 2: take item i (if it fits)
dp[i][w] = max(dp[i][w], dp[i-1][w - weight[i]] + value[i]);
}
}
}
cout << dp[n][W] << "\n";
return 0;
}
Space-Optimized 0/1 Knapsack — O(W) Space
We only need the previous row dp[i-1], so we can use a 1D array. Crucial: iterate w from W down to 0 (otherwise item i is used multiple times):
vector<int> dp(W + 1, 0);
for (int i = 1; i <= n; i++) {
// Iterate BACKWARDS to prevent using item i more than once
for (int w = W; w >= weight[i]; w--) {
dp[w] = max(dp[w], dp[w - weight[i]] + value[i]);
}
}
cout << dp[W] << "\n";
Why backwards? When computing
dp[w], we needdp[w - weight[i]]from the previous item's row (not current item's). Iterating backwards ensuresdp[w - weight[i]]hasn't been updated by item i yet.
Unbounded Knapsack (Unlimited Items)
If each item can be used multiple times, iterate forwards:
for (int i = 1; i <= n; i++) {
for (int w = weight[i]; w <= W; w++) { // FORWARDS — allows reuse
dp[w] = max(dp[w], dp[w - weight[i]] + value[i]);
}
}
6.2.3 Grid Path Counting
Problem: Count the number of paths from the top-left corner (1,1) to the bottom-right corner (N,M) of a grid, moving only right or down. Some cells are blocked.
Example: 3×3 grid with no blockages → 6 paths (C(4,2) = 6).
Visual: Grid Path DP Values
Each cell shows the number of paths from (0,0) to that cell. The recurrence dp[i][j] = dp[i-1][j] + dp[i][j-1] adds paths arriving from above and from the left. The Pascal's triangle pattern emerges naturally when there are no obstacles.
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, m;
cin >> n >> m;
vector<string> grid(n);
for (int r = 0; r < n; r++) cin >> grid[r];
// dp[r][c] = number of paths to reach (r, c)
vector<vector<long long>> dp(n, vector<long long>(m, 0));
// Base case: starting cell (if not blocked)
if (grid[0][0] != '#') dp[0][0] = 1;
// Fill first row (can only come from the left)
for (int c = 1; c < m; c++) {
if (grid[0][c] != '#') dp[0][c] = dp[0][c-1];
}
// Fill first column (can only come from above)
for (int r = 1; r < n; r++) {
if (grid[r][0] != '#') dp[r][0] = dp[r-1][0];
}
// Fill rest of the grid
for (int r = 1; r < n; r++) {
for (int c = 1; c < m; c++) {
if (grid[r][c] == '#') {
dp[r][c] = 0; // blocked — no paths through here
} else {
dp[r][c] = dp[r-1][c] + dp[r][c-1]; // from above + from left
}
}
}
cout << dp[n-1][m-1] << "\n";
return 0;
}
Grid Maximum Value Path
Problem: Find the path from (1,1) to (N,M) (moving right or down) that maximizes the sum of values.
vector<vector<int>> val(n, vector<int>(m));
for (int r = 0; r < n; r++)
for (int c = 0; c < m; c++)
cin >> val[r][c];
vector<vector<long long>> dp(n, vector<long long>(m, 0));
dp[0][0] = val[0][0];
for (int c = 1; c < m; c++) dp[0][c] = dp[0][c-1] + val[0][c];
for (int r = 1; r < n; r++) dp[r][0] = dp[r-1][0] + val[r][0];
for (int r = 1; r < n; r++) {
for (int c = 1; c < m; c++) {
dp[r][c] = max(dp[r-1][c], dp[r][c-1]) + val[r][c];
}
}
cout << dp[n-1][m-1] << "\n";
6.2.4 USACO DP Example: Hoof Paper Scissors
Problem (USACO 2019 January Silver): Bessie plays N rounds of Hoof-Paper-Scissors (like Rock-Paper-Scissors but with cow gestures). She knows the opponent's moves in advance. She can change her gesture at most K times. Maximize wins.
State: dp[i][j][g] = max wins in the first i rounds, having changed j times, currently playing gesture g.
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, k;
cin >> n >> k;
// 0=Hoof, 1=Paper, 2=Scissors
vector<int> opp(n + 1);
for (int i = 1; i <= n; i++) {
char c; cin >> c;
if (c == 'H') opp[i] = 0;
else if (c == 'P') opp[i] = 1;
else opp[i] = 2;
}
// dp[j][g] = max wins using j changes so far, currently playing gesture g
// (2D since we process rounds iteratively)
const int NEG_INF = -1e9;
vector<vector<int>> dp(k + 1, vector<int>(3, NEG_INF));
// Initialize: before round 1, 0 changes, any starting gesture
for (int g = 0; g < 3; g++) dp[0][g] = 0;
for (int i = 1; i <= n; i++) {
vector<vector<int>> ndp(k + 1, vector<int>(3, NEG_INF));
for (int j = 0; j <= k; j++) {
for (int g = 0; g < 3; g++) {
if (dp[j][g] == NEG_INF) continue;
int win = (g == opp[i]) ? 1 : 0; // do we win this round?
// Option 1: don't change gesture
ndp[j][g] = max(ndp[j][g], dp[j][g] + win);
// Option 2: change gesture (costs 1 change)
if (j < k) {
for (int ng = 0; ng < 3; ng++) {
if (ng != g) {
int nwin = (ng == opp[i]) ? 1 : 0;
ndp[j+1][ng] = max(ndp[j+1][ng], dp[j][g] + nwin);
}
}
}
}
}
dp = ndp;
}
int ans = 0;
for (int j = 0; j <= k; j++)
for (int g = 0; g < 3; g++)
ans = max(ans, dp[j][g]);
cout << ans << "\n";
return 0;
}
6.2.5 Interval DP — Matrix Chain and Burst Balloons Patterns
Interval DP is a powerful DP technique where the state represents a contiguous subarray or subrange, and we combine solutions of smaller intervals to solve larger ones.
💡 Key Insight: When the optimal solution for a range
[l, r]depends on how we split that range at some pointk, and the sub-problems for[l, k]and[k+1, r]are independent, interval DP applies.
The Interval DP Framework
Interval DP 填表顺序示意(以 n=4 为例):
flowchart LR
subgraph len1["len=1 (基础情况)"]
direction TB
L11["dp[1][1]"]
L22["dp[2][2]"]
L33["dp[3][3]"]
L44["dp[4][4]"]
end
subgraph len2["len=2"]
direction TB
L12["dp[1][2]"]
L23["dp[2][3]"]
L34["dp[3][4]"]
end
subgraph len3["len=3"]
direction TB
L13["dp[1][3]"]
L24["dp[2][4]"]
end
subgraph len4["len=4 (答案)"]
direction TB
L14["dp[1][4] ⭐"]
end
len1 -->|"依赖"| len2
len2 -->|"依赖"| len3
len3 -->|"依赖"| len4
style L14 fill:#dcfce7,stroke:#16a34a
style len4 fill:#f0fdf4,stroke:#16a34a
💡 填表顺序关键: 必须按区间长度由小到大填表。计算
dp[l][r]时,所有更短的子区间dp[l][k]和dp[k+1][r]已经就绪。
State: dp[l][r] = optimal solution for the subproblem on interval [l, r]
Base: dp[i][i] = cost/value for a single element (often 0 or trivial)
Order: Fill by increasing interval LENGTH (len = 1, 2, 3, ..., n)
This ensures dp[l][k] and dp[k+1][r] are computed before dp[l][r]
Transition:
dp[l][r] = min/max over all split points k in [l, r-1] of:
dp[l][k] + dp[k+1][r] + cost(l, k, r)
Answer: dp[1][n] (or dp[0][n-1] for 0-indexed)
Enumeration order matters! We enumerate by interval length, not by left endpoint. This guarantees all sub-intervals are solved before we need them.
Classic Example: Matrix Chain Multiplication
Problem: Given N matrices A₁, A₂, ..., Aₙ where matrix Aᵢ has dimensions dim[i-1] × dim[i], find the parenthesization that minimizes the total number of scalar multiplications.
Why DP? Different parenthesizations have wildly different costs:
(A₁A₂)A₃: cost =p×q×r + p×r×s(where shapes are p×q, q×r, r×s)A₁(A₂A₃): cost =q×r×s + p×q×s
State: dp[l][r] = minimum multiplications to compute the product Aₗ × Aₗ₊₁ × ... × Aᵣ
Transition: Try every split point k ∈ [l, r-1]. When we split at k:
- Left product
Aₗ...Aₖhas costdp[l][k], resulting shapedim[l-1] × dim[k] - Right product
Aₖ₊₁...Aᵣhas costdp[k+1][r], resulting shapedim[k] × dim[r] - Multiplying these two results costs
dim[l-1] × dim[k] × dim[r]
// Solution: Matrix Chain Multiplication — O(N³) time, O(N²) space
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n; // number of matrices
// dim[i-1] × dim[i] is the shape of matrix i (1-indexed)
// So we need n+1 dimensions
vector<int> dim(n + 1);
for (int i = 0; i <= n; i++) cin >> dim[i];
// Matrix i has shape dim[i-1] × dim[i]
// dp[l][r] = min cost to compute product of matrices l..r
vector<vector<long long>> dp(n + 1, vector<long long>(n + 1, 0));
const long long INF = 1e18;
// Fill dp by increasing interval length
for (int len = 2; len <= n; len++) { // interval length
for (int l = 1; l + len - 1 <= n; l++) { // left endpoint
int r = l + len - 1; // right endpoint
dp[l][r] = INF;
// Try every split point k
for (int k = l; k < r; k++) {
long long cost = dp[l][k] // left subproblem
+ dp[k+1][r] // right subproblem
+ (long long)dim[l-1] * dim[k] * dim[r]; // merge cost
dp[l][r] = min(dp[l][r], cost);
}
}
}
cout << dp[1][n] << "\n"; // min cost to multiply all n matrices
return 0;
}
Complexity Analysis:
- States: O(N²) — all pairs (l, r) with l ≤ r
- Transition: O(N) per state — try all split points k
- Total Time: O(N³)
- Space: O(N²)
Example trace for N=4, dims = [10, 30, 5, 60, 10]:
Matrices: A1(10×30), A2(30×5), A3(5×60), A4(60×10)
len=2:
dp[1][2] = dim[0]*dim[1]*dim[2] = 10*30*5 = 1500
dp[2][3] = dim[1]*dim[2]*dim[3] = 30*5*60 = 9000
dp[3][4] = dim[2]*dim[3]*dim[4] = 5*60*10 = 3000
len=3:
dp[1][3]: try k=1: dp[1][1]+dp[2][3]+10*30*60 = 0+9000+18000 = 27000
try k=2: dp[1][2]+dp[3][3]+10*5*60 = 1500+0+3000 = 4500
dp[1][3] = 4500
dp[2][4]: try k=2: dp[2][2]+dp[3][4]+30*5*10 = 0+3000+1500 = 4500
try k=3: dp[2][3]+dp[4][4]+30*60*10 = 9000+0+18000 = 27000
dp[2][4] = 4500
len=4:
dp[1][4]: try k=1: dp[1][1]+dp[2][4]+10*30*10 = 0+4500+3000 = 7500
try k=2: dp[1][2]+dp[3][4]+10*5*10 = 1500+3000+500 = 5000 ← min!
try k=3: dp[1][3]+dp[4][4]+10*60*10 = 4500+0+6000 = 10500
dp[1][4] = 5000
Answer: 5000 scalar multiplications (parenthesization: (A1 A2)(A3 A4))
Other Classic Interval DP Problems
1. Burst Balloons (LeetCode 312):
dp[l][r]= max coins from bursting all balloons between l and r- Key twist: think of k as the last balloon to burst in [l, r] (not first split!)
dp[l][r] = max over k of (nums[l-1]*nums[k]*nums[r+1] + dp[l][k-1] + dp[k+1][r])
2. Optimal Binary Search Tree:
dp[l][r]= min cost of BST for keys l..r with given access frequencies- Split at root k:
dp[l][r] = dp[l][k-1] + dp[k+1][r] + sum_freq(l, r)
3. Palindrome Partitioning:
dp[l][r]= min cuts to partition s[l..r] into palindromesdp[l][r] = 0if s[l..r] is already a palindrome, elsemin over k of (dp[l][k] + dp[k+1][r] + 1)
Template Summary
// Generic Interval DP Template
// Assumes 1-indexed, n elements
void intervalDP(int n) {
vector<vector<int>> dp(n + 1, vector<int>(n + 1, 0));
// Base case: intervals of length 1
for (int i = 1; i <= n; i++) dp[i][i] = base_case(i);
// Fill by increasing length
for (int len = 2; len <= n; len++) {
for (int l = 1; l + len - 1 <= n; l++) {
int r = l + len - 1;
dp[l][r] = INF; // or -INF for maximization
for (int k = l; k < r; k++) { // split at k (or k+1)
int val = dp[l][k] + dp[k+1][r] + cost(l, k, r);
dp[l][r] = min(dp[l][r], val); // or max
}
}
}
// Answer is dp[1][n]
}
⚠️ Common Mistake: Iterating over left endpoint
lin the outer loop and length in the inner loop. This is wrong — when you computedp[l][r], the sub-intervalsdp[l][k]anddp[k+1][r]must already be computed. Always iterate by length in the outer loop.
// WRONG — dp[l][k] might not be ready yet!
for (int l = 1; l <= n; l++)
for (int r = l + 1; r <= n; r++)
...
// CORRECT — all shorter intervals are computed first
for (int len = 2; len <= n; len++)
for (int l = 1; l + len - 1 <= n; l++) {
int r = l + len - 1;
...
}
⚠️ Common Mistakes in Chapter 6.2
- LIS: using
upper_boundfor strictly increasing: For strictly increasing, uselower_bound. For non-decreasing, useupper_bound. Getting this wrong gives LIS length off by 1. - 0/1 Knapsack: iterating weight forward: Iterating w from 0 to W (forward) allows using item i multiple times — that's unbounded knapsack, not 0/1. Always iterate backwards for 0/1.
- Grid paths: forgetting to handle blocked cells: If
grid[r][c] == '#', setdp[r][c] = 0(notdp[r-1][c] + dp[r][c-1]). - Overflow in grid path counting: Even for small grids, the number of paths can be astronomically large. Use
long longor modular arithmetic. - LIS: thinking
tailscontains the actual LIS: It doesn't!tailscontains the smallest possible tail elements for subsequences of each length. The actual LIS must be reconstructed separately.
Chapter Summary
📌 Key Takeaways
| Problem | State Definition | Recurrence | Complexity |
|---|---|---|---|
LIS (O(N²)) | dp[i] = LIS length ending at A[i] | dp[i] = max(dp[j]+1), j<i and A[j]<A[i] | O(N²) |
LIS (O(N log N)) | tails[k] = min tail of IS with length k+1 | binary search + replace | O(N log N) |
| 0/1 Knapsack (2D) | dp[i][w] = max value using first i items, capacity ≤ w | max(skip, take) | O(NW) |
| 0/1 Knapsack (1D) | dp[w] = max value with capacity ≤ w | reverse iterate w | O(NW) |
| Grid Path | dp[r][c] = path count to reach (r,c) | dp[r-1][c] + dp[r][c-1] | O(RC) |
❓ FAQ
Q1: In the O(N log N) LIS solution, does the tails array store the actual LIS?
A: No!
tailsstores "the minimum tail element of increasing subsequences of each length". Its length equals the LIS length, but the elements themselves may not form a valid increasing subsequence. To reconstruct the actual LIS, you need to record each element's "predecessor".
Q2: Why does 0/1 knapsack require reverse iteration over w?
A: Because
dp[w]needs the "previous row's"dp[w - weight[i]]. If iterating forward,dp[w - weight[i]]may already be updated by the current row (equivalent to using item i multiple times). Reverse iteration ensures each item is used at most once.
Q3: What is the only difference between unbounded knapsack (items usable unlimited times) and 0/1 knapsack code?
A: Just the inner loop direction. 0/1 knapsack:
wfrom W down to weight[i] (reverse). Unbounded knapsack:wfrom weight[i] up to W (forward).
Q4: What if the grid path can also move up or left?
A: Then simple grid DP no longer works (because there would be cycles). You need BFS/DFS or more complex DP. Standard grid path DP only applies to "right/down only" movement.
🔗 Connections to Later Chapters
- Chapter 3.3 (Sorting & Binary Search): binary search is the core of
O(N log N)LIS —lower_boundon thetailsarray - Chapter 6.3 (Advanced DP): extends knapsack to bitmask DP (item sets → bitmask), extends grid DP to interval DP
- Chapter 4.1 (Greedy): interval scheduling problems can sometimes be converted to LIS (via Dilworth's theorem)
- LIS is extremely common in USACO Silver — 2D LIS, weighted LIS, LIS counting variants appear frequently
Practice Problems
Problem 6.2.1 — LIS Length 🟢 Easy Read N integers. Find the length of the longest strictly increasing subsequence.
Hint
Use the `O(N log N)` approach with `lower_bound` on the `tails` array. Answer is `tails.size()`.Problem 6.2.2 — Number of LIS 🔴 Hard Read N integers. Find the number of distinct longest increasing subsequences. (Answer modulo 10^9+7.)
Solution sketch: Maintain both dp[i] (LIS length ending at i) and cnt[i] (number of such LIS). When dp[j]+1 > dp[i]: update dp[i] and reset cnt[i] = cnt[j]. When equal: add cnt[j] to cnt[i].
Hint
This requires the `O(N²)` approach. For each i, find all j < i where A[j] < A[i] and `dp[j]`+1 = `dp[i]`. Sum up their ``cnt[j]`` values.Problem 6.2.3 — 0/1 Knapsack 🟡 Medium N items with weights and values, capacity W. Find maximum value. (N, W ≤ 1000)
Hint
Space-optimized 1D dp: iterate items in outer loop, weights BACKWARDS (W down to weight[i]) in inner loop.Problem 6.2.4 — Collect Stars 🟡 Medium An N×M grid has stars ('*') and obstacles ('#'). Moving only right or down from (1,1) to (N,M), collect as many stars as possible. Print the maximum stars collected.
Hint
`dp[r][c]` = max stars collected to reach (r,c). For each cell, `dp[r][c]` = max(`dp[r-1][c]`, `dp[r][c-1]`) + (1 if grid[r][c]=='*').Problem 6.2.5 — Variations of Knapsack 🔴 Hard Read N items each with weight w[i] and value v[i]. Capacity W.
- Variant A: Each item available up to k[i] times (bounded knapsack)
- Variant B: Must fill the knapsack exactly (no extra space allowed)
- Variant C: Minimize weight while achieving value ≥ target V
Solution sketch: (A) Treat each item as k[i] copies for 0/1 knapsack, or use monotonic deque optimization. (B) Initialize dp[0] = 0, all other dp[w] = INF, answer is dp[W]. (C) Swap the roles of weight and value in the DP.
Hint
For variant B: change "INF means unreachable" to "INF means infeasible". Only states reachable from `dp[0]`=0 will have finite values. For variant C: `dp[v]` = minimum weight to achieve exactly value v.🏆 Challenge Problem: USACO 2019 January Silver: Grass Planting Each of N fields has a certain grass density. Farmer John can re-plant any number of non-overlapping intervals. Design a DP to maximize the number of fields with the specific grass density he wants after at most K re-plantings. (Interval DP combined with 1D DP)
Visual: LIS via Patience Sorting
This diagram illustrates LIS using the patience sorting analogy. Each "pile" represents a potential subsequence endpoint. The number of piles equals the LIS length. Binary search finds where each card goes in O(log N), giving an O(N log N) overall algorithm.
Visual: Knapsack DP Table
The 0/1 Knapsack DP table: rows = items considered, columns = capacity. Each cell shows the maximum value achievable. Blue cells show single-item contributions, green cells show combinations, and the starred cell is the optimal answer.
Chapter 6.3: Advanced DP Patterns
📝 Before You Continue: You must have completed Chapter 6.1 (Introduction to DP) and Chapter 6.2 (Classic DP Problems). Advanced patterns build on memoization, tabulation, and the classic DP problems (LIS, knapsack, grid paths).
This chapter covers DP techniques that appear at USACO Silver and above: bitmask DP, interval DP, tree DP, and digit DP. Each has a characteristic structure that, once recognized, makes the problem tractable.
6.3.1 Bitmask DP
When to use: Problems involving subsets of a small set (N ≤ 20), where the state includes "which elements have been selected."
Core idea: Represent the set of selected elements as a bitmask (integer). Bit i is 1 if element i is included.
{0, 2, 3} in a set of 5 elements → bitmask = 0b01101 = 13
bit 0 = 1 (element 0 ∈ set)
bit 1 = 0 (element 1 ∉ set)
bit 2 = 1 (element 2 ∈ set)
bit 3 = 1 (element 3 ∈ set)
bit 4 = 0 (element 4 ∉ set)
Essential Bitmask Operations
// Element operations
int mask = 0;
mask |= (1 << i); // add element i to set
mask &= ~(1 << i); // remove element i from set
bool has_i = (mask >> i) & 1; // check if element i is in set
// Enumerate all subsets of mask
for (int sub = mask; sub > 0; sub = (sub - 1) & mask) {
// process subset 'sub'
}
// Include the empty subset too: add sub=0 after the loop
// Count bits set (number of elements in set)
int count = __builtin_popcount(mask); // for int
int count = __builtin_popcountll(mask); // for long long
// Enumerate all masks with exactly k bits set
for (int mask = 0; mask < (1 << n); mask++) {
if (__builtin_popcount(mask) == k) { /* ... */ }
}
Classic: Traveling Salesman Problem (TSP) — O(2^N × N²)
Problem: N cities, complete weighted graph. Find the minimum-cost Hamiltonian path (visit every city exactly once).
State: dp[mask][u] = minimum cost to visit exactly the cities in mask, ending at city u.
Transition: To extend to city v not in mask:
dp[mask | (1<<v)][v] = min(dp[mask][v], dp[mask][u] + dist[u][v])
// Solution: TSP with Bitmask DP — O(2^N × N^2)
// Works for N ≤ 20 (2^20 × 400 ≈ 4×10^8 — tight; N≤18 is safer)
#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
const ll INF = 1e18;
int n;
int dist[20][20];
ll dp[1 << 20][20];
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
cin >> n;
for (int i = 0; i < n; i++)
for (int j = 0; j < n; j++)
cin >> dist[i][j];
// Initialize: INF everywhere
for (int mask = 0; mask < (1 << n); mask++)
fill(dp[mask], dp[mask] + n, INF);
// Base case: start at city 0, only city 0 visited
dp[1][0] = 0; // mask=1 (bit 0 set), at city 0, cost=0
// Fill DP
for (int mask = 1; mask < (1 << n); mask++) {
for (int u = 0; u < n; u++) {
if (!(mask & (1 << u))) continue; // u not in current set
if (dp[mask][u] == INF) continue;
// Try extending to city v not yet visited
for (int v = 0; v < n; v++) {
if (mask & (1 << v)) continue; // v already visited
int newMask = mask | (1 << v);
dp[newMask][v] = min(dp[newMask][v], dp[mask][u] + dist[u][v]);
}
}
}
// Answer: minimum over all ending cities to return to city 0
// (or just minimum over all ending cities for Hamiltonian PATH, not cycle)
int fullMask = (1 << n) - 1; // all cities visited
ll ans = INF;
for (int u = 1; u < n; u++) { // end at any city except 0
ans = min(ans, dp[fullMask][u] + dist[u][0]); // return to 0 for cycle
}
cout << ans << "\n";
return 0;
}
⚠️ Memory Warning:
dp[1<<20][20]uses 2^20 × 20 × 8 bytes ≈ 168MB(而非 160MB). For N=20, this is close to typical 256MB memory limits. If distances fit inint, useint dpinstead oflong longto halve memory to ~84MB.
6.3.2 Interval DP
When to use: Problems where the answer for a larger interval can be built from answers for smaller intervals. Keywords: "merge," "split," "burst," "matrix chain."
Core structure:
dp[l][r] = optimal answer for subproblem on interval [l, r]
Base case: dp[i][i] = trivial (single element)
Transition: dp[l][r] = min/max over k ∈ [l, r-1] of:
dp[l][k] + dp[k+1][r] + cost(l, k, r)
Fill order: by INCREASING interval length (len = r - l + 1)
Classic: Matrix Chain Multiplication — O(N³)
Problem: Multiply N matrices in sequence. Matrix i has dimensions dims[i] × dims[i+1]. The number of scalar multiplications to multiply A (p×q) by B (q×r) is p*q*r. Find the parenthesization that minimizes total multiplications.
State: dp[l][r] = minimum multiplications to compute the product of matrices l through r.
// Solution: Matrix Chain Multiplication — O(N^3), O(N^2) space
#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
const ll INF = 1e18;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
// dims[i] = rows of matrix i; dims[i+1] = cols of matrix i
vector<int> dims(n + 1);
for (int i = 0; i <= n; i++) cin >> dims[i];
// dp[l][r] = min multiplications to compute M_l × M_{l+1} × ... × M_r
vector<vector<ll>> dp(n + 1, vector<ll>(n + 1, 0));
// Fill by increasing interval length
for (int len = 2; len <= n; len++) { // len = number of matrices
for (int l = 1; l + len - 1 <= n; l++) {
int r = l + len - 1;
dp[l][r] = INF;
// Try all split points k (split after matrix k)
for (int k = l; k < r; k++) {
// Cost: compute [l..k], compute [k+1..r], then multiply the results
// Result of [l..k]: dims[l-1] × dims[k]
// Result of [k+1..r]: dims[k] × dims[r]
ll cost = dp[l][k] + dp[k+1][r]
+ (ll)dims[l-1] * dims[k] * dims[r]; // ← KEY: cost of final multiply
dp[l][r] = min(dp[l][r], cost);
}
}
}
cout << dp[1][n] << "\n";
return 0;
}
Worked Example:
3 matrices: A(10×30), B(30×5), C(5×60)
dims = [10, 30, 5, 60]
dp[1][1] = dp[2][2] = dp[3][3] = 0 (single matrices, no multiplication)
len=2:
dp[1][2] = dp[1][1] + dp[2][2] + 10*30*5 = 0 + 0 + 1500 = 1500
dp[2][3] = dp[2][2] + dp[3][3] + 30*5*60 = 0 + 0 + 9000 = 9000
len=3:
dp[1][3]: try k=1 and k=2
k=1: dp[1][1] + dp[2][3] + 10*30*60 = 0 + 9000 + 18000 = 27000
k=2: dp[1][2] + dp[3][3] + 10*5*60 = 1500 + 0 + 3000 = 4500 ← minimum!
dp[1][3] = 4500
Answer: 4500 (parenthesize as (A×B)×C)
Verify: (10×30)×5 = 1500 ops, then (10×5)×60 = 3000 ops, total = 4500 ✓
Classic: Burst Balloons (Variant of Interval DP)
Problem: N balloons with values. Burst balloon i: earn left_value × value[i] × right_value. Find maximum coins.
// dp[l][r] = max coins from bursting ALL balloons in (l, r) exclusively
// (l and r are boundaries, not burst)
// Key insight: think about which balloon is burst LAST in [l, r]
// (The last balloon sees l and r as neighbors)
// Add sentinel balloons: val[-1] = val[n] = 1
vector<int> val(n + 2);
val[0] = val[n + 1] = 1;
for (int i = 1; i <= n; i++) cin >> val[i];
vector<vector<ll>> dp(n + 2, vector<ll>(n + 2, 0));
for (int len = 1; len <= n; len++) {
for (int l = 1; l + len - 1 <= n; l++) {
int r = l + len - 1;
for (int k = l; k <= r; k++) {
// k is the LAST balloon burst in [l, r]
// When k is burst, its neighbors are l-1 and r+1 (sentinels)
ll cost = dp[l][k-1] + dp[k+1][r]
+ (ll)val[l-1] * val[k] * val[r+1];
dp[l][r] = max(dp[l][r], cost);
}
}
}
cout << dp[1][n] << "\n";
6.3.3 Tree DP
When to use: DP on a tree, where the state of a node depends on its subtree (post-order) or its ancestors (pre-order).
Pattern: Subtree DP (Post-Order)
dp[u] = some value computed from dp[children of u]
Process nodes in post-order (leaves first, root last)
Classic: Tree Knapsack / Maximum Independent Set on Tree
Problem: N nodes, each with value val[u]. Select a subset S maximizing total value, subject to: if u ∈ S, then no child of u is in S.
State: dp[u][0] = max value from subtree of u if u is NOT selected.
dp[u][1] = max value from subtree of u if u IS selected.
// Solution: Max Independent Set on Tree — O(N)
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 100005;
vector<int> children[MAXN];
int val[MAXN];
long long dp[MAXN][2]; // dp[u][0/1] = max value if u excluded/included
// DFS post-order: compute dp[u] after computing all dp[children]
void dfs(int u) {
dp[u][1] = val[u]; // include u: get val[u]
dp[u][0] = 0; // exclude u: get 0 from this node
for (int v : children[u]) {
dfs(v); // ← process child first (post-order)
// If we INCLUDE u: children must be EXCLUDED
dp[u][1] += dp[v][0];
// If we EXCLUDE u: children can be either included or excluded
dp[u][0] += max(dp[v][0], dp[v][1]);
}
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, root;
cin >> n >> root;
for (int i = 1; i <= n; i++) cin >> val[i];
for (int i = 0; i < n - 1; i++) {
int u, v;
cin >> u >> v;
children[u].push_back(v);
// Note: if the tree is given as undirected edges, need to root it first
}
dfs(root);
cout << max(dp[root][0], dp[root][1]) << "\n";
return 0;
}
Tree Diameter (Two DFS)
// Tree Diameter: longest path between any two nodes
// Method: Two DFS
// 1. DFS from any node u → find farthest node v
// 2. DFS from v → find farthest node w
// dist(v, w) = diameter
int farthest_node, max_dist;
void dfs_diameter(int u, int parent, int d, vector<int> adj[]) {
if (d > max_dist) {
max_dist = d;
farthest_node = u;
}
for (int v : adj[u]) {
if (v != parent) dfs_diameter(v, u, d + 1, adj);
}
}
int tree_diameter(int n, vector<int> adj[]) {
// First DFS from node 1
max_dist = 0; farthest_node = 1;
dfs_diameter(1, -1, 0, adj);
// Second DFS from farthest node found
int v = farthest_node;
max_dist = 0;
dfs_diameter(v, -1, 0, adj);
return max_dist; // this is the diameter
}
6.3.4 Digit DP
When to use: Count numbers in range [1, N] satisfying some property related to their digits.
Core idea: Build the number digit by digit (left to right), maintaining a "tight" constraint (whether we're still bounded by N's digits).
State: dp[position][tight][...other state...]
position: which digit we're currently deciding (0 = leftmost)tight: are we still constrained by N? (1 = yes, can't exceed N's digit; 0 = no, can use 0-9 freely)- Other state: whatever property we're tracking (sum of digits, count of zeros, etc.)
Classic: Count numbers in [1, N] with digit sum divisible by K
// Solution: Digit DP — O(|digits| × 10 × K) time, O(|digits| × K) space
#include <bits/stdc++.h>
using namespace std;
string num; // N as a string
int K;
// dp[pos][tight][sum % K] = count of valid numbers
// Here we use top-down memoization
map<tuple<int,int,int>, long long> memo;
// pos: current digit position (0-indexed)
// tight: are we bounded by num[pos]?
// rem: current digit sum mod K
long long solve(int pos, bool tight, int rem) {
if (pos == (int)num.size()) {
return rem == 0 ? 1 : 0; // complete number: valid iff digit sum ≡ 0 (mod K)
}
auto key = make_tuple(pos, tight, rem);
if (memo.count(key)) return memo[key];
int limit = tight ? (num[pos] - '0') : 9; // max digit we can place here
long long result = 0;
for (int d = 0; d <= limit; d++) {
bool new_tight = tight && (d == limit);
result += solve(pos + 1, new_tight, (rem + d) % K);
}
return memo[key] = result;
}
// Count numbers in [1, N] with digit sum divisible by K
long long count_up_to(long long N) {
num = to_string(N);
memo.clear();
long long ans = solve(0, true, 0);
// Subtract 1 because 0 itself has digit sum 0 (divisible by K)
// but we want [1, N], not [0, N]
return ans - 1;
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
long long L, R;
cin >> L >> R >> K;
// Count in [L, R] = count_up_to(R) - count_up_to(L-1)
cout << count_up_to(R) - count_up_to(L - 1) << "\n";
return 0;
}
💡 Key Insight: The
tightflag is crucial. Whentight=true, we can only use digits up tonum[pos]. Once we place a digit less thannum[pos], all subsequent digits are free (0–9), sotightbecomesfalse. This "peeling off" of the upper bound is what makes digit DP correct.
6.3.5 DP Optimization: When Standard DP Is Too Slow
Slope Trick (O(N log N) for Convex/Concave DP)
For DPs of the form dp[i] = min_{j<i} (dp[j] + cost(j, i)) where the cost function has "convex" structure.
Divide & Conquer Optimization (O(N² → N log N))
When the optimal split point opt[i][j] is monotone:
opt[i][j] ≤ opt[i][j+1](or similar monotone property)- Reduces cubic DP to
O(N log N)per DP dimension
Standard interval DP: O(N^3)
With D&C optimization: O(N^2 log N)
With Knuth's optimization: O(N^2) (requires additional condition)
📌 USACO Relevance: These optimizations are typically USACO Gold/Platinum level. For Silver, mastery of the four patterns in this chapter (bitmask, interval, tree, digit) is sufficient.
Chapter Summary
📌 Pattern Recognition Guide
| Pattern | Clue in Problem | State | Transition |
|---|---|---|---|
| Bitmask DP | "subset," N ≤ 20, assign tasks | dp[mask][last] | Flip bit, try next element |
| Interval DP | "merge," "split," "parenthesize" | dp[l][r] | Split at k, combine |
| Tree DP | "tree," subtree property | dp[node][state] | Aggregate from children |
| Digit DP | "count numbers with property" | dp[pos][tight][...] | Try each digit d |
🧩 Core Framework Quick Reference
// Bitmask DP framework
for (int mask = 0; mask < (1<<n); mask++)
for (int u = 0; u < n; u++) if (mask & (1<<u))
for (int v = 0; v < n; v++) if (!(mask & (1<<v)))
dp[mask|(1<<v)][v] = min(dp[mask|(1<<v)][v], dp[mask][u] + cost[u][v]);
// Interval DP framework
for (int len = 2; len <= n; len++) // enumerate interval length
for (int l = 1; l+len-1 <= n; l++) { // enumerate left endpoint
int r = l + len - 1;
for (int k = l; k < r; k++) // enumerate split point
dp[l][r] = min(dp[l][r], dp[l][k] + dp[k+1][r] + cost(l,k,r));
}
// Tree DP framework (post-order traversal)
void dfs(int u, int parent) {
for (int v : adj[u]) if (v != parent) {
dfs(v, u);
dp[u] = update(dp[u], dp[v]); // update current node with child info
}
}
// Digit DP framework
long long solve(int pos, bool tight, int state) {
if (pos == len) return (state == target) ? 1 : 0;
if (memo[pos][tight][state] != -1) return memo[pos][tight][state];
int lim = tight ? (num[pos]-'0') : 9;
long long res = 0;
for (int d = 0; d <= lim; d++)
res += solve(pos+1, tight && (d==lim), next_state(state, d));
return memo[pos][tight][state] = res;
}
❓ FAQ
Q1: Why must interval DP enumerate by length first?
A: Because
dp[l][r]depends ondp[l][k]anddp[k+1][r], both of which have length less thanr-l+1. So all shorter intervals must be computed beforedp[l][r]. Enumerating by length from small to large satisfies this requirement. If you enumerate l and r directly, you may computedp[l][r]before its dependencies are ready.
Q2: In tree DP, how do you handle an unrooted tree (given undirected edges)?
A: Choose any node as root (usually node 1), then use DFS to turn undirected edges into directed edges (parent→child direction). Pass a
parentparameter in DFS to avoid going back to the parent.
void dfs(int u, int par) {
for (int v : adj[u]) {
if (v != par) { // only visit children, not parent
dfs(v, u);
// Updated dp[u]
}
}
}
Q3: In digit DP, can tight=true and tight=false share the same memoization array?
A: Yes, which is exactly why
tightis part of the state.dp[pos][1][rem]anddp[pos][0][rem]are different states, recording "count under upper bound constraint" and "count when free" respectively. Note thattight=falsestates can be reused across multiple calls (once tight becomes false, the remaining digits are unconstrained).
Practice Problems
Problem 6.3.1 — Bitmask DP: Task Assignment 🟡 Medium
N workers, N tasks. Worker i can do task j in time[i][j] hours. Assign each task to exactly one worker to minimize total time. (N ≤ 15)
Hint
dp[mask] = minimum time to complete the tasks in `mask`, with worker popcount(mask)-1 assigned to the "new" task. Actually: dp[mask] = min time to assign the first popcount(mask) workers to the tasks in mask.Problem 6.3.2 — Interval DP: Palindrome Partitioning 🟡 Medium Find the minimum number of cuts to partition a string into palindromes.
Hint
First precompute isPalin[l][r] with interval DP. Then dp[i] = min cuts for s[0..i].Problem 6.3.3 — Tree DP: Maximum Matching 🔴 Hard Find the maximum matching in a tree (maximum set of edges with no shared vertex).
Hint
dp[u][0] = max matching in subtree of u when u is NOT matched. dp[u][1] = max matching in subtree of u when u IS matched (to one child).Problem 6.3.4 — Digit DP: Count Lucky Numbers 🟡 Medium A "lucky" number only contains digits 4 and 7. Count lucky numbers in [1, N].
Hint
This can be solved without DP (just enumerate 2^k possibilities for k ≤ 18 digits). But practice the digit DP framework: state = (position, tight, has_only_4_7_so_far).Problem 6.3.5 — Mixed: USACO 2019 December Platinum 🔴 Hard Cow Poetry — combinatorics + DP. Count poem arrangements with specific rhyme schemes.
Hint
Group lines by their suffix hash. Use DP to count valid arrangements.🏆 Part 7: USACO Contest Guide
Not algorithms — contest strategy. Learn how to compete: read problems, manage time, debug under pressure, and think strategically about scoring partial credit.
📚 3 Chapters · ⏱️ Read anytime · 🎯 Target: Promote from Bronze to Silver
Part 7: USACO Contest Guide
Read anytime — no prerequisites
Part 7 is different from the rest of the book. Instead of teaching algorithms, it teaches you how to compete — how to read problems, manage time, debug under pressure, and think strategically about scoring.
What Topics Are Covered
| Chapter | Topic | The Big Idea |
|---|---|---|
| Chapter 7.1 | Understanding USACO | Contest format, divisions, scoring, partial credit |
| Chapter 7.2 | Problem-Solving Strategies | How to approach problems you've never seen before |
| Chapter 7.3 | Ad Hoc Problems | Observation-based problems with no standard algorithm |
When to Read This Part
- Before your first USACO contest: Read Chapter 7.1 to understand the format
- When you're stuck on practice problems: Chapter 7.2's algorithm decision tree helps
- After finishing Parts 2-6: Chapter 7.2's checklist tells you if you're ready for Silver
Key Topics in This Part
Chapter 7.1: Understanding USACO
- Contest schedule (4 contests/year: December, January, February, US Open)
- Division structure: Bronze → Silver → Gold → Platinum
- Scoring: ~1000 points, need 750+ to promote
- Partial credit strategy: how to score points even without a perfect solution
- Common mistakes and how to avoid them
Chapter 7.2: Problem-Solving Strategies
- The Algorithm Decision Tree: Given constraints, what algorithm fits?
- N ≤ 20 → brute force/bitmask
- N ≤ 1000 → O(N²)
- N ≤ 10^5 → O(N log N)
- Grid + shortest path → BFS
- Optimal decisions → DP or greedy
- Testing methodology: sample cases, edge cases, stress testing
- Debugging tips:
cerr,assert, AddressSanitizer - The Bronze → Silver checklist
Chapter 7.3: Ad Hoc Problems
- What is ad hoc: no standard algorithm; requires problem-specific insight
- The ad hoc mindset: small cases → find pattern → prove invariant → implement
- 6 categories: observation/pattern, simulation shortcut, constructive, invariant/impossibility, greedy observation, geometry/grid
- Core techniques: parity arguments, pigeonhole, coordinate compression, symmetry reduction, think backwards
- 9 practice problems (Easy → Hard → Challenge) with hints
- Silver-level ad hoc patterns: observation + BFS/DP/binary search
Contest Day Checklist
Refer to this on contest day:
- Template compiled and tested
- Read ALL THREE problems before coding anything
- Work through examples by hand
- Identify constraints and appropriate algorithm tier
- Code the easiest problem first
- Test with sample cases before submitting
- For partial credit: code brute force for small cases if stuck
- With 30 min left: stop adding code, focus on testing
-
Double-check:
long longwhere needed? Array bounds correct?
🏆 USACO Tip: The best investment of time in the week before a contest is to re-solve 5-10 problems you've seen before, from memory. Speed + accuracy matter as much as knowledge.
Chapter 7.1: Understanding USACO
Before you can ace a competition, you need to understand how it works. This chapter covers everything about USACO's structure, rules, and scoring that you need to know to compete effectively.
7.1.1 What Is USACO?
The USA Computing Olympiad (USACO) is the premier competitive programming contest for pre-college students in the United States. Established in 1993, it selects the US team for the International Olympiad in Informatics (IOI).
Key facts:
- Completely free and open to anyone
- Competed from home, on your own computer
- Problems involve algorithms and data structures
- No math competition, no trivia — pure algorithmic thinking
7.1.2 Contest Format
Schedule
USACO holds 4 contests per year:
- December contest (typically first or second week)
- January contest
- February contest
- US Open (March/April) — a bit harder, 5 hours instead of 4
Contests open on a Friday and close after 4 hours of actual competition time (you choose when to start, within a 3-day window).
Problems
Each contest has 3 problems. The time limit is 4 hours (US Open: 5 hours).
Input/Output
- Problems use file I/O OR standard I/O (newer contests use standard I/O)
- For file I/O: input from
problem.in, output toproblem.out - Template for file I/O:
#include <bits/stdc++.h>
using namespace std;
int main() {
// Redirect cin/cout to files
freopen("problem.in", "r", stdin);
freopen("problem.out", "w", stdout);
ios_base::sync_with_stdio(false);
cin.tie(NULL);
// Your solution here
return 0;
}
Important: Starting from 2020, most USACO problems use standard I/O. Always check the problem statement!
7.1.3 The Four Divisions
USACO has four competitive divisions, each with distinct difficulty:
Visual: USACO Divisions Pyramid
The pyramid shows USACO's four divisions from entry-level Bronze at the base to elite Platinum at the top. Each tier requires mastery of the concepts below it. The percentages indicate roughly what fraction of contestants compete at each level.
🥉 Bronze
- Audience: Beginners with basic programming knowledge
- Algorithms: Simulation, brute force, basic loops, simple arrays
- Typical complexity: O(N²) or O(N³) for small N, sometimes O(N) with insights
- N constraints: Usually ≤ 1000 or very small
- Promotion threshold: Score 750/1000 or higher (exact threshold varies)
🥈 Silver
- Audience: Intermediate programmers
- Algorithms: Sorting, binary search, BFS/DFS, prefix sums, basic DP, greedy
- Typical complexity: O(N log N) or O(N)
- N constraints: Up to 10^5
- Promotion threshold: Score 750+/1000
🥇 Gold
- Audience: Advanced programmers
- Algorithms: Dijkstra, segment trees, advanced DP, network flow, LCA
- Typical complexity: O(N log N) to O(N log² N)
- N constraints: Up to 10^5 to 10^6
💎 Platinum
- Audience: Top competitors
- Algorithms: Difficult combinatorics, advanced data structures, geometry
- Top performers qualify for the USACO Finalist camp and possibly the IOI team (4 selected per year)
7.1.4 Scoring
How Scoring Works
Each problem has multiple test cases (typically 10–15). You earn partial credit for each test case you pass.
- Each problem is worth approximately 333 points
- Total: ~1000 points per contest
- Exact breakdown depends on the contest
The All-Or-Nothing Myth
People think you need the perfect solution. You don't! Partial credit from simpler cases (smaller N, special structures) can get you to 750+ for promotion. In Bronze especially, many partial credit strategies exist.
Partial Credit Strategies
If you can't solve a problem fully:
- Solve small cases: If N ≤ 20, brute force with O(N!) or O(2^N) often passes several test cases
- Solve special cases: If the graph is a tree, or all values are equal, solve those first
- Output always the same answer: If you think the answer is always "YES" or some constant, try it for the first few test cases
- Time out gracefully: Make sure your partial solution doesn't crash — a TLE is better than a runtime error for some OJs
7.1.5 Time Management in Contests
The 4-Hour Strategy
First 30 minutes: Read all 3 problems. Don't code yet. Just understand them and think.
- Identify which problem looks easiest
- Note any edge cases or trick conditions
- Start forming approaches in your head
Hours 1-2: Solve the easiest problem (usually problem 1 or 2).
- Implement, test against examples, debug
- Aim for 100% on at least one problem
Hours 2-3: Tackle the second-easiest problem.
- If stuck, consider partial credit approaches
Final hour: Either finish the third problem or consolidate/debug existing solutions.
- With 30 minutes left: stop adding new code; focus on testing and fixing bugs
Reading the Problem
Spend 5–10 minutes reading each problem before writing any code:
- Re-read the constraints (N, values, special conditions)
- Work through the examples manually on paper
- Think: "What algorithm does this remind me of?"
If You're Stuck
- Try small examples manually — what pattern do you see?
- Think about simpler versions: what if N=1? N=2? N=10?
- Consider: is this a graph problem? A DP? A sorting/greedy problem?
- Write brute force first — it might be fast enough, or it helps you understand the structure
7.1.6 Common Mistake Patterns
1. Off-by-One Errors
// Wrong: misses last element
for (int i = 0; i < n - 1; i++) { ... }
// Wrong: accesses arr[n] — out of bounds!
for (int i = 0; i <= n; i++) { cout << arr[i]; }
// Correct
for (int i = 0; i < n; i++) { ... } // 0-indexed
for (int i = 1; i <= n; i++) { ... } // 1-indexed
2. Integer Overflow
int a = 1e9, b = 1e9;
int wrong = a * b; // OVERFLOW
long long right = (long long)a * b; // Correct
3. Uninitialized Variables
int ans; // uninitialized — has garbage value!
// Always initialize:
int ans = 0;
int best = INT_MIN;
4. Wrong Answer on Empty Input / Edge Cases
// What if n = 0?
int maxVal = arr[0]; // crash if n = 0!
// Check: if (n == 0) { cout << 0; return 0; }
5. Using endl Instead of "\n"
// Slow (flushes buffer every time)
for (int i = 0; i < n; i++) cout << arr[i] << endl;
// Fast
for (int i = 0; i < n; i++) cout << arr[i] << "\n";
6. Forgetting to Handle All Cases
Read the problem carefully. "What if all cows have the same height?" "What if N=1?" Test these edge cases.
7.1.7 Bronze Problem Types Cheat Sheet
| Category | Description | Key Technique |
|---|---|---|
| Simulation | Follow instructions step by step | Implement carefully; use arrays/maps |
| Counting | Count elements satisfying some condition | Loops, prefix sums, hash maps |
| Geometry | Points, rectangles on a grid | Index carefully, avoid float errors |
| Sorting-based | Sort and check properties | std::sort + scan |
| String processing | Manipulate character sequences | String indexing, maps |
| Ad hoc | Clever observation, no standard algo | Read carefully, find the pattern (see Chapter 7.3) |
Chapter Summary
📌 Key Takeaways
| Topic | Key Points |
|---|---|
| Format | 4 contests per year, 4 hours each, 3 problems |
| Divisions | Bronze → Silver → Gold → Platinum |
| Scoring | ~1000 points per contest, need 750+ to advance |
| Partial credit | Brute force on small data still earns points |
| Time management | Read all problems first, start with the easiest |
| Common bugs | Overflow, off-by-one, uninitialized variables |
❓ FAQ
Q1: What language does USACO use? Is C++ recommended?
A: USACO supports C++, Java, Python. C++ is strongly recommended — it's the fastest (Python is 10-50x slower), with a rich STL. Java works too, but is ~2x slower than C++ and more verbose. This book uses C++ throughout.
Q2: How long does it take to advance from Bronze to Silver?
A: It varies. Students with programming background typically take 2-6 months (5-10 hours of practice per week). Complete beginners may need 6-12 months. The key is not the time, but effective practice — solve problems + read editorials + reflect.
Q3: Can you look things up online during the contest?
A: You can look up general reference materials (like C++ reference, algorithm tutorials), but cannot look up existing USACO editorials or get help from others. USACO is open-resource but independently completed.
Q4: Is there a penalty for wrong answers?
A: No. USACO allows unlimited resubmissions, and only the last submission counts. So submitting a partially correct solution first, then optimizing, is a smart strategy.
Q5: When should you give up on a problem and move to the next?
A: If you've been stuck on a problem for 40+ minutes with no new ideas, consider moving to the next. But before switching, submit your current code to get partial credit. Come back if you have time at the end.
🔗 Connections to Other Chapters
- Chapters 2.1-2.3 (Part 2) cover all C++ knowledge needed for Bronze
- Chapters 3.1-3.11 (Part 3) cover core data structures and algorithms for Silver
- Chapters 5.1-5.4 (Part 5) cover graph theory at the Silver/Gold boundary
- Chapters 4.1-4.2, 6.1-6.3 (Parts 4, 6) cover greedy and DP for Silver/Gold
- Chapter 7.2 continues this chapter with deeper problem-solving strategies and thinking methods
- Chapter 7.3 gives a full deep dive into ad hoc problems — the 10–15% of Bronze problems that require creative observation rather than standard algorithms
7.1.8 Complete Bronze Problem Taxonomy
Bronze problems fall into these 10 categories. Knowing the taxonomy helps you recognize patterns instantly.
| # | Category | Description | Key Approach | Example |
|---|---|---|---|---|
| 1 | Simulation | Follow given rules step by step | Implement carefully, use arrays | "Simulate N cows moving" |
| 2 | Counting / Iteration | Count elements satisfying a condition | Nested loops, prefix sums | "Count pairs with sum K" |
| 3 | Sorting + Scan | Sort, then scan with a simple check | std::sort + linear scan | "Find median, find closest pair" |
| 4 | Grid / 2D array | Process cells in a 2D grid | Index carefully, BFS/DFS | "Count connected regions" |
| 5 | String processing | Manipulate character sequences | String indexing, maps | "Find most frequent substring" |
| 6 | Brute Force Search | Try all possibilities | Nested loops over small N | "Try all subsets of ≤ 20 items" |
| 7 | Geometry (integer) | Points, rectangles on a grid | Integer arithmetic, no floats | "Area of overlapping rectangles" |
| 8 | Math / Modular | Number theory, patterns | Modular arithmetic, formulas | "Nth element of sequence" |
| 9 | Data Structure | Use the right container | Map, set, priority queue | "Who arrives first?" |
| 10 | Ad Hoc / Observation | Clever insight, no standard algo | Read carefully, find pattern | "Unique USACO-flavored problems" — see Chapter 7.3 for deep dive |
Bronze Category Breakdown (estimated frequency):
Simulation: ████████████ ~30%
Counting/Loops: ████████ ~20%
Sorting+Scan: ██████ ~15%
Grid/2D: █████ ~12%
Ad Hoc: █████ ~12%
Other: ████ ~11%
7.1.9 Silver Problem Taxonomy
Silver problems require more sophisticated algorithms. Here are the main categories:
| Category | Key Algorithms | N Constraint | Time Needed |
|---|---|---|---|
| Sorting + Greedy | Sort + sweep, interval scheduling | N ≤ 10^5 | O(N log N) |
| Binary Search | BS on answer, parametric search | N ≤ 10^5 | O(N log N) or O(N log² N) |
| BFS/DFS | Shortest path, components, flood fill | N ≤ 10^5 | O(N + M) |
| Prefix Sums | 1D/2D range queries, difference arrays | N ≤ 10^5 | O(N) |
| Basic DP | 1D DP, LIS, knapsack, grid paths | N ≤ 5000 | O(N²) or O(N log N) |
| DSU | Dynamic connectivity, Kruskal's MST | N ≤ 10^5 | O(N α(N)) |
| Graph + DP | DP on trees, DAG paths | N ≤ 10^5 | O(N) or O(N log N) |
Time Complexity Limits for USACO
This is crucial: USACO problems have tight time limits (typically 2–4 seconds). Use this table to determine the required algorithm complexity.
| N (input size) | Required Complexity | Allowed Algorithms |
|---|---|---|
| N ≤ 10 | O(N!) | Permutation brute force |
| N ≤ 20 | O(2^N × N) | Bitmask DP, full search |
| N ≤ 100 | O(N³) | Floyd-Warshall, interval DP |
| N ≤ 1,000 | O(N²) | Standard DP, pairwise |
| N ≤ 10,000 | O(N² / constants) | Optimized O(N²) sometimes OK |
| N ≤ 100,000 | O(N log N) | Sort, BFS, binary search, DSU |
| N ≤ 1,000,000 | O(N) | Linear algorithms, prefix sums |
| N ≤ 10^9 | O(log N) | Binary search, math formulas |
⚠️ Rule of thumb: ~10^8 simple operations per second. With N=10^5, O(N²) = 10^10 operations → TLE. You need O(N log N) or better.
7.1.10 How to Upsolve — When You're Stuck
"Upsolving" means solving a problem you couldn't solve during the contest, after looking at hints or the editorial. It's the most important skill for improving at USACO.
Step-by-Step Upsolving Process
Step 1: Struggle first (30–60 min)
- Don't look at the editorial immediately. Struggling builds intuition.
- Try small examples (N=2, N=3). What's the pattern?
- Think: "What algorithm does this smell like?"
Step 2: Get a hint, not the solution
- Look at just the first line of the editorial: "This is a BFS problem" or "Sort first."
- Try again with just that hint.
Step 3: Read the full editorial
- Read slowly. Understand why the algorithm works, not just what it does.
- Ask yourself: "What insight am I missing? Why didn't I think of this?"
Step 4: Implement from scratch
- Don't copy the editorial's code. Write it yourself.
- This is where real learning happens.
Step 5: Identify your gap
- Was the issue recognizing the algorithm type? → Study more problem patterns.
- Was the issue implementation? → Practice coding faster, learn STL better.
- Was the issue the observation/insight? → Practice thinking about properties and invariants.
Common Reasons People Get Stuck
| Reason | Fix |
|---|---|
| Don't recognize the algorithm | Study more patterns; classify every problem you solve |
| Know algorithm but can't implement | Code templates from memory daily |
| Algorithm is correct but wrong answer | Check edge cases: N=1, all same values, empty input |
| Algorithm is correct but TLE | Review complexity; look for unnecessary O(N) loops inside O(N) loops |
| Panicked during contest | Practice under timed conditions |
The "Algorithm Recognition" Mental Checklist
When reading a USACO problem, ask yourself:
1. What's N? (N≤20 → bitmask; N≤10^5 → O(N log N))
2. Is there a graph/grid? → BFS/DFS
3. Is there a "minimum/maximum subject to constraint"? → Binary search on answer
4. Can the problem be modeled as: "best subsequence"? → DP
5. "Minimize max" or "maximize min"? → Binary search or greedy
6. "Connect/disconnect" queries? → DSU
7. "Range queries"? → Prefix sums or segment tree
8. Seems combinatorial with small N? → Try all cases (bitmask or permutations)
7.1.11 USACO Patterns Cheat Sheet
| Pattern | Recognition Keywords | Algorithm | Example Problem |
|---|---|---|---|
| Shortest path grid | "minimum steps", "maze", "BFS" | BFS | Maze navigation |
| Nearest X to each cell | "closest fire", "distance to nearest" | Multi-source BFS | Fire spreading |
| Sort + scan | "close together", "largest gap" | Sort, adjacent pairs | Closest pair of cows |
| Binary search on answer | "maximize minimum distance", "minimize maximum" | BS + check | Aggressive Cows |
| Sliding window | "subarray sum", "contiguous", "window" | Two pointers | Max sum subarray of size K |
| Connected components | "regions", "islands", "groups" | DFS/BFS flood fill | Count farm regions |
| Dynamic connectivity | "union groups", "add connections" | DSU | Fence connectivity |
| Minimum spanning tree | "connect cheapest", "road network" | Kruskal's | Farm cable network |
| Counting pairs | "how many pairs satisfy" | Sort + two pointers or BS | Pairs with sum |
| 1D DP | "optimal sequence of decisions" | DP array | Coin change, LIS |
| Grid DP | "paths in grid", "rectangular regions" | 2D DP | Grid path max sum |
| Activity selection | "maximum non-overlapping events" | Sort by end time, greedy | Job scheduling |
| Prefix sum range query | "sum of range [l,r]", "2D rectangle sum" | Prefix sum | Range sum queries |
| Topological order | "prerequisites", "dependency order" | Topo sort | Course prerequisites |
| Bipartite check | "2-colorable", "odd cycle?" | BFS 2-coloring | Team division |
7.1.12 Contest Strategy Refined
The First 5 Minutes Are Critical
Before writing a single line of code:
- Read all 3 problems (titles and constraints first)
- Estimate difficulty: Which is easiest? (Usually problem 1 at Bronze/Silver)
- Note key constraints: N ≤ ?, time limit, special conditions
- Mentally classify each problem using the taxonomy above
Partial Credit Strategy
Even if you can't solve a problem fully, earn partial credit:
Bronze (N ≤ ~1000 usually):
- Brute force O(N²) or O(N³) often passes several test cases
- "Solve small cases" approach: N ≤ 20 → brute force
Silver (N ≤ 10^5 usually):
- O(N²) solution often passes 4-6/15 test cases (partial credit!)
- Implement the brute force FIRST, then optimize
Always:
- Make sure your code compiles and runs (no runtime errors)
- Output something for every test case, even if wrong
- A wrong answer beats a crash
Debugging Checklist
Before submitting:
- Correct output for all given examples?
- Edge case: N=1?
-
Integer overflow? (use
long longwhen values > 10^9) - Array out of bounds? (size arrays carefully)
- Off-by-one in loops?
-
Using
"\n"notendl? - Reading correct number of test cases?
Chapter 7.2: Problem-Solving Strategies
Knowing algorithms is necessary but not sufficient. You also need to know how to think when facing a problem you've never seen before. This chapter teaches you a systematic approach.
7.2.1 How to Read a Competitive Programming Problem
USACO problems follow a consistent structure. Learn to parse it efficiently.
Problem Structure
- Story/Setup — a theme (usually cows 🐄). Mostly flavor text — don't get distracted.
- Task/Objective — the actual question. Read this very carefully.
- Input format — how to read the data.
- Output format — exactly what to print.
- Sample input/output — the examples.
- Constraints — the most important section for algorithm choice.
Reading Discipline
Step 1: Read the task/objective first. Then read input/output format. Step 2: Read the constraints. These tell you:
- N ≤ 20 → maybe O(2^N) or O(N!)
- N ≤ 1000 → probably O(N²) or O(N² log N)
- N ≤ 10^5 → must be O(N log N) or O(N)
- N ≤ 10^6 → must be O(N) or O(N log N)
- Values up to 10^9 → might need
long long - Values up to 10^18 → definitely
long long
Step 3: Work through the sample manually. Verify your understanding.
Step 4: Look for hidden constraints. "All values are distinct." "The graph is a tree." "N is even." These often unlock simpler solutions.
7.2.2 Identifying the Algorithm Type
After reading the problem, ask yourself these questions in order:
Visual: Problem-Solving Flowchart
The flowchart above captures the complete contest workflow. The key step is mapping input constraints to algorithm complexity — use the complexity table below to make that decision quickly.
Visual: Complexity vs Input Size
This reference table tells you immediately whether your chosen algorithm will pass. If N = 10⁵ and you have an O(N²) solution, it will TLE. This table should be your first mental check when designing an approach.
Question 1: Can I brute force it?
- If N ≤ 15, brute force all subsets: O(2^N)
- If N ≤ 8, try all permutations: O(N!)
- Even if brute force is too slow for full credit, it's good for partial credit and for verifying your correct solution
Question 2: Does it involve a grid or graph?
- Grid with shortest path question → BFS
- Grid/graph with connectivity → DFS or Union-Find
- Graph with weighted edges, shortest path → Dijkstra (Gold topic)
- Tree structure → Tree DP or LCA
Question 3: Does it involve sorted data?
- Finding closest elements → Sort + adjacent scan
- Range queries → Binary search or prefix sums
- "Can we achieve value X?" type question → Binary search on answer
Question 4: Does it involve optimal decisions over a sequence?
- "Maximum/minimum cost path" → DP
- "Maximum number of non-overlapping intervals" → Greedy
- "Minimum operations to transform X to Y" → BFS (if small state space) or DP
Question 5: Does it involve counting?
- Counting subsets → Bitmask DP (if small N) or combinatorics
- Counting paths in a DAG → DP
- Frequency of elements → Hash map
The Algorithm Decision Tree
Is N ≤ 20?
├── YES → Try brute force (O(2^N) or O(N!))
└── NO
Is it a graph/grid problem?
├── YES
│ Is it about shortest path?
│ ├── YES (unweighted) → BFS
│ ├── YES (weighted) → Dijkstra (Gold)
│ └── NO (connectivity) → DFS / Union-Find
└── NO
Does sorting help?
├── YES → Sort + scan / binary search
└── NO
Does it have "overlapping subproblems"?
├── YES → Dynamic Programming
└── NO → Greedy / simulation
7.2.3 Testing with Examples
Always Test the Given Examples First
Before submitting, verify your solution produces exactly the right output for all provided examples.
# Compile
g++ -o sol solution.cpp -std=c++17
# Test with sample input
echo "5
3 1 4 1 5" | ./sol
# Or from file
./sol < sample.in
Create Your Own Test Cases
The provided examples are easy. Create:
- Minimum case: N=1, N=0, empty input
- Maximum case: N at max constraint, all values at max
- All same values: N elements all equal
- Already sorted / reverse sorted
- Special structures: Complete graph, path graph, star graph (for graph problems)
Stress Testing
Write a brute-force solution for small N, then compare against your optimized solution on random inputs:
// brute.cpp — simple O(N^3) solution
// sol.cpp — your O(N log N) solution
// stress_test.sh:
for i in {1..1000}; do
# Generate random test
python3 gen.py > test.in
# Run both solutions
./brute < test.in > expected.out
./sol < test.in > got.out
# Compare
if ! diff -q expected.out got.out > /dev/null; then
echo "MISMATCH on test $i"
cat test.in
break
fi
done
echo "All tests passed!"
Stress testing catches subtle bugs that sample cases miss.
7.2.4 Debugging Tips for C++
Strategy 1: Print Everything
When something's wrong, add cerr statements to trace your program's execution. cerr goes to standard error (separate from standard output):
cerr << "At node " << u << ", dist = " << dist[u] << "\n";
cerr << "Array state: ";
for (int x : arr) cerr << x << " ";
cerr << "\n";
Why
cerrnotcout?coutgoes to standard output where the judge checks your answer.cerrgoes to standard error, which the judge usually ignores. So your debug output doesn't pollute your answer.
Strategy 2: Use assert for Invariants
assert(n >= 1 && n <= 100000); // crashes with a message if condition fails
assert(dist[v] >= 0); // check BFS invariant
Strategy 3: Check Array Bounds
Common out-of-bounds patterns:
int arr[100];
arr[100] = 5; // Bug! Valid indices are 0-99
// Use this to detect bounds issues while debugging:
// Compile with -fsanitize=address (AddressSanitizer)
// g++ -fsanitize=address,undefined -o sol sol.cpp
Strategy 4: Rubber Duck Debugging
Explain your code line by line, out loud or in writing. The act of explaining forces you to notice inconsistencies. Many bugs are found this way — not by staring at the screen, but by articulating what each line is supposed to do.
Strategy 5: Reduce the Problem
If your code fails on a large input, manually create the smallest input that still fails. Fix that. Repeat.
Strategy 6: Read Compiler Warnings
g++ -Wall -Wextra -o sol sol.cpp
The -Wall -Wextra flags enable all warnings. Read them! Uninitialized variables, unused variables, signed/unsigned mismatches — all common USACO bugs.
7.2.5 USACO-Specific Debugging
Check Your I/O
The #1 cause of Wrong Answer on correct algorithms: wrong input/output format.
- Did you read the right number of values?
- Are you printing the right number of lines?
- Is there a trailing space or missing newline?
Test Timing
To check if your solution is fast enough:
time ./sol < large_input.in
USACO typically allows 2–4 seconds. If your solution takes 10 seconds locally, it'll time out.
Estimate Complexity First
Before coding, calculate: "My algorithm is O(N²). N = 10^5. That's 10^10 operations. Way too slow."
Rough guide for what runs in 1 second with C++:
- 10^8 simple operations
- 10^7 complex operations (like map lookups)
- 10^5 × 10^3 = 10^8 for nested loops with simple body
7.2.6 From Bronze to Silver Checklist
Use this checklist to evaluate your readiness for Silver:
Algorithms to Know
- Prefix sums (1D and 2D)
- Binary search (including on the answer)
- BFS and DFS on graphs and grids
- Union-Find (DSU)
- Sorting with custom comparators
- Basic DP (1D DP, 2D DP, knapsack)
-
STL:
map,set,priority_queue,vector,sort
Problem-Solving Skills
- Can identify whether a problem needs BFS vs. DFS vs. DP vs. Greedy
- Can implement BFS from scratch in 10 minutes
- Can implement DSU from scratch in 5 minutes
- Can model grid problems as graphs
- Knows how to binary search on the answer
- Comfortable with 2D arrays and grid traversal
Contest Skills
- Can write a clean template with fast I/O in 30 seconds
-
Never forget
long longwhen needed - Always test with sample cases before submitting
- Can read and understand constraints quickly
- Has practiced at least 20 Bronze problems
- Has solved at least 5 Silver problems (even with hints)
Practice Plan
- Solve all easily available USACO Bronze problems (2016–2024)
- For each problem you can't solve in 2 hours: read editorial, implement from scratch
- After solving 30+ Bronze problems, attempt Silver: start with 2016–2018 Silver
- Keep a problem log: problem name, techniques used, key insight
7.2.7 Resources
Official
- USACO website: usaco.org — contest archive, editorials
- USACO training: train.usaco.org — old but good structured curriculum
Unofficial
- USACO Guide: usaco.guide — excellent community-written guide, highly recommended
- Codeforces: codeforces.com — more problems and contests
- AtCoder: atcoder.jp — high-quality educational problems
Books
- Competitive Programmer's Handbook by Antti Laaksonen — free PDF, excellent
- Introduction to Algorithms (CLRS) — the bible for theory (heavy reading)
Chapter Summary
📌 Key Takeaways
| Skill | Practice Until... |
|---|---|
| Reading | Understand the problem within 3 minutes |
| Algorithm ID | Guess the right approach 70%+ of the time |
| Implementation | Finish standard problems in ≤30 minutes |
| Debugging | Locate and fix bugs within 30 minutes |
| Testing | Develop the habit of testing edge cases before submitting |
🧩 "Problem-Solving Mindset" Quick Checklist
| Step | Question to Ask Yourself |
|---|---|
| 1. Check N range | N ≤ 20 → brute force/bitmask; N ≤ 10^5 → O(N log N) |
| 2. Graph/grid? | Yes → BFS/DFS/DSU |
| 3. Optimize a value? | "maximize minimum" or "minimize maximum" → binary search on answer |
| 4. Overlapping subproblems? | Yes → DP |
| 5. Sort then greedy? | Yes → Greedy |
| 6. Range queries? | Yes → prefix sum / segment tree |
❓ FAQ
Q1: What to do when you encounter a completely unfamiliar problem type?
A: ① First write a brute force for small data to get partial credit; ② Draw diagrams, manually compute small examples to find patterns; ③ Try simplifying the problem (if 2D, think about the 1D version first); ④ If still stuck, move to the next problem and come back later.
Q2: How to improve "problem recognition" ability?
A: Deliberate categorized practice. After each problem, record its "tags" (BFS, DP, greedy, binary search, etc.). After enough practice, you'll immediately associate similar constraints and keywords with the right algorithm. The Pattern Cheat Sheet in Chapter 7.1 of this book is a good starting point.
Q3: In a contest, should you write brute force first or go straight to the optimal solution?
A: Write brute force first. Brute force code usually takes only 5 minutes and serves three purposes: ① gets partial credit; ② helps you understand the problem; ③ can be used for stress testing to verify the optimal solution. Even if you're confident in your solution, it's recommended to write brute force first.
Q4: How to use stress testing for efficient debugging?
A: Write three programs:
brute.cpp(correct brute force),sol.cpp(your optimized solution),gen.cpp(random data generator). Run them in a loop and compare outputs. When a discrepancy is found, that small test case is your debugging clue. This is the most powerful debugging technique in competitive programming.
🔗 Connections to Other Chapters
- The algorithm decision tree in this chapter covers the core algorithms from all chapters in this book
- Chapter 7.1 covers USACO contest rules and problem categories; this chapter covers "how to solve problems"
- The Bronze-to-Silver Checklist summarizes all knowledge points from Chapters 2.1–6.3
- The Stress Testing technique in this chapter can be applied to Practice Problems in all chapters
The journey from Bronze to Silver is about volume of practice combined with deliberate reflection. After each problem you solve — or fail to solve — ask: "What was the key insight? How do I recognize this type faster next time?"
Good luck, and enjoy the cows. 🐄
Chapter 7.3: Ad Hoc Problems
"Ad hoc" is Latin for "for this purpose." An ad hoc problem has no standard algorithm — you must invent a solution specifically for that problem.
Ad hoc problems are the most creative and often the most frustrating category in competitive programming. They don't fit neatly into "BFS" or "DP" or "greedy." Instead, they require you to observe a key property of the problem and exploit it directly.
At USACO Bronze, roughly 10–15% of problems are ad hoc. At Silver, they appear less frequently but are often the hardest problem on the set. Learning to recognize and solve them is a crucial skill.
7.3.1 What Is an Ad Hoc Problem?
Definition
An ad hoc problem is one where:
- No standard algorithm (BFS, DP, greedy, etc.) directly applies
- The solution relies on a clever observation or mathematical insight specific to the problem
- Once you see the key insight, the implementation is usually simple
How to Recognize Ad Hoc Problems
When reading a problem, if you ask yourself "What algorithm is this?" and the answer is "...none of the above," it's probably ad hoc.
Common signals:
- The problem involves a small, specific structure (e.g., a 3×3 grid, a sequence of length ≤ 10)
- The problem asks about a property that seems hard to compute directly
- The constraints are unusual (e.g., N ≤ 50, or values are very small)
- The problem has a "trick" that makes it much simpler than it looks
- The problem involves simulation but with a hidden shortcut
Ad Hoc vs. Other Categories
| Category | Key Feature | Example |
|---|---|---|
| Simulation | Follow rules step by step; no shortcut needed | "Simulate N cows moving for T steps" |
| Greedy | Local optimal choice leads to global optimum | "Schedule jobs to minimize lateness" |
| DP | Overlapping subproblems, optimal substructure | "Minimum coins to make change" |
| Ad Hoc | Clever observation eliminates brute force | "Find the pattern; implement it directly" |
💡 Key distinction: Simulation problems are also "ad hoc" in spirit, but they're straightforward to implement once understood. True ad hoc problems require an insight that isn't obvious from the problem statement.
7.3.2 The Ad Hoc Mindset
Solving ad hoc problems requires a different mental approach than algorithmic problems.
Step 1: Understand the Problem Deeply
Don't rush to code. Spend 5–10 minutes just thinking about the problem:
- What is the problem really asking?
- What makes this problem hard?
- What would make it easy?
Step 2: Try Small Cases
Work through examples with N = 2, 3, 4 by hand. Look for patterns:
- Does the answer follow a formula?
- Is there a symmetry or invariant?
- Can you reduce the problem to a simpler form?
Step 3: Look for Invariants
An invariant is a property that doesn't change as the problem evolves. Finding invariants often unlocks ad hoc solutions.
Example: In a problem where you can swap adjacent elements, the parity of the number of inversions is an invariant. If the initial and target configurations have different parities, the answer is "impossible."
Step 4: Consider the Extremes
- What happens when all values are equal?
- What happens when N = 1?
- What happens when all values are at their maximum?
Extreme cases often reveal the structure of the solution.
Step 5: Think About What You're Really Computing
Sometimes the problem description obscures a simpler underlying computation. Ask: "Is there a formula for this?"
7.3.3 Ad Hoc Problem Categories
Ad hoc problems at USACO Bronze/Silver fall into several recurring patterns:
Category 1: Observation / Pattern Finding
The key is to find a mathematical pattern or formula.
Typical structure: Given some sequence or structure, find a property that can be computed directly.
Example problem: You have N cows in a circle. Each cow either faces left or right. A cow is "happy" if it faces the same direction as both its neighbors. How many cows are happy?
Brute force: Check each cow's neighbors — O(N). This is already optimal, but the insight is recognizing that you just need to count "same-same-same" triples.
Category 2: Simulation with a Shortcut
The problem looks like a simulation, but the naive simulation is too slow. There's a mathematical shortcut.
Typical structure: "Repeat this operation T times" where T is huge (up to 10^9).
Key insight: The state space is finite, so the sequence must eventually cycle. Find the cycle length, then use modular arithmetic.
Example:
// Naive: simulate T steps — O(T), too slow if T = 10^9
// Smart: find cycle length C, then simulate T % C steps — O(C)
int simulate(vector<int> state, int T) {
map<vector<int>, int> seen;
int step = 0;
while (step < T) {
if (seen.count(state)) {
int cycle_start = seen[state];
int cycle_len = step - cycle_start;
int remaining = (T - step) % cycle_len;
// simulate 'remaining' more steps
for (int i = 0; i < remaining; i++) {
state = next_state(state);
}
return answer(state);
}
seen[state] = step;
state = next_state(state);
step++;
}
return answer(state);
}
Category 3: Constructive / Build the Answer
Instead of searching for the answer, construct it directly.
Typical structure: "Find any configuration satisfying these constraints" or "Is it possible to achieve X?"
Key insight: Think about what constraints must be satisfied, then build a solution that satisfies them.
Example: Given N, construct a permutation of 1..N such that no two adjacent elements differ by more than K.
Insight: Sort the elements and interleave them: place elements at positions 1, K+1, 2K+1, ... then 2, K+2, 2K+2, ...
Category 4: Invariant / Impossibility
Prove that something is impossible by finding an invariant that the target state violates.
Typical structure: "Can you transform state A into state B using these operations?"
Key insight: Find a quantity that is preserved (or changes in a predictable way) under each operation. If A and B have different values of this quantity, transformation is impossible.
Classic example: The 15-puzzle (sliding tiles). The solvability depends on the parity of the permutation combined with the blank tile's position.
Category 5: Greedy Observation
The problem looks like it needs DP, but a simple greedy observation makes it trivial.
Typical structure: Optimization problem where the greedy choice is non-obvious.
Example: You have N items with values v[i]. You can take at most K items. Maximize total value.
Obvious greedy: Sort by value descending, take top K. (This is trivial once you see it, but the problem might be disguised.)
Category 6: Geometry / Grid Observation
Problems on grids or with geometric constraints often have elegant observations.
Typical structure: Count something on a grid, or determine if a configuration is reachable.
Key insight: Often involves parity (checkerboard coloring), symmetry, or a clever coordinate transformation.
7.3.4 Worked Examples
Example 1: The Fence Painting Problem
Problem: Farmer John has a fence of length N. He paints it with two colors: red (positions a to b) and blue (positions c to d). What fraction of the fence is painted?
Naive approach: Use an array of size N, mark painted positions, count. O(N).
Ad hoc insight: The painted region is the union of two intervals. Use inclusion-exclusion:
- Painted = |[a,b]| + |[c,d]| - |[a,b] ∩ [c,d]|
- Intersection of [a,b] and [c,d] = [max(a,c), min(b,d)] if max(a,c) ≤ min(b,d), else 0
#include <bits/stdc++.h>
using namespace std;
int main() {
int a, b, c, d;
cin >> a >> b >> c >> d;
int red = b - a;
int blue = d - c;
// Intersection
int inter_start = max(a, c);
int inter_end = min(b, d);
int overlap = max(0, inter_end - inter_start);
cout << red + blue - overlap << "\n";
return 0;
}
Why this is ad hoc: The key insight (inclusion-exclusion on intervals) isn't a "standard algorithm" — it's a direct observation about the structure of the problem.
Example 2: Cow Lineup
Problem: N cows stand in a line. Each cow has a breed (integer 1 to K). Find the shortest contiguous subarray that contains at least one cow of every breed that appears in the array.
This looks like: Sliding window (Chapter 3.4). But wait — what if K is very large and most breeds appear only once?
Ad hoc insight: If a breed appears only once, the subarray must include that cow. So the answer must span from the leftmost "unique" cow to the rightmost "unique" cow. Then check if this span already contains all breeds.
#include <bits/stdc++.h>
using namespace std;
int main() {
int n;
cin >> n;
vector<int> a(n);
map<int, int> cnt;
for (int i = 0; i < n; i++) {
cin >> a[i];
cnt[a[i]]++;
}
// Find breeds that appear exactly once
set<int> unique_breeds;
for (auto& [breed, c] : cnt) {
if (c == 1) unique_breeds.insert(breed);
}
if (unique_breeds.empty()) {
// Use sliding window for the general case
// ... (standard two-pointer approach)
} else {
// Must include all unique-breed cows
int lo = n, hi = -1;
for (int i = 0; i < n; i++) {
if (unique_breeds.count(a[i])) {
lo = min(lo, i);
hi = max(hi, i);
}
}
// Check if [lo, hi] contains all breeds
// ...
}
return 0;
}
Example 3: Cycle Detection in Simulation
Problem: A sequence of N numbers undergoes a transformation: each number is replaced by the sum of its digits. Starting from value X, how many steps until you reach a single-digit number? (N up to 10^18)
Naive approach: Simulate step by step. But what if it takes millions of steps?
Ad hoc insight: The sum of digits of a number ≤ 10^18 is at most 9×18 = 162. After one step, the value is ≤ 162. After two steps, it's ≤ 9+9 = 18. After three steps, it's a single digit. So the answer is at most 3 steps for any starting value!
#include <bits/stdc++.h>
using namespace std;
long long digit_sum(long long x) {
long long s = 0;
while (x > 0) { s += x % 10; x /= 10; }
return s;
}
int main() {
long long x;
cin >> x;
int steps = 0;
while (x >= 10) {
x = digit_sum(x);
steps++;
}
cout << steps << "\n";
return 0;
}
The insight: Recognizing that the value shrinks so rapidly that brute force is actually fast.
Example 4: Grid Coloring Invariant
Problem: You have an N×M grid. You can flip any 2×2 square (toggle all 4 cells between 0 and 1). Starting from all zeros, can you reach a target configuration?
Ad hoc insight: Consider the "checkerboard parity." Color the grid like a checkerboard (black/white). Each 2×2 flip toggles exactly 2 black and 2 white cells. Therefore, the number of black cells that are 1 and the number of white cells that are 1 always have the same parity (both start at 0, both change by ±2 or 0 with each flip).
If the target has an odd number of black 1-cells or an odd number of white 1-cells, it's impossible.
#include <bits/stdc++.h>
using namespace std;
int main() {
int n, m;
cin >> n >> m;
vector<string> grid(n);
for (auto& row : grid) cin >> row;
int black_ones = 0, white_ones = 0;
for (int i = 0; i < n; i++) {
for (int j = 0; j < m; j++) {
if (grid[i][j] == '1') {
if ((i + j) % 2 == 0) black_ones++;
else white_ones++;
}
}
}
// Both must be even for the configuration to be reachable
if (black_ones % 2 == 0 && white_ones % 2 == 0) {
cout << "YES\n";
} else {
cout << "NO\n";
}
return 0;
}
7.3.5 Common Ad Hoc Techniques
Technique 1: Parity Arguments
Many impossibility results come from parity. If an operation always changes some quantity by an even amount, then the parity of that quantity is an invariant.
When to use: "Can you transform A into B?" problems.
How to apply:
- Identify what each operation does to some quantity Q
- If every operation changes Q by an even amount, then Q mod 2 is invariant
- If A and B have different Q mod 2, the answer is "impossible"
Technique 2: Pigeonhole Principle
If you have N+1 items in N categories, at least one category has ≥ 2 items.
When to use: "Prove that something must exist" or "find a guaranteed collision."
Example: In any sequence of N²+1 numbers, there exists either an increasing subsequence of length N+1 or a decreasing subsequence of length N+1 (Erdős–Szekeres theorem).
Technique 3: Coordinate Compression
When values are large but the number of distinct values is small, map values to indices 0, 1, 2, ...
vector<int> vals = {1000000, 3, 999, 42, 1000000};
sort(vals.begin(), vals.end());
vals.erase(unique(vals.begin(), vals.end()), vals.end());
// vals is now {3, 42, 999, 1000000}
// Map original value to compressed index:
auto compress = [&](int x) {
return lower_bound(vals.begin(), vals.end(), x) - vals.begin();
};
// compress(1000000) = 3, compress(3) = 0, etc.
Technique 4: Symmetry Reduction
If the problem has symmetry, you only need to consider one representative from each equivalence class.
Example: If the problem is symmetric under rotation, you can fix one element's position and only consider the remaining N-1! arrangements instead of N!.
Technique 5: Think Backwards
Sometimes it's easier to work backwards from the target state to the initial state.
Example: "What's the minimum number of operations to reach state B from state A?" might be easier as "What's the minimum number of reverse-operations to reach state A from state B?"
Technique 6: Reformulate the Problem
Restate the problem in a different form that reveals structure.
Example: "Find the maximum number of non-overlapping intervals" can be reformulated as "find the minimum number of points that 'stab' all intervals" (they're equivalent by LP duality — but you don't need to know that; just recognize the reformulation).
7.3.6 USACO Bronze Ad Hoc Examples
Here are patterns from actual USACO Bronze problems (paraphrased):
Pattern: "Minimum operations to sort"
Problem type: Given a sequence, find the minimum number of swaps/moves to sort it.
Key insight: Often the answer is N minus the length of the longest already-sorted subsequence, or related to the number of cycles in the permutation.
Cycle decomposition approach:
// For sorting a permutation with minimum swaps:
// Answer = N - (number of cycles in the permutation)
vector<int> perm = {3, 1, 4, 2}; // 1-indexed values
int n = perm.size();
vector<bool> visited(n, false);
int cycles = 0;
for (int i = 0; i < n; i++) {
if (!visited[i]) {
cycles++;
int j = i;
while (!visited[j]) {
visited[j] = true;
j = perm[j] - 1; // follow the permutation (0-indexed)
}
}
}
cout << n - cycles << "\n"; // minimum swaps
Pattern: "Reachability with constraints"
Problem type: Can you reach position B from position A, given movement rules?
Key insight: Often reduces to a parity or modular arithmetic condition.
Example: On a number line, you can move +3 or -5. Can you reach position T from position 0?
Insight: You can reach any position that is a multiple of gcd(3, 5) = 1, so you can reach any integer. But if the moves were +4 and +6, you can only reach multiples of gcd(4, 6) = 2.
#include <bits/stdc++.h>
using namespace std;
int main() {
int a, b, target;
cin >> a >> b >> target;
// Can reach target using moves +a and -b (or +b and -a)?
// Equivalent: can we write target = x*a - y*b for non-negative x, y?
// Key: target must be divisible by gcd(a, b)
if (target % __gcd(a, b) == 0) {
cout << "YES\n";
} else {
cout << "NO\n";
}
return 0;
}
Pattern: "Count valid configurations"
Problem type: Count the number of ways to arrange/assign things satisfying constraints.
Key insight: Often the constraints reduce the count dramatically. Look for what's forced.
Example: N cows, each either black or white. Constraint: no two adjacent cows are the same color. How many valid colorings?
Insight: Once you fix the first cow's color, the entire sequence is determined. So the answer is 2 (if N ≥ 1) or 0 if the constraints are contradictory.
7.3.7 Practice Problems
🟢 Easy
P1. Fence Painting (USACO 2012 November Bronze) Farmer John paints fence posts a to b red, then c to d blue (blue overwrites red). How many posts are painted red? Blue? Both?
💡 Hint
Use an array of size 100 (posts are numbered 1–100). Mark red posts, then mark blue posts (overwriting). Count each color.
Alternatively: red_only = max(0, b-a) - overlap, where overlap = max(0, min(b,d) - max(a,c)).
P2. Digit Sum Steps Starting from integer X (1 ≤ X ≤ 10^9), repeatedly replace X with the sum of its digits until X < 10. How many steps does it take?
💡 Hint
Just simulate! The value drops so fast (sum of digits of a 9-digit number is at most 81) that you'll reach a single digit in at most 3 steps.
P3. Cow Checkerboard (ad hoc grid) An N×N grid (N ≤ 100) is colored like a checkerboard. You can swap any two adjacent cells (horizontally or vertically). Can you transform the initial configuration into the target configuration?
💡 Hint
Count the number of black cells that are '1' and white cells that are '1' in both configurations. Each swap changes both counts by the same amount (±1 each). So the difference (black_ones - white_ones) is invariant. If the initial and target have different differences, it's impossible.
🟡 Medium
P4. Permutation Sorting Given a permutation of 1..N, find the minimum number of adjacent swaps to sort it.
💡 Hint
The minimum number of adjacent swaps equals the number of inversions in the permutation (pairs (i,j) where i < j but perm[i] > perm[j]). Count inversions using merge sort or a Fenwick tree in O(N log N).
P5. Cycle Simulation (USACO-style) A function f maps {1, ..., N} to itself. Starting from position 1, repeatedly apply f. After exactly K steps (K up to 10^18), where are you?
💡 Hint
Starting from 1, the sequence must eventually cycle (since the state space is finite). Find the cycle start and length using Floyd's algorithm or a visited array. Then use modular arithmetic to find the position after K steps.
P6. Rectangle Union Area Given M axis-aligned rectangles (M ≤ 100, coordinates ≤ 1000), find the total area covered (counting overlapping regions only once).
💡 Hint
Since coordinates are ≤ 1000, use a 1000×1000 boolean grid. Mark each cell covered by at least one rectangle. Count marked cells. O(M × max_coord²) = O(100 × 10^6) — might be tight; optimize by only iterating over each rectangle's area.
🔴 Hard
P7. Reachability on a Torus (invariant problem) On an N×M grid (with wraparound — a torus), you start at (0,0). Each step, you move either (+a, 0) or (0, +b) (mod N and mod M respectively). Can you reach every cell?
💡 Hint
You can reach cell (x, y) if and only if x is a multiple of gcd(a, N) and y is a multiple of gcd(b, M). You can reach every cell if and only if gcd(a, N) = 1 and gcd(b, M) = 1.
P8. Minimum Swaps to Group (USACO 2016 February Bronze — "Milk Pails") N cows stand in a circle. Each cow is either type A or type B. You want all type-A cows to be contiguous. What is the minimum number of swaps of adjacent cows needed?
💡 Hint
Let K = number of type-A cows. Consider all windows of size K in the circular arrangement. For each window, count how many type-B cows are inside (these need to be swapped out). The answer is the minimum over all windows. This is O(N) with a sliding window.
🏆 Challenge
P9. Lights Out (classic ad hoc) You have a 5×5 grid of lights, each on or off. Pressing a light toggles it and all its orthogonal neighbors. Given an initial configuration, find the minimum number of presses to turn all lights off, or report it's impossible.
💡 Hint
Key insight: pressing a light twice is the same as not pressing it. So each light is either pressed 0 or 1 times. There are 2^25 ≈ 33 million possibilities — too many to brute force directly.
Better insight: once you decide the first row's presses (2^5 = 32 possibilities), the rest of the grid is forced (each subsequent row's presses are determined by whether the row above is fully off). Try all 32 first-row configurations and check if the last row ends up all-off.
7.3.8 Ad Hoc in USACO Silver
At Silver level, ad hoc problems are rarer but harder. They often combine an observation with a standard algorithm.
Silver Ad Hoc Patterns
| Pattern | Description | Example |
|---|---|---|
| Observation + BFS | Key insight reduces the state space, then BFS | "Cows can only move to cells of the same color" → BFS on reduced graph |
| Observation + DP | Insight reveals DP structure | "Optimal solution always has this property" → DP with that property |
| Observation + Binary Search | Insight makes the check function simple | "Answer is monotone" → binary search on answer |
| Pure observation | No standard algorithm needed | "The answer is always ⌈N/2⌉" |
How to Approach Silver Ad Hoc
- Don't panic when you can't identify the algorithm type
- Work small examples — N=2, N=3, N=4 — and look for patterns
- Ask: "What's special about this problem?" — what property makes it different from a generic version?
- Consider: "What if I could solve it for a simpler version?" — then generalize
- Trust your observations — if you notice a pattern in small cases, it's probably correct
Chapter Summary
📌 Key Takeaways
| Concept | Key Point |
|---|---|
| Definition | Ad hoc = no standard algorithm; requires problem-specific insight |
| Recognition | Can't identify algorithm type → probably ad hoc |
| Approach | Small cases → find pattern → prove it → implement |
| Invariants | Find quantities preserved by operations → prove impossibility |
| Simulation shortcut | Large T → find cycle → use modular arithmetic |
| Parity | Many impossibility results come from parity arguments |
| Constructive | Build the answer directly instead of searching |
🧩 Ad Hoc Problem-Solving Checklist
When you suspect a problem is ad hoc:
- Try N = 1, 2, 3, 4 — compute answers by hand
- Look for a formula — does the answer follow a simple pattern?
- Check parity — is there an invariant that rules out some configurations?
- Look for cycles — if simulating, does the state repeat?
- Consider the extremes — what if all values are equal? All maximum?
- Reformulate — can you restate the problem in a simpler way?
- Think backwards — is the reverse problem easier?
- Trust small-case patterns — if it works for N=2,3,4,5, it probably works in general
❓ FAQ
Q1: How do I know if a problem is ad hoc or just a standard algorithm I haven't learned yet?
A: This is genuinely hard to tell. A good heuristic: if the problem has small constraints (N ≤ 100) and doesn't involve graphs, DP, or sorting in an obvious way, it's likely ad hoc. If N ≤ 10^5 and you can't identify the algorithm, you might be missing a standard technique — check the problem tags after solving.
Q2: I found the pattern in small cases but can't prove it. Should I just submit?
A: In a contest, yes — submit and move on. In practice, try to understand why the pattern holds. Unproven patterns sometimes fail on edge cases. But partial credit from a pattern-based solution is better than nothing.
Q3: Ad hoc problems feel impossible. How do I get better at them?
A: Practice is the only way. Solve 20–30 ad hoc problems, and after each one, write down: "What was the key insight? How could I have found it faster?" Over time, you'll build a library of techniques (parity, cycles, invariants, etc.) that you recognize in new problems.
Q4: Is there a systematic way to find invariants?
A: Yes. For each operation in the problem, ask: "What quantities does this operation change? By how much?" If an operation always changes quantity Q by a multiple of K, then Q mod K is an invariant. Common invariants: parity (mod 2), sum mod K, number of inversions mod 2.
🔗 Connections to Other Chapters
- Chapter 7.1 (Understanding USACO): Ad hoc is one of the 10 Bronze problem categories; this chapter gives it the depth it deserves
- Chapter 7.2 (Problem-Solving Strategies): The algorithm decision tree ends with "Greedy / simulation" — ad hoc problems fall outside the tree entirely
- Chapter 3.4 (Two Pointers): The sliding window technique appears in several ad hoc problems (e.g., P8 above)
- Chapter 3.2 (Prefix Sums): Many ad hoc counting problems use prefix sums as a sub-step
- Appendix E (Math Foundations): GCD, modular arithmetic, and number theory underpin many ad hoc insights
🐄 Final thought: Ad hoc problems are where competitive programming becomes an art. There's no formula — just careful observation, creative thinking, and the satisfaction of finding an elegant solution to a problem that seemed impossible. Embrace the struggle.
Appendix A: C++ Quick Reference
This appendix is your cheat sheet. Keep it handy during practice sessions. Everything here has been covered in the book; this is the condensed reference form.
A.1 The Competition Template
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
// freopen("problem.in", "r", stdin); // uncomment for file I/O (use actual problem name)
// freopen("problem.out", "w", stdout); // uncomment for file I/O
// Your code here
return 0;
}
A.2 Common Data Types
| Type | Size | Range | Use When |
|---|---|---|---|
int | 32-bit | ±2.1 × 10^9 | Default integer |
long long | 64-bit | ±9.2 × 10^18 | Large numbers, products |
double | 64-bit | ~15 significant digits | Decimals |
bool | 1-byte | true/false | Flags |
char | 8-bit | -128 to 127 | Single characters |
string | variable | any length | Text |
Safe maximum values:
INT_MAX = 2,147,483,647 ≈ 2.1 × 10^9
LLONG_MAX = 9,223,372,036,854,775,807 ≈ 9.2 × 10^18
A.3 STL Containers — Operations Cheat Sheet
vector<T>
vector<int> v; // empty
vector<int> v(n, 0); // n zeros
vector<int> v = {1,2,3}; // from list
v.push_back(x); // add to end — O(1) amortized
v.pop_back(); // remove last — O(1)
v[i] // access index i — O(1)
v.front() // first element
v.back() // last element
v.size() // number of elements
v.empty() // true if empty
v.clear() // remove all
v.resize(k, val) // resize to k, fill new with val
v.insert(v.begin()+i, x) // insert at index i — O(n)
v.erase(v.begin()+i) // remove at index i — O(n)
pair<A,B>
pair<int,int> p = {3, 5};
p.first // 3
p.second // 5
make_pair(a, b) // create pair
// Comparison: by .first, then .second
map<K,V>
map<string,int> m;
m[key] = val; // insert/update — O(log n)
m[key] // access (creates if absent!) — O(log n)
m.find(key) // iterator; .end() if not found — O(log n)
m.count(key) // 0 or 1 — O(log n)
m.erase(key) // remove — O(log n)
m.size() // number of entries
for (auto &[k,v] : m) // iterate in sorted key order
set<T>
set<int> s;
s.insert(x) // add — O(log n)
s.erase(x) // remove all x — O(log n)
s.count(x) // 0 or 1 — O(log n)
s.find(x) // iterator — O(log n)
s.lower_bound(x) // first element >= x
s.upper_bound(x) // first element > x
*s.begin() // minimum element
*s.rbegin() // maximum element
stack<T>
stack<int> st;
st.push(x) // push — O(1)
st.pop() // pop (no return!) — O(1)
st.top() // peek at top — O(1)
st.empty() // true if empty
st.size() // count
queue<T>
queue<int> q;
q.push(x) // enqueue — O(1)
q.pop() // dequeue (no return!) — O(1)
q.front() // front element — O(1)
q.back() // back element — O(1)
q.empty()
q.size()
priority_queue<T> (max-heap)
priority_queue<int> pq; // max-heap
priority_queue<int, vector<int>, greater<int>> pq2; // min-heap
pq.push(x) // insert — O(log n)
pq.pop() // remove top — O(log n)
pq.top() // view top (max) — O(1)
pq.empty()
pq.size()
unordered_map<K,V> / unordered_set<T>
Same interface as map/set, but O(1) average (no ordered iteration).
A.4 STL Algorithms Cheat Sheet
// All assume #include <bits/stdc++.h>
// SORT
sort(v.begin(), v.end()); // ascending
sort(v.begin(), v.end(), greater<int>()); // descending
sort(v.begin(), v.end(), [](int a, int b){...}); // custom
// BINARY SEARCH (requires sorted container)
binary_search(v.begin(), v.end(), x) // bool: exists?
lower_bound(v.begin(), v.end(), x) // iterator to first >= x
upper_bound(v.begin(), v.end(), x) // iterator to first > x
// MIN/MAX
min(a, b) // minimum of two
max(a, b) // maximum of two
min({a, b, c}) // minimum of many (C++11)
*min_element(v.begin(), v.end()) // min of container
*max_element(v.begin(), v.end()) // max of container
// ACCUMULATE
accumulate(v.begin(), v.end(), 0LL) // sum (use 0LL for long long)
// FILL
fill(v.begin(), v.end(), x) // fill all with x
memset(arr, 0, sizeof(arr)) // zero a C-array (fast)
// REVERSE
reverse(v.begin(), v.end()) // reverse in place
// COUNT
count(v.begin(), v.end(), x) // count occurrences of x
// UNIQUE (removes consecutive duplicates — sort first!)
auto it = unique(v.begin(), v.end());
v.erase(it, v.end());
// SWAP
swap(a, b) // swap two values
// PERMUTATION (useful for brute force)
sort(v.begin(), v.end());
do {
// process current permutation
} while (next_permutation(v.begin(), v.end()));
// GCD / LCM (C++17)
gcd(a, b) // GCD — std::gcd from <numeric>
lcm(a, b) // LCM — std::lcm from <numeric>
// Legacy (pre-C++17): __gcd(a, b) // still works but prefer std::gcd
A.5 Time Complexity Reference Table
Visual: Complexity vs N Reference
The color-coded table above gives an at-a-glance feasibility check. When reading a problem, find N in the columns and your algorithm's complexity in the rows to see if it will pass within 1 second.
| N | Max feasible complexity | Algorithm tier |
|---|---|---|
| N ≤ 12 | O(N! × N) | All permutations |
| N ≤ 20 | O(2^N × N) | All subsets + linear work |
| N ≤ 500 | O(N³) | 3 nested loops, interval DP |
| N ≤ 5000 | O(N²) | 2 nested loops, O(N²) DP |
| N ≤ 10^5 | O(N log N) | Sort, BFS, binary search |
| N ≤ 10^6 | O(N) | Linear scan, prefix sums |
| N ≤ 10^8 | O(N) or O(N / 32) | Pure loop or bitsets |
A.6 Common Pitfalls
Integer Overflow
// WRONG
int a = 1e9, b = 1e9;
int product = a * b; // overflow!
// CORRECT
long long product = (long long)a * b;
// WRONG
int n = 1e5;
int arr[n * n]; // n*n = 10^10, way too large
// Check: if any intermediate value might exceed 2 × 10^9, use long long
Off-by-One
// WRONG: accesses arr[n]
for (int i = 0; i <= n; i++) cout << arr[i];
// CORRECT
for (int i = 0; i < n; i++) cout << arr[i]; // 0-indexed
for (int i = 1; i <= n; i++) cout << arr[i]; // 1-indexed
// Prefix sum: P[i] = sum of first i elements
// Query sum from L to R (1-indexed): P[R] - P[L-1]
// NOT P[R] - P[L] ← off by one!
Modifying Container While Iterating
// WRONG
for (auto it = s.begin(); it != s.end(); ++it) {
if (*it % 2 == 0) s.erase(it); // iterator invalidated!
}
// CORRECT
set<int> toErase;
for (int x : s) if (x % 2 == 0) toErase.insert(x);
for (int x : toErase) s.erase(x);
map Creating Entries on Access
map<string,int> m;
if (m["missing_key"]) // creates "missing_key" with value 0!
// CORRECT: check first
if (m.count("missing_key") && m["missing_key"]) // safe
// Or:
auto it = m.find("missing_key");
if (it != m.end() && it->second) { ... }
Double Comparison
double a = 0.1 + 0.2;
if (a == 0.3) // might be false due to floating point!
// CORRECT: use epsilon comparison
const double EPS = 1e-9;
if (abs(a - 0.3) < EPS) { ... }
Stack Overflow from Deep Recursion
// DFS on large graphs can cause stack overflow
// For trees with N = 10^5 nodes in a line (like a chain), recursion depth = 10^5
// Fix: increase stack size, or use iterative DFS
// On Linux/Mac, increase stack:
// ulimit -s unlimited
// Or compile with: g++ -DLOCAL ... and set stack size manually
A.7 Useful #define and typedef
// Common shortcuts (personal taste — don't overdo it)
typedef long long ll;
typedef pair<int,int> pii;
typedef vector<int> vi;
#define pb push_back
#define all(v) (v).begin(), (v).end()
#define sz(v) ((int)(v).size())
// Example usage:
ll x = 1e18;
pii p = {3, 5};
vi v = {1, 2, 3};
sort(all(v));
A.8 C++17 Useful Features
// Structured bindings — unpack pairs/tuples cleanly
auto [x, y] = make_pair(3, 5);
for (auto [key, val] : mymap) { ... }
// If with initializer
if (auto it = m.find(key); it != m.end()) {
// use it->second
}
// __gcd and gcd
int g = gcd(12, 8); // C++17: use std::gcd from <numeric>
int l = lcm(4, 6); // C++17: use std::lcm from <numeric>
// Compile with: g++ -std=c++17 -O2 -o sol sol.cpp
Appendix B: USACO Problem Set
This appendix provides a curated list of 20 USACO problems organized by topic. These problems are carefully selected to reinforce the techniques covered in this book. All are available for free on usaco.org.
How to Use This Problem Set
Work through these problems roughly in order. For each problem:
- Read the problem carefully and try to solve it independently for at least 1–2 hours
- If stuck, look at the hint below (not the full editorial)
- If still stuck after another 30 minutes, read the editorial on the USACO website
- After solving (or reading the editorial), implement the solution yourself from scratch
Learning happens most when you struggle and then understand — not when you read a solution passively.
Section 1: Simulation & Brute Force (Bronze)
Problem 1: Blocked Billboard
Contest: USACO 2017 December Bronze Topic: 2D geometry, rectangles Link: usaco.org — 2017 December Bronze
Description: Two billboards and a truck (all rectangles). Find the area of the billboards not covered by the truck.
Key Insight: Compute the intersection of the truck with each billboard. Area of billboard - area of intersection = visible area.
Techniques: 2D rectangle intersection, careful arithmetic Difficulty: ⭐⭐
Problem 2: The Cow-Signal
Contest: USACO 2016 February Bronze Topic: 2D array manipulation Link: usaco.org — 2016 February Bronze
Description: Given a pattern of characters in a K×L grid, "scale" it up by factor R (repeat each character R times in each direction).
Key Insight: Character at position (i,j) in the output comes from ((i-1)/R + 1, (j-1)/R + 1) in the input.
Techniques: 2D array indexing, integer division Difficulty: ⭐
Problem 3: Shell Game
Contest: USACO 2016 January Bronze Topic: Simulation Link: usaco.org — 2016 January Bronze
Description: Elsie plays a shell game. Track where a ball ends up after a sequence of swaps.
Key Insight: Track the ball's position through each swap. The pea starts under one of the three shells; try all three starting positions.
Techniques: Simulation, brute force over starting positions Difficulty: ⭐
Problem 4: Counting Haybales
Contest: USACO 2016 November Bronze Topic: Sorting, searching Link: usaco.org — 2016 November Bronze
Description: N haybales at positions. Q queries asking how many haybales are in range [A, B].
Key Insight: Sort haybale positions, then use binary search (lower_bound/upper_bound) for each query.
Techniques: Sorting, binary search Difficulty: ⭐⭐
Problem 5: Mowing the Field
Contest: USACO 2016 January Bronze Topic: Grid simulation Link: usaco.org — 2016 January Bronze
Description: FJ mows a field by following N instructions. Count how many cells he mows more than once.
Key Insight: Track all visited positions in a set/map. When a cell is visited again, it's double-mowed.
Techniques: Set/map for tracking visited cells, direction simulation Difficulty: ⭐⭐
Section 2: Arrays & Prefix Sums (Bronze/Silver)
Problem 6: Breed Counting
Contest: USACO 2015 December Bronze Topic: Prefix sums Link: usaco.org — 2015 December Bronze
Description: N cows each with breed 1, 2, or 3. Q queries: how many cows of breed B in range [L, R]?
Key Insight: Build a prefix sum array for each of the 3 breeds. Answer each query in O(1).
Techniques: Prefix sums, multiple arrays Difficulty: ⭐⭐
Problem 7: Hoof, Paper, Scissors
Contest: USACO 2019 January Silver Topic: DP Link: usaco.org — 2019 January Silver
Description: Bessie plays N rounds of Hoof-Paper-Scissors. She can change her gesture at most K times. Maximize wins.
Key Insight: DP state: (round, changes used, current gesture). See Chapter 6.2 for full solution.
Techniques: 3D DP Difficulty: ⭐⭐⭐
Section 3: Sorting & Binary Search (Bronze/Silver)
Problem 8: Angry Cows
Contest: USACO 2016 February Bronze Topic: Sorting, simulation Link: usaco.org — 2016 February Bronze
Description: Cows placed on a number line. One cow fires a "blast" that spreads outward, setting off other cows. Find the minimum initial blast radius to set off all cows.
Key Insight: Binary search on the blast radius. For a given radius, simulate which cows get set off.
Techniques: Binary search on answer, sorting, simulation Difficulty: ⭐⭐⭐
Problem 9: Aggressive Cows
Contest: USACO 2011 March Silver Topic: Binary search on answer Link: usaco.org — 2011 March Silver
Description: N stalls at given positions. Place C cows to maximize the minimum distance between any two cows.
Key Insight: Binary search on the answer (minimum distance). For each candidate distance, greedily check if C cows can be placed.
Techniques: Binary search on answer, greedy check Difficulty: ⭐⭐⭐
Problem 10: Convention
Contest: USACO 2018 February Silver Topic: Binary search on answer + greedy Link: usaco.org — 2018 February Silver
Description: N cows arrive at times t[i] and board M buses of capacity C. Minimize the maximum waiting time.
Key Insight: Binary search on the maximum wait time. For each candidate, greedily assign cows to buses.
Techniques: Binary search on answer, greedy simulation, sorting Difficulty: ⭐⭐⭐
Section 4: Graph Algorithms (Silver)
Problem 11: Closing the Farm
Contest: USACO 2016 January Silver Topic: DSU (Union-Find), offline processing Link: usaco.org — 2016 January Silver
Description: A farm has N fields and M paths. Remove fields one by one. After each removal, determine if the remaining fields are still all connected.
Key Insight: Reverse the process — add fields in reverse order. Use DSU to track connectivity as fields are added.
Techniques: DSU, reverse processing Difficulty: ⭐⭐⭐
Problem 12: Moocast
Contest: USACO 2016 February Silver Topic: DSU / BFS Link: usaco.org — 2016 February Silver
Description: N cows on a field. Cow i has walkie-talkie range p[i]. Can cow i directly contact cow j? Find the minimum range such that all cows can communicate (directly or via relays).
Key Insight: Binary search on the minimum range. For a given range, build a graph and check connectivity.
Techniques: Binary search on answer, BFS/DFS connectivity, or Kruskal's MST Difficulty: ⭐⭐⭐
Problem 13: BFS Shortest Path
Contest: USACO 2016 February Bronze: Milk Pails (modified) Topic: BFS on state space Link: usaco.org — 2016 February Bronze
Description: Two buckets with capacities X and Y. Fill/empty/pour operations. Find minimum operations to get exactly M liters in either bucket.
Key Insight: Model (amount in bucket 1, amount in bucket 2) as a graph state. BFS finds the minimum operations.
Techniques: BFS on state graph Difficulty: ⭐⭐⭐
Problem 14: Grass Cownoisseur
Contest: USACO 2015 December Silver Topic: SCC (Strongly Connected Components), BFS on DAG Link: usaco.org — 2015 December Silver
Description: Directed graph of pastures. Bessie can reverse one edge for free. Find the maximum number of pastures reachable in a round trip from pasture 1.
Key Insight: Contract SCCs into super-nodes, then BFS on the DAG. For each edge that could be reversed, check improvement.
Techniques: SCC, BFS, graph contraction Difficulty: ⭐⭐⭐⭐ (Gold-level thinking, Silver contest)
Section 5: Dynamic Programming (Silver)
Problem 15: Rectangular Pasture
Contest: USACO 2021 January Silver Topic: 2D prefix sums, DP Link: usaco.org — 2021 January Silver
Description: N cows on a 2D grid (all at distinct x and y coordinates). Count the number of axis-aligned rectangles that contain exactly K cows.
Key Insight: Sort by x, then for each pair of columns, use a DP over rows. 2D prefix sums for fast rectangle counting.
Techniques: 2D prefix sums, combinatorics Difficulty: ⭐⭐⭐
Problem 16: Lemonade Line
Contest: USACO 2017 February Bronze Topic: Greedy Link: usaco.org — 2017 February Bronze
Description: N cows. Cow i will join a lemonade line if there are at most p[i] cows already in line. Find the maximum number of cows in line.
Key Insight: Sort cows by patience (p[i]) in decreasing order. Greedily add each cow if possible.
Techniques: Sorting, greedy Difficulty: ⭐⭐
Problem 17: Tallest Cow
Contest: USACO 2016 February Silver Topic: Difference arrays Link: usaco.org — 2016 February Silver
Description: N cows in a line. H[i] is the height of cow i. Given pairs (A, B) meaning cow A can see cow B (implies all cows between them are shorter), find maximum possible height of each cow.
Key Insight: Use difference arrays to track height constraints. For each (A, B) pair, all cows strictly between A and B must be shorter than both.
Techniques: Difference arrays, prefix sums Difficulty: ⭐⭐⭐
Section 6: Mixed (Silver)
Problem 18: Balancing Act
Contest: USACO 2018 January Silver Topic: Tree DP, centroid Link: usaco.org — 2018 January Silver
Description: Find the "centroid" of a tree — the node whose removal creates the most balanced partition (minimizes the size of the largest remaining component).
Key Insight: Compute subtree sizes via DFS. For each node, the largest component when it's removed is max(subtree size of each child, N - subtree size of this node).
Techniques: Tree DP, subtree sizes Difficulty: ⭐⭐⭐
Problem 19: Concatenation Nation
Contest: USACO 2016 January Bronze Topic: String manipulation, sorting Link: usaco.org — 2016 January Bronze
Description: Given N strings, for each pair (i, j) with i < j, form the string s_i + s_j. Count how many such concatenated strings are palindromes.
Key Insight: Check each pair; O(N² × L) where L is string length. For N ≤ 1000, this works.
Techniques: String manipulation, palindrome check Difficulty: ⭐⭐
Problem 20: Berry Picking
Contest: USACO 2020 January Silver Topic: Greedy, DP Link: usaco.org — 2020 January Silver
Description: Bessie picks berries from N trees. She has K baskets; each basket can hold berries from only one tree. Maximize total berries given that each basket in a group must hold the same amount.
Key Insight: Optimal: use K/2 baskets for Bessie, K/2 for Elsie. Sort trees. For each possible basket-size for Elsie's trees, binary search to find Bessie's optimal allocation.
Techniques: Sorting, binary search, greedy Difficulty: ⭐⭐⭐⭐
Quick Reference: Problems by Technique
| Technique | Problems |
|---|---|
| Simulation | 1, 2, 3, 5 |
| Sorting | 4, 8, 9, 10, 16 |
| Prefix Sums | 6, 17 |
| Binary Search | 4, 8, 9, 10, 12 |
| BFS / DFS | 13, 14 |
| Union-Find | 11, 12 |
| Dynamic Programming | 7, 15, 18, 20 |
| Greedy | 16, 20 |
| String / Ad hoc | 19 |
Tips for Practicing
- Use the USACO training gate at train.usaco.org for auto-grading
- Read editorials at usaco.org after each problem — even for problems you solved
- Keep a problem journal — write the key insight for each problem you solve
- Difficulty progression: do easy problems from recent years, then medium from older years
Additional Problem Sources
| Source | URL | Best For |
|---|---|---|
| USACO Archive | usaco.org | USACO-specific practice |
| USACO Guide | usaco.guide | Structured curriculum with problems |
| Codeforces | codeforces.com | Volume practice, diverse problems |
| AtCoder Beginner | atcoder.jp | High-quality beginner problems |
| LeetCode | leetcode.com | Data structure fundamentals |
| CSES | cses.fi/problemset | Classic algorithm problems |
CSES Problem Set at cses.fi/problemset is especially recommended — it has ~300 carefully curated problems covering all USACO Silver topics, auto-graded, free.
Appendix C: C++ Competitive Programming Tricks
This appendix collects the most useful C++ tricks, macros, templates, and code snippets that competitive programmers use daily. These techniques can save significant time in contests and help your code run faster.
C.1 Fast I/O
The most important performance optimization for I/O-heavy problems:
// Always include these at the start of main()
ios_base::sync_with_stdio(false); // disconnect C and C++ I/O streams
cin.tie(NULL); // untie cin from cout
// Why they help:
// sync_with_stdio(false): by default, C++ syncs with C I/O (printf/scanf)
// for compatibility. Turning this off makes cin/cout much faster.
// cin.tie(NULL): by default, cin flushes cout before each read.
// Untying eliminates this unnecessary flush.
File I/O (USACO traditional problems):
freopen("problem.in", "r", stdin); // redirect cin to file (replace "problem" with actual name)
freopen("problem.out", "w", stdout); // redirect cout to file
// After these lines, cin/cout work as normal but read/write files
// Example: for "Blocked Billboard", use "billboard.in" / "billboard.out"
Even faster: manual reading with getchar_unlocked (Linux):
inline int readInt() {
int x = 0; bool neg = false;
char c = getchar_unlocked();
while (c != '-' && (c < '0' || c > '9')) c = getchar_unlocked();
if (c == '-') { neg = true; c = getchar_unlocked(); }
while (c >= '0' && c <= '9') { x = x*10 + c-'0'; c = getchar_unlocked(); }
return neg ? -x : x;
}
// Typically 3-5× faster than cin for large integer inputs
C.2 Common Macros and Typedefs
// Shorter type names
typedef long long ll;
typedef unsigned long long ull;
typedef long double ld;
typedef pair<int,int> pii;
typedef pair<ll,ll> pll;
typedef vector<int> vi;
typedef vector<ll> vll;
typedef vector<pii> vpii;
// Shorthand operations
#define pb push_back
#define pf push_front
#define all(v) (v).begin(), (v).end()
#define rall(v) (v).rbegin(), (v).rend()
#define sz(v) ((int)(v).size())
#define fi first
#define se second
// Loop macros (use sparingly — can hurt readability)
#define FOR(i, a, b) for(int i = (a); i < (b); i++)
#define REP(i, n) FOR(i, 0, n)
// Min/max shortcuts
#define chmin(a, b) a = min(a, b)
#define chmax(a, b) a = max(a, b)
// Usage examples:
// vi v; v.pb(5); → v.push_back(5)
// sort(all(v)); → sort(v.begin(), v.end())
// cout << sz(v) << "\n";→ cout << (int)v.size() << "\n"
// FOR(i, 1, n+1) { ... }→ for(int i = 1; i < n+1; i++) { ... }
C.3 GCC Pragmas for Speed
// These pragmas can give 2-4× speedup on GCC compilers (used on USACO judges)
#pragma GCC optimize("O3,unroll-loops")
#pragma GCC target("avx2,bmi,bmi2,popcnt")
// Place these BEFORE #include lines
// Warning: "O3" and "avx2" may cause subtle numerical differences
// (usually fine for integer problems, be careful with floating point)
// Safer version (just O2 without vector instructions):
#pragma GCC optimize("O2")
// Full competitive template with pragmas:
#pragma GCC optimize("O3,unroll-loops")
#pragma GCC target("avx2")
#include <bits/stdc++.h>
using namespace std;
// ... rest of your code
C.4 Useful Math: GCD, LCM, Modular Arithmetic
#include <bits/stdc++.h>
using namespace std;
// ─── GCD and LCM ───────────────────────────────────────────────────────────
// C++17: std::gcd and std::lcm from <numeric>
#include <numeric>
int g = gcd(12, 8); // 4
int l = lcm(4, 6); // 12
// C++14 and earlier: __gcd from <algorithm>
int g2 = __gcd(12, 8); // 4
long long l2 = 4LL / __gcd(4, 6) * 6; // 12 (careful: divide first to avoid overflow)
// Custom GCD (Euclidean algorithm):
ll mygcd(ll a, ll b) { return b ? mygcd(b, a%b) : a; }
ll mylcm(ll a, ll b) { return a / mygcd(a,b) * b; } // divide first!
// ─── Modular Arithmetic ─────────────────────────────────────────────────────
const ll MOD = 1e9 + 7; // standard USACO/Codeforces modulus
// Add: (a + b) % MOD
ll addmod(ll a, ll b) { return (a + b) % MOD; }
// Subtract: (a - b + MOD) % MOD ← always add MOD before % to avoid negatives
ll submod(ll a, ll b) { return (a - b + MOD) % MOD; }
// Multiply: (a * b) % MOD
ll mulmod(ll a, ll b) { return (a % MOD) * (b % MOD) % MOD; }
// Power: a^b mod MOD using fast exponentiation — O(log b)
ll power(ll base, ll exp, ll mod = MOD) {
ll result = 1;
base %= mod;
while (exp > 0) {
if (exp & 1) result = result * base % mod; // odd exponent
base = base * base % mod; // square
exp >>= 1; // halve exponent
}
return result;
}
// Modular inverse (a^{-1} mod p, where p is prime):
ll modinv(ll a, ll mod = MOD) { return power(a, mod-2, mod); }
// This uses Fermat's little theorem: a^{p-1} ≡ 1 (mod p) for prime p
// So a^{-1} ≡ a^{p-2} (mod p)
// Modular division: (a / b) mod p = (a * b^{-1}) mod p
ll divmod(ll a, ll b) { return mulmod(a, modinv(b)); }
// Example: C(n, k) mod p using precomputed factorials
const int MAXN = 200001;
ll fact[MAXN], inv_fact[MAXN];
void precompute_factorials() {
fact[0] = 1;
for (int i = 1; i < MAXN; i++) fact[i] = fact[i-1] * i % MOD;
inv_fact[MAXN-1] = modinv(fact[MAXN-1]);
for (int i = MAXN-2; i >= 0; i--) inv_fact[i] = inv_fact[i+1] * (i+1) % MOD;
}
ll C(int n, int k) {
if (k < 0 || k > n) return 0;
return fact[n] * inv_fact[k] % MOD * inv_fact[n-k] % MOD;
}
C.5 Useful Code Snippets
Disjoint Set Union (DSU / Union-Find) Template
// DSU — complete template with size tracking
struct DSU {
vector<int> parent, sz;
DSU(int n) : parent(n+1), sz(n+1, 1) {
iota(parent.begin(), parent.end(), 0); // parent[i] = i
}
int find(int x) {
if (parent[x] != x) parent[x] = find(parent[x]); // path compression
return parent[x];
}
bool unite(int x, int y) {
x = find(x); y = find(y);
if (x == y) return false; // already same component
if (sz[x] < sz[y]) swap(x, y); // union by size
parent[y] = x;
sz[x] += sz[y];
return true; // successfully merged
}
bool connected(int x, int y) { return find(x) == find(y); }
int size(int x) { return sz[find(x)]; } // size of x's component
};
// Usage:
DSU dsu(n);
dsu.unite(1, 2);
cout << dsu.connected(1, 3) << "\n"; // 0 (false)
cout << dsu.size(1) << "\n"; // 2
Segment Tree (Point Update, Range Query)
// Segment Tree — supports:
// point_update(i, val): set position i to val
// query(l, r): sum of [l, r]
// All operations O(log N)
struct SegTree {
int n;
vector<ll> tree;
SegTree(int n) : n(n), tree(4*n, 0) {}
void update(int node, int start, int end, int idx, ll val) {
if (start == end) {
tree[node] = val;
return;
}
int mid = (start + end) / 2;
if (idx <= mid) update(2*node, start, mid, idx, val);
else update(2*node+1, mid+1, end, idx, val);
tree[node] = tree[2*node] + tree[2*node+1]; // merge
}
ll query(int node, int start, int end, int l, int r) {
if (r < start || end < l) return 0; // out of range
if (l <= start && end <= r) return tree[node]; // fully in range
int mid = (start + end) / 2;
return query(2*node, start, mid, l, r)
+ query(2*node+1, mid+1, end, l, r);
}
void update(int i, ll val) { update(1, 1, n, i, val); }
ll query(int l, int r) { return query(1, 1, n, l, r); }
};
// Usage:
SegTree st(n);
st.update(3, 10); // set position 3 to 10
cout << st.query(1, 5); // sum of positions 1..5
BFS Template
// Grid BFS — shortest path in unweighted grid
int bfs_grid(vector<string>& grid, int sr, int sc, int er, int ec) {
int R = grid.size(), C = grid[0].size();
vector<vector<int>> dist(R, vector<int>(C, -1));
queue<pair<int,int>> q;
int dr[] = {-1, 1, 0, 0};
int dc[] = {0, 0, -1, 1};
dist[sr][sc] = 0;
q.push({sr, sc});
while (!q.empty()) {
auto [r, c] = q.front(); q.pop();
for (int d = 0; d < 4; d++) {
int nr = r + dr[d], nc = c + dc[d];
if (nr >= 0 && nr < R && nc >= 0 && nc < C
&& grid[nr][nc] != '#' && dist[nr][nc] == -1) {
dist[nr][nc] = dist[r][c] + 1;
q.push({nr, nc});
}
}
}
return dist[er][ec];
}
Binary Search on Answer Template
// Binary search on answer — maximize X such that check(X) is true
// Precondition: check is monotone (false...false...true...true)
template<typename T, typename F>
T binary_search_ans(T lo, T hi, F check) {
T ans = lo; // or -1 if no valid answer
while (lo <= hi) {
T mid = lo + (hi - lo) / 2;
if (check(mid)) { ans = mid; lo = mid + 1; }
else { hi = mid - 1; }
}
return ans;
}
// Usage example: find max D such that canPlace(D) is true
int result = binary_search_ans(1, maxDist, canPlace);
C.6 Built-in Functions Worth Knowing
// ─── Integer operations ─────────────────────────────────────────────────────
__builtin_popcount(x) // count set bits in x (int)
__builtin_popcountll(x) // count set bits in x (long long)
__builtin_clz(x) // count leading zeros (int, x > 0)
__builtin_ctz(x) // count trailing zeros (int, x > 0)
// Examples:
__builtin_popcount(0b1011) == 3 // three 1-bits
__builtin_ctz(0b1000) == 3 // three trailing zeros
__builtin_clz(1) == 31 // 31 leading zeros (for 32-bit int)
(31 - __builtin_clz(x)) // floor(log2(x))
// ─── Bit tricks ─────────────────────────────────────────────────────────────
// Check if x is a power of 2:
bool isPow2 = (x > 0) && !(x & (x-1));
// Extract lowest set bit:
int lsb = x & (-x);
// Turn off lowest set bit:
x = x & (x-1);
// Iterate all subsets of a bitmask (for bitmask DP):
for (int sub = mask; sub > 0; sub = (sub-1) & mask) {
// process subset 'sub' of 'mask'
}
// ─── Useful STL functions ────────────────────────────────────────────────────
// next_permutation: iterate all permutations
sort(v.begin(), v.end()); // start from sorted order
do {
// v is current permutation
} while (next_permutation(v.begin(), v.end()));
// __gcd: greatest common divisor (available before C++17)
int g = __gcd(a, b);
// std::gcd, std::lcm (C++17 <numeric>):
#include <numeric>
int g = gcd(a, b);
int l = lcm(a, b);
C.7 The Full Competition Template
// ────────────────────────────────────────────────────────────────────────────
// Competitive Programming Template — C++17
// ────────────────────────────────────────────────────────────────────────────
#pragma GCC optimize("O2")
#include <bits/stdc++.h>
using namespace std;
// Type aliases
typedef long long ll;
typedef pair<int,int> pii;
typedef vector<int> vi;
// Convenience macros
#define pb push_back
#define all(v) (v).begin(), (v).end()
#define sz(v) ((int)(v).size())
#define fi first
#define se second
// Constants
const ll MOD = 1e9 + 7;
const ll INF = 1e18;
const int MAXN = 200005;
// Fast power mod
ll power(ll base, ll exp, ll mod = MOD) {
ll res = 1; base %= mod;
for (; exp > 0; exp >>= 1) {
if (exp & 1) res = res * base % mod;
base = base * base % mod;
}
return res;
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
// Uncomment for file I/O:
// freopen("problem.in", "r", stdin);
// freopen("problem.out", "w", stdout);
// ── Your solution here ──
return 0;
}
C.8 Common Patterns and Idioms
// ─── Reading N integers into a vector ────────────────────────────────────────
int n; cin >> n;
vi a(n);
for (int &x : a) cin >> x;
// ─── 2D vector initialization ────────────────────────────────────────────────
int R, C;
vector<vector<int>> grid(R, vector<int>(C, 0));
// ─── Sorting with custom criterion ───────────────────────────────────────────
sort(all(v), [](const auto &a, const auto &b) {
return a.weight < b.weight; // sort by weight ascending
});
// ─── Finding min/max with index ───────────────────────────────────────────────
auto maxIt = max_element(all(v));
int maxVal = *maxIt;
int maxIdx = maxIt - v.begin();
// ─── Erase duplicates from sorted vector ─────────────────────────────────────
sort(all(v));
v.erase(unique(all(v)), v.end());
// ─── String splitting by character ───────────────────────────────────────────
vector<string> split(const string &s, char delim) {
vector<string> parts;
stringstream ss(s);
string part;
while (getline(ss, part, delim)) parts.pb(part);
return parts;
}
// ─── Integer square root (exact, no float issues) ───────────────────────────
ll isqrt(ll n) {
ll r = sqrtl(n);
while (r*r > n) r--;
while ((r+1)*(r+1) <= n) r++;
return r;
}
// ─── Checking if a number is prime ───────────────────────────────────────────
bool isPrime(ll n) {
if (n < 2) return false;
if (n == 2) return true;
if (n % 2 == 0) return false;
for (ll i = 3; i * i <= n; i += 2) {
if (n % i == 0) return false;
}
return true;
}
// ─── Sieve of Eratosthenes (all primes up to N) ─────────────────────────────
vector<bool> sieve(int N) {
vector<bool> is_prime(N+1, true);
is_prime[0] = is_prime[1] = false;
for (int i = 2; i * i <= N; i++) {
if (is_prime[i]) {
for (int j = i*i; j <= N; j += i)
is_prime[j] = false;
}
}
return is_prime;
}
C.9 Debugging Tips
// Use cerr for debug output (judges usually ignore stderr)
#ifdef DEBUG
#define dbg(x) cerr << #x << " = " << x << "\n"
#define dbgv(v) cerr << #v << ": "; for(auto x:v) cerr << x << " "; cerr << "\n"
#else
#define dbg(x)
#define dbgv(v)
#endif
// Compile with: g++ -DDEBUG -o sol sol.cpp (enables debug output)
// Compile without: g++ -o sol sol.cpp (removes debug output)
// Usage:
int x = 42;
dbg(x); // prints: x = 42 (only in debug mode)
vi v = {1,2,3};
dbgv(v); // prints: v: 1 2 3 (only in debug mode)
// Compile with sanitizers to catch memory errors and UB:
// g++ -fsanitize=address,undefined -O1 -o sol sol.cpp
// These are invaluable for catching:
// - Out-of-bounds array access
// - Integer overflow (with -fsanitize=signed-integer-overflow)
// - Use of uninitialized memory
// - Null pointer dereference
Fenwick Tree (BIT) — Prefix Sum with Updates
The Binary Indexed Tree (BIT or Fenwick Tree) uses the lowest set bit trick to achieve O(log N) prefix sum queries and updates. Each index i is "responsible" for the range [i - lowbit(i) + 1, i] where lowbit(i) = i & (-i).
// Fenwick Tree / BIT — O(log N) update and prefix query
struct BIT {
int n;
vector<long long> tree;
BIT(int n) : n(n), tree(n + 1, 0) {}
// Add val to position i (1-indexed)
void update(int i, long long val) {
for (; i <= n; i += i & (-i))
tree[i] += val;
}
// Prefix sum [1..i]
long long query(int i) {
long long sum = 0;
for (; i > 0; i -= i & (-i))
sum += tree[i];
return sum;
}
// Range sum [l..r]
long long query(int l, int r) { return query(r) - query(l - 1); }
};
Appendix D: Contest-Ready Algorithm Templates
🏆 Quick Reference: These templates are battle-tested, copy-paste ready, and designed to work correctly in competitive programming. Each is annotated with complexity and typical use cases.
D.1 DSU / Union-Find
Use when: Dynamic connectivity, Kruskal's MST, cycle detection, grouping elements.
Complexity: O(α(N)) ≈ O(1) per operation.
// =============================================================
// DSU (Disjoint Set Union) with Path Compression + Union by Rank
// =============================================================
struct DSU {
vector<int> parent, rank_;
int components; // number of connected components
DSU(int n) : parent(n), rank_(n, 0), components(n) {
iota(parent.begin(), parent.end(), 0); // parent[i] = i
}
// Find with path compression
int find(int x) {
if (parent[x] != x)
parent[x] = find(parent[x]); // path compression
return parent[x];
}
// Union by rank — returns true if actually merged (different components)
bool unite(int x, int y) {
x = find(x); y = find(y);
if (x == y) return false; // already connected
if (rank_[x] < rank_[y]) swap(x, y);
parent[y] = x;
if (rank_[x] == rank_[y]) rank_[x]++;
components--;
return true;
}
bool connected(int x, int y) { return find(x) == find(y); }
};
// Example usage:
int main() {
int n = 5;
DSU dsu(n);
dsu.unite(0, 1);
dsu.unite(2, 3);
cout << dsu.connected(0, 1) << "\n"; // 1 (true)
cout << dsu.connected(0, 2) << "\n"; // 0 (false)
cout << dsu.components << "\n"; // 3
return 0;
}
D.2 Segment Tree (Point Update, Range Sum)
Use when: Range sum/min/max queries with point updates.
Complexity: O(N) build, O(log N) per query/update.
// =============================================================
// Segment Tree — Point Update, Range Sum Query
// =============================================================
struct SegTree {
int n;
vector<long long> tree;
SegTree(int n) : n(n), tree(4 * n, 0) {}
void build(vector<long long>& arr, int node, int start, int end) {
if (start == end) { tree[node] = arr[start]; return; }
int mid = (start + end) / 2;
build(arr, 2*node, start, mid);
build(arr, 2*node+1, mid+1, end);
tree[node] = tree[2*node] + tree[2*node+1];
}
void build(vector<long long>& arr) { build(arr, 1, 0, n-1); }
void update(int node, int start, int end, int idx, long long val) {
if (start == end) { tree[node] = val; return; }
int mid = (start + end) / 2;
if (idx <= mid) update(2*node, start, mid, idx, val);
else update(2*node+1, mid+1, end, idx, val);
tree[node] = tree[2*node] + tree[2*node+1];
}
// Update arr[idx] = val
void update(int idx, long long val) { update(1, 0, n-1, idx, val); }
long long query(int node, int start, int end, int l, int r) {
if (r < start || end < l) return 0; // identity for sum
if (l <= start && end <= r) return tree[node];
int mid = (start + end) / 2;
return query(2*node, start, mid, l, r)
+ query(2*node+1, mid+1, end, l, r);
}
// Query sum of arr[l..r]
long long query(int l, int r) { return query(1, 0, n-1, l, r); }
};
// Example usage:
int main() {
vector<long long> arr = {1, 3, 5, 7, 9, 11};
SegTree st(arr.size());
st.build(arr);
cout << st.query(2, 4) << "\n"; // 5+7+9 = 21
st.update(2, 10); // arr[2] = 10
cout << st.query(2, 4) << "\n"; // 10+7+9 = 26
return 0;
}
D.3 BFS Template
Use when: Shortest path in unweighted graph/grid, level-order traversal, multi-source distances.
Complexity: O(V + E).
// =============================================================
// BFS — Shortest Path in Unweighted Graph
// =============================================================
#include <bits/stdc++.h>
using namespace std;
// Returns dist[] where dist[v] = shortest distance from src to v
// dist[v] = -1 if unreachable
vector<int> bfs(int src, int n, vector<vector<int>>& adj) {
vector<int> dist(n, -1);
queue<int> q;
dist[src] = 0;
q.push(src);
while (!q.empty()) {
int u = q.front(); q.pop();
for (int v : adj[u]) {
if (dist[v] == -1) {
dist[v] = dist[u] + 1;
q.push(v);
}
}
}
return dist;
}
// Grid BFS (4-directional)
const int dr[] = {-1, 1, 0, 0};
const int dc[] = {0, 0, -1, 1};
int gridBFS(vector<string>& grid, int sr, int sc, int er, int ec) {
int R = grid.size(), C = grid[0].size();
vector<vector<int>> dist(R, vector<int>(C, -1));
queue<pair<int,int>> q;
dist[sr][sc] = 0;
q.push({sr, sc});
while (!q.empty()) {
auto [r, c] = q.front(); q.pop();
for (int d = 0; d < 4; d++) {
int nr = r + dr[d], nc = c + dc[d];
if (nr >= 0 && nr < R && nc >= 0 && nc < C
&& grid[nr][nc] != '#' && dist[nr][nc] == -1) {
dist[nr][nc] = dist[r][c] + 1;
q.push({nr, nc});
}
}
}
return dist[er][ec]; // -1 if unreachable
}
D.4 DFS Template
Use when: Connected components, cycle detection, topological sort, flood fill.
Complexity: O(V + E).
// =============================================================
// DFS — Iterative and Recursive Templates
// =============================================================
vector<vector<int>> adj;
vector<int> color; // 0=white, 1=gray (in stack), 2=black (done)
// Recursive DFS with cycle detection (directed graph)
bool hasCycle = false;
void dfs(int u) {
color[u] = 1; // mark as "in progress"
for (int v : adj[u]) {
if (color[v] == 0) dfs(v);
else if (color[v] == 1) hasCycle = true; // back edge → cycle!
}
color[u] = 2; // mark as "done"
}
// Topological sort using DFS post-order
vector<int> topoOrder;
void dfsToposort(int u) {
color[u] = 1;
for (int v : adj[u]) {
if (color[v] == 0) dfsToposort(v);
}
color[u] = 2;
topoOrder.push_back(u); // add to order AFTER processing all children
}
// Reverse topoOrder for correct topological sequence
// Iterative DFS (avoids stack overflow for large graphs)
void dfsIterative(int src, int n) {
vector<bool> visited(n, false);
stack<int> st;
st.push(src);
while (!st.empty()) {
int u = st.top(); st.pop();
if (visited[u]) continue;
visited[u] = true;
// Process u here
for (int v : adj[u]) {
if (!visited[v]) st.push(v);
}
}
}
D.5 Dijkstra's Algorithm
Use when: Shortest path in weighted graph with non-negative edge weights.
Complexity: O((V + E) log V).
// =============================================================
// Dijkstra's Shortest Path — O((V+E) log V)
// =============================================================
#include <bits/stdc++.h>
using namespace std;
typedef pair<long long, int> pli; // {distance, node}
const long long INF = 1e18;
vector<long long> dijkstra(int src, int n,
vector<vector<pair<int,int>>>& adj) {
// adj[u] = { {v, weight}, ... }
vector<long long> dist(n, INF);
priority_queue<pli, vector<pli>, greater<pli>> pq; // min-heap
dist[src] = 0;
pq.push({0, src});
while (!pq.empty()) {
auto [d, u] = pq.top(); pq.pop();
if (d > dist[u]) continue; // ← KEY LINE: skip outdated entries
for (auto [v, w] : adj[u]) {
if (dist[u] + w < dist[v]) {
dist[v] = dist[u] + w;
pq.push({dist[v], v});
}
}
}
return dist; // dist[v] = shortest distance src → v, INF if unreachable
}
// Example usage:
int main() {
int n = 5;
vector<vector<pair<int,int>>> adj(n);
// Add edge u-v with weight w (undirected):
auto addEdge = [&](int u, int v, int w) {
adj[u].push_back({v, w});
adj[v].push_back({u, w});
};
addEdge(0, 1, 4);
addEdge(0, 2, 1);
addEdge(2, 1, 2);
addEdge(1, 3, 1);
addEdge(2, 3, 5);
auto dist = dijkstra(0, n, adj);
cout << dist[3] << "\n"; // 4 (path: 0→2→1→3 with cost 1+2+1=4)
return 0;
}
D.6 Binary Search Templates
Use when: Searching in sorted arrays, or "binary search on answer" (parametric search).
Complexity: O(log N) per search, O(f(N) × log V) for binary search on answer.
// =============================================================
// Binary Search Templates
// =============================================================
// 1. Find exact value (returns index or -1)
int binarySearch(vector<int>& arr, int target) {
int lo = 0, hi = (int)arr.size() - 1;
while (lo <= hi) {
int mid = lo + (hi - lo) / 2;
if (arr[mid] == target) return mid;
else if (arr[mid] < target) lo = mid + 1;
else hi = mid - 1;
}
return -1;
}
// 2. First index where arr[i] >= target (lower_bound)
int lowerBound(vector<int>& arr, int target) {
int lo = 0, hi = (int)arr.size();
while (lo < hi) {
int mid = lo + (hi - lo) / 2;
if (arr[mid] < target) lo = mid + 1;
else hi = mid;
}
return lo; // arr.size() if all elements < target
}
// 3. First index where arr[i] > target (upper_bound)
int upperBound(vector<int>& arr, int target) {
int lo = 0, hi = (int)arr.size();
while (lo < hi) {
int mid = lo + (hi - lo) / 2;
if (arr[mid] <= target) lo = mid + 1;
else hi = mid;
}
return lo;
}
// 4. Binary search on answer — find maximum X where check(X) is true
// Template: adapt lo, hi, and check() for your problem
long long bsOnAnswer(long long lo, long long hi,
function<bool(long long)> check) {
long long answer = lo - 1; // sentinel: no valid answer
while (lo <= hi) {
long long mid = lo + (hi - lo) / 2;
if (check(mid)) {
answer = mid;
lo = mid + 1; // try to do better
} else {
hi = mid - 1;
}
}
return answer;
}
// STL wrappers (prefer these in practice):
// lower_bound(v.begin(), v.end(), x) → iterator to first element >= x
// upper_bound(v.begin(), v.end(), x) → iterator to first element > x
// binary_search(v.begin(), v.end(), x) → bool, whether x exists
lower_bound / upper_bound cheat sheet:
| Goal | Code |
|---|---|
| First index ≥ x | lower_bound(v.begin(), v.end(), x) - v.begin() |
| First index > x | upper_bound(v.begin(), v.end(), x) - v.begin() |
| Count of x | upper_bound(..., x) - lower_bound(..., x) |
| Largest value ≤ x | prev(upper_bound(..., x)) if exists |
| Smallest value ≥ x | *lower_bound(..., x) if < end |
D.7 Modular Arithmetic Template
Use when: Large numbers, combinatorics, DP with large values.
Complexity: O(1) per operation, O(log exp) for modpow.
// =============================================================
// Modular Arithmetic Template
// =============================================================
const long long MOD = 1e9 + 7; // or 998244353 for NTT-friendly
long long mod(long long x) { return ((x % MOD) + MOD) % MOD; }
long long add(long long a, long long b) { return (a + b) % MOD; }
long long sub(long long a, long long b) { return mod(a - b); }
long long mul(long long a, long long b) { return a % MOD * (b % MOD) % MOD; }
// Fast power: base^exp mod MOD — O(log exp)
long long power(long long base, long long exp, long long mod = MOD) {
long long result = 1;
base %= mod;
while (exp > 0) {
if (exp & 1) result = result * base % mod; // if last bit is 1
base = base * base % mod; // square the base
exp >>= 1; // shift right
}
return result;
}
// Modular inverse (base^(MOD-2) mod MOD, only when MOD is prime)
long long inv(long long x) { return power(x, MOD - 2); }
// Modular division
long long divide(long long a, long long b) { return mul(a, inv(b)); }
// Precompute factorials for combinations
const int MAXN = 200005;
long long fact[MAXN], inv_fact[MAXN];
void precompute_factorials() {
fact[0] = 1;
for (int i = 1; i < MAXN; i++) fact[i] = fact[i-1] * i % MOD;
inv_fact[MAXN-1] = inv(fact[MAXN-1]);
for (int i = MAXN-2; i >= 0; i--) inv_fact[i] = inv_fact[i+1] * (i+1) % MOD;
}
// C(n, k) = n choose k mod MOD
long long C(int n, int k) {
if (k < 0 || k > n) return 0;
return fact[n] * inv_fact[k] % MOD * inv_fact[n-k] % MOD;
}
D.8 Fast Power (Binary Exponentiation)
Use when: Computing a^b for large b (standalone or modular).
Complexity: O(log b).
// =============================================================
// Binary Exponentiation — a^b in O(log b)
// =============================================================
// Integer power (no mod) — careful of overflow for large a,b
long long fastPow(long long a, long long b) {
long long result = 1;
while (b > 0) {
if (b & 1) result *= a; // if current bit is 1
a *= a; // square a
b >>= 1; // next bit
}
return result;
}
// Modular power — a^b mod m
long long modPow(long long a, long long b, long long m) {
long long result = 1;
a %= m;
while (b > 0) {
if (b & 1) result = result * a % m;
a = a * a % m;
b >>= 1;
}
return result;
}
// Matrix exponentiation — M^b for matrix M (for Fibonacci in O(log N) etc.)
typedef vector<vector<long long>> Matrix;
// Note: uses MOD from D.7 (const long long MOD = 1e9 + 7)
Matrix multiply(const Matrix& A, const Matrix& B) {
int n = A.size();
Matrix C(n, vector<long long>(n, 0));
for (int i = 0; i < n; i++)
for (int k = 0; k < n; k++)
if (A[i][k])
for (int j = 0; j < n; j++)
C[i][j] = (C[i][j] + A[i][k] * B[k][j]) % MOD;
return C;
}
Matrix matPow(Matrix M, long long b) {
int n = M.size();
Matrix result(n, vector<long long>(n, 0));
for (int i = 0; i < n; i++) result[i][i] = 1; // identity matrix
while (b > 0) {
if (b & 1) result = multiply(result, M);
M = multiply(M, M);
b >>= 1;
}
return result;
}
// Example: Fibonacci(N) in O(log N) using matrix exponentiation
// [F(n+1)] [1 1]^n [F(1)]
// [F(n) ] = [1 0] * [F(0)]
long long fibonacci(long long n) {
if (n <= 1) return n;
Matrix M = {{1, 1}, {1, 0}};
Matrix result = matPow(M, n - 1);
return result[0][0]; // F(n)
}
D.9 Other Useful Templates
Prefix Sum (1D and 2D)
// 1D Prefix Sum
vector<long long> prefSum(n + 1, 0);
for (int i = 1; i <= n; i++) prefSum[i] = prefSum[i-1] + arr[i];
// Query sum of arr[l..r] (1-indexed): prefSum[r] - prefSum[l-1]
// 2D Prefix Sum
long long psum[N+1][M+1] = {};
for (int i = 1; i <= N; i++)
for (int j = 1; j <= M; j++)
psum[i][j] = grid[i][j] + psum[i-1][j] + psum[i][j-1] - psum[i-1][j-1];
// Query sum of rectangle [r1,c1]..[r2,c2]:
// psum[r2][c2] - psum[r1-1][c2] - psum[r2][c1-1] + psum[r1-1][c1-1]
Competitive Programming Header
// Standard competitive programming template
#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
typedef pair<int,int> pii;
typedef vector<int> vi;
typedef vector<ll> vll;
#define all(x) x.begin(), x.end()
#define sz(x) (int)(x).size()
#define pb push_back
#define mp make_pair
const int INF = 1e9;
const ll LINF = 1e18;
const int MOD = 1e9 + 7;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
// Your solution here
return 0;
}
Quick Reference Card
| Algorithm | Complexity | Header to include |
|---|---|---|
| DSU (Union-Find) | O(α(N)) per op | — |
| Segment Tree | O(N) build, O(log N) per op | — |
| BFS | O(V+E) | <queue> |
| DFS | O(V+E) | <stack> |
| Dijkstra | O((V+E) log V) | <queue> |
| Binary search | O(log N) | <algorithm> |
| Sort | O(N log N) | <algorithm> |
| Modular exponentiation | O(log exp) | — |
| lower/upper_bound | O(log N) | <algorithm> |
✅ All examples compiled and tested with C++17 (
-std=c++17 -O2).
Appendix E: Math Foundations for Competitive Programming
💡 About This Appendix: Competitive programming often requires mathematical tools beyond basic arithmetic. This appendix covers the essential math you'll encounter in USACO Bronze, Silver, and Gold — with contest-ready code templates for each topic.
E.1 Modular Arithmetic
Why Do We Need Modular Arithmetic?
Many problems ask you to output an answer "modulo 10⁹ + 7". This isn't arbitrary — it prevents integer overflow when answers are astronomically large.
Consider: "How many permutations of N elements?" Answer: N! For N = 20, that's 2,432,902,008,176,640,000 — larger than long long's max (~9.2 × 10¹⁸). For N = 100, it's completely unrepresentable.
Solution: Compute everything modulo a prime M (typically 10⁹ + 7).
Common MOD Values
| Constant | Value | Why This Value? |
|---|---|---|
1e9 + 7 | 1,000,000,007 | Prime, fits in int (< 2³¹), widely used |
1e9 + 9 | 1,000,000,009 | Prime, alternative to 1e9+7 |
998244353 | 998,244,353 | NTT-friendly prime (for polynomial operations) |
Basic Modular Operations Template
// Solution: Modular Arithmetic Basics
#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
const ll MOD = 1e9 + 7; // standard competitive programming MOD
// Safe addition: (a + b) % MOD
ll addMod(ll a, ll b) {
return (a % MOD + b % MOD) % MOD;
}
// Safe subtraction: (a - b + MOD) % MOD (handle negative result)
ll subMod(ll a, ll b) {
return ((a % MOD) - (b % MOD) + MOD) % MOD; // +MOD prevents negative!
}
// Safe multiplication: (a * b) % MOD
// Key: a and b are at most MOD-1 ≈ 10^9, so a*b ≈ 10^18 which fits long long
ll mulMod(ll a, ll b) {
return (a % MOD) * (b % MOD) % MOD;
}
// Example: Compute sum of first N integers modulo MOD
ll sumFirstN(ll n) {
// Formula: n*(n+1)/2, but careful with division — need modular inverse!
// For now: just accumulate with addMod
ll result = 0;
for (ll i = 1; i <= n; i++) {
result = addMod(result, i);
}
return result;
}
⚠️ Critical Bug:
(a - b) % MODcan be negative in C++ ifa < b! Always use(a - b + MOD) % MOD.
E.1.1 Fast Exponentiation (Binary Exponentiation)
Computing a^n mod M naively takes O(N) multiplications. Fast exponentiation (exponentiation by squaring) does it in O(log N).
Key insight: a^n = a^(n/2) × a^(n/2) if n is even
a^n = a × a^((n-1)/2) × a^((n-1)/2) if n is odd
Example: a^13 = a^(1101 in binary)
= a^8 × a^4 × a^1
= 3 multiplications instead of 12!
// Solution: Fast Modular Exponentiation — O(log n)
// Computes (base^exp) % mod
ll power(ll base, ll exp, ll mod = MOD) {
ll result = 1;
base %= mod; // reduce base first
while (exp > 0) {
if (exp & 1) { // if current bit is 1
result = result * base % mod;
}
base = base * base % mod; // square the base
exp >>= 1; // shift to next bit
}
return result;
}
// Example usage:
// power(2, 10) = 1024 % MOD = 1024
// power(2, 100, MOD) = 2^100 mod (10^9+7)
E.1.2 Modular Inverse (Fermat's Little Theorem)
The modular inverse of a modulo M is a number a⁻¹ such that a × a⁻¹ ≡ 1 (mod M).
This lets us do modular division: a / b mod M = a × b⁻¹ mod M.
Fermat's Little Theorem: If M is prime and gcd(a, M) = 1, then:
// Solution: Modular Inverse using Fermat's Little Theorem
// Only works when MOD is PRIME and gcd(a, MOD) = 1
ll modInverse(ll a, ll mod = MOD) {
return power(a, mod - 2, mod);
}
// Division with modular arithmetic:
ll divMod(ll a, ll b) {
return mulMod(a, modInverse(b));
}
// Example: (n! / k!) mod MOD
// = n! × (k!)^(-1) mod MOD
// = n! × modInverse(k!) mod MOD
E.1.3 Precomputing Factorials and Inverses
For problems requiring many combinations C(n, k):
// Solution: Precomputed Factorials for O(1) Combination Queries
const int MAXN = 1000005;
ll fact[MAXN], inv_fact[MAXN];
void precompute() {
fact[0] = 1;
for (int i = 1; i < MAXN; i++) {
fact[i] = fact[i-1] * i % MOD;
}
inv_fact[MAXN-1] = modInverse(fact[MAXN-1]);
for (int i = MAXN-2; i >= 0; i--) {
inv_fact[i] = inv_fact[i+1] * (i+1) % MOD;
}
}
// C(n, k) = n! / (k! * (n-k)!)
ll C(int n, int k) {
if (k < 0 || k > n) return 0;
return fact[n] * inv_fact[k] % MOD * inv_fact[n-k] % MOD;
}
// Usage: precompute() once, then C(n, k) in O(1)
E.2 GCD and LCM
Euclidean Algorithm
The Greatest Common Divisor (GCD) of two numbers is the largest number that divides both.
Euclidean Algorithm: Based on gcd(a, b) = gcd(b, a % b).
// Solution: GCD — O(log(min(a,b)))
int gcd(int a, int b) {
while (b != 0) {
a %= b;
swap(a, b);
}
return a;
}
// Or recursively:
// int gcd(int a, int b) { return b == 0 ? a : gcd(b, a % b); }
// C++17: std::gcd from <numeric>
// int g = gcd(a, b); // std::gcd, C++17 (recommended)
// int g = __gcd(a, b); // legacy GCC built-in, still works
Trace: gcd(48, 18):
gcd(48, 18) → gcd(18, 48%18=12) → gcd(12, 18%12=6) → gcd(6, 0) = 6
LCM and the Overflow Trap
// Solution: LCM — be careful with overflow!
// WRONG: overflows for large a, b
long long lcmWrong(long long a, long long b) {
return a * b / gcd(a, b); // a*b can overflow even long long!
}
// CORRECT: divide first, then multiply
long long lcm(long long a, long long b) {
return a / gcd(a, b) * b; // divide BEFORE multiplying
}
// a / gcd(a,b) is always an integer, so no precision loss
// Then * b: max value is around 10^18 which fits in long long
⚠️ Always divide before multiplying to avoid overflow!
Extended Euclidean Algorithm
Finds integers x, y such that ax + by = gcd(a, b) — useful for modular inverse when MOD is not prime:
// Solution: Extended Euclidean Algorithm — O(log(min(a,b)))
// Returns gcd(a,b), and sets x,y such that a*x + b*y = gcd(a,b)
long long extgcd(long long a, long long b, long long &x, long long &y) {
if (b == 0) { x = 1; y = 0; return a; }
long long x1, y1;
long long g = extgcd(b, a % b, x1, y1);
x = y1;
y = x1 - (a / b) * y1;
return g;
}
// Modular inverse using extgcd (works even when MOD is not prime):
long long modInverseExtGcd(long long a, long long mod) {
long long x, y;
long long g = extgcd(a, mod, x, y);
if (g != 1) return -1; // no inverse exists (gcd != 1)
return (x % mod + mod) % mod;
}
E.3 Prime Numbers and Sieves
Trial Division
// Solution: Trial Division Primality Test — O(sqrt(N))
bool isPrime(long long n) {
if (n < 2) return false;
if (n == 2) return true;
if (n % 2 == 0) return false;
for (long long i = 3; i * i <= n; i += 2) {
if (n % i == 0) return false;
}
return true;
}
// Efficient because: if n has a factor > sqrt(n), it must also have one <= sqrt(n)
// Only check odd numbers after 2 (halves the iterations)
Sieve of Eratosthenes
Find all primes up to N efficiently:
// Solution: Sieve of Eratosthenes — O(N log log N) time, O(N) space
// After running, isPrime[i] = true iff i is prime
const int MAXN = 1000005;
bool isPrime[MAXN];
void sieve(int n) {
fill(isPrime, isPrime + n + 1, true); // assume all prime initially
isPrime[0] = isPrime[1] = false; // 0 and 1 are not prime
for (int i = 2; (long long)i * i <= n; i++) {
if (isPrime[i]) {
// Mark all multiples of i as composite
for (int j = i * i; j <= n; j += i) {
isPrime[j] = false;
// Start from i*i (smaller multiples already marked by smaller primes)
}
}
}
}
// Count primes up to N:
void countPrimes(int n) {
sieve(n);
int count = 0;
for (int i = 2; i <= n; i++) {
if (isPrime[i]) count++;
}
cout << count << "\n";
}
Why start inner loop at i²? All multiples of i smaller than i² (i.e., 2i, 3i, ..., (i-1)i) were already marked by smaller primes (2, 3, ..., i-1).
Linear Sieve (Euler Sieve) — O(N)
The Euler sieve marks each composite number exactly once:
// Solution: Linear Sieve (Euler Sieve) — O(N) time
// Also computes smallest prime factor (SPF) for each number
const int MAXN = 1000005;
int spf[MAXN]; // smallest prime factor
vector<int> primes;
void linearSieve(int n) {
fill(spf, spf + n + 1, 0);
for (int i = 2; i <= n; i++) {
if (spf[i] == 0) { // i is prime
spf[i] = i;
primes.push_back(i);
}
for (int j = 0; j < (int)primes.size() && primes[j] <= spf[i] && (long long)i * primes[j] <= n; j++) {
spf[i * primes[j]] = primes[j]; // mark composite
}
}
}
// Fast prime factorization using SPF:
// O(log N) per factorization
vector<int> factorize(int n) {
vector<int> factors;
while (n > 1) {
factors.push_back(spf[n]);
n /= spf[n];
}
return factors;
}
E.4 Binary Representations and Bit Manipulation
Fundamental Bit Operations
// Solution: Common Bit Operations Reference
int n = 42; // binary: 101010
// ── AND (&): both bits must be 1 ──
int a = 6 & 3; // 110 & 011 = 010 = 2
// ── OR (|): at least one bit is 1 ──
int b = 6 | 3; // 110 | 011 = 111 = 7
// ── XOR (^): exactly one bit is 1 ──
int c = 6 ^ 3; // 110 ^ 011 = 101 = 5
// ── NOT (~): flip all bits (two's complement) ──
int d = ~6; // = -7 (in two's complement)
// ── Left shift (<<): multiply by 2^k ──
int e = 1 << 4; // = 16 = 2^4
// ── Right shift (>>): divide by 2^k (arithmetic) ──
int f = 32 >> 2; // = 8 = 32/4
Essential Bit Tricks
// Solution: Competitive Programming Bit Tricks
// ── Check if n is odd ──
bool isOdd(int n) { return n & 1; } // last bit is 1 iff odd
// ── Check if n is a power of 2 ──
bool isPow2(int n) { return n > 0 && (n & (n-1)) == 0; }
// Why? Powers of 2: 1=001, 2=010, 4=100. n-1 flips all lower bits.
// 4 & 3 = 100 & 011 = 000. Non-powers: 6 & 5 = 110 & 101 = 100 ≠ 0.
// ── Get k-th bit (0-indexed from right) ──
bool getBit(int n, int k) { return (n >> k) & 1; }
// ── Set k-th bit to 1 ──
int setBit(int n, int k) { return n | (1 << k); }
// ── Clear k-th bit (set to 0) ──
int clearBit(int n, int k) { return n & ~(1 << k); }
// ── Toggle k-th bit ──
int toggleBit(int n, int k) { return n ^ (1 << k); }
// ── lowbit: lowest set bit (used in Fenwick tree!) ──
int lowbit(int n) { return n & (-n); }
// Example: lowbit(12) = lowbit(1100) = 0100 = 4
// ── Count number of set bits (popcount) ──
int popcount(int n) { return __builtin_popcount(n); } // use built-in!
// For long long: __builtin_popcountll(n)
// ── Swap two numbers without temp variable ──
void swapXOR(int &a, int &b) {
a ^= b;
b ^= a;
a ^= b;
}
// (usually just use std::swap — this is mainly a curiosity)
// ── Find position of lowest set bit ──
int lowestBitPos(int n) { return __builtin_ctz(n); } // count trailing zeros
// __builtin_clz(n) = count leading zeros
Subset Enumeration
A powerful technique: enumerate all subsets of a set represented as a bitmask.
// Solution: Subset Enumeration with Bitmasks
// Enumerate all subsets of an N-element set
void enumerateAllSubsets(int n) {
// Total subsets = 2^n
for (int mask = 0; mask < (1 << n); mask++) {
// 'mask' represents a subset: bit i set = element i is included
cout << "Subset: {";
for (int i = 0; i < n; i++) {
if (mask & (1 << i)) {
cout << i << " ";
}
}
cout << "}\n";
}
}
// Enumerate all NON-EMPTY subsets of a given set 'S'
void enumerateSubsetsOf(int S) {
for (int sub = S; sub > 0; sub = (sub - 1) & S) {
// Process subset 'sub'
// The trick: (sub-1) & S gives the "next smaller" subset of S
// This enumerates all 2^|S| subsets of S in O(1) amortized per step
}
}
// Classic use: bitmask DP
// dp[mask] = minimum cost to visit the set of cities represented by mask
// dp[0] = 0 (start: no cities visited)
// dp[mask | (1 << v)] = min(dp[mask | (1 << v)], dp[mask] + cost[last][v])
E.5 Combinatorics Basics
Counting Formulas
// Solution: Combinatorics with Modular Arithmetic
// Assumes precompute() from E.1.3 has been called
// C(n, k) = n! / (k! * (n-k)!)
ll combination(int n, int k) {
if (k < 0 || k > n) return 0;
return fact[n] * inv_fact[k] % MOD * inv_fact[n-k] % MOD;
}
// P(n, k) = n! / (n-k)!
ll permutation(int n, int k) {
if (k < 0 || k > n) return 0;
return fact[n] * inv_fact[n-k] % MOD;
}
// Stars and Bars: number of ways to put n identical balls into k distinct boxes
// = C(n + k - 1, k - 1)
ll starsAndBars(int n, int k) {
return combination(n + k - 1, k - 1);
}
Pascal's Triangle — Computing C(n, k) without Precomputation
When n is small (n ≤ 2000), Pascal's triangle is simpler:
// Solution: Pascal's Triangle DP — O(n^2) precomputation
const int MAXN = 2005;
ll C[MAXN][MAXN];
void buildPascal() {
for (int i = 0; i < MAXN; i++) {
C[i][0] = C[i][i] = 1;
for (int j = 1; j < i; j++) {
C[i][j] = (C[i-1][j-1] + C[i-1][j]) % MOD;
}
}
}
// Then C[n][k] is the answer for any 0 <= k <= n < MAXN
// This avoids modular inverse entirely — useful when MOD might not be prime
Pascal's Rule: C(n, k) = C(n-1, k-1) + C(n-1, k)
This comes from: "choose k items from n" = "include item n and choose k-1 from n-1" + "exclude item n and choose k from n-1".
Key Combinatorial Identities
// Useful identities in competitive programming:
// Hockey Stick Identity: sum of C(r+k, k) for k=0..n = C(n+r+1, n)
// Useful for: 2D prefix sums, polynomial evaluations
// Vandermonde's Identity: sum_k C(m,k)*C(n,r-k) = C(m+n, r)
// Useful for: counting problems with two groups
// Inclusion-Exclusion:
// |A ∪ B| = |A| + |B| - |A ∩ B|
// |A ∪ B ∪ C| = |A| + |B| + |C| - |A∩B| - |A∩C| - |B∩C| + |A∩B∩C|
// Generalizes to n sets with 2^n terms (or bitmask enumeration)
E.6 Common Mathematical Results for Complexity Analysis
Harmonic Series
This explains why the Sieve of Eratosthenes runs in O(N log log N):
- Total work = N/2 + N/3 + N/5 + N/7 + ... (for each prime p, mark N/p multiples)
- Sum over primes ≈ N × ln(ln(N))
And why Fenwick tree operations are O(log N): the lowbit operation advances by 1, 2, 4, ... bits.
Key Estimates
| Expression | Approximation | Notes |
|---|---|---|
| log₂(10⁵) | ≈ 17 | Depth of BST/segment tree on 10⁵ elements |
| log₂(10⁹) | ≈ 30 | Binary search on 10⁹ range |
| √(10⁶) | = 1000 | Trial division up to √N for N ≤ 10⁶ |
| 2²⁰ | ≈ 10⁶ | Bitmask DP limit (20 items) |
| 20! | ≈ 2.4 × 10¹⁸ | Barely fits in long long |
| 13! | ≈ 6 × 10⁹ | Just over int limit |
Operations Per Second Estimate
| Time Limit | Max Operations (safe) |
|---|---|
| 1 second | ~10⁸ simple operations |
| 2 seconds | ~2 × 10⁸ |
| 3 seconds | ~3 × 10⁸ |
Using this, you can estimate if your algorithm is fast enough:
- N = 10⁵,
O(N log N)→ ~1.7 × 10⁶ ops → fast - N = 10⁵,
O(N²)→ 10¹⁰ ops → too slow - N = 10⁵,
O(N√N)→ ~3 × 10⁷ ops → borderline (usually OK with 2s limit)
E.7 Complete Math Template
Here's a single file with all the templates from this appendix:
// Solution: Complete Math Template for Competitive Programming
#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
typedef unsigned long long ull;
// ═══════════════════════════════════════════════
// MODULAR ARITHMETIC
// ═══════════════════════════════════════════════
const ll MOD = 1e9 + 7;
ll power(ll base, ll exp, ll mod = MOD) {
ll result = 1;
base %= mod;
while (exp > 0) {
if (exp & 1) result = result * base % mod;
base = base * base % mod;
exp >>= 1;
}
return result;
}
ll modInverse(ll a, ll mod = MOD) {
return power(a, mod - 2, mod);
}
// ═══════════════════════════════════════════════
// FACTORIALS (precomputed up to MAXN)
// ═══════════════════════════════════════════════
const int MAXN = 1000005;
ll fact[MAXN], inv_fact[MAXN];
void precomputeFactorials() {
fact[0] = 1;
for (int i = 1; i < MAXN; i++) fact[i] = fact[i-1] * i % MOD;
inv_fact[MAXN-1] = modInverse(fact[MAXN-1]);
for (int i = MAXN-2; i >= 0; i--) inv_fact[i] = inv_fact[i+1] * (i+1) % MOD;
}
ll C(int n, int k) {
if (k < 0 || k > n) return 0;
return fact[n] * inv_fact[k] % MOD * inv_fact[n-k] % MOD;
}
// ═══════════════════════════════════════════════
// GCD / LCM
// ═══════════════════════════════════════════════
ll gcd(ll a, ll b) { return b == 0 ? a : gcd(b, a % b); }
ll lcm(ll a, ll b) { return a / gcd(a, b) * b; }
// ═══════════════════════════════════════════════
// PRIME SIEVE
// ═══════════════════════════════════════════════
const int MAXP = 1000005;
bool notPrime[MAXP];
vector<int> primes;
void sieve(int n = MAXP - 1) {
notPrime[0] = notPrime[1] = true;
for (int i = 2; i <= n; i++) {
if (!notPrime[i]) {
primes.push_back(i);
for (long long j = (long long)i*i; j <= n; j += i)
notPrime[j] = true;
}
}
}
bool isPrime(int n) { return n >= 2 && !notPrime[n]; }
// ═══════════════════════════════════════════════
// BIT TRICKS
// ═══════════════════════════════════════════════
bool isOdd(int n) { return n & 1; }
bool isPow2(int n) { return n > 0 && !(n & (n-1)); }
int lowbit(int n) { return n & (-n); }
int popcount(int n) { return __builtin_popcount(n); }
int ctz(int n) { return __builtin_ctz(n); } // count trailing zeros
// ═══════════════════════════════════════════════
// EXTENDED GCD
// ═══════════════════════════════════════════════
ll extgcd(ll a, ll b, ll &x, ll &y) {
if (!b) { x = 1; y = 0; return a; }
ll x1, y1, g = extgcd(b, a%b, x1, y1);
x = y1; y = x1 - a/b * y1;
return g;
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
precomputeFactorials();
sieve();
// Test: C(10, 3) = 120
cout << C(10, 3) << "\n";
// Test: 2^100 mod (10^9+7)
cout << power(2, 100) << "\n";
// Test: first few primes
for (int i = 0; i < 10; i++) cout << primes[i] << " ";
cout << "\n";
return 0;
}
E.8 Number Theory Quick Reference
Divisibility Rules (useful for manual checks)
| Divisor | Rule |
|---|---|
| 2 | Last digit is even |
| 3 | Sum of digits divisible by 3 |
| 4 | Last two digits form a number divisible by 4 |
| 5 | Last digit is 0 or 5 |
| 9 | Sum of digits divisible by 9 |
| 10 | Last digit is 0 |
| 11 | Alternating sum of digits divisible by 11 |
Integer Square Root
// Safe integer square root (avoids floating point errors)
ll isqrt(ll n) {
ll x = sqrtl(n); // floating point approximation
while (x * x > n) x--; // correct downward if needed
while ((x+1) * (x+1) <= n) x++; // correct upward if needed
return x;
}
Ceiling Division
// Ceiling division: ceil(a/b) for positive integers
ll ceilDiv(ll a, ll b) {
return (a + b - 1) / b;
// Or: (a - 1) / b + 1 (same thing for a > 0)
}
❓ FAQ
Q1: When should I use long long?
A: When values might exceed 2 × 10⁹ (roughly the
intlimit). Typical cases: ① multiplying two largeintvalues (10⁹ × 10⁹ = 10¹⁸); ② summing path weights (N edges, each weight 10⁶, total up to 10¹¹); ③ factorials/combinations (uselong longfor intermediate calculations even with modular arithmetic). Rule of thumb: uselong longwhenever there's multiplication in competitive programming code.
Q2: Why use 10⁹ + 7 as the modulus instead of 10⁹?
A:
10⁹is not prime (= 2⁹ × 5⁹), so Fermat's little theorem can't be used to compute modular inverses.10⁹ + 7 = 1,000,000,007is prime, and(10⁹ + 7)² < 2⁶³(thelong longlimit), so multiplying two numbers after taking the modulus won't overflowlong long.
Q3: How does the bit-manipulation trick in fast exponentiation work?
A: Write the exponent n in binary: n = b_k × 2^k + ... + b_1 × 2 + b_0. Then a^n = a^(b_k × 2^k) × ... × a^(b_1 × 2) × a^b_0. Each loop iteration squares the base (representing a to the power of 2^k), and multiplies into the result when the current bit is 1. This requires only log₂(n) multiplications.
Q4: Why does the Sieve of Eratosthenes start marking from i×i?
A: Multiples 2i, 3i, ..., (i-1)i have already been marked by the smaller primes 2, 3, ..., i-1. For example, 6 = 2×3 was marked by 2; 7×5=35 was marked by 5. Starting from i×i avoids redundant work and optimizes the constant factor.
Q5: Why does n & (n-1) check if n is a power of 2?
A: Powers of 2 have exactly one 1-bit in binary (e.g., 8 = 1000). Subtracting 1 flips the lowest 1-bit to 0 and all lower 0-bits to 1 (e.g., 7 = 0111). So
n & (n-1)clears the lowest 1-bit. If n is a power of 2 (only one 1-bit), the result is 0; otherwise it's nonzero.
End of Appendix E — See also: Algorithm Templates | Competitive Programming Tricks
Appendix F: Debugging Guide — Common Bugs & How to Fix Them
💡 Why This Appendix? Even correct algorithmic thinking fails when bugs slip through. This guide is a systematic catalogue of the most common bugs in competitive programming C++ code, organized by category. Bookmark it and check here first when your solution gives WA (Wrong Answer), TLE (Time Limit Exceeded), RE (Runtime Error), or MLE (Memory Limit Exceeded).
F.1 Integer Overflow
The most common source of Wrong Answer in C++.
Problem: int is Too Small
int holds values up to ~2.1 × 10⁹ (≈ 2 × 10⁹). Many problems exceed this.
// ❌ WRONG: n*n can overflow when n = 10^5
int n = 100000;
int result = n * n; // = 10^10 → overflows int (max ~2×10^9)!
// ✅ CORRECT: cast to long long before multiplication
long long result = (long long)n * n; // = 10^10, fits in long long
// OR:
long long n_ll = n;
long long result2 = n_ll * n_ll;
When to Use long long
| Situation | Use long long? |
|---|---|
| Array values up to 10⁹, need range sums | ✅ Yes (sum can be 10⁹ × 10⁵ = 10¹⁴) |
| Prefix sums of up to 10⁵ elements | ✅ Yes (safe default) |
| Matrix entries, intermediate DP values | ✅ Yes |
| Distances in shortest path (Dijkstra) | ✅ Yes (dist[u] + w can overflow int) |
| Simple counters (0 to N where N ≤ 10⁶) | ❌ int is fine |
| Indices and loop variables | ❌ int is fine |
Dangerous Operations
// ❌ Overflow examples:
int a = 1e9, b = 1e9;
cout << a + b; // overflow (answer > INT_MAX)
cout << a * 2; // overflow
cout << a * a; // catastrophic overflow
// ❌ Comparison overflow:
if (a * b > 1e18) ... // a*b itself may have overflowed!
// ✅ Safe versions:
cout << (long long)a + b;
cout << (long long)a * 2;
cout << (long long)a * a;
if ((long long)a * b > (long long)1e18) ... // compare as long long
INF Value Choice
// ❌ WRONG: Using INT_MAX as infinity in Dijkstra
const int INF = INT_MAX;
if (dist[u] + w < dist[v]) ... // dist[u] + w OVERFLOWS if dist[u]=INT_MAX!
// ✅ CORRECT: Use a safe sentinel
const long long INF = 1e18; // for long long distances
const int INF_INT = 1e9; // for int distances (leave headroom for addition)
F.2 Off-By-One Errors
The second most common source of WA.
Array Indexing
// ❌ WRONG: Array out of bounds (accessing index n)
int A[n];
for (int i = 0; i <= n; i++) cout << A[i]; // A[n] is undefined!
// ✅ CORRECT
for (int i = 0; i < n; i++) cout << A[i]; // indices 0..n-1
// OR (1-indexed):
for (int i = 1; i <= n; i++) cout << A[i]; // indices 1..n
Prefix Sum Formula
// ❌ WRONG: Off-by-one in range sum
// sum(L, R) should be P[R] - P[L-1], NOT P[R] - P[L]
cout << P[R] - P[L]; // missing element A[L]!
// ✅ CORRECT
cout << P[R] - P[L-1]; // P[0]=0 handles the L=1 case correctly
Binary Search Boundaries
// Finding first index where A[i] >= target (lower_bound behavior):
// ❌ WRONG: Common binary search mistakes
int lo = 0, hi = n - 1;
while (lo < hi) {
int mid = (lo + hi) / 2;
if (A[mid] < target) lo = mid; // BUG: should be lo = mid + 1
else hi = mid - 1; // BUG: should be hi = mid
}
// ✅ CORRECT: Standard lower_bound template
int lo = 0, hi = n; // hi = n (not n-1!) to allow "not found" answer
while (lo < hi) {
int mid = (lo + hi) / 2;
if (A[mid] < target) lo = mid + 1; // target is in [mid+1, hi]
else hi = mid; // target is in [lo, mid]
}
// lo = hi = first index with A[i] >= target; lo=n means not found
Loop Bounds
// ❌ Common mistake: loop runs one too few or many times
for (int i = 1; i < n; i++) ... // misses i=n if you meant i=0 to n-1
for (int i = 0; i <= n-1; i++) ... // OK but confusing; prefer i < n
// DP table filling: check if the recurrence accesses i-1
// ❌ If dp[i] uses dp[i-1], and i starts at 0, then dp[-1] is undefined!
for (int i = 0; i <= n; i++) {
dp[i] = dp[i-1] + ...; // BUG when i=0: dp[-1]!
}
// ✅ Start at i=1, or initialize dp[0] as base case separately
dp[0] = BASE_CASE;
for (int i = 1; i <= n; i++) {
dp[i] = dp[i-1] + ...; // safe: dp[i-1] always valid
}
F.3 Uninitialized Variables
// ❌ WRONG: dp array not initialized
int dp[1005][1005]; // contains garbage values in C++!
// dp[i][j] might be non-zero from previous test cases or OS memory
// ✅ CORRECT options:
// Option 1: memset (fills bytes, use 0 or 0x3f for near-infinity)
memset(dp, 0, sizeof(dp)); // fills with 0
memset(dp, 0x3f, sizeof(dp)); // fills with ~1.06e9 (useful as "infinity" for int)
// Option 2: vector with explicit initialization
vector<vector<int>> dp(n+1, vector<int>(m+1, 0));
// Option 3: fill
fill(dp, dp + n, 0);
// ⚠️ WARNING: memset(dp, -1, sizeof(dp)) fills each BYTE with 0xFF
// For int: 0xFFFFFFFF = -1 (works for "unvisited" marker)
// For long long: 0xFFFFFFFFFFFFFFFF = -1 (also works)
// But memset(dp, 1, sizeof(dp)) gives 0x01010101 = 16843009, not 1!
Global vs Local Arrays
// Global arrays are zero-initialized by default in C++
// Local (stack) arrays are NOT initialized
int globalArr[100005]; // ✅ initialized to 0
int globalDP[1005][1005]; // ✅ initialized to 0
int main() {
int localArr[1000]; // ❌ NOT initialized (garbage values)
int localDP[100][100]; // ❌ NOT initialized
// Tip: Declare large arrays globally to avoid stack overflow AND ensure init
}
F.4 Stack Overflow (Recursion Too Deep)
// C++ default stack size is typically 1-8 MB
// Deep recursion can exceed this → Runtime Error (segfault)
// ❌ Dangerous: DFS/recursion on tree of depth 10^5
void dfs(int u) { dfs(children[u]); } // stack overflow for long chains!
// ✅ FIX 1: Convert to iterative using explicit stack
void dfs_iterative(int start) {
stack<int> st;
st.push(start);
while (!st.empty()) {
int u = st.top(); st.pop();
for (int v : children[u]) st.push(v);
}
}
// ✅ FIX 2: Increase stack size (platform-specific, contest judges often allow this)
// On Linux, compile and run with: ulimit -s unlimited && ./sol
// Rule of thumb:
// Recursion depth up to ~10^4: usually safe
// Recursion depth up to ~10^5: risky, consider iterative
// Recursion depth up to ~10^6: almost certainly stack overflow → use iterative
F.5 Modular Arithmetic Bugs
// When the problem asks for answer mod 10^9+7:
const int MOD = 1e9 + 7;
// ❌ WRONG: Forgot to mod, result overflows long long
long long dp = 1;
for (int i = 0; i < n; i++) dp *= A[i]; // overflows after ~18 large multiplications!
// ❌ WRONG: Subtraction underflow (result is negative mod)
long long ans = (a - b) % MOD; // if a < b, result is negative in C++!
// ✅ CORRECT: Add MOD before taking mod of a subtraction
long long ans = ((a - b) % MOD + MOD) % MOD; // guaranteed non-negative
// ❌ WRONG: Forgetting to mod intermediate values in DP
dp[i][j] = dp[i-1][j] + dp[i][j-1]; // can overflow if iterations are many
// ✅ CORRECT: Mod every addition
dp[i][j] = (dp[i-1][j] + dp[i][j-1]) % MOD;
// ✅ CORRECT modular exponentiation:
long long modpow(long long base, long long exp, long long mod) {
long long result = 1;
base %= mod;
while (exp > 0) {
if (exp & 1) result = result * base % mod; // ← mod after each multiply!
base = base * base % mod;
exp >>= 1;
}
return result;
}
F.6 Graph / BFS / DFS Bugs
// ❌ BFS: Forgetting to mark visited BEFORE entering queue
// This causes nodes to be processed multiple times!
queue<int> q;
q.push(src);
while (!q.empty()) {
int u = q.front(); q.pop();
visited[u] = true; // ❌ Marking AFTER dequeue → same node pushed multiple times
for (int v : adj[u]) if (!visited[v]) q.push(v);
}
// ✅ CORRECT: Mark visited when ADDING to queue
visited[src] = true;
queue<int> q;
q.push(src);
while (!q.empty()) {
int u = q.front(); q.pop();
for (int v : adj[u]) {
if (!visited[v]) {
visited[v] = true; // ✅ Mark BEFORE pushing
q.push(v);
}
}
}
// ❌ DFS: Forgetting to reset visited between test cases
// In problems with multiple test cases, reinitialize visited[]!
memset(visited, false, sizeof(visited));
// ❌ Dijkstra: Using int instead of long long for distances
int dist[MAXN]; // ❌ if edge weights can be up to 10^9, sum overflows!
long long dist[MAXN]; // ✅
F.7 I/O Bugs
// ❌ WRONG: Missing ios_base::sync_with_stdio(false) for large I/O
// Without this, cin/cout are synced with C stdio → very slow!
// For N = 10^6 inputs, this can be the difference between AC and TLE.
// ✅ ALWAYS add at start of main() for competitive programming:
ios_base::sync_with_stdio(false);
cin.tie(NULL);
// ❌ WRONG: Using endl (flushes buffer every line → slow)
for (int i = 0; i < n; i++) cout << ans[i] << endl; // slow!
// ✅ CORRECT: Use "\n" instead
for (int i = 0; i < n; i++) cout << ans[i] << "\n"; // fast
// ❌ WRONG: Mixing cin and scanf/printf after disabling sync
ios_base::sync_with_stdio(false);
scanf("%d", &n); // BUG: mixing C and C++ I/O after desync!
// ✅ CORRECT: Pick ONE and stick with it
// Either use cin/cout exclusively, or scanf/printf exclusively
// USACO file I/O (when required):
freopen("problem.in", "r", stdin);
freopen("problem.out", "w", stdout);
// After these lines, cin/cout work with files automatically
F.8 2D Array Bounds and Directions
// Grid BFS: off-by-one in boundary checking
int dx[] = {0, 0, 1, -1};
int dy[] = {1, -1, 0, 0};
// ❌ WRONG: Bounds check is wrong (allows -1 index)
for (int d = 0; d < 4; d++) {
int nx = x + dx[d], ny = y + dy[d];
if (nx >= 0 && ny >= 0 && nx < n && ny < m) // ✅ This is actually correct!
// Just make sure you check ALL FOUR conditions
}
// ❌ WRONG: Wrong dimensions (swapping rows and columns)
// If grid is N rows × M columns:
// A[row][col]: row goes 0..N-1, col goes 0..M-1
// Bounds: row < N, col < M (NOT row < M!)
// ❌ WRONG: Visiting same cell multiple times (forgetting dist check)
// In multi-source BFS for distance:
if (!visited[nx][ny]) { // ✅ Only visit unvisited cells
visited[nx][ny] = true;
dist[nx][ny] = dist[x][y] + 1;
q.push({nx, ny});
}
F.9 DP-Specific Bugs
// ❌ WRONG: 0/1 Knapsack inner loop direction
// Must iterate capacity from HIGH to LOW to prevent reusing items!
for (int i = 0; i < n; i++) {
for (int j = W; j >= weight[i]; j--) { // ✅ HIGH to LOW
dp[j] = max(dp[j], dp[j - weight[i]] + value[i]);
}
}
// If you iterate j from LOW to HIGH:
for (int j = weight[i]; j <= W; j++) { // ❌ LOW to HIGH = unbounded knapsack!
dp[j] = max(dp[j], dp[j - weight[i]] + value[i]);
}
// ❌ WRONG: LIS with binary search — using upper_bound vs lower_bound
// For STRICTLY increasing LIS: use lower_bound (find first >= x, replace)
// For NON-DECREASING LIS: use upper_bound (find first > x, replace)
auto it = lower_bound(tails.begin(), tails.end(), x); // strictly increasing
auto it = upper_bound(tails.begin(), tails.end(), x); // non-decreasing
// ❌ WRONG: Forgetting base cases
// dp[0] or dp[i][0] or dp[0][j] MUST be explicitly set before the main loop
dp[0][0] = 0; // always initialize base cases!
F.10 Memory Limit Exceeded (MLE)
// Common causes of MLE:
// ❌ Array too large for the problem
int dp[10005][10005]; // = 10^8 ints = 400MB → exceeds typical 256MB limit!
// Calculate: N*M*sizeof(type) bytes
// int: 4 bytes, long long: 8 bytes
// 256MB = 256 × 10^6 bytes
// Max int array: 64 × 10^6 elements
// Max long long array: 32 × 10^6 elements
// ✅ Space optimization for 1D DP:
// If dp[i] only depends on dp[i-1], use rolling array:
vector<long long> dp(2, 0); // dp[cur] and dp[prev]
for (int i = 0; i < n; i++) {
dp[1 - cur] = f(dp[cur]); // alternate between 0 and 1
cur = 1 - cur;
}
// ✅ Space optimization for 2D DP (knapsack-style):
// If dp[i][j] only depends on dp[i-1][...], keep only two rows
vector<int> prev_row(W+1, 0), curr_row(W+1, 0);
Quick Diagnosis Checklist
When you get WA/RE/TLE, go through this checklist:
Wrong Answer (WA):
-
Integer overflow? (Add
long longcasts or change types) - Off-by-one in array bounds, loop bounds, range sum formula?
-
Uninitialized array? (Add
memsetor usevectorwith init) - Wrong DP transition direction? (0/1 knapsack: high-to-low)
-
Wrong binary search template? (Verify on
[1,2,3]for target2) - Edge cases: empty input, N=0, N=1, all equal elements?
Runtime Error (RE):
-
Array out of bounds? (Add bounds checks or use
vector) - Stack overflow from deep recursion? (Convert to iterative)
- Null/invalid pointer dereference?
- Division by zero?
Time Limit Exceeded (TLE):
-
Missing
ios_base::sync_with_stdio(false); cin.tie(NULL);? - O(N²) algorithm when N=10⁵ needs O(N log N)?
- Unnecessary recomputation in DP? (Need memoization)
- BFS visiting nodes multiple times? (Mark visited before pushing)
Memory Limit Exceeded (MLE):
- 2D array too large? (Calculate N×M×sizeof bytes)
- Recursive DFS with implicit call stack too deep?
- Dynamic memory allocation in tight loop?
💡 Pro Tip: Print your intermediate values!
cerr << "DEBUG: dp[3] = " << dp[3] << "\n";cerrgoes to stderr (not stdout), so it won't affect your output in competitive programming judges. Remove allcerrlines before final submission.
Glossary of Competitive Programming Terms
This glossary defines 35+ key terms used throughout this book and in competitive programming generally. When you encounter an unfamiliar term, look it up here first.
A
Algorithm A step-by-step procedure for solving a problem. An algorithm must be correct (give the right answer), finite (eventually terminate), and well-defined (each step is unambiguous). Examples: binary search, BFS, merge sort.
Adjacency List A way to represent a graph where each vertex stores a list of its neighbors. Space: O(V + E). The standard representation in competitive programming.
Adjacency Matrix
A 2D array where matrix[u][v] = 1 if there's an edge from u to v. Space: O(V²). Use only for dense graphs with V ≤ 1000.
Amortized Time
The average time per operation over a sequence of operations. Example: vector::push_back is O(1) amortized even though occasional doubling is O(N).
B
Base Case
In recursion and DP, the simplest subproblem with a known answer (requires no further recursion). Example: fib(0) = 0, fib(1) = 1.
BFS (Breadth-First Search) A graph traversal that explores nodes level by level (all nodes at distance 1, then distance 2, ...). Uses a queue. Guarantees shortest path in unweighted graphs. Time: O(V + E).
Big-O Notation A mathematical notation describing the upper bound on an algorithm's time or space growth. "O(N log N)" means "at most c × N × log(N) operations for some constant c." Used to compare algorithm efficiency.
Binary Search An O(log N) search algorithm on a sorted array. Each step eliminates half the remaining candidates by comparing with the midpoint. The most important application: "binary search on the answer" for optimization problems.
Brute Force A naive solution that tries all possibilities. Usually O(N²) or O(2^N). Correct but too slow for large inputs. Useful for: partial credit, verifying optimized solutions, small test cases.
C
Comparator
A function that defines a sorting order. Takes two elements and returns true if the first should come before the second. Used with std::sort.
Competitive Programming A type of programming contest where participants solve algorithmic problems within a time limit. USACO, Codeforces, LeetCode, and IOI are popular platforms.
Connected Component A maximal subgraph where every pair of vertices is connected by a path. Find components with DFS/BFS or Union-Find.
Coordinate Compression Mapping a large range of values (e.g., up to 10^9) to small consecutive indices (0, 1, 2, ...) without changing relative order. Enables using arrays instead of hash maps.
D
DAG (Directed Acyclic Graph) A directed graph with no cycles. Key property: has a topological ordering. Examples: dependency graphs, task scheduling.
DFS (Depth-First Search) A graph traversal that explores as deep as possible before backtracking. Uses a stack (or recursion). Good for: connectivity, cycle detection, topological sort. Time: O(V + E).
Difference Array A technique for O(1) range updates. Store differences between consecutive elements; range add [L,R] becomes diff[L]++ and diff[R+1]--. Reconstruct with prefix sums.
DP (Dynamic Programming) An optimization technique that solves problems by breaking them into overlapping subproblems and caching results. Two properties needed: optimal substructure + overlapping subproblems. See: memoization, tabulation.
DSU (Disjoint Set Union) See Union-Find.
E
Edge A connection between two vertices in a graph. Can be directed (one-way) or undirected (two-way). May have a weight.
Exchange Argument A proof technique for greedy algorithms. Show that swapping the greedy choice with any other choice never worsens the solution.
F
Flood Fill An algorithm (usually DFS or BFS) that marks all connected cells of the same "color" in a grid. Used to count connected regions.
G
Graph A data structure consisting of vertices (nodes) and edges (connections). Models relationships, networks, maps, etc.
Greedy Algorithm An algorithm that makes the locally optimal choice at each step, hoping for a globally optimal result. Works when the "greedy choice property" holds. Examples: activity selection, Huffman coding, Kruskal's MST.
H
Hash Map (unordered_map) A data structure that stores key-value pairs with O(1) average lookup. Implemented with hash tables. No ordering guarantee. Use when you need fast lookup but don't need sorted keys.
I
Interval DP A DP pattern where the state is a subarray [l, r] and you try all split points. Classic examples: matrix chain multiplication, palindrome partitioning. Time: O(N³).
K
Knapsack Problem A DP problem: given items with weights and values, maximize value within a weight limit. "0/1 knapsack" means each item used at most once. "Unbounded knapsack" means unlimited uses.
L
LIS (Longest Increasing Subsequence) The longest subsequence of an array where each element is strictly greater than the previous. O(N²) DP or O(N log N) with binary search.
LCA (Lowest Common Ancestor) The deepest node that is an ancestor of both u and v in a rooted tree. Naive: O(depth) per query. Binary lifting: O(log N).
M
Memoization Caching the results of recursive function calls to avoid recomputation. "Top-down DP." A memo table stores computed values; before computing, check if the answer is already known.
MST (Minimum Spanning Tree) A spanning tree of a weighted graph with minimum total edge weight. Kruskal's algorithm: sort edges + DSU. Prim's algorithm: priority queue + visited set. Both O(E log E).
Monotone / Monotonic Consistently increasing or decreasing. A function is monotone if it never reverses direction. Key for binary search on answer: the feasibility function must be monotone.
O
Off-By-One Error
A bug where an index or count is wrong by exactly 1. Very common in loops (< n vs <= n), binary search, prefix sums (P[L-1] vs P[L]).
Optimal Substructure A property: the optimal solution to a problem can be built from optimal solutions to its subproblems. Required for DP to work correctly.
Overflow
When a value exceeds the maximum representable value for its type. int max is ~2×10^9; long long max is ~9.2×10^18. Multiplying two 10^9 ints overflows int — cast to long long first.
P
Prefix Sum
An array where P[i] = sum of all elements from index 0 (or 1) through i. Enables O(1) range sum queries: sum(L,R) = P[R] - P[L-1].
R
Recurrence Relation
A formula expressing a DP value in terms of smaller DP values. Example: fib(n) = fib(n-1) + fib(n-2). Defines the DP transition.
S
Segment Tree A data structure for range queries and updates in O(log N). More powerful than prefix sums (supports updates). A Gold/Platinum topic.
Sparse Graph A graph with few edges relative to V². In practice: E = O(V). Use adjacency lists.
State (DP)
The set of information that uniquely identifies a DP subproblem. Example in knapsack: (item_index, remaining_capacity). Choosing the right state is the key skill in DP.
Subtree All nodes in a tree that are descendants of a given node (including itself). Tree DP often computes aggregate values over subtrees.
T
Tabulation Building a DP table iteratively from base cases to larger subproblems. "Bottom-up DP." No recursion, no stack overflow risk.
Time Limit Exceeded (TLE) A verdict meaning your solution is correct but too slow. In USACO, most problems have a 2-4 second time limit. If you get TLE, optimize the algorithm — not just the constant factors.
Topological Sort An ordering of vertices in a DAG such that for every directed edge u→v, u comes before v. Computed with DFS (reverse post-order) or Kahn's algorithm (BFS-based).
Two Pointers A technique using two indices moving through an array, usually in the same direction. Converts O(N²) pair searches into O(N). Works on sorted arrays or when the condition is monotone.
U
Union-Find (DSU)
A data structure supporting two operations: find(x) (which group is x in?) and union(x,y) (merge groups of x and y). With path compression + union by rank: O(α(N)) ≈ O(1) per operation. Used for dynamic connectivity, Kruskal's MST, cycle detection.
V
Vertex (Node) A fundamental unit of a graph. Vertices have indices (usually 1-indexed in USACO).
W
Wrong Answer (WA) A verdict meaning your program ran but produced incorrect output. Check edge cases, off-by-ones, and overflow.
📊 Knowledge Dependency Map
This interactive map shows prerequisite relationships between all chapters. Click any node to highlight its prerequisites (red) and dependent chapters (green).
How to Read This Map
| Color | Meaning |
|---|---|
| 🔵 Blue nodes | C++ Foundation chapters (Ch.2.1–3.1) |
| 🟢 Green nodes | Core Data Structure chapters |
| 🟠 Orange nodes | Graph Algorithm chapters |
| 🟣 Purple nodes | Dynamic Programming chapters |
| 🔴 Red nodes | Greedy Algorithm chapters |
| Red highlighted edges | Prerequisites of the selected chapter |
| Green highlighted edges | Chapters unlocked by the selected chapter |
Tip: Click any node to reveal its full dependency chain. Click again (or press "↺ Clear Selection") to reset.