📖 Chapter 5.2 ⏱️ ~120 min read 🎯 Intermediate

Chapter 5.2: BFS & DFS

📝 Before You Continue: Make sure you understand graph representation (Chapter 5.1), queues and stacks (Chapter 3.6), and basic 2D array traversal (Chapter 2.3).

Graph traversal algorithms explore every node reachable from a starting point. They're the foundation of dozens of graph algorithms. DFS (Depth-First Search) dives deep before backtracking. BFS (Breadth-First Search) explores layer by layer. Knowing which to use and when is a skill you'll develop throughout your competitive programming career.

5.2.1 Depth-First Search (DFS)

DFS works like exploring a maze: you keep going forward until you hit a dead end, then backtrack and try another path. It's the most natural graph traversal — recursion does most of the work for you.

The Core Idea

Imagine you're standing at a crossroads in a maze. DFS says: pick one path and go as far as you can. When you hit a dead end (all neighbors already visited), backtrack to the last crossroads and try the next path. Repeat until every reachable node has been visited.

This "dive deep, then backtrack" behavior maps perfectly to recursion: each recursive call goes one step deeper, and returning from a call is backtracking.

DFS from node 1:

    1 ──── 2 ──── 4
    |      |
    3      5 ──── 6

Visit order: 1 → 2 → 4 (dead end, backtrack) → 5 → 6 (dead end, backtrack×2) → 3

Visual: DFS Traversal Order

DFS Traversal

DFS dives as deep as possible before backtracking. The numbered circles show the visit order, red dashed arrows show backtracking. The call stack on the right illustrates how recursion naturally implements the LIFO behaviour needed for DFS.

Recursive DFS

#include <bits/stdc++.h>
using namespace std;

const int MAXN = 100001;
vector<int> adj[MAXN];
bool visited[MAXN];

void dfs(int u) {
    visited[u] = true;           // mark current node as visited
    cout << u << " ";            // process u (print it, in this example)

    for (int v : adj[u]) {       // for each neighbor v
        if (!visited[v]) {       // if not yet visited
            dfs(v);              // recursively explore v
        }
    }
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n, m;
    cin >> n >> m;

    for (int i = 0; i < m; i++) {
        int u, v;
        cin >> u >> v;
        adj[u].push_back(v);
        adj[v].push_back(u);
    }

    // DFS from node 1
    dfs(1);
    cout << "\n";

    return 0;
}

Step-by-Step Trace: How Recursive DFS Works

Let's trace through a concrete example. Given this graph:

Nodes: 1, 2, 3, 4, 5, 6
Edges: 1-2, 1-3, 2-4, 2-5, 5-6

Adjacency list (after reading input):
adj[1] = {2, 3}
adj[2] = {1, 4, 5}
adj[3] = {1}
adj[4] = {2}
adj[5] = {2, 6}
adj[6] = {5}

DFS from node 1 — full trace with call stack:

Call: dfs(1)
  visited[1] = true.  Print: 1
  Neighbors of 1: [2, 3]
  ├── v=2: not visited → call dfs(2)
  │   Call: dfs(2)
  │     visited[2] = true.  Print: 2
  │     Neighbors of 2: [1, 4, 5]
  │     ├── v=1: already visited → skip
  │     ├── v=4: not visited → call dfs(4)
  │     │   Call: dfs(4)
  │     │     visited[4] = true.  Print: 4
  │     │     Neighbors of 4: [2]
  │     │     └── v=2: already visited → skip
  │     │   Return from dfs(4)  ← backtrack!
  │     ├── v=5: not visited → call dfs(5)
  │     │   Call: dfs(5)
  │     │     visited[5] = true.  Print: 5
  │     │     Neighbors of 5: [2, 6]
  │     │     ├── v=2: already visited → skip
  │     │     └── v=6: not visited → call dfs(6)
  │     │         Call: dfs(6)
  │     │           visited[6] = true.  Print: 6
  │     │           Neighbors of 6: [5]
  │     │           └── v=5: already visited → skip
  │     │         Return from dfs(6)  ← backtrack!
  │     │   Return from dfs(5)  ← backtrack!
  │   Return from dfs(2)  ← backtrack!
  ├── v=3: not visited → call dfs(3)
  │   Call: dfs(3)
  │     visited[3] = true.  Print: 3
  │     Neighbors of 3: [1]
  │     └── v=1: already visited → skip
  │   Return from dfs(3)  ← backtrack!
Return from dfs(1)

Output: 1 2 4 5 6 3

Call stack at the deepest point (when visiting node 6):

┌─────────┐
│  dfs(6) │  ← current (depth 4)
├─────────┤
│  dfs(5) │  ← waiting for dfs(6) to return
├─────────┤
│  dfs(2) │  ← waiting for dfs(5) to return
├─────────┤
│  dfs(1) │  ← waiting for dfs(2) to return
├─────────┤
│  main() │  ← entry point
└─────────┘
Max recursion depth = 4 (for this graph)

💡 Key observation: The recursion depth equals the longest path from the starting node in the DFS tree. For a path graph 1→2→3→...→N, the depth is N — this is when stack overflow becomes a risk.

Important: Always mark nodes as visited before recursing, not after! This prevents infinite loops on cycles.

// ❌ WRONG — marks visited AFTER recursing → infinite loop on cycles!
void dfs_wrong(int u) {
    cout << u << " ";
    for (int v : adj[u]) {
        if (!visited[v]) {
            dfs_wrong(v);    // v might recurse back to u before u is marked!
        }
    }
    visited[u] = true;       // too late — cycle already caused infinite recursion
}

// ✅ CORRECT — marks visited BEFORE recursing
void dfs(int u) {
    visited[u] = true;       // mark FIRST
    cout << u << " ";
    for (int v : adj[u]) {
        if (!visited[v]) {
            dfs(v);
        }
    }
}

Complexity Analysis

Time: O(V + E) — Each vertex is visited exactly once (visited[] check). Each edge is examined exactly twice (once from each endpoint in an undirected graph). Total work = V + 2E = O(V + E).
Space: O(V) — The visited[] array uses O(V). The recursion call stack can go up to O(V) deep in the worst case (a path graph).

Iterative DFS (Using a Stack)

For very large graphs, recursive DFS can cause a stack overflow (too deep recursion). The default stack size is typically 1–8 MB, and each recursion level uses ~100–200 bytes. When the graph depth exceeds ~10^4–10^5, you'll crash.

The iterative version replaces the system call stack with an explicit stack<int>:

void dfs_iterative(int start, int n) {
    vector<bool> visited(n + 1, false);
    stack<int> st;

    st.push(start);

    while (!st.empty()) {
        int u = st.top();
        st.pop();

        if (visited[u]) continue;  // may have been pushed multiple times
        visited[u] = true;
        cout << u << " ";

        for (int v : adj[u]) {
            if (!visited[v]) {
                st.push(v);
            }
        }
    }
}

Trace of iterative DFS on the same graph (starting from node 1):

Stack: [1]
Pop 1 → not visited → mark, print 1. Push neighbors 2, 3.
Stack: [2, 3]

Pop 3 → not visited → mark, print 3. Push neighbor 1.
Stack: [2, 1]

Pop 1 → already visited → skip.
Stack: [2]

Pop 2 → not visited → mark, print 2. Push neighbors 1, 4, 5.
Stack: [1, 4, 5]

Pop 5 → not visited → mark, print 5. Push neighbors 2, 6.
Stack: [1, 4, 2, 6]

Pop 6 → not visited → mark, print 6. Push neighbor 5.
Stack: [1, 4, 2, 5]

Pop 5 → already visited → skip.
Pop 2 → already visited → skip.
Pop 4 → not visited → mark, print 4. Push neighbor 2.
Stack: [1, 2]

Pop 2 → already visited → skip.
Pop 1 → already visited → skip.
Stack empty → done.

Output: 1 3 2 5 6 4

⚠️ Note: Iterative DFS may visit nodes in a different order than recursive DFS (notice the output above is 1 3 2 5 6 4 vs recursive's 1 2 4 5 6 3). This is because the stack processes the last-pushed neighbor first. For most problems this doesn't matter — both visit all reachable nodes. If you need the exact same order as recursive DFS, push neighbors in reverse order.

When to use iterative DFS:

Graph depth could exceed ~10^4 (e.g., path graphs, chains)

Grid problems with N×M ≥ 10^6 cells

Any time you're worried about stack overflow

5.2.2 Connected Components

A connected component is a maximal set of vertices where every vertex can reach every other vertex through edges. Think of it as an "island" of connected nodes — if you start DFS from any node in the component, you'll visit every other node in that same component, but none outside it.

The Core Idea

A graph might not be fully connected. For example:

Component 1:    Component 2:    Component 3:
  1 — 2           5 — 6             8
  |   |           |
  3   4           7

This graph has 3 connected components: {1,2,3,4}, {5,6,7}, and {8}. Finding components is a very common USACO task — it answers questions like "how many separate groups exist?" or "are nodes A and B in the same group?"

Algorithm: Label Each Component with DFS

The strategy is simple:

Scan all nodes from 1 to N
When you find an unvisited node, it's the start of a new component
Run DFS from that node, labeling every reachable node with the same component ID
Repeat until all nodes are labeled

#include <bits/stdc++.h>
using namespace std;

const int MAXN = 100001;
vector<int> adj[MAXN];
int comp[MAXN];   // comp[v] = component ID of vertex v (0 = unvisited)

void dfs(int u, int id) {
    comp[u] = id;
    for (int v : adj[u]) {
        if (comp[v] == 0) {   // 0 means unvisited
            dfs(v, id);
        }
    }
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n, m;
    cin >> n >> m;

    for (int i = 0; i < m; i++) {
        int u, v;
        cin >> u >> v;
        adj[u].push_back(v);
        adj[v].push_back(u);
    }

    int numComponents = 0;
    for (int u = 1; u <= n; u++) {
        if (comp[u] == 0) {
            numComponents++;
            dfs(u, numComponents);  // assign component ID
        }
    }

    cout << "Number of components: " << numComponents << "\n";

    // Print component sizes
    vector<int> size(numComponents + 1, 0);
    for (int u = 1; u <= n; u++) size[comp[u]]++;
    for (int i = 1; i <= numComponents; i++) {
        cout << "Component " << i << ": " << size[i] << " nodes\n";
    }

    return 0;
}

Step-by-Step Trace

Given the graph above (8 nodes, edges: 1-2, 1-3, 2-4, 5-6, 5-7):

Initial: comp[] = [0, 0, 0, 0, 0, 0, 0, 0, 0]  (all unvisited)

Scan u=1: comp[1]==0 → new component! numComponents=1
  dfs(1, 1): comp[1]=1
    → dfs(2, 1): comp[2]=1
      → dfs(4, 1): comp[4]=1 (no unvisited neighbors)
    → dfs(3, 1): comp[3]=1 (no unvisited neighbors)
  comp[] = [_, 1, 1, 1, 1, 0, 0, 0, 0]

Scan u=2: comp[2]==1 → already visited, skip
Scan u=3: comp[3]==1 → already visited, skip
Scan u=4: comp[4]==1 → already visited, skip

Scan u=5: comp[5]==0 → new component! numComponents=2
  dfs(5, 2): comp[5]=2
    → dfs(6, 2): comp[6]=2 (no unvisited neighbors)
    → dfs(7, 2): comp[7]=2 (no unvisited neighbors)
  comp[] = [_, 1, 1, 1, 1, 2, 2, 2, 0]

Scan u=6: already visited, skip
Scan u=7: already visited, skip

Scan u=8: comp[8]==0 → new component! numComponents=3
  dfs(8, 3): comp[8]=3 (no neighbors at all — isolated node)
  comp[] = [_, 1, 1, 1, 1, 2, 2, 2, 3]

Result: 3 components
  Component 1: nodes {1,2,3,4} — size 4
  Component 2: nodes {5,6,7}   — size 3
  Component 3: nodes {8}       — size 1

Complexity Analysis

Time: O(V + E) — The outer loop scans all V nodes. Each DFS call visits each node and edge at most once. Total: O(V + E).
Space: O(V + E) — Adjacency list O(V + E), component array O(V), recursion stack O(V).

Common USACO Applications

Connected components appear in many forms:

"How many groups?" — Count components
"Are A and B connected?" — Check if comp[A] == comp[B]
"Largest group size?" — Find the component with the most nodes
"Can we make the graph connected by adding K edges?" — Need exactly numComponents - 1 edges to connect all components

💡 Alternative: Union-Find (DSU) can also find connected components, and supports dynamic edge additions. We'll cover DSU in Chapter 5.3.

5.2.3 Breadth-First Search (BFS)

BFS explores all nodes at distance 1, then all at distance 2, then distance 3, and so on. This makes it perfect for finding shortest paths in unweighted graphs. While DFS dives deep, BFS spreads wide — like ripples in a pond.

The Core Idea

BFS uses a queue (FIFO: First In, First Out) to process nodes in order of their distance from the source:

Start at the source node (distance 0)
Visit all its neighbors (distance 1)
Visit all their unvisited neighbors (distance 2)
Continue until all reachable nodes are visited

The queue ensures that all nodes at distance d are processed before any node at distance d+1. This level-by-level expansion is what guarantees shortest paths.

BFS from node S:

Level 0:  [S]
Level 1:  [neighbors of S]
Level 2:  [neighbors of level-1 nodes, not yet visited]
Level 3:  [neighbors of level-2 nodes, not yet visited]
...

Visual: BFS Level-by-Level Traversal

BFS Traversal

BFS spreads outward like ripples in a pond. Each "level" of nodes is colored differently, showing that all nodes at distance d from the source are discovered before any node at distance d+1. The queue at the bottom shows the processing order.

BFS Template

The BFS template below is the single most important code pattern in this chapter. You'll use it (or a variant of it) in dozens of problems.

// Solution: BFS Shortest Path — O(V + E)
#include <bits/stdc++.h>
using namespace std;

const int MAXN = 100001;
vector<int> adj[MAXN];

// Returns array of shortest distances from source to all vertices
// dist[v] = -1 means unreachable
vector<int> bfs(int source, int n) {
    vector<int> dist(n + 1, -1);   // -1 = unvisited (also serves as "visited" check)
    queue<int> q;

    dist[source] = 0;     // distance to source is 0
    q.push(source);       // seed the queue with the source

    while (!q.empty()) {
        int u = q.front();        // take the EARLIEST-discovered node
        q.pop();

        for (int v : adj[u]) {           // for each neighbor of u
            if (dist[v] == -1) {          // if v hasn't been visited yet
                dist[v] = dist[u] + 1;   // ← KEY LINE: v is one hop further than u
                q.push(v);                // add v to queue for future processing
            }
        }
    }

    return dist;
}

Line-by-line breakdown of the key parts:

Line	What it does	Why it matters
`dist(n+1, -1)`	Initialize all distances to -1	-1 means "not yet reached" — doubles as visited check
`dist[source] = 0`	Source is at distance 0 from itself	Starting point of BFS
`q.push(source)`	Seed the queue	BFS needs at least one node to start
`u = q.front(); q.pop()`	Process the earliest-discovered node	FIFO order guarantees level-by-level processing
`if (dist[v] == -1)`	Only visit unvisited nodes	Prevents revisiting and infinite loops
`dist[v] = dist[u] + 1`	THE KEY LINE — distance increases by 1	Each edge has weight 1 in unweighted graphs
`q.push(v)`	Schedule v for future processing	v's neighbors will be explored later

Why BFS Finds Shortest Paths — A Detailed Explanation

BFS processes nodes in order of their distance from the source. The first time BFS visits a node, it's via the shortest path. This is because BFS never visits a node at distance d+1 before visiting all nodes at distance d.

Proof by induction (informal):

Base case: The source has distance 0. Correct ✓
Inductive step: Assume all nodes at distance ≤ d have been correctly assigned their shortest distance. When BFS processes a node u at distance d, it discovers u's unvisited neighbors and assigns them distance d+1. Since:
- All nodes at distance ≤ d are already visited (by induction)
- Any path to these neighbors through already-visited nodes would be ≥ d+1
- Therefore d+1 is the shortest possible distance ✓

Concrete intuition — why DFS fails but BFS succeeds:

Graph:  1 — 2 — 3 — 4
        |           |
        5 ——————————┘

Shortest path from 1 to 4: 1 → 5 → 4 (distance 2)

BFS from 1:
  Level 0: {1}
  Level 1: {2, 5}        ← discovers 5 at distance 1
  Level 2: {3, 4}        ← discovers 4 at distance 2 (via 5) ✓

DFS from 1 (might go 1→2→3→4):
  Visits: 1, 2, 3, 4     ← finds 4 at distance 3 (via 2→3→4) ✗
  The shorter path 1→5→4 was missed because DFS committed to the 1→2 path first!

💡 Key Insight: Think of BFS as dropping a stone in water — ripples spread outward one layer at a time. All cells at distance 1 are processed before any cell at distance 2. This level-by-level processing guarantees the first visit to any node is via the shortest path.

BFS vs. DFS for shortest path:

BFS: guaranteed shortest path in unweighted graphs ✓

DFS: does NOT guarantee shortest path ✗

Complexity Analysis:

Time: O(V + E) — each vertex and edge is processed at most once
Space: O(V) — for the distance array and queue

Complete BFS Shortest Path Trace on a 4×4 Grid

Let's trace BFS starting from node 1 in this graph:

Edges: 1-2, 2-3, 1-4, 3-6, 4-5, 5-7, 7-8

BFS Trace:

Start: dist = [-1, 0, -1, -1, -1, -1, -1, -1, -1]  (1-indexed, source=1)
Queue: [1]

Process 1: neighbors 2, 4
  → dist[2] = 1, dist[4] = 1
  Queue: [2, 4]

Process 2: neighbors 1, 3
  → 1 already visited; dist[3] = 2
  Queue: [4, 3]

Process 4: neighbors 1, 5
  → 1 already visited; dist[5] = 2
  Queue: [3, 5]

Process 3: neighbors 2, 6
  → 2 already visited; dist[6] = 3
  Queue: [5, 6]

Process 5: neighbors 4, 7
  → 4 already visited; dist[7] = 3
  Queue: [6, 7]

Process 6: neighbor 3 → already visited
Process 7: neighbors 5, 8
  → 5 already visited; dist[8] = 4
  Queue: [8]

Process 8: neighbor 7 → already visited. Queue empty.

Final distances from node 1:
Node: 1  2  3  4  5  6  7  8
Dist: 0  1  2  1  2  3  3  4

5.2.4 Grid BFS — The Most Common USACO Pattern

Many USACO problems give you a grid with passable (.) and blocked (#) cells. BFS finds the shortest path from one cell to another.

Visual: Grid BFS Distance Flood Fill

Grid BFS

Starting from the center cell (distance 0), BFS expands to all reachable cells, recording the minimum number of steps to reach each one. Cells colored more blue are farther away. This is exactly how USACO flood-fill and shortest-path problems work on grids.

USACO-Style Grid BFS Problem: Maze Shortest Path

Problem: Given a 5×5 maze with walls (#) and open cells (.), find the shortest path from top-left (0,0) to bottom-right (4,4). Print the length, or -1 if no path exists.

The Maze:

. . . # .
# # . # .
. . . . .
. # # # .
. . . . .

BFS Trace — Distance Array Filling:

Starting at (0,0), BFS expands level by level. Here's the distance each cell gets assigned:

Step 0 — Initialize:
dist[0][0] = 0, queue: [(0,0)]

Step 1 — Process (0,0):
  Neighbors: (0,1)='.', (1,0)='#'(wall)
  dist[0][1] = 1. Queue: [(0,1)]

Step 2 — Process (0,1):
  Neighbors: (0,0)=visited, (0,2)='.', (1,1)='#'
  dist[0][2] = 2. Queue: [(0,2)]

Step 3 — Process (0,2):
  Neighbors: (0,1)=visited, (0,3)='#', (1,2)='.'
  dist[1][2] = 3. Queue: [(1,2)]

Step 4 — Process (1,2):
  Neighbors: (0,2)=visited, (1,1)='#', (1,3)='#', (2,2)='.'
  dist[2][2] = 4. Queue: [(2,2)]

Step 5 — Process (2,2):
  Neighbors: (1,2)=visited, (2,1)='.', (2,3)='.', (3,2)='#'
  dist[2][1] = 5, dist[2][3] = 5. Queue: [(2,1),(2,3)]

...continuing BFS...

Final distance array (. = reachable, # = wall, X = unreachable):
    c=0  c=1  c=2  c=3  c=4
r=0:  0    1    2    #    X
r=1:  #    #    3    #    X
r=2:  8    5    4    5    6
r=3:  9    #    #    #    7
r=4: 10   11   12   11    8

Shortest path length = dist[4][4] = 8

Path reconstruction: Follow the path backward from (4,4), always moving to the cell with distance one less:

(4,4)=8 → (3,4)=7 → (2,4)=6 → (2,3)=5 → (2,2)=4 → (1,2)=3 → (0,2)=2 → (0,1)=1 → (0,0)=0
Path length: 8 steps ✓

ASCII Visualization of the path:

S → . → . # .
# # ↓ # .
. . ↓ → → →
. # # # ↓
. . . . E

Complete C++ Code:

// Solution: Grid BFS Shortest Path — O(R × C)
#include <bits/stdc++.h>
using namespace std;

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int R, C;
    cin >> R >> C;
    vector<string> grid(R);
    for (int r = 0; r < R; r++) cin >> grid[r];

    // Find start (S) and end (E), or use fixed corners
    int sr = 0, sc = 0, er = R-1, ec = C-1;

    // BFS distance array: -1 = unvisited
    vector<vector<int>> dist(R, vector<int>(C, -1));
    queue<pair<int,int>> q;

    // Step 1: Seed BFS from source
    dist[sr][sc] = 0;
    q.push({sr, sc});

    // Step 2: Direction arrays (up, down, left, right)
    int dr[] = {-1, 1, 0, 0};
    int dc[] = {0, 0, -1, 1};

    // Step 3: BFS expansion
    while (!q.empty()) {
        auto [r, c] = q.front();
        q.pop();

        for (int d = 0; d < 4; d++) {
            int nr = r + dr[d];
            int nc = c + dc[d];

            if (nr >= 0 && nr < R           // in-bounds row
                && nc >= 0 && nc < C        // in-bounds col
                && grid[nr][nc] != '#'       // not a wall
                && dist[nr][nc] == -1) {     // ← KEY LINE: not yet visited

                dist[nr][nc] = dist[r][c] + 1;
                q.push({nr, nc});
            }
        }
    }

    // Step 4: Output result
    if (dist[er][ec] == -1) {
        cout << -1 << "\n";   // no path
    } else {
        cout << dist[er][ec] << "\n";
    }

    return 0;
}

📋 Sample Input/Output (6 lines, click to expand)

Sample Input (the maze above):

5 5
...#.
##.#.
.....
.###.
.....

Sample Output:

⚠️ Common Mistake: Using DFS instead of BFS for shortest path in a maze. DFS might find A path, but not the SHORTEST path. Always use BFS for shortest distances in unweighted grids.

5.2.5 USACO Example: Flood Fill

USACO loves "flood fill" problems: find all connected cells of the same type, or count connected regions. Flood fill is essentially DFS/BFS on a grid — it "paints" all reachable cells of the same type starting from a seed cell.

The Core Idea

Flood fill works exactly like the "paint bucket" tool in image editors: click on a pixel, and all connected pixels of the same color get filled. In code:

Start at a cell
Mark it as visited
Recursively visit all 4-directional neighbors that are the same type and unvisited
Stop when no more neighbors qualify

Problem: Count Connected Regions

Problem: Count the number of distinct connected regions of '.' cells in a grid. (Like counting islands, but counting water regions instead of land.)

Example grid:

. . # # .
. . # . .
# # # . .
. . . # #
. . . # .

Regions of '.' cells:

Region 1:     Region 2:     Region 3:     Region 4:
. .           . .           . .           .
. .           . .           . .
              . .

Answer: 4 regions.

Complete Code with Detailed Comments

#include <bits/stdc++.h>
using namespace std;

int R, C;
vector<string> grid;
vector<vector<bool>> visited;

void floodFill(int r, int c) {
    // Base cases: stop recursion if invalid
    if (r < 0 || r >= R || c < 0 || c >= C) return;  // out of bounds
    if (visited[r][c]) return;                          // already visited
    if (grid[r][c] == '#') return;                      // wall (not our target type)

    // Mark this cell as visited (part of current region)
    visited[r][c] = true;

    // Recurse in all 4 directions
    floodFill(r - 1, c);  // up
    floodFill(r + 1, c);  // down
    floodFill(r, c - 1);  // left
    floodFill(r, c + 1);  // right
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    cin >> R >> C;
    grid.resize(R);
    visited.assign(R, vector<bool>(C, false));

    for (int r = 0; r < R; r++) cin >> grid[r];

    int regions = 0;
    for (int r = 0; r < R; r++) {
        for (int c = 0; c < C; c++) {
            if (!visited[r][c] && grid[r][c] == '.') {
                regions++;           // found a new unvisited '.' cell → new region!
                floodFill(r, c);     // mark ALL cells in this region as visited
            }
        }
    }

    cout << regions << "\n";
    return 0;
}

Step-by-Step Trace

Using the 5×5 grid above:

Scan (0,0): grid='.', not visited → NEW REGION #1!
  floodFill(0,0) → marks (0,0), recurses...
    floodFill(0,1) → marks (0,1), recurses...
      floodFill(1,1) → marks (1,1), recurses...
        floodFill(1,0) → marks (1,0), recurses...
          (all neighbors: out of bounds or visited or '#')
        (all other neighbors: visited or '#')
      (all other neighbors: visited or '#')
    (all other neighbors: visited or '#')
  Region 1 = {(0,0), (0,1), (1,0), (1,1)} — 4 cells

Scan (0,2): grid='#' → skip
Scan (0,3): grid='#' → skip

Scan (0,4): grid='.', not visited → NEW REGION #2!
  floodFill(0,4) → marks (0,4), recurses...
    floodFill(1,4) → marks (1,4), recurses...
      floodFill(1,3) → marks (1,3), recurses...
        floodFill(2,3) → marks (2,3), recurses...
          floodFill(2,4) → marks (2,4), recurses...
            (all neighbors: visited or '#' or OOB)
          (all other neighbors: visited or '#')
        (all other neighbors: visited or '#')
      (all other neighbors: visited or '#')
    (all other neighbors: visited)
  Region 2 = {(0,4), (1,3), (1,4), (2,3), (2,4)} — 5 cells

Scan (3,0): grid='.', not visited → NEW REGION #3!
  floodFill(3,0) → marks (3,0), recurses...
    → marks (3,1), (3,2), (4,0), (4,1), (4,2)
  Region 3 = {(3,0), (3,1), (3,2), (4,0), (4,1), (4,2)} — 6 cells
  (note: (4,4) is NOT reachable from here — (3,3)='#', (3,4)='#', (4,3)='#' all block it)

Scan (4,4): grid='.', not visited → NEW REGION #4!
  floodFill(4,4) → marks (4,4)
  Region 4 = {(4,4)} — 1 cell (isolated: surrounded by '#' on all reachable sides)

Final answer: 4 regions ✓

Complexity Analysis

Time: O(R × C) — Each cell is visited at most once by floodFill. The outer double loop also scans each cell once. Total: O(R × C).
Space: O(R × C) — The visited array uses O(R × C). The recursion stack can go up to O(R × C) deep in the worst case (a spiral path).

Variant: BFS-based Flood Fill (Avoids Stack Overflow)

For large grids (R × C ≥ 10^6), recursive flood fill can cause stack overflow. Use BFS instead:

void floodFillBFS(int sr, int sc) {
    queue<pair<int,int>> q;
    visited[sr][sc] = true;
    q.push({sr, sc});

    int dr[] = {-1, 1, 0, 0};
    int dc[] = {0, 0, -1, 1};

    while (!q.empty()) {
        auto [r, c] = q.front(); q.pop();
        for (int d = 0; d < 4; d++) {
            int nr = r + dr[d], nc = c + dc[d];
            if (nr >= 0 && nr < R && nc >= 0 && nc < C
                && !visited[nr][nc] && grid[nr][nc] == '.') {
                visited[nr][nc] = true;
                q.push({nr, nc});
            }
        }
    }
}

Variant: Flood Fill with Region Size Tracking

Often you need not just the count, but the size of each region:

int floodFillSize(int sr, int sc) {
    int size = 0;
    queue<pair<int,int>> q;
    visited[sr][sc] = true;
    q.push({sr, sc});

    int dr[] = {-1, 1, 0, 0};
    int dc[] = {0, 0, -1, 1};

    while (!q.empty()) {
        auto [r, c] = q.front(); q.pop();
        size++;   // count this cell
        for (int d = 0; d < 4; d++) {
            int nr = r + dr[d], nc = c + dc[d];
            if (nr >= 0 && nr < R && nc >= 0 && nc < C
                && !visited[nr][nc] && grid[nr][nc] == '.') {
                visited[nr][nc] = true;
                q.push({nr, nc});
            }
        }
    }
    return size;
}

// Usage: find the largest region
int maxSize = 0;
for (int r = 0; r < R; r++)
    for (int c = 0; c < C; c++)
        if (!visited[r][c] && grid[r][c] == '.')
            maxSize = max(maxSize, floodFillSize(r, c));

💡 USACO tip: Flood fill problems are extremely common at Bronze and Silver levels. Common variations include: counting regions, finding the largest region, checking if two cells are in the same region, and computing the perimeter of a region.

5.2.6 Multi-Source BFS

Sometimes you need to compute the distance from every cell to the nearest special cell — for example, "how far is each empty cell from the nearest fire?" Starting a separate BFS from each fire cell would be O(K × R × C) where K is the number of fires — too slow. Multi-source BFS solves this in a single O(R × C) pass.

The Core Idea

Instead of running BFS from one source, push ALL source cells into the queue at distance 0 before starting BFS. Then run BFS normally. Each cell gets assigned the distance to its nearest source — guaranteed by BFS's level-order property.

Why does this work? Imagine a virtual "super-source" node S* connected to all real sources with cost-0 edges. BFS from S* would first visit all real sources (distance 0), then their neighbors (distance 1), and so on. Multi-source BFS is exactly this — without actually creating the virtual node.

Virtual super-source view:

         S* (virtual, dist=0)
        / | \
       /  |  \
     F₁  F₂  F₃    ← all fire sources at dist=0
     |    |    |
    ...  ...  ...   ← their neighbors at dist=1

Code Template

// Multi-source BFS: start from all fire cells at once
queue<pair<int,int>> q;
vector<vector<int>> dist(R, vector<int>(C, -1));

int dr[] = {-1, 1, 0, 0};
int dc[] = {0, 0, -1, 1};

// Step 1: Push ALL sources at distance 0 BEFORE starting BFS
for (int r = 0; r < R; r++) {
    for (int c = 0; c < C; c++) {
        if (grid[r][c] == 'F') {  // fire cell = source
            dist[r][c] = 0;
            q.push({r, c});
        }
    }
}

// Step 2: Run BFS from all sources simultaneously
while (!q.empty()) {
    auto [r, c] = q.front();
    q.pop();
    for (int d = 0; d < 4; d++) {
        int nr = r + dr[d], nc = c + dc[d];
        if (nr >= 0 && nr < R && nc >= 0 && nc < C
            && grid[nr][nc] != '#' && dist[nr][nc] == -1) {
            dist[nr][nc] = dist[r][c] + 1;
            q.push({nr, nc});
        }
    }
}
// After BFS: dist[r][c] = minimum distance from (r,c) to nearest fire cell

Step-by-Step Trace

Given this 3×4 grid with two fire cells:

. F . .
. # . F
. . . .

Initialization — push all 'F' cells at distance 0:

dist:                Queue:
-1  0 -1 -1         [(0,1), (1,3)]
-1  #  -1  0
-1 -1 -1 -1

Process (0,1) [dist=0]: neighbors (0,0)='.', (0,2)='.', (1,1)='#'(wall)

dist:                Queue:
 1  0  1 -1         [(1,3), (0,0), (0,2)]
-1  #  -1  0
-1 -1 -1 -1

Process (1,3) [dist=0]: neighbors (0,3)='.', (1,2)='.', (2,3)='.'

dist:                Queue:
 1  0  1  1         [(0,0), (0,2), (0,3), (1,2), (2,3)]
-1  #  -1  0
-1 -1 -1  1

Process (0,0) [dist=1]: neighbors (1,0)='.'

dist:                Queue:
 1  0  1  1         [(0,2), (0,3), (1,2), (2,3), (1,0)]
 2  #  -1  0
-1 -1 -1  1

...continuing BFS until queue is empty...

Final distance array:

Each cell shows its minimum distance to the nearest fire cell. Notice cell (2,0) has distance 3 — it's 3 steps from the nearest fire at (0,1).

💡 Key insight: The order of sources in the queue doesn't matter. BFS processes all distance-0 cells first, then all distance-1 cells, etc. Each cell is guaranteed to be reached first by the nearest source.

5.2.7 DFS vs. BFS — When to Use Each

This is one of the most important decisions in graph problems. Here's a comprehensive guide:

Quick Reference Table

Task	Use	Why
Shortest path (unweighted)	BFS ✓	Level-by-level guarantees shortest
Connectivity / connected components	Either	Both work; DFS often simpler recursively
Cycle detection (directed)	DFS ✓	3-color scheme tracks current path
Cycle detection (undirected)	Either	DFS with parent check, or DSU
Topological sort	DFS ✓	Post-order gives reverse topological order
Flood fill	Either (DFS often simpler)	DFS recursion is concise
Bipartite check	BFS or DFS	2-color with either
Distance to ALL nodes	BFS ✓	BFS naturally computes all distances
Tree traversals (pre/in/post order)	DFS ✓	Recursion maps naturally to tree structure
Path existence (just yes/no)	Either	Both find all reachable nodes
Nearest source (multi-source)	BFS ✓	Multi-source BFS is the standard approach

The Decision Flowchart

Do you need the SHORTEST path / MINIMUM steps?
├── YES → Use BFS (always!)
└── NO  → Do you need to explore paths / detect back edges?
          ├── YES → Use DFS (recursion tracks the current path)
          └── NO  → Either works. DFS is often shorter code.

Memory and Performance Comparison

Aspect	DFS (Recursive)	DFS (Iterative)	BFS
Data structure	Call stack (implicit)	Explicit `stack<int>`	`queue<int>`
Time	O(V + E)	O(V + E)	O(V + E)
Space	O(V) stack frames	O(V) stack entries	O(V) queue entries
Max memory usage	Proportional to max depth	Proportional to max depth	Proportional to max width
Stack overflow risk	Yes (depth > ~10^4)	No	No
Visit order	Deep-first	Deep-first	Level-by-level

💡 Key Insight: Use BFS whenever you need "the minimum number of steps." Use DFS whenever you just need to visit all nodes or check properties of paths (cycles, topological order, subtree properties).

Rule of thumb for USACO:

Bronze/Silver grid problems: BFS for shortest path, DFS for flood fill

Silver graph problems: BFS for distances, DFS for components

Gold: DFS for topological sort, cycle detection; BFS for multi-source distances

⚠️ Common Mistakes in Chapter 5.2

These are the bugs that trip up beginners most often. Each one has cost someone a contest submission — learn from their pain!

Mistake 1: Using DFS for Shortest Path

DFS explores one path deeply and doesn't guarantee minimum steps. Always use BFS for unweighted shortest paths.

// ❌ WRONG: DFS does NOT find shortest path!
void dfs(int u, int depth) {
    dist[u] = depth;   // might assign a LONGER path first
    for (int v : adj[u])
        if (dist[v] == -1)
            dfs(v, depth + 1);
}

// ✅ CORRECT: BFS guarantees shortest path
void bfs(int source) {
    dist[source] = 0;
    queue<int> q;
    q.push(source);
    while (!q.empty()) {
        int u = q.front(); q.pop();
        for (int v : adj[u])
            if (dist[v] == -1) {
                dist[v] = dist[u] + 1;  // guaranteed shortest
                q.push(v);
            }
    }
}

Why DFS fails: DFS might reach node X via a long path (e.g., 1→2→3→4→X, distance 4) before discovering the short path (e.g., 1→X, distance 1). Once X is marked visited, the short path is never explored.

Mistake 2: Forgetting Bounds Check in Grid BFS

nr >= 0 && nr < R && nc >= 0 && nc < C — missing any one of these four conditions causes out-of-bounds crashes.

// ❌ WRONG: missing lower bound check
if (nr < R && nc < C && grid[nr][nc] != '#')  // nr could be -1!

// ❌ WRONG: checking grid BEFORE bounds
if (grid[nr][nc] != '#' && nr >= 0 && nr < R)  // accesses grid[-1][c]!

// ✅ CORRECT: bounds check FIRST, then grid check
if (nr >= 0 && nr < R && nc >= 0 && nc < C && grid[nr][nc] != '#')

Debugging tip: If your grid BFS crashes with a segfault or gives garbage output, the bounds check is the first thing to verify.

Mistake 3: Marking Visited When Popping Instead of Pushing

If you mark visited when popping instead of when pushing, the same node can be pushed multiple times, causing O(V²) time instead of O(V+E).

Why it happens — scenario: Consider a node X with three neighbors A, B, C that are all already in the queue at the same BFS level. When we dequeue them one by one, each of them looks at X and checks "is X visited yet?"

BFS Mark on Pop vs Push

// ❌ WRONG: mark when popping → same node pushed multiple times
while (!q.empty()) {
    auto [r, c] = q.front(); q.pop();
    if (visited[r][c]) continue;   // wasteful: already in queue many times
    visited[r][c] = true;
    for (...) {
        if (!visited[nr][nc]) {
            q.push({nr, nc});      // might push (nr,nc) from MULTIPLE neighbors!
        }
    }
}

// ✅ CORRECT: mark when pushing → each node pushed exactly once
while (!q.empty()) {
    auto [r, c] = q.front(); q.pop();
    for (...) {
        if (dist[nr][nc] == -1) {
            dist[nr][nc] = dist[r][c] + 1;  // mark immediately
            q.push({nr, nc});                // pushed exactly once
        }
    }
}

Trace of the WRONG version (neighbors A, B, C are processed in order, X is their common neighbor):

Step	Dequeue	`visited[X]`?	Action on X	Queue after step
1	A	false	push X	`[B, C, X]`
2	B	false (still!)	push X again	`[C, X, X]`
3	C	false (still!)	push X again	`[X, X, X]`
4	X	false → mark true	—	`[X, X]`
5	X	true → skip	—	`[X]`
6	X	true → skip	—	`[]`

X was enqueued 3 times and dequeued 3 times; only the first actually processes it, the other two are wasted work.

Impact: On a 1000×1000 grid where each cell has ~4 neighbors, the wrong version can push up to 4× more entries into the queue (4 million instead of 1 million) — causing TLE or MLE. In the worst case (dense graph with V nodes, E = O(V²) edges), the wrong version runs in O(V + E) = O(V²) queue operations; the correct version guarantees exactly V pushes. On a sparse graph with E = O(V), both run in O(V), but the wrong version still has a larger constant factor.

Mistake 4: Stack Overflow in Recursive DFS

For grids with N×M = 10^6, recursive DFS can exceed the default stack size (typically 1–8 MB). Each recursion level uses ~100–200 bytes, so depth > ~10^4–10^5 will crash.

// ❌ RISKY: recursive DFS on large grid
void dfs(int r, int c) {
    visited[r][c] = true;
    for (int d = 0; d < 4; d++) {
        int nr = r + dr[d], nc = c + dc[d];
        if (valid(nr, nc) && !visited[nr][nc])
            dfs(nr, nc);   // recursion depth can reach R×C in worst case!
    }
}

// ✅ SAFE: iterative BFS or iterative DFS with explicit stack
void bfs(int sr, int sc) {
    queue<pair<int,int>> q;
    visited[sr][sc] = true;
    q.push({sr, sc});
    while (!q.empty()) { /* ... */ }
}

When does this happen? A spiral-shaped grid can force DFS to recurse R×C times. For R=C=1000, that's 10^6 recursion levels ≈ 100–200 MB of stack — instant crash.

Mistake 5: Using Wrong Starting Point (0-indexed vs 1-indexed)

In grid problems, make sure you're BFSing from the correct cell. USACO problems sometimes use 1-indexed grids, sometimes 0-indexed.

// ❌ WRONG: grid is 0-indexed but starting from (1,1)
dist[1][1] = 0;  // should be dist[0][0] for top-left corner!

// ✅ CORRECT: match the problem's indexing
// If problem says "row 1, column 1" but grid is 0-indexed:
dist[0][0] = 0;  // convert to 0-indexed

Debugging tip: If BFS gives wrong distances or misses the target, print the start and end coordinates and verify they match the problem statement.

Chapter Summary

📌 Key Takeaways

Algorithm	Data Structure	Time	Space	Best For
DFS (recursive)	Call stack	`O(V+E)`	`O(V)`	Connectivity, cycle detection, tree problems
DFS (iterative)	Explicit stack	`O(V+E)`	`O(V)`	Same, avoids stack overflow
BFS	Queue	`O(V+E)`	`O(V)`	Shortest path, layer traversal
Multi-source BFS	Queue (multi-source pre-fill)	`O(V+E)`	`O(V)`	Distance from each node to nearest source
3-Color DFS	Color array	`O(V+E)`	`O(V)`	Directed graph cycle detection
Topological Sort	DFS/BFS (Kahn)	`O(V+E)`	`O(V)`	Sorting/DP on DAG

❓ FAQ

Q1: Both BFS and DFS have time complexity O(V+E). Why can BFS find shortest paths but DFS cannot?

A: The key is visit order. BFS uses a queue to guarantee "process all nodes at distance d before distance d+1," so the first time a node is reached is always via the shortest path. DFS uses a stack (or recursion) and may take a long path to a node, missing shorter ones.

Q2: When does recursive DFS cause stack overflow? How to fix it?

A: Default stack size is ~1-8 MB. Each recursion level uses ~100-200 bytes. When graph depth exceeds ~10^4-10^5, overflow may occur. Solutions: ① Switch to iterative DFS (explicit stack); ② Add -Wl,-z,stacksize=67108864 at compile time to increase stack size.

Q3: In Grid BFS, why use dist == -1 for unvisited instead of a visited array?

A: Using dist[r][c] == -1 kills two birds with one stone: it records both "visited or not" and "distance to reach." One fewer array, cleaner code.

Q4: When to use DFS topological sort vs. Kahn's BFS topological sort?

A: DFS topological sort has shorter code (just reverse postorder), but Kahn's is more intuitive and can detect cycles (if final sorted length < N, there is a cycle). Both are common in contests; choose whichever you're more comfortable with.

🔗 Connections to Later Chapters

Chapter 5.3 (Trees & DSU): Tree Traversal (pre/postorder) is essentially DFS
Chapters 5.3 & 6.1–6.3 (DP): "DP on DAG" requires topological sort first, then compute DP in topological order
Chapter 4.1 (Greedy): Some graph greedy problems need BFS to compute distances as input
BFS shortest path is a simplified version of Dijkstra (Gold level)—Dijkstra handles weighted graphs, BFS handles unweighted
Multi-source BFS is extremely common in USACO Silver and is a must-master core technique

Practice Problems

Problem 5.2.1 — Island Count 🟢 Easy

Problem: You are given an N×M grid. Each cell is either . (water) or # (land). Two land cells are part of the same island if they are adjacent horizontally or vertically. Count the total number of distinct islands.

Input format:

N M
row₁
row₂
...

Sample Input 1:

4 5
.###.
.#.#.
.###.
.....

Sample Output 1:

(All # cells are connected — one island)

Sample Input 2:

3 5
#.#.#
.....
#.#.#

Sample Output 2:

(Six isolated land cells, each is its own island)

Constraints: 1 ≤ N, M ≤ 1000

💡 Hint

Scan every cell. When you find an unvisited # cell, increment the island count and run DFS/BFS to mark all connected # cells as visited.

✅ Full Solution

#include <bits/stdc++.h>
using namespace std;

int R, C;
vector<string> grid;
vector<vector<bool>> visited;

int dr[] = {-1, 1, 0, 0};
int dc[] = {0, 0, -1, 1};

void dfs(int r, int c) {
    if (r < 0 || r >= R || c < 0 || c >= C) return;
    if (visited[r][c] || grid[r][c] == '.') return;

    visited[r][c] = true;
    for (int d = 0; d < 4; d++)
        dfs(r + dr[d], c + dc[d]);
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    cin >> R >> C;
    grid.resize(R);
    visited.assign(R, vector<bool>(C, false));
    for (int r = 0; r < R; r++) cin >> grid[r];

    int islands = 0;
    for (int r = 0; r < R; r++)
        for (int c = 0; c < C; c++)
            if (!visited[r][c] && grid[r][c] == '#') {
                islands++;
                dfs(r, c);
            }

    cout << islands << "\n";
    return 0;
}
// Time: O(N×M),  Space: O(N×M)

Problem 5.2.2 — Maze Shortest Path 🟢 Easy

Problem: Given an N×M maze with S (start), E (end), . (passable), and # (wall), find the minimum number of steps to get from S to E (moving only up/down/left/right). Output −1 if no path exists.

Input format:

N M
row₁
...

📋 Sample Input/Output 1 (6 lines, click to expand)

Sample Input 1:

5 5
S...#
####.
....E
.####
.....

Sample Output 1:

Sample Input 2:

3 3
S#E
.#.
...

Sample Output 2:

-1

(S and E are separated by walls with no passable path)

Constraints: 1 ≤ N, M ≤ 1000, exactly one S and one E exist.

💡 Hint

BFS from S. The first time BFS reaches E, dist[E] is the minimum steps. If BFS ends without reaching E, output −1.

✅ Full Solution

#include <bits/stdc++.h>
using namespace std;

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int R, C;
    cin >> R >> C;
    vector<string> grid(R);
    for (int r = 0; r < R; r++) cin >> grid[r];

    // Find S and E
    int sr, sc, er, ec;
    for (int r = 0; r < R; r++)
        for (int c = 0; c < C; c++) {
            if (grid[r][c] == 'S') { sr = r; sc = c; }
            if (grid[r][c] == 'E') { er = r; ec = c; }
        }

    vector<vector<int>> dist(R, vector<int>(C, -1));
    queue<pair<int,int>> q;
    dist[sr][sc] = 0;
    q.push({sr, sc});

    int dr[] = {-1, 1, 0, 0};
    int dc[] = {0, 0, -1, 1};

    while (!q.empty()) {
        auto [r, c] = q.front(); q.pop();
        for (int d = 0; d < 4; d++) {
            int nr = r + dr[d], nc = c + dc[d];
            if (nr >= 0 && nr < R && nc >= 0 && nc < C
                && grid[nr][nc] != '#' && dist[nr][nc] == -1) {
                dist[nr][nc] = dist[r][c] + 1;
                q.push({nr, nc});
            }
        }
    }

    cout << dist[er][ec] << "\n";
    return 0;
}
// Time: O(N×M),  Space: O(N×M)

Problem 5.2.3 — Bipartite Check 🟡 Medium

Problem: A graph is bipartite if you can color every node either black or white such that every edge connects a black node to a white node. Given an undirected graph, determine if it is bipartite. Print "BIPARTITE" or "NOT BIPARTITE".

Input format:

N M
u₁ v₁
...

Sample Input 1:

Sample Output 1:

BIPARTITE

(A 4-cycle: color 1,3 black and 2,4 white)

Sample Input 2:

Sample Output 2:

NOT BIPARTITE

(A 3-cycle (triangle) — odd cycles are never bipartite)

Constraints: 1 ≤ N ≤ 10^5, 0 ≤ M ≤ 2×10^5

💡 Hint

BFS and 2-color. Assign color 0 to the source. For each uncolored neighbor, assign the opposite color (1 − current color). If a neighbor already has the same color as the current node, the graph is NOT bipartite.

✅ Full Solution

#include <bits/stdc++.h>
using namespace std;

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n, m;
    cin >> n >> m;

    vector<vector<int>> adj(n + 1);
    for (int i = 0; i < m; i++) {
        int u, v; cin >> u >> v;
        adj[u].push_back(v);
        adj[v].push_back(u);
    }

    vector<int> color(n + 1, -1);  // -1 = uncolored
    bool bipartite = true;

    for (int start = 1; start <= n && bipartite; start++) {
        if (color[start] != -1) continue;  // already colored

        queue<int> q;
        color[start] = 0;
        q.push(start);

        while (!q.empty() && bipartite) {
            int u = q.front(); q.pop();
            for (int v : adj[u]) {
                if (color[v] == -1) {
                    color[v] = 1 - color[u];  // opposite color
                    q.push(v);
                } else if (color[v] == color[u]) {
                    bipartite = false;   // same color on both ends of edge → not bipartite
                }
            }
        }
    }

    cout << (bipartite ? "BIPARTITE" : "NOT BIPARTITE") << "\n";
    return 0;
}
// Time: O(V + E),  Space: O(V + E)

Step-by-step trace (Sample 2 — triangle 1-2-3):

Start at 1: color[1]=0. Queue: [1]
Process 1: neighbors 2, 3
  → color[2]=1, color[3]=1. Queue: [2, 3]
Process 2: neighbors 1, 3
  → 1: color[1]=0 ≠ color[2]=1 ✓
  → 3: color[3]=1 == color[2]=1 → NOT BIPARTITE ✗

Problem 5.2.4 — Multi-Source BFS: Nearest Fire 🟡 Medium

Problem: Given an N×M grid with fire cells F, passable empty cells ., and walls #, for each empty cell print the minimum distance to the nearest fire cell. Walls are impassable. If an empty cell cannot reach any fire cell, print −1 for that cell.

Input format:

N M
row₁
...

Sample Input:

3 4
.F..
.#.F
....

Sample Output:

1 0 1 1
2 # 1 0
3 2 2 1

(distances shown; # for walls)

Constraints: 1 ≤ N, M ≤ 1000, at least one F cell exists.

💡 Hint

Multi-source BFS: push all F cells into the queue at distance 0 before starting BFS. Then BFS naturally assigns each cell its minimum distance to any fire source.

✅ Full Solution

#include <bits/stdc++.h>
using namespace std;

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int R, C;
    cin >> R >> C;
    vector<string> grid(R);
    for (auto& row : grid) cin >> row;

    vector<vector<int>> dist(R, vector<int>(C, -1));
    queue<pair<int,int>> q;

    // Key: push ALL fire sources at distance 0
    for (int r = 0; r < R; r++)
        for (int c = 0; c < C; c++)
            if (grid[r][c] == 'F') {
                dist[r][c] = 0;
                q.push({r, c});
            }

    int dr[] = {-1, 1, 0, 0};
    int dc[] = {0, 0, -1, 1};

    while (!q.empty()) {
        auto [r, c] = q.front(); q.pop();
        for (int d = 0; d < 4; d++) {
            int nr = r + dr[d], nc = c + dc[d];
            if (nr >= 0 && nr < R && nc >= 0 && nc < C
                && grid[nr][nc] != '#' && dist[nr][nc] == -1) {
                dist[nr][nc] = dist[r][c] + 1;
                q.push({nr, nc});
            }
        }
    }

    for (int r = 0; r < R; r++) {
        for (int c = 0; c < C; c++) {
            if (grid[r][c] == '#') cout << "# ";
            else cout << dist[r][c] << " ";
        }
        cout << "\n";
    }
    return 0;
}
// Time: O(N×M),  Space: O(N×M)

Problem 5.2.5 — USACO 2016 February Bronze: Milk Pails 🔴 Hard

Problem: You have two empty buckets with capacities X and Y. Available operations: fill either bucket to full, empty either bucket, pour one bucket into the other (until one is empty or the other is full). Find the minimum number of operations to get exactly M gallons in either bucket.

Input format:

X Y M

Sample Input 1:

3 5 4

Sample Output 1:

(Fill 5, pour into 3 → (3,2), empty 3 → (0,2), pour 2 into 3 → (2,0), fill 5 → (2,5), pour until 3 full → (3,4). Bucket 2 has 4 gallons. 6 operations.)

Sample Input 2:

2 3 1

Sample Output 2:

Constraints: 1 ≤ X, Y ≤ 100, 0 ≤ M ≤ max(X, Y)

💡 Hint

Model as a BFS on states: each state is a pair (a, b) where a ∈ [0,X] and b ∈ [0,Y] are the current amounts. Apply 6 operations to generate neighbor states. BFS finds the minimum operations from (0,0) to any state where a==M or b==M.

✅ Full Solution

#include <bits/stdc++.h>
using namespace std;

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int X, Y, M;
    cin >> X >> Y >> M;

    // dist[a][b] = min operations to reach state (a, b); -1 if not yet visited
    vector<vector<int>> dist(X + 1, vector<int>(Y + 1, -1));
    queue<pair<int,int>> q;

    dist[0][0] = 0;
    q.push({0, 0});

    while (!q.empty()) {
        auto [a, b] = q.front(); q.pop();

        // Generate all 6 possible operations
        vector<pair<int,int>> next = {
            {X, b},             // fill bucket A
            {a, Y},             // fill bucket B
            {0, b},             // empty bucket A
            {a, 0},             // empty bucket B
            {max(0, a+b-Y), min(Y, a+b)},  // pour A into B
            {min(X, a+b), max(0, a+b-X)}   // pour B into A
        };

        for (auto [na, nb] : next) {
            if (dist[na][nb] == -1) {
                dist[na][nb] = dist[a][b] + 1;
                q.push({na, nb});
            }
        }
    }

    // Find minimum operations where a==M or b==M
    int ans = INT_MAX;
    for (int a = 0; a <= X; a++)
        if (dist[a][M] != -1) ans = min(ans, dist[a][M]);
    for (int b = 0; b <= Y; b++)
        if (dist[M][b] != -1) ans = min(ans, dist[M][b]);

    cout << (ans == INT_MAX ? -1 : ans) << "\n";
    return 0;
}
// Time: O(X×Y × 6) = O(X×Y),  Space: O(X×Y)
// Total states: (X+1)×(Y+1) ≤ 101×101 ≈ 10^4 — very fast

State graph insight: The key insight is that (a, b) can be modeled as a node in a graph where each operation creates an edge to a new state. BFS on this graph finds the minimum operations (= minimum edges) from (0,0) to any goal state.

🏆 Challenge Problem: USACO 2015 December Bronze: Switching on the Lights

Problem: You have an N×N grid of rooms. Each room has a light (initially off) and a light switch connected to some other room's light. You start in room (1,1) (which is lit). You can enter any lit room and flip its switch (which toggles the target room's light). Rooms are passable only when lit. Find all rooms that can ever be lit.

Input format:

N
N lines of N characters: each room's switch target as "(row,col)"

Constraints: 1 ≤ N ≤ 100

💡 Hint

Multi-source BFS with a twist: use BFS to track which rooms are reachable (reachable = connected to a lit room). Each time a new room is lit, add it to the BFS queue (it may now be reachable). Repeat until no more rooms can be lit. This is essentially a BFS where the graph edges are revealed dynamically as rooms get lit.

5.2.8 Multi-Source BFS — In Depth

Multi-source BFS starts from multiple source nodes simultaneously. The key: push all sources into the queue at distance 0 before starting BFS.

Why does this work? BFS processes nodes level by level. If multiple nodes start at "level 0," BFS naturally propagates from all of them in parallel — exactly as if you had a virtual super-source connected to all real sources at cost 0.

Level 0:    [S₁][S₂][S₃]    ← all fire sources / all starting nodes
Level 1:   neighbors of S₁, S₂, S₃
Level 2:   their neighbors not yet visited
...

Complete Example: Spreading Fire

Problem: Given an N×M grid with fire cells ('F'), water cells ('.'), and walls ('#'), compute the minimum distance from each '.' cell to the nearest fire cell.

// Solution: Multi-Source BFS — O(N×M)
#include <bits/stdc++.h>
using namespace std;

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int R, C;
    cin >> R >> C;
    vector<string> grid(R);
    for (auto& row : grid) cin >> row;

    vector<vector<int>> dist(R, vector<int>(C, -1));
    queue<pair<int,int>> q;

    // ← KEY: Push ALL fire sources at distance 0 before starting BFS
    for (int r = 0; r < R; r++) {
        for (int c = 0; c < C; c++) {
            if (grid[r][c] == 'F') {
                dist[r][c] = 0;
                q.push({r, c});
            }
        }
    }

    int dr[] = {-1, 1, 0, 0};
    int dc[] = {0, 0, -1, 1};

    while (!q.empty()) {
        auto [r, c] = q.front(); q.pop();
        for (int d = 0; d < 4; d++) {
            int nr = r + dr[d], nc = c + dc[d];
            if (nr >= 0 && nr < R && nc >= 0 && nc < C
                && grid[nr][nc] != '#' && dist[nr][nc] == -1) {
                dist[nr][nc] = dist[r][c] + 1;
                q.push({nr, nc});
            }
        }
    }

    // Print distance grid
    for (int r = 0; r < R; r++) {
        for (int c = 0; c < C; c++) {
            if (dist[r][c] == -1) cout << " # ";
            else cout << " " << dist[r][c] << " ";
        }
        cout << "\n";
    }

    return 0;
}

BFS Level Visualization:

Level 0:    [F₁][F₂]          ← all fire sources enter queue together
Level 1:   [ 1 ][ 1 ][ 1 ]    ← cells adjacent to any fire source
Level 2:  [ 2 ][ 2 ][ 2 ][ 2 ]
Level 3: [ 3 ][ 3 ][ 3 ][ 3 ][ 3 ]

Multi-Source BFS — How Propagation Spreads from Multiple Sources:

Multi-Source BFS Levels

💡 Key principle: All fire sources enter the queue together at distance 0. BFS naturally propagates outward from all of them in parallel. Each empty cell is assigned the distance to its nearest fire source — a direct consequence of BFS's level-order property.

Each cell gets the minimum distance to any fire source — guaranteed by BFS's level-order property.

USACO Application: "Icy Perimeter" Style

Multi-source BFS is useful when you need:

"Distance from each cell to nearest [thing]"
"Spreading from multiple starting points" (fire, infection, flood)
"Simultaneous evacuation from multiple exits"

5.2.9 Cycle Detection with DFS — White/Gray/Black Coloring

Detecting cycles is a fundamental graph problem. The approach differs for directed vs. undirected graphs.

Directed Graph Cycle Detection: 3-Color DFS

For directed graphs, we use a 3-color scheme to track each node's state during DFS:

White (0): Not yet visited — DFS hasn't reached this node
Gray (1): Currently in the DFS call stack — we started processing this node but haven't finished (some descendants are still being explored)
Black (2): Fully processed — all descendants have been explored and we've returned from this node

The key insight: A cycle exists if and only if DFS encounters a back edge — an edge from the current node to a gray node (an ancestor still being processed). Why? Because a gray node is on the current DFS path, so an edge back to it creates a cycle.

Edge types during DFS:
  → White node: Tree edge (normal DFS exploration)
  → Gray node:  BACK EDGE → CYCLE DETECTED!
  → Black node: Cross/forward edge (safe, no cycle)

Why 2 Colors Aren't Enough for Directed Graphs

Consider this directed graph:

1 → 2 → 3
1 → 3

With only 2 colors (visited/unvisited), when DFS from 1 visits 2→3, then backtracks and sees edge 1→3, node 3 is already "visited." But there's NO cycle! The 3-color scheme distinguishes: when we check edge 1→3, node 3 is black (fully processed), not gray — so it's safe.

Complete Code

// Solution: Cycle Detection in Directed Graph — O(V+E)
#include <bits/stdc++.h>
using namespace std;

int n;
vector<int> adj[100001];
vector<int> color;   // 0=white, 1=gray, 2=black
bool hasCycle = false;

void dfs(int u) {
    color[u] = 1;  // mark as "in progress" (gray)

    for (int v : adj[u]) {
        if (color[v] == 0) {
            dfs(v);              // unvisited: recurse
        } else if (color[v] == 1) {
            hasCycle = true;     // ← back edge: v is an ancestor of u → cycle!
        }
        // color[v] == 2: already fully processed, safe to skip
    }

    color[u] = 2;  // mark as "done" (black)
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int m;
    cin >> n >> m;
    color.assign(n + 1, 0);

    for (int i = 0; i < m; i++) {
        int u, v; cin >> u >> v;
        adj[u].push_back(v);  // directed edge u → v
    }

    for (int u = 1; u <= n; u++) {
        if (color[u] == 0) dfs(u);
    }

    cout << (hasCycle ? "HAS CYCLE" : "NO CYCLE") << "\n";
    return 0;
}

Step-by-Step Trace: Cycle Detected

Given directed graph: 1→2, 2→3, 3→1 (a cycle!)

Initial: color = [_, W, W, W]   (W=white, G=gray, B=black)

dfs(1): color[1] = G
  color = [_, G, W, W]
  Neighbor 2: white → dfs(2)
    dfs(2): color[2] = G
      color = [_, G, G, W]
      Neighbor 3: white → dfs(3)
        dfs(3): color[3] = G
          color = [_, G, G, G]
          Neighbor 1: color[1] == G (gray!) → BACK EDGE → hasCycle = true! 🔴
        color[3] = B
      color = [_, G, G, B]
    color[2] = B
  color = [_, G, B, B]
color[1] = B

Result: HAS CYCLE ✓

Step-by-Step Trace: No Cycle

Given directed graph: 1→2, 1→3, 2→3 (a DAG, no cycle)

Initial: color = [_, W, W, W]

dfs(1): color[1] = G
  color = [_, G, W, W]
  Neighbor 2: white → dfs(2)
    dfs(2): color[2] = G
      color = [_, G, G, W]
      Neighbor 3: white → dfs(3)
        dfs(3): color[3] = G
          color = [_, G, G, G]
          No neighbors → done
        color[3] = B
      color = [_, G, G, B]
    color[2] = B
  color = [_, G, B, B]
  Neighbor 3: color[3] == B (black) → safe, skip ✅
color[1] = B

Result: NO CYCLE ✓

Notice: edge 1→3 points to a black node (fully processed), not gray — so it's NOT a back edge and doesn't indicate a cycle.

Undirected Graph Cycle Detection (Simpler!)

For undirected graphs, you don't need 3 colors. The rule is simpler: during DFS, if you encounter a visited node that is not the parent of the current node, there's a cycle.

// Cycle detection in undirected graph — O(V+E)
vector<int> adj[100001];
bool visited[100001];
bool hasCycle = false;

void dfs(int u, int parent) {
    visited[u] = true;
    for (int v : adj[u]) {
        if (!visited[v]) {
            dfs(v, u);                    // v's parent is u
        } else if (v != parent) {
            hasCycle = true;              // visited AND not parent → cycle!
        }
    }
}

// Call: dfs(1, -1);  // start node 1, no parent (use -1 as sentinel)

Why check v != parent? In an undirected graph, edge u–v appears in both adj[u] and adj[v]. When DFS goes from u to v, then looks at v's neighbors, it sees u again. But that's just the edge we came from — not a cycle. We only report a cycle if v is visited AND it's not the node we just came from.

Trace example — undirected graph with cycle: edges 1-2, 2-3, 3-1

dfs(1, -1): visited[1]=true
  Neighbor 2: not visited → dfs(2, 1)
    dfs(2, 1): visited[2]=true
      Neighbor 1: visited, but 1 == parent → skip (just the edge we came from)
      Neighbor 3: not visited → dfs(3, 2)
        dfs(3, 2): visited[3]=true
          Neighbor 2: visited, but 2 == parent → skip
          Neighbor 1: visited AND 1 ≠ parent(2) → CYCLE! 🔴

⚠️ Common pitfall with multi-edges: If there are multiple edges between u and v, the simple v != parent check can fail. For multi-edge graphs, track the edge index instead of the parent node to avoid false negatives.

5.2.10 Topological Sort with DFS

Topological sort orders the nodes of a directed acyclic graph (DAG) such that for every edge u → v, u comes before v in the ordering. Think of it as scheduling tasks with dependencies: if task A must be done before task B (edge A→B), then A appears earlier in the sorted order.

When Is Topological Sort Possible?

Topological sort exists if and only if the graph is a DAG (Directed Acyclic Graph). If there's a cycle (A→B→C→A), no valid ordering exists — you can't put A before B, B before C, and C before A simultaneously.

Real-World Analogy

Consider course prerequisites:

Algebra → Calculus → Differential Equations
Algebra → Linear Algebra
Calculus → Physics

A valid topological order: Algebra, Calculus, Linear Algebra, Physics, Differential Equations. Another valid order: Algebra, Linear Algebra, Calculus, Physics, Differential Equations.

💡 Topological sort is NOT unique — there can be multiple valid orderings. Any ordering that respects all edge directions is valid.

Method 1: DFS-Based Topological Sort

DFS approach: When a node finishes (all descendants processed), add it to the result list. This gives reverse topological order — so reverse the list at the end.

Why does this work? In a DAG, if there's an edge u→v, DFS will finish v before u (because v is a descendant of u). So v appears earlier in the finish order. Reversing gives u before v — exactly what we want.

DFS finish order (post-order):  E, D, C, B, A
Topological order (reverse):    A, B, C, D, E

// Solution: Topological Sort via DFS — O(V+E)
#include <bits/stdc++.h>
using namespace std;

vector<int> adj[100001];
vector<bool> visited;
vector<int> topoOrder;

void dfs(int u) {
    visited[u] = true;
    for (int v : adj[u]) {
        if (!visited[v]) dfs(v);
    }
    topoOrder.push_back(u);  // ← add AFTER all children processed (post-order)
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n, m;
    cin >> n >> m;
    visited.assign(n + 1, false);

    for (int i = 0; i < m; i++) {
        int u, v; cin >> u >> v;
        adj[u].push_back(v);
    }

    for (int u = 1; u <= n; u++) {
        if (!visited[u]) dfs(u);
    }

    // Reverse post-order = topological order
    reverse(topoOrder.begin(), topoOrder.end());

    for (int u : topoOrder) cout << u << " ";
    cout << "\n";

    return 0;
}

DFS Topological Sort — Step-by-Step Trace

Given DAG: 1→2, 1→3, 2→4, 3→4, 4→5

Graph:
  1 → 2 → 4 → 5
  1 → 3 → 4

Adjacency list:
  adj[1] = {2, 3}
  adj[2] = {4}
  adj[3] = {4}
  adj[4] = {5}
  adj[5] = {}

DFS trace:

dfs(1): visited[1]=true
  → dfs(2): visited[2]=true
    → dfs(4): visited[4]=true
      → dfs(5): visited[5]=true
        No unvisited neighbors
        topoOrder.push_back(5)  ← 5 finishes first
      topoOrder.push_back(4)    ← 4 finishes
    topoOrder.push_back(2)      ← 2 finishes
  → dfs(3): visited[3]=true
    → 4 already visited, skip
    topoOrder.push_back(3)      ← 3 finishes
  topoOrder.push_back(1)        ← 1 finishes last

Post-order (finish order): [5, 4, 2, 3, 1]
Reversed = topological order:  [1, 3, 2, 4, 5]

Verify: 1→2 ✓ (1 before 2), 1→3 ✓ (1 before 3),
        2→4 ✓ (2 before 4), 3→4 ✓ (3 before 4), 4→5 ✓ (4 before 5)

Method 2: Kahn's Algorithm (BFS-based Topological Sort)

Kahn's algorithm uses a different approach: repeatedly remove nodes with in-degree 0 (no incoming edges). These nodes have no prerequisites and can be processed first.

Algorithm:

Compute in-degree for every node
Push all nodes with in-degree 0 into a queue
While queue is not empty:
- Pop a node u, add it to the result
- For each neighbor v of u: decrement in-degree of v. If in-degree becomes 0, push v into queue
If result size < N, there's a cycle (some nodes could never reach in-degree 0)

// Kahn's Algorithm: Process nodes with in-degree 0 first — O(V+E)
vector<int> inDeg(n + 1, 0);
for (int u = 1; u <= n; u++)
    for (int v : adj[u])
        inDeg[v]++;

queue<int> q;
for (int u = 1; u <= n; u++)
    if (inDeg[u] == 0) q.push(u);  // start with nodes having no prerequisites

vector<int> order;
while (!q.empty()) {
    int u = q.front(); q.pop();
    order.push_back(u);
    for (int v : adj[u]) {
        inDeg[v]--;
        if (inDeg[v] == 0) q.push(v);
    }
}

// If order.size() != n, there's a cycle (not a DAG)
if ((int)order.size() != n) cout << "CYCLE DETECTED\n";
else for (int u : order) cout << u << " ";

Kahn's Algorithm — Step-by-Step Trace

Same DAG: 1→2, 1→3, 2→4, 3→4, 4→5

Step 0 — Compute in-degrees:
  Node:    1  2  3  4  5
  In-deg:  0  1  1  2  1
  
  In-degree 0: node 1 → push to queue
  Queue: [1]

Step 1 — Process node 1:
  Remove 1 from queue. Order: [1]
  Edge 1→2: inDeg[2] = 1-1 = 0 → push 2
  Edge 1→3: inDeg[3] = 1-1 = 0 → push 3
  Queue: [2, 3]

Step 2 — Process node 2:
  Remove 2 from queue. Order: [1, 2]
  Edge 2→4: inDeg[4] = 2-1 = 1 (not 0 yet, don't push)
  Queue: [3]

Step 3 — Process node 3:
  Remove 3 from queue. Order: [1, 2, 3]
  Edge 3→4: inDeg[4] = 1-1 = 0 → push 4
  Queue: [4]

Step 4 — Process node 4:
  Remove 4 from queue. Order: [1, 2, 3, 4]
  Edge 4→5: inDeg[5] = 1-1 = 0 → push 5
  Queue: [5]

Step 5 — Process node 5:
  Remove 5 from queue. Order: [1, 2, 3, 4, 5]
  No outgoing edges.
  Queue: empty

Result: [1, 2, 3, 4, 5]  — valid topological order ✓
order.size() == 5 == n → no cycle ✓

Kahn's Algorithm — How In-Degrees Change Step by Step:

Kahn's Algorithm In-Degree Steps

💡 Cycle detection bonus: If order.size() < n at the end, some nodes never reached in-degree 0 — they are part of a cycle. This makes Kahn's algorithm superior to DFS topological sort for cycle detection.

DFS vs. Kahn's: Which to Choose?

Aspect	DFS Topological Sort	Kahn's Algorithm (BFS)
Code length	Shorter (just reverse postorder)	Slightly longer (in-degree computation)
Cycle detection	Needs separate 3-color DFS	Built-in: `order.size() < n` means cycle
Intuition	Less intuitive (why does reverse postorder work?)	More intuitive (remove nodes with no prerequisites)
Lexicographically smallest order	Hard to achieve	Easy: use `priority_queue` instead of `queue`
Memory	O(V) recursion stack	O(V) queue + in-degree array

💡 Contest tip: If the problem asks for "lexicographically smallest topological order," use Kahn's with a priority_queue (min-heap). DFS cannot easily produce this.

💡 Key Application: Topological sort is essential for DP on DAGs. If the dependency graph is a DAG, process nodes in topological order — each node's DP state depends only on previously-processed nodes.

Example DAG and BFS levels visualization: BFS DAG Levels

Visual: Grid BFS Distances from Source

BFS Grid Distances

The diagram shows a 5×5 grid BFS where each cell displays its minimum distance from the source (0,0). Walls are shown in dark gray. Note how the BFS "flood fills" outward in concentric rings, never revisiting a cell — guaranteeing minimum distances.

mark

C++ for Competitive Programming: A USACO Guide