📖 Chapter 3.2 ⏱️ ~55 min read 🎯 Intermediate

Chapter 3.2: Arrays & Prefix Sums

📝 Before You Continue: Make sure you're comfortable with arrays, vectors, and basic loops (Chapters 2.2–2.3). You'll also want to understand long long overflow (Chapter 2.1).

Imagine you have an array of N numbers, and someone asks you 100,000 times: "What is the sum of elements from index L to index R?" A naive approach recomputes the sum from scratch each time — that's O(N) per query, or O(N × Q) total. With N = Q = 10^5, that's 10^10 operations. Way too slow.

Prefix sums solve this in O(N) preprocessing and O(1) per query. This is one of the most elegant and useful techniques in all of competitive programming.

💡 Key Insight: Prefix sums transform a "range query" problem into a subtraction. Instead of summing L to R every time, you precompute cumulative sums and subtract two of them. This trades O(Q) repeated work for one-time O(N) preprocessing.


3.2.1 The Prefix Sum Idea

The prefix sum of an array is a new array where each element stores the cumulative sum up to that index.

Visual: Prefix Sum Array

Prefix Sum Visualization

The diagram above shows how the prefix sum array is constructed from the original array, and how a range query sum(L, R) = P[R] - P[L-1] is computed in O(1) time. The blue cells highlight a query range while the red and green cells show the two prefix values being subtracted.

Given array: A = [3, 1, 4, 1, 5, 9, 2, 6] (1-indexed for clarity)

Index:  1  2  3  4  5  6  7  8
A:      3  1  4  1  5  9  2  6
P:      3  4  8  9  14 23 25 31

Where P[i] = A[1] + A[2] + ... + A[i].

Why 1-Indexing?

Using 1-indexed arrays lets us define P[0] = 0 (the "empty prefix" sums to zero). This makes the query formula P[R] - P[L-1] work even when L = 1 — we'd compute P[R] - P[0] = P[R], which is correct.

Building the Prefix Sum Array

// Solution: Build Prefix Sum Array — O(N)
#include <bits/stdc++.h>
using namespace std;

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n;
    cin >> n;

    // Step 1: Read input (1-indexed)
    vector<int> A(n + 1);
    for (int i = 1; i <= n; i++) cin >> A[i];

    // Step 2: Build prefix sums
    vector<long long> P(n + 1, 0);  // P[0] = 0 (base case)
    for (int i = 1; i <= n; i++) {
        P[i] = P[i - 1] + A[i];   // ← KEY LINE: each P[i] = all elements up to i
    }

    return 0;
}

Complexity Analysis:

  • Time: O(N) — one pass through the array
  • Space: O(N) — stores the prefix array

Step-by-step trace for A = [3, 1, 4, 1, 5]:

i=1: P[1] = P[0] + A[1] = 0 + 3 = 3
i=2: P[2] = P[1] + A[2] = 3 + 1 = 4
i=3: P[3] = P[2] + A[3] = 4 + 4 = 8
i=4: P[4] = P[3] + A[4] = 8 + 1 = 9
i=5: P[5] = P[4] + A[5] = 9 + 5 = 14

3.2.2 Range Sum Queries in O(1)

Once you have the prefix sum array, the sum from index L to R is:

sum(L, R) = P[R] - P[L-1]

Why? P[R] = sum of elements 1..R. P[L-1] = sum of elements 1..(L-1). Their difference = sum of elements L..R.

💡 Key Insight: Think of P[i] as "the total sum of the first i elements." To get the sum of a window [L, R], you subtract the "prefix before L" from the "prefix through R." It's like: big triangle minus smaller triangle = trapezoid.

// Solution: Range Sum Queries — Preprocessing O(N), Each Query O(1)
#include <bits/stdc++.h>
using namespace std;

const int MAXN = 100001;
long long A[MAXN];
long long P[MAXN];

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n, q;
    cin >> n >> q;

    // Step 1: Read array
    for (int i = 1; i <= n; i++) cin >> A[i];

    // Step 2: Build prefix sum — O(n)
    P[0] = 0;
    for (int i = 1; i <= n; i++) {
        P[i] = P[i - 1] + A[i];
    }

    // Step 3: Answer q range sum queries — O(1) each
    for (int i = 0; i < q; i++) {
        int l, r;
        cin >> l >> r;
        cout << P[r] - P[l - 1] << "\n";  // ← KEY LINE: range sum formula
    }

    return 0;
}

Sample Input:

8 3
3 1 4 1 5 9 2 6
1 4
3 7
2 6

Sample Output:

9
21
20

Verification:

  • sum(1,4) = P[4] - P[0] = 9 - 0 = 9 → A[1]+A[2]+A[3]+A[4] = 3+1+4+1 = 9 ✓
  • sum(3,7) = P[7] - P[2] = 25 - 4 = 21 → A[3]+...+A[7] = 4+1+5+9+2 = 21 ✓
  • sum(2,6) = P[6] - P[1] = 23 - 3 = 20 → A[2]+...+A[6] = 1+4+1+5+9 = 20 ✓

⚠️ Common Mistake: Writing P[R] - P[L] instead of P[R] - P[L-1]. The formula includes both endpoints L and R — you want to subtract the sum before L, not the sum at L.

Total Complexity: O(N + Q) — perfect for N, Q up to 10^5.


3.2.3 USACO Example: Breed Counting

This is a classic USACO Bronze problem (2015 December).

Problem: N cows in a line. Each cow is breed 1, 2, or 3. Answer Q queries: how many cows of breed B are in positions L to R?

Solution: Maintain one prefix sum array per breed.

// Solution: Multi-Breed Prefix Sums — O(N + Q)
#include <bits/stdc++.h>
using namespace std;

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n, q;
    cin >> n >> q;

    vector<int> breed(n + 1);
    vector<vector<long long>> P(4, vector<long long>(n + 1, 0));
    // P[b][i] = number of cows of breed b in positions 1..i

    // Step 1: Build prefix sums for each breed
    for (int i = 1; i <= n; i++) {
        cin >> breed[i];
        for (int b = 1; b <= 3; b++) {
            P[b][i] = P[b][i - 1] + (breed[i] == b ? 1 : 0);  // ← KEY LINE
        }
    }

    // Step 2: Answer each query in O(1)
    for (int i = 0; i < q; i++) {
        int l, r, b;
        cin >> l >> r >> b;
        cout << P[b][r] - P[b][l - 1] << "\n";
    }

    return 0;
}

🏆 USACO Tip: Many USACO Bronze problems involve "count elements satisfying property X in a range." If Q is large, always consider prefix sums.


3.2.4 USACO-Style Problem Walkthrough: Farmer John's Grass Fields

🔗 Related Problem: This is a fictional USACO-style problem inspired by "Breed Counting" and "Tallest Cow" — both classic Bronze problems.

Problem Statement: Farmer John has N fields in a row. Field i has grass[i] units of grass. He needs to answer Q queries: "What is the total grass in fields L through R (inclusive)?" With N, Q up to 10^5, he needs each query answered in O(1).

Sample Input:

6 4
4 2 7 1 8 3
1 3
2 5
4 6
1 6

Sample Output:

13
18
12
25

Step-by-Step Solution:

Step 1: Understand the problem. We have an array [4, 2, 7, 1, 8, 3] and need range sums.

Step 2: Build the prefix sum array.

Index:  0  1  2  3  4  5  6
grass:  -  4  2  7  1  8  3
P:      0  4  6  13 14 22 25

Step 3: Answer queries using P[R] - P[L-1]:

  • Query (1,3): P[3] - P[0] = 13 - 0 = 13
  • Query (2,5): P[5] - P[1] = 22 - 4 = 18
  • Query (4,6): P[6] - P[3] = 25 - 13 = 12
  • Query (1,6): P[6] - P[0] = 25 - 0 = 25

Complete C++ Solution:

// Farmer John's Grass Fields — Prefix Sum Solution O(N + Q)
#include <bits/stdc++.h>
using namespace std;

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n, q;
    cin >> n >> q;

    // Step 1: Read grass values and build prefix sum simultaneously
    vector<long long> P(n + 1, 0);
    for (int i = 1; i <= n; i++) {
        long long g;
        cin >> g;
        P[i] = P[i - 1] + g;   // ← KEY LINE: incremental prefix sum
    }

    // Step 2: Answer each query in O(1)
    while (q--) {
        int l, r;
        cin >> l >> r;
        cout << P[r] - P[l - 1] << "\n";
    }

    return 0;
}

Why is this O(N + Q)?

  • Building prefix sums: one loop, N iterations → O(N)
  • Each query: one subtraction → O(1) per query, O(Q) total
  • Total: O(N + Q) — much better than the O(NQ) brute force

⚠️ Common Mistake: Using int instead of long long for the prefix sum. If grass values are up to 10^9 and N = 10^5, the total could be up to 10^14 — way beyond int's range of ~2×10^9.


3.2.5 Difference Arrays

The difference array is the inverse of prefix sums. It's useful when you need to add a value to a range of positions, then query final values.

Problem: Start with all zeros. Apply M updates: "add V to all positions from L to R." Then print the final array.

Naively, each update is O(R-L+1). With a difference array, each update is O(1), and reconstruction is O(N).

💡 Key Insight: Instead of adding V to every position in [L, R] (slow), we record "+V at position L" and "-V at position R+1" (fast). When we later do a prefix sum of these markers, the +V and -V "cancel out" outside [L,R], so the net effect is exactly adding V to [L,R].

// Solution: Difference Array for Range Updates — O(N + M)
#include <bits/stdc++.h>
using namespace std;

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n, m;
    cin >> n >> m;

    vector<long long> diff(n + 2, 0);  // difference array (extra space for R+1 case)

    // Step 1: Process all range updates in O(1) each
    for (int i = 0; i < m; i++) {
        int l, r, v;
        cin >> l >> r >> v;
        diff[l] += v;      // ← KEY LINE: mark start of range
        diff[r + 1] -= v;  // ← KEY LINE: mark end+1 to undo the addition
    }

    // Step 2: Reconstruct the final array by taking prefix sums of diff
    long long running = 0;
    for (int i = 1; i <= n; i++) {
        running += diff[i];
        cout << running;
        if (i < n) cout << " ";
    }
    cout << "\n";

    return 0;
}

Sample Input:

5 3
1 3 2
2 5 3
3 4 -1

Step-by-step trace:

📝 索引说明: 以下追踪中 diff 数组使用 1-indexed(即 diff[1]..diff[n+1]),与代码 vector<long long> diff(n + 2, 0) 一致。方括号内的数字表示 diff[1], diff[2], ..., diff[6](n=5 时,数组共 7 个位置,有效使用 diff[1..6])。

初始状态:           diff[1..6] = [0,  0,  0,  0,  0,  0]

After update(1,3,+2): diff[1]+=2, diff[4]-=2
                      diff[1..6] = [2,  0,  0, -2,  0,  0]

After update(2,5,+3): diff[2]+=3, diff[6]-=3
                      diff[1..6] = [2,  3,  0, -2,  0, -3]

After update(3,4,-1): diff[3]+=-1 即 diff[3]-=1, diff[5]-=(-1) 即 diff[5]+=1
                      diff[1..6] = [2,  3, -1, -2,  1, -3]

Prefix sum reconstruction: i=1: running = 0+2 = 2 → result[1] = 2 i=2: running = 2+3 = 5 → result[2] = 5 i=3: running = 5-1 = 4 → result[3] = 4 i=4: running = 4-2 = 2 → result[4] = 2 i=5: running = 2+1 = 3 → result[5] = 3


**Sample Output:**

2 5 4 2 3


**Complexity Analysis:**
- **Time:** `O(N + M)` — `O(1)` per update, `O(N)` reconstruction
- **Space:** `O(N)` — just the difference array

> ⚠️ **Common Mistake:** Declaring `diff` with size N+1 instead of N+2. When R=N, you write to `diff[R+1] = diff[N+1]`, which needs to exist!

---

## 3.2.6 2D Prefix Sums

For 2D grids, you can extend prefix sums to answer rectangular range queries in `O(1)`.

Given an R×C grid, define `P[r][c]` = sum of all elements in the rectangle from (1,1) to (r,c).

### Building the 2D Prefix Sum

P[r][c] = A[r][c] + P[r-1][c] + P[r][c-1] - P[r-1][c-1]


The subtraction removes the overlap (otherwise the top-left rectangle is counted twice).

> 💡 **Key Insight (Inclusion-Exclusion):** Visualize the four rectangles:
> - `P[r-1][c]` = the "top" rectangle
> - `P[r][c-1]` = the "left" rectangle
> - `P[r-1][c-1]` = the "top-left corner" (counted in BOTH above — so subtract once)
> - `A[r][c]` = the single new cell

### Step-by-Step 2D Prefix Sum Worked Example

Let's trace through a 4×4 grid:

**Original Grid A:**
 c=1  c=2  c=3  c=4

r=1: 1 2 3 4 r=2: 5 6 7 8 r=3: 9 10 11 12 r=4: 13 14 15 16


**Building P step by step (left-to-right, top-to-bottom):**

P[1][1] = A[1][1] = 1

P[1][2] = A[1][2] + P[0][2] + P[1][1] - P[0][1] = 2 + 0 + 1 - 0 = 3 P[1][3] = A[1][3] + P[0][3] + P[1][2] - P[0][2] = 3 + 0 + 3 - 0 = 6 P[1][4] = 4 + 0 + 6 - 0 = 10

P[2][1] = A[2][1] + P[1][1] + P[2][0] - P[1][0] = 5 + 1 + 0 - 0 = 6 P[2][2] = A[2][2] + P[1][2] + P[2][1] - P[1][1] = 6 + 3 + 6 - 1 = 14 P[2][3] = 7 + 6 + 14 - 3 = 24 P[2][4] = 8 + 10 + 24 - 6 = 36

P[3][1] = 9 + 6 + 0 - 0 = 15 P[3][2] = 10 + 14 + 15 - 6 = 33 P[3][3] = 11 + 24 + 33 - 14 = 54 P[3][4] = 12 + 36 + 54 - 24 = 78

P[4][1] = 13 + 15 + 0 - 0 = 28 P[4][2] = 14 + 33 + 28 - 15 = 60 P[4][3] = 15 + 54 + 60 - 33 = 96 P[4][4] = 16 + 78 + 96 - 54 = 136


**Resulting prefix sum grid P:**
 c=1  c=2  c=3  c=4

r=1: 1 3 6 10 r=2: 6 14 24 36 r=3: 15 33 54 78 r=4: 28 60 96 136


**Query: Sum of subgrid (r1=2, c1=2) to (r2=3, c2=3):**

ans = P[3][3] - P[1][3] - P[3][1] + P[1][1] = 54 - 6 - 15 + 1 = 34

Verify: A[2][2]+A[2][3]+A[3][2]+A[3][3] = 6+7+10+11 = 34 ✓


**Visualization of the inclusion-exclusion:**

![2D Prefix Sum Inclusion-Exclusion](../images/prefix_sum_2d_inclusion_exclusion.svg)

```cpp
// Solution: 2D Prefix Sums — Build O(R×C), Query O(1)
#include <bits/stdc++.h>
using namespace std;

const int MAXR = 1001, MAXC = 1001;
int A[MAXR][MAXC];
long long P[MAXR][MAXC];

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int R, C;
    cin >> R >> C;

    for (int r = 1; r <= R; r++)
        for (int c = 1; c <= C; c++)
            cin >> A[r][c];

    // Step 1: Build 2D prefix sum — O(R × C)
    for (int r = 1; r <= R; r++) {
        for (int c = 1; c <= C; c++) {
            P[r][c] = A[r][c]
                    + P[r-1][c]    // rectangle above
                    + P[r][c-1]    // rectangle to the left
                    - P[r-1][c-1]; // ← KEY LINE: remove overlap (counted twice)
        }
    }

    // Step 2: Answer each query in O(1)
    int q;
    cin >> q;
    while (q--) {
        int r1, c1, r2, c2;
        cin >> r1 >> c1 >> r2 >> c2;
        long long ans = P[r2][c2]
                      - P[r1-1][c2]    // subtract top strip
                      - P[r2][c1-1]    // subtract left strip
                      + P[r1-1][c1-1]; // add back top-left corner
        cout << ans << "\n";
    }

    return 0;
}

Complexity Analysis:

  • Build time: O(R × C)
  • Query time: O(1) per query
  • Space: O(R × C)

⚠️ Common Mistake: Forgetting to add P[r1-1][c1-1] back in the query formula. The top strip and left strip both include the top-left corner, so it gets subtracted twice — you need to add it back once!


3.2.7 USACO Example: Max Subarray Sum

Problem (variation of Kadane's algorithm): Find the contiguous subarray with the maximum sum.

// Solution: Kadane's Algorithm — O(N) Time, O(1) Space
#include <bits/stdc++.h>
using namespace std;

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n;
    cin >> n;
    vector<int> A(n);
    for (int &x : A) cin >> x;

    // Kadane's Algorithm: O(n)
    long long maxSum = LLONG_MIN;  // LLONG_MIN = smallest long long
    long long current = 0;

    for (int i = 0; i < n; i++) {
        current += A[i];
        maxSum = max(maxSum, current);
        if (current < 0) current = 0;  // ← KEY LINE: restart if sum goes negative
    }

    cout << maxSum << "\n";

    return 0;
}

💡 Key Insight: Why reset current to 0 when it goes negative? Because a negative prefix sum hurts any future subarray. If the running sum so far is -5, any future subarray starting fresh (sum 0) will always beat continuing from -5.

Alternative with prefix sums: The max subarray sum equals max over all pairs (i,j) of P[j] - P[i-1]. For each j, this is maximized when P[i-1] is minimized. Track the running minimum of prefix sums!

// Alternative: Min Prefix Trick — also O(N)
long long maxSum = LLONG_MIN, minPrefix = 0, prefix = 0;
for (int x : A) {
    prefix += x;
    maxSum = max(maxSum, prefix - minPrefix);  // best sum ending here
    minPrefix = min(minPrefix, prefix);         // track minimum prefix seen so far
    // ⚠️ 注意:minPrefix 的更新必须在 maxSum 之后。
    // 若提前更新 minPrefix,相当于允许空子数组(长度为0,和为0)参与比较,
    // 会导致结果在全负数组时错误地返回 0 而非最大负数。
}

⚠️ Common Mistakes in Chapter 3.2

  1. Off-by-one in range queries: P[R] - P[L] instead of P[R] - P[L-1]. Always verify on a small example.
  2. Overflow: Prefix sums of large values can exceed int range (2×10^9). Use long long for the prefix array even if elements are int.
  3. 2D query formula: Forgetting the +P[r1-1][c1-1] term in the 2D query — a very easy slip.
  4. Difference array size: Declaring diff[n+1] when you need diff[n+2] (because you write to index r+1 which could be n+1).
  5. 1-indexing vs 0-indexing: If you use 0-indexed prefix sums, the query formula changes to P[R+1] - P[L]. Pick one convention and stick to it within a problem.

Chapter Summary

📌 Key Takeaways

TechniqueBuild TimeQuery TimeSpaceUse Case
1D prefix sumO(N)O(1)O(N)Range sum on 1D array
2D prefix sumO(RC)O(1)O(RC)Range sum on 2D grid
Difference arrayO(N+M)O(1)*O(N)Range addition updates
Kadane's algorithmO(N)O(1)Maximum subarray sum

*After O(N) reconstruction pass to read all values.

🧩 Core Formula Quick Reference

OperationFormulaNotes
1D range sumP[R] - P[L-1]P[0] = 0 is the sentinel value
2D rectangle sumP[r2][c2] - P[r1-1][c2] - P[r2][c1-1] + P[r1-1][c1-1]Inclusion-exclusion: subtract twice, add once
Difference array updatediff[L] += V; diff[R+1] -= V;Array size should be N+2
Restore from differenceTake prefix sum of diffResult is the final array

❓ FAQ

Q1: What is the relationship between prefix sums and difference arrays?

A: They are inverse operations. Taking the prefix sum of an array gives the prefix sum array; taking the difference (adjacent element differences) of the prefix sum array restores the original. Conversely, taking the prefix sum of a difference array also restores the original. This is analogous to integration and differentiation in mathematics.

Q2: When to use prefix sums vs. difference arrays?

A: Rule of thumb — look at the operation type:

  • Multiple range sum queries → prefix sum (preprocess O(N), query O(1))
  • Multiple range add/subtract operations → difference array (update O(1), restore O(N) at the end)
  • If both operations alternate, you need a more advanced data structure (like Segment Tree in Chapter 3.9)

Q3: Can prefix sums handle dynamic modifications? (array elements change)

A: No. Prefix sums are a one-time preprocessing; the array cannot change afterward. If elements are modified, use Fenwick Tree (BIT) or Segment Tree, which support point updates and range queries in O(log N) time.

Q4: Why are there two versions of Kadane's algorithm (current=0 vs minPrefix)?

A: Both are essentially the same, both O(N). The first (classic Kadane) is more intuitive: restart when the current subarray sum goes negative. The second (min-prefix method) uses prefix sum thinking: max subarray = max(P[j] - P[i-1]) = max(P[j]) - min(P[i]). Choose based on personal preference.

Q5: What are the space constraints for 2D prefix sums?

A: If R, C are both up to 10^4, the P array needs 10^8 long long values (about 800MB) — exceeding memory limits. Generally R×C ≤ 10^6~10^7 is safe. For larger grids, consider compression or offline processing.

🔗 Connections to Later Chapters

  • Chapter 3.4 (Two Pointers): sliding window can also do range queries, but only for fixed-size or monotonically moving windows; prefix sums are more general
  • Chapter 3.3 (Sorting & Searching): binary search can combine with prefix sums — e.g., binary search on the prefix sum array for the first position ≥ target
  • Chapter 3.9 (Segment Trees): solves "dynamic update + range query" problems that prefix sums cannot handle
  • Chapters 6.1–6.3 (DP): many state transitions involve range sums; prefix sums are an important tool for optimizing DP
  • The difference array idea ("+V at start, -V after end") recurs in sweep line algorithms, event sorting, and other advanced techniques

Practice Problems

Problem 3.2.1 — Range Sum 🟢 Easy Read N integers and Q queries. Each query gives L and R. Print the sum of elements from index L to R (1-indexed).

Hint Build a prefix sum array P where P[i] = A[1]+...+A[i]. Answer each query as P[R] - P[L-1].

Problem 3.2.2 — Range Add, Point Query 🟢 Easy Start with N zeros. Process M operations: each operation adds V to all positions from L to R. After all operations, print the value at each position. (Use difference array)

Hint Use ``diff[L]`` += V and ``diff[R+1]`` -= V for each update, then take prefix sums of diff.

Problem 3.2.3 — Rectangular Sum 🟡 Medium Read an N×M grid of integers and Q queries. Each query gives (r1,c1,r2,c2). Print the sum of the subgrid.

Hint Build a 2D prefix sum. Query = P[r2][c2] - P[r1-1][c2] - P[r2][c1-1] + P[r1-1][c1-1].

Problem 3.2.4 — USACO 2016 January Bronze: Mowing the Field 🔴 Hard (Challenge) Farmer John mows grass along a path. Cells visited more than once contribute to "double-mowed" area. Use a 2D array and count cells visited at least twice.

Hint Simulate the path, marking cells in a 2D visited array. Count cells with value ≥ 2 at the end.

Problem 3.2.5 — Maximum Subarray (Negative numbers allowed) 🟡 Medium Read N integers (possibly negative). Find the maximum possible sum of a contiguous subarray. What if all numbers are negative?

Hint Use Kadane's algorithm. If all numbers are negative, the answer is the single largest element (that's why we initialize maxSum to `LLONG_MIN`, not 0).

🏆 Challenge Problem: Cows and Paint Buckets An N×M grid contains paint buckets, each with a positive value. You can select any rectangular subgrid. Your score is the maximum value in your subgrid minus the sum of all border cells of your subgrid. Find the optimal rectangle. (N, M ≤ 500)

Solution approach: 2D prefix sums for sums + careful enumeration of all rectangles.