Chapter 3.10: Fenwick Tree (BIT)
📝 Prerequisites: Understanding of prefix sums (Chapter 3.2) and bitwise operations. This chapter complements Segment Trees (Chapter 3.9) — BIT has shorter code and smaller constants, but supports fewer operations.
The Fenwick Tree (also known as Binary Indexed Tree / BIT) is one of the most commonly used data structures in competitive programming: fewer than 15 lines of code, yet supporting point updates and prefix queries in O(log N) time.
3.10.1 Core Idea: What Is lowbit?
Bitwise Principle of lowbit
For any positive integer x, lowbit(x) = x & (-x) returns the value represented by the lowest set bit in x's binary representation.
x = 6 → binary: 0110
-x = -6 → two's complement: 1010 (bitwise NOT + 1)
x & (-x) = 0010 = 2 ← lowest set bit corresponds to 2^1 = 2
Examples:
| x | Binary | -x (two's complement) | x & (-x) | Meaning |
|---|---|---|---|---|
| 1 | 0001 | 1111 | 0001 = 1 | Manages 1 element |
| 2 | 0010 | 1110 | 0010 = 2 | Manages 2 elements |
| 3 | 0011 | 1101 | 0001 = 1 | Manages 1 element |
| 4 | 0100 | 1100 | 0100 = 4 | Manages 4 elements |
| 6 | 0110 | 1010 | 0010 = 2 | Manages 2 elements |
| 8 | 1000 | 1000 | 1000 = 8 | Manages 8 elements |
BIT Tree Structure Intuition
The elegance of BIT: tree[i] doesn't store a single element, but the sum of an interval, with length exactly lowbit(i).
BIT Structure (n=8): Each tree[i] covers exactly lowbit(i) elements ending at index i.
Jump path for querying prefix(7) (i -= lowbit(i) jumping down):
💡 Jump Pattern: For queries,
i -= lowbit(i)(jump down); for updates,i += lowbit(i)(jump up). Each jump eliminates the lowest set bit, at most log N steps.
Index i: 1 2 3 4 5 6 7 8
Range managed by tree[i]:
tree[1] = A[1] (length lowbit(1)=1)
tree[2] = A[1]+A[2] (length lowbit(2)=2)
tree[3] = A[3] (length lowbit(3)=1)
tree[4] = A[1]+...+A[4] (length lowbit(4)=4)
tree[5] = A[5] (length lowbit(5)=1)
tree[6] = A[5]+A[6] (length lowbit(6)=2)
tree[7] = A[7] (length lowbit(7)=1)
tree[8] = A[1]+...+A[8] (length lowbit(8)=8)
Jump path for updating position 3 (i += lowbit(i) jumping up):
When querying prefix(7), jump down via i -= lowbit(i):
i=7: addtree[7](manages A[7]), then7 - lowbit(7) = 7 - 1 = 6i=6: addtree[6](manages A[5..6]), then6 - lowbit(6) = 6 - 2 = 4i=4: addtree[4](manages A[1..4]), then4 - lowbit(4) = 4 - 4 = 0, stop
3 steps total = O(log 7) ≈ 3 steps.
When updating position 3, jump up via i += lowbit(i):
i=3: updatetree[3], then3 + lowbit(3) = 3 + 1 = 4i=4: updatetree[4], then4 + lowbit(4) = 4 + 4 = 8i=8: updatetree[8], 8 > n, stop
3.10.2 Point Update + Prefix Query — Complete Code
📄 View Code: 3.10.2 Point Update + Prefix Query — Complete Code
// ══════════════════════════════════════════════════════════════
// Fenwick Tree (BIT) — Classic Implementation
// Supports: point update O(log N), prefix sum query O(log N)
// Array MUST use 1-indexed (critical!)
// ══════════════════════════════════════════════════════════════
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 300005;
int n;
long long tree[MAXN]; // BIT array, 1-indexed
// ── lowbit: returns value of lowest set bit ──
inline int lowbit(int x) {
return x & (-x);
}
// ── Update: add val to position i ──
// Traverse upward: i += lowbit(i)
// Every ancestor node covering position i is updated
void update(int i, long long val) {
for (; i <= n; i += lowbit(i))
tree[i] += val;
// Time: O(log N) — at most log2(N) iterations
}
// ── Query: return prefix sum A[1..i] ──
// Traverse downward: i -= lowbit(i)
// Decompose [1..i] into O(log N) non-overlapping intervals
long long query(int i) {
long long sum = 0;
for (; i > 0; i -= lowbit(i))
sum += tree[i];
return sum;
// Time: O(log N) — at most log2(N) iterations
}
// ── Build: initialize BIT from existing array A[1..n] ──
// Method 1: N separate updates — O(N log N)
void build_slow(long long A[]) {
fill(tree + 1, tree + n + 1, 0LL);
for (int i = 1; i <= n; i++)
update(i, A[i]);
}
// Method 2: O(N) build (using "direct parent" relationship)
void build_fast(long long A[]) {
for (int i = 1; i <= n; i++) {
tree[i] += A[i];
int parent = i + lowbit(i); // Direct parent in BIT
if (parent <= n)
tree[parent] += tree[i];
}
}
// Method 3: O(N) build (using prefix sums)
// Principle: tree[i] = sum(A[i-lowbit(i)+1 .. i])
// = prefix[i] - prefix[i - lowbit(i)]
void build_prefix(long long A[], long long prefix[]) {
// Compute prefix sums first
for (int i = 1; i <= n; i++) prefix[i] = prefix[i-1] + A[i];
// Directly compute each node using prefix sums
for (int i = 1; i <= n; i++)
tree[i] = prefix[i] - prefix[i - lowbit(i)];
}
// ── Complete Example ──
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int q;
cin >> n >> q;
long long A[MAXN] = {};
for (int i = 1; i <= n; i++) cin >> A[i];
build_fast(A); // O(N) initialization
while (q--) {
int type;
cin >> type;
if (type == 1) {
// Point update: A[i] += val
int i; long long val;
cin >> i >> val;
update(i, val);
} else {
// Prefix query: sum of A[1..r]
int r;
cin >> r;
cout << query(r) << "\n";
}
}
return 0;
}
3.10.3 Range Query = prefix(r) - prefix(l-1)
Range query sum(l, r) is identical to the prefix sum technique:
📄 Range query `sum(l, r)` is identical to the prefix sum technique:
// Range sum: sum of A[l..r]
// Time: O(log N) — two prefix queries
long long range_query(int l, int r) {
return query(r) - query(l - 1);
// query(r) = A[1] + A[2] + ... + A[r]
// query(l-1) = A[1] + A[2] + ... + A[l-1]
// difference = A[l] + A[l+1] + ... + A[r]
}
// Example:
// A = [3, 1, 4, 1, 5, 9, 2, 6] (1-indexed)
// range_query(3, 6) = query(6) - query(2)
// = (3+1+4+1+5+9) - (3+1)
// = 23 - 4 = 19
// Verify: A[3]+A[4]+A[5]+A[6] = 4+1+5+9 = 19 ✓
3.10.4 Comparison: Prefix Sum vs BIT vs Segment Tree
| Operation | Prefix Sum Array | Fenwick Tree (BIT) | Segment Tree |
|---|---|---|---|
| Build | O(N) | O(N) or O(N log N) | O(N) |
| Prefix Query | O(1) | O(log N) | O(log N) |
| Range Query | O(1) | O(log N) | O(log N) |
| Point Update | O(N) rebuild | O(log N) ✓ | O(log N) ✓ |
| Range Update | O(N) | O(log N) (difference BIT) | O(log N) (lazy) |
| Range Min/Max | O(1) (sparse table) | ❌ Not supported | ✓ Supported |
| Code Complexity | Trivial | Simple (10 lines) | Complex (50+ lines) |
| Constant Factor | Smallest | Very small | Larger |
| Space | O(N) | O(N) | O(4N) |
When to choose BIT?
- ✅ Only need prefix/range sum + point update
- ✅ Need minimal code (fewer bugs in contests)
- ✅ Inversion counting, merge sort counting problems
- ❌ Need range min/max → use Segment Tree
- ❌ Need complex range operations (range multiply, etc.) → use Segment Tree
3.10.5 Interactive Visualization: BIT Update Process
3.10.6 Range Update + Point Query (Difference BIT)
Standard BIT supports "point update + prefix query". Using the difference array technique, we can switch to "range update + point query".
Principle
Let difference array D[i] = A[i] - A[i-1] (D[1] = A[1]), then:
A[i] = D[1] + D[2] + ... + D[i](A[i] is the prefix sum of D)- Adding val to all elements in A[l..r] is equivalent to:
D[l] += val; D[r+1] -= val
📄 Full C++ Code
// ══════════════════════════════════════════════════════════════
// Difference BIT: Range Update + Point Query
// ══════════════════════════════════════════════════════════════
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 300005;
int n;
long long diff_bit[MAXN]; // BIT on difference array D[]
inline int lowbit(int x) { return x & (-x); }
// Update difference BIT at position i: D[i] += val
void diff_update(int i, long long val) {
for (; i <= n; i += lowbit(i))
diff_bit[i] += val;
}
// Query A[i] = sum of D[1..i] = prefix query on difference BIT
long long diff_query(int i) {
long long s = 0;
for (; i > 0; i -= lowbit(i))
s += diff_bit[i];
return s;
}
// Range update: add val to all elements in A[l..r]
// Equivalent to: D[l] += val, D[r+1] -= val
void range_update(int l, int r, long long val) {
diff_update(l, val); // D[l] += val
diff_update(r + 1, -val); // D[r+1] -= val
}
// Point query: return current value of A[i]
// A[i] = D[1] + D[2] + ... + D[i] = prefix_sum(D, i)
long long point_query(int i) {
return diff_query(i);
}
Advanced: Range Update + Range Query (Double BIT)
Supporting both range updates and range queries using two BITs:
📄 Supporting both range updates and range queries using two BITs:
// ══════════════════════════════════════════════════════════════
// Double BIT: Range Update + Range Query
// Formula: sum(1..r) = B1[r] * r - B2[r]
// Where B1 is BIT on D[], B2 is BIT on (i-1)*D[i]
// ══════════════════════════════════════════════════════════════
long long B1[MAXN], B2[MAXN];
inline int lowbit(int x) { return x & (-x); }
void add(long long* b, int i, long long v) {
for (; i <= n; i += lowbit(i)) b[i] += v;
}
long long sum(long long* b, int i) {
long long s = 0;
for (; i > 0; i -= lowbit(i)) s += b[i];
return s;
}
// Range update: add val to A[l..r]
void range_add(int l, int r, long long val) {
add(B1, l, val);
add(B1, r + 1, -val);
add(B2, l, val * (l - 1)); // Compensate for prefix formula
add(B2, r + 1, -val * r);
}
// Prefix sum A[1..r]
long long prefix_sum(int r) {
return sum(B1, r) * r - sum(B2, r);
}
// Range sum A[l..r]
long long range_sum(int l, int r) {
return prefix_sum(r) - prefix_sum(l - 1);
}
3.10.7 USACO-Style Problem: Counting Inversions with BIT
Problem Description
Count Inversions (O(N log N))
Given an integer array A of length N (distinct elements, range 1..N), count the number of inversions.
An inversion is a pair of indices (i, j) where i < j but A[i] > A[j].
Sample Input:
5
3 1 4 2 5
Sample Output:
3
Explanation: Inversions are (3,1), (3,2), (4,2), total 3 pairs.
Solution: BIT Inversion Counting
📄 View Code: Solution: BIT Inversion Counting
// ══════════════════════════════════════════════════════════════
// Count Inversions with Fenwick Tree — O(N log N)
//
// Core Idea:
// Process A[i] from left to right.
// For each A[i], the number of inversions with A[i] as right endpoint
// = number of already-processed values greater than A[i]
// = (number of elements processed so far) - (number of processed elements <= A[i])
// = i-1 - prefix_query(A[i])
// Sum over all i gives total inversions.
//
// BIT's role: track frequency of seen values.
// After seeing value v: update(v, +1)
// Query count of values <= x: query(x)
// ══════════════════════════════════════════════════════════════
#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
const int MAXN = 300005;
int n;
int bit[MAXN]; // Frequency count BIT
inline int lowbit(int x) { return x & (-x); }
// Add 1 at position v (saw value v)
void update(int v) {
for (; v <= n; v += lowbit(v))
bit[v]++;
}
// Count values in [1..v] seen so far
int query(int v) {
int cnt = 0;
for (; v > 0; v -= lowbit(v))
cnt += bit[v];
return cnt;
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
cin >> n;
ll inversions = 0;
for (int i = 1; i <= n; i++) {
int a;
cin >> a;
// Count inversions with a as right endpoint:
// Number of already-seen values greater than a
// = (i-1 elements seen so far) - (count of seen values <= a)
int less_or_equal = query(a); // Count of [1..a] seen so far
int greater = (i - 1) - less_or_equal; // Count of [a+1..n] seen so far
inversions += greater;
// Mark that we've seen value a
update(a);
}
cout << inversions << "\n";
return 0;
}
/*
Trace for A = [3, 1, 4, 2, 5]:
i=1, a=3: seen=[], query(3)=0, greater=0-0=0. inversions=0. update(3).
i=2, a=1: seen=[3], query(1)=0, greater=1-0=1. inversions=1. update(1).
(3 > 1: 1 inversion: (3,1) ✓)
i=3, a=4: seen=[3,1], query(4)=2, greater=2-2=0. inversions=1. update(4).
(no seen elements > 4)
i=4, a=2: seen=[3,1,4], query(2)=1, greater=3-1=2. inversions=3. update(2).
(3>2 and 4>2: 2 inversions: (3,2),(4,2) ✓)
i=5, a=5: seen=[3,1,4,2], query(5)=4, greater=4-4=0. inversions=3. update(5).
Final: 3 ✓
*/
Complexity Analysis:
- Time: O(N log N) — N iterations, each with O(log N) update + query
- Space: O(N) (BIT)
Extension: If array elements are not in range 1..N, do coordinate compression first:
📄 Full C++ Code
// Coordinate compression for arbitrary values
vector<int> A(n);
for (int i = 0; i < n; i++) cin >> A[i];
// Step 1: Sort and deduplicate
vector<int> sorted_A = A;
sort(sorted_A.begin(), sorted_A.end());
sorted_A.erase(unique(sorted_A.begin(), sorted_A.end()), sorted_A.end());
// Step 2: Replace each value with its rank (1-indexed)
for (int i = 0; i < n; i++) {
A[i] = lower_bound(sorted_A.begin(), sorted_A.end(), A[i]) - sorted_A.begin() + 1;
// A[i] is now in [1..M], M = sorted_A.size()
}
// Now use BIT with n = sorted_A.size()
3.10.8 Common Mistakes
❌ Mistake 1: Wrong lowbit Implementation
// ❌ Wrong — common typo
int lowbit(int x) { return x & (x - 1); } // This clears the lowest bit, doesn't return it!
// x=6 (0110): x&(x-1) = 0110&0101 = 0100 = 4 (wrong, should be 2)
// ✅ Correct
int lowbit(int x) { return x & (-x); }
// x=6: -6 = ...11111010 (two's complement)
// 0110 & 11111010 = 0010 = 2 ✓
Memory trick: x & (-x) reads as "x AND negative x". -x is bitwise NOT plus 1, which preserves the lowest set bit, clears all bits below it, inverts all bits above it; AND keeps only the lowest set bit.
❌ Mistake 2: 0-indexed Array (0-indexed Trap)
BIT must use 1-indexed arrays. 0-indexed causes infinite loops!
// ❌ Wrong — 0-indexed causes infinite loop!
// If i = 0: query loop: i -= lowbit(0) = 0 - 0 = 0 → infinite loop!
// ✅ Correct — convert to 1-indexed
for (int i = 0; i < n; i++) {
update(i + 1, arr[i]); // Convert 0-indexed i to 1-indexed i+1
}
// Note: for 0-indexed range [l, r], use query(r+1) - query(l)
❌ Mistake 3: Integer Overflow for Large Sums
// ❌ Wrong — tree[] should be long long for large sums
int tree[MAXN]; // Overflows when sum exceeds 2^31
// ✅ Correct
long long tree[MAXN];
// Also: inversion count can be up to N*(N-1)/2 ≈ 4.5×10^10 (N=3×10^5)
// Always use long long for result counter!
long long inversions = 0; // ✅ Not int!
❌ Mistake 4: Forgetting to Clear BIT Between Test Cases
📄 View Code: ❌ Mistake 4: Forgetting to Clear BIT Between Test Cases
// ❌ Wrong — multiple test cases
int T; cin >> T;
while (T--) {
// Forgot to clear tree[]!
// Old data from previous test case contaminates results
solve();
}
// ✅ Correct — reset before each test case
int T; cin >> T;
while (T--) {
fill(tree + 1, tree + n + 1, 0LL); // Clear BIT
solve();
}
3.10.9 Chapter Summary
📋 Quick Reference
| Operation | Code | Description |
|---|---|---|
| lowbit | x & (-x) | Value of lowest set bit in x |
| Point update | for(;i<=n;i+=lowbit(i)) t[i]+=v | Propagate upward |
| Prefix query | for(;i>0;i-=lowbit(i)) s+=t[i] | Decompose downward |
| Range query | query(r) - query(l-1) | Difference formula |
| Range update (diff BIT) | upd(l,+v); upd(r+1,-v) | Difference array |
| Inversion count | (i-1) - query(a[i]) | Count per element |
| Array must be | 1-indexed | 0-indexed → infinite loop |
❓ FAQ
Q1: Both BIT and Segment Tree support prefix sum + point update; which to choose?
A: Use BIT whenever possible. BIT has only 10 lines of code, smaller constants (2-3x faster in practice), and lower chance of bugs. Only switch to Segment Tree when you need range min/max (RMQ), range coloring, or more complex range operations. In contests, BIT is the "default weapon", Segment Tree is the "heavy artillery".
Q2: Can BIT support range minimum queries (RMQ)?
A: Standard BIT cannot support RMQ, because minimum has no "inverse operation" (can't "undo" a minimum merge like subtraction). Range min/max requires Segment Tree or Sparse Table. There's a "static BIT RMQ" technique, but it only works without updates, with limited practical use.
Q3: Can BIT be 2-dimensional (2D BIT)?
A: Yes! 2D BIT solves 2D prefix sum + point update problems with complexity O(log N × log M). Code structure uses two nested loops:
// 2D BIT update void update2D(int x, int y, long long v) { for (int i = x; i <= N; i += lowbit(i)) for (int j = y; j <= M; j += lowbit(j)) bit[i][j] += v; }Not common in USACO, but occasionally used in 2D coordinate counting problems.
3.10.10 Practice Problems
🟢 Easy 1: Range Sum (Point Update) Given an array of length N, support two operations:
1 i x: add x to A[i]2 l r: query A[l] + A[l+1] + ... + A[r]
Hint: Direct application of BIT. Use update(i, x) and query(r) - query(l-1).
🟢 Easy 2: Count Elements Less Than K Given N operations, each either inserting an integer (range 1..10^6) or querying "how many of the currently inserted integers are ≤ K?"
Hint: BIT maintains frequency array over value domain. update(v, 1) inserts value v, query(K) is the answer.
🟡 Medium 1: Range Add, Point Query Given an array of length N (initially all zeros), support two operations:
1 l r x: add x to every element in A[l..r]2 i: query current value of A[i]
Hint: Use Difference BIT (Section 3.10.6).
🟡 Medium 2: Inversion Count (with Coordinate Compression) Given an array of length N, elements in range 1..10^9 (may have duplicates), count inversions.
Hint: Coordinate compress first, then use BIT counting (variant of Section 3.10.7). Note for equal elements: (i,j) with i<j and A[i]>A[j] (strictly greater) counts as an inversion.
🔴 Hard: Range Add, Range Sum (Double BIT) Given an array of length N, support two operations:
1 l r x: add x to every element in A[l..r]2 l r: query A[l] + ... + A[r]
Hint: Use Double BIT. Formula: prefix_sum(r) = B1[r] * r - B2[r], where B1 maintains difference array, B2 maintains weighted difference array.
✅ All BIT Practice Problem Solutions
🟢 Easy 1: Range Sum
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 100005;
int n, q;
long long tree[MAXN];
int lowbit(int x) { return x & (-x); }
void update(int i, long long val) { for (; i <= n; i += lowbit(i)) tree[i] += val; }
long long query(int i) { long long s=0; for (; i > 0; i -= lowbit(i)) s += tree[i]; return s; }
int main() {
ios_base::sync_with_stdio(false); cin.tie(NULL);
cin >> n >> q;
while (q--) {
int t; cin >> t;
if (t == 1) { int i; long long x; cin >> i >> x; update(i, x); }
else { int l, r; cin >> l >> r; cout << query(r) - query(l-1) << "\n"; }
}
}
🟡 Medium 1: Range Add, Point Query (Difference BIT)
Core idea: maintain difference array in BIT. range_add(l,r,x) = update(l,x) + update(r+1,-x). Point query = query(i).
void range_add(int l, int r, long long x) { update(l, x); update(r+1, -x); }
long long point_query(int i) { return query(i); }
🟡 Medium 2: Inversion Count
// Coordinate compress first, then for each element x:
// inversions += (number of inserted elements) - query(compressed x)
// Then insert x: update(compressed x, 1)
🔴 Hard: Range Add, Range Sum (Double BIT)
// prefix_sum(r) = (r+1)*sum(D[1..r]) - sum(i*D[i], i=1..r)
// = (r+1)*B1.query(r) - B2.query(r)
// Where B1 stores D[i], B2 stores i*D[i]
struct DoubleBIT {
long long B1[MAXN], B2[MAXN];
int n;
DoubleBIT(int n) : n(n) { memset(B1,0,sizeof(B1)); memset(B2,0,sizeof(B2)); }
void add(int i, long long v) {
for (int x=i; x<=n; x+=x&-x) { B1[x]+=v; B2[x]+=v*i; }
}
void range_add(int l, int r, long long v) { add(l,v); add(r+1,-v); }
long long prefix(int i) {
long long s=0; for(int x=i;x>0;x-=x&-x) s+=(i+1)*B1[x]-B2[x]; return s;
}
long long range_query(int l, int r) { return prefix(r)-prefix(l-1); }
};
3.10.11 Weight BIT: Global K-th Smallest
A weight BIT maintains a frequency array over the value domain: bit[v] represents how many times value v appears in the sequence. It can efficiently query "the K-th smallest element in the sequence".
Naive Approach: Binary Search + Prefix Query, O(log² N)
📄 View Code: Naive Approach: Binary Search + Prefix Query, O(log² N)
// Find K-th smallest value in BIT over value domain [1..MAXV]
int kth_binary_search(int k) {
int lo = 1, hi = MAXV;
while (lo < hi) {
int mid = (lo + hi) / 2;
if (query(mid) >= k)
hi = mid;
else
lo = mid + 1;
}
return lo;
}
Doubling Optimization: O(log N)
Leveraging BIT's tree structure, the doubling method finds the K-th smallest in O(log N):
📄 Leveraging BIT's tree structure, the doubling method finds the K-th smallest in O(log N):
// Global K-th smallest (doubling method) — O(log N)
// Prerequisite: BIT maintains value domain frequency, bit[v] = count of v
int kth(int k) {
int sum = 0, x = 0;
// Determine answer bit by bit from highest to lowest
for (int i = (int)log2(MAXV); i >= 0; --i) {
int nx = x + (1 << i);
if (nx <= MAXV && sum + bit[nx] < k) {
x = nx; // Take this entire segment, continue expanding right
sum += bit[nx];
}
// Otherwise answer is in [x+1, x + 2^(i-1)], don't expand
}
return x + 1; // x is the last position where sum < k, answer is x+1
}
// Complete example: dynamically maintain sequence, support insert and K-th smallest query
// Insert value v: update(v, 1)
// Delete value v: update(v, -1)
// Query K-th smallest: kth(k)
💡 Principle Explained: BIT's tree structure makes
bit[x]exactly the subtree sum rooted at x (the interval before x's lowest binary bit). During doubling, each step tries setting a bit of x to 1: if the prefix sum with that bit set is still < k, the answer is on the right, so expand; otherwise narrow to the left. O(log V) steps total.
💡 Chapter Connection: BIT and Segment Tree are the two most commonly paired data structures in USACO. BIT handles 80% of scenarios with 1/5 the code. After mastering BIT, return to Chapter 3.9 to learn Segment Tree lazy propagation — the territory BIT cannot reach.