Chapter 3.8: Maps & Sets
Maps and sets are the workhorses of frequency counting, lookup, and tracking unique elements. In this chapter, we go deep into their practical use in USACO problems.
π Before You Continue: You should be comfortable with arrays and basic C++ STL (Chapter 2.4). Understanding hash tables conceptually (Chapter 3.7) will help, but is not strictly required β
mapandsetare tree-based and work without hashing.
3.8.1 map vs unordered_map β Choosing Wisely
Visual: Map Internal Structure (BST)
std::map stores key-value pairs in a balanced BST (Red-Black tree). This gives O(log N) for all operations and keeps keys sorted automatically β great when you need lower_bound/upper_bound queries. Use unordered_map when you only need O(1) lookups and don't care about order.
The key structural difference between map and unordered_map:
| Feature | map | unordered_map |
|---|---|---|
| Underlying structure | Red-black tree | Hash table |
| Insert/lookup time | O(log n) | O(1) average, O(n) worst |
| Iterates in | Sorted key order | Arbitrary order |
| Min/Max key | Available via .begin()/.rbegin() | Not available |
| Keys must be | Comparable (has <) | Hashable |
| Use when | You need sorted keys or find min/max | You need fastest possible lookup |
For most USACO problems, either works fine. Use unordered_map for speed when keys are integers or strings, map when you need ordered iteration.
Example: Frequency Map
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
unordered_map<int, int> freq;
for (int i = 0; i < n; i++) {
int x;
cin >> x;
freq[x]++; // increment count; creates with 0 if not present
}
// Find the element with highest frequency
int maxFreq = 0, maxVal = INT_MIN;
for (auto &[val, count] : freq) { // structured binding (C++17)
if (count > maxFreq || (count == maxFreq && val < maxVal)) {
maxFreq = count;
maxVal = val;
}
}
cout << "Most frequent: " << maxVal << " (" << maxFreq << " times)\n";
return 0;
}
3.8.2 Map Operations β Complete Reference
#include <bits/stdc++.h>
using namespace std;
int main() {
map<string, int> scores;
// Insert
scores["Alice"] = 95;
scores["Bob"] = 87;
scores["Charlie"] = 92;
scores.insert({"Dave", 78}); // another way
scores.emplace("Eve", 88); // most efficient way
// Lookup
cout << scores["Alice"] << "\n"; // 95
// WARNING: scores["Unknown"] creates it with value 0!
// Safe lookup
if (scores.count("Frank")) {
cout << scores["Frank"] << "\n";
} else {
cout << "Frank not found\n";
}
// Using find() β returns iterator
auto it = scores.find("Bob");
if (it != scores.end()) {
cout << it->first << ": " << it->second << "\n"; // Bob: 87
}
// Update
scores["Alice"] += 5; // Alice now has 100
// Erase
scores.erase("Charlie");
// Iterate in sorted key order (map always gives sorted order)
for (const auto &[name, score] : scores) {
cout << name << ": " << score << "\n";
}
// Alice: 100
// Bob: 87
// Dave: 78
// Eve: 88
// Size and empty check
cout << scores.size() << "\n"; // 4
cout << scores.empty() << "\n"; // 0 (false)
// Clear all entries
scores.clear();
return 0;
}
3.8.3 Set Operations β Complete Reference
#include <bits/stdc++.h>
using namespace std;
int main() {
set<int> s = {5, 3, 8, 1, 9, 2};
// s = {1, 2, 3, 5, 8, 9} (always sorted!)
// Insert
s.insert(4); // s = {1, 2, 3, 4, 5, 8, 9}
s.insert(3); // already there, no change
// Erase
s.erase(8); // s = {1, 2, 3, 4, 5, 9}
// Lookup
cout << s.count(3) << "\n"; // 1 (exists)
cout << s.count(7) << "\n"; // 0 (not found)
// Iterator-based queries
auto it = s.lower_bound(4); // first element >= 4
cout << *it << "\n"; // 4
auto it2 = s.upper_bound(4); // first element > 4
cout << *it2 << "\n"; // 5
// Min and Max
cout << *s.begin() << "\n"; // 1 (min)
cout << *s.rbegin() << "\n"; // 9 (max)
// Remove minimum
s.erase(s.begin()); // removes 1
cout << *s.begin() << "\n"; // 2
// Iterate
for (int x : s) cout << x << " ";
cout << "\n"; // 2 3 4 5 9
return 0;
}
3.8.4 USACO Problem: Cow IDs
Problem (USACO 2017 February Bronze): Bessie wants to find the N-th smallest number that doesn't appear in a set of "taken" IDs. Given a set of taken IDs and N, find the N-th available ID.
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, q;
cin >> n >> q;
set<int> taken;
for (int i = 0; i < n; i++) {
int x; cin >> x;
taken.insert(x);
}
// For each query q, find the q-th positive integer NOT in taken
while (q--) {
int k; cin >> k;
// Binary search: find smallest x such that x - (# taken values <= x) >= k
int lo = 1, hi = 2e9;
while (lo < hi) {
int mid = lo + (hi - lo) / 2;
// count of available numbers in [1, mid] = mid - (# taken values <= mid)
int taken_count = (int)(taken.lower_bound(mid + 1) - taken.begin());
int available = mid - taken_count;
if (available >= k) hi = mid;
else lo = mid + 1;
}
cout << lo << "\n";
}
return 0;
}
3.8.5 Multiset β Sorted Bag with Duplicates
A multiset is like a set, but allows duplicate values:
#include <bits/stdc++.h>
using namespace std;
int main() {
multiset<int> ms;
ms.insert(3);
ms.insert(1);
ms.insert(3); // duplicate allowed
ms.insert(5);
ms.insert(1);
// ms = {1, 1, 3, 3, 5}
cout << ms.count(3) << "\n"; // 2 (how many 3s)
cout << ms.count(2) << "\n"; // 0
// Remove ONE occurrence of 3
ms.erase(ms.find(3)); // removes only one 3
// ms = {1, 1, 3, 5}
// Remove ALL occurrences of 1
ms.erase(1); // removes all 1s
// ms = {3, 5}
cout << *ms.begin() << "\n"; // 3 (min)
cout << *ms.rbegin() << "\n"; // 5 (max)
return 0;
}
Running Median with Two Multisets
Keep track of the median of a stream of numbers using a max-multiset (lower half) and a min-multiset (upper half):
#include <bits/stdc++.h>
using namespace std;
int main() {
multiset<int> lo; // max-heap: lower half (use negation or reverse iterator)
multiset<int> hi; // min-heap: upper half
// For simplicity, use two multisets where lo stores values in reverse
// lo's maximum = lo.rbegin(); hi's minimum = hi.begin()
int n;
cin >> n;
for (int i = 0; i < n; i++) {
int x;
cin >> x;
// Add to appropriate half
if (lo.empty() || x <= *lo.rbegin()) {
lo.insert(x);
} else {
hi.insert(x);
}
// Rebalance: sizes should differ by at most 1
while (lo.size() > hi.size() + 1) {
hi.insert(*lo.rbegin());
lo.erase(lo.find(*lo.rbegin()));
}
while (hi.size() > lo.size()) {
lo.insert(*hi.begin());
hi.erase(hi.begin());
}
// Print median
if (lo.size() == hi.size()) {
// Even count: average of two middle values
double median = (*lo.rbegin() + *hi.begin()) / 2.0;
cout << fixed << setprecision(1) << median << "\n";
} else {
// Odd count: middle value is in lo
cout << *lo.rbegin() << "\n";
}
}
return 0;
}
3.8.6 Practical Patterns
Pattern 1: Counting Distinct Elements
vector<int> data = {1, 5, 3, 1, 2, 5, 5, 3};
set<int> distinct(data.begin(), data.end());
cout << "Distinct count: " << distinct.size() << "\n"; // 4
Pattern 2: Group by Frequency, Sort by Value
vector<int> nums = {3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5};
map<int, int> freq;
for (int x : nums) freq[x]++;
// Group values by their frequency
map<int, vector<int>> byFreq;
for (auto &[val, cnt] : freq) {
byFreq[cnt].push_back(val);
}
// Print in order of frequency
for (auto &[cnt, vals] : byFreq) {
for (int v : vals) cout << v << " (Γ" << cnt << ")\n";
}
Pattern 3: Offline Queries with Sorting
Sort queries along with events to process them together in O((N+Q) log N):
// Example: for each query point, count how many events have value <= query point
// Sort both arrays, sweep through with two pointers
β οΈ Common Mistakes in Chapter 3.8
| # | Mistake | Why It's Wrong | Fix |
|---|---|---|---|
| 1 | map[key] accessing non-existent key | Auto-creates entry with value 0, pollutes data | Use m.count(key) or m.find(key) to check first |
| 2 | multiset::erase(value) deletes all equal values | Expected to delete one, deleted all | Use ms.erase(ms.find(value)) to delete just one |
| 3 | Modifying map/set size during iteration | Iterator invalidated, crash or skipped elements | Use it = m.erase(it) for safe deletion |
| 4 | unordered_map hacked to degrade to O(N) | Adversary constructs hash-collision data, TLE | Switch to map or use custom hash function |
| 5 | Forgetting set doesn't store duplicates | size() doesn't grow after inserting duplicate, count wrong | Use multiset when duplicates needed |
Chapter Summary
π Key Takeaways
| Structure | Ordered | Duplicates | Key Feature | Why It Matters |
|---|---|---|---|---|
map<K,V> | Yes (sorted) | No (unique keys) | Key-value mapping, O(log N) | Frequency counting, IDβattribute mapping |
unordered_map<K,V> | No | No | O(1) average lookup | 5-10x faster than map for large data |
set<T> | Yes (sorted) | No | Ordered unique set | Deduplication, range queries (lower_bound) |
unordered_set<T> | No | No | O(1) membership test | Just need to check "seen before?" |
multiset<T> | Yes (sorted) | Yes | Ordered multiset | Dynamic median, sliding window |
π§© "Which Container to Use" Quick Reference
| Need | Recommended Container | Reason |
|---|---|---|
| Count occurrences of each element | map / unordered_map | freq[x]++ in one line |
| Deduplicate and sort | set | Auto-dedup + auto-sort |
| Check if element was seen | unordered_set | O(1) lookup |
| Dynamic ordered set + find extremes | set / multiset | O(1) access to min/max |
Need lower_bound / upper_bound | set / map | Only ordered containers support this |
| Valueβindex mapping | map / unordered_map | Coordinate compression etc. |
β FAQ
Q1: What's the difference between map's [] operator and find?
A:
m[key]auto-creates a default value (0 for int) when key doesn't exist;m.find(key)only searches, doesn't create. If you just want to check if a key exists, usem.count(key)orm.find(key) != m.end().
Q2: Both multiset and priority_queue can get extremes β which to use?
A:
priority_queuecan only get the max (or min) and delete it, doesn't support deletion by value.multisetsupports finding and deleting any value, more flexible. If you only need to repeatedly get the extreme,priority_queueis simpler; if you need to delete specific elements (e.g., removing elements leaving a sliding window), usemultiset.
Q3: When can unordered_map be slower than map?
A: Two situations: β When hash collisions are severe (many keys hash to the same bucket), degrades to
O(N); β‘ In contests, adversaries deliberately construct data to hackunordered_map. Solution: use a custom hash function, or switch tomap.
Q4: Is C++17 structured binding auto &[key, val] safe? Can I use it in contests?
A: USACO and most contest platforms support C++17, so
for (auto &[key, val] : m)is safe to use. It's cleaner thanentry.first/entry.second.
π Connections to Later Chapters
- Chapter 3.3 (Sorting & Searching): coordinate compression often combines with
map(value β compressed index) - Chapter 3.9 (Segment Trees): ordered
set'slower_boundcan replace simple segment tree queries - Chapters 5.1β5.2 (Graphs):
mapis commonly used to store adjacency lists for sparse graphs - Chapter 4.1 (Greedy):
multisetcombined with greedy strategies can efficiently maintain dynamic optimal choices - The
mapfrequency counting pattern appears throughout the book and is one of the most fundamental tools in competitive programming
Practice Problems
Problem 3.8.1 β Two Sum π’ Easy Read N integers and a target T. Find two values that sum to T. Print their indices (1-indexed).
β Full Solution
#include <bits/stdc++.h>
using namespace std;
int main() {
int n, T; cin >> n >> T;
map<int, int> seen; // value β index
for (int i = 1; i <= n; i++) {
int x; cin >> x;
if (seen.count(T - x)) {
cout << seen[T - x] << " " << i << "\n";
return 0;
}
seen[x] = i;
}
cout << "no solution\n";
}
Complexity: O(N log N) with map, O(N) with unordered_map.
Problem 3.8.2 β Anagram Groups π‘ Medium Group N words by their sorted-letter form.
β Full Solution
#include <bits/stdc++.h>
using namespace std;
int main() {
int n; cin >> n;
map<string, vector<string>> groups;
for (int i = 0; i < n; i++) {
string w; cin >> w;
string key = w; sort(key.begin(), key.end());
groups[key].push_back(w);
}
for (auto& [key, words] : groups) {
sort(words.begin(), words.end());
for (const string& w : words) cout << w << " ";
cout << "\n";
}
}
Complexity: O(N Γ K log K) where K = average word length.
Problem 3.8.3 β Interval Overlap Count π‘ Medium Count maximum overlap of N intervals over points 1..M.
β Full Solution
#include <bits/stdc++.h>
using namespace std;
int main() {
int n, M; cin >> n >> M;
vector<int> diff(M + 2, 0);
for (int i = 0; i < n; i++) {
int l, r; cin >> l >> r;
diff[l]++; diff[r+1]--; // difference array
}
int cur = 0, ans = 0;
for (int i = 1; i <= M; i++) {
cur += diff[i];
ans = max(ans, cur);
}
cout << ans << "\n";
}
Complexity: O(N + M).
Problem 3.8.4 β Cow Photography π΄ Hard Find ordering consistent with all N lists (each a permutation of IDs).
β Full Solution
Core Idea: For each pair (a, b), count in how many lists a appears before b. If a appears before b in more than half the lists, a comes before b in the true order. Sort using this pairwise comparison.
#include <bits/stdc++.h>
using namespace std;
int main() {
int n, k; cin >> n >> k;
// pos[list][cow] = position of cow in list
vector<vector<int>> pos(k, vector<int>(n+1));
for (int i = 0; i < k; i++)
for (int j = 0; j < n; j++) {
int x; cin >> x; pos[i][x] = j;
}
// count[a][b] = #lists where a before b
// For large N, use the "majority" approach:
// Compare each pair in O(k) β feasible for small N
vector<int> cows(n); iota(cows.begin(), cows.end(), 1);
sort(cows.begin(), cows.end(), [&](int a, int b){
int before = 0;
for (int i = 0; i < k; i++) before += (pos[i][a] < pos[i][b]);
return before > k / 2;
});
for (int c : cows) cout << c << "\n";
}
Problem 3.8.5 β Running Distinct Count π’ Easy After each new integer, print count of distinct values seen.
β Full Solution
#include <bits/stdc++.h>
using namespace std;
int main() {
int n; cin >> n;
unordered_set<int> seen;
for (int i = 0; i < n; i++) {
int x; cin >> x;
seen.insert(x);
cout << seen.size() << "\n";
}
}
Complexity: O(N) average.