Chapter 3.8: Maps & Sets
Maps and sets are the workhorses of frequency counting, lookup, and tracking unique elements. In this chapter, we go deep into their practical use in USACO problems.
3.8.1 map vs unordered_map β Choosing Wisely
Visual: Map Internal Structure (BST)
std::map stores key-value pairs in a balanced BST (Red-Black tree). This gives O(log N) for all operations and keeps keys sorted automatically β great when you need lower_bound/upper_bound queries. Use unordered_map when you only need O(1) lookups and don't care about order.
| Feature | map | unordered_map |
|---|---|---|
| Underlying structure | Red-black tree | Hash table |
| Insert/lookup time | O(log n) | O(1) average, O(n) worst |
| Iterates in | Sorted key order | Arbitrary order |
| Min/Max key | Available via .begin()/.rbegin() | Not available |
| Keys must be | Comparable (has <) | Hashable |
| Use when | You need sorted keys or find min/max | You need fastest possible lookup |
For most USACO problems, either works fine. Use unordered_map for speed when keys are integers or strings, map when you need ordered iteration.
Example: Frequency Map
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
unordered_map<int, int> freq;
for (int i = 0; i < n; i++) {
int x;
cin >> x;
freq[x]++; // increment count; creates with 0 if not present
}
// Find the element with highest frequency
int maxFreq = 0, maxVal = INT_MIN;
for (auto &[val, count] : freq) { // structured binding (C++17)
if (count > maxFreq || (count == maxFreq && val < maxVal)) {
maxFreq = count;
maxVal = val;
}
}
cout << "Most frequent: " << maxVal << " (" << maxFreq << " times)\n";
return 0;
}
3.8.2 Map Operations β Complete Reference
#include <bits/stdc++.h>
using namespace std;
int main() {
map<string, int> scores;
// Insert
scores["Alice"] = 95;
scores["Bob"] = 87;
scores["Charlie"] = 92;
scores.insert({"Dave", 78}); // another way
scores.emplace("Eve", 88); // most efficient way
// Lookup
cout << scores["Alice"] << "\n"; // 95
// WARNING: scores["Unknown"] creates it with value 0!
// Safe lookup
if (scores.count("Frank")) {
cout << scores["Frank"] << "\n";
} else {
cout << "Frank not found\n";
}
// Using find() β returns iterator
auto it = scores.find("Bob");
if (it != scores.end()) {
cout << it->first << ": " << it->second << "\n"; // Bob: 87
}
// Update
scores["Alice"] += 5; // Alice now has 100
// Erase
scores.erase("Charlie");
// Iterate in sorted key order (map always gives sorted order)
for (const auto &[name, score] : scores) {
cout << name << ": " << score << "\n";
}
// Alice: 100
// Bob: 87
// Dave: 78
// Eve: 88
// Size and empty check
cout << scores.size() << "\n"; // 4
cout << scores.empty() << "\n"; // 0 (false)
// Clear all entries
scores.clear();
return 0;
}
3.8.3 Set Operations β Complete Reference
#include <bits/stdc++.h>
using namespace std;
int main() {
set<int> s = {5, 3, 8, 1, 9, 2};
// s = {1, 2, 3, 5, 8, 9} (always sorted!)
// Insert
s.insert(4); // s = {1, 2, 3, 4, 5, 8, 9}
s.insert(3); // already there, no change
// Erase
s.erase(8); // s = {1, 2, 3, 4, 5, 9}
// Lookup
cout << s.count(3) << "\n"; // 1 (exists)
cout << s.count(7) << "\n"; // 0 (not found)
// Iterator-based queries
auto it = s.lower_bound(4); // first element >= 4
cout << *it << "\n"; // 4
auto it2 = s.upper_bound(4); // first element > 4
cout << *it2 << "\n"; // 5
// Min and Max
cout << *s.begin() << "\n"; // 1 (min)
cout << *s.rbegin() << "\n"; // 9 (max)
// Remove minimum
s.erase(s.begin()); // removes 1
cout << *s.begin() << "\n"; // 2
// Iterate
for (int x : s) cout << x << " ";
cout << "\n"; // 2 3 4 5 9
return 0;
}
3.8.4 USACO Problem: Cow IDs
Problem (USACO 2017 February Bronze): Bessie wants to find the N-th smallest number that doesn't appear in a set of "taken" IDs. Given a set of taken IDs and N, find the N-th available ID.
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, q;
cin >> n >> q;
set<int> taken;
for (int i = 0; i < n; i++) {
int x; cin >> x;
taken.insert(x);
}
// For each query q, find the q-th positive integer NOT in taken
while (q--) {
int k; cin >> k;
// Binary search: find smallest x such that x - (# taken values <= x) >= k
int lo = 1, hi = 2e9;
while (lo < hi) {
int mid = lo + (hi - lo) / 2;
// count of available numbers in [1, mid] = mid - (# taken values <= mid)
int taken_count = (int)(taken.lower_bound(mid + 1) - taken.begin());
int available = mid - taken_count;
if (available >= k) hi = mid;
else lo = mid + 1;
}
cout << lo << "\n";
}
return 0;
}
3.8.5 Multiset β Sorted Bag with Duplicates
A multiset is like a set, but allows duplicate values:
#include <bits/stdc++.h>
using namespace std;
int main() {
multiset<int> ms;
ms.insert(3);
ms.insert(1);
ms.insert(3); // duplicate allowed
ms.insert(5);
ms.insert(1);
// ms = {1, 1, 3, 3, 5}
cout << ms.count(3) << "\n"; // 2 (how many 3s)
cout << ms.count(2) << "\n"; // 0
// Remove ONE occurrence of 3
ms.erase(ms.find(3)); // removes only one 3
// ms = {1, 1, 3, 5}
// Remove ALL occurrences of 1
ms.erase(1); // removes all 1s
// ms = {3, 5}
cout << *ms.begin() << "\n"; // 3 (min)
cout << *ms.rbegin() << "\n"; // 5 (max)
return 0;
}
Running Median with Two Multisets
Keep track of the median of a stream of numbers using a max-multiset (lower half) and a min-multiset (upper half):
#include <bits/stdc++.h>
using namespace std;
int main() {
multiset<int> lo; // max-heap: lower half (use negation or reverse iterator)
multiset<int> hi; // min-heap: upper half
// For simplicity, use two multisets where lo stores values in reverse
// lo's maximum = lo.rbegin(); hi's minimum = hi.begin()
int n;
cin >> n;
for (int i = 0; i < n; i++) {
int x;
cin >> x;
// Add to appropriate half
if (lo.empty() || x <= *lo.rbegin()) {
lo.insert(x);
} else {
hi.insert(x);
}
// Rebalance: sizes should differ by at most 1
while (lo.size() > hi.size() + 1) {
hi.insert(*lo.rbegin());
lo.erase(lo.find(*lo.rbegin()));
}
while (hi.size() > lo.size()) {
lo.insert(*hi.begin());
hi.erase(hi.begin());
}
// Print median
if (lo.size() == hi.size()) {
// Even count: average of two middle values
double median = (*lo.rbegin() + *hi.begin()) / 2.0;
cout << fixed << setprecision(1) << median << "\n";
} else {
// Odd count: middle value is in lo
cout << *lo.rbegin() << "\n";
}
}
return 0;
}
3.8.6 Practical Patterns
Pattern 1: Counting Distinct Elements
vector<int> data = {1, 5, 3, 1, 2, 5, 5, 3};
set<int> distinct(data.begin(), data.end());
cout << "Distinct count: " << distinct.size() << "\n"; // 4
Pattern 2: Group by Frequency, Sort by Value
vector<int> nums = {3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5};
map<int, int> freq;
for (int x : nums) freq[x]++;
// Group values by their frequency
map<int, vector<int>> byFreq;
for (auto &[val, cnt] : freq) {
byFreq[cnt].push_back(val);
}
// Print in order of frequency
for (auto &[cnt, vals] : byFreq) {
for (int v : vals) cout << v << " (Γ" << cnt << ")\n";
}
Pattern 3: Offline Queries with Sorting
Sort queries along with events to process them together in O((N+Q) log N):
// Example: for each query point, count how many events have value <= query point
// Sort both arrays, sweep through with two pointers
β οΈ Common Mistakes in Chapter 3.8
| # | Mistake | Why It's Wrong | Fix |
|---|---|---|---|
| 1 | map[key] accessing non-existent key | Auto-creates entry with value 0, pollutes data | Use m.count(key) or m.find(key) to check first |
| 2 | multiset::erase(value) deletes all equal values | Expected to delete one, deleted all | Use ms.erase(ms.find(value)) to delete just one |
| 3 | Modifying map/set size during iteration | Iterator invalidated, crash or skipped elements | Use it = m.erase(it) for safe deletion |
| 4 | unordered_map hacked to degrade to O(N) | Adversary constructs hash-collision data, TLE | Switch to map or use custom hash function |
| 5 | Forgetting set doesn't store duplicates | size() doesn't grow after inserting duplicate, count wrong | Use multiset when duplicates needed |
Chapter Summary
π Key Takeaways
| Structure | Ordered | Duplicates | Key Feature | Why It Matters |
|---|---|---|---|---|
map<K,V> | Yes (sorted) | No (unique keys) | Key-value mapping, O(log N) | Frequency counting, IDβattribute mapping |
unordered_map<K,V> | No | No | O(1) average lookup | 5-10x faster than map for large data |
set<T> | Yes (sorted) | No | Ordered unique set | Deduplication, range queries (lower_bound) |
unordered_set<T> | No | No | O(1) membership test | Just need to check "seen before?" |
multiset<T> | Yes (sorted) | Yes | Ordered multiset | Dynamic median, sliding window |
π§© "Which Container to Use" Quick Reference
| Need | Recommended Container | Reason |
|---|---|---|
| Count occurrences of each element | map / unordered_map | freq[x]++ in one line |
| Deduplicate and sort | set | Auto-dedup + auto-sort |
| Check if element was seen | unordered_set | O(1) lookup |
| Dynamic ordered set + find extremes | set / multiset | O(1) access to min/max |
Need lower_bound / upper_bound | set / map | Only ordered containers support this |
| Valueβindex mapping | map / unordered_map | Coordinate compression etc. |
β FAQ
Q1: What's the difference between map's [] operator and find?
A:
m[key]auto-creates a default value (0 for int) when key doesn't exist;m.find(key)only searches, doesn't create. If you just want to check if a key exists, usem.count(key)orm.find(key) != m.end().
Q2: Both multiset and priority_queue can get extremes β which to use?
A:
priority_queuecan only get the max (or min) and delete it, doesn't support deletion by value.multisetsupports finding and deleting any value, more flexible. If you only need to repeatedly get the extreme,priority_queueis simpler; if you need to delete specific elements (e.g., removing elements leaving a sliding window), usemultiset.
Q3: When can unordered_map be slower than map?
A: Two situations: β When hash collisions are severe (many keys hash to the same bucket), degrades to
O(N); β‘ In contests, adversaries deliberately construct data to hackunordered_map. Solution: use a custom hash function, or switch tomap.
Q4: Is C++17 structured binding auto &[key, val] safe? Can I use it in contests?
A: USACO and most contest platforms support C++17, so
for (auto &[key, val] : m)is safe to use. It's cleaner thanentry.first/entry.second.
π Connections to Later Chapters
- Chapter 3.3 (Sorting & Searching): coordinate compression often combines with
map(value β compressed index) - Chapter 3.9 (Segment Trees): ordered
set'slower_boundcan replace simple segment tree queries - Chapters 5.1β5.2 (Graphs):
mapis commonly used to store adjacency lists for sparse graphs - Chapter 4.1 (Greedy):
multisetcombined with greedy strategies can efficiently maintain dynamic optimal choices - The
mapfrequency counting pattern appears throughout the book and is one of the most fundamental tools in competitive programming
Practice Problems
Problem 3.8.1 β Two Sum Read N integers and a target T. Find two values in the array that sum to T. Print their indices (1-indexed). (Hint: use a map to store value β index)
Problem 3.8.2 β Anagram Groups Read N words. Group them by their sorted-letter form. Print each group on one line, sorted alphabetically.
- Example: "eat tea tan ate nat bat" β groups: {ate, eat, tea}, {bat}, {nat, tan}
Problem 3.8.3 β Interval Overlap Count Read N intervals [L_i, R_i]. For each integer point from 1 to M, count how many intervals contain it. Output the maximum overlap count. (Hint: use difference array, or sort events and sweep with a set)
Problem 3.8.4 β Cow Photography (USACO Bronze Inspired) N cows each have a unique ID. Read N lists (each a permutation of IDs). Find the ordering that's consistent with all lists (the "true" order). (Hint: use maps to count pairwise orderings)
Problem 3.8.5 β Running Distinct Count Read N integers one by one. After each new integer, print the count of distinct values seen so far. (Hint: maintain an unordered_set; its size is the answer)