Hash Map
How constant time O(1) lookups actually work under the hood.
Tiny Summary
Hash maps provide O(1) average-time inserts and lookups by using a hash function to map keys directly to array indices.
Collisions can occur when multiple keys map to the same index, and performance degrades if the hash function is poor or the table becomes too crowded.
How a Hash Map Actually Works
At its core, a Hash Map is just an array with a clever hashing/indexing trick.
How it works:
- You have an array of some size (maybe 10 slots available)
- To insert
"apple", you run"apple"through a hash function (converts the "key" to a number the same each time) - Hash function outputs a number (the hashcode):
hash("apple") = 47823 - Then take the hashcode and perform a mod (%) by the size of the array (to find which index in the array to store it):
47823 % 10 = 3 - Now we store the value at
array[3]
Whate did above:
key = "apple"
hashcode = hash("apple") = 47823
index = 47823 % 10 = 3
array[3] = ("apple", 5)
This is why lookups are O(1) - you compute the index directly from the key (with hash function). No searching through the array needed.
One Problem That Happens: Collisions
What happens when two keys hash to the same index?
hash("apple") = 47823 → index 3
hash("grape") = 10003 → index 3 // Collision!
Both want array[3]. Now what?
Collisions are inevitable because:
- Pigeonhole Principle: More possible keys than buckets means some must share
- Birthday paradox: With just 23 people, there's a 50% chance two share a birthday
Two Collision Resolution Strategies
1. Separate Chaining
Each array slot stores a linked list of all items that hash to it.
array[3]: → ("apple", 5) → ("grape", 2) → null
array[5]: → ("banana", 7) → null
Lookup: Hash the key → jump to array index → walk the linked list
Performance:
- Best: O(1) - key is first in chain
- Worst: O(n) - all keys collide into one bucket
- Average: O(1 + α) where α = items per bucket
Typically, Hash Maps/Tables resize when α gets too big, so you can think on average that HashMaps inserts and lookups are O(1).
2. Linear Probing
When a collision happens, probe forward to find the next empty slot.
Try array[hash(key)]
If full, try array[hash(key) + 1]
If full, try array[hash(key) + 2]
... until you find empty space
Lookup: Hash the key → check array index → if wrong key, keep probing
Performance:
- Best: O(1) - key in ideal position
- Worst: O(n) - probe through many slots
- Gets worse as table fills up (clustering)
Load Factor & Resizing
Load Factor (α) = items / buckets (how full the Hash Map/Table is on average, lower is faster, but could take more memory to make it (a) lower).
When α hits ~0.75, performance degrades:
- Chaining: longer linked lists
- Linear probing: more collisions, more probing
Solution: Double the array size and rehash everything (resizing).
8 buckets → 16 buckets
Rehash all items into new positions
This take O(n) time and extra space, but happens rarely, so it amortizes to O(1) per insert.