Tuesday, July 22, 2025

Java Collections (internal workings) - How Hashmap works internally

 

๐Ÿ“ฆ Java Collections – Internal Workings & Use Cases


1. List Implementations

ClassBacked ByKey TraitsTime Complexity
ArrayListDynamic arrayFast random access; resize costsO(1) get, O(n) add (worst)
LinkedListDoubly linked listFast insert/delete at endsO(n) get, O(1) addFirst/last
VectorSynchronized arrayLegacy, thread-safeSlower than ArrayList

Use When:

  • Use ArrayList for frequent reads/random access

  • Use LinkedList for frequent inserts/deletes


2. Set Implementations

ClassBacked ByKey TraitsTime Complexity
HashSetHashMapNo duplicates, no orderO(1) add/contains
LinkedHashSetLinked HashMapMaintains insertion orderO(1) add, ordered
TreeSetRed-Black Tree (self-balancing BST)Sorted, no duplicatesO(log n) add/search

Use When:

  • HashSet for fast lookups

  • LinkedHashSet to preserve order

  • TreeSet to maintain sorted elements


3. Map Implementations

ClassBacked ByKey TraitsTime Complexity
HashMapArray + Linked List / Tree (after Java 8)Unordered, allows nullsO(1) get/put (avg)
LinkedHashMapHashMap + Doubly Linked ListMaintains access/insertion orderO(1) get, ordered
TreeMapRed-Black TreeSorted by keysO(log n)
ConcurrentHashMapSegmented/striped locking → Java 7, lock-free buckets (Java 8+)Thread-safeO(1) concurrent get/put

HashMap (Java 8+) uses:

  • Buckets (array of nodes)

  • Converts bucket to TreeNode (Red-Black Tree) if bucket > 8 elements → speeds up worst-case from O(n) to O(log n)

Use When:

  • HashMap for general usage

  • LinkedHashMap for LRU caches

  • TreeMap for sorted data

  • ConcurrentHashMap in multithreaded scenarios


4. Queue & Deque

InterfaceCommon ImplementationsKey Use
QueueLinkedList, PriorityQueueFIFO
DequeArrayDeque, LinkedListDouble-ended queue
BlockingQueueArrayBlockingQueue, LinkedBlockingQueue, DelayQueueThread-safe, producer/consumer

PriorityQueue is min-heap (O(log n) insert)
ArrayDeque is more efficient than Stack/LinkedList for LIFO/FIFO operations


5. Thread-safe Collections

ClassDescription
Vector, HashtableLegacy synchronized
Collections.synchronizedList()Wrapper for any collection
ConcurrentHashMapHigh-concurrency map
CopyOnWriteArrayListGood for read-heavy workloads; copy on every write

Choose CopyOnWriteArrayList for config-heavy apps (low writes, frequent reads)


๐Ÿง  Interview Answer Template

“I chose LinkedHashMap because we needed fast lookups with insertion order preserved. Internally, it uses a doubly linked list in addition to a HashMap, giving O(1) access while keeping elements ordered. It was ideal for building an LRU cache mechanism in our API gateway.”


✅ Summary Table

NeedCollection
Fast access, no orderHashMap, HashSet
Maintain orderLinkedHashMap, LinkedHashSet
SortedTreeMap, TreeSet
Thread-safe readsConcurrentHashMap, CopyOnWriteArrayList
LIFO / StackArrayDeque
Priority-based queuePriorityQueue

How HashMap works internally?

 What is HashMap?

A HashMap<K, V> is a key-value data structure that allows fast lookups, inserts, and deletions using hashing. It is not thread-safe, allows one null key, and multiple null values.


⚙️ Internal Architecture (Java 8+)

A HashMap is internally backed by:

  • An array of Node<K,V> called the bucket array

  • Each bucket is a linked list (or a Red-Black tree if it grows too large)

  • Hash collisions are handled by chaining

Core Components:

static class Node<K,V> implements Map.Entry<K,V> { final int hash; final K key; V value; Node<K,V> next; }

๐Ÿ”„ Key Operations

1. Put (K key, V value)

Steps:

  1. Compute Hash of the key

    • Uses hashCode() and spreads it using hash(hashCode) = h ^ (h >>> 16) to reduce collisions

  2. Determine bucket index

    index = (n - 1) & hash;
  3. Check if key exists in bucket

    • If yes → update value

    • If no → insert a new node

  4. If bucket has > 8 elements, convert to Red-Black Tree for better performance

✅ Tree threshold is 8, resize threshold is 0.75 × capacity


2. Get (K key)

Steps:

  1. Compute the hash

  2. Find the index

  3. Traverse the bucket (linked list or tree)

  4. Return value if key matches


3. Resize / Rehashing

When the number of entries exceeds the load factor (default: 0.75), the capacity is doubled, and all existing nodes are rehashed and moved to new buckets.


๐Ÿ“Š HashMap vs TreeMap vs LinkedHashMap

FeatureHashMapTreeMapLinkedHashMap
OrderNoSortedInsertion
Backing StructureArray + LinkedList/TreeRed-Black TreeHashMap + Doubly LinkedList
PerformanceO(1)O(log n)O(1)

๐ŸŒณ Java 8 Optimization: Treeification

If too many entries hash to the same bucket (≥ 8), and capacity ≥ 64:

  • The bucket is converted from LinkedList → Tree (Red-Black Tree)

  • Improves worst-case get() and put() from O(n)O(log n)

If size drops below 6, the tree may untreeify back to a linked list.


๐Ÿ” Hash Collision Example

Two different keys with the same hashCode():

String s1 = "FB"; // hashCode = 2236 String s2 = "Ea"; // hashCode = 2236

These will go into the same bucket and form a chain (or tree if more collide).


๐Ÿ”ฅ Interview-Ready Answer

HashMap uses a combination of an array and linked list/tree to store key-value pairs. It computes the hashCode, spreads the bits to avoid collisions, and places the entry in the computed bucket. If too many keys collide into the same bucket, it uses treeification to improve performance. Resizing is triggered when the size exceeds the load factor. I chose HashMap in our application to ensure fast constant-time lookups for large datasets.”




Great! Let’s go deeper into the core concepts that make HashMap tick: hashing, hashCode(), equals(), and how it handles collisions. These are critical interview topics, especially for Java Architect and Senior Developer roles.


๐Ÿ”ข 1. Hashing in HashMap

Hashing is the process of converting a key into an integer (the hash code) that determines where the entry is stored in the bucket array.

Java’s Hashing Formula:

hash = hash(key.hashCode()) = h ^ (h >>> 16)

✅ This bit manipulation (XOR with right-shift) is a Java 8 optimization to reduce collision chances by spreading the hash bits.


๐Ÿงช 2. hashCode() and equals() Contracts

hashCode()

  • Returns an integer value for an object.

  • Should be consistent with equals():

    If a.equals(b) is true, then a.hashCode() == b.hashCode() must be true.

equals()

  • Used to determine object equality, especially within a bucket (after a hash match).

  • Important when keys collide (same hash, different objects).


๐Ÿ”„ Flow in HashMap.get(key):

  1. Compute hash from key.

  2. Go to bucket at index (n - 1) & hash.

  3. Check each node:

    • If node.hash == hash && node.key.equals(key) → return value.

So:

  • hashCode() determines bucket

  • equals() finds the exact key inside the bucket


⚠️ 3. What Are Hash Collisions?

A collision occurs when two different keys produce the same hash index.

String a = "FB"; // hashCode() = 2236 String b = "Ea"; // hashCode() = 2236

Both a and b will go to the same bucket.

HashMap handles collisions using:

  1. Chaining via Linked List (pre-Java 8)

  2. Tree (Red-Black Tree) after Java 8 if:

    • ≥ 8 entries in a bucket

    • HashMap capacity ≥ 64


๐Ÿ”ฅ Example Scenario

Map<String, String> map = new HashMap<>(); map.put("FB", "value1"); map.put("Ea", "value2");

Internally:

  • Both keys have same hashCode()

  • They go to the same bucket

  • HashMap uses equals() to distinguish between them

If:

map.get("FB"); // returns "value1" map.get("Ea"); // returns "value2"

Even though they have same hashCode, they're distinct due to equals().


๐Ÿ’ก Best Practices for Custom Keys

When using custom objects as keys in a HashMap:

✅ Override both hashCode() and equals() properly.

@Override public int hashCode() { return Objects.hash(id, name); } @Override public boolean equals(Object o) { if (this == o) return true; if (o == null || getClass() != o.getClass()) return false; MyKey other = (MyKey) o; return id == other.id && name.equals(other.name); }

๐Ÿง  Interview Answer Template

HashMap relies on hashCode() to determine the bucket and equals() to resolve exact matches inside the bucket. In the event of a collision, where two keys have the same hash, Java 8+ optimizes this by switching from a linked list to a red-black tree once the threshold is reached. This keeps lookup time near O(1) even under high collision scenarios. Ensuring consistent hashCode() and equals() implementations for custom keys is essential for correct behavior.”


No comments:

Post a Comment