Internal Working of HashMap

Hi All,

This is the most popular question in interviews and definitely an important collection to understand. So, starting with the internal working of HashMap, I will be covering below concepts:

1) Hashing
2) Bucketing
3) Collision
4) Rehashing


Let’s start with the statement with which you declare a HashMap. I will be creating a map of type String, String and will be using default constructor to make life a little simpler 🙂

 Map <String,String> mapObject = new HashMap<>();

Now, the above piece of code will execute the default constructor, in which HashMap will initiate the default load factor with a value of 0.75f. The load factor helps HashMap to decide when to double the number of buckets depending upon the current load of HashMap.

Now, once you start adding an element to a HashMap, a couple of tasks are executed sequentially.

 mapObject.put("key1","obj1");
  • Task 1: Calculating Hash of Key
  • Task 2: Identifying bucket for Key
  • Task 3: Handling Collision

Task 1: Calculating Hash of Key

The way hash is calculated internally for a key is by executing xor and unsigned right shift operation on current hashCode of key.

Step 1: Call hashCode function on key, which will call the object the hashCode of Object if not overridden or will call the hashCode of Class where it is overridden.
Step 2: Do Unsigned Right Shift of 16 bits on the value received in Step 1
Step 3: Execute XOR operation on Values received in Step 1 and Step 2

The value received in Step 3 will now be used as an internal hash of the key in HashMap.

 static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

Task 2: Identifying bucket for the key

Once the internal hash is calculated, the next task is to identify in which bucket this key should go. The way it is identified is by doing an AND operation between the internal hash of the key and size of the bucket (which is default 16). Whatever will be the result of this operation, Entry (Key, Value) Pair will be stored on the same index of bucket array.

i = (n - 1) & hash

Task 3: Handling Collision

Collision occurs when two different keys fall into same bucket. Now that can be because buckets identified by AND operation resulted into same index even though the internal and actual hash code was different or both the keys may be unequal but have generated the same hash code. In this case, the entries will be stored as a linked list. From JAVA8 onwards, as part of JEP 180, an improvement has been made to switch from linked list to balanced tree when number of collisions grows beyond a certain threshold. The below code in HashMap is doing this magic:


 if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }

Rehashing

Now, there is another concept which is called rehashing. This happens when load of map crosses the limit of load factor. After adding a new element in HashMap, if it crosses the load factor limit threshold ( which is calculated as capacity * load factor) , the HashMap will double the size of the buckets and now will again identify the bucket for each entry and put every entry into new bucket array. You can find the below piece of code in the end of putVal function.


     if (++size > threshold)
            resize();
        afterNodeInsertion(evict);

That’s all for now in terms of internal working of HashMap. Please share your feedback and ideas if you want me to cover anything more in this post.

Leave a Reply

Your email address will not be published. Required fields are marked *