3 Jan 2018

How HashMap internally works.

HashMap works internally has become a popular question in almost all the interview now a days for fresher as well as experience people.so here is the complete description how HashMap works.

What is Map?
A Map is an object that maps keys to values. A map cannot contain duplicate keys: Each key can map to at most one value. It models the mathematical function abstraction. The Map interface includes methods for basic operations (such as putgetremovecontainsKeycontainsValuesize, and empty), bulk operations (such as putAll and clear), and collection views (such as keySetentrySet, and values).
In sort we can say-

    1.   If we want to represent a group of objects as "key-value" pair then we should go for Map  interface.
    2.   Both key and value are objects only.
    3.   Duplicate keys are not allowed but values can be duplicated
     4.    Each key-value pair is called "one entry".

What is HashMap?
The underlying data structure of HashMap is Hashtable. In HashMapDuplicate keys are not allowed but values can be duplicated. In it Insertion order is not preserved and it is based on hash code of the keys. HashMap allows Heterogeneous objects for both key and value. Null insertion is allowed for keys(only once) and for values(any number of times). It is best suitable for Search operations. Default initial capacity of HashMap is 16 and default fill is ratio "0.75".
 
Before going to see internal working of HashMap we have to go through the some terminology these are.

Hashing
Hashing is the process of transforming of a String to a sorted fixed-length value, that is also known as Key which represent the original String. And this sorter value helps to make the indexing and searching fast. HashMap works on the principal of hashing.

Hash Function
hashCode() method is called Hash Function.it contains Hash Function code. HashMap implementation calls hashCode() on key object to calculate a hash that is used to find bucket where Entry object will be stored. When get(Object key) method is used to retrieve value ,again key object is used to calculate a hash which is used then to find a bucket where that particular key is stored.

Hash Value
The value returned by the hash function ,that is int type, is called Hash Value for key.

Bucket
A bucket is used to store key value pairs. A bucket can have multiple Key-Value pairs in HashMap, Bucket uses simple Linked List to store object in (Key-Value) pairs called Entry.

Map.Entry Interface
Each key-value pair is called one entry. Hence Map is considered as a group of entry Objects, without existing Map object there is no chance of existing entry object hence interface entry is define inside Map interface(inner interface).
 
Equals ()
equals () method is used to compare object for equality. If bucket is one and we have two object with same hashCode then equals () method come to rescue and after comparison of key in each entries using keys.equals () method ,until it return true.then the corresponding Entry object value is returned.
To get understand the HshMap Internals we have to know the contract between equals() and  hashCode()
Contract between .equals() method and hashCode() method:
1.   If 2 objects are equal by .equals() method compulsory their hashcodes must be equal (or) same. That is Ifr1.equals(r2) is true then r1.hascode()==r2.hashcode( ) must be true.
2.   If 2 objects are not equal by .equals() method then there are no restrictions on hashCode() methods. They may be same (or) may be different. That is If r1.equals(r2) is false then r1.hashCode()==r2.hashCode() may be same (or) may be different.
3.   If hashcodes of 2 objects are equal we can't conclude anything about .equals() method it may returns true (or) false. That is If r1.hashCode()==r2.hashCode() is true then r1.equals(r2) method may returns true (or) false.
4.   If hashcodes of 2 objects are not equal then these objects are always not equal by .equals() method also. That is If r1.hashCode()==r2.hashCode() is false then r1.equals(r2) is always false.
To maintain the above contract between .equals() and hashCode() methods whenever we are overriding .equals() method compulsory we should override hashCode() method. Violation leads to no compile time error and runtime error but it is not good programming practice

There are two method that are used to add the Entry and retrieve the Entry from HashMap.these are-
          1.   put() method
         2.   get() method
let see how does the both method works internally.

How does put() method work internally?
  1. First, it checks if the key given is null or not. If the given key is null, it will be stored in the zero position, as the hashcode of null will be zero.
if (key == null)
       return putForNullKey(value);

  1. Then it applies the hashcode to the key .hashCode() by calling the hashcode method. In
order to get the value within the limits of an array, the hash(key.hashCode()) is called, which performs some shifting operations on the hashcode and provide the hash value.
int hash = hash(key.hashCode());
  1. The indexFor() method is used to get the exact location to store the Entry object.
int i = indexFor(hash, table.length);
  1. Then comes the most important part what happens if two different object has the same hashcode and will it be stored in the same bucket. To handle this let's think of the LinkedList in data structure it will have a next attribute which will always point to the next object. The same way the next attribute in the Entry class points to the next object. Using this different objects with the same hashcode will be placed next to each other.
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
         {
             V oldValue = e.value;
             e.value = value;
             e.recordAccess(this);
             return oldValue;
          }

  1. In the case of the Collision, the HashMap checks for the value of the next attribute if it is null it inserts the Entry object in that location, if next attribute is not null then it keeps the loop running till next attribute is null then stores the Entry object there.
public V put(K key, V value)
{
    if (key == null)
       return putForNullKey(value);
       int hash = hash(key.hashCode());
       int i = indexFor(hash, table.length);
       for (Entry e = table[i]; e != null; e = e.next)
       {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
            {
                 V oldValue = e.value;
                 e.value = value;
                 e.recordAccess(this);
                 return oldValue;
             }
     }
        modCount++;
        addEntry(hash, key, value, i);
        return null;
 }

Now Let see how the get() method works.
How does get() method work internally?
  1. First, it gets the hash code of the key object, which is passed, and finds the bucket location.
int hash = hash(key.hashCode());
  1. If the correct bucket is found, it returns the value (e.value).
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
    return e.value;

  1. If no match is found, it returns null.
public V get(Object key)
{
    if (key == null)
       return getForNullKey();
     int hash = hash(key.hashCode());
     for (Entry e = table[indexFor(hash, table.length)];e != null;e = e.next)
     {
         Object k;
         if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
             return e.value;
     }
         return null;
}

There are some question arises while inserting Entry in HashMap. these are-

         1.   What happen when two different key have same hashCode?

         2.   What happen when two key are same and have same hashCode?

So the solution can be done via equals() method it comes in front of and works like as-
        1.   equals() method finds such key already exist in bucket if not found then a new node is created with the map entry and stored with in the same bucket .a Linked List is used to store these node.

        2.   If equals() method returns true means that the key is already exists in the bucket, in that case the new value will overwrite the old value for the matched key.