HashMap works internally has become a popular
question in almost all the interview now a days for fresher as well as
experience people.so here is the complete description how HashMap works.
What is Map?
A Map is
an object that maps keys to values. A map cannot contain duplicate keys: Each
key can map to at most one value. It models the mathematical function abstraction.
The
Map
interface includes methods for
basic operations (such as put
, get
, remove
, containsKey
, containsValue
, size
, and empty
), bulk operations (such as putAll
and clear
), and collection views (such as keySet
, entrySet
, and values
).
In sort we can say-
1.
If we want to
represent a group of objects as "key-value" pair then we should go
for Map interface.
2.
Both key and
value are objects only.
3.
Duplicate keys
are not allowed but values can be duplicated
4.
Each key-value
pair is called "one entry".
What is HashMap?
The underlying data structure of HashMap is Hashtable.
In HashMapDuplicate keys are not allowed but values can be duplicated. In it Insertion
order is not preserved and it is based on hash code of the keys. HashMap allows
Heterogeneous objects for both key and value. Null insertion is allowed for
keys(only once) and for values(any number of times). It is best suitable for
Search operations. Default
initial capacity of HashMap is 16 and default fill is ratio "0.75".
Before
going to see internal working of HashMap we have to go through the some
terminology these are.
Hashing
Hashing is the process of
transforming of a String to a sorted fixed-length value, that is also known as Key
which represent the original String. And this sorter value helps to make the
indexing and searching fast. HashMap works on the principal of hashing.
Hash Function
hashCode() method is called Hash
Function.it contains Hash Function code. HashMap implementation calls
hashCode() on key object to calculate a hash that is used to find bucket where
Entry object will be stored. When get(Object key) method is used to retrieve
value ,again key object is used to calculate a hash which is used then to find
a bucket where that particular key is stored.
Hash Value
The value returned by the hash
function ,that is int type, is called Hash Value for key.
Bucket
A bucket is used to store key value
pairs. A bucket can have multiple Key-Value pairs in HashMap, Bucket uses
simple Linked List to store object in (Key-Value) pairs called Entry.
Map.Entry Interface
Each key-value pair is called one entry. Hence
Map is considered as a group of entry Objects, without existing Map object
there is no chance of existing entry object hence interface entry is define
inside Map interface(inner interface).
Equals ()
equals () method is used to compare
object for equality. If bucket is one and we have two object with same hashCode
then equals () method come to rescue and after comparison of key in each
entries using keys.equals () method ,until it return true.then the
corresponding Entry object value is returned.
To get understand the HshMap
Internals we have to know the contract between equals() and hashCode()
Contract between .equals() method and hashCode() method:
1. If 2 objects are equal by
.equals() method compulsory their hashcodes must be equal (or) same. That is Ifr1.equals(r2)
is true then r1.hascode()==r2.hashcode( ) must be true.
2. If 2 objects are not equal
by .equals() method then there are no restrictions on hashCode() methods. They
may be same (or) may be different. That is If r1.equals(r2) is false
then r1.hashCode()==r2.hashCode() may be same (or) may be different.
3. If hashcodes of 2 objects
are equal we can't conclude anything about .equals() method it may returns true
(or) false. That is If r1.hashCode()==r2.hashCode() is true then r1.equals(r2)
method may returns true (or) false.
4. If hashcodes of 2 objects
are not equal then these objects are always not equal by .equals() method also.
That is If r1.hashCode()==r2.hashCode() is false then r1.equals(r2) is
always false.
To maintain the above contract between .equals() and hashCode()
methods whenever we are overriding .equals() method compulsory we should
override hashCode() method.
Violation leads to no compile time error and runtime error but it is not good
programming practice
There
are two method that are used to add the Entry and retrieve the Entry from
HashMap.these are-
1.
put() method
2.
get() method
let
see how does the both method works internally.
How does put() method work internally?
- First, it checks
if the key given is null or not. If the given key is null, it will be
stored in the zero position, as the hashcode of null will be zero.
if (key == null)
return
putForNullKey(value);
- Then it applies
the hashcode to the key .hashCode() by calling the
hashcode method. In
order to get the value
within the limits of an array, the hash(key.hashCode()) is called, which
performs some shifting operations on the hashcode and provide the hash
value.
int
hash = hash(key.hashCode());
- The indexFor() method
is used to get the exact location to store the Entry object.
int
i = indexFor(hash, table.length);
- Then comes the
most important part what happens if two different object has the same
hashcode and will it be stored in the same bucket. To handle this let's
think of the LinkedList in data structure it will have a next attribute
which will always point to the next object. The same way the next
attribute in the Entry class points to the next object. Using this
different objects with the same hashcode will be placed next to each
other.
if (e.hash == hash && ((k = e.key) == key ||
key.equals(k)))
{
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return
oldValue;
}
- In the case of
the Collision, the HashMap checks for the value of the next attribute if
it is null it inserts the Entry object in that location,
if next attribute is not null then it keeps the loop running till next
attribute is null then stores the Entry object there.
public V put(K key, V value)
{
if (key == null)
return
putForNullKey(value);
int hash =
hash(key.hashCode());
int i =
indexFor(hash, table.length);
for
(Entry e = table[i]; e != null;
e = e.next)
{
Object k;
if (e.hash ==
hash && ((k = e.key) == key || key.equals(k)))
{
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return
oldValue;
}
}
modCount++;
addEntry(hash, key, value, i);
return null;
}
Now Let see how the
get() method works.
How does get() method work internally?
- First, it gets
the hash code of the key object, which is passed, and finds the bucket
location.
int hash = hash(key.hashCode());
- If the correct
bucket is found, it returns the value (e.value).
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return
e.value;
- If no match is
found, it returns null.
public V get(Object key)
{
if (key == null)
return
getForNullKey();
int hash =
hash(key.hashCode());
for
(Entry e = table[indexFor(hash, table.length)];e != null;e
= e.next)
{
Object k;
if (e.hash ==
hash && ((k = e.key) == key || key.equals(k)))
return
e.value;
}
return null;
}
There are some
question arises while inserting Entry in HashMap. these are-
1. What
happen when two different key have same hashCode?
2. What
happen when two key are same and have same hashCode?
So the solution can
be done via equals() method it comes in front of and works like as-
1.
equals() method finds such
key already exist in bucket if not found then a new node is created with the
map entry and stored with in the same bucket .a Linked List is used to store
these node.
2.
If equals() method returns
true means that the key is already exists in the bucket, in that case the new
value will overwrite the old value for the matched key.