This is one of my favorite core Java interview questions as not knowing this can cause subtle and intermittent issues in Java. The issues that arise from not understanding equals( ) and hashCode( ) contract can be hard to debug.
Q. When providing a user defined key class for storing objects in a Map (e.g. HashMap), what methods do you have to provide or override (i.e. method overriding)?
A. You should override the equals( ... ) and hashCode( .. ) methods from the Object class. The default implementation of the equals( ... ) and hashCode( ... ) methods are inherited from the java.lang.Object class uses an object instance’s memory location (e.g. MyObject@6c60f2ea). This can cause problems when two instances of car objects have the same color but the inherited equals( .. ) will return false because it uses the memory location as oppose to the color, which is different for the two instances.
The hashCode and equals methods are used by the Map and Set interfaces to store and retrieve objects. The following diagram shows how these methods are used.
As you can see from the above diagram, the hashCode(..) method is used to evaluate the bucket index to store the keys. Since more than one object like "John" and "Peter" can result with the same hashCode, a list of keys are maintained at that index to store all the keys.When retrieving a value with a key, the hashCode(..) method is called first to determine the hashValue for the bucket, and then the equals(...) method is invoked to loop through the keys like "John", "Peter", etc and pick the right key to retrieve the corresponding value.
Q. What are the primary considerations when implementing a user defined key?
A. The user defined key class must consider the following.
- If the class overrides the equals( ... ) method, it must also override the hashCode( ... ) method.
- If 2 objects are equal, then their hashCode values must be equal as well. But, if two objects return the same hashCode value does not mean that those two objects are equal.
- It is a best practice to implement the user defined key class as an immutable object.
- If it is accessed often, the hashCode( ... ) method is a good candidate for caching to enhance performance.
Q. Why it is a best practice to implement the user defined key class as an immutable object?
A.
Problem: As per the code snippet shown below, if you use a mutable user defined class “UserKey” as a Map key and subsequently if you mutate the key after the object has been added to the Map via the setter method key.setName(“Sam”), you will not be able to access the object later on. The original key will still be in the Map and you can still iterate through your Map and print it, but you cannot access it with map.get(key) and querying it with map.containsKey(key) will return false because the key “John” becomes “Sam” after being modified in the “List of keys” at the key index with the hash value of “345678965” as shown in the diagram. When you print the keys, it will print "Sam" twice, once for index “345678965” and again for index "76854676". These types of errors are subtle and can be very hard to trace and fix without understanding this basic relating to how the map and the methods hashCode and equals work as demonstrated in the diagram.
Map myMap = new HashMap(10);
//add the key “John”
UserKey key = new UserKey(“John”); //Assume UserKey class is mutable
myMap.put(key, “Sydney”);
//now key value "John" is modified to key value “Sam”
key.setName(“Sam”); // This line modifies the key value “John” to “Sam” in the “List of keys”
// as shown in the diagram above. This means that the key “John” cannot be
// accessed. There will be two keys with “Sam” in positions with hash
// values 345678965 and 76854676.
myMap.put(key, “Melbourne”);
myMap.get(new UserKey(“John”)); // key cannot be accessed. The key hashes to
// the same hash index position 345678965
// in the “Key index array” but cannot be found
// in the “List of keys”
Solution: Generally you use a java.lang.Integer or a java.lang.String class as the key, which are immutable Java objects. If you define your own key class then it is a best practice to make the key class an immutable object. If a programmer wants to insert a new key, then he/she will always have to instantiate a new object as he/she cannot mutate an existing key.
Map myMap = new HashMap(10);
//add the key “John”
UserKey key1 = new UserKey(“John”); //Assume UserKey is immutable
myMap.put(key1, “Sydney”);
//add the key “Sam”
UserKey key2 = new UserKey(“Sam”); //Since UserKey is immutable, new instance is created.
myMap.put(key2, “Melbourne”);
myMap.get(new UserKey(“John”)); //Now the key can be accessed
Similar issues are possible with the Set (e.g. HashSet) as well. If you add an object to a “Set” and subsequently modify the added object and later on try to query the original object it may not be present. mySet.contains(originalObject) may return false.
Q. If Java did not have a HashMap implementation, how would you go about implementing your own?
A. Writing a HashMap is not a trivial task. It is also not a good practice to reinvent the wheel. The interviewer is trying to evaluate your level of technical knowledge and thought process. What really important here are, how you approach the problem?, how good your understanding of the data structures are?, and how well you can logically think and code?.
- Firstly, decide on the backing data structure. Arrays are fast and memory efficient. Hence an array can be used to back up the map. This will be an indexed bucket.
- Decide on a hashing algorithm to index the array and store the entries in a particular slot of the array.
- Decide on a strategy to store two or more keys that result in the same hash code value. More objects can be linked with each other occupying the same index. This means each bucket can contain 0..N entries. This linking strategy is shown in the diagram.
- Decide on an approach to evaluate the hash code, which determines the array bucket index to use.
int index =Math.abs(key.hashCode( ) % buckets.length);
- Come up with a strategy for resizing the backing array when the capacity is reached. This process is known as rehashing whereby existing items are mapped to new bucket locations.
- Consider implementing the Map interface and fill in the empty methods that need to be implemented.
Note: The above questions is explained in more detail with working code and more core Java interview questions are discussed with clear answers in my book "Core Java Career Essentials".