For open address hashing in Java, how do you get the maximum number of collisions and how do you find the average length of any chain in the table?
Expert Answer
“Aa” = ‘A’ * 31 + ‘a’ = 2112
“BB” = ‘B’ * 31 + ‘B’ = 2112
There are several approaches in dealing with collisions for resolving it and one of the methods is based on idea of putting the keys that collide in a linked list. A hashtable then is an array of lists and this technique is called a Separate Chaining collision resolution. Here we get the maximum number of collisions.
The big attraction of using a hashtable is a constant-time performance for the basic operations add, remove, contains, size. Here, because of collisions, we cannot guarantee the constant runtime in the worst-case. Imagine that all our objects collide into the same index. Then searching for one of them will be equivalent to searching in a list that takes a liner runtime. However, we can guarantee an expected constant runtime, if we make sure that our lists won’t become too long. This is usually implemented by maintaining a load factor that keeps a track of the average length of lists. If a load factor approaches a set in advanced threshold, we create a bigger array and rehash all elements from the old table into the new one.
Another technique of collision resolution is a linear probing. If we cannot insert at index k, we try the next slot k+1. If that one is occupied, we go to k+2, and so on. This is quite simple approach but it requires new thinking about hashtables.