Quantcast
Channel: Baeldung
Viewing all articles
Browse latest Browse all 4535

Limiting the Max Size of a HashMap in Java

$
0
0

1. Overview

HashMap is a well-known class from the Java Collections library. It implements the Map interface and allows the storage of key-value pairs. An instance of HashMap does not have a limitation on its number of entries. In some specific scenarios, we might want to change this behavior. In this tutorial, we’ll look at a few possible ways of enforcing a size limit on a HashMap.

2. Notions on Java HashMap

The core of a HashMap is essentially a hash table. A hash table is a fundamental data structure based on two other basic structures: arrays and linked lists.

2.1. Inner Structure

An array is the basic storage entity of a HashMap. Each position of the array contains a reference to a linked list. The linked list can contain a set of entries made of a key and a value. Both keys and values are Java objects, not primitive types, and keys are unique. The interface of HashMap has a put method defined as:

V put(K key, V value)

It uses a so-called “hash function” that calculates a number called the “hash” from the input key. Then, starting from the hash and based on the current size of the array, it calculates the index in which to insert the entry.

Different key values could have the same hash value and, thus, the same index. This results in a conflict, and when this happens, the entry is inserted in the next position of the linked list for that index.

If the number of entries in the linked list is greater than a specific threshold, defined by the TREEIFY_THRESHOLD constant, HashMap replaces the linked list with a tree, improving the runtime performance from O(n) to O(log(n)).

2.2. Resizing, Re-Hashing, and the Load Factor

From the standpoint of performance, the ideal situation is the one in which the entries are spread over the whole array, with a maximum of one entry per position. As the number of entries grows, though, the number of conflicts rises, and so does the linked list size in each position.

To maintain a situation as close as possible to the ideal one, the HashMap resizes itself when the number of entries reaches a certain limit and then performs a recalculation of the hashes and indexes.

HashMap resizes itself based on the “load factor”. We can define the load factor as the maximum value of the fraction between the number of entries divided by the available positions before a resizing and re-hashing is needed. The default load factor of HashMap is 0.75f.

3. Scenarios in Which the Need for a Max Size of HashMap Could Arise

The size of a HashMap is limited only by the Java Virtual Machine memory available. This is how the class has been designed, and it’s consistent with the regular use cases for the hash table data structure.

Still, there might be some specific scenarios in which we may need to impose a custom limit. For instance:

  • Implementing a cache
  • Gathering metrics on HashMap in a well-defined condition, avoiding its automatic re-sizing phase

As we’ll see in the following sections, for the first case, we can use the LinkedHashMap extension and its removeEldestEntry method. For the second case, we have to implement a custom extension of HashMap instead.

These scenarios are far from the original purpose of the HashMap design. A cache is a broader concept than a simple map, and the need to extend the original class makes it impossible to test it as if it were a pure black box.

4. Limiting Max Size Using LinkedHashMap

One possible way of limiting the max size is using LinkedHashMap, a subclass of HashMap. LinkedHashMap is a combination of a HashMap and a LinkedList: it stores a pointer to the previous and next entry. It can, therefore, maintain the insertion order of its entries, while HashMap doesn’t support that.

4.1. Example with LinkedHashMap

We can limit the max size by overriding the removeEldestEntry() method. This method returns a boolean and is internally invoked to decide if the eldest entry should be removed after a new insert.

The eldest entry would be the least recently inserted key or the least recently accessed entry, depending on whether we create the LinkedHashMap instance with the accessOrder parameter as false, which is the default, or true.

By overriding the removeEldestEntry() method we define our own rule:

@Test
void givenLinkedHashMapObject_whenAddingNewEntry_thenEldestEntryIsRemoved() {
    final int MAX_SIZE = 4;
    LinkedHashMap<Integer, String> linkedHashMap;
    linkedHashMap = new LinkedHashMap<Integer, String>() {
        @Override
        protected boolean removeEldestEntry(Map.Entry<Integer, String> eldest) {
            return size() > MAX_SIZE;
        }
    };
    linkedHashMap.put(1, "One");
    linkedHashMap.put(2, "Two");
    linkedHashMap.put(3, "Three");
    linkedHashMap.put(4, "Four");
    linkedHashMap.put(5, "Five");
    String[] expectedArrayAfterFive = { "Two", "Three", "Four", "Five" };
    assertArrayEquals(expectedArrayAfterFive, linkedHashMap.values()
        .toArray());
    linkedHashMap.put(6, "Six");
    String[] expectedArrayAfterSix = { "Three", "Four", "Five", "Six" };
    assertArrayEquals(expectedArrayAfterSix, linkedHashMap.values()
        .toArray());
}

In the above example, we created an instance of LinkedHashMap, overriding its removeEldestEntry method. Then, when the size of the entries set becomes greater than the given limit upon adding a new entry, the instance itself will remove its eldest key before inserting the new one.

In the test case, we preliminarily set a max size of 4 in the setUp method and fill the LinkedHashMap object with four entries. We can see that for each further insert, the number of entries is always 4, and the LinkedHashMap instance removes its eldest entry each time.

We must note that this is not a strict interpretation of the requirement of limiting the maximum size as the insertion operation is still allowed: The maximum size is kept by removing the eldest keys.

We can implement a so-called LRU (Least Recently Used) cache by creating the LinkedHashMap with the accessOrder parameter as true by using the appropriate constructor.

4.2. Performance Considerations

A regular HashMap has the performance that we would expect from a hash table data structure. LinkedHashMap, on the other hand, has to keep the order of its entries, making its insertion and deletion operations slower. The access operation’s performance does not change unless we set the accessOrder flag to true.

5. Limiting Max Size Using a Custom HashMap

Another possible strategy is extending the HashMap with a custom implementation of the put method. We can make this implementation raise an exception when the HashMap instance reaches an imposed limit.

5.1. Implementing the Custom HashMap

For this approach, we have to override the put method, which makes a preliminary check:

public class HashMapWithMaxSizeLimit<K, V> extends HashMap<K, V> {
    private int maxSize = -1;
    public HashMapWithMaxSizeLimit(int maxSize) {
        super();
        this.maxSize = maxSize;
    }
    @Override
    public V put(K key, V value) {
        if (this.maxSize == -1 || this.containsKey(key) || this.size() < this.maxSize) {
            return super.put(key, value);
        }
        throw new RuntimeException("Max size exceeded!");
    }
}

In the above example, we have an extension of HashMap. It has a maxSize attribute with -1 as the default value, which implicitly means “no limit”. We can modify this attribute by a specific constructor. In the put implementation, if maxSize equals the default, the key is already present, or the number of entries is lower than the specified maxSize, the extension calls the corresponding method of the superclass.

If the extension does not meet the above conditions, it raises an unchecked exception. We cannot use a checked exception because the put method does not explicitly throw any exception and we cannot redefine its signature. In our example, this limit can be circumvented by using putAll. To avoid this, we may also want to improve the example by overriding putAll, with the same logic.

To keep the example simple, we haven’t redefined all the constructors of the superclass. We should do it, though, in a real use case to keep the original design of HashMap. In that case, we should use the max size logic also in the HashMap(Map<? extends K, ? extends V> m) constructor because it adds all the entries of the map argument without using putAll.

This solution follows the requirements in a stricter way than the one described in the previous section. In this case, the HashMap inhibits any insert operation beyond the limit, without removing any existing element.

5.2. Testing the Custom HashMap

As an example, we can test the above custom class:

private final int MAX_SIZE = 4;
private HashMapWithMaxSizeLimit<Integer, String> hashMapWithMaxSizeLimit;
@Test
void givenCustomHashMapObject_whenThereIsNoLimit_thenDoesNotThrowException() {
    hashMapWithMaxSizeLimit = new HashMapWithMaxSizeLimit<Integer, String>();
    assertDoesNotThrow(() -> {
        for (int i = 0; i < 10000; i++) {
            hashMapWithMaxSizeLimit.put(i, i + "");
        }
    });
}
@Test
void givenCustomHashMapObject_whenLimitNotReached_thenDoesNotThrowException() {
    hashMapWithMaxSizeLimit = new HashMapWithMaxSizeLimit<Integer, String>(MAX_SIZE);
    assertDoesNotThrow(() -> {
        for (int i = 0; i < 4; i++) {
            hashMapWithMaxSizeLimit.put(i, i + "");
        }
    });
}
@Test
void givenCustomHashMapObject_whenReplacingValueWhenLimitIsReached_thenDoesNotThrowException() {
    hashMapWithMaxSizeLimit = new HashMapWithMaxSizeLimit<Integer, String>(MAX_SIZE);
    assertDoesNotThrow(() -> {
        hashMapWithMaxSizeLimit.put(1, "One");
        hashMapWithMaxSizeLimit.put(2, "Two");
        hashMapWithMaxSizeLimit.put(3, "Three");
        hashMapWithMaxSizeLimit.put(4, "Four");
        hashMapWithMaxSizeLimit.put(4, "4");
    });
}
@Test
void givenCustomHashMapObject_whenLimitExceeded_thenThrowsException() {
    hashMapWithMaxSizeLimit = new HashMapWithMaxSizeLimit<Integer, String>(MAX_SIZE);
    Exception exception = assertThrows(RuntimeException.class, () -> {
        for (int i = 0; i < 5; i++) {
            hashMapWithMaxSizeLimit.put(i, i + "");
        }
    });
    String messageThrownWhenSizeExceedsLimit = "Max size exceeded!";
    String actualMessage = exception.getMessage();
    assertTrue(actualMessage.equals(messageThrownWhenSizeExceedsLimit));
}

In the above code, we have four test cases:

  • We instantiate the class with the default no-limit behavior. Even if we add a high number of entries, we don’t expect any exceptions.
  • We instantiate the class with a maxSize of 4. If we don’t reach the limit, we don’t expect any exceptions.
  • We instantiate the class with a maxSize of 4. If we have reached the limit and we replace a value, we don’t expect any exceptions.
  • We instantiate the class with a maxSize of 4. If we exceed the limit, we expect a RuntimeException with a specific message.

6. Conclusion

The need to enforce a limit for the maximum size of HashMap might arise in some specific scenarios. In this article, we’ve seen some examples that address this requirement.

Extending LinkedHashMap and overriding its removeEldestEntry() method can be useful if we want to quickly implement an LRU cache or something similar. If we want to enforce a stricter bound, we can extend HashMap and override its insertion methods to fit our needs.

As usual, the example code is available over on GitHub.

       

Viewing all articles
Browse latest Browse all 4535

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>