The new factory methods in map collections since JDK 19

Did you know that JDK 19 added some new factory methods to “map” collections based on a hash table? These new methods weren’t much publicized, so I guess many developers don’t know their existence. This short article provides an overview of this little new feature.

Introduction

The Java Collections Framework provides several “map” collections that are internally based on a hash table, to name a few: HashMap, LinkedHashMap, HashSet, and LinkedHashSet. If you look at the Javadoc documentation of HashMap, for example, you can see it has four constructors. Two of these constructors are of particular interest for this article:

  • public HashMap(int initialCapacity)
  • public HashMap(int initialCapacity, float loadFactor)

Suppose you must create a HashMap with the correct initial capacity to add 100 entries without causing intermediate rehashings. You may be tempted to use new HashMap<X,Y>(100), but it’s incorrect! The main issue with these constructors is that the initialCapacity parameter does not indicate the number of mappings (or elements). Instead, it means the capacity of the internal hash table.

The three factors in collections based on a hash table

There are three essential factors in all collections based on a hash table:

  • the logical size (number of mappings or elements)
  • the hash table’s physical capacity
  • the load factor

The Javadoc documentation is pretty clear about the load factor, saying that: The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased.

In practice, the hash table is rehashed (its structure is rebuilt using a larger hash table) when the number of entries is greater than the product of the current capacity and the load factor.

Example case

Consider the following (fictional) scenario: you have a hash table-based map object with a current capacity of 64, a load factor of 0.6, and 38 entries stored. If you add 1 entry, you have that:

39 > 64 x 0.6

In this case, the hash table is rehashed because the capacity is no longer sufficient according to the load factor.

Note: The above is a simplified (and fictional) example scenario to show the general concept. The actual implementation of a collection like HashMap is more sophisticated and performs these calculations slightly differently.

So, what must we do to create an instance of HashMap (or other hash table-based collection) with a suitable capacity for N entries? The calculation is pretty straightforward:

int capacity = (int) Math.ceil(numMappings / loadFactor);

If you know N in advance, you can also precalculate the capacity. Thus, for example, to create a HashMap for 100 entries and a load factor of 0.6, you can use:

Map<X,Y> map = new HashMap<X,Y>(167, 0.6f);    // ((int) Math.ceil(100 / 0.6)) = 167

The new factory methods since JDK 19

Starting from JDK 19, some new factory methods exist in various “map” collections based on a hash table. They are precisely the following five:

  • HashMap.newHashMap(int numMappings)
  • HashSet.newHashSet(int numElements)
  • LinkedHashMap.newLinkedHashMap(int numMappings)
  • LinkedHashSet.newLinkedHashSet(int numElements)
  • WeakHashMap.newWeakHashMap(int numMappings)

These methods are handy when you know a map will only have N (or at most N) entries/elements, and you don’t want to cause intermediate rehashings. So, for example:

Map<X,Y> map = HashMap.newHashMap(100);    // suitable for 100 entries

The only limit of these factory methods is that you cannot specify the load factor. These methods use the default load factor of 0.75, as described in Oracle’s Javadoc documentation. The default of 0.75 is generally a “good” value, representing a reasonable tradeoff between space cost and performance.

Conclusions

In this article, I described the capacity and load factor question for hash table-based collections reasonably well. I wrote the article mainly as a reminder for other developers (and, clearly, for me personally). Therefore, remember these new methods if you are “lucky” to use Java/JDK 19 (or higher) in your projects!

Similar Posts