Quantcast
Channel: Baeldung
Viewing all articles
Browse latest Browse all 4535

Collecting into Map using Collectors.toMap() vs Collectors.groupingBy()

$
0
0

1. Overview

In Java programming, working with collections and streams is a common task, especially in modern, functional programming paradigms. Java 8 introduced the Stream API, which provides powerful tools for processing data collections.

Two essential Collectors in the Stream API are Collectors.toMap() and Collectors.groupingBy(), both of which serve distinct purposes when it comes to transforming Stream elements into a Map.

In this tutorial, we’ll delve into the differences between these Collectorand explore scenarios where each is more appropriate.

2. The City Example

Examples can help us illustrate the problem. So, let’s create a simple immutable POJO class:

class City {
 
    private final String name;
    private final String country;
 
    public City(String name, String country) {
        this.name = name;
        this.country = country;
    }
   // ... getters, equals(), and hashCode() methods are omitted
}

As the code above shows, the City class only has two properties: the city name and the country in which the City is located.

Since we’ll be using Collectors.toMap() and Collectors.groupingBy() as the terminal Stream operations in our examples, let’s create some City objects to feed a Stream:

static final City PARIS = new City("Paris", "France");
static final City BERLIN = new City("Berlin", "Germany");
static final City TOKYO = new City("Tokyo", "Japan");

Now, we can create a Stream easily from these City instances:

Stream.of(PARIS, BERLIN, TOKYO);

Next, we’ll use Collectors.toMap() and Collectors.groupingBy() to convert a Stream of City instances to a Map and discuss the differences between these two Collectors.

For simplicity, we’ll use “toMap()” and “groupingBy()” to reference the two Collectors in the tutorial and employ unit test assertions to verify whether a transformation yields the expected result.

3. Taking City.country as the Key

First, let’s explore the basic usages of the toMap() and groupingBy() Collectors. We’ll use these two Collectors to transform a Stream. In the transformed Map result, we’ll take each City.country as the key.

Also, the key can be null. So, let’s create a City with null as its country:

static final City COUNTRY_NULL = new City("Unknown", null);

3.1. Using the toMap() Collector

toMap() allows us to define how keys and values are mapped from elements in the input Stream. We can pass keyMapper and valueMapper parameters to the toMap() method. Both parameters are functions, providing the key and the value in the result Map.

If we want the City instance itself to be the value in the result, and to get a Map<String, City>, we can use Function.identity() as the valueMapper:

Map<String, City> result = Stream.of(PARIS, BERLIN, TOKYO)
  .collect(Collectors.toMap(City::getCountry, Function.identity()));
 
Map<String, City> expected = Map.of(
  "France", PARIS,
  "Germany", BERLIN,
  "Japan", TOKYO
);
 
assertEquals(expected, result);

Also, toMap() works as expected even if the keyMapper function returns null:

Map<String, City> result = Stream.of(PARIS, COUNTRY_NULL)
  .collect(Collectors.toMap(City::getCountry, Function.identity()));
 
Map<String, City> expected = new HashMap<>() {{
    put("France", PARIS);
    put(null, COUNTRY_NULL);
}};
 
assertEquals(expected, result);

3.2. Using the groupingBy() Collector

The groupingBy() Collector is good at segregating Stream elements into groups based on a specified classifier function. Therefore, the value type in the result Map is a Collection. By default, it’s a List.

So next, let’s group our Stream by City.country:

Map<String, List<City>> result = Stream.of(PARIS, BERLIN, TOKYO)
  .collect(Collectors.groupingBy(City::getCountry));
 
Map<String, List<City>> expected = Map.of(
  "France", List.of(PARIS),
  "Germany", List.of(BERLIN),
  "Japan", List.of(TOKYO)
);
 
assertEquals(expected, result);

Unlike toMap(), groupingBy() cannot handle null as the classifier:

assertThrows(NullPointerException.class, () -> Stream.of(PARIS, COUNTRY_NULL)
  .collect(Collectors.groupingBy(City::getCountry)));

As the example shows, when the classifier function returns null, it throws a NullPointerException.

We’ve explored the two Collectors’ fundamental usages through examples. However, in our Stream, there are no duplicate countries among the City instances. In real-world projects, handling key collisions can be a common scenario we need to address.

So next, let’s see how the two Collectors deal with duplicate keys.

4. When There Are Duplicate Keys

Let’s create three additional cities:

static final City NICE = new City("Nice", "France");
static final City AACHEN = new City("Aachen", "Germany");
static final City HAMBURG = new City("Hamburg", "Germany");

Alongside the previously generated City instances, we now have cities with duplicate countries. For example, BERLIN, HAMBURG, and AACHEN have the same country: “Germany“.

Next, let’s explore how the toMap() and groupingBy() Collectors handle duplicate keys.

4.1. Using the toMap() Collector

If we continue with the previous approach, only passing the keyMapper and valueMapper to the toMap() Collector, an IllegalStateException will be thrown due to the presence of duplicate keys:

assertThrows(IllegalStateException.class, () -> Stream.of(PARIS, BERLIN, TOKYO, NICE, HAMBURG, AACHEN)
    .collect(Collectors.toMap(City::getCountry, Function.identity())));

When duplicate keys may occur, it’s necessary to provide a mergeFunction as the third parameter to toMap() to resolve collisions between values associated with the same key.

Next, let’s provide a lambda expression as the mergeFunction to toMap(), which selects the “smaller” City by comparing two city names lexicographically when their countries are the same:

Map<String, City> result = Stream.of(PARIS, BERLIN, TOKYO, NICE, HAMBURG, AACHEN)
  .collect(Collectors.toMap(City::getCountry, Function.identity(), (c1, c2) -> c1.getName()
     .compareTo(c2.getName()) < 0 ? c1 : c2));
 
Map<String, City> expected = Map.of(
  "France", NICE, // <-- from Paris and Nice
  "Germany", AACHEN, // <-- from Berlin, Hamburg, and Aachen
  "Japan", TOKYO
);
 
assertEquals(expected, result);

As the above example shows, the mergeFunction returns one City instance based on the given rule. Therefore, we still obtain a Map<String, City> as the result after calling the collect() method.

4.2. Using the groupingBy() Collector

On the other hand, since groupingBy() groups Stream elements into a Collection using the classifier, the previous code still works although the cities in the input Stream have the same country values:

Map<String, List<City>> result = Stream.of(PARIS, BERLIN, TOKYO, NICE, HAMBURG, AACHEN)
  .collect(Collectors.groupingBy(City::getCountry));
 
Map<String, List<City>> expected = Map.of(
  "France", List.of(PARIS, NICE),
  "Germany", List.of(BERLIN, HAMBURG, AACHEN),
  "Japan", List.of(TOKYO)
);
 
assertEquals(expected, result);

As we can see, cities with the same country names are grouped into the same List, as evidenced by the “France” and “Germany” entries in the result.

5. With Value Mappers

Up to this point, we’ve used the toMap() and groupingBy() Collectors to obtain associations of country -> City or country -> a Collection of City instances.

However, at times, we might need to map the Stream elements to different values. For example, we may wish to obtain associations of country -> City.name or country -> a Collection of City.name values.

Further, it’s important to note that the mapped values can be null. So, we may need to address the cases where the value is null. Let’s create a City whose name is null:

static final City FRANCE_NULL = new City(null, "France");

Next, let’s explore how to apply a value mapper to the toMap() and groupingBy() Collectors.

5.1. Using the toMap() Collector

As we’ve mentioned earlier, we can pass a valueMapper function to the toMap() method as the second parameter, allowing us to map the objects in the input Stream to different values:

Map<String, String> result = Stream.of(PARIS, BERLIN, TOKYO)
  .collect(Collectors.toMap(City::getCountry, City::getName));
 
Map<String, String> expected = Map.of(
  "France", "Paris",
  "Germany", "Berlin",
  "Japan", "Tokyo"
);
 
assertEquals(expected, result);

In this example, we use the method referenceCity::getName as the valueMapper parameter, mapping a City to its name.

However, toMap() encounters issues when the mapped values contain null:

assertThrows(NullPointerException.class, () -> Stream.of(PARIS, FRANCE_NULL)
  .collect(Collectors.toMap(City::getCountry, City::getName)));

As we can see, if the mapped values contain a null, toMap() throws a NullPointerException.

5.2. Using the groupingBy() Collector

Unlike toMap(), groupingBy() doesn’t directly support a valueMapper function as its parameter. However, we can supply another Collector as the second parameter to groupingBy(), allowing us to perform downstream reduction operations. For example, we can use the mapping() Collector to map grouped City instances to their names:

Map<String, List<String>> result = Stream.of(PARIS, BERLIN, TOKYO)
  .collect(Collectors.groupingBy(City::getCountry, mapping(City::getName, toList())));
 
Map<String, List<String>> expected = Map.of(
  "France", List.of("Paris"),
  "Germany", List.of("Berlin"),
  "Japan", List.of("Tokyo")
);
  
assertEquals(expected, result);

Furthermore, the combination of groupingBy() and mapping() can seamlessly handle null values:

Map<String, List<String>> resultWithNull = Stream.of(PARIS, BERLIN, TOKYO, FRANCE_NULL)
  .collect(Collectors.groupingBy(City::getCountry, mapping(City::getName, toList())));
 
Map<String, List<String>> expectedWithNull = Map.of(
  "France", newArrayList("Paris", null),
  "Germany", newArrayList("Berlin"),
  "Japan", List.of("Tokyo")
);
 
assertEquals(expectedWithNull, resultWithNull);

6. Conclusion

As we’ve seen in this article, Collectors.toMap() and Collectors.groupingBy() are powerful Collectors, each serving distinct purposes.

toMap() is suitable for transforming a Stream into a Map directly, while groupingBy() is good at categorizing Stream elements into groups based on certain criteria. 

Additionally, toMap() works although the keyMapper function returns null. But, groupingBy() throws a NullPointerException  if the classifier function returns null.

Since toMap() supports a valueMapper parameter, it’s pretty convenient to map values to a desired type. However, it’s important to note that if the valueMapper function returns null, toMap() throws a NullPointerException. In contrast, groupingBy() relies on other Collectors to map the Stream elements to a different type, and it effectively handles null values.

By understanding their differences and use cases, we can effectively use these Collectors in our Java applications to manipulate and process data streams.

As always, the complete source code for the examples is available over on GitHub.

       

Viewing all articles
Browse latest Browse all 4535

Trending Articles