1. Overview
Finding differences between collections of objects of the same data type is a common programming task. As an example, imagine we have a list of students who applied for an exam and another list of students who passed it. The difference between those two lists would give us the students who didn't pass the exam.
In Java, there's no explicit way for finding differences between two lists in the List API, though there are some helper methods that come close.
In this quick tutorial, we'll look at how to find the differences between two lists. We'll try a few different approaches, including plain Java (with and without Streams) and using third-party libraries such as Guava and the Apache Commons Collections.
2. Test Setup
Let's start by defining two lists, which we'll use to test out our examples:
public class FindDifferencesBetweenListsUnitTest {
private static final List listOne = Arrays.asList("Jack", "Tom", "Sam", "John", "James", "Jack");
private static final List listTwo = Arrays.asList("Jack", "Daniel", "Sam", "Alan", "James", "George");
}
3. Using the Java List API
We can create a copy of one list and then remove all the elements common with the other, using the List method removeAll():
List<String> differences = new ArrayList<>(listOne);
differences.removeAll(listTwo);
assertEquals(2, differences.size());
assertThat(differences).containsExactly("Tom", "John");
Let's reverse this to find the differences the other way around:
List<String> differences = new ArrayList<>(listTwo);
differences.removeAll(listOne);
assertEquals(3, differences.size());
assertThat(differences).containsExactly("Daniel", "Alan", "George");
We should also note that if we want to find the common elements between the two lists, List also contains a retainAll method.
4. Using the Streams API
A Java Stream can be used for performing sequential operations on data from collections, which includes filtering differences between lists:
List<String> differences = listOne.stream()
.filter(element -> !listTwo.contains(element))
.collect(Collectors.toList());
assertEquals(2, differences.size());
assertThat(differences).containsExactly("Tom", "John");
As in our first example, we can switch the order of lists to find the different elements from the second list:
List<String> differences = listTwo.stream()
.filter(element -> !listOne.contains(element))
.collect(Collectors.toList());
assertEquals(3, differences.size());
assertThat(differences).containsExactly("Daniel", "Alan", "George");
We should note that the repeated calling of List.contains() can be a costly operation for larger lists.
5. Using Third-Party Libraries
5.1. Using Google Guava
Guava contains a handy Sets.difference method, but to use it we need to first convert our List to a Set:
List<String> differences = new ArrayList<>(Sets.difference(Sets.newHashSet(listOne), Sets.newHashSet(listTwo)));
assertEquals(2, differences.size());
assertThat(differences).containsExactlyInAnyOrder("Tom", "John");
We should note that converting the List to a Set will have the effect of deduplicating and reordering it.
5.2. Using Apache Commons Collections
The CollectionUtils class from Apache Commons Collections contains a removeAll method.
This method does the same as List.removeAll, while also creating a new collection for the result:
List<String> differences = new ArrayList<>((CollectionUtils.removeAll(listOne, listTwo)));
assertEquals(2, differences.size());
assertThat(differences).containsExactly("Tom", "John");
6. Handling Duplicate Values
Let's now look at finding differences when two lists contain duplicated values.
To achieve this, we need to remove the duplicate elements from the first list, precisely as many times as they are contained in the second list.
In our example, the value “Jack” appears twice in the first list and only once in the second list:
List<String> differences = new ArrayList<>(listOne);
listTwo.forEach(differences::remove);
assertThat(differences).containsExactly("Tom", "John", "Jack");
We can also achieve this using the subtract method from Apache Commons Collections:
List<String> differences = new ArrayList<>(CollectionUtils.subtract(listOne, listTwo));
assertEquals(3, differences.size());
assertThat(differences).containsExactly("Tom", "John", "Jack");
7. Conclusion
In this article, we explored a few ways to find differences between lists.
In the examples, we covered a basic Java solution, a solution using the Streams API, and with third-party libraries like Google Guava and Apache Commons Collections.
We also saw how to handle duplicate values.
As always, the complete source code is available over on GitHub.