1. Overview
We've all used Arrays.sort() to sort an array of objects or primitives. In JDK 8, creators enhanced the API to provide a new method: Arrays.parallelSort().
In this tutorial, we'll draw a comparison between the sort() and parallelSort() methods.
2. Arrays.sort()
The Arrays.sort() method sorts the array of objects or primitives. The sorting algorithm used in this method is Dual-Pivot Quicksort. In other words, it is a custom implementation of the Quicksort algorithm to achieve better performance.
This method is single-threaded and there are two variants:
- sort(array) – sorts the full array into ascending order
- sort(array, fromIndex, toIndex) – sorts only the elements from fromIndex to toIndex
Let's see an example of both variants:
@Test public void givenArrayOfIntegers_whenUsingArraysSortMethod_thenSortFullArrayInAscendingOrder() { int[] array = { 10, 4, 6, 2, 1, 9, 7, 8, 3, 5 }; int[] expected = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 }; Arrays.sort(array); assertArrayEquals(expected, array); } @Test public void givenArrayOfIntegers_whenUsingArraysSortMethodWithRange_thenSortRangeOfArrayInAscendingOrder() { int[] array = { 10, 4, 6, 2, 1, 9, 7, 8, 3, 5 }; int[] expected = { 10, 4, 1, 2, 6, 7, 8, 9, 3, 5 }; Arrays.sort(array, 2, 8); assertArrayEquals(expected, array); }
Let's summarize the pros and cons of this approach:
PROS | CONS |
---|---|
Works fast on smaller data sets | Performance degrades for large datasets |
Multiple cores of the system aren't utilized |
3. Arrays.parallelSort()
This method also sorts an array of objects or primitives. Similar to sort() it also has two variants to sort a full array and partial array:
@Test public void givenArrayOfIntegers_whenUsingArraysParallelSortMethod_thenSortFullArrayInAscendingOrder() { int[] array = { 10, 4, 6, 2, 1, 9, 7, 8, 3, 5 }; int[] expected = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 }; Arrays.parallelSort(array); assertArrayEquals(expected, array); } @Test public void givenArrayOfIntegers_whenUsingArraysParallelSortMethodWithRange_thenSortRangeOfArrayInAscendingOrder() { int[] array = { 10, 4, 6, 2, 1, 9, 7, 8, 3, 5 }; int[] expected = { 10, 4, 1, 2, 6, 7, 8, 9, 3, 5 }; Arrays.parallelSort(array, 2, 8); assertArrayEquals(expected, array); }
The parallelSort() is functionally different. Unlike sort(), which sorts data sequentially using a single thread, it uses a parallel sort-merge sorting algorithm. It breaks the array into sub-arrays that are themselves sorted and then merged.
For executing parallel tasks it uses the ForkJoin pool.
But we need to know that it uses parallelism only when certain conditions are met. If the array size is less than or equal to 8192 or the processor has only one core, then it uses the sequential Dual-Pivot Quicksort algorithm. Otherwise, it uses a parallel sort.
Let's summarize the advantages and disadvantages of using it:
PROS | CONS |
---|---|
Offers better performance for large size datasets | Slower for smaller size arrays |
Utilizes multiple cores of the system |
4. Comparison
Let's now see how both methods performed with different size datasets. Below numbers are derived using JMH benchmarking. The test environment uses AMD A10 PRO 2.1Ghz quad-core processor and JDK 1.8.0_221:
Array Size | Arrays.sort() | Arrays.parallelSort() |
---|---|---|
1000 | o.048 | 0.054 |
10000 | 0.847 | 0.425 |
100000 | 7.570 | 4.395 |
1000000 | 65.301 | 37.998 |
5. Conclusion
In this quick article, we saw how sort() and parallelSort() differ.
Based on performance results, we can conclude that parallelSort() may be a better choice when we have a large dataset to sort. However, in the case of smaller size arrays, it's better to go with sort() since it offers better performance.
As always, the complete source code is available over on GitHub.