Quantcast
Channel: Baeldung
Viewing all articles
Browse latest Browse all 4535

Arrays.sort vs Arrays.parallelSort

$
0
0

1. Overview

We've all used Arrays.sort() to sort an array of objects or primitives. In JDK 8, creators enhanced the API to provide a new method: Arrays.parallelSort().

In this tutorial, we'll draw a comparison between the sort() and parallelSort() methods.

2. Arrays.sort()

The Arrays.sort() method sorts the array of objects or primitives. The sorting algorithm used in this method is Dual-Pivot Quicksort. In other words, it is a custom implementation of the Quicksort algorithm to achieve better performance.

This method is single-threaded and there are two variants:

  • sort(array) – sorts the full array into ascending order
  • sort(array, fromIndex, toIndex) – sorts only the elements from fromIndex to toIndex

Let's see an example of both variants:

@Test
public void givenArrayOfIntegers_whenUsingArraysSortMethod_thenSortFullArrayInAscendingOrder() {
    int[] array = { 10, 4, 6, 2, 1, 9, 7, 8, 3, 5 };
    int[] expected = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };

    Arrays.sort(array);

    assertArrayEquals(expected, array);

}

@Test
public void givenArrayOfIntegers_whenUsingArraysSortMethodWithRange_thenSortRangeOfArrayInAscendingOrder() {
    int[] array = { 10, 4, 6, 2, 1, 9, 7, 8, 3, 5 };
    int[] expected = { 10, 4, 1, 2, 6, 7, 8, 9, 3, 5 };

    Arrays.sort(array, 2, 8);

    assertArrayEquals(expected, array);
}

Let's summarize the pros and cons of this approach:

PROS CONS
Works fast on smaller data sets Performance degrades for large datasets
Multiple cores of the system aren't utilized

3. Arrays.parallelSort()

This method also sorts an array of objects or primitives. Similar to sort() it also has two variants to sort a full array and partial array:

@Test
public void givenArrayOfIntegers_whenUsingArraysParallelSortMethod_thenSortFullArrayInAscendingOrder() {
    int[] array = { 10, 4, 6, 2, 1, 9, 7, 8, 3, 5 };
    int[] expected = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };

    Arrays.parallelSort(array);

    assertArrayEquals(expected, array);
}

@Test
public void givenArrayOfIntegers_whenUsingArraysParallelSortMethodWithRange_thenSortRangeOfArrayInAscendingOrder() {
    int[] array = { 10, 4, 6, 2, 1, 9, 7, 8, 3, 5 };
    int[] expected = { 10, 4, 1, 2, 6, 7, 8, 9, 3, 5 };

    Arrays.parallelSort(array, 2, 8);

    assertArrayEquals(expected, array);
}

The parallelSort() is functionally different. Unlike sort(), which sorts data sequentially using a single thread, it uses a parallel sort-merge sorting algorithm. It breaks the array into sub-arrays that are themselves sorted and then merged.

For executing parallel tasks it uses the ForkJoin pool.

But we need to know that it uses parallelism only when certain conditions are met. If the array size is less than or equal to 8192 or the processor has only one core, then it uses the sequential Dual-Pivot Quicksort algorithm. Otherwise, it uses a parallel sort.

Let's summarize the advantages and disadvantages of using it:

PROS CONS
Offers better performance for large size datasets Slower for smaller size arrays
Utilizes multiple cores of the system

4. Comparison

Let's now see how both methods performed with different size datasets. Below numbers are derived using JMH benchmarking. The test environment uses AMD A10 PRO 2.1Ghz quad-core processor and JDK 1.8.0_221:

Array Size Arrays.sort() Arrays.parallelSort()
1000 o.048 0.054
10000 0.847 0.425
100000 7.570 4.395
1000000 65.301 37.998

5. Conclusion

In this quick article, we saw how sort() and parallelSort() differ.

Based on performance results, we can conclude that parallelSort() may be a better choice when we have a large dataset to sort. However, in the case of smaller size arrays, it's better to go with sort() since it offers better performance.

As always, the complete source code is available over on GitHub.


Viewing all articles
Browse latest Browse all 4535

Trending Articles