1. Introduction
In this article, we'll dive into the bucket sort algorithm. We'll start with a quick bit of theory, before working on the Java implementation alongside unit testing our solution. Finally, we'll look at the time complexity of bucket sorting.
2. The Theory of Bucket Sorting
Bucket sorting, sometimes known as bin sorting, is a specific sorting algorithm. The sort works by distributing the elements we want to sort into several individually sorted buckets. By doing this, we can reduce the number of comparisons between the elements and help cut the sorting time.
Let's take a quick look at the steps required to perform a bucket sort:
- Set up an array of our initially empty buckets
- Distribute our elements into their appropriate buckets
- Sort each bucket
- Concatenate the sorted buckets together to recreate the full list
3. Java Implementation
While this algorithm isn't language-specific, we'll be implementing the sort in Java. Let's go through the above list step by step and write the code to sort a list of integers.
3.1. Bucket Setup
First, we need to determine a hashing algorithm to decide which of our elements gets placed into which bucket:
private int hash(int i, int max, int numberOfBuckets) { return (int) ((double) i / max * (numberOfBuckets - 1)); }
With our hash method defined, we can now specify the number of bins as a square root of the input list size:
final int numberOfBuckets = (int) Math.sqrt(initialList.size()); List<List<Integer>> buckets = new ArrayList<>(numberOfBuckets); for(int i = 0; i < numberOfBuckets; i++) { buckets.add(new ArrayList<>()); }
Finally, we need a short method to determine the maximum integer in our input list:
private int findMax(List<Integer> input) { int m = Integer.MIN_VALUE; for (int i : input) { m = Math.max(i, m); } return m; }
3.2. Distributing the Elements
Now that we have our buckets defined, we can distribute each element of our input list into its relevant bucket using the hash method:
int max = findMax(initialList); for (int i : initialList) { buckets.get(hash(i, max, numberOfBuckets)).add(i); }
3.3. Sorting the Individual buckets
With our buckets defined and full of integers, let's use a Comparator to sort them:
Comparator<Integer> comparator = Comparator.naturalOrder(); for(List<Integer> bucket : buckets){ bucket.sort(comparator); }
3.4. Concatenating Our Buckets
Finally, we need to pull our buckets together to recreate the single list. Since our buckets are sorted, we only need to loop through each bucket once and append the elements to a master list:
List<Integer> sortedArray = new LinkedList<>(); for(List<Integer> bucket : buckets) { sortedArray.addAll(bucket); } return sortedArray;
4. Testing Our Code
With our implementation complete, let's write a quick unit test to make sure it works as expected:
BucketSorter sorter = new IntegerBucketSorter(); List<Integer> unsorted = Arrays.asList(80,50,60,30,20,10,70,0,40,500,600,602,200,15); List<Integer> expected = Arrays.asList(0,10,15,20,30,40,50,60,70,80,200,500,600,602); List<Integer> sorted = sorter.sort(unsorted); assertEquals(expected, sorted);
5. Time Complexity
Next, let's take a quick look at the time complexity of performing a bucket sort.
5.1. Worst Case Scenario
In our worst-case scenario, we'd find all of our elements in the same bucket and in reverse order. When this case occurs, we're reducing our bucket sort to a simple sort in which every element is compared to every other element, yielding a time complexity of O(n²).
5.2. Average Case Scenario
In our average case, we find that the elements are relatively evenly distributed among our input buckets. Since each of our steps requires just one iteration through our input buckets, we find that our bucket sort completes in O(n) time.
6. Conclusion
In this article, we saw how to implement a bucket sort in Java. We also looked at the time complexity of the bucket sort algorithm.
As always, the code shown in this article is available over on GitHub.