Quantcast
Channel: Baeldung
Viewing all articles
Browse latest Browse all 4536

Guide to FastUtil

$
0
0

1. Introduction

In this tutorial, we’ll be looking at the FastUtil library.

First, we’ll code some examples of its type-specific collections.

Then, we’ll analyze the performance that gives FastUtil its name.

Finally, let’s take a peek at FastUtil‘s BigArray utilities.

2. Features

The FastUtil Java library seeks to extend the Java Collections Framework. It provides type-specific maps, sets, lists and queues with a smaller memory footprint and fast access and insertion. FastUtil also provides a set of utilities for working with and manipulating large (64-bit) arrays, sets and lists.

The library also includes a multitude of practical Input/Output classes for binary and text files.

Its latest release, FastUtil 8, also released a host of type-specific functions, extending the JDK’s Functional Interfaces.

2.1. Speed

In many cases, the FastUtil implementations are the fastest available. The authors have even provided their own in-depth benchmark report, comparing it against similar libraries include HPPC and Trove.

In this tutorial, we’ll look to define our own benchmarks using the Java Microbench Harness (JMH).

3. Full Sized Dependency

On top of the usual JUnit dependency, we’ll be using the FastUtils and JMH dependencies in this tutorial.

We’ll need the following dependencies in our pom.xml file:

<dependency>
    <groupId>it.unimi.dsi</groupId>
    <artifactId>fastutil</artifactId>
    <version>8.2.2</version>
</dependency>
<dependency>
    <groupId>org.openjdk.jmh</groupId>
    <artifactId>jmh-core</artifactId>
    <version>1.19</version>
    <scope>test</scope>
</dependency>
<dependency>
    <groupId>org.openjdk.jmh</groupId>
    <artifactId>jmh-generator-annprocess</artifactId>
    <version>1.19</version>
    <scope>test</scope>
</dependency>

Or for Gradle users:

testCompile group: 'org.openjdk.jmh', name: 'jmh-core', version: '1.19'
testCompile group: 'org.openjdk.jmh', name: 'jmh-generator-annprocess', version: '1.19'
compile group: 'it.unimi.dsi', name: 'fastutil', version: '8.2.2'

3.1. Customized Jar File

Due to the lack of generics, FastUtils generates a large number of type-specific classes. And unfortunately, this leads to a huge jar file.

However, luckily for us, FastUtils includes a find-deps.sh script which allows generation of smaller, more focused jars comprising of only the classes we want to use in our application.

4. Type-Specific Collections

Before we begin, let’s take a quick peek at the simple process of instantiating a type-specific collection. Let’s pick a HashMap that stores keys and values using doubles. 

For this purpose, FastUtils provides a Double2DoubleMap interface and a Double2DoubleOpenHashMap implementation:

Double2DoubleMap d2dMap = new Double2DoubleOpenHashMap();

Now that we’ve instantiated our class, we can simply populate data as we would with any Map from the Java Collections API:

d2dMap.put(2.0, 5.5);
d2dMap.put(3.0, 6.6);

Finally, we can check that the data has been added correctly:

assertEquals(5.5, d2dMap.get(2.0));

4.1. Performance

FastUtils focuses on its performant implementations. In this section, we’ll make use of the JMH to verify that fact. Let’s compare the Java Collections HashSet<Integer> implementation against FastUtil’s IntOpenHashSet.

First, let’s see how to implement the IntOpenHashSet:

@Param({"100", "1000", "10000", "100000"})
public int setSize;

@Benchmark
public IntSet givenFastUtilsIntSetWithInitialSizeSet_whenPopulated_checkTimeTaken() {
    IntSet intSet = new IntOpenHashSet(setSize);
    for(int i = 0; i < setSize; i++) {
        intSet.add(i);
    }
    return intSet; 
}

Above, we’ve simply declared the IntOpenHashSet implementation of the IntSet interface. We’ve also declared the initial size setSize with the @Param annotation.

Put simply, these numbers are fed into JMH to produce a series of benchmark tests with different set sizes.

Next, let’s do the same thing using the Java Collections implementation:

@Benchmark
public Set<Integer> givenCollectionsHashSetWithInitialSizeSet_whenPopulated_checkTimeTaken() {
    Set<Integer> intSet = new HashSet<>(setSize);
    for(int i = 0; i < setSize; i++) {
        intSet.add(i);
    }
    return intSet;
}

Finally, let’s run the benchmark and compare the two implementations:

Benchmark                                     (setSize)  Mode  Cnt     Score   Units
givenCollectionsHashSetWithInitialSizeSet...        100  avgt    2     1.460   us/op
givenCollectionsHashSetWithInitialSizeSet...       1000  avgt    2    12.740   us/op
givenCollectionsHashSetWithInitialSizeSet...      10000  avgt    2   109.803   us/op
givenCollectionsHashSetWithInitialSizeSet...     100000  avgt    2  1870.696   us/op
givenFastUtilsIntSetWithInitialSizeSet...           100  avgt    2     0.369   us/op
givenFastUtilsIntSetWithInitialSizeSet...          1000  avgt    2     2.351   us/op
givenFastUtilsIntSetWithInitialSizeSet...         10000  avgt    2    37.789   us/op
givenFastUtilsIntSetWithInitialSizeSet...        100000  avgt    2   896.467   us/op

These results make it clear the FastUtils implementation is much more performant than the Java Collections alternative.

5. Big Collections

Another important feature of FastUtils is the ability to use 64-bit arrays. Arrays in Java, by default, are limited to 32 bits.

To get started, let’s take a look at the BigArrays class for Integer types. IntBigArrays provides static methods for working with 2-dimensional Integer arrays. By using these provided methods, we can essentially wrap our array into a more user-friendly 1-dimensional array.

Let’s take a look at how this works.

First, we’ll start by initializing a 1-dimensional array, and converting it into a 2-dimensional array using IntBigArray’s wrap method:

int[] oneDArray = new int[] { 2, 1, 5, 2, 1, 7 };
int[][] twoDArray = IntBigArrays.wrap(oneDArray.clone());

We should make sure to use the clone method to ensure a deep copy of the array.

Now, as we’d do with a List or a Map, we can gain access to the elements using the get method:

int firstIndex = IntBigArrays.get(twoDArray, 0);
int lastIndex = IntBigArrays.get(twoDArray, IntBigArrays.length(twoDArray)-1);

Finally, let’s add some checks to ensure our IntBigArray returns the correct values:

assertEquals(2, firstIndex);
assertEquals(7, lastIndex);

6. Conclusion

In this article, we’ve taken a dive into FastUtils core features.

We looked at some of the type-specific collections that FastUtil offers, before playing around with some BigCollections.

As always, the code can be found over on GitHub


Viewing all articles
Browse latest Browse all 4536

Trending Articles