Quantcast
Channel: Baeldung
Viewing all articles
Browse latest Browse all 4535

HashSet and TreeSet Comparison

$
0
0

1. Introduction

In this article, we are going to compare two of the most popular Java implementations of the java.util.Set interface – HashSet and TreeSet.

2. Differences

HashSet and TreeSet are leaves of the same branch, but they differ in few important matters.

2.1. Ordering

HashSet stores the objects in random order, whereas TreeSet applies the natural order of the elements. Let’s see the following example:

@Test
public void givenTreeSet_whenRetrievesObjects_thenNaturalOrder() {
    Set<String> set = new TreeSet<>();
    set.add("Baeldung");
    set.add("is");
    set.add("Awesome");
 
    assertEquals(3, set.size());
    assertTrue(set.iterator().next().equals("Awesome"));
}

After adding the String objects into TreeSet, we see that the first one is “Awesome”, even though it was added at the very end. A similar operation done with HashSet does not guarantee that the order of elements will remain constant over time.

2.2. Null Objects

Another difference is that HashSet can store null objects, while TreeSet does not allow them:

@Test(expected = NullPointerException.class)
public void givenTreeSet_whenAddNullObject_thenNullPointer() {
    Set<String> set = new TreeSet<>();
    set.add("Baeldung");
    set.add("is");
    set.add(null);
}

@Test
public void givenHashSet_whenAddNullObject_thenOK() {
    Set<String> set = new HashSet<>();
    set.add("Baeldung");
    set.add("is");
    set.add(null);
 
    assertEquals(3, set.size());
}

If we try to store the null object in a TreeSet, the operation will result in a thrown NullPointerException. The only exception was in Java 7 when it was allowed to have exactly one null element in the TreeSet.

2.3. Performance

Simply put, HashSet is faster than the TreeSet.

HashSet provides constant-time performance for most operations like add(), remove() and contains(), versus the log(n) time offered by the TreeSet. 

Usually, we can see that the execution time for adding elements into TreeSet is much better than for the HashSet. 

Please remember that the JVM might be not warmed up, so the execution times can differ. A good discussion how to design and perform micro tests using various Set implementations is available here.

2.4. Implemented Methods

TreeSet is rich in functionalities, implementing additional methods like:

  • pollFirst() – to return the first element, or null if Set is empty
  • pollLast() – to retrieve and remove the last element, or return null if Set is empty
  • first() – to return the first item
  • last() – to return the last item
  • ceiling() – to return the least element greater than or equal to the given element, or null if there is no such element
  • lower() – to return the largest element strictly less than the given element, or null if there is no such element

The methods mentioned above make TreeSet much easier to use and more powerful than HashSet.

3. Similarities

3.1. Unique Elements

Both TreeSet and HashSet guarantee a duplicate-free collection of elements, as it is a part of the generic Set interface:

@Test
public void givenHashSetAndTreeSet_whenAddDuplicates_thenOnlyUnique() {
    Set<String> set = new HashSet<>();
    set.add("Baeldung");
    set.add("Baeldung");
 
    assertTrue(set.size() == 1);
        
    Set<String> set2 = new TreeSet<>();
    set2.add("Baeldung");
    set2.add("Baeldung");
 
    assertTrue(set2.size() == 1);
}

3.2. Not synchronized

None of the described Set implementations are synchronized. This means that if multiple threads access a Set concurrently, and at least one of the threads modifies it, then it must be synchronized externally.

3.3. Fail-fast Iterators

The Iterators returned by TreeSet and HashSet are fail-fast.

That means that any modification of the Set at any time after the Iterator is created will throw a ConcurrentModificationException:

@Test(expected = ConcurrentModificationException.class)
public void givenHashSet_whenModifyWhenIterator_thenFailFast() {
    Set<String> set = new HashSet<>();
    set.add("Baeldung");
    Iterator<String> it = set.iterator();

    while (it.hasNext()) {
        set.add("Awesome");
        it.next();
    }
}

4. Which Implementation to use?

Both implementations fulfill the contract of the idea of a set so it’s up to the context which implementation we might use.

Here are few quick points to remember:

  • If we want to keep our entries sorted, we need to go for the TreeSet
  • If we value performance more than memory consumption, we should go for the HashSet
  • If we are short on memory, we should go for the TreeSet
  • If we want to access elements that are relatively close to each other according to their natural ordering, we might want to consider TreeSet because it has greater locality
  • HashSet‘s performance can be tuned using the initialCapacity and loadFactor, which is not possible for the TreeSet
  • If we want to preserve insertion order and benefit from constant time access, we can use the LinkedHashSet

5. Conclusion

In this article, we covered the differences and similarities between TreeSet and HashSet.

As always, the code examples for this article are available over on GitHub.


Viewing all articles
Browse latest Browse all 4535

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>