Quantcast
Channel: Baeldung
Viewing all articles
Browse latest Browse all 4535

Find the Longest Substring without Repeating Characters

$
0
0

1. Overview

In this tutorial, compare ways to find the longest substring of unique letters using Java. For example, the longest substring of unique letters in “CODINGISAWESOME” is “NGISAWE”.

2. Brute Force Approach

Let’s start with a naive approach. To begin with, we can examine each substring whether it contains unique characters:

String getUniqueCharacterSubstringBruteForce(String input) {
    String output = "";
    for (int start = 0; start < input.length(); start++) {
        Set<Character> visited = new HashSet<>();
        int end = start;
        for (; end < input.length(); end++) {
            char currChar = input.charAt(end);
            if (visited.contains(currChar)) {
                break;
            } else {
                visited.add(currChar);
            }
        }
        if (output.length() < end - start + 1) {
            output = input.substring(start, end);
        }
    }
    return output;
}

Since there are n*(n+1)/2 possible substrings, the time complexity of this approach is O(n^2).

3. Optimized Approach

Now, let’s take a look at an optimized approach. We start traversing the string from left to right and maintain track of:

  1. the current substring with non-repeating characters with the help of a start and end index
  2. the longest non-repeating substring output
  3. a lookup table of already visited characters
String getUniqueCharacterSubstring(String input) {
    Map<Character, Integer> visited = new HashMap<>();
    String output = "";
    for (int start = 0, end = 0; end < input.length(); end++) {
        char currChar = input.charAt(end);
        if (visited.containsKey(currChar)) {
            start = Math.max(visited.get(currChar)+1, start);
        }
        if (output.length() < end - start + 1) {
            output = input.substring(start, end + 1);
        }
        visited.put(currChar, end);
    }
    return output;
}

For every new character, we look for it in the already visited characters. If the character has already been visited and is part of the current substring with non-repeating characters, we update the start index. Otherwise, we’ll continue traversing the string.

Since we are traversing the string only once, the time complexity will be linear, or O(n).

This approach is also known as the sliding window pattern.

4. Testing

Finally, let’s test thoroughly our implementation to make sure it works:

@Test
void givenString_whenGetUniqueCharacterSubstringCalled_thenResultFoundAsExpected() {
    assertEquals("", getUniqueCharacterSubstring(""));
    assertEquals("A", getUniqueCharacterSubstring("A"));
    assertEquals("ABCDEF", getUniqueCharacterSubstring("AABCDEF"));
    assertEquals("ABCDEF", getUniqueCharacterSubstring("ABCDEFF"));
    assertEquals("NGISAWE", getUniqueCharacterSubstring("CODINGISAWESOME"));
    assertEquals("be coding", getUniqueCharacterSubstring("always be coding"));
}

Here, we try and test boundary conditions as well as the more typical use cases.

5. Conclusion

In this tutorial, we have learned how to use the sliding window technique to find the longest substring with non-repeating characters. As always, the source code is available on GitHub.


Viewing all articles
Browse latest Browse all 4535

Trending Articles