Longest Substring Without Repeating Characters

Difficulty: Medium, Asked-in: Microsoft, Amazon, Morgan Stanley

Key takeaway: An excellent problem to learn time complexity optimization using sliding window approach. We can solve a lot of problems using a similar idea.

Let’s understand the problem

Write a program to find the length of the longest substring without repeating characters.

  • A substring is a continuous sub-part of a string. We aim to determine the maximum substring that has all unique characters.
  • The input string consists of English letters, digits, and symbols.
  • The answer must be a continuous substring, not a subsequence of the string.
  • There can be more than one longest substring without repeating characters of equal length. Therefore, we only need to return the length of the longest substring.

Example 1

Input: str = "abcbbcab", Output: 3

Explanation: There are three substrings of unique characters with the longest length of 3: "abc", "bca", and "cab". So the output is 3.

Example 2

Input: str = "bbbbb", Output: 1

Explanation: There is only one unique character, so the output is 1.

Example 3

Input: str = "enjoyalgorithms", Output: 11

Explanation: The only repeating character is "o", and the answer is "yalgorithms", which has a length of 11.

Discussed solution approaches

  • Brute force approach  using two nested loops
  • Using sliding window approach
  • Optimized sliding window approach

Brute force approach  using two nested loops

Solution idea and steps

This is a simple approach where the idea is to explore all substrings, and for each substring, we check whether it contains unique characters or not. For a string with n characters, there will be n*(n + 1)/2 substrings.

Step 1: We initialize the variable maxLength to keep track of the longest substring with unique characters: maxLength = 0.

Step 2: Now, we explore all possible substrings using two nested loops: an outer loop from i = 0 to n - 1, and an inner loop from j = i to n - 1. The idea is to pick a starting character via the outer loop and explore all substrings starting from that character using the inner loop.

  • For each substring, str[i, j], i.e., starting from index i and ending at index j, we use the function checkUniqueSubstring(str, i, j) to check if all characters in the substring are unique or not. It will return true if all characters are unique; otherwise, it will return false.
  • If the function checkUniqueSubstring(str, i, j) returns true, we compare maxLength with the length of the substring. If maxLength < j - i + 1, we update the variable maxLength.

Step 3: By the end of the nested loops, we return the value stored in maxLength.

Implementation steps of checkUniqueSubstring(str, i, j)

  • To check whether a substring has duplicate characters or not, we use a set visited[256] to store the status of each possible character in the substring, i.e., either true or false.
  • We go through each character from index i to j using a loop. For each character, we check if it is already present in the set visited[] or not. If it is present, we return false. Otherwise, we update the status of that character as true in the set visited[].
  • By the end of the loop, we return true, i.e., the characters are unique in the substring.

Implementation code C++

bool checkUniqueSubstring(string str, int i, int j)
{
    bool visited[256] = {false};
    for (int k = i; k <= j; k =  k + 1)
    {
        if (visited[str[k]] == true)
            return false;
        visited[str[k]] = true;
    }
    return true;
}

int longestSubstring(string str, int n)
{
    int maxLength = 0;
    for (int i = 0; i < n; i = i + 1)
    {
        for (int j = i; j < n; j = j + 1)
        {
            if (checkUniqueSubstring(str, i, j) == true)
                maxLength = max(maxLength, j - i + 1);
        }
    }
    return maxLength;
}

Implementation code Python

def checkUniqueSubstring(str, i, j):
    visited = [False] * 256
    for k in range(i, j+1):
        if visited[ord(str[k])]:
            return False
        visited[ord(str[k])] = True
    return True

def longestSubstring(str, n):
    maxLength = 0
    for i in range(n):
        for j in range(i, n):
            if checkUniqueSubstring(str, i, j):
                maxLength = max(maxLength, j - i + 1)
    return maxLength

Implementation code Java

class Solution
{
    private boolean checkUniqueSubstring(String str, int i, int j) 
    {
        boolean[] visited = new boolean[256];
        for (int k = i; k <= j; k = k + 1) 
        {
            if (visited[str.charAt(k)])
                return false;

            visited[str.charAt(k)] = true;
        }
        return true;
    }

    public int longestSubstring(String str, int n) 
    {
        int maxLength = 0;
        for (int i = 0; i < n; i = i + 1) 
        {
            for (int j = i; j < n; j = j + 1) 
            {
                if (checkUniqueSubstring(str, i, j))
                    maxLength = Math.max(maxLength, j - i + 1);
            }
        }
        return maxLength;
    }
}

Time and space complexity analysis

We are exploring all n(n + 1)/2 substrings using nested loops. For each substring, function areUnique(str, i, j) takes O(j - i + 1) time to check whether the substring str[i, j] has unique characters or not.

Time complexity = n(n + 1)/2 * O( j - i + 1) = O(n^2) * O( j - i + 1). For the upper bound of the worst-case analysis, we consider O(j - i +1) as O(n). Why? Think! For a better understanding of loop analysis, refer to the analysis of loop blog. So, the overall time complexity in the worst case = O(n^2) * O(n) = O(n^3).

We are using a few extra variables and a constant-size set visited[]. Therefore, the space complexity is O(1).

Using  sliding window technique

Solution idea

Now a critical question is: How can we optimize the time complexity of the brute force approach? Think! The idea would be to apply sliding window approach and track repetition of characters in each substring, i.e., whenever we found repetition in the current window, we slide the window and move to the next substring.

Here is an optimization insight: In brute force idea, we repeatedly check a substring starting from character str[i] to see if it has duplicate characters or not. We can optimize it further because, during the loop, if substring str[i, j-1] is already checked to have no duplicate characters, then for substring str[i, j], we only need to check if str[j] is already present in substring str[i, j-1] or not.

We can use a hash set or extra array to check whether character str[j] is there in substring str[i, j-1] or not, which can be done in O(1) time complexity. How do we implement this? Let's think!

Note: Sliding window is an abstract concept commonly used in array/string problems. A window is a range of elements in the array/string defined by the start and end indices, where the window "slides" in a forward direction by incrementing the left or right end.

Solution steps

Step 1: We initialize the variable maxLength to keep track of the longest substring with unique characters i.e. maxLength = 0.

Step 2: We run an outer loop until i < n to explore the substring window starting from character str[i].

Step 3: Inside the loop, we initialize the right end of the window i.e., j = i. We also use an array visited[256] as a hash set to store the status of characters in the current window [i, j). Initially, i = j = 0.

Step 4: We run an inner loop to check each substring in the window [i, j). This loop will run until j < n && visited[str[j]] == false.

  • If str[j] is not present in the window [i, j) i.e., if visited[str[j]] == false , we mark str[j] in visited[] and keep track of the longest substring length. Now we slide the current window to one right by incrementing j and continue the same process in the inner loop until str[j] is not present in the window [i, j).
  • If the character str[j] is already present or visited in the current window, we increment i by 1 and go to the next iteration of the outer loop. Now we check the new window of the substring starting from index i + 1. Note: Before moving to the next iteration of the outer loop, we should set the status of the character str[i] in the set visited[] i.e., visited[str[i]] = false. (Think!)

Step 5: The outer loop will stop when the left end of the sliding window reaches the end of the string i.e., i = n. At this stage, we return the value stored in the variable maxLength.

Solution code C++

int longestSubstring(string str, int n)
{
    int maxLength = 0;
    int i = 0;
    while (i < n)
    {
        bool visited[256] = {false};
        int j = i;
        while (j < n && visited[str[j]] == false)
        {
            maxLength = max(maxLength, j - i + 1);
            visited[str[j]] = true;
            j = j + 1;
        }
        visited[str[i]] = false;
        i = i + 1;
    }
    return maxLength;
}

Solution code Python

def longestSubstring(str, n):
    maxLength = 0
    i = 0
    while i < n:
        visited = [False] * 256
        j = i
        while j < n and not visited[ord(str[j])]:
            maxLength = max(maxLength, j - i + 1)
            visited[ord(str[j])] = True
            j = j + 1
        visited[ord(str[i])] = False
        i = i + 1
    return maxLength

Solution code Java

class Solution
{
    public int longestSubstring(String str, int n) 
    {
        int maxLength = 0;
        int i = 0;
        while (i < n) 
        {
            boolean[] visited = new boolean[256];
            int j = i;
            while (j < n && !visited[str.charAt(j)]) 
            {
                maxLength = Math.max(maxLength, j - i + 1);
                visited[str.charAt(j)] = true;
                j = j + 1;
            }
            visited[str.charAt(i)] = false;
            i = i + 1;
        }
        return maxLength;
    }
}

Time and space complexity analysis

We are using nested loops and performing an O(1) operation at each iteration. The worst-case occur when inner loop runs till the end every time. Or in other words, worst-case will occur when all characters in string are unique. Think! In such situation, nested loop will explore all possible substrings i.e. n(n + 1)/2. Time complexity = n(n + 1)/2 * O(1) = O(n^2).

We are using constant size set visited[] of size 256, so space complexity = O(1).

Optimized sliding window solution

Solution idea

Now, the critical questions are: Can we optimize the above approach further and solve this problem in O(n) time complexity? Can we think of removing the inner loop of the previous approach? Here is a hint:

  • At each iteration of the outer loop in the previous approach, we ignore the previous window [i, j) and reset j = i to start a new window. In other words, we are starting with a new window size of 1 at each iteration of the outer loop.
  • If the characters in window [i, j) are unique, then the characters in window [i + 1, j) will also be unique. So, in the next iteration of the outer loop, there is no need to reset j = i (Think!).

Solution steps

  • We scan the string from left to right using two window pointers, i and j. We also keep track of the longest length of the substring using the variable maxLength and maintain a visited[256] table to keep track of visited characters. Note: All values in the visited[] table are initially false.
  • If visited[str[j]] is false, we set visited[str[j]] = true and slide the current window to the right by incrementing the pointer j. We also update the longest length of the substring with unique characters, i.e., maxLength = max(maxLength, j - i).
  • If the character str[j] is already marked true in the visited[] table, we have found a repeated character in the current substring starting from index i. So we set visited[str[j]] = false and slide the window to the right by incrementing the pointer i.
  • This process will end when either pointer reaches the end of the string, i.e., i = n or j = n. At this stage, we return the value stored in the variable maxLength.

Solution code C++

int longestSubstring(string str, int n)
{
    if (n == 0)
        return 0;
    bool visited[256] = {false};
    int maxLength = 0;
    int i = 0, j = 0;
    while (i < n && j < n)
    {
        if (visited[str[j]] == false)
        {
            visited[str[j]] = true;
            j = j + 1;
            maxLength = max(maxLength, j - i);
        }
        else
        {
            visited[str[i]] = false;
            i = i + 1;
        }
    }
    return maxLength;
}

Solution code Python

def longestSubstring(str, n):
    if n == 0:
        return 0
    visited = [False] * 256
    maxLength = 0
    i = 0
    j = 0
    while i < n and j < n:
        if not visited[ord(str[j])]:
            visited[ord(str[j])] = True
            j = j + 1
            maxLength = max(maxLength, j - i)
        else:
            visited[ord(str[i])] = False
            i = i + 1
    return maxLength

Solution code Java

class Solution
{
    public int longestSubstring(String str, int n) 
    {
        if (n == 0)
            return 0;
    
        boolean[] visited = new boolean[256];
        int maxLength = 0;
        int i = 0;
        int j = 0;
        while (i < n && j < n) 
        {
            if (visited[str.charAt(j)] == false) 
            {
                visited[str.charAt(j)] = true;
                j = j + 1;
                maxLength = Math.max(maxLength, j - i);
            } 
            else 
            {
                visited[str.charAt(i)] = false;
                i = i + 1;
            }
        }
        return maxLength;
    }
}

Time and space complexity analysis

At each iteration of the while loop, we either increment pointer i or pointer j. So, the loop will run n times and perform a constant number of operations at each iteration. The time complexity is O(n).

We are using a constant-size set visited[] of size 256, so the space complexity is O(1).

Critical ideas to think!

  • Can we optimize the efficient approach further?
  • Instead of using the visited[256] table, can we use a hash table to store the status of each character in a substring window?
  • Can we think of solving the problem using some other approach?
  • How can we modify the above approaches to print all the longest substrings in lexicographic order?
  • Explore patterns of problem-solving using the sliding window technique.

Suggested coding problems to practice

If you have any queries/doubts/feedback, please write us at contact@enjoyalgorithms.com. Enjoy learning, Enjoy algorithms!

Share Your Insights

☆ 16-week live DSA course
☆ 16-week live ML course
☆ 10-week live DSA course

More from EnjoyAlgorithms

Self-paced Courses and Blogs

Coding Interview

Machine Learning

System Design

Our Newsletter

Subscribe to get well designed content on data structure and algorithms, machine learning, system design, object orientd programming and math.