**Difficulty:** Medium, **Asked-in:** Microsoft, Amazon, Morgan Stanley.

**Key takeaway:** An excellent problem to learn problem solving using sliding window approach. We can solve a lot of problems using a similar idea.

You are given a string of size n, write a program to find the length of the longest substring without repeating characters. Here substring is a continuous subpart of a string.

- Our goal is to find the longest substring that has all unique characters.
- There can be more than one longest substring without repeating characters of equal length. So we only need to return the length of the longest substring.

Input: str = "abcbbcab", Output: 3

Explanation: There are three substrings of unique characters with the longest length of 3: "abc", "bca", and "cab". So the output is 3.

Input: str = "bbbbb", Output: 1

Explanation: There is only one unique character, so the output is 1.

Input: str = "enjoyalgorithms", Output: 11

Explanation: The only repeating character is "o", and the answer is "yalgorithms", which has a length of 11.

- Brute force using nested loops
- Optimized brute force solution
- Sliding window approach

One basic idea is to explore all substrings, and for each substring, we check whether it contains unique characters or not. For a string with n characters, there will be n*(n + 1)/2 substrings.

**Step 1:** We initialize a variable **maxLength** to track longest substring with unique characters.

**Step 2:** Now, we explore all substrings using two nested loops: an outer loop from i = 0 to n - 1, and an inner loop from j = i to n - 1. The idea is to pick a starting character via the outer loop and explore all substrings starting from that character using the inner loop.

- For each substring str[i, j], we use function
**isUnique(str, i, j)**to check if all characters in the substring are unique or not. It will return true if all characters are unique. Otherwise, it will return false. - If function
**isUnique(str, i, j)**returns true, we compare maxLength with the length of the substring. If**maxLength < j - i + 1**, we update the variable maxLength with the value**j - i + 1.**Note: The length of the substring from i to j is j - i + 1.

**Step 3:** By the end of the nested loops, we return the value stored in maxLength.

- To check whether a substring has duplicate characters or not, we use a set
**visited[256]**to store the visited status of each possible character in the substring. - We go through each character from index i to j using a loop. For each character, we check if it is already present in the set visited[] or not. If it is present, we return
**false**. Otherwise, we update the status of that character as**true**in the visited[]. - By the end of the loop, we return true, i.e., the characters are unique in the substring.

```
bool isUnique(string str, int i, int j)
{
bool visited[256] = {false}
for (int k = i; k <= j; k = k + 1)
{
if (visited[str[k]] == true)
return false
visited[str[k]] = true
}
return true
}
int longestSubstring(string str, int n)
{
int maxLength = 0
for (int i = 0; i < n; i = i + 1)
{
for (int j = i; j < n; j = j + 1)
{
if (isUnique(str, i, j) == true)
maxLength = max(maxLength, j - i + 1)
}
}
return maxLength
}
```

We are exploring all n(n + 1)/2 substrings using nested loops. For each substring, function **isUnique(str, i, j)** takes O(j - i + 1) time to check whether the substring str[i, j] has unique characters or not.

Time complexity = n(n + 1)/2 * O( j - i + 1) = O(n^2) * O( j - i + 1). For the upper bound of the worst-case analysis, we consider O(j - i +1) as O(n). So, overall time complexity in the worst case = O(n^2) * O(n) = O(n^3). We are using a few extra variables and a constant-size set visited[]. So space complexity is O(1).

Now, a critical question is: How can we optimize the time complexity of the brute-force approach? In the brute-force, we repeatedly check a substring starting from the character str[i] to see if it has duplicate characters. In other words, we use O(n) time for every substring. Can we reduce this computation?

Here’s an insight: If we know that the substring str[i, j-1] has no duplicate characters, then for the substring str[i, j], we only need to check if str[j] is already present in the substring str[i, j-1]. To do this, we can use a visited array to check whether the character str[j] is in the substring str[i, j-1], which can be done in O(1) time.

**Step 1:** Initialize the variable **maxLength** to keep track of the longest substring.

**Step 2:** Now we run an outer loop from **i = 0 to n - 1** to explore the substring starting from the character str[i]. Inside the loop, we use **visited[256]** to store the status of characters in the current substring.

**Step 3:** Now, we run an inner loop from **j = i to n - 1** to find the longest substring with unique characters starting from the character str[i]. At each iteration of the inner loop, we check if **visited[str[j]]** is false. If it is, we mark the status of str[j] in visited[] and update the **maxLength**. In other words, we are sliding the current window one position to the right by incrementing j until str[j] is present in the substring [i, j - 1].

**Step 4:** The inner loop will stop when the character str[j] is already present in the current window. Then, we move to the next iteration of the outer loop to check the new window of the substring starting from index i + 1.

**Step 5:** At the end of the outer loop, we return the value stored in the variable maxLength.

```
int longestSubstring(string str, int n)
{
int maxLength = 0
for (int i = 0; i < n; i++)
{
bool visited[256] = {false}
for (int j = i; j < n; j++)
{
if (visited[str[j]] == false)
{
maxLength = max(maxLength, j - i + 1)
visited[str[j]] = true
}
else
break // Stop further checking if a duplicate character is found
}
}
return maxLength
}
```

We are using nested loops and performing an O(1) operation at each iteration. The worst case occurs when the inner loop runs until the end every time. In other words, the worst case occurs when all characters in the string are unique. In such a situation, a nested loop will explore all possible substrings i.e. n(n + 1)/2. Time complexity = n(n + 1)/2 * O(1) = O(n^2). We are using a constant size set visited[] of size 256, so space complexity = O(1).

Now, the critical questions are: Can we optimize the above approach further and solve this problem in O(n) time? Can we eliminate the inner loop? Here is an insight from the previous approach!

In the **ith** iteration of the outer loop, we run the inner loop from **j = i** to find the longest substring with unique characters starting from index i. Now in the **(i+1)th** iteration, we ignore the value of j from the previous iteration and reset it to **j = i+1** to run the inner loop. But there’s no need to do this! If the characters in the substring **[i, j]** are unique, then the characters in the substring **[i+1, j]** will also be unique. So, in the next iteration of the outer loop, there’s no need to start from **j = i+1.**

**Note:** Sliding window is an excellent technique for solving array or string problems. A window is a range of elements defined by the start and end indices, where the window "slides" forward by incrementing either the left or right end.

- We scan the string from left to right using two window pointers
**, i**and**j**. We also keep track of the longest length of the substring using the variable**maxLength**and maintain a**visited[256]**table to keep track of visited characters.**Note:**All values in the visited[] table are initially false. - If
**visited[str[j]]**is**false**, we set**visited[str[j]] = true**and slide the current window to the right by incrementing the pointer j. We also update the longest length of the substring with unique characters, i.e.,**maxLength = max(maxLength, j - i).** - If
**visited[str[j]]**is true, we have found a repeated character in the current substring starting from index i. So we set**visited[str[i]] = false**and slide the window to the right by incrementing the pointer i. - This process will end when either pointer reaches the end of the string, i.e., i = n or j = n. At this stage, we return the value stored in the variable
**maxLength**.

```
int longestSubstring(string str, int n)
{
bool visited[256] = {false}
int maxLength = 0
int i = 0, j = 0
while (i < n && j < n)
{
if (visited[str[j]] == false)
{
visited[str[j]] = true
j = j + 1
maxLength = max(maxLength, j - i)
}
else
{
visited[str[i]] = false
i = i + 1
}
}
return maxLength
}
```

At each iteration of the while loop, we either increment pointer i or pointer j. So, the loop will run n times and perform a constant number of operations at each iteration. The time complexity is O(n). We are using a constant-size set visited[] of size 256, so space complexity is O(1).

- Can we optimize the efficient approach further?
- Instead of using the visited[256] table, can we use a hash table to store the status of each character in a substring window?
- Can we think of solving the problem using some other approach?
- How can we modify the above approaches to print all the longest substrings in lexicographic order?
- Explore patterns of problem-solving using the sliding window technique.

- Longest substring with at most two distinct characters
- Longest substring with at most K distinct characters
- Find maximum consecutive 1s in an array
- Max continuous series of 1s
- Count distinct elements in every window
- Sliding window maximum
- Subarrays with K Different Integers

If you have any queries/doubts/feedback, please write us at contact@enjoyalgorithms.com. Enjoy learning, Enjoy algorithms!