**Difficulty:** Medium, **Asked-in:** Microsoft, Amazon, Morgan Stanley

**Key takeaway:** An excellent problem to learn time complexity optimization using sliding window approach. We can solve a lot of problems using similar idea.

Write a program to find the length of longest substring without repeating characters.

- Substring is a continuous sub-part of the string, where we aim to determine the maximum substring which has all unique characters.
- Input string consists of English letters, digits, and symbols.
- The answer must be continuous substring, not a subsequence of the string.
- There can be more than one longest substring (without repeating characters) of equal length. So, we just need to return the length of longest substring.

Input: str = "abcbbcab", Output: 3

Explanation: There are three substrings of unique characters with the longest length 3: "abc", "bca", and "cab". So we return 3 as an output.

Input: "bbbbb", Output: 1

Explanation: There is only one unique character. So output is 1.

Input: "enjo**yalgorithms**", Output: 11

Explanation: "o" is the only repeating character, and the answer is "yalgorithms".

- Brute force approach using two nested loops
- Using sliding window approach
- Optimized sliding window approach

This is a simple approach where idea is to explore all substrings, and for each substring, we check whether it contains unique characters or not. For string with n characters, there will be n*(n + 1)/2 substrings.

- We initialize variable
**maxLength**to keep track of the longest substring with unique characters. maxLength = 0 -
Now we explore all the possible substrings using two nested loops: outer loop from i = 0 to n - 1 and inner loop from j = i to n - 1. The idea is to pick starting character via outer loop and explore all substrings starting from that character using inner loop.

- For each substring str[i, j] i.e. starting from index i and ending at index j, we use function
**areUnique(str, i, j)**to check if all characters in the substring are unique or not. It will return true if all characters are unique, otherwise false. - If function areUnique(str, i , j) return true, we compare maxLength with length of the substring.
**If(maxLength > j - i + 1)**, we update variable maxLength.**Note:**Length of substring str[i, j] = j - i + 1.

- For each substring str[i, j] i.e. starting from index i and ending at index j, we use function
- By the end of nested loops, we return value stored in maxLength.

- To check whether a substring has duplicate characters or not, we use a set
**visited[256]**to store status of each possible character in the substring i.e. either true or false. - We go through each character from index i to j using loop. For each character, we check if it is already present in set visited[] or not. If so, we return
**false.**Otherwise, we update the status of that character**true**in set visited[]. - By the end of loop, we return true i.e. characters are unique in the substring.

```
int longestSubstring(string str[], int n)
{
int maxLength = 0
for (int i = 0; i < n; i = i + 1)
{
for (int j = i; j < n; j = j + 1)
{
if (areUnique(str, i, j) == true)
maxLength = max(maxLength, j - i + 1)
}
}
return maxLength
}
bool areUnique(string str[], int i, int j)
{
bool visited[256]
for (int k = i; k < j; k = k + 1)
{
if (visited[str[k]] == true)
return false
visited[str[k]] = true
}
return true
}
```

- We are exploring all n(n + 1)/2 substrings using nested loops.
- For each substring, function
**areUnique(str, i, j)**will take O(j - i + 1) time to check substring str[i, j] has unique characters or not. - Time complexity = n(n + 1)/2 * O( j - i + 1) = O(n^2) * O( j - i + 1). For upper bound of the worst-case analysis, we consider O( j - i +1) as O(n). Why? Think! For better understanding of such loop analysis, refer to this blog. So overall time complexity in the worst case = O(n^2) * O(n) = O(n^3)
- We are using few extra variables and constant size set visited[]. Space complexity = O(1).

Now a critical question is: How can we optimize time complexity of brute force approach? Think! The idea would be to apply sliding window approach and track repetition of characters in each substring i.e., whenever we found repetition in the current window, we slide the window and move to the next substring.

Here is optimization insight: In brute force idea, we repeatedly check a substring starting from character str[i] to see if it has duplicate character or not. We can optimize it further because, during the loop, if substring **str[i, j-1]** is already checked to have no duplicate characters, then for substring **str[i, j],** we only need to check if **str[j]** is already present in substring **str[i, j-1]** or not.

We can use hash set or extra array to check whether character str[j] is there in substring str[i, j-1] or not, which can be done in O(1) time complexity. How do we implement this? Let's think!

**Note:** Sliding window is an abstract concept commonly used in array/string problems. A window is a range of elements in the array/string defined by the start and ends indices, where window “slides” in a forward direction by incrementing the left or right end.

- We initialize variable
**maxLength**to keep track of the longest substring with unique characters. maxLength = 0 - We run an outer loop till i < n to explore substring window starting from character str[i].
- Now inside loop, we initialize the right end of window i.e. j = i. We also use an array
**visited[256]**as hash set to store status of characters in the current window [i, j). Initially i = j = 0. -
Now we run inner loop to check each substring in window [i, j). This loop will run till j < n && visited[str[j]] == false.

- If str[j] is not present in window [i, j) i.e. if(visited[str[j]] == false): we mark str[j] in visited[] and keep track of the longest substring length. Now we slide the current window to one right by incrementing j and continue the same process in inner loop till str[j] is not present in window [i, j).
- If character str[j] is already present or visited in the current window, we increment i by 1 and go to the next iteration of outer loop. Now we check the new window of substring starting from index i + 1. Note: Before moving to the next iteration of outer loop, we should set status of the character str[i] in set visited[] i.e. visited[str[i]] = false. (Think!)

- The outer loop will stop when the left end of sliding window reaches the end of string i.e., i = n. At this stage, we return the value stored in variable maxLength
**.**

```
int longestSubstring(string str[], int n)
{
int maxLength = 0
int i = 0
While(i < n)
{
bool visited[256]
int j = i
while(j < n && visited[str[j]] == false)
{
maxLength = max(maxLength, j - i + 1)
visited[str[j]] = true
j = j + 1
}
visited[str[i]] = false
i = i + 1
}
return maxLength
}
```

We are using nested loops and performing an O(1) operation at each iteration. The worst-case occur when inner loop runs till the end every time. Or in other words, worst-case will occur when all characters in string are unique. Think! In such situation, nested loop will explore all possible substrings i.e. n(n + 1)/2. Time complexity = n(n + 1)/2 * O(1) = O(n^2).

We are using constant size set visited[] of size 256, so space complexity = O(1).

Now critical questions are: can we optimize the above approach further and solve this problem in O(n) time complexity? Can we think of removing inner loop of the previous approach? Here is a hint:

- At each iteration of outer loop in the previous approach, we ignore the previous window [i, j) and reset j = i to start a new window. In other words, we are starting with new window size 1 at each iteration of outer loop.
- If characters in window [i, j) are unique, then characters in window [i + 1, j) will also be unique. So in the next iteration of outer loop, there is no need to start with a new window size 1 or reset j = i (Think!).

- Scan string from left to right using two window pointers i and j. Also, keep track of the longest length of substring using variable maxLength and maintain a hash table to keep track of visited characters.
- If character str[j] is not present in hash table, we insert it into hash table and slide the current window right by incrementing pointer j. We also update the longest length of substring with unique characters i.e. maxLength = max (maxLength, j - i).
- If character str[j] is already present in hash table, we delete character str[j] from hash table and slide the window right by incrementing pointer i.
- This process will end when any one of the pointers reaches the end of string i.e. i = n or j = n. At this stage, we return value stored in variable maxLength.

```
int longestSubstring(string str[], int n)
{
if (n == 0)
return 0
HashTable H
int maxLength = 0
int i = 0, j = 0
while (i < n && j < n)
{
if (H.search(str[j]) == false)
{
H.insert(str[j])
j = j + 1
maxLength = max(maxLength, j - i)
}
else
{
H.delete(str[i])
i = i + 1
}
}
return maxLength
}
```

At each iteration of while loop, we either increment pointer i or pointer j, based on the comparison if(H.search(str[j]) == false). So loop will run n times and perform constant number of operations at each iteration. Time complexity = O(n).

The algorithm requires O(n) space for hash table. Space complexity = O(n)

- Can we optimize the efficient approach further and solve it using O(1) extra space? Instead of hash table, can we use a set visited[256] to store status of each character in a substring window?
- How do we modify the above approaches to print all longest substrings in lexicographic order?
- Try to explore the precise analysis of brute force approach?
- Explore patterns of problem-solving using the sliding window.

- Longest substring with at most two distinct characters
- Longest substring with At Most K distinct characters
- Find Maximum Consecutive 1s in an Array
- Max Continuous Series of 1s
- Count distinct elements in every window
- Sliding window maximum
- Subarrays with K Different Integers

**Enjoy learning, Enjoy algorithms!**

Subscribe to get weekly content on data structure and algorithms, machine learning, system design and oops.