Difficulty: Hard, Asked-in: Google, Amazon, Linkedin, Walmart, Zoho
Key takeaway: An excellent problem to learn problem-solving and step-by-step optimization using sorting and hash table.
Given an array X[] of n integers, write a program to find the length of the longest consecutive elements sequence. In other words, we need to find the length of the longest sub-sequence such that elements in the subsequence are consecutive integers.
Example 1
Input: X[] = [4, 7, 1, 2, 8, 10, 3], Output: 4
Explanation: [1, 2, 3, 4] is the longest subsequence of consecutive elements.
Example 2
Input: X[] = [0, -3, 5, -1, 7, -2, -4, 1, 3], Output: 6
Explanation: There are two longest consecutive sequences of length 6: [-4, -3, -2, -1, 0, 1] and [-2, -1, 0, 1, 2, 3]. So, we return 6 as an output.
Example 3
Input: X[] = [0, 3, 7, 2, 5, 8, 4, 6, 0, 2, 1], Output: 9
Explanation: Here 2 and 3 are repeated but all the unique integers are part of the longest consecutive sequence i.e. 0, 1, 2, 3, 4, 5, 6, 7, 8.
The longest consecutive sequence must be starting from some element in the array. So the basic idea would be to explore each possibility: pick each element in the input and do a linear search to count the length of the longest consecutive sequence starting from that element. We also keep track of the longest length of consecutive sequences seen so far during this process. The critical question is: how do we implement this? Let's think!
We run a loop from i = 0 to n - 1 to traverse the input array. Inside the loop, we initialize two variables: currLength to store the length of the longest consecutive sequence starting from any element X[i] and currElement to track the consecutive element of the sequence starting from element X[i]. currElement = X[i], currLength = 1
while(searchNext(X, n, currElement + 1) == true)
{
currElement = currElement + 1
currLength = currLength + 1
}
if(longestLength < currLength)
longestLength = currLength
int longestConsecutive(int X[], int n)
{
int longestLength = 0
for (int i = 0; i < n; i = i + 1)
{
int currElement = X[i]
int currLength = 1
while (searchNext(X, n, currElement + 1) == true)
{
currElement = currElement + 1
currLength = currLength + 1
}
longestLength = max(longestLength, currLength)
}
return longestLength
}
int searchNext(int X[], int n, int k)
{
for(int i = 0; i < n; i = i + 1)
{
if(X[i] == k)
return true
}
return false
}
For each element X[i], we are finding the length of the longest consecutive streak of integers using the inner loop. So overall time complexity = n * Time complexity of finding the longest consecutive streak starting from each element = n * Time complexity of inner while loop
The time complexity of the inner while loop depends on two things: 1) The length of the longest consecutive streak starting from a given element (This could be n in the worst case) and 2) The time complexity of searching an element in the streak linearly (This is O(n) in the worst case).
The time complexity of the inner while loop in the worst case = n * O(n) = O(n²). This process is repeated for each element of the array. So the overall time Complexity = n * O(n²) = O(n³).
Space complexity = O(1), as we are using a constant number of variables.
Suppose we sort the input and iterate over each element. In that case, it will be easy to find sequences of consecutive numbers, because consecutive elements will be linearly lined up next to each other. Think!
Step 1: We initialize two variables currLength and longestLength to track the current length of the consecutive sequence and length of the longest consecutive sequence.
Step 2: We sort the input in increasing order. Suppose we are using some efficient in-place sorting algorithm heap sort.
Step 3: Now we traverse the sorted array and compare each element X[i] to its previous element X[i - 1].
By the end of the loop, it is possible that the last element X[n - 1] may be part of the longest sequence. In other words, if (X[n - 1] == X[n - 2] + 1), then X[n - 1] is a part of continuous sequence of X[n - 2] and currLength get incremented to 1. After this, the loop will end due to the loop condition in the next iteration, and the updated value of currLength will not be considered for the calculation of longestLength. To handle this: we need to return the maximum of currLength and longestLength by the end of the loop i.e. return max(currLength, longestLength).
int longestConsecutive(int X[], int n)
{
int longestLength = 0
heapSort(X, n)
int currLength = 1
for (int i = 1; i < n; i = i + 1)
{
if(X[i] != X[i - 1])
{
if (X[i] == X[i - 1] + 1)
currLength = currLength + 1
else
{
currLength = 1
longestLength = max(longestLength, currLength)
}
}
}
return max(longestLength, currLength)
}
Suppose we are using some efficient O(nlogn) sorting algorithm like merge sort or heap sort or quicksort. So time complexity = Time complexity of sorting + Linear traversal of the array = O(nlogn) + O(n) = O(nlogn)
Space complexity: If we use heap sort, O(1), else if we use merge sort, O(n).
In the previous solution, sorting helped us calculate the longest sequence in O(n), but the sorting algorithm still dominates overall time complexity. The critical question is: how can we optimize the time complexity further? Let’s think.
The solution idea is inspired by the idea of the brute force approach. Instead of using linear search to find the next element in the sequence, can we think to use a hash table? As we know, the hash table does fast searching in O(1) time complexity on average. Think! If we observe the problem clearly, there will be two types of elements in the array :
If we know the starting element of any consecutive sequence (Type 1), we can easily calculate the length of the sequence by searching all the next successive elements. So one solution idea would be to identify all elements of type 1, calculate the consecutive sequence length starting from any such element, and return max among them.
If we observe the sorted array approach, we are doing a similar process. When we encounter a different starting element, we reset the sequence length and update the max sequence length seen so far. But how do we implement this idea using a hash table? Let's think!
Now again, we traverse each element X[i] using a loop:
int longestconsecutive(int X[], int n)
{
HashTable H
int longestLength = 0
for(int i = 0; i < n; i = i + 1)
H.insert(X[i])
for(int i = 0; i < n; i = i + 1)
{
if (H.search(X[i] - 1) == false)
{
int currLength = 1
int currElement = X[i]
while(H.search(X, currElement + 1) == true)
{
currLength = currLength + 1
currElement = currElement + 1
}
longestLength = max(longestLength, currLength)
}
}
return longestLength
}
At first sight, the time complexity appears to be quadratic due to the two nested loops. But it requires a closer look because while loop is running only when any element X[i] is the beginning of a sequence. A better idea would be to calculate the count of the critical operations inside the loop for better analysis.
Overall time complexity = Time complexity of inserting n elements into hash table + time complexity of searching n elements twice = n*O(1) + 2*n*O(1)= O(n)
Space Complexity = O(n), for the hash table.
Enjoy learning, Enjoy Algorithms!
Subscribe to get weekly content on data structure and algorithms, machine learning, system design and oops.