Difficulty: Medium, Asked-In: Google, Facebook
Key takeaway: An excellent problem to learn time complexity optimization using various approaches. The two-pointer and hash table solutions are intuitive and worth exploring.
Given two unsorted arrays X[] and Y[] of size m and n, write a program to find the intersection of these two arrays.
Example 1
Input: X[] = [3, 4, 6, 2, 8, 5, 9], Y[] = [6, 3, 2, 7, 5, 1]
Output: [3, 6, 2, 5]
Explanation: Common elements are 3, 6, 2, and 5. So the intersection of both arrays is 3, 6, 2, and 5.
Example 2
Input: X[] = [3, 4, 6, 7, 10, 12, 5], Y[] = [7, 11, 18, 15, 12]
Output: [7, 12]
Explanation: Common elements are 7 and 12. So the intersection of both arrays is 7 and 12.
Example 3
Input: X[] = [3, 4, 6, 10, 5], Y[] = [7, 11, 18, 15, 12]
Output: There are no common elements. So the output is an empty array.
A basic idea would be to pick each element from X[] and search linearly in Y[]. We add an element to the output list if it is present in both X[] and Y[].
vector<int> findIntersection(int X[], int Y[], int m, int n)
{
vector<int> intersection
for (int i = 0; i < m; i = i + 1)
{
for (int j = 0; j < n; j = j + 1)
{
if (X[i] == Y[j])
{
intersection.push_back(X[i])
break
}
}
}
return intersection
}
We are running two nested loops and performing O(1) operations at each iteration. In the worst case, inner loop will run n times. So time complexity = O(m) * O(*n) * O(1) = O(mn). We are using constant extra space, so space complexity = O(1). Why are we not considering output list intersection as a part of space complexity? Think!
The critical question is: How can we improve the time complexity of above approach? Searching is an essential operation in the above code. Can we think to improve the efficiency of searching element X[i] in Y[]? One idea would be simple: If we sort the array Y[], we can search each element X[i] using the binary search. This could help to improve time complexity because binary search takes O(logn) time.
vector<int> findIntersection(int X[], int Y[], int m, int n)
{
vector<int> intersection
heapSort(Y, n)
for (int i = 0; i < m; i = i + 1)
{
if (binarySearch(Y, n, X[i]) != -1)
intersection.push_back(X[i])
}
return intersection
}
int binarySearch(int Y[], int n, int target)
{
int l = 0, r = n
while (l <= r)
{
int mid = l + (r - l)/2
if (Y[mid] == target)
return mid
else if (Y[mid] > target)
r = mid - 1
else
l = mid + 1
}
return -1
}
We are using heap sort and iterative binary search for the implementation. Time complexity = Time complexity of sorting array Y[] using heap sort + Time complexity of searching each elements X[i] in Y[] using binary search = O(nlogn) + m * O(logn) = O(nlogn + mlogn). If m > n, then mlogn > nlogn and time complexity would be O(mlogn).
Space complexity = Space complexity of heap sort + Space complexity of iterative binary search = O(1) + O(1) = O(1). What would be the space complexity if we use merge sort and recursive binary search for the implementation? Think!
In the above approach, overall time complexity of searching is O(mlogn). The critical question is: Can we improve the time complexity of searching further? Here is an idea: If we sort both arrays, we can apply two-pointers approach similar to the merging procedure of merge sort to find common elements. Think!
After sorting both arrays, the idea would be to start comparing one element of X[] with one element of Y[]. If both are equal, we add it to the output list and move to next elements in both arrays. Otherwise, we move to the next element in the array that includes the smaller element. Think!
vector<int> findIntersection(int X[], int Y[], int m, int n)
{
vector<int> intersection
heapSort(X, m)
heapSort(Y, n)
int i = 0, j = 0
while (i < m && j < n)
{
if (X[i] == Y[j])
{
intersection.push_back(X[i])
i = i + 1
j = j + 1
}
else if (X[i] < Y[j])
i = i + 1
else
j = j + 1
}
return intersection
}
For m > n, if we compare time complexity of this approach with the previous approach, then it is not better! The idea is: We have improved the time complexity of searching after sorting both the arrays, but, here, time complexity is dominated by the sorting algorithm, which is O(mlogm). The time complexity of above approach is O(mlogn) which is better than O(mlogm). Think!
We can improve the time complexity of searching further using hash table because it can perform searching in O(1) time average. The idea is to insert all elements of Y[] in hash table and search each element X[i]. If element X[i] is present in the hash table, we add it to the output list.
vector<int> findIntersection(int X[], int Y[], int m, int n)
{
vector<int> intersection
HashTable H
for (int i = 0; i < n; i = i + 1)
H.insert(Y[i])
for (int i = 0; i < m; i = i + 1)
{
if (H.search(X[i]) == true)
intersection.push_back(X[i])
}
return intersection
}
Time complexity = Time complexity of inserting n elements of Y[] in hash table + Time complexity of searching m elements of X[] in hash table = n * O(1) + m * O(1)) = O(m + n). Space complexity = O(n), for storing values in hash table.
Enjoy learning, Enjoy algorithms!
Subscribe to get weekly content on data structure and algorithms, machine learning, system design and oops.