Difficulty: Medium, Asked-In: Google, Facebook
Key takeaway: An excellent problem to learn time complexity optimization using various approaches. The two-pointer and hash table solutions are intuitive and worth exploring.
Given two integer arrays X[] and Y[] of size m and n, write a program to find the intersection of these two arrays.
Example 1
Input: X[] = [3, 4, 6, 2, 8, 5, 9], Y[] = [6, 3, 2, 7, 5, 1]
Output: [3, 6, 2, 5]
Explanation: Common elements to both arrays are 3, 6, 2, and 5.
Example 2
Input: X[] = [3, 4, 6, 7, 10, 12, 5], Y[] = [7, 11, 18, 15, 12]
Output: [7, 12]
Explanation: Common elements to both arrays are 7 and 12.
Solution idea
A basic idea would be to pick each element from X[] and search linearly in Y[]. We add an element to an output list if it is present in both X[] and Y[].
Solution pseudocode
vector<int> findIntersection(int X[], int Y[], int m, int n)
{
vector<int> intersection
for (int i = 0; i < m; i = i + 1)
{
for (int j = 0; j < n; j = j + 1)
{
if (X[i] == Y[j])
{
intersection.push_back(X[i])
break
}
}
}
return intersection
}
Solution Analysis
We are running two nested loops and performing O(1) operations at each iteration. In the worst case, the inner loop will run n times, so the time complexity = O(m) * O(*n) * O(1) = O(mn). We are using constant extra space, so space complexity = O(1). Why are we not considering output list intersection as a part of space complexity? Think!
Solution idea
The critical question is: how can we improve the time complexity of the above approach? Searching is an essential operation in the above code. Can we think to improve the efficiency of searching an element X[i] in Y[]? One idea would be simple: if we sort the array Y[], we can search each element X[i] using the binary search. This could help to improve the time complexity because binary search takes O(logn) time.
Solution steps
Solution pseudocode
vector<int> findIntersection(int X[], int Y[], int m, int n)
{
vector<int> intersection
heapSort(Y, n)
for (int i = 0; i < m; i = i + 1)
{
if (binarySearch(Y, n, X[i]) != -1)
intersection.push_back(X[i])
}
return intersection
}
int binarySearch(int Y[], int n, int target)
{
int l = 0, r = n
while (l <= r)
{
int mid = l + (r - l)/2
if (Y[mid] == target)
return mid
else if (Y[mid] > target)
r = mid - 1
else
l = mid + 1
}
return -1
}
Solution analysis
We are using heap sort and iterative binary search for the implementation. Time complexity = Time complexity of sorting array Y[] using heap sort + Time complexity of searching each elements X[i] in Y[] using binary search = O(nlogn) + m * O(logn) = O(nlogn + mlogn). If m > n, then mlogn > nlogn and time complexity would be O(mlogn).
Space complexity = Space complexity of heap sort + Space complexity of iterative binary search = O(1) + O(1) = O(1). What would be the space complexity if we use merge sort and recursive binary search for the implementation? Think!
Solution idea
In the above approach, the overall time complexity of searching is O(mlogn). The critical question is: can we improve the time complexity of the searching further? Here is an idea: if we sort both arrays, we can apply a two-pointers approach similar to the merging procedure of merge sort to find the common elements. Think!
After sorting both arrays, the idea would be to start comparing one element of X[] with one element of Y[]. If both are equal, we add it to the output list and move to the next elements in both arrays. Otherwise, we move to the next element in the array that includes the smaller element. (Think!)
Solution steps
Solution pseudocode
vector<int> findIntersection(int X[], int Y[], int m, int n)
{
vector<int> intersection
heapSort(X, m)
heapSort(Y, n)
int i = 0, j = 0
while (i < m && j < n)
{
if (X[i] == Y[j])
{
intersection.push_back(X[i])
i = i + 1
j = j + 1
}
else if (X[i] < Y[j])
i = i + 1
else
j = j + 1
}
return intersection
}
Solution analysis
For m > n, if we compare the time complexity of this approach with the previous approach, then it is not better! The idea is: we have improved the time complexity of searching after sorting both the arrays, but, here, the time complexity is dominated by the sorting algorithm, which is O(mlogm). The time complexity of the above approach is O(mlogn) which is better than O(mlogm). Think!
Solution idea
We can improve the time complexity of searching further using the hash table because it can perform searching in O(1) time average. The idea is to first insert all elements of array Y[] in the hash table and then search each element X[i] in the hash table. If element X[i] is present in the hash table, we add it to the output list.
Solution pseudocode
vector<int> findIntersection(int X[], int Y[], int m, int n)
{
vector<int> intersection
HashTable H
for (int i = 0; i < n; i = i + 1)
H.insert(Y[i])
for (int i = 0; i < m; i = i + 1)
{
if (H.search(X[i]) == true)
intersection.push_back(X[i])
}
return intersection
}
Solution analysis
Time complexity = Time complexity of inserting n elements of Y[] in the hash table + Time complexity of searching m elements of X[] in the hash table = n * O(1) + m * O(1)) = O(m + n). Space complexity = O(n), for storing values in the hash table.
Enjoy learning, Enjoy Algorithm!
Given an array that includes both positive and negative numbers, write a program to find the first missing positive integer. This is one of the best searching problems for learning step-by-step optimization using various approaches. An in-place hashing solution uses the same input array to process values and generate correct output.
Hashing is a technique to map (key, value) pairs into the hash table using a hash function. It uses an array of size proportional to the number of keys and calculates an array index from the key using a hash function. Explore this blog to learn how hash tables work in programming?
Given a binary tree, write a program to find its height. The height or depth of a binary tree is equal to the count of nodes on the longest path from the root to the leaf, i.e., the max number of nodes from the root to the most distant leaf. This is an excellent problem to learn problem-solving using DFS and BFS traversal.
Subscribe to get free weekly content on data structure and algorithms, machine learning, system design, oops design and mathematics.