Find the Intersection of Two Unsorted Arrays

EnjoyAlgorithms Blog Cover Image

Difficulty: Medium, Asked-In: Google, Facebook

Key takeaway: An excellent problem to learn time complexity optimization using various approaches. The two-pointer and hash table solutions are intuitive and worth exploring.

Let’s understand the problem!

Given two integer arrays X[] and Y [] of size m and n, respectively, write a program to find the intersection of these two arrays.

  • The intersection of two arrays is a list of distinct numbers which are present in both arrays.
  • The values in the intersection can be in any order.
  • Suppose m > n and elements in both arrays are distinct.

Examples

Input: X[] = [1, 4, 3, 2, 5, 8, 9], Y[] = [6, 3, 2, 7, 5]

Output: [3, 2, 5]

Input: X[] = [3, 4, 6, 7, 10, 12, 5], Y[] = [7, 11, 18, 15, 12]

Output: [7, 12]

Discussed solution approaches

  • A brute force approach  using two nested loops
  • A solution approach using sorting and binary search
  • A solution Approach using sorting and two pointers
  • An efficient approach  using a hash table

A brute force approach  using two nested loops

Solution Idea

A basic idea would be to pick each element in X[] and search linearly in Y[]. We add an element to the output list if it is common to both X[] and Y[]. To implement this:

  • We run two nested loops: outer loop from i = 0 to n - 1 to pick each element X[i] from X[] and inner loop from j = 0 to n - 1 to search X[i] linearly in Y[].
  • Whenever we found an element common to both X[] and Y[] i.e. X[i] == Y[j], we add it to the output list result[].
  • By the end of nested loops, we return the output list result[].

Solution Pseudocode

vector<int> findIntersection(int X[], int Y[], int m, int n)
{
    vector<int> result
    for (int i = 0; i < m; i = i + 1)
    {
        for (int j = 0; j < n; j = j + 1)
        {
            if (X[i] == Y[j])
            {
                result.add(X[i])
                break
            }    
        }
    }
    return result
}

Solution Analysis

  • We are running two nested loops, so the time complexity = O(m*n).
  • Space complexity = O(1)

A solution approach using sorting and binary search

Solution Idea

The critical question is: how can we improve the time complexity of the above approach? Searching is an essential operation in the above code. So, can we think to improve the efficiency of searching an element X[i] in Y[]?

One idea is simple: if we sort the array Y[], we can search each element of X[] using binary search. This could help to improve the time complexity because binary search takes O(logn) time.

Solution Steps

  • We declare an output list result[] to store the common elements.
  • Now we sort array Y[] in increasing order. We can use effcient sorting algorithms like heap sort or quick sort to sort the array in O(nlogn) time.
  • We run a loop from i = 0 to n - 1 and search each element X[i] in the sorted array Y[] using binary search.
  • Whenever we found an element common to both X[] and Y[] i.e. X[i] == Y[j], we add it to the output list result[].
  • By the end of the loop, we return the output list result[] as an output.

Solution Pseudocode

vector<int> findIntersection(int X[], int Y[], int m, int n)
{
    vector<int> result
    heapSort(Y, n)
    for (int i = 0; i < m; i = i + 1)
    {
        if(binarySearch(Y, n, X[i]) != -1)
            result.add(X[i])
    }
    return result
}

int binarySearch(int Y[], int n, int k)
{
    int l = 0, r = n
    while(l <= r)
    {
        int mid = l + (r - l)/2
        if(Y[mid] == k)
            return mid
        else if(Y[mid] > k)
            r = mid - 1
        else
            l = mid + 1
   }
   return -1
}

Solution Analysis

  • Suppose we are using heap sort and iterative binary search for the implementation. Time Complexity = Time complexity of sorting array Y[] + Time complexity of searching each elements of X[] in Y[] = O(nlogn) + m* O(logn) = O(nlogn + mlogn)
  • Space Complexity = O(1). Think!

A solution approach using sorting and two pointers

In the above approach, the overall time complexity of searching is O(mlogn). Now the critical question is: can we improve the time complexity of the searching further? Think!

Here is an idea: if we sort both arrays, we can apply the two-pointers approach similar to the merging to find the common elements to both the arrays. Think!

The idea is to start comparing one element of X[] with one element of Y[]. If both are equal, we add it to the output list and move to the next elements in both arrays by incrementing the pointers by one. Otherwise, we move to the next element by moving the pointer in the array, which includes the smaller element. (Think!)

Solution Steps

  1. We declare an output list result[] to store the common elements.
  2. Now we sort both the arrays in increasing order. We can use effcient sorting algorithms like heap sort or quick sort to sort the array in O(nlogn) time.
  3. Before starting the two pointers loop to find the common elements, we initialize the pointers i and j i.e. i = 0 and j = 0
  4. We run a loop while i < m and j < n:

    • If X[i] == Y[j], we add X[i] to the output list result[] and increment both pointers i and j.
    • If X[i] < Y[j], we increment i by 1.
    • If X[i] > Y[j], we increment j by 1.
  5. By the end of the loop, we return the output list result[] as an output.

Solution Pseudocode

vector<int> findIntersection(int X[], int Y[], int m, int n)
{
    vector<int> result
    heapSort(X, m)
    heapSort(Y, n)
    int i = 0, j = 0
    while(i < m and j < n)
    {
        if (X[i] == Y[j])
        { 
            result.add(X[i])
            i = i + 1
            j = j + 1
        }    
 
        else if (X[i] < Y[j])
            i = i + 1
 
        else
            j = j + 1
    }
    return result
}

Solution Analysis

  • Suppose we are using heap sort for the implementation. Time Complexity = Time complexity of the sorting X[] + Time complexity of sorting Y[] + Time complexity of two pointer approach of merging = O(mlogm) + O(nlogn) + O(n+m) = O(mlogm + nlogn)
  • Space Complexity = O(1)

An efficient approach  using a hash table

We can improve the time complexity by using the hash table because it performs the searching in O(1) time average. First, the idea is to insert all elements of array Y[] in the hash table and then search each element of the array X[]. If we find an element X[i] in the hash table, we add it to the output list.

Solution Pseudocode

vector<int> findIntersection(int X[], int Y[], int m, int n)
{
    vector<int> result
    HashTable H
    for (int i = 0; i < n; i = i + 1)
        H.insert(Y[i])
 
    for (int i = 0; i < m; i = i + 1)
    {
        if(H.search(X[i]) == true)
            result.add(X[i])
    }
    return result
}

Solution Analysis

  • Time complexity = Time complexity of inserting n elements of Y[] in the hash table + Time complexity of searching m elements of X[] in the hash table = n* O(1) + m* O(1)) = O(m+n)
  • Space Complexity = O(n), for storing the auxiliary hash table. 

Critical ideas to think!

  • Do we need to modify the above algorithms if elements are repeated?
  • How do time and space complexity change in the last approach if the hash table is created for the larger array instead of the smaller one?
  • What is the worst-case time complexity for searching an element in a hash table?
  • Can we use some other data structures to solve this problem efficiently? Is self-balancing BST useful?
  • In the second approach, we sorted the smaller array. What if we sort the larger array for the solution? How does the time complexity change?
  • What would be the space complexity if we use a recursive binary search for the implementation? 
  • In the 3rd approach, why are we incrementing pointers when X[i] > Y[j] or X[i] < Y[j] ?
  • In the 3rd approach, why the time complexity of the while loop is O(m+n)?

Suggested coding problems to solve

Enjoy learning, Enjoy Algorithm!

We'd love to hear from you

More content from EnjoyAlgorithms

Our weekly newsletter

Subscribe to get free weekly content on data structure and algorithms, machine learning, system design, oops and math. enjoy learning!