Difficulty: Hard, Asked-in: Google, Microsoft, Amazon
Key takeaway: An excellent problem to learn problem-solving using two pointers idea similar to merging and divide and conquer idea similar to the binary search.
There are two sorted arrays A[] and B[] of size n each, write a program to find the median of the array obtained after merging both the arrays i.e. merged array of size 2n.
Example 1
Input: A[] = [1, 3, 6], B[] = [2, 8, 12]
Output: 4.5
Explanation: After merging sorted arrays, we get the larger sorted array [1, 2, 3, 6, 8, 12]. The total number of elements is 6, so the median would be the average of the two middle elements at index 2 and 3 i.e. (3 + 6)/2 = 4.5
Example 2
Input: A[] = [1, 3, 4, 6, 9], B[] = [2, 5, 7, 8, 10]
Output: 5.5
Explanation: After merging sorted arrays, we get the larger sorted array [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]. So median would be the average of the two middle elements at index 4 and 5 i.e. (5 + 6)/2 = 5.5
Solution Idea
So one basic idea is to merge both sorted arrays using extra space and get the median by taking an average of both middle elements in the merged array. If we follow the 0 based indexing, the median will be the average value at (n-1)th and nth indexes. Note: for merging sorted arrays, we can use an idea similar to the merging procedure of the merge sort.
Solution Pseudocode
double median(int A[], int B[], int n)
{
int MergedArr[2n]
int i = 0, j = 0, k = 0
while(i < n && j < n)
{
if(A[i] <= B[j])
{
MergedArr[k] = A[i]
i = i + 1
k = k + 1
}
else
{
MergedArr[k] = B[j]
j = j + 1
k = k + 1
}
}
while(i < n)
{
MergedArr[k] = A[i]
i = i + 1
k = k + 1
}
while(j < n)
{
MergedArr[k] = B[j]
j = j + 1
k = k + 1
}
return (double)(MergedArr[n-1]+ MergedArr[n])/2
}
Time and space complexity analysis
Solution Idea
Now critical questions are: do we really need to perform the complete merging of sorted arrays to find the median? can we reduce the extra space used in the merging process? Let's Think!
Here are few critical observations: we are traversing both the arrays from start in incremental order. At each step of the iteration, we are increasing either pointer i or pointer j and placing one value from any one of the smaller arrays to the larger sorted array. At any general step of iteration, k = i + j - 1.
So here is an optimized solution idea: while comparing the elements of two sorted arrays during the merging process, we can track the count of elements in the merged array using variable k. We stop when the count becomes n + 1 and calculate the median by taking an average of the nth and (n+1)th element of the merged array. So there is no need to use extra space and perform complete merging.
Solution Steps
Solution Pseudocode
double median(int A[], int B[], int n)
{
int i = 0, j = 0, count = 0
int middle1 = 0, middle2 = 0
while(count <= n)
{
if(A[i] <= B[j])
{
middle1 = middle2
middle2 = A[i]
i = i + 1
}
else
{
middle1 = middle2
middle2 = B[j]
j = j + 1
}
if(i == n)
{
middle1 = middle2
middle2 = B[0]
break
}
else if(j == n)
{
middle1 = middle2
middle2 = A[0]
break
}
count = count + 1
}
return (double)(middle1 + middle2)/2
}
Time and Space complexity analysis
While loop is running n times and doing O(1) operation at each iteration. So the time complexity = n * O(1) = O(n). We are using a constant number of extra variables, so space complexity = O(1).
Now critical questions are: how do we improve the time complexity? This is a searching problem, so can we take advantage of the sorted array property? Let's think!
Solution Idea
Here idea is to compare the medians of both sorted arrays and recursively reduce the search space by half. Suppose the median of the first array is m1, and the second array is m2. We can get this value in O(1) using the formula: m1 = A[n/2], m2 = B[n/2] (We have assumed in the problem that n is odd)
Case 1 if (m1 = m2): In this case, n-1 elements are less than m1 and n-1 elements are greater than m2. In other words, both m1 and m2 would be the middle elements in the merged array i.e. element at (n-1)th and nth index. We take the average of both, which is equal to m1 or m2. We return any one of the values as an output. (Think!)
Case 2 if (m1 < m2): then in the merged array, both middle elements of the median will lie between the range [m1, m2] i.e. m1 <= (middle1, middle2) <= m2.
Case 3 if (m1 > m2): then in the merged array, both middle elements of the median will lie between the range [m2, m1] i.e. m2 <= (middle1, middle2) <= m1.
The above idea is quite similar to the binary search: we reject half elements in both sorted arrays after each comparison. Overall, we are reducing the search space by half after each step. So, we can think of implementing a solution using an idea similar to a recursive binary search.
Solution Steps
Suppose we use function median(int A[], int B[], int n).
If (m1 > m2), then both middle elements of the median must be present in one of the two sub-arrays: subarray starting from the first element of array A[] and ending at element m1(including m1) and subarray starting from element m2 (including m2) and ending at the last element of array B[]. In other words, we are rejecting n/2 elements from both subarrays and the remaining input size = n - n/2. So, we recursively call the same function with input size n - n/2 i.e. median(A, B + n/2, n - n/2)
For example, if n = 7, then n/2 = 3, m1 = A[3] and m2 = B[3]. So we reject: 3 elements in A[] from index 4 to 6 and 3 elements in B[] from index 0 to 2. Remaining size of array A and B = n - n/2 = 7 - 3 = 4.
If (m2 > m1), both middle elements of the median must be present in one of the two sub-arrays: subarray starting from element m1 (including m1) and ending at the last element of array A[] and subarray starting from the first element of array B[] and ending at element m2 (including m2). In other words, we are rejecting n/2 elements from both subarrays and the remaining input size = n - n/2. So, we recursively call the same function with input size n - n/2 i.e. median(A + n/2, B, n - n/2)
For example, if n = 9, then n/2 = 4, m1 = A[4] and m2 = B[4]. So we reject: 4 elements in A[] from index 0 to 3, and 4 elements in B[] from index 5 to 8. Remaining size of array A and B = n - n/2 = 9 - 4 = 5.
Solution Pseudocode
double median(int A[], int B[], int n)
{
int m1, m2
if(n = 0)
return 0
if(n == 1)
return (double)(A[0] + B[0])/2
m1 = getMedian(A, n)
m2 = getMedian(B, n)
if(m1 == m2)
return m1
if (m1 < m2)
return median(A + n/2, B, n - n/2)
else
return median(A, B + n/2, n - n/2)
}
int getMedian(int C[], int n)
{
if (n % 2 == 0)
return (C[n/2 - 1] + C[n/2])/2
else
return C[n/2]
}
Time and space complexity analysis
Enjoy learning, Enjoy algorithms!
Have you ever thought of any such service that could make our life easier by allowing us to store data over the internet and simultaneously allow others to access the same data? If it is so, then this is the perfect blog for you. Here, in this blog, we'll be discussing the PasteBin System. Here, the aim is to design a highly scalable service that could allow users to paste various content (images, text, etc.) over the internet and allow others to access the pasted content. So let's look at the formal definition of PasteBin.
Level order traversal accesses nodes in level by level order. This is also called breadth-first search traversal or BFS traversal. Here we start from the root node and process it, then process all the nodes at the first level, then process all the nodes at the second level, and so on. In other words, we explore all nodes at the current level before moving on to the nodes at the next level.
Subscribe to get free weekly content on data structure and algorithms, machine learning, system design, oops and math. enjoy learning!