Difficulty: Medium, Asked-in: Microsoft, Amazon, Goldman Sachs, Intel, Yahoo, Qualcomm, Bloomberg, Paytm.
Merge sort is a popular sorting algorithm that uses the divide and conquers approach to sort an array (or linked list) of integers (or characters or strings). Here are some excellent reasons to learn the merge sort algorithm:
Suppose we need to sort an array A[l...r] of n integers starting from index l and ending at index r. The critical question is: Can we solve this sorting problem of size n using the solution of smaller sub-problems or applying the divide and conquer approach? Let's think!
If we observe the above diagram, the divide and conquer idea looks like this:
Divide part: We divide the sorting problem of size n into two equal sub-problems of size n/2 by calculating the mid-index.
Conquer part: We solve each sub-problems recursively. In other words, we sort both smaller subarrays using recursion. During this process, we should not worry about the solution to the sub-problems because recursion will do this work for us. This is recursive leap of faith!
Combine part: Now we merge both sorted subarrays to get the final sorted array. In other words, we combine the solution of two n/2 size sub-problems to solve sorting problems of size n.
Suppose the function mergeSortAlgorithm(int A[], int l, int r) sorts the entire array A[] with left and right ends as input parameters.
Note: Why are we not calculating mid using the formula (l + r)/2? Explore this excellent Google blog to get an answer.
void mergeSortAlgorithm(int A[], int l, int r)
{
if(l >= r)
return
int mid = l + (r - l)/2
mergeSortAlgorithm(A, l, mid)
mergeSortAlgorithm(A, mid + 1, r)
merge(A, l, mid, r)
}
After the conquer step, both left part A[l…mid] and right part A[mid + 1…r] will be sorted. Now we need to combine the solution of smaller sub-problems to build a solution to the larger problem, i.e., merging both sorted halves to create the larger sorted array.
The critical question is: Both sorted halves are part of the same array A[]. Can we merge these two halves in O(n) time without using extra space? Try this by doing some swapping and comparison operations.
Here is an O(n) time idea using O(n) extra space: If we store sorted halves into two extra arrays of size n/2 (X[] and Y[]), we can transform this problem into merging sorted arrays X[] and Y[] into the larger sorted array A[]. For this, we compare values one by one in both smaller arrays and build the larger array sorted A[]. How can we implement it? Let's think!
We use two pointers i and j to traverse X[] and Y[] from the start. We compare elements in both arrays one by one and place a smaller value on array A[]. Another way of thinking would be: After each comparison, we add one element to the sorted output and incrementally build a partially sorted array A[].
Step 1: Memory allocation and data copy process
We allocate two extra arrays of size equal to the size of left and right sorted parts i.e. size of left sorted part = mid - l + 1, size of right sorted part = r - mid. After this, we copy left and right sorted parts of A[] into extra arrays. Note: We include the value at the mid-index in the left part.
int n1 = mid - l + 1
int n2 = r - mid
int X[n1], Y[n2]
for (int i = 0; i < n1; i = i + 1)
X[i] = A[l + i]
for (int j = 0; j < n2; j = j + 1)
Y[j] = A[mid + 1 + j]
Step 2: Now we start the merging process using two pointers loop.
Note: This is a two-pointer approach of problem-solving where we build a partial solution by moving pointers i and j in the same direction.
int i = 0, j = 0, k = l
while (i < n1 && j < n2)
{
if (X[i] <= Y[j])
{
A[k] = X[i]
i = i + 1
}
else
{
A[k] = Y[j]
j = j + 1
}
k = k + 1
}
Step 3: Loop termination and boundary conditions
The loop will stop when one of the two pointers reaches the end of its array (either i = n1 or j = n2). At this stage, there will be two boundary conditions:
Boundary condition 1: If we exit the loop due to j = n2, then we have reached the end of array Y[] and placed all values of Y[] in A[]. But there may be remaining values in X[] that still need to be put in the array A[]. These values are greater than all values available in A[], so we copy the remaining values of X[] into the end of the array A[].
while (i < n1)
{
A[k] = X[i]
i = i + 1
k = k + 1
}
Boundary condition 2: If we exit the loop due to the condition i = n1, we have reached the end of array X[] and placed all its values in A[]. But there may still be some values remaining in Y[] that need to be placed in the array A[]. The idea is that these values are greater than all the values in A[], so we copy the remaining values of Y[] into the end of array A[].
while (j < n2)
{
A[k] = Y[j]
j = j + 1
k = k + 1
}
void merge(int A[], int l, int mid, int r)
{
int n1 = mid - l + 1
int n2 = r - mid
int X[n1], Y[n2]
for (int i = 0; i < n1; i = i + 1)
X[i] = A[l + i]
for (int j = 0; j < n2; j = j + 1)
Y[j] = A[mid + 1 + j]
int i = 0, j = 0, k = l
while (i < n1 && j < n2)
{
if (X[i] <= Y[j])
{
A[k] = X[i]
i = i + 1
}
else
{
A[k] = Y[j]
j = j + 1
}
k = k + 1
}
while (i < n1)
{
A[k] = X[i]
i = i + 1
k = k + 1
}
while (j < n2)
{
A[k] = Y[j]
j = j + 1
k = k + 1
}
}
This is an excellent code to understand the analysis of iterative algorithms. To understand this better, let's visualize the time complexity of each critical step and calculate the overall time complexity.
Overall time complexity = O(1)+ O(n) + O(n) + O(n1) + O(n2) = O(n). If we observe closely, the merging algorithm time complexity depends on the time complexity of the merging loop where comparison, assignment, and increment are critical operations. There could be two different perspectives to understanding this analysis:
Space complexity = Extra space for storing left part + Extra space for storing right part = O(n1) + O(n2) = O(n1 + n2) = O(n).
Let's assume that T(n) is the worst-case time complexity of merge sort for n integers. We can break down the time complexities as follows:
To calculate T(n), we need to add up the time complexities of the divide, conquer, and combine parts:
T(n) = O(1) + 2T(n/2) + O(n) = 2T(n/2) + O(n) = 2T(n/2) + cn
We can use the following formulas to calculate T(n):
Note that the merge sort function works correctly when the number of elements is not even. To simplify the analysis, we assume that n is a power of 2. This assumption does not affect the order of growth in the analysis.
In this method, we draw the recursion tree and count the total number of operations at each level. Finally, we find overall time complexity by doing the sum of operations count at each level.
As we have seen in the above recursion tree, the cost at each level is O(n), and there are O(logn) levels. So the time complexity of the merge sort algorithm is the sum of the cost at each level, which is O(logn)*O(n) = O(nlogn).
The best and worst-case time complexity of the merge sort algorithm is O(nlogn). The idea is simple: irrespective of the input, merge sort divides the input into equal halves and takes O(n) time at each level. Note: To learn about the fundamentals of recursion analysis, you can explore the blog post on time complexity analysis of recursion.
The Master method is a direct way to obtain the solution for recurrences that can be transformed to the type T(n) = aT(n/b) + O(n^k), where a ≥ 1 and b > 1. There are three cases for the analysis using the master theorem:
Let's compare with the merge sort recurrence:
Here, a = 2, b = 2 (a > 1 and b > 1).
So k = logb(a). This means that the merge sort recurrence satisfies the 2nd case of the master theorem. Merge sort time complexity T(n) = O(n^k * logn) = O(n^1 * logn) = O(nlogn).
The space complexity of the merge sort algorithm depends on the extra space used by the merging process and the size of the recursion call stack used by the recursion.
We can implement merge sort iteratively without using an explicit auxiliary stack. On another side, the recursive merge sort uses a call stack to store intermediate values of function parameters l and r and other information.
The iterative merge sort works by considering window sizes in exponential order, i.e., 1, 2, 4, 8, ... over the input array. For each window of size k, all adjacent pairs of windows are merged into a temporary space and then put back into the array. The critical question is: what would be the time and space complexity of this approach? We will discuss this idea in a separate blog.
Pseudocode of iterative merge sort algorithm
void iterativeMergeSort(int A[], int n)
{
int windowSize;
int left;
for (windowSize = 1; windowSize < n ; windowSize = 2 * windowSize)
{
for (left = 0; left < n - 1; left = left + 2 * windowSize)
{
int mid = min(left + windowSize - 1, n - 1);
int right = min(left + 2 * windowSize - 1, n - 1);
// Code of this merge function is same as the code
// used in the above recursive implementation
merge(A, left, mid, right);
}
}
}
Content reference: Algorithms by CLRS
Please write in the message below if you find anything incorrect, or you want to share more insight. Enjoy learning, Enjoy algorithms!
Subscribe to get well designed content on data structure and algorithms, machine learning, system design, object orientd programming and math.