**Difficulty:** Medium, **Asked-in:** Microsoft, Amazon, Goldman Sachs, Intel, Yahoo, Qualcomm, Bloomberg.

Merge sort is one of the fastest algorithms for sorting an array (or linked list) in O(nlogn) time. Before exploring the design and analysis of merge sort, let's understand its importance from various angles:

- Merge sort is an excellent algorithm for learning the divide-and-conquer approach. We can solve several coding questions using a similar idea.
- It is one of the best algorithms for learning the analysis of recursion.
- The merging process of merge sort uses a two-pointer approach, where both pointers move in the same direction. We can use similar ideas to solve various coding questions.
- It is an optimal algorithm for sorting linked lists in O(nlogn) time.

Suppose we need to sort an array **A[l...r]** of **n** integers starting from index **l** and ending at index **r**. The critical question is: Can we solve the sorting problem of size n using the solution of smaller sub-problems or applying the idea of recursion? Let's think!

If we observe the above diagram, the idea looks like this:

**Divide:** Divide the problem of size n into two equal sub-problems of size n/2.

- Subproblem 1: Sorting subarray
**A[l,****mid].** - Subproblem 2: Sorting subarray
**A[mid + 1,****r].**

**Conquer:** Solve each sub-problems recursively i.e. sort both smaller subarrays recursively using the same function. During this process, we should not worry about the solution to the sub-problems because the recursive call to the subproblems will do this work for us. This is a recursive leap of faith!

**Combine part:** Merge both sorted subarrays to get the final sorted array. In other words, combine the solution of two n/2 size sub-problems to solve sorting problems of size n.

Suppose the function **mergeSort(int A[], int l, int r)** sorts the entire array A[].

**Divide part:**Calculate the mid-index, i.e.,**mid = l + (r - l)/2**.**Conquer part 1:**Recursively sort the left part of size n/2 by calling the same function with mid as the right end, i.e.,**mergeSort(A, l, mid)**.**Conquer part 2:**Recursively sort the right part of size n/2 by calling the same function with mid + 1 as the left end, i.e.,**mergeSort(A, mid + 1, r)**.**Combine part:**Now we use the**merge(A, l, mid, r)**function to merge both sorted halves.**Base case:**If l == r during recursive calls, then the sub-array has one element left, which is trivially sorted. So recursion will not go further and return from there. In other words, the sub-array of size 1 is the smallest version of the sorting problem for which recursion directly returns the solution.

```
void mergeSort(int A[], int l, int r)
{
if(l >= r)
return
int mid = l + (r - l)/2
mergeSort(A, l, mid)
mergeSort(A, mid + 1, r)
merge(A, l, mid, r)
}
```

After the conquer step, both left part **A[l, mid]** and right part **A[mid + 1, r]** will be sorted. Now we need to combine the solution of smaller sub-problems to build a solution to the larger problem, i.e., merging both sorted halves to create the larger sorted array. The critical question is: Both sorted halves are part of the same array A[]. Can we merge these two halves in O(n) time without using extra space? This will be not possible! You can try this by doing some swapping and comparison.

**O(n) time idea using O(n) extra space:** If we store sorted halves into two extra arrays of size n/2 (X[] and Y[]), we can transform this problem into merging sorted arrays X[] and Y[] into the larger sorted array A[]. For this, we compare values one by one in both smaller arrays and build the larger array sorted A[].

How can we implement it? Here is an idea: We use two pointers i and j to traverse X[] and Y[] from the start. During this process, we compare X[i] and Y[j], place the smaller value on A[] and move one of the pointers. Another way of thinking would be: After each comparison, we add one element to the sorted output and incrementally build a partially sorted array A[].

**Step 1: Memory allocation and data copy.**

We allocate two extra arrays of size equal to the size of left and right sorted parts i.e. size of left sorted part = mid - l + 1, size of right sorted part = r - mid. After this, we copy left and right sorted parts of A[] into extra arrays. **Note:** We are including the value at mid-index in the left part.

```
int n1 = mid - l + 1
int n2 = r - mid
int X[n1], Y[n2]
for(int i = 0; i < n1; i = i + 1)
X[i] = A[l + i]
for(int j = 0; j < n2; j = j + 1)
Y[j] = A[mid + 1 + j]
```

**Step 2: Now start the merging process.**

- We initialize pointers i, j, and k to traverse X[], Y[], and A[], respectively i.e. i = 0, j = 0, and k = l.
- Now we run a loop until either of the smaller arrays reaches its end:
**while (i < n1 && j < n2)**. - In the first step of iteration, we compare X[0] and Y[0] and place the smallest value at A[0]. Before moving forward to the second iteration, we increment pointer
**k**in A[] and pointer in the array containing the smaller value (maybe i or j, depending on the comparison). - Similarly, move forward in all three arrays using pointers i, j, and k. At each step, compare X[i] and Y[j], place the smaller value in A[k], and increment k by 1. Based on the comparison, we also increment pointer i or j. If (X[i] <= Y[j]), increment i by 1; otherwise, increment j by 1.

**Note:** This is a two-pointer approach of problem-solving where we build a partial solution by moving pointers i and j in the same direction.

```
int i = 0, j = 0, k = l
while(i < n1 && j < n2)
{
if(X[i] <= Y[j])
{
A[k] = X[i]
i = i + 1
}
else
{
A[k] = Y[j]
j = j + 1
}
k = k + 1
}
```

**Step 3: Loop termination and boundary conditions.**

The loop will stop when one of the two pointers reaches the end of its array (either i = n1 or j = n2). At this stage, there will be two boundary conditions:

**Boundary condition 1:** If we exit the loop due to j = n2, then we have reached the end of array Y[] and placed all values of Y[] in A[]. But there may be remaining values in X[] that still need to be put in the array A[]. So we copy the remaining values of X[] into the end of the A[].

```
while(i < n1)
{
A[k] = X[i]
i = i + 1
k = k + 1
}
```

**Boundary condition 2:** If we exit the loop due to the condition i = n1, we have reached the end of array X[] and placed all its values in A[]. But there may still be some values remaining in Y[] that need to be placed in the array A[]. So we copy the remaining values of Y[] into the end of array A[].

```
while(j < n2)
{
A[k] = Y[j]
j = j + 1
k = k + 1
}
```

```
void merge(int A[], int l, int mid, int r)
{
int n1 = mid - l + 1
int n2 = r - mid
int X[n1], Y[n2]
for(int i = 0; i < n1; i = i + 1)
X[i] = A[l + i]
for(int j = 0; j < n2; j = j + 1)
Y[j] = A[mid + 1 + j]
int i = 0, j = 0, k = l
while(i < n1 && j < n2)
{
if(X[i] <= Y[j])
{
A[k] = X[i]
i = i + 1
}
else
{
A[k] = Y[j]
j = j + 1
}
k = k + 1
}
while(i < n1)
{
A[k] = X[i]
i = i + 1
k = k + 1
}
while(j < n2)
{
A[k] = Y[j]
j = j + 1
k = k + 1
}
}
```

This is an excellent code to understand the analysis of iterative algorithms. To understand this better, let's visualize the time complexity of each critical step and calculate the overall time complexity.

- Memory allocation process = O(1).
- Data copy process = O(n1) + O(n2) = O(n1 + n2) = O(n).
- Merging loop in the worst case = O(n1 + n2) = O(n).
- Boundary condition 1 in the worst case = O(n1).
- Boundary condition 2 in the worst case = O(n2).

Overall time complexity = O(1)+ O(n) + O(n) + O(n1) + O(n2) = O(n). If we observe closely, time complexity depends on the complexity of the merging loop where comparison, assignment, and increment are critical operations. There could be two different perspectives to understand this:

**Perspective 1:**At each step of the while loop, we increment either pointer i or j. In other words, we need to access each element of both smaller arrays at least once.**Perspective 2:**At each step, we are placing one element of smaller sorted arrays to the output array A[].

Space complexity = Extra space for left part + Extra space for right part = O(n1) + O(n2) = O(n1 + n2) = O(n).

Suppose T(n) is the time complexity for sorting n elements using merge sort. So to calculate the value of T(n), we can break down the time complexities as follows.

**Divide:**Time complexity of the divide part is O(1).**Conquer:**We are recursively solving two sub-problems, each of size n/2. So the time complexity of each subproblem is T(n/2), and the overall time complexity of the conquer part = 2T(n/2).**Combine:**The worst-case time complexity of the merging process is O(n).

To calculate T(n), we need to add up the time complexities of the divide, conquer, and combine parts.

=> T(n) = O(1) + 2T(n/2) + O(n) = 2T(n/2) + O(n) = 2T(n/2) + cn.

Here is the simplified recurrence relation:

- T(n) = c, if n = 1.
- T(n) = 2T(n/2) + cn, if n > 1.

Merge sort function works correctly when the number of elements is not even. To simplify the analysis, we assume that n is a power of 2. This assumption does not affect the order of growth in the analysis.

In this method, we draw the recursion tree and calculate the total number of operations at each level of recursion. Finally, we calculate the overall time complexity by doing the sum of operations count level by level. Note: To learn about the fundamentals of recursion analysis, you can explore the blog post on time complexity analysis of recursion.

Based on the above recursion tree, the cost at each level is O(n), and there are O(logn) levels. So the time complexity of the merge sort algorithm is the sum of the cost at each level, which is O(logn)*O(n) = O(nlogn).

The best and worst-case time complexity of the merge sort algorithm is O(nlogn). The idea is simple: irrespective of the input, merge sort divides the input into equal halves and takes O(n) time at each level.

The Master method is a direct way to obtain the solution for recurrences that can be transformed to the type T(n) = aT(n/b) + O(n^k), where a ≥ 1 and b > 1. There are three cases for analysis using the master theorem:

- If f(n) = O(n^k) where k < logb(a), then T(n) = O(n^logb(a)).
- If f(n) = O(n^k) where k = logb(a), then T(n) = O(n^k * logn).
- If f(n) = O(n^k) where k > logb(a), then T(n) = O(n^k).

Let's compare with the merge sort recurrence:

- T(n) = 2T(n/2) + cn
- T(n) = aT(n/b) + O(n^k)

Here, a = 2, b = 2 (a > 1 and b > 1).

- O(n^k) = cn = O(n¹) => k = 1
- Similarly, logb(a) = log2(2) = 1

So k = logb(a). This means that the merge sort recurrence satisfies the 2nd case of the master theorem. So merge sort time complexity T(n) = O(n^k * logn) = O(n^1 * logn) = O(nlogn).

The space complexity of the merge sort depends on the extra space used by the merging process and the size of the recursion call stack.

- The space complexity of the merging process is O(n).
- The space complexity for the recursion call stack = height of the recursion tree = O(logn).
- The overall space complexity = Space complexity of the merging process + Space used by the recursion call stack = O(n) + O(logn) = O(n).

The recursive merge sort uses a call stack to store intermediate values of function parameters l and r and other information. On the other side, we can implement it iteratively without using an explicit auxiliary stack.

The idea behind iterative merge sort is to consider window sizes in exponential order, i.e., 1, 2, 4, 8, ... over the input array. For each window of size k, we merge all adjacent pairs of windows into a temporary space and then put them back into the array. The critical question is: What would be the time and space complexity of this approach? Explore and think!

**Pseudocode of iterative merge sort**

```
void iterativeMergeSort(int A[], int n)
{
int windowSize
int left
for(windowSize = 1; windowSize < n ; windowSize = 2 * windowSize)
{
for(left = 0; left < n - 1; left = left + 2 * windowSize)
{
int mid = min(left + windowSize - 1, n - 1)
int right = min(left + 2 * windowSize - 1, n - 1)
// Code of this merge function is same as the code
// used in the above recursive implementation
merge(A, left, mid, right)
}
}
}
```

- Merge sort is the best choice for sorting linked lists due to its efficiency and low memory consumption. It will take O(nlogn) time and requires only O(1) space. In contrast, sorting algorithms like quick sort or heap sort do not perform well with linked lists due to the slow random access of linked list elements.
- Merge sort is a stable sorting algorithm i.e. the relative order of equal elements is preserved throughout the sorting process. This property is important in many applications where the original order of equal elements must be maintained.
- For smaller input sizes, merge sort may be slower than other sorting algorithms like insertion sort.
- Merge sort goes through a complete process of divide-and-conquer if the array is already or almost sorted. However, some implementations can take advantage of this and perform optimizations.
- Merge sort's divide and conquer method makes it suitable for parallelization because it involves independent sub-problems that can be solved in parallel.
- The parallelization of merge sort makes it an attractive choice for sorting large amounts of data, such as those processed in computer clusters. Additionally, in such systems, memory is often not a limiting resource, so the disadvantage of space complexity of merge sort is negligible.
- One of the variations of merge sort is 3-way merge sort, where instead of dividing the array into 2 parts, we divide it into 3 equal parts. The critical question is: what would be the time and space complexity of 3-way merge sort? Is it more efficient than 2-way merge sort?

- Find the inversion count of an array
- Binary search algorithm
- Quicksort algorithm
- Divide and conquer algorithm of the maximum sub-array sum
- Divide and conquer algorithm of finding max and min in an array
- Divide and conquer algorithm of finding the majority element in an array
- External sorting algorithm

**Content reference:** Algorithms by CLRS

Please write in the message below if you find anything incorrect, or you want to share more insight. Enjoy learning, Enjoy algorithms!