**Difficulty:** Medium, **Asked-in:** Google, Microsoft, Adobe, Goldman Sachs, Qualcomm

### Introduction to Quick Sort

Quick Sort is one of the fastest sorting algorithms that use the divide and conquer approach of problem-solving. Here are some excellent reasons to learn this algorithm :

- Often the best choice of sorting because it works efficiently in O(nlogn) time complexity on average.
- One of the best algorithms to learn the idea of recursion. Its recursive structure, the flow of recursion, and the base case are intuitive.
- It is an in-place sorting that works best in the virtual memory environment.
- A good algorithm to learn the worst, best, and average-case analysis.
- The Quicksort partition algorithm is an excellent idea to learn problem-solving using the two-pointers. We can use a similar approach to solve other coding questions.
- A good algorithm to understand the idea of the randomized algorithm. Randomization is a general tool to improve the efficiency of real-time systems!

### Quicksort Intuition

Let’s explore the pattern of the sorted array and think! Suppose all the elements are arranged in increasing order.

Now from the above observation, try to think in a reverse way: suppose we partition an unsorted array around some value from the input **(pivot)** into two parts such that:

- All the values in the left half will be less than the
**pivot.**
- All the values in the right half will be greater than the
**pivot.**
- The
**pivot** will get placed at the correct position in the sorted output.

We have a situation similar to the sorted array with one difference: left and right halves are still unsorted. If we explore closely, both unsorted halves are smaller sub-problem of the original sorting problem, i.e., a sorting problem for smaller input sizes.

So here is an idea: if we sort both the halves, then the whole array will be sorted because all the values in the left half are less than X[i], the pivot value is at the correct position after the partition, and all the values in the right half are greater than X[i]. Think!

### Divide and Conquer Idea of the Quicksort

The above approach is a divide and conquer approach of problem-solving : we divide the sorting problem into two smaller subproblems of sorting using the partition algorithm and solve each sub-problems recursively to get the final sorted array. We repeat the same process of partition for the smaller sub-array till we reach the base case of the recursion.

**Divide Part :** we divide the problem into two smaller sub-problems by partitioning the array **X[l…r]** into two smaller subarrays around some **pivot** in the array. The partition process returns the **pivotIndex** i.e. index of the pivot in the sorted array.

- Left subproblem: sorting array X[] from
**l** to **pivotIndex - 1.**
- Right subproblem : sorting array X[] from
**pivotIndex + 1** to **r.**

**Conquer Part :** Now, we solve both the sub-problems or sort both the smaller arrays recursively. We should not worry about the solutions to the sub-problems because recursion will do the rest of the work for us!

**Combine Part** : This is a trivial case because after sorting both smaller arrays, our whole array will get sorted. In other words, there is no need to write code for the combine part. Think!

### Divide and Conquer Implementation of the Quicksort

Suppose the function **quickSort**(int X[], int l, int r) sorts the entire array where l and r are the left and right ends of the array. The initial function call would be quickSort(X, 0, n - 1).

**Divide Part**

Let’s define a **partition(X, l, r**) function that divides the array around a pivot and returns the **pivotIndex.** We will select the pivot value inside the partition function.

```
pivotIndex = partition(X, l, r)
```

**Conquer Part**

- We recursively sort the left subarray by calling the same function with
**l** and **pivotIndex** as the left and right end respectively i.e. **quickSort(X, l, pivotIndex - 1)**.
- Similarly, We recursively sort the right subarray by calling the same function with
**pivotIndex + 1 and r** as the left and right end respectively i.e. **quickSort(X, pivotIndex + 1, r).**

**Base Case**

This is one of the critical steps of the implementation. The base case would the smallest version of the problem where recursion terminates and returns the value directly. What would be the base case? If we find **l ≥ r** during the recursive calls, the sub-array would be either empty or have one value left. Our recursion will not go further and return from the base case.

**Quicksort Pseudocode**

### Quicksort Partition Algorithm

**Partition algorithm idea**

How do we divide the input array around some pivot value so that values in the left part would be less than pivot and values in the right part would be greater than pivot? Can we do it in place? Let’s think!

Suppose we pick the rightmost value as the pivot i.e. **pivot = X[r]** and scan the array from left to right. Whenever we found a value less than the pivot, we place it at the starting part of the array and move forward. Otherwise, we ignore the value and move forward. The idea looks simple, but how do we implement it? Let’s think!

- We need two pointers to track two different parts of the array: 1) Pointer i to track the starting subarray from 0 to i to track values less than the pivot 2) The pointer j to track two subarrays: a) Subarray from i + 1 to j - 1 to track values greater than the pivot. Initially, we start with an empty left subarray i.e. j = l - 1.
- Now we scan the array using a loop from j = l to r - 1. Whenever we found
**X[i] < pivot**, we increment pointer i, swap X[j] with X[i], and move to the next iteration. Otherwise, we ignore the value at index j and move to the next iteration.
- By the end of the loop: all the values in subarray X[0…i] will be less than the pivot, and all the values in the subarray X[i + 1 … r - 1] will be greater than the pivot. Now we place the pivot at its correct position i.e. index i + 1. So we swap the X[r] with X[i + 1] to complete the partition process. Think!
- Finally, we return the position of the pivot as an output i.e. return index i + 1.

**Pseudocode of the partition algorithm**

**Analysis of the partition algorithm**

We are running a single loop and doing constant operations at each iteration. In the worst or best case, we are making one comparison at each iteration. Here swapping operation depends on the comparison if (X[j] < pivot)**.** So time complexity = O(n). We are using constant extra space, so space complexity = O(1)

### Quicksort Algorithm Visualization

### Quicksort Time Complexity Analysis

Let’s assume that T(n) is the worst-case time complexity of quicksort for **n** integers. Let's analyze it by breaking down the time complexities of each process:

**Divide Part:** The time complexity of the divide part is the time complexity of the partition algorithm, which is O(n).

**Conquer Part:** We are recursively solving two subproblems of different sizes. The size of the subproblems depends on the choice of the pivot in the partition process! Suppose after the partition, i elements are in the left subarray (left of the pivot), and n - i - 1 elements are in the right subarray.

- Size of the left subarray = i
- Size of the right subarray = n - i - 1

Time complexity of the conquer part = Time complexity of sorting the left subarray + Time complexity of sorting the right subarray = T(i) + T(n - i - 1)

**Combine Part**: As mentioned above, there is no operation in this part of the quick sort. So time complexity of the combine part = O(1).

For calculating the overall time complexity T(n), we need to add the time complexities of the divide, conquer, and combine part:

```
T(n)
= O(n) + T(i) + T(n — i — 1) + O(1)
= T(i) + T(n — i — 1) + O(n)
= T(i) + T(n — i — 1) + cn
Recurrence relation of the quick sort
T(n) = c, if n = 1
T(n) = T(i) + T(n — i — 1) + cn, if n > 1
```

**Worst-case Analysis of Quicksort**

The worst-case scenario of quicksort occurs when the partition process always picks the largest or smallest element as a pivot. In this scenario, the partition process would be highly unbalanced i.e. one subproblem with n - 1 element and the other with 0 elements. This situation occurs when the array will be sorted in increasing or decreasing order. Think!

Let us assume that unbalanced partitioning arises at each recursive call. So for calculating the time complexity in the worst case, we put i = n - 1 in the above formula of T(n).

```
T(n) = T(n - 1) + T(0) + cn
T(n) = T(n - 1) + cn
```

**Analysis of worst-case using the substitution method**

We simply expand the recurrence relation by substituting the all intermediate value of T(i) (from i = n-1 to 1). By end of this process, we ended up in a sum of arithmetic series.

```
T(n)
= T(n-1) + cn
= T(n-2) + c(n-1) + cn
= T(n-3) + c(n-2) + c(n-1) + cn
………… and so on
= T(1) + 2c + 3c + ... + c(n-3) + c(n-2) + c(n-1) + cn
= c + 2c + 3c + ... + c(n-3) + c(n-2) + c(n-1) + cn
= c (1 + 2 + 3... + n-3 + n-2 + n-1 + n)
this is a simple sum of the arithmetic progression.
T(n) = c (n(n+1)/2) = O(n^2)
Worst case Time complexity of the quick sort = O(n^2)
```

**Analysis of worst-case using recursion tree method**

When we sum the total partitioning times for each level of recursion tree, we get => cn + c(n−1) + c(n−2) +⋯+ 2c + c = c (n + n−1 + n−2 + ⋯+ 2 + 1) = c(n(n+1)/2) = O(n^2)

**Best-case Analysis of Quicksort**

The best-case behavior for quicksort occurs when we are lucky and the partition process always picks the middle element as a pivot. In other words, this is a case of the balanced partition where both sub-problems are n/2 size each.

Let us assume that balanced partitioning arises in each recursive call. So for calculating the time complexity in the best case, we put i = n/2 in the above formula of T(n).

```
T(n) = T(n/2) + T(n - 1 - n/2) + cn
= T(n/2) + T(n/2 - 1) + cn
~ 2 T(n/2)+ cn
T(n) = 2 T(n/2)+ cn
```

This recurrence is similar to the merge sort for which the solution is O(nlogn). We can solve this using the recursion tree or the master method. The best-case time complexity of the quick sort = O(nlogn)

Image Source : https://www.cs.dartmouth.edu/~thc/cs5-F96/lec28.html

**Average-case Analysis of Quicksort**

The behavior of quicksort depends on the relative order of the values in the input. We can assume that all permutations of the input are equally likely. When we run quicksort on random input, the partitioning is highly unlikely to happen in the same way at each level of recursion. So we expect that some of the splits will be reasonably well balanced and that some will be pretty unbalanced. Think!

In the average case, the partition process generates a mix of good (balanced partition) and bad (unbalanced partition) splits. In a recursion tree, the good and bad splits are distributed randomly throughout the tree. Suppose, for intuition, good and bad splits appear at the alternate level of the tree. Think!

Suppose at the root of the recursion tree, partition generates a good split, and at the next level, partition generates a bad split. The cost of the partition process will be O(n) at both levels. So the combined partitioning cost of the bad split followed by the good split is O(n). Certainly, this situation is equivalent to the single level of partitioning, which looks similar to the scenario of a balanced partition! So the average-case running time of quicksort is much closer to the best case than to the worst case.

From the above perspective, the average case time complexity looks O(nlogn). For better visualization, let’s assume the partitioning algorithm always produces a partially unbalanced split in the ratio of 9 to 1. The recurrence relation for this would be: T(n) = T(9n/10) + T(n/10) + cn. The following diagram shows the recursion tree for this recurrence . Image source: CLRS Book

We can notice the following things from the above diagram:

- The left subtree is decreasing fast with a factor of 1/10. So the depth of the left subtree is equal to log10(n). (Think!)
- The right subtree is decreasing slowly with a factor of 9/10. So the depth of the right subtree is equal to log10/9(n).
**Note:** log10/9(n) = O(logn)
- At every level of recursion, the cost of partition is at most cn. So after doing the sum of cost at each level of the recursion tree, the quick sort cost is O(nlogn).
- Thus, with a 9 to 1 proportional split at each level of recursion, which intuitively seems quite unbalanced, quicksort runs in O(nlogn) time. Even a 99-to-1 split can produce O(nlogn) time complexity!
- In general, any split of constant proportionality produces a recursion tree of depth O(logn), where the cost at each level is O(n). The time complexity is, therefore, O(nlogn) whenever the split has constant proportionality.

**Space Complexity Analysis of Quicksort**

Quicksort is an in-place sorting algorithm where we are not using extra space in the code. But as we know, every recursive program uses a call stack in the background to execute the recursion. So the space complexity of the quick sort depends on the size of the recursion call stack, which is equal to the height of the recursion tree. As we know from the above analysis, the height of the recursion tree depends on the partition process. Think!

In the worst-case scenario, the partition is always unbalanced, and there will be only 1 recursive call at each level of recursion. In such a scenario, the generated tree is skewed in nature. So the height of the tree = O(n) and recursion requires a call stack of size O(n). The **worst-case space complexity** of the quick sort = O(n).

The partition is always balanced in the best-case scenario, and there will be 2 recursive calls at each recursion level. In such a scenario, the generated tree is balanced in nature. So the height of the tree = O(logn), and recursion requires a call stack of size O(logn). The **best-case space complexity** of the quick sort = O(logn).

### Case of Repeated Elements in Quicksort

In the above algorithm, we assume that all input values are distinct. But how do we modify the quick sort algorithm when input values are repeated? Let’s think!

In the above partition algorithm, quicksort exhibits poor performance for inputs with many repeated elements even though we choose good pivot values. To understand this better, suppose all the input elements are equal. So at each stage of recursion, the left partition will be empty, and the right partition will only decrease by one element. This is a situation just similar to the worst-case scenario of the quick sort. But what would be a good solution to this situation? Here is an idea!

We modify the partition algorithms and separate the input values into three groups: values less than the pivot, values equal to the pivot, values greater than the pivot. The values equal to the pivot are already sorted, so only the less-than and greater-than partitions need to be recursively sorted. In the modified partition algorithm, we return two index **startIndex** and **endIndex** such that:

- All elements of X[] from
**l** to **startIndex - 1** are less than the pivot.
- All elements of X[] from
**startIndex** to **endIndex** are equal.
- All elements of X[] from
**endIndex + 1** to **r** are greater than the pivot.

**Modified quicksort pseudocode**

### Strategy to Choose Pivot in the Partition Algorithm

As we know from the analysis: the efficiency of the quicksort depends on the smart selection of the pivot element. So there are many ways to choose pivot in the quick sort.

- Always choosing the last element as a pivot.
- Always choosing the first element as a pivot.
- Selecting a random element as a pivot.
- Selecting median as a pivot.

In the above implementation, we have chosen the rightmost element as the pivot element. But this can result in a worst-case situation when the input array is sorted. The best idea would be to choose a random pivot that minimizes the chances of worst-case at each level of recursion. Selecting the median element as a pivot would be also acceptable in the majority of cases.

**Pseudo-code snippet for the median-of-three pivot selection:**

```
mid = l + (r - l)/ 2
if X[mid] < X[l]
swap (X[l], X[mid])
if X[hi] < X[lo]
swap (X[l], X[r])
if X[mid] < X[r]
swap (X[mid], X[r])
pivot = X[r]
```

### Critical ideas to think!

- What value of pivotIndex does partition return when all elements in the array X[] have the same value? What is the time complexity of quicksort when all array elements have the same value?
- How would you modify quicksort to sort into non-increasing order?
- We can improve the running time of Quicksort by taking advantage of the fast running time of
**insertion sort** when its input is almost sorted. After calling quicksort on a subarray with fewer than k elements, the idea is to let it simply return without sorting the subarray recursively. After the top-level call to quicksort returns, we can run insertion sort on the entire array to finish the sorting process. What would be the time complexity of this algorithm? How should we pick the value of k?
- The above quicksort implementation is
**not stable**. How can we modify the above algorithm to make it stable?
- Quicksort is an in-place sorting algorithm where we use extra space only for recursion call stack but not for manipulating the input. What would be the average case space complexity of the quick sort? How do we optimize the space complexity to O(logn) in the worst case?
- How can we implement Quicksort
**iteratively** using stack?
- Why do we prefer Quick sort over Merge sort for sorting Arrays?
- How do we
**sort the linked list** using the QuickSort algorithm? Why do we prefer merge sort over quicksort for sorting Linked Lists?
- The number of comparisons in quicksort equals the number of comparisons in
**constructing BST** with a sequence of insertion operations. Quicksort is also a similar version of the **BST sort** where both algorithms make the exact number of comparisons but in a different order.

### Similar concepts to explore further

### Similar coding problems to practice

### Content references

**Enjoy learning, Enjoy coding, Enjoy algorithms!**