Analysis of Recursion in Data Structures and Algorithms

Recursion is one of the popular problem-solving approaches in data structure and algorithms. Even some problem-solving approaches are totally based on recursion: Decrease and conquer, Divide and conquer, DFS traversal of tree and graph, Backtracking, Top-down approach of dynamic programming, etc. So learning analysis of recursion is critical to understand the thought process behind these approaches and improving the code performance.

Sometimes programmers find it challenging to analyze recursion due to its mathematical details. However, working with recursion becomes easy when we understand various patterns of recursive algorithms and methods used in the analysis. So let's start the idea step by step!

What is the recurrence relation of a recursive function?

A recurrence relation is a mathematical equation that defines a sequence of values based on previous terms in the sequence. We often use recurrence relations to analyze the time complexity of recursion. The critical question is: How can we write the recurrence relation of a recursive algorithm? Let's think!

In recursion, we solve a problem by breaking it into smaller subproblems. If the time complexity function of input size n is T(n), then the time complexity of the smaller subproblems will be defined by the same function, but in terms of the subproblem's input size. So here is an approach to write T(n) if we have k number of subproblems:

T(n) = T(1st subproblem input size ) + T(2nd subproblem input size) + .... + T(kth subproblem input size) + Time complexity of extra operations other than recursive calls.

We can use the above formula to define the recurrence relation of every recursive function. We then solve the recurrence relation and calculate the overall time complexity in terms of Big-O notation. Let's better understand this via examples of various recurrence relations of popular recursive algorithms.

Decrease by a constant: Reverse an array

reverse (A[], l, r) 
- swap(A[l], A[r]) 
- reverse(A, l + 1, r - 1)
- Base Case: if (l >= r), then return.

We are reversing n-size array by swapping the first and last values and recursively solving the subproblem of reversing (n - 2) size array. At each step of recursion, the input size decreases by 2.

Time complexity T(n) = Time complexity of solving an (n - 2) size problem + Time complexity of the swapping operation = T(n - 2) + O(1).
The recurrence relation => T(n) = T(n - 2) + c, where T(1) = c.

Decrease by a constant factor: Binary search

binarySearch(A[], l, r, target)
- mid = l + (r - l)/2
- if (A[mid] == target), return mid
- if (A[mid] > target), binarySearch(A, l, mid - 1, target)
- if (A[mid] < target), binarySearch(A, mid + 1, r, target)

Base case: If (l > r), return -1.

At each step of recursion, we are doing one comparison and decreasing the input size by half. In other words, we are solving the problem of size n using the solution of one subproblem of input size n/2 (based on the comparison, we either go into the left half or right half).

Time complexity T(n) = Time complexity of n/2 size problem + Time complexity of extra operations = T(n/2) + O(1) = T(n/2) + c
The recurrence relation => T(n) = T(n/2) + c, where T(1) = c.

Dividing into two equal size subproblems: Merge sort

mergeSort (A[], l, r)
- mid = l + (r - l)/2
- mergeSort(A, l, mid)
- mergeSort(A, mid + 1, r)
- merge(A, l, mid, r)

Base case: if (l == r), return.

We divide the array into two halves by calculating the mid index, recursively sort both halves (each of size n/2), and perform O(n) extra operations to merge both sorted halves.

Time complexity T(n) = Time complexity of calculating the mid index + Time complexity of sorting the left and right halves + Time complexity of merging = O(1) + 2T(n/2) + O(n) = 2T(n/2) + O(n) = 2T(n/2) + cn
The recurrence relation => T(n) = 2T(n/2) + cn, where T(1) = c.

Dividing into two different size subproblems: Quick sort

quickSort(A[], l, r)
- pivot = partition(A, l, r)
- quickSort(A, l, pivot - 1)
- quickSort(A, pivot + 1, r)

Base case: if (l >= r), then return.

In quick sort, we divide an array into two subarrays in O(n) time using the partition algorithm and recursively sort each subarray. The size of the subarrays depends on the choice of pivot in the partition algorithm.

Suppose k number of elements are in the left subarray (left of the pivot), and n - k - 1 number of elements are in the right subarray (right of the pivot). So the size of the left subproblem is k and the size of the right subproblem is n - k - 1.

Time complexity T(n) = Time complexity of the partition algorithm + Time complexity of the left sub-problem of size k + Time complexity of the right sub-problem of size (n - k - 1) = O(n) + T(k) + T(n - k - 1) = T(k) + T(n - k - 1) + cn.
The recurrence relation => T(n) = T(k) + T(n - k - 1) + cn, where T(1) = c.

Dividing into more than two subproblems of equal size

Karatsuba algorithm for fast multiplication: Here, we solve the integer multiplication problem of size n bits using three sub-problems of size n/2 bits and combine these solutions in O(n) time. Recurrence relation: T(n) = 3T(n/2) + cn, where T(1) = c.

Strassen’s matrix multiplication: Here, we solve the matrix multiplication problem of size n using the solution of seven sub-problems of size n/2 and combining these solutions in O(n^2) time. Recurrence relation: T(n) = 7T(n/2) + cn^2, where T(1) = c.

Dividing into two dependent subproblems: Finding the nth Fibonacci

fib (n) = fib (n - 1) + fib (n - 2)
Base case: fib(0) = 0 and fib(1) = 1.

To find the nth Fibonacci number, we recursively solve and add the solutions of subproblems of size n−1 and n−2. These subproblems are dependent on each other because the value of the nth Fibonacci number depends on the (n−1)th and (n−2)th Fibonacci numbers, and the value of the (n−1)th Fibonacci number depends on the (n−2)th and (n−3)th Fibonacci numbers, and so on. We often encounter such types of dependent subproblems in dynamic programming solutions.

Time complexity T(n) = Time complexity of finding the (n - 1)th Fibonacci + Time complexity of finding the (n - 2)th Fibonacci + Time complexity of addition operation = T(n - 1) + T(n - 2) + O(1).
Recurrence relation: T(n) = T(n - 1) + T(n - 2) + c, where T(n) = c for n <= 1.

Recurrence relation of popular recursive algorithms

Steps to analyse the time complexity of recursive algorithms

Step 1: Identify input size and smaller subproblems

We first identify the input size of the larger problem.
Then we recognise the total number of smaller sub-problems.
Finally, we identify the input size of smaller sub-problems.

Step 2: Write the recurrence relation for the time complexity

We define the time complexity function for the larger problem in terms of the input size. For example, if the input size is n, then the time complexity function would be T(n).
Then we define the time complexity of smaller subproblems. For example, in the case of merge sort, the input size of both subproblems is n/2 each, so the time complexity of each subproblem would be T(n/2).
Next, we analyze the worst-case time complexity of additional operations. For example, in merge sort, the divide (O(1)) and merging process (O(n)) are extra operations.
We add the time complexities of the sub-problems and additional operations to get the equation of the overall time complexity T(n).
We also define the time complexity of the base case, i.e., the smallest version of the problem. Our solution to the recurrence depends on this, so we need to define it correctly.

Step 3: Solving recurrence relation to get the time complexity

We mainly use the following two methods to solve recurrence relations in algorithms. We can choose these methods depending on the nature of the recurrence relation. The master method works best to analyse divide and conquer algorithms, but recursion tree method is a fundamental approach to analyse recursive algorithms.

Method 1: Recursion Tree Method
Method 2: Master Theorem

Method 1: Analysis of recursion using Recursion Tree Method

A recursion tree is a tree diagram of recursive calls where each tree node represents the cost of a certain subproblem. The idea is simple! The time complexity of a recursive function depends on two factors: 1) the total number of recursive calls and 2) the time complexity of additional operations for each recursive call.

So, a recursion tree is a diagram that represents the additional cost for each recursive call in terms of its input size. We should add the extra cost for each recursive call to get the overall time complexity. The best idea is to perform this sum level by level.

Steps of recursion analysis using the recursion tree method

Draw the recursion tree for the given recurrence relation.
Calculate the cost of additional operations for each recursive call.
Calculate the cost of additional operations at each level by adding the cost of each node at that level.
Finally, add the cost of each level to determine the total cost of recursion. Simplify this expression to get the overall time complexity in terms of big-O notation.

Example: Analysis of merge sort using recursion tree method

Merge sort recurrence relation: T(n) = 2T(n/2) + cn, T(1) = c.

We are dividing problem of size n into two different sub-problems of size n/2. Here cn represents the extra cost of merging the solution of two smaller sub-problems of size n/2 (merging two smaller sorted arrays). One more thing: the problem will be divided in half until the size of the sub-problem becomes 1.

Recursion tree construction for merge sort time complexity analysis

We start with the term cn as the root, which represents the cost of additional operations at the first level. The left and right children of the root are the time complexities of the two smaller sub-problems, which are T(n/2). Now we move one level further by expanding T(n/2).
At the second level, there are two nodes of size n/2, and the cost of the additional operations at each node is cn/2. So the overall cost of additional operations at the second level is cn/2 + cn/2 = cn.
Similarly, we move to the third level where we have 4 nodes of size n/4, and the cost of the additional operations at each node is cn/4. The total cost at the third level is cn/4 + cn/4 + cn/4 + cn/4 = cn.
In a similar way, we continue expanding each internal node in the recursion tree until the problem sizes get down to 1. The bottom level has n nodes, each contributing a cost c. So total cost at the last level is cn.
In general, level i has 2^i nodes, each contributing a cost cn/2^i. So total cost of ith level is 2^i *(cn/2^i) = cn.

Analysis of merge sort using recursion tree method

Here, the recursion tree is a full binary tree where each node has two children, and each level is completely full. At each level, the input size decreases by a factor of 2, and the recursion tree will terminate at an input size of 1. So the height of the recursion tree h = logn.

The total number of levels in a binary tree = height of the binary tree + 1 = logn + 1.
To compute the total cost, we add up the costs of all the levels. The recursion tree has logn + 1 levels, each costing cn. So the total cost = cn * (logn + 1) = cnlogn + cn.
After ignoring the low-order term, time complexity of merge sort = O(nlogn).

Method 2: Analysis of Recursion using Master Theorem

We use the master method for finding the time complexity of the divide and conquer algorithm that partitions an input into smaller subproblems of equal sizes. It is a direct way to get the solution for recurrences that can be transformed to the type: T(n) = aT(n/b) + O(n^k), where a≥1 and b>1.

The master theorem recurrence describes the time complexity of an algorithm that divides a problem of size n into a number of subproblems, each of size n/b, where a and b are positive constants. Here a number of subproblems are solved recursively, each in time T(n/b) and O(n^k) is the cost of dividing the problem and combining the solution of subproblems.

There are three cases of recursion analysis using the master theorem:

Case 1: When k < logb(a) then T(n) = O(n^logb(a))

Case 2: When k = logb(a) then T(n) = O(n^k * logn)

Case 3: When k > logb(a) then T(n) = O(n^k)

Special notes

The general recurrence relation of the master theorem is T(n) = aT(n/b) + f(n), where f(n) is an asymptotically positive function. For simplicity of analysis, we assume that f(n) = O(n^k) for some positive constant k. This assumption holds true for most recursive solutions, where the cost of the additional operations is polynomial in n.
The master theorem is derived from the recursion tree method, where the height of the recursion tree is logb(n). However, the mathematical proof of this theorem is beyond the scope of this discussion. If you are interested in learning the proof, you can refer to the CLRS book.

Example 1: Binary search analysis using master theorem

Comparing master theorem recurrence relation with binary search recurrence relation:

T(n) = aT(n/b) + O(n^k)
T(n) = T(n/2) + c

Here a = 1, b = 2 (a > 1 and b > 1)
k = 0 because n^k = c = Cn^0

=> logb(a) = log2(1) = 0
=> k = logb(a)

We can apply case 2 of the master theorem.
T(n) = O(n^k * log(n)) 
= O(n^0 * log(n)) 
= O(logn).

Example 2: Merge sort analysis using master theorem

Comparing master theorem recurrence with merge sort recurrence relation:

T(n) = aT(n/b) + O(n^k)
T(n) = 2 T(n/2) + cn

Here a = 2, b = 2 (a > 1 and b > 1)
O(n^k)= cn = O(n¹) => k = 1

logb(a) = log 2(2) = 1
Hence => k = logb(a) = 1

We can apply case 2 of the master theorem.

Time complexity T(n) 
= O(n^k * logn) 
= O(n^1 * logn) 
= O(nlogn).

Example 3: Divide and conquer idea of finding max and min

Comparing master theorem recurrence with finding max and min recurrence relation:

T(n) = aT(n/b) + O(n^k)
T(n) = 2 T(n/2) + c

Here a = 2, b = 2 (a > 1 and b > 1)
O(n^k)= cn = O(n^0) => k = 0

logb(a) = log 2(2) = 1
Hence => logb(a) > k

We can apply case 1 of the master theorem.

Time complexity T(n) 
= O(n^logb(a)) 
= O(n^1) 
= O(n)

Example 4: Analysis of Strassen’s matrix multiplication

Comparing master theorem recurrence with Strassen’s multiplication recurrence:

T(n) = aT(n/b) + O(n^k)
T(n) = 7T(n/2) + cn^2

Here a = 7, b = 2 (a > 1 and b > 1)
O(n^k)= cn^2 = O(n^2) => k = 2

logb(a) = log 2(7) = 2.8 (Approximately)
Hence => logb(a) > k

so we can apply case 1 of the master theorem.

Time complexity T(n) 
= O(n^logb(a)) 
= O(n^log2(7)) 
= O(n^2.81)

We highly recommend to explore analysis of following recursive algorithms

Finding the max element in an array recursively: T(n) = T(n - 1) + O(1).
Recursive insertion sort: T(n) = T(n - 1) + O(n).
Tower of Hanoi puzzle: T(n) = 2T(n - 1) + O(1).
Finding the closest pair of points: T(n) = 2T(n/2) + O(nlogn).
Divide and conquer algorithm of the max subarray sum: T(n) = 2T(n/2) + O(n).
Divide and conquer algorithm of finding the majority element: T(n) = 2T(n/2) + O(n).
Recursively searching in a balanced BST: T(n) = T(n/2) + O(1)
DFS traversal of a binary tree: T(n) = T(i) + T(n - i - 1) + O(1), where i is the number of nodes in the left subtree and (n - i - 1) is the number of nodes in the right subtree.
Average case analysis of the quick-select algorithm for finding the kth smallest element: T(n) = 2/n * (i = n/2 to n-1 ∑ T(i)) + cn.
Brute force recursive algorithm of the longest common subsequence: T(n, m) = T(n-1, m-1) + O(1), if the last values of the strings are equal, otherwise T(n, m) = T(n-1, m) + T(n, m-1) + O(1).
Recurrence relation to count the number of unique binary search trees with n keys (Catalan number): T(n) = Σ (i = 1 to n) [T(i - 1) * T(n - i)].

If you have any queries or feedback, please write to us at contact@enjoyalgorithms.com. Enjoy learning, enjoy coding, and enjoy algorithms!

Time Complexity Analysis of Recursive Algorithms