Difficulty: Hard, Asked-in: Google, Amazon, Samsung
Key takeaway: An excellent problem to learning step-by-step optimization using various approaches. In-place hashing solution is an intuitive approach that uses the same input array to process values and generate correct output.
Given an array that includes positive and negative numbers, write a program to find the smallest missing positive integer.
Example 1
Input: X[] = [2, -9, 5, 11, 1, -10, 7], Output: 3
Explanation: Consecutive positive integers 1 and 2 are present in the array. So the first missing positive is 3.
Example 2
Input: X[] = [2, 3, -2, 1], Output: 4
Explanation: Consecutive positive integers 1, 2, and 3 are present in the array. So the first missing positive is 4.
Example 3
Input: X[] = [4, 6, 5, 3, 8], Output: 1
Explanation: All values are positive, but the first positive integer 1 is missing in the input. So the output is 1.
Solution idea
Positive number starts from 1 on the number scale. If the first missing positive number is k, then all the positive numbers from 1 to k - 1 will be present in the array.
The critical question is: What will be the maximum possible missing positive number in an array of size n? Think! The answer is simple: In the n size array, the maximum possible value of missing positive will be n + 1. This is the scenario when all numbers from 1 to n are present in the array.
So one basic approach would be to search all positive integers starting from 1 to n + 1 linearly in the array. If any positive integer is missing in the sequence, we will return that value as an output.
Solution pseudocode
int firstMissingPositive(int X[], int n)
{
for (int i = 1; i <= n + 1; i = i + 1)
{
bool missingFlag = true
for (int j = 0; j < n; j = j + 1)
{
if (X[j] == i)
{
missingFlag = false
break
}
}
if (missingFlag == true)
return i
}
}
Solution analysis
We are running two nested loops and doing constant operations at each iteration. Here both loops will run n times in the worst-case. So worst case time complexity = O(n)*O(n)*O(1) = O(n^2). We are using constant extra space, so space complexity = O(1).
Solution idea and steps
If we sort the input array, then all positive numbers get lined up in increasing order. So we can search linearly to find the first missing positive integer. For sorting, we can use efficient O(nlogn) sorting algorithms like heap sort, merge sort, or quicksort.
Solution pseudocode
int firstMissingPositive(int X[], int n)
{
heapSort(X, n)
int i = 0
While (X[i] < 1)
i = i + 1
missingPositive = 1
for (int j = i; j < n; j = j + 1)
{
if (missingPositive == X[i])
missingPositive = missingPositive + 1
else if (X[i] > missingPositive)
return missingPositive
}
return missingPositive
}
Solution analysis
Time complexity = Time complexity of sorting + Time compexity of linear scan for finding the first missing positive = O(nlogn) + O(n) = O(nlogn). Space complexity = O(1) (Think!).
Solution idea
Now the critical question is: Can we improve the time complexity further? As empahsized in the brute force approach, this is a searching problem where we need to look for positive integers starting from 1 to n + 1. Can we think of optimizing the brute force approach? Think!
To improve efficiency of searching, one idea will be to use hash table instead of linear search. Hash table can help us to perform searching and insertion efficiently in the O(1) average.
Solution steps
Solution pseudocode
int firstMissingPositive(int X[], int n)
{
HashTable H
for (int i = 0; i < n; i = i + 1)
H[X[i]] = true
for (int i = 1; i <= n + 1; i = i + 1)
{
if (H[i] == false)
return i
}
}
Solution analysis
Time complexity = Time complexity of inserting n elements into the hash table + Time complexity of searching n elements into the hash table = O(n) + O(n) = O(n). Space complexity = O(n), for the hash table of size n.
Solution idea and steps
Now the critical question is: Can we solve this problem in-place or without using extra space? Let’s think!
The best part of this idea is: We are modifying the same array and storing the results to get the correct output.
Solution pseudocode
void swap(int* a, int* b)
{
int temp
temp = *a
*a = *b
*b = temp
}
int firstMissingPos(int X[], int n)
{
int i = 0
while (i < n)
{
if (X[i] > 0 && X[i] <= n && X[X[i] - 1] != X[i])
swap(&X[i], &X[X[i] - 1])
else
i = i + 1
}
for (int i = 0; i < n; i = i + 1)
{
if (X[i] != i + 1)
return i + 1
}
return n + 1
}
Solution analysis
We are running two separate single loops. So worst case time complexity = Time complexity of modifying the array + Time complexity of searching first missing positive = O(n) + O(n) = O(n)
Space complexity = O(1), as we use the same input array to place the elements at their correct position. That's why such approaches are always premium!
Solution idea and steps
This idea is similar to the above approach, but there is a difference! We first divide the array into two parts: the first part consists of positive numbers, and the second part consists of negative numbers. We are assuming the array can be modified.
We can better understand this approach via an example.
Initial Array: 1 -1 -5 -3 3 4 2 8
Step 1: Partition: 1 8 2 4 3 | -3 -5 -1, positiveEnd = 5
Step 2: Modifying the input array.
1 < positiveEnd, we change the sign of value at index 1 - 1 = 0
Modified array: -1 8 2 4 3 | -3 -5 -1
2 < positiveEnd, we change the sign of value at index 2 - 1 = 1
Modified array: -1 -8 2 4 3 | -3 -5 -1
4 < positiveEnd, we change the sign of value at index 4 - 1 = 3
Modified array: -1 -8 2 -4 3 | -3 -5 -1
3 < positiveEnd, we change the sign of value at index 3 - 1 = 2
Modified array: -1 -8 -2 -4 3 | -3 -5 -1
Step 3: Traversing from index 0 to positiveEnd, we find X[4] = 3 > 0. The answer is 4 + 1 = 5
Why this approach is working perfectly? What is the critical reason behind it? We are leaving this intuition for the learners to think! Enjoy thinking!
Solution Pseudocode
int firstMissingPositive(int X[], int n)
{
int positiveEnd = partition(X, n)
for (int i = 0; i < positiveEnd; i = i + 1)
{
if (X[i] - 1 < positiveEnd)
X[X[i] - 1] = - X[X[i] - 1]
}
for (int i = 0; i < positiveEnd; i = i + 1)
{
if (X[i] > 0)
return i + 1
}
return positiveEnd + 1
}
int partition(int X[], int n)
{
int j = n - 1, i = 0
while (i <= j)
{
if (A[i] <= 0)
{
swap(A[i], A[j])
j = j - 1
}
else
i = i + 1
}
return i
}
Solution analysis
Time complexity = Time complexity of the partition + time complexity of modifying the input array + Linear scan to find the missing positive = O(n) + O(n) + O(n) = O(n)
Space complexity = O(1), we are solving the problem in place using the same input array!
Enjoy learning, Enjoy algorithms!
Subscribe to get weekly content on data structure and algorithms, machine learning, system design and oops.