0-1 Knapsack Problem

Difficulty: Medium, Asked-In: Amazon, Microsoft, Yahoo, Zoho, Visa

Key takeaway: An excellent problem to learn problem solving using dynamic programming and space complexity optimization.

Let’s understand the problem

We are given n items, where each item has a weight and profit associated with it. We are also given a knapsack with capacity C (the knapsack can hold at most C weight). Write a program to determine the maximum profit that can be obtained by selecting a subset of items without exceeding the knapsack capacity.

  • P[]: An array containing the profits of items.
  • W[]: An array containing the weights of items.
  • n: The number of items.
  • C: The capacity of the knapsack.

Constraint:

  • We can either put an item completely into the knapsack or not put it at all. In other words, it is not possible to put a part of an item into the knapsack.
  • No item can be selected more than once.

Examples

Input: n = 3, W[] = [3, 1, 4], P[] = [25, 20, 20], C = 5

Output: 45

Explanation: There are 7 possible non-empty subsets of items: {3}, {1}, {4}, {3}, {3, 1}, {3, 4}, {1, 4}, and {3, 1, 4}. Out of these choices, {3, 4} and {3, 1, 4} have combined weights that exceed the maximum capacity of 5. So we will not consider them in the solution. Among the remaining subsets, {3}, {1}, {4}, {3}, {3, 1}, and {1, 4}, the subset {3, 1} yields the maximum profit, which is 45.

Input: n = 3, W[] = [4, 5, 1], P[] = [1, 2, 3], C = 4
Output: 3

Explanation: There are two items which have a weight less than or equal to 4. If we select the item with weight 4, the profit is 1. And if we select the item with weight 1, the profit is 3. So the maximum possible profit is 3. Note that we cannot put both the items with weights 4 and 1 together as the capacity of the knapsack is 4.

Input: n = 3, W[] = [4, 5, 6], P[] = [1, 2, 3], C = 3
Output: 0

Explanation: The weight of each item exceeds the maximum capacity of the knapsack, and we cannot select any items. So, the maximum possible profit is 0.

Discussed solution approaches

  • Brute force approach using recursion
  • Efficient approach using top-down memoization
  • Efficient approach using bottom-approach of DP
  • Space-optimized version of the bottom-approach
  • Another space-optimized version using a 1-D table

Brute force approach using recursion

The most basic solution would be to consider all possible subsets of items and calculate the weight and total profit for each subset. Here, we only choose the subsets whose total weight is smaller than the knapsack capacity C, and among these subsets, we identify the subset with the maximum profit.

The critical question is: How can we generate all subsets of items? Here is an idea! There are two choices for every item: either the item is included in the knapsack or the item is excluded from the knapsack. Suppose we use the function knapsack(P[], W[], n, C) to find the maximum profit for n items.

Case 1: If we include nth items in the knapsack

If the weight of the nth item (W[n - 1]) is less than or equal to the capacity C, we can include it in the knapsack. If we include the nth item, we will acquire a profit of P[n - 1], and the remaining capacity of the knapsack will be C - W[n - 1].

Now, the problem will be reduced to a smaller version of the knapsack problem: Maximizing the profit from the remaining n - 1 items with a capacity of C - W[n - 1]. So, we call the same function recursively with new input parameters, i.e., knapsack(P[], W[], n - 1, C - W[n - 1]).

If (C >= W[n-1])
profitInclude = P[n-1] + knapsack(P[], W[], n-1, C - W[n-1])

Case 2: If we exclude the nth item from the knapsack

If we exclude the nth item, the capacity of the knapsack remains the same, but we have only n - 1 items left. Now, the problem is reduced to another smaller version of the knapsack problem: Maximizing the profit from the remaining n - 1 items with a capacity of C. So, we call the same function recursively with new input parameters, i.e., knapsack(P[], W[], n - 1, C).

profitExclude = knapsack(P[], W[], n-1, C)

The maximum profit obtained from n items will be the max of the above two cases i.e., max (profitInclude, profitExclude).

Base case: If there are no more items to consider (n == 0) or the knapsack capacity becomes zero (C == 0), we can’t select any more items. So, we return 0 as the maximum profit.

Solution code C++

int knapsack(int P[], int W[], int n, int C)
{
    if (n == 0 || C == 0)
        return 0;

    int profitInclude = 0;
    if (C >= W[n - 1])
        profitInclude = P[n - 1] + knapsack(P, W, n - 1, C - W[n - 1]);

    int profitExclude = knapsack(P, W, n - 1, C);

    return max(profitInclude, profitExclude);
}   

Solution code Python

def knapsack(P, W, n, C):
    if n == 0 or C == 0:
        return 0

    profitInclude = 0
    if C >= W[n - 1]:
        profitInclude = P[n - 1] + knapsack(P, W, n - 1, C - W[n - 1])

    profitExclude = knapsack(P, W, n - 1, C)

    return max(profitInclude, profitExclude)

Time complexity analysis

There are two possibilities for each of the n items. So the total number of possible subsets = (2 * 2 * 2 * ... multiplication n times) = 2^n. In the worst case, recursion will explore all these possible combinations of items. So time complexity = O(2^n).

This solution is highly inefficient, and the key reason behind this is that recursion solves the same sub-problems repeatedly. We can observe scenarios of repeated sub-problems by drawing the recursion tree.

Here is an example where one sub-problem is repeated. If we expand the recursion tree further, we will observe more instances of repeated sub-problems.

Recursion tree digram to visualize repeated sub-problems in 0-1 knapsack problem

Efficient approach using top-down memoization

Solution idea

Now, the critical question is: How can we improve the time complexity? One idea is to use the top-down approach of dynamic programming, which is a modified version of the recursive solution.

Here, we calculate each subproblem only once using recursion and store their values in an additional memory. When the same subproblem again appear during recursion, instead of solving it again, we directly retrieve the solution from the extra memory in constant time. This will help us significantly reduce unnecessary calculations.

Solution steps

We initialize a 2D array dp[][] of size (n + 1) * (C + 1) and set all the values to -1. We use this array to store the solutions of the sub-problems. Now, the critical question is: Why is the size of the table (n + 1) * (C + 1)? Let's think! During recursion, the input sizes n and C decrease by at least 1 until they reach the base value of 0. So there are (n + 1) * (C + 1) possible sub-problems, and we need to allocate an entry for each one of them.

Now, we call a helper function knapsack(P[], W[], i, C, dp), where the initial value of i is n.

  1. Base case: If there are no more items to consider (i == 0) or the knapsack capacity becomes zero (C == 0), we can't select any more items. So we return 0 as the maximum profit. If the base case is not true, we check if the solution for the current state (dp[i][C]) is already computed or not. 
  2. If dp[i][C] < 0, it means the result is not yet computed. So, we proceed with the recursive calculation:

    • We calculate the profit by including the ith item if C >= W[i-1] i.e. profitInclude = P[i - 1] + knapsack(P, W, i - 1, C - W[i - 1], dp).
    • We calculate the profit by excluding the ith item and considering the remaining (i - 1) items and the current capacity C i.e. profitExclude = knapsack(P, W, i - 1, C, dp).
    • We store the maximum profit among including and excluding the ith item for future reference i.e. dp[i][C] = max (profitInclude, profitExclude).
  3. If dp[i][C] > 0, it means the result has already been calculated and stored in dp[i][C]. So there is no need to do the calculation again and we can directly return this value.

Solution code C++

int knapsack(int P[], int W[], int i, int C, int** dp)
{
    if (i == 0 || C == 0)
        return 0;

    if (dp[i][C] < 0)
    {
        int profitInclude = 0;
        if (C >= W[i-1])
            profitInclude = P[i-1] + knapsack(P, W, i - 1, C - W[i-1], dp);
        
        int profitExclude = knapsack(P, W, i - 1, C, dp);
        dp[i][C] = max(profitInclude, profitExclude);
    }
    return dp[i][C];
}

int zeroOneKnapSack(int P[], int W[], int n, int C)
{
    int** dp = new int*[n + 1];
    for (int i = 0; i <= n; i = i + 1)
    {
        dp[i] = new int[C + 1];
        for (int j = 0; j <= C; j = j + 1)
            dp[i][j] = -1;
    }
    int result = knapsack(P, W, n, C, dp);
    return result;
}

Solution code Python

def knapsack(P, W, i, C, dp):
    if i == 0 or C == 0:
        return 0

    if dp[i][C] < 0:
        profitInclude = 0
        if C >= W[i - 1]:
            profitInclude = P[i - 1] + knapsack(P, W, i - 1, C - W[i - 1], dp)

        profitExclude = knapsack(P, W, i - 1, C, dp)
        dp[i][C] = max(profitInclude, profitExclude)

    return dp[i][C]


def zeroOneKnapSack(P, W, n, C):
    dp = [[-1] * (C + 1) for _ in range(n + 1)]

    result = knapsack(P, W, n, C, dp)
    return result

Time and space complexity analysis

In the top-down approach, we solve each sub-problem once using recursion and avoid recomputation when sub-problems reappear. So, time complexity = O(n*C). This is a significant improvement compared to the previous approach!

We require extra memory of size (n + 1)(C + 1). \*So space complexity = O(nC)\*. However, there will be additional overhead of space complexity for the recursion call stack. The critical question is: What would be the size of the recursion call stack? Explore and think!

Efficient approach using bottom-approach of DP

Solution idea

This is a dynamic programming problem. So, we can efficiently solve 0-1 knapsack problem using the iterative idea of a bottom-up approach. How can we do this? Let's think!

In the bottom-up approach, we start by solving subproblems with smaller sizes (0 items and 0 capacity). Then, we build the solution iteratively for larger subproblems using the previously solved smaller subproblems until we find the solution to the given problem (n items and C capacity). During this process, we use additional memory to store the solution of each subproblem.

Solution steps

Defining table size: We define a 2D table dp[n + 1][C + 1] to store the solution for each possible subproblem.

Table initialization: Now, we initialize the table with the solution of the smallest version of the subproblem (similar to the base case in the recursive solution). For this, we set the value 0 in the first row and the first column of the table.

Iterative structure to fill the table: Now, we define an iterative structure to fill the table and store the solution of subproblems in a bottom-up manner. We can get this insight from the recursive structure of the above recursive solution. Here are the steps!

We use nested loops to fill the remaining cells of the dp table. The outer loop runs from i = 1 to n to iterate over each item, and the inner loop runs from j = 1 to C to consider each possible capacity value. At each iteration:

  • If the weight of the ith item (W[i - 1]) is greater than the current knapsack capacity j, then the ith item cannot be included. So at dp[i][j], we copy the solution stored at dp[i - 1][j]. Note: Here dp[i - 1][j] stores the maximum profit obtained for knapsack capacity j after considering the i - 1 items.
  • Otherwise, the ith item can be included and we consider two possibilities: 1) Solution for not including the ith item, i.e., dp[i - 1][j], 2) Solution for including the ith item, i.e., P[i - 1] + dp[i - 1][j - W[i - 1]]. We take the maximum of both possibilities and assign it to dp[i][j]. Note: Here dp[i - 1][j - W[i - 1]] stores the maximum profit obtained for the knapsack capacity j - W[i - 1] after considering the i - 1 items.
if (W[i - 1] > j)
    dp[i][j] = dp[i - 1][j];
else
    dp[i][j] = max(P[i - 1] + dp[i - 1][j - W[i - 1]], dp[i - 1][j]);

Returning the final solution: After filling the entire dp table, we return the value stored in the bottom-right cell of the table, dp[n][C], which represents the maximum profit that can be achieved considering all n items and knapsack capacity C.

Storing results in bottom up approach of 0-1 knapsack problem

Solution code C++

int knapsack(int P[], int W[], int n, int C) 
{
    int dp[n + 1][C + 1];

    for (int i = 0; i <= n; i++)
        dp[i][0] = 0;

    for (int j = 0; j <= C; j++)
        dp[0][j] = 0;

    for (int i = 1; i <= n; i++) 
    {
        for (int j = 1; j <= C; j++) 
        {
            if (W[i - 1] > j)
                dp[i][j] = dp[i - 1][j];
            else
                dp[i][j] = max(P[i - 1] + dp[i - 1][j - W[i - 1]], dp[i - 1][j]);
        }
    }

    return dp[n][C];
}

Solution code Python

def knapsack(P, W, n, C):
    dp = [[0] * (C + 1) for _ in range(n + 1)]

    for i in range(1, n + 1):
        for j in range(1, C + 1):
            if W[i - 1] > j:
                dp[i][j] = dp[i - 1][j]
            else:
                dp[i][j] = max(P[i - 1] + dp[i - 1][j - W[i - 1]], dp[i - 1][j])

    return dp[n][C]

Time and space complexity analysis

In the bottom-up approach, we use two separate loops for initialization and one nested loop for storing results in the table. During each iteration of the loop, we perform constant operations. So time complexity = O(n) + O(C) + O(nC) = O(n*C). We use an extra 2D table of size (n + 1)*(C + 1), so space complexity = O(n*C).

This approach is slightly more efficient compared to the top-down approach. The reason is simple: there is no overhead of recursive calls and call stack.

Space-optimized version of the bottom-approach

Solution idea

The above solution uses O(n*C) space. The critical question is: Can we reduce the space complexity further? Can we use just a table with two rows to get the solution of 0-1 knapsack problem? Let's think!

If we observe the pattern of filling the table, we can notice an idea: To fill the (i, j) entry, we use two values stored in the previous row at (i - 1, j) and (i - 1, j - W[i - 1]). In other words, for filling the ith row, we use the value stored in the (i - 1)th row. So, we can use only two rows to store solutions: the previous row and the current row.

We define a 2D array of size 2 x (C + 1) to represent the two rows: dp[0] and dp[1]. For the current row i, we store values at row i % 2 and previous row values at (i - 1) % 2. Taking modulus with 2 helps us switch between these two rows based on the current row index i.

For example, for i = 3, the previous row will be stored at dp[0] and the current row will be filled at dp[1]. Then, for i = 4, dp[1] will contain the entries of the previous row and the current row will be filled at dp[0] (overriding the previous value stored in dp[0]). This is the idea of a rolling array in dynamic programming.

So there are two major changes in the above solution: 1) Instead of an (n + 1) x (C + 1) table, we use a 2 x (C + 1) table. 2) Instead of using i and i - 1, we use i % 2 and (i - 1) % 2. The remaining ideas will remain the same!

Solution code C++

int knapsack(int P[], int W[], int n, int C)
{
    int dp[2][C + 1]; 
    for (int i = 0; i <= n; i++) 
    {
        for (int j = 0; j <= C; j++) 
        {
            if (i == 0 || j == 0)
                dp[i % 2][j] = 0;
            else if (W[i - 1] <= j)
                dp[i % 2][j] = max(P[i - 1] + dp[(i - 1) % 2][j - W[i - 1]], dp[(i - 1) % 2][j]);
            else
                dp[i % 2][j] = dp[(i - 1) % 2][j];
        }
    }
    return dp[n % 2][C];
}

Solution code Python

def knapsack(P, W, n, C):
    dp = [[0] * (C + 1) for _ in range(2)]

    for i in range(n + 1):
        for j in range(C + 1):
            if i == 0 or j == 0:
                dp[i % 2][j] = 0
            elif W[i - 1] <= j:
                dp[i % 2][j] = max(P[i - 1] + dp[(i - 1) % 2][j - W[i - 1]], dp[(i - 1) % 2][j])
            else:
                dp[i % 2][j] = dp[(i - 1) % 2][j]

    return dp[n % 2][C]

Time and space complexity analysis

The time complexity of this approach will be the same as the previous approach, i.e., O(n*C). However, the space complexity will be reduced to O(C) because we are using only a table with two rows.

Another space-optimized version using a 1-D array

Solution idea

The above solution works well in O(C) space. The critical question is: Can we solve 0-1 knapsack problem using a single row or a 1-D array? The idea is: If we start traversing the rows from right to left, then it can be done with a single row only. How? Let's think!

Suppose we take a 1D table dp[C + 1] and initialize all values with 0. Now at the (i - 1)th iteration, suppose we have stored the (i - 1)th row values in the table. To calculate the values at each cell dp[j] in the ith iteration (from right to left), we have two values available from (i - 1)th iteration: dp[j] and dp[j - W[i - 1]].

In other words, at the start of the ith iteration:

  • dp[j] will store a value equivalent to dp[i - 1][j].
  • dp[j - W[i - 1]] will store a value equivalent to dp[i - 1][j - W[i - 1]].

So, when we fill the value at dp[j] while traversing from right to left in the ith iteration (equivalent to dp[i][j]), we already have both required values available at dp[j] and dp[j - W[i - 1]].

Solution code C++

int knapsack(int P[], int W[], int n, int C)
{
    int dp[C + 1];
    memset(dp, 0, sizeof(dp));
 
    for (int i = 1; i < n + 1; i++) 
    {
        for (int j = C; j >= 0; j--) 
        {
            if (W[i - 1] <= j)
                dp[j] = max(dp[j], dp[j - W[i - 1]] + P[i - 1]);
        }
    }

    return dp[C];
}

Solution code Python

def knapsack(P, W, n, C):
    dp = [0] * (C + 1)

    for i in range(1, n + 1):
        for j in range(C, -1, -1):
            if W[i - 1] <= j:
                dp[j] = max(dp[j], dp[j - W[i - 1]] + P[i - 1])

    return dp[C]

Time and space complexity analysis

The time and space complexity of this approach will be the same as the previous approach, i.e., O(n*C) and O(C). However, the space complexity will be reduced because we are using only a 1-D table instead of a two-row 2D table.

Critical ideas to think!

  • What would be the size of the recursion call stack in the recursive solution?
  • What would be the solution idea when we have an infinite supply of each item? In other words, how can we approach this problem when we have the option to select any item more than once?
  • Fractional Knapsack Problem: Suppose items are like grains, where we have the option to select a fraction of items. How can we solve this problem? Here, no item can be selected more than once.
  • Which approach is more efficient in terms of time and space complexity: the top-down approach or the bottom-up approach?
  • Can we think to implement the last approach by traversing from left to right?
  • In the top-down approach, why are we initializing table values as -1?
  • What would be the best and worst-case scenarios in the case of a recursive solution?
  • In the space-optimized solution, instead of using i % 2, can we use flag^1? If the flag is 0, the result of the XOR operation will be 1, and if the flag is 1, the result will be 0. So it will flip the value of the flag variable between 0 and 1.

Similar coding questions to solve

  • Rod Cutting Problem: Given a rod of length n and an array p[] containing prices of rods of different lengths, determine the maximum revenue that can be obtained by cutting up the rod and selling the pieces. The revenue obtained is the sum of the prices of all the pieces.
  • Coin change problem: Given a target amount and a set of coin denominations, the task is to find the minimum number of coins required to make a change for the target amount. Suppose we have an infinite supply of each coin.
  • Weighted Job Scheduling: Given n jobs, where each job i has a start time si, an end time ei, and a weight or value wi, find a subset of jobs that do not overlap in time intervals and maximize the total weight or value of the selected jobs
  • Subset Sum Problem: Given a set of positive integers and a target sum, determine whether there exists a subset of the integers that adds up to the target sum.
  • Equal Sum Partition: Given an array of positive integers, divide the array into two subsets such that the sum of elements in both subsets is equal.
  • Count of Subset Sum: Given a set of positive integers and a target sum, find the total number of subsets that sum up to the target.
  • Minimum Subset Sum Difference: Given a set of positive integers, partition the set into two subsets such that the difference between the sum of elements in the two subsets is minimized.

If you have any queries or feedback, please write us at contact@enjoyalgorithms.com. Enjoy learning, Enjoy algorithms!

Share Feedback

Coding Interview

Machine Learning

System Design

EnjoyAlgorithms Newsletter

Subscribe to get well designed content on data structure and algorithms, machine learning, system design, object orientd programming and math.

Explore More Content

Follow us on

©2023 Code Algorithms Pvt. Ltd.

All rights reserved.