Find Length of Largest Subarray with Zero Sum

Difficulty: Medium, Asked-in: Microsoft, Amazon

Key takeaway: An excellent problem to learn problem-solving using a hash table.

Let’s understand the problem!

Given an array X[] of n integers, write a program to find the length of longest subarray with sum equal to 0. In general, for all j > i, find max (j - i + 1) among all subarray with zero-sum. Note: Length of subarray starting from index i and ending at index j = j - i + 1.

Example 1

Input: X[] = [14, -1, 1, -6, 1, 5, 12, 17], Output: 5

Explanation: The longest sub-array with elements summing up to 0 is [-1, 1, -6, 1, 5].

Example 2

Input: X[] = [1, 5, 10], Output: 0

Explanation: There is no subarray with 0 sum.

Example 3

Input: X[] = [1, 0, 2], Output: 1

Explanation: The longest sub-array with elements summing up to 0 is [0]

Discussed solution approaches

  • Brute force approach using three nested loops
  • Solution approach using two nested loops
  • Efficient approach using hash table

Brute force idea using three nested loops

Solution idea and steps

A basic solution would be to explore all possible subarrays, calculate the sum of each subarray and track maximum length of subarray with zero-sum using a variable.

  • We initialize variable maxLengthZeroSubarray to track maximum length of the subarray with zero-sum.
  • Now we run two nested loops to generate all pair of indices (i, j), where i <= j. Here outer loop will run from i = 0 to n - 1 and inner loop will run from j = i to n - 1.
  • For every pair (i, j), we run another loop from k = i to j to calculate sub-array sum between indices (i, j). We store this sum in a variable subarraySum.
  • When we find any sub-array sum equal to zero i.e. if(subarraySum == 0), we update the maximum length found so far i.e. if (maxLengthZeroSubarray < j - i + 1), maxLengthZeroSubarray = j - i + 1. Here length of each subarray is j - i + 1.
  • After exploring the sum of all subarrays using three nested loops, we return value stored in variable maxLengthZeroSubarray.

Solution pseudocode

int largestSubarrayZeroSum(int X[], int n)
{
    int maxLengthZeroSubarray = 0
    for (int i = 0, i < n; i = i + 1)
    {
        for (int j = i; j < n; j = j + 1) 
        {   
            int subarraySum = 0
            for (int k = i; k <= j; k = k + 1)
                subarraySum = subarraySum + X[k]
            
            if (subarraySum == 0)
            {
                if (maxLengthZeroSubarray < j - i + 1)
                    maxLengthZeroSubarray = j - i + 1
            }      
        }
    }
    return maxLengthZeroSubarray
}

Solution analysis

We are running three nested loops. So, time complexity in the worst case = O(n³). Note: Explore the last part of this blog to understand the analysis of above three nested loops: analysis of loop in programming.

We are using constant extra space. Space complexity = O(1)

Optimized solution using two nested loops

Solution idea

Now critical questions are: Can we further optimize the time complexity? Can we avoid the innermost loop from k = i to j to calculate subarray sum? Let's think! One idea is: If we know sub-array sum from i to j, we can easily calculate the sub-array sum from i to j + 1 in O(1). Subarray sum from i to j + 1 = Subarray sum from i to j + X[j + 1].

Here we run outer loop from i = 0 to n - 1 to pick the starting index i, and run inner loop from j = i to n - 1 to calculate the running sum of all sub-arrays starting from index i. Inside inner loop, if subarray sum is equal to zero, we calculate its length and update max subarray length found so far. This is an optimized version of the above approach, where we are running only two nested loops and exploring sub-array starting at all positions.

Solution pseudocode

int largestSubarrayZeroSum(int X[], int n)
{
    int maxLengthZeroSubarray = 0
    for (int i = 0, i < n; i = i + 1)
    {
        int subarraySum = 0
        for (int j = i; j < n; j = j + 1)
        {
            subarraySum = subarraySum + X[j]
            if (subarraySum == 0)
                maxLengthZeroSubarray = max(maxLengthZeroSubarray, j - i + 1)
        }        
    }
    return maxLengthZeroSubarray
}

Solution analysis

We are running two nested loops to explore each subarray from (i, j), where i <= j. Total number of loop iteration = n + (n - 1) + .... + 2 + 1 = n(n + 1)/2 = O(n^2). At each iteration, we are doing O(1) operations. So time complexity = Total number of nested loops iteration * O(1) = O(n^2) * O(1).

We are using constant extra space. So space complexity = O(1)

Solution approach 3: An efficient idea using hash table

Solution idea

Now critical questions are: Can we improve time complexity to O(n)? Can we solve this problem in using single loop? Let’s think! Suppose sum(j, i) is the sub-array sum from index j to i where j < i. Here we can identify two properties related to sum(j, i).

  1. Continuous sum of elements from j to i = Continuous sum of elements from 0 to i - Continuous sum of elements from 0 to j - 1 => sum(j, i) = sum(0, i) - sum(0, j - 1)
  2. From the above equation, if sum(0, i) = sum(0, j - 1), then sum(j, i) = 0.

So there could be two possibilities for sub-array with zero sum:

  • Case 1: Zero subarray sum starting from 0 and ending at some index i. sum(0, i) = 0.
  • Case 2: Zero subarray sum starting from some index j and ending at some index i. sum(j, i) = 0 where 0 < j < i < n.

Using the above hypothesis, we can solve the problem by traversing array linearly and storing the sum of all subarrays starting from the index 0 in the hash table (Think!).

Solution steps

  1. We initialize two variables subarraySumStart (to track the subarray sum starting from index 0 to i) and maxLengthZeroSubarray (to track the max length of sub-array with zero-sum) with 0.
  2. We use hash table H to store the sum of subarrays starting from index 0 as key and end index i as value. In other words, we store it in the form of key-value pair (subarraySumStart, i).
  3. Now we run loop from i = 0 to n - 1. Inside the loop:
  4. We calculate sub-array sum starting from index 0 (subarraySumStart).
  5. If subarraySumStart is equal to zero, we have found case 1 of the above possibilities. So we update variable maxLengthZeroSubarray i.e. if (maxLengthZeroSubarray < i + 1), maxLengthZeroSubarray = i + 1.

    subarraySumStart = subarraySumStart + X[i]
    if (subarraySumStart == 0)
    {
      if (maxLengthZeroSubarray < i + 1)
          maxLengthZeroSubarray = i + 1
    }
  6. Otherwise, there can be chance of case 2. So We search subarraySumStart in the hash table.
  7. If subarraySumStart is not present in hash table, we store current index i in the hash table using subarraySumStart as a key i.e. H.insert(subarraySumStart) = i.
  8. Otherwise, we have already stored some previous index H[subarraySumStart] in the hash table with the same value of subarraySumStart. This is a case 2 where subarray sum from index 0 to current index i is equal to subarray sum from index 0 to index H[subarraySumStart]. In other words, we can say that the sum of subarray from index H[subarraySumStart] + 1 to index i is equal to zero. So we update the value of maximum length i.e. if(maxLengthZeroSubarray < i - H[subarraySumStart]), maxLengthZeroSubarray = i - H[subarraySumStart].

    else if (H.search(subarraySumStart) == false)
      H.insert(subarraySumStart) = i
    else
    {
      if (maxLengthZeroSubarray < i - H[subarraySumStart])
          maxLengthZeroSubarray = i - H[subarraySumStart]
    } 
  9. By end of the whole loop, we return the value stored in maxLengthZeroSubarray.

Solution pseudocode

int largestZeroSubarraySum(int X[], int n)
{
    HashTable H
    int subarraySumStart = 0
    int maxLengthZeroSubarray = 0
    for (int i = 0, i < n; i = i + 1)
    {
        subarraySumStart = subarraySumStart + X[i]
        if (subarraySumStart == 0)
        {
            if (maxLengthZeroSubarray < i + 1)
                maxLengthZeroSubarray = i + 1
        }       
        else if (H.search(subarraySumStart) == false)
            H.insert(subarraySumStart) = i
        else
        {
            if (maxLengthZeroSubarray < i - H[subarraySumStart])
                maxLengthZeroSubarray = i - H[subarraySumStart]
        } 
            
    }
    return maxLengthZeroSubarray
}

Solution analysis

We are linearly traversing the array and performing one search or one insert operation at each step of the iteration. Time complexity of one insert/search operation = O(1) average. So overall time complexity = n*O(1) = O(n). Space Complexity = O(n), for the hash table

Critical ideas to think!

  • Can we solve this problem using some other approach?
  • How do we modify the above approach if we also need to return all indexes of the subarray with zero-sum?
  • How do we modify the above approach to find the largest subarray with a given sum other than zero?
  • Can we solve this problem using some other data structures?

Suggested coding problems to solve

Enjoy learning! Enjoy algorithms!

Share feedback with us

More blogs to explore

Our weekly newsletter

Subscribe to get weekly content on data structure and algorithms, machine learning, system design and oops.

© 2022 Code Algorithms Pvt. Ltd.

All rights reserved.