Merge Overlapping Intervals

Difficulty: Medium, Asked-in: Amazon, Adobe, Hike

Key takeaway: An excellent problem to learn problem solving using sorting.

Let’s understand the problem

Given an array of intervals where the start and end points of each interval are provided, write a program to merge all overlapping intervals and return an array of non-overlapping intervals that cover all the intervals in the input.

  • The output should be an array of only mutually exclusive intervals.
  • The intervals can be given in any order.

Merge overlapping intervals example

More examples

Input: intervals = [ [9, 12], [3, 7], [1, 4], [14, 17] ], Output: [ [1, 7], [9, 12], [14, 17] ]

Explanation: Since intervals [1, 4] and [3, 7] are overlapping, we merge them into [1, 7].

Input: intervals = [ [1, 6], [6, 9] ], Output: [ [1, 9] ]

Explanation: Intervals [1, 6] and [6, 9] are overlapping.

Input: intervals = [ [8, 10], [2, 4], [1, 3], [5, 9] ], Output: [ [1, 4], [5, 10]]

Explanation: Intervals [2, 4] and [1, 3] are overlapping, we merge them into [1, 4]. Similarly, we merge [8, 10] and [5, 9] into [5, 10].

Brute force solution using nested loops

Solution idea

One basic idea is to construct the output array of merged intervals incrementally by traversing the input intervals. To achieve this, we iterate through each interval and check if it can be merged with any existing merged intervals in the output. If it cannot be merged, we include it as a separate interval in the output. How does this idea work? Let's explore!

Let's assume we have already merged intervals from index 0 to i - 1 in an output[] array. Now, for the ith interval, we traverse the output[] and check if the ith interval can be merged with any existing intervals in the output[]. If it cannot be merged, we add it to the output[]. Otherwise, we merge the ith interval with the overlapping interval in the output[].

To accomplish this, we use the merge function, which takes two intervals, X and Y, and verifies if they can be merged. In this function, we compare the end of interval X with the start of interval Y and vice versa. If the intervals do not overlap, we return false. However, if they overlap, we update the start of interval Y to the minimum of the two start points and the end of interval Y to the maximum of the two endpoints. Finally, we return true to indicate a successful merge.

Solution steps

  • We create an array called output[] to store the merged intervals.
  • Next, we iterate through each interval using the outer loop, and for each interval, we execute an inner loop to check if it can be merged with any of the intervals in the output[].
  • Before the start of the inner loop, we initialize a flag variable called mergedFlag to false to track the merging status. If merging of the current interval is possible with any one of the intervals in the output, we merge it using the merge function, set the mergedFlag to true, and break out of the inner loop. Otherwise, we continue checking the scope of merging.
  • After the inner loop, we check the mergedFlag. If it is false, the current interval did not merge with any intervals in the output. So, we add the current interval to the output.
  • Finally, we return the output[].

Solution code C++

struct Interval 
{
    int start;
    int end;

    Interval(int s, int e) 
    {
        start = s;
        end = e;
    }
};

bool merge(Interval& X, Interval& Y) 
{
    if (X.end < Y.start || Y.end < X.start)
        return false;

    Y.start = min(X.start, Y.start);
    Y.end = max(X.end, Y.end);
    return true;
}

vector<Interval> mergeIntervals(vector<Interval>& intervals) 
{
    int n = intervals.size();
    vector<Interval> output;

    for (int i = 0; i < n; i = i + 1) 
    {
        bool merged = false;
        for (int j = 0; j < output.size(); j = i + 1) 
        {
            if (merge(intervals[i], output[j])) 
            {
                merged = true;
                break;
            }
        }      
        if (!merged)
            output.push_back(intervals[i]);
    }

    return output;
}

Solution code Java

class Interval 
{
    public int start;
    public int end;

    public Interval(int s, int e) 
    {
        start = s;
        end = e;
    }
}

public class IntervalMerger
{
    private boolean merge(Interval X, Interval Y) 
    {
        if (X.end < Y.start || Y.end < X.start)
            return false;

        Y.start = Math.min(X.start, Y.start);
        Y.end = Math.max(X.end, Y.end);
        return true;
    }

    public List<Interval> mergeIntervals(List<Interval> intervals) 
    {
        int n = intervals.size();
        List<Interval> output = new ArrayList<>();

        for (int i = 0; i < n; i = i + 1) 
        {
            boolean merged = false;
            for (int j = 0; j < output.size(); j = j + 1) 
            {
                if (merge(intervals.get(i), output.get(j))) 
                {
                    merged = true;
                    break;
                }
            }
            
            if (!merged)
                output.add(intervals.get(i));
        }

        return output;
    }
}

Solution code Python

class Interval:
    def __init__(self, s, e):
        self.start = s
        self.end = e

def merge(X, Y):
    if X.end < Y.start or Y.end < X.start:
        return False

    Y.start = min(X.start, Y.start)
    Y.end = max(X.end, Y.end)
    return True

def mergeIntervals(intervals):
    n = len(intervals)
    output = []

    for i in range(n):
        merged = False
        for j in range(len(output)):
            if merge(intervals[i], output[j]):
                merged = True
                break
        
        if not merged:
            output.append(intervals[i])

    return output

Time and space complexity analysis

We run nested loops where the outer loop runs n times, and the length of the inner loop depends on the size of the output array at given instance. During each iteration of the nested loop, we call the merge function only once, which takes O(1) time.

In the best-case scenario, if all intervals in the input are overlapping, the merge function will always return true, and the inner loop will break after one iteration. In this case, the inner loop will run only once. So the best-case time complexity is O(n).

In the worst-case scenario, if all intervals in the input are non-overlapping, the merge function will always return false and inner loop will run until the end. The total number of loop iterations is 1 + 2 + 3 + ... + n, which is equal to n(n + 1)/2. So the worst-case time complexity is O(n^2).

We are using constant extra space, so space complexity = O(1).

Using sorting and stack data structure 

Solution idea

Now, the critical question is: Can we think of improving the time complexity? Can we perform some pre-processing to make merging easier? The idea is to sort the intervals based on their start times and then merge them iteratively using a stack. Here, sorting will help us easily identify the overlapping intervals, and the stack will help us keep track of the merged intervals.

Solution steps

  1. We create an output array to store merged intervals.
  2. We sort the intervals based on their start times.
  3. Now, we create a stack to store intervals and run a loop to access each interval.

    • If the stack is empty or the top interval in the stack does not overlap with the current interval, we push the current interval into the stack. Note: An interval will not overlap if the end of the interval at the top of the stack is less than the start of the current interval.
    • If the top interval of the stack overlaps with the current interval, we merge both intervals by updating the end of the top interval with the end of the current interval. Note: An interval will overlap if the end of the interval at the top of the stack is less than the end of the current interval.
  4. After processing all intervals, all merged intervals will be present inside the stack. We iterate over the stack and insert the intervals at the end of the output[] array.
  5. Finally, we return the output[] array.

Solution code C++

bool compareIntervals(const Interval& a, const Interval& b) 
{
    return a.start < b.start;
}

vector<Interval> mergeIntervals(vector<Interval>& intervals) 
{
    int n = intervals.size();
    stack<Interval> s;
    vector<Interval> output;
    sort(intervals.begin(), intervals.end(), compareIntervals);

    for (int i = 0; i < n; i = i + 1) 
    {
        Interval curr = intervals[i];
        if (s.empty() || s.top().end < curr.start)
            s.push(curr);
        else if (s.top().end < curr.end)
            s.top().end = curr.end;
    }

    while (!s.empty()) 
    {
        output.push_back(s.top());
        s.pop();
    }

    return output;
}

Solution code Java

class Solution 
{
    public static int compareIntervals(Interval a, Interval b) 
    {
        if(a.start < b.start)
            return 1;
        else 
            return 0;
    }

    public static List<Interval> mergeIntervals(List<Interval> intervals) 
    {
        int n = intervals.size();
        Stack<Interval> stack = new Stack<>();
        List<Interval> output = new ArrayList<>();
        intervals.sort(Solution::compareIntervals);

        for (int i = 0; i < n; i = i + 1) 
        {
            Interval curr = intervals.get(i);
            if (stack.isEmpty() || stack.peek().end < curr.start)
                stack.push(curr);
            else if (stack.peek().end < curr.end)
                stack.peek().end = curr.end;
        }

        while (!stack.isEmpty()) 
        {
            output.add(stack.peek());
            stack.pop();
        }

        return output;
    }
}

Solution code Python

def compareIntervals(a, b):
    return a.start < b.start

def mergeIntervals(intervals):
    intervals.sort(key=lambda x: x.start)
    stack = []
    output = []

    for i in range(len(intervals)):
        curr = intervals[i]
        if not stack or stack[-1].end < curr.start:
            stack.append(curr)
        elif stack[-1].end < curr.end:
            stack[-1].end = curr.end

    while stack:
        output.append(stack[-1])
        stack.pop()

    return output

Time and space complexity analysis

We are sorting the interval and running a single loop to merge intervals using stack. Suppose we are using some efficient O(nlogn) sorting algorithm like quick sort, merge sort or heap sort.

At each iteration of the single loop, we are doing O(1) operation to merge intervals. So overall time complexity = Time complexity of sorting + O(n)*O(1) = O(nlogn) + O(n) = O(nlogn).

Space complexity depends on the implementation of the sorting algorithm and the size of the stack. In the worst case, the size of the stack will be O(n) when all intervals are non-overlapping (Think!) If we use merge sort, then space complexity = O(n) + O(n) = O(n). If we use heap sort, then space complexity = O(1) + O(n) = O(n).

Using the sorting and single scan

The above solution requires O(n) additional space for the stack. How can we solve this problem without using a stack? Here is an observation: If we sort the intervals based on their start values, we can easily identify and merge overlapping intervals in a single scan. The idea is simple: Sorting ensures that overlapping intervals are adjacent to each other.

Solution steps

  1. We declare an array output[] to store the merged intervals.
  2. We sort the intervals based on their start values.
  3. We also initialize a variable currInterval with the first interval to track the current interval.
  4. Now we iterate through the sorted intervals starting from the second interval and compare the end value of the currInterval with the start value of the next interval.
  5. If there is an overlap (currInterval.end >= intervals[i].start), we update the end value of currInterval with max(currInterval.end, intervals[i].end). This will extend the merged interval. If there is no overlap, we add currInterval to the output[] and update the currInterval with the next interval.
  6. After iterating through all intervals, we add the final currInterval to the output since it represents the last merged interval. Finally, we return the output[].

Solution code C++

bool compareIntervals(const Interval& a, const Interval& b) 
{
    return a.start < b.start;
}

vector<Interval> mergeIntervals(Interval intervals[], int n) 
{
    vector<Interval> output;
    sort(In, In + n, compareIntervals);

    Interval currInterval = intervals[0];
    for (int i = 1; i < n; i = i + 1) 
    {
        if (currInterval.end >= intervals[i].start)
            currInterval.end = max(currInterval.end, intervals[i].end);
        else 
        {
            output.push_back(currInterval);
            currInterval = intervals[i];
        }
    }
    output.push_back(currInterval);

    return output;
}

Solution code Java

class Solution 
{
    public static List<Interval> mergeIntervals(Interval[] intervals, int n) 
    {
        List<Interval> output = new ArrayList<>();
        Arrays.sort(intervals, 0, n, Comparator.comparingInt(a -> a.start));

        Interval currInterval = intervals[0];
        for (int i = 1; i < n; i = i + 1) 
        {
            if (currInterval.end >= intervals[i].start)
                currInterval.end = Math.max(currInterval.end, intervals[i].end);
            else 
            {
                output.add(currInterval);
                currInterval = intervals[i];
            }
        }
        output.add(currInterval);

        return output;
    }
}

Solution code Python

def compareIntervals(a, b):
    return a.start < b.start

def mergeIntervals(intervals, n):
    output = []
    intervals.sort(key=lambda x: x.start)

    currInterval = intervals[0]
    for i in range(1, n):
        if currInterval.end >= intervals[i].start:
            currInterval.end = max(currInterval.end, intervals[i].end)
        else:
            output.append(currInterval)
            currInterval = intervals[i]
    output.append(currInterval)

    return output

Time and space complexity analysis

We are sorting the intervals and running a single loop. Suppose we are using an efficient O(nlogn) sorting algorithm like quicksort, merge sort, or heapsort. At each iteration of the single loop, we perform an O(1) operation to merge intervals. So overall time complexity = Time complexity of sorting + O(n)*O(1) = O(nlogn) + O(n) = O(nlogn).

The space complexity depends on the implementation of the sorting algorithm. If we use merge sort, then the space complexity is O(n). If we use heapsort, then the space complexity is O(1).

In-place solution using sorting and single loop

Solution idea

The idea here is to sort the intervals based on the start time and modify the same input array to store the merged intervals. To achieve this, we maintain an index variable to track the index of the last merged interval. We update the start and end points of the new interval based on the interval present at the index.

Solution steps

  1. We sort all intervals in increasing order of the start time.
  2. We initialize an index variable to keep track of the last merged interval in the same input array. Now we traverse the sorted intervals from the second interval up to the last interval.
  3. In each iteration, we check if the current interval (intervals[i]) overlaps with the interval at intervals[index]. If there is an overlap, i.e., the end of the interval at intervals[index] is greater than or equal to the start of the interval at intervals[i].
  4. For merging, we update the end of the interval at intervals[index] to max (intervals[index].end, intervals[i].end) and the start to min (intervals[index].start, intervals[i].start).
  5. If there is no overlap, we increment the index and copy the current interval to the next position in the array (intervals[index] = intervals[i]).
  6. After iterating through all the intervals, we return the length of the merged intervals, which is index + 1.

Solution code C++

bool compareIntervals(const Interval& a, const Interval& b) 
{
    return a.start < b.start;
}

int mergeIntervals(Interval intervals[], int n) 
{
    int index = 0;
    sort(intervals, intervals + n, compareIntervals);
    for (int i = 1; i < n; i = i + 1) 
    {
        if (intervals[index].end >= intervals[i].start) 
        {
            intervals[index].end = max(intervals[index].end, intervals[i].end);
            intervals[index].start = min(intervals[index].start, intervals[i].start);
        }
        else 
        {
            index = index + 1;
            intervals[index] = intervals[i];
        }
    }
    
    return index + 1;
}

void printIntervals(Interval intervals[], int index) 
{
    for (int i = 0; i < index; i = i + 1)
        cout << "[" << intervals[i].start << ", " << intervals[i].end << "] ";
    
    cout << endl;
}

Solution code Java

class Solution 
{
    public static int mergeIntervals(Interval[] intervals, int n) 
    {
        int index = 0;
        Arrays.sort(intervals, 0, n, Comparator.comparingInt(a -> a.start));
        for (int i = 1; i < n; i = i + 1) 
        {
            if (intervals[index].end >= intervals[i].start) 
            {
                intervals[index].end = Math.max(intervals[index].end, intervals[i].end);
                intervals[index].start = Math.min(intervals[index].start, intervals[i].start);
            } 
            else 
            {
                index = index + 1;
                intervals[index] = intervals[i];
            }
        }

        return index + 1;
    }

    public static void printIntervals(Interval intervals[], int index) 
    {
        for (int i = 0; i < index; i = i + 1)
            System.out.println("[" + intervals[i].start + ", " + intervals[i].end + "] ");

        System.out.println();
    }
}

Solution code Python

def mergeIntervals(intervals, n):
    intervals.sort(key=lambda x: x.start)
    index = 0
    for i in range(1, n):
        if intervals[index].end >= intervals[i].start:
            intervals[index].end = max(intervals[index].end, intervals[i].end)
            intervals[index].start = min(intervals[index].start, intervals[i].start)
        else:
            index = index + 1
            intervals[index] = intervals[i]

    return index + 1

Time and space complexity analysis

We are sorting the interval and running a single loop to merge intervals. So overall time complexity = Time complexity of sorting + O(n)*O(1) = O(nlogn) + O(n) = O(nlogn).

Since we store the merged intervals in the same input array, the space complexity depends only on the sorting algorithm used. If we use merge sort, the space complexity is O(n). If we use heap sort, the space complexity is O(1).

Critical idea to think!

  • In the third solution, when (currInterval.end >= intervals[i].start), we merge the endpoints of intervals. Why are we not considering the condition for the start time?
  • In the stack-based approach, what is the count of push and pop operations in the worst case?
  • Can we think of solving this problem using a data structure like a hash table or a binary search tree?
  • Can we think of solving this problem using a graph or a disjoint set data structure?

Suggested coding questions to practice

  • Given a set of non-overlapping intervals and a new interval, insert the interval into the intervals and merge if necessary.
  • Given a collection of intervals, find the minimum number of intervals you need to remove to make the rest of the intervals non-overlapping.
  • Given an array of meeting time intervals, find the minimum number of conference rooms required.
  • Given a list of intervals, find the maximum number of intervals that overlap at any point in time.
  • Given a list of intervals, remove the intervals that are completely covered by another interval in the list.
  • Given a set of intervals, for each interval, find the interval with the smallest start time that overlaps with it.
  • Given two lists of closed intervals, return their intersection.

If you have any queries or feedback, please write us at contact@enjoyalgorithms.com. Enjoy learning, Enjoy algorithms!

Share Your Insights

☆ 16-Week Live DSA Course
☆ 10-Week Live DSA Course

More from EnjoyAlgorithms

Self-paced Courses and Blogs

Coding Interview

Machine Learning

System Design

Our Newsletter

Subscribe to get well designed content on data structure and algorithms, machine learning, system design, object orientd programming and math.