n Repeated element in 2n size array

Difficulty: Medium, Asked-in: Google

Key takeaway: An excellent problem to learn time complexity optimization using the mathematical approach. Sometimes mathematical insights into the problem can help us to get efficient solutions.

Let’s understand the problem

In an array X[] of size 2n, there are n + 1 unique elements, and exactly one of these elements is repeated n times. Write a code to return the element repeated n times.


Input: X[] = [1, 2, 2, 3], Output: 2

Input: X[] = [2, 1, 2, 5, 3, 2, 2, 4], Output: 2

Input: X[] = [5, 1, 5, 2, 5, 3, 5, 4, 6, 5], Output: 5

Discussed solution approaches

  • A brute force approach  using a hash table
  • An efficient approach  using the pigeon hole principle

A brute force approach using a hash table

Solution Idea

A simple approach would be to traverse the array and count the frequency of each element using a hash table. During the traversal, if we found an element with a count larger than 1, we return that element. Why are we returning the element with a count larger than 1? (Think!)

Solution Pseudocode

int repeatedNTimes(int X[], int n)
    HashTable H
    int repeatedElement
    for(int i = 0; i < n; i = i + 1)
        H[X[i]] = H[X[i]] + 1
        if(H[X[i]] > 1)
            repeatedElement = X[i]
   return repeatedElement

Solution Analysis

  • The traversal of the array takes time proportional to the number of elements in the array. At each iteration, we are doing constant operations because inserting and searching in the hash table takes O(1) time. So, time complexity in the worst case = O(n). 
  • We are using O(n) extra space to store the frequency of elements in a hash table. Hence, Space Complexity = O(n).

An efficient approach using the pigeon hole principle

Solution Idea

The idea behind the algorithm is: if an element is repeated, then it must be the element that has been repeated n times because all other elements are unique! So, as given in the problem, there is an element “x” that appears n times in an array size 2n. How can we use this insight to solve this problem efficiently in O(1)? Let's think!

Suppose the indexes of all x’s in the array are X1, X2, ... Xn, where X1 < X2,…< Xn. Let calculate the average gap between all consecutive x’s:

  • Average gap = [ (X2 - X1) + (X3 - X2) +...(Xn - Xn-1)] / (n - 1).
  • Here X2, X3, ...X(n-1) cancel out each other. So Average gap = (Xn - X1)/(n - 1)

As array size is 2n, so max possible value of Xn = 2n - 1, min possible value of X1 = 0. If we put the values of Xn and X1 in the above formula, we will get the max upper limit of the average gap.

  • Average gap <= (2n - 1)/(n - 1).
  • Equality holds when n = 2.
  • For all n > 2, (2n - 1)/(n - 1) is less than 3.
  • So, Average gap <= 3, for n > 2.

Therefore, from the above analysis, there must exist at least two x’s in array X[] such that their gap is at most 3. Simply stating: If half the array is repeated, it doesn’t matter how we shuffle it; for one of the repeated elements, there has to be another repeated element at least three positions away; otherwise, the array size isn’t 2n. Think!

Solution Steps

  • We use a loop from k = 1 to 3 to create a window of gap lengths 1, 2, and 3. In other words: we are using this loop to explore elements at gap lengths 1, 2, and 3, for each element X[i].
  • Now for each window, we traverse the array and compare the first and last elements of the window.
  • If equality holds then, this must be our repeated element.

Solution Pseudocode

int repeatedNTimes(int X[], int n)
    for (int k = 1; k <= 3; k = k + 1)
        for (int i = 0; i < n - k; i = i + 1)
            if (X[i] == X[i + k])
                return X[i]

Solution Analysis

  • We are using two nested loops where the outer loop is running constant times.
  • We are doing one comparison operation at each iteration of the inner loop, i.e., the constant number of operations.
  • So in the worst case, the outer loop runs three times, and we need to traverse each element via the inner loop. So time complexity = Time complexity of the outer loop * Time complexity of the inner loop = O(1)*O(n) = O(n).
  • We are using constant extra space. So, space complexity = O(1)

Critical ideas to think!

  • What is the pigeonhole principle? How can we use it in problem-solving?
  • Can we solve this problem by sorting the input? If yes, then what would be the time and space complexity?
  • Can we solve this problem efficiently using some other approach?
  • What would be the worst and best case input for both approaches?
  • Can we further optimize the above approach? Is it okay to compare each element with its 2 right neighbors instead of 3?
  • Let's design a randomized algorithm: randomly pick two array elements and check whether they come from different cells and have the same value. If they do, then return true. So after how many numbers of repetition algorithm succeeds with probability 1 – 1/n? Out of n² possible combinations of picking two elements, how many will result in success?

Comparisons of time and space complexities

  • Brute force approach: Time = O(n), Space = O(n)
  • Using pigeon hole principle : Time = O(n), Space = O(1)

Suggested coding problems to practice

  • Valid Mountain
  • Remove duplicates from sorted array
  • A Product Array Puzzle
  • Chocolate Distribution Problem
  • Best Time to Buy and Sell Stock
  • Sort an array in the waveform
  • Number of jumps for a thief to cross walls
  • Find the number of subarrays with an even sum
  • Double the first element and move zero to the end

Thanks to Navtosh Kumar for his contribution in creating the first version of the content. Please write comments if you find an error or you want to share more insights about the topic. Enjoy learning, Enjoy coding, Enjoy algorithms!

More From EnjoyAlgorithms

Our weekly newsletter

Subscribe to get free weekly content on data structure and algorithms, machine learning, system design, oops design and mathematics.

Follow Us:


© 2020 EnjoyAlgorithms Inc.

All rights reserved.