Find Most Frequent Element in an Array

Difficulty: Easy, Asked-in: Facebook, Uber

Key takeaway: An excellent-searching problem to learn problem-solving and time complexity optimization using sorting and hash table.

Let’s understand the problem

Given an array X[] of size n, write a program to find the most frequent element in the array, i.e. the element which occurs the maximum number of times.

  • It is assumed that at least one element is repeated.
  • If there are multiple elements with maximum frequency, return the smallest of them.

Example 1

Input: X[] = [2, 12, 1, 2, 8, 2, 2, 1, 8, 2], Output: 2

Explanation: 2 is most frequent element which appears 4 times.

Example 2

Input: X[] = [1, 9, 1, 1, 2], Output: 1

Explanation: 1 is a single repeated element that appears 3 times.

Example 3

Input: X[] = [1, 9, 2, 1, 2], Output: 1

Explanation: 1 and 2 are repeated two times, then we return the smallest of them i.e. 1.

Discussed solution approaches

  • A brute force approach using nested loops
  • Using sorting and linear scan
  • Using a hash table to store the frequency count

A Brute force approach using nested loops

Solution Idea and Steps

The basic idea would be to count occurrences of each element and return element with the maximum number of occurrences. We can use nested loop: pick each element using outer loop and scan entire array to count frequency using inner loop. Here are solutions steps:

  1. We initialize two variables outside nested loops: maxFreq to track the maximum frequency and mostFrequent to track the most frequent element. maxFreq = 0, mostFrequent = -1.
  2. We run outer loop from i = 0 to n - 1 and inner loop from j = 0 to n - 1 to count frequency. Before starting inner loop, we also initialize a variable countFreq to track the frequency count of each element i.e. countFreq = 1.

    • Inside inner loop, when we found X[i] = X[j], we increase the value of countFreq by one and move to the next iteration of inner loop.
    • By the end of inner loop, If(maxFreq < countFreq), we found an element X[i] with frequency greater than maximum frequency count till that point. So we update maxFreq with countFreq and mostFrequent with X[i].
    • If (countFreq == maxFreq), we update the mostFrequent with the minimum of mostFrquent and X[i] i.e. min(mostFrequent, X[i]). Think!
  3. By the end of nested loop, we return the value stored in mostFrequent.

Solution Pseudocode

int mostFrequentElement(int X[], int n)
{
    int maxFreq = 0
    int mostFrequent = -1
    for (int i = 0; i < n; i = i + 1)
    {
        int countFreq = 1
        for (int j = 0; j < n; j = j + 1)
        {
            if (X[j] == X[i])
                countFreq = countFreq + 1
        }        
        if (maxFreq < countFreq)
        {
            maxFreq = countFreq
            mostFrequent = X[i]
        }
        else if (maxFreq == countFreq)
            mostFrequent = min(mostFrequent, X[i])
    }
    return mostFrequent
}

Time and space complexity analysis

We are running nested loop and performing O(1) operation at each iteration. Time complexity = n*n*O(1) = O(n^2). We are using constant number of extra variables, space complexity = O(1)

Using sorting and single scan

Solution Idea

Now critical questions are: How do we improve the time complexity? Searching is an essential operation in the problem. Can we think to improve the time complexity of searching?

If we sort the array, then all duplicate elements will get placed adjacent to each other. We can easily scan the array, count frequency of each element and return the element with max frequency. Compared with the above approach, this ensures that frequency is calculated only once for each unique element.

Solution Steps

  1. We first sort the array using some efficient O(nlogn) sorting algorithm like merge sort, heap sort, or quicksort. Suppose we are using heap sort, which works efficiently in place.
  2. We initialize two variables: maxFreq to track the maximum frequency and mostFrequent to track the most frequent element. maxFreq = 0, mostFrequent = -1.
  3. Now we scan sorted array using a loop till i < n. Inside loop, we initialize variable countFreq to track the frequency count i.e. countFreq = 1.

    • We start from the first element (i = 0) and search its consecutive occurrences using a while loop till X[i] = X[i + 1]. In other words, this inner while loop will end when X[i] != X[i + 1], and in such a situation, we have moved to the next unique element in sorted array. During this process, we increase countFreq and i by 1.
    • After inner while loop, if(maxFreq < countFreq), we found an element X[i] with frequency greater than maximum frequency count till that point. So we update maxFreq with countFreq and mostFrequent with X[i]. If(countFreq == maxFreq), we also update the mostFrequent with the minimum of mostFrquent and X[i].
    • Now we go to the next iteration of outer loop and repeat the same process for the next unique element available at index i.
  4. By the end of outer loop, we return the value stored in mostFrequent.

Solution Pseudocode

int mostFrequentElement(int X[], int n)
{
    heapSort(X, n)
    int maxFreq = 0
    int mostFrequent = -1
    int i = 0
    while (i < n)
    {
        int countFreq = 1
        while (i + 1 < n && X[i] == X[i + 1])
        {
            countFreq = countFreq + 1
            i = i + 1
        }
        if (maxFreq < countFreq)
        {
            maxFreq = countFreq
            mostFrequent = X[i]
        }
        else if (maxFreq == countFreq)
            mostFrequent = min(mostFrequent, X[i])
        i = i + 1
    }
    return mostFrequent
}

Time and space complexity analysis

If we observe nested loop, outer loop pick the unique elements, and inner loop counts the frequency of that element. So we are accessing each element only once and performing O(1) operation (Think!). Time complexity of nested loop = n * O(1) = O(n).

Overall time complexity = Time complexity of heap sort + Time complexity of nested loop = O(nlogn) + O(n) = O(nlogn). Space complexity = O(1), because heap sort works in-place and we are using constant number of variables.

Using a hash table to store the frequency count

Solution Idea and Steps

Again, how do we further improve time complexity? Can we use a hash table to perform searching efficiently? Hash table performs basic dictionary operations like insertion, deletion, and searching operation in O(1) average. So here is another idea:

  • We create hash table H to store elements, and their frequency counts as key-value pairs.
  • Now we scan array to check if value associated with any element exists in Hash Table or not. If it exists, we increment the frequency count (value) of element (key) by 1. If not, we store the frequency count of that element by 1. 
  • During iteration, we also store the most frequent element and its frequency count in variable mostFrequent and maxFreq, respectively. We update the value of these variables if we found stored frequency count of any element greater than the current max frequency count.
  • And finally, we return the most frequent element found i.e. return mostFrequent.

Solution Pseudocode

int mostFrequentElement(int X[], int n)
{
    HashTable H
    int maxFreq = 1
    int mostFrequent = -1
    for (int i = 0; i < n; i = i + 1)
    {
        if (H.search(X[i]))
        {
            H[X[i]] = H[X[i]] + 1
            if (maxFreq < H[X[i]])
            {
                maxFreq = H[X[i]]
                mostFrequent = X[i]
            }
            else if (maxFreq == H[X[i]])
                mostFrequent = min(mostFrequent, X[i])
        }
        else
            H[X[i]] = 1
    }
    return mostFrequent
}

Time and space complexity analysis

We are doing single scan of array and, at each iteration, performing constant operations. In other words, we are searching for each element once and inserting/updating it once. Time Complexity = n*O(1) = O(n). Space complexity = O(n) for storing elements in hash table.

Critical ideas to think!

  • Can we try to solve this problem using BST? If yes, then what would be the time complexity?
  • Can we solve this problem using some other approach?
  • How can we optimize the brute force approach further? How do we ensure that the frequency is calculated only once for each unique element?
  • How can we optimize the 2nd approach further? Would it be possible to use binary search after sorting to find the most frequent element?
  • What if all elements were unique? What changes will you make?

Comparisons of time and space complexities

  • Using nested loops: Time = O(n^2), Space = O(1)
  • Using sorting and linear scan: Time = O(nlogn), Space = O(1)
  • Using hash table: Time = O(n), Space = O(n)

Suggested coding problems to practice

  • Find the most frequent word in a sentence
  • Sort characters by frequency
  • Find the frequency of all words in a sentence
  • Find the least frequent element in the array
  • Find the top k frequent elements in the array

Enjoy learning, Enjoy algorithms!

Share feedback with us

More blogs to explore

Our weekly newsletter

Subscribe to get weekly content on data structure and algorithms, machine learning, system design and oops.

© 2022 Code Algorithms Pvt. Ltd.

All rights reserved.