Check Whether Two Strings are Anagram or Not

Difficulty: Easy, Asked-in: Google, Amazon, Microsoft, Goldman Sachs

Key takeaway: An excellent problem to learn problem-solving and step-by-step optimization using direct address table. We can use similar approach to solve several other coding questions.

Let’s understand the problem

Given two strings str1 and str2 of size m and n, write a program to check whether two given strings are anagram of each other or not. 

  • String str1 is an anagram of str2 if characters of str1 can be rearranged to form str2. In other words, an anagram of a string is another string that contains same characters in different order.
  • Both strings may contain duplicate characters.
  • Characters in both strings are lowercase English characters.

Examples

Input: str1= “dusty” , str2= “study”, Output: true

Input: str1= “state” , str2= “taste”, Output: true

Input: str1= “enjoyalgorithms”, str2= “enjoymathematics”, Output: false

Discussed solution approaches

  • Brute force solution using nested loops
  • Using sorting and comparing characters
  • Counting occurrences of each character using two frequency tables
  • Counting occurrences of each character using one frequency table

Brute force solution using nested loops

Solution idea

One basic solution would be to pick each character from str1 and search for that character in str2. If any character is missing, both strings are not an anagram of each other.

On another side, if characters are repeated, then frequency of repeated characters must be same in anagrams strings. To ensure this, we can replace the found characters in str2 with an empty flag ‘\0’. Think!

Solution pseudocode

bool anagramString(String str1, int m, String str2, int n)
{
    if (m != n)
        return false
    for (int i = 0; i < m; i = i + 1)
    {
        int pos = linearSearch(str2, n, str1[i])
        if (pos == -1)
            return false
        else
            str2[pos] = '\0'
    }
    return true
}

Solution analysis

We are searching each character of str1 in str2 using the linear search. So time complexity = m * O(n) = O(mn). We are using constant extra space, so space complexity = O(1).

Using sorting and comparing characters

Solution idea

Can we think of improving time complexity further? One idea would be to sort both strings and compare each character. This will arrange all repeated characters together and help us to compare their frequency in both strings.

  • If (m != n): Strings are not an anagram of each other.
  • Sort both strings in increasing order. What sorting algorithm will be efficient in this scenario? Only lowercase characters are present in the input, i.e. only 26 possible values repeated several times. So instead of using O(nlogn) sorting algorithm, one can think to use ideas similar to O(n) time sorting algorithms like bucket sort or counting sort. Think!
  • Now run a loop from i = 0 to m - 1 or i = 0 to n - 1 and compare characters at index i on both strings. If we find some unequal character in sequence, return false. Otherwise, by the end of loop, return true.

Solution pseudocode

bool anagramStrings(String str1, int m, String str2, int n)
{
    if (m != n)
        return false
    sort(str1, m)
    sort(str2, n)
    for (int i = 0; i < m; i = i + 1)
    {
        if (str1[i] != str2[i])
            return false
    }        
    return true
}

Solution analysis

We are doing sorting and comparing both strings. So time complexity = Time complexity of sorting str1 + Time complexity of sorting str2 + Time complexity of comparing sorted strings = O(n) + O(n) + O(n) = O(n). We use constant extra space, so space complexity = O(1).

Counting characters using two direct address tables

Searching is one of the critical operations to solve this problem. So can we think of using hash table or direct address table to compare frequency of characters? Hash table is an efficient data structure for searching, insertion and deletion in O(1) time average.

Solution idea

All are lowercase characters, which have 26 possible values. So we can create two separate direct address tables of size 26 to store frequency of each character in both strings. We can directly store values in each table using the ASCII value of characters.

After storing values in direct address tables, we traverse them linearly and compare the frequency count of each character. If values in both tables are equal, then strings are anagrams of each other.

Solution pseudocode

bool anagramStrings(String str1, int m, String str2, int n)
{
     if (m != n)
        return false
    int freqCount1[26]
    int freqCount2[26]
    for(int i = 0; i< 26; i = i + 1)
    {
        freqCount1[i] = 0
        freqCount2[i] = 0
    }
    for(int i = 0; i < n; i = i  + 1) 
    {
        freqCount1[str1[i]-'a'] =  freqCount1[str1[i]-'a'] + 1
        freqCount2[str2[i]-'a'] =  freqCount2[str2[i]-'a'] + 1
    }
    
    for(int i = 0; i < 26; i = i + 1)
    {
        if (freqCount1[i] != freqCount2[i])
            return false
    }        
    return true
}

Solution analysis

We are running three separate loops, where 1st and 3rd loops run constant times and the 2nd loop runs n times. So time complexity = Time complexity of initializing direct address tables + Time complexity of storing direct address tables + Time complexity of comparing direct address tables = = O(1) + O(n) + O(1) = O(n).

The size of direct address tables are constant, so space complexity = O(1) + O(1) = O(1).

Counting occurrences of characters using one direct address tables

The critical question is: Can we optimize space complexity of the above approach and solve problem using one direct address table?

Solution idea

Suppose we traverse the first string and store frequency count of each character in a direct address table of size 26. Now we need to do slight modifications to the above approach.

Instead of storing frequency count of second string in a separate table, we traverse second string and decrement frequency of each character in direct address table of first string. Now at this stage, if both strings are anagrams, then all values stored in direct address table must be 0. 

So we traverse the table to check this. If we find any value not equal to 0, return false. Otherwise, by the end of loop, return true.

Solution pseudocode

bool anagramString(String str1, int m, String str2, int n)
{
    if (m != n)
        return false
    int freqCount[26]
    
    for (int i = 0; i < n; i = i + 1) 
        freqCount[str1[i] - 'a'] = freqCount[str1[i] - 'a'] + 1
      
    for (int i = 0; i < n; i = i + 1) 
        freqCount[str2[i] - 'a'] = freqCount[str2[i] - 'a'] - 1
        
    for (int i = 0; i < 26; i = i + 1)
    {
        if (freqCount[i] != 0)
            return false
    }        
    return true
}

Solution analysis

Time complexity = Time complexity of traversing first string and storing values in direct address table + Time complexity of traversing second string and decrementing values in direct address table + Time complexity of checking 0’s in direct address table = O(n) + O(n) + O(n) = O(n).

We only use one frequency table of constant size, so space complexity = O(1).

Critical ideas to think about!

  • Is there a way to optimize the 4th approach further? Can we add an extra conditional statement in the second loop to avoid going through 3rd loop every time? Or can we think of removing the 3rd loop at all?
  • In 4th approach, can we merge the 1st and 2nd loop operations into a single loop?
  • In 2nd approach, can we solve problem by sorting just one string? Design an algorithm to sort strings in O(n) time.
  • How can we modify above solutions if inputs contain Unicode characters? Can we use direct address table, or do we need to use a hash table?
  • How can we modify the above approaches when uppercase, empty and lowercase characters are allowed?
  • Can we solve this problem using some other data structures or another approach?
  • Why do we assign found characters with a null value in the brute force approach?

Suggested problems to practice

Please write in the message below if you have some additional insights or if you find an error.

Enjoy learning, Enjoy algorithms!

Share feedback with us

More blogs to explore

Our weekly newsletter

Subscribe to get weekly content on data structure and algorithms, machine learning, system design and oops.

© 2022 Code Algorithms Pvt. Ltd.

All rights reserved.