Group words with same set of characters in C++

Think that, we have a group of words. Our aim is to group words with the same set of characters in C++. Assuming that only lower case alphabets are present, for example:

INPUT: { hello, listen, silent, pole, elope,hole}
OUTPUT: hello, hole
        listen, silent
        pole, elope

The simplest way to solve this problem revolves around the sorting of a word lexicographically and reducing the characters in such a way that no two characters of the reduced form of the word are the same. Example: The reduced form of fewer is efrw. We can do this for all the words. Then we can use an unordered_map to group the words with the same set of characters by mapping each of the words to their reduced form. The keys formed are unique and representative of the words with the same set of characters.

Group words with same the set of characters in C++ code

Approach:

  1. We will first create a function named strSort to arrange a word lexicographically and reduce it so that no two characters are the same. This function creates an array of type bool to mark the presence of a character.
  2. For each character present in the word we will mark its corresponding index to be true. In this way even if a character appears twice, we will mark it only once.
    for (char c : s) { 
        present[c-'a']=true; 
    }
  3. The final step of this function will be to make a new string that will act as the key to group words together. We simply traverse the array and if the element is marked true we add it to the string. This ensures that the string is built lexicographically.
    string temp; 
    for (int c = 0; c < 26; c++) { 
        if(present[c]==true){ 
            temp+= c+'a'; 
        } 
    }
  4. Once we have the key for every word, we use it to map the words to their respective keys. This is done using the group() function.
  5. For every word present in the input, we generate its key. Then we add the word to the vector corresponding to that key.
    for(string s: words){ 
        groups[strSort(s)].push_back(s); 
    }
  6. Then we just make a vector named results and add all of the groups by traversing each key in the map.
#include <iostream>
#include<vector>
#include<unordered_map>
using namespace std;

 string strSort(string s) {
        bool present[26] = {false};

//mark the presence of a character
        for (char c : s) {
            present[c-'a']=true;
        }

//make the reduced form
        string temp;
        for (int c = 0; c < 26; c++) {
            if(present[c]==true){
                temp+= c+'a';
            }
        }
        return temp;
    }

vector<vector<string>> group(vector<string> &words){
    unordered_map<string,vector<string>> groups;

//sort the words and map them to their reduced form
    for(string s: words){
        groups[strSort(s)].push_back(s);
    }
    vector<vector<string>> result;
//make a vector to store different groups        
    for (auto i : groups) { 
        result.push_back(i.second);
    }
    return result;
}

int main() {
    vector<string> words;
    words.push_back("hello");
    words.push_back("listen"); 
    words.push_back("silent");
    words.push_back("pole");
    words.push_back("elope");
    words.push_back("hole");

    vector<vector<string>> ans=group(words);
    for(int i=0;i<ans.size();i++){
        for(int j=0;j<ans[i].size();j++){
            cout<<ans[i][j]<<" ";
        }
        cout<<endl;
    }
    return 0;
}

OUTPUT:

pole elope 
hello hole 
listen silent

TIME COMPLEXITY: O(n*k) where n is the number of words and k is the length of the longest word.
SPACE COMPLEXITY: O(n*k) where n is the number of words and k is the length of the longest word.

READ MORE:

Program to check two Strings are Anagram or not using Hashmap in Java
C++ Program to find if the strings can be chained to form a circle

Leave a Reply

Your email address will not be published. Required fields are marked *