C++ program to find Shortest unique prefix for each word in an array

In this tutorial, We’ll learn how to find the shortest unique prefix for every word in an array using C++. Assume that no word from the array is the prefix of the other. This can be done by taking every prefix of each word and matching it with others and, if matched then print but the efficient approach for this problem is using Trie.

Consider an INPUT array of strings,

INPUT:¬† [“BLUE”, “GREEN”, “GRAY”, “YELLOW”, “BROWN”]

OUTPUT: [“BL” , “BR” , “GRA” , “GRE” , “Y”]

EXPLANATION:

BL can uniquely identify BLUE.
BR can uniquely identify BROWN.
GRA can uniquely identify GRAY.
GRE can uniquely identify GREEN.
Y can uniquely identify YELLOW.

What is trie

  • Trie is a tree-based data structure that is used to store strings that can be evoked like graphs.
  • It is an efficient way to retrieve¬†data in a large string dataset.

SHORTEST UNIQUE PREFIX USING TRIE

The idea is to construct a Trie of all words present in the array. While inserting words in the trie, record the frequency of every node.

The initial value of each Trie node’s frequency is zero and will be incremented by one each time the node is visited during insertion. Time complexity of this step is O(N).

Lastly, Traverse the Trie in pre-order trend to display the shortest unique prefix for each word in an array.
For every traversed node, check its frequency. If the frequency is 1, the prefix ending at that Trie node is the shortest unique prefix.
Print all the characters from root to that node and stop traversing down further on the current Trie node. Move to the next path in the pre-order traversal. Time complexity of this step is O(N).

C++ implementation of the above algorithm is as follows:

#include <bits/stdc++.h>
using namespace std;
 
// Construct a trie data structure
struct TrieNode
{
    map<char, TrieNode*> child;
    int freq = 0;
};
// traversal of trie in preorder trend and
// printing the shortest unique prefix
void displayShortPrefix(string s,TrieNode *root)
{
    if (root == nullptr) {
        return;
    }
    if (root->freq == 1)
    {
        cout << s << " ";
        return;
    }
    // recursion for all child nodes
    for (auto &child: root->child) {
        displayShortPrefix(s + child.first, child.second);
    }
}
// insertion of a given word into TrieNode
void insert(TrieNode* &root, string s)
{
    TrieNode* curr = root;
    for (char c: s)
    {
        if (curr->child.find(c) == curr->child.end()) {
            curr->child[c] = new TrieNode();
        }
        curr->child[c]->freq++;
        curr = curr->child[c];
    }
}
// shortest unique prefix for every word in the given array
void ShortestPrefix(vector<string> &word)
{
    TrieNode* root = new TrieNode();
    for(string s: word){
        insert(root, s);
    }
    displayShortPrefix("",root);
}
int main()
{
    vector<string> arr = { "BLUE", "GREEN", "GRAY", "YELLOW", "BROWN" };
    ShortestPrefix(arr);
    return 0;
}

Output:

BL BR GRA GRE Y

 

Also, refer:

Leave a Reply

Your email address will not be published. Required fields are marked *