C++ program to find unique words in a file

Hi there. Today we are going to see how to find unique words in a text file. For this problem, we will need to use a library named fstream, another standard C++ library. It has it’s own data types. They help in interacting with files. For storing the word count of each word, we will use maps.

If you are not familiar with maps, let me explain it to you in brief. Maps contain keys and values unique to each key. Using keys, we can access the values assigned to them. Also, each key is unique. Now let’s get to our problem. In our case, the map will store the word count for each word. It stores the count for characters also. But we can remove them using the erase method.

Find unique words in a file in C++

We proceed in this way. First, we will use the ofstream data type. It will create a file and write data to it. Afterward, we will use fstream data type. It will help us read and parse text from that file. Now, from that text, we will pick each word or character encountered and will store them and their frequencies in a map. The words will be keys and their frequencies the values of those keys. And for each value which is 1, which indicates a unique word, the key will be printed. Now let’s look at the program.

#include<bits/stdc++.h> 
using namespace std; 
  
int main() 
{  
    string s;
    string file_name="test.txt";
    map<string,int> m;
    ofstream ofs(file_name); 
    ofs<<"I code in C++ . He also codes in C++ . But she codes in java ."; 
    ofs.close(); 
    
    fstream fs(file_name);    
    while(fs>>s)
        if(!m.count(s))
            m.insert(make_pair(s,1));
        else
            m[s]++;
    fs.close(); 
  
    m.erase("."); 
    map<string,int>::iterator i;
    for(i=m.begin();i!=m.end();i++) 
        if (i->second==1)
            cout<<i->first<<endl;
    return 0; 
}

As you can see, only unique words are printed. And as I said earlier, we can remove characters like full stops, commas, etc., by using the erase method. It removes both the keys and values from the map.

Let’s see the output now.

But
He
I
also
code
java
she

As you can see, only unique words are printed. If you want to use special characters, remember to remove them from the map before printing the output.

I hope you found this article useful. For further practice, you can try to use other data types from the fstream library. That will enhance your grasp on the topic.

Also read: How to find the longest word in a text file in C++

Leave a Reply

Your email address will not be published. Required fields are marked *