Compute Bigrams Frequency in a String in Python

In this tutorial, we are going to learn about computing Bigrams frequency in a string in Python. In this, we will find out the frequency of 2 letters taken at a time in a String. For example, if we have a String ababc in this String ab comes 2 times, whereas ba comes 1 time similarly bc comes 1 time. This is what we are going to find in this tutorial.

There are two ways of finding the Bigrams: –

  1. By using counter() + generator() function.
  2. By using counter() + zip() + map() + join() function.

 Bigrams Using Counter() + Generator()

First we import Counter from the collections module. Using the counter function we will find the frequency and using the generator and string slicing of 2 we will find the bigram.

Code: –

from collections import Counter 
      
string = 'abracadabra'
# This combines the counter and generator
# This for x in range part is the generator
result = Counter(string[x:x+2] for x in range(len(string) - 1)) 
# Now we convert the Counter to a string(we can only concatenate a string not a counter) dictionary   
print("Bigram Frequency : " + str(dict(result)))

Output: –

Bigrams Frequency : {'ab': 2, 'br': 2, 'ra': 2, 'ac': 1, 'ca': 1, 'ad': 1, 'da': 1}

Here,  we slice the string taking two at a time and generate the Bigram and using the counter function we count the number of occurrences of each of Bigram. We then convert the Counter to a dictionary.

Bigram Using Counter() + Zip() + Map() + Join()

In this method, we find the Bigram using the zip + map + join method, and then we apply the counter method to count the number of occurrences.

Code: –

from collections import Counter 
      
string = 'abracadabra'
 
result = Counter(map(''.join, zip(string, string[1:]))) 

# Now we convert the Counter to a string(we can only concatenate a string not a counter) dictionary 
print("Bigrams Frequency : " + str(dict(result)))

Output: –

Bigrams Frequency : {'ab': 2, 'br': 2, 'ra': 2, 'ac': 1, 'ca': 1, 'ad': 1, 'da': 1}

Using the zip function we form a tuple of two-character taken at a time after that we join these two characters using join function. And then we map these characters and using counter function we calculate the Bigram frequency.

This is how we find the Bigram frequency in a String using Python.

Also Read: –

Find the least frequent character in a string in Python

Leave a Reply

Your email address will not be published. Required fields are marked *