How to encode a String in Huffman Coding Using Python

In this tutorial, we are going to see how to encode a string in Huffman coding in Python.

In order to encode a string, you have to build a min-heap tree. So we are using a module called heapq in Python. This method is used to build a min-heap tree.

Defaultdict is used to generate the frequency for each character in the string. It is available under the collections packages.

Now we have to create a method. Let’s give the name of the method as encode. Using this user-defined method you have to pass the frequency as the parameter.

Now use a while loop to find the low and high values in the heap that we have already declared.

Inside the loop, we are using two separate for loops which are used to assign the values as ‘0’ and ‘1’ for the left and the right nodes which are here referred to as the low and high.

Finally, we use the sorted function to sort the obtained heap It is noted that in the code we use the lambda keyword which is used to mention an unnamed function as for the key value.

Now we need to give the input string for encoding.

In which the frequency for each character in the string is automatically generated by the defaultdict as we mentioned before.

We use the for loop here to store each unique frequency value of the characters.

Now we call the encode function to pass each of the frequencies of the characters present in the string to the user-defined function encode and it is returned to the dictionary huff.

Finally, we print the encoded values with the help of the for loop.

Python program:  Hoffman coding in Python

Now let’s write our code as we discussed above:

import heapq
from collections import defaultdict
def encode(frequency):
    heap = [[weight, [symbol, '']] for symbol, weight in frequency.items()]
    heapq.heapify(heap)
    while len(heap) > 1:
        low = heapq.heappop(heap)

        high = heapq.heappop(heap)

        for value in low[1:]:

            value[1] = '0' + value[1]

        for value in high[1:]:

            value[1] = '1' +value[1]

        heapq.heappush(heap, [low[0] + high[0]] + low[1:] + high[1:])

    return sorted(heapq.heappop(heap)[1:], key=lambda p: (len(p[-1]), p))


string=input("Enter the string to be encoded:")

frequency = defaultdict(int)

for character in string:

    frequency[character] += 1

huff = encode(frequency)

print("character".ljust(10) + "Weight".ljust(10) + "Huffman Code")

for i in huff:

    print(i[0].ljust(10) + str(frequency[i[0]]).ljust(10) + i[1])

 

Output:

Enter the string to be encoded:maran
character Weight Huffman Code
 a           2        11
 m           1        00
 n           1        01
 r           1        10

You may also read:

Leave a Reply

Your email address will not be published. Required fields are marked *