Create an Audiobook from PDF file using Python – Text to speech

Hey there! In this tutorial, we will be learning to convert any regular PDF into an Audiobook using Python in PyCharm.

Here this program will read aloud any PDF file. We can say that the texts of a PDF file will be converted to audio.

If you have a PDF file of the desired book, then you can easily convert it into an audiobook, free of cost.

Read aloud texts from PDF file in Python

Step 1

Open PyCharm and create a project titled Audiobook. Thenopen the terminal and type the below-listed commands to install the respective libraries:

pip install pyttsx3
pip install PyPDF2
  • pyttsx3: For text-to-speech conversion
  • PyPDF2: Capable of extracting document content, splitting, merging and cropping documents page by page, encrypting and decrypting PDF files, etc

Refer to pyttsx3 documentation and PyPDF2 documentation for more information.

Step 2

Within the main.py file in thiproject, type the below-specified code. Refer to the code’s comments for an explanation regarding the code.

# Import necessary libraries:
import pyttsx3
import PyPDF2


# Read the file in binary mode:
book = open('demo.pdf', 'rb')

# Create a PdfFileReader object:
pdfReader = PyPDF2.PdfFileReader(book)

# To determine total number of pages in the PDF file:
pages = pdfReader.numPages

# Initialize the speaker:
# Here, init() function is used to get a reference to a pyttsx3.Engine instance
speaker = pyttsx3.init()

# To access voice property of the speaker:
voices = speaker.getProperty('voices')

# Set the speaker's gender: 0-> Male (default), 1-> Female
speaker.setProperty('voice', voices[1].id)

# Iterate through the pages you want to access
# For accessing specific pages: Iterate through the corresponding page indices
# Note: Index of first page-> 0
# Here, entire PDF is accessed:
for num in range(pages):
    # To read current page index:
    page = pdfReader.getPage(num)
    # To extract the text present in current page:
    text = page.extractText()
    # say() function takes a string as the parameter and then queues the same to be converted from text-to-speech
    speaker.say(text)
    # runAndWait() function blocks the engine instance until all the currently queued commands are processed
    speaker.runAndWait()

# To save the audio output as a MP3 file, within this project:
# Make use of any MP3 player to access this recording whenever required
speaker.save_to_file(text, 'audio.mp3')
speaker.runAndWait()

Output

In the video attached below, you can see a sample output of this code.

Also read,  Building a voice-controlled virtual assistant using Python

Leave a Reply

Your email address will not be published. Required fields are marked *