Converting PDFs to JPEG using Python

In this tutorial, we will see how to convert all the pages of pdf into jpeg format using the pdf2image library in Python 3.
It is a powerful open-source library that provides a direct functionality to convert any pdf file into an image i.e-JPG, PNG format. It has one dependency which is explained later in the blog post.

pdf2image is a simple library package that can be downloaded on any distribution of Python working in any environment.  For more details, you can refer to its documentation or simply follow the post.

Library used:
pdf2image
Use Command – “pip install pdf2image” to install the library.

Dependencies

Download and install poppler separately of suitable version and add the path bin/ to the PATH variable for the functioning of the dependency.

Python code to convert PDF into image

Below is our programming in Python using the pdf2image library:

from pdf2image import convert_from_path    #import library
images = convert_from_path('example.pdf') #Read pdf file
for i in range(len(images)):
          images[i].save('img'+str(i)+'.jpg', 'JPEG')  #Convert each page into image and save it to the directory
The above written snippet will generate a image of the given Pdf file.

The above code can further be modified by import convert_from_byte from pdf2image library to read byte files and convert them into image format.
This code converts all the pages of the given pdf file, in order to convert a specific page from pdf mention page number to be converted at the loading of the file.

images = convert_from_path('example.pdf',pagenumber)  # insert pagenumber to be converted i.e '2'

Remember to download Poppler and add the path of the bin folder in the system path.
Code is perfectly running on all the environment, Try and modify yourself as per your requirements.

 

Leave a Reply

Your email address will not be published. Required fields are marked *