Converting PDFs to JPEG using Python
In this tutorial, we will see how to convert all the pages of pdf into jpeg format using the pdf2image library in Python 3.
It is a powerful open-source library that provides a direct functionality to convert any pdf file into an image i.e-JPG, PNG format. It has one dependency which is explained later in the blog post.
pdf2image is a simple library package that can be downloaded on any distribution of Python working in any environment. For more details, you can refer to its documentation or simply follow the post.
Use Command – “pip install pdf2image” to install the library.
Download and install poppler separately of suitable version and add the path bin/ to the PATH variable for the functioning of the dependency.
Python code to convert PDF into image
Below is our programming in Python using the pdf2image library:
from pdf2image import convert_from_path #import library images = convert_from_path('example.pdf') #Read pdf file for i in range(len(images)): images[i].save('img'+str(i)+'.jpg', 'JPEG') #Convert each page into image and save it to the directory
The above written snippet will generate a image of the given Pdf file.
The above code can further be modified by import convert_from_byte from pdf2image library to read byte files and convert them into image format.
This code converts all the pages of the given pdf file, in order to convert a specific page from pdf mention page number to be converted at the loading of the file.
images = convert_from_path('example.pdf',pagenumber) # insert pagenumber to be converted i.e '2'
Remember to download Poppler and add the path of the bin folder in the system path.
Code is perfectly running on all the environment, Try and modify yourself as per your requirements.