Delete pages from a PDF file in Python

In this tutorial, we will learn how to delete pages from a PDF file in Python. While working with PDF files we may need to delete some unwanted pages from the PDF.  Sometimes it will reduce its size.

We will use here PyMuPDF package to delete pages from the PDF.

To delete pages from a PDF file in Python

Python is popular for its built-in functions and packages, which makes it easy to use and short lines of code.

Here we will use the ‘PyMuPDF’ package and its inbuilt functions to work with.

Install:

For this, you need to install the package, before coding.

You can use the command below to install:

pip install PyMuPDF

 

Here we are using a PDF file with 6 pages and saved it as ‘A.pdf’.

PyMuPDF:

PyMuPDF library makes the code easy to delete pages from any PDF file. We can delete a single page as well as multiple pages from PDF.

We can also use the list to delete pages from PDF.

At first, we will import the ‘Fitz’ library from the package. Then we stored input file in ‘ipf’ variable and output file in ‘opf’ variable.
Next we read the file and stored in ‘f’ variable. Taken the page numbers to be saved in the list called ‘pgls’ here. And others will be deleted. But The page numbers will be indexed from 0. Then we will select the page number list to be saved and save it in the output file ‘opf’.

import fitz

ipf = "A.pdf"
opf = "B.pdf"

f = fitz.open(ipf)
pgls = [0,1,4]

f.select(pgls)
f.save(opf)

The output will be saved as ‘B.pdf’ and it will consist of 3 pages PDF file and the page numbers from the original file will be 1, 2, and 5.

 

Hope it is useful.

Thanks for your valuable time!

You can also read:
Extract Tables from PDF

Leave a Reply

Your email address will not be published. Required fields are marked *