Extract text from image in Python

In this tutorial, we are going to describe one of the most interesting things in python that is how to extract text from the image in python. We are going to do this by using two modules that is cv2 and pytesseract. So you have to install cv2 and pytesseract in your machine.

Installation of cv2 and pytesseract

You have to download the latest version of tesseract and OpenCV and install it in your pc as you install normal software.

How to extract text from image in Python

At first, we will import pytesseract as tr and cv2

import pytesseract as tr
import cv2

Next, we will declare variable I am to read the image and we will read the image by this function imread. And in brackets, we will give the location of the image which we will want to import but if it is already present in the folder then we will type only the name of the image.

im = cv2.imread('image.jpg')

Then we will declare another variable string_from_image to store the string which is read from the image. And we’ll apply the image_to_string function to read the text. As an argument of the function, we’ll use the ‘im’ variable.

string_from_image = tr.image_to_string(im)

And the final step is to print the string

print (string_from_image)

The whole code for the above explanation is

import pytesseract as tr
import cv2
im = cv2.imread('image.jpg')
string_from_image = tr.image_to_string(im)
print (string_from_image)

Problem fixing

But in this whole program, you might have to face some difficulties like – you have installed the required packages but your system is showing that you have not installed the package yet.
To fix this issue you have to write the following code in your Python IDE

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe"

For windows pc you can also apply this method:
This pc (My Computer) -> properties -> Advanced system settings -> Environment variable ->PATH -> New-> C:/Program Files /Tesseract-OCR/

Also read:

Leave a Reply

Your email address will not be published. Required fields are marked *