YOLO Object Detection from image with OpenCV and Python

In this tutorial, we will be learning how to use Python and OpenCV in order to detect an object from an image with the help of the YOLO algorithm. We will be using PyCharm IDE to solve this problem.

YOLO is an object detection algorithm or model that was launched in May 2016. YOLO stands for “You Only Look Once”. This algorithm looks at the entire image in one go and detects objects.

We load the algorithm. In order to load the algorithm, we need these 3 files:

  • Weight file: The trained model that detects the objects.
  • Cfg file: The configuration file
  • Name files: Consists of the names of the objects that this algorithm can detect

Click on the above highlights links to download these files.

Prerequisites

In order to build this program, we’ll require the following header files:

  1. cv2
  2. NumPy
    import cv2
    import numpy as np

We will be testing our program with this Input Image

traffic

Load Yolo In Our Python Program

We follow the following steps:

  • Use the files we have downloaded
  • Load classes from the file i.e the objects that Yolo can detect
  • Then we have to use the getLayerNames() function and getUnconnectedOutLayers() function to get the output layers.

 

#Load YOLO Algorithms\
net=cv2.dnn.readNet("yolov3.weights","yolov3.cfg")


#To load all objects that have to be detected
classes=[]
with open("coco.names","r") as f:
    read=f.readlines()
for i in range(len(read)):
    classes.append(read[i].strip("\n"))


#Defining layer names
layer_names=net.getLayerNames()
output_layers=[]
for i in net.getUnconnectedOutLayers():
    output_layers.append(layer_names[i[0]-1])

Load The Image File

We follow the following steps:

  • Use imread() funciton to read the image
  • Use .shape to get the height,width and channels of the image
#Loading the Image
img=cv2.imread("Road.jpg")
height,width,channels=img.shape

Extracting features to detect objects

BLOB stands for Binary Large Object and refers to a group of connected pixels in a binary image.

We follow the following steps:

  • Use blobFromImage() function to extract the blob
  • Pass this blob image into the algorithm
  • Use forward() to forward the blob to the output layer to generate the result
#Extracting features to detect objects
blob=cv2.dnn.blobFromImage(img,0.00392,(416,416),(0,0,0),True,crop=False)
                                       #Standard         #Inverting blue with red
                                       #ImageSize        #bgr->rgb


#We need to pass the img_blob to the algorithm
net.setInput(blob)
outs=net.forward(output_layers)

 

Displaying Information On The Screen

Here, we are going through the result to retrieve the scores,class_id and confidence of a particular object detected. If the cofidence is greated that 0.5, then we use the coordinate values to draw a rectangle around the object.

#Displaying information on the screen
class_ids=[]
confidences=[]
boxes=[]
for output in outs:
    for detection in output:
        #Detecting confidence in 3 steps
        scores=detection[5:]                #1
        class_id=np.argmax(scores)          #2
        confidence =scores[class_id]        #3

        if confidence >0.5: #Means if the object is detected
            center_x=int(detection[0]*width)
            center_y=int(detection[1]*height)
            w=int(detection[2]*width)
            h=int(detection[3]*height)

            #Drawing a rectangle
            x=int(center_x-w/2) # top left value
            y=int(center_y-h/2) # top left value

            boxes.append([x,y,w,h])
            confidences.append(float(confidence))
            class_ids.append(class_id)
            cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)

But if we try to print the result, our program draws double boxes around some objects which is not correct

yolo object detection

 

Removing Double Boxes

We will be using the NoMarkSupression function to remove the double boxes from our result and thus get only the top and bottom coordinates of the required object.

#Removing Double Boxes
indexes=cv2.dnn.NMSBoxes(boxes,confidences,0.3,0.4)

for i in range(len(boxes)):
    if i in indexes:
        x, y, w, h = boxes[i]
        label = classes[class_ids[i]]  # name of the objects
       
        cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)
        cv2.putText(img, label, (x, y), cv2.FONT_HERSHEY_PLAIN, 1, (0, 0, 255), 2)

Printing the Output

We’ll create a new variable to store the original image that we just processed just to compare it with the resulting image we get after we run the program.

cv2.imshow("Output",img)
cv2.waitKey(0)
cv2.destroyAllWindows()

Complete Code

Here is the complete code for this program

import cv2
import numpy as np

#Load YOLO Algorithm
net=cv2.dnn.readNet("yolov3.weights","yolov3.cfg")

#To load all objects that have to be detected
classes=[]
with open("coco.names","r") as f:
    read=f.readlines()
for i in range(len(read)):
    classes.append(read[i].strip("\n"))

#Defining layer names
layer_names=net.getLayerNames()
output_layers=[]
for i in net.getUnconnectedOutLayers():
    output_layers.append(layer_names[i[0]-1])


#Loading the Image
img=cv2.imread("Road.jpg")
height,width,channels=img.shape


#Extracting features to detect objects
blob=cv2.dnn.blobFromImage(img,0.00392,(416,416),(0,0,0),True,crop=False)
                                                        #Inverting blue with red
                                                        #bgr->rgb


#We need to pass the img_blob to the algorithm
net.setInput(blob)
outs=net.forward(output_layers)
#print(outs)

#Displaying informations on the screen
class_ids=[]
confidences=[]
boxes=[]
for output in outs:
    for detection in output:
        #Detecting confidence in 3 steps
        scores=detection[5:]                #1
        class_id=np.argmax(scores)          #2
        confidence =scores[class_id]        #3

        if confidence >0.5: #Means if the object is detected
            center_x=int(detection[0]*width)
            center_y=int(detection[1]*height)
            w=int(detection[2]*width)
            h=int(detection[3]*height)

            #Drawing a rectangle
            x=int(center_x-w/2) # top left value
            y=int(center_y-h/2) # top left value

            boxes.append([x,y,w,h])
            confidences.append(float(confidence))
            class_ids.append(class_id)
           #cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)

#Removing Double Boxes
indexes=cv2.dnn.NMSBoxes(boxes,confidences,0.3,0.4)

for i in range(len(boxes)):
    if i in indexes:
        x, y, w, h = boxes[i]
        label = classes[class_ids[i]]  # name of the objects
        cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)
        cv2.putText(img, label, (x, y), cv2.FONT_HERSHEY_PLAIN, 1, (0, 0, 255), 2)
       


cv2.imshow("Output",img)
cv2.waitKey(0)
cv2.destroyAllWindows()

Now if we run our program, we will able to see the final output image just like below:

yolo object detection

 We get our final image with all the objects highlighted with their names

Hope this post helps you understand the concept of YOLO Object Detection with OpenCV and Python

Leave a Reply

Your email address will not be published. Required fields are marked *