Real time object detection using TensorFlow in Python

Hey there everyone, Today we will learn real-time object detection using python. Tensorflow object detection API available on GitHub has made it a lot easier to train our model and make changes in it for real-time object detection.

We will see, how we can modify an existing “.ipynb” file to make our model detect real-time object images.
So, let’s start.

Real-time object detection in TensorFlow

First of all, we need to download a few pieces of stuff before actually start working with the code. Let’s see what are they.

Download Protobuf version 3.4 or above(this article uses version 3.4) and extract it. You can get it here
https://github.com/protocolbuffers/protobuf/releases

The next thing you need to do is to download the Models and examples built with TensorFlow from the Github link provided below
https://github.com/tensorflow/models
download and then extract it.

Now, we will compile the Protobuf, but it should be compiled in the research directory “…….models\research” you can compile the protobuf using the command

protoc object_detection/protos/*.proto –python_out=.

Screenshot 2020-01-14 at 7.46.44 PM

Once you have successfully compiled the Protobuf, you will be able to see a “.py” file for each “.proto” file within the protos folder. Now, its time to work on our code.

Working with the code

Open the “object_detection_tutorial.ipynb” located in the ” ⁨models⁩ ▸ ⁨research⁩ ▸ ⁨object_detection⁩” directory. This will open up a jupyter notebook that consists of the well explained complete code for object detection.

When you run all the cells of the “object_detection_tutorial.ipynb” file, It imports all the modules required and the model required for object detection is downloaded from the internet. You can use other models from here https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md

Once you successfully compile and run all the cells, you have the two test image outputs for the “object_detection_tutorial.ipynb”.

Real time object detection using TensorFlow in Python

Object Detection in Python

The above images are the result of object detection performed on “test_images”. For real-time object detection, we need access to a camera and we will make some changes to “object_detection_tutorial.ipynb”.

First, we need to remove this part from our code, as we don’t need the test_images for object detection.

# If you want to test the code with your images, just add path to the images to the TEST_IMAGE_PATHS.
PATH_TO_TEST_IMAGES_DIR = pathlib.Path('models/research/object_detection/test_images')
TEST_IMAGE_PATHS = sorted(list(PATH_TO_TEST_IMAGES_DIR.glob("*.jpg")))
TEST_IMAGE_PATHS

you may comment it out or completely remove this part.

Next, for accessing our camera have to import cv2.

import cv2
cap = cv2.VideoCapture(0)

Now, we need to change this piece of our code:

def show_inference(model, image_path):
  # the array based representation of the image will be used later in order to prepare the
  # result image with boxes and labels on it.
  image_np = np.array(Image.open(image_path))
  # Actual detection.
  output_dict = run_inference_for_single_image(model, image_np)
  # Visualization of the results of a detection.
  vis_util.visualize_boxes_and_labels_on_image_array(
      image_np,
      output_dict['detection_boxes'],
      output_dict['detection_classes'],
      output_dict['detection_scores'],
      category_index,
      instance_masks=output_dict.get('detection_masks_reframed', None),
      use_normalized_coordinates=True,
      line_thickness=8)

  display(Image.fromarray(image_np))

Modify the above to this :

while True:
        # Read frame from camera
        ret, image_np = cap.read()
       
        # Actual detection.
        output_dict = detection_parameters(detection_model, image_np)
        # detected object visualization.
        vis_util.visualize_boxes_and_labels_on_image_array(
        image_np,
        output_dict['detection_boxes'],
        output_dict['detection_classes'],
        output_dict['detection_scores'],
        category_index,
        instance_masks=output_dict.get('detection_masks_reframed', None),
        use_normalized_coordinates=True,
        line_thickness=8)
       
        cv2.imshow('object detection', cv2.resize(image_np, (800, 600)))
        if cv2.waitKey(25) & 0xFF == ord('q'):
                cv2.destroyAllWindows()
                break

And finally, here is our complete code for real-time object detection:

!pip install -U --pre tensorflow=="2.*"
!pip install pycocotools
import os
import pathlib


if "models" in pathlib.Path.cwd().parts:
  while "models" in pathlib.Path.cwd().parts:
    os.chdir('..')
elif not pathlib.Path('models').exists():
  !git clone --depth 1 https://github.com/tensorflow/models
%%bash 
cd models/research 
pip install .

Importing all the required libraries:

import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image
from IPython.display import display
from object_detection.utils import ops as utils_ops
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_util
# patch tf1 into `utils.ops`
utils_ops.tf = tf.compat.v1

# Patch the location of gfile
tf.gfile = tf.io.gfile

Preparing our model:

def Load_My_Model(Model_Name):
  base_url = 'http://download.tensorflow.org/models/object_detection/'
  model_file = Model_Name + '.tar.gz'
  model_directory = tf.keras.utils.get_file(
    fname=Model_Name, 
    origin=base_url + model_file,
    untar=True)
  model_directory = pathlib.Path(model_directory)/"saved_model"
  my_model = tf.saved_model.load(str(model_directory))
  my_model = my_model.signatures['serving_default']
  return my_model

Loading the label map:

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = 'models/research/object_detection/data/mscoco_label_map.pbtxt'
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)

Loading the object detection model. The model is trained on COCO(Common Objects In Context) dataset and is downloaded from the internet:

Model = 'ssd_mobilenet_v1_coco_2017_11_17'
detection_model = Load_My_Model(Model)

Checking the model’s input signature and outputs:

print(detection_model.inputs)
detection_model.output_dtypes
detection_model.output_shapes

Function for calling the model and converting the image to tensor:

def detection_parameters(my_model, obj):
  obj = np.asarray(obj)
  # converting the input using `tf.convert_to_tensor`.
  input_tensor_obj = tf.convert_to_tensor(obj)
  
  input_tensor_obj = input_tensor_obj[tf.newaxis,...]
  # Run inference
  output_dictionary = my_model(input_tensor_obj)
  
  
  # considering only the first num_detection
  num_detections = int(output_dictionary.pop('num_detections'))
  output_dictionary = {key:val[0, :num_detections].numpy() 
                 for key,val in output_dictionary.items()}
  output_dictionary['num_detections'] = num_detections
  
  output_dictionary['detection_classes'] = output_dictionary['detection_classes'].astype(np.int64)
   
  # Handle models with masks:
  if 'detection_masks' in output_dictionary:
    # Reframe the the box mask to the image size.
    detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
              output_dictionary['detection_masks'], output_dictionary['detection_boxes'],
               obj.shape[0], obj.shape[1])      
    detection_masks_reframed = tf.cast(detection_masks_reframed > 0.5,
                                       tf.uint8)
    output_dictionary['detection_masks_reframed'] = detection_masks_reframed.numpy()
    
  return output_dictionary

Instance Segmentation:

model_name = "mask_rcnn_inception_resnet_v2_atrous_coco_2018_01_28"
masking_model = Load_My_Model("mask_rcnn_inception_resnet_v2_atrous_coco_2018_01_28")
masking_model.output_shapes

Importing cv2 for real-time detection of objects:

import cv2
cap = cv2.VideoCapture(0)

Running camera and real-time detection of objects:

while True:
        # Read frame from camera
        ret, image_np = cap.read()
       
        # Actual detection.
        output_dict = detection_parameters(detection_model, image_np)
        # detected object visualization.
        vis_util.visualize_boxes_and_labels_on_image_array(
        image_np,
        output_dict['detection_boxes'],
        output_dict['detection_classes'],
        output_dict['detection_scores'],
        category_index,
        instance_masks=output_dict.get('detection_masks_reframed', None),
        use_normalized_coordinates=True,
        line_thickness=8)
       
        cv2.imshow('object detection', cv2.resize(image_np, (800, 600)))
        if cv2.waitKey(25) & 0xFF == ord('q'):
                cv2.destroyAllWindows()
                break

Here is the screenshot of the output generated:

TensorFlow object detection

I hope you enjoyed this tutorial and will try it out on your own.

Also, read: Motion Detection using OpenCV in Python

 

Leave a Reply

Your email address will not be published. Required fields are marked *