Real time object detection using TensorFlow in Python
Hey there everyone, Today we will learn real-time object detection using python. Tensorflow object detection API available on GitHub has made it a lot easier to train our model and make changes in it for real-time object detection.
We will see, how we can modify an existing “.ipynb” file to make our model detect real-time object images.
So, let’s start.
Real-time object detection in TensorFlow
First of all, we need to download a few pieces of stuff before actually start working with the code. Let’s see what are they.
Download Protobuf version 3.4 or above(this article uses version 3.4) and extract it. You can get it here
https://github.com/protocolbuffers/protobuf/releases
The next thing you need to do is to download the Models and examples built with TensorFlow from the Github link provided below
https://github.com/tensorflow/models
download and then extract it.
Now, we will compile the Protobuf, but it should be compiled in the research directory “…….models\research” you can compile the protobuf using the command
protoc object_detection/protos/*.proto –python_out=.
Once you have successfully compiled the Protobuf, you will be able to see a “.py” file for each “.proto” file within the protos folder. Now, its time to work on our code.
Working with the code
Open the “object_detection_tutorial.ipynb” located in the ” models ▸ research ▸ object_detection” directory. This will open up a jupyter notebook that consists of the well explained complete code for object detection.
When you run all the cells of the “object_detection_tutorial.ipynb” file, It imports all the modules required and the model required for object detection is downloaded from the internet. You can use other models from here https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
Once you successfully compile and run all the cells, you have the two test image outputs for the “object_detection_tutorial.ipynb”.
The above images are the result of object detection performed on “test_images”. For real-time object detection, we need access to a camera and we will make some changes to “object_detection_tutorial.ipynb”.
First, we need to remove this part from our code, as we don’t need the test_images for object detection.
# If you want to test the code with your images, just add path to the images to the TEST_IMAGE_PATHS. PATH_TO_TEST_IMAGES_DIR = pathlib.Path('models/research/object_detection/test_images') TEST_IMAGE_PATHS = sorted(list(PATH_TO_TEST_IMAGES_DIR.glob("*.jpg"))) TEST_IMAGE_PATHS
you may comment it out or completely remove this part.
Next, for accessing our camera have to import cv2.
import cv2 cap = cv2.VideoCapture(0)
Now, we need to change this piece of our code:
def show_inference(model, image_path): # the array based representation of the image will be used later in order to prepare the # result image with boxes and labels on it. image_np = np.array(Image.open(image_path)) # Actual detection. output_dict = run_inference_for_single_image(model, image_np) # Visualization of the results of a detection. vis_util.visualize_boxes_and_labels_on_image_array( image_np, output_dict['detection_boxes'], output_dict['detection_classes'], output_dict['detection_scores'], category_index, instance_masks=output_dict.get('detection_masks_reframed', None), use_normalized_coordinates=True, line_thickness=8) display(Image.fromarray(image_np))
Modify the above to this :
while True: # Read frame from camera ret, image_np = cap.read() # Actual detection. output_dict = detection_parameters(detection_model, image_np) # detected object visualization. vis_util.visualize_boxes_and_labels_on_image_array( image_np, output_dict['detection_boxes'], output_dict['detection_classes'], output_dict['detection_scores'], category_index, instance_masks=output_dict.get('detection_masks_reframed', None), use_normalized_coordinates=True, line_thickness=8) cv2.imshow('object detection', cv2.resize(image_np, (800, 600))) if cv2.waitKey(25) & 0xFF == ord('q'): cv2.destroyAllWindows() break
And finally, here is our complete code for real-time object detection:
!pip install -U --pre tensorflow=="2.*" !pip install pycocotools
import os import pathlib if "models" in pathlib.Path.cwd().parts: while "models" in pathlib.Path.cwd().parts: os.chdir('..') elif not pathlib.Path('models').exists(): !git clone --depth 1 https://github.com/tensorflow/models
%%bash cd models/research pip install .
Importing all the required libraries:
import numpy as np import os import six.moves.urllib as urllib import sys import tarfile import tensorflow as tf import zipfile from collections import defaultdict from io import StringIO from matplotlib import pyplot as plt from PIL import Image from IPython.display import display
from object_detection.utils import ops as utils_ops from object_detection.utils import label_map_util from object_detection.utils import visualization_utils as vis_util
# patch tf1 into `utils.ops` utils_ops.tf = tf.compat.v1 # Patch the location of gfile tf.gfile = tf.io.gfile
Preparing our model:
def Load_My_Model(Model_Name): base_url = 'http://download.tensorflow.org/models/object_detection/' model_file = Model_Name + '.tar.gz' model_directory = tf.keras.utils.get_file( fname=Model_Name, origin=base_url + model_file, untar=True) model_directory = pathlib.Path(model_directory)/"saved_model" my_model = tf.saved_model.load(str(model_directory)) my_model = my_model.signatures['serving_default'] return my_model
Loading the label map:
# List of the strings that is used to add correct label for each box. PATH_TO_LABELS = 'models/research/object_detection/data/mscoco_label_map.pbtxt' category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)
Loading the object detection model. The model is trained on COCO(Common Objects In Context) dataset and is downloaded from the internet:
Model = 'ssd_mobilenet_v1_coco_2017_11_17' detection_model = Load_My_Model(Model)
Checking the model’s input signature and outputs:
print(detection_model.inputs) detection_model.output_dtypes detection_model.output_shapes
Function for calling the model and converting the image to tensor:
def detection_parameters(my_model, obj): obj = np.asarray(obj) # converting the input using `tf.convert_to_tensor`. input_tensor_obj = tf.convert_to_tensor(obj) input_tensor_obj = input_tensor_obj[tf.newaxis,...] # Run inference output_dictionary = my_model(input_tensor_obj) # considering only the first num_detection num_detections = int(output_dictionary.pop('num_detections')) output_dictionary = {key:val[0, :num_detections].numpy() for key,val in output_dictionary.items()} output_dictionary['num_detections'] = num_detections output_dictionary['detection_classes'] = output_dictionary['detection_classes'].astype(np.int64) # Handle models with masks: if 'detection_masks' in output_dictionary: # Reframe the the box mask to the image size. detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks( output_dictionary['detection_masks'], output_dictionary['detection_boxes'], obj.shape[0], obj.shape[1]) detection_masks_reframed = tf.cast(detection_masks_reframed > 0.5, tf.uint8) output_dictionary['detection_masks_reframed'] = detection_masks_reframed.numpy() return output_dictionary
Instance Segmentation:
model_name = "mask_rcnn_inception_resnet_v2_atrous_coco_2018_01_28" masking_model = Load_My_Model("mask_rcnn_inception_resnet_v2_atrous_coco_2018_01_28")
masking_model.output_shapes
Importing cv2 for real-time detection of objects:
import cv2 cap = cv2.VideoCapture(0)
Running camera and real-time detection of objects:
while True: # Read frame from camera ret, image_np = cap.read() # Actual detection. output_dict = detection_parameters(detection_model, image_np) # detected object visualization. vis_util.visualize_boxes_and_labels_on_image_array( image_np, output_dict['detection_boxes'], output_dict['detection_classes'], output_dict['detection_scores'], category_index, instance_masks=output_dict.get('detection_masks_reframed', None), use_normalized_coordinates=True, line_thickness=8) cv2.imshow('object detection', cv2.resize(image_np, (800, 600))) if cv2.waitKey(25) & 0xFF == ord('q'): cv2.destroyAllWindows() break
Here is the screenshot of the output generated:
I hope you enjoyed this tutorial and will try it out on your own.
Also, read: Motion Detection using OpenCV in Python
Leave a Reply