Identify Cats vs Dogs in Python using Deep Learning

Hello guys in this tutorial we are going to build a machine learning model that detects the different categories of the animals, for example, cats vs dogs using deep learning techniques in Python programming.

In this model, I have used a transfer learning approach. Transfer learning is an approach in which new similar problems uses the weights that were trained on the previous problem. This technique is highly popular in the deep learning world because it is fairly simple, easy to apply, and less time-consuming or costly. So, here I have used the VGG16 model which is an inbuilt model in the Keras library. This architecture is build using CNN for feature transformation or feature detection, Overpool layer, Dense layer, and other important layers.

Binary classification for detecting Cats and Dogs

So let’s start…

Get the path of the directory where the dataset is stored.

train_path = "/content/files/training_set/training_set"
test_path = "/content/files/test_set/test_set"

Names of the folder which are inside our directory.

import os
os.listdir(train_path)
['dogs', 'cats']

Initializing some of the variables which we will use later.

img_rows=224
img_cols=224
num_channel=3

num_epoch=100
batch_size=32

image_data=[]
labels=[]

The next step is to get the location of the image and perform a read operation using OpenCV.
Then we are appending the images into separate lists i.e. dog_images and cat_images.

import cv2
dog_images = []
cat_images = []
label_index = {}
for folder in os.listdir(train_path):
  if folder == 'dogs':
    label_index[folder] = len(label_index)
    dog_list = os.listdir(os.path.join(train_path, folder))
    for img in dog_list:
      if '(' not in img and '_' not in img: 
        input_img=cv2.imread(train_path + '/'+ folder + '/'+ img )
        input_img_resize=cv2.resize(input_img, (img_rows, img_cols))
        dog_images.append(input_img_resize)
  if folder == 'cats':
    label_index[folder] = len(label_index)
    cat_list = os.listdir(os.path.join(train_path, folder))
    for img in cat_list:
      if '(' not in img and '_' not in img:
        input_img=cv2.imread(train_path + '/'+ folder + '/'+ img )
        input_img_resize=cv2.resize(input_img, (img_rows, img_cols))
        cat_images.append(input_img_resize)

In the above code, we have also created a label index dictionary that we use later to map our prediction.

label_index
{'cats': 1, 'dogs': 0}

Let’s withdraw 100 random samples from previously created lists and concat them to create a single list. Because this is a large dataset and will take a long time train I have used only 100 samples.

import random
dog_images = random.sample(dog_images,100)
cat_images = random.sample(cat_images,100)

image_data = dog_images + cat_images

We can look at the random image using plt.imshow() defined in matplotlib library.

import numpy as np
import matplotlib.pyplot as plt
plt.imshow(np.array(image_data[101]))
plt.show()

Normalizing the images for faster training.

image_data = np.array(image_data)
image_data = image_data.astype('float32')
image_data /= 255

let’s create a list of labels, the target variable for binary classification i.e. 0 for dogs and 1 for cats.

labels = list(np.zeros(100,dtype='int')) + list(np.ones(100,dtype='int'))

Keras provides a method to_categorical() to create a one-hot encoded vector of the target variable. Then we use this one-hot encoded vector for the classification at the time of training.

from keras.utils import to_categorical

labels = to_categorical(np.array(labels))

Now let’s shuffle the data

from sklearn.utils import shuffle

X, Y = shuffle(image_data, labels, random_state=132)

splitting the data for training and testing

from sklearn.model_selection import train_test_split
X_train, X_val, Y_train, Y_val = train_test_split(image_data, labels,test_size = 0.3, stratify = labels, random_state = 123)

Let’s build our model.

Keras provides an inbuilt architecture VGG16 for experimenting with deep learning models. so let’s define input shape i.e. (224,224,3) because VGG model expects the images to be in (224,224,3) dimensions.

from keras.layers import Input, Dense
image_input = Input(shape=(img_rows, img_cols, num_channel))

Here, I am taking the weights of the model previously trained on the Imagenet dataset. That is why this approach is called Transfer learning. We can also build our own deep learning models and train it.

from keras.applications.vgg16 import VGG16

model = VGG16(input_tensor=image_input, include_top=True, weights='imagenet')

Because the last layer of our model is expecting 1000 classes. So, redefining the last layer with 2 classes i.e. binary classification.

last_layer = model.get_layer('fc2').output
out = Dense(2, activation='softmax', name='output')(last_layer)
from keras import Model
custom_vgg_model = Model(image_input, out)
custom_vgg_model.summary()

## compile and train the model

for layer in custom_vgg_model.layers[:-1]:
    layer.trainable = False

custom_vgg_model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])

Now that we have built our model let’s train it on our dataset. And make the prediction.

custom_vgg_model.fit(X_train, Y_train, batch_size=batch_size, epochs=num_epoch, verbose=1, validation_data=(X_val, Y_val))
y_test_pred = custom_vgg_model.predict(X_val)
np.argmax(y_test_pred[0:1,:])
output : 0

plt.imshow(X_val[0])

which I got the image of a dog as expected. You may get different results based on your data samples.

Leave a Reply