Identify Cats vs Dogs in Python using Deep Learning
Hello guys in this tutorial we are going to build a machine learning model that detects the different categories of the animals, for example, cats vs dogs using deep learning techniques in Python programming.
In this model, I have used a transfer learning approach. Transfer learning is an approach in which new similar problems uses the weights that were trained on the previous problem. This technique is highly popular in the deep learning world because it is fairly simple, easy to apply, and less time-consuming or costly. So, here I have used the VGG16 model which is an inbuilt model in the Keras library. This architecture is build using CNN for feature transformation or feature detection, Overpool layer, Dense layer, and other important layers.
Binary classification for detecting Cats and Dogs
So let’s start…
Get the path of the directory where the dataset is stored.
train_path = "/content/files/training_set/training_set" test_path = "/content/files/test_set/test_set"
Names of the folder which are inside our directory.
import os os.listdir(train_path)
['dogs', 'cats']
Initializing some of the variables which we will use later.
img_rows=224 img_cols=224 num_channel=3 num_epoch=100 batch_size=32 image_data=[] labels=[]
The next step is to get the location of the image and perform a read operation using OpenCV.
Then we are appending the images into separate lists i.e. dog_images and cat_images.
import cv2 dog_images = [] cat_images = [] label_index = {} for folder in os.listdir(train_path): if folder == 'dogs': label_index[folder] = len(label_index) dog_list = os.listdir(os.path.join(train_path, folder)) for img in dog_list: if '(' not in img and '_' not in img: input_img=cv2.imread(train_path + '/'+ folder + '/'+ img ) input_img_resize=cv2.resize(input_img, (img_rows, img_cols)) dog_images.append(input_img_resize) if folder == 'cats': label_index[folder] = len(label_index) cat_list = os.listdir(os.path.join(train_path, folder)) for img in cat_list: if '(' not in img and '_' not in img: input_img=cv2.imread(train_path + '/'+ folder + '/'+ img ) input_img_resize=cv2.resize(input_img, (img_rows, img_cols)) cat_images.append(input_img_resize)
In the above code, we have also created a label index dictionary that we use later to map our prediction.
label_index
{'cats': 1, 'dogs': 0}
Let’s withdraw 100 random samples from previously created lists and concat them to create a single list. Because this is a large dataset and will take a long time train I have used only 100 samples.
import random dog_images = random.sample(dog_images,100) cat_images = random.sample(cat_images,100) image_data = dog_images + cat_images
We can look at the random image using plt.imshow() defined in matplotlib library.
import numpy as np import matplotlib.pyplot as plt plt.imshow(np.array(image_data[101])) plt.show()
Normalizing the images for faster training.
image_data = np.array(image_data) image_data = image_data.astype('float32') image_data /= 255
let’s create a list of labels, the target variable for binary classification i.e. 0 for dogs and 1 for cats.
labels = list(np.zeros(100,dtype='int')) + list(np.ones(100,dtype='int'))
Keras provides a method to_categorical() to create a one-hot encoded vector of the target variable. Then we use this one-hot encoded vector for the classification at the time of training.
from keras.utils import to_categorical labels = to_categorical(np.array(labels))
Now let’s shuffle the data
from sklearn.utils import shuffle X, Y = shuffle(image_data, labels, random_state=132)
splitting the data for training and testing
from sklearn.model_selection import train_test_split X_train, X_val, Y_train, Y_val = train_test_split(image_data, labels,test_size = 0.3, stratify = labels, random_state = 123)
Let’s build our model.
Keras provides an inbuilt architecture VGG16 for experimenting with deep learning models. so let’s define input shape i.e. (224,224,3) because VGG model expects the images to be in (224,224,3) dimensions.
from keras.layers import Input, Dense image_input = Input(shape=(img_rows, img_cols, num_channel))
Here, I am taking the weights of the model previously trained on the Imagenet dataset. That is why this approach is called Transfer learning. We can also build our own deep learning models and train it.
from keras.applications.vgg16 import VGG16 model = VGG16(input_tensor=image_input, include_top=True, weights='imagenet')
Because the last layer of our model is expecting 1000 classes. So, redefining the last layer with 2 classes i.e. binary classification.
last_layer = model.get_layer('fc2').output out = Dense(2, activation='softmax', name='output')(last_layer) from keras import Model custom_vgg_model = Model(image_input, out) custom_vgg_model.summary() ## compile and train the model for layer in custom_vgg_model.layers[:-1]: layer.trainable = False custom_vgg_model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
Now that we have built our model let’s train it on our dataset. And make the prediction.
custom_vgg_model.fit(X_train, Y_train, batch_size=batch_size, epochs=num_epoch, verbose=1, validation_data=(X_val, Y_val))
y_test_pred = custom_vgg_model.predict(X_val) np.argmax(y_test_pred[0:1,:])
output : 0
plt.imshow(X_val[0])
which I got the image of a dog as expected. You may get different results based on your data samples.
Leave a Reply