How to Configure Image Data Augmentation in Keras TensorFlow

Post Views: 2,201

Training deep convolutional neural networks on more data can lead to an increase in its performance and generalization capacity. Limited training data can cause the model to overfit. This article will help you understand how you can expand your existing dataset through Image Data Augmentation in Keras TensorFlow with Python language.

We’ll understand what data augmentation is and how we can implement the same.

Image Data Augmentation

Data augmentation is a method of increasing the size of our training data by transforming the data that we already have. Image Data augmentation similarly is a technique where we expand our training dataset by creating modified versions of the images that already exist in our training data. Keep in mind, that these newly transformed images also belong to the same class as the original image. Augmentation both expands our training data and introduces heterogeneity in it thereby reducing the model’s tendency to overfit. It must be performed carefully. For e.g. a vertical flip of a photo of a dog will not make sense but a horizontal flip will because the photo can be taken from either left or right.
Keras provides us the ability to perform Image Data Augmentation automatically when training our model using the ImageDataGenerator class. It has a variety of methods for Image Data Augmentation but we’ll focus on the 5 main strategies namely:

Image Rotation
Horizontal and Vertical flipping
Zooming
Image Brightness
Image height and width shifting

The ImageDataGenerator class accepts the original training data, transforms it, and returns only the newly transformed data for the model to be trained on. But as these transformations/augmentations are applied randomly, this provides both the modified images and the images with almost no transformations for use during training.

Image Data Augmentation using ImageDataGenerator

Image to be used for illustration :

Dog

Before beginning with Image Augmentation, let us import the required libraries.

import matplotlib.pyplot as plt
import numpy as np
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.preprocessing.image import ImageDataGenerator

The ‘img_to_array’ function converts a PIL image to a NumPy-array of pixels

Image Rotation

This augmentation randomly rotates the image clockwise by the specified number of degrees between 0° and 360°. Rotation can be configured using the ‘rotation_range’ argument of the ImagedataGenerator class.

See the Python code below:

# loading the image
image = load_img('dog.jpg')

# converting the PIL image into a numpy array of pixels
image = img_to_array(image)

# expanding dimensions
samples = np.expand_dims(image, 0)

# creating Image data augmentation generator
datagen = ImageDataGenerator(rotation_range = 120)

# preparing the iterator
iterator = datagen.flow(samples, batch_size = 1)

# generating and plotting the augmented images 
for index in range(4):
  # defining the subplot
  plt.subplot(220 + 1 + index)
  # generating the batch of images
  batch = iterator.next()
  # converting to unsigned integers for viewing
  image = batch[0].astype('uint8')
  # vewing the augmented image
  plt.imshow(image)

Here, first we initialize an object of the ImageDataGenerator class. We then create the iterator which is called four times in a loop and plot each augmented image.

OUTPUT:

rot_aug

Horizontal & Vertical flipping

A Horizontal flip will flip the image w.r.t the vertical axis whereas a Vertical flip flips the image w.r.t the horizontal axis. As the image of a dog is being used, vertically flipping will make no sense in this case.

# loading the image
image = load_img('dog.jpg')

# converting the PIL image into a numpy array of pixels
image = img_to_array(image)

# expanding dimensions
samples = np.expand_dims(image, 0)

# creating Image data augmentation generator
datagen = ImageDataGenerator(horizontal_flip = True)

# preparing the iterator
iterator = datagen.flow(samples, batch_size = 1)

# generating and plotting the augmented images 
for index in range(4):
  # defining the subplot
  plt.subplot(220 + 1 + index)
  # generating the batch of images
  batch = iterator.next()
  # converting to unsigned integers for viewing
  image = batch[0].astype('uint8')
  # vewing the augmented image
  plt.imshow(image)

OUTPUT:

flip_aug

As we can see, horizontal_flip was randomly applied to some images and not others.

Zooming

A zoom augmentation can randomly zoom in/out of the image. Image zooming can be configured using the ‘zoom_range’ argument of the ImageDataGenerator class. A zoom_range of [0.6, 1.4] indicates zooming between 60% (zoom in) and 140% (zoom out). Values < 1 will zoom the image in, thereby making the objects in the image larger whereas values > 1 will zoom the image out thereby making the objects smaller. A zoom of [1.0, 1.0] will have no effect on the original image.

# loading the image
image = load_img('dog.jpg')

# converting the PIL image into a numpy array of pixels
image = img_to_array(image)

# expanding dimensions
samples = np.expand_dims(image, 0)

# creating Image data augmentation generator
datagen = ImageDataGenerator(zoom_range = [0.3, 3])

# preparing the iterator
iterator = datagen.flow(samples, batch_size = 1)

# generating and plotting the augmented images 
for index in range(4):
  # defining the subplot
  plt.subplot(220 + 1 + index)
  # generating the batch of images
  batch = iterator.next()
  # converting to unsigned integers for viewing
  image = batch[0].astype('uint8')
  # vewing the augmented image
  plt.imshow(image)

OUTPUT:

zoom_aug

Image Brightness

This augmentation will alter the brightness of the image. It can be configured using the ‘brightness_range’ argument of the ImageDataGenerator class. Values > 1 will brighten the image whereas values < 1 will darken the image.

# loading the image
image = load_img('dog.jpg')

# converting the PIL image into a numpy array of pixels
image = img_to_array(image)

# expanding dimensions
samples = np.expand_dims(image, 0)

# creating Image data augmentation generator
datagen = ImageDataGenerator(brightness_range = [0.3, 1.5])

# preparing the iterator
iterator = datagen.flow(samples, batch_size = 1)

# generating and plotting the augmented images 
for index in range(4):
  # defining the subplot
  plt.subplot(220 + 1 + index)
  # generating the batch of images
  batch = iterator.next()
  # converting to unsigned integers for viewing
  image = batch[0].astype('uint8')
  # vewing the augmented image
  plt.imshow(image)

OUTPUT:

bright_aug

Height & Width shifting

Height_shift will shift the image in the vertical direction whereas width_shift shifts it in the horizontal direction. To configure this augmentation, we use the ‘height_shift_range’ and ‘width_shift_range’ arguments of the ImageDataGenerator class. These parameters can either be a floating-point value (between 0 and 1) indicating the percentage of width or height of the image to be shifted or the number of pixels to be shifted.

# loading the image
image = load_img('dog.jpg')

# converting the PIL image into a numpy array of pixels
image = img_to_array(image)

# expanding dimensions
samples = np.expand_dims(image, 0)

# creating Image data augmentation generator
datagen = ImageDataGenerator(height_shift_range = [-110, 110], width_shift_range = 0.4) 

# preparing the iterator
iterator = datagen.flow(samples, batch_size = 1)

# generating and plotting the augmented images 
for index in range(4):
  # defining the subplot
  plt.subplot(220 + 1 + index)
  # generating the batch of images
  batch = iterator.next()
  # converting to unsigned integers for viewing
  image = batch[0].astype('uint8')
  # vewing the augmented image
  plt.imshow(image)

OUTPUT:

height_aug

If you want to adjust the replicated pixels in the image, you can do so by using the ‘fill_mode’ parameter.

Applying augmentation in a CNN

Now let us learn how to apply Image Data Augmentation on your training data before you fit your model to it. The model we’ll implement will be a cat v/s dog classifier. The dataset in total has 8000 images, 3000 in each category in the training set, and the remaining 2000 images in the test set (1000 in each category).

# Importing the required libraries
from keras.models import Sequential
from keras.layers import Convolution2D, MaxPooling2D, Flatten, Dense

# Defining the CNN architecture
model = Sequential()

model.add(Convolution2D(32, 3, 3, input_shape = (64, 64, 3), activation = 'relu'))
model.add(MaxPooling2D(pool_size = (2, 2)))

model.add(Convolution2D(32, 3, 3, activation = 'relu'))
model.add(MaxPooling2D(pool_size = (2, 2)))

model.add(Flatten())

model.add(Dense(output_dim = 128, activation = 'relu'))
model.add(Dense(output_dim = 1, activation = 'sigmoid'))

model.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])


# Applying Image Augmentation before fitting the model on the dataset

from keras.preprocessing.image import ImageDataGenerator

# 'rescale = 1./255' transforms every pixel value from range [0, 255] to [0, 1]. 
train_datagen = ImageDataGenerator(rescale = 1./255,
                                   zoom_range = [0.6, 1.4],
                                   height_shift_range = [-110, 110],
                                   width_shift_range = 0.3,
                                   brightness_range = [0.3, 1.4],
                                   horizontal_flip = True)

test_datagen = ImageDataGenerator(rescale = 1./255)

training_set = train_datagen.flow_from_directory('dataset/training_set',
                                                 target_size = (64, 64),
                                                 batch_size = 64,
                                                 class_mode = 'binary')

test_set = test_datagen.flow_from_directory('dataset/test_set',
                                            target_size = (64, 64),
                                            batch_size = 64,
                                            class_mode = 'binary')

model.fit_generator(training_set,
                    samples_per_epoch = 6000,
                    nb_epoch = 30,
                    validation_data = test_set,
                    nb_val_samples = 2000)

In conclusion, if you have relatively small training data, you can add more data points to it using Data Augmentation to improve your model’s performance.

How to Configure Image Data Augmentation in Keras TensorFlow

Image Data Augmentation

Image Data Augmentation using ImageDataGenerator

Image Rotation

Horizontal & Vertical flipping

Zooming

Image Brightness

Height & Width shifting

Applying augmentation in a CNN

Leave a Reply Cancel reply