AutoEncoder implementation in tensorflow 2.0 in Python
In this article, I will show you how to implement a simple autoencoder using TensorFlow 2.0. You can always make it a deep autoencoder by just adding more layers. First, we will see what an autoencoder is, and then we will go to its code. When you search for autoencoder code, you will find many but when you go run them on your machine, there will always be some error and unexpected output. So, to save your time and spare you a headache, this article here contains many tips and tricks to sail you past this. Read on till the end to find them all.
An autoencoder is basically a self-supervised neural network or machine learning algorithm that applies backpropagation to make the target values equal to the inputs. The number of neurons is the same in the input and the output, therefore we can expect that the output is an image that is of the same size as the input, and to be more specific, is the same image. But, it sounds weird, isn’t it? Why would we want to make a neural network do the job of a copying machine?
Well, here goes the answer – there is a bottleneck in one of these layers. This means that the number of neurons in this layer is much less than input/output, therefore it has to find a way to represent the data the best it can with a much smaller number of neurons. These autoencoders are used for creating sparse representations of the input data and thus can be used for image compression. Autoencoder has 4 main parts:
- Encoder: This is the part where the model learns how to reduce the number of features required to represent the data. This is the part which performs feature learning.
- Bottleneck: This is the layer that has the minimum number of neurons in the model. It contains the compressed representation of input data with the lowest dimensionality possible.
- Decoder: This part performs the reconstruction of the compressed representation of the input data from the bottleneck. The aim is to produce an output that is as close to the input as possible.
- Reconstruction Loss: This is a mathematical function that computes the difference between the output and input, also called loss. It is a measure of the performance of the autoencoder.
So, finally, we have come to the much-awaited part. Let’s dive in and see how easy it is to code an autoencoder in TensorFlow 2.0. We will do it part by part, making it easier to understand.
To begin with, first, make sure that you have the correct version of TensorFlow installed. This tutorial is specifically suited for autoencoder in TensorFlow 2.0. Here is the way to check it –
import tensorflow as tf print(tf.__version__)
Next, import all the libraries required.
import tensorflow as tf import numpy as np import matplotlib.pyplot as plt
Now, as mentioned earlier, we will make a simple autoencoder by using a single fully connected layer as encoder and decoder. This code is for the MNIST dataset that is why the input shape is (784,) as the size of each image is (28×28).
# bottleneck is the size of encoded representations bottleneck = 32 # Placeholder for input input_image = tf.keras.layers.Input(shape=(784,)) # Encoded representation of the input encoded_input = tf.keras.layers.Dense(bottleneck, activation='relu')(input_image) # Lossy reconstruction of the input decoded_output = tf.keras.layers.Dense(784, activation='sigmoid')(encoded_input) # Autoencoder model to map an input to its output autoencoder = tf.keras.models.Model(input_image, decoded_output)
You may be wondering why I used Keras. Well, the contrib module of TensorFlow will be removed from version 2.0 and all use cases will be transferred to Keras. So, it is better to start with Keras. For input placeholders, many tutorials use
input = tf.placeholder('float',[None,abc]) but
tf.placeholder is removed from TensorFlow 2.0. If you want to use this function, you will have to change your API compatibility, accessible as
tensorfow.compat.v1 and disable v2 behaviors. To avoid falling into this mess, use
tf.keras.layers.Input(). Also, you will see tutorials using
xyz = tf.Variable(tf.random_normal([abc,efg])) to create weights and biases for various layers. But
tf.random_normal is no longer valid in TensorFlow 2.0. It is replaced by
tf.random.nomal. But, to make things even easier, use
tf.keras.layers.Dense() to make layers.
We will choose the “binary cross-entropy” loss function and “adam” optimizer for our model.
autoencoder.compile(optimizer = 'adam', loss = 'binary_crossentropy')
For autoencoders, the two most widely used loss functions are – mean squared error and binary cross-entropy. If the input values are in the range (0,1), use binary cross-entropy otherwise use mean squared error. In many tutorials, you will find RMSProp or adadelta optimizer. But these optimizers usually give too blurry indistinguishable output. After many trials, I have found adam optimizer to be the most suitable.
The dataset used here, as mentioned earlier, is the MNIST dataset. The dataset is available under
keras.datasets module. Loading the dataset returns two tuples, one has the input and output labels for the training set and the other one has the input and output labels for the test set. But we do not need the output labels as the input and output for autoencoder are the same.
(X_train, _), (X_test, _) = tf.keras.datasets.mnist.load_data() X_train = X_train.astype('float32') / 255 X_test = X_test.astype('float32') / 255 X_train = X_train.reshape((len(X_train),np.prod(X_train.shape[1:]))) X_test = X_test.reshape((len(X_test),np.prod(X_test.shape[1:])))
Now that our autoencoder model is ready, let’s train our model for 30 epochs.
autoencoder.fit(X_train,X_train,epochs = 30,batch_size = 256, shuffle = True, validation_data = (X_test, X_test))
You will notice that I have used X_train as both input and output of the training set and X_test as both input and output of the test set. The reason is very obvious – input and output of autoencoder are the same.
Many tutorials have used 50 epochs. But, 30 epochs also give the same result. Training and validation loss for 50 epochs are 0.0924 and 0.0910 respectively and for 30 epochs, 0.0923 and 0.0910 respectively. More epochs are not always necessary. You will ask why did I choose 30? Well, 30 worked fine for me. You can always play around and see if you find similar results for a lower number of epochs.
The model has now reached a stable train-test loss. Now, let us visualize the original input and the reconstructed input from the encoded representation. For this, we will use a few images from the test set.
# Get the reconstructed input reconstructed_img = autoencoder.predict(X_test) # Plot some of the input and output images # Here we have plotted 10 images n = 10 plt.figure(figsize=(20, 4)) for i in range(n): # display original ax = plt.subplot(2, n, i + 1) plt.imshow(X_test[i].reshape(28, 28)) plt.gray() ax.get_xaxis().set_visible(False) ax.get_yaxis().set_visible(False) # display reconstruction ax = plt.subplot(2, n, i + 1 + n) plt.imshow(reconstructed_img[i].reshape(28, 28)) plt.gray() ax.get_xaxis().set_visible(False) ax.get_yaxis().set_visible(False) plt.show()
You can see that the reconstructed images are not very clear and are somewhat blurry. This is a common case with a simple autoencoder. For getting cleaner output there are other variations – convolutional autoencoder, variation autoencoder. Now we have seen the implementation of autoencoder in TensorFlow 2.0. As mentioned earlier, you can always make a deep autoencoder by adding more layers to it. Also, I hope the tips come in handy when you start coding.
Want to add your thoughts? Need any further help? Leave a comment below and I will get back to you ASAP 🙂
For further reading: