# How to train Tensorflow models in Python?

In this tutorial, I will explain what is Tensorflow and how to build, compile and train the models in Tensorflow Python deep learning module. So let’s continue…

Basically, Tensors are multi-dimensional array and these multi-dimensional array acts as an input in TensorFlow. Similar to graphs, a tensor has a node and an edge where node carries the mathematical operations and produces endpoint outputs and the edge contains the relationship between input and output.

In this article, we will train mnist dataset which will predict any handwritten digits images ranging from 0 – 9 using TensorFlow.

## How to use Google Colab for running Tensorflow models?

Google colab is similar to Jupyter notebook that supports free GPUs(Graphics Processing Unit) where we can compile and run python codes without downloading any software in our system. We just need to go to this link ->https://colab.research.google.com

It is a very easy and efficient way to learn Tensorflow as we don’t have to go a long process of downloading Anaconda and setting up the path in the system. We will have to focus only on the implementation part of the technique in Google Colab.

Below are some simple steps that we have to follow to use Google Colab:

- Sign in your Google account.
- Visit the above link.
- Click on NEW PYTHON3 NOTEBOOK.
- Start Coding.

## Build Compile and Train the Tensorflow models in Python

For training any Tensorflow model we have to –

- Load the dataset.
- Build the model (mention how many hidden layers we want along with their activation function)
- Define the loss function.
- Obtain training data and use an optimizer in your model.

**Optimizer **are used for improving speed and performance for training a specific model.

In our Google Colab, we have to install and import TensorFlow. We also have to import matplotlib.pyplot to visualize the image which is to be trained and NumPy to perform certain operation while predicting the number present in the image. The code for the above process is –

!pip install tensorflow==2.0.0-beta1 import tensorflow as tf from tensorflow import keras import numpy as np import matplotlib.pyplot as plt

### How to load and split the dataset?

First of all, see the code below:

handwritten_dataset = tf.keras.datasets.mnist #downloads the mnist dataset and store them in a variable. (x_train, y_train), (x_test, y_test) = handwritten_dataset.load_data() #splits the dataset into train and test data x_train, x_test = x_train / 255.0, x_test / 255.0 #as the pixel value of an image ranges from 0-255 so dividing the pixel value by 255 range becomes 0-1

In the above code, the handwritten_dataset contains the mnist dataset which is available in Keras. We have to split the dataset into (x_train,y_train) and (x_test,y_test).

The (x_train,y_train) will train the model and the (x_test,y_test) will evaluate the accuracy of the model. The x_train or x_test are the handwritten digits images and y_train or y_test are the labels(digit in integer format) associated with the image. To normalize, the training and testing dataset is divided by 255.

As mnist dataset contains 60000 training images and 10000 testing images. To find the shape we can write –

print(x_train.shape) print(x_test.shape)

Output of the above code will be –

(60000, 28, 28) (10000, 28, 28)

Now to visualize the datasets we can use matplotlib.pyplot.

plt.imshow(x_train[1205], cmap='gray_r') print(y_train[1205])

Output –

7 https://drive.google.com/file/d/1iTT-_nF4_AEB3K0nqv0758FFceEMN7nD/view?usp=sharing

### Build the Model

Now we need to build a model in which the training data has to fit in order to predict the test data. First of all we will add a layer to flatten the image i.e if image resolution is 28 x 28 pixels then flatten layer will generate 784 nodes in the flatten layer which will be fed as an input layer in the model.

Next, will add a single hidden layer having 128 nodes with a ‘*relu*‘ activation function and then we will add an ouput layer having 10 nodes with a ‘*softmax*‘ activation function.

**Relu(**Rectified Linear Unit) – This function will output the input directly if the input is positive and if the input is negative it will result 0.

**Softmax **function – This function returns the probabilities of every possible output. The output having maximum probability will be considered as a correct prediction.

In the above problem of recognizing handwritten digits the softmax will return an array of 10 elements which is the probabilities of all the numbers from 0 to 9.

The number which will have the highest probability will be the result of our program.

Below is the image that represents the above explanation of our program:

The code for building the model is –

classification_model = keras.models.Sequential([ keras.layers.Flatten(input_shape=(28, 28)), keras.layers.Dense(128, activation='relu'), keras.layers.Dense(10, activation='softmax') ])

### Compile the model

Now we have to compile the model by giving an optimizer and a loss function to the model for calculating and minimizing the loss.

We use optimizer to speed up the training process. Here we will use ‘*adam*‘ optimizer which is a replacement of classical stochastic gradient descent technique.

In classical stochastic gradient descent technique, the learning rate is unchanged for the whole training process. On the other hand, as adam optimization algorithm takes advantage of both Adaptive Gradient Descent Technique and RMSprop for faster training process.

Here we will use “*sparse categorical crossentropy*” as our loss function because this is a classification type of problem where we have to classify images which comes under those nine category(i.e, from 0-9). *Sparse categorical crossentropy* will calculate the loss for categorizing the image and we will use “accuracy” as our metrics which will represent the accuracy of our model.

The code for compiling the model is –

classification_model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

### Train and evaluate the Model

Now for training our model we have to fit the training data into our model and we also have mention the the number of epochs. An epoch is iterating the whole training data for 1 time. If the number of epoch is 5 then the whole training data will be processed 5 times.

While training the data we will see the loss and the accuracy for every epoch. The loss should decrease and the accuracy should increase from every epoch.

The code for training and evaluating the model for 5 epochs is –

classification_model.fit(x_train, y_train, epochs=5) classification_model.evaluate(x_test, y_test)

The output will be-

Then we compiled our model using ‘adam’ optimizer and set the loss function to ‘sparse_categorical_crossentropy’. Then we trained our model for 5 epochs and evaluated the loss and accuracy for test data. At last we predicted the first image of our test data.

## Leave a Reply