Overfit and underfit in TensorFlow

Post Views: 794

Hey there my fellow machine learning enthusiasts, well today we are going to learn about “Overfit and Underfit in Tensorflow”. Well, these two are so much correlated that we encounter one while handling the other.

So, let us just start…

Let us first see what these mean and why are they so talked about in the industry:

Underfitting:

Well, underfitting is the condition when the machine learning model doesn’t fit the training data well and leads to a high training error.

This could happen due to various reasons:

Less complex model:
->Well there are chances that the model has a very less number of layers or the number of nodes in each layer is less.
Less number of features:
->Well there are chances that the features are not adequate for a model to generate a good relation among the data and the labels.
Too much regularisation:
-> Well there are chances that you might have regularised the model too much.
Trained for less time:
->Well sometimes we train the model for such a less time that it is not able to correctly identify the relationships among the data and the labels.

Overfitting:

Well, overfitting is a condition when the model doesn’t perform as good on the validation or test data as it did on the train data.

That is the validation or test data accuracy is considerably less than the train data.

This happens because the model maps or fits the train data so much that it fails to generalize the result.

This could happen due to a lot of reasons:

Too complicated model:
-> The model we have trained has more number of layers or number of nodes in a layer than required, thus it fits the data too much.
Less number of training data:
-> Well, a model with fewer data will fit the data very easily and then it will just overfit it soon.
Less Regularisation:
-> Well, the model might be regularised less such that its weights might have been stagnant.
Trained for a very long time:
-> Well, the model might have been trained for a very long time such that it has just learned to fit the train data accurately and hence could not generalize on the new data.

Well, I know I know… It’s too much of theory, now let’s see how we can try to tackle these problems in TensorFlow:

Mitigation Strategies for Overfit and underfit in TensorFlow:

Here, we will see how we tackle “Overfit and underfit in TensorFlow”:

Data:

-> To tackle the issue of underfitting we need to add more features of the data, this could be done by collecting more data and processing it through the model.
-> To tackle the issue of overfitting we need more data samples which could be done by creating more data with the help of data hand with little to enough changes in the data. An example is data augmentation in the field of image classification machine learning models.

The model Complexity:

-> To tackle the issue of underfitting we need to add more layers or more nodes to each layer such that complication of the model increases and it can identify the relationship among data more accurately.
-> To tackle the issue of overfitting we need to decrease the complexity of the model.
->There are various ways to do so, on is the Dropout method, in which we randomly drop out the nodes for each layer separately, thus reducing the complexity.
-> This could be implemented in TensorFlow as:

dropout_method = tf.keras.Sequential([
    keras.layers.Dense(128, activation='relu', input_shape=(FEATURES,)),
    keras.layers.Dropout(0.3),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dropout(0.4),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dropout(0.5),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dropout(0.2),
    keras.layers.Dense(1, activation='sigmoid')
])

Here, from the first layer, we are dropping out 30% of the nodes and then from the second 40%, then from the third 50% and the fourth layer 20%.

Also, read: Real-time object detection using TensorFlow in Python

Regularizations:

->To tackle the issue of underfitting, we need to reduce the regularization parameter.
-> To tackle the issue of overfitting, we need to increase the regularization parameter.
-> Well what does regularisation does is it add to the loss function and thus keeps a control on the values on the weights.
-> There are two types of regularizations:
-> L1 Regularisation: Here the cost is proportional to the absolute values of the weight parameter.
->L2 Regularisation: Here the cost is proportional to the square of the weight parameter.
-> L2 is the preferred type of regularisation than L1 because it penalizes the weights in a way that none of them becomes zero whereas in L1 many of them do.
->This could be implemented in TensorFlow as:

Regularised_model = tf.keras.Sequential([
    keras.layers.Dense(128, activation='relu',
                 kernel_regularizer=regularizers.l2(0.001),
                 input_shape=(FEATURES,)),
    keras.layers.Dense(128, activation='relu',
                 kernel_regularizer=regularizers.l1(0.001)),
    keras.layers.Dense(128, activation='relu',
                 kernel_regularizer=regularizers.l2(0.001)),
    keras.layers.Dense(128, activation='relu',
                 kernel_regularizer=regularizers.l1(0.001)),
    keras.layers.Dense(1, activation='sigmoid')
])

Here, 0.001 is the parameter for regularisation.

Training time:

->To tackle underfitting we need to train the model longer so that it can learn more delicate features of the training data.
-> To tackle overfitting we need to decrease the training time. This could be done with the help of a concept named Early Stopping. This could be done in TensorFlow as:
```
early_stop = keras.callbacks.EarlyStopping(
  monitor='val_accuracy',
  patience=100
)
```
Here, the model we seek for 100 epochs if the accuracy doesn’t increase then the model stops.

Preventing overfitting due to overtraining.

And that’s how you solve the issue of “Overfit and underfit in TensorFlow”.

I hope you loved the read.

Thanx for reading, hope you learned something new today.