Checkpoint in Keras in machine learning
In this tutorial, we will learn about creating a Checkpoint in Keras in Machine Learning. This checkpoint creation in Keras helps us to return to a checkpoint if something goes wrong in the future. This method helps us feel safe to experiment with our code as we can return to a checkpoint we have saved at any point in time.
Creating Checkpoint in Keras
The checkpoint helps allows us to define weights, checkpoints, defining names under specific circumstances for a checkpoint. The fit() function can be used to call the ModelCheckpoint function for the training process. In this session, we will create a deep neural network and then try to create some checkpoints on the same.
Firstly make sure to download the dataset that we will use from this link. Keep in mind that this data has 2/3rd of its data for training and the rest 1/3rd for testing.
Let’s now get to the coding part:
There are two parts to it, first is creating a check-point, and the second is fetching it.
Creating a checkpoint:
from keras.models import Sequential from keras.layers import Dense from keras.callbacks import ModelCheckpoint import matplotlib.pyplot as plt import numpy numpy.random.seed(10) dataset = numpy.loadtxt("/home/sumit/pima-indians-diabetes.data.csv", delimiter=",") X = dataset[:,0:8] Y = dataset[:,8] model = Sequential() model.add(Dense(12, input_dim=8, activation='relu')) model.add(Dense(8, activation='relu')) model.add(Dense(1, activation='sigmoid')) model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) filepath="weights-improvement-{epoch:02d}-{val_accuracy:.2f}.hdf5" checkpoint = ModelCheckpoint(filepath, monitor='val_accuracy', verbose=1, save_best_only=True, mode='max') callbacks_list = [checkpoint] model.fit(X, Y, validation_split=0.33, epochs=150, batch_size=10, callbacks=callbacks_list, verbose=0)
In the above code, we run 150 epochs on the data and then store the results in a .hdf5 file in a specific directory.
Output:
Using TensorFlow backend. Epoch 00001: val_accuracy improved from -inf to 0.51969, saving model to weights-improvement-01-0.52.hdf5 Epoch 00002: val_accuracy did not improve from 0.51969 Epoch 00003: val_accuracy did not improve from 0.51969 Epoch 00004: val_accuracy did not improve from 0.51969 Epoch 00005: val_accuracy did not improve from 0.51969 Epoch 00006: val_accuracy did not improve from 0.51969 Epoch 00007: val_accuracy improved from 0.51969 to 0.65748, saving model to weights-improvement-07-0.66.hdf5 Epoch 00008: val_accuracy did not improve from 0.65748 Epoch 00009: val_accuracy improved from 0.65748 to 0.66535, saving model to weights-improvement-09-0.67.hdf5 Epoch 00010: val_accuracy did not improve from 0.66535 Epoch 00011: val_accuracy did not improve from 0.66535 Epoch 00012: val_accuracy improved from 0.66535 to 0.68110, saving model to weights-improvement-12-0.68.hdf5 Epoch 00013: val_accuracy did not improve from 0.68110 Epoch 00014: val_accuracy did not improve from 0.68110 Epoch 00015: val_accuracy did not improve from 0.68110 Epoch 00016: val_accuracy did not improve from 0.68110 Epoch 00017: val_accuracy did not improve from 0.68110 Epoch 00018: val_accuracy did not improve from 0.68110 Epoch 00019: val_accuracy did not improve from 0.68110 Epoch 00020: val_accuracy did not improve from 0.68110 Epoch 00021: val_accuracy did not improve from 0.68110 Epoch 00022: val_accuracy did not improve from 0.68110 Epoch 00023: val_accuracy did not improve from 0.68110 Epoch 00024: val_accuracy did not improve from 0.68110 Epoch 00025: val_accuracy did not improve from 0.68110 Epoch 00026: val_accuracy improved from 0.68110 to 0.68898, saving model to weights-improvement-26-0.69.hdf5 Epoch 00027: val_accuracy did not improve from 0.68898 Epoch 00028: val_accuracy did not improve from 0.68898 Epoch 00029: val_accuracy did not improve from 0.68898 Epoch 00030: val_accuracy did not improve from 0.68898 Epoch 00031: val_accuracy did not improve from 0.68898 Epoch 00032: val_accuracy did not improve from 0.68898 Epoch 00033: val_accuracy did not improve from 0.68898 Epoch 00034: val_accuracy did not improve from 0.68898 Epoch 00035: val_accuracy did not improve from 0.68898 Epoch 00036: val_accuracy did not improve from 0.68898 Epoch 00037: val_accuracy did not improve from 0.68898 Epoch 00038: val_accuracy did not improve from 0.68898 Epoch 00039: val_accuracy did not improve from 0.68898 Epoch 00040: val_accuracy did not improve from 0.68898 Epoch 00041: val_accuracy did not improve from 0.68898 Epoch 00042: val_accuracy did not improve from 0.68898 Epoch 00043: val_accuracy did not improve from 0.68898 Epoch 00044: val_accuracy did not improve from 0.68898 Epoch 00045: val_accuracy did not improve from 0.68898 Epoch 00046: val_accuracy did not improve from 0.68898 Epoch 00047: val_accuracy improved from 0.68898 to 0.69291, saving model to weights-improvement-47-0.69.hdf5 Epoch 00048: val_accuracy did not improve from 0.69291 Epoch 00049: val_accuracy improved from 0.69291 to 0.69685, saving model to weights-improvement-49-0.70.hdf5 Epoch 00050: val_accuracy did not improve from 0.69685 Epoch 00051: val_accuracy did not improve from 0.69685 Epoch 00052: val_accuracy did not improve from 0.69685 Epoch 00053: val_accuracy did not improve from 0.69685 Epoch 00054: val_accuracy did not improve from 0.69685 Epoch 00055: val_accuracy did not improve from 0.69685 Epoch 00056: val_accuracy did not improve from 0.69685 Epoch 00057: val_accuracy did not improve from 0.69685 Epoch 00058: val_accuracy did not improve from 0.69685 Epoch 00059: val_accuracy did not improve from 0.69685 Epoch 00060: val_accuracy did not improve from 0.69685 Epoch 00061: val_accuracy improved from 0.69685 to 0.71260, saving model to weights-improvement-61-0.71.hdf5 Epoch 00062: val_accuracy did not improve from 0.71260 Epoch 00063: val_accuracy did not improve from 0.71260 Epoch 00064: val_accuracy did not improve from 0.71260 Epoch 00065: val_accuracy did not improve from 0.71260 Epoch 00066: val_accuracy did not improve from 0.71260 Epoch 00067: val_accuracy did not improve from 0.71260 Epoch 00068: val_accuracy did not improve from 0.71260 Epoch 00069: val_accuracy did not improve from 0.71260 Epoch 00070: val_accuracy did not improve from 0.71260 Epoch 00071: val_accuracy did not improve from 0.71260 Epoch 00072: val_accuracy did not improve from 0.71260 Epoch 00073: val_accuracy did not improve from 0.71260 Epoch 00074: val_accuracy did not improve from 0.71260 Epoch 00075: val_accuracy did not improve from 0.71260 Epoch 00076: val_accuracy did not improve from 0.71260 Epoch 00077: val_accuracy did not improve from 0.71260 Epoch 00078: val_accuracy did not improve from 0.71260 Epoch 00079: val_accuracy did not improve from 0.71260 Epoch 00080: val_accuracy improved from 0.71260 to 0.71654, saving model to weights-improvement-80-0.72.hdf5 Epoch 00081: val_accuracy improved from 0.71654 to 0.72047, saving model to weights-improvement-81-0.72.hdf5 Epoch 00082: val_accuracy did not improve from 0.72047 Epoch 00083: val_accuracy did not improve from 0.72047 Epoch 00084: val_accuracy did not improve from 0.72047 Epoch 00085: val_accuracy did not improve from 0.72047 Epoch 00086: val_accuracy did not improve from 0.72047 Epoch 00087: val_accuracy did not improve from 0.72047 Epoch 00088: val_accuracy did not improve from 0.72047 Epoch 00089: val_accuracy did not improve from 0.72047 Epoch 00090: val_accuracy did not improve from 0.72047 Epoch 00091: val_accuracy did not improve from 0.72047 Epoch 00092: val_accuracy did not improve from 0.72047 Epoch 00093: val_accuracy did not improve from 0.72047 Epoch 00094: val_accuracy did not improve from 0.72047 Epoch 00095: val_accuracy did not improve from 0.72047 Epoch 00096: val_accuracy did not improve from 0.72047 Epoch 00097: val_accuracy did not improve from 0.72047 Epoch 00098: val_accuracy did not improve from 0.72047 Epoch 00099: val_accuracy did not improve from 0.72047 Epoch 00100: val_accuracy did not improve from 0.72047 Epoch 00101: val_accuracy did not improve from 0.72047 Epoch 00102: val_accuracy did not improve from 0.72047 Epoch 00103: val_accuracy did not improve from 0.72047 Epoch 00104: val_accuracy did not improve from 0.72047 Epoch 00105: val_accuracy did not improve from 0.72047 Epoch 00106: val_accuracy did not improve from 0.72047 Epoch 00107: val_accuracy did not improve from 0.72047 Epoch 00108: val_accuracy did not improve from 0.72047 Epoch 00109: val_accuracy did not improve from 0.72047 Epoch 00110: val_accuracy did not improve from 0.72047 Epoch 00111: val_accuracy did not improve from 0.72047 Epoch 00112: val_accuracy did not improve from 0.72047 Epoch 00113: val_accuracy did not improve from 0.72047 Epoch 00114: val_accuracy did not improve from 0.72047 Epoch 00115: val_accuracy did not improve from 0.72047 Epoch 00116: val_accuracy did not improve from 0.72047 Epoch 00117: val_accuracy did not improve from 0.72047 Epoch 00118: val_accuracy did not improve from 0.72047 Epoch 00119: val_accuracy did not improve from 0.72047 Epoch 00120: val_accuracy improved from 0.72047 to 0.73228, saving model to weights-improvement-120-0.73.hdf5 Epoch 00121: val_accuracy did not improve from 0.73228 Epoch 00122: val_accuracy did not improve from 0.73228 Epoch 00123: val_accuracy did not improve from 0.73228 Epoch 00124: val_accuracy did not improve from 0.73228 Epoch 00125: val_accuracy did not improve from 0.73228 Epoch 00126: val_accuracy did not improve from 0.73228 Epoch 00127: val_accuracy did not improve from 0.73228 Epoch 00128: val_accuracy did not improve from 0.73228 Epoch 00129: val_accuracy did not improve from 0.73228 Epoch 00130: val_accuracy did not improve from 0.73228 Epoch 00131: val_accuracy did not improve from 0.73228 Epoch 00132: val_accuracy did not improve from 0.73228 Epoch 00133: val_accuracy did not improve from 0.73228 Epoch 00134: val_accuracy did not improve from 0.73228 Epoch 00135: val_accuracy did not improve from 0.73228 Epoch 00136: val_accuracy did not improve from 0.73228 Epoch 00137: val_accuracy did not improve from 0.73228 Epoch 00138: val_accuracy did not improve from 0.73228 Epoch 00139: val_accuracy did not improve from 0.73228 Epoch 00140: val_accuracy did not improve from 0.73228 Epoch 00141: val_accuracy did not improve from 0.73228 Epoch 00142: val_accuracy did not improve from 0.73228 Epoch 00143: val_accuracy did not improve from 0.73228 Epoch 00144: val_accuracy did not improve from 0.73228 Epoch 00145: val_accuracy did not improve from 0.73228 Epoch 00146: val_accuracy did not improve from 0.73228 Epoch 00147: val_accuracy did not improve from 0.73228 Epoch 00148: val_accuracy did not improve from 0.73228 Epoch 00149: val_accuracy did not improve from 0.73228 Epoch 00150: val_accuracy did not improve from 0.73228
This would have successfully created many weight-improvement.hdf5 files in the specified path directory. Through this, we have randomly made many checkpoints throughout the dataset. Some of these may feel to be unnecessary check-point files but it is a good start.
Also, read: Image Classification using Keras in TensorFlow Backend
The next thing we can do is to save a file by creating a check-point only of the validation accuracy is found to improve. This can be achieved by making a slight change in the same code which is that we will create a single file this time. So all of the improvements if and when found will be stored by overwriting the previous data.
from keras.models import Sequential from keras.layers import Dense from keras.callbacks import ModelCheckpoint import matplotlib.pyplot as plt import numpy dataset = numpy.loadtxt("/home/sumit/pima-indians-diabetes.data.csv", delimiter=",") X = dataset[:,0:8] Y = dataset[:,8] model = Sequential() model.add(Dense(12, input_dim=8, activation='relu')) model.add(Dense(8, activation='relu')) model.add(Dense(1, activation='sigmoid')) model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) filepath="weights.best.hdf5" checkpoint = ModelCheckpoint(filepath, monitor='val_accuracy', verbose=1, save_best_only=True, mode='max') callbacks_list = [checkpoint] model.fit(X, Y, validation_split=0.33, epochs=150, batch_size=10, callbacks=callbacks_list, verbose=0)
This code on execution will create a file named weights.best.hdf5 file in the specified path directory. Now we have successfully created a single check-point file for our data.
Output:
Using TensorFlow backend. Epoch 00001: val_accuracy improved from -inf to 0.48425, saving model to weights.best.hdf5 Epoch 00002: val_accuracy improved from 0.48425 to 0.58661, saving model to weights.best.hdf5 Epoch 00003: val_accuracy did not improve from 0.58661 Epoch 00004: val_accuracy improved from 0.58661 to 0.61024, saving model to weights.best.hdf5 Epoch 00005: val_accuracy did not improve from 0.61024 Epoch 00006: val_accuracy improved from 0.61024 to 0.67717, saving model to weights.best.hdf5 Epoch 00007: val_accuracy did not improve from 0.67717 Epoch 00008: val_accuracy did not improve from 0.67717 Epoch 00009: val_accuracy improved from 0.67717 to 0.70079, saving model to weights.best.hdf5 Epoch 00010: val_accuracy did not improve from 0.70079 Epoch 00011: val_accuracy did not improve from 0.70079 Epoch 00012: val_accuracy did not improve from 0.70079 Epoch 00013: val_accuracy did not improve from 0.70079 Epoch 00014: val_accuracy did not improve from 0.70079 Epoch 00015: val_accuracy did not improve from 0.70079 Epoch 00016: val_accuracy did not improve from 0.70079 Epoch 00017: val_accuracy did not improve from 0.70079 Epoch 00018: val_accuracy did not improve from 0.70079 Epoch 00019: val_accuracy did not improve from 0.70079 Epoch 00020: val_accuracy did not improve from 0.70079 Epoch 00021: val_accuracy did not improve from 0.70079 Epoch 00022: val_accuracy did not improve from 0.70079 Epoch 00023: val_accuracy did not improve from 0.70079 Epoch 00024: val_accuracy did not improve from 0.70079 Epoch 00025: val_accuracy did not improve from 0.70079 Epoch 00026: val_accuracy did not improve from 0.70079 Epoch 00027: val_accuracy did not improve from 0.70079 Epoch 00028: val_accuracy did not improve from 0.70079 Epoch 00029: val_accuracy did not improve from 0.70079 Epoch 00030: val_accuracy improved from 0.70079 to 0.71654, saving model to weights.best.hdf5 Epoch 00031: val_accuracy did not improve from 0.71654 Epoch 00032: val_accuracy did not improve from 0.71654 Epoch 00033: val_accuracy did not improve from 0.71654 Epoch 00034: val_accuracy did not improve from 0.71654 Epoch 00035: val_accuracy did not improve from 0.71654 Epoch 00036: val_accuracy did not improve from 0.71654 Epoch 00037: val_accuracy did not improve from 0.71654 Epoch 00038: val_accuracy did not improve from 0.71654 Epoch 00039: val_accuracy did not improve from 0.71654 Epoch 00040: val_accuracy did not improve from 0.71654 Epoch 00041: val_accuracy did not improve from 0.71654 Epoch 00042: val_accuracy did not improve from 0.71654 Epoch 00043: val_accuracy did not improve from 0.71654 Epoch 00044: val_accuracy did not improve from 0.71654 Epoch 00045: val_accuracy did not improve from 0.71654 Epoch 00046: val_accuracy did not improve from 0.71654 Epoch 00047: val_accuracy did not improve from 0.71654 Epoch 00048: val_accuracy did not improve from 0.71654 Epoch 00049: val_accuracy did not improve from 0.71654 Epoch 00050: val_accuracy did not improve from 0.71654 Epoch 00051: val_accuracy did not improve from 0.71654 Epoch 00052: val_accuracy did not improve from 0.71654 Epoch 00053: val_accuracy did not improve from 0.71654 Epoch 00054: val_accuracy did not improve from 0.71654 Epoch 00055: val_accuracy improved from 0.71654 to 0.72441, saving model to weights.best.hdf5 Epoch 00056: val_accuracy did not improve from 0.72441 Epoch 00057: val_accuracy did not improve from 0.72441 Epoch 00058: val_accuracy did not improve from 0.72441 Epoch 00059: val_accuracy did not improve from 0.72441 Epoch 00060: val_accuracy did not improve from 0.72441 Epoch 00061: val_accuracy did not improve from 0.72441 Epoch 00062: val_accuracy did not improve from 0.72441 Epoch 00063: val_accuracy did not improve from 0.72441 Epoch 00064: val_accuracy did not improve from 0.72441 Epoch 00065: val_accuracy did not improve from 0.72441 Epoch 00066: val_accuracy did not improve from 0.72441 Epoch 00067: val_accuracy did not improve from 0.72441 Epoch 00068: val_accuracy did not improve from 0.72441 Epoch 00069: val_accuracy did not improve from 0.72441 Epoch 00070: val_accuracy did not improve from 0.72441 Epoch 00071: val_accuracy did not improve from 0.72441 Epoch 00072: val_accuracy did not improve from 0.72441 Epoch 00073: val_accuracy did not improve from 0.72441 Epoch 00074: val_accuracy did not improve from 0.72441 Epoch 00075: val_accuracy did not improve from 0.72441 Epoch 00076: val_accuracy did not improve from 0.72441 Epoch 00077: val_accuracy did not improve from 0.72441 Epoch 00078: val_accuracy did not improve from 0.72441 Epoch 00079: val_accuracy did not improve from 0.72441 Epoch 00080: val_accuracy did not improve from 0.72441 Epoch 00081: val_accuracy did not improve from 0.72441 Epoch 00082: val_accuracy did not improve from 0.72441 Epoch 00083: val_accuracy did not improve from 0.72441 Epoch 00084: val_accuracy did not improve from 0.72441 Epoch 00085: val_accuracy improved from 0.72441 to 0.72835, saving model to weights.best.hdf5 Epoch 00086: val_accuracy did not improve from 0.72835 Epoch 00087: val_accuracy did not improve from 0.72835 Epoch 00088: val_accuracy did not improve from 0.72835 Epoch 00089: val_accuracy improved from 0.72835 to 0.73228, saving model to weights.best.hdf5 Epoch 00090: val_accuracy did not improve from 0.73228 Epoch 00091: val_accuracy did not improve from 0.73228 Epoch 00092: val_accuracy did not improve from 0.73228 Epoch 00093: val_accuracy did not improve from 0.73228 Epoch 00094: val_accuracy improved from 0.73228 to 0.73622, saving model to weights.best.hdf5 Epoch 00095: val_accuracy did not improve from 0.73622 Epoch 00096: val_accuracy did not improve from 0.73622 Epoch 00097: val_accuracy did not improve from 0.73622 Epoch 00098: val_accuracy did not improve from 0.73622 Epoch 00099: val_accuracy did not improve from 0.73622 Epoch 00100: val_accuracy did not improve from 0.73622 Epoch 00101: val_accuracy did not improve from 0.73622 Epoch 00102: val_accuracy did not improve from 0.73622 Epoch 00103: val_accuracy did not improve from 0.73622 Epoch 00104: val_accuracy did not improve from 0.73622 Epoch 00105: val_accuracy improved from 0.73622 to 0.75197, saving model to weights.best.hdf5 Epoch 00106: val_accuracy did not improve from 0.75197 Epoch 00107: val_accuracy did not improve from 0.75197 Epoch 00108: val_accuracy did not improve from 0.75197 Epoch 00109: val_accuracy did not improve from 0.75197 Epoch 00110: val_accuracy did not improve from 0.75197 Epoch 00111: val_accuracy did not improve from 0.75197 Epoch 00112: val_accuracy did not improve from 0.75197 Epoch 00113: val_accuracy did not improve from 0.75197 Epoch 00114: val_accuracy did not improve from 0.75197 Epoch 00115: val_accuracy did not improve from 0.75197 Epoch 00116: val_accuracy did not improve from 0.75197 Epoch 00117: val_accuracy did not improve from 0.75197 Epoch 00118: val_accuracy did not improve from 0.75197 Epoch 00119: val_accuracy did not improve from 0.75197 Epoch 00120: val_accuracy did not improve from 0.75197 Epoch 00121: val_accuracy did not improve from 0.75197 Epoch 00122: val_accuracy did not improve from 0.75197 Epoch 00123: val_accuracy did not improve from 0.75197 Epoch 00124: val_accuracy did not improve from 0.75197 Epoch 00125: val_accuracy did not improve from 0.75197 Epoch 00126: val_accuracy did not improve from 0.75197 Epoch 00127: val_accuracy did not improve from 0.75197 Epoch 00128: val_accuracy did not improve from 0.75197 Epoch 00129: val_accuracy did not improve from 0.75197 Epoch 00130: val_accuracy did not improve from 0.75197 Epoch 00131: val_accuracy did not improve from 0.75197 Epoch 00132: val_accuracy did not improve from 0.75197 Epoch 00133: val_accuracy improved from 0.75197 to 0.75591, saving model to weights.best.hdf5 Epoch 00134: val_accuracy did not improve from 0.75591 Epoch 00135: val_accuracy did not improve from 0.75591 Epoch 00136: val_accuracy did not improve from 0.75591 Epoch 00137: val_accuracy did not improve from 0.75591 Epoch 00138: val_accuracy did not improve from 0.75591 Epoch 00139: val_accuracy did not improve from 0.75591 Epoch 00140: val_accuracy did not improve from 0.75591 Epoch 00141: val_accuracy did not improve from 0.75591 Epoch 00142: val_accuracy did not improve from 0.75591 Epoch 00143: val_accuracy did not improve from 0.75591 Epoch 00144: val_accuracy did not improve from 0.75591 Epoch 00145: val_accuracy did not improve from 0.75591 Epoch 00146: val_accuracy did not improve from 0.75591 Epoch 00147: val_accuracy did not improve from 0.75591 Epoch 00148: val_accuracy did not improve from 0.75591 Epoch 00149: val_accuracy did not improve from 0.75591 Epoch 00150: val_accuracy did not improve from 0.75591
One can use any of the two above mentioned ways of creating a checkpoint file. Both the methods have their perks, in one you create many check-point files, which may be difficult to handle, but provides more options to return to. While the other just creates a single file but only when an improvement is observed.
Fetching/Loading the created checkpoints:
Now we shall learn to access the created checkpoints to use them whenever required. To do you must have a good understanding of the network structure. So for this particular example, we will try to load the previously created weights.best.hdf5 file from the directory it was stored into.
import numpy import matplotlib.pyplot as plt from keras.layers import Dense from keras.models import Sequential from keras.callbacks import ModelCheckpoint model = Sequential() model.add(Dense(12, input_dim=8, activation='relu')) model.add(Dense(8, activation='relu')) model.add(Dense(1, activation='sigmoid')) model.load_weights("weights.best.hdf5") model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) print("Created model and loaded weights from file") dataset = numpy.loadtxt("/home/sumit/pima-indians-diabetes.data.csv", delimiter=",") X = dataset[:,0:8] Y = dataset[:,8] scores = model.evaluate(X, Y, verbose=0) print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
Output:
Using TensorFlow backend. Created model and loaded weights from file accuracy: 76.04%
So clearly we have successfully loaded the file and then performed a task on it using a model. The checkpoint here helped us to directly perform the testing part over the data as its training part was already completed and stored in the file in the previous code.
I hope you know how to create checkpoints in your code and also load them as and when required. I hope you will use this method in your upcoming model in machine learning.
This was a basic tutorial on checkpoints in Keras, hope you enjoyed it. Have a good day and happy learning.
Leave a Reply