Detecting the handwritten digit in Python

In this tutorial, you will create a neural network model that can detect the handwritten digit from an image in Python using sklearn. A neural network consists of three types of layers named the Input layer that accepts the inputs, the Hidden layer that consists of neurons that learn through training, and an Output layer which provides the final output.

Detection of handwritten digit from an image in Python using scikit-learn

To get started with this first we need to download the dataset for training. The dataset is the MNIST digit recognizer dataset which can be downloaded from the kaggle website.

The dataset consists of two CSV (comma separated) files namely train and test. The model is trained on the train.csv file and then tested using a test.csv file.

Dealing with the training dataset

The train.csv file consists of 785 columns out of which one column defines the label of the digit and the rest are the pixels of the image.

  1. First, we import the required libraries
    import pandas as pd
    import numpy as np
  2. Then we load the dataset
    dataset=pd.read_csv('train.csv')
  3. We view the first few rows of the dataset as
    dataset.head()
    

    The dataset looks like:
    dataset for digit detection from an image in Python

  4. Dividing the dataset into two numpy arrays x and y such that x contains all pixel values and y contains the label column.
    x=dataset.iloc[:,1:785].values
    y=dataset.iloc[:,0].values
  5. Splitting the training dataset into 2 parts train and test, train for training the model and test for validating the model.
    from sklearn.model_selection import train_test_split
    x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2,random_state=0)

    Assigning the respective labels. Here, we consider 80% of the training dataset as a train and the remaining 20% as validating dataset.

  6. Reshaping the x_train as each image is of size 28X28 and in all x_train consists of 33600 rows.
    x_train=x_train.reshape(33600,28,28,1)
  7. Converting y_train into categorical values
    from keras.utils import to_categorical
    y_train=to_categorical(y_train)
  8. Creating the convolutional neural networks
    1. Importing all the required libraries and packages
      from keras.models import Sequential
      from keras.layers import Convolution2D
      from keras.layers import MaxPooling2D
      from keras.layers import AveragePooling2D
      from keras.layers import Flatten
      from keras.layers import Dense
      from keras.layers import Dropout
      from keras.models import load_model
    2. Classifier
      classifier=Sequential()
    3. Defining the input layers
      classifier.add(Convolution2D(32,(3,3),input_shape=(28,28,1),activation='relu'))
      classifier.add(MaxPooling2D(pool_size=(2,2)))
      classifier.add(Flatten())
      classifier.add(Dense(units=128,activation='relu'))
      classifier.add(Dropout(0.2))
      
    4. Creating the output layer
      classifier.add(Dense(10, activation='softmax'))
      

      Here, we use activation function as ‘softmax’ because the output is categorical.

    5. Compiling the model
      classifier.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])
    6. Fitting the model on train and test sets
      classifier.fit(x_train,y_train,epochs=35,batch_size=500,validation_split=0.2)
    7. Saving the model
      classifier.save('digit_recognizer.h5')
  9. Testing the model on an image
    import numpy as np,cv2,imutils
    #reading image
    img = cv2.imread(#image-name)
    #resizing image
    img = imutils.resize(img,width=300)
    #showing original image
    cv2.imshow("Original",img)
    #converting image to grayscale
    gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
    #showing grayscale image
    cv2.imshow("Gray Image",gray)
    
    #creating a kernel
    kernel = np.ones((40,40),np.uint8)
    
    #applying blackhat thresholding
    blackhat = cv2.morphologyEx(gray,cv2.MORPH_BLACKHAT,kernel)
    
    
    #applying OTSU's thresholding
    ret,thresh = cv2.threshold(blackhat,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
    
    #performing erosion and dilation
    opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)
    
    #finding countours in image
    ret,cnts= cv2.findContours(thresh.copy(),cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
    
    #loading our ANN model
    classifier = load_model('digit_recognizer.h5')
    for c in cnts:
        try:
            #creating a mask
            mask = np.zeros(gray.shape,dtype="uint8")
            
        
            (x,y,w,h) = cv2.boundingRect(c)
            
            hull = cv2.convexHull(c)
            cv2.drawContours(mask,[hull],-1,255,-1)    
            mask = cv2.bitwise_and(thresh,thresh,mask=mask)
    
            
            #Getting Region of interest
            roi = mask[y-7:y+h+7,x-7:x+w+7]       
            roi = cv2.resize(roi,(28,28))
            roi = np.array(roi)
            #reshaping roi to feed image to our model
            roi = roi.reshape(1,784)
    
            #predicting
            prediction = model.predict(roi)
            predict=prediction.argmax()
        
            cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),1)
            cv2.putText(img,str(int(predict)),(x,y),cv2.FONT_HERSHEY_SIMPLEX,0.8,(255,255,0),1)
            
        except Exception as e:
            print(e)
            
    img = imutils.resize(img,width=500)
    
    #showing the output
    cv2.imshow('Detection',img)
    

Output

Detect Handwritten Text Digit from an Image in Python scikit

Also read:

Anomaly detection in Python using scikit-learn

Leave a Reply

Your email address will not be published. Required fields are marked *