Using Confusion Matrix in Machine Learning in Python
This article is aimed at understanding how to use the Confusion Matrix in Python in various learning algorithms such as Logistic Regression, Support Vector Machines, Decision Trees, Boosted Trees, Random Forest, etc.
However, before going to the main topic let’s see what is confusion matrix is and how the need for it arises?
Confusion Matrix in ML – Python
A confusion matrix is a table that allows us to evaluate the performance of an algorithm. It is used in ML classification algorithms and is also known as the Error matrix. In addition, Rows of the matrix represent Predicted class and Columns represent the Actual class.
Assume we classification algorithm which we trained to find, say whether a person has a Tumor or not, the required matrix is a matrix that will summarize the results which we got while testing the algorithm. Assume we have a sample of 15 cases where we have 10 +ve and 5 -ve cases, the resulting confusion matrix for them will look like the table below:
Actual class +ve -ve Predicted
+ve 7 2 -ve 3 3
Most importantly, using this table we can get the parameters True positive, True negative, False positive, False-negative.
True positive=7 True Negative=3 False positive=2 False negative=3
Accurary = (TP+TN)/(TP+FP+TN+FN)Recall = TP/(TP+FN)Precision = TP/(TP+FP)
How to generate Confusion Matrix in Python using sklearn
For using confusing matrix we have dedicated library Scikit learn in Python. Further, it is also used in implementing ML algorithms. For instance, the sample code in Python 3 for this is shown below:
from sklearn.metrics import accuracy_score from sklearn.metrics import confusion_matrix actual_values = [0, 0, 0, 1, 1, 0, 1, 0, 0, 1] predicted_values = [1, 1, 1, 0, 1, 0, 0, 1, 0, 0] CF = confusion_matrix(actual_values, predicted_values) print('Matrix:') print(CF) print('Accuracy:',accuracy_score(actual_values, predicted_values))
Here we took two lists actual_values and predicted_values. After that, we moved on to make a confusion matrix using confusion_matrix() syntax. Now our matrix is generated and we printed it as shown in the output.
Matrix : [[1 4][2 3]]Accuracy: 0.4