Dummy classifiers using sklearn library in Python

Post Views: 1,292

Fellow coders, in this tutorial we will learn about the dummy classifiers using the scikit-learn library in Python. Scikit-learn is a library in Python that provides a range of supervised and unsupervised learning algorithms and also supports Python’s numerical and scientific libraries like NumPy and SciPy. The scikit-learn library’s functionality includes regression, classification, clustering, model section and preprocessing.

What are dummy classifiers in sklearn:

A DummyClassifier is a classifier in the sklearn library that makes predictions using simple rules and does not generate any valuable insights about the data. As the name suggests, dummy classifiers are used as a baseline and can be compared to real classifiers and thus we must not use it for actual problems. All the other (real) classifiers are expected to perform better on any dataset when compared to the dummy classifier. The classifier does not take into account the training data and instead uses one of the strategies to predict the class label. Stratified, most frequent, constant, and uniform are a few of the strategies used by dummy classifiers. We will implement all these strategies in our code below and check out the results.

Working with the code:

Let us implement dummy classifiers using the sklearn library:

Create a new Python file and import all the required libraries:

from sklearn.dummy import DummyClassifier
import numpy as np

Now, let’s start writing our code for implementing dummy classifiers:

a = np.array([-1, 1, 1, 1])
b = np.array([0, 1, 1, 1])

strat = ["most_frequent", "stratified", "constant", "uniform"]

for s in strat:
     if s == "constant":
             dummy_clf = DummyClassifier(strategy=s,random_state=None,constant=1)
     else:
             dummy_clf = DummyClassifier(strategy=s,random_state=None)
     dummy_clf.fit(a,b)
     print(s)
     dummy_clf.predict(a)
     dummy_clf.score(a,b)
     print("----------------------xxxxxxx----------------------")

After running the code, here is the output:

DummyClassifier(constant=None, random_state=None, strategy='most_frequent')
most_frequent
array([1, 1, 1, 1])
0.75

--------------------------------xxxxxxx--------------------------------

DummyClassifier(constant=None, random_state=None, strategy='stratified')
stratified
array([1, 1, 0, 1])
0.25

--------------------------------xxxxxxx--------------------------------

DummyClassifier(constant=1, random_state=None, strategy='constant')
constant
array([1, 1, 1, 1])
0.75

--------------------------------xxxxxxx--------------------------------

DummyClassifier(constant=None, random_state=None, strategy='uniform')
uniform
array([0, 0, 1, 0])
1.0

--------------------------------xxxxxxx-------------------------------

Also learn: Sequential forward selection with Python and Scikit learn

2 responses to “Dummy classifiers using sklearn library in Python”

Anchal bisht says:

February 12, 2020 at 9:20 pm

This post was very helpful I used this in my assignment thank you so much

Reply
Gaurav Singh says:

February 12, 2020 at 9:24 pm

Thank you for posting this tutorial. It helped me to understand this concept clearly.

Reply

Dummy classifiers using sklearn library in Python

What are dummy classifiers in sklearn:

Working with the code:

2 responses to “Dummy classifiers using sklearn library in Python”

Leave a Reply Cancel reply

Related Posts