AdaBoost Algorithm for Machine Learning in Python

In this tutorial, we will learn the AdaBoost algorithm for machine learning in Python. The AdaBoost algorithm is one of the most important ensemble techniques in machine learning algorithms. So, we will learn step by step and also try to execute the program in python.

Ensemble Methods in Machine Learning

If we want to use multiple algorithms in machine learning, then the technique required to create a new model is known as Ensemble Technique. It gives us more accuracy than other classifiers and regression algorithms. There are four ensemble techniques-

  • Bagging (Bootstrap aggregation)
  • Boosting
  • Stacking
  • cascading

Bagging methods are used to reduce the variance, Boosting methods are used to reduce the biased approach and Stacking methods are used to improve the predictions.

Boosting method has three parts-

  1. AdaBoost
  2. Gradient Boosting
  3. XgBoost

Adaboosting Technique:

This is a high bias, low variance model. We will introduce this method by explaining three points –

  • Weak Learner: This technique does not give proper training to the model. Therefore it does not reach the maximum depth of the tree. It selects the training based on the previous accurate prediction.
  • Weight: We can decide the models and prefer the majority based on their weight. We have to adjust weights to get a better model. For correct data, we have to decrease the weight. For incorrect data, we have to increase the weight.
  • Dependency: These models are not independent and they are interconnected to each other. The first model gives its output to the next model and so on. It occurs to reduce the training error that helps to predict a better output.

Adaboost algorithm steps:

  1. Assigning weights to the models
  2. Creating all the decision stumps
  3. Choosing the best decision stump
  4. Calculating the weight
  5. Adjusting the weights
  6. Normalising the weights
  7. Preparing data for the next stage
  8. Assigning new weights
  9. Repeating all the steps
  10. Working on a given query point
#loading the dataset
from sklearn import datasets

iris = datasets.load_iris()
X=iris.data
y=iris.target

#training the model
from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.25)

#applying Adaboost classifier
from sklearn.ensemble import AdaBoostClassifier

classifier=AdaBoostClassifier(n_estimators=30,learning_rate=1)
adaboost=classifier.fit(X_train,y_train)
y_pred=adaboost.predict(X_test)

#calculating the accuracy
from sklearn.metrics import accuracy_score
print("Accuracy: ",accuracy_score(y_test,y_pred))

Output:

Accuracy: 0.9473684210526315

Leave a Reply