AdaBoost Algorithm for Machine Learning in Python
In this tutorial, we will learn the AdaBoost algorithm for machine learning in Python. The AdaBoost algorithm is one of the most important ensemble techniques in machine learning algorithms. So, we will learn step by step and also try to execute the program in python.
Ensemble Methods in Machine Learning
If we want to use multiple algorithms in machine learning, then the technique required to create a new model is known as Ensemble Technique. It gives us more accuracy than other classifiers and regression algorithms. There are four ensemble techniques-
- Bagging (Bootstrap aggregation)
Bagging methods are used to reduce the variance, Boosting methods are used to reduce the biased approach and Stacking methods are used to improve the predictions.
Boosting method has three parts-
- Gradient Boosting
This is a high bias, low variance model. We will introduce this method by explaining three points –
- Weak Learner: This technique does not give proper training to the model. Therefore it does not reach the maximum depth of the tree. It selects the training based on the previous accurate prediction.
- Weight: We can decide the models and prefer the majority based on their weight. We have to adjust weights to get a better model. For correct data, we have to decrease the weight. For incorrect data, we have to increase the weight.
- Dependency: These models are not independent and they are interconnected to each other. The first model gives its output to the next model and so on. It occurs to reduce the training error that helps to predict a better output.
Adaboost algorithm steps:
- Assigning weights to the models
- Creating all the decision stumps
- Choosing the best decision stump
- Calculating the weight
- Adjusting the weights
- Normalising the weights
- Preparing data for the next stage
- Assigning new weights
- Repeating all the steps
- Working on a given query point
#loading the dataset from sklearn import datasets iris = datasets.load_iris() X=iris.data y=iris.target #training the model from sklearn.model_selection import train_test_split X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.25) #applying Adaboost classifier from sklearn.ensemble import AdaBoostClassifier classifier=AdaBoostClassifier(n_estimators=30,learning_rate=1) adaboost=classifier.fit(X_train,y_train) y_pred=adaboost.predict(X_test) #calculating the accuracy from sklearn.metrics import accuracy_score print("Accuracy: ",accuracy_score(y_test,y_pred))