# GridSearchCV in Scikit-learn

In this article, we see how to implement a grid search using **GridSearchCV** of the **Sklearn** library in Python. The solution comprises of usage of hyperparameter tuning.

However, Grid search is used for making ‘*accurate*‘ predictions.

## GridSearchCV

Grid search is the process of performing parameter tuning to determine the optimal values for a given model. Whenever we want to impose an ML model, we make use of GridSearchCV, to automate this process and make life a little bit easier for ML enthusiasts.

### Model using GridSearchCV

Here’s a python implementation of grid search on Breast Cancer dataset.

Download the dataset required for our ML model.

- Import the dataset and read the first 5 columns.
import pandas as pd df = pd.read_csv('../DataSets/BreastCancer.csv') df.head()

Output:

The ‘**diagnosis**‘ column in the dataset has one of two possible classes: benign (represented by 0) and malignant (represented by 1). The few attributes shown above will be used for our predictions. - Renaming the class values as ‘0’(benign) and ‘1’(malignant).

#Encoding categorical data values from sklearn.preprocessing import LabelEncoder labelencoder_Y = LabelEncoder() Y = labelencoder_Y.fit_transform(Y) df['diagnosis'].value_counts()

Output:

There are 357 benign and 212 malignant cases.

3. Let us now define our attributes and target variable. Further, save it to ‘X’ and ‘Y’.

X = df.iloc[:, 2:31].values Y = df.iloc[:, 1].values

4. Performing train test split.

from sklearn.model_selection import train_test_split X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size = 0.3, random_state = 4)

5. Let us now prepare the preprocessing model for our dataset, using *StandardScaler.*

from sklearn.preprocessing import StandardScaler ss = StandardScaler() X_train = ss.fit_transform(X_train) X_test = ss.transform(X_test)

6. Applying GridSearchCV to find the best model.

from sklearn.model_selection import GridSearchCV parameters = [{'C': [1,10,100], 'kernel': ['linear']}] grid_search = GridSearchCV(estimator= classifier_df, param_grid = parameters, scoring = 'accuracy',cv = 10) grid_search = grid_search.fit(X_train, Y_train)

7. Calculate the accuracy score for this model.

accuracy = grid_search.best_score_ print("The accuracy ffor predicting test data for our model is : {0}% ".format(accuracy))

Output: The accuracy for predicting test data for our model is: 94.234%

Also read,

## Leave a Reply