Terrorism detection using Logistic Regression Algorithm in Python

So, here we’ll be looking at a Python implementation of the logistic regression algorithm. We will be using the dataset available below to implement our algorithm. The dataset consists of details of employees in a company. It contains the employee id, gender salary, and the purchase.

Dataset:-User_Data.csv

We make a logistic regression model that will predict whether the employee will buy the product or not.

Importing the libraries:

import pandas as pnd 
import numpy as nmp 
import matplotlib.pyplot as pt 

 

dataset = pnd.read_csv('...\\User_Data.csv') 

Now we need to find a relation between age and salary to predict whether the employee will purchase the product or not.

x = dataset.iloc[:, [2, 3]].values 
y = dataset.iloc[:, 4].values 

Now we need to split the dataset. For training the model, 75% of data is used and for testing the model, 25% of the data is used.

from sklearn.cross_validation import train_test_split 
xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size = 0.25, random_state = 0) 

Now we will perform the feature scaling operation between the age and salary so that the salary doesn’t dominate the age when it finds the nearest neighbor.

from sklearn.preprocessing import StandardScaler 
sc_x = StandardScaler() 
xtrain = sc_x.fit_transform(xtrain) 
xtest = sc_x.transform(xtest) 
print (xtrain[0:10, :]) 

At last, we train our logistic regression model.

from sklearn.linear_model import LogisticRegression 
classifier = LogisticRegression(random_state = 0) 
classifier.fit(xtrain, ytrain) 

For the prediction,

y_pred = classifier.predict(xtest) 

Testing the performance,

from sklearn.metrics import confusion_matrix 
cm = confusion_matrix(ytest, y_pred) 
print ("Confusion Matrix Output: \n", cm) 

Output:

Confusion Matrix : 
 [[65  3]
 [ 8 24]]



TP+TN=65+24
FP+FN=8+3

Finally the accuracy

from sklearn.metrics import accuracy_score 
print ("Accuracy : ", accuracy_score(ytest, y_pred)) 

Output:

Accuracy :  0.89

By this method, we can easily implement the logistic regression algorithm. Implement this algorithm on the Global Terrorism Database(GTD) for the required result. I hope you have clearly understood the concept. For any clarifications or suggestions comment down below.

Leave a Reply

Your email address will not be published. Required fields are marked *