Height-Weight Prediction By Using Linear Regression in Python
Hi everyone, in this tutorial we are going to discuss “Height-Weight Prediction By Using Linear Regression in Python“.
What is a Linear Regression?
In statistics, linear regression is a linear approach to modeling the relationship between a scalar response(or dependent variable ) and one or more explanatory variables(or independent variables). The case of one explanatory variable is called a simple linear regression. For more than one explanatory variable, The process is called multiple linear regression.
A linear regression line has an equation of the form y=mx+c, where x is the explanatory variable and y is the dependent variable. The slope of the line is m, and c is the intercept (the value of y when x=0)
Image of the linear model:
Implementation of Linear Regression Model: Height-Weight Prediction
In this problem, you need to find out the weight with respect to the height, when the height is 2.
We have to add the dataset by using numpy, pandas Data science library. This is a CSV dataset that’s why we are adding the read_csv .head method use to add the first 5 rows.
import numpy as np import pandas as pd df=pd.read_csv("height-weight.csv") df.head()
output: Height Weight 0 1.47 52.21 1 1.5 53.12 2 1.52 54.48 3 1.55 55.84 4 1.57 57.2
Now we have to check the column name of this dataset, the dimension of this data set and also check have any missing value or not.
df.columns df.shape df.isna().any()
output: Index(['Height', 'Weight'], dtype='object') (15, 2) Height False Weight False type: bool
Now we need to find out the correlation between two variables
output: Height Weight Height 1.0000000 0.994584 Weight 0.994584 1.0000000
Now, we need only the values of this independent variable and this independent variable should be 2 dimension array and we need also the dependent variable values. It is one dimension array
height=df.Height.values[:,np.newaxis] weight=df.Weight.values height weight
output: array([[1.47], [1.5 ], [1.52], [1.55], [1.57], [1.6 ], [1.63], [1.65], [1.68], [1.7 ], [1.73], [1.75], [1.78], [1.8 ], [1.83]]) array([52.21, 53.12, 54.48, 55.84, 57.2 , 58.57, 59.93, 61.29, 63.11, 64.47, 66.28, 68.1 , 69.92, 72.19, 74.46])
step5:-Now, we need to normalize the variables or max-mix scaling the variables.
Formula:- Xnormal=(X-Xmin)/(Xmax-Xmin), where X is the values, Xman is the maximum value of the X and Xmin is the minimum value of this X.
Heightmin=height.min() Heightmax=height.max() Heightnorm=(height-Heightmin)/(Heightmax-Heightmin) Weightmin=weight.min() Weightmax=weight.max() Weightnorm=(weight-Weightmin)/(Weightmax-Weightmin) Heightnorm Weightnorm
output: array([[0. ], [0.08333333], [0.13888889], [0.22222222], [0.27777778], [0.36111111], [0.44444444], [0.5 ], [0.58333333], [0.63888889], [0.72222222], [0.77777778], [0.86111111], [0.91666667], [1. ]]) array([0. , 0.04089888, 0.10202247, 0.16314607, 0.22426966, 0.2858427 , 0.34696629, 0.40808989, 0.48988764, 0.55101124, 0.63235955, 0.7141573 , 0.79595506, 0.89797753, 1. ])
Now, we can apply the Linear Regression Model. In this sklearn library has an inbuilt method for this linear model.
import sklearn.linear_model as lm lr=lm.LinearRegression() lr.fit(height,weight)
output: LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)
Now, we need to find out the value of the weight, when the height value is 2.
knownvalue=int(input("Enter the value of height:")) findvalue=lr.predict([[knownvalue]]) print("when the height value is",knownvalue,"that moment weight value is",findvalue)output:
output: Enter the value of height:2 when the height value is 2 that moment weight value is [83.48241717]
We can insert the new predicted value into this dataset.
output: Height Weight predicted_value 0 1.47 52.21 51.008158 1 1.50 53.12 52.846324 2 1.52 54.48 54.071768 3 1.55 55.84 55.909933 4 1.57 57.20 57.135377
Now, finally, we need to calculate the model score.
from sklearn.metrics import r2_score accuracy=r2_score(weight,lr.predict(height)) print("the model accuracy is",accuracy*100,"%")
output: the model accuracy is 98.91969224457968 %
Finally, we applied the linear regression model and understand the concept of linear regression.