Height-Weight Prediction By Using Linear Regression in Python

Hi everyone, in this tutorial we are going to discuss “Height-Weight Prediction By Using Linear Regression in Python“.

What is a Linear Regression?

In statistics, linear regression is a linear approach to modeling the relationship between a scalar response(or dependent variable ) and one or more explanatory variables(or independent variables). The case of one explanatory variable is called a simple linear regression. For more than one explanatory variable, The process is called multiple linear regression.

A linear regression line has an equation of the form y=mx+c, where x is the explanatory variable and y is the dependent variable. The slope of the line is m, and c is the intercept (the value of y when x=0)

Image of the linear model:

Height-Weight Prediction By Using Linear Regression in Python

 

our dataset:

height-weight.csv

Implementation of Linear Regression Model: Height-Weight Prediction

In this problem, you need to find out the weight with respect to the height, when the height is 2.

step1:-

We have to add the dataset by using numpy, pandas Data science library. This is a CSV dataset that’s why we are adding the read_csv .head method use to add the first 5 rows.

import numpy as np
import pandas as pd
df=pd.read_csv("height-weight.csv")
df.head()
output:
      Height     Weight
0      1.47      52.21
1      1.5       53.12
2      1.52      54.48
3      1.55      55.84
4      1.57      57.2

step2: –

Now we have to check the column name of this dataset, the dimension of this data set and also check have any missing value or not.

df.columns
df.shape
df.isna().any()

output:
Index(['Height', 'Weight'], dtype='object')
(15, 2)
Height False Weight False type: bool


step3:-

Now we need to find out the correlation between two variables

df.corr()
   output:

              Height            Weight

Height       1.0000000         0.994584

Weight        0.994584        1.0000000

step4:-

Now, we need only the values of this independent variable and this independent variable should be 2 dimension array and we need also the dependent variable values. It is one dimension array

height=df.Height.values[:,np.newaxis]
weight=df.Weight.values
height
weight
output:
array([[1.47],
       [1.5 ],
       [1.52],
       [1.55],
       [1.57],
       [1.6 ],
       [1.63],
       [1.65],
       [1.68],
       [1.7 ],
       [1.73],
       [1.75],
       [1.78],
       [1.8 ],
       [1.83]])
array([52.21, 53.12, 54.48, 55.84, 57.2 , 58.57, 59.93, 61.29, 63.11,
       64.47, 66.28, 68.1 , 69.92, 72.19, 74.46])

step5:-Now, we need to normalize the variables or max-mix scaling the variables.

Formula:- Xnormal=(X-Xmin)/(Xmax-Xmin), where X is the values, Xman is the maximum value of the X and Xmin is the minimum value of this X.

Heightmin=height.min()
Heightmax=height.max() 
Heightnorm=(height-Heightmin)/(Heightmax-Heightmin)
Weightmin=weight.min()
Weightmax=weight.max()
Weightnorm=(weight-Weightmin)/(Weightmax-Weightmin)
Heightnorm
Weightnorm

 

output:
array([[0.        ],
       [0.08333333],
       [0.13888889],
       [0.22222222],
       [0.27777778],
       [0.36111111],
       [0.44444444],
       [0.5       ],
       [0.58333333],
       [0.63888889],
       [0.72222222],
       [0.77777778],
       [0.86111111],
       [0.91666667],
       [1.        ]])
array([0.        , 0.04089888, 0.10202247, 0.16314607, 0.22426966,
       0.2858427 , 0.34696629, 0.40808989, 0.48988764, 0.55101124,
       0.63235955, 0.7141573 , 0.79595506, 0.89797753, 1.        ])

step6:-

Now, we can apply the Linear Regression Model. In this sklearn library has an inbuilt method for this linear model.

import sklearn.linear_model as lm lr=lm.LinearRegression() lr.fit(height,weight)
output:
LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)

 

step7:-

Now, we need to find out the value of the weight, when the height value is 2.

knownvalue=int(input("Enter the value of height:"))
findvalue=lr.predict([[knownvalue]])
print("when the height value is",knownvalue,"that moment weight value is",findvalue)output:

output:
Enter the value of height:2
when the height value is 2 that moment weight value is [83.48241717]

step8:-

We can insert the new predicted value into this dataset.

df["predicted_value"]=lr.predict(height)
df.head()
output:
     Height   Weight   predicted_value
0     1.47    52.21       51.008158
1     1.50    53.12       52.846324
2     1.52    54.48       54.071768
3     1.55    55.84       55.909933
4     1.57    57.20       57.135377

step9:-

Now, finally, we need to calculate the model score.

from sklearn.metrics import r2_score
accuracy=r2_score(weight,lr.predict(height))
print("the model accuracy is",accuracy*100,"%")
output:
the model accuracy is 98.91969224457968 %

 

Finally, we applied the linear regression model and understand the concept of linear regression.

Leave a Reply

Your email address will not be published. Required fields are marked *