Height-Weight Prediction By Using Linear Regression in Python
Hi everyone, in this tutorial we are going to discuss “Height-Weight Prediction By Using Linear Regression in Python“.
What is a Linear Regression?
In statistics, linear regression is a linear approach to modeling the relationship between a scalar response(or dependent variable ) and one or more explanatory variables(or independent variables). The case of one explanatory variable is called a simple linear regression. For more than one explanatory variable, The process is called multiple linear regression.
A linear regression line has an equation of the form y=mx+c, where x is the explanatory variable and y is the dependent variable. The slope of the line is m, and c is the intercept (the value of y when x=0)
Image of the linear model:

our dataset:
Implementation of Linear Regression Model: Height-Weight Prediction
In this problem, you need to find out the weight with respect to the height, when the height is 2.
step1:-
We have to add the dataset by using numpy, pandas Data science library. This is a CSV dataset that’s why we are adding the read_csv .head method use to add the first 5 rows.
import numpy as np
import pandas as pd
df=pd.read_csv("height-weight.csv")
df.head()
output: Height Weight 0 1.47 52.21 1 1.5 53.12 2 1.52 54.48 3 1.55 55.84 4 1.57 57.2
step2: –
Now we have to check the column name of this dataset, the dimension of this data set and also check have any missing value or not.
df.columns df.shape df.isna().any()
output: Index(['Height', 'Weight'], dtype='object') (15, 2) Height False Weight False type: bool
step3:-
Now we need to find out the correlation between two variables
df.corr()
output: Height Weight Height 1.0000000 0.994584 Weight 0.994584 1.0000000
step4:-
Now, we need only the values of this independent variable and this independent variable should be 2 dimension array and we need also the dependent variable values. It is one dimension array
height=df.Height.values[:,np.newaxis] weight=df.Weight.values height weight
output:
array([[1.47],
[1.5 ],
[1.52],
[1.55],
[1.57],
[1.6 ],
[1.63],
[1.65],
[1.68],
[1.7 ],
[1.73],
[1.75],
[1.78],
[1.8 ],
[1.83]])
array([52.21, 53.12, 54.48, 55.84, 57.2 , 58.57, 59.93, 61.29, 63.11,
64.47, 66.28, 68.1 , 69.92, 72.19, 74.46])step5:-Now, we need to normalize the variables or max-mix scaling the variables.
Formula:- Xnormal=(X-Xmin)/(Xmax-Xmin), where X is the values, Xman is the maximum value of the X and Xmin is the minimum value of this X.
Heightmin=height.min() Heightmax=height.max() Heightnorm=(height-Heightmin)/(Heightmax-Heightmin) Weightmin=weight.min() Weightmax=weight.max() Weightnorm=(weight-Weightmin)/(Weightmax-Weightmin) Heightnorm Weightnorm
output:
array([[0. ],
[0.08333333],
[0.13888889],
[0.22222222],
[0.27777778],
[0.36111111],
[0.44444444],
[0.5 ],
[0.58333333],
[0.63888889],
[0.72222222],
[0.77777778],
[0.86111111],
[0.91666667],
[1. ]])
array([0. , 0.04089888, 0.10202247, 0.16314607, 0.22426966,
0.2858427 , 0.34696629, 0.40808989, 0.48988764, 0.55101124,
0.63235955, 0.7141573 , 0.79595506, 0.89797753, 1. ])
step6:-
Now, we can apply the Linear Regression Model. In this sklearn library has an inbuilt method for this linear model.
import sklearn.linear_model as lm lr=lm.LinearRegression() lr.fit(height,weight)
output: LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)
step7:-
Now, we need to find out the value of the weight, when the height value is 2.
knownvalue=int(input("Enter the value of height:"))
findvalue=lr.predict([[knownvalue]])
print("when the height value is",knownvalue,"that moment weight value is",findvalue)output:
output: Enter the value of height:2 when the height value is 2 that moment weight value is [83.48241717]
step8:-
We can insert the new predicted value into this dataset.
df["predicted_value"]=lr.predict(height) df.head()
output:
Height Weight predicted_value
0 1.47 52.21 51.008158
1 1.50 53.12 52.846324
2 1.52 54.48 54.071768
3 1.55 55.84 55.909933
4 1.57 57.20 57.135377step9:-
Now, finally, we need to calculate the model score.
from sklearn.metrics import r2_score
accuracy=r2_score(weight,lr.predict(height))
print("the model accuracy is",accuracy*100,"%")output: the model accuracy is 98.91969224457968 %
Finally, we applied the linear regression model and understand the concept of linear regression.
How can we test and train and plots. how to add in it?