LA ILAHA ILLA HU
Allah, Your Lord There Is No Deity Except Him.
Python Data Science Machine Learning Lesson 9: Multiple Regression
Multiple regression is an advanced Linear Regression that uses more than one independent variables.It means that we try to predict a value using two or more than two independent variables.
Take a look at the data set below, it contains some information about abalones.
We can predict the age of an abalone based on the weight of the shell,
but with multiple regression we can throw in more variables, like the height of the abalone and many more variables,
in order to make our prediction more accurate.
How to create our model and make it work?
Python has modules which will make our work easier.First, import the Pandas module.
import pandas
Pandas module reads csv files and converts them into a DataFrame object.
Now, the python code for Multiple Regression will be as under.
code
import warnings
warnings.filterwarnings("ignore")
from sklearn.linear_model import LinearRegression
import pandas as pd
df = pd.read_csv("abalone.csv")
X = df[['Length','Diameter','Height',
'Whole_weight','Shucked_weight',
'Viscera_weight','Shell_weight']]
y = df['Class_number_of_rings']
rsr=LinearRegression()
rsr.fit(X,y)
pred=rsr.predict([[0.300,0.350,0.
100,0.4245,0.2522,0.1300,
0.099]])
age=pred+1.5
print(float(age))
The output in years will be
8.27128158983664
Explanation:
1. First we have imported the required libraries. Here, importing warnings library is optional.This library is used to ignore all warnings that our code might encounter during execution.
2. Pandas library is imported for loading our dataframe and sklearn is imported for Multiple Regression.
3. 'abalone.csv' file which is a comma seperated value file is loaded in the dataframe 'df'.
This dataset has 9 columns.
Out of which, 8 of them are independent variables and 1 is dependent variable.
Here, we are excluding one independent variable i.e. Sex as it doesn't matter while predicting the age.
You can take 2 or more than 2 variables depending upon your choice for age prediction.
This dataset will be used to predict the age of abalone.
4. After that, all independent variables have been alloted inside the variable 'X'.
Dependent variable has been alloted inside 'y' variable
5. After that, Multiple Regression is performed on 'X' and 'y'.
6. For the final prediction, use the 'predict' method of sklearn to predict the rings
and after that add 1.5 to rings value in order to get the age of abalone.
Note: if you have python installed on your pc you can install pandas as under.
Open Command Prompt from the start menu.
Inside the command prompt, type
pip install pandas
press enter
This command will install pandas on your computer after which you can run on python.
Note: if you have python installed on your pc you can install sklearn as under.
Open Command Prompt from the start menu.
Inside the command prompt, type
pip install sklearn
press enter
This command will install sklearn on your computer after which you can run on python.