لَآ إِلَـٰهَ إِلَّا هُوَ
LA ILAHA ILLA HU
Allah, Your Lord There Is No Deity Except Him.

Python Data Science Machine Learning Lesson 10: Scale

Sometimes, it becomes difficult to compare all the values inside the dataset because of them having different values or having different measuring units.
Scaling solves this problem. Data can be scaled into new values that are easier to compare.

For Scaling, we will take the accelerometer dataset from UCI ML repository.
There are 5 attributes in the dataset: wconfid,pctid,x,y and z.

1. wconfid: Weight Configuration ID (1 - 'red' - normal configuration; 2 - 'blue' - perpendicular configuration; 3 - 'green' - opposite configuration)
2. pctid: Cooler Fan RPM Speed Percentage ID (20 means 20%, and so on
3. x: Accelerometer x value.
4. y: Accelerometer y value.
5. z: Accelerometer z value.

Take a look at the accelerometer data before Scaling:

'sklearn' module of Python has a method called StandardScaler() which returns a Scaler object with methods for transforming data sets.
The python code for Scaling will be as under.

code

import pandas as pd
from sklearn.preprocessing import StandardScaler
scale = StandardScaler()
df = pd.read_csv
("accelerometer.csv")
X = df[['pctid','x','y','z']]
scaledX = scale.fit_transform(X)
print(scaledX)

The output will be
[[-1.63299316 0.01082468 0.11398969 -0.01398598]

[-1.63299316 0.01082468 -0.06511094 -0.01398598]

[-1.63299316 -0.03439754 0.11398969 -0.00624913]

... [ 1.63299316 0.20721833 -0.13378863 -0.21127571]

[ 1.63299316 -0.07961976 0.2661579 -0.10489399]

[ 1.63299316 0.26277706 -0.24421158 0.43861987]]

Explanation:

1. Pandas library is imported for loading our dataframe and sklearn is imported for Scaling.

2. 'accelerometer.csv' file which is a comma seperated value file is loaded in the dataframe 'df'.This dataset has 5 columns. wconfid: Weight Configuration ID (1 - 'red' - normal configuration; 2 - 'blue' - perpendicular configuration; 3 - 'green' - opposite configuration) pctid: Cooler Fan RPM Speed Percentage ID (20 means 20%, and so on).
x: Accelerometer x value.
y: Accelerometer y value.
z: Accelerometer z value.
4. After that, perform Standard Scaling method on all the features.

5. Use fit_transform method to fit and transform the values simultaneously.

Note: if you have python installed on your pc you can install pandas as under.
Open Command Prompt from the start menu.
Inside the command prompt, type
pip install pandas
press enter
This command will install pandas on your computer after which you can run on python.

Note: if you have python installed on your pc you can install sklearn as under.
Open Command Prompt from the start menu.
Inside the command prompt, type
pip install sklearn
press enter
This command will install sklearn on your computer after which you can run on python.