لَآ إِلَـٰهَ إِلَّا هُوَ
LA ILAHA ILLA HU
Allah, Your Lord There Is No Deity Except Him.

Python Data Science Pandas Fixing Cleaning Removing Wrong Data - Fixing Wrong Data

Wrong Data: "Wrong data" does not have to be "empty cells" or "wrong format", it can just be wrong.

How to Fix Or Clean Wrong Data In Pandas?

Take a look at the dataset below. You Can Simply Replace Wrong Data With Something Else


At times you can spot wrong data just by looking at the data set, because you have an expectation of what it should be.

If you take a look at our data set, you can see that in row 7, the value is YesNo at the same time. It has to be either Yes Or No, but for all the other rows the value for Play is either Yes Or No.

How can we fix wrong values, like the one for in row 7?

Replacing Values

One way to fix wrong values is to replace them with something else.

In our example, it is most likely a typo, and the value should be "Yes" instead of No, and we could just insert Yes in row 7.

Code

import pandas as pd
df = pd.read_csv('dummy2.csv')
df['Play'].replace('YesNo','Yes',
inplace=True)
#to replace Yes in place of YesNo
df['Date'] = pd.to_datetime(df['Date'])
#date format we corrected earlier
df.dropna(subset=['Date'], inplace = True)
#dropping NaT value corrected earlier
print(df.to_string())

the output will be


Note the dataset is now fully corrected.