لَآ إِلَـٰهَ إِلَّا هُوَ
LA ILAHA ILLA HU
Allah, Your Lord There Is No Deity Except Him.

Python Data Science Pandas Data Cleaning: Fixing Cleaning Removing Wrong Cleaning Wrong Data Cleaning of Wrong Format to_datetime() method

Data Cleaning of Wrong Format: Data of Wrong Format

Cells with data of wrong format can make it difficult, or even impossible, to analyze data.

How to Fix Or Clean Wrong Format Data In Pandas?

In order to repair and fix it, we have two options.

1. Remove the rows.

2. Convert all cells in the columns into the same format.

Converting Into a Correct Format. Observe the dataset below.


In the above dataset, we have two cells with the wrong format.

Check out row 1 and 5, the 'Date' column should be a string that represents a date.

Let's try to convert all cells in the 'Date' column into dates.

Pandas has a to_datetime() method for this.

Example 1: Convert to date.

Code

import pandas as pd

df = pd.read_csv('dummy2.csv')

df['Date'] = pd.to_datetime(df['Date'])

print(df.to_string())

the output will be


Note: The row 5 value has been corrected but in row 1 it is showing NaT

NaT (Not a Time) value, in other words an empty value. One way to deal with empty values is simply removing the entire row.

Removing Rows:The result from the converting in the example above gave us a NaT value, which can be handled as a NULL value, and we can remove the row by using the dropna() method.

Example 2: Remove rows with a NULL value in the "Date" column

Code

import pandas as pd

df = pd.read_csv('dummy2.csv')

df['Date'] = pd.to_datetime(df['Date'])

df.dropna(subset=['Date'], inplace = True)

print(df.to_string())

the output will be