Pandas: Find and replace the missing values in a given DataFrame which do not have any valuable information
Pandas Handling Missing Values: Exercise-4 with Solution
Write a Pandas program to find and replace the missing values in a given DataFrame which do not have any valuable information.
Example:
Missing values: ?, --
Replace those values with NaN
Test Data:
ord_no purch_amt ord_date customer_id salesman_id 0 70001 150.5 ? 3002 5002 1 NaN 270.65 2012-09-10 3001 5003 2 70002 65.26 NaN 3001 ? 3 70004 110.5 2012-08-17 3003 5001 4 NaN 948.5 2012-09-10 3002 NaN 5 70005 2400.6 2012-07-27 3001 5002 6 -- 5760 2012-09-10 3001 5001 7 70010 ? 2012-10-10 3004 ? 8 70003 12.43 2012-10-10 -- 5003 9 70012 2480.4 2012-06-27 3002 5002 10 NaN 250.45 2012-08-17 3001 5003 11 70013 3045.6 2012-04-25 3001 --
Sample Solution:
Python Code :
import pandas as pd
import numpy as np
pd.set_option('display.max_rows', None)
#pd.set_option('display.max_columns', None)
df = pd.DataFrame({
'ord_no':[70001,np.nan,70002,70004,np.nan,70005,"--",70010,70003,70012,np.nan,70013],
'purch_amt':[150.5,270.65,65.26,110.5,948.5,2400.6,5760,"?",12.43,2480.4,250.45, 3045.6],
'ord_date': ['?','2012-09-10',np.nan,'2012-08-17','2012-09-10','2012-07-27','2012-09-10','2012-10-10','2012-10-10','2012-06-27','2012-08-17','2012-04-25'],
'customer_id':[3002,3001,3001,3003,3002,3001,3001,3004,"--",3002,3001,3001],
'salesman_id':[5002,5003,"?",5001,np.nan,5002,5001,"?",5003,5002,5003,"--"]})
print("Original Orders DataFrame:")
print(df)
print("\nReplace the missing values with NaN:")
result = df.replace({"?": np.nan, "--": np.nan})
print(result)
Sample Output:
Original Orders DataFrame: ord_no purch_amt ord_date customer_id salesman_id 0 70001 150.5 ? 3002 5002 1 NaN 270.65 2012-09-10 3001 5003 2 70002 65.26 NaN 3001 ? 3 70004 110.5 2012-08-17 3003 5001 4 NaN 948.5 2012-09-10 3002 NaN 5 70005 2400.6 2012-07-27 3001 5002 6 -- 5760 2012-09-10 3001 5001 7 70010 ? 2012-10-10 3004 ? 8 70003 12.43 2012-10-10 -- 5003 9 70012 2480.4 2012-06-27 3002 5002 10 NaN 250.45 2012-08-17 3001 5003 11 70013 3045.6 2012-04-25 3001 -- Replace the missing values with NaN: ord_no purch_amt ord_date customer_id salesman_id 0 70001.0 150.50 NaN 3002.0 5002.0 1 NaN 270.65 2012-09-10 3001.0 5003.0 2 70002.0 65.26 NaN 3001.0 NaN 3 70004.0 110.50 2012-08-17 3003.0 5001.0 4 NaN 948.50 2012-09-10 3002.0 NaN 5 70005.0 2400.60 2012-07-27 3001.0 5002.0 6 NaN 5760.00 2012-09-10 3001.0 5001.0 7 70010.0 NaN 2012-10-10 3004.0 NaN 8 70003.0 12.43 2012-10-10 NaN 5003.0 9 70012.0 2480.40 2012-06-27 3002.0 5002.0 10 NaN 250.45 2012-08-17 3001.0 5003.0 11 70013.0 3045.60 2012-04-25 3001.0 NaN
Python Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
Previous: Write a Pandas program to count the number of missing values in each column of a given DataFrame.
Next: Write a Pandas program to drop the rows where at least one element is missing in a given DataFrame.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
Python: Tips of the Day
Find current directory and file's directory:
To get the full path to the directory a Python file is contained in, write this in that file:
import os dir_path = os.path.dirname(os.path.realpath(__file__))
(Note that the incantation above won't work if you've already used os.chdir() to change your current working directory, since the value of the __file__ constant is relative to the current working directory and is not changed by an os.chdir() call.)
To get the current working directory use
import os cwd = os.getcwd()
Documentation references for the modules, constants and functions used above:
- The os and os.path modules.
- The __file__ constant
- os.path.realpath(path) (returns "the canonical path of the specified filename, eliminating any symbolic links encountered in the path")
- os.path.dirname(path) (returns "the directory name of pathname path")
- os.getcwd() (returns "a string representing the current working directory")
- os.chdir(path) ("change the current working directory to path")
Ref: https://bit.ly/3fy0R6m
- New Content published on w3resource:
- HTML-CSS Practical: Exercises, Practice, Solution
- Java Regular Expression: Exercises, Practice, Solution
- Scala Programming Exercises, Practice, Solution
- Python Itertools exercises
- Python Numpy exercises
- Python GeoPy Package exercises
- Python Pandas exercises
- Python nltk exercises
- Python BeautifulSoup exercises
- Form Template
- Composer - PHP Package Manager
- PHPUnit - PHP Testing
- Laravel - PHP Framework
- Angular - JavaScript Framework
- Vue - JavaScript Framework
- Jest - JavaScript Testing Framework