Pandas: Split a given dataframe into groups with bin counts
Pandas Grouping and Aggregating: Split-Apply-Combine Exercise-18 with Solution
Write a Pandas program to split a given dataframe into groups with bin counts.
Test Data:
ord_no purch_amt customer_id sales_id 0 70001 150.50 3005 5002 1 70009 270.65 3001 5003 2 70002 65.26 3002 5004 3 70004 110.50 3009 5003 4 70007 948.50 3005 5002 5 70005 2400.60 3007 5001 6 70008 5760.00 3002 5005 7 70010 1983.43 3004 5007 8 70003 2480.40 3009 5008 9 70012 250.45 3008 5004 10 70011 75.29 3003 5005 11 70013 3045.60 3002 5001
Sample Solution:
Python Code :
import pandas as pd
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
df = pd.DataFrame({
'ord_no':[70001,70009,70002,70004,70007,70005,70008,70010,70003,70012,70011,70013],
'purch_amt':[150.5,270.65,65.26,110.5,948.5,2400.6,5760,1983.43,2480.4,250.45, 75.29,3045.6],
'customer_id':[3005,3001,3002,3009,3005,3007,3002,3004,3009,3008,3003,3002],
'sales_id':[5002,5003,5004,5003,5002,5001,5005,5007,5008,5004,5005,5001]})
print("Original DataFrame:")
print(df)
groups = df.groupby(['customer_id', pd.cut(df.sales_id, 3)])
result = groups.size().unstack()
print(result)
Sample Output:
Original DataFrame: ord_no purch_amt customer_id sales_id 0 70001 150.50 3005 5002 1 70009 270.65 3001 5003 2 70002 65.26 3002 5004 3 70004 110.50 3009 5003 4 70007 948.50 3005 5002 5 70005 2400.60 3007 5001 6 70008 5760.00 3002 5005 7 70010 1983.43 3004 5007 8 70003 2480.40 3009 5008 9 70012 250.45 3008 5004 10 70011 75.29 3003 5005 11 70013 3045.60 3002 5001 sales_id (5000.993, 5003.333] (5003.333, 5005.667] (5005.667, 5008.0] customer_id 3001 1.0 NaN NaN 3002 1.0 2.0 NaN 3003 NaN 1.0 NaN 3004 NaN NaN 1.0 3005 2.0 NaN NaN 3007 1.0 NaN NaN 3008 NaN 1.0 NaN 3009 1.0 NaN
Python Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
Next: Write a Pandas program to split a given dataframe into groups with multiple aggregations.What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
Python: Tips of the Day
Find current directory and file's directory:
To get the full path to the directory a Python file is contained in, write this in that file:
import os dir_path = os.path.dirname(os.path.realpath(__file__))
(Note that the incantation above won't work if you've already used os.chdir() to change your current working directory, since the value of the __file__ constant is relative to the current working directory and is not changed by an os.chdir() call.)
To get the current working directory use
import os cwd = os.getcwd()
Documentation references for the modules, constants and functions used above:
- The os and os.path modules.
- The __file__ constant
- os.path.realpath(path) (returns "the canonical path of the specified filename, eliminating any symbolic links encountered in the path")
- os.path.dirname(path) (returns "the directory name of pathname path")
- os.getcwd() (returns "a string representing the current working directory")
- os.chdir(path) ("change the current working directory to path")
Ref: https://bit.ly/3fy0R6m
- New Content published on w3resource:
- HTML-CSS Practical: Exercises, Practice, Solution
- Java Regular Expression: Exercises, Practice, Solution
- Scala Programming Exercises, Practice, Solution
- Python Itertools exercises
- Python Numpy exercises
- Python GeoPy Package exercises
- Python Pandas exercises
- Python nltk exercises
- Python BeautifulSoup exercises
- Form Template
- Composer - PHP Package Manager
- PHPUnit - PHP Testing
- Laravel - PHP Framework
- Angular - JavaScript Framework
- Vue - JavaScript Framework
- Jest - JavaScript Testing Framework