Please note, this is a STATIC archive of website www.w3resource.com from 19 Jul 2022, cach3.com does not collect or store any user information, there is no "phishing" involved.
w3resource

Pandas: Split the specified dataframe into groups based on customer id and create a list of order date for each group

Pandas Grouping and Aggregating: Split-Apply-Combine Exercise-10 with Solution

Write a Pandas program to split the following dataframe into groups based on customer id and create a list of order date for each group.

Test Data:

    ord_no  purch_amt    ord_date  customer_id  salesman_id
0    70001     150.50  2012-10-05         3005         5002
1    70009     270.65  2012-09-10         3001         5005
2    70002      65.26  2012-10-05         3002         5001
3    70004     110.50  2012-08-17         3009         5003
4    70007     948.50  2012-09-10         3005         5002
5    70005    2400.60  2012-07-27         3007         5001
6    70008    5760.00  2012-09-10         3002         5001
7    70010    1983.43  2012-10-10         3004         5006
8    70003    2480.40  2012-10-10         3009         5003
9    70012     250.45  2012-06-27         3008         5002
10   70011      75.29  2012-08-17         3003         5007
11   70013    3045.60  2012-04-25         3002         5001

Sample Solution:

Python Code :

import pandas as pd
pd.set_option('display.max_rows', None)
#pd.set_option('display.max_columns', None)
df = pd.DataFrame({
'ord_no':[70001,70009,70002,70004,70007,70005,70008,70010,70003,70012,70011,70013],
'purch_amt':[150.5,270.65,65.26,110.5,948.5,2400.6,5760,1983.43,2480.4,250.45, 75.29,3045.6],
'ord_date': ['2012-10-05','2012-09-10','2012-10-05','2012-08-17','2012-09-10','2012-07-27','2012-09-10','2012-10-10','2012-10-10','2012-06-27','2012-08-17','2012-04-25'],
'customer_id':[3001,3001,3005,3001,3005,3001,3005,3001,3005,3001,3005,3005],
'salesman_id': [5002,5005,5001,5003,5002,5001,5001,5006,5003,5002,5007,5001]})
print("Original Orders DataFrame:")
print(df)
result = df.groupby('customer_id')['ord_date'].apply(list)
print("\nGroup on 'customer_id' and display the list of order dates in group wise:")
print(result)

Sample Output:

Original Orders DataFrame:
    ord_no  purch_amt    ord_date  customer_id  salesman_id
0    70001     150.50  2012-10-05         3001         5002
1    70009     270.65  2012-09-10         3001         5005
2    70002      65.26  2012-10-05         3005         5001
3    70004     110.50  2012-08-17         3001         5003
4    70007     948.50  2012-09-10         3005         5002
5    70005    2400.60  2012-07-27         3001         5001
6    70008    5760.00  2012-09-10         3005         5001
7    70010    1983.43  2012-10-10         3001         5006
8    70003    2480.40  2012-10-10         3005         5003
9    70012     250.45  2012-06-27         3001         5002
10   70011      75.29  2012-08-17         3005         5007
11   70013    3045.60  2012-04-25         3005         5001

Group on 'customer_id' and display the list of order dates in group wise:
customer_id
3001    [2012-10-05, 2012-09-10, 2012-08-17, 2012-07-2...
3005    [2012-10-05, 2012-09-10, 2012-09-10, 2012-10-1...
Name: ord_date, dtype: object

Python Code Editor:


Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Pandas program to split a dataset to group by two columns and then sort the aggregated results within the groups.
Next: Write a Pandas program to split the following dataframe into groups and calculate monthly purchase amount.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Python: Tips of the Day

Find current directory and file's directory:

To get the full path to the directory a Python file is contained in, write this in that file:

import os 
dir_path = os.path.dirname(os.path.realpath(__file__))

(Note that the incantation above won't work if you've already used os.chdir() to change your current working directory, since the value of the __file__ constant is relative to the current working directory and is not changed by an os.chdir() call.)

To get the current working directory use

import os
cwd = os.getcwd()

Documentation references for the modules, constants and functions used above:

  • The os and os.path modules.
  • The __file__ constant
  • os.path.realpath(path) (returns "the canonical path of the specified filename, eliminating any symbolic links encountered in the path")
  • os.path.dirname(path) (returns "the directory name of pathname path")
  • os.getcwd() (returns "a string representing the current working directory")
  • os.chdir(path) ("change the current working directory to path")

Ref: https://bit.ly/3fy0R6m