Issue
I have a pandas dataframe as below which details additional calls into an area:
CommsDate | Area | Day0 Incremental | Day1 Incremental | Day2 Incremental |
---|---|---|---|---|
01/01/24 | Sales | 43 | 36 | 29 |
01/01/24 | Service | 85 | 74 | 66 |
02/01/24 | Sales | 56 | 42 | 31 |
02/01/24 | Service | 73 | 62 | 49 |
03/01/24 | Sales | 48 | 32 | 24 |
03/01/24 | Service | 67 | 58 | 46 |
I am trying to calculate by date how many calls were received, so calls received for Sales on 1st Jan would be Day0_incremental for that date (43), 2nd Jan would be Day0 for 2nd Jan plus Day1 for 1st Jan (36+56) and 3rd Jan would be Day0 for 3rd Jan plus Day1 for 2nd Jan plus Day2 for 1st Jan (48+42+29), resulting in the below dataframe:
CallDate | Sales | Service |
---|---|---|
01/01/24 | 43 | 85 |
02/01/24 | 92 | 147 |
03/01/24 | 119 | 195 |
04/01/24 | 63 | 107 |
05/01/24 | 24 | 46 |
I have managed to create a shell of a dataframe for the second table with no values under the area columns, but at a loss as to the next steps:
df['CommsDate'] = pd.to_datetime(df['CommsDate'], format='%d/%m/%Y')
areaUnique = df['Area'].unique().tolist()
from datetime import timedelta
CallDate = pd.date_range(start=min(df['CommsDate']), end=max(df['CommsDate'])+timedelta(days=6), freq='D')
data = {area: [] for area in areaUnique}
dfNew = pd.DataFrame(data)
dfNew['CallDate'] = CallDate
dfNew = dfNew.melt(id_vars=['CallDate'], var_name='Area')
dfNew = dfNew.pivot(index='CallDate', columns='Area', values='value')
dfNew = dfNew.reset_index()
dfNew = dfNew[['CallDate'] + areaUnique]
I have started to write a for loop, but I've only got this far:
for i in range(1,len(areaUnique)+1):
dfNew.columns(i) =
Solution
df['CommsDate'] = pd.to_datetime(df['CommsDate'], dayfirst=True)
tmp = df.pivot(index='CommsDate', columns='Area')
out = (tmp['Day0 Incremental']
.add(tmp['Day1 Incremental'].shift(freq='1d'), fill_value=0)
.add(tmp['Day2 Incremental'].shift(freq='2d'), fill_value=0)
.reset_index().rename_axis(columns=None)
)
Or, programmatically with functools.reduce
using the extracted number from the DayX …
string:
from functools import reduce
import re
reg = re.compile(r'Day(\d+)')
df['CommsDate'] = pd.to_datetime(df['CommsDate'], dayfirst=True)
tmp = df.pivot(index='CommsDate', columns='Area')
out = reduce(lambda a,b: a.add(b, fill_value=0),
(tmp[d].shift(freq=f'{reg.search(d).group(1)}d') for d in
tmp.columns.get_level_values(0).unique())
).reset_index().rename_axis(columns=None)
Output:
CommsDate Sales Service
0 2024-01-01 43.0 85.0
1 2024-01-02 92.0 147.0
2 2024-01-03 119.0 195.0
3 2024-01-04 63.0 107.0
4 2024-01-05 24.0 46.0
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.