Issue
The following code worked fine:
import pandas as pd
import numpy as np
df = pd.DataFrame({'from_date': ['01-10-2003', '15-12-2004', '01-08-2012', '09-07-2001'],
'to_date': ['01-11-2003', '15-12-2006', '01-09-2012', '09-12-2001']})
df['from_date'] = pd.to_datetime(df['from_date'], format='%d-%m-%Y')
df['to_date'] = pd.to_datetime(df['to_date'], format='%d-%m-%Y')
df['Months'] = (df['to_date']-df['from_date'])/np.timedelta64(1, 'M')
df
After packages update it gives the following error:
ValueError: Unit M is not supported. Only unambiguous timedelta values durations are supported. Allowed units are 'W', 'D', 'h', 'm', 's', 'ms', 'us', 'ns'
Pandas = 2.1.4, Numpy = 1.26.2
How can I now calculate month difference between two dates?
Solution
If you want to compute an exact difference, convert to_period
:
df['Months_period'] = (df['to_date'].dt.to_period('M')
.sub(df['from_date'].dt.to_period('M'))
.apply(lambda x: x.n)
)
Output (compared to using 30 days an approximation for a month):
from_date to_date Months_30D Months_period
0 2003-10-01 2003-11-01 1.033333 1
1 2004-12-15 2006-12-15 24.333333 24
2 2012-08-01 2012-09-01 1.033333 1
3 2001-07-09 2001-12-09 5.100000 5
4 2000-01-01 2023-01-01 280.033333 276
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.