Issue
I have a Dataset like this:
dictionary = {'Month1': ['C1','C2',0,0,'C5'], 'Month2': ['C1','C2','C3','C4',0], 'Month3': ['C1','C2','C3','C4',0], 'Month4' : [0,'C2','C3',0,0]}
df = pd.DataFrame(dictionary)
Month1 Month2 Month3 Month4
0 C1 C1 C1 0
1 C2 C2 C2 C2
2 0 C3 C3 C3
3 0 C4 C4 0
4 C5 0 0 0
I want to compare each row between all the columns of this DataFrame, i.e. compare if df.loc[0,'Month1']
is equal to: df.loc[0,'Month2']
, df.loc[0,'Month3']
and df.loc[0,'Month2']
. I want to do this because my goal with this comparison is to know from and to what month there is the same string, I would like to put the initial month that the value appear in one column and the last month that the string is repeated in other one. Someone like this:
Month1 Month2 Month3 Month4 Firts appear last appear
0 C1 C1 C1 0 Month1 Month3
1 C2 C2 C2 C2 Month1 Month4
2 0 C3 C3 C3 Month2 Month4
3 0 C4 C4 0 Month2 Month3
4 C5 0 0 0 Month1 Month1
I'm thinking in uses np.where()
or a loop but I really don't know how to do it. Please help me
Solution
Use idxmax
>>> df.assign(first_appear=df.ne(0).idxmax(1),
last_appear=df.loc[:, df.columns[::-1]].ne(0).idxmax(1))
Month1 Month2 Month3 Month4 Firts appear last appear
0 C1 C1 C1 0 Month1 Month3
1 C2 C2 C2 C2 Month1 Month4
2 0 C3 C3 C3 Month2 Month4
3 0 C4 C4 0 Month2 Month3
4 C5 0 0 0 Month1 Month1
Answered By - rafaelc
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.