Issue
I have a multiindex dataframe in the form of :
level0 | level1 | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|---|
idx1 | name1 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
name2 | 1.0 | 2.0 | 5.0 | 4.0 | 2.0 | |
name3 | 1.0 | 4.0 | 2.0 | 6.0 | 8.0 | |
idx2 | name1 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
name2 | 4.0 | 2.0 | 9.0 | 5.0 | 5.0 | |
name3 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
idx3 | name1 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
name2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
name3 | 4.0 | 7.0 | 6.0 | 5.0 | 1.0 |
I would like to drop the rows where I have zeroes across all index of level 0. With the given exemple, (idx1, name1), (idx2, name1) and (idx3, name1) shoud be dropped.
I have tried :
tmp.loc[~(tmp==0).all(axis=1)]
But it drops all rows with zeroes...
I am not able to figure out the proper mask to get what I want. Can someone help ?
Solution
If I understand correctly, you further need to apply a groupby.transform
to test homogeneity within a level:
tmp[(tmp!=0).any(axis=1).groupby(level='level1').transform('any')]
Which is identical to:
tmp[~(tmp==0).all(axis=1).groupby(level='level1').transform('all')]
Output:
1 2 3 4 5
level0 level1
idx1 name2 1.0 2.0 5.0 4.0 2.0
name3 1.0 4.0 2.0 6.0 8.0
idx2 name2 4.0 2.0 9.0 5.0 5.0
name3 0.0 0.0 0.0 0.0 0.0
idx3 name2 0.0 0.0 0.0 0.0 0.0
name3 4.0 7.0 6.0 5.0 1.0
Intermediates:
1 2 3 4 5 (tmp!=0).any(axis=1) (…).groupby(level='level1').transform('any')
level0 level1
idx1 name1 0.0 0.0 0.0 0.0 0.0 False False
name2 1.0 2.0 5.0 4.0 2.0 True True
name3 1.0 4.0 2.0 6.0 8.0 True True
idx2 name1 0.0 0.0 0.0 0.0 0.0 False False
name2 4.0 2.0 9.0 5.0 5.0 True True
name3 0.0 0.0 0.0 0.0 0.0 False True
idx3 name1 0.0 0.0 0.0 0.0 0.0 False False
name2 0.0 0.0 0.0 0.0 0.0 False True
name3 4.0 7.0 6.0 5.0 1.0 True True
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.