Issue
I have two data frames df1
and df2
. The df1
has repeated text wrt column name
but column hobby
changes. The df2
also has repeated text in the column name
. I want to merge both the data frames and keep everything.
df1:
name hobby
mike cricket
mike football
jack chess
jack football
jack vollyball
pieter sleeping
pieter cyclying
my df2
is
df2:
name
mike
pieter
jack
mike
pieter
Now I have to merge df2
with df1
on name
column
So my resultant df3
should look like this:
df3:
name hobby
mike cricket
mike football
pieter sleeping
pieter cyclying
jack chess
jack football
jack vollyball
mike cricket
mike football
pieter sleeping
pieter cyclying
Solution
You want to assign an order for df2
, merge on name
, then sort by the said order:
(df2.assign(rank=np.arange(len(df2)))
.merge(df1, on='name')
.sort_values('rank')
.drop('rank', axis=1)
)
Output:
name hobby
0 mike cricket
1 mike football
4 pieter sleeping
5 pieter cyclying
8 jack chess
9 jack football
10 jack vollyball
2 mike cricket
3 mike football
6 pieter sleeping
7 pieter cyclying
Answered By - Quang Hoang
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.