Issue
When updating one DataFrame based on info from second one the NaN's are not transferred. With below code
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'A':[1,2,3,4,5], 'B':[4,5,6,7,8]})
df2 = pd.DataFrame({'A':[1,2,3], 'B':[7,np.nan,6]})
df1.update(df2)
The new DataFrame will look as below:
So the value from df2 is not transferred as this is a NaN
Normally this is probably an expected outcome but not in the particular case that I'm working on. For the moment how I deal with it is that I replace all NaNs with a value or string that signalizes to me that it's something I need to check but I was wondering if there is an actual solution to pass the NaN as an update?
Solution
What you could do is a bit more involved, but solves the problem as well without using fillna
or so:
df3 = df1.merge(df2, how='left', suffixes=('___',''), on=['A'], indicator=True)
df3['finalB'] = df3.apply(lambda x: x['B'] if x['_merge'] == 'both' else x['B___'], axis=1)
df3[['A','finalB']]
Output:
A finalB
0 1.0 7.0
1 2.0 NaN
2 3.0 6.0
3 4.0 7.0
4 5.0 8.0
Answered By - PrinsEdje80
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.