Issue
In excel if we want to use VLOOKUP on new column, we need to define lookup value, table array & index column number that we need and then the value fill the column that we need. If we want to do it the same with python, how we execute this?
for example, first and second dataframe
data01 = pd.DataFrame({'Code Id':['AA-103', 'BB-203', 'CC-303', 'DD-403'], 'Area':['AA', 'BB', 'CC', 'DD'], 'Sub-Area':['AA', 'BB', 'CC-1', 'DD-3']})
data02 = pd.DataFrame({'Code Id':['AA-103', 'BB-203', 'CC-505', 'FF-606'], 'Area':['AA', 'BB', 'EE', 'FF']})
and then the expected output such a like this
data03 = pd.DataFrame({'Code Id':['AA-103', 'BB-203', 'EE-505', 'FF-606'], 'Area':['AA', 'BB', 'EE', 'FF'], 'Sub-Area':['AA', 'BB', 'Na', 'Na']})
so it's like we put new column in second dataframe based on new contract, not make a new dataframe based both of them. Any idea?
Solution
One of the ways to do it is to use pandas.merge
with a left-join:
Left join: It provides all the rows from the first dataframe and will match rows from the second dataframe. Every row not found/matched in the second dataframe will be replaced by NaN (
vlookup()
will put #N/A instead).
data03 = data02.merge(data01[['Code Id', 'Sub-Area']], on='Code Id', how='left')
>>> print(data03)
Note: there is a small error in your data02
, the third Code Id
has to equal 'EE-505' and not 'CC-505'.
Answered By - L'Artiste
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.