Wednesday, December 20, 2023

[FIXED] How do I check if one column value contains another column value and vice versa - Pandas

December 20, 2023 dataframe, pandas, python, python-3.x, string No comments

Issue

I'm working with string columns on a pandas dataframe and would like to check if I can check if one of 2 columns contains the string in the match column (& check both ways)

Column1       | Column2       |  Match_Column
---------------------------------------------
Customer1     | Customer1     |  Customer1 LLC
              | Customer2     |  Customer2 LLC
Customer3     |               |  Customer3 LLC
              | Customer4 LLC |  Customer4
Customer5 LLC |               |  Customer5
Customer6 LLC |               |  Customer8
              | Customer9 LLC |  Customer4

The expected output is as follows

Column1       | Column2       | Match_Column  | is_Match
----------------------------------------------------------
Customer1     | Customer1     | Customer1 LLC | Yes
NaN           | Customer2     | Customer2 LLC | Yes
Customer3     | NaN           | Customer3 LLC | Yes
NaN           | Customer4 LLC | Customer4     | Yes
Customer5 LLC | NaN           | Customer5     | Yes
Customer6 LLC | NaN           | Customer8     | No
NaN           | Customer9 LLC | Customer4     | No
NaN           | NaN           | Customer4     | No

I tried this approach, but did not seem to work as expected

df.loc[(df['Column1'].isin(df['Match_Column'])) | 
       (df['Match_Column'].isin(df['Column1'])) | 
       (df['Match_Column'].isin(match_df['Column2'])) |
       (df['Column2'].isin(df['Match_Column'])),'is_Match'] = 'Yes'

Solution

Use numpy.where with in statement for test per rows:

df['is_Match'] = np.where([(a in c) or (b in c) or (c in a) or (c in b) for a,b,c  
                           in zip(df['Column1'].fillna('_'), df['Column2'].fillna('_'), 
                                  df['Match_Column'].fillna('nodata'))], 
                          'Yes', 'No')
print (df)
         Column1        Column2   Match_Column is_Match
0      Customer1      Customer1  Customer1 LLC      Yes
1            NaN      Customer2  Customer2 LLC      Yes
2      Customer3            NaN  Customer3 LLC      Yes
3            NaN  Customer4 LLC      Customer4      Yes
4  Customer5 LLC            NaN      Customer5      Yes
5  Customer6 LLC            NaN      Customer8       No
6            NaN  Customer9 LLC      Customer4       No
7            NaN            NaN      Customer4       No

Answered By - jezrael

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Wednesday, December 20, 2023

[FIXED] How do I check if one column value contains another column value and vice versa - Pandas

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels