Issue
Let say I have below calculation,
import pandas as pd
dat = pd.DataFrame({'xx1' : [1,2,3], 'aa2' : ['qq', '4', 'd'], 'xx3' : [4,5,6]})
dat2 = (dat
.assign(xx1 = lambda x : [str(i) for i in x['xx1'].values])
.assign(xx3 = lambda x : [str(i) for i in x['xx3'].values])
)
Basically, I need to find those columns for which column names match pattern xx + sequence of numbers
(i.e. xx1, xx2, xx3
etc) and then apply some transformation to those column (e.g. apply str
function)
One way I can do this is like above i.e. find manually those columns and perform transformation. I wonder if there is any way to generalise this approach. I prefer to use pipe
like above.
Any pointer will be very helpful.
Solution
You could do:
# Matches all columns starting with 'xx' with a sequence of numbers afterwards.
cols_to_transform = dat.columns[dat.columns.str.match('^xx[0-9]+$')]
# Transform to apply (column-wise).
transform_function = lambda c: c.astype(str)
# If you want a new DataFrame and not modify the other in-place.
dat2 = dat.copy()
dat2[cols_to_transform] = dat2[cols_to_transform].transform(transform_function, axis=0)
To use it within assign
:
# Here I put a lambda to avoid precomputing all the transformations in the dict comprehension.
dat.assign(**{col: lambda df: df[col].astype(str) for col in cols_to_transform})
Answered By - user2246849
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.