Issue
I want to transform a series of variables 'x1', 'x2' with a squared function using FunctionTransformer and rename the feature names with a suffix 's2_'. For example, 'x1' would become 's2_x1' in the transformed data set. I have the following code:
import pandas as pd
from sklearn.preprocessing import FunctionTransformer
from sklearn import set_config
set_config(transform_output='pandas') # This ensures the transformed output is a dataframe
df = pd.DataFrame(
{
'x1' : [1, 2, 3],
'x2' : [2, 3, 4]
}
)
my_transformer = FunctionTransformer(lambda x: x**2,
feature_names_out= lambda x : [f's2_{c}' for c in x])
transformed_df = my_transformer.fit_transform(df)
transformed_df
However, the renaming did not work as expected. The columns names remained 'x1' and 'x2' How should I fix the code?
Solution
According to the FunctionTransformer
docs, the feature_names_out
argument
must take two positional arguments: this FunctionTransformer (self) and an array-like of input feature names (input_features).
In your case, you were missing the argument for the FunctionTransformer
itself. Try this:
my_transformer = FunctionTransformer(lambda x: x ** 2,
feature_names_out=lambda self, input_features: [f's2_{c}' for c in input_features])
Result:
>>> transformed_df
s2_x1 s2_x2
0 1 4
1 4 9
2 9 16
Answered By - Georgi
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.