Issue
My current understanding is that, we cant directly transform/retrieve y-labels
passed as (X,y)
while using a Pipeline.
The fit_transform
at the end returns transformations only on the X
passed and y
is only utilized in situations involving fit()
, fit_predict()
and such.
Is my understanding correct?
Also is there a way to transform and retrieve y
(including when dropping instances using a Custom Transformer) without having to break out of a fully enclosed model training pipeline?
Solution
In general, your understanding is correct. Pipeline
objects are meant for sequential application of several transformations of X
. From the user guide:
Pipelines only transform the observed data (
X
).
Also have a look at the gloassary about the term transform:
transform
In a transformer, transforms the input, usually onlyX
, into some transformed space (conventionally notated asXt
).
In case of a regression tasks, there is a special TransformedTargetRegressor
which deals with transforming the target y
and can e.g. be used at the end of a pipeline.
Other than that, there is no canonical way in controlling transformations of y
in a pipeline.
Answered By - afsharov
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.