Issue
I have create a simple code to implement OneHotEncoder
.
from sklearn.preprocessing import OneHotEncoder
X = [[0, 'a'], [0, 'b'], [1, 'a'], [2, 'b']]
onehotencoder = OneHotEncoder(categories=[0])
X = onehotencoder.fit_transform(X).toarray()
I just want to use method called fit_transform
to the X
for index 0
, so it means for [0, 0, 1, 2]
like what you see in X
. But it causes an error like this :
ValueError: Shape mismatch: if categories is an array, it has to be of shape (n_features,).
Anyone can solve this problem ? I am stuck on it
Solution
You need to use ColumnTransformer
to specify the column index not categories
parameter.
Constructor parameter categories
is to tell distinct category values explicitly. E.g. you could provide [0, 1, 2]
explicitly, but auto
will determine it. Further, you can use slice()
object instead.
from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import ColumnTransformer
X = [[0, 'a'], [0, 'b'], [1, 'a'], [2, 'b']]
ct = ColumnTransformer(
[('one_hot_encoder', OneHotEncoder(categories='auto'), [0])], # The column numbers to be transformed (here is [0] but can be [0, 1, 3])
remainder='passthrough' # Leave the rest of the columns untouched
)
X = ct.fit_transform(X)
Answered By - TRiNE
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.