One Hot Encoding Preserve The NAs For Imputation
I am trying to use KNN for imputing categorical variables in python. In order to do so, a typical way is to one hot encode the variables before. However sklearn OneHotEncoder() doe
Solution 1:
Handling of missing values in OneHotEncoder
ended up getting merged in PR17317, but it operates by just treating the missing values as a new category (no option for other treatments, if I understand correctly).
One manual approach is described in this answer. The first step isn't strictly necessary now because of the above PR, but maybe filling with custom text will make it easier to find the column?
Post a Comment for "One Hot Encoding Preserve The NAs For Imputation"