H2O Target Mean Encoder "frames Are Being Sent In The Same Order" ERROR
I am following the H2O example to run target mean encoding in Sparking Water (sparking water 2.4.2 and H2O 3.22.04). It runs well in all the following paragraph from h2o.targetenco
Solution 1:
The issue is because you are trying encoding multiple categorical features. I think that is a bug of H2O, but you can solve putting the transformer in a for loop that iterate over all categorical names.
import numpy as np
import pandas as pd
import h2o
from h2o.targetencoder import TargetEncoder
h2o.init()
df = pd.DataFrame({
'x_0': ['a'] * 5 + ['b'] * 5,
'x_1': ['c'] * 9 + ['d'] * 1,
'x_2': ['a'] * 3 + ['b'] * 7,
'y_0': [1, 1, 1, 1, 0, 1, 0, 0, 0, 0]
})
hf = h2o.H2OFrame(df)
hf['cv_fold_te'] = hf.kfold_column(n_folds=2, seed=54321)
hf['y_0'] = hf['y_0'].asfactor()
cat_features = ['x_0', 'x_1', 'x_2']
for item in cat_features:
target_encoder = TargetEncoder(x=[item], y='y_0', fold_column = 'cv_fold_te')
target_encoder.fit(hf)
hf = target_encoder.transform(frame=hf, holdout_type='kfold',
seed=54321, noise=0.0)
hf
Solution 2:
Thanks everyone for letting us know. Assertion was a precaution as I was not sure whether there could be the case that order could be changed. Rest of the code was written with this assumption in mind and therefore safe to use with changed order anyway, but assertion was left and forgotten. Added test and removed assertion. Now this issue is fixed and merged. Should be available in the upcoming fix release. 0xdata.atlassian.net/browse/PUBDEV-6474
Post a Comment for "H2O Target Mean Encoder "frames Are Being Sent In The Same Order" ERROR"