Scikit Grid Search For KNN Regression ValueError: Array Contains NaN Or Infinity
Solution 1:
Additionally to what you have tried, you can also see if
import numpy as np
features = np.nan_to_num(features)
rewards = np.nan_to_num(rewards)
This sets all non-numeric values in your arrays to 0
, and should at least make your algorithm run, unless the error occurs somewhere internal to the algorithm. Make sure there aren't to many non-numeric entries in your data, as setting them all to 0 may cause strange biases in your estimates.
If this is not the case, and you are using weights='distance'
, then please check whether any of the train samples are identical. This will cause a division by zero in inverse distance.
If inverse distances are the cause of division by zero, you can circumvent this by using your own distance function, e.g.
def better_inv_dist(dist):
c = 1.
return 1. / (c + dist)
and then use 'weights': better_inv_dist
. You may need to adapt the constant c
to the right scale. In any case it will avoid division by zero as long as c > 0
.
Solution 2:
I ran into the same problem with KNN regression on scikit-learn. I was using weights='distance' and that led to infinite values while computing the predictions (but not while fitting the KNN model i.e. learning appropriate KD Tree or Ball Tree). I switched to weights='uniform' and the program ran to completion correctly, indicating the supplied weight function was the problem. If you want to use distance-based weights, supply a custom-weight function that doesn't explode to infinity at zero distance as indicated in eickenberg's answer.
Post a Comment for "Scikit Grid Search For KNN Regression ValueError: Array Contains NaN Or Infinity"