Skip to content Skip to sidebar Skip to footer

Probabilistic Svm, Regression

I've currently implemented a probabilistic (at least I think so) for binary classes. Now I want to extend this approach for regression, and I'm trying to use it for the Boston data

Solution 1:

There are several issues with your code.

  • To start with, what is taking forever is the firstclf.fit (i.e. the grid search one), and that's why you didn't see any change when you set max_iter and tol in your secondclf.fit.

  • Second, the clf=SVR() part will not work, because:

    • You have to import it, SVR is not recognizable
    • You have a bunch of illegal arguments in there (decision_function_shape, probability, random_state etc) - check the docs for the admissible SVR arguments.
  • Third, you don't need to explicitly fit again with the best parameters; you should simply ask for refit=True in your GridSearchCV definition and subsequently use clf.best_estimator_ for your predictions (EDIT after comment: simply clf.predict will also work).

So, moving the stuff outside of any function definition, here is a working version of your code:

from sklearn.svm import SVR
# other imports as-is# data loading & splitting as-is

param_C = [0.01, 0.1]
param_grid = {'C': param_C, 'kernel': ['poly', 'rbf'], 'gamma': [0.1, 0.01]}
clf = GridSearchCV(SVR(degree=5, max_iter=10000), cv = 5, param_grid= param_grid, refit=True,)
clf.fit(inputs_train, targets_train)
a = clf.best_estimator_.predict(inputs_test[0])
# a = clf.predict(inputs_test[0]) will also work print(a)
# [ 21.89849792]

Apart from degree, all the other admissible argument values you are are using are actually the respective default values, so the only arguments you really need in your SVR definition are degree and max_iter.

You'll get a couple of warnings (not errors), i.e. after fitting:

/databricks/python/lib/python3.5/site-packages/sklearn/svm/base.py:220: ConvergenceWarning: Solver terminated early (max_iter=10000). Consider pre-processing your data with StandardScaler or MinMaxScaler.

and after predicting:

/databricks/python/lib/python3.5/site-packages/sklearn/utils/validation.py:395: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample. DeprecationWarning)

which already contain some advice for what to do next...

Last but not least: a probabilistic classifier (i.e. one that produces probabilities instead of hard labels) is a valid thing, but a "probabilistic" regression model is not...

Tested with Python 3.5 and scikit-learn 0.18.1

Post a Comment for "Probabilistic Svm, Regression"