Skip to content Skip to sidebar Skip to footer

Change X Labels In A Python Sklearn Partial Dependence Plot

Hi used normalized data for fitting a GradientBoostingRegressor and plotted the partial dependecies for the main 10 variables. Now I want to plot them against the real, non-normali

Solution 1:

I found the solution, and it was quite obvious... The axs contains all the axes information as a list. Therefore each axes can be accessed by it. The axes of the first subplot is therefore axs[0] and to get the labels its:

labels = [item.get_text() for item in axs[0].get_xticklabels()]

however, this did not work as the labels in my case where always empty although values were displayed in the figure. I therefore used the axis limits and the following code to create the new transformed labels

    fig, axs = plot_partial_dependence(clf, X, features,feature_names=X.columns, grid_resolution=100)
    lims = plt.getp(axs[0],"xlim") 
    myxrange = np.linspace(lims[0],lims[1],5)                                  
    mymean = mean4bactransform
    mysd   = sd4bactransform
    newlabels = [str(round((myx*mysd)+mymean,2)) for myx in myxrange]           
    plt.setp(axs, xticks=myxrange, xticklabels=newlabels)                                               
    fig.suptitle('Partial dependence')
    plt.subplots_adjust(top=0.9)  # tight_layout causes overlap with suptitle
    fig.set_size_inches(10.5, 7.5)

Solution 2:

If I understood right you want to access the labels based on the feature importance.

If this is the case then you can do the following:

#after fitting the model use this to get the feature importance
feature_importance = clf.feature_importances_

# make importances relative to max importance
feature_importance = 100.0 * (feature_importance / feature_importance.max())

# sort the importances and get the indices of the sorting
sorted_idx = np.argsort(feature_importance)

#match the indices with the labels of the x matrix#important: x must have columns names to do this
x.columns[feature_names[sorted_idx]]

This will give you the feature names in ascending order. This means that the first name is the feature the least important and the last name is the feature that is the most important of all.

Post a Comment for "Change X Labels In A Python Sklearn Partial Dependence Plot"