Skip to content Skip to sidebar Skip to footer

How To Perform Standardization On The Data In Gridsearchcv?

How to perform standardizing on the data in GridSearchCV? Here is the code. I have no idea on how to do it. import dataset import warnings warnings.filterwarnings('ignore') import

Solution 1:

Use sklearn.pipeline.Pipeline

Demo:

from sklearn.pipelineimportPipelinefrom sklearn.model_selectionimport train_test_split

X_train, X_test, y_train, y_test = \
        train_test_split(X, y, test_size=0.33)

pipe = Pipeline([
    ('scale', StandardScaler()),
    ('clf', LogisticRegression())
])

param_grid = [
    {
        'clf__solver': ['newton-cg', 'lbfgs', 'liblinear', 'sag', 'saga'],
        'clf__C': np.logspace(-3, 1, 5),
    },
]

grid = GridSearchCV(pipe, param_grid=param_grid, cv=3, n_jobs=-1, verbose=2)
grid.fit(X_train, y_train)

Solution 2:

if you use refit=True than you can use the best model results from the GridSearchCV. you can use the cv_results to find the best row based on rank score. Using the best row then it is possible to extract the parameters. If your feature list becomes large than use RandomSearchCV to make predictions.

from sklearn.pipeline import Pipeline
 from sklearn.model_selection import train_test_split

 X_train, X_test, y_train, y_test =train_test_split(X, y, test_size=0.3)

 pipe = Pipeline([
     ('scale', StandardScaler()),
     ('clf', LogisticRegression())
 ])

 param_grid = [
    {
    'clf__solver': ['newton-cg', 'lbfgs', 'liblinear', 'sag', 'saga'],
    'clf__C': np.logspace(-3, 1, 5),
    },
 ]

 grid_class=GridSearchCV(
    estimator=pipeline,
    param_grid=parameter_grid,
    scoring='accuracy',
    n_jobs=4, #use 4 cores
    cv=10, #10 folds
    refit=True,
    return_train_score=True)

    grid_class.fit(X_train,y_train)

    predictions=grid_class.predict(X_test)

    cv_results_df=pd.DataFrame(grid_class.cv_results_)

    best_row=cv_results_df[cv_results_df["rank_test_score"]==1]
 
    print(best_row)

    params_column = cv_results_df.loc[:, ['params']]
    print(params_column)

Post a Comment for "How To Perform Standardization On The Data In Gridsearchcv?"