⚡️ Saturday ML Sparks – Hyperparameter Tuning with GridSearchCV 🎛🧠
Posted on: December 6, 2025
Description:
Tuning hyperparameters is one of the most important steps in improving your ML model.
Instead of manually trying values, GridSearchCV systematically searches combinations of parameters to find the best-performing model.
Today’s ML Spark makes hyperparameter tuning simple, structured, and beginner-friendly.
Understanding the Problem
Most ML models depend heavily on hyperparameters:
- Random Forest → number of trees, depth
- SVM → kernel choice, C value, gamma
- Logistic Regression → regularization strength
- KNN → number of neighbors
Using wrong hyperparameters = poor model performance.
GridSearchCV evaluates multiple parameter combinations using cross-validation, ensuring consistent and fair comparisons.
You get:
- best hyperparameters
- best cross-validated score
- the tuned model ready to use
1. Load the Dataset
We’ll use the Breast Cancer dataset (binary classification).
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
data = load_breast_cancer()
X, y = data.data, data.target
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, stratify=y, random_state=42
)
2. Choose a Model to Tune
We’ll tune a RandomForestClassifier.
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(random_state=42)
3. Define the Hyperparameter Grid
GridSearch will try every combination.
param_grid = {
"n_estimators": [100, 200, 300],
"max_depth": [None, 5, 10],
"min_samples_split": [2, 5],
"min_samples_leaf": [1, 2]
}
4. Run GridSearchCV
We use 5-fold CV for stable evaluation.
from sklearn.model_selection import GridSearchCV
grid = GridSearchCV(
estimator=model,
param_grid=param_grid,
scoring="accuracy",
cv=5,
n_jobs=-1
)
grid.fit(X_train, y_train)
5. Check the Best Params + Best Score
print("Best Hyperparameters:", grid.best_params_)
print("Best CV Score:", grid.best_score_)
6. Evaluate the Tuned Model
from sklearn.metrics import accuracy_score, classification_report
best_model = grid.best_estimator_
y_pred = best_model.predict(X_test)
print("Test Accuracy:", accuracy_score(y_test, y_pred))
print("\nClassification Report:\n", classification_report(y_test, y_pred))
Key Takeaways
- GridSearchCV automates hyperparameter tuning, saving hours of guesswork.
- It uses cross-validation to ensure each hyperparameter set is evaluated fairly.
- The output includes the best parameters and the best performing model.
- Can be used with any estimator — Logistic Regression, SVM, Random Forest, XGBoost, etc.
- Useful for both beginners and professionals to improve model performance systematically.
Conclusion
Hyperparameter tuning is essential for modern ML workflows.
GridSearchCV provides a structured, reliable, and automated way to improve models using systematic search and cross-validation.
This technique is foundational for performance optimization across all machine learning projects.
Code Snippet:
import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report
data = load_breast_cancer()
X, y = data.data, data.target
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, stratify=y, random_state=42
)
model = RandomForestClassifier(random_state=42)
param_grid = {
"n_estimators": [100, 200, 300],
"max_depth": [None, 5, 10],
"min_samples_split": [2, 5],
"min_samples_leaf": [1, 2]
}
grid = GridSearchCV(
estimator=model,
param_grid=param_grid,
scoring="accuracy",
cv=5,
n_jobs=-1
)
grid.fit(X_train, y_train)
print("Best Hyperparameters:", grid.best_params_)
print("Best Cross-Validated Score:", grid.best_score_)
best_model = grid.best_estimator_
y_pred = best_model.predict(X_test)
print("Test Accuracy:", accuracy_score(y_test, y_pred))
print("\nClassification Report:\n", classification_report(y_test, y_pred))
No comments yet. Be the first to comment!