🧠 AI with Python – 📉 Residuals vs Predicted Plot (Regression)

Posted on: March 17, 2026

Description:

When working with regression models, evaluating model performance goes beyond metrics like R² or Mean Squared Error. Even a model with strong numerical performance can still violate important assumptions.

One of the most useful diagnostic tools in regression analysis is the Residuals vs Predicted plot, which helps visualize how prediction errors behave across the range of predicted values.

Understanding the Problem

A regression model predicts continuous values, but the true test of its reliability lies in how the errors (residuals) behave.

Residuals are defined as:

Residual = Actual Value − Predicted Value

Ideally, residuals should be randomly scattered around zero. If patterns appear in the residuals, it may indicate that the model is missing important relationships in the data.

What Is a Residual Plot?

A Residuals vs Predicted plot displays:

Predicted values on the x-axis
Residual errors on the y-axis

This visualization helps detect issues such as:

Non-linear relationships
Unequal error variance (heteroscedasticity)
Outliers
Model misspecification

1. Training a Regression Model

We first train a regression model using a structured dataset.

from sklearn.linear_model import LinearRegression

model = LinearRegression()
model.fit(X_train, y_train)

The model learns a linear relationship between the features and the target variable.

2. Generating Predictions

After training, we generate predictions on unseen test data.

y_pred = model.predict(X_test)

These predictions will be compared with the actual values to compute residuals.

3. Computing Residuals

Residuals represent the error between actual and predicted values.

residuals = y_test - y_pred

These values form the basis for the residual analysis.

4. Visualizing Residuals vs Predictions

We now plot residuals against predicted values.

plt.scatter(y_pred, residuals)
plt.axhline(0, linestyle="--")

The horizontal line represents zero error, making it easier to observe deviations.

How to Interpret the Residual Plot

A good regression model typically produces:

Random scatter around zero → model fits well

However, certain patterns reveal problems:

Curved pattern → missing non-linear relationship
Funnel shape → heteroscedasticity (changing variance)
Clusters or structure → missing variables
Extreme points → potential outliers

These insights help guide model improvements.

Why Residual Analysis Matters

Residual diagnostics allow us to:

Validate regression assumptions
Identify model limitations
Detect non-linear relationships
Improve feature engineering

Without residual analysis, important model issues may go unnoticed.

Key Takeaways

Residuals measure prediction errors in regression models.
Random scatter around zero indicates a well-fitted model.
Patterns in residuals signal model problems.
Residual plots help detect heteroscedasticity and non-linearity.
A fundamental diagnostic tool for regression analysis.

Conclusion

Residuals vs predicted plots provide a powerful visual diagnostic for regression models. By examining how prediction errors behave, we gain deeper insight into model assumptions and potential weaknesses. This makes residual analysis an essential part of building reliable regression models within the Advanced Visualization & Interpretability module of the AI with Python series.

Code Snippet:

# 📦 Import Required Libraries
import pandas as pd
import matplotlib.pyplot as plt

from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression


# 🧩 Load the Dataset
data = fetch_california_housing()

X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target


# ✂️ Split Data into Train and Test Sets
X_train, X_test, y_train, y_test = train_test_split(
    X,
    y,
    test_size=0.3,
    random_state=42
)


# 🤖 Train the Regression Model
model = LinearRegression()
model.fit(X_train, y_train)


# 📊 Generate Predictions
y_pred = model.predict(X_test)


# 📉 Compute Residuals
residuals = y_test - y_pred


# 📈 Plot Residuals vs Predicted Values
plt.figure(figsize=(6, 6))

plt.scatter(y_pred, residuals, alpha=0.6)

plt.axhline(0, linestyle="--")

plt.xlabel("Predicted Values")
plt.ylabel("Residuals")
plt.title("Residuals vs Predicted Plot")

plt.tight_layout()
plt.show()

← →	move
↑	rotate
↓	soft drop
Space	hard drop
P	pause / resume

🧠 AI with Python – 📉 Residuals vs Predicted Plot (Regression)

Description:

Understanding the Problem

What Is a Residual Plot?

1. Training a Regression Model

2. Generating Predictions

3. Computing Residuals

4. Visualizing Residuals vs Predictions

How to Interpret the Residual Plot

Why Residual Analysis Matters

Key Takeaways

Conclusion

Code Snippet:

Comments

Add Your Comment

🧠 AI with Python – 📉 Residuals vs Predicted Plot (Regression)

Description:

Understanding the Problem

What Is a Residual Plot?

1. Training a Regression Model

2. Generating Predictions

3. Computing Residuals

4. Visualizing Residuals vs Predictions

How to Interpret the Residual Plot

Why Residual Analysis Matters

Key Takeaways

Conclusion

Code Snippet:

Comments Show Comments

Add Your Comment

Related Posts

🧠 AI with Python – 📈 Comparative ROC Curves

🧠 AI with Python – 📊 Reliability Diagrams

🧠 AI with Python – 📈 Model Calibration Curves

Comments