⚡️ Saturday ML Spark – 🔗 Creating Interaction Features

Posted on: May 2, 2026

Description:

In machine learning, better models don’t always come from more complex algorithms. Sometimes, the biggest improvements come from better features.

One powerful yet simple technique in feature engineering is creating interaction features — combining existing variables to capture hidden relationships.

In this project, we explore how interaction features can improve model performance on tabular data.

Understanding the Problem

Most basic models assume that features independently influence the target.

However, in real-world data:

features often interact with each other
relationships are rarely purely linear
combined effects can be stronger than individual effects

For example:

income alone → moderate signal
education alone → moderate signal
income × education → strong predictive signal

What Are Interaction Features?

Interaction features are created by combining two or more features, usually through multiplication.

X["A_B"] = X["A"] * X["B"]

This allows the model to learn relationships that depend on multiple variables together.

Baseline Model

We first train a model without interaction features.

model = LinearRegression()
model.fit(X_train, y_train)

This serves as a reference for comparison.

Creating Interaction Features

We generate new features by combining existing ones.

X_train["RM_LSTAT"] = X_train["RM"] * X_train["LSTAT"]
X_train["CRIM_NOX"] = X_train["CRIM"] * X_train["NOX"]

These features represent interactions between variables.

Training with Interaction Features

model.fit(X_train_interact, y_train)

After adding interaction features, the model can capture more complex patterns.

Why Interaction Features Matter

Interaction features help in:

capturing non-linear relationships
improving performance of simple models
uncovering hidden patterns in data
enhancing predictive power without changing algorithms

They are especially useful when using:

Linear Regression
Logistic Regression
simpler ML models

When to Use Interaction Features

when relationships between features are expected
when model performance is limited
when using simpler models
when domain knowledge suggests feature combinations

Key Takeaways

Interaction features combine multiple variables into new features.
They help capture relationships that individual features cannot.
Simple models benefit significantly from interaction features.
Feature engineering can improve performance without complex models.
A practical and powerful technique for tabular data.

Conclusion

Interaction features are a simple yet highly effective way to improve machine learning models. By combining existing features, we can reveal hidden patterns and enhance model performance without increasing algorithm complexity.

This marks an important step in the Feature Engineering track of Saturday ML Spark ⚡️, helping you move from just using models to designing better data representations.

Code Snippet:

# 📦 Import Required Libraries
import pandas as pd

from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score


# 🧩 Load Dataset (Boston is deprecated → using California Housing)
data = fetch_california_housing()

X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target


# ✂️ Split Data
X_train, X_test, y_train, y_test = train_test_split(
    X,
    y,
    test_size=0.3,
    random_state=42
)


# =========================================================
# 🚨 Baseline Model (No Interaction Features)
# =========================================================

baseline_model = LinearRegression()
baseline_model.fit(X_train, y_train)

baseline_pred = baseline_model.predict(X_test)

print("Baseline R2 Score:", r2_score(y_test, baseline_pred))


# =========================================================
# 🔗 Create Interaction Features
# =========================================================

X_train_interact = X_train.copy()
X_test_interact = X_test.copy()

# Example interaction features
X_train_interact["MedInc_HouseAge"] = X_train["MedInc"] * X_train["HouseAge"]
X_test_interact["MedInc_HouseAge"] = X_test["MedInc"] * X_test["HouseAge"]

X_train_interact["AveRooms_Population"] = X_train["AveRooms"] * X_train["Population"]
X_test_interact["AveRooms_Population"] = X_test["AveRooms"] * X_test["Population"]


# =========================================================
# 🤖 Train Model with Interaction Features
# =========================================================

interaction_model = LinearRegression()
interaction_model.fit(X_train_interact, y_train)

interaction_pred = interaction_model.predict(X_test_interact)

print("With Interaction Features R2 Score:", r2_score(y_test, interaction_pred))

← →	move
↑	rotate
↓	soft drop
Space	hard drop
P	pause / resume

⚡️ Saturday ML Spark – 🔗 Creating Interaction Features

Description:

Understanding the Problem

What Are Interaction Features?

Baseline Model

Creating Interaction Features

Training with Interaction Features

Why Interaction Features Matter

When to Use Interaction Features

Key Takeaways

Conclusion

Code Snippet:

Comments

Add Your Comment

⚡️ Saturday ML Spark – 🔗 Creating Interaction Features

Description:

Understanding the Problem

What Are Interaction Features?

Baseline Model

Creating Interaction Features

Training with Interaction Features

Why Interaction Features Matter

When to Use Interaction Features

Key Takeaways

Conclusion

Code Snippet:

Comments Show Comments

Add Your Comment

Related Posts

⚡️ Saturday ML Spark – ⚔️ LightGBM vs RandomForest

⚡️ Saturday ML Spark – ⚖️ Imbalanced Data (SMOTE vs class_weight)

⚡️ Saturday ML Spark – 🎯 Threshold Tuning

Comments