Ridge vs Lasso Regression on Boston Housing

Ridge and Lasso regression Boston Housing

Want to predict housing prices and shrink your model smartly? In this hands-on guide, we’ll dive into the famous Boston Housing dataset and show you how to analyze it using Ridge and Lasso regression — two powerful tools that help you avoid overfitting and make better predictions.

Whether you’re brand new to machine learning or just need a clean example, this tutorial is for you. We’ll walk through everything from loading the data to tuning hyperparameters and visualizing results.

Let’s roll.

🧠 What Are Ridge and Lasso Regression?

Regular linear regression tries to fit all features perfectly, even noisy or irrelevant ones — not ideal.

That’s where Ridge and Lasso step in:

Ridge Regression adds a penalty for large weights, shrinking coefficients toward zero (but not exactly zero).
Lasso Regression does the same but is aggressive — it can eliminate irrelevant features entirely.

Think of them like personal trainers for your model:

Ridge says “trim the fat.”
Lasso says “cut the junk.”

🧰 Step 1: Import Your Libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

We’re using:

pandas to handle data
numpy for math
matplotlib for beautiful plots later

🏡 Step 2: Load the Boston Housing Dataset

Even though the dataset is deprecated in scikit-learn, it’s still useful for learning.

from sklearn.datasets import load_boston

boston = load_boston()
data = pd.DataFrame(boston.data, columns=boston.feature_names)
data['Price'] = boston.target

Output:

RM	CRIM	ZN	…	LSTAT	Price
6.57	0.02	18.0	…	5.3	24.0

RM = average number of rooms
LSTAT = % lower status population
Price = target variable (in $1000s)

🧹 Step 3: Clean & Prepare Data

print(data.head())
print(data.isnull().sum())

This checks:

First few rows
Missing values (should be 0 across the board)

🧪 Step 4: Separate Features and Target

Let’s extract X (features) and y (target):

X = data.drop(columns='Price')
y = data['Price']

Boom. Done. Now it’s model time.

🔢 Step 5: Linear Regression (The Baseline)

Let’s try plain linear regression first — no penalties.

from sklearn.linear_model import LinearRegression
from sklearn.model_selection import cross_val_score

lin_model = LinearRegression()
neg_mse_scores = cross_val_score(lin_model, X, y, scoring='neg_mean_squared_error', cv=5)
mean_mse = np.mean(neg_mse_scores)
print(f"Linear Regression Mean MSE: {mean_mse}")

Output Example:

Linear Regression Mean MSE: -34.23

This will be our baseline for comparison.

🧗 Step 6: Ridge Regression with Grid Search

We’ll use GridSearchCV to find the best alpha (penalty strength).

from sklearn.linear_model import Ridge
from sklearn.model_selection import GridSearchCV

ridge = Ridge()
params_ridge = {'alpha': np.logspace(-3, 3, 13)}  # Try values from 0.001 to 1000
ridge_cv = GridSearchCV(ridge, params_ridge, scoring='neg_mean_squared_error', cv=5)
ridge_cv.fit(X, y)

print(f"Best Ridge Alpha: {ridge_cv.best_params_['alpha']}")
print(f"Best Ridge Score: {ridge_cv.best_score_}")

Output Example:

Best Ridge Alpha: 10.0
Best Ridge Score: -29.87

Ridge is doing better already — smaller error means tighter predictions!

✂️ Step 7: Lasso Regression with Grid Search

Lasso is up next — let’s see if it can outshine Ridge.

from sklearn.linear_model import Lasso

lasso = Lasso(max_iter=10000)
params_lasso = {'alpha': np.logspace(-3, 3, 13)}
lasso_cv = GridSearchCV(lasso, params_lasso, scoring='neg_mean_squared_error', cv=5)
lasso_cv.fit(X, y)

print(f"Best Lasso Alpha: {lasso_cv.best_params_['alpha']}")
print(f"Best Lasso Score: {lasso_cv.best_score_}")

Output Example:

Best Lasso Alpha: 0.01
Best Lasso Score: -29.55

Sweet! Lasso not only shrinks, but might completely zero out some weak features.

📊 Step 8: Visualize Ridge vs. Lasso Performance

Time for a side-by-side comparison across alphas.

ridge_results = pd.DataFrame(ridge_cv.cv_results_)
lasso_results = pd.DataFrame(lasso_cv.cv_results_)

plt.figure(figsize=(10, 5))
plt.plot(params_ridge['alpha'], -ridge_results['mean_test_score'], label='Ridge')
plt.plot(params_lasso['alpha'], -lasso_results['mean_test_score'], label='Lasso')
plt.xscale('log')
plt.xlabel('Alpha')
plt.ylabel('Negative MSE')
plt.title('Ridge vs Lasso Regression Performance on Boston Housing')
plt.legend()
plt.grid(True)
plt.show()

You’ll likely see:

Ridge smooths out slowly
Lasso dips then spikes (it punishes harder!)

🧾 Summary Table

Model	Best Alpha	Mean MSE	Feature Elimination
Linear	N/A	-34.23	❌
Ridge	10.0	-29.87	❌
Lasso	0.01	-29.55	✅

✅ Final Thoughts

This project showed how regularization improves prediction and can even simplify your model.

Use Ridge when all features might matter.
Use Lasso when you want to shrink and prune unnecessary features.
Always use cross-validation to tune hyperparameters!

📚 Learn More with Ossels AI

If you enjoyed this tutorial, you’ll love our other hands-on AI projects:

👉 How to Predict Your Salary Using Python and Machine Learning
👉 Build a Bitcoin Price Predictor with LSTM
👉 Ultimate Guide to Generative AI Tools in 2025

💬 Got Questions?

Drop your comments below or reach out via Ossels AI. We’d love to see what you’re building with Ridge and Lasso!

Ridge vs Lasso Regression on Boston Housing – Which Works Better?

🧠 What Are Ridge and Lasso Regression?

🧰 Step 1: Import Your Libraries

🏡 Step 2: Load the Boston Housing Dataset

🧹 Step 3: Clean & Prepare Data

🧪 Step 4: Separate Features and Target

🔢 Step 5: Linear Regression (The Baseline)

🧗 Step 6: Ridge Regression with Grid Search

✂️ Step 7: Lasso Regression with Grid Search

📊 Step 8: Visualize Ridge vs. Lasso Performance

🧾 Summary Table

✅ Final Thoughts

📚 Learn More with Ossels AI

💬 Got Questions?

Posted by Ananya Rajeev

Adblock Detected!

🧠 What Are Ridge and Lasso Regression?

🧰 Step 1: Import Your Libraries

🏡 Step 2: Load the Boston Housing Dataset

🧹 Step 3: Clean & Prepare Data

🧪 Step 4: Separate Features and Target

🔢 Step 5: Linear Regression (The Baseline)

🧗 Step 6: Ridge Regression with Grid Search

✂️ Step 7: Lasso Regression with Grid Search

📊 Step 8: Visualize Ridge vs. Lasso Performance

🧾 Summary Table

✅ Final Thoughts

📚 Learn More with Ossels AI

💬 Got Questions?

Share with friends

Tags

Posted by Ananya Rajeev

Adblock Detected!