Want to predict housing prices and shrink your model smartly? In this hands-on guide, we’ll dive into the famous Boston Housing dataset and show you how to analyze it using Ridge and Lasso regression — two powerful tools that help you avoid overfitting and make better predictions.
Whether you’re brand new to machine learning or just need a clean example, this tutorial is for you. We’ll walk through everything from loading the data to tuning hyperparameters and visualizing results.
Let’s roll.
🧠 What Are Ridge and Lasso Regression?
Regular linear regression tries to fit all features perfectly, even noisy or irrelevant ones — not ideal.
That’s where Ridge and Lasso step in:
- Ridge Regression adds a penalty for large weights, shrinking coefficients toward zero (but not exactly zero).
- Lasso Regression does the same but is aggressive — it can eliminate irrelevant features entirely.
Think of them like personal trainers for your model:
- Ridge says “trim the fat.”
- Lasso says “cut the junk.”
🧰 Step 1: Import Your Libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
We’re using:
pandasto handle datanumpyfor mathmatplotlibfor beautiful plots later
🏡 Step 2: Load the Boston Housing Dataset
Even though the dataset is deprecated in scikit-learn, it’s still useful for learning.
from sklearn.datasets import load_boston
boston = load_boston()
data = pd.DataFrame(boston.data, columns=boston.feature_names)
data['Price'] = boston.target
Output:
| RM | CRIM | ZN | … | LSTAT | Price |
|---|---|---|---|---|---|
| 6.57 | 0.02 | 18.0 | … | 5.3 | 24.0 |
RM= average number of roomsLSTAT= % lower status populationPrice= target variable (in $1000s)
🧹 Step 3: Clean & Prepare Data
print(data.head())
print(data.isnull().sum())
This checks:
- First few rows
- Missing values (should be
0across the board)
🧪 Step 4: Separate Features and Target
Let’s extract X (features) and y (target):
X = data.drop(columns='Price')
y = data['Price']
Boom. Done. Now it’s model time.
🔢 Step 5: Linear Regression (The Baseline)
Let’s try plain linear regression first — no penalties.
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import cross_val_score
lin_model = LinearRegression()
neg_mse_scores = cross_val_score(lin_model, X, y, scoring='neg_mean_squared_error', cv=5)
mean_mse = np.mean(neg_mse_scores)
print(f"Linear Regression Mean MSE: {mean_mse}")
Output Example:
Linear Regression Mean MSE: -34.23
This will be our baseline for comparison.
🧗 Step 6: Ridge Regression with Grid Search
We’ll use GridSearchCV to find the best alpha (penalty strength).
from sklearn.linear_model import Ridge
from sklearn.model_selection import GridSearchCV
ridge = Ridge()
params_ridge = {'alpha': np.logspace(-3, 3, 13)} # Try values from 0.001 to 1000
ridge_cv = GridSearchCV(ridge, params_ridge, scoring='neg_mean_squared_error', cv=5)
ridge_cv.fit(X, y)
print(f"Best Ridge Alpha: {ridge_cv.best_params_['alpha']}")
print(f"Best Ridge Score: {ridge_cv.best_score_}")
Output Example:
Best Ridge Alpha: 10.0
Best Ridge Score: -29.87
Ridge is doing better already — smaller error means tighter predictions!
✂️ Step 7: Lasso Regression with Grid Search
Lasso is up next — let’s see if it can outshine Ridge.
from sklearn.linear_model import Lasso
lasso = Lasso(max_iter=10000)
params_lasso = {'alpha': np.logspace(-3, 3, 13)}
lasso_cv = GridSearchCV(lasso, params_lasso, scoring='neg_mean_squared_error', cv=5)
lasso_cv.fit(X, y)
print(f"Best Lasso Alpha: {lasso_cv.best_params_['alpha']}")
print(f"Best Lasso Score: {lasso_cv.best_score_}")
Output Example:
Best Lasso Alpha: 0.01
Best Lasso Score: -29.55
Sweet! Lasso not only shrinks, but might completely zero out some weak features.
📊 Step 8: Visualize Ridge vs. Lasso Performance
Time for a side-by-side comparison across alphas.
ridge_results = pd.DataFrame(ridge_cv.cv_results_)
lasso_results = pd.DataFrame(lasso_cv.cv_results_)
plt.figure(figsize=(10, 5))
plt.plot(params_ridge['alpha'], -ridge_results['mean_test_score'], label='Ridge')
plt.plot(params_lasso['alpha'], -lasso_results['mean_test_score'], label='Lasso')
plt.xscale('log')
plt.xlabel('Alpha')
plt.ylabel('Negative MSE')
plt.title('Ridge vs Lasso Regression Performance on Boston Housing')
plt.legend()
plt.grid(True)
plt.show()


You’ll likely see:
- Ridge smooths out slowly
- Lasso dips then spikes (it punishes harder!)
🧾 Summary Table
| Model | Best Alpha | Mean MSE | Feature Elimination |
|---|---|---|---|
| Linear | N/A | -34.23 | ❌ |
| Ridge | 10.0 | -29.87 | ❌ |
| Lasso | 0.01 | -29.55 | ✅ |
✅ Final Thoughts
This project showed how regularization improves prediction and can even simplify your model.
- Use Ridge when all features might matter.
- Use Lasso when you want to shrink and prune unnecessary features.
- Always use cross-validation to tune hyperparameters!
📚 Learn More with Ossels AI
If you enjoyed this tutorial, you’ll love our other hands-on AI projects:
👉 How to Predict Your Salary Using Python and Machine Learning
👉 Build a Bitcoin Price Predictor with LSTM
👉 Ultimate Guide to Generative AI Tools in 2025
💬 Got Questions?
Drop your comments below or reach out via Ossels AI. We’d love to see what you’re building with Ridge and Lasso!