Ridge — L2 regularized regression
Ridge adds an L2 penalty to the OLS objective to shrink coefficients toward zero, stabilizing estimates under multicollinearity or with many regressors (large p). Ridge does not set coefficients exactly to zero (no variable selection) but substantially reduces estimation variance.
When to use
Model specification
Ridge minimizes the residual sum of squares plus an L2 penalty:
is the regularization parameter: ⇒ OLS; larger ⇒ stronger shrinkage.
Choosing lambda & notes
- Choose by cross-validation (CV).
- Standardize variables first, since the penalty is scale-sensitive.
- Ridge reduces variance but increases bias (bias-variance tradeoff).
Running in EcoLab
- Modeling module → Regularized regression family → Ridge.
- Select , the variables; enable standardization; choose (or auto CV).
- Run and read the shrunk coefficients and the shrinkage path; export the replication code.
Replication code
- Stata
- R
- Python
* ---- Ridge Regression ----
* Note: ridgereg is a community-contributed command
* Install: ssc install ridgereg
use "macro_data.dta", clear
ridgereg y x1-x20, model(orr)
# ---- Ridge Regression (alpha = 0) ----
library(glmnet)
# Load and prepare data (illustrative)
df <- read.csv("macro_data.csv")
X <- as.matrix(df[, paste0("x", 1:20)])
y <- df$y
# Ridge with cross-validation to choose lambda
cv_ridge <- cv.glmnet(X, y, alpha = 0)
plot(cv_ridge)
# Best lambda and coefficients
best_lambda <- cv_ridge$lambda.min
coef(cv_ridge, s = best_lambda)
# ---- Ridge Regression with cross-validation ----
from sklearn.linear_model import RidgeCV
from sklearn.preprocessing import StandardScaler
import pandas as pd
# Load data (illustrative)
df = pd.read_csv("macro_data.csv")
X = df[[f"x{i}" for i in range(1, 21)]]
y = df["y"]
# Standardize
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
# Ridge with CV over a grid of alphas (= lambda)
model = RidgeCV(alphas=[0.1, 1, 10]).fit(X_scaled, y)
print(f"Best alpha: {model.alpha_}")
print(f"Coefficients: {model.coef_}")
Limitations
- Does not select variables (all coefficients non-zero).
- Coefficients are harder to interpret due to shrinkage; typically used for prediction rather than causal inference.
Video tutorial
Video Tutorial: Running Ridge regression in EcoLab
See also
- Lasso · Elastic Net · OLS · Catalog