GAM — Generalized Additive Models
GAM (Generalized Additive Models) model smooth nonlinear relationships between and each regressor using smooth functions (splines) instead of linear coefficients — without pre-specifying the functional form as NLS requires. GAM bridges linear regression and nonparametric models.
When to use
Use GAM when you suspect a relationship is nonlinear but of unknown shape (e.g. a U-shaped effect of age/income). GAM produces interpretable smooth-function plots while keeping additivity.
Model specification
where is a link function (identity/logit/log…) and each is a smooth function (spline) estimated from data, with a roughness penalty to avoid overfitting; smoothness is chosen via GCV/REML.
Running in EcoLab
- Modeling module → Non-linear & semi-parametric family → GAM.
- Select , the variables, mark which use a smooth function; choose the link.
- Run; view the per-variable smooth-function plots + effective degrees of freedom; export the replication code.
Replication code
- Stata
- R
- Python
* ── GAM estimation (semiparametric) ───────────────
* Stata does not have a built-in gam command;
* use npregress series or the community semipar command.
* Option 1: npregress series with B-spline basis
npregress series y x1, basis(bspline)
* Option 2: semipar (community-contributed)
* ssc install semipar
semipar y x2 x3, nonpar(x1)
# ── GAM estimation ────────────────────────────────
library(mgcv)
# Smooth terms s() for nonlinear, linear term for x3
model_gam <- gam(
y ~ s(x1) + s(x2) + x3,
data = df,
method = "REML"
)
# Summary: edf, significance of smooth terms
summary(model_gam)
# Smooth-function plots
plot(model_gam, pages = 1, residuals = TRUE)
# ── GAM estimation ────────────────────────────────
from pygam import LinearGAM, s, l
# s(i) = smooth spline for column i
# l(i) = linear term for column i
X = df[["x1", "x2", "x3"]].values
y = df["y"].values
gam = LinearGAM(s(0) + s(1) + l(2)).fit(X, y)
# Summary: effective degrees of freedom, p-values
print(gam.summary())
# Plot partial dependence for each term
for i, term in enumerate(gam.terms):
if term.isintercept:
continue
gam.plot(term=i)
Limitations
- Interpreted via plots rather than a single coefficient; harder to fit into concise causal inference.
- Assumes additivity (does not automatically capture interactions unless specified).