Skip to main content

GAM — Generalized Additive Models

GAM (Generalized Additive Models) model smooth nonlinear relationships between YY and each regressor using smooth functions (splines) instead of linear coefficients — without pre-specifying the functional form as NLS requires. GAM bridges linear regression and nonparametric models.

When to use

Use GAM when you suspect a relationship is nonlinear but of unknown shape (e.g. a U-shaped effect of age/income). GAM produces interpretable smooth-function plots while keeping additivity.


Model specification

g(E[Yi])=β0+f1(X1i)+f2(X2i)++fk(Xki)g\big(E[Y_i]\big) = \beta_0 + f_1(X_{1i}) + f_2(X_{2i}) + \dots + f_k(X_{ki})

where g()g(\cdot) is a link function (identity/logit/log…) and each fjf_j is a smooth function (spline) estimated from data, with a roughness penalty to avoid overfitting; smoothness is chosen via GCV/REML.


Running in EcoLab

  1. Modeling module → Non-linear & semi-parametric family → GAM.
  2. Select YY, the XX variables, mark which use a smooth function; choose the link.
  3. Run; view the per-variable smooth-function plots + effective degrees of freedom; export the replication code.

Replication code

* ── GAM estimation (semiparametric) ───────────────
* Stata does not have a built-in gam command;
* use npregress series or the community semipar command.

* Option 1: npregress series with B-spline basis
npregress series y x1, basis(bspline)

* Option 2: semipar (community-contributed)
* ssc install semipar
semipar y x2 x3, nonpar(x1)

Limitations

  • Interpreted via plots rather than a single coefficient; harder to fit into concise causal inference.
  • Assumes additivity (does not automatically capture interactions unless specified).

Video tutorial

Video Tutorial: Guide to running GAM in EcoLab

See also