PSM — Propensity Score Matching

PSM (Propensity Score Matching) evaluates the impact of an intervention on observational data by matching each treated unit with a control unit that has a similar propensity score — the estimated probability of participation given observed variables. The goal is to mimic a randomized experiment and reduce selection bias on observables.

Key assumption

PSM relies on selection on observables (CIA): every factor affecting both participation and outcome is observed. If there are unobserved confounders, PSM remains biased (unlike IV/DiD, which partly address unobservables).

Workflow

The propensity score $p(X) = P(\text{treat}=1 \mid X)$ is estimated by Logit/Probit.

Running in EcoLab

Modeling module → Causal inference family → PSM.
Declare the treatment, outcome, and covariates; choose the matching algorithm.
Run; check balance + common support; read the ATT; export the replication code.

Replication code

Stata
R
Python

* ── PSM: nearest-neighbor matching ────────────────
* Install: ssc install psmatch2
psmatch2 treated x1 x2 x3, outcome(y) ///
    neighbor(1) caliper(0.05) common

* ── Balance check ─────────────────────────────────
pstest x1 x2 x3, both graph

* ATT is reported in the psmatch2 output

# ── PSM: nearest-neighbor matching ────────────────
library(MatchIt)

m <- matchit(
  treated ~ x1 + x2 + x3,
  data    = df,
  method  = "nearest",
  caliper = 0.05
)

# Balance diagnostics
summary(m)
plot(m, type = "jitter")

# Estimate ATT on matched data
library(lmtest)
matched_df <- match.data(m)
model_att  <- lm(y ~ treated, data = matched_df,
                 weights = weights)
coeftest(model_att)

# ── PSM: propensity score matching ────────────────
# Option 1: causalinference
from causalinference import CausalModel

cm = CausalModel(
    Y = df["y"].values,
    D = df["treated"].values,
    X = df[["x1", "x2", "x3"]].values
)
cm.est_propensity_s()   # Estimate propensity score
cm.trim_s()             # Trim non-overlapping region
cm.est_via_matching()   # Nearest-neighbor matching
print(cm.estimates)

# Option 2: DoWhy (Microsoft)
# import dowhy
# model = dowhy.CausalModel(...)

Limitations

Does not handle unobserved confounders.
Sensitive to the matching algorithm; balance must be checked carefully.

Video tutorial

Video Tutorial: Guide to running PSM in EcoLab

Workflow​

Running in EcoLab​

Replication code​

Limitations​

Video tutorial​

See also​

Workflow

Running in EcoLab

Replication code

Limitations

Video tutorial

See also