PSM — Propensity Score Matching
PSM (Propensity Score Matching) evaluates the impact of an intervention on observational data by matching each treated unit with a control unit that has a similar propensity score — the estimated probability of participation given observed variables. The goal is to mimic a randomized experiment and reduce selection bias on observables.
Key assumption
Workflow
The propensity score is estimated by Logit/Probit.
Running in EcoLab
- Modeling module → Causal inference family → PSM.
- Declare the treatment, outcome, and covariates; choose the matching algorithm.
- Run; check balance + common support; read the ATT; export the replication code.
Replication code
- Stata
- R
- Python
* ── PSM: nearest-neighbor matching ────────────────
* Install: ssc install psmatch2
psmatch2 treated x1 x2 x3, outcome(y) ///
neighbor(1) caliper(0.05) common
* ── Balance check ─────────────────────────────────
pstest x1 x2 x3, both graph
* ATT is reported in the psmatch2 output
# ── PSM: nearest-neighbor matching ────────────────
library(MatchIt)
m <- matchit(
treated ~ x1 + x2 + x3,
data = df,
method = "nearest",
caliper = 0.05
)
# Balance diagnostics
summary(m)
plot(m, type = "jitter")
# Estimate ATT on matched data
library(lmtest)
matched_df <- match.data(m)
model_att <- lm(y ~ treated, data = matched_df,
weights = weights)
coeftest(model_att)
# ── PSM: propensity score matching ────────────────
# Option 1: causalinference
from causalinference import CausalModel
cm = CausalModel(
Y = df["y"].values,
D = df["treated"].values,
X = df[["x1", "x2", "x3"]].values
)
cm.est_propensity_s() # Estimate propensity score
cm.trim_s() # Trim non-overlapping region
cm.est_via_matching() # Nearest-neighbor matching
print(cm.estimates)
# Option 2: DoWhy (Microsoft)
# import dowhy
# model = dowhy.CausalModel(...)
Limitations
- Does not handle unobserved confounders.
- Sensitive to the matching algorithm; balance must be checked carefully.