Difference-in-Differences (DID)

DID is a causal inference method that estimates the effect of an intervention/policy by comparing the change over time between a treatment group and a control group. The idea: the difference of the before–after differences between the two groups is the policy effect, provided the two groups would have followed parallel trends absent the intervention.

In EcoLab, DID belongs to the Causal Inference group. See Estimation & Modeling.

When should you use DID?

There is a shock/policy applied to one group at a point in time, while another group is not.
You have data before and after the intervention for both groups (panel or repeated cross-sections).
The parallel trends assumption is plausible (testable via pre-trends).

Model specification

Two-group, two-period DID form:

Y_{it} = \beta_0 + \beta_1 \, \text{Treat}_i + \beta_2 \, \text{Post}_t + \delta \, (\text{Treat}_i \times \text{Post}_t) + \varepsilon_{it}

$\delta$ (the interaction coefficient) is the estimated policy effect (ATT).
Extension: TWFE (two-way fixed effects) with unit and time fixed effects for multiple groups/periods.

Assumptions and tests

Parallel trends: check by comparing pre-intervention trends (pre-trends) via an event study.
No simultaneous shock affecting only one group.
SUTVA / no spillover between treatment and control groups.
With staggered treatment timing, traditional TWFE can be biased — consider modern estimators (Callaway–Sant'Anna, Sun–Abraham).
Clustered standard errors by unit.

Running in EcoLab

Data Collection module: create the Treat, Post and interaction variables; prepare panel data.
Modeling module → Causal Inference group → DiD (or TWFE).
Declare the treatment variable, time, outcome variable and controls; choose clustered standard errors.
Run an event study to check pre-trends; read the $\delta$ coefficient (ATT) and the replication code.

Input / output example

Input (illustrative): a business-support policy applied from year T in some provinces; revenue is the outcome.

Output (format, illustrative figures — not real results):

Component	Coefficient	p-value
Treat × Post ( $\delta$ , ATT)	0.124***	0.003
Pre-trend (event study)	≈ 0, not significant	parallel trends hold

Interpretation: the policy raised the outcome by about 12.4% relative to the control group; insignificant pre-trends support the parallel-trends assumption.

Replication code

Stata
R
Python

* ── DID: two-group, two-period ────────────────────
* treated = treatment dummy, post = post-intervention dummy
regress y i.treated##i.post controls, vce(cluster id)

* ── Modern DID for staggered treatment ────────────
* Install: ssc install did_multiplegt
did_multiplegt y id time treated, robust_dynamic ///
    placebo(3) dynamic(3) breps(100)

# ── DID: two-group, two-period ────────────────────
library(did)
library(lmtest)
library(sandwich)

# Option 1: simple interaction regression
model_did <- lm(y ~ treated * post + controls, data = df)
coeftest(model_did, vcov = vcovCL(model_did,
                                   cluster = df$id))

# Option 2: Callaway & Sant'Anna (staggered DID)
cs_did <- att_gt(
  yname  = "y",
  tname  = "time",
  idname = "id",
  gname  = "first_treat",
  data   = df
)
summary(cs_did)
ggdid(cs_did)   # Event study plot

# ── DID: two-group, two-period ────────────────────
import statsmodels.formula.api as smf

model_did = smf.ols(
    "y ~ treated * post + controls",
    data=df
).fit(
    cov_type="cluster",
    cov_kwds={"groups": df["id"]}
)
print(model_did.summary())

# The coefficient on treated:post is the ATT (δ)

Limitations and notes

Results are invalid if parallel trends are violated — always report the event study.
Staggered DID with TWFE can suffer from "negative weighting"; use modern estimators when treatment timing differs.
The control group must be genuinely comparable; consider combining with PSM or synthetic control.
You need enough pre-intervention observations to test pre-trends.

Video tutorial

Video Tutorial: Guide to running DID in EcoLab

When should you use DID?​

Model specification​

Assumptions and tests​

Running in EcoLab​

Input / output example​

Replication code​

Limitations and notes​

Video tutorial​

See also​