Pooled OLS — Pooled panel regression

Pooled OLS pools all observations of panel data (N units × T periods) into a single sample and runs OLS as if cross-sectional, ignoring the panel structure. It is the baseline for comparison with FE/RE.

Strong assumption

Pooled OLS assumes no individual effects ( $\alpha_i$ identical across units). If unobserved unit characteristics are correlated with $X$ , Pooled OLS is biased ⇒ use FE. Errors within a unit are usually correlated ⇒ use clustered standard errors.

Model specification

Y_{it} = \beta_0 + X_{it}\beta + \varepsilon_{it}

Like OLS but using all $N \times T$ observations. Use clustered SE by unit.

Running in EcoLab

Modeling module → Linear panel data family → Pooled OLS.
Declare entity/time, $Y$ , $X$ ; choose clustered SE.
Run; compare with FE/RE via tests; export the replication code.

Replication code

Stata
R
Python

* ---- Pooled OLS with clustered SE ----
use "panel_data.dta", clear

* Pooled OLS with clustered standard errors by entity
reg y x1 x2, vce(cluster id)

# ---- Pooled OLS ----
library(plm)

# Load panel data (illustrative)
df <- read.csv("panel_data.csv")
pdata <- pdata.frame(df, index = c("id", "time"))

# Pooled OLS
model_pooled <- plm(y ~ x1 + x2, data = pdata, model = "pooling")
summary(model_pooled)

# ---- Pooled OLS with clustered SE ----
import pandas as pd
from linearmodels.panel import PooledOLS

# Load panel data (illustrative)
df = pd.read_csv("panel_data.csv")
df = df.set_index(["id", "time"])

y = df["y"]
X = df[["x1", "x2"]]

model = PooledOLS(y, X).fit(cov_type="clustered", cluster_entity=True)
print(model)

Video tutorial

Video Tutorial: Running Pooled OLS in EcoLab

Model specification​

Running in EcoLab​

Replication code​

Video tutorial​

See also​

Model specification

Running in EcoLab

Replication code

Video tutorial

See also