## Introduction

The main goal of this post is to illustrate the dangers of relying on traditional two-way fixed-effects (TWFE) regression models with leads and lags to conduct causal inference about treatment effect dynamics. In fact, we will see that even when treatment effect dynamics are homogeneous across different treatment groups/cohorts, some of these TWFE regression models can be severely biased.

In order to effectively communicate these dangers, let’s consider a stylized simulation example where covariates do not play any major role. This example draws from several discussions I (Pedro) have had with Andrew Baker. Let’s consider three setups. In the first one all units are eventually treated, and treatment effect dynamics are homogeneous across treatment groups. In the second, we have a group of units that remain untreated at the end of our sample, and treatment effect dynamics are homogeneous across treatment groups. The third one is like the second, but with heterogeneous treatment effect dynamics across treatment groups.

## Setup with all units being eventually treated and homogeneous treatment effect dynamics

Consider a staggered treatment adoption setup where 40 ‘states’ ($$state = \{1,2,\dots,40\}$$) are randomly assigned into 4 treatment groups depending on the treatment starting year (1986, 1992, 1998, and 2004). We denote the treatment starting period as $$g$$ with ($$g \in \{ 1986, 1992, 1998, 2004\}$$). The 1000 units (e.g., ‘counties’ or ‘firms’) $$i$$ in the sample are randomly assigned to one of the 40 states. Let $$G_i$$ indicates the group/cohort unit $$i$$ belongs to, i.e., $$G \subseteq \{ 1986, 1992, 1998, 2004\}$$.

The data generating process (DGP) for the outcome $$Y$$ considered here is $Y_{i,t} = (2010-g) + \alpha_i + \alpha_t + \tau_{i,t} + \epsilon_{i,t}$ where $$\alpha_i$$ are unit fixed effects drawn from $$\sim N(\mu_{state}, 1)$$ with state-specific mean $$\mu_{state} = state/5$$, $$\alpha_t$$ are time fixed effects (cohort specific parallel time-trends) generated as $\alpha_t = 0.1 * (t - g) + \epsilon^{time FE}_t,$ with $$\epsilon^{time FE}_t \sim N(0, 1)$$, $$\epsilon_{i,t} \sim N\left(0, \left(\frac{1}{2}\right)^2\right)$$ is an idiosyncratic error term, and $$\tau_{i,t}$$ are the unit-specific treatment effects at time $$t$$ generated as $\tau_{i,t} = \mu \times (t - g + 1)\times 1\{t \ge g \},$ where $$\mu$$ is the the instantaneous treatment effect; let’s set $$\mu=1$$.

From the above DGP, it is evident that the ATT at the time a unit is treated is equal to 1 for all groups, it is equal to 2 one period after treatment started for all groups, etc. Thus, the evolution of the treatment effects is homogeneous across treatment groups. This is an a priori strong restriction that should make things “easier” to estimate (but is by no means required!).

### Visualizing the DGP

Let’s start the discussion by first visualizing how the DGP looks like.

# Load libraries and set baseline parameters
library(tidyverse)
library(lfe)
library(fastDummies)
library(ggthemes)
library(did)
theme_set(theme_clean() + theme(plot.background = element_blank()))
#----------------------------------------------------------------------------
iseed  = 20201221
nrep <- 100
true_mu <- 1
set.seed(iseed)

#----------------------------------------------------------------------------
## Generate data - treated cohorts consist of 250 obs each, with the treatment effect still = true_mu on average
make_data <- function(nobs = 1000,
nstates = 40) {

# unit fixed effects (unobservd heterogeneity)
unit <- tibble(
unit = 1:nobs,
# generate state
state = sample(1:nstates, nobs, replace = TRUE),
unit_fe = rnorm(nobs, state/5, 1),
# generate instantaneous treatment effect
#mu = rnorm(nobs, true_mu, 0.2)
mu = true_mu
)

# year fixed effects (first part)
year <- tibble(
year = 1980:2010,
year_fe = rnorm(length(year), 0, 1)
)

# Put the states into treatment groups
treat_taus <- tibble(
# sample the states randomly
state = sample(1:nstates, nstates, replace = FALSE),
# place the randomly sampled states into four treatment groups G_g
cohort_year = sort(rep(c(1986, 1992, 1998, 2004), 10))
)

# make main dataset
# full interaction of unit X year
expand_grid(unit = 1:nobs, year = 1980:2010) %>%
left_join(., unit) %>%
left_join(., year) %>%
left_join(., treat_taus) %>%
# make error term and get treatment indicators and treatment effects
# Also get cohort specific trends (modify time FE)
mutate(error = rnorm(nobs*31, 0, 1),
treat = ifelse(year >= cohort_year, 1, 0),
tau = ifelse(treat == 1, mu, 0),
year_fe = year_fe + 0.1*(year - cohort_year)
) %>%
# calculate cumulative treatment effects
group_by(unit) %>%
mutate(tau_cum = cumsum(tau)) %>%
ungroup() %>%
# calculate the dep variable
mutate(dep_var = (2010 - cohort_year) + unit_fe + year_fe + tau_cum + error)

}
#----------------------------------------------------------------------------
# make data
data <- make_data()

# plot
plot1 <- data %>%
ggplot(aes(x = year, y = dep_var, group = unit)) +
geom_line(alpha = 1/8, color = "grey") +
geom_line(data = data %>%
group_by(cohort_year, year) %>%
summarize(dep_var = mean(dep_var)),
aes(x = year, y = dep_var, group = factor(cohort_year),
color = factor(cohort_year)),
size = 2) +
labs(x = "", y = "Value", color = "Treatment group   ") +
geom_vline(xintercept = 1986, color = '#E41A1C', size = 2) +
geom_vline(xintercept = 1992, color = '#377EB8', size = 2) +
geom_vline(xintercept = 1998, color = '#4DAF4A', size = 2) +
geom_vline(xintercept = 2004, color = '#984EA3', size = 2) +
scale_color_brewer(palette = 'Set1') +
theme(legend.position = 'bottom',
#legend.title = element_blank(),
axis.title = element_text(size = 14),
axis.text = element_text(size = 12))  +
ggtitle("One draw of the DGP with homogeneous effects across cohorts \n and with all groups being eventually treated")+
theme(plot.title = element_text(hjust = 0.5, size=12))

plot1