In another post I described difference-in-differences, the design that estimates a policy’s effect by comparing the change in a treated group to the change in an untreated one. Today I want to build on it, because there is a more powerful relative of that design, one I have leaned on often: the controlled interrupted time series, or CITS. It is best understood as difference-in-differences with the trends filled in.
Recall the limitation hiding inside basic difference-in-differences. In its simplest form it uses just two snapshots, before and after, for each group, comparing how much the treated group’s average moved against the comparison group’s. That works, but it is blind to everything between and around those two points. It cannot see whether the groups were already on different trajectories before the policy, which is exactly where the parallel trends assumption quietly fails.
An interrupted time series fixes the blindness by using many measurements over time rather than two. You track the outcome across numerous points before the intervention, establish the pre-existing trend, and then test whether the series breaks at the moment of the intervention, in its level, its slope, or both. The interruption is the policy. The projected continuation of the old trend is the counterfactual.
A single interrupted time series still has a weakness. Something else might have happened at the same moment as your intervention, a recession, a seasonal swing, a national change, and you could mistake its effect for your program’s. This is where the control comes in. A controlled interrupted time series runs two trend analyses side by side: one for the treated group and one for a comparison group that was not exposed. If the treated series breaks from its own trend at the intervention and the comparison series does not, you have strong evidence the break was caused by the policy, not by something in the air that month.
Here is the relationship that matters. Basic difference-in-differences asks whether the treated group’s level jumped more than the comparison group’s between two points. CITS asks whether the treated group’s entire trajectory, its level and its slope, broke from its established trend by more than the comparison group’s did, using the full time series. In that sense, difference-in-differences is a simplified, two-period special case of the richer design. CITS earns its added power by relaxing the parallel trends assumption: instead of assuming the groups would have moved in parallel, it models each group’s actual trend and asks whether the treated one deviated.
That power is not free. CITS has heavier data requirements: you generally need several measurements before the intervention, at least four and ideally many more, to estimate the baseline trend, plus a good run afterward. Time series data also break a standard statistical assumption, because consecutive observations are correlated, so the analysis, usually segmented regression, must be adjusted for that autocorrelation or the uncertainty will be understated. And you still need a genuinely unaffected comparison series.
For evaluators, the practical appeal is real. When a program launches at a known date and you have a decent history of outcome data for both an affected and an unaffected group, CITS lets you separate a real, sustained change from noise, pre-existing momentum, and coincident events, more convincingly than a two-point comparison can. It is one of the strongest quasi-experimental designs available when randomizing is off the table.
So here is my question: When you have a clear intervention date and a run of historical data, do you reach for difference-in-differences or step up to a controlled interrupted time series, and what tips the decision for you?
Leave a comment