Stephen T’s Blog Spot

A blog aimed at issues only data scientists, data analysts, statisticians, evaluators, and researchers care about.

Across several posts on causal inference, propensity-score matching, regression discontinuity, target trial emulation, I kept hitting the same wall. Each method could adjust for the confounders you had measured, but none could touch the ones you had not. Unmeasured confounding was the shared limit. There is one classic method built to climb that wall, with a clever trick. It is called instrumental variables.

The trick is to borrow a piece of randomness from the world. An instrument is a variable that nudges people toward or away from the treatment but has no other route to the outcome. If you can find one, the part of the treatment driven by the instrument behaves as though it were randomized, and you can use that slice of variation to estimate a causal effect even while unmeasured confounders lurk. The classic example is quarter of birth. Because of school-entry and compulsory-attendance rules, children born at different times of year once ended up with slightly different amounts of schooling, for reasons unrelated to ability or family. Joshua Angrist and Alan Krueger used that accident of timing to estimate the effect of schooling on earnings. The draft lottery is another: random numbers pushed some men into military service, letting researchers estimate the effect of service on later earnings.

The power comes entirely from the instrument being valid, and validity has three parts. First, relevance: the instrument has to actually move the treatment, and move it strongly. Second, the exclusion restriction: it must affect the outcome only through the treatment, with no side door. Third, exogeneity: it must be unrelated to the unmeasured confounders, as if randomly assigned. The first part you can check in the data. The other two, which carry the entire causal claim, you mostly cannot. They rest on an argument about how the world works, not on a statistical test.

This is the uncomfortable part. A variable can be a strong predictor of the treatment and still be a terrible instrument, because it reaches the outcome through some other path. When that happens, instrumental variables does not reduce bias. It can amplify it, and hand you a precise, confident, wrong answer. A weak instrument is its own hazard: if it barely moves the treatment, the estimate turns unstable and biased, exactly the problem later critics found hiding in some famous quarter-of-birth results. The method is powerful and fragile at once, and a bad instrument is worse than none.

Even a strong, valid instrument answers a narrower question than most readers assume. It estimates the effect only for the compliers, the people whose behavior the instrument actually changed. In the draft lottery, that means the men who served because they were drafted and would not have otherwise, not the volunteers, and not those who would have avoided service regardless. The result is a local average treatment effect. It need not generalize to everyone, and reporting it as the effect for everyone is a quiet overreach.

So instrumental variables is the rare tool that genuinely targets unmeasured confounding, the wall the other designs could not climb. But it does not make the problem vanish. It relocates it. Instead of assuming you measured every confounder, you now must defend, by argument rather than data, that your instrument has no back door and was as good as random. Often that is just as hard. The honesty lies in stating which assumption your causal claim leans on, and admitting it cannot be proven.

For those of us in evaluation, real instruments are rare and valuable: a lottery for an oversubscribed program, a policy cutoff, a randomized nudge to participate. When you have one, it can rescue a question that observational data alone could not answer. But resist the urge to christen any convenient variable an instrument. Ask the hard question out loud: could this thing affect the outcome through any path other than the program? If you cannot rule that out, you do not have an instrument. You have another confounder.

So here is my question: Have you ever found a genuinely credible instrument in your own work, and did its exclusion restriction survive a skeptical second look?

Posted in

Leave a comment