Causal Forecasting in Panel Data: A Two-Way Synthetic Forecasting Approach
面板数据中的因果预测:一种双向合成预测方法
Dennis Shen
AI总结 针对面板数据中未经历干预的目标单元的未来结果预测问题,提出双向合成预测(TWSF)方法,结合合成控制与时间序列外推,给出有限样本误差界和渐近正态性,并通过NFL体育场开放案例验证。
详情
估计面板数据中的因果效应是政策评估的核心问题。现有方法主要解决回顾性问题:在观测面板期间,目标单元在不同干预下会发生什么?然而,在许多应用中,决策者面临前瞻性问题:在观测面板之外,目标单元在尚未经历的干预下会发生什么?本文通过将基于合成控制的回顾性反事实逻辑与多元时间序列预测的外推结构相结合,开发了一个回答此类因果预测问题的框架。基于证明合成控制中单元侧回归合理性的潜在因子模型,我们对潜在时间因子施加低秩时间结构,以识别前瞻性因果预测估计量。我们通过双向合成预测估计量(TWSF)实施这一策略,该估计量从预处理结果中学习跨单元关系,并将其与从感兴趣干预下的处理单元轨迹中学习的时间序列模型相结合。在适当条件下,我们建立了有限样本预测误差界,该误差界意味着逐点一致性,并引入正交化校正,得到渐近正态性,从而实现逐点推断。我们将该框架扩展到固定多步预测视界,通过直接和递归两种程序,每种程序都继承了类似的逐点保证。我们通过模拟研究验证了理论,并通过研究2020赛季开放NFL体育场对公共卫生的影响,说明了TWSF的实际效用。
Estimating causal effects in panel data is a central problem in policy evaluation. Existing methods largely address retrospective questions of the form: what would have happened to a target unit under a different intervention during the observed panel? In many applications, however, decision-makers face prospective questions: what will happen to a target unit under an intervention it has not yet experienced, beyond the observed panel? This article develops a framework for answering such causal forecasting questions by integrating the retrospective counterfactual logic of synthetic-controls-based approaches with the extrapolative structure of multivariate time-series forecasting. Building on the latent factor models that justify unit-side regressions in synthetic controls, we impose low-rank temporal structure on the latent time factors to identify prospective causal forecast estimands. We operationalize this strategy through the Two-Way Synthetic Forecasting estimator, or TWSF, which learns cross-unit relationships from pre-treatment outcomes and combines them with a time-series model learned from treated donor trajectories under the intervention of interest. Under suitable conditions, we establish finite-sample forecasting error bounds that imply pointwise consistency and introduce an orthogonalized correction that yields asymptotic normality and thus enables pointwise inference. We extend the framework to fixed multi-step forecasting horizons through both direct and recursive procedures, each of which inherits analogous pointwise guarantees. We corroborate the theory with simulation studies and illustrate the practical utility of TWSF by studying the public-health impact of opening NFL stadiums during the 2020 season.