Beyond Exchangeability: Distribution-Shift-Aware Integration of External Control Data in Randomized Trials
超越可交换性:随机试验中外部对照数据的分布偏移感知整合
Jiawei Shan, Yiteng Tu, Guanbo Wang, Chao Ying, Jiwei Zhao
AI总结 针对随机试验与外部对照数据间分布偏移问题,提出通过校准方程平衡人群的增强估计量,并开发自适应收缩估计量以保证一致性和效率优势。
详情
随机对照试验(RCT)是评估因果效应的金标准,但通常成本高昂且难以扩展;因此,在许多应用中,它们经常辅以辅助外部对照。先前的借用此类数据的方法通常依赖于可交换性,即外部对照可直接用于试验人群的推断。然而在实践中,入组标准、标准护理和数据收集程序的差异可能导致RCT与外部对照之间的分布偏移,使得可交换性不成立。在本文中,我们提出了一种通过显式建模这些分布偏移来整合外部对照的新框架。我们通过校准方程调整仅使用试验的有效影响函数来构建增强估计量,以平衡试验和外部人群,从而即使在可交换性不成立时也能充分利用外部对照数据。我们进一步开发了一种自适应收缩估计量,该估计量保持一致性,同时保证相对于仅使用试验的基准的效率优势。合成实验和真实数据应用证明了所提出方法的实际优势。
Randomized controlled trials (RCTs) are the gold standard for evaluating causal effects but are often costly and difficult to scale; consequently, they are frequently augmented with auxiliary external controls in many applications. Prior approaches for borrowing such data typically rely on exchangeability, under which the external controls are readily usable for inference in the trial population. In practice, however, differences in eligibility criteria, standard of care, and data collection procedures may induce distribution shifts between the RCT and the external controls, rendering exchangeability implausible. In this paper, we propose a novel framework for integrating external controls by explicitly modeling these distribution shifts. We construct augmented estimators by adapting trial-only efficient influence functions through calibration equations that balance the trial and external populations, thereby fully exploiting the external control data even when exchangeability fails. We further develop an adaptive shrinkage estimator that preserves consistency while guaranteeing efficiency dominance over the trial-only benchmark. Synthetic experiments and a real data application demonstrate the practical advantages of the proposed approaches.