Digitally enriching a screening population for pancreatic cancer using routine blood-based measures and clinical histories
利用常规血液检测指标和临床病史对胰腺癌筛查人群进行数字富集
Chris Varghese, Leo Y. Li-Han, Richa Bisht, Ellen Larson, Frank Lee, Ryan M. Carr, Tanios S. Bekaii-Saab, Shounak Majumder, John D. Halamka, Mark Truty, Ajit H. Goenka, Hojjat Salehinejad, Cornelius A. Thiels
AI总结 提出基于Transformer的多头注意力神经网络,利用纵向诊断编码和血液检测序列预测胰腺癌风险,实现提前1-3年风险分层,为人群级数字富集筛查奠定基础。
详情
早期检测胰腺癌是扩大治愈性治疗可及性和减少癌症死亡的关键;然而,目前筛查并不可行。病理的潜在指标体现在个体的疾病和血液检测轨迹中,可能预测胰腺癌的发展。利用患者在临床互动过程中积累的纵向诊断编码和血液检测值序列,训练了一个基于Transformer的定制神经网络,采用多头注意力机制,以提前多年预测胰腺癌风险,并对人群进行风险分层以进行靶向筛查。该队列包括6,017名胰腺癌成人患者和177,081名对照(总体中位年龄75岁,45%女性),在胰腺癌诊断前拥有中位12年(四分位距6.9-16.2)的病史。通过留一站点法进行外部验证,在诊断前1年、2年和3年预测胰腺癌,受试者工作特征曲线下面积均值分别为0.837(95%置信区间0.827-0.848)、0.797(95%置信区间0.782-0.813)和0.760(95%置信区间0.745-0.776)。估计的胰腺癌风险校准良好(校准图斜率1.08,截距-0.077;Brier评分0.025),贝叶斯人群胰腺癌患病率更新使得估计的癌症风险输出可跨环境迁移。在测试中,1年内胰腺癌风险>3.3%的筛查阈值提供了18.2的诊断优势比。因此,我们的工作为第一个人群级数字富集工具奠定了基础,以扩大胰腺癌治愈性管理的可及性。
Earlier detection of pancreatic cancer is key to enabling wider access to curative treatment and reducing cancer deaths; however, screening is presently not viable. Latent indicators of pathology are evident in an individual's disease and blood test trajectories and may predict the development of pancreatic cancer. Longitudinal sequences of coded diagnoses and blood test values accrued by patients throughout their clinical interactions were used to train a custom Transformer-based neural network with a multi-head attention mechanism to predict risk of pancreatic cancer with a multi-year lead time and risk-stratify populations for targeted screening. The cohort comprised 6,017 adults with pancreatic cancer and 177,081 controls (overall median age 75, 45% female) with median 12 years (interquartile range 6.9-16.2) of medical history prior to pancreatic cancer diagnosis. External validation via leave-one-site-out, out-of-sample testing predicting pancreatic cancer 1-, 2-, and 3-years prior to diagnosis demonstrated mean area under the receiver operating characteristic of 0.837 (95% confidence interval 0.827-0.848), 0.797 (95% confidence interval 0.782-0.813), and 0.760 (95% confidence interval 0.745-0.776), respectively. Estimated pancreatic cancer risks were well-calibrated (calibration plot slope 1.08, intercept of -0.077; Brier score 0.025), and a Bayesian population pancreatic cancer prevalence update allows estimated cancer risk outputs to be transportable across settings. At testing, a screening threshold of >3.3% risk of pancreatic cancer in 1-year offered a diagnostic odds ratio of 18.2. Our work therefore lays the foundation for a first population-level digital enrichment tool to widen access to curative-intent management of pancreatic cancer.