ProtoPathway: Biologically Structured Prototype-Pathway Fusion for Multimodal Cancer Survival Prediction
ProtoPathway: 为多模态癌症生存预测设计的生物结构化原型-路径融合
Amaya Gallagher-Syed, Costantino Pitzalis, Myles J. Lewis, Michael R. Barnes, Gregory Slabaugh
AI总结 本文提出ProtoPathway框架,通过统一全切片成像和转录组学,利用编码器生成生物基础的表示,以提升癌症生存预测的生物可解释性和计算效率。
详情
- Comments
- Currently under peer review
我们介绍了ProtoPathway,一种为癌症生存预测设计的可解释多模态框架,通过编码器在两个融合侧生成生物基础的表示。在组织病理学侧,$K$个可学习的形态原型通过端到端训练与生存目标相结合,作为切片本身的表示:片段通过软分配流入原型标记,将可变长度的片段集压缩成固定任务适应的标记。在基因组侧,双分图神经网络在Reactome通路层级编码基因表达,生成反映构成基因及其更广泛生物背景的通路嵌入,通过双向消息传递在共享的基因-通路图上进行。跨模态注意机制则在紧凑的原型$ imes$通路矩阵上操作,其中原型查询通路,建模分子程序如何导致组织形态的生物方向。由于两个轴都携带稳定的任务学习身份,注意矩阵本身是可解释性输出,从而在完整的生物层级上实现原生的推理时间归因,从基因通过通路和原型到空间组织图。我们在五个TCGA癌症队列上进行评估,展示了与现有方法相比具有竞争力或更优的生存预测能力,同时具有显著改进的生物可解释性和减少的计算成本,通过折叠分层的基于排名的群体水平分析验证了可解释性声明。我们的源代码、模型权重和Reactome通路,以及一个重新实现所有多模态生存基准的统一代码库,在相同预处理和评估条件下可用:https://github.com/AmayaGS/ProtoPathway.
We introduce ProtoPathway, an interpretable-by-design multimodal framework for cancer survival prediction that unifies whole slide imaging and transcriptomics through encoders producing biologically grounded representations on both sides of the fusion. On the histopathology side, $K$ learnable morphological prototypes, trained end-to-end with the survival objective, serve as the slide representation itself: patches flow into prototype tokens via soft assignment, compressing variable-length patch sets into fixed task-adaptive tokens. On the genomic side, a bipartite graph neural network encodes gene expression within the Reactome pathway hierarchy, producing pathway embeddings that reflect both constituent genes and their broader biological context through bidirectional message passing over a shared gene--pathway graph. Cross-modal attention then operates over a compact prototype $\times$ pathway matrix in which prototypes query pathways, modeling the biological direction in which molecular programs give rise to tissue morphology. Because both axes carry stable task-learned identity, the attention matrix is itself an interpretability output, yielding native inference-time attribution across the full biological hierarchy, from genes through pathways and prototypes to spatial tissue maps. We evaluate on five TCGA cancer cohorts, demonstrating competitive or superior survival prediction with substantially improved biological interpretability and reduced computational cost, with interpretability claims validated through fold-stratified rank-based population-level analysis. Our source code, model weights, and Reactome pathways, together with a unified codebase reimplementing all multimodal survival baselines under identical preprocessing and evaluation, are available at: https://github.com/AmayaGS/ProtoPathway.