arXivDaily arXiv每日学术速递 周一至周五更新

科学与医疗

医学 AI

医学智能、临床 AI、医学影像、病理、诊断和医疗健康大模型。

今日/当前日期收录 47 信号源:cs.CV, cs.LG, q-bio, eess.IV, eess.SP

1. 诊断辅助 3 篇

2606.19140 2026-06-18 cs.LG 新提交 85%

ChronoSurv: A Clinical Pathway-Guided Graph Framework for Multimodal Survival Analysis

ChronoSurv:一种临床路径引导的多模态生存分析图框架

Hugo Miccinilli, Theo Di Piazza

发表机构 * Université Paris-Saclay, CentraleSupélec, MICS, France(巴黎萨克雷大学,中央超算学院,MICS,法国) University of Lyon, INSA Lyon, CREATIS, France(里昂大学,里昂国家理工学院,CREATIS,法国)

专题命中 诊断辅助 :多模态生存分析框架,用于头颈癌预测

AI总结 提出ChronoSurv,一种基于有向图的多模态生存分析框架,通过层次化拓扑和异质消息传递建模临床轨迹,在头颈癌数据集上取得最优判别性能与可靠校准。

Comments Accepted at MICCAI 2026. Submitted version due to embargo

详情
AI中文摘要

准确的生存预测对于头颈癌的个性化治疗计划至关重要,但由于多模态临床数据的异质性和高维性,这仍然具有挑战性。虽然深度生存模型在预测性能上优于经典统计方法,但现有方法通常依赖于静态融合策略或时间无关建模,限制了其捕捉结构化临床工作流程的能力。在这项工作中,我们提出了ChronoSurv,一种用于多模态生存分析的异质层次有向图框架。ChronoSurv使用与关键诊断步骤对齐的有向图,将患者护理表示为进展感知的临床轨迹。层次拓扑包含细粒度、粗粒度和全局表示,进一步支持对缺失模态的灵活适应,而异质消息传递则建模了跨模态和临床步骤的复杂非对称关系。在两个公共数据集上的实验结果表明,ChronoSurv在保持统计可靠校准的同时,实现了最先进的判别性能。全面的消融研究进一步证实了每个架构组件的贡献,突出了轨迹感知图建模在多模态生存预测中的潜力。

英文摘要

Accurate survival prediction is essential for personalized treatment planning in head and neck cancer, yet remains challenging due to the heterogeneous and high-dimensional nature of multimodal clinical data. While deep survival models have improved predictive performance over classical statistical approaches, existing methods typically rely on static fusion strategies or temporally agnostic modeling, limiting their ability to capture structured clinical workflows. In this work, we propose ChronoSurv, a heterogeneous hierarchical directed graph framework for multimodal survival analysis. ChronoSurv represents patient care as a progression-aware clinical trajectory using directed graphs aligned with key diagnostic steps. A hierarchical topology incorporates fine-grained, coarse, and global representations, further supporting flexible adaptation to missing modalities, while heterogeneous message passing models complex and asymmetric relationships across modalities and clinical steps. Experimental results on two public datasets demonstrate that ChronoSurv achieves state-of-the-art discriminative performance while maintaining statistically reliable calibration. Comprehensive ablation studies further confirm the contribution of each architectural component, highlighting the potential of trajectory-aware graph modeling for multimodal survival prediction.

2606.18571 2026-06-18 cs.LG cs.CL cs.SD eess.AS 新提交 85%

Fair Cognitive Impairment Detection Through Unlearning

通过去学习实现公平的认知障碍检测

William Nguyen, Jiali Cheng, Hadi Amiri

发表机构 * University of Massachusetts Lowell, USA(马萨诸塞大学洛厄尔分校)

专题命中 诊断辅助 :多模态框架公平检测轻度认知障碍

AI总结 提出一种多模态框架,结合跨模态融合和梯度反转去学习,减少人口统计信息对轻度认知障碍检测的偏见,在跨语言数据集上缩小性能差距。

Comments Interspeech 2026

详情
AI中文摘要

轻度认知障碍(MCI)是一种以记忆、语言或思维能力显著下降为特征的医学状况。从自发语音中检测MCI对于可扩展的筛查具有前景。然而,学习模型常常利用与标签相关的人口统计线索,导致不同亚组之间存在较大的性能差距。我们提出了一种多模态框架,结合了(i)模态间(语音、文本和图像)的跨模型融合,以及(ii)使用梯度反转的去学习,该技术阻止共享嵌入编码与任务无关的人口统计属性。在多语言基准TAUKADIAL和PREPARE上的评估表明,我们的方法在MCI分类上优于最先进的多语言和多模态基线,同时显著缩小了患者亚组(性别和语言)之间的性能差距。我们进一步分析了跨数据集的迁移,表明人口统计去学习有助于学习更鲁棒的MCI检测表示。

英文摘要

Mild Cognitive Impairment (MCI) is a medical condition characterized by a noticeable decline in memory, language, or thinking abilities. MCI detection from spontaneous speech is promising for scalable screening. However, learned models often exploit demographic cues correlated with labels, resulting in a large performance gap across subgroups. We present a multimodal framework that combines (i) cross-model fusion between modalities (speech, text, and image), and (ii) unlearning using gradient reversal that discourages the shared embedding from encoding task-irrelevant demographic attributes. Evaluated on the multilingual benchmarks TAUKADIAL and PREPARE, our method outperforms the state-of-the-art multilingual and multimodal baseline in MCI classification while substantially reducing the performance gap across patient subgroups (sex and language). We further analyze transfer across datasets, showing that demographic unlearning helps learn more robust representations for MCI detection.

2606.15973 2026-06-18 eess.SP 新提交 85%

An auscultation location specific study on the relationship between expiratory-to-inspiratory acoustic patterns and spirometric airflow limitation across age and gender in asthmatic patients

基于听诊位置的哮喘患者呼气-吸气声学模式与肺功能气流受限关系的年龄和性别特异性研究

Dheeraj Harish Kumar, Sanjana M C, Perumal Keerthi Priya, K V Nikhath Khanam, Uma Maheshwari Krishnaswamy, Prasanta Kumar Ghosh

专题命中 诊断辅助 :呼吸音分析辅助哮喘诊断,医学AI

AI总结 本研究通过分析141名哮喘患者的呼吸音频谱,发现呼气-吸气声功率比与FEV1/FVC在100-400Hz频段显著相关,且相关性受听诊位置、年龄和性别影响。

详情
AI中文摘要

哮喘导致呼气气流受限,临床通过肺功能检查评估,使用FEV1/FVC比值表示第一秒呼出气量占用力肺活量的比例。先前研究表明,在后部听诊位置(左下、左上、右上、右下)记录的呼吸音可反映局部气流模式。本研究在141名20-60岁参与者中,使用Spearman相关分析,研究呼气-吸气(E/I)频谱功率比与FEV1/FVC在不同频率子带的关系。100-200 Hz和200-400 Hz频带显示出显著相关性。总体而言,较低的后部听诊位置关联性更强;年轻成年人在左下位置相关性更强,而老年人在左上位置相关性更强。性别分层分析显示,男性在左下位置相关性更强,女性在左上位置相关性更强。

英文摘要

Asthma causes expiratory airflow limitation and is clinically assessed using spirometry, which provides the FEV1/FVC ratio representing the proportion of air exhaled in the first second relative to total forced vital capacity. Prior studies suggest that respiratory sounds recorded at posterior sites (Left Lower, Left Upper, Right Upper, Right Lower) reflect regional airflow patterns. In this study, we investigate the relationship between the expiratory-to-inspiratory (E/I) spectral power ratio and FEV1/FVC in 141 participants aged 20-60 years using Spearman correlation across frequency subbands. The 100-200 Hz and 200-400 Hz bands showed significant correlations. Overall, lower posterior sites showed stronger associations; younger adults showed stronger correlations at the Left Lower site, whereas older adults showed stronger correlations at the Left Upper site. Gender-stratified analysis showed stronger Left Lower correlations in males and stronger Left Upper correlations in females.

2. 医学影像 8 篇

2606.18749 2026-06-18 cs.CV 新提交 85%

Toward Training-Free Zero-Shot Anomaly Detection in 3D Medical Images: A Batch-Based Approach Using 2D Foundation Models

迈向3D医学图像的无训练零样本异常检测:基于批次的方法使用2D基础模型

Tai Le-Gia

发表机构 * Chungnam National University(忠南大学)

专题命中 医学影像 :3D医学图像零样本异常检测,无训练方法。

AI总结 提出CS3F框架,利用2D基础模型对3D医学图像进行零样本异常检测,通过沿多轴分解、切片编码和跨主体相似性计算异常分数,并引入粗到细的分词策略减少信号衰减。

详情
AI中文摘要

零样本异常检测(ZSAD)在医学成像中具有吸引力,因为临床系统必须处理异构采集协议、变化的患者群体以及可能缺乏标注训练数据的病理。大多数现有的零样本异常检测方法是为2D图像设计的,它们直接扩展到3D医学体积受到大规模体积基础模型稀缺或利用体积上下文困难的限制。我们提出CS3F,一个无训练的基于批次的框架,用于3D医学图像中的ZSAD,使用2D基础模型。每个体积沿多个解剖轴分解,并由2D视觉变换器逐切片编码。然后通过池化相邻切片特征将其转换为局部体积令牌。异常分数通过跨主体互相似性获得:在其他主体中缺乏相似令牌的令牌被赋予更高的异常分数。为了减少深度池化引起的病灶信号衰减,我们引入了一种粗到细的分词策略,无需穷举匹配即可实现细分辨率体积评分。CS3F在脑部MRI上针对转移瘤、胶质瘤和中风进行评估,并在肺部CT上验证其泛化能力,超越标准图谱对齐的脑部MRI。结果表明,冻结的2D基础模型可以支持3D医学图像中的异常定位,且细分词化的益处很大程度上取决于病灶对比度和成像模态。

英文摘要

Zero-shot anomaly detection (ZSAD) is attractive for medical imaging because clinical systems must handle heterogeneous acquisition protocols, changing patient populations, and pathologies for which annotated training data may be unavailable. Most existing zero-shot anomaly detection methods are designed for 2D images, and their direct extension to 3D medical volumes is limited by the scarcity of large-scale volumetric foundation models or by the difficulty of utilizing volumetric context. We propose CS3F, a training-free batch-based framework for ZSAD in 3D medical images using 2D foundation models. Each volume is decomposed along multiple anatomical axes and encoded slice-wise by a 2D vision transformer. These are then converted into localized volumetric tokens by pooling neighboring slice features. Anomaly scores are obtained from cross-subject mutual similarity: tokens that lack close analogues in other subjects are assigned higher anomaly scores. To reduce the attenuation of focal lesion signals caused by depth pooling, we introduce a coarse-to-fine tokenization strategy that enables fine-resolution volumetric scoring without exhaustive matching. CS3F is evaluated on brain MRI across metastases, glioma, and stroke, as well as validated on lung CT to test generalizability beyond atlas-aligned brain MRI. The results show that frozen 2D foundation models can support anomaly localization in 3D medical images, and that the benefit of fine tokenization depends strongly on lesion contrast and imaging modality.

2606.18658 2026-06-18 cs.CV eess.IV 新提交 85%

On-Manifold Variational Learning with Heat-Kernel Priors

基于热核先验的流形变分学习

Jiarui Xing, Tal Zeevi, Nian Wu, Jian Wang

发表机构 * Yale School of Medicine(耶鲁大学医学院) University of Virginia(弗吉尼亚大学) Harvard Medical School(哈佛医学院)

专题命中 医学影像 :在心脏瘢痕和脑MRI基准上取得最高精度

AI总结 提出一种流形锚定变分框架,利用几何感知EM算法选择热核加权潜图上的图中心点作为原型,确保原型在流形上,并通过Dirichlet能量正则化保持潜空间几何平滑,在心脏瘢痕和脑MRI基准上取得最高精度和清晰原型。

详情
AI中文摘要

学习医学影像队列的无监督表示可以揭示临床上有意义的原型,而无需专家标签,这些标签通常带有噪声且无法捕捉真实的病理异质性。然而,现有的深度潜变量模型通过欧几里得平均估计高斯混合先验,产生的原型会偏离弯曲的数据流形,并随着子种群数量的增加而退化。我们提出了一种流形锚定变分框架,基于几何感知的期望最大化(EM)算法,其M步骤选择每个子种群原型作为热核加权潜图上具有最高扩散中心性的图中心点,确保每个原型保持在流形上。Dirichlet能量正则化强制潜空间的几何平滑性,每个子种群的不确定性分数实现了无标签的质量评估。流形锚定EM是一种通用几何工具,扩展了标准EM,并易于应用于其他潜变量模型。在心脏瘢痕和脑MRI基准上,我们的框架在所有比较方法中取得了最高精度,产生了迄今为止最清晰的原型,并且在所有基线退化的较大子种群数量下保持稳定。

英文摘要

Learning unsupervised representations of medical imaging cohorts can reveal clinically meaningful prototypes without expert labels, which are often noisy and fail to capture true pathological heterogeneity. However, existing deep latent-variable models estimate Gaussian mixture priors via Euclidean averaging, producing prototypes that drift off the curved data manifold and degenerate as the number of sub-populations grows. We propose a manifold-anchored variational framework built on a geometry-aware Expectation-Maximization (EM) algorithm, whose M-step selects each sub-population prototype as the graph medoid with the highest diffusion centrality on a heat-kernel-weighted latent graph, ensuring that every prototype remains on-manifold. A Dirichlet energy regularizer enforces geometric smoothness of the latent space, and a per-sub-population uncertainty score enables label-free quality assessment. \rev{The manifold-anchored EM is a general-purpose geometric tool that extends standard EM and applies readily to other latent-variable models beyond this setting.} On cardiac scar and brain MRI benchmarks, our framework attains the highest accuracy among all compared methods, produces the sharpest prototypes reported to date, and remains stable at large sub-population counts where all baselines degenerate.

2606.17412 2026-06-18 cs.CV cs.AI 新提交 85%

Enhancing Pathological VLMs with Cross-scale Reasoning

增强病理视觉语言模型的跨尺度推理能力

Chi Phan, Tianyi Zhang, Qiaochu Xue, Yufeng Wu, Dan Hu, Zeyu Liu, Sudong Wang, Yueming Jin

发表机构 * Department of Electrical and Computer Engineering, National University of Singapore(新加坡国立大学电气与计算机工程系) PuzzleLogic Pte Ltd(PuzzleLogic私人有限公司) Department of Pathology, Fujian Medical University Cancer Hospital & Fujian Cancer Hospital(福建医科大学附属肿瘤医院病理科暨福建省肿瘤医院)

专题命中 医学影像 :病理VLM跨尺度推理,医学影像分析

AI总结 提出首个跨尺度训练与评估范式,通过多倍率视觉问答任务增强病理视觉语言模型的跨尺度推理能力,并构建高质量基准数据集Scale-VQA及模型ScaleReasoner-R1,实现最优性能。

详情
AI中文摘要

病理图像本质上是多尺度的,要求病理学家整合从低倍放大下的整体组织结构到高倍放大下的细胞形态的证据以进行准确诊断。虽然现有的视觉语言模型(VLM)病理数据集包含多种尺度,但它们通常缺乏明确的跨尺度推理目标。这一限制阻碍了VLM捕获关键的跨尺度表示和学习基于证据的推理。为弥补这一差距,我们引入了首个跨尺度训练和评估范式,将病理解释表述为多倍率推理。然而,创建这样的任务揭示了一个关键挑战:多图像视觉问答(VQA)容易受到仅文本捷径的影响,这使得模型能够利用与放大倍数相关的伪影而非视觉证据来猜测答案。为解决此问题,我们提出了一种泄漏感知的策展流程,结合了对抗性仅文本筛选和约束引导的问题设计。利用该流程,我们构建了Scale-VQA,一个高质量基准,包含4,685个多项选择题,基于2,537张跨多个放大级别的病理图像。最后,我们提出了ScaleReasoner-R1,一个通过强化学习训练的模型,以优化跨尺度VQA任务的性能。ScaleReasoner-R1在我们的跨尺度推理基准上达到了最先进的性能,并在已有的单尺度基准上泛化到最先进的性能。研究结果表明,即使是有限的跨尺度监督也能显著改善病理理解。代码和演示将开源。

英文摘要

Pathological images are inherently multi-scale, requiring pathologists to integrate evidence from global tissue architecture at low magnification to cellular morphology at higher magnification for accurate diagnosis. While existing pathological datasets for vision-language model (VLM) include various scales, they often lack an explicit cross-scale reasoning objective. This limitation prevents VLMs from capturing essential cross-scale representations and learning evidence-based reasoning. To bridge this gap, we introduce the first cross-scale training and evaluation paradigm that formulates pathology interpretation as multi-magnification reasoning. However, creating such a task reveals a critical challenge: multi-image visual question answering (VQA) is prone to text-only shortcuts, which allow models to guess answers using magnification-dependent artifacts rather than visual evidence. To address this, we propose a leakage-aware curation pipeline that combines adversarial text-only screening with constraint-guided question design. Using this pipeline, we construct Scale-VQA, a high-quality benchmark with 4,685 multiple-choice questions grounded in 2,537 pathology images across multiple magnification levels. Finally, we present ScaleReasoner-R1, a model trained via reinforcement learning to optimize performance on the cross-scale VQA task. ScaleReasoner-R1 achieves state-of-the-art performance on our cross-scale reasoning benchmark and generalizes to SOTA performance on established single-scale benchmarks. Findings suggest that even the limited cross-scale supervision can significantly improve pathological understanding. The code and demos will be open-sourced.

2606.19174 2026-06-18 cs.HC cs.AI 新提交 80%

A Clinician-Centered Pipeline for Annotation and Evaluation in Ultrasound AI Studies

面向临床医生的超声AI研究注释与评估流程

Fangyijie Wang, Jianjun Yu, Wentao Shi, Haixia Huang, Ran Shi, Guénolé Silvestre, Kathleen M. Curran

发表机构 * Research Ireland Centre for Research Training in Machine Learning(爱尔兰研究机器学习研究中心) School of Medicine, University College Dublin, Dublin, Ireland(都柏林大学医学院) The Third People's Hospital of Zhenjiang City, Zhenjiang, China(镇江市第三人民医院) Zhenjiang Maternal and Child Health Hospital, Zhenjiang, China(镇江 maternal and child health hospital) The Fifth People's Hospital of Zhenjiang City, Zhenjiang, China(镇江市第五人民医院) School of Computer Science, University College Dublin, Dublin, Ireland(都柏林大学计算机科学学院)

专题命中 医学影像 :超声AI注释与评估流程,属于医学影像

AI总结 提出一个基于中央服务器和轻量级浏览器的临床医生中心化流程,支持远程注释、盲评和多评分者参与,在胎儿超声分割研究中验证了其可重复性和统计一致性。

Comments Accepted to MIUA 2026

详情
AI中文摘要

临床医生中心的评估对于验证医学AI系统至关重要,尤其是在超声成像中,定量指标并不总能捕捉临床可用性。现有的医学图像平台主要关注数据集标注,缺乏对盲法模型比较和可重复评估工作流的集成支持。我们提出了一个面向临床医生的超声AI研究远程注释与评估流程。该流程使用中央服务器和轻量级浏览器界面,使临床医生无需下载本地数据集即可进行注释、盲法排序和审查。该流程还支持多评分者参与、集中结果聚合和自动统计分析。我们在一个胎儿超声分割研究中验证了该流程,涉及六名评分者,涵盖专家、全科医生和非专家经验水平。系统自动生成了Spearman相关性、Kendall's τ和top-1选择统计量。结果显示专家与其他组之间存在中等到强的一致性。盲法评估结果表明,后期主动学习模型更受青睐。这些结果表明,该流程可以支持超声成像中临床医生中心的注释和可重复的人机AI评估研究。该流程可在GitHub上获取。

英文摘要

Clinician-centered evaluation is critical for validating medical AI systems, especially in ultrasound imaging where quantitative metrics do not always capture clinical usability. Existing medical image platforms primarily focus on dataset labeling. They lack integrated support for blinded model comparison and reproducible evaluation workflows. We present a clinician-centered pipeline for remote annotation and evaluation in ultrasound AI studies. The proposed pipeline uses a centralized server and lightweight browser interfaces to enable clinicians to perform annotation, blinded ranking, and review without local dataset downloads. The pipeline also supports multi-rater participation, centralized result aggregation, and automated statistical analysis. We validate the pipeline in a fetal ultrasound segmentation study with six raters spanning expert, generalist, and non-expert experience levels. The system automatically generated Spearman correlation, Kendall's $τ$, and top-1 selection statistics. Results indicated moderate to strong agreement across experts and other groups. The blinded evaluation results showed a tendency for later active learning models to be preferred. These outcomes suggest that the pipeline can support clinician-centered annotation and reproducible human-\ac{AI} evaluation studies in ultrasound imaging. The proposed pipeline is available on \href{https://github.com/13204942/SonoRate}{GitHub}.

2606.18287 2026-06-18 cs.LG 新提交 80%

Artemis: Anatomy-Resolved inTervention for Eliminating Multimodal NeuroImage confounderS

Artemis: 解剖分辨的干预方法用于消除多模态神经影像混杂因素

Siyuan Dai, Yang Du, Kun Zhao, Zhusuyi Chen, Heng Huang, Paul Thompson, Chao Shi, Haoteng Tang, Liang Zhan

发表机构 * University of Pittsburgh(匹兹堡大学) University of Maryland(马里兰大学) University of Southern California(南加州大学) Binghamton University(宾汉姆顿大学) University of Texas Rio Grande Valley(德克萨斯大学里奥格兰德河谷分校)

专题命中 医学影像 :提出Artemis框架消除神经影像混杂因素,提升诊断性能。

AI总结 提出Artemis框架,通过区域级因果干预学习特定脑区的混杂因素表示,消除fMRI和DTI多模态神经影像中人口统计学混杂因素对GNN的影响,在三个基准上提升性能。

Comments 11 pages, 8 figures

详情
AI中文摘要

多模态神经影像学整合了来自fMRI的功能连接和来自DTI的结构连接,使得使用图神经网络对脑网络进行无创分析成为可能。然而,年龄和性别等人口统计学因素系统地混淆了脑连接与临床结果之间的关系,导致GNN利用虚假捷径而非学习因果不变表示。尽管最近的因果GNN方法在图建模层面引入因果关系,但其因果机制仍然是领域无关的,没有考虑临床神经影像数据中固有的真实世界混杂因素。此外,脑网络是基于图谱分区构建的,每个区域对人口统计学因素表现出不同的敏感性,因此需要区域感知的调整。我们提出了Artemis,一个区域级因果框架,通过在每个脑区域独立进行因果干预,使用轻量级参数学习区域特定的混杂因素表示,从而弥合了这一差距。我们的调整综合利用多模态功能和结构特征进行图推理,作为一个与任意GNN骨干兼容的插件模块。在三个基准(用于疾病诊断的ADNI、用于痴呆分期的OASIS和用于性别分类的HCP)上的实验表明,与代表性的基于GNN的基线相比,该方法具有一致的改进。多项支持实验进一步证明了统计显著性和神经科学可解释性。

英文摘要

Multimodal neuroimaging, integrating functional connectivity from fMRI and structural connectivity from DTI, enables non-invasive analysis of brain networks using graph neural networks. However, demographic factors such as age and sex systematically confound the relationship between brain connectivity and clinical outcomes, causing GNNs to exploit spurious shortcuts rather than learning causally invariant representations. While recent causal GNN methods introduce causality at the graph-modeling level, their causal mechanisms remain domain-agnostic without accounting for the real-world confounders inherent in clinical neuroimaging data. Moreover, brain networks are constructed from atlas-based parcellations where each region exhibits distinct sensitivity to demographic factors, necessitating region-aware adjustment. We propose Artemis, a region-level causal framework that bridges this gap with causal intervention at each brain region independently by learning region-specific confounder representations with lightweight parameters. Our adjustment comprehensively utilized the multimodal functional and structural features for graph reasoning as a plug-in module compatible with arbitrary GNN backbones. Experiments on three benchmarks, ADNI for disease diagnosis, OASIS for dementia staging, and HCP for sex classification, demonstrate consistent improvements over representative GNN-based baselines. Multiple supporting experiments further demonstrate statistical significance and neuroscientific interpretability.

2606.19270 2026-06-18 eess.IV cs.LG physics.med-ph 新提交 80%

Beyond Algorithms: Conceptual Innovation in Medical Imaging AI

超越算法:医学影像人工智能中的概念创新

Mark A. Anastasio

发表机构 * Mallinckrodt Institute of Radiology and Department of Electrical & Systems Engineering, Washington University in St. Louis(马林克罗德特放射医学研究所和电气与系统工程系,华盛顿大学圣路易斯分校)

专题命中 医学影像 :医学影像AI概念创新讨论

AI总结 本文区分算法创新与概念创新,指出当前激励结构过度奖励算法新颖性而忽视概念贡献,通过医学影像AI案例展示概念不足导致的错位目标与有限临床影响,并提出促进概念创新的建议。

详情
AI中文摘要

人工智能推动了医学影像研究的快速发展,产生了日益复杂的算法,并在基准任务上稳步改进。然而,这种以算法为中心的发展轨迹也揭示了一个日益加剧的不平衡:虽然计算方法快速进步,但定义成像任务、评估指标和临床意义的概念基础有时仍未得到充分审视。在这篇观点文章中,我们区分了算法创新(专注于在固定问题定义内改进计算实现和性能)与概念创新(重新定义提出的问题、衡量成功的方式以及方法在临床上的相关性)。我们认为,当前的激励结构、培训路径和发表规范不成比例地奖励算法新颖性,尤其是对早期职业研究者而言,而有时低估了对科学成熟和临床转化至关重要的概念贡献。通过医学影像AI的代表性例子,我们展示了概念基础不足如何导致目标错位、泛化脆弱以及现实世界影响有限。最后,我们为研究者、导师、审稿人和期刊提出了可操作的建议,以更好地识别、支持和整合概念创新与算法进步。

英文摘要

Artificial intelligence has driven rapid progress in medical imaging research, producing increasingly sophisticated algorithms and steady improvements on benchmark tasks. However, this algorithm-centric trajectory has also revealed a growing imbalance: while computational methods advance rapidly, the conceptual foundations that define imaging tasks, evaluation metrics, and clinical meaning sometimes remain underexamined. In this Perspective, we distinguish algorithmic innovation, which focuses on improving computational implementations and performance within a fixed problem definition, from conceptual innovation, which reframes what problems are posed, how success is measured, and why an approach is clinically relevant. We argue that prevailing incentive structures, training pathways, and publication norms disproportionately reward algorithmic novelty, particularly for early-career researchers, while at times undervaluing conceptual contributions that are essential for scientific maturation and clinical translation. Through representative examples from medical imaging AI, we show how insufficient conceptual grounding can lead to misaligned objectives, fragile generalization, and limited real-world impact. We conclude with actionable recommendations for researchers, mentors, reviewers, and journals to better recognize, support, and integrate conceptual innovation alongside algorithmic advances.

2606.18887 2026-06-18 eess.IV physics.med-ph 新提交 80%

Efficient Image Registration for Ultrasound Localization Microscopy by Obtaining Gradients via Integration Across Iterations

通过跨迭代积分获取梯度的超声定位显微镜高效图像配准

Jipeng Yan, Chang Liu, Hengchang Liu, Biao Huang, Meng-Xing Tang, Yingxiang Liu, Ying Tan

专题命中 医学影像 :超声定位显微镜图像配准

AI总结 提出极值搜索控制(ESC)替代显式梯度计算,用于超声定位显微镜(ULM)图像配准,实现每迭代计算成本降低约3.5倍,并在离体猪心ULM成像中达到219 μm分辨率。

详情
AI中文摘要

通过图像配准进行组织运动校正对于超声定位显微镜(ULM)至关重要。参数化图像配准通常被表述为一个优化问题,其中运动参数被迭代更新以最大化图像相似度,所使用的优化算法通常依赖于梯度信息,而梯度的显式计算可能变得计算密集。本研究探讨了极值搜索控制(ESC)作为图像配准中显式导数计算的替代方案。通过跨迭代积分扰动和解调后的图像相似度度量来获取下降信息,ESC避免了每次迭代中图像相似度度量对运动参数的微分。经典的ESC(其优化行为近似于经典梯度下降(GD))首先与GD进行比较,用于仿射图像配准,使用从离体猪心跳动数据集中提取的模拟真实运动。结果表明,ESC实现了与GD相当的配准精度和收敛行为,同时每迭代计算成本降低了约3.5倍。随后,ESC被用于两阶段运动校正流程,其中仿射配准补偿全局组织运动,B样条配准校正残余局部变形。所提出的方法应用于离体跳动猪心的ULM成像,实现了219 μm的空间分辨率,显著低于与2.4 MHz发散波成像相关的半波长衍射极限321 μm。这些结果表明,ESC为ULM图像配准中的显式导数计算提供了一种有效的替代方案,能够实现精确的运动校正和高质量的超分辨率成像。

英文摘要

Tissue motion correction through image registration is essential for ultrasound localization microscopy (ULM). Parametric image registration is commonly formulated as an optimization problem where motion parameters are iteratively updated to maximize image similarity, and used optimization algorithms typically rely on gradient information, the explicit evaluation of which can become computationally demanding. This work investigates Extremum Seeking Control (ESC) as an alternative to explicit derivative evaluation in image registration. By obtaining descent information via integrating perturbed and demodulated image similarity metric across iterations, ESC avoids differentiation of the image similarity metric with respect to motion parameters in each iteration. The classical ESC, whose optimization behavior approximates that of classical gradient descent (GD), is first compared with GD for affine image registration using simulated ground-truth motions derived from a beating ex vivo porcine heart dataset. The results show that ESC achieves registration accuracy and convergence behavior comparable to GD while reducing per-iteration computational cost by approximately 3.5-fold. ESC is subsequently employed in a two-stage motion correction pipeline, where affine registration compensates for global tissue motion and B-spline registration corrects residual local deformation. The proposed method is applied to ULM imaging of a beating ex vivo porcine heart and achieves a spatial resolution of 219 um, substantially below the half-wavelength diffraction limit of 321 um associated with 2.4 MHz diverging-wave imaging. These results demonstrate that ESC provides an effective alternative to explicit derivative evaluation in ULM image registration, enabling accurate motion correction and high-quality super-resolution imaging.

2606.19169 2026-06-18 cs.GR cs.SY eess.SY 新提交 70%

RespGeomLib: A Reproducible Parametric Engine for Generating Analysis-Ready Human Airway Lumen Geometry

RespGeomLib:一个可复现的参数化引擎,用于生成分析就绪的人类气道管腔几何结构

Nichula Wasalathilaka, Parakrama Ekanayake, Roshan Godaliyadda

专题命中 医学影像 :气道管腔几何生成,用于医学仿真

AI总结 提出RespGeomLib,一个基于YAML规范的可复现参数化引擎,通过端口组装与隐式平滑混合生成无缝气道管腔表面,避免全局体素化,在定量上产生更清洁的分叉且更高效,支持形态测量引导生成和CFD仿真。

Comments Accepted to Publication at 2026 IEEE Mercon

详情
AI中文摘要

CT衍生的气道模型支持肺形态测量和气流模拟,但通常受限于远端扫描分辨率和分叉附近需要大量清理。程序化替代方案是可复现的,但许多依赖于拼接的管状基元,这些基元引入了非光滑连接和定义不清的开放边界。我们提出了RespGeomLib,一个可复现的参数化引擎,用于从紧凑的YAML规范生成分析就绪的人类气道管腔表面。该框架结合了基于端口的组装与隐式平滑最小混合,以产生无缝连接,同时通过解析段和分叉周围的局部隐式提取避免全树体素化。定量上,RespGeomLib产生比布尔/拼接基线更清洁的连接,并且比全树全局隐式提取更快且更节省内存。我们进一步展示了形态测量引导的树生成、受控合成气道变体以及具有稳定气流模拟的CFD就绪导出。RespGeomLib针对需要可复现形态测量、受控合成变体和模拟就绪管腔几何的生物医学工作流。代码公开于此https URL。

英文摘要

CT-derived airway models support pulmonary morphometry and airflow simulation, but are often limited by distal scan resolution and the need for substantial cleanup near bifurcations. Procedural alternatives are reproducible, yet many rely on stitched tubular primitives that introduce non-smooth junctions and poorly defined open boundaries. We present RespGeomLib, a reproducible parametric engine for generating analysis-ready human airway lumen surfaces from compact YAML specifications. The framework combines port-based assembly with implicit smooth-min junction blending to produce seamless junctions, while avoiding full-tree voxelization through analytic segments and local implicit extraction around bifurcations. Quantitatively, RespGeomLib yields cleaner junctions than a Boolean/stitch baseline and is substantially faster and more memory-efficient than whole-tree global implicit extraction. We further demonstrate morphometry-guided tree generation, controlled synthetic airway variants, and CFD-ready export with stable airflow simulation. RespGeomLib targets biomedical workflows requiring reproducible morphometry, controlled synthetic variants, and simulation-ready lumen geometry. The code is publicly available at https://nichula01.github.io/Respgeomlib/

3. 健康监测 5 篇

2606.18640 2026-06-18 cs.LG q-bio.QM 新提交 85%

MetaboNet-Bench: A Multi-modal Benchmark for Glucose Forecasting in Type 1 Diabetes

MetaboNet-Bench:1型糖尿病血糖预测的多模态基准

Nathaniel Jeffries, Miriam Wolff, Sam Royston, Elizabeth Healey, Caleb Mayer, David Klonoff, Michael Snyder, Tao Wang

发表机构 * Department of Genetics, Stanford University School of Medicine(斯坦福大学医学院遗传学系) Replica Health Boston Children’s Hospital, Harvard Medical School(哈佛医学院波士顿儿童医院) Diabetes Research Institute, Mills-Peninsula Medical Center(米尔斯半岛医学中心糖尿病研究所)

专题命中 健康监测 :1型糖尿病血糖预测多模态基准

AI总结 针对1型糖尿病血糖预测算法缺乏标准化评估基准的问题,提出MetaboNet-Bench多模态基准,集成血糖、胰岛素和碳水化合物数据,通过多个模型对比验证多模态数据对模型性能的影响。

Comments main content in 10 pages with 5 figures; supplementary section with 11 more pages and 5 more figures

详情
AI中文摘要

血糖预测算法是1型糖尿病血糖控制管理的重要方面。迄今为止,研究社区已经开发了大量预测算法和模型。然而,公认的是,缺乏标准化的模型性能评估基准使得公平比较变得困难,并阻碍了进一步的创新,因此基准标准化迫在眉睫。此外,许多已发表的血糖预测算法仅限于CGM数据,忽略了其他多模态信号,如胰岛素剂量和碳水化合物摄入。在此,我们介绍MetaboNet-Bench,这是一个针对1型糖尿病患者的多模态血糖预测基准,它提供了一个可扩展的开源评估框架,用于比较利用血糖、胰岛素和碳水化合物数据的血糖预测算法。然后,我们通过基准测试几个最近发布的血糖预测模型和一个自定义的多模态时间序列模型(代表不同的模型架构)来展示其实用性。结果表明,添加数据模态的好处取决于模型的复杂性,并且纳入更多临床指标有助于识别未来研究中有意义的空白。

英文摘要

Glucose forecasting algorithms are an important aspect of glycemic control management in type 1 diabetes. So far, the research community has developed numerous algorithms and models for forecasting. However, it is well-recognized that the lack of standardized model performance evaluation benchmarks makes fair comparison difficult and hinders further innovation, and thus benchmark standardization is in urgent need. Furthermore, many published glucose forecasting algorithms are limited to CGM data alone, ignoring other multimodal signals such as insulin dosing and carbohydrate intake. Here, we introduce MetaboNet-Bench, a benchmark for multimodal glucose forecasting for patients with type 1 diabetes that provides an extensible open-source evaluation framework for comparison of glucose forecasting algorithms that leverage glucose, insulin, and carbohydrate data. We then demonstrate its utility by benchmarking several recently published glucose forecasting models and a custom multimodal time-series model, representing different model architectures. The results show that the benefit of adding data modalities is conditioned on the complexity of the model and that incorporating more clinical metrics helps identify meaningful gaps to fill for future research.

2511.06140 2026-06-18 q-bio.QM 80%

Non-invasive load measurement in the human tibia via spectral analysis of flexural waves

通过弯曲波的频谱分析非侵入式测量人体胫骨的负荷

Ali Yawar, Daniel H. Aslan, Daniel E. Lieberman

专题命中 健康监测 :非侵入式胫骨负荷测量,用于运动医学

AI总结 该研究提出了一种非侵入式测量胫骨压缩力的方法,通过分析胫骨中传播的弯曲波频谱,利用频谱峰值位置与压缩力的线性关系进行测量,验证了该方法在人体运动和体育医学中的应用潜力。

Comments 23 pages, 23 figures, 1 table. Manuscript revised for clarity and consistency

Journal ref J. R. Soc. Interface (2026) 23 (239): 20251206

详情
AI中文摘要

骨骼传递的力在人类生物力学中经常被研究,但非侵入式测量尤其在非实验室环境中具有挑战性。我们介绍了一种非侵入式、体内测量胫骨压缩力的技术,利用胫骨中传播的弯曲波。将胫骨建模为轴向压缩的欧拉-伯努利梁,显示胫骨弯曲波具有依赖于负载的频谱。在生理条件下,波加速谱中的峰值位置与胫骨上的压缩力线性变化,并可作为压缩力的代理。我们通过一个概念验证的可穿戴系统测试了该技术的有效性,该系统通过皮肤安装的机械换能器生成弯曲波,并利用皮肤安装的加速度计测量这些波的频谱。与梁理论一致,9名参与者的数据显示了胫骨压缩力与频谱峰值位置之间的线性关系,相关系数r=0.82-0.99(均值r=0.93)用于前后摆动试验,r=0.81-0.98(均值r=0.93)用于步行试验。这种基于弯曲波的技术可能催生一种新的可穿戴传感器,用于非侵入式生理骨负荷监测和测量,影响人类运动和运动医学的研究。

英文摘要

Forces transmitted by bones are routinely studied in human biomechanics, but it is challenging to measure them non-invasively, especially outside of laboratory settings. We introduce a technique for non-invasive, in vivo measurement of tibial compressive force using flexural waves propagating in the tibia. Modelling the tibia as an axially compressed Euler-Bernoulli beam, we show that tibial flexural waves have load-dependent frequency spectra. Specifically, under physiological conditions, peak locations in the wave acceleration spectra vary linearly with the compressive force on the tibia and may be used as proxies for the compressive force. We test the validity of this technique using a proof-of-concept wearable system that generates flexural waves via a skin-mounted mechanical transducer and measures the spectra of these waves using a skin-mounted accelerometer. In agreement with beam theory, data from 9 participants demonstrate linear relationships between tibial compressive force and spectral peak location, with Pearson correlation coefficients $r=0.82 - 0.99$ (mean $r=0.93$) for medial-lateral swaying and $r=0.81 - 0.98$ (mean $r=0.93$) for walking trials. This flexural wave-based technique could give rise to a new class of wearable sensors for non-invasive physiological bone load monitoring and measurement, impacting research in human locomotion and sports medicine.

2412.01836 2026-06-18 q-bio.NC 80%

Eye dominance and testing order effects in the circularly-oriented macular pigment optical density measurements that rely on the perception of structured light-based stimuli

圆周定向视网膜色素密度测量中眼主导性与测试顺序效应的影响

Mukhit Kulmaganbetov, Taranjit Singh, Dmitry Pushin, Pinki Chahal, David Cory, Davis Garrad, Connor Kapahi, Melanie Mungalsingh, Iman Salehi, Andrew Silva, Ben Thompson, Zhangting Wang, Dusan Sarenac

专题命中 健康监测 :研究视网膜色素密度测量中的影响因素

AI总结 研究探讨了基于结构化光刺激的视网膜色素密度测量中,眼主导性和测试顺序对感知的影响,发现两者与测量结果无显著相关性,为未来临床应用奠定基础。

详情
AI中文摘要

心理物理学中结构化光刺激的辨别可能在筛查各种视网膜疾病,包括退行性视网膜病变中发挥作用。圆周定向视网膜色素密度(coMPOD)通过结构化光诱导的视网膜现象辨别性能计算,可能揭示视网膜健康的新功能生物标志物。本研究探讨了眼主导性和测试顺序对结构化光刺激感知的潜在影响,这些因素可能影响基于结构化光技术的筛查测试的灵敏度。28名18-38岁受试者在全面眼科检查后参与研究。心理物理任务中,多种具有多方位条纹旋转特定时间频率的结构化光刺激被投射到受试者视网膜上。通过遮蔽视网膜中央区域,测量了刺激可感知区域的视网膜等距(R)。使用考虑结构化光刺激不同空间密度和时间频率的感知阈值测量的时空敏感性模型,计算了每个受试者的coMPOD轮廓斜率(a值)。眼主导性和测试顺序效应的皮尔逊相关系数为r=0.8(p<0.01)。两种因素的布兰-阿尔曼图显示零偏倚。结果表明,两眼的测量结果可重复,暗示眼主导性和测试顺序对结构化光刺激感知影响较小。结果为未来探索结构化光工具在眼科临床应用中的实用价值奠定了基础。

英文摘要

Psychophysical discrimination of structured light (SL) stimuli may be useful in screening for various macular disorders, including degenerative macular diseases. The circularly-oriented macular pigment optical density (coMPOD), calculated from the discrimination performance of SL-induced entoptic phenomena, may reveal a novel functional biomarker of macular health. In this study, we investigated the potential influence of eye dominance and testing order effects on SL-based stimulus perception, factors that potentially influence the sensitivity of screening tests based on SL technology. A total of 28 participants (aged 18-38 years) were selected for the study after undergoing a comprehensive eye examination. A psychophysical task was performed where various SL-based entoptic images with multiple azimuthal fringes rotating with a specific temporal frequency were projected onto the participants' retinas. By occluding the central areas of entoptic images, we measured the retinal eccentricity ($R$) of the perceivable area of the stimuli. The slope of the coMPOD profile ($a$-value) was calculated for each participant using a spatiotemporal sensitivity model that takes into account the perceptual threshold measurements of structured light stimuli with varying spatial densities and temporal frequencies. The Pearson correlation coefficient between eye dominance and testing order effects was $r=0.8$ ($p<0.01$). The Bland-Altman plots for both factors indicated zero bias. The results indicate repeatable measurements for both eyes, implying minimal impact from eye dominance and testing order on SL-based stimulus perception. The results provide a foundation for future studies exploring the clinical utility of SL tools in eye health.

2606.19102 2026-06-18 eess.SP 新提交 55%

Decentralized Power Control for Over-the-Air Computation with Phase Noise

含相位噪声的空中计算去中心化功率控制

Martin Dahl, Erik G. Larsson

专题命中 健康监测 :研究空中计算功率控制,可应用于医疗物联网

AI总结 针对空中计算中信道估计仅本地可用的问题,提出基于截断信道反转的分布式功率控制方案,给出近似闭式解和精确数值解法,证明均方误差与接收天线数无关,并揭示其与聚合相位误差的关系。

Comments SPAWC 2026

详情
AI中文摘要

相干空中计算(OAC)需要上行信道估计。当使用校准互易性进行信道估计时,估计值仅对设备本地可用。这对预编码和解码构成了挑战,因为无法集中协调。为此,我们使用截断信道反转(TCI),并提出了一个近似闭式解和一个精确数值求解器来优化TCI参数。重要的是,我们证明了所提出的TCI方案在均方误差(MSE)方面与接收天线数量无关。此外,我们的分析揭示了MSE与设备间预期聚合相位误差之间的明确联系,这有助于理解OAC的可扩展性。最后,与先前工作中使用全局可用无误差信道估计的参考方法进行的仿真比较表明,所提出的方法在某些条件下甚至优于这些参考方法的MSE。

英文摘要

Estimation of uplink channels is required for coherent over-the-air computation (OAC). When channel estimation is done using calibrated reciprocity, the estimates are only available locally to the devices. This poses a challenge for precoding and decoding, which cannot be coordinated centrally. To this end we use truncated channel inversion (TCI) and propose an approximate closed form solution and an exact numerical solver to optimize the TCI parameters. Importantly, we prove that the proposed TCI scheme is independent of the number of receiver antennas in terms of mean-square-error (MSE). Furthermore, our analysis reveals a clear connection between the MSE and expected aggregate phase error across devices which gives insight to the scalability of OAC. Finally, simulations with comparisons to reference methods from prior work with globally available error-free channel estimates show that proposed is close, even outperforming these references in MSE under some conditions.

2606.18564 2026-06-18 cs.SD eess.SP 新提交 55%

Reference-Based Recursive Least-Squares Mitigation of Real Interference in Stereo Audio Recordings

基于参考的递归最小二乘法在立体声音频录音中抑制真实干扰

Necati Kagan Erkek, Y. Ugur Ozcan

发表机构 * Telecommunications Engineering, Department of Electronics, Information(电信工程系,电子与信息系)

专题命中 健康监测 :自适应干扰消除用于音频录音,可能医疗应用

AI总结 针对受真实火车噪声和环境背景污染的立体声音频,采用多参考递归最小二乘(RLS)估计器进行自适应干扰消除,通过参考信号估计干扰分量并减去,后接低通后置滤波器,有效降低参考相关性达30.6-34.1 dB。

Comments 7 pages

详情
AI中文摘要

评估了基于参考的自适应干扰消除方法,用于受真实火车噪声和环境背景污染的立体声音频录音。观测信号被建模为干净的立体声节目受到由外部声源通过未知传播路径产生的加性干扰污染。第二个立体声录音,代表同一物理噪声源的另一个滤波观测,被用作多参考递归最小二乘(RLS)估计器的参考输入。估计的火车干扰分量从含噪音频中减去,随后经过有限冲激响应低通后置滤波器。在相同算法参数下处理了三个74.01秒、采样率为11.025 kHz的真实音频序列。由于没有干净的参考真值,性能通过无参考指标评估:波形行为、Welch谱估计、RMS变化以及与参考的残差归一化相关性。每个参考通道使用30个抽头、15个反因果抽头和遗忘因子0.999,最大参考相关性从处理前的0.386--0.832降低到处理后的0.011--0.016。相应的相关性比降低约30.6--34.1 dB,而输出RMS根据片段和立体声通道减少1.8--4.8 dB。结果表明,当存在相关参考录音时,真实火车干扰(包括环境声学效应)可以被显著衰减。

英文摘要

Reference-based adaptive interference cancellation is evaluated for stereo audio recordings corrupted by real train noise and environmental background. The observed signal is modeled as a clean stereo program contaminated by an additive disturbance generated by an external acoustic source through unknown propagation paths. A second stereo recording, representing another filtered observation of the same physical noise source, is used as the reference input of a multi-reference recursive least-squares (RLS) estimator. The estimated train-interference component is subtracted from the noisy audio and followed by a finite-impulse-response low-pass postfilter. Three 74.01 s real audio sequences sampled at 11.025 kHz are processed under identical algorithmic parameters. Since clean ground truth is not available, performance is assessed with no-reference indicators: waveform behavior, Welch spectral estimates, RMS change, and residual normalized correlation with the reference. With 30 taps per reference channel, 15 anti-causal taps, and forgetting factor 0.999, the maximum reference correlation is reduced from 0.386--0.832 before processing to 0.011--0.016 after processing. The corresponding correlation-ratio reduction is approximately 30.6--34.1 dB, while the output RMS decreases by 1.8--4.8 dB depending on section and stereo channel. The results demonstrate that real train interference, including environmental acoustic effects, can be substantially attenuated when a correlated reference recording is available.

4. 临床大模型 1 篇

2606.18518 2026-06-18 cs.LG cs.AI 新提交 80%

PSyGenTAB: A Privacy-Preserving Framework for Synthetic Clinical Tabular Data Generation via Constrained Optimization

PSyGenTAB:通过约束优化生成合成临床表格数据的隐私保护框架

Arshia Ilaty, Hossein Shirazi, Manasi Chitale, Kedar Hegde, Dhanalakshmi Ramesh, Rashmi S. Manjunath, Amir Rahmani, Hajar Homayouni

发表机构 * San Diego State University(圣地亚哥州立大学) University of California, Irvine(加利福尼亚大学尔湾分校)

专题命中 临床大模型 :生成合成临床表格数据

AI总结 提出PSyGenTAB框架,将合成医疗数据生成建模为约束优化问题,通过增强拉格朗日方法嵌入可配置隐私约束,在保证隐私阈值的同时最大化临床数据效用,实验表明合成数据训练的模型性能与真实数据相当。

Comments 20 pages

详情
AI中文摘要

由于机构壁垒和严格的隐私法规(如HIPAA和GDPR),医疗AI的发展受到高质量临床数据获取限制。合成数据生成提供了一种潜在解决方案,但现有方法缺乏明确管理隐私-效用权衡的原则性机制,常常退化临床有意义的模式或面临患者重识别风险。我们提出PSyGenTAB,一个隐私保护生成框架,将合成医疗数据生成建模为使用增强拉格朗日方法求解的约束优化问题。通过将可配置的隐私约束直接嵌入模型训练,PSyGenTAB在最大化临床数据效用的同时强制执行最低隐私阈值。在多个临床驱动的基准测试中,PSyGenTAB保留了可靠健康AI所需的特征间临床关系和少数类诊断模式。使用“合成训练、真实测试”和“真实训练、合成测试”协议的下游评估表明,在合成数据上训练的模型达到了与真实患者记录训练模型相当的性能。隐私审计进一步证明了精确记录复制的减少和对成员推理攻击的强大抵抗力。这些结果确立了PSyGenTAB作为平衡合成医疗数据中隐私保护和临床效用的原则性框架,支持安全的跨机构AI开发。

英文摘要

The development of medical AI is constrained by limited access to high-quality clinical data due to institutional silos and strict privacy regulations such as HIPAA and GDPR. Synthetic data generation offers a potential solution, but existing methods lack principled mechanisms to explicitly manage the privacy-utility trade-off, often degrading clinically meaningful patterns or risking patient re-identification. We present PSyGenTAB, a privacy-preserving generative framework that formulates synthetic healthcare data generation as a constrained optimization problem solved using the Augmented Lagrangian Method. By embedding configurable privacy constraints directly into model training, PSyGenTAB enforces minimum privacy thresholds while maximizing clinical data utility. Across multiple clinically motivated benchmarks, PSyGenTAB preserves inter-feature clinical relationships and minority-class diagnostic patterns essential for reliable health AI. Downstream evaluation using Train-on-Synthetic, Test-on-Real and Train-on-Real, Test-on-Synthetic protocols shows that models trained on synthetic data achieve performance comparable to those trained on real patient records. Privacy auditing further demonstrates reduced exact record reproduction and strong resilience to membership inference attacks. These results establish PSyGenTAB as a principled framework for balancing privacy protection and clinical utility in synthetic healthcare data, supporting secure cross-institutional AI development.