arXivDaily arXiv每日学术速递 周一至周五更新

科学与医疗

医学 AI

医学智能、临床 AI、医学影像、病理、诊断和医疗健康大模型。

今日/当前日期收录 31 信号源:cs.CV, cs.LG, q-bio, eess.IV, eess.SP
2606.19300 2026-06-18 cs.CV cs.LG 新提交 95%

Confidence is Not Reliability: Rethinking MC Dropout in Brain Tumour Segmentation

置信度不等于可靠性:重新思考脑肿瘤分割中的MC Dropout

Xin Ci Wong, Duygu Sarikaya, Kieran Zucker, Marc De Kamps, Nishant Ravikumar

发表机构 * Centre for Doctoral Training in AI for Medical Diagnosis and Care(人工智能辅助医疗诊断与护理博士培训中心) School of Computing, University of Leeds(利兹大学计算机学院) School of Computer Science, University of Leeds(利兹大学计算机科学学院) Leeds Cancer Centre, St James’s University Hospital, Leeds, UK(利兹癌症中心,圣詹姆斯大学医院,利兹,英国)

专题命中 医学影像 :脑肿瘤分割中的MC Dropout不确定性估计,聚焦临床安全

AI总结 通过MC Dropout不确定性估计,发现全局不确定性-误差对齐(AUROC≈0.97)可能掩盖关键子区域(如增强肿瘤)的严重误校准(ECE=0.915),表明子区域校准评估对临床安全至关重要。

Comments Accepted for MIUA2016

详情
AI中文摘要

多参数MRI中的胶质瘤分割是治疗计划的关键组成部分。一个在治疗关键子区域上静默失败的分割模型会带来患者安全风险,而Dice分数等基于重叠的指标无法暴露这种风险。我们探究通过蒙特卡洛(MC)Dropout进行的体素级不确定性估计能否可靠地识别临床关键子区域中的分割错误,以及校准失败模式是否仅从标准报告指标中可检测。在126名BraTS21患者的两模型实证案例研究中,我们评估了高性能预训练SegResNet和本地训练的带有残差单元的UNet(UNet-Res)。MC dropout保持了分割准确性($|\Delta \text{Dice}|$ $<0.01$),同时实现了强不确定性-误差对齐(熵(H)的AUROC $\approx$0.97),表明不确定性正确地将错误体素排在正确体素之上。基于熵的患者分层识别出一个高不确定性亚组,其分割性能显著较低(全肿瘤Dice中位数$0.835$ vs. $0.925$),支持不确定性作为实用的分诊信号。然而,全局对齐可能掩盖重要的区域特异性差异。尽管AUROC相似,UNet-Res在增强肿瘤熵上接近零($0.054$),期望校准误差(ECE)为$0.915$,Dice仅为$0.714$,表明在最临床关键子区域上置信度严重误校准,这是标准Dice和AUROC报告无法发现的失败模式。这些发现表明,强不确定性-误差对齐对于临床安全是必要但不充分的:在选择临床部署模型时,子区域特异性校准评估必须伴随AUROC评估。

英文摘要

Glioma segmentation in multiparametric MRI is a critical component of treatment planning. A segmentation model that fails silently on treatment-critical sub-regions represents a patient safety risk that overlap-based metrics such as Dice scores cannot expose. We ask whether voxel-level uncertainty estimation via Monte Carlo (MC) Dropout can reliably identify segmentation errors in clinically critical sub-regions, and whether calibration failure modes are detectable from standard reporting metrics alone. In an empirical two-model case study on 126 BraTS21 patients, we evaluate a high-performance pretrained SegResNet and a locally trained UNet with residual units (UNet-Res). MC dropout preserved segmentation accuracy ($|Δ\text{Dice}|$ $<0.01$) while achieving strong uncertainty-error alignment (AUROC for entropy (H) $\approx$0.97), indicating uncertainty correctly ranks erroneous voxels above correct ones. Entropy-based patient stratification identified a high-uncertainty subgroup with substantially lower segmentation performance (median whole-tumour Dice $0.835$ vs. $0.925$), supporting uncertainty as a practical triage signal. However, global alignment can mask important region-specific differences. Despite similar AUROC, UNet-Res exhibited near-zero enhancing tumour entropy ($0.054$) and Expected Calibration Error (ECE) of $0.915$, with a Dice of only $0.714$, indicating severely miscalibrated confidence on the most clinically critical sub-region, a failure mode invisible to standard Dice and AUROC reporting. These findings demonstrate that strong uncertainty-error alignment is necessary but insufficient for clinical safety: sub-region-specific calibration assessment must accompany AUROC evaluation when selecting models for clinical deployment.

2606.18707 2026-06-18 cs.CV 新提交 95%

PEFT-MedSAM: Efficient Fine-Tuning of Medical Foundation Models for Explainable Skin Lesion Segmentation

PEFT-MedSAM:面向可解释皮肤病变分割的医学基础模型高效微调

Asad Channa, Abdullah Khan, Asghar Ali Chandio, Aamir Akbar, Shahzad Memon, Aqib Hussain, Ameer Hamza

发表机构 * Department of Computer Science, Quaid-e-Awam University of Engineering, Sciences & Technology(计算机科学系,卡迪尔-阿瓦姆工程、科学与技术大学) Department of Artificial Intelligence, Quaid-e-Awam University of Engineering, Sciences & Technology(人工智能系,卡迪尔-阿瓦姆工程、科学与技术大学) Department of Computer Science, Sindh Madressatul Islam University, City Campus, Karachi(计算机科学系, Sind 阿里斯坦伊斯兰大学,卡拉奇城校区) Department of Computer Science and Digital Technologies, School of Architecture, Computing and Engineering, University of East London(计算机科学与数字技术系,建筑、计算与工程学院,东伦敦大学)

专题命中 医学影像 :提出医学图像分割微调方法,应用于皮肤病变分割

AI总结 提出参数高效微调方法PEFT-MedSAM,冻结预训练编码器仅训练轻量解码器,在ISIC 2018上达到0.9411 Dice系数,并通过Grad-CAM可解释性增强临床可信度。

详情
AI中文摘要

使用深度学习模型对皮肤镜图像进行皮肤病变自动分割,有助于比常规检测更早发现黑色素瘤。然而,大多数现有的深度学习方法性能不佳。本文旨在提出一种名为PEFT-MedSAM的参数高效微调方法,用于适配医学分割一切模型(MedSAM)以自动分割皮肤镜皮肤病变。PEFT-MedSAM方法仅使用轻量级掩码解码器训练模型,同时保持预训练图像编码器和提示编码器冻结。在ISIC 2018基准数据集上的实验表明,与完全训练的U-Net基线(0.8715 Dice系数)和零样本MedSAM推理(0.8997 Dice系数)相比,PEFT-MedSAM获得了0.9411的Dice系数和0.8918的交并比。使用PH2数据集进行的外部验证显示Dice系数为0.9467,标准差为±0.0310。这些主张的支持证据包括比较两个数据集的Wilcoxon符号秩检验p值小于0.0001,以及bootstrap估计的95%置信区间[0.9364, 0.9447],该区间表示重复测试获得的平均Dice系数的估计范围。为了增加临床可信度,我们使用Grad-CAM可解释性以及基于指向游戏的评估方法,在验证集上评估CNN基线模型。结果表明,在包含519张图像的验证集上,准确率达到98.27%,并确认模型正确分类了包含皮肤病变的区域。

英文摘要

Automated segmentation of skin lesions using deep learning models for dermoscopic images can be very helpful in finding melanomas earlier than they would normally be detected. However, most deep learning methods available do not perform well. The aim of this paper is to present a parameter-efficient fine-tuning method called PEFT-MedSAM for adapting the Medical Segment Anything Model (MedSAM) to automatically segment dermoscopic skin lesions. The PEFT-MedSAM method uses only the lightweight mask decoder for training the model while keeping the pre-trained image encoder and prompt encoder frozen. The experiments performed on the ISIC 2018 benchmark dataset shows that PEFT-MedSAM obtains a dice coefficient of .9411 and an intersection over union value of .8918 when compared to both a fully trained U-Net baseline (.8715 dice coefficient) and zero-shot MedSAM inference (.8997 dice coefficient). The external validation of the model using PH2 dataset shows .9467 dice coefficient with +/- .0310 standard deviation. Supportive evidence for these claims include a p-value less than .0001 for Wilcoxon signed rank tests comparing the two datasets and bootstrap-estimated 95% confidence intervals of [.9364,.9447] that represent the estimated range of possible values for the average dice coefficient obtained by repeating the test. To increase clinical trustworthiness, we used Grad-CAM explainability along with a pointing game based evaluation methodology to evaluate the CNN baseline model on the validation set. The results showed that we had an accuracy rate of 98.27% on the validation set of 519 images and confirmed that the model classified regions containing skin lesions.

2606.18682 2026-06-18 cs.CV 新提交 95%

Multi-Class Brain Tumor Classification Using Advanced Deep Learning Models: A Comparative Study

使用先进深度学习模型的多类脑肿瘤分类:一项比较研究

Asad Channa, Asghar Ali Chandio, Akhtar Hussain Jalbani, Mehwish Leghari, Shahzad Memon

发表机构 * Department of Computer Science, Quaid-e-Awam University of Engineering, Sciences & Technology(夸迪-艾瓦姆工程、科学与技术大学计算机科学系) Department of Artificial Intelligence, Quaid-e-Awam University of Engineering, Sciences & Technology(夸迪-艾瓦姆工程、科学与技术大学人工智能系) The Faculty of Artificial Intelligence and Cyber Security, Universiti Teknikal Malaysia Melaka(马来西亚梅拉卡技术大学人工智能与网络安全学院) Department of Data Science, Quaid-e-Awam University of Engineering, Sciences & Technology(夸迪-艾瓦姆工程、科学与技术大学数据科学系) Department of Computer Science and Digital Technologies, School of Architecture, Computing and Engineering, University of East London(东伦敦大学建筑、计算与工程学院计算机科学与数字技术系)

专题命中 医学影像 :脑肿瘤MRI分类,比较CNN架构

AI总结 本研究比较五种CNN架构(包括定制模型和四种预训练模型)在约10,000张MRI图像上的多类脑肿瘤分类性能,发现EfficientNetB0以95%准确率最优,尤其显著提高了脑膜瘤的召回率(89%)。

详情
AI中文摘要

尽管深度学习最近取得了进展,但从MRI图像中准确分类脑肿瘤仍然面临挑战。在本研究中,我们对五种不同的卷积神经网络(CNN)架构进行了全面评估,包括一个定制的基线模型和四个预训练模型,用于使用临床来源的约10,000张MRI图像数据集对多类脑肿瘤进行分类。我们使用了五种不同的架构:VGG16、VGG19、DenseNet121和EfficientNetB0,它们都在相同的实验框架内进行了测试和训练。性能通过总体准确率和肿瘤召回率来衡量,以评估每种架构的临床相关性能。我们发现,与其他测试的架构相比,EfficientNetB0具有最佳的整体分类准确率95%;具体来说,VGG16(94.37%)、VGG19(92.29%)、DenseNet121(90.91%)和定制CNN(78.00%)。我们研究的一个特别重要的发现是,在检测脑膜瘤方面有显著改进;具体而言,简单的CNN可以以约20%的召回率检测脑膜瘤,而EfficientNetB0能够以89%的召回率检测脑膜瘤。脑膜瘤通常难以检测,因为它们在MRI图像上可能表现得非常微妙。此外,一个有趣的发现是,更深的VGG19性能不如较浅的VGG16。这表明,在处理医学图像时,CNN模型的架构效率可能比其深度更重要。总体而言,EfficientNetB0似乎在分类准确率、模型参数数量和临床有意义性能之间提供了最佳权衡。

英文摘要

Despite recent advancements in deep learning, accurately classifying brain tumors from MRI images continues to pose challenges. In this research, we present a comprehensive evaluation of five different convolutional neural networks (CNN) architectures, including a customized baseline model and four pre-trained models - for use in classifying multi-class brain tumors using a clinically-sourced dataset of approximately 10,000 MRI images. We have utilized five different architectures; VGG16, VGG19, DenseNet121, and EfficientNetB0, which were all tested and trained within an identical experimental framework. Performance was measured by both overall accuracy and tumor-wise recall as a means to measure the clinically-relevant performance of each architecture. We found that EfficientNetB0 had the best overall classification accuracy at 95%, when compared to the other architectures tested; specifically VGG16 (94.37%), VGG19 (92.29%), DenseNet121 (90.91%) and the customized CNN (78.00%). An especially important finding of our research was the considerable improvement in detecting meningiomas; specifically, while simple CNNs could detect meningiomas with a recall rate of approximately 20%, EfficientNetB0 was able to detect meningiomas with a recall rate of 89%. Meningiomas are often difficult to detect because they can appear very subtly on MRI images. Additionally, an interesting finding was that the deeper VGG19 performed worse than the shallower VGG16. This indicates that in many cases the architectural efficiency of a CNN model may be more important than its depth when working with medical images. Overall, EfficientNetB0 appears to provide the optimal trade-off between classification accuracy, number of parameters used in the model and clinically meaningful performance.

2606.18675 2026-06-18 cs.CV 新提交 95%

BrainFusionNet: a deep learning and XAI model to understand local, global, and sequential features of MRI images for improved brain tumour detection

BrainFusionNet:一种用于理解MRI图像局部、全局和序列特征以改进脑肿瘤检测的深度学习与XAI模型

Md Taimur Ahad, Bo Song, Yan Li

发表机构 * School of Mathematics, Physics and Computing, University of Southern Queensland(南方昆士兰大学数学、物理与计算学院) School of Engineering, University of Southern Queensland(南方昆士兰大学工程学院)

专题命中 医学影像 :脑肿瘤检测混合模型,结合CNN/ViT/GRU

AI总结 提出BrainFusionNet混合模型,结合CNN、ViT和GRU提取MRI空间、上下文和序列特征,并集成SHAP、LIME和GradCAM进行可解释性分析,在公开数据集上达到98%准确率,优于SOTA CNN。

Journal ref Brain Inf. 13, 21 (2026)

详情
AI中文摘要

磁共振成像(MRI)的噪声给深度学习(DL)带来挑战,当肿瘤边界模糊、肿瘤位置和外观复杂时尤其如此。因此,我们开发了BrainFusionNet,它结合卷积神经网络(CNN)、视觉变换器(ViT)和门控循环单元(GRU),从MRI图像中提取空间、上下文和序列特征,以改进脑肿瘤分类。此外,集成了可解释AI(如SHAP、LIME和GradCAM),以可视化和突出显示有助于BrainFusionNet决策过程的图像区域。所提出的BrainFusionNet模型在两个公开MRI数据集上进行了评估,K折验证表明在两个数据集上准确率均达到98%。该模型与六种最先进的(SOTA)CNN和迁移学习进行了比较。在SOTA CNN中,DenseNet121和VGG16达到了96%的最高准确率。BrainFusionNet的新颖之处在于,该混合模型能够有效提取MRI图像的局部和全局特征,即使在小尺度肿瘤区域和肿瘤尺寸较小的情况下也是如此。该模型具有平衡的序列CNN架构,以捕获低层和深层特征;以及定制的ViT,可捕获局部特征、稳定梯度流并降低MRI图像训练期间梯度消失的风险。CNN和ViT的输出被馈送到GRU以进行最终分类。此外,我们分析像素强度以确定MRI图像质量是否影响图像分类。我们的发现在图像解释方面非常新颖,因为我们发现MRI图像中像素强度的分布会影响DL性能。

英文摘要

The noise of Magnetic Resonance Imaging MRI poses challenges for Deep Learning DL when tumor boundaries are obscured tumor location and appearance are complex Therefore we develop BrainFusionNet that combines Convolutional Neural Networks CNNs Vision Transformers ViT and Gated Recurrent Units GRUs to extract spatial contextual and sequential features from MRI images for improved brain tumor classification Furthermore explainable AI such as SHAP LIME and GradCAM are integrated to visualise and highlight image regions that contribute to BrainFusionNets decisionmaking process The proposed BrainFusionNet model is evaluated on two publicly available MRI datasets Kfold validation suggests 98 accuracy on both datasets The model was compared with the six stateoftheart SOTA CNNs and transfer learning Among the SOTA CNNs DenseNet121 and VGG16 achieved the highest accuracy of 96 The novelty of BrainFusionNet is that the hybrid model effectively extracts local and global features from MRI images even in smallscale tumor regions and small tumor sizes The model has a balanced sequential CNN architecture to capture lowlevel and deeperlayer features a customized ViT that captures local features stabilizes gradient flow and reduces the risk of vanishing gradients during MRI image training The CNN and ViT outputs are fed into a GRU for final classification Furthermore we analyze pixel intensities to determine whether MRI image quality affects image classification Our findings are very novel in image interpretation as we found that the distribution of pixel intensities in MRI images affects DL performance

2606.18609 2026-06-18 cs.CV 新提交 95%

Hallucination Detection and Correction in Medical VLMs via Counter-Evidence Verification

基于反事实证据验证的医学视觉语言模型幻觉检测与纠正

Nan Zhou, Ke Zou, Meng Liu, Linchao He, Jiaqi Zhu, Yi Zhang, Hu Chen, Huazhu Fu

发表机构 * College of Computer Science, Sichuan University(四川大学计算机科学学院) Yong Loo Lin School of Medicine, National University of Singapore(新加坡国立大学杨潞龄医学院) Key Laboratory of Data Protection and Intelligent Management, Ministry of Education, Sichuan University(四川大学数据保护与智能管理教育部重点实验室) National Key Laboratory of Autonomous Intelligent Unmanned Systems, Beijing Institute of Technology(北京理工大学自主智能无人系统国家重点实验室) Institute of High Performance Computing (IHPC), Agency for Science, Technology and Research (A*STAR)(新加坡科技研究局高性能计算研究所)

专题命中 医学影像 :提出CoEV框架检测和纠正医学VLM幻觉,聚焦医学诊断

AI总结 提出CoEV框架,通过文本与视觉证据的双向验证检测并纠正医学VLM幻觉,无需重新训练,在四个数据集上显著提升检测和纠正性能。

Comments MICCAI 2026 Accept. Submission Version

详情
AI中文摘要

视觉语言模型(VLM)在医学诊断中的可靠性受到幻觉的挑战,这削弱了信任。现有的幻觉检测方法主要关注识别生成文本与参考数据之间的事实不一致性。虽然一些研究分析了模型在图像中的注意力区域,但它们很少验证这种注意力是否真正反映了支持生成文本的视觉证据。为了解决这一差距,我们提出了反事实证据验证(CoEV),一个无需训练的即插即用框架,通过基于证据的事实一致性验证来检测和纠正幻觉。CoEV在文本断言和视觉证据之间执行双向验证,测试每个陈述是否得到其对应证据区域的支持,并将每个陈述分配到一个四象限诊断图中,该图捕获文本事实性和视觉基础性的组合。CoEV检测幻觉内容,并作为事后细化工具,无需重新训练即可纠正幻觉。在四个医学数据集上的大量实验表明,CoEV能够对抗幻觉。在幻觉检测方面,CoEV始终优于现有方法,平均PR-AUC和ROC-AUC分别提高了3.0%和3.9%的绝对百分点,在特定VQA场景中提升高达18.5%。在幻觉纠正方面,它将Micro-F1提高了高达12.5%,在医学报告生成中将幻觉率降低了超过11.9%,并提高了医学VQA的准确性。这些结果表明,CoEV能够可靠地检测和纠正幻觉,为临床医生提供可靠的、基于证据的诊断线索。代码将在接收后发布。

英文摘要

Vision-Language models (VLMs) reliability in medical diagnosis is challenged by trust-undermining hallucinations. Existing hallucination detection approaches mainly focus on identifying factual inconsistencies between generated text and reference data. While some studies analyze where models attend in images, they seldom verify whether such attention truly reflects the visual evidence supporting the generated text. To address this gap, we propose Co}unter-Evidence Verification (CoEV), a training-free plug-and-play framework that detects and corrects hallucinations through evidence-based factual consistency verification. CoEV performs bidirectional verification between textual assertions and visual evidence, testing whether each statement is supported by its corresponding evidence region, and assigns each statement into a four-quadrant diagnostic map capturing combinations of text factuality and visual grounding. CoEV detects hallucinated content and serves as a post hoc refinement tool, correcting hallucinations without retraining. Extensive experiments on four medical datasets show that CoEV combats hallucinations in VLMs.For hallucination detection, CoEV consistently outperforms existing methods, improving average PR-AUC and ROC-AUC by 3.0% and 3.9% absolute points respectively, with notable gains of up to 18.5% in specific VQA scenarios. For hallucination correction, it improves Micro-F1 by up to 12.5%, reduces hallucination rates by over 11.9% on medical report generation, and also boosts medical VQA accuracy. These results show that CoEV enables reliable detection and correction of hallucinations, providing clinicians with dependable, evidence-based cues for diagnosis. Code will be released upon acceptance.

2604.14837 2026-06-18 cs.CV 95%

Improved Multiscale Structural Mapping with Supervertex Vision Transformer for the Detection of Alzheimer's Disease Neurodegeneration

改进的多尺度结构映射与超顶点视觉Transformer用于阿尔茨海默病神经退行性病变的检测

Geonwoo Baek, David H. Salat, Ikbeom Jang

发表机构 * Department of Computer Science \& Engineering, Hankuk University of Foreign Studies, Seoul, Republic of Korea Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Charlestown, MA, USA Department of Radiology, Harvard Medical School, Boston, MA, USA Neuroimaging Research for Veterans (NeRVe) Center, VA Boston Healthcare System, Boston, MA, USA

专题命中 医学影像 :使用MRI检测阿尔茨海默病,属于医学影像

AI总结 本文提出MSSM+结合SSVM和SV-ViT,通过多尺度结构映射和超顶点映射提高阿尔茨海默病早期检测的准确性,实现了更显著的组间差异识别和分类性能提升。

Comments Submitted to Human Brain Mapping

Journal ref Human Brain Mapping 47(8), e70548 (2026)

详情
AI中文摘要

阿尔茨海默病(AD)的确认通常依赖于正电子发射断层扫描(PET)或脑脊液(CSF)分析,这些方法成本高且侵入性。因此,结构MRI生物标志物如皮层厚度(CT)被广泛用于非侵入性AD筛查。多尺度结构映射(MSSM)最近被提出,以整合灰白质对比(GWCs)与CT,从单个T1加权MRI(T1w)扫描中。在此框架基础上,我们提出了MSSM+,结合表面超顶点映射(SSVM)和超顶点视觉Transformer(SV-ViT)。对具有AD和认知正常(CN)控制的个体的3D T1w图像进行了分析。MSSM+通过在顶点层面整合沟回深度和皮层曲率扩展了MSSM。SSVM将皮层表面划分为超顶点(表面块),有效代表区域间和区域内的空间关系。SV-ViT是一种在这些超顶点上运行的视觉Transformer架构,使从表面网格表示中获得解剖学信息的学习成为可能。与MSSM相比,MSSM+在AD和CN之间识别了更广泛且统计上显著的组差异。在AD vs. CN分类中,MSSM+在精确率-召回率曲线下面积比MSSM高3%。针对特定供应商的分析进一步表明,信号变异性减少,并且在MR制造商之间,相对于CT、GWCs和MSSM,分类性能一致提高。这些发现表明,结合SV-ViT的MSSM+是一种有前景的MRI成像生物标志物,用于在CSF/PET确认之前检测AD。

英文摘要

Alzheimer's disease (AD) confirmation often relies on positron emission tomography (PET) or cerebrospinal fluid (CSF) analysis, which are costly and invasive. Consequently, structural MRI biomarkers such as cortical thickness (CT) are widely used for non-invasive AD screening. Multiscale structural mapping (MSSM) was recently proposed to integrate gray-white matter contrasts (GWCs) with CT from a single T1-weighted MRI (T1w) scan. Building on this framework, we propose MSSM+, together with surface supervertex mapping (SSVM) and a Supervertex Vision Transformer (SV-ViT). 3D T1w images from individuals with AD and cognitively normal (CN) controls were analyzed. MSSM+ extends MSSM by incorporating sulcal depth and cortical curvature at the vertex level. SSVM partitions the cortical surface into supervertices (surface patches) that effectively represent inter- and intra-regional spatial relationships. SV-ViT is a Vision Transformer architecture operating on these supervertices, enabling anatomically informed learning from surface mesh representations. Compared with MSSM, MSSM+ identified more spatially extensive and statistically significant group differences between AD and CN. In AD vs. CN classification, MSSM+ achieved a 3%p higher area under the precision-recall curve than MSSM. Vendor-specific analyses further demonstrated reduced signal variability and consistently improved classification performance across MR manufacturers relative to CT, GWCs, and MSSM. These findings suggest that MSSM+ combined with SV-ViT is a promising MRI-based imaging marker for AD detection prior to CSF/PET confirmation.

2602.02370 2026-06-18 cs.CV 95%

Uncertainty-Aware Image Classification In Biomedical Imaging Using Spectral-normalized Neural Gaussian Processes

利用谱归一化神经高斯过程进行生物医学影像中的不确定性感知图像分类

Uma Meleti, Jeffrey J. Nirschl

发表机构 * Department of Pathology(病理学部) Lab Medicine, University of Wisconsin-Madison(实验室医学,威斯康星大学麦迪逊分校)

专题命中 医学影像 :生物医学影像分类,属于医学影像

AI总结 本文提出SNGP模型,通过谱归一化和高斯过程层改进单模型不确定性估计与异常检测,在三个生物医学分类任务中表现优异。

Comments Published at the IEEE International Symposium on Biomedical Imaging (ISBI) 2026

Journal ref Proc. 2026 IEEE 23rd International Symposium on Biomedical Imaging (ISBI),London, United Kingdom, Apr. 8-11, 2026, pp. [1-4], 2026

详情
AI中文摘要

准确的组织病理学解释对临床决策至关重要;然而,当前的数字病理深度学习模型在分布外(OOD)设置中往往过于自信且校准不佳,限制了信任和临床应用。安全关键的医学影像工作流程受益于内在的不确定性感知属性,能够准确拒绝OOD输入。我们实现了SNGP,即一组轻量级修改,应用谱归一化并用高斯过程层替代最终密集层,以提高单模型不确定性估计和OOD检测。我们在六个数据集上评估SNGP与确定性和蒙特卡洛dropout,涵盖三个生物医学分类任务:白血球、淀粉样斑块和结直肠组织病理学。SNGP在分布内性能相当,同时显著提高不确定性估计和OOD检测。因此,SNGP或相关模型提供了一个有用的框架,用于数字病理学中的不确定性感知分类,支持安全部署并建立与病理科医生的信任。

英文摘要

Accurate histopathologic interpretation is key for clinical decision-making; however, current deep learning models for digital pathology are often overconfident and poorly calibrated in out-of-distribution (OOD) settings, which limit trust and clinical adoption. Safety-critical medical imaging workflows benefit from intrinsic uncertainty-aware properties that can accurately reject OOD input. We implement the Spectral-normalized Neural Gaussian Process (SNGP), a set of lightweight modifications that apply spectral normalization and replace the final dense layer with a Gaussian process layer to improve single-model uncertainty estimation and OOD detection. We evaluate SNGP vs. deterministic and MonteCarlo dropout on six datasets across three biomedical classification tasks: white blood cells, amyloid plaques, and colorectal histopathology. SNGP has comparable in-distribution performance while significantly improving uncertainty estimation and OOD detection. Thus, SNGP or related models offer a useful framework for uncertainty-aware classification in digital pathology, supporting safe deployment and building trust with pathologists.

2606.18354 2026-06-18 eess.IV cs.LG 新提交 90%

Structural MRI Synthesis for Alzheimer's Disease via Conditional Diffusion on Anatomical Masks

基于解剖掩膜条件扩散的阿尔茨海默病结构MRI合成

Muge Zhang, Muhammad Ali Khaliq, Jamal Alsakran, Byeong Kil Lee, Jeeho Ryoo

发表机构 * Fairleigh Dickinson University(Fairleigh Dickinson大学) University of Colorado at Colorado Springs(科罗拉多州立大学)

专题命中 医学影像 :合成阿尔茨海默病结构MRI,条件扩散模型

AI总结 针对阿尔茨海默病结构MRI合成中细微解剖变化难以捕捉的问题,本文扩展Med-DDPM条件扩散模型,以解剖分割掩膜为条件生成3D结构MRI,实验表明合成数据训练的模型Dice分数与真实数据相当,混合数据训练则显著提升性能。

Journal ref 2025 IEEE 8th International Conference on Multimedia Information Processing and Retrieval (MIPR)

详情
AI中文摘要

生成式机器学习模型的最新进展显著改善了医学成像,为数据增强、隐私保护和模型泛化提供了有前景的解决方案。然而,由于神经退行性病变相关的细微、区域特异性和渐进性解剖变化,合成阿尔茨海默病(AD)的高质量结构MRI数据仍然具有挑战性。在本文中,我们将最初为脑肿瘤合成设计的Med-DDPM条件扩散模型扩展,以生成专门针对AD的3D结构MRI。我们采用Med-DDPM,因为与其他生成模型相比,它具有稳定的结构和保真度,特别适合捕捉AD特征的细微解剖变化。我们的方法以来自ADNI数据集的解剖分割掩膜为条件,将关键的AD相关脑结构纳入生成过程。我们通过在真实、合成和混合数据集上训练分割模型,系统评估了合成图像的质量和实用性。实验结果表明,仅在合成数据上训练的分割模型达到了与真实数据训练(0.6513)相当的Dice分数(0.6532),同时召回率显著提高。值得注意的是,在混合数据集(混合真实和合成图像)上训练的模型优于真实和纯合成基线,Dice分数达到0.7244。这些发现强调了条件扩散模型在生成解剖准确、AD特异性合成MRI方面的成功应用,并突出了它们在增强训练数据可用性、提高诊断准确性和促进神经影像研究可重复性方面的潜力。

英文摘要

Recent advances in generative machine learning models have significantly improved medical imaging, offering promising solutions for data augmentation, privacy preservation, and improved model generalization. However, synthesizing high-quality structural MRI data for Alzheimer's Disease (AD) remains challenging due to the subtle, region-specific, and progressive anatomical changes associated with neurodegeneration. In this paper, we extend the Med-DDPM conditional diffusion model -- originally designed for brain tumor synthesis -- to generate 3D structural MRIs specifically tailored to AD. We adopted Med-DDPM due to its established stability and structural fidelity compared to other generative models, which makes it particularly suitable for capturing the subtle anatomical changes characteristic of AD. Our approach conditions the diffusion process on anatomical segmentation masks derived from the ADNI dataset, incorporating key AD-relevant brain structures into the generation process. We systematically evaluate the quality and utility of the synthetic images by training segmentation models on real, synthetic, and hybrid (mixed) datasets. Experimental results demonstrate that segmentation models trained exclusively on synthetic data achieve comparable Dice scores (0.6532) to those trained on real data (0.6513), while exhibiting significantly enhanced recall. Notably, models trained on hybrid datasets (mixing real and synthetic images) outperform both real and synthetic-only baselines, achieving a Dice score of 0.7244. These findings underscore the successful use of conditional diffusion models for generating anatomically accurate, AD-specific synthetic MRIs, and highlight their potential for enhancing training data availability, improving diagnostic accuracy, and promoting research reproducibility in neuroimaging studies.

2606.19215 2026-06-18 cs.CV 新提交 90%

GUMP-Net: An interpretable model-data-driven intelligent algorithm for multi-class pelvic segmentation

GUMP-Net: 一种用于多类盆腔分割的可解释模型-数据驱动智能算法

Liheng Wang, Yinghui Zhang, Licheng Zhang, Hailin Xu, Qiyong Cao, Chong Chen

发表机构 * State Key Laboratory of Mathematical Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences(数学科学国家重点实验室,数学与系统科学研究院,中国科学院) University of Chinese Academy of Sciences(中国科学院大学) Department of Orthopedics, The Fourth Medical Center of Chinese PLA General Hospital(中国人民解放军第四医学中心骨科部) National Clinical Research Center for Orthopedics, Sports Medicine and Rehabilitation(骨科、运动医学与康复临床研究中心) Department of Trauma and Orthopedics, People’s Hospital Peking University(北京大学人民医院创伤与骨科部) Department of Orthopedics and Traumatology, Beijing Jishuitan Hospital, Capital Medical University(首都医科大学北京积水潭医院骨科与创伤科)

专题命中 医学影像 :盆腔分割,属于医学影像分析

AI总结 提出GUMP-Net,结合改进测地线活动轮廓模型与深度神经网络,实现多类盆腔分割,在小训练数据下表现更优,并提供可解释几何视角。

Comments 26 pages, 8 figures, 3 tables

详情
AI中文摘要

盆腔分割是盆腔骨折精准智能诊疗及手术规划导航中最重要和基础的研究问题之一。通过将改进的测地线活动轮廓模型与深度神经网络相结合,我们提出了GUMP-Net,一种用于多类盆腔分割的可解释模型-数据驱动智能算法,其中设计了三个网络模块共同构成整体分割框架:用于自动水平集初始化的目标检测模块、用于学习解剖感知边缘检测函数的边缘检测器模块以及用于深度水平集演化的迭代模块。利用水平集表示和深度学习的优势,GUMP-Net在分割性能上比最先进的方法更准确、鲁棒和一致,尤其是在小训练数据情况下。在盆腔数据集上的大量实验证明了所提算法的合理性和有效性。扩展到踝关节数据集的进一步实验表明其对其他解剖结构具有更广泛的应用。所提算法不仅为复杂骨折复位提供了高效的分割方法,而且为理解深度学习分割提供了可解释的几何视角。

英文摘要

Pelvic segmentation is one of the most important and fundamental research problems in precise and intelligent diagnosis and treatment, as well as surgical planning and navigation for pelvic fractures. By combining an improved geodesic active contour model with deep neural networks, we propose GUMP-Net, an interpretable model-data-driven intelligent algorithm for multi-class pelvic segmentation, in which three network modules are designed to constitute the overall segmentation framework together: the object detection module for automatic level set initialization, the edge detector module for learning an anatomy-aware edge detector function and the iteration module for deep level set evolution. Leveraging the advantages of level set representation and deep learning, GUMP-Net shows more accurate, robust and consistent segmentation performance, especially in small training data situation, compared to the state-of-the-art methods. Extensive experiments on pelvic datasets demonstrate the rationality and effectiveness of the proposed algorithm. Further experiments extended to ankle dataset indicate broader applications to other anatomies. The proposed algorithm not only provides an efficient segmentation method for complex fracture reduction, but also gives an interpretable geometric perspective for understanding deep learning segmentation.

2606.18886 2026-06-18 cs.CV 新提交 90%

DINO-Med3D: Bridging Dimension and Domain Gaps in Volumetric Segmentation via Progressive Adaptation

DINO-Med3D:通过渐进式适应弥合体分割中的维度与领域差距

Haoyu Hu, Xiyao Ma, Shiqi Liu, Linsen Zhang, Xiaoliang Xie, Xiaohu Zhou, Zeng-Guang Hou

发表机构 * University of Chinese Academy of Sciences(中国科学院大学) Institute of Automation, Chinese Academy of Sciences(中国科学院自动化研究所)

专题命中 医学影像 :将DINOv3适配到3D医学分割

AI总结 提出两阶段渐进框架DINO-Med3D,通过多切片嵌入模块、3D适配器和并行细节恢复流,将DINOv3适配到3D医学分割,在五个数据集上超越现有方法。

Comments Accepted at MICCAI 2026. The camera-ready version and link will be made publicly available upon publication

详情
AI中文摘要

尽管DINOv3在自然图像中展现了显著的语义判别能力,但其直接应用于体医学分割受到固有的维度和领域差异的阻碍。为解决这些问题,我们提出DINO-Med3D,一个两阶段渐进框架,将预训练的DINOv3编码器重新用于3D医学任务。在第一阶段,我们通过引入融合伪3D上下文的多切片嵌入模块来弥合维度差距,同时采用分割代理任务将从自然场景学到的表示适应到医学领域。随后,我们通过在冻结的主干中添加轻量级3D适配器来增强体理解,以强制执行全局切片间连续性。最后,为补偿嵌入过程中固有的空间信息损失,我们设计了一个并行细节恢复流,以显式保留高频边界线索。在五个公共数据集上的大量实验表明,我们的方法成功地将DINOv3适应到医学领域,并显著优于最先进的基线方法。

英文摘要

Although DINOv3 has demonstrated remarkable semantic discrimination in natural imagery, its direct application to volumetric medical segmentation is hindered by inherent dimension and domain disparities. To resolve these issues, we propose DINO-Med3D, a two-stage progressive framework that repurpose the pre-trained DINOv3 encoder for 3D medical tasks. In the first stage, we mitigate the dimension gap by introducing a multi-slice embedding module that incorporates pseudo-3D context, while simultaneously employing a segmentation proxy task to adapt representations learned from natural scenes to the medical domain. Subsequently, we further enhance volumetric understanding by adding lightweight 3D adapters into the frozen backbone to enforce global inter-slice continuity. Finally, to compensate for the spatial information loss inherent in the embedding process, we design a parallel detail recovery stream to explicitly preserve high-frequency boundary cues. Extensive experiments on five public datasets demonstrate that our approach successfully adapts DINOv3 to the medical domain and significantly outperforms state-of-the-art baselines.

2606.18876 2026-06-18 cs.CV cs.LG 新提交 90%

Test-Time Adaptation in Optical Coherence Tomography Using Trajectory-Aligned Time-Independent Flow

光学相干断层扫描中基于轨迹对齐的时间无关流的测试时自适应

Veit Hucke, Thomas Pinetz, Gregor Reiter, Ursula Schmidt-Erfurth, Hrvoje Bogunović

发表机构 * Institute of Artificial Intelligence, Center for Medical Data Science, Medical University of Vienna, Austria(人工智能研究所、医学数据科学中心、维也纳医学大学,奥地利) Comprehensive Center for Artificial Intelligence in Medicine, Medical University of Vienna, Austria(医学人工智能综合中心、维也纳医学大学,奥地利) Department of Ophthalmology and Optometry, Medical University of Vienna, Austria(眼科与视光学部、维也纳医学大学,奥地利) Laboratory for Ophthalmic Image Analysis, Medical University of Vienna, Austria(眼科图像分析实验室、维也纳医学大学,奥地利)

专题命中 医学影像 :OCT图像质量自适应,用于AMD分割,医学影像核心。

AI总结 提出一种基于流匹配的测试时自适应方法,通过直方图匹配和去除时间条件,生成高质量替代图像,在AMD分割中达到最优性能。

Comments Accepted in MICCAI

详情
AI中文摘要

光学相干断层扫描(OCT)在眼科中至关重要,但图像质量不一致,尤其是在低成本设备中,阻碍了自动化分析。为了解决这个问题,我们引入了一种基于流匹配的测试时自适应方法,从噪声输入生成高质量替代图像。通常,测试数据和训练数据之间的域差距会导致去噪过程中像素分布不匹配。我们通过将测试图像的直方图与合成参考轨迹匹配来克服这一问题,成功地将输入与预期分布对齐。此外,我们移除了网络的时间条件,以考虑真实世界噪声分布的轻微偏差。我们的方法在分割年龄相关性黄斑变性(AMD)两个阶段的关键生物标志物方面达到了最先进的性能。代码地址:this https URL。

英文摘要

Optical coherence tomography (OCT) is essential in ophthalmology, but inconsistent image quality especially in low-cost devices hinders automated analysis. To address this, we introduce a flow-matching-based test-time adaptation method that generates high-quality surrogate images from noisy inputs. Typically, domain gaps between test and training data cause pixel distribution mismatches during the denoising process. We overcome this by matching the test image's histogram to synthetic reference trajectories, successfully aligning the input with expected distributions. Additionally, we remove the network's time conditioning to account for slight deviations in real-world noise distributions. Our approach achieves state-of-the-art performance in segmenting critical biomarkers for two stages of Age-related Macular Degeneration (AMD). Code is available: https://github.com/Veit21/tta-flow.

2606.18872 2026-06-18 cs.CV 新提交 90%

Bridging Single Distortion Artifacts and Mmultifactorial Clinical Quality: Few-shot Biparametric MRI Quality Assessment via Distortion-trained Prototypical Networks

桥接单一失真伪影与多因素临床质量:基于失真训练的原型网络的少样本双参数MRI质量评估

Yuheng Tang, Alexander Ng, Wen Yan, Natasha Thorley, Pawel Rajwa, Yipei Wang, Aqua Asif, Clare Allen, Louise Dickinson, Francesco Giganti, Shonit Punwani, Daniel Alexander, Veeru Kasivisvanathan, Yipeng Hu

发表机构 * UCL Hawkes Institute(UCL Hawkes研究所) Department of Medical Physics and Biomedical Engineering(医学物理与生物医学工程系) University College London(伦敦大学学院) Division of Surgery and Interventional Science(外科与介入科学分会) Centre for Medical Imaging(医学成像中心) British Urology Researchers in Surgical Training (BURST)(英国泌尿外科手术培训研究人员(BURST)) Department of Radiology(放射科) University College London Hospitals NHS Foundation Trust(伦敦大学学院医院国家健康服务信托基金) Centre of Medical Imaging, Division of Medicine(医学成像中心,医学分会) Centre for Medical Image Computing(医学图像计算中心) Department of Computer Science(计算机科学系) Department of Urology(泌尿科)

专题命中 医学影像 :前列腺MRI质量评估,少样本原型网络。

AI总结 提出一种少样本双参数原型网络,利用失真标签元训练,通过特征融合和域对齐,仅用5个样本即可预测PI-QUAL临床质量评分,解决临床数据稀缺问题。

详情
AI中文摘要

临床前列腺多参数MRI高度依赖高质量扩散加权成像(DWI),但DWI读图常因几何失真(通常由直肠气体引起)而受损。通过PI-QUAL评分系统评估质量是新兴的临床标准,但该方法主观、耗时,且存在类别不平衡问题,其中低质量病例多样且相对稀少。以PRIME临床试验为例,6%的图像PI-QUAL评分低于4,87%的DWI问题源于失真,许多其他临床质量问题代表性不足。为解决这种标注临床数据的双重稀缺性,我们提出了一种用于自动图像质量评估(IQA)的少样本双参数原型网络。我们的框架利用双分支3D ResNet融合T2加权和DWI特征,提供解剖背景以区分真实形态与失真。为处理现实异质性,我们引入特征级线性调制(FiLM)和梯度反转层(GRL),以对齐基于不同b值的特征分布,同时抑制采集相关偏差。我们证明,仅基于相对客观、易于获取的失真标签进行元训练的模型,能够仅使用五个代表性样本有效适应预测复杂的多因素临床质量评分(如PI-QUAL)。在两个数据集上的实验结果表明,我们的方法在此具有挑战性的IQA任务中显著优于少样本学习基线,为临床工作流程中标准化前列腺MRI质量控制提供了实际可行且数据高效的解决方案。

英文摘要

Clinical prostate multi-parametric MRI relies heavily on high-quality diffusion-weighted imaging (DWI), yet reading DWI is frequently compromised by geometric distortion, often caused by rectal air. Assessing quality via the PI-QUAL scoring system is an emerging clinical standard, but it is subjective, time-consuming and suffers from a class imbalance where low-quality cases are diverse and relatively scarce. Using the PRIME clinical trial as an example, there are $6\%$ images with PI-QUAL scores lower than 4, $87\%$ of DWI issues are due to distortion. Many of the other clinical quality issues are under-represented. To address this common dual-scarcity of annotated clinical data, we propose a few-shot biparametric prototypical network for automated image quality assessment (IQA). Our framework utilizes a dual-branch 3D ResNet to fuse T2-weighted and DWI features, providing anatomical context to distinguish true morphology from distortion. To handle real-world heterogeneity, we introduce feature-wise linear modulation (FiLM) and a gradient reversal layer (GRL) to align feature distributions conditioned on varying b-values while suppressing acquisition-related biases. We demonstrate that a model meta-trained solely on comparatively objective, readily obtainable distortion labels can effectively adapt to predicting complex, multi-factorial clinical quality scores such as PI-QUAL using only five representative samples. Experimental results on two datasets show that our method significantly outperforms few-shot learning baselines for this challenging IQA task, offering a practically feasible and data-efficient solution for standardizing prostate MRI quality control in clinical workflows.

2606.18869 2026-06-18 cs.CV 新提交 90%

Learning to Distort: Weakly-Supervised Image Quality Transfer for Prostate DWI Correction

学习扭曲:用于前列腺DWI校正的弱监督图像质量迁移

YuCheng Tang, Wen Yan, Alexander Ng, Natasha Thorley, Pawel Rajwa, Yipei Wang, Aqua Asif, Clare Allen, Louise Dickinson, Francesco Giganti, David Atkinson, Shonit Punwani, Daniel Alexander, Shaheer Ullah Saeed, Veeru Kasivisvanathan, Yipeng Hu

发表机构 * UCL Hawkes Institute(UCL哈维斯研究所) Department of Medical Physics and Biomedical Engineering(医学物理与生物医学工程系) University College London(伦敦大学学院) Division of Surgery and Interventional Science(外科与介入科学分会) Centre for Medical Imaging(医学成像中心) British Urology Researchers in Surgical Training (BURST)(英国泌尿外科手术培训研究人员(BURST)) Department of Radiology(放射科) University College London Hospitals NHS Foundation Trust(伦敦大学学院医院国家健康服务信托基金) Centre for Medical Image Computing(医学图像计算中心) Department of Computer Science(计算机科学系) Department of Urology(泌尿科)

专题命中 医学影像 :前列腺DWI失真校正,弱监督图像质量迁移。

AI总结 提出弱监督图像质量迁移框架,利用图像质量评估信号从无失真图像学习生成真实失真,并训练校正模型,在PI-RADS和Gleason评分分类任务中优于现有无配对方法。

详情
AI中文摘要

单次激发平面回波前列腺弥散加权成像(DWI)常因几何失真而复杂化,影响从这些图像中获得可靠诊断的能力。开发自动化校正方法面临缺乏配对的失真和未失真临床扫描的挑战。本文首先提出一种新颖的弱监督图像质量迁移(IQT)框架,从无失真图像到失真图像,利用图像质量评估(IQA)信号监督迁移过程。与传统方法需要昂贵的体素级配对数据或采用无配对算法不同,我们的方法利用图像级质量标签(此处为失真与无失真)在预训练特征空间中建立潜在质量原型。认识到模拟真实失真比直接无配对校正更可靠,我们描述了一种弱监督原型流匹配算法,显式正则化生成轨迹朝向失真原型,产生模拟临床退化的真实磁敏感伪影。通过合成这些真实配对,我们能够训练第二个IQT模型进行正向失真校正。实验结果表明,我们生成的图像成功模拟了真实伪影的诊断干扰,从而产生更强大的失真校正IQT模型。除定性比较外,我们还通过评估临床下游任务性能(PI-RADS和Gleason评分分类),使用分布内和外部数据集,将我们的方法与现有无配对方法(如CycleGAN、UNIT-DDPM和OT-FM)作为正向或反向替代方案进行详尽的定量评估。

英文摘要

Single-shot echo-planar prostate diffusion-weighted imaging (DWI) is frequently complicated by geometric distortions, which impact the ability to derive reliable diagnoses from such images. Developing automated correction methods is challenged by the absence of paired distorted and undistorted clinical scans. In this paper, we first propose a novel weakly-supervised image quality transfer (IQT) framework from undistorted to distorted images that utilizes image quality assessment (IQA) signals to supervise the transfer process. Unlike traditional methods that require expensive, voxel-wise paired data or resort to developing unpaired algorithms, our approach utilizes image-level quality labels (here, distorted vs. undistorted) to establish latent quality prototypes within a pre-trained feature space. Recognizing that simulating realistic distortions is more reliable than direct unpaired correction, we describe a weakly-supervised prototype flow matching algorithm to explicitly regularize generative trajectories towards distorted prototypes, producing realistic susceptibility artifacts that mimic clinical degradations. By synthesizing these realistic pairs, we enable a second IQT model to be trained in the forward direction for distortion correction. Experimental results demonstrate that our generated images successfully mimic the diagnostic interference of real-world artifacts, which leads to more capable distortion correction IQT models. In addition to qualitative comparisons, we also conduct exhaustive quantitative evaluations that compare our approach with existing unpaired approaches (e.g., CycleGAN, UNIT-DDPM, and OT-FM) - as either forward or reverse alternatives - by assessing clinical downstream task performance in PI-RADS and Gleason score classification, using both in-distribution and external data sets.

2606.18860 2026-06-18 cs.CV cs.LG 新提交 90%

Quantification of Uncertainty with Adversarial Models in Medical Image Segmentation

医学图像分割中对抗模型的不确定性量化

Hana Jebril, Thomas Pinetz, Günter Klambauer, Hrvoje Bogunović

发表机构 * Institute of Artificial Intelligence, Center for Medical Data Science, Medical University of Vienna, Austria(人工智能研究所、医学数据科学中心、维也纳医学大学,奥地利) Comprehensive Center for AI in Medicine, Medical University of Vienna, Austria(医学人工智能综合中心、维也纳医学大学,奥地利) ELLIS Unit Linz, LIT AI Lab and Institute for Machine Learning, Johannes Kepler University Linz, Austria(林茨ELLIS单位、LIT人工智能实验室和机器学习研究所、林茨约瑟夫·冯·克拉夫特大学,奥地利) Institute for Machine Learning, Johannes Kepler University Linz, Austria(机器学习研究所、林茨约瑟夫·冯·克拉夫特大学,奥地利) Clinical Research Center for Medical AI, Johannes Kepler University Linz, Austria(医学人工智能临床研究中心、林茨约瑟夫·冯·克拉夫特大学,奥地利)

专题命中 医学影像 :医学图像分割不确定性量化,后处理框架。

AI总结 提出QUAM-SM后处理框架,通过针对性对抗搜索识别脆弱像素,量化不确定性并分离认知与偶然不确定性,在公开数据集上优于现有方法。

Comments Accepted at MICCAI 2026

详情
AI中文摘要

可靠的像素级不确定性量化具有通过实现高保真纵向监测和区分真实病理变化与伪影来改变临床工作流程的潜力。理想情况下,这些模型提供关键治疗计划和手术干预所需的稳定性。然而,标准深度学习模型常常遭受校准不良,产生过度自信的预测,掩盖了微妙病理边界处的潜在脆弱性。为了解决这个问题,我们提出了QUAM-SM,一种使用针对性对抗搜索来识别“对抗脆弱”像素的后处理框架。通过主动寻找暴露预测不稳定性的扰动,我们的方法突出了决策最容易被翻转的区域。重要的是,该框架将认知不确定性与偶然不确定性分离。在两个具有多个专家标注的公开数据集上的实验表明,QUAM-SM在可靠性和边界敏感性方面优于标准和最新的不确定性估计方法。代码可在以下网址获取:https://this https URL

英文摘要

Reliable pixel-level uncertainty quantification holds the potential to transform clinical workflows by enabling high-fidelity longitudinal monitoring and distinguishing true pathological changes from artifacts. Ideally, these models provide the stability required for critical treatment planning and surgical intervention. However, standard deep learning models often suffer from miscalibration, yielding overconfident predictions that mask underlying vulnerabilities at subtle pathological boundaries. To address this, we propose QUAM-SM, a post-hoc framework using targeted adversarial search to identify "adversarially fragile" pixels. By actively seeking perturbations that expose predictive instability, our method highlights regions where decisions are most vulnerable to being flipped. Importantly, the framework disentangles epistemic uncertainty from aleatoric uncertainty. Experiments on two public datasets with multiple expert annotations demonstrate that QUAM-SM outperforms both standard and recent uncertainty estimation approaches in terms of reliability and boundary sensitivity. Code is available at https://github.com/HanaJebril/quam_sm

2606.18825 2026-06-18 cs.CV 新提交 90%

DreamReg: Belief-Driven World Model for 2D-3D Ultrasound Registration

DreamReg:基于信念驱动的世界模型用于2D-3D超声配准

Luoyao Kang, Yuelin Zhang, Jiwei Shan, Haifan Gong, Qingpeng Ding, Shing Shin Cheng

发表机构 * T Stone Robotics Institute, The Chinese University of Hong Kong(香港中文大学T Stone机器人研究所) Multi-scale Medical Robotics Center(多尺度医疗机器人中心) Perelman School of Medicine, University of Pennsylvania(宾夕法尼亚大学佩雷尔曼医学院)

专题命中 医学影像 :2D-3D超声配准,用于手术导航

AI总结 提出DreamReg框架,将2D-3D超声配准建模为信念更新,通过世界模型模拟探头运动并整合想象结果,在CAMUS和u-RegPro数据集上实现鲁棒且准确的实时配准。

详情
AI中文摘要

超声(US)广泛应用于手术导航,但由于部分可观测性、散斑噪声以及依赖于动作的US采集,术中2D切片与术前3D体积之间的实时配准仍然具有挑战性。现有方法是一次性的或短视的,难以随时间收集证据或捕捉外科医生如何根据屏幕反馈调整探头运动。我们提出DreamReg,一个基于信念驱动的世界模型框架,将2D-3D配准形式化为对刚性变换的信念更新。DreamReg维护一个潜在信念状态,总结过去的观测和位姿信息,并在新切片到达时通过学习到的动态不断细化变换。在训练期间,DreamReg暴露于模拟临床扫描行为的探头运动轨迹,并通过将位姿细化条件于当前US观测来学习更新其信念。在推理期间,DreamReg通过内部想象来细化配准:它展开学习到的世界模型以模拟候选探头运动及其预测的观测,并整合这些想象的结果以收敛到准确的刚性变换。在CAMUS和u-RegPro数据集上的实验表明,与最先进方法相比,DreamReg在实时引导中具有改进的鲁棒性和有竞争力的配准精度。

英文摘要

Ultrasound (US) is widely used for surgical navigation, yet real-time registration between intraoperative 2D slices and preoperative 3D volumes remains challenging due to partial observability, speckle noise, and the action-dependent US acquisition. Existing methods are one-shot or short-horizon, making it hard for them to gather evidence over time or capture how surgeons adjust probe motion based on on-screen feedback. We propose DreamReg, a belief-driven world-model framework that formulates 2D-3D registration as belief updating over rigid transformations. DreamReg maintains a latent belief state that summarizes past observations and poses information, and continuously refines the transformation through learned dynamics as new slices arrive. During training, DreamReg is exposed to probe-motion trajectories that mimic clinical scanning behavior and learns to update its belief by conditioning pose refinement on the current US observation. During inference, DreamReg refines registration via internal imagination: it rolls out the learned world model to simulate candidate probe motions and their predicted observations, and integrates these imagined outcomes to converge to an accurate rigid transformation. Experiments on CAMUS and u-RegPro datasets demonstrate improved robustness and competitive registration accuracy for real-time guidance compared with state-of-the-art methods.

2606.18753 2026-06-18 cs.CV 新提交 90%

SMART: A Flexible, Interpretable, and Scalable Spatio-temporal Brain Atlas from High-Resolution Imaging Data

SMART:一种灵活、可解释且可扩展的高分辨率成像数据时空脑图谱

John Kalkhof, Boris Gutman, Emile d'Angremont, Daniel C. Alexander, Marco Lorenzi

发表机构 * Illinois Institute of Technology(伊利诺伊理工学院) Amsterdam University Medical Center(阿姆斯特丹大学医学中心) University College London(伦敦大学学院)

专题命中 医学影像 :时空脑图谱,高分辨率3D医学图像建模。

AI总结 提出SMART框架,通过解耦全局疾病动态与患者特定解剖表现,学习连续疾病时间图谱,实现高分辨率3D医学图像中时空变化的灵活、可解释和可扩展建模。

详情
AI中文摘要

我们介绍了SMART,一个从纵向高分辨率3D医学图像中学习灵活、可解释且可扩展的时空脑图谱的框架。现有的时空图谱构建方法依赖于黑盒生成模型,缺乏灵活性、限制可解释性,并且难以扩展到高维数据。SMART通过学习一个连续的疾病时间图谱来解决这些挑战,该图谱将全局群体级疾病动态与患者特定的解剖表现解耦。在解剖学启发先验的指导下,SMART通过区域特异性微分方程,沿着共享的疾病时间线建模可解释的全局区域进展轨迹。全局轨迹进一步通过由灵活且可扩展的多尺度神经细胞自动机参数化的密集微分同胚位移,个性化到个体解剖结构。在阿尔茨海默病的五个纵向MRI数据集(ADNI-1/GO/2、OASIS-3、AIBL;>1300名受试者)上评估,SMART产生了解剖学上有意义的疾病进展预测,并实现了最先进的预测准确性和比对抗性和扩散基线更好的时间一致性。我们的方法为高维医学图像时间序列中时空变化的灵活、可解释和可扩展建模建立了一个新范式。

英文摘要

We introduce SMART, a framework for learning a flexible, interpretable, and scalable spatio-temporal brain atlas from longitudinal high-resolution 3D medical images. Existing approaches to spatio-temporal atlas construction rely on black-box generative models that lack flexibility, limit interpretability, and struggle to scale to high-dimensional data. SMART addresses these challenges by learning a continuous disease-time atlas that decouples global group-wise disease dynamics from their patient-specific anatomical manifestation. Guided by anatomically inspired priors, SMART models interpretable global trajectories of regional progression along a shared disease timeline through region-specific differential equations. Global trajectories are further personalized to individual anatomies via dense diffeomorphic displacements parameterized by a flexible and scalable multi-scale Neural Cellular Automata. Evaluated on five longitudinal MRI datasets in Alzheimer's disease (ADNI-1/GO/2, OASIS-3, AIBL; > 1,300 subjects), SMART produces anatomically meaningful predictions of disease progression and achieves state-of-the-art forecasting accuracy and improved temporal consistency over adversarial and diffusion baselines. Our approach establishes a new paradigm for flexible, interpretable, and scalable modeling of spatio-temporal change in high-dimensional medical image time-series.

2606.18723 2026-06-18 cs.CV cs.LG 新提交 90%

Clinically Aligned Geometry Constraints for Robust IVUS Vessel Boundary Segmentation

临床对齐的几何约束用于鲁棒的IVUS血管边界分割

Yunshu Chen, Litao Yang, Giuseppe Di Giovanni, Jordan Tan, Deval Mehta, Andrew Lin, Derek Chew, Masasi Fujino, Julie Butters, Stephen Nicholls, Zongyuan Ge, Kyung Hoon Cho

发表机构 * AIM For Health Lab, Monash University(莫纳什大学AIM健康实验室) Department of Data Science and Artificial Intelligence, Faculty of IT, Monash University(莫纳什大学信息技术学院数据科学与人工智能系) Monash University Victorian Heart Institute(莫纳什大学维多利亚心脏研究所) School of Computing Technologies, RMIT University(皇家墨尔本理工大学计算技术学院) National Cerebral and Cardiovascular Center(国立循环器病研究中心) Department of Cardiology, Chonnam National University Hospital and Medical School(全南大学医院和医学院心脏病学系)

专题命中 医学影像 :IVUS血管边界分割,几何约束网络。

AI总结 提出GeoCat网络,通过双编码器与可微几何一致性损失,在IVUS分割中降低边界漂移和拓扑错误,提升临床几何测量精度。

Comments MICCAI2026 Accepted

详情
AI中文摘要

血管内超声(IVUS)管腔和外弹性膜(EEM)分割对于定量评估冠状动脉斑块负荷至关重要。管腔或EEM勾画的误差会直接传播到斑块面积、斑块负荷和几何测量中。然而,优先考虑重叠分数的标准方法常常遭受边界漂移和拓扑错误,导致临床测量不准确。我们提出GeoCat,一个几何一致性网络,使用双笛卡尔-极坐标编码器,结合跨域注意力和时间融合,处理5帧IVUS片段。可微的几何一致性损失直接监督临床相关描述符,包括直径、方向和横截面积。该模型在来自146名患者的12,242张标注帧上训练,这些帧使用两种商用IVUS系统采集。我们使用分割准确性和斑块相关临床指标评估性能,包括Dice/IoU、边界测量(95HD(mm)、ASSD)、拓扑违规率和临床几何误差(dmax/dmin、角度和面积)。在我们的数据集上,GeoCat实现了0.93的Dice,将95HD降低到0.14 mm,并将拓扑违规率降低到1.0%。重要的是,它显著提高了几何保真度,产生0.13-0.16 mm的直径误差和约8度的角度误差,支持可靠的斑块负荷量化。

英文摘要

Intravascular ultrasound (IVUS) lumen and external elastic membrane (EEM) segmentation is important for quantitative coronary plaque burden assessment. Errors in lumen or EEM delineation directly propagate to plaque area, plaque burden and geometric measurements. However, standard methods prioritising overlap scores often suffer from boundary drift and topology errors, leading to inaccurate clinical measurements. We present GeoCat, a geometry-consistent network that processes 5-frame IVUS clips using dual Cartesian-polar encoders with cross-domain attention and temporal fusion. A differentiable geometry consistency loss directly supervises clinically relevant descriptors including diameters, orientations, and cross-sectional areas. The model is trained on 12,242 annotated frames from 146 patients acquired with two commercial IVUS systems. We evaluate performance using both segmentation accuracy and plaque-relevant clinical metrics, including Dice/IoU, boundary measures(95HD (mm), ASSD), topology violation rate, and clinical geometry errors (dmax/dmin, angles, and areas). On our dataset, GeoCat achieves a Dice of 0.93, reduces 95HD to 0.14 mm, and lowers topology violations to 1.0%. Importantly, it significantly improves geometric fidelity, yielding diameter errors of 0.13-0.16 mm and angular errors of ~8 degrees, supporting reliable plaque burden quantification.

2606.15604 2026-06-18 eess.IV cs.CV 新提交 90%

Parameter-Efficient Adaptation of SAM 3 for Automated ITV Generation from 4DCT Images

基于参数高效微调SAM 3从4DCT图像自动生成内靶区

Changwoo Song

发表机构 * Oncosoft Inc.(Oncosoft公司) Department of Computer Science & Engineering, Chungnam National University(忠南大学计算机科学与工程系)

专题命中 医学影像 :4DCT图像分割生成内靶区,医学影像

AI总结 提出轻量框架,通过LoRA参数高效微调SAM 3,结合硬负样本挖掘和相位相干滤波,仅用7个标注体数据实现高精度内靶区自动生成,中位Dice达0.968。

Comments 10 pages, 4 figures, 2 tables

详情
AI中文摘要

四维计算机断层扫描(4DCT)捕获了胸部解剖结构的完整呼吸周期,然而当前的内靶区勾画流程孤立处理每个相位,丢弃了时间相干性,使轮廓易受相位特定伪影影响。我们提出一个轻量框架,通过低秩适应(LoRA)对Segment Anything Model 3(SAM 3)进行参数高效微调,仅使用七个标注的3D CT体数据,将其文本提示分割与医学领域对齐。此外,该框架结合了硬负样本挖掘策略,以改善低对比度胸部区域的边界判别。在推理时,通过相位相干时间滤波和空间连通性分析细化逐相位预测。由于呼吸运动是连续且周期性的,真实解剖结构出现在连续的相位块中,而瞬态伪影零星出现,因此被有效抑制。在肺部和心脏结构上的实验分别产生中位Dice分数0.968和0.910,95百分位Hausdorff距离分别为0.998 mm和2.931 mm。所提框架有效消除了未适应SAM 3零样本推理中固有的严重假阳性预测。仅用七个标注体数据,框架保留了超过95%的全数据准确率,且整个流水线可在单个消费级GPU上训练,展示了自适应放疗中可扩展、数据高效的解决方案。

英文摘要

Four-dimensional computed tomography (4DCT) captures the full respiratory cycle of thoracic anatomy, yet current Internal Target Volume contouring workflows process each phase in isolation, discarding temporal coherence and leaving contours vulnerable to phase-specific artifacts. We present a lightweight framework that applies parameter-efficient fine-tuning to the Segment Anything Model 3 (SAM 3) via low-rank adaptation (LoRA) to align its text-prompted segmentation with the medical domain using only seven annotated 3D CT volumes. Furthermore, the framework incorporates a hard negative mining strategy to improve boundary discrimination in low-contrast thoracic regions. At inference, phase-wise predictions are refined through phase-coherent temporal filtering and spatial connectivity analysis. Since respiratory motion is continuous and periodic, genuine anatomy appears in contiguous blocks of phases, whereas transient artifacts appear sporadically and are thus effectively suppressed. Experiments on pulmonary and cardiac structures yield median Dice scores of 0.968 and 0.910 with 95th-percentile Hausdorff distances of 0.998 mm and 2.931 mm, respectively. The proposed framework effectively eliminates the severe false-positive predictions inherent in the zero-shot inference of the unadapted SAM 3. With only seven annotated volumes, the framework retains over 95% of full-data accuracy, and the entire pipeline is trainable on a single consumer-grade GPU, demonstrating a scalable, data-efficient solution for adaptive radiotherapy.

2511.12126 2026-06-18 eess.IV 90%

Volumetric Ultrasound via 3D Null Subtraction Imaging with Circular and Spiral Apertures

体积分层超声成像:基于圆形和螺旋孔径的3D空子减法成像

Bingze Dai, Xi Zhang, Wei-Ning Lee

专题命中 医学影像 :提出3D空子减法成像技术用于体积超声,属于医学影像。

AI总结 本文提出3D空子减法成像技术,通过高效空子减法与稀疏孔径设计提升体积分层超声成像的图像质量、帧率和硬件复杂度平衡,实验显示其在方位和仰角分辨率及对比度方面优于传统DAS方法。

Comments 10 pages,12 figures

Journal ref Ultrasonics, 2026: 108179

详情
AI中文摘要

体积分层超声成像面临图像质量、帧率和硬件复杂度之间的根本性权衡。本文介绍了一种非线性波束成形框架,即三维空子减法成像(3D NSI),通过结合计算高效的空子减法过程与针对矩阵阵列的多路复用感知稀疏孔径设计,解决这一权衡问题。我们评估了三种声学孔径配置:一个完全驱动的圆形孔径和两个费马螺旋稀疏孔径。为克服矩阵阵列在与低通道数超声系统多路复用时常见的通道共享限制,我们提出了一种螺旋“无重复”孔径,强制在发射-接收事件之间保持非重叠的元件集。该设计解决了多路复用冲突,并使仅使用1024个元件探头中的240个主动元件即可实现高达16倍的采集体积速率。在计算机模拟和组织仿生假体实验中,3D NSI在方位和仰角分辨率方面平均提高了36%,对比度比传统延迟求和(DAS)波束成形器提高了约20%。当与螺旋无重复孔径结合时,3D NSI框架实现了每秒超过1000个体积分层,计算负载仅为DAS的三倍以下,使其成为实时4D成像的实用解决方案。

英文摘要

Volumetric ultrasound imaging faces a fundamental trade-off among image quality, frame rate, and hardware complexity. This study introduces three-dimensional Null Subtraction Imaging (3D NSI), a nonlinear beamforming framework that addresses this trade-off by combining computationally efficient null-subtraction process with multiplexing-aware sparse aperture designs on matrix arrays. We evaluate three apodization configurations: a fully addressed circular aperture and two Fermat's spiral sparse apertures. To overcome channel-sharing constraints common in matrix arrays multiplexed with low-channel-count ultrasound systems, we propose a spiral "no-reuse" apodization that enforces non-overlapping element sets across transmit-receive events. This design resolves multiplexing conflicts and enables up to a 16-fold increase in acquisition volume rate using only 240 active elements on a 1024-element probe. In computer simulations and tissue-mimicking phantom experiments, 3D NSI achieved an average improvement of 36% in azimuthal and elevational resolutions, along with an approximately 20% higher contrast ratio, compared to the conventional Delay-and-Sum (DAS) beamformer under matched transmit/receive configurations. When implemented with the spiral no-reuse aperture, the 3D NSI framework achieved over 1000 volumes per second with a computational load less than three times that of DAS, making it a practical solution for real-time 4D imaging.

2510.13562 2026-06-18 physics.med-ph cs.CV cs.NA math.NA 90%

An efficient approach with theoretical guarantees to simultaneously reconstruct activity and attenuation sinogram for TOF-PET

一种具有理论保证的高效方法用于同时重建TOF-PET的活动和衰减正弦图

Liyang Hu, Chong Chen

发表机构 * State Key Laboratory of Mathematical Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China(数学科学国家重点实验室,数学与系统科学研究院,中国科学院,北京100190,中国) University of Chinese Academy of Sciences, Beijing 100190, China(中国科学院大学,北京100190,中国)

专题命中 医学影像 :PET重建,核心医学影像方法

AI总结 本文提出一种基于最大似然估计的新方法,用于同时重建TOF-PET的活动和衰减正弦图,通过利用指数形式的衰减校正因子和活动总量约束,证明了方法的可解性,并通过实验验证了其在精度和效率上的优越性。

Comments 32 pages, 11 figures, 4 tables

Journal ref IEEE Transactions on Computational Imaging 2026

详情
AI中文摘要

在正电子发射断层扫描(PET)中,进行衰减校正对于获得体内定量准确的活动图(示踪剂分布)至关重要。通常,这基于从计算机断层扫描或磁共振成像获得的估计衰减图。然而,除了衰减校正因子的误差外,额外的扫描不仅会引入新的辐射剂量或增加扫描时间,还会由于两次连续扫描之间的各种运动导致严重的对齐问题。为了解决这些问题,基于最大似然估计,我们提出了一种新的数学模型,仅从时间飞越(TOF)-PET发射数据中同时重建活动和衰减正弦图。特别地,我们充分利用了衰减校正因子的唯一指数形式,并在所提出的模型中考虑了某些掩码区域的活动总量约束。此外,我们证明了其可解性,包括解的存在性、唯一性和稳定性。我们提出了一种交替更新算法来求解该模型,并分析了其收敛性。最后,使用各种TOF-PET发射数据的数值实验表明,所提出的方法在数值收敛性和抗噪性方面表现良好,并在精度和效率上优于一些最先进的方法,且具有自主衰减校正的能力。

英文摘要

In positron emission tomography (PET), it is indispensable to perform attenuation correction in order to obtain the quantitatively accurate activity map (tracer distribution) in the body. Generally, this is carried out based on the estimated attenuation map obtained from computed tomography or magnetic resonance imaging. However, except for errors in the attenuation correction factors obtained, the additional scan not only brings in new radiation doses and/or increases the scanning time but also leads to severe misalignment induced by various motions during and between the two sequential scans. To address these issues, based on maximum likelihood estimation, we propose a new mathematical model for simultaneously reconstructing the activity and attenuation sinogram from the time-of-flight (TOF)-PET emission data only. Particularly, we make full use of the exclusively exponential form for the attenuation correction factors, and consider the constraint of a total amount of the activity in some mask region in the proposed model. Furthermore, we prove its well-posedness, including the existence, uniqueness and stability of the solution. We propose an alternating update algorithm to solve the model, and also analyze its convergence. Finally, numerical experiments with various TOF-PET emission data demonstrate that the proposed method is of numerical convergence and robust to noise, and outperforms some state-of-the-art methods in terms of accuracy and efficiency, and has the capability of autonomous attenuation correction.

2507.05647 2026-06-18 eess.IV cs.CV 90%

Diffusion-Based Limited-Angle CT Reconstruction under Noisy Conditions

基于扩散的噪声条件下有限角度CT重建

Jiaqi Guo, Santiago López-Tapia

发表机构 * Dept. of Electrical and Computer Engineering, Northwestern University, Evanston, IL, USA(电气与计算机工程系,西北大学,埃文斯顿,伊利诺伊州,美国)

专题命中 医学影像 :CT重建方法,直接应用于医学影像

AI总结 本文提出基于扩散的有限角度CT重建方法,通过Mean-Reverting随机微分方程完成缺失角度视图,结合噪声感知校正机制提升鲁棒性,实验表明在不同噪声强度和采集条件下均表现优异。

Comments Accepted at the 2025 IEEE International Conference on Image Processing (ICIP), Workshop

详情
AI中文摘要

有限角度计算机断层扫描(LACT)是一个具有挑战性的逆问题,其中缺失的角度投影导致不完整的sinogram和重建图像中的严重伪影。尽管最近的基于学习的方法已显示出有效性,但大多数方法假设理想、无噪声的测量,并未能解决测量噪声的影响。为了克服这一限制,我们将LACT视为sinogram修复任务,并提出基于扩散的框架,利用Mean-Reverting随机微分方程(MR-SDE)公式来完成缺失的角度视图。为了在现实噪声下提高鲁棒性,我们提出RNSD$^+$,一种新的噪声感知校正机制,该机制在推理时显式建模不确定性,从而实现可靠且稳健的重建。广泛的实验表明,我们的方法在数据一致性和感知质量上一致优于基线模型,并且在不同噪声强度和采集场景下具有良好的泛化能力。

英文摘要

Limited-Angle Computed Tomography (LACT) is a challenging inverse problem where missing angular projections lead to incomplete sinograms and severe artifacts in the reconstructed images. While recent learning-based methods have demonstrated effectiveness, most of them assume ideal, noise-free measurements and fail to address the impact of measurement noise. To overcome this limitation, we treat LACT as a sinogram inpainting task and propose a diffusion-based framework that completes missing angular views using a Mean-Reverting Stochastic Differential Equation (MR-SDE) formulation. To improve robustness under realistic noise, we propose RNSD$^+$, a novel noise-aware rectification mechanism that explicitly models inference-time uncertainty, enabling reliable and robust reconstruction. Extensive experiments demonstrate that our method consistently surpasses baseline models in data consistency and perceptual quality, and generalizes well across varying noise intensity and acquisition scenarios.

2606.19182 2026-06-18 eess.IV 新提交 85%

Optimized Multi-Contrast Self-Supervised MRI Reconstruction using Learned k-space Partitioning

使用学习型k空间划分的优化多对比度自监督MRI重建

Brenden Kadota, Charles Millard, Mark Chiew

专题命中 医学影像 :提出多对比度自监督MRI重建方法

AI总结 提出一种多对比度自监督学习框架,通过端到端学习最优k空间数据划分,无需全采样数据即可提升MRI重建质量。

详情
AI中文摘要

目的:深度学习在通过从欠采样数据重建高质量图像来加速MRI方面显示出前景。虽然最近的工作利用多对比度信息来提高重建性能,但这些方法依赖于监督学习,需要全采样k空间进行训练。一种方法,通过数据欠采样的自监督学习(SSDU),通过将k空间划分为两个集合,并在两者之间进行网络映射,从而能够直接在欠采样k空间上进行训练。在这项工作中,我们通过两项修改改进了MRI自监督重建。方法:我们提出了一个多对比度自监督学习框架,该框架联合训练多个欠采样对比度,无需全采样k空间数据作为参考。此外,我们以端到端的方式为每个对比度学习最优的自监督数据划分,进一步提高了重建质量。具体来说,我们学习一个最优的划分概率分布,对其进行采样以生成用于划分的掩码。结果:在两个公开可用的多对比度MRI数据集上的实验表明,与当前的单对比度自监督学习方法相比,我们提出的自监督多对比度学习划分方法提高了重建质量。我们还证明了学习k空间数据的划分进一步增强了重建的保真度。结论:多对比度重建与学习划分相结合,比单对比度自监督MRI重建提高了重建保真度。意义:与之前的自监督方法相比,我们的方法可以实现更高的图像保真度和/或加速MRI协议时间,并且无需全采样k空间进行训练。

英文摘要

Objective: Deep Learning has shown promise in accelerating MRI by reconstructing high-quality images from under-sampled data. While recent work has leveraged multi-contrast information to improve reconstruction performance, these methods rely on supervised learning, which requires fully sampled k-space for training. One method, self-supervised learning via data undersampling (SSDU), enables direct training on under-sampled k-space by partitioning it into two sets, with a network mapping between the two. In this work, we improve MRI self-supervised MRI reconstruction with two modifications. Methods: We propose a multi-contrast self-supervised learning framework that jointly trains on multiple under-sampled contrasts without requiring fully sampled k-space data as a reference. Moreover, we learn an optimal self-supervised data partitioning for each contrast in an end-to-end manner, further enhancing reconstruction quality. Specifically, we learn an optimal partitioning probability distribution, which is sampled to generate a mask for partitioning. Results: Experiments on two publicly available multi-contrast MRI datasets demonstrate the improved reconstruction quality of our proposed self-supervised multi-contrast learned partitioning method compared to the current single-contrast self-supervised learning methods. We also demonstrate that learning the partitioning of k-space data further enhances the fidelity of reconstructions. Conclusion: Multi-contrast reconstruction combined with learned partitioning improves reconstruction fidelity over single-contrast self-supervised MRI reconstructions. Significance: Our method can facilitate higher image fidelity and/or accelerated MRI protocol times compared to previous self-supervised methods, and without requiring fully sampled k-space for training.

2606.18489 2026-06-18 eess.IV 新提交 85%

GHOST-CAT: An Efficient and Practical Network for Mesh Generation from 3D Echocardiography

GHOST-CAT: 一种高效实用的三维超声心动图网格生成网络

Edward Ferdian, Debbie Zhao, Alistair A. Young, Martyn P. Nash

专题命中 医学影像 :从3D超声心动图生成左心室网格,属于医学影像处理

AI总结 提出GHOST-CAT两阶段网络,结合CNN、图卷积和Transformer,从3D超声心动图生成拓扑一致、时间连贯的左心室网格,在100例测试集上Dice系数达0.87(腔室)和0.75(心肌),优于现有方法。

详情
AI中文摘要

深度学习的最新进展显著加速了心脏成像工作流程,从分割到用于计算建模的网格生成。然而,由于3D超声心动图的低对比度噪声比、锥形视野以及对声影的敏感性,其分析面临独特挑战。在此,我们提出了一种专为3D超声心动图定制的高效实用网络。我们的方法由一个两阶段网络组成,结合了卷积神经网络、图卷积网络和Transformer,以创建准确的时间变化3D左心室网格,这些网格在整个心动周期中拓扑一致且时间连贯。我们的模型在100张3D超声图像的保留测试数据集上实现了比当前最先进方法更优越的网格重建精度,与心脏磁共振成像导出的参考分割相比,Dice系数为0.87±0.05(腔室)和0.75±0.07(心肌),平均±标准差表面距离为3.3±0.6毫米(心内膜)和3.5±0.5毫米(心外膜)。重建的网格能够自动计算常规临床指标,如体积、质量和应变,并支持生物物理数字孪生的高级应用。源代码在此https URL公开共享。

英文摘要

Recent advances in deep learning have significantly accelerated cardiac imaging workflows, from segmentation to the generation of meshes for computational modelling. Nevertheless, analysis of 3D echocardiograms presents unique challenges due to their low contrast-to-noise ratio, conical field of view, and susceptibility to acoustic shadowing. Here, we present an efficient and practical network tailored for 3D echocardiograms. Our method consists of a two-stage network that combines convolutional neural networks, graph convolutional networks, and transformers, to create accurate time-varying 3D meshes of the left ventricle that are topologically consistent and temporally coherent throughout the cardiac cycle. Our model achieved superior mesh reconstruction accuracy compared to current state-of-the-art methods on a held-out test dataset of 100 3D echo images, with a Dice coefficient of 0.87 +/- 0.05 (cavity) and 0.75 +/- 0.07 (myocardium), and mean +/- SD surface distances of 3.3 +/- 0.6 mm (endocardium) and 3.5 +/- 0.5 mm (epicardium), against reference segmentations derived from cardiac magnetic resonance imaging. The reconstructed mesh enables automated calculation of routine clinical indices, such as volume, mass, and strain, and enables advanced applications with biophysical digital twins. Source code is openly shared at https://github.com/EdwardFerdian/ghost-cat.

2606.18749 2026-06-18 cs.CV 新提交 85%

Toward Training-Free Zero-Shot Anomaly Detection in 3D Medical Images: A Batch-Based Approach Using 2D Foundation Models

迈向3D医学图像的无训练零样本异常检测:基于批次的方法使用2D基础模型

Tai Le-Gia

发表机构 * Chungnam National University(忠南大学)

专题命中 医学影像 :3D医学图像零样本异常检测,无训练方法。

AI总结 提出CS3F框架,利用2D基础模型对3D医学图像进行零样本异常检测,通过沿多轴分解、切片编码和跨主体相似性计算异常分数,并引入粗到细的分词策略减少信号衰减。

详情
AI中文摘要

零样本异常检测(ZSAD)在医学成像中具有吸引力,因为临床系统必须处理异构采集协议、变化的患者群体以及可能缺乏标注训练数据的病理。大多数现有的零样本异常检测方法是为2D图像设计的,它们直接扩展到3D医学体积受到大规模体积基础模型稀缺或利用体积上下文困难的限制。我们提出CS3F,一个无训练的基于批次的框架,用于3D医学图像中的ZSAD,使用2D基础模型。每个体积沿多个解剖轴分解,并由2D视觉变换器逐切片编码。然后通过池化相邻切片特征将其转换为局部体积令牌。异常分数通过跨主体互相似性获得:在其他主体中缺乏相似令牌的令牌被赋予更高的异常分数。为了减少深度池化引起的病灶信号衰减,我们引入了一种粗到细的分词策略,无需穷举匹配即可实现细分辨率体积评分。CS3F在脑部MRI上针对转移瘤、胶质瘤和中风进行评估,并在肺部CT上验证其泛化能力,超越标准图谱对齐的脑部MRI。结果表明,冻结的2D基础模型可以支持3D医学图像中的异常定位,且细分词化的益处很大程度上取决于病灶对比度和成像模态。

英文摘要

Zero-shot anomaly detection (ZSAD) is attractive for medical imaging because clinical systems must handle heterogeneous acquisition protocols, changing patient populations, and pathologies for which annotated training data may be unavailable. Most existing zero-shot anomaly detection methods are designed for 2D images, and their direct extension to 3D medical volumes is limited by the scarcity of large-scale volumetric foundation models or by the difficulty of utilizing volumetric context. We propose CS3F, a training-free batch-based framework for ZSAD in 3D medical images using 2D foundation models. Each volume is decomposed along multiple anatomical axes and encoded slice-wise by a 2D vision transformer. These are then converted into localized volumetric tokens by pooling neighboring slice features. Anomaly scores are obtained from cross-subject mutual similarity: tokens that lack close analogues in other subjects are assigned higher anomaly scores. To reduce the attenuation of focal lesion signals caused by depth pooling, we introduce a coarse-to-fine tokenization strategy that enables fine-resolution volumetric scoring without exhaustive matching. CS3F is evaluated on brain MRI across metastases, glioma, and stroke, as well as validated on lung CT to test generalizability beyond atlas-aligned brain MRI. The results show that frozen 2D foundation models can support anomaly localization in 3D medical images, and that the benefit of fine tokenization depends strongly on lesion contrast and imaging modality.

2606.18658 2026-06-18 cs.CV eess.IV 新提交 85%

On-Manifold Variational Learning with Heat-Kernel Priors

基于热核先验的流形变分学习

Jiarui Xing, Tal Zeevi, Nian Wu, Jian Wang

发表机构 * Yale School of Medicine(耶鲁大学医学院) University of Virginia(弗吉尼亚大学) Harvard Medical School(哈佛医学院)

专题命中 医学影像 :在心脏瘢痕和脑MRI基准上取得最高精度

AI总结 提出一种流形锚定变分框架,利用几何感知EM算法选择热核加权潜图上的图中心点作为原型,确保原型在流形上,并通过Dirichlet能量正则化保持潜空间几何平滑,在心脏瘢痕和脑MRI基准上取得最高精度和清晰原型。

详情
AI中文摘要

学习医学影像队列的无监督表示可以揭示临床上有意义的原型,而无需专家标签,这些标签通常带有噪声且无法捕捉真实的病理异质性。然而,现有的深度潜变量模型通过欧几里得平均估计高斯混合先验,产生的原型会偏离弯曲的数据流形,并随着子种群数量的增加而退化。我们提出了一种流形锚定变分框架,基于几何感知的期望最大化(EM)算法,其M步骤选择每个子种群原型作为热核加权潜图上具有最高扩散中心性的图中心点,确保每个原型保持在流形上。Dirichlet能量正则化强制潜空间的几何平滑性,每个子种群的不确定性分数实现了无标签的质量评估。流形锚定EM是一种通用几何工具,扩展了标准EM,并易于应用于其他潜变量模型。在心脏瘢痕和脑MRI基准上,我们的框架在所有比较方法中取得了最高精度,产生了迄今为止最清晰的原型,并且在所有基线退化的较大子种群数量下保持稳定。

英文摘要

Learning unsupervised representations of medical imaging cohorts can reveal clinically meaningful prototypes without expert labels, which are often noisy and fail to capture true pathological heterogeneity. However, existing deep latent-variable models estimate Gaussian mixture priors via Euclidean averaging, producing prototypes that drift off the curved data manifold and degenerate as the number of sub-populations grows. We propose a manifold-anchored variational framework built on a geometry-aware Expectation-Maximization (EM) algorithm, whose M-step selects each sub-population prototype as the graph medoid with the highest diffusion centrality on a heat-kernel-weighted latent graph, ensuring that every prototype remains on-manifold. A Dirichlet energy regularizer enforces geometric smoothness of the latent space, and a per-sub-population uncertainty score enables label-free quality assessment. \rev{The manifold-anchored EM is a general-purpose geometric tool that extends standard EM and applies readily to other latent-variable models beyond this setting.} On cardiac scar and brain MRI benchmarks, our framework attains the highest accuracy among all compared methods, produces the sharpest prototypes reported to date, and remains stable at large sub-population counts where all baselines degenerate.

2606.17412 2026-06-18 cs.CV cs.AI 新提交 85%

Enhancing Pathological VLMs with Cross-scale Reasoning

增强病理视觉语言模型的跨尺度推理能力

Chi Phan, Tianyi Zhang, Qiaochu Xue, Yufeng Wu, Dan Hu, Zeyu Liu, Sudong Wang, Yueming Jin

发表机构 * Department of Electrical and Computer Engineering, National University of Singapore(新加坡国立大学电气与计算机工程系) PuzzleLogic Pte Ltd(PuzzleLogic私人有限公司) Department of Pathology, Fujian Medical University Cancer Hospital & Fujian Cancer Hospital(福建医科大学附属肿瘤医院病理科暨福建省肿瘤医院)

专题命中 医学影像 :病理VLM跨尺度推理,医学影像分析

AI总结 提出首个跨尺度训练与评估范式,通过多倍率视觉问答任务增强病理视觉语言模型的跨尺度推理能力,并构建高质量基准数据集Scale-VQA及模型ScaleReasoner-R1,实现最优性能。

详情
AI中文摘要

病理图像本质上是多尺度的,要求病理学家整合从低倍放大下的整体组织结构到高倍放大下的细胞形态的证据以进行准确诊断。虽然现有的视觉语言模型(VLM)病理数据集包含多种尺度,但它们通常缺乏明确的跨尺度推理目标。这一限制阻碍了VLM捕获关键的跨尺度表示和学习基于证据的推理。为弥补这一差距,我们引入了首个跨尺度训练和评估范式,将病理解释表述为多倍率推理。然而,创建这样的任务揭示了一个关键挑战:多图像视觉问答(VQA)容易受到仅文本捷径的影响,这使得模型能够利用与放大倍数相关的伪影而非视觉证据来猜测答案。为解决此问题,我们提出了一种泄漏感知的策展流程,结合了对抗性仅文本筛选和约束引导的问题设计。利用该流程,我们构建了Scale-VQA,一个高质量基准,包含4,685个多项选择题,基于2,537张跨多个放大级别的病理图像。最后,我们提出了ScaleReasoner-R1,一个通过强化学习训练的模型,以优化跨尺度VQA任务的性能。ScaleReasoner-R1在我们的跨尺度推理基准上达到了最先进的性能,并在已有的单尺度基准上泛化到最先进的性能。研究结果表明,即使是有限的跨尺度监督也能显著改善病理理解。代码和演示将开源。

英文摘要

Pathological images are inherently multi-scale, requiring pathologists to integrate evidence from global tissue architecture at low magnification to cellular morphology at higher magnification for accurate diagnosis. While existing pathological datasets for vision-language model (VLM) include various scales, they often lack an explicit cross-scale reasoning objective. This limitation prevents VLMs from capturing essential cross-scale representations and learning evidence-based reasoning. To bridge this gap, we introduce the first cross-scale training and evaluation paradigm that formulates pathology interpretation as multi-magnification reasoning. However, creating such a task reveals a critical challenge: multi-image visual question answering (VQA) is prone to text-only shortcuts, which allow models to guess answers using magnification-dependent artifacts rather than visual evidence. To address this, we propose a leakage-aware curation pipeline that combines adversarial text-only screening with constraint-guided question design. Using this pipeline, we construct Scale-VQA, a high-quality benchmark with 4,685 multiple-choice questions grounded in 2,537 pathology images across multiple magnification levels. Finally, we present ScaleReasoner-R1, a model trained via reinforcement learning to optimize performance on the cross-scale VQA task. ScaleReasoner-R1 achieves state-of-the-art performance on our cross-scale reasoning benchmark and generalizes to SOTA performance on established single-scale benchmarks. Findings suggest that even the limited cross-scale supervision can significantly improve pathological understanding. The code and demos will be open-sourced.

2606.19174 2026-06-18 cs.HC cs.AI 新提交 80%

A Clinician-Centered Pipeline for Annotation and Evaluation in Ultrasound AI Studies

面向临床医生的超声AI研究注释与评估流程

Fangyijie Wang, Jianjun Yu, Wentao Shi, Haixia Huang, Ran Shi, Guénolé Silvestre, Kathleen M. Curran

发表机构 * Research Ireland Centre for Research Training in Machine Learning(爱尔兰研究机器学习研究中心) School of Medicine, University College Dublin, Dublin, Ireland(都柏林大学医学院) The Third People's Hospital of Zhenjiang City, Zhenjiang, China(镇江市第三人民医院) Zhenjiang Maternal and Child Health Hospital, Zhenjiang, China(镇江 maternal and child health hospital) The Fifth People's Hospital of Zhenjiang City, Zhenjiang, China(镇江市第五人民医院) School of Computer Science, University College Dublin, Dublin, Ireland(都柏林大学计算机科学学院)

专题命中 医学影像 :超声AI注释与评估流程,属于医学影像

AI总结 提出一个基于中央服务器和轻量级浏览器的临床医生中心化流程,支持远程注释、盲评和多评分者参与,在胎儿超声分割研究中验证了其可重复性和统计一致性。

Comments Accepted to MIUA 2026

详情
AI中文摘要

临床医生中心的评估对于验证医学AI系统至关重要,尤其是在超声成像中,定量指标并不总能捕捉临床可用性。现有的医学图像平台主要关注数据集标注,缺乏对盲法模型比较和可重复评估工作流的集成支持。我们提出了一个面向临床医生的超声AI研究远程注释与评估流程。该流程使用中央服务器和轻量级浏览器界面,使临床医生无需下载本地数据集即可进行注释、盲法排序和审查。该流程还支持多评分者参与、集中结果聚合和自动统计分析。我们在一个胎儿超声分割研究中验证了该流程,涉及六名评分者,涵盖专家、全科医生和非专家经验水平。系统自动生成了Spearman相关性、Kendall's τ和top-1选择统计量。结果显示专家与其他组之间存在中等到强的一致性。盲法评估结果表明,后期主动学习模型更受青睐。这些结果表明,该流程可以支持超声成像中临床医生中心的注释和可重复的人机AI评估研究。该流程可在GitHub上获取。

英文摘要

Clinician-centered evaluation is critical for validating medical AI systems, especially in ultrasound imaging where quantitative metrics do not always capture clinical usability. Existing medical image platforms primarily focus on dataset labeling. They lack integrated support for blinded model comparison and reproducible evaluation workflows. We present a clinician-centered pipeline for remote annotation and evaluation in ultrasound AI studies. The proposed pipeline uses a centralized server and lightweight browser interfaces to enable clinicians to perform annotation, blinded ranking, and review without local dataset downloads. The pipeline also supports multi-rater participation, centralized result aggregation, and automated statistical analysis. We validate the pipeline in a fetal ultrasound segmentation study with six raters spanning expert, generalist, and non-expert experience levels. The system automatically generated Spearman correlation, Kendall's $τ$, and top-1 selection statistics. Results indicated moderate to strong agreement across experts and other groups. The blinded evaluation results showed a tendency for later active learning models to be preferred. These outcomes suggest that the pipeline can support clinician-centered annotation and reproducible human-\ac{AI} evaluation studies in ultrasound imaging. The proposed pipeline is available on \href{https://github.com/13204942/SonoRate}{GitHub}.

2606.18287 2026-06-18 cs.LG 新提交 80%

Artemis: Anatomy-Resolved inTervention for Eliminating Multimodal NeuroImage confounderS

Artemis: 解剖分辨的干预方法用于消除多模态神经影像混杂因素

Siyuan Dai, Yang Du, Kun Zhao, Zhusuyi Chen, Heng Huang, Paul Thompson, Chao Shi, Haoteng Tang, Liang Zhan

发表机构 * University of Pittsburgh(匹兹堡大学) University of Maryland(马里兰大学) University of Southern California(南加州大学) Binghamton University(宾汉姆顿大学) University of Texas Rio Grande Valley(德克萨斯大学里奥格兰德河谷分校)

专题命中 医学影像 :提出Artemis框架消除神经影像混杂因素,提升诊断性能。

AI总结 提出Artemis框架,通过区域级因果干预学习特定脑区的混杂因素表示,消除fMRI和DTI多模态神经影像中人口统计学混杂因素对GNN的影响,在三个基准上提升性能。

Comments 11 pages, 8 figures

详情
AI中文摘要

多模态神经影像学整合了来自fMRI的功能连接和来自DTI的结构连接,使得使用图神经网络对脑网络进行无创分析成为可能。然而,年龄和性别等人口统计学因素系统地混淆了脑连接与临床结果之间的关系,导致GNN利用虚假捷径而非学习因果不变表示。尽管最近的因果GNN方法在图建模层面引入因果关系,但其因果机制仍然是领域无关的,没有考虑临床神经影像数据中固有的真实世界混杂因素。此外,脑网络是基于图谱分区构建的,每个区域对人口统计学因素表现出不同的敏感性,因此需要区域感知的调整。我们提出了Artemis,一个区域级因果框架,通过在每个脑区域独立进行因果干预,使用轻量级参数学习区域特定的混杂因素表示,从而弥合了这一差距。我们的调整综合利用多模态功能和结构特征进行图推理,作为一个与任意GNN骨干兼容的插件模块。在三个基准(用于疾病诊断的ADNI、用于痴呆分期的OASIS和用于性别分类的HCP)上的实验表明,与代表性的基于GNN的基线相比,该方法具有一致的改进。多项支持实验进一步证明了统计显著性和神经科学可解释性。

英文摘要

Multimodal neuroimaging, integrating functional connectivity from fMRI and structural connectivity from DTI, enables non-invasive analysis of brain networks using graph neural networks. However, demographic factors such as age and sex systematically confound the relationship between brain connectivity and clinical outcomes, causing GNNs to exploit spurious shortcuts rather than learning causally invariant representations. While recent causal GNN methods introduce causality at the graph-modeling level, their causal mechanisms remain domain-agnostic without accounting for the real-world confounders inherent in clinical neuroimaging data. Moreover, brain networks are constructed from atlas-based parcellations where each region exhibits distinct sensitivity to demographic factors, necessitating region-aware adjustment. We propose Artemis, a region-level causal framework that bridges this gap with causal intervention at each brain region independently by learning region-specific confounder representations with lightweight parameters. Our adjustment comprehensively utilized the multimodal functional and structural features for graph reasoning as a plug-in module compatible with arbitrary GNN backbones. Experiments on three benchmarks, ADNI for disease diagnosis, OASIS for dementia staging, and HCP for sex classification, demonstrate consistent improvements over representative GNN-based baselines. Multiple supporting experiments further demonstrate statistical significance and neuroscientific interpretability.

2606.19270 2026-06-18 eess.IV cs.LG physics.med-ph 新提交 80%

Beyond Algorithms: Conceptual Innovation in Medical Imaging AI

超越算法:医学影像人工智能中的概念创新

Mark A. Anastasio

发表机构 * Mallinckrodt Institute of Radiology and Department of Electrical & Systems Engineering, Washington University in St. Louis(马林克罗德特放射医学研究所和电气与系统工程系,华盛顿大学圣路易斯分校)

专题命中 医学影像 :医学影像AI概念创新讨论

AI总结 本文区分算法创新与概念创新,指出当前激励结构过度奖励算法新颖性而忽视概念贡献,通过医学影像AI案例展示概念不足导致的错位目标与有限临床影响,并提出促进概念创新的建议。

详情
AI中文摘要

人工智能推动了医学影像研究的快速发展,产生了日益复杂的算法,并在基准任务上稳步改进。然而,这种以算法为中心的发展轨迹也揭示了一个日益加剧的不平衡:虽然计算方法快速进步,但定义成像任务、评估指标和临床意义的概念基础有时仍未得到充分审视。在这篇观点文章中,我们区分了算法创新(专注于在固定问题定义内改进计算实现和性能)与概念创新(重新定义提出的问题、衡量成功的方式以及方法在临床上的相关性)。我们认为,当前的激励结构、培训路径和发表规范不成比例地奖励算法新颖性,尤其是对早期职业研究者而言,而有时低估了对科学成熟和临床转化至关重要的概念贡献。通过医学影像AI的代表性例子,我们展示了概念基础不足如何导致目标错位、泛化脆弱以及现实世界影响有限。最后,我们为研究者、导师、审稿人和期刊提出了可操作的建议,以更好地识别、支持和整合概念创新与算法进步。

英文摘要

Artificial intelligence has driven rapid progress in medical imaging research, producing increasingly sophisticated algorithms and steady improvements on benchmark tasks. However, this algorithm-centric trajectory has also revealed a growing imbalance: while computational methods advance rapidly, the conceptual foundations that define imaging tasks, evaluation metrics, and clinical meaning sometimes remain underexamined. In this Perspective, we distinguish algorithmic innovation, which focuses on improving computational implementations and performance within a fixed problem definition, from conceptual innovation, which reframes what problems are posed, how success is measured, and why an approach is clinically relevant. We argue that prevailing incentive structures, training pathways, and publication norms disproportionately reward algorithmic novelty, particularly for early-career researchers, while at times undervaluing conceptual contributions that are essential for scientific maturation and clinical translation. Through representative examples from medical imaging AI, we show how insufficient conceptual grounding can lead to misaligned objectives, fragile generalization, and limited real-world impact. We conclude with actionable recommendations for researchers, mentors, reviewers, and journals to better recognize, support, and integrate conceptual innovation alongside algorithmic advances.

2606.18887 2026-06-18 eess.IV physics.med-ph 新提交 80%

Efficient Image Registration for Ultrasound Localization Microscopy by Obtaining Gradients via Integration Across Iterations

通过跨迭代积分获取梯度的超声定位显微镜高效图像配准

Jipeng Yan, Chang Liu, Hengchang Liu, Biao Huang, Meng-Xing Tang, Yingxiang Liu, Ying Tan

专题命中 医学影像 :超声定位显微镜图像配准

AI总结 提出极值搜索控制(ESC)替代显式梯度计算,用于超声定位显微镜(ULM)图像配准,实现每迭代计算成本降低约3.5倍,并在离体猪心ULM成像中达到219 μm分辨率。

详情
AI中文摘要

通过图像配准进行组织运动校正对于超声定位显微镜(ULM)至关重要。参数化图像配准通常被表述为一个优化问题,其中运动参数被迭代更新以最大化图像相似度,所使用的优化算法通常依赖于梯度信息,而梯度的显式计算可能变得计算密集。本研究探讨了极值搜索控制(ESC)作为图像配准中显式导数计算的替代方案。通过跨迭代积分扰动和解调后的图像相似度度量来获取下降信息,ESC避免了每次迭代中图像相似度度量对运动参数的微分。经典的ESC(其优化行为近似于经典梯度下降(GD))首先与GD进行比较,用于仿射图像配准,使用从离体猪心跳动数据集中提取的模拟真实运动。结果表明,ESC实现了与GD相当的配准精度和收敛行为,同时每迭代计算成本降低了约3.5倍。随后,ESC被用于两阶段运动校正流程,其中仿射配准补偿全局组织运动,B样条配准校正残余局部变形。所提出的方法应用于离体跳动猪心的ULM成像,实现了219 μm的空间分辨率,显著低于与2.4 MHz发散波成像相关的半波长衍射极限321 μm。这些结果表明,ESC为ULM图像配准中的显式导数计算提供了一种有效的替代方案,能够实现精确的运动校正和高质量的超分辨率成像。

英文摘要

Tissue motion correction through image registration is essential for ultrasound localization microscopy (ULM). Parametric image registration is commonly formulated as an optimization problem where motion parameters are iteratively updated to maximize image similarity, and used optimization algorithms typically rely on gradient information, the explicit evaluation of which can become computationally demanding. This work investigates Extremum Seeking Control (ESC) as an alternative to explicit derivative evaluation in image registration. By obtaining descent information via integrating perturbed and demodulated image similarity metric across iterations, ESC avoids differentiation of the image similarity metric with respect to motion parameters in each iteration. The classical ESC, whose optimization behavior approximates that of classical gradient descent (GD), is first compared with GD for affine image registration using simulated ground-truth motions derived from a beating ex vivo porcine heart dataset. The results show that ESC achieves registration accuracy and convergence behavior comparable to GD while reducing per-iteration computational cost by approximately 3.5-fold. ESC is subsequently employed in a two-stage motion correction pipeline, where affine registration compensates for global tissue motion and B-spline registration corrects residual local deformation. The proposed method is applied to ULM imaging of a beating ex vivo porcine heart and achieves a spatial resolution of 219 um, substantially below the half-wavelength diffraction limit of 321 um associated with 2.4 MHz diverging-wave imaging. These results demonstrate that ESC provides an effective alternative to explicit derivative evaluation in ULM image registration, enabling accurate motion correction and high-quality super-resolution imaging.