arXivDaily arXiv每日学术速递 周一至周五更新

科学与医疗

医学 AI

医学智能、临床 AI、医学影像、病理、诊断和医疗健康大模型。

今日/当前日期收录 41 信号源:cs.CV, cs.LG, q-bio, eess.IV, eess.SP
2606.18749 2026-06-18 cs.CV 新提交 85%

Toward Training-Free Zero-Shot Anomaly Detection in 3D Medical Images: A Batch-Based Approach Using 2D Foundation Models

迈向3D医学图像的无训练零样本异常检测:基于批次的方法使用2D基础模型

Tai Le-Gia

发表机构 * Chungnam National University(忠南大学)

专题命中 医学影像 :3D医学图像零样本异常检测,无训练方法。

AI总结 提出CS3F框架,利用2D基础模型对3D医学图像进行零样本异常检测,通过沿多轴分解、切片编码和跨主体相似性计算异常分数,并引入粗到细的分词策略减少信号衰减。

详情
AI中文摘要

零样本异常检测(ZSAD)在医学成像中具有吸引力,因为临床系统必须处理异构采集协议、变化的患者群体以及可能缺乏标注训练数据的病理。大多数现有的零样本异常检测方法是为2D图像设计的,它们直接扩展到3D医学体积受到大规模体积基础模型稀缺或利用体积上下文困难的限制。我们提出CS3F,一个无训练的基于批次的框架,用于3D医学图像中的ZSAD,使用2D基础模型。每个体积沿多个解剖轴分解,并由2D视觉变换器逐切片编码。然后通过池化相邻切片特征将其转换为局部体积令牌。异常分数通过跨主体互相似性获得:在其他主体中缺乏相似令牌的令牌被赋予更高的异常分数。为了减少深度池化引起的病灶信号衰减,我们引入了一种粗到细的分词策略,无需穷举匹配即可实现细分辨率体积评分。CS3F在脑部MRI上针对转移瘤、胶质瘤和中风进行评估,并在肺部CT上验证其泛化能力,超越标准图谱对齐的脑部MRI。结果表明,冻结的2D基础模型可以支持3D医学图像中的异常定位,且细分词化的益处很大程度上取决于病灶对比度和成像模态。

英文摘要

Zero-shot anomaly detection (ZSAD) is attractive for medical imaging because clinical systems must handle heterogeneous acquisition protocols, changing patient populations, and pathologies for which annotated training data may be unavailable. Most existing zero-shot anomaly detection methods are designed for 2D images, and their direct extension to 3D medical volumes is limited by the scarcity of large-scale volumetric foundation models or by the difficulty of utilizing volumetric context. We propose CS3F, a training-free batch-based framework for ZSAD in 3D medical images using 2D foundation models. Each volume is decomposed along multiple anatomical axes and encoded slice-wise by a 2D vision transformer. These are then converted into localized volumetric tokens by pooling neighboring slice features. Anomaly scores are obtained from cross-subject mutual similarity: tokens that lack close analogues in other subjects are assigned higher anomaly scores. To reduce the attenuation of focal lesion signals caused by depth pooling, we introduce a coarse-to-fine tokenization strategy that enables fine-resolution volumetric scoring without exhaustive matching. CS3F is evaluated on brain MRI across metastases, glioma, and stroke, as well as validated on lung CT to test generalizability beyond atlas-aligned brain MRI. The results show that frozen 2D foundation models can support anomaly localization in 3D medical images, and that the benefit of fine tokenization depends strongly on lesion contrast and imaging modality.

2606.18658 2026-06-18 cs.CV eess.IV 新提交 85%

On-Manifold Variational Learning with Heat-Kernel Priors

基于热核先验的流形变分学习

Jiarui Xing, Tal Zeevi, Nian Wu, Jian Wang

发表机构 * Yale School of Medicine(耶鲁大学医学院) University of Virginia(弗吉尼亚大学) Harvard Medical School(哈佛医学院)

专题命中 医学影像 :在心脏瘢痕和脑MRI基准上取得最高精度

AI总结 提出一种流形锚定变分框架,利用几何感知EM算法选择热核加权潜图上的图中心点作为原型,确保原型在流形上,并通过Dirichlet能量正则化保持潜空间几何平滑,在心脏瘢痕和脑MRI基准上取得最高精度和清晰原型。

详情
AI中文摘要

学习医学影像队列的无监督表示可以揭示临床上有意义的原型,而无需专家标签,这些标签通常带有噪声且无法捕捉真实的病理异质性。然而,现有的深度潜变量模型通过欧几里得平均估计高斯混合先验,产生的原型会偏离弯曲的数据流形,并随着子种群数量的增加而退化。我们提出了一种流形锚定变分框架,基于几何感知的期望最大化(EM)算法,其M步骤选择每个子种群原型作为热核加权潜图上具有最高扩散中心性的图中心点,确保每个原型保持在流形上。Dirichlet能量正则化强制潜空间的几何平滑性,每个子种群的不确定性分数实现了无标签的质量评估。流形锚定EM是一种通用几何工具,扩展了标准EM,并易于应用于其他潜变量模型。在心脏瘢痕和脑MRI基准上,我们的框架在所有比较方法中取得了最高精度,产生了迄今为止最清晰的原型,并且在所有基线退化的较大子种群数量下保持稳定。

英文摘要

Learning unsupervised representations of medical imaging cohorts can reveal clinically meaningful prototypes without expert labels, which are often noisy and fail to capture true pathological heterogeneity. However, existing deep latent-variable models estimate Gaussian mixture priors via Euclidean averaging, producing prototypes that drift off the curved data manifold and degenerate as the number of sub-populations grows. We propose a manifold-anchored variational framework built on a geometry-aware Expectation-Maximization (EM) algorithm, whose M-step selects each sub-population prototype as the graph medoid with the highest diffusion centrality on a heat-kernel-weighted latent graph, ensuring that every prototype remains on-manifold. A Dirichlet energy regularizer enforces geometric smoothness of the latent space, and a per-sub-population uncertainty score enables label-free quality assessment. \rev{The manifold-anchored EM is a general-purpose geometric tool that extends standard EM and applies readily to other latent-variable models beyond this setting.} On cardiac scar and brain MRI benchmarks, our framework attains the highest accuracy among all compared methods, produces the sharpest prototypes reported to date, and remains stable at large sub-population counts where all baselines degenerate.

2606.17412 2026-06-18 cs.CV cs.AI 新提交 85%

Enhancing Pathological VLMs with Cross-scale Reasoning

增强病理视觉语言模型的跨尺度推理能力

Chi Phan, Tianyi Zhang, Qiaochu Xue, Yufeng Wu, Dan Hu, Zeyu Liu, Sudong Wang, Yueming Jin

发表机构 * Department of Electrical and Computer Engineering, National University of Singapore(新加坡国立大学电气与计算机工程系) PuzzleLogic Pte Ltd(PuzzleLogic私人有限公司) Department of Pathology, Fujian Medical University Cancer Hospital & Fujian Cancer Hospital(福建医科大学附属肿瘤医院病理科暨福建省肿瘤医院)

专题命中 医学影像 :病理VLM跨尺度推理,医学影像分析

AI总结 提出首个跨尺度训练与评估范式,通过多倍率视觉问答任务增强病理视觉语言模型的跨尺度推理能力,并构建高质量基准数据集Scale-VQA及模型ScaleReasoner-R1,实现最优性能。

详情
AI中文摘要

病理图像本质上是多尺度的,要求病理学家整合从低倍放大下的整体组织结构到高倍放大下的细胞形态的证据以进行准确诊断。虽然现有的视觉语言模型(VLM)病理数据集包含多种尺度,但它们通常缺乏明确的跨尺度推理目标。这一限制阻碍了VLM捕获关键的跨尺度表示和学习基于证据的推理。为弥补这一差距,我们引入了首个跨尺度训练和评估范式,将病理解释表述为多倍率推理。然而,创建这样的任务揭示了一个关键挑战:多图像视觉问答(VQA)容易受到仅文本捷径的影响,这使得模型能够利用与放大倍数相关的伪影而非视觉证据来猜测答案。为解决此问题,我们提出了一种泄漏感知的策展流程,结合了对抗性仅文本筛选和约束引导的问题设计。利用该流程,我们构建了Scale-VQA,一个高质量基准,包含4,685个多项选择题,基于2,537张跨多个放大级别的病理图像。最后,我们提出了ScaleReasoner-R1,一个通过强化学习训练的模型,以优化跨尺度VQA任务的性能。ScaleReasoner-R1在我们的跨尺度推理基准上达到了最先进的性能,并在已有的单尺度基准上泛化到最先进的性能。研究结果表明,即使是有限的跨尺度监督也能显著改善病理理解。代码和演示将开源。

英文摘要

Pathological images are inherently multi-scale, requiring pathologists to integrate evidence from global tissue architecture at low magnification to cellular morphology at higher magnification for accurate diagnosis. While existing pathological datasets for vision-language model (VLM) include various scales, they often lack an explicit cross-scale reasoning objective. This limitation prevents VLMs from capturing essential cross-scale representations and learning evidence-based reasoning. To bridge this gap, we introduce the first cross-scale training and evaluation paradigm that formulates pathology interpretation as multi-magnification reasoning. However, creating such a task reveals a critical challenge: multi-image visual question answering (VQA) is prone to text-only shortcuts, which allow models to guess answers using magnification-dependent artifacts rather than visual evidence. To address this, we propose a leakage-aware curation pipeline that combines adversarial text-only screening with constraint-guided question design. Using this pipeline, we construct Scale-VQA, a high-quality benchmark with 4,685 multiple-choice questions grounded in 2,537 pathology images across multiple magnification levels. Finally, we present ScaleReasoner-R1, a model trained via reinforcement learning to optimize performance on the cross-scale VQA task. ScaleReasoner-R1 achieves state-of-the-art performance on our cross-scale reasoning benchmark and generalizes to SOTA performance on established single-scale benchmarks. Findings suggest that even the limited cross-scale supervision can significantly improve pathological understanding. The code and demos will be open-sourced.

2606.03827 2026-06-18 cs.CV cs.AI 版本更新 85%

Conditional Latent Diffusion Model with Fourier-based Motion Modelling for Virtual Population Synthesis

基于傅里叶运动建模的条件潜扩散模型用于虚拟人群合成

Shaokun Lan, Haoran Dou, Jinghan Huang, Arezoo Zakeri, Fengming Lin, Zherui Zhou, Jinming Duan, Alejandro F. Frangi

发表机构 * Centre for Computational Imaging and Modelling in Medicine (CIMIM)(计算医学成像与建模中心) University of Manchester(曼彻斯特大学) Christabel Pankhurst Institute(克里斯塔贝尔·潘克赫斯特研究所) Department of Computer Science(计算机科学系) Division of Informatics, Imaging & Data Sciences(信息学、成像与数据科学分会) Department of Electrical & Electronic Engineering(电子与电气工程系) NIHR Manchester Biomedical Research Centre, Manchester Academic Health Sciences Centre, University of Manchester(尼日利亚卫生研究委员会曼彻斯特生物医学研究中心、曼彻斯特学术健康科学中心、曼彻斯特大学)

专题命中 医学影像 :心脏网格序列生成,医学影像应用

AI总结 提出4D F-MeshLDM框架,结合卷积网格VAE、截断傅里叶级数运动参数化和条件扩散先验,实现可控的3D+t心脏网格序列生成,在UK Biobank数据上优于基线方法。

Comments This work has been early accepted by International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) 2026

详情
AI中文摘要

医疗设备的计算机模拟试验需要生成虚拟解剖人群。在心血管应用中,虚拟解剖通常表示为从生成模型采样的3D+t网格。然而,大多数现有网格生成器关注静态解剖,而序列模型往往缺乏显式周期性。为此,我们提出4D F-MeshLDM,一个条件生成框架,包括用于编码网格的卷积网格VAE、使用截断傅里叶级数参数化运动的结构化潜空间,以及学习傅里叶系数令牌上潜分布的先验扩散。通过仿射调制将扩散过程条件化于临床协变量,我们实现了可控合成。采样令牌并执行逆傅里叶合成产生周期一致的潜轨迹,可解码为3D+t心脏网格序列。在5,000名UK Biobank受试者上的实验表明,4D F-MeshLDM在解剖保真度上优于最先进的基线,并实现了接近零的周期闭合误差。此外,生成的队列准确保留了临床功能指标,突显了我们的框架在可靠的心脏计算机模拟试验中的潜力。

英文摘要

In-silico trials of medical devices require the generation of virtual populations of anatomies. In cardiovascular applications, virtual anatomy is typically represented as a 3D+t mesh sampled from a generative model. However, most existing mesh generators focus on static anatomy, while sequence models often lack explicit periodicity. To this end, we propose 4D F-MeshLDM, a conditional generative framework comprising a convolutional mesh VAE to encode meshes, a structural latent space that parameterises motion using a truncated Fourier series, and a diffusion prior that learns the latent distribution over Fourier coefficient tokens. By conditioning the diffusion process on clinical covariates via affine modulation, we enable controllable synthesis. Sampling tokens and performing inverse Fourier synthesis yield cycle-consistent latent trajectories, which can be decoded into 3D+t cardiac mesh sequences. Experiments on 5,000 UK Biobank subjects demonstrate that 4D F-MeshLDM outperforms state-of-the-art baselines in anatomical fidelity and achieves near-zero cycle closure error. Furthermore, the generated cohorts accurately preserve clinical functional indices, highlighting the potential of our framework for reliable in-silico cardiac trials.

2504.01527 2026-06-18 cs.CV eess.IV 版本更新 85%

Beyond Nearest Neighbor Interpolation in Data Augmentation

超越数据增强中的最近邻插值

Olivier Rukundo

发表机构 * Department of Electronic and Computer Engineering, University of Limerick(电子与计算机工程系,利默里克大学)

专题命中 医学影像 :提出离线数据增强管道,提升医学图像分割性能。

AI总结 本文提出改进的几何变换函数和均值分类过滤机制,以避免最近邻插值带来的标注误差和低通滤波影响,通过离线数据增强管道提升医学图像分割性能。

Comments 10 pages, 11 figures, 14 tables

详情
AI中文摘要

避免最近邻插值导致的未定义类别标签风险忽视了增强训练数据中像素级标注误差的加剧风险。此外,插值算法固有的低通滤波效应会加剧标注区域内的高频结构细节退化风险。为避免这些风险,作者通过修改卷积神经网络的数据转换函数,引入改进的几何变换函数,去除对最近邻插值的依赖,并整合基于均值的类别过滤机制来处理未定义的类别标签。作者还实现了离线数据增强管道,生成特定于插值的增强训练数据,从而能够定量评估插值对增强训练数据的低通滤波效应。在三个医学图像分割数据集和XBAT+数据集上的实验评估显示,在多个定量指标上均实现了性能提升。

英文摘要

Avoiding the risk of undefined categorical labels using nearest neighbor interpolation overlooks the risk of exacerbating pixel level annotation errors in augmented training data. Additionally, the inherent low pass filtering effects of interpolation algorithms exacerbate the risk of degrading high frequency structural details within annotated regions of interest. To avoid these risks, the author modified convolutional neural networks data transformation functions by incorporating a modified geometric transformation function, removing reliance on nearest neighbor interpolation, and integrating a mean-based class filtering mechanism to handle undefined categorical labels with alternative interpolation algorithms. The author also implemented an offline data augmentation pipeline to generate interpolation specific augmented training data, enabling quantitative assessment of interpolation specific low pass filtering effects on augmented training data. Experimental evaluation on three medical image segmentation datasets and the XBAT+ datasets demonstrated performance gains across multiple quantitative metrics.

2606.19174 2026-06-18 cs.HC cs.AI 新提交 80%

A Clinician-Centered Pipeline for Annotation and Evaluation in Ultrasound AI Studies

面向临床医生的超声AI研究注释与评估流程

Fangyijie Wang, Jianjun Yu, Wentao Shi, Haixia Huang, Ran Shi, Guénolé Silvestre, Kathleen M. Curran

发表机构 * Research Ireland Centre for Research Training in Machine Learning(爱尔兰研究机器学习研究中心) School of Medicine, University College Dublin, Dublin, Ireland(都柏林大学医学院) The Third People's Hospital of Zhenjiang City, Zhenjiang, China(镇江市第三人民医院) Zhenjiang Maternal and Child Health Hospital, Zhenjiang, China(镇江 maternal and child health hospital) The Fifth People's Hospital of Zhenjiang City, Zhenjiang, China(镇江市第五人民医院) School of Computer Science, University College Dublin, Dublin, Ireland(都柏林大学计算机科学学院)

专题命中 医学影像 :超声AI注释与评估流程,属于医学影像

AI总结 提出一个基于中央服务器和轻量级浏览器的临床医生中心化流程,支持远程注释、盲评和多评分者参与,在胎儿超声分割研究中验证了其可重复性和统计一致性。

Comments Accepted to MIUA 2026

详情
AI中文摘要

临床医生中心的评估对于验证医学AI系统至关重要,尤其是在超声成像中,定量指标并不总能捕捉临床可用性。现有的医学图像平台主要关注数据集标注,缺乏对盲法模型比较和可重复评估工作流的集成支持。我们提出了一个面向临床医生的超声AI研究远程注释与评估流程。该流程使用中央服务器和轻量级浏览器界面,使临床医生无需下载本地数据集即可进行注释、盲法排序和审查。该流程还支持多评分者参与、集中结果聚合和自动统计分析。我们在一个胎儿超声分割研究中验证了该流程,涉及六名评分者,涵盖专家、全科医生和非专家经验水平。系统自动生成了Spearman相关性、Kendall's τ和top-1选择统计量。结果显示专家与其他组之间存在中等到强的一致性。盲法评估结果表明,后期主动学习模型更受青睐。这些结果表明,该流程可以支持超声成像中临床医生中心的注释和可重复的人机AI评估研究。该流程可在GitHub上获取。

英文摘要

Clinician-centered evaluation is critical for validating medical AI systems, especially in ultrasound imaging where quantitative metrics do not always capture clinical usability. Existing medical image platforms primarily focus on dataset labeling. They lack integrated support for blinded model comparison and reproducible evaluation workflows. We present a clinician-centered pipeline for remote annotation and evaluation in ultrasound AI studies. The proposed pipeline uses a centralized server and lightweight browser interfaces to enable clinicians to perform annotation, blinded ranking, and review without local dataset downloads. The pipeline also supports multi-rater participation, centralized result aggregation, and automated statistical analysis. We validate the pipeline in a fetal ultrasound segmentation study with six raters spanning expert, generalist, and non-expert experience levels. The system automatically generated Spearman correlation, Kendall's $τ$, and top-1 selection statistics. Results indicated moderate to strong agreement across experts and other groups. The blinded evaluation results showed a tendency for later active learning models to be preferred. These outcomes suggest that the pipeline can support clinician-centered annotation and reproducible human-\ac{AI} evaluation studies in ultrasound imaging. The proposed pipeline is available on \href{https://github.com/13204942/SonoRate}{GitHub}.

2606.18287 2026-06-18 cs.LG 新提交 80%

Artemis: Anatomy-Resolved inTervention for Eliminating Multimodal NeuroImage confounderS

Artemis: 解剖分辨的干预方法用于消除多模态神经影像混杂因素

Siyuan Dai, Yang Du, Kun Zhao, Zhusuyi Chen, Heng Huang, Paul Thompson, Chao Shi, Haoteng Tang, Liang Zhan

发表机构 * University of Pittsburgh(匹兹堡大学) University of Maryland(马里兰大学) University of Southern California(南加州大学) Binghamton University(宾汉姆顿大学) University of Texas Rio Grande Valley(德克萨斯大学里奥格兰德河谷分校)

专题命中 医学影像 :提出Artemis框架消除神经影像混杂因素,提升诊断性能。

AI总结 提出Artemis框架,通过区域级因果干预学习特定脑区的混杂因素表示,消除fMRI和DTI多模态神经影像中人口统计学混杂因素对GNN的影响,在三个基准上提升性能。

Comments 11 pages, 8 figures

详情
AI中文摘要

多模态神经影像学整合了来自fMRI的功能连接和来自DTI的结构连接,使得使用图神经网络对脑网络进行无创分析成为可能。然而,年龄和性别等人口统计学因素系统地混淆了脑连接与临床结果之间的关系,导致GNN利用虚假捷径而非学习因果不变表示。尽管最近的因果GNN方法在图建模层面引入因果关系,但其因果机制仍然是领域无关的,没有考虑临床神经影像数据中固有的真实世界混杂因素。此外,脑网络是基于图谱分区构建的,每个区域对人口统计学因素表现出不同的敏感性,因此需要区域感知的调整。我们提出了Artemis,一个区域级因果框架,通过在每个脑区域独立进行因果干预,使用轻量级参数学习区域特定的混杂因素表示,从而弥合了这一差距。我们的调整综合利用多模态功能和结构特征进行图推理,作为一个与任意GNN骨干兼容的插件模块。在三个基准(用于疾病诊断的ADNI、用于痴呆分期的OASIS和用于性别分类的HCP)上的实验表明,与代表性的基于GNN的基线相比,该方法具有一致的改进。多项支持实验进一步证明了统计显著性和神经科学可解释性。

英文摘要

Multimodal neuroimaging, integrating functional connectivity from fMRI and structural connectivity from DTI, enables non-invasive analysis of brain networks using graph neural networks. However, demographic factors such as age and sex systematically confound the relationship between brain connectivity and clinical outcomes, causing GNNs to exploit spurious shortcuts rather than learning causally invariant representations. While recent causal GNN methods introduce causality at the graph-modeling level, their causal mechanisms remain domain-agnostic without accounting for the real-world confounders inherent in clinical neuroimaging data. Moreover, brain networks are constructed from atlas-based parcellations where each region exhibits distinct sensitivity to demographic factors, necessitating region-aware adjustment. We propose Artemis, a region-level causal framework that bridges this gap with causal intervention at each brain region independently by learning region-specific confounder representations with lightweight parameters. Our adjustment comprehensively utilized the multimodal functional and structural features for graph reasoning as a plug-in module compatible with arbitrary GNN backbones. Experiments on three benchmarks, ADNI for disease diagnosis, OASIS for dementia staging, and HCP for sex classification, demonstrate consistent improvements over representative GNN-based baselines. Multiple supporting experiments further demonstrate statistical significance and neuroscientific interpretability.

2606.19270 2026-06-18 eess.IV cs.LG physics.med-ph 新提交 80%

Beyond Algorithms: Conceptual Innovation in Medical Imaging AI

超越算法:医学影像人工智能中的概念创新

Mark A. Anastasio

发表机构 * Mallinckrodt Institute of Radiology and Department of Electrical & Systems Engineering, Washington University in St. Louis(马林克罗德特放射医学研究所和电气与系统工程系,华盛顿大学圣路易斯分校)

专题命中 医学影像 :医学影像AI概念创新讨论

AI总结 本文区分算法创新与概念创新,指出当前激励结构过度奖励算法新颖性而忽视概念贡献,通过医学影像AI案例展示概念不足导致的错位目标与有限临床影响,并提出促进概念创新的建议。

详情
AI中文摘要

人工智能推动了医学影像研究的快速发展,产生了日益复杂的算法,并在基准任务上稳步改进。然而,这种以算法为中心的发展轨迹也揭示了一个日益加剧的不平衡:虽然计算方法快速进步,但定义成像任务、评估指标和临床意义的概念基础有时仍未得到充分审视。在这篇观点文章中,我们区分了算法创新(专注于在固定问题定义内改进计算实现和性能)与概念创新(重新定义提出的问题、衡量成功的方式以及方法在临床上的相关性)。我们认为,当前的激励结构、培训路径和发表规范不成比例地奖励算法新颖性,尤其是对早期职业研究者而言,而有时低估了对科学成熟和临床转化至关重要的概念贡献。通过医学影像AI的代表性例子,我们展示了概念基础不足如何导致目标错位、泛化脆弱以及现实世界影响有限。最后,我们为研究者、导师、审稿人和期刊提出了可操作的建议,以更好地识别、支持和整合概念创新与算法进步。

英文摘要

Artificial intelligence has driven rapid progress in medical imaging research, producing increasingly sophisticated algorithms and steady improvements on benchmark tasks. However, this algorithm-centric trajectory has also revealed a growing imbalance: while computational methods advance rapidly, the conceptual foundations that define imaging tasks, evaluation metrics, and clinical meaning sometimes remain underexamined. In this Perspective, we distinguish algorithmic innovation, which focuses on improving computational implementations and performance within a fixed problem definition, from conceptual innovation, which reframes what problems are posed, how success is measured, and why an approach is clinically relevant. We argue that prevailing incentive structures, training pathways, and publication norms disproportionately reward algorithmic novelty, particularly for early-career researchers, while at times undervaluing conceptual contributions that are essential for scientific maturation and clinical translation. Through representative examples from medical imaging AI, we show how insufficient conceptual grounding can lead to misaligned objectives, fragile generalization, and limited real-world impact. We conclude with actionable recommendations for researchers, mentors, reviewers, and journals to better recognize, support, and integrate conceptual innovation alongside algorithmic advances.

2606.18887 2026-06-18 eess.IV physics.med-ph 新提交 80%

Efficient Image Registration for Ultrasound Localization Microscopy by Obtaining Gradients via Integration Across Iterations

通过跨迭代积分获取梯度的超声定位显微镜高效图像配准

Jipeng Yan, Chang Liu, Hengchang Liu, Biao Huang, Meng-Xing Tang, Yingxiang Liu, Ying Tan

专题命中 医学影像 :超声定位显微镜图像配准

AI总结 提出极值搜索控制(ESC)替代显式梯度计算,用于超声定位显微镜(ULM)图像配准,实现每迭代计算成本降低约3.5倍,并在离体猪心ULM成像中达到219 μm分辨率。

详情
AI中文摘要

通过图像配准进行组织运动校正对于超声定位显微镜(ULM)至关重要。参数化图像配准通常被表述为一个优化问题,其中运动参数被迭代更新以最大化图像相似度,所使用的优化算法通常依赖于梯度信息,而梯度的显式计算可能变得计算密集。本研究探讨了极值搜索控制(ESC)作为图像配准中显式导数计算的替代方案。通过跨迭代积分扰动和解调后的图像相似度度量来获取下降信息,ESC避免了每次迭代中图像相似度度量对运动参数的微分。经典的ESC(其优化行为近似于经典梯度下降(GD))首先与GD进行比较,用于仿射图像配准,使用从离体猪心跳动数据集中提取的模拟真实运动。结果表明,ESC实现了与GD相当的配准精度和收敛行为,同时每迭代计算成本降低了约3.5倍。随后,ESC被用于两阶段运动校正流程,其中仿射配准补偿全局组织运动,B样条配准校正残余局部变形。所提出的方法应用于离体跳动猪心的ULM成像,实现了219 μm的空间分辨率,显著低于与2.4 MHz发散波成像相关的半波长衍射极限321 μm。这些结果表明,ESC为ULM图像配准中的显式导数计算提供了一种有效的替代方案,能够实现精确的运动校正和高质量的超分辨率成像。

英文摘要

Tissue motion correction through image registration is essential for ultrasound localization microscopy (ULM). Parametric image registration is commonly formulated as an optimization problem where motion parameters are iteratively updated to maximize image similarity, and used optimization algorithms typically rely on gradient information, the explicit evaluation of which can become computationally demanding. This work investigates Extremum Seeking Control (ESC) as an alternative to explicit derivative evaluation in image registration. By obtaining descent information via integrating perturbed and demodulated image similarity metric across iterations, ESC avoids differentiation of the image similarity metric with respect to motion parameters in each iteration. The classical ESC, whose optimization behavior approximates that of classical gradient descent (GD), is first compared with GD for affine image registration using simulated ground-truth motions derived from a beating ex vivo porcine heart dataset. The results show that ESC achieves registration accuracy and convergence behavior comparable to GD while reducing per-iteration computational cost by approximately 3.5-fold. ESC is subsequently employed in a two-stage motion correction pipeline, where affine registration compensates for global tissue motion and B-spline registration corrects residual local deformation. The proposed method is applied to ULM imaging of a beating ex vivo porcine heart and achieves a spatial resolution of 219 um, substantially below the half-wavelength diffraction limit of 321 um associated with 2.4 MHz diverging-wave imaging. These results demonstrate that ESC provides an effective alternative to explicit derivative evaluation in ULM image registration, enabling accurate motion correction and high-quality super-resolution imaging.

2606.19169 2026-06-18 cs.GR cs.SY eess.SY 新提交 70%

RespGeomLib: A Reproducible Parametric Engine for Generating Analysis-Ready Human Airway Lumen Geometry

RespGeomLib:一个可复现的参数化引擎,用于生成分析就绪的人类气道管腔几何结构

Nichula Wasalathilaka, Parakrama Ekanayake, Roshan Godaliyadda

专题命中 医学影像 :气道管腔几何生成,用于医学仿真

AI总结 提出RespGeomLib,一个基于YAML规范的可复现参数化引擎,通过端口组装与隐式平滑混合生成无缝气道管腔表面,避免全局体素化,在定量上产生更清洁的分叉且更高效,支持形态测量引导生成和CFD仿真。

Comments Accepted to Publication at 2026 IEEE Mercon

详情
AI中文摘要

CT衍生的气道模型支持肺形态测量和气流模拟,但通常受限于远端扫描分辨率和分叉附近需要大量清理。程序化替代方案是可复现的,但许多依赖于拼接的管状基元,这些基元引入了非光滑连接和定义不清的开放边界。我们提出了RespGeomLib,一个可复现的参数化引擎,用于从紧凑的YAML规范生成分析就绪的人类气道管腔表面。该框架结合了基于端口的组装与隐式平滑最小混合,以产生无缝连接,同时通过解析段和分叉周围的局部隐式提取避免全树体素化。定量上,RespGeomLib产生比布尔/拼接基线更清洁的连接,并且比全树全局隐式提取更快且更节省内存。我们进一步展示了形态测量引导的树生成、受控合成气道变体以及具有稳定气流模拟的CFD就绪导出。RespGeomLib针对需要可复现形态测量、受控合成变体和模拟就绪管腔几何的生物医学工作流。代码公开于此https URL。

英文摘要

CT-derived airway models support pulmonary morphometry and airflow simulation, but are often limited by distal scan resolution and the need for substantial cleanup near bifurcations. Procedural alternatives are reproducible, yet many rely on stitched tubular primitives that introduce non-smooth junctions and poorly defined open boundaries. We present RespGeomLib, a reproducible parametric engine for generating analysis-ready human airway lumen surfaces from compact YAML specifications. The framework combines port-based assembly with implicit smooth-min junction blending to produce seamless junctions, while avoiding full-tree voxelization through analytic segments and local implicit extraction around bifurcations. Quantitatively, RespGeomLib yields cleaner junctions than a Boolean/stitch baseline and is substantially faster and more memory-efficient than whole-tree global implicit extraction. We further demonstrate morphometry-guided tree generation, controlled synthetic airway variants, and CFD-ready export with stable airflow simulation. RespGeomLib targets biomedical workflows requiring reproducible morphometry, controlled synthetic variants, and simulation-ready lumen geometry. The code is publicly available at https://nichula01.github.io/Respgeomlib/

2602.21160 2026-06-18 stat.ML cs.LG stat.AP stat.ME 版本更新 70%

Not Just How Much, But Where: Decomposing Epistemic Uncertainty into Per-Class Contributions

不仅多少,而且何处:将认知不确定性分解为每类贡献

Mame Diarra Toure, David A. Stephens

发表机构 * Department of Mathematics and Statistics(数学与统计学系)

专题命中 医学影像 :在糖尿病视网膜病变选择性预测中验证方法

AI总结 针对安全关键分类中认知不确定性度量无法区分类别的问题,提出将互信息分解为每类向量$C_k$,通过二阶泰勒展开和$1/\mu_k$加权校正边界抑制,在糖尿病视网膜病变选择性预测、分布外检测和标签噪声研究中验证其有效性。

Comments 8 pages, 17 figures Accepted at UAI 2026

Journal ref Forty-Second Annual Conference on Uncertainty in Artificial Intelligence}, year={2026}, url={https://openreview.net/forum?id=cxuWscJmAr}

详情
AI中文摘要

在安全关键分类中,失败的代价往往是不对称的,然而贝叶斯深度学习用单个标量——互信息(MI)来总结认知不确定性,这无法区分模型的无知涉及良性类别还是安全关键类别。我们将MI分解为每类向量$C_k(x)=\sigma_k^{2}/(2\mu_k)$,其中$\mu_k{=}\mathbb{E}[p_k]$,$\sigma_k^2{=}\mathrm{Var}[p_k]$,计算基于后验样本。该分解来自熵的二阶泰勒展开;$1/\mu_k$加权校正了边界抑制,使$C_k$在稀有类别和常见类别之间具有可比性。根据构造,$\sum_k C_k \approx \mathrm{MI}$,并且伴随的偏度诊断标志可识别近似退化的输入。在刻画$C_k$的公理性质后,我们在三个任务上验证了它:(i)糖尿病视网膜病变的选择性预测,其中关键类别的$C_k$相比MI降低了34.7%的选择性风险,相比方差基线降低了56.2%;(ii)临床和图像基准上的分布外检测,其中$\sum_k C_k$取得了最高的AUROC,并且每类视角暴露了MI无法察觉的不对称偏移;(iii)受控的标签噪声研究,其中在端到端贝叶斯训练下,$\sum_k C_k$对注入的偶然噪声的敏感性低于MI,而在迁移学习下两种度量均退化。在所有任务中,后验近似的质量对不确定性的影响至少与度量选择本身一样强,这表明不确定性如何通过网络传播与其如何被度量同等重要。

英文摘要

In safety-critical classification, the cost of failure is often asymmetric, yet Bayesian deep learning summarises epistemic uncertainty with a single scalar, mutual information (MI), that cannot distinguish whether a model's ignorance involves a benign or safety-critical class. We decompose MI into a per-class vector $C_k(x)=σ_k^{2}/(2μ_k)$, with $μ_k{=}\mathbb{E}[p_k]$ and $σ_k^2{=}\mathrm{Var}[p_k]$ across posterior samples. The decomposition follows from a second-order Taylor expansion of the entropy; the $1/μ_k$ weighting corrects boundary suppression and makes $C_k$ comparable across rare and common classes. By construction $\sum_k C_k \approx \mathrm{MI}$, and a companion skewness diagnostic flags inputs where the approximation degrades. After characterising the axiomatic properties of $C_k$, we validate it on three tasks: (i) selective prediction for diabetic retinopathy, where critical-class $C_k$ reduces selective risk by 34.7\% over MI and 56.2\% over variance baselines; (ii) out-of-distribution detection on clinical and image benchmarks, where $\sum_k C_k$ achieves the highest AUROC and the per-class view exposes asymmetric shifts invisible to MI; and (iii) a controlled label-noise study in which $\sum_k C_k$ shows less sensitivity to injected aleatoric noise than MI under end-to-end Bayesian training, while both metrics degrade under transfer learning. Across all tasks, the quality of the posterior approximation shapes uncertainty at least as strongly as the choice of metric, suggesting that how uncertainty is propagated through the network matters as much as how it is measured.