医学 AI - arXivDaily 专题

2606.15554 2026-06-18 cs.CV 新提交 90%

RaLMPH: Reliability-aware Learning for Multi-Pathologist Harmonization in Whole-Slide Image Classification

RaLMPH：全切片图像分类中面向多病理学家协调的可靠性感知学习

Sungrae Hong, Jiwon Jeong, Soeun Cheon, Donghee Han, Sol Lee, Jisu Shin, Kyungeun Kim, Mun Yong Yi

发表机构 * Korea Advanced Institute of Science and Technology（韩国科学技术院）； Seegene Medical Foundation（Seegene医学基金会）

专题命中病理影像：多病理学家标注协调，病理图像分析

AI总结提出RaLMPH框架，通过可靠性场建模局部邻域结构和专家不确定性，实现多病理学家标注的全切片图像标签协调，提升多实例学习性能。

Comments Accepted by MICCAI 2026

详情

AI中文摘要

多实例学习（MIL）是全切片图像（WSI）分析的标准范式，并在计算病理学中取得了显著成果。然而，大多数MIL流程假设每张切片只有一个“金标准”标签，这与临床实践中常见的病理学家间显著差异相矛盾。现有的多标注者学习和标签细化方法通常估计全局标注者可靠性或依赖单实例假设，使其难以适应MIL以及专家意见不一致的局部诊断场景。我们提出RaLMPH（面向多病理学家协调的可靠性感知学习），一种基于MIL的标签协调框架，用于由多位病理学家标注的WSI。RaLMPH引入了一个可靠性场，该场联合建模（i）WSI特征空间中的局部邻域结构和（ii）专家不确定性（熵），从而能够识别每个样本的可信参考邻域。利用该场，RaLMPH执行样本级局部标注者排序以选择每张切片的可靠意见，并应用自适应门控机制根据局部可靠性融合标签。在由六位病理学家标注的临床WSI数据集以及受控模拟基准上的实验表明，RaLMPH始终优于现有方法。进一步分析阐明了我们的可靠性感知机制如何改进标签协调和下游MIL性能。

英文摘要

Multiple Instance Learning (MIL) is a standard paradigm for Whole-Slide Image (WSI) analysis and has achieved strong results in computational pathology. However, most MIL pipelines assume a single "gold" label per slide, which conflicts with clinical practice where substantial inter-pathologist variability is common. Existing multi-annotator learning and label-refinement methods typically estimate global annotator reliability or rely on single-instance assumptions, making them poorly suited to MIL and to localized diagnostic contexts where experts disagree. We propose RaLMPH (Reliability-aware Learning for Multi-Pathologist Harmonization), a MIL-based label reconciliation framework for WSIs annotated by multiple pathologists. RaLMPH introduces a reliability field that jointly models (i) local neighborhood structure in WSI feature space and (ii) expert uncertainty (entropy), enabling per-sample identification of trustworthy reference neighborhoods. Leveraging this field, RaLMPH performs sample-wise local annotator ranking to select reliable opinions per slide and applies an adaptive gating mechanism to fuse labels conditioned on local reliability. Experiments on a clinical WSI dataset with labels from six pathologists, as well as controlled simulated benchmarks, show that RaLMPH consistently outperforms existing approaches. Further analyses clarify how our reliability-aware mechanism improves label reconciliation and downstream MIL performance.

URL PDF HTML ☆

赞 0 踩 0