Normal Guidance is what Attention Needs
Normal Guidance is what Attention Needs
Ethan Harvey, Dennis Johan Loevlie, Michael C. Hughes
AI总结 提出Normal Guidance正则化技术,使基于注意力的多实例学习方法在3D医学图像切片级定位上超越现有方法,同时保持全扫描分类性能。
详情
我们考虑仅使用整个体积的一个二元标签(而不是每个2D切片的标签)来训练3D医学图像的分类器。在这种弱监督设置下,我们能否学习准确的切片级预测分类器?基于注意力的多实例学习(MIL)可以为每个切片生成注意力分数。然而,最近的研究表明,一个忽略图像内容的简单中心聚焦基线在3D脑部扫描的切片级分类上可以胜过基于注意力和基于Transformer的MIL。我们证明该基线在胸部和腹部CT扫描的切片级分类上也优于现有的MIL。受此基线启发,我们提出了Normal Guidance,一种正则化技术,鼓励学习的注意力分布遵循钟形曲线。在三个总计超过400万张2D切片的医学影像数据集上,我们展示了Normal Guidance使基于注意力和基于Transformer的MIL方法在切片级定位上显著优于现有技术,同时在全扫描分类上保持竞争力。
We consider training classifiers for 3D medical images using only one binary label for the entire volume rather than a label for each 2D slice. In such weakly supervised settings, can we learn accurate classifiers for slice-level predictions? Attention-based multiple instance learning (MIL) can produce an attention score for every slice. Yet recent work demonstrates that a simple center-focused baseline that ignores image content can outperform attention-based and transformer-based MIL at slice-level classification of 3D brain scans. We show this baseline also outperforms existing MIL at slice-level classification of thoracic and abdominal CT scans. Motivated by this baseline, we propose Normal Guidance, a regularization technique that encourages the learned attention distribution to follow a bell-shaped curve. Across three medical imaging datasets totaling over 4 million 2D slices, we show our Normal Guidance enables attention-based and transformer-based MIL methods to deliver significantly better slice-level localization than the state-of-the-art while remaining competitive at whole-scan classification.