A Multimodal 3D Foundation Model for Light Sheet Fluorescence Microscopy Enables Few-Shot Segmentation, Classification, and Deblurring
一种用于光片荧光显微镜的多模态3D基础模型实现少样本分割、分类和去模糊
Adina Scheinfeld, Haotan Zhang, Shang Mu, Rudolf L. M. van Herten, Lucas Stoffl, Ali Erturk, Zhuhao Wu, Johannes C. Paetzold
AI总结 提出一种基于掩码重建与图像-文本对齐联合优化的3D基础模型,在光片荧光显微镜数据上预训练,通过少样本适应显著降低标注成本并提升分割、分类和去模糊性能。
详情
- Comments
- 11 pages, 3 figures
光片荧光显微镜(LSM)能够对生物样本进行高分辨率三维(3D)成像,提供丰富的体积数据用于研究细胞组织、病理学和血管网络。然而,LSM数据的大小、维度和标注负担使得监督深度学习方法成本高昂且难以扩展。此外,尽管存在大量未标注的LSM体积数据,但由于计算挑战和体积表示学习的复杂性,针对该模态的基础模型仍未得到充分探索。在这项工作中,我们引入了一个用于LSM数据的3D基础模型,该模型在涵盖多种生物体、染色和成像协议的大型精选3D图像集合上进行了预训练。通过联合优化掩码重建和图像-文本对齐,我们学习了可迁移的体积表示。预训练骨干网络大幅降低了标注负担,实现了针对多种下游任务的高效少样本适应。我们在下游分割、分类和去模糊任务上评估了该方法。结果表明,我们的方法在(1)使用标准评估指标衡量时以及(2)经过领域专家严格评估时,均持续优于基线。这凸显了基础模型预训练在减少标注需求的同时提升多样化LSM分析任务性能的潜力。预训练模型权重以及预训练和微调的代码已公开:https://github.com/AdinaScheinfeld/lsm_fm_public_repo.git。
Light sheet fluorescence microscopy (LSM) enables high-resolution, three-dimensional (3D) imaging of biological specimens, providing rich volumetric data for studying cellular organization, pathology, and vascular networks. However, the size, dimensionality, and annotation burden of LSM data make supervised deep learning approaches costly and difficult to scale. Additionally, despite the abundance of unannotated LSM volumes, foundation models for this modality remain underexplored due to computational challenges and the complexity of volumetric representation learning. In this work, we introduce a 3D foundation model for LSM data, pretrained on a large curated collection of 3D images spanning multiple organisms, stains, and imaging protocols. We learn transferable volumetric representations by jointly optimizing for masked reconstruction and image-text alignment. The pretrained backbone drastically reduces the annotation burden, enabling efficient, few-shot adaptation for varied downstream tasks. We evaluate this approach on downstream segmentation, classification, and deblurring. Our results demonstrate consistent improvements over baselines, (1) when measured using standard evaluation metrics and (2) when rigorously assessed by domain experts. This highlights the potential of foundation model pretraining to reduce annotation requirements while improving performance across diverse LSM analysis tasks. Pretrained model weights and code for pretraining and finetuning are publicly available: https://github.com/AdinaScheinfeld/lsm_fm_public_repo.git.