Variational views for self-supervised learning in radio astronomy
无线电天文领域自监督学习的变分观点
Johnny Joseph Alphonse, Anna M. M. Scaife
AI总结 本文研究了利用变分自编码器进行射电星系形态预训练的方法,通过生成视图增强自监督学习模型,提升了下游分类性能,并揭示了生成与对比学习方法的互补性。
详情
现代天文调查正产生越来越大规模和复杂的数据集,使得依赖大量标注目录的传统监督方法变得越来越困难。因此,利用自监督学习(SSL)进行预训练,通过直接从未标注图像中提取结构,成为许多下游应用的可行方法。本文考虑了使用耦合的自监督表示学习方法对射电星系形态进行预训练。为了考虑到比基于视图的SSL算法通常包含的更细致的射电星系形态变化,我们使用预训练的变分自编码器(VAE)生成视图以训练更大的基于视图的自监督模型。为此,β-VAE在Radio Galaxy Zoo(RGZ)数据集上进行了训练,发现适度的正则化(β=2.3)在重建质量与生成因素(如源多重性和喷流不对称性)解耦之间提供了良好的平衡。β-VAE分析表明,Fanaroff-Riley类身份在潜在空间中表现为连续的过渡,而不是与单个离散维度相关。β-VAE重建随后被用作生成性增强的基于视图的SSL流水线中的生成性增强。我们的实验表明,将这些生成视图与标准图像增强结合可以提高下游分类性能,我们还进行了消融研究以明确每种增强类型的相对贡献。这些结果表明,生成和对比方法是互补的,并指向了具有解耦意识的自监督学习作为未来无线电天文调查的有前途的方向。
Modern astronomical surveys are producing progressively larger and more complex datasets, making traditional supervised approaches that rely on extensive labelled catalogues increasingly difficult. Consequently, pre-training using self-supervised learning (SSL), which offers a scalable route by extracting structure directly from unlabelled images, is becoming attractive for many downstream applications. In this work we consider the use of coupled self-supervised representation learning approaches for radio galaxy morphology pre-training. In order to account for the more nuanced variations in radio galaxy morphology than are typically included in the augmented views of view-based SSL algorithms, we use a pre-trained Variational Autoencoder (VAE) to generate views for training a larger view-based self-supervised model. To do this, a $β$-VAE was trained on the Radio Galaxy Zoo (RGZ) dataset, where moderate regularization ($β= 2.3$) was found to provide a good balance between reconstruction quality and disentanglement of generative factors such as source multiplicity and lobe asymmetry. An analysis of the $β$-VAE reveals that Fanaroff-Riley class identity manifests as a continuous transition across the latent space, rather than being associated to a single discrete dimension. $β$-VAE reconstructions were then incorporated as generative augmentations within a view-based SSL pipeline. Our experiments show that combining these generative views with standard image augmentations improves downstream classification performance, and we present ablation studies clarifying the relative contribution of each augmentation type. These results indicate that generative and contrastive approaches are complementary, and point toward disentanglement-aware self-supervised learning as a promising direction for future radio astronomy surveys.