Improving Full Waveform Inversion in Large Model Era
在大模型时代改进全波形反演
Yinan Feng, Peng Jin, Yuzhe Guo, Yinpeng Chen, Youzuo Lin
AI总结 提出通过协调缩放模型容量、数据多样性和训练策略,使十亿参数模型在简单合成数据上训练后能泛化到复杂地质结构,在OpenFWI等基准上达到最优性能。
详情
全波形反演(FWI)是一个高度非线性和不适定的问题,旨在从地表记录的地震波形数据中恢复地下速度图。现有的数据驱动FWI通常使用小模型,因为可用数据集体积有限、地质多样性不足且空间范围小,导致对过拟合的严重担忧。尽管它们在合成数据集上表现良好,但当前方法无法泛化到更真实的地质结构。在这项工作中,我们展示了完全在模拟且相对简单数据上训练的模型能够出色地泛化到具有挑战性且未见过的地质基准。我们提供了一个工作配方,通过协调三个轴上的缩放:模型容量、数据多样性和训练策略,来驯服十亿参数模型用于FWI。我们的模型在OpenFWI上达到了最先进的性能,并显著缩小了数据驱动FWI中的泛化差距。在六个具有挑战性的地球物理基准上,包括Marmousi、2D SEG/EAGE盐体和逆冲断层、2004 BP、Sigsbee和SEAM Phase I,它推断出了训练集中不存在的复杂结构,并带来了显著的性能提升(SSIM从0.5844提高到0.7669)。总体而言,我们的结果表明,通过适当的缩放策略,在简单合成数据上训练的大模型能够实现对更复杂和真实地质结构的显著泛化。
Full Waveform Inversion (FWI) is a highly nonlinear and ill-posed problem that aims to recover subsurface velocity maps from surface-recorded seismic waveforms data. Existing data-driven FWI typically uses small models, as available datasets have limited volume, geological diversity, and spatial extent, leading to substantial concerns about overfitting. Although they perform well on synthetic datasets, current methods fail to generalize to more realistic geological structures. In this work, we show that a model trained entirely on simulated and relatively simple data can generalize remarkably well to challenging and unseen geological benchmarks. We provide a working recipe that tames a billion-parameter model for FWI through coordinated scaling across three axes: model capacity, data diversity, and training strategy. Our model achieves state-of-the-art performance on OpenFWI and significantly narrows the generalization gap in data-driven FWI. Across six challenging geophysical benchmarks, including Marmousi, 2D SEG/EAGE Salt and Overthrust, 2004 BP, Sigsbee, and SEAM Phase I, it infers complex structures absent from the training set and delivers significant performance improvements (SSIM from 0.5844 to 0.7669). Overall, our results demonstrate that with an appropriate scaling strategy, large models trained on simple synthetic data can achieve substantial generalization to more complex and realistic geological structures.