arXivDaily arXiv每日学术速递 周一至周五更新

科学与医疗

AI for Science

科学智能、蛋白质、分子、药物、材料、气象、物理和数学 AI。

今日/当前日期收录 5 信号源:cs.LG, q-bio, physics, cond-mat, math, stat.ML
2606.19093 2026-06-18 physics.ao-ph 新提交 95%

AIFS-DOP: End-to-End Medium-Range Weather Prediction from Observations Alone with Machine Learning

AIFS-DOP:仅基于观测的端到端中期天气预报机器学习方法

Ewan Pinnington, Peter Lean, Mihai Alexe, Eulalie Boucher, Simon Lang, Patrick Laloyaux, Gert Mertes, Tomas Kral, Patricia de Rosnay, Matthew Chantry, Anthony McNally

专题命中 气象气候 :仅用观测数据训练的机器学习中期天气预报模型,属于气象学。

AI总结 提出AIFS-DOP模型,仅用40年网格化观测数据训练,无需数值预报再分析数据,在2021/2022年预报评分中与ECMWF的IFS系统竞争,首次实现纯数据驱动模型在中期预报中与IFS相当。

Comments 12 pages, 10 figures

详情
AI中文摘要

我们介绍了用于直接观测预测的人工智能预报系统(AIFS-DOP)。AIFS-DOP在40年的网格化观测协调数据集上训练,不使用数值天气预报(NWP)再分析或模型数据。所得模型在2021/2022年的一年预报周期评分中与ECMWF的综合预报系统(IFS)具有竞争力。直接观测预测的这一进展标志着首次有仅基于观测训练的数据驱动模型在中期范围内与IFS在多个关键高层和地面主要评分上具有竞争力,当根据观测数据验证时。

英文摘要

We introduce the Artificial Intelligence Forecasting System for Direct Observation Prediction (AIFS-DOP). AIFS-DOP is trained on a 40-year harmonized dataset of gridded observations, without using numerical weather prediction (NWP) reanalysis or model data. The resulting model is competitive with ECMWF's Integrated Forecasting System (IFS) when scored on a one year period of forecasts across 2021/2022. This progress on Direct Observation Prediction represents the first time that a data-driven model, trained solely on observations, is competitive with the IFS at medium ranges for several key upper-air and surface headline scores, when verified against observation data.

2606.19302 2026-06-18 physics.ao-ph cs.LG 新提交 90%

Optimal scenario design for climate emulation

气候模拟的最优情景设计

Christopher B. Womack, Shahine Bouabid, Andrei Sokolov, Popat Salunke, Glenn Flierl, Sebastian D. Eastham, Noelle E. Selin

发表机构 * Department of Aeronautics and Astronautics, Massachusetts Institute of Technology(航空与航天系,麻省理工学院) Center for Sustainability Science and Strategy, Massachusetts Institute of Technology(可持续科学与战略中心,麻省理工学院) Department of Earth, Atmospheric, and Planetary Sciences, Massachusetts Institute of Technology(地球、大气与行星科学系,麻省理工学院) Brahmal Vasudevan Institute for Sustainable Aviation, Department of Aeronautics, Imperial College London(可持续航空研究所,帝国理工学院伦敦校区) Institute for Data, Systems, and Society, Massachusetts Institute of Technology(数据、系统与社会研究所,麻省理工学院)

专题命中 气象气候 :优化训练数据提升气候模拟器泛化能力,属于气候科学。

AI总结 针对气候模拟器泛化能力受限的问题,提出通过可微简单气候模型优化训练数据情景,使小数据集训练的模拟器性能优于标准情景集。

详情
AI中文摘要

随着深度学习在物理系统中的普及,改进泛化性的努力主要集中在设计嵌入物理约束的架构上。然而,对于机器学习替代气候模型(模拟器),我们表明现有情景中用于生成训练数据的低结构多样性限制了预测能力。在此,我们研究是否可以优化训练数据集本身以提高泛化性。我们引入一种方法创建数据集,使模拟器能够泛化到训练数据中未出现的新结构情景。我们使用可微简单气候模型(SCM)计算模拟器损失对训练数据扰动的敏感性,迭代更新训练数据以最大化模拟器技能。对于SCM,以这种方式优化的一个情景训练出的模拟器优于在六个标准ScenarioMIP路径上训练的模拟器。尽管训练数据集更小,但我们实现了更高的预测技能,发现我们的模拟器成功隔离了不同气候强迫因子(如温室气体与气溶胶)的独特物理行为,而无需单强迫运行。然后我们证明,使用SCM优化的情景驱动中等复杂度气候模型时,产生的训练数据集比在ScenarioMIP输出上训练得到更熟练的模拟器。我们的结果表明,在运行全尺度气候模型的计算受限环境中,生成少量动态丰富的情景比扩展传统排放路径集对模拟和表征系统响应具有更大的边际价值。

英文摘要

As deep learning for physical systems continues to grow in popularity, efforts to improve generalizability have primarily focused on designing architectures that embed physical constraints. However, for machine-learning surrogate climate models (emulators), we show that the low structural diversity in existing scenarios commonly used to generate training data places a ceiling on predictive skill. Here, we examine whether training datasets themselves can be optimized to improve generalization. We introduce a method to create datasets that produce emulators capable of generalizing to new, structurally different scenarios absent from the training data. We use a differentiable Simple Climate Model (SCM) to calculate the sensitivity of emulator loss to perturbations in the training data, iteratively updating the training data to maximize emulator skill. For an SCM, training on one scenario optimized in this fashion outperforms an emulator trained on six standard ScenarioMIP pathways. We achieve this higher predictive skill despite training on a smaller dataset, finding that our emulator successfully isolates distinct physical behaviors of different climate forcing agents (e.g., greenhouse gases vs. aerosols) without single-forcing runs. We then demonstrate that scenarios optimized using an SCM, when used to drive an intermediate-complexity climate model, produce a training dataset that yields a more skillful emulator than training on ScenarioMIP outputs. Our results suggest that, in the compute-constrained environment of running full-scale climate models, generating a small number of dynamically rich scenarios provides greater marginal value for emulation and characterizing system responses than expanding the suite of traditional emissions pathways.

2606.19026 2026-06-18 cs.LG cs.AI physics.ao-ph 新提交 85%

A Hybrid LSTM--Vision Transformer Architecture for Predicting HRRR Forecast Errors

混合LSTM-视觉Transformer架构用于预测HRRR预报误差

David Aaron Evans, Jay C. Rothenberger, Kara J. Sulia, Nick P. Bassill, Chris D. Thorncroft

发表机构 * Atmospheric Sciences Research Center, University at Albany, SUNY(纽约州立大学奥尔巴尼分校大气科学研究中心) University of Oklahoma(俄克拉荷马大学) State Weather Risk Communication Center, University at Albany, SUNY(纽约州立大学奥尔巴尼分校州天气风险沟通中心)

专题命中 气象气候 :混合架构预测HRRR预报误差

AI总结 提出LSTM-ViT混合框架,结合地表观测时序与大气廓线,预测HRRR降水、风速和温度预报误差,相比基线LSTM性能提升,尤其降水误差预测技能提高约两倍。

Comments This manuscript is a preprint and has been submitted for peer review to the Artificial Intelligence for the Earth Systems journal. The content is subject to change based on the outcome of the peer review process and should not be considered final or definitive. Copyright in this Work may be transferred without further notice

详情
AI中文摘要

高分辨率数值天气预报(NWP)系统中的预报误差通常与未解析的边界层(PBL)过程、对流、地形诱导环流以及其他垂直结构的大气现象有关。先前的研究表明,长短期记忆(LSTM)网络可以利用中尺度观测成功预测高分辨率快速刷新(HRRR)模型的预报误差,但我们认为性能下降与复杂垂直大气演化时期有关。为解决这一局限,我们开发了一种混合LSTM-视觉Transformer(LSTM-ViT)框架,将来自地表观测的时间序列学习与来自纽约州中尺度剖面仪网络的垂直大气廓线相结合。LSTM-ViT框架被训练用于预测单个中尺度站点上HRRR的逐时降水、10米风速和2米温度预报误差。在所有三个预测变量中,相对于基线LSTM架构,引入剖面仪导出的大气结构提高了预报误差预测技能,最大提升出现在较短的预报提前期和PBL活动增强期间。对于降水预报误差,改进尤为显著,LSTM-ViT框架相对于基线LSTM实现了约两倍的预测技能提升,同时更好地捕捉了对流驱动的误差演变并减少了与PBL过程相关的退化。这些结果表明,将时间序列学习与垂直注意力机制相结合,为改进业务NWP系统中的预报误差预测提供了一条具有物理意义的途径。我们的研究为预报员提供了关于模型偏差和预报置信度的增强指导。

英文摘要

Forecast errors in high-resolution numerical weather prediction (NWP) systems are often linked to unresolved planetary boundary layer (PBL) processes, convection, terrain-induced circulations, and other vertically structured atmospheric phenomena. Previous work demonstrated that Long Short-Term Memory (LSTM) networks can successfully predict forecast errors in the High-Resolution Rapid Refresh (HRRR) model using mesonet observations, but we believe performance degradation is linked to periods of complex vertical atmospheric evolution. To address this limitation, we develop a hybrid LSTM-Vision Transformer (LSTM-ViT) framework that combines temporal sequence learning from surface observations with atmospheric profiles from the New York State Mesonet profiler network. The LSTM-ViT framework is trained to predict HRRR hourly precipitation, 10 m wind speed, and 2 m temperature forecast errors at individual mesonet stations. Across all three predictors, incorporation of profiler-derived atmospheric structure improves forecast error prediction skill relative to the baseline LSTM architecture, with the largest gains occurring at shorter forecast lead times and during periods of enhanced PBL activity. Improvements are particularly pronounced for precipitation forecast error, where the LSTM-ViT framework achieves approximately a twofold increase in predictive skill relative to the baseline LSTM while better capturing convectively driven error evolution and reducing degradation associated with PBL processes. These results demonstrate that combining temporal sequence learning with vertically informed attention mechanisms provides a physically meaningful pathway for improving forecast error prediction in operational NWP systems. Our research offers forecasters enhanced guidance regarding model bias and forecast confidence.

2606.18857 2026-06-18 cs.LG physics.ao-ph 新提交 80%

Investigating Inductive Biases for Machine Learning Emulation of Sudden Stratospheric Warmings in Idealised Isca Simulations

研究理想化Isca模拟中平流层突然增温的机器学习模拟的归纳偏差

Oskar Bohn Lassen, Simon Driscoll, Stephen I. Thomson, Sebastian Schemm, Francisco C. Pereira

发表机构 * Technical University of Denmark(丹麦技术大学) University of Cambridge(剑桥大学) University of Exeter(埃克塞特大学)

专题命中 气象气候 :机器学习模拟平流层增温

AI总结 测试不同架构的归纳偏差对模拟平流层突然增温动力学的影响,发现三维垂直耦合是关键,但低预测误差不保证物理一致性。

详情
AI中文摘要

机器学习模拟器越来越多地用于天气预报,并有可能通过学习动态重要的可预测性来源,将技能扩展到次季节到季节时间尺度。一个关键挑战是模型能否利用可预测性锚点,例如平流层变率,这些锚点在超出短期超前时间时影响对流层环流。我们使用配对的理想化Isca模拟测试架构归纳偏差如何影响对平流层突然增温(SSW)动力学的模拟,这些模拟仅在施加的波-2加热扰动上有所不同。在用于一步预测的卷积、变换器和基于图的架构中,当平流层动态安静时,模型差异不大,但当类似SSW的变率活跃时,差异显著扩大。我们的结果确定显式三维垂直耦合是机器学习模拟平流层动力学的关键归纳偏差。然而,Eliassen-Palm通量诊断表明,低预测误差并不能保证物理上真实的波-平均流相互作用,平流层波驱动结构中仍存在相干误差。

英文摘要

Machine-learning emulators are increasingly used for weather prediction and have the potential to extend skill on subseasonal-to-seasonal timescales by learning dynamically important sources of predictability. A key challenge is whether the models can exploit predictability anchors, such as stratospheric variability, that influence tropospheric circulation beyond short lead times. We test how architectural inductive bias affects emulation of sudden stratospheric warming (SSW) dynamics using paired idealised Isca simulations that differ only in an imposed wave-2 heating perturbation. Across convolutional, transformer, and graph-based architectures trained for one-step prediction, model differences are modest when the stratosphere is dynamically quiet but widen substantially when SSW-like variability is active. Our results identify explicit three-dimensional vertical coupling as a key inductive bias for machine-learning emulation of stratospheric dynamics. However, Eliassen-Palm flux diagnostics show that low forecast error does not guarantee physically faithful wave-mean-flow interaction, with coherent errors remaining in stratospheric wave-driving structure.

2606.18901 2026-06-18 physics.ao-ph physics.data-an 新提交 80%

Multifractal Dynamics of Tropical Atlantic SST Indices: Nonlinear Scaling Structure and Episodic Statistical Association with ENSO Variability

热带大西洋海温指数的多重分形动力学:非线性标度结构与ENSO变率的间歇性统计关联

Sebastián Jaroszewicz, Nahuel Mendez, Maria P. Beccar-Varela, Maria Cristina Mariani

专题命中 气象气候 :热带大西洋海温指数的多重分形动力学分析,属于气候科学。

AI总结 利用MFDFA分析三个热带大西洋海温指数的多重分形特性,发现TASI具有更宽的多重分形谱且包含非线性贡献,移动窗口分析显示强厄尔尼诺事件期间多重分形宽度显著降低,滞后相关揭示与ENSO的间歇性关联。

详情
AI中文摘要

热带大西洋表现出由内部海洋-大气相互作用和远程气候强迫驱动的复杂海表温度(SST)变率。我们使用1981年至2025年的周数据,对三个海温指数——南大西洋热带(SAT)、热带南大西洋(TSA)和热带大西洋海温梯度指数(TASI)——进行了比较多重分形分析。多重分形去趋势波动分析(MFDFA)揭示了所有指数中稳健的尺度依赖行为。TASI显示出比SAT(0.27)和TSA(0.34)宽得多的多重分形谱(Δh约0.72)。替代数据测试表明,SAT和TSA中的多重分形性主要由线性自相关解释,而TASI包含与相位相关的额外非线性贡献。为了研究时间变率,我们引入了一个移动窗口MFDFA框架,跟踪多重分形宽度的演变。在1997-1998年和2015-2016年两次主要厄尔尼诺事件期间观察到显著减小,表明在极端太平洋强迫下多尺度变率受到抑制。滞后相关分析揭示了与海洋尼诺指数在15-18个月延迟下的显著负相关,与已知的大西洋-太平洋遥相关一致。然而,格兰杰因果性和传递熵检验未检测到显著的因果联系,表明这是一种间歇性而非持续性的关系。滞后多重分形互相关分析进一步揭示了尺度依赖的跨海盆耦合。这些结果表明,时间依赖的多重分形测度为表征非线性大西洋变率提供了一个有用的框架,并将TASI识别为一个动态独特的指数,其标度性质包含区域海温指数单独无法捕捉的信息。

英文摘要

The Tropical Atlantic exhibits complex sea surface temperature (SST) variability driven by internal ocean-atmosphere interactions and remote climate forcing. We perform a comparative multifractal analysis of three SST indices, South Atlantic Tropical (SAT), Tropical Southern Atlantic (TSA), and the Tropical Atlantic SST Gradient Index (TASI), using weekly data from 1981 to 2025. Multifractal Detrended Fluctuation Analysis (MFDFA) reveals robust scale-dependent behavior in all indices. TASI displays a substantially broader multifractal spectrum (Delta h about 0.72) than SAT (0.27) and TSA (0.34). Surrogate-data tests show that multifractality in SAT and TSA is mainly explained by linear autocorrelations, whereas TASI contains an additional nonlinear contribution associated with phase correlations. To investigate temporal variability, we introduce a moving-window MFDFA framework that tracks the evolution of multifractal width. Significant reductions are observed during the major 1997-1998 and 2015-2016 El Nino events, indicating a suppression of multiscale variability under extreme Pacific forcing. Lagged correlation analysis reveals a significant negative association with the Oceanic Nino Index at delays of 15-18 months, consistent with known Atlantic-Pacific teleconnections. However, Granger causality and Transfer Entropy tests do not detect significant causal links, suggesting an episodic rather than persistent relationship. Lagged multifractal cross-correlation analysis further reveals scale-dependent inter-basin coupling. These results demonstrate that time-dependent multifractal measures provide a useful framework for characterizing nonlinear Atlantic variability and identify TASI as a dynamically distinct index whose scaling properties contain information not captured by regional SST indices alone.