arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.16011 2026-05-18 cs.SE cs.AI cs.CL

FormulaCode: Evaluating Agentic Optimization on Large Codebases

Atharva Sehgal, James Hou, Akanksha Sarkar, Ishaan Mantripragada, Swarat Chaudhuri, Jennifer J. Sun, Yisong Yue

AI总结本文提出FormulaCode，一个用于评估大语言模型（LLM）代理在真实大型代码库中进行多目标优化能力的基准。该基准基于从GitHub科学Python仓库中挖掘的957个性能瓶颈，每个瓶颈都配有专家编写的补丁和大量社区维护的性能测试任务，能够全面评估LLM在保证正确性与性能约束下的优化能力。实验表明，当前最先进的LLM代理在面对大规模、多目标优化任务时仍面临显著挑战。

Comments Preprint version

2603.13864 2026-05-18 cs.CR cs.CV

Inevitable Encounters: Backdoor Attacks Involving Lossy Compression

Qian Li, Yunuo Chen, Yuntian Chen

AI总结本文研究了在现实场景中，由于数据存储和传输过程中不可避免地使用有损压缩，导致后门攻击效果被削弱的问题。针对图像压缩过程中嵌入的触发器信息可能丢失的问题，作者提出了两种专门应对有损压缩的中毒策略，确保触发器信息在压缩后仍能被有效恢复。实验表明，这两种方法在多种压缩方案下均具有良好的攻击效果，为后门攻击在实际应用中的实现提供了新的思路。

2603.04459 2026-05-18 cs.CR cs.AI cs.SE

Benchmark of Benchmarks: Unpacking Influence and Code Repository Quality in LLM Safety Benchmarks

Junjie Chu, Xinyue Shen, Ye Leng, Michael Backes, Yun Shen, Yang Zhang

AI总结本文系统评估了31个大型语言模型安全基准的代码质量和可运行性，并与382篇非基准论文进行对比。研究发现，大多数基准代码需要修改才能运行，且仅有少数提供完整的安装指南和伦理考量。作者指出，基准的采用与作者知名度和代码可运行性相关，而非代码质量标准，揭示了社区在基准选择上的潜在偏差。此外，部分基准存在安全隐患，可能被用作攻击资源，影响安全评估的可靠性。

Comments 24 pages. 19 figures

详情

英文摘要

The rapid expansion of research in LLM safety presents challenges in tracking advancements, making benchmarks important evaluation infrastructures for identifying key trends and facilitating systematic comparisons. Yet no systematic assessment exists of their code quality and runnability, nor of what factors are associated with the community's adoption of certain benchmarks over others. To address this gap, we conduct a systematic measurement study of 31 LLM safety benchmarks (covering prompt injection, jailbreak, and hallucination) with 382 non-benchmark papers as a control group, combining automated static analysis, human runnability testing (220+ person-hours), and bibliometric analysis. We find that only 39\% of benchmark repositories can run without modification, only 16\% provide flawless installation guides, and a mere 6\% include ethical considerations despite containing potentially harmful content. These deficiencies persist across the study period with no significant improvement. Analyzing adoption factors, we find that benchmark adoption correlates with author prominence and code runnability, but not with code quality standards such as Pylint score and maintainability, suggesting that the community's benchmark selection does not reward higher coding standards. Based on these results, we identify potential safety and reliability concerns. Some safety benchmark repositories openly expose harmful content, such as successful jailbreak responses, without any ethical warning or access control, effectively serving as unguarded attack resources. Furthermore, when benchmarks require ad-hoc modifications to run, downstream safety evaluations across different papers may not be comparable. We present case studies illustrating these concrete consequences and propose a targeted checklist to help benchmark contributors improve code quality, documentation, and ethical practices.

URL PDF HTML ☆

赞 0 踩 0

2602.14342 2026-05-18 math.ST cs.DS cs.LG math.PR stat.TH

High-accuracy log-concave sampling with stochastic queries

Fan Chen, Sinho Chewi, Constantinos Daskalakis, Alexander Rakhlin

AI总结本文研究了在对数凹函数采样中如何实现高精度的采样保证，提出使用具有亚指数尾部的随机梯度可以达到迭代和查询复杂度与 $\mathrm{poly}\log(1/δ)$ 相关的高精度采样。这与凸优化问题形成对比，后者在梯度存在随机性时需要 $\mathrm{poly}(1/δ)$ 的查询次数。研究还从信息论角度论证了轻尾随机梯度对于实现高精度采样的必要性，并给出了针对零阶随机查询和有限和势函数采样的改进复杂度结果。

2602.14092 2026-05-18 eess.SY cs.RO cs.SY

Simultaneous State Estimation and Online Model Learning in a Soft Robotic System

Jan-Hendrik Ewering, Max Bartholdt, Simon F. G. Ehlers, Niklas Wahlström, Thomas B. Schön, Thomas Seel

AI总结本文研究了在软体机器人系统中同时进行状态估计和在线模型学习的问题。作者提出了一种基于灰色箱系统辨识工具的方法，仅需使用名义上的恒曲率机器人模型和机器人基座的力测量数据，即可同时估计软体机器人的当前姿态并学习其弯曲刚度模型。该方法通过边缘化粒子滤波器将恒曲率模型与高斯过程模型结合，有效提升了模型预测精度和整体质量，并在实际软体机器人实验中验证了其有效性。

Comments 8 pages, 3 figures, 2 tables, contribution to the International Conference on Information Fusion 2026

2602.12292 2026-05-18 eess.SP cs.LG

A Gradient Boosted Mixed-Model Machine Learning Framework for Vessel Speed in the U.S. Arctic

Mauli Pant, Linda Fernandez, Indranil Sahoo

AI总结本文研究了环境与操作条件如何影响美国北极地区船舶的航速，通过分析2010至2019年的自动识别系统（AIS）数据，提出了一种两阶段的混合机器学习框架，分别建模航速大于零的概率和条件航速。该方法结合了梯度提升决策树与随机效应，能够捕捉非线性环境响应并处理重复观测，结果显示海岸距离和水深是影响船舶航速的主要因素，而风和海冰的影响则相对较小。

2602.06824 2026-05-18 math.OC cs.LG

RanSOM: Second-Order Momentum with Randomized Scaling for Constrained and Unconstrained Optimization

El Mahdi Chayti

AI总结本文提出了一种名为 RanSOM 的统一优化框架，用于解决有约束和无约束优化问题，旨在消除传统动量方法在随机设置中的曲率偏差问题。该方法通过将确定性步长替换为从特定分布中随机抽取的步长，结合 Stein 型恒等式，仅使用一次 Hessian-向量乘积即可准确估计动量偏差，从而避免了额外采样或对光滑性假设的依赖。实验表明，RanSOM 在标准噪声条件下实现了最优的 $\mathcal{O}(ε^{-3})$ 收敛速度，并在重尾噪声环境下也表现出优越的性能。

2602.01568 2026-05-18 cs.GT cs.RO

Efficiently Solving Mixed-Hierarchy Games with Quasi-Policy Approximations

Hamzah Khan, Dong Ho Lee, Jingqi Li, Tianyu Qiu, Christian Ellis, Jesse Milzman, Wesley Suttle, David Fridovich-Keil

AI总结本文研究了具有混合层次结构的多机器人博弈问题，其中部分机器人作为Stackelberg领导者在其子树中决策，而不同分支的机器人则通过纳什均衡进行交互。为了解决这类博弈中高阶导数带来的求解困难，作者提出了一种准策略近似方法，并结合非精确牛顿法高效求解近似KKT系统，证明了算法在非二次目标和非线性约束下的局部指数收敛性。该方法已在实际硬件和仿真环境中验证，展示了对复杂混合层次结构的实时求解能力。

2601.23030 2026-05-18 stat.ML cs.LG stat.ME

Neural Backward Filtering Forward Guiding

Gefan Yang, Frank van der Meulen, Stefan Sommer

AI总结本文提出了一种名为“神经反向滤波正向引导”（NBFFG）的统一框架，用于解决树状非线性连续随机过程中的推断问题，尤其适用于观测稀疏且拓扑结构复杂的情形。该方法通过构造一个近似的线性高斯过程，得到闭式反向滤波器以引导生成路径向高似然区域移动，并利用神经网络残差捕捉非线性偏差，从而实现无偏的路径子采样，显著降低训练复杂度。实验表明，NBFFG在合成数据集和高维系统发育分析任务中均优于现有方法。

2601.21028 2026-05-18 cs.CY cs.AI cs.HC

"Unlimited Realm of Exploration and Experimentation": Methods and Motivations of AI-Generated Sexual Content Creators

Jaron Mink, Lucy Qin, Elissa M. Redmiles

AI总结本文研究了AI生成性内容（AIG-SC）创作者的动机、方法及内容类型，揭示了他们创作的多样性，包括性探索、创意表达和技术实验等。研究通过深入访谈28位创作者，探讨了AIG-SC在技术、伦理和社会层面的影响，为相关政策制定提供了重要参考。

2601.00361 2026-05-18 cs.DS cs.LG

Deterministic Coreset for Lp Subspace

Rachit Chhaya, Anirban Dasgupta, Dan Feldman, Supratim Shit

AI总结本文提出了一种首个迭代算法，用于构造保证确定性 $\ell_p$ 子空间嵌入的 $\varepsilon$-coreset，适用于任意 $p \in [1, \infty)$ 和 $\varepsilon > 0$。该算法通过迭代维护一个行子集，确保其损失函数与原数据集的损失函数在适当缩放下保持上下界，从而提供了确定性的子空间嵌入保证。该方法去除了传统核心集大小中的对数因子，达到最优的理论界，并可用于确定性地近似求解 $\ell_p$ 回归问题。

Comments The proofs of some claims are incomplete

2512.07946 2026-05-18 hep-th cs.LG

Conformal Defects in Neural Network Field Theories

Pietro Capuozzo, Brandon Robinson, Benjamin Suzzoni

AI总结本文研究了神经网络场论（NN-FTs）中共形不变缺陷的构建方法，提出了一种形式化框架用于在这些理论中引入共形缺陷。通过两个标量场论的玩具模型，展示了该方法的有效性，并发展了类似缺陷算符乘积展开的神经网络解释，为共形场论与深度学习的交叉研究提供了新工具。

Comments 23 pages, 1 figure

2512.04745 2026-05-18 math.OC cs.AI cs.SY eess.SY nlin.AO

Neural Policy Composition from Free Energy Minimization

Francesca Rossi, Veronica Centorrino, Francesco Bullo, Giovanni Russo

AI总结本文研究了如何通过最小化变分自由能来实现神经策略的组合，提出了一种规范化的框架，为策略组合提供了原理性且广泛适用的目标函数。基于该框架，作者推导出一种连续时间梯度流，其轨迹可保证以明确速率收敛到最优策略组合，并展示了该动态机制可通过软竞争递归电路实现。实验表明，该模型在多智能体群体行为、人类决策任务和分层控制等场景中，能够有效解释策略组合机制，再现关键行为特征，并在性能上优于或匹配现有模型。

2511.05297 2026-05-18 cs.SE cs.LG

Building Specialized Software-Assistant ChatBot with Graph-Based Retrieval-Augmented Generation

Mohammed Hilel, Yannis Karmim, Jean De Bodinat, Reda Sarehane, Antoine Gillon

AI总结本文提出了一种基于图结构的检索增强生成框架，用于构建面向企业软件的专用软件助手聊天机器人，以解决传统大型语言模型在缺乏软件结构理解时易产生幻觉的问题。该方法通过自动将企业网页应用转换为状态-动作知识图谱，辅助语言模型生成更准确、上下文相关的指导信息。研究还详细介绍了从软件界面中提取和构建知识图谱的工程流程，并展示了该方法在实际数字采用平台中的集成与应用效果。

Comments Accepted at ICMLC 2026

2511.04484 2026-05-18 cs.DS cs.LG

Online Algorithms for Repeated Optimal Stopping: Balancing Baseline Guarantees and Regret

Tsubasa Harada, Yasushi Kawase, Hanna Sumita

AI总结本文研究重复最优停止问题，在未知分布的情况下，目标是在每轮中保持强性能保证的同时实现整体次线性遗憾。作者提出了一种通用算法框架，在完整反馈条件下，以高概率同时满足每轮性能保证和次线性遗憾，并适用于如先知不等式、秘书问题等多种经典场景。研究还给出了在独立同分布模型下的遗憾下界，表明所提方法的性能接近最优。

Comments 30 pages, Major revision with corrected results, new impossibility results, and revised exposition

2511.03606 2026-05-18 stat.ML cs.LG math.ST stat.TH

Vector-valued self-normalized concentration inequalities beyond sub-Gaussianity

Diego Martinez-Taboada, Tomas Gonzalez, Aaditya Ramdas

AI总结本文研究了超越次高斯分布的向量值自归一化过程的集中不等式，填补了该领域在非次高斯条件下的理论空白。作者提出了适用于轻尾分布（如贝内特或伯努利分布）的集中界，扩展了传统自归一化分析的适用范围。研究成果在在线线性回归及核化线性强盗算法中具有重要应用价值。

2510.19315 2026-05-18 cs.FL cs.LG cs.LO

Transformers are Inherently Succinct

Pascal Bergsträßer, Ryan Cotterell, Anthony W. Lin

AI总结本文研究了变换器（transformers）在表达能力上的简洁性，将其作为衡量其性能的一个重要指标。作者证明了固定精度的变换器在描述语言时比线性时序逻辑（LTL）、循环神经网络（RNN）以及有限自动机等传统模型更加简洁，甚至在某些情况下具有指数级或双指数级的简洁优势。同时，研究还给出了相应的上界，表明变换器可以转换为LTL公式，且仅需指数级的扩展，这改进了之前的双指数级转换结果。这一简洁性也导致了变换器的基本验证问题（如空集性和等价性）在计算复杂度上是难以处理的，具体为EXPSPACE-完全问题。

2510.15714 2026-05-18 math.OC cs.LG

A Split-Client Approach to Second-Order Optimization

El Mahdi Chayti, Martin Jaggi

AI总结本文提出了一种名为Split-Client的框架，用于解决二阶优化方法中Hessian矩阵计算和分解带来的计算瓶颈问题。该方法将优化过程分解为并行的梯度和曲率计算，实现了对延迟的自适应调整，无需手动调参即可达到与最优Lazy方法相当的收敛速度。此外，该框架在持续曲率误差和结构化条件下分别提供了噪声自适应和更快的收敛速率，并在非凸问题实验中展示了显著的加速效果。

2510.02734 2026-05-18 q-bio.BM cs.AI q-bio.GN

SAE-RNA: A Sparse Autoencoder Model for Interpreting RNA Language Model Representations

Taehan Kim, Sangdae Nam

AI总结本文提出了一种名为 SAE-RNA 的稀疏自编码器模型，用于解释 RNA 语言模型的表示，旨在探索其是否能够对 RNA 语言模型的特征进行可解释的分解。该方法基于 RiNALMo 模型，通过映射到已知的生物学特征，分析 RNA 语言模型内部如何组织生物信息。研究为 RNA 分类和结构特征的识别提供了一个基于特征层面的比较框架，并探讨了稀疏自编码器在该任务中的适用性与局限性。

Comments 12 pages, 7 figures. v2: Updated bibliography to improve reference accuracy and reflect updated publication venues. Refined claims for better alignment with results and added an Appendix

2509.16223 2026-05-18 eess.SP cs.CV

mRadNet: A Compact Radar Object Detector with MetaFormer

Huaiyu Chen, Fahed Hassanat, Robert Laganiere, Martin Bouchard

AI总结本文提出了一种名为mRadNet的紧凑型雷达目标检测模型，旨在满足车载嵌入式系统对模型轻量化和高效性的需求。该模型基于U-Net结构，结合MetaFormer模块，利用分离卷积和注意力机制有效提取局部与全局特征，并引入更高效的特征嵌入与融合策略以进一步降低计算复杂度。实验结果表明，mRadNet在CRUW数据集上以最少的参数和最低的计算量实现了优于现有方法的检测性能。

Comments 5 pages, 2 figures, to appear in Proc. of 34th European Signal Processing Conference (EUSIPCO 2026), Bruges, Belgique, Aug. 31 - Sept. 4, 2026. Code availble at https://github.com/huaiyu-chen/mRadNet

2509.12266 2026-05-18 q-bio.GN cs.LG

Genome-Factory: A Library for Tuning, Deploying, and Interpreting Genomic Foundation Models

Weimin Wu, Xuefeng Song, Yibo Wen, Qinjie Lin, Zhihan Zhou, Jerry Yao-Chieh Hu, Zhong Wang, Han Liu

AI总结本文介绍了 Genome-Factory，一个用于调优、部署和解释基因组基础模型的首个集成 Python 库。该库通过统一数据收集、模型调优、推理、基准测试和可解释性分析的流程，简化了基因组模型的开发工作。其核心贡献包括自动化数据预处理、支持多种模型调优方式、提供嵌入提取与序列生成功能，并引入基于稀疏自编码器的生物解释器，显著提升了基因组模型在实际分析中的实用价值。

2509.01685 2026-05-18 stat.ML cs.LG math.OC stat.CO

Preconditioned Regularized Wasserstein Proximal Sampling

Hong Ye Tan, Stanley Osher, Wuchen Li

AI总结本文研究如何通过有限粒子的演化从吉布斯分布中进行采样，提出了一种预条件正则化Wasserstein近端采样方法。该方法通过正则化Wasserstein近端算子的数值可计算得分函数来近似得分函数，并基于各向异性热方程的Cole-Hopf变换推导出其核形式。实验表明，该方法在多种对数凹和非对数凹分布以及贝叶斯图像去卷积和神经网络训练任务中表现出加速和稳定性优势。

2508.16114 2026-05-18 astro-ph.GA astro-ph.IM astro-ph.SR cs.LG

Neural-Network Chemical Emulator for First-Star Formation: Robust Iterative Predictions over a Wide Density Range

Sojun Ono, Kazuyuki Sugimura

AI总结本文提出了一种基于神经网络的化学模拟器，用于研究第一代恒星（Population III）形成过程中的热力学与化学演化。该模拟器能够覆盖21个数量级的密度范围（10⁻³–10¹⁸ cm⁻³），准确追踪六种原始物质的演化。为提高预测的鲁棒性和效率，研究引入了基于时间尺度的更新方法，并在不同密度区间分别训练深度算子网络，显著提升了计算速度并保证了多步迭代下的预测精度。

Comments 19 pages, 7 figures, Accepted for publication in ApJ

详情

DOI: 10.3847/1538-4357/ae1ca9
Journal ref: ApJ, 996, 9 (2026)

英文摘要

We present a neural-network emulator for the thermal and chemical evolution in Population III star formation. The emulator accurately reproduces the thermochemical evolution over a wide density range spanning 21 orders of magnitude (10$^{-3}$-10$^{18}$ cm$^{-3}$), tracking six primordial species: H, H$_2$, e$^{-}$, H$^{+}$, H$^{-}$, and H$_2^{+}$. To handle the broad dynamic range, we partition the density range into five subregions and train separate deep operator networks (DeepONets) in each region. When applied to randomly sampled thermochemical states, the emulator achieves relative errors below 10% in over 90% of cases for both temperature and chemical abundances (except for the rare species H$_2^{+}$). The emulator is roughly ten times faster on a CPU and more than 1000 times faster for batched predictions on a GPU, compared with conventional numerical integration. Furthermore, to ensure robust predictions under many iterations, we introduce a novel timescale-based update method, where a short-timestep update of each variable is computed by rescaling the predicted change over a longer timestep equal to its characteristic variation timescale. In one-zone collapse calculations, the results from the timescale-based method agree well with traditional numerical integration even with many iterations at a timestep as short as 10$^{-4}$ of the free-fall time. This proof-of-concept study suggests the potential for neural network-based chemical emulators to accelerate hydrodynamic simulations of star formation.

URL PDF HTML ☆

赞 0 踩 0

2508.03810 2026-05-18 hep-th cs.LG

Viability of perturbative expansion for quantum field theories on neurons

Srimoyee Sen, Varun Vaidya

AI总结本文研究了在有限神经元数量下，使用神经网络架构进行局部量子场论微扰计算的可行性，以$d$维欧几里得空间中的标量$ϕ^4$理论为例。研究发现，二点和四点关联函数的重整化$O(1/N)$修正所形成的微扰级数对紫外截断敏感，收敛性较弱。为此，作者提出对网络结构进行改进，并探讨了理论参数和神经元数量的标度关系，以更准确地提取场论结果。

Comments Published version

2506.14829 2026-05-18 cs.HC cs.AI cs.LG

The Hardness of Achieving Impact in AI for Social Impact Research: A Ground-Level View of Challenges & Opportunities

Aditya Majumdar, Wenbo Zhang, Kashvi Prawal, Amulya Yadav

AI总结本文探讨了人工智能用于社会影响研究（AI4SI）在实际应用中面临的主要挑战与机遇。研究通过访谈26位AI4SI领域的研究者，分析了在结构性、组织性、沟通与协作等方面阻碍AI4SI落地的障碍，并总结了可行的合作策略与实践经验。该研究为希望推动社会影响的AI研究者和机构提供了实用指导。

Comments To be published in FAccT'26

2506.00182 2026-05-18 stat.ML cs.IT cs.LG math.IT math.ST stat.TH

Overfitting has a limitation: a model-independent generalization gap bound based on Rényi entropy

Atsushi Suzuki, Jing Wang

AI总结本文研究了机器学习模型泛化能力的限制，提出了一个与模型无关的泛化间隙上界，该上界仅依赖于数据生成分布的Rényi熵。研究指出，即使模型规模无限增大，只要数据量相对于Rényi熵足够，仍可保持较小的泛化间隙。该框架不仅解释了数据中注入噪声导致性能下降的现象，还拓展了无免费午餐定理，强调了数据分布熵在成功学习中的关键作用。

2505.11708 2026-05-18 cs.CR cs.LG

Unveiling the Black Box: A Multi-Layer Framework for Explaining Reinforcement Learning-Based Cyber Agents

Diksha Goel, Kristen Moore, Jeff Wang, Minjune Kim, Thanh Thi Nguyen

AI总结随着强化学习（RL）在模拟复杂网络攻击中的应用日益广泛，其决策过程的不透明性成为阻碍信任建立、调试和防御准备的关键问题。本文提出了一种统一的多层级解释框架，用于揭示基于RL的攻击代理在战略（MDP层）和战术（策略层）层面的决策逻辑，通过将网络攻击建模为部分可观测马尔可夫决策过程（POMDP）并分析Q值的动态变化，实现了对攻击行为演变的深入解释。该框架具有通用性，适用于多种攻击代理和环境，为红队模拟、策略调试、威胁建模和前瞻防御等场景提供了可解释的行为洞察。

2504.13850 2026-05-18 cs.DC cs.LG

FedOptima: Optimizing Resource Utilization in Federated Learning

Zihan Zhang, Leon Wong, Blesson Varghese

AI总结本文提出 FedOptima，一种优化联邦学习中资源利用的系统，旨在解决服务器和设备资源利用率低的问题。该系统通过异步聚合、辅助网络和集中式任务调度等创新方法，同时减少由任务依赖和设备异步导致的空闲时间，显著提升了训练效率和模型准确性。实验表明，FedOptima 在保持高精度的同时，大幅提升了训练速度和系统吞吐量。

Comments Accepted for publication in Future Generation Computer Systems

详情

DOI: 10.1016/j.future.2026.108551
Journal ref: Future Generation Computer Systems, Volume 183, October 2026, 108551

英文摘要

Federated learning (FL) systems facilitate distributed machine learning across a server and multiple devices. However, FL systems have low resource utilization on servers and devices, limiting their practical use in the real world. This inefficiency primarily arises from two types of idle time: (i) task dependency between the server and devices, and (ii) stragglers among heterogeneous devices. This paper introduces FedOptima, a resource-optimized FL system designed to simultaneously minimize both types of idle time; existing systems do not eliminate or reduce both at the same time. FedOptima offloads the training of certain layers of a neural network from a device to a server using three innovations. First, devices operate independently of each other using asynchronous aggregation to eliminate straggler effects, and independently of the server by utilizing auxiliary networks to minimize idle time caused by task dependency. Second, the server performs centralized training using a task scheduler that ensures balanced contributions from all devices, improving model accuracy. Third, an efficient memory management mechanism on the server increases the scalability of the number of participating devices. Extensive experiments are conducted on multiple lab-based testbeds, evaluated on image classification and sentiment analysis tasks with CNNs and Transformers. Compared to four state-of-the-art offloading-based and asynchronous FL baselines, FedOptima (i) achieves higher or comparable accuracy, (ii) accelerates training by 1.9x to 21.8x, (iii) reduces server and device idle time by up to 93.9% and 81.8%, respectively, and (iv) increases throughput by 1.1x to 2.0x.

URL PDF HTML ☆

赞 0 踩 0

2501.13188 2026-05-18 cond-mat.stat-mech cs.LG nlin.AO q-bio.CB

Topological constraints on self-organisation in locally interacting systems

Francesco Sacco, Dalton A R Sakthivadivel, Michael Levin

AI总结本文研究了局部相互作用系统中自组织行为的拓扑限制，探讨了在平面图结构下，系统能否形成有序相的必要条件。通过分析三个模型系统（Potts模型、自回归模型和分层网络）中自由能随领域壁形成的缩放行为，揭示了图结构中的相互作用组合如何影响自发有序的产生。研究结果为理解生物多尺度系统能够形成复杂模式，而基础语言模型在处理长序列时面临挑战提供了理论依据。

Comments 11+3 pages, four figures, four tikzpictures. This version to appear in Philos Trans R Soc A

2412.12636 2026-05-18 cs.DC cs.AI cs.LG cs.PF

TrainMover: An Interruption-Resilient Runtime for ML Training

ChonLam Lao, Jiaqi Gao, Jiamin Cao, Zhipeng Zhang, Pengcheng Zhang, Jiangfei Duan, Zhilong Zheng, Yu Guan, Yichi Xu, Yong Li, Zhengping Qian, Aditya Akella, Minlan Yu, Ennan Zhai, Dennis Cai, Jingren Zhou

AI总结大规模机器学习训练任务常因硬件、软件故障或管理事件而中断，现有方法如检查点重启或运行时重新配置往往导致较长的停机时间和性能下降。本文提出TrainMover，一种具有高弹性的大语言模型训练运行时系统，通过利用弹性与备用机器实现最小停机时间和零内存开销的中断处理。TrainMover引入了两阶段基于增量的通信组构建、无通信沙箱预热以及通用备用设计等关键技术，实验表明其在千GPU规模下处理中断的停机时间可稳定控制在约20秒，相比现有最佳方案可减少55%的GPU空转时间。

Comments 14 pages body, 19 pages total