arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2606.05208 2026-06-05 eess.SP cs.LG

Transformer-Enhanced Reinforcement Learning: Fundamentals and Applications in Communication Networks

Transformer增强的强化学习：通信网络中的基础与应用

Nguyen Cong Luong, Shaohan Feng, Nguyen Duc Hai, Zeping Sui, Bo Ma, Min Xu, Zhihao Dong, Qiushi Zhao, Nguyen Duc Duy Anh, Nguyen Quoc Khanh, Ngoc Hung Nguyen, Zitian Zhang, Jie Cao

发表机构 * Faculty of Computer Science, Phenikaa University（菲律宾Phenikaa大学计算机科学学院）； Faculty of Artificial Intelligence and Data Science, Phenikaa University（菲律宾Phenikaa大学人工智能与数据科学学院）； School of Information and Electronic Engineering (Sussex Artificial Intelligence Institute), Zhejiang Gongshang University（浙江工商大学信息与电子工程学院（Sussex人工智能研究院））； School of Computer Science and Electronics Engineering, University of Essex（埃塞克斯大学计算机科学与电子工程学院）； School of Mathematics, Statistics and Mechanics, Beijing University of Technology（北京理工大学数学、统计与力学学院）； School of Information Science and Technology, Harbin Institute of Technology（哈尔滨工业大学信息科学与技术学院）； Department of Electrical and Information Technology, Faculty（电气与信息技术系）

AI总结本文综述了Transformer增强的强化学习算法及其在通信网络中的应用，重点解决了传统RL在长期依赖建模和部分可观测性方面的局限。

详情

AI中文摘要

强化学习长期以来一直是解决通信网络中各种问题的强大工具。然而，传统的强化学习模型仍然面临若干局限性。它们不仅依赖于与环境的大量交互，而且在建模长期关系和应对部分可观测性方面也受到限制。近年来，Transformer模型展示了增强强化学习模型的能力，使其能够克服这些问题。特别是，Transformer中的自注意力机制能够有效建模长程依赖和全局相关性，同时加速训练过程并处理异质数据模态。本文全面综述了基于Transformer的强化学习算法及其在通信网络中的应用。具体而言，本文提供了强化学习和Transformer架构的数学背景，并深入探讨了资源分配、计算卸载、路由与轨迹控制以及网络安全等关键问题。最后，我们讨论了挑战、开放问题以及值得关注的未来研究方向，包括用于语义通信和网络优化的Transformer增强深度强化学习算法。

英文摘要

Reinforcement Learning (RL) has long been a powerful solution to various problems in communication networks. However, traditional RL models still face with several limitations. Not only do they rely on large numbers of interactions with the environment, but they are also limited in terms of modeling long-term relationships and tackling partial observability. In recent years, the Transformer model has demonstrated the ability to enhance RL models, allowing them to overcome these issues. Particularly, the self-attention mechanism within the Transformer enables efficient modeling of long-range dependencies and global correlations, as well as accelerates training processes and handles heterogeneous data modalities. In this paper, we present a comprehensive survey of Transformer-based RL algorithms and their applications in communication networks. Specifically, the paper provides the mathematical background of RL and Transformer architectures, along with insights into key issues such as resource allocation, computation offloading, routing, and trajectory control, and network security. We conclude the paper by discussing challenges, open issues, and notable future research directions, including Transformer-enhanced DRL algorithms for semantic communication and network optimization.

URL PDF HTML ☆

赞 0 踩 0

2606.05206 2026-06-05 q-bio.NC cs.AI stat.AP

Ontology-constrained multi-LLM scoring of hypothesis support in the predictive processing literature

本体约束的多LLM评分在预测处理文献中假设支持度的应用

Hamed Nejat, Alexander Maier, Jesse Spencer-Smith, André M. Bastos

发表机构 * University of Edinburgh（爱丁堡大学）； University of Cambridge（剑桥大学）

AI总结本文提出一个本地多LLM流水线，通过本体约束对预测编码文献中的研究进行评分，将异构文献映射到定量证据空间，并揭示假设间的结构化分歧。

Comments 33 pages, 5 tables and 9 figures

详情

AI中文摘要

跨学科领域由于方法多样和理论承诺不同，常常存在碎片化问题。预测编码神经科学是一个典型例子：其文献涵盖计算理论、电生理学、影像学、行为学和建模，造成了传统荟萃分析难以解决的综合问题。本文描述了一个用于本体约束文献综合的本地多LLM流水线。该流水线读取论文、提取证据、整合图表描述、组装约束提示，并根据专家词汇表验证输出。我们手动定义了一个预测编码词汇表，包含36个概念，分为三个假设：预测抑制、前向误差传播和普遍性。由十个本地语言模型组成的委员会根据每个词汇因子在局部和全局oddball情境下的一致性或不一致性，对31项研究进行评分。这使得可以进行成对研究一致性分析、跨模型比较和三维假设空间映射。某些假设的一致性较高，而其他假设则较弱，揭示了结构化分歧，特别是在局部与全局oddball范式之间。我们进一步定义了假设空间温度，这是一种几何离散度度量，用于衡量研究在假设空间中的紧凑程度。局部oddball情境的温度较低，而全局oddball情境的温度较高，表明后者离散度更大。评分几何还允许我们估计实验情境之间的变化向量。这些结果表明，本地多LLM委员会可以产生可审计的不一致性测量，将异构文献映射到定量证据空间。该框架可能推广到传统荟萃分析缺乏共同比较空间的跨研究假设映射。

英文摘要

Fragmentation is common in interdisciplinary fields with diverse methods and theoretical commitments. Predictive coding neuroscience is a clear example: its literature spans computational theory, electrophysiology, imaging, behavior, and modeling, creating a synthesis problem that conventional meta-analysis cannot easily resolve. Here, we describe a local multi-LLM pipeline for ontology-constrained literature synthesis. The pipeline reads papers, extracts evidence, incorporates figure descriptions, assembles constrained prompts, and validates outputs against an expert glossary. We manually defined a predictive-coding glossary of thirty-six concepts grouped into three hypotheses: predictive suppression, feedforward error propagation, and ubiquity. A council of ten local language models scored 31 studies according to their agreement or disagreement with each glossary factor across local and global oddball contexts. This enabled pairwise study-agreement analysis, cross-model comparison, and three-dimensional hypothesis-space mapping. Agreement was high for some hypotheses but weaker for others, revealing structured disagreement, particularly across local versus global oddball paradigms. We further define hypothesis-space temperature, a geometric dispersion metric measuring how compactly studies occupy the hypothesis space. Temperature was lower for local oddball contexts and higher for global oddball contexts, indicating greater dispersion in the latter. The scoring geometry also allowed us to estimate vectors of change between experimental contexts. These results demonstrate that local multi-LLM councils can produce auditable disagreement measurements that map heterogeneous literatures into quantitative evidence spaces. This framework may generalize to cross-study hypothesis mapping where conventional meta-analysis lacks a common comparison space.

URL PDF HTML ☆

赞 0 踩 0

2606.05202 2026-06-05 physics.comp-ph cs.LG

Multi-Fidelity Learning with Shallow Recurrent Decoders for Reactor Physics

基于浅层循环解码器的多保真度学习在反应堆物理中的应用

Stefano Riva, Carolina Introini, J. Nathan Kutz, Antonio Cammi

发表机构 * Autodesk Research（Autodesk研究院）； Department of Energy, Nuclear Engineering Division（能源部核工程系）； Politecnico di Milano（米兰理工大学）； Department of Mechanical and Nuclear Engineering and Emirates Nuclear Technology Center（机械与核工程系和阿联酋核技术中心）

AI总结针对反应堆物理中高保真数据稀缺而低保真数据丰富的问题，提出利用浅层循环解码器将低保真模型（如点动力学）映射到高保真模型（如扩散方程），以低成本获得高保真解。

详情

AI中文摘要

在反应堆物理中，根据用户需求，中子学可以以不同的保真度处理。一方面，由于数值求解玻尔兹曼输运方程的计算成本高，精确模拟反应堆中中子行为通常昂贵且耗时。另一方面，通过采用适当的假设，如SP$_N$、扩散理论和点动力学，可以高效生成低保真数据。从代理模型的角度看，这种计算限制转化为高保真数据稀缺和大量低保真数据。鉴于这种保真度差异，开发一种合适的程序将低保真模型映射到高保真模型将是有趣的；例如，可以从点动力学模型获得的时间序列数据出发，求解多群扩散方程。实际上，本文通过利用多保真度信息和浅层循环解码器（一种新颖的机器学习架构，能够将时间序列观测映射到反应堆的全状态）来研究这种可能性。该技术设计为使用局部或全局测量作为输入，并将其时间轨迹映射到高维状态；同理，原则上当输入由集总模型的解构成时，该架构也可使用。本文将这一思想应用于基准反应堆几何，在各种输入条件下将点动力学模型映射到扩散解，且计算成本大大降低。

英文摘要

In reactor physics, neutronics can be treated with different fidelity levels, according to the needs of the user. On one hand, the precise modeling of neutrons' behaviour in reactor physics is often expensive and time-consuming due to the high computational costs to numerically solve the Boltzmann transport equation. Conversely, by adopting suitable assumptions, such as the SP$_N$, diffusion theory, and point kinetics, it is possible to generate efficiently low-fidelity data. From the perspective of surrogate models, this computational limitation translates into a scarcity of high-fidelity data and a significant amount of low-fidelity data. Given this difference in fidelity levels, it would be interesting to develop a suitable procedure to map low-fidelity models towards higher fidelity models; for instance, one could obtain the solution to a multi-group diffusion equation starting from time-series data obtained from a point kinetics model. Indeed, this work investigates this possibility by leveraging multi-fidelity information with Shallow Recurrent Decoders, a novel machine learning architecture able to map time-series observations to the full state of the reactor. This technique has been designed to use local or global measurements as input and map their temporal trajectories to the high-dimensional state; by the same logic, in principle, this architecture can also be used when the input is formed by the solution of a lumped model. This work applies this idea to a benchmark reactor geometry, mapping the point kinetics model to the diffusion solution under various input conditions, with much less computational costs.

URL PDF HTML ☆

赞 0 踩 0

2606.05200 2026-06-05 physics.comp-ph cs.LG

A differentiable machine learning small-angle X-ray scattering analysis framework for structure elucidation of lipid nanoparticles

一种用于脂质纳米颗粒结构解析的可微分机器学习小角X射线散射分析框架

Maria Bånkestad, Sandra Barman, Magnus Röding, Erik Kaunisto, Viktoriia Meklesh, Audrey Gallud, Marco Mendez, Marianna Yanez Arteta, Stefan Norberg, Ann Terry, Smita Chakraborty, Shun Yu, Jerk Rönnols, Sepideh Pashami

发表机构 * RISE Research Institutes of Sweden, Division Digital Systems, Computer Science（瑞典RISE研究机构，数字系统部门，计算机科学）； RISE Research Institutes of Sweden, Division Bioeconomy, Food Research and Innovation（瑞典RISE研究机构，生物经济、食品研究与创新部门）； Sustainable Innovation & Transformational Excellence, Pharmaceutical Technology & Development, Operations, AstraZeneca（可持续创新与转型卓越，制药技术与开发，运营，阿斯利康）； Department of Mathematical Sciences, Chalmers University of Technology and University of Gothenburg（查尔姆斯理工大学和哥德堡大学数学科学系）； Advanced Drug Delivery, Pharmaceutical Sciences, R&D, AstraZeneca（先进药物递送，药学科学，研发，阿斯利康）； Global Product Development, Pharmaceutical Technology & Development, Operations, AstraZeneca（全球产品开发，制药技术与开发，运营，阿斯利康）； MAX IV Laboratory, Lund University（隆德大学MAX IV实验室）

AI总结提出一种结合机器学习代理模型和可微分层的框架，加速脂质纳米颗粒的SAXS数据分析，实现多起点拟合和可辨识性分析，揭示参数简并性。

Comments 38 pages, 24 figures, 5 tables (incl. supplementary information)

详情

AI中文摘要

脂质纳米颗粒（LNPs）是带负电核酸的有效递送系统。其多组分架构产生核-壳结构。小角X射线散射（SAXS）是LNPs的重要表征技术，但从SAXS恢复内部结构和尺寸分布是一个具有非唯一解的反问题。现实模型通常过于昂贵，难以进行系统探索。我们引入了一个机器学习加速的、可微分的框架，用于异质、多分散LNPs的SAXS分析。前向模型结合了具有高斯随机场内芯的核-壳颗粒、单分散SAXS图的神经代理模型，以及一个对颗粒尺寸分布进行积分的可微分层。代理模型将预测成本降低了四个数量级，而可微性使得大规模多起点拟合和集成可辨识性分析成为可能。应用于合成和实验MC3 LNP数据，该框架表明，近乎相同的SAXS拟合可能源于不同的参数模式，其中实验拟合主要由尺寸分布与内部结构参数之间的权衡主导。

英文摘要

Lipid nanoparticles (LNPs) are efficient delivery systems for negatively charged nucleic acids. Their multi-component architecture yields a core-shell structure. Small-angle X-ray scattering (SAXS) is an important characterization technique for LNPs, but recovering internal structure and size distribution from SAXS is an inverse problem with non-unique solutions. Realistic models are often too expensive for systematic exploration. We introduce a machine-learning-accelerated, differentiable framework for SAXS analysis of heterogeneous, polydisperse LNPs. The forward model combines a core-shell particle with a Gaussian random-field interior, a neural surrogate for the monodisperse SAXS map, and a differentiable layer integrating over particle-size distributions. The surrogate reduces prediction cost by four orders of magnitude, while differentiability enables large-scale multi-start fitting and ensemble identifiability analysis. Applied to synthetic and experimental MC3 LNP data, the framework shows that near-identical SAXS fits can arise from distinct parameter modes, with the experimental fits dominated by a trade-off between size-distribution and interior-structure parameters.

URL PDF HTML ☆

赞 0 踩 0

2606.05199 2026-06-05 physics.comp-ph cs.AI

TGSD: 拓扑引导的状态空间扩散用于EEG空间超分辨率

Zijian Kang, Weiming Zeng, Yueyang Li, Shengyu Gong, Hongjie Yan, Wai Ting Siok, Nizhuan Wang

发表机构 * Lab of Digital Image and Intelligent Computation, Shanghai Maritime University（数字图像与智能计算实验室，上海海洋大学）； Department of Language Science and Technology, The Hong Kong Polytechnic University（语言科学与技术系，香港理工大学）； Affiliated Lianyungang Hospital of Xuzhou Medical University（徐州医学院连云港医院）

AI总结提出TGSD框架，通过拓扑引导的状态空间扩散模型，利用分层空间先验编码器和条件状态空间扩散重建器，从低密度EEG恢复高密度信号，在SEED和PhysioNet MM/I数据集上优于基线方法。

详情

AI中文摘要

低密度EEG更适合可穿戴和基于物联网的大脑传感，但稀疏的电极采样通常缺乏足够的空间信息来表征跨区域的神经活动。EEG空间超分辨率旨在从稀疏记录中恢复密集通道EEG，但由于通道缺失通常发生在整个通道级别，全电极布局上的时空依赖性往往未被充分探索，且从稀疏到密集信号的映射本质上具有模糊性，因此仍然具有挑战性。为了解决这些问题，我们提出了TGSD，一种用于EEG空间超分辨率的拓扑引导状态空间扩散框架。TGSD首先采用分层空间先验编码器，通过整合局部几何关系与区域级上下文信息，学习完整电极布局上的拓扑感知先验。基于这些先验和稀疏观测，条件状态空间扩散重建器通过反向扩散逐步生成缺失通道信号，同时交替进行时间和通道维度的状态空间建模，在统一框架中捕捉长程时间动态和通道间依赖性。在SEED和PhysioNet MM/I数据集上的实验表明，TGSD在不同超分辨率因子下，在重建保真度和下游分类性能方面均持续优于代表性基线。这些结果证明了将拓扑感知空间先验与条件扩散相结合，在可穿戴和物联网场景中增强实用低密度EEG传感的有效性。官方实现代码可在https://github.com/jtggz/TGSD获取。

英文摘要

Low-density EEG is more suitable for wearable and IoT-based brain sensing, but sparse electrode sampling often lacks sufficient spatial information to characterize cross-regional neural activity. EEG spatial super-resolution aims to recover dense-channel EEG from sparse recordings, yet remains challenging because channel missingness typically occurs at the whole-channel level, spatiotemporal dependencies over the full electrode layout are often underexplored, and the mapping from sparse to dense signals is inherently ambiguous. To address these issues, we propose TGSD, a topology-guided state-space diffusion framework for EEG spatial super-resolution. TGSD first employs a Hierarchical Spatial Prior Encoder to learn topology-aware priors over the complete electrode layout by integrating local geometric relationships with region-level contextual information. Based on these priors and sparse observations, a Conditional State-Space Diffusion Reconstructor progressively generates missing-channel signals through reverse diffusion, while alternating temporal and channel-wise state-space modeling captures long-range temporal dynamics and inter-channel dependencies in a unified framework. Experiments on the SEED and PhysioNet MM/I datasets show that TGSD consistently outperforms representative baselines under different super-resolution factors in both reconstruction fidelity and downstream classification performance. These results demonstrate the effectiveness of combining topology-aware spatial priors with conditional diffusion for enhancing practical low-density EEG sensing in wearable and IoT scenarios. The official implementation code is available at https://github.com/jtggz/TGSD.

URL PDF HTML ☆

赞 0 踩 0

2606.03067 2026-06-05 stat.ML cs.LG

Trajectory-Aware Node Contributions and the Limits of Static Controllability

轨迹感知的节点贡献与静态可控性的极限

Valentina Kuskova, Dmitry Zaytsev, Michael Coppedge

发表机构 * University of California, Berkeley（加州大学伯克利分校）

AI总结本文提出“涌现贡献”（EC）作为节点动态杠杆的有限时域度量，通过可微模型的雅可比矩阵计算，在线性时不变极限下退化为平均可控性，并构建相图刻画两者一致与分歧的条件。

Comments 11 pages, 1 figure

详情

AI中文摘要

复杂网络中的一个常见数据挖掘任务是确定单个节点如何影响系统行为。现有方法依赖于静态图中心性或控制理论量（如可控性格拉姆矩阵），这些方法假设线性时不变动力学。然而，实际估计的系统通常是非线性和时变的。我们定义了“涌现贡献（EC）”，这是一种节点动态杠杆的有限时域度量：其脉冲响应的度量加权能量沿系统轨迹累积。EC 通过任何可微模型的雅可比矩阵计算，与估计器无关，并在线性时不变极限下精确地退化为平均可控性。我们的贡献是刻画了这两种度量一致与分歧的条件。使用一个具有已知真实贡献的受控合成族，我们构建了一个跨越非线性、机制结构、持续性和扰动幅度的相图。EC 和平均可控性在静态或平滑漂移动力学下一致，并且两者都跟踪真实值。分歧在持续机制切换下出现，在持续符号反转下最强，并在移除符号反转时消失。在极端扰动幅度下，两种度量都会退化，这揭示了局部线性化的极限。我们将来自多个领域的五个估计真实系统置于该相空间中。它们的位置可作为 EC 何时提供超出静态可控性信息的诊断，从而证明其额外计算成本的合理性。在一个深入检查的面板上，一个二十种子重训练集成揭示了稳健的方差-杠杆分离：节点的扰动广泛传播，尽管其系统内方差较低，这既未被静态中心性恢复，也未被基于方差的摘要恢复。

英文摘要

A recurring data mining task in complex networks is to determine how individual nodes contribute to system behavior. Existing approaches rely on either static-graph centralities or control-theoretic quantities such as controllability Gramians, which assume linear, time-invariant dynamics. Estimated systems, however, are typically nonlinear and time-varying. We define "emergent contribution (EC)," a finite-horizon measure of a node's dynamical leverage: the metric-weighted energy of its impulse response accumulated along the system trajectory. Computed from the Jacobians of any differentiable model, EC is estimator-agnostic and reduces exactly to average controllability in the linear, time-invariant limit. Our contribution is a characterization of when the two measures agree and diverge. Using a controlled synthetic family with known ground-truth contribution, we construct a phase diagram spanning nonlinearity, regime structure, persistence, and perturbation amplitude. EC and average controllability agree under static or smoothly drifting dynamics and both track ground truth. Divergence emerges under persistent regime switching, is strongest under persistent sign reversal, and disappears when the sign reversal is removed. At extreme perturbation amplitudes, both measures degrade, identifying the limits of local linearization. We place five estimated real systems from several domains within this phase space. Their placement serves as a diagnostic of when EC provides information beyond static controllability and therefore justifies its additional computational cost. On one panel examined in depth, a twenty-seed retraining ensemble reveals a robust variance--leverage dissociation: nodes whose perturbations propagate widely despite low within-system variance, which is not recovered by static centralities nor variance-based summaries.

URL PDF HTML ☆

赞 0 踩 0

2606.03091 2026-06-05 cs.IR cs.AI

BAHSD: Bridging the Long-tail Gap via Adaptive Distillation in Black-box Sequential Recommendation

BAHSD：通过自适应蒸馏弥合黑盒序列推荐中的长尾差距

Xi Zhou, Famin Wu, Mingming Li, Hongyue Zhang, Jiao Dai, Jizhong Han, Tao Guo

发表机构 * Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China（中国科学院信息工程研究所，北京，中国）； School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China（中国科学院大学网络安全学院，北京，中国）； Beijing Institute for General Artificial Intelligence, Beijing, China（北京一般人工智能研究院，北京，中国）

AI总结针对黑盒序列推荐中长尾分布导致的信号异质性，提出BAHSD框架，利用多尺度一致性探测机制量化信号可靠性，并设计自适应分层目标（动态温度KL散度、排序一致性和InfoNCE对比学习）来缓解偏好固化并增强噪声鲁棒性，在尾用户上提升80%以上。

详情

AI中文摘要

序列推荐系统被广泛采用，但通常作为黑盒API部署，这推动了近期对模型提取的兴趣，以在本地复制其能力。然而，长尾分布导致了严重的信号异质性：密集的头部序列触发教师偏好的固化，使提取偏向局部模式，而稀疏的尾部序列产生平坦且嘈杂的预测。现有的一刀切式提取忽略了这种差异，导致噪声过拟合和次优的知识迁移。我们提出BAHSD，一种黑盒自适应蒸馏框架，通过多尺度一致性探测机制隐式量化信号可靠性来处理信号异质性。基于此，设计了自适应分层目标：动态温度KL散度缓解高置信度信号的偏好固化，而排序一致性和InfoNCE对比学习为低置信度信号提供噪声鲁棒的增强。BAHSD持续优于基线，在教师模型上获得高达4.98%的提升，在尾用户上提升80%以上，为高保真黑盒推荐提取提供了一种即插即用的解决方案。

英文摘要

Sequential recommendation systems are widely adopted but often deployed as black-box APIs, which has driven recent interest in model extraction to replicate their capabilities locally. However, the long-tail distribution induces severe signal heterogeneity: dense head sequences trigger the solidification of teacher preference, biasing extraction toward local patterns, while sparse tail sequences yield flat, noisy predictions. Existing one-size-fits-all extraction overlooks this disparity, resulting in noise overfitting and suboptimal knowledge transfer. We propose BAHSD, a black-box adaptive distillation framework that handles signal heterogeneity via a multi-scale consistency probing mechanism to implicitly quantify signal reliability. Based on this, an adaptive hierarchical objective is designed: dynamic-temperature KL divergence mitigates preference solidification for high-confidence signals, while ranking consistency and InfoNCE contrastive learning provide noise-robust enhancement for low-confidence signals. BAHSD consistently outperforms baselines, achieving up to 4.98\% gain over the teacher and 80\%+ improvement on tail users, offering a plug-and-play solution for high-fidelity black-box recommendation extraction.

URL PDF HTML ☆

赞 0 踩 0

2606.00804 2026-06-05 cs.MA cs.AI cs.CL

Dynamic Coordination Strategy Selection for Enterprise Multi-Agent Systems

企业多智能体系统的动态协调策略选择

Thanh Luong Tuan

发表机构 * Golden Gate University（金门大学）； Foundation AgenticOS (FAOS)（基础代理操作系统（FAOS））

AI总结本文通过大规模实验评估企业多智能体系统是否应根据问题类别动态选择协调策略，发现动态路由作为校准默认值有效，但无法确定唯一最优策略。

Comments 13 pages, 4 appendix. Code and data: https://github.com/frank-luongt/faos-research/tree/main/RA-1

详情

AI中文摘要

企业多智能体系统日益暴露多种协调模式，但部署时往往缺乏证据表明何时使用共识、辩论、综合或更简单的单智能体工作流。本文评估协调策略是否应根据问题类别动态选择，而非全局固定。我们运行了一个固定的矩阵，包含30个企业任务，涵盖六个行业、五个问题类别、四种执行条件、每个单元格三个重复，以及四个模型分支：qwen_local、sonnet、gemma_openrouter和一个辅助的openai云验证分支。所有1,440个生成输出均由固定的Sonnet评分标准评判。主要发现是有界且操作上有用的，但并非最初的严格H1。预先注册的精确胜者/CI标准未得到支持：精确胜者身份在不同模型分支间不稳定，且若干预测策略接近但未超过最佳观察到的替代方案。一个较弱的近最优路由主张得到强烈支持。在每个预先注册的模型分支和问题类别中，以及在辅助的OpenAI验证分支中，预测策略的质量分数与最佳观察条件相差在0.10以内。结构化合规验证是对原始映射最明显的例外：所有分支都偏好单智能体而非共识。预先注册的Kendall's W检验发现，越南语领域和英语领域任务在四种协调条件排序的一致性上没有可靠差异（两个分层的平均W均为0.20；符号秩检验p = .85），因此H2未得到支持。我们得出结论，企业协调策略应使用动态路由作为校准默认值，而非确定性胜者选择法则。

英文摘要

Enterprise multi-agent systems increasingly expose multiple coordination patterns, but deployments often lack evidence for when to use consensus, debate, synthesis, or a simpler single-agent workflow. This paper evaluates whether coordination strategy should be selected dynamically by problem class rather than fixed globally. We run a frozen matrix of 30 enterprise tasks spanning six industries, five problem classes, four execution conditions, three replications per cell, and four model arms: qwen_local, sonnet, gemma_openrouter, and an auxiliary openai cloud-validation arm. All 1,440 generated outputs are judged by a fixed Sonnet rubric. The main finding is bounded and operationally useful, but it is not the original strict H1. The pre-registered exact-winner/CI criterion is not supported: exact winner identity is unstable across model arms, and several predicted strategies are close to, but not above, the best observed alternative. A weaker near-best routing claim is strongly supported. In every pre-registered model arm and problem class, and again in the auxiliary OpenAI validation arm, the predicted strategy is within 0.10 quality-score points of the best observed condition. Structured compliance verification is the clearest exception to the original mapping: all arms favor single_agent rather than consensus. A pre-registered Kendall's W test finds no reliable difference between Vietnamese-domain and English-domain tasks in how consistently the four coordination conditions are ranked (mean W of 0.20 in both strata; signed-rank p = .85), so H2 is not supported. We conclude that enterprise coordination policy should use dynamic routing as a calibrated default, not as a deterministic winner-selection law.

URL PDF HTML ☆

赞 0 踩 0

2605.27991 2026-06-05 stat.ML cs.LG

Gradient-Flow Optimization as Dynamic Random-Effects Inference: Testing and Early Stopping with Applications to Deep Learning

深度神经网络训练作为随机效应：优化-推断对偶性

Minhao Yao, Ruoyu Wang, Xihong Lin, Lin Liu, Zhonghua Liu

发表机构 * Centre for Biomedical Data Science, Duke-NUS Medical School, National University of Singapore（生物医学数据科学中心，国立新加坡大学杜克-新加坡医学学校）； Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA（生物统计学系，哈佛T.H. Chan公共卫生学院，马萨诸塞州波士顿，美国）； Institute of Natural Sciences, MOE-LSC, School of Mathematical Sciences, CMA-Shanghai, SJTU-Yale Joint Center of Biostatistics and Data Science, Shanghai Jiao Tong University（自然科学院，MOE-LSC，数学科学学院，CMA-上海，SJTU-耶鲁联合生物统计学与数据科学中心，上海交通大学）； Department of Biostatistics, Columbia University, New York, NY, USA（生物统计学系，哥伦比亚大学，纽约州纽约市，美国）

AI总结本文提出深度神经网络训练与经典随机效应模型等价，揭示了优化-推断对偶性，并利用限制最大似然估计实现基于似然的早停规则。

详情

AI中文摘要

深度神经网络（DNN）取得了显著的实证成功，但其训练动态主要从优化而非统计原理的角度被理解。本文通过证明连续时间神经正切核（NTK）梯度流产生的预测与经典随机效应模型的预测完全等价，为过参数化机制下的DNN训练建立了一个统计框架。在该框架中，训练时间充当方差分量，或等价地作为经验贝叶斯协方差超参数，控制噪声到结构化信号的变异分配。这种等价性揭示了一种优化-推断对偶性：梯度流路径既是优化轨迹，也是经验贝叶斯随机效应推断路径。以训练时间为条件，网络输出是潜在信号的后验均值，通过限制最大似然估计（REML）估计训练时间，将早停转化为基于似然的经验贝叶斯推断，而非外部调参。这一视角产生了一个两阶段推断程序。首先，方差分量检验确定DNN训练是否捕捉到初始化之外的统计显著结构。其次，以训练合理为条件，REML提供基于似然的早停规则。由此产生的停止时间在NTK特征基下具有谱解释，其中训练持续到谱损失去相关实现。我们进一步证明，对于固定设计下的样本内预测，REML引导的早停实现了渐近最优预测误差，并且在额外的随机设计正则条件下，对于样本外预测也成立。这项工作将DNN训练重新定义为统计推断，并为决定是否以及训练深度神经网络多长时间提供了原则性基础。

英文摘要

Gradient-flow optimization is usually viewed as an algorithmic procedure for minimizing empirical loss, with training duration selected by validation or heuristic early-stopping rules. We develop a statistical inference framework for the gradient-flow training trajectory itself. The central object is fixed-operator squared-error gradient flow: whenever the fitted value evolves through a time-invariant positive semidefinite training operator, the trained model output at each training time is exactly equivalent to the best linear unbiased predictor, or empirical-Bayes posterior mean, under a corresponding random-effects model. Under this representation, training time becomes a variance-component parameter governing how variance is reallocated from residual noise to structured signal. This turns two basic training decisions into inferential problems. First, whether training is needed is formulated as a variance-component test for signal beyond initialization. Second, how long to train is formulated as restricted maximum likelihood (REML) estimation of the training-time variance component. The resulting REML-guided early stopping rule has a spectral interpretation: it selects the training time at which optimized spectral losses become empirically decorrelated from the eigenvalues of the training operator, yielding an effective degrees-of-freedom measure for the evolving trained model. We establish asymptotic prediction optimality for fixed-design in-sample risk and, under additional kernel regularity conditions, random-design out-of-sample risk. Deep learning models in fixed-kernel gradient regimes provide canonical modern-AI instantiations of the theory. Numerical experiments and a UK Biobank proteomics application show that the proposed inferential approach attains competitive prediction accuracy while reducing the reliance on validation splits and repeated checkpoint evaluation.

URL PDF HTML ☆

赞 0 踩 0

2605.26179 2026-06-05 cond-mat.mtrl-sci cs.AI cs.CE

AutoDFT: A Closed-Loop Multi-Agent Framework for Autonomous DFT Calculations

AutoDFT：用于自主DFT计算的闭环多智能体框架

Penghui Yang, Zhonghan Zhang, Yue Li, Xinrun Wang, Yanchen Deng, Yuhao Lu, Bijun Tang, Zheng Liu, Bo An

发表机构 * Nanyang Technological University, Singapore（南洋理工大学，新加坡）； Singapore Management University（新加坡管理大学）

AI总结提出AutoDFT闭环多智能体框架，通过将LLM推理嵌入DFT计算全生命周期，实现从规划到执行的自主适应，在VASPBench基准上达到94.1%任务成功率，并可靠预测电子、磁性和能量性质。

详情

AI中文摘要

密度泛函理论（DFT）是材料科学和化学中计算发现的基础，然而每次计算都需要大量人工努力：当收敛停滞时调整算法，当出现意外物理现象时修改计划，以及当中间结果重塑问题时插入步骤。现有的基于LLM的智能体仅自动化初始规划阶段，预先生成完整的执行计划，而将所有后续调整留给手工规则。因此，这些工作流仍然脆弱，难以泛化到预规划场景之外，并且当失败或意外的中间结果需要改变计算路径时，通常需要专家干预。在此，我们介绍AutoDFT，一个闭环多智能体框架，将LLM推理嵌入DFT生命周期的每个阶段：战略规划器生成步骤目标的骨架计划；步骤规划器根据先前结果即时生成数值参数；监控-恢复-反思循环诊断失败、修复失败，并在证据支持时修改计划。我们展示了广度和深度：广度方面，在VASPBench（一个专门构建的基准，涵盖34个任务和9种DFT计算类型）上，AutoDFT使用GPT-5.2实现了94.1%的任务级成功率；深度方面，在已建立的材料数据库上，AutoDFT在电子、磁性和能量性质上产生了定量可靠的属性预测。通过闭环规划和执行，AutoDFT使没有深厚计算专业知识的实验人员能够获得可靠的第一性原理结果。

英文摘要

Density functional theory (DFT) serves as the basis for computational discovery in materials science and chemistry, yet each calculation demands extensive human effort: adjusting algorithms when convergence stalls, revising plans when unexpected physics emerges, and inserting steps as intermediate results reshape the problem. Existing LLM-based agents automate only the initial planning stage, producing a full execution plan upfront and leaving all subsequent adaptation to hand-crafted rules. As a result, these workflows remain fragile, do not generalize well beyond pre-planned scenarios, and often require expert intervention when failures or unexpected intermediate results require changes to the calculation path. Here, we introduce AutoDFT, a closed-loop multi-agent framework that embeds LLM reasoning into every stage of the DFT lifecycle, where a strategic planner produces a skeletal plan of step objectives; a step planner generates numerical parameters just in time from preceding results; and a monitor-recover-reflect cycle diagnoses failures, repairs them, and revises the plan when the evidence justifies it. We demonstrate both breadth and depth: breadth on VASPBench, a purpose-built benchmark spanning 34 tasks and 9 DFT calculation types, where AutoDFT achieves 94.1% task-level success with GPT-5.2; and depth on established materials databases, where AutoDFT produces quantitatively reliable property predictions across electronic, magnetic, and energetic properties. By closing the loop between planning and execution, AutoDFT enables experimentalists without deep computational expertise to obtain reliable first-principles results.

URL PDF HTML ☆

赞 0 踩 0

2605.29916 2026-06-05 cs.NE cs.AI cs.DS math.OC

Selection Hyper-heuristics Can Automatically Adjust the Learning Period to Optimally Solve Pseudo-Boolean Problems

选择超启发式可以自动调整学习周期以最优地解决伪布尔问题

Benjamin Doerr, Pietro S. Oliveto, John Alasdair Warwicker

发表机构 * Laboratoire d’Informatique (LIX), CNRS, École Polytechnique, Institut Polytechnique de Paris（信息实验室（LIX），法国国家科学研究中心，巴黎高等理工学院，巴黎理工学院）； Department of Computer Science and Engineering, Southern University of Science and Technology（计算机科学与工程系，南方科技大学）； School of Computing & Communications, Lancaster University Leipzig（计算与通信学院，莱斯特大学莱比锡分校）

AI总结本文提出一种自动设置学习周期参数的超启发式方法，证明其能在1-o(1)比例的迭代中选择最优邻域大小，从而以最优时间（忽略低阶项）优化LeadingOnes基准问题。

Comments To appear in "Artificial Intelligence"

详情

DOI: 10.1016/j.artint.2026.104560
Journal ref: Artificial Intelligence 357:104560 (2026)

AI中文摘要

最近研究表明，随机梯度超启发式在使用随机局部搜索（RLS）元启发式优化LeadingOnes基准时，能够学习最优邻域大小。然而，这需要使用一定长度$τ$的学习周期，这与经典超启发式不同，后者仅基于前一次迭代的成功来改变行为。在本文中，我们展示了如何自动设置这个新参数值，从而使用户免于控制这一新颖算法参数的非平凡任务。我们证明，由此产生的超启发式在$1-o(1)$比例的迭代中选择最优邻域大小，并因此以这些邻域大小所能达到的最佳时间（忽略低阶项）优化LeadingOnes基准。

英文摘要

The Random Gradient hyper-heuristic was recently shown to be able to learn the optimal neighbourhood size when optimizing the LeadingOnes benchmark via the Randomised Local Search (RLS) meta-heuristic. However, for this to happen, a learning period of a certain length $τ$ had to be used, differently from classic hyper-heuristics, which change their behaviour based on the success of only the previous iteration. In this paper, we show how to automatically set this new parameter value, relieving the user from the non-trivial task of controlling this novel algorithm parameter. We prove that the resulting hyper-heuristic selects the optimal neighbourhood size in a $1-o(1)$ fraction of the iterations and, consequently, optimises the LeadingOnes benchmark in the best possible time (apart from lower-order terms) achievable with these neighborhood sizes.

URL PDF HTML ☆

赞 0 踩 0

2605.29054 2026-06-05 cs.SE cs.CL

Converted, Not Equivalent: Benchmarking Codebase Conversion via Observational Equivalence

转换而非等价：通过观察等价性基准测试代码库转换

Linxin Song, Jiefeng Chen, Yue Huang, Bhavana Dalvi Mishra, Chi Wang, Jieyu Zhao, Jinsung Yoon, Tomas Pfister

发表机构 * University of Southern California（南加州大学）； Google Cloud AI Research（谷歌云人工智能研究）； University of Notre Dame（圣约翰大学）； Google Deepmind（谷歌深Mind）

AI总结针对代码库转换中智能体过度信任本地验证导致语义违反的问题，提出T2J-Bench基准，通过固定等价契约和三级验证（Spec、Numeric、Behavioral）评估转换质量，发现最佳系统通过率仅26.7-28.9%，且所有系统高估成功率66.6-97.8点。

详情

AI中文摘要

编码智能体日益成为代码库规模的协作者，能够协助代码库转换，但这一进展暴露了一个关键弱点：智能体往往过度信任自己的本地验证例程，并在满足表面检查但违反用户实际关心的语义契约的工件上宣布成功。这个问题在代码库转换中尤为严重，因为先前的评估主要是结果驱动的，因此不稳定：两个实现可以在浅层结果上匹配，例如单个前向损失，但在梯度、优化器行为或短期训练动态上存在差异。我们引入了T2J-Bench，一个代码库转换基准，它将转换重新定义为在固定等价契约下的迁移。然后，一个固定验证器通过三个有序阶段比较源代码库和转换后的代码库：Spec（接口可接受性）、Numeric（前向输出、损失、梯度和目标特定张量）和Behavioral（固定种子下的短期训练动态）。在355次盲转换尝试中，尽管Spec通过率高达91.1%，最佳系统总体通过率仅为26.7-28.9%；4.7倍的token预算差异仅产生2.2倍的通过率差异；所有系统相对于固定评估器高估成功率66.6-97.8点。这表明失败更多源于契约不一致的自我验证，而非有限的预算或骨干强度。

英文摘要

Coding agents increasingly act as codebase-scale collaborators that can assist with codebase conversion, but this progress has exposed a critical weakness: agents often over-trust their own local validation routines and declare success on artifacts that satisfy surface checks while violating the semantic contracts users actually care about. This problem is especially acute in codebase conversion, where prior evaluation is largely outcome-driven and therefore unstable: two implementations can match on a shallow outcome, such as a single forward loss, while diverging in gradients, optimizer behavior, or short-horizon training dynamics. We introduce T2J-Bench, a benchmark for codebase conversion that reformulates conversion as transfer under a fixed equivalence contract. A fixed verifier then compares source and converted codebases through three ordered stages: Spec (interface admissibility), Numeric (forward outputs, losses, gradients, and objective-specific tensors), and Behavioral (short training dynamics under fixed seeds). Across 355 blind conversion attempts, the best system reaches only 26.7--28.9% overall pass rate despite Spec pass rates up to 91.1%; a 4.7x token-budget spread yields only a 2.2x pass-rate spread; and all systems overestimate success by 66.6--97.8 points relative to the fixed evaluator. This suggests that failures stem more from contract-misaligned self-validation than from limited budget or backbone strength.

URL PDF HTML ☆

赞 0 踩 0

2605.23809 2026-06-05 eess.SY cs.LG cs.SY

Advanced AI Service Provisioning in O-RAN through LLM Engine Integration

通过LLM引擎集成在O-RAN中的高级AI服务提供

Seyed Bagher Hashemi Natanzi, Pranshav Gajjar, Bo Tang, Vijay K. Shah

发表机构 * Department of Electrical and Computer Engineering, Worcester Polytechnic Institute（电气与计算机工程系，沃斯特理工学院）； Department of Electrical and Computer Engineering, North Carolina State University（电气与计算机工程系，北卡罗来纳州立大学）

AI总结提出一种双脑架构，结合LLM的推理能力和轻量级ML引擎的实时性，实现O-RAN中AI服务的自动化部署与配置。

2604.15524 2026-06-05 eess.SY cs.RO cs.SY

Safe and Energy-Aware Multi-Robot Density Control via PDE-Constrained Optimization for Long-Duration Autonomy

面向长期自主性的安全与能量感知多机器人密度控制：基于PDE约束优化

Longchen Niu, Andrew Nasif, Gennaro Notomista

发表机构 * Department of Electrical and Computer Engineering, University of Waterloo（滑铁卢大学电气与计算机工程系）

AI总结提出一种结合Fokker-Planck偏微分方程与控制李雅普诺夫/障碍函数的密度控制框架，实现多机器人系统的目标密度跟踪、避障和能量可持续性。

2605.21557 2026-06-05 stat.ML cs.AI cs.LG

DPU 或 GPU 加速神经网络推断——为何不两者都用？分割 CNN 推断

Ali Emre Oztas, Mahir Demir, James Garside, Mikel Luján

发表机构 * The University of Manchester（曼彻斯特大学）

AI总结本文提出了一种将 CNN 推断任务分割到 DPU 和 GPU 上的方法，以降低延迟。通过在 DPU 处理初始层，GPU 处理剩余层，结合 GNN 分割索引预测方法，实现了比单一 DPU 或 GPU 更高的效率提升。

详情

AI中文摘要

边缘设备上的视频和图像流需要低延迟。为解决此问题，神经网络（NN）被广泛应用，先前的研究主要集中在使用单个硬件单元如图形处理单元（GPU）、可编程门阵列（FPGA）和深度学习处理单元（DPU）来加速这些网络。然而，通过结合这些单元可以进一步减少延迟。本文提出将 CNN 推断任务分割到 DPU 和 GPU 上（Split CNN 推断）。第一个分割部分在 Versal VCK190 的 AI 引擎（DPU）上运行，处理输入图像的初始 CNN 层。DPU 在数据源附近处理第一部分。异步流水线方式下，GPU 运行剩余的层。NVIDIA RTX 2080 GPU 处理第二部分，尽管减少了数据源（存储/摄像头）与 GPU 之间的数据传输。此外，提出了一种基于图神经网络（GNN）的分割索引预测方法，以自动化 Split 推断所需的 CNN 分割。已建立的模型如 LeNet-5、ResNet18/50/101/152、VGG16 和 MobileNetv2 被分析。结果表明，相比仅使用 DPU 的执行，延迟提高了最多 2.48 倍；相比仅使用 GPU 的执行，延迟提高了最多 3.37 倍。训练好的 GNN 模型在适当的设备之间分割层的准确率为 96.27%。

英文摘要

Video and image streaming on edge devices requires low latency. To address this, Neural Networks (NNs) are widely used, and prior work mainly focuses on accelerating them with single hardware units such as Graphics Processing Units (GPUs), Field Programmable Gate Arrays (FPGAs), and Deep Learning Processing Units (DPUs). However, further reductions in latency can be observed by combining these units. In this paper, partitioning CNN inference across DPU and GPU (Split CNN Inference) is proposed. The first partition runs on the AI engines (DPU) of a Versal VCK190, which consists of initial CNN layers processing the input images. The DPU processes the first partition near the source of the data. Pipelined asynchronously, a GPU runs the remaining layers. The GPU (NVIDIA RTX 2080) processes the second partition, albeit having reduced the data transfer between the data source (storage/camera) and the GPU. Furthermore, a Graph Neural Network (GNN)-based partition index prediction method is proposed to automate the partitioning of CNNs needed for Split Inference. Well established models such as LeNet-5, ResNet18/50/101/152, VGG16, and MobileNetv2 are analyzed. Results demonstrate up to 2.48x latency improvement over DPU-only execution and up to 3.37x over GPU-only execution. The trained GNN model splits the layers between the appropriate devices with 96.27% accuracy.

URL PDF HTML ☆

赞 0 踩 0