arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2606.10975 2026-06-10 cs.LG eess.SP math.OC 新提交

Learning Doubly Sparse Explicitly Conditioned Transforms

学习双稀疏显式条件变换

Tudor Pistol

AI总结提出一种将固定规范矩阵与自适应稀疏分量乘积形式的结构化显式条件变换学习方法，在保持快速稳定分析变换优势的同时引入可控自适应性，实验表明在双稀疏变换学习问题上达到最优性能。

Comments 10 pages, 1 figure, 1 table. Accepted for publication in Procedia Computer Science (30th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems - KES 2026; Invited Session: Global and Constrained Optimization: Algorithms and Applications)

详情

AI中文摘要

在最近的研究中，找到自然信号假定稀疏结构成立的便利空间已成为一个理想结果，其影响体现在数据压缩、降噪和特征提取等领域。虽然广泛使用的分析变换（如DFT或DCT）已经提供了高效的算法和鲁棒的稀疏表示，但它们假设了关于数据的固定先验，无法准确捕捉更严格信号类别的特定结构。为了解决这个问题，文献中引入了数据自适应学习变换的概念，允许减少变换域中的残差项。最近的研究表明，条件数在此背景下是一个良好的度量，期望的结果在泛化倾向和实现最小近似误差之间交替。受这些考虑启发，我们引入了一种结构化显式条件变换的学习，该变换被表述为一个固定规范矩阵与一个精炼的数据自适应稀疏分量的乘积。这种方法旨在保留快速稳定分析变换的优势，同时引入对数据的可控自适应性。目前尚未发现涉及这种特定公式的参考文献，表明其新颖性。所提出的算法在不精确近端方法的框架内被推导，利用了一个新导出的闭式投影算子。实验观察表明，在双稀疏变换学习问题上取得了最先进的结果，并且与密集变体相比，在显著降低计算成本的同时，有时收敛更快且更好地避免不良局部最小值。

英文摘要

Finding convenient spaces in which certain hypotheses regarding an assumed sparse structure of natural signals hold true has become a desirable result in recent research, its implications being reflected in areas such as data compression, noise reduction and feature extraction. While the extensively used analytical transforms, such as DFT or DCT, already provide efficient algorithms and robust sparse representations, they assume a fixed prior about the data, failing to accurately capture the specific structure of more restrictive classes of signals. To address this, the concept of a data-adaptive, learnt transform has been introduced in the literature, allowing for the reduction of a residual term in the transform domain. More recent studies have shown that the condition number serves as a good metric in this context, where the desired outcome alternates between a generalizing tendency and one that achieves minimal approximation error. Motivated by these considerations, we introduce the learning of a structured, explicitly conditioned transform formulated as the product of a fixed canonical matrix and a refining data-adaptive sparse component. This approach seeks to preserve the advantages of fast and stable analytical transforms, while introducing controllable adaptivity to the data. No references that concern this specific formulation have been identified so far, indicating its novelty. The proposed algorithm is motivated within the framework of inexact proximal methods, leveraging a newly derived closed-form projection operator. Empirical observations demonstrate state-of-the-art results on the doubly sparse transform learning problem and comparable performance with its dense variant at significantly lower computational costs and sometimes faster convergence and better avoidance of bad local minima.

URL PDF HTML ☆

赞 0 踩 0

2606.10944 2026-06-10 cs.LG cs.DS math.ST stat.ME stat.ML stat.TH 新提交

Express Language Modeling

Express 语言建模

Albert Gong, Annabelle Michael Carrell, Raaz Dwivedi, Lester Mackey

AI总结提出 Express 工具，将非因果注意力近似转换为因果近似，结合 Thinformer 实现最优因果注意力保证，并加速语言建模中的四个资源瓶颈。

2606.10841 2026-06-10 cs.RO cs.SY eess.SY math.OC 新提交

Gradient based Bilevel for Inverse Optimal Control, a Riemannian approach

基于梯度的双层逆最优控制：一种黎曼方法

Ahmed-Manaf Dahmani, Vincent Bonnet, David Daney, François Charpillet

AI总结提出一种黎曼逆最优控制方法，将最优轨迹集视为流形，通过流形上的优化避免标准约束违规，计算时间减少约四倍。

Comments 6 Pages, 4 Figures. To be published in a control journal

详情

AI中文摘要

逆最优控制旨在恢复解释观测轨迹作为最优控制问题解的成本函数。经典逆最优控制公式依赖于双层优化，反复求解嵌套的最优控制问题，对于实际系统很快变得计算上不可行。最近的基于投影的方法提供了一种有希望的替代方案，但由于违反标准约束条件，在使用基于梯度的方法求解时会出现数值不稳定性。在本文中，我们表明这些困难源于逆最优控制可行集的几何结构。我们证明满足最优性条件的轨迹集自然形成一个流形，并将逆最优控制重新表述为该流形上的优化问题。基于这一见解，我们提出了一种黎曼逆最优控制方法，该方法将观测轨迹投影到最优解流形上，同时通过构造保持可行性。在真实人类手臂轨迹上的实验表明，所提出的方法在重建精度上与经典双层逆最优控制相当或更好，同时计算时间减少约四倍。这些结果凸显了几何优化方法在提高逆最优控制在机器人和人体运动分析中的可扩展性和可靠性方面的潜力。

英文摘要

Inverse Optimal Control (IOC) aims to recover the cost function that explains observed trajectories as solutions of an optimal control problem. Classical IOC formulations rely on bilevel optimization, which repeatedly solves a nested optimal control problem and quickly becomes computationally prohibitive for realistic systems. Recent projection-based approaches offer a promising alternative but suffer from numerical instability when solved with gradient-based methods due to violations of standard constraint qualifications. In this paper, we show that these difficulties stem from the geometric structure of the IOC feasible set. We demonstrate that the set of trajectories satisfying the optimality conditions naturally forms a manifold and reformulate IOC as an optimization problem on this manifold. Based on this insight, we propose a Riemannian Inverse Optimal Control (RIOC) method that projects observed trajectories onto the manifold of optimal solutions while preserving feasibility by construction. Experiments on real human arm trajectories show that the proposed method achieves comparable or better reconstruction accuracy than classical bilevel IOC while reducing computation time by about a factor of four. These results highlight the potential of geometric optimization methods to improve the scalability and reliability of IOC for robotics and human motion analysis.

URL PDF HTML ☆

赞 0 踩 0

2606.10824 2026-06-10 cs.LG math.AT 新提交

Encoding the Euler Characteristic Transform

编码欧拉特征变换

Nello Blaser, Odin Hoff Gardaa, Lars M. Salbu, Elena Xinyi Wang, Bastian Rieck

AI总结提出连续编码方法，将欧拉特征曲线转化为每个顶点的净变化序列，通过小型变换器生成特征向量，并在多个数据集上提升分类精度。

详情

AI中文摘要

欧拉特征曲线（ECC）记录线性嵌入的胞复形在给定方向上的欧拉特征随过滤高度的变化，而欧拉特征变换（ECT）是通过收集多个方向上的ECC得到的单射形状描述符。如何为神经网络编码ECT本身是一种归纳偏置，传统上通过离散化每个ECC来固定。我们引入一种连续编码：对于每个方向和每个顶点，它记录归因于该顶点的净欧拉特征变化，产生一个每个方向的令牌序列，由一个小型变换器映射到特征向量。我们将得到的流程分为两个正交轴上的阶段：一个ECC编码器，在每个方向内作用，将其曲线映射到固定长度向量；以及一个ECT表示，跨方向作用，聚合每个方向的向量为一个。我们研究了六种ECT表示架构，涵盖从结构无关的前馈基线到保持平面旋转等变性的卷积和复值模型的一系列归纳偏置。在涵盖点云、图、立方复形和网格的六个分类基准上，连续编码在六个数据集中有五个提高了准确率，控制实验将增益归因于令牌化本身而非增加的变换器容量。表示架构的重要性小于编码，其归纳偏置的收益取决于编码：前馈网络在连续编码下表现最佳，但在离散化下不如卷积架构鲁棒。

英文摘要

The Euler Characteristic Curve (ECC) records the Euler characteristic of a linearly embedded cell complex as a function of filtration height in a given direction, and the Euler Characteristic Transform (ECT) is the injective shape descriptor obtained by collecting ECCs over many directions. How the ECT is encoded for a neural network is itself an inductive bias, conventionally fixed by discretizing each ECC. We introduce a continuous encoding: for each direction and each vertex it records the net Euler-characteristic change attributed to that vertex, producing a per-direction token sequence that a small transformer maps to a feature vector. We separate the resulting pipeline into two stages on orthogonal axes: an ECC encoder that acts within each direction, mapping its curve to a fixed-length vector, and an ECT representation that acts across directions, aggregating the per-direction vectors into one. We study six ECT representation architectures spanning a range of inductive biases, from a structure-agnostic feedforward baseline to convolutional and complex-valued models that preserve equivariance under planar rotations. Across six classification benchmarks covering point clouds, graphs, cubical complexes, and meshes, the continuous encoding improves accuracy on five of six datasets, and control experiments attribute the gain to the tokenization itself rather than to the added transformer capacity. The representation architecture matters less than the encoding, and the payoff from its inductive biases depends on the encoding: a feedforward network performs best under continuous encoding but is less robust under discretization than convolutional architectures.

URL PDF HTML ☆

赞 0 踩 0

2606.10806 2026-06-10 cs.AI math.FA 新提交

Moonshine: An Autonomous Mathematical Research Agent Centered on Conjecture Generation

Moonshine：一个以猜想生成为中心的自主数学研究智能体

Xiaoyang Chen, Xiang Jiang

AI总结提出自主智能体Moonshine，通过提取经典问题结构、提炼新概念并生成数学猜想，以Jacobian猜想为例，将其转化为神经Jacobian猜想并证明部分情况。

详情

AI中文摘要

Moonshine是一个自主智能体，其核心目标是生成数学猜想。它的核心能力是从经典问题中提取结构、提炼新概念，并制定具有数学意义的猜想。Moonshine不将解决单个命题作为终点，而是通过猜想生成、桥梁构建和障碍识别来构建可扩展的理论框架。本文以Moonshine对Jacobian猜想的探索为例，展示了局部非退化性是否强制全局单射性的核心逻辑如何转移到单隐层仿射-岭sigmoid网络上。这导致了\emph{神经Jacobian猜想}（NJC）的提出：如果这样的网络在整个空间上具有严格正的Jacobian行列式，则它必须是全局单射的。通过分别调用GPT-5.5-pro和DeepSeek-V4-pro，Moonshine获得了情况$N=n+1$的独立完整证明。此外，在ChatGPT通过其网页界面与GPT-5.5-pro交互使用的辅助下，开发了一个几何拓扑证明。这些结果为猜想的合理性提供了初步证据。然而，一般的高宽度情况$N\ge n+2$仍未解决，留待进一步研究。这项工作展示了Moonshine自主生成有意义的数学问题并对其取得严格进展的能力。

英文摘要

Moonshine is an autonomous agent whose central objective is to generate mathematical conjectures. Its core capability is to extract structure from classical problems, distill new concepts, and formulate conjectures of mathematical significance. Rather than treating the solution of a single proposition as its endpoint, Moonshine builds an extensible theoretical framework through conjecture generation, bridge building, and obstacle identification. This article uses Moonshine's exploration of the Jacobian conjecture as an example. It shows how the central logic of whether local nondegeneracy can force global injectivity is transferred to one-hidden-layer affine-ridge sigmoid networks. This leads to the formulation of the \emph{Neural Jacobian Conjecture} (NJC): if such a network has strictly positive Jacobian determinant on the whole space, then it must be globally injective. By invoking GPT-5.5-pro and DeepSeek-V4-pro separately, Moonshine obtained independent complete proofs for the case $N=n+1$. In addition, with the assistance of ChatGPT through interactive use of its web interface with GPT-5.5-pro, a geometric-topological proof was developed. These results provide preliminary evidence for the plausibility of the conjecture. The general higher-width case $N\ge n+2$, however, remains unresolved and is left for further investigation. This work illustrates Moonshine's ability to autonomously generate meaningful mathematical problems and make rigorous progress on them.

URL PDF HTML ☆

赞 0 踩 0

2606.10583 2026-06-10 cs.LG cs.AI math.OC 新提交

NOVA: Symbolic Regression Discovery of Interpretable Car-Following and Lane-Change Models with Driver Heterogeneity

NOVA: 可解释的跟驰与换道模型及驾驶员异质性的符号回归发现

Ishak Abassi, Nassim Ali Bouazzouni, Farah Ibelaiden, Nadir Farhi

AI总结提出NOVA符号回归框架，从原始轨迹数据自动发现可解释的跟驰与换道结构，在NGSIM数据集上优于基线，并揭示主导非线性项与心理物理理论关联。

详情

AI中文摘要

我们提出了NOVA，一个自主符号回归框架，能够从原始轨迹数据中识别出可解释的跟驰和换道结构，且仅需极少的先验行为假设。应用于来自NGSIM I-80和US-101数据集的4,765,788个活跃驾驶观测，NOVA的确定性Rust驱动搜索引擎评估了超过10,000个候选代数结构，并在前向平移滚动均值预测目标下识别出一个紧凑的两项加速度模型。在两种互补的预处理流程下评估，NOVA在意图预测基准上实现了RMSE = 1.376 m/s²（R² = 15.57%），在相同评估协议下，RMSE比最佳重新校准的符号回归基线（SR-LLM, PNAS 2025）低0.135 m/s²。在八个独立实验中，单个主导非线性项作为人类跟驰的稳健骨干出现；残差引导的扩展进一步将所选结构与已建立的碰撞避免心理物理理论联系起来。发现的特征算子在不同高速公路地点之间零样本迁移，R²损失低于3个百分点。扩展到多项logit框架内的换道建模，NOVA在502个未见驾驶员的严格车辆ID留出测试中实现了67.4%的平衡准确率，在三类问题上超过现有换道基线+29.8个百分点。

英文摘要

We present NOVA, an autonomous symbolic regression framework that identifies interpretable car-following and lane-change structures from raw trajectory data with minimal behavioral priors. Applied to 4,765,788 active driving observations from the NGSIM I-80 and US-101 datasets, NOVA's deterministic Rust-powered search engine evaluates over 10,000 candidate algebraic structures and identifies a compact two-term acceleration model under a forward-shifted rolling-mean prediction target. Evaluated under two complementary preprocessing pipelines, NOVA achieves $RMSE = 1.376 m/s^2$ ($R^2 = 15.57\%$) on the intent-forecasting benchmark, outperforming the best recalibrated symbolic-regression baseline (SR-LLM, PNAS~2025) by 0.135 m/s$^2$ in RMSE under an identical evaluation protocol. Across eight independent experiments, a single dominant nonlinear term emerges as a robust backbone of human car-following; a residual-guided extension further links the selected structure to an established psychophysical theory of collision avoidance. The discovered feature operators transfer zero-shot between freeway sites with under 3 pp $R^2$ loss. Extended to lane-change modelling within a multinomial logit framework, NOVA achieves 67.4\% balanced accuracy under strict vehicle-ID holdout on 502 unseen drivers, surpassing existing lane-changing baselines by +29.8 percentage points on a three-class problem.

URL PDF HTML ☆

赞 0 踩 0

2606.10321 2026-06-10 cs.LG cs.AI cs.RO math.OC 新提交

Baseline-Free Policy Optimization for Neural Combinatorial Optimization

无基线的神经组合优化策略优化

Carlos S. Sepúlveda, Gonzalo A. Ruz

AI总结提出使用GRPO算法消除神经组合优化中的基线依赖，避免训练崩溃，在TSP和CVRP上达到接近POMO的性能。

详情

AI中文摘要

神经组合优化（NCO）训练自回归策略以解决路由问题。标准训练算法REINFORCE使用滚动基线，需要维护并定期更新策略的冻结副本以降低方差。这种基线引入了一个结构脆弱性：在更难的问题实例上，较差的基线会产生噪声梯度估计，从而破坏训练稳定性。我们评估了来自大语言模型对齐的组相对策略优化（GRPO），该算法通过归一化组内采样轨迹的优势完全消除了基线。在RL4CO框架内对TSP和CVRP基准上的五种RL算法进行受控比较，我们发现：(i) GRPO避免了REINFORCE在TSP-100上观察到的训练崩溃，其中性能在预热阶段后立即从成本9.8下降到52.1，并且在延长训练下无法恢复；(ii) 在匹配的梯度更新次数下，GRPO达到了与POMO（一种基于AM的强多起点基线）在2%以内的解质量，同时无需外部基线；(iii) P3O，一种也来自对齐文献的成对偏好算法，在TSP上具有竞争力，但在CVRP上表现出更高的变异性。这些结果表明GRPO是一种有前途的无基线NCO替代方案，特别是在基线依赖训练变得脆弱的场景中。

英文摘要

Neural combinatorial optimization (NCO) trains autoregressive policies to solve routing problems. The standard training algorithm, REINFORCE with a rollout baseline, requires maintaining and periodically updating a frozen copy of the policy for variance reduction. This baseline introduces a structural vulnerability: on harder instances, a poor baseline produces noisy gradient estimates that can destabilize training. We evaluate Group Relative Policy Optimization (GRPO), an algorithm from large language model alignment that eliminates the baseline entirely by normalizing advantages within groups of sampled trajectories. In a controlled comparison of five RL algorithms on TSP and CVRP benchmarks within the RL4CO framework, we find that: (i) GRPO avoids the training collapse observed with REINFORCE on TSP-100, where performance degrades from cost 9.8 to 52.1 immediately after the warmup phase and does not recover under extended training; (ii) at matched gradient updates, GRPO achieves solution quality within 2% of POMO, a strong AM-based multi-start baseline, while requiring no external baseline; and (iii) P3O, a pairwise preference algorithm also from the alignment literature, is competitive on TSP but shows higher variability on CVRP. These results identify GRPO as a promising baseline-free alternative for NCO, particularly in settings where baseline-dependent training becomes fragile.

URL PDF HTML ☆

赞 0 踩 0

2606.10289 2026-06-10 cs.RO cs.NA math.NA 新提交

Improved Representation of Matrix Lie Group Operations through Tensor Notation

通过张量符号改进矩阵李群运算的表示

Clark Taylor

AI总结本文引入张量和爱因斯坦求和符号来简化矩阵李群在李导数计算中的表示，提高估计框架中梯度计算的清晰度。

Comments 12 pages, 4 figures + graphical abstract, 1 algorithm, 4 tables

2606.10085 2026-06-10 cs.LG eess.SP math.OC 新提交

Structured Adaptive Tensor Prediction for Streaming Data

流式数据的结构化自适应张量预测

Zhen Qin, Yang Chen

AI总结针对矩阵值时间序列的流式预测，提出自适应张量回归框架，包含矩阵-矩阵和张量-矩阵两种形式，并开发在线SGD算法，张量-矩阵模型在稳态误差和去噪方面更优，同时建立了低维结构下的恢复保证。

详情

AI中文摘要

矩阵值时间序列出现在广泛的应用中，例如来自医学成像和地球物理学的时空数据。现有方法主要针对静态环境设计，缺乏对流式和时变环境的适应性。自适应滤波技术也大多局限于标量或向量值数据，使得矩阵值时间序列的自适应预测理解不足。为弥补这些差距，我们开发了一个自适应张量回归框架，包括矩阵-矩阵（MoM）和张量-矩阵（ToM）两种形式，用于流式矩阵值预测。这两种形式的区别在于是否直接建模矩阵值输出，或通过高阶张量表示利用时间结构。针对所提出的张量回归框架，我们开发了用于在线学习的随机梯度下降（SGD）算法。我们表明，将多个响应随时间堆叠成高阶张量可以提高性能；特别是，ToM比MoM实现了更低的稳态误差和更强的去噪能力，这促使我们关注ToM模型。我们进一步刻画了SGD在时变动态下的跟踪行为。从统计角度，我们建立了ToM在一般低维结构（包括稀疏性、低秩性及其联合稀疏低秩模型）下的固定时间恢复保证。

英文摘要

Matrix-valued time series arise in a wide range of applications, such as spatio-temporal data from medical imaging and geophysics. Existing methods are mainly designed for static settings and lack adaptability to streaming and time-varying environments. Adaptive filtering techniques have also been largely limited to data with scalar or vector values, leaving adaptive forecasting for matrix-valued time series inadequately understood. To bridge these gaps, we develop an adaptive tensor regression framework that includes Matrix-on-Matrix (MoM) and Tensor-on-Matrix (ToM) formulations for streaming matrix-valued prediction. The two formulations differ in whether to directly model matrix-valued outputs or to exploit temporal structure via higher-order tensor representations. For the proposed tensor regression framework, we develop stochastic gradient descent (SGD) algorithms for online learning. We show that stacking multiple responses across time into higher-order tensors improves performance; in particular, the ToM achieves lower steady-state error and stronger denoising capability than MoM, motivating our focus on the ToM model. We further characterize the tracking behavior of SGD under time-varying dynamics. From a statistical perspective, we establish fixed-time recovery guarantees for ToM under general low-dimensional structures, including sparsity, low-rankness, and their joint sparselow-rank models.

URL PDF HTML ☆

赞 0 踩 0

2606.10874 2026-06-10 cs.CV math.QA quant-ph 新提交

Schmidt Decomposition-Based Methods for Efficient Quantum Image Encoding

基于Schmidt分解的高效量子图像编码方法

Ana-Maria Pangeva, Yassine Ferhi, Alexander Geng, Andreas Weinmann, Desislava Ivanova, Ali Moghiseh

AI总结针对量子图像编码在NISQ设备上电路复杂度过高的问题，提出基于Schmidt分解的低秩近似方法，在保持图像质量的同时显著降低电路深度和门数量，FRQI模型实现97%的深度缩减且MSE仅约0.27。

详情

AI中文摘要

在量子图像处理中，一个基本步骤是将经典图像数据编码为量子态。这可以通过诸如量子图像的灵活表示（FRQI）、量子概率图像编码（QPIE）和新颖增强量子表示（NEQR）等方法实现。然而，在真实量子硬件上，这些编码会迅速导致电路具有大量门、大电路深度和高量子比特使用量，这对于嘈杂中等规模量子（NISQ）设备来说是一个问题。在这项工作中，我们研究了通过Schmidt分解公式化的低秩状态近似是否有助于降低这种复杂性。该方法仅保留量子态纠缠结构中最显著的部分，使状态准备更高效，同时保留大部分图像信息。我们比较了三种编码技术在其原始形式和低秩近似下的性能，评估了电路深度、CNOT计数、MSE和重建图像的视觉质量等指标。结果揭示了准确性与资源效率之间有意义的权衡，其中FRQI模型实现了97%的电路深度缩减，同时保持了近乎完美的重建（MSE约为0.27）。这证明了低秩技术在近期硬件上推进实用量子图像处理的潜力。

英文摘要

In quantum image processing, a fundamental step is encoding classical image data into quantum states. This can be achieved using methods such as Flexible Representation of Quantum Images (FRQI), Quantum Probability Image Encoding (QPIE), and Novel Enhanced Quantum Representation (NEQR). However, on real quantum hardware, these encodings can quickly lead to circuits with many gates, large circuit depth, and high qubit usage, which is a problem for Noisy Intermediate-Scale Quantum (NISQ) devices. In this work, we investigate whether low-rank state approximation, formulated via Schmidt decomposition, can help reduce this complexity. The method keeps only the most significant parts of a quantum state's entanglement structure, making state preparation more efficient while preserving most of the image information. We compare the three encoding techniques in their original form and with low-rank approximation, evaluating metrics such as circuit depth, CNOT count, MSE, and visual quality of reconstructed images. The results reveal meaningful trade-offs between accuracy and resource efficiency, with the FRQI model achieving a 97 percent reduction in circuit depth while maintaining a near-perfect reconstruction (MSE of about 0.27). This demonstrates the potential of low-rank techniques for advancing practical quantum image processing on near-term hardware.

URL PDF HTML ☆

赞 0 踩 0

2606.09857 2026-06-10 cs.LG physics.comp-ph 新提交

Uncertainty-aware Multi-fidelity Closure via Conditional Normalizing Flows

基于条件归一化流的不确定性感知多保真度闭合模型

Jice Zeng, Shady E. Ahmed, David Barajas-Solano, Panos Stinis

AI总结提出基于条件归一化流的不确定性感知多保真度框架，通过学习低保真度到高保真度系数的概率映射，解决降阶模型中的闭合问题，在涡旋合并问题中验证了残差学习优于直接学习。

Comments No comments

详情

AI中文摘要

降阶模型（ROM）为复杂多尺度系统提供了高效的替代模型，但其预测精度常因截断误差以及已解析尺度与未解析尺度之间相互作用的不足表示而受损。截断（未解析）尺度对ROM（已解析）尺度缺失的影响通常被称为闭合问题。在本工作中，我们将ROM闭合建模视为一个多保真度（MF）学习问题，并基于条件归一化流提出一个不确定性感知的MF框架，以提高ROM的预测精度。所提出的方法学习从低保真度（LF）ROM系数到高保真度（HF）系数的概率映射，从而在量化与所学闭合相关的不确定性的同时提高预测保真度。研究了两种校正策略：直接学习（直接从LF输入预测HF系数）和残差学习（学习LF与HF系数之间的差异，并用其恢复校正后的HF解）。该框架在由二维Navier-Stokes方程控制的涡旋合并问题上进行了验证。结果表明，两种校正策略均比未校正的ROM提高了精度，其中残差学习始终优于直接学习。此外，所提出的两种基于深度生成模型的策略为校正后的ROM系数提供了不确定性量化，这对于评估预测置信度和支持ROM在实际应用中的可靠使用至关重要。

英文摘要

Reduced-order models (ROMs) provide an efficient surrogate for complex multiscale systems, but their predictive accuracy is often compromised by truncation errors and the inadequate representation of interactions between resolved and unresolved scales. The missing effect of truncated (unresolved) scales on ROM (resolved) scales is often denoted as the closure problem. In this work, we formulate ROM closure modeling as a multi-fidelity (MF) learning problem and propose an uncertainty-aware MF framework based on conditional normalizing flow to enhance ROM predictive accuracy. The proposed approach learns a probabilistic mapping from low-fidelity (LF) ROM coefficients to high-fidelity (HF) coefficients, thereby improving predictive fidelity while quantifying the uncertainty associated with the learned closure. Two correction strategies are investigated: direct learning, in which HF coefficients are predicted directly from LF inputs, and residual learning, which learns the discrepancy between LF and HF coefficients and uses it to recover the corrected HF solution. The framework is demonstrated on a vortex merging problem governed by the two-dimensional Navier Stokes equations. Results show that both correction strategies improve ROM accuracy over uncorrected ROM, with residual learning achieving consistently better performance than direct learning. Moreover, the two proposed deep generative model-based strategies provide uncertainty quantification for the corrected ROM coefficients, which is critical for assessing prediction confidence and supporting the reliable use of ROMs in practical applications.

URL PDF HTML ☆

赞 0 踩 0

2606.09950 2026-06-10 cs.LG nucl-th physics.comp-ph physics.data-an 新提交

Integrating Out, Twice:The Open-System Case That Neural-Network Ensemble Theory Is Missing

两次积分：神经网络集成理论缺失的开系统情形

Jin Lei

AI总结本文揭示神经网络参数平均与高斯边缘化等价，指出集成理论仅覆盖闭系统，缺失开系统情形；借鉴核反应理论，通过非厄米有效生成器描述开系统，并在注意力图等应用中测试，发现主要结果为负，并解释其结构原因。

详情

AI中文摘要

将神经网络在其随机参数上平均与边缘化高斯扇区是相同的操作，即被消除块的Schur补，当该块闭合时，它返回协方差及其逆。网络集成产生的全部就是闭情形。开情形缺失，而核反应理论已将其解决。将散射问题投影到选定的通道集上，其余部分不可逆地将概率携带到连续谱，留下一个非厄米有效生成器，它精确地守恒并列举它所失去的：核光学模型及其广义光学定理。我仅使用分布的矩、高斯代数和块逆来并置这两种情形，不使用场论，并完整给出闭情形的词典：神经正切核是Fisher灵敏度核，无限宽高斯极限是高斯过程仿真器，从懒惰到特征转换是简化基仿真器的有效性边界。然后我在截断的注意力图、令牌级传输算子和稀疏专家路由器上测试开情形的导出，并报告一个主要为负的结果。守恒流账本在真正存在开放性的地方起作用，但其独特内容缺失，是所选划分的伪影，或被训练目标固定在某个下限附近，而操作上有用的不确定性实际上是认知性的，存在于对应的闭半部分，而非开半部分。这个负结果有一个结构原因，本文使其精确：开情形需要一个具有连续谱和波动（而非弛豫）动力学的被消除扇区，而主流学习的有限或耗散对象无法提供。这是一篇笔记，而非结果；其主要发现是那个负结果，其价值在于定位它的地图。

英文摘要

Averaging a neural network over its random parameters and marginalizing a Gaussian sector are the same operation, the Schur complement of the eliminated block, and when that block is closed it returns a covariance and its inverse. That is all a network ensemble produces, the closed case. The open case is missing, and nuclear reaction theory has it worked out. Projecting a scattering problem onto a chosen set of channels, with the rest carrying probability irreversibly to a continuum, leaves a non-Hermitian effective generator that conserves and itemizes exactly what it loses: the nuclear optical model and its generalized optical theorem. I set the two cases side by side using only the moments of a distribution, the algebra of Gaussians, and block inversion, no field theory, and give the closed-case dictionary in full: the neural tangent kernel is the Fisher sensitivity kernel, the infinite-width Gaussian limit is the Gaussian-process emulator, and the lazy-to-feature transition is the validity boundary of a reduced-basis emulator. I then test the open export on a truncated attention map, a token-level transfer operator, and a sparse expert router, and report a mostly negative result. The conserved flux ledger ports wherever openness is genuinely present, but its distinctive content is absent, an artifact of the chosen partition, or pinned near a floor by the training objective, and the operationally useful uncertainty turns out to be epistemic, living in the closed half of the correspondence, not the open one. The negative has a structural reason this note makes precise: the open case needs an eliminated sector with a continuous spectrum and wave-like, not relaxational, dynamics, which mainstream learning's finite or dissipative objects do not supply. This is a note, not a result; its main finding is that negative one, and its value is the map that locates it.

URL PDF HTML ☆

赞 0 踩 0

2606.11171 2026-06-10 cs.LG cond-mat.stat-mech cs.IT math.IT math.OC math.ST stat.TH 新提交

Algorithmic and Minimax Complexities in Kernel Bandits

核赌博机中的算法与极小极大复杂度

Yunbei Xu

AI总结本文通过统一MAIR框架，将GP-UCB与MAMS算法置于共同语言下，提出结合两者优势的安全主算法，并证明在过参数化模型中算法复杂度比类宽极小极大或DEC证书更具信息性。

详情

AI中文摘要

高斯过程上置信界（GP-UCB）和决策估计系数（DEC）方法乍看之下可能属于不同的理论。本文将这两种观点置于一个共同的算法信息语言中，用于频率学派RKHS赌博机。GP-UCB固定了一个算法性的（而非真实的）高斯过程先验，并利用实现轨迹的复杂度以及计算可处理性，而MAMS优化了一个鲁棒的类宽MAIR/DEC包络。通过统一的MAIR框架和异质半正定算法先验，我们推广了GP-UCB分析和MAMS算法，提出了一种结合两者优势的安全主算法，并提供了一个核赌博机构造，表明在过参数化模型中算法复杂度可以比类宽极小极大或DEC证书更具信息性。由此得出的信息是：算法信息和类宽极小极大系数回答不同的问题，并可能导致不同的差距；核赌博机提供了一个干净的环境，使得这种区别在数学上变得可见。

英文摘要

Gaussian-process upper confidence bound (GP-UCB) and decision-estimation-coefficient (DEC) methods may appear, at first sight, to belong to different theories. This paper places the two viewpoints in a common algorithmic-information language for frequentist RKHS bandits. GP-UCB fixes an algorithmic, rather than true, Gaussian-process prior and exploits realized-trajectory complexity together with computational tractability, whereas MAMS optimizes a robust class-wide MAIR/DEC envelope. Through the unified MAIR framework and heterogeneous positive-semidefinite algorithmic priors, we generalize both the GP-UCB analysis and the MAMS algorithm, propose a safeguarded master that combines their advantages, and provide a kernel-bandit construction showing that algorithmic complexity can be more informative than class-wide minimax or DEC certificates in overparameterized models. The resulting message is that algorithmic information and class-wide minimax coefficients answer different questions and can lead to different gaps; kernel bandits provide a clean setting in which this distinction becomes mathematically visible.

URL PDF HTML ☆

赞 0 踩 0

2606.10324 2026-06-10 cs.LG cond-mat.stat-mech stat.ML 新提交

Rank Collapse, Fixed Points, and the Renormalization Group Structure of MLP Residual Networks

MLP残差网络的秩坍缩、不动点与重正化群结构

Parviz Haggi-Mani, Irina Rish

AI总结本文通过MLP残差网络在合成马尔可夫链上的掩码预测任务，首次定量证明网络深度方向存在选择性秩坍缩，对应重正化群中的相关自由度整合，并发现层间核漂移集中在少数转换处。

Comments 16 pages, 9 figures

详情

AI中文摘要

深度神经网络前向传播与重正化群流之间的类比在文献中反复被提及，但现有处理仍是定性的：深度被描述为粗粒化尺度，注意力被比作配分函数，表示被认为流向不动点。尚无工作定义可测量的RG序参量，在输入分布受控变化下测试它，或做出经实验验证的定量预测。我们研究了类比可处理的最简单架构：一个纯MLP残差堆栈，在具有已知谱性质的合成马尔可夫链序列上训练掩码标记预测。我们报告三个发现。(i) 训练后残差流的有效秩随深度单调递减，与无关自由度的逐步整合一致。(ii) 这种秩坍缩是选择性的：它发生在相关长度约1的短链上，但在相关长度约7的长链上不存在（在位置级别测量以控制均值池化伪影）。网络精确保留了预测任务相关的自由度，即RG相关性判据的内容。(iii) 层间核漂移集中在一两个特定转换处，网络其余部分接近不动点，与离散不动点平台一致。这些发现共同构成了首个定量的位置级证据，表明MLP残差网络实现了由输入分布谱结构控制的选择性粗粒化过程。

英文摘要

The analogy between deep neural network forward passes and renormalization group (RG) flows has been repeatedly noted in the literature, but existing treatments remain qualitative: depth is described as a coarse-graining scale, attention is likened to a partition function, and representations are said to flow toward fixed points. No existing work has defined a measurable RG order parameter, tested it under controlled variation of the input distribution, or made quantitative predictions that are empirically verified. We study the simplest architecture for which the analogy is tractable: a pure MLP residual stack trained on masked token prediction over synthetic Markov chain sequences with known spectral properties. We report three findings. (i) The effective rank of the residual stream decreases monotonically with depth after training, consistent with progressive integration of irrelevant degrees of freedom. (ii) This rank collapse is selective: it occurs for chains with short correlation length approximately 1 but is absent for chains with long correlation length approximately 7, measured at the position level to control for mean-pooling artifacts. The network preserves exactly the degrees of freedom relevant to the prediction task, the content of the RG relevance criterion. (iii) Inter-layer kernel drift is concentrated at one or two specific transitions, with the remainder of the network near a fixed point, consistent with a discrete fixed-point plateau. Together these findings constitute the first quantitative, position-level evidence that MLP residual networks implement a selective coarse-graining procedure governed by the spectral structure of the input distribution.

URL PDF HTML ☆

赞 0 踩 0

2606.10868 2026-06-10 cs.LG astro-ph.IM 新提交

When Do Autoregressive Sequence Models Forecast Physical Wavefields? A Controlled Study on Synthetic Seismograms

自回归序列模型何时能预测物理波场？基于合成地震图的受控研究

Waleed Esmail, Stuart Russell, Jana Klinge, Alexander Kappes, Christine Thomas

AI总结通过合成三分量地震图受控消融实验，发现多token预测是自回归波场滚动预测稳定的主要因素，并揭示上下文比率阈值和相位感知损失的关键作用。

Comments 16 pages, 5 figures and 3 tables

详情

AI中文摘要

长时程自回归预测振荡物理信号（如地震图、引力波应变及类似波场）受限于误差累积：当因果模型在数百步中不断接收自身输出时，微小的每步误差会复合为相位漂移，而逐点指标无法检测到这种漂移。我们以合成三分量地震图作为物理结构化的测试平台，以\ extsc{SeismoGPT}自回归预测器作为研究对象，探究这种滚动预测何时保持稳定。通过受控的架构内消融实验，在自由运行滚动预测上结合配对显著性检验进行评估，我们分离了每个设计选择的贡献。多token预测是主要的稳定因素，几乎贡献了相对于单token基线的全部改进（中位数NCC提升+0.040）；地平线嵌入混合预测头和跨地平线STFT幅度相干性损失各自增加了微小但一致的额外增益。性能严重依赖于接近1的上下文比率阈值（大致为观测信号的完整P-S区间），低于该阈值时滚动泛化能力崩溃。主要的残余失败是极性反转，而基于幅度的频谱损失无法（按设计）对此进行惩罚，这表明相位感知目标自然成为下一步方向。我们将此定位为对振荡波场滚动稳定性的受控研究，而非预测架构的基准测试。

英文摘要

Long-horizon autoregressive forecasting of oscillatory physical signals, such as seismograms, gravitational-wave strain, and similar wavefields is limited by error accumulation: as a causal model is fed its own outputs over hundreds of steps, small per-step errors compound into phase drift that pointwise metrics fail to detect. We ask when such rollout stays stable, using synthetic three-component seismograms as a physically structured testbed and the \textsc{SeismoGPT} autoregressive forecaster as the model under study. Through controlled, intra-architecture ablations evaluated on free-running rollout with paired significance tests, we isolate the contribution of each design choice. Multi-token prediction is the dominant stabilizer, accounting for almost the entire improvement over a single-token baseline ($+0.040$ median NCC); a horizon-embedding hybrid prediction head and a cross-horizon STFT-magnitude coherence loss each add a small but consistent further gain. Performance depends sharply on a context-ratio threshold near one, roughly the full P-S interval of observed signal, below which rollout generalization collapses. The dominant residual failure is a polarity inversion that a magnitude-based spectral loss cannot, by construction, penalize, identifying phase-aware objectives as the natural next step. We frame this as a controlled study of rollout stability on oscillatory wavefields, not a benchmark of forecasting architectures.

URL PDF HTML ☆

赞 0 踩 0

2606.07998 2026-06-10 cs.LG cs.AI 版本更新

Enhancing AI Interpretability and Safety through Localised Architectures

通过局部化架构增强AI可解释性与安全性

Ian Seet, Jonas Bozenhard, Simon Ostermann

AI总结针对大型生成式AI模型可解释性差、计算成本高的问题，提出局部化机器学习架构，通过降低带宽、提高节点表达能力来提升可解释性和效率，并评估了多种硬件实现方案的适用性。

详情

AI中文摘要

近期生成式AI的进展，特别是强大的大型语言模型（LLM）和大型推理模型（LRM），引发了对这些庞大且不透明的AI模型的可解释性、安全性和可持续性的担忧。这些架构的能力不仅源于深度神经网络的可扩展性，还源于大规模并行硬件（如GPU集群）。深度神经网络的弥散性质使其在提供足够训练数据时具有强大的函数逼近能力，但代价是可解释性和计算效率的降低。观察到局部化机器学习（ML）模型在小数据集上往往比深度神经网络更具可解释性和计算效率，我们通过类比推理，认为类似的优势可能适用于特定的局部化硬件ML架构。我们主张，具有较低带宽但每个节点具有更高表达能力的局部化架构，有潜力在根本上比运行在GPU集群上的深度神经网络更具可解释性，同时在较小数据集上保持竞争力。然后，我们评估了各种硬件ML范式在实现此类局部化架构方面的适用性，并评估了它们的每节点表达能力、能效以及所需技术的实际成熟度。

英文摘要

Recent advances in generative AI, especially powerful Large Language Models (LLMs) and Large Reasoning Models (LRMs), raise concerns over the interpretability, safety and sustainability of these large and opaque AI models. The power of such architectures is derived not only from the scalability of deep neural networks, but also massively parallel hardware such as GPU clusters. The diffuse nature of deep neural networks gives them great function-approximation capability when provided with sufficient training data but imposes a cost in interpretability and computational efficiency. Observing that localised machine learning (ML) models tend to be more interpretable and computationally efficient than deep neural networks on small datasets, we reason by analogy that similar advantages may apply to specific localised hardware ML architectures. We argue that localised architectures with lower bandwidth but higher expressivity per node have the potential to be fundamentally more interpretable than deep neural networks running on GPU clusters while remaining competitive for smaller datasets. We then evaluate the suitability of various hardware ML paradigms for implementing such localised architectures and evaluate their per-node expressivity, energy efficiency and practical maturity of the technology required.

URL PDF HTML ☆

赞 0 踩 0

2606.06624 2026-06-10 cs.LG 版本更新

Principles and Practice of Deep Representation Learning: or a Mathematical Theory of Memory

深度表示学习的原理与实践：或记忆的数学理论

Sam Buchanan, Druv Pai, Peng Wang, Yi Ma

AI总结本书通过表示学习视角，用优化和信息论解释现代神经网络架构设计原理，旨在打开黑箱，提高可解释性、可靠性和可控性。

Comments version 2; TeX source and supplementary material at https://ma-lab-berkeley.github.io/deep-representation-learning-book/

详情

AI中文摘要

在当前深度学习和特别是生成模型的时代，训练非常大的生成模型投入巨大。到目前为止，这类模型是难以理解的“黑箱”，因为它们具有不透明的内部机制，导致在可解释性、可靠性和可控性方面存在困难。自然，这种缺乏理解导致了炒作和恐惧。本书试图通过表示学习的视角“打开黑箱”并理解大型深度网络的机制，这是深度学习模型经验能力的主要因素——可以说是最重要的因素。本书简要大纲如下：第1章将总结贯穿全文的线索。第2、3、4、5和6章将通过优化和信息论解释现代神经网络架构的设计原理，一旦引入基本原理，就将架构开发过程（长期以来被描述为一种“炼金术”）简化为本科水平的线性代数和微积分练习。第7章和第8章将讨论这些原理在更范式化的问题解决中的应用，获得新的方法和模型，这些模型在设计上高效、可解释且可控，但又不亚于——有时甚至超过——它们所模仿的黑箱模型。第9章将讨论深度学习的潜在未来方向、表示学习的作用以及一些开放问题。

英文摘要

In the current era of deep learning and especially generative models, there is significant investment in training very large deep neural networks. Thus far, such models have been "black boxes" that are difficult to understand in the sense that they have opaque internal mechanisms, leading to difficulties in interpretability, reliability, and control. Naturally, this lack of understanding has led to both hype and fear. This book is an attempt to "open the black box" and understand the mechanisms of large deep networks, through the perspective of representation learning, which is a major factor - arguably the single most important one - in the empirical power of deep learning models. A brief outline of this book is as follows. Chapter 1 will summarize the threads that underlie the whole text. Chapters 2, 3, 4, 5, and 6 will explain the design principles of modern neural network architectures through optimization and information theory, reducing the process of architecture development (long having been described as a sort of "alchemy") to undergraduate-level linear algebra and calculus exercises once the underlying principles are introduced. Chapters 7 and 8 will discuss applications of these principles to solve problems in more paradigmatic ways, obtaining new methods and models which are efficient, interpretable, and controllable by design, and yet no less - sometimes even more - powerful than the black-box models they resemble. Chapter 9 will discuss potential future directions for deep learning, the role of representation learning, as well as some open problems.

URL PDF HTML ☆

赞 0 踩 0

2604.13717 2026-06-10 cs.CL 版本更新

On Cost-Effective LLM-as-a-Judge Improvement Techniques

关于成本效益的LLM作为评判者的改进技术

Ryan Lail, Luke Markham

AI总结研究通过集成评分、任务特定标准注入等四种技术提高LLM评判准确性，在RewardBench 2上达到85.8%准确率，成本效益显著。

Comments Accepted at the ICML 2026 workshops "Statistical Frameworks for Uncertainty in Agentic Systems" and "Combining Theory and Benchmarks: Towards a Virtuous Cycle to Understand and Guarantee Foundation Model Performance". 13 pages, 9 figures

详情

AI中文摘要

使用语言模型对候选回答进行评分或排序已成为强化学习从人类反馈（RLHF）流程、基准测试和应用层评估中人类评估的可扩展替代方案。然而，输出可靠性在很大程度上依赖于提示和聚合策略。我们对四种即插即用技术——集成评分、任务特定标准注入、校准上下文和自适应模型升级——进行了实证研究，以在RewardBench 2上提高LLM评判准确性，并通过噪声控制的统一视角对随机评判器进行分析：集成作为每次调用噪声的蒙特卡洛平均，标准注入作为回答间判别锐化，以及每次回答得分方差作为不确定性信号。集成评分和任务特定标准注入（后者几乎零成本）共同达到高达85.8%的准确率，比基线提高13.5个百分点。校准上下文和自适应模型升级也优于基线，但在成本-准确率帕累托前沿上被标准注入+集成所主导。小模型从集成中获益不成比例，使得高准确率的LLM评判器可以低成本获得。我们表明这些技术在不同模型提供商之间具有泛化性，在OpenAI GPT和Anthropic Claude系列上进行了评估。

英文摘要

Using a language model to score or rank candidate responses has become a scalable alternative to human evaluation in reinforcement learning from human feedback (RLHF) pipelines, benchmarking, and application layer evaluations. However, output reliability depends heavily on prompting and aggregation strategy. We present an empirical investigation of four drop-in techniques -- ensemble scoring, task-specific criteria injection, calibration context, and adaptive model escalation -- for improving LLM judge accuracy on RewardBench 2, with a unifying lens of noise control on the stochastic judge: ensembling as Monte Carlo averaging over per-call noise, criteria injection as between-response discrimination sharpening, and per-response score variance as an uncertainty signal. Ensemble scoring and task-specific criteria injection (the latter virtually cost free) together reach up to 85.8% accuracy, +13.5pp over baseline. Calibration context and adaptive model escalation also improve over baseline but are dominated by criteria + ensembling on the cost-accuracy Pareto frontier. Small models benefit disproportionately from ensembling, making high-accuracy LLM judges accessible at low cost. We show that these techniques generalise across model providers, evaluating on both OpenAI GPT and Anthropic Claude families.

URL PDF HTML ☆

赞 0 踩 0

2602.19393 2026-06-10 cs.LG 版本更新

In Defense of Cosine Similarity: Normalization Eliminates the Gauge Freedom

为余弦相似度辩护：归一化消除了规范自由度

Taha Bouhsine

AI总结本文证明，当嵌入被约束到单位球面时，对角规范矩阵的歧义消失，余弦距离与欧氏距离单调等价，从而解决了余弦相似度任意性的问题。

Comments This was a blog post companion draft, it needs to be updated to fit as a preprint, will do later

详情

AI中文摘要

Steck、Ekanadham 和 Kallus [arXiv:2403.05440] 表明，来自矩阵分解模型的学习嵌入的余弦相似度可以通过对角“规范”矩阵 $D$ 变得任意。他们的结果对于使用点积目标训练嵌入并计算余弦相似度的从业者来说是正确的且重要的。然而，我们认为，他们得出的普遍反对余弦相似度的结论，混淆了不兼容训练目标的病理与单位球面上余弦距离的几何有效性。我们证明，当嵌入被约束到单位球面 $\mathbb{S}^{d-1}$ 时（无论是在训练期间还是之后使用适当的目标），$D$ 矩阵的歧义完全消失，并且余弦距离恰好等于平方欧氏距离的一半。这种单调等价性意味着，在归一化嵌入上，基于余弦和基于欧氏距离的邻居排名是相同的。余弦相似度的“问题”不在于余弦相似度本身，而在于未能进行归一化。

英文摘要

Steck, Ekanadham, and Kallus [arXiv:2403.05440] demonstrate that cosine similarity of learned embeddings from matrix factorization models can be rendered arbitrary by a diagonal ``gauge'' matrix $D$. Their result is correct and important for practitioners who compute cosine similarity on embeddings trained with dot-product objectives. However, we argue that their conclusion, cautioning against cosine similarity in general, conflates the pathology of an incompatible training objective with the geometric validity of cosine distance on the unit sphere. We prove that when embeddings are constrained to the unit sphere $\mathbb{S}^{d-1}$ (either during or after training with an appropriate objective), the $D$-matrix ambiguity vanishes identically, and cosine distance reduces to exactly half the squared Euclidean distance. This monotonic equivalence implies that cosine-based and Euclidean-based neighbor rankings are identical on normalized embeddings. The ``problem'' with cosine similarity is not cosine similarity, it is the failure to normalize.

URL PDF HTML ☆

赞 0 踩 0

2509.21925 2026-06-10 cs.LG cs.AI 版本更新

Generation Properties of Stochastic Interpolation under Finite Training Set

有限训练集下随机插值的生成性质

Yunchen Li, Shaohui Lin, Zhou Yu

AI总结研究有限训练集下随机插值生成模型的理论性质，推导最优速度场和得分函数的闭式解，揭示确定性和随机生成过程的行为，并定义欠拟合与过拟合。

Comments We found proof errors affecting key theorems and wish to avoid misleading readers. We have submitted a substantially revised new paper, arXiv:2606.08554, retaining only two old theorems and adding five new ones

详情

AI中文摘要

本文研究了有限训练总体下生成模型的理论行为。在随机插值生成框架内，我们推导了当仅有有限数量的训练样本可用时最优速度场和得分函数的闭式表达式。我们证明，在某些正则性条件下，确定性生成过程精确恢复训练样本，而随机生成过程表现为带有加性高斯噪声的训练样本。在理想化设置之外，我们考虑模型估计误差，并引入生成模型特有的欠拟合和过拟合的正式定义。我们的理论分析揭示，在存在估计误差的情况下，随机生成过程有效地产生训练样本的凸组合，这些组合被均匀噪声和高斯噪声的混合所破坏。在生成任务和分类等下游任务上的实验支持了我们的理论。

英文摘要

This paper investigates the theoretical behavior of generative models under finite training populations. Within the stochastic interpolation generative framework, we derive closed-form expressions for the optimal velocity field and score function when only a finite number of training samples are available. We demonstrate that, under some regularity conditions, the deterministic generative process exactly recovers the training samples, while the stochastic generative process manifests as training samples with added Gaussian noise. Beyond the idealized setting, we consider model estimation errors and introduce formal definitions of underfitting and overfitting specific to generative models. Our theoretical analysis reveals that, in the presence of estimation errors, the stochastic generation process effectively produces convex combinations of training samples corrupted by a mixture of uniform and Gaussian noise. Experiments on generation tasks and downstream tasks such as classification support our theory.

URL PDF HTML ☆

赞 0 踩 0

2602.17547 2026-06-10 cs.AI cs.CL

KLong: Training LLM Agent for Extremely Long-horizon Tasks

KLong：训练用于超长 horizon 任务的 LLM 代理

Yue Liu

AI总结 KLong 通过轨迹分割 SFT 和渐进式 RL 训练，解决超长 horizon 任务，实现 106B 模型在 PaperBench 上超越 Kimi K2 Thinking 11.28%。

Comments We request standard withdrawal of this submission because significant errors were discovered in the data after submission, which affect the validity of the results. We may submit a corrected version later

详情

AI中文摘要

本文介绍了KLong，一种开源的LLM代理，旨在解决超长horizon任务。其原理是首先通过轨迹分割SFT冷启动模型，然后通过渐进式RL训练进行扩展。具体而言，我们首先使用全面的SFT配方激活基础模型的基本代理能力。然后，我们引入Research-Factory，一个自动化管道，通过收集研究论文和构建评估标准来生成高质量的训练数据。利用该管道，我们从Claude 4.5 Sonnet（Thinking）中构建了数千条超长horizon轨迹。为了训练这些极长的轨迹，我们提出了一种新的轨迹分割SFT，该方法保留早期上下文，逐步截断后期上下文，并保持子轨迹之间的重叠。此外，为了进一步提高超长horizon任务解决能力，我们提出了一种新的渐进式RL，将训练分为多个阶段，逐步延长超时时间。实验表明KLong的优越性和泛化能力，如图1所示。值得注意的是，我们的KLong（106B）在PaperBench上超越Kimi K2 Thinking（1T）11.28%，且性能提升泛化到其他编码基准如SWE-bench Verified和MLE-bench。

英文摘要

This paper introduces KLong, an open-source LLM agent trained to solve extremely long-horizon tasks. The principle is to first cold-start the model via trajectory-splitting SFT, then scale it via progressive RL training. Specifically, we first activate basic agentic abilities of a base model with a comprehensive SFT recipe. Then, we introduce Research-Factory, an automated pipeline that generates high-quality training data by collecting research papers and constructing evaluation rubrics. Using this pipeline, we build thousands of long-horizon trajectories distilled from Claude 4.5 Sonnet (Thinking). To train with these extremely long trajectories, we propose a new trajectory-splitting SFT, which preserves early context, progressively truncates later context, and maintains overlap between sub-trajectories. In addition, to further improve long-horizon task-solving capability, we propose a novel progressive RL, which schedules training into multiple stages with progressively extended timeouts. Experiments demonstrate the superiority and generalization of KLong, as shown in Figure 1. Notably, our proposed KLong (106B) surpasses Kimi K2 Thinking (1T) by 11.28% on PaperBench, and the performance improvement generalizes to other coding benchmarks like SWE-bench Verified and MLE-bench.

URL PDF HTML ☆

赞 0 踩 0

2606.10280 2026-06-10 eess.IV cs.CV 新提交

Overlapped Wavelet Diffusion for Low-Light Image Enhancement

重叠小波扩散用于低光照图像增强

Fen Peng, Taizo Suzuki, Seisuke Kyochi

AI总结提出重叠小波扩散框架OWDiff，通过重叠小波变换消除块伪影，并引入低频引导的高频增强模块恢复细节，在LOLv1和LOLv2-real数据集上优于现有方法。

Comments Advance published in IEICE Transactions on Information and Systems. DOI: 10.1587/transinf.2026PCP0006. Code: https://github.com/FinnPeg/Overlapped-Wavelet-Diffusion

详情

DOI: 10.1587/transinf.2026PCP0006
Journal ref: IEICE Transactions on Information and Systems, Advance online publication, 2026

AI中文摘要

在这项研究中，我们提出了一种用于低光照图像增强（LLIE）的重叠小波扩散框架，该框架包含两个互补组件，以实现无块伪影和细节保持的增强。尽管与传统方法相比，最近基于扩散的LLIE方法表现出显著性能，但DiffLL仍然遭受由Haar小波变换（WT）引起的块伪影以及由于其高频恢复模块（HFRM）的限制导致的边缘模糊或纹理过度平滑。为了克服这些问题，我们引入了重叠小波变换（OWT），它融合了相邻区域的相关性，从而在结构上防止块伪影。此外，我们集成了一个低频引导的高频增强模块（HFEBlock）来加强细节恢复，产生更清晰的边缘和更可靠的纹理。在LOLv1和LOLv2-real数据集上的大量实验表明，我们的框架（称为OWDiff）在定性和定量上均持续优于现有的LLIE方法，在保持计算效率的同时实现了卓越的视觉质量。OWDiff有效解决了Haar WT和HFRM的结构限制，与DiffLL相比，在LOLv1和LOLv2-real数据集上平均PSNR增益为0.58 dB，SSIM相对提高1.64%，LPIPS相对降低5.9%。

英文摘要

In this study, we propose an overlapped wavelet diffusion framework for Low-Light Image Enhancement (LLIE), which incorporates two complementary components to achieve blocking artifact-free and detail-preserving enhancement. Although recent diffusion-based LLIE methods have demonstrated remarkable performance compared with traditional approaches, DiffLL still suffers from blocking artifacts caused by the Haar Wavelet Transform (WT) and blurred edges or over-smoothed textures due to the limitations of its High-Frequency Restoration Module (HFRM). To overcome these issues, we introduce an Overlapped WT (OWT) that incorporates correlations across neighboring regions, thereby structurally preventing blocking artifacts. Furthermore, we integrate a low-frequency-guided High-Frequency Enhance Block (HFEBlock) to strengthen detail recovery, yielding sharper edges and more reliable textures. Extensive experiments on the LOLv1 and LOLv2-real datasets demonstrate that our framework, termed OWDiff, consistently outperforms existing LLIE methods both qualitatively and quantitatively, achieving superior visual quality while maintaining computational efficiency. OWDiff effectively addresses the structural limitations of the Haar WT and the HFRM, achieving an average PSNR gain of 0.58 dB, along with a 1.64% relative improvement in SSIM and a 5.9% relative reduction in LPIPS, compared to DiffLL across both the LOLv1 and LOLv2-real datasets.

URL PDF HTML ☆

赞 0 踩 0

2606.09942 2026-06-10 cs.SE cs.AI 新提交

Anomaly Detection and Root Cause Analysis for Microservice Systems

微服务系统的异常检测与根因分析

Luan Pham

AI总结针对微服务系统异常检测与根因分析的五大局限性，提出端到端方法BARO、EventADL和TORAI，并构建基准RCAEval，通过实验验证有效性与鲁棒性。

Comments This is the pre-print of my PhD thesis, submitted to RMIT University

详情

AI中文摘要

微服务系统被广泛用于构建云应用，但其复杂性使得故障不可避免，从而降低用户体验并造成经济损失。自动化异常检测与根因分析（RCA）目前是活跃的研究领域，但现有技术存在五个局限性。首先，大多数方法将异常检测和RCA分开处理，假设异常已被正确检测，当检测因噪声或延迟而不精确时便会失效。其次，它们关注指标、日志和跟踪，而忽略了事件数据（如API调用和配置变更）。第三，许多方法需要给定的服务调用图，否则无法诊断。第四，该领域缺乏标准化的数据集和评估框架，导致方法难以公平比较。第五，尽管基于因果推断的RCA已成为主流，但其有效性、效率和鲁棒性仍不明确。本论文通过两组贡献解决这些局限性。第一组引入了独立和联合利用可观测性数据的方法。BARO是一种针对指标数据的端到端异常检测与RCA方法。EventADL是一种针对事件数据的端到端框架。TORAI是一种无需服务调用图的多模态RCA框架。在真实微服务系统上的大量实验证明了它们的有效性和鲁棒性。第二组贡献提供了基准数据集、评估框架和系统性的评估工作。RCAEval是一个全面的基准，为未来研究提供即用数据集和可复现基线。对现有RCA方法（尤其是基于因果推断的方法）的系统性评估提供了指导未来方向的见解。本论文因此推进了微服务故障的自动化异常检测与RCA，为事件缓解和修复的未来研究奠定基础。

英文摘要

Microservice systems are widely used to build cloud applications, yet their complexity makes failures inevitable, degrading user experience and causing economic loss. Automated anomaly detection and root cause analysis (RCA) are now active research areas, but existing techniques share five limitations. First, most treat anomaly detection and RCA separately, assuming anomalies are detected correctly, and falter when detection is imprecise due to noise or delay. Second, they focus on metrics, logs, and traces, leaving event data such as API calls and configuration changes underexplored. Third, many require a given service call graph and cannot diagnose without one. Fourth, the field lacks standardised datasets and evaluation frameworks, so methods are hard to compare fairly. Fifth, although causal inference-based RCA has become dominant, its effectiveness, efficiency, and robustness remain unclear. This thesis addresses these limitations through two groups of contributions. The first introduces methods that exploit observability data both independently and collectively. BARO is an end-to-end anomaly detection and RCA approach for metric data. EventADL is an end-to-end framework for event data. TORAI is a multimodal RCA framework that requires no service call graph. Extensive experiments on real microservice systems demonstrate their effectiveness and robustness. The second group delivers benchmarking datasets, an evaluation framework, and systematic evaluation efforts. RCAEval is a comprehensive benchmark providing ready-to-use datasets and reproducible baselines for future research. A systematic evaluation of existing RCA methods, especially causal inference-based approaches, offers insights that guide future directions. This thesis thereby advances automated anomaly detection and RCA for microservice failures, enabling future research on incident mitigation and remediation.

URL PDF HTML ☆

赞 0 踩 0

2606.09930 2026-06-10 cs.PL cs.LG cs.SC 新提交

Compile Once, Differentiate Everywhere: A Differentiable Meta-Circular Interpreter

一次编译，处处微分：可微分元循环解释器

Lucas Sheneman

AI总结提出一种将Scheme子集编译为可微分计算图的编译器，实现可微分元循环解释（DMCI），支持对包含闭包、递归和数据结构的程序进行反向模式自动微分，无需重新编译。

详情

AI中文摘要

程序执行与基于梯度的优化之间的界限长期以来限制了代码本身作为可学习科学模型的使用。我们提出一个编译器，将Scheme的自托管子集转换为用于自动微分后端的可微分计算图。由于该子集可以编译自身的求值器，这产生了可微分元循环解释（DMCI）：一个编译后的Scheme解释器执行作为数据提供的程序，而反向模式自动微分将梯度传播到嵌入在这些程序中的连续常数。解释器只编译一次，因此新程序无需重新编译或自定义梯度机制即可继承可微性，同时保留闭包、递归和数据结构。我们证明通过编译解释器的梯度几乎处处正确，并表明它们在171个递归和高阶程序-种子对上与直接编译的数值精度匹配。然后，我们使用DMCI进行程序与参数联合搜索，其中大型语言模型提出Scheme程序，精确梯度通过单个冻结的解释器校准其连续参数。这实现了OpenEvolve风格的程序搜索，其中外部循环提出离散程序结构，DMCI提供每个候选程序连续参数的精确基于梯度的校准。在电池容量衰减数据上，该搜索恢复了膝盖状退化结构，并在更难的早期外推分割上改善了保留外推性能，优于手工基线，在后期分割上与之匹配。在高维厄尔尼诺反问题中，DMCI优化了基于解释的卡尔曼滤波器似然，而无梯度搜索失败。这些结果将符号回归和神经符号搜索从闭式表达式扩展到可执行、有状态的程序，使模型生成的代码可直接针对数据进行优化。

英文摘要

The boundary between program execution and gradient-based optimization has long limited the use of code itself as a learnable scientific model. We present a compiler that translates a self-hosting subset of Scheme into differentiable computation graphs for autograd backends. Because the subset can compile its own evaluator, this yields differentiable meta-circular interpretation (DMCI): a compiled Scheme interpreter executes programs supplied as data, while reverse-mode autodiff propagates gradients to continuous constants embedded in those programs. The interpreter is compiled once, so new programs inherit differentiability without recompilation or custom gradient machinery, while retaining closures, recursion, and data structures. We prove that gradients through the compiled interpreter are correct almost everywhere and show that they match direct compilation to numerical precision across 171 recursive and higher-order program-seed pairs. We then use DMCI for program-and-parameter co-search, where a large language model proposes Scheme programs and exact gradients calibrate their continuous parameters through a single frozen interpreter. This enables OpenEvolve-style program search in which an outer loop proposes discrete program structures and DMCI supplies exact gradient-based calibration of each candidate's continuous parameters. On battery capacity-fade data, the search recovers a knee-like degradation structure and improves held-out extrapolation over hand-crafted baselines on the harder early-extrapolation split, matching them on the later split. On a high-dimensional El Nino inverse problem, DMCI optimizes an interpreted Kalman-filter likelihood where gradient-free search fails. These results extend symbolic regression and neurosymbolic search from closed-form expressions to executable, stateful programs, making model-generated code directly optimizable against data.

URL PDF HTML ☆

赞 0 踩 0

2606.09858 2026-06-10 cs.IT cs.AI math.IT 新提交

Support sufficiency as action-sufficient compression: a single-cycle rate-regret formulation

支持充分性作为行动充分压缩：单周期率-遗憾公式

Mark Walsh

AI总结本文形式化支持充分性为行动充分压缩，通过策略等价商空间定义精确充分性，并基于期望策略遗憾定义近似充分性，在有限单周期设置下导出率-遗憾问题，区分行动充分性与重建保真度、信息瓶颈预测和理性疏忽。

Comments 22 pages. Submitted to Journal of Mathematical Psychology. Formal single-cycle model of action-sufficient support compression and rate-regret sufficiency

详情

AI中文摘要

鲁棒决策需要压缩。形成丰富支持状态的系统通常无法在行动点保留其完整结构。它必须仅保留在当前后果几何下行动、验证、放弃或推迟所需的区别。本文将支持充分性形式化为行动充分压缩。设$H$表示完整支持状态，$\mathcal{A}$表示有限行动集，$Z$表示指定收益结构的后果几何。对于固定的$Z$，最粗略的精确行动充分压缩是支持空间按策略等价的商。当两个支持状态需要相同的最优行动时，它们可以合并。这阐明了为什么仅内容或仅标量置信度的仲裁在其诱导划分跨越行动边界时失败。然后通过有界期望策略遗憾定义近似充分性。在有限单周期设置中，这产生了一个率-遗憾问题，其源为$H$，再现字母表为$\mathcal{A}$，失真由后果敏感遗憾给出。最优随机行动通道继承了标准率失真吉布斯形式，此处应用于具有遗憾失真的支持状态。贡献是解释性的：行动充分性与重建保真度、信息瓶颈预测和理性疏忽区分开来。鲁棒单周期仲裁不需要保留所有支持，但需要保留后果几何使行动相关的区别。

英文摘要

Robust decision-making requires compression. A system that forms a rich support state cannot usually preserve its full structure at the point of action. It must retain only those distinctions needed to act, verify, abstain, or defer under the current consequence geometry. This paper formalizes support sufficiency as action-sufficient compression. Let $H$ denote a full support state, $\mathcal{A}$ a finite action set, and $Z$ a consequence geometry specifying payoff structure. For fixed $Z$, the coarsest exactly action-sufficient compression is the quotient of support space by policy equivalence. Two support states may be merged exactly when they require the same optimal action. This clarifies why content-only and scalar-confidence-only arbitration fail whenever their induced partitions cross action boundaries. Approximate sufficiency is then defined by bounded expected policy regret. In the finite single-cycle setting, this yields a rate-regret problem with source $H$, reproduction alphabet $\mathcal{A}$, and distortion given by consequence-sensitive regret. The optimal stochastic action channel inherits the standard rate-distortion Gibbs form, applied here to support states with regret distortion. The contribution is interpretive: action adequacy is distinguished from reconstruction fidelity, information-bottleneck prediction, and rational inattention. Robust single-cycle arbitration does not require preserving all support, but it does require preserving the distinctions that consequence geometry makes action-relevant.

URL PDF HTML ☆

赞 0 踩 0

2604.05013 2026-06-10 cs.SE cs.AI

Scaling Coding Agents via Atomic Skills

通过原子技能扩大编码代理

Yue Liu

AI总结本文提出通过原子技能提升编码代理的新型方法，通过联合强化学习提升五个基础技能，从而提高复杂软件任务的泛化能力。

Comments We request standard withdrawal of this submission because significant errors were discovered in the data after submission, which affect the validity of the results. We may submit a corrected version later

详情

AI中文摘要

当前LLM编码代理主要在复合基准上训练，导致任务特定过拟合和泛化能力有限。为此，我们提出一种新的扩展范式，将重点从任务级优化转向原子技能掌握。我们首先正式化五个基本原子技能，即代码定位、代码编辑、单元测试生成、问题重现和代码审查，这些技能作为复杂软件工程任务的基础向量。与复合编码任务相比，这些原子技能更具通用性和可组合性。然后，我们通过联合强化学习扩展编码代理，使原子技能一致提升，而不会产生负面影响或权衡。值得注意的是，这些原子技能的改进在其他未见的复合编码任务中表现良好，如bug修复、代码重构、机器学习工程和代码安全。观察到这一现象，促使我们通过训练原子技能提出新的编码代理扩展范式。广泛实验验证了所提范式的有效性。值得注意的是，我们的联合强化学习在5个原子技能和5个复合任务上平均性能提高了18.7%。

英文摘要

Current LLM coding agents are predominantly trained on composite benchmarks (e.g., bug fixing), which often leads to task-specific overfitting and limited generalization. To address this, we propose a novel scaling paradigm that shifts the focus from task-level optimization to atomic skill mastery. We first formalize five fundamental atomic skills, code localization, code editing, unit-test generation, issue reproduction, and code review, that serve as the basis vectors for complex software engineering tasks. Compared with composite coding tasks, these atomic skills are more generalizable and composable. Then, we scale coding agents by performing joint RL over atomic skills. In this manner, atomic skills are consistently improved without negative interference or trade-offs between them. Notably, we observe that improvements in these atomic skills generalize well to other unseen composite coding tasks, such as bug-fixing, code refactoring, machine learning engineering, and code security. The observation motivates a new scaling paradigm for coding agents by training with atomic skills. Extensive experiments demonstrate the effectiveness of our proposed paradigm. Notably, our joint RL improves average performance by 18.7% on 5 atomic skills and 5 composite tasks.

URL PDF HTML ☆

赞 0 踩 0

2606.11053 2026-06-10 econ.TH 新提交

Revealing information -- or not -- in a social network of traders

揭示信息——或不揭示——在交易者社交网络中

Patrick Allmis, Paolo Pin, Fernando Vega Redondo

AI总结基于Kyle(1985)的资产交易微观基础模型，研究知情交易者为何可能主动分享信息，并发现均衡中信息部分揭示，导致价格不完全反映资产回报，影响社会剩余分配。

2606.11047 2026-06-10 econ.EM 新提交

Panel Data Estimation of Individual Demand in Markets with Many Consumers

多消费者市场中个体需求的面板数据估计

Sarah Moon, Whitney K. Newey

AI总结研究如何利用面板数据估计个体需求，通过差分等方法消除市场定价内生性偏差，发现当每个市场消费者数量增加时偏差消失，并允许宏观经济效应。

详情

AI中文摘要

本文旨在考虑面板数据是否以及如何用于估计个体需求（而非市场层面需求），同时考虑由市场定价导致的同时性问题。我们考虑线性需求模型和随机系数需求模型，以及线性供给模型。我们发现，使用熟悉的的面板数据方法（如差分）获得的个体需求估计的偏差随着每个市场消费者数量的增加而消失，只要偏好的时变（即异质性）成分与供给的未观测时变成分正交。这种近似控制在许多面板离散选择模型中被假设，并且在其他模型中也是合理的，其中异质性偏好代表偏好随时间的随机变化。可以通过包含表征时间效应的回归量（如趋势和时间周期虚拟变量）或固定时间效应来允许宏观经济效应。

英文摘要

The purpose of this paper is to consider whether and how panel data can be used to estimate individual demand, as opposed to market-level demand, while accounting for simultaneity resulting from prices being determined in markets. We consider linear demand models and random coefficient demand models, together with linear supply models. We find that the bias of individual demand estimates obtained using familiar panel data methods, like differencing, disappears as the number of consumers in each market grows, as long as the time-varying, i.e. idiosyncratic, component of preferences is orthogonal to the unobserved, time-varying component of supply. This approximate control is assumed in many panel discrete choice models and is plausible in other models where idiosyncratic preferences represent random variation in preferences over time. Macroeconomic effects can be allowed for by including regressors characterizing time effects, such as trends and time period dummies, or fixed time effects.

URL PDF HTML ☆

赞 0 踩 0

2606.10998 2026-06-10 econ.TH 新提交

Consistent Probabilistic Social Choice Revisited

再论一致概率社会选择

Florian Brandl, Felix Brandt

AI总结将Brandt等人（2016）基于分数偏好概型的最大抽签结果转移到标准有限选民模型，并放宽连续性条件至实数概率。

Comments 18 pages

2606.10845 2026-06-10 econ.TH 新提交

Iterative Elimination of Borda Losers: Axiomatizations of the Baldwin and Nanson Rules

迭代消除博达败者：鲍德温和南森规则的公理化

Leo Goto, Satoshi Nakada

AI总结本文通过公理化方法统一刻画鲍德温和南森两种投票规则，其核心是递归消除博达得分最低或低于平均的选项，并与Young对博达规则的公理化进行对比。