arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 2251
2606.09940 2026-06-10 cs.LG cs.AI 新提交

Interactions Between Crosscoder Features: A Compact Proofs Perspective

交叉编码器特征间的交互:一个紧凑证明的视角

Dmitry Manning-Coe, Thomas Read, Anna Soligo, Oliver Clive-Griffin, Chun-Hei Yip, Rajashree Agrawal, Jason Gross

AI总结 本文从紧凑证明角度形式化交叉编码器特征交互,提出交互度量并应用于计算稀疏性、语义聚类和检测休眠代理。

Comments Accepted at the NeurIPS 2025 Workshop on Mechanistic Interpretability

详情
AI中文摘要

像稀疏自编码器(SAEs)和交叉编码器这样的字典学习方法试图通过将模型的激活分解为独立特征来解释模型。因此,特征之间的交互会在重构中引入误差。我们通过紧凑证明形式化了这一直觉,并做出了五项贡献。首先,我们展示了原则上如何使用交叉编码器构建模型性能的紧凑证明。其次,我们证明了该证明中出现的误差项可以自然地解释为交叉编码器特征之间交互的度量,并提供了多层感知器(MLP)层中交互项的显式表达式。然后,我们提供了这种新交互度量的三个应用。在第三项贡献中,我们展示了交互项本身可以用作可微分的损失惩罚。应用这种惩罚,我们可以实现“计算稀疏”的交叉编码器,当在每个数据点和神经元仅保留单个特征时,保留MLP性能的60%,而标准交叉编码器仅保留10%。接着,我们展示了根据我们的交互度量进行聚类可以提供语义上有意义的特征聚类,最后,我们展示了休眠代理具有显著的交互。代码可在以下网址获取:https://this URL。

英文摘要

Dictionary learning methods like Sparse Autoencoders (SAEs) and crosscoders attempt to explain a model by decomposing its activations into independent features. Interactions between features hence induce errors in the reconstruction. We formalize this intuition via compact proofs and make five contributions. First, we show how, \textit{in principle}, a compact proof of model performance can be constructed using a crosscoder. Second, we show that an error term arising in this proof can naturally be interpreted as a measure of interaction between crosscoder features and provide an explicit expression for the interaction term in the Multi-Layer Perceptron (MLP) layers. We then provide three applications of this new interaction measure. In our third contribution we show that the interaction term itself can be used as a differentiable loss penalty. Applying this penalty, we can achieve ``computationally sparse'' crosscoders that retain $60\%$ of MLP performance when only keeping a single feature at each datapoint and neuron, compared to $10\%$ in standard crosscoders. We then show that clustering according to our interaction measure provides semantically meaningful feature clusters, and finally that sleeper agents have significant interactions. Code is available at https://github.com/chainik1125/crosscoders-feature-interactions/tree/arxiv.

2606.09937 2026-06-10 cs.LG cs.AI cs.CL 新提交

RKSC: Reasoning-Aware KV Cache Sharing and Confident Early Exit for Multi-Step LLM Inference

RKSC:面向多步LLM推理的感知推理的KV缓存共享与自信提前退出

Anirudh Sekar

AI总结 提出RKSC框架,通过注意力相似性KV共享、置信门控提前退出和推理选择性块缓存管理,消除多分支LLM推理中的结构冗余,实现平均3.008倍加速,错误率仅0.37%。

Comments Accepted to the ICML 2026 Workshop on Statistical Frameworks for Uncertainty in Agentic Systems

详情
AI中文摘要

我们提出RKSC(感知推理的KV缓存共享),一种无需训练的推理框架,消除了多分支LLM推理流程中的两种结构冗余。ASKS(注意力相似性KV共享)计算前缀KV缓存一次,并通过隐藏状态余弦相似度广播给所有语义相似的分支,严格推广了vLLM和SGLang使用的精确令牌前缀缓存。CGEE(置信门控提前退出)应用两种互补的退出机制:(1)当生成置信度在分支间具有决定性时,完全跳过验证前向传播;(2)当逐层熵稳定时,在中间层终止验证传播,使用Transformer骨干上的轻量级钩子。RSBCM(推理选择性块缓存管理器)通过注意力加权深度优先驱逐防止无界缓存增长。在五个模型家族(7B-10B)、四个基准测试和1000个评估问题上,RKSC相对于无KV基线实现了平均3.008倍加速(峰值3.990倍),相对于vLLM等效前缀缓存平均提升1.66倍,CGEE导致的错误率仅为0.37%(1616次验证调用中6次错误)。无需微调或架构更改。代码可在该URL获取。

英文摘要

We introduce RKSC (Reasoning-Aware KV Cache Sharing), a training-free inference framework that eliminates two structural redundancies in multi-branch LLM reasoning pipelines. ASKS (Attention-Similarity KV Sharing) computes the prefix KV cache once and broadcasts it to all semantically similar branches via hidden-state cosine similarity, strictly generalising the token-exact prefix caching used by vLLM and SGLang. CGEE (Confidence-Gated Early Exit) applies two complementary exit mechanisms: (1) it skips the verification forward pass entirely when generation confidence is decisive across branches, and (2) it terminates the verification pass at an intermediate layer when per-layer entropy stabilises, using lightweight hooks on the transformer backbone. RSBCM (Reasoning-Selective Block Cache Manager) prevents unbounded cache growth via attention-weighted depth-priority eviction. Across five model families (7B-10B), four benchmarks, and 1,000 evaluated problems, RKSC achieves a mean speedup of 3.008x over the No-KV baseline (peak 3.990x), a 1.66x mean improvement over vLLM-equivalent prefix caching, with a CGEE-induced error rate of only 0.37% (6 errors out of 1,616 verify calls). No fine-tuning or architecture changes are required. Code is available at https://github.com/AnirudhSekar/RKSC.

2606.09936 2026-06-10 cs.LG cs.AI 新提交

One Lens, Many Worlds : A Capability-Typed Interface for World-Model Interpretability

一个镜头,多个世界:面向世界模型可解释性的能力类型接口

Bhavith Chandra Challagundla, Sanskar Pandey, Param Thakkar, Rishikesh Mallagundla, Yugandhar Reddy Gogireddy, Wenhao Lu, Hindol Roy Choudhury, Shravani Challagundla, Mohamed Deraz Nasr, Spursh Deshpande

AI总结 提出WorldModelLens,通过能力类型适配器统一不同世界模型(如PlaNet、IRIS、I-JEPA)的可解释性分析,避免重复实现。

详情
AI中文摘要

世界模型现在建立在截然不同的计算基板上。潜在循环状态空间模型(如PlaNet和Dreamer系列)将观测压缩为循环状态;基于token的模型(如IRIS)将观测量化到学习到的码本中,并用transformer进行自回归预测;联合嵌入预测架构(如I-JEPA)在没有像素解码器的学习潜在空间中进行预测。应用于这些模型的可解释性方法,包括探针、激活修补、稀疏自编码器和惊喜分析,共享一组共同的基元,但由于现有的钩子和缓存工具假设一个没有动作、环境步骤或想象回滚概念的transformer语言模型,它们为每个架构从头重新实现。我们认为这种碎片化反映了工具而非模型,并且世界模型的共享结构可以通过一个小型类型接口捕获。我们提出了WorldModelLens,一个围绕能力类型适配器组织的开源可解释性基板:每个模型实现四个必需方法(编码、转移、初始状态、采样),并通过显式能力描述符声明一组可选头(解码、奖励、继续、行动者、评论者),使得强化学习和自监督世界模型成为一等公民,而无需模仿对方。单一的钩子和缓存层在此接口上暴露时间索引的激活、想象回滚和干预重放,使得每个分析只需编写一次。

英文摘要

World models are now built on substantially different computational substrates. Latent recurrent state-space models such as PlaNet and the Dreamer family compress observations into recurrent states; token-based models such as IRIS quantize observations into a learned codebook and predict autoregressively with a transformer; and joint-embedding predictive architectures such as I-JEPA predict in a learned latent space with no pixel decoder. The interpretability methods applied to these models, including probing, activation patching, sparse autoencoders, and surprise analysis, share a common set of primitives, yet they are re-implemented from scratch for each architecture because existing hook-and-cache tooling assumes a transformer language model with no notion of actions, environment steps, or imagined rollouts. We argue that this fragmentation reflects the tooling rather than the models, and that the shared structure of world models is captured by a small typed interface. We present WorldModelLens, an open-source interpretability substrate organized around a capability-typed adapter: every model implements four required methods (encode, transition, initial state, sample) and declares a set of optional heads (decode, reward, continue, actor, critic) through an explicit capability descriptor, so that reinforcement-learning and self-supervised world models are first-class without either imitating the other. A single hook and cache layer exposes time-indexed activations, imagination rollouts, and intervention replay over this interface, allowing each analysis to be written once.

2606.09934 2026-06-10 cs.LG cs.CR 新提交

nCMD: Benign-Anchored Feature Selection for Imbalanced Network Intrusion Detection

nCMD: 面向不平衡网络入侵检测的良性锚定特征选择

Abu Fuad Ahmad, Istiaque Ahmed

AI总结 提出良性锚定类均值偏差(nCMD)方法,通过计算攻击类分布与良性类均值的偏差进行特征选择,在四个基准数据集上优于传统过滤方法,尤其适用于特征预算紧张和类别严重不平衡的场景。

Comments 6 pages, IEEE double columns

详情
AI中文摘要

特征选择对于在操作和防御网络中常见的高维、高度不平衡流量下运行的网络入侵检测系统(NIDS)至关重要。传统的过滤方法使用跨类别对称计算的全局统计量对特征进行排序,因此无法捕捉入侵检测的不对称性,其中攻击最好被描述为对主导良性流量的偏离。我们提出了良性锚定类均值偏差(nCMD),一种轻量级且可解释的方法,该方法基于攻击类分布与良性类均值的偏差(而非全局有偏的参考)对特征相关性进行评分。这种方法使特征选择与NIDS的操作语义保持一致,且不增加额外计算成本。在四个基准数据集(CICIDS2017、CICDDoS2019、NSL-KDD和UNSW-NB15)、多个特征预算和三个下游分类器上,nCMD在宏平均F1分数上达到或超过了经典过滤基线。它在四个数据集中的三个以及每个分类器下均取得了最佳结果,在特征预算紧张和类别严重不平衡的情况下改进最为显著。这些结果支持良性锚定排序作为资源受限NIDS的可扩展且可解释的预处理组件。

英文摘要

Feature selection is critical for network intrusion detection systems (NIDS) operating under high-dimensional, highly imbalanced traffic, as found in operational and defense networks. Traditional filter methods rank features using global statistics computed symmetrically across classes and thus fail to capture the asymmetry of intrusion detection, where attacks are best characterized as deviations from dominant benign traffic. We propose benign-anchored Classwise Mean Deviation (nCMD), a lightweight and interpretable method that scores feature relevance based on the deviation of attack-class distributions from the benign-class mean, rather than a globally biased reference. This approach aligns feature selection with the operational semantics of NIDS at no additional computational cost. Across four benchmark datasets (CICIDS2017, CICDDoS2019, NSL-KDD, and UNSW-NB15), multiple feature budgets, and three downstream classifiers, nCMD matches or exceeds classical filter baselines in macro-averaged F1-score. It achieves the best result on three of the four datasets and under every classifier, with the strongest improvements observed under tight feature budgets and severe class imbalance. These results support benign-anchored ranking as a scalable and interpretable preprocessing component for resource-constrained NIDS.

2606.09932 2026-06-10 cs.LG cs.AI 新提交

When RL Fails after SFT: Rejuvenating Model Plasticity for Robust SFT-to-RL Handoff

当强化学习在监督微调后失效:恢复模型可塑性以实现稳健的SFT到RL交接

Runze Liu, Jiashun Liu, Xu Wan, Yuqian Fu, Ling Pan

AI总结 针对SFT过度训练导致RL阶段改进有限的问题,提出Rejuvenation方法,通过基模型锚定融合和神经元重置恢复模型可塑性,在数学推理和智能体任务上提升RL性能。

详情
AI中文摘要

监督微调(SFT)后接强化学习(RL)已成为大语言模型(LLM)后训练的标准流程。SFT预期为RL提供有用的行为先验,以进一步增强模型能力。然而,过度SFT的检查点在RL中往往表现出有限的改进。我们将此失败归因于模型可塑性的丧失:SFT初始化的策略被后续RL有效重塑的能力降低。为了更好地理解这一现象,我们从参数变化、输出空间和RL优化动态等多个角度进行了详细分析。我们的结果表明,过度SFT的模型倾向于产生过度自信的token分布,并表现出尖锐的参数景观,这使得它们在RL阶段更难优化。为了实现更稳健的SFT到RL交接,我们提出了Rejuvenation,一种简单而有效的方法,在保留有用的SFT获取先验的同时恢复可塑性。Rejuvenation利用基于基模型的模型融合来减少过度SFT引起的漂移,并通过有针对性的神经元重置来缓解模型僵化。在数学推理任务和智能体任务上的实验结果表明,我们的方法在过度训练的SFT模型上持续提升了RL性能,同时也增强了对分布外任务的泛化能力。

英文摘要

Supervised Fine-Tuning (SFT) followed by Reinforcement Learning (RL) has become a standard pipeline for Large Language Model (LLM) post-training. SFT is expected to provide a useful behavioral prior for RL to further enhance model capabilities. However, checkpoints with excessive SFT often show limited improvement during RL. We attribute this failure to the loss of model plasticity: the reduced ability of an SFT-initialized policy to be effectively reshaped by subsequent RL. To better understand this phenomenon, we conduct detailed analysis from multiple perspectives, including parameter changes, output spaces, and RL optimization dynamics. Our results show that models from excessive SFT tend to produce over-confident token distributions and exhibit sharp parameter landscapes, which make them harder to optimize in the RL stage. To enable a more robust SFT-to-RL handoff, we propose \texttt{Rejuvenation}, a simple yet effective method that restores plasticity while preserving useful SFT-acquired priors. Rejuvenation leverages base-anchored model fusion to reduce excessive SFT-induced drift with targeted neuron reset to mitigate model rigidity. Experimental results on both math reasoning tasks and agentic tasks demonstrate that our approach consistently improves RL performance on over-trained SFT models, while also enhancing generalization to out-of-distribution tasks.

2606.09929 2026-06-10 cs.LG cs.AI 新提交

Between Amnesia and Chaos: A Memory Stability Expressivity Trilemma for Trainable Dissipative Oscillator Networks

介于遗忘与混沌之间:可训练耗散振荡器网络的记忆稳定性表现力三难困境

Caleb Munigety

AI总结 本文研究可训练非线性振荡器网络,发现记忆范围、梯度稳定性和动态表现力三者受阻尼控制,存在无法同时最大化的三难困境,并通过实验验证了理论边界。

详情
AI中文摘要

物理储层计算利用非线性机械动力学,但传统上冻结基底并仅训练线性读出层,假定基底不可训练。我们重新审视这一前提,研究非线性振荡器网络,其质量、阻尼和刚度通过辛积分器端到端学习。我们的核心结果是三难困境:记忆范围、梯度稳定性和动态表现力无法同时最大化,因为三者均由阻尼控制。反向梯度以阻尼决定的速率衰减,限制了信用传播的距离,而前向灵敏度以最大李雅普诺夫指数指数增长,因此可用梯度需要阻尼高于稳定下限。由于李雅普诺夫指数随阻尼增加而下降,而记忆上限随范围增加而下降,稳定训练被限制在一个随范围收缩并在临界点闭合的带状区域内。我们在一个二十振荡器网络上测试了每一步。阻尼扫描发现最大李雅普诺夫指数单调变化并在明确的下限处过零,证实了定理的关键假设。在九个范围上的延迟回忆任务中,学习基底与冻结基底的算力匹配比较显示,学习基底在短范围占优,优势在约十一步范围附近接近并反转,这是带状闭合的预测特征;训练模型稳定在稳定下限附近,自发寻求混沌边缘。解析上限高估经验交叉约五倍,这是可检测梯度与可学习梯度之间的差距,我们报告而非调整消除。贡献在于确认了何时训练物理基底优于冻结基底。

英文摘要

Physical reservoir computing harnesses nonlinear mechanical dynamics but, by convention, freezes the substrate and trains only a linear readout, presuming the substrate is not usefully trainable. We revisit that premise for networks of nonlinear oscillators whose mass, damping, and stiffness are learned end-to-end through a symplectic integrator. Our central result is a trilemma: memory horizon, gradient stability, and dynamical expressivity cannot be simultaneously maximized, because all three are governed by the damping. The backward gradient decays at a rate set by the damping, capping how far back credit can propagate, while forward sensitivities grow exponentially in the largest Lyapunov exponent, so usable gradients require damping above a stability floor. Since the Lyapunov exponent falls as damping rises while the memory ceiling falls as the horizon grows, stable training is confined to a band that contracts with horizon and closes at a critical point. We test every step on a twenty-oscillator network. A damping sweep finds the largest Lyapunov exponent monotone and crossing zero at a well-defined stability floor, confirming the theorem's key assumption. A compute-matched comparison of learned versus frozen substrate on delayed recall across nine horizons shows the learned substrate dominating at short horizons and the advantage closing and reversing near a horizon of eleven steps, the predicted signature of band closure; trained models settle near the stability floor, seeking the edge of chaos unprompted. The analytic ceiling overestimates the empirical crossover roughly fivefold, a gap between detectable and learnable gradient that we report rather than tune away. The contribution is a confirmed account of when training a physical substrate beats freezing it.

2606.09928 2026-06-10 cs.LG cs.AI 新提交

Forward-Only Convolutional Neural Networks with Learnable Channel-Class Assignment

具有可学习通道-类别分配的前向传播卷积神经网络

Mohammadnavid Ghader, Saeed Reza Kheradpisheh, Bahar Farahani, Mahmood Fazlali

AI总结 提出可学习的通道-类别分配机制,结合熵和正交正则化,以及基于验证性能的损失感知层贡献策略,在残差CNN上实现前向传播学习,在CIFAR-10/100和Tiny-ImageNet上达到FF模型最佳性能,缩小与反向传播的差距。

详情
AI中文摘要

前向-前向(FF)算法通过用局部的前向目标替代基于梯度的信用分配,提供了一种受生物学启发的反向传播替代方案。虽然最近的扩展已将FF适应到卷积神经网络(CNN),但现有公式依赖于静态的通道-类别分区,并且在复杂任务中难以有效执行。在这项工作中,我们引入了一种可学习的通道-类别分配机制,实现了卷积通道的自适应、数据驱动特化,并辅以熵和正交正则化以提升学习性能。我们进一步提出了一种损失感知的层贡献策略,该策略根据中间层的验证性能自适应地加权其预测,从而增强前向推理的有效性。集成到残差CNN中,所提出的方法在CIFAR-10、CIFAR-100和Tiny-ImageNet上相比现有的类似前向方法持续实现了更优的性能。值得注意的是,它在基于FF的模型中建立了新的最先进性能,显著缩小了与反向传播的差距。这些发现表明,引入可学习的通道特化和层贡献加权显著增强了深度CNN中前向学习的表示能力。

英文摘要

The Forward-Forward (FF) algorithm offers a biologically inspired alternative to backpropagation by replacing gradient-based credit assignment with local, forward-only objectives. While recent extensions have adapted FF to convolutional neural networks (CNNs), existing formulations rely on static channel-class partitions and struggle to perform effectively in complex tasks. In this work, we introduce a learnable channel-class assignment mechanism that enables adaptive, data-driven specialization of convolutional channels, supported by entropy and orthogonality regularization to promote learning performance. We further propose a loss-aware layer contribution strategy that adaptively weights intermediate-layer predictions based on their validation performance, enhancing the effectiveness of forward-only inference. Integrated into residual CNNs, the proposed method achieves consistently superior performance across CIFAR-10, CIFAR-100, and Tiny-ImageNet compared to existing similar forward-only methods. Notably, it establishes new state-of-the-art performance among FF-based models, substantially narrowing the gap with backpropagation. These findings demonstrate that introducing learnable channel specialization and layer contribution weighting significantly enhances the representational capacity of forward-only learning in deep CNNs.

2606.09927 2026-06-10 cs.LG cs.AI cs.CL 新提交

Trainable Smooth-Rotation Transforms with Learned Channel Scales for LLM Quantization

可训练平滑旋转变换与学习通道尺度用于LLM量化

Patrik Czakó, Gábor Kertész, Sándor Szénási

AI总结 针对大语言模型量化中激活值量化困难的问题,提出基于分位数鲁棒的缩放策略和梯度优化的通道尺度学习,在W4A4量化下显著降低误差。

Comments 6 pages, 8 figures, 3 tables. Accepted to IEEE INES 2026 conference proceedings

详情
AI中文摘要

后训练量化(PTQ)是降低大语言模型(LLM)服务成本最实用的方法之一,但激活值量化仍然困难,因为异常值主导的通道会导致较大的量化误差。本文研究了这种退化是否部分由基于缩放的等效变换中的过度迁移引起。我们引入了一种用于SmoothRot风格变换的分位数鲁棒缩放策略,用高分位数替代基于最大值的激活统计量,并辅以通道尺度的约束梯度优化。在LLaMA-3.2-1B的W4A4量化下,仅分位数策略搜索相比SmoothRot基线将选定层误差降低11.1%,联合(alpha, q)搜索降低12%,训练达到18.5%。将最佳选定层策略重放到所有解码器块的下投影层,相应的全层平均误差从97.51降至78.08(19.9%)。结果表明,鲁棒的迁移控制和轻量级尺度学习在保持等效变换框架的同时,相比基于最大值的固定策略提供了持续改进。

英文摘要

Post-training quantization (PTQ) is one of the most practical ways to reduce the serving cost of Large Language Models (LLMs), but activation quantization remains difficult because outlier-dominated channels lead to large quantization errors. This paper investigates whether part of this degradation is caused by over-migration in scaling-based equivalent transformations. We introduce a quantile-robust scaling policy for SmoothRot-style transforms by replacing max-based activation statistics with high quantiles, and we complement it with constrained gradient-based optimization of channel scales. On LLaMA-3.2-1B under W4A4 quantization, quantile-only policy search improves selected-layer error by 11.1% over the SmoothRot baseline, joint (alpha, q) search improves it by 12%, and training reaches 18.5%. Replaying the best selected-layer policy on all decoder-block down-projection layers reduces the corresponding full-layer mean error from 97.51 to 78.08 (19.9%). The results show that robust migration control and lightweight scale learning provide consistent gains over max-based fixed policies while preserving the equivalent-transform framework.

2606.09926 2026-06-10 cs.LG cs.AI 新提交

Sample Where You Struggle: Sharpening Base Model Reasoning via Entropy-Guided Power Sampling

在你挣扎处采样:通过熵引导的幂采样增强基础模型推理

Hong Guo, Nianhui Guo, Christoph Meinel, Haojin Yang

AI总结 提出熵引导的幂采样(EGPS),一种无需训练和验证器的采样方法,通过利用前向传播中的token级熵将MCMC移动定位到高熵区域,在多个基准上以高达12.6倍加速达到最优或并列最优准确率。

详情
AI中文摘要

从序列级幂分布 $p^\alpha$ 采样可以在不更新任何参数的情况下从基础语言模型中引出强化学习级别的推理,但标准的Metropolis-Hastings(MH),一种马尔可夫链蒙特卡洛(MCMC)采样器,既昂贵又慢混合。我们将这两个问题归因于结构不匹配:$p^\alpha$ 主要在稀疏、空间聚集的高熵决策点集上偏离 $p$,然而MH沿着前缀均匀地提出重采样位置——在近简并条件上浪费计算,同时在模式发散处欠混合。我们提出熵引导的幂采样(EGPS),一种无需训练和验证器的采样器,它从已经在前向传播中的token级熵重新推导其提议。EGPS跳过确定性块,将每个MCMC移动定位到高熵邻域,并在决策点应用多尝试Metropolis——使得采样成本随熵质量而非序列长度缩放。在Qwen2.5-Math-7B上,EGPS在所有三个基准(MATH500 $75.8\\%$,HumanEval $62.2\\%$,GPQA $42.4\\%$)上达到最佳或并列最佳准确率,同时相对于MH基线实现了高达12.6倍的墙钟加速。

英文摘要

Sampling from the sequence-level power distribution $p^α$ elicits RL-level reasoning from base language models without any parameter updates, but the standard Metropolis--Hastings (MH), a Markov Chain Monte Carlo (MCMC) sampler, is both expensive and slow-mixing. We trace both to a structural mismatch: $p^α$ mainly departs from $p$ at a sparse, spatially clustered set of high-entropy decision points, yet MH proposes resampling positions uniformly along the prefix -- wasting compute on near-degenerate conditionals while under-mixing precisely where modes diverge. We propose Entropy-Guided Power Sampling (EGPS), a training-free and verifier-free sampler that re-derives its proposal from token-level entropy already in the forward pass. EGPS skips deterministic blocks, localizes each MCMC move to a high-entropy neighborhood, and applies Multiple-Try Metropolis at decision points -- making sampling cost scale with \emph{entropy mass rather than sequence length}. On Qwen2.5-Math-7B, EGPS reaches best or tied-best accuracy on all three benchmarks (MATH500 $75.8\%$, HumanEval $62.2\%$, GPQA $42.4\%$) at up to a $12.6\times$ wall-clock speedup over the MH baseline.

2606.09925 2026-06-10 cs.SD 新提交

AudioProcessBench: Benchmark for Identifying Process Errors in Audio-Grounded Reasoning

AudioProcessBench: 音频基础推理中过程错误识别的基准

Xiangyu Zhao, Junyu Yan, Yaling Shen, Zimu Wang, Yiwen Jiang, Stephanie Fong, Qingyang Xu, Jiahe Liu, Dominic Dwyer, Zongyuan Ge

AI总结 提出AudioProcessBench基准,用于评估音频-语言模型在推理步骤中的过程错误识别能力,涵盖步骤正确性、错误类型检测和链级聚合三种范式。

详情
AI中文摘要

大型音频-语言模型(LALMs)越来越多地使用显式推理轨迹进行复杂的音频理解,但对推理质量的评估仍未被充分探索。尽管过程级基准(用于过程奖励模型PRMs)在文本和多模态领域推进了推理评估,但音频推理的类似评估仍然有限。在本文中,我们提出了AudioProcessBench,一个用于音频推理中步骤级过程错误识别的综合基准。AudioProcessBench包含由6个音频和全模态语言模型生成的不同推理轨迹。每个轨迹被分割成离散的推理步骤,并标注了二元步骤正确性和细粒度错误类型。我们的基准在三种互补范式下评估模型:(1)步骤正确性识别,(2)错误类型条件检测,用于诊断音频特定验证器能力,以及(3)链级聚合,其中验证器为同一问题选择或聚合多个推理轨迹。这种设计使得系统分析当前模型是否能检测过程错误、它们的弱点是否因音频特定错误类型而异,以及过程验证是否能转化为改进的答案选择成为可能。AudioProcessBench为未来关于音频推理验证器、过程奖励模型和可靠的全模态推理研究提供了测试平台。

英文摘要

Large audio-language models (LALMs) increasingly use explicit reasoning traces for complex audio understanding, yet the evaluation of reasoning quality remains underexplored. Although process-level benchmarks for process reward models (PRMs) have advanced reasoning evaluation in text and multi-modal domains, comparable evaluation for audio reasoning remains limited. In this paper, we present AudioProcessBench, a comprehensive benchmark for step-level process error identification in audio reasoning. AudioProcessBench contains diverse reasoning traces generated by 6 audio and omni language models. Each trace is segmented into discrete reasoning steps and annotated with binary step correctness and fine-grained error types. Our benchmark evaluates models under three complementary paradigms: (1) step correctness identification, (2) error-type-conditioned detection for diagnosing audio-specific verifier capacities, and (3) chain-level aggregation, where verifiers select or aggregate among multiple reasoning traces for the same question. This design enables a systematic analysis of whether current models can detect process errors, whether their weaknesses differ across audio-specific error types, and whether process verification translates into improved answer selection. AudioProcessBench provides a testbed for future research on audio reasoning verifiers, process reward models, and reliable omni-modal reasoning.

2606.09924 2026-06-10 cs.LG cs.AI 新提交

Sigma-Branch: Hierarchical Single-Path Network Reconstruction for Dynamic Inference with Reduced Active Parameters

Sigma-Branch: 用于动态推理的分层单路径网络重构,减少活跃参数

Kohga Tanaka, Hiroaki Nishi

AI总结 提出Sigma-Branch框架,通过分层二叉树结构将预训练密集网络重构为共享主干、分层路由器和专用叶子,利用激活聚类初始化并微调,推理时仅执行单一路径,在CIFAR-100/ResNet-50等任务上减少58-60%活跃参数,性能损失小于1.72个百分点。

详情
AI中文摘要

在内存受限的边缘加速器上部署深度神经网络,瓶颈在于每次推理的片外权重传输而非计算:密集网络无法保留在芯片上,每个输入都必须加载所有参数。现有模型压缩仅在永久容量损失代价下减少这种传输。我们提出Sigma-Branch (SigmaB),一个将预训练密集网络重构为分层二叉树的框架,该树由共享主干、分层路由器和专用叶子组成。预训练权重通过基于激活的球形k-means聚类分布在树中,该聚类联合初始化路由器权重和每分支通道分配;然后通过软路由微调使每个叶子与其路由输入子集对齐。在推理时,所得网络仅执行一条根到叶路径,减少活跃参数占用,同时将完整密集参数集存储在内存中。在CIFAR-100 / ResNet-50、ImageNet-1K / ResNet-50和ModelNet40 / PointNet++上,SigmaB-Net将每次推理的活跃参数减少58-60%,同时与密集基线Top-1相比误差在1.72个百分点以内。在可比的ImageNet-1K Top-1下,活跃参数减少超过静态结构化剪枝(FPGM、HRank)14-23个百分点。跨模态评估涵盖2D视觉和3D点云骨干网络,证实了将每次推理内存流量与总参数数量解耦的框架级主张。

英文摘要

Deploying deep neural networks on memory-constrained edge accelerators is bottlenecked by per-inference off-chip weight transfer rather than computation: the dense network cannot be retained on-chip, and every parameter must be loaded for every input. Existing model compression reduces this transfer only at the cost of permanent capacity loss. We propose Sigma-Branch (SigmaB), a framework that restructures a pretrained dense network into a hierarchical binary tree composed of a shared backbone, hierarchical routers, and specialized leaves. Pretrained weights are distributed across the tree via activation-based spherical k-means clustering, which jointly initializes router weights and per-branch channel allocations; soft-routing fine-tuning then aligns each leaf with its routed input subset. At inference, the resulting network executes only a single root-to-leaf path, reducing the active-parameter footprint while storing the complete dense parameter set in memory. Across CIFAR-100 / ResNet-50, ImageNet-1K / ResNet-50, and ModelNet40 / PointNet++, SigmaB-Net reduces per-inference active parameters by 58-60% while remaining within 1.72 percentage points (pp) of the dense baseline Top-1. At comparable ImageNet-1K Top-1, the active-parameter reduction exceeds static structured pruning (FPGM, HRank) by 14-23 pp. The cross-modal evaluation, spanning 2D vision and 3D point-cloud backbones, substantiates a framework-level claim that decouples per-inference memory traffic from the total parameter count.

2606.09923 2026-06-10 cs.LG cs.AI 新提交

Conformal Prediction for Neural Operators: Distribution-Free Uncertainty Quantification in Physics Simulation

神经算子的共形预测:物理模拟中无分布不确定性量化

Michael Chin

AI总结 提出将分裂共形预测应用于神经算子物理模拟,实现无分布预测区间和有限样本覆盖保证,并通过归一化共形预测方案生成自适应宽度区间。

Comments 13 pages, 7 tables, 7 figures. Full-scale experiments on NVIDIA V100

详情
AI中文摘要

神经算子如傅里叶神经算子(FNO)已成为求解偏微分方程(PDE)的强大替代方法,比传统数值求解器快几个数量级。然而,在安全关键工程应用(如电子元件和电池系统的热管理)中部署这些模型,不仅需要准确的点预测,还需要严格的不确定性保证。现有的神经算子不确定性量化(UQ)方法,包括蒙特卡洛Dropout和深度集成,仅提供相对不确定性估计,没有正式的覆盖保证。在这项工作中,我们首次将分裂共形预测应用于基于神经算子的物理模拟,提供具有有限样本覆盖保证的无分布预测区间。我们进一步引入了一种归一化共形预测方案,利用MC Dropout不确定性生成自适应宽度区间,在低不确定性区域产生更紧的区间,在模型不太确定的区域产生更宽的区间。在稳态热传导基准上的全规模实验(3370万参数,800个训练样本,5个集成成员,NVIDIA V100)表明,我们的方法在目标水平alpha=0.1下达到89.1%的经验覆盖率,同时生成反映底层物理不确定性结构的空间自适应预测区间。我们还提供了一个不确定性分解框架,将认知不确定性(占总量的68%)与偶然不确定性(占总量的32%)分离,为数据收集和模型改进提供可操作指导。我们的方法在一个开源平台上实现,具有REST API端点和交互式3D可视化。

英文摘要

Neural operators such as the Fourier Neural Operator (FNO) have emerged as powerful surrogates for solving partial differential equations (PDEs), achieving speedups of several orders of magnitude over traditional numerical solvers. However, deploying these models in safety-critical engineering applications -- such as thermal management of electronic components and battery systems -- requires not only accurate point predictions but also rigorous uncertainty guarantees. Existing uncertainty quantification (UQ) methods for neural operators, including Monte Carlo Dropout and Deep Ensembles, provide only relative uncertainty estimates without formal coverage guarantees. In this work, we propose the first application of split conformal prediction to neural operator-based physics simulation, providing distribution-free prediction intervals with finite-sample coverage guarantees. We further introduce a normalized conformal prediction scheme that leverages MC Dropout uncertainty to produce adaptive-width intervals, yielding tighter intervals in regions of low uncertainty and wider intervals where the model is less certain. Full-scale experiments (33.7M parameters, 800 training samples, 5 ensemble members, NVIDIA V100) on steady-state heat conduction benchmarks demonstrate that our method achieves 89.1% empirical coverage at the target level of alpha=0.1, while producing spatially adaptive prediction intervals that reflect the underlying physical uncertainty structure. We also provide an uncertainty decomposition framework that separates epistemic uncertainty (68% of total) from aleatoric uncertainty (32% of total), offering actionable guidance for data collection and model improvement. Our method is implemented in an open-source platform with REST API endpoints and interactive 3D visualization.

2606.09919 2026-06-10 cs.LG cs.AI cs.MA cs.RO 新提交

Co-GLANCE: Uncertainty-Aware Active Perception for Heterogeneous Robot Teaming

Co-GLANCE: 异构机器人团队的不确定性感知主动感知

Michal P. Podolinsky, Neel P. Bhatt, Pranay Samineni, Rohan Siva, Christian Ellis, Ufuk Topcu

AI总结 提出Co-GLANCE系统,通过蒸馏视觉语言模型实现实时遮挡分割与机器人分配,结合共形预测与选择性弃权提供统计保证的不确定性量化,驱动主动感知,在真实场景中遮挡分割和分配准确率分别提升25%和36%,推理延迟降低350倍。

Comments Code, videos, and dataset available at https://co-glance.github.io/

详情
AI中文摘要

感知不确定性是异构机器人团队在非结构化户外环境中运行的核心挑战,因为单一视角无法提供可靠的场景理解。由遮挡等来源引起的感知不确定性,根据场景结构在不同机器人视角下表现不同。检测和解决感知不确定性的来源需要基于场景的上下文推理和具备能力感知的机器人分配。虽然视觉语言模型为两者提供了强大的语义先验,但它们对于机载推理在计算上过于昂贵,且缺乏校准的不确定性量化。我们介绍了Co-GLANCE,一个用于异构机器人团队不确定性解决的实时机载感知与决策系统。Co-GLANCE将视觉语言模型的语义推理能力蒸馏为用于遮挡分割和机器人分配的端到端模型,消除了对基于云推理的需求。为了量化感知不确定性,Co-GLANCE结合了共形预测与选择性弃权,为分割、机器人分配和检测输出提供统计有效的覆盖保证。这些校准的不确定性估计直接触发主动感知,派遣最合适的机器人获取信息丰富的视角并解决不确定性。在真实世界场景中,Co-GLANCE在遮挡分割和机器人分配准确率上分别比基于云的视觉语言模型基线高出25%和36%,同时将每帧推理延迟降低350倍。我们还发布了一个空地数据集以供未来研究。代码、视频和数据集可在以下网址获取:此 https URL。

英文摘要

Perceptual uncertainty is a central challenge for heterogeneous robot teams operating in unstructured outdoor environments, where no single viewpoint affords reliable scene understanding. Perceptual uncertainty, arising from sources such as occlusions, manifests differently across robot viewpoints depending on scene structure. Detecting and resolving sources of perceptual uncertainty requires both scene-based contextual reasoning and capability-aware robot allocation. While vision-language models provide strong semantic priors for both, they are computationally prohibitive for onboard inference and lack calibrated uncertainty quantification. We introduce Co-GLANCE, a real-time onboard perception and decision-making system for uncertainty resolution in heterogeneous robot teams. Co-GLANCE distills the semantic reasoning capabilities of a vision-language model into an end-to-end model for occlusion segmentation and robot allocation, eliminating the need for cloud-based inference. To quantify perceptual uncertainty, Co-GLANCE combines conformal prediction with selective abstention to provide statistically valid coverage guarantees for segmentation, robot allocation, and detection outputs. These calibrated uncertainty estimates directly trigger active perception, dispatching the most appropriate robot to acquire informative viewpoints and resolve uncertainty. Across real-world scenarios, Co-GLANCE outperforms cloud-based vision-language model baselines in occlusion segmentation and robot allocation accuracy by 25% and 36%, respectively, while reducing per-frame inference latency 350x. We also release an air-ground dataset for future research. Code, videos, and dataset available at https://co-glance.github.io/ .

2606.09917 2026-06-10 cs.LG 新提交

SPDM: Geometry-Modulated State Space Modeling with Manifold Constraints for Time Series Forecasting

SPDM: 基于流形约束的几何调制状态空间建模用于时间序列预测

Xingsheng Chen, Siu-Ming Yiu

AI总结 提出SPDM,一种将对称正定流形约束引入状态空间模型的几何感知架构,通过流形轨迹和几何门控机制调制选择性扫描,在保持线性复杂度同时提升多变量时间序列预测精度。

详情
AI中文摘要

多变量时间序列预测需要捕捉交互变量间持续演化的相关结构。现有状态空间模型通过扫描标记化的时间或空间序列来处理时间序列,忽略了演化的几何结构。我们通过将流形约束引入状态空间建模来解决这一局限性:将跨变量相关结构视为对称正定流形上的连续轨迹,其黎曼几何特征、切空间线性度和弗雷歇均值中心性作为原则性的几何正则化器,引导并稳定SSM的选择性扫描动态。我们提出SPDM,一种几何感知的SSM架构,通过两种协作机制实现这一原则:一个流形轨迹路径,将动态演化的协方差矩阵从SPD流形投影到欧几里得切空间;以及一个几何门控方案,基于从流形轨迹导出的几何信号直接调制SSM的内部选择性参数。该参数化在嵌入丰富结构约束的同时保持了Mamba并行扫描的线性时间复杂度,使架构同时保持预测精度和计算效率。在11个真实世界基准数据集上的广泛实验建立了最先进的预测性能,进一步研究证实几何约束的状态空间动态是其性能提升背后的主导架构因素。

英文摘要

Multivariate time series forecasting requires capturing the continuously evolving correlation structure among interacting variables. Existing state-space models process time series by scanning tokenized temporal or spatial sequences, discarding the evolutionary geometric structure. We address this limitation by introducing manifold constraints into state-space modeling: treating the cross-variable correlation structure as a continuous trajectory on the symmetric positive definite manifold, whose Riemannian geometric features, tangent space linearity, and Frechet mean centrality act as a principled geometric regularizer that guides and stabilizes the selective scanning dynamics of SSMs. We propose SPDM, a geometry-aware SSM architecture that realizes this principle through two cooperating mechanisms: a manifold trajectory path that projects dynamically evolving covariance matrices from the SPD manifold to a Euclidean tangent space, and a geometric gating scheme that directly modulates SSM's internal selective parameters based on geometric signals derived from the manifold trajectory. The parameterization preserves the linear-time complexity of the Mamba parallel scan while embedding rich structural constraints, making the architecture preserve prediction accuracy and computational efficiency simultaneously. Extensive experiments on eleven real-world benchmark datasets establish state-of-the-art forecasting performance, and further studies confirm that geometrically constrained state-space dynamics are the dominant architectural factor behind its performance gains.

2606.09916 2026-06-10 cs.LG cs.AI 新提交

IntentKV: Cross-Turn Intent-Aware KV Cache Pruning for Agent Inference

IntentKV: 面向Agent推理的跨轮次意图感知KV缓存剪枝

Junjie Li, Jiong Lou, Jie Li

AI总结 针对多轮LLM Agent中KV缓存成为服务瓶颈的问题,提出IntentKV方法,通过会话级QueryMemory和残差注意力头实现跨轮次意图感知的KV剪枝,在保持精度的同时大幅降低峰值请求token和KV读取量。

详情
AI中文摘要

多轮LLM Agent将短查询扩展为包含工具调用、搜索结果和中间推理的长轨迹。在单条轨迹中,KV内存和KV读取带宽增长数个数量级,使得键值(KV)缓存(而非参数计算)成为长时Agent的主要服务瓶颈。我们提出IntentKV,一种学习型KV剪枝方法,保持基础LLM冻结。IntentKV维护一个会话级的跨轮次意图QueryMemory,通过记忆-注意力规则对实时历史token进行评分,并添加一个零初始化的残差注意力头,对当前查询的K向量进行交叉注意力。为了与前缀缓存保持可组合性,驱逐采用槽位映射重定向:被丢弃的位置路由到一个哨兵死槽,而存活的K/V行、RoPE相位和槽位标识保持不变。在严格的KV预算下,IntentKV与无剪枝的全缓存基线相比几乎没有精度下降:在8k KV预算下,Qwen3-8B的平均峰值请求token下降23.9%,Qwen2.5-14B下降30.7%。在Qwen2.5-14B上所有方法都能完成的100个最长BCP查询中,IntentKV-8k进一步将最坏情况下的峰值请求token从92.3k降至20.5k(减少77.8%),最坏情况下的原始KV读取从4.11亿降至3100万(减少92.6%)。

英文摘要

Multi-turn LLM agents fan short queries into long trajectories of tool calls, search results, and intermediate reasoning. Both KV memory and KV read bandwidth grow by orders of magnitude across a single trajectory, making the key-value (KV) cache, not parameter compute, the dominant serving bottleneck for long-horizon agents. We introduce IntentKV, learned KV pruning that keeps the base LLM frozen. IntentKV maintains a session-level QueryMemory of cross-turn intent, scores live history tokens with a memory-attention rule, and adds a zero-initialized residual head with cross-attention over current-query K-vectors. To stay composable with prefix caches, eviction is a slot-map redirection: dropped positions route to a sentinel dead slot while surviving K/V rows, RoPE phases, and slot identities stay in place. IntentKV matches the no-pruning full-cache baseline with almost no accuracy drop under tight KV budgets: at an 8k KV budget, mean peak request tokens drop 23.9% on Qwen3-8B and 30.7% on Qwen2.5-14B. On the 100 longest BCP queries that all methods complete on Qwen2.5-14B, IntentKV-8k further cuts worst-case peak request tokens from 92.3k to 20.5k, a 77.8% reduction, and worst-case raw KV reads from 411M to 31M, a 92.6% reduction.

2606.09912 2026-06-10 cs.LG cs.AI 新提交

Mix, Don't Pick: Why Synthetic Corpus Composition Matters for Time Series Foundation Model Pretraining

混合而非挑选:为什么合成语料组合对时间序列基础模型预训练至关重要

Aaryan Nagpal, Debdeep Sanyal, Murari Mandal, Dhruv Kumar, Saurabh Deshpande

AI总结 针对时间序列基础模型预训练中合成数据生成器选择困难的问题,提出简单等权混合所有生成器的方法,匹配或超越最优单个生成器,并与真实数据结合获得最强预训练语料。

Comments Accepted at the ICML 2026 Workshop on Foundation Models for Structured Data (FMSD), Seoul, South Korea

详情
AI中文摘要

为时间序列基础模型预训练选择错误的合成生成器代价高昂:在相同训练预算下,最佳和最差生成器产生的预测误差差距可达2倍,然而该领域尚无原则性的选择方法。问题因生成器排名在不同架构间不稳定而加剧:在11个生成器家族上,对从头训练的Chronos-T5-Mini和Moirai-Small进行评估,我们发现哪些生成器有用取决于模型架构。我们没有解决生成器选择问题,而是绕过了它:所有生成器的简单等权混合匹配或击败了两种架构的最佳单个生成器,并且将此混合与真实数据组合产生了整体最强的预训练语料。因此,合成预训练是一个语料组合问题,而非生成器选择问题,组合选择应针对每个模型家族进行验证,而非假设可迁移。

英文摘要

Choosing the wrong synthetic generator for time-series foundation model pretraining is costly: under identical training budgets, the best and worst generators produce up to a $2\times$ gap in forecasting error, yet the field has no principled way to make this choice. The problem is compounded by the fact that generator rankings are not stable across architectures: across 11 generator families evaluated on Chronos-T5-Mini and Moirai-Small trained from scratch, we find that which generators are useful depends on the model architecture. Rather than solving the generator selection problem, we sidestep it: a simple equal-weight mixture of all generators matches or beats the best individual generator for both architectures, and composing this mixture with real data yields the strongest pretraining corpora overall. Synthetic pretraining is therefore a corpus composition problem, not a generator selection problem, and composition choices should be validated per model family rather than assumed to transfer.

2606.09907 2026-06-10 cs.LG cs.AI 新提交

LongMoE: Longitudinal Multimodal Learning via Trajectory-Aware Mixture-of-Experts

LongMoE:基于轨迹感知的混合专家模型的纵向多模态学习

Maxx Richard Rahman, Prakhar Kumar, Wolfgang Maass

AI总结 提出LongMoE框架,通过上下文感知插补、注意力标记化、轨迹感知编码和稀疏MoE路由,联合解决临床多模态学习中模态缺失和纵向动态两大挑战,在ADNI等数据集上验证了鲁棒性。

详情
AI中文摘要

多模态临床学习对于整合包括影像、文本和个性化健康记录在内的多样化患者数据日益重要。然而,它面临两个基本挑战:i) 模态缺失,即在一次患者就诊中任意子集的模态不可用;ii) 纵向动态,即观察结果的诊断意义取决于患者随时间演变的疾病轨迹。现有方法孤立地处理这些挑战:缺失模态框架将每次就诊视为独立的静态快照并丢弃时间上下文,而纵向模型通常假设模态完全可用并在系统性模态不完整时性能下降。我们提出LongMoE(纵向混合专家模型),这是一个统一框架,用于联合解决这两个挑战。LongMoE结合了上下文感知插补模块和注意力标记化模块,后者捕获不规则就诊序列中的频域时间模式,以及用于建模疾病进展的轨迹感知编码器和用于患者特定专家选择的上下文条件稀疏MoE路由。在ADNI、OASIS-3和MIMIC-IV上的实验表明,LongMoE在缺失或弱共时模态下提高了鲁棒性,并在全模态设置中保持竞争力,为纵向感知的多模态临床学习奠定了坚实基础。

英文摘要

Multimodal clinical learning is increasingly important for integrating diverse patient data, including imaging, text, and personalised health records. However, it faces two fundamental challenges: i) modality missingness, where arbitrary subsets of modalities are unavailable at a given patient visit, ii) longitudinal dynamics, where the diagnostic significance of an observation depends on the patient's evolving disease trajectory over time. Existing methods address these challenges in isolation: missing-modality frameworks treat each visit as an independent static snapshot and discard temporal context, while longitudinal models often assume complete modality availability and degrade under systematic modality incompleteness. We propose LongMoE (Longitudinal Mixture-of-Experts), the unified framework to jointly address both challenges. LongMoE combines a context-aware imputation module with an attentional tokenization module that captures frequency-domain temporal patterns across irregular visit sequences, a trajectory-aware encoder for modeling disease progression, and context-conditioned Sparse MoE routing for patient-specific expert selection. Experiments on ADNI, OASIS-3, and MIMIC-IV show that LongMoE improves robustness under missing or weak contemporaneous modalities and remains competitive in full-modality settings, establishing a strong foundation for longitudinally-aware multimodal clinical learning.

2606.09875 2026-06-10 cs.LG cs.AI stat.ML 新提交

Integrating Local and Global Entropy for Uncertainty Quantification in LLMs

集成局部和全局熵用于大语言模型的不确定性量化

Johanne Medina, Tianyi Zhou, Keivin Isufaj, Aristides Gionis, Sanjay Chawla

AI总结 本文提出GLU方法,通过融合隐藏状态几何熵(全局)和token级熵(局部)来量化LLM不确定性,有效捕捉自信但错误的失败模式,无需额外训练。

Comments 17 pages, 2 figures

详情
AI中文摘要

大语言模型会自信地产生幻觉,使得不确定性量化(UQ)对于可靠部署至关重要。现有方法主要依赖token级信号,而中间隐藏状态的几何结构未被充分利用。在本文中,我们将隐藏状态矩阵的几何复杂度作为LLM全局不确定性的度量,同时将token级不确定性估计视为局部度量。我们表明,隐藏状态几何熵(全局不确定性)和token级熵(局部不确定性)在统计上近似正交,捕捉了可靠性预测的不同失败模式。特别地,全局几何恢复了局部信号系统性遗漏的自信但错误的失败模式。基于此,我们提出了全局-局部不确定性(GLU),这是一种无监督、单次前向传播的分数,通过乘法门融合两种信号。在三个模型族和六个基准测试中,GLU匹配或优于所有无监督基线,同时仅需一次前向传播,且保持长度归一化和架构无关性。

英文摘要

Large language models hallucinate confidently, making uncertainty quantification (UQ) essential for reliable deployment. Existing methods rely predominantly on token-level signals, leaving the geometric structure of intermediate hidden states underused. In this paper, we take the geometric complexity of hidden-state matrices as a measure of the global uncertainty of LLMs, while treating token-level uncertainty estimation as a local metric. We show that hidden-state geometric entropy (global uncertainty) and token-level entropy (local uncertainty) are statistically near-orthogonal, capturing distinct failure regimes for reliability prediction. In particular, global geometry recovers the confident-but-wrong failure mode that local signals systematically miss. Building on this, we propose Global-Local Uncertainty (GLU), an unsupervised, single-pass score that fuses the two signals via a multiplicative gate. Across three model families and six benchmarks, GLU matches or outperforms all unsupervised baselines while requiring only a single forward pass and remaining length-normalized and architecture-agnostic.

2606.09873 2026-06-10 cs.LG cs.AI 新提交

Rotate2Think: Geometric Priming via Orthogonal Rotation to Improve Language Model Reasoning

Rotate2Think:通过正交旋转进行几何提示以提升语言模型推理能力

Aditya Sharma, Christopher J. Pal, Amal Zouaq

AI总结 发现推理模型的输入嵌入与思考嵌入存在高锥度且方向非共线,提出无训练方法Rotate2Think,通过正交Procrustes分析估计旋转并注入合成思考向量,在30/32配置中提升数学、科学和代码任务准确率。

详情
AI中文摘要

推理模型通过生成显式的中间推理轨迹再给出最终答案,在挑战性任务上取得了强劲表现。然而,推理过程中表示空间的内部结构仍知之甚少:模型的隐藏表示在思考时与输入提示的嵌入有何不同?这种结构能否被利用以在推理时激发更强的推理能力?我们表明,输入嵌入和思考嵌入(分别对提示和推理轨迹的最后一层隐藏状态进行均值池化)都表现出极高的锥度,所有向量紧密聚集在单一平均方向周围。关键的是,这些平均输入方向和思考方向是非共线的,思考嵌入在嵌入空间中占据了几何上不同的区域,这在许多不同模型和基准任务中均成立。这一观察启发我们将输入到思考的转换视为一个旋转问题,该问题可通过正交Procrustes分析得到闭式解。我们提出Rotate2Think,一种无需训练的方法,从少量正确求解的示例中估计该旋转,并在推理时将生成的合成思考向量注入思考分隔符之间,在推理轨迹开始时提供几何提示。在多个基准和模型家族上的评估表明,Rotate2Think在数学、科学和代码任务的32个模型-基准配置中改进了30个的准确率,并零样本泛化到MATH-Vision上的多模态推理。

英文摘要

Reasoning models achieve strong performance on challenging tasks by generating explicit intermediate reasoning traces before producing a final answer. Yet the internal structure of representation space when reasoning remains poorly understood: how do a model's hidden representations differ during thinking versus the embeddings of the input prompt, and can this structure be exploited to elicit stronger reasoning at inference time? We show that both input embeddings and thinking embeddings (mean-pooled last-layer hidden states over the prompt and reasoning trace, respectively) exhibit extremely high conicity, with all vectors clustering tightly around a single mean direction. Crucially, these mean input and thinking directions are non-collinear, with thinking embeddings occupying a geometrically distinct region of embedding space across many different models and benchmark tasks. This observation motivates casting the input-to-thinking transition as a rotation problem admitting a closed-form solution via orthogonal Procrustes analysis. We propose Rotate2Think, a training-free method that estimates this rotation from a small set of correctly solved examples and injects the resulting synthetic thinking vector between thinking delimiters at inference time, providing a geometric primer at the onset of the reasoning trace. Evaluated across multiple benchmarks and model families, Rotate2Think improves accuracy in 30 of 32 model-benchmark configurations across mathematics, science, and code tasks, and generalizes zero-shot to multimodal reasoning on MATH-Vision.

2606.09871 2026-06-10 cs.CV cs.AI cs.LG 新提交

SD-GRPO: Verifiable Segment Decomposition for Long-Form Vision-Language Generation

SD-GRPO:面向长格式视觉-语言生成的可验证片段分解

Hyunwoong Kim, Seongeun Lee, Hannah Yun, Junhyun Park, Jonggwon Park

AI总结 提出SD-GRPO方法,通过将长格式输出分解为片段并计算逐片段优势,解决GRPO在视觉-语言任务中粗粒度信用分配不足的问题,实验证明其在多种长格式生成任务中优于基线。

详情
AI中文摘要

群体相对策略优化(GRPO)及其变体最初为大型语言模型(LLM)开发,最近被应用于多模态LLM并取得了强劲结果。然而,它们基于单一标量优势的粗粒度整体信用分配在视觉-语言(VL)任务中拟合不足,这些任务的输出通常是基于语义丰富图像的长格式响应。为解决这一限制,我们利用了一种单标量公式丢弃的结构化信号:长格式VL输出的自然分段。具体地,我们提出片段分解GRPO(SD-GRPO),它对整个rollout组中可验证的逐片段奖励进行z归一化,生成一个逐片段优势向量以替代单一标量。我们在三个设置中评估SD-GRPO,涵盖受控和真实世界的长格式VL生成,按片段间语义纠缠程度递增组织。在从DOCCI构建的受控多面板密集字幕任务中(片段语义独立),SD-GRPO始终优于GRPO基线,且片段数量越多增益越大。扩展到从MultiChartQA构建的受控多图表长格式VQA任务,我们从理论和经验上证明,rollout级奖励存在随输出长度增加而加剧的跨片段信用错误归因。在MMSci数据集上的真实世界科学图表字幕任务中(子图字幕共享图表上下文),混合整体和逐片段奖励进一步提升了两者性能,表明当片段语义纠缠时,仅逐片段归一化是不够的。最后,通过将SD-GRPO集成到Dr. GRPO中,我们确认它可以以最小的实现开销应用于任何GRPO框架,以增强长格式VL生成。

英文摘要

Group Relative Policy Optimization (GRPO) and its variants, originally developed for Large Language Models (LLMs), have recently been applied to Multimodal LLMs and produced strong results. However, their coarse-grained holistic credit assignment from a single scalar advantage underfits vision-language (VL) tasks, where outputs are often long-form responses grounded in semantically rich images. To address this limitation, we exploit a structured signal that single-scalar formulations discard: the natural segmentation of long-form VL outputs. Concretely, we propose Segment-Decomposed GRPO (SD-GRPO), which z-normalizes verifiable per-segment rewards across the rollout group, yielding a vector of per-segment advantages in place of a single scalar. We evaluate SD-GRPO across three settings spanning controlled and real-world long-form VL generation, organized by increasing semantic entanglement across segments. On a controlled multi-panel dense-captioning task constructed from DOCCI, where segments are semantically independent, SD-GRPO consistently outperforms the GRPO baseline, with larger gains at higher segment counts. Extending to a controlled multi-chart long-form VQA task constructed from MultiChartQA, we show both theoretically and empirically that rollout-level rewards suffer from cross-segment credit misattribution that scales with output length. On a real-world scientific figure captioning task on the MMSci dataset, where subfigure captions share context across the figure, blending holistic and per-segment rewards further improves on both, suggesting per-segment normalization alone is insufficient when segments are semantically entangled. Finally, by integrating SD-GRPO into Dr. GRPO, we confirm that it can be applied to any GRPO framework with minimal implementation overhead to enhance long-form VL generation.

2606.09869 2026-06-10 cs.LG cs.AI cs.CR 新提交

QSplitFL: Capability Aware Deep Q-Learning for Optimal Split Point Selection in Split Federated Learning

QSplitFL: 基于能力感知的深度Q学习在分割联邦学习中的最优分割点选择

Nazmus Shakib Shadin, Xinyue Zhang, Jingyi Wang, Miao Pan

AI总结 提出QSplitFL框架,利用深度Q网络基于客户端硬件指标(CPU、内存、电池、网络延迟)动态选择最优分割点,解决异构设备上的分割联邦学习挑战,通过衰减损失奖励函数和委员会投票机制提升收敛速度和精度。

Comments Accepted by ECML-PKDD 2026

详情
AI中文摘要

联邦学习(FL)与分割学习(SL)结合是一种隐私保护范式,能够在资源受限设备上训练深度神经网络(DNN),同时降低整体训练成本。然而,确定最优分割点(即模型被分割的层)仍然是一个关键挑战,尤其是当客户端具有异构硬件能力时。固定分割点可能使弱设备过载,增加通信和服务器负载,从而减慢收敛速度并降低稳定性。本文介绍了QSplitFL,一种新颖的基于能力感知的深度Q网络(DQN)框架,用于在基于分割学习的联邦学习(SFL)环境中选择最优分割点。与依赖高维模型权重表示的现有方法不同,QSplitFL采用直接从客户端硬件指标(包括CPU利用率、内存、电池电量和网络延迟)导出的轻量级状态表示。所提出的框架包含一个衰减损失下降奖励函数,优先考虑早期收敛,以及一个基于委员会的DQN架构,通过多数投票来减轻奖励黑客攻击。在MNIST、Fashion-MNIST、CIFAR-10和CIFAR-100数据集上,使用CNN、ResNet50、MobileNetV4和ConvNeXt架构进行的广泛实验表明,我们的方法在收敛速度和精度上优于现有方法,同时有效适应异构设备资源。源代码在此https URL公开可用。

英文摘要

Federated Learning (FL) combined with Split Learning (SL) is a privacy preserving paradigm that enables training deep neural networks (DNNs) on resource constrained devices while reducing overall training cost. However, determining the optimal split point, meaning the layer where the model is divided still remains a critical challenge, especially when clients have heterogeneous hardware capabilities. Fixed split points can overload weak devices and increase the communication and server load, which slows convergence and reduces stability. This paper introduces QSplitFL, a novel capability-aware Deep Q-Network (DQN) framework for optimal split point selection in Split learning based Federated Learning (SFL) environments. Unlike existing approaches that rely on high-dimensional model weight representations, QSplitFL employs a lightweight state representation derived directly from client hardware metrics, including CPU utilization, memory, battery level, and network latency. The proposed framework incorporates a decayed loss-drop reward function that prioritizes early convergence, and a committee-based DQN architecture with majority voting to mitigate reward hacking. Extensive experiments on MNIST, Fashion-MNIST, CIFAR-10, and CIFAR-100 datasets using CNN, ResNet50, MobileNetV4, and ConvNeXt architectures demonstrate that our approach achieves better convergence and higher accuracy compared to existing methods, while effectively adapting to heterogeneous device resources. The source code is publicly available at https://github.com/AIPO-Lab/QSplitFL.

2606.09866 2026-06-10 cs.LG cs.AI 新提交

Two to Tango: Coupled Task-Reference Selection for Safe LLM Fine-tuning

双人探戈:面向安全LLM微调的耦合任务-参考选择

Xinrui Chen, Jianhao Zhang, Ou Wu, Di Gao

AI总结 提出DualSelect框架,通过耦合任务与安全参考选择,在微调时保持安全对齐,提升安全评分至少5.10点。

详情
AI中文摘要

在下游数据上微调安全对齐的大型语言模型(LLMs)可以提高适应性,但可能会侵蚀已学习的安全行为。现有方法使用固定的安全示例、全局约束或单边任务过滤。我们的诊断表明,任务更新暴露了不同的安全约束,从而激发了联合选择相关参考和兼容任务样本的需求。我们提出DualSelect,一个耦合的任务和参考选择框架,它在过滤与诱导参考方向兼容的整个任务样本之前,刷新任务条件化的安全参考。在极小极大视角下,DualSelect通过熵正则化评分代理、惰性参考刷新和梯度校正,选择具有高保留损失和任务冲突的安全参考以及兼容的任务样本。在1B-8B LLMs上,DualSelect在不损失任务效用的情况下保持安全性;使用REDORCA评估器,它在安全平均值上比最强基线至少提高5.10分,并且在所有评估器中保持最高的安全平均值,且开销适中。这一观点扩展到以保留为中心的持续学习。

英文摘要

Fine-tuning safety aligned large language models (LLMs) on downstream data improves adaptation but may erode learned safety behavior. Existing methods use fixed safety examples, global constraints, or one-sided task filtering. Our diagnostics show task updates expose different safety constraints, motivating joint selection of relevant references and compatible task samples. We propose DualSelect, a coupled framework for task and reference selection that refreshes task conditioned safety references before filtering whole task samples compatible with the induced reference direction. Under a minimax view, DualSelect selects safety references with high preservation loss and task conflict, together with compatible task samples, through entropy-regularized scoring surrogates, lazy reference refresh, and gradient correction. On 1B-8B LLMs, DualSelect preserves safety without losing task utility; using the REDORCA judge, it improves Safety Avg. over the strongest baseline by at least 5.10 points and remains highest in Safety Avg. across judges with moderate overhead. This view extends to retention focused continual learning.

2606.09865 2026-06-10 cs.LG cs.CR cs.IR 新提交

LLM-as-a-Discriminator: When Synthetic Tables Still Look Real

LLM作为判别器:当合成表格看起来仍然真实

Manel Slokom, Malek Slokom, Thierno Kante

AI总结 提出用LLM区分真实与合成表格数据,测试不同设置和模型,发现LLM判别可作为实用的隐私审计信号。

详情
AI中文摘要

隐私和数据共享常常处于紧张状态。许多组织使用合成数据来降低隐私风险,同时仍能共享有用的数据。对于表格数据,审计隐私仍然困难。在许多情况下,即使是人类也很难判断一个表格是真实的还是合成的。在本文中,我们提出了一种基于LLM判别的方法。我们要求LLM将每个表格样本分类为真实或合成。我们测试了两种设置:C1仅包含表格,C2包含表格和分布元数据。我们使用LLaMA作为开放模型,Gemini作为参考模型。在我们的实验中,我们在两个公共数据集UCI Adult和ACS Census上运行了三种合成模型:CTGAN、TVAE和Gaussian Copula。我们收集了451个有效试验。我们的结果显示模型之间存在明显差异。在Adult上,LLaMA在报告单元格中达到DRS=0%,而Gemini对CTGAN和TVAE达到DRS=100%。在Census上,LLaMA预测大多数样本为合成,而Gemini在C1中保持高值,但在C2中对CTGAN和TVAE下降。我们还与分类器双样本检验(C2ST)和记录链接作为分布基线进行了比较,并与2名标注员和240次试验的人类试点进行了比较。我们的结果表明,当模型选择、每个提供者的报告和数据编码得到谨慎处理时,LLM判别是一种实用的隐私审计信号。为了可重复性,代码和实验脚本可在以下网址获得:https://this URL。

英文摘要

Privacy and data sharing are often in tension. Many organizations use synthetic data to reduce privacy risk and still share useful data. For tabular data, auditing privacy remains hard. In many cases, even humans cannot easily tell if a table is real or synthetic. In this paper, we propose a method based on LLM discrimination. We ask an LLM to classify each table sample as REAL or SYNTHETIC. We test two settings: C1 with table only, and C2 with table plus distributional metadata. We use LLaMA as an open model and Gemini as a reference model. In our experiments, we run three synthesis models, CTGAN, TVAE, and Gaussian Copula, on two public datasets, UCI Adult and ACS Census. We collect 451 valid trials. Our results show clear differences between models. On Adult, LLaMA reaches DRS=0% in reported cells, while Gemini reaches DRS=100% for CTGAN and TVAE. On Census, LLaMA predicts SYNTHETIC for most samples, while Gemini stays high in C1 but drops for CTGAN and TVAE in C2. We also compare with a classifier two-sample test (C2ST) and record linkage as distributional baselines, and with a human pilot of 2 annotators and 240 trials. Our results show that LLM discrimination is a practical privacy audit signal when model choice, per provider reporting, and data encoding are handled with care. For reproducibility, code and experiment scripts are available at https://github.com/SlokomManel/LLM-as-a-Discriminator.

2606.09862 2026-06-10 cs.LG cs.AI 新提交

Blurry Window Attention

模糊窗口注意力

Axel Laborieux, Christos Sourmpis, Juan Gabriel Kostelec, Qinghai Guo

AI总结 提出模糊窗口注意力(BLA),一种基于Dirichlet核插值重构模糊KV历史的有界记忆控制方法,在合成任务中状态效率比滑动窗口注意力高8倍,且随状态增大性能提升。

详情
AI中文摘要

Transformer语言模型中的Softmax注意力操作在序列长度上具有二次复杂度,且状态大小以KV缓存形式增长,这成为长上下文场景中的瓶颈。为克服此限制,引入了具有线性复杂度和有限状态大小的替代架构,如状态空间模型(SSM)、线性注意力(LA)和有界记忆控制注意力(ABC)。尽管线性模型在语言困惑度上与Transformer相当,但在需要检索或回忆特定信息的任务中仍落后。本文提出模糊窗口注意力(BLA),一种受SSM启发的新型ABC方法。BLA存储一个频率窗口,通过使用Dirichlet核进行插值从中重建模糊的KV历史。根据Dirichlet核的分辨率,BLA可理解为滑动窗口注意力(SWA)的泛化,或门控槽注意力(GSA)的特例,其中衰减因子由Dirichlet核实现。我们详细描述了BLA的理论和高效实现。在多查询关联回忆(MQAR)合成任务上,我们表明BLA的状态效率比SWA高8倍,且与流行的线性注意力模型竞争;在RegBench合成任务中,在我们测试的线性模型中,只有BLA和SWA随着状态大小增长而提升性能。

英文摘要

The Softmax Attention operation in Transformer language models has a quadratic complexity in the sequence length and a growing state size in the form of KV cache, which becomes a bottleneck in long context scenarios. To overcome this limitation, alternative architectures with linear complexity and finite state size have been introduced, such as State-Space Models (SSMs), Linear Attention (LA), and Attention with Bounded-memory Control (ABC). Though linear models achieve similar language perplexity as Transformers, they are still behind in tasks which require retrieval or recall of specific information. In this work, we introduce Blurry Window Attention (BLA) a novel ABC method inspired by SSMs. BLA stores a frequency window from which a blurry KV history is reconstructed via interpolation using Dirichlet kernels. BLA can be understood as a generalization of Sliding Window Attention (SWA) depending on the Dirichlet kernels resolution or as a special case of the Gated Slot Attention (GSA), where the decay factor is implemented with Dirichlet kernels. We describe in details the theory and efficient implementation of BLA. On the Multi-Query Associate Recall (MQAR) synthetic task, we show that the state efficiency of BLA is 8$\times$ better than SWA and is competitive with popular linear attention models, and in the RegBench synthetic task, only BLA and SWA improve their performance as the state size grows among the linear models we tested.

2606.09861 2026-06-10 cs.LG cs.AI 新提交

Time Series as Language: A Universal Tokenizer for General-Purpose Time Series Foundation Models

时间序列作为语言:通用时间序列基础模型的通用分词器

Yunhao Zhang, Ruiying Qi, Jiale Zheng, Jianfeng Zhang, Lujia Pan, Junchi Yan

AI总结 提出UniTok通用分词器将时间序列转化为离散令牌,并基于NTP预训练UniTok-FM基础模型,支持零样本预测、提示增强预测以及少样本生成和分类,无需任务特定修改。

详情
AI中文摘要

虽然下一个令牌预测(NTP)统一了LLM的预训练,但其对无界、连续时间序列(TS)的适应仍然是一个开放问题。为了弥合这一差距,我们引入了UniTok,一个将TS转化为离散令牌的通用分词器,以及UniTok-FM,一个在这些令牌上通过NTP预训练的基础模型。UniTok-FM是一个通用基础模型,支持零样本和提示增强的预测,以及通过无训练上下文推理进行的少样本生成和分类——这是先前工作未能实现的能力。在技术上,UniTok是一个向量量化自编码器,结合了前缀归一化以实现尺度稳定、渐进分辨率因果架构用于编码和解码,以及结构保持重建损失用于训练。UniTok-FM采用现成的LLM架构,无需针对TS的特定修改。它不是在孤立的TS上预训练,而是在由多个具有相似模式的序列形成的上下文窗口上执行NTP,旨在捕捉它们的共享动态。在预测、生成和分类上的实验表明,单个统一的UniTok-FM始终优于统计和监督基线,与任务特定的基础模型性能相当,并且独特地实现了跨任务的无训练上下文推理。

英文摘要

While Next-Token Prediction (NTP) has unified LLM pretraining, its adaptation to unbounded, continuous time series (TS) remains open. To bridge the gap, we introduce UniTok, a universal tokenizer that transforms TS into discrete tokens, and UniTok-FM, a foundation model pretrained via NTP on these tokens. UniTok-FM is a general-purpose foundation model that supports zero-shot and prompt-boosted forecasting, as well as few-shot generation and classification via training-free in-context inference--a capability not achieved by prior works. Technically, UniTok is a vector-quantized autoencoder incorporating prefix normalization for scale stabilization, a progressive-resolution causal architecture for encoding and decoding, and a structure-preserving reconstruction loss for training. UniTok-FM adopts an off-the-shelf LLM architecture without TS-specific modifications. Instead of pretraining on isolated TS, it performs NTP on context windows formed by multiple series with similar patterns, aiming to capture their shared dynamics. Experiments on forecasting, generation, and classification show that a single unified UniTok-FM consistently outperforms statistical and supervised baselines, achieves competitive performance with task-specific foundation models, and uniquely enables training-free in-context inference across tasks.

2606.09860 2026-06-10 cs.LG cs.AI stat.AP stat.ML 新提交

Conformal Risk Prediction for Non-Alcoholic Fatty Liver Disease Using Gradient Boosting with Distribution-Free Coverages

基于梯度提升与无分布覆盖的非酒精性脂肪肝病共形风险预测

Xinze Zhang

AI总结 提出结合梯度提升决策树与共形预测的机器学习框架Method,实现非酒精性脂肪肝病个体风险的无分布校准覆盖预测,在中国多中心队列中AUROC达0.912,优于多种方法。

详情
AI中文摘要

非酒精性脂肪肝病(NAFLD)影响全球约25%的成年人,带来显著的肝脏和心血管风险。然而,人群层面的筛查工具仍不充分。我们提出Method,一种用于NAFLD风险预测的机器学习框架,将梯度提升决策树与共形预测相结合,以在个体风险估计上产生校准的、无分布的覆盖保证。它集成了基于互信息的稳定性选择过程,通过自助重采样识别紧凑、临床可解释的特征子集,构建预测集,其边际覆盖可证明超过用户指定的置信水平。我们在中国广州的多中心队列(主要n=2,187;外部验证n=412)上评估了Method,使用了涵盖人口统计学、代谢生物标志物和生活方式因素的78个候选特征。Method内部AUROC为0.912,外部为0.891,优于深度神经网络、TabNet、支持向量机和逻辑回归。共形预测集在90%名义水平下达到91.3%的经验覆盖。从这些分数得出的三层风险分层将人群分为不同组别,高风险亚组的12个月进展率是低风险组的4.7倍。选定的特征——特别是腰围、ALT、GGT、甘油三酯、空腹血糖和BMI——与已建立的代谢风险因素一致,提供了生物学合理性。

英文摘要

Non-alcoholic fatty liver disease (NAFLD) affects roughly 25% of global adults, posing substantial hepatic and cardiovascular risks. Yet, population-level screening tools remain inadequate. We present Method, a machine-learning framework for NAFLD risk prediction coupling gradient-boosted decision trees with conformal prediction to yield calibrated, distribution-free coverage guarantees on individual risk estimates. It integrates a mutual-information-based stability selection procedure to identify a compact, clinically interpretable feature subset via bootstrap resampling, constructing prediction sets whose marginal coverage provably exceeds a user-specified confidence level. We evaluated Method on a multicenter cohort from Guangzhou, China (primary n=2,187; external validation n=412) using 78 candidate features across demographics, metabolic biomarkers, and lifestyle factors. Method achieves an AUROC of 0.912 internally and 0.891 externally, outperforming deep neural networks, TabNet, support vector machines, and logistic regression. Conformal prediction sets achieve 91.3% empirical coverage at the 90% nominal level. A three-tier risk stratification derived from these scores separates the population into distinct groups, with the high-risk subgroup showing a 12-month progression rate 4.7 times that of the low-risk tier. The selected features -- notably waist circumference, ALT, GGT, triglycerides, fasting glucose, and BMI -- align with established metabolic risk factors, providing biological plausibility.

2606.09856 2026-06-10 cs.CL cs.AI cs.LG stat.ML 新提交

Using Probabilistic Programs to Train Inductive Reasoning in Large Language Models

使用概率程序训练大型语言模型的归纳推理

Liyi Zhang, Akshay K. Jagadish, Brenden M. Lake, Thomas L. Griffiths

AI总结 提出基于程序的后验训练(PPT)方法,利用LLM生成概率程序场景,通过推理产生分布目标,微调模型以提升归纳推理准确性、与人类判断的一致性及校准能力。

Comments 20 pages, 5 figures

详情
AI中文摘要

大型语言模型(LLM)的后训练推理通常专注于数学和编码等演绎任务,其中正确性可验证。然而,许多现实世界的推理问题是归纳性的:智能体必须从稀疏、模糊的观测中推断不确定的信念。使用标准微调方法进行归纳推理面临挑战,包括难以策划大规模、高质量标注数据集以及处理本质上是分布式的目标。在这项工作中,我们引入了一种称为基于程序的后验训练(PPT)的新方法来解决这些局限性:我们使用LLM生成多样化的开放世界场景作为概率程序,运行概率推理以产生查询的分布式目标响应,然后在这些概率软标签上进行微调。使用这种方法,我们在10,000个程序生成的场景上微调LLM,并在保留的模板、人工标注的判断和外部基准上进行评估。总体而言,PPT显著提高了保留归纳任务的估计准确性,增强了与人类判断的一致性,并迁移到估计和校准的外部基准。此外,原始校准的增益并未被事后温度缩放所涵盖,表明与输出重新缩放相比,模型更深入地内化了不确定性。这些结果表明,概率程序介导的微调是一种有前景的方法,用于后训练LLM以可靠地执行近似归纳推理。

英文摘要

Post-training Large Language Models (LLMs) for reasoning typically focuses on deductive tasks such as mathematics and coding where correctness is verifiable. Yet, many real-world reasoning problems are inductive: agents must infer uncertain beliefs from sparse, ambiguous observations. There are challenges to using standard fine-tuning methods for inductive reasoning, including difficulties in curating large-scale, high-quality labeled datasets and in handling targets that are inherently distributional. In this work, we introduce a novel approach, called Program-based Posterior Training (PPT), to address these limitations: we use an LLM to generate diverse open-world scenarios as probabilistic programs, run probabilistic inference to produce distributional target responses to queries, and then fine-tune on these probabilistic soft labels. Using this approach, we fine-tune LLMs on 10,000 programmatically generated scenarios and evaluate on held-out motifs, human-labeled judgments, and external benchmarks. Overall, PPT substantially improves estimation accuracy on held-out inductive tasks, increases alignment with human judgments, and transfers to external benchmarks for estimation and calibration. Additionally, the gains in raw calibration are not subsumed by post-hoc temperature scaling, showing that the models have more deeply internalized uncertainty compared to output rescaling. Together, these results suggest that probabilistic-program-mediated fine-tuning is a promising approach for post-training LLMs to reliably perform approximate inductive inference.

2606.09854 2026-06-10 cs.CL cs.AI cs.CY cs.LG 新提交

Can Multi-Agent LLMs Identify Their Peers? Stylometric Fingerprinting in Role-Constrained Political Analysis

多智能体大语言模型能否识别其同类?角色约束政治分析中的笔迹风格指纹识别

Juergen Dietrich

AI总结 研究多智能体LLM在政治分析中能否通过笔迹风格识别模型家族,提出SD-CV协议,T5模型在五类归属任务中达到F1=0.991,证明提示级匿名化无法消除模型身份信号。

Comments 24 pages, 3 figures

详情
AI中文摘要

用于政治声明分析的多智能体大语言模型(LLM)管道容易受到同伴保护偏见的影响:模型倾向于保护同伴模型免于停用,并表现出依赖身份的评分扭曲。提示级匿名化被提出作为缓解措施,但先前的工作同时记录了在角色约束输出中笔迹风格指纹在匿名化后仍然存在——这引发了该缓解措施是否足够的问题。本文首次系统研究LLM是否能在匿名化条件下识别政治分析文本背后的模型家族。我们评估了三种分类器方法——LLM零样本和少样本(Claude Sonnet 4.6和Llama-3.3-70B)以及微调的T5-base模型——在一个涵盖四个商业LLM家族和一个开放世界“未知”类的五类归属任务上。我们引入了一种声明不相交的交叉验证协议(SD-CV;定义见第3.5节),该协议保证训练和验证数据之间没有内容重叠,并将其与运行不相交的基线(RD-CV)进行对比。T5在SD-CV下达到Macro F1 = 0.991(±0.008),在24个完全保留的声明上F1 = 0.978——尽管与RD-CV相比,训练-测试内容距离增加了2.1倍(0.767 vs. 0.366,p<0.001),但仍表现出稳健性,证明了真正的笔迹风格泛化能力。一项分数SD-CV分析确定了训练数据40%(约440篇文本)处的性能拐点。我们的研究结果证实,仅靠提示级匿名化无法消除模型身份信号,这对欧盟AI法案合规性(第13、14、26条)以及质量关键型多智能体部署中的计算机系统验证(CSV)具有直接影响。

英文摘要

Multi-agent large language model (LLM) pipelines for political statement analysis are vulnerable to peer-preservation bias: models tend to protect peer models from deactivation and show identity-dependent scoring distortions. Prompt-level anonymization was proposed as a mitigation, but prior work simultaneously documented that stylometric fingerprints survive anonymization in role-constrained outputs - raising the question of whether this mitigation is sufficient. This paper provides the first systematic investigation of whether LLMs can identify the model family behind political analysis texts under anonymization conditions. We evaluate three classifier approaches - LLM zero-shot and few-shot (Claude Sonnet 4.6 and Llama-3.3-70B) and a fine-tuned T5-base model - on a five-class attribution task covering four commercial LLM families and an open-world 'unknown' class. We introduce a statement-disjoint cross-validation protocol (SD-CV; defined in Section 3.5) that guarantees no content overlap between training and validation data, and contrast it with a run-disjoint baseline (RD-CV). T5 achieves Macro F1 = 0.991 (+-0.008) under SD-CV and F1 = 0.978 on 24 completely held-out statements - robust despite a 2.1x increase in train-test content distance versus RD-CV (0.767 vs. 0.366, p<0.001), demonstrating genuine stylometric generalization. A fractional SD-CV analysis identifies a performance knee at 40% of training data (~440 texts). Our findings confirm that prompt-level anonymization alone cannot neutralize model identity signals, with direct implications for EU AI Act compliance (Articles 13, 14, 26) and for computer system validation (CSV) in quality-critical multi-agent deployments.

2606.09850 2026-06-10 cs.LG cs.CL 新提交

Mechanistic Analysis of Alignment Algorithms in Language Models

语言模型中对齐算法的机制分析

Aarush Sinha, Ishan Garg, Veeraraju Elluru, Arth Singh, Kushal Garg

AI总结 本文通过层间线性探针、稀疏自编码器和交叉编码器,系统分析了六种偏好优化方法在语言模型中的内部机制,发现不同目标函数导致不同的表示几何变换,并揭示了行为对齐与内部结构变化的不一致性。

Comments Work in Progress

详情
AI中文摘要

后训练对齐算法主要作为黑箱进行评估,掩盖了它们如何重塑语言模型的内部计算。我们对三种开源模型家族的六种偏好优化方法(PPO、DPO、SimPO、ORPO、GRPO 和 KTO)进行了系统的机制分析。通过集成层间线性探针、稀疏自编码器和交叉编码器,我们定位了偏好表示并量化了对齐引起的潜在空间几何变换。我们发现偏好信号一致地集中在早期-中期或中期-后期层,但不同的目标函数导致定性的不同表示偏移。KTO 和 GRPO 通过建设性的特征共享和稀疏高显著性招募增强了线性可分离性。相反,DPO 和 ORPO 通过非建设性的几何旋转和特征衰减降低了可分离性,而 PPO 和 SimPO 基本保持了基线几何。这些变换表现出架构依赖的变异性,表明行为对齐并不意味着统一的内部重构。我们的发现将对齐确立为一种异质性干预,激励了安全性和可解释性的标准化特征级审计,并强调了需要机制感知的优化目标。

英文摘要

Post-training alignment algorithms are predominantly evaluated as black boxes, obscuring how they reshape language models' internal computations. We present a systematic mechanistic analysis of six preference-optimization methods: PPO, DPO, SimPO, ORPO, GRPO, and KTO across three open-weight model families. By integrating layer-wise linear probing, Sparse Autoencoders, and crosscoders, we localize preference representations and quantify alignment-induced geometric transformations in latent space. We find that preference signals consistently concentrate in early--mid or mid--late layers, but different objectives induce qualitatively distinct representational shifts. KTO and GRPO enhance linear separability through constructive feature sharing and sparse, high-salience recruitment. In contrast, DPO and ORPO degrade separability via non-constructive geometric rotation and feature attenuation, while PPO and SimPO largely preserve baseline geometry. These transformations exhibit architecture-dependent variability, demonstrating that behavioral alignment does not imply uniform internal restructuring. Our findings establish alignment as a heterogeneous intervention, motivate standardized feature-level auditing for safety and interpretability, and highlight the need for mechanism-aware optimization objectives.

2606.11138 2026-06-10 cs.LG cs.NA math.NA 新提交

First-Order Trajectory Matching: Fast Ensemble Predictions of Chaotic, Turbulent, Stochastic Systems

一阶轨迹匹配:混沌、湍流、随机系统的快速集成预测

Shreya Jha, Timo Schorlepp, Nicholas Geissler, Jules Berman, Benjamin Peherstorfer

AI总结 提出一阶轨迹匹配(FTM)方法,通过学习随机系统轨迹的一阶局部概率质量输运,实现低成本的集成预测,并捕捉通量、环流等轨迹量。

详情
AI中文摘要

我们引入一阶轨迹匹配(FTM),这是一种替代建模方法,从随机系统的轨迹中学习概率质量的一阶局部输运。通过匹配轨迹的对称一阶运动,FTM学习概率流速度,其流动保持时间边缘以匹配集成平均值,同时捕获类似流的轨迹量,如通量、环流和跨势垒电流。FTM直接从轨迹学习流速度,避免了漂移、扩散和分数估计。我们的稳定性分析将离散化误差与采样方差分开,并表明当时间分辨率和样本量适当平衡时,单步无模拟的FTM损失是稳定的。在随机动力系统和PDE示例中,我们经验证明FTM以低确定性展开成本提供轨迹感知的集成预测。

英文摘要

We introduce First-Order Trajectory Matching (FTM), a surrogate-modeling method that learns the first-order local transport of probability mass from trajectories of stochastic systems. By matching the symmetric first-order motion of trajectories, FTM learns the probability current velocity, whose flow preserves time marginals to match ensemble averages, while also capturing current-like trajectory quantities such as fluxes, circulations, and barrier-crossing currents. FTM learns the current velocity directly from trajectories, avoiding drift, diffusion, and score estimation. Our stability analysis separates discretization error from sampling variance and shows that the one-step simulation-free FTM loss is stable when temporal resolution and sample size are properly balanced. Across stochastic dynamical systems and PDE examples, we empirically demonstrate that FTM provides trajectory-aware ensemble predictions at low, deterministic-rollout cost.