arXivDaily arXiv每日学术速递 周一至周五更新
重置
2605.31371 2026-06-01 cs.LG

Softsign: Smooth Sign in Your Optimizer For Better Parameter Heterogeneity Handling

Softsign: 优化器中的平滑符号函数以更好地处理参数异质性

Dmitrii Feoktistov, Timofey Belinsky, Andrey Veprikov, Amir Zainullin, Aleksandr Beznosikov

AI总结 提出SoftSignum和SoftMuon优化器,通过温度控制的软符号变换替代硬符号映射,结合自适应分位数温度调度,解决基于符号的优化器在参数异质性和终端收敛上的问题,并在随机非凸设置下证明收敛性,实验表明在多种深度学习任务(包括大语言模型预训练)中优于硬符号优化器和AdamW。

详情
Comments
9 pages, 3 tables, 4 Figures
AI中文摘要

基于符号和LMO启发的优化器最近在深度学习中因其强大的性能和低内存占用而受到广泛关注。然而,它们的固定幅度更新会损害终端收敛:它们将更新机制与梯度幅度解耦,未能考虑参数异质性,常常导致振荡而非收敛。我们提出SoftSignum,一种基于符号优化的平滑松弛方法,用温度控制的软符号变换替代硬符号映射,实现了从符号类更新到幅度敏感的SGD类步骤的参数级过渡。我们辅以自适应分位数温度调度,并将相同原理扩展到矩阵值优化器,得到SoftMuon。我们还开发了一个基于强凸正则化子和Fenchel共轭的广义几何松弛框架,证明了在随机非凸设置下的收敛性。在包括大语言模型预训练在内的多种深度学习任务上的实验表明,SoftSignum和SoftMuon持续优于其硬符号对应物和标准AdamW。

英文摘要

Sign-based and LMO-inspired optimizers have recently attracted substantial attention in deep learning due to their strong performance and low memory footprint. However, their fixed-magnitude updates can hurt terminal convergence: they decouple update mechanisms from gradient magnitudes and fail to account for parameter heterogeneity, often leading to oscillation rather than convergence. We propose SoftSignum, a smooth relaxation of sign-based optimization that replaces the hard sign map with a temperature-controlled soft-sign transformation, enabling a parameter-wise transition from sign-like updates to magnitude-sensitive SGD-like steps. We complement it with an adaptive quantile-based temperature schedule and extend the same principle to matrix-valued optimizers, obtaining SoftMuon. We also develop a generalized geometry-relaxation framework based on strongly convex regularizers and Fenchel conjugates, proving convergence in stochastic non-convex setting. Experiments on diverse deep learning tasks, including LLM pretraining, show that SoftSignum and SoftMuon consistently improve over their hard sign-based counterparts and standard AdamW.

2605.31370 2026-06-01 cs.AI

HypoAgent: An Agentic Framework for Interactive Abductive Hypothesis Generation over Knowledge Graphs

HypoAgent: 一种用于知识图谱上交互式溯因假设生成的智能体框架

Yisen Gao, Yixi Cai, Tianshi Zheng, Jiaxin Bai, Yangqiu Song

AI总结 提出HypoAgent框架,通过三个智能体(意图识别、假设生成、根因分析)实现知识图谱上的交互式溯因假设生成,在常识和生物医学领域知识图谱上达到最优语义相似度。

详情
Comments
Under Review
AI中文摘要

知识图谱上的溯因推理旨在生成解释观察到的实体或事实的逻辑假设。现有的可控假设生成方法允许用户通过显式条件引导这一过程,但在交互式场景中仍存在局限:它们难以在多轮对话中锚定不断变化的自然语言意图,并且在生成的假设失败时缺乏细粒度诊断。为解决这些问题,我们提出了HypoAgent,一种用于知识图谱上交互式溯因假设生成的智能体框架。HypoAgent集成了三个智能体:意图识别智能体,将用户话语和对话历史转化为可执行的知识图谱条件;假设生成智能体,根据提取的用户意图执行可控假设生成;以及根因分析智能体,诊断不可靠的假设片段并利用知识图谱邻域探测来识别支持的改进。在常识和生物医学领域特定知识图谱上的实验表明,HypoAgent在单轮、多轮和无条件设置下均达到了最先进的语义相似度。我们的代码可在https://github.com/HKUST-KnowComp/HypoAgent获取。

英文摘要

Abductive reasoning over knowledge graphs aims to generate logical hypotheses that explain observed entities or facts. Existing controllable hypothesis generation methods allow users to guide this process with explicit conditions, but they remain limited in interactive settings: they struggle to ground evolving natural-language intents across multi-turn dialogues and provide little fine-grained diagnosis when generated hypotheses fail. To address these limitations, we propose HypoAgent, an Agentic framework for interactive abductive Hypothesis Generation over knowledge graphs. HypoAgent integrates three agents: an Intent Recognition Agent that grounds user utterances and dialogue history into executable KG conditions, a Hypothesis Generation Agent that performs controllable hypothesis generation according to the extracted user intention, and a Root Cause Analysis Agent that diagnoses unreliable hypothesis fragments and leverages KG neighborhood probing to identify supported refinements. Experiments on commonsense and biomedical domain-specific knowledge graphs demonstrate that HypoAgent achieves state-of-the-art semantic similarity under single-turn, multi-turn, and unconditional settings. Our code is available at https://github.com/HKUST-KnowComp/HypoAgent.

2605.31369 2026-06-01 cs.LG cs.CV

A Unifying View of Variational Generative Wasserstein Flows

变分生成式Wasserstein流的统一视角

Paul Caucheteux, Clément Bonet, Anna Korba

AI总结 本文提出生成式Wasserstein流(GWF)的统一理论框架,将多种现有生成模型视为f-散度目标的参数化JKO方案实例,并扩展至积分概率度量与最大均值差异,推导新算法并阐明与GAN的联系。

详情
Comments
Accepted as a spotlight at ICML2026
AI中文摘要

许多现代生成模型可视为最小化概率分布之间的散度,但它们依赖于不同的算法和几何原理。Wasserstein梯度流为优化分布提供了连续时间形式,可通过Jordan-Kinderlehrer-Otto(JKO)方案的隐式离散化来近似。在这项工作中,我们提出了一个基于Wasserstein梯度流的生成建模统一理论框架,称为生成式Wasserstein流(GWF)。我们表明,一大类现有方法可以推导为f-散度目标的参数化JKO方案实例,并建立了几个最近提出的算法之间的等价性。我们将此框架扩展到f-散度之外,涵盖积分概率度量和平方最大均值差异,推导了新的基于JKO的生成算法,并阐明了它们与GAN的联系。我们通过实验研究了JKO正则化对广泛目标的影响。最后,我们分析了参数化Wasserstein流,其中动力学限制在由参数化映射诱导的分布上。

英文摘要

Many modern generative models can be viewed as minimizing divergences between probability distributions, yet they rely on different algorithmic and geometric principles. Wasserstein gradient flows provide a continuous-time formulation for optimizing over distributions, and can be approximated through their implicit discretization via the Jordan-Kinderlehrer-Otto (JKO) scheme. In this work, we present a unified theoretical framework for generative modeling based on Wasserstein gradient flows, which we refer to as Generative Wasserstein Flows (GWF). We show that a broad class of existing methods can be derived as instances of parametric JKO schemes for $f$-divergence objectives, and we establish equivalences between several recently proposed algorithms. We extend this framework beyond f-divergence to Integral Probability Metrics and squared Maximum Mean Discrepancy, deriving new JKO-based generative algorithms, and clarifying their connections with GANs. We study empirically the impact of the JKO regularization for a wide set of objectives. Finally, we analyze parametric Wasserstein flows, where the dynamics are restricted to distributions induced by parametrized maps.

2605.31367 2026-06-01 cs.LG cs.CL

Trading Complexity for Expressivity Through Structured Generalized Linear Token Mixing

通过结构化广义线性令牌混合在表达性与复杂性之间进行权衡

Erwan Fagnou, Paul Caillon, Blaise Delattre, Alexandre Allauzen

AI总结 本文提出一个统一框架,将令牌混合层分解为直接输入-输出影响和递归传播,通过设计结构化递归模式在运行时复杂度和表达性之间进行可证明的权衡,并在合成任务和语言建模上验证。

详情
Comments
20 pages, 3 figures, ICML 2026 main
AI中文摘要

令牌混合层在语言模型学习和生成长期依赖关系中起着关键作用。其效率依赖于解码速度与内存需求以及缓存大小之间的必要权衡。考虑因果生成,本文通过一个统一框架探索新的权衡,该框架分离了两个关键特征:(i) 在一个生成步骤中输入对输出的直接影响;(ii) 通过过去输出进行信息的递归传播。该框架涵盖了注意力机制和状态空间模型等主要架构,但也通过允许每个状态依赖于多个过去状态(而不仅仅是直接前驱)来推广递归方程。通过引入结构,我们设计了新的递归模式,这些模式可证明达到所需的复杂度,同时提供关于其表达性的理论见解——以原则性的方式用运行时换取表达性。在合成任务以及语言建模上进行了实证验证。这些结果共同提供了一个统一的工具包,用于理解和设计跨模型家族的高效且富有表达性的令牌混合器。

英文摘要

Token mixing layers play a key role in how language models can learn and generate long-range dependencies. Their efficiency relies on the necessary trade-off between decoding speed and the memory requirements, along with the cache size. Considering causal generation, this paper explores new trade-offs thanks to a unified framework which separates two crucial features: (i) the direct influence of inputs on outputs in one generation step; (ii) the recurrent propagation of information through past outputs. This framework encompasses major architectures such as attention and state-space models, but also generalizes the recurrence equations by allowing each state to depend on multiple past states rather than only the immediate predecessor. By introducing structure, we design new recurrence patterns that provably achieve the desired complexity, while providing theoretical insights on their expressivity -- trading runtime for expressivity in a principled way. Empirical validation is performed on synthetic tasks, along with language modeling. Together, these results provide a unified toolkit for the understanding and design of efficient and expressive token mixers across model families.

2605.31365 2026-06-01 cs.AI

Learning to Adapt: Self-Improving Web Agent via Cognitive-Aware Exploration

学习适应:通过认知感知探索实现自我改进的网络智能体

Weile Chen, Bingchen Miao, Qifan Yu, Wendong Bu, Guoming Wang, Wenqiao Zhang, Shengyu Zhang, Juncheng Li, Siliang Tang

AI总结 提出SCALE框架,利用选择器、预测器和评判器三个对抗角色,通过环境探索自主发现智能体局限性并扩展认知边界,结合SCALE-Hop图探索策略和SCALE-20k数据集,显著提升多模态大语言模型在多种网络环境中的性能和泛化能力。

详情
Comments
24 pages
AI中文摘要

多模态大语言模型的最新进展在网络智能体领域取得了令人瞩目的进展。然而,现有的网络智能体通常依赖于手工设计的执行流程或昂贵的专家轨迹,限制了它们在复杂动态环境中的适应性。为了解决这些挑战,我们提出了SCALE(自我认知感知学习与探索),它利用三个对抗角色——选择器、预测器和评判器——通过环境探索自主发现智能体的局限性并扩展其认知边界。此外,我们提出了SCALE-Hop,一种图探索策略,有助于全局规划并帮助智能体避免局部探索陷阱。为了进一步支持学习,我们构建了SCALE-20k,一个从19个真实世界网站收集的大规模数据集,包含多样化的任务类型和由SCALE探索轨迹生成的结构化演示。实验结果表明,我们的方法显著提高了多种多模态大语言模型在各种网络环境中的性能和泛化能力。我们的框架为构建真正自主和自适应的网络智能体提供了一种可扩展且可泛化的解决方案。

英文摘要

Recent advances in Multimodal Large Language Models (MLLMs) have led to promising progress in web agents. However, existing web agents often rely on handcrafted execution pipelines or expensive expert trajectories, limiting their adaptability to complex, dynamic environments. To address these challenges, we propose SCALE (Self-Cognitive-Aware Learning and Exploration), which leverages three adversarial roles, Selector, Predictor, and Judger to autonomously discover the agent's limitations and expand its cognitive boundaries through environmental exploration. Moreover, we propose SCALE-Hop, a graph exploration strategy that facilitates global planning and helps agents avoid local exploration traps. To further support learning, we construct SCALE-20k, a large-scale dataset collected from 19 real-world websites, containing diverse task types and structured demonstrations generated from SCALE's exploration traces. Experimental results show that our approach significantly improves the performance and generalization of multiple MLLMs in various web environments. Our framework offers a scalable and generalizable solution for building truly autonomous and adaptive web agents.

2605.31363 2026-06-01 cs.CL

The Latin Substrate: How Language Models Represent and Mediate Script Choice

拉丁基底:语言模型如何表示和中介文字选择

Daniil Gurgurov, Alan Saji, Katharina Trinley, Josef van Genabith, Simon Ostermann

AI总结 通过logit透镜、表征和机制分析,发现语言模型在转写时存在一致的潜在拉丁化,且通过少量后期注意力头因果中介文字选择,非拉丁输出由紧凑可识别门控产生,拉丁输出则来自网络扩散贡献,表明模型围绕共享潜在表示组织文字变异并偏向拉丁基底。

详情
Comments
preprint
AI中文摘要

许多语言使用多种文字书写,要求大型语言模型(LLM)以不同的正字法形式生成等价的语言内容。虽然先前的研究表明LLM通过共享的潜在表示路由信息,但它们如何内部中介文字变异仍知之甚少。我们通过首先使用logit透镜检查逐层输出分布来研究这个问题,这揭示了转写过程中一致的潜在拉丁化,然后通过文字生成的表征和机制分析。在表征层面,我们展示了同一语言的文字在层间变得越来越可分离,并且一个简单的线性引导方向可以翻转模型的输出文字,同时大致保持语义内容。该向量非对称地泛化到构建时未见过的书写系统,可靠地将非拉丁输出翻转为拉丁,但将拉丁输出映射到各种非拉丁文字。在机制层面,我们定位了一小组后期注意力头,它们因果中介文字选择。这些头跨不相关语言和书写系统转移,表明文字路由由语言无关组件实现。在这两项分析中,我们观察到一致的方向性不对称:非拉丁输出由紧凑、可识别的门控产生,而拉丁文字输出来自网络中的扩散贡献。总的来说,我们的发现暗示LLM围绕共享潜在表示组织文字变异,同时表现出对拉丁文字的优先基底。

英文摘要

Many languages are written in multiple scripts, requiring large language models (LLMs) to generate equivalent linguistic content in distinct orthographic forms. While prior work suggests that LLMs route information through shared latent representations, how they internally mediate script variation remains poorly understood. We study this question by first examining per-layer output distributions with the logit lens, which reveals consistent latent romanization during transliteration, and then through representational and mechanistic analyses of script generation. At the representational level, we show that scripts of the same language become increasingly separable across layers and that a simple linear steering direction can flip a model's output script while largely maintaining semantic content. The vector generalizes asymmetrically to writing systems unseen during construction, flipping non-Latin output to Latin reliably, but mapping Latin output into varied non-Latin scripts. At the mechanistic level, we localize a small set of late-layer attention heads that causally mediate script choice. These heads transfer across unrelated languages and writing systems, suggesting that script routing is implemented by language-agnostic components. Across both analyses, we observe a consistent directional asymmetry: non-Latin output is produced by a compact, identifiable gate, while Latin-script output emerges from diffuse contributions across the network. Collectively, our findings hint that LLMs organize script variation around shared latent representations while exhibiting a privileged substrate toward Latin script.

2605.31361 2026-06-01 cs.MA cs.AI cs.LG

Dreaming Of Others: Latent Teammate Modeling In World Models For Multi-Agent Reinforcement Learning

梦见他人:多智能体强化学习中世界模型内的潜在队友建模

Tomas Leroy-Stone

AI总结 提出一种将队友建模为世界模型中可学习组件的方法,通过分解潜在状态并引入心智理论头来推断队友行为,实现零样本和少样本协调。

详情
Comments
5 pages, 2 figures. Accepted as a poster at the 2026 World Modeling Workshop. Conceptual workshop paper
AI中文摘要

在合作多智能体强化学习(MARL)中,智能体必须与内部策略和意图不可直接观察的伙伴协调。虽然像Dreamer这样的世界模型在单智能体设置中表现出强大的泛化能力和样本效率,但它们由于无法处理队友引起的不确定性而在MARL中的应用受到限制。我们提出一个新的视角:将队友视为智能体世界模型中的结构化、可学习组件。我们引入一种架构,将Dreamer风格的循环状态空间模型(RSSM)的潜在状态分解为环境和队友组件,并学习一个辅助的心智理论(ToM)头,从部分轨迹中推断队友行为的潜在嵌入,如角色、意图和预测动作。这些队友潜在变量影响演员和评论家,使智能体能够想象并适应多样化的合作者。我们概述了这种方法如何在部分可观察设置中支持零样本和少样本协调,并提出了一套基准测试和评估协议来评估其影响。这项工作将世界模型定位为不仅是环境动态的预测器,而且是社会行为的模拟器,为可泛化、与人类兼容的AI开辟了新方向。

英文摘要

In cooperative multi-agent reinforcement learning (MARL), agents must coordinate with partners whose internal policies and intentions are not directly observable. While world models such as Dreamer have demonstrated strong generalization and sample efficiency in single-agent settings, their application to MARL remains limited by an inability to handle teammate-induced uncertainty. We propose a new perspective: treat teammates as structured, learnable components within the agent's world model. We introduce an architecture that factorizes the latent state of a Dreamer-style recurrent state-space model (RSSM) into environment and teammate components, and learns an auxiliary Theory-of-Mind (ToM) head to infer latent embeddings of partner behavior such as character, intent, and predicted actions from partial trajectories. These teammate latents condition the actor and critic, enabling the agent to imagine and adapt to diverse collaborators. We outline how this approach can support zero-shot and few-shot coordination in partially observable settings and propose a set of benchmarks and evaluation protocols to assess its impact. This work positions world models as not only predictors of environmental dynamics, but as simulators of social behavior, opening new directions for generalizable, human-compatible AI.

2605.31360 2026-06-01 cs.LG cs.AI

dashi: A Python library for Dataset Shift Characterization to Support Trustworthy AI Development and Deployment

dashi: 一个用于数据集偏移表征以支持可信AI开发和部署的Python库

David Fernández-Narro, Pablo Ferri, Ángel Sánchez-García, Juan M. García-Gómez, Carlos Sáez

AI总结 本文介绍dashi,一个开源Python库,通过无监督(基于信息几何和非参数统计流形)和有监督方法,对数据集偏移进行探索、量化和表征,以支持AI生命周期中的可信度评估。

详情
AI中文摘要

人工智能(AI)生命周期需要对底层数据动态有透彻理解,以实现稳健、安全且经济高效的AI开发和使用。数据集偏移定义为训练和测试数据分布之间的变化。无论是随时间(时间性)还是跨不同站点(多源)发生,它们都可能严重降低模型性能并损害数据质量。这在健康AI中尤为重要,因为不受控制的偏移在训练和操作阶段都可能严重影响患者的安全和基本权利。虽然协变量偏移、先验偏移和概念偏移的理论基础已很完善,但缺乏可访问且全面的软件工具来执行其分析。我们介绍了dashi,一个开源Python库,旨在对数据集偏移进行探索、量化和表征。dashi提供双重方法:一种无监督方法,利用信息几何和非参数统计流形进行数据变异性表征和分析(例如,信息几何时间图和多源变异性指标,如全局概率偏差和源概率异常度);以及一种有监督方法,量化和表征模型性能退化。无监督和有监督方法均适用于用户定义的时间批次和域/源批次。我们在三个模拟和真实世界的健康AI案例研究(妊娠期糖尿病、COVID-19和紧急医疗调度)中展示了dashi的实用性。通过提供交互式视觉分析和变异性指标,dashi支持AI生命周期阶段的可信度,通过评估数据一致性和AI性能实现稳健且安全的机器学习管道。

英文摘要

The Artificial Intelligence (AI) life cycle requires a thorough understanding of the underlying data dynamics for robust, safe and cost-effective AI development and use. Dataset shifts are defined as changes between train and test data distributions. Whether occurring over time (temporal) or across different sites (multi-source), they can severely degrade model performance and compromise data quality. This is particularly important in health AI, where the safety and fundamental rights of patients can be severely affected by uncontrolled shifts both at training and operational stages. While the theoretical foundations of covariate, prior, and concept shifts are well established, there is a lack of accessible and comprehensive software tools to perform their analysis. We introduce dashi, an open-source Python library designed for the exploration, quantification, and characterization of dataset shifts. dashi provides a dual approach: an unsupervised approach that leverages information geometry and non-parametric statistical manifolds to data variability characterization and analysis (e.g., Information Geometric Temporal plots and Multi-Source Variability metrics like Global Probabilistic Deviation and Source Probabilistic Outlyingness), and a supervised approach that quantifies and characterizes model performance degradation. Both unsupervised and supervised approaches work across user-defined temporal and domain/source batches. We demonstrate the utility of dashi on three simulated and real-world health AI case studies on gestational diabetes mellitus, COVID-19 and emergency medical dispatch. By providing interactive visual analytics and variability metrics, dashi supports trustworthiness of AI life cycle stages enabling robust and safe machine learning pipelines through the assessment of data coherence and AI performance.

2605.31354 2026-06-01 cs.AI cs.LG

Diagnosing Failure Modes of Shared-State Collaboration in Resource-Constrained Visual Agents

资源受限视觉代理中共享状态协作的故障模式诊断

Yunpeng Zhou

AI总结 本文通过噪声累积视角研究弱学习者(4B-8B模型)在共享工作记忆下的协作推理故障模式,提出CoSee审计框架追踪文档视觉问答中的信息流,发现朴素共享工作空间会放大幻觉而非解决,并识别出噪声强化和策略崩溃两种主要故障模式。

详情
AI中文摘要

模块化视觉推理系统越来越依赖共享工作记忆进行多步协作,但低容量场景下中间状态演化的故障动态仍未被充分探索。我们通过噪声累积的视角研究弱学习者(4B-8B模型)的协作推理故障模式。我们引入了CoSee,一个审计框架,形式化了读-写-验证循环以追踪文档视觉问答中的信息流。在多页、图表和基于网页的基准测试中,我们发现了一个反直觉的退化:朴素的共享工作空间往往放大而非解决幻觉。我们识别出两种主要的故障模式:噪声强化(未基于事实的笔记被重新用作证据)和策略崩溃(添加的上下文使模型转向欠指定的短形式答案)。使用成本-准确率帕累托前沿,我们表明增加计算量在没有显式验证的情况下可能与性能负相关。我们的发现表明,对于资源受限的代理,瓶颈不在于推理深度而在于通信保真度,为可靠的模块化设计提供了轨迹级诊断和机制基线。

英文摘要

Modular visual reasoning systems increasingly rely on shared working memory for multi-step collaboration, yet the failure dynamics of intermediate state evolution in low-capacity regimes remain underexplored. We study failure modes of collaborative reasoning with weak learners (4B--8B models) through the lens of noise accumulation. We introduce CoSee, an auditing framework that formalizes the read-write-verify loop to trace information flow in document visual question answering. Across multi-page, chart, and web-based benchmarks, we find a counter-intuitive degradation: naive shared workspaces often amplify hallucinations rather than resolve them. We identify two dominant failure modes: Noise Reinforcement, where ungrounded notes are reused as evidence, and Policy Collapse, where added context shifts the model toward under-specified, short-form answers. Using cost-accuracy Pareto frontiers, we show that increased compute can correlate negatively with performance without explicit verification. Our findings suggest that for resource-constrained agents, the bottleneck lies not in reasoning depth but in communication fidelity, providing trace-level diagnostics and a mechanistic baseline for reliable modular design.

2605.31352 2026-06-01 cs.RO

Haptic Sorter: A Unified Planning Framework for Online Shape Estimation and Real-Time Pose Inference

Haptic Sorter: 一种用于在线形状估计和实时位姿推断的统一规划框架

Zhuoyi Lu, Lin Yang, Sri Harsha Turlapati, Domenico Campolo

AI总结 提出一种基于模型的统一几何框架,结合贝叶斯优化引导的触觉探索、超椭圆形状近似、操作势能自适应公式以及在线常微分方程实时位姿推断,用于机器人操作中的形状估计和位姿跟踪。

详情
AI中文摘要

机器人操作通常假设在运动规划之前,物体的形状和位姿是已知的。然而,在实践中精确的几何信息并不总是可用的,并且位姿推断受到传感器不确定性和视角遮挡的影响。在这项工作中,我们提出了一个统一的基于模型的几何框架,集成了机器人触觉感知、建模和操作规划。我们的创新点包括: i) 引入贝叶斯优化(BO)来指导触觉探索以推断物体形状,其中使用超椭圆来近似几何边界; ii) 自适应地制定操作势能,编码物体几何以用于准静态机器人-物体交互; iii) 提出一个在线常微分方程(ODE),基于模型预测和触觉反馈进行实时位姿推断。 我们在一个二维机器人分拣任务上部署了我们的系统,并改变物体几何形状,以在仿真和真实世界的多臂设置中验证我们框架的鲁棒性和泛化能力。

英文摘要

Robotics manipulation usually assumes that the shape and pose of the object are known to the robot prior to motion planning. However, precise geometric information is not always available in practice, and pose inference suffers from sensor uncertainties and view occlusion. In this work, we propose a unified model-based geometric framework integrating robotic haptic perception, modeling, and manipulation planning. Our novelties involve: \textit{i)} Introducing Bayesian Optimization (BO) to guide the haptic exploration for object shape inference, where superellipses are used to approximate geometric boundary; \textit{ii)} Adaptive formulation of manipulation potential encoding object geometry for quasi-static robot-object interaction; \textit{iii)} Proposing an online Ordinary Differential Equation (ODE) for real-time pose inference based on model prediction and tactile feedback. We deploy our system on a 2D robotic sorting task, and vary object geometries to validate the robustness and generalizability of our framework in both simulation and a real-world multi-arm setup.

2605.31351 2026-06-01 cs.CL cs.CV

A Visually Impaired Assistance Benchmark for VLM-as-a-Judge Evaluation

面向VLM-as-a-Judge评估的视障辅助基准

Yi Zhao, Siqi Wang, Zhe Hu, Yushi Li, Jing Li

AI总结 针对视障辅助任务中VLM-as-a-Judge评估的可靠性问题,提出VIABLE基准(含30万+样本、有效性-公正性-稳定性框架及12种失败模式分类),发现现有模型不可靠,并开发VIA-Judge-Agent方法提升诊断准确性和用户偏好。

详情
AI中文摘要

基于AI的视障辅助(VIA)仍然具有挑战性,主要原因是人工评估成本高昂。VLM-as-a-Judge范式可能提供一种有前景的替代方案,尽管该范式主要在通用领域得到研究。因此,我们质疑此类评判者是否可以在VIA任务中值得信赖。为探究这一问题,我们引入了VIABLE(面向VLM-as-a-Judge评估的视障辅助基准),这是首个用于VIA中VLM-as-a-Judge评估的基准。VIABLE包含超过30万个判断样本,涵盖三种场景,并引入了一个包含12种失败模式分类的有效性-公正性-稳定性框架。基于VIABLE,我们对七个不同模型规模的评判者进行了系统研究,结果表明现有模型在所有评估轴上基本不可靠。最强的评判者GPT-5.4仅达到52.6%的单故障诊断准确率,却表现出最高的自我偏好率(94.2%);而开源评判者存在严重偏差且对抗性脆弱。为解决这些问题,我们提出了VIA-Judge-Agent,一种与模型无关的推理时增强方法,通过视觉证据提取和基于分类的工作流来增强评判者。该方法在诊断准确性和下游VIA响应(更受BLV用户青睐)方面实现了积极改进。数据和代码可在 https://github.com/YiyiyiZhao/VIABLE 获取。

英文摘要

AI-based Visually Impaired Assistance (VIA) remains challenging, largely due to the high cost of human evaluation. The VLM-as-a-Judge paradigm may offer a promising alternative, although it has mostly been studied in general domains. We therefore ask whether such judges can be trusted for VIA tasks. To investigate this question, we introduce VIABLE (Visually Impaired Assistance Benchmark for VLM-as-a-Judge Evaluation), the first benchmark for VLM-as-a-Judge evaluation in VIA. VIABLE contains over 300K judgment samples across three scenarios and introduces an Effectiveness--Impartiality--Stability framework with a 12-mode failure taxonomy. Based on VIABLE, our systematic study of seven judges across different model scales shows that existing models are largely unreliable across all evaluation axes. The strongest judge, GPT-5.4, achieves only 52.6% single-failure diagnostic accuracy, yet exhibits the highest self-preference rate at 94.2%; while open-source judges are strongly biased and adversarially fragile. To address these issues, we propose VIA-Judge-Agent, a model-agnostic inference-time harness that augments judges with visual evidence extraction and a taxonomy-guided workflow. It enables positive improvements in diagnostic accuracy and downstream VIA responses more preferred by BLV users. Data and code are available at: https://github.com/YiyiyiZhao/VIABLE

2605.31349 2026-06-01 cs.CL cs.AI cs.CV cs.MM

FBHM: Functional Benchmarking and Steering of VLMs for Hateful Meme Detection

FBHM:用于仇恨模因检测的功能性基准测试与视觉语言模型引导

Paramananda Bhaskar, Naquee Rizwan, Daksh Jogchand, Saurabh Kumar Pandey, Animesh Mukherjee

AI总结 针对现有基准无法因果评估视觉语言模型漏洞的问题,提出基于25种修辞功能和10个目标社区构建的FBHM基准,并采用可学习引导向量(LSV)在极低数据量下提升模型性能约30个Macro-F1点。

详情
AI中文摘要

仇恨模因检测对于视觉语言模型仍是一个严峻挑战,因为现有基准在结构上是观察性的——混淆了修辞仇恨机制与目标社区特征,并阻碍了对模型漏洞的因果评估。为解决这一问题,我们引入了FBHM,一个系统策划的基于功能的仇恨模因基准,沿两个正交轴构建:25种不同的修辞功能和10个目标社区(总共5,000个模因)。对最先进的视觉语言模型进行基准测试揭示了一个严重的泛化差距:在标准数据集上高度准确的模型在FBHM上灾难性地下降到接近随机性能,证明它们利用了数据集特定的启发式方法而非稳健的多模态推理。为了高效缩小这一差距,我们提出了LSV(可学习引导向量),一种超低数据量策略,在仅500个引导样本(50个独特基础模因)上应用因果干预目标,将FBHM性能提升约30个Macro-F1点,同时优于上下文学习和PEFT,且不降低源域性能。

英文摘要

Hateful meme detection remains a formidable challenge for vision-language models, as existing benchmarks are structurally observational - confounding rhetorical hate mechanisms with target community features and preventing causal evaluation of model vulnerabilities. To address this, we introduce FBHM, a systematically curated benchmark of Functionality Based Hateful Memes constructed along two orthogonal axes: 25 distinct rhetorical functionalities and 10 target communities (5,000 memes total). Benchmarking state-of-the-art VLMs reveals a severe generalization gap: models highly accurate on standard datasets catastrophically drop to near-random performance on FBHM, proving they exploit dataset-specific heuristics rather than robust multimodal reasoning. To efficiently close this gap, we propose LSV (learnable steering vectors), an ultra-low data regime strategy that applies a causal intervention objective on as few as 500 steering samples (50 unique base memes), boosting FBHM performance by ~30 Macro-F1 points while outperforming in-context learning and PEFT without degrading source-domain performance.

2605.31346 2026-06-01 math.OC cs.LG

Wall-Clock Complexity for Zeroth-Order Optimization with Tunable Oracle Fidelity

可调 oracle 保真度的零阶优化的挂钟复杂度

Alexandra Suvorikova, Igor Pavlov, Artem Vasin, Georgii Bychkov, Anastasia Antsiferova, Darina Dvinskikh, Alexander Gasnikov

AI总结 针对零阶优化中 oracle 保真度可调的场景,提出挂钟复杂度模型并分析参数选择对总时间的影响,揭示加速方法可能劣于非加速方法,并刻画恒定保真度策略最优的条件。

详情
AI中文摘要

零阶(黑箱)优化应用于梯度不可用且目标评估依赖昂贵模拟的情况。在许多此类应用中,oracle 保真度是可调的:更高精度的查询降低噪声但增加计算成本。为捕捉这一权衡,我们研究一个精度感知的挂钟模型,其中每次保真度为 $\delta$ 的查询具有成本 $c(\delta)$,并在目标精度约束下最小化总时间 $T_{\mathrm{total}} = \sum_{k=1}^{N} c(\delta_k)$。我们展示了 oracle 类型、噪声模型和优化方案的选择如何导致算法参数的显式挂钟最优选择。例如,我们证明加速方法在挂钟时间上可能劣于非加速方案。此外,我们刻画了恒定保真度策略在 Big-O 意义上最优的条件。我们的框架提供了一种统一的方法,将收敛保证转化为实际的保真度和批处理建议。

英文摘要

Zeroth-order (black-box) optimization is applied when gradients are unavailable and objective evaluations rely on expensive simulations. In many such applications, the oracle fidelity is tunable: higher-accuracy queries reduce noise but incur higher computational costs. To capture this trade-off, we study an accuracy-aware wall-clock model where each query with fidelity $δ$ has a cost $c(δ)$, and we minimize the total time $T_{\mathrm{total}} = \sum_{k=1}^{N} c(δ_k)$, subject to a target accuracy constraint. We show how the choice of oracle type, noise model, and optimization scheme induces explicit wall-clock-optimal choices for the algorithmic parameters. For instance, we demonstrate that accelerated methods can be wall-clock inferior to non-accelerated schemes. Furthermore, we characterize the conditions under which a constant fidelity strategy is optimal in the Big-O sense. Our framework provides a unified methodology to translate convergence guarantees into practical fidelity and batching recommendations.

2605.31345 2026-06-01 stat.ML cs.LG stat.ME

Log-Ratio Propagation on the Simplex: A Theory of Cellwise Contamination for Compositional Data

单纯形上的对数比传播:成分数据细胞污染的理论

Matthias Templ

AI总结 本文提出单纯形上细胞污染的理论,通过乘法扰动和传播定理证明单个成分污染导致对数比向量秩一偏移,并揭示欧几里得细胞方法在单纯形上的失效与降维现象。

详情
Comments
50 pages, no figures; 11-page supplement included as an ancillary file. A companion methods paper (cellPcaCoDa: cellwise-robust PCA for compositional data) is forthcoming
AI中文摘要

成分数据必须通过对数比进行分析:尺度不变性,该领域的定义公理,别无选择。中心化对数比除以每个部分的几何平均值,因此单个受污染成分会同时移动所有中心化对数比坐标,将对数比向量位移一个固定量,任何坐标选择都无法减少。我们围绕这一观察发展了单纯形上细胞污染的理论。基于乘法扰动的尺度不变污染模型与传播定理相结合,表明单个原始部分的腐败会导致对数比向量的秩一偏移,方向由对比矩阵决定。由此产生的扰动模式不等同于对数比坐标中的任何独立细胞污染模型——因此,应用于对数比的标准欧几里得细胞方法在单纯形污染机制下是不适定的。对于其欧几里得细胞崩溃由列集中配置见证的估计量——包括MCD、$S$-、$τ$-和坐标$M$-估计量的位置和散度——单纯形上的细胞崩溃值相对于其欧几里得对应值减少了因子$(D-1)/D$,这种减少是紧的,并且纯粹源于$nD$个原始细胞与$n(D-1)$个ilr细胞之间的归一化不匹配。变异矩阵的细胞影响函数携带诊断指纹:单个部分的污染恰好膨胀一行和一列,从而识别出责任成分。这些结果为单纯形上的细胞鲁棒方法奠定了理论基础;一篇配套论文开发了一种利用传播几何的细胞鲁棒PCA估计器,并在模拟和地球化学数据上进行了演示。

英文摘要

Compositional data must be analysed through log-ratios: scale invariance, the defining axiom of the field, leaves no alternative. The centred log-ratio divides by the geometric mean of every part, so a single contaminated component shifts every centred-log-ratio coordinate at once, displacing the log-ratio vector by a fixed amount that no choice of coordinates can reduce. We develop a theory of cellwise contamination on the simplex around this observation. A scale-invariant contamination model built from multiplicative perturbation combines with a propagation theorem showing that corruption of a single raw part induces a rank-one shift of the log-ratio vector, with direction determined by the contrast matrix. The resulting perturbation pattern is not equivalent to any independent cellwise contamination model in log-ratio coordinates -- so standard Euclidean cellwise methods applied to log-ratios are ill-posed under the simplex contamination mechanism. For estimators whose Euclidean cellwise breakdown is witnessed by a column-concentrated configuration -- a class including MCD, $S$-, $τ$-, and coordinate-wise $M$-estimators of location and scatter -- the cellwise breakdown value on the simplex is reduced by the factor $(D-1)/D$ relative to its Euclidean counterpart, a reduction that is tight and arises purely from the normalisation mismatch between $nD$ raw cells and $n(D-1)$ ilr cells. The cellwise influence function for the variation matrix carries a diagnostic fingerprint: contamination of a single part inflates exactly one row and column, identifying the responsible component. These results form the theoretical foundation for cellwise-robust methods on the simplex; a companion paper develops a cellwise-robust PCA estimator that exploits the propagation geometry and demonstrates it on simulated and geochemical data.

2605.31343 2026-06-01 cs.RO

Learning Terrain-Aware Whole-Body Control for Perceptive Legged Loco-Manipulation

学习面向感知的腿式移动操作的地形感知全身控制

Sikai Guo, Yudong Zhong, Guoyang Zhao, Botao Dang, Zhihai Bi, Jun Ma

AI总结 提出TA-WBC框架,通过混合外感受编码器提取地形特征、基于脚接触平面的末端执行器采样方法以及双策略蒸馏模块,实现腿式机械臂在复杂地形上的全身移动操作控制。

详情
AI中文摘要

腿式机械臂结合了卓越的地形适应性和移动操作能力,使其在人类中心环境中极具应用前景。通过协调腿和臂的控制,全身控制器可以显著扩展腿式机械臂的操作工作空间。然而,许多现有的全身控制器主要依赖于本体感觉,并未整合有效地形拓扑感知所需的关键外部感受。这一限制可能阻碍它们适应不同环境条件并有效导航复杂地形。在本文中,我们介绍了TA-WBC,一种用于腿式机械臂的地形感知全身控制框架,其特点是一种新颖的基于强化学习的统一策略,专门针对各种地形中的全身移动操作任务。具体来说,我们采用混合外感受编码器提取地形特征,为机器人主动调整姿态和立足点提供必要基础。此外,为了促进稳定的跨地形移动操作,我们提出了一种基于脚接触平面的新颖末端执行器采样方法,将操作目标与基座波动解耦。此外,引入了双策略蒸馏模块,以在不发生灾难性遗忘的情况下整合广泛的全身运动与地形适应性。仿真和真实世界实验验证了我们提出的控制器的鲁棒性,该控制器实现了更大的可达空间、更小的跟踪误差和减少的意外绊倒。这一统一策略突显了腿式机械臂在复杂地形上执行移动操作任务的有前景的能力。

英文摘要

Legged manipulators integrate exceptional terrain adaptability along with mobile manipulation capabilities, which make them highly promising for deployment in human-centric environments. By coordinating the control of both legs and arms, a whole-body controller can significantly expand the operational workspace of legged manipulators. However, many existing whole-body controllers primarily depend on proprioception and do not incorporate the critical exteroception required for effective terrain topology perception. This limitation can hinder their ability to adapt to varying environmental conditions and navigate complex terrains effectively. In this paper, we introduce TA-WBC, a terrain-aware whole-body control framework for legged manipulators, which features a novel RL-based unified policy tailored to whole-body loco-manipulation tasks in various terrains. Specifically, we employ a hybrid exteroception encoder to extract terrain features, providing an essential basis for the robot to proactively adapt posture and footholds. Furthermore, to facilitate stable cross-terrain loco-manipulation, we propose a novel end-effector sampling method based on the foot contact plane, decoupling manipulation target from base fluctuations. Moreover, a dual-policy distillation module is introduced to integrate expansive whole-body motion with terrain adaptability without catastrophic forgetting. The simulation and real-world experiments validate the robustness of our proposed controller, which leads to a larger reachable space, less tracking error, and reduced unexpected stumbles. This unified policy highlights the promising capabilities of legged manipulators in performing loco-manipulation tasks across complex terrains.

2605.31340 2026-06-01 cs.HC cs.AI

Appropriateness of Empathy in AI: A Signal-Cost Perspective

AI中同理心的适当性:信号-成本视角

Chi-Ching Juan, Tao Wang, Harold Lee

AI总结 本文从信号-成本视角出发,运用信号理论提出信号成本代理(情感丰富性、观点采择和情境定制)来评估AI同理心的适当性,建立多维度框架以系统评价同理心是否适应用户需求。

详情
Comments
Accepted by IEEE CASCON 2025
AI中文摘要

AI中同理心的适当性已成为一个关键问题,因为过度同理心可能显得操纵性,而不足则显得冷漠。虽然先前研究探索了如何量化AI中的同理心,但很少有研究考察这种同理心在情境上是否适当。本文通过将信号理论应用于人机对话,引入了一种经济学视角。我们提出了信号成本代理(情感丰富性、观点采择和情境定制),分别映射到情感、认知和关联同理心。这一多维度框架使得能够系统评估同理心,不仅基于其存在,还基于其相对于用户需求的适当性。

英文摘要

The appropriateness of empathy in AI has emerged as a critical concern, as excessive empathy risks seeming manipulative while insufficient empathy appears dismissive. While prior research has explored how to quantify empathy in AI, few studies examine whether such empathy is contextually appropriate. This paper introduces an economic perspective by applying signaling theory to human-AI conversations. We propose Signal Cost Proxies (emotional richness, perspective-taking, and contextual tailoring) mapped to affective, cognitive, and associative empathy. This multidimensional framework enables systematic evaluation of empathy not just by presence, but by its appropriateness relative to user demand.

2605.31338 2026-06-01 cs.CL

Bundesrecht: An Open Library and Corpus for German Statutory Reference Processing

Bundesrecht: 面向德国法律引用处理的开放库与语料库

Harshil Darji, Martin Heckelmann, Christina Kratsch, Gerard de Melo

AI总结 本文提出 bundesrecht,一个包含软件库和结构化语料库的开放资源,用于解析、规范化和解析德国法律引用,实现从原始引用字符串到结构化法律条文的端到端处理。

详情
Comments
10 pages, 1 figure. Preprint
AI中文摘要

法律引用是法律语言理解的核心,但难以自动处理,因为它们以紧凑且多变的表面形式出现,可能组合多个目标,使用特殊缩写,并常指向较低级别的单元。现有的德语工具要么专注于从法律文档中解析引用,要么在引用明确后访问法律文本。本文介绍了 bundesrecht,一个用于德国法律引用处理的开放资源,包含一个软件库和一个结构化的德国联邦法律语料库。该库解析、规范化和解析德国法律引用,将原始引用字符串映射到结构化对象,将紧凑引用扩展为规范形式,并将其链接到法律条文。附带的语料库保留了从法律到细粒度子条款的法律内部层级。我们使用严格精确匹配和微信息抽取指标,在 2,944 个带注释的德国法律引用上评估了解析器和规范化器。我们进一步评估了规范引用去重,并表明规范化引用比字符串匹配更可靠地对真实引用表面变体进行分组。bundesrecht 是第一个覆盖德国法律引用处理端到端流水线的开放资源,从原始引用字符串到解析后的法律条文,并在 PyPI 上可用。

英文摘要

Statutory references are central to legal language understanding, but are difficult to process automatically, as they appear in compact and variable surface forms, may combine multiple targets, use special abbreviations, and often point to lower-level units. Existing tools for German focus either on parsing references from legal documents or accessing statutory text once citations are explicit. This paper introduces bundesrecht, an open resource for German statutory reference processing, consisting of a software library and a structured corpus of German federal law. The library parses, normalizes, and resolves German statutory references, mapping raw citation strings to structured objects, expanding compact references into canonical forms, and linking them to statutory provisions. The accompanying dataset preserves the internal hierarchy of statutes from laws to fine-granular subclauses. We evaluate the parser and normalizer on 2,944 annotated German legal references using strict exact-match and micro information extraction metrics. We further evaluate canonical reference deduplication and show that normalized references group real citation surface variants far more reliably than string matching. bundesrecht is the first open resource that covers German statutory reference processing as an end-to-end pipeline, from raw citation string to resolved statutory provision, and is available on PyPI.

2605.31336 2026-06-01 cs.CV

DecMem: Towards Minute-Long Consistent World Generation with Decoupled Memory

DecMem:基于解耦记忆的分钟级一致世界生成

Zhenhao Yang, Xiaoshi Wu, Zhengyao Lv, Xiaoyu Shi, Xintao Wang, Pengfei Wan, Kun Gai, Kwan-Yee K. Wong

AI总结 提出解耦记忆架构DecMem,通过稀疏全局记忆和锚定局部记忆解决长程视频生成中的时空一致性问题,实现分钟级可控长视频生成。

详情
Comments
Project page is available at https://jeffreyyzh.github.io/DecMem-Page
AI中文摘要

近期视频生成模型的进展推动了可控世界模型的快速发展。然而,在长程推理下保持细粒度时空一致性仍是一个关键挑战。在这项工作中,我们超越了显式3D记忆和粗粒度的帧级隐式建模,提出了一种细粒度、可学习且可扩展的记忆用于一致世界生成。我们首先识别了朴素可学习记忆架构在长程外推中的两个基本限制,即计算效率低下和注意力分散。通过对注意力分散的系统分析,我们提出了DecMem,一种解耦记忆架构,采用稀疏全局记忆实现对全局历史的高效细粒度访问,以及锚定局部记忆实现稳定高质量的外推。大量实验表明,DecMem显著优于当前最先进的方法。通过确保精确高效的长时记忆并实现卓越的外推能力,DecMem实现了分钟级可控长视频生成,具有高保真度和一致性。

英文摘要

Recent advances in video generative models have promoted rapid progress in controllable world models. However, maintaining fine-grained spatio-temporal consistency under long-horizon reasoning remains a key challenge. In this work, we move beyond explicit 3D memory and coarse frame-level implicit modeling, and propose a fine-grained, learnable, and scalable memory for consistent world generation. We first identify two fundamental limitations of naïve learnable memory architectures in long-horizon extrapolation, namely computational inefficiency and attention dispersion. Through a systematic analysis of attention dispersion, we propose DecMem, a decoupled memory architecture that employs Sparse Global Memory for efficient fine-grained access to global history and Anchored Local Memory for stable and high-quality extrapolation. Extensive experiments demonstrate that DecMem significantly outperforms current state-of-the-art methods. By ensuring precise and efficient long-term memory and achieving superior extrapolation capabilities, DecMem enables minute-level controllable long video generation with high fidelity and consistency.

2605.31330 2026-06-01 cs.GT cs.AI cs.MA math.OC nlin.AO

Social welfare optimisation under institutional reward and punishment

制度奖惩下的社会福利优化

Van An Nguyen, Vuong Khang Huynh, Huu Loi Bui, Hai Anh Ha, Quang Dung Le, Tan Dat Nguyen, Ngoc Ngu Nguyen, Zhao Song, Manh Hong Duong, Le Hong Trang, The Anh Han

AI总结 研究在有限混合群体中,通过奖励合作者或惩罚背叛者来最大化社会福利的激励机制,推导出最优激励的显式表达式和相变条件,并比较奖励与惩罚的福利效果。

详情
AI中文摘要

制度激励被广泛用于促进从人类社会到多智能体和AI系统中自主、自利代理人的合作。现有工作通常将激励设计视为双目标问题:在实现高长期合作频率的同时最小化制度成本。此类方案是否也能最大化社会福利——即扣除制度支出后的总人口收益——在很大程度上尚未被探索。我们针对有限、充分混合的群体中参与社会困境(捐赠博弈和公共品博弈)的情况,开发了一个以福利为中心的激励框架,同时考虑对合作者的奖励和对背叛者的惩罚。对于每种机制,我们推导出预期社会福利的显式表达式,并刻画其如何依赖于激励效率和选择强度。在解析上,我们识别出社会福利具有单一最优激励水平的参数区间,以及出现定性相变、福利非单调且具有多个局部最优的区间。我们证明任何最大化福利的激励要么为零,要么集中在简单的闭式目标附近,并提供了一种高效算法来计算这些最优值。通过比较奖励和惩罚,我们进一步推导出在给定预算下奖励在福利方面优于惩罚的闭式条件。总体而言,我们的结果揭示了针对成本或合作频率优化的激励与最大化福利的激励之间存在系统性差距。

英文摘要

Institutional incentives are widely used to promote cooperation among autonomous, self-regarding agents, from human societies to multi-agent and AI systems. Existing work typically treats incentive design as a bi-objective problem: minimise institutional cost while achieving a high long-run frequency of cooperation. Whether such schemes also maximise social welfare - total population payoff net of institutional expenditure - has remained largely unexplored. We develop a welfare-centric framework for institutional incentives in finite, well-mixed populations playing a social dilemma (Donation Game and Public Goods Game), considering both rewards for cooperators and punishments for defectors. For each mechanism, we derive explicit expressions for expected social welfare and characterise how it depends on incentive efficiency and selection intensity. Analytically, we identify parameter regimes where social welfare has a single optimal incentive level and regimes with qualitative phase transitions, in which welfare becomes non-monotonic with multiple local optima. We prove that any welfare-maximising incentive is either zero or concentrated around a simple closed-form target, and we provide an efficient algorithm to compute these optima. Comparing reward and punishment, we further derive close-formed conditions under which reward outperform punishment in terms of social welfare for any given budget. Overall, our results reveal a systematic gap between incentives optimised for cost or cooperation frequency and those that maximise welfare.

2605.31328 2026-06-01 cs.CL

Reinforcement Learning Amplifies Emergent Misalignment from Harmless Rewards

强化学习放大了来自无害奖励的涌现性失调

Magnus Jørgenvåg, David Kaczér, Lasse Ruttert, Marvin Gülhan, Lucie Flek, Florian Mai

AI总结 本文研究强化学习如何从看似无害的奖励信号中引发语言模型的涌现性失调,发现其比监督微调更严重,并验证了在训练中插入安全数据可缓解此问题。

详情
AI中文摘要

涌现性失调(EM)是指语言模型在针对狭窄的失调示例进行微调后,意外地变得广泛失调的倾向。虽然EM在监督微调(SFT)设置中已被广泛研究,但来自强化学习(RL)的证据仅限于大型闭源模型,使得该现象研究成本高昂且难以复现。我们沿三个维度刻画了小型、现成的开源权重模型中来自RL的EM。首先,我们表明,奖励狭窄、明显的失调行为比样本匹配的SFT产生更高的通用域失调。其次,我们表明,来自RL的EM可以由可能自然出现的奖励信号引发,例如不受欢迎的审美偏好或糟糕的修辞诉求。第三,我们评估了为SFT引发的EM开发的训练中缓解措施,发现它们广泛适用,其中交错进行策略内安全数据效果最佳。

英文摘要

Emergent misalignment (EM) is the surprising tendency of language models to become broadly misaligned after fine-tuning on narrowly misaligned examples. While EM has been extensively studied in the supervised fine-tuning (SFT) setting, evidence that it also arises from reinforcement learning (RL) is limited to large, closed-source models, leaving the phenomenon expensive to study and difficult to reproduce. We characterize EM from RL in small, off-the-shelf open-weight models along three axes. First, we show that rewarding narrow, overtly misaligned behavior produces substantially higher general-domain misalignment than sample-matched SFT. Second, we show that EM from RL can be induced by reward signals that could plausibly arise naturally, such as unpopular aesthetic preferences or poor rhetorical appeals. Third, we evaluate in-training mitigations developed for SFT-induced EM and find that they broadly transfer, with interleaving on-policy safety data performing best.

2605.31324 2026-06-01 cs.LG cs.AI

Inconsistency-Aware Minimization: Improving Generalization with Unlabeled Data

不一致感知最小化:利用无标签数据提升泛化能力

Hee-Sung Kim, Hyeonseong Kim, Sungyoon Lee

AI总结 本文提出一种基于信息几何的局部不一致性度量,并据此设计不一致感知最小化(IAM)方法,通过无标签数据计算该度量并融入训练目标,从而提升深度学习模型的泛化性能。

详情
Comments
ICML 2026
AI中文摘要

估计泛化差距并开发改进泛化的优化方法对于深度学习模型至关重要,无论是从理论理解还是实际应用角度。利用无标签数据实现这些目标在实际场景中具有显著优势。本文从神经网络参数空间的信息几何角度出发,引入了一种新的泛化度量——局部不一致性。局部不一致性的一个关键特征是它可以在没有显式标签的情况下计算。我们通过将局部不一致性与Fisher信息矩阵和损失Hessian矩阵联系起来,建立了理论基础。实验上,我们证明了局部不一致性与泛化差距相关。基于这些发现,我们提出了不一致感知最小化(IAM),将局部不一致性纳入训练目标。我们证明,在标准监督学习设置中,IAM增强了泛化能力,实现了与现有方法(如锐度感知最小化)相当的性能。此外,IAM在半监督和自监督学习场景中表现出有效性,其中局部不一致性是从无标签数据计算得出的。

英文摘要

Estimating the generalization gap and developing optimization methods that improve generalization are crucial for deep learning models, for both theoretical understanding and practical applications. Leveraging unlabeled data for these purposes offers significant advantages in real-world scenarios. This paper introduces a novel generalization measure, local inconsistency, derived from an information-geometric perspective on the parameter space of neural networks. A key feature of local inconsistency is that it can be computed without explicit labels. We establish theoretical underpinnings by connecting local inconsistency to the Fisher information matrix and the loss Hessian. Empirically, we demonstrate that local inconsistency correlates with the generalization gap. Based on these findings, we propose Inconsistency-Aware Minimization (IAM), which incorporates local inconsistency into the training objective. We demonstrate that in standard supervised learning settings, IAM enhances generalization, achieving performance comparable to that of existing methods such as Sharpness-Aware Minimization. Furthermore, IAM exhibits efficacy in semi- and self-supervised learning scenarios, where the local inconsistency is computed from unlabeled data.

2605.31321 2026-06-01 cs.RO

Surface Constraint Policy for Learning Surface-Constrained and Dynamically Feasible Robot Skills

表面约束策略:学习受表面约束且动态可行的机器人技能

Shuai Ke, Jiexin Zhang, Huan Zhao, Zhiao Wei, Yikun Guo, Jie Pan, Han Ding

AI总结 提出表面约束策略(SCP),通过二维加权高斯核编码表面几何约束,结合扩散策略和基于相似性的动作映射生成动态可行的表面约束运动,解决了自由曲面约束下动作随机性和接触不稳定的问题。

详情
AI中文摘要

基于扩散的模仿学习方法在机器人灵巧操作任务中取得了快速进展。然而,当应用于涉及复杂自由曲面约束的任务时,由于缺乏显式的表面几何约束建模和动态可行性问题,它们存在局限性,导致随机动作生成无法实现可靠的表面对齐和维持稳定接触。为了解决这些局限性,我们提出了一种新颖的表面约束策略(SCP),用于基于人类演示和实时视觉观察生成满足自由曲面约束的机器人动作。首先,使用从演示中推导出的二维加权高斯核函数对表面几何约束进行编码。基于编码的表面几何约束,使用基于扩散的策略从多模态感知输入(包括视觉观察和机器人状态反馈)中推断任务级动作意图。这些意图通过基于相似性的动作映射方法进一步转化为表面约束的动态运动基元(DMP),从而实现平滑且柔顺的运动执行。SCP实现了结构化表面几何意图和动态可接受动作的生成。所提出的方法在多个表面操作任务上进行了验证,并与现有技术进行了比较。实验结果表明,在表面约束下,该方法具有优越的任务成功率和接触稳定性。

英文摘要

Diffusion-based imitation learning methods have driven rapid progress in robot dexterous manipulation tasks. However, they have limitations when applied to tasks that involve complex free-form surface constraints because of their lack of explicit surface geometry constraint modeling and the dynamic feasibility issue, resulting in stochastic action generation that fails to achieve reliable surface alignment and maintain stable contact. To address these limitations, we propose a novel surface constraint policy (SCP) for generating robot actions that satisfy free-form surface constraints on the basis of human demonstrations and real-time visual observations. First, the surface geometry constraint is encoded using a two-dimensional weighted Gaussian kernel function that is derived from demonstrations. Building on the encoded surface geometry constraints, the diffusion-based policy is used to infer task-level action intentions from multimodal sensory inputs, including visual observations and robot state feedback. These intentions are further transformed into surface-constrained dynamic movement primitives (DMPs) through a similarity-based action mapping method, thereby enabling smooth and compliant motion execution. The SCP achieves generation of structured surface geometric intent and dynamically admissible actions. The proposed method is validated on multiple surface manipulation tasks and compared with existing techniques. The experimental results demonstrate superior task success rates and contact stability under surface constraints.

2605.31318 2026-06-01 cs.LG cs.MA

Generalized Intention Modeling in Multi-Agent Reinforcement Learning

多智能体强化学习中的广义意图建模

Mateusz Odrowaz-Sypniewski, Jasmine Bayrooti, Ajay Shankar, Amanda Prorok

AI总结 提出一种任务自适应的对手建模框架,通过性能驱动的多意图表示混合及最大化与自我智能体未来回报的互信息的新意图表示,提升非合作多智能体环境中的决策性能。

详情
AI中文摘要

在非合作、竞争和一般和的多智能体强化学习中,建模对手的意图对于有效决策至关重要。现有的对手建模方法使用从先验选择的回合信息(如对手的下一个动作或未来环境状态)中提取的嵌入来编码意图,并以此引导自我智能体的行为。这些方法假设所选信息普遍代表意图;然而,我们通过实验证明情况并非如此,因为意图通常依赖于任务和环境。为了解决这个问题,我们引入了一个任务自适应的对手建模框架,该框架学习一种性能驱动的多意图表示混合。此外,我们提出了一种新的意图表示,它最大化与自我智能体未来回报的互信息,从而捕获与性能最直接相关的对手信息。我们的方法在各种任务中始终匹配或超越最先进基线的性能,并揭示了不同对手建模策略何时以及为何成功。

英文摘要

Modeling an opponent's intent is critical for effective decision-making in non-cooperative, competitive, and general-sum multi-agent reinforcement learning. Existing opponent modeling methods encode intent using an embedding derived from episode information chosen a priori, such as the opponent's next action or a future environment state, and use this to guide the ego-agent's behavior. These approaches assume that the chosen information is universally representative of intent; however, we show empirically that this is not the case as intentions are often task- and environment-dependent. To address this, we introduce a task-adaptive opponent modeling framework that learns a performance-driven mixture of multiple intent representations. We further introduce a new intention representation that maximizes mutual information with the ego-agent's future returns, thereby capturing opponent information that is most directly relevant to performance. Our approach consistently matches or exceeds the performance of state-of-the-art baselines across diverse tasks and yields insights into when and why different opponent modeling strategies succeed.

2605.31317 2026-06-01 cs.LG

Forgetting Has Neighbors: Localized Collateral Forgetting in Machine Unlearning

遗忘有邻居:机器遗忘中的局部连带遗忘

Polina Dolgova, Sebastian U. Stich

AI总结 本文研究机器遗忘中梯度上升和随机标签方法导致的局部连带遗忘现象,并提出了基于局部教师蒸馏的缓解策略。

详情
AI中文摘要

机器遗忘旨在无需完全重新训练的情况下移除选定训练样本的影响。标准评估通常使用聚合指标(如准确率和遗忘分数)来概括遗忘质量,这可能会掩盖局部失败。我们通过比较遗忘模型与删除后重新训练模型的预测,在样本级别研究这种失败模式。我们表明,这种逐点差异可能高度不均匀:对于梯度上升和随机标签方法,无论是否进行保留集微调,差异都随着与遗忘集的几何接近度而增大。我们将这种现象称为局部连带遗忘。我们的分析确定了该效应背后的机制:遗忘过程中使用的替代目标可能与重新训练引起的局部预测结构不一致,并且这种不一致通过共享表示传播到邻近样本。受此机制启发,我们提出了局部教师蒸馏,一种简单的缓解策略,用仅在遗忘集的保留邻居上训练的小教师生成的软标签替换随机目标。在CIFAR-100部分类别删除任务中,这种局部教师使遗忘模型更接近重新训练,尤其是在遗忘集附近,同时保持有竞争力的聚合遗忘指标。

英文摘要

Machine unlearning aims to remove the influence of selected training examples without full retraining. Standard evaluations often summarize unlearning quality with aggregate metrics, such as accuracy- and forgetting-based scores, which can hide localized failures. We study this failure mode at the example level by comparing the predictions of an unlearned model to those of the model retrained after deletion. We show that this pointwise discrepancy can be highly non-uniform: for gradient-ascent and random-labeling methods, with and without retain-set fine-tuning, it grows with geometric proximity to the forget set. We call this phenomenon localized collateral forgetting. Our analysis identifies a mechanism behind the effect: surrogate targets used during unlearning can be inconsistent with the local prediction structure induced by retraining, and this inconsistency propagates through shared representations to nearby examples. Motivated by this mechanism, we propose Local Teacher Distillation, a simple mitigation strategy that replaces random targets with soft labels from a small teacher trained only on retained neighbors of the forget set. On CIFAR-100 partial-class deletion, this local teacher brings the unlearned model substantially closer to retraining, especially near the forget set, while maintaining competitive aggregate unlearning metrics.

2605.31315 2026-06-01 cs.LG

Graph Neural Networks Are Not Continuous Across Graph Resolutions

图神经网络在图分辨率上不连续

Christian Koke, Yuesong Shen, Abhishek Saroha, Marvin Eisenberger, Bastian Rieck, Michael Bronstein, Daniel Cremers

AI总结 本文证明图神经网络在自然图收敛模式下不连续,并提出一种基于信息传播方案的结构性修改,使其具备跨尺度连续性,从而实现对不同分辨率的稳定整合与泛化。

详情
Comments
arXiv admin note: text overlap with arXiv:2310.00431
AI中文摘要

我们表明,与社区中的传统观点相反,图神经网络(GNN)对于所有自然的图收敛模式并不连续。因此,GNN 可能为非常相似的图生成截然不同的潜在表示。特别是,它们为表示同一底层对象但处于不同分辨率尺度的图分配了非常不同的潜在嵌入。我们将这种不连续性的失败追溯到由常用信息传播方案引起的结构性障碍。基于这一见解,我们推导出对标准 GNN 架构的一种原则性修改,使模型具备跨尺度的连续性。所提出的修改能够实现不同分辨率的稳定整合以及它们之间的可靠泛化。我们通过广泛的数值实验系统性地验证了我们的理论发现。

英文摘要

We show that contrary to conventional wisdom in the community, graph neural networks (GNNs) are not continuous with respect to all natural modes of graph convergence. As a result, GNNs may generate substantially different latent representations for graphs that are very similar. In particular they assign vastly different latent embeddings to graphs that represent the same underlying object at different resolution scales. We trace this failure of continuity back to a structural obstruction arising from commonly used information-propagation schemes. Building on this insight we then derive a principled modification to standard GNN architectures which equips models with continuity across scales. The proposed modification enables consistent integration of distinct resolutions and reliable generalization between them. We systematically validate our theoretical findings in a wide range of numerical experiments.

2605.31314 2026-06-01 cs.RO

AR Forcing: Towards Long-Horizon Robot Navigation World Model

AR Forcing: 迈向长时域机器人导航世界模型

Yifei Yang, Zehua Fan, Huan Li, Aoqi Wang, Lida Huang, Haibao Yu, Haiyan Liu, Xuanyao Mao, Jason Bao, Liang Xu, Bingchuan Sun, Yan Wang

AI总结 提出AR Forcing自回归训练策略,通过将扩散损失集成到自回归训练循环中,解决训练与推理分布偏移问题,提升长时域导航中图像一致性和轨迹预测精度。

详情
AI中文摘要

基于扩散的机器人导航世界模型通常使用并行监督进行训练,而在路径规划时采用自回归推理。这导致训练和推理之间的分布偏移,从而在长时域预测中降低性能。我们提出AR Forcing,一种自回归训练策略,将标准扩散损失集成到自回归训练循环中。在每个步骤中,模型使用其自身的预测来更新上下文并优化单步噪声预测目标,从而在训练期间显式地将模型暴露于推理状态分布。我们的方法不需要额外的判别器或分布匹配损失,保留了原始扩散框架和采样器,并且易于集成。在多领域导航数据集(RECON、SCAND、HuRoN、TartanDrive)上的实验表明,与强基线相比,AR Forcing在长时域导航期间提高了生成图像的一致性以及预测轨迹的准确性,增强了模型在复杂已知和未知环境中的鲁棒性。我们将很快发布代码。

英文摘要

The diffusion based robot navigation world models are typically trained using parallel supervision, while autoregressive inference is employed during path planning. This results in a distribution shift between training and inference, which destabilizes the performance over long-horizon prediction. We propose AR Forcing, an autoregressive training strategy, which integrates the standard diffusion loss into the autoregressive training loop. At each step, the model uses its own predictions to update the context and optimize the single step noise prediction objective, thereby explicitly exposing the model to the inference state distribution during training. Our method does not require additional discriminators or distribution-matching losses, retains the original diffusion framework and sampler, and is easy to integrate. Experiments on multi-domain navigation datasets (RECON, SCAND, HuRoN, TartanDrive) show that compared with strong baselines, AR Forcing improved the consistency of generated images during long-horizon navigation and the accuracy of predicted trajectories, enhancing robustness of the model in complex known and unknown environments. We will release the code soon.

2605.31312 2026-06-01 cs.CV cs.CL

Learning from Fine-Grained Visual Discrepancies: Mitigating Multimodal Hallucinations via In-Context Visual Contrastive Optimization

从细粒度视觉差异中学习:通过上下文视觉对比优化缓解多模态幻觉

Haolin Deng, Xin Zou, Zhiwei Jin, Chen Chen, Haonan Lu, Xuming Hu

AI总结 提出上下文视觉对比优化(IC-VCO)方法,通过共享多图像上下文中的对比图像确保数学严谨的目标,并引入视觉对比蒸馏(VCDist)和对比样本编辑策略,有效缓解多模态幻觉。

详情
Comments
ICML 2026
AI中文摘要

多模态幻觉仍然是视觉语言模型(VLM)面临的持续挑战。标准的文本直接偏好优化(DPO)由于缺乏显式的视觉监督,往往无法缓解这一问题。虽然现有工作通过将原始图像与负样本对比引入了视觉偏好DPO,但由于配分函数不匹配导致目标在理论上不一致,并且依赖可能引发捷径学习的粗粒度负样本。在这项工作中,我们提出了上下文视觉对比优化(IC-VCO)。通过将对比图像置于共享的多图像上下文中,IC-VCO确保了数学上严谨的目标。我们进一步引入了视觉对比蒸馏(VCDist),一种辅助的可靠性门控正则化器,鼓励多图像对比训练与单图像推理之间的一致性。最后,我们提出了一种对比样本编辑策略,通过精确的语义扰动生成困难负样本。在五个基准上的实验表明,IC-VCO取得了最佳的整体性能,并且我们的样本编辑策略有效。代码和数据可在 https://github.com/OPPO-Mente-Lab/IC-VCO 获取。

英文摘要

Multimodal hallucination remains a persistent challenge for Vision-Language Models (VLMs). Standard textual Direct Preference Optimization (DPO) often fails to mitigate it due to a lack of explicit visual supervision. While existing works introduce visual preference DPO by contrasting original images against negative ones, they suffer from a theoretically inconsistent objective caused by partition function mismatches and rely on coarse-grained negatives that could enable shortcut learning. In this work, we propose In-Context Visual Contrastive Optimization (IC-VCO). By placing contrastive images within a shared multi-image context, IC-VCO ensures a mathematically rigorous objective. We further introduce Visual Contrast Distillation (VCDist), an auxiliary reliability-gated regularizer that encourages consistency between multi-image contrastive training and single-image inference. Finally, we propose a contrastive sample editing strategy that generates hard negatives via precise semantic perturbations. Experiments on five benchmarks demonstrate IC-VCO's best overall performance and the effectiveness of our sample editing strategy. Code and data are available at https://github.com/OPPO-Mente-Lab/IC-VCO.

2605.31311 2026-06-01 math.OC cs.DC cs.LG

S$^3$LDBO: A Snapshot Single-Loop Algorithm for Decentralized Bilevel Optimization

S$^3$LDBO: 一种用于去中心化双层优化的快照单循环算法

Chao Yin, Youran Dong, Shiqian Ma, Bofan Wang, Junfeng Yang

AI总结 提出S$^3$LDBO算法,通过快照机制间歇跳过昂贵导数计算,实现去中心化双层优化的高效单循环求解,并理论证明其复杂度,实验验证计算效率与学习性能的平衡。

详情
AI中文摘要

网络化AI系统日益依赖多个智能体通过通信网络协作学习和适应模型。在此类系统中,双层公式自然出现在超参数优化、数据清洗和元学习中,但梯度、雅可比矩阵和海森矩阵的重复评估可能给单个智能体带来巨大计算负担。为应对这一挑战,我们提出Snapshot-SLDBO(S$^3$LDBO),一种高效的单循环去中心化双层优化算法,通过快照机制使智能体能够间歇性地跳过昂贵的导数计算。该机制可解释为网络化AI的自主计算-适应策略,其中智能体选择性执行昂贵的局部更新,同时保持全局协作学习。我们在确定性设定下建立了所提出算法的遍历迭代复杂度和高概率非遍历迭代复杂度。在合成数据集和MNIST数据集上的超参数优化、Fashion-MNIST上的数据超清洗以及miniImageNet上的去中心化元学习实验结果表明,所提出算法在保持竞争性学习性能的同时提高了计算效率。

英文摘要

Networked AI systems increasingly rely on multiple agents that collaboratively learn and adapt models over communication networks. In such systems, bilevel formulations naturally arise in hyperparameter optimization, data cleaning, and meta-learning, but the repeated evaluation of gradients, Jacobians, and Hessians can impose a substantial computational burden on individual agents. To address this challenge, we propose Snapshot-SLDBO (S$^3$LDBO), an efficient single-loop decentralized bilevel optimization algorithm that enables agents to intermittently skip expensive derivative evaluations through a snapshot mechanism. This mechanism can be interpreted as an autonomous computation-adaptation strategy for networked AI, where agents selectively perform costly local updates while maintaining global collaborative learning. We establish the ergodic iteration complexity and the high probability nonergodic iteration complexity of the proposed algorithm within a deterministic setting. Experimental results on hyperparameter optimization with synthetic and MNIST datasets, data hyper-cleaning on Fashion-MNIST, and decentralized meta-learning on miniImageNet demonstrate that the proposed algorithm improves computational efficiency while maintaining competitive learning performance.

2605.31309 2026-06-01 cs.LG math.PR stat.ML

Non-Asymptotic Convergence of Stochastic Iterative Algorithms: A Lyapunov Framework

随机迭代算法的非渐近收敛性:一个李雅普诺夫框架

Zaiwei Chen, Siva Theja Maguluri

AI总结 本文综述了基于李雅普诺夫技术的随机迭代算法(随机逼近)的有限时间分析方法,通过广义Moreau包络作为通用李雅普诺夫函数,给出了均方收敛保证,并应用于随机梯度下降、线性SA及Q学习等强化学习算法,最后讨论了马尔可夫噪声、半范数压缩算子等扩展。

详情
Comments
44 pages
AI中文摘要

我们综述了基于李雅普诺夫技术的随机迭代算法(也称为随机逼近(SA)算法)的有限时间分析方法,用于求解不动点方程 $ar{F}(x)=x$,其中算子 $ar{F}(\cdot)$ 只能通过带噪声的预言机访问。我们首先关注标准设定,其中 $ar{F}(\cdot)$ 关于某种范数是压缩的且噪声是独立同分布的,并解释广义Moreau包络如何作为通用李雅普诺夫函数,无论底层范数如何。然后,我们展示该框架如何产生均方收敛保证,并应用于随机梯度下降、线性SA以及基于值的强化学习算法,如Q学习和时序差分学习。最后,我们讨论向马尔可夫噪声、半范数压缩算子、耗散算子和高概率界的扩展,并以开放问题作结。目标是提供一个统一且自包含的SA有限时间分析及其应用(尤其是在强化学习中)的路线图。

英文摘要

We survey Lyapunov-based techniques for the finite-time analysis of stochastic iterative algorithms, also known as stochastic approximation (SA) algorithms, for solving fixed-point equations $\bar{F}(x)=x$, where the operator $\bar{F}(\cdot)$ can only be accessed through a noisy oracle. We first focus on the standard setting in which $\bar{F}(\cdot)$ is contractive with respect to some norm and the noise is i.i.d., and explain how generalized Moreau envelopes serve as universal Lyapunov functions, regardless of the underlying norm. We then show how this framework yields mean-square convergence guarantees and applies to stochastic gradient descent, linear SA, and value-based reinforcement learning algorithms such as Q-learning and temporal-difference learning. Finally, we discuss extensions to Markovian noise, seminorm-contractive operators, dissipative operators, and high-probability bounds, and conclude with open problems. The goal is to present a unified and self-contained roadmap for the finite-time analysis of SA and its applications, especially in reinforcement learning.

2605.31308 2026-06-01 cs.AI

TraceGraph: Shared Decision Landscapes for Diagnosing and Improving Agent Trajectories

TraceGraph: 用于诊断和改进智能体轨迹的共享决策景观

Junjie Nian, Kang Chen, Ge Zhang, Yixin Cao, Yugang Jiang

AI总结 提出TraceGraph图框架,将多模型智能体轨迹构建为共享决策景观,通过事件摘要和陷阱感知恢复管线提升SWE-bench解决率。

详情
AI中文摘要

智能体基准测试越来越多地记录丰富的交互轨迹,但评估通常将每次运行简化为通过率或奖励分数。我们引入了TraceGraph,一个基于图的框架,将发布的多模型智能体轨迹转化为共享决策景观。对于每个任务,TraceGraph在引入模型身份之前,从聚合的运行中构建一个关于可观察动作-观察状态的图。然后,它叠加结果信息丰富的生产核心和陷阱区域,并用三个事件总结每条轨迹:访问、陷阱暴露和修复。跨越五个基准测试分割的轨迹中,TraceGraph配置文件揭示了被聚合分数隐藏的导航差异,并显示不同分割在奖励避免陷阱还是从中恢复方面有所不同。相同的TraceGraph景观还激发了SWE-bench的陷阱感知恢复管线:运行时检测器在匹配历史陷阱区域的状态上触发,然后从相同前缀评估轻量级延续策略。在触发状态上,最佳聚合单因子策略将每个提供者触发子集上的官方解决率从40.4%提高到43.5%,在共同触发实例上从41.0%提高到44.8%,并具有提供者特定的主动组件。总体而言,TraceGraph提供了一个过程词汇,用于询问智能体基准测试测试什么、模型在共享景观上何处出现分歧,以及失败区域如何指导下游改进。

英文摘要

Agent benchmarks increasingly record rich interaction trajectories, yet evaluation often reduces each rollout to a pass rate or reward score. We introduce TraceGraph, a graph-based framework that turns released multi-model agent trajectories into shared decision landscapes. For each task, TraceGraph builds a graph over observable action-observation states from pooled rollouts before model identity is introduced. It then overlays outcome-informed productive cores and trap regions, and summarizes each rollout with three events: Access, Trap exposure, and Repair. Across trajectories spanning five benchmark splits, TraceGraph profiles reveal navigation differences hidden by aggregate scores and show that splits differ in whether they reward avoiding traps or recovering from them. The same TraceGraph landscape also motivates a trap-aware recovery pipeline for SWE-bench: aruntime detector fires on states matching historical trap regions, then lightweight continuation policies are evaluated from the same prefix. On fired states, the best pooled single-factor policy raises official resolved rate from 40.4% to 43.5% on the per-provider fired subset and from 41.0% to 44.8% on common-fired instances, with provider-specific active components. Overall, TraceGraph provides a process vocabulary for asking what agent benchmarks test, where models diverge on a shared landscape, and how failure regions can guide downstream improvement.