arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.14987 2026-05-22 cs.CL cs.DB

Beyond Benchmark Islands: Toward Representative Trustworthiness Evaluation for Agentic AI

超越基准岛屿：面向代理AI的代表性可信度评估

Jinhu Qi, Yifan Li, Minghao Zhao, Wentao Zhang, Zijian Zhang, Yaoman Li, Irwin King

AI总结本文提出了一种基于五属性的代理可信度定义，并引入了Holographic Agent Assessment Framework（HAAF）框架，通过场景 manifold 的静态策略分析、沙盒模拟、社会伦理对齐评估和分布感知采样，实现对代理系统在社会技术场景中的可信度评估，展示了其在13个模型家族上的跨家族迁移实验结果。

Comments 9 pages, 3 figures, 8 tables. Submitted to the Agent4IR Workshop at KDD 2026

详情

AI中文摘要

Agentic AI systems increasingly act through tool-augmented, multi-step workflows whose failures (unsafe tool use, unauthorised actions, social harm) carry deployment-level consequences. Evaluation practice remains fragmented across isolated benchmark slices, and

英文摘要

Agentic AI systems increasingly act through tool-augmented, multi-step workflows whose failures (unsafe tool use, unauthorised actions, social harm) carry deployment-level consequences. Evaluation practice remains fragmented across isolated benchmark slices, and "trustworthiness" is frequently invoked but rarely defined operationally. We argue the central limitation is twofold: (i) the absence of a measurable specification of what agent trustworthiness means, and (ii) the lack of a principled notion of representativeness allowing assessment over a socio-technical scenario distribution rather than disconnected benchmark instances. We address (i) by defining agentic trustworthiness as a five-property profile (Reliability, Robustness, Safety, Social-Ethical Alignment, Operational Integrity) grounded in current AI risk frameworks, and (ii) with the Holographic Agent Assessment Framework (HAAF), which measures this profile over a scenario manifold through static policy analysis, sandbox simulation, social-ethical alignment assessment, and distribution-aware sampling, connected through an iterative Trustworthy Optimization Factory that converts red-team diagnoses into blue-team interventions. Our contributions are: (1) an operational five-property definition of agentic trustworthiness; (2) a distribution-aware scenario-sampling framework that surfaces property-level trade-offs invisible to scalar leaderboards; and (3) a cross-family transfer experiment in which interventions designed from a single focal model generalise -- without per-model or per-scenario tuning -- to 13 systems from seven model families (Llama, Mistral, Kimi, GLM, Qwen, GPT, DeepSeek) on a 100-scenario suite, where all 13 systems improve and two reach a perfect risk-weighted profile, establishing HAAF's Factory as a model-agnostic deployment-readiness pipeline. Code: https://github.com/TonyQJH/haaf-pilot

URL PDF HTML ☆

赞 0 踩 0

2603.11679 2026-05-22 cs.AI

LLMs can construct powerful representations and streamline sample-efficient supervised learning

Ilker Demirel, Lawrence Shi, Zeshan Hussain, David Sontag

AI总结本文提出了一种基于LLM的代理流程，通过生成全局 rubric 来提升多模态数据的表示能力，并在15个临床任务中显著优于传统方法。

详情

AI中文摘要

随着现实数据集变得更加复杂和异质化，监督学习常受到输入表示设计的瓶颈。对多模态数据（如时间序列、自由文本和结构化记录）建模通常需要非平凡的领域专业知识。我们提出了一种代理流程来简化这一过程。首先，一个LLM分析一小但多样化的文本序列输入示例，在上下文中合成一个全局rubric，该rubric作为程序化规范用于提取和组织证据。此rubric随后用于将原始文本序列转换为更标准化的格式，以供下游模型使用。我们还描述了局部rubrics，即由LLM生成的任务条件解释性摘要。在EHRSHOT基准的15个临床任务中，我们的rubric方法显著优于计数特征模型、朴素LLM基线和预训练数据量更大的临床基础模型。除了性能外，rubrics还提供了操作优势，如易于审计、规模化成本效益以及促进表格表示。

英文摘要

As real-world datasets become more complex and heterogeneous, supervised learning is often bottlenecked by input representation design. Modeling multimodal data, such as time-series, free text, and structured records, often requires non-trivial domain expertise. We propose an agentic pipeline to streamline this process. First, an LLM analyzes a small but diverse subset of text-serialized input examples in-context to synthesize a global rubric, which acts as a programmatic specification for extracting and organizing evidence. This rubric is then used to transform naive text-serializations of inputs into a more standardized format for downstream models. We also describe local rubrics, which are task-conditioned interpretive summaries generated by an LLM. Across 15 clinical tasks from the EHRSHOT benchmark, our rubric approaches significantly outperform count-feature models, naive LLM baselines, and a clinical foundation model pretrained on orders of magnitude more data. Beyond performance, rubrics offer operational advantages such as being easy to audit, cost-effectiveness at scale, and facilitating tabular representations.

URL PDF HTML ☆

赞 0 踩 0

2603.11642 2026-05-22 cs.RO

Noise-Space Attribution and Control of Chunk-Boundary Artifact

噪声空间中的属性分析与块边界伪影控制

Rui Wang

AI总结本文研究了生成视觉-运动策略中块边界伪影的机制，通过分析噪声空间中的变量，展示了如何通过控制隐含噪声来调节伪影，并证明伪影变化可以影响最终任务结果。

详情

AI中文摘要

动作分块在生成视觉-运动策略中被广泛应用，但块边界处的反复执行不连续性仍然缺乏机制性解释。本文将块边界伪影视为可分析的机制变量。我们首先证明成功和失败的episode在伪影度量上稳定分离。然后我们显示，在随机动作分块策略中，固定观察上下文并仅改变隐含噪声足以系统地调节伪影。在同一扩散策略检查点上，比较DDPM、零方差DDPM和DDIM进一步表明，这种局部可控性取决于从初始噪声到动作输出的信息路径是否保持完整。最后，从固定局部执行状态的受控干预中，我们发现伪影变化可以影响最终结果，并且在同一任务中，首选方向甚至可以反转：某些上下文在较低伪影下表现更高成功，而另一些上下文在较高伪影下表现更高成功。在代表性高伪影偏好的关键上下文中，成功率从0.033增加到0.717。这些结果表明，块边界伪影不是单纯的执行副产品，而是在噪声空间中的一个变量，可以被归因、控制，并与任务结果机制性关联。

英文摘要

Action chunking is widely used in generative visuomotor policies, yet the recurring execution discontinuities at chunk boundaries still lack a mechanistic explanation. This paper treats chunk-boundary artifact as an analyzable mechanism variable. We first show that successful and failed episodes separate stably on artifact metrics. We then show that, in stochastic action-chunked policies, fixing the observation context and changing only latent noise is sufficient to modulate artifact systematically. On the same Diffusion Policy checkpoint, comparisons among DDPM, zero-variance DDPM, and DDIM further show that this local controllability depends on whether the information path from initial noise to action output remains intact. Finally, from controlled interventions at fixed local execution states, we find that artifact changes can carry through to final outcome, and that the preferred direction can reverse even within the same task: some contexts achieve higher success under lower artifact, whereas others achieve higher success under higher artifact. In a representative high-artifact-favoring key context selected by held-out matched-continuation validation, success rate increases from 0.033 to 0.717. These results show that chunk-boundary artifact is not a mere execution-side by-product, but a variable in noise space that can be attributed, controlled, and mechanistically linked to task outcome.

URL PDF HTML ☆

赞 0 踩 0

2603.03784 2026-05-22 cs.AI

Specification-Driven Generation and Evaluation of Discrete-Event World Models via the DEVS Formalism

通过DEVS形式化方法驱动的离散事件世界模型生成与评估

Zheyu Chen, Huiteng Zhuang, Zhuohuan Li, Chuanhao Li

AI总结本文提出了一种基于自然语言规范在线生成离散事件世界模型的方法，结合了显式模拟器的可靠性与神经模型的适应性，通过DEVS形式化方法和分阶段的LLM生成流程，实现了对事件和时间逻辑的结构推断，并通过基准测试集验证了模型的一致性和可验证性。

Comments 36 pages, 6 figures

详情

AI中文摘要

世界模型是LLM代理在长时间范围内评估行动的核心组成部分。然而，现有研究大多集中在由物理动态或空间结构主导的环境，而许多高影响领域，如供应链、采购网络和业务流程，通过离散事件、时间约束和因果依赖演变。这些设置需要离散事件世界模型。现有构建世界模型的方法往往处于两个极端：手动工程模拟器提供一致性和可重复性，但构建和适应成本高；神经模型灵活，但长期时间推演中可能累积不一致。本文寻求一种原则性的中间方法，通过从自然语言规范中在线合成离散事件世界模型，保留显式模拟器的可靠性，同时获得神经模型的适应性。我们采用DEVS形式化方法，并引入一种分阶段的基于LLM的生成流程，将组件交互的结构推断与组件级事件和时间逻辑分开。在评估方面，我们开发了基准测试集，其中模拟器发出结构化事件轨迹，随后通过规范推导的时序、因果和语义约束进行验证。这使得可以实现可重复的验证和局部诊断。这些贡献共同产生了一种在长期时间推演中保持一致、可以从可观察行为中验证，并且可以在在线执行时高效合成的世界模型。

英文摘要

World models are central to LLM agents that must evaluate actions over long horizons. Yet much existing work focuses on environments governed by physical dynamics or spatial structure, whereas many high-impact domains, including supply chains, procurement networks, and business processes, evolve through discrete events, timing constraints, and causal dependencies. These settings call for discrete-event world models. Existing approaches to constructing world models often fall near two extremes: hand-engineered simulators provide consistency and reproducibility, but are costly to build and adapt; neural models are flexible, but can suffer from compounding inconsistency over long-horizon rollouts. We seek a principled middle ground by synthesizing discrete-event world models online from natural-language specifications, retaining the reliability of explicit simulators while gaining the adaptability of neural models. We adopt the DEVS formalism and introduce a staged LLM-based generation pipeline that separates structural inference over component interactions from component-level event and timing logic. For evaluation, we develop benchmark suites in which simulators emit structured event traces, which are then validated against specification-derived temporal, causal, and semantic constraints. This enables reproducible verification and localized diagnostics. Together, these contributions produce world models that remain consistent over long-horizon rollouts, can be verified from observable behavior, and can be synthesized efficiently on demand during online execution.

URL PDF HTML ☆

赞 0 踩 0

2603.02604 2026-05-22 cs.LG

Heterogeneous Agent Collaborative Reinforcement Learning

异质智能体协作强化学习

Zhixia Zhang, Zixuan Huang, Gongxun Li, Huaiyang Wang, Chengyi Yuan, Xin Xia, Deqing Wang, Fuzhen Zhuang, Shuai Ma, Ning Ding, Yaodong Yang, Jianxin Li, Yikun Ban

AI总结本文提出了一种新的强化学习从可验证奖励（RLVR）问题HACRL，通过异质智能体共享验证的轨迹实现协同优化，解决了孤立多智能体在线优化的效率问题，并提出HACPO算法以最大化样本利用率和跨智能体知识转移。

详情

AI中文摘要

我们引入了异质智能体协作强化学习（HACRL），一种新的强化学习从可验证奖励（RLVR）问题，旨在解决孤立多智能体在线优化的低效问题。HACRL允许独立执行的协同优化：异质智能体在训练期间共享验证的轨迹以互相改进，而在推理期间独立操作。不同于基于大语言模型的多智能体强化学习（MARL），HACRL不需要协调部署，也不同于在线/离线策略蒸馏，它使异质智能体之间实现双向相互学习，而非单向的教师到学生转移。基于此问题，我们提出HACPO，一种协作RL算法，能够通过原则性的轨迹共享最大化样本利用率和跨智能体知识转移。为缓解能力差异和策略分布偏移，HACPO引入了四个定制机制，具有对无偏优势估计的理论保证。在多样化的异质模型组合和推理基准上的广泛实验表明，HACPO一致地提升了所有参与智能体，相比使用双轨迹的GSPO，平均提高了3.6%，同时仅使用一半的轨迹成本。

英文摘要

We introduce Heterogeneous Agent Collaborative Reinforcement Learning (HACRL), a new Reinforcement Learning from Verifiable Reward (RLVR) problem that addresses the inefficiencies of isolated multi-agent on-policy optimization. HACRL enables collaborative optimization with independent execution: heterogeneous agents share verified rollouts during training to mutually improve, while operating independently at inference time. Unlike LLM-based multi-agent reinforcement learning (MARL), HACRL does not require coordinated deployment, and unlike on-/off-policy distillation, it enables bidirectional mutual learning among heterogeneous agents rather than one-directional homogeneous teacher-to-student transfer. Building on this problem, we propose HACPO, a collaborative RL algorithm that enables principled rollout sharing to maximize sample utilization and cross-agent knowledge transfer. To mitigate capability discrepancies and policy distribution shifts, HACPO introduces four tailored mechanisms with theoretical guarantees on unbiased advantage estimation. Extensive experiments across diverse heterogeneous model combinations and reasoning benchmarks show that HACPO consistently improves all participating agents, outperforming GSPO with double rollouts by an average of 3.6% while using only half the rollout cost.

URL PDF HTML ☆

赞 0 踩 0

2602.23200 2026-05-22 cs.LG cs.CL

InnerQ: Hardware-Aware Tuning-Free Quantization of KV Cache for Large Language Models

InnerQ: 一种面向硬件的无需调优的KV缓存量化方法用于大语言模型

Sayed Mohammadreza Tayaranian Hosseini, Amir Ardakani, Warren J. Gross

AI总结本文提出InnerQ，一种面向硬件的KV缓存量化方法，旨在减少解码延迟而不影响评估性能，通过分组量化策略提高数据重用率，从而在Llama和Mistral模型上提升了少样本评估得分。

Comments 18 pages, 5 figures, 7 tables

详情

AI中文摘要

当基于Transformer的语言模型用于文本生成时，大部分推理时间消耗在解码阶段，其中依次生成输出token。因此，减少每个解码步骤的硬件成本对于高效的长上下文生成至关重要。主要瓶颈是键值（KV）缓存，其大小随序列长度增长，通常主导模型的内存足迹。先前工作提出了压缩KV缓存的同时最小化精度损失的量化方法。我们提出了InnerQ，一种面向硬件的KV缓存量化方案，能够在不牺牲评估性能的情况下减少解码延迟。InnerQ通过沿内维对缓存矩阵进行分组实现分组量化。这种分组策略使去量化与向量-矩阵乘法对齐，并在GPU计算单元之间增加数据重用。结果，InnerQ减少了内存访问并加速了去量化，实现了比先前KV缓存量化方法平均快1.3倍，比非量化基线快2.7倍。为了在剧烈压缩下保持精度，InnerQ结合了三种技术：(i) 混合量化，根据局部统计选择对每个组使用对称或非对称量化；(ii) 高精度窗口用于最近的token和注意力sink token以缓解异常值泄漏；(iii) 对key缓存的通道归一化，在prefill期间计算一次并折叠到模型参数中以消除运行时开销。除了减少延迟外，在Llama和Mistral模型上的实验表明，InnerQ还相对于先前的KV缓存量化方法提升了少样本评估得分。

英文摘要

When transformer-based language models are deployed for text generation, most of the inference time is spent in the decoding stage, where output tokens are generated sequentially. Reducing the hardware cost of each decoding step is therefore critical for efficient long-context generation. A major bottleneck is the key-value (KV) cache, whose size grows with sequence length and often dominates the model's memory footprint. Prior work has proposed quantization methods to compress the KV cache while minimizing its loss of precision. We present InnerQ, a hardware-aware KV cache quantization scheme that reduces decode latency without compromising evaluation performance. InnerQ performs group-wise quantization by grouping cache matrices along their inner dimension. This grouping strategy aligns dequantization with vector-matrix multiplication and increases data reuse across GPU compute units. As a result, InnerQ reduces memory access and accelerates dequantization, achieving an average $1.3\times$ speedup over prior KV cache quantization methods and $2.7\times$ over the non-quantized baseline. To maintain fidelity under aggressive compression, InnerQ incorporates three techniques: (i) hybrid quantization, which chooses symmetric or asymmetric quantization for each group based on local statistics; (ii) high-precision windows for both recent tokens and attention sink tokens to mitigate outlier leakage; and (iii) per-channel normalization of the key cache, computed once during prefill and folded into the model parameters to eliminate runtime overhead. Beyond reducing latency, experiments on Llama and Mistral models show that InnerQ also improves few-shot evaluation scores relative to prior KV cache quantization methods.

URL PDF HTML ☆

赞 0 踩 0

2602.18600 2026-05-22 cs.LG

MapTab: Are MLLMs Ready for Multi-Criteria Route Planning in Heterogeneous Graphs?

MapTab: MLLMs 是否已准备好在异构图中进行多标准路线规划？

Ziqiao Shang, Lingyue Ge, Zi-Jian Cheng, Shi-Yu Tian, Zhenyu Huang, Wenbo Fu, Weiming Wu, Yang Chen, Xiangwen Zhang, Yulan Hu, Bin Liu, Yu-Feng Li, Lan-Zhe Guo

AI总结本文提出MapTab基准测试，用于评估多模态大语言模型在多标准路线规划任务中的综合推理能力，发现当前模型在多模态推理方面存在显著挑战。

详情

AI中文摘要

系统评估多模态大语言模型（MLLMs）对于推进人工通用智能（AGI）至关重要。然而，现有基准测试仍不足以严格评估其在多标准约束下的推理能力。为弥合这一差距，我们引入MapTab，一个专门设计用于通过路线规划任务评估MLLMs的综合多标准推理能力的多模态基准测试。MapTab要求MLLMs感知并结合地图图像中的视觉线索与结构化表格数据中的路线属性（如时间、价格）。该基准测试涵盖两个场景：Metromap，涵盖52个国家160座城市的地铁网络；Travelmap，描绘19个国家的168个代表性旅游景点。总共包含328张图像、196,800个路线规划查询和3,936个问答查询，所有数据均包含4个关键标准：时间、价格、舒适度和可靠性。对15个代表性MLLMs的广泛评估表明，当前模型在多标准多模态推理方面面临重大挑战。值得注意的是，在视觉感知有限的条件下，多模态协作往往不如单模态方法表现优异。我们认为MapTab提供了一个具有挑战性和现实性的测试平台，以推进MLLMs的系统评估。我们的代码可在https://github.com/Ziqiao-Shang/MapTab上获得。

英文摘要

Systematic evaluation of Multimodal Large Language Models (MLLMs) is crucial for advancing Artificial General Intelligence (AGI). However, existing benchmarks remain insufficient for rigorously assessing their reasoning capabilities under multi-criteria constraints. To bridge this gap, we introduce MapTab, a multimodal benchmark specifically designed to evaluate holistic multi-criteria reasoning in MLLMs via route planning tasks. MapTab requires MLLMs to perceive and ground visual cues from map images alongside route attributes (e.g., Time, Price) from structured tabular data. The benchmark encompasses two scenarios: Metromap, covering metro networks in 160 cities across 52 countries, and Travelmap, depicting 168 representative tourist attractions from 19 countries. In total, MapTab comprises 328 images, 196,800 route planning queries, and 3,936 QA queries, all incorporating 4 key criteria: Time, Price, Comfort, and Reliability. Extensive evaluations across 15 representative MLLMs reveal that current models face substantial challenges in multi-criteria multimodal reasoning. Notably, under conditions of limited visual perception, multimodal collaboration often underperforms compared to unimodal approaches. We believe MapTab provides a challenging and realistic testbed to advance the systematic evaluation of MLLMs. Our code is available at https://github.com/Ziqiao-Shang/MapTab.

URL PDF HTML ☆

赞 0 踩 0

2602.17186 2026-05-22 cs.CV

Focusing Where Vision Matters: Selective Training for Large Vision Language Models via Visual Information Gain

聚焦视觉关键点：通过视觉信息增益进行大视觉语言模型的定向训练

Seulbi Lee, Sangheum Hwang

AI总结本文提出通过视觉信息增益（VIG）指标，对大视觉语言模型进行定向训练，以提升视觉基础性并减少语言偏见，通过优先选择高VIG样本和token来提高性能。

Comments Accepted at ICML 2026

详情

AI中文摘要

大视觉语言模型（LVLMs）已取得显著进展，但它们常常受到语言偏见的影响，产生答案时往往不依赖视觉证据。尽管先前工作试图通过解码策略、架构修改或精心挑选的指令数据来缓解这一问题，但它们通常缺乏对单个训练样本或token实际从图像中获益程度的定量衡量。在本工作中，我们引入了视觉信息增益（VIG），一种基于困惑度的度量指标，用于衡量视觉输入对预测不确定性的减少。VIG能够在样本和token层面进行细粒度分析，有效突出视觉基础元素，如颜色、空间关系和属性。借助这一指标，我们提出了一种VIG引导的定向训练方案，优先选择高VIG样本和token。这种方法提高了视觉基础性并减轻了语言偏见，通过专注于仅视觉信息丰富的样本和token，实现了显著减少监督下的优越性能。

英文摘要

Large Vision Language Models (LVLMs) have achieved remarkable progress, yet they often suffer from language bias, producing answers without relying on visual evidence. While prior work attempts to mitigate this issue through decoding strategies, architectural modifications, or curated instruction data, they typically lack a quantitative measure of how much individual training samples or tokens actually benefit from the image. In this work, we introduce Visual Information Gain (VIG), a perplexity-based metric that measures the reduction in prediction uncertainty provided by visual input. VIG enables fine-grained analysis at both sample and token levels, effectively highlighting visually grounded elements such as colors, spatial relations, and attributes. Leveraging this, we propose a VIG-guided selective training scheme that prioritizes high-VIG samples and tokens. This approach improves visual grounding and mitigates language bias, achieving superior performance with significantly reduced supervision by focusing exclusively on visually informative samples and tokens.

URL PDF HTML ☆

赞 0 踩 0

2602.16169 2026-05-22 cs.LG cs.CL

Discrete Stochastic Localization for Non-autoregressive Generation

非自回归生成的离散随机定位

Yunshu Wu, Jiayi Cheng, Longxuan Yu, Partha Thakuria, Rob Brekelmans, Evangelos E. Papalexakis, Greg Ver Steeg

AI总结本文提出了一种名为离散随机定位（DSL）的连续状态框架，通过单位球体令牌嵌入实现最优去噪，从而在离散序列生成中提升分布忠实度，并展示了其在OpenWebText上的有效性。

详情

AI中文摘要

连续扩散是一种非自回归生成的自然框架，但在离散序列生成中通常落后于掩码离散扩散模型（MDMs）。我们认为瓶颈不在于连续性本身，而在于一种表示方式，其中去噪依赖于时间步索引的噪声模式。我们引入了离散随机定位（DSL），一种具有单位球体令牌嵌入的连续状态框架，其贝叶斯最优去噪器在定位信道下对名义信号噪声比（SNR）具有不变性。一个训练好的网络可以支持整个SNR路径家族，端点掩码扩散路径是特殊情况。对预训练MDLM检查点进行微调可显著提升OpenWebText在所有步预算（从T=128到T=1024）下的分布忠实度（MAUVE），并且同一检查点支持随机顺序自回归采样，以及使用最少T=48总步数的混合连续-然后-离散采样器，无需蒸馏或重新训练。

英文摘要

Continuous diffusion is a natural framework for non-autoregressive generation but has generally lagged behind masked discrete diffusion models (MDMs) on discrete sequence generation. We argue that the bottleneck is not continuity itself, but a representation in which denoising depends on timestep-indexed noise regimes. We introduce \emph{Discrete Stochastic Localization} (DSL), a continuous-state framework with unit-sphere token embeddings whose Bayes-optimal denoiser is invariant to the nominal signal-to-noise ratio (SNR) under the localization channel. One trained network then supports an entire family of per-token SNR paths, with endpoint masked-diffusion paths as a special case. Fine-tuning a pretrained MDLM checkpoint with DSL substantially improves distributional faithfulness (MAUVE) on OpenWebText across all step budgets from $T{=}128$ to $T{=}1024$, and the same checkpoint supports random-order autoregressive sampling, as well as a hybrid continuous-then-discrete sampler using as few as T=48 total steps -- without distillation or retraining.

URL PDF HTML ☆

赞 0 踩 0

2602.15338 2026-05-22 cs.LG cs.CL

Discovering Implicit Large Language Model Alignment Objectives

发现隐式大语言模型对齐目标

Edward Chen, Sanmi Koyejo, Carlos Guestrin

AI总结本文提出Obj-Disco框架，通过自动分解对齐奖励信号为可解释的目标，解决现有方法的不足，验证了框架在多种任务和模型上的鲁棒性，并发现潜在的对齐偏差。

Comments ICML 2026

详情

AI中文摘要

大语言模型（LLM）对齐依赖于复杂的奖励信号，这些信号往往模糊了被激励的具体行为，导致对齐风险和奖励黑客问题。现有解释方法通常依赖预定义的准则，可能遗漏“未知的未知”，或无法识别全面覆盖和因果影响模型行为的目标。为了解决这些限制，我们引入Obj-Disco框架，该框架能够自动将对齐奖励信号分解为稀疏、加权的可解释自然语言目标的组合。我们的方法利用迭代贪心算法分析训练检查点的行为变化，识别并验证最佳解释残差奖励信号的候选目标。在多种任务、模型大小和对齐算法上的广泛评估证明了框架的鲁棒性。对流行开源奖励模型的实验表明，框架一致捕获超过90%的奖励行为，这一发现进一步得到人类评估的证实。此外，对开源奖励模型对齐的案例研究显示，Obj-Disco能够成功识别伴随预期行为出现的潜在偏移激励。我们的工作提供了一种关键工具，用于揭示LLM对齐中的隐式目标，为更透明和安全的AI发展铺平道路。

英文摘要

Large language model (LLM) alignment relies on complex reward signals that often obscure the specific behaviors being incentivized, creating critical risks of misalignment and reward hacking. Existing interpretation methods typically rely on pre-defined rubrics, risking the omission of "unknown unknowns", or fail to identify objectives that comprehensively cover and are causal to the model behavior. To address these limitations, we introduce Obj-Disco, a framework that automatically decomposes an alignment reward signal into a sparse, weighted combination of human-interpretable natural language objectives. Our approach utilizes an iterative greedy algorithm to analyze behavioral changes across training checkpoints, identifying and validating candidate objectives that best explain the residual reward signal. Extensive evaluations across diverse tasks, model sizes, and alignment algorithms demonstrate the framework's robustness. Experiments with popular open-source reward models show that the framework consistently captures > 90% of reward behavior, a finding further corroborated by human evaluation. Additionally, a case study on alignment with an open-source reward model reveals that Obj-Disco can successfully identify latent misaligned incentives that emerge alongside intended behaviors. Our work provides a crucial tool for uncovering the implicit objectives in LLM alignment, paving the way for more transparent and safer AI development.

URL PDF HTML ☆

赞 0 踩 0

2602.13294 2026-05-22 cs.CV cs.AI

VisPhyWorld: Probing Physical Reasoning via Code-Driven Video Reconstruction

VisPhyWorld: 通过代码驱动的视频重建探测物理推理

Jiarong Liang, Max Ku, Ka-Hei Hui, Ping Nie, Wenhu Chen

AI总结本文提出VisPhyWorld框架，通过要求模型从视觉观察生成可执行的模拟器代码来评估物理推理能力，引入VisPhyBench基准测试集，验证模型在重建外观和模拟物理运动方面的能力，发现最先进的MLLM在准确推断物理参数和模拟一致的物理动态方面存在困难。

详情

AI中文摘要

评估多模态大语言模型（MLLMs）是否真正理解物理动态仍然具有挑战性。现有的基准测试大多依赖于识别式协议，如视觉问答（VQA）和期望违反（VoE），这些协议通常可以在不承诺明确、可测试的物理假设的情况下回答。我们提出了VisPhyWorld，一个基于执行的框架，通过要求模型从视觉观察生成可执行的模拟器代码来评估物理推理能力。通过生成可运行的代码，推断的世界表示可以直接检查、编辑和验证。这将物理推理与渲染分开。基于此框架，我们引入了VisPhyBench，包含209个评估场景，这些场景源自108个物理模板和一个系统化的协议，用于评估模型在重建外观和模拟物理合理的运动方面的能力。我们的流水线在97.7%的基准运行中生成有效的重建视频之前会回退。实验表明，尽管最先进的MLLM在语义场景理解方面表现强劲，但在准确推断物理参数和模拟一致的物理动态方面存在困难。我们的代码可在https://github.com/TIGER-AI-Lab/VisPhyWorld上获得。

英文摘要

Evaluating whether Multimodal Large Language Models (MLLMs) genuinely reason about physical dynamics remains challenging. Most existing benchmarks rely on recognition-style protocols such as Visual Question Answering (VQA) and Violation of Expectation (VoE), which can often be answered without committing to an explicit, testable physical hypothesis. We propose VisPhyWorld, an execution-based framework that evaluates physical reasoning by requiring models to generate executable simulator code from visual observations. By producing runnable code, the inferred world representation is directly inspectable, editable, and falsifiable. This separates physical reasoning from rendering. Building on this framework, we introduce VisPhyBench, comprising 209 evaluation scenes derived from 108 physical templates and a systematic protocol that evaluates how well models reconstruct appearance and reproduce physically plausible motion. Our pipeline produces valid reconstructed videos in 97.7% of benchmark runs before fallback. Experiments show that while state-of-the-art MLLMs achieve strong semantic scene understanding, they struggle to accurately infer physical parameters and to simulate consistent physical dynamics. Our code is available https://github.com/TIGER-AI-Lab/VisPhyWorld

URL PDF HTML ☆

赞 0 踩 0

2602.11574 2026-05-22 cs.AI

Learning to Configure Agentic AI Systems

学习配置代理AI系统

Aditya Taparia, Som Sagar, Ransalu Senanayake

AI总结本文提出了一种基于半马尔可夫决策过程（SMDP）的代理配置方法，通过ARC模型动态选择查询特定的代理配置，从而在多个基准测试中提升了推理准确性、工具使用准确性和τ-Bench（Airline）Pass的成功率。

Comments 22 pages, 12 figures

详情

AI中文摘要

配置基于LLM的代理系统涉及从庞大的组合设计空间中选择工作流、工具、令牌预算和提示，而目前通常通过固定的模板或手工调整的启发式方法处理，这些方法无论查询难度如何都应用相同的配置，导致行为脆弱和计算浪费。为了解决这个问题，我们将代理配置建模为半马尔可夫决策过程（SMDP），其中每个配置都是一种时间扩展的选项，决定了代理系统如何处理查询，并引入了ARC（Agentic Resource & Configuration learner），一种轻量级的分层策略，能够动态选择查询特定的代理配置。在推理、工具使用和代理基准测试中，ARC在与预算匹配的工具增强LLM相比，平均推理准确性提高了31.3%，工具使用准确性提高了13.95%，并将τ-Bench（Airline）Pass的成功率从9.0%提升到18.0%。这些结果表明，学习查询特定的代理配置是“一刀切”设计的一种强大替代方案。

英文摘要

Configuring LLM-based agent systems involves choosing workflows, tools, token budgets, and prompts from a large combinatorial design space, and is typically handled today by fixed templates or hand-tuned heuristics that apply the same configuration regardless of query difficulty, leading to brittle behavior and wasted compute. To address this, we formulate agent configuration as a semi-Markov decision process (SMDP) where each configuration acts as a temporally extended option that determines how an agent system processes a query, and introduce introduce ARC (Agentic Resource & Configuration learner), a lightweight hierarchical policy that dynamically selects query-specific agent configurations. Across reasoning, tool-use, and agentic benchmarks, ARC consistently improves over budget-matched tool-augmented LLMs, increasing average reasoning accuracy by 31.3%, tool-use accuracy by 13.95%, and doubling τ-Bench (Airline) Pass^1 success from 9.0% to 18.0%. These results demonstrate that learning per-query agent configurations is a powerful alternative to "one size fits all" designs.

URL PDF HTML ☆

赞 0 踩 0

FlashSinkhorn: GPU上的IO感知熵最优传输

Felix X. -F. Ye, Xingjie Li, An Yu, Ming-Ching Chang, Linsong Chu, Davis Wertheimer

AI总结本文提出FlashSinkhorn，一种基于GPU的熵最优传输求解器，通过将稳定化的对数域Sinkhorn更新转换为行-wise的LogSumExp归一化，实现了与Transformer注意力相同的归一化方式，从而实现了FlashAttention风格的融合和分块处理，显著降低了HBMIO并保持线性内存操作。

详情

AI中文摘要

熵最优传输（EOT）通过Sinkhorn迭代在现代机器学习中广泛应用，但GPU求解器在大规模情况下仍效率低下。张量化实现因密集的n×m交互导致二次HBM流量，而现有在线后端避免存储密集矩阵但仍然依赖于通用的 tiled map-reduce 减少内核，融合有限。我们提出FlashSinkhorn，一种针对平方欧几里得成本的IO感知EOT求解器，将稳定化的对数域Sinkhorn更新重写为行-wise的LogSumExp归一化，与Transformer注意力相同的归一化方式。这使得FlashAttention风格的融合和分块处理成为可能：融合的Triton内核通过芯片上的SRAM流式传输分块，并在单次通过中更新双潜力，显著减少每个迭代的HBM IO同时保持线性内存操作。我们进一步提供了用于传输应用的流式内核，实现了可扩展的一阶和二阶优化。在A100 GPU上，FlashSinkhorn在点云OT上的前向传递速度比最先进的在线基线快32倍，在端到端速度上快161倍，提高了OT基于下游任务的可扩展性。为了可重复性，我们发布了开源实现，网址为https://github.com/ot-triton-lab/flash-sinkhorn。

英文摘要

Entropic optimal transport (EOT) via Sinkhorn iterations is widely used in modern machine learning, yet GPU solvers remain inefficient at scale. Tensorized implementations suffer quadratic HBM traffic from dense $n\times m$ interactions, while existing online backends avoid storing dense matrices but still rely on generic tiled map-reduce reduction kernels with limited fusion. We present \textbf{FlashSinkhorn}, an IO-aware EOT solver for squared Euclidean cost that rewrites stabilized log-domain Sinkhorn updates as row-wise LogSumExp reductions of biased dot-product scores, the same normalization as transformer attention. This enables FlashAttention-style fusion and tiling: fused Triton kernels stream tiles through on-chip SRAM and update dual potentials in a single pass, substantially reducing HBM IO per iteration while retaining linear-memory operations. We further provide streaming kernels for transport application, enabling scalable first- and second-order optimization. On A100 GPUs, FlashSinkhorn achieves up to $32\times$ forward-pass and $161\times$ end-to-end speedups over state-of-the-art online baselines on point-cloud OT, improves scalability on OT-based downstream tasks. For reproducibility, we release an open-source implementation at https://github.com/ot-triton-lab/flash-sinkhorn .

URL PDF HTML ☆

赞 0 踩 0

2602.01935 2026-05-22 cs.LG cs.AI cs.PL

针对实值时间序列的软贝叶斯上下文树模型

Shota Saito, Yuta Nakahara, Toshiyasu Matsushima

AI总结本文提出了一种新的软贝叶斯上下文树模型（Soft-BCT），用于实值时间序列。该模型采用概率性分裂上下文空间，而非传统上下文树模型中确定性的上下文空间分裂。基于变分推断提出学习算法，实验结果表明Soft-BCT在某些数据集上优于传统上下文树模型。

2601.10348 2026-05-22 cs.CL cs.AI cs.LG

Training-Trajectory-Aware Token Selection

基于训练轨迹的token选择

Zhanming Shen, Jiaqi Hu, Zeyu Qin, Hao Chen, Wentao Ye, Zenan Huang, Yihong Zhuang, Guoshan Lu, Junlin Zhou, Junbo Zhao

AI总结本文提出T3S方法，通过在token层面重构训练目标，清除未学习token的优化路径，从而在连续蒸馏中提升性能，实验表明在AR和dLLM设置中均取得显著效果。

Comments Accepted by ICML 2026

详情

AI中文摘要

高效的蒸馏是将昂贵的推理能力转化为可部署效率的关键途径，然而在前沿领域中，当学生模型已具备较强的推理能力时，朴素的连续蒸馏往往产生有限的收益甚至退化。我们观察到一种训练特征现象：即使损失单调下降，所有性能指标在几乎相同的瓶颈处会突然大幅下降，然后逐渐恢复。我们进一步揭示了token层面的机制：置信度会分裂成稳步增加的模仿锚点token，快速锚定优化，以及尚未学习的token，其置信度被抑制直到瓶颈之后。这两种类型token无法共存的特性是连续蒸馏失败的根本原因。为此，我们提出了基于训练轨迹的token选择（T3S）方法，以在token层面重建训练目标，清除未学习token的优化路径。T3S在AR和dLLM设置中均取得一致的收益：仅用数百个示例，Qwen3-8B在竞争性推理基准上超越DeepSeek-R1，Qwen3-32B接近Qwen3-235B，且T3训练的LLaDA-2.0-Mini超越其AR基线，达到所有16B级模型中的最先进性能。

英文摘要

Efficient distillation is a key pathway for converting expensive reasoning capability into deployable efficiency, yet in the frontier regime where the student already has strong reasoning ability, naive continual distillation often yields limited gains or even degradation. We observe a characteristic training phenomenon: even as loss decreases monotonically, all performance metrics can drop sharply at almost the same bottleneck, before gradually recovering. We further uncover a token-level mechanism: confidence bifurcates into steadily increasing Imitation-Anchor Tokens that quickly anchor optimization and other yet-to-learn tokens whose confidence is suppressed until after the bottleneck. And the characteristic that these two types of tokens cannot coexist is the root cause of the failure in continual distillation. To this end, we propose Training-Trajectory-Aware Token Selection (T3S) to reconstruct the training objective at the token level, clearing the optimization path for yet-to-learn tokens. T3S yields consistent gains in both AR and dLLM settings: with only hundreds of examples, Qwen3-8B surpasses DeepSeek-R1 on competitive reasoning benchmarks, Qwen3-32B approaches Qwen3-235B, and T3-trained LLaDA-2.0-Mini exceeds its AR baseline, achieving state-of-the-art performance among all of 16B-scale no-think models.

URL PDF HTML ☆

赞 0 踩 0

2512.16739 2026-05-22 cs.AI

AI-Driven Prediction of Cancer Pain Episodes: A Hybrid Decision Support Approach

基于AI的癌症疼痛发作预测：一种混合决策支持方法

Yipeng Zhuang, Yifeng Guo, Yuewen Li, Yuheng Wu, Philip Leung-Ho Yu, Tingting Song, Zhiyong Wang, Kunzhong Zhou, Weifang Wang, Li Zhuang

AI总结本研究提出了一种混合机器学习和大语言模型的方法，利用结构化和非结构化电子健康记录数据预测癌症患者在住院48和72小时内疼痛发作，通过整合时间序列药物趋势和模糊剂量记录，提高了敏感性和可解释性，实现了87.6%和91.7%的准确率。

详情

DOI: 10.1109/JBHI.2026.3694585

AI中文摘要

肺癌患者经常经历突破性疼痛发作，高达91%的患者需要及时干预。为了实现主动疼痛管理，我们提出了一种混合机器学习和大语言模型的管道，利用结构化和非结构化的电子健康记录数据预测住院48和72小时内的疼痛发作。分析了266名住院患者的历史队列，特征包括人口统计学数据、肿瘤分期、生命体征和WHO分级镇痛药使用情况。机器学习模块捕捉时间序列药物趋势，而大语言模型解释模糊的剂量记录和自由文本临床笔记。整合这些模态提高了灵敏度和可解释性。我们的框架在48小时和72小时的准确率分别为0.876和0.917，灵敏度分别提高了10.6%和10.7%，归因于大语言模型的增强。这种混合方法提供了一种临床可解释且可扩展的工具，用于早期疼痛发作预测，有望提高治疗精准度并优化肿瘤学护理中的资源分配。

英文摘要

Lung cancer patients frequently experience breakthrough pain episodes, with up to 91% requiring timely intervention. To enable proactive pain management, we propose a hybrid machine learning and large language model pipeline that predicts pain episodes within 48 and 72 hours of hospitalization using both structured and unstructured electronic health record data. A retrospective cohort of 266 inpatients was analyzed, with features including demographics, tumor stage, vital signs, and WHO-tiered analgesic use. The machine learning module captured temporal medication trends, while the large language model interpreted ambiguous dosing records and free-text clinical notes. Integrating these modalities improved sensitivity and interpretability. Our framework achieved an accuracy of 0.876 (48h) and 0.917 (72h), with improvements in sensitivity of 10.6% and 10.7%, respectively, attributable to large language model augmentation. This hybrid approach offers a clinically interpretable and scalable tool for early pain episode forecasting, with potential to enhance treatment precision and optimize resource allocation in oncology care.

URL PDF HTML ☆

赞 0 踩 0

2512.12744 2026-05-22 cs.LG

Resting Neurons, Active Insights: Robustifying Activation Sparsity in LLMs via Spontaneity

静息神经元，主动洞察：通过自发性增强LLM中的激活稀疏性

Haotian Xu, Jiannan Yang, Tian Gao, Tsui-Wei Weng, Tengfei Ma

AI总结本文提出了一种通过引入自发神经元（SPON）来增强LLM中激活稀疏性的方法，解决了高稀疏率下模型精度下降的问题，通过分布匹配训练SPON，使模型在稀疏计算中保持稳定和泛化能力。

Comments ICML 2026

详情

AI中文摘要

激活稀疏性提供了一种有吸引力的途径来加速大型语言模型（LLM）的推理过程，通过选择性地抑制隐藏激活。然而，现有方法在高稀疏率下表现出严重的准确性下降。我们发现，这种失败源于表征不稳定：*激活稀疏性破坏了预训练期间学习的输入依赖激活，导致隐藏状态的分布偏移。*我们通过将激活稀疏性重新定义为表征对齐问题，并引入**自发神经元（SPON）**，一种受生物系统中自发神经活动启发的轻量机制。SPON注入一组小的可学习、输入无关的激活向量，作为稀疏计算中的持久表征锚点。这些向量通过分布匹配训练与密集模型匹配，并在训练后可吸收进偏置项中，带来极小的推理开销。在多个LLM架构上，SPON一致地恢复了性能，稳定了潜在表征，并保持了泛化能力。我们的结果确立了SPON作为可靠激活稀疏推理的有效且原则性解决方案，并为LLM的知识保留提供了新的见解。

英文摘要

Activation sparsity offers a compelling route to accelerate large language model (LLM) inference by selectively suppressing hidden activations, yet existing approaches exhibit severe accuracy degradation at high sparsity. We show that this failure stems from representational instability: *activation sparsity disrupts input-dependent activation learned during pretraining, inducing distribution shifts in hidden states.* We address this issue by reframing activation sparsity as a representational alignment problem and introducing **Spontaneous Neurons (SPON)**, a lightweight mechanism inspired by spontaneous neural activity in biological systems. SPON injects a small set of learnable, input-independent activation vectors that act as persistent representational anchors for sparse computation. These vectors are trained via distribution matching to the dense model and can be absorbed into bias terms after training, incurring negligible inference overhead. Across multiple LLM backbones, SPON consistently restores performance, stabilizes latent representations, and preserves generalization. Our results establish SPON as an effective and principled solution for reliable activation-sparse inference, and offer new insights into knowledge retention in LLMs.

URL PDF HTML ☆

赞 0 踩 0

2512.11587 2026-05-22 cs.LG cs.NA math.NA math.OC

Gradient Descent as a Perceptron Algorithm: Understanding Dynamics and Implicit Acceleration

梯度下降作为感知机算法：理解动态与隐式加速

Alexander Tyurin

AI总结本文研究了梯度下降在神经网络训练中的优化动态和隐式加速现象，通过非线性模型分析显示梯度下降步骤等价于广义感知机算法，揭示了非线性模型在迭代复杂度上的优势。

详情

AI中文摘要

即使对于应用于神经网络训练的梯度下降（GD）方法，理解其优化动态，包括收敛速度、迭代轨迹、函数值振荡，尤其是其隐式加速现象，仍然是一个具有挑战性的问题。我们分析了具有逻辑损失的非线性模型，并展示梯度下降的步骤等同于广义感知机算法（Rosenblatt, 1958），从而提供了新的动态视角。这种简化步骤通过经典线性代数工具进行分析。在最小化示例中，我们证明了双层模型的非线性可以证明在迭代复杂度上比线性模型更快，即$ ilde{O}(\sqrt{d})$，相比线性模型的$Ω(d)$，其中$d$是特征数量。这有助于解释神经网络中观察到的优化动态和隐式加速现象。理论结果通过广泛的数值实验得到支持。我们相信这种替代观点将进一步推动神经网络优化的研究。

英文摘要

Even for the gradient descent (GD) method applied to neural network training, understanding its optimization dynamics, including convergence rate, iterate trajectories, function value oscillations, and especially its implicit acceleration, remains a challenging problem. We analyze nonlinear models with the logistic loss and show that the steps of GD reduce to those of generalized perceptron algorithms (Rosenblatt, 1958), providing a new perspective on the dynamics. This reduction yields significantly simpler algorithmic steps, which we analyze using classical linear algebra tools. Using these tools, we demonstrate on a minimalistic example that the nonlinearity in a two-layer model can provably yield a faster iteration complexity $\tilde{O}(\sqrt{d})$ compared to $Ω(d)$ achieved by linear models, where $d$ is the number of features. This helps explain the optimization dynamics and the implicit acceleration phenomenon observed in neural networks. The theoretical results are supported by extensive numerical experiments. We believe that this alternative view will further advance research on the optimization of neural networks.

URL PDF HTML ☆

赞 0 踩 0

2512.10719 2026-05-22 cs.CV

SpaceDrive: Infusing Spatial Awareness into VLM-based Autonomous Driving

SpaceDrive: 在基于视觉语言模型的自动驾驶中引入空间感知

Peizheng Li, Zhenghao Zhang, David Holtz, Hang Yu, Yutong Yang, Yuzhi Lai, Rui Song, Andreas Geiger, Andreas Zell

AI总结本文提出SpaceDrive框架，通过将空间信息作为显式位置编码来增强基于VLM的自动驾驶系统对精细3D空间关系的理解，从而提升规划精度和开放环性能。

详情

AI中文摘要

基于视觉语言模型（VLM）的端到端自动驾驶方法因具备通用的视觉理解和强大的推理能力而迅速发展。然而，我们发现当前VLM在理解细粒度的3D空间关系方面存在困难，这在与物理世界交互的系统中是基本要求。为了解决这一问题，我们提出了SpaceDrive，一个基于空间感知的VLM自动驾驶框架，将空间信息作为显式位置编码（PEs）而非文本数字标记，从而实现语义和空间表示的联合推理。SpaceDrive采用通用的位置编码器处理从多视角深度估计、历史自我状态和文本提示中得到的所有3D坐标。这些3D PE首先叠加到相应的2D视觉标记上，同时作为任务无关的坐标表示，取代数字形式的数值标记作为VLM的输入和输出。这种机制使模型能够更好地在空间推理中索引特定的视觉语义，并直接回归轨迹坐标而非逐位生成，从而提升规划精度。广泛的实验验证了SpaceDrive在nuScenes数据集上实现了最先进的开放环性能，并在Bench2Drive闭环基准中取得了78.02的第二好Driving Score。代码可在：https://github.com/zhenghao2519/SpaceDrive获取。

英文摘要

End-to-end autonomous driving methods built on vision language models (VLMs) have undergone rapid development driven by their universal visual understanding and strong reasoning capabilities obtained from the large-scale pretraining. However, we find that current VLMs struggle to understand fine-grained 3D spatial relationships which is a fundamental requirement for systems interacting with the physical world. To address this issue, we propose SpaceDrive, a spatial-aware VLM-based driving framework that treats spatial information as explicit positional encodings (PEs) instead of textual digit tokens, enabling joint reasoning over semantic and spatial representations. SpaceDrive employs a universal positional encoder to all 3D coordinates derived from multi-view depth estimation, historical ego-states, and text prompts. These 3D PEs are first superimposed to augment the corresponding 2D visual tokens. Meanwhile, they serve as a task-agnostic coordinate representation, replacing the digit-wise numerical tokens as both inputs and outputs for the VLM. This mechanism enables the model to better index specific visual semantics in spatial reasoning and directly regress trajectory coordinates rather than generating digit-by-digit, thereby enhancing planning accuracy. Extensive experiments validate that SpaceDrive achieves state-of-the-art open-loop performance on the nuScenes dataset and the second-best Driving Score of 78.02 on the Bench2Drive closed-loop benchmark over existing VLM-based methods. Code is available at: https://github.com/zhenghao2519/SpaceDrive.

URL PDF HTML ☆

赞 0 踩 0

2512.02193 2026-05-22 cs.AI

From monoliths to modules: Decomposing transducers for efficient world modelling

从整体到模块：分解转换器以实现高效的world建模

Alexander Boyd, Franz Nowak, David Hyland, Manuel Baltieri, Fernando E. Rosas

AI总结本文提出了一种分解复杂world建模的方法，通过转换器框架将世界模型分解为多个模块，从而提高计算效率并支持分布式推理，为AI安全和现实应用提供基础。

详情

AI中文摘要

world模型最近被提出作为AI代理在部署前训练和评估的沙盒环境。尽管现实中的world模型通常计算需求高，但通过利用现实世界场景中子组件以模块化方式交互的事实，可以缓解这一问题。在本文中，我们通过开发一个框架来分解由转换器表示的复杂world模型，探索这一想法。转换器是一类扩展POMDPs的模型。尽管转换器的组合已被深入理解，我们的结果澄清了如何通过推导在不同输入-输出子空间上操作的子转换器来反转这一过程，从而实现并行化和可解释的替代方案，以支持分布式推理。总体而言，这些结果为连接现实推理所需的计算效率与AI安全所要求的结构透明性奠定了基础。

英文摘要

World models have been recently proposed as sandbox environments in which AI agents can be trained and evaluated before deployment. While realistic world models often have high computational demands, this can often be alleviated by exploiting the fact that real-world scenarios tend to involve subcomponents that interact in a modular manner. In this paper, we explore this idea by developing a framework for decomposing complex world models represented by transducers, a class of models generalising POMDPs. Whereas the composition of transducers is well understood, our results clarify how to invert this process by deriving sub-transducers operating on distinct input-output subspaces, enabling parallelizable and interpretable alternatives to monolithic world modelling that can support distributed inference. Overall, these results lay groundwork for bridging the computational efficiency required for real-world inference and the structural transparency demanded by AI safety.

URL PDF HTML ☆

赞 0 踩 0

2511.18159 2026-05-22 cs.LG

Bringing Stability to Diffusion: Decomposing and Reducing Variance of Training Masked Diffusion Models

为扩散模型带来稳定性：分解和减少训练掩码扩散模型的方差

Mengni Jia, Mengyu Zhou, Yihao Liu, Xiaoxi Jiang, Guanjun Jiang

AI总结本文研究了掩码扩散模型（MDMs）训练方差高导致不稳定的问题，通过分解方差来源并提出六种方差减少方法，显著提升了模型在复杂推理任务中的准确率，并将运行间变异性降低至自回归模型（ARMs）水平。

详情

AI中文摘要

Masked diffusion models (MDMs) are a promising alternative to autoregressive models (ARMs), but they suffer from inherently much higher training variance. High variance leads to noisier gradient estimates and unstable optimization, so even equally strong pretrained MDMs and ARMs that are competitive at initialization often diverge after task-specific training, with MDMs falling far behind. There has been no theoretical explanation or systematic solution. We derive the first decomposition of MDM training variance into three sources: (A) masking pattern noise, (B) masking rate noise, and (C) data noise, while ARMs are only affected by (C). This explains the fundamental training gap. Building on this foundation, we design six variance-reduction methods, including two core methods: (1) P-POTS, a Pareto-optimal t sampler that minimizes training variance by sampling harder t values more often with appropriately smaller update steps, and (2) MIRROR, which uses negatively correlated samples to reduce (A). Experiments show that compared to standard MDM training, our methods improve accuracy by 7-8% on complex reasoning tasks, while simultaneously reducing run-to-run variability to near ARM levels, substantially narrowing the gap with strong ARM baselines; in most settings, even the best baseline runs remain below the worst run of our method.

英文摘要

Masked diffusion models (MDMs) are a promising alternative to autoregressive models (ARMs), but they suffer from inherently much higher training variance. High variance leads to noisier gradient estimates and unstable optimization, so even equally strong pretrained MDMs and ARMs that are competitive at initialization often diverge after task-specific training, with MDMs falling far behind. There has been no theoretical explanation or systematic solution. We derive the first decomposition of MDM training variance into three sources: (A) masking pattern noise, (B) masking rate noise, and (C) data noise, while ARMs are only affected by (C). This explains the fundamental training gap. Building on this foundation, we design six variance-reduction methods, including two core methods: (1) P-POTS, a Pareto-optimal t sampler that minimizes training variance by sampling harder t values more often with appropriately smaller update steps, and (2) MIRROR, which uses negatively correlated samples to reduce (A). Experiments show that compared to standard MDM training, our methods improve accuracy by 7-8% on complex reasoning tasks, while simultaneously reducing run-to-run variability to near ARM levels, substantially narrowing the gap with strong ARM baselines; in most settings, even the best baseline runs remain below the worst run of our method.

URL PDF HTML ☆

赞 0 踩 0