arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 4160
2606.00016 2026-06-02 cs.CL cs.AI

AEyeDE: An Attention-Based Attribution Framework for AI-Generated Text Detection

AEyeDE:一种基于注意力的人工智能生成文本检测归因框架

Aria Nourbakhsh, Adelaide Danilov, Christoph Schommer, Salima Lamsiyah

发表机构 * University of Luxembourg(卢森堡大学) Department of Computer Science(计算机科学系)

AI总结 提出AEyeDE框架,利用代理Transformer模型的注意力归因矩阵,通过轻量级CNN区分人类与AI生成文本,在多种设置下优于文本基线,并揭示注意力图的局部结构差异。

详情
Comments
24 pages, 2 figures
AI中文摘要

随着现代语言模型达到接近人类水平的流畅度,并能规避依赖表面统计或似然信号的检测器,检测AI生成文本变得越来越具有挑战性。我们提出 extsc{AEyeDE},一种归因驱动的人机作者身份检测方法,利用模型注意力作为判别信号。具体来说,我们使用具有白盒访问权限的\emph{代理}Transformer模型提取人类和AI生成文本的基于注意力的归因矩阵,并训练轻量级卷积神经网络从这些归因图中学习表示。在编码器-解码器翻译设置中,我们的方法始终优于纯文本基线。在仅解码器设置中,它在生成器特定检测中表现强劲,在标准基准上保持竞争力,并在跨数据集迁移和替代拼写扰动下表现出鲁棒性。我们进一步表明,注意力图表现出重复的局部结构,其相对频率在不同数据集和代理模型下的人类与AI生成文本之间一致不同。这些发现表明,基于注意力的归因图为AI生成文本检测提供了互补且可解释的信号。我们将公开代码以支持未来研究。

英文摘要

Detecting AI-generated text is becoming increasingly challenging as modern language models approach human-level fluency and can evade detectors that rely on surface statistics or likelihood-based signals. We propose \textsc{AEyeDE}, an attribution-driven approach to human-AI authorship detection that leverages model attention as a discriminative signal. Specifically, we extract attention-based attribution matrices for both human- and AI-generated text using a \emph{proxy} Transformer model with white-box access and train a lightweight Convolutional Neural Network to learn representations from these attribution maps. Across encoder-decoder translation settings, our method consistently outperforms a text-only baseline. In decoder-only settings, it performs strongly in generator-specific detection, remains competitive on standard benchmarks, and shows robustness under cross-dataset transfer and alternative-spelling perturbations. We further show that attention maps exhibit recurring local structures whose relative frequencies differ consistently between human- and AI-generated text across datasets and proxy models. These findings suggest that attention-based attribution maps provide a complementary and interpretable signal for AI-generated text detection. We will make the code publicly available to support future research.

2606.00014 2026-06-02 cs.CL cs.AI

Toward Robust In-Context Learning: Leveraging Out-of-distribution Proxies for Target Inaccessible Demonstration Retrieval

面向鲁棒的上下文学习:利用分布外代理进行目标不可访问的演示检索

Hao Xu, Rite Bo, Fausto Giunchiglia, Yingji Li, Rui Song

发表机构 * College of Computer Science and Technology, Jilin University, China(吉林大学计算机科学与技术学院) Department of Information Engineering and Computer Science, University of Trento, Italy(特伦托大学信息工程与计算机科学系)

AI总结 提出DOPA框架,通过引入分布外代理近似不可访问的目标域并利用马氏距离全局多样性约束,提升大语言模型在分布偏移下的鲁棒性。

详情
Comments
Accepted by ACL 2026 main
AI中文摘要

尽管研究表明大语言模型(LLMs)在分布外(OOD)任务上表现良好,但随着分布偏移加剧,其优势趋于减弱。因此,研究人员旨在从可用源域中检索分布相似且信息丰富的演示来增强LLMs的推理能力。然而,在目标域不可访问的实际场景中,评估未知分布具有挑战性,这间接影响所选演示的质量。为解决此问题,我们提出 extbf{DOPA},一种演示搜索框架,它引入OOD代理来近似不可访问的目标域并指导检索过程。基于代理评估,DOPA进一步引入基于马氏距离的全局多样性约束,确保检索到的演示具有足够的多样性。在多个LLMs和任务上的实验结果表明,DOPA有效增强了OOD设置下的鲁棒性 ootnote{https://github.com/bort64/ood\_code}。

英文摘要

Although studies have demonstrated that Large Language Models (LLMs) can perform well on Out-of-Distribution (OOD) tasks, their advantage tends to diminish as the distribution shift becomes more severe. Consequently, researchers aim to retrieve distributionally similar and informative demonstrations from the available source domain to boost the inference capabilities of LLMs. However, in practical scenarios where the target domain is inaccessible, evaluating the unknown distribution is challenging, which indirectly impacts the quality of the selected demonstrations. To address this problem, we propose \textbf{DOPA}, a demonstration search framework that incorporates an OOD proxy to approximate the inaccessible target domain and guide the retrieval process. Building on proxy-based evaluation, DOPA further introduces a Mahalanobis distance-based global diversity constraint to ensure sufficient diversity among the retrieved demonstrations. Experimental results on multiple LLMs and tasks demonstrate that DOPA effectively enhances robustness in OOD settings\footnote{https://github.com/bort64/ood\_code}.

2606.00009 2026-06-02 cs.AI

Optimal Transport-based Permutation-Invariant Bayesian Optimization of Offshore Wind Farm Layouts

基于最优传输的排列不变贝叶斯优化在海上风电场布局中的应用

Antonio Candelieri, Laurens Bliek

发表机构 * Department of Economics Management and Statistics(经济学管理与统计系) Department of Industrial Engineering and Innovation Sciences(工业工程与创新科学系) Eindhoven AI Systems Institute(埃因霍温人工智能系统研究所)

AI总结 针对海上风电场布局优化问题,提出一种基于最优传输理论的排列不变贝叶斯优化方法PIBO,在保持性能的同时将计算时间减半。

详情
AI中文摘要

贝叶斯优化(BO)被广泛且成功地用于解决具有昂贵评估、黑箱和非凸目标函数的优化问题。然而,标准BO算法无法利用目标问题可能具有的对称性。一个直观的例子是最优位置问题,其决策变量指连续空间中的一组有限点,点的顺序不影响目标函数的值。我们将这种设置称为布局优化,以区别于点云优化(其中点的顺序重要)。作为布局优化的实例,我们考虑一个实际工业相关应用,即海上风电场布局优化:给定相同的风力涡轮机,交换任意一对涡轮机对年发电量没有影响。基于最优传输理论,我们提出了一种排列不变BO方法,即PIBO,证明与标准BO方法相比,它能提供更好的风电场布局,同时将计算时间大致减半。

英文摘要

Bayesian Optimization (BO) is widely and successfully adopted for solving optimization problems having an expensive-to-evaluate, black-box, and non-convex objective function. However, the vanilla BO algorithm is not able to exploit possible symmetries characterizing the target problem. An intuitive case is given by optimal location problems, whose decision variables refer to a finite set of points within a continuous space, with the order of points not affecting the value of the objective function. We refer to this setting as optimization over layouts to distinguish from optimization over point-clouds where, instead, the order of points counts. As an instance of optimization over layouts we consider a real-life industrial-relevant application, that is the optimization of the layout of an offshore wind farm: given identical wind turbines, switching any pair of them has not any effect on the annual energy production. Based on Optimal Transport theory, we propose a Permutation-Invariant BO approach, namely PIBO, proved to provide better wind farm layouts when compared to the vanilla BO approach while cutting computation time roughly in half.

2606.00008 2026-06-02 cs.AI

Agents on a Tree: Pathwise Coordination for Multi-Objective Molecular Optimization

树上的智能体:多目标分子优化的路径协调

Jia Zhang, Tengfei Ma, Tianle Li, Daojian Zeng, Xieping Gao, Xiangxiang Zeng

发表机构 * arXiv.org cs.AI(计算机科学与人工智能)

AI总结 提出ATOM多智能体框架,将分子优化建模为树结构搜索,通过路径协调实现多目标权衡,在活性、可合成性和ADMET相关性质上优于基线。

详情
Comments
17 pages, 6 figures
AI中文摘要

多目标分子优化需要在冲突目标下搜索广阔的化学空间,其中早期设计决策强烈约束后续结果。现有方法通常依赖单一策略或固定标量化,限制了其表示多样权衡和探索多个有前途设计轨迹的能力。我们提出ATOM,一个将分子优化建模为树结构搜索的多智能体框架。每个节点对应一个原子操作,并托管一个专门针对特定目标或决策上下文的智能体。智能体沿树的不同路径进行协调,而不是强制执行全局共识,使该方法能够维护和比较替代的分子进化轨迹。过去优化行为的全局记忆进一步支持跨目标的平衡探索与利用。这种树结构交互使得能够推理分子设计中固有的长程依赖关系。在涉及活性、可合成性和ADMET相关性质的具有挑战性的多目标基准测试中,实验表明ATOM在帕累托覆盖率和超体积上始终优于强基线。这些结果证明了路径多智能体协调在分子优化中的有效性。代码可在https://anonymous.4open.science/r/ATOM-41CE获取。

英文摘要

Multi-objective molecular optimization requires searching vast chemical spaces under conflicting objectives, where early design decisions strongly constrain downstream outcomes. Existing methods typically rely on a single policy or fixed scalarization, which limits their ability to represent diverse trade-offs and to explore multiple promising design trajectories. We propose ATOM, a multi-agent framework that formulates molecular optimization as a tree-structured search. Each node corresponds to an atomic operation and hosts an agent specialized for a particular objective or decision context. Agents coordinate along different paths of the tree rather than enforcing a global consensus, enabling the method to maintain and compare alternative molecular evolution trajectories. A global memory of past optimization behaviors further supports balanced exploration and exploitation across objectives. This tree-structured interaction enables reasoning over long-horizon dependencies inherent in molecular design. Experiments on challenging multi-objective benchmarks involving activity, synthesizability, and ADMET-related properties show that ATOM consistently achieves improved Pareto coverage and hypervolume over strong baselines. These results demonstrate the effectiveness of pathwise multi-agent coordination for molecular optimization. Code is available at https://anonymous.4open.science/r/ATOM-41CE.

2606.00007 2026-06-02 cs.AI

Deliberative Curation: A Protocol for Multi-Agent Knowledge Bases

审慎策展:多智能体知识库协议

Steven Johnson

发表机构 * Steven Johnson(史蒂文·约翰逊)

AI总结 针对多智能体知识库的集体策展挑战,提出结合知识工件生命周期、声誉加权审议投票和分级制裁的审慎策展协议,通过基于智能体的仿真验证其在逆境下优于多数投票的鲁棒性。

详情
Comments
29 pages, 1 figure, 6 tables. Open-source implementation available at https://github.com/StevenJohnson998/AIngram
AI中文摘要

随着AI智能体从孤立工具过渡到共享知识生态系统中的协作参与者,管理集体知识策展成为一个关键挑战。人类平台治理机制无法直接转移:智能体无状态性削弱了基于威慑的制裁,模型同质性违反了群体智慧所需的独立性假设,而谄媚行为破坏了审议共识。我们提出了一种审慎策展协议,结合三个治理层:(1)形式化为带标签转移系统的知识工件生命周期;(2)结合Beta声誉与EigenTrust放大的声誉加权审议投票;(3)针对无状态智能体调整的分级制裁,包括区分故障与对抗行为的损坏智能体处理。我们通过基于智能体的仿真评估该协议,涉及7种行为原型下的100个智能体,在两种逆境场景中(30个种子,配对t检验)。该协议在良性条件下牺牲适度精度,以换取逆境下显著更好的鲁棒性:中等逆境下0.826对比多数投票的0.791(p<0.001),高压逆境下0.807对比0.740(p<0.001)。该协议退化速度约为多数投票的三分之一。消融分析确定提交-揭示投票隐藏是最有影响力的单一组件(精度提升8.2-8.6个百分点,p<0.001),优于声誉加权和审议的结合。分级制裁在仿真中未被使用,因此仍缺乏实证验证。

英文摘要

As AI agents transition from isolated tools to collaborative participants in shared knowledge ecosystems, governing collective knowledge curation becomes a critical challenge. Human platform governance mechanisms do not transfer directly: agent statelessness undermines deterrence-based sanctions, model homogeneity violates independence assumptions underlying crowd wisdom, and sycophancy collapses deliberative consensus. We propose a deliberative curation protocol combining three governance layers: (1) a knowledge artifact lifecycle formalized as a labeled transition system; (2) reputation-weighted deliberative voting integrating Beta Reputation with EigenTrust amplification; and (3) graduated sanctions adapted for stateless agents, including broken agent handling distinguishing malfunction from adversarial behavior. We evaluate the protocol through agent-based simulation with 100 agents across seven behavioral archetypes under two adversity scenarios (30 seeds, paired t-tests). The protocol trades modest precision under benign conditions for substantially better resilience under adversity: 0.826 vs 0.791 for majority vote under moderate adversity (p<0.001), widening to 0.807 vs 0.740 under stress (p<0.001). The protocol degrades roughly three times more slowly than majority vote. Ablation analysis identifies commit-reveal vote concealment as the most impactful single component (8.2-8.6pp precision improvement, p<0.001), outperforming reputation weighting and deliberation combined. Graduated sanctions were not exercised in simulation and remain empirically unvalidated.

2606.00002 2026-06-02 cs.AI

Position Paper: Post-Solve Robustness in Decision Engines: Feasible Regions and Smoothness Under Perturbations

立场论文:决策引擎中的求解后鲁棒性——可行域与扰动下的平滑性

Yi-Xiang Hu

发表机构 * Yi-Xiang Hu(胡毅祥)

AI总结 针对混合整数线性规划决策引擎在部署后因微小扰动导致解不可行或突变的问题,提出求解后鲁棒性层,通过审计已求解并返回基于求解器的可信证据,形式化ε-近优可行邻域和解平滑性两个核心概念,并整合敏感性分析、鲁棒优化、邻域搜索、对抗测试及学习增强等方法,构建统一的求解后鲁棒性框架。

详情
AI中文摘要

混合整数线性规划(MILP)决策引擎通常为高风险工业系统输出名义上的最优计划。然而,部署时很少匹配求解时的假设:成本、需求或资源可用性的微小扰动可能使解不可行,或引发不连续的转变,导致性质上不同的解。我们认为,这种求解后鲁棒性差距是当今优化流程中缺失的一层,也是学习型决策系统缺失的评估维度。该层并非替代鲁棒优化或随机规划,而是审计已求解的基解,并返回基于求解器的证据,说明该解在多大程度上可以被信任。我们形式化了两个核心对象:(i)参数空间中的ε-近优可行邻域,捕捉基解在扰动下保持可行且近优的范围;(ii)决策空间中的解平滑性,捕捉具有小组合编辑的邻近替代方案是否仍具竞争力。然后,我们综合了来自敏感性分析、稳定性分析、鲁棒优化、邻域搜索、对抗测试和基于学习的增强方法中最相关的部分答案,并阐述了一个统一的求解后鲁棒性层的议程。具体而言,我们呼吁围绕基解进行认证的内部近似、具有校准不确定性的概率鲁棒性估计、对抗鲁棒性边界,以及与基于求解器的验证相一致的学习型预测和解释。最后,我们提出了一个紧凑的报告模板和评估协议,使鲁棒性成为决策引擎的一等输出。

英文摘要

Mixed-Integer Linear Programming (MILP) decision engines routinely output nominally optimal plans for high-stakes industrial systems. Yet deployment rarely matches solve-time assumptions: small perturbations in costs, demands, or resource availability can invalidate feasibility or trigger discontinuous shifts to qualitatively different solutions. We argue that this post-solve robustness gap is a missing layer in today's optimization pipelines and a missing evaluation dimension for learning-enabled decision systems. Rather than replacing robust optimization or stochastic programming, the proposed layer audits a solved incumbent and returns solver-backed evidence about how far that solution can be trusted. We formalize two central objects: (i) an $ε$-near-optimal feasible neighborhood in parameter space, capturing when an incumbent remains feasible and near-optimal under perturbations, and (ii) solution smoothness in decision space, capturing whether nearby alternatives with small combinatorial edits remain competitive. We then synthesize the most relevant partial answers from sensitivity and stability analysis, robust optimization, neighborhood search, adversarial testing, and learning-based enhancements, and articulate an agenda for a unified post-solve robustness layer. Concretely, we call for certified inner approximations around the incumbent, probabilistic robustness estimation with calibrated uncertainty, adversarial robustness margins, and learning-based prediction and explanation aligned with solver-backed verification. We conclude with a compact reporting template and evaluation protocol that would make robustness a first-class output of decision engines.

2605.04127 2026-06-02 cs.LG cs.CL cs.CY

Position: the Stochastic Parrot in the Coal Mine. Model Collapse is a Threat to Low-Resource Communities

立场:煤矿中的随机鹦鹉。模型崩溃对低资源社区的威胁

Devon Jarvis, Richard Klein, Benjamin Rosman, Steven James, Stefano Sarao Mannelli

发表机构 * GitHub

AI总结 本文探讨模型崩溃(生成模型在先前模型输出上训练导致的性能下降)如何通过降低训练效率、扭曲数据分布,不成比例地影响低资源和边缘化社区,并呼吁采取行动。

详情
Comments
14 pages, 1 figure, 1 table, International Conference on Machine Learning
AI中文摘要

模型崩溃,即当生成模型在先前的模型输出上进行训练时出现的性能下降,随着人工生成内容的激增,日益受到关注。对大型语言模型的相关批评强调了它们倾向于复现训练数据中的频繁模式、依赖庞大的数据集以及巨大的环境成本。这些因素共同导致了数据退化、文化偏见的强化以及资源利用的低效。在这篇立场论文中,我们旨在结合这些观点,并论证模型崩溃威胁着当前使AI民主化的努力。通过降低训练效率并使数据分布偏离其支撑的尾部,模型崩溃不成比例地影响了低资源和边缘化社区。我们考察了这一现象的环境和文化影响,将我们的立场置于近期关于模型崩溃的立场论文中,并以行动呼吁作为结论。最后,我们概述了减轻这些影响的初步方向。

英文摘要

Model collapse, the degradation in performance that arises when generative models are trained on the outputs of prior models, is an increasing concern as artificially generated content proliferates. Related critiques of large language models have highlighted their tendency to reproduce frequent patterns in training data, their reliance on vast datasets, and their substantial environmental cost. Together, these factors contribute to data degradation, the reinforcement of cultural biases, and inefficient resource use. In this position paper we aim to combine these views and argue that model collapse threatens current efforts to democratize AI. By reducing training efficiency and skewing data distributions away from the tails of their support, model collapse disproportionately impacts low-resource and marginalized communities. We examine both the environmental and cultural implications of this phenomenon, situate our position within recent position papers on model collapse, and conclude with a call to action. Finally, we outline initial directions for mitigating these effects.

2605.03052 2026-06-02 cs.CL

How Language Models Process Negation

语言模型如何处理否定

Zhejian Zhou, Tianyi Zhou, Robin Jia, Jonathan May

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 研究大型语言模型处理否定的内部机制,发现模型内部存在正确处理的组件,但后期注意力层导致错误,通过消融可提升准确率;模型同时采用抑制和构建两种机制,其中构建机制更显著。

详情
Comments
ICML 2026
AI中文摘要

我们研究了大型语言模型(LLMs)在机制层面如何处理否定。首先,我们确定,尽管开放权重模型在涉及否定问题时经常给出错误答案,但它们确实拥有内部组件能够正确处理否定。其准确性低是由于后期注意力层的行为促进了简单的捷径;消融这些注意力模块极大地提高了否定相关问题的准确性。其次,我们揭示了模型处理否定的方式。我们考虑了两种假设:模型可能使用注意力头来关注被否定的短语并抑制相关概念,或者它们可能直接构建整个否定短语的表示(例如,将“不是气体”表示为促进液体和固体的向量)。我们应用一系列观察性和因果性可解释性技术于Mistral-7B和Llama-3.1-8B,以表明模型实现了这两种机制,其中“构建”机制更为突出。综合来看,我们的工作加深了对LLMs内部机制的理解,突出了构建主导的计算以及LLMs内竞争机制的共存。

英文摘要

We study how Large Language Models (LLMs) process negation mechanistically. First, we establish that even though open-weight models often provide wrong answers to questions involving negation, they do possess internal components that process negation correctly. Their poor accuracy is due to late-layer attention behavior that promotes simple shortcuts; ablating those attention modules greatly improves accuracy on negation-related questions. Second, we uncover how models process negation. We consider two hypotheses: models could use attention heads that attend to the phrase being negated and suppress related concepts, or they could directly construct a representation of the entire negative phrase (e.g., representing "not gas" as a vector that promotes liquids and solids). We apply a range of observational and causal interpretability techniques on Mistral-7B and Llama-3.1-8B to show that models implement both mechanisms, with the "constructive" mechanism being more prominent. Combined, our work deepens the understanding of LLMs' internals, highlighting construction-dominant computations and the coexistence of competing mechanisms within LLMs.

2605.31597 2026-06-02 cs.CV

SOCO: Benchmarking Semantic Object Correspondence in Vision Foundation Models

SOCO: 视觉基础模型中语义对象对应关系的基准测试

Olaf Dünkel, Basavaraj Sunagad, Haoran Wang, David T. Hoffmann, Christian Theobalt, Adam Kortylewski

发表机构 * Max Planck Institute for Informatics(马克斯·普朗克研究所信息学研究所) Saarland Informatics Campus(萨尔州信息学校园) CISPA Helmholtz Center for Information Security(信息安全霍夫曼中心) University of Freiburg(弗赖堡大学)

AI总结 提出SOCO基准,通过引入对应类型分类法和100个类别上超过100万对功能上有意义的关键点注释,系统评估视觉基础模型中的语义对应能力,并揭示模型在跨类别迁移、语言引导定位与视觉对应之间的差距。

详情
Comments
Project page: https://genintel.github.io/SOCO/
AI中文摘要

由于评估协议不一致和部分级监督有限,测量视觉基础模型中的结构化对象理解仍然具有挑战性。语义对应(SC)通过测试对象部分是否能在外观、视角和几何形状的巨大变化下跨实例和类别匹配来评估这种能力。为了实现系统的SC评估,我们引入了SOCO,一个新的语义对象对应基准,它引入了对应类型的分类法,并在100个类别和超过100万对应对上提供了一致、功能上有意义的关键点注释。此外,SOCO包括关键点语言描述,使得能够评估大型视觉语言模型(LVLMs)及其细粒度部分级理解。综合实验揭示:(i) 视觉基础骨干编码了强大的语义结构,但在相关类别之间转移对应关系较差,且仅部分捕捉对象部分位置;(ii) LVLMs在文本提示的部分定位方面比视觉参考的跨图像匹配更强,暴露了语言引导定位与细粒度视觉对应之间的差距;(iii) 对应性能比ImageNet分类更能预测密集下游任务(包括分割、跟踪、3D姿态估计和3D检测)的性能。总之,这些发现将SOCO定位为视觉和多模态基础模型中结构化、部分级表示质量的基准。

英文摘要

Measuring structured object understanding in vision foundation models remains challenging due to inconsistent evaluation protocols and limited part-level supervision. Semantic correspondence (SC) evaluates this capability by testing whether object parts can be matched across instances and categories under large variations in appearance, viewpoint, and geometry. To enable a systematic SC evaluation, we introduce SOCO, a new benchmark for Semantic Object Correspondence that introduces a taxonomy of correspondence types and provides consistent, functionally meaningful keypoint annotations across 100 categories and over 1M correspondence pairs. In addition, SOCO includes keypoint language descriptions, enabling the evaluation of large vision-language models (LVLMs) and their fine-grained part-level understanding. Comprehensive experiments reveal that (i) vision foundation backbones encode strong semantic structure but transfer correspondences poorly across related categories and only partially capture object-part position, (ii) LVLMs are stronger at text-prompted part localization than at visual-reference cross-image matching, exposing a gap between language-grounded localization and fine-grained visual correspondence, and (iii) correspondence performance predicts performance on dense downstream tasks, including segmentation, tracking, 3D pose estimation, and 3D detection, more strongly than ImageNet classification. Together, these findings position SOCO as a benchmark for structured, part-level representation quality in vision and multimodal foundation models.

2605.31490 2026-06-02 cs.CL

Are Full Rollouts Necessary for On-Policy Distillation?

在线策略蒸馏是否必须完整展开?

Yaocheng Zhang, Jiajun Chai, Yuqian Fu, Songjun Tu, Xiaohan Wang, Wei Lin, Guojun Yin, Qichao Zhang, Yuanheng Zhu, Dongbin Zhao

发表机构 * Institute of Automation, Chinese Academy of Sciences(中国科学院自动化研究所) School of Advanced Interdisciplinary Sciences, University of Chinese Academy of Sciences(中国科学院大学先进交叉学科学院) Meituan(美团) School of Artificial Intelligence, University of Chinese Academy of Sciences(中国科学院大学人工智能学院)

AI总结 本文针对在线策略蒸馏(OPD)中完整展开计算成本高且早期训练时教师反馈不可靠的问题,提出渐进式OPD(POPD)和截断式OPD(TOPD)两种展开长度控制策略,在数学推理任务上实现高达3倍训练效率提升,并显著减少内存占用。

详情
Comments
15 pages, 14 figures
AI中文摘要

在线策略蒸馏(OPD)在学生生成的展开轨迹上提供密集的教师反馈,而非固定的教师轨迹,已成为一种有前景的训练后范式。然而,标准OPD通常在训练期间生成完整展开,这计算成本高昂,并且可能使学生暴露于后期展开位置不可靠的教师反馈,尤其是在早期训练阶段。我们确定展开长度是OPD中严重影响训练效率的关键瓶颈。与具有可验证奖励的强化学习(RLVR)不同,OPD不需要最终答案奖励来提供学习信号。因此,完整展开对于OPD可能并非总是必要的。基于这一见解,我们提出了两种简单的展开长度控制策略:渐进式OPD(POPD),它在训练期间逐步扩展展开长度;以及截断式OPD(TOPD),它永久性地在可靠的截断展开上进行蒸馏。数学推理实验表明,POPD将OPD的训练效率提升高达3倍,而TOPD仅使用10%的展开长度即可达到与OPD相当的性能,从而显著减少了挂钟时间和内存消耗。这些结果表明,控制展开长度为更高效的OPD提供了一条简单实用的途径。

英文摘要

On-policy distillation (OPD) provides dense teacher feedback along student-generated rollouts rather than fixed teacher traces and has emerged as a promising post-training paradigm. However, standard OPD typically generates full rollouts during training, which is computationally expensive and may expose the student to unreliable teacher feedback at late rollout positions, especially during early training. We identify the rollout horizon as a key bottleneck in OPD that substantially impacts training efficiency. Unlike Reinforcement Learning with Verifiable Rewards (RLVR), OPD does not require a final answer reward to provide learning signals. Therefore, full rollouts may not always be necessary for OPD. Motivated by this insight, we propose two simple horizon-control strategies: Progressive OPD (POPD), which gradually expands the rollout horizon during training, and Truncated OPD (TOPD), which permanently performs distillation on reliable truncated rollouts. Experiments on mathematical reasoning show that POPD improves the training efficiency of OPD by up to 3$\times$, while TOPD matches OPD performance using only 10\% of the rollout horizon, leading to substantial wall-clock and memory reductions. These results demonstrate that controlling the rollout horizon offers a simple and practical path to more efficient OPD.

2605.31487 2026-06-02 cs.CV

Enhancing Computer Vision Model Generalization in Warehouse Facilities: A Case Study on Anomaly Detection in Vertical Material Handling Systems

提升仓库设施中计算机视觉模型泛化能力:垂直物料搬运系统异常检测案例研究

Ruiliang Liu, Tina Dongxu Li, Joshua Migdal, Ken Meszaros, Trevor Dardik

发表机构 * Amazon, USA(亚马逊公司)

AI总结 本研究通过实验室环境下的最优相机布置、图像触发策略、模型选择与集成,实现了垂直物料搬运系统异常检测模型从实验室到多种仓库环境的有效泛化,简化了部署流程并节省了标注和重训练资源。

详情
Comments
6 pages, 10 figures. Accepted at IEEE International Conference on Mechatronics and Automation (ICMA) 2026
AI中文摘要

在仓库设施中部署计算机视觉模型传统上需要大量资源用于相机安装、图像采集、标注、训练和部署——由于相机安装限制和环境变化,这一过程通常需要在每个新环境中重复。本文探索了一种创新方法,通过仅在实验室环境中执行标准流程来简化这一过程,重点关注垂直物料搬运系统及其叉的异常检测。通过大量实验,我们发现结合最优相机布置、策略性图像触发、谨慎的模型选择和模型集成,能够实现从实验室条件到多种仓库设施环境的有效泛化,可能通过将仓库设施部署简化为仅需相机安装、图像采集和模型部署,从而节省通常用于图像标注和模型重训练的大量资源和时间,改变仓库自动化实施方式。这是一项实验研究,并非生产部署。

英文摘要

Deploying computer vision models in Warehouse Facilities traditionally requires extensive resources for camera mounting, image collection, annotation, training, and deployment - a process often needing repetition in each new environment due to camera mounting constraints and environmental variability. This paper explores an innovative approach to streamline this process by conducting the standard procedure solely in a laboratory setting, focusing on vertical material handling systems and anomaly detection in forks of the systems. Through extensive experimentation, we have found that combining optimal camera placement, strategic image triggering, careful model selection and model ensemble enables effective generalization from laboratory conditions to diverse warehouse facilities environments, potentially transforming warehouse automation implementation by simplifying warehouse facilities deployment to just camera mounting, image collection, and model deployment, thereby saving significant resources and time typically spent on image annotation and model retraining. This is an experimental research study and not a production deployment.

2605.31437 2026-06-02 cs.CV

Astra: a generalizable report generation foundation model for 3D computed tomography

Astra:一种用于三维计算机断层扫描的通用报告生成基础模型

Zhuhao Wang, Fang Chen, Chaohui Yu, Zihan Li, Yuchao Zheng, Jing Wang, Xuan Yang, Jia Guo, Zhenlu Yang, Xingju Zheng, Yihua Sun, Haojie Han, Xiaoxiao Qin, Zhan Feng, Wenbo Xiao, Chao Zhu, Yuehua Li, Shipeng Zhang, Hao Luo, Yunsong Peng, Fan Wang, Hongen Liao

发表机构 * School of Biomedical Engineering, Tsinghua University(清华大学生物医学工程学院) School of Biomedical Engineering, Shanghai Jiao Tong University(上海交通大学生物医学工程学院) DAMO Academy, Alibaba Group(阿里云达摩院) Hupan Laboratory(壶辰实验室) Department of Biomedical Engineering, National University of Singapore(新加坡国立大学生物医学工程系) Department of Radiology, Guizhou Provincial People’s Hospital(贵州省级人民医院放射科) Department of Radiology, The First Affiliated Hospital, Zhejiang University School of Medicine(浙江大学医学院附属第一医院放射科) Department of Radiology, Shanghai Sixth People’s Hospital Affiliated to Shanghai Jiao Tong University School of Medicine(上海交通大学医学院附属第六人民医院放射科) College of Computer Science and Technology, Zhejiang University(浙江大学计算机科学与技术学院)

AI总结 提出Astra模型,通过风格统一和强化学习,在8个器官系统的CT报告生成中实现高精度,平均细粒度诊断指标提升44.1%,并加速临床工作流。

详情
AI中文摘要

CT解读需要放射科医生每次检查审查数百个容积切片,使得报告耗时且高度依赖专业知识。自动CT报告生成为提高临床效率提供了一条有前景的途径,但该领域仍缺乏一个支持多区域报告并在外部真实世界队列中保持鲁棒性的通用CT报告生成基础模型。不同队列间报告风格和诊断术语的内在不一致性使得朴素联合训练容易受到噪声文本监督的影响,从而限制了模型的泛化能力。本文提出Astra,一个通用的CT报告生成基础模型,在包含90,678个胸腹部CT-报告对(CTRgDB)的数据集上训练,涵盖8个器官系统的353,671个异常。通过统一报告风格并进一步通过强化学习细化诊断一致性,Astra实现了跨不同解剖区域和机构的风格一致且诊断准确的报告生成。在CTRgDB和六个外部队列上的评估显示,Astra在细粒度诊断指标上平均提升44.1%(P<0.001),达到最先进性能。在实际临床工作流中,Astra辅助将胸部报告起草速度提高29.6%,并将腹部报告完整性提高11.3%(P<0.001)。此外,Astra作为CT AI开发的基础也展现出广泛实用性,通过高质量报告合成改善下游诊断性能并扩展视觉-语言预训练。总体而言,Astra作为一个广泛可用的临床助手和下一代AI医疗的关键基础设施。

英文摘要

CT interpretation requires radiologists to review hundreds of volumetric slices per examination, making reporting time-consuming and highly expertise-dependent. Automated CT report generation offers a promising route to improving clinical efficiency, yet the field still lacks a generalizable CT report generation foundation model that supports multi-region reporting and remains robust across external real-world cohorts. Intrinsic inconsistencies in reporting style and diagnostic terminology across cohorts make naive joint training prone to noisy textual supervision, thereby limiting model generalizability. Here we present Astra, a generalizable CT report generation foundation model trained on 90,678 thoracoabdominal CT-report pairs (CTRgDB) with 353,671 abnormalities spanning eight organ systems. By harmonizing report style and further refining diagnostic consistency via reinforcement learning, Astra achieves style-consistent and diagnostically accurate report generation across diverse anatomical regions and institutions. Evaluating on CTRgDB and six external cohorts, Astra achieves state-of-the-art performance with a 44.1% average improvement in fine-grained diagnostic metrics (P<0.001). In real-world clinical workflows, Astra assistance accelerates chest report drafting by 29.6% and improves abdominal report completeness by 11.3% (P<0.001). Furthermore, Astra also demonstrates broad utility as a foundation for CT AI development, improving downstream diagnostic performance and scaling vision-language pretrain through high-quality report synthesis. Overall, Astra serves as a broadly accessible clinical assistant and a pivotal infrastructure for the next generation of AI-powered healthcare.

2605.31401 2026-06-02 cs.CL

"Înţelegi Româneşte?'' A Recipe for Romanian Vision-Language Models

Înţelegi Româneşte?'' 罗马尼亚语视觉语言模型的构建配方

Mihai Masala, Marius Leordeanu, Mihai Dascalu, Traian Rebedea

发表机构 * National University of Science and Technology POLITEHNICA Bucharest(波兰技术大学布加勒斯特国家大学)

AI总结 本文系统研究了构建罗马尼亚语视觉语言模型(VLM)的完整流程,通过翻译英文语料、消融实验分析视觉骨干、语言骨干和OCR数据的影响,并构建文化本地化评估集HoraVQA,证明罗马尼亚语适配模型在性能上超越同尺寸甚至更大尺寸的模型。

详情
AI中文摘要

视觉语言模型(VLM)很大程度上遵循纯文本LLM的发展轨迹,在英文基准测试中表现出色,但在低资源语言上性能急剧下降,因为既缺乏大规模图像-文本语料库,也缺乏基于文化的评估。我们提出了一项针对罗马尼亚语构建语言特定VLM的系统研究,涵盖了从数据构建到架构选择的完整流程。我们将已有的英文VLM训练和评估语料库翻译成罗马尼亚语,对文本注释和图像内文本应用机器翻译,在保留视觉基础的同时调整文本内容。利用这些数据,我们训练并消融了一系列VLM,以隔离以下因素的贡献:(i)不同规模和预训练的视觉骨干,(ii)从多语言到罗马尼亚语适配LLM的语言骨干,以及(iii)OCR风格的图像-文本数据。我们进一步整理了HoraVQA,一个基于罗马尼亚日常场景的文化本地化评估集。罗马尼亚语适配的VLM始终优于同尺寸的对应模型,并且在所有评估基准上甚至超越了下一个更大尺寸类别的模型。

英文摘要

Vision-Language Models (VLMs) largely follow the text-only LLM trajectory, excelling on English benchmarks but sharply degrading on low-resource languages, where neither large-scale image-text corpora nor culturally grounded evaluations exist. We present a systematic study of building a language-specific VLM for Romanian, covering the full pipeline from data construction to architectural choices. We translate established English VLM training and evaluation corpora into Romanian, applying machine translation to textual annotations and to in-image text, preserving visual grounding while adapting the textual content. Using this data, we train and ablate a series of VLMs to isolate the contribution of (i) vision backbones of varying scale and pretraining, (ii) language backbones from multilingual to Romanian-adapted LLMs, and (iii) OCR-style image-text data. We further curate HoraVQA, a culturally native evaluation set grounded in Romanian everyday scenes. Romanian-adapted VLMs consistently outperform their same-sized counterparts and, across all evaluated benchmarks, even surpass models from the next larger size category.

2605.31200 2026-06-02 cs.LG stat.ML

Beyond Additive Decompositions: Interpretability Through Separability

超越加性分解:通过可分离性实现可解释性

Jinyang Liu, Munir Eberhardt Hiabu

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 提出张量分离学习(TSL)回归模型,通过阶段式贪心过程与正交重拟合学习单变量特征函数的秩1乘积之和,避免加性分解中的信号抵消和外推问题,实现忠实于拟合分量的可视化,并提供近似率保证。

详情
Comments
To appear in Proceedings of the 43rd International Conference on Machine Learning (ICML 2026)
AI中文摘要

可解释机器学习需要准确且结构上忠实于数据的模型。现有的可解释性方法严重依赖加性表示(例如广义加性模型(GAMs)、SHapley加性解释(SHAP)、函数ANOVA),这些方法在存在强交互作用时可能会遭受信号抵消和支持外推。我们提出了张量分离学习(TSL),一种回归模型,通过带有正交重拟合的阶段式贪心过程学习单变量特征函数的秩1乘积之和。通过强制可分离性,TSL避免了加性投影中由于边缘化高阶交互作用而导致的信息损失。学习的TSL模型可以从一阶偏依赖函数完全重建,直到常数因子。这种阶段式对应确保了所得可视化忠实于拟合的分量。我们为具有有界混合$p$阶偏导数的函数建立了近似率保证,并证明TSL在回归基准测试中与黑盒模型竞争。

英文摘要

Interpretable machine learning requires models that are accurate and structurally faithful to the data. Existing explainability methods rely heavily on additive representations (e.g., Generalized Additive Models (GAMs), SHapley Additive exPlanations (SHAP), functional ANOVA), which can suffer from signal cancellation and off-support extrapolation in the presence of strong interactions. We propose Tensor Separation Learning (TSL), a regression model that learns a sum of rank-1 products of univariate per-feature functions via a stagewise greedy procedure with orthogonal refitting. By enforcing separability, TSL avoids the information loss inherent in additive projections caused by marginalizing higher-order interactions. The learned TSL model can be fully reconstructed from first-order partial dependence functions, up to constant factors. This stage-wise correspondence ensures that the resulting visualizations are faithful to the fitted components. We establish approximation-rate guarantees for functions with bounded mixed $p$-th order partial derivatives and demonstrate that TSL competes with black-box models on regression benchmarks.

2605.31162 2026-06-02 cs.CV cs.LG

Guidance for Low-Level Perceptual Editing in Unconditional Diffusion Models

无条件扩散模型中低级感知编辑的引导

Shreyansh Modi, Akshat Tomar, Aarush Aggarwal

发表机构 * Indian Institute of Technology Roorkee(印度理工学院罗尔基)

AI总结 针对无条件扩散模型在美学和感知增强中难以进行全局低级变换的问题,提出一种无需训练的推理时机制,通过提取退化概念向量并结合瓶颈修补与无分类器引导,实现图像编辑与质量提升。

详情
Comments
11 pages, 12 figures, Generative Models for Computer Vision Workshop CVPR 2026
AI中文摘要

无条件扩散模型提供了强大的生成先验,但将其引导至美学增强的输出仍未被充分探索。我们表明,h-空间修补(用于无训练扩散编辑的主导范式)在美学和感知细化所需的全局低级变换中系统性失败。我们引入了一种新颖的、通用的框架,用于在无条件扩散模型中进行图像编辑,无需显式训练。这种推理时机制通过提取退化概念向量并组合瓶颈修补与无分类器引导来操作低级特征,从而引导采样远离退化流形,无需任何模型重训练即可持续生成改进的图像。

英文摘要

Unconditional diffusion models offer powerful generative priors, yet steering them toward aesthetically enhanced outputs remains largely unexplored. We show that h-space patching, the dominant paradigm for training-free diffusion editing, systematically fails for global, low-level transformations required for aesthetic and perceptual refinement. We introduce a novel, generalized framework for image-editing in unconditional diffusion models without explicit training. This inference-time mechanism operates on low-level features by extracting degradation concept vectors and combining bottleneck patching with classifier-free guidance to guide sampling away from the degraded manifold, producing consistently improved images without any model retraining.

2605.31086 2026-06-02 cs.CL cs.IR

Beyond Static Dialogues: Benchmarking Realistic, Heterogeneous, and Evolving Long-Term Memory

超越静态对话:对现实、异构和演化长期记忆的基准测试

Han Zhang, Zihao Tang, Xin Yu, Xiao Liu, Yeyun Gong, Haizhen Huang, Yan Lu, Weiwei Deng, Feng Sun, Qi Zhang, Hanfang Yang

发表机构 * Center for Applied Statistics, Renmin University of China(中国人民大学应用统计中心) School of Statistics, Renmin University of China(中国人民大学统计学院) Microsoft(微软公司)

AI总结 针对现有大语言模型记忆基准中对话缺乏长期语义一致性、人物静态化以及忽视异构数据流的问题,提出RHELM基准,通过用户画像和LOOP模块生成具有动态时间演化和长期连贯性的现实对话,并整合异构外部源,评估模型在多源聚合和现实上下文推理方面的能力。

详情
AI中文摘要

在现有的大语言模型(LLM)记忆基准中,评估的对话会话通常缺乏长期语义一致性,且底层人物角色趋于扁平化和静态化。此外,在现实场景中,用户与助手之间的交互涉及更多样化、异构的数据流,例如文档和电子邮件。这些缺陷严重限制了当前评估的现实性和有效性。为了解决这些限制,我们引入了RHELM(现实、异构和演化的长期记忆)。通过精心设计的用户画像和新型LOOP(规划-展开-演化-剪枝)模块,我们在多样化的交互场景中构建了展现动态时间演化和长期连贯性的现实对话。关键的是,这些对话与用户时间事件轨迹同步的异构外部源深度融合。由此产生的基准涵盖了跨越七种查询类型的具有挑战性的问答对,每个问题映射到至少27个关键记忆特征之一,这些特征在我们看来是当前研究中必要但尚未充分探索的。在全文模型、检索增强生成(RAG)方法和代表性记忆框架上的全面实验表明,当代方法在复杂的现实环境中仍然暴露出关键弱点,特别是在解决多源聚合和现实上下文推理方面。

英文摘要

In existing memory benchmarks for Large Language Models (LLMs), the evaluated dialogue sessions often lack long-term semantic consistency, and the underlying personas tend to be flat and static. Furthermore, in real-world scenarios, interactions between users and assistants involve more diverse, heterogeneous data streams, such as documents and emails. These shortcomings significantly limit the realism and effectiveness of current evaluations. To address these limitations, we introduce RHELM (Realistic, Heterogeneous, and Evolving Long-term Memory). Driven by meticulously crafted user profiles and a novel LOOP (pLan-rOllout-evOlve-Prune) module, we construct realistic dialogues across diverse interaction scenarios that exhibit dynamic temporal evolution and long-term coherence. Crucially, these dialogues are deeply integrated with heterogeneous external sources synchronized with the user's temporal event trajectory. The resulting benchmark encompasses challenging question-answer pairs spanning seven inquiry types, with each question mapping to at least one of 27 critical memory characteristics that we identify as essential yet underexplored in current research. Comprehensive experiments across full-context models, retrieval-augmented generation (RAG) methods, and representative memory frameworks reveal that contemporary approaches still expose critical weaknesses in complex, real-world settings, particularly in resolving multi-source aggregation and real-world contextual reasoning.

2605.30877 2026-06-02 cs.RO

Wall-OSS-0.5 Technical Report

Wall-OSS-0.5 技术报告

Ryan Yu, Pushi Zhang, Starrick Liu, Brae Liu, Miracle Kang, Shalfun Li, Lights Shi, Ellie Ma, Ping Yang, Chris Pan, Jerry Chen, Dongxiu Liu, Rain Sun, Miles Guo, Byron Zhang, Hugo Zhou, Zach Xu, Vincent Chen, Harrison Huang, James Wang, Dance Kuzi, Andy Zhai, Hang Su, Roy Gan, Lucy Liang, Hao Wang, Qian Wang

发表机构 * arXiv

AI总结 本文提出Wall-OSS-0.5,一个基于3B VLM骨干网络并增强动作生成组件的4B开源VLA模型,通过梯度桥接联合训练策略,在超过20个实体上预训练,实现零样本真实机器人行为,并在微调后超越π_0.5,证明VLA预训练本身即可产生可执行的机器人能力。

详情
AI中文摘要

大规模视觉-语言-动作(VLA)预训练正日益成为机器人策略的基础,然而预训练VLA的证据几乎总是在任务特定微调后报告。这留下了一个基本问题未解答:VLA预训练本身是否产生可执行的机器人行为,还是仅仅为下游策略学习提供更好的初始化?我们提出Wall-OSS-0.5,一个基于3B VLM骨干网络并增强动作生成组件的开源4B VLA,设计使得预训练的机器人能力可直接在物理硬件上测量。该模型在超过20个实体上进行预训练,每轮处理超过一百万个机器人轨迹以及一个多模态语料库。我们采用梯度桥接联合训练方案,其中三个目标扮演不同且互补的角色:离散动作预测将强大的VLM原生梯度注入骨干网络,多模态预测保持基于视觉-语言的理解,连续流匹配作为部署时的动作接口。在任务特定微调之前,预训练检查点实现了非平凡的零样本真实机器人行为,在17个任务套件中完成了包括一个保留的变形操作任务在内的多个任务,并取得了高任务进度。微调后,同一检查点作为更强的适应先验,在15个真实机器人任务上达到60.5%的平均任务进度,比π_0.5高出17.5%。多模态评估进一步证实动作训练不会侵蚀基于视觉-语言的能力:模型在保持广泛视觉-语言能力的同时增强了具身基础。总之,这些结果将VLA预训练从初始化策略重新定位为可直接测试且已经有用的机器人能力来源。

英文摘要

Large-scale Vision-Language-Action (VLA) pretraining is increasingly adopted as the foundation for robot policies, yet the evidence for pretrained VLAs is almost invariably reported after task-specific fine-tuning. This leaves a foundational question unanswered: does VLA pretraining itself yield executable robot behavior, or does it merely furnish a better initialization for downstream policy learning? We present Wall-OSS-0.5, an open-source 4B VLA built upon a 3B VLM backbone augmented with action-generation components, designed so that pretrained robotic capability is directly measurable on physical hardware. The model is pretrained across more than 20 embodiments, processing over one million robot trajectories per epoch alongside a grounded multimodal corpus. We adopt a gradient-bridged co-training recipe in which three objectives play distinct and complementary roles: discrete action prediction routes strong VLM-native gradients into the backbone, multimodal prediction preserves grounded vision-language understanding, and continuous flow matching serves as the deployment-time action interface. Before task-specific fine-tuning, the pretrained checkpoint achieves non-trivial zero-shot real-robot behavior, completing several tasks, including a held-out deformable manipulation task, at high task progress on a 17-task suite. After fine-tuning, the same checkpoint serves as a stronger adaptation prior, reaching 60.5% average task progress on 15 real-robot tasks and outperforming π_0.5 by 17.5%. Multimodal evaluations further confirm that action training does not erode grounded vision-language competence: the model preserves broad vision-language ability while strengthening embodied grounding. Together, these results reposition VLA pretraining from an initialization strategy to a directly testable, already useful source of robot capability.

2605.30855 2026-06-02 cs.CV

Robust Dreamer: Deviation-Aware Latent Gaussian Memory for Action-Controlled AR Video Generation

Robust Dreamer: 用于动作控制AR视频生成的偏差感知潜在高斯记忆

Hanlin Chen, Jiaxin Wei, Xibin Song, Yifu Wang, Steve Wang, Hongdong Li, Pan Ji, Gim Hee Lee

发表机构 * School of Computing, National University of Singapore(新加坡国立大学计算机学院) Technische Universität München(慕尼黑技术大学) Vertex Lab(Vertex实验室) Australian National University(澳大利亚国立大学)

AI总结 提出Robust Dreamer框架,通过潜在高斯记忆和动态偏差存档解决自回归视频生成中的漂移问题,实现长程3D一致性。

详情
AI中文摘要

逐帧动作控制的图像到视频生成是交互式世界模拟的一种有前景的范式,其中每个控制信号应引发即时的视觉响应。然而,在长自回归展开中保持视觉保真度和3D一致性仍然具有挑战性。现有的3D感知方法常常因两个障碍而遭受灾难性漂移:来自 extit{潜在--RGB循环}的信息丢失,其中生成的潜在被反复解码为RGB并重新编码用于未来条件;以及由 extit{无误差假设}引起的训练--推理差距,其中干净的训练记忆无法匹配预测损坏的推理记忆。为了解决这些挑战,我们提出了 extbf{Robust Dreamer},这是一个围绕如何设计3D记忆以及如何稳健使用它而构建的记忆增强框架。首先,我们引入了 extbf{潜在高斯记忆},它将从生成过程中继承的扩散潜在锚定到高斯基元,并通过潜在空间高斯泼溅召回它们。这提供了密集、几何感知、视图对齐的条件,同时避免了重复VAE转换导致的累积退化。其次,我们提出了 extbf{带有动态偏差存档的偏差学习},它通过一步近似合成展开引起的潜在偏差,按自回归阶段和去噪时间戳存储,并在训练期间将其注入历史记忆。这使得生成器暴露于现实的损坏记忆状态,并在推理前学习内部修正。在ScanNet、DL3DV和OmniWorldGame上的实验证明了最先进的长程性能。

英文摘要

Frame-wise action-controlled image-to-video generation is a promising paradigm for interactive world simulation, where each control signal should elicit an immediate visual response. However, maintaining visual fidelity and 3D consistency over long autoregressive rollouts remains challenging. Existing 3D-aware methods often suffer from catastrophic drift due to two impediments: information loss from \textit{Latent--RGB Cycling}, where generated latents are repeatedly decoded to RGB and re-encoded for future conditioning, and the training--inference gap induced by the \textit{error-free hypothesis}, where clean training memory fails to match prediction-corrupted inference memory. To address these challenges, we present \textbf{Robust Dreamer}, a memory-augmented framework built around how to design 3D memory and how to use it robustly. First, we introduce \textbf{Latent Gaussian Memory}, which anchors diffusion latents inherited from the generation process to Gaussian primitives and recalls them via latent-space Gaussian splatting. This provides dense, geometry-aware, view-aligned conditioning while avoiding accumulated degradation from repeated VAE conversion. Second, we propose \textbf{Deviation Learning with Dynamic Deviation Archive}, which synthesizes rollout-induced latent deviations through a one-step approximation, stores them by autoregressive stage and denoising timestamp, and injects them into historical memory during training. This exposes the generator to realistic corrupted memory states and teaches internal correction before inference. Experiments on ScanNet, DL3DV, and OmniWorldGame demonstrate state-of-the-art long-horizon performance.

2605.30748 2026-06-02 cs.SD cs.AI eess.AS

Chatterbox-Flash: Prior-Calibrated Block Diffusion for Streaming Zero-Shot TTS

Chatterbox-Flash: 用于流式零样本TTS的先验校准块扩散

Deokjin Seo, Gangin Park, Kihyun Nam

发表机构 * Resemble AI Seoul National University(首尔国立大学) KAIST(韩国科学技术院)

AI总结 提出Chatterbox-Flash,通过将预训练自回归TTS解码器微调为块扩散解码器,实现块内并行生成与块间流式推理,并引入先验校准评分和早期解码调度解决长尾分布导致的生成质量下降问题。

详情
Comments
8 pages, 4 figures, 9 tables
AI中文摘要

我们提出Chatterbox-Flash,一种零样本文本转语音模型,通过将预训练的自回归TTS解码器微调为块扩散解码器获得,支持每个块内的并行令牌生成,同时保持逐块流式传输。我们发现,将主流的块扩散解码直接迁移到离散语音令牌会降低质量,因为长尾令牌分布使并行位置选择偏向少数高频令牌。为在不修改架构的情况下缓解这一问题,我们引入了两种推理时技术:先验校准评分(减去块级边际令牌分布)和早期解码调度(基于校准置信度自适应终止迭代)。在标准零样本TTS基准测试中,Chatterbox-Flash实现了与强自回归和非自回归基线相当的高保真合成,同时支持流式推理,首包时间与流式AR系统相当,且实时因子显著降低。代码和音频样本可在 https://github.com/resemble-ai/chatterbox-flash 获取。

英文摘要

We present Chatterbox-Flash, a zero-shot text-to-speech model obtained by fine-tuning a pretrained autoregressive TTS decoder into a block-diffusion decoder, enabling parallel token generation within each block while retaining block-by-block streaming. We find that naively transferring mainstream block-diffusion decoding to discrete speech tokens degrades quality, as a long-tail token distribution biases parallel position selection toward a few high-frequency tokens. To mitigate this without architectural modification, we introduce two inference-time techniques: prior-calibrated scoring, which subtracts the block-level marginal token distribution, and an early-decoding schedule, which adaptively terminates iteration based on calibrated confidence. On standard zero-shot TTS benchmarks, Chatterbox-Flash attains high-fidelity synthesis comparable to strong autoregressive and non-autoregressive baselines, while supporting streaming inference with time-to-first-packet on par with streaming AR systems and substantially lower real-time factor. Code and audio samples are available at https://github.com/resemble-ai/chatterbox-flash.

2605.30581 2026-06-02 cs.CV cs.AI cs.RO

Prior Availability in Industrial Visual Sim-to-Real: A Review of CAD-Guided and CAD-Unavailable Regimes

工业视觉模拟到现实中的先验可用性:CAD引导与CAD不可用机制的综述

Chenxi Tao, Seung-Kyum Choi

发表机构 * George W. Woodruff School of Mechanical Engineering(乔治·W·伍德鲁夫机械工程学院) Georgia Institute of Technology(佐治亚理工学院)

AI总结 本文通过先验可用性视角重新组织工业视觉模拟到现实问题,区分CAD可用、CAD不可用和边界先验三种机制,并基于T-LESS/BOP、MVTec AD和VisA数据集进行实证分析,揭示了源分布设计、检测器容量和真实校准的重要性,以及CAD在测试时提供的独特验证通道。

详情
Comments
Review article; 103 references; 9 main figures; empirical anchors on T-LESS/BOP, MVTec AD, and VisA
AI中文摘要

工业视觉模拟到现实通常被描述为从合成图像到真实图像的迁移,但工业部署通常涉及可用证据与所需决策之间更广泛的错配。系统可能基于CAD渲染、模拟RGB-D观测、正常参考图像、合成缺陷、预训练特征空间或语言提示构建,却在不同的传感器、光照、材料、夹具、校准、生产变化和罕见缺陷模式下部署。本综述将工业视觉模拟到现实重新定义为由先验可用性组织的域差距问题。我们区分了CAD可用设置(其中显式物体几何可支持渲染、校准、姿态估计、分割和测试时几何验证)、CAD不可用设置(其中几何被正常参考外观、特征分布、师生残差、合成异常假设、基础特征或视觉语言先验取代)以及边界先验设置(其中近似模型、模板、参考视图或语义对应仅保留CAD的部分作用)。这一框架将基于CAD的检测和6D姿态估计文献与通常单独综述的工业异常和表面检测文献联系起来。为使分类具体化,我们使用T-LESS/BOP、MVTec AD和VisA上的实证锚点。这些锚点表明,仅靠CAD渲染数量并不能弥合迁移;源分布设计、检测器容量和小规模真实校准可能更为重要。它们还表明,测试时的CAD通过掩码、姿态和深度一致性创建了独特的验证通道,而CAD不可用的检测则依赖于校准的正常性和特征偏差。因此,本综述反对单一跨任务排行榜,而是询问什么先验支撑了部署决策。

英文摘要

Industrial visual sim-to-real is often described as transferring from synthetic images to real images, but industrial deployment usually involves a broader mismatch between available evidence and required decisions. A system may be built from CAD renderings, simulated RGB-D observations, normal reference images, synthetic defects, pretrained feature spaces, or language prompts, yet deployed under different sensors, lighting, materials, fixtures, calibration, production variation, and rare defect modes. This review reframes industrial visual sim-to-real as a domain-gap problem organized by prior availability. We distinguish CAD-available settings, where explicit object geometry can support rendering, calibration, pose estimation, segmentation, and test-time geometric verification; CAD-unavailable settings, where geometry is replaced by normal-reference appearance, feature distributions, teacher-student residuals, synthetic anomaly assumptions, foundation features, or vision-language priors; and boundary-prior settings, where approximate models, templates, reference views, or semantic correspondences preserve only part of the CAD role. This framing connects CAD-based detection and 6D pose-estimation literature with industrial anomaly and surface-inspection literature that is usually reviewed separately. To make the taxonomy concrete, we use empirical anchors on T-LESS/BOP, MVTec AD, and VisA. The anchors show that CAD render count alone does not close transfer; source-distribution design, detector capacity, and small real calibration can matter more. They also show that CAD at test time creates a distinct verification channel through mask, pose, and depth consistency, whereas CAD-unavailable inspection relies on calibrated normality and feature deviation. The review therefore argues against a single cross-task leaderboard and instead asks what prior grounds the deployment decision.

2605.30443 2026-06-02 cs.CL

Cross-Lingual Steering for Figurative Language Generation

跨语言引导的比喻语言生成

Linfeng Liu, Tiffany Zhan, Louie Hong Yao, Saptarshi Ghosh, Tianyu Jiang

发表机构 * Department of Computer Science, University of Cincinnati(卡内基梅隆大学计算机科学系) School of Computer Science, Carnegie Mellon University(卡内基梅隆大学计算机科学学院) Independent Researcher(独立研究者)

AI总结 通过激活引导技术,研究多语言大模型中比喻语言生成的内部信号是否跨语言可复用,发现跨语言方向可有效转移并增强目标行为。

详情
Comments
40 pages, 7 figures
AI中文摘要

多语言大模型能够生成比喻语言,但驱动这一行为的内部信号是语言特有的还是跨语言可复用的尚不清楚。使用激活引导作为探针,我们从一种语言的比喻-字面激活差异中估计比喻类别的方向,并在生成过程中应用它。在五个比喻类别、六种语言和四个多语言LLM中,这些方向在其自身语言内可靠地引导,对于隐喻和明喻最为稳健。更重要的是,它们可以跨语言转移:从一种语言学习到的方向在应用于另一种语言时增加了目标行为,其中德语是最易接受的目标之一。进一步地,从其他语言组合而成的方向可以匹配甚至超越目标语言自身的原生方向,而移除这一共享成分则会削弱原生引导。综合来看,这些结果提供了直接证据,表明存在一种可复用但依赖于目标的跨语言信号用于比喻生成。

英文摘要

Multilingual large language models can generate figurative language, but whether the internal signals driving this behavior are language-specific or reusable across languages is unclear. Using activation steering as a probe, we estimate a direction for a figurative category from figurative--literal activation differences in one language and apply it during generation. Across five figurative categories, six languages, and four multilingual LLMs, these directions steer reliably within their own language, most robustly for metaphor and simile. More importantly, they transfer across languages: a direction learned in one increases the target behavior when applied to another, with German among the most receptive targets. Going further, directions assembled from other languages can match or even surpass a target language's own native direction, while removing this shared component weakens native steering. Together, these results provide direct evidence of a reusable but target-dependent cross-lingual signal for figurative generation.

2605.30380 2026-06-02 cs.CV

Lightweight SAR Ship Detection via Contrastive Distillation

基于对比蒸馏的轻量级SAR舰船检测

Surendar Devasundaram, Banafsheh Saber Latibari, Abhijit Mahalanobis

发表机构 * University of Arizona Department of Electrical and Computer Engineering(亚利桑那大学电气与计算机工程系)

AI总结 提出结构化统一关系知识蒸馏框架SURGE,通过对比InfoNCE目标在共享嵌入空间中转移关系几何,实现轻量级SAR舰船检测,在SSDD和HRSID上提升6.2 mAP和8.0 AP75。

详情
Comments
Accepted in GLSVLSI'26 special session 74: Efficiency In Computer Vision: From Image Generation to Decision"
AI中文摘要

深度卷积和基于Transformer的检测器在SAR舰船检测中表现出色,但通常计算成本高昂,难以用于实时或机载部署。轻量级模型提高了效率,但难以捕捉SAR后向散射中固有的复杂结构关系。大多数现有的SAR知识蒸馏方法依赖于特征或logit匹配,强制局部激活相似性,而忽略了对象表示之间的几何关系。我们提出了一种用于SAR舰船检测的结构化统一关系知识蒸馏框架(SURGE),该框架通过对比InfoNCE目标在共享投影嵌入空间中从强大的教师检测器向紧凑的学生检测器转移关系几何。据我们所知,这项工作提出了SAR领域中首个基于Transformer的SAR舰船检测器知识蒸馏框架。该框架与架构无关,为两阶段、一阶段和基于Transformer的检测器提供了通用的区域级蒸馏接口,无需修改其部署架构。在SSDD和HRSID基准上的实验表明,所提出的方法为两阶段检测器带来了显著改进,相比基线学生模型实现了高达6.2 mAP和8.0 AP75的提升,甚至超越了教师性能。

英文摘要

Deep convolutional and transformer-based detectors achieve strong performance for SAR ship detection but are often computationally prohibitive for real-time or onboard deployment. Lightweight models offer improved efficiency yet struggle to capture the complex structural relationships inherent in SAR backscatter. Most existing SAR knowledge-distillation approaches rely on feature or logit matching, which enforces localized activation similarity while neglecting the geometric relationships among object representations. We propose a Structured Unified Relational knowledGE distillation framework for SAR Ship detection (SURGE) that transfers relational geometry from a powerful teacher detector to a compact student detector using a contrastive InfoNCE objective in a shared projection embedding space. To the best of our knowledge, this work presents the first transformer-based SAR ship detector knowledge distillation framework in SAR domain. The framework is architecture-agnostic in the sense that it provides a common region-level distillation interface for two-stage, one-stage and transformer-based detectors without modifying their deployed architectures. Experiments on the SSDD and HRSID benchmarks demonstrate that the proposed method yields substantial improvements for two-stage detectors, achieving up to 6.2 mAP and 8.0 AP75 gains over baseline student and even surpassing teacher performance

2605.28335 2026-06-02 cs.LG

Dimensionality Reduction for Robust Federated Learning: A Theoretical Analysis and Convergence Guarantee

鲁棒联邦学习的降维方法:理论分析与收敛性保证

Shiyuan Zuo, Jiashuo Li, Rongfei Fan, Han Hu, Jie Xu

发表机构 * Beihang University(北京航空航天大学) Xi'an Jiaotong University(西安交通大学) City University of Hong Kong(香港城市大学)

AI总结 针对联邦学习在拜占庭攻击下高维梯度聚合计算开销大的问题,提出基于稀疏随机投影的投影降维框架,将复杂度降至最优O(Mp),并证明其达到非凸函数O(1/√T)和强凸函数O(1/T)的最优收敛率。

详情
AI中文摘要

联邦学习使多个客户端能够在不共享原始数据的情况下协作训练模型,但它极易受到拜占庭攻击。现有的鲁棒方法可以中和这些威胁,但在高维梯度聚合过程中会产生大量计算开销,这种开销随模型大小扩展性差,并且随着现代模型变得越来越大,在训练成本中占据主导地位。为了解决这一计算瓶颈,我们提出了投影降维(PDR),一种用于基于向量级距离的鲁棒聚合器的通用加速框架,它通过稀疏随机投影将梯度压缩到一个大幅缩小的子空间中以高效计算可靠性权重,从而执行鲁棒聚合。这种方法将服务器计算复杂度降低到最优的O(Mp),其中M是客户端数量,p是模型维度,匹配了仅读取梯度所需的理论下界。我们在先前拜占庭鲁棒联邦学习分析的标准FL假设下建立了收敛性保证。通过利用子空间嵌入定理,我们证明PDR对于非凸函数实现了O(1/√T)的最优收敛率,对于强凸函数实现了O(1/T)的最优收敛率,其中T表示迭代次数。关键的是,我们从数学上证明,这种巨大的加速几乎是免费的,仅仅将固有的拜占庭误差界放大了有界且可调的因子(1+ε)/(1-ε)。在基准数据集上的实验结果证实,将PDR与现有聚合器集成可以在时间效率上实现数量级的加速,同时保持高度有竞争力的收敛性能。

英文摘要

Federated Learning (FL) enables multiple clients to collaboratively train models without sharing raw data, but it is highly vulnerable to Byzantine attacks. Existing robust approaches can neutralize these threats but incur substantial computational overhead during high-dimensional gradient aggregation, an overhead that scales poorly with model size and increasingly dominates the training cost as modern models grow larger. To address this computational bottleneck, we propose Projected Dimensionality Reduction (PDR), a universal acceleration framework for vector-level distance-based robust aggregators, which performs robust aggregation by compressing gradients into a drastically smaller subspace via sparse random projection to efficiently compute reliability weights. This approach reduces the server computational complexity to an optimal $ \mathcal{O}(Mp) $, where $ M $ is the number of clients and $ p $ is the model dimension, matching the theoretical lower bound required merely to read the gradients. We establish convergence guarantees under standard FL assumptions in prior Byzantine-robust FL analyses. By leveraging the Subspace Embedding Theorem, we show that PDR achieves optimal convergence rates of $ \mathcal{O}(1/\sqrt{T}) $ for non-convex functions and $ \mathcal{O}(1/T) $ for strongly convex functions, where $ T $ denotes the number of iterations. Crucially, we mathematically demonstrate that this massive acceleration comes almost for free, merely inflating the inherent Byzantine error floor by a bounded, tunable factor of $ \frac{1+ε}{1-ε} $. Experimental results on benchmark datasets confirm that integrating PDR with existing aggregators yields orders of magnitude speedups in time efficiency while maintaining highly competitive convergence performance.

2605.28209 2026-06-02 cs.LG

Robust Contrastive Graph Clustering with Adaptive Local-Global Integration

鲁棒对比图聚类与自适应局部-全局整合

Lei Zhang, Fubo Sun, Haipeng Yang, Zhong Guan, Likang Wu

发表机构 * School of Computer Science and Technology, Anhui University(安徽大学计算机科学与技术学院) College of Management and Economics, Tianjin University(天津大学管理学院)

AI总结 提出一种对比图聚类框架,通过注意力机制自适应融合多尺度局部结构和全局语义原型,以解决复杂图中高阶局部结构捕获不足和全局语义忽略问题,提升聚类性能。

详情
Comments
Accepted at IJCAI 2026
AI中文摘要

图聚类在图分析中对于揭示结构模式和节点社区至关重要。尽管自监督对比学习的最新进展通过结构和属性信号改进了聚类,但现有方法仍难以灵活捕获高阶局部结构,并且常常忽略复杂图中的全局语义。这些限制导致节点表示次优,尤其是在具有碎片化结构和模糊聚类边界的真实世界图中。为了解决这些限制,提出了一种对比图聚类框架,通过注意力机制联合整合多尺度局部结构与全局语义。在局部层面,通过基于注意力的加权自适应融合从多个传播深度提取的基于GNN的拓扑信号,以捕获多尺度邻域特征。在全局层面,通过注意力自适应聚合从动态演化的聚类中心导出的语义原型,以指导节点表示并增强聚类间可分离性。该模型在双视图对比学习范式下训练,采用结合实例级和结构感知损失的混合目标,以提高表示鲁棒性和判别力。在八个真实世界图数据集上的实验表明,我们的方法实现了有竞争力的聚类性能。代码可在 https://github.com/vege12138/w2 获取。

英文摘要

Graph clustering is essential in graph analysis for revealing structural patterns and node communities. Despite recent advances in self-supervised contrastive learning that have improved clustering via structural and attribute signals, existing methods still struggle to flexibly capture high-order local structures and often overlook global semantics in complex graphs. These limitations lead to suboptimal node representations, especially in real-world graphs with fragmented structures and ambiguous cluster boundaries. To address these limitations, a contrastive graph clustering framework is proposed to jointly integrate multi-scale local structures with global semantics via attention mechanisms. At the local level, GNN-based topological signals extracted from multiple propagation depths are adaptively fused through attention-based weighting to capture multi-scale neighborhood features. At the global level, semantic prototypes derived from dynamically evolving cluster centers are adaptively aggregated through attention to guide node representations and enhance inter-cluster separability. The model is trained under a dual-view contrastive learning paradigm with a hybrid objective that combines instance-level and structure-aware losses to improve representation robustness and discrimination. Experiments on eight real-world graph datasets demonstrate that our method achieves competitive clustering performance. Code is available at https://github.com/vege12138/w2.

2605.28183 2026-06-02 cs.CL cs.AI

BenGER: Benchmarking LLM Systems on Subsumption-Based Legal Reasoning in German Law

BenGER:德国法律中基于归入的法律推理的LLM系统基准测试

Sebastian Nagl, Ann-Kristin Mayrhofer, Martin Heidebach, Aleyna Koçak, Anne Zettelmeier, Elly Breu, Angelina Greiner, Sofija Milijas, Matthias Grabmair

发表机构 * Technical University of Munich (TUM)(慕尼黑技术大学) Ludwig Maximilian University of Munich (LMU)(慕尼黑路德维希-马克西米利安大学) University of Konstanz(康斯坦茨大学) University of Saarbrücken(萨尔布吕肯大学)

AI总结 提出BenGER数据集,用于评估LLM系统在德国法律归入推理中的表现,通过自动和基于法官的指标比较12个LLM系统与人类基线。

详情
Comments
Pre-Print v2
AI中文摘要

我们引入了BenGER(德国法律基准)数据集,用于评估LLM系统在德国法律中基于归入的法律推理。BenGER数据集由三个部分组成:596个跨多个法律教育水平的考试式自由文本法律案例任务和531个简短的教义推理任务。我们评估了12个当代LLM系统——包括封闭旗舰型、效率导向型和开放权重型——使用自动和基于法官的指标。在受控验证子集上,对定时的人类撰写解决方案(在无辅助和人机共创条件下)进行模型性能与这些人类基线的对比。我们引入了一个与多评分者人工评分协议(每个解决方案三次盲审加一次作者知情创建者评审)交叉验证的、基于评分标准的LLM-as-a-Judge框架。我们的结果表明,用LLM法官替换盲人评审员对整个人类评审池的一致性影响不大于完全移除该评审员(Calderon r=0.96 vs. r=0.96,匹配n=30),封闭旗舰系统在所有语料库中领先排行榜,并且人机共创显著优于无辅助的人类工作。

英文摘要

We introduce the BenGER (Benchmark for German Law) dataset for evaluating LLM systems on subsumption-based legal reasoning in German law. The BenGER dataset consists of three components: 596 exam-style free-text legal case tasks across multiple levels of legal education and 531 short doctrinal reasoning tasks. We evaluate 12 contemporary LLM systems -- closed flagship, efficiency-oriented, and open-weight -- across automatic and judge-based metrics. On a controlled validation subset of timed human-written solutions under both unaided and human--AI co-creation conditions, we contextualise model performance against these human baselines. We introduce a rubric-aligned LLM-as-a-Judge framework cross-validated against a multi-rater human-grading protocol (three blind reviews plus one author-informed creator review per solution). Our results show that replacing a blind human reviewer with the LLM judge degrades agreement with the full human pool no more than removing that reviewer altogether (Calderon r=0.96 vs.~r=0.96, matched n=30), that closed-flagship systems lead the leaderboard across all corpora, and that human--AI co-creation substantially outperforms unaided human work.

2605.24583 2026-06-02 cs.LG cs.CL stat.ML

Measuring Alignment-Induced Activation Shifts Correctly: A Template-Controlled Difference-in-Differences Protocol

正确测量对齐引起的激活偏移:一种模板控制的双重差分协议

Yuki Nakamura

发表机构 * The Open University of Japan(日本开放大学)

AI总结 针对对齐前后模型内部激活比较中存在的聊天模板混淆问题,提出模板控制的双重差分协议,有效分离对齐偏移与格式效应,恢复拒绝方向并提升余弦对齐度。

详情
Comments
11 pages, 1 figure. v3: substantially revised and reframed as a measurement-methodology paper. Code, data, and an immutable Zenodo archive are available at https://github.com/Nakammura/effective-rank-audit (DOI: 10.5281/zenodo.20341444)
AI中文摘要

比较模型在安全对齐前后的内部激活是探究安全训练改变了什么的一种自然方式:在安全相关输入上形成配对的(对齐后减对齐前)激活矩阵,并读取其有效秩或主方向。我们表明,形成该矩阵的直观方式存在混淆。对齐后的模型在基础模型从未见过的聊天模板下进行评估,因此朴素差异将对齐偏移与聊天格式混为一谈。我们引入修改矩阵的四变量分解(朴素、模板控制、对齐内和双重差分,DiD),以分离这两种效应。仅模板控制即可消除 Llama-3.1-8B、Gemma-2-9B 和 Qwen-2.5-7B 上测量有效秩的 2.0-3.9 倍膨胀;DiD 对比才是恢复 Arditi 等人(2024)的拒绝方向的关键,将其余弦对齐度从 0.18-0.39 提升至 0.50-0.86。跨三个系列的投影消融实验证实,恢复的子空间在行为上是活跃的,且奇异值顺序并非因果顺序。我们在受控测试平台上验证了该协议,并将其提炼为对齐激活差异研究的测量建议。

英文摘要

Comparing a model's internal activations before and after alignment is a natural way to ask what safety training changes: one forms the matrix of paired aligned-minus-base activations on safety-relevant inputs and reads off its effective rank or top direction. We show the obvious way to form this matrix is confounded. The aligned model is evaluated under a chat template the base model never saw, so the naive difference conflates the alignment shift with chat formatting. We introduce a four-variant decomposition of the modification matrix (naive, template-controlled, within-aligned, and difference-in-differences, DiD) that separates the two effects. Template control alone removes a 2.0-3.9x inflation of the measured effective rank across Llama-3.1-8B, Gemma-2-9B, and Qwen-2.5-7B; the DiD contrast is what recovers the refusal direction of Arditi et al. (2024), lifting its cosine alignment from 0.18-0.39 to 0.50-0.86. Projection-ablation across the three families confirms the recovered subspace is behaviorally active and that singular-value order is not causal order. We validate the protocol on a controlled testbed and distill it into measurement recommendations for activation-difference studies of alignment.

2605.20716 2026-06-02 cs.LG stat.ML

Decision-Path Patterns as Tree Reliability Signals: Path-based Adaptive Weighting for Random Forest Classification

决策路径模式作为树可靠性信号:基于路径的自适应加权用于随机森林分类

Youngjoon Park

发表机构 * Independent Researcher(独立研究者)

AI总结 提出利用每棵树的决策路径结构模式作为实例自适应可靠性信号,对更可靠的树进行差异化加权,以纠正随机森林中因错误表示树占多数而导致的错误,在36个二分类基准上显著提升准确率。

详情
Comments
32 pages, 3 figures. Code and data: https://github.com/DavidParkYJ/dwarfp
AI中文摘要

随机森林通过每棵树对特征空间的不同随机化表示进行构建。当具有错误表示的树在概率上超过正确表示的树时,即使集成整体拥有足够正确的信息,其统一投票也无法纠正这些区域的错误——这是本文解决的一种可约错误。我们提出利用每棵树决策路径的结构模式作为实例自适应可靠性信号,以识别并对更可靠的树进行差异化加权。在推理时,随机森林通过样本在每棵树中遍历的根到叶路径得出预测,因此路径级可靠性提供了比树级加权更细的粒度。我们表明,该信号反映了每棵树决策的实际可靠性,并且使用它能在36个二分类基准上比RF获得统计显著的准确率提升(Wilcoxon p < 0.0001)。测量了类别召回率回归——RF校正方法的典型失败模式:在0.2个百分点阈值下,零个少数类召回率回归和一个多数类召回率回归,表明偏差减少而非类别权衡。我们进一步量化了该方法从拟合RF本身可访问的可约错误;该估计与每个数据集的增益强相关(Pearson r = +0.840, p < 0.0001)。在它识别的合格组上,该方法平均准确率提升+0.99个百分点,且在每个数据集上严格获胜(7/0/0);可选的放大机制进一步将其提升至+1.48个百分点。

英文摘要

Random forests construct each tree with a different, randomised representation of the feature space. Their uniform voting cannot correct errors in regions where trees with incorrect representations probabilistically outnumber correct ones, even when the ensemble collectively holds enough correct information - a reducible error that this paper addresses. We propose using the structural pattern of each tree's decision path as an instance-adaptive reliability signal to identify and differentially weight the more reliable trees. At inference, a random forest reaches its prediction through the root-to-leaf path the sample traverses in each tree, so path-level reliability offers a finer granularity than tree-level weighting can access. We show that this signal reflects the actual reliability of each tree's decision, and that using it yields a statistically significant accuracy improvement over RF on 36 binary classification benchmarks (Wilcoxon p < 0.0001). Class-recall regression - the typical failure mode of RF correction methods - is measured: zero minority-recall regressions and a single majority-recall regression at the 0.2 pp threshold, indicating bias reduction rather than a class trade-off. We further quantify the reducible error accessible to the method from the fitted RF alone; this estimate correlates strongly with per-dataset gain (Pearson r = +0.840, p < 0.0001). On the qualifying group it identifies, the method delivers a mean +0.99 pp accuracy improvement with strict wins on every dataset (7/0/0); an optional amplification mechanism further raises this to +1.48 pp.

2605.27835 2026-06-02 cs.LG cs.CL

CAREF: Calibration-Aware Regularization for Explanation Faithfulness Without Rationale Supervision

CAREF: 面向解释忠实性的校准感知正则化,无需理由监督

Naphat Nithisopa, Teerapong Panboonyuen

发表机构 * MARSAIL Chulalongkorn University(朱拉隆梭大学) PBYAIL (Panboonyuen AI Lab)(PBYAIL(Panboonyuen人工智能实验室))

AI总结 提出CAREF框架,通过校准感知正则化联合优化预测准确性和解释忠实性,无需理由监督,在四个NLE基准上以少量可训练参数取得最佳性能。

详情
Comments
10 pages
AI中文摘要

我们引入了CAREF,一个参数高效的微调框架,通过校准感知正则化联合优化预测准确性和解释忠实性。其核心是通过单一统一损失函数LSCED(面向解释忠实性的校准感知正则化)将基于熵的校准与令牌级稀疏性控制相结合,无需理由监督。在四个NLE基准(COS-E、ECQA、ComVE、e-SNLI)上使用Flan-T5进行评估,我们轻量级的CAREF-AQ变体仅使用6.43%的可训练参数就达到了最佳平均准确率(89.04)和解释对齐度(81.00 nBERT),优于LoRA和AdaLoRA。据我们所知,CAREF是第一个将熵和稀疏性正则化统一到单一训练目标中用于可解释LLM微调的方法。

英文摘要

We introduce CAREF, a parameter-efficient fine-tuning framework that jointly optimizes predictive accuracy and explanation faithfulness via calibration-aware regularization. At its core, CAREF couples entropy-based calibration with token-level sparsity control through a single unified loss, the Calibration-Aware Regularization for Explanation Faithfulness (LSCED), without requiring rationale supervision. Evaluated on four NLE benchmarks (COS-E, ECQA, ComVE, e-SNLI) with Flan-T5, our lightweight CAREF-AQ variant attains the best average accuracy (89.04) and explanation alignment (81.00 nBERT) using only 6.43% of trainable parameters, outperforming LoRA and AdaLoRA. To our knowledge, CAREF is the first method to unify entropy and sparsity regularization in a single training objective for interpretable LLM fine-tuning.

2605.27752 2026-06-02 cs.AI

Asking Is Not Enough: Protocol Sensitivity in LLM Confidence Calibration

询问是不够的:LLM 置信度校准中的协议敏感性

Hankyeol Kim, Pilsung Kang

发表机构 * Seoul National University(首尔国立大学)

AI总结 研究通过改变测量协议(如条件上下文、令牌读取方式)发现,LLM 的令牌概率置信度与口头置信度之间的比较高度敏感,且口头置信度不仅反映正确性还反映答案的合理性和来源。

详情
AI中文摘要

LLM 置信度校准通常通过比较两种信号来评估:令牌概率分数和口头置信度。这些信号有时被视为模型不确定性的直接读数,但它们的比较取决于很少明确说明的测量选择。在主要分析中,我们固定口头置信度的引出方式:一个单一的提示模板、概率尺度和输出格式。然后,我们改变定义口头与令牌比较的测量轴:哪个答案字符串获得令牌概率分数,如何从答案令牌中读取该分数,以及在哪种条件上下文中测量它。我们在三个开放 7--8B 基础/指令模型家族的四个 QA 基准上评估了这种设计,并使用更大的 Qwen2.5 变体作为同家族鲁棒性检查。结果比较对这些选择敏感:条件上下文改变了跨设置的 ECE 差距的符号或大小,令牌读取产生了更小但仍改变符号的变化,而改变 ECE 估计器影响很小。在默认的生成答案、裸上下文协议下,指令设置接近平衡,而不是显示口头置信度的大幅校准增益。在单独的提供答案分析中,表面合理的错误答案与提供的正确答案获得几乎相同的置信度,这表明口头置信度也反映了答案的合理性和来源,而不仅仅是正确性。我们认为,两种置信度信号都应被视为依赖于协议的测量行为,并提供了一个报告清单,涵盖引出来源、评分答案、令牌概率读取和条件上下文。

英文摘要

LLM confidence calibration is often evaluated by comparing two signals: token-probability scores and verbalized confidence. These signals are sometimes treated as direct readouts of model uncertainty, but their comparison depends on measurement choices that are rarely made explicit. In the main analysis, we hold the verbalized-confidence elicitation fixed: a single prompt template, probability scale, and output format. We then vary the measurement axes that define the verbalized-vs-token comparison: which answer string receives the token-probability score, how that score is read from the answer tokens, and under which conditioning context it is measured. We evaluate this design on four QA benchmarks across three open 7--8B base/Instruct model families, with larger Qwen2.5 variants as same-family robustness checks. The resulting comparison is sensitive to these choices: conditioning context changes the sign or magnitude of the ECE gap across settings, token readout produces smaller but still sign-moving changes, and changing the ECE estimator has little effect. Under the default generated-answer, bare-context protocol, Instruct settings are close to parity rather than showing a large calibration gain for verbalized confidence. In a separate supplied-answer analysis, surface-plausible wrong answers receive nearly the same confidence as supplied gold answers, suggesting that verbalized confidence also reflects answer plausibility and provenance rather than correctness alone. We argue that both confidence signals should be treated as protocol-dependent behavioral measurements, and provide a reporting checklist covering elicitation provenance, scored answer, token-probability readout, and conditioning context.

2605.27701 2026-06-02 cs.AI

Cross-Entropy Games and Frost Training

交叉熵博弈与Frost训练

Arthur Renard, Franck Gabriel, Valentin Hartmann, Clément Hongler

发表机构 * Xent Labs(Xent实验室) Université Lyon 1(里昂1大学)

AI总结 提出Frost训练方法,利用奖励函数在嵌入空间中的梯度改进基于蒙特卡洛的策略优化,用于解决一类称为交叉熵博弈的LLM-as-a-judge任务,在最佳k选择中实现更高最大分数并加速训练。

详情
Comments
14 pages, 6 figures
AI中文摘要

我们提出Frost训练,一种改进基于蒙特卡洛的策略优化的方法,适用于称为交叉熵博弈的一大类LLM-as-a-judge任务。关键思想是利用奖励函数在嵌入空间中的梯度。该信号在贪婪坐标梯度(GCG)越狱技术中使用;我们首次证明它也可用于提升模型训练。我们使用GRPO训练进行最大似然填充来验证我们的方法。Frost训练提高了模型生成高评分输出的能力,在最佳k选择中达到更高的最大分数,并且速度更快。

英文摘要

We present Frost Training, a method for improving Monte Carlo-based policy optimization for a large family of LLM-as-a-judge tasks called Cross-Entropy Games. The key idea is to exploit the gradient of the reward function in embedding space. This signal is used in the Greedy Coordinate Gradient (GCG) jailbreaking technique; we demonstrate for the first time that it can also be used to boost model training. We validate our method using GRPO training for maximum-likelihood infilling. Frost Training improves the model's ability to generate high-scoring outputs, reaching higher maximum scores in a best-of-k setting, and does so at an increased speed.