arXivDaily arXiv每日学术速递 周一至周五更新
重置
2605.29960 2026-05-29 cs.CR cs.AI

Hijacking Agent Memory: Stealthy Trojan Attacks Through Conversational Interaction

劫持Agent记忆:通过对话交互的隐蔽木马攻击

Hongtao Wang, Se Yang, Yu Chen, Puzhuo Liu

AI总结 提出MemPoison攻击方法,通过语义关系桥、实体伪装和联合嵌入优化绕过选择性记忆机制,在LLM Agent长期记忆中注入触发器后门,实现高达0.95的攻击成功率。

详情
Comments
19 pages, 12 figures
AI中文摘要

大型语言模型(LLM)Agent越来越多地利用长期记忆来支持持久且自主的任务执行。然而,这种能力也引入了一个新的攻击面:记忆投毒,即对手可以注入恶意信息以影响未来行为。现有的记忆投毒攻击通常假设注入内容可以直接存储在记忆中,忽略了现代记忆流水线中的选择性提取和重写阶段。这使得先前的方法在现实场景中无效。在本文中,我们提出MemPoison,一种新颖的记忆投毒攻击,能够绕过LLM Agent中的选择性记忆机制,攻击者可以通过对话交互将可触发的后门注入Agent的长期记忆,从而误导其后续响应。MemPoison引入三个关键组件:(i)语义关系桥,将触发器和载荷绑定为连贯的陈述,确保它们一起被提取到记忆中;(ii)实体伪装,优化触发器以模仿命名实体,抵抗重写;(iii)联合嵌入优化,将注入触发器的文本在嵌入空间中形成紧密聚类,同时与良性嵌入保持隔离以实现隐蔽。跨不同Agent领域和记忆机制的评估显示,MemPoison的攻击成功率高达0.95,优于现有基线。机制分析表明,该攻击利用了嵌入空间各向异性并转移注意力模式,突显了选择性记忆系统的核心漏洞。我们评估了多种防御策略,并展示了它们在缓解攻击方面的根本局限性。

英文摘要

Large language model (LLM) agents increasingly leverage long term memory to support persistent and autonomous task execution. However, this capability also introduces a new attack surface: memory poisoning, where adversaries can inject malicious information to influence future behavior. Existing memory poisoning attacks often assume that injected content can be stored directly in memory, overlooking the selective extraction and rewriting stages in modern memory pipelines. This makes prior methods ineffective under realistic settings. In this paper, we propose MemPoison, a novel memory poisoning attack that bypasses selective memory mechanisms in LLM agents, where an attacker can inject triggerable backdoors into the agent's long-term memory through dialogue interactions, thereby misleading its subsequent responses. MemPoison introduces three key components: (i) a semantic relational bridge that binds the trigger and payload into a coherent statement to ensure they are extracted into memory together; (ii) entity masquerading that optimizes triggers to mimic named entities, resisting rewriting; and (iii) joint embedding optimization that shapes trigger-injected texts into a tight cluster in the embedding space while maintaining isolation from benign embeddings for stealth. Evaluations across different agent domains and memory mechanisms show MemPoison achieves attack success rates up to 0.95, outperforming existing baselines. Mechanistic analysis indicates that the attack exploits embedding-space anisotropy and shifts attention patterns, highlighting core vulnerabilities in selective memory systems. We evaluate multiple defense strategies and demonstrate their fundamental limitations in mitigating the attack.

2605.29955 2026-05-29 cs.AI

Formalizing Mathematics at Scale

大规模形式化数学

Ahmad Rammal, Niket Patel, Fabian Gloeckle, Amaury Hayat, Julia Kempe, Remi Munos, Charles Arnal, Vivien Cabannes

AI总结 提出多智能体系统AutoformBot,利用LLM和形式化验证工具,自动将非正式教材翻译为Lean 4可验证代码,构建了包含超过45,000个声明和50万行代码的Atlas形式化库。

详情
AI中文摘要

我们提出了AutoformBot,一个用于在Lean 4中大规模构建自动形式化教材库(Atlas)的多智能体系统。AutoformBot协调数千个LLM智能体,配备形式化验证工具、依赖感知的任务调度和协作版本控制,将非正式的教材文本转化为机器可检查的定义和证明。我们将方法应用于26本开放获取教材,涵盖分析、代数、拓扑、组合学和概率论,生成了Atlas:一个包含超过45,000个Lean 4声明和50万行代码的已验证库。我们发布两个成果:(i)AutoformBot,开源的多智能体框架;(ii)Atlas,生成的形式化库。我们的结果表明,大规模自动形式化研究生级别数学的核心内容在经济和技术上现在是可行的。这为在研究层面上自动验证人类和机器生成的数学打开了大门。

英文摘要

We present AutoformBot, a multi-agent system for building an Autoformalized Textbook Library At Scale (Atlas) in Lean 4. AutoformBot orchestrates thousands of LLM agents, equipped with formal verification tools, dependency-aware task scheduling, and collaborative version control, to translate informal textbook prose into machine-checked definitions and proofs. We apply our methods to a corpus of 26 open-access textbooks spanning analysis, algebra, topology, combinatorics, and probability, producing Atlas: a verified library of over 45,000 Lean 4 declarations and 500 thousand lines of code. We release two artifacts: (i) AutoformBot, the open-source multi-agent framework; and (ii) Atlas, the resulting formal library. Our results suggest that autoformalizing the core content of graduate-level mathematics at scale is now economically and technically feasible. This opens the door to the automated verification of both human- and machine-generated mathematics at a research level.

2605.29954 2026-05-29 cs.CV

SwInception -- Local Attention Meets Convolutions

SwInception -- 局部注意力与卷积的结合

David Hagerman, Roman Naeem, Jakob Lindqvist, Carl Lindström, Fredrik Kahl, Lennart Svensson

AI总结 提出SwInception架构,通过在Swin Transformer的前馈层引入Inception块增强归纳偏置,并改进解码器以更少参数捕捉细节,在多个医学数据集上提升分割性能。

详情
Comments
International Conference on Pattern Recognition and Artificial Intelligence, 2024
AI中文摘要

稀疏视觉变换器作为医学体积分割的高效编码器已广受欢迎,其中Swin成为突出选择。Swin使用局部注意力降低复杂度,在许多任务上表现优异,但仍倾向于在小数据集上过拟合。为缓解这一弱点,我们提出了一种新颖架构,通过在前馈层引入Inception块进一步增强Swin的归纳偏置。这些多分支卷积的引入使得在变换器块内能够更直接地对局部多尺度特征进行推理。我们还修改了解码器层,以使用更少的参数捕捉更精细的细节。通过大量实验,我们在十一个不同的医学数据集上展示了性能提升。我们特别展示了在医学分割十项全能(Medical Segmentation Decathlon)和颅穹窿外(Beyond the Cranial Vault)等基准挑战中,相较于先前最先进骨干网络的进步。通过证明Swin中现有的归纳偏置可以进一步改进,我们的工作为增强稀疏视觉变换器在医学和自然图像分割任务中的能力提供了一条有前景的途径。代码和预训练权重可在 https://github.com/Eiphodos/SwInception 获取。

英文摘要

Sparse vision transformers have gained popularity as efficient encoders for medical volumetric segmentation, with Swin emerging as a prominent choice. Swin uses local attention to reduce complexity and yields excellent performance for many tasks but still tends to overfit on small datasets. To mitigate this weakness, we propose a novel architecture that further enhances Swin's inductive bias by introducing Inception blocks in the feed-forward layers. The introduction of these multi-branch convolutions enables more direct reasoning over local, multi-scale features within the transformer block. We have also modified the decoder layers in order to capture finer details using fewer parameters. We demonstrate a performance improvement on eleven different medical datasets through extensive experimentation. We specifically showcase advancements over the previous state-of-the-art backbones on benchmark challenges like the Medical Segmentation Decathlon and Beyond the Cranial Vault. By showing that the existing inductive bias in Swin can be further improved, our work presents a promising avenue for enhancing the capabilities of sparse vision transformers for both medical and natural image segmentation tasks. Code and pre-trained weights can be accessed at https://github.com/Eiphodos/SwInception.

2605.29953 2026-05-29 cs.CV

Mesh-Aware Epipolar Matching for Multi-View Multi-Person 3D Pose Estimation in Basketball

网格感知的对极匹配用于篮球多视角多人3D姿态估计

Li Yin, Qin Haobin, Tomohiro Suzuki, Calvin Yeung, Mariko Isogawa, Keisuke Fujii

AI总结 提出一种无训练框架MAEM,通过单目3D人体网格恢复模型和两阶段对极匹配策略,解决团队运动场景中多视角多人3D姿态估计的遮挡和外观相似问题。

详情
AI中文摘要

团队运动场景中的多视角多人3D姿态估计因球员遮挡、队服造成的外观相似性以及标注多视角数据的稀缺而仍然具有挑战性,这些因素限制了基于学习方法的有效性和泛化能力。相比之下,无训练方法的性能固有地受限于2D关键点检测的准确性和跨视角关联的鲁棒性。为应对这些挑战,我们提出了网格感知的对极匹配(MAEM),一种用于多视角多人3D姿态估计的无训练框架。我们的方法采用单目3D人体网格恢复模型作为前端,并基于恢复的网格输出引入了一种两阶段对极匹配策略。具体而言,所提出的框架结合了基于并查集的聚类与每关节三角测量,以实现鲁棒的跨视角关联和准确的3D姿态重建。在两个公开的多视角篮球数据集上的实验表明,MAEM持续优于现有的无训练关联基线,同时在室内和室外篮球场景中实现了有竞争力的仅RGB性能。MAEM在SportCenter EPFL上达到MPJPE/PA-MPJPE分数59.8/40.7毫米,在Human-M3 Basketball上达到74.0/51.8毫米,突显了密集网格几何在无需目标域训练或微调的情况下进行跨视角关联的有效性。

英文摘要

Multi-view multi-person 3D pose estimation in team sports scenarios remains challenging due to player occlusions, appearance similarity caused by team uniforms, and the scarcity of annotated multi-view data, all of which limit the effectiveness and generalization capability of learning-based methods. In contrast, the performance of training-free approaches is inherently constrained by the accuracy of 2D keypoint detection and the robustness of cross-view association. To address these challenges, we propose Mesh-Aware Epipolar Matching (MAEM), a training-free framework for multi-view multi-person 3D pose estimation. Our method employs a monocular 3D human mesh recovery model as the frontend and introduces a two-stage epipolar matching strategy based on the recovered mesh outputs. Specifically, the proposed framework combines disjoint-set-union-based clustering with per-joint triangulation to achieve robust cross-view association and accurate 3D pose reconstruction. Experiments on two public multi-view basketball datasets demonstrate that MAEM consistently outperforms existing training-free association baselines while achieving competitive RGB-only performance in both indoor and outdoor basketball scenarios. MAEM achieves MPJPE/PA-MPJPE scores of 59.8/40.7 mm on SportCenter EPFL and 74.0/51.8 mm on Human-M3 Basketball, highlighting the effectiveness of dense mesh geometry for cross-view association without requiring target-domain training or fine-tuning.

2605.29952 2026-05-29 cs.LG

From Short Histories to Long Futures: Horizon-Aware Graph Neural Networks for Long Horizon Forecasting

从短历史到长未来:面向长时域预测的视界感知图神经网络

Zesheng Liu, Maryam Rahnemoonfar

AI总结 提出一种多视界图神经网络模拟器,通过共享图骨干网络和增量预测策略,联合优化多步超前预测,实现长时域稳定且准确的地球物理系统模拟。

详情
Comments
Accepted for International Conference on Pattern Recognition (ICPR) 2026
AI中文摘要

由于强非线性动力学、全物理模拟的高计算成本以及单步自回归代理在数十年滚动中产生的误差累积,地球物理系统的精确长期预测十分困难。深度神经网络可作为高效模拟器,但大多数仅训练用于下一步预测,且随着预测视界增长常出现漂移或不稳定。我们提出一种多视界图神经网络模拟器,在统一模型中学习从单个当前时间到多个未来超前时间的状态到状态转换。物理域表示为图,其中节点对应具有时变地球物理属性的空间位置,边编码局部空间相互作用。给定当前图状态,模型预测关键场(冰厚度和冰速度)在所有节点上的未来演化,使用共享图骨干网络和每个目标变量的独立输出分支。为提高稳定性,网络预测相对于当前状态的状态增量,然后将其加回以重建未来状态。训练联合优化所有超前时间,使用统一回归目标,推理采用从粗到细的滚动方式,以较大步长推进并有选择地以较短步长细化,以减少漂移并避免冗余计算。在数十年期松岛冰川模拟上的实验表明,我们的方法在长期精度和稳定性上均优于(i)直接从初始状态预测每个未来时间的基线模型和(ii)标准单步自回归滚动,为下游气候和海平面研究提供了更可靠的模拟器。

英文摘要

Accurate long-range prediction of geophysical systems is difficult due to strongly nonlinear dynamics, the high computational cost of full-physics simulations, and the error accumulation that arise when one-step autoregressive surrogates are rolled out over decades. Deep neural network can serve as efficient emulators, but most are trained only for next-step prediction and often drift or become unstable as the forecast horizon grows. We propose a multi-horizon graph neural network emulator that learns state-to-state transitions from a single current time to multiple future lead times within one unified model. The physical domain is represented as a graph, where nodes correspond to spatial locations with time-varying geophysical attributes and edges encode local spatial interactions. Given the current graph state, the model predicts the future evolution of key fields, ice thickness and ice velocities at all nodes, using a shared graph backbone with separate output branches for each target variable. To improve stability, the network predicts state increments relative to the current state, which are then added back to reconstruct future states. Training jointly optimizes all lead times with a unified regression objective, and inference uses a coarse-to-fine rollout that advances with larger jumps and selectively refines with shorter jumps to reduce drift and avoid redundant computation. Experiments on multi-decadal Pine Island Glacier simulations show that our approach achieves higher long-range accuracy and improved stability than both (i) an initial-state baseline that predicts each future time directly from the starting state and (ii) a standard single-step autoregressive rollout, producing a more reliable emulator for downstream climate and sea-level studies.

2605.29951 2026-05-29 cs.AI cs.CL cs.LG cs.MM

MuPHI: Learning Implicit Multimodal Harm Reasoning via Semantically Grounded Reward Optimization

MuPHI: 通过语义基础奖励优化学习隐式多模态有害推理

Anisha Saha, Varsha Suresh, Teodora Kamova, Sophia Wiedmann, Timothy Hospedales, Vera Demberg

AI总结 针对视觉语言模型在隐式跨模态有害语义推理上的不足,提出MuPHI数据集和MuPHIRM训练框架,通过多视角奖励优化联合语义学习,提升有害检测与推理质量及分布外鲁棒性。

详情
AI中文摘要

理解看似良性的图像-文本对之间交互如何产生危害,需要超越表面特征的意图感知跨模态推理。现有的视觉语言模型(VLM)擅长对感知线索进行字面推理,但往往无法推导出依赖于隐式、上下文相关推理的有害语义。为了评估VLM在组合性有害检测和推理方面的能力,我们引入了多模态语用有害解释(MuPHI)数据集,其中包含有害编码在微妙多模态线索中的图像-文本对。MuPHI涵盖多种有害类别,并包含用于评估VLM推理链的注释有害理由。为了改进VLM的检测和推理能力,我们提出了MuPHIRM,一种推理增强的训练框架,通过优化多视角奖励来学习联合语义。MuPHIRM提高了VLM的有害检测和推理质量,同时与训练和推理时基线相比,表现出优越的分布外鲁棒性。我们的发现表明,面向推理的奖励优化为构建超越基准特定捷径进行泛化的多模态系统提供了一个有前景的方向。

英文摘要

Understanding how harm emerges from interaction between otherwise benign image-text pairs requires intent-aware cross-modal reasoning beyond surface-level features. Existing vision-language models (VLMs) excel at literal reasoning over perceptual cues but often fail to derive harmful semantics that rely on implicit, context-dependent reasoning. To evaluate VLMs on compositional harm detection and reasoning, we introduce Multimodal Pragmatic Harm Interpretation (MuPHI), a dataset containing image-text pairs where harm is encoded in subtle multimodal cues. MuPHI spans diverse harm categories and includes annotated harm rationales for assessing VLM reasoning chains. To improve both detection and reasoning in VLMs, we propose MuPHIRM, a reasoning-augmented training framework which learns joint semantics by optimizing multi-perspective rewards. MuPHIRM improves both harm detection and reasoning quality of VLMs while demonstrating superior out-of-distribution robustness compared to both trained and inference-time baselines. Our findings suggest that reasoning-oriented reward optimization offers a promising direction towards building multimodal systems that generalize beyond benchmark-specific shortcuts.

2605.29943 2026-05-29 cs.HC cs.ET cs.LG

A Domain-Informed Multi-Objective Framework for EEG Channel Selection in Motor Imagery BCIs

一种领域信息驱动的多目标框架用于运动想象脑机接口中的EEG通道选择

Dekka Muni Kumar, Dhruba Jyoti Kalita, Yogesh Kumar Meena

AI总结 提出一种基于多目标优化(NSGA-II、MOPSO、MOEA/D)的EEG通道选择框架,通过高斯核评估空间相关性、任务相关去同步评估功能区分性,在四个数据集上优于单目标方法,实现紧凑通道子集和高分类性能。

详情
Comments
This work has been submitted to the IEEE for possible publication
AI中文摘要

使用脑电图(EEG)信号进行运动想象(MI)分类对于推进脑机接口(BCI)至关重要。传统的EEG通道选择方法通常面临局限性,例如依赖单目标标准和易陷入局部最优。为了解决这些挑战,本文提出了一种多目标优化框架,采用非支配排序遗传算法、多目标粒子群优化和基于分解的多目标进化算法。我们的方法有效平衡了空间相关性(使用高斯核)和功能区分性(评估试验内任务相关去同步),从而提高了性能。我们在四个EEG数据集(Physionet、OpenBMI、HighGamma和BCIIV-2A)上评估了该框架。所提出的方法成功识别出紧凑且相关的通道子集,这些子集集中在与MI活动相关的感觉运动皮层区域,解决了传统技术中普遍存在的维度和复杂性挑战。此外,该框架在Physionet、OpenBMI、HighGamma和BCIIV-2A数据集上分别达到了87%、71%、75%和65%的分类性能。通过优于现有的单目标和基于准确率的方法以及依赖固定子集的方法,这些发现表明,这种新的多目标优化框架可以增强基于MI的BCI性能,同时促进紧凑的通道配置,降低计算复杂度,使其更适合可穿戴、便携式和实时BCI应用。

英文摘要

Motor imagery (MI) classification using electroencephalography (EEG) signals is essential for advancing brain-computer interfaces (BCIs). Traditional EEG channel selection methods often face limitations, such as dependency on single-objective criteria and susceptibility to local optima. To address these challenges, this work proposes a multi-objective optimisation framework that employs non-dominated sorting genetic algorithm, multiple-objective particle swarm optimisation, and a multi-objective evolutionary algorithm based on decomposition. Our approach effectively balances spatial relevance, using a Gaussian kernel, and functional discriminability, which assesses intratrial task-related desynchronisation, thereby improving performance. We evaluated this framework on four EEG datasets: Physionet, OpenBMI, HighGamma, and BCIIV-2A. The proposed approach successfully identifies compact, relevant channel subsets concentrated around sensorimotor cortex regions linked to MI activity, addressing the prevalent challenges of dimensionality and complexity inherent to traditional techniques. Furthermore, the framework achieved classification performance of 87%, 71%, 75%, and 65% on the Physionet, OpenBMI, HighGamma, and BCIIV-2A datasets, respectively. By outperforming existing single-objective and accuracy-based methods, and those relying on fixed subsets, these findings demonstrate that this new multi-objective optimisation framework can enhance MI-based BCI performance while facilitating compact channel configurations with reduced computational complexity, making them better suited for wearable, portable, and real-time BCI applications.

2605.29941 2026-05-29 cs.NI cs.LG

TraceCodec: A Compiler-Backed Neural Codec for Stateful Multi-Flow Network Traffic Traces

TraceCodec:一种基于编译器的有状态多流网络流量轨迹神经编解码器

Junhui Ding, Xinchen Zhang, Xiaohui Xie, Shinan Liu

AI总结 针对有状态多流网络流量轨迹的高保真生成问题,提出TraceCodec,通过将数据包解码为带时间戳的动作并学习连续潜在表示,再经确定性编译器还原为PCAP,实现精确的流量统计和TCP状态保持。

详情
AI中文摘要

关键网络工作流需要高保真的数据包捕获(PCAP)用于测试、安全分析和协议验证,而不仅仅是统计性的流级摘要。最近的包生成器展示了协议约束的PCAP合成,但它们普遍直接解码为原始包字段。这种接口将学习到的行为选择与确定性协议后果纠缠在一起,迫使包实现依赖于事后启发式修复。我们将这种解码接口识别为根本瓶颈,并提出了TraceCodec,一种用于有状态多流轨迹的状态感知神经编解码器。TraceCodec将每个数据包提升为带有显式流槽和传输线索的定时包动作,然后学习连续的每包潜在表示。确定性编译器将解码后的动作降级回PCAP,负责端点分配、TCP状态、合法性约束和包渲染。潜在层暴露了一个面向生成器的序列空间,因此下游流量模型可以在包动作潜在表示上操作,而不是原始头部字段。在CICIDS2017 Monday上,TraceCodec将包计数、协议组成和流数量匹配到0.03%以内。在相同的非修复策略下,原始字段基线将流数量和TCP状态扭曲了几个数量级。结构诊断表明,TraceCodec保留了原始字段解码器所分割的TCP状态转换和多流交织。这项工作为高保真包轨迹生成建立了新的基础。

英文摘要

Critical networking workflows require high-fidelity packet captures (PCAPs) for testing, security analysis, and protocol validation, not just statistical flow-level summaries. Recent packet generators have demonstrated protocol-constrained PCAP synthesis, but they universally decode directly to raw packet fields. That interface entangles learned behavioral choices with deterministic protocol consequences, which forces packet realization to depend on post-hoc heuristic repair. We identify this decode interface as the fundamental bottleneck and present TraceCodec, a state-aware neural codec for stateful multi-flow traces. TraceCodec lifts each packet into a timed packet action with explicit flow slots and transport cues, then learns a continuous per-packet latent. A deterministic compiler lowers decoded actions back to PCAPs, owning endpoint assignment, TCP state, legality constraints, and packet rendering. The latent layer exposes a generator-facing sequence space, so downstream traffic models can operate on packet-action latents rather than raw header fields. On CICIDS2017 Monday, TraceCodec matches packet count, protocol composition, and flow population to within 0.03%. Raw-field baselines under the same non-repair policy distort flow counts and TCP state by orders of magnitude. Structural diagnostics show that TraceCodec preserves TCP state transitions and multi-flow interleaving that raw-field decoders fragment. This work establishes a new foundation for high-fidelity packet-trace generation.

2605.29940 2026-05-29 cs.AI

Make LLM Learn to Synthesize from Streaming Experiences through Feedback

使大语言模型通过反馈从流式经验中学习合成

Zhenlin Hu, Yan Wang, Zhen Bi, Zihao Xue, Bingyu Zhu, Longtao Huang, Xiongtao Zhang, Zeyu Yang, Zhixuan Chu, Jungang Lou

AI总结 提出StreamSynth设置和SynLearner框架,使模型通过任务流积累经验并利用反馈提升合成数据生成性能。

详情
AI中文摘要

大语言模型(LLMs)已被广泛用于合成数据生成,显著降低了标注成本。然而,现有研究大多将合成视为一组孤立任务,忽略了一个更基本的问题:模型能否通过积累过去任务的经验并将其迁移到未来任务来学习合成。在这项工作中,我们引入了StreamSynth,一种新的设置,其中合成任务顺序到达,历史任务的经验为未来合成提供信息信号。为了解决这一设置,我们提出了SynLearner,一个通用框架,使合成模型能够在任务流上获取可重用的合成经验。SynLearner不是为每个任务独立生成数据,而是鼓励模型探索多样化的合成模式,从反馈中学习,并在任务演化中平衡样本质量与集合级多样性。在多个基准上的大量实验表明,SynLearner有效地利用了早期任务的经验来改进后期任务的合成性能,表现出一致的跨任务可迁移性。这些发现为StreamSynth的可行性提供了证据,并突显了合成数据生成作为一个经验驱动过程,可以从任务流中受益。

英文摘要

Large language models (LLMs) have been widely adopted for synthetic data generation, significantly reducing annotation costs. However, most existing studies treat synthesis as a set of isolated tasks and overlook a more fundamental question: whether a model can learn to synthesize by accumulating experience from past tasks and transferring it to future ones. In this work, we introduce StreamSynth, a new setting in which synthesis tasks arrive sequentially and experience from historical tasks provides informative signals for future synthesis. To address this setting, we propose SynLearner, a general framework that enables synthesis models to acquire reusable synthesis experience over a task stream. Instead of generating data independently for each task, SynLearner encourages the model to explore diverse synthesis patterns, learn from feedback, and balance sample quality with set-level diversity as tasks evolve. Extensive experiments across multiple benchmarks show that SynLearner effectively leverages experience from earlier tasks to improve synthesis performance on later ones, exhibiting consistent cross-task transferability. These findings provide evidence for the feasibility of StreamSynth and highlight synthetic data generation as an experience-driven process that can benefit from task streams.

2605.29939 2026-05-29 cs.IT cs.LG math.IT

CRB-Guided Framework Design and Resource Allocation for Indoor mmWave ISCC Systems

室内毫米波ISCC系统的CRB引导框架设计与资源分配

Zhonghao Liu, Yahao Ding, Yinchao Yang, Mohammad Shikh-Bahaei

AI总结 针对室内毫米波ISCC系统,提出基于克拉美罗界(CRB)的资源分配框架,通过联合优化感知功率和自适应深度Mamba模型深度,最小化人体姿态预测误差。

详情
Comments
7 pages, 6 figures, conference(submitted to GLOBECOM)
AI中文摘要

集成感知、通信与计算(ISCC)为室内以人为中心的应用提供了一个有前景的框架。在这些应用中,短期人体姿态预测有助于提前实现连续的人体跟踪和资源分配。本文提出了一种基于克拉美罗界(CRB)的资源分配框架,用于室内毫米波ISCC系统,以在通信、延迟和能量约束下最小化人体姿态预测误差。我们基于CRB刻画了感知功率对距离估计不确定性和点云扰动的影响。为了捕捉计算资源对预测性能的影响,我们采用了一种自适应深度的Mamba姿态预测模型,其中在每个层后附加轻量级预测头,以实现不同模型深度的推理。通过这种统一的感知-计算建模,我们建立了感知功率、模型深度和预测误差之间的定量关系。此外,我们制定了一个联合资源分配问题以最小化姿态预测误差。为了高效解决该问题,我们开发了一种基于交替优化(AO)的算法,其中为感知功率和模型深度更新步骤推导了闭式解。仿真结果表明,与基线方法相比,所提方案显著降低了姿态预测误差,验证了其在资源受限的室内以人为中心的ISCC系统中的有效性。

英文摘要

Integrated sensing, communication, and computation (ISCC) provides a promising framework for indoor human-centric applications. In these applications, short-term human pose prediction facilitates continuous human tracking and resource allocation in advance. In this paper, we propose a Cramer-Rao bound (CRB) guided resource allocation framework for indoor mmWave ISCC systems to minimize the human pose prediction error under communication, latency, and energy constraints. We characterize the impact of sensing power on range-estimation uncertainty and point-cloud perturbation based on the CRB. To capture the impact of computation resources on prediction performance, we adopt an adaptive-depth Mamba-based pose prediction model, where lightweight prediction heads are attached after every layer to enable inference with different model depths. With this unified sensing-computation modeling, we establish a quantitative relationship among sensing power, model depth, and prediction error. Furthermore, we formulate a joint resource allocation problem to minimize the pose prediction error. To solve this problem efficiently, we develop an alternating optimization (AO)-based algorithm, where closed-form solutions are derived for the sensing power and model depth update steps. Simulation results show that the proposed scheme significantly reduces pose prediction error compared with baseline methods, validating its effectiveness for resource-constrained indoor human-centric ISCC systems.

2605.29937 2026-05-29 cs.RO cs.LG

Fisher-Preserving Guidance: Training-Free Manifold Constraints for Safe Diffusion Control

Fisher保持引导:用于安全扩散控制的免训练流形约束

Hao Ren, Zetong Bi, Yiming Zeng, Le Zheng, Zhi Li, Zhaoliang Wan, Lu Qi, Hui Cheng

AI总结 提出一种免训练的Fisher保持引导方法,通过低秩雅可比分解计算Fisher保持更新,并利用截断Fisher去噪敏感性作为不确定性信号,在视觉导航中实现可靠且高效的轨迹预测。

详情
Comments
ICML2026
AI中文摘要

扩散模型在视觉导航中的航路点预测是有效的,但当更新偏离训练流形时,标准采样和测试时引导可能产生不可靠或低效的轨迹。我们提出带有外积跨度投影的Fisher保持引导,这是一种免训练的推理方法,在优化任务目标的同时避免与分布外动作相关的大Fisher漂移。我们的方法通过低秩雅可比分解计算Fisher保持更新,每步仅需一次反向传播,支持实时使用。我们进一步引入截断Fisher去噪敏感性作为不确定性信号,并将其用于鲁棒的多样本动作混合。在玩具和真实导航基准上的实验,包括基于TSDF引导的Maze2D、使用官方扩散策略权重的PushT,以及仿真和真实机器人上的视觉导航,均表明与强扩散策略基线相比,无需额外训练即可获得一致的性能提升。

英文摘要

Diffusion models are effective for waypoint prediction in visual navigation, but standard sampling and test time guidance can produce unreliable or inefficient trajectories when updates drift off the training manifold. We propose Fisher Preserving Guidance with Outer Product Span Projection, a training-free inference method that avoids large Fisher drift associated with off-distribution actions while optimizing a task objective. Our method computes the Fisher-preserving update via a low-rank Jacobian factorization, requiring only a single backward pass per step and enabling real-time use. We further introduce Truncated Fisher Denoising Sensitivity as an uncertainty signal and use it for robust multi-sample action blending. Experiments on toy and realistic navigation benchmarks, including Maze2D with TSDF-based guidance, PushT with official Diffusion Policy weights, and visual navigation in simulation and on real robots, demonstrate consistent improvements in performance over strong diffusion-policy baselines without additional training.

2605.29935 2026-05-29 cs.CV cs.AI

CityGen: Structure-Guided City-Style Synthesis for Cross-City Autonomous Driving

CityGen: 结构引导的城市风格合成用于跨城市自动驾驶

Zezhong Qian, Zhao Yang, Lu Tan, Zhihao Yan, Weiyi Hong, Haizhuang Liu, Yawei Jueluo

AI总结 提出CityGen,一种基于扩散模型的生成框架,通过高清地图条件和城市级视觉提示实现零标签城市适应,提升跨城市自动驾驶在感知、分割和规划任务上的鲁棒性。

详情
AI中文摘要

自动驾驶系统通常在有限的地理区域内进行训练和评估,这阻碍了它们在新城市部署时的可扩展性。然而,外观、道路拓扑和交通模式的显著域偏移常常导致跨城市部署时性能严重下降。现有的基于域适应、数据增强或合成数据生成的方法通常依赖于标注的目标数据、城市特定的标注或任务特定的设计,限制了它们在整体评估中的可扩展性和有效性。在本文中,我们引入了CityTransfer-Bench,一个地理上不重叠的基准,用于评估跨城市泛化在感知、分割和规划任务上的表现,并提出了CityGen,一个基于扩散的生成框架,通过城市级视觉提示引导的高清地图条件合成实现零标签城市适应。大量实验表明,CityGen在多个任务上持续提高了跨城市鲁棒性,为可泛化的自动驾驶建立了可扩展且标签高效的基石。

英文摘要

Autonomous driving systems are commonly trained and evaluated within limited geographic regions, which hinders their scalability when deployed in new cities. However, significant domain shifts in appearance, road topology, and traffic patterns often cause severe performance degradation under cross-city deployment. Existing approaches based on domain adaptation, data augmentation, or synthetic data generation typically rely on labeled target data, city-specific annotations, or task-specific designs, limiting their scalability and effectiveness for holistic evaluation. In this paper, we introduce CityTransfer-Bench, a geographically disjoint benchmark for evaluating cross-city generalization across perception, segmentation, and planning, and propose CityGen, a diffusion-based generative framework that performs zero-label city adaptation via HD-map-conditioned synthesis guided by city-level visual prompts. Extensive experiments demonstrate that CityGen consistently improves cross-city robustness across multiple tasks, establishing a scalable and label-efficient foundation for generalizable autonomous driving.

2605.29933 2026-05-29 cs.LG

CLUBench: A Clustering Benchmark

CLUBench:一个聚类基准测试

Feng Xiao, Dazhi Fu, Chris Ding, Jicong Fan

AI总结 本文提出CLUBench,一个包含24种算法在131个数据集上的综合聚类基准,通过大规模实验分析超参数调优、数据类型、预训练嵌入、大语言模型聚类等,揭示传统算法仍具竞争力,并结合预训练嵌入可提升效率。

详情
AI中文摘要

聚类是数据科学中的一个基本问题,有着悠久的研究历史,产生了许多富有洞察力的算法。尽管取得了这些进展,但缺乏一个系统且大规模的经验评估,同时考虑传统算法、基于深度学习的方法以及最近基于基础模型的聚类,导致对算法选择和部署的指导有限。为了填补这一空白,我们引入了CLUBench,一个全面的聚类基准,包含24种不同原理的算法,在131个数据集上进行了评估,涵盖表格、文本和图像数据,涉及178,815次实验。重要的是,我们对(i)超参数调优的影响、(ii)数据类型和特征的影响、(iii)预训练嵌入的影响、(iv)基于大语言模型的聚类、(v)算法的相似性以及(vi)性能矩阵的低秩结构的分析,为聚类研究提供了有意义的见解和有前景的途径。例如,我们的研究揭示:1) 所有评估的深度聚类方法在平均性能方面并不比表现最佳的传统聚类算法(如KMeans、SpeClu)具有显著优势;2) 对于图像和文本聚类任务,将预训练嵌入与传统聚类算法(如KMeans、SpeClu)相结合提供了有效且高效的聚类;3) 即使在大模型日益占据主导地位的时代,聚类仍然是一个具有挑战性和非平凡的问题。此外,我们提出利用跨模型性能矩阵中的低秩结构来高效近似实际应用中的整体性能评估。我们进一步展示了基于所有超参数配置下的性能矩阵进行模型选择的可行性。

英文摘要

Clustering is a fundamental problem in data science with a long-standing research history, yielding numerous insightful algorithms. Despite this progress, a systematic and large-scale empirical evaluation that jointly considers conventional algorithms, deep learning-based methods, and recent foundation model-based clustering remains largely absent, leading to limited guidance on algorithm selection and deployment. To address this gap, we introduce CLUBench, a comprehensive clustering benchmark comprising 24 algorithms of diverse principles evaluated on 131 datasets across tabular, text, and image data, involving 178,815 experiments. Importantly, our analyses of (i) the impact of hyperparameter tuning,(ii) the impact of data types and characteristics,(iii) the impact of pretrained embeddings,(iv) large language model-based clustering,(v) the similarity of algorithms, and (vi) the low-rank structures of performance matrices, yield meaningful insights and promising pathways for clustering research. For instance, our study reveals that: 1) All evaluated deep clustering methods do not exhibit a significant advantage compared with the top-performing conventional clustering algorithms (e.g., KMeans, SpeClu) in terms of average performance; 2) For image and text clustering tasks, combining pretrained embeddings with conventional clustering algorithms (e.g., KMeans, SpeClu) offers effective and efficient clustering; 3) Clustering remains a challenging and nontrivial problem, even in the era of increasingly dominant foundation models. Moreover, we propose to use the low-rank structure in cross-model performance matrices to efficiently approximate the overall performance evaluation in practical applications. We further demonstrate the feasibility of model selection based on the performance matrices across all hyperparameter configurations.

2605.29932 2026-05-29 cs.LG cs.CV

Treatment-Conditioned Diffusion for Forecasting Neurodegenerative Disease Progression

治疗条件扩散用于预测神经退行性疾病进展

Danylo Boiko, Viktoriia Mishkurova

AI总结 提出一种治疗条件扩散框架,通过条件化生成过程于患者的筛查DaTscan图像和一年内左旋多巴等效日剂量,预测高保真未来脑状态,在临床保真度上显著优于基线。

详情
Comments
9 pages, 5 figures, 1 table
AI中文摘要

预测帕金森病等神经退行性疾病的进展对于有效的长期规划和个性化治疗干预至关重要。现有系统通常产生忽略纵向神经影像丰富结构的标量临床评分,而传统生成方法则遭受解剖细节丢失和细微进展模式模糊的问题。为此,我们引入了一种新颖的治疗条件扩散框架,通过将生成过程条件化于患者的筛查DaTscan图像和一年内左旋多巴等效日剂量,预测高保真的未来脑状态。该流程使用基于Transformer的编码器表示非线性、时间依赖的药理学动态,并通过一个关注生物关键区域的多权重感兴趣区域掩码优化生成。实验评估表明,我们的框架保持了清晰的解剖边界,并在临床保真度上显著优于基线,实现了MSE降低14.0%,MAE降低7.2%,SSIM提高4.9%。

英文摘要

Forecasting the progression of neurodegenerative diseases, such as Parkinson's disease, is essential for effective long-term planning and personalized therapeutic intervention. Existing systems typically produce scalar clinical scores that ignore the rich structure of longitudinal neuroimaging, while traditional generative approaches suffer from a loss of anatomical details and blurring subtle progression patterns. To address this, we introduce a novel treatment-conditioned diffusion framework that predicts high-fidelity future brain states by conditioning the generative process on patients' screening DaTscan images and levodopa equivalent daily dose over one year. The pipeline uses a Transformer-based encoder to represent non-linear, time-dependent pharmacological dynamics and optimizes generation through a multi-weight region-of-interest mask that focuses on biologically critical areas. Experimental evaluation shows that our framework maintains sharp anatomical boundaries and significantly improves clinical fidelity relative to the baseline, achieving 14.0% lower MSE, 7.2% lower MAE, and 4.9% higher SSIM.

2605.29931 2026-05-29 cs.AI eess.AS

It`s All About Speed: AI`s Impact on Workflow in Music Production

一切都关乎速度:AI对音乐制作工作流程的影响

Finn McClellan, Fabio Morreale

AI总结 通过民族志研究,探讨AI和自动化工具如何影响音乐制作工作流程,重点关注录音工程师、混音师和制作人的使用体验与态度,并分析速度、可控性与创造性自主权之间的张力及其缓解方法。

详情
Comments
Audio Engineering Society Conference Paper - Presented at the AES International Conference on Machine Learning and Artificial Intelligence for Audio 2025 - September 8-10, London, UK
AI中文摘要

在本文中,我们展示了一项关于AI和自动化工具对音乐制作工作流程影响的民族志研究结果。我们特别关注那些自认为是录音工程师、混音师和制作人的专业参与者,讨论了他们对常见AI和自动化软件的使用情况,以及他们对这些工具普及的看法。我们讨论了在速度和效率、可控性以及保持创造性自主权等关键领域,用户与自动化工具之间可能产生的紧张关系,以及如何通过工具设计来缓解这些紧张关系。

英文摘要

In this paper, we present the results of an ethnographic study into the impact of AI and automated tools on music production workflow. Focusing specifically on professional participants who identified as recording engineers, mixers, and producers, we discuss their usage of common AI and automated software, as well as their sentiments on the proliferation of these tools. We discuss tensions that may be created between users and automated tools in key areas such as the need for speed and efficiency, controllability, and maintaining creative agency, and how these tensions may be alleviated through tool design.

2605.29927 2026-05-29 cs.CL cs.AI cs.LG

Does The Way You Plan Matter? An Empirical Study of Planning Representations for LLM Web Agents

计划方式重要吗?LLM网络代理计划表示的实证研究

Alejandra Zambrano, Sara Vera Marjanovic, Imene Kerboua, Xing Han Lù, Leila Kosseim

AI总结 本研究提出PlanAhead框架,通过自动难度分类和四种计划表示(顺序子目标、叙述、伪代码、检查清单)的对比实验,发现计划表示形式和生成计划的LLM显著影响网络代理的鲁棒性和任务成功率。

详情
Comments
Extended version of paper submitted to EMNLP, waiting for acceptance
AI中文摘要

尽管最近取得了进展,基于LLM的网络代理仍然面临探索有限、遗漏关键步骤以及对任务约束敏感等问题。先前的研究表明,许多这些失败源于规划中的弱点,但替代自然语言计划表示的影响尚未被探索。为了解决这个问题,我们引入了PlanAhead,一个静态规划器-执行器框架,评估计划表示对代理性能的影响。我们首先将WebArena任务自动分类为3个难度级别,无需人工标注即可实现一致的难度分级。然后,我们在被分类为困难的任务上系统评估了4种不同的计划表示:顺序子目标、叙述、伪代码和检查清单;跨越不同系列的多模态LLM驱动的代理(OpenAI、阿里巴巴和谷歌)。为了解释随机变异性,我们引入了两个新的评估指标:达成率(AR)和解决任务一致性(STC)。我们的结果表明,计划制定和生成计划的底层LLM都显著影响网络代理的鲁棒性和任务成功率。

英文摘要

Despite recent advances, LLM-based web agents still struggle with limited exploration, omission of critical steps, and sensitivity to task constraints. Prior work suggests that many of these failures stem from weaknesses in planning, yet the impact of alternative natural language plan representation remains unexplored. To address this, we introduce PlanAhead, a static planner-executor framework that evaluates the impact of plan representation in agent performance. We first automatically categorize WebArena tasks into 3 difficulty levels, enabling consistent difficulty grading without human annotation. Then we systematically evaluate 4 different plan representations on the tasks categorized as hard: sequential subgoals, narrative, pseudocode, and checklist; across different families of multimodal LLM powered agents (OpenAI, Alibaba, and Google). To account for stochastic variability, we introduce two novel evaluation metrics: Achievement Rate (AR) and Solved-Task Consistency (STC). Our results show that both, the plan formulation and the underlying LLM generating the plan, significantly influence web-agent robustness and task success.

2605.29926 2026-05-29 cs.LG

A Triple-Modal Contrastive Learning Framework with Sequence, Graph, and 3D Features for Drug-Target Interaction Prediction

一种融合序列、图和3D特征的三模态对比学习框架用于药物-靶标相互作用预测

Le Xu, Xi Zhang, Dan Luo, Ting Wang, Xuan Lin

AI总结 提出TriMod-DTI框架,通过融合药物和蛋白质的1D序列、2D图和3D结构,并采用三模态对比学习策略对齐潜在空间表示,从而提升药物-靶标相互作用预测性能。

详情
Comments
12 pages, 5 figures, ISBRA 2026
AI中文摘要

准确预测药物-靶标相互作用(DTI)对药物发现至关重要。现有方法通常依赖单模态表示(如序列或图)或仅结合两种模态,忽视了3D结构特征。为解决这一挑战,我们提出TriMod-DTI,一种三模态对比学习框架,融合药物和蛋白质的1D序列、2D图和3D结构,获得用于DTI预测的通用且互补的特征表示。我们设计了一个特征提取器,用于捕获三种模态下的药物和靶标特征,从而丰富其表示。我们进一步提出了一种三模态对比学习策略,以在潜在空间中对齐同一药物或蛋白质的不同模态表示。通过构建跨模态的正负样本对,该方法增强了模型的判别能力。在三个基准数据集上的实验表明,TriMod-DTI优于最先进的方法。消融研究验证了每种模态的贡献。此外,案例研究突显了其在DTI预测和药物发现中的实际潜力。

英文摘要

Accurate prediction of drug-target interactions (DTI) is critical for drug discovery. Existing methods often rely on single-modal representations (e.g., sequences or graphs) or combine only two modalities, overlooking 3D structural features. To address this challenge, we propose TriMod-DTI, a triple-modal contrastive learning framework that incorporates 1D sequences, 2D graphs, and 3D structures of drugs and proteins, obtaining the universal and complementary feature representations for DTI prediction. We design a Feature Extractor to capture drug and target features across the three modalities, thereby enriching their representations. We further propose a triple-modal contrastive learning strategy to align different modal representations of the same drug or protein in the latent space. By constructing cross-modal positive and negative sample pairs, this approach enhances the model's discriminative ability. Experiments on three benchmark datasets demonstrate that TriMod-DTI outperforms state-of-the-art methods. The ablation studies validate the contributions of each modality. Moreover, case studies highlight its practical potential for DTI prediction and drug discovery.

2605.29919 2026-05-29 cs.AI cs.MA

On the Geometry of Games and their Solvers

论博弈及其求解器的几何结构

Yaqi Sun, Julian Ma, David Mguni

AI总结 提出一种结构感知的求解器合成框架,通过学习连续求解器对齐的博弈几何表示,实现自适应均衡计算并揭示求解器行为的连续区域。

详情
AI中文摘要

博弈论和生成对抗网络等学习系统中的一个核心挑战是理解哪些算法能够在异质博弈景观中高效计算均衡。均衡计算通常按求解器和博弈类别分别研究,产生了强局部保证但碎片化的求解器行为视图。现有的离散分类法往往无法完整解释算法成功的原因。我们通过一个将博弈与有效求解器动力学联系起来的求解器-博弈映射来研究这一问题。经典理论识别出该映射的孤立区域,但对中间或重叠区域提供的见解有限,表明可解性由定义连续求解器对齐博弈几何的潜在结构属性控制。我们通过结构感知的求解器合成来形式化这一视角。一个学习到的结构识别器将每个博弈映射到低维求解器对齐表示,一个策略将该表示映射到有效的原始机制,从而跨区域调整求解器行为。这揭示了特定求解器动力学有效的区域,以及需要原始机制混合而非单一主导求解器的区域。一个有界残差充当局部校正器和诊断信号,用于不完整的求解器基或表示。该框架同时产生自适应求解器和分析视角:具有相似优化动力学的博弈聚类在一起,揭示了算法有效性的连续区域和重叠的求解器行为。实验表明,固定原始机制表现出系统性的区域不匹配,而学习到的表示将博弈空间组织成与求解器行为对齐的结构化地图。这些结果表明,应将均衡计算视为学习求解器机制和映射可解性几何的联合问题。

英文摘要

A central challenge in game theory and learning systems such as GANs is understanding which algorithms can efficiently compute equilibria across the heterogeneous landscape of games. Equilibrium computation is typically studied solver by solver and game class by game class, yielding strong local guarantees but a fragmented view of solver behaviour. Existing discrete taxonomies often provide an incomplete account of where algorithms succeed. We study this problem through a solver-game map linking games to effective solver dynamics. Classical theory identifies isolated regions of this map but provides limited insight into intermediate or overlapping regimes, suggesting that solvability is governed by latent structural properties defining a continuous solver-aligned geometry of games. We formalise this perspective through structure-aware solver synthesis. A learned structure recogniser maps each game to a low-dimensional solver-aligned representation, and a policy maps this representation to effective primitive mechanisms, adapting solver behaviour across regimes. This reveals regions where particular solver dynamics are effective and where mixtures of primitives are required rather than a single dominant solver. A bounded residual acts as a local corrector and diagnostic signal for incomplete solver bases or representations. The framework yields both an adaptive solver and an analytical lens: games with similar optimisation dynamics cluster together, revealing continuous regions of algorithmic validity and overlapping solver behaviour. Empirically, we show that fixed primitives exhibit systematic regime mismatch, while the learned representation organises game space into a structured cartography aligned with solver behaviour. These results suggest viewing equilibrium computation as the joint problem of learning solver mechanisms and mapping the geometry of solvability.

2605.29913 2026-05-29 cs.IT cs.LG math.IT

Gesture-Aware Indoor THz ISAC Systems for Adaptive Resource Allocation

基于手势感知的室内太赫兹ISAC系统自适应资源分配

Zhonghao Liu, Yinchao Yang, Yahao Ding, Yixuan Wang, Mohammad Shikh-Bahaei

AI总结 针对太赫兹频段多用户室内集成感知与通信系统,提出基于扩展卡尔曼滤波手势跟踪的自适应联合优化算法,通过动态调整功率分配和波束赋形,在满足手势相关通信服务质量约束下最大化感知信干噪比。

详情
Comments
6 pages, 4 figures, conference(Submitted to PIMRC)
AI中文摘要

本文研究了一种在太赫兹频段运行的多用户室内集成感知与通信系统,该系统设计用于基于手势识别的自适应通信。通过扩展卡尔曼滤波器进行手势跟踪,接入点根据检测到的手势变化动态调整资源分配,从而提高感知精度。基于手势识别结果,接入点进一步更新不同用户的通信质量需求,实现高效的资源分配。为此,开发了一种功率分配和波束赋形的自适应联合优化算法,在满足手势相关的通信服务质量约束下,最大化整体感知信干噪比。仿真结果表明,与传统的单变量优化基线相比,所提方法能有效响应手势动态,实现更优的感知精度和通信性能。

英文摘要

This paper investigates a multi-user indoor integrated sensing and communication (ISAC) system operating in the terahertz (THz) band, designed for adaptive communication based on gesture recognition. Leveraging gesture tracking through an extended Kalman filter (EKF), the access point (AP) dynamically adjusts resource allocation in response to detected gesture variations, thereby improving sensing accuracy. Based on the gesture recognition results, the AP further updates the communication quality requirements of different users, enabling efficient resource allocation. To this end, an adaptive joint optimization algorithm for power allocation and beamforming is developed to maximize the overall sensing signal-to-interference-plus-noise ratio (SINR) while satisfying the gesture-dependent communication quality of service (QoS) constraints. Simulation results demonstrate that the proposed method effectively responds to gesture dynamics, achieving superior sensing accuracy and communication performance compared with conventional single-variable optimization baselines.

2605.29911 2026-05-29 cs.LG cs.CV

Reducing Experimental Testing in Space Propulsion Film Cooling Analyses by Pixelwise Generative Image Interpolation

通过逐像素生成图像插值减少空间推进薄膜冷却分析中的实验测试

Adam T. Müller, Philipp J. Teuffel, Konstantin Manassis, Nicolaj C. Stache

AI总结 提出一种基于轻量级前馈神经网络和位置编码的机器学习方法,从稀疏实验测量中进行图像回归,以减少推进系统薄膜冷却研究中的物理测试需求。

详情
Comments
Presented at the 11th European Conference for Aeronautics and Aerospace Sciences (EUCASS), 2025, DOI: 10.13009/EUCASS2025-285
AI中文摘要

我们提出了一种从稀疏实验测量中进行图像回归的机器学习方法。我们展示了该方法在推进系统开发中薄膜冷却研究中的应用,旨在减少对大量物理测试的需求。我们的方法采用带有位置编码的轻量级前馈神经网络,根据输入参数生成图像。在真实和合成数据上的验证表明,该方法在减少30%测量量的同时,实现了高图像相似度(RMSE < 8%,SSIM > 93%)。我们进一步提出了一种知识驱动的扩展,用于生成图像的局部适应性。该方法显著减少了所需测试次数,同时保持了高质量数据,从而能够高效优化冷却剂喷射器配置,其应用范围超越航空航天领域。

英文摘要

We propose a machine learning approach for image regression from sparse experimental measurements. We show the application of the proposed method on film cooling studies in propulsion system development, aiming to reduce the need for extensive physical testing. Our method employs a lightweight feed-forward neural network with positional encoding to generate images conditioned by input parameters. Validated on real and synthetic data, it achieves high image similarity (RMSE < 8 %, SSIM > 93 %) while maintaining accuracy with a 30 \% reduction of measurements. We further propose a knowledge-informed extension for local adaptability of the generated images. This approach significantly reduces required tests while preserving high-quality data, enabling efficient optimization of coolant injector configurations with applications beyond aerospace.

2605.29910 2026-05-29 cs.SE cs.AI

Agora: Toward Autonomous Bug Detection in Production-Level Consensus Protocols with LLM Agents

Agora: 面向生产级共识协议中自主漏洞检测的LLM智能体

Xiang Liu, Sa Song, Zhaowei Zhang, Huiying Lan, Jason Zeng, Ming Wu, Michael Heinrich, Yong Sun, Ceyao Zhang

AI总结 提出Agora,一个领域感知的多智能体框架,通过假设驱动测试和LLM协作,在Raft、EPaxos、HotStuff、BullShark四个共识实现中发现15个未知协议级逻辑漏洞,而现有LLM方法未能检测到任何此类漏洞。

详情
Comments
35 pages, 4 figures
AI中文摘要

共识协议构成了分布式系统和区块链的骨干,其中的实现漏洞可能导致数据损坏和财务损失。虽然基于LLM的方法在代码分析中显示出前景,但它们难以处理涉及跨多个执行阶段的复杂状态依赖行为的深层协议级逻辑漏洞。我们提出Agora,一个领域感知的多智能体框架,将假设驱动测试与LLM能力相结合,用于系统性的协议验证。Agora采用专门的智能体,协作探索协议状态空间,使用领域特定约束综合攻击场景,并通过迭代细化验证发现。这种明确的角色分离使得能够推理全局协议不变量,超越单函数代码分析。我们在四个共识实现(Raft、EPaxos、HotStuff、BullShark)上使用四个最先进的LLM评估了Agora。Agora发现了15个先前未知的违反安全属性的协议级逻辑漏洞,而现有的基于LLM的智能体未能检测到任何此类协议级逻辑漏洞。我们的结果表明,领域感知的多智能体协作对于检测复杂协议中的深层逻辑漏洞至关重要。

英文摘要

Consensus protocols form the backbone of distributed systems and blockchains, where implementation bugs can cause data corruption and financial losses. While LLM-based approaches show promise in code analysis, they struggle with deep protocol-level logic bugs involving complex state-dependent behaviors across multiple execution stages. We present Agora, a domain-aware multi-agent framework that integrates hypothesis-driven testing with LLM capabilities for systematic protocol verification. Agora employs specialized agents that collaboratively explore protocol state spaces, synthesize attack scenarios using domain-specific constraints, and validate findings through iterative refinement. This explicit role separation enables reasoning about global protocol invariants beyond single-function code analysis. We evaluate Agora on four consensus implementations (Raft, EPaxos, HotStuff, BullShark) using four state-of-the-art LLMs. Agora discovers 15 previously unknown protocol-level logic bugs that violate safety properties, while existing LLM-based agents fail to detect any such protocol-level logic bugs. Our results demonstrate that domain-aware multi-agent collaboration is essential for detecting deep logic bugs in complex protocols.

2605.29908 2026-05-29 stat.ML cs.LG

Joint Model and Data Sparsification via the Marginal Likelihood

通过边际似然进行联合模型与数据稀疏化

Alexander Timans, Thomas Möllenhoff, Christian A. Naesseth, Mohammad Emtiyaz Khan, Eric Nalisnick

AI总结 提出通过边际似然联合学习特征和样本相关性,实现同时模型与数据稀疏化的贝叶斯方法,在保持共轭性和闭式更新的同时提升鲁棒性。

详情
Comments
36 pages, 8 figures, 12 tables (incl. appendix); published at ICML 2026
AI中文摘要

线性系统中的稀疏恢复支撑着从信号处理到高维回归的应用。基于自动相关性确定(ARD)原理的稀疏贝叶斯学习,通过边际似然优化为特征稀疏性提供了一种实用的贝叶斯机制。然而,其对同方差噪声模型的依赖使其对数据污染(如异常值或错误指定的噪声)敏感,损害了模型拟合和预测。相反,我们提出联合学习个体特征和样本相关性,通过单一贝叶斯目标实现同时模型与数据稀疏化。这种模型和数据的对称剪枝提供了一种自然扩展,保持了共轭性,允许标准优化过程的闭式更新,并与鲁棒回归和影响函数的观点一致。跨多种回归任务的实证结果证实,联合ARD方法一致地产生稀疏且鲁棒的预测模型。

英文摘要

Sparse recovery in linear systems underpins applications from signal processing to high-dimensional regression. Sparse Bayesian Learning, grounded in the principle of automatic relevance determination (ARD), offers a practical Bayesian mechanism for feature sparsity via marginal likelihood optimization. Yet, its reliance on a homoscedastic noise model renders it sensitive to data contaminations such as outliers or misspecified noise, harming model fit and predictions. Instead, we propose jointly learning individual feature and sample relevancies, enabling simultaneous model and data sparsification via a single Bayesian objective. This symmetric pruning of model and data offers a natural extension that preserves conjugacy, admits closed-form updates for standard optimization procedures, and aligns with perspectives from robust regression and influence functions. Empirical results across diverse regression tasks affirm that a joint ARD approach consistently yields both sparse and robust prediction models.

2605.29901 2026-05-29 cs.CR cs.LG

Dissecting the Black Box: Circuit-Level Analysis of LLM Vulnerability Detection

剖析黑箱:LLM 漏洞检测的电路级分析

Syafiq Al Atiiq, Chun Zhou, Christian Gehrmann

AI总结 通过机械可解释性分析 Gemma-2-2b 模型在 C/C++ 漏洞检测中的内部计算,发现模型主要依赖安全检测器(识别安全编码模式的注意力头)而非直接检测漏洞特征,并识别出关键神经组件(早期层注意力头和 MLP 神经元),通过消融实验验证其因果作用。

详情
Comments
11 pages, 6 figures. Supported by the Wallenberg AI, Autonomous Systems and Software Program (WASP)
AI中文摘要

大型语言模型(LLM)能够检测软件漏洞,但它们实际上是如何识别易受攻击的代码的呢?我们利用机械可解释性来回答这个问题;分析神经网络的内部计算以理解其推理过程。通过在 Gemma-2-2b 上使用 Circuit Tracer,我们追踪了模型将 472 个 C/C++ 代码样本分类为易受攻击或安全时所激活的计算路径。我们的分析揭示了一个令人惊讶的发现:模型主要依赖安全检测器(即识别安全编码模式的注意力头),而不是直接检测漏洞特征。当这些安全检测器未能激活时,模型将代码分类为易受攻击。我们识别出了关键的神经组件:早期层(L5、L7)中专注于安全模式的特定注意力头,以及第 7 层中编码漏洞相关特征的多层感知器(MLP)神经元。消融实验证实了它们的因果作用;移除第 11 层会使漏洞检测准确率从 100% 降至 6%,而仅消融第 7 层中的 20 个神经元就会使其降低 50%。我们的发现表明,LLM 漏洞检测使用了稀疏、可解释的电路(仅占模型容量的 16%),从而能够为安全预测提供电路级解释,并有针对性地改进检测系统。

英文摘要

Large language models (LLMs) can detect software vulnerabilities, but how do they actually identify vulnerable code? We address this question using mechanistic interpretability; analyzing the internal computations of a neural network to understand its reasoning process.Using Circuit Tracer on Gemma-2-2b, we trace the computational pathways activated when the model classifies 472 C/C++ code samples as vulnerable or safe. Our analysis reveals a surprising finding: the model primarily relies on safety detectors, attention heads that recognize safe coding patterns, rather than directly detecting vulnerability signatures. When these safety detectors fail to activate, the model classifies code as vulnerable. We identify the critical neural components: specific attention heads in early layers (L5, L7) that focus on safety patterns, and Multilayer Perceptron (MLP) neurons in Layer 7 that encode vulnerability-related features. Ablation experiments confirm their causal role; removing Layer 11 drops vulnerability detection accuracy from 100% to 6%, while ablating just 20 neurons in Layer 7 reduces it by 50%.Our findings show that LLM vulnerability detection uses sparse, interpretable circuits (only 16% of model capacity), enabling circuit-level explanations for security predictions and targeted improvements to detection systems.

2605.29900 2026-05-29 cs.LG cs.IT math.IT

OVA-IB: One vs All Information Bottleneck for Multi-Modal Alignment

OVA-IB:用于多模态对齐的一对多信息瓶颈

Tianchao Li, Shujian Yu, Xinrui Zu, Zhaolong Wei, Jeremy Gummeson, Jack C. P. Cheng, Robert Jenssen

AI总结 提出基于信息瓶颈的一对多对齐框架OVA-IB,通过充分性对比下界和最小性正则化实现任意数量模态的对齐,在分类、回归和跨模态检索任务中表现鲁棒。

详情
AI中文摘要

对比学习对于对齐配对视图或模态是有效的,但超出两个模态的对齐仍然具有挑战性且相对未被充分探索。成对的CLIP风格损失将多模态对齐分解为独立的双向比较,因此没有显式建模多个模态之间的高阶依赖关系。最近的超越成对目标从统计或几何角度处理这个问题,但任意模态对齐仍然缺乏一个原则性的标准来定义每个模态相对于其他模态应该保留和压缩什么。我们通过信息瓶颈原则重新审视任意模态对齐。在多模态学习中,充分性应保留可从其余模态预测的信息,而最小性应压缩不被其余模态支持的模态特定信息。这自然导致一对多视角,其中每个模态相对于其余模态进行表征。我们提出OVA-IB,一个用于任意模态对齐的信息瓶颈框架。OVA-IB优化一个可处理的一对多对比下界用于充分性,该下界与双总相关风格目标相连,使用无参数的几何感知投影分数,并通过用其余模态诱导的表示分布来约束每个表示对其自身输入的依赖,导出一个可处理的上界正则化器用于最小性。在分类、回归、模态无关评估和跨模态检索基准上的实验展示了强大且鲁棒的性能。

英文摘要

Contrastive learning is effective for aligning paired views or modalities, but alignment beyond two modalities remains non-trivial and comparatively underexplored. Pairwise CLIP-style losses decompose multi-modal alignment into independent two-way comparisons and therefore do not explicitly model higher-order dependencies among multiple modalities. Recent beyond-pairwise objectives approach this problem from statistical or geometric perspectives, but arbitrary-modality alignment still lacks a principled criterion for defining what each modality should preserve and compress relative to the others. We revisit arbitrary-modality alignment through the Information Bottleneck principle. In multi-modal learning, sufficiency should preserve information predictable from the remaining modalities, while minimality should compress modality-specific information not supported by them. This naturally leads to a One-vs-All view, where each modality is characterized with respect to the remaining modalities. We propose OVA-IB, an Information Bottleneck framework for arbitrary-modality alignment. OVA-IB optimizes a tractable One-vs-All contrastive lower bound for sufficiency connected to a Dual Total Correlation-style objective, uses a parameter-free geometry-aware projection score, and derives a tractable upper-bound regularizer for minimality by bounding each representation's dependence on its own input with representation distributions induced by the remaining modalities. Experiments on classification, regression, modality-agnostic evaluation, and cross-modal retrieval benchmarks demonstrate strong and robust performance.

2605.29897 2026-05-29 cs.CL

ExCAM: Explainable Cultural Awareness Metrics

ExCAM:可解释的文化意识度量

Christoph Leiter, Haiyue Song, Hour Kaing, Jin Tei, Hideki Tanaka, Masao Utiyama, Steffen Eger

AI总结 提出ExCAM,首个可识别、评分并解释指令-输出对中文化错误的专用评估度量,在平衡测试集上达到80%准确率。

详情
Comments
preprint
AI中文摘要

评估大型语言模型的文化意识对于确保生成文本的公平性和应用在全球范围内的泛化能力至关重要。最近的基准通过问答或文本生成任务探索食物等文化物品或压力情境下的行为等价值观。然而,创建这些基准需要耗时且昂贵的人工标注。此外,评估自由文本中文化意识的基准很少,且往往依赖过时的评估机制。为弥补这一空白,我们引入了ExCAM,一种可解释的文化意识度量,据我们所知,这是第一个专门用于识别、评分和解释指令-输出对中文化错误的评估度量。为了训练和评估ExCAM,我们引入了ExCAM40k,一个由九个现有基准组成的数据集,我们对其进行了重新格式化并增加了合成错误。与包括GPT-5在内的多个基线相比,ExCAM在平衡测试集上实现了高达80%的最高错误检测准确率。因此,ExCAM为自由文本的细粒度、可解释的文化评估开辟了道路。

英文摘要

Evaluating the cultural awareness of large language models is crucial to ensure the fairness of generated text and the generalizability of applications across the world. Recent benchmarks explore cultural goods like food or values like behavior in stressful situations through the lens of question answering or text generation tasks. However, creating these benchmarks requires time-intensive and costly human annotations. Also, benchmarks that evaluate cultural awareness in free text are scarce and often rely on dated evaluation mechanisms. To address this gap, we introduce ExCAM, an Explainable Cultural Awareness Metric, which is, to our knowledge, the first dedicated evaluation metric that identifies, rates and explains cultural errors in instruction-output pairs. To train and evaluate ExCAM, we introduce ExCAM40k, a dataset comprised of nine existing benchmarks that we reformat and enhance with synthetic errors. Compared to several baselines, including GPT-5, ExCAM achieves the highest error detection rate with up to 80% accuracy on a balanced test set. Therefore, ExCAM opens the pathway towards fine-grained and explainable cultural evaluation of free text.

2605.29894 2026-05-29 cs.CV

Train the Agent, Not the Expert: Learning to Harness Heterogeneous Experts for Multi-Turn Visual Reasoning

训练智能体而非专家:学习利用异构专家进行多轮视觉推理

Yaowu Fan, Tao Han, Dazhao Du, Andy J. Ma, Jia Wan

AI总结 提出VisHarness,一种可训练的视觉智能体,通过解耦高层感知推理与低层任务执行,学习利用异构视觉专家模型,以轻量训练实现多轮交互下的通用视觉任务求解。

详情
AI中文摘要

计算机视觉的最新进展产生了大量用于检测、分割、计数和其他视觉任务的强大专用模型。然而,这些模型通常针对孤立的任务形式进行优化,使得直接支持通用视觉智能变得困难,尤其是当任务需要复杂的语言理解和密集的小物体感知时。在本文中,我们提出了VisHarness,一种可训练的视觉智能体,它将高层感知、推理和决策与低层任务执行解耦。VisHarness不是训练模型来解决特定的视觉任务,而是学习利用一组精心设计的异构视觉专家。这种范式保留了智能体的通用智能,同时充分利用了专用视觉模型在具体视觉任务中的精度优势。仅通过轻量训练,VisHarness就能学习到可泛化的视觉专家利用策略,并通过与视觉专家模型的多轮交互,在各种复杂条件下解决常见的基础视觉任务。为了在实时环境中实现高效的在策略强化学习训练,我们引入了动态视觉记忆归档,这缓解了与视觉专家模型多轮交互导致的快速累积的视觉令牌开销。在涵盖推理分割、广义指代分割、密集小物体检测和指代计数的四个代表性基准上的实验表明,VisHarness显著优于现有的通用模型,并与任务专用模型相比取得了具有竞争力或更优的性能。

英文摘要

Recent progress in computer vision has produced a wide range of powerful specialized models for detection, segmentation, counting, and other visual tasks. However, these models are usually optimized for isolated task formulations, making it difficult to directly support general-purpose visual intelligence, especially when a task requires complex language understanding and dense small-object perception. In this paper, we propose VisHarness, a trainable visual agent that decouples high-level perception, reasoning, and decision-making from low-level task execution. Instead of training a model to solve a specific visual task, VisHarness learns to harness a set of carefully designed heterogeneous visual experts. This paradigm preserves the general intelligence of the agent while fully leveraging the precision advantages of specialized visual models in concrete visual tasks. With only lightweight training, VisHarness learns a generalizable visual expert-harnessing policy and can solve common fundamental vision tasks under various complex conditions through multi-turn interactions with visual expert models. To enable efficient on-policy reinforcement learning training in a live environment, we introduce dynamic visual memory archiving, which mitigates the rapidly accumulating visual-token overhead caused by multi-turn interactions with visual expert models. Experiments on four representative benchmarks covering reasoning segmentation, generalized referring segmentation, dense small-object detection, and referring counting demonstrate that VisHarness substantially outperforms existing general-purpose models and achieves competitive or superior performance compared with task-specific models.

2605.29893 2026-05-29 cs.AI

Redundant or Necessary? A Benchmark for Detecting Redundant Steps in Agent Trajectories

冗余还是必要?检测智能体轨迹中冗余步骤的基准

Minyang Hu, Bo Yang, Zhinuo Zhou, Jiachen Liang, Guo Jiahao, Yiyang Yin, Xiongwei Han

AI总结 针对LLM智能体轨迹中的冗余步骤检测问题,提出RedundancyBench基准,包含标注轨迹的数据集,并评估三种方法,发现最佳方法仅达到24.88%的检测分数。

详情
AI中文摘要

基于LLM的智能体通过多步推理和工具使用在解决复杂任务方面表现出强大的能力。然而,现有的评估协议主要关注任务成功,忽略了智能体行为的一个关键方面:执行效率。在实践中,智能体轨迹通常包含冗余步骤,这些步骤消耗大量资源但对任务完成贡献甚微。在这项工作中,我们提出并定义了一个新的研究领域:智能体轨迹的 extbf{冗余步骤检测}。为了支持这一倡议,我们引入了 extbf{RedundancyBench},这是一个新的基准,包含多样化的任务和精心标注的轨迹,其中每个步骤根据其对任务完成的贡献进行标记。利用RedundancyBench,我们开发并评估了3种代表性方法,以回答轨迹中的步骤是冗余还是必要的问题。我们的结果表明,即使是最优方法在检测冗余步骤方面也仅达到24.88%的分数,而有些方法的表现甚至不如随机猜测。这些结果突显了该任务的复杂性以及在该领域进一步研究的必要性。 ootnote{本文的代码和数据集均可在\href{https://anonymous.4open.science/r/RedundancyBench}{https://anonymous.4open.science/r/RedundancyBench}获取。}

英文摘要

LLM-based agents have demonstrated strong capabilities in solving complex tasks through multi-step reasoning and tool use. However, existing evaluation protocols primarily focus on task success, overlooking a critical aspect of agent behavior: execution efficiency. In practice, agent trajectories often contain redundant steps that consume substantial resources while contributing little to task completion. In this work, we propose and formulate a new research area: \textbf{redundant step detection} for agent trajectories. To support this initiative, we introduce \textbf{RedundancyBench}, a new benchmark that contains diverse tasks with carefully annotated trajectories, where each step is labeled according to its contribution to task completion. Using RedundancyBench, we develop and evaluate 3 representative methods to answer whether a step within trajectory is redundant or necessary. Our results show that even the best-performing method achieves only 24.88\% score in detecting redundant steps, while some methods perform worse than random guessing. These results highlight the task's complexity and the need for further research in this area. \footnote{Code and dataset in this paper are both available in \href{https://anonymous.4open.science/r/RedundancyBench}{https://anonymous.4open.science/r/RedundancyBench}.}

2605.29891 2026-05-29 cs.CV

DVSM: Decoder-only View Synthesis Model Done Right

DVSM: 正确的仅解码器视图合成模型

Cheng Sun, Jaesung Choe, Min-Hung Chen, Ryo Hachiuma, Yu-Chiang Frank Wang

AI总结 提出仅解码器架构DVSM,通过隐式KV-cache表示场景,在相同渲染复杂度下以更少参数超越编码器-解码器变体,并利用共享权重、基础模型先验和分阶段块大小优化效率与质量,在多个基准上实现新视点合成的最优结果。

详情
Comments
Code at https://github.com/NVLabs/dvsm
AI中文摘要

近期的大型视图合成模型(LVSMs)倡导一种编码器-解码器架构,将重建和渲染分离到不同的网络中。我们重新审视了这种设计。通过控制实验,我们表明仅解码器架构(将场景隐式表示为KV-cache)在相同渲染复杂度下使用更少参数,性能优于编码器-解码器变体。进一步分析表明,在颜色输入重建网络和仅相机渲染网络之间共享权重,能更好地对齐同一视点下的特征,从而促进图像合成。基于这一发现,我们的模型DVSM进一步结合了基础模型先验和分阶段块大小调整,以改进效率与质量的权衡。我们的结果在多个基准上为新颖视图合成设立了新的最先进水平,在某些情况下,甚至在密集输入视图下优于每场景优化的3DGS。

英文摘要

Recent Large View Synthesis Models (LVSMs) advocate an encoder-decoder architecture that separates reconstruction and rendering into distinct networks. We re-examine this design. Through controlled experiments, we show that a decoder-only architecture, which represents scenes implicitly as a KV-cache, outperforms encoder-decoder variants while using fewer parameters at identical rendering complexity. Further analysis shows that sharing weights between the color-input reconstruction network and the camera-only rendering network better aligns their features at the same viewpoint, facilitating image synthesis. Building on this finding, our model, dubbed DVSM, further incorporates foundation model priors and stage-wise patch sizing for an improved efficiency-quality tradeoff. Our results establish a new state of the art for novel-view synthesis across multiple benchmarks, in some cases even outperforming per-scene-optimized 3DGS under dense input views.

2605.29889 2026-05-29 cs.CL cs.AI

Internal Representation, Not Clinical Knowledge: Where Apparent LLM Triage Failures Originate

内部表示,而非临床知识:明显的大语言模型分诊失败源于何处

David Fraile Navarro, Berardino Como, Jialei Sheng, Soundariya Ananthan, Shlomo Berkovsky

AI总结 本研究通过稀疏自编码器特征分析,发现大语言模型在分诊任务中表现不佳源于输出格式限制,而非临床知识表示缺陷。

详情
Comments
9 pages main text, 27 pages total including appendices; 7 figures, 25 tables
AI中文摘要

患者语音临床分诊基准报告显示,在受限的多选输出中,消费级大语言模型存在较高的分诊不足率,但同样的案例在自由文本中得分不同。我们探究输出格式是否改变了模型的\emph{临床表示},还是仅改变了从保留表示到答案的映射。使用Gemma 3 4B/12B IT和Qwen3-8B中的稀疏自编码器(SAE)特征,我们发现相同的医学特征在两种格式下对共享临床叙述激活,但在所有模型的每个案例的多选决策标记处变得{沉默}。三种独立方法(自然语言自编码器言语化、决策标记logit归因和顶部特征表征)一致认为,驱动决策logit的是支架和格式特征,而非医学特征。行为上,多选惩罚在结构化和自然语言输入下均反转,选项顺序洗牌排除了位置偏差,且差距主要由偏差一个决策(模型选择与黄金答案相邻的敏锐度字母)主导,而非知识失败。因此,失败源于输出格式,而非临床表示。

英文摘要

Patient-voiced clinical-triage benchmarks report high under-triage rates for consumer LLMs for constrained multiple-choice output, yet the same cases score differently with free-text. We ask whether output format changes the model's \emph{clinical representation} or only the mapping from a preserved representation to an answer. Using sparse-autoencoder (SAE) features in Gemma 3 4B/12B IT and Qwen3-8B, we find the same medical features fire on the shared clinical narrative under both formats but go {silent} at the multiple-choice decision token in all the cases at every model. Three independent methods (natural-language autoencoder verbalization, decision-token logit attribution, and top-feature characterization) agree that scaffold and format features, but not medical features, drive the decision logits. Behaviorally, the multiple-choice penalty inverts under both structured and natural-language input, option-order shuffle rules out positional bias, and the gap is dominated by off-by-one decision (the model picks an adjacent acuity letter to the gold answer) rather than knowledge failure. Thus, the failure originates in the output format and not in the clinical representation.

2605.29888 2026-05-29 cs.LG cs.AI

LaRA: Layer-wise Representation Analysis for Detecting Data Contamination in RL Post-Training

LaRA: 面向RL后训练中数据污染的逐层表示分析

Minju Gwak, Minseo Kwak, Dongseok Lee, Guijin Son, Alan Ritter, Jaehyung Kim

AI总结 提出LaRA框架,通过逐层表示分析检测强化学习后训练中的污染数据,利用扰动敏感性、方向坍缩和局部表示刚性三个指标,优于现有输出级方法。

详情
Comments
Work in Progress
AI中文摘要

强化学习(RL)后训练已被证明能提升大型语言模型(LLMs)的推理能力。然而,关于RL后训练中数据污染问题的探索很少,这可能损害训练过程本身的泛化能力和评估可靠性。现有的检测方法主要依赖于输出级信号,如似然或熵,这对于RL训练的模型变得不可靠,因为RL通过轨迹级奖励而非token似然来塑造行为。我们提出LaRA,一个用于检测RL后训练LLMs中数据污染的逐层表示分析框架。LaRA引入了三个互补指标,测量受控扰动下的扰动敏感性、方向坍缩和局部表示刚性。我们发现污染会在各层产生渐进式的几何偏差,包括放大的扰动敏感性、更强的方向坍缩和增强的局部刚性。基于我们的发现,我们还开发了一个污染检测协议,聚合跨层和跨指标的表示级偏差。在RL训练推理模型上的实验表明,我们的协议在污染检测方面优于现有的输出级基线。

英文摘要

Reinforcement learning (RL) post-training has shown to improve reasoning in large language models (LLMs). However, there has been little exploration on the problem of data contamination in RL post-training, potentially undermining generalization and evaluation reliability of the training process itself. Existing detection methods primarily rely on output-level signals such as likelihood or entropy, which become unreliable for RL-trained models since RL shapes behavior through trajectory-level rewards rather than token likelihoods. We propose LaRA, a layer-wise representation analysis framework for detecting contamination in RL post-trained LLMs. LaRA introduces three complementary metrics, measuring perturbation sensitivity, directional collapse, and local representation rigidity under controlled perturbations. We find that contamination produces progressive geometric deviations across layers, including amplified perturbation sensitivity, stronger directional collapse, and enhanced local rigidity. Based on our findings, we also develop a contamination detection protocol that aggregates representation-level deviations across layers and metrics. Experiments on RL-trained reasoning models show that our protocol outperforms existing output-level baselines for contamination detection.