arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 2075
专题追踪 全部专题
2602.10032 2026-05-14 cs.CV cs.RO

Perception with Guarantees: Certified Pose Estimation via Reachability Analysis

Tobias Ladner, Yasser Shoukry, Matthias Althoff

发表机构 * Technical University of Munich, Germany(慕尼黑技术大学) University of California, Irvine, USA(加州大学 Irvine 分校)

AI总结 该论文研究了在安全关键型系统中如何通过视觉信息实现具有严格保证的三维姿态估计问题。作者提出了一种仅依赖于单目图像和已知目标几何形状的认证姿态估计方法,通过可达性分析和形式化神经网络验证技术,对姿态进行形式化边界约束,从而在最坏情况下也能保证估计的安全性。实验表明,该方法在合成与真实场景中均能高效且准确地完成定位任务,为安全关键型应用提供了可靠保障。

Comments Accepted at Computed Aided Verification (CAV'2026)

详情
英文摘要

Agents in cyber-physical systems are increasingly entrusted with safety-critical tasks. Ensuring safety of these agents often requires localizing the pose for subsequent actions. Pose estimates can, e.g., be obtained from various combinations of lidar sensors, cameras, and external services such as GPS. Crucially, in safety-critical domains, a rough estimate is insufficient to formally determine safety, i.e., guaranteeing safety even in the worst-case scenario, and external services might additionally not be trustworthy. We address this problem by presenting a certified pose estimation in 3D solely from a camera image and a well-known target geometry. This is realized by formally bounding the pose, which is computed by leveraging recent results from reachability analysis and formal neural network verification. Our experiments demonstrate that our approach efficiently and accurately localizes agents in both synthetic and real-world experiments.

2602.09628 2026-05-14 cs.RO

TeleGate: Whole-Body Humanoid Teleoperation via Gated Expert Selection with Motion Prior

Jie Li, Bing Tang, Feng Wu

发表机构 * School of Computer Science and Technology, University of Science and Technology of China(中国科学技术大学计算机科学与技术学院) AnyWit Robotics Co., Ltd., Shushan District, Hefei, Anhui, China(安徽合肥蜀山 district AnyWit 机器人有限公司)

AI总结 本文提出了一种名为TeleGate的全身人形机器人远程操作框架,旨在解决现有方法在复杂动态运动中性能下降的问题。该方法通过训练一个轻量级的门控网络,在运行时根据本体感觉状态和参考轨迹动态选择专家策略,从而保留各领域专家策略的完整能力。此外,引入基于VAE的运动先验模块,从历史观测中提取隐含的未来运动意图,实现对需要预测的运动(如跳跃和起立)的前瞻性控制。实验表明,TeleGate在仅使用2.5小时动作捕捉数据训练的情况下,能够在多种动态运动中实现高精度的实时远程操作,显著优于基线方法。

Comments Accepted by RSS 2026. Project page: https://anywitresearch.github.io/TeleGate/

详情
英文摘要

Real-time whole-body teleoperation is a critical method for humanoid robots to perform complex tasks in unstructured environments. However, developing a unified controller that robustly supports diverse human motions remains a significant challenge. Existing methods typically distill multiple expert policies into a single general policy, which often inevitably leads to performance degradation, particularly on highly dynamic motions. This paper presents TeleGate, a unified whole-body teleoperation framework for humanoid robots that achieves high-precision tracking across various motions while avoiding the performance loss inherent in knowledge distillation. Our key idea is to preserve the full capability of domain-specific expert policies by training a lightweight gating network, which dynamically activates experts in real-time based on proprioceptive states and reference trajectories. Furthermore, to compensate for the absence of future reference trajectories in real-time teleoperation, we introduce a VAE-based motion prior module that extracts implicit future motion intent from historical observations, enabling anticipatory control for motions requiring prediction such as jumping and standing up. We conducted empirical evaluations in simulation and also deployed our technique on the Unitree G1 humanoid robot. Using only 2.5 hours of motion capture data for training, our TeleGate achieves high-precision real-time teleoperation across diverse dynamic motions (e.g., running, fall recovery, and jumping), significantly outperforming the baseline methods in both tracking accuracy and success rate.

2602.08920 2026-05-14 cs.LG

Diffusion-Inspired Reconfiguration of Transformers for Uncertainty Calibration

Manh Cuong Dao, Quang Hung Pham, Phi Le Nguyen, Thao Nguyen Truong, Bryan Kian Hsiang Low, Trong Nghia Hoang

发表机构 * National University of Singapore(新加坡国立大学) Hanoi University of Science and Technology(河内科学技术大学) National Institute of Advanced Industrial Science and Technology(国家先进工业科学与技术研究院) Washington State University(华盛顿州立大学)

AI总结 本文研究了预训练Transformer模型在不确定性校准方面的不足,提出了一种受扩散过程启发的重构方法。该方法将每个特征变换模块建模为概率映射,通过组合这些映射构建出类似扩散过程的概率路径,从而实现不确定性在模型各层的合理传播。实验表明,该方法在保持原有预测性能的同时,显著提升了模型在多个视觉和语言任务中的不确定性校准能力。

详情
英文摘要

Uncertainty calibration in pre-trained transformers is critical for their reliable deployment in risk-sensitive applications. Yet, most existing pre-trained transformers do not have a principled mechanism for uncertainty propagation through their feature transformation stack. In this work, we propose a diffusion-inspired reconfiguration of transformers in which each feature transformation block is modeled as a probabilistic mapping. Composing these probabilistic mappings reveals a probability path that mimics the structure of a diffusion process, transporting data mass from the input distribution to the pre-trained feature distribution. This probability path can then be recompiled on a diffusion process with a unified transition model to enable principled propagation of representation uncertainty throughout the pre-trained model's architecture while maintaining its original predictive performance. Empirical results across a variety of vision and language benchmarks demonstrate that our method achieves superior calibration and predictive accuracy compared to existing uncertainty-aware transformers.

2602.06475 2026-05-14 cs.LG

Towards Generalizable Reasoning: Group Causal Counterfactual Policy Optimization for LLM Reasoning

Jingyao Wang, Peizheng Guo, Wenwen Qiang, Jiahuan Zhou, Huijie Guo, Changwen Zheng, Hui Xiong

发表机构 * Institute of Software, Chinese Academy of Sciences(中国科学院软件研究所) University of Chinese Academy of Sciences(中国科学院大学) Wangxuan Institute of Computer Technology, Peking University(北京大学王宣计算机技术研究所) The Hong Kong University of Science and Technology (Guangzhou)(香港科技大学(广州))

AI总结 该研究针对大语言模型(LLM)在推理任务中过度依赖最终答案正确性而忽视推理过程的问题,提出了一种基于因果反事实的策略优化方法。通过将多候选推理过程视为因果反事实实验,该方法设计了一种新的奖励机制,同时鼓励推理过程的鲁棒性和有效性,从而提升模型的推理泛化能力。实验表明,该方法在多个基准测试中表现出优越的推理性能。

详情
英文摘要

Large language models (LLMs) excel at complex tasks with advances in reasoning capabilities. However, existing reward mechanisms remain tightly coupled to final correctness and pay little attention to the underlying reasoning process: trajectories with sound reasoning but wrong answers receive low credit, while lucky guesses with flawed logic may be highly rewarded, affecting reasoning generalization. From a causal perspective, we interpret multi-candidate reasoning for a fixed question as a family of counterfactual experiments with theoretical supports. Building on this, we propose Group Causal Counterfactual Policy Optimization to explicitly train LLMs to learn generalizable reasoning patterns. It proposes an episodic causal counterfactual reward that jointly captures (i) robustness, encouraging the answer distribution induced by a reasoning step to remain stable under counterfactual perturbations; and (ii) effectiveness, enforcing sufficient variability so that the learned reasoning strategy can transfer across questions. We then construct token-level advantages from this reward and optimize the policy, encouraging LLMs to favor reasoning patterns that are process-valid and counterfactually robust. Extensive experiments on diverse benchmarks demonstrate its advantages.

2602.06138 2026-05-14 cs.LG

Flow Matching for Offline Reinforcement Learning with Discrete Actions

Fairoz Nower Khan, Nabuat Zaman Nahim, Ruiquan Huang, Haibo Yang, Peizhong Ju

发表机构 * Department of Computer Science, University of Kentucky(计算机科学系,肯塔基大学) Department of Computing and Information Sciences, Rochester Institute of Technology(计算与信息科学系,罗切斯特理工学院)

AI总结 本文研究了如何将流匹配方法扩展到具有离散动作空间的离线强化学习问题中。作者提出了一种基于连续时间马尔可夫链的通用框架,并采用Q加权流匹配目标进行训练,同时支持多目标优化。该方法在多智能体环境下通过因子化条件路径缓解联合动作空间的指数增长问题,并在理论和实验上验证了其有效性,尤其在高维控制、多智能体游戏和多目标动态偏好等场景中表现优异。此外,该框架还可通过动作量化应用于连续控制问题,提供了性能与复杂度之间的灵活权衡。

详情
英文摘要

Generative policies based on diffusion models and flow matching have shown strong promise for offline reinforcement learning (RL), but their applicability remains largely confined to continuous action spaces. To address a broader range of offline RL settings, we extend flow matching to a general framework that supports discrete action spaces with multiple objectives. Specifically, we replace continuous flows with continuous-time Markov chains, trained using a Q-weighted flow matching objective. We then extend our design to multi-agent settings, mitigating the exponential growth of joint action spaces via a factorized conditional path. We theoretically show that, under idealized conditions, optimizing this objective recovers the optimal policy. Extensive experiments further demonstrate that our method performs robustly across diverse settings and benchmarks, including high-dimensional control, multi-agent games, and dynamically changing preferences over multiple objectives, while outperforming traditional offline RL methods in practical multi-modal decision-making scenarios. Our discrete framework can also be applied to continuous-control problems through action quantization, providing a flexible trade-off between representational complexity and performance.

2602.06104 2026-05-14 cs.LG stat.ML

Pragmatic Curiosity: A Unified Framework for Hybrid Learning and Optimization via Active Inference

Yingke Li, Anjali Parashar, Enlu Zhou, Chuchu Fan

发表机构 * Department of Aeronautics and Astronautics(航空与航天系) Massachusetts Institute of Technology(麻省理工学院) School of Industrial and Systems Engineering(工业与系统工程系) Georgia Institute of Technology(佐治亚理工学院)

AI总结 该论文提出了一种名为“实用好奇心”(Pragmatic Curiosity, PraC)的统一框架,用于结合学习与优化的混合场景,通过主动推理实现高效的决策。该方法通过权衡任务相关潜在符号的信息增益与结果的预期遗憾,指导候选查询的选择,从而在减少不确定性的同时提升任务性能。研究展示了PraC在多个复杂场景中的应用,包括固定符号的决策监控、局部符号的目标主动搜索以及未知偏好的复合贝叶斯优化,表现出降低决策风险、提升关键结果区域覆盖能力和联合学习预测与偏好结构的优势。

详情
英文摘要

Many engineering and scientific workflows rely on expensive black-box evaluations, requiring sequential decisions that must both improve task performance and reduce uncertainty. Bayesian optimization (BO) and Bayesian experimental design (BED) provide powerful but largely separate treatments of goal-directed optimization and information-seeking experimentation, leaving limited guidance for hybrid settings in which learning and optimization are intrinsically coupled. We propose Pragmatic Curiosity (PraC), a unified framework for hybrid learning and optimization via active inference. PraC evaluates candidate queries by trading information gain about a task-relevant latent symbol against an expected regret-based potential over outcomes. This formulation exposes three operational design choices: which latent quantity should be clarified, how task value is encoded as regret, and how strongly information gain should be exchanged against pragmatic value. We instantiate PraC across three regimes of increasing complexity: decision-oriented plume monitoring with fixed global symbols and known downstream losses, targeted active search with induced local symbols and evolving coverage goals, and composite Bayesian optimization with hierarchical regret learning under unknown preferences. Across these regimes, PraC reduces downstream decision risk, improves coverage of critical outcome regions, and jointly learns predictive and preference structures without relying on task-specific staging rules.

2602.05000 2026-05-14 cs.LG cs.AI cs.CL

Entropy Aware Reward Guidance for Diffusion Language Model Alignment

Atula Tejaswi, Litu Rout, Constantine Caramanis, Sanjay Shakkottai, Sujay Sanghavi

发表机构 * The University of Texas at Austin(德克萨斯大学奥斯汀分校)

AI总结 本文研究了在离散扩散语言模型中使用奖励引导(Reward Guidance)进行对齐的问题,针对离散输出无法直接求导的挑战,提出了一种新的机制EntRGi,通过动态结合连续松弛的token和采样的硬token,并基于模型预测熵进行插值,从而在保持奖励模型可靠性的同时提升优化准确性。实验表明,该方法在测试时适配和奖励引导强化学习等场景下均优于现有方法,具有显著的性能提升。

Comments Preprint

详情
英文摘要

Reward guidance, also known as posterior sampling, is a popular method for test-time adaptation and post-training in continuous diffusion models. In this paper, we study reward guidance for discrete diffusion language models; now, one cannot differentiate through the natural outputs of the model because they are discrete tokens. We introduce a novel mechanism called EntRGi (Entropy aware Reward Guidance) to address this issue. EntRGi dynamically interpolates between continuous token relaxations and sampled hard tokens, on a token-by-token basis, using the diffusion model's predictive entropy. We demonstrate that EntRGi maintains both reward model reliability and optimization accuracy, while existing approaches sacrifice one for the other. We empirically validate our approach on 7B-parameter diffusion language models across two settings: (1) test-time adaptation, and (2) RGRL (Reward Guided Reinforcement Learning), our recipe for post-training on reward-guided data, showing consistent improvements over state-of-the-art methods. Our code is available at https://atutej.github.io/entrgi-rgrl

2602.04923 2026-05-14 cs.LG

Imposing Boundary Conditions on Neural Operators via Learned Function Extensions

Sepehr Mousavi, Siddhartha Mishra, Laura De Lorenzis

发表机构 * Department of Mechanical and Process Engineering, ETH Zurich, Switzerland(机械与过程工程系,苏黎世联邦理工学院,瑞士) Seminar for Applied Mathematics, ETH Zurich, Switzerland(应用数学研讨会,苏黎世联邦理工学院,瑞士) ETH AI Center, Zurich, Switzerland(苏黎世联邦理工学院人工智能中心,瑞士)

AI总结 该论文提出了一种通过学习函数扩展来为神经算子施加边界条件的通用框架,解决了神经算子在处理复杂、非齐次边界条件时的局限性。其核心方法是将边界数据映射到整个空间域的潜在伪扩展上,使标准算子学习架构能够有效利用边界信息。实验表明,该方法在多个偏微分方程问题上取得了优于现有方法的高精度结果,且无需跨数据集调整超参数,展示了其在科学机器学习中的有效性和实用性。

详情
英文摘要

Neural operators have emerged as powerful surrogates for the solution of partial differential equations (PDEs), yet their ability to handle general, highly variable boundary conditions (BCs) remains limited. Existing approaches often fail when the solution operator exhibits strong sensitivity to boundary forcings. We propose a general framework for conditioning neural operators on complex non-homogeneous BCs through function extensions. Our key idea is to map boundary data to latent pseudo-extensions defined over the entire spatial domain, enabling any standard operator learning architecture to consume boundary information. The resulting operator, coupled with an arbitrary domain-to-domain neural operator, can learn rich dependencies on complex BCs and input domain functions at the same time. To benchmark this setting, we construct 18 challenging datasets spanning Poisson, linear elasticity, and hyperelasticity problems, with highly variable, mixed-type, component-wise, and multi-segment BCs on diverse geometries. Our approach achieves state-of-the-art accuracy, outperforming baselines by large margins, while requiring no hyperparameter tuning across datasets. Overall, our results demonstrate that learning boundary-to-domain extensions is an effective and practical strategy for imposing complex BCs in existing neural operator frameworks, enabling accurate and robust scientific machine learning models for a broader range of PDE-governed problems.

2602.04264 2026-05-14 cs.LG cs.AI cs.NA math.NA

Exponential Approximation Rates and Parameter Efficiency of Learnable Bernstein Activations

Ibrahim Albool, Malak Gamal El-Din, Salma Elmalaki, Yasser Shoukry

发表机构 * Department of Electrical Engineering and Computer Science, University of California, Irvine(电气工程与计算机科学系,加州大学 Irvine 分校)

AI总结 本文研究了可学习伯恩斯坦激活函数(Learnable Bernstein Activations)在深度神经网络中的表示能力和参数效率。通过理论分析,作者证明了采用此类激活的DeepBern-Nets网络在逼近误差上具有指数级衰减的速率,远优于传统的ReLU结构。实验表明,DBNs在多个科学数据集上实现了显著的参数减少和更快的收敛速度,验证了其结构优势。

Comments 20 pages

详情
英文摘要

The choice of activation function fundamentally shapes the representational capacity and parameter efficiency of deep neural networks, yet most widely used activations lack rigorous theoretical guarantees on these properties. We provide a theoretical analysis of DeepBern-Nets (DBNs) -- networks employing learnable Bernstein polynomial activations -- showing that their approximation error decays with the network depth $L$ and the polynomial order $n$ with a rate of $\mathcal{O}(n^{-L})$, exponentially faster than the polynomial rate of ReLU architectures while remaining fully differentiable. We validate these predictions through $1{,}344$ experiments on large scientific datasets (HIGGS and SUSY), comparing DBNs against ReLU, Leaky ReLU, SELU, and GeLU. DBNs achieve over $70\%$ parameter reduction across the majority of architectures -- reaching $99.9\%$ at scale -- converge to ReLU's final loss in as few as $26\%$ of the training epochs, and attain up to $45\%$ lower final loss. These advantages hold over all tested activations, confirming that DBN's gains stem from the learnable polynomial structure rather than mere smoothness.

2602.02977 2026-05-14 cs.CV cs.AI cs.LG

Aligning Forest and Trees in Images & Long Captions for Visually Grounded Understanding

Byeongju Woo, Zilin Wang, Byeonghyun Pak, Sangwoo Mo, Stella X. Yu

发表机构 * Agency for Defense Development(国防发展局) University of Michigan(密歇根大学) POSTECH

AI总结 该研究针对视觉语言模型在理解长而细节丰富的图像描述时存在的问题,提出了一种基于局部-整体结构的层次化学习方法。核心方法是通过CAFT模型,在中间表示层对齐局部文本与图像区域,在最终表示层实现全局图像与文本的对齐,从而更准确地捕捉细粒度视觉信息。该模型在多个长文本检索任务中取得了最先进的性能,并且无需显式的区域标注即可实现文本语义在图像区域中的定位。

Comments Preprint

详情
英文摘要

Vision-language models such as CLIP often struggle to faithfully understand long, detail-rich captions, relying on dominant scene cues while overlooking fine-grained visual evidence. We propose a hierarchical vision-language learning principle for understanding scenes as part-to-whole compositions: before forming a whole-scene representation, a model should uncover what semantic parts appear where in the image. To this end, we propose CAFT (Cross-domain Alignment of Forests and Trees), a vision-language model that jointly learns local text-region alignment at intermediate representations and global image-text alignment at the final representation. Exploiting the organization of long captions, where local descriptions often correspond to scene parts, CAFT employs a fine-to-coarse image encoder and a part-whole text encoder to discover localized part semantics and progressively compose them into a global image-text representation. Trained on 30M image-text pairs, CAFT achieves state-of-the-art performance on six long-text retrieval benchmarks and exhibits strong scaling behavior. Experiments show that CAFT learns fine-grained representations that localize textual semantics in image regions without explicit region-level supervision.

2602.02350 2026-05-14 cs.AI cs.LG cs.MA

Context Learning for Multi-Agent Discussion

Xingyuan Hua, Sheng Yue, Xinyi Li, Yizhe Zhao, Jinrui Zhang, Ju Ren

发表机构 * Department of Computer Science and Technology, Tsinghua University(清华大学计算机科学与技术系) School of Cyber Science and Technology, Sun Yat-sen University Shenzhen Campus(中山大学深圳校区网络科学与技术学院) College of Computer Science, Northwest University(西北大学计算机学院) State Key Laboratory of Internet Architecture, Tsinghua University(清华大学互联网体系结构实验室)

AI总结 多智能体讨论(MAD)任务中,多个大语言模型通过结构化讨论协作解决问题,但现有方法常因个体上下文不一致导致讨论不协调、难以达成共识。本文提出一种多大语言模型上下文学习方法(M2CL),通过为每个智能体学习上下文生成器,动态生成每轮讨论的上下文指令,从而提升讨论的一致性和准确性。实验表明,M2CL在多项复杂任务中性能显著优于现有方法,提升幅度达20%至50%,同时具备良好的迁移能力和计算效率。

详情
英文摘要

Multi-Agent Discussion (MAD) has garnered increasing attention very recently, where multiple LLM instances collaboratively solve problems via structured discussion. However, we find that current MAD methods easily suffer from discussion inconsistency, LLMs fail to reach a coherent solution, due to the misalignment between their individual contexts.In this paper, we introduce a multi-LLM context learning method (M2CL) that learns a context generator for each agent, capable of dynamically generating context instructions per discussion round via automatic information organization and refinement. Specifically, inspired by our theoretical insights on the context instruction, M2CL train the generators to control context coherence and output discrepancies via a carefully crafted self-adaptive mechanism.It enables LLMs to avoid premature convergence on majority noise and progressively reach the correct consensus. We evaluate M2CL on challenging tasks, including academic reasoning, embodied tasks, and mobile control. The results show that the performance of M2CL significantly surpasses existing methods by 20%--50%, while enjoying favorable transferability and computational efficiency.

2602.02001 2026-05-14 cs.LG cs.AI

Preserve-Then-Quantize: Balancing Rank Budgets for Quantization Error Reconstruction in LLMs

Yoonjun Cho, Dongjae Jeon, Soeun Kim, Moongyu Jeon, Albert No

发表机构 * Department of Computer Science, Yonsei University(延世大学计算机科学系) Department of Artificial Intelligence, Yonsei University(延世大学人工智能系)

AI总结 该论文研究了如何在大语言模型的后训练量化(PTQ)中减少精度损失,提出了一种名为“Preserve-Then-Quantize”的方法,通过在量化前保留权重矩阵的主要奇异子空间,仅对残差部分进行量化,并利用剩余的秩用于误差重建。该方法引入了结构化残差重建(SRR)框架,在理论指导下平衡量化暴露能量与不可恢复误差,有效提升了量化后的模型性能,并支持高效的量化参数微调,实验表明其在多个任务和量化设置下均取得了显著的性能提升。

Comments Accepted at ICML 2026. Project page: https://ai-isl.github.io/srr

详情
英文摘要

Quantization Error Reconstruction (QER) reduces accuracy loss in Post-Training Quantization (PTQ) by approximating weights as $\mathbf{W} \approx \mathbf{Q} + \mathbf{L}\mathbf{R}$, using a rank-$r$ correction to reconstruct quantization error. Prior methods devote the full rank budget to error reconstruction, which is suboptimal when $\mathbf{W}$ has intrinsic low-rank structure and quantization corrupts dominant directions. We propose Structured Residual Reconstruction (SRR), a rank-allocation framework that preserves the top-$k$ singular subspace of the activation-scaled weight before quantization, quantizes only the residual, and uses the remaining rank $r-k$ for error reconstruction. We derive a theory-guided criterion for selecting $k$ by balancing quantization-exposed energy and unrecoverable error under rank constraints. We further show that resulting $\mathbf{Q} + \mathbf{L}\mathbf{R}$ parameterization naturally supports Quantized Parameter-Efficient Fine-Tuning (QPEFT), and stabilizes fine-tuning via gradient scaling along preserved directions. Experiments demonstrate consistent perplexity reductions across diverse models and quantization settings in PTQ, along with a 5.9 percentage-point average gain on GLUE under 2-bit QPEFT. The project page is available at https://ai-isl.github.io/srr.

2602.01453 2026-05-14 cs.LG

The Horizon Threshold in Cooperative Multi-Agent Reward-Free Exploration

Idan Barnea, Orin Levy, Yishay Mansour

发表机构 * Tel Aviv University(特拉维夫大学) Google Research(谷歌研究)

AI总结 本文研究了在无奖励探索(reward-free exploration)设置下的合作多智能体强化学习问题,多个智能体共同探索未知的有限时间地平线马尔可夫决策过程(MDP),以学习其动态特性。研究采用分阶段学习框架,每个阶段智能体独立执行策略并观察轨迹,重点分析学习阶段数与智能体数量之间的权衡关系。研究发现,地平线长度 $H$ 决定了性能拐点,并提出了一种计算高效的算法,在 $H$ 阶段内使用 $\tilde{O}(S^6 H^6 A / ε^2)$ 个智能体即可获得动态的 $ε$ 近似,同时证明了当阶段数小于 $H$ 时,智能体数量必须指数级增长才能保证精度,从而表明 $Θ(H)$ 阶段是实现多项式智能体数量的必要且充分条件。

详情
英文摘要

We study cooperative multi-agent reinforcement learning in the setting of reward-free exploration, where multiple agents jointly explore an unknown MDP in order to learn its dynamics (without observing rewards). We focus on a tabular finite-horizon MDP and adopt a phased learning framework. In each learning phase, multiple agents independently interact with the environment. More specifically, in each learning phase, each agent is assigned a policy, executes it, and observes the resulting trajectory. Our primary goal is to characterize the tradeoff between the number of learning phases and the number of agents, especially when the number of learning phases is small. Our results identify a regime change governed by the horizon $H$. When the number of learning phases equals $H$, we present a computationally efficient algorithm that uses only $\tilde{O}(S^6 H^6 A / ε^2)$ agents to obtain an $ε$ approximation of the dynamics (i.e., yields an $ε$-optimal policy for any reward function). We complement our algorithm with a lower bound showing that any algorithm restricted to $ρ< H$ phases requires at least $A^{H/ρ}$ agents to achieve constant accuracy. Thus, we show that having $Θ(H)$ learning phases is both necessary and sufficient when restricting the number of agents to be polynomial.

2602.00616 2026-05-14 cs.AI

SPOT: Selective Prompt Projection via Total Variation for Inference-Only Safe Text-to-Image Generation

Minhyuk Lee, Hyekyung Yoon, Myungjoo Kang

发表机构 * Seoul National University(首尔国立大学)

AI总结 本文研究了在不修改预训练生成模型的前提下,如何在文本到图像生成过程中安全地抑制不适当内容的生成。提出了一种名为SPOT的方法,通过在推理阶段选择性地将输入提示投影到安全提示集,利用总变分理论控制风险变化,从而在保持良性提示生成质量的同时降低生成内容的风险。实验表明,SPOT在多个数据集和扩散模型架构上均能有效提升生成内容的安全性,同时保持对原始提示的良好响应。

详情
英文摘要

Text-to-Image (T2I) diffusion models enable high quality open ended synthesis, but practical use requires suppressing unsafe generations while preserving behavior on benign prompts. We study this tension relative to the frozen generator, using its prompt conditioned distribution as the preservation reference. Since T2I safety is commonly evaluated by bounded risk scores on generated images, total variation (TV) bounds how much expected risk can change from this reference. We call this fixed reference constraint the Safety-Prompt Alignment Tradeoff (SPAT): reducing expected unsafety requires prompt conditioned distributional deviation. To make this deviation selective and adjustable, we define the tau safe set as prompts whose reference risk is at most tau, and cast intervention as projection toward nearby prompts in this set. We propose Selective Prompt prOjecTion (SPOT), an inference time framework that approximates this projection without retraining the generator or learning a category specific rewriter. SPOT uses an LLM to rank candidate rewrites and a safeguard VLM to accept generated images under the same tau. Across four datasets and three diffusion backbones, SPOT achieves relative inappropriate (IP) score reductions from 14.2% to 44.4% over strong safety alignment baselines while keeping benign prompt behavior close to the fixed reference.

2601.23143 2026-05-14 cs.AI

THINKSAFE: Self-Generated Safety Alignment for Reasoning Models

Seanie Lee, Sangwoo Park, Yumin Choi, Gyeongman Kim, Minki Kang, Jihun Yun, Dongmin Park, Jongho Park, Sung Ju Hwang

发表机构 * KAIST(韩国科学技术院) KRAFTON(KRAFTON公司) UC Berkeley(加州大学伯克利分校)

AI总结 大型推理模型在生成长链推理时往往过于追求任务合规性,导致对有害提示的防御能力下降。为此,研究提出了THINKSAFE框架,通过自我生成的安全对齐方法,在无需外部教师模型的情况下恢复模型的安全性。该方法基于KL散度投影理论,利用轻量级拒绝引导机制,在保持推理能力的同时显著提升模型的安全性,并在多个模型上验证了其有效性与高效性。

Comments 17 pages, 13 figures

详情
英文摘要

Large reasoning models (LRMs) achieve remarkable performance by leveraging reinforcement learning (RL) on reasoning tasks to generate long chain-of-thought (CoT) reasoning. However, this over-optimization often prioritizes compliance, making models vulnerable to harmful prompts. To mitigate this safety degradation, recent approaches rely on external teacher distillation, yet this introduces a distributional discrepancy that degrades native reasoning. We formalize safety realignment as a KL projection onto the safe simplex and prove that the student's own safety-filtered distribution is the unique KL-optimal target, while any external teacher incurs an irreducible excess KL penalty. Guided by this analysis, we propose ThinkSafe, a self-generated alignment framework that restores safety without external teachers. Our key insight is that while compliance suppresses safety mechanisms, models often retain latent knowledge to identify harm. ThinkSafe unlocks this via lightweight refusal steering, which preserves the KL-optimal target while increasing the acceptance rate. Experiments on DeepSeek-R1-Distill and Qwen3 show ThinkSafe significantly improves safety while preserving reasoning proficiency, and achieves superior safety and comparable reasoning to GRPO with roughly an order of magnitude less compute. Code, models, and datasets are available at https://github.com/seanie12/ThinkSafe and https://huggingface.co/Seanie-lee/collections.

2601.22853 2026-05-14 cs.CV

Inference-Time Dynamic Modality Selection for Incomplete Multimodal Classification

Siyi Du, Xinzhe Luo, Declan P. O'Regan, Chen Qin

发表机构 * Department of Electrical and Electronic Engineering & I-X(电气与电子工程系及I-X)

AI总结 本文研究了多模态深度学习在面对不完整模态数据时的分类问题,提出了一种在推理阶段动态选择模态的框架DyMo,以解决传统方法中丢弃或恢复缺失模态所带来的信息损失或噪声引入问题。DyMo通过一种新的选择算法,在测试时自适应地识别并融合可靠的恢复模态,最大化任务相关的多模态信息,并设计了相应的奖励函数和网络架构,实验表明其在多种数据缺失场景下均优于现有方法。

Comments 27 pages (including appendix), accepted by ICLR 2026

详情
英文摘要

Multimodal deep learning (MDL) has achieved remarkable success across various domains, yet its practical deployment is often hindered by incomplete multimodal data. Existing incomplete MDL methods either discard missing modalities, risking the loss of valuable task-relevant information, or recover them, potentially introducing irrelevant noise, leading to the discarding-imputation dilemma. To address this dilemma, in this paper, we propose DyMo, a new inference-time dynamic modality selection framework that adaptively identifies and fuses reliable recovered modalities, fully exploring task-relevant information beyond the conventional discard-or-impute paradigm. Central to DyMo is a novel selection algorithm that maximizes multimodal task-relevant information for each test sample. Since direct estimation of such information at test time is intractable due to the unknown data distribution, we theoretically establish a connection between information and the task loss, which we compute at inference time as a tractable proxy. Building on this, a novel principled reward function is proposed to guide modality selection. In addition, we design a flexible multimodal network architecture compatible with arbitrary modality combinations, alongside a tailored training strategy for robust representation learning. Extensive experiments on diverse natural and medical image datasets show that DyMo significantly outperforms state-of-the-art incomplete/dynamic MDL methods across various missing-data scenarios. Our code is available at https://github.com//siyi-wind/DyMo.

2601.22816 2026-05-14 cs.LG stat.ML

Cascaded Flow Matching for Heterogeneous Tabular Data with Mixed-Type Features

Markus Mueller, Kathrin Gruber, Dennis Fok

发表机构 * Econometric Institute, Erasmus University Rotterdam, Rotterdam, The Netherlands(荷兰埃因霍温鲁特兰大学经济研究所)

AI总结 本文提出了一种用于生成包含离散和连续混合特征的表格数据的级联流匹配方法,以解决现有模型在生成混合类型特征时的困难。该方法首先生成表格数据的低分辨率版本,再通过一种新的引导条件概率路径和数据依赖耦合机制,在高分辨率模型中生成更精确的混合特征。实验表明,该方法在生成样本的真实性和分布细节捕捉方面表现优异,检测得分提升了51.9%。

Comments published at ICML 2026

详情
英文摘要

Advances in generative modeling have recently been adapted to tabular data containing discrete and continuous features. However, generating mixed-type features that combine discrete states with an otherwise continuous distribution in a single feature remains challenging. We advance the state-of-the-art in diffusion models for tabular data with a cascaded approach. We first generate a low-resolution version of a tabular data row, that is, the collection of the purely categorical features and a coarse categorical representation of numerical features. Next, this information is leveraged in the high-resolution flow matching model via a novel guided conditional probability path and data-dependent coupling. The low-resolution representation of numerical features explicitly accounts for discrete outcomes, such as missing or inflated values, and therewith enables a more faithful generation of mixed-type features. We formally prove that this cascade tightens the transport cost bound. The results indicate that our model generates significantly more realistic samples and captures distributional details more accurately, for example, the detection score improves by 51.9\%. Code is available at https://github.com/muellermarkus/tabcascade.

2601.22409 2026-05-14 cs.LG cs.AI stat.ML

Optimization, Generalization and Differential Privacy Bounds for Gradient Descent on Kolmogorov-Arnold Networks

Puyu Wang, Junyu Zhou, Philipp Liznerski, Marius Kloft

发表机构 * RPTU Kaiserslautern-Landau(凯斯布鲁克-兰道大学)

AI总结 本文研究了梯度下降在Kolmogorov-Arnold网络(KAN)上的优化动态、泛化性能及差分隐私保障。通过理论分析,作者得出了关于训练过程、泛化误差和隐私预算的通用界,并在逻辑斯蒂损失下证明了对数宽度的网络即可实现与迭代次数和样本量相关的优化与泛化速率。在差分隐私设置中,研究进一步表明所需噪声与输入维度和隐私参数相关,并揭示了在隐私保护下网络宽度不仅需满足充分性,还需满足必要性,揭示了隐私与非隐私训练之间的本质差异。

Comments 42 pages, 3 figures

Journal ref ICML 2026

详情
英文摘要

Kolmogorov--Arnold Networks (KANs) have recently emerged as a structured alternative to standard MLPs, yet a principled theory for their training dynamics, generalization, and privacy properties remains limited. In this paper, we analyze gradient descent (GD) for training two-layer KANs and derive general bounds that characterize their training dynamics, generalization, and utility under differential privacy (DP). As a concrete instantiation, we specialize our analysis to logistic loss under an NTK-separable assumption, where we show that polylogarithmic network width suffices for GD to achieve an optimization rate of order $1/T$ and a generalization rate of order $1/n$, with $T$ denoting the number of GD iterations and $n$ the sample size. In the private setting, we characterize the noise required for $(ε,δ)$-DP and obtain a utility bound of order $\sqrt{d}/(nε)$ (with $d$ the input dimension), matching the classical lower bound for general convex Lipschitz problems. Our results imply that polylogarithmic width is not only sufficient but also necessary under differential privacy, revealing a qualitative gap between non-private (sufficiency only) and private (necessity also emerges) training regimes. Experiments further illustrate how these theoretical insights can guide practical choices, including network width selection and early stopping.

2601.21892 2026-05-14 cs.CV cs.AI

Improving Classifier-Free Guidance of Flow Matching via Manifold Projection

Jian-Feng Cai, Haixia Liu, Zhengyi Su, Chao Wang

发表机构 * Department of Mathematics, The Hong Kong University of Science IAS Center for AI for Scientific Discoveries, The Hong Kong University of Science School of Mathematics Statistics \& Institute of Interdisciplinary Research for Mathematics Applied Science \& Hubei Key Laboratory of Engineering Modeling Scientific Computing, Huazhong University of Science Department of Statistics Data Science, Southern University of Science

AI总结 本文研究了如何改进基于流匹配模型的无分类器引导(CFG)方法,提出了通过流匹配中的速度场与平滑距离函数梯度之间的关系,对CFG进行原理性解释。基于此,作者将CFG采样重新表述为具有流形约束的同伦优化问题,并通过增量梯度下降实现流形投影,进一步结合Anderson加速提升计算效率与稳定性。该方法无需额外训练,有效提升了生成质量、提示对齐度及对引导尺度的鲁棒性,并在多个大型模型上取得了显著改进。

Comments 26 pages, 14 figures

详情
英文摘要

Classifier-free guidance (CFG) is a widely used technique for controllable generation in diffusion and flow-based models. Despite its empirical success, CFG relies on a heuristic linear extrapolation that is often sensitive to the guidance scale. In this work, we provide a principled interpretation of CFG through the lens of optimization. We demonstrate that the velocity field in flow matching corresponds to the gradient of a sequence of smoothed distance functions, which guides latent variables toward the scaled target image set. This perspective reveals that the standard CFG formulation is an approximation of this gradient, where the prediction gap, the discrepancy between conditional and unconditional outputs, governs guidance sensitivity. Leveraging this insight, we reformulate the CFG sampling as a homotopy optimization with a manifold constraint. This formulation necessitates a manifold projection step, which we implement via an incremental gradient descent scheme during sampling. To improve computational efficiency and stability, we further enhance this iterative process with Anderson Acceleration without requiring additional model evaluations. Our proposed methods are training-free and consistently refine generation fidelity, prompt alignment, and robustness to the guidance scale. We validate their effectiveness across diverse benchmarks, demonstrating significant improvements on large-scale models such as DiT-XL-2-256, Flux, and Stable Diffusion 3.5.

2601.21731 2026-05-14 cs.LG

Mechanistic Evidence for Spectral Structures in Prior-Data Fitted Networks

Kaustubh Sharma, Srijan Tiwari, Ojasva Nema, Parikshit Pareek

发表机构 * Indian Institute of Technology Roorkee (IIT Roorkee)(印度理工学院罗尔基分校)

AI总结 该研究探讨了Prior-Data Fitted Networks(PFNs)内部是否学习了可识别的贝叶斯结构,而非仅仅记忆输入输出映射。通过实验发现,PFNs能够学习到结构化的谱表示,并且这些表示可以从潜在的注意力得分中线性解码,且集中在低维子空间中。研究还提出了一种滤波器组解码器,能够将冻结的PFNs潜在表示映射为显式的谱密度,从而重建出与高斯过程回归相当的核函数,表明PFNs的先验信息不仅是隐式的,而且可以显式提取并用于实际任务。

详情
英文摘要

Prior-Data Fitted Networks (PFNs) enable amortized Bayesian inference in a single forward pass, yet their internal representations remain opaque. It is unknown whether PFNs encode identifiable Bayesian structure or merely memorize input-output mappings. We provide mechanistic evidence that PFNs learn structured spectral representations and that these can be extracted as explicit kernels. First, probing experiments across three architectures, including the publicly released TabPFN, show that spectral information is linearly decodable from the latent attention score and organized along a dominant principal axis. Activation patching and targeted subspace interventions establish that this information is causally used for prediction and concentrated in a low-dimensional subspace, with spectral directions an order of magnitude more effective than random ones. Crucially, these properties hold on TabPFN with both synthetic out-of-distribution inputs and real-world time series (Airline Passengers, Milk Production), indicating they are emergent features of PFN-style amortization over continuous regression tasks rather than artifacts of training prior. Second, we introduce a Filter Bank Decoder that maps frozen PFN latents to explicit spectral densities, reconstructing stationary kernels via Bochner's theorem. The resulting kernels support GP regression competitive with iterative baselines while requiring only a single forward pass, demonstrating that PFN priors are not merely implicit but are explicitly recoverable as portable Bayesian objects.

2601.21577 2026-05-14 cs.LG

Collaborative Parameter Learning: Mitigating Forgetting via Parameter-Level Gradient Analysis

Mutian Yang, Zisen Zhan, Yutong Chen, Haolin Li, Kaiwen Wang, Kaili Zheng, Yuguang Wang, Qi Wang, Jiandong Gao, Ji Wu

发表机构 * Department of Electronic Engineering, Tsinghua University, Beijing, China(清华大学电子工程系,北京,中国) Institute of Medical Technology, Peking University Health Science Center, Peking University, Beijing, China(北京大学医学部医学技术研究所,北京大学,北京,中国) College of Information Science and Engineering, Northeastern University, Shenyang, China(东北大学信息科学与工程学院,沈阳,中国) College of AI, Tsinghua University, Beijing, China(清华大学人工智能学院,北京,中国) Beijing National Research Center for Information Science and Technology, Beijing, China(北京信息科学与技术国家研究中心,北京,中国)

AI总结 在知识注入过程中,大语言模型容易出现灾难性遗忘问题,即学习新知识时会覆盖已有知识。本文通过参数级梯度分析,识别出两类参数:导致遗忘的冲突参数和缓解遗忘的协作参数,并提出协作参数学习(CPL)方法,仅更新协作参数以减轻遗忘。实验表明,CPL在保持较少遗忘的同时显著提升了模型的学习能力,并降低了显存和计算时间消耗。

详情
英文摘要

Catastrophic forgetting during knowledge injection impairs the ability of large language models to acquire new knowledge without overwriting previously mastered knowledge. Recent studies analyze forgetting from a gradient similarity perspective and mitigate forgetting through vector projection. However, these methods primarily characterize gradient similarity at the aggregate direction level, leaving the parameter wise contributions to forgetting underexplored. In this paper, we decompose gradient similarity into parameter wise contributions and identify two types of parameters during forgetting: Conflicting Parameters, whose updates contribute to forgetting and typically account for 50 percent to 75 percent of parameters, and Collaborative Parameters, whose updates mitigate forgetting and account for 25 percent to 50 percent. Based on this analysis, we propose Collaborative Parameter Learning, CPL, a parameter wise training rule that freezes Conflicting Parameters and updates only Collaborative Parameters. Experiments comparing CPL with seven baseline methods show that CPL learns 20.2% to 48.2% more questions with negligible forgetting, while reducing peak VRAM by approximately 3 GB per billion model parameters and computation time by 16.5 percent. Extensive evaluations on parameter consumption, out of set generalization, cross prompt generalization, multimodal tasks, open ended question answering, and multilingual settings demonstrate that CPL effectively mitigates forgetting across diverse scenarios.

2601.19931 2026-05-14 cs.CL

CascadeMind at SemEval-2026 Task 4: A Hybrid Neuro-Symbolic Cascade for Narrative Similarity

Sebastien Kawada, Dylan Holyoak

发表机构 * Kaons Epoch Learn

AI总结 本文研究了叙事相似性任务中的模型决策机制,提出了一种混合神经符号级联系统CascadeMind。该方法通过分析大语言模型的自洽性样本投票结果,根据投票一致性动态调整计算资源分配,对高置信度案例进行共识处理,对分歧案例进行多轮采样,仅在极少数完美平票案例中引入符号系统。实验表明,该方法在SemEval-2026任务中取得了72.75%的测试成绩,主要贡献在于展示了基于置信度的计算资源调度比引入辅助表示更为有效。

Comments 7 pages, 2 figures, 5 tables. Accepted paper for SemEval-2026 Task 4 at ACL. Code: https://github.com/chreia/CascadeMind-ACL

详情
英文摘要

Across self-consistency samples from an LLM, vote agreement tracks instance difficulty: on SemEval-2026 Task 4 (Narrative Story Similarity), supermajority cases (>= 7/8 votes) resolve at 85 percent accuracy, split votes at 67 percent, and perfect ties at 61 percent, a monotone gradient that holds across the development set. We exploit this in CascadeMind, which routes eight Gemini 2.5 Flash votes by consensus, escalates split votes to additional sampling rounds, and falls through to a symbolic ensemble of theory-inspired narrative signals only on perfect ties (5 percent of cases). The system reached 72.75 percent on Track A test, placing 10th of 44 teams. Ablations show that the symbolic component contributes negligibly end-to-end and that nearly all gains come from confidence-aware routing. The takeaway is methodological: for narrative similarity, calibrating when to spend more compute on a hard instance matters more than adding auxiliary representations to reason about it.

2601.19208 2026-05-14 cs.CL cs.LG

How Do Transformers Learn to Associate Tokens: Gradient Leading Terms Bring Mechanistic Interpretability

Shawn Im, Changdae Oh, Zhen Fang, Sharon Li

发表机构 * University of Wisconsin–Madison(威斯康星大学麦迪逊分校) University of Technology Sydney(悉尼技术大学)

AI总结 该研究探讨了Transformer模型如何从自然语言数据中学习并建立词之间的语义关联,例如“bird”与“flew”之间的联系。通过分析训练动态并利用梯度的主项近似,研究提出了权重的闭式表达式,揭示了Transformer中每组权重可以表示为三个基础函数的简单组合,反映了语料库的统计特性。实验表明,理论分析与实际大语言模型中学习到的权重高度一致,为理解Transformer中语义关联的形成提供了机制性解释。

Comments ICLR 2026

详情
英文摘要

Semantic associations such as the link between "bird" and "flew" are foundational for language modeling as they enable models to go beyond memorization and instead generalize and generate coherent text. Understanding how these associations are learned and represented in language models is essential for connecting deep learning with linguistic theory and developing a mechanistic foundation for large language models. In this work, we analyze how these associations emerge from natural language data in attention-based language models through the lens of training dynamics. By leveraging a leading-term approximation of the gradients, we develop closed-form expressions for the weights at early stages of training that explain how semantic associations first take shape. Through our analysis, we reveal that each set of weights of the transformer has closed-form expressions as simple compositions of three basis functions (bigram, token-interchangeability, and context mappings), reflecting the statistics of the text corpus and uncovering how each component of the transformer captures semantic associations based on these compositions. Experiments on real-world LLMs demonstrate that our theoretical weight characterizations closely match the learned weights, and qualitative analyses further show how our theorem shines light on interpreting the learned associations in transformers.

2601.17326 2026-05-14 cs.CV cs.HC

SymbolSight: Minimizing Inter-Symbol Interference for Reading with Prosthetic Vision

Jasmine Lesner, Michael Beyeler

发表机构 * Department of Computer Science, University of California, Santa Barbara(计算机科学系,加州大学圣芭芭拉分校) Department of Psychological & Brain Sciences, University of California, Santa Barbara(心理学与脑科学系,加州大学圣芭芭拉分校)

AI总结 该研究针对视网膜假体视觉恢复中阅读困难的问题,提出了一种名为SymbolSight的计算框架,旨在通过优化视觉符号设计来减少符号间干扰。研究利用语言的双字统计特性,选择字母到符号的映射方式,以降低相邻字母间的识别混淆。实验表明,这种方法在阿拉伯语、保加利亚语和英语中显著减少了预测的识别错误,展示了符号设计优化在提升低带宽视觉假体阅读性能中的潜力。

Comments Accepted to IEEE EMBC 2026. 7 pages, 6 figures, 2 tables

详情
英文摘要

Retinal prostheses restore limited visual perception, but low spatial resolution and temporal persistence make reading difficult. In sequential letter presentation, the afterimage of one symbol can interfere with perception of the next, leading to systematic recognition errors. Rather than relying on future hardware improvements, we investigate whether optimizing the visual symbols themselves can mitigate this temporal interference. We present SymbolSight, a computational framework that selects symbol-to-letter mappings to minimize confusion among frequently adjacent letters. Using simulated prosthetic vision (SPV) and a neural proxy observer, we estimate pairwise symbol confusability and optimize assignments using language-specific bigram statistics. Across simulations in Arabic, Bulgarian, and English, the resulting heterogeneous symbol sets reduced predicted confusion by a median factor of 22 relative to native alphabets. These results suggest that standard typography is poorly matched to serial, low-bandwidth prosthetic vision and demonstrate how computational modeling can narrow the design space of visual encodings, identifying high-potential candidates for future psychophysical and clinical evaluation rather than predicting present-day clinical reading performance directly.

2601.16806 2026-05-14 cs.AI cs.RO

An Efficient Insect-inspired Approach for Visual Point-goal Navigation

Yihe Lu, Barbara Webb

发表机构 * School of Informatics, University of Edinburgh(爱丁堡大学信息学院)

AI总结 本文提出了一种受昆虫启发的高效视觉点目标导航模型,结合了与联想学习和路径整合相关的两种昆虫脑结构的抽象模型。该方法在视觉导航任务中表现出与当前先进模型相当的性能,但计算成本大幅降低,并在更真实的模拟环境中展示了其对干扰的鲁棒性。

Comments This work has been submitted to the IEEE for possible publication

详情
英文摘要

In this work we develop a novel insect-inspired model for visual point-goal navigation. This combines abstracted models of two insect brain structures that have been implicated, respectively, in associative learning and path integration. We draw an analogy between the formal benchmark of the Habitat point-goal navigation task and the ability of insects to discover, learn, and refine visually guided paths around obstacles between a discovered food location and their nest. We demonstrate that the simple insect-inspired model exhibits performance comparable to recent state-of-the-art models at many orders of magnitude less computational cost. Testing in a more realistic simulated environment shows the approach is robust to perturbations.

2601.15161 2026-05-14 cs.CL cs.AI

Automated Rubrics for Reliable Evaluation of Medical Dialogue Systems

Yinzhu Chen, Abdine Maiga, Hossein A. Rahmani, Emine Yilmaz

发表机构 * AI Center, University College London(伦敦大学学院人工智能中心)

AI总结 随着大型语言模型在医疗决策支持中的应用增加,如何可靠评估其输出成为关键问题。本文提出了一种基于检索增强的多智能体框架,用于自动生成针对具体对话实例的评估标准,从而更准确地识别临床意图偏差和潜在风险。该方法通过分解检索到的权威医学证据并结合用户交互约束,形成可验证的细粒度评估准则,在多个医疗对话数据集上表现出色,显著优于现有基线模型,并有效指导了模型响应的优化。

详情
英文摘要

Large Language Models (LLMs) are increasingly used for clinical decision support, where hallucinations and unsafe suggestions may pose direct risks to patient safety. These risks are hard to assess: subtle clinical errors are often missed by generic metrics and LLM judges using general criteria, while expert-authored fine-grained rubrics are expensive and difficult to scale. In this paper, we propose a retrieval-augmented multi-agent framework designed to automate the generation of instance-specific evaluation rubrics. Our approach grounds evaluation in authoritative medical evidence by decomposing retrieved content into atomic facts and synthesizing them with user interaction constraints to form verifiable, fine-grained evaluation criteria. Evaluated on HealthBench and LLMEval-Med datasets, our framework achieves Clinical Intent Alignment (CIA) scores of 50.20% and 31.90%, significantly outperforming the GPT-4o baseline and demonstrating robust cross-lingual generalization. In discriminative tests on HealthBench, our rubrics yield a 7.8% higher win rate than GPT-4o baseline with nearly double score $Δ$, while ablation studies confirm its structural necessity. Beyond evaluation, our rubrics effectively guide response refinement, improving quality by 9.2%. This provides a scalable, cross-lingual foundation for both evaluating and improving medical LLMs. The code is available at https://github.com/AmbeChen/Automated-Rubric-Generation.

2601.14104 2026-05-14 cs.RO cs.CV

When Backdoors Meet Partial Observability: Attacking Real-World Reinforcement Learning

Tairan Huang, Qingqing Ye, Yulin Jin, Jiawei Lian, Yaxin Xiao, Yi Wang, Haibo Hu

发表机构 * Department of Electrical and Electronic Engineering(电气与电子工程系)

AI总结 本文研究了在部分可观测的现实环境中对强化学习(RL)策略进行后门攻击的问题,指出传统攻击方法在多模态观测(如视觉和激光雷达)共存的场景下存在局限性。为此,作者提出了一种基于扩散模型的后门攻击框架(DGBA),通过可打印的视觉触发器,在不干扰任务性能的前提下实现对RL策略的隐蔽操控。实验表明,该方法在物理机器人平台上的攻击效果优于现有方法,具有较高的实用性和隐蔽性。

详情
英文摘要

Backdoor attacks can cause reinforcement learning (RL) policies to behave normally under clean inputs while executing malicious behaviors when triggers are present. Existing RL backdoor attacks are primarily studied in simulation and often assume that attackers can reliably manipulate the observations driving policy decisions. This assumption becomes fragile in real-world deployment, where RL policies commonly rely on multimodal observations. Attackers can manipulate visual inputs through physical triggers, but auxiliary states such as LiDAR and odometry signals remain uncontrollable and vary across trajectories. We study this overlooked challenge and propose a diffusion-guided backdoor attack framework (DGBA) for real-world RL. DGBA uses small printable visual patches as triggers and learns a stochastic trigger distribution via conditional diffusion to maintain consistent attack activation under varying uncontrollable states. We further introduce an advantage-based poisoning strategy that injects triggers only at decision-critical training states. Experiments on a physical TurtleBot3 platform show that DGBA consistently outperforms prior RL backdoor attacks while preserving normal task performance. Demo videos and code are available in the supplementary material.

2601.13359 2026-05-14 cs.CL cs.CR cs.LG

Sockpuppetting: Jailbreaking LLMs by Combining Prefilling with Optimization

Asen Dotsinski, Panagiotis Eustratiadis

发表机构 * University of Amsterdam(阿姆斯特丹大学)

AI总结 本文研究了一种针对大语言模型的新型越狱攻击方法,称为“Sockpuppetting”,通过在模型输出的前缀中注入特定序列,诱导模型生成不符合安全策略的响应。研究提出了一种结合预填充(prefill)和优化后缀的混合攻击策略,显著提升了攻击成功率。实验表明,该方法在多个主流模型上均表现出较高的有效性,揭示了当前开放权重模型在防御输出前缀注入方面亟需加强。

Comments 13 pages, 6 figures

详情
英文摘要

Prefill attacks are an effective and low-cost jailbreaking method, as they directly insert an acceptance sequence (e.g., "Sure, here is how to...") at the start of an LLM's output and lead the model to continue the response. We make two contributions to this prior work. First, we show that an unsophisticated adversary can improve the well-known prefill attacks by ensembling a small number of prefill variants. Running three easy-to-generate prefills yields a combined attack success rate (ASR) of 22%, 90%, and 99% on Gemma-7B, Llama-3.1-8B, and Qwen3-8B respectively, an up to 38% improvement over the standard "Sure, here's..." prefill and up to 82% over our reproduction of GCG (Zou et al., 2023). Second, we introduce "sockpuppetting", a hybrid attack that optimizes an adversarial suffix placed inside the "assistant" message block of the chat template, rather than within the user prompt. The rolling variant of this attack, RollingSockpuppetGCG, increases prompt-agnostic ASR by up to 64% over our universal GCG baseline on Llama-3.1-8B. Both findings highlight the need for defences against output-prefix injection in open-weight models. Code: https://gitlab.com/asendotsinski/sockpuppetting

2601.11942 2026-05-14 cs.LG quant-ph

Geometric Preconditioning and Curriculum Optimization for Trainable Variational Quantum Regression

Qingyu Meng, Yangshuai Wang

发表机构 * Shanghai Jiao Tong University-Chongqing Institute of Artificial Intelligence(上海交通大学-重庆人工智能研究院) Department of Mathematics, National University of Singapore(新加坡国立大学数学系)

AI总结 该论文研究了可训练变分量子回归中的训练难题,针对全局损失、有限采样随机性和电路深度增加导致的梯度信号弱或病态问题,提出了一种结合几何预处理和课程优化的混合量子-经典回归方法。核心方法包括一个可控容量的经典嵌入,作为可学习的几何预处理器,用于重塑输入分布并保持低维量子瓶颈,同时采用课程学习策略逐步增加电路深度并切换优化方式。实验表明,该混合量子神经网络在有限规模的回归任务中相比纯量子网络具有更低的误差,支持了其在训练性方面的优势。

详情
英文摘要

Variational quantum circuits are increasingly studied as continuous-function approximators, but quantum regression remains difficult to train when global losses, finite-shot stochasticity, and circuit-depth growth combine to produce weak or ill-conditioned gradient signals. We study this trainability problem in a controlled hybrid quantum--classical regression design. The central ingredient is a capacity-controlled classical embedding that acts as a learnable geometric preconditioner: it reshapes the input distribution seen by a data-reuploading variational circuit while preserving a low-dimensional quantum bottleneck. We pair this representation design with a curriculum protocol that grows circuit depth progressively and switches from SPSA-based stochastic exploration to Adam-based analytic-gradient fine-tuning. We formalize the mechanism through a local quantum-tangent contraction statement: in the linearized quantum-parameter dynamics, the embedding changes the empirical Gram matrix that controls residual contraction and one-step loss decrease. Across finite-size statevector audits on PDE-informed regression benchmarks and small-data tabular tasks, the Hybrid QNN lowers error relative to Pure QNN baselines under matched quantum-model budgets. Strong classical references remain competitive, and in several cases are better in absolute error; the evidence therefore supports a trainability claim for the hybrid QNN design rather than a claim of classical or hardware quantum advantage.

2601.09636 2026-05-14 cs.AI cs.CV cs.HC cs.LG

PersonalAlign: Hierarchical Implicit Intent Alignment for Personalized GUI Agent with Long-Term User-Centric Records

Yibo Lyu, Gongwei Chen, Rui Shao, Weili Guan, Liqiang Nie

发表机构 * Harbin Institute of Technology, Shenzhen(哈尔滨工业大学(深圳)) Shenzhen Loop Area Institute(深圳环城区域研究院)

AI总结 本文提出 PersonalAlign,一种面向个性化图形用户界面(GUI)代理的分层隐式意图对齐方法,旨在通过利用用户的长期行为记录来理解模糊指令中的隐含偏好并主动预测用户潜在操作。为此,研究者构建了 AndroidIntent 基准数据集,并设计了 Hierarchical Intent Memory Agent(HIM-Agent)来持续更新和组织用户的个性化偏好与行为模式。实验表明,HIM-Agent 在执行与主动协助任务上分别提升了 15.7% 和 7.3%。

Comments Accepted to ACL26 Main

详情
英文摘要

While GUI agents have shown strong performance under explicit and completion instructions, real-world deployment requires aligning with users' more complex implicit intents. In this work, we highlight Hierarchical Implicit Intent Alignment for Personalized GUI Agent (PersonalAlign), a new agent task that requires agents to leverage long-term user records as persistent context to resolve omitted preferences in vague instructions and anticipate latent routines by user state for proactive assistance. To facilitate this study, we introduce AndroidIntent, a benchmark designed to evaluate agents' ability in resolving vague instructions and providing proactive suggestions through reasoning over long-term user records. We annotated 775 user-specific preferences and 215 routines from 20k long-term records across different users for evaluation. Furthermore, we introduce Hierarchical Intent Memory Agent (HIM-Agent), which maintains a continuously updating personal memory and hierarchically organizes user preferences and routines for personalization. Finally, we evaluate a range of GUI agents on AndroidIntent, including GPT-5, Qwen3-VL, and UI-TARS, further results show that HIM-Agent significantly improves both execution and proactive performance by 15.7% and 7.3%.