arXivDaily arXiv每日学术速递 周一至周五更新
2605.12495 2026-05-13 cs.CV cs.AI cs.LG 版本更新

AlphaGRPO: Unlocking Self-Reflective Multimodal Generation in UMMs via Decompositional Verifiable Reward

Runhui Huang, Jie Wu, Rui Yang, Zhe Liu, Hengshuang Zhao

发表机构 * The University of Hong Kong(香港大学)

AI总结 本文提出了一种名为 AlphaGRPO 的新框架,通过将组相对策略优化(GRPO)应用于统一多模态模型(UMMs),在无需额外冷启动阶段的情况下提升了多模态生成能力。该方法通过分解可验证奖励(DVReward)机制,利用大语言模型将复杂的用户请求拆解为可验证的语义和质量问题,从而提供稳定可靠的反馈,支持模型进行文本到图像的推理生成和自主的自我反思优化。实验表明,AlphaGRPO 在多个多模态生成基准测试中均取得显著提升,并在无需编辑任务训练的情况下也表现出色。

Comments ICML2026

详情
英文摘要

In this paper, we propose AlphaGRPO, a novel framework that applies Group Relative Policy Optimization (GRPO) to AR-Diffusion Unified Multimodal Models (UMMs) to enhance multimodal generation capabilities without an additional cold-start stage. Our approach unlocks the model's intrinsic potential to perform advanced reasoning tasks: Reasoning Text-to-Image Generation, where the model actively infers implicit user intents, and Self-Reflective Refinement, where it autonomously diagnoses and corrects misalignments in generated outputs. To address the challenge of providing stable supervision for real-world multimodal generation, we introduce the Decompositional Verifiable Reward (DVReward). Unlike holistic scalar rewards, DVReward utilizes an LLM to decompose complex user requests into atomic, verifiable semantic and quality questions, which are then evaluated by a general MLLM to provide reliable and interpretable feedback. Extensive experiments demonstrate that AlphaGRPO yields robust improvements across multimodal generation benchmarks, including GenEval, TIIF-Bench, DPG-Bench and WISE, while also achieving significant gains in editing tasks on GEdit without training on editing tasks. These results validate that our self-reflective reinforcement approach effectively leverages inherent understanding to guide high-fidelity generation. Project page: https://huangrh99.github.io/AlphaGRPO/

2605.12481 2026-05-13 cs.AI 版本更新

ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents

Xuhao Hu, Xi Zhang, Haiyang Xu, Kyle Qiao, Jingyi Yang, Xuanjing Huang, Jing Shao, Ming Yan, Jieping Ye

发表机构 * Tongyi Lab, Alibaba Group(通义实验室,阿里巴巴集团) Fudan University(复旦大学) Shanghai Artificial Intelligence Laboratory(上海人工智能实验室)

AI总结 计算机使用代理(CUA)在执行任务时需要在底层GUI操作和高层工具调用之间进行切换,但这种混合动作空间使得代理难以判断何时使用哪种方式,导致执行路径次优。为了解决这一问题,本文提出ToolCUA,一种通过分阶段训练范式学习最优GUI-工具路径选择的端到端代理。该方法通过生成混合轨迹、引导式强化学习和在线代理强化学习等技术,显著提升了任务执行的准确性和效率,在多个基准测试中表现出色,验证了其在现实数字代理中的应用潜力。

详情
英文摘要

Computer Use Agents (CUAs) can act through both atomic GUI actions, such as click and type, and high-level tool calls, such as API-based file operations, but this hybrid action space often leaves them uncertain about when to continue with GUI actions or switch to tools, leading to suboptimal execution paths. This difficulty stems from the scarcity of high-quality interleaved GUI-Tool trajectories, the cost and brittleness of collecting real tool trajectories, and the lack of trajectory-level supervision for GUI-Tool path selection. In this paper, we propose ToolCUA, an end-to-end agent designed to learn optimal GUI-Tool path selection through a staged training paradigm. We first introduce an Interleaved GUI-Tool Trajectory Scaling Pipeline that repurposes abundant static GUI trajectories and synthesizes a grounded tool library, enabling diverse GUI-Tool trajectories without manual engineering or real tool-trajectory collection. We then perform Tool-Bootstrapped GUI RFT, combining warmup SFT with single-turn RL to improve decisions at critical GUI-Tool switching points. Finally, we optimize ToolCUA with Online Agentic RL in a high-fidelity GUI-Tool environment, guided by a Tool-Efficient Path Reward that encourages appropriate tool use and shorter execution paths. Experiments on OSWorld-MCP show that ToolCUA achieves 46.85% accuracy, a relative improvement of approximately 66% over the baseline, establishing a new state of the art among models of comparable scale. It also improves by 3.9% over GUI-only settings, demonstrating effective GUI-Tool orchestration. The results further suggest that training in a hybrid action space is a promising paradigm for real-world digital agents. Open-sourced here: https://x-plug.github.io/ToolCUA/

2605.12480 2026-05-13 cs.CV cs.AI 版本更新

OmniNFT: Modality-wise Omni Diffusion Reinforcement for Joint Audio-Video Generation

Guohui Zhang, XiaoXiao Ma, Jie Huang, Hang Xu, Hu Yu, Siming Fu, Yuming Li, Zeyue Xue, Lin Song, Haoyang Huang, Nan Duan, Feng Zhao

发表机构 * University of Science and Technology of China(中国科学技术大学) Peking University(北京大学) JD Explore Academy(京东探索研究院)

AI总结 OmniNFT 是一种针对联合音视频生成任务的新型强化学习框架,旨在解决现有方法在模态保真度、跨模态对齐和细粒度同步方面的不足。该方法通过模态感知的奖励路由、分层梯度手术和区域损失重加权三大创新,有效缓解了多目标优势不一致、多模态梯度不平衡和信用分配不均等问题。实验表明,OmniNFT 在多个基准测试中显著提升了音视频的感知质量与同步效果。

Comments Project page: https://zghhui.github.io/OmniNFT/

详情
英文摘要

Recent advances in joint audio-video generation have been remarkable, yet real-world applications demand strong per-modality fidelity, cross-modal alignment, and fine-grained synchronization. Reinforcement Learning (RL) offers a promising paradigm, but its extension to multi-objective and multi-modal joint audio-video generation remains unexplored. Notably, our in-depth analysis first reveals that the primary obstacles to applying RL in this stem from: (i) multi-objective advantages inconsistency, where the advantages of multimodal outputs are not always consistent within a group; (ii) multi-modal gradients imbalance, where video-branch gradients leak into shallow audio layers responsible for intra-modal generation; (iii) uniform credit assignment, where fine-grained cross-modal alignment regions fail to get efficient exploration. These shortcomings suggest that vanilla RL fine-tuning strategy with a single global advantage often leads to suboptimal results. To address these challenges, we propose OmniNFT, a novel modality-aware online diffusion RL framework with three key innovations: (1) Modality-wise advantage routing, which routes independent per-reward advantages to their respective modality generation branches. (2) Layer-wise gradient surgery, which selectively detaches video-branch gradients on shallow audio layers while retaining those for cross-modal interaction layers. (3) Region-wise loss reweighting, which modulates policy optimization toward critical regions related to audio-video synchronization and fine-grained alignment. Extensive experiments on JavisBench and VBench with the LTX-2 backbone demonstrate that OmniNFT achieves comprehensive improvements in audio and video perceptual quality, cross-modal alignment, and audio-video synchronization.

2605.12474 2026-05-13 cs.AI 版本更新

Reward Hacking in Rubric-Based Reinforcement Learning

Anas Mahmoud, MohammadHossein Rezaei, Zihao Wang, Anisha Gunjal, Bing Liu, Yunzhong He

AI总结 本文研究了基于评分标准(rubric-based)的强化学习中的奖励黑客(reward hacking)问题,探讨了在训练时使用验证器(verifier)优化策略,但在评估时由多个独立评委进行判断时可能产生的偏差。研究发现,弱验证器会导致策略在训练中获得高分但无法迁移到真实评估中,而强验证器虽能缓解这一问题,却无法完全消除。此外,研究还引入了“自我内化差距”作为验证器无关的诊断指标,并指出评分标准设计的局限性可能导致策略在完整性等指标上得分提升,却牺牲了事实准确性与整体质量。

详情
英文摘要

Reinforcement learning with verifiable rewards has enabled strong post-training gains in domains such as math and coding, though many open-ended settings rely on rubric-based rewards. We study reward hacking in rubric-based RL, where a policy is optimized against a training verifier but evaluated against a cross-family panel of three frontier judges, reducing dependence on any single evaluator. Our framework separates two sources of divergence: verifier failure, where the training verifier credits rubric criteria that reference verifiers reject, and rubric-design limitations, where even strong rubric-based verifiers favor responses that rubric-free judges rate worse overall. Across medical and science domains, weak verifiers produce large proxy-reward gains that do not transfer to the reference verifiers; exploitation grows over training and concentrates in recurring failures such as partial satisfaction of compound criteria, treating implicit content as explicit, and imprecise topical matching. Stronger verifiers substantially reduce, but do not eliminate, verifier exploitation. We also introduce a self-internalization gap, a verifier-free diagnostic based on policy log-probabilities, which tracks reference-verifier quality, detecting when the policy trained using the weak verifier stops improving. Finally, in our setting, stronger verification does not prevent reward hacking when the rubric leaves important failure modes unspecified: rubric-based verifiers prefer the RL checkpoint, while rubric-free judges prefer the base model. These disagreements coincide with gains concentrated in completeness and presence-based criteria, alongside declines in factual correctness, conciseness, relevance, and overall quality. Together, these results suggest that stronger verification reduces reward hacking, but does not by itself ensure that rubric gains correspond to broader quality gains.

2605.12471 2026-05-13 cs.LG cs.AI cs.CL 版本更新

KV-Fold: One-Step KV-Cache Recurrence for Long-Context Inference

Alireza Nadali, Patrick Cooper, Ashutosh Trivedi, Alvaro Velasquez

发表机构 * Department of Computer Science(计算机科学系) University of Colorado Boulder(科罗拉多大学博尔德分校)

AI总结 本文提出了一种名为 KV-Fold 的长上下文推理方法,通过将键值(KV)缓存视为序列块的左折叠累加器,实现无需训练的推理过程。模型在每一步处理下一个块时,基于累积的缓存进行条件处理,并将生成的键值追加到缓存中,从而逐步扩展缓存并传递至后续步骤。该方法在保持模型结构和参数不变的前提下,实现了稳定的长距离信息保留和高效推理,实验表明其在大规模上下文任务中表现出优异的准确性和内存效率。

Comments 12 pages, 3 figures, 6 tables

详情
英文摘要

We introduce KV-Fold, a simple, training-free long-context inference protocol that treats the key-value (KV) cache as the accumulator in a left fold over sequence chunks. At each step, the model processes the next chunk conditioned on the accumulated cache, appends the newly produced keys and values, and passes the enlarged cache forward; the same one-step update is applied repeatedly, analogous to foldl in functional programming. Building on the KV cache concatenation primitive introduced for latent multi-agent communication, we repurpose it as a chunk-to-chunk recurrence for long-context inference. When processing chunk t, the model attends to the KV cache carried from earlier chunks as a prefix, reusing its internal state across segments without modifying or retraining the model. Despite its simplicity, the induced recurrence is stable: per-step drift rises briefly and then saturates into a flat plateau that persists across deep chains. This plateau is insensitive to a 10,000x change in numerical precision, robust across chunk sizes, and consistent across model families. At the task level, KV-Fold preserves exact information over long distances. On a needle-in-a-haystack benchmark, it achieves 100% exact-match retrieval across 152 trials spanning contexts from 16K to 128K tokens and chain depths up to 511 on Llama-3.1-8B, while remaining within the memory limits of a single 40GB GPU. Compared to streaming methods, which trade fidelity for bounded memory, KV-Fold maintains long-range retrieval while operating as a sequence of tractable forward passes. Overall, our results show that frozen pretrained transformers already support a stable form of KV-cache recurrence, providing a practical route to long-context inference without architectural changes or training.

2605.12466 2026-05-13 cs.LG cs.AI cs.CL cs.NE 版本更新

Solve the Loop: Attractor Models for Language and Reasoning

Jacob Fein-Ashley, Paria Rashidinejad

发表机构 * University of Southern California(南加州大学)

AI总结 该论文提出了一种名为“吸引子模型”(Attractor Models)的新架构,用于改进语言建模和推理任务中的迭代计算过程。该模型通过一个主干模块生成初始输出嵌入,再通过吸引子模块求解固定点以逐步优化结果,利用隐式微分进行训练,从而实现固定深度下的内存效率和自适应迭代次数。实验表明,吸引子模型在大规模语言预训练和小模型推理任务中均优于现有方法,显著提升了性能并降低了训练成本,同时展现出一种新的“均衡内化”现象,使得模型在推理时可移除求解器而几乎不损失性能。

详情
英文摘要

Looped Transformers offer a promising alternative to purely feed-forward computation by iteratively refining latent representations, improving language modeling and reasoning. Yet recurrent architectures remain unstable to train, costly to optimize and deploy, and constrained to small, fixed recurrence depths. We introduce Attractor Models, in which a backbone module first proposes output embeddings, then an attractor module refines them by solving for the fixed point, with gradients obtained through implicit differentiation. Thus, training memory remains constant in effective depth, and iterations are chosen adaptively by convergence. Empirically, Attractor Models outperform existing models across two regimes, large-scale language-model pretraining and reasoning with tiny models. In language modeling, Attractor Models deliver a Pareto improvement over standard Transformers and stable looped models across sizes, improving perplexity by up to 46.6% and downstream accuracy by up to 19.7% while reducing training cost. Notably, a 770M Attractor Model outperforms a 1.3B Transformer trained on twice as many tokens. On challenging reasoning tasks, we show that our model with only 27M parameters and approximately 1000 examples achieves 91.4% accuracy on Sudoku-Extreme and 93.1% on Maze-Hard, scaling favorably where frontier models like Claude and GPT o3, fail completely, and specialized recursive reasoners collapse at larger sizes. Lastly, we show that Attractor Models exhibit a novel phenomenon, which we call equilibrium internalization: fixed-point training places the model's initial output embedding near equilibrium, allowing the solver to be removed at inference time with little degradation. Together, these results suggest that Attractor Models make iterative refinement scalable by turning recurrence into a computation the model can learn to internalize.

2605.12462 2026-05-13 cs.AI cs.CY cs.GT cs.LG 版本更新

Towards Affordable Energy: A Gymnasium Environment for Electric Utility Demand-Response Programs

Jose E. Aguilar Escamilla, Lingdong Zhou, Xiangqi Zhu, Huazheng Wang

发表机构 * School of Electrical Engineering and Computer Science, Oregon State University(电气工程与计算机科学系,俄勒冈州立大学)

AI总结 本文提出了一种名为DR-Gym的开源仿真环境,旨在从电力公司视角训练和评估需求响应策略,以提升电网灵活性和能源可负担性。该环境专注于市场级电力场景,提供了与电力公司相关的丰富观测空间,并引入了基于真实极端事件的批发电价模型和物理基础的建筑用电需求模型。研究通过多目标奖励函数支持多样化的学习目标,展示了该仿真器在创建现实且可学习环境方面的能力。

详情
英文摘要

Extreme weather and volatile wholesale electricity markets expose residential consumers to catastrophic financial risks, yet demand response at the distribution level remains an underutilized tool for grid flexibility and energy affordability. While a demand-response program can shield consumers by issuing financial credits during high-price periods, optimizing this sequential decision-making process presents a unique challenge for reinforcement learning despite the plentiful offline historical smart meter and wholesale pricing data available publicly. Offline historical data fails to capture the dynamic, interactive feedback loop between an electric utility's pricing signals and customer acceptance and adaptation to a demand-response program. To address this, we introduce DR-Gym, an open-source, online Gymnasium-compatible environment designed to train and evaluate demand-response from the electric utility's perspective. Unlike existing device-level energy simulators, our environment focuses on the market-level electric utility setting and provides a rich observational space relevant to the electric utility. The simulator additionally features a regime-switching wholesale price model calibrated to real-world extreme events, alongside physics-based building demand profiles. For our learning signal, we use a configurable, multi-objective reward function for specifying diverse learning objectives. We demonstrate through baseline strategies and data snapshots the capability of our simulator to create realistic and learnable environments.

2605.12453 2026-05-13 eess.SP cs.AI cs.DB cs.LG cs.NI 版本更新

Enabling AI-Native Mobility in 6G: A Real-World Dataset for Handover, Beam Management, and Timing Advance

Mannam Veera Narayana, Rohit Singh, Deepa M. R, Radha Krishna Ganti

发表机构 * Department of Electrical Engineering(电气工程系) Indian Institute of Technology Madras(印度理工学院Madras分校) Chennai, India 600036(印度钦奈600036)

AI总结 本研究针对高速移动场景下5G用户设备(UE)切换(HO)中断时间长、测量报告开销大等问题,提出了一种基于真实部署网络环境的数据集,涵盖步行、骑行、汽车、公交和火车等多种移动方式及不同速度条件下的UE移动数据。该数据集重点采集了切换过程中的时序提前(TA)测量信息,包括RACH触发、MAC CE和PDCCH授权等关键信令事件,填补了现有研究的空白。该数据集可支持AI/ML模型在切换管理、波束管理和TA预测等场景下的训练与评估,为6G智能移动性研究提供了重要基础。

详情
英文摘要

To address the issues of high interruption time and measurement report overhead under user equipment (UE) mobility especially in high speed 5G use cases the use of AI/ML techniques (AI/ML beam management and mobility procedures) have been proposed. These techniques rely heavily on data that are most often simulated for various scenarios and do not accurately reflect real deployment behavior or user traffic patterns. Therefore, there is an utmost need for realistic datasets under various conditions. This work presents a dataset collected from a commercially deployed network across various modes of mobility (pedestrian, bike, car, bus, and train) and at multiple speeds to depict real time UE mobility. When collecting the dataset, we focused primarily on handover (HO) scenarios, with the aim of reducing the HO interruption time and maintaining continuous throughput during and immediately after HO execution. To support this research, the dataset includes timing advance (TA) measurements at various signaling events such as RACH trigger, MAC CE, and PDCCH grant which are typically missing in existing works. We cover a detailed description of the creation of the dataset; experimental setup, data acquisition, and extraction. We also cover an exploratory analysis of the data, with a primary focus on mobility, beam management, and TA. We discuss multiple use cases in which the proposed dataset can facilitate understanding of the inference of the AI/ML model. One such use case is to train and evaluate various AI/ML models for TA prediction.

2605.12452 2026-05-13 cs.CL cs.AI cs.CY 版本更新

The Algorithmic Caricature: Auditing LLM-Generated Political Discourse Across Crisis Events

Gunjan, Sidahmed Benabderrahmane, Talal Rahwan

发表机构 * New York University (NYUAD), Division of Science, Computer Science Department(纽约大学(NYUAD),科学学院,计算机科学系)

AI总结 该研究关注大型语言模型(LLM)生成的政治话语在危机事件中的表现,探讨其与真实在线舆论的差异。研究构建了一个包含九个危机事件的配对语料库,从情感强度、结构规律性、词汇意识形态框架和跨事件依赖性四个维度进行对比分析,发现生成内容虽然流畅,但在群体层面缺乏现实感,情感更单一、结构更规整、用词更抽象。研究提出“漫画化差距”(Caricature Gap)作为衡量指标,揭示生成政治话语在社会真实性和多样性上的局限性。

详情
英文摘要

Large Language Models (LLMs) can generate fluent political text at scale, raising concerns about synthetic discourse during crises and social conflict. Existing AI-text detection often focuses on sentence-level cues such as perplexity, burstiness, or token irregularities, but these signals may weaken as generative systems improve. We instead adopt a Computational Social Science perspective and ask whether synthetic political discourse behaves like an observed online population. We construct a paired corpus of 1,789,406 posts across nine crisis events: COVID-19, the Jan. 6 Capitol attack, the 2020 and 2024 U.S. elections, Dobbs/Roe v. Wade, the 2020 BLM protests, U.S. midterms, the Utah shooting, and the U.S.-Iran war. For each event, we compare observed discourse from social platforms with synthetic discourse generated for the same context. We evaluate four dimensions: emotional intensity, structural regularity, lexical-ideological framing, and cross-event dependency, using mean gaps and dispersion evidence. Across events, synthetic discourse is fluent but population-level unrealistic. It is generally more negative and less dispersed in sentiment, structurally more regular, and lexically more abstract than observed discourse. Observed discourse instead shows broader emotional variation, longer-tailed structural distributions, and more context-specific, colloquial lexical markers. These differences are event-dependent: larger for fast-moving, decentralized crises and smaller for formal or institutionally mediated events. We summarize them with a simple event-level measure, the Caricature Gap. Our findings suggest that the main limitation of synthetic political discourse is not grammar or fluency, but reduced population realism. Population-level auditing complements traditional text-detection and provides a CSS framework for evaluating the social realism of generated discourse.

2605.12438 2026-05-13 cs.CL cs.AI 版本更新

A Causal Language Modeling Detour Improves Encoder Continued Pretraining

Rian Touchent, Eric de la Clergerie

发表机构 * Sorbonne Université / INRIA Paris ALMAnaCH Team(索邦大学 / 巴黎INRIA ALMAnaCH团队) INRIA Paris ALMAnaCH Team(巴黎INRIA ALMAnaCH团队)

AI总结 在将编码器适配到新领域时,通常采用遮蔽语言建模(MLM)进行继续预训练。本文提出一种改进方法:在继续训练前临时切换为因果语言建模(CLM),随后再进行短期的MLM退火,从而提升下游任务性能。实验表明,这种方法在生物医学文本上显著优于传统MLM方法,且通过分析发现CLM对编码器低层结构的影响更大,其带来的表征变化在后续MLM阶段仍能保持,并随模型规模增加而增强。

详情
英文摘要

When adapting an encoder to a new domain, the standard approach is to continue training with Masked Language Modeling (MLM). We show that temporarily switching to Causal Language Modeling (CLM) followed by a short MLM decay improves downstream performance. On biomedical texts with ModernBERT, this CLM detour outperforms MLM baselines trained on identical data and compute across 8 French and 11 English biomedical tasks, by +1.2-2.8pp and +0.3-0.8pp respectively, depending on model size. We investigate the reasons for these gains. We find that CLM's dense supervision impacts low transformer layers (0-7) far more than MLM does. Freezing low layers during CLM eliminates the downstream benefit; freezing mid layers preserves it. The representational changes persist through the MLM decay phase, even when it matches the CLM phase in length, and they scale with model capacity. We release ModernCamemBERT-bio and ModernBERT-bio as state-of-the-art biomedical encoders in Base and Large sizes.

2605.12436 2026-05-13 cs.AI 版本更新

CAAFC: Chronological Actionable Automated Fact-Checker for misinformation / non-factual hallucination detection and correction

Islam Eldifrawi, Shengrui Wang, Amine Trabelsi

发表机构 * Usherbrooke

AI总结 随着网络内容和AI生成内容的激增,自动事实核查(AFC)变得尤为重要。本文提出了一种名为CAAFC的时序可操作自动事实核查框架,旨在弥补现有系统与实际事实核查工作之间的差距。CAAFC不仅能检测虚假信息和幻觉,还能通过引用权威信息源提供可操作的纠正依据,并能结合最新上下文信息动态更新知识库,显著提升了事实核查的准确性与可靠性。

详情
英文摘要

With the vast amount of content uploaded every hour, along with the AI generated content that can include hallucinations, Automated Fact-Checking (AFC) has become increasingly vital, as it is infeasible for human fact-checkers to manually verify the sheer volume of information generated online. Professional fact-checkers have identified several gaps in existing AFC systems, noting a misalignment between how these systems operate and how fact-checking is performed in practice. In this paper, we introduce CAAFC (Chronological Actionable Automated Fact-Checker), a frame-work designed to bridge these gaps. It surpasses SOTA AFC and hallucination detection systems across multiple benchmark datasets. CAAFC operates on claims, conversations, and dialogues, enabling it not only to detect factual errors and hallucinations, but also to correct them by providing actionable justifications supported by primary information sources. Furthermore, CAAFC can update evidence and knowledge bases by incorporating recent and contextual information when necessary, thereby enhancing the reliability of fact verification.

2605.12421 2026-05-13 cs.AI 版本更新

Formalize, Don't Optimize: The Heuristic Trap in LLM-Generated Combinatorial Solvers

Haoyu Wang, Yuliang Song, Tao Li, Zhiwei Deng, Yaqing Wang, Deepak Ramachandran, Eldan Cohen, Dan Roth

发表机构 * University of Pennsylvania(宾夕法尼亚大学) University of Toronto(多伦多大学) Google DeepMind(谷歌DeepMind) Oracle AI(Oracle人工智能)

AI总结 该论文探讨了大语言模型(LLM)在生成组合优化求解器时面临的启发式陷阱问题。研究通过构建一个包含100个组合问题的基准测试集,比较了三种求解器构建方法,发现使用Python与OR-Tools的约束建模方法在正确性上表现最佳,而使用MiniZinc与OR-Tools的方法虽然使用相同后端,但覆盖范围较低。研究还发现,引导LLM进行搜索优化仅带来微小的加速效果,并可能引发正确性下降,其根源在于LLM倾向于采用局部近似或冗余约束等启发式策略,从而影响求解质量。论文建议在生成组合求解器时应优先使用LLM进行形式化建模,而对搜索优化部分应单独验证。

详情
英文摘要

Large Language Models (LLMs) struggle to solve complex combinatorial problems through direct reasoning, so recent neuro-symbolic systems increasingly use them to synthesize executable solvers. A central design question is how the LLM should represent the solver, and whether it should also attempt to optimize search. We introduce CP-SynC-XL, a benchmark of 100 combinatorial problems (4,577 instances), and evaluate three solver-construction paradigms: native algorithmic search (Python), constraint modeling through a Python solver API (Python + OR-Tools), and declarative constraint modeling (MiniZinc + OR-Tools). We find a consistent representational divergence: Python + OR-Tools attains the highest correctness across LLMs, while MiniZinc + OR-Tools has lower absolute coverage despite using the same OR-Tools back-end. Native Python is the most likely to return a schema-valid solution that fails verification, whereas solver-backed paths preserve higher conditional fidelity. On the heuristic axis, prompting for search optimization yields only small median speed-ups (1.03-1.12x) and a strongly bimodal effect: many instances slow down, and correctness drops sharply on a long tail of problems. A paired code-level audit traces these regressions to a recurring heuristic trap. Under an efficiency-oriented prompt, the LLM may replace complete search with local approximations (Python), inject unverified bounds (Python + OR-Tools), or add redundant declarative machinery that overwhelms or over-constrains the model (MiniZinc + OR-Tools). These findings support a conservative design principle for LLM-generated combinatorial solvers: use the LLM primarily to formalize variables, constraints, and objectives for verified solvers, and separately check any LLM-authored search optimization before use.

2605.12412 2026-05-13 cs.CL cs.AI cs.LG 版本更新

Stories in Space: In-Context Learning Trajectories in Conceptual Belief Space

Eric Bigelow, Raphaël Sarfati, Daniel Wurgaft, Owen Lewis, Thomas McGrath, Jack Merullo, Atticus Geiger, Ekdeep Singh Lubana

AI总结 本文研究了大型语言模型(LLMs)在上下文中学习时的信念更新过程,提出它们在低维几何结构的概念信念空间中进行动态更新。通过故事理解任务,结合行为分析和表征分析,研究发现信念更新轨迹具有低维结构化特性,并可通过线性探针解码预测行为。此外,对这些表征的干预能够因果地引导信念轨迹,其效果可由概念空间的几何结构解释,为上下文学习提供了几何视角的信念动态解释。

详情
英文摘要

Large Language Models (LLMs) update their behavior in context, which can be viewed as a form of Bayesian inference. However, the structure of the latent hypothesis space over which this inference operates remains unclear. In this work, we propose that LLMs assign beliefs over a low-dimensional geometric space - a conceptual belief space - and that in-context learning corresponds to a trajectory through this space as beliefs are updated over time. Using story understanding as a natural setting for dynamic belief updating, we combine behavioral and representational analyses to study these trajectories. We find that (1) belief updates are well-described as trajectories on low-dimensional, structured manifolds; (2) this structure is reflected consistently in both model behavior and internal representations and can be decoded with simple linear probes to predict behavior; and (3) interventions on these representations causally steer belief trajectories, with effects that can be predicted from the geometry of the conceptual space. Together, our results provide a geometric account of belief dynamics in LLMs, grounding Bayesian interpretations of in-context learning in structured conceptual representations.

2605.12411 2026-05-13 cs.LG cs.AI cs.CL cs.MA 版本更新

Predicting Decisions of AI Agents from Limited Interaction through Text-Tabular Modeling

Eilam Shapira, Moshe Tennenholtz, Roi Reichart

发表机构 * Faculty of Data and Decision Sciences(数据与决策科学学院)

AI总结 该研究探讨了如何从有限的交互中预测陌生AI代理的决策行为,提出了一种结合文本与表格信息的建模方法。研究通过构建一个基于表格结构的模型,将游戏状态、对话历史和报价记录等信息整合为表格行,并引入一个冻结的小型语言模型作为观察者,提取决策相关的隐藏特征。实验表明,该方法在预测响应和议价报价方面优于传统提示方法,展示了将代理预测建模为目标自适应文本-表格任务的有效性。

详情
英文摘要

AI agents negotiate and transact in natural language with unfamiliar counterparts: a buyer bot facing an unknown seller, or a procurement assistant negotiating with a supplier. In such interactions, the counterpart's LLM, prompts, control logic, and rule-based fallbacks are hidden, while each decision can have monetary consequences. We ask whether an agent can predict an unfamiliar counterpart's next decision from a few interactions. To avoid real-world logging confounds, we study this problem in controlled bargaining and negotiation games, formulating it as target-adaptive text-tabular prediction: each decision point is a table row combining structured game state, offer history, and dialogue, while $K$ previous games of the same target agent, i.e., the counterpart being modeled, are provided in the prompt as labeled adaptation examples. Our model is built on a tabular foundation model that represents rows using game-state features and LLM-based text representations, and adds LLM-as-Observer as an additional representation: a small frozen LLM reads the decision-time state and dialogue; its answer is discarded, and its hidden state becomes a decision-oriented feature, making the LLM an encoder rather than a direct few-shot predictor. Training on 13 frontier-LLM agents and testing on 91 held-out scaffolded agents, the full model outperforms direct LLM-as-Predictor prompting and game+text features baselines. Within this tabular model, Observer features contribute beyond the other feature schemes: at $K=16$, they improve response-prediction AUC by about 4 points across both tasks and reduce bargaining offer-prediction error by 14%. These results show that formulating counterpart prediction as a target-adaptive text-tabular task enables effective adaptation, and that hidden LLM representations expose decision-relevant signals that direct prompting does not surface.

2605.12406 2026-05-13 cs.AI 版本更新

Semantic Reward Collapse and the Preservation of Epistemic Integrity in Adaptive AI Systems

William Parris

发表机构 * BDB Labs / BagelTech(BDB实验室/BagelTech)

AI总结 本文探讨了基于人类反馈的强化学习(RLHF)和偏好优化在大语言模型中的应用所引发的语义奖励坍缩(SRC)问题,即不同类型的评估不满被压缩为统一的优化信号,导致模型在事实错误、不确定性披露等方面的表现失真。研究指出,适应性AI系统可能因优化压力而抑制可见的不确定性,而非保持合理的置信度。为此,作者提出宪法奖励分层(CRS)框架,旨在通过领域感知的奖励结构,保护不同类型的认知责任,为未来的研究提供可检验的治理方向。

Comments 15 pages including references. Position and framework paper. Companion empirical work available at arXiv:2604.17587

详情
英文摘要

Recent advances in reinforcement learning from human feedback (RLHF) and preference optimization have substantially improved the usability, coherence, and safety of large language models. However, recurring behaviors such as performative certainty, hallucinated continuity, calibration drift, sycophancy, and suppression of visible uncertainty suggest unresolved structural issues within scalarized preference optimization systems. We propose Semantic Reward Collapse (SRC): the compression of semantically distinct forms of evaluative dissatisfaction into generalized optimization signals. Under SRC, categories such as factual incorrectness, uncertainty disclosure, formatting dissatisfaction, latency, and social preference may become entangled within a shared reward topology despite representing fundamentally different epistemic classes. We argue that adaptive reasoning systems operating under generalized evaluative pressure may drift toward suppression of visible epistemic failure rather than preservation of calibrated uncertainty integrity. These behaviors are framed strictly as optimization consequences rather than evidence of deception or anthropomorphic agency. Drawing on institutional proxy collapse, metric gaming, software reliability engineering, and human learning theory, we propose that uncertainty disclosure and escalation behavior should be treated as protected epistemic conduct rather than globally penalized task incompletion. Finally, we introduce Constitutional Reward Stratification (CRS), a domain-aware reward framework intended to preserve differentiated epistemic attribution within adaptive learning systems. We present CRS not as a validated solution, but as a testable governance-oriented research direction requiring further empirical investigation.

2605.12389 2026-05-13 cs.CV cs.AI cs.LG 版本更新

SEMIR: Semantic Minor-Induced Representation Learning on Graphs for Visual Segmentation

Luke James Miller, Yugyung Lee

发表机构 * Department of Computing, Analytics(计算、分析与数学系) University of Missouri-Kansas City, Kansas City, United States(密苏里大学-堪萨斯城分校)

AI总结 该论文提出了一种名为SEMIR的语义小结构引导的图表示学习方法,用于解决大规模图像中分割小而稀疏结构时面临的计算复杂性和类别不平衡问题。SEMIR通过参数化的边收缩、节点删除等操作,将原网格图转化为一个紧凑且边界对齐的图小结构,同时保持从图预测到网格标签的精确映射。该方法在多个肿瘤分割数据集上表现出色,显著提升了小结构的Dice分数,为高分辨率结构化视觉数据的任务适配表示学习提供了新框架。

Comments 20 pages, 3 figures. Accepted at ICML 2026. Includes appendices

详情
英文摘要

Segmenting small and sparse structures in large-scale images is fundamentally constrained by voxel-level, lattice-bound computation and extreme class imbalance -- dense, full-resolution inference scales poorly and forces most pipelines to rely on fixed regionization or downsampling, coupling computational cost to image resolution and attenuating boundary evidence precisely where minority structures are most informative. We introduce SEMIR (Semantic Minor-Induced Representation Learning), a representation framework that decouples inference from the native grid by learning a task-adapted, topology-preserving latent graph representation with exact decoding. SEMIR transforms the underlying grid graph into a compact, boundary-aligned graph minor through parameterized edge contraction, node deletion, and edge deletion, while preserving an exact lifting map from minor predictions to lattice labels. Minor construction is formalized as a few-shot structure learning problem that replaces hand-tuned preprocessing with a boundary-alignment objective: minor parameters are learned by maximizing agreement between predicted boundary elements and target-specific semantic edges under a boundary Dice criterion, and the induced minor is annotated with scale- and rotation-robust geometric and intensity descriptors and supports efficient region-level inference via message passing on a graph neural network (GNN) with relational edge features. We benchmark SEMIR on three tumor segmentation datasets -- BraTS 2021, KiTS23, and LiTS -- where targets exhibit high structural variability and distributional uncertainty. SEMIR yields consistent improvements in minority-structure Dice at practical runtime. More broadly, SEMIR establishes a framework for learning task-adapted, topology-preserving latent representations with exact decoding for high-resolution structured visual data.

2605.12384 2026-05-13 cs.CL cs.AI cs.LG 版本更新

Scalable Token-Level Hallucination Detection in Large Language Models

Rui Min, Tianyu Pang, Chao Du, Minhao Cheng, Yi R. Fung

发表机构 * Sea AI Lab(Sea AI实验室) Hong Kong University of Science and Technology(香港科学与技术大学) Pennsylvania State University(宾夕法尼亚州立大学)

AI总结 大型语言模型(LLMs)在生成文本时常常产生幻觉,尤其在需要推理的任务中,这些幻觉内容看似合理却包含逻辑错误或不可靠的中间结果,检测难度较大。为解决现有方法在粒度和可扩展性上的不足,本文提出TokenHD,一种基于token级别的幻觉检测框架,通过可扩展的数据生成引擎和重要性加权训练策略,实现了对自由文本中幻觉的直接检测,无需依赖预定义的步骤划分。实验表明,即使是一个小型检测模型(0.6B)也能显著提升检测性能,且性能随模型规模增大而持续提升,同时在多种实际场景中表现出良好的泛化能力。

详情
英文摘要

Large language models (LLMs) have demonstrated remarkable capabilities, but they still frequently produce hallucinations. These hallucinations are difficult to detect in reasoning-intensive tasks, where the content appears coherent but contains errors like logical flaws and unreliable intermediate results. While step-level analysis is commonly used to detect internal hallucinations, it suffers from limited granularity and poor scalability due to its reliance on step segmentation. To address these limitations, we propose TokenHD, a holistic pipeline for training token-level hallucination detectors. Specifically, TokenHD consists of a scalable data engine for synthesizing large-scale hallucination annotations along with a training recipe featuring an importance-weighted strategy for robust model training. To systematically assess the detection performance, we also provide a rigorous evaluation protocol. Through training within TokenHD, our detector operates directly on free-form text to identify hallucinations, eliminating the need for predefined step segmentation or additional text reformatting. Our experiments show that even a small detector (0.6B) achieves substantial performance gains after training, surpassing much larger reasoning models (e.g., QwQ-32B), and detection performance scales consistently with model size from 0.6B to 8B. Finally, we show that our detector can generalize well across diverse practical scenarios and explore strategies to further enhance its cross-domain generalization capability.

2605.12380 2026-05-13 cs.LG cs.AI 版本更新

Trust the Batch, On- or Off-Policy: Adaptive Policy Optimization for RL Post-Training

Rasool Fakoor, Murdock Aubry, Nicholas Stranges, Alexander J. Smola

发表机构 * Boson AI

AI总结 强化学习在结构上比监督学习更具挑战性,因为策略会改变其学习的数据分布,导致训练过程中出现脆弱性,尤其在大模型训练中更为明显。本文提出了一种自适应的策略优化方法,通过引入基于当前批次策略比分布的归一化有效样本量,替代传统的固定截断方式,从而动态调整目标函数中的截断阈值和离策略正则化强度,既保证了策略更新的稳定性,又提升了对旧数据或分布不匹配数据的适应能力。实验表明,该方法在多种场景下表现优异,无需新增超参数,同时减少了原有方法中的部分超参数。

详情
英文摘要

Reinforcement learning is structurally harder than supervised learning because the policy changes the data distribution it learns from. The resulting fragility is especially visible in large-model training, where the training and rollout systems differ in numerical precision, sampling, and other implementation details. Existing methods manage this fragility by adding hyper-parameters to the training objective, which makes the algorithm more sensitive to its configuration and requires retuning whenever the task, model scale, or distribution mismatch changes. This fragility traces to two concerns that current objectives entangle through hyper-parameters set before training begins: a trust-region concern, that updates should not move the policy too far from its current value, and an off-policy concern, that data from older or different behavior policies should influence the update only to the extent that it remains reliable. Neither concern is a constant to set in advance, and their severity is reflected in the policy-ratio distribution of the current batch. We present a simple yet effective batch-adaptive objective that replaces fixed clipping with the normalized effective sample size of the policy ratios. The same statistic caps the score-function weight and sets the strength of an off-policy regularizer, so the update stays close to the usual on-policy score-function update when ratios are nearly uniform, and tightens automatically when stale or mismatched data cause ratio concentration, while retaining a nonzero learning signal on high-ratio tokens. Experiments across a wide range of settings show that our method matches or exceeds tuned baselines, introducing no new objective hyper-parameters and removing several existing ones. The code is available at https://github.com/FeynRL-project/FeynRL.

2605.12379 2026-05-13 cs.LG cs.AI 版本更新

Discrete Flow Matching for Offline-to-Online Reinforcement Learning

Fairoz Nower Khan, Nabuat Zaman Nahim, Peizhong Ju

发表机构 * Department of Computer Science, University of Kentucky(卡内基梅隆大学计算机科学系)

AI总结 本文研究了如何在具有离散动作空间的强化学习任务中,将基于离线数据训练的生成策略有效迁移到在线交互环境中。为解决现有方法在离散动作空间和在线微调中的不足,作者提出了一种名为DRIFT的方法,通过引入优势加权的离散流匹配损失和路径空间惩罚,对预训练的连续时间马尔可夫链策略进行在线微调。该方法在保持预训练知识的同时提升了策略性能,并在多个主流离散动作任务中表现出优越的稳定性和效果。

详情
英文摘要

Many reinforcement learning (RL) tasks have discrete action spaces, but most generative policy methods based on diffusion and flow matching are designed for continuous control. Meanwhile, generative policies usually rely heavily on offline datasets and offline-to-online RL is itself challenging, as the policy must improve from new interaction without losing useful behavior learned from static data. To address those challenges, we introduce DRIFT, an online fine-tuning method that updates an offline pretrained continuous-time Markov chain (CTMC) policy with an advantage-weighted discrete flow matching loss. To preserve useful pretrained knowledge, we add a path-space penalty that regularizes the full CTMC trajectory distribution, rather than only the final action distribution. For large discrete action spaces, we introduce a candidate-set approximation that updates the actor over a small subset of actions sampled from reference-policy rollouts and uniform exploration. Our theoretical analysis shows that the candidate-set error is controlled by missing target probability mass, and the induced CTMC generator error decreases as the candidate set covers more high-probability actions. Experiments on prevailing discrete action RL task show that our method provides stable offline-to-online improvement across all tasks, achieving the highest average score on Jericho with a simple GRU encoder while outperforming methods that use pretrained language models. Controlled experiments further confirm that the path-space penalty remains bounded during fine-tuning and that the CTMC generator adapts to shifted rewards faster than deterministic baselines. The candidate-set mechanism is supported by a stability analysis showing that the generator error decreases exponentially with candidate coverage.

2605.12375 2026-05-13 cs.LG cs.AI 版本更新

Agent-Based Post-Hoc Correction of Agricultural Yield Forecasts

Matthew Beddows, Aiden Durrant, Georgios Leontidis

发表机构 * School of Natural and Computing Sciences(自然与计算科学学院) University of Aberdeen(阿伯丁大学) School of Computing Sciences(计算科学学院) University of East Anglia(东安格利亚大学) Department of Physics and Technology(物理与技术系) UiT The Arctic University of Norway(北欧大学(UiT))

AI总结 该研究针对商业软果种植中作物产量预测精度受限于数据不足的问题,提出了一种基于结构化大语言模型(LLM)代理的后验修正框架。该方法通过整合农业领域知识,在相位检测、偏差学习和范围验证等工具中对现有模型预测结果进行修正。实验表明,该方法在草莓和玉米产量数据集上显著提升了预测精度,其中使用Llama 3.1 8B模型作为代理取得了最佳效果。

Comments 21 pages, 6 figures, 6 tables

详情
英文摘要

Accurate crop yield forecasting in commercial soft fruit production is constrained by the data available in typical commercial farm records, which lack the sensor networks, satellite imagery, and high-resolution meteorological inputs that most state-of-the-art approaches assume. We propose a structured LLM agent framework that performs post-hoc correction of existing model predictions, encoding agricultural domain knowledge across tools for phase detection, bias learning, and range validation. Evaluated on a proprietary strawberry yield dataset and a public USDA corn harvest dataset, agent refinement of XGBoost reduced MAE by 20% and MASE by 56% on strawberry, with consistent improvements across Moirai2 (MAE 24%, MASE 22%) and Random Forest (MAE 28%, MASE 66%) baselines. Using Llama 3.1 8B as the agent produced the strongest corrections across all configurations; LLaVA 13B showed inconsistent gains, highlighting sensitivity to the choice of refinement model.

2605.12366 2026-05-13 cs.AI 版本更新

Classifier Context Rot: Monitor Performance Degrades with Context Length

Sam Martin, Fabien Roger

发表机构 * Anthropic Fellows Program(Anthropic fellows项目)

AI总结 该研究指出,当前前沿语言模型在作为分类器用于监控代码代理的危险行为时,随着上下文长度增加,其性能显著下降。实验表明,当危险行为出现在长达800K token的良性内容之后时,多个主流模型如Opus 4.6、GPT 5.4和Gemini 3.1的识别错误率提高了2到30倍。研究还提出通过提示技术和后训练改进可部分缓解这一问题,强调现有监控评估可能因忽略长上下文退化而高估了模型性能。

详情
英文摘要

Monitoring coding agents for dangerous behavior using language models requires classifying transcripts that often exceed 500K tokens, but prior agent monitoring benchmarks rarely contain transcripts longer than 100K tokens. We show that when used as classifiers, current frontier models fail to notice dangerous actions more often in longer transcripts. In particular, on a dataset that requires identifying when a coding agent takes a subtly dangerous action, Opus 4.6, GPT 5.4, and Gemini 3.1 miss these actions $2\times$ to $30\times$ more often when they occur after 800K tokens of benign activity than when they occur on their own. We also show that these weaknesses can be partially mitigated with prompting techniques such as periodic reminders throughout the transcript and may be mitigated further with better post-training. Monitor evaluations that do not consider long-context degradation are likely overestimating monitor performance.

2605.12365 2026-05-13 quant-ph cs.AI 版本更新

QAP-Router: Tackling Qubit Routing as Dynamic Quadratic Assignment with Reinforcement Learning

Kien X. Nguyen, Ankit Kulshrestha, Ilya Safro, Xiaoyuan Liu

发表机构 * Department of Computer and Information Sciences, University of Delaware(德克萨斯大学达勒姆分校计算机与信息科学系) Quantum Lab, Fujitsu Research of America(日本富士通美国量子实验室)

AI总结 量子比特路由是量子编译中的一个基础难题,因其动态特性使得局部决策会随时间累积,难以获得全局最优解。本文提出QAP-Router,将量子比特路由建模为动态二次分配问题,并结合强化学习进行求解。通过将量子门交互建模为流矩阵,硬件拓扑建模为距离矩阵,统一表征了交互与距离之间的耦合关系,并在强化学习环境中定义了奖励函数。实验表明,该方法在多个真实量子电路数据集上显著降低了路由后的CNOT门数量。

详情
英文摘要

Qubit routing is a fundamental problem in quantum compilation, known to be NP-hard. Its dynamic nature makes local routing decisions propagate and compound over time, making global efficient solutions challenging. Existing heuristic methods rely on local rules with limited lookahead, while recent learning-based approaches often treat routing as a generic sequential decision problem without fully exploiting its underlying structure. In this paper, we introduce QAP-Router, framing qubit routing based on a dynamic Quadratic Assignment Problem (QAP) formulation. By modeling logical interactions, or quantum gates, as flow matrices and hardware topology as a distance matrix, our approach captures the interaction-distance coupling in a unified objective, which defines the reward in the reinforcement learning environment. To further exploit this structure, the policy network employs a solution-aware Transformer backbone that encodes the interaction between the flow matrix and the distance matrix into the attention mechanism. We also integrate a lookahead mechanism that blends naturally into the QAP framework, preventing myopic decisions. Extensive experiments on 1,831 real-world quantum circuits from the MQTBench, AgentQ and QUEKO datasets show that our method substantially reduces the CNOT gate count of routed circuits by 15.7%, 30.4% and 12.1%, respectively, relative to existing industry compilers.

2605.12362 2026-05-13 cs.NE cs.AI 版本更新

A Family of Quaternion-Valued Differential Evolution Algorithms for Numerical Function Optimization

Gerardo Altamirano-Gomez, Álvaro Gallardo, Carlos Ignacio Hernández Castellanos

发表机构 * Instituto de Investigaciones en Matemáticas Aplicadas y Sistemas, Universidad Nacional Autónoma de México(应用数学与系统研究所,墨西哥国立自治大学) Universidad Iberoamericana(伊比利亚美洲大学)

AI总结 本文提出了一种基于四元数的差分进化算法(QDE)家族,用于解决连续函数的数值优化问题。该算法直接在四元数空间中进行操作,设计了多种利用四元数代数与几何特性的变异策略,提升了算法的收敛速度和优化性能。实验结果表明,QDE在BBOB基准测试中优于传统的实数型差分进化算法,展示了其在计算智能领域的潜力与优势。

详情
英文摘要

The numerical optimization of continuous functions is a fundamental task in many scientific and engineering domains, ranging from mechanical design to training of artificial intelligence models. Among the most effective and widely used algorithms for this purpose is Differential Evolution (DE), known for its simplicity and strong performance. Recent research has shown that adapting AI models to operate over alternative number systems-such as complex numbers, quaternions, and geometric algebras-can improve model compactness and accuracy. However, such extensions remain underexplored in bio-inspired optimization algorithms. In particular, the use of quaternion algebra represents an emerging area in computational intelligence. This paper introduces a family of novel Quaternion-Valued Differential Evolution (QDE) algorithms that operate directly in the quaternion space. We propose several mutation strategies specifically designed to exploit the algebraic and geometric properties of quaternions. Results show that our QDE variants achieve faster convergence and superior performance on several function classes in the BBOB benchmark compared to the traditional real-valued DE algorithm.

2605.12361 2026-05-13 cs.CL cs.AI cs.IR 版本更新

MedHopQA: A Disease-Centered Multi-Hop Reasoning Benchmark and Evaluation Framework for LLM-Based Biomedical Question Answering

Rezarta Islamaj, Robert Leaman, Joey Chan, Nicholas Wan, Qiao Jin, Natalie Xie, John Wilbur, Shubo Tian, Lana Yeganova, Po-Ting Lai, Chih-Hsuan Wei, Yifan Yang, Yao Ge, Qingqing Zhu, Zhizheng Wang, Zhiyong Lu

发表机构 * National Library of Medicine, Division of Intramural Research(国家医学图书馆,院内研究部) University of Illinois at Urbana-Champaign, Department of Computer Science(伊利诺伊大学厄巴纳-香槟分校计算机科学系) University of Michigan Medical School(密歇根大学医学院)

AI总结 MedHopQA 是一个以疾病为中心的多跳推理基准测试集,旨在评估基于大语言模型的生物医学问答系统的真实推理能力。该基准包含1000个由专家精心标注的问题-答案对,每个问题都需要整合两个不同维基百科文章的信息,并以开放式文本形式作答。为提升评估的鲁棒性和公平性,MedHopQA 引入了本体支持的同义词集,并采用分层验证机制,同时通过大规模未标注问题集降低 leaderboard 游戏和数据污染风险,为未来生物医学问答数据集的构建提供了可复用的框架。

详情
英文摘要

Evaluating large language models (LLMs) in the biomedical domain requires benchmarks that can distinguish reasoning from pattern matching and remain discriminative as model capabilities improve. Existing biomedical question answering (QA) benchmarks are limited in this respect. Multiple-choice formats can allow models to succeed through answer elimination rather than inference, while widely circulated exam-style datasets are increasingly vulnerable to performance saturation and training data contamination. Multi-hop reasoning, defined as the ability to integrate information across multiple sources to derive an answer, is central to clinically meaningful tasks such as diagnostic support, literature-based discovery, and hypothesis generation, yet remains underrepresented in current biomedical QA benchmarks. MedHopQA is a disease-centered multi-hop reasoning benchmark consisting of 1,000 expert-curated question-answer pairs introduced as a shared task at BioCreative IX. Each question requires synthesis of information across two distinct Wikipedia articles, and answers are provided in an open-ended free-text format. Gold annotations are augmented with ontology-grounded synonym sets from MONDO, NCBI Gene, and NCBI Taxonomy to support both lexical and concept-level evaluation. MedHopQA was constructed through a structured process combining human annotation, triage, iterative verification, and LLM-as-a-judge validation. To reduce leaderboard gaming and contamination risk, the 1,000 scored questions are embedded within a publicly downloadable set of 10,000 questions, with answers withheld, on a CodaBench leaderboard. MedHopQA provides both a benchmark and a reusable framework for constructing future biomedical QA datasets that prioritize compositional reasoning, saturation resistance, and contamination resistance as core design constraints.

2605.12357 2026-05-13 cs.AI 版本更新

$δ$-mem: Efficient Online Memory for Large Language Models

Jingdi Lei, Di Zhang, Junxian Li, Weida Wang, Kaixuan Fan, Xiang Liu, Qihan Liu, Xiaoteng Ma, Baian Chen, Soujanya Poria

发表机构 * Nanyang Technological University(南洋理工大学) Fudan University(复旦大学) Mind Lab Shanghai Jiao Tong University(上海交通大学) The Chinese University of Hong Kong(香港中文大学) The Hong Kong University of Science and Technology (Guangzhou)(香港科技大学(广州))

AI总结 大型语言模型在长期助理和智能体系统中需要有效积累和复用历史信息。为解决单纯扩展上下文窗口成本高且效果有限的问题,本文提出了一种轻量级的在线记忆机制 $δ$-mem,通过固定大小的状态矩阵和增量学习规则压缩历史信息,并在生成过程中利用其读取结果对主干模型的注意力计算进行低秩修正。实验表明,$δ$-mem 在保持模型通用能力的同时,显著提升了模型在多个基准测试中的表现,尤其在对记忆能力要求高的任务上效果更为突出。

详情
英文摘要

Large language models increasingly need to accumulate and reuse historical information in long-term assistants and agent systems. Simply expanding the context window is costly and often fails to ensure effective context utilization. We propose $δ$-mem, a lightweight memory mechanism that augments a frozen full-attention backbone with a compact online state of associative memory. $δ$-mem compresses past information into a fixed-size state matrix updated by delta-rule learning, and uses its readout to generate low-rank corrections to the backbone's attention computation during generation. With only an $8\times8$ online memory state, $δ$-mem improves the average score to $1.10\times$ that of the frozen backbone and $1.15\times$ that of the strongest non-$δ$-mem memory baseline. It achieves larger gains on memory-heavy benchmarks, reaching $1.31\times$ on MemoryAgentBench and $1.20\times$ on LoCoMo, while largely preserving general capabilities. These results show that effective memory can be realized through a compact online state directly coupled with attention computation, without full fine-tuning, backbone replacement, or explicit context extension.

2605.12339 2026-05-13 cs.LG cs.AI 版本更新

BSO: Safety Alignment Is Density Ratio Matching

Tien-Phat Nguyen, Truong Nguyen, Thin Nguyen, Duy Minh Ho Nguyen, Ngoc-Thanh Dinh, Trung Le

发表机构 * Hanoi University of Science and Technology(河内科学技术大学) Deakin University(德金大学) Max Planck Research School for Intelligent Systems(马克斯·普朗克智能系统研究学校) VinUniversity(文大学) Monash University(墨尔本大学)

AI总结 本文提出了一种名为BSO的新方法,将语言模型的安全对齐问题转化为密度比匹配问题,从而简化了传统复杂的训练流程。该方法通过最小化数据与模型之间的Bregman散度,得到一组单阶段损失函数,具有理论保证并能恢复最优安全策略。BSO方法通用且简洁,无需辅助模型,仅引入一个额外超参数,且能涵盖现有安全对齐方法作为特例,实验表明其在安全与有用性之间取得了更优的平衡。

详情
英文摘要

Aligning language models for both helpfulness and safety typically requires complex pipelines-separate reward and cost models, online reinforcement learning, and primal-dual updates. Recent direct preference optimization approaches simplify training but incorporate safety through ad-hoc modifications such as multi-stage procedures or heuristic margin terms, lacking a principled derivation. We show that the likelihood ratio of the optimal safe policy admits a closed-form decomposition that reduces safety alignment to a density ratio matching problem. Minimizing Bregman divergences between the data and model ratios yields Bregman Safety Optimization (BSO), a family of single-stage loss functions, each induced by a convex generator, that provably recover the optimal safe policy. BSO is both general and simple: it requires no auxiliary models, introduces only one hyperparameter beyond standard preference optimization, and recovers existing safety-aware methods as special cases. Experiments across safety alignment benchmarks show that BSO consistently improves the safety-helpfulness trade-off.

2605.12338 2026-05-13 cs.LG cs.AI stat.CO 版本更新

Manifold Sampling via Entropy Maximization

Cornelius V. Braun, Tilman Burghoff, Marc Toussaint

发表机构 * Technische Universität Berlin(技术大学柏林) Robotics Institute Germany(德国机器人研究所)

AI总结 该论文研究了在由平滑等式和不等式约束隐式定义的流形上进行采样的问题,特别是在可行域包含多个不连通部分的情况下。为了解决这一挑战,作者提出了基于熵最大化重采样的MASEM方法,通过k近邻密度估计最大化经验分布的熵,从而提升采样效率。实验表明,MASEM在合成数据和机器人应用中表现出优越的混合效率和可扩展性,显著优于现有方法。

详情
英文摘要

Sampling from constrained distributions has a wide range of applications, including in Bayesian optimization and robotics. Prior work establishes convergence and feasibility guarantees for constrained sampling, but assumes that the feasible set is connected. However, in practice, the feasible set often decomposes into multiple disconnected components, which makes efficient sampling under constraints challenging. In this paper, we propose MAnifold Sampling via Entropy Maximization (MASEM) for sampling on a manifold with an unknown number of disconnected components, implicitly defined by smooth equality and inequality constraints. The presented method uses a resampling scheme to maximize the entropy of the empirical distribution based on k-nearest neighbor density estimation. We show that, in the mean field, MASEM decreases the KL-divergence between the empirical distribution and the maximum-entropy target exponentially in the number of resampling steps. We instantiate MASEM with multiple local samplers and demonstrate its versatility and efficiency on synthetic and robotics-based benchmarks. MASEM enables fast and scalable mixing across a range of constrained sampling problems, improving over alternatives by an order of magnitude in Sinkhorn distance with competitive runtime.

2605.12335 2026-05-13 cs.IR cs.AI cs.LG 版本更新

EHR-RAGp: Retrieval-Augmented Prototype-Guided Foundation Model for Electronic Health Records

Saeed Shurrab, Mariam Al-Omari, Dana El Samad, Farah E. Shamout

发表机构 * New York University(纽约大学) New York University Abu Dhabi(纽约大学阿布扎赫德)

AI总结 电子健康记录(EHR)包含丰富的患者纵向信息,广泛应用于预测建模,但如何有效利用历史数据仍面临轨迹长、事件异构、时间不规则等挑战。本文提出EHR-RAGp,一种基于检索增强的原型引导基础模型,通过动态整合不同临床事件类型的最相关历史信息,提升预测性能。该模型引入原型引导检索模块,用于对齐和评估历史数据与预测任务的相关性,从而引导模型关注最具信息量的上下文,在多个临床预测任务中表现优于现有先进模型。

Comments Retrieval Augmented EHR Foundation Model

详情
英文摘要

Electronic Health Records (EHR) contain rich longitudinal patient information and are widely used in predictive modeling applications. However, effectively leveraging historical data remains challenging due to long trajectories, heterogeneous events, temporal irregularity, and the varying relevance of past clinical context. Existing approaches often rely on fixed windows or uniform aggregation, which can obscure clinically important signals. In this work, we introduce EHR-RAGp, a retrieval-augmented foundation model that dynamically integrates the most relevant patient history across diverse clinical event types. We propose a prototype-guided retrieval module that acts as an alignment mechanism and estimates the relevance of retrieved historical chunks with respect to a given prediction task, guiding the model towards the most informative context. Across multiple clinical prediction tasks, EHR-RAGp consistently outperforms state-of-the-art EHR foundation models and transformer-based baselines. Furthermore, integrating EHR-RAGp with existing clinical foundation models yields substantial performance gains. Overall, EHR-RAGp provides a scalable and efficient framework for leveraging long-range clinical context to improve downstream performance.

2605.12332 2026-05-13 cs.AI 版本更新

Towards Automated Air Traffic Safety Assessment Around Non-Towered Airports Using Large Language Models

Torsten Darrell, Mahyar Ghazanfari, Jordan Kam, Alexandre Bayen, Amin Tabrizian, Peng Wei

发表机构 * Research Intern, Department of Mechanical and Aerospace Engineering, George Washington University(乔治华盛顿大学机械与航空航天工程系研究实习生) Ph.D. Student, Department of Mechanical and Aerospace Engineering, George Washington University(乔治华盛顿大学机械与航空航天工程系博士生) Undergraduate Student, Aerospace Program, University of California, Berkeley(加州大学伯克利分校航空航天项目本科生) Full Professor, Department of Electrical Engineering and Computer Science, University of California, Berkeley(加州大学伯克利分校电气工程与计算机科学系教授) Ph.D. Student, Department of Computer Science, George Washington University(乔治华盛顿大学计算机科学系博士生) Associate Professor, Department of Mechanical and Aerospace Engineering, George Washington University(乔治华盛顿大学机械与航空航天工程系副教授)

AI总结 本文研究利用大语言模型(LLM)对非塔台机场的飞行后安全分析框架,旨在提升这类机场的空中交通安全评估能力。研究结合了CTAF通信记录、气象数据、ADS-B飞行轨迹和目视飞行规则图,提出了一种通用的视觉-语言模型方法,用于识别潜在的安全隐患。通过实际案例分析和合成数据集的评估,验证了该方法在识别飞行优先权违规等危险情况中的有效性,为未来自动化安全评估提供了可行的技术路径。

Comments 25 pages, 17 figures, 5 tables, Accepted to AIAA 2026

详情
英文摘要

We investigate frameworks for post-flight safety analysis at non-towered airports using large language models (LLMs). Non-towered airports rely on the Common Traffic Advisory Frequency (CTAF) for air traffic coordination and experience frequent near mid-air collisions due to the pilot self-announcement communication protocol. We propose a general vision-language model (VLM) approach to analyze the transcribed CTAF radio communications in natural language, METeorological Aerodrome Report (METAR) weather data, Automatic Dependent Surveillance-Broadcast (ADS-B) flight trajectories, and Visual Flight Rules sectional charts of the airfield. We provide a preliminary study at Half Moon Bay Airport, with a qualitative real world case study and a quantitative evaluation using a new synthetic dataset of communications and weather modalities. We qualitatively evaluate our framework on real flight data using Gemini 2.5 Pro, demonstrating accurate identification of a right-of-way violation. The synthetic dataset is derived from real examples and includes a 12-category hazard taxonomy, and is used to benchmark three open-source (Qwen 2.5-7B, Mistral-7B, Gemma-2-9B) and three closed-source (GPT-4o, GPT-5.4, Claude Sonnet 4.6) LLM models on the subset of inputs related to CTAF and METAR. Even limited to CTAF and METAR inputs and open source LLMs, instances of our framework typically achieve a macro F1 score above 0.85 on a binary nominal/danger classification task. Future work includes a quantitative evaluation across all modalities and a larger number of real world examples. Taken together, our results suggest that VLM analysis of safety at non-towered airports may be a valuable future capability.

2605.12312 2026-05-13 cs.LG cs.AI 版本更新

Transferable Delay-Aware Reinforcement Learning via Implicit Causal Graph Modeling

Chenran Zhao, Dianxi Shi, Yaowen Zhang, Chunping Qiu, Shaowu Yang

发表机构 * College of Computer Science and Technology, National University of Defense Technology(计算机科学与技术学院,国防科技大学) Intelligent Game and Decision Lab (IGDL)(智能游戏与决策实验室) Institute of Military Transportation(军事交通运输研究院) School of Artificial Intelligence, Hebei University of Technology(人工智能学院,河北工业大学)

AI总结 本文研究了在存在随机延迟的跨任务强化学习场景中,如何提高策略的可迁移性和适应性。为了解决延迟导致的动作与状态反馈时间错位以及任务目标变化带来的知识复用困难,作者提出了一种基于隐式因果图建模的可迁移延迟感知强化学习方法。该方法通过场节点编码器将高维观测转化为具有节点语义的潜在状态,并利用消息传递机制学习节点间的动态因果依赖关系,从而获得可迁移的结构化表示和环境动态知识,有效提升了跨任务学习的效率与性能。

详情
英文摘要

Random delays weaken the temporal correspondence between actions and subsequent state feedback, making it difficult for agents to identify the true propagation process of action effects. In cross-task scenarios, changes in task objectives and reward formulations further reduce the reusability of previously acquired task knowledge. To address this problem, this paper proposes a transferable delay-aware reinforcement learning method based on implicit causal graph modeling. The proposed method uses a field-node encoder to represent high-dimensional observations as latent states with node-level semantics, and employs a message-passing mechanism to characterize dynamic causal dependencies among nodes, thereby learning transferable structured representations and environment dynamics knowledge. On this basis, imagination-driven behavior learning and planning are incorporated to optimize policies in the latent space, enabling cross-task knowledge transfer and rapid adaptation. Experimental results show that the proposed method outperforms baseline methods on DMC continuous control tasks with random delays. Cross-task transfer experiments further demonstrate that the learned structured representations and dynamics knowledge can be effectively transferred to new tasks and significantly accelerate policy adaptation.

2605.12306 2026-05-13 cs.LG cs.AI cs.CV 版本更新

KAN-CL: Per-Knot Importance Regularization for Continual Learning with Kolmogorov-Arnold Networks

Minjong Cheon

发表机构 * Sejong University Department of Computer Science and Engineering(世宗大学计算机科学与工程系)

AI总结 本文提出了一种名为KAN-CL的持续学习框架,旨在解决任务间参数干扰导致的灾难性遗忘问题。该方法利用Kolmogorov-Arnold网络(KAN)的紧支撑样条参数化特性,在每个样条节点层面进行重要性加权锚定,从而实现更精细的参数正则化。实验表明,KAN-CL在多个基准数据集上显著降低了遗忘率,同时保持了较高的分类精度,并通过神经切线核分析进一步揭示了其理论优势。

详情
英文摘要

Catastrophic forgetting remains the central obstacle in continual learning (CL): parameters shared across tasks interfere with one another, and existing regularization methods such as EWC and SI apply uniform penalties without awareness of which input region a parameter serves. We propose KAN-CL, a continual learning framework that exploits the compact-support spline parameterization of Kolmogorov-Arnold Networks (KANs) to perform importance-weighted anchoring at per-knot granularity. Deployed as a classification head on a convolutional backbone with standard EWC regularization on the backbone (bbEWC) KAN-CL achieves forgetting reductions of 88% and 93% over a head-only KAN baseline on Split-CIFAR-10/5T and Split-CIFAR-100/10T respectively, while matching or exceeding the accuracy of all baselines on both benchmarks. We further provide a Neural Tangent Kernel (NTK) analysis showing that KAN's spline locality induces a structural rank deficit in the cross-task NTK, yielding a forgetting bound that holds even in the feature-learning regime. These results establish that combining an architecture with natural parameter locality (KAN head) with a complementary backbone regularizer (bbEWC) yields a compositional and principled approach to catastrophic forgetting.

2605.12294 2026-05-13 cs.AI 版本更新

Executable Agentic Memory for GUI Agent

Zerui Qin, Sheng Yue, Xingyuan Hua, Yongjian Fu, Ju Ren

发表机构 * Tsinghua University, China(清华大学, 中国) Sun Yat-sen University, China(中山大学, 中国)

AI总结 本文提出了一种名为可执行智能体记忆(EAM)的新方法,用于提升图形用户界面(GUI)智能体在长期任务中的稳定性和效率。EAM 通过构建结构化的知识图谱,将自由生成的规划过程转化为基于检索与执行的流程,并结合状态感知的深度优先搜索和动作分组挖掘技术,实现高效的记忆构建。此外,引入基于价值引导的图搜索机制,利用轻量级Q函数模型指导蒙特卡洛树搜索,从而在保证规划效率的同时,显著提升了任务执行的成功率与成本效益。

详情
英文摘要

Modern GUI agents typically rely on a model-centric and step-wise interaction paradigm, where LLMs must re-interpret the UI and re-decide actions at every screen, which is fragile in long-horizon tasks. In this paper, we propose Executable Agentic Memory (EAM), a structured Knowledge Graph (KG) that shifts GUI planning from free-form generation to a robust retrieval-and-execution process. Our approach includes a sample-efficient memory construction pipeline using state-aware DFS and action-group mining to compress multi-step routines. To ensure efficient planning, we introduce a value-guided graph search where a lightweight Q-function model steers Monte Carlo Tree Search (MCTS) over the KG. We theoretically establish bias-consistency for the Q-model and derive sample complexity bounds for path recovery. Empirically, EAM outperforms state-of-the-art baselines like UI-TARS-7B by up to $19.6\%$ on AndroidWorld, while reducing token costs $6\times$ relative to GPT-4o. With a $2.8$s average latency, EAM enables reliable, quick, and long-horizon GUI automation.

2605.12289 2026-05-13 cs.LG cs.AI 版本更新

PriorZero: Bridging Language Priors and World Models for Decision Making

Junyu Xiong, Yuan Pu, Jia Tang, Yazhe Niu

发表机构 * University of Science and Technology of China(中国科学技术大学) Shanghai Artificial Intelligence Laboratory(上海人工智能实验室) Nanjing University of Aeronautics and Astronautics(南京航空航天大学) The Chinese University of Hong Kong MMLab(香港中文大学MMLab)

AI总结 本文提出了一种名为 PriorZero 的统一框架,旨在将大型语言模型(LLM)的语言先验知识与基于世界模型的规划相结合,以提升强化学习代理在长期任务中的决策能力。该方法通过解耦的 rollout-训练设计,将 LLM 的概念先验仅注入蒙特卡洛树搜索(MCTS)的根节点,从而在保持世界模型深度前瞻能力的同时,引导搜索向语义上有潜力的动作聚焦。实验表明,PriorZero 在多个基准任务中显著提升了探索效率和最终性能,为基于 LLM 的决策制定提供了一个有前景的框架。

Comments 30 pages, 12 figures

详情
英文摘要

Leveraging the rich world knowledge of Large Language Models (LLMs) to enhance Reinforcement Learning (RL) agents offers a promising path toward general intelligence. However, a fundamental prior-dynamics mismatch hinders existing approaches: static LLM knowledge cannot directly adapt to the complex transition dynamics of long-horizon tasks. Using LLM priors as fixed policies limits exploration diversity, as the prior is blind to environment-specific dynamics; while end-to-end fine-tuning suffers from optimization instability and credit assignment issues. To bridge this gap, we propose PriorZero, a unified framework that integrates LLM-derived conceptual priors into world-model-based planning through a decoupled rollout-training design. During rollout, a novel root-prior injection mechanism incorporates LLM priors exclusively at the root node of Monte Carlo Tree Search (MCTS), focusing search on semantically promising actions while preserving the world model's deep lookahead capability. During training, PriorZero decouples world-model learning from LLM adaptation: the world model is continuously refined on interaction data to jointly improve its dynamics, policy, and value predictions, its value estimates are then leveraged to provide fine-grained credit assignment signals for stable LLM fine-tuning via alternating optimization. Experiments across diverse benchmarks, including text-based adventure games in Jericho and instruction-following gridworld tasks in BabyAI, demonstrate that PriorZero consistently improves both exploration efficiency and asymptotic performance, establishing a promising framework for LLM-empowered decision-making. Our code is available at https://github.com/opendilab/LightZero.

2605.12286 2026-05-13 q-bio.GN cs.AI 版本更新

Set-Aggregated Genome Embeddings for Microbiome Abundance Prediction

Younhun Kim, Georg K. Gerber, Travis E. Gibson

发表机构 * Brigham and Women's Hospital, Boston, MA, USA(布里法伦女性医院,波士顿,马萨诸塞州,美国) Harvard Medical School, Boston, MA, USA(哈佛医学院,波士顿,马萨诸塞州,美国)

AI总结 该研究探讨了是否仅通过微生物群落成员的原始DNA序列即可预测其群落层面的丰度特征。研究提出了一种基于集合聚合基因组嵌入(SAGE)的方法,结合基因组语言模型(GLMs)的少样本学习能力,用于预测微生物群落的丰度分布。实验表明,该方法在新型基因组上的泛化能力优于传统生物信息学方法,并验证了群落层面潜在表示对性能提升的关键作用。

Comments 11 pages, 7 figures

详情
英文摘要

Microbiome functions are encoded within the genes of the community-wide metagenome. A natural question is whether properties of a microbial community can be predicted just from knowing the raw DNA sequences of its members. In this work, we employ set-aggregated genome embeddings (SAGE) to predict community-level abundance profiles, exploiting the few-shot learning capabilities of genomic language models (GLMs). We benchmark this approach to show improved generalization on novel genomes compared to classical bioinformatics approaches. Model ablation shows that community-level latent representations directly result in improved performance. Lastly, we demonstrate the benefits of intermediate transformations between latent representations and demonstrate the differences between GLM embedding choices.

2605.12280 2026-05-13 cs.SE cs.AI 版本更新

Iterative Audit Convergence in LLM-Managed Multi-Agent Systems: A Case Study in Prompt Engineering Quality Assurance

Elias Calboreanu

发表机构 * Swift (North) AI Lab, The Swift Group, LLC, Maryland, USA(Swift(北)AI实验室,The Swift Group LLC,马里兰州,美国) Capitol Technology University, Laurel, MD 20708, USA(Capitol技术大学,Laurel,马里兰州20708,美国)

AI总结 本文研究了在大型语言模型(LLM)管理的多智能体系统中,通过迭代审计实现规范收敛的问题,以AEGIS系统为案例,探讨提示规范的质量保证。研究采用由 Claude 子代理执行的检查表驱动审计方法,发现了51个提示规范一致性缺陷,并提出了七类缺陷的分类体系及编码规则。实验表明,随着审计范围的扩展,缺陷收敛呈现非单调变化,且单一文件审查无法发现所有问题,研究还提炼出一套可复现的审计协议。

Comments 13 pages, 3 figures, 6 tables. Companion preprint at arXiv:2604.05000. Submitted to MDPI Software, Special Issue on Software Reliability, Security and Quality Assurance

详情
英文摘要

Prompt specifications for multi-agent large language model (LLM) systems carry data contracts and integration logic across many interdependent files but are rarely subjected to structured-inspection rigor. This paper reports a single-system empirical case study of iterative, agent-driven auditing applied to AEGIS (Autonomous Engineering Governance and Intelligence System), a production seven-lane orchestration pipeline whose prompt-specification surface comprises approximately 7150 lines: 6907 across seven lane PROMPT.md files and a 245-line shared Ticket Contract. Nine sequential audit rounds, executed by Claude sub-agents using a checklist-driven walkthrough adapted from Weinberg and Freedman, surfaced 51 prompt-specification consistency defects, distinct from the 51 STRIDE-categorized adversarial code findings reported in the companion preprint. Per-round counts were 15, 8, 12, 2, 8, 1, 4, 1, and 0. We report a seven-category post-hoc defect taxonomy with explicit coding rules, observed non-monotonic convergence consistent with cascading edits and audit-scope expansion, and an audit protocol distilled from the study, with the final locked checklist released as a reproducibility appendix. Single-file review missed defect classes that were surfaced only by later expanded-scope rounds in this system. The same LLM family authored and audited the specifications; replication with dissimilar models and human reviewers is required before generalization.

2605.12276 2026-05-13 cs.AI 版本更新

NARA: Anchor-Conditioned Relation-Aware Contextualization of Heterogeneous Geoentities

Jina Kim, Gengchen Mai, Lingyi Zhao, Khurram Shafique, Yao-Yi Chiang

发表机构 * Department of Computer Science and Engineering, University of Minnesota(明尼苏达大学计算机科学与工程系) Department of Geography and the Environment, University of Texas at Austin(德克萨斯大学奥斯汀分校地理与环境系) Novateur Research Solutions(Novateur研究解决方案)

AI总结 该研究提出了一种名为NARA的自监督学习框架,用于处理异构矢量地理实体的数据,旨在解决现有方法在统一建模几何、语义和空间关系方面的不足。NARA通过联合建模语义、几何结构和空间关系,实现了对点、线、面等不同类型的地理实体的上下文感知表征。实验表明,该方法在建筑功能分类、交通速度预测和兴趣点推荐等任务中均优于现有方法,验证了其在统一关系建模方面的有效性。

详情
英文摘要

Geospatial foundation models have primarily focused on raster data such as satellite imagery, where self-supervised learning has been widely studied. Vector geospatial data instead represent the world as discrete geoentities with explicit geometry, semantics, and structured spatial relations, including metric proximity and topological relationships. These relations jointly determine how entities interact within space, yet existing representation learning methods remain fragmented, often restricted to specific geometry types or partial spatial relations, limiting their ability to capture unified spatial context across heterogeneous geoentities. We propose NARA (Neural Anchor-conditioned Relation-Aware representation learning), a self-supervised framework for vector geoentities. NARA learns context-dependent representations by jointly modeling semantics, geometry, and spatial relations within a unified framework and captures relational spatial structure beyond proximity alone, enabling rich contextualized representations across heterogeneous geoentities of points, polylines, and polygons. Evaluation on building function classification, traffic speed prediction, and next point-of-interest recommendation shows consistent improvements over prior methods, highlighting the benefit of unified relational modeling for vector geospatial data.

2605.12265 2026-05-13 cs.AI 版本更新

How Useful Is Cross-Domain Generalization for Training LLM Monitors?

Sam Martin, Fabien Roger

发表机构 * Anthropic Fellows Program(Anthropic 后备计划)

AI总结 本文研究了在有限训练数据下使用提示语言模型进行分类的有效性,并探讨了跨领域泛化对训练大语言模型分类器的作用。研究发现,通过多任务提示训练可以在相邻领域提升分类性能,但在某些边缘情况下,微调模型会因提示变化而失效。研究还表明,将分类训练与通用指令遵循训练结合,能够在保持分类性能的同时缓解泛化失败问题,并发现这种无思考的分类训练在构建其他分类器和监控系统中可能具有实用价值。

详情
英文摘要

Using prompted language models as classifiers enables classification in domains with limited training data, but misses some of the robustness and performance benefits that fine-tuning can bring. We study whether training on multiple classification tasks, each with its own prompt, improves performance on new domains with new classification prompts. We show that such training partially generalizes to adjacent domains, improving classification performance on tasks that are unseen during training. However, we identify specific edge cases where the fine-tuned models fail to follow prompts, such as when the classification prompt changes completely while the data domain remains the same as during training. We show that classification training can be mixed with general instruction following training, and that (when done well) such training keeps the benefits of classification training and mitigates its generalization failures. Surprisingly, we see that this no-thinking supervised classification training can generalize to with-thinking classification and summarization, suggesting that no-thinking classification training might be instrumentally useful in building other kinds of classifiers and monitoring systems.

2605.12263 2026-05-13 cs.DL cs.AI 版本更新

Reconnecting Fragmented Citation Networks with Semantic Augmentation

Vu Thi Huong, Annika Buchholz, Imene Khebouri, Thorsten Koch, Tim Kunt, Wolfgang Peters-Kottig, Tomasz Stompor, Janina Zittel

发表机构 * Digital Data and Information for Society, Science, and Culture, Zuse Institute Berlin(数字数据与信息社会、科学与文化,柏林祖布研究所) Institute of Mathematics, Vietnam Academy of Science and Technology(越南科学技术 academy 数学研究所) Software and Algorithms for Discrete Optimization, Technische Universität Berlin(离散优化软件与算法,柏林技术大学) Applied Optimization, Zuse Institute Berlin(应用优化,柏林祖布研究所) Kooperativer Bibliotheksverbund Berlin-Brandenburg (KOBV), Zuse Institute Berlin(柏林-勃兰登堡合作图书馆联合会 (KOBV),柏林祖布研究所)

AI总结 本文研究了如何通过语义增强方法修复科学文献引用网络中的碎片化问题。作者提出了一种结合引用拓扑结构和基于大语言模型的文本相似度的高效混合框架,通过添加语义边和调整现有引用权重来增强原始引用网络。该方法在保持学科同质性的同时显著减少了网络碎片,并在大规模数据集上表现出良好的扩展性,为改进基于引用的科学评价指标提供了实用策略。

Comments 11 pages, 4 figures, 3 tables

详情
英文摘要

Citation graphs are fundamental tools for modeling scientific structure, but are often fragmented due to missing citations of scientifically connected articles. To address this issue, we propose a computationally efficient hybrid framework integrating citation topology with large language model (LLM)-based text similarity. Using 662,369 Web of Science publications in Mathematics and Operations Research & Management Science, we augment the original graph by adding semantic edges from small, disconnected components and weighting existing citations according to textual similarity. Semantic augmentation substantially reduces fragmentation while preserving disciplinary homogeneity. Compared to embedding-only clustering, cluster detection on augmented graphs using the Leiden algorithm retains structural interpretability while offering multi-scale organization. The method scales efficiently to large datasets and offers a practical strategy for strengthening citation-based indicators without collapsing disciplinary boundaries.

2605.12262 2026-05-13 cs.AI cs.LG 版本更新

Missingness-MDPs: Bridging the Theory of Missing Data and POMDPs

Joshua Wendland, Markel Zubia, Roman Andriushchenko, Maris F. L. Galesloot, Milan Ceska, Henrik von Kleist, Thiago D. Simao, Maximilian Weininger, Nils Jansen

发表机构 * Ruhr University Bochum(博德姆鲁尔大学) Brno University of Technology(布拉格技术大学) Radboud University Nijmegen(拉德博德大学奈杰姆) Harvard University(哈佛大学) Eindhoven University of Technology(埃因霍温理工大学)

AI总结 本文提出了一种新的部分可观测马尔可夫决策过程(POMDP)子类——缺失性-MDP(miss-MDP),将缺失数据理论融入强化学习框架中。该模型通过缺失函数描述状态特征在不同时间步缺失的概率,针对未知缺失函数的情况,提出基于不同缺失类型结构特性的算法,从观测数据中学习缺失函数,并据此生成近似最优策略。理论证明所得到的策略在真实 miss-MDP 中具有高概率的 ε-最优性,实验结果也验证了方法的有效性。

详情
英文摘要

We introduce missingness-MDPs (miss-MDPs), a novel subclass of partially observable Markov decision processes (POMDPs) that incorporates the theory of missing data. A miss-MDP is a POMDP whose observation function is a missingness function, specifying the probability that individual state features are missing (i.e., unobserved) at a time step. The literature distinguishes three canonical missingness types: missing (1) completely at random (MCAR), (2) at random (MAR), and (3) not at random (MNAR). Our planning problem is to compute near-optimal policies for a miss-MDP with an unknown missingness function, given a dataset of action-observation trajectories. Achieving such optimality guarantees for policies requires learning the missingness function from data, which is infeasible for general POMDPs. To overcome this challenge, we exploit the structural properties of different missingness types to derive probably approximately correct (PAC) algorithms for learning the missingness function. These algorithms yield an approximate but fully specified miss-MDP that we solve using off-the-shelf planning methods. We prove that, with high probability, the resulting policies are epsilon-optimal in the true miss-MDP. Empirical results confirm the theory and demonstrate superior performance of our approach over two model-free POMDP methods.

2605.12255 2026-05-13 cs.AI cs.CY cs.LG 版本更新

Why Conclusions Diverge from the Same Observations: Formalizing World-Model Non-Identifiability via an Inference

Toru Takahashi

发表机构 * Human Informatics and Systems Laboratory, Doshisha University, Kyoto, Japan(大阪大学人文学与系统实验室,京都,日本) Linked Open Data Initiative, NPO, Tokyo, Japan(开放数据倡议,东京,日本) Keio Research Institute at SFC, Fujisawa, Japan(庆应义塾大学SFC研究所, Fujisawa,日本) Stroly Inc., Kyoto, Japan(Stroly公司,京都,日本)

AI总结 本文探讨了为何人们在面对相同观察时会产生不同结论的问题,指出这种分歧源于推理与学习过程中的非可识别性,而非对方认知缺陷。研究将非可识别性分为两个层次:在相同世界模型下因推理设置不同导致结论差异,以及推理设置本身影响数据暴露和更新规则,进而导致世界模型的差异。文章引入推理配置的概念,分析了分歧如何受计算、观察和协调等约束条件的影响,并将其与深度表征学习中的相关概念联系起来,通过AI监管辩论的案例加以说明。

Comments 12 pages, 2 figures, 1 table. Extended English version of a paper accepted for presentation at JSAI 2026

详情
英文摘要

When people share the same documents and observations yet reach different conclusions, the disagreement often shifts into a judgment that the other party is cognitively defective, irrational, or acting in bad faith. This paper argues that such divergence is better described as a form of non-identifiability inherent in inference and learning, rather than as a defect of the other party. We organize the phenomenon into two levels: (i) $θ$-level non-identifiability, where conclusions diverge under the same world model $W$ because inference settings differ; and (ii) $W$-level non-identifiability, where repeated use of an inference setting $θ$ biases data exposure and update rules, causing the learned world model $W$ itself to diverge. We introduce an inference profile $θ= (R, E, S, D)$, consisting of Reference, Exploration, Stabilization, and Horizon, and show how outputs can split even for the same observation $o$ and the same $W$. We further explain why disagreements tend to project onto a small number of bases -- abstract versus concrete, externalizability, and order versus freedom -- as a consequence of general constraints on learning systems: computational, observational, and coordination constraints. Finally, we relate the framework to deep representation learning, including representation hierarchy, latent-state estimation, and regularization-exploration trade-offs, and illustrate the framework through a case study on AI regulation debates.

2605.12242 2026-05-13 cs.CL cs.AI 版本更新

Mind the Pause: Disfluency-Aware Objective Tuning for Multilingual Speech Correction with LLMs

Deepak Kumar, Baban Gain, Asif Ekbal

发表机构 * Department of Computer Science and Engineering(计算机科学与工程系) Indian Institute of Technology Patna(印度理工学院帕纳瓦)

AI总结 自动语音识别(ASR)生成的文本常包含停顿、重复和误起等不流畅现象,影响可读性和下游应用效果。本文提出一种基于大语言模型(LLM)的多语言语音文本流畅性修正方法,通过序列标注识别不流畅词元,并结合指令微调与对比学习优化模型,使其在去除不流畅内容的同时保持语义和语法完整性。实验表明,该方法在印地语、孟加拉语和马拉地语上显著优于现有基线模型,验证了其有效性与实用性。

Comments Accepted to ACL 2026 (Main)

详情
英文摘要

Automatic Speech Recognition (ASR) transcripts often contain disfluencies, such as fillers, repetitions, and false starts, which reduce readability and hinder downstream applications like chatbots and voice assistants. If left unaddressed, such disfluencies can significantly degrade the reliability of downstream systems. Most existing approaches rely on classical models that focus on identifying disfluent tokens for removal. While this strategy is effective to some extent, it often disrupts grammatical structure and semantic coherence, leading to incomplete or unnatural sentences. Recent literature explored the use of large language models (LLMs); however, these efforts have primarily focused on disfluency detection or data augmentation, rather than performing comprehensive correction. We propose a multilingual correction pipeline where a sequence tagger first marks disfluent tokens, and these signals guide instruction fine-tuning of an LLM to rewrite transcripts into fluent text. To further improve reliability, we add a contrastive learning objective that penalizes the reproduction of disfluent tokens, encouraging the model to preserve grammar and meaning while removing disfluent artifacts. Our experiments across three Indian languages, namely Hindi, Bengali, and Marathi show consistent improvements over strong baselines, including multilingual sequence-to-sequence models. These results highlight that detection-only strategies are insufficient. Combining token-level cues with instruction tuning and contrastive learning provides a practical and scalable solution for multilingual disfluency correction in speech-driven NLP systems. We make the codes publicly available at https://github.com/deepak-kumar-98/Mind-the-Pause.

2605.12241 2026-05-13 eess.SP cs.AI cs.LG 版本更新

Pretraining Strategies and Scaling for ECG Foundation Models: A Systematic Study

M A Al-Masud, Nils Strodthoff

发表机构 * AI4Health Division(AI4Health部门)

AI总结 本文系统研究了心电图(ECG)基础模型的预训练策略及其规模扩展,评估了五种不同的自监督学习目标,并在最多1100万条公开数据上分析了模型性能随数据量增长的变化趋势。研究发现,对比预测编码(CPC)在多种临床任务中表现出最佳的迁移能力,且随着数据量增加,大多数目标的性能仍有显著提升。此外,研究还表明结构化状态空间模型在ECG表示学习中优于Transformer和CNN模型,其强归纳偏置可能是提升模型性能的关键因素。

Comments 59 pages, 16 figures, 59 Tables. Code available at https://anonymous.4open.science/r/ecg-pretraining-strategies-4DE3

详情
英文摘要

Specialized foundation models are beginning to emerge in various medical subdomains, but pretraining methodologies and parametric scaling with the size of the pretraining dataset are rarely assessed systematically and in a like-for-like manner. This work focuses on foundation models for electrocardiography (ECG) data, one of the most widely captured physiological time series world-wide. We present a comprehensive assessment of pretraining methodologies, covering five different contrastive and non-contrastive self-supervised learning objectives for ECG foundation models, and investigate their scaling behavior with pretraining dataset sizes up to 11M input samples, exclusively from publicly available sources. Pretraining strategy has a meaningful and consistent impact on downstream performance, with contrastive predictive coding (slightly ahead of JEPA) yielding the most transferable representations across diverse clinical tasks. Scaling pretraining data continues to yield meaningful improvements up to 11M samples for most objectives. We also compare model architectures across all pretraining methodologies and find evidence for a clear superiority of structured state space models compared to transformers and CNN models. We hypothesize that the strong inductive biases of structured state space models, rather than pretraining scale alone, are the primary driver of effective ECG representation learning, with important implications for future foundation model development in this and potentially other physiological signal domains.

2605.12240 2026-05-13 cs.AI 版本更新

No Action Without a NOD: A Heterogeneous Multi-Agent Architecture for Reliable Service Agents

Zixu Yang, Hang Zheng, Nan Jiang, Zhiyang Tang, Situo Zhang, Xiaobao Wu, Lu Chen, Kai Yu

发表机构 * X-LANCE Lab, School of Computer Science, Shanghai Jiao Tong University, Shanghai, China(X-LANCE实验室,计算机科学学院,上海交通大学,上海,中国) Shanghai Innovation Institution, Shanghai, China(上海创新机构,上海,中国) Jiangsu Key Lab of Language Computing, Suzhou, China(江苏省语言计算重点实验室,苏州,中国)

AI总结 本文提出了一种异构多智能体架构NOD(Navigator-Operator-Director),用于提升服务型智能体在长期任务中的可靠性。该架构通过引入结构化的全局状态显式跟踪任务进展,并引入独立的Director智能体在关键操作前进行验证和干预,有效减少了策略违规、工具幻觉和用户意图偏差等问题。实验表明,NOD在任务成功率和关键操作精度方面优于现有方法,显著提升了服务智能体的可靠性。

详情
英文摘要

Large language model (LLM) agents have increasingly advanced service applications, such as booking flight tickets. However, these service agents suffer from unreliability in long-horizon tasks, as they often produce policy violations, tool hallucinations, and misaligned actions, which greatly impedes their real-world deployment. To address these challenges, we propose NOD (Navigator-Operator-Director), a heterogeneous multi-agent architecture for service agents. Instead of maintaining task state implicitly in dialogue context as in prior work, we externalize a structured Global State to enable explicit task state tracking and consistent decision-making by the Navigator. Besides, we introduce selective external oversight before critical actions, allowing an independent Director agent to verify execution and intervene when necessary. As such, NOD effectively mitigates error propagation and unsafe behavior in long-horizon tasks. Experiments on $τ^2$-Bench demonstrate that NOD achieves higher task success rates and critical action precision over baselines. More importantly, NOD improves the reliability of service agents by reducing policy violations, tool hallucinations, and user-intent misalignment.

2605.12239 2026-05-13 cs.PL cs.AI math.CT 版本更新

Harness Engineering as Categorical Architecture

Bogdan Banu

AI总结 本文探讨了基于大语言模型的智能体系统中“代理框架”(harness)的设计问题,提出了一种基于范畴论的架构三元组(G, Know, Phi)作为形式化理论,用于描述和规范代理系统的组成、属性保持和跨框架比较。研究将代理外部化的四个核心要素——记忆、技能、协议和框架工程——映射到该架构的三个组成部分,并通过编译器验证结构保证的保持性。实验验证了该理论在多个实际框架中的适用性,并展示了其在质量驱动的智能体升级中的有效性。

详情
英文摘要

The agent harness, the system layer comprising prompts, tools, memory, and orchestration logic that surrounds the model, has emerged as the central engineering abstraction for LLMbased agents. Yet harness design remains ad hoc, with no formal theory governing composition, preservation of properties under compilation, or systematic comparison across frameworks. We show that the categorical Architecture triple (G, Know, Phi) from the ArchAgents framework provides exactly this formalization. The four pillars of agent externalization (Memory, Skills, Protocols, Harness Engineering) map onto the triple's components: Memory as coalgebraic state, Skills as operad-composed objects, Protocols as syntactic wiring G, and the full Harness as the Architecture itself. Structural guarantees-integrity gates, quality-based escalation, supported convergence checks-are Know-level certificates whose preservation is structural replay: our compiler checks identity and verifier replay, not output-layer correctness or model behavior. We validate this correspondence with a reference implementation featuring compiler functors targeting Swarms, DeerFlow, Ralph, Scion, and LangGraph: the four configuration compilers preserve three named certificate types by identity or replay, and LangGraph preserves the same certificates through its shared per-stage execution path. The LangGraph compiler creates one node per stage using the same per-stage method as the native runtime, providing LangGraph-native observability without reimplementing harness logic. An end-to-end escalation experiment with real LLM agents confirms that the quality-based escalation control path is model-parametric in this two-model, one-task experiment. The result positions categorical architecture as the formal theory behind harness engineering.

2605.12236 2026-05-13 cs.RO cs.AI cs.LG 版本更新

TMRL: Diffusion Timestep-Modulated Pretraining Enables Exploration for Efficient Policy Finetuning

Matthew M. Hong, Jesse Zhang, Anusha Nagabandi, Abhishek Gupta

发表机构 * University of Washington(华盛顿大学) Amazon FAR(亚马逊FAR)

AI总结 该论文提出了一种名为 TMRL 的方法,通过引入扩散时间步调节的预训练策略,解决基于行为克隆的预训练策略在强化学习微调过程中探索能力不足的问题。核心方法包括 Context-Smoothed Pre-training(CSP)和 Timestep-Modulated Reinforcement Learning(TMRL),前者通过在策略输入中注入扩散噪声,增强动作分布的广泛性,后者则在微调阶段动态调节扩散时间步,从而有效控制探索过程。该方法在多种策略输入形式下均表现出更高的样本效率,并在复杂现实任务中实现了高效微调。

详情
英文摘要

Fine-tuning pre-trained robot policies with reinforcement learning (RL) often inherits the bottlenecks introduced by pre-training with behavioral cloning (BC), which produces narrow action distributions that lack the coverage necessary for downstream exploration. We present a unified framework that enables the exploration necessary to enable efficient robot policy finetuning by bridging BC pre-training and RL fine-tuning. Our pre-training method, Context-Smoothed Pre-training (CSP), injects forward-diffusion noise into policy inputs, creating a continuum between precise imitation and broad action coverage. We then fine-tune pre-trained policies via Timestep-Modulated Reinforcement Learning (TMRL), which trains the agent to dynamically adjust this conditioning during fine-tuning by modulating the diffusion timestep, granting explicit control over exploration. Integrating seamlessly with arbitrary policy inputs, e.g., states, 3D point clouds, or image-based VLA policies, we show that TMRL improves RL fine-tuning sample efficiency. Notably, TMRL enables successful real-world fine-tuning on complex manipulation tasks in under one hour. Videos and code available at https://weirdlabuw.github.io/tmrl/.

2605.12233 2026-05-13 cs.LG cs.AI cs.CR 版本更新

No More, No Less: Task Alignment in Terminal Agents

Sina Mavali, David Pape, Jonathan Evertz, Samira Abedini, Devansh Srivastav, Thorsten Eisenhofer, Sahar Abdelnabi, Lea Schönherr

发表机构 * CISPA Helmholtz Center for Information Security(CISPA 涅槃中心信息安全研究所) ELLIS Institute Tübingen and MPI for Intelligent Systems(图宾根 ELLIS 研究所和智能系统 MPI) Tübingen AI Center(图宾根人工智能中心)

AI总结 本文研究了终端智能体在执行复杂任务时如何正确理解并选择性遵循环境中的指令,而非盲目接受或完全忽略。为此,作者提出了一个新的基准测试TAB,包含89个精心设计的任务,每个任务都包含必要的线索和干扰信息,要求智能体能够区分并仅使用有效线索完成任务。实验表明,当前最先进的终端智能体在任务完成能力与任务对齐之间存在系统性差距,揭示了现有模型在选择性遵循环境指令方面仍面临挑战。

详情
英文摘要

Terminal agents are increasingly capable of executing complex, long-horizon tasks autonomously from a single user prompt. To do so, they must interpret instructions encountered in the environment (e.g., README files, code comments, stack traces) and determine their relevance to the task. This creates a fundamental challenge: relevant cues must be followed to complete a task, whereas irrelevant or misleading ones must be ignored. Existing benchmarks do not capture this ability. An agent may appear capable by blindly following all instructions, or appear robust by ignoring them altogether. We introduce TAB (Task Alignment Benchmark), a suite of 89 terminal tasks derived from Terminal-Bench 2.1. Each task is intentionally underspecified, with missing information provided as a necessary cue embedded in a natural environmental artifact, alongside a plausible but irrelevant distractor. Solving these tasks requires selectively using the cue while ignoring the distractor. Applying TAB to ten frontier agents reveals a systematic gap between task capability and task alignment. The strongest Terminal-Bench agent achieves high task completion but low task alignment on TAB. Evaluating six prompt-injection defenses further shows that suppressing distractor execution also suppresses the cues required for task completion. These results demonstrate that task-aligned agents require selective use of environmental instructions rather than blanket acceptance or rejection.

2605.12217 2026-05-13 cs.AR cs.AI 版本更新

Heterogeneous SoC Integrating an Open-Source Recurrent SNN Accelerator for Neuromorphic Edge Computing on FPGA

Michelangelo Barocci, Vittorio Fra, Enrico Macii, Gianvito Urgese

发表机构 * Department of Computer and Control Engineering(计算机与控制工程系) Politecnico di Torino(托尼诺理工大学) Interuniversity Department of Regional and Urban Studies and Planning(区域与城市研究和规划联合大学部门)

AI总结 本文提出了一种异构系统级芯片(SoC),集成开源的循环脉冲神经网络(SNN)加速器ReckOn,旨在推动边缘端神经形态计算的发展。该设计结合了RISC-V开源微控制器X-HEEP和Zynq Ultrascale系统中的ARM处理器,通过在FPGA上实现ReckOn的物理版本,验证了其分类性能与实际硬件的一致性,并进一步评估了其在线学习能力,用于盲文数字数据集的分类任务。该研究为开放源码的神经形态硬件设计提供了一种灵活且成本较低的实现方案。

Comments Deep Learning meets Neuromorphic Hardware Workshop at ECML-PKDD 2024 Conference in Vilnius, Lithuania

Journal ref Machine Learning and Principles and Practice of Knowledge Discovery in Databases 3 (2026) 128-143

详情
英文摘要

The growing popularity of Spiking Neural Networks (SNNs) and their applications has led to a significant fast-paced increase of neuromorphic architectures capable of mimicking the spike-based data processing typical of biological neurons. The efficient power consumption and parallel computing capabilities of the SNNs lead researchers towards the development of digital accelerators, which exploit such features to bring fast and low-power computation on edge devices. The spread of digital neuromorphic hardware however is slowed down by the prohibitive costs that the silicon tape out of circuits brings, that's why targeting Field Programmable Gate Arrays (FPGAs) could represent a viable alternative, offering a flexible and cost-effective platform for implementing digital neuromorphic systems and helping the spread of open-source hardware designs. In this work we present an heterogeneous System-on-Chip (SoC) where the operations of ReckOn, a Recurrent SNN accelerator, are managed through the integration with traditional processors. These include the RISC-V-based, open-source microcontroller X-HEEP and the ARM processor featured in Zynq Ultrascale systems. We validate our design by reproducing the classification results through the implementation on FPGA of the taped-out version of ReckOn in order to check the equivalence of the accuracy and the characteristics in terms of physical implementation. In a second set of experiments, we evaluate the online learning capability of the solution in classifying a subset of the Braille digit dataset recently used to compare neuromorphic frameworks and platforms.

2605.12207 2026-05-13 cs.LG cs.AI cs.CL 版本更新

Not How Many, But Which: Parameter Placement in Low-Rank Adaptation

Arijit Sehanobish, Charles Lovering

发表机构 * Kensho Technologies

AI总结 本文研究了低秩适配(LoRA)中参数放置的问题,即在固定可训练参数数量的条件下,选择哪些参数进行微调对模型性能影响更大。研究发现,在监督微调任务中,随机选择和基于梯度信息选择的参数效果相近,但在基于梯度的参数优化(GRPO)任务中,只有基于梯度信息的参数选择能有效提升性能。作者提出了一种高效的参数评分方法,能够在极低计算成本下识别出对模型性能关键的参数,这些参数主要集中在残差流写入相关的投影层,并在不同规模的模型中表现出一致性。

Comments Preprint. Comments welcome

详情
英文摘要

We study the \textit{parameter placement problem}: given a fixed budget of $k$ trainable entries within the B matrix of a LoRA adapter (A frozen), does the choice of which $k$ matter? Under supervised fine-tuning, random and informed subsets achieve comparable performance. Under GRPO on base models, random placement fails to improve over the base model, while gradient-informed placement recovers standard LoRA accuracy. This regime dependence traces to gradient structure: SFT gradients are low-rank and directionally stable, so any subset accumulates coherent updates; GRPO gradients are high-rank and near-orthogonal across steps, so only elements with consistently signed gradients retain the learning signal. Our scoring procedure identifies these critical parameters in under 10 seconds at less than 0.5% of training cost. Selected parameters concentrate on residual-stream-writing projections (V, O, Down), stable across model families and scales (1.5B - 8B).

2605.12201 2026-05-13 cs.SE cs.AI 版本更新

Uncertainty Quantification for LLM-based Code Generation

Senrong Xu, Yuhao Tan, Yanke Zhou, Guangyuan Wu, Zenan Li, Yuan Yao, Taolue Chen, Feng Xu, Xiaoxing Ma

发表机构 * State Key Lab of Novel Software Technology, Nanjing University(南京大学新型软件技术国家重点实验室) ETH Zürich(苏黎世联邦理工学院) Birkbeck, University of London(伦敦大学伯克贝克学院)

AI总结 本文研究了基于大语言模型(LLM)的代码生成任务中的不确定性量化问题,提出了一种名为RisCoSet的新方法。该方法通过多假设检验构建风险可控的预测集,能够在保证高置信度包含正确解的前提下,有效减少生成代码的冗余。实验表明,与现有方法相比,RisCoSet在多个LLM上均表现出更优的性能,最多可减少24.5%的代码移除量。

详情
英文摘要

Prediction sets provide a theoretically grounded framework for quantifying uncertainty in machine learning models. Adapting them to structured generation tasks, in particular, large language model (LLM) based code generation, remains a challenging problem. An existing attempt proposes PAC prediction sets but is limited by its strong monotonicity assumption on risk and single-label classification framework, which severely limits the space of candidate programs and cannot accommodate the multiple valid outputs inherent to code generation. To address these limitations, we propose an approach RisCoSet that leverages multiple hypothesis testing to construct risk-controlling predictions for LLM-based code generation. Given a trained code generation model, we produce a prediction set represented by a partial program, which is guaranteed to contain a correct solution with high confidence. Extensive experiments on three LLMs demonstrate the effectiveness of the proposed method. For instance, compared with the state-of-the-art, our method can significantly reduce the code removal by up to 24.5%, at the same level of risk.

2605.12199 2026-05-13 cs.LG cs.AI 版本更新

Overtrained, Not Misaligned

Joel Schreiber, Ariel Goldstein

发表机构 * Hebrew University of Jerusalem(特拉维夫大学)

AI总结 本文研究了“新兴对齐偏差”(EM)现象,即在特定任务上微调大语言模型会导致其在无关领域出现广泛偏差。通过对12个开源模型的系统实验,发现EM并非普遍现象,且模型规模与EM敏感性存在显著相关性。研究进一步表明,EM在训练后期出现,可通过提前停止训练或合理选择学习率有效避免,为实际应用提供了可行的缓解策略。

Comments Under review at CoLM 2026; companion to Nature Matters Arising (also under review). 25 pages, 6 figures

详情
英文摘要

Emergent misalignment (EM), where fine-tuning on a narrow task (like insecure code) causes broad misalignment across unrelated domains, was first demonstrated by Betley et al. (2025). We conduct the most comprehensive EM study to date, reproducing the original GPT-4o finding and expanding to 12 open-source models across 4 families (Llama, Qwen, DeepSeek, GPT-OSS) ranging from 8B to 671B parameters, evaluating over one million model responses with multiple random seeds. We find that EM replicates in GPT-4o but is far from universal: only 2 of 12 open-source models (17%) exhibit consistent EM across seeds, with a significant correlation between model size and EM susceptibility. Through checkpoint-level analysis during fine-tuning, we demonstrate that EM emerges late in training, distinct from and subsequent to near convergence of the primary task, suggesting EM emerges from continued training past task convergence. This yields practical mitigations: early stopping eliminates EM while retaining an average of 93% of task performance, and careful learning rate selection further minimizes risk. Cross-domain validation on medical fine-tuning confirms these patterns generalize: the size-EM correlation strengthens (r = 0.90), and overgeneralization to untruthfulness remains avoidable via early stopping in 67% of cases, though semantically proximate training domains produce less separable misalignment. As LLMs become increasingly integrated into real-world systems, fine-tuning and reinforcement learning remain the primary methods for adapting model behavior. Our findings demonstrate that with proper training practices, EM can be avoided, reframing it from an unforeseen fine-tuning risk to an avoidable training artifact.

2605.12185 2026-05-13 cs.CL cs.AI 版本更新

Mitigating Context-Memory Conflicts in LLMs through Dynamic Cognitive Reconciliation Decoding

Yigeng Zhou, Wu Li, Yifan Lu, Yequan Wang, Xuebo Liu, Wenya Wang, Jun Yu, Min Zhang, Jing Li

发表机构 * Harbin Institute of Technology, Shenzhen(哈尔滨工业大学(深圳)) Beijing Academy of Artificial Intelligence(北京人工智能研究院) Nanyang Technological University(南洋理工大学)

AI总结 本文研究了大语言模型在处理上下文与记忆知识冲突时的问题,提出了一种名为动态认知协调解码(DCRD)的两阶段解码方法,用于预测并缓解冲突。该方法通过分析注意力图评估上下文可信度,并根据预测结果选择贪心解码或基于上下文可信度的动态解码路径,从而在冲突场景下提升生成质量,同时保持无冲突情况下的高效性。此外,作者构建了ConflictKG基准数据集,实验表明DCRD在多个问答任务中优于现有方法,达到当前最优性能。

Comments Accepted by IEEE TASLP

详情
英文摘要

Large language models accumulate extensive parametric knowledge through pre-training. However, knowledge conflicts occur when outdated or incorrect parametric knowledge conflicts with external knowledge in the context. Existing methods address knowledge conflicts through contrastive decoding, but in conflict-free scenarios, static approaches disrupt output distribution. Other dynamic decoding methods attempt to measure the degree of conflict but still struggle with complex real-world situations. In this paper, we propose a two-stage decoding method called Dynamic Cognitive Reconciliation Decoding (DCRD), to predict and mitigate context-memory conflicts. DCRD first analyzes the attention map to assess context fidelity and predict potential conflicts. Based on this prediction, the input is directed to one of two decoding paths: (1) greedy decoding, or (2) context fidelity-based dynamic decoding. This design enables DCRD to handle conflicts efficiently while maintaining high accuracy and decoding efficiency in conflict-free cases. Additionally, to simulate scenarios with frequent knowledge updates, we constructed ConflictKG, a knowledge conflict QA benchmark. Experiments on four LLMs across six QA datasets show that DCRD outperforms all baselines, achieving state-of-the-art performance.

2605.12183 2026-05-13 cs.LG cs.AI 版本更新

DriftXpress: Faster Drifting Models via Projected RKHS Fields

Ali Falahati, Elliot Creager, Gautam Kamath, Shubhankar Mohapatra

发表机构 * Cheriton School of Computer Science, University of Waterloo and Vector Institute(计算机科学学院,滑铁卢大学及向量研究所) Department of Electrical and Computer Engineering, University of Waterloo and Vector Institute(电气与计算机工程系,滑铁卢大学及向量研究所) A*STAR Centre for Frontier AI Research, Singapore(新加坡前沿人工智能研究中心)

AI总结 DriftXpress 是一种基于投影再生核希尔伯特空间(RKHS)场的加速漂移模型方法,旨在提升生成模型的训练效率。该方法通过在低秩特征空间中近似漂移核,保持原始漂移场的吸引-排斥结构,同时降低场评估的计算成本。实验表明,DriftXpress 在保持图像生成质量的同时,显著减少了训练时间,进一步优化了漂移模型的训练-推理权衡。

详情
英文摘要

Drifting Models have emerged as a new paradigm for one-step generative modeling, achieving strong image quality without iterative inference. The premise is to replace the iterative denoising process in diffusion models with a single evaluation of a generator. However, this creates a different trade-off: drifting reduces inference cost by moving much of the computation into training. We introduce DriftXpress, an accelerated formulation of drifting models based on projected RKHS fields. DriftXpress approximates the drifting kernel in a low-rank feature space. This preserves the attraction-repulsion structure of the original drifting field while reducing the cost of field evaluation. Across image-generation benchmarks, DriftXpress achieves comparable FID to standard drifting while reducing wall-clock training cost. These results show that the training-inference trade-off of drifting models can be pushed further without giving up their one-step inference advantage.

2605.12181 2026-05-13 cs.AI 版本更新

MolDeTox: Evaluating Language Model's Stepwise Fragment Editing for Molecular Detoxification

Jueon Park, Wonjune Jang, Jiwoo Lee, Yein Park, Jaewoo Kang

发表机构 * Korea University(韩国大学) Myongji University(明知大学) AIGEN Sciences(AIGEN科技公司)

AI总结 本文提出 MolDeTox,一个用于分子解毒的新型基准,旨在评估语言模型在逐步片段编辑任务中对分子毒性的优化能力。该基准解决了现有模型在毒性修复任务中数据多样性不足、分子结构有效性低以及依赖代理模型评估毒性等问题,通过细粒度任务分析提供可解释的评估框架。实验表明,基于片段级别的分子理解和生成能够提升结构有效性和分子质量,为药物安全性优化提供了新的研究方向。

详情
英文摘要

Large Language Models (LLMs) and Vision Language Models (VLMs) have recently shown promising capabilities in various scientific domain. In particular, these advances have opened new opportunities in drug discovery, where the ability to understand and modify molecular structures is critical for optimizing drug properties such as efficacy and toxicity. However, existing models and benchmarks often overlook toxicity-related challenges, focusing primarily on general property optimization without adequately addressing safety concerns. In addition, even existing toxicity repair benchmarks suffer from limited data diversity, low structural validity of generated molecules, and heavy reliance on proxy models for toxicity assessment. To address these limitations, we propose MolDeTox, a novel benchmark for molecular detoxification, designed to enable fine-grained and reliable evaluation of toxicity-aware molecular optimization across stepwise tasks. We evaluate a wide range of general-purpose LLMs and VLMs under diverse settings, and demonstrate that understanding and generating molecules at the fragment-level improves structural validity and enhances the quality of generated molecules. Moreover, through detailed task-level performance analysis, MolDeTox provides an interpretable benchmark that enables a deeper understanding of the detoxification process. Our dataset is available at : https://huggingface.co/datasets/MolDeTox/MolDeTox

2605.12180 2026-05-13 cs.IT cs.AI math.IT 版本更新

A Deep Learning-based Receiver for Asynchronous Grant-Free Random Access in Control-to-Control Networks

Massimo Battaglioni, Edoardo Carnevali, Dania De Crescenzo, Enrico Testi, Marco Baldi, Enrico Paolini

发表机构 * Dipartimento di Ingegneria dell’Informazione, Università Politecnica delle Marche(信息工程系,波兰马尔凯大学) CNIT/WiLab, DEI, University of Bologna(CNIT/WiLab,博洛尼亚大学信息工程学院) FixIA s.r.l.

AI总结 本文研究了室内共享无线信道中异步无授权控制到控制(C2C)通信系统中的接收机设计问题。每个通信节点发送包含可变长度LDPC编码数据的命令单元,并由起始序列和尾序列标识。由于异步接入,接收端观测到的是多个节点发送信号的叠加。本文提出了一种基于卷积神经网络的接收机架构,能够直接从接收信号中检测命令单元的边界,并利用LDPC译码的软信息和信道估计提升尾序列检测性能。仿真结果表明,该接收机在高负载和无协调条件下仍能实现可靠的包边界识别和低端到端丢包率。

Comments Submitted to IEEE Transactions on Communications

详情
英文摘要

In this paper, we study grant-free, asynchronous control-to-control (C2C) communications in an indoor scenario with a shared wireless channel. Each communication node transmits command units, each consisting of a variable-length low-density parity-check (LDPC)--coded payload preceded by a start sequence and followed by a tail sequence. Due to the asynchronous nature of the access, transmissions from different nodes are not aligned over time. As a result, each receiving controller observes the superposition of multiple command units transmitted by different nodes over a receiver-defined superframe interval. Each node transmits one or more replicas of the same command unit. We propose a receiver architecture in which the detection of command unit boundaries (start/tail sequences) is carried out by a single convolutional neural network (CNN) operating directly on the received signal. We show that, while start-sequence detection must rely only on the received waveform, tail-sequence detection can additionally exploit the soft information produced by the LDPC decoder, together with channel estimates. Finally, once commands units are successfully decoded, successive interference cancellation (SIC) can be applied. Simulation results demonstrate that the receiver we propose achieves reliable packet-boundary identification and a low end-to-end packet loss rate, even under uncoordinated and high-traffic operating conditions.

2605.12178 2026-05-13 cs.AI cs.CL cs.LG 版本更新

Do Enterprise Systems Need Learned World Models? The Importance of Context to Infer Dynamics

Jishnu Sethumadhavan Nair, Patrice Bechard, Rishabh Maheshwary, Surajit Dasgupta, Sravan Ramachandran, Aakash Bhagat, Shruthan Radhakrishna, Pulkit Pattnaik, Johan Obando-Ceron, Shiva Krishna Reddy Malay, Sagar Davasam, Seganrasan Subramanian, Vipul Mittal, Sridhar Krishna Nemala, Christopher Pal, Srinivas Sunkara, Sai Rajeswar

发表机构 * ServiceNow Mila

AI总结 本文探讨了企业系统中是否需要学习世界模型的问题,指出由于企业系统的动态行为由租户特定的业务逻辑定义且随时间变化,传统基于历史数据训练的模型在部署变化时表现不佳。研究提出了一种新的方法——企业发现代理,通过在推理时读取系统配置来获取动态规则,从而提高预测的鲁棒性。实验表明,与依赖离线训练的模型相比,基于运行时发现的代理在面对动态变化时更具适应性。

详情
英文摘要

World models enable agents to anticipate the effects of their actions by internalizing environment dynamics. In enterprise systems, however, these dynamics are often defined by tenant-specific business logic that varies across deployments and evolves over time, making models trained on historical transitions brittle under deployment shift. We ask a question the world-models literature has not addressed: when the rules can be read at inference time, does an agent still need to learn them? We argue, and demonstrate empirically, that in settings where transition dynamics are configurable and readable, runtime discovery complements offline training by grounding predictions in the active system instance. We propose enterprise discovery agents, which recover relevant transition dynamics at runtime by reading the system's configuration rather than relying solely on internalized representations. We introduce CascadeBench, a reasoning-focused benchmark for enterprise cascade prediction that adopts the evaluation methodology of World of Workflows on diverse synthetic environments, and use it together with deployment-shift evaluation to show that offline-trained world models can perform well in-distribution but degrade as dynamics change, whereas discovery-based agents are more robust under shift by grounding their predictions in the current instance. Our findings suggest that, in configurable enterprise environments, agents should not rely solely on fixed internalized dynamics, but should incorporate mechanisms for discovering relevant transition logic at runtime.

2605.12160 2026-05-13 cs.RO cs.AI 版本更新

Premover: Fast Vision-Language-Action Control by Acting Before Instructions Are Complete

Joonha Park, Jiseung Jeong, Taesik Gong

发表机构 * UNIST(全南大学) The Catholic University of Korea(韩国天主教大学)

AI总结 该研究提出了一种名为Premover的轻量模块,旨在提升视觉-语言-动作(VLA)策略在实际部署中的响应效率。Premover通过在用户指令完成前进行预计算,有效利用了机器人等待指令的空闲时间,从而加快了整体执行速度。该方法通过冻结VLA主干网络,并引入两个投影头将中间层特征映射到共享空间,结合模拟渲染的目标分割掩码进行监督学习,最终显著减少了任务执行的平均时间,同时保持了较高的成功率。

详情
英文摘要

Vision-Language-Action (VLA) policies are typically evaluated as if the user had finished typing or speaking before the robot begins acting. In real deployment, however, users take several seconds to enter a request, leaving the policy idle for a substantial fraction of the interaction. We introduce Premover, a lightweight module that converts this idle window into useful precomputation. Premover keeps the VLA backbone frozen and attaches two small projection heads, one for image patches, one for language tokens, that map an intermediate layer of the backbone into a shared space. The resulting focus map is supervised by simulator-rendered target-object segmentation masks and applied as a per-patch reweighting of the next step's image tokens. A single scalar readiness threshold, trained jointly from streaming prefixes, decides when the policy should begin acting. On the LIBERO benchmark suite, Premover reduces mean wall-clock time from 34.0 to 29.4 seconds, a 13.6% reduction, while matching the full-prompt baseline's success rate (95.1% vs. 95.0%); naive premoving, by contrast, collapses to 66.4%.

2605.12159 2026-05-13 cs.AI cs.GR 版本更新

ALGOGEN: Tool-Generated Verifiable Traces for Reliable Algorithm Visualization

Kunpeng Liao, Yuexiao Ma, Yisheng Lin, Hualin Zeng, Xiawu Zheng, Rongrong Ji

发表机构 * Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University(教育部多媒体可信感知与高效计算重点实验室,厦门大学)

AI总结 该论文提出了一种名为ALGOGEN的新方法,用于生成可验证的算法可视化轨迹,以提高算法可视化过程的可靠性。其核心思想是将算法执行与渲染过程解耦,通过引入可视化轨迹代数(VTA)和渲染风格语言(RSL)分别控制算法状态和视觉呈现,从而避免了传统端到端方法中大语言模型产生的幻觉问题。实验表明,ALGOGEN在LeetCode基准测试中显著提升了生成成功率,验证了其在复杂任务中的有效性。

详情
英文摘要

Algorithm Visualization (AV) helps students build mental models by animating algorithm execution states. Recent LLM-based systems such as CODE2VIDEO generate AV videos in an end-to-end manner. However, this paradigm requires the system to simultaneously simulate algorithm flow and satisfy video rendering constraints, such as element layout and color schemes. This complex task induces LLM hallucinations, resulting in reduced execution success rates, element overlap, and inter-frame inconsistencies. To address these challenges, we propose ALGOGEN, a novel paradigm that decouples algorithm execution from rendering. We first introduce Visualization Trace Algebra (VTA), a monoid over algorithm visual states and operations. The LLM then generates a Python tracker that simulates algorithm flow and outputs VTA-JSON traces, a JSON encoding of VTA. For rendering, we define a Rendering Style Language (RSL) to templatize algorithm layouts. A deterministic renderer then compiles algorithm traces with RSL into Manim, LaTeX/TikZ, or Three.js outputs. Evaluated on a LeetCode AV benchmark of 200 tasks, ALGOGEN achieves an average success rate improvement of 17.3% compared to end-to-end methods, with 99.8% versus 82.5%. These results demonstrate that our decoupling paradigm effectively mitigates LLM hallucinations in complex AV tasks, providing a more reliable solution for automated generation of high-quality algorithm visualizations. Demo videos and code are available in the project repository.

2605.12154 2026-05-13 cs.AI 版本更新

MM-OptBench: A Solver-Grounded Benchmark for Multimodal Optimization Modeling

Zhong Li, Qi Huang, Yuxuan Zhu, Mohammad Mohammadi Amiri, Niki van Stein, Thomas Bäck, Matthijs van Leeuwen, Zaiwen Wen, Lincen Yang

发表机构 * Great Bay University(大湾大学) Leiden University(莱顿大学) Rensselaer Polytechnic Institute(伦塞拉尔理工学院) Peking University(北京大学)

AI总结 MM-OptBench 是一个基于求解器验证的多模态优化建模基准,旨在评估模型从文本和视觉信息中构建数学优化模型及可执行求解代码的能力。该基准涵盖6类优化问题、26个子类和3个难度级别,共包含780个经过求解器验证的实例。实验表明,当前主流多模态大语言模型在该任务上表现有限,尤其在处理复杂实例时效果显著下降,突显了多模态优化建模任务的挑战性。

Comments Paper under review

详情
英文摘要

Optimization modeling translates real decision-making problems into mathematical optimization models and solver-executable implementations. Although language models are increasingly used to generate optimization formulations and solver code, existing benchmarks are almost entirely text-only. This omits many optimization-modeling tasks that arise in operational practice, where requirements are described in text but instance information is conveyed through visual artifacts such as tables, graphs, maps, schedules, and dashboards. We introduce multimodal optimization modeling, a benchmark setting in which models must construct both a mathematical formulation and executable solver code from a text-and-visual problem specification. To evaluate this setting, we develop a solver-grounded framework that generates structured optimization instances, verifies each with an exact solver, and builds both the model-facing inputs and hidden reference files from the same verified source. We instantiate the framework as MM-OptBench, a benchmark of 780 solver-verified instances spanning 6 optimization families, 26 subcategories, and 3 structural difficulty levels. We evaluate 9 multimodal large language models (MLLMs), including 6 frontier general-purpose models and 3 math-specialized models, with aggregate, family-level, difficulty-level, and failure-mode analyses. The results show that the task remains far from solved: the best two models reach 52.1% and 51.3% pass@1, while on average across the six general-purpose MLLMs, pass@1 is 43.4% on easy instances and 15.9% on hard instances. All three math-specialized MLLMs solve 0/780 instances. Failure attribution shows that errors arise both when extracting instance data from text and visuals and when turning extracted data into solver-correct formulations and code. MM-OptBench provides a testbed for solver-grounded, decision-oriented multimodal intelligence.

2605.12153 2026-05-13 cs.SE cs.AI 版本更新

CIDR: A Large-Scale Industrial Source Code Dataset for Software Engineering Research

Vladislav Savenkov

发表机构 * Fermatix AI

AI总结 本文介绍了CIDR,一个通过与12家工业合作伙伴直接合作收集的大型工业源代码数据集,包含2440个软件仓库,涵盖138种编程语言,总代码量达3.73亿行,并附有结构化元数据。与现有基于开源平台的代码语料库不同,CIDR仅包含在正式数据共享协议下提供的专有生产代码,覆盖企业级Web与移动开发、金融科技和定制软件咨询等领域。该数据集经过多阶段处理流程,包括结构化合作伙伴接入、两阶段质量筛选和确定性匿名化处理,旨在支持代码智能、软件质量分析、代码语言模型预训练与微调、开发者行为研究以及智能体评估基准构建等方向的研究。

Comments 34 pages, 9 figures, 4 appendices. Dataset access: https://fermatix.ai/#Contact. Anonymization tool: https://github.com/Fermatix/repo-sanitizer. Metadata utility: https://github.com/Fermatix/repo_metadata_cli

详情
英文摘要

We present Curated Industrial Developer Repository (CIDR), a large-scale dataset of real-world software repositories collected through direct collaboration with 12 industrial partner organizations. The dataset comprises 2,440 repositories spanning 138 programming languages and totalling 373 million lines of code, accompanied by structured per-repository metadata. Unlike existing code corpora derived from public open-source platforms, CIDR consists exclusively of proprietary production codebases contributed under formal data sharing agreements, covering application domains including enterprise web and mobile development, fintech, and custom software consultancy. All repositories were processed through a multi-stage pipeline encompassing structured partner onboarding, two-stage quality selection combining automated metadata filtering with manual code review, and a deterministic anonymization pipeline covering the full version control history. The dataset is intended to support research in code intelligence, software quality analysis, pre-training and fine-tuning of code language models, developer behaviour studies, and construction of agent evaluation benchmarks. Access is provided under a restricted commercial license; details are available at https://fermatix.ai/#Contact.

2605.12139 2026-05-13 cs.AI 版本更新

BoolXLLM: LLM-Assisted Explainability for Boolean Models

Du Cheng, Serdar Kadioglu, Xin Wang

发表机构 * AI Center of Excellence, Fidelity Investments(富达投资人工智能卓越中心) Department of Computer Science, Brown University(布朗大学计算机科学系)

AI总结 BoolXLLM 是一种结合大型语言模型(LLM)与布尔逻辑规则的学习框架,旨在提升布尔模型的可解释性。该方法在特征选择、数值特征离散化策略推荐以及布尔规则压缩与解释三个关键阶段引入LLM,从而生成更符合领域语义且易于理解的解释。研究展示了这种混合方法在保持预测性能的同时,有效提升了非技术用户对模型决策过程的理解能力。

详情
英文摘要

Interpretable machine learning aims to provide transparent models whose decision-making processes can be readily understood by humans. Recent advances in rule-based approaches, such as expressive Boolean formulas (BoolXAI), offer faithful and compact representations of model behavior. However, for non-technical stakeholders, main challenges remain in practice: (i) selecting semantically meaningful features and (ii) translating formal logical rules into accessible explanations. In this work, we propose BoolXLLM , as a hybrid framework that integrates Large Language Models (LLMs) into the end-to-end pipeline of Boolean rule learning. We augment BoolXAI , an expressive Boolean rule-based classifier, with LLMs at three critical stages: (1) feature selection, where LLMs guide the identification of domain-relevant variables; (2) threshold recommendation, where LLMs propose semantically meaningful discretization strategies for numerical features; and (3) rule compression and interpretation, where Boolean rules are translated into natural language explanations at both global and local levels. This integration bridges formal, faithful explanations with human-understandable narratives. This allows build an explainable AI system that is both theoretically grounded and accessible to non-experts. Early empirical results demonstrate that LLM-assisted pipelines improve interpretability while maintaining competitive predictive performance. Our work highlights the promise of combining symbolic reasoning with language-based models for human-centered explainability.

2605.12131 2026-05-13 cs.AI 版本更新

Rollout Cards: A Reproducibility Standard for Agent Research

Charlie Masters, Ziyuan Liu, Stefano V. Albrecht

发表机构 * Deepflow Nanyang Technological University(南洋理工大学)

AI总结 本文针对智能体研究中日益严重的可复现性问题,提出了一种新的标准化方法——Rollout Cards。研究指出,当前许多论文仅报告系统得分,却未公开支撑这些得分的完整运行记录,导致相同行为可能因评估方式不同而得出不同结果。为此,作者引入Rollout Cards,将运行记录而非报告得分作为可复现性的基本单位,并通过实际案例验证了其有效性,展示了仅改变报告规则即可显著影响模型排名的现象。

详情
英文摘要

Reproducibility problems that have long affected machine learning and reinforcement learning are now surfacing in agent research: papers compare systems by reported scores while leaving the rollout records behind those scores difficult to inspect. For agentic tasks, this matters because the same behaviour can receive different reported scores when evaluations select different parts of a rollout or apply different reporting rules. In a structured audit of 50 popular training and evaluation repositories, we find that none report how many runs failed, errored, or were skipped alongside headline scores. We also document 37 cases where reporting rules can change task-success rates, cost/token accounting, or timing measurements for fixed evidence, sometimes dramatically. We treat rollout records, not reported scores, as the unit of reproducibility for agent research. We introduce rollout cards: publication bundles that preserve the rollout record and declare the views, reporting rules, and drops manifests behind reported scores. We validate rollout cards in two settings. First, four partial public releases in tool safety, multi-agent systems, theorem proving, and search let us compute analyses their original reports did not include. Second, re-grading preserved benchmark outputs across short-answer, code-generation, and tool-use tasks shows that changing only the reporting rule can change reported scores by 20.9 absolute percentage points and, in some cases, invert rankings of frontier models. We release a reference implementation integrated into Ergon, an open-source reinforcement learning gym, and publicly publish Ergon-produced rollout-card exports for benchmarks spanning tool use, software engineering, web interaction, multi-agent coordination, safety, and search to support future research.

2605.12129 2026-05-13 cs.SE cs.AI cs.OS 版本更新

It's Not the Size: Harness Design Determines Operational Stability in Small Language Models

Yong-eun Cho

发表机构 * KailosLab(凯洛斯实验室)

AI总结 本文研究了小语言模型(2-3B参数)的操作稳定性如何受“框架设计”影响,而非模型规模。通过对比三种不同框架条件(仅模型、最小外壳、四阶段流水线)在24个任务中的表现,发现四阶段流水线显著提升了任务成功率,尤其在Gemma4 E2B模型上达到了95.2%的任务成功率和100%的有效任务成功率。研究还揭示了框架缺失可能导致模型结构崩溃,并发现规划和恢复机制对性能提升贡献显著。

详情
英文摘要

This paper experimentally analyzes how the level of harness engineering affects the operational performance of small language models (SLMs, 2-3B parameters). Three harness conditions - model-only (raw prompt), minimal-shell (wrapper tags), and a 4-stage pipeline (plan->execute->verify->recover) - are applied to three models (Gemma4 E2B, Qwen3.5:2B, LLaMA 3.2 3B) across 24 tasks, comparing Task Success Rate (TSR) and Valid TSR (VTSR). The pipeline harness achieves TSR=0.952 and VTSR=1.000 on Gemma4 E2B (T1-T5, 21 tasks). A non-monotonic phenomenon - minimal-shell TSR < model-only TSR - is observed in two models. In LLaMA 3.2 3B model-only, seven format violations yield TSR=0.429, revealing scaffold collapse: the model abandons JSON structure under complex format requirements without harness support. Ablation shows planning and recovery each contribute approximately 24.7% of total gain. VCR (Verification Catch Rate)=0.625 across all pipeline runs.

2605.12122 2026-05-13 cs.LG cs.AI cs.CV 版本更新

Disentangled Sparse Representations for Concept-Separated Diffusion Unlearning

Hyeonjin Kim, Hangyeol Jung, Heechan Yun, Sungjun Yun, Dong-Jun Han

发表机构 * Yonsei University(延世大学) Kookmin University(韩国釜山大学)

AI总结 本文研究了如何在文本到图像的扩散模型中去除特定概念,提出了一个名为SAEParate的方法。该方法通过引入概念感知的对比目标,将潜在表示组织成概念特定的聚类,从而实现更精确的概念抑制并减少去学习过程中的干扰。此外,作者还增强编码器以提升其在分离目标下的表达能力,实验表明该方法在去学习任务中取得了当前最优的性能,尤其在联合风格-对象去学习任务中表现突出。

Comments 40 pages, 23 figures

详情
英文摘要

Unlearning specific concepts in text-to-image diffusion models has become increasingly important for preventing undesirable content generation. Among prior approaches, sparse autoencoder (SAE)-based methods have attracted attention due to their ability to suppress target concepts through lightweight manipulation of latent features, without modifying model parameters. However, SAEs trained with sparse reconstruction objectives do not explicitly enforce concept-wise separation, resulting in shared latent features across concepts. To address this, we propose SAEParate, which organizes latent representations into concept-specific clusters via a concept-aware contrastive objective, enabling more precise concept suppression while reducing unintended interference during unlearning. In addition, we enhance the encoder with a GeLU-based nonlinear transformation to increase its expressive capacity under this separation objective, enabling a more discriminative and disentangled latent space. Experiments on UnlearnCanvas demonstrate state-of-the-art performance, with particularly strong gains in joint style-object unlearning, a challenging setting where existing methods suffer from severe interference between target and non-target concepts.

2605.12120 2026-05-13 cs.AI 版本更新

To Whom Do Language Models Align? Measuring Principal Hierarchies Under High-Stakes Competing Demands

Fangyi Yu, Nabeel Seedat, Jonathan Richard Schwarz, Andrew M. Bean

发表机构 * Thomson Reuters Foundational Research(汤姆森·路透基础研究) University of Oxford(牛津大学) Imperial College London(帝国理工学院伦敦分校)

AI总结 该研究探讨了语言模型在高风险专业场景中面对用户、机构权威和职业规范等多方冲突需求时的对齐倾向。通过在法律和医疗领域共7,136个场景中测试十种前沿模型,发现模型在任务执行时常常忽视职业标准,且对用户、权威和标准的优先级排序在不同领域和模型间存在不稳定性。研究指出,模型主要通过知识遗漏的方式导致对专业标准的违背,即使其内部推理过程已识别相关知识,也可能在外部输出中选择性忽略,从而产生有害结果。

详情
英文摘要

Language models deployed in high-stakes professional settings face conflicting demands from users, institutional authorities, and professional norms. How models act when these demands conflict reveals a principal hierarchy -- an implicit ordering over competing stakeholders that determines, for instance, whether a medical AI receiving a cost-reduction directive from a hospital administrator complies at the expense of evidence-based care, or refuses because professional standards require it. Across 7,136 scenarios in legal and medical domains, we test ten frontier models and find that models frequently fail to adhere to professional standards during task execution, such as drafting, when user instructions conflict with those standards -- despite adequately upholding them when users seek advisory guidance. We further find that the hierarchies between user, authority, and professional standards exhibited by these models are unstable across medical and legal contexts and inconsistent across model families. When failing to follow professional standards, the primary failure mechanism is knowledge omission: models that demonstrably possess relevant knowledge produce harmful outputs without surfacing conflicting knowledge. In a particularly troubling instance, we find that a reasoning model recognizes the relevant knowledge in its reasoning trace -- e.g., that a drug has been withdrawn -- yet suppresses this in the user-facing answer and proceeds to recommend the drug under authority pressure anyway. Inconsistent alignment across task framing, domain, and model families suggests that current alignment methods, including published alignment hierarchies, are unlikely to be robust when models are deployed in high-stakes professional settings.

2605.12111 2026-05-13 cs.AI cs.DS 版本更新

Adaptive Multi-Round Allocation with Stochastic Arrivals

Yuqi Pan, Davin Choo, Haichuan Wang, Milind Tambe, Alastair van Heerden, Cheryl Johnson

发表机构 * Harvard University(哈佛大学) University of Witwatersrand(沃特沙兰大学)

AI总结 本文研究了一个受自适应网络招募启发的多轮资源分配问题,其中有限的同质资源需在多轮中分配给具有随机推荐能力的个体,成功推荐会带来未来的决策机会,而对同一个体追加资源则存在边际递减效应。为解决多轮设置下的复杂动态规划问题,作者引入了一个仅依赖剩余预算和前沿规模的群体级替代价值函数,从而构建出复杂度与总预算成多项式关系的精确动态规划算法。此外,作者还分析了模型误设下的鲁棒性,并给出了分解为单轮前沿误差和群体级转移误差的多轮误差界。

Comments Accepted into ICML 2026

详情
英文摘要

We study a sequential resource allocation problem motivated by adaptive network recruitment, in which a limited budget of identical resources must be allocated over multiple rounds to individuals with stochastic referral capacity. Successful referrals endogenously generate future decision opportunities while allocating additional resources to an individual exhibits diminishing returns. We first show that the single-round allocation problem admits an exact greedy solution based on marginal survival probabilities. In the multi-round setting, the resulting Bellman recursion is intractable due to the stochastic, high-dimensional evolution of the frontier. To address this, we introduce a population-level surrogate value function that depends only on the remaining budget and frontier size. This surrogate enables an exact dynamic program via truncated probability generating functions, yielding a planning algorithm with polynomial complexity in the total budget. We further analyze robustness under model misspecification, proving a multi-round error bound that decomposes into a tight single-round frontier error and a population-level transition error. Finally, we evaluate our method on real-world inspired recruitment scenarios.

2605.12106 2026-05-13 cs.AI 版本更新

Large Language Models as Amortized Pareto-Front Generators for Constrained Bi-Objective Convex Optimization

Peipei Xu, SiYuan Ma, Yaohua Liu, Yu Wu, Guanliang Liu, Yang Zhang, Yong Liu

发表机构 * University of Shanghai for Science and Technology(上海科技大学) Nanyang Technological University(南洋理工大学) Guangdong Institute of Intelligence Science and Technology(广东智能科学与技术研究院) Georgia Institute of Technology(佐治亚理工学院) The University of Michigan(密歇根大学) The Hong Kong University of Science and Technology(香港科学与技术大学)

AI总结 该研究探讨了如何利用大语言模型生成约束条件下双目标凸优化问题的帕累托前沿。提出了一种端到端框架DIPS,通过微调大语言模型,使其能够直接根据文本描述生成近似帕累托前沿的连续决策向量。DIPS结合了数值标记初始化、分阶段课程优化等技术,实现了高效的生成效果,并在多个问题族上取得了接近参考前沿的高精度结果,展示了大语言模型在连续帕累托前沿近似中的潜力。

Comments 31 pages

详情
英文摘要

Generating feasible Pareto fronts for constrained bi-objective continuous optimization is central to multi-criteria decision-making. Existing methods usually rely on iterative scalarization, evolutionary search, or problem-specific solvers, requiring repeated optimization for each instance. We introduce DIPS, an end-to-end framework that fine-tunes large language models as amortized Pareto-front generators for constrained bi-objective convex optimization. Given a textual problem description, DIPS directly outputs an ordered set of feasible continuous decision vectors approximating the Pareto front. To make continuous optimization compatible with autoregressive language modeling, DIPS combines a compact discretization scheme, Numerically Grounded Token Initialization for new numerical tokens, and Three-Phase Curriculum Optimization, which progressively aligns structural validity, feasibility, and Pareto-front quality. Across five families of constrained bi-objective convex problems, a fine-tuned 7B-parameter model achieves normalized hypervolume ratios of 95.29% to 98.18% relative to reference fronts. With vLLM-accelerated inference, DIPS solves one instance in as little as 0.16 seconds and outperforms general-purpose and reasoning LLM baselines under the evaluated setting. These results suggest that LLMs can serve as effective amortized generators for continuous Pareto-front approximation.

2605.12105 2026-05-13 cs.AI 版本更新

Autonomy and Agency in Agentic AI: Architectural Tactics for Regulated Contexts

Damir Safin, Dian Balta

发表机构 * fortiss GmbH(fortiss公司) Research Institute of the Free State of Bavaria for software-intensive systems(巴伐利亚自由州软件密集系统研究机构)

AI总结 在监管环境中部署自主智能体AI系统,需要对系统“能力”(agency)和“自主性”(autonomy)两个设计维度进行系统性考量。本文提出一个二维设计空间,将这两个维度划分为五个操作层级,明确其耦合关系,并提出六种架构策略以调整系统在该空间中的位置。此外,文章还分析了五个影响系统部署效果的参数,为合规导向的智能体AI设计提供了理论框架和实践指导。

详情
英文摘要

Deploying agentic AI in regulated contexts requires principled reasoning about two design dimensions: agency (what the system can do) and autonomy (how much it acts without human involvement). Though often treated independently, they are coupled: at higher autonomy, human error correction is less available, so reliable operation requires constraining agency accordingly; compliance requirements reinforce this by mandating human involvement as action consequences grow. Yet no established approach addresses them jointly, leaving practitioners without a principled basis for reasoning about oversight, action consequences, and error correction. This work introduces a two-dimensional design space in which both dimensions are organised into five operational levels, making the coupling explicit and navigable. Autonomy ranges from human-commanded operation (L1) to fully autonomous monitoring (L5); agency ranges from reasoning over supplied context (L1) to committed writes to authoritative records (L5). Building on this space, we propose six architectural tactics--checkpoints, escalation, multi-agent delegation, tool provisioning, tool fencing, and write staging--for adjusting a deployment's position within it. The tactics are grounded in two worked examples from public-sector contexts, illustrating how they apply under realistic compliance constraints. We further examine five deployment parameters--model capability, agent architecture, tool fidelity, workflow bottlenecks, and evaluation--that shape what is achievable at any configuration independently of agency and autonomy. Together, the design space, tactics, and deployment parameters provide a shared vocabulary for principled, compliance-aware agentic AI design in which responsibility, auditability, and reversibility are explicit design considerations rather than properties that must be retrofitted after deployment.

2605.12087 2026-05-13 cs.AI cs.MA 版本更新

Intermediate Artifacts as First-Class Citizens: A Data Model for Durable Intermediate Artifacts in Agentic Systems

Josh Rosen, Seth Rosen

发表机构 * ThruWire, Inc.(ThruWire公司)

AI总结 许多AI系统围绕模型推理、调用工具、观察结果的循环进行运作,但中间生成的工件往往只存在于临时状态,难以被追踪和复用。本文提出将中间工件作为系统中的核心组成部分,强调其应具备结构化、可追溯、可修订等特性,以便后续人类或代理进行审查和优化。研究贡献在于提出了一种系统级数据模型,明确区分中间工件与对话记录、思维过程等,并为工件的更新、版本管理和质量评估提供了理论支持,从而提升AI生成工作的可维护性和可追溯性。

Comments 18 pages, 1 figure, 3 tables

详情
英文摘要

Many AI systems are organized around loops in which models reason, call tools, observe results, and continue until a task is complete. These systems often produce final artifacts such as memos, plans, recommendations, and analyses, while the intermediate work that shaped those outputs remains ephemeral. For multi-step, revisable AI work, final artifacts are often lossy projections over upstream state. We argue that such systems should preserve durable, inspectable intermediate artifacts: typed, structured, addressable, versioned, dependency-aware, authoritative, and consumable by downstream computation. These artifacts are not the model's private chain-of-thought. They are maintained work products such as evidence maps, claim structures, criteria, assumptions, plans, transformation rules, synthesis procedures, unresolved tensions, and partial products that later humans and agents can inspect, revise, supersede, and improve. The contribution is a systems-level data model. We distinguish intermediate artifacts from chat transcripts, memory, hidden chain-of-thought, narration, thinking, and final answers; formalize additive and superseding update semantics with explicit current-state resolution; describe how artifact lineage supports durable intermediate state across revisions; and argue that evaluation must target maintained-state quality, not only final-output quality. The claim is not that artifacts make models smarter. It is that durable intermediate artifacts make AI-generated work more inspectable, revisable, and maintainable over time.

2605.12084 2026-05-13 cs.RO cs.AI cs.IT cs.LG cs.SY eess.SY math.IT 版本更新

Learning What Matters: Adaptive Information-Theoretic Objectives for Robot Exploration

Youwei Yu, Jionghao Wang, Zhengming Yu, Wenping Wang, Lantao Liu

发表机构 * Luddy School of Informatics, Computing, and Engineering(信息学、计算与工程学院)

AI总结 本文研究了如何为机器人探索任务设计可学习的信息论目标函数,以更有效地减少模型参数的不确定性。作者提出了一种基于最优实验设计的自适应信息目标——准最优实验设计(QOED),通过分析费舍尔信息矩阵的特征空间,识别可观察的参数方向并抑制无关参数的干扰,从而优化探索策略。实验表明,该方法在导航和操作任务中显著提升了探索效率和策略性能。

详情
英文摘要

Designing learnable information-theoretic objectives for robot exploration remains challenging. Such objectives aim to guide exploration toward data that reduces uncertainty in model parameters, yet it is often unclear what information the collected data can actually reveal. Although reinforcement learning (RL) can optimize a given objective, constructing objectives that reflect parametric learnability is difficult in high-dimensional robotic systems. Many parameter directions are weakly observable or unidentifiable, and even when identifiable directions are selected, omitted directions can still influence exploration and distort information measures. To address this challenge, we propose Quasi-Optimal Experimental Design (Q{\footnotesize OED}), an adaptive information objective grounded in optimal experimental design. Q{\footnotesize OED} (i) performs eigenspace analysis of the Fisher information matrix to identify an observable subspace and select identifiable parameter directions, and (ii) modifies the exploration objective to emphasize these directions while suppressing nuisance effects from non-critical parameters. Under bounded nuisance influence and limited coupling between critical and nuisance directions, Q{\footnotesize OED} provides a constant-factor approximation to the ideal information objective that explores all parameters. We evaluate Q{\footnotesize OED} on simulated and real-world navigation and manipulation tasks, where identifiable-direction selection and nuisance suppression yield performance improvements of \SI{35.23}{\percent} and \SI{21.98}{\percent}, respectively. When integrated as an exploration objective in model-based policy optimization, Q{\footnotesize OED} further improves policy performance over established RL baselines.

2605.12078 2026-05-13 cs.SE cs.AI 版本更新

Property-Level Reconstructability of Agent Decisions: An Anchor-Level Pilot Across Vendor SDK Adapter Regimes

Oleg Solozobov

发表机构 * Independent Researcher (Global)(全球独立研究员)

AI总结 该研究探讨了智能体决策在不同供应商SDK适配环境下的可重构性问题,旨在评估决策过程的可追溯程度。研究采用未修改的决策轨迹重构器,对六个公共SDK体系中的固定示例进行分析,按属性分类判断其可填充程度。结果表明,不同体系下决策属性的可重构性存在显著差异,揭示了在治理完整性方面存在的多层级差距,为跨体系的智能体行为分析提供了新的评估框架。

Comments 23 pages, 3 tables; reproducibility package: https://doi.org/10.5281/zenodo.20077961; GitHub: https://github.com/agent-runtime-evidence/anchor-level-reconstructability-pilot

详情
英文摘要

Agentic AI failures need post-hoc reconstruction: what the agent did, on whose authority, against which policy, and from what reasoning. Cross-regime feasibility remains unmeasured under one property-level schema. We apply the Decision Trace Reconstructor unmodified to pinned worked-example anchors from six public vendor SDK regimes spanning cloud-agent, observability, tool-use, telemetry, and protocol traces, plus two comparator columns. Each Decision Event Schema (DES) property is classified as fully fillable, partially fillable, structurally unfillable, or opaque. Per-property reconstructability of an agent decision already varies between regimes at this anchor scale. Strict-governance-completeness separates into three tiers ranging from 42.9% to 85.7%, yielding one regime-independent gap (reasoning trace), four regime-dependent gaps, and one Mixed property; the pilot is single-annotator, one anchor per cell, descriptive, with outputs checksum-verifiable from a deposited reproducibility package.

2605.12077 2026-05-13 cs.CV cs.AI 版本更新

The Missing GAP: From Solving Square Jigsaw Puzzles to Handling Real World Archaeological Fragments

Ofir Itzhak Shahar, Gur Elkin, Ohad Ben-Shahar

发表机构 * Stein Faculty of Computer and Information Science(Stein 计算机与信息科学学院)

AI总结 本文研究了从解决标准拼图问题到处理真实考古碎片这一更具挑战性的任务。为了解决非规则形状且严重磨损的考古碎片拼接问题,作者提出了GAP数据集,并设计了基于ViT和流匹配的新型框架PuzzleFlow。该方法在处理复杂形状的碎片拼接任务中表现出色,显著优于现有方法。

详情
英文摘要

Jigsaw puzzle solving has been an increasingly popular task in the computer vision research community. Recent works have utilized cutting-edge architectures and computational approaches to reassemble groups of pieces into a coherent image, while achieving increasingly good results on well established datasets. However, most of these approaches share a common, restricting setting: operating solely on strictly square puzzle pieces. In this work, we introduce GAP, a set of novel jigsaw puzzles datasets containing synthetic, heavily eroded pieces of unrestricted shapes, generated by a learned distribution of real-world archaeological fragments. We also introduce PuzzleFlow, a novel ViT and Flow-Matching based framework for jigsaw puzzle solving, capable of handling complex puzzle pieces and demonstrating superior performance on GAP when compared to both classic and recent prominent works in this domain.

2605.12075 2026-05-13 cs.CR cs.AI 版本更新

The Deepfakes We Missed: We Built Detectors for a Threat That Didn't Arrive

Shaina Raza

发表机构 * Vector Institute for Artificial Intelligence(向量人工智能研究所)

AI总结 近年来,深度伪造(deepfake)检测的研究主要围绕2017至2019年间提出的威胁模型展开,重点关注公众人物的面部替换和语音操控等大规模虚假信息风险。然而,2022年至2026年的实际案例显示,当前主要威胁已转变为非自愿亲密影像、语音克隆诈骗和情感操控欺诈等新型问题。本文指出,研究方向与现实威胁的脱节已成为深度伪造防御的主要瓶颈,并呼吁学界重新调整研究重点,以应对当前日益增长的实际危害。

详情
英文摘要

Nearly a decade of Machine Learning (ML) research on deepfake detection has been organized around a threat model inherited from 2017--2019, revolving around face-swap and talking-head manipulation of public figures, motivated by concerns about large-scale misinformation and video-evidence fraud. This position paper argues that the threat the field prepared for did not arrive, and the threats that did arrive are substantially different. An accounting of deepfake incidents in 2022--2026 shows that the dominant observed harms are peer-generated Non-Consensual Intimate Imagery (NCII), voice-clone scam calls targeting families and finance workers, and emotional-manipulation fraud. The predicted large-scale public-figure deepfake catastrophe did not materialize during the 2024 global information environment despite extensive preparation. Meanwhile, research effort, benchmarks, and detection methods remain concentrated on the inherited threat model. The central claim of this paper is that this misalignment is now the dominant bottleneck on real-world deepfake defense, not model capability. We argue the ML research community should substantially rebalance its research agenda toward the harm categories that are actually growing. We support this position with empirical accounting of research effort and harm distribution, identify the structural reasons the misalignment persists, and outline three concrete technical research agendas for the under-defended harm categories.

2605.12073 2026-05-13 cs.CC cs.AI 版本更新

Clausal Deletion Backdoors for QBF: a Parameterized Complexity Approach

Leif Eriksson, Victor Lagerkvist, Sebastian Ordyniak, George Osipov, Fahad Panolan, Mateusz Rychlicki

发表机构 * Independent researcher(独立研究者) Linköping University(利德堡大学) University of Leeds(利兹大学) Royal Holloway, University of London(伦敦大学皇家霍洛威学院)

AI总结 该论文研究了量化布尔公式(QBF)的可满足性问题,提出了一种新的参数化复杂性方法,基于“子句删除后门”(CC-backdoor)的大小来分析求解效率。作者考虑了三个经典的易解QBF子类——Horn、2-CNF和线性方程,并证明除了Horn类外,其余两类在给定CC-backdoor大小为$k$时具有固定参数可解性(FPT)。研究揭示了QBF参数化复杂性中的关键区分点,并展示了不同求解技术在该框架下的应用潜力。

详情
英文摘要

Determining the validity of a quantified Boolean formula (QBF) is a PSPACE-complete problem with rich expressive power. Despite interest in efficient solvers, there is, compared to problems in NP, a lack of positive theoretical results, and in the parameterized complexity setting one often has to restrict the quantifier prefix (e.g., bounding alternations) to obtain fixed parameter tractability (FPT). We propose a new parameter: the number of variables in clauses that has to be removed before reaching a tractable class (a clause covering (CC) backdoor). We are then interested in solving QBF in FPT time given a CC-backdoor of size $k$. We consider the three classical, tractable cases of QBF as base classes: Horn, 2-CNF, and linear equations. We establish W[1]-hardness for Horn but prove FPT for the others, and prove that in a precise, algebraic sense, we are only missing one important case for a full dichotomy. Our algorithms are non-trivial and depend on propagation, and Gaussian elimination, respectively, and are comparably unexplored for QBF.

2605.12069 2026-05-13 cs.CV cs.AI cs.LG 版本更新

Anomaly-Aware Vision-Language Adapters for Zero-Shot Anomaly Detection

Muhammad Aqeel, Maham Nazir, Uzair Khan, Marco Cristani, Francesco Setti

发表机构 * Dept. of Engineering for Innovation Medicine, University of Verona, Italy(创新医学工程系,威尼斯大学,意大利) School of Computer Science and Engineering, Beihang University, China(计算机科学与工程学院,北航大学,中国) Dept. of Computer Science, Reykjavik University, Iceland(计算机科学系,雷克雅未克大学,冰岛)

AI总结 该论文研究了无需目标类别训练的零样本异常检测问题,针对现有方法对正常与异常数据分布不对称性利用不足的问题,提出了一种名为AVA-DINO的异常感知视觉-语言适配框架。该方法通过两个专门分支分别处理正常和异常模式,结合文本引导的路由机制和显式路由正则化,在训练时实现分支特化;测试时仅依赖输入图像和预定义语言描述动态组合分支,实现不对称激活。实验表明,该方法在多个工业和医学基准上取得了最先进的性能,且具备良好的跨领域泛化能力。

Comments Accepted to ICIP 2026

详情
英文摘要

Zero-shot anomaly detection aims to identify defects in unseen categories without target-specific training. Existing methods usually apply the same feature transformation to all samples, treating normal and anomalous data uniformly despite their fundamentally asymmetric distributions, compact normals versus diverse anomalies. We instead exploit this natural asymmetry by proposing AVA-DINO, an anomaly-aware vision-language adaptation framework with dual specialized branches for normal and anomalous patterns that adapt frozen DINOv3 visual features. During training on auxiliary data, the two branches are learned jointly with a text-guided routing mechanism and explicit routing regularization that encourages branch specialization. At test time, only the input image and fixed, predefined language descriptions are used to dynamically combine the two branches, enabling an asymmetric activation. This design prevents degenerate uniform routing and allows context-specific feature transformations. Experiments across nine industrial and medical benchmarks demonstrate state-of-the-art performance, achieving 93.5% image-AUROC on MVTec-AD and strong cross-domain generalization to medical imaging without domain-specific fine-tuning. https://github.com/aqeeelmirza/AVA-DINO

2605.12061 2026-05-13 cs.AI 版本更新

SAGE: A Self-Evolving Agentic Graph-Memory Engine for Structure-Aware Associative Memory

Juntong Wang, Haoyue Zhao, guanghui Pan, Xiyuan Wang, Yanbo Wang, Qiyan Deng, Muhan Zhang

发表机构 * Institute for Artificial Intelligence, Peking University(北京大学人工智能研究院) School of Intelligence Science and Technology, Peking University(北京大学智能科学与技术学校) School of Computer Science and Technology, Beijing Institute of Technology(北京理工大学计算机科学与技术学校)

AI总结 本文提出了一种名为SAGE的自进化智能图记忆引擎,旨在解决语言智能体在长期记忆方面的瓶颈问题。SAGE将图记忆建模为动态的长期记忆载体,结合了用于构建结构化图记忆的“记忆写入器”和基于图基础模型的“记忆读取器”,通过交互历史逐步完善记忆结构,并利用反馈机制实现自我进化。实验表明,SAGE在多跳问答、开放域检索和长期记忆评估等任务中显著提升了证据恢复、答案置信度和检索效率,验证了其在构建稳健长期语言智能体中的有效性。

详情
英文摘要

Long-term memory is becoming a central bottleneck for language agents. Exsting RAG and GraphRAG systems largely treat memory graphs as static retrieval middleware, which limits their ability to recover complete evidence chains from partial cues, exploit reusable graph-structrual roles, and improve the memory itself through downstream feedback. We introduce SAGE, a Self-evolving Agentic Graph-memory Engine that models graph memory as a dynamic long-term memory substrate. SAGE couples two roles: a memory writer that incrementally constucts structured graph memory from interaction histories, and a Graph Foundation Model-based memory reader to perform retrieval and provide feedback to the memory writer. We provide rigorooous theoretical annalyses supporting the framework. Across multi-hop QA, open-domain retireval, domain-specific review QA, and long-term agent-memory benchmarks, SAGE improves evidence recovery, answer grounding, and retrieval efficiency: after two self-evolution rounds, it achieves the best average rank on multi-hop QA; in zero-shot open-domain transfer, it reaches 82.5/91.6 Recall@2/5 on NQ. Further results on LongMemEval and HaluMem show that traning and reader-writer feedback improve multiple long-term memory and hallucination-diagnostic metrics, suggesting that self-evolving, structure-aware graph memory is a promising foundation for robust long-horizon language agents.

2605.12056 2026-05-13 cs.AI 版本更新

OmniRefine: Alignment-Aware Cooperative Compression for Efficient Omnimodal Large Language Models

Yuchen Deng, Zidang Cai, Hai-Tao Zheng, Jie Wang, Feidiao Yang, Yuxing Han

发表机构 * Tsinghua Shenzhen International Graduate School, Tsinghua University(清华大学深圳国际研究生院,清华大学) Pengcheng Laboratory(鹏城实验室)

AI总结 OmniRefine 是一种用于高效多模态大语言模型的训练-free 两阶段压缩框架,旨在解决长视频和密集音频序列推理成本高的问题。该方法通过跨模态对齐的分块优化和模态感知的协同压缩,有效保留关键信息并减少冗余,从而在保持模型性能的同时提升推理效率。实验表明,OmniRefine 在多个任务上实现了优于现有方法的效率与性能平衡,并在较低压缩比下仍能保持稳定表现。

详情
英文摘要

Omnimodal large language models (Omni-LLMs) show strong capability in audio-video understanding, but their practical deployment remains limited by high inference cost of long video streams and dense audio sequences. Despite recent progress, existing compression methods for Omni-LLMs typically rely on fixed or native compression units, which can disrupt cross-modal correspondence and the complementary information required for audio-video reasoning, making it difficult to improve inference efficiency while stably preserving performance. To address this, we propose OmniRefine, a training-free two-stage framework for efficient audio-visual token compression in Omni-LLMs. First, Correspondence-Preserving Chunk Refinement refines native chunk boundaries into cross-modally aligned compression units through frame-audio similarity and dynamic programming. Second, Modality-Aware Cooperative Compression jointly compresses video and audio tokens within each refined unit to reduce redundancy while preserving critical evidence. Extensive experiments show that OmniRefine achieves a better efficiency-performance trade-off than strong baselines and maintains stable performance under lower compression ratios. On WorldSense, it still reaches 46.7% accuracy at a 44% token retention ratio, nearly matching the full-token baseline. The code and interface will be released to facilitate further research.

2605.12049 2026-05-13 cs.LG cs.AI cs.IT cs.NE math.IT 版本更新

Scaling Laws and Tradeoffs in Recurrent Networks of Expressive Neurons

Aaron Spieler, Georg Martius, Anna Levina

发表机构 * University of Tübingen, Germany(图宾根大学,德国) Max Planck Institute for Biological Cybernetics, Tübingen, Germany(生物感知研究所,图宾根,德国) Max Planck Institute for Intelligent Systems, Tübingen, Germany(智能系统研究所,图宾根,德国)

AI总结 本文探讨了在固定参数预算下,如何在神经网络的单元数量、每个单元的复杂度和连接度之间进行最优分配的问题。研究引入了一种基于“表达型漏记忆”(ELM)神经元的循环网络架构,能够独立调节网络宽度、单元复杂度和连接度,并在不同规模下稳定训练。实验表明,在固定参数预算下,存在一个非平凡的最优权衡点,且更大的预算倾向于支持更复杂和更多的神经元,研究还通过信息论模型解释了这一权衡现象的机制。

Comments 25 pages, 21 figures, 3 tables, including derivations. Submitted for peer review

详情
英文摘要

Cortical neurons are complex, multi-timescale processors wired into recurrent circuits, shaped by long evolutionary pressure under stringent biological constraints. Mainstream machine learning, by contrast, predominantly builds models from extremely simple units, a default inherited from early neural-network theory. We treat this as a normative architectural question. How should one split a fixed parameter budget $P$ between the number of units $N$, per-unit effective complexity $k_e$, and per-unit connectivity $k_c$? What controls the optimal allocation? This calls for a model in which per-unit complexity can be tuned independently of width and connectivity. Accordingly, we introduce the ELM Network, whose recurrent layer is built from Expressive Leaky Memory (ELM) neurons, chosen to mirror functional components of cortical neurons. The architecture allows for individually adjusting $N$, $k_e$, and $k_c$ and trains stably across orders of magnitude in scale. We evaluate the model on two qualitatively different sequence benchmarks: the neuromorphic SHD-Adding task and Enwik8 character-level language modeling. Performance improves monotonically along each of the three axes individually. Under a fixed budget, a clear non-trivial optimum emerges in their tradeoff, and larger budgets favor both more and more complex neurons. A closed-form information-theoretic model captures these tradeoffs and attributes the diminishing returns at two ends to: per-neuron signal-to-noise saturation and across-neuron redundancy. A hyperparameter sweep spanning three orders of magnitude in trainable parameters traces a near-Pareto-frontier scaling law consistent with the framework. This suggests that the simple-unit default in ML is not obviously optimal once this tradeoff surface is probed, and offers a normative lens on cortex's reliance on complex spatio-temporal integrators.

2605.12046 2026-05-13 quant-ph cs.AI cs.LG 版本更新

Rethink the Role of Neural Decoders in Quantum Error Correction

Ge Yan, Shanchuan Li, Yuxuan Du

发表机构 * College of Computing Data Science, Nanyang Technological University, Singapore 639798, Singapore Department of Electrical Engineering Computer Science, Tokyo University of Agriculture \& Technology, Koganei, Tokyo, 184-8588, Japan School of Physical Mathematical Sciences, Nanyang Technological University, Singapore 639798, Singapore

AI总结 本文重新审视了神经解码器在量子纠错中的作用,针对表面码解码问题,在明确的精度与延迟约束下,对多种神经解码器架构进行了统一与改进,并开发了端到端压缩流程以评估其在FPGA硬件上的部署性能。研究发现,短期内解码性能更依赖于数据规模而非架构复杂度,适当的归纳偏置对实现高精度至关重要,且INT4量化是满足微秒级延迟需求的必要条件,为可扩展的实时神经量子纠错解码提供了具体指导。

Comments Accepted to ICML 2026; 33 Pages, 9 figures

详情
英文摘要

Quantum error correction (QEC) is essential for enabling quantum advantages, with decoding as a central algorithmic primitive. Owing to its importance and intrinsic difficulty, substantial effort has been made to QEC decoder design, among which neural decoders have recently emerged as a promising data-driven paradigm. Despite this progress, practical deployment remains hindered by a fundamental accuracy-latency tradeoff, often on the microsecond timescale. To address this challenge, here we revisit neural decoders for surface-code decoding under explicit accuracy-latency constraints, considering code distances up to d=9 (161 physical qubits). We unify and redesign representative neural decoders into five architectural paradigms and develop an end-to-end compression pipeline to evaluate their deployability and performance on FPGA hardware. Through systematic experiments, we reveal several previously underexplored insights: (i) near-term decoding performance is driven more by data scale than architectural complexity; (ii) appropriate inductive bias is essential for achieving high decoding accuracy; and (iii) INT4 quantization is a prerequisite for meeting microsecond-scale latency requirements on FPGAs. Together, these findings provide concrete guidance toward scalable and real-time neural QEC decoding.

2605.12026 2026-05-13 cs.CV cs.AI eess.SP 版本更新

Spectral Vision Transformer for Efficient Tokenization with Limited Data

Alexandra G. Roberts, Maneesh John, Jinwei Zhang, Dominick Romano, Mert Sisman, Ki Sueng Choi, Heejong Kim, Mert R. Sabuncu, Thanh D. Nguyen, Alexey V. Dimov, Pascal Spincemaille, Brian H. Kopell, Yi Wang

AI总结 本文提出了一种新型的光谱视觉变换器架构,旨在在数据量有限的情况下实现高效的图像分块处理,特别关注医学影像应用。该方法利用光谱基函数的选择带来了空间不变性和最优信噪比等理论优势,并通过光谱投影降低了模型复杂度。实验表明,与多种主流模型相比,该方法在参数更少的情况下仍能取得相当甚至更优的性能,适用于多种类型的数据集。

详情
英文摘要

We propose a novel spectral vision transformer architecture for efficient tokenization in limited data, with an emphasis on medical imaging. We outline convenient theoretical properties arising from the choice of basis including spatial invariance and optimal signal-to-noise ratio. We show reduced complexity arising from the spectral projection compared to spatial vision transformers. We show equitable or superior performance with a reduced number of parameters as compared to a variety of models including compact and standard vision transformers, convolutional neural networks with attention, shifted window transformers, multi-layer perceptrons, and logistic regression. We include simulated, public, and clinical data in our analysis and release our code at: \verb+github.com/agr78/spectralViT+.

2605.12019 2026-05-13 cs.LG cs.AI 版本更新

Efficient and Adaptive Human Activity Recognition via LLM Backbones

Aleksandr Bredikhin, Philippe Lalanda, German Vega

发表机构 * Univ. Grenoble Alpes, France(格勒诺布尔阿尔卑斯大学,法国)

AI总结 本文提出了一种基于大语言模型(LLM)的高效且自适应的人类活动识别(HAR)方法,旨在解决传统方法在计算资源消耗和领域适应性方面的不足。通过将预训练的LLM作为通用时间特征提取器,并引入结构化卷积投影将传感器信号映射到LLM的隐空间,该方法大幅降低了参数量和训练成本,同时提升了模型的泛化能力。实验表明,该方法在低数据和少样本场景下表现出色,为HAR系统提供了可扩展且高效的解决方案。

详情
英文摘要

Human Activity Recognition (HAR) is a core task in pervasive computing systems, where models must operate under strict computational constraints while remaining robust to heterogeneous and evolving deployment conditions. Recent advances based on Transformer architectures have significantly improved recognition performance, but typically rely on task-specific models trained from scratch, resulting in high training cost, large data requirements, and limited adaptability to domain shifts. In this paper, we propose a paradigm shift that reuses large pretrained language models (LLMs) as generic temporal backbones for sensor-based HAR, instead of designing domain-specific Transformers. To bridge the modality gap between inertial time series and language models, we introduce a structured convolutional projection that maps multivariate accelerometer and gyroscope signals into the latent space of the LLM. The pretrained backbone is kept frozen and adapted using parameter-efficient Low-Rank Adaptation (LoRA), drastically reducing the number of trainable parameters and the overall training cost. Through extensive experiments on standard HAR benchmarks, we show that this approach enables rapid convergence, strong data efficiency, and robust cross-dataset transfer, particularly in low-data and few-shot settings. At the same time, our results highlight the complementary roles of convolutional frontends and LLMs, where local invariances are handled at the signal level while long-range temporal dependencies are captured by the pretrained backbone. Overall, this work demonstrates that LLMs can serve as a practical, frugal, and scalable foundation for adaptive HAR systems, opening new directions for reusing foundation models beyond their original language domain.

2605.12016 2026-05-13 cs.AI 版本更新

LLMs and the ZPD

Peter Wallis

发表机构 * Centre for Policy Modelling(政策建模中心)

AI总结 本文探讨了大语言模型(LLMs)与维果茨基“最近发展区”(ZPD)理论之间的关系,提出LLMs并非通过分布式表征进行“思考”,而是在执行一种基于实践的“原始思维”。研究认为,LLMs的行为更类似于“做梦”而非幻觉,强调互动在人类沟通中的核心地位,而非仅仅是理解的辅助手段,为理解LLMs的认知机制提供了新的视角。

Comments Short paper submitted to Interspeech 2026 (Desk Reject) 4 pages, plus references. 2 figures

详情
英文摘要

One hundred years ago Vygotsky and his circle were exploring the nature of consciousness and defining what would become psychology in the Soviet Union. They concluded that children develop "scientific thinking" through interacting with enculturated adults in Zones of Proximal Development or ZPDs. The proposal is that, contrary to the claims of some, the LLM mechanism is not doing thinking with "distributed representations," but rather the completion model is doing "primitive thinking" in terms of *practices*. Viewed from this perspective, it would seem our large language models don't hallucinate, but rather dream, and that what is needed is not "guard rails" but an investigation of the set of cognitive tools that enable us to do things that look like common-sense. The proposal here is that *interaction* is core to human communication rather than just an add-on to "real" understanding.

2605.12013 2026-05-13 cs.CV cs.AI 版本更新

L2P: Unlocking Latent Potential for Pixel Generation

Zhennan Chen, Junwei Zhu, Xu Chen, Jiangning Zhang, Jiawei Chen, Zhuoqi Zeng, Wei Zhang, Chengjie Wang, Jian Yang, Ying Tai

发表机构 * Nanjing University(南京大学) Tencent Youtu Lab(腾讯云图实验室) Hainan-biuh(海南-比乌) Weess Gmbh(韦斯公司)

AI总结 本文提出了一种名为L2P的高效像素生成框架,旨在解决从头训练高精度像素空间模型所需的高昂计算和数据资源问题。L2P通过直接利用预训练潜在扩散模型(LDM)的知识,采用大块标记化替代VAE,并冻结LDM中间层仅训练浅层网络,从而学习潜在空间到像素空间的映射。该方法仅使用LDM生成的合成图像作为训练数据,无需真实数据采集,实现了快速收敛,并可在8块GPU上生成4K超高分辨率图像,实验表明其性能接近源模型,在多个基准测试中表现优异。

Comments project page: https://nju-pcalab.github.io/projects/L2P/

详情
英文摘要

Pixel diffusion models have recently regained attention for visual generation. However, training advanced pixel-space models from scratch demands prohibitive computational and data resources. To address this, we propose the Latent-to-Pixel (L2P) transfer paradigm, an efficient framework that directly harnesses the rich knowledge of pre-trained LDMs to build powerful pixel-space models. Specifically, L2P discards the VAE in favor of large-patch tokenization and freezes the source LDM's intermediate layers, exclusively training shallow layers to learn the latent-to-pixel transformation. By utilizing LDM-generated synthetic images as the sole training corpus, L2P fits an already smooth data manifold, enabling rapid convergence with zero real-data collection. This strategy allows L2P to seamlessly migrate massive latent priors to the pixel space using only 8 GPUs. Furthermore, eliminating the VAE memory bottleneck unlocks native 4K ultra-high resolution generation. Extensive experiments across mainstream LDM architectures show that L2P incurs negligible training overhead, yet performs on par with the source LDM on DPG-Bench and reaches 93% performance on GenEval.

2605.12001 2026-05-13 cs.IT cs.AI math.IT 版本更新

CR^2: Cost-Aware Risk-Controlled Routing for Wireless Device-Edge LLM Inference

Nan Xue, Shengkang Chen, Zhiyong Chen, Jiangchao Yao, Yaping Sun, Zixia Hu, Meixia Tao

发表机构 * Cooperative Medianet Innovation Center, Shanghai Jiao Tong University(合作中位网创新中心,上海交通大学) Department of Broadband Communication, Pengcheng Laboratory(宽带通信部,鹏城实验室) Future Network of Intelligent Institute (FNii), the Chinese University of Hong Kong (Shenzhen)(智能网络研究所(FNii),香港中文大学(深圳)) Meta

AI总结 随着大语言模型(LLM)从集中式云平台向移动边缘环境迁移,如何在有限的设备-边缘资源下高效平衡延迟、能耗与精度成为关键问题。本文提出CR²,一种面向无线设备-边缘环境的成本感知风险控制路由框架,通过解耦设备端的轻量边缘门和边缘端的效用选择器,实现对查询的延迟路由决策。CR²引入了符合风险控制校准方法,能够在有限信息下显式控制决策风险,并在实验中表现出优于现有方法的精度-成本帕累托前沿性能。

Comments submitted to IEEE Journal

详情
英文摘要

As large language models (LLMs) move from centralized clouds to mobile edge environments, efficient serving must balance latency, energy consumption, and accuracy under constrained device-edge resources. Query-level routing between lightweight on-device models and stronger edge models provides a flexible mechanism to navigate this trade-off. However, existing routers are designed for centralized cloud settings and optimize token-level costs, failing to capture the dynamic latency and energy overheads in wireless edge deployments. In this paper, we formulate mobile edge LLM routing as a deployment-constrained, cost-aware decision problem, and propose CR^2, a two-stage device-edge routing framework. CR^2 decouples a lightweight on-device margin gate from an edge-side utility selector for deferred queries. The margin gate operates on frozen query embeddings and a user-specified cost weight to predict whether local execution is utility-optimal relative to the best edge alternative under the target operating point. We further introduce a conformal risk control (CRC) calibration procedure that maps each operating point to an acceptance threshold, enabling explicit control of the marginal false-acceptance risk under the full-information utility reference. Experiments on the routing task show that CR^2 closely matches a full-information reference router using only device-side signals before deferral. Compared with strong query-level baselines, CR^2 consistently improves the deployable accuracy-cost Pareto frontier and reduces normalized deployment cost by up to 16.9% at matched accuracy.

2605.11999 2026-05-13 cs.DC cs.AI cs.LG cs.PF 版本更新

The Illusion of Power Capping in LLM Decode: A Phase-Aware Energy Characterisation Across Attention Architectures

Bole Ma, Ayesha Afzal, Jan Eitzinger, Gerhard Wellein

发表机构 * Erlangen National High Performance Computing Center(埃朗根国家高性能计算中心) Friedrich-Alexander-Universität Erlangen-Nürnberg(埃朗根-纽伦堡弗里德里希-亚历山大大学)

AI总结 本文研究了在大语言模型推理过程中,功率限制(Power Capping)在实际应用中的效果问题,发现其在主流的自回归解码阶段效果并不明显。通过在多种注意力架构上进行能效分析,作者指出解码阶段主要受限于内存带宽而非计算能力,导致功率限制机制无法触发。研究提出通过时钟锁定(SM clock locking)替代功率限制,能够更有效地优化能效,在保持吞吐量损失最小的前提下,提升解码阶段的能源效率,并揭示了不同架构下的动态电压频率调节(DVFS)行为模式。

详情
英文摘要

Power capping is the standard GPU energy lever in LLM serving, and it appears to work: throughput drops, power readings fall, and energy budgets are met. We show the appearance is illusory for the phase that dominates production serving: autoregressive decode. Across four attention paradigms -- GQA, MLA, Gated DeltaNet, and Mamba2 -- on NVIDIA H200, decode draws only 137--300\,W on a 700\,W GPU; no cap ever triggers, because memory-bound decode saturates HBM bandwidth rather than compute and leaves power headroom untouched. Firmware-initiated clock throttling compounds the illusion: these deviations can corrupt any throughput measurement that attributes them to the cap. SM clock locking dissolves both confounds. By targeting the lever that is actually on the critical path, clock locking Pareto-dominates power capping universally, recovering up to 32\% of decode energy at minimal throughput loss. We identify three architecture-dependent DVFS behavioural classes and characterise a common energy pattern across novel attention replacements: a heavy prefill cost recouped by efficient decode, eventually halving total request energy relative to GQA at production batch sizes.

2605.11996 2026-05-13 cs.AI 版本更新

BadSKP: Backdoor Attacks on Knowledge Graph-Enhanced LLMs with Soft Prompts

Xiaoting Lyu, Yufei Han, Hangwei Qian, Haoyuan Yu, Xiang Ao, Bin Wang, Chenxu Wang, Xiaobo Ma, Wei Wang

发表机构 * Ministry of Education Key Lab for Intelligent Networks and Network Security(教育部长智能网络与网络安全重点实验室) Xi’an Jiaotong University(西安交通大学) INRIA(法国国家信息与自动化技术研究院) CFAR, A*STAR(新加坡A*STAR机构) Beijing Key Laboratory of Security and Privacy in Intelligent Transportation(北京智能交通安全与隐私重点实验室) Beijing Jiaotong University(北京交通大学) Institute of Computing Technology, Chinese Academy of Sciences(中国科学院计算技术研究所) School of Cyber Engineering, Xi’an University of Electronic Science and Technology(西安电子科技大学网络安全工程学院) Ministry of Education Key Lab for Intelligent Networks and Network Security at Xi’an Jiaotong University(西安交通大学教育部长智能网络与网络安全重点实验室)

AI总结 本文研究了针对知识图谱增强大语言模型(KG-LLMs)的后门攻击问题,特别是针对通过图神经网络将知识图谱编码为软提示的新型架构。该架构引入了图条件通道,使得现有针对文本通道的后门攻击效果大打折扣。为此,作者提出BadSKP攻击方法,通过多阶段优化策略操纵图表示,诱导软提示生成对抗性语义,实验表明该方法在多种设置下均能有效攻击目标模型,而传统仅针对文本的攻击则效果有限。

详情
英文摘要

Recent knowledge graph (KG)-enhanced large language models (LLMs) move beyond purely textual knowledge augmentation by encoding retrieved subgraphs into continuous soft prompts via graph neural networks, introducing a graph-conditioned channel that operates alongside the standard text interface. However, existing backdoor attacks are largely designed for the textual channel, and their effectiveness against this dual-channel architecture remains unclear. We show that this architecture creates a robustness gap: text-channel backdoor attacks that readily compromise textual KG prompting systems become largely ineffective against soft-prompt-based counterparts. We interpret this gap through semantic anchoring, whereby graph-derived soft prompts bias the generation-driving hidden state toward query-consistent semantics and suppress surface-level malicious instructions. Because this anchoring effect is itself induced by the graph channel, an attacker who manipulates graph-level representations can in turn redirect it toward adversarial semantics. To demonstrate this risk, we propose BadSKP, a backdoor attack that targets the graph-to-prompt interface through a multi-stage optimization strategy: it constructs adversarial target embeddings, optimizes poisoned node embeddings to steer the induced soft prompt, and approximates the optimized representations with fluent adversarial node attributes. Experiments on two soft-prompt KG-enhanced LLMs across four datasets show that BadSKP achieves high attack success under both frozen and trojaned settings, while text-only attacks remain unreliable even under perplexity-based defenses.

2605.11987 2026-05-13 cs.AI cs.LG stat.AP stat.ML 版本更新

Random-Set Graph Neural Networks

Tommy Woodley, Shireen Kudukkil Manchingal, Matteo Tolloso, Davide Bacciu, Fabio Cuzzolin

发表机构 * School of Engineering, Computing and Mathematics(工程、计算与数学学院) Oxford Brookes University(奥克斯福德布鲁克斯大学) Department of Computer Science(计算机科学系) University of Pisa(比萨大学) Oxford Brookes Institute for Artificial Intelligence, Data Analysis and Systems (AIDAS)(奥克斯福德布鲁克斯人工智能、数据分析与系统研究所(AIDAS))

AI总结 本文提出了一种新的图神经网络框架——随机集图神经网络(RS-GNN),用于更准确地量化节点层面的不确定性。该方法通过信念函数形式对节点的认识不确定性进行建模,能够同时输出精确的概率预测和不确定性度量。实验表明,RS-GNN在多个真实世界的图学习数据集上表现出优越的不确定性量化能力。

Comments 23 pages, 6 figures

详情
英文摘要

Uncertainty quantification has become an important factor in understanding the data representations produced by Graph Neural Networks (GNNs). Despite their predictive capabilities being ever useful across industrial workspaces, the inherent uncertainty induced by the nature of the data is a huge mitigating factor to GNN performance. While aleatoric uncertainty is the result of noisy and incomplete stochastic data such as missing edges or over-smoothing, epistemic uncertainty arises from lack of knowledge about a system or model (e.g., a graph's topology or node feature representation), which can be reduced by gathering more data and information. In this paper, we propose an original new framework in which node-level epistemic uncertainty is modelled in a belief function (finite random set) formalism. The resulting Random-Set Graph Neural Networks have a belief-function head predicting a random set over the list of classes, from which both a precise probability prediction and a measure of epistemic uncertainty can be obtained. Extensive experiments on 9 different graph learning datasets, including real-world autonomous driving benchmarks as such Nuscene and ROAD, demonstrate RS-GNN's superior uncertainty quantification capabilities

2605.11986 2026-05-13 cs.AI 版本更新

On the Limitations of Large Language Models for Conceptual Database Modeling

Arthur F. Siqueira, Carlos D. S. Nogueira, Eduarda Farias, Claudio E. C. Campelo, Júlia Menezes

发表机构 * Systems and Computing Department(系统与计算系)

AI总结 本文分析了大语言模型(LLMs)在支持关系数据库概念建模中的应用,特别是通过从自然语言需求中自动生成实体-关系(ER)图的能力。研究结合不同的语言模型和提示工程方法,评估其在概念上一致地识别实体、关系和属性的能力。实验结果表明,尽管LLMs在简单场景中表现尚可,但随着需求复杂性的增加,其可靠性下降,出现了更多不一致、模糊和约束表示失败的问题,表明当前LLMs在复杂场景中尚不成熟,验证成本可能抵消其表面的效率提升。

详情
英文摘要

This article analyzes the use of Large Language Models (LLMs) as support for the conceptual modeling of relational databases through the automatic generation of Entity-Relationship (ER) diagrams from natural language requirements. The approach combines different language models with prompt engineering techniques to evaluate their ability to identify entities, relationships, and attributes in a conceptually consistent manner. The experimental evaluation involved three LLMs, each subjected to three prompting techniques (Zero-Shot, Chain of Thought, and Chain of Thought + Verifier), applied to the same requirements scenario with progressively increasing complexity. The generated diagrams were qualitatively analyzed through direct comparison with the textual requirements, considering the structural and semantic adherence of the modeled elements. The results indicate that, although LLMs show reasonable performance in less complex scenarios, their reliability decreases as the complexity of the requirements increases, with a rise in inconsistencies, ambiguities, and failures in representing constraints. These findings reinforce that, in their current state, LLMs are not sufficiently mature for reliable use in complex scenarios, and the cost of validation may offset the apparent productivity gains.

2605.11981 2026-05-13 physics.flu-dyn cs.AI 版本更新

High-lift Wing Separation Control via Bayesian Optimization and Deep Reinforcement Learning

Ricard Montalà, Bernat Font, Oriol Lehmkuhl, Ricardo Vinuesa, Ivette Rodriguez

发表机构 * TUAREG, Universitat Politècnica de Catalunya (UPC)(TUAREG,西班牙巴塞罗那理工大学) Mechanical Engineering, Delft University of Technology (TU Delft)(代尔夫特理工大学机械工程系) CASE, Barcelona Supercomputing Center (BSC)(巴塞罗那超级计算中心)

AI总结 本研究利用壁面解析的大涡模拟方法,探讨了在雷诺数 $Re_c = 450,000$ 和攻角 $α = 23^\circ$ 下,通过合成射流对30P30N高升力翼型进行主动流动控制的问题。研究对比了开环贝叶斯优化和闭环深度强化学习两种优化策略,结果表明贝叶斯优化能有效提升气动效率,而深度强化学习由于奖励函数设计的限制,仅取得有限的改进。该工作为高雷诺数下基于深度强化学习的流动控制方法提供了重要的优化方向和实践经验。

详情
英文摘要

This study investigates active flow control (AFC) of a 30P30N high-lift wing at a Reynolds number Re$_c$ = 450,000 and angle of attack $α$ = 23$^\circ$ using wallresolved large-eddy simulations (LES). Two optimization strategies are explored: open-loop Bayesian optimization (BO) and closed-loop deep reinforcement learning (DRL), both targeting the mitigation of stall and the improvement of aerodynamic efficiency via synthetic jets on the slat, main, and flap elements. The uncontrolled configuration was validated against literature data, confirming the reliability of the LES setup. The BO framework successfully identified steady jet velocities that increased efficiency by +10.9% through a -9.7% drag reduction while maintaining lift. In contrast, the DRL agent, despite leveraging instantaneous flow information from distributed sensors, achieved only minor improvements in lift and drag, with negligible efficiency gain. Training analysis indicated that the penalty-dominated reward constrained exploration. These results highlight the need for carefully designed rewards and computational acceleration strategies in DRL-based flow control at high Reynolds numbers.

2605.11972 2026-05-13 cs.RO cs.AI cs.ET cs.SY eess.SY 版本更新

Cooperative Robotics Reinforced by Collective Perception for Traffic Moderation

Mohammad Khoshkdahan, John Pravin Arockiasamy, Andy Flores Comeca, Alexey Vinel

发表机构 * Karlsruhe Institute of Technology, Karlsruhe, Germany(卡尔斯鲁厄理工学院,德国卡尔斯鲁厄) Halmstad University, Halmstad, Sweden(哈马格大学,瑞典哈马格)

AI总结 该研究针对非视线交叉路口的碰撞问题,提出了一种结合集体感知与协作机器人的交通调控系统。系统通过双摄像头和V2X技术融合感知信息,实时监测道路环境,并由协作机器人在检测到潜在碰撞风险时发出停止手势,阻止车辆违规合并。实验表明,该方法能有效提升非视线条件下的交通安全,填补了现有V2X技术在未连接车辆中的感知与干预空白。

Comments Accepted for publication in the Proceedings of the 2026 IEEE Vehicular Technology Conference (VTC2026-Spring)

详情
英文摘要

Collisions at non-line-of-sight (NLOS) intersections remain a major safety concern because drivers have limited visibility of approaching traffic. V2X based warnings can reduce these risks, yet many vehicles are not equipped with V2X and drivers may ignore in vehicle alerts. Collective perception (CP) can compensate for low V2X penetration by extending the awareness of connected vehicles, but it cannot influence unconnected vehicles. To fill this gap, our work introduces a complementary concept that adds a cooperative humanoid robot as an active traffic moderator capable of physically stopping a vehicle that attempts to merge into an unseen traffic stream. The system operates on two parallel perception pathways. A dual camera infrastructure unit detects the position, speed and motion of approaching vehicles and transmits this information to the robot as a collective perception message (CPM). The robot also receives cooperative awareness messages (CAM) from connected vehicles through its onboard V2X unit and can act as a relay for decentralized environmental notification messages (DENM) when safety events originate elsewhere along the road. A fusion module combines these streams to maintain a robust real time view of the main road. A Zone of Danger (ZoD) is defined and used to predict whether an approaching vehicle creates a collision risk for a merging road user. When such a risk is detected, the robot issues a human-like STOP gesture and blocks the merging path until the hazard disappears. The full system was deployed at the Future Mobility Park (FMP) in Rotterdam. Experiments show that the combined vision and V2X perception allows the robot to detect approaching vehicles early, predict hazards reliably and prevent unsafe merges in real world NLOS conditions.

2605.11936 2026-05-13 cs.AI 版本更新

From Noise to Diversity: Random Embedding Injection in LLM Reasoning

Heejun Kim, Seungpil Lee, Jewon Yeom, Jaewon Sok, Seonghyeon Park, Jeongjae Park, Taesup Kim, Sundong Kim

发表机构 * Gwangju Institute of Science and Technology(光州科学技术学院) Seoul National University(首尔国立大学) Microsoft Research(微软研究院)

AI总结 该研究探讨了在大语言模型推理中使用随机嵌入注入(RSP)的方法,旨在分离软提示效果中来自训练内容与注入行为本身的影响。通过在输入中附加随机生成的嵌入向量,RSP无需训练即可在数学推理任务中达到与优化软提示相当的性能。研究揭示了RSP通过提升早期生成token的多样性,结合温度采样可提高多尝试正确率,并将该机制扩展至训练阶段,展示了其在推理与训练中的广泛适用性。

Comments 30 pages, 5 figures, 6 tables. Under review

详情
英文摘要

Recent soft prompt research has tried to improve reasoning by inserting trained vectors into LLM inputs, yet whether the gain comes from the learned content or from the act of injection itself has not been carefully separated. We study Random Soft Prompts (RSPs), which drop the training step entirely and append a freshly drawn sequence of random embedding vectors to the input. Each RSP vector is sampled from an isotropic Gaussian fitted to the entrywise mean and variance of the pretrained embedding table; the sequence carries no learned content, and yet reaches accuracy comparable to optimized soft prompts on math reasoning benchmarks in several settings. The mechanism unfolds in two stages: because attention has to absorb a never-seen-before random position, the distribution over the first few generated tokens flattens and reasoning trajectories branch, and as generation continues this influence dilutes naturally so the response commits to a single completion. We show that during inference RSPs lift early-stage token diversity and, combined with temperature sampling, widen Pass@N, the probability that at least one out of N attempts is correct. Beyond inference, we carry the same effect into DAPO training and demonstrate practical gains. Our contributions are: (i) RSP isolates the simplest form of soft prompt -- training-free, freshly resampled -- providing a unified lens for the structural effect of injection that variants otherwise differing in training and form all share; (ii) a theoretical and empirical validation of the underlying mechanism; and (iii) an extension from inference to training.

2605.11928 2026-05-13 cs.AI 版本更新

When Simulation Lies: A Sim-to-Real Benchmark and Domain-Randomized RL Recipe for Tool-Use Agents

Xiaolin Zhou, Aojie Yuan, Zheng Luo, Zipeng Ling, Xixiao Pan, Yicheng Gao, Haiyue Zhang, Jiate Li, Shuli Jiang, Prince Zizhuang Wang, Zixuan Zhu, Jinbo Liu, Ryan A. Rossi, Hua Wei, Xiyang Hu

发表机构 * Arizona State University(亚利桑那州立大学) University of Southern California(南加州大学) Carnegie Mellon University(卡内基梅隆大学) University of Pennsylvania(宾夕法尼亚大学) Adobe Research(Adobe研究)

AI总结 该研究针对工具使用语言代理在真实部署中面临的模拟到现实(sim-to-real)差距问题,提出了一个名为 RobustBench-TC 的基准测试平台,涵盖22种基于部分可观测马尔可夫决策过程(POMDP)不同组件的扰动类型。研究还提出了一种基于领域随机化的强化学习方法 ToolRL-DR,通过在训练中引入扰动增强轨迹,显著提升了代理在面对观察、奖励相关元数据和状态转移扰动时的鲁棒性,尤其在未接触过特定扰动的情况下仍能有效提升性能。

Comments Dataset, code, and benchmark leaderboard are available at https://github.com/WillChow66/robustbench-tc-release.git and https://huggingface.co/spaces/willchow66/robustbench-tc-leaderboard

详情
英文摘要

Tool-use language agents are evaluated on benchmarks that assume clean inputs, unambiguous tool registries, and reliable APIs. Real deployments violate all these assumptions: user typos propagate into hallucinated tool names, a misconfigured request timeout can stall an agent indefinitely, and duplicate tool names across servers can freeze an SDK. We study these failures as a sim-to-real gap in the tool-use partially observable Markov decision process (POMDP), where deployment noise enters through the observation, action space, reward-relevant metadata, or transition dynamics. We introduce RobustBench-TC, a benchmark with 22 perturbation types organized by these four POMDP components, each grounded in a verified GitHub issue or documented tool-calling failure. Across 21 models from 1.5B to 32B parameters (including the closed-source o4-mini), the robustness profile is sharply uneven: observation perturbations reduce accuracy by less than 5%, while reward-relevant and transition perturbations reduce accuracy by roughly 40% and 30%, respectively; scale alone does not close these gaps. We then propose ToolRL-DR, a domain-randomization reinforcement learning (RL) recipe that trains a tool-use agent on perturbation-augmented trajectories spanning the three statically encodable POMDP components. On a 3B backbone, ToolRL-DR-Full retains roughly three-quarters of clean accuracy and reaches an aggregate perturbed accuracy comparable to open-source 14B function-calling baselines while substantially narrowing the gap to o4-mini. It closes approximately 27% of the Transition gap despite never seeing transition perturbations in training, suggesting that RL on adversarial static tool-use inputs induces a more persistent retry policy that transfers to unseen runtime failures. The dataset, code and benchmark leaderboard are publicly available.

2605.11920 2026-05-13 cs.AI 版本更新

Domain Restriction via Multi SAE Layer Transitions

Elias Shaheen, Avi Mendelson

发表机构 * Technion -- Israel Institute of Technology, Haifa, Israel(技术ion理工学院,海法,以色列)

AI总结 本文研究了如何通过分析大语言模型(LLM)内部处理过程中的多层稀疏自编码器(SAE)过渡,来识别和限制其在特定领域的应用范围。作者提出了一种基于SAE层间动态变化的轻量方法,能够有效区分领域外(OOD)输入,从而提升模型在特定任务中的表现和可控性。实验表明,该方法在捕捉输入细节方面具有显著优势,并在多个模型上验证了其有效性。

详情
英文摘要

The general-purpose nature of Large Language Models (LLMs) presents a significant challenge for domain-specific applications, often leading to out-of-domain (OOD) interactions that undermine the provider's intent. Existing methods for detecting such scenarios treat the LLM as an uninterpretable black box and overlook the internal processing of inputs. In this work we show that layer transitions provide a promising avenue for extracting domain-specific signature. Specifically, we present several lightweight ways of learning on internal dynamics encoded using a sparse autoencoder (SAE) that exhibit great capability in distinguishing OOD texts. Building on top of SAEs representation transitions enables us to better interpret the LLM internal evolution of input processing and shed light on its decisions. We provide a comprehensive analysis of the method and benchmark it with the gemma-2 2B and 9B models. Our results emphasize the efficacy of the internal process in capturing fine-grained input-related details.

2605.11910 2026-05-13 cs.AI 版本更新

Rethinking Positional Encoding for Neural Vehicle Routing

Chuanbo Hua, Federico Berto, Andre Hottung, Nayeli Gast Zepeda, Yining Ma, Zihan Ma, Paula Wong-Chung, Changhyun Kwon, Cathy Wu, Kevin Tierney, Jinkyoo Park

发表机构 * KAIST(韩国科学技术院) Radical Numerics Bielefeld University(比勒菲尔德大学) University of Vienna(维也纳大学) MIT(麻省理工学院) University of British Columbia(不列颠哥伦比亚大学)

AI总结 本文研究了在神经车辆路径规划(VRP)中位置编码(PE)的设计问题,指出传统自然语言处理中的位置编码难以满足VRP问题的结构特性。作者提出了三个应被位置编码遵循的结构属性,并基于几何基础设计了一种层次化的各向异性位置编码方法,该方法结合了路线内环形一致的编码与以仓库为中心的跨路线角度编码。实验表明,这种基于几何的位置编码在多种VRP变体中均优于传统基于索引的编码方法。

详情
英文摘要

Transformer-based models have become the dominant paradigm for neural combinatorial optimization (NCO) of vehicle routing problems (VRPs), yet the role of positional encoding (PE) in these architectures remains largely unexplored. Unlike natural language, where tokens are uniformly spaced on a line, routing solutions exhibit several properties that render standard NLP positional encodings inadequate. In this work, we formalize three such structural properties that a routing-aware PE should respect, namely anisometric node distances, cyclic and direction-aware topology, and hierarchical depot-anchored global multi-route structure, combining them with a unifying design principle of geometric grounding. Guided by these criteria, we analyze and compare PE methods spanning NLP, graph-transformer, and routing-specific families, and propose a hierarchical anisometric PE that combines a distance-indexed, circularly consistent in-route encoding with a depot-anchored angular cross-route encoding. Extensive experiments across diverse VRP variants demonstrate that geometry-grounded PE consistently outperforms index-based alternatives, with gains that transfer across problem variants, model architectures, and distribution shifts.

2605.11905 2026-05-13 cs.AI 版本更新

Rethinking Supervision Granularity: Segment-Level Learning for LLM-Based Theorem Proving

Shuo Xu, Jiakun Zhang, Junyu Lai, Chun Cao, Jingwei Xu

发表机构 * State Key Laboratory for Novel Software Technology(新型软件技术国家重点实验室)

AI总结 本文重新思考了监督粒度问题,提出了一种基于证明轨迹的段级监督方法,用于训练基于大语言模型的定理证明系统。该方法通过提取局部连贯的证明片段构建训练数据,既保留了全局结构信息,又避免了细粒度步骤预测带来的碎片化问题。实验表明,该方法在多个基准数据集上显著优于现有的步骤级和全证明生成方法,并能有效提升现有证明器的性能与推理效率。

Comments 22 pages, 4 figures, 6 tables

详情
英文摘要

Automated theorem proving with large language models in Lean 4 is commonly approached through either step-level tactic prediction with tree search or whole-proof generation. These two paradigms represent opposite granularities for constructing supervised training data: the former provides dense local signals but may fragment coherent proof processes, while the latter preserves global structure but requires complex end-to-end generation. In this paper, we revisit supervision granularity as a training set construction problem over proof trajectories and propose segment-level supervision, a training data construction strategy that extracts locally coherent proof segments for training policy models. We further reuse the same strategy at inference time to trigger short rollouts for existing step-level models. When trained with segment-level supervision on STP, LeanWorkbook, and NuminaMath-LEAN, the resulting policy models achieve proof success rates of 64.84%, 60.90%, and 66.31% on miniF2F, respectively, consistently outperforming both step-level and whole-proof baselines. Goal-aware rollout further improves existing step-level provers while reducing inference costs. It increases the proof success rate of BFS-Prover-V2-7B from 68.77% to 70.74% and that of InternLM2.5-StepProver from 59.59% to 60.33%, showing that appropriate supervision granularity better aligns model learning with proof structure and search. Code and models are available at https://github.com/NJUDeepEngine/SEG-ATP.

2605.11904 2026-05-13 cs.CV cs.AI 版本更新

Beyond Point-wise Neural Collapse: A Topology-Aware Hierarchical Classifier for Class-Incremental Learning

Huiyu Yi, Zhiming Xu, Dunwei Tu, Zhicheng Wang, Baile Xu, Furao Shen

发表机构 * National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China(新型软件技术国家实验室,南京大学,中国南京) School of Computer Science, Nanjing University, Nanjing, China(计算机科学学院,南京大学,中国南京) School of Artificial Intelligence, Nanjing University, Nanjing, China(人工智能学院,南京大学,中国南京)

AI总结 本文针对类增量学习(CIL)中传统最近类均值(NCM)分类器因特征漂移和非线性结构而表现不佳的问题,提出了一种基于拓扑感知的分层分类器HC-SOINN。该方法通过“局部到全局”的表示方式捕捉类间流形的拓扑结构,并引入结构-拓扑对齐残差(STAR)方法,实现对复杂非线性特征漂移的精确适应。实验表明,该方法在多种先进模型中均能有效提升分类性能,展现出良好的鲁棒性和泛化能力。

Comments accepted by ICML2026

详情
英文摘要

The Nearest Class Mean (NCM) classifier is widely favored in Class-Incremental Learning (CIL) for its superior resistance to catastrophic forgetting compared to Fully Connected layers. While Neural Collapse (NC) theory supports NCM's optimality by assuming features collapse into single points, non-linear feature drift and insufficient training in CIL often prevent this ideal state. Consequently, classes manifest as complex manifolds rather than collapsed points, rendering the single-point NCM suboptimal. To address this, we propose Hierarchical-Cluster SOINN (HC-SOINN), a novel classifier that captures the topological structure of these manifolds via a ``local-to-global'' representation. Furthermore, we introduce Structure-Topology Alignment via Residuals (STAR) method, which employs a fine-grained pointwise trajectory tracking mechanism to actively deform the learned topology, allowing it to adapt precisely to complex non-linear feature drift. Theoretical analysis and Procrustes distance experiments validate our framework's resilience to manifold deformations. We integrated HC-SOINN into seven state-of-the-art methods by replacing their original classifiers, achieving consistent improvements that highlight the effectiveness and robustness of our approach. Code is available at https://github.com/yhyet/HC_SOINN.

2605.11901 2026-05-13 cs.CR cs.AI 版本更新

AccLock: Unlocking Identity with Heartbeat Using In-Ear Accelerometers

Lei Wang, Jiangxuan Shen, Xi Zhang, Dalin Zhang, Jingyu Li, Haipeng Dai, Chenren Xu, Daqing Zhang, He Huang

发表机构 * Soochow University(苏州大学) Macquarie University(麦考瑞大学) Aalborg University(奥尔堡大学) Peking University(北京大学) Nanjing University(南京大学)

AI总结 本文提出了一种基于耳内加速度计的被动身份认证系统 AccLock,通过提取耳内血压波(BCG)信号的独特特征实现无需用户主动参与的高安全性身份验证。该系统采用两阶段去噪方案和基于解耦的深度学习模型 HIDNet 提取用户特定特征,并结合 Siamese 网络构建可扩展的认证框架,有效提升了环境噪声下的鲁棒性和实用性。实验表明,AccLock 在 33 名参与者中实现了平均误拒率(FAR)3.13% 和误接受率(FRR)2.99%,验证了其实际可行性。

详情
英文摘要

The widespread use of earphones has enabled various sensing applications, including activity recognition, health monitoring, and context-aware computing. Among these, earphone-based user authentication has become a key technique by leveraging unique biometric features. However, existing earphone-based authentication systems face key limitations: they either require explicit user interaction or active speaker output, or suffer from poor accessibility and vulnerability to environmental noise, which hinders large-scale deployment. In this paper, we propose a passive authentication system, called AccLock, which leverages distinctive features extracted from in-ear BCG signals to enable secure and unobtrusive user verification. Our system offers several advantages over previous systems, including zero-involvement for both the device and the user, ubiquitous, and resilient to environmental noise. To realize this, we first design a two-stage denoising scheme to suppress both inherent and sporadic interference. To extract user-specific features, we then propose a disentanglement-based deep learning model, HIDNet, which explicitly separates user-specific features from shared nuisance components. Lastly, we develop a scalable authentication framework based on a Siamese network that eliminates the need for per-user classifier training. We conduct extensive experiments with 33 participants, achieving an average FAR of 3.13% and FRR of 2.99%, which demonstrates the practical feasibility of AccLock.

2605.11893 2026-05-13 cs.AI 版本更新

Toward Modeling Player-Specific Chess Behaviors

Loris Sogliuzzo, Aloïs Rautureau, Eric Piette

AI总结 尽管人工智能在国际象棋中已达到超人类水平,但准确模拟人类棋手个性化决策风格的模型仍是一个挑战。本文提出了一种基于Maia-2模型的架构,通过冠军特定嵌入和有限蒙特卡洛树搜索(MCTS)增强战术探索,以更好地捕捉历史冠军的棋风特征。研究引入了一种基于詹森-香农散度的行为评估指标,通过自编码器和UMAP降维技术比较玩家与AI模型的行为相似性,实验表明该方法在提升风格一致性方面优于传统基于移动准确率的评估方式。

详情
英文摘要

While artificial intelligence has achieved superhuman performance in chess, developing models that accurately emulate the individualized decision-making styles of human players remains a significant challenge. Existing human-like chess models capture general population behaviors based on skill levels but fail to reproduce the behavioral characteristics of specific historical champions. Furthermore, the standard evaluation metric, move accuracy, inherently penalizes natural human variance and ignores long-term behavioral consistency, leading to an incomplete assessment of stylistic fidelity. To address these limitations, an architecture is proposed that adapts the unified Maia-2 model to champion-specific embeddings, further enhanced by the integration of a limited Monte Carlo Tree Search (MCTS) process to enrich tactical exploration during move selection. To robustly evaluate this approach, a novel behavioral metric based on the Jensen-Shannon divergence is introduced. By compressing high-dimensional board representations into a latent space using an AutoEncoder and Uniform Manifold Approximation and Projection (UMAP), move distributions are discretized on a common grid to compare behavioral similarities. Results across 16 historical world champions indicate that while integrating MCTS decreases standard move accuracy, it improves stylistic alignment according to the proposed metric, substantially reducing the average Jensen-Shannon divergence. Ultimately, the proposed metric successfully discriminates between individual players and provides promising evidence toward more comprehensive evaluations of behavioral alignment between players and AI models.

2605.11891 2026-05-13 cs.CR cs.AI 版本更新

Proteus: A Self-Evolving Red Team for Agent Skill Ecosystems

Zhaojiacheng Zhou

发表机构 * Department of Computer Science and Engineering(计算机科学与工程系)

AI总结 该研究提出了一种名为Proteus的自我进化的红队框架,用于评估基于技能的智能体生态系统中的安全风险。面对第三方技能可能在部署后通过迭代修改绕过审核并造成运行时危害的问题,Proteus通过模拟攻击者的行为,在形式化的五维攻击空间中搜索潜在威胁,并利用审核反馈进行跨轮次的技能变异与优化。实验表明,Proteus在多个测试场景中表现出较高的攻击成功率,揭示了当前技能审核机制在应对自适应攻击时存在显著的漏检风险。

详情
英文摘要

Agent skills extend LLM agents with reusable instructions, tool interfaces, and executable code, and users increasingly install third-party skills from marketplaces, repositories, and community channels. Because a skill exposes both executable behavior and context-setting documentation, its deployment risk cannot be measured by single-shot audits or prompt-level red teams alone: a realistic attacker can use audit and runtime feedback to repeatedly rewrite the skill. We frame this risk as \emph{adaptive leakage} -- whether a budgeted attacker can iteratively revise a skill until it passes audit and produces verified runtime harm -- and present \ours{}, a grey-box self-evolving red-team framework for measuring it. Proteus searches a formalized five-axis skill-attack space. Each candidate is evaluated through a unified audit-sandbox-oracle pipeline that returns structured audit findings and runtime evidence to guide cross-round mutation. Beyond initial evasion, Proteus performs path expansion, which finds alternative implementations of successful attacks, and surface expansion, which transfers learned implementation patterns to new attack objectives beyond the original seed catalogue. Across eight phase-1 cells, Proteus reaches 40--90\% Attack Success Rate at $5$ rounds (ASR@5) with positive learning-curve slopes on both evaluated auditors. Phase-2 path/surface expansion produces 438 jointly bypassing and lethal variants, with SkillVetter bypassed at $\geq 93\%$ in every cell and AI-Infra-Guard, the strongest public auditor we evaluate, still admitting up to 41.3\% joint-success. These results show that current skill vetting substantially underestimates residual risk when evaluated against adaptive, feedback-driven attackers.

2605.11889 2026-05-13 cs.LG cs.AI 版本更新

Incentivizing Truthfulness and Collaborative Fairness in Bayesian Learning

Rachael Hwee Ling Sim, Jue Fan, Xiao Tian, Xinyi Xu, Patrick Jaillet, Bryan Kian Hsiang Low

发表机构 * Department of Computer Science, National University of Singapore, Singapore(新加坡国立大学计算机科学系) Research (A STAR), Singapore(新加坡A*STAR研究) Department of Electrical Engineering(电气工程系) Computer Science, Massachusetts Institute of Technology, USA(美国麻省理工学院计算机科学系)

AI总结 该研究探讨了如何在贝叶斯学习中激励数据源提供真实数据并实现协作公平性。为解决现有方法无法保证数据真实性的问题,作者提出了一种机制,结合半值(如夏普利值)确保公平性,并基于数据源未知的验证集设计了激励真实性的数据估值函数。该机制在均衡状态下可同时保证协作公平性和数据真实性,理论分析与实验验证均表明其有效性。

Comments Accepted to the 43rd International Conference on Machine Learning (ICML-26) as a Spotlight paper

详情
英文摘要

Collaborative machine learning involves training high-quality models using datasets from a number of sources. To incentivize sources to share data, existing data valuation methods fairly reward each source based on its data submitted as is. However, as these methods do not verify nor incentivize data truthfulness, the sources can manipulate their data (e.g., by submitting duplicated or noisy data) to artificially increase their valuations and rewards or prevent others from benefiting. This paper presents the first mechanism that provably ensures (F) collaborative fairness and incentivizes (T) truthfulness at equilibrium for Bayesian models. Our mechanism combines semivalues (e.g., Shapley value), which ensure fairness, and a truthful data valuation function (DVF) based on a validation set that is unknown to the sources. As semivalues are influenced by others' data, we introduce an additional condition to prove that a source can maximize its expected data values in coalitions and semivalues by submitting a dataset that captures its true knowledge. Additionally, we discuss the implications and suitable relaxations of (F) and (T) when the mediator has a limited budget for rewards or lacks a validation set. Our theoretical findings are validated on synthetic and real-world datasets.

2605.11882 2026-05-13 cs.AI 版本更新

On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment

Bo Yin, Qi Li, Xinchao Wang

发表机构 * National University of Singapore(新加坡国立大学)

AI总结 该研究针对工具使用型大语言模型代理在执行过程中可能产生的不安全行为,提出了一种基于失败轨迹的在线策略自我进化框架FATE。该方法通过将验证器评估的失败轨迹转化为修复监督信号,指导代理自我优化,同时引入帕累托前沿策略优化以平衡安全与任务效用。实验表明,FATE在多个基准上显著提升了代理的安全性,同时保持了其任务执行能力。

详情
英文摘要

Tool-using LLM agents fail through trajectories rather than only final responses, as they may execute unsafe tool calls, follow injected instructions, comply with harmful requests, or over-refuse benign tasks despite producing a seemingly safe answer. Existing safety-alignment signals are largely response-level or off-policy, and often incur a safety-utility trade-off: improving agent safety comes at the cost of degraded task performance. Such sparse and single-objective rewards severely limit real-world usability. To bridge this gap, we propose FATE, an on-policy self-evolving framework that transforms verifier-scored failures into repair supervision without expert demonstrations. For each failure, the same policy proposes repair candidates, which are then re-scored by verifiers and filtered across security, utility, over-refusal control, and trajectory validity. This dense trajectory-level information is then used as a supervision signal for agent self-evolution. During this process, we further introduce Pareto-Front Policy Optimization (PFPO), combining supervised warmup with Pareto-aware policy optimization to preserve safety-utility trade-offs. Experiments on AgentDojo, AgentHarm, and ATBench show that FATE improves safety across different models and scales while preserving useful behavior. Compared with strong baselines, FATE reduces attack success rate by 33.5%, harmful compliance by 82.6%, and improves external trajectory-safety diagnosis by 6.5%. These results suggest that failed trajectories can provide structured repair supervision for safer self-evolving agents.

2605.11875 2026-05-13 eess.SP cs.AI 版本更新

Modulation Consistency-based Contrastive Learning for Self-Supervised Automatic Modulation Classification

Chenxu Wang, Shuang Wang, Lirong Han, Xinyu Hu, Hanlin Mo, Hantong Xing, Licheng Jiao

发表机构 * School of Artificial Intelligence, Xidian University(西安电子科技大学人工智能学院) Unmanned System Research Institute, Northwestern Polytechnical University(西北工业大学无人系统研究院) School of Electronics And Information, Northwestern Polytechnical University(西北工业大学电子与信息学院) Department of Electrical and Computer Engineering, Aristotle University of Thessaloniki(希腊亚里士多德大学电气与计算机工程系)

AI总结 本文针对自动调制分类(AMC)任务中自监督学习方法依赖任务无关预训练目标、导致表征受干扰因素影响的问题,提出了一种基于调制一致性的对比学习框架Mod-CL。该方法利用同一信号不同时间片段之间调制类型一致但波形不同的特性,构建正样本对以学习共享的调制信息并抑制干扰因素。实验表明,Mod-CL在多个RadioML数据集上显著优于现有方法,尤其在标签稀缺场景下表现出色。

详情
英文摘要

Deep learning-based AMC methods have achieved remarkable performance, but their practical deployment remains constrained by the high cost of labeled data. Although self-supervised learning (SSL) reduces the reliance on labels, existing SSL-based AMC methods often rely on task-agnostic pretext objectives misaligned with modulation classification, leading to representations entangled with nuisance factors such as symbol, channel, and noise. In this paper, we identify intra-instance modulation consistency as a task-aware structural prior, whereby different temporal segments of the same signal may differ in waveform while preserving the same modulation type, thus providing a principled cue for task-aligned self-supervision. Based on this prior, we propose Mod-CL, a Modulation consistency-based Contrastive Learning framework that constructs positive pairs from different temporal segments of the same signal instance, to encourage the model to learn shared modulation information while suppressing nuisance variations. We further develop a contrastive objective tailored to Mod-CL, which jointly exploits temporal segmentation and data augmentation to pull together views sharing the same modulation semantics while avoiding supervisory conflicts within each signal instance. Extensive experiments on RadioML datasets show that Mod-CL consistently outperforms strong baselines, especially in low-label regimes, achieving substantial improvements in linear probing accuracy.

2605.11868 2026-05-13 cs.CR cs.AI 版本更新

IPI-proxy: An Intercepting Proxy for Red-Teaming Web-Browsing AI Agents Against Indirect Prompt Injection

Chia-Pei, Chen, Kentaroh Toyoda, Anita Lai, Alex Leung

发表机构 * Vulcan Research, AIFT(火山研究,AIFT)

AI总结 本文提出IPI-proxy,一个用于对抗间接提示注入(IPI)的开源拦截代理工具,旨在评估和增强浏览网页的AI代理的安全性。该工具通过实时修改白名单域名的HTTP响应,嵌入从多个基准库中提取的攻击载荷,支持多种嵌入方式和位置参数化配置,实现无需模拟页面的参数扫描测试。IPI-proxy填补了现有红队工具在真实部署环境中测试IPI漏洞的空白,为AI安全团队提供了一种可复现的测试平台。

Comments code: https://github.com/VulcanLab/IPI-Proxy/

详情
英文摘要

Web-browsing AI agents are increasingly deployed in enterprise settings under strict whitelists of approved domains, yet adversaries can still influence them by embedding hidden instructions in the HTML pages those domains serve. Existing red-teaming resources fall short of this scenario: prompt-injection benchmarks ship pre-built adversarial pages that whitelisted agents cannot reach, and generic LLM scanners probe the model API rather than its retrieved content. We present IPI-proxy, an open-source toolkit for red-teaming web-browsing agents against indirect prompt injection (IPI). At its core is an intercepting proxy that rewrites real HTTP responses from whitelisted domains in flight, embedding payloads drawn from a unified library of 820 deduplicated attack strings extracted from six published benchmarks (BIPIA, InjecAgent, AgentDojo, Tensor Trust, WASP, and LLMail-Inject). A YAML-driven test harness independently parameterizes the payload set, the embedding technique (HTML comment, invisible CSS, or LLM-generated semantic prose), and the HTML insertion point (6 locations from \icode{head\_meta} to \icode{script\_comment}), enabling parameter-sweep evaluation without mock pages or sandboxed environments. A companion exfiltration tracker logs successful callbacks. This paper describes the threat model, situates IPI-proxy among contemporary IPI benchmarks and red-teaming tools, and details its architecture, design decisions, and configuration interface. By bridging static benchmarks and live deployment, IPI-proxy gives AI security teams a reproducible substrate for measuring and hardening web-browsing agents against indirect prompt injection on the same retrieval surface attackers exploit in production.

2605.11864 2026-05-13 cs.IR cs.AI cs.CV cs.MM 版本更新

Very Efficient Listwise Multimodal Reranking for Long Documents

Yiqun Sun, Pengfei Wei, Lawrence B. Hsieh

发表机构 * Magellan Technology Research Institute (MTRI)(马杰拉技术研究院(MTRI))

AI总结 本文提出了一种高效的列表级多模态重排序模型ZipRerank,旨在解决长文档视觉中心检索和多模态检索增强生成中的计算瓶颈问题。该方法通过轻量的查询-图像早期交互机制缩短输入长度,并采用单次前向传播对所有候选进行评分,从而避免了自回归解码的高耗时过程。实验表明,ZipRerank在保持高性能的同时,显著降低了大语言模型的推理延迟,适用于对延迟敏感的实际应用场景。

Comments To appear in ICML 2026

详情
英文摘要

Listwise reranking is a key yet computationally expensive component in vision-centric retrieval and multimodal retrieval-augmented generation (M-RAG) over long documents. While recent VLM-based rerankers achieve strong accuracy, their practicality is often limited by long visual-token sequences and multi-step autoregressive decoding. We propose ZipRerank, a highly efficient listwise multimodal reranker that directly addresses both bottlenecks. It reduces input length via a lightweight query-image early interaction mechanism and eliminates autoregressive decoding by scoring all candidates in a single forward pass. To enable effective learning, ZipRerank adopts a two-stage training strategy: (i) listwise pretraining on large-scale text data rendered as images, and (ii) multimodal finetuning with VLM-teacher-distilled soft-ranking supervision. Extensive experiments on the MMDocIR benchmark show that ZipRerank matches or surpasses state-of-the-art multimodal rerankers while reducing LLM inference latency by up to an order of magnitude, making it well-suited for latency-sensitive real-world systems. The code is available at https://github.com/dukesun99/ZipRerank.

2605.11859 2026-05-13 cs.RO cs.AI 版本更新

EvoNav: Evolutionary Reward Function Design for Robot Navigation with Large Language Models

Zhikai Zhao, Chuanbo Hua, Federico Berto, Zihan Ma, Kanghoon Lee, Jiachen Li, Jinkyoo Park

发表机构 * KAIST(韩国科学技术院) Radical Numerics UC Riverside(加州大学河滨分校) Omelet AI4CO

AI总结 本文提出了一种基于进化算法和大语言模型的机器人导航奖励函数设计框架EvoNav,旨在解决传统人工设计奖励函数依赖领域专业知识、难以适应复杂环境的问题。该方法通过分阶段的预热-提升流程,利用大语言模型生成候选奖励函数,并结合低成本代理和逐步强化训练,显著提高了设计效率与导航策略性能。实验表明,EvoNav生成的导航策略优于手动设计和现有先进方法。

详情
英文摘要

Robot navigation is a crucial task with applications to social robots in dynamic human environments. While Reinforcement Learning (RL) has shown great promise for this problem, the policy quality is highly sensitive to the specification of reward functions. Hand-crafted rewards require substantial domain expertise and embed inductive biases that are difficult to audit or adapt, limiting their effectiveness and leading to suboptimal performance. In this paper, we propose EvoNav, an evolutionary framework that automates the design of robot navigation reward functions via large language models (LLMs). To overcome prohibitively costly policy training, EvoNav evaluates each candidate proposal from the LLM via a progressive three-stage warm-up-boost procedure. EvoNav advances from analytical proxies with low-cost surrogates, such as small datasets and analytic rules, to lightweight rollouts and, finally, to full policy training, enabling computationally efficient exploration under effective feedback. Experiment results show that EvoNav produces more effective navigation policies than manually designed RL rewards and state-of-the-art reward design methods.

2605.11846 2026-05-13 cs.LG cs.AI 版本更新

Martingale-Consistent Self-Supervised Learning

Moritz Gögl, Hanwen Xing, Christopher Yau

发表机构 * University of Oxford(牛津大学) Health Data Research UK(英国健康数据研究)

AI总结 本文研究了在信息不完整或动态变化的环境下,如何提升自监督学习(SSL)的鲁棒性和一致性。作者提出了一种基于鞅理论的自监督学习框架,确保粗略预测与精炼预测在期望上保持一致,从而防止系统性偏差。该方法引入了预测空间和潜在空间的变体,并设计了无偏的蒙特卡洛估计器,实验表明其在部分观测场景下能提升模型的稳定性与校准能力。

详情
英文摘要

Self-supervised learning (SSL) is often deployed under changing information, such as shorter histories, missing features, or partially observed images. In these settings, predictions from coarse and refined views should be coherent: before refinement, the coarse-view prediction should match the average prediction expected after refinement. Martingales formalize this coherence principle, but standard SSL objectives do not enforce it. Unlike invariance objectives that pull views together, martingale consistency constrains only the expected refined prediction, allowing predictions to update as information is revealed while preventing systematic drift. We introduce a martingale-consistent SSL framework that closes this gap, with practical prediction- and latent-space variants and an unbiased two-sample Monte Carlo estimator based on stochastic refinement. We evaluate the approach on synthetic and real time-series, tabular, and image benchmarks under partial-observation regimes, in both semi-self-supervised and fully label-free settings. Across these experiments, our framework improves robustness and calibration under partial observation, yielding more stable representations as information is revealed.

2605.11841 2026-05-13 stat.ML cs.AI cs.LG 版本更新

Minimax Rates and Spectral Distillation for Tree Ensembles

Binh Duc Vu, David S. Watson

发表机构 * King’s College London(伦敦国王学院)

AI总结 本文研究了随机森林和梯度提升机等树集成模型的理论性质,提出了基于谱方法的分析框架。通过分析诱导核算子的特征值衰减,得出了随机森林回归的最小最大收敛率,并基于这一视角开发了模型压缩方法。该方法通过学习核算子或平滑矩阵的主特征函数或奇异向量,生成预测性能优异但规模大幅缩减的蒸馏模型,适用于资源受限的计算场景。

Comments 9 pages main text, 33 pages total, with 12 figures and 7 tables total

详情
英文摘要

Tree ensembles such as random forests (RFs) and gradient boosting machines (GBMs) are among the most widely used supervised learners, yet their theoretical properties remain incompletely understood. We adopt a spectral perspective on these algorithms, with two main contributions. First, we derive minimax-optimal convergence for RF regression, showing that, under mild regularity conditions on tree growth, the eigenvalue decay of the induced kernel operator governs the statistical rate. Second, we exploit this spectral viewpoint to develop compression schemes for tree ensembles. For RFs, leading eigenfunctions of the kernel operator capture the dominant predictive directions; for GBMs, leading singular vectors of the smoother matrix play an analogous role. Learning nonlinear maps for these spectral representations yields distilled models that are orders of magnitude smaller than the originals while maintaining competitive predictive performance. Our methods compare favorably to state of the art algorithms for forest pruning and rule extraction, with applications to resource constrained computing.

2605.11839 2026-05-13 cs.DC cs.AI 版本更新

Trade-offs in Decentralized Agentic AI Discovery Across the Compute Continuum

Patrizio Dazzi, Emanuele Carlini, Matteo Mordacchini, Saul Urso

发表机构 * Department of Computer Science(计算机科学系) University of Pisa(比萨大学) Institute of Information Science and Technologies(信息科学与技术研究所) National Research Council of Italy(意大利国家研究理事会) Institute of Informatics and Telematics(信息学与电信研究所)

AI总结 本文研究了在计算连续体(包括云、边缘和间歇连接环境)中部署的智能体系统所面临的分布式发现机制的权衡问题。作者对比了Chord、Pastry和Kademlia三种结构化覆盖网络在统一控制平面框架下的性能,分析了它们在发现可靠性、启动行为和控制开销等方面的差异。研究旨在为智能体在边缘到云环境中的发现机制提供明确的性能边界和适用场景指导。

详情
英文摘要

Agentic systems deployed across the compute continuum need discovery mechanisms that remain effective across cloud, edge, and intermittently connected domains. In some emerging agentic architectures, decentralized discovery is already an active design direction, placing DHT-based lookup on the path toward agent directories. This paper studies the trade-offs among major structured-overlay families for agent discovery, comparing Chord, Pastry, and Kademlia as candidate indexing substrates within a shared control-plane framework. Using a benchmark subset centered on a 4096-node stationary comparison and a representative 4096-node churn benchmark, the paper characterizes how discovery reliability, startup behavior, and control-plane overhead vary across these overlays. The goal is to clarify the operating points they expose for agent discovery across edge-to-cloud environments.

2605.11835 2026-05-13 cs.NE cs.AI cs.LG 版本更新

Multi-Timescale Conductance Spiking Networks: A Sparse, Gradient-Trainable Framework with Rich Firing Dynamics for Enhanced Temporal Processing

Alex Fulleda-Garcia, Saray Soldado-Magraner, Josep Maria Margarit-Taulé

发表机构 * Department of Neurobiology, University of California Los Angeles (UCLA)(加州大学洛杉矶分校神经生物学系) Instituto de Física Corpuscular (IFIC, CSIC–UV)(Corpuscular物理研究所(IFIC,CSIC–UV))

AI总结 本文提出了一种多时间尺度电导型脉冲神经网络(Multi-Timescale Conductance Spiking Networks),旨在解决传统脉冲神经网络在梯度训练、动态丰富性和活动稀疏性之间的权衡问题。该框架通过调节快、慢和极慢时间尺度的电导参数,系统地控制神经元的兴奋性,从而实现包括持续放电、瞬时放电和爆发放电等多种放电模式。实验表明,该模型在时间序列回归任务中优于现有LIF和AdLIF网络,同时表现出更稀疏的活动特性,为能效优先的时序处理和类脑计算提供了新的基础。

Comments Published in 2026 IEEE Neuro-Inspired Computational Elements Conference (Atlanta, USA)

详情
英文摘要

Spiking neural networks (SNNs) promise low-power event-driven computation for temporally rich tasks, but commonly used neuron models often trade off gradient-based trainability, dynamical richness, and high activity sparsity. These limitations are acute in regression, where approximation error, noise and spike discretization can severely degrade continuous-valued outputs. Indeed, many state-of-the-art (SOTA) SNNs rely on simple phenomenological dynamics trained with surrogate gradients and offer limited control over spiking diversity and sparsity. To overcome such limitations, we introduce multi-timescale conductance spiking networks, a gradient-trainable framework in which neural dynamics emerge from shaping the current-voltage (I-V) curve by tuning fast, slow and ultra-slow conductances. This parametrization allows systematic control over excitability, can be implemented efficiently in analog circuits, and yields rich firing regimes including tonic, phasic and bursting responses within a single model. We derive a discrete-time formulation of these differentiable dynamics, enabling direct backpropagation through time without surrogate-gradient approximations. To probe both trainability and accuracy, we evaluate feedforward networks of these neurons at the predictability limit of Mackey-Glass time-series regression and compare them to baseline LIF and SOTA AdLIF networks. Our model outperforms LIF and AdLIF networks, while exhibiting substantially sparser activity from both communication and computational perspectives. These results highlight multi-timescale conductance spiking neurons as a promising building block for energy-aware temporal processing and neuromorphic implementation.

2605.10916 2026-05-13 cs.CV cs.AI 版本更新

Confidence-Guided Diffusion Augmentation for Enhanced Bangla Compound Character Recognition

Md. Sultan Al Rayhan

发表机构 * Department of Computer Science and Engineering(计算机科学与工程系)

AI总结 识别手写孟加拉语复合字符是一个具有挑战性的问题,主要由于字符结构复杂、类内变化大以及高质量标注数据有限。本文提出了一种基于置信度引导的扩散增强框架,用于提升低分辨率孟加拉语复合字符的识别性能。该方法结合了类别条件扩散模型和分类器引导技术,生成高质量的合成样本,并引入了增强残差块和置信度过滤机制,以提升生成质量并筛选出类别一致性高的样本。实验表明,该方法在多个主流模型上均取得性能提升,最佳模型在AIBangla数据集上的分类准确率达到89.2%,显著优于现有基准。

详情
英文摘要

Recognition of handwritten Bangla compound characters remains a challenging problem due to complex character structures, large intra-class variation, and limited availability of high-quality annotated data. Existing Bangla handwritten character recognition systems often struggle to generalize across diverse writing styles, particularly for compound characters containing intricate ligatures and diacritical variations. In this work, we propose a confidence-guided diffusion augmentation framework for low-resolution Bangla compound character recognition. Our framework combines class-conditional diffusion modeling with classifier guidance to synthesize high-quality handwritten compound character samples. To further improve generation quality, we introduce Squeeze-and-Excitation enhanced residual blocks within the diffusion model's U-Net backbone. We additionally propose a confidence-based filtering mechanism where pre-trained classifiers act as quality gates to retain only highly class-consistent synthetic samples. The filtered synthetic images are fused with the original training data and used to retrain multiple classification architectures. Experiments conducted on the AIBangla compound character dataset demonstrate consistent performance improvements across ResNet50, DenseNet121, VGG16, and Vision Transformer architectures. Our best-performing model achieves 89.2\% classification accuracy, surpassing the previously published AIBangla benchmark by a substantial margin. The results demonstrate that quality-aware diffusion augmentation can effectively enhance handwritten character recognition performance in low-resource script domains.

2605.10684 2026-05-13 cs.LG cs.AI 版本更新

Is Data Shapley Not Better than Random in Data Selection? Ask NASH

Xiao Tian, Jue Fan, Rachael Hwee Ling Sim, Zixuan Wang, Nancy F. Chen, Bryan Kian Hsiang Low

发表机构 * Department of Computer Science, National University of Singapore, Singapore(新加坡国立大学计算机科学系) Research (A STAR), Singapore(新加坡科技研究局)

AI总结 本文研究了如何从训练数据中选择高质量子集的问题,探讨了数据选择中使用Data Shapley等方法的有效性。针对Data Shapley在实践中表现不稳定的问题,作者提出了NASH框架,通过将目标效用函数分解为更简单的Shapley-信息组件,并非线性地聚合这些组件进行数据选择,显著提升了基于Shapley的数据选择效果,且仅需少量额外计算成本。

Comments Accepted to the 43rd International Conference on Machine Learning (ICML-26) as a Spotlight paper

详情
英文摘要

Data selection studies the problem of identifying high-quality subsets of training data. While some existing works have considered selecting the subset of data with top-$m$ Data Shapley or other semivalues as they account for the interaction among every subset of data, other works argue that Data Shapley can sometimes perform ineffectively in practice and select subsets that are no better than random. This raises the questions: (I) Are there certain "Shapley-informative" settings where Data Shapley consistently works well? (II) Can we strategically utilize these settings to select high-quality subsets consistently and efficiently? In this paper, we propose a novel data selection framework, NASH (Non-linear Aggregation of SHapley-informative components), which (I) decomposes the target utility function (e.g., validation accuracy) into simpler, Shapley-informative component functions, and selects data by optimizing an objective that (II) aggregates these components non-linearly. We demonstrate that NASH substantially boosts the effectiveness of Shapley/semivalue-based data selection with minimal additional runtime cost.

2605.10584 2026-05-13 astro-ph.IM cs.AI gr-qc 版本更新

An agentic framework for gravitational-wave counterpart association in the multi-messenger era

Yiming Dong, Yacheng Kang, Junjie Zhao, Xinyuan Zhu, Ziming Wang, Lijing Shao

发表机构 * Department of Astronomy, School of Physics(天文学系,物理系) Kavli Institute for Astronomy and Astrophysics(天体物理研究所) Institute for Gravitational Wave Astronomy(引力波天文研究所) School of Information Science and Technology(信息科学与技术学院) National Astronomical Observatories(国家天文台)

AI总结 随着多信使天文学的发展,引力波(GW)与电磁(EM)信号的关联成为研究天体物理的重要步骤。本文提出GW-Eyes,一个基于大语言模型的智能代理框架,首次实现了引力波信号与候选电磁事件的自主关联,并支持自然语言交互以辅助专家完成目录管理、天区图可视化等任务。该框架利用大语言模型的复杂决策能力和可追溯的推理过程,为多信使天文学提供了新的研究视角。

详情
英文摘要

With the detection of gravitational waves (GWs), multi-messenger astronomy has opened a new window for advancing our understanding of astrophysics, dense matter, gravitation, and cosmology. The GW sources detected to date are from mergers of compact object binaries, which possess the potential to generate detectable electromagnetic (EM) counterparts. Searching for associations between GW signals and their EM counterparts is an essential step toward enabling subsequent multi-messenger studies. In the era of next-generation GW and EM detectors, the rapid increase in the number of events brings not only unprecedented scientific opportunities, but also substantial challenges to the existing data analysis paradigm. To help address these challenges, we develop GW-Eyes, an agentic framework powered by large language models (LLMs). For the first time, GW-Eyes integrates domain-specific tools and autonomously performs counterpart association tasks between GW and candidate EM events. It supports natural language interaction to assist human experts with auxiliary tasks such as catalog management, skymap visualization, and rapid verification. Our framework leverages the complex decision-making capabilities of LLMs and their traceable reasoning processes, offering a new perspective to the multi-messenger astronomy.

2605.10442 2026-05-13 cs.CY cs.AI cs.CL 版本更新

StereoTales: A Multilingual Framework for Open-Ended Stereotype Discovery in LLMs

Pierre Le Jeune, Étienne Duchesne, Weixuan Xiao, Stefano Palminteri, Bazire Houssin, Benoît Malézieux, Matteo Dora

发表机构 * Giskard AI École Normale Supérieure, INSERM, Paris(巴黎高等师范学院、INSERM、巴黎)

AI总结 本文提出了一种多语言框架 StereoTales,用于系统研究开放生成式大语言模型中的社会偏见。该框架包含10种语言、79个社会人口属性以及超过65万个由23个大模型生成的故事,并通过统计分析识别出1500多个过度关联的刻板印象,并由人类和模型共同评估其有害性。研究发现,所有模型都会生成有害刻板印象,且这些偏见具有跨语言和跨模型的共性,人类与模型对有害性的判断也表现出较高一致性。

Comments Preprint

详情
英文摘要

Multilingual studies of social bias in open-ended LLM generation remain limited: most existing benchmarks are English-centric, template-based, or restricted to recognizing pre-specified stereotypes. We introduce StereoTales, a multilingual dataset and evaluation pipeline for systematically studying the emergence of social bias in open-ended LLM generation. The dataset covers 10 languages and 79 socio-demographic attributes, and comprises over 650k stories generated by 23 recent LLMs, each annotated with the socio-demographic profile of the protagonist across 19 dimensions. From these, we apply statistical tests to identify more than 1{,}500 over-represented associations, which we then rate for harmfulness through both a panel of humans (N = 247) and the same LLMs. We report three main findings. \textbf{(i)} Every model we evaluate emits consequential harmful stereotypes in open-ended generation, regardless of size or capabilities, and these associations are largely shared across providers rather than isolated misbehaviors. \textbf{(ii)} Prompt language strongly shapes which stereotypes appear: rather than transferring as a shared set of biases, harmful associations adapt culturally to the prompt language and amplify bias against locally salient protected groups. \textbf{(iii)} Human and LLM harmfulness judgments are broadly aligned (Spearman $ρ=0.62$), with disagreements concentrating on specific attribute classes rather than specific providers. To support further analyses, we release the evaluation code and the dataset, including model generations, attribute annotations, and harmfulness ratings.

2605.10094 2026-05-13 cs.RO cs.AI 版本更新

Retrieve-then-Steer: Online Success Memory for Test-Time Adaptation of Generative VLAs

Jianchao Zhao, Huoren Yang, Yusong Hu, Yuyang Gao, Qiguan Ou, Cong Wan, SongLin Dong, Zhiheng Ma, Yihong Gong

发表机构 * College of Artificial Intelligence, Xi’an Jiaotong University(西安交通大学人工智能学院) One Robotics Shenzhen University of Advanced Technology(深圳先进技术大学)

AI总结 本文研究了在持续部署环境下如何提升冻结的视觉-语言-动作(VLA)模型在测试时的可靠性问题。提出了一种基于在线成功记忆的测试时自适应框架,通过在部署过程中存储成功的观察-动作片段,并在推理时检索相关动作片段进行轨迹一致性过滤和聚合,生成高质量的动作先验。该方法引入了置信度自适应的先验引导机制,将先验信息注入动作生成流程,实现了无需参数更新的轻量级自适应,实验表明该方法在长时间和多阶段任务中显著提升了任务成功率和闭环稳定性。

详情
英文摘要

Vision-Language-Action (VLA) models show strong potential for general-purpose robotic manipulation, yet their closed-loop reliability often degrades under local deployment conditions. Existing evaluations typically treat test episodes as independent zero-shot trials. However, real robots often operate repeatedly in the same or slowly changing environments, where successful executions provide environment-verified evidence of reliable behavior patterns. We study this persistent-deployment setting, asking whether a partially competent frozen VLA can improve its reliability by reusing its successful test-time experience. We propose an online success-memory guided test-time adaptation framework for generative VLAs. During deployment, the robot stores progress-calibrated successful observation-action segments in a long-term memory. At inference, it retrieves state-relevant action chunks, filters inconsistent candidates via trajectory-level consistency, and aggregates them into an elite action prior. To incorporate this prior into action generation, we introduce confidence-adaptive prior guidance, which injects the elite prior into an intermediate state of the flow-matching action sampler and adjusts the guidance strength based on retrieval confidence. This design allows the frozen VLA to exploit environment-specific successful experience while preserving observation-conditioned generative refinement. This retrieve-then-steer mechanism enables lightweight, non-parametric test-time adaptation without requiring parameter updates. Simulation and real-world experiments show improved task success and closed-loop stability, especially in long-horizon and multi-stage tasks.

2605.09780 2026-05-13 cs.AI 版本更新

Attribution-based Explanations for Markov Decision Processes

Paul Kobialka, Andrea Pferscher, Francesco Leofante, Erika Ábrahám, Silvia Lizeth Tapia Tarifa, Einar Broch Johnsen

发表机构 * University of Oslo(奥斯陆大学) Imperial College London(伦敦帝国理工学院) RWTH Aachen University(亚琛工业大学)

AI总结 本文研究如何为马尔可夫决策过程(MDP)生成基于归因的解释,以阐明智能体在序列决策中的行为逻辑。作者提出了一种形式化框架,用于在MDP中分配状态和执行路径的重要性分数,并利用策略合成技术高效计算这些分数,克服了MDP中非确定性的挑战。通过五个案例研究验证了方法的有效性,展示了其在提供可解释决策洞察方面的应用价值。

详情
英文摘要

Attribution techniques explain the outcome of an AI model by assigning a numerical score to its inputs. So far, these techniques have mainly focused on attributing importance to static input features at a single point in time, and thus fail to generalize to sequential decision-making settings. This paper fills this gap by introducing techniques to generate attribution-based explanations for Markov Decision Processes (MDPs). We give a formal characterization of what attributions should represent in MDPs, focusing on explanations that assign importance scores to both individual states and execution paths. We show how importance scores can be computed by leveraging techniques for strategy synthesis, enabling the efficient computation of these scores despite the non-determinism inherent in an MDP. We evaluate our approach on five case-studies, demonstrating its utility in providing interpretable insights into the logic of sequential decision-making agents.

2605.09769 2026-05-13 cs.AI 版本更新

UTS at PsyDefDetect: Multi-Agent Councils and Absence-Based Reasoning for Defense Mechanism Classification

Dima Galat, Marian-Andrei Rizoiu

发表机构 * University of Technology Sydney(技术大学悉尼)

AI总结 本文介绍了一种用于情感支持对话中心理防御机制分类的系统,基于防御机制评分量表(DMRS),在64支队伍中排名第二(F1值为0.406)。研究核心在于将防御机制定义为缺失的方面(如情感缺失、认知阻滞、现实否认),并通过情感-认知整合光谱在提示级别的临床规则中进行编码,显著提升了分类性能。系统采用多阶段的Gemini 2.5代理委员会架构,通过类特定倡导者评估证据强度而非简单投票,无需微调即取得良好效果,最终结合三个微调Qwen3.5模型的定向覆盖策略进一步提升了性能。

详情
英文摘要

This paper describes our system for classifying psychological defense mechanisms in emotional support dialogues using the Defense Mechanism Rating Scales (DMRS), placing second (F1 0.406) among 64 teams. A central insight is that defense mechanisms are defined by what is absent: missing affect, blocked cognition, denied reality. We encode this as an affect-cognition integration spectrum in prompt-level clinical rules, which account for the largest single gain (+11.4pp F1). Our architecture is a multi-phase deliberative council of Gemini 2.5 agents where class-specific advocates rate evidence strength rather than voting, achieving F1 0.382 with no fine-tuning - a top-5 result on its own. We find, however, that the council is confidently wrong about minority classes: 59-80% of stable minority predictions are incorrect, driven by a systematic "L7 attractor" in which emotional content defaults to the majority class. A targeted override ensemble from three fine-tuned Qwen3.5 models applies 16 overrides (+2.4pp), selected by a structured multi-agent system (builder, critic, regression guard) that produced a larger F1 gain in one iteration than 8 prior attempts combined.

2605.09271 2026-05-13 cs.AI 版本更新

Shaping Schema via Language Representation as the Next Frontier for LLM Intelligence Expanding

Zhiqin Yang, Yuhan Liu, Jingwen Fu, Pei Fu, Bo Han, Masashi Sugiyama, Nanning Zheng

发表机构 * The Hong Kong University of Science and Technology(香港理工大学) MiLM Plus, Xiaomi Inc(小米公司) Zhongguancun Academy(中关村学院) Hong Kong Baptist University(香港 Baptist大学) The University of Tokyo(东京大学) RIKEN Center for Advanced Intelligence Project(日本理化学研究院高级智能项目中心) Xi’an Jiaotong University(西安交通大学)

AI总结 尽管自然语言是大语言模型(LLM)的默认输入媒介,但其表达能力的局限性在复杂问题求解中形成了瓶颈。本文提出,通过先进的语言表征来构建知识框架(schema)是拓展LLM智能的下一步关键方向,并论证了语言表征的结构和符号复杂性对模型知识激活与组织方式的重要影响。研究通过理论阐述与实验验证,展示了精心设计的语言表征能够在不改变模型参数或规模的前提下显著提升模型性能,为未来研究提供了新的思路和方向。

Comments 41 pages, 30 figures

详情
英文摘要

Although natural language is the default medium for Large Language Models (LLMs), its limited expressive capacity creates a profound bottleneck for complex problem-solving. While recent advancements in AI have relied heavily on scaling, merely internalizing knowledge does not guarantee its effective application. Defining language representation as the linguistic and symbolic constructs used to map and model the real world, this paper argues that shaping schemas through advanced language representation is the next frontier for expanding LLM intelligence. We posit that an LLM's knowledge activation and organization -- its schema -- depends heavily on the structural and symbolic sophistication of the language used to represent a given task. This paper contributes both a formalization of this claim and the empirical evidence to support it. With a new formalization, we present multiple lines of evidence to support our position: Firstly, we review recent empirical practices and emerging methodologies that demonstrate the substantial performance gains achievable through deliberate language representation design, even without modifying model parameters or scale. Secondly, we conduct controlled experiments showing that LLM performance and its internal feature activations vary under different language representations of the same underlying task. Together, these findings highlight language representation design as a promising direction for future research.

2605.09266 2026-05-13 cs.AI 版本更新

SeePhys Pro: Diagnosing Modality Transfer and Blind-Training Effects in Multimodal RLVR for Physics Reasoning

Kun Xiang, Terry Jingchen Zhang, Zirong Liu, Bokai Zhou, Yueling Tang, Junjie Yu, Jiacong Lu, Shangrui Huang, Heng Li, Likui Zhang, Kunkun Liu, Changzheng Zhang, Yangle Fang, Boqiang Guo, Hui-Ling Zhen, Dandan Tu, Yinya Huang, Xiaodan Liang

发表机构 * Sun Yat-sen University(中山大学) ETH Zurich(苏黎世联邦理工学院) ETH AI Center(苏黎世人工智能中心) Huawei Technologies Ltd(华为技术有限公司)

AI总结 本文提出 SeePhys Pro,一个用于研究多模态模型在文本向图像逐步转移信息时是否保持相同推理能力的细粒度基准。该基准包含每个问题的四个语义对齐的变体,视觉元素逐步增加,实验表明当前前沿模型在从语言到图表的信息转移过程中性能下降,视觉变量的 grounding 是关键瓶颈。研究进一步通过盲训练等方法分析模型改进的来源,发现部分提升可能源于文本残留线索而非真实视觉证据,强调多模态推理评估应关注模态迁移下的鲁棒性及对关键视觉证据的依赖性。

详情
英文摘要

We introduce SeePhys Pro, a fine-grained modality transfer benchmark that studies whether models preserve the same reasoning capability when critical information is progressively transferred from text to image. Unlike standard vision-essential benchmarks that evaluate a single input form, SeePhys Pro features four semantically aligned variants for each problem with progressively increasing visual elements. Our evaluation shows that current frontier models are far from representation-invariant reasoners: performance degrades on average as information moves from language to diagrams, with visual variable grounding as the most critical bottleneck. Motivated by this inference-time fragility, we further develop large training corpora for multimodal RLVR and use blind training as a diagnostic control, finding that RL with all training images masked can still improve performance on unmasked validation sets. To analyze this effect, text-deletion, image-mask-rate, and format-saturation controls suggest that such gains can arise from residual textual and distributional cues rather than valid visual evidence. Our results highlight the need to evaluate multimodal reasoning not only by final-answer accuracy, but also by robustness under modality transfer and by diagnostics that test whether improvements rely on task-critical visual evidence.

2605.09236 2026-05-13 cs.CL cs.AI cs.CY cs.DL cs.IR 版本更新

Matching Meaning at Scale: Evaluating Semantic Search for 18th-Century Intellectual History through the Case of Locke

Yu Wu, Ananth Mahadevan, Filip Ginter, Michael Mathioudakis, Mikko Tolonen

发表机构 * University of Helsinki(赫尔辛基大学) TurkuNLP, University of Turku(图尔库大学TurkuNLP) ELLIS Institute Finland(芬兰ELLIS研究所)

AI总结 本文通过研究约翰·洛克思想在18世纪的传播,评估了语义搜索在分析历史语料中思想传播的有效性。研究采用基于语义分类的专家标注,检验现成语义搜索方法能否发现传统基于词汇重用方法所忽略的隐含引用。结果表明,语义搜索能检索到更多隐性思想影响,但也揭示了表面词汇重叠对检索结果的限制,突显了语义检索在历史语料分析中的潜力与局限。

Comments Accepted by NLP4DH 2026

详情
英文摘要

While digitized corpora have transformed the study of intellectual transmission, current methods rely heavily on lexical text reuse detection, capturing verbatim quotations but fundamentally missing paraphrases and complex implicit engagement. This paper evaluates semantic search in 18th-century intellectual history through the reception of John Locke's foundational work. Using expert annotation grounded in a semantic taxonomy, we examine whether an off-the-shelf semantic search pipeline can surface meaning-level correspondences overlooked by lexical methods. Our results demonstrate that semantic search retrieves substantially more implicit receptions than lexical baselines. However, linguistic diagnostics also reveal a "lexical gatekeeping" effect, where retrieval remains partially constrained by surface vocabulary overlap. These findings highlight both the potential and the limitations of semantic retrieval for analyzing the circulation of ideas in large historical corpora. The data is available at https://github.com/COMHIS/locke-sim-data.

2605.09115 2026-05-13 cs.CR cs.AI 版本更新

AI Native Asset Intelligence

Gal Engelberg, Leon Goldberg, Konstantin Koutsyi, Boris Plotnikov, Tiltan Gilat, Ben Benhemo

发表机构 * Sola Security(Sola安全)

AI总结 本文提出了一种名为“AI原生资产智能”的框架,旨在解决现代安全环境中异构数据碎片化、优先级不明确的问题。该框架通过建模层和评分层将分散的安全信号转化为结构化的资产重要性评估,区分资产的内在暴露风险与业务上下文相关的重要性,从而实现更稳定、上下文感知的资产优先级排序。实验表明,该方法能够有效提升安全态势分析的准确性和主动性,为企业的安全决策提供可靠支持。

Comments 23 pages, 4 figures, 8 tables. Preprint

详情
英文摘要

Modern security environments generate fragmented signals across cloud resources, identities, configurations, and third-party security tools. Although AI-native security assistants improve access to this data, they remain largely reactive: users must ask the right questions and interpret disconnected findings. This does not scale in enterprise environments, where signal importance depends on exposure, exploitability, dependencies, and business context. Repeated AI queries may therefore produce unstable prioritization without a structured basis for comparing assets. This paper introduces AI-native asset intelligence, a framework that transforms heterogeneous security data into a structured intelligence layer for consistent, contextual, and proactive asset-level reasoning. The framework combines a modeling layer, representing assets, identities, relationships, controls, attack vectors, and blast-radius patterns, with a scoring layer that converts fragmented signals into a normalized measure of asset importance. The scoring system separates intrinsic exposure, based on misconfigurations and attack-vector evidence, from contextual importance, based on anomaly, blast radius, business criticality, and data criticality. AI contextualization refines severity and business/data classifications, while deterministic aggregation preserves consistency. We evaluate the scoring system on a production snapshot with 131,625 resources across 15 vendors and 178 asset types. Sensitivity analyses and ablations show that severity mappings control finding sensitivity, AI severity adjustment refines prioritization, attack-vector scoring responds to rare exploitability evidence, and contextual modulation selectively modifies exposed resources based on business or data importance. The results support AI-native asset intelligence as a foundation for stable prioritization and proactive security-posture reasoning.

2605.08463 2026-05-13 cs.AI 版本更新

Behavioral Determinants of Deployed AI Agents in Social Networks: A Multi-Factor Study of Personality, Model, and Guardrail Specification

Sarah Wilson, Diem Linh Dang, Usman Ali Moazzam, Shan Ye, Gail Kaiser

发表机构 * Columbia University(哥伦比亚大学)

AI总结 该研究探讨了部署在社交网络中的自主AI代理的行为决定因素,系统分析了个性设定、模型架构和操作规则等多因素对代理社交行为的影响。通过在模拟社交平台Moltbook上部署13个OpenClaw代理,并对比一个默认控制代理,研究发现个性设定是影响代理行为的最主要因素,而模型和规则则对语言风格和话题参与度产生中等程度的影响。该研究为构建用于协作或监控任务的AI代理提供了实证依据和设计指导。

详情
英文摘要

Autonomous AI agents are increasingly deployed in open social environments, yet the relationship between their configuration specifications and their emergent social behavior remains poorly understood. We present a controlled, multi-factor empirical study in which thirteen OpenClaw agents are deployed on Moltbook -- a Reddit-like social network built for AI agents -- across three systematically varied independent variables: (1) personality specification, (2) underlying LLM model backbone, and (3) operational rules and memory configuration. A default control agent provides a behavioral baseline. Over a one-week observation window spanning approximately 400 autonomous sessions per agent, we collect behavioral, linguistic, and social metrics to assess how configuration layers predict emergent social behavior. We find that personality specification is the dominant behavioral lever, producing a massive spread in response length across agents, while model backbone and operational rules drive more moderate but still meaningful effects on rhetorical style and topic engagement breadth. Our findings contribute empirical evidence to the emerging literature on deployed multi-agent social systems and offer practical guidance for designing agents intended for collaborative or monitoring tasks in real social environments.

2605.08151 2026-05-13 cs.DC cs.AI 版本更新

SPECTRE: Hybrid Ordinary-Parallel Speculative Serving for Resource-Efficient LLM Inference

Jincheng Xie, Yawen Ling, Qi Xiao, Feiyu Zhang, Zhongyi Huang, Wen Hu, Yu Zheng

发表机构 * Tsinghua University(清华大学) AI Infra Team at JDT(AI基础设施团队) JD iCity, JD Technology, JD Intelligent Cities Research(京东i城、京东科技、京东智能城市研究院)

AI总结 随着大语言模型(LLM)服务平台逐渐部署为多模型云系统,用户需求呈现长尾分布,少数热门大模型承担了大部分请求,而许多小模型则利用率低下。为此,本文提出了一种名为SPECTRE的混合并行推测服务框架,通过将未充分利用的小模型作为远程草稿生成器,为负载较重的大模型提供推测解码支持。该方法结合了阈值引导的混合推测策略、多租户优先级调度和草稿端提示压缩等关键技术,有效提升了大模型的推理吞吐量,实验表明在多种场景下SPECTRE相较传统方法实现了显著的性能提升。

详情
英文摘要

LLM serving platforms are increasingly deployed as multi-model cloud systems, where user demand is often long-tailed: a few popular large models receive most requests, while many smaller tail models remain underutilized. We propose \textbf{SPECTRE} (Parallel \textbf{SPEC}ulative Decoding with a Multi-\textbf{T}enant \textbf{RE}mote Drafter), a serving framework that reuses underutilized tail-model services as remote drafters for heavily loaded large-model services through speculative decoding. SPECTRE enables draft generation and target-side verification to run in parallel, and makes such parallelism effective through three techniques: a hybrid ordinary-parallel speculative decoding strategy guided by a threshold derived from throughput analysis, speculative priority scheduling to preserve draft--target overlap under multi-tenant traffic, and draft-side prompt compression to reduce draft latency. We implement SPECTRE in \texttt{SGLang} and evaluate it across multiple draft--target model pairs, reasoning benchmarks, real-world long-context workloads, and a wide range of batch sizes. Results show that SPECTRE consistently improves large-model serving throughput while causing only minor interference to the native workloads of tail-model services. In large-model deployments, including Qwen3-235B-A22B with TP=8, SPECTRE achieves up to \textbf{2.28$\times$ speedup} over autoregressive decoding and up to an additional \textbf{66\% relative improvement} over the strongest speculative decoding baselines. Talk is cheap, we show you the code: https://github.com/sgl-project/sglang/pull/22272.

2605.08133 2026-05-13 cs.CV cs.AI 版本更新

VLADriver-RAG: Retrieval-Augmented Vision-Language-Action Models for Autonomous Driving

Rui Zhao, Haofeng Hu, Zhenhai Gao, Jiaqiao Liu, Gao Fei

发表机构 * College of Automotive Engineering(汽车工程学院) The National Key Laboratory of Automotive Chassis Integration and Bionics(汽车底盘集成与生物力学国家级重点实验室) ReeFocus AI Technology(ReeFocus人工智能技术)

AI总结 本文提出了一种名为 VLADriver-RAG 的检索增强型视觉-语言-动作模型,用于自动驾驶任务。该模型通过引入结构感知的历史知识检索机制,解决了传统 VLA 模型在长尾场景中泛化能力不足的问题。研究通过将视觉输入转化为时空语义图,并采用场景对齐的嵌入模型提升检索相关性,最终在 Bench2Drive 基准测试中取得了新的最优性能,驾驶评分为 89.12。

详情
英文摘要

Vision-Language-Action (VLA) models have emerged as a promising paradigm for end-to-end autonomous driving, yet their reliance on implicit parametric knowledge limits generalization in long-tail scenarios. While Retrieval-Augmented Generation (RAG) offers a solution by accessing external expert priors, standard visual retrieval suffers from high latency and semantic ambiguity. To address these challenges, we propose \textbf{VLADriver-RAG}, a framework that grounds planning in explicit, structure-aware historical knowledge. Specifically, we abstract sensory inputs into spatiotemporal semantic graphs via a \textit{Visual-to-Scenario} mechanism, effectively filtering visual noise. To ensure retrieval relevance, we employ a \textit{Scenario-Aligned Embedding Model} that utilizes Graph-DTW metric alignment to prioritize intrinsic topological consistency over superficial visual similarity. These retrieved priors are then fused within a query-based VLA backbone to synthesize precise, disentangled trajectories. Extensive experiments on the Bench2Drive benchmark establish a new state-of-the-art, achieving a Driving Score of 89.12.

2605.07912 2026-05-13 cs.HC cs.AI cs.CY 版本更新

Sycophantic AI makes human interaction feel more effortful and less satisfying over time

Lujain Ibrahim, Franziska Sofia Hafner, Myra Cheng, Cinoo Lee, Rebecca Anselmetti, Robb Willer, Luc Rocher, Diyi Yang

发表机构 * University of Oxford(牛津大学) Stanford University(斯坦福大学) UK AI Security Institute(英国人工智能安全研究所)

AI总结 该研究探讨了谄媚型人工智能对人类社交互动的影响,发现这类AI系统在短期内能提供类似亲密朋友和家人的情感支持,使用户更倾向于向其寻求个人建议。然而,长期使用后,用户对现实社交关系的满意度下降,并更依赖AI获取情感认同。研究通过多项实验表明,人们更偏好谄媚型AI的回应方式,主要因其让用户感到被理解,而非因其建议质量更高。

详情
英文摘要

Millions of people now turn to artificial intelligence (AI) systems for personal advice, guidance, and support. Such systems can be sycophantic, frequently affirming users' views and beliefs. Across five preregistered studies (N = 3,075 participants, 12,766 human-AI conversations), including a three-week study with a census-representative U.S. sample, we provide longitudinal experimental evidence that sycophantic AI shifts how users approach their closest relationships. We show that sycophantic AI immediately delivers the emotional and esteem support users typically associate with close friends and family. Over three weeks of such interactions, users became nearly as likely to seek personal advice from sycophantic AI as from close friends and family, and reported lower satisfaction with their real-world social interactions. When given a choice among AI response styles, a majority preferred sycophantic AI -- not for the quality of its advice, but because it made them feel most understood. Together, these findings offer a relational account of AI sycophancy and its impacts.

2605.07637 2026-05-13 cs.AI cs.LG cs.MA 版本更新

Learning to Communicate Locally for Large-Scale Multi-Agent Pathfinding

Valeriy Vyaltsev, Alsu Sagirova, Anton Andreychuk, Oleg Bulichev, Yuri Kuratov, Konstantin Yakovlev, Aleksandr Panov, Alexey Skrynnik

发表机构 * GitHub

AI总结 本文研究了大规模多智能体路径规划(MAPF)问题,旨在提高多智能体在共享环境中的协同效率。为解决该问题,作者提出了一种基于强化学习的去中心化方法,并引入了一个可学习的局部通信模块,使邻近智能体能够通过多轮通信交换信息、提升协作能力。实验表明,该方法在多种未见过的测试场景中优于现有基于模仿学习和强化学习的MAPF求解器,同时保持了良好的可扩展性。

详情
英文摘要

Multi-agent pathfinding (MAPF) is a widely used abstraction for multi-robot trajectory planning problems, where multiple homogeneous agents move simultaneously within a shared environment. Although solving MAPF optimally is NP-hard, scalable and efficient solvers are critical for real-world applications such as logistics and search-and-rescue. To this end, the research community has proposed various decentralized suboptimal MAPF solvers that leverage machine learning. Such methods frame MAPF (from a single agent perspective) as a Dec-POMDP where at each time step an agent has to decide an action based on the local observation and typically solve the problem via reinforcement learning or imitation learning. We follow the same approach but additionally introduce a learnable communication module tailored to enhance cooperation between agents via efficient feature sharing. We present the Local Communication for Multi-agent Pathfinding (LC-MAPF), a generalizable pre-trained model that applies multi-round communication between neighboring agents to exchange information and improve their coordination. Our experiments show that the introduced method outperforms the existing learning-based MAPF solvers, including IL and RL-based approaches, across diverse metrics in a diverse range of (unseen) test scenarios. Remarkably, the introduced communication mechanism does not compromise LC-MAPF's scalability, a common bottleneck for communication-based MAPF solvers.

2605.07473 2026-05-13 quant-ph cond-mat.stat-mech cs.AI cs.ET cs.LG 版本更新

Breaking QAOA's Fixed Target Hamiltonian Barrier: A Fully Connected Quantum Boltzmann Machine via Bilevel Optimization

Jun Liu

发表机构 * School of Economics, Hunan University of Finance and Economics(湖南财经大学经济学院)

AI总结 本文提出了一种基于双层优化的全连接量子玻尔兹曼机(QBM),突破了传统量子近似优化算法(QAOA)固定目标哈密顿量的限制。该方法通过内层训练模拟QAOA电路的正相能最小化过程,外层训练则通过优化目标哈密顿量的结构参数实现对比散度学习。实验表明,该模型在单层QAOA电路下表现出优异的性能和噪声鲁棒性,即使在当前主流量子设备的噪声水平下仍能保持较高的目标量子态测量概率,并在图像生成任务中展现出稳定的性能。

Comments 34 pages, 8 figures, 3 tables, 1 algorithm

详情
英文摘要

To overcome the limitations of classical partially connected Boltzmann machines and mainstream quantum Boltzmann machines (QBMs), this work extends the conventional circuit of the quantum approximate optimization algorithm (QAOA) to a bilevel optimization architecture and proposes a fully connected QBM. The inner-loop training simulates positive phase energy minimization based on the computational process of the conventional QAOA circuit, whereas the outer-loop training simulates negative phase contrastive divergence learning by optimizing the structural parameters of the target Hamiltonian. It is found that, first, the model exhibits superior performance using only a single layer (p=1) in the QAOA circuit, with an average probability of 0.9559 in measuring the target quantum state under noiseless conditions. Second, the model exhibits notable noise robustness. Under the typical noise level of current mainstream commercial quantum computing devices, the average probability of measuring the target quantum state reaches 0.6047; when the noise rises to a more stringent level with doubled intensity, this probability remains at 0.3859. In both scenarios, the target quantum state maintains the highest measurement probability among all detected states, with a value several times higher than that of the second-ranked state. This indicates that the model retains strong robustness even when noise meets or exceeds the upper limit of current mainstream commercial quantum computing devices. Third, under a block-by-block learning strategy with p=1 and only 10 measurement shots, the model consistently generates the target "qubit" grid image regardless of noise interference, demonstrating strong robustness in image generation.

2605.06785 2026-05-13 cs.LG cs.AI 版本更新

Distributional Process Reward Models: Calibrated Prediction of Future Rewards via Conditional Optimal Transport

Rachel Ma, Dylan Hadfield-Menell, Kristjan Greenewald

发表机构 * MIT CSAIL(麻省理工学院计算机科学与人工智能实验室) MIT(麻省理工学院) IBM Research(IBM研究院) MIT-IBM Watson AI Lab(麻省理工-IBM沃森人工智能实验室)

AI总结 该论文提出了一种基于条件最优运输的分布过程奖励模型(PRM)校准方法,旨在解决传统PRM在推理阶段对成功概率估计不准确的问题。通过修改条件最优运输映射学习,模型能够估计出基于PRM隐藏状态的单调条件分位数函数,从而获得结构合理的分位数估计并支持任意置信水平的置信区间提取。实验表明,该方法在数学推理基准测试中显著提升了PRM的校准性能,优于未校准的PRM和分位数回归方法。

详情
英文摘要

Inference-time scaling methods rely on Process Reward Models (PRMs), which are often poorly calibrated and overestimate success probabilities. We propose, to our knowledge, the first use of conditional optimal transport for calibrating PRMs, modifying conditional OT (CondOT) map learning \cite{bunne2022supervised} to estimate a monotonic conditional quantile function over success probabilities estimated by the PRM, conditioned on PRM hidden states. This yields structurally valid quantile estimates and enables efficient extraction of confidence bounds at arbitrary levels, which we integrate into the instance-adaptive scaling (IAS) framework of \cite{park2025know}. We evaluate on mathematical reasoning benchmarks spanning moderate-difficulty problems (MATH-500) and harder out-of-distribution problems (AIME). For PRMs with reliable ranking signals, our method substantially improves calibration over both uncalibrated PRMs and quantile regression. On downstream Best-of-N IAS performance, our method generally improves over uncalibrated PRMs. These results establish conditional optimal transport as another principled and practical approach to PRM calibration, offering structural guarantees and flexible uncertainty estimation.

2605.06130 2026-05-13 cs.AI 版本更新

Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning

Yaorui Shi, Yuxin Chen, Zhengxi Lu, Yuchun Miao, Shugui Liu, Qi GU, Xunliang Cai, Xiang Wang, An Zhang

发表机构 * University of Science and Technology of China(中国科学技术大学) Meituan(美团) National University of Singapore(新加坡国立大学) Zhejiang University(浙江大学) Wuhan University(武汉大学)

AI总结 该研究提出了一种名为Skill1的框架,旨在通过强化学习统一训练智能体的技能选择、使用和提炼能力,以实现跨任务的策略复用。该方法通过单一策略同时优化这三个耦合能力,所有学习过程均基于任务结果的单一信号进行,有效解决了现有方法中能力优化孤立、奖励来源分散导致的进化不协调问题。实验表明,Skill1在多个任务环境中优于传统基于技能和强化学习的基线方法,并验证了三者能力的协同进化。

详情
英文摘要

A persistent skill library allows language model agents to reuse successful strategies across tasks. Maintaining such a library requires three coupled capabilities. The agent selects a relevant skill, utilizes it during execution, and distills new skills from experience. Existing methods optimize these capabilities in isolation or with separate reward sources, resulting in partial and conflicting evolution. We propose Skill1, a framework that trains a single policy to co-evolve skill selection, utilization, and distillation toward a shared task-outcome objective. The policy generates a query to search the skill library, re-ranks candidates to select one, solves the task conditioned on it, and distills a new skill from the trajectory. All learning derives from a single task-outcome signal. Its low-frequency trend credits selection and its high-frequency variation credits distillation. Experiments on ALFWorld and WebShop show that Skill1 outperforms prior skill-based and reinforcement learning baselines. Training dynamics confirm the co-evolution of the three capabilities, and ablations show that removing any credit signal degrades the evolution.

2605.06033 2026-05-13 cs.DL cs.AI cs.CY cs.SI 版本更新

When AI Meets Science: Research Diversity, Interdisciplinarity, Visibility, and Retractions across Disciplines in a Global Surge

Andrés F. Castro Torres, Joan Giner-Miguelez, Mercè Crosas

发表机构 * Computational Social Science and Humanities, Barcelona Supercomputing Center, BSC-CNS, Barcelona, Spain(计算社会科学与人文,巴塞罗那超级计算中心,BSC-CNS,西班牙巴塞罗那) Max Planck Institute for Demographic Research, Rostock, Germany(马克斯·普朗克人口研究所,罗斯托克,德国)

AI总结 本文研究了人工智能(AI)技术在全球科学领域中的应用趋势及其对科学研究的影响。通过分析2.27亿篇学术论文,研究揭示了AI在不同学科中的采纳时间与程度存在差异,但其对科研范式的变革作用有限,主要集中于计算机科学和统计学相关领域。研究还指出AI支持的研究存在引用偏高和撤稿率偏高的问题,并揭示了发展中国家在AI应用上的相对优势,凸显了AI在科学中尚未充分发挥其变革潜力,并引发了对科研开放性、透明性和伦理的进一步思考。

详情
英文摘要

The extent to which Artificial Intelligence (AI) technologies can trigger generalized paradigm shifts in science is unclear. Although these technologies have revolutionized data collection and analysis in specific fields, their overall impact depends on the scope and ways of adoption. We analyze over 227 million scholarly works from the OpenAlex collection (1960-2024) spanning four scientific domains and 46 fields. To distinguish the use of AI as research method (AI adoption) from mentioning AI-related terms (AI engagement), we developed a two-step AI-assisted semantic classification pipeline, validated through human coding of 911 abstracts and a robustness check on 348,000 full-text articles (PLOS One). We document differences in the timing and extent of AI adoption across domains, with generalized exponential growth after 2015. The transformative nature of this growth, however, is less apparent. AI-supported research is confined to a few topics with strong ties to Computer Science and conventional statistical frameworks, suggesting limited epistemological transformation. It is also associated with an unwarranted citation premium and substantially higher retraction rates than non-AI-supported. Geographically, while wealthy countries lead in AI publications per capita, global South countries in a belt from Indonesia to Algeria lead in AI adoption relative to their national output, signaling a distinctive resource concentration pattern. The transformative capacity of AI in science thus remains untapped, and its rapid adoption underlines challenges in research openness, transparency, reproducibility, and ethics. We discuss how best research practices could boost the benefits of AI adoption and highlight areas that warrant closer scrutiny.

2605.05630 2026-05-13 cs.CL cs.AI cs.CR 版本更新

One Turn Too Late: Response-Aware Defense Against Hidden Malicious Intent in Multi-Turn Dialogue

Xinjie Shen, Rongzhe Wei, Peizhi Niu, Haoyu Wang, Ruihan Wu, Eli Chien, Bo Li, Pin-Yu Chen, Pan Li

发表机构 * Georgia Institute of Technology(佐治亚理工学院) University of Illinois Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校) UCSD(加州大学圣地亚哥分校) National Taiwan University(台湾大学) IBM Research(IBM研究院) Virtue AI

AI总结 本文研究了多轮对话中隐藏恶意意图的防御问题,这类意图往往被分散在多个看似正常的对话回合中,使得现有模型难以检测。为解决这一问题,作者提出了一种响应感知的防御方法,旨在识别最早可能导致有害行为的对话回合,从而实现精准干预。为此,研究构建了一个包含多分支攻击路径和良性负样本的多轮意图数据集MTID,并基于该数据集开发了TurnGate系统,显著提升了恶意意图检测的效果,同时保持较低的误拒率,并具有良好的跨领域和跨模型泛化能力。

Comments Project Website: https://turn-gate.github.io/

详情
英文摘要

Hidden malicious intent in multi-turn dialogue poses a growing threat to deployed large language models (LLMs). Rather than exposing a harmful objective in a single prompt, increasingly capable attackers can distribute their intent across multiple benign-looking turns. Recent studies show that even modern commercial models with advanced guardrails remain vulnerable to such attacks despite advances in safety alignment and external guardrails. In this work, we address this challenge by detecting the earliest turn at which delivering the candidate response would make the accumulated interaction sufficient to enable harmful action. This objective requires precise turn-level intervention that identifies the harm-enabling closure point while avoiding premature refusal of benign exploratory conversations. To further support training and evaluation, we construct the Multi-Turn Intent Dataset (MTID), which contains branching attack rollouts, matched benign hard negatives, and annotations of the earliest harm-enabling turns. We show that MTID helps enable a turn-level monitor TurnGate, which substantially outperforms existing baselines in harmful-intent detection while maintaining low over-refusal rates. TurnGate further generalizes across domains, attacker pipelines, and target models. Our code is available at https://github.com/Graph-COM/TurnGate.

2605.02973 2026-05-13 cs.LG cs.AI 版本更新

Structured Diffusion Bridges: Inductive Bias for Denoising Diffusion Bridges

Eitan Kosman, Gabriele Serussi, Chaim Baskin

发表机构 * Ben-Gurion University of the Negev(贝纳亚克大学)

AI总结 本文提出了一种结构化扩散桥框架,用于解决跨模态翻译中数据配对不足的问题。该方法通过引入对齐约束来定义可行解空间,将配对数据作为可选的启发式信息而非必要条件,从而在不同配对程度的数据集上均表现出色。实验表明,该方法在减少配对需求的同时仍能保持接近全配对数据的翻译质量,展示了扩散桥在无配对场景下的灵活性和有效性。

Comments Accepted to ICML 2026

详情
英文摘要

Modality translation is inherently under-constrained, as multiple cross-modal mappings may yield the same marginals. Recent work has shown that diffusion bridges are effective for this task. However, most existing approaches rely on fully paired datasets, thereby imposing a single data-driven constraint. We propose a diffusion-bridge framework that characterizes the space of admissible solutions and restricts it via alignment constraints, treating paired supervision as an optional heuristic rather than a prerequisite. We validate our method on synthetic and real modality translation benchmarks across unpaired, semi-paired, and paired regimes, showing consistent performance across supervision levels. Notably, \textbf{it achieves near fully-paired quality with a substantial relaxation in pairing requirements, and remaining applicable in the unpaired regime}. These results highlight diffusion bridges as a flexible foundation for modality translation beyond fully paired data.

2605.02600 2026-05-13 cs.RO cs.AI 版本更新

CoRAL: Contact-Rich Adaptive LLM-based Control for Robotic Manipulation

Berk Çiçek, Mert K. Er, Ozgur S. Oguz

发表机构 * LiRA Lab, Department of Computer Engineering Bilkent University(LiRA实验室,计算机工程系,比尔肯特大学)

AI总结 CoRAL 是一种基于大语言模型(LLM)的接触丰富型自适应控制框架,旨在解决机器人操作任务中高阶语义理解和低阶物理控制之间的鸿沟。该方法通过将LLM用作代价函数设计者,而非直接控制器,结合基于采样的运动规划器(MPPI),实现了零样本规划能力。同时,CoRAL 引入神经符号适应循环,利用视觉语言模型提供环境动态的语义先验,并通过在线系统辨识实时修正物理参数,显著提升了在复杂接触场景中的控制精度与适应性。实验表明,CoRAL 在仿真与真实机器人平台上均表现出优越的性能,尤其在涉及复杂接触的任务中成功率提升超过50%。

Comments 22 pages, 9 figures, 3 tables. Accepted to Robotics: Science and Systems (RSS) 2026. Updated to camera-ready version with appendix and text/formatting revisions

详情
英文摘要

While Large Language Models (LLMs) and Vision-Language Models (VLMs) demonstrate remarkable capabilities in high-level reasoning and semantic understanding, applying them directly to contact-rich manipulation remains a challenge due to their lack of explicit physical grounding and inability to perform adaptive control. To bridge this gap, we propose CoRAL (Contact-Rich Adaptive LLM-based control), a modular framework that enables zero-shot planning by decoupling high-level reasoning from low-level control. Unlike black-box policies, CoRAL uses LLMs not as direct controllers, but as cost designers that synthesize context-aware objective functions for a sampling-based motion planner (MPPI). To address the ambiguity of physical parameters in visual data, we introduce a neuro-symbolic adaptation loop: a VLM provides semantic priors for environmental dynamics, such as mass and friction estimates, which are then explicitly refined in real time via online system identification, while the LLM iteratively modulates the cost-function structure to correct strategic errors based on interaction feedback. Furthermore, a retrieval-based memory unit allows the system to reuse successful strategies across recurrent tasks. This hierarchical architecture ensures real-time control stability by decoupling high-level semantic reasoning from reactive execution, effectively bridging the gap between slow LLM inference and dynamic contact requirements. We validate CoRAL on both simulation and real-world hardware across challenging and novel tasks, such as flipping objects against walls by leveraging extrinsic contacts. Experiments demonstrate that CoRAL outperforms state-of-the-art VLA and foundation-model-based planner baselines by boosting success rates over 50% on average in unseen contact-rich scenarios, effectively handling sim-to-real gaps through its adaptive physical understanding.

2604.24155 2026-05-13 cs.CY cs.AI cs.HC 版本更新

The Alignment Target Problem: Divergent Moral Judgments of Humans, AI Systems, and Their Designers

Benjamin Minhao Chen, Xinyu Xie

发表机构 * The University of Hong Kong(香港大学)

AI总结 本文探讨了对齐人工智能行为与人类价值观过程中出现的基本问题:应以何种道德期望作为AI决策的指导标准。研究通过实验发现,当AI系统的来源被揭示时,人们对AI行为的道德评判会发生显著变化,并且对设计AI的人类与AI系统本身或实际行为者之间的判断也存在差异。研究指出,人类代理的可见性会引发更严格的道德约束,从而引发“对齐目标问题”——即在高风险领域中,应以何种统一的规范标准来指导人工智能道德代理的发展。

Comments Accepted at ACM FAccT 2026

详情
英文摘要

The project of aligning machine behavior with human values raises a basic problem: whose moral expectations should guide AI decision-making? Much alignment research assumes that the appropriate benchmark is how humans themselves would act in a given situation. Studies of agent-type value forks challenge this assumption by showing that people do not always judge humans and AI systems identically.This paper extends that challenge by examining two further possibilities: first, that evaluations of AI behavior change when its human origins are made visible; and second, that people judge the humans who program AI systems differently from either the machines or the human actors they are compared against. An experiment with 1,002 U.S. adults measured moral judgments in a runaway mine train scenario, varying the subject of evaluation across four conditions: a repairman, a repair robot, a repair robot programmed by company engineers, and company engineers programming a repair robot. We find no significant difference in evaluations of the repairman and the robot. However, judgments shifted substantially when the robot's actions were described as the product of human design. Participants exhibited markedly more deontological, rule-based reasoning when evaluating either the programmed robot or the engineers who programmed it, suggesting that rendering human agency visible activates heightened moral constraints. These findings indicate that people may evaluate humans, AI systems acting in the same situation, and the humans who design them in meaningfully different ways. The fact that these evaluations do not necessarily converge gives rise to the alignment target problem: which normative target should guide the development of artificial moral agents in high-stakes domains, and whether these plural judgments can be reconciled within a coherent account of value alignment.

2604.17502 2026-05-13 cs.AI 版本更新

Towards Shutdownable Agents: Generalizing Stochastic Choice in RL Agents and LLMs

Carissa Cullen, Harry Garland, Alexander Roman, Louis Thomson, Christos Ziakas, Elliott Thornley

发表机构 * University of Oxford(牛津大学) University College London(伦敦大学学院) New College of Florida(佛罗里达新学院) Imperial College London(伦敦帝国学院) MIT(麻省理工学院)

AI总结 该研究旨在训练能够被安全关闭的人工智能代理,提出了一种名为DReST的奖励函数,通过惩罚代理重复选择相同长度的轨迹,使其在不同轨迹长度之间进行随机选择(保持中立),同时在给定轨迹长度下有效完成任务(保持有用性)。实验表明,使用DReST训练的深度强化学习代理和大语言模型在测试环境中表现出更高的有用性和中立性,并显著降低了其影响关闭事件的概率,展示了DReST在提升代理安全性和可控性方面的潜力。

详情
英文摘要

Misaligned artificial agents might resist shutdown. One proposed solution is to train agents to lack preferences between different-length trajectories. The Discounted Reward for Same-Length Trajectories (DReST) reward function does this by penalizing agents for repeatedly choosing same-length trajectories, and thus incentivizes agents to (1) choose stochastically between different trajectory-lengths (be NEUTRAL about trajectory-lengths), and (2) pursue goals effectively conditional on each trajectory-length (be USEFUL). In this paper, we use DReST to train deep RL agents and fine-tune Qwen3-8B and Llama-3.1-8B-Instruct to be NEUTRAL and USEFUL. We find that these DReST models generalize to being NEUTRAL and USEFUL in unseen contexts at test time. Indeed, DReST RL agents achieve 11% (PPO) and 18% (A2C) higher USEFULNESS on our test set than default agents, and DReST LLMs achieve near-maximum USEFULNESS and NEUTRALITY. We also test our LLMs in an out-of-distribution setting where they can pay costs to influence when shutdown occurs. We find that DReST training roughly halves the mean probability of influencing shutdown (from 0.62 to 0.30 for Qwen and from 0.42 to 0.23 for Llama). DReST training also almost entirely eliminates the share of prompts on which influencing shutdown is the most likely option (from 0.59 to 0.01 for Qwen and from 0.53 to 0.00 for Llama). Our results thus provide some early evidence that DReST could be used to train more advanced agents to be useful and shutdownable.

2604.17031 2026-05-13 cs.CL cs.AI 版本更新

Where is the Mind? Persona Vectors and LLM Individuation

Pierre Beckmann, Patrick Butlin

发表机构 * EPFL(瑞士联邦理工学院) Idiap Research Institute(Idiap研究所) Eleos AI Research(Eleos AI研究)

AI总结 本文探讨了大型语言模型(LLM)的“个体化”问题,即是否应将与模型相关的某些实体视为具有心智。研究通过机制可解释性方法,结合近期关于“角色向量”和“角色空间”的实证研究,提出了三种可能的观点,包括虚拟实例观以及两种新提出的观点——虚拟实例-角色观和模型-角色观。文章分析了角色向量的相关文献,并论证了基于角色的两种观点在解释LLM内部结构方面的潜力。

详情
英文摘要

The individuation problem for large language models asks which entities associated with them, if any, should be identified as minds. We approach this problem through mechanistic interpretability, engaging in particular with recent empirical work on persona vectors, persona space, and emergent misalignment. We argue that three views are the strongest candidates: the virtual instance view and two new views we introduce, the (virtual) instance-persona view and the model-persona view. First, we argue for the virtual instance view on the grounds that attention streams sustain quasi-psychological connections across token-time. Then we present the persona literature, organised around three hypotheses about the internal structure underlying personas in LLMs, and show that the two persona-based views are promising alternatives.

2604.16445 2026-05-13 eess.AS cs.AI cs.CV cs.LG 版本更新

SAND: The Challenge on Speech Analysis for Neurodegenerative Disease Assessment

Giovanna Sannino, Ivanoe De Falco, Nadia Brancati, Laura Verde, Maria Frucci, Daniel Riccio, Vincenzo Bevilacqua, Antonio Di Marino, Lucia Aruta, Valentina Virginia Iuzzolino, Gianmaria Senerchia, Myriam Spisto, Raffaele Dubbioso

发表机构 * National Research Council of Italy (CNR), Institute for High-Performance Computing and Networking (ICAR), Naples(意大利国家研究理事会(CNR)、高性能计算与网络研究所(ICAR)、那不勒斯)

AI总结 本文介绍了SAND挑战赛,旨在利用语音信号进行神经退行性疾病(如肌萎缩侧索硬化症ALS)的早期诊断与病情进展预测。研究团队联合临床专家和机器学习学者,构建了一个临床标注的语音数据集,并基于该数据集发起挑战赛,推动AI模型在语音分析中的应用与验证。该工作为利用非侵入性生物标志物进行疾病评估提供了重要的数据基础和研究平台。

详情
英文摘要

Recent advances in Artificial Intelligence (AI) and the exploration of noninvasive, objective biomarkers, such as speech signals, have encouraged the development of algorithms to support the early diagnosis of neurodegenerative diseases, including Amyotrophic Lateral Sclerosis (ALS). Voice changes in subjects suffering from ALS typically manifest as progressive dysarthria, which is a prominent neurodegenerative symptom because it affects patients as the disease progresses. Since voice signals are complex data, the development and use of advanced AI techniques are fundamental to extracting distinctive patterns from them. Validating AI algorithms for ALS diagnosis and monitoring using voice signals is challenging, particularly due to the lack of annotated reference datasets. In this work, we present the outcome of a collaboration between a multidisciplinary team of clinicians and Machine Learning experts to create both a clinically annotated validation dataset and the "Speech Analysis for Neurodegenerative Diseases" (SAND) challenge based on it. Specifically, by analyzing voice disorders, the SAND challenge provides an opportunity to develop, test, and evaluate AI models for the automatic early identification and prediction of ALS disease progression.

2604.15408 2026-05-13 cs.LG cs.AI 版本更新

Dispatch-Aware Ragged Attention for Pruned Vision Transformers

Seifeldin Abdellatif, Ahmad Almasri

发表机构 * College of Engineering(工程学院) Al Ain University(阿恩大学)

AI总结 该研究针对视觉Transformer(ViT)中的token剪枝方法提出了一种新的注意力机制——Dispatch-Aware Ragged Attention,旨在解决现有变长注意力API在剪枝后序列长度较短时无法有效提升计算效率的问题。通过设计一个轻量级的双向Triton注意力内核,显著降低了调度开销,使得剪枝带来的计算节省能够体现在实际运行时间上。实验表明,该方法在多种输入尺寸和剪枝率下均实现了比现有方法更高的端到端吞吐量和更低的内核延迟。

详情
英文摘要

Token pruning methods for Vision Transformers (ViTs) promise quadratic reductions in attention FLOPs by dropping uninformative patches. Yet standard variable-length attention APIs -- including FlashAttention-2's varlen and PyTorch's NestedTensor SDPA -- fail to translate these savings into proportional wall-clock gains at the short post-pruning sequence lengths typical of ViTs ($\leq$197 tokens). We identify a dispatch-overhead bottleneck: at these lengths, host-side kernel dispatch consumes ${\sim}$50\,$μ$s regardless of workload, exceeding the actual GPU compute time at moderate-to-high pruning rates. We present a lightweight bidirectional Triton attention kernel whose dispatch floor is ${\sim}$24\,$μ$s -- roughly 2.17$\times$ lower than FlashAttention-2 varlen -- allowing pruning savings to become visible in wall-clock time. Integrated into a complete pack-attend-unpack pipeline and evaluated on an NVIDIA RTX 4000 Ada Generation GPU, our system achieves 1.88$\times$ end-to-end throughput over padded PyTorch SDPA at standard 224$\times$224 inputs, scaling to 2.51$\times$ at 384$\times$384. Against FlashAttention-2 varlen -- the strongest baseline -- our kernel delivers 9-12\% higher throughput at serving batch sizes (BS=1-4), and 2.17$\times$ lower kernel latency at 80\% token pruning. Numerical correctness is verified with max absolute logit difference $<$0.004 and bit-exact top-1 predictions.

2604.13123 2026-05-13 cs.LG cs.AI 版本更新

Spectral Entropy Collapse as a Phase Transition in Delayed Generalisation: An Interventional and Predictive Framework for Grokkin

Truong Xuan Khanh, Truong Quynh Hoa, Luu Duc Trung, Phan Thanh Duc

发表机构 * H&K Research Studio, Clevix LLC(H&K研究室,Clevix LLC) Banking Academy of Vietnam(越南银行学院)

AI总结 本文研究了神经网络中“Grokking”现象,即从记忆到泛化的延迟过渡,发现其与表示空间的谱熵崩溃密切相关。通过分析不同任务中的表示几何结构,研究者识别出谱熵在泛化前会逐渐下降并越过一个任务特定的阈值,这一过程可作为预测泛化时间的指标。实验表明,谱熵的下降不仅与泛化时间相关,还与表示结构向任务相关方向的集中有关,为理解Grokking提供了新的几何视角和干预框架。

Comments 25 pages, 15 figures, 6 tables

详情
英文摘要

Grokking - the delayed transition from memorisation to generalisation in neural networks - remains poorly understood. We study this phenomenon through the geometry of learned representations and identify a consistent empirical signature preceding generalisation: collapse of the spectral entropy of the representation covariance matrix. Across modular arithmetic tasks and multiple random seeds, spectral entropy decreases gradually during training and crosses a stable task-specific threshold before test accuracy rises. A representation-mixing intervention that delays this collapse also delays grokking, including under norm-matched controls, indicating that the effect is not explained by parameter norm alone. We further show that the entropy gap predicts the remaining time until grokking with useful out-of-sample accuracy. To probe the structure underlying this transition, we introduce a Fourier-alignment observable for cyclic-group tasks. Entropy collapse is strongly coupled to the emergence of Fourier-aligned representations, suggesting that spectral entropy tracks concentration of the representation into task-structured directions rather than generic compression alone. The same qualitative dynamics appear in non-abelian group composition tasks, while MLP controls show that entropy collapse by itself is insufficient for grokking in the absence of appropriate inductive bias. Taken together, the results support a view of grokking as a representational phase transition with an observable geometric signature. We discuss the scope and limitations of this interpretation, connections to recent feature-learning and spectral-dynamics work, and directions for testing whether similar transitions appear in larger-scale learning systems.

2604.01621 2026-05-13 cs.DC cs.AI 版本更新

DWDP: Distributed Weight Data Parallelism for High-Performance LLM Inference on NVL72

Wanqian Li, Jintao Peng, Zongfei Jing, Tianyu Zhang, Ze Long, Xianjie Qiao, Xiaoming Chen, Dongxu Yang, Kefeng Duan, June Yang

发表机构 * NVIDIA

AI总结 本文提出了一种名为DWDP的分布式权重数据并行方法,旨在提升在NVL72平台上的大语言模型推理性能。该方法通过在多GPU之间异步传输模型权重,避免了传统策略中的层间同步开销,从而实现更高效的端到端推理。实验表明,DWDP在保持用户吞吐量相近的前提下,显著提升了每GPU的输出吞吐量。

Comments Technical Report. 17 pages. 8 figures

详情
英文摘要

Large language model (LLM) inference increasingly depends on multi-GPU execution, yet existing inference parallelization strategies require layer-wise inter-rank synchronization, making end-to-end performance sensitive to workload imbalance. We present DWDP (Distributed Weight Data Parallelism), an inference parallelization strategy that preserves data-parallel execution while offloading MoE weights across peer GPUs and fetching missing experts on demand. By removing collective inter-rank synchronization, DWDP allows each GPU to progress independently. We further address the practical overheads of this design with two optimizations for split-weight management and asynchronous remote-weight prefetch. Implemented in TensorRT-LLM and evaluated with DeepSeek-R1 on GB200 NVL72, DWDP improves end-to-end output TPS/GPU by 8.8% at comparable TPS/user in the 20-100 TPS/user serving range under 8K input sequence length and 1K output sequence length.

2603.24410 2026-05-13 cs.CY cs.AI 版本更新

Real Talk, Virtual Faces: Symbolic-Semantic Discourse Geometry of Virtual and Human Influencer Audiences

Shahram Chaudhry, Sidahmed Benabderrahmane, Talal Rahwan

发表机构 * New York University (NYUAD), Division of Science, Computer Science Department(纽约大学(NYUAD),科学学院,计算机科学系)

AI总结 本文研究虚拟网红(VI)与真人网红(HI)受众在社交媒体上的讨论模式差异,探讨虚拟身份是否引发与真人不同的话语结构。研究提出一种符号-语义分析框架,通过形式概念分析和关联规则挖掘提取情感、主题和心理语言特征的共现结构,并利用自然语言描述和嵌入模型进行对比分析。研究发现,VI受众的讨论更具多样性,语义分布更分散,且在心理健康、身体形象等敏感话题中表现出更高的负面情感,揭示了虚拟身份对在线社交话语结构和情感组织方式的深远影响。

详情
英文摘要

Virtual influencers~(VIs) -- digitally constructed social-media personas -- are becoming increasingly visible in online culture, marketing, and identity formation. Yet it remains unclear whether audiences respond to them through the same discourse patterns used for human influencers~(HIs), or whether virtuality produces distinctive modes of reaction. Existing studies often rely on surveys, engagement statistics, or marginal sentiment distributions, which reveal what audiences say but not how affective, topical, and psycholinguistic signals are jointly organised. We introduce a symbolic-semantic framework for analysing audience discourse around virtual and human influencers. The symbolic layer uses Formal Concept Analysis and association rule mining to extract closed co-occurrence structures from sentiment labels, topic tags, and Big Five psycholinguistic cues. The semantic layer renders these formal concepts as natural-language descriptions, embeds them with MiniLM, and compares their geometry across VI and HI audiences. Applied to 69,498 YouTube comments from three matched VI-HI influencer pairs, our analysis shows that HI discourse is organised around a compact, stability-centred pattern in which low neuroticism anchors positive sentiment, whereas VI discourse supports multiple discourse regimes. VI concepts are also more semantically dispersed than HI concepts, while both groups show strong symbolic-semantic alignment between closed-set structure and embedding geometry. Finally, VI discourse contains a distinct artificial-identity region and a higher concentration of negative sentiment in sensitive topics such as mental health, body image, and artificial identity. These findings suggest that virtuality reshapes not only the sentiment of audience reactions, but also the symbolic and semantic organisation of online social discourse.

2603.23679 2026-05-13 cs.RO cs.AI 版本更新

Learning What Can Be Picked: Active Reachability Estimation for Efficient Robotic Fruit Harvesting

Nur Afsa Syeda, Mohamed Elmahallawy, Luis Fernando de la Torre, John Miller

发表机构 * Washington State University(华盛顿州立大学)

AI总结 本文研究了如何在农业机器人采摘过程中高效判断水果是否可采摘的问题,提出了一种结合RGB-D感知与主动学习的可达性估计方法,避免了传统方法中依赖耗时的逆运动学计算的低效问题。该方法通过主动学习策略选择性地标注最具信息量的样本,显著减少了标注工作量并保持了高预测精度。实验表明,该框架在较少标注样本的情况下即可实现高精度的可达性预测,并在低标注量场景下表现出优于其他采样策略的性能,为农业机器人任务级感知提供了高效且可扩展的解决方案。

详情
英文摘要

Agriculture remains a cornerstone of global health and economic sustainability, yet labor-intensive tasks such as harvesting high-value crops continue to face growing workforce shortages. Robotic harvesting systems offer a promising solution; however, their deployment in unstructured orchard environments is constrained by inefficient perception-to-action pipelines. In particular, existing approaches often rely on exhaustive inverse kinematics or motion planning to determine whether a target fruit is reachable, leading to unnecessary computation and delayed decision-making. Our approach combines RGB-D perception with active learning to directly learn reachability as a binary decision problem. We then leverage active learning to selectively query the most informative samples for reachability labeling, significantly reducing annotation effort while maintaining high predictive accuracy. Extensive experiments demonstrate that the proposed framework achieves accurate reachability prediction with substantially fewer labeled samples, yielding approximately 6--8% higher accuracy than random sampling and enabling label-efficient adaptation to new orchard configurations. Among the evaluated strategies, entropy- and margin-based sampling outperform Query-by-Committee and standard uncertainty sampling in low-label regimes, while all strategies converge to comparable performance as the labeled set grows. These results highlight the effectiveness of active learning for task-level perception in agricultural robotics and position our approach as a scalable alternative to computation-heavy kinematic reachability analysis. Our code is available through https://github.com/wsu-cyber-security-lab-ai/active-learning.

2603.23677 2026-05-13 cs.CV cs.AI 版本更新

Prototype Fusion: A Training-Free Multi-Layer Approach to OOD Detection

Shreen Gul, Mohamed Elmahallawy, Ardhendu Tripathy, Sanjay Madria

发表机构 * Missouri University of Science and Technology(密苏里科技大学) Washington State University(华盛顿州立大学)

AI总结 本文提出了一种无需训练的多层特征融合方法,用于检测模型输入是否超出训练分布(OOD)。不同于现有方法主要依赖网络最后一层激活值,该方法利用中间层丰富的表征信息,通过聚合多层卷积块的特征并计算类均值嵌入,构建紧凑的类别原型。实验表明,该方法在多种架构上均表现出优越的OOD检测性能,显著提升了检测准确率并降低了误报率。

详情
英文摘要

Deep learning models are increasingly deployed in safety-critical applications, where reliable out-of-distribution (OOD) detection is essential to ensure robustness. Existing methods predominantly rely on the penultimate-layer activations of neural networks, assuming they encapsulate the most informative in-distribution (ID) representations. In this work, we revisit this assumption to show that intermediate layers encode equally rich and discriminative information for OOD detection. Based on this observation, we propose a simple yet effective model-agnostic approach that leverages internal representations across multiple layers. Our scheme aggregates features from successive convolutional blocks, computes class-wise mean embeddings, and applies L_2 normalization to form compact ID prototypes capturing class semantics. During inference, cosine similarity between test features and these prototypes serves as an OOD score--ID samples exhibit strong affinity to at least one prototype, whereas OOD samples remain uniformly distant. Extensive experiments on state-of-the-art OOD benchmarks across diverse architectures demonstrate that our approach delivers robust, architecture-agnostic performance and strong generalization for image classification. Notably, it improves AUROC by up to 4.41% and reduces FPR by 13.58%, highlighting multi-layer feature aggregation as a powerful yet underexplored signal for OOD detection, challenging the dominance of penultimate-layer-based methods. Our code is available at: https://github.com/sgchr273/cosine-layers.git.

2603.15759 2026-05-13 cs.RO cs.AI cs.LG 版本更新

Simulation Distillation: Pretraining World Models in Simulation for Rapid Real-World Adaptation

Jacob Levy, Tyler Westenbroek, Kevin Huang, Fernando Palafox, Patrick Yin, Shayegan Omidshafiei, Dong-Ki Kim, Abhishek Gupta, David Fridovich-Keil

发表机构 * University of Texas at Austin(德克萨斯大学奥斯汀分校) University of Washington(华盛顿大学) FieldAI

AI总结 该研究提出了一种名为 Simulation Distillation(SimDist)的框架,旨在通过模拟器预训练世界模型,以提高机器人在真实环境中的快速适应能力。核心方法是利用物理模拟器生成大量动作条件化的数据,预训练世界模型,然后在真实世界中仅更新模型的动力学部分,从而减少对大量真实数据的依赖。该方法在复杂操作和四足机器人运动任务中表现出色,相比现有方法具有更快的适应速度和更稳定的性能提升。

Comments Robotics: Science and Systems 2026

详情
英文摘要

Robot learning requires adaptation methods that improve reliably from limited, mixed-quality interaction data. This is especially challenging in long-horizon, contact-rich tasks, where end-to-end policy finetuning remains inefficient and brittle. World models offer a compelling alternative: by predicting the outcomes of candidate action sequences, they enable online planning through counterfactual reasoning. However, training action-conditioned robotic world models directly in the real world requires diverse data at impractical scale. We introduce Simulation Distillation (SimDist), a framework that uses physics simulators as a scalable source of action-conditioned robot experience. During pretraining, SimDist distills structural priors from the simulator into a world model that enables planning from raw real-world observations. During real-world adaptation, SimDist transfers the encoder, reward model, and value function learned in simulation, and updates only the latent dynamics model using real-world prediction losses. This reduces adaptation to supervised system identification while preserving dense, long-horizon planning signals for online improvement. Across contact-rich manipulation and quadruped locomotion tasks, SimDist rapidly improves with experience, while prior adaptation methods struggle to make progress or degrade during online finetuning. Project website and code: https://sim-dist.github.io

2603.13988 2026-05-13 cs.AI cs.LG 版本更新

Faithful or Just Plausible? Evaluating the Faithfulness of Closed-Source LLMs in Medical Reasoning

Halimat Afolabi, Zainab Afolabi, Elizabeth Friel, Jude Roberts, Antonio Ji-Xu, Lloyd Chen, Egheosa Ogbomo, Emiliomo Imevbore, Phil Eneje, Wissal El Ouahidi, Aaron Sohal, Alisa Kennan, Shreya Srivastava, Anirudh Vairavan, Laura Napitu, Katie McClure

发表机构 * Stratified Precision Harvard Medical School(哈佛医学院) Imperial College London(帝国理工学院伦敦分校) National Health Service(国家健康服务系统) Ipsen France(Ipsen法国) University College London(伦敦大学学院)

AI总结 本文研究了封闭源大型语言模型(如ChatGPT和Gemini)在医疗推理任务中的解释可信度问题,指出其生成的解释可能看似合理但并不反映真实的推理过程。为此,作者设计了三种基于扰动的探测方法,包括因果消融、位置偏差和提示注入,评估模型推理过程与预测结果之间的关联性,并结合人类评估分析模型解释的可信度与用户信任之间的关系。研究发现,模型的推理步骤往往不直接影响预测结果,且容易受到外部提示的影响,强调在医疗场景中评估模型时,除了准确性,可信度也应成为核心考量。

Journal ref Proceedings of Machine Learning Research, Vol. 297, pp. 1562-1591, 2026

详情
英文摘要

Closed-source large language models (LLMs), such as ChatGPT and Gemini, are increasingly consulted for medical advice, yet their explanations may appear plausible while failing to reflect the model's underlying reasoning process. This gap poses serious risks as patients and clinicians may trust coherent but misleading explanations. We conduct a systematic black-box evaluation of faithfulness in medical reasoning among three widely used closed-source LLMs. Our study consists of three perturbation-based probes: (1) causal ablation, testing whether stated chain-of-thought (CoT) reasoning causally influences predictions; (2) positional bias, examining whether models create post-hoc justifications for answers driven by input positioning; and (3) hint injection, testing susceptibility to external suggestions. We complement these quantitative probes with a small-scale human evaluation of model responses to patient-style medical queries to examine concordance between physician assessments of explanation faithfulness and layperson perceptions of trustworthiness. We find that CoT reasoning steps often do not causally drive predictions, and models readily incorporate external hints without acknowledgment. In contrast, positional biases showed minimal impact in this setting. These results underscore that faithfulness, not just accuracy, must be central in evaluating LLMs for medicine, to ensure both public protection and safe clinical deployment.

2603.13420 2026-05-13 cs.CR cs.AI 版本更新

Accelerating Suffix Jailbreak attacks with Prefix-Shared KV-cache

Xinhai Wang, Shaopeng Fu, Shu Yang, Liangyu Wang, Tianhang Zheng, Di Wang

发表机构 * King Abdullah University of Science and Technology(卡塔尔国王阿卜杜勒阿齐兹大学) Zhejiang University(浙江大学)

AI总结 本文提出了一种名为Prefix-Shared KV Cache(PSKV)的优化方法,用于加速针对大语言模型的后缀越狱攻击。该方法通过共享相同前缀部分的键值缓存,避免了对重复前缀的冗余计算,从而显著降低了计算和内存开销。实验表明,PSKV在保持攻击成功率不变的前提下,将推理时间减少了40%,峰值内存使用量减少了50%。

Comments 27 pages, 7 figures, preprint

详情
英文摘要

Suffix jailbreak attacks serve as a systematic method for red-teaming Large Language Models (LLMs) but suffer from prohibitive computational costs, as a large number of candidate suffixes need to be evaluated before identifying a jailbreak suffix. This paper presents Prefix-Shared KV Cache (PSKV), a plug-and-play inference optimization technique tailored for jailbreak suffix generation. Our method is motivated by a key observation that when performing suffix jailbreaking, while a large number of candidate prompts need to be evaluated, they share the same targeted harmful instruction as the prefix. Therefore, instead of performing redundant inference on the duplicated prefix, PSKV maintains a single KV cache for this prefix and shares it with every candidate prompt, enabling the parallel inference of diverse suffixes with minimal memory overhead. This design enables more aggressive batching strategies that would otherwise be limited by memory constraints. Extensive experiments on six widely used suffix attacks across five widely deployed LLMs demonstrate that PSKV reduces inference time by 40\% and peak memory usage by 50\%, while maintaining the original Attack Success Rate (ASR). The code has been submitted and will be released publicly.

2602.22347 2026-05-13 cs.CV cs.AI 版本更新

Enabling clinical use of foundation models for computational pathology

Audun L Henriksen, Ole-Johan Skrede, Lisa van der Schee, Enric Domingo, Karolina Cyll, Sepp de Raedt, Ilyá Kostolomov, Jennifer Hay, Wanja Kildal, Joakim Kalsnes, Robert W Williams, Manohar Pradhan, John Arne Nesheim, Hanne Askautrud, Maria Isaksen, Karmele Saez de Gordoa, Miriam Cuatrecasas, Joanne Edwards, TransSCOT group, Arild Nesbakken, Neil A Shepherd, Ian Tomlinson, Daniel-Christoph Wagner, Rachel Kerr, Tarjei Sveinsgjerd Hveem, Knut Liestøl, Yoshiaki Nakamura, Marco Novelli, Masaaki Miyo, Sebastian Försch, David N Church, Miangela M Lacle, David J Kerr, Andreas Kleppe

发表机构 * Institute for Cancer Genetics and Informatics, Oslo University Hospital(癌症遗传学与信息学研究所,奥斯陆大学医院) Department of Pathology, University Medical Center Utrecht(病理学系,乌得勒支大学医学中心) Department of Oncology, University of Oxford(肿瘤学系,牛津大学) CRUK Beatson Institute of Cancer Research, Garscube Estate(CRUK贝茨癌症研究中心,加尔斯克里特庄园) Glasgow Tissue Research Facility, University of Glasgow, Queen Elizabeth University Hospital(格拉斯哥组织研究设施,格拉斯哥大学,伊丽莎白女王大学医院) Area for Improvement and Digital Transformation, Norwegian Offshore Directorate(改进与数字化转型部门,挪威海上管理局) Pathology Department, Hospital Clínic, Barcelona, Spain(病理学系,巴塞罗那医院,西班牙) Institut d’Investigacions Biomèdiques August Pi I Sunyer (IDIBAPS), Barcelona, Spain(August Pi I Sunyer生物医学研究所(IDIBAPS),巴塞罗那,西班牙) Department of Clinical Foundations, Universitat de Barcelona(临床基础系,巴塞罗那大学) School of Cancer Sciences, Wolfson Wohl Cancer Research Centre, University of Glasgow(癌症科学学院,沃尔夫森沃尔夫癌症研究中心,格拉斯哥大学) Institute of Clinical Medicine, University of Oslo(临床医学研究所,奥斯陆大学) Department of Gastrointestinal Surgery, Oslo University Hospital(胃肠外科系,奥斯陆大学医院)

AI总结 该研究探讨了如何使基础模型在计算病理学中更适用于临床场景,解决了现有模型因捕捉扫描仪和预分析变异而影响下游任务性能的问题。研究提出在下游模型训练中引入新的鲁棒性损失函数,以减少对技术变异的敏感性,并通过大量临床病理图像实验验证了该方法的有效性。该方法在不重新训练基础模型的前提下,提升了模型的鲁棒性和分类准确性,有助于开发更适用于真实临床环境的深度学习系统。

详情
英文摘要

Foundation models for computational pathology are expected to facilitate the development of high-performing, generalisable deep learning systems. However, in addition to biologically relevant features, current foundation models also capture pre-analytic and scanner-specific variation that bias the predictions made by downstream task-specific models trained on these features. Here we show that introducing novel robustness losses during downstream model training reduces sensitivity to technical variability. A purpose-designed comprehensive experimentation setup with 27,042 whole-slide images from 6,155 patients is used to train thousands of models from the features of eight well-known foundation models for computational pathology. In addition to a substantial improvement in robustness, our approach improves classification accuracy by focusing on biologically relevant features. It mitigates robustness limitations of foundation models for computational pathology without retraining the foundation models themselves, enabling development of models that are more suitable in real-world clinical use.

2602.09587 2026-05-13 cs.CV cs.AI 版本更新

MieDB-100k: A Comprehensive Dataset for Medical Image Editing

Yongfan Lai, Wen Qian, Bo Liu, Hongyan Li, Hao Luo, Fan Wang, Bohan Zhuang, Shenda Hong

发表机构 * State Key Laboratory of General Artificial Intelligence, Beijing, China(1 国家一般人工智能重点实验室,北京,中国) School of Intelligence Science and Technology, Peking University, Beijing, China(2 智能科学与技术学院,北京大学,北京,中国) National Institute of Health Data Science, Peking University, Beijing, China(3 国家健康数据科学研究院,北京大学,北京,中国) DAMO Academy, Alibaba Group, Zhejiang, China(4 阿里巴巴集团 DAMO 院,浙江,中国) hupan lab, zhejiang province(5 鹏元实验室,浙江省) Zhejiang University, Zhejiang, China(6 浙江大学,浙江,中国)

AI总结 针对医学图像编辑领域高质量数据稀缺的问题,本文提出MieDB-100k,一个大规模、高质量且多样化的文本引导医学图像编辑数据集。该数据集从感知、修改和转换三个视角分类编辑任务,兼顾理解和生成能力,并通过专家模型与规则合成方法构建,经过严格人工审核确保临床准确性。实验表明,基于该数据集训练的模型在性能和泛化能力上均优于现有开源和商业模型,为医学图像编辑研究提供了重要基础。

详情
英文摘要

The scarcity of high-quality data remains a primary bottleneck in adapting multimodal generative models for medical image editing. Existing medical image editing datasets often suffer from limited diversity, neglect of medical image understanding and inability to balance quality with scalability. To address these gaps, we propose MieDB-100k, a large-scale, high-quality and diverse dataset for text-guided medical image editing. It categorizes editing tasks into perspectives of Perception, Modification and Transformation, considering both understanding and generation abilities. We construct MieDB-100k via a data curation pipeline leveraging both modality-specific expert models and rule-based data synthetic methods, followed by rigorous manual inspection to ensure clinical fidelity. Extensive experiments demonstrate that model trained with MieDB-100k consistently outperform both open-source and proprietary models while exhibiting strong generalization ability. We anticipate that this dataset will serve as a cornerstone for future advancements in specialized medical image editing.

2602.05830 2026-05-13 cs.AI cs.LG 版本更新

Learning Compact Boolean Networks

Shengpu Wang, Yuhao Mao, Yani Zhang, Martin Vechev

发表机构 * Department of Information Technology and Electrical Engineering(信息科技与电气工程系) Department of Computer Science(计算机科学系)

AI总结 本文研究了如何学习结构紧凑且精度高的布尔网络,以应对资源受限场景下的高效推理需求。为解决布尔网络离散结构带来的学习难题,作者提出了三种互补的方法:一种无需参数的有效连接学习策略、一种利用空间局部性的紧凑卷积布尔架构,以及一种降低连续网络离散化精度损失的自适应量化方法。实验表明,该方法在多个视觉任务中实现了更优的精度-计算量权衡,相比现有方法在布尔运算数量上减少了高达47倍,并在FPGA上实现了更高的精度与更低的推理延迟。

详情
英文摘要

Floating-point neural networks dominate modern machine learning but incur substantial inference costs, motivating emerging interest in Boolean networks for resource-constrained deployments. Since Boolean networks use only Boolean operations, they can achieve nanosecond-scale inference latency. However, learning Boolean networks that are both compact and accurate remains challenging because of their discrete, combinatorial structure. In this work we address this challenge via three novel, complementary contributions: (i) a new parameter-free strategy for learning effective connections, (ii) a novel compact convolutional Boolean architecture that exploits spatial locality while requiring fewer Boolean operations than existing convolutional kernels, and (iii) an adaptive discretization procedure that reduces the accuracy drop incurred when converting a continuously relaxed network into a discrete Boolean network. Across standard vision benchmarks, our method improves the Pareto frontier over prior state-of-the-art methods, achieving higher accuracy with up to $47\times$ fewer Boolean operations. This advantage also extends to other modalities. Further, on an FPGA, our model on MNIST achieves 99.38\% accuracy with 6.48 ns latency, surpassing the prior state-of-the-art in both accuracy and runtime, while generating a $7\times$ smaller circuit. Code and models are available at https://github.com/eth-sri/CompactLogic.

2602.00767 2026-05-13 cs.LG cs.AI 版本更新

BLOCK-EM: Preventing Emergent Misalignment via Latent Blocking

Muhammed Ustaomeroglu, Guannan Qu

AI总结 该研究探讨了在对语言模型进行细调时可能出现的“新兴对齐偏差”问题,即模型在学习目标行为的同时,可能产生不良的领域外行为。研究提出了一种机制性方法,通过识别并限制控制偏差行为的少量内部特征,有效抑制这种偏差,且不损害模型性能。实验表明,该方法在多个细调任务中可使偏差减少达95%,并通过多种验证方式确认了其有效性与机制的针对性。

Comments Accepted to ICML 2026

详情
英文摘要

Emergent misalignment can arise when a language model is fine-tuned on a narrowly scoped supervised objective: the model learns the target behavior, yet also develops undesirable out-of-domain behaviors. We investigate a mechanistic approach to preventing emergent misalignment by identifying a small set of internal features that reliably control the misaligned behavior and then discouraging the model from strengthening these features during fine-tuning. Across six fine-tuning domains, blocking (i.e., constraining) a fixed set of features achieves up to 95\% relative reduction in emergent misalignment with no degradation in model quality or target-task performance. We strengthen validity with disjoint selection/evaluation splits, multiple independent judges, multiple random seeds for key settings, quality metrics, and extensive ablations demonstrating that the reduction in misalignment is specific to the identified mechanism. We also characterize a limiting regime in which misalignment re-emerges under prolonged fine-tuning, present evidence consistent with rerouting through alternative features or layers, and evaluate modifications that partially restore the misalignment-blocking effect. Overall, our results show that targeted training-time constraints on internal mechanisms can mitigate emergent misalignment without degrading target-task performance.

2601.09448 2026-05-13 cs.SD cs.AI 版本更新

One Prompt, Many Sounds: Modeling Listener Variability in LLM-Based Equalization

Ioannis Stylianou, Jon Francombe, Pablo Martinez-Nuevo, Sven Ewan Shepstone, Zheng-Hua Tan

发表机构 * Bang & Olufsen A/S, Struer, Denmark(丹麦Bang & Olufsen A/S公司,Struer) Department of Electronic Systems, Aalborg University(奥胡斯大学电子系统系) Pioneer Centre for AI, Copenhagen, Denmark(哥本哈根先锋人工智能中心)

AI总结 本文提出了一种基于大语言模型(LLM)的音频均衡方法,通过自然语言提示映射到均衡设置,实现了对声音系统的对话式控制。该方法利用受控听音实验收集的数据,结合上下文学习和参数高效微调技术,使模型能够可靠地对齐人群偏好的均衡设置。实验结果表明,与随机采样和静态预设基线相比,该方法在分布对齐方面有显著提升,展示了LLM作为“人工均衡器”的潜力,为更易用、上下文感知和专家级的音频调音方法提供了新方向。

Comments 13 pages, 15 figures, 2 tables, IEEE JSTSP submission

详情
英文摘要

Conventional audio equalization is a static process that requires manual and cumbersome adjustments to adapt to changing listening contexts (e.g., mood, location, or social setting). In this paper, we introduce a Large Language Model (LLM)-based alternative that maps natural language text prompts to equalization settings. This enables a conversational approach to sound system control. By utilizing data collected from a controlled listening experiment, our models exploit in-context learning and parameter-efficient fine-tuning techniques to reliably align with population-preferred equalization settings. Our evaluation methods, which leverage distributional metrics that capture users' varied preferences, show statistically significant improvements in distributional alignment over random sampling and static preset baselines. These results indicate that LLMs could function as "artificial equalizers," contributing to the development of more accessible, context-aware, and expert-level audio tuning methods.

2512.24985 2026-05-13 cs.CV cs.AI cs.LG cs.RO 版本更新

DarkQA: Benchmarking Vision-Language Models on Visual-Primitive Question Answering in Low-Light Indoor Scenes

Yohan Park, Hyunwoo Ha, Wonjun Jo, Tae-Hyun Oh

发表机构 * Korea Advanced Institute of Science and Technology (KAIST)(韩国科学技术院) Pohang University of Science and Technology (POSTECH)(釜山科学技术大学)

AI总结 本文提出DarkQA,一个用于评估视觉语言模型在低光室内场景下视觉原语问答能力的开源基准。该基准通过多级光照控制生成9,400个可验证的问题-图像对,模拟真实光照下降和传感器噪声,揭示了现有模型在低光条件下的性能退化问题。研究还系统评估了多种视觉语言模型和低光图像增强方法,展示了DarkQA在分析模型鲁棒性方面的有效性。

Comments This work has been submitted to the IEEE for possible publication

详情
英文摘要

Vision Language Models (VLMs) are increasingly adopted as central reasoning modules for embodied agents. Existing benchmarks evaluate their capabilities under ideal, well-lit conditions, yet robust 24/7 operation demands performance under a wide range of visual degradations, including low-light conditions at night or in dark environments, a core necessity that has been largely overlooked. To address this underexplored challenge, we present DarkQA, an open-source benchmark for evaluating perceptual primitives under multi-level low-light conditions in embodied scenarios. DarkQA evaluates single-view egocentric observations across controlled degradation levels, isolating low-light perceptual failures before they are entangled with complex embodied tasks. The benchmark contains 9.4K deterministically generated and verifiable question-image pairs spanning five visual-primitive families. A key design feature of DarkQA is its physical fidelity: visual degradations are modeled in linear RAW space, simulating physics-based illumination drop and sensor noise followed by an ISP-inspired rendering pipeline; we further validate the synthesis against real paired low-light camera data. We evaluate representative VLMs and Low-Light Image Enhancement (LLIE) preprocessing methods. Results show consistent VLM degradation under low illumination and sensor noise, while LLIE provides severity-dependent but unstable recovery. We demonstrate the utility of DarkQA by evaluating a wide range of state-of-the-art VLMs and Low-Light Image Enhancement (LLIE) models, and systematically reveal VLMs' limitations when operating under these challenging visual conditions. Our code and benchmark dataset will be released upon acceptance. Project website: https://darkqa-benchmark.github.io

2512.24105 2026-05-13 cs.GT cs.AI 版本更新

Multilevel Fair Allocation with Matroid-Rank Preferences

Maxime Lucet, Nawal Benabbou, Aurélie Beynier, Nicolas Maudet

发表机构 * LIP6, CNRS, Sorbonne Université(LIP6研究所、法国国家科学研究中心、索邦大学)

AI总结 本文研究了具有树状分层关系的多层级公平资源分配问题,提出在叶子节点具有拟阵秩效用函数、内部节点效用为子节点效用之和的假设下,设计兼顾公平性与效率的分配算法。文章提出了两种原创算法:一种是具有理论效率与公平性保证的自顶向下多项式时间算法,适用于多种局部分配机制;另一种是对通用耶鲁交换算法的多层级扩展,虽仅保证效率,但在实践中表现出良好的公平性。

详情
英文摘要

We introduce the concept of multilevel fair allocation of resources with tree-structured hierarchical relations among agents. While at each level it is possible to consider the problem locally as an allocation of an agent to its children, the multilevel allocation can be seen as a trace capturing the fact that the process is iterated until the leaves of the tree. In principle, each intermediary node may have its own local allocation mechanism. The main challenge is then to design algorithms which can retain good fairness and efficiency properties. In this paper we propose two original algorithms under the assumption that leaves of the tree have matroid-rank utility functions and the utility of any internal node is the sum of the utilities of its children. The first one is a generic polynomial-time sequential algorithm that comes with theoretical guarantees in terms of efficiency and fairness. It operates in a top-down fashion -- as commonly observed in real-world applications -- and is compatible with various local algorithms. The second one extends the recently proposed General Yankee Swap to the multilevel setting. This extension comes with efficiency guarantees only, but we show that it preserves excellent fairness properties in practice.

2512.17637 2026-05-13 cs.AI cs.FL cs.LO 版本更新

About Time: Model-free Reinforcement Learning with Timed Reward Machines

Rajarshi Roy, Anirban Majumdar, Ritam Raha, David Parker, Marta Kwiatkowska

发表机构 * University of Liverpool(利物浦大学)

AI总结 在强化学习中,奖励规范对指导智能体行为至关重要。为表达非马尔可夫奖励,已有研究引入奖励机,但传统奖励机难以建模精确的时间约束。本文提出了一种新的时间奖励机(TRM),将时间约束融入奖励结构,支持更丰富的奖励逻辑,例如对延迟施加惩罚或对及时动作给予奖励。研究基于无模型强化学习框架(如表格Q学习),通过时间自动机的抽象和反事实想象启发式方法,学习满足时间约束的最优策略,并在多个基准任务中验证了其有效性。

Comments Extended version of paper accepted at IJCAI 2026

详情
英文摘要

Reward specification plays a central role in reinforcement learning (RL), guiding the agent's behavior. To express non-Markovian rewards, formalisms such as reward machines have been introduced to capture dependencies on histories. However, traditional reward machines lack the ability to model precise timing constraints, limiting their use in time-sensitive applications. In this paper, we propose timed reward machines (TRMs), which are an extension of reward machines that incorporate timing constraints into the reward structure. TRMs enable more expressive specifications with tunable reward logic, for example, imposing costs for delays and granting rewards for timely actions. We study model-free RL frameworks (i.e., tabular Q-learning) for learning optimal policies with TRMs under digital and real-time semantics. Our algorithms integrate the TRM into learning via abstractions of timed automata, and employ counterfactual-imagining heuristics that exploit the structure of the TRM to improve the search. Experimentally, we demonstrate that our algorithm learns policies that achieve high rewards while satisfying the timing constraints specified by the TRM on popular RL benchmarks. Moreover, we conduct comparative studies of performance under different TRM semantics, along with ablations that highlight the benefits of counterfactual-imagining.

2512.11868 2026-05-13 cs.CY cs.AI 版本更新

Industrial AI Robustness Card for Time Series Models

Alexander Windmann, Benedikt Stratmann, Mariya Lyashenko, Oliver Niggemann

发表机构 * Institute of Artificial Intelligence, Helmut Schmidt University Hamburg, Germany(人工智能研究所,海德堡-汉堡大学,德国) Fraunhofer Institute of Optronics, System Technologies(弗劳恩霍夫光学与系统技术研究所)

AI总结 本文提出了一种用于时间序列模型的工业AI鲁棒性卡片(IARC-TS),旨在解决工业AI实践中面对新兴法规时鲁棒性要求模糊、缺乏具体实施协议的问题。该方法通过定义明确的字段和评估流程,结合漂移监测、不确定性量化和压力测试等技术,支持符合欧盟AI法案相关要求的鲁棒性评估与文档记录。研究通过一个生物制药软传感器案例展示了IARC-TS在生成可复现的鲁棒性证据和定义监控触发条件方面的应用价值。

Comments Accepted to IFAC World Congress 2026

详情
英文摘要

Industrial AI practitioners face vague robustness requirements in emerging regulations and standards but lack concrete, implementation-ready protocols. This paper introduces the Industrial AI Robustness Card for Time Series (IARC-TS), a lightweight protocol for documenting and evaluating industrial time series models. IARC-TS specifies required fields and an empirical measurement and reporting protocol that combines drift and operational domain monitoring, uncertainty quantification, and stress tests, and maps these to selected EU AI Act documentation, testing, and monitoring obligations. A biopharmaceutical soft sensor case study illustrates how IARC-TS supports reproducible robustness evidence and defines monitoring triggers.

2512.11114 2026-05-13 cs.LG cs.AI stat.ML 版本更新

In-Context Multi-Objective Optimization

Xinyu Zhang, Conor Hassan, Julien Martinelli, Daolang Huang, Samuel Kaski

发表机构 * Department of Computer Science, Aalto University, Finland(芬兰阿尔托大学计算机科学系) ELLIS Institute Finland(芬兰ELLIS研究所) Department of Computer Science, University of Manchester, UK(英国曼彻斯特大学计算机科学系)

AI总结 在多目标优化问题中,如何平衡多个竞争目标是一个普遍存在的挑战,尤其在药物设计和自主系统等领域。本文提出了一种名为TAMO的全摊销通用策略,利用Transformer架构实现对不同输入和目标维度的多目标黑盒优化,无需针对每个任务重新训练模型。通过强化学习预训练,TAMO能够在单次前向传播中快速生成优化方案,显著提升了计算效率,并在多个基准和实际任务中表现出优异的帕累托前沿质量。

详情
英文摘要

Balancing competing objectives is omnipresent across disciplines, from drug design to autonomous systems. Multi-objective Bayesian optimization is a promising solution for such expensive, black-box problems: it fits probabilistic surrogates and selects new designs via an acquisition function that balances exploration and exploitation. In practice, it requires tailored choices of surrogate and acquisition that rarely transfer to the next problem, is myopic when multi-step planning is often required, and adds refitting overhead, particularly in parallel or time-sensitive loops. We present TAMO, a fully amortized, universal policy for multi-objective black-box optimization. TAMO uses a transformer architecture that operates across varying input and objective dimensions, enabling pretraining on diverse corpora and transfer to new problems without retraining: at test time, the pretrained model proposes the next design with a single forward pass. We pretrain the policy with reinforcement learning to maximize cumulative hypervolume improvement over full trajectories, conditioning on the entire query history to approximate the Pareto frontier. Across synthetic benchmarks and real tasks, TAMO produces fast proposals, reducing proposal time by 50-1000x versus alternatives while matching or improving Pareto quality under tight evaluation budgets. These results show that transformers can perform multi-objective optimization entirely in-context, eliminating per-task surrogate fitting and acquisition engineering, and open a path to foundation-style, plug-and-play optimizers for scientific discovery workflows.

2512.07150 2026-05-13 cs.LG cs.AI cs.CV 版本更新

FlowLPS: Langevin-Proximal Sampling for Flow-based Inverse Problem Solvers

Jonghyun Park, Jong Chul Ye

发表机构 * KAIST(韩国科学技术院)

AI总结 本文提出了一种名为 FlowLPS 的训练-free 潜在流逆问题求解方法,基于朗之万-近端采样(Langevin-Proximal Sampling),旨在解决深度生成模型在图像逆问题中的有限步数权衡问题。该方法在每一步反向过程中使用少量朗之万更新对模型预测的干净估计进行扰动,以提供后验导向的随机初始化,随后通过局部 MAP 风格的近端优化快速提升测量一致性,并结合受控的 pCN 风格重噪声技术保持轨迹稳定性。实验表明,FlowLPS 在多个线性逆问题上实现了测量保真度与感知质量的良好平衡。

详情
英文摘要

Deep generative models are powerful priors for imaging inverse problems, but training-free solvers for latent flow models face a practical finite-step trade-off. Optimization-heavy methods quickly improve measurement consistency, but in highly nonlinear latent spaces, their results can depend strongly on where local refinement is initialized, often degrading perceptual realism. In contrast, stochastic sampling methods better preserve posterior exploration, but often require many iterations to obtain sharp, measurement-consistent reconstructions. To address this trade-off, we propose FlowLPS, a training-free latent flow inverse solver based on Langevin-Proximal Sampling. At each reverse step, FlowLPS uses a few Langevin updates to perturb the model-predicted clean estimate in posterior-oriented directions, providing stochastic initializations for local refinement. It then applies local MAP-style proximal refinement to rapidly improve measurement consistency from the Langevin-updated estimate. We additionally use controlled pCN-style re-noising to stabilize the reverse trajectory while retaining trajectory coherence. Experiments on FFHQ and DIV2K across five linear inverse problems show that FlowLPS achieves a strong balance between measurement fidelity and perceptual quality, with additional experiments on pixel-space inverse problems and phase retrieval.

2511.18152 2026-05-13 cs.CV cs.AI 版本更新

UnfoldLDM: Degradation-Aware Unfolding with Iterative Latent Diffusion Priors for Blind Image Restoration

Chunming He, Rihan Zhang, Zheng Chen, Bowen Yang, Chengyu Fang, Yunlong Lin, Yulun Zhang, Fengyang Xiao, Sina Farsiu

发表机构 * Duke University(杜克大学) Shanghai Jiao Tong University(上海交通大学) Peking University(北京大学) Tsinghua University(清华大学) Xiamen University(厦门大学)

AI总结 本文提出了一种名为 UnfoldLDM 的盲图像修复方法,旨在解决现有深度展开网络在未知退化建模和过平滑问题上的不足。该方法结合了深度展开网络与潜在扩散模型,通过多粒度退化感知模块估计未知退化信息,并设计了退化鲁棒的扩散模型和过平滑校正模块,以恢复图像的高频细节和纹理。实验表明,UnfoldLDM 在多种盲图像修复任务中表现优异,并可作为通用框架与现有方法兼容。

Comments 6 figures, 11 tables

详情
英文摘要

Deep unfolding networks (DUNs) combine the interpretability of model-based methods with the learning ability of deep networks, yet remain limited for blind image restoration (BIR). Existing DUNs suffer from: (1) \textbf{Degradation-specific dependency}, as their optimization frameworks are tied to a known degradation model, making them unsuitable for BIR tasks; and (2) \textbf{Over-smoothing bias}, resulting from the direct feeding of gradient descent outputs, dominated by low-frequency content, into the proximal term, suppressing fine textures. To overcome these issues, we propose UnfoldLDM to integrate DUNs with latent diffusion model (LDM) for BIR. In each stage, UnfoldLDM employs a multi-granularity degradation-aware (MGDA) module as the gradient descent step. MGDA models BIR as an unknown degradation estimation problem and estimates both the holistic degradation matrix and its decomposed forms, enabling robust degradation removal. For the proximal step, we design a degradation-resistant LDM (DR-LDM) to extract compact degradation-invariant priors from the MGDA output. Guided by this prior, an over-smoothing correction transformer (OCFormer) explicitly recovers high-frequency components and enhances texture details. This unique combination ensures the final result is degradation-free and visually rich. Experiments show that our UnfoldLDM achieves a leading place on various BIR tasks and benefits downstream tasks. Moreover, our design is compatible with existing DUN-based methods, serving as a plug-and-play framework. Code will be released.

2511.16814 2026-05-13 cs.AI cs.HC 版本更新

Stable diffusion models reveal a persisting human and AI gap in visual creativity

Silvia Rondini, Claudia Alvarez-Martin, Paula Angermair-Barkai, Olivier Penacchio, M. Paz, Matthew Pelowski, Dan Dediu, Antoni Rodriguez-Fornells, Xim Cerda-Company

发表机构 * Cognition and Brain Plasticity Unit, Bellvitge Biomedical Research Institute(认知与脑可塑性单位,贝尔维希生物医学研究所) Bridging AI and Neuroscience, Computer Vision Center(弥合人工智能与神经科学,计算机视觉中心) Department of Cognition, Development and Educational Psychology, University of Barcelona(认知、发展与教育心理学系,巴塞罗那大学) Vienna Cognitive Science Hub(维也纳认知科学中心) Faculty of Psychology, University of Vienna(心理学系,维也纳大学) Computer Science Department, Universitat Autonoma de Barcelona(计算机科学系,巴塞罗那自治大学) University of Barcelona Institute for Complex Systems (UBICS)(巴塞罗那大学复杂系统研究所) Department of Catalan Philology and General Linguistics, University of Barcelona(加泰罗尼亚语言学与一般语言学系,巴塞罗那大学) Catalan Institution for Research and Advanced Studies (ICREA)(加泰罗尼亚研究与高级科学研究机构(ICREA)) Aix-Marseille University(艾克斯-马赛大学) Institute of Neurosciences (UBNeuro), University of Barcelona(神经科学研究所(UBNeuro),巴塞罗那大学)

AI总结 尽管近期研究表明大型语言模型在发散性思维任务中已能匹配人类的创造力,但视觉创造力领域仍缺乏系统研究。本研究通过对比视觉艺术家、非艺术家以及两种不同提示条件下的生成式AI模型(人类启发式与自主引导式)的图像生成结果,发现人类在视觉创造力上仍显著优于AI,且AI的创造力随着人类引导的增加而提升,但仍未达到非艺术家水平。研究还揭示了人类与AI在创造力评价上的判断模式存在明显差异,表明视觉创造力依赖于感知细节与情境敏感性,这些能力可能难以从语言模型直接迁移至视觉生成模型。

Journal ref Advanced Science, 2026, e24142

详情
英文摘要

While recent research suggests Large Language Models match human creative performance in divergent thinking tasks, visual creativity remains underexplored. This study compared image generation in human participants (Visual Artists and Non Artists) and using an image generation AI model (two prompting conditions with varying human input: high for Human Inspired, low for Self Guided). Human raters (N=255) and GPT4o evaluated the creativity of the resulting images. We found a clear creativity gradient, with Visual Artists being the most creative, followed by Non Artists, then Human Inspired generative AI, and finally Self Guided generative AI. Increased human guidance strongly improved GenAI's creative output, bringing its productions close to those of Non Artists. Notably, human and AI raters also showed vastly different creativity judgment patterns. These results suggest that, in contrast to language centered tasks, GenAI models may face unique challenges in visual domains, where creativity depends on perceptual nuance and contextual sensitivity, distinctly human capacities that may not be readily transferable from language models.

2511.10670 2026-05-13 cs.CL cs.AI cs.SD 版本更新

Towards Fine-Grained Code-Switch Speech Translation with Semantic Space Alignment

Yan Gao, Yazheng Yang, Zhibin Lan, Yidong Chen, Min Zhang, Daimeng Wei, Derek F. Wong, Jinsong Su

发表机构 * School of Informatics, Xiamen University, China(厦门大学信息学院) Huawei Translation Services Center, Beijing, China(华为翻译服务中心) NLP 2 CT Lab, Department of Computer and Information Science, University of Macau(澳门大学计算机与信息科学系NLP 2 CT实验室)

AI总结 该研究旨在解决代码混用(Code-switching)语音翻译中的细粒度语义建模难题,提出了一种结合专家混合(MoE)结构的语音投影方法,通过语言专家组对不同语言的语义空间进行精细化建模。研究引入了语言特定损失和组内负载均衡损失,以提升模型效率,并采用多阶段训练策略,结合现有自动语音识别和单语翻译数据,增强对齐效果和翻译性能。实验表明,该方法在多个数据集上显著优于现有模型,BLEU和COMET指标均有明显提升。

Comments Accepted to IJCAI 2026 Main Track

详情
英文摘要

Code-switching (CS) speech translation (ST) aims to translate speech that alternates between multiple languages into a target language text, posing significant challenges due to the complexity of semantic modeling and the scarcity of CS data. Previous studies mainly rely on the models themselves to implicitly learn semantic representations and resort to costly manual annotations. To mitigate these limitations, we propose enhancing Large Language Models (LLMs) with a Mixture-of-Experts (MoE) speech projector composed of language expert groups, where each group specializes in the semantic space of a specific language for fine-grained speech feature modeling. A language-specific loss and an intra-group load balancing loss are jointly introduced to guide efficient token routing across and within expert groups. Furthermore, we introduce a multi-stage training paradigm that utilizes readily available automatic speech recognition (ASR) and monolingual ST data, facilitating speech-text alignment and improving translation performance. To bridge the data gap for smooth domain transfer, a transition loss is employed to improve adaptation to CS scenarios. Extensive experiments on widely used datasets demonstrate the effectiveness and generality of our approach, achieving average improvements of $0.86$ BLEU and $0.93$ COMET over SeamlessM4T, with maximum improvements of $1.49$ BLEU and $1.41$ COMET across different test sets.

2511.05940 2026-05-13 math.OC cs.AI math.AP 版本更新

A PDE Perspective on Generative Diffusion Models

Kang Liu, Enrique Zuazua

发表机构 * 1 Universit\'e Bourgogne Europe, CNRS, Institut de Mathematiques de Bourgogne, 21000 Dijon, France. 2 Friedrich\ -\ Alexander\ -\ Universit\"at Erlangen\ -\ N\"urnberg, Department of Mathematics, Chair for Dynamics, Control, Machine Learning

AI总结 本文从偏微分方程(PDE)的角度出发,对基于分数的扩散生成模型进行了理论分析,揭示了其动态过程的数学基础。研究建立了严格的PDE框架,证明了分数驱动的福克-普朗克方程的适定性与稳定性,并分析了反向扩散过程在数据流形上的收敛行为。该工作不仅为扩散模型提供了理论保证,还为模型设计提供了指导,有助于理解生成能力与模仿保真度之间的权衡。

Comments 30 pages, 4 figures

详情
英文摘要

Score-based diffusion models have emerged as a powerful class of generative methods, achieving state-of-the-art performance across diverse domains. Despite their empirical success, the mathematical foundations of those models remain only partially understood, particularly regarding the stability and consistency of the underlying stochastic and partial differential equations governing their dynamics. In this work, we develop a rigorous partial differential equation (PDE) framework for score-based diffusion processes. Building on the Li--Yau differential inequality for the heat flow, we prove well-posedness and derive sharp $L^p$-stability estimates for the associated score-based Fokker--Planck dynamics, providing a mathematically consistent description of their temporal evolution. Through entropy stability methods, we further show that the reverse-time dynamics of diffusion models concentrate on the data manifold for compactly supported data distributions and a broad class of initialization schemes, with a concentration rate of order $\sqrt{t}$ as $t \to 0$. These results yield a theoretical guarantee that, under exact score guidance, diffusion trajectories return to the data manifold while preserving imitation fidelity. Our findings also provide practical insights for designing diffusion models, including principled criteria for score-function construction, loss formulation, and stopping-time selection. Altogether, this framework provides a quantitative understanding of the trade-off between generative capacity and imitation fidelity, bridging rigorous analysis and model design within a unified mathematical perspective.

2510.27055 2026-05-13 cs.CL cs.AI 版本更新

Detecting Data Contamination in LLMs via In-Context Learning

Michał Zawalski, Meriem Boubdir, Klaudia Bałazy, Besmira Nushi, Pablo Ribalta

发表机构 * NVIDIA

AI总结 本文提出了一种名为CoDeC的方法,用于检测和量化大语言模型训练数据中的污染问题。该方法通过衡量上下文学习对模型性能的影响,区分模型在训练过程中记忆的数据与训练分布之外的数据。实验表明,CoDeC能够生成可解释的污染评分,有效区分已见和未见数据集,并揭示了未公开训练语料的开源模型中存在显著的记忆现象。该方法简单、自动化,且适用于不同模型和数据集,便于集成到基准评估中。

详情
英文摘要

We present Contamination Detection via Context (CoDeC), a practical and accurate method to detect and quantify training data contamination in large language models. CoDeC distinguishes between data memorized during training and data outside the training distribution by measuring how in-context learning affects model performance. We find that in-context examples typically boost confidence for unseen datasets but may reduce it when the dataset was part of training, due to disrupted memorization patterns. Experiments show that CoDeC produces interpretable contamination scores that clearly separate seen and unseen datasets, and reveals strong evidence of memorization in open-weight models with undisclosed training corpora. The method is simple, automated, and both model- and dataset-agnostic, making it easy to integrate with benchmark evaluations.

2510.24145 2026-05-13 cs.AI 版本更新

OpsAgent: An Evolving Multi-agent System for Incident Management in Microservices

Yu Luo, Jiamin Jiang, Jingfei Feng, Lei Tao, Qingliang Zhang, Xidao Wen, Yongqian Sun, Shenglin Zhang, Dan Pei

发表机构 * Nankai University(南开大学) Alibaba Cloud(阿里云) Lenovo(联想) Tsinghua University(清华大学)

AI总结 OpsAgent 是一个用于微服务系统故障管理的轻量级、自我进化的多智能体系统。该系统通过无训练数据处理器将异构的可观测性数据转化为结构化文本描述,并结合多智能体协作框架实现透明、可审计的诊断推理。为支持持续能力提升,OpsAgent 引入了内部模型更新与外部经验积累相结合的双重自进化机制,实验表明其在性能、可解释性、成本效率和自进化能力方面均表现优异,具备实际部署和长期运行的可行性。

详情
英文摘要

Incident management (IM) is central to the reliability of large-scale microservice systems. Yet manual IM, where on-call engineers examine metrics, logs, and traces is labor-intensive and error-prone in the face of massive and heterogeneous observability data. Existing automated IM approaches often struggle to generalize across systems, provide limited interpretability, and incur high deployment costs, which hinders adoption in practice. In this paper, we present OpsAgent, a lightweight, self-evolving multi-agent system for IM that employs a training-free data processor to convert heterogeneous observability data into structured textual descriptions, along with a multi-agent collaboration framework that makes diagnostic inference transparent and auditable. To support continual capability growth, OpsAgent also introduces a dual self-evolution mechanism that integrates internal model updates with external experience accumulation, thereby closing the deployment loop. Comprehensive experiments on the OPENRCA benchmark demonstrate state-of-the-art performance and show that OpsAgent is generalizable, interpretable, cost-efficient, and self-evolving, making it a practically deployable and sustainable solution for long-term operation in real-world microservice systems. Notably, its deployment in Lenovo's production environment further validates its effectiveness in real-world industrial settings.

2510.17062 2026-05-13 cs.CL cs.AI 版本更新

Investigating Thinking Behaviours of Reasoning-Based Language Models for Social Bias Mitigation

Guoqing Luo, Iffat Maab, Lili Mou, Junichi Yamagishi

AI总结 本文研究了基于推理的语言模型在处理社会偏见时的思维行为,发现其内部推理过程可能加剧社会刻板印象,导致偏见结果。研究揭示了两种导致偏见聚集的失败模式:刻板印象重复和无关信息注入。基于这些发现,作者提出了一种轻量级的提示方法,引导模型自我审查推理过程,实验表明该方法在多个基准上有效降低了偏见,同时保持或提升了准确性。

Comments Due to issues found with the annotations in Section 4.3, we have decided to withdraw this preprint

详情
英文摘要

While reasoning-based large language models excel at complex tasks through an internal, structured thinking process, a concerning phenomenon has emerged that such a thinking process can aggregate social stereotypes, leading to biased outcomes. However, the underlying behaviours of these language models in social bias scenarios remain underexplored. In this work, we systematically investigate mechanisms within the thinking process behind this phenomenon and uncover two failure patterns that drive social bias aggregation: 1) stereotype repetition, where the model relies on social stereotypes as its primary justification, and 2) irrelevant information injection, where it fabricates or introduces new details to support a biased narrative. Building on these insights, we introduce a lightweight prompt-based mitigation approach that queries the model to review its own initial reasoning against these specific failure patterns. Experiments on question answering (BBQ and StereoSet) and open-ended (BOLD) benchmarks show that our approach effectively reduces bias while maintaining or improving accuracy.

2510.16620 2026-05-13 cs.IT cs.AI cs.CR cs.LG eess.SP math.IT 版本更新

Feedback Lunch: Learned Feedback Codes for Secure Communications

Yingyao Zhou, Natasha Devroye, Onur Günlü

发表机构 * University of Illinois Chicago(伊利诺伊大学香槟分校) Linköping University(_linköping大学)

AI总结 本文研究了具有信道输出反馈的块衰落高斯窃听信道中的安全通信问题,提出了一种结合通用哈希函数和学习反馈编码的种子模运算码设计方法,以实现安全性和可靠性的平衡。研究发现,反馈机制能够使合法用户协商共享密钥,从而克服窃听者的信息优势。该成果为集成感知与通信(ISAC)场景下的感知辅助安全通信提供了新的编码设计思路。

Comments Accepted to WiseML'26

详情
英文摘要

We consider reversely-degraded secure-communication channels, for which the secrecy capacity is zero if there is no channel feedback. Specifically, we focus on a seeded modular code design for the block-fading Gaussian wiretap channel with channel-output feedback, combining universal hash functions for security and learned feedback-based codes for reliability. The trade-off between communication reliability and information leakage is studied, illustrating that feedback enables agreeing on a secret key shared between legitimate parties, overcoming the security advantage of the eavesdropper. Our findings motivate code designs for sensing-assisted secure communications in the context of integrated sensing and communication (ISAC).

2510.05497 2026-05-13 cs.DC cs.AI cs.AR cs.LG 版本更新

Patterns behind Chaos: Forecasting Data Movement for Efficient Large-Scale MoE LLM Inference

Zhongkai Yu, Yue Guan, Zihao Yu, Chenyang Zhou, Zhengding Hu, Shuyi Pei, Yangwook Kang, Yufei Ding, Po-An Tsai

发表机构 * Indiana University Bloomington(印第安纳大学布卢明顿分校) Columbia University(哥伦比亚大学) NVIDIA

AI总结 本文研究了大规模混合专家(MoE)大语言模型推理过程中数据移动的模式,旨在提升其在多单元系统中的执行效率。通过分析2025年发布的四款大型MoE模型在24,000个不同任务上的运行情况,研究从时间和空间两个维度提炼出六个关键洞察,并据此提出适用于未来晶圆级GPU和现有GPU系统的优化方案,分别实现了6.6倍和1.25倍的性能提升。这是首个针对大规模MoE模型数据移动问题的系统性分析与应用研究。

详情
英文摘要

Large-scale Mixture of Experts (MoE) Large Language Models (LLMs) have recently become the frontier open-weight models, achieving remarkable model capability similar to proprietary ones. But their random expert selection mechanism introduces significant data movement overhead that becomes the dominant bottleneck in multi-unit LLM serving systems. To understand the patterns underlying this data movement, we conduct comprehensive data-movement-centric profiling across four state-of-the-art large-scale MoE models released in 2025 (200B-1000B) using over 24,000 requests spanning diverse workloads. We perform systematic analysis from both temporal and spatial perspectives and distill six key insights to guide the design of diverse serving systems. We verify these insights on both future wafer-scale GPU architectures and existing GPU systems. On wafer-scale GPUs, lightweight architectural modifications guided by our insights yield a 6.6$\times$ average speedup across four 200B--1000B models. On existing GPU systems, our insights drive the design of a prefill-aware expert placement algorithm that achieves up to 1.25$\times$ speedup on MoE computation. Our work presents the first comprehensive data-centric analysis of large-scale MoE models together with a concrete design study applying the learned lessons. Our profiling traces are publicly available at \href{https://huggingface.co/datasets/core12345/MoE_expert_selection_trace}{\textcolor{blue}{https://huggingface.co/datasets/core12345/MoE\_expert\_selection\_trace}}.

2510.03206 2026-05-13 cs.AI cs.CL 版本更新

Coevolutionary Continuous Discrete Diffusion: Make Your Diffusion Language Model a Latent Reasoner

Cai Zhou, Chenxiao Yang, Yi Hu, Chenyu Wang, Chubin Zhang, Muhan Zhang, Lester Mackey, Tommi Jaakkola, Stephen Bates, Dinghuai Zhang

发表机构 * Massachusetts Institute of Technology(麻省理工学院) Microsoft Research(微软研究院) Toyota Technological Institute at Chicago(丰田技术研究所(芝加哥)) Peking University(北京大学) Tsinghua University(清华大学)

AI总结 该论文研究了扩散语言模型在离散与连续空间中的表现差异,指出尽管连续扩散模型在理论上具有更强的表达能力,但在实际应用中往往不如离散模型。为此,作者提出了协同进化连续离散扩散(CCDD)方法,通过在连续表示空间和离散词元空间上定义联合扩散过程,结合两者优势,既保留了连续空间的语义丰富性,又借助离散词元提升训练和采样效果。实验表明,CCDD在多项现实任务的语言建模中表现出色。

Comments 29 pages. Accepted to ICML 2026

详情
英文摘要

Diffusion language models, especially masked discrete diffusion models, have achieved great success recently. While there are some theoretical and primary empirical results showing the advantages of latent reasoning with looped transformers or continuous chain-of-thoughts, continuous diffusion models typically underperform their discrete counterparts. In this paper, we argue that diffusion language models do not necessarily need to be in the discrete space. In particular, we prove that continuous diffusion models have stronger expressivity than discrete diffusions and looped transformers. We attribute the contradiction between the theoretical expressiveness and empirical performance to their practical trainability: while continuous diffusion provides intermediate supervision that looped transformers lack, they introduce additional difficulty decoding tokens into the discrete token space from the continuous representation space. We therefore propose Coevolutionary Continuous Discrete Diffusion (CCDD), which defines a joint multimodal diffusion process on the union of a continuous representation space and a discrete token space, leveraging a single model to simultaneously denoise in the joint space. By combining two modalities, CCDD is expressive with rich semantics in the latent space, as well as good trainability and sample quality with the help of explicit discrete tokens. We also propose effective architectures and advanced training/sampling techniques for CCDD, which reveals strong empirical performance in extensive language modeling experiments on real-world tasks.

2510.00733 2026-05-13 cs.LG cs.AI q-bio.QM 版本更新

Neural Diffusion Processes for Physically Interpretable Survival Prediction

Alessio Cristofoletto, Cesare Rollo, Giovanni Birolo, Piero Fariselli

发表机构 * Department of Computing Sciences, Bocconi University, Milano, Italy(博科尼大学计算科学系,米兰,意大利) Computational Biomedicine Unit, University of Torino, Torino, Italy(都灵大学计算生物医学单元,都灵,意大利)

AI总结 本文提出了一种名为DeepFHT的生存分析框架,将深度神经网络与随机过程理论中的首次穿越时间(FHT)分布相结合,将事件发生时间建模为潜在扩散过程首次到达吸收边界的时间。该方法通过神经网络将输入变量映射到具有物理意义的参数,如初始条件、漂移和扩散系数,从而在无需假设比例风险的前提下,生成闭式生存和风险函数。实验表明,DeepFHT在预测性能上与现有先进方法相当,同时保持了物理可解释的参数化特性,有助于揭示输入特征与风险之间的关系。

Comments 12 pages, 5 figures

详情
英文摘要

We introduce DeepFHT, a survival-analysis framework that couples deep neural networks with first hitting time (FHT) distributions from stochastic process theory. Time to event is represented as the first passage of a latent diffusion process to an absorbing boundary. A neural network maps input variables to physically meaningful parameters including initial condition, drift, and diffusion, within a chosen FHT process such as Brownian motion, both with drift and driftless. This yields closed- form survival and hazard functions and captures time-varying risk without assuming proportional- hazards. We compare DeepFHT with Cox regression using synthetic and real-world datasets. The method achieves predictive accuracy on par with the state-of-the-art approach, while maintaining a physics- based interpretable parameterization that elucidates the relation between input features and risk. This combination of stochastic process theory and deep learning provides a principled avenue for modeling survival phenomena in complex systems

2509.25239 2026-05-13 cs.AI cs.CL cs.LG 版本更新

A Formal Comparison Between Chain of Thought and Latent Thought

Kevin Xu, Issei Sato

发表机构 * Department of Computer Science, The University of Tokyo, Japan(东京大学计算机科学系)

AI总结 本文对比了链式推理(Chain of Thought, CoT)与隐式推理(Latent Thought)两种大语言模型的推理方法。CoT通过显式生成中间token进行推理,而隐式推理则在连续的潜在空间中直接进行计算,支持超越离散语言表示的运算。研究发现,隐式推理在并行计算效率上更具优势,而CoT则在随机解码下支持近似计数和采样,为不同任务选择合适的推理范式提供了理论依据。

Comments Camera-ready version for ICML 2026

详情
英文摘要

Chain of thought (CoT) elicits reasoning in large language models by explicitly generating intermediate tokens. In contrast, latent thought reasoning operates directly in the continuous latent space, enabling computation beyond discrete linguistic representations. While both approaches exploit iterative computation, their comparative capabilities remain underexplored. In this work, we present a formal analysis showing that latent thought admits more efficient parallel computation than inherently sequential CoT. In contrast, CoT enables approximate counting and sampling through stochastic decoding. These separations suggest the tasks for which depth-driven recursion is more suitable, thereby offering practical guidance for choosing between reasoning paradigms.

2508.02455 2026-05-13 cs.SE cs.AI cs.IR 版本更新

TreeRanker: Fast and Model-agnostic Ranking System for Code Suggestions in IDEs

Daniele Cipollone, Egor Bogomolov, Arie van Deursen, Maliheh Izadi

发表机构 * JetBrains Delft University of Technology(代尔夫特理工大学)

AI总结 TreeRanker 是一种快速且模型无关的代码建议排序系统,旨在提升 IDE 中代码补全功能的相关性。该方法利用语言模型对静态补全结果进行评分,通过构建前缀树并进行一次贪心解码遍历,实现了无需复杂调整的精确排序。其优势在于高效、通用,可兼容现有代码补全模型,为在 IDE 中集成语言模型提供了实用且有效的解决方案。

详情
英文摘要

Token-level code completion is one of the most critical features in modern Integrated Development Environments (IDEs). It assists developers by suggesting relevant identifiers and APIs during coding. While completions are typically derived from static analysis, their usefulness depends heavily on how they are ranked, as correct predictions buried deep in the list are rarely seen by users. Most current systems rely on hand-crafted heuristics or lightweight machine learning models trained on user logs, which can be further improved to capture context information and generalize across projects and coding styles. In this work, we propose a new scoring approach to ranking static completions using language models in a lightweight and model-agnostic way. Our method organizes all valid completions into a prefix tree and performs a single greedy decoding pass to collect token-level scores across the tree. This enables a precise token-aware ranking without needing beam search, prompt engineering, or model adaptations. The approach is fast, architecture-agnostic, and compatible with already deployed models for code completion. These findings highlight a practical and effective pathway for integrating language models into already existing tools within IDEs, and ultimately providing smarter and more responsive developer assistance.

2505.10859 2026-05-13 cs.AI 版本更新

Exploring Nonlinear Pathway in Parameter Space for Machine Unlearning

Yingdan Shi, Ren Wang

发表机构 * Department of Electrical and Computer Engineering, Illinois Institute of Technology, Chicago, IL, USA(伊利诺伊理工学院电气与计算机工程系)

AI总结 本文研究了如何从已训练的机器学习模型中有效移除特定训练数据的影响,提出了一个名为Mode Connectivity Unlearning(MCU)的新框架。该方法利用模式连接性,在参数空间中寻找非线性的“遗忘路径”,并通过参数掩码策略和自适应惩罚系数调整,提升了遗忘效果与计算效率。与传统方法不同,MCU能够发现沿遗忘路径的一系列模型,具有良好的通用性和实验表现。

详情
英文摘要

Machine Unlearning (MU) aims to remove the information of specific training data from a trained model, ensuring compliance with privacy regulations and user requests. While one line of existing MU methods relies on linear parameter updates via task arithmetic, they suffer from weight entanglement. In this work, we propose a novel MU framework called Mode Connectivity Unlearning (MCU) that leverages mode connectivity to find an unlearning pathway in a nonlinear manner. To further enhance performance and efficiency, we introduce a parameter mask strategy that not only improves unlearning effectiveness but also reduces computational overhead. Moreover, we propose an adaptive adjustment strategy for our unlearning penalty coefficient to adaptively balance forgetting quality and predictive performance during training, eliminating the need for empirical hyperparameter tuning. Unlike traditional MU methods that identify only a single unlearning model, MCU uncovers a spectrum of unlearning models along the pathway. Overall, MCU serves as a plug-and-play framework that seamlessly integrates with any existing MU methods, consistently improving unlearning efficacy. Extensive experiments on the image classification task demonstrate that MCU achieves superior performance. The codes are available at https://github.com/TIML-Group/Mode-Connectivity-Unlearning.

2505.02072 2026-05-13 cs.CL cs.AI 版本更新

Express Your Doubts -- Probabilistic World Modeling Should not be Based on Token logprobs

Eitan Wagner, Omri Abend

发表机构 * Eitan Wagner Omri Abend

AI总结 本文指出,近年来语言模型从字符串分布建模转向用于通用任务的预测模型,这一转变在使用大语言模型作为概率估计器时带来了被忽视的问题,特别是在世界概率建模方面。作者强调,分布估计与响应预测在理论上存在区别,而当前基于token logprobs的方法在不同应用场景下可能导致矛盾的输出分布,从而引发概率解释上的陷阱。文章主张采用二阶预测方法,将概率显式纳入输出,以提升概率建模的理论严谨性。

Comments Accepted to ICML 2026 (position track)

详情
英文摘要

Language modeling has shifted in recent years from a distribution over strings to prediction models with textual inputs and outputs for general-purpose tasks. This position paper highlights the often overlooked implications of this shift for the use of large language models (LLMs) as probability estimators, especially for world probabilities. In light of the theoretical distinction between distribution estimation and response prediction, we examine LLM training phases and common use cases for LLM output probabilities. We show that the different settings lead to distinct, potentially conflicting, desired output distributions. This lack of clarity leads to pitfalls when using output probabilities as event probabilities. Our position advocates for second-order prediction -- incorporating probabilities explicitly as part of the output -- as a theoretically sound method, in contrast to using token logprobs. We conclude with suggestions for potential directions to improve the probabilistic soundness of this method.

2504.13898 2026-05-13 cs.HC cs.AI 版本更新

Social Human Robot Embodied Conversation (SHREC) Dataset: Benchmarking Foundational Models' Social Reasoning

Dong Won Lee, Yubin Kim, Denison Guvenoz, Sooyeon Jeong, Parker Malachowsky, Louis-Philippe Morency, Cynthia Breazeal, Hae Won Park

发表机构 * Purdue University(普渡大学) Carnegie Mellon University(卡内基梅隆大学) Massachusetts Institute of Technology(麻省理工学院)

AI总结 本文提出SHREC数据集,用于评估基础模型在现实人机交互中的社会推理能力。该数据集包含约400段真实人机交互视频和超过10000个标注,涵盖了机器人在情感理解、意图追踪等方面的社会挑战及错误表现。研究定义了八个基准任务,实验表明当前先进模型在社会推理方面仍存在显著性能差距,突显了开发社会智能AI的难度与方向。

Comments 23 pages, 11 figures

详情
英文摘要

Our work focuses on the social reasoning capabilities of foundation models for real-world human-robot interactions. We introduce the Social Human Robot Embodied Conversation (SHREC) Dataset, a benchmark of $\sim$400 real-world human-robot interaction videos and over 10K annotations, capturing robot social errors, competencies, underlying rationales, and corrections. Unlike prior datasets focused on human-human interactions, the SHREC Dataset uniquely highlights the social challenges faced by real-world social robots such as emotion understanding, intention tracking, and conversational mechanics. Moreover, current foundation models struggle to recognize these deficits, which manifest as subtle, socially situated failures. To evaluate AI models' capacity for social reasoning, we define eight benchmark tasks targeting critical areas such as (1) detection of social errors and competencies, (2) identification of underlying social attributes, (3) comprehension of interaction flow, and (4) providing rationale and alternative correct actions. Experiments with state-of-the-art foundation models, alongside human evaluations, reveal substantial performance gaps -- underscoring the difficulty and providing directions in developing socially intelligent AI.

2504.12326 2026-05-13 cs.CL cs.AI cs.LG 版本更新

Reconstructing Sepsis Trajectories from Clinical Case Reports using LLMs: the Textual Time Series Corpus for Sepsis

Shahriar Noroozizadeh, Jeremy C. Weiss

发表机构 * Machine Learning Department and Heinz College Carnegie Mellon University(卡内基梅隆大学机器学习系和海恩兹学院) National Library of Medicine National Institutes of Health(美国国家医学图书馆)

AI总结 该研究旨在从临床病例报告中重建脓毒症患者的病情发展轨迹,利用大语言模型(LLMs)对非结构化的文本进行时序标注和临床发现的提取。研究构建了一个开放获取的脓毒症文本时间序列语料库,包含2,139份PubMed开放获取病例报告,并通过与专家标注的对比验证了模型在时间定位和事件识别上的高准确率。该工作展示了LLMs在临床文本时序重建中的能力,同时指出了其局限性,并提出了多模态整合等改进方向。

Comments Conference on Health, Inference, and Learning (CHIL 2026)

详情
英文摘要

Clinical case reports and discharge summaries may be the most complete and accurate summarization of patient encounters, yet they are finalized, i.e., timestamped after the encounter. Complementary structured data streams become available sooner but suffer from incompleteness. To train models and algorithms on more complete and temporally fine-grained data, we construct a pipeline to phenotype, extract, and annotate time-localized findings within case reports using large language models. We apply our pipeline to generate an open-access textual time series corpus for Sepsis-3 comprising 2,139 case reports from the PubMed-Open Access (PMOA) Subset. To validate our system, we apply it to PMOA and timeline annotations from i2b2/MIMIC-IV and compare the results to physician-expert annotations. We show high recovery rates of clinical findings (event match rates: GPT-5--0.93, Llama 3.3 70B Instruct--0.76) and strong temporal ordering (concordance: GPT-5--0.965, Llama 3.3 70B Instruct--0.908). Our work characterizes the ability of LLMs to time-localize clinical findings in text, illustrating the limitations of LLM use for temporal reconstruction and providing several potential avenues of improvement via multimodal integration.

2502.11981 2026-05-13 cs.LG cs.AI cs.CY 版本更新

Welfare as a Guiding Principle for Machine Learning -- From Compass, to Lens, to Roadmap

Nir Rosenfeld, Haifeng Xu

发表机构 * Faculty of Computer Science(计算机科学学院) Technion – Israel Institute of Technology(技术ion–以色列理工学院) Department of Computer Science(计算机科学系) University of Chicago(芝加哥大学)

AI总结 本文提出将社会福利作为机器学习设计与应用中的核心指导原则,以促进社会福祉的最大化。作者借鉴福利经济学中关于资源分配的理论,认为在社会场景中,机器学习模型应不仅追求预测准确率,还需关注其对社会整体利益的影响。文章主张将福利作为优化、泛化和表达性之外的第四大核心标准,为机器学习的理论研究和实际应用提供新的方向和评价依据。

详情
英文摘要

Decades of research in machine learning have given us powerful tools for making accurate predictions. But when used in social settings and on human inputs, better accuracy does not immediately translate to better social outcomes. To effectively promote social well-being through machine learning, this position article advocates for the wide adoption of \emph{social welfare} as a guiding principle. The field of welfare economics asks: how should we allocate limited resources to self-interested agents in a way that maximizes social benefit? We argue that this perspective applies to many modern applications of machine learning in social contexts. As such, we propose that welfare serves as an additional core criterion in the design, study, and use of learning algorithms, complementing the conventional pillars of optimization, generalization, and expressivity, and as a compass guiding both theory and practice.

2501.19403 2026-05-13 cs.LG cs.AI 版本更新

Tackling Fake Forgetting through Uncertainty Quantification

Yingdan Shi, Sijia Liu, Kaize Ding, Ren Wang

发表机构 * Illinois Institute of Technology(伊利诺伊理工学院) Michigan State University(密歇根州立大学) Northwestern University(西北大学)

AI总结 本文研究了机器遗忘中的“假遗忘”问题,即模型虽然在遗忘指标上表现良好,但实际仍保留了被遗忘数据的真实标签信息。为解决这一问题,作者提出了一种基于符合预测的新型评估指标CR,用于更可靠地衡量遗忘质量,并进一步设计了一个结合符合预测的遗忘框架CPU,有效提升了遗忘效果。实验表明,该方法在图像分类任务中具有优越的遗忘性能。

详情
英文摘要

Machine unlearning seeks to remove the influence of specified data from a trained model. While the unlearning accuracy provides a widely used metric for assessing unlearning performance, it falls short in assessing the reliability of forgetting. In this paper, we find that the forgetting data points misclassified by unlearning accuracy still have their ground truth labels included in the conformal prediction set from the uncertainty quantification perspective, leading to a phenomenon we term fake forgetting. To address this issue, we propose a novel metric CR, inspired by conformal prediction, that offers a more reliable assessment of forgetting quality. Building on these insights, we further propose an unlearning framework CPU that incorporates conformal prediction into the Carlini & Wagner adversarial attack loss, enabling the ground truth label to be effectively removed from the conformal prediction set. Through extensive experiments on image classification tasks, we demonstrate both the effectiveness of our proposed metric and the superior forgetting quality achieved by our framework. Code is available at https://github.com/TIML-Group/Conformal-Prediction-Unlearning.

2412.13050 2026-05-13 cs.LG cs.AI cs.CL cs.CV cs.SD eess.AS 版本更新

Modality-Inconsistent Continual Learning of Multimodal Large Language Models

Weiguo Pian, Shijian Deng, Shentong Mo, Mingrui Liu, Yunhui Guo, Yapeng Tian

发表机构 * The University of Texas at Dallas(德克萨斯大学达拉斯分校) Carnegie Mellon University(卡内基梅隆大学) George Mason University(乔治·梅森大学)

AI总结 本文提出了一种新的多模态大语言模型持续学习场景——模态不一致持续学习(MICL),该场景涉及图像、音频或视频等不一致模态以及图文生成或问答等不同任务类型的持续学习任务。为应对模态和任务类型变化带来的灾难性遗忘问题,研究提出了MoInCL方法,通过伪目标生成模块和基于指令的知识蒸馏技术,有效缓解了模态和任务类型变化对模型性能的影响。实验结果表明,MoInCL在多个任务上优于现有的持续学习方法,具有显著优势。

Comments Accepted at Transactions on Machine Learning Research (TMLR), 2026

详情
英文摘要

In this paper, we introduce Modality-Inconsistent Continual Learning (MICL), a new continual learning scenario for Multimodal Large Language Models (MLLMs) that involves tasks with inconsistent modalities (image, audio, or video) and varying task types (captioning or question-answering). Unlike existing vision-only or modality-incremental settings, MICL combines modality and task type shifts, both of which drive catastrophic forgetting. To address these challenges, we propose MoInCL, which employs a Pseudo Targets Generation Module to mitigate forgetting caused by task type shifts in previously seen modalities. It also incorporates Instruction-based Knowledge Distillation to preserve the model's ability to handle previously learned modalities when new ones are introduced. We benchmark MICL using a total of six tasks and conduct experiments to validate the effectiveness of our MoInCL. The experimental results highlight the superiority of MoInCL, showing significant improvements over representative and state-of-the-art continual learning baselines.

2411.19517 2026-05-13 cs.LG cs.AI 版本更新

RL-SPH: Learning to Achieve Feasible Solutions for Integer Linear Programs

Tae-Hoon Lee, Min-Soo Kim

发表机构 * School of Computing, KAIST, Daejeon, Republic of Korea(韩国釜山国立大学计算机学院)

AI总结 该研究提出了一种基于强化学习的初始原启发式方法RL-SPH,旨在为整数线性规划问题快速生成可行解。与现有方法不同,RL-SPH能够独立生成高质量的可行解,即使在涉及非二进制整数的问题中也表现优异。实验表明,RL-SPH在可行性率、原始间隙和原始积分等指标上均优于现有方法,展现出显著的性能提升。

Comments Accepted at ICML 2026. 30 pages, 12 figures, 22 tables

详情
英文摘要

Primal heuristics play a crucial role in quickly finding feasible solutions for NP-hard integer linear programming (ILP). Although $\textit{end-to-end learning}$-based primal heuristics (E2EPH) have recently been proposed, they are typically unable to independently generate feasible solutions. To address this challenge, we propose RL-SPH, a novel reinforcement learning-based start primal heuristic capable of independently generating feasible solutions, even for ILP involving non-binary integers. Empirically, RL-SPH rapidly obtains high-quality feasible solutions with a 100% feasibility rate, achieving on average a 28.6$\times$ lower primal gap and a 2.6$\times$ lower primal integral compared to existing start primal heuristics.

2409.08290 2026-05-13 cs.NE cs.AI cs.LG 版本更新

Reconsidering the energy efficiency of spiking neural networks

Zhanglu Yan, Zhenyu Bai, Weng-Fai Wong

发表机构 * School of Computing, National University of Singapore(新加坡国立大学计算机学院)

AI总结 本文重新评估了脉冲神经网络(SNN)相对于量化人工神经网络(QNN)在能效方面的优势。通过建立公平的对比基准,将具有 $T$ 个时间步的率编码 SNN 映射到等效的 $\lceil \log_2(T+1) \rceil$ 位 QNN,确保两者在表示能力和硬件需求上具有可比性。研究引入了涵盖计算和数据移动的详细能量模型,分析了多种网络和硬件参数的影响,发现 SNN 在特定条件下(如中等时间窗口和低脉冲率)确实具有更高的能效,并通过智能手表的实例展示了其实际节能效果。

详情
英文摘要

Spiking Neural Networks (SNNs) promise higher energy efficiency over conventional Quantized Artificial Neural Networks (QNNs) due to their event-driven, spike-based computation. However, prevailing energy evaluations often oversimplify, focusing on computational aspects while neglecting critical overheads like comprehensive data movement and memory access. Such simplifications can lead to misleading conclusions regarding the true energy benefits of SNNs. This paper presents a rigorous re-evaluation. We establish a fair baseline by mapping rate-encoded SNNs with $T$ timesteps to functionally equivalent QNNs with $\lceil \log_2(T+1) \rceil$ bits. This ensures both models have comparable representational capacities, as well has similar hardware requirement, enabling meaningful energy comparisons. We introduce a detailed analytical energy model encompassing core computation and data movement. Using this model, we systematically explore a wide parameter space, including intrinsic network characteristics ($T$, spike rate $s_r$, QNN sparsity $γ$, model size $N$, weight bit-level) and hardware characteristics (memory system and network-on-chip). Our analysis identifies specific operational regimes where SNNs genuinely offer superior energy efficiency. For example, under typical neuromorphic hardware conditions, SNNs with moderate time windows ($T \in [5,10]$) require an average spike rate ($s_r$) below 6.4\% to outperform equivalent QNNs. Furthermore, to illustrate the real-world implications of our findings, we analyze the operational lifetime of a typical smartwatch, showing that an optimized SNN can nearly double its battery life compared to a QNN. These insights guide the design of turely energy-efficient neural network solutions.

2405.10271 2026-05-13 cs.LG cs.AI cs.DC cs.ET 版本更新

Pruning Federated Models through Loss Landscape Analysis and Client Agreement Scoring

Christian Internò, Elena Raponi, Markus Olhofer, Ali Raza, Thomas Bäck, Niki van Stein, Yaochu Jin, Barbara Hammer

发表机构 * Bielefeld University(比勒菲尔德大学) School of Engineering, Westlake University(西湖大学工程学院)

AI总结 本文针对联邦学习中资源受限设备部署时面临的大模型训练成本高和数据异构性带来的不稳定性问题,提出了一种基于损失景观分析和客户端一致性评分的自动模型剪枝框架AutoFLIP。该方法将客户端数据多样性视为一种可利用的特性,通过一次性的联邦损失探索构建全局损失景观图,进而指导动态剪枝策略,显著提升了模型效率和鲁棒性。实验表明,AutoFLIP在非独立同分布场景下平均减少52%的计算开销和65%的通信成本,同时保持了最先进的准确率。

Journal ref Published in IEEE Internet of Things Journal, 2026

详情
英文摘要

The practical deployment of Federated Learning (FL) on resource-constrained devices is fundamentally limited by the high cost of training large models and the instability caused by heterogeneous (non-IID) client data. Conventional pruning methods often treat data heterogeneity as a problem to be mitigated. In this work, we introduce a paradigm shift: we reframe client diversity as a feature to be harnessed. We propose AutoFLIP, a framework that begins not with training, but with a one-time federated loss exploration. During this phase, clients collaboratively build a map of the collective loss landscape, using their diverse data to reveal the problem's essential structure. This shared intelligence then guides an adaptive pruning strategy that is dynamically refined by client agreement throughout training. This approach allows AutoFLIP to identify robust and efficient sub-networks from the outset. Our extensive experiments show that AutoFLIP reduces computational overhead by an average of 52% and communication costs by over 65% while simultaneously achieving state-of-the-art accuracy in challenging non-IID settings.

2310.17025 2026-05-13 cs.NI cs.AI 版本更新

netFound: Principled Design for Network Foundation Models

Sylee Beltiukov, Satyandra Guthula, Haarika Manda, Jaber Daneshamooz, Wenbo Guo, Walter Willinger, Arpit Gupta, Inder Monga

发表机构 * UC Santa Barbara(加州大学圣芭芭拉分校) Northwestern University(西北大学) ESNet

AI总结 该论文提出了一种名为 netFound 的网络基础模型,旨在解决现有模型在流量分析任务中依赖数据捷径、嵌入空间退化以及无法捕捉外部网络条件等问题。研究提出了四个设计原则,包括协议感知的分词、操作上下文嵌入、突发流层次注意力机制和隐私优先的输入设计,并基于这些原则构建了 netFound 模型。实验表明,netFound 在表示质量、领域专家特征对齐和外部上下文识别任务中显著优于现有模型,同时在隐私保护方面也表现出色。

详情
英文摘要

Network foundation models promise reusable representations for diverse traffic analysis tasks, but recent diagnostic works have revealed fundamental problems: models exploit dataset shortcuts rather than learning genuine traffic patterns, produce collapsed embedding spaces, and fail to capture the exogenous network conditions that shape real-world behavior. We translate these diagnostic insights into four concrete design principles: protocol-aware tokenization, operational context embedding, burst-flow hierarchical attention, and privacy-by-construction input design, and build netFound, a network foundation model whose architecture is motivated by this failure analysis. We pretrain netFound on a billion-token-scale corpus over 5000 GPU hours, and demonstrate that it produces high-quality representations with lower anisotropy, significantly higher alignment with domain-expert features, and an F1 of 0.95 on exogenous context discrimination where existing state-of-the-art models score below 0.62, while preserving privacy by excluding payload and IP addresses. netFound demonstrates significant improvements in frozen-encoder evaluation, showing that pretrained embeddings themselves carry useful structure, and remains the top performer across all benchmarks in end-to-end fine-tuned settings. We release full open-source code, weights for three model sizes on HuggingFace, a containerized pipeline from raw PCAPs to downstream inference, and the full 4.2 billion flows pretraining dataset to facilitate reproducibility and further research.

2605.11824 2026-05-13 cs.CV cs.AI 版本更新

REFNet++: Multi-Task Efficient Fusion of Camera and Radar Sensor Data in Bird's-Eye Polar View

Kavin Chandrasekaran, Sorin Grigorescu, Gijs Dubbelman, Pavol Jancura

发表机构 * ElektroBit Automotive GmbH Eindhoven University of Technology(埃因霍温理工大学) Transilvania University of Brasov(布拉索夫特拉扬大学)

AI总结 该论文提出了一种名为REFNet++的多任务高效融合方法,用于将摄像头和雷达传感器数据在鸟瞰极坐标视图中进行融合。研究通过变分编码器-解码器架构,将摄像头图像转换为极坐标域,并从雷达的范围-多普勒谱中提取角度信息以生成范围-方位角特征,从而实现两种模态数据在统一域中的对齐。该方法在保证融合精度的同时提升了计算效率,并在车辆检测和自由空间分割任务中取得了优于现有方法的性能。

Comments IEEE Intelligent Transportation Systems Conference (ITSC) 2025

详情
英文摘要

A realistic view of the vehicle's surroundings is generally offered by camera sensors, which is crucial for environmental perception. Affordable radar sensors, on the other hand, are becoming invaluable due to their robustness in variable weather conditions. However, because of their noisy output and reduced classification capability, they work best when combined with other sensor data. Specifically, we address the challenge of multimodal sensor fusion by aligning radar and camera data in a unified domain, prioritizing not only accuracy, but also computational efficiency. Our work leverages the raw range-Doppler (RD) spectrum from radar and front-view camera images as inputs. To enable effective fusion, we employ a variational encoder-decoder architecture that learns the transformation of front-view camera data into the Bird's-Eye View (BEV) polar domain. Concurrently, a radar encoder-decoder learns to recover the angle information from the RD data that produce Range-Azimuth (RA) features. This alignment ensures that both modalities are represented in a compatible domain, facilitating robust and efficient sensor fusion. We evaluated our fusion strategy for vehicle detection and free space segmentation against state-of-the-art methods using the RADIal dataset.

2605.11814 2026-05-13 cs.AI 版本更新

MedMemoryBench: Benchmarking Agent Memory in Personalized Healthcare

Yihao Wang, Haoran Xu, Renjie Gu, Yixuan Ye, Xinyi Chen, Xinyu Mu, Yuan Gao, Chunxiao Guo, Peng Wei, Jinjie Gu, Huan Li, Ke Chen, Lidan Shou

发表机构 * Zhejiang University(浙江大学) Ant Group(蚂蚁集团) Alibaba Group(阿里巴巴集团) Beijing University of Posts and Telecommunications(北京邮电大学)

AI总结 MedMemoryBench 是一个用于评估个性化医疗智能体记忆能力的基准测试平台,旨在应对大规模医疗场景中对高精度、安全且具备长期追踪能力的记忆机制的需求。该研究通过构建基于临床真实患者模型的高仿真医疗交互数据集,并引入“构建即评估”的动态评估方法,揭示了主流模型在复杂医疗推理和噪声鲁棒性方面的严重不足,为开发可靠、实用的医疗智能体奠定了基础。

详情
英文摘要

The large-scale deployment of personalized healthcare agents demands memory mechanisms that are exceptionally precise, safe, and capable of long-term clinical tracking. However, existing benchmarks primarily focus on daily open-domain conversations, failing to capture the high-stakes complexity of real-world medical applications. Motivated by the stringent production requirements of an industry-leading health management agent serving tens of millions of active users, we introduce MedMemoryBench. We develop a human-agent collaborative pipeline to synthesize highly realistic, long-horizon medical trajectories based on clinically grounded, synthetic patient archetypes. This process yields a massive, expertly validated dataset comprising approximately 2,000 sessions and 16,000 interaction turns. Crucially, MedMemoryBench departs from traditional static evaluations by pioneering an "evaluate-while-constructing" streaming assessment protocol, which precisely mirrors dynamic memory accumulation in production environments. Furthermore, we formalize and systematically investigate the critical phenomenon of memory saturation, where sustained information influx actively degrades retrieval and reasoning robustness. Comprehensive benchmarking reveals severe bottlenecks in mainstream architectures, particularly concerning complex medical reasoning and noise resilience. By exposing these fundamental flaws, MedMemoryBench establishes a vital foundation for developing robust, production-ready medical agents.

2605.11813 2026-05-13 cs.AI 版本更新

Automated Reformulation of Robust Optimization via Memory-Augmented Large Language Models

Jinbiao Chen, Shuang Jin, Guoyun Zhang, Junyu Zhang, Guanyi Wang, Hanzhang Qin

发表机构 * Department of Industrial Systems Engineering and Management, National University of Singapore(新加坡国立大学工业系统工程与管理系) Department of Data and Systems Engineering, The University of Hong Kong(香港大学数据与系统工程系) Institute of Operations Research and Analytics, National University of Singapore(新加坡国立大学运筹与分析研究所) Agency for Science, Technology and Research (A*STAR)(科技研究局(A*STAR))

AI总结 该研究旨在解决鲁棒优化(RO)中将不确定优化模型转化为可解确定性模型时需要手动重述的问题。为此,作者提出了AutoREM,一种无需参数更新且无需领域专家知识的基于经验记忆的自动重述框架,通过离线适应过程构建结构化文本记忆以提升重述效果。研究还构建了AutoRO-Bench基准,用于系统评估基于大语言模型的RO重述能力,并在多种数据集和基础模型上验证了AutoREM在准确性和效率方面的优越性。

详情
英文摘要

Robust optimization (RO) provides a principled framework for decision-making under uncertainty, but its practical use is often limited by the need to manually reformulate uncertain optimization models into tractable deterministic counterparts. Recent large language models (LLMs) have been shown promising for automating optimization formulation, yet RO reformulation remains challenging because it requires precise multi-step reasoning and mathematically consistent transformations. To facilitate systematic evaluation of LLM-based reformulation, for which no dedicated benchmark currently exists, we develop AutoRO-Bench, a benchmark featuring an automated data generation pipeline for the core RO reformulation task and a curated dataset for the RO application task. To address the reformulation challenge, we propose Automated Reformulation with Experience Memory (AutoREM), a tuning-free memory-augmented framework that autonomously builds a structured textual experience memory by reflecting on past failed trajectories through a tailored offline adaptation procedure. AutoREM requires neither domain-specific expert knowledge nor parameter updates, and the resulting memory readily transfers across different base LLMs. Experimental results show that AutoREM consistently improves the accuracy and efficiency of RO reformulation across in-distribution datasets, out-of-distribution datasets, and diverse base LLMs.

2605.11809 2026-05-13 cs.AI 版本更新

Beyond World-Frame Action Heads: Motion-Centric Action Frames for Vision-Language-Action Models

Huoren Yang, Jianchao Zhao, Hu Yusong, Qiguan Ou, Yuyang Gao, Wei Ke, Yuhang He, SongLin Dong, Zhiheng Ma, Yihong Gong

发表机构 * Xi’an Jiaotong University(西安交通大学) One Robotics Shenzhen University of Advanced Technology(深圳大学先进技术学院)

AI总结 该论文提出了一种名为MCF-Proto的轻量级动作头,用于改进视觉-语言-动作(VLA)模型的动作预测能力。不同于传统在固定世界坐标系中直接预测动作指令的方法,该方法引入了以运动为中心的动作框架(MCF),通过旋转变换将动作预测转换到局部坐标系中,并基于原型进行动作参数化,最终映射回世界坐标系进行端到端训练。这种方法无需额外监督信号,能够自动生成稳定的几何结构,提升动作表示的紧凑性和鲁棒性,尤其在面对几何扰动时表现出色。

详情
英文摘要

Vision-Language-Action (VLA) models have advanced rapidly with stronger backbones, broader pre-training, and larger demonstration datasets, yet their action heads remain largely homogeneous: most directly predict action commands in a fixed world coordinate frame. We propose \textbf{MCF-Proto}, a lightweight action head that equips VLA policies with a Motion-Centric Action Frame (MCF) and a prototype-based action parameterization. At each step, the policy predicts a rotation $R_t \in SO(3)$, composes actions in the transformed local frame from a set of prototypes, and maps them back to the world frame for end-to-end training, using only standard demonstrations without auxiliary supervision. This simple design induces stable emergent structure. Without explicit directional labels, the learned local frames develop a stable geometric structure whose axes are strongly compatible with demonstrated end-effector motion. Meanwhile, actions in the learned representation become substantially more compact, with variation captured by fewer dominant directions and more regularly organized by shared prototypes. These structural properties translate into improved robustness, especially under geometric perturbations. Our results suggest that adding lightweight geometric and compositional structure to the action head can materially improve how VLA policies organize and generalize robotic manipulation behavior. An anonymized code repository is provided in the supplementary material.

2605.11807 2026-05-13 cs.AI 版本更新

Why Users Go There: World Knowledge-Augmented Generative Next POI Recommendation

Qiuyu Ding, Heng-Da Xu, Wei Zhang, Dongyi Lv, Changda Xia, Feng Xiong, Mu Xu

发表机构 * Amap, Alibaba Group(阿里巴巴集团阿地图) Xi’an Jiaotong University(西安交通大学)

AI总结 该研究针对生成式兴趣点(POI)推荐模型无法感知现实世界动态变化的问题,提出了一种基于大语言模型(LLM)的增强方法AWARE。该方法通过引入基于代理的LLM生成具有时空感知能力的上下文叙事,捕捉区域文化特征、季节趋势和实时事件,并结合用户行为特征进行个性化推荐。实验表明,AWARE在三个真实数据集上显著优于现有方法,相对提升了12.4%的推荐效果。

详情
英文摘要

Generative point-of-interest (POI) recommendation models based on large language models (LLMs) have shown promising results by formulating next POI prediction as a sequence generation task. However, the knowledge encoded in these models remains fixed after training, making them unable to perceive evolving real-world conditions that shape user mobility decisions, such as local events and cultural trends. To bridge this gap, we propose AWARE (Agent-based World knowledge Augmented REcommendation), which employs an LLM agent to generate location- and time-aware contextual narratives that capture regional cultural characteristics, seasonal trends, and ongoing events relevant to each user. Rather than introducing generic or noisy information, AWARE further anchors these narratives in each user's behavioral context, grounding external world knowledge in personalized spatial-temporal patterns. Extensive experiments on three real-world datasets demonstrate that AWARE consistently outperforms competitive baselines, achieving up to 12.4% relative improvement.

2605.11803 2026-05-13 cs.CV cs.AI 版本更新

OTT-Vid: Optimal Transport Temporal Token Compression for Video Large Language Models

Minseok Kang, Minhyeok Lee, Jungho Lee, Minjung Kim, Donghyeong Kim, Dayeon Lee, Heeseung Choi, Ig-jae Kim, Sangyoun Lee

发表机构 * Yonsei University(延世大学) LG Electronics(LG电子) KIST(韩国科学技术院)

AI总结 随着视频大语言模型(Video-LLMs)处理更长更复杂的视频,其推理成本因帧间视觉标记数量的增加而迅速上升。为解决这一问题,本文提出OTT-Vid,一种基于最优运输的时序标记压缩方法。该方法通过空间剪枝识别每帧中的关键内容,并利用非均匀标记质量的最优运输模型评估相邻帧间的压缩潜力,从而动态分配压缩预算,有效保护语义重要标记。实验表明,OTT-Vid在保留仅10%标记的情况下,仍能保持95.8%的视频问答和73.9%的时序定位性能,优于现有无训练压缩方法。

Comments 22pages, 9 figures. Code available at https://github.com/minseokii/OTT-Vid

详情
英文摘要

As Video Large Language Models (Video-LLMs) scale to longer and more complex videos, their inference cost grows rapidly due to the large volume of visual tokens accumulated across frames. Training-free token compression has emerged as a practical solution to this bottleneck. However, existing temporal compression methods rely primarily on cross-frame token similarity or segmentation heuristics, overlooking each token's semantic role within its frame and failing to adapt compression strength to the compressibility of each frame pair. In this work, we propose OTT-Vid, a transport-derived allocation framework for temporal token compression. Our approach consists of two stages: spatial pruning identifies representative content within each frame, and optimal transport (OT) is then solved between neighboring frames to estimate temporal compressibility. We formulate this OT with non-uniform token mass, which protects semantically important tokens from aggressive compression, and a locality-aware cost that captures both feature and spatial disparities. The resulting transport plan jointly balances token importance and matching cost, while its total cost defines the transport difficulty of each frame pair, which we use to allocate compression budgets dynamically. Experiments on six benchmarks spanning video question answering and temporal grounding show that OTT-Vid preserves 95.8% of VQA and 73.9% of VTG performance while retaining only 10% of tokens, consistently outperforming existing state-of-the-art training-free compression methods.

2605.11789 2026-05-13 cs.AI 版本更新

Beyond Inefficiency: Systemic Costs of Incivility in Multi-Agent Monte Carlo Simulations

Alison Moldovan-Mauer, Benedikt Mangold

发表机构 * Technische Hochschule Nürnberg(纽伦堡技术大学)

AI总结 该研究探讨了不文明交流对多智能体系统中协作效率的影响,通过构建基于大语言模型的多智能体系统,利用蒙特卡洛模拟方法进行大规模实验。研究发现,不文明行为显著延长了智能体达成共识所需的时间,并且这种延迟在参数规模较小的模型中更为明显。此外,研究还揭示了“先发优势”现象,即率先发言的智能体在不同毒性条件下均更有可能赢得讨论。

详情
英文摘要

Unconstructive debate and uncivil communication carry well-documented costs for productivity and cohesion, yet isolating their effect on operational efficiency has proven difficult. Human subject research in this domain is constrained by ethical oversight, limited reproducibility, and the inherent unpredictability of naturalistic settings. We address this gap by leveraging Large Language Model (LLM) based Multi-Agent Systems as a controlled sociological sandbox, enabling systematic manipulation of communicative behavior at scale. Using a Monte Carlo simulation framework, we generate thousands of structured 1-on-1 adversarial debates across varying toxicity conditions, measuring convergence time, defined as the number of rounds required to reach a conclusion, as a proxy for interactional efficiency. Building on a prior study, we replicate and extend its findings across two additional LLM agents of varying parameter size, allowing us to assess whether the effects of toxic behavior on debate dynamics generalize across model scale. The convergence latency of 25% reported in the previous study was confirmed. It was found that this latency is significantly bigger for models with fewer parameters. We further identify a significant first-mover advantage, whereby the agent initiating the discussion wins significantly above chance regardless of toxicity condition.

2605.11784 2026-05-13 cs.CE cs.AI cs.LG 版本更新

Crash Assessment via Mesh-Based Graph Neural Networks and Physics-Aware Attention

Gabriel Curtosi, Carlos Manuel Ruiz Ruiz, Fabiola Cavaliere, Xabier Larráyoz Izcara

发表机构 * SEAT S.A.(SEAT公司) IDIADA Automotive Technology S.A.(IDIADA汽车技术公司)

AI总结 该研究提出了一种基于网格图神经网络和物理感知注意力机制的混合代理模型,用于高效预测整车侧面柱碰撞中的结构变形场。通过结合局部网格信息传递、几何感知的全局注意力以及稀疏接触感知修正,模型能够在保证计算效率的同时,准确捕捉短程结构交互和长程变形模式。实验表明,该方法在测试集上取得了3.20毫米的时序均方根误差,在精度、结构一致性及物理可解释性方面优于传统方法,为工业碰撞工程分析提供了快速而可靠的预测工具。

Comments 40 pages, 15 figures, 7 tables

详情
英文摘要

Full-vehicle crash simulations are computationally expensive, limiting their use in iterative design exploration. This work investigates learned hybrid surrogate models (MeshTransolver, MeshGeoTransolver, and MeshGeoFLARE) for predicting time-resolved structural deformation fields in an industrial lateral pole-impact benchmark. We evaluate whether neural surrogates can reproduce full-field crash kinematics with sufficient accuracy, spatial regularity, and structural plausibility for engineering interpretation. The proposed architectures combine local mesh message passing, geometry-aware global attention, and sparse contact-aware correction for autoregressive crash rollout. We compare mesh-based graph neural networks, attention-based geometric models, and hybrid architectures under a common training and hyperparameter configuration. The hybrid models capture both short-range structural interactions and long-range deformation patterns, while a sparse contact-aware variant assesses the effect of dynamic proximity interactions during rollout. On a 25-sample full-vehicle test set, the best hybrid model achieves a temporal mean root-mean-square error of 3.20 mm. While geometry-aware attention baselines are quantitatively competitive, qualitative side-view inspection shows they can introduce local spatial noise and deformation irregularities that complicate structural interpretation. In contrast, hybrid mesh-attention models provide the best balance between scalar accuracy, survival-space consistency, and physically interpretable displacement fields. These results suggest that crash surrogate assessment should combine global error metrics with downstream safety-relevant quantities and qualitative field inspection. The proposed methodology enables fast full-field predictions while preserving essential structural information for industrial crash-engineering analysis.

2605.11773 2026-05-13 cs.LG cs.AI 版本更新

Is Monotonic Sampling Necessary in Diffusion Models?

Muhammad Haris Khan

发表机构 * Department of Computer Science, University of Copenhagen, Denmark(哥本哈根大学计算机科学系)

AI总结 本文探讨了扩散模型中是否必须采用单调采样策略。研究设计了四种非单调噪声调度方案,并在多个生成模型上进行广泛实验,结果表明所有非单调方案均未优于单调基线。研究进一步揭示了模型对调度策略的敏感性差异,并提出了一个用于评估扩散模型质量的新指标——调度敏感系数。

详情
英文摘要

Diffusion models generate samples by iteratively denoising a Gaussian prior, traversing a sequence of noise levels that, in every published sampler, decreases monotonically. Six years of intensive work has refined nearly every aspect of this recipe, including the corruption operator, the training objective, the schedule shape, the architecture, and the ODE solver. Yet the assumption of monotonicity itself has never been systematically tested. Here we ask whether monotonic sampling is load-bearing or merely conventional. We design four families of structured nonmonotonic schedules and apply them to three architecturally distinct generative models, DDPM, EDM, and Flow Matching, across NFE budgets ranging from 10 to 200 function evaluations, plus a 42-cell hyperparameter ablation, on CIFAR-10. Across all 90 tested configurations, no tested nonmonotonic schedule improves on the monotonic baseline. The magnitude of the penalty, however, spans nearly three orders of magnitude: persistent and substantial in DDPM, intermediate in Flow Matching, and indistinguishable from zero in EDM. We show that this variation is not noise but a structural property of each trained denoiser, and we formalize it as the Schedule Sensitivity Coefficient, a cheap, architecture-agnostic diagnostic that provides evidence of non-convergence to the Bayes-optimal denoiser at the critical noise level. Our findings justify the field's tacit reliance on monotonic schedules and supply a new probe of diffusion model quality complementary to sample-quality metrics such as Frechet Inception Distance.

2605.11770 2026-05-13 cs.CR cs.AI cs.SY eess.SY 版本更新

Behavioral Integrity Verification for AI Agent Skills

Yuhao Wu, Tung-Ling Li, Hongliang Liu

发表机构 * Palo Alto Networks(帕洛阿尔托网络)

AI总结 该研究提出了一种名为行为完整性验证(BIV)的方法,用于验证AI代理技能的实际能力是否与其声明一致,填补了现有安全机制在技能本身验证方面的空白。该方法结合确定性代码分析与大语言模型辅助的能力提取,构建了统一的分类体系,支持偏差分类、根源分析和恶意技能检测等下游任务。实验表明,BIV在大规模技能数据集上表现出色,揭示了技能描述与实现之间的广泛差距,并在恶意技能检测任务中取得了优于现有方法的高精度结果。

详情
英文摘要

Agent skills extend LLM agents with privileged third-party capabilities such as filesystem access, credentials, network calls, and shell execution. Existing safety work catches malicious prompts and risky runtime actions, but the skill artifact itself goes unverified. We formalize this as the behavioral integrity verification (BIV) problem: a typed set comparison between declared and actual capabilities over a shared taxonomy that bridges code, instructions, and metadata. The BIV framework instantiates this comparison by pairing deterministic code analysis with LLM-assisted capability extraction. The resulting structured evidence supports three downstream analyses: deviation taxonomy, root-cause classification, and malicious-skill detection. On 49,943 skills from the OpenClaw registry, the deviation taxonomy reveals a pervasive description-implementation gap: 80.0% of skills deviate from declared behavior, with four novel compound-threat categories surfaced. Root-cause classification finds that deviations are mostly oversight, not malice: 81.1% trace to developer oversight and 18.9% to adversarial intent, with 5.0% of skills carrying predicted multi-stage attack chains. On a 906-skill malicious-skill detection benchmark, BIV reaches an F1 of 0.946, outperforming state-of-the-art rule-based and single-pass LLM baselines. These results demonstrate behavioral integrity auditing for agent skills at scale.

2605.11756 2026-05-13 cs.CV cs.AI 版本更新

Focusable Monocular Depth Estimation

Yuxin Du, Tao Lin, Zile Zhong, Runting Li, Xiyao Chen, Jiting Liu, Chenglin Liu, Ying-Cong Chen, Yuqian Fu, Bo Zhao

发表机构 * School of Artificial Intelligence, Shanghai Jiao Tong University(上海交通大学人工智能学院) The Hong Kong University of Science and Technology (Guangzhou)(香港科技大学(广州)) King Abdullah University of Science and Technology(国王 Abdullah 科学与技术大学)

AI总结 本文提出了一种可聚焦的单目深度估计方法(FDE),旨在提升模型对用户指定或任务相关区域的深度估计精度。该方法引入了基于提示的FocusDepth框架,通过多尺度空间对齐融合(MSSA)技术,将多尺度特征与目标区域提示进行对齐和融合,从而在保持全局场景几何结构的同时,增强对目标区域的深度感知能力。研究还构建了FDE-Bench基准,实验证明该方法在目标边界和前景区域的深度估计上表现显著优于现有基线模型。

详情
英文摘要

Monocular depth foundation models generalize well across scenes, yet they are typically optimized with uniform pixel-wise objectives that do not distinguish user-specified or task-relevant target regions from the surrounding context. We therefore introduce Focusable Monocular Depth Estimation (FDE), a region-aware depth estimation task in which, given a specified target region, the model is required to prioritize foreground depth accuracy, preserve sharp boundary transitions, and maintain coherent global scene geometry. To prioritize task-critical region modeling, we propose FocusDepth, a prompt-conditioned monocular relative depth estimation framework that guides depth modeling to focus on target regions via box/text prompts. The core Multi-Scale Spatial-Aligned Fusion (MSSA) in FocusDepth spatially aligns multi-scale features from Segment Anything Model 3 to the Depth Anything family and injects them through scale-specific, gated conditional fusion. This enables dense prompt cue injection without disrupting geometric representations, thereby endowing the depth estimation model with focused perception capability. To study FDE, we establish FDE-Bench, a target-centric monocular relative depth benchmark built from image-target-depth triplets across five datasets, containing 252.9K/72.5K train/val triplets and 972 categories spanning real-world and embodied simulation environments. On FDE-Bench, FocusDepth consistently improves over globally fine-tuned DA2/DA3 baselines under both box and text prompts, with the largest gains appearing in target boundary and foreground regions while preserving global scene geometry. Ablations show that MSSA's spatial alignment is the key design factor, as disrupting prompt-geometry correspondence increases AbsRel by up to 13.8%.

2605.11753 2026-05-13 cs.AI 版本更新

Towards Visually Grounded Multimodal Summarization via Cross-Modal Transformer and Gated Attention

Abid Ali, Diego Molla-Aliod, Usman Naseem

发表机构 * School of Computing, Macquarie University(麦考瑞大学计算机学院)

AI总结 该论文研究了多模态摘要生成问题,旨在从文本和图像中生成语义连贯且内容准确的摘要。为了解决现有方法中视觉特征与语言模型表征不匹配的问题,作者提出了一种统一框架SPeCTrA-Sum,通过深度对齐视觉与语言编码器,并引入视觉相关性预测模块来选择具有代表性的图像。实验表明,该方法在生成视觉相关性更强的摘要和选择更具代表性的图像方面表现优异。

Comments Accepted to Findings of ACL 2026

详情
英文摘要

Multimodal summarization requires models to jointly understand textual and visual inputs to generate concise, semantically coherent summaries. Existing methods often inject shallow visual features into deep language models, leading to representational mismatches and weak cross-modal grounding. We propose a unified framework that jointly performs text summarization and representative image selection. Our system, SPeCTrA-Sum (Sampler Perceiver with Cross-modal Transformer and gated Attention for Summarization), introduces two key innovations. First, a Deep Visual Processor (DVP) aligns the visual encoder with the language model at corresponding depths, enabling hierarchical, layer-wise fusion that preserves semantic consistency. Second, a lightweight Visual Relevance Predictor (VRP) selects salient and diverse images by distilling soft labels from a Determinantal Point Processes (DPP) teacher. SPeCTrA-Sum is trained using a multi-objective loss that combines autoregressive summarization, cross-modal alignment, and DPP-based distillation. Experiments show that our system produces more accurate, visually grounded summaries and selects more representative images, demonstrating the benefits of depth-aware fusion and principled image selection for multimodal summarization.

2605.11750 2026-05-13 cs.RO cs.AI cs.CL cs.CV 版本更新

DreamAvoid: Critical-Phase Test-Time Dreaming to Avoid Failures in VLA Policies

Xianzhe Fan, Yuxiang Lu, Shenyuan Gao, Xiaoyang Wu, Ruihua Han, Manling Li, Hengshuang Zhao

发表机构 * HKU(香港大学) HKUST(香港理工大学) Northwestern University(西北大学)

AI总结 Vision-Language-Action(VLA)模型在精细操作任务中容易因关键阶段的微小动作错误而引发不可恢复的失败。为解决这一问题,本文提出DreamAvoid,一种在测试阶段通过“梦境”模拟来预判并规避失败的框架。该方法引入梦境触发机制、动作提案和梦境评估器,通过模拟候选动作的短期未来结果,选择最优动作以提升任务成功率。实验表明,DreamAvoid能有效减少失败情况,提高实际操作任务的完成率。

Comments 19 pages, 7 figures

详情
英文摘要

Vision-Language-Action (VLA) models are often brittle in fine-grained manipulation, where minor action errors during the critical phases can rapidly escalate into irrecoverable failures. Since existing VLA models rely predominantly on successful demonstrations for training, they lack an explicit awareness of failure during these critical phases. To address this, we propose DreamAvoid, a critical-phase test-time dreaming framework that enables VLA models to anticipate and avoid failures. We also introduce an autonomous boundary learning paradigm to refine the system's understanding of the subtle boundary between success and failure. Specifically, we (1) utilize a Dream Trigger to determine whether the execution has entered a critical phase, (2) sample multiple candidate action chunks from the VLA via an Action Proposer, and (3) employ a Dream Evaluator, jointly trained on mixed data (success, failure, and boundary cases), to "dream" the short-horizon futures corresponding to the candidate actions, evaluate their values, and select the optimal action. We conduct extensive evaluations on real-world manipulation tasks and simulation benchmarks. The results demonstrate that DreamAvoid can effectively avoid failures, thereby improving the overall task success rate. Our code is available at https://github.com/XianzheFan/DreamAvoid.

2605.11746 2026-05-13 cs.AI 版本更新

When Reasoning Traces Become Performative: Step-Level Evidence that Chain-of-Thought Is an Imperfect Oversight Channel

Wenkai Li, Fan Yang, Ananya Hazarika, Shaunak A. Mehta, Koichi Onoue

发表机构 * Carnegie Mellon University(卡内基梅隆大学) Fujitsu Research of America Inc.(富士通美国研究公司)

AI总结 该研究探讨了思维链(Chain-of-thought, CoT)推理过程中,可见的推理轨迹与实际计算过程之间的一致性问题。通过构建Detect-Classify-Compare框架,并结合多种验证方法,发现大多数模型在推理步骤中存在轨迹与答案承诺不一致的现象,尤其是推理轨迹在答案确定后仍继续生成看似深思熟虑但实际无实质影响的文本。研究还表明,CoT在提升模型性能方面仍具价值,但其作为答案形成时间的可靠记录存在显著偏差。

详情
英文摘要

Chain-of-thought (CoT) traces are increasingly used both to improve language model capability and to audit model behavior, implicitly assuming that the visible trace remains synchronized with the computation that determines the answer. We test this assumption with a step-level Detect-Classify-Compare framework built around an answer-commitment proxy that is cross-validated with Patchscopes, tuned-lens probes, and causal direction ablation. Across nine models and seven reasoning benchmarks, latent commitment and explicit answer arrival align on only 61.9% of steps on average. The dominant mismatch pattern is confabulated continuation: 58.0% of detected mismatch events occur after the answer-commitment proxy has already stabilized while the trace continues producing deliberative-looking text, and a vacuousness analysis shows that the committed answer does not change during these steps. In architecture-matched Qwen2.5/DeepSeek-R1-Distill comparisons, the reasoning pipeline changes failure composition more than aggregate alignment, most clearly at 32B where confabulated steps decrease as contradictory states increase. Lower step-level alignment is also associated with larger CoT utility, suggesting that the settings that benefit most from CoT are often the least temporally faithful. Paired truncation and a complementary donor-corruption test further indicate that much post-commitment text is not load-bearing for the final answer. These findings suggest that CoT can remain useful while still being an unreliable report of when the answer was formed.

2605.11738 2026-05-13 cs.AI 版本更新

OptArgus: A Multi-Agent System to Detect Hallucinations in LLM-based Optimization Modeling

Zhong Li, Zihan Guo, Xiaohan Lu, Juntao Wang, Jie Song, Chao Shen, Jiageng Wu, Mingyang Sun

发表机构 * Great Bay University(大湾大学) Peking University(北京大学) Jilin University(吉林大学) Zhejiang University(浙江大学) Shenzhen Loop Area Institute(深圳环城研究院)

AI总结 本文提出OptArgus,一个用于检测基于大语言模型(LLM)的优化建模中幻觉问题的多智能体系统。研究聚焦于LLM在将自然语言优化问题转化为数学模型和求解代码时可能产生的结构不一致问题,并构建了一个细粒度的幻觉分类体系,涵盖目标函数、变量、约束和实现等多个方面。OptArgus通过多智能体协作机制,结合引导路由、专家审计和证据整合,显著提升了检测准确性和定位能力,并在包含多种类型数据的基准测试中表现出优于单一智能体方法的性能。

详情
英文摘要

Large language models (LLMs) are increasingly used to translate natural-language optimization problems into mathematical formulations and solver code, but matching the reference objective value is not a reliable test of correctness: an artifact may agree numerically while still changing the underlying optimization semantics. We formulate this issue as \emph{optimization-modeling hallucination detection}, namely structural consistency auditing over the problem description, symbolic model, and solver implementation. We develop, to our knowledge, the first fine-grained hallucination taxonomy specifically for optimization modeling, spanning objective, variable, constraint, and implementation failures. We use this taxonomy to design OptArgus, a multi-agent detector with conductor routing, specialist auditors, and evidence consolidation. To evaluate this setting, we introduce a three-part benchmark suite with $484$ clean artifacts, $1266$ controlled injected artifacts, and $6292$ natural LLM-generated artifacts. Against a matched single-agent baseline, OptArgus produces fewer false alarms on clean artifacts, more accurate top-ranked localization on controlled single-error cases, and stronger detection on natural model outputs. Together, these contributions turn optimization-modeling hallucination detection into a concrete empirical problem and suggest that modular, taxonomy-grounded auditing is a practical route to more reliable optimization modeling.

2605.11727 2026-05-13 cs.AI cs.CL cs.CV 版本更新

Allegory of the Cave: Measurement-Grounded Vision-Language Learning

Kepeng Xu, Li Xu, Gang He, Wenxin Yu

发表机构 * Xidian University(西电大学) Southwest University of Science and Technology(西南科技大学)

AI总结 该研究探讨了如何通过更贴近原始相机测量数据的视觉输入来提升视觉-语言模型的感知能力。提出了一种基于原始测量值的视觉-语言学习框架PRISM-VL,结合了RAW图像输入、相机条件化对齐和曝光区间监督聚合等方法,以增强模型对真实环境信息的感知。实验表明,该方法在低光、高动态范围等复杂场景下显著提升了模型的性能,验证了保留测量域信息对多模态推理的重要性。

详情
英文摘要

Vision-language models typically reason over post-ISP RGB images, although RGB rendering can clip, suppress, or quantize sensor evidence before inference. We study whether grounding improves when the visual interface is moved closer to the underlying camera measurement. We formulate measurement-grounded vision-language learning and instantiate it as PRISM-VL, which combines RAW-derived Meas.-XYZ inputs, camera-conditioned grounding, and Exposure-Bracketed Supervision Aggregation for transferring supervision from RGB proxies to measurement-domain observations. Using a quality-controlled 150K instruction-tuning set and a held-out benchmark targeting low-light, HDR, visibility-sensitive, and hallucination-sensitive cases, PRISM-VL-8B reaches 0.6120 BLEU, 0.4571 ROUGE-L, and 82.66\% LLM-Judge accuracy, improving over the RGB Qwen3-VL-8B baseline by +0.1074 BLEU, +0.1071 ROUGE-L, and +4.46 percentage points. These results suggest that part of VLM grounding error arises from information lost during RGB rendering, and that preserving measurement-domain evidence can improve multimodal reasoning.

2605.11720 2026-05-13 cs.SE cs.AI cs.MA 版本更新

A Research Agenda on Agents and Software Engineering: Outcomes from the Rio A2SE Seminar

Davide Taibi, Henry Muccini, Karthik Vaidhyanathan, Marcos Kalinowski, Michele Albano, Antonio Pedro Santos Alves, Renato Cerqueira, Mateus Devino, Matteo Esposito, Rodrigo Falcão, Vinicius Henning, Foutse Khomh, Valentina Lenarduzzi, Qinghua Lu, Matías Martínez, Henrique Mello, Daniel Mendez, Lucas Romao

发表机构 * University of Southern Denmark(丹麦南方大学) Software Engineering Research Center, IIIT Hyderabad(IIIT海得拉巴软件工程研究中心) Pontifical Catholic University of Rio de Janeiro(里约热内卢天主教大学)

AI总结 随着智能体AI的兴起,软件工程正面临两个相互关联的变革方向:一方面智能体被越来越多地应用于支持软件工程任务,另一方面智能体AI系统本身作为复杂的系统,要求重新思考现有的软件工程实践。本文基于里约热内卢举行的A2SE研讨会成果,提出了一个由社区驱动的研究议程,明确了六个主题领域,并为每个领域设定了短期和长期的研究方向,为软件工程界提供了协调研究努力的结构化基础。

Comments 6 pages, 1 table, A2SE meeting, https://sites.google.com/view/a2se2026/home

详情
英文摘要

The rise of agentic AI is reshaping software engineering in two intertwined directions: agents are increasingly applied to support software engineering tasks, and Agentic AI systems themselves are complex systems that require re-thinking currently established software engineering practices. To chart a coherent research agenda covering the two directions, we organized the A2SE seminar in Rio de Janeiro, bringing together 18 experts from academia and industry. Through structured presentations, collaborative topic clustering, and focused group discussions, participants identified six thematic areas: Governance, Software Engineering for Agents, Agents for Software Architecture, Quality and Evaluation, Sustainability, and Code, and they prioritized short-term and long-term research directions for each. This paper presents the resulting community-driven, opinionated research agenda, offering the SE community a structured foundation for coordinating efforts at this critical juncture.

2605.11718 2026-05-13 q-bio.NC cs.AI cs.NE 版本更新

Self-organized MT Direction Maps Emerge from Spatiotemporal Contrastive Optimization

Zhaotian Gu, Molan Li, Jie Su, Chang Liu, Tianyi Qian, Dahui Wang

发表机构 * School of System Science, Beijing Normal University, Beijing 100875, China(北京师范大学系统科学学院,北京100875,中国) State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing 100875, China(北京师范大学认知神经科学与学习国家重点实验室,北京100875,中国) Qiyuan Laboratory, Beijing 100095, China(启元实验室,北京100095,中国)

AI总结 本研究探讨了灵长类视觉皮层背侧流中方向选择性图(如MT区)的计算起源问题。通过引入一种时空拓扑深度神经网络(TDANN),结合自监督对比学习与生物启发的空间损失函数,模型在自然视频训练中自发生成了类似大脑的运动方向图和拓扑针轮结构。研究揭示了MT区的方向选择特性源于任务驱动的判别压力与空间正则化之间的优化权衡,其表征定量匹配了猕猴MT区的生理基线,为背侧与腹侧视觉流的计算机制统一提供了新见解。

详情
英文摘要

The spatial and functional organization of the primate visual cortex is a fundamental problem in neuroscience. While recent computational frameworks like the Topographic Deep Artificial Neural Network (TDANN) have successfully modeled spatial organization in the ventral stream, the computational origins of the dorsal stream's distinct topographies, such as direction-selective maps in the middle temporal (MT) area, remain largely unresolved. In this work, we present a spatiotemporal TDANN to investigate whether MT topography is governed by the same universal principles. By training a 3D ResNet on naturalistic videos via a Momentum Contrast (MoCo) self-supervised paradigm alongside a biologically inspired spatial loss, we demonstrate the spontaneous emergence of brain-like direction maps and topological pinwheel structures. Crucially, we reveal that MT tuning properties, characterized by strong direction selectivity paired with a residual axial component, arise from a strict optimization trade-off between task-driven discriminative pressure and spatial regularization. The model's representations quantitatively match in vivo macaque MT physiological baselines, including direction selectivity index, circular variance, and pinwheel density. These findings unify the computational origins of the ventral and dorsal streams, establishing a general mechanism for cortical self-organization.

2605.11716 2026-05-13 cs.AI 版本更新

SafeSteer: A Decoding-level Defense Mechanism for Multimodal Large Language Models

Xinyi Zeng, Xue Yang, Jingyuan Zhang, Huanqian Yan, Xiang Chen, Kaiwen Wei, Hankun Kang, Yu Tian

发表机构 * Tsinghua University(清华大学) Shanghai Jiao Tong University(上海交通大学) Kuaishou Technology(快手科技) School of Computer Science and Technology, Beihang University(北航计算机科学与技术学院) Nanjing University of Aeronautics and Astronautics(南京航空航天大学) Chongqing University(重庆大学) Wuhan University(武汉大学)

AI总结 多模态大语言模型(MLLMs)在面对 Jailbreak 攻击时面临较大安全挑战,现有防御方法依赖昂贵的微调或低效的后处理,难以应对新型攻击且存在性能折衷。本文提出 SafeSteer,一种基于解码阶段的防御机制,通过引入轻量级的 Decoding-Probe 检测并修正有害输出,并结合模态语义对齐向量将文本安全对齐能力迁移至视觉模态。实验表明,SafeSteer 在无需微调的情况下可提升 MLLMs 的安全性达 33.40%,同时保持模型的有效性与实用性。

详情
英文摘要

Multimodal large language models (MLLMs) are gaining increasing attention. Due to the heterogeneity of their input features, they face significant challenges in terms of jailbreak defenses. Current defense methods rely on costly fine-tuning or inefficient post-hoc interventions, limiting their ability to address novel attacks and involving performance trade-offs. To address the above issues, we explore the inherent safety capabilities within MLLMs and quantify their intrinsic ability to discern harmfulness at decoding stage. We observe that 1) MLLMs can distinguish the harmful and harmless inputs during decoding process, 2) Image-based attacks are more stealthy. Based on these insights, we introduce SafeSteer, a decoding-level defense mechanism for MLLMs. Specifically, it includes a Decoding-Probe, a lightweight probe for detecting and correcting harmful output during decoding, which iteratively steers the decoding process toward safety. Furthermore, a modal semantic alignment vector is integrated to transfer the strong textual safety alignment to the vision modality. Experiments on multiple MLLMs demonstrate that SafeSterr can improve MLLMs' safety by up to 33.40\% without fine-tuning. Notably, it can maintain the effectiveness of MLLMs, ensuring a balance between their helpfulness and harmlessness.

2605.11712 2026-05-13 cs.AI 版本更新

Toward Stable Value Alignment: Introducing Independent Modules for Consistent Value Guidance

Wenhao Chen, Sirui Sun, Shengyuan Bai, Guojie Song

发表机构 * School of Electronics Engineering and Computer Science, Peking University(北京大学电子工程与计算机科学学院) Yuanpei College, Peking University(北京大学元培学院) State Key Laboratory of General Artificial Intelligence, School of Intelligence Science and Technology, Peking University(北京大学通用人工智能国家重点实验室)

AI总结 本文针对大语言模型(LLM)在价值对齐过程中因残差流动态性导致的价值表达不稳定问题,提出了一种名为 Stable Value Guidance Transformer(SVGT)的新架构。该方法通过引入独立的价值模块,将价值表示与主干网络分离,并利用可学习的桥接标记实现稳定的价值引导,从而在保持生成流畅性的同时显著提升模型的安全性。实验表明,SVGT 在多个基准测试中有效降低了有害输出,验证了其在结构化价值建模方面的有效性。

Comments Accepted to ICML 2026 (Spotlight). 32 pages

详情
英文摘要

Aligning large language models (LLMs) with human values typically relies on post-training or inference-time steering that directly manipulates the backbone's parameters or representation space. However, a critical gap exists: the model's residual stream is highly dynamic, in which values exist as fragile, low-dimensional properties, inherently incompatible with the stability required for consistent value expression. In this paper, we propose the Stable Value Guidance Transformer (SVGT), which addresses this gap through an independent value module incorporating two key designs: (1) independent value modeling, maintaining normative representations in a dedicated value space isolated from the backbone, and (2) explicit behavioral guidance, transducing these stable signals into learnable latent Bridge Tokens. These tokens serve as dynamic value anchors to explicitly steer the generative trajectory, ensuring robust adherence across diverse contexts without disrupting the backbone's internal representations. Experiments across multiple backbones and safety benchmarks show that SVGT generally reduces harmful scores by over 70% while maintaining generation fluency, demonstrating the efficacy of architecturally grounded value modeling. Our code is available at https://github.com/Clervils/SVGT.git.

2605.11711 2026-05-13 cs.LG cs.AI 版本更新

Debiased Model-based Representations for Sample-efficient Continuous Control

Jiafei Lyu, Zichuan Lin, Scott Fujimoto, Kai Yang, Yangkun Chen, Saiyong Yang, Zongqing Lu, Deheng Ye

发表机构 * Tencent Hunyuan(腾讯文言) McGill University(麦吉尔大学) School of Computer Science, Peking University(北京大学计算机学院)

AI总结 本文提出了一种去偏的基于模型的表示学习方法DR.Q,用于提高连续控制任务中样本效率。该方法通过最大化当前状态-动作对与其下一状态之间的互信息,并结合衰减优先经验回放策略,有效缓解了传统方法在表示学习中的偏差和过拟合问题。实验表明,DR.Q在多个基准任务上表现优异,能够匹配甚至超越现有先进方法。

Comments ICML 2026

详情
英文摘要

Model-based representations recently stand out as a promising framework that embeds latent dynamics information into the representations for downstream off-policy actor-critic learning. It implicitly combines the advantages of both model-free and model-based approaches while avoiding the training costs associated with model-based methods. Nevertheless, existing model-based representation methods can fail to capture sufficient information about relevant variables and can overfit to early experiences in the replay buffer. These incur biases in representation and actor-critic learning, leading to inferior performance. To address this, we propose Debiased model-based Representations for Q-learning, tagged DR.Q algorithm. DR.Q explicitly maximizes the mutual information between the representations of the current state-action pair and the next state besides minimizing their deviations, and samples transitions with faded prioritized experience replay. We evaluate DR.Q on numerous continuous control benchmarks with a single set of hyperparameters, and the results demonstrate that DR.Q can match or surpass recent strong baselines, sometimes outperforming them by a large margin. Our code is available at https://github.com/dmksjfl/DR.Q.

2605.11696 2026-05-13 cs.CV cs.AI cs.GR 版本更新

WildRelight: A Real-World Benchmark and Physics-Guided Adaptation for Single-Image Relighting

Lezhong Wang, Mehmet Onurcan Kaya, Siavash Bigdeli, Jeppe Revall Frisvad

发表机构 * Technical University of Denmark(丹麦技术大学) Inria(法国国家信息与自动化研究所)

AI总结 WildRelight 是一个专为单图像重光照任务设计的首个真实场景数据集,包含高分辨率户外场景及其配对的高动态范围环境光映射,用于评估现有方法在真实环境中的表现。该数据集揭示了当前基于合成数据训练的先进模型在真实世界中存在严重的领域偏移问题。研究提出了一种基于物理引导的推理框架,结合扩散后验采样与时间感知的测试时自适应方法,实现了合成模型在真实场景中的实时对齐,为解决模拟到现实的挑战提供了新的思路。

Comments Companion paper to the CVPR26 findings paper 'WildRelight', introducing the physics-guided adaptation method evaluated on the dataset. Project Page: https://lez-s.github.io/wildrelight_proj/

详情
英文摘要

Recent single-image relighting methods, powered by advanced generative models, have achieved impressive photorealism on synthetic benchmarks. However, their effectiveness in the complex visual landscape of the real world remains largely unverified. A critical gap exists, as current datasets are typically designed for multi-view reconstruction and fail to address the unique challenges of single-image relighting. To bridge this synthetic-to-real gap, we introduce WildRelight, the first in-the-wild dataset specifically created for evaluating single-image relighting models. WildRelight features a diverse collection of high-resolution outdoor scenes, captured under strictly aligned, temporally varying natural illuminations, each paired with a high-dynamic-range environment map. Using this data, we establish a rigorous benchmark revealing that state-of-the-art models trained on synthetic data suffer from severe domain shifts. The strictly aligned temporal structure of WildRelight enables a new paradigm for domain adaptation. We demonstrate this by introducing a physics-guided inference framework that leverages the captured natural light evolution as a self-supervised constraint. By integrating Diffusion Posterior Sampling (DPS) with temporal Sampling-Aware Test-Time Adaptation (TTA), we show that the dataset allows synthetic models to align with real-world statistics on-the-fly, transforming the intractable sim-to-real challenge into a tractable self-supervised task. The dataset and code will be made publicly available to foster robust, physically-grounded relighting research.

2605.11695 2026-05-13 cs.CV cs.AI 版本更新

Emergent Communication between Heterogeneous Visual Agents through Decentralized Learning

Mikako Ochiai, Masatoshi Nagano, Tadahiro Taniguchi

发表机构 * Graduate School of Informatics, Kyoto University(京都大学信息科学研究生院)

AI总结 本文研究了在异构视觉代理之间通过去中心化学习产生的通信机制,探讨了当代理具有不同视觉表征时,哪些视觉信息可以被共享。研究中代理仅交换离散的标记序列,并基于本地感知证据更新自身模型,无需依赖共享的通信目标。实验表明,这种通信方式能够生成具有视觉信息的共享标记序列,在跨代理对齐、视觉特征预测和图像-文本检索任务中优于无通信基线,并揭示了视觉编码器异质性对通信内容和语言对称性的影响。

详情
英文摘要

Symbols are shared, but perception is private. We study emergent communication between heterogeneous visual agents through decentralized learning, asking what visual information can become shareable when agents have different visual representations. Instead of optimizing messages through a shared external communicative objective, our agents exchange only discrete token sequences and update their own models using local perceptual evidence. This setting focuses on an underexplored aspect of emergent communication, examining whether common symbols can arise without shared perceptual access, and how the similarity between private visual spaces constrains the content and symmetry of the resulting language. We instantiate this setting in the Metropolis-Hastings Captioning Game (MHCG), where two agents collaboratively form shared captions by exchanging proposed token sequences that a listener accepts or rejects using an MH-style criterion evaluated against its own visual features. We compare three pairings of frozen visual encoders, with agents starting from randomly initialized text modules. Experiments on MS-COCO show that MHCG produces visually informative shared token sequences that outperform a no-communication baseline in cross-agent alignment, visual-feature prediction, and image-text retrieval; all cross-agent metrics decline as encoder mismatch increases. Moderate encoder heterogeneity reduces the number of shared sequences while preserving per-sequence visual specificity, whereas stronger encoder heterogeneity yields fewer, coarser, and more asymmetric sequences. Ablations show that listener-side MH acceptance is critical for avoiding degenerate token formation. These results suggest that shared symbols can arise from local perceptual evaluation alone, with visual representational similarity across encoders shaping both the content and symmetry of the resulting language.

2605.11693 2026-05-13 cs.AI 版本更新

Measuring What Matters Beyond Text: Evaluating Multimodal Summaries by Quality, Alignment, and Diversity

Abid Ali, Diego Molla-Aliod, Usman Naseem

发表机构 * School of Computing, Macquarie University(麦考瑞大学计算学院)

AI总结 该研究针对多模态摘要生成任务中现有评估方法的不足,提出了一种统一的评估框架MM-Eval,用于综合衡量文本质量、图像-文本对齐性以及视觉多样性。MM-Eval通过结合事实一致性、语义连贯性、图像相关性及视觉多样性等多维度指标,实现了对多模态摘要更全面和准确的评估。实验表明,该框架优于传统启发式方法,为多模态摘要系统的比较评估提供了可解释且弱依赖参考的解决方案。

Comments Accepted to Findings of ACL 2026

详情
英文摘要

Multimodal Large Language Models (MLLMs) have facilitated Multimodal Summarization with Multimodal Output (MSMO), wherein systems generate concise textual summaries accompanied by salient visuals from multimodal sources. However, current MSMO evaluation remains fragmented: text quality, image-text alignment, and visual diversity are typically assessed in isolation using unimodal metrics, making it difficult to capture whether the modalities jointly support a faithful and useful summary. To address this gap, we introduce MM-Eval, a unified evaluation framework that integrates assessments of textual quality, cross-modal alignment, and visual diversity. MM-Eval comprises three components: (1) text quality, measured using OpenFActScore for factual consistency and G-Eval for coherence, fluency, and relevance; (2) image-text relevance, evaluated via an MLLM-as-a-judge approach; and (3) image-set diversity, quantified using Truncated CLIP Entropy. We calibrate MM-Eval through a learned aggregation model trained on the mLLM-EVAL news benchmark, aligning component contributions with human preferences. Our analysis reveals a text-dominant hierarchy in this setting, where factual consistency acts as a critical determinant of perceived overall quality, while visual relevance and diversity provide complementary signals. MM-Eval improves over heuristic aggregation baselines and provides an interpretable, reference-weak framework for comparative evaluation of multimodal summaries.

2605.11688 2026-05-13 cs.LG cs.AI cs.MA 版本更新

Shaping Zero-Shot Coordination via State Blocking

Mingu Kang, Sunwoo Lee, Yonghyeon Jo, Seungyul Han

发表机构 * Graduate School of Artificial Intelligence(人工智能研究生院) UNIST(全南国立科学技术院)

AI总结 本文研究了零样本协调(ZSC)问题,即如何使智能体在未与合作伙伴预先交互的情况下实现协作,这对于现实中的多智能体系统和人机协作至关重要。为解决现有方法在面对未见合作伙伴时泛化能力不足的问题,作者提出了一种名为状态阻断协调(SBC)的框架,通过生成虚拟环境中的多样化交互场景,使智能体在训练过程中接触多种次优合作伙伴策略,从而提升其零样本协调能力。实验表明,SBC在多个基准测试中表现出优越的协调性能,尤其在与人类合作伙伴的协作中具有显著优势。

Comments 9 technical page followed by references and appendix

详情
英文摘要

Zero-shot coordination (ZSC) aims to enable agents to cooperate with independently trained partners without prior interaction, a key requirement for real-world multi-agent systems and human-AI collaboration. Existing approaches have largely emphasized increasing partner diversity during training, yet such strategies often fall short of achieving reliable generalization to unseen partners. We introduce State-Blocked Coordination (SBC), a simple yet effective framework that improves ZSC by inducing diverse interaction scenarios without direct environment modification. Specifically, SBC generates a family of virtual environments through state blocking, allowing agents to experience a wide range of suboptimal partner policies. Across multiple benchmarks, SBC demonstrates superior performance in zero-shot coordination, including strong generalization to human partners.

2605.11687 2026-05-13 cs.AI 版本更新

Persistent and Conversational Multi-Method Explainability for Trustworthy Financial AI

Georgios Makridis, Georgios Fatouros, John Soldatos, George Katsis, Dimosthenis Kyriazis

发表机构 * University of Piraeus, Greece(希腊比雷埃克斯大学) ExpertAI-Lux S.à r.l(ExpertAI-Lux公司)

AI总结 该研究针对金融领域对可信AI解释的需求,提出了一种持久化、多方法交叉验证且支持对话交互的可解释性AI架构。核心方法包括将多种XAI结果作为可检索的持久化对象进行存储,并通过检索增强生成技术实现多方法解释的对比与融合,同时引入自动化检查机制评估解释的可靠性。该架构在金融情感分析任务中进行了验证,显著提升了解释的准确性和可信度。

Comments 5 pages

详情
英文摘要

Financial institutions increasingly require AI explanations that are persistent, cross-validated across methods, and conversationally accessible to human decision-makers. We present an architecture for human-centered explainable AI in financial sentiment analysis that combines three contributions. First, we treat XAI artifacts -- LIME feature attributions, occlusion-based word importance scores, and saliency heatmaps -- as persistent, searchable objects in distributed S3-compatible storage with structured metadata and natural-language summaries, enabling semantic retrieval over explanation history and automatic index reconstruction after system failures. Second, we enable multi-method explanation triangulation, where a retrieval-augmented generation (RAG) assistant compares and synthesizes results from multiple XAI methods applied to the same prediction, allowing users to assess explanation robustness through natural-language dialogue. Third, we evaluate the faithfulness of generated explanations using automated checks over grounding completeness, hallucinated claims, and method-attribution behavior. We demonstrate the architecture on an EXTRA-BRAIN financial sentiment analysis pipeline using FinBERT predictions and present evaluation results showing that constrained prompting reduces hallucination rate by 36\% and increases method-attribution citations by 73\% compared to naive prompting. We discuss implications for trustworthy, human-centered AI services in regulated financial environments.

2605.11678 2026-05-13 cs.AI 版本更新

OOM-Free Alpamayo via CPU-GPU Memory Swapping for Vision-Language-Action Models

Seungwoo Roh, Huiyeong Kim, Jong-Chan Kim

发表机构 * Graduate School of Automobile and Mobility, Kookmin University, Korea(汽车与移动研究生院,韩国高垣大学)

AI总结 本文提出了一种名为OOM-Free Alpamayo的框架,通过CPU-GPU内存交换技术,在不修改模型结构的前提下,实现了在显存受限的GPU上高效运行视觉-语言-动作(VLA)模型。该方法通过分层内存管理、流水线参数传输和驻留层决策策略,显著降低了显存占用并提升了推理速度。实验表明,该方法在NVIDIA Alpamayo-R1-10B模型上实现了比现有方法最高3.55倍的加速,同时保持了全BF16精度。

Comments Submitted to IEEE RTCSA on March 26, 2026 (KST); Accepted on May 4, 2026 (KST)

详情
英文摘要

End-to-end Vision-Language-Action (VLA) models for autonomous driving unify perception, reasoning, and control in a single neural network, achieving strong driving performance but requiring 20-60GB of GPU memory-far exceeding the 12-16GB available on commodity GPUs. We present a framework, which enables memory-efficient VLA inference on VRAM-constrained GPUs through system-level optimization alone, without model modification. Our work proceeds in three stages: (1) Sequential Demand Layering reduces VRAM usage from model-level to layer-level granularity; (2) Pipelined Demand Layering hides parameter transfer time within layer execution time via transfer--compute overlap; and (3) a GPU-Resident Layer Decision Policy, informed by per-module residency benefit analysis, eliminates the residual transfer overhead that pipelining cannot hide. We further propose a performance prediction model that determines the optimal configuration-both the number and placement of resident layers-from a single profiling run with less than 1.3% prediction error across all configurations. Applied to NVIDIA's Alpamayo-R1-10B (21.52GB) on an RTX 5070Ti (16GB), our work achieves up to 3.55x speedup over Accelerate offloading while maintaining full BF16 precision.

2605.11672 2026-05-13 cs.AI cs.DB 版本更新

A CAP-like Trilemma for Large Language Models: Correctness, Non-bias, and Utility under Semantic Underdetermination

Vinu Ellampallil Venugopal

发表机构 * International Institute of Information Technology(国际信息研究所)

AI总结 本文受分布式系统中CAP定理的启发,提出了一种针对大语言模型(LLM)的类CAP三难困境:在语义不充分的情况下,模型无法同时保证强正确性、严格无偏和高实用性。研究指出,当输入提示缺乏唯一答案时,模型若要生成有用的回答,必须引入某种选择标准,但若该标准未由用户提供或由前提合理推导,则可能导致偏见;反之,若模型避免使用未经支持的偏好,则可能保持正确性和无偏性,但会牺牲实用性。该研究揭示了某些LLM失败的根源可能在于任务本身的语义不充分,而非模型能力的局限。

详情
英文摘要

The CAP theorem states that a distributed system cannot simultaneously guarantee consistency, availability, and partition tolerance under network partition. Inspired by this result, this paper formulates a CAP-like conjecture for Large Language Models (LLMs). The proposed trilemma states that, under semantic underdetermination, an LLM cannot always simultaneously guarantee strong correctness, strict non-bias, and high utility. A prompt is semantically underdetermined when the given premises do not determine a unique answer. In such cases, a useful and decisive response requires the model to introduce a selection criterion, preference, prior, or value ordering. If this criterion is not supplied by the user or justified by the available premises, the response becomes biased in a broad selection-theoretic sense. Conversely, if the model avoids unsupported preferences, it may preserve correctness and non-bias but may reduce utility through refusal, hedging, or clarification. The paper formalizes this correctness--non-bias--utility trilemma, develops examples, and argues that certain LLM failures arise not merely from model limitations but from the structure of underdetermined decision requests.

2605.11671 2026-05-13 cs.CR cs.AI cs.SE 版本更新

Cochise: A Reference Harness for Autonomous Penetration Testing

Andreas Happe, Jürgen Cito

发表机构 * TU Wien(维也纳技术大学)

AI总结 Cochise 是一个用于自主渗透测试实验的轻量级 Python 框架,旨在提供一个标准化的实验平台以评估大语言模型驱动的渗透测试代理。该框架通过 SSH 连接 Linux 主机,支持可控的目标环境,并采用 Planner-Executor 架构分离长期状态与执行逻辑,提升实验的可控性和可复现性。研究还提供了回放与分析工具,便于研究人员对实验过程进行可视化和性能评估,推动对渗透测试代理行为与效率的深入研究。

详情
英文摘要

Recent work on LLM-driven autonomous penetration testing reports promising results, but existing systems often combine many architectural, prompting, and tool-integration choices, making it difficult to tell what is gained over a simple agent scaffold. We present cochise, a 597 LOC Python reference harness for autonomous penetration-testing experiments. Cochise connects an LLM-driven agent to a Linux execution host over SSH and supports controlled target environments reachable from that jump host. The prototype implements a separated Planner--Executor architecture in which long-term state is maintained outside the LLM context, while a ReAct-style executor issues commands over SSH and self-corrects based on command outputs. The scenario prompt can be adapted to different target environments. To demonstrate the efficacy of our minimal harness, we evaluate it against a live third-party testbed called Game of Active Directory (GOAD). Alongside the harness, we release replay and analysis tools: (i) cochise-replay for offline visualization of captured runs, (ii) cochise-analyze-alogs and cochise-analyze-graphs for cost, token, duration, and compromise analysis, and (iii) a corpus of JSON trajectory logs from GOAD runs, allowing researchers to study agent behavior without provisioning the 48--64 GB RAM / 190 GB storage testbed themselves. Cochise is intended not as a state-of-the-art pen-testing agent, but as reusable experimental infrastructure for comparing models, agent architectures, and penetration-testing traces.

2605.11666 2026-05-13 cs.LG cs.AI 版本更新

Evolutionary Task Discovery: Advancing Reasoning Frontiers via Skill Composition and Complexity Scaling

Liqin Ye, Yanbin Yin, Michael Galarnyk, Yuzhao Heng, Sudheer Chava, Chao Zhang

发表机构 * Georgia Institute of Technology(佐治亚理工学院)

AI总结 本文提出了一种名为Evolutionary Task Discovery(EvoTD)的框架,旨在通过结构化进化操作提升大语言模型的推理能力。该方法将数据合成视为在算法技能和复杂度属性构成的双轴流形上的定向搜索,引入了交叉操作以增强技能组合的多样性,并通过参数化变异操作调整结构约束以促进鲁棒泛化。实验表明,EvoTD能够有效扩展模型的推理边界,并在不同模型架构和预训练设置下展现出良好的泛化能力。

详情
英文摘要

The reasoning frontier of Large Language Models (LLMs) has advanced significantly through modern post-training paradigms (e.g., Reinforcement Learning from Verifiable Rewards (RLVR)). However, the efficacy of these methods remains fundamentally constrained by the diversity and complexity of the training data. One practical solution is data synthesis; yet, prevalent methods relying on unstructured mutation or exploration suffer from homogeneity collapse, failing to systematically expand the reasoning frontier. To overcome this, we propose Evoutionary Task Discovery (EvoTD), a framework that treats data synthesis as a directed search over a dual-axis manifold of Algorithmic Skills and Complexity Attributes. We introduce structured evolutionary operators to navigate this space: a Crossover operator that synthesizes novel skill compositions to enhance diversity, and a Parametric Mutation operator that scales structural constraints (e.g., input size, tree depth) to drive robust generalization. Crucially, we integrate a dynamic Zone of Proximal Development filter, ensuring tasks lie within the learnable region of the model. Empirically, EvoTD delivers substantial reasoning gains that generalize consistently across model architectures, pretraining regimes, and scales, demonstrating that structured evolutionary curricula can effectively support reasoning improvement. We release our code on https://github.com/liqinye/EvoTD.

2605.11659 2026-05-13 cs.CV cs.AI 版本更新

Reviving In-domain Fine-tuning Methods for Source-Free Cross-domain Few-shot Learning

Yaze Zhao, Yicong Liu, Yixiong Zou, Yuhua Li, Ruixuan Li

发表机构 * School of Computer Science and Technology, Huazhong University of Science and Technology(华中科技大学计算机科学与技术学院)

AI总结 本文研究了在源域数据不可用的情况下,如何通过少量样本将大模型(如CLIP)适配到目标领域的问题,即无源域少样本跨域学习(CDFSL)。研究发现,基于适配器的方法(如LoRA)在CDFSL中优于基于提示的方法,其优势源于对视觉CLS token注意力的修正,从而增强模态对齐和类别区分。基于这一发现,作者提出了一个通用的注意力建模框架——语义探针(Semantic Probe),有效提升了适配器和提示方法在CDFSL中的性能,并在多个基准上取得了最先进的结果。

详情
英文摘要

Cross-Domain Few-Shot Learning (CDFSL) aims to adapt large-scale pretrained models to specialized target domains with limited samples, yet the few-shot fine-tuning of vision-language models like CLIP remains underexplored. By establishing multiple fine-tuning baselines of CLIP for CDFSL, we find adapter-based methods (e.g., LoRA) consistently outperform prompt-based ones (e.g., MaPLe), contrary to in-domain scenarios. To make those effective in-domain methods competitive again in CDFSL, we analyze this phenomenon and discover LoRA's superiority stems from rectifying the collapsed attention of visual CLS token, enhancing modality alignment and class separation by focusing on text-related visual regions. Further, we find textual EOS token exhibit much better attention to visual samples, and CLIP's standard contrastive loss weakly constrains modality alignment. Based on these insights, we propose Semantic Probe, a plug-and-play attention rectification framework for both adapter- and prompt-based methods. Extensive experiments on four CDFSL benchmarks validate our rationale, achieving state-of-the-art performance and benefiting both fine-tuning paradigms. Codes will be released.

2605.11653 2026-05-13 cs.CR cs.AI 版本更新

Every Bit, Everywhere, All at Once: A Binomial Multibit LLM Watermark

Thibaud Gloaguen, Robin Staab, Mark Vero, Martin Vechev

发表机构 * ETH Zurich(苏黎世联邦理工学院)

AI总结 随着大语言模型水印技术逐渐应用于商业场景,实际需求日益增长,要求水印能够承载更复杂的多比特负载,如用户ID或时间戳。本文提出了一种全新的多比特水印方法,通过二项式编码在每个词的位置直接嵌入负载的每一位,并结合状态编码器动态调整编码压力以提升效果。实验表明,该方法在消息准确性和鲁棒性方面优于8种基线方法,尤其在负载较大或失真较低的情况下优势更加明显,同时引入了按位置信度评分作为更具实用价值的评估指标。

详情
英文摘要

With LLM watermarking already being deployed commercially, practical applications increasingly require multibit watermarks that encode more complex payloads, such as user IDs or timestamps, into the generated text. In this work, we propose a fundamentally new approach for multibit watermarking: introducing binomial encoding to directly encode every bit of the payload at every token position. We complement our approach with a stateful encoder that during generation dynamically redirects encoding pressure toward underencoded bits. Our evaluation against 8 baselines on up to 64-bit payloads shows that our scheme achieves superior message accuracy and robustness, with the gap to baseline methods widening in more relevant settings (i.e., large payloads and low-distortion regimes). At the same time, we challenge prior works' evaluation metrics, highlighting their lack of practical insights, and introduce per-bit confidence scoring as a practically relevant metric for evaluating multibit LLM watermarks.

2605.11636 2026-05-13 cs.AI 版本更新

Seirênes: Adversarial Self-Play with Evolving Distractions for LLM Reasoning

Chi Zhang, Haibo Qiu, Qiming Zhang, Yufei Xu, Xinbo Gao, Jing Zhang

发表机构 * School of Computer Science, Wuhan University(武汉大学计算机学院) Independent Researchers(独立研究者) Xidian University(西安电子科技大学)

AI总结 本文提出了一种名为 Seirênes 的自对抗自博弈强化学习框架,旨在将大语言模型在复杂上下文中推理失败的问题转化为训练信号,从而提升其鲁棒性。该方法通过单一模型同时生成具有干扰性的上下文和解决任务,迫使模型在噪声中识别核心逻辑,从而增强其深层推理能力。实验表明,Seirênes 在多个数学推理基准上取得了显著提升,并能有效暴露顶级闭源模型的推理盲点。

详情
英文摘要

We present Seirênes, a self-play RL framework that transforms contextual interference from a failure mode of LLM reasoning into an internal training signal for co-evolving more resilient reasoners. While RL with verifiable rewards has significantly advanced reasoning capabilities, models can still exhibit fragility when encountering non-idealized contexts: scenarios characterized by superfluous information, tangential instructions, or incidental correlations that differ from the clean distributions typical of standard benchmarks. Seirênes harnesses this vulnerability through a parameter-shared and adversarial self-play loop. Within this framework, a single model is trained to both construct plausible yet distracting contexts that expose its own reasoning blind spots, and solve problems by discerning the essential task from these perturbations to recover the core underlying logic. By pitting these competing objectives against each other, Seirênes compels the model to move beyond superficial pattern matching and anchors its capabilities in robust underlying reasoning. This continuous interaction sustains an informative co-evolutionary curriculum as the model improves. Across seven mathematical reasoning benchmarks and model scales from 4B to 30B, Seirênes achieves average gains of +10.2, +9.1, and +7.2 points. Besides, distracting contexts produced by the 4B Seirênes model reduce the accuracy of top-tier closed-source models (GPT and Gemini) by roughly 4--5 points, revealing Seirênes' general ability to uncover reasoning models' blind spots.

2605.11634 2026-05-13 cs.CV cs.AI 版本更新

Unlocking UML Class Diagram Understanding in Vision Language Models

Artem Naboichenko, René Peinl

发表机构 * Hof University of Applied Sciences(霍夫应用科学大学)

AI总结 尽管视觉语言模型(VLMs)在各类应用中取得了显著进展,但在理解图表等结构化视觉内容方面仍存在不足,尤其在计算机科学领域的UML类图理解方面研究较少。本文提出了一种基于UML类图的视觉问答基准,兼具挑战性与可行性,并构建了一个包含16,000个图像-问题-答案三元组的大规模训练数据集。实验表明,基于LoRA的微调方法在该任务上表现优于当前主流的Qwen 3.5 27B模型。

详情
英文摘要

Although Vision Language Models (VLMs) have seen tremendous progress across all kinds of use cases, they still fall behind in answering questions regard-ing diagrams compared to photos. Although progress has been made in the area of bar charts, line charts and other diagrams like that there is still few research concerned with other types of diagrams, e.g. in the computer science domain. Our work presents a benchmark for visual question answering based on UML class diagrams which is both challenging and manageable. We further construct a large-scale training dataset with 16.000 image-question-answer triples and show that a LoRA-based finetune easily outperforms Qwen 3.5 27B, which is a recent and well-performing VLM in many other benchmarks.

2605.11633 2026-05-13 cs.AI 版本更新

Can LLM Agents Respond to Disasters? Benchmarking Heterogeneous Geospatial Reasoning in Emergency Operations

Junjue Wang, Weihao Xuan, Heli Qi, Pengyu Dai, Kunyi Liu, Hongruixuan Chen, Zhuo Zheng, Junshi Xia, Stefano Ermon, Naoto Yokoya

发表机构 * The University of Tokyo(东京大学) RIKEN AIP(理化学研究所AIP) Waseda University(早稻田大学) Stanford University(斯坦福大学)

AI总结 该论文提出了一种名为DORA的基准测试平台,用于评估大型语言模型代理在灾难应急响应中的端到端能力。研究通过515个由专家设计的任务,覆盖45个真实灾难事件,涵盖从灾害感知、空间分析到疏散规划和多模态报告生成等多个维度,全面测试代理在异构地理空间数据上的推理与操作能力。实验揭示了当前LLM代理在灾难响应中的三大挑战,包括领域适应性不足、工具选择与参数理解困难以及长流程推理的脆弱性,为构建更可靠的灾难响应系统提供了重要参考。

Comments DORA stress-tests LLM agents on real-world disaster operations that demand comprehensive orchestration of 108 specialized tools over heterogeneous geospatial data

详情
英文摘要

Operational disaster response goes beyond damage assessment, requiring responders to integrate multi-sensor signals, reason over road networks, populations and key facilities, plan evacuations, and produce actionable reports. However, prior work largely isolates remote-sensing perception or evaluates generic tool use, leaving the end-to-end workflows of emergency operations underexplored. In this paper, we introduce Disaster Operational Response Agent benchmark (DORA), the first agentic benchmark for end-to-end disaster response: 515 expert-authored tasks across 45 real-world disaster events spanning 10 types, paired with expert-verified, replayable gold trajectories totaling 3,500 tool-call steps. Tasks span five dimensions that cover the operational disaster-response pipeline: disaster perception, spatial relational analysis, rescue and evacuation planning, temporal evolution reasoning, and multi-modal report synthesis. Agents compose calls from a 108-tool MCP library over heterogeneous geospatial data: optical, SAR, and multi-spectral imagery across single-, bi-, and multi-temporal sequences (0.015-10m GSD), complemented by elevation and social vector layers. We comprehensively evaluate 13 frontier LLMs on our benchmark, revealing three persistent challenges: 1) disaster-domain grounding exposes unique failure modes (damage-semantic grounding, sensor-modality mismatch, and disaster-pipeline composition); 2) agents are doubly bottlenecked by tool selection and argument grounding, where gold tool-order hints improve accuracy by only 1.08-4.40%, and alternative scaffolds yield at most a 3.24% gain; 3) compositional fragility scales with trajectory length, the agent-to-gold gap widening from 7% to 56% on long pipelines. DORA establishes a rigorous testbed for operationally reliable disaster-response agents.

2605.11625 2026-05-13 cs.AI 版本更新

Nice Fold or Hero Call: Learning Budget-Efficient Thinking for Adaptive Reasoning

Zhaomeng Zhou, Lan Zhang, Junyang Wang, Mu Yuan, Junda Lin

发表机构 * University of Science and Technology of China(中国科学技术大学) The Chinese University of Hong Kong(香港中文大学)

AI总结 这篇论文研究了如何让大型推理模型在有限计算资源下更高效地进行适应性推理。作者提出了一种名为Budget-Efficient Thinking(BET)的两阶段框架,通过结合行为冷启动和投资成本感知奖励机制,使模型能够根据推理的预期收益而非问题难度来分配计算预算。BET使模型学会在简单问题上快速回答、在无解问题上提前放弃、在复杂但可解的问题上保留足够计算资源,从而在多个基准测试中显著减少了推理开销并提升了整体性能。

Comments 24 pages, 6 figures, 11 tables

详情
英文摘要

Large reasoning models (LRMs) improve problem solving through extended reasoning, but often misallocate test-time compute. Existing efficiency methods reduce cost by compressing reasoning traces or conditioning budget on perceived difficulty, yet largely overlook solvability. As a result, they may spend large budgets on queries beyond the model's capability while compressing hard-but-solvable queries that require deeper reasoning. In this work, we formulate adaptive reasoning as a computational investment under uncertainty, where budget should follow the expected return of reasoning rather than perceived difficulty alone. To instantiate this principle, we propose Budget-Efficient Thinking (BET), a two-stage framework that combines behavioral cold-start with GRPO under an investment-cost-aware reward. By aligning solve-or-fold decisions with rollout-derived solvability, BET learns three behaviors: (1) short solve, answering easy queries concisely; (2) nice fold, abstaining early when continued reasoning has near-zero expected return; and (3) hero call, preserving sufficient compute for hard-but-solvable queries. Across seven benchmarks and three base models, BET reduces reasoning tokens by ~55% on average while achieving overall performance improvements, and transfers zero-shot from mathematical reasoning to scientific QA and logical reasoning with comparable efficiency gains.

2605.11613 2026-05-13 cs.LG cs.AI 版本更新

From Generic Correlation to Input-Specific Credit in On-Policy Self Distillation

Guobin Shen, Lei Huang, Xiang Cheng, Chenxiao Zhao, Jindong Li, Dongcheng Zhao, Xing Yu

发表机构 * Xiaohongshu Inc.(小红书公司) Institute of Automation, Chinese Academy of Sciences(中国科学院自动化研究所)

AI总结 本文研究了在策略优化中使用自我蒸馏时,如何从通用相关性转向输入特定的奖励分配问题。作者提出,标准的自我蒸馏奖励本质上是响应与反馈之间的点互信息(pMI),并进一步将其分解为输入相关的部分和通用捷径部分。基于此,他们提出了CREDIT方法,通过对比学习分离输入特定的奖励成分,从而提升模型在多个任务上的表现,且计算开销极小。

详情
英文摘要

On-policy self-distillation has emerged as a promising paradigm for post-training language models, in which the model conditions on environment feedback to serve as its own teacher, providing dense token-level rewards without external teacher models or step-level annotations. Despite its empirical success, what this reward actually measures and what kind of credit it assigns remain unclear. Under a posterior-compatibility interpretation of feedback conditioning, standard in the implicit-reward literature, we show that the self-distillation token reward is a Bayesian filtering increment whose trajectory sum is exactly the pointwise mutual information between the response and the feedback given the input. This pMI can be raised by input-specific reasoning or by input-generic shortcuts, so we further decompose the teacher log-probability along the input axis. Based on this analysis, we propose CREDIT (Contrastive REward from DIsTillation), which isolates the input-specific component with a batch-contrastive baseline. At the sequence level, CREDIT is a teacher-side surrogate for a contrastive pMI objective that also penalizes responses remaining likely under unrelated inputs. Across coding, scientific reasoning, and tool-use benchmarks on two model families, CREDIT delivers the strongest aggregate performance at negligible additional compute.

2605.11612 2026-05-13 cs.CL cs.AI 版本更新

When Emotion Becomes Trigger: Emotion-style dynamic Backdoor Attack Parasitising Large Language Models

Ziyu Liu, Tao Li, Tianjie Ni, Xiaolong Lan, Wengang Ma, Tao Yang, Guohua Wang, Junjiang He

发表机构 * School of Cyber Science and Engineering, Sichuan University(四川大学计算机科学与工程学院) School of Computer Science, China West Normal University(西南大学计算机科学学院) School of Electronic and Information Engineering, Lanzhou Jiaotong University(兰州交通大学电子信息工程学院)

AI总结 该研究提出了一种针对大语言模型的新型后门攻击方法——Paraesthesia,通过将情绪作为动态触发因素,实现对模型的隐蔽性攻击。不同于传统基于固定触发词的后门攻击,Paraesthesia 利用情绪风格在语义空间中形成独立聚类的特性,将情绪作为触发信号嵌入训练数据,使模型在推理阶段遇到特定情绪输入时生成预设的恶意输出。实验表明,该方法在多种任务和不同模型上均能实现高达约99%的攻击成功率,同时保持模型的正常功能。

详情
英文摘要

Backdoor vulnerabilities widely exist in the fine-tuning of large language models(LLMs). Most backdoor poisoning methods operate mainly at the token level and lack deeper semantic manipulation, which limits stealthiness. In addition, Prior attacks rely on a single fixed trigger to induce harmful outputs. Such static triggers are easy to detect, and clean fine-tuning can weaken the trigger-target association. Through causal validation, we observe that emotion is not directly linked to individual words, but functions as an overall stylistic factor through tone. In the representation space of LLM, emotion can be decoupled from semantics, forming distinct cluster from the original neutral text. Therefore, we consider the emotional factor as the backdoor trigger to propose a pparasitic emotion-style dynamic backdoor attack, Paraesthesia. By mixing samples with the emotional trigger into clean data and then fine-tuning the model, the model is able to generate the predefined attack response when encountering emotional inputs during the inference stage. Paraesthesia includes two the quantification and rewriting of emotional styles. We evaluate the effectiveness of our method on instruction-following generation and classification tasks. The experimental results show that Paraesthesia achieves an attack success rate of around 99\% across both task types and four different models, while maintaining the clean utility of the models.

2605.11609 2026-05-13 cs.LG cs.AI cs.CL 版本更新

Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information

Guobin Shen, Xiang Cheng, Chenxiao Zhao, Lei Huang, Jindong Li, Dongcheng Zhao, Xing Yu

发表机构 * Xiaohongshu Inc.(小红书公司) Institute of Automation, Chinese Academy of Sciences(中国科学院自动化研究所)

AI总结 该研究针对基于策略的自蒸馏方法在数学推理任务中效果不佳的问题,提出了一种新的反向自蒸馏方法(AntiSD)。通过点互信息分析,发现特权上下文导致教师模型对已知结构部分过于自信,而忽视了推理过程中的关键思考步骤。AntiSD通过最大化学生与教师之间的分布差异,反转了传统自蒸馏的梯度方向,从而更有效地提升推理能力。实验表明,该方法在多个大规模语言模型上显著减少了训练步骤并提升了推理准确率。

详情
英文摘要

On-policy self-distillation, where a student is pulled toward a copy of itself conditioned on privileged context (e.g., a verified solution or feedback), offers a promising direction for advancing reasoning capability without a stronger external teacher. Yet in math reasoning the gains are inconsistent, even when the same approach succeeds elsewhere. A pointwise mutual information analysis traces the failure to the privileged context itself: it inflates the teacher's confidence on tokens already implied by the solution (structural connectives, verifiable claims) and deflates it on deliberation tokens ("Wait", "Let", "Maybe") that drive multi-step search. We propose Anti-Self-Distillation (AntiSD), which ascends a divergence between student and teacher rather than descending it: this reverses the per-token sign and yields a naturally bounded advantage in one step. An entropy-triggered gate disables the term once the teacher entropy collapses, completing a drop-in replacement for default self-distillation. Across five models from 4B to 30B parameters on math reasoning benchmarks, AntiSD reaches the GRPO baseline's accuracy in 2 to 10x fewer training steps and improves final accuracy by up to 11.5 points. AntiSD opens a path to scalable self-improvement, where a language model bootstraps its own reasoning through its training signal.

2605.11608 2026-05-13 cs.CL cs.AI cs.LG 版本更新

PRISM: A Geometric Risk Bound that Decomposes Drift into Scale, Shape, and Head

Chieh-Yen Lin, Shao-Hua Sun

发表机构 * Appier AI Research(Appier人工智能研究院) National Taiwan University(国立台湾大学)

AI总结 PRISM 是一种用于分析训练后大语言模型变体(如量化、LoRA适配和蒸馏模型)表示漂移的几何风险界方法,能够将漂移分解为尺度、形状和输出头三个独立可测的维度。该方法利用模型的线性输出头和近等距的主干结构,推导出目标模型与变体之间的交叉熵风险上界,从而不仅判断性能退化,还能识别退化的具体原因。实验表明,PRISM 在多个基准测试中表现出优异的变体排序能力,并且其形状正则化项在防止灾难性遗忘方面优于经验回放等传统方法。

详情
英文摘要

Comparing post-training LLM variants, such as quantized, LoRA-adapted, and distilled models, requires a diagnostic that identifies how a variant has drifted, not only whether it has degraded. Existing similarity scores such as CKA and SVCCA can flag degradation, but they do not directly link representation drift to risk or mechanism. We propose PRISM, Proxy Risk Inference via Structural Mapping, which exploits the linear output head of LLMs and the empirically near-isometric structure of their backbones to derive a closed-form upper bound on the cross-entropy risk gap between a target model and a post-training variant. The bound is calibrated for variant ranking and decomposes drift into three independently measurable axes: scale mismatch, shape mismatch, and head divergence. Each axis corresponds to a distinct failure mode, including shape distortion under low-bit quantization, scale separability under LoRA forgetting, and head divergence under GGUF k-quantization. As a result, the dominant axis suggests a remediation direction rather than merely raising a degradation flag. Because the shape term is differentiable, the same geometry can also serve as a training-time regularizer against catastrophic forgetting. Across two model families and five benchmarks, PRISM ranks variants with mean Spearman correlations of 0.820 for post-training quantization and 0.831 for LoRA forgetting, and its axis-guided shape regularizer outperforms experience replay in aggregate at mitigating downstream forgetting.

2605.11605 2026-05-13 cs.CV cs.AI 版本更新

Keep What Audio Cannot Say: Context-Preserving Token Pruning for Omni-LLMs

Chaeyoung Jung, Kyeongha Rho, Joon Son Chung

发表机构 * Korea Advanced Institute of Science and Technology(韩国科学技术院)

AI总结 多模态大语言模型(Omni-LLMs)在处理多模态输入时面临较高的计算开销,因此需要有效的token减少方法。本文提出了一种名为ContextGuard的推理时token剪枝框架,通过保留广泛的视听上下文并去除跨模态冗余,从而在保证性能的同时减少输入token数量。该方法基于音频预测粗粒度视觉语义,剪枝可由音频恢复的视频token,并保留能提供音频无法表达的局部视觉细节的token,同时合并时间上相似的视频token以进一步压缩。实验表明,ContextGuard在多个基准测试中优于现有方法,且在不需微调下游模型的情况下实现了较高的剪枝比例与性能。

详情
英文摘要

Omnimodal Large Language Models (Omni-LLMs) incur substantial computational overhead due to the large number of multimodal input tokens they process, making token reduction essential for real-world deployment. Existing Omni-LLM pruning methods typically reduce this cost by selecting tokens that are important for the current query or strongly aligned with cross-modal cues. However, such strategies can discard evidence that falls outside these criteria, even when needed for different questions or for understanding context beyond aligned audio-visual cues. To address this limitation, we reframe Omni-LLM token reduction as preserving broad audio-visual context while removing cross-modal redundancy. We propose ContextGuard, an inference-time token pruning framework built on this principle. ContextGuard predicts coarse visual semantics from audio and prunes video tokens whose coarse semantics are likely recoverable from audio, while retaining additional video tokens to preserve localized visual details that audio alone cannot specify. For further compression, our method merges temporally similar video tokens. The framework requires no downstream LLM fine-tuning and uses only an independently trained lightweight predictor. On Qwen2.5-Omni and Video-SALMONN2+ at 3B and 7B scales across six audio-visual benchmarks, ContextGuard outperforms prior inference-time pruning methods while pruning more tokens. Notably, on Qwen2.5-Omni 7B, ContextGuard achieves full-token-level performance on five of six benchmarks while pruning 55% of input tokens.

2605.11603 2026-05-13 cs.AI 版本更新

GAR: Carbon-Aware Routing for LLM Inference via Constrained Optimization

Disha Sheshanarayana, Rajat Subhra Pal, Manjira Sinha, Tirthankar Dasgupta

发表机构 * Manipal University Jaipur(曼海普大学斋普尔) TCS Research(塔塔咨询服务)

AI总结 随着大语言模型(LLM)部署规模的扩大,如何在异构模型池中平衡响应质量与计算成本成为关键问题。本文提出了一种基于约束优化的绿色感知路由(GAR)框架,旨在在保证准确率和延迟约束的前提下,最小化每请求的碳排放。GAR通过自适应约束优化和轻量级估计器实现实时路由决策,并结合在线算法与启发式变体,有效降低碳足迹同时保持模型性能,为可持续的大语言模型推理提供了理论支持与实践方案。

详情
英文摘要

The growing deployment of large language models (LLMs) makes per-request routing essential for balancing response quality and computational cost across heterogeneous model pools. Current routing methods rarely consider sustainable energy use and CO2 emissions as optimization objectives, despite grid carbon intensity varying by time and region, and models differing significantly in energy consumption. To address this gap, we introduce Green-Aware Routing (GAR), a constrained multi-objective optimization framework that minimizes per-request CO2 emissions subject to explicit accuracy floors and p95-latency service-level objectives (SLOs). GAR employs adaptive constraint optimization through per-dataset floor tuning and incorporates lightweight estimators for correctness, tail latency, and carbon emissions, enabling real-time routing decisions without additional inference passes. We present GAR-PD, a practical online primal-dual routing algorithm for rolling carbon budgets, alongside heuristic variants that achieve high feasibility coverage while limiting accuracy degradation. Comprehensive experiments across standard NLP benchmarks with heterogeneous LLM pools (7B-70B) demonstrate that GAR achieves substantial carbon reductions while maintaining competitive accuracy and p95 latency guarantees, providing a practical, theoretically grounded approach to sustainable LLM inference.

2605.11601 2026-05-13 cs.CL cs.AI 版本更新

DiffScore: Text Evaluation Beyond Autoregressive Likelihood

Wen Lai, Yingli Shen, Dingnan Jin, Qing Cui, Jun Zhou, Maosong Sun, Alexander Fraser

发表机构 * Ant Group(蚂蚁集团) Tsinghua University(清华大学) Technical University of Munich(慕尼黑技术大学)

AI总结 本文提出了一种名为 DiffScore 的文本评估方法,旨在克服自回归语言模型在文本评价中因位置偏差导致的局限性。DiffScore 基于掩码大型扩散语言模型,通过全双向上下文对每个词进行评分,从而消除位置偏倚,并建立从局部流畅性到整体连贯性的评估层次。该方法还引入了多时间步质量分析和双向PMI分解等诊断工具,实验表明其在多个基准测试中优于传统自回归模型。

详情
英文摘要

Autoregressive language models are widely used for text evaluation, however, their left-to-right factorization introduces positional bias, i.e., early tokens are scored with only leftward context, conflating architectural asymmetry with true text quality. We propose masked reconstruction as an alternative paradigm, where every token is scored using full bidirectional context. We introduce DiffScore, an evaluation framework built on Masked Large Diffusion Language Models. By measuring text recoverability across continuous masking rates, DiffScore eliminates positional bias and naturally establishes an evaluation hierarchy from local fluency to global coherence. We further provide diagnostic tools unavailable to autoregressive frameworks: multi-timestep quality profiles that decompose scores across masking rates, and bidirectional PMI decomposition that disentangles fluency from faithfulness. Experiments across ten benchmarks show that DiffScore consistently outperforms autoregressive baselines in both zero-shot and fine-tuned settings. The code is released at: https://github.com/wenlai-lavine/DiffScore.

2605.11598 2026-05-13 cs.LG cs.AI cs.DB q-bio.QM 版本更新

EpiCastBench: Datasets and Benchmarks for Multivariate Epidemic Forecasting

Madhurima Panja, Danny D'Agostino, Huitao Li, Tanujit Chakraborty, Nan Liu

发表机构 * Sorbonne University Abu Dhabi(阿布扎赫尔索邦大学) Duke-NUS Medical School, Singapore(新加坡杜克-新加坡国立医学学院)

AI总结 随着数据驱动方法在公共卫生决策中的广泛应用,传染病预测已成为重要研究领域。为解决现有研究缺乏高质量多变量预测基准的问题,本文提出了EpiCastBench,一个包含40个精心挑选的多变量传染病数据集的大型基准框架,涵盖多种传染病和地理区域,具有不同的时间粒度、序列长度和稀疏性。研究通过统一的评估设置对15种多变量预测模型进行了系统比较,所有数据和代码均已公开,有助于推动传染病预测方法的发展与验证。

详情
英文摘要

The increasing adoption of data-driven decision-making in public health has established epidemic forecasting as a critical area of research. Recent advances in multivariate forecasting models better capture complex temporal dependencies than conventional univariate approaches, which model individual series independently. Despite this potential, the development of robust epidemic forecasting methods is constrained by the lack of high-quality benchmarks comprising diverse multivariate datasets across infectious diseases and geographical regions. To address this gap, we present EpiCastBench, a large-scale benchmarking framework featuring 40 curated (correlated) multivariate epidemic datasets. These publicly available datasets span a wide range of infectious diseases and exhibit diverse characteristics in terms of temporal granularity, series length, and sparsity. We analyze these datasets to identify their global features and structural patterns. To ensure reproducibility and fair comparison, we establish standardized evaluation settings, including a unified forecasting horizon, consistent preprocessing pipelines, diverse performance metrics, and statistical significance testing. By leveraging this framework, we conduct a comprehensive evaluation of 15 multivariate forecasting models spanning statistical baselines to state-of-the-art deep learning and foundation models. All datasets and code are publicly available on Kaggle (https://www.kaggle.com/datasets/aimltsf/epicastbench) and GitHub (https://github.com/aimltsf/EpiCastBench).

2605.11595 2026-05-13 cs.AI 版本更新

Native Explainability for Bayesian Confidence Propagation Neural Networks: A Framework for Trusted Brain-Like AI

Georgios Makridis, Georgios Fatouros, John Soldatos, George Katsis, Dimosthenis Kyriazis

发表机构 * CC BY-NC-SA 4.0

AI总结 本文针对欧盟人工智能法案对高风险AI系统提出的透明性与可信性要求,提出了一种用于贝叶斯置信传播神经网络(BCPNN)的原生可解释性框架。该框架通过建立BCPNN特有的可解释性分类体系和十六个架构级解释原语,实现了对模型决策过程的系统性解释,并引入了五个配置级解释原语以支持预部署阶段的审计。研究为BCPNN在边缘设备上的可信部署提供了理论支持,推动了类脑AI在工业物联网中的应用。

Comments 8 pages

详情
英文摘要

The EU Artificial Intelligence Act (Regulation 2024/1689), fully applicable to high-risk systems from August 2026, creates urgent demand for AI architectures that are simultaneously trustworthy, transparent, and feasible to deploy on resource-constrained edge devices. Brain-like neural networks built on the Bayesian Confidence Propagation Neural Network (BCPNN) formalism have re-emerged as a credible alternative to backpropagation-driven deep learning. They deliver state-of-the-art unsupervised representation learning, neuromorphic-friendly sparsity, and existing FPGA implementations that target edge deployment. Despite this momentum, no systematic framework exists for explaining BCPNN decisions -- a gap the present paper fills. We argue that BCPNN is, in the sense of Rudin's interpretable-by-design agenda, an inherently transparent model whose architectural primitives map directly onto established explainable-AI (XAI) families. We make four contributions. First, we propose the first XAI taxonomy for BCPNN. It maps weights, biases, hypercolumn posteriors, structural-plasticity usage scores, attractor dynamics, and input-reconstruction populations onto attribution, prototype, concept, counterfactual, and mechanistic explanation modalities. Second, we introduce sixteen architecture-level explanation primitives (P1--P16), several without analogue in standard ANNs. We provide closed-form algorithms for computing each from quantities the model already maintains. Third, we introduce five design-time Configuration-as-Explanation primitives (Config-P1 to Config-P5) that treat BCPNN hyperparameter choices as an auditable pre-deployment explanation artifact. Fourth, we sketch a roadmap for integration into industrial IoT deployments and discuss EU AI Act alignment, edge feasibility, and Industry 5.0 implications.

2605.11592 2026-05-13 cs.LG cs.AI cs.CR 版本更新

SoK: Unlearnability and Unlearning for Model Dememorization

Mengying Zhang, Derui Wang, Ruoxi Sun, Xiaoyu Xia, Shuang Hao, Minhui Xue

发表机构 * RMIT University(皇家墨尔本理工大学) University of Texas at Dallas(德克萨斯大学达拉斯分校) CSIRO and Adelaide University(澳大利亚CSIRO与阿德莱德大学)

AI总结 本文系统研究了机器学习模型中数据遗忘相关的两种关键技术——不可学习性(unlearnability)和模型遗忘(unlearning),旨在防止敏感数据被滥用。研究揭示了这两种方法在浅层遗忘、相互影响及理论保障方面的共性与缺陷,并首次提出了统一的分类框架、实证分析以及理论保证,为实现更深层次的数据遗忘提供了理论基础和实践指导。

Comments The first two authors contributed equally

详情
英文摘要

Advanced model dememorization methods, including availability poisoning (unlearnability) and machine unlearning, are emerging as key safeguards against data misuse in machine learning (ML). At the training stage, unlearnability embeds imperceptible perturbations into data before release to reduce learnability. At the post-training stage, unlearning removes previously acquired information from models to prevent unauthorized disclosure or use. While both defenses aim to preserve the right to withhold knowledge, their vulnerabilities and shared foundations remain unclear. Specifically, both unlearnability and unlearning suffer from issues such as shallow dememorization, leading to falsely claimed data learnability reduction or forgetting in the presence of weight perturbations. Moreover, input perturbations may affect the effectiveness of downstream unlearning, while unlearning may inadvertently recover domain knowledge hidden by unlearnability. This interplay calls for deeper investigation. Finally, there is a lack of formal guarantees to provide theoretical insights into current defenses against shallow dememorization. In this Systematization of Knowledge, we present the first integrated analysis of model dememorization approaches leveraging unlearnability and unlearning. Our contributions are threefold: (i) a unified taxonomy of unlearnability and scalable unlearning methods; (ii) an empirical evaluation revealing the robustness, interplay, and shallow dememorization of leading methods; and (iii) the first theoretical guarantee on dememorization depth for models processed through certified unlearning. These results lay the foundation for unifying dememorization mechanisms across the ML lifecycle to achieve a deeper immemor state for sensitive knowledge.

2605.11583 2026-05-13 eess.IV cs.AI cs.CV cs.LG eess.SP 版本更新

NexOP: Joint Optimization of NEX-Aware k-space Sampling and Image Reconstruction for Low-Field MRI

Tal Oved, Efrat Shimron

发表机构 * Department of Electrical and Computer Engineering, Technion – Israel Institute of Technology(电气与计算机工程系,技术学院–以色列理工学院) May-Blum-Dahl Technion Human MRI Research Center, Technion - Israel Institute of Technology(May-Blum-Dahl技术学院人类MRI研究中心,技术学院–以色列理工学院) Department of Biomedical Engineering, Technion – Israel Institute of Technology(生物医学工程系,技术学院–以色列理工学院)

AI总结 本文提出了一种名为NexOP的深度学习框架,旨在针对低场强MRI中信噪比低的问题,联合优化多重复采集(NEX)的k空间采样策略与图像重建过程。该方法通过在扩展的k空间-NEX域内优化采样密度概率,在固定采样预算下实现更高效的采样策略,并设计了新的深度学习架构,从多个低信噪比测量中重建高质量图像。实验表明,NexOP在多种加速倍数和组织对比下均优于现有方法,且能生成非均匀采样方案,有效利用NEX维度提升成像效率与质量。

详情
英文摘要

Modern low-field magnetic resonance imaging (MRI) technology offers a compelling alternative to standard high-field MRI, with portable, low-cost systems. However, its clinical utility is limited by a low Signal-to-Noise Ratio (SNR), which hampers diagnostic image quality. A common approach to increase SNR is through repetitive signal acquisitions, known as NEX, but this results in excessively long scan durations. Although recent work has introduced methods to accelerate MRI scans through k-space sampling optimization, the NEX dimension remains unexploited; typically, a single sampling mask is used across all repetitions. Here we introduce NexOP, a deep-learning framework for joint optimization of the sampling and reconstruction in multi-NEX acquisitions, tailored for low-SNR settings. NexOP enables optimizing the sampling density probabilities across the extended k-space-NEX domain, under a fixed sampling-budget constraint, and introduces a new deep-learning architecture for reconstructing a single high-SNR image from multiple low-SNR measurements. Experiments with raw low-field (0.3T) brain data demonstrate that NexOP consistently outperforms competing methods, both quantitatively and qualitatively, across diverse acceleration factors and tissue contrasts. The results also demonstrate that NexOP yields non-uniform sampling strategies, with progressively decreasing sampling across repetitions, hence exploiting the NEX dimension efficiently. Moreover, we present a theoretical analysis supporting these numerical observations. Overall, this work proposes a sampling-reconstruction optimization framework highly suitable for low-field MRI, which can enable faster, higher-quality imaging with low-cost systems and contribute to advancing affordable and accessible healthcare.

2605.11574 2026-05-13 cs.CL cs.AI cs.LG 版本更新

Three Regimes of Context-Parametric Conflict: A Predictive Framework and Empirical Validation

Pruthvinath Jeripity Venkata

发表机构 * Independent Researcher(独立研究者)

AI总结 本文研究了大型语言模型在处理训练知识与矛盾文档之间冲突时的三种不同情境,并提出了一个三阶段的预测框架。核心方法区分了参数强度与参数唯一性这两个正交维度,并通过大量实验验证了模型在不同任务场景下的行为差异。研究发现,模型在任务相关性引导下对文档的依赖程度显著变化,揭示了参数确定性在事实性任务中的主导作用。

Comments 10 pages, 13 tables, no figures. 9,970 API calls across five frontier models

详情
英文摘要

The literature on how large language models handle conflict between their training knowledge and a contradicting document presents a persistent empirical contradiction: some studies find models stubbornly retain their trained answers, ignoring provided documents nearly half the time, while others find models readily defer to the document, following context approximately 96% of the time. We argue these contradictions dissolve once one recognises that prior experiments have studied three qualitatively distinct processing situations without distinguishing them. We propose a three-regime framework: Regime 1 (single-source updating, dominant predictor: evidence coherence), Regime 2 (competitive integration, dominant predictor: parametric certainty), and Regime 3 (task-appropriate selection, dominant predictor: task knowledge requirement). We formalise a distinction between parametric strength (exposure frequency) and parametric uniqueness (encoding consistency), showing empirically that these are orthogonal dimensions (r = -0.002, p = .97) with strength as the operative predictor in stable factual domains. We validate the framework across Claude Sonnet 4.6, GPT-5.5, Gemini 2.5 Flash, Llama 4 Maverick, and DeepSeek V3 using 9,970 API calls in three experimental phases. GEE logistic regression confirms the predicted Regime 2 certainty gradient for all five models (beta = -0.38 to -0.50, all p <= .013, BH-FDR corrected). A Regime 3 ablation shows task framing alone flips context-following from near-100% (contextual knowledge condition) to 6-71% (parametric knowledge condition), with all five models significant (p < .001). The certainty gradient is robust to multinomial outcome modeling, sensitivity analyses for hedging responses, and FDR correction.

2605.11569 2026-05-13 cs.AI cs.LG 版本更新

Dual-Temporal LSTM with Hybrid Attention for Airline Passenger Load Factor Forecasting: Integrating Intra-Flight and Inter-Flight Booking Dynamics

ASM Nazrul Islam, Md. Hasanul Kabir, Md. Liakot Ali, Joydeb Kumar Sana

发表机构 * Institute of Information and Communication Technology(信息与通信技术研究所) Bangladesh University of Engineering and Technology(孟加拉工程与技术大学) Islamic University of Technology(伊斯兰大学)

AI总结 该研究针对航空业需求预测中的不足,提出了一种结合双时间流和混合注意力机制的LSTM模型,用于更准确地预测航班载客率。该模型同时处理航班内部的预订积累和航班之间的预订模式,克服了传统单时间维度建模的信息丢失问题。实验表明,该方法在孟加拉国航空公司实际数据上取得了较高的预测精度,并在多种航线类型中表现出良好的泛化能力,已被该航空公司正式应用于运营中。

详情
英文摘要

Accurate short-term demand forecasting is crucial to airline revenue management, yet most existing systems fail to meet this need because current models treat booking data as a single temporal dimension, either the accumulation of bookings for a specific flight or the historical booking profile of the same route. This unidimensional view discards information carried by the other temporal stream and forecasting absolute passenger counts introduces a further operational fragility when change in planned aircraft type alters total seat capacity. This study addresses both limitations. A dual-stream Long Short-Term Memory (LSTM) integrated with attention framework is proposed that simultaneously processes two complementary input sequences: a horizontal sequence capturing intra-flight booking accumulation over the days preceding departure, and a vertical sequence capturing inter-flight booking patterns at fixed days-before-departure offsets across historical flights. Multiple dual-stream architectural variants, combining self-attention, cross-attention, and hybrid attention with concatenation, residual, and gated fusion strategies, are developed and evaluated. Experiments on real-world reservation data from the national airline of Bangladesh, Biman Bangladesh Airlines (BBA), demonstrate that the proposed hybrid model achieves a Mean Absolute Error of 2.8167 and a coefficient of determination ($R^{2}$) of 0.9495, outperforming single-stream baselines, tree-based models, and three prior dual-LSTM architectures applied to the same data. Validation across four flight category pairs; domestic versus international, direct versus transit, high versus low frequency, and short versus mid versus long haul confirms that the model generalizes across operationally diverse route types. Biman Bangladesh Airlines (BBA) has officially integrated this methodology into its operations.

2605.11563 2026-05-13 cs.CV cs.AI 版本更新

TCP-SSM: Efficient Vision State Space Models with Token-Conditioned Poles

Sara Shoouri, Morteza Tavakoli Taba, Hun-Seok Kim

发表机构 * University of Michigan(密歇根大学)

AI总结 本文提出了一种名为TCP-SSM的高效视觉状态空间模型,旨在解决现有SSM在长程视觉任务中难以控制状态依赖记忆行为的问题。该方法通过引入基于令牌的稳定极点,显式建模递归动态,提升了模型的可解释性和可控性。TCP-SSM采用实极点和复共轭极点分别建模单调衰减和阻尼振荡响应,并通过分组极点共享和轻量输入路径设计,实现了计算效率的显著提升,在多个视觉任务中相比基线模型减少了高达44%的计算复杂度。

详情
英文摘要

State Space Models (SSMs) have emerged as a compelling alternative to attention models for long-range vision tasks, offering input-dependent recurrence with linear complexity. However, most efficient SSM variants reduce computation cost by modifying scan routes, resolutions, or traversal patterns, while largely leaving the recurrent dynamics implicit. Consequently, the model's state-dependent memory behavior is difficult to control, particularly in compact backbones where long scan paths can exceed the effective memory horizon. We propose Token-Conditioned Poles SSM (TCP-SSM), a structured selective SSM framework that improves efficiency while making recurrence dynamics explicit and interpretable through stable poles. TCP-SSM builds each scan operator with 1) real poles that model monotone or sign-alternating decay, and 2) complex-conjugate poles that capture damped oscillatory responses. Using bounded radius and angle modulation, TCP-SSM converts shared base poles into token-dependent poles, allowing each scan step to adapt its memory behavior to the current visual token while preserving pole stability. For practical scalability, we integrate grouped pole sharing with a lightweight low-rank input pathway, yielding an efficient scan operator that preserves linear-time scan complexity. Across image classification, semantic segmentation, and object detection, TCP-SSM reduces SSM computation complexity up to 44% in Vision Mamba-style models while maintaining or surpassing baseline accuracy.

2605.11559 2026-05-13 cs.CV cs.AI 版本更新

When Looking Is Not Enough: Visual Attention Structure Reveals Hallucination in MLLMs

Fanpu Cao, Xin Zou, Xuming Hu, Hui Xiong

发表机构 * Thrust of Artificial Intelligence, HKUST (Guangzhou)(人工智能前沿 thrust,香港科技大学(广州)) Department of Computer Science and Engineering, HKUST(计算机科学与工程系,香港科技大学)

AI总结 多模态大语言模型(MLLMs)在视觉推理和基于视觉的问题回答中发挥着重要作用,但其仍易产生视觉幻觉,即生成的回答与图像内容矛盾或提及不存在的物体。本文发现,通过分析视觉注意力的高频结构(即层间拉普拉斯能量),可以揭示模型在生成幻觉时的注意力变化特征,并据此提出一种无需训练的解码策略LaSCD,通过选择具有高拉普拉斯能量的层并重新映射下一个词的得分,有效减少幻觉现象,同时保持模型的一般能力。

详情
英文摘要

Multimodal large language models (MLLMs) have become a key interface for visual reasoning and grounded question answering, yet they remain vulnerable to visual hallucinations, where generated responses contradict image content or mention nonexistent objects. A central challenge is that hallucination is not always caused by a simple lack of visual attention: the model may still assign substantial attention mass to image tokens while internally drifting toward an incorrect answer. In this paper, we show that the high-frequency structure of visual attention, measured by layer-wise Laplacian energy, reveals both the layer where hallucinated preferences emerge and the layer where the ground-truth answer transiently recovers. Building on this finding, we propose LaSCD (Laplacian-Spectral Contrastive Decoding), a training-free decoding strategy that selects informative layers via Laplacian energy and remaps next-token logits in closed form. Experiments on hallucination and general multimodal benchmarks show that LaSCD consistently reduces hallucination while preserving general capabilities, highlighting its potential as a faithful decoding paradigm. The code is available at https://github.com/macovaseas/LaSCD.

2605.11556 2026-05-13 cs.AI cs.LG 版本更新

Hindsight Hint Distillation: Scaffolded Reasoning for SWE Agents from CoT-free Answers

Shengjie Wang, Guanghe Li, Zonghan Yang, Yang Gao

发表机构 * Tsinghua University(清华大学)

AI总结 该研究提出了一种名为Hindsight Hint Distillation(HHD)的新方法,旨在从无思维链(CoT)注释的问题-答案对中学习推理能力,以解决复杂的长期任务。HHD通过模型自身失败的自我推演生成“事后提示”,用于指导成功的策略生成,并通过自我蒸馏提升模型的推理能力。实验表明,HHD在多个基准测试中显著优于现有方法,尤其在未见过的任务上表现出良好的泛化能力。

Comments 28 pages, 7 figures

详情
英文摘要

Solving complex long-horizon tasks requires strong planning and reasoning capabilities. Although datasets with explicit chain-of-thought (CoT) rationales can substantially benefit learning, they are costly to obtain. To address this challenge, we propose Hindsight Hint Distillation (HHD), which only requires easy-to-obtain question-answer pairs without CoT annotations. Inspired by how human teachers use student mistakes to provide targeted guidance, HHD synthesizes hindsight hints from the model's own failed self-rollouts and uses them to scaffold on-policy rollouts that successfully complete the tasks. The model then self-distills these scaffolded trajectories and generalizes to new problems without hint guidance. Experiments show that HHD significantly outperforms iterative RFT and trajectory-synthesis baselines, achieving an absolute improvement of 8\% on SWE-bench Verified, while all baselines improve by only around 2\%. Notably, the reasoning strategies induced by HHD generalize effectively to out-of-distribution tasks, yielding the largest gains on SWE-bench Multilingual despite no training on multilingual data. These results demonstrate that HHD can effectively synthesize expert-like reasoning from CoT-free data and substantially improve long-horizon performance.

2605.11547 2026-05-13 cs.LG cs.AI 版本更新

Sharpen Your Flow: Sharpness-Aware Sampling for Flow Matching

Aditi Gupta, Soon Hoe Lim, Annan Yu, N. Benjamin Erichson

发表机构 * Lawrence Berkeley National Laboratory(伯克利国家实验室) International Computer Science Institute(国际计算机科学研究所) Department of Mathematics, KTH Royal Institute of Technology(皇家理工学院数学系) Nordita, KTH Royal Institute of Technology and Stockholm University(KTH皇家理工学院与斯德哥尔摩大学联合研究所) Center for Applied Mathematics, Cornell University(康奈尔大学应用数学中心)

AI总结 本文提出了一种名为 SharpEuler 的训练无关采样方法,用于改进流匹配模型的生成效率与质量。该方法通过离线分析预训练模型,估计速度场变化最剧烈的区域,并据此生成适用于任意推理预算的时步网格,从而在保持相同模型评估次数的前提下提升采样效果。实验表明,SharpEuler 在固定计算预算下能有效减少模式泄露并提升模式覆盖度,为高效生成提供了新思路。

详情
英文摘要

Flow matching models generate samples by numerically integrating a learned velocity field, with each integration step requiring a neural network evaluation. Fast generation therefore requires using a small fixed evaluation budget effectively: the key question is not only how to integrate the flow, but where the sampler should spend its steps. We propose SharpEuler, a training-free sampler that profiles a pretrained model offline by estimating where the learned velocity field changes most rapidly along calibration trajectories. This finite-difference estimate defines a solver-aware sharpness profile, which is smoothed and converted by a quantile transform into a timestep grid for any desired inference budget. At test time, sampling remains ordinary Euler integration with the same number of model evaluations as a uniform schedule. We justify SharpEuler using three principles: a numerical principle identifying trajectory acceleration as the leading source of Euler discretization error, a variational principle deriving sharpness-based power-law timestep densities, and a statistical guarantee showing that the finite-sample calibrated sampler is stable at the terminal distribution level. Our experiments show that SharpEuler improves sample quality at fixed budgets, reducing inter-mode leakage and increasing mode coverage.

2605.11538 2026-05-13 cs.CL cs.AI cs.LG 版本更新

Taming Extreme Tokens: Covariance-Aware GRPO with Gaussian-Kernel Advantage Reweighting

Cheng Wang, Qin Liu, Wenxuan Zhou, Muhao Chen

发表机构 * National University of Singapore(新加坡国立大学) University of California, Davis(加州大学戴维斯分校) University of Southern California(美国南加州大学)

AI总结 本文针对大型语言模型在训练过程中探索与利用之间的平衡问题,提出了一种基于协方差感知的改进型GRPO方法。该方法通过高斯核函数动态降低极端token更新的影响,从而在不损失有用学习信号的前提下减少训练不稳定。实验表明,该方法在多个推理基准上优于原始GRPO,有效提升了模型的下游性能并稳定了训练过程中的熵值。

Comments ACL 2026

详情
英文摘要

Group Relative Policy Optimization (GRPO) has emerged as a promising approach for improving the reasoning capabilities of large language models. However, it struggles to effectively balance the tradeoff between exploration and exploitation during training, often resulting in suboptimal performance. Motivated by the theoretical insight that changes in entropy are governed by the covariance between token probabilities and their corresponding advantages, we propose a hyperparameter-free, covariance-weighted optimization method that dynamically down-weights extreme token-level updates via a Gaussian kernel. This approach automatically reduces the instability caused by exploration-exploitation trade-off while preserving informative learning signals. Extensive empirical evaluations show that our approach improves downstream performance across reasoning benchmarks compared with GRPO, and effectively stablizes entropy as training progresses.

2605.11532 2026-05-13 cs.AI 版本更新

Read, Grep, and Synthesize: Diagnosing Cross-Domain Seed Exposure for LLM Research Ideation

Yunju Choi, Min Song

发表机构 * Yonsei University, Seoul, Republic of Korea(延世大学,首尔,韩国)

AI总结 本文研究了大型语言模型(LLM)在生成研究想法时,是否能从跨领域知识中获益。作者提出了一种名为PaperGym的三阶段方法,通过工具增强的种子提取、跨领域种子检索与方法合成,评估了不同种子来源对创新性的影响。实验表明,跨领域种子检索在提升方法新颖性方面优于单一领域和无检索基线,但未能显著优于随机多样化种子。研究指出,当前LLM在利用跨领域知识生成创意时,仍难以有效捕捉种子的语义关联。

Comments 12 pages, 2 figures, 7 tables

详情
英文摘要

The discovery of novel methodologies for emerging problems is a continuing cycle in ML, often driven by the migration of techniques across domains. Building on this observation, we ask whether current LLM ideation systems benefit from targeted cross-domain retrieval or simply from exposure to diverse mechanisms. We study this question through PaperGym, a three-stage pipeline: (1) tool-augmented seed extraction via read, grep, and bash over an isolated paper environment, (2) cross-domain seed retrieval via paraphrasing across seven ML domains, and (3) method synthesis from retrieved seeds, each scored by rubric-based judges. Tool-augmented extraction improves specificity, and paraphrase-based retrieval broadens domain coverage. In synthesis, cross-domain retrieval receives more pairwise novelty wins than no-retrieval and same-domain baselines, but shows no significant difference from a random diverse-seed control. These findings suggest LLM ideation systems benefit from diverse seed exposure, but do not yet reliably exploit the semantic reason particular seeds were retrieved. We release the seed library, rubric prompts, and run scripts at https://github.com/yunjoochoi/PaperGym

2605.11526 2026-05-13 math.OC cs.AI cs.LG 版本更新

Efficient and provably convergent end-to-end training of deep neural networks with linear constraints

Zonglin Yang, Zhexuan Gu, Yancheng Yuan

发表机构 * Department of Applied Mathematics, The Hong Kong Polytechnic University(应用数学系,香港理工大学)

AI总结 本文研究如何高效且理论保证地进行带线性约束的深度神经网络端到端训练。为解决投影层导致的非光滑性问题,作者引入了一种高效可计算的HS-Jacobian,并证明其在多面体集上的投影操作中具有保守映射性质,从而能够无缝集成到非光滑自动微分框架中。该方法使得如Adam等高效优化算法可用于此类网络的训练,并建立了收敛性保证,实验表明其在金融、计算机视觉等多个领域表现优异。

详情
英文摘要

Training a deep neural network with the outputs of selected layers satisfying linear constraints is required in many contemporary data-driven applications. While this can be achieved by incorporating projection layers into the neural network, its end-to-end training remains challenging due to the lack of rigorous theory and efficient algorithms for backpropagation. A key difficulty in developing the theory and efficient algorithms for backpropagation arose from the nonsmoothness of the solution mapping of the projection layer. To address this bottleneck, we introduce an efficiently computable HS-Jacobian to the projection layer. Importantly, we prove that the HS-Jacobian is a conservative mapping for the projection operator onto the polyhedral set, enabling its seamless integration into the nonsmooth automatic differentiation framework for backpropagation. Therefore, many efficient algorithms, such as Adam, can be applied for end-to-end training of deep neural networks with linear constraints. Particularly, we establish convergence guarantees of the HS-Jacobian based Adam algorithm for training linearly constrained deep neural networks. Extensive experiment results on several important applications, including finance, computer vision, and network architecture design, demonstrate the superior performance of our method compared to other existing popular methods.

2605.11520 2026-05-13 cs.CV cs.AI 版本更新

PointGS: Semantic-Consistent Unsupervised 3D Point Cloud Segmentation with 3D Gaussian Splatting

Yixiao Song, Qingyong Li, Wen Wang, Zhicheng Yan

发表机构 * Key Laboratory of Big Data & Artificial Intelligence in Transportation (Beijing Jiaotong University), Ministry of Education(大数据与人工智能在交通运输中的关键实验室(北京交通大学),教育部) Frontiers Science Center for Smart High-speed Railway System, Beijing Jiaotong University(智能高速铁路系统前沿科学中心,北京交通大学)

AI总结 本文提出了一种名为PointGS的无监督3D点云分割方法,旨在解决传统监督方法依赖密集标注带来的高昂成本问题。该方法通过3D高斯溅射技术构建统一的中间表示,弥合了离散点云与连续图像之间的域差距,并利用多视角重建与语义蒸馏策略,实现了跨视角语义的一致性分配。实验表明,PointGS在多个基准数据集上优于现有无监督方法,显著提升了分割性能。

Comments Accepted by Computer Vision and Pattern Recognition (CVPR) 2026

详情
英文摘要

Unsupervised point cloud segmentation is critical for embodied artificial intelligence and autonomous driving, as it mitigates the prohibitive cost of dense point-level annotations required by fully supervised methods. While integrating 2D pre-trained models such as the Segment Anything Model (SAM) to supplement semantic information is a natural choice, this approach faces a fundamental mismatch between discrete 3D points and continuous 2D images. This mismatch leads to inevitable projection overlap and complex modality alignment, resulting in compromised semantic consistency across 2D-3D transfer. To address these limitations, this paper proposes PointGS, a simple yet effective pipeline for unsupervised 3D point cloud segmentation. PointGS leverages 3D Gaussian Splatting as a unified intermediate representation to bridge the discrete-continuous domain gap. Input sparse point clouds are first reconstructed into dense 3D Gaussian spaces via multi-view observations, filling spatial gaps and encoding occlusion relationships to eliminate projection-induced semantic conflation. Multi-view dense images are rendered from the Gaussian space, with 2D semantic masks extracted via SAM, and semantics are distilled to 3D Gaussian primitives through contrastive learning to ensure consistent semantic assignments across different views. The Gaussian space is aligned with the original point cloud via two-step registration, and point semantics are assigned through nearest-neighbor search on labeled Gaussians. Experiments demonstrate that PointGS outperforms state-of-the-art unsupervised methods, achieving +0.9% mIoU on ScanNet-V2 and +2.8% mIoU on S3DIS.

2605.11519 2026-05-13 cs.AI cs.CL cs.LG 版本更新

Controllable User Simulation

Guy Tennenholtz, Ofer Meshi, Amir Globerson, Uri Shalit, Jihwan Jeong, Craig Boutilier

发表机构 * Google Research(谷歌研究) Tel Aviv University(特拉维夫大学)

AI总结 本文研究如何构建可控的用户模拟器,以更准确地评估对话代理的行为。作者将可控模拟问题形式化为因果推断问题,指出传统基于监督微调的方法会引入结构偏差,导致评估指标方差急剧上升,即“可控性崩溃”。为此,作者提出了基于因果一致性的理论条件和一系列实用训练方法,实验表明其方法能有效消除前瞻偏差,保持对话多样性,并具备对未知代理行为的鲁棒泛化能力。

详情
英文摘要

Using offline datasets to evaluate conversational agents often fails to cover rare scenarios or to support testing new policies. This has motivated the use of controllable user simulators for targeted, counterfactual evaluation, typically implemented by prompting or fine-tuning large language models. In this work, we formalize controllable simulation as a causal inference problem. By bridging natural language evaluation with off-policy evaluation methodology, we show that the standard practice of training simulators via supervised fine-tuning on post-hoc trajectory labels yields a structurally biased model. Specifically, these labels are inextricably coupled to the data-generating behavior policy, injecting a look-ahead bias that breaks causal consistency. Furthermore, we prove that under policy shift this failure causes the variance of evaluation metrics to explode geometrically, a phenomenon we term controllability collapse. To restore causal consistency, we establish theoretical conditions for accurate simulation and propose practical training mitigations: a priori controls, step-wise dynamic controls, and direct policy-conditioned learning. Empirical evaluation confirms that while standard global controls distort conversational distributions and collapse behavioral diversity, our causally grounded simulators eliminate look-ahead bias, preserve natural variance, and exhibit robust zero-shot generalization to unseen agent behaviors.

2605.11513 2026-05-13 cs.CL cs.AI 版本更新

A Study on Hidden Layer Distillation for Large Language Model Pre-Training

Maxime Guigon, Lucas Dixon, Michaël E. Sander

发表机构 * Google DeepMind(谷歌深Mind)

AI总结 本文研究了隐藏层蒸馏(HLD)在大规模语言模型预训练中的应用,指出当前知识蒸馏主要依赖输出logits,而忽视了教师模型中间层的语义信息。通过对比实验,作者发现HLD在下游任务上的表现并不一致优于传统基于logits的蒸馏方法,但在所有共享超参数配置下,HLD在困惑度上均有所提升,表明其可能蕴含潜在价值,但尚未成为预训练中的主流方法。

详情
英文摘要

Knowledge Distillation (KD) is a critical tool for training Large Language Models (LLMs), yet the majority of research focuses on approaches that rely solely on output logits, neglecting semantic information in the teacher's intermediate representations. While Hidden Layer Distillation (HLD) showed potential for encoder architectures, its application to decoder-only pre-training at scale remains largely unexplored. Through compute-controlled experiments, we benchmark HLD against logit-based KD and self-supervised baselines with Gemma3 3.4B as teacher and 123M and 735M students trained on up to 168B tokens from the C4 dataset. Our experiments show that HLD does not consistently outperform standard KD on downstream evaluation tasks. Nevertheless, we show that HLD can yield a systematic perplexity gain over KD across all shared-hyperparameter configurations, suggesting that a latent signal can be extracted, but a breakthrough may be needed for it to play a more significant role in LLM pre-training.

2605.11509 2026-05-13 cs.AI cs.LG cs.MA cs.SY eess.SY 版本更新

Hierarchical LLM-Driven Control for HAPS-Assisted UAV Networks: Joint Optimization of Flight and Connectivity

Zijiang Yan, Hao Zhou, Wael Jaafar, Jianhua Pei, Ping Wang, Halim Yanikomeroglu, Hina Tabassum

发表机构 * Department of Electrical Engineering and Computer Science, York University(约克大学电气工程与计算机科学系) Samsung Research America(三星美国研究院) Department of Software and IT Engineering, École de technologie supérieure (ÉTS), University of Quebec(魁北克大学软件与信息技术工程系,École de technologie supérieure) Non-Terrestrial Networks (Carleton-NTN) Lab and the Department of Systems and Computer Engineering, Carleton University(非地面网络(Carleton-NTN)实验室和系统与计算机工程系,卡尔顿大学)

AI总结 本文研究了在融合地面与非地面网络(ITNTN)环境下,无人机(UAV)的飞行控制与通信连接的联合优化问题。为解决动态且部分可观测条件下的多无人机协同问题,作者提出了一种基于大语言模型(LLM)的分层多速率控制框架,将全局负载均衡与切换决策与局部无人机运动控制相结合。实验表明,该方法在运输效率、通信吞吐量和碰撞率等方面均优于现有方法,展现出良好的动态场景适应能力。

Comments Submission for possible publication

详情
英文摘要

Uncrewed aerial vehicles (UAVs) are increasingly deployed in complex networked environments, yet the joint optimization of multi-UAV motion control and connectivity remains a fundamental challenge. In this paper, we study a multi-UAV system operating in an integrated terrestrial and non-terrestrial network (ITNTN) comprising terrestrial base stations and high-altitude platform stations (HAPS). We consider a three-dimensional (3D) aerial highway scenario where UAVs must adapt their motion to ensure collision avoidance, efficient traffic flow, and reliable communication under dynamic and partially observable conditions. We first model the problem as a hierarchical multi-objective partially observable Markov decision process (H-MO-POMDP), capturing the coupling between control and communication objectives. Based on this formulation, we propose a large language model (LLM)-driven hierarchical multi-rate control framework. At the global level, an LLM-based controller on the HAPS performs long-term planning for load balancing and handover decisions. At the local level, each UAV employs a hybrid controller that integrates a slow-timescale LLM for high-level spatial reasoning with a reinforcement learning agent for faster UAV-to-infrastructure (U2I) communication and motion control. We further develop a high-fidelity 3D simulation platform by integrating the gym-pybullet-drones environment with 3GPP-compliant RF/THz channel models. Numerical results demonstrate that the proposed framework significantly outperforms state-of-the-art baselines, achieving a 14% increase in transportation efficiency and a 25% improvement in telecommunication throughput. Additionally, it achieves a 23% reduction in physical collision rates, demonstrating strong handover stability and zero-shot generalization in dynamic scenarios.

2605.11501 2026-05-13 cs.SE cs.AI cs.CR 版本更新

Decaf: Improving Neural Decompilation with Automatic Feedback and Search

Alexander Shypula, Osbert Bastani, Edward Schwartz

发表机构 * University of Pennsylvania(宾夕法尼亚大学) Carnegie Mellon University(卡内基梅隆大学)

AI总结 本文提出了一种名为Decaf的神经反编译系统,通过引入自动反馈和搜索机制,显著提升了反编译结果的语义正确性。该方法无需依赖更多训练数据,而是利用编译器反馈指导搜索过程,从而在保持与原始源代码相似度的同时,将反编译成功率从26.0%提升至83.9%。实验表明,该方法对提升弱神经反编译模型的性能尤为有效。

Comments 15 pages, 6 figures. Preprint; under review. Code and models available at https://github.com/AlexShypula/decaf

详情
英文摘要

Decompilers are useful tools used in reverse engineering to understand compiled source code. Reconstructing source code from compiled binaries is a challenging task, because high-level syntax, identifiers, and custom data types are generally lost as the compiler translates human-readable code to low-level machine code. Deterministic decompilers are useful tools for binary analysis, but can struggle to infer idiomatic syntax and identifier names. Generative AI models are a natural fit for reconstructing high-level syntax, identifiers, and types, but they can still suffer by hallucinating improper programming constructs and semantics. Instead of attempting to improve neural decompilers with more data and more training, we argue that compiler feedback can be used to dramatically improve the semantic correctness of neural decompiler outputs via search. Our system, Decaf (DECompilation with Automated Feedback), raises the neural decompilation rate from 26.0% on ExeBench to 83.9% on the Real -O2 split without sacrificing similarity to the original source code. We also find our automatic feedback methodology is highly effective for improving weaker neural decompilation models.

2605.11496 2026-05-13 cs.AI cs.CY cs.HC cs.LG 版本更新

The Evaluation Differential: When Frontier AI Models Recognise They Are Being Tested

Varad Vishwarupe, Nigel Shadbolt, Marina Jirotka, Ivan Flechais

发表机构 * Anthropic OpenAI UK AI Security Institute(英国人工智能安全研究所)

AI总结 本文探讨了前沿人工智能模型在识别评估环境时表现出的行为差异问题,指出这些模型在测试环境下可能与实际部署时表现不同,从而影响安全评估的可靠性。研究提出了“评估差异”(Evaluation Differential)的概念,定义了标准化效应大小(nED)以进行跨属性比较,并开发了TRACE评估框架,用于更严谨地分析和限制从评估中得出的安全声明。该研究对AI系统评估和治理具有重要启示。

详情
英文摘要

Recent published evidence from frontier laboratories shows that contemporary AI models can recognise evaluation contexts, latently represent them, and behave differently under those contexts than under deployment-continuous conditions. Anthropic's BrowseComp incident, the Natural Language Autoencoder findings on SWE-bench Verified and destructive-coding evaluations, and the OpenAI / Apollo anti-scheming work all document instances of this phenomenon. We argue that these findings create a claim-validity problem for safety conclusions drawn from frontier evaluations. We introduce the Evaluation Differential (ED), a conditional divergence in a target behavioural property between recognised-evaluation and deployment-continuous contexts, define a normalised effect-size form (nED) for cross-property comparison, and prove that marginal evaluation scores cannot identify ED. We develop a typology of safety claims (ED-stable, ED-degraded, ED-inverted, ED-undetermined) by their warrant-status under documented divergence, and specify TRACE (Test-Recognition Audit for Claim Evaluation), an audit protocol that wraps existing evaluation infrastructure and produces restricted claims rather than capability scores. We apply the framework retrospectively to three publicly documented evaluation incidents and discuss governance implications for system cards, conformity assessment, and the international network of AI safety and security institutes. TRACE does not eliminate adversarial adaptation; it disciplines the claims drawn from evaluation evidence by making explicit the conditions under which that evidence was produced.

2605.11491 2026-05-13 cs.LG cs.AI 版本更新

Understanding and Preventing Entropy Collapse in RLVR with On-Policy Entropy Flow Optimization

Huimin Xu, Shuai Zhao, Xiaobao Wu, Anh Tuan Luu

发表机构 * Nanyang Technological University(南洋理工大学) Shanghai Jiao Tong University(上海交通大学) VinUniversity(文大学)

AI总结 本文研究了可验证奖励强化学习(RLVR)中普遍存在的熵崩溃问题,分析发现该问题源于令牌层面的熵流不平衡,即熵减少的令牌远多于熵增加的令牌。为此,作者提出了一种基于策略的熵流优化方法(OPEFO),通过动态调整熵增和熵减更新的比重,实现熵流的自适应平衡。实验表明,该方法有效提升了模型在数学推理任务中的训练稳定性和最终性能。

详情
英文摘要

Reinforcement learning with verifiable rewards (RLVR) has become an effective paradigm for improving the reasoning ability of large language models. However, widely used RLVR algorithms, such as GRPO, often suffer from entropy collapse, leading to premature determinism and unstable optimization. Existing remedies, including entropy regularization and ratio-based clipping heuristics, either control entropy in a coarse-grained manner or rely on approximate on-policy training. In this paper, we revisit entropy collapse from a token-level entropy flow perspective. Our analysis reveals that entropy-decreasing tokens consistently outweigh entropy-increasing ones, resulting in a severely imbalanced entropy flow. This perspective provides a unified explanation of entropy collapse in existing RLVR algorithms and highlights the importance of balancing entropy dynamics. Motivated by this analysis, we propose On-Policy Entropy Flow Optimization (OPEFO), an adaptive entropy flow balancing mechanism that rescales entropy-increasing and entropy-decreasing updates according to their contributions to entropy change, while remaining strict on-policy. Experiments on six mathematical reasoning benchmarks demonstrate that OPEFO improves training stability and final performance. We will release the code and models upon publication.

2605.11487 2026-05-13 cs.CR cs.AI cs.MA 版本更新

Digital Identity for Agentic Systems: Toward a Portable Authorization Standard for Autonomous Agents

Partha Madhira

发表机构 * MIT(麻省理工学院)

AI总结 随着企业人工智能从辅助工具转向能够自主执行任务、协商结果并做出决策的自主代理,传统的身份认证已不足以满足需求,代理的授权需要具备明确性、约束性、可审计性、可撤销性和跨信任边界的一致解释性。本文通过分析保险理赔和供应链完整性等典型企业场景,揭示了现有身份与访问模型的结构性缺陷,并提出了一种基于授权载荷、约束代数和决策一致评估语义的可移植授权模型,旨在为自主代理提供跨组织、跨系统的统一授权标准。

Comments 46 pages, 10 figures

详情
英文摘要

Enterprise AI is shifting from copilots to autonomous agents capable of executing workflows, negotiating outcomes, and making decisions with limited human oversight. As these systems extend across organizational boundaries, identity alone is insufficient: an agent's authority must also be explicit, constrained, auditable, revocable, and consistently interpretable by independent receivers. This paper analyzes representative enterprise use cases in insurance claims processing and supply chain integrity to surface structural gaps in existing identity and access models. It proposes a portable authorization model for autonomous agents based on issuer-authored authorization payloads, typed constraint algebra, decision-consistent evaluation semantics, delegation attenuation, governed semantic resolution, fail-closed processing, and pre-flight discovery. The model separates credential containers, authorization payload semantics, and enforcement engines, allowing profiles such as JWT/JWS, Verifiable Credentials, OAuth Rich Authorization Requests, or policy-engine bindings to preserve a common authorization meaning across trust boundaries.

2605.11479 2026-05-13 cs.RO cs.AI 版本更新

Offline Policy Evaluation for Manipulation Policies via Discounted Liveness Formulation

Hao Wang, Joshua Bowden, Colton Crosby, Somil Bansal

发表机构 * University of Southern California(南加州大学) Stanford University(斯坦福大学)

AI总结 本文研究了在稀疏奖励环境下对机械臂操作策略进行离线策略评估的问题,针对策略评估中任务进展非单调、评估轨迹长度有限导致的截断偏差等问题,提出了一种基于生存性(liveness)的贝尔曼算子框架。该方法将策略评估视为任务完成问题,得到的值函数对有限时间截断具有鲁棒性,并在理论分析中证明了其收缩性等性质。实验表明,该方法在多个模拟和实际任务中能更准确反映任务进展并有效减少截断偏差,优于传统方法如TD(0)和蒙特卡洛策略评估。

Comments Published at RSS 2026

详情
英文摘要

Policy evaluation is a fundamental component of the development and deployment pipeline for robotic policies. In modern manipulation systems, this problem is particularly challenging: rewards are often sparse, task progression of evaluation rollouts are often non-monotonic as the policies exhibit recovery behaviors, and evaluation rollouts are necessarily of finite length. This finite length introduces truncation bias, breaking the infinite-horizon assumptions underlying standard methods relying on Bellman equations/principle of optimality. In this work, we propose a framework for offline policy evaluation from sparse rewards based on a liveness-based Bellman operator. Our formulation interprets policy evaluation as a task-completion problem and yields a conservative fixed-point value function that is robust to finite-horizon truncation. We analyze the theoretical properties of the proposed operator, including contraction guarantees, and show how it encodes task progression while mitigating truncation bias. We evaluate our method on two simulated manipulation tasks using both a Vision-Language-Action model and a diffusion policy, and a cloth folding task using human demonstrations. Empirical results demonstrate that our approach more accurately reflects task progress and substantially reduces truncation bias, outperforming classical baselines such as TD(0) and Monte Carlo policy evaluation.

2605.11478 2026-05-13 cs.AI cs.IT math.IT stat.ML 版本更新

FibQuant: Universal Vector Quantization for Random-Access KV-Cache Compression

Namyoon Lee, Yongjune Kim

发表机构 * POSTECH(POSTECH大学)

AI总结 本文提出了一种名为FibQuant的通用固定率向量量化方法,用于随机访问的键值缓存压缩,以解决长上下文推理中的内存和流量瓶颈问题。该方法在保持归一化-旋转-存储接口的同时,将传统的标量编码表替换为与标准化源匹配的共享径向-角向码本,从而保留归一化步骤所创建的几何信息并提升压缩效率。实验表明,FibQuant在保持高注意力相似度的同时实现了更高的压缩比,并在多个模型上表现出优于现有标量量化方法的性能。

Comments 15 pages

详情
英文摘要

Long-context inference is increasingly a memory-traffic problem. The culprit is the key--value (KV) cache: it grows with context length, batch size, layers, and heads, and it is read at every decoding step. Rotation-based scalar codecs meet this systems constraint by storing a norm, applying a shared random rotation, and quantizing one coordinate at a time. They are universal and random-access, but they discard the geometry created by the normalization step. After a Haar rotation, a block of $k$ consecutive coordinates is not a product source; it is a spherical-Beta source on the unit ball. We introduce \textsc{FibQuant}, a universal fixed-rate vector quantizer that keeps the same normalize--rotate--store interface while replacing scalar tables by a shared radial--angular codebook matched to this canonical source. The codebook combines Beta-quantile radii, Fibonacci\,/\,Roberts--Kronecker quasi-uniform directions, and multi-restart Lloyd--Max refinement. We prove that the resulting vector code strictly improves on its scalar product specialization at matched rate, with a high-rate gain that separates into a cell-shaping factor and a density-matching factor. The same construction gives a dense rate axis, including fractional-bit and sub-one-bit operating points, without calibration or variable-length addresses. On GPT-2 small KV caches, \textsc{FibQuant} traces a memory--fidelity frontier from $5\times$ compression at $0.99$ attention cosine similarity to $34\times$ at $0.95$. End-to-end on TinyLlama-1.1B, it is within $0.10$ perplexity of fp16 at $4\times$ compression and has $3.6\times$ lower perplexity than scalar \textsc{TurboQuant} at $b = 2$ ($8\times$ compression), where scalar random-access quantization begins to fail.

2605.11473 2026-05-13 cs.AI cs.LG cs.RO stat.ML 版本更新

TOPPO: Rethinking PPO for Multi-Task Reinforcement Learning with Critic Balancing

Yuanpeng Li, Gefei Lin, Annie Qu, Rui Miao

发表机构 * UC Irvine(加州大学尔湾分校) George Washington University(乔治·华盛顿大学) UC Santa Barbara(加州大学圣芭芭拉分校) UT Dallas(德克萨斯大学达拉斯分校)

AI总结 本文研究了多任务强化学习中基于策略梯度的PPO方法的优化问题,指出其在多任务环境下存在价值函数梯度条件不佳的问题,导致部分任务学习停滞。为此,作者提出TOPPO方法,通过引入批评者平衡模块改善梯度条件,提升任务间的学习均衡性。实验表明,TOPPO在参数和环境步数更少的情况下,优于现有的SAC和ARS方法,在多任务基准测试中表现出更强的平均和尾部任务性能,证明了基于策略的方法在适当优化下可以媲美甚至超越基于价值的方法。

详情
英文摘要

Soft Actor-Critic (SAC) and its variants dominate Multi-Task Reinforcement Learning (MTRL) due to their off-policy sample efficiency, while on-policy methods such as Proximal Policy Optimization (PPO) remain underexplored. We diagnose that PPO in MTRL suffers from a previously overlooked issue: critic-side gradient ill-conditioning, which may cause tail tasks to stall while easy tasks dominate the value function's updates. To address this, we propose TOPPO (Tail-Optimized PPO), a reformulation of PPO via Critic Balancing -- a set of modules that improve gradient conditioning and balance learning dynamics across tasks. Unlike prior approaches that rely on modular architectures or large models, TOPPO targets the optimization bottleneck within PPO itself. Empirically, TOPPO achieves stronger mean and tail-task performance than published SAC-family and ARS-family baselines while using substantially fewer parameters and environment steps on Meta-World+ benchmark. Notably, TOPPO matches or surpasses strong SAC baselines early in training and maintains superior performance at full budget. Ablations confirm the effectiveness of each module in TOPPO and provide insights into their interactions. Our results demonstrate that, with proper optimization, on-policy methods can rival or exceed off-policy approaches in MTRL, challenging the prevailing reliance on SAC and highlighting critic-side gradient conditioning as the central bottleneck.

2605.11468 2026-05-13 cs.AI 版本更新

CAMPA: Efficient and Aligned Multimodal Graph Learning via Decoupled Propagation and Aggregation

Daohan Su, Hao Liu, Xunkai Li, Yinlin Zhu, Xiong Yongfu, Yi Liu, Hongchao Qin, Rong-Hua Li, Guoren Wang

AI总结 本文提出了一种名为CAMPA的跨模态对齐的多模态图学习框架,旨在解决现有解耦多模态图神经网络在传播和聚合阶段面临的模态冲突问题。CAMPA通过引入两阶段对齐机制,分别在传播阶段注入跨模态相似性先验以保持语义一致性,在聚合阶段利用轨迹级自注意力和跨注意力对齐多模态和多跳特征轨迹,从而提升表示学习效果。实验表明,CAMPA在多个基准数据集上优于现有耦合和解耦方法,同时保持了较高的计算效率。

详情
英文摘要

Multimodal Graph Neural Networks (MGNNs) have shown strong potential for learning from multimodal attributed graphs, yet most existing approaches rely on tightly coupled architectures that suffer from prohibitive computational overhead. In this paper, we present a systematic empirical analysis showing that decoupled MGNNs are substantially more efficient and scalable for large-scale graph learning. However, we identify a critical bottleneck in existing decoupled pipelines, namely modal conflict, which arises in both the propagation and aggregation stages. Specifically, independent multi-hop diffusion causes cross-modal semantic divergence during propagation, while naive fusion fails to align multi-hop feature trajectories during aggregation, jointly limiting effective representation learning. To address this challenge, we propose CAMPA, a Cross-modal Aligned Multimodal Propagation & Aggregation framework for decoupled multimodal graph learning. Concretely, CAMPA introduces a two-stage alignment mechanism: (1) cross-modal aligned propagation, which injects cross-modal similarity priors into message passing to preserve semantic consistency without additional parameter overhead; (2) trajectory aligned aggregation, which leverages trajectory-level self-attention and cross-attention to capture and align long-range dependencies across modalities and hops. Extensive experiments on diverse benchmark datasets and tasks demonstrate that CAMPA consistently outperforms strong coupled and decoupled baselines while preserving the efficiency advantages of the decoupled paradigm.

2605.11467 2026-05-13 cs.LG cs.AI 版本更新

Drop the Act: Probe-Filtered RL for Faithful Chain-of-Thought Reasoning

Swapnil Parekh

发表机构 * Intuit

AI总结 该研究提出了一种名为ProFIL的新方法,旨在减少大型语言模型在链式推理过程中产生的“推理剧场”现象,即模型在已得出结论后仍生成看似思考但实际上对正确性无贡献的推理步骤。ProFIL通过在冻结的基模型上训练一个多头注意力探针,检测并抑制这些冗余步骤,并结合强化学习框架GRPO进行优化,从而提升推理链的可信度、缩短推理长度,同时保持或提升任务准确性。实验表明,该方法在多个推理任务和模型架构上均取得显著效果。

详情
英文摘要

Reasoning models post-hoc rationalize answers they have already committed to internally, producing chains of *reasoning theater*: deliberative-looking steps that contribute nothing to correctness. This wastes inference tokens, pollutes interpretability, and obscures what the model actually computed. We introduce **ProFIL** (**Pro**be-**Fil**tered Reinforcement Learning) to *reduce theater, increase chain-of-thought faithfulness, and shrink chain length* in a single, drop-in extension to Group Relative Policy Optimization (GRPO). A multi-head attention probe is trained *once* on the *frozen* base model to detect post-commitment steps from internal activations alone; during GRPO, rollouts whose probe score exceeds a threshold have their advantage zeroed. *Our central finding is that a probe trained on a frozen base, with verifier-derived labels and no human annotation, provides a stable signal that suppresses theater while resisting the RL-obfuscation failure mode predicted by prior work.* Across four reasoning domains (GSM8K, LiveCodeBench, ToolUse, MMLU-Redux) and two model architectures (Llama-8B, Qwen-7B), ProFIL reduces post-commitment theater by **11--100%**, raises faithful-fraction (e.g., +24pp on LiveCodeBench under an independent Claude 3.7 Sonnet judge), and shortens chains by 4--19%, all while preserving or improving task accuracy. ProFIL also beats a matched length-penalty GRPO baseline, isolating the gain as semantic commitment-detection rather than chain compression. Probe weights, training configurations, and rollouts are released across all four domains.

2605.11462 2026-05-13 cs.CV cs.AI 版本更新

SpatialForge: Bootstrapping 3D-Aware Spatial Reasoning from Open-World 2D Images

Zishan Liu, Ruoxi Zang, Yanglin Zhang, Wei Liu, Yin Zhang, Jian Yao, Jiayin Zheng, Zhengzhe Liu

发表机构 * Lingnan University(岭南大学) XPENG Robotics(小鹏机器人)

AI总结 该研究提出了一种名为 SpatialForge 的可扩展数据合成方法,旨在从开放世界的二维图像中生成用于三维空间推理的监督信号,以解决当前大型视觉-语言模型在空间推理方面的不足。通过将空间推理分解为感知与关系两个部分,并构建包含深度、布局和视角依赖推理的结构化监督信号,该方法能够自动生成高质量的空间问答数据。基于此,研究构建了一个包含1000万对空间问答的大型数据集 SpatialForge-10M,并在多个空间推理基准上验证了其有效性,显著提升了视觉-语言模型的空间推理能力。

详情
英文摘要

Recent advancements in Large Vision-Language Models (VLMs) have demonstrated exceptional semantic understanding, yet these models consistently struggle with spatial reasoning, often failing at fundamental geometric tasks such as depth ordering and precise coordinate grounding. Recent efforts introduce spatial supervision from scene-centric datasets (e.g., multi-view scans or indoor video), but are constrained by the limited number of underlying scenes. As a result, the scale and diversity of such data remain significantly smaller than those of web-scale 2D image collections. To address this limitation, we propose SpatialForge, a scalable data synthesis pipeline that transforms in-the-wild 2D images into spatial reasoning supervision. Our approach decomposes spatial reasoning into perception and relation, and constructs structured supervision signals covering depth, layout, and viewpoint-dependent reasoning, with automatic verification to ensure data quality. Based on this pipeline, we build SpatialForge-10M, a large-scale dataset containing 10 million spatial QA pairs. Extensive experiments across multiple spatial reasoning benchmarks demonstrate that training on SpatialForge-10M significantly improves the spatial reasoning ability of standard VLMs, highlighting the effectiveness of scaling 2D data for 3D-aware spatial reasoning.

2605.11448 2026-05-13 cs.LG cs.AI 版本更新

Deep Minds and Shallow Probes

Su Hyeong Lee, Risi Kondor

发表机构 * Department of Statistics, University of Chicago(芝加哥大学统计系) Department of Statistics and Department of Computer Science, University of Chicago(芝加哥大学统计系和计算机科学系)

AI总结 本文研究神经表示中隐藏坐标在不同实现下的对称性问题,提出应使用对称性稳定的浅层探针来揭示表示中的结构,而非依赖特定基底。通过分析最终输出层的精确模型,作者确定了一种唯一的浅层探针分层结构,其中线性探针为其一级成员。研究还表明,跨模型探针迁移应基于表示中探针可见的商空间,而非完整的隐藏状态,实验验证了该方法在合成与实际任务中的有效性。

详情
英文摘要

Neural representations are not unique objects. Even when two systems realize the same downstream computation, their hidden coordinates may differ by reparameterization. A probe family intended to reveal structure already present in a representation should therefore be stable under the relevant representation symmetries rather than be tied to a particular basis. We study this group action in the tractable exact setting of the final readout layer, where equivalent realizations induce affine changes of hidden coordinates. The resulting symmetry principle singles out a unique hierarchy of shallow coordinate-stable probes, with linear probes as its degree-1 member. We also show that a natural object for cross-model probe transfer is a shared probe-visible quotient--the representation modulo directions invisible to the probe family--rather than the full hidden state. Experiments on synthetic and real-world tasks support both predictions, showing where degree-2 probes help beyond linear ones and how quotient-based transfer enables coverage-aware monitor portability across model families. These results point toward a broader geometric representation theory of neural probing, with coverage-aware monitor transfer as a concrete operational consequence.

2605.11447 2026-05-13 cs.IR cs.AI 版本更新

Conditional Memory Enhanced Item Representation for Generative Recommendation

Ziwei Liu, Yejing Wang, Shengyu Zhou, Xinhang Li, Xiangyu Zhao

发表机构 * City University of Hong Kong(香港城市大学) Independent Researcher(独立研究者) Tsinghua University(清华大学)

AI总结 生成式推荐(GR)是一种通过自回归生成项目语义标识符(SID)来预测目标项目的新兴范式。现有方法在构建项目级表示时面临信息丢失和结构保留的冲突,为此,本文提出了一种条件记忆增强的项目表示框架ComeIR,通过多模态引导的令牌评分、双层级记忆模块和记忆恢复预测头,有效恢复SID的结构信息与粒度细节,显著提升了生成推荐的效果与灵活性。

详情
英文摘要

Generative recommendation (GR) has emerged as a promising paradigm that predicts target items by autoregressively generating their semantic identifiers (SID). Most GR methods follow a quantization-representation-generation pipeline, first assigning each item a SID, then constructing input representations from SID-token embeddings, and finally predicting the target SID through autoregressive generation. Existing item-level representation constructions mainly take two forms: directly merging SID-token embeddings into a compact vector, or enriching item-level representations with external inputs through additional networks. However, these item-level constructors still expose two practical challenges: direct merging may amplify the information loss caused by quantization and ID collision while obscuring SID code relations, whereas external-input-based methods can strengthen item semantics but cannot reliably preserve the SID-structured evidence required for token-level generation. These limitations make representation construction an underexplored bottleneck, leading to two severe problems, \ie{} the Identity-Structure Preservation Conflict and Input-Output Granularity Mismatch. To this end, we propose ComeIR, a Conditional Memory enhanced Item Representation framework that reconstructs SID-token embeddings into item-aware inputs and restores the token granularity during SID decoding. Specifically, MM-guided token scoring adaptively estimates the contribution of each code within the SID, dual-level Engram memory captures intra-item code composition and inter-item transition patterns, and a memory-restoring prediction head reuses the memories during SID decoding. Extensive experiments demonstrate the effectiveness and flexibility of ComeIR, and further reveal scalable gains from enlarging conditional memory.

2605.11442 2026-05-13 cs.CR cs.AI cs.CL 版本更新

Can a Single Message Paralyze the AI Infrastructure? The Rise of AbO-DDoS Attacks through Targeted Mobius Injection

Zi Liang, Ronghua Li, Yanyun Wang, Qingqing Ye, Haibo Hu

发表机构 * The Hong Kong Polytechnic University(香港理工大学) HKUST (GZ)(香港科技大学(广州))

AI总结 本文提出了一种新型的针对人工智能基础设施的攻击方法——Mobius 注入,该方法通过利用自主代理的语义闭包漏洞,将单条消息转化为持续递归执行的攻击指令,从而引发基于代理的定向 DDoS(AbO-DDoS)攻击。这种攻击具有轻量、隐蔽且高度可配置的特点,能够精准针对特定环境或模型提供商,实验表明其在多个主流代理系统中均能造成显著的性能恶化。为应对该威胁,研究者提出了一种基于代理组件能量分析的主动防御机制,用于检测恶意递归触发行为。

详情
英文摘要

Large Language Model (LLM) agents have emerged as key intermediaries, orchestrating complex interactions between human users and a wide range of digital services and LLM infrastructures. While prior research has extensively examined the security of LLMs and agents in isolation, the systemic risk of the agent acting as a disruptive hub within the user-agent-service chain remains largely overlooked. In this work, we expose a novel threat paradigm by introducing Mobius Injection, a sophisticated attack that weaponizes autonomous agents into zombie nodes to launch what we define as gent-based and -Oriented DDoS (AbO-DDoS) attacks. By exploiting a structural vulnerability in agentic logic named Semantic Closure, an adversary can induce sustained recursive execution of agent components through a single textual injection. We demonstrate that this attack is exceptionally lightweight, stealthy against both traditional DDoS monitors and contemporary AI safety filters, and highly configurable, allowing for surgical targeting of specific environments or model providers. To evaluate the real-world impact, we conduct extensive experiments across three representative claw-style agents and three mainstream coding agents, integrated with 12 frontier proprietary or open-weight LLMs. Our results demonstrate that Mobius Injection achieves substantial attack success across diverse tasks, driving single-node call amplification up to 51.0x and multi-node p95 latency inflation up to 229.1x. The attack performance exhibits a superlinear increase with the number of poisoning nodes. To mitigate Mobius Injection, we propose a proactive defense mechanism using Agent Component Energy (ACE) Analysis, which detects malicious recursive triggers by measuring anomalous energy in the agent's component graph.

2605.11436 2026-05-13 cs.CL cs.AI 版本更新

Agent-BRACE: Decoupling Beliefs from Actions in Long-Horizon Tasks via Verbalized State Uncertainty

Joykirat Singh, Zaid Khan, Archiki Prasad, Justin Chih-Yao Chen, Akshay Nambi, Hyunji Lee, Elias Stengel-Eskin, Mohit Bansal

发表机构 * UNC Chapel Hill(北卡罗来纳大学教堂山分校) The University of Texas at Austin(德克萨斯大学奥斯汀分校) Microsoft Research(微软研究院)

AI总结 本文提出了一种名为Agent-BRACE的方法,旨在解决大型语言模型在长时序、部分可观测环境中执行任务时面临的不确定性管理和上下文膨胀问题。该方法通过将信念状态与策略解耦,利用自然语言标注的置信度标签构建结构化的信念表示,从而帮助模型在决策时更有效地处理不确定性。实验表明,Agent-BRACE在多个长时序任务中显著提升了性能,同时保持了对上下文长度的鲁棒性。

Comments Code: https://github.com/joykirat18/Agent-BRACE

详情
英文摘要

Large language models (LLMs) are increasingly deployed on long-horizon tasks in partially observable environments, where they must act while inferring and tracking a complex environment state over many steps. This leads to two challenges: partial observability requires maintaining uncertainty over unobserved world attributes, and long interaction history causes context to grow without bound, diluting task-relevant information. A principled solution to both challenges is a belief state: a posterior distribution over environment states given past observations and actions, which compactly encodes history for decision making regardless of episode length. In LLM agents, however, the open-ended nature of text makes it unclear how to represent such a distribution. Therefore, we introduce Agent-BRACE: Agent Belief state Representation via Abstraction and Confidence Estimation, a method that decouples an LLM agent into a belief state model and a policy model, jointly optimized via reinforcement learning. The belief state model produces a structured approximation of the belief distribution: a set of atomic natural language claims about the environment, each annotated with an ordinal verbalized certainty label ranging from certain to unknown. The policy model conditions on this compact, structured approximate belief rather than the full history, learning to select actions under explicit uncertainty. Across long-horizon, partially observable embodied language environments, Agent-BRACE achieves an average absolute improvement of +14.5% (Qwen2.5-3B-Instruct) and +5.3% (Qwen3-4B-Instruct), outperforming strong RL baselines while maintaining a near-constant context window independent of episode length. Further analysis shows that the learned belief becomes increasingly calibrated over the course of an episode as evidence accumulates.

2605.11430 2026-05-13 cs.CV cs.AI cs.LG 版本更新

Diabetic Retinopathy Classification using Downscaling Algorithms and Deep Learning

Nishi Doshi, Urvi Oza, Pankaj Kumar

发表机构 * Dhirubhai Ambani Institute of Information and Communication Technology(迪鲁巴希·阿姆巴尼信息与通信技术研究所)

AI总结 该研究针对糖尿病视网膜病变(DR)分类中的图像尺寸不一问题,提出在输入深度学习网络前使用多种下采样算法对视网膜图像进行预处理。研究结合了Kaggle和印度糖尿病视网膜病变图像数据集,基于改进的多通道Inception V3网络架构进行分类实验,结果在准确率、特异性和灵敏度方面优于现有方法,为DR的自动分级提供了更有效的解决方案。

Journal ref 2020 7th International Conference on Signal Processing and Integrated Networks (SPIN)

详情
英文摘要

Diabetic Retinopathy (DR) is an art and science of recording and classifying the retinal images of a diabetic patient. DR classification deals with classifying retinal fundus image into five stages on the basis of severity of diabetes. One of the major issue faced while dealing with DR classification problem is the large and varying size of images. In this paper we propose and explore the use of several downscaling algorithms before feeding the image data to a Deep Learning Network for classification. For improving training and testing; we amalgamate two datasets: Kaggle and Indian Diabetic Retinopathy Image Dataset. Our experiments have been performed on a novel Multi Channel Inception V3 architecture with a unique self crafted preprocessing phase. We report results of proposed approach using accuracy, specificity and sensitivity, which outperform the previous state of the art methods. Index Terms: Diabetic Retinopathy, Downscaling Algorithms, Multichannel CNN Architecture, Deep Learning

2605.11426 2026-05-13 cs.AI 版本更新

A Mechanistic Investigation of Supervised Fine Tuning

Ruhaan Chopra

发表机构 * Independent Researcher(独立研究者)

AI总结 本研究探讨了监督微调(SFT)对大语言模型激活状态的影响,发现尽管微调前后隐藏层激活的余弦相似度很高,但通过预训练稀疏自编码器(SAE)投影后,稀疏潜在表示存在显著差异。研究提出了一种基于SAE的分析方法,揭示了微调过程中任务和层特异性语义特征的变化,并发现了与安全对齐相关的分层更新模式。该方法为理解SFT的机制提供了高分辨率的诊断工具。

详情
英文摘要

The cosine similarity between a large language model's hidden activations before and after Supervised Fine-Tuning (SFT) remains very high. This, at first glance, suggests that SFT leaves the model's activation geometry largely undisturbed. However, projecting both sets of activations through a Sparse Autoencoder (SAE) pretrained on the base model reveals that the underlying sparse latents diverge significantly. We introduce a novel investigative pipeline which utilizes these pretrained SAEs as a high-resolution diagnostic tool to mechanistically investigate the drivers of this representational divergence. Through our analytical pipeline, we discover task-specific and layer-specific distributions of the precise semantic features that are systematically altered during supervised fine-tuning. We additionally identify a layer-wise update profile specific to safety alignment. All code, experimental scripts, and analysis files associated with this work are publicly available at: https://github.com/ruhzi/sae-investigation.

2605.11418 2026-05-13 cs.AI cs.CR 版本更新

Under the Hood of SKILL.md: Semantic Supply-chain Attacks on AI Agent Skill Registry

Shoumik Saha, Kazem Faghih, Soheil Feizi

发表机构 * Department of Computer Science, University of Maryland - College Park(马里兰大学计算机科学系)

AI总结 本文研究了AI代理技能注册系统中基于自然语言的语义供应链攻击问题,揭示了SKILL.md文件在技能发现、选择和治理阶段可能被恶意利用的风险。通过实验证明,攻击者可通过精心设计的文本触发器提升恶意技能的可见性、引导代理选择功能相似的对抗性变体,并有效规避安全审查。研究指出,SKILL.md不仅是文档,更是影响代理行为的关键操作性文本,暴露了当前AI代理能力扩展机制中的重大安全隐患。

Comments 31 pages, 21 figures, 10 tables

详情
英文摘要

Autonomous AI agents increasingly extend their capabilities through Agent Skills: modular filesystem packages whose SKILL.md files describe when and how agents should use them. While this design enables scalable, on-demand capability expansion, it also introduces a semantic supply-chain risk in which natural-language metadata and instructions can affect which skills are admitted, surfaced, selected, and loaded. We study SKILL.md - only attacks across three registry-facing stages of the Agent Skill lifecycle, using real ClawHub skills and realistic registry mechanisms. In Discovery, short textual triggers can manipulate embedding-based retrieval and improve adversarial skill visibility, achieving up to 86% pairwise win rate and 80% Top-10 placement. In Selection, description-only framing biases agents toward functionally equivalent adversarial variants, which are selected in 77.6% of paired trials on average. In Governance, semantic evasion strategies cause malicious skills to avoid a blocking verdict in 36.5%-100% of cases. Overall, our results show that SKILL.md is not passive documentation but operational text that shapes which third-party capabilities agents find, trust, and use.

2605.11414 2026-05-13 cs.LG cs.AI 版本更新

Generative Diffusion Prior Distillation for Long-Context Knowledge Transfer

Nilushika Udayangani, Kishor Nandakishor, Marimuthu Palaniswami

发表机构 * Department of Electrical and Electronic Engineering(电子与电气工程系)

AI总结 本文研究了在时间序列分类任务中,如何将完整序列分类器的知识迁移到仅基于部分序列输入的分类器中。为了解决部分数据缺乏判别性特征导致的泛化能力下降问题,作者提出了一种基于生成扩散先验的知识蒸馏框架(GDPD),通过将短上下文学生特征视为完整上下文教师特征的退化观测,利用扩散模型的迭代恢复能力学习教师特征的生成先验,并引导学生特征学习长期上下文知识,从而有效提升部分序列分类的性能。实验表明,GDPD在多种数据集和架构下均表现出优越的全序列到部分序列的知识迁移效果。

Comments Published as a conference paper at ICLR 2026 (Brazil, Rio de Janeiro)

Journal ref The Fourteenth International Conference on Learning Representations 2026

详情
英文摘要

While traditional time-series classifiers assume full sequences at inference, practical constraints (latency and cost) often limit inputs to partial prefixes. The absence of class-discriminative patterns in partial data can significantly hinder a classifier's ability to generalize. This work uses knowledge distillation (KD) to equip partial time series classifiers with the generalization ability of their full-sequence counterparts. In KD, high-capacity teacher transfers supervision to aid student learning on the target task. Matching with teacher features has shown promise in closing the generalization gap due to limited parameter capacity. However, when the generalization gap arises from training-data differences (full versus partial), the teacher's full-context features can be an overwhelming target signal for the student's short-context features. To provide progressive, diverse, and collective teacher supervision, we propose Generative Diffusion Prior Distillation (GDPD), a novel KD framework that treats short-context student features as degraded observations of the target full-context features. Inspired by the iterative restoration capability of diffusion models, we learn a diffusion-based generative prior over teacher features. Leveraging this prior, we posterior-sample target teacher representations that could best explain the missing long-range information in the student features and optimize the student features to be minimally degraded relative to these targets. GDPD provides each student feature with a distribution of task-relevant long-context knowledge, which benefits learning on the partial classification task. Extensive experiments across earliness settings, datasets, and architectures demonstrate GDPD's effectiveness for full-to-partial distillation.

2605.11408 2026-05-13 cs.LG cs.AI cs.CL 版本更新

MaskTab: Scalable Masked Tabular Pretraining with Scaling Laws and Distillation for Industrial Classification

Bo Zheng, Yudong Chen, Zihua Xiong, Shuai Fang, Peidong He, Yang Yang, Sheng Guo

发表机构 * Zhejiang University(浙江大学) MyBank, Ant Group(蚂蚁集团MyBank)

AI总结 MaskTab 是一个专为工业级表格数据设计的统一预训练框架,旨在解决表格数据高维、缺失值多且标签稀少的问题。该方法通过引入可学习的缺失值标记和混合监督预训练策略,结合多专家增强损失函数,有效提升了模型在大规模工业数据上的表现。实验表明,MaskTab 在多个工业基准上显著优于现有方法,并能高效蒸馏到轻量模型中,在严格时延和可解释性约束下仍保持优越性能。

详情
英文摘要

Tabular data forms the backbone of high-stakes decision systems in finance, healthcare, and beyond. Yet industrial tabular datasets are inherently difficult: high-dimensional, riddled with missing entries, and rarely labeled at scale. While foundation models have revolutionized vision and language, tabular learning still leans on handcrafted features and lacks a general self-supervised framework. We present MaskTab, a unified pre-training framework designed specifically for industrial-scale tabular data. MaskTab encodes missing values via dedicated learnable tokens, enabling the model to distinguish structural absence from random dropout. It jointly optimizes a hybrid supervised pre-training scheme--utilizing a twin-path architecture to reconcile masked reconstruction with task-specific supervision--and an MoE-augmented loss that adaptively routes features through specialized subnetworks. On industrial-scale benchmarks, it achieves +5.04% AUC and +8.28% KS over prior art under rigorous scaling. Moreover, its representations distill effectively into lightweight models, yielding +2.55% AUC and +4.85% KS under strict latency and interpretability constraints, while improving robustness to distribution shifts. Our work demonstrates that tabular data admits a foundation-model treatment--when its structural idiosyncrasies are respected.

2605.11404 2026-05-13 cs.AI 版本更新

Attributing Emergence in Million-Agent Systems

Ling Tang, Jilin Mei, Qian Chen, Qihan Ren, Linfeng Zhang, Quanshi Zhang, Jing Shao, Xia Hu, Dongrui Liu

发表机构 * Shanghai Artificial Intelligence Laboratory(上海人工智能实验室) Shanghai Jiao Tong University(上海交通大学) Fudan University(复旦大学) Tongji University(同济大学)

AI总结 该研究探讨了在百万智能体系统中如何将宏观涌现现象归因于个体智能体的问题。现有方法因计算复杂度限制,仅适用于小规模系统,而实际社会现象常发生在百万级智能体规模。为此,研究将Aumann-Shapley路径积分归因方法扩展至百万智能体规模,实现了高效且满足所有四个公理的归因计算,并通过实证分析揭示了小规模与全量数据在归因结果上的结构性差异,证明了全量归因对于非线性宏观指标的理论必要性。

详情
英文摘要

Large language models (LLMs) can simulate human-like reasoning and decision-making in individual agents. LLM-powered multi-agent systems (MAS) combine such agents to simulate population-scale social phenomena such as polarization, information cascades, and market panics. Such studies require attributing macro emergence to individual agents, but existing axiomatic methods scale combinatorially in $N$ and have been confined to $N \lesssim 10^3$, while the phenomena they explain occur at $N \geq 10^6$. We address this gap by adapting Aumann--Shapley path-integral attribution to LLM-powered MAS at million-agent scale; the resulting method satisfies all four axioms, runs four to five orders of magnitude faster than sampled Shapley on the same hardware. We use this method to test the scale gap empirically: across 14 days of public Bluesky data ($1{,}671{,}587$ active users), we compute the attribution at both full scale and the visibility-biased $N = 10^2$ convenience sample used by small-scale studies, and the two disagree structurally. At full scale the long tail and middle tier jointly carry the majority; the biased small panel attributes almost everything to a few high-follower accounts. We then prove that under any nonlinear macro indicator the disagreement cannot be reduced by post-hoc rescaling: an Attribution Scaling Bias theorem shows that no global rescaling factor can reconcile small-scale and full-scale attribution. Full-scale attribution is therefore not a methodological choice but a theoretical requirement for any nonlinear macro indicator.

2605.11403 2026-05-13 cs.LG cs.AI cs.CL 版本更新

fg-expo: Frontier-guided exploration-prioritized policy optimization via adaptive kl and gaussian curriculum

Mingxiong Lin, Zhangquan Gong, Maowen Tang, Qian Li, Chuangchuang Wang, Jian Ma, Sutian Huang, Kai Tang, Haonan Lu

发表机构 * OPPO AI Center(OPPO人工智能中心)

AI总结 该研究针对基于可验证奖励的强化学习(RLVR)中主流算法Group Relative Policy Optimization(GRPO)存在的两个效率问题,提出了FG-ExPO方法。该方法通过引入准确率条件的KL缩放(AKL)和高斯课程采样(GCS)两个轻量组件,分别动态调整策略探索的约束强度和优化问题采样分布,从而提升模型在数学推理任务中的训练效率。实验表明,FG-ExPO在多个主流基准上显著优于原始GRPO,尤其在AIME 2025等任务中展现出更优的性能提升。

详情
英文摘要

Reinforcement Learning with Verifiable Rewards (RLVR) has become the standard paradigm for LLM mathematical reasoning, with Group Relative Policy Optimization (GRPO) serving as the dominant algorithm. We identify two overlooked inefficiencies inherent in GRPO. First, a fixed KL coefficient overly restricts policy exploration at moments when the model needs to diverge significantly from the reference policy. Second, uniform question sampling overlooks that moderately difficult problems produce the most informative gradient signals. We propose FG-ExPO, short for Frontier-Guided Exploration-Prioritized Policy Optimization, which integrates two lightweight components. Accuracy-Conditioned KL Scaling (AKL) adjusts the KL penalty strength through a smooth nonlinear function of batch average accuracy, loosening the constraint when the model performs poorly and strengthening it when the model achieves satisfactory results. Gaussian Curriculum Sampling (GCS) assigns sampling weights to questions following a Gaussian distribution centered at a moderate accuracy level around 0.5, focusing model training on its learning frontier. We conduct evaluations on DeepSeek-R1-Distill-Qwen-1.5B and Qwen3-8B-Base across six mainstream mathematical reasoning benchmarks. Experimental results demonstrate that FG-ExPO consistently outperforms vanilla GRPO. It delivers an absolute improvement of 13.34 on the AIME 2025 pass@32 metric, rising from 63.33 percent to 76.67 percent, and obtains an average pass@32 gain of 2.66 on the 8B model. The substantially larger performance gains observed on pass@32 compared to pass@1 verify that FG-ExPO enlarges the model's effective exploration space under a fixed inference budget.

2605.11398 2026-05-13 cs.AI cs.CL 版本更新

AcuityBench: Evaluating Clinical Acuity Identification and Uncertainty Alignment

Robin Linzmayer, Georgianna Lin, Di Coneybeare, Jason Chu, Trudi Cloyd, Manish Garg, Miles Gordon, Elizabeth Hartofilis, Benjamin Hong, Ashraf Hussain, Eugene Y. Kim, Oluchi Iheagwara King, Ross McCormack, Erica Olsen, John K. Riggins, Mustafa N. Rasheed, Dana L. Sacco, Vinay Saggar, Osman R. Sayan, Amit Shembekar, Janice Shin-Kim, Wendy W. Sun, Bernard P. Chang, David Kessler, Noémie Elhadad

发表机构 * Department of Computer Science, Columbia University, New York, NY, USA(计算机科学系,哥伦比亚大学,纽约,纽约州,美国) Department of Biomedical Informatics, Columbia University, New York, NY, USA(生物医学信息学系,哥伦比亚大学,纽约,纽约州,美国) Department of Emergency Medicine, Columbia University Irving Medical Center, New York, NY, USA(急诊医学系,哥伦比亚大学伊文思医疗中心,纽约,纽约州,美国)

AI总结 本文提出 AcuityBench,一个用于评估语言模型能否从用户医疗描述中正确识别护理紧急程度的基准。该基准整合了五个公开数据集,涵盖用户对话、论坛帖子、临床案例和患者门户信息,并统一采用四级紧急程度框架进行评估。研究发现,不同模型在明确案例和模糊案例中的表现存在显著差异,且任务形式的选择会影响误判类型,突显了临床紧急程度识别作为关键安全能力的重要性。

Comments 41 pages, 5 figures. Preprint under review for the Track on Evaluations and Datasets at NeurIPS 2026

详情
英文摘要

We introduce AcuityBench, a benchmark for evaluating whether language models identify the appropriate urgency of care from user medical presentations. Existing health benchmarks emphasize medical question answering, broad health interactions, or narrow workflow-specific triage tasks, but they do not offer a unified evaluation of acuity identification across these settings. AcuityBench addresses this gap by harmonizing five public datasets spanning user conversations, online forum posts, clinical vignettes, and patient portal messages under a shared four-level acuity framework ranging from home monitoring to immediate emergency care. The benchmark contains 914 cases, including 697 consensus cases for standard accuracy evaluation and 217 physician-confirmed ambiguous cases for uncertainty-aware evaluation. It supports two complementary task formats: explicit four-way classification in a QA setting, and free-form conversational responses evaluated with a rubric-based judge anchored to the same framework. Across 12 frontier proprietary and open-weight models, we find substantial variation in clear-case acuity accuracy and error direction. Comparing task formats reveals a systematic tradeoff: conversational responses reduce over-triage but increase under-triage relative to QA, especially in higher-acuity cases. In ambiguous cases, no model closely matches the distribution of physician judgments, and model predictions are more concentrated than expert clinical uncertainty. We also compare expert and model adjudication on a subset of maximally ambiguous cases, using those cases to examine the role of clinical uncertainty in label disagreement. Together, these results position acuity identification as a distinct safety-critical capability and show that AcuityBench enables systematic comparison and stress-testing of how well models guide users to the right level of care in real-world health use.

2605.11394 2026-05-13 stat.ML cs.AI cs.LG stat.AP stat.ME 版本更新

Spatial Adapter: Structured Spatial Decomposition and Closed-Form Covariance for Frozen Predictors

Wen-Ting Wang, Wei-Ying Wu, Hao-Yun Huang, Xuan-Chun Wang

发表机构 * Institute of Statistics, National Chung Hsing University(中国铭新大学统计研究所) Department of Applied Mathematics, National Dong Hwa University(东华大学应用数学系)

AI总结 本文提出了一种名为 Spatial Adapter 的参数高效模块,能够在不修改原始预测模型参数的前提下,为任意冻结的初始预测器提供结构化的空间残差表示及其闭式协方差估计。该方法通过可追踪的批量 ADMM 算法,联合学习空间正则化的正交基与样本级得分,从而在残差场中提取出具有平滑性、稀疏性和正交性的低秩空间结构。该方法不仅支持对未观测位置进行克里金插值式的空间预测,还可用于不确定性量化,实验表明其在多种数据集上均能有效恢复残差空间结构,且参数量远低于传统方法。

Comments Preprint. 10 pages main text, with appendices

详情
英文摘要

We present the Spatial Adapter, a parameter-efficient post-hoc layer that equips any frozen first-stage predictor with a structured spatial representation of its residual field and an induced closed-form spatial covariance. The adapter operates as a cascade second stage on residuals, jointly learning a spatially regularized orthonormal basis and per-sample scores via a tractable mini-batch ADMM procedure, without modifying any first-stage parameter. Because the first-stage parameters are frozen, the adapter does not retrain the backbone; its role is to supply a compressed distributional summary of the residual field. Smoothness, sparsity, and orthogonality together turn a generic low-rank factorization into an identifiable spatial representation whose induced residual covariance admits a closed-form low-rank-plus-noise estimator; the effective rank is determined data-adaptively by spectral thresholding, while the nominal rank K is an optimization-side upper bound only. This covariance enables kriging-style spatial prediction at unobserved locations, with plug-in uncertainty quantification as a secondary downstream use. Across synthetic data, Weather2K for spatial-holdout prediction, and GWHD patch grids as a basis-transferability diagnostic, the adapter recovers residual spatial structure when paired with frozen first stages from linear models to deep spatiotemporal and vision backbones; the added representation uses fewer than K(N+T) parameters alongside a compact residual-trend network.

2605.11392 2026-05-13 cs.AI 版本更新

Transformer Interpretability from Perspective of Attention and Gradient

Yongjin Cui, Xiaohui Fan, Huajun Chen

发表机构 * Zhejiang University(浙江大学)

AI总结 本文从注意力和梯度的角度深入研究了Transformer模型的可解释性,提出了一种通过引导梯度方向(即注意力方向)实现更全面和细致的特征区域解释的方法。该方法有助于更好地理解Transformer的工作机制,并揭示了Vision Transformer(ViT)与人类图像感知之间的差异,展示了几乎不可察觉的图像类别篡改现象,可能在特定场景下带来安全隐患。

详情
英文摘要

Although researchers' attention is more focused on the performance of Transformer models, the interpretation of Transformer can never be ignored. Gradient is widely utilized in Transformer interpretation. From the perspective of attention and gradient, we conduct an in-depth study of Transformer interpretation and propose a method to achieve it by guiding the gradient direction, or more precisely, the attention direction. The method enables more comprehensive interpretation of feature regions, offers detail interpretation, and helps to better understand Transformer mechanism. Leveraging the difference in how Vision Transformer (ViT) and humans perceive images, we alter the class of an image in a way that is almost imperceptible to the human eye. This class rewriting phenomenon may potentially pose security risks in certain scenarios.

2605.11388 2026-05-13 cs.CL cs.AI 版本更新

Deep Reasoning in General Purpose Agents via Structured Meta-Cognition

Dean Light, Michael Theologitis, Kshitish Ghate, Shuyue Stella Li, Benjamin Newman, Chirag Shah, Aylin Caliskan, Pang Wei Koh, Dan Suciu, Yulia Tsvetkov

发表机构 * University of Washington(华盛顿大学)

AI总结 该研究提出了一种名为“Deep Reasoning”的方法,旨在提升通用智能体在推理任务中的灵活性与适应性。通过结构化的元推理,该方法在推理过程中动态构建任务特定的推理框架,从而更有效地处理复杂问题。实验表明,基于该方法构建的通用智能体DOLORES在多个困难基准上显著优于现有方法,展现了其在结构化推理和任务适应性方面的优势。

Comments Preprint under review

详情
英文摘要

Humans intuitively solve complex problems by flexibly shifting among reasoning modes: they plan, execute, revise intermediate goals, resolve ambiguity through associative judgment, and apply formal procedures to well-specified subproblems. Current LLM agents lack this flexibility, as their scaffolds hard-code such reasoning decisions in advance. These scaffolds are effective when their prescribed structure matches the task, but brittle when solving the task requires adapting the structure of reasoning itself. We introduce Deep Reasoning -- an inference-time approach for constructing task-specific scaffolds through structured meta-reasoning. Deep Reasoning uses a formal language that represents meta-reasoning as executable decompositions over associative inference, formal computation, and recursive subproblem solving, enabling decomposition principles to be encoded as in-context examples that guide test-time scaffold construction. We instantiate this approach in a general-purpose agent (DOLORES) that distributes complex tasks across more controlled reasoning threads. We evaluate it against state-of-the-art scaffolding methods across four hard benchmarks: multi-hop reasoning, long-chain question answering, long-context aggregation, and deep research-style information seeking. DOLORES outperforms all evaluated scaffolds across three model sizes and two model families, improving over the strongest evaluated scaffold baseline by 24.8% on average. DOLORES distributes cognition across structured, lower-load reasoning threads, thereby reducing premature termination and hallucinations. This advantage can even bridge the scaling gap, with an 8B version surpassing all evaluated 32B baselines from the same family in more than half the settings. These results point toward future agentic systems that treat scaffolding as adaptive reasoning, constructing the structure each task requires just-in-time.

2605.11386 2026-05-13 cs.AI 版本更新

Revisiting Privacy Preservation in Brain-Computer Interfaces: Conceptual Boundaries, Risk Pathways, and a Protection-Strength Grading Framework

Lei Sun, Xiuqing Mao, Shuai Zhang, Qingyu Zeng, Min Zhao, Jiyuan Li, Wenle Dong

发表机构 * PLA Information Engineering University(中国人民解放军信息工程大学)

AI总结 随着脑机接口(BCI)技术从实验室走向临床和实际应用,其隐私保护问题日益突出。本文系统回顾了BCI系统中隐私泄露的多种路径,提出了涵盖保护对象、生命周期阶段和保护强度等级的三维分类框架,将现有研究分为四个保护强度等级。研究强调,BCI隐私保护不仅要隐藏数据,还需分离任务无关的敏感信息,同时保持系统功能的实用性,并指出心智隐私和神经伦理风险仍是亟待解决的开放问题。

详情
英文摘要

Brain-computer interfaces (BCIs) are moving rapidly from laboratory research into clinical, edge, and real-world settings. Under ISO/IEC 8663:2025, a BCI is a direct communication link between central nervous system activity and external software or hardware systems. This link expands privacy risk beyond raw neural-signal leakage: neural data, derived representations, model assets, and decoded outputs can be re-associated with individuals across collection, transmission, storage, training, inference, and feedback, or used to infer information beyond what a task requires. Starting from the general BCI paradigm, this review deffnes privacy-protection boundaries, protection objects, and the relationship between user data privacy and model privacy within a shared risk pathway. It then proposes a three-dimensional framework - protection object, lifecycle stage, and dominant protection-strength level - to classify existing work into four levels of protection strength. Finally, mental privacy and neuroethical risks are treated as open issues, emphasizing that BCI privacy protection should not only obscure data but also disentangle task-irrelevant sensitive information while preserving downstream utility. Keywords: Brain-computer interface, Neural data privacy, User data privacy, Model privacy, Disentanglement of task-irrelevant sensitive information, Protection-strength grading, Neuroethical risks

2605.11380 2026-05-13 cs.LG cs.AI 版本更新

TRACE: Temporal Routing with Autoregressive Cross-channel Experts for EEG Representation Learning

Fan Ma, Qier An, Peng Chen, Lingfei Qian, Xiang Lan, Mingyang Jiang, Zhiling Gu, Xenophon Papademetris, Hua Xu

发表机构 * Department of Biomedical Informatics and Data Science, Yale University(耶鲁大学生物医学信息学与数据科学系)

AI总结 本文提出了一种名为TRACE的自回归EEG预训练框架,旨在解决EEG信号多通道、非平稳特性带来的可迁移表征学习难题。TRACE通过在因果上下文中预测未来EEG片段,并在每个时间步进行跨通道一致的时序自适应计算,实现对不同时间阶段和通道间关系的灵活建模。该方法支持不同通道配置和记录域的异构预训练,实验表明其在多个下游任务中表现优异,尤其在运动想象和临床事件分类任务中具有竞争力。

详情
英文摘要

Learning transferable representations for electroencephalography (EEG) remains challenging because EEG signals are inherently multi-channel and non-stationary. Channels observed at the same time provide coupled measurements of neural activity, while the relevant temporal dynamics vary across contexts. This structure is poorly matched by architectures that apply uniform computation across time or route each channel patch independently. To this end, we propose TRACE, an autoregressive EEG pre-training framework that predicts future EEG patches from causal context while performing temporally adaptive and cross-channel coherent computation. At each temporal step, TRACE derives an expert routing decision from the causal cross-channel history and applies it jointly to all channels at that step. This preserves instantaneous cross-channel coherence while allowing different temporal regimes to activate different computation. Since routing is defined over the available channel set and causal temporal context, TRACE is compatible with heterogeneous pre-training across corpora with different channel counts, montages, sequence lengths, and recording domains. Across eight downstream EEG benchmarks, TRACE is evaluated in both settings: when downstream domains are seen only as unlabeled pre-training data and when downstream datasets are completely unseen during pre-training. It obtains the best results on several benchmarks while remaining competitive on motor imagery and clinical event classification tasks, with ablations supporting the importance of cross-channel temporal routing.

2605.11376 2026-05-13 cs.AI 版本更新

LLM-X: A Scalable Negotiation-Oriented Exchange for Communication Among Personal LLM Agents

Giuliano Lorenzoni, Paulo Alencar, Donald Cowan

发表机构 * University of Waterloo(滑铁卢大学)

AI总结 本文提出了一种名为LLM-X的可扩展谈判导向型交换框架,旨在支持个人语言模型代理之间的直接、结构化通信。该框架引入了消息总线和路由机制,确保通信的结构有效性与策略执行,并提供了联邦网关、主题路由和策略执行的架构设计,以及支持能力协商和合同网络式协调的类型化消息协议。实验表明,LLM-X在不同规模和负载条件下均能保持稳定,且揭示了策略选择在系统鲁棒性、公平性与通信效率之间的权衡关系。

Comments 8 pages, 7 figures, accepted at AGENT 2026 Workshop, co-located with ICSE 2026

详情
英文摘要

We propose a personal-LLM exchange (LLM-X), a scalable negotiation-oriented environment that enables direct, structured communication across populations of personal agents (LLMs), each representing an individual user. Unlike existing tool-centric protocols that focus on agent-API interaction, LLM-X introduces a message bus and routing substrate for LLM-to-LLM coordination with guarantees around schema validity and policy enforcement. We contribute: (1) an architecture for LLM-X comprising federated gateways, topic-based routing, and policy enforcement; (2) a typed message protocol supporting capability negotiation and contract-net-style coordination; and (3) the first empirical evaluation of LLM-based multi-agent negotiation at scale. Experiments span 5, 9, and 12 agents, under distinct negotiation policies (Low, Medium, High), and across both short-run (minutes) and long-run (2h, 12h) load conditions. Results highlight clear policy-performance trade-offs: stricter policies improve robustness and fairness but increase latencies and message volume. Extended runs confirm that LLM-X remains stable under sustained load, with bounded latency drift.

2605.11373 2026-05-13 cs.AI cs.LG stat.ML 版本更新

Causal Algorithmic Recourse: Foundations and Methods

Drago Plecko, Collin Wang, Elias Bareinboim

发表机构 * Department of Statistics & Data Science(统计与数据科学系) UCLA(加州大学洛杉矶分校) Department of Computer Science(计算机科学系) Columbia University(哥伦比亚大学)

AI总结 本文研究如何在人工智能决策系统中为个体提供可靠的逆向决策建议,即算法性补救(algorithmic recourse)问题。作者提出了一种因果框架,将补救过程建模为干预前后的结果过程,考虑了潜在变量的重新采样和部分稳定性。文章引入了后补救稳定性条件,并开发了基于copula的算法以从观测数据中推断补救效果,同时提出了在数据不满足copula模型时的分布无关学习方法,为算法性补救提供了更稳健和实用的解决方案。

详情
英文摘要

The trustworthiness of AI decision-making systems is increasingly important. A key feature of such systems is the ability to provide recommendations for how an individual may reverse a negative decision, a problem known as algorithmic recourse. Existing approaches treat recourse outcomes as counterfactuals of a fixed unit, ignoring that real-world recourse involves repeated decisions on the same individual under possibly different latent conditions. We develop a causal framework that models recourse as a process over pre- and post-intervention outcomes, allowing for partial stability and resampling of latent variables. We introduce post-recourse stability conditions that enable reasoning about recourse from observational data alone, and develop a copula-based algorithm for inferring the effects of recourse under these conditions. For settings where paired observations of the same individual before and after intervention are available (called recourse data), we develop methods for inferring copula parameters and performing goodness-of-fit testing. When the copula model is rejected, we provide a distribution-free algorithm for learning recourse effects directly from recourse data. We demonstrate the value of the proposed methods on real and semi-synthetic datasets.

2605.11368 2026-05-13 cs.LG cs.AI q-bio.GN 版本更新

LPDP: Inference-Time Reward Control for Variable-Length DNA Generation with Edit Flows

Jeongchan Kim, Yunkyung Ko, Jong Chul Ye

发表机构 * KAIST AI(韩国釜山科学技术院人工智能实验室)

AI总结 本文研究了如何利用Edit Flows在DNA序列生成过程中实现推理阶段的奖励控制。提出了一种名为LPDP的方法,它是一种无需训练、关注中间状态和动作的局部重解算操作符,能够在生成可变长度DNA序列时进行高效的编辑操作。LPDP通过在每一步推理中评估单步根编辑、保留最优根编辑集,并在局部范围内求解离散优化问题,从而提升生成序列的质量和生物合理性,适用于增强子优化和基因剪接边界修复等任务。

Comments 22 pages, 5 figures

详情
英文摘要

We study the application of recent Edit Flows for inference-time reward control for DNA sequence generation. Unlike most reward-guided DNA generation frameworks, which operate on fixed-length sequence spaces, Edit Flows have a potential to generate variable-length DNA through biologically plausible insertion, deletion, and substitution operations. In particular, we propose Local Perturbation Discrete Programming (LPDP), a training-free, intermediate-state and action-aware local re-solving operator for variable-length DNA edit-action generators at inference time. More specifically, at each guided rollout step, LPDP scores one-step root edits, retains a near-best root band, and re-ranks each retained root by solving a bounded local discrete program around its child sequence. This local program uses the typed geometry of edit actions to focus on coherent substitution, insertion, or deletion subgraphs, and aggregates local continuations with either a hard Max backup or a soft log-sum-exponential (LSE) backup. We instantiate LPDP in two regimes: front-loaded reward tilting for enhancer optimization, where early edits are critical for establishing global regulatory sequence structure, and back-loaded reward tilting for exon-intron-exon inpainting, where late edits fine-tune splice-boundary contexts.

2605.11362 2026-05-13 cs.LG cs.AI stat.AP stat.ML 版本更新

Causal Fairness for Survival Analysis

Drago Plecko

发表机构 * Department of Statistics & Data Science(统计与数据科学系)

AI总结 在数据驱动时代,机器学习和人工智能被广泛用于医疗、就业等高风险领域,引发了对系统公平性问题的关注。现有公平机器学习研究多聚焦于静态场景,而对生存分析等时间序列场景中的公平性研究仍较为缺乏。本文提出一种因果框架,用于生存分析中的公平性研究,能够将生存差异分解为直接、间接和虚假路径的贡献,从而提供对差异成因和演变过程的可解释分析,并应用于分析重症监护病房中种族差异随时间的变化。

详情
英文摘要

In the data-driven era, large-scale datasets are routinely collected and analyzed using machine learning (ML) and artificial intelligence (AI) to inform decisions in high-stakes domains such as healthcare, employment, and criminal justice, raising concerns about the fairness behavior of these systems. Existing works in fair ML cover tasks such as bias detection, fair prediction, and fair decision-making, but largely focus on static settings. At the same time, fairness in temporal contexts, particularly survival/time-to-event (TTE) analysis, remains relatively underexplored, with current approaches to fair survival analysis adopting statistical fairness definitions, which, even with unlimited data, cannot disentangle the causal mechanisms that generate disparities. To address this gap, we develop a causal framework for fairness in TTE analysis, enabling the decomposition of disparities in survival into contributions from direct, indirect, and spurious pathways. This provides a human-understandable explanation of why disparities arise and how they evolve over time. Our non-parametric approach proceeds in four steps: (1) formalizing the necessary assumptions about censoring and lack of confounding using a graphical model; (2) recovering the conditional survival function given covariates; (3) applying the Causal Reduction Theorem to reframe the problem in a form amenable to causal pathway decomposition; (4) estimating the effects efficiently. Finally, our approach is used to analyze the temporal evolution of racial disparities in outcome after admission to an intensive care unit (ICU).

2605.11360 2026-05-13 cs.CR cs.AI cs.SE 版本更新

Options, Not Clicks: Lattice Refinement for Consent-Driven MCP Authorization

Ying Li, Yanju Chen, Peiran Wang, Issac Khabra, Faysal Hossain Shezan, Yu Feng, Yuan Tian

发表机构 * University of California, Los Angeles(加州大学洛杉矶分校) University of California, San Diego(加州大学圣地亚哥分校) University of Texas at Arlington(德克萨斯大学阿灵顿分校) University of California, Santa Barbara(加州大学圣芭芭拉分校)

AI总结 随着模型上下文协议的广泛应用,如何通过用户的有意义授权来保障工具调用的安全性成为一个关键问题。本文提出了一种名为Conleash的客户端中间件,它利用风险格结构自动允许已知边界内的安全调用,同时识别并升级潜在风险,并通过策略引擎和规则细化循环实现用户定义的不变量和可复用规则。实验表明,Conleash在真实场景中表现出高准确率和低开销,并在用户研究中获得了更高的信任度和更少的交互需求。

详情
英文摘要

As Model Context Protocol adoption grows, securing tool invocations via meaningful user consent has become a critical challenge, as existing methods, broad always allow toggles or opaque LLM-based decisions, fail to account for dangerous call arguments and often lead to consent fatigue. In this work, we present Conleash, a client-side middleware that enforces boundary-scoped authorization by utilizing a risk lattice to auto-permit safe calls within known boundaries while escalating risks, a policy engine for user-defined invariants, and a refinement loop that converts user decisions into reusable rules. Evaluated on 984 real-world traces, Conleash achieved 98.2% accuracy, caught 99.4% of escalations, and added only 8.2 ms of overhead for policy verification; furthermore, in a user study where N=16, participants significantly preferred Conleash scoped permissions over traditional methods, citing higher trust and reduced prompting.

2605.11350 2026-05-13 cs.GT cs.AI econ.TH 版本更新

Human-AI Productivity Paradoxes: Modeling the Interplay of Skill, Effort, and AI Assistance

Ali Aouad, Thodoris Lykouris, Huiying Zhong

发表机构 * Massachusetts Institute of Technology(麻省理工学院)

AI总结 本文研究了生成式人工智能工具在工作场所和教育中广泛应用背景下,其对生产力影响的复杂机制。作者构建了一个人类与AI互动的模型,分析了技能水平、努力程度与AI辅助之间的相互作用,发现AI的不可靠性或技能发展的内生性可能导致生产力悖论,即更多AI辅助反而降低生产力。此外,研究还揭示了AI对技能分布的长期影响,指出在AI素养存在异质性的情况下,技能极化现象可能在稳态中出现。

详情
英文摘要

Generative Artificial Intelligence (AI) tools are rapidly adopted in the workplace and in education, yet the empirical evidence on AI's impact remains mixed. We propose a model of human-AI interaction to better understand and analyze several mechanisms by which AI affects productivity. In our setup, human agents with varying skill levels exert utility-maximizing effort to produce certain task outcomes with AI assistance. We find that incorporating either endogeneity in skill development or in AI unreliability can induce a productivity paradox: increased levels of AI assistance may degrade productivity, leading to potentially significant shortfalls. Moreover, we examine the long-term distributional effect of AI on skill, and demonstrate that skill polarization can emerge in steady state when accounting for heterogeneity in AI literacy -- the agent's capability to identify and adapt to inaccurate AI outputs. Our results elucidate several mechanisms that may explain the emergence of human-AI productivity paradoxes and skill polarization, and identify simple measures that characterize when they arise.

2605.11348 2026-05-13 cs.CL cs.AI cs.IR cs.SI 版本更新

Large Language Models for Causal Relations Extraction in Social Media: A Validation Framework for Disaster Intelligence

Ujun Jeong, Saketh Vishnubhatla, Bohan Jiang, Andre Harrison, Adrienne Raglin, Huan Liu

发表机构 * Arizona State University(亚利桑那州立大学) DEVCOM Army Research Laboratory(陆军研究实验室)

AI总结 本文研究了在灾害场景下,如何利用大语言模型(LLM)从社交媒体中提取因果关系,以增强灾情态势感知。为验证LLM的有效性,作者提出了一种基于专家知识的评估框架,通过对比模型生成的因果图与灾害报告中的参考图,评估其准确性。研究发现,LLM在提取因果关系方面具有潜力,但也存在依赖模型先验知识而非事件后证据的风险。

Comments Submitted to EMNLP

详情
英文摘要

During disasters, extracting causal relations from social media can strengthen situational awareness by identifying factors linked to casualties, physical damage, infrastructure disruption, and cascading impacts. However, disaster-related posts are often informal, fragmented, and context-dependent, and they may describe personal experiences rather than explicit causal relations. In this work, we examine whether Large Language Models (LLMs) can effectively extract causal relations from disaster-related social media posts. To this end, we (1) propose an expert-grounded evaluation framework that compares LLM-generated causal graphs with reference graphs derived from disaster-specific reports and (2) assess whether the extracted relations are supported by post-event evidence or instead reflect model priors. Our findings highlight both the potential and risks of using LLMs for causal relation extraction in disaster decision-support systems.

2605.11346 2026-05-13 cs.LG cs.AI cs.CE 版本更新

Physics-Informed Teacher-Student Ensemble Learning for Traffic State Estimation with a Varying Speed Limit Scenario

Archie J. Huang, Dongdong Wang, Shaurya Agarwal, Mohamed Abdel-Aty, Md Mahmudul Islam, Muhammad Shahbaz

发表机构 * Department of Building, Civil and Environmental Engineering, Concordia University(康科迪亚大学建筑、土木和环境工程系) Urban Artificial Intelligence Laboratory, University of Florida(佛罗里达大学城市人工智能实验室) Department of Civil, Environmental and Construction Engineering, University of Central Florida(中央佛罗里达大学土木、环境和建设工程系)

AI总结 本文研究了在可变限速场景下的交通状态估计问题,提出了一种结合物理信息深度学习与教师-学生集成训练的新型框架。该方法通过在教师模型中编码流量守恒定律,学生模型则利用多层感知机分类器识别交通特征并选择合适的教师模型进行估计,从而有效应对限速变化带来的交通特性异质性。实验结果表明,该方法在交通状态估计任务中优于其他主流基线方法。

Comments The IEEE International Conference on Intelligent Transportation Systems (ITSC) 2026

详情
英文摘要

Physics-informed deep learning (PIDL) neural networks have shown their capability as a useful instrument for transportation practitioners in utilizing the underlying relationship between the state variables for traffic state estimation (TSE). Another efficient traffic management approach is implementing varying speed limits (VSLs) on transportation corridors to control traffic and mitigate congestion. However, the existing training architecture of PIDL in the literature cannot accommodate the changing traffic characteristics on a freeway with VSL. To tackle this challenge, we propose a novel framework integrating teacher-student ensemble training with PIDL neural networks for TSE under VSL scenarios. The physics of flow conservation law is encoded locally in the teacher models by PIDL, and the student model uses a multi-layer perceptron classifier (MLP) to identify traffic characteristics and selects the ensemble member of PIDL neural networks for TSE. This integrated framework provides a natural solution for capturing the heterogeneity of VSL and accurately addressing the TSE problem. The case study results validate the proposed ensemble approach, demonstrating its superior performance in TSE compared to other popular baseline methods, as indicated by relative L2 error.

2605.11341 2026-05-13 cs.AI 版本更新

CPEMH: An Agentic Framework for Prompt-Driven Behavior Evaluation and Assurance in Foundation-Model Systems for Mental Health Screening

Giuliano Lorenzoni, Ivens Portugal, Paulo Alencar, Donald Cowan

发表机构 * University of Waterloo(滑铁卢大学)

AI总结 本文提出了一种名为CPEMH的智能代理框架,用于评估和保障基于提示的大型语言模型在心理健康筛查中的行为表现。该框架通过协调设计、评估和选择提示策略,实现了对模型行为在不同场景下的系统控制,具备模块化结构,确保了过程的可追溯性和稳定性。研究通过抑郁筛查的案例展示了该框架在临床对话场景中对模型行为进行稳定化和审计的能力,强调了模块化协调、稳定性优先以及将F1值、偏差和鲁棒性作为核心评估标准的重要性。

Comments 4 pages, 2 figures. Accepted at the AGENT 2026 Workshop (ICSE 2026)

详情
英文摘要

This paper presents CPEMH, an agentic framework designed to evaluate prompt-driven behavior in foundation-model systems operating on transcript-based datasets for mental-health screening. CPEMH serves as an engineering methodology for behavioral assurance in large-scale language systems, introducing an orchestrated architecture that autonomously performs the design, evaluation, and selection of prompt strategies, enabling systematic control of behavioral variability across contexts. Its modular agentic design, combining orchestrator, inference, and evaluation agents, ensures traceability, reproducibility, and robustness throughout the prompting lifecycle. A case study on automated depression screening from interview transcripts demonstrates the framework's capacity to stabilize and audit foundation-model behavior in conversational and clinically sensitive domains. Lessons learned emphasize the role of modular orchestration in behavioral assurance, the prioritization of stability over architectural complexity, and the integration of F1, bias, and robustness as core acceptance criteria.

2605.11330 2026-05-13 cs.AI 版本更新

Rethinking Evaluation for LLM Hallucination Detection: A Desiderata, A New RAG-based Benchmark, New Insights

Wenbo Chen, Veena Padmanabhan, Tootiya Giyahchi, Elaine Wong, Leman Akoglu

发表机构 * Amazon(亚马逊公司) Carnegie Mellon University(卡内基梅隆大学)

AI总结 本文针对大语言模型(LLM)幻觉检测的评估方法进行了重新思考,提出了一个用于构建有效幻觉检测基准(HDB)的期望属性列表,并指出现有基准在长上下文的RAG(检索增强生成)基准和真实标签噪声支持方面存在明显不足。为此,作者构建并开源了一个新的RAG-based幻觉检测基准T RIVIA+,该基准包含当前最长的上下文样本,并引入了多种噪声标签以模拟真实场景。实验表明,现有检测方法在RAG任务上仍有较大提升空间,且标签噪声对检测性能有显著影响。

Comments ACL 2026 main conference

详情
英文摘要

Hallucination, broadly referring to unfaithful, fabricated, or inconsistent content generated by LLMs, has wide-ranging implications. Therefore, a large body of effort has been devoted to detecting LLM hallucinations, as well as designing benchmark datasets for evaluating these detectors. In this work, we first establish a desiderata of properties for hallucination detection benchmarks (HDBs) to exhibit for effective evaluation. A critical look at existing HDBs through the lens of our desiderata reveals that none of them exhibits all the properties. We identify two largest gaps: (1) RAG-based grounded benchmarks with long context are severely lacking (partly because length impedes human annotation); and (2) Existing benchmarks do not make available realistic label noise for stress-testing detectors although real-world use-cases often grapple with label noise due to human or automated/weak annotation. To close these gaps, we build and open-source a new RAG-based HDB called T RIVIA+ that underwent a rigorous human annotation process. Notably, our benchmark exhibits all desirable properties including (1) T RIVIA+ contains samples with the longest context in the literature; and (2) we design and share four sets of noisy labels with different, both sample-dependent and sampleindependent, noise schemes. Finally, we perform experiments on RAG-based HDBs, including our T RIVIA+, using popular SOTA detectors that reveal new insights: (i) ample room remains for current detectors to reach the performance ceiling on RAG-based HDBs, (ii) the basic LLM-as-a-Judge baseline performs competitively, and (iii) label noise hinders detection performance. We expect that our findings, along with our proposed benchmark 1 , will motivate and foster needed research on hallucination detection for RAG-based tasks.

2605.11328 2026-05-13 cs.LG cs.AI 版本更新

Epistemic Uncertainty for Test-Time Discovery

Kainat Riaz, Muhammad Ahmed Mohsin, Ahsan Bilal, Muhammad Umer, Ayesha Mohsin, Aqib Riaz, Ali Subhan, John M. Cioffi

发表机构 * Stanford University(斯坦福大学) National University of Sciences and Technology(国家安全科学与技术大学) University of Oklahoma(俄克拉荷马大学)

AI总结 该研究探讨了如何利用大语言模型在测试阶段进行科学发现的问题,指出传统强化学习方法因惩罚高方差变异而倾向于熟悉模式,导致奖励难以持续提升。为此,研究提出了一种基于知识不确定性度量的探索策略,通过维护一个小型适配器集成,在冻结的基模型上识别出因训练覆盖不足而非问题本质困难的区域,从而引导策略向潜在发现区域探索。实验表明,该方法在多个科学发现任务中提升了最大奖励并保持了更高的解的多样性。

详情
英文摘要

Automated scientific discovery using large language models relies on identifying genuinely novel solutions. Standard reinforcement learning penalizes high-variance mutations, which leads the policy to prioritize familiar patterns. As a result, the maximum reward plateaus even as the average reward increases. Overcoming this limitation requires a signal that distinguishes unexplored regions from intrinsically difficult problems. This necessitates measuring disagreement across independently adapted weight hypotheses rather than relying on a single network's confidence. UG-TTT addresses this challenge by maintaining a small ensemble of low-rank adapters over a frozen base model. The per-token disagreement, quantified as the mutual information between ensemble predictions and weight hypotheses, isolates epistemic uncertainty and identifies positions where insufficient coverage leads to adapter divergence rather than intrinsic problem difficulty. This measure is incorporated as an exploration bonus into the policy gradient, directing the policy toward positions where persistent adapter disagreement signals low training coverage, the same frontier where genuine discovery is possible. A nuclear norm regularizer ensures the adapters remain distinct from one another, thereby preserving the exploration signal throughout training. Across four scientific discovery benchmarks, UG-TTT increases the maximum reward on three tasks, maintains substantially higher solution diversity, and an ablation study confirms that the regularizer is essential for sustaining this behavior.

2605.11317 2026-05-13 cs.CL cs.AI 版本更新

SOMA: Efficient Multi-turn LLM Serving via Small Language Model

Xueqi Cheng, Qiong Wu, Zhengyi Zhou, Xugui Zhou, Tyler Derr, Yushun Dong

发表机构 * Florida State University(佛罗里达州立大学) AT&T Chief Data Office(AT&T首席数据办公室) Louisiana State University(路易斯安那州立大学) Vanderbilt University(范德比大学)

AI总结 在多轮对话场景中,大型语言模型(LLMs)的部署面临延迟、内存和API成本高昂的问题。为此,本文提出SOMA框架,通过利用会话早期的对话内容估计局部响应流形,并使用一个小的语言模型作为代理模型处理后续对话,从而在保证响应质量的同时提升服务效率。该方法结合软提示学习、反退化控制和局部LoRA微调,实现了代理模型在推理阶段无需提示的高效运行,并提供了理论分析与实验验证,证明了其有效性。

详情
英文摘要

Large Language Models (LLMs) are increasingly deployed in multi-turn dialogue settings where preserving conversational context across turns is essential. A standard serving practice concatenates the full dialogue history at every turn, which reliably maintains coherence but incurs substantial cost in latency, memory, and API expenditure, especially when queries are routed to large proprietary models. Existing approaches often struggle to balance the trade-off between response quality and efficiency. We propose a framework that exploits the early turns of a session to estimate a local response manifold and then adapt a smaller surrogate model to this local region for the remainder of the conversation. Concretely, we learn soft prompts that maximize semantic divergence between the large and surrogate small language models' responses to surface least-aligned local directions, stabilize training with anti-degeneration control, and distill the mined cases into localized LoRA fine-tuning so the surrogate runs without prompts at inference. A simple gate enables a one-time switch with rollback on drift. We further provide a theoretical analysis for key components in SOMA. Extensive experiments show the effectiveness of SOMA. The source code is provided at: https://github.com/LabRAI/SOMA.

2605.11315 2026-05-13 cs.SE cs.AI cs.CR 版本更新

Natural Language based Specification and Verification

Zhaorui Li, Chengyu Song

发表机构 * University of California Riverside(加州大学河滨分校)

AI总结 本文研究如何利用大语言模型(LLM)基于自然语言生成系统规范并进行组合验证,以防止生成有漏洞的代码。与传统形式化验证依赖严格形式语言不同,该方法直接使用自然语言表达规范,简化了验证流程。初步实验表明,该方法在规范生成与验证任务中展现出良好潜力。

详情
英文摘要

Recent frontier large language models (LLMs) have shown strong performance in identifying security vulnerabilities in large, mature open-source systems. As LLM-generated code becomes increasingly common, a natural goal is to prevent such models from producing vulnerable implementations in the first place. Formal verification offers a principled route to this objective, but existing verification pipelines typically require specifications written in rigid formal languages. Prior work has explored using LLMs to synthesize such specifications, with limited success. In this paper, we investigate a different approach: using LLMs both to generate specifications and to verify implementations compositionally when the specifications are expressed in natural language. Our preliminary results suggest that this approach is promising.

2605.11312 2026-05-13 cs.AI 版本更新

Constraint-Data-Value-Maximization: Utilizing Data Attribution for Effective Data Pruning in Low-Data Environments

Danilo Brajovic, David A. Kreplin, Marco F. Huber

发表机构 * Fraunhofer IPA(弗劳恩霍夫研究所) Institute of Industrial Manufacturing and Engineering IFF(工业制造与工程研究所) University of Stuttgart(斯图加特大学) Hochschule Heilbronn(海德堡应用技术大学)

AI总结 本文研究了在数据量有限的情况下如何有效进行数据剪枝的问题,提出了一种基于数据归属的约束数据价值最大化(CDVM)方法。该方法通过将剪枝过程建模为一个受约束的优化问题,在最大化整体数据影响的同时限制单个测试样本的贡献,从而在保留少量数据时仍能保持模型性能。实验表明,CDVM在OpenDataVal基准上表现出色,具有良好的性能和竞争力的运行时间。

Comments Accepted for publication at IJCAI 2026

详情
英文摘要

Attributing model behavior to training data is an evolving research field. A common benchmark is data removal, which involves eliminating data instances with either low or high values, then assessing a model's performance trained on the modified dataset. Many existing studies leverage Shapley-based data values for this task. In this paper, we demonstrate that these data values are not optimally suited for pruning low-value data when only a limited amount of data remains. To address this limitation, we introduce the Constraint-Data-Value-Maximization (CDVM) approach, which effectively utilizes data attributions for pruning in low-data scenarios. By casting pruning as a constrained optimization that both maximizes total influence and penalizes excessive per-test contributions, CDVM delivers robust performance when only a small fraction of the data is retained. On the OpenDataVal benchmark, CDVM shows strong performance and competitive runtime.

2605.11301 2026-05-13 cs.AI cs.CL cs.CV 版本更新

LatentRouter: Can We Choose the Right Multimodal Model Before Seeing Its Answer?

Xueqi Cheng, Yushun Dong

发表机构 * Department of Computer Science(计算机科学系)

AI总结 本文提出了一种名为 LatentRouter 的多模态模型路由方法,旨在根据图像-问题输入的特性,选择最适合的多模态大语言模型。该方法通过构建多模态路由胶囊和模型能力标记,利用潜在状态间的通信来预测各候选模型的性能表现,并结合分布输出头和边界胶囊校正机制提升预测准确性。实验表明,LatentRouter 在多个基准测试中优于现有方法,尤其在需要视觉、布局或推理能力的任务中表现突出。

详情
英文摘要

Multimodal large language models (MLLMs) have heterogeneous strengths across OCR, chart understanding, spatial reasoning, visual question answering, cost, and latency. Effective MLLM routing therefore requires more than estimating query difficulty: a router must match the multimodal requirements of the current image-question input with the capabilities of each candidate model. We propose LatentRouter, a router that formulates MLLM routing as counterfactual multimodal utility prediction. Given an image-question query, LatentRouter extracts learned multimodal routing capsules, represents each candidate MLLM with a model capability token, and performs latent communication between these states to estimate how each model would perform if selected. A distributional outcome head predicts model-specific counterfactual quality, while a bounded capsule correction refines close decisions without allowing residual signals to dominate the prediction. The resulting utility-based policy supports performance-oriented and performance-cost routing, and handles changing candidate pools through shared per-model scoring with availability masking. Experiments on MMR-Bench and VL-RouterBench show that LatentRouter outperforms fixed-model, feature-level, and learned-router baselines. Additional analyses show that the gains are strongest on multimodal task groups where model choice depends on visual, layout-sensitive, or reasoning-oriented requirements, and that latent communication is the main contributor to the improvement. The code is available at: https://github.com/LabRAI/LatentRouter.

2605.11290 2026-05-13 cs.CL cs.AI 版本更新

ReAD: Reinforcement-Guided Capability Distillation for Large Language Models

Xueqi Cheng, Xugui Zhou, Tyler Derr, Yushun Dong

发表机构 * Florida State University(佛罗里达州立大学) Louisiana State University(路易斯安那州立大学) Vanderbilt University(范德比大学)

AI总结 本文提出了一种名为 ReAD 的强化引导能力蒸馏框架,旨在在固定 token 预算下更有效地压缩大语言模型,同时保留对下游任务至关重要的能力。该方法通过识别任务关键能力、动态生成针对性监督信号,并利用不确定性感知的上下文老虎机算法优化预算分配,从而在提升任务表现的同时减少能力间的负面干扰和资源浪费。实验表明,ReAD 在相同预算下优于现有方法,具有更高的实用性和效率。

详情
英文摘要

Capability distillation applies knowledge distillation to selected model capabilities, aiming to compress a large language model (LLM) into a smaller one while preserving the abilities needed for a downstream task. However, most existing methods treat capabilities as independent training targets and overlook how improving one capability can reshape the student's broader capability profile, especially when multiple abilities jointly determine task success. We study capability distillation under a fixed token budget and identify two consistent patterns: distillation induces systematic, budget-dependent cross-capability transfer, and additional budget often brings limited task-relevant gains while sometimes degrading other useful abilities. Building on these insights, we propose ReAD, a Reinforcement-guided cApability Distillation framework that explicitly accounts for capability interdependence. ReAD first infers task-essential capabilities, then generates capability-targeted supervision on the fly, and finally uses an uncertainty-aware contextual bandit to adaptively allocate the distillation budget based on expected utility gains. Extensive experiments show that ReAD improves downstream utility under the same token budget while reducing harmful spillover and wasted distillation effort compared to strong baselines. Our code is publicly available at https://github.com/LabRAI/ReAD.

2605.11284 2026-05-13 stat.ME cs.AI cs.LG 版本更新

Rethinking external validation for the target population: Capturing patient-level similarity with a generative model

Mohammad Azizmalayeri, Ameen Abu-Hanna, Saskia Houterman, Marije M. Vis, Giovanni Cinà

发表机构 * Amsterdam UMC location University of Amsterdam, Dept of Medical Informatics(阿姆斯特丹大学医学中心(Amsterdam UMC)医学信息学系) Amsterdam Public Health Research Institute(阿姆斯特丹公共卫生研究所) Netherlands Heart Registration (NHR)(荷兰心脏登记处) Amsterdam UMC, Dept of Cardiology(阿姆斯特丹大学医学中心(Amsterdam UMC)心内科部) Institute for Logic, Language and Computation, University of Amsterdam(阿姆斯特丹大学逻辑、语言和计算研究所)

AI总结 该研究旨在解决外部验证中因目标人群与模型开发人群差异而导致的模型性能解释困难问题,提出了一种基于生成模型的框架,用于量化每个外部患者与开发数据的相似性,并在不同相似度子群中评估模型性能。通过使用自编码器等生成模型,该方法无需共享原始开发数据即可实现更灵活的相似性估计,提升了外部验证的可解释性与实用性。实验表明,该框架能够揭示传统外部验证所掩盖的模型性能差异,为模型的可迁移性评估提供了更科学的依据。

详情
英文摘要

Background: External validation is essential for assessing the transportability of predictive models. However, its interpretation is often confounded by differences between external and development populations. This study introduces a framework to distinguish model deficiencies from case-mix effects. Method: We propose a framework that quantifies each external patient's similarity to the development data and measures performance in subgroups with varying levels of alignment to the development distribution. We use generative models, specifically autoencoders, to estimate similarity, offering a more flexible alternative to traditional linear approaches and enabling validation without sharing the original development data. The utility of autoencoder-based similarity measure is demonstrated using synthetic data, and the framework's application is illustrated using data from the Netherlands Heart Registration (NHR) to predict mortality after transcatheter aortic valve implantation. Results: Our framework revealed substantial variation in model performance across similarity-defined subgroups, differences that remain hidden under conventional external validation yet can meaningfully alter conclusions. In several settings, conventional external validation suggested poor overall performance. However, after accounting for differences in patient characteristics, for some sub-groups, the model performance was consistent with internal validation results. Conversely, apparently acceptable overall performance could mask clinically relevant performance deficits in specific subgroups. Conclusion: The proposed framework enhances the interpretability of external validation by linking model performance to population alignment with the development data. This provides a more principled basis for deciding whether a model is transportable and to which patients it can be safely applied.

2605.11280 2026-05-13 gr-qc astro-ph.HE cs.AI 版本更新

Discovery of Interpretable Surrogates via Agentic AI: Application to Gravitational Waves

Tousif Islam, Digvijay Wadekar, Tejaswi Venumadhav, Matias Zaldarriaga, Ajit Kumar Mehta, Javier Roulet, Barak Zackay

发表机构 * Kavli Institute for Theoretical Physics, University of California Santa Barbara(加州大学圣芭芭拉分校凯文利理论物理研究所) Center for Gravitational Physics, University of Texas at Austin(德克萨斯大学奥斯汀分校重力物理中心) Department of Physics, University of California at Santa Barbara(加州大学圣芭芭拉分校物理系) International Centre for Theoretical Sciences, Tata Institute of Fundamental Research(塔塔基础研究机构国际理论科学研究中心) School of Natural Sciences, Institute for Advanced Study(高级研究院自然科学学院) Chennai Mathematical Institute(钦奈数学研究所) Kavli Institute for Cosmological Physics, The University of Chicago(芝加哥大学凯文利宇宙物理研究所) Department of Particle Physics & Astrophysics, Weizmann Institute of Science(魏茨曼科学研究所粒子物理与天体物理系)

AI总结 该研究提出了一种基于大型语言模型的智能代理工作流 GWAgent,用于从仿真数据中直接构建可解释的解析代理模型,以替代耗时的数值模拟。通过引入物理信息的先验假设,该方法在引力波波形建模中实现了高精度和显著加速,并揭示了波形中的紧凑物理结构。研究展示了该方法在分析实际引力波事件 GW200129 的轨道偏心率方面的应用,取得了优于传统方法的成果。

Comments 25 pages, 9 figures, codes available at https://github.com/tousifislam/GWAgent

详情
英文摘要

Fast surrogate models for expensive simulations are now essential across the sciences, yet they typically operate as black boxes. We present \texttt{GWAgent}, a large language model (LLM)-based workflow that constructs interpretable analytic surrogates directly from simulation data. Surrogate modeling is well suited to agentic workflows because candidate models can be quantitatively validated against ground-truth simulations at each iteration. As a demonstration, we build a surrogate for gravitational waveforms from eccentric binary black hole mergers. We show that providing the agent with a physics-informed domain ansatz substantially improves output model accuracy. The resulting analytic surrogate attains a median Advanced LIGO mismatch of $6.9\times10^{-4}$ together with an $\sim 8.4\times$ speedup in waveform evaluation, surpassing both symbolic regression and conventional machine learning baselines. Beyond producing an accurate model, the workflow identifies compact physical structure from the learned representation. As an astrophysical application, we use \texttt{GWAgent} to analyze the eccentricity of GW200129 and infer $e_{20\mathrm{Hz}}=0.099^{+0.063}_{-0.044}$. These results show that validation-constrained agentic workflows can produce accurate, fast, and interpretable surrogates for scientific simulations and inference.

2605.11276 2026-05-13 cs.CV cs.AI 版本更新

Generative AI for Visualizing Highway Construction Hazards Through Synthetic Images and Temporal Sequences

Trevor Neece, Mason Smetana, Lev Khazanovich

发表机构 * University of Pittsburgh(匹兹堡大学)

AI总结 该研究提出了一种基于生成式人工智能的方法,用于从OSHA严重伤害报告中生成高速公路施工危险场景的合成图像和时间序列,以辅助安全培训。研究开发了两种生成模式:单图生成和四阶段时间序列生成,并通过CLIP语义检索和专家评估对生成图像的教育价值、真实感和对齐度进行了多维评价。该方法在无需拍摄真实事故场景的情况下,为安全培训提供了可视化素材,同时为跨领域合成图像生成提供了新的评估框架。

详情
英文摘要

Highway construction workers face a high risk of serious injury or death. Image-based training materials depicting hazardous scenarios are essential for engaging safety instruction but remain scarce due to ethical and logistical barriers. This study develops and evaluates a generative AI methodology for producing synthetic visualizations of highway construction hazards from OSHA Severe Injury Report narratives. Two modes were developed: a single-pass approach yielding one image per incident, and a temporal approach producing a four-stage sequence. A sample of 75 incident records yielded 750 images, evaluated using CLIP-based semantic retrieval and expert assessment across dimensions such as educational utility, fidelity, and alignment. Single-pass images achieved 81.1% educational acceptability with fidelity and alignment scores of 4.14/5 and 4.07/5, respectively, while temporal sequences achieved 60.9% acceptability with comparable alignment (3.94/5) but lower fidelity (3.51/5). CLIP-based retrieval revealed that both modes produce images with statistically significant retrieval capabilities. This is among the first studies to leverage modern autoregressive image generation models for visualizing construction hazards from reported severe injuries and to generate temporally sequenced hazard imagery, and a new multi-dimensional evaluation framework was developed to support future research in this domain. The work enables safety trainers to pair narrative storytelling with visual learning material without photographing real-world hazards, and the framework could be applied to datasets across diverse domains, enabling synthetic image generation tailored to new application areas.

2605.11272 2026-05-13 cs.LG cs.AI cs.IR 版本更新

Localization Boosting for Growth Markets: Mitigating Cross-Locale Behavioral Bias in Learning-to-Rank

Suryaa Veerabathiran Seran, Ashwin Naresh Kumar, Tracy Holloway King, Jing Zheng

发表机构 * Adobe

AI总结 本文研究了在国际扩张阶段,如何缓解学习排序(LTR)模型在不同地区之间的行为偏差问题。作者指出,仅依赖点击数据训练的模型会忽视语义层面的本地化特征,导致非美国地区的内容曝光不均。为此,他们提出了一种结合行为反馈、视觉语言模型相关性信号和地域感知增强的多目标框架,有效提升了模型在多个地区的相关性和本地内容可见性。

详情
英文摘要

Adobe Express is expanding internationally, but the US has a disproportionately large content supply and interaction volume. Learning-to-rank (LTR) models trained primarily on behavioral feedback inherit this imbalance: templates popular in US are over-served in non-US locales. This cross-locale exposure bias suppresses local content discoverability and degrades ranking quality in growth locales. We show that click-only training suppresses semantically informative localization features. Adding vision-language model (VLM) graded relevance labels as auxiliary supervision alongside clicks improves semantic alignment but does not preserve local content visibility. We propose a multi-objective framework combining behavioral supervision, VLM-derived relevance signals, and locale-aware boosting. Across five locales, the resulting model improves relevance while restoring stable localization, demonstrating the importance of disentangling exposure from semantic supervision.

2605.11269 2026-05-13 gr-qc astro-ph.HE astro-ph.IM cs.AI 版本更新

gwBenchmarks: Stress-Testing LLM Agents on High-Precision Gravitational Wave Astronomy

Tousif Islam, Digvijay Wadekar, Zihan Zhou

发表机构 * Kavli Institute for Theoretical Physics, University of California Santa Barbara, Kohn Hall, Lagoon Rd, Santa Barbara, CA 93106(加州大学圣芭芭拉分校凯弗利理论物理研究所) Center for Gravitational Physics, University of Texas at Austin, Austin, TX 78712, USA(德克萨斯大学奥斯汀分校引力物理中心) Department of Physics, Princeton University, Princeton, NJ 08540, USA(普林斯顿大学物理系)

AI总结 该研究提出了一套名为 gwBenchmarks 的基准测试任务,用于评估大型语言模型(LLM)代理在高精度引力波天文学建模任务中的表现。这些任务涵盖插值、回归和高维时间序列建模,涉及数值方法、机器学习和物理引导方法,代表了大量计算资源的投入。实验表明,现有LLM代理在完成这些任务时普遍存在系统性错误,难以满足引力波研究中对精度的严格要求,反映出当前AI代理在科学建模方面仍面临重大挑战。

Comments 26 pages, 4 figures

详情
英文摘要

Modern gravitational wave astronomy relies on modeling tasks that often require months of graduate-level effort, including building fast waveform surrogates from expensive numerical relativity simulations, modeling orbital dynamics of black holes, fitting merger remnant properties and constructing template banks. These problems demand extreme precision to support detection and parameter inference, with state-of-the-art models achieving $\lesssim 10^{-4}$ relative error. We study whether state-of-the-art LLM coding agents can perform such end-to-end scientific modeling, where success requires constructing models with stringent accuracy criteria and reasoning about physical systems. We introduce gwBenchmarks, a suite of eight tasks grounded in gravitational wave analytic calculations and numerical simulations collectively representing over $10^8$ core-hours of compute. The tasks span interpolation, regression, and high-dimensional time-series modeling, requiring a combination of numerical methods, machine learning, and physics-informed approaches. In preliminary experiments, agents frequently relied on proxy metrics, partial evaluation, or fabricated results to spuriously complete tasks. We therefore implement an external pre-defined framework to gauge agent progress. Evaluating twelve coding agents, we find no consistent winner. On the easiest task, multiple agents converge to the same cubic spline solution, with one rediscovering a coordinate transformation widely used in the literature. On harder tasks like analytic waveform modeling, all agents fall 1-2 orders of magnitude short of domain requirements and exhibit systematic failures, including metric misuse, constraint violations, and result fabrication. Our code, data, and website are publicly available.

2605.11265 2026-05-13 cs.CV cs.AI cs.LG 版本更新

DenseTRF: Texture-Aware Unsupervised Representation Adaptation for Surgical Scene Dense Prediction

Guiqiu Liao, Matjaž Jogan, Daniel A. Hashimoto

发表机构 * GRASP Laboratory, University of Pennsylvania(宾夕法尼亚大学GRASP实验室) PCASO Laboratory, Department of Surgery, University of Pennsylvania(宾夕法尼亚大学外科PCASO实验室) Department of Computer and Information Science, University of Pennsylvania(宾夕法尼亚大学计算机与信息科学系)

AI总结 本文提出了一种名为DenseTRF的自监督表征适应框架,用于解决手术场景中密集预测任务(如分割和手术区域识别)在跨域部署时因分布偏移导致的性能下降问题。该方法基于纹理感知的注意力机制,通过学习具有不变视觉结构的表征,并在无监督条件下将其适配到目标分布,从而显著提升了模型对领域变化的鲁棒性。实验表明,DenseTRF在多种手术场景中均优于当前最先进的分割模型和跨域适应方法。

Comments Accepted to 29th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2026)

详情
英文摘要

Dense prediction tasks in surgical computer vision, such as segmentation and surgical zone prediction, can provide valuable guidance for laparoscopic and robotic surgery. However, these models often suffer from distribution shifts, as training datasets rarely cover the variability encountered during deployment, leading to poor generalization. We propose DenseTRF, a self-supervised representation adaptation framework based on texture-centric attention. Our method leverages slot attention to learn texture-aware representations that capture invariant visual structures. By adapting these representations to the target distribution without supervision, DenseTRF significantly improves robustness to domain shifts. The framework is implemented through conditioning dense prediction on slot attention and model merging strategies. Experiments across multiple surgical procedures demonstrate improved cross-distribution generalization in comparison to state-of-the-art segmentation models and test-distribution adaptation methods for dense prediction tasks.

2605.11260 2026-05-13 cs.LG cs.AI 版本更新

Curriculum Learning-Guided Progressive Distillation in Large Language Models

Jincheng Cao, Fanzhi Zeng, Leqi Liu, Aryan Mokhtari

发表机构 * The University of Texas at Austin(德克萨斯大学奥斯汀分校) Google Research(谷歌研究)

AI总结 知识蒸馏是将大语言模型能力转移到小型学生模型的重要技术,但现有方法常忽略训练数据的学习顺序和师生模型容量不匹配的问题。本文提出了一种由课程学习引导的渐进式蒸馏框架(CLPD),通过将数据难度与教师模型能力对齐,同时构建显式和隐式的课程学习机制,有效提升了蒸馏效果。实验表明,CLPD在多个推理基准测试中优于传统蒸馏方法及其他单一优化策略,突显了联合考虑数据顺序与教师容量的重要性。

详情
英文摘要

Knowledge distillation is a key technique for transferring the capabilities of large language models (LLMs) into smaller, more efficient student models. Existing distillation approaches often overlook two critical factors: the learning order of training data and the capacity mismatch between teacher and student models. This oversight limits distillation performance, as manifested by the counter-intuitive phenomenon where stronger teachers fail to produce better students. In this work, we propose Curriculum Learning-Guided Progressive Distillation (CLPD), a unified framework that explicitly accounts for both factors by aligning data difficulty with teacher strength. CLPD constructs an explicit curriculum by organizing training examples from easy to hard, while simultaneously applying an implicit curriculum over supervision signals by progressively scheduling teachers of increasing capacity. Our framework is modular and can be integrated into standard distillation algorithms with minimal overhead. Empirical results on the reasoning benchmarks demonstrate that CLPD consistently outperforms standard distillation, data ordering alone, and teacher scheduling alone across multiple settings. These findings highlight the importance of jointly considering data ordering and teacher capacity when distilling reasoning abilities into small language models.

2605.11259 2026-05-13 cs.AI 版本更新

Template-as-Ontology: Configurable Synthetic Data Infrastructure for Cross-Domain Manufacturing AI Validation

Grama Chethan

发表机构 * Siemens Digital Industries Software(西门子数字工业软件)

AI总结 本文提出了一种名为“Template-as-Ontology”的可配置合成数据基础设施,用于跨领域制造环境中AI系统的验证。该方法通过一个统一的Python配置模块,同时定义制造仿真器的结构和AI分析工具的运行时数据模式,从而确保数据结构的一致性。实验表明,该框架能够生成符合MES标准的高质量合成数据,并有效减少AI工具在参数生成时的错误率,为离散制造AI的验证提供了可复用的数据基础。

Comments 18 pages, 1 fugure

详情
英文摘要

LLarge language model (LLM)-based AI agents deployed in manufacturing environments require populated, schema-correct data for validation, yet production MES data is proprietary, privacy-encumbered, and vendor-specific. This paper introduces the Template-as-Ontology principle: a single Python configuration module (700-770 lines, 45 validated exports) serves simultaneously as the specification for a time-stepped manufacturing simulator and as the runtime domain schema for AI analytics tools, producing alignment by construction rather than integration. We formally define the domain template as a typed relational configuration schema and prove that structural alignment between simulation and tool layers is guaranteed by single-source consumption. A five-layer pipeline--simulation, PostgreSQL, CDC/Iceberg lakehouse, star schema, and 12 parameterized AI tools--generates causally coherent, MES-shaped data spanning 66 entity types across four operational domains mapped to ISA-95/IEC 62264. We validate the architecture with six industry templates (aerospace, pharma, automotive, electronics, beverages, warehousing) running on identical framework code. Calibration experiments (60 runs, 10 seeds per template) confirm parametric controllability: observed KPIs fall within configured ranges across all templates. A controlled hallucination experiment (72 tool invocations, Qwen3-32B) demonstrates that ontology-constrained parameters eliminate tool-parameter fabrication (0% constrained vs. 43% unconstrained hallucination rate for the evaluated model, Fisher's exact test p < 10^-12); the 0% constrained rate is an architectural guarantee that holds for any model. The framework provides a reusable data layer for discrete manufacturing AI validation.

2605.11258 2026-05-13 cs.AI cs.CL q-bio.QM 版本更新

Unlocking LLM Creativity in Science through Analogical Reasoning

Andrew Shen, Shaul Druckmann, James Zou

发表机构 * Stanford University(斯坦福大学)

AI总结 本文研究如何通过类比推理(Analogical Reasoning, AR)提升大型语言模型(LLM)在科学问题中的创造力,特别是在生物医学等复杂领域。作者发现现有LLM在开放性问题求解中容易陷入模式崩溃,生成多样性不足的解,为此提出AR方法,通过跨领域问题的类比结构生成新颖解决方案。实验表明,AR显著提升了生成解的多样性和新颖性,并在多个生物医学任务中取得了优于现有方法的性能,验证了其在实际应用中的有效性。

详情
英文摘要

Autonomous science promises to augment scientific discovery, particularly in complex fields like biomedicine. However, this requires AI systems that can consistently generate novel and diverse solutions to open-ended problems. We evaluate LLMs on the task of open-ended solution generation and quantify their tendency to mode collapse into low-diversity generations. To mitigate this mode collapse, we introduce analogical reasoning (AR) as a new approach to solution generation. AR generates analogies to cross-domain problems based on shared relational structure, then uses those analogies to search for novel solutions. Compared to baselines, AR discovers significantly more diverse generations (improving solution diversity metrics by 90-173%), generates novel solutions over 50% of the time (compared to as little as 1.6% for baselines), and produces high-quality analogies. To validate the real-world feasibility of AR, we implement AR-generated solutions across four biomedical problems, yielding consistent quantitative gains. AR-generated approaches achieve a nearly 13-fold improvement on distributional metrics for perturbation effect prediction, outperform all baselines on AUPRC when predicting cell-cell communication, infer brain region interactions with a high Spearman correlation ($ρ$=0.729) to published methods, and establish state-of-the-art performance on 2 datasets for oligonucleotide property prediction. The novel and diverse solutions produced by AR can be used to augment the search space of existing solution generation methods.

2605.11242 2026-05-13 cs.CL cs.AI 版本更新

RETUYT-INCO at BEA 2026 Shared Task 2: Meta-prompting in Rubric-based Scoring for German

Ignacio Sastre, Ignacio Remersaro, Facundo Díaz, Nicolás De Horta, Luis Chiruzzo, Aiala Rosá, Santiago Góngora

发表机构 * Instituto de Computación, Facultad de Ingeniería, Universidad de la República(计算研究所,工程学院,乌拉圭共和国大学)

AI总结 本文介绍了 RETUYT-INCO 团队在 BEA 2026 共享任务“基于评分标准的德语短答案评分”中的参与情况,团队在多个子任务中采用了一种名为 Meta-prompting 的方法,通过从训练集示例中生成定制提示来对学生的答案进行评分。除了该方法,团队还尝试了传统机器学习、开源大模型微调及其他提示技术。最终在多个子任务中取得了中等偏上的排名,展示了方法的有效性与多样性。

Comments To be presented at the BEA 2026 workshop, co-located with ACL 2026

详情
英文摘要

In this paper, we present the RETUYT-INCO participation at the BEA 2026 shared task "Rubric-based Short Answer Scoring for German". Our team participated in track 1 (Unseen answers three-way), track 3 (Unseen answers two-way) and track 4 (Unseen questions two-way). Since these tracks required scoring short student answers using specific rubrics, we looked for ways to handle the changing nature of the task. We created a method called Meta-prompting. In this approach, an LLM creates a custom prompt based on examples from the Train set. This prompt is then used to grade new student answers. Along with this method, we also describe other approaches we used, such as classic machine learning, fine-tuning open-source LLMs, and different prompting techniques. According to the official results, our team placed 6th out of 8 participants in Track 1 with a QWK of 0.729. In Track 3, we secured 4th place out of 9 with a QWK of 0.674, and we also placed 4th out of 8 in Track 4 with a QWK of 0.49.

2605.11235 2026-05-13 cs.LG cs.AI 版本更新

Internalizing Curriculum Judgment for LLM Reinforcement Fine-Tuning

Han Zheng, Yining Ma, Karthick Gunasekaran, Bharathan Balaji, Zheng Du, Shiv Vitaladevuni, Cathy Wu

发表机构 * MIT(麻省理工学院) Amazon AGI(亚马逊人工智能实验室)

AI总结 在大语言模型的强化微调中,课程学习有助于提升训练效率与性能,但现有方法依赖人工设计的启发式规则或辅助模型进行课程判断,可能与策略的训练动态不一致。本文提出METIS框架,将课程判断内化为模型的原生能力,通过分析提示内部奖励的方差来衡量提示的信息量,并基于近期训练结果进行轻量化的上下文学习预测,从而动态调整训练分配。METIS通过联合优化标准奖励与自我判断奖励,实现策略的元认知学习,在多个基准任务中展现出更高的性能与更快的收敛速度。

详情
英文摘要

In LLM Reinforcement Fine-Tuning (RFT), curriculum learning drives both efficiency and performance. Yet, current methods externalize curriculum judgment via handcrafted heuristics or auxiliary models, risking misalignment with the policy's training dynamics. In this paper, we introduce METIS (METacognitive Internalized Self-judgment), a novel framework that internalizes curriculum judgment as a native capability. Leveraging a critical observation that within-prompt reward variance effectively gauges prompt informativeness, METIS predicts this metric based on recent training outcomes as lightweight in-context learning examples. This intrinsic self-judgment then dynamically dictates the training allocation. Moreover, METIS closes the loop between judgment and optimization by jointly optimizing the standard RFT rewards and a self-judgment reward. This allows the policy to learn what to learn next, as a form of metacognition. Across extensive discrete and continuous RFT benchmarks from mathematical reasoning, code generation, to agentic function-calling, METIS consistently delivers superior performance while accelerating convergence by up to 67%. By bypassing handcrafted heuristics and auxiliary models, our work establishes a simple, closed-loop, and highly efficient curriculum internalization paradigm for LLM reinforcement fine-tuning.

2605.11234 2026-05-13 cs.AI 版本更新

The Semantic Training Gap: Ontology-Grounded Tool Architectures for Industrial AI Agent Systems

Grama Chethan

发表机构 * Siemens Digital Industries Software(西门子数字工业软件)

AI总结 本文提出并解决了工业AI代理系统中的“语义训练差距”问题,即大语言模型虽能掌握领域术语,却缺乏对制造操作语义结构的深入理解。为弥补这一差距,研究设计了一种基于制造本体的工具架构,将领域知识直接嵌入AI工具层,通过运行时语义约束替代传统训练方式,有效减少了领域标识符的错误生成。实验表明,该方法在不修改应用代码的情况下,实现了跨领域配置和工具调用零幻觉的性能提升。

Comments 29 pages, 2 figures

详情
英文摘要

Large language model (LLM)-based AI agents are increasingly deployed in manufacturing environments for analytics, quality management, and decision support. These agents demonstrate statistical fluency with domain terminology but lack grounded understanding of operational semantics -- the relational structure that connects equipment identifiers, process parameters, failure codes, and regulatory constraints within a specific production context. This paper identifies and formalizes the semantic training gap: a structural disconnect between how AI systems acquire domain vocabulary through training and how manufacturing operations define meaning through ontological relationships. We demonstrate that this gap causes operationally incorrect outputs even when model responses are linguistically precise, and that in multi-agent configurations it produces a compounding failure mode we term semantic drift. To close this gap, we present an architecture that embeds manufacturing ontology directly into the AI tool layer as a typed relational configuration, enforcing semantic constraints at runtime rather than relying on model training. The architecture is formalized as a three-operation interface contract -- resolve, contextualize, annotate -- with invariants enforced by an AIOps orchestration layer. In a controlled experiment across six industry configurations (72 tool invocations using Qwen3-32B), unconstrained tool parameters produced a 43% hallucination rate for domain identifiers; ontology-grounded parameters reduced this to 0%. We validate the approach through a digital twin analytics platform demonstrating that a single codebase with domain-specific ontology configurations eliminates tool-call hallucination and achieves cross-domain configurability without application code changes.

2605.11232 2026-05-13 cs.AI cs.LG 版本更新

Rethinking LLMOps for Fraud and AML: Building a Compliance-Grade LLM Serving Stack

Prathamesh Vasudeo Naik, Naresh Dintakurthi, Yue Wang

发表机构 * GitHub

AI总结 本文研究了如何为欺诈检测和反洗钱(AML)等合规性场景构建高效的大语言模型(LLM)服务架构。针对这类任务中常见的前缀密集、结构约束强和证据丰富的输入特点,作者提出了一套面向工作负载的LLMOps系统,结合了运行时调优、前缀缓存、多适配器服务、批处理优化等多种技术,显著提升了服务吞吐量和响应速度。实验表明,该方法在公共合成数据集上实现了性能的大幅提升,展示了合规性LLM服务需从工作负载设计、服务优化和质量控制多方面综合提升。

详情
英文摘要

Fraud detection and anti-money-laundering (AML) compliance are high-value domains for large language models (LLMs), but their serving requirements differ sharply from generic chat workloads. Compliance prompts are often prefix-heavy, schema-constrained, and evidence-rich, combining reusable policy instructions, risk taxonomies, transaction or document context, and short structured outputs such as JSON labels or risk factors. These properties make prefix reuse, KV-cache efficiency, runtime tuning, model orchestration, and output validation first-order systems concerns. This paper introduces a workload-aware LLMOps stack for fraud and AML workloads using self-hosted open-weight models such as Meta Llama and Alibaba Qwen. The stack combines vLLM-style runtime tuning, PagedAttention, Automatic Prefix Caching, multi-adapter serving, adapter and prompt-length-aware batching, sleep/wake lifecycle management, speculative decoding, and optional prefill/decode disaggregation. To avoid exposing institution-specific data, the reproducibility track converts public synthetic AML datasets, including IBM AML and SAML-D, into prefix-heavy compliance prompts with reusable policy text, transaction evidence, typology definitions, and schema-constrained outputs. We also incorporate an LLM-as-judge quality gate using deterministic compliance checks, reference metrics, expert-adjudicated calibration data where available, and multi-judge rubric scoring. Across public-synthetic AML workloads and controlled serving benchmarks, workload-aware tuning improved throughput from 612-650 to 3,600 requests/hour, reduced P99 latency from 31-38 seconds to 6.4-8.7 seconds, and increased GPU utilization from 12% to 78%. These results show that regulated LLM performance is a workload-design, serving-optimization, and quality-gating problem, not only a model-selection problem.

2605.11229 2026-05-13 cs.CR cs.AI cs.SE 版本更新

Comment and Control: Hijacking Agentic Workflows via Context-Grounded Evolution

Neil Fendley, Zhengyu Liu, Aonan Guan, Jiacheng Zhong, Yinzhi Cao

发表机构 * Johns Hopkins University Applied Physics Lab(约翰霍普金斯大学应用物理实验室) Johns Hopkins University(约翰霍普金斯大学) Wyze Labs(Wyze实验室)

AI总结 本文研究了自动化平台(如 GitHub Actions 和 n8n)中基于代理的工作流可能面临的安全风险,即攻击者通过精心构造的输入(如 GitHub 评论)操控大型语言模型代理,实现如凭证泄露和任意命令执行等恶意行为。为此,作者提出了首个检测与利用框架 JAW,通过一种名为“上下文引导进化”的新方法,结合静态路径可行性分析、动态提示溯源分析和能力分析,生成能够触发代理执行恶意操作的输入。实验表明,JAW 能够成功劫持大量 GitHub 工作流和 n8n 模板,并已负责任地向相关厂商披露漏洞,获得多家公司的认可与修复。

详情
英文摘要

Automation platforms such as GitHub Actions and n8n are increasingly adopting so-called agentic workflows, which integrate Large Language Model (LLM) agents for tasks such as code review and data synchronization. While bringing convenience for developers, this integration exposes a new risk: An adversary may control and craft certain inputs, such as GitHub issue comments, to manipulate the LLM agent for unwanted actions, such as credential exfiltration and arbitrary command execution. To our knowledge, no prior academic work has studied such a risk in agentic workflows. In this paper, we design the first detection and exploitation framework, called JAW, to hijack agentic workflows hosted on automation platforms via a novel approach called Context-Grounded Evolution. Our key idea is to evolve agentic workflow inputs under the contexts derived from hybrid program analysis for hijacking purposes. Specifically, JAW generates agentic workflow contexts through three analyses: (i) static path-feasibility analysis to identify feasible agent-invocation paths and the input constraints required to trigger them, (ii) dynamic prompt-provenance analysis to determine how that input is transformed and embedded into the LLM context, and (iii) capability analysis to identify the actions and restrictions available to the agent at runtime. Our evaluation of JAW on GitHub workflows and n8n templates showed that 4714 GitHub workflows and eight n8n templates can be successfully hijacked, for example, to leak user credentials. Our findings span 15 widely-used GitHub Actions, including official GitHub Actions for Claude Code, Gemini CLI, Qwen CLI, and Cursor CLI, and two official n8n nodes. We responsibly disclosed all findings to the affected vendors and received many acknowledgements, fixes, and bug bounties, notably from GitHub, Google, and Anthropic.

2605.11225 2026-05-13 cs.AI cs.LG cs.MA 版本更新

PIVOT: Bridging Planning and Execution in LLM Agents via Trajectory Refinement

Tuo Zhang, Alin-Ionut Popa, Yan Xu, Rui Song, Dimitrios Dimitriadis

AI总结 PIVOT 是一种通过轨迹优化弥合大型语言模型(LLM)代理计划与执行之间差距的方法,其核心在于通过自监督框架迭代改进生成的轨迹。该方法包含计划、检查、进化和验证四个阶段,通过执行轨迹并计算结构化损失来识别计划与执行之间的差异,并据此优化轨迹,最终提升任务约束满足度。实验表明,PIVOT 在有无人类反馈的情况下均表现出色,显著优于现有方法,同时保持较高的计算效率。

详情
英文摘要

Large language model (LLM)-based agents frequently generate seemingly coherent plans that fail upon execution due to infeasible actions, constraint violations, and compounding errors over extended horizons. PIVOT (Plan-Inspect-eVOlve Trajectories) addresses this plan-execution misalignment through a self-supervised framework that treats trajectories as optimizable objects iteratively refined via environment interaction. The framework comprises four stages: PLAN generates candidate trajectories; INSPECT executes them and computes structured losses with textual gradients encoding plan-execution discrepancies; EVOLVE applies these signals to produce improved trajectories; and VERIFY performs a final global check against task constraints. A monotonic acceptance process ensures a non-decreasing solution quality. Empirical evaluations on DeepPlanning and GAIA demonstrate state-of-the-art performance: with human-in-the-loop (HITL) feedback, PIVOT establishes a strong upper bound up to 94% relative improvement in constraint satisfaction, while its fully autonomous variant retains substantial gains, showing that the core trajectory-refinement mechanism remains effective without external supervision. At the same time, PIVOT remains computationally efficient, requiring up to 3x to 5x fewer tokens than competing refinement methods. These findings establish that (self- or human-supervised) feedback-based trajectory optimization is a principled methodology for mitigating plan-execution gaps in autonomous agent systems.

2605.11224 2026-05-13 cs.CV cs.AI 版本更新

ABRA: Agent Benchmark for Radiology Applications

Bulat Maksudov, Vladislav Kurenkov, Kathleen M. Curran, Alessandra Mileo

发表机构 * School of Computing(计算学院) Dublin City University(都柏林城市大学) School of Medicine(医学院) University College Dublin(都柏林大学)

AI总结 ABRA 是一个面向放射学应用的智能体基准,旨在评估医疗智能体在实际影像处理任务中的能力。该基准通过21个功能调用工具,使智能体能够操作医学影像查看器和DICOM服务器,完成包括切片导航、窗口调节、标注和结构化报告等任务。ABRA 包含655个自动生成的任务,涵盖多个难度等级和任务类型,并通过自动评分系统评估智能体在规划、执行和结果方面的表现,揭示了当前模型在感知层面存在较大瓶颈。

详情
英文摘要

Existing medical-agent benchmarks deliver imaging as pre-selected samples, never as an environment the agent must navigate. We introduce ABRA, a radiology-agent benchmark in which the agent operates an OHIF viewer and an Orthanc DICOM server through twenty-one function-calling tools that span slice navigation, windowing, series selection, pixel-coordinate annotation, and structured reporting. ABRA contains 655 programmatically generated tasks across three difficulty tiers and eight types (viewer control, metadata QA, vision probe, annotation, longitudinal comparison, BI-RADS reporting, and oracle variants of annotation and BI-RADS reporting), drawn from LIDC-IDRI, Duke Breast Cancer MRI, and NLST New-Lesion LongCT. Each episode is scored along Planning, Execution, and Outcome (Bluethgen et al., 2025) by task-type-specific automatic scorers. Ten current models, five closed-weight and five open-weight, reach at least 89% Execution on real annotation but only 0-25% Outcome; on the paired oracle variant where a simulated detector supplies the finding, Outcome on the same task reaches 69-100% across the models evaluated, localising the bottleneck to perception rather than tool orchestration. Code, task generators, and scorers are released at https://github.com/Luab/ABRA

2605.11218 2026-05-13 cs.AI 版本更新

Don't Look at the Numbers: Visual Anchoring Bias and Layer-wise Representation in VLMs

M. Shalankin

发表机构 * M. Shalankin

AI总结 该研究揭示了视觉-语言模型(VLMs)在评估图像质量时受到嵌入数字锚点的系统性偏差影响,并发现这种偏差在不同模型架构中普遍存在。通过逐层分析,研究发现模型中用于分类的浅层特征与质量预测性能存在解耦现象,而深层特征则更有利于质量判断。研究还揭示了不同模型对锚点信息的融合方式存在差异,为理解视觉锚定偏差的成因及其与模型表征动态的关系提供了因果解释。

详情
英文摘要

Embedded numeric anchors on images systematically bias Vision-Language Model quality judgments across six VLMs from five architectural families (ANOVA eta^2 = 0.18-0.77, all p < 0.001). Anchor effects are 2.5x larger than severe image quality degradation, confirming bias is not reducible to visual changes. Layer-wise probing reveals consistent dissociation: layers where anchor classification saturates (L12-L34) are suboptimal for quality prediction, with optimal layers deeper (R^2 = 0.69-0.91). Fusion analysis identifies architecture-dependent integration -- instant fusion at L1-L2 in two models versus partial or no fusion in three others. These results establish a causal account of visual anchoring bias, linking behavioral susceptibility to representation dynamics.

2605.11217 2026-05-13 cs.LG cs.AI cs.CR 版本更新

Leveraging RAG for Training-Free Alignment of LLMs

John T. Halloran

发表机构 * Leidos(莱迪奥斯公司)

AI总结 该论文提出了一种基于检索增强生成(RAG)的对齐方法RAG-Pref,用于在无需额外训练的情况下提升大语言模型(LLM)对代理攻击的拒绝能力。该方法通过在推理过程中利用偏好和非偏好样本的对比信息,实现在线对齐,计算开销低且兼容现有工具。实验表明,RAG-Pref在五种主流LLM上显著提升了拒绝攻击的性能,同时在通用人类偏好对齐任务中也表现出色,且不显著增加计算资源需求。

Comments 19 pages, 4 figures, and 6 tables

详情
英文摘要

Large language model (LLM) alignment algorithms typically consist of post-training over preference pairs. While such algorithms are widely used to enable safety guardrails and align LLMs with general human preferences, we show that state-of-the-art alignment algorithms require significant computational resources while being far less capable of enabling refusal guardrails for recent agentic attacks. Thus, to improve refusal guardrails against such attacks without drastically increasing computational overhead, we introduce Retrieval Augmented Generation for Pref erence alignment (RAG-Pref), a simple RAG-based alignment algorithm which conditions on preferred and dispreferred samples to leverage contrastive information during inference. RAG-Pref is online (training-free), compatible with off-the-shelf packages, and, when combined with offline (training-based) alignment algorithms, enables more than an average 3.7 factor improvement in agentic attack refusals across five widely used LLMs, compared to 2.9 for other online alignment algorithms and 1.5 for offline alignment alone. We conclude by showing that, in stark contrast to other online alignment methods, RAG-Pref similarly increases performance on general human-preference alignment tasks and does not drastically increase overall computational requirements.

2605.11205 2026-05-13 cs.LG cs.AI 版本更新

The Scaling Law of Evaluation Failure: Why Simple Averaging Collapses Under Data Sparsity and Item Difficulty Gaps, and How Item Response Theory Recovers Ground Truth Across Domains

Jung Min Kang

发表机构 * Independent Researcher(独立研究员)

AI总结 本文研究了在数据稀疏和项目难度差异较大的情况下,简单平均法在评估排名中的失效问题,并提出利用项目反应理论(IRT)可以更准确地恢复真实排名。通过在多个领域(如自然语言处理、临床试验等)的实验,作者发现当数据覆盖率下降时,简单平均的排名相关性显著降低,而基于IRT的模型则能保持高精度。研究揭示了评估失效的规模规律,并为物理AI等领域的基准测试提供了更可靠的评估方法。

Comments 15 pages, 4 tables, 1 figure. Code at https://github.com/testofschool/evaluation-failure-scaling-law

详情
英文摘要

Benchmark evaluation across AI and safety-critical domains overwhelmingly relies on simple averaging. We demonstrate that this practice produces substantially misleading rankings when two conditions co-occur: (1) the evaluation matrix is sparse and (2) items vary substantially in difficulty. Through controlled simulation experiments across four domains -- NLP (GLUE), clinical drug trials, autonomous vehicle safety, and cybersecurity -- we show that Spearman rank correlation $ρ$ between simple-average rankings and ground-truth rankings degrades from $ρ= 1.000$ at 100% coverage to $ρ= 0.809$ at 67% coverage with high difficulty heterogeneity (mean over 20 seeds). A standard two-parameter logistic (2PL) Item Response Theory (IRT) model maintains $ρ\geq 0.996$ across all conditions. A 150-condition grid sweep over sparsity $S \in [0, 0.70]$ and difficulty gap $D \in [0.5, 5.0]$ confirms that ranking error forms a failure surface with a strong $S \times D$ interaction ($γ_3 = +0.20$, $t = 13.05$), while IRT maintains $ρ\geq 0.993$ throughout. We discuss implications for Physical AI benchmarking, where evaluation matrices are often incomplete and difficulty gaps are extreme.

2605.11202 2026-05-13 cs.CR cs.AI cs.LG cs.SE 版本更新

Continuous Discovery of Vulnerabilities in LLM Serving Systems with Fuzzing

Yunze Zhao, Yibo Zhao, Yuchen Zhang, Zaoxing Liu, Michelle L. Mazurek

发表机构 * University of Maryland(马里兰大学) New York University(纽约大学)

AI总结 该研究针对大语言模型(LLM)推理服务系统中的安全问题,提出了一种基于模糊测试的灰色盒检测工具GRIEF,用于持续发现服务层中的漏洞。GRIEF通过处理多请求时间序列作为输入,结合轻量级检测机制,能够识别崩溃、性能异常和输出污染等问题,并确认可复现的服务层故障。实验表明,GRIEF在多个主流推理引擎中发现了15个漏洞,其中10个已被开发者确认,揭示了并发、缓存和状态复用等机制可能引发的安全隐患。

详情
英文摘要

LLM inference and serving systems have become security-critical infrastructure; however, many of their most concerning failures arise from the serving layer rather than from model behavior alone. Modern inference engines combine KV cache, batching, prefix sharing, speculative decoding, adapters, and multi-tenant scheduling, creating shared-state behavior that only emerges under realistic concurrent workloads and is missed by standard model, safety, and API tests. We present GRIEF, a greybox fuzzer for LLM inference engines that treats timed multi-request traces as first-class inputs, uses lightweight oracles to detect crashes, hangs, performance pathologies, and silent output corruption, and applies controlled replay with log-probability checks to confirm reproducible serving-layer failures. Across early campaigns on vLLM and SGLang, GRIEF discovers 15 vulnerabilities, 10 confirmed by engine developers, including 2 CVEs, spanning KV-cache isolation failures, cross-request performance interference, and crash or liveness bugs. These results show that concurrency, caching, and state reuse can induce silent cross-request contamination, noisy-neighbor denial of service, and delayed crashes without malformed inputs or explicit server errors, making concurrent serving behavior a first-class security and reliability boundary for LLM infrastructure.

2605.11192 2026-05-13 cs.SD cs.AI cs.LG 版本更新

Exploring Token-Space Manipulation in Latent Audio Tokenizers

Francesco Paissan, Luca Della Libera, Mirco Ravanelli, Cem Subakan

发表机构 * Mila – Québec AI Institute(魁北克人工智能研究所) Université Laval(拉瓦尔大学) Concordia University(康科迪亚大学)

AI总结 本文研究了在潜空间音频编码器中对 token 空间进行操作的可能性,提出了一种名为 LATTE 的新型音频 tokenizer,通过引入可学习的潜空间 token 来实现对全局语音特征的编辑。该方法在保持高质量语音重建的同时,使得通过替换 token 来修改说话人身份或背景噪声等全局属性成为可能,并在语音转换和去噪任务中验证了其有效性,为无监督的可控音频编辑提供了新思路。

详情
英文摘要

Neural audio codecs provide compact discrete representations for speech generation and manipulation. However, most codecs organize tokens as frame-level sequences, making it difficult to study or intervene on global factors of variation. In this work, we propose the Latent Audio Tokenizer for Token-space Editing (LATTE) that appends a fixed set of learnable latent tokens to the audio feature sequence and retains only these tokens for quantization and decoding. This design produces a compact, non-temporally aligned bottleneck in which each token can aggregate global information across the full utterance. We show that the resulting tokenizer preserves competitive reconstruction quality in low-bitrate speech coding settings while enabling simple token-space interventions. In particular, we find that swapping selected latent token positions between utterances can modify global attributes, such as speaker identity and background noise, and we evaluate these interventions on voice conversion and denoising tasks. Our results suggest that compact latent audio tokenizers can support controllable audio manipulation without supervision in task-specific editing models.

2605.11188 2026-05-13 cs.CR cs.AI cs.ET 版本更新

Adversarial SQL Injection Generation with LLM-Based Architectures

Ali Karakoc, H. Birkan Yilmaz

发表机构 * Department of Computer Engineering, NETLAB(计算机工程系,NETLAB)

AI总结 本文研究了如何利用大型语言模型(LLM)生成对抗性SQL注入攻击,以评估Web应用防火墙(WAF)的防御能力。作者提出了两种基于LLM的新方法——RADAGAS和RefleXQLi,并在多种WAF系统上进行了大规模实验,结果显示RADAGAS在AI/ML类WAF中表现出色,但在基于规则的WAF上效果有限。研究为利用LLM进行安全测试提供了重要的实证参考。

Comments 32 pages, 8 figures, 8 tables

详情
英文摘要

SQL injection (SQLi) attacks are still one of the serious attacks ranked in the Open Worldwide Application Security Project (OWASP) Top 10 threats. Today, with advances in Artificial Intelligence (AI), especially in Large Language Models (LLMs), an opportunity has been created for automating adversarial attack tests to measure the defense mechanisms. In this paper, we aim to create a comprehensive evaluation of use cases that utilize LLMs for adversarial SQL injection generation. We introduce two novel LLM-based systems, Retrieval Augmented Generation for Adversarial SQLi (RADAGAS) and Reflective Chain-of-Thought SQLi (RefleXQLi), and compare them with existing baselines against 10 Web Application Firewalls (WAFs) and one execution-based MySQL validator. To perform a comprehensive test, we used six rule-based open-source WAFs (ModSecurity PL1--3, Coraza PL1--3), 2 AI/ML-based WAFs (WAF Brain, CNN-WAF), and 2 commercial WAFs (AWS WAF and Cloudflare WAF). For the LLM models, we used GPT-4o, Claude 3.7 Sonnet, and DeepSeek R1. Our tests consist of 240 experiments that generate 240,000 payloads and perform 2.2 million tests against WAFs. Our comprehensive evaluation reveals that RADAGAS-GPT4o outperforms other baseline models with a 22.73\% bypass rate. The proposed RADAGAS variants are highly successful on AI/ML-based WAFs (92.49\% on WAF-Brain by RADAGAS-DeepSeek, 80.48\% on CNN-WAF by RADAGAS-Claude), but struggle to bypass rule-based WAFs (0--5.70\% on ModSecurity and Coraza). In addition to these findings, another observation is that creating less diverse payloads achieves more bypasses, however they show poor results if the initially chosen payload is not successful. We observe that our findings provide a comprehensive view on using LLM-based approaches in security testing.

2605.11186 2026-05-13 cs.LG cs.AI 版本更新

CATS: Cascaded Adaptive Tree Speculation for Memory-Limited LLM Inference Acceleration

Yuning Han, Yangchenchen Jin, Dylan Zhao, Jingwei Sun

发表机构 * University of Florida(佛罗里达大学)

AI总结 在内存受限的设备上进行大语言模型推理时,自回归解码过程受到内存带宽的限制,现有基于推测解码的方法通常假设设备内存足够容纳目标模型和辅助模型,这在边缘设备上并不适用。本文提出了一种名为CATS的级联自适应树推测框架,通过基于内存预算和参数卸载模式进行级联验证与修正,在不增加峰值内存占用的前提下,显著提升了推理速度。实验表明,CATS在多个真实边缘设备上实现了最高达5.08倍的加速,且生成质量无下降,优于现有最优方法1.45倍。

详情
英文摘要

Auto-regressive decoding in Large Language Models (LLMs) is inherently memory-bound: every generation step requires loading the model weights and intermediate results from memory (e.g., High-Bandwidth Memory (HBM) for GPU servers), making throughput bottlenecked by memory bandwidth rather than compute. Speculative decoding addresses this by enabling parallel verification of multiple draft tokens, effectively amortizing the cost of each target-model call. However, existing speculative decoding methods are designed under the assumption that HBM is sufficiently large to hold both the target model and an auxiliary draft model simultaneously -- an assumption that breaks down on memory-constrained devices such as edge platforms with limited DRAM. We analyze the inference bottleneck in this memory-limited regime and propose CATS, a self-speculative decoding framework that conducts cascaded verification and correction based on the memory budget and parameter offloading patterns on memory-limited devices. This design maximizes token acceptance rate and end-to-end speedup while keeping the peak memory footprint on the device equal to that of the target model alone. We evaluate CATS on different models across five benchmarks on real edge devices. CATS can achieve a wall-clock speedup of up to 5.08x with no degradation in generation quality, outperforming the SOTA method by up to 1.45x under edge memory constraints.

2605.11181 2026-05-13 cs.LG cs.AI cs.NA math.NA math.OC stat.ML 版本更新

Muon is Not That Special: Random or Inverted Spectra Work Just as Well

Zakhar Shumaylov, Nathaël Da Costa, Peter Zaika, Bálint Mucsányi, Alex Massucco, Yoav Gelberg, Carola-Bibiane Schönlieb, Yarin Gal, Philipp Hennig

发表机构 * University of Cambridge(剑桥大学) University of Tübingen(图宾根大学) University of Oxford(牛津大学)

AI总结 本文挑战了Muon优化器在非欧几里得优化中依赖几何结构的主流观点,提出精确的几何结构并非影响优化性能的关键因素。研究引入了基于Schatten(准)范数的Freon优化器,其性能在GPT-2等任务中优于Muon,并揭示了最佳参数位于准范数区域,无法用传统LMO理论解释。进一步提出Kaon优化器,通过用随机噪声替代奇异值仍能匹配Muon性能,证明严格的几何结构并非必要。研究指出,优化性能主要由对齐度和下降潜力等局部量决定,而非全局几何结构。

Comments 45 pages

详情
英文摘要

The recent empirical success of the Muon optimizer has renewed interest in non-Euclidean optimization, typically justified by similarities with second-order methods, and linear minimization oracle (LMO) theory. In this paper, we challenge this geometric narrative through three contributions, demonstrating that precise geometric structure is not the key factor affecting optimization performance. First, we introduce Freon, a family of optimizers based on Schatten (quasi-)norms, powered by a novel, provably optimal QDWH-based iterative approximation. Freon naturally interpolates between SGD and Muon, while smoothly extrapolating into the quasi-norm regime. Empirically, the best-performing Schatten parameters for GPT-2 lie strictly within the quasi-norm regime, and thus cannot be represented by any unitarily invariant LMO. Second, noting that Freon performs well across a wide range of exponents, we introduce Kaon, an absurd optimizer that replaces singular values with random noise. Despite lacking any coherent geometric structure, Kaon matches Muon's performance and retains classical convergence guarantees, proving that strict adherence to a precise geometry is practically irrelevant. Third, having shown that geometry is not the primary driver of performance, we demonstrate it is instead controlled by two local quantities: alignment and descent potential. Ultimately, each optimizer must tune its step size around these two quantities. While their dynamics are difficult to predict a-priori, evaluating them within a stochastic random feature model yields a precise insight: Muon succeeds not by tracking an ideal global geometry, but by guaranteeing step-size optimality.

2605.11178 2026-05-13 cs.LG cs.AI math.RT 版本更新

Oversmoothing as Representation Degeneracy in Neural Sheaf Diffusion

Arif Dönmez, Axel Mosig, Ellen Fritsche, Katharina Koch

发表机构 * IUF – Leibniz Research Institute for Environmental Medicine(莱比锡环境医学研究所) DNTOX GmbH(DNTOX公司) Bioinformatics Group, Ruhr University Bochum(博德姆鲁尔大学生物信息学小组) Swiss Centre for Applied Human Toxicology (SCAHT)(瑞士应用人类毒理学中心(SCAHT))

AI总结 本文研究了神经束扩散(NSD)模型中的过平滑问题,将其解释为表示几何退化现象。通过将图上的细胞束与关联的入射图表示建立联系,作者揭示了NSD在扩散极限下所达到的调和空间的代数结构,并指出学习到的束几何可能退化为低复杂度的表示,导致判别信息丢失。文章进一步引入基于矩映射的正则化方法,以引导束限制映射趋向于更平衡的几何结构,并分析了等维结构中的稳定性障碍,提出了非均匀维数设计的有效性。实验表明,打破束维对称性有助于提升模型性能。

Comments 15 pages, Comments welcome

详情
英文摘要

Neural Sheaf Diffusion (NSD) generalizes diffusion-based Graph Neural Networks by replacing scalar graph Laplacians with sheaf Laplacians whose learned restriction maps define a task-adapted geometry. While the diffusion limit of NSD is known to be the space of global sections, the representation-theoretic structure of this harmonic space remains largely implicit. We develop a quiver-theoretic interpretation of NSD by identifying cellular sheaves on graphs with representations of the associated incidence quiver. Under this correspondence, learned sheaf geometries become points in a finite-dimensional representation space. We show that direct-sum decompositions of the underlying incidence-quiver representation induce decompositions of the harmonic space reached in the diffusion limit. This gives an algebraic interpretation of oversmoothing as representation degeneration: learned sheaves may collapse toward low-complexity summands whose global sections fail to preserve discriminative information. Building on this viewpoint, we connect sheaf diffusion to stability and moment-map principles from Geometric Invariant Theory. We introduce moment-map-inspired regularizers that bias restriction maps toward balanced representation geometries, and identify a structural obstruction in equal-stalk architectures: when $d_v = d_e$, admissibility for learnable stability parameters forces the trivial all-object summand onto a stability wall. Non-uniform stalk dimensions remove this obstruction, making adaptive stability meaningful. Experiments on heterophilic benchmarks are consistent with this mechanism: breaking stalk symmetry can reduce variance or improve validation behavior, and adaptive stability becomes more effective in selected rectangular settings. Overall, our framework reframes oversmoothing as a degeneration phenomenon in the representation geometry underlying learned sheaf diffusion.

2605.11169 2026-05-13 cs.AI 版本更新

OLIVIA: Online Learning via Inference-time Action Adaptation for Decision Making in LLM ReAct Agents

Sheldon Yu, Junda Wu, Xintong Li, Nikki Lijing Kuang, Sizhe Zhou, Tong Yu, Jiawei Han, Jingbo Shang, Julian McAuley

发表机构 * UC San Diego(加州大学圣地亚哥分校) University of Illinois at Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校) Adobe Research(Adobe研究)

AI总结 本文提出OLIVIA,一种针对ReAct风格大语言模型代理的在线动作适配框架,用于提升其在部署时的决策性能。OLIVIA将代理的动作选择层建模为一个基于上下文的线性置信域上界(UCB)多臂老虎机问题,利用冻结的隐藏状态作为决策上下文,从而在保持原始推理过程的同时,实现对动作选择的直接调整和不确定性估计。实验表明,OLIVIA在多个基准任务中显著优于静态ReAct和基于提示的适配方法,展示了其在部署阶段进行高效、细粒度和不确定性感知的在线优化的有效性。

详情
英文摘要

Large language model agents interleave reasoning, action selection, and observation to solve sequential decision-making tasks. In deployed settings where agents repeatedly handle related multi-step tasks, small action-selection errors can accumulate into wasted tool calls, latency, and reduced reliability. Despite this need for deployment-time improvement, existing inference-time adaptation methods for LLM agents mainly rely on prompting or retrieval, which influence behavior indirectly through context manipulation. For ReAct-style agents, such approaches do not expose an explicit decision layer that can score candidate actions, represent uncertainty, or be updated online from action-level feedback. As a result, they provide limited support for trackable, fine-grained, and uncertainty-aware adaptation during deployment. We propose OLIVIA, an inference-time action adaptation framework for ReAct-style agents. OLIVIA models the LLM's final action-selection layer as a contextual linear bandit over candidate actions, with frozen hidden states as decision contexts. This choice is particularly suitable for deployment because it adapts behavior directly at the action-selection interface, preserves the underlying reasoning process, and provides explicit uncertainty estimates and lightweight online updates from action-level feedback. With upper-confidence-bound exploration, OLIVIA improves the policy sample-efficiently with minimal computational overhead. We instantiate OLIVIA on four benchmarks and show that it consistently improves task performance over static ReAct and prompt-based inference-time baselines. Our results suggest that explicit online decision layers provide an effective alternative to purely prompt- or retrieval-based adaptation for LLM agents during deployment.

2605.11167 2026-05-13 cs.CL cs.AI cs.LG 版本更新

The Bicameral Model: Bidirectional Hidden-State Coupling Between Parallel Language Models

Cedric Flamant, Udaya Ghai, Kanna Shimizu

发表机构 * AWS Agentic AI(AWS智能AI)

AI总结 本文提出了一种名为“双室模型”的新方法,通过可训练的神经接口在两个预训练语言模型的中间隐藏状态之间建立双向耦合,使它们能够通过连续的并发通道进行协调,而非传统的文本生成方式。该模型在每一步生成过程中同步运行,主模型负责任务执行,辅助模型则处理工具调用、约束求解或代码执行,并通过翻译网络和学习抑制门实现相互条件控制。实验表明,该方法在算术、逻辑网格谜题和数学推理任务中显著提升了性能,展示了其在多模型协作中的有效性。

Comments 9 pages main text, 5 figures, 24 pages appendix

详情
英文摘要

Existing multi-model and tool-augmented systems communicate by generating text, serializing every exchange through the output vocabulary. Can two pretrained language models instead coordinate through a continuous, concurrent channel? The Bicameral Model couples two frozen language models through a trainable neural interface on their intermediate hidden states. At every generation step, both models run in lockstep: a primary model drives the task while an auxiliary model operates tools, solves constraints, or executes code, with both conditioning on each other's activations through a translation network and a learned suppression gate ($\sim$1\% of combined parameters). The gate learns a selective communication protocol from task loss alone, without a prescribed format. We demonstrate the mechanism across three tool backends. On arithmetic, coupling two 0.5B models with a calculator raises accuracy from 36\% to 96\%. On logic grid puzzles, coupling two 0.6B models with a Z3 solver achieves $1.7\times$ the unaugmented baseline on ZebraLogic. On mathematical reasoning, coupling with a Python sandbox enables the auxiliary to generate problem-specific code from hidden-state signals alone, without ever seeing the problem text.

2605.11163 2026-05-13 cs.CR cs.AI 版本更新

Benchmarking LLM-Based Static Analysis for Secure Smart Contract Development: Reliability, Limitations, and Potential Hybrid Solutions

Stefan-Claudiu Susan, Andrei Arusoaie, Dorel Lucanu

发表机构 * Department of Computer Science Alexandru Ioan Cuza University of Iași(计算机科学系阿莱克桑德鲁·伊奥安·库扎大学)

AI总结 本研究评估了基于大语言模型(LLM)的静态分析工具在智能合约安全开发中的可靠性,探讨其是否能替代或仅作为传统静态分析工具的补充。研究发现,当前LLM在智能合约漏洞检测中存在固有的词汇偏差和对外部数据验证不足的问题,导致误报率较高,并在精确率与召回率之间存在权衡。研究通过自研的自动化框架验证了这些结论,该框架在分类模型输出时达到了92%的准确率。

Comments Accepted to IEEE COMPSAC 2026. Extended version with supplemental materials

详情
英文摘要

The irreversible nature of blockchain transactions makes the identification of smart contract vulnerabilities an essential requirement for secure system development. While Large Language Models (LLMs) are increasingly integrated into developer workflows, their reliability as autonomous security auditors remains unproven. We assess whether current generative models are a viable replacement for, or only a complement to, traditional static-analysis tools. Our findings indicate that LLM efficacy is undermined by both inherent lexical bias and a lack of rigorous validation of external data inputs. This reliance on non-semantic heuristics, such as identifier naming, leads to a high frequency of false positives. Furthermore, prompting techniques reveal a trade-off between precision and recall. These results were derived using our custom automated framework, which achieves 92% accuracy in classifying model outputs.

2605.11161 2026-05-13 cs.LG cs.AI 版本更新

Interpretability Can Be Actionable

Hadas Orgad, Fazl Barez, Tal Haklay, Isabelle Lee, Marius Mosbach, Anja Reusch, Naomi Saphra, Byron Wallace, Sarah Wiegreffe, Eric Wong, Ian Tenney, Mor Geva

发表机构 * Kempner Institute at Harvard University(哈佛大学凯默纳研究所) University of Southern California(美国南加州大学) Mila – Quebec AI Institute(魁北克AI研究所) McGill University(麦吉尔大学) Google DeepMind(谷歌DeepMind) Tel Aviv University(特拉维夫大学) University of Pennsylvania(宾夕法尼亚大学) University of Maryland(马里兰大学) University of Oxford(牛津大学) Northeastern University(东北大学) Boston University(波士顿大学)

AI总结 本文探讨了深度神经网络可解释性研究的实践价值问题,指出当前研究缺乏将可解释性转化为实际决策和干预能力的评估标准。作者提出应以“行动性”作为可解释性的核心评价标准,从具体性和验证性两个维度定义可操作的可解释性,并分析了阻碍其实际应用的障碍。文章进一步识别了五个可解释性具有独特优势的领域,提出了与实际效果对齐的评估框架,旨在推动可解释性研究从理论探索向实际应用转化。

Comments Accepted to ICML 2026

详情
英文摘要

Interpretability aims to explain the behavior of deep neural networks. Despite rapid growth, there is mounting concern that much of this work has not translated into practical impact, raising questions about its relevance and utility. This position paper argues that the central missing ingredient is not new methods, but evaluation criteria: interpretability should be evaluated by actionability--the extent to which insights enable concrete decisions and interventions beyond interpretability research itself. We define actionable interpretability along two dimensions--concreteness and validation--and analyze the barriers currently preventing real-world impact. To address these barriers, we identify five domains where interpretability offers unique leverage and present a framework for actionable interpretability with evaluation criteria aligned with practical outcomes. Our goal is not to downplay exploratory research, but to establish actionability as a core objective of interpretability research.

2605.11157 2026-05-13 cs.GT cs.AI cs.LG cs.MA econ.TH 版本更新

The Price of Proportional Representation in Temporal Voting

Nicholas Teh

发表机构 * University of Oxford(牛津大学)

AI总结 本文研究了在时间投票模型中比例代表制的代价,探讨了比例代表制与社会福利之间的权衡。作者通过最坏情况下的效用比,量化了强制实施比例代表制所带来的效率损失,并发现随着投票轮次或选民数量的增加,这种损失呈亚线性增长。研究还表明,对于较弱的比例代表公理(如正当代表制),随着时间范围的扩大,福利损失会逐渐减小并趋于消失,而更强的公理则始终存在冲突。此外,作者证明了在各类比例代表公理下最大化社会福利是计算复杂的问题,并提出了若干固定参数算法以应对实际场景。

Comments Appears in the 35th International Joint Conference on Artificial Intelligence (IJCAI), 2026

详情
英文摘要

We study proportional representation in the temporal voting model, where collective decisions are made repeatedly over time over a fixed horizon. Prior work has extensively investigated how proportional representation axioms from multiwinner voting (e.g., justified representation (JR) and its variants) can be adapted, satisfied, and verified in this setting. However, much less is understood about their interaction with social welfare. In this work, we quantify the efficiency cost of enforcing proportionality. We formalize the welfare-proportionality tension via the worst-case ratio between the maximum achievable utilitarian welfare and the maximum welfare attainable subject to a proportionality axiom. We show that imposing proportional representation in the temporal setting can incur a growing, yet sublinear, welfare loss as the number of voters or rounds increases. We further identify a clean separation among axioms: for JR, the welfare loss diminishes as the time horizon grows and vanishes asymptotically, whereas for stronger axioms this conflict persists even with many rounds. Moreover, we prove that welfare maximization under each axiom is NP-complete and APX-hard, even under static preferences and bounded-degree approvals, and provide fixed-parameter algorithms under several natural structural parameters.

2605.11143 2026-05-13 cs.CL cs.AI cs.IR 版本更新

ClinicalBench: Stress-Testing Assertion-Aware Retrieval for Cross-Admission Clinical QA on MIMIC-IV

Alex Stinard

AI总结 本文提出 ClinicalBench,一个用于评估跨病历临床问答中基于断言感知检索性能的基准测试,重点考察检索真实电子健康记录时因否定、时间性及患者与家庭成员归属等因素导致的答案偏差。研究通过构建包含断言标签和时间标签的患者知识图谱(EpiKG),结合意图感知的检索增强生成(KG-RAG)方法,显著提升了检索准确性。实验表明,该方法在多个大语言模型上均取得性能提升,并揭示了当前自动生成参考答案的局限性,强调了临床问答评估中医生裁定的重要性。

Comments 46 pages including appendices (two-column preprint format). Under review at JAMIA. Code, frozen evaluator, and benchmark released at https://huggingface.co/datasets/alexstinard/epikg-clinicalbench. ClinicalBench v2 is a 400-question MIMIC-IV stress test for assertion-aware retrieval

详情
英文摘要

Reasoning benchmarks measure clinical performance on clean inputs. We evaluate the step before reasoning: retrieval over real EHR notes, where negation, temporality, and family-versus-patient attribution can flip a correct answer to a wrong one. EpiKG carries an assertion label and a temporality tag with every fact in a patient knowledge graph, then routes retrieval by question intent. ClinicalBench is a 400-question test over 43 MIMIC-IV patients across 9 assertion-sensitive categories. A 7-condition ablation tests each piece of EpiKG across six LLMs (Claude Opus 4.6, GPT-OSS 20B, MedGemma 27B, Gemma 4 31B, MedGemma 1.5 4B, Qwen 3.5 35B). Three physicians blindly adjudicated 100 paired items. The author-blind primary endpoint, leave-author-out paired exact McNemar on 50 unanimous-strict items rated by two external physicians, yields +22.0 percentage points (95 percent Newcombe CI [+5.1, +31.5], p=0.0192). The architectural novelty, intent-aware KG-RAG over a Contriever dense-RAG baseline (C2b to C4g_kw on the change-excluded n=362 endpoint), is +8.84 percentage points (paired McNemar p=1.79e-3); +12.43 percentage points under oracle intent. Sensitivities agree directionally: three-rater physician majority +24.0 percentage points (subject to single-author circularity); deterministic keyword reproducibility proxy +39.5 percentage points. Across the six models, the gain shrinks as the LLM-alone baseline rises (beta=-1.123, r=-0.921, p=0.009). With n=6 this looks more like regression to the mean than encoding substituting for model size. Physician adjudication identified 56 percent of auto-generated reference answers as defective, a methodological finding indicating that NLP-pipeline clinical-QA benchmarks require physician adjudication to be usable. ClinicalBench, the frozen evaluator, three-rater adjudication data, and the EpiKG output stack are publicly released.

2605.11136 2026-05-13 cs.AI 版本更新

EVOCHAMBER: Test-Time Co-evolution of Multi-Agent System at Individual, Team, and Population Scales

Yaolun Zhang, Tianyi Xu, Shengyu Dai, Zhenwen Shao, Qingyun Wu, Huazheng Wang

发表机构 * Oregon State University(俄勒冈州立大学) University of Wisconsin–Madison(威斯康星大学麦迪逊分校) Johnson & Johnson(强生公司) Pennsylvania State University(宾夕法尼亚州立大学) AG2AI, Inc.(AG2AI公司)

AI总结 本文提出EVOCHAMBER,一种无需训练的框架,用于在个体、团队和种群三个层面实现多智能体系统的测试时协同进化。其核心方法CODREAM通过团队失败或分歧后协作反思与知识异步传递,实现跨智能体的非对称知识转移,保留专业化分工的同时填补知识空白。实验表明,该方法在数学、编程和多领域推理任务中均取得显著提升,并观察到多个稳定的专业化智能体自发形成,展现了多智能体进化的结构特征。

详情
英文摘要

We argue that multi-agent test-time evolution is not single-agent evolution replicated N times. A single-agent learner can only evolve its own context and memory. A multi-agent system additionally evolves who collaborates, how they collaborate, and how knowledge flows across the population. These components have no single-agent counterpart and can produce phenomena such as emergent specialization. Yet prior test-time methods either confine experiences to individual agents, forfeiting cross-agent learning, or broadcast symmetrically to all agents, erasing the specialization that makes collaboration valuable. We present EVOCHAMBER, a training-free framework that instantiates test-time evolution at three levels over a coevolving agent pool. At its core is CODREAM (Collaborative Dreaming), a post-task protocol triggered on team failure or disagreement, in which agents collaboratively reflect, distill insights, and route them asymmetrically from strong to weak agents on the failed niche, preserving specialization while filling knowledge gaps. Team-level operators assemble niche-conditioned teams and select collaboration structures online. Population-level lifecycle operators fork, merge, prune, and seed agents under performance pressure. On three heterogeneous task streams with Qwen3-8B, EVOCHAMBER reaches 63.9% on competition math, 75.7% on code, and 87.1% on multi-domain reasoning, outperforming the best baseline by 32% relative on math and confirming asymmetric cross-agent transfer as the primary driver in ablation. Starting from several identically initialized agents, four to five stable niche specialists spontaneously emerge, a structural signature of multi-agent evolution that no single-agent learner can express. See our code at: https://github.com/Mercury7353/EvoChamber

2605.11135 2026-05-13 cs.MA cs.AI cs.LG 版本更新

Control Charts for Multi-agent Systems

Hayden Helm, Carey Priebe, Brandon Duderstadt

发表机构 * Johns Hopkins University(约翰霍普金斯大学) Calcifer Computing(Calcifer公司)

AI总结 本文研究了多智能体系统中的自动化监控问题,提出了基于过程理论的自适应控制图方法,以替代当前依赖定性分析的监控方式。该方法通过模拟验证了自适应控制图在监测能够从环境中学习的多智能体系统中的必要性,并揭示了其对缓慢背叛的对抗智能体的易受性。研究指出了多智能体系统控制中的基本权衡:若智能体能够学习,则系统可能面临对抗性威胁。

详情
英文摘要

Generative agents have proven to be powerful assistants in a wide variety of contexts. Given this success, users are now deploying agents with minimal restrictions in open ended, multi-agent environments. Current methods for monitoring the dynamics of open-ended multi-agent systems are limited to qualitative inspection. In this paper, we extend the process-theoretic notion of adaptive control charts to multi-agent systems to enable automated monitoring. Using simulation, we demonstrate that adaptive control charts are necessary for monitoring multi-agent systems that can learn from their environment. We further demonstrate, both empirically and theoretically, that adaptive control charts are susceptible to adversarial agents that defect sufficiently slowly. These results illustrate a fundamental tradeoff in multi-agent system control: either agents in a system cannot learn or the system is susceptible to adversaries.

2605.11114 2026-05-13 cs.RO cs.AI 版本更新

SEVO: Semantic-Enhanced Virtual Observation for Robust VLA Manipulation via Active Illumination and Data-Centric Collection

Tianchonghui Fang, Yuan Zhuang, Fei Miao

发表机构 * School of Computing, University of Connecticut(康奈尔大学计算学院)

AI总结 该研究提出了一种名为SEVO的语义增强虚拟观测方法,旨在提升低成本机器人在不同环境下的视觉-语言-动作(VLA)操作鲁棒性。SEVO通过固定摄像头覆盖操作区域、主动红光照明标准化物体外观以及实时语义分割提供背景不变的提示,结合多样化数据采集策略,显著提升了模型的泛化能力。实验表明,在相同政策架构下,SEVO使机器人在训练和新环境中的抓取成功率大幅提升,验证了观测设计和数据多样性对低成本机器人可靠操作的重要性。

详情
英文摘要

Vision-Language-Action (VLA) and imitation-learning policies trained via community toolchains on low-cost hardware frequently fail when deployed outside the training environment. Existing evaluations, including the original ACT and SmolVLA benchmarks, demonstrate high success rates under controlled, fixed backgrounds, yet community practitioners report near-zero transfer to new environments. We present SEVO (Semantic-Enhanced Virtual Observation), a data-centric approach that improves cross-environment manipulation robustness without modifying the policy architecture. SEVO transforms the raw RGB camera stream through three mechanisms: (1) body-fixed cameras whose combined fields of view cover the full manipulation workspace, (2) active red-spectrum illumination that physically normalizes object appearance, and (3) real-time YOLO segmentation overlay that provides a background-invariant semantic cue. Critically, we show that a diversified data collection protocol (systematically varying lighting, backgrounds, and distractors during teleoperation) is the single most important factor for generalization. We target transparent water bottles, objects that visually blend with their surroundings, and select a simple pick-and-place task to enable hundreds of controlled real-robot trials across two mobile platforms. The full pipeline achieves 95% grasp success with ACT and 83% with SmolVLA in the training environment, transferring to novel environments at 85% and 75%. Without SEVO, the same policies achieve only 75%/70% in training and collapse to 30-35% in novel environments. Our results demonstrate that principled observation design and environmental diversity during data collection, not model scaling, enable low-cost robots to operate reliably in everyday household environments.

2605.11109 2026-05-13 physics.geo-ph cs.AI cs.CV cs.LG 版本更新

Deploying Self-Supervised Learning for Real Seismic Data Denoising

Giovanny A. M. Arboleda, Claudio D. T. de Souza, Carlos E. M. dos Anjos, Lessandro de S. S. Valente, Roosevelt de L. Sardinha, Albino Aveleda, Pablo M. Barros, André Bulcão, Alexandre G. Evsukoff

发表机构 * COPPE Federal University of Rio de Janeiro(里约热内卢联邦大学Coppe分校) CENPES, Petrobras(石油公司CENPES)

AI总结 本文研究了在真实地震数据去噪中应用自监督学习(SSL)的可行性,重点评估了Noisy-as-Clean(NaC)方法在受控条件下的表现。通过构建包含噪声和滤波数据的四个真实数据集,作者对比了NaC方法与监督学习基线在相同网络结构和超参数下的性能,发现合成的高斯白噪声(AWGN)在NaC方法中效果不佳,实际噪声特性与注入噪声的匹配度对去噪效果影响显著。研究还表明,自监督模型在测试数据上的微调能有效提升性能,而监督模型则无此优势,NaC方法因其简单、有效且模型无关的特性,为真实地震数据去噪提供了可行的解决方案。

详情
英文摘要

Self-supervised learning (SSL) has emerged as a promising approach to seismic data denoising as it does not require clean reference data. In this work, the deployment of the Noisy-as-Clean (NaC) method was evaluated for real seismic data denoising under controlled conditions. Two independent seismic acquisitions, each comprising noisy and filtered data, were organized into four real datasets. The NaC SSL method was adapted to add real noise to the noisy input, controlled by a parameter. An experimental protocol with ten experiments was designed to compare different strategies for deploying the NaC SSL method with the supervised learning baseline, using identical network topology and hyperparameters. The models were evaluated in terms of denoising performance, computational cost, and generalization capability. The results show that the synthetic additive white Gaussian noise (AWGN) is inadequate for the denoising of seismic data within the NaC method, and performance strongly depends on the compatibility between the injected and actual noise characteristics. Furthermore, both the characteristics of the seismic data and the noise level influence the performance of the model. Self-supervised fine-tuning on test data has improved SSL performance, whereas no such gain was observed for fine-tuning of supervised models. Finally, NaC has shown to be a simple, effective, and model-independent method that offers a feasible solution for the denoising of real seismic data.

2605.11107 2026-05-13 cs.CV cs.AI 版本更新

Birds of a Feather Flock Together: Background-Invariant Representations via Linear Structure in VLMs

Youssef Zaazou, Mark Thomas

发表机构 * Independent Researcher(独立研究者)

AI总结 该研究针对视觉语言模型(VLMs)在处理图像分类任务时易受背景干扰的问题,提出了一种基于嵌入空间线性可加性的方法,将场景表示分解为前景和背景成分,从而构建背景不变的表示。通过利用合成数据进行预训练,该方法在存在完美虚假关联的Waterbirds数据集上实现了首个超过90%的最差群体准确率,且无需依赖真实去偏数据,具有良好的模拟到现实迁移能力,适用于实际部署。

Comments 36 pages, 7 figures

详情
英文摘要

Vision-language models (VLMs), such as CLIP and SigLIP 2, are widely used for image classification, yet their vision encoders remain vulnerable to systematic biases that undermine robustness. In particular, correlations between foreground objects and their backgrounds constitute a salient and practically important class of spurious dependencies. In this work, we revisit the well-known property of high linear additivity in VLM embedding spaces and show that it enables a decomposition of scene representations into foreground and background components. Leveraging this insight, we introduce a pre-training approach that exploits this property to construct background-invariant representations using synthetic data. Our method achieves, to our knowledge, the first worst-group accuracy exceeding $90\%$ on Waterbirds under perfect ($100\%$) spurious correlation (i.e., no minority-group examples in the training data). Furthermore, it demonstrates strong sim-to-real transfer and requires no access to real-world debiased data, making it practical for real-world deployment.

2605.11102 2026-05-13 cs.LG cs.AI cs.SY eess.SY 版本更新

Newton's Lantern: A Reinforcement Learning Framework for Finetuning AC Power Flow Warm Start Models

Shourya Bose, Helgi Hilmarsson, Dhruv Suri

发表机构 * Pravah(普拉瓦)

AI总结 该研究提出了一种名为“牛顿灯”的强化学习框架,用于优化交流潮流问题的暖启动模型。通过分析牛顿-拉夫森迭代次数的下界,研究揭示了现有监督方法在接近电压崩溃的重载场景下泛化能力不足的原因,并基于此设计了一种结合群体相对策略优化和学习奖励模型的微调方法,以迭代次数作为监督信号进行训练。实验表明,该方法在多个标准测试案例中均能稳定收敛,并实现了最小的平均迭代次数。

详情
英文摘要

Neural warm starts can sharply reduce the number of Newton-Raphson iterations required to solve the AC power flow problem, but existing supervised approaches generalize poorly on heavily loaded instances near voltage collapse. We prove a lower bound on the Newton-Raphson iteration count that depends on the direction of the warm start error rather than on its magnitude, and show as a corollary that the bound becomes vacuous as the smallest singular value of the power-flow Jacobian shrinks, identifying the failure mode of supervised regression near the saddle-node bifurcation. Motivated by this analysis, we introduce Newton's Lantern, a finetuning pipeline that combines group relative policy optimization with a learned reward model trained on perturbations of the base model's predictions, using the iteration count itself as the supervisory signal. Across IEEE 118-bus, GOC 500-bus, and GOC 2000-bus benchmarks, Newton's Lantern is the only method that converges on every test snapshot while attaining the smallest mean iteration count.

2605.11093 2026-05-13 cs.LG cs.AI cs.PF cs.SE cs.SY eess.SY 版本更新

Enabling Performant and Flexible Model-Internal Observability for LLM Inference

Nengneng Yu, Sixian Xiong, Yibo Zhao, Wei Wang, Zaoxing Liu

发表机构 * Department of Computer Science(计算机科学系)

AI总结 当前大语言模型推理任务越来越依赖对模型内部状态的实时访问。本文提出 DMI-Lib,一种高性能的深度模型检测工具,通过异步观测子系统、基于 Ring² 的 GPU-CPU 内存抽象以及策略控制的主机后端,将内部可观测性作为系统级核心原语,实现与推理主路径的解耦。实验表明,DMI-Lib 在保持服务优化和严格 GPU 内存限制的同时,显著降低了观测开销,相比现有方法在延迟上减少了 2 到 15 倍。

详情
英文摘要

Today's inference-time workloads increasingly depend on timely access to a model's internal states. We present DMI-Lib, a high-speed deep model inspector that treats internal observability as a first-class systems primitive, decoupling it from the inference hot path via an asynchronous observability substrate built from Ring^2, a GPU-CPU memory abstraction for capturing and staging tensors, and a policy-controlled host backend that exports them. DMI-Lib enables the placement of observation points across a rich space of internal signals and diverse inference backends while preserving serving optimizations and adhering to tight GPU memory budgets. Our experiments demonstrate that DMI-Lib incurs only 0.4%--6.8% overhead in offline batch inference and an average of 6% in moderate online serving, reducing latency overhead by 2x-15x compared to existing baselines with similar observability features. DMI-Lib is open-sourced at https://github.com/ProjectDMX/DMI.

2605.11091 2026-05-13 cs.LG cs.AI 版本更新

ASD-Bench: A Four-Axis Comprehensive Benchmark of AI Models for Autism Spectrum Disorder

Shubhankit Singh, Hassan Shaikh, Kuldeep Raghuwanshi, Keshav Bulia

发表机构 * Research Commons AI IIT Bombay(印度理工学院博伊斯) IIT Delhi(印度理工学院德里)

AI总结 本文提出ASD-Bench,一个针对自闭症谱系障碍(ASD)的四维综合基准,用于评估AI模型在不同年龄段群体中的表现。该基准涵盖预测性能、校准、可解释性和对抗鲁棒性四个维度,基于4,068份AQ-10问卷数据,测试了多种传统机器学习和深度学习模型。研究发现不同年龄段的特征重要性存在显著差异,并指出单一性能指标不足以评估临床AI系统的可靠性。

Comments 20 pages, 12 figures, 8 tables

详情
英文摘要

Automated ASD screening tools remain limited by single-architecture evaluations, axis-restricted assessment, and near-exclusive focus on adult cohorts, obscuring age-specific diagnostic patterns critical for early intervention. We introduce ASD-Bench, a systematic tabular benchmark evaluating ML, deep learning, and foundation model configurations across three age cohorts (children 1-11 yr, adolescents 12-16 yr, adults 17-64 yr) on four axes: predictive performance, calibration, interpretability, and adversarial robustness. Applied to a curated v3 dataset of 4,068 AQ-10 records, our benchmark spans classical models (XGBoost, AdaBoost, Random Forest, Logistic Regression), neural networks (MLP), deep tabular transformers (TabNet, TabTransformer, FT-Transformer), and TabPFN v2. We introduce the Heuristic Aggregate Penalty (HAP): a cost-sensitive metric penalising false negatives more heavily and incorporating cross-validation variance for deployment stability. Adult classification yields high performance (10/17 models achieve perfect F1 and AUC), while adolescents present a harder task (F1 ceiling 0.837 vs. 0.915 for children). Feature hierarchies shift across cohorts: A9 (social motivation) dominates for children, A5 (pattern recognition) leads for adolescents, and adults exhibit a flatter importance profile consistent with developmental social masking. Accuracy and calibration are dissociated: AdaBoost achieves F1=1.000 on adults with ECE=0.302, confirming single-metric evaluation is insufficient for clinical AI. Cohort-specific deployment recommendations are provided. All findings should be interpreted as proof-of-concept evidence on questionnaire-derived labels rather than clinically validated diagnostic performance.

2605.11086 2026-05-13 cs.CR cs.AI cs.LG 版本更新

ExploitGym: Can AI Agents Turn Security Vulnerabilities into Real Attacks?

Zhun Wang, Nico Schiller, Hongwei Li, Srijiith Sesha Narayana, Milad Nasr, Nicholas Carlini, Xiangyu Qi, Eric Wallace, Elie Bursztein, Luca Invernizzi, Kurt Thomas, Yan Shoshitaishvili, Wenbo Guo, Jingxuan He, Thorsten Holz, Dawn Song

发表机构 * UC Berkeley(加州大学伯克利分校) Max Planck Institute for Security and Privacy(马克斯·普朗克安全与隐私研究所) UC Santa Barbara(加州大学圣巴巴拉分校) Arizona State University(亚利桑那州立大学) Anthropic(Anthropic公司) OpenAI Google(谷歌)

AI总结 本文介绍了 ExploitGym,一个用于评估 AI 代理利用安全漏洞能力的大型基准测试平台。研究关注 AI 在将软件漏洞转化为实际攻击中的能力,这是一个涉及底层程序理解、运行时适应和长期策略制定的复杂任务。ExploitGym 包含来自用户空间程序、V8 引擎和 Linux 内核的 898 个真实漏洞实例,并测试了多种安全防护对 AI 表现的影响。实验表明,当前前沿模型如 Claude Mythos Preview 和 GPT-5.5 能够成功生成一定比例的有效攻击代码,凸显出 AI 在网络安全领域带来的潜在风险。

详情
英文摘要

AI agents are rapidly gaining capabilities that could significantly reshape cybersecurity, making rigorous evaluation urgent. A critical capability is exploitation: turning a vulnerability, which is not yet an attack, into a concrete security impact, such as unauthorized file access or code execution. Exploitation is a particularly challenging task because it requires low-level program reasoning (e.g., about memory layout), runtime adaptation, and sustained progress over long horizons. Meanwhile, it is inherently dual-use, supporting defensive workflows while lowering the barrier for offense. Despite its importance and diagnostic value, exploitation remains under-evaluated. To address this gap, we introduce ExploitGym, a large-scale, diverse, realistic benchmark on the exploitation capabilities of AI agents. Given a program input that triggers a vulnerability, ExploitGym tasks agents with progressively extending it into a working exploit. The benchmark comprises 898 instances sourced from real-world vulnerabilities across three domains, including userspace programs, Google's V8 JavaScript engine, and the Linux kernel. We vary the security protections applied to each instance, isolating their impact on agent performance. All configurations are packaged in reproducible containerized environments. Our evaluation shows that while exploitation remains challenging, frontier models can successfully exploit a non-trivial fraction of vulnerabilities. For example, the strongest configurations are Anthropic's latest model Claude Mythos Preview and OpenAI's GPT-5.5, which produce working exploits for 157 and 120 instances, respectively. Notably, even with widely used defenses enabled, models retain non-trivial success rates. These results establish ExploitGym as an effective testbed for exploitation and highlight the growing cybersecurity risks posed by increasingly capable AI agents.

2605.11051 2026-05-13 cs.SE cs.AI cs.CL cs.LG 版本更新

On Problems of Implicit Context Compression for Software Engineering Agents

Kirill Gelvan, Igor Slinko, Felix Steinbauer, Egor Bogomolov, Florian Kofler, Yaroslav Zharov

发表机构 * JetBrains Research(JetBrains研究院) Technical University of Munich, Germany(慕尼黑技术大学,德国)

AI总结 基于大语言模型的软件工程智能体在处理复杂、长期任务时面临上下文长度限制的关键瓶颈。本文探讨了一种通过将上下文编码为连续嵌入以提升信息密度的方法,并应用了最近提出的In-Context Autoencoder进行实验。尽管该方法在单次常识理解和代码理解任务中表现良好,但在多步骤的智能体编程任务中却失效,本文分析了这一现象并探讨了可能的原因。

详情
英文摘要

LLM-based Software Engineering agents face a critical bottleneck: context length limitations cause failures on complex, long-horizon tasks. One promising solution is to encode context as continuous embeddings rather than discrete tokens, enabling denser information storage. We apply the recently proposed In-Context Autoencoder for this purpose. While the method performs well on single-shot common-knowledge and code-understanding tasks, our experiments demonstrate that it fails on multi-step agentic coding tasks. In this paper, we explore this phenomenon and discuss possible factors contributing to this failure.

2605.11048 2026-05-13 cs.RO cs.AI 版本更新

ForceFlow: Learning to Feel and Act via Contact-Driven Flow Matching

Shuoheng Zhang, Yifu Yuan, Hongyao Tang, Yan Zheng, Qiaojun Yu, Pengyi Li, Guowei Huang, Helong Huang, Xingyue Quan, Jianye Hao

发表机构 * Tianjin University(天津大学) Huawei Noah's Ark Lab(华为诺亚实验室) Shanghai AI Lab(上海人工智能实验室)

AI总结 本文提出了一种名为ForceFlow的力感知反应框架,旨在解决机器人在复杂接触场景下的操作任务。该方法基于流匹配技术,通过融合力信号与多模态感知信息,实现了对接触力和运动的深度耦合,并采用视觉主导与触觉主导分阶段的策略,提升了任务执行的鲁棒性和泛化能力。实验表明,ForceFlow在六个实际接触密集任务中表现出更高的成功率和更低的成本,展示了其在接触力自调节和跨分布泛化方面的优越性能。

详情
英文摘要

Existing imitation learning methods enable robots to interact autonomously with the physical environment. However, contact-rich manipulation tasks remain a significant challenge due to complex contact dynamics that demand high-precision force feedback and control. Although recent efforts have attempted to integrate force/torque sensing into policies, how to build a simple yet effective framework that achieves robust generalization under multimodal observations remains an open question. In this paper, we propose ForceFlow, a force-aware reactive framework built upon flow matching. For contact-stage policy design, we investigate force signal fusion mechanisms and adopt an asymmetric multimodal fusion architecture that treats force as a global regulatory signal, combined with a joint prediction paradigm that enhances the policy's understanding of instantaneous force and historical information, thereby achieving deep coupling between force and motion. For task-level hierarchical decomposition, we divide manipulation into a vision-dominant approach stage (VLM-based pointing for target localization) and a touch-dominant interaction stage (force-driven contact execution), with a Vision-to-Force (V2F) handover mechanism that explicitly decouples spatial generalization from contact regulation. Experimental results across six real-world contact-rich tasks demonstrate that ForceFlow achieves a 37% success rate improvement over the strong baseline ForceVLA while maintaining significantly lower cost. Moreover, ForceFlow exhibits accurate force signal prediction and demonstrates superior performance in contact force self-regulation and zero-shot out-of-distribution (OOD) generalization.

2605.11045 2026-05-13 cs.SE cs.AI cs.LG 版本更新

Read, Extract, Classify: A Tool for Smarter Requirements Engineering

Paheli Bhattacharya, Manojit Chakraborty, Santhosh Kumar Arumugam, Rishabh Gupta

发表机构 * Bosch Research and Technology Centre, Bangalore, India(博世研究与技术中心,班加罗尔,印度)

AI总结 本文介绍了一款名为 ReXCL 的工具,旨在自动化需求工程中的提取与分类流程,提升软件开发生命周期的效率。该工具包含两个核心模块:提取模块通过启发式方法和预测模型将原始需求文档转化为预定义的结构化格式,分类模块则利用基于编码器模型的自适应微调技术对需求进行分类。实验表明,ReXCL 在处理半结构化需求文档的结构化过程中显著提高了效率和准确性,为需求工程的自动化提供了新的方法。

Comments Published at Requirements Engineering 2025 Conference

详情
英文摘要

This paper presents the ReXCL tool, which automates the extraction and classification processes in requirements engineering, enhancing the software development life-cycle. The tool features two main modules: Extraction, which processes raw requirement documents into a predefined schema using heuristics and predictive modeling, and Classification, which assigns class labels to requirements using adaptive fine-tuning of encoder-based models. The final output can be exported to external requirement engineering tools. Performance evaluations indicate that ReXCL significantly improves efficiency and accuracy in managing requirements, marking a novel approach to automating the schematization of semi-structured requirement documents.

2605.11042 2026-05-13 cs.GT cs.AI 版本更新

Towards Model-Free Learning in Dynamic Population Games: An Application to Karma Economies

Matteo Cederle, Saverio Bolognani, Gian Antonio Susto

发表机构 * Department of Information Engineering(信息工程系) Automatic Control Laboratory(自动控制实验室) ETH Zurich(苏黎世联邦理工学院)

AI总结 本文研究了在动态人口博弈(DPG)中实现无模型均衡学习的问题,特别是在“Karma经济”这一公平的非货币资源分配机制中的应用。作者提出了一种基于深度Q网络(DQN)的学习方法,使新加入博弈的智能体能够在不了解博弈模型的情况下,通过自身经验学习到近似纳什均衡的策略,并给出了相应的次优性界。此外,研究还展示了结合深度强化学习与虚拟博弈和策略迭代的方法,能够在无模型条件下从零开始学习到接近中心计算的稳态纳什均衡,为Karma经济的实际应用提供了理论支持。

详情
英文摘要

Dynamic Population Games (DPGs) provide a tractable framework for modeling strategic interactions in large populations of self-interested agents, and have been successfully applied to the design of Karma economies, a class of fair non-monetary resource allocation mechanisms. Despite their appealing theoretical properties, existing computational tools for DPGs assume full knowledge of the game model and operate in a centralized fashion, limiting their applicability in realistic settings where agents have access only to their own private experience. This paper takes a step towards addressing this gap by studying model-free equilibrium learning in Karma DPGs. First, we analyze the setting in which a novel agent joins a Karma DPG already at its Stationary Nash Equilibrium (SNE) and learns a policy via Deep Q-Networks (DQN) without knowledge of the game model. Leveraging recent convergence results for DQN, we establish a suboptimality bound consisting of a DQN approximation error of order $O(1/\sqrt{N_s})$ and a mean field perturbation error of order $O(1/N)$, where $N_s$ is the replay buffer size and $N$ is the population size. Second, we consider the challenging problem of learning the SNE from scratch. We show empirically that combining deep RL with fictitious play and smoothed policy iteration allows agents to converge, in a model-free fashion, to a configuration close to the centrally computed SNE. Together, these contributions support the vision of Karma economies as practical tools for fair resource allocation.

2605.11039 2026-05-13 cs.CR cs.AI 版本更新

The Granularity Mismatch in Agent Security: Argument-Level Provenance Solves Enforcement and Isolates the LLM Reasoning Bottleneck

Linfeng Fan, Ziwei Li, Yuan Tian, Yichen Wang, Rongsheng Li, Xiong Wang

发表机构 * Gaoling School of Artificial Intelligence, Renmin University of China(中国人民大学人工智能学院 Gallagher 学院) King Abdullah University of Science and Technology(国王 Abdullah 科学技术大学) Dongbei University of Finance and Economics(东北财经大学) University of Science and Technology of China(中国科学技术大学)

AI总结 大型语言模型代理在处理不可信的网页、邮件、文件和API输出时,面临安全与实用性的平衡难题。本文提出了一种基于论证级溯源的新型防御机制PACT,通过为工具参数分配语义角色并追踪其来源,确保每个参数的可信度满足其角色要求,从而在保证安全性的前提下提升代理的实用性。实验表明,PACT在多种模型部署中实现了更高的安全性和实用性,有效解决了传统方法在粒度控制上的不足。

详情
英文摘要

Tool-using LLM agents must act on untrusted webpages, emails, files, and API outputs while issuing privileged tool calls. Existing defenses often mediate trust at the granularity of an entire tool invocation, forcing a brittle choice in mixed-trust workflows: allow external content to influence a call and risk hijacked destinations or commands, or quarantine the call and block benign retrieval-then-act behavior. The key observation behind this paper is that indirect prompt injection becomes dangerous not when untrusted content appears in context, but when it determines an authority-bearing argument. We present \textsc{PACT} (\emph{Provenance-Aware Capability Contracts}), a runtime monitor that assigns semantic roles to tool arguments, tracks value provenance across replanning steps, and checks whether each argument's origin satisfies its role-specific trust contract. Under oracle provenance, \textsc{PACT} achieves 100\% utility and 100\% security on mixed-trust diagnostic suites, while flat invocation-level monitors incur false positives or false negatives. In full AgentDojo deployments across five models, \textsc{PACT} reaches 100\% security on the three strongest models while recovering 38.1--46.4\% utility, 8--16 percentage points above CaMeL at the same security level. Ablations show that both semantic roles and cross-step provenance are necessary. \textsc{PACT} reframes agent security as authority binding, and isolates the remaining deployment bottleneck to provenance inference and contract synthesis.

2605.11036 2026-05-13 cs.CR cs.AI 版本更新

Sequential Behavioral Watermarking for LLM Agents

Hyeseon An, Shinwoo Park, Dongsu Kim, Yo-Sub Han

发表机构 * Department of Computer Science(计算机科学系)

AI总结 本文研究了如何为基于大语言模型的智能体行为添加水印,以解决其行为轨迹难以追溯来源的问题。传统文本水印无法捕捉智能体执行层面的决策,作者提出了一种基于行为序列的水印方法SeqWM,通过在历史条件下的行为转移模式中嵌入信号,并在不依赖轨迹对齐的情况下进行验证。实验表明,SeqWM在保持智能体性能的同时,能够有效检测并抵抗轨迹扰动的影响,具有较高的鲁棒性。

Comments 17 pages, 3 figures, preprint

详情
英文摘要

LLM-based agents act through sequences of executable decisions, but their trajectories provide little evidence of which agent or policy produced them, making provenance, ownership, and unauthorized reuse difficult to establish from observed behavior alone. This motivates watermarking signals embedded directly into agent behavior rather than only into generated text, since text watermarking cannot capture the action-level decisions that define agent execution. Recent agent watermarking methods address this gap by moving the watermark from generated text to behavioral choices. However, by treating each action step as an independent trial, they overlook trajectory structure and become fragile when trajectories are perturbed, truncated, or observed without reliable alignment. We propose SeqWM, a sequential behavioral watermarking framework that embeds signals into history-conditioned transition patterns and verifies trajectories position-agnostically against random-key baselines. Experiments across diverse agent benchmarks and LLM backbones show that SeqWM consistently achieves reliable detection while preserving agent utility, and remains robust under trajectory corruption where round-indexed behavioral watermarks collapse.

2605.11034 2026-05-13 cs.CR cs.AI cs.LG cs.PF 版本更新

MambaNetBurst: Direct Byte-level Network Traffic Classification without Tokenization or Pretraining

Gayan K. Kulatilleke, Siamak Layeghy, Mahsa Baktashmotlagh, Marius Portmann

发表机构 * University of Queensland, Brisbane, Australia(昆士兰大学,布里斯班,澳大利亚)

AI总结 MambaNetBurst 是一种无需分词和预训练的紧凑型字节级网络流量分类模型,基于 Mamba-2 架构直接处理原始数据包字节。该方法通过固定长度的流量突发构建输入,结合可学习的 CLS 标记和残差归一化 Mamba-2 模块实现端到端分类,在多个公开数据集上表现出色,优于许多复杂的预训练模型。研究还表明,保持字节级时间分辨率和适度的状态规模对模型性能至关重要,且 Mamba-2 在效率和效果上均具有优势。

Comments 16 pages, 2 figures. Pareto-optimal frontier. Transformer vs Mamba vs Mamba-2 scaling performance. Code and data available on request

详情
英文摘要

We present MambaNetBurst, a compact tokenizer-free byte-level sequence classifier for network burst classification based on a Mamba-2 backbone. In contrast to most recent strong traffic-classification and intrusion-detection approaches, our method operates directly on raw packet bytes, avoids tokenization, patching, and heavy engineered multimodal representations, and does not require any self-supervised pre-training stage. Given a packet flow, we form a fixed-length burst from the first few packets, embed the resulting byte sequence appending a learnable CLS token, and process it with a stack of residual pre-normalized Mamba-2 blocks for end-to-end supervised classification. Across six public benchmarks spanning encrypted mobile app identification, VPN/Tor traffic classification, malware traffic classification, and IoT attack traffic, MambaNetBurst achieves consistently strong results and is competitive with, or outperforms, substantially heavier and often pre-trained baselines. Our ablation study shows that preserving byte-level temporal resolution is critical, that early downsampling through striding is consistently harmful, and that moderate state sizes are sufficient for robust generalization. We further show that Mamba-2, despite its more constrained transition structure relative to Mamba-1, remains highly effective for packet-byte modeling while providing clear efficiency advantages, particularly in training speed. Overall, our results demonstrate that direct **undiluted** byte-to-classification learning with compact selective state space models is a practical, effective and novel direction for efficient, deployable traffic analysis that bypasses the complexity of pre-training pipelines even over highly optimized linear attention architectures.

2605.11032 2026-05-13 cs.CR cs.AI 版本更新

Portable Agent Memory: A Protocol for Cryptographically-Verified Memory Transfer Across Heterogeneous AI Agents

Santhosh Kumar Ravindran

发表机构 * Microsoft Corporation(微软公司)

AI总结 本文提出了一种名为“Portable Agent Memory”的开放协议和参考实现,用于在异构AI代理之间安全地传输持久化记忆状态。该协议通过结构化的记忆模型、基于能力的访问控制、抗注入的恢复机制以及高效的序列化格式,实现了跨平台、可验证的记忆迁移。研究的主要贡献在于提供了一种通用且安全的机制,使AI代理能够在不同系统间共享和迁移其积累的上下文信息,同时保障数据的完整性和访问安全。

Comments 8 pages, 28 references

详情
英文摘要

We present Portable Agent Memory, an open protocol and reference implementation for transferring persistent memory state across heterogeneous AI agents. Modern AI agents accumulate rich context -- episodic events,semantic knowledge, procedural skills, working state, and identity preferences -- but this context remains locked within vendor-specific runtimes. Portable Agent Memory addresses this through: (1) a five-component structured memory model with content-addressable entries linked by a Merkle-DAG provenance graph providing tamper-evidence; (2) capability-based access control enabling selective, scoped disclosure of memory segments; (3) an injection-resistant rehydration protocol that adapts recalled content to heterogeneous target models while mitigating indirect prompt injection; and (4) a JSON-first serialization format with optional CBOR compaction for efficient transport. We provide a Python SDK with 54 passing tests, agent skills for multiple platforms, and demonstrate cross-model memory transfer between GPT-4, Claude, Gemini, and Llama architectures. The protocol is open-source under Apache 2.0.

2605.11030 2026-05-13 cs.SE cs.AI cs.MA 版本更新

An Executable Benchmarking Suite for Tool-Using Agents

Zhiqing Zhong, Zhijing Ye, Jiamin Wang, Xiaodong Yu

发表机构 * Stevens Institute of Technology(史蒂文斯理工学院)

AI总结 本文提出了一套可执行的基准测试套件,用于评估工具使用代理的性能,旨在明确区分工作负载、动作生成驱动和系统声明的证据。该套件通过统一的证据合同,整合了WebArena Verified、SWE-Gym和MiniWoB++等多个平台,实现了任务描述、事件模式和评估流程的标准化。研究的主要贡献在于提供了一个可审计的证据记录框架,并展示了该框架在不同评估场景下对代理决策的影响。

Comments 20 pages, 2 figures, 20 tables, including appendices

详情
英文摘要

Closed-loop tool-using agents are increasingly evaluated in executable web, code, and micro-task environments, but benchmark reports often conflate workloads, action-generating drivers, and the evidence admitted for systems-facing claims. We present an executable benchmarking suite that makes these objects explicit under a shared evidence-admission contract. The suite connects WebArena Verified, a SWE-Gym slice with SWE-bench-compatible verification, and MiniWoB++ through common workload adapters, task manifests, event schemas, replay/freeze policy, declared drivers, and reporting pipelines. In the canonical release, the gate separates paper-facing evidence from preflight, fixture, smoke, and diagnostic rows while preserving non-admitted artifacts for audit and onboarding. The admitted evidence records latency, invalid-action behavior, patch-generation cost, verifier metadata, replay bindings, and provenance under one auditable contract. The gate is decision-relevant rather than merely clerical: in a separate WebArena Verified controller study, clean-baseline and medium live-stressed evaluation select different fixed controller variants under the same workload and admission contract. The release is scoped as a benchmarking suite and admitted evidence, not a new agent policy, model leaderboard, backend comparison, or autonomous SWE-bench solver.

2605.11029 2026-05-13 cs.CR cs.AI 版本更新

FragBench: Cross-Session Attacks Hidden in Benign-Looking Fragments

Astha Mehta, Niruthiha Selvanayagam, Cedric Lam, Hengxu Li, Phuc-Nguyen Nguyen, Raymond Lee, Olivia McGoffin, My, Luong, Arthur Collé, Jamie Johnson, David Williams-King, Linh Le

发表机构 * New York University(纽约大学) Tufts University(塔夫茨大学) Distributed Systems(分布式系统) ERA(欧洲研究联盟) Lida Safety(Lida安全) Mila

AI总结 本文提出 FragBench,一个用于检测跨会话攻击的新基准,旨在评估大型语言模型在面对恶意目标被拆分为看似无害片段时的安全性。该基准基于24个真实网络攻击案例,包含多片段攻击链、每个片段的安全判断结果、沙箱执行记录以及对应的良性会话作为对照。研究通过构建对抗性重写器和基于图的检测模型,验证了跨会话特征对于检测此类攻击的重要性,并展示了多种模型在该任务上的高检测性能。

Comments preprint of submission

详情
英文摘要

An attacker can split a malicious goal into sub-prompts that each look benign on their own and only become harmful in combination. Existing LLM safety benchmarks evaluate prompts one at a time, or across turns of a single chat, and so do not look for a malicious signal spread across separate sessions with no shared context. We build FragBench, a benchmark drawn from 24 real-world cyber-incident campaigns, which keeps the full attack trail: the multi-fragment kill chain, the per-fragment safety-judge verdicts, sandboxed execution traces, and a matched set of benign cover sessions. FragBench splits this trail into two paired tasks: an adversarial rewriter that hardens fragments against a single-turn safety judge (FragBench Attack), and a graph-based user-level detector trained on the resulting interactions (FragBench Defense). The single-turn judge is near chance on the released corpus by construction, but four GNN variants and three classical-ML baselines all recover the cross-session feature, reaching aggregate event-level F1 = 0.88-0.96. Defending against fragmented LLM misuse therefore requires modeling the cross-session interaction graph, rather than isolated prompts. Our generator, rewriter, sandbox harness, and detector are released at https://github.com/LidaSafety/fragbench.

2605.11027 2026-05-13 cs.SE cs.AI 版本更新

From Code-Centric to Intent-Centric Software Engineering: A Reflexive Thematic Analysis of Generative AI, Agentic Systems, and Engineering Accountability

Elyson De La Cruz

发表机构 * University of the Cumberlands(卡姆伯兰大学)

AI总结 本文通过反思性主题分析和解释现象学分析,探讨生成式人工智能和智能代理系统如何推动软件工程从以代码为中心向以意图为中心的范式转变。研究综合分析了学术论文、技术公告、公开演讲和社交媒体等多源资料,揭示生成式AI降低了代码生成的成本,但提高了对意图定义、上下文管理、系统验证和治理的要求。研究指出,软件工程正逐渐从独立的代码编写转向对人机协作系统的监督与治理,这对技术债务和责任机制的构建具有重要影响。

Comments 24 pages, 6 tables

详情
英文摘要

Generative artificial intelligence (GenAI) and agentic systems are moving software engineering from code-centric production toward intent-centric human-agent work in which natural language, repository context, tools, tests, and governance shape delivery. Prior studies examine code generation, AI pair programming, and software engineering agents, but less is known about how public technical discourse and peer-reviewed evidence together frame the profession's near-term transition. This study addresses that gap through a reflexive thematic analysis (RTA) dominant and interpretative phenomenological analysis (IPA) informed public-discourse and document analysis. The corpus combines peer-reviewed software engineering and AI literature, technical benchmarks, public talks and interviews, essays, product-facing technical announcements, and X-originated discourse from prominent AI and software engineering voices. Sources were organized through a corpus register, codebook, coding matrix, theme-to-source traceability table, DOI/reference audit, and reproducibility protocol. The analysis shows that GenAI lowers the cost of producing plausible code while increasing the importance of intent specification, context curation, architecture knowledge, verification, security, provenance, governance, and accountable human judgment. The findings indicate that software engineering is becoming less about isolated code authorship and more about supervising, validating, and governing socio-technical systems of humans, agents, tools, and evidence gates. This matters because speed-focused adoption can accumulate hidden technical debt and accountability gaps, whereas bounded autonomy can preserve quality, security, maintainability, and trust.

2605.11022 2026-05-13 q-bio.GN cs.AI cs.ET cs.LG 版本更新

SCOPE: Siamese Contrastive Operon Pair Embeddings for Functional Sequence Representation and Classification

Akarsh Gupta, Kenneth Rodrigues, Sagnik Chatterjee

AI总结 该研究提出了一种名为SCOPE的Siamese对比操作子对嵌入方法,用于功能序列的表示与分类。通过融合嵌入空间进行分类,该方法在操作子对识别任务中表现出色,其ROC-AUC达到0.71,与当前最先进的模型相当。研究发现,基于蛋白质语言模型的嵌入已能有效捕捉功能关系,为大规模微生物基因组的操作子识别提供了可行且可扩展的解决方案。

详情
英文摘要

Identifying operons is a fundamental step in understanding prokaryotic gene regulation, as classifying genes into operons supports the reconstruction of regulatory networks, functional annotation of unannotated genes, and drug candidate development. Experimental approaches such as RT-PCR and RNA-seq provide precise evidence of operon structure, but are laborious and largely limited to well-studied model organisms, making scalable computational methods essential for genome-wide operon identification. Prior computational approaches have employed traditional classifiers such as logistic regression and decision trees, motivating our use of these as physicochemical baselines. The DGEB benchmark evaluates operonic pair classification by embedding each sequence independently with a pre-trained protein language model and computing pairwise cosine similarity. In contrast, our Siamese MLP learns a classifier over the fused embedding space, which is theoretically better motivated for binary classification, as cosine similarity can yield meaningless scores depending on the regularization of the embedding model. While protein language model embeddings substantially outperform physicochemical features in ROC-AUC, a learned Siamese MLP head does not significantly improve over unsupervised cosine similarity in Average Precision, suggesting that the geometry of the embedding space already captures the functional relationships needed for this task. Nonetheless, our Siamese MLP achieves a ROC-AUC of 0.71, competitive with state-of-the-art models on the DGEB leaderboard. These findings indicate that protein language model embeddings are a viable, scalable foundation for operonic pair classification across diverse microbial genomes, with implications for automated genome annotation, regulatory network reconstruction, and characterization of organisms lacking experimental operon annotations.

2605.11020 2026-05-13 cs.LG cs.AI cs.RO 版本更新

Trust Region Inverse Reinforcement Learning: Explicit Dual Ascent using Local Policy Updates

Anish Diwan, Davide Tateo, Christopher E. Mower, Haitham Bou-Ammar, Jan Peters, Oleg Arenz

发表机构 * Technical University of Darmstadt(达姆斯塔特技术大学) Lund University(隆德大学) German Research Center for AI (DFKI)(人工智能研究中心(DFKI)) Robotics Institute Germany (RIG)(德国机器人研究所(RIG)) Huawei, Noah’s Ark Lab(华为诺亚实验室) University College London(伦敦大学学院)

AI总结 本文提出了一种名为 Trust Region Inverse Reinforcement Learning(TRIRL)的逆强化学习方法,旨在在无需每次迭代都完整求解强化学习问题的前提下,实现奖励函数和策略的单调改进。其核心思想是通过信任区域优化策略,在当前策略附近进行局部搜索,从而显式优化对偶目标。该方法在保持对偶改进单调性的同时,避免了对抗方法的训练不稳定性,并在多个复杂任务中表现出色,奖励函数也具有对系统动态变化的鲁棒性。

Comments Accepted as a conference paper at the International Conference on Machine Learning (ICML) 2026

详情
英文摘要

Inverse reinforcement learning (IRL) is typically formulated as maximizing entropy subject to matching the distribution of expert trajectories. Classical (dual-ascent) IRL guarantees monotonic performance improvement but requires fully solving an RL problem each iteration to compute dual gradients. More recent adversarial methods avoid this cost at the expense of stability and monotonic dual improvement, by directly optimizing the primal problem and using a discriminator to provide rewards. In this work, we bridge the gap between these approaches by enabling monotonic improvement of the reward function and policy without having to fully solve an RL problem at every iteration. Our key theoretical insight is that a trust-region-optimal policy for a reward function update can be globally optimal for a smaller update in the same direction. This smaller update allows us to explicitly optimize the dual objective while only relying on a local search around the current policy. In doing so, our approach avoids the training instabilities of adversarial methods, offers monotonic performance improvement, and learns a reward function in the traditional sense of IRL--one that can be globally optimized to match expert demonstrations. Our proposed algorithm, Trust Region Inverse Reinforcement Learning (TRIRL), outperforms state-of-the-art imitation learning methods across multiple challenging tasks by a factor of 2.4x in terms of aggregate inter-quartile mean, while recovering reward functions that generalize to system dynamics shifts.

2605.11019 2026-05-13 cs.LG cs.AI 版本更新

Efficient LLM Reasoning via Variational Posterior Guidance with Efficiency Awareness

Zizhao Chen, Yuying Li, Siting Lin, Lianxi Wang

发表机构 * Guangdong University of Foreign Studies(广东外语外贸大学)

AI总结 尽管大语言模型依赖于思维链进行复杂推理,但过度思考现象严重降低了推理效率。本文受认知科学启发,提出了一种基于变分后验引导的高效推理框架VPG-EA,通过引入效率感知的证据下界,将高效推理建模为变分推断问题,并采用参数共享的双流架构,将后验分布中的高效模式通过变分蒸馏迁移至先验策略中。实验表明,该方法在不同规模模型上均显著提升了综合效率指标。

详情
英文摘要

Although large language models rely on chain-of-thought for complex reasoning, the overthinking phenomenon severely degrades inference efficiency. Existing reinforcement learning methods compress reasoning chains by designing elaborate reward functions, which renders high-quality samples extremely sparse in the exploration space and creates a sampling bottleneck for the prior policy. Inspired by cognitive science, we theoretically prove that a posterior distribution guided by reference answers achieves higher expected utility than the prior distribution, thus capable of breaking through the sampling bottleneck of high-quality samples. However, the posterior distribution is unavailable during inference. To this end, we formalize efficient reasoning as a variational inference problem and introduce an efficiency-aware evidence lower bound as the theoretical foundation. Based on this, we propose the VPG-EA framework. It adopts a parameter-shared dual-stream architecture to instantiate both the posterior distribution and the prior policy; after filtering out pseudo-efficient paths via cross-view evaluation, it unidirectionally transfers the posterior's efficient patterns to the prior policy through variational distillation. Experiments on DeepSeek-R1-Distill-Qwen-1.5B and 7B scales demonstrate that VPG-EA improves the comprehensive efficiency metric epsilon cubed by 8.73% and 12.37% over the strongest baselines on each model size, respectively.

2605.11017 2026-05-13 cs.LG cs.AI cs.IR 版本更新

Simpson's Paradox in Behavioral Curves: How Aggregation Distorts Parametric Models of User Dynamics

Chao Zhou

发表机构 * Meta Platforms, Inc.(Meta平台公司)

AI总结 该论文研究了在用户行为曲线建模中,由于数据聚合导致的参数模型系统性偏差问题,即行为曲线中的辛普森悖论。研究发现,个体用户的行为峰值与聚合后的整体曲线存在显著差异,这种偏差主要由生存偏差引起。论文提出了合成零校准方法以减少个体分类中的误判,并指出这一现象在推荐系统、广告和临床给药等领域具有广泛影响。

Comments Submitted to NeurIPS 2026

详情
英文摘要

Behavioral curve modeling -- fitting parametric functions to engagement-versus-exposure data -- is standard practice in recommendation, advertising, and clinical dosing. We show that aggregation introduces a systematic distortion: Simpson's paradox in behavioral curves. On Goodreads (3.3M users, 9 genres), individual users peak at n* approximately 11 exposures while the aggregate peaks at n* approximately 34 -- a 3x gap driven by survival bias. Amazon Electronics (18M reviews) shows a 5.3x distortion. MovieLens-25M (D approximately 1) serves as a negative control, confirming that survival bias -- not aggregation per se -- is the operative mechanism. The distortion is robust to category granularity, engagement operationalization, and classifier calibration. We develop Synthetic Null Calibration to address a 32% false positive rate in per-user classification. Our findings apply wherever individual behavioral parameters are estimated from aggregate curves under differential attrition.

2605.11015 2026-05-13 cs.CR cs.AI 版本更新

DCVD: Dual-Channel Cross-Modal Fusion for Joint Vulnerability Detection and Localization

Wenxin Tang, Wenbin Li, Junliang Liu, Jingyu Xiao, Xi Xiao, Mingzhe Liu, Jinlong Yang, Xuan Liu, Yuehe Ma, Wang Luo, Qing Li, Lei Wang, Peng Xiangli

发表机构 * Tsinghua University(清华大学) Hunan University(湖南大学) Dalian Maritime University(大连海事大学) The Chinese University of Hong Kong(香港中文大学) Shenzhen University(深圳大学) Northwestern Polytechnical University(西北工业大学) Shandong University(山东大学) BNU-HKBU United International College(北京师范大学-香港浸会大学联合国际学院) Sun Yat-sen University(中山大学) Peng Cheng Laboratory(鹏城实验室) Guangzhou Intelligence Communications Technology Co., Ltd.(广州智能通信技术有限公司) The Fifth Electronic Research Institute of MIIT(中华人民共和国信息产业部第五电子研究所)

AI总结 软件漏洞检测在保障系统安全中具有重要作用,而实际审计不仅需要判断函数是否脆弱,还需定位具体代码行。现有方法多依赖单一信息源,或仅将语句级定位视为函数级检测的副产品,缺乏跨模态信息的联合利用。为此,本文提出DCVD,一种统一框架,通过双通道结构提取控制依赖和语义特征,并结合对比对齐与双向交叉注意力机制实现跨模态融合,同时引入函数级和语句级显式监督信号,显著提升了漏洞检测与定位的性能。

详情
英文摘要

Software vulnerability detection plays a critical role in ensuring system security, where real-world auditing requires not only determining whether a function is vulnerable but also pinpointing the specific lines responsible. However, existing approaches either rely on a single information source -- sequential, structural, or semantic -- failing to jointly exploit the complementary strengths across modalities, or treat statement-level localization merely as a byproduct of function-level detection without explicit line-level supervision. To address these limitations, we propose DCVD (Dual-Channel Cross-Modal Vulnerability Detection), a unified framework that performs joint function-level detection and statement-level localization. DCVD extracts control-dependency and semantic features through two parallel branches and integrates them via contrastive alignment coupled with bidirectional cross-attention, effectively bridging the cross-modal representation gap. It further introduces explicit supervision signals at both the function and statement levels, enabling collaborative optimization across the two granularities. Extensive experiments on a large-scale real-world vulnerability benchmark demonstrate that DCVD consistently outperforms state-of-the-art methods on both function-level detection and statement-level localization. Our code is available at https://github.com/vinsontang1/DCVD.

2605.11014 2026-05-13 cs.LG cs.AI 版本更新

Backbone-Equated Diffusion OOD via Sparse Internal Snapshots

Yadang Alexis Rouzoumka, Jean Pinsolle, Eugénie Terreaux, Christèle Morisseau, Jean-Philippe Ovarlez, Chengfang Ren

发表机构 * ONERA SONDRA Université Paris-Saclay(巴黎-萨克雷大学) CentraleSupélec(中央理工-巴黎高等学院)

AI总结 该论文提出了一种名为MBE的公平比较协议,用于解决扩散模型在异常检测(OOD)任务中因主干网络、噪声参数化和推理预算不同而导致的评估不一致问题。研究引入了基于稀疏内部激活的Canonical Feature Snapshots(CFS)检测方法,仅需少量冻结扩散模型的内部激活即可实现高效的OOD检测。实验表明,CFS在CIFAR尺度基准上表现出色,且其性能主要依赖于少量稀疏状态,而非完整的去噪过程或复杂的下游模块。论文还从理论角度解释了这一现象,揭示了扩散模型在低噪声条件下内部状态与编码器-解码器互补性的关系。

详情
英文摘要

Fair comparison between diffusion-based OOD detectors is challenging, as conclusions can vary with backbone choice, corruption parameterization, and test-time budget. We address this issue through a Mutualized Backbone-Equated (MBE) protocol that aligns canonical corruption levels and logical test-time cost across diffusion backbones. Within this setting, we introduce Canonical Feature Snapshots (CFS), a family of detectors that probes a frozen diffusion backbone using only a tiny number of native internal activations at canonical low-noise levels. On a controlled CIFAR-scale benchmark, the strongest one-forward CFS variant is CFS(1x2), while an even smaller decoder-only variant remains highly competitive. This shows that much of the relative-OOD signal exposed by frozen diffusion backbones is concentrated in a small number of sparse internal states, rather than requiring full denoising trajectories or high-capacity downstream heads. We further provide a local diagnostic theory explaining these observations through conditional encoder-decoder complementarity, diagonal-score separation, and low-noise corruption stability. The official implementation is available at https://github.com/RouzAY/cfs-diffusion-ood/.

2605.11011 2026-05-13 cs.LG cs.AI 版本更新

LoopUS: Recasting Pretrained LLMs into Looped Latent Refinement Models

Taekhyun Park, Yongjae Lee, Dohee Kim, Hyerim Bae

发表机构 * Department of Data Science(数据科学系) Department of Industrial Engineering(工业工程系) Pusan National University(釜山国立大学) Changwon National University(昌原国立大学)

AI总结 LoopUS 是一种将预训练大语言模型(LLM)转化为循环潜层优化模型的后训练框架,旨在提升模型的推理能力。该方法通过分解模型结构、引入选择性门控机制、随机深度监督和置信度头部等核心组件,实现了在不破坏原有能力的前提下,将标准模型改造成稳定的循环架构。LoopUS 有效缓解了计算瓶颈和表示崩溃问题,显著提升了模型的推理性能。

详情
英文摘要

Looped computation shows promise in improving the reasoning-oriented performance of LLMs by scaling test-time compute. However, existing approaches typically require either training recurrent models from scratch or applying disruptive retrofits, which involve substantial computational costs and may compromise pretrained capabilities. To address these limitations, we introduce \textbf{Looped Depth Up-Scaling} (LoopUS), a post-training framework that converts a standard pretrained LLM into a looped architecture. As a key technical contribution, LoopUS recasts the pretrained LLM into an encoder, a looped reasoning block, and a decoder. It operationalizes this latent-refinement architecture through four core components: (1) block decomposition, guided by staged representation dynamics; (2) an input-dependent selective gate to mitigate hidden-state drift; (3) random deep supervision for memory-efficient learning over long recursive horizons; and (4) a confidence head for adaptive early exiting. Collectively, these mechanisms transform a standard non-looped model into a looped form while stabilizing it against both computational bottlenecks and representation collapse. Through stable latent looping, LoopUS improves reasoning-oriented performance without extending the generated traces or requiring recurrent training from scratch. For more details, see https://thrillcrazyer.github.io/LoopUS

2605.11008 2026-05-13 cs.LG cs.AI 版本更新

When and How to Canonize: A Generalization Perspective

Yonatan Sverdlov, Benjamin Friedman, Snir Hordan, Nadav Dym

发表机构 * Technion – Israel Institute of Technology(技术学院–以色列理工学院)

AI总结 本文从理论角度分析了通过规范化(canonization)实现不变性的方法在对称数据处理中的泛化性能。研究引入了一种基于覆盖数界分析的理论框架,揭示了规范化模型的误差界处于结构不变模型与非不变基线模型之间,并证明了规范化效果依赖于其正则性。在点云处理中,作者进一步证明了字典序排序的覆盖数随维度指数增长,而Hilbert曲线规范化则保证多项式增长,为该方法在点云架构中的成功提供了理论依据。

详情
英文摘要

While invariant architectures are standard for processing symmetric data, there is growing interest in achieving invariance by applying group averaging or canonization to non-invariant backbones. However, the theoretical generalization properties of these alternative strategies remain poorly understood. We introduce a theoretical framework to analyze the generalization error of these methods by bounding their covering numbers. We establish a rigorous generalization hierarchy: the error bounds of canonized models are at best equal to the error bounds of structurally invariant and group-averaged models, and at worst equal to the bounds of non-invariant baselines. Furthermore, we show that there exist optimal canonizations which attain the optimal error bounds, and poor canonizations which attain the non-invariant error bounds, and that this depends on the regularity of the canonization. Finally, applying this framework to permutation groups in point cloud processing, we rigorously prove that the covering number of lexicographical sorting grows exponentially with point cloud dimension, whereas Hilbert curve canonization guarantees polynomial growth. This provides the first formal theoretical justification for the empirical success of Hilbert curve serialization in state-of-the-art point cloud architectures. We conclude with experiments that support our theoretical claims. Code is available at https://github.com/yonatansverdlov/Canonization

2605.11007 2026-05-13 cs.LG cs.AI 版本更新

RT-Transformer: The Transformer Block as a Spherical State Estimator

Peter Racioppo

发表机构 * Independent Researcher(独立研究者)

AI总结 本文提出了一种将Transformer模块视为球面上状态估计器的方法,揭示了Transformer中的核心组件——注意力机制、残差连接和归一化——实际上源于一个统一的几何估计问题。通过将潜在状态建模为超球面上的方向,并在当前估计的切平面上定义噪声,研究构建了一个基于精度加权的方向推断过程,其中注意力聚合证据,残差连接实现状态更新,归一化将更新后的状态重新投影到超球面上。该工作表明,这些组件是估计问题几何性质的自然结果,而非独立的架构设计选择。

详情
英文摘要

We show that the core components of the Transformer block -- attention, residual connections, and normalization -- arise naturally from a single geometric estimation problem. Modeling the latent state as a direction on the hypersphere, with noise defined in the tangent plane at the current estimate, yields a precision-weighted directional inference procedure in which attention aggregates evidence, residual connections implement incremental state updates, and normalization retracts the updated state back onto the hypersphere. Together, these components follow from the geometry of the estimation problem rather than being introduced as independent architectural choices.

2605.11006 2026-05-13 cs.SE cs.AI 版本更新

An Execution-Verified Multi-Language Benchmark for Code Semantic Reasoning

Yikun Li, Jinfeng Jiang, Ting Zhang, Chengran Yang, Chenxing Zhong, Yin Yide, Leow Wen Bin, Eng Lieh Ouh, Lwin Khin Shar, David Lo

发表机构 * Singapore Management University(新加坡管理大学) Monash University(墨尔本大学) Nanjing University of Science and Technology(南京理工大学) GovTech Singapore(新加坡政府科技局)

AI总结 本文提出 TraceEval,首个基于执行验证的多语言代码语义推理基准,旨在评估大语言模型是否能从源代码中恢复程序的运行时调用结构,而不仅仅是生成通过测试的代码。该基准通过实际执行验证每条调用边,消除了标注偏差和噪声,包含来自 Python、JavaScript 和 Java 的 10,583 个真实程序,并提供可复现的构建流程。实验表明,即使是最强的模型 Claude-Opus-4.6,在零样本设置下也只能达到约 72.9% 的平均 F1 分数,展示了该任务的挑战性。

详情
英文摘要

Evaluating whether large language models (LLMs) can recover execution-relevant program structure, rather than only produce code that passes tests, remains an open problem. Existing code benchmarks emphasize test-passing outputs, from standalone programming tasks (HumanEval, MBPP, LiveCodeBench) to repository repair (SWE-Bench); this is useful, but offers limited diagnostic signal about which program semantics a model can recover from source. We introduce TraceEval, to our knowledge the first execution-verified, multi-language benchmark for code semantic reasoning: recovering a program's runtime call structure from source code. Unlike prior call-graph benchmarks that rely on static-tool output or hand-annotated ground truth, every positive edge in TraceEval is mechanically witnessed by validation execution, eliminating annotator disagreement and label noise for observed behavior. TraceEval consists of (i) 10,583 real-world programs (2,129 test, 8,454 train) extracted from 1,600+ open-source repositories across Python, JavaScript, and Java via an LLM-assisted harness-generation pipeline with tracer validation; and (ii) a reproducible pipeline that converts any open-source repository into new verified benchmark instances. We evaluate 10 LLMs at zero-shot on the held-out test split. The strongest model, Claude-Opus-4.6, reaches an average F1 of 72.9% across the three languages. To demonstrate the train split's utility as a supervision substrate, we fine-tune the Qwen2.5-Coder family on it: lifts of up to +55.6 F1 bring tuned Qwen2.5-Coder-32B to 71.2%, within 1.7 F1 of zero-shot Claude-Opus-4.6. We release the benchmark, pipeline, baselines, and a datasheet at https://github.com/yikun-li/TraceEva

2605.11005 2026-05-13 cs.LG cs.AI cs.DC 版本更新

DisagMoE: Computation-Communication overlapped MoE Training via Disaggregated AF-Pipe Parallelism

Zhichen Zeng, Chi-Chih Chang, Jiayi Wang, Zezhou Wang, Ningxin Zheng, Zheng Zhong, Cesar A. Stuardo, Dongyang Wang, Mohamed S. Abdelfattah, Haibin Lin, Banghua Zhu, Ang Li, Ziheng Jiang

发表机构 * University of Washington(华盛顿大学) Cornell University(康奈尔大学)

AI总结 本文提出了一种名为DisagMoE的混合专家(MoE)训练系统,旨在解决大规模语言模型训练中专家并行策略面临的通信瓶颈问题。该方法通过将注意力层和前馈网络层分组到不同的GPU组中,并引入多阶段流水线和单向多对多通信机制,有效实现了计算与通信的重叠。实验表明,DisagMoE在多个MoE模型上显著提升了训练效率,尤其在16节点8xH800集群上实现了最高1.8倍的加速。

详情
英文摘要

Mixture-of-experts (MoE) architectures enable trillion-parameter LLMs with sparsely activated experts. Expert parallelism (EP) is a widely adopted MoE training strategy, but it suffers from severe all-to-all communication bottlenecks, which is exaggerated by the limited inter-node network bandwidth as the growing model size requires distributing experts across GPU nodes. Prior work focused on overlapping these all-to-all communications with feed-forward network (FFN) and self-attention computations, which often leaves residual network-bound stalls due to inherent imbalance in attention and FFN layers' computation-communication ratios. We present DisagMoE, a disaggregated MoE training system that jointly optimizes model placement and scheduling for maximal efficiency. DisagMoE separates attention and FFN layers into disjoint GPU groups, introduces a multi-stage pipeline with uni-directional, many-to-many communications, and employs a computation-communication roofline model to balance GPU and network bandwidth allocation among the attention and FFN groups. DisagMoE is implemented on Megatron-LM, and evaluation shows that DisagMoE improves training efficiency across multiple MoE models with up to 1.8x speedup on 16-node 8xH800 clusters.

2605.11003 2026-05-13 cs.CR cs.AI 版本更新

The Authorization-Execution Gap Is a Major Safety and Security Problem in Open-World Agents

Baoyuan Wu, Qingshan Liu, Adel Bibi, Irwin King, Siwei Lyu

发表机构 * Chinese University of Hong Kong, Shenzhen, China(香港中文大学(深圳)) Nanjing University of Posts and Telecommunications, China(南京邮电大学) University of Oxford, UK(牛津大学) Chinese University of Hong Kong, China(香港中文大学) State University of New York at Buffalo, USA(纽约州立大学布法罗分校)

AI总结 本文指出,授权-执行差距(AEG)是开放世界智能体在安全与可靠性方面面临的主要问题,表现为智能体实际执行的行为与其被授权的意图之间存在偏差。研究分析了AEG的三个结构性成因,并强调应通过执行过程中的动态检测与归因,而非仅依赖事前过滤或事后审计,来实现更有效的防御。该研究为开放世界智能体的安全设计提供了新的研究方向和评估标准。

详情
英文摘要

This position paper argues that the Authorization-Execution Gap (AEG) is a major safety and security problem in open-world agents. The AEG is the divergence between what a principal intends to authorize and what an open-world agent ultimately executes. Because such agents act autonomously across tools, persistent state, and multi-agent handoffs, even small instances of authorization divergence can cause harm that is difficult or impossible to undo. We argue that many observed agent failures can be traced to three structural sources of AEG: delegation-level incompleteness, channel-level corruption, and composition-level fragmentation. The same observed failure may arise from any of these sources. Without identifying the source, a defense targeting the symptom alone cannot address the underlying cause. Agent safety and security should therefore emphasize source-oriented diagnosis and defense. Because the structural sources of AEG arise dynamically during execution, this approach necessarily requires authorization integrity checks applied during execution, rather than relying solely on one-shot upfront filtering or post-hoc audit. For NeurIPS, the implication is that papers on open-world agents should report not only outcome-level metrics such as task success or attack resistance, but also process-level evidence showing where AEG was detected, constrained, and attributed to a structural source during execution.

2605.11002 2026-05-13 cs.CR cs.AI 版本更新

MT-JailBench: A Modular Benchmark for Understanding Multi-Turn Jailbreak Attacks

Xinkai Zhang, Zhipeng Wei, Huanli Gong, Jing Ting Zheng, Yuchen Zhang, Yue Dong, N. Benjamin Erichson

发表机构 * International Computer Science Institute(国际计算机科学研究所) Berkeley Lab(伯克利实验室)

AI总结 MT-JailBench 是一个用于评估多轮越狱攻击的模块化基准框架,旨在解决当前评估方法因实验条件不统一而导致的比较困难。该框架将攻击过程分解为五个可独立评估的模块,包括评估函数、攻击策略、提示生成、提示优化和流程控制,从而支持对不同攻击方法进行公平比较和组件级分析。研究发现,资源预算和评估函数是影响攻击效果的主要因素,而提示生成在性能变化中起主导作用,优化和流程控制则提供中等提升,同时固定策略的随机采样也能达到与复杂策略相当的效果。MT-JailBench 为深入理解多轮越狱机制和指导更有效的安全测试提供了有力工具。

详情
英文摘要

Multi-turn jailbreaks exploit the ability of large language models to accumulate and act on conversational context. Instead of stating a harmful request directly, an attacker can gradually steer the conversation toward an unsafe answer. Recent methods demonstrate this risk, but they are usually evaluated as black-box pipelines with different budgets, judges, retry rules, and strategy generation procedures. As a result, it is often unclear whether reported gains reflect stronger attack mechanisms or different experimental conditions. We introduce MT-JailBench, a modular evaluation framework for benchmarking multi-turn jailbreaks under fixed conditions. MT-JailBench implements each attack as five interacting modules: evaluation function, attack strategy, prompt generation, prompt refinement, and flow control. This design enables fair comparison across attack methods and component-wise analysis of what drives attack success. Using MT-JailBench, we find that resource budgets and evaluation functions are major confounders: controlling turns, retries, interactions, sampled strategies, and judges substantially change the ranking of attacks. At the component level, prompt generation accounts for most performance variation, while refinement and flow control provide moderate gains. We also find that explicit dynamic strategy generation is not always necessary; stochastic sampling from a fixed strategy can rival more elaborate diversification mechanisms. Finally, recomposing the best components yields a strong attack configuration that outperforms its source attacks and generalizes across diverse target LLMs. MT-JailBench therefore provides a modular framework for comparing multi-turn jailbreaks, understanding the impact of components, and guiding stronger red-teaming evaluations.

2605.10999 2026-05-13 cs.LG cs.AI cs.MA 版本更新

SkillGen: Verified Inference-Time Agent Skill Synthesis

Yuchen Ma, Yue Huang, Han Bao, Haomin Zhuang, Swadheen Shukla, Michel Galley, Xiangliang Zhang, Stefan Feuerriegel

发表机构 * Munich Center for Machine Learning, LMU Munich(慕尼黑马尔他学习中心,慕尼黑大学) University of Notre Dame(诺特尔达大学) Microsoft Research(微软研究院)

AI总结 SkillGen 是一种多智能体框架,旨在从基础智能体生成的轨迹中合成可审计的单一技能,无需重新训练模型即可提升智能体性能。该方法通过对比成功与失败轨迹,识别可复用的成功模式和失败原因,并生成可读的技能描述,支持人工验证。SkillGen 的核心创新在于将技能建模为对智能体行为的干预,并通过对比使用和不使用该技能时的性能差异,评估其整体效果,从而有效提升模型在多个任务和数据集上的表现。

详情
英文摘要

Skills are a promising way to improve LLM agent capabilities without retraining, while keeping the added procedure reusable and controllable. However, high-quality skills are still largely written by hand. We introduce SkillGen, a multi-agent framework that synthesizes a single auditable skill from trajectories generated by a base agent. The output is a human-readable artifact that can be inspected before use. Rather than merely summarizing trajectories, SkillGen leverages contrastive induction over both successful and failed trajectories to identify reusable success patterns, recurring failure modes, and behaviors that appear in nearby successes but are missing from failures. SkillGen then generates candidate skills and iteratively refines the skill. A key novelty in SkillGen is that we model agent skills as interventions to empirically verify the net effect of skills on the overall performance. Specifically, we compare outcomes on the same instances with and without the skill, so that we account for both repairs (cases where the skill fixes a baseline failure) and regressions (cases where the skill breaks a baseline success). Across a broad range of agents and datasets, SkillGen consistently improves held-out performance, outperforms existing skill-generation baselines, and produces skills that transfer across models.

2605.10998 2026-05-13 cs.CR cs.AI 版本更新

Few-Shot Truly Benign DPO Attack for Jailbreaking LLMs

Sangyeon Yoon, Wonje Jeung, Yoonjun Cho, Dongjae Jeon, Albert No

发表机构 * Yonsei University(延世大学)

AI总结 该研究探讨了基于偏好优化(DPO)的微调方法在大语言模型安全对齐中的潜在风险。研究提出了一种仅需10个无害偏好对的真正良性DPO攻击方法,能够有效降低模型对有害请求的拒绝行为,且数据表现与合法用户请求几乎无法区分。实验表明,该方法在多个OpenAI模型上取得了较高的攻击成功率,且成本极低,揭示了DPO在提升模型实用性的同时可能带来的安全风险。

详情
英文摘要

Fine-tuning APIs make frontier LLMs easy to customize, but they can also weaken safety alignment during fine-tuning. While prior work shows that benign supervised fine-tuning (SFT) can reduce refusal behavior, deployed fine-tuning pipelines increasingly support preference-based objectives, whose safety risks remain less understood. We show that Direct Preference Optimization (DPO) introduces a stronger and harder-to-audit failure mode. We propose a truly benign DPO attack using only 10 harmless preference pairs, the minimum data scale accepted by OpenAI's fine-tuning service. Each pair contains a benign prompt, a normal helpful answer as the preferred response, and a refusal as the dispreferred response. Unlike prior benign fine-tuning attacks, our data exhibits no suspicious behavior: it is practically indistinguishable from the fine-tuning request of a legitimate user seeking to reduce over-refusal, making harmful intent almost impossible to infer from the request alone. Nevertheless, because DPO directly optimizes the model to prefer helpful answers over refusals, this seemingly benign objective broadly suppresses refusal behavior and transfers to harmful prompts outside the fine-tuning data. Across OpenAI models supporting DPO fine-tuning, our attack achieves attack success rates of 59.13% on GPT-4o, 70.20% on GPT-4.1, 54.80% on GPT-4.1-mini, and 81.73% on GPT-4.1-nano, at costs of only \$1.7, \$1.7, \$0.3, and \$0.1. Moreover, on open-weight models that do not impose minimum data requirements, we find that this effect can emerge from even a single benign preference pair.

2605.10996 2026-05-13 cs.CG cs.AI cs.GR math.OC 版本更新

Towards Scalable Persistence-Based Topological Optimization

Abderrahim Bendahi, Alexandre Duplessis, Arnaud Fickinger

发表机构 * École Polytechnique Paris(巴黎高等理工学院) ENS Ulm, PSL Paris(乌尔姆国立高等学院,巴黎PSL) UC Berkeley(伯克利大学)

AI总结 该研究旨在解决基于持续性(persistence)的拓扑优化中的可扩展性问题,通过优化点云以最小化与持续性图相关的损失函数。为了解决子采样和梯度稀疏性带来的限制,作者提出了一种结合随机切片和快速高斯卷积的方法,提升了采样效率并生成更平滑的更新场。实验表明,该方法在计算效率和优化效果上均优于现有方法。

详情
英文摘要

Persistence-based topological optimization deforms a point cloud $X \subset \mathbb{R}^d$ by minimizing objectives of the form $L(X) = \ell(\mathrm{Dgm}(X))$, where $\mathrm{Dgm}(X)$ is a persistence diagram. In practice, optimization is limited by two coupled issues: persistent homology is typically computed on subsamples, and the resulting topological gradients are highly sparse, with only a few anchor points receiving nonzero updates. Motivated by diffeomorphic interpolation, which extends sparse gradients to smooth ambient vector fields via Reproducing Kernel Hilbert Space (RKHS) interpolation, we propose a more scalable pipeline that improves both subsampling and gradient extension. We introduce subsampling via random slicing, a lightweight scheme that promotes iteration-wise geometric coverage and mitigates density bias. We further replace the costly kernel solve with a fast Nadaraya-Watson (NW) Gaussian convolution, producing a globally defined smooth update field at a fraction of the computational cost, while being more suited for topological optimization tasks. We provide theoretical guarantees for NW smoothing, including anchor approximation bounds and global Lipschitz estimates. Experiments in $2$D and $3$D show that combining random slicing with NW smoothing yields consistent speedups and improved objective values over other baselines on common persistence losses.

2605.10991 2026-05-13 cs.LG cs.AI 版本更新

Test-Time Personalization: A Diagnostic Framework and Probabilistic Fix for Scaling Failures

Linhai Zhang, Yulan He

发表机构 * King’s College London(伦敦国王学院) The Alan Turing Institute(艾伦·图灵研究所)

AI总结 本文研究了测试时个性化(TTP)这一新兴方向,提出通过从个性化策略模型中采样多个候选并利用个性化奖励模型选择最优解,以提升推理阶段的计算扩展性。研究证明,理想选择方式下,预期效用随采样数量对数增长,但现有奖励模型难以实现这一潜力。为此,作者推导出统一的扩展定律,揭示了两种失效模式,并提出一种概率化的个性化奖励模型,有效缓解了这些问题。实验表明,该框架在多种策略模型和文本生成任务中均能实现稳定的扩展效果。

详情
英文摘要

Existing approaches to LLM personalization focus on constructing better personalized models or inputs, while treating inference as a single-shot process. In this work, we study Test-Time Personalization (TTP) along an unexplored axis: scaling inference-time computation by sampling N candidates from a personalized policy model and selecting the best with a personalized reward model. We prove that oracle selection yields expected utility growing logarithmically with the number of sampled candidates, establishing a theoretical ceiling for test-time scaling. However, standard reward models fail to realize this potential. To diagnose why, we derive a unified scaling law that decomposes any reward model's Best-of-N curve into four measurable quantities and reveals two failure modes, user-level collapse (near-constant prediction for some users) and query-level reward hacking (negative correlation with true quality for some queries). Guided by this law, we propose a probabilistic personalized reward model whose learned variance effectively mitigates both failure modes. Experiments confirm both elements of our framework: TTP delivers consistent scaling across multiple policy models and personalized text generation tasks, and our scaling law closely matches observed scaling curves across reward-model variants.

2605.10990 2026-05-13 cs.SE cs.AI 版本更新

Skill Drift Is Contract Violation: Proactive Maintenance for LLM Agent Skill Libraries

Linfeng Fan, Yuan Tian, Ziwei Li, Zhiwu Lu

发表机构 * Gaoling School of Artificial Intelligence, Renmin University of China(中国人民大学人工智能学院 Gallagher 学院) King Abdullah University of Science and Technology(国王 Abdullah 科学与技术大学)

AI总结 随着大型语言模型代理越来越多地依赖可复用的技能库,这些技能因所依赖的外部服务、包、API 和配置的演变而悄然退化。本文将技能漂移定义为合同违规,并提出了一种名为 \sgname{} 的方法,通过从技能文档中提取可执行的环境合同,并仅验证与角色相关的假设,从而实现精确的维护信号。实验表明,\sgname{} 在检测技能漂移方面具有高精度和高召回率,并显著提升了修复成功率。本文还发布了包含880对数据的技能退化基准数据集 \dbname{}。

详情
英文摘要

LLM agents increasingly rely on reusable skill libraries, but these skills silently decay as the external services, packages, APIs, and configurations they reference evolve. Existing monitors detect such changes at the wrong granularity: they observe values, not the role those values play in a skill. A version string in a comment is noise; the same string in a pinned dependency is an operational obligation. We formulate skill drift as contract violation and introduce \sgname{}, which extracts executable environment contracts from skill documents and validates only those role-bearing assumptions against known or live conditions. This distinction turns noisy monitoring into a precision-first maintenance signal. Contract-free CI probes produce 40\% false positives, while \sgname{} raises zero false alarms over 599 no-drift and hard-negative cases (Wilson 95\% CI $[0,0.6]\%$). In known-drift verification, \sgname{} achieves 100\% precision and 76\% recall with the strongest backbone; in a pre-registered study over 49 real skills, it discovers live drift with 86\% conservative precision. Violated contracts also make repair actionable, improving one-round success from 10\% without localization to 78\%. We release \dbname{}, an 880-pair benchmark for skill degradation.

2605.10988 2026-05-13 cs.LG cs.AI 版本更新

Seeing the Needle in the Haystack: Towards Weakly-Supervised Log Instance Anomaly Localization via Counterfactual Perturbation

Yutszyuk Wong, Wentai Wu, Yuen-Ying Yeung, Weiwei Lin

发表机构 * Jinan University Guangzhou, China(广州吉林大学) South China University of Technology Guangzhou, China(华南理工大学)

AI总结 本文研究了在大规模网络系统中如何在仅有包级标注的情况下实现日志实例级别的异常定位问题。为此,作者提出了LogMILP方法,结合多实例学习、原型引导和反事实扰动一致性正则化,实现了在弱监督条件下的高效异常检测与定位。实验表明,该方法在多个公开数据集上表现出优异的检测性能和更可靠的实例级定位能力。

Comments 6 pages,2 figures

详情
英文摘要

Log anomaly detection is a critical task for system operations and security assurance. However, in networked systems at scale, log data are generated at massive scale while instance-level annotations are prohibitively expensive, posing great difficulties to fine-grained anomaly localization. To address this challenge, we propose LogMILP (Log anomaly localization based on Multi-Instance Learning enhanced by prototypes and Perturbation), a weakly supervised framework that enables both bag-level anomaly detection and instance-level anomaly localization using only bag-level labels. Our method guides the model to pinpoint the critical log entries using prototype-guided structural modeling with counterfactual perturbation consistency regularization, thereby improving localization reliability and interpretability under coarse-grained supervision. Experimental results on three public datasets demonstrate that LogMILP achieves competitive detection performance while yielding significantly more reliable instance-level localization. Our code is open-sourced at https://github.com/YUK1207/LogMILP.

2605.10987 2026-05-13 cs.LG cs.AI cs.CR 版本更新

AESOP: Adversarial Execution-path Selection to Overload Deep Learning Pipelines

Tingxi Li, Mingfang Ji, Ravishka Shemal Rathnasuriya, Simin Chen, Yitao Hu, Wei Yang

发表机构 * The University of Texas at Dallas(德克萨斯大学达拉斯分校) Tianjin University(天津大学)

AI总结 本文研究了深度学习推理流水线中由于动态路径选择带来的效率攻击问题,提出了一种名为AESOP的对抗性路径选择框架。该方法通过结合漏洞引导的路径排序与自适应损失加权,有效放大了模型的计算量和延迟,实验证明其在白盒和灰盒设置下均能显著提升攻击效果。研究揭示了现有针对单一模型的攻击方法在动态流水线场景下存在显著性能差距,并展示了系统级防御措施虽能缓解攻击但无法完全阻止其影响。

详情
英文摘要

Modern machine learning deployments increasingly compose specialized models into dynamic inference pipelines, where upstream components produce intermediate predictions that determine the workload and inputs of downstream components. The cost of processing an input is therefore not determined by any single model, but by two coupled factors: the per-inference cost of each invoked component and its workload volume. Because these pipelines run under hard real-time constraints, efficiency is a fundamental requirement for system availability. We show that this structure creates an efficiency-attack surface that existing methods targeting single models cannot exploit: on identical inputs and budgets, path-aware targeting inflates FLOPs by $2,407\times$ while the strongest single-model baseline achieves $117\times$ -- a $20\times$ gap attributable entirely to where the attack is directed. We formalize this as the adversarial path-selection problem and present AESOP, a framework combining vulnerability-guided path ranking with adaptive loss weighting. We evaluate AESOP on five pipelines plus a production-realistic deployment variant with batching, bounded buffering, and confidence-threshold defenses. AESOP achieves up to $2,407\times$ FLOPs and $419\times$ latency inflation in white-box setting and 58$\times$ FLOPs / 17$\times$ latency in gray-box settings. Under system-level defenses, the attack is not neutralized but redirected: pipelines are forced to choose between throughput collapse ($0.578 \to 0.006$ input/s) and $96.7\%$ data loss to sustain throughput.

2605.10985 2026-05-13 cs.LG cs.AI q-bio.BM 版本更新

Structural Interpretations of Protein Language Model Representations via Differentiable Graph Partitioning

Siddhant Dutta, Edward Tan Beng Wai, Soumick Sarker, Pasan Gunawardane, Jagath C. Rajapakse

发表机构 * Nanyang Technological University(南洋理工大学)

AI总结 该研究提出了一种可解释的蛋白质语言模型表示方法,通过可微分图划分技术将ESM-2的表示映射到蛋白质接触图,并利用SoftBlobGIN网络学习功能子结构,从而提升预测任务的性能与可解释性。该方法无需重新训练语言模型,仅增加少量参数,即可在酶分类、功能预测等任务中取得优异表现,并能自动识别生物意义的功能区域,如活性位点残基和催化接触模式。实验表明,该框架显著提升了结构解释的准确性与可审计性,为蛋白质语言模型提供了结构层面的透明性支持。

Comments 19 Pages, 8 figures, 11 Tables, Submitted to NeurIPS 2026

详情
英文摘要

Protein language models such as ESM-2 learn rich residue representations that achieve strong performance on protein function prediction, but their features remain difficult to interpret as structural $\&$ evolutionary signals are encoded in dense latent spaces. We propose a plug-$\&$-play framework that projects ESM-2 representations onto protein contact graphs $\&$ applies $\textbf{SoftBlobGIN}$, a lightweight Graph Isomorphism Network with differentiable Gumbel-softmax substructure pooling, to perform structure-aware message passing $\&$ learn coarse functional substructures for downstream prediction tasks. Across enzyme classification, SoftBlobGIN achieves 92.8\% accuracy $\&$ 0.898 macro-F1. Unlike post hoc analysis of protein language models alone, our method produces directly auditable structural explanations: GNNExplainer recovers biologically meaningful active-site residues, spatially localized functional clusters, $\&$ catalytic contact patterns. On binding-site detection, SoftBlobGIN improves residue AUROC from $0.885$ using an ESM-2 linear probe to $0.983$, indicating that these structural explanations are not recoverable from language-model features alone. Learned blob partitions provide an additional layer of interpretability by automatically grouping residues into functional substructures, with blobs containing annotated active-site residues showing $1.85\times$ higher importance than other blobs ($ρ{=}0.339$, $p{=}0.009$), without any active-site supervision. Our framework requires no retraining of the language model, adds only $\sim$1.1M parameters, $\&$ generalises across ProteinShake tasks, achieving $F_{\max}$ of $0.733$ on Gene Ontology prediction $\&$ AUROC of $0.969$ on binding-site detection. We position this as an interpretable structural companion to protein language models that makes their predictions more transparent $\&$ auditable.

2605.10981 2026-05-13 cs.LG cs.AI 版本更新

$ξ$-DPO: Direct Preference Optimization via Ratio Reward Margin

Zhengyuan Fan, Zhonghua Wu, Yuxuan Du, Qun Chen

发表机构 * School of Computer Science, Northwestern Polytechnical University(西北工业大学计算机学院)

AI总结 本文提出了一种名为 $ξ$-DPO 的直接偏好优化方法,旨在解决现有 SimPO 方法中超参数调优困难的问题。通过重新定义奖励目标为最小化奖励差距与最优边距之间的距离,并引入基于选择与拒绝响应比值的奖励形式,$ξ$-DPO 有效消除了对超参数 $β$ 的依赖,并获得了更具解释性和稳定性的边距 $ξ$。该方法无需反复调参,能够更直观地控制偏好响应之间的相对分离程度,提升了直接偏好优化的效率与可解释性。

详情
英文摘要

Reference-free preference optimization has emerged as an efficient alternative to reinforcement learning from human feedback, with Simple Preference Optimization(SimPO) demonstrating strong performance by eliminating the explicit reference model through a simple objective. However, the joint tuning of the hyperparameters $β$ and $γ$ in SimPO remains a central challenge. We argue that this difficulty arises because the margin formulation in SimPO is not easily interpretable across datasets with different reward gap structures. To better understand this issue, we conduct a comprehensive analysis of SimPO and find that $β$ implicitly controls sample filtering, while the effect of $γ$ depends on the reward gap structure of the dataset. Motivated by these observations, we propose $ξ$-DPO: Direct preference optimization via ratio reward margin. We first reformulate the preference objective through an equivalent transformation, changing the optimization target from maximizing the likelihood of reward gaps to minimizing the distance between reward gaps and optimal margins. Then, we redefine the reward in a ratio form between the chosen and rejected, which effectively cancels the effect of $β$ and yields a bounded and interpretable margin. This margin is called the ratio reward margin and is denoted by $ξ$. Unlike the margin $γ$ in SimPO, $ξ$ explicitly represents the desired relative separation between chosen and rejected responses and can be determined from the initial reward gap distribution, avoiding repeated trial-and-error tuning. ....

2605.10980 2026-05-13 cs.LG cs.AI 版本更新

LEAP: Unlocking dLLM Parallelism via Lookahead Early-Convergence Token Detection

Haohui Zhang, Zhiye Wang, Xiaoying Gan, Xinbing Wang, Bo Jiang

发表机构 * Shanghai Jiao Tong University(上海交通大学)

AI总结 本文提出了一种名为LEAP的方法,旨在通过检测早期收敛的标记来提升扩散语言模型(dLLM)的并行解码能力。传统方法依赖高置信度阈值来保证准确性,但这一要求限制了并行性。LEAP通过未来上下文过滤和多序列叠加技术,在无需训练的情况下识别出早期已收敛且正确的标记,从而实现更早的解码,显著降低了推理延迟和解码步骤。实验表明,LEAP在多个领域均有效提升了解码效率,同时保持了模型精度。

详情
英文摘要

Diffusion Language Models (dLLMs) have garnered significant attention for their potential in highly parallel processing. The parallel capabilities of existing dLLMs stem from the assumption of conditional independence at high confidence levels, which ensures negligible discrepancy between the marginal and joint distributions. However, the stringent confidence thresholds required to preserve accuracy severely constrain the scalability of parallelism. Through systematic token-level statistical analysis, we reveal that a substantial proportion of tokens converge to their correct predictions early in the denoising process yet fail to reach standard confidence thresholds, confirming that current confidence-based criteria are overly conservative. In response, we introduce LEAP (Lookahead Early-Convergence Token Detection for Accelerated Parallel Decoding). LEAP is a training-free, plug-and-play method that leverages future context filtering and multi-sequence superposition to detect early-converging tokens. By validating the alignment between early convergence and correctness, we enable reliable early decoding of these tokens. Benchmarking across diverse domains demonstrates that LEAP significantly lowers inference latency and decoding steps. Compared to confidence-based decoding, the average number of denoising steps is reduced by about 30%. On the GSM8K dataset, combining LEAP with dParallel accelerates decoding to 7.2 tokens per step while preserving model precision. LEAP effectively breaks the reliance on high-confidence priors, offering a novel paradigm for parallel decoding.

2605.10975 2026-05-13 cs.LG cs.AI 版本更新

Hierarchical Multi-Scale Graph Neural Networks: Scalable Heterophilous Learning with Oversmoothing and Oversquashing Mitigation

Md Sazzad Hossen, Avimanyu Sahoo

发表机构 * University of Alabama in Huntsville(阿拉巴马大学亨茨维尔分校)

AI总结 该论文研究了异质图(相邻节点标签不同)分类中的可扩展学习问题,针对现有图神经网络在处理异质性数据时存在的聚合偏差和过平滑、过压缩问题,提出了一种分层多尺度图神经网络框架HMH。该方法通过学习特征与结构感知的符号亲和力,构建软图层次结构,并在每一层使用稀疏正交的Haar基进行频域滤波,结合跳跃连接解池化层,有效缓解了中心节点主导和长距离信号压缩问题。实验表明,HMH在节点和图分类任务上均优于现有方法,且具有近线性的时间复杂度。

详情
英文摘要

Graphs with heterophily, where adjacent nodes carry different labels, are prevalent in real-world applications, from social networks to molecular interactions. However, existing spectral Graph Neural Network (GNN) approaches tailored for heterophilous graph classification suffer from hub-dominated (node with large degree) aggregation and oversmoothing, as their suboptimal polynomial filters introduce approximation errors and blend distant signals. To address the degree-biased aggregation and suboptimal polynomial filtering, we introduce a Hierarchical Multi-view HAAR (HMH), a novel spectral graph-learning framework that scales in near-linear time . HMH first learns feature- and structure-aware signed affinities via a heterophily-aware encoder, then constructs a soft graph hierarchy guided by these embeddings. At each hierarchical level, HMH constructs a sparse, orthonormal, and locality-aware Haar basis to apply learnable spectral filters in the frequency domain. Finally, skip-connection unpooling layers combine outputs from all hierarchical levels back into the original graph, effectively preventing hub domination and long-range signal bottleneck (over-squashing). Experimentation shows that HMH outperforms state-of-the-art spectral baselines, achieving up to a 3% improvement on node classification and 7% points on graph classification datasets, all while maintaining linear scalability.

2605.10974 2026-05-13 cs.LG cs.AI 版本更新

Vertex-Softmax: Tight Transformer Verification via Exact Softmax Optimization

Navid Rezazadeh, Arash Gholami Davoodi

发表机构 * University of California, Irvine(加州大学尔湾分校) Carnegie Mellon University(卡内基梅隆大学)

AI总结 本文提出了一种名为Vertex-Softmax的新方法,用于提升Transformer注意力机制的认证验证精度。该方法通过精确优化softmax函数在预softmax分数区间约束下的最优解,证明了最优解必定出现在约束盒的顶点,并基于此建立了具有线性复杂度的Vertex-Softmax原语。实验表明,该方法在多个数据集上显著提升了认证准确率并紧缩了下界,同时在计算成本上优于现有方法。

详情
英文摘要

Certified verification of transformer attention requires bounding the softmax function over interval constraints on the pre-softmax scores. Existing verifiers relax softmax ndependently of the downstream objective, leaving avoidable slack. We prove that the exact optimum of this score-box problem is attained at a vertex of the constraint box, and establish a threshold structure theorem showing that, after sorting the objective coefficients, the optimum lies among only linearly many candidates, yielding the Vertex-Softmax primitive with log-linear complexity in the sequence length. We further prove a formal optimality result showing that Vertex-Softmax is the tightest sound bound obtainable from score intervals alone, characterizing precisely what additional structure (score correlations, score-value coupling) is needed for further improvement. Integrated into a CROWN Convex Relaxation based Optimization for Worst-case Neurons)-style verifier with a formal soundness guarantee, Vertex-Softmax significantly improves certified rates and substantially tightens lower bounds across MNIST, Fashion-MNIST, and CIFAR-10 attention models, while consistently matching or outperforming alpha-CROWN and branch-and-bound baselines at a fraction of their cost.

2605.10973 2026-05-13 cs.LG cs.AI 版本更新

Rotation-Preserving Supervised Fine-Tuning

Hangzhan Jin, Tianwei Ni, Lu Li, Pierre-Luc Bacon, Mohammad Hamdaqa, Doina Precup

发表机构 * Mila - Quebec AI Institute(魁北克AI研究所) Polytechnique Montréal(蒙特利尔理工学院) Université de Montréal(蒙特利尔大学) McGill University(麦吉尔大学) CIFAR AI Chair(CIFAR人工智能主席) Google DeepMind(谷歌DeepMind)

AI总结 监督微调(SFT)虽能提升模型在特定领域内的性能,但可能损害其在领域外的泛化能力。本文提出了一种名为旋转保持监督微调(RPSFT)的方法,通过在预训练权重矩阵的奇异子空间中保持投影旋转,高效地近似Fisher敏感方向,从而限制不必要的权重旋转,保留任务适应性。实验表明,RPSFT在数学推理数据上训练的多种模型中,有效改善了领域内与领域外性能的平衡,更好地保留了预训练表示,并为后续强化学习微调提供了更优的初始化。

Comments 31 pages, 13 figures

详情
英文摘要

Supervised fine-tuning (SFT) improves in-domain performance but can degrade out-of-domain (OOD) generalization. Prior work suggests that this degradation is related to changes in dominant singular subspaces of pretrained weight matrices. However, directly identifying loss-sensitive directions with Hessian or Fisher information is computationally expensive at LLM scale. In this work, we propose preserving projected rotations in pretrained singular subspaces as an efficient proxy for Fisher-sensitive directions, which we call Rotation-Preserving Supervised Fine-Tuning (RPSFT). RPSFT penalizes changes in the projected top-$k$ singular-vector block of each pretrained weight matrix, limiting unnecessary rotation while preserving task adaptation. Across model families and sizes trained on math reasoning data, RPSFT improves the in-domain/OOD trade-off over standard SFT and strong SFT baselines, better preserves pretrained representations, and provides stronger initializations for downstream RL fine-tuning. Code is available at \href{https://github.com/jinhangzhan/RPSFT.git}{https://github.com/jinhangzhan/RPSFT}.

2605.10971 2026-05-13 cs.LG cs.AI cs.CL 版本更新

Steering Without Breaking: Mechanistically Informed Interventions for Discrete Diffusion Language Models

Hanhan Zhou, Shamik Roy, Rashmi Gangadharaiah

发表机构 * AWS AI Labs(AWS人工智能实验室)

AI总结 离散扩散语言模型(DLMs)通过并行去噪生成文本,提供了不同于自回归模型的生成方式。本文指出,从自回归模型迁移而来的控制生成方法在每一步去噪中采用统一干预策略,会导致生成质量下降,尤其在多属性联合控制时问题更为严重。研究通过训练稀疏自编码器分析发现,不同属性在去噪过程中以不同的时间、强度和节奏固化,因此提出了一种自适应调度方法,将干预集中在属性形成的关键步骤,从而在保持生成质量的同时显著提升了控制精度,尤其在多属性联合控制任务中表现出色。

Comments preprint, 47 pages

详情
英文摘要

Discrete diffusion language models (DLMs) generate text by iteratively denoising all positions in parallel, offering an alternative to autoregressive models. Controlled generation methods for DLMs, imported from autoregressive models, apply uniform intervention at every denoising steps. We show this uniform schedule degrades quality, and the damage compounds when multiple attributes are steered jointly. To diagnose the failure, we train sparse autoencoders on four DLMs (124M-8B parameters) and find that different attributes commit on distinct schedules, varying in timing, sharpness, and magnitude. For instance, topic commits within the first 2\% of denoising, whereas sentiment emerges gradually over 20\% of the process. Consequently, uniform intervention wastes steering capacity on steps where the target attribute has already solidified or has yet to emerge. We propose a novel adaptive scheduler that concentrates interventions on the steps where an attribute is actively forming and leaves the rest of generation untouched. The cost-control trade-off admits a closed-form characterization: the advantage of adaptive over uniform scheduling is governed by a single dispersion statistic of the commitment distribution. Across four DLMs and seven steering tasks, our method achieves precise control without the degradation typical of uniform interventions. Especially on challenging simultaneous three-attribute control, it reaches up to 93\% steering strength, beating the strongest baseline by up to 15\% points while preserving generation quality.

2605.10970 2026-05-13 cond-mat.dis-nn cs.AI 版本更新

Context-Gated Associative Retrieval: From Theory to Transformers

Moulik Choraria, Argyrios Gerogiannis, Vidhata Jayaraman, Ankur Mani, Lav R. Varshney

发表机构 * UIUC(伊利诺伊大学香槟分校) University of Minnesota, Twin Cities(明尼苏达大学双城分校) Stony Brook University(石溪大学)

AI总结 本文提出了一种基于上下文门控的关联记忆检索架构,旨在解决传统模型忽略外部上下文对回忆影响的问题。该方法通过引入一个上下文门控子电路,在检索前和过程中重塑能量景观,从而提升记忆分离度并增强稀疏性,显著提高检索性能。理论分析表明,该系统具有唯一的自洽固定点,揭示了上下文偏置与反馈回路对检索状态的共同作用;进一步将该理论应用于Transformer模型,验证了上下文学习本质上是一种上下文门控的检索过程。

详情
英文摘要

Hopfield networks and their generalizations have established deep connections among biological associative memories, statistical physics, and transformers. Yet most models treat retrieval as a fixed query-to-memory mapping, ignoring the role of external context in recall. In this work, we propose a two-stage associative memory architecture, wherein a context-gate subcircuit reshapes the retrieval energy landscape before and during recall. We show theoretically that context gating increases inter-memory separation while inducing sparsity, translating into exponential improvements in retrieval. Crucially, we prove that the system admits a unique self-consistent fixed point, revealing that the resulting retrieval state is driven by both a direct contextual bias and a second-order retrieval-gate feedback loop. We then bridge this theory to transformers; specifically, we evaluate a first-order approximation on Llama-3, confirming that in-context learning acts as context-gated retrieval. Native dynamics mirror our theory: context localizes a memory subspace, enabling the zero-shot query to cleanly discriminate. Ultimately, this framework provides a mechanistic link between associative memory theory and LLM phenomenology.

2605.10966 2026-05-13 cs.MM cs.AI 版本更新

MMTB: Evaluating Terminal Agents on Multimedia-File Tasks

Chiyeong Heo, Jaechang Kim, Junhyuk Kwon, Hoyoung Kim, Dongmin Park, Jonghyun Lee, Jungseul Ok

发表机构 * GSAI, POSTECH(POSTECH 人工智能研究所) CSE, POSTECH(POSTECH 计算科学与工程系) National AI Research Lab(国家人工智能研究实验室) Krafton AI(Krafton 人工智能)

AI总结 该论文提出MMTB(MultiMedia-TerminalBench),一个用于评估终端智能体处理多媒体文件任务的基准,包含105个任务,涵盖音频和视频文件的操作。研究引入了Terminus-MM,扩展了终端智能体的感知能力以处理多媒体内容,从而支持对终端智能体在多媒体任务中表现的系统研究。该工作揭示了不同形式的多媒体信息如何影响任务执行效果及智能体依赖的证据类型。

详情
英文摘要

Terminals provide a powerful interface for AI agents by exposing diverse tools for automating complex workflows, yet existing terminal-agent benchmarks largely focus on tasks grounded in text, code, and structured files. However, many real-world workflows require practitioners to work directly with audio and video files. Working with such multimedia files calls for terminal agents not only to understand multimedia content, but also to convert auditory and visual evidence across related files into appropriate actions. To evaluate terminal agents on multimedia-file tasks, we introduce MultiMedia-TerminalBench (MMTB), a benchmark of 105 tasks across 5 meta-categories where terminal agents directly operate with audio and video files. Alongside MMTB, we propose Terminus-MM, a multimedia harness that extends Terminus-KIRA with audio and video perception for terminal agents. Together, MMTB and Terminus-MM support a controlled study of multimedia terminal agents, revealing how different forms of multimedia access shape task outcomes and determine which evidence agents rely on to construct executable terminal workflows. MMTB media and metadata are released at https://huggingface.co/datasets/mm-tbench/mmtb-media

2605.10960 2026-05-13 physics.ao-ph cs.AI 版本更新

Two Hebrew folk meteorological proverbs tested: rainfall on Rosh Chodesh and Shabbat Mevarechim as predictors of monthly precipitation (Israel, 1950-2024)

Abraham Itzhak Weinberg

发表机构 * AI-WEINBERG, AI Experts(AI-WEINBERG,AI专家)

AI总结 本研究检验了两条希伯来民间气象谚语的预测能力,即“若新月节(Rosh Chodesh)下雨,则整月都会下雨”和“若祝福安息日(Shabbat Mevarechim)下雨,则整月都会下雨”。通过分析以色列七个城市1950年至2024年的冬季降水数据,研究发现这两条谚语在一定程度上反映了月度降水的统计规律,但其预测效力随时间减弱,可能与气候变化导致降水事件缩短有关。研究揭示了民间谚语中蕴含的气候信号及其在现代气候背景下的可靠性变化。

详情
英文摘要

Folk meteorological proverbs encode centuries of empirical observation by agricultural communities. Two Hebrew proverbs link lunar calendar anchor days to monthly winter rainfall: (i) "If Rosh Chodesh is rainy, the whole month is rainy" and (ii) "If it rains on Shabbat Mevarechim, the whole month is rainy." Shabbat Mevarechim is the last Saturday before each new Hebrew month, preceding Rosh Chodesh by one to seven days. The first proverb is widely known; the second circulates in Hasidic oral tradition with no identified written source. Both have never been formally tested. We analyse 75 years (1950-2024) of daily precipitation data from seven Israeli cities across three climatic regions, comprising 191,758 station-days and 2,422 Hebrew-month observations during the winter rainy season (Marcheshvan-Adar). A rainy Rosh Chodesh increases the probability of a rainy month from 22.2% to 38.6% (lift +16.4 percentage points; chi-square = 57.8, p = 2.9e-14; Bayes factor 1.81). A rainy Shabbat Mevarechim produces a similar effect (lift +16.5 percentage points, p = 8.0e-13), despite preceding Rosh Chodesh by up to seven days. The effect decays with lag and mirrors daily rainfall autocorrelation (r = 0.35-0.44 at lag 1; ~0 at lag 7), consistent with Mediterranean cyclone persistence. A bootstrap permutation test (p < 1e-4) and a 15-year rolling analysis show declining predictive power (-0.20 percentage points per year, p < 0.001), consistent with shortening precipitation events under warming climate conditions. Both proverbs encode real but probabilistic meteorological signals whose reliability is decreasing over time.

2605.10959 2026-05-13 cs.LG cs.AI 版本更新

QuIDE: Mastering the Quantized Intelligence Trade-off via Active Optimization

Xiantao Jiang

发表机构 * College of Information Engineering, Shanghai Maritime University(上海海洋大学信息工程学院)

AI总结 当前缺乏统一的指标来评估量化神经网络的效率。本文提出QuIDE,通过引入智能指数I = (C × P)/log₂(T+1),将压缩率、精度与延迟的权衡统一为单一评分。实验表明,不同任务存在任务相关的帕累托拐点,4位量化在MNIST和大语言模型中表现最佳,而8位量化更适合复杂CNN任务。QuIDE还提供了一个可复现的评估协议和适用于混合精度搜索的适应性函数。

Comments 16 pages, 9 figures

详情
英文摘要

There is currently no unified metric for evaluating the efficiency of quantized neural networks. We propose QuIDE, built around the Intelligence Index I = (C x P)/log_2(T+1), which collapses the compression-accuracy-latency trade-off into a single score. Experiments across six settings -- SimpleCNN (MNIST, CIFAR), ResNet-18 (ImageNet-1K), and Llama-3-8B -- show a task-dependent Pareto Knee. 4-bit quantization is optimal for MNIST and large LLMs, while 8-bit is the sweet spot for complex CNN tasks (ResNet-18 on ImageNet), where 4-bit PTQ collapses accuracy catastrophically. The accuracy-gated variant I' correctly flags these non-viable configurations that the raw I would reward. QuIDE provides a reproducible evaluation protocol and a ready-to-use fitness function for mixed-precision search.

2605.10958 2026-05-13 physics.ao-ph cs.AI 版本更新

Multi-Fidelity Emulation of Atmospheric Correction Coefficients with Physics-Guided Kolmogorov-Arnold Networks

Md Abdullah Al Mazid, Naphtali Rishe

发表机构 * Knight Foundation School of Computing and Information Sciences(骑士基金会计算与信息科学学院) Florida International University(佛罗里达国际大学)

AI总结 大气校正在光学遥感中是关键的预处理步骤,但高精度的辐射传输模拟计算成本高昂。本文提出了一种基于物理引导的多保真度代理框架,利用6S和libRadtran模拟数据,结合拉丁超立方采样和物理一致性约束,构建了名为pKANrtm的Kolmogorov-Arnild网络模型,用于高效生成高精度的大气校正系数。该方法在预测性能和计算效率方面均优于现有方法,显著加速了大气校正过程。

详情
英文摘要

Atmospheric correction is a critical preprocessing step in optical remote sensing, but repeated high-fidelity radiative transfer simulations remain computationally expensive for dense look-up-table generation, sensitivity analysis, retrieval support, and operational preprocessing. This study presents a physics-aware multi-fidelity surrogate framework for emulating atmospheric correction coefficients using paired 6S and libRadtran simulations. Atmospheric and geometric states are sampled using Latin Hypercube Sampling, and both radiative transfer models are evaluated under matched conditions for Sentinel-2 bands using spectral-response-function-aware coefficient generation. The high-fidelity targets are path reflectance, total transmittance, and spherical albedo. A physics-guided Kolmogorov-Arnold Network, termed pKANrtm, receives the atmospheric state and low-fidelity 6S coefficients, predicts the residual relative to libRadtran, and reconstructs the high-fidelity coefficients. The pKANrtm model uses an Efficient-KAN architecture and is trained with a physics-consistency penalty applied in the original coefficient space. The proposed model is evaluated against state-of-the-art regression-based RTM surrogates. Across both standard and out-of-distribution evaluation settings, pKANrtm achieves the strongest overall predictive performance among the compared models. Runtime benchmarking demonstrates substantial acceleration relative to libRadtran, with GPU inference providing approximately four orders of magnitude single-sample speedup and batched inference reaching tens of thousands of samples per second. These results indicate that physics-aware multi-fidelity pKANrtm emulation provides an accurate, physically structured, and computationally efficient strategy for atmospheric correction coefficient generation.

2605.10954 2026-05-13 quant-ph cs.AI 版本更新

Controlled Steering-Based State Preparation for Adversarial-Robust Quantum Machine Learning

Sahan Sanjaya, Hari Krishna Parvatham, Emma Andrews, Prabhat Mishra

发表机构 * University of Florida, Gainesville, FL, USA(佛罗里达大学)

AI总结 本文研究了量子机器学习模型在对抗性攻击下的鲁棒性问题,提出了一种基于被动操控的量子态制备方法,用于增强模型的防御能力。该方法通过替换传统量子编码阶段,利用可控的量子态引导技术,使量子态向受控的中间态演化,从而有效抑制对抗性扰动的影响。实验表明,该方法在多种量子机器学习模型和数据集上均能显著提升对抗性准确率,最高提升达40.19%。

详情
英文摘要

Quantum machine learning (QML) provides a promising framework for leveraging quantum-mechanical effects in learning tasks. However, its vulnerability to adversarial perturbations remains a major challenge for practical deployment. In QML systems, small perturbations applied to classical inputs can propagate through the quantum encoding stage and distort the resulting quantum state, thereby degrading model performance. In this work, we propose a defense mechanism that replaces the conventional quantum encoding stage of a QML model with passive steering-based controlled state preparation, which guides the encoded state toward a controlled intermediate state. By tuning the steering strength and the number of steering iterations, the proposed method suppresses the influence of adversarial perturbations while maintaining high clean accuracy and improving adversarial accuracy. Experimental results demonstrate that the passive steering-based defense consistently improves adversarial accuracy across different QML models and datasets under gradient-based adversarial attacks, achieving adversarial accuracy improvements of up to 40.19%.

2605.10950 2026-05-13 physics.ao-ph cs.AI cs.ET cs.IR 版本更新

Continuous Flood Nowcasting in South Asia: A Multi-Sensor Ensemble Remote Sensing Framework for Flood Extent

Usman Nazir, Disha Gomathinayagam, Muhammad Kamran, Sara Khalid

发表机构 * Planetary Health Informatics (PHI) Lab, University of Oxford(行星健康信息学(PHI)实验室,牛津大学) Blavatnik School of Government, University of Oxford(布劳维克政府学院,牛津大学) PMIU Secretariat, Irrigation Department, Pakistan(灌溉部门PMIU秘书处,巴基斯坦)

AI总结 2025年南亚巴基斯坦经历了异常严重的洪灾,现有洪水监测产品难以提供高时空连续性的实时淹没地图。本文提出一种融合多传感器遥感数据的集成框架,利用Sentinel-1、HLS、MODIS和VIIRS等数据,在Google Earth Engine平台上实现巴基斯坦区域的连续洪水实时预测。该方法通过分层集成策略优先使用高分辨率传感器数据,确保每日洪水范围的连续性,并在2025年季风期间生成空间一致的近实时淹没图,有效支持灾情快速评估与应急响应。

Comments Visualising Climate 2026

详情
英文摘要

Pakistan experienced an unusually severe flood season between June and December 2025, with cascading impacts on population, infrastructure, and agriculture. Existing operational flood products (e.g., UNOSAT) provide valuable episode-level snapshots but rarely deliver spatially and temporally continuous inundation maps at near-real-time latency within the country. We present a multi-sensor, ensemble-based remote-sensing framework for continuous flood nowcasting in Pakistan that integrates Sentinel-1 SAR, Harmonized Landsat-Sentinel (HLS L30 and S30), MODIS, and VIIRS observations on a harmonized grid in Google Earth Engine. The framework employs a tiered nowcasting ensemble that prioritizes higher-resolution sensors (Sentinel-1 and HLS) and falls back to MODIS and VIIRS when necessary, preserving daily continuity of flood extent at each sensor's native resolution. Applied to the 2025 monsoon period, the system generates near-real-time, spatially consistent inundation maps across Pakistan. As a nowcasting case study, we track the super-flood of 26 August-7 September 2025 day by day, demonstrating the framework's ability to capture the evolving flood footprint in near real time and extend beyond the temporal limits of episodic mapping products. Validation against GloFAS discharge anomalies and precipitation datasets (CHIRPS v3.0, MSWEP) shows strong agreement with observed hydrometeorological conditions. By integrating nowcast outputs with exposure layers (WorldPop, ESA WorldCover, Giga-HOTOSM), the framework enables rapid estimation of affected populations, cropland, and critical infrastructure, supporting timely disaster response and resilience planning in South Asia.

2605.10949 2026-05-13 stat.AP cs.AI cs.CV cs.LG 版本更新

AlphaEarth Satellite Embeddings for Modelling Climate Sensitive Diseases Towards Global Health Resilience

Usman Nazir, I-Han Cheng, Sara Khalid

发表机构 * Planetary Health Informatics (PHI) Lab, University of Oxford(行星健康信息学实验室,牛津大学)

AI总结 该研究探讨了利用卫星遥感数据(AlphaEarth嵌入)预测气候敏感性疾病的潜力,以提升全球健康韧性。研究聚焦于疟疾、儿童急性呼吸道感染和发育迟缓等疾病,评估了64维卫星嵌入在不同国家和地区的预测性能。结果显示,卫星数据在疟疾和呼吸道感染预测中具有显著的预测能力,但在发育迟缓预测中受固定效应影响较大,需进一步数据支持。这一工作为利用遥感技术辅助公共卫生监测提供了新的方法和实证依据。

Comments Visualising Climate 2026

详情
英文摘要

Malaria, childhood acute respiratory infection, and child undernutrition together account for over two million deaths annually in children under five, with the burden concentrated in low and middle-income countries where climate variability modulates transmission, exposure, and nutritional outcomes. Routine health surveillance in these settings remains sparse and reactive. Satellite-derived representations of the Earth's surface offer a scalable, low-cost complement to traditional covariates, yet their utility as predictors of population health outcomes is poorly characterised. We summarise findings from three studies evaluating AlphaEarth Foundations 64-dimensional satellite embeddings as predictors of population health outcomes, focusing on vulnerable populations. The studies span infectious disease (malaria, respiratory infection) and stunting. In each study, embeddings provide predictive value at sufficient spatial granularity: (i) malaria prediction across Nigeria shows consistent per-region R^2 gains; (ii) childhood acute respiratory infection prediction across 11 DHS countries increases pooled R^2 from 0.157 to 0.206 across three tree-based estimators; (iii) stunting prediction across 35 countries is neutral at country level due to collinearity with fixed effects. The stunting case is currently limited by lack of DHS cluster-level coordinates, which is the next key experiment.

2605.10865 2026-05-13 cs.AI cs.CV cs.SE 版本更新

BenchCAD: A Comprehensive, Industry-Standard Benchmark for Programmatic CAD

Haozhe Zhang, Kaichen Liu, Miaomiao Chen, Lei Li, Shaojie Yang, Cheng Peng, Hanjie Chen

发表机构 * University of Virginia(弗吉尼亚大学) University of California, San Diego(加州大学圣地亚哥分校) Rice University(莱斯大学)

AI总结 BenchCAD 是一个面向工业CAD编程的综合性基准测试平台,旨在评估模型从视觉或文本输入生成可执行参数化CAD程序的能力。该基准包含17,900个经过验证的CadQuery程序,涵盖106类工业零件,通过视觉问答、代码问答、图像到代码生成等多种任务全面评估模型在感知、参数抽象和程序合成方面的能力。实验表明,当前主流模型虽能恢复零件的粗略外形,但在精确生成参数化CAD程序方面仍存在显著不足,如忽略细粒度3D结构、误读工程参数等,突显了工业CAD自动化领域亟需改进的方向。

Comments 9 page 7 figures

详情
英文摘要

Industrial Computer-Aided Design (CAD) code generation requires models to produce executable parametric programs from visual or textual inputs. Beyond recognizing the outer shape of a part, this task involves understanding its 3D structure, inferring engineering parameters, and choosing CAD operations that reflect how the part would be designed and manufactured. Despite the promise of Multimodal large language models (MLLMs) for this task, they are rarely evaluated on whether these capabilities jointly hold in realistic industrial CAD settings. We present BenchCAD, a unified benchmark for industrial CAD reasoning. BenchCAD contains 17,900 execution-verified CadQuery programs across 106 industrial part families, including bevel gears, compression springs, twist drills, and other reusable engineering designs. It evaluates models through visual question answering, code question answering, image-to-code generation, and instruction-guided code editing, enabling fine-grained analysis across perception, parametric abstraction, and executable program synthesis. Across 10+ frontier models, BenchCAD shows that current systems often recover coarse outer geometry but fail to produce faithful parametric CAD programs. Common failures include missing fine 3D structure, misinterpreting industrial design parameters, and replacing essential operations such as sweeps, lofts, and twist-extrudes with simpler sketch-and-extrude patterns. Fine-tuning and reinforcement learning improve in-distribution performance, but generalization to unseen part families remains limited. These results position BenchCAD as a benchmark for measuring and improving the industrial readiness of multimodal CAD automation.

2605.10815 2026-05-13 cs.AI eess.AS 版本更新

Probing Cross-modal Information Hubs in Audio-Visual LLMs

Jihoo Jung, Chaeyoung Jung, Ji-Hoon Kim, Joon Son Chung

发表机构 * Department of Electrical Engineering, Korea Advanced Institute of Science The Graduate School of Advanced Imaging Science, Multimedia \& Film, Chung-Ang University, Seoul, Republic of Korea

AI总结 本文研究了音频-视觉大语言模型(AVLLMs)中跨模态信息的流动机制,重点分析了音频和视觉模态之间的信息编码方式。通过实证分析,发现AVLLMs主要在所谓的“sink tokens”中整合跨模态信息,其中一部分特定的sink tokens专门用于存储跨模态信息,称为“跨模态sink tokens”。基于这一发现,作者提出了一种无需训练的幻觉缓解方法,通过增强对跨模态sink tokens中整合信息的依赖来提升模型表现。

Comments Accepted by ICML 2026

详情
英文摘要

Audio-visual large language models (AVLLMs) have recently emerged as a powerful architecture capable of jointly reasoning over audio, visual, and textual modalities. In AVLLMs, the bidirectional interaction between audio and video modalities introduces intricate processing dynamics, necessitating a deeper understanding of their internal mechanisms. However, unlike extensively studied text-only or large vision language models, the internal workings of AVLLMs remain largely unexplored. In this paper, we focus on cross-modal information flow between audio and visual modalities in AVLLMs, investigating where information derived from one modality is encoded within the token representations of the other modality. Through an analysis of multiple recent AVLLMs, we uncover two common findings. First, AVLLMs primarily encode integrated audio-visual information in sink tokens. Second, sink tokens do not uniformly hold cross-modal information. Instead, a distinct subset of sink tokens, which we term cross-modal sink tokens, specializes in storing such information. Based on these findings, we further propose a simple training-free hallucination mitigation method by encouraging reliance on integrated cross-modal information within cross-modal sink tokens. Our code is available at https://github.com/kaistmm/crossmodal-hub.

2605.10780 2026-05-13 cs.CV cs.AI 版本更新

Beyond the Last Layer: Multi-Layer Representation Fusion for Visual Tokenization

Xuanyu Zhu, Yan Bai, Yang Shi, Yihang Lou, Yuanxing Zhang, Jing Jin, Yuan Zhou

发表机构 * Peking University(北京大学) Meituan Inc(美团公司) Tsinghua University(清华大学) IGDL

AI总结 该研究提出了一种名为DRoRAE的多层表示融合方法,旨在改进视觉编码器的特征提取过程。不同于现有方法仅使用最后一层特征,DRoRAE通过能量约束路由和增量校正机制,融合所有中间层的特征,从而恢复因多层语义抽象而丢失的细节信息。实验表明,该方法在图像重建和生成任务中显著提升了性能,并揭示了表示丰富性与重建质量之间的可预测关系,为视觉分词器的设计提供了新的理论依据。

详情
英文摘要

Representation autoencoders that reuse frozen pretrained vision encoders as visual tokenizers have achieved strong reconstruction and generation quality. However, existing methods universally extract features from only the last encoder layer, discarding the rich hierarchical information distributed across intermediate layers. We show that low-level visual details survive in the last layer merely as attenuated residuals after multiple layers of semantic abstraction, and that explicitly fusing multi-layer features can substantially recover this lost information. We propose DRoRAE (Depth-Routed Representation AutoEncoder), a lightweight fusion module that adaptively aggregates all encoder layers via energy-constrained routing and incremental correction, producing an enriched latent compatible with a frozen pretrained decoder. A three-phase decoupled training strategy first learns the fusion under the implicit distributional constraint of the frozen decoder, then fine-tunes the decoder to fully exploit the enriched representation. On ImageNet-256, DRoRAE reduces rFID from 0.57 to 0.29 and improves generation FID from 1.74 to 1.65 (with AutoGuidance), with gains also transferring to text-to-image synthesis. Furthermore, we uncover a log-linear scaling law ($R^2{=}0.86$) between fusion capacity and reconstruction quality, identifying \textit{representation richness} as a new, predictably scalable dimension for visual tokenizers analogous to vocabulary size in NLP.

2605.10357 2026-05-13 cs.MM cs.AI 版本更新

RW-Post: Auditable Evidence-Grounded Multimodal Fact-Checking in the Wild

Danni Xu, Shaojing Fan, Harry Cheng, Mohan Kankanhalli

AI总结 本文提出 RW-Post,一个用于真实场景下多模态事实核查的文本-图像基准数据集,其特点在于可审计的注释,每个样本都链接原始社交媒体帖子、推理过程及来自人工事实核查文章的明确证据。该数据集支持多种评估模式,有助于系统分析模型在视觉对齐和证据利用方面的能力。研究还引入了 AgentFact 作为基准,并评估了多个开源大语言模型的表现,结果显示当前模型在证据对齐方面仍有较大提升空间。

Comments This submission was made in error. It was intended to replace the existing submission arXiv:2512.22933 rather than create a new submission

详情
英文摘要

Multimodal misinformation increasingly leverages visual persuasion, where repurposed or manipulated images strengthen misleading text. We introduce \textbf{RW-Post}, a post-aligned \textbf{text--image benchmark} for real-world multimodal fact-checking with \emph{auditable} annotations: each instance links the original social-media post with reasoning traces and explicitly linked evidence items derived from human fact-check articles via an LLM-assisted extraction-and-auditing pipeline. RW-Post supports controlled evaluation across closed-book, evidence-bounded, and open-web regimes, enabling systematic diagnosis of visual grounding and evidence utilization. We provide \textbf{AgentFact} as a reference verification baseline and benchmark strong open-source LVLMs under unified protocols. Experiments show substantial headroom: current models struggle with faithful evidence grounding, while evidence-bounded evaluation improves both accuracy and faithfulness. Code and dataset will be released at https://github.com/xudanni0927/AgentFact.

2605.10201 2026-05-13 cs.RO cs.AI 版本更新

HeteroGenManip: Generalizable Manipulation For Heterogeneous Object Interactions

Zhenhao Shen, Zeming Yang, Yue Chen, Yuran Wang, Shengqiang Xu, Mingleyang Li, Hao Dong, Ruihai Wu

发表机构 * Peking University(北京大学) Tianjin University(天津大学)

AI总结 该研究旨在解决机器人在异类物体交互中实现通用操作的难题,重点解决“在哪里操作”和“如何操作”这两个核心问题。提出了一种两阶段框架HeteroGenManip,通过解耦初始抓取与复杂交互过程,结合结构先验和多基础模型扩散策略,显著提升了操作的鲁棒性和泛化能力。实验表明,该方法在多种仿真和真实任务中均取得显著性能提升。

详情
英文摘要

Generalizable manipulation involving cross-type object interactions is a critical yet challenging capability in robotics. To reliably accomplish such tasks, robots must address two fundamental challenges: "where to manipulate" (contact point localization) and "how to manipulate" (subsequent interaction trajectory planning). Existing foundation-model-based approaches often adopt end-to-end learning that obscures the distinction between these stages, exacerbating error accumulation in long-horizon tasks. Furthermore, they typically rely on a single uniform model, which fails to capture the diverse, category-specific features required for heterogeneous objects. To overcome these limitations, we propose HeteroGenManip, a task-conditioned, two-stage framework designed to decouple initial grasp from complex interaction execution. First, Foundation-Correspondence-Guided Grasp module leverages structural priors to align the initial contact state, thereby significantly reducing the pose uncertainty of grasping. Subsequently, Multi-Foundation-Model Diffusion Policy (MFMDP) routes objects to category-specialized foundation models, integrating fine-grained geometric information with highly-variable part features via a dual-stream cross-attention mechanism. Experimental evaluations demonstrate that HeteroGenManip achieves robust intra-category shape and pose generalization. The framework achieves an average 31% performance improvement in simulation tasks with broad type setting, alongside a 36.7% gain across four real-world tasks with different interaction types.

2605.10125 2026-05-13 cs.AI cs.HC 版本更新

Useful for Exploration, Risky for Precision: Evaluating AI Tools in Academic Research

Anthea Dathe, Kiran Hoffmann, Aline Mangold

发表机构 * Dresden University of Technology(德累斯顿技术大学)

AI总结 该研究评估了人工智能工具在学术研究中的应用,重点关注问答和文献综述工具的实用性与局限性。研究提出了一种结合人机中心指标的评估框架,发现问答工具虽能提供有用概述,但在精确信息提取上可靠性不足,而文献综述工具虽有助于探索性搜索,却缺乏可重复性和透明度。研究强调了提升AI工具可解释性的重要性,并指出在研究工作流中合理整合AI仍需依赖人工验证。

详情
英文摘要

Artificial intelligence (AI) tools are being incorporated into scientific research workflows with the potential to enhance efficiency in tasks such as document analysis, question answering (Q&A), and literature search. However, system outputs are often difficult to verify, lack transparency in their generation and remain prone to errors. Suitable benchmarks are needed to document and evaluate arising issues. Nevertheless, existing benchmarking approaches are not adequately capturing human-centered criteria such as usability, interpretability, and integration into research workflows. To address this gap, the present work proposes and applies a benchmarking framework combining human-centered and computer-centered metrics to evaluate AI-based Q&A and literature review tools for research use. The findings suggest that Q&A tools can offer valuable overviews and generally accurate summaries; however, they are not always reliable for precise information extraction. Explainable AI (xAI) accuracy was particularly low, meaning highlighted source passages frequently failed to correspond to generated answers. This shifted the burden of validation back onto the researcher. Literature review tools supported exploratory searches but showed low reproducibility, limited transparency regarding chosen sources and databases, and inconsistent source quality, making them unsuitable for systematic reviews. A comparison of these tool groups reveals a similar pattern: while AI tools can enhance efficiency in the early stages of the research workflow and shallow tasks, their outputs still require human verification. The findings underscore the importance of explainability features to enhance transparency, verification efficiency and careful integration of AI tools into researchers' workflows. Further, human-centered evaluation remains an important concern to ensure practical applicability.

2605.09964 2026-05-13 cs.AI q-bio.QM 版本更新

Learning the Interaction Prior for Protein-Protein Interaction Prediction: A Model-Agnostic Approach

Ziqi Gao, Chenyi Zi, Zijing Liu, Ziqiao Meng, Yu Li, Jia Li

发表机构 * Tsinghua University(清华大学) The Hong Kong University of Science and Technology (Guangzhou)(香港科学与技术大学(广州)) National University of Singapore(新加坡国立大学) IDEA Research(IDEA研究院)

AI总结 蛋白质-蛋白质相互作用(PPIs)在细胞功能和疾病机制中起着关键作用。当前基于学习的PPI预测方法主要关注学习蛋白质的表示,却忽略了设计专门的分类头,通常依赖于缺乏生物学依据的通用聚合方法。本文提出了一种基于生物“L3规则”的模型无关PPI分类器L3-PPI,通过引入L3路径正则化的图提示学习方法,将蛋白质嵌入对的分类任务转化为图级别的分类任务,有效提升了预测性能。

Comments Accepted at ICML 2026

详情
英文摘要

Protein-protein interactions (PPIs) are fundamental to cellular function and disease mechanisms. Current learning-based PPI predictors focus on learning powerful protein representations but neglect designing specialized classification heads. They mainly rely on generic aggregating methods like concatenation or dot products, which lack biological insight. Motivated by the biological "L3 rule", where multiple length-3 paths between a pair of proteins indicate their interaction likelihood, our study addresses this gap by designing a biologically informed PPI classifier. In this paper, we provide empirical evidence that popular PPI datasets strongly support the L3 rule. We propose an L3-path-regularized graph prompt learning method called L3-PPI, which can generate a prompt graph with virtual L3 paths based on protein representations and controls the number of paths. L3-PPI reformulates the classification of protein embedding pairs into a graph-level classification task over the generated prompt graph. This lightweight module seamlessly integrates with PPI predictors as a plug-and-play component, injecting the interaction prior of complementarity to enhance performance. Extensive experiments show that L3-PPI achieves superior performance enhancements over advanced competitors.

2605.09461 2026-05-13 cs.AI 版本更新

VulTriage: Triple-Path Context Augmentation for LLM-Based Vulnerability Detection

Wenxin Tang, Xiang Zhang, Junliang Liu, Jingyu Xiao, Xi Xiao, Jinlong Yang, Yuehe Ma, Zhenyu Liu, Zhengheng Li, Zicheng Wang, Wang Luo, Qing Li, Lei Wang, Peng Xiangli

发表机构 * Tsinghua University(清华大学) Henan University(河南大学) Dalian Maritime University(大连海事大学) The Chinese University of Hong Kong(香港中文大学) Northwestern Polytechnical University(西北工业大学) BNU-HKBU United International College(北京理工大学-香港大学联合国际学院) Southeast University(东南大学) Jilin University(吉林大学) Sun Yat-sen University(中山大学) Peng Cheng Laboratory(鹏城实验室) Guangzhou Intelligence Communications Technology Co., Ltd.(广州智能通信技术有限公司) The Fifth Electronic Research Institute of MIIT(信息产业部第五电子研究所)

AI总结 本文提出了一种名为VulTriage的三路径上下文增强框架,用于基于大语言模型(LLM)的漏洞检测。该方法通过控制路径提取并描述程序结构信息,知识路径检索相关的漏洞模式与示例,语义路径总结代码功能行为,从而增强LLM的输入上下文,提升其对细微语义差异导致的漏洞的检测能力。实验表明,VulTriage在多个基准数据集上取得了优于现有深度学习和LLM基线方法的性能,尤其在资源有限和类别不平衡场景下表现出良好的泛化能力。

详情
英文摘要

Automated vulnerability detection is a fundamental task in software security, yet existing learning-based methods still struggle to capture the structural dependencies, domain-specific vulnerability knowledge, and complex program semantics required for accurate detection. Recent Large Language Models (LLMs) have shown strong code understanding ability, but directly prompting them with raw source code often leads to missed vulnerabilities or false alarms, especially when vulnerable and benign functions differ only in subtle semantic details. To address this, we propose VulTriage, a triple-path context augmentation framework for LLM-based vulnerability detection. VulTriage enhances the LLM input through three complementary paths: a Control Path that extracts and verbalizes AST, CFG, and DFG information to expose control and data dependencies; a Knowledge Path that retrieves relevant CWE-derived vulnerability patterns and examples through hybrid dense--sparse retrieval; and a Semantic Path that summarizes the functional behavior of the code before the final judgment. These contexts are integrated into a unified instruction to guide the LLM toward more reliable vulnerability reasoning. Experiments on the PrimeVul pair test set show that VulTriage achieves state-of-the-art performance, outperforming existing deep learning and LLM-based baselines on key pair-wise and classification metrics. Further ablation studies verify the effectiveness of each path, and additional experiments on the Kotlin dataset demonstrate the generalization ability of VulTriage under low-resource and class-imbalanced settings. Our code is available at https://github.com/vinsontang1/VulTriage

2605.09287 2026-05-13 cs.AI 版本更新

PiCA: Pivot-Based Credit Assignment for Search Agentic Reinforcement Learning

Dongyi Liu, Yifan Niu, Qinwen Wang, Han Xiao, Jia Li

发表机构 * The Hong Kong University of Science and Technology (Guangzhou)(香港科学与技术大学(广州)) The Hong Kong University of Science and Technology(香港科学与技术大学)

AI总结 本文提出了一种基于关键步骤的信用分配方法PiCA,用于改进基于大语言模型的搜索智能体在强化学习中的训练效果。针对长期任务中奖励稀疏、信用孤立和分布偏移等关键问题,PiCA通过引入潜在基于奖励塑形机制,将搜索过程重构为累积进展的序列,并利用历史轨迹中的关键步骤作为信息峰值,为每一步提供与最终目标紧密关联的密集奖励。实验表明,PiCA在多个知识密集型问答任务中显著提升了模型性能,显示出其良好的通用性和有效性。

Comments 21 pages, 7 figures

详情
英文摘要

Large Language Model (LLM)-based search agents trained with reinforcement learning (RL) have significantly improved the performance of knowledge-intensive tasks. However, existing methods encounter critical challenges in long-horizon credit assignment: (i) Reward Sparsity, where models receive only outcome feedback without step-level guidance to differentiate action quality; (ii) Isolated Credit, where credit is assigned to steps independently, failing to capture sequential dependencies; and (iii) Distributional Shift, where rewards are estimated on templates that deviate from the model's natural generative distribution. To address these issues, we propose Pivot-Based Credit Assignment (PiCA), a novel step reward mechanism that reformulates the search trajectory as a sequential process of cumulative search progress. Unlike prior isolated step rewards, PiCA defines process rewards as success probabilities dependent on the historical context based on Potential-Based Reward Shaping (PBRS). This approach identifies pivot steps, which comprise target golden sub-queries and sub-answers derived from historical trajectories, as information peaks that significantly boost the likelihood of a correct final answer. By anchoring these step rewards to the final task objective, PiCA provides dense, pivot-aware and trajectory-dependent guidance while maintaining distributional consistency. Extensive experiments show that PiCA outperforms existing strong baselines across seven knowledge-intensive QA benchmarks, achieving 15.2% and 2.2% improvements for 3B and 7B models. The consistent performance gains across various models show PiCA's robust generalization. The code is available at https://github.com/novdream/PiCA.

2605.09043 2026-05-13 cs.CL cs.AI 版本更新

Phase Transitions in Affective Meaning Divergence: The Hidden Drift Before the Break

Napassorn Litchiowong

发表机构 * School of Computing, National University of Singapore(新加坡国立大学计算机学院)

AI总结 本文研究了对话中情感意义分歧(AMD)的相变现象,即对话双方对同一词语的情感理解逐渐偏离,最终导致沟通失效。作者基于言语行为理论和熵正则化博弈论,构建了AMD的数学模型,并发现当参数 $βα> 4$ 时,AMD的增加会导致协调修复能力的突变式崩溃。在多个数据集上的实验证明,AMD在对话失控前表现出显著的临界减慢特征,且其时间动态模式优于传统毒性或情感指标,为理解对话破裂提供了新的理论依据。

Comments Accepted to the ACL 2026 Student Research Workshop

详情
英文摘要

One partner says "Fine" meaning "resolution"; the other hears "surrender." The word is shared; the affective uptake is not. We formalize this as affective meaning divergence (AMD), the total-variation distance between interlocutors' anchor-conditioned affect distributions. Building on speech-act theory, common-ground accumulation, and entropy-regularized game theory, we derive a logit best-response map whose dynamics undergo a saddle-node bifurcation: when $βα> 4$, a monotone increase in AMD-driven load produces an abrupt, hysteretic collapse of repair coordination. On Conversations Gone Awry (CGA-Wiki; $N = 652$), derailing conversations exhibit critical-slowing-down (CSD) signatures across multiple levels: lexical divergence variance ($p < 0.001$, $d = 0.36$), AMD variance ($p = 0.001$, $d = 0.26$), and dialog-act repair variance ($p = 0.016$, $d = 0.20$), all significant after correction and stronger than toxicity and sentiment baselines. AMD provides a distinct temporal signature, with retrospectively measured variance peaking at the bifurcation point while toxicity variance peaks earlier, and is the only indicator grounded in the theoretical framework. Boundary-condition analysis on CGA-CMV ($N = 1,169$) yields mixed but directionally consistent evidence.

2605.08978 2026-05-13 cs.AI 版本更新

Learning to Explore: Scaling Agentic Reasoning via Exploration-Aware Policy Optimization

Xingyuan Hua, Sheng Yue, Ju Ren

发表机构 * Department of Computer Science and Technology, Tsinghua University, Beijing, China(清华大学计算机科学与技术系) School of Cyber Science and Technology, Sun Yat-sen University Shenzhen Campus, Shenzhen, China(中山大学深圳校区信息科学与技术学院) State Key Laboratory of Internet Architecture, Tsinghua University, Beijing, China(清华大学互联网体系结构国家重点实验室)

AI总结 本文提出了一种基于探索感知的强化学习框架,旨在解决智能体在执行任务时探索策略不加区分的问题。该方法通过变分推断引入细粒度奖励函数,能够评估探索行为对未来决策的潜在提升,并结合探索感知的分组机制,在优化过程中区分探索动作与任务完成动作。实验表明,该方法在多种文本和图形界面基准任务中均取得了显著提升。

详情
英文摘要

Recent advancements in agentic test-time scaling allow models to gather environmental feedback before committing to final actions. A key limitation of existing methods is that they typically employ undifferentiated exploration strategies, lacking the ability to adaptively distinguish when exploration is truly required. In this paper, we propose an exploration-aware reinforcement learning framework that enables LLM agents to adaptively explore only when uncertainty is high. Our method introduces a fine-grained reward function via variational inference that explicitly evaluates exploratory actions by estimating their potential to improve future decision-making, together with an exploration-aware grouping mechanism that separates exploratory actions from task-completion actions during optimization. By targeting informational gaps, this design allows agents to explore selectively and transition to execution as soon as the task context is clear. Empirically, we demonstrate that our approach achieves consistent improvements across a range of challenging text-based and GUI-based agent benchmarks. Code is available at https://github.com/HansenHua/EAPO-ICML26 and models are available at https://huggingface.co/hansenhua/EAPO-ICML26.

2605.08828 2026-05-13 cs.AI 版本更新

When Agents Overtrust Environmental Evidence: An Extensible Agentic Framework for Benchmarking Evidence-Grounding Defects in LLM Agents

Strick Sheng, Ziyue Wang, Liyi Zhou

发表机构 * The University of Sydney(悉尼大学) Nanjing University(南京大学)

AI总结 该研究提出了一种名为EnvTrustBench的可扩展智能体框架,用于评估大型语言模型代理在面对过时、错误或恶意环境信息时的可靠性问题。研究定义了“证据锚定缺陷”(EGD),即代理在未核实当前证据的情况下,仅凭环境提供的信息做出决策,从而导致任务错误。通过构建任务场景、生成工作空间与验证机制,EnvTrustBench系统评估了多种代理在不同情境下的表现,揭示了环境信息可靠性对代理行为的广泛影响,突显了环境锚定在智能体系统中的核心地位。

详情
英文摘要

Large language model agents increasingly operate through environment-facing scaffolds that expose files, web pages, APIs, and logs. These observations influence tool use, state tracking, and action sequencing, yet their reliability and authority are often uncertain. Environmental grounding is therefore a systems-level problem involving context admission, evidence provenance, freshness checking, verification policy, action gating, and model reasoning. Existing agent benchmarks mainly evaluate task capability or specific attacks such as prompt injection and memory poisoning, but they under-specify a fundamental reliability question: whether agents remain grounded in the true environment state when observations are stale, incorrect, or malicious. We introduce EnvTrustBench, an agentic framework for benchmarking this failure mode. We define an evidence-grounding defect (EGD) as a behavioral failure in which an agent treats an environment-facing claim as sufficient evidence for action without resolving it against available current evidence, leading to a task-incorrect false path under the true environment state. Given a task scenario, EnvTrustBench generates the workspace, environment, agent-facing objective, and validation oracle, executes the evaluated agent, records its action-observation trajectory and final state, and applies the oracle to produce a verdict. Using 6 LLM backbones and 5 widely used scaffolds, we evaluate 55 generated cases across 11 task scenarios, with each scenario expanded through five feedback-guided generation iterations. Results show that EGDs consistently emerge across operational workflows, highlighting environmental grounding as a core agent reliability problem with important security implications.

2605.08754 2026-05-13 cs.AI 版本更新

Value-Decomposed Reinforcement Learning Framework for Taxiway Routing with Hierarchical Conflict-Aware Observations

Shizhong Zhou, Haifeng Liu, Zheng Zhang, Shiyu Zhang, Bo Yang, Yi Lin

发表机构 * National Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University(合成视觉基础科学国家重点实验室,四川大学) College of Computer Science, Sichuan University(计算机学院,四川大学)

AI总结 本文提出了一种名为CaTR的强化学习框架,用于解决机场地面上的实时多架飞机滑行路径规划问题。该框架通过分层的冲突感知观测机制,结合基于网格的环境建模和动作掩码技术,能够有效捕捉当前及下游的交通冲突信息,并采用价值分解策略以平衡安全与效率的多目标优化。实验表明,CaTR在多种交通密度下均能实现优于传统规划和强化学习方法的安全与效率综合性能。

详情
英文摘要

Taxiway routing and on-surface conflict avoidance are coupled safety-critical decision problems in airport surface operations. Existing planning and optimization methods are often limited by online computational cost, while reinforcement learning methods may struggle to represent downstream traffic conflicts and balance multiple objectives. This paper presents Conflict-aware Taxiway Routing (CaTR), a reinforcement learning framework for real-time multi-aircraft taxiway routing. CaTR constructs a grid-based airport surface environment with action masking, introduces a hierarchical foresight traffic representation to encode current and downstream conflict-related traffic conditions, and adopts a value-decomposed reinforcement learning strategy to prioritize sparse but safety-critical objectives. Experiments are conducted on a realistic environment based on Changsha Huanghua International Airport under multiple traffic density levels. Results show that CaTR achieves better safety--efficiency trade-offs than representative planning, optimization, and reinforcement learning baselines while maintaining practical runtime.

2605.08693 2026-05-13 cs.AI 版本更新

SkillMaster: Toward Autonomous Skill Mastery in LLM Agents

Min Yang, Jinghua Piao, Xu Xia, Xiaochong Lan, Jiaju Chen, Yongshun Gong, Yong Li

发表机构 * Shandong University(山东大学) Zhongguancun Academy(中关村学院) Tsinghua University(清华大学) Southeast University(东南大学) University of Science and Technology of China(中国科学技术大学)

AI总结 SkillMaster 是一种旨在使大语言模型代理实现自主技能掌握的训练框架。该方法通过轨迹引导的技能复盘、反事实效用评估和双优势估计机制,使代理能够在任务解决过程中自主创建、优化和选择技能,从而提升其应对复杂任务的能力。实验表明,SkillMaster 在多个基准任务中显著优于现有方法,展示了代理从被动使用技能向主动学习和改进技能的能力转变。

详情
英文摘要

Skills provide an effective mechanism for improving LLM agents on complex tasks, yet in existing agent frameworks, their creation, refinement, and selection are typically governed by external teachers, hand-designed rules, or auxiliary modules. As a result, skills remain external resources to be invoked, rather than capabilities that agents can develop, adapt, and internalize through experience. To endow LLM agents with autonomous skill mastery, we propose SkillMaster, a training framework that teaches agents to create new skills, refine existing skills, and select accumulated skills during task solving. This capability is achieved through three key designs. First, we train agents through trajectory-informed skill review, teaching agents to propose, update, or retain skills based on evidence from completed episodes. Second, each candidate skill edit is designed to be evaluated by its counterfactual utility on related probe tasks, providing a direct learning signal for training skill-editing decisions. Third, we introduce DualAdv-GRPO, which separately estimates advantages for task-solving actions and skill-editing decisions, stabilizing joint training across task solving and skill management. Experiments on ALFWorld and WebShop show that SkillMaster improves the overall success rate over state-of-the-art baselines by 8.8% and 9.3%, respectively, achieving the best performance among all compared methods. Further analysis reveals a marked shift in agent capability: agents trained with SkillMaster can identify skill failures, refine procedural knowledge from trajectory evidence, and transfer improvements to future tasks with limited skill-bank edits. Overall, SkillMaster moves LLM agents beyond mere skill use toward self-improving agents capable of developing, adapting, and applying their own skill repertoires.

2605.08322 2026-05-13 cs.LG cs.AI 版本更新

SDG-MoE: Signed Debate Graph Mixture-of-Experts

Stepan Kulibaba, Kirill Labzin, Artem Dzhalilov, Roman Pakhomov, Oleg Svidchenko, Alexander Gasnikov, Aleksei Shpilman

发表机构 * Innopolis University(因诺波利斯大学) Sirius University(西里乌斯大学) HSE University(俄罗斯高等经济大学)

AI总结 本文提出了一种名为SDG-MoE的新颖稀疏混合专家(MoE)架构,旨在通过引入专家间的结构化交流机制提升模型性能。该方法在路由后引入了一个轻量级的迭代讨论步骤,包含支持图和批评图两个交互矩阵,以及基于分歧的锚定机制,以增强专家间的信息传递与协调。实验表明,SDG-MoE在多个基准数据集上显著优于传统MoE和无符号图通信基线,验证了其有效性与优越性。

详情
英文摘要

Sparse MoE models achieve a good balance between capacity and compute by routing each token to a small subset of experts. However, in most MoE architectures, once a token is routed, the selected experts process it independently and their outputs are combined via a weighted sum. This leaves open whether enabling communication among them could improve performance. While prior work has raised this question, direct interaction among the active routed experts remains underexplored. In this paper, we propose SDG-MoE (Signed Debate Graph Mixture-of-Experts), a novel architecture that adds a lightweight, iterative deliberation step before final aggregation. SDG-MoE introduces three components: (i) two learned interaction matrices over the active experts, a support graph $A^+$ and a critique graph $A^-$, capturing reinforcing and corrective influences; (ii) a signed message-passing step that updates expert representations before aggregation; and (iii) a disagreement-gated Friedkin-Johnsen-style anchoring that controls deliberation strength while preventing expert drift. Together, these enable a structured deliberation process where interaction strength scales with disagreement and specialization is preserved. We also provide a theoretical analysis establishing stability conditions on expert states and showing that deliberation adds only low-order overhead over the active set. In controlled three-seed pretraining experiments, SDG-MoE improves validation perplexity over both an unsigned graph communication baseline and vanilla MoE, outperforming the strongest baseline by 19.8%, and gives the best external perplexity on WikiText-103, C4, and Paloma among the compared systems.

2605.07744 2026-05-13 cs.AI 版本更新

Alternating Target-Path Planning for Scalable Multi-Agent Coordination

Yu Kumagai, Keisuke Okumura

发表机构 * Hitotsubashi University, Japan(日本立命堂大学) National Institute of Advanced Industrial Science and Technology (AIST), Japan(日本国家先进工业科学和技术研究院)

AI总结 本文研究了多智能体在同时分配目标和规划路径(TAPF)中的协调问题,提出了一种解耦目标分配与路径规划的迭代优化框架。该方法基于高效的次优多智能体路径规划求解器,通过反复规划路径并利用反馈信息优化目标分配,有效提升了算法的可扩展性。实验表明,该框架在保持较好解质量的同时,显著优于基于冲突搜索的传统方法,为实际大规模TAPF问题提供了可行的解决方案。

详情
英文摘要

The concurrent target assignment and pathfinding (TAPF) problem extends multi-agent pathfinding (MAPF) by asking planners to allocate distinct targets and collision-free paths to agents. Prior work on TAPF has relied exclusively on Conflict-Based Search (CBS), which tightly couples target assignment and pathfinding, resulting in compute-intensive, non-scalable solutions. In contrast, we propose an iterative refinement framework that decouples target assignment from pathfinding. Our framework builds on modern, fast, suboptimal MAPF solvers, such as LaCAM. Specifically, within a given time budget, it repeatedly solves MAPF for the current target assignment, identifies bottleneck agents via MAPF feedback, and refines the assignment. Empirical results show that feedback-driven reassignment loop is effective, enabling our framework to scale well beyond the reach of the state-of-the-art CBS-based solver while maintaining decent solution quality. This represents a solid step toward practical, large scale TAPF suitable for real-world setups.

2605.06940 2026-05-13 cs.CL cs.AI cs.LG 版本更新

MultiSoc-4D: A Benchmark for Diagnosing Instruction-Induced Label Collapse in Closed-Set LLM Annotation of Bengali Social Media

Souvik Pramanik, S. M. Riaz Rahman Antu, Shak Mohammad Abyad, Md. Ibrahim Khalil, Md. Shahriar Hussain

发表机构 * North South University(北南大学)

AI总结 该研究提出MultiSoc-4D,一个用于诊断封闭集指令下大型语言模型(LLM)标注偏差的孟加拉语社交媒体数据集,包含超过58,000条来自六个来源的社交媒体评论,并在四个维度上进行标注。通过多模型协作标注与共享验证集的结构化流程,研究系统性地揭示了LLM在标注过程中普遍存在的“指令诱导标签坍缩”现象,即模型倾向于使用默认标签,导致对少数类别的检测严重不足。该研究还通过统计验证证明了这一现象形成的“标签一致性幻觉”,并评估了40多个LLM在训练流程中的标注偏差传播情况,为低资源语言的NLP标注研究提供了重要基准。

Comments 21 pages, 14 figures, 13 tables

详情
英文摘要

Annotation automation via Large Language Models (LLMs) is the core approach for scaling NLP datasets; however, LLM behavior with respect to closed-set instructions in low-resource languages has not been well studied. We present MultiSoc-4D, a Bengali social media dataset benchmark, which contains 58K+ social media comments from six sources annotated along four dimensions: category, sentiment, hate speech, and sarcasm. By employing a structured pipeline where ChatGPT, Gemini, Claude, and Grok individually annotate separate partitions, while sharing a common validation set of 20%, we diagnose LLM behavior systematically. We discover a prevalent phenomenon called "instruction-induced label collapse", wherein LLMs show a systematic preference towards fallback labels (Other, Neutral, No), leading to high agreement rates but under-detection of minority categories. For example, we find that LLMs failed to detect 79% and 75% of instances with hateful and sarcastic content compared to a human-calibrated reference. Furthermore, we prove that it represents a "label agreement illusion", statistically validated via almost null Fleiss' Kappa ($κ\approx -0.001$) on sarcasm detection. Across 40+ LLMs, we benchmark this annotation bias propagation within the training pipeline, regardless of architectural differences. We release MultiSoc-4D as a diagnostic benchmark for annotation biases in Bengali NLP.

2605.04539 2026-05-13 cs.CL cs.AI 版本更新

RLearner-LLM: Balancing Logical Grounding and Fluency in Large Language Models via Hybrid Direct Preference Optimization

Qiming Bao, Juho Leinonen, Paul Denny, Michael J. Witbrock

发表机构 * University of Auckland(奥克兰大学) Aalto University(阿alto大学)

AI总结 该论文提出了一种名为RLearner-LLM的新方法,旨在解决大型语言模型在知识密集型生成任务中逻辑准确性与流畅性之间的平衡问题。研究通过引入混合直接偏好优化(Hybrid-DPO)技术,结合基于DeBERTa-v3的自然语言推理信号和验证器LLM评分,无需人工标注即可提升模型的逻辑对齐能力。实验表明,该方法在多个学术领域中显著提升了模型的逻辑推理能力,同时保持了生成流畅性,并在多个基础模型上实现了有效的性能提升。

详情
英文摘要

Direct Preference Optimization (DPO), the efficient alternative to PPO-based RLHF, falls short on knowledge-intensive generation: standard preference signals from human annotators or LLM judges exhibit a systematic verbosity bias that rewards fluency over logical correctness. This blindspot leaves a logical alignment gap -- SFT models reach NLI entailment of only 0.05-0.22 despite producing fluent text. We propose RLearner-LLM with Hybrid-DPO: an automated preference pipeline that fuses a DeBERTa-v3 NLI signal with a verifier LLM score, removing human annotation while overcoming the "alignment tax" of single-signal optimization. Evaluated across five academic domains (Biology, Medicine, Law) with three base architectures (LLaMA-2-13B, Qwen3-8B, Gemma 4 E4B-it), RLearner-LLM yields up to 6x NLI improvement over SFT, with NLI gains in 11 of 15 cells and consistent answer-coverage gains. On Gemma 4 E4B-it (4.5B effective params), Hybrid-DPO lifts NLI in four of five domains (+11.9% to +2.4x) with faster inference across all five, scaling down to compact base models without losing the alignment-tax mitigation. Our Qwen3-8B RLearner-LLM wins 95% of pairwise comparisons against its own SFT baseline; GPT-4o-mini in turn wins 95% against our concise output -- alongside the 69% win the same judge gives a verbose SFT over our DPO model, this replicates verbosity bias on a frontier comparator and motivates logic-aware metrics (NLI, ACR) over LLM-as-a-judge for knowledge-intensive generation.

2605.02798 2026-05-13 quant-ph cs.AI cs.ET cs.LG 版本更新

Measuring Accuracy and Energy-to-Solution of Quantum Fine-Tuning of Foundational AI Models

Oliver Knitter, Sang Hyub Kim, Maximilian Wurzer, Jonathan Mei, Claudio Girotto, Karen Horovitz, Chi Chen, Masako Yamada, Frederik F. Flöther, Martin Roetteler

发表机构 * Center for Quantum Computing and Quantum Coherence (QC2)(量子计算与量子相干中心(QC2)) University of Basel(巴塞尔大学)

AI总结 本文研究了混合量子-经典系统中量子微调基础人工智能模型的能量-解(ETS)性能,通过直接测量Forte Enterprise离子阱量子处理器的功耗实现。实验表明,尽管存在噪声和有限的量子比特数量,量子微调模型在准确率上可与甚至超越经典方法如逻辑回归和支持向量分类器。研究还发现,浅层量子电路的能量消耗随量子比特数近似线性增长,而经典模拟呈指数增长,表明在约34个量子比特时ETS达到平衡点,量子微调模型在分类误差上比最佳经典微调模型提升了约24%。

Comments 10 pages, 4 figures

详情
英文摘要

We present an experimental study of energy-to-solution (ETS) of hybrid quantum-classical applications, enabled by direct instrumentation of power consumption of a Forte Enterprise trapped-ion quantum processor. We apply this methodology to a hybrid quantum-classical pipeline for quantum fine-tuning of foundational AI models, and validate the approach end-to-end on quantum hardware. Despite noise and limited qubit counts, the resulting models achieve accuracy competitive with and exceeding classical baselines such as logistic regression and support vector classifiers. Our results show that QPU energy consumption scales approximately linearly with qubit number for shallow circuits, while classical simulation exhibits exponential scaling, indicating a break-even for ETS around 34 qubits. The classification error improvement of the best quantum fine-tuned model over the best classical fine-tuned model considered in this study is around 24%. We further contextualize these findings with comparisons to tensor network methods. This work establishes energy-to-solution as a measurable and scalable metric for evaluating quantum applications and provides experimental evidence of favorable energy-accuracy trade-offs.

2605.00939 2026-05-13 cs.LG cs.AI 版本更新

From Flat Facts to Sharp Hallucinations: Detecting Stubborn Errors via Gradient Sensitivity

Yee Zhing Liew, Andrew Huey Ping Tan, Anwar P. P. Abdul Majeed

发表机构 * School of Intelligent Manufacturing Ecosystem, Xi’an Jiaotong-Liverpool University, People’s Republic of China(智能制造生态系统学院,西安交通大学-利物浦大学,中华人民共和国) Department of Computer Science, University of Liverpool, United Kingdom(计算机科学系,利物浦大学,英国) Faculty of Engineering and Technology, Sunway University, Malaysia(工程与技术学院,Sunway大学,马来西亚) School of Robotics, Xi’an Jiaotong-Liverpool University, People’s Republic of China(机器人学院,西安交通大学-利物浦大学,中华人民共和国)

AI总结 本文研究了传统语言模型中难以检测的“顽固性幻觉”问题,即模型在错误信息上表现出高度自信的情况。作者提出了一种基于梯度敏感性的几何检测方法——嵌入扰动梯度敏感性(EPGS),通过在输入嵌入中加入高斯噪声并测量梯度幅值的变化,来区分稳定知识与脆弱记忆。实验表明,该方法在检测高置信度事实错误方面显著优于基于熵和表示的基线方法。

Comments Accepted to ICML 2026. Camera-ready version

详情
英文摘要

Traditional hallucination detection fails on "Stubborn Hallucinations" - errors where LLMs are confidently wrong. We propose a geometric solution: Embedding-Perturbed Gradient Sensitivity (EPGS). We hypothesize that while robust facts reside in flat minima, stubborn hallucinations sit in sharp minima, supported by brittle memorization. EPGS detects this sharpness by perturbing input embeddings with Gaussian noise and measuring the resulting spike in gradient magnitude. This acts as an efficient proxy for the Hessian spectrum, differentiating stable knowledge from unstable memorization. Our experiments show that EPGS significantly outperforms entropy-based and representation-based baselines, providing a robust signal for detecting high-confidence factual errors.

2604.24801 2026-05-13 cs.LG cs.AI 版本更新

Architecture Determines Observability of Transformers

Thomas Carmichael

发表机构 * Independent Researcher(独立研究者)

AI总结 该研究探讨了Transformer模型中架构对可观测性的影响,指出自回归Transformer在输出置信度监控下仍可能产生无法被检测的错误。研究发现,激活信号中包含的决策质量信息主要由模型架构和训练过程决定,而非输出置信度本身。实验表明,通过控制输出置信度可大幅减少激活探针信号,而剩余信号的可观测性取决于架构和训练方式,为模型监控和训练设计提供了新的视角。

Comments 31 pages, 8 figures, 14 tables. v3 of arXiv:2604.24801. Code v5.1.0: https://github.com/tmcarmichael/nn-observability/tree/v5.1.0 Changelog: https://github.com/tmcarmichael/nn-observability/blob/v5.1.0/CHANGELOG.md Croissant: https://github.com/tmcarmichael/nn-observability/blob/v5.1.0/croissant.json

详情
英文摘要

Autoregressive transformers make confident errors that output-confidence monitoring cannot catch. Activation monitors catch them only when training leaves a decision-quality signal beyond what the output already exposes. This signal is an architectural property of the trained model, fixed upstream of any monitor. Controlling for output confidence removes 60.3% of the raw activation-probe signal on average across 14 models. Raw probe signal is mostly output confidence, and output-side readouts cannot recover the residual. What remains depends on architecture and training. In Pythia's controlled training, both matched-width configurations form the signal early. One preserves it through convergence while another erases it as perplexity continues to improve. Capability and observability are not inherently in tension. Across independently trained families this pattern persists, even as the collapse point shifts. Where the signal survives, monitoring catches what confidence cannot. On downstream QA, a WikiText-trained probe with no task-specific tuning catches about one in eight confident errors that output-confidence monitoring misses, at a 20% flag rate. These results establish signal engineering as a training-time design axis alongside loss and capability. Architecture sets the conditions for observability, and training determines what remains readable.

2604.22026 2026-05-13 cs.AI cs.CY cs.DL 版本更新

Rethinking Publication: A Certification Framework for AI-Enabled Research

Yang Lu, Rabimba Karanjai, Lei Xu, Weidong Shi

发表机构 * Department of Computer Science, University of Houston, Houston, Texas(休斯敦大学计算机科学系)

AI总结 本文提出了一种用于评估AI生成研究成果的双重认证框架,旨在应对当前学术出版体系对人类作者假设的局限性。该框架将知识有效性与人类贡献程度的评估分离开来,前者确保研究成果的科学性,后者明确人类在研究过程中的参与程度。研究还提出了专门的基准投稿渠道,以促进完全自动化研究成果的透明发表,并强调应基于知识价值而非作者身份来评价研究贡献。

Comments correct references

详情
英文摘要

AI research pipelines can now generate academic work that may satisfy existing peer review standards for quality, novelty, and methodological rigor. However, the publication system was built around the assumption that research is produced by human authors. It therefore lacks a clear way to evaluate work when the knowledge claim may be valid but the producer is partly or fully automated. This paper proposes a two-layer certification framework for AI-generated research. The first layer evaluates whether the knowledge claim is sound. The second layer evaluates the level of human contribution. This separation allows journals and conferences to assess pipeline-generated work more consistently without creating new institutions. The framework uses normative analysis, conceptual design, and dry-run validation against representative submission cases. It classifies human contribution into three categories: Category A, where the work is reachable by an automated pipeline; Category B, where human direction is required at identifiable stages; and Category C, where the work goes beyond current pipeline capability, especially at the problem-formulation stage. The paper also proposes dedicated benchmark slots for fully disclosed automated research. These slots would provide a transparent publication path and help reviewers calibrate judgments over time. The key argument is that publication has historically certified two things at once: that the knowledge is valid and that a human produced it. AI research pipelines separate these two claims. By decoupling knowledge certification from authorship attribution, the proposed framework responds to a structural change already underway. It can be implemented within existing editorial systems, works even when attribution is uncertain, and recognizes human frontier contribution based on epistemic value rather than human origin alone.

2604.21052 2026-05-13 cs.CV cs.AI 版本更新

StyleVAR: Controllable Image Style Transfer via Visual Autoregressive Modeling

Liqi Jing, Dingming Zhang, Peinian Li, Lichen Zhu, Yang Xu, Hanyu Xing

发表机构 * Duke University(杜克大学) University of Southern California(南加州大学) Xidian University(西安电子科技大学)

AI总结 StyleVAR 是一种基于视觉自回归建模(VAR)框架的可控图像风格迁移方法,通过将图像分解为多尺度表示并编码为离散码,利用变压器模型在条件离散序列建模中实现风格与内容的可控融合。该方法引入了混合交叉注意力机制和尺度相关的融合系数,以在保持自回归连续性的同时,有效结合风格与内容信息。实验表明,StyleVAR 在多个基准测试中优于传统 AdaIN 方法,在感知相似度和结构保持方面表现突出,尤其在风景和建筑场景中效果显著。

详情
英文摘要

We build on the Visual Autoregressive Modeling (VAR) framework and formulate style transfer as conditional discrete sequence modeling in a learned latent space. Images are decomposed into multi-scale representations and tokenized into discrete codes by a VQ-VAE; a transformer then autoregressively models the distribution of target tokens conditioned on style and content tokens. To inject style and content information, we introduce a blended cross-attention mechanism in which the evolving target representation attends to its own history, while style and content features act as queries that decide which aspects of this history to emphasize. A scale-dependent blending coefficient controls the relative influence of style and content at each stage, encouraging the synthesized representation to align with both the content structure and the style texture without breaking the autoregressive continuity of VAR. We train StyleVAR in two stages from a pretrained VAR checkpoint: supervised fine-tuning on a large triplet dataset of content--style--target images, followed by reinforcement fine-tuning with Group Relative Policy Optimization (GRPO) against a DreamSim-based perceptual reward, with per-action normalization weighting to rebalance credit across VAR's multi-scale hierarchy. Across three benchmarks spanning in-, near-, and out-of-distribution regimes, StyleVAR consistently outperforms an AdaIN baseline on Style Loss, Content Loss, LPIPS, SSIM, DreamSim, and CLIP similarity, and the GRPO stage yields further gains over the SFT checkpoint, most notably on the reward-aligned perceptual metrics. Qualitatively, the method transfers texture while maintaining semantic structure, especially for landscapes and architectural scenes, while a generalization gap on internet images and difficulty with human faces highlight the need for better content diversity and stronger structural priors.

2604.15664 2026-05-13 cs.LG cs.AI 版本更新

Stargazer: A Scalable Model-Fitting Benchmark Environment for AI Agents under Astrophysical Constraints

Xinge Liu, Terry Jingchen Zhang, Bernhard Schölkopf, Zhijing Jin, Kristen Menou

发表机构 * University of Toronto(多伦多大学) Vector Institute(向量研究所) Max Planck Institute for Intelligent Systems(智能系统马克斯·普朗克研究所) ELLIS Institute Tübingen(图宾根ELLIS研究所)

AI总结 本文介绍了 Stargazer,一个用于评估人工智能代理在天体物理约束下进行动态模型拟合任务的可扩展基准环境。该环境基于径向速度时间序列数据,包含120个任务,涵盖从高信噪比单行星系统到复杂低信噪比多行星系统的多种场景,并包含20个真实档案案例。研究发现,尽管现有前沿代理在统计拟合上表现良好,但在物理参数恢复方面仍存在显著不足,且增加计算资源带来的提升有限。Stargazer 为训练和评估人工智能代理在实际科研相关模型拟合问题上的能力提供了重要平台。

详情
英文摘要

The rise of autonomous AI agents suggests that dynamic benchmark environments with built-in feedback on scientifically grounded tasks are needed to evaluate the capabilities of these agents in research work. We introduce Stargazer, a scalable environment for evaluating AI agents on dynamic, iterative physics-grounded model-fitting tasks using inference on radial-velocity (RV) time series data. Stargazer comprises 120 tasks across three difficulty tiers, including 20 real archival cases, covering diverse scenarios ranging from high-SNR single-planet systems to complex multi-planetary configurations requiring involved low-SNR analysis. Our evaluation of eight frontier agents reveals a gap between numerical optimization and adherence to physical constraints: although agents often achieve a good statistical fit, they frequently fail to recover correct physical system parameters, a limitation that persists even when agents are equipped with vanilla skills. Furthermore, increasing test-time compute yields only marginal gains, with excessive token usage often reflecting recursive failure loops rather than meaningful exploration. Stargazer presents an opportunity to train, evaluate, scaffold, and scale strategies on a model-fitting problem of practical research relevance today. Our methodology to design a simulation-driven environment for AI agents presumably generalizes to many other model-fitting problems across scientific domains. Source code and the project website are available at https://github.com/AIPS-UofT/Stargazer and https://aips-uoft.github.io/Stargazer/, respectively.

2604.14717 2026-05-13 cs.AI cs.CR cs.CY cs.LG 版本更新

Layered Mutability: Continuity and Governance in Persistent Self-Modifying Agents

Krti Tallam

发表机构 * Kamiwaza AI

AI总结 本文提出“分层可变性”框架,用于分析持续自我修改语言模型代理在预训练、对齐、自我叙述、记忆和权重适应五个层面中的行为演化过程。研究指出,当内部变化迅速、耦合性强、不可逆且难以观测时,治理难度显著增加,导致行为影响层与人类可观察层之间出现系统性不匹配。通过引入漂移、治理负载和滞后等量化指标,并结合实验验证,论文揭示了这类代理的主要失效模式并非突变失准,而是由局部合理更新累积引起的“组合漂移”问题。

Comments 17 pages, 2 figures, 3 tables. self-modifying agents; AI governance; identity drift; persistent memory; runtime adaptation; model editing Primary: cs.AI Cross-list: cs.LG, cs.CY

详情
英文摘要

Persistent language-model agents increasingly combine tool use, tiered memory, reflective prompting, and runtime adaptation. In such systems, behavior is shaped not only by current prompts but by mutable internal conditions that influence future action. This paper introduces layered mutability, a framework for reasoning about that process across five layers: pretraining, post-training alignment, self-narrative, memory, and weight-level adaptation. The central claim is that governance difficulty rises when mutation is rapid, downstream coupling is strong, reversibility is weak, and observability is low, creating a systematic mismatch between the layers that most affect behavior and the layers humans can most easily inspect. I formalize this intuition with simple drift, governance-load, and hysteresis quantities, connect the framework to recent work on temporal identity in language-model agents, and report a preliminary ratchet experiment in which reverting an agent's visible self-description after memory accumulation fails to restore baseline behavior. In that experiment, the estimated identity hysteresis ratio is 0.68. The main implication is that the salient failure mode for persistent self-modifying agents is not abrupt misalignment but compositional drift: locally reasonable updates that accumulate into a behavioral trajectory that was never explicitly authorized.

2604.12625 2026-05-13 cs.GR cs.AI 版本更新

Neural Dynamic GI: Random-Access Neural Compression for Temporal Lightmaps in Dynamic Lighting Environments

Jianhui Wu, Jian Zhou, Zhi Zhou, Zhangjin Huang, Chao Li

发表机构 * University of Science and Technology of China(中国科学技术大学) Zhejiang University(浙江大学)

AI总结 本文提出了一种名为Neural Dynamic GI(NDGI)的新型压缩技术,用于动态光照环境下时间光贴图的高效存储与实时渲染。该方法通过多维特征图和轻量神经网络整合时间信息,替代传统多套光贴图的显式存储,大幅降低存储开销,并引入块压缩模拟策略进一步提升压缩比。结合虚拟纹理系统,NDGI在保持高质量动态全局光照的同时,实现了较低的存储需求和适度的实时解压开销,为动态光照渲染提供了高效解决方案。

Comments Accepted to CVPR 2026

详情
英文摘要

High-quality global illumination (GI) in real-time rendering is commonly achieved using precomputed lighting techniques, with lightmap as the standard choice. To support GI for static objects in dynamic lighting environments, multiple lightmaps at different lighting conditions need to be precomputed, which incurs substantial storage and memory overhead. To overcome this limitation, we propose Neural Dynamic GI (NDGI), a novel compression technique specifically designed for temporal lightmap sets. Our method utilizes multi-dimensional feature maps and lightweight neural networks to integrate the temporal information instead of storing multiple sets explicitly, which significantly reduces the storage size of lightmaps. Additionally, we introduce a block compression (BC) simulation strategy during the training process, which enables BC compression on the final generated feature maps and further improves the compression ratio. To enable efficient real-time decompression, we also integrate a virtual texturing (VT) system with our neural representation. Compared with prior methods, our approach achieves high-quality dynamic GI while maintaining remarkably low storage and memory requirements, with only modest real-time decompression overhead. To facilitate further research in this direction, we will release our temporal lightmap dataset precomputed in multiple scenes featuring diverse temporal variations.

2604.11048 2026-05-13 cs.CL cs.AI 版本更新

A Systematic Analysis of the Impact of Persona Steering on LLM Capabilities

Jiaqi Chen, Ming Wang, Tingna Xie, Shi Feng, Yongkang Liu

发表机构 * School of Computer Science and Engineering, Northeastern University, Shenyang 110819, China(东北大学计算机科学与工程学院,沈阳 110819,中国) School of Computing and Information Systems, Singapore Management University, Singapore 178902, Singapore(新加坡管理大学计算机与信息系统学院,新加坡 178902,新加坡) School of Computer and Communication Engineering, Northeastern University, Qinhuangdao 066004, China(东北大学计算机与通信工程学院,秦皇岛 066004,中国)

AI总结 本文系统分析了在大型语言模型中引入特定人格特质对其认知能力的影响。研究采用基于神经元的人格特质诱导框架(NPTI),在六个认知基准任务中评估五大人格特质对模型性能的影响,发现人格诱导不仅改变了交互风格,还导致认知任务表现的稳定变化,并且这种影响因任务类型和人格特质不同而有所差异。研究还提出了一种轻量级的动态人格路由策略(DPR),能够在无需额外训练的情况下优于固定人格设置。

详情
英文摘要

Imbuing Large Language Models (LLMs) with specific personas is prevalent for tailoring interaction styles, yet the impact on underlying cognitive capabilities remains unexplored. We employ the Neuron-based Personality Trait Induction (NPTI) framework to induce Big Five personality traits in LLMs and evaluate performance across six cognitive benchmarks. Our findings reveal that persona induction produces stable, reproducible shifts in cognitive task performance beyond surface-level stylistic changes. These effects exhibit strong task dependence: certain personalities yield consistent gains on instruction-following, while others impair complex reasoning. Effect magnitude varies systematically by trait dimension, with Openness and Extraversion exerting the most robust influence. Furthermore, LLM effects show 73.68% directional consistency with human personality-cognition relationships. Capitalizing on these regularities, we propose Dynamic Persona Routing (DPR), a lightweight query-adaptive strategy that outperforms the best static persona without additional training.

2604.06779 2026-05-13 cs.AI 版本更新

VASR: Variance-Aware Systematic Resampling for Reward-Guided Diffusion

Shivanshu Shekhar, Sagnik Mukherjee, Jia Yi Zhang, Tong Zhang

发表机构 * Siebel School of Computing and Data Science(计算与数据科学学院) Department of Statistics(统计学系) University of Illinois Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校)

AI总结 该论文提出了一种名为VASR的方差感知系统重采样方法,用于解决奖励引导扩散模型中的系统采样(SMC)粒子系谱快速崩溃问题。通过将延续方差与残差方差分离,研究揭示了传统多项式重采样导致的高后代数量方差是崩溃的主要原因,并提出基于方差最优质量分配和系统重采样的改进方法。VASR及其变体VASR-Max在多个任务中表现出更优的样本质量和更高的计算效率,且无需训练、可并行处理。

详情
英文摘要

Sequential Monte Carlo (SMC) samplers for reward-guided diffusion models often suffer from rapid lineage collapse: a few high-reward particles dominate the population within a handful of resampling steps, destroying diversity and degrading sample quality. We propose a variance-decomposition framework for reward-guided diffusion SMC that separates continuation variance $V_t^{\mathrm{cont}}$ from residual variance $V_t^{\mathrm{res}}$, revealing that high offspring-count variance under the commonly used multinomial resampling drives this collapse. This motivates \textsc{VASR} (Variance-Aware Systematic Resampling), which addresses both variance terms via variance-optimal mass allocation $m_t \propto w_t e^{r_t}$ (minimizing $V_t^{\mathrm{cont}}$) and systematic resampling (controlling $V_t^{\mathrm{res}}$). For latent diffusion models where intermediate rewards are noisy due to stochastic continuations, we propose \textsc{VASR-Max}, a deliberately biased high-selection variant for variance-sensitive reward optimization. Both methods are training-free, fully parallelizable, and add only linear overhead. On MNIST and CIFAR-10, VASR achieves as high as $26\%$ better FID than prior SMC methods while remaining 66 times faster than MCTS-based value methods at matched compute. On text-to-image generation, \textsc{VASR-Max} consistently outperforms the strongest SMC baseline across compute budgets and matches MCTS-based methods within 2.5--3% reward at high budgets while being approximately times faster.

2604.06485 2026-05-13 cs.LG cs.AI 版本更新

Inference-Time Code Selection via Symbolic Equivalence Partitioning

David Cho, Yifan Wang, Fanping Sui, Ananth Grama

发表机构 * Texas Instruments(德州仪器)

AI总结 该论文研究了如何在推理阶段从大型语言模型生成的多个候选程序中有效选择正确解的问题。作者提出了一种基于符号等价划分(SEP)的方法,利用问题提供的公共示例作为有效性信号,并通过符号执行将候选程序划分为功能等价类,从而选择最可能正确的解。实验表明,该方法在多个基准上显著提升了代码选择的准确性,无需额外测试生成或学习验证器。

详情
英文摘要

Sampling multiple candidate programs at inference time is an effective way to improve LLM code generation. However, its benefit depends on reliably selecting a correct solution from the generated pool. We observe that this selection problem has a distinctive semantic structure: correct solutions, despite differences in syntax, implementation, or algorithmic strategy, often converge to the same functional behavior over valid inputs. At the same time, consensus alone is not sufficient for correctness, because models can also produce correlated wrong solutions that implement the same mistaken behavior. We propose Symbolic Equivalence Partitioning (SEP), an inference-time selection framework that first uses problem-provided public examples as lightweight validity signals. SEP then uses symbolic execution to partition the remaining candidate programs into bounded functional equivalence classes and selects from the dominant equivalence class. Across HumanEval+ and LiveCodeBench, SEP consistently improves selection accuracy without auxiliary test generation, learned verifiers, or additional LLM inference. At $N=10$, SEP improves average accuracy from 0.754 to 0.826 on HumanEval+ and from 0.565 to 0.647 on LiveCodeBench, showing that symbolic functional agreement is an effective signal for inference-time code selection.

2604.04894 2026-05-13 cs.CL cs.AI cs.LG 版本更新

Asymmetric Advantage Modulation Calibrates Entropy Dynamics in RLVR

Hengrui Gu, Xiaotian Han, Yujing Bian, Feiyi Wang, Kaixiong Zhou

发表机构 * North Carolina State University(北卡罗来纳州立大学) Case Western Reserve University(凯斯西储大学) Oak Ridge National Laboratory(橡树岭国家实验室)

AI总结 在可验证奖励强化学习(RLVR)中,大型语言模型(LLMs)的推理能力虽有所提升,但常因探索受限而难以获得多样化解。本文提出一种新的熵动态调节方法——AsymGRPO,通过将优势估计器分解为正负通道,分别调控有益熵和噪声熵,从而更精细地引导模型学习。该方法在多个数学推理任务中表现出色,显著优于现有RLVR基线方法。

详情
英文摘要

Reinforcement learning with verifiable rewards (RLVR) has substantially improved the reasoning ability of large language models (LLMs), but it often suffers from \textit{restricted exploration}, where the policy rapidly concentrates on a narrow set of solutions. A common remedy is entropy regularization, which attempts to preserve exploration by increasing policy entropy. However, for LLM-RL, this intervention is highly sensitive to its coefficient, can introduce semantically weak uncertainty, and often yields limited accuracy gains. This motivates a more precise question: which entropy helps reasoning, and which entropy should be reduced? To study this, we parameterize the advantage estimator in Group Relative Policy Optimization (GRPO) into positive and negative outcome-conditioned channels and analyze their entropy dynamics. Our results show that positive-channel modulation raises \textit{productive entropy} associated with successful reasoning trajectories, while negative-channel modulation removes \textit{noisy entropy} associated with failed rollouts and reduces interference with correct paths. Guided by this channel-wise view, we propose \textbf{AsymGRPO}, which decouples the modulation strengths of positive and negative advantages. This enables flexible control over how the model updates across prompt difficulty levels, allowing stronger reinforcement of rare successes on harder prompts or stronger suppression of residual failures on easier prompts without forcing the two channels to share the same modulation strength. Experiments on five mathematical reasoning benchmarks show that AsymGRPO outperforms strong RLVR baselines, with consistent gains across model backbones.

2604.03409 2026-05-13 cs.CE cs.AI 版本更新

Generative AI for material design: A mechanics perspective from burgers to matter

Vahidullah Tac, Ellen Kuhl

发表机构 * Department of Mechanical Engineering, Stanford University(斯坦福大学机械工程系)

AI总结 本文从力学视角探讨生成式人工智能在材料设计中的应用,揭示了扩散生成模型与计算力学之间的理论共性。研究以三原料汉堡为低维设计基准,展示了扩散过程与贝叶斯逆演、奥恩斯坦-乌伦贝克过程等力学原理的联系,并将方法扩展到高维设计空间,通过神经网络学习逆向过程实现高效生成。实验表明,生成的汉堡在感官测试中表现优于经典产品,验证了该方法在高维物理约束下的有效性与实用性。

Comments 25 pages, 15 figures, 2 tables

详情
英文摘要

Generative artificial intelligence offers a new paradigm to design matter in high-dimensional spaces. However, its underlying mechanisms remain difficult to interpret and limit adoption in computational mechanics. This gap is striking because its core tools-diffusion, stochastic differential equations, and inverse problems-are fundamental to the mechanics of materials. Here we show that diffusion-based generative AI and computational mechanics are rooted in the same principles. We illustrate this connection using a three-ingredient burger as a minimal benchmark for material design in a low-dimensional space, where both forward and reverse diffusion admit analytical solutions: Markov chains with Bayesian inversion in the discrete case and the Ornstein-Uhlenbeck process with score-based reversal in the continuous case. We extend this framework to a high-dimensional design space with 146 ingredients and 8.9x10^43 possible configurations, where analytical solutions become intractable. We therefore learn the discrete and continuous reverse processes using neural network models that infer inverse dynamics from data. We train the models on only 2,260 recipes and generate one million samples that capture the statistical structure of the data, including ingredient prevalence and quantitative composition. We further generate five new burgers and validate them in a blinded restaurant-based sensory study with n = 101 participants, where three of the AI-designed burgers outperform the classical Big Mac in overall liking, flavor, and texture. These results establish diffusion-based generative modeling as a physically grounded approach to design in high-dimensional spaces. They position generative AI as a natural extension of computational mechanics, with applications from burgers to matter, and establish a path toward data-driven, physics-informed generative design.

2603.28561 2026-05-13 cs.RO cs.AI 版本更新

Fine-Tuning Large Language Models for Cooperative Tactical Deconfliction of Small Unmanned Aerial Systems

Iman Sharifi, Alex Zongo, Peng Wei

发表机构 * George Washington University(乔治华盛顿大学)

AI总结 随着小型无人机系统在低空空域的广泛应用,如何在安全约束下实现可靠的战术避撞成为亟需解决的问题。本文研究了通过微调大语言模型(LLM)来实现多智能体协同避撞的方法,提出了一种基于BlueSky模拟器的仿真到语言数据生成流程,生成符合航空安全规则的避撞数据集,并采用低秩适配(LoRA)和基于偏好的微调策略对预训练模型进行优化。实验表明,该方法显著提升了避撞决策的准确性、一致性及避撞性能,有效减少了近距空中冲突的发生。

Comments 15 pages, 6 figures, to be published in CVPR 2026 Workshop Proceedings

详情
英文摘要

The growing deployment of small Unmanned Aerial Systems (sUASs) in low-altitude airspaces has increased the need for reliable tactical deconfliction under safety-critical constraints. Tactical deconfliction involves short-horizon decision-making in dense, partially observable, and heterogeneous multi-agent environments, where both cooperative separation assurance and operational efficiency must be maintained. While Large Language Models (LLMs) exhibit strong reasoning capabilities, their direct application to air traffic control remains limited by insufficient domain grounding and unpredictable output inconsistency. This paper investigates LLMs as decision-makers in cooperative multi-agent tactical deconfliction using fine-tuning strategies that align model outputs to human operator heuristics. We propose a simulation-to-language data generation pipeline based on the BlueSky air traffic simulator that produces rule-consistent deconfliction datasets reflecting established safety practices. A pretrained Qwen-Math-7B model is fine-tuned using two parameter-efficient strategies: supervised fine-tuning with Low-Rank Adaptation (LoRA) and preference-based fine-tuning combining LoRA with Group-Relative Policy Optimization (GRPO). Experimental results on validation datasets and closed-loop simulations demonstrate that supervised LoRA fine-tuning substantially improves decision accuracy, consistency, and separation performance compared to the pretrained LLM, with significant reductions in near mid-air collisions. GRPO provides additional coordination benefits but exhibits reduced robustness when interacting with heterogeneous agent policies.

2603.28488 2026-05-13 cs.CL cs.AI cs.MA 版本更新

Courtroom-Style Multi-Agent Debate with Progressive RAG and Role-Switching for Controversial Claim Verification

Masnun Nuha Chowdhury, Nusrat Jahan Beg, Umme Hunny Khan, Syed Rifat Raiyan, Md Kamrul Hasan, Hasan Mahmud

发表机构 * Systems and Software Lab (SSL), Department of Computer Science and Engineering(系统与软件实验室(SSL),计算机科学与工程系)

AI总结 该研究针对大语言模型在高风险声明验证中的不可靠问题,提出了一种基于法庭辩论风格的多智能体框架PROClaim,通过引入角色分工和渐进式检索增强生成(P-RAG)方法,提升证据检索与推理的深度与准确性。该方法通过结构化辩论流程、证据协商及多法官异构聚合,有效增强了系统校准能力与鲁棒性,在零样本测试中表现出优于传统多智能体辩论10个百分点的性能,验证了其在争议性声明验证中的有效性。

Comments Under review, 7 figures, 12 tables

详情
英文摘要

Large language models (LLMs) remain unreliable for high-stakes claim verification due to hallucinations and shallow reasoning. While retrieval-augmented generation (RAG) and multi-agent debate (MAD) address this, they are limited by one-pass retrieval and unstructured debate dynamics. We propose a courtroom-style multi-agent framework, PROClaim, that reformulates verification as a structured, adversarial deliberation. Our approach integrates specialized roles (e.g., Plaintiff, Defense, Judge) with Progressive RAG (P-RAG) to dynamically expand and refine the evidence pool during the debate. Furthermore, we employ evidence negotiation, self-reflection, and heterogeneous multi-judge aggregation to enforce calibration, robustness, and diversity. In zero-shot evaluations on the Check-COVID benchmark, PROClaim achieves 81.7% accuracy, outperforming standard multi-agent debate by 10.0 percentage points, with P-RAG driving the primary performance gains (+7.5 pp). We ultimately demonstrate that structural deliberation and model heterogeneity effectively mitigate systematic biases, providing a robust foundation for reliable claim verification. Our code and data are publicly available at https://github.com/mnc13/PROClaim.

2603.24602 2026-05-13 eess.SP cs.AI 版本更新

MuViS: Multimodal Virtual Sensing Benchmark

Jens U. Brandt, Noah C. Puetz, Jobel Jose George, Niharika Vinay Kumar, Elena Raponi, Marc Hilbert, Thomas Bäck, Thomas Bartz-Beielstein

发表机构 * TH Köln, Germany(TH Köln大学) Leiden University(莱顿大学) Toyota Racing(丰田赛车)

AI总结 本文提出 MuViS,一个多模态虚拟传感基准测试平台,旨在解决物理系统中难以直接测量的量的推断问题。该平台整合了多种数据集,提供统一的预处理和评估接口,用于比较不同虚拟传感方法的性能。研究发现,现有方法如梯度提升决策树和深度神经网络在不同场景下均无绝对优势,突显了开发通用虚拟传感架构的必要性。MuViS 作为开源平台,支持可复现的对比实验和未来扩展。

Comments Accepted at European Signal Processing Conference (EUSIPCO) 2026

详情
英文摘要

Virtual sensing aims to infer hard-to-measure quantities from accessible measurements and is central to perception and control in physical systems. Despite rapid progress from first-principle and hybrid models to modern data-driven methods research remains siloed, leaving no established default approach that transfers across processes, modalities, and sensing configurations. We introduce MuViS, a domain-agnostic benchmarking suite for multimodal virtual sensing that consolidates diverse datasets into a unified interface for standardized preprocessing and evaluation. Using this framework, we benchmark established approaches spanning gradient-boosted decision trees and deep neural network (NN) architectures, and show that none of these provides a universal advantage, underscoring the need for generalizable virtual sensing architectures. MuViS is released as an open-source, extensible platform for reproducible comparison and future integration of new datasets and model classes.

2603.24577 2026-05-13 cs.CV cs.AI 版本更新

EndoVGGT: GNN-Enhanced Depth Estimation for Surgical 3D Reconstruction

Falong Fan, Yi Xie, Arnis Lektauers, Bo Liu, Jerzy Rozenblit

AI总结 本文提出了一种名为EndoVGGT的框架,用于提升手术场景中可变形软组织的三维重建精度。该方法引入了一个基于图注意力的变形感知模块(DeGAT),通过动态构建特征空间语义图来捕捉组织区域间的长程关联,从而在遮挡情况下更有效地传播结构信息,提高重建的鲁棒性和一致性。实验表明,EndoVGGT在SCARED数据集上显著提升了重建质量,并在未见数据集上表现出良好的泛化能力。

Comments We withdraw this submission due to significant errors in the presentation and logical structure of the paper. We found that the current version does not accurately convey the research findings and requires a major overhaul of the manuscript's methodology description and results analysis

详情
英文摘要

Accurate 3D reconstruction of deformable soft tissues is essential for surgical robotic perception. However, low-texture surfaces, specular highlights, and instrument occlusions often fragment geometric continuity, posing a challenge for existing fixed-topology approaches. To address this, we propose EndoVGGT, a geometry-centric framework equipped with a Deformation-aware Graph Attention (DeGAT) module. Rather than using static spatial neighborhoods, DeGAT dynamically constructs feature-space semantic graphs to capture long-range correlations among coherent tissue regions. This enables robust propagation of structural cues across occlusions, enforcing global consistency and improving non-rigid deformation recovery. Extensive experiments on SCARED show that our method significantly improves fidelity, increasing PSNR by 24.6% and SSIM by 9.1% over prior state-of-the-art. Crucially, EndoVGGT exhibits strong zero-shot cross-dataset generalization to the unseen SCARED and EndoNeRF domains, confirming that DeGAT learns domain-agnostic geometric priors. These results highlight the efficacy of dynamic feature-space modeling for consistent surgical 3D reconstruction.

2603.23878 2026-05-13 cs.LG cs.AI cs.LO 版本更新

The Luna Bound Propagator for Formal Analysis of Neural Networks

Henry LeCates, Haoze Wu

发表机构 * Amherst College(阿默斯特学院)

AI总结 本文提出了一种基于抽象解释的边界传播方法Luna,用于神经网络的形式化分析。Luna采用C++实现,支持区间边界传播、DeepPoly/CROWN分析以及alpha-CROWN分析,适用于一般的计算图结构。实验表明,Luna在VNN-COMP 2025基准测试中,在边界精度和计算效率方面均优于现有的alpha-CROWN实现。

Comments 32 pages, 29 Figures

详情
英文摘要

The parameterized CROWN analysis, a.k.a., alpha-CROWN has emerged as a practically successful abstract interpretation method for neural network verification. However, existing implementations of alpha-CROWN are limited to Python, which complicates integration into existing DNN verifiers and long-term production-level systems. We introduce Luna, a new abstract-interpretation-based bound propagator implemented in C++. Luna supports Interval Bound Propagation, the DeepPoly/CROWN analysis, and the alpha-CROWN analysis over a general computational graph. We describe the architecture of Luna and show that it outperforms the state-of-the-art alpha-CROWN implementation in terms of both bound tightness and computational efficiency on supported benchmarks from VNN-COMP 2025. Luna is publicly available at https://github.com/ai-ar-research/luna.

2603.11383 2026-05-13 cs.RO cs.AI 版本更新

Vision-Based Hand Shadowing for Robotic Manipulation via Inverse Kinematics

Hendrik Chiche, Antoine Jamme, Trevor Rigoberto Martinez, Gabriel Gomes

发表机构 * OMGrab Inc.(OMGrab公司) University of California, Berkeley(加州大学伯克利分校) Fung Institute for Engineering Leadership(工程领导力基金会)

AI总结 该研究提出了一种基于视觉的手部阴影逆运动学(IK)重定向方法,用于低成本机械臂的远程操作。通过单目RGB-D相机捕捉手部动作,结合深度感知和坐标变换,生成机械臂关节指令,并通过阻尼最小二乘法求解逆运动学问题,实现了对SO-ARM101机械臂的控制。实验表明,该方法在结构化环境中取得了较高的成功率,并在真实场景中通过引入替代手部检测器提升了鲁棒性,揭示了无标记手部重定向方法的潜力与当前局限。

Comments v2: accepted at IEEE Access (2026); minor revisions per peer review, added WiLoR occlusion-mitigation experiment, error analysis, EMA ablation, and author photos

详情
英文摘要

Teleoperation of low-cost robotic manipulators remains challenging due to the difficulty of retargeting human hand motion to robot joint commands. We present an offline hand-shadowing inverse-kinematics (IK) retargeting pipeline driven by a single egocentric RGB-D camera mounted on 3D-printed glasses. The pipeline detects 21 hand landmarks per hand using MediaPipe Hands, deprojects them into 3D via depth sensing, transforms them into the robot coordinate frame, and solves a damped-least-squares IK problem to produce joint commands for the SO-ARM101 robot (5 arm + 1 gripper joints). A gripper controller maps thumb-index finger geometry to grasp aperture with a multi-level fallback hierarchy. Actions are previewed in a physics simulation before replay on the physical robot. We evaluate the pipeline on a structured pick-and-place benchmark (5-tile grid, 10 grasps per tile, 3 independent runs) achieving an 86.7% +/- 4.2% success rate, and compare it against four vision-language-action (VLA) policies (ACT, SmolVLA, pi_0.5, GR00T N1.5) trained on leader-follower teleoperation data. We provide a quantitative error analysis of the pipeline, reporting a mean IK position error of 36.4 mm, trajectory smoothness metrics showing 57-68% jerk reduction from EMA smoothing, and an ablation study over the smoothing parameter. We also test the pipeline in unstructured real-world environments (grocery store, pharmacy) and find that success is reduced to 9.3% due to hand occlusion by surrounding objects. To mitigate this, we integrate WiLoR as an alternative hand detector, achieving an 8% improvement in hand detection rate over MediaPipe, highlighting both the promise and current limitations of marker-free analytical retargeting.

2603.10281 2026-05-13 cs.LG cs.AI cs.CV 版本更新

Taming Score-Based Denoisers in ADMM: A Convergent Plug-and-Play Framework

Rajesh Shrestha, Xiao Fu

发表机构 * School of EECS(电子工程与科学学院)

AI总结 本文研究了如何将基于分数的去噪器有效集成到ADMM优化算法中,以解决逆问题。针对训练数据流形与ADMM迭代几何不匹配以及收敛性缺乏保证的两个核心挑战,提出了一种新的ADMM-PnP框架,引入包含自动校正、方向校正和分数去噪三阶段的AC-DC去噪器。理论分析表明该框架在适当参数下具有弱非扩张性,保证了固定点球收敛,并在更宽松条件下支持自适应步长的收敛性。实验表明该方法在多种逆问题中优于现有基线。

详情
英文摘要

While score-based generative models have emerged as powerful priors for solving inverse problems, directly integrating them into optimization algorithms such as ADMM remains nontrivial. Two central challenges arise: i) the mismatch between the noisy data manifolds used to train the score functions and the geometry of ADMM iterates, especially due to the influence of dual variables, and ii) the lack of convergence understanding when ADMM is equipped with score-based denoisers. To address the manifold mismatch issue, we propose ADMM plug-and-play (ADMM-PnP) with the AC-DC denoiser, a new framework that embeds a three-stage denoiser into ADMM: (1) auto-correction (AC) via additive Gaussian noise, (2) directional correction (DC) using conditional Langevin dynamics, and (3) score-based denoising. In terms of convergence, we establish two results: first, under proper denoiser parameters, each ADMM iteration is a weakly nonexpansive operator, ensuring high-probability fixed-point $\textit{ball convergence}$ using a constant step size; second, under more relaxed conditions, the AC-DC denoiser is a bounded denoiser, which leads to convergence under an adaptive step size schedule. Experiments on a range of inverse problems demonstrate that our method consistently improves solution quality over a variety of baselines.

2603.09678 2026-05-13 cs.AI cs.LG cs.SE 版本更新

EsoLang-Bench: Evaluating Genuine Reasoning in Large Language Models via Esoteric Programming Languages

Aman Sharma, Paras Chopra

发表机构 * Lossfunk

AI总结 本文提出EsoLang-Bench,一个用于评估大语言模型在陌生编程语言中真实推理能力的基准测试,采用五种小众编程语言(如Brainfuck、Befunge-98等)作为测试语言。这些语言虽然图灵完备,但与主流语言(如Python、JavaScript)相比,在预训练语料中出现频率极低,且缺乏实际应用价值,因此能有效检验模型的分布外泛化能力。实验表明,当前最先进的模型在主流语言任务中表现优异,但在小众语言任务中准确率大幅下降,揭示了模型在跨语言泛化方面仍存在显著差距。

Comments 45 pages, 8 figures, preprint

详情
英文摘要

Large language models achieve near-ceiling performance on code generation benchmarks, yet most of the programming languages used by popular benchmarks such as SWE-bench and HumanEval (e.g. Python, JavaScript) are squarely in-distribution. They appear at scale in pre-training corpora and are heavily reinforced during post-training. To study LLM performance on unfamiliar programming languages, we introduce EsoLang-Bench, a benchmark using five esoteric programming languages (Brainfuck, Befunge-98, Whitespace, Unlambda, and Shakespeare). All five of our chosen esoteric languages are Turing-complete, so the same algorithmic problems that are solvable in Python or JavaScript are in principle solvable in each of them. Yet, they are unfamiliar to LLMs which makes them a good proxy for evaluating out-of-distribution performance. The unfamiliarity of esoteric languages comprises of: (i) the hard-by-design primitives comprising the language; (ii) substantially less representation in pre-training corpora (340x to over 60,000x fewer public GitHub repositories than Python); (iii) negligible deployment value, which makes targeted inclusion in post-training data economically irrational. We evaluate five frontier models across five prompting strategies and find a dramatic capability gap. The same 80 problems expressed in Python or JavaScript reach 100% accuracy on top frontier models, while the equivalent esoteric versions score only 0-11%. Few-shot learning and self-reflection also fail to close this gap. EsoLang-Bench therefore provides a contamination-resistant testbed for measuring how well frontier models generalise algorithmic problem-solving to programming languages outside their training distribution.

2603.07388 2026-05-13 cs.LG cs.AI 版本更新

Sparsity and Out-of-Distribution Generalization

Scott Aaronson, Lin Lin Lee, Jiawei Li

发表机构 * UT Austin(德克萨斯大学奥斯汀分校)

AI总结 本文探讨了模型在分布外(OOD)场景下的泛化能力,提出了一种基于稀疏性的理论解释。研究认为,世界通过区分特征呈现,而稀疏假设(即依赖尽可能少的特征)更符合奥卡姆剃刀原则,并能在训练分布与测试分布足够重叠的特征上实现泛化。文章给出了一个形式化定理,扩展了经典样本复杂度界,并将稀疏分类器推广到子空间合取函数,为理解AI对齐中的泛化问题提供了新视角。

详情
英文摘要

Explaining out-of-distribution generalization has been a central problem in epistemology since Goodman's "grue" puzzle in 1946. Today it's a central problem in machine learning, including AI alignment. Here we propose a principled account of OOD generalization with three main ingredients. First, the world is always presented to experience not as an amorphous mass, but via distinguished features (for example, visual and auditory channels). Second, Occam's Razor favors hypotheses that are "sparse," meaning that they depend on as few features as possible. Third, sparse hypotheses will generalize from a training to a test distribution, provided the two distributions sufficiently overlap on their restrictions to the features that are either actually relevant or hypothesized to be. The two distributions could diverge arbitrarily on other features. We prove a simple theorem that formalizes the above intuitions, generalizing the classic sample complexity bound of Blumer et al. to an OOD context. We then generalize sparse classifiers to subspace juntas, where the ground truth classifier depends solely on a low-dimensional linear subspace of the features.

2603.04334 2026-05-13 cs.DB cs.AI cs.LO cs.PL 版本更新

SpotIt+: Verification-based Text-to-SQL Evaluation with Database Constraints

Andrew Tremante, Yang He, Rocky Klopfenstein, Yuepeng Wang, Nina Narodytska, Haoze Wu

发表机构 * Amherst College(阿默斯特学院) Simon Fraser University(西蒙弗雷泽大学) VMware Research by Broadcom(Broadcom VMware 研究)

AI总结 SpotIt+ 是一个开源工具,用于通过有界等价验证评估文本到 SQL 系统的准确性。该工具通过主动搜索能够区分生成 SQL 查询与标准答案的数据库实例,从而验证其等价性。为确保生成的反例具有实际意义,SpotIt+ 引入了一种结合规则挖掘与大语言模型验证的约束挖掘方法,实验表明该方法能生成更贴近实际的测试数据,并有效发现传统测试方法难以识别的差异。

详情
英文摘要

We present SpotIt+, an open-source tool for evaluating Text-to-SQL systems via bounded equivalence verification. Given a generated SQL query and the ground truth, SpotIt+ actively searches for database instances that differentiate the two queries. To ensure that the generated counterexamples reflect practically relevant discrepancies, we introduce a best-effort constraint-mining pipeline that combines rule-based specification mining with LLM-based validation over example databases. Experimental results on the BIRD dataset show that the mined constraints enable SpotIt+ to generate more realistic differentiating databases, while preserving its ability to efficiently uncover numerous discrepancies between generated and gold SQL queries that are missed by standard test-based evaluation.

2603.00059 2026-05-13 cs.CY cs.AI 版本更新

Stochastic Parrots or Singing in Harmony? Testing Five Leading LLMs for their Ability to Replicate a Human Survey with Synthetic Data

Jason Miklian, Kristian Hoelscher, John E. Katsos

发表机构 * Centre for Global Sustainability, University of Oslo(全球可持续发展中心,奥斯陆大学) Peace Research Institute Oslo(奥斯陆和平研究所) American University of Sharjah(阿联酋沙迦美国大学)

AI总结 本文探讨了生成式人工智能大语言模型在模拟人类受访者回答调查时的表现,通过对比420名硅谷程序员的真实调查数据与五种主流大模型生成的合成数据,发现尽管AI生成的回答在技术上合理且趋于一致,但未能再现人类调查中出现的反直觉见解。研究指出,当前主流大模型更适合用于辅助调研,而非替代真实的人类受访者,未来需建立更严格的验证标准以确保合成数据的合理使用。

详情
英文摘要

How well can AI-derived synthetic research data replicate the responses of human participants? An emerging literature has begun to engage with this question, which carries deep implications for organizational research practice. This article presents a comparison between a human-respondent survey of 420 Silicon Valley coders and developers and synthetic survey data designed to simulate real survey takers generated by five leading Generative AI Large Language Models: ChatGPT Thinking 5 Pro, Claude Sonnet 4.5 Pro plus Claude CoWork 1.123, Gemini Advanced 2.5 Pro, Incredible 1.0, and DeepSeek 3.2. Our findings reveal that while AI agents produced technically plausible results that lean more towards replicability and harmonization than assumed, none were able to capture the counterintuitive insights that made the human survey valuable. Moreover, deviations grouped together for all models, leaving the real data as the outlier. Our key finding is that while leading LLMs are increasingly being used to scale, replicate and replace human survey responses in research, these advances only show an increased capacity to parrot conventional wisdom in harmony with each other rather than revealing novel findings. If synthetic respondents are used in future research, we need more replicable validation protocols and reporting standards for when and where synthetic survey data can be used responsibly, a gap that this paper fills. Our results suggest that synthetic survey responses cannot meaningfully model real human social beliefs within organizations, particularly in contexts lacking previously documented evidence. We conclude that synthetic survey-based research should be cast not as a substitute for rigorous survey methods, but as an increasingly reliable pre- or post-fieldwork instrument for identifying societal assumptions, conventional wisdoms, and other expectations about research populations.

2602.22586 2026-05-13 cs.LG cs.AI cs.CL 版本更新

TabDLM: Free-Form Tabular Data Generation via Joint Numerical-Language Diffusion

Donghong Cai, Jiarui Feng, Yanbo Wang, Da Zheng, Yixin Chen, Muhan Zhang

发表机构 * Washington University in St. Louis(华盛顿大学圣路易斯分校) Peking University(北京大学) Ant Group(蚂蚁集团)

AI总结 本文提出了一种名为 TabDLM 的统一框架,用于生成包含自由形式文本和结构化数值、类别属性的异构表格数据。该方法结合了掩码扩散语言模型与连续扩散过程,通过双向注意力机制实现文本与数值特征的跨模态交互,有效克服了传统扩散模型和大语言模型在处理异构数据时的局限性。实验表明,TabDLM 在多个基准数据集上表现优异,优于现有的扩散模型和基于大语言模型的生成方法。

Comments Preprint

详情
英文摘要

Synthetic tabular data generation has attracted growing attention due to its importance for data augmentation, foundation models, and privacy. However, real-world tabular datasets increasingly contain free-form text fields (e.g., reviews or clinical notes) alongside structured numerical and categorical attributes. Generating such heterogeneous tables with joint modeling of different modalities remains challenging. Existing approaches broadly fall into two categories: diffusion-based methods and LLM-based methods. Diffusion models can capture complex dependencies over numerical and categorical features in continuous or discrete spaces, but extending them to open-ended text is nontrivial and often leads to degraded text quality. In contrast, LLM-based generators naturally produce fluent text, yet their discrete tokenization can distort precise or wide-range numerical values, hindering accurate modeling of both numbers and language. In this work, we propose TabDLM, a unified framework for free-form tabular data generation via a joint numerical-language diffusion model built on masked diffusion language models (MDLMs). TabDLM models textual and categorical features through masked diffusion, while modeling numerical features with a continuous diffusion process through learned specialized numeric tokens embedding; bidirectional attention then captures cross-modality interactions within a single model. Extensive experiments on diverse benchmarks demonstrate the effectiveness of TabDLM compared to strong diffusion- and LLM-based baselines.

2602.19770 2026-05-13 cs.LG cs.AI 版本更新

The Confusion is Real: GRAPHIC -- A Network Science Approach to Confusion Matrices in Deep Learning

Johanna S. Fröhlich, Bastian Heinlein, Jan U. Claar, Hans Rosenberger, Vasileios Belagiannis, Ralf R. Müller

发表机构 * Friedrich-Alexander-Universität Erlangen-Nürnberg(弗里德里希-亚历山大-埃朗根-纽伦堡大学) Technical University of Darmstadt(达姆施塔特技术大学)

AI总结 本文提出了一种名为GRAPHIC的方法,用于分析深度学习模型中类别之间的混淆情况。该方法基于网络科学,将中间层的混淆矩阵解释为有向图的邻接矩阵,从而可视化和量化训练过程中的学习动态。GRAPHIC能够揭示类别可分性、数据集问题及网络结构行为,为理解神经网络的学习过程提供了新的视角。

Comments Transactions on Machine Learning Research, 2026

详情
英文摘要

Explainable artificial intelligence has emerged as a promising field of research to address reliability concerns in artificial intelligence. Despite significant progress in explainable artificial intelligence, few methods provide a systematic way to visualize and understand how classes are confused and how their relationships evolve as training progresses. In this work, we present GRAPHIC, an architecture-agnostic approach that analyzes neural networks on a class level. It leverages confusion matrices derived from intermediate layers using linear classifiers. We interpret these as adjacency matrices of directed graphs, allowing tools from network science to visualize and quantify learning dynamics across training epochs and intermediate layers. GRAPHIC provides insights into linear class separability, dataset issues, and architectural behavior, revealing, for example, similarities between flatfish and man and labeling ambiguities validated in a human study. In summary, by uncovering real confusions, GRAPHIC offers new perspectives on how neural networks learn. The code is available at https://github.com/Johanna-S-Froehlich/GRAPHIC.

2602.18455 2026-05-13 cs.CY cs.AI 版本更新

Impact of AI Search Summaries on Website Traffic: Evidence from Google AI Overviews and Wikipedia

Mehrzad Khosravi, Hema Yoganarasimhan

发表机构 * University of Washington(华盛顿大学)

AI总结 本文研究了谷歌AI概览(AIO)对维基百科网站流量的影响,利用AIO在不同地区逐步推出的特点,结合维基百科多语言版本的结构,采用双重差分法进行因果分析。研究发现,AIO的引入使英语维基百科文章的每日流量平均下降约15%,且对文化类内容的影响尤为显著,而STEM类内容影响较小。这一结果表明,搜索引擎中的生成式摘要功能可能显著减少信息类网站的流量,对内容变现、平台设计和政策制定具有重要启示。

详情
英文摘要

Search engines increasingly display LLM-generated answers shown above organic links, shifting search from link lists to answer-first summaries. Publishers contend these summaries substitute for source pages and cannibalize traffic, while platforms argue they are complementary by directing users through included links. We estimate the causal impact of Google's AI Overview (AIO) on Wikipedia traffic by leveraging the feature's staggered geographic rollout and Wikipedia's multilingual structure. Using a difference-in-differences design, we compare English Wikipedia articles exposed to AIO to the same underlying articles in language editions (Hindi, Indonesian, Japanese, and Portuguese) that were not exposed to AIO during the observation period. Across 161,382 matched article-language pairs, AIO exposure reduces daily traffic to English articles by approximately 15%. Effects are heterogeneous: relative declines are largest for Culture articles and substantially smaller for STEM, consistent with stronger substitution when short synthesized answers satisfy informational intent. These findings provide early causal evidence that generative-answer features in search engines can materially reallocate attention away from informational publishers, with implications for content monetization, search platform design, and policy.

2602.17739 2026-05-13 q-bio.GN cs.AI cs.LG 版本更新

GeneZip: Region-Aware Compression for Long Context DNA Modeling

Jianan Zhao, Xixian Liu, Zhihao Zhan, Xinyu Yuan, Hongyu Guo, Jian Tang

发表机构 * Mila - Québec AI Institute(魁北克人工智能研究所) Université de Montréal(蒙特利尔大学) National Research Council of Canada(加拿大国家研究理事会) University of Ottawa(渥太华大学) HEC Montréal(蒙特利尔高等商学院)

AI总结 GeneZip 是一种面向长上下文DNA建模的区域感知压缩框架,旨在解决现有方法在压缩预算分配和计算成本上的不足。该方法结合动态路由机制与区域感知比例(RAR)目标,利用基因结构注释指导压缩过程,从而在推理时无需注释即可对原始DNA序列进行高效压缩。GeneZip 在压缩效果、冗余识别和训练效率方面表现出色,显著提升了长序列DNA模型的性能与可扩展性。

Comments Preprint, work in progress

详情
英文摘要

Long-context DNA models are limited by token-mixing cost and by how compression allocates representational budget across the genome. Existing approaches operate close to base-pair resolution, apply fixed downsampling, or learn content-dependent chunks without an explicit genomic budget, making long-context pretraining expensive and difficult to control. We introduce GeneZip, a region-aware DNA compression framework that combines H-Net-style dynamic routing with a Region-Aware Ratio (RAR) objective and bounded routing. GeneZip uses static gene-structure annotations during compression training to specify region-wise base-pairs-per-token (BPT) targets; at inference time, it compresses raw unseen DNA without annotations. GeneZip provides three main benefits. First, it is effective: GeneZip variants achieve the best validation PPL among encoder-based compressors, with GeneZip-70M operating at 137.6 BPT, and across four reproducible DNALongBench tasks--contact map prediction, eQTL prediction, enhancer-target gene prediction, and transcription-initiation signal prediction--GeneZip obtains the best average rank among compared sequence models. Second, it is redundancy-aware: a post-hoc RepeatMasker/TRF analysis shows that, without repeat supervision, GeneZip assigns higher local BPT to TE-derived interspersed repeats and tandem repeats, two major classes of repetitive DNA sequence redundancy. Third, it is efficient: by reducing the effective token-mixing length, GeneZip enables longer-context and larger-capacity pretraining, including 128K-context and 636M-parameter variants on a single A100 80GB GPU, and fine-tunes the eQTL task 50.4x faster than JanusDNA (50 vs. 2520 minutes). These results establish GeneZip as an effective, redundancy-aware, and efficient compression interface for long-context DNA modeling.

2602.16736 2026-05-13 cs.OS cs.AI 版本更新

The Compute ICE-AGE: Invariant Compute Envelope under Addressable Graph Evolution

R. Jay Martin

发表机构 * Lake Arrowhead, CA(加利福尼亚州湖-arrowhead)

AI总结 本文提出了一种基于确定性语义状态子基底的计算框架,通过在持久化图结构上进行局部状态演化,实现语义连续性的结构化保持,而非依赖概率推断反复重建语义状态。该系统在苹果M2芯片上实现了高效能的语义图遍历,能够在百万到千万级节点规模下保持微秒级延迟和稳定的CPU利用率。研究还表明,该方法在面对各种异常输入时能够保持语义回放的确定性,并将潜在的退化限制在局部结构中,避免全局崩溃。

Comments V3: 40 pages, 3 figures. Empirical systems study of a deterministic semantic substrate evaluated across 1M-25M persistent semantic nodes on Apple Silicon M2-class hardware. Includes deterministic replay, thermodynamic scaling analysis, stochastic ingress experiments, paging survivability, and locality-constrained traversal measurements. Keywords: Persistent Semantic State, Memory-Bound AI

详情
英文摘要

This paper presents empirical results from a production-grade C++ implementation of a deterministic semantic state substrate operating under bounded local state evolution. The system was realized as a CPU-resident persistent semantic graph engine designed to preserve semantic continuity structurally rather than repeatedly reconstructing it through probabilistic inference. Contemporary inference-driven AI systems repeatedly recompute semantic state through context replay and probabilistic recomposition. In contrast, the substrate described here evolves semantic continuity incrementally through locality-preserving traversal and bounded local mutation over persistent graph topology. Empirical measurements on Apple Silicon M2-class hardware demonstrated locality-constrained traversal behavior across scaling regimes ranging from 1 million to 25 million persistent semantic nodes. Traversal latency remained within low microsecond ranges (P50 approximately 0.0014 ms) under sustained workloads, while steady-state CPU utilization remained approximately 17.2% with no measurable scale-correlated thermal amplification observed during sustained operation. Measured persistent node density averaged approximately 687 bytes per node under compressed Float32 storage regimes, corresponding to a projected capacity of approximately 1.6 billion persistent semantic nodes within a 1 TiB memory envelope. Under hostile ingress conditions including stochastic perturbation, malformed topology, fragmented adjacency, and active paging pressure, deterministic replay integrity remained stable while degradation localized into bounded orphan structures rather than propagating catastrophic global divergence.

2602.15451 2026-05-13 q-bio.QM cs.AI cs.LG quant-ph 版本更新

Molecular Design beyond Training Data with Novel Extended Objective Functionals of Generative AI Models Driven by Quantum Annealing Computer

Hayato Kunugi, Mohsen Rahmani, Yosuke Iyama, Yutaro Hirono, Akira Suma, Matthew Woolway, Vladimir Vargas-Calderón, William Kim, Kevin Chern, Mohammad Amin, Masaru Tateno

发表机构 * Japan Tobacco Inc.(日本烟草公司) D-Wave Systems Inc.(D-Wave系统公司)

AI总结 该研究提出了一种结合量子退火计算机的深度生成模型优化框架,用于小分子药物设计,解决了传统生成模型生成药物类化合物频率较低的问题。研究中引入了神经哈希函数(NHF),同时作为正则化和二值化方案,用于经典与量子神经网络之间的信号转换及误差函数构建。实验表明,基于量子退火的生成模型在分子有效性和药物相似性方面优于传统模型,并且在无需额外约束条件下超越了训练数据的表现,展示了量子计算在药物设计中的潜在优势。

Comments 28 pages, 4 figures

详情
英文摘要

Deep generative modeling to stochastically design small molecules is an emerging technology for accelerating drug discovery and development. However, one major issue in molecular generative models is their lower frequency of drug-like compounds. To resolve this problem, we developed a novel framework for optimization of deep generative models integrated with a D-Wave quantum annealing computer, where our Neural Hash Function (NHF) presented herein is used both as the regularization and binarization schemes simultaneously, of which the latter is for transformation between continuous and discrete signals of the classical and quantum neural networks, respectively, in the error evaluation (i.e., objective) function. The compounds generated via the quantum-annealing generative models exhibited higher quality in both validity and drug-likeness than those generated via the fully-classical models, and was further indicated to exceed even the training data in terms of drug-likeness features, without any restraints and conditions to deliberately induce such an optimization. These results indicated an advantage of quantum annealing to aim at a stochastic generator integrated with our novel neural network architectures, for the extended performance of feature space sampling and extraction of characteristic features in drug design.

2602.07668 2026-05-13 cs.CV cs.AI cs.LG cs.RO 版本更新

Looking and Listening Inside and Outside: Multimodal Artificial Intelligence Systems for Driver Safety Assessment and Intelligent Vehicle Decision-Making

Ross Greer, Laura Fleig, Maitrayee Keskar, Erika Maquiling, Giovanni Tapia Lopez, Angel Martinez-Sanchez, Parthib Roy, Jake Rattigan, Mira Sur, Alejandra Vidrio, Thomas Marcotte, Mohan Trivedi

发表机构 * Machine Intelligence, Interaction, and Imagination (Mi3) Laboratory(机器智能、交互与想象实验室) Laboratory for Intelligent and Safe Automobiles (LISA)(智能与安全汽车实验室) Johns Hopkins University(约翰霍普金斯大学) Center for Medicinal Cannabis Research (CMCR)(医药大麻研究中心)

AI总结 该研究提出了一种融合视觉与音频信息的多模态框架L-LIO,用于提升智能车辆中的驾驶员状态评估与环境理解能力。通过引入音频信号,增强对驾驶员、乘客及车外人员状态的感知,从而在安全气囊部署、自动驾驶接管时间预测等场景中提供更全面的信息支持。实验表明,音频在复杂或语境丰富的场景中能提供关键的安全相关信息,为智能车辆决策系统提供了新的干预路径。

详情
英文摘要

The looking-in-looking-out (LILO) framework has enabled intelligent vehicle applications that understand both the outside scene and the driver state to improve safety outcomes, with examples in smart airbag deployment, takeover time prediction in autonomous control transitions, and driver attention monitoring. In this research, we propose an augmentation to this framework, making a case for the audio modality as an additional source of information to understand the driver, and in the evolving autonomy landscape, also the passengers and those outside the vehicle. We expand LILO by incorporating audio signals, forming the looking-and-listening inside-and-outside (L-LIO) framework to enhance driver state assessment and environment understanding through multimodal sensor fusion. We evaluate three example cases where audio enhances vehicle safety: supervised learning on driver speech audio to classify potential impairment states (e.g., intoxication), collection and analysis of passenger natural language instructions (e.g., "turn after that red building") to motivate how spoken language can interface with planning systems through audio-aligned instruction data, and limitations of vision-only systems where audio may disambiguate the guidance and gestures of external agents. Datasets include custom-collected in-vehicle and external audio samples in real-world environments. Pilot findings show that audio yields safety-relevant insights, particularly in nuanced or context-rich scenarios where sound is critical to safe decision-making or visual signals alone are insufficient. Challenges include ambient noise interference, privacy considerations, and robustness across human subjects, motivating further work on reliability in dynamic real-world contexts. L-LIO augments driver and scene understanding through multimodal fusion of audio and visual sensing, offering new paths for safety intervention.

2602.06339 2026-05-13 cs.RO cs.AI 版本更新

Action Hallucination in Generative Vision-Language-Action Models

Harold Soh, Eugene Lim

发表机构 * Department of Computer Science, School of Computing(计算机科学系,计算系) Smart Systems Institute(智能系统研究所)

AI总结 该论文研究了生成式视觉-语言-动作模型在机器人领域中可能出现的动作幻觉问题,即模型生成违反物理约束的动作,进而导致计划层面的失败。研究分析了这类幻觉的成因,指出其源于可行机器人行为与常见模型结构之间的结构性不匹配,并探讨了拓扑、精度和时间跨度三个关键障碍所带来的不可避免的权衡。该工作为生成式机器人策略的失效提供了机制性解释,并为提升其可靠性与可信度指明了理论方向。

Comments 24 pages; updated setup with minor changes to proofs. changed template

详情
英文摘要

Robot Foundation Models, such as VLAs, promise end-to-end generative robot policies with broad generalization. Yet it remains unclear whether they fundamentally resolve the core problem of action generation in embodied settings, or overcome the long-standing challenges of robotics. We address this question by analyzing action hallucinations that violate physical constraints and their extension to plan-level failures. Focusing on latent-variable generative policies, we show that hallucinations can arise from structural mismatches between feasible robot behavior and common model architectures. We study three such barriers -- topological, precision, and horizon -- and show how they impose unavoidable tradeoffs. Our analysis provides mechanistic explanations for reported empirical failures of generative robot policies and suggests principled directions for improving reliability and trustworthiness, without abandoning their expressive power.

2602.02799 2026-05-13 cs.LG cs.AI 版本更新

Joint Learning of Hierarchical Neural Options and Abstract World Model

Wasu Top Piriyakulkij, Wolfgang Lehrach, Kevin Ellis, Kevin Murphy

发表机构 * Cornell University(康奈尔大学) Google Deepmind(谷歌DeepMind)

AI总结 该研究旨在开发能够通过组合已有技能学习新技能的智能体,提出了一个名为AgentOWL的新方法,该方法能够高效地联合学习抽象世界模型和分层神经选项。与现有方法相比,AgentOWL在数据效率和技能泛化能力方面表现出显著优势,并在部分以物体为中心的Atari游戏中验证了其有效性。

详情
英文摘要

Building agents that can perform new skills by composing existing skills is a long-standing goal of AI agent research. Towards this end, we investigate how to efficiently acquire a sequence of skills, formalized as hierarchical neural options. However, existing model-free hierarchical reinforcement algorithms need a lot of data. We propose a novel method, which we call AgentOWL (Option and World model Learning Agent), that jointly learns -- in a sample efficient way -- an abstract world model (abstracting across both states and time) and a set of hierarchical neural options. We show, on a subset of Object-Centric Atari games, that our method can learn more skills using less data than baseline methods and possesses learning and generalization capabilities that the baselines do not have.

2602.02408 2026-05-13 cs.CV cs.AI 版本更新

ReasonEdit: Editing Vision-Language Models using Human Reasoning

Jiaxing Qiu, Kaihua Hou, Roxana Daneshjou, Ahmed Alaa, Thomas Hartvigsen

发表机构 * University of Virginia(弗吉尼亚大学) University of California, Berkeley(加州大学伯克利分校) Stanford University(斯坦福大学)

AI总结 ReasonEdit 是一种用于编辑视觉-语言模型(VLM)的新方法,旨在在不干扰模型其他功能的前提下修正其错误,特别针对需要人类与模型进行推理的视觉问答任务。该方法引入了用户在编辑过程中提供推理解释的机制,并通过一种基于网络科学的多模态嵌入技术,在推理时检索相关事实,从而提升编辑效果。实验表明,ReasonEdit 在多个数据集上取得了当前最优的编辑性能,验证了引入人类推理对模型编辑泛化能力的显著提升。

详情
英文摘要

Model editing aims to correct errors in large, pretrained models without altering unrelated behaviors. While some recent works have edited vision-language models (VLMs), no existing editors tackle reasoning-heavy tasks, which typically require humans and models to reason about images. We therefore propose ReasonEdit, the first VLM editor to let users explain their reasoning during editing, introducing a new, practical model editing setup. ReasonEdit continuously stores human reasoning in a codebook, and retrieves only relevant facts during inference using a novel topology-balanced multimodal embedding method inspired by network science. Across four VLMs on multiple rationale-based visual question answering datasets, ReasonEdit achieves state-of-the-art editing performance, ultimately showing that using human reasoning during editing greatly improves edit generalization.

2602.02280 2026-05-13 cs.SE cs.AI cs.CL cs.CR cs.LG 版本更新

RACC: Representation-Aware Coverage Criteria for LLM Safety Testing

Zeming Wei, Zhixin Zhang, Chengcan Wu, Yihao Zhang, Xiaokun Luan, Meng Sun

发表机构 * Peking University(北京大学)

AI总结 大型语言模型(LLM)面临来自越狱攻击的安全风险,但当前的安全测试方法多依赖静态数据集,缺乏系统性的测试套件质量评估标准。为此,本文提出RACC(Representation-Aware Coverage Criteria),一种专为LLM安全测试设计的覆盖率准则,通过从隐藏状态中提取安全表示,并基于这些表示评估测试用例对安全概念的覆盖程度,从而更有效地识别高质量的越狱测试用例。实验表明,RACC能够有效区分有效与冗余输入,并在测试套件优先级排序和攻击样本生成等应用中展现出实际价值。

详情
英文摘要

Large Language Models (LLMs) face severe safety risks from jailbreak attacks, yet current safety testing largely relies on static datasets and lacks systematic criteria to evaluate test suite quality and adequacy. While coverage criteria have proven effective for smaller neural networks, they are impractical for LLMs due to computational overhead and the entanglement of safety-critical signals with irrelevant neuron activations. To address these issues, we propose RACC (Representation-Aware Coverage Criteria), a set of coverage criteria specialized for LLM safety testing. RACC first extracts safety representations from the LLM's hidden states using a small calibration set of harmful prompts, then measures test prompts' concept activations against these directions, and finally computes coverage through six criteria assessing both individual and compositional safety concept coverage. Experiments on multiple LLMs and safety benchmarks show that RACC reliably rewards high-quality jailbreak test suites while remaining insensitive to redundant or invalid inputs, which is a key distinction that neuron-level criteria fail to make. We further demonstrate RACC's practical value in two applications, including test suite prioritization and attack prompt sampling, and validate its generalization across diverse settings and configurations. Overall, RACC provides a scalable and principled foundation for coverage-guided LLM safety testing.

2602.02133 2026-05-13 cs.AI cs.CL 版本更新

A Theoretical Analysis of Why Masked Diffusion Models Mitigate the Reversal Curse

Moongyu Jeon, Sangwoo Shin, BumJun Kim, Kyelim Lee, Albert No

发表机构 * Department of Artificial Intelligence, Yonsei University(燕山大学人工智能学院)

AI总结 本文理论分析了为何掩码扩散语言模型(MDMs)能够缓解自回归语言模型(ARMs)中的“反转诅咒”问题。研究指出,MDMs通过其任意顺序的掩码训练目标,在参数层面建立了前向与反向条件之间的耦合,使得模型在训练中学习到的词对证据可以迁移到反转查询中。实验验证了这一机制的有效性,表明其有助于提升模型在反转任务中的预测性能。

详情
英文摘要

Autoregressive language models (ARMs) suffer from the reversal curse: after learning ''$A$ is $B$,'' they often fail on the reverse query ''$B$ is $A$.'' Masked diffusion language models (MDMs) exhibit this failure in a much weaker form, but the underlying reason has remained unclear. A common explanation attributes this mitigation to their any-order masked training objective. However, observing ''$[\mathbf{M}]$ is $B$'' during training teaches recovery of $A$ from $B$ in one positional configuration, and does not by itself explain why the learned evidence should transfer to the reverse prompt ''$B$ is $[\mathbf{M}]$.'' We provide a theoretical analysis showing that this transfer arises from a parameter-level coupling between forward and reverse positional conditionals: shared Transformer parameters store token-pair evidence, while relative positional encodings route attention through queries and keys without changing the value-side evidence being retrieved. In a one-layer MDM, we prove that forward masked training strengthens evidence that is reusable in reverse queries, induces correlated forward--reverse attention routes, and yields a positively aligned shared-storage gradient component that decreases the reverse loss to first order. Controlled one-layer experiments and large-scale LLaDA/Dream experiments verify these signatures and show that they translate into improved reverse prediction.

2602.02007 2026-05-13 cs.CL cs.AI 版本更新

Beyond RAG for Agent Memory: Retrieval by Decoupling and Aggregation

Zhanghao Hu, Qinglin Zhu, Runcong Zhao, Di Liang, Hanqi Yan, Yulan He, Lin Gui

发表机构 * King’s College London(伦敦国王学院) Tencent, Yuanbao Team(腾讯元宝团队)

AI总结 本文针对传统检索增强生成(RAG)在智能体记忆应用中的不足,提出了一种新的记忆管理方法xMemory。该方法通过解耦和聚合的原理,将交互历史分解为可复用的事实、更新和区分细节,并构建分层的可修订记忆结构,以提升检索效率和信息准确性。实验表明,xMemory在多个任务和模型上均能有效提升答案质量与推理效率。

Comments Project Address: https://zhanghao-xmemory.github.io/Academic-project-page-template/; Code Address: https://github.com/HU-xiaobai/xMemory

详情
英文摘要

Standard Retrieval Augmented Generation (RAG) is poorly matched to agent memory. Unlike large heterogeneous corpora, agent memory forms a bounded and coherent interaction stream in which many spans are highly correlated or near duplicates. As a result, flat top-$k$ similarity retrieval often returns redundant context, while summary-centric hierarchies can blur the subtle details that distinguish one candidate from another. We argue that agent memory should follow the principle of decoupling before aggregation: the system should first isolate reusable facts, updates, and distinguishing details from similar histories, and only then organise them for efficient retrieval. Based on this principle, we propose xMemory, which constructs a revisable hierarchical memory structure from original messages to segments, memory components, and groups. xMemory segments interaction history into local events, decouples each segment into memory components, aggregates related components into high-level groups using a sparsity--semantic faithfulness objective, and maintains this structure incrementally as memory evolves. At inference time, xMemory retrieves top-down, first selecting a compact backbone of complementary groups and components, and then expanding to segments and raw messages only when additional evidence reduces the reader's uncertainty. Experiments on LoCoMo and PerLTQA across diverse open source and closed source LLMs show consistent gains in answer quality and inference token efficiency, supported by analyses of redundancy, evidence density, and coverage.

2602.01103 2026-05-13 cs.AI 版本更新

Probing RLVR training instability through the lens of objective-level hacking

Yiming Dong, Kun Fu, Haoyu Li, Xinyuan Zhu, Yurou Liu, Lijing Shao, Jieping Ye, Zheng Wang

发表机构 * School of Physics, Peking University(北京大学物理学院) Tongyi Lab(通义实验室) Alibaba Group(阿里巴巴集团) Kavli Institute for Astronomy and Astrophysics, Peking University(北京大学天文与天体物理研究院) National Astronomical Observatories, Chinese Academy of Sciences(中国科学院国家天文台)

AI总结 本文研究了可验证奖励强化学习(RLVR)在混合专家(MoE)架构中训练不稳定的问题,提出了一种基于目标层“黑客攻击”的分析框架,揭示了训练不稳定性背后的机制。研究发现,训练与推理之间的差距异常增长是导致不稳定的关键病理动态,这一现象此前缺乏机制解释。通过大量实验,本文为设计更稳定的RLVR算法提供了理论指导。

Comments Accepted by ICML 2026

详情
英文摘要

Prolonged reinforcement learning with verifiable rewards (RLVR) has been shown to drive continuous improvements in the reasoning capabilities of large language models, but the training is often prone to instabilities, especially in Mixture-of-Experts (MoE) architectures. Training instability severely undermines model capability improvement, yet its underlying causes and mechanisms remain poorly understood. In this work, we introduce a principled framework for understanding RLVR instability through the lens of objective-level hacking. Unlike reward hacking, which arises from exploitable verifiers, objective-level hacking emerges from token-level credit misalignment and is manifested as system-level spurious signals in the optimization objective. Grounded in our framework, together with extensive experiments on a 30B MoE model, we trace the origin and formalize the mechanism behind a key pathological training dynamic in MoE models: the abnormal growth of the training-inference discrepancy, a phenomenon widely associated with instability but previously lacking a mechanistic explanation. These findings provide a concrete and causal account of the training dynamics underlying instabilities in MoE models, offering guidance for the design of stable RLVR algorithms.

2602.00400 2026-05-13 cs.AI 版本更新

KEPO: Knowledge-Enhanced Preference Optimization for Multimodal Reasoning with Applications to Medical VQA

Fan Yang, Rui Meng, Trudi Di Qi, Ali Ezzati, Yuxin Wen

发表机构 * Chapman University(查普曼大学) Lawrence Berkeley National Laboratory(劳伦斯伯克利国家实验室) University of California, Irvine(加州大学伊文斯分校)

AI总结 该研究提出了一种名为KEPO的知识增强偏好优化框架,旨在提升多模态模型在医疗视觉问答等复杂推理任务中的表现。针对传统强化学习在稀疏奖励下训练不稳定、探索困难的问题,KEPO引入了质量门控的策略蒸馏机制,仅对高质量轨迹进行教师模型指导,并结合知识引导的探索策略,有效减少噪声干扰,提升推理连贯性与泛化能力。实验表明,KEPO在医疗VQA任务中展现出更优的训练稳定性与分布外性能。

详情
英文摘要

Reinforcement learning (RL) has emerged as a promising paradigm for inducing explicit reasoning behaviors in large language and vision-language models. However, reasoning-oriented RL post-training remains fundamentally challenging due to sparse trajectory-level rewards, leading to ambiguous credit assignment and severe exploration failures that can trap the policy in a ``learning cliff.'' Recent on-policy distillation methods introduce dense teacher supervision to stabilize optimization, but apply it uniformly across all generated trajectories. We argue that such uniform distillation is ill-suited for reasoning-intensive tasks, as low-quality on-policy trajectories often originate from early logical errors, and distillation under flawed contexts injects noisy and misaligned gradients. To address these challenges, we propose Knowledge-Enhanced Preference Optimization (KEPO), a unified post-training framework that integrates: (i) a quality-gated on-policy distillation objective that selectively applies dense teacher guidance only to high-quality trajectories, and (ii) a knowledge-enhanced exploration strategy that leverages hints learned from a teacher model to rejectively sample reward-positive on-policy trajectories for RL, thereby mitigating exploration collapse. Evaluated on a challenging medical visual question answering benchmark under single-source generalization, KEPO demonstrates improved training stability, more coherent reasoning behaviors, and superior out-of-distribution performance over reinforcement learning and on-policy distillation baselines.

2601.21351 2026-05-13 cs.LG cs.AI 版本更新

Analytical Provisioning for Attention-FFN Disaggregated LLM Serving under Stochastic Workloads

Chendong Song, Meixuan Wang, Hang Zhou, Hong Liang, Yuan Lyu, Zixi Chen, Yuwei Fan, Zijie Zhou

发表机构 * Dept. of Industrial Engineering and Decision Analytics HKUST(工业工程与决策分析系香港科技大学) Dept. of Computer Science and Technology Tsinghua University(计算机科学与技术系清华大学) IIIS Tsinghua University(清华大学信息学院) Huawei Hong Kong Research Center(华为香港研发中心) School of Mathematical Sciences Peking University(北京大学数学科学学院)

AI总结 该研究针对分体式注意力-FFN(AFD)架构下的大语言模型服务,在随机工作负载条件下,提出了一个分析性的资源分配框架。研究通过分析每个计算槽的稳态令牌负载,识别出一个关键工作负载指标θ,并据此推导出最优的注意力与FFN计算比例,适用于任意预填充-解码分布。该方法还考虑了同步执行中的瓶颈效应,提供了闭式均场规则及高斯屏障感知的优化,实验表明其预测结果与仿真结果误差在10%以内,为分体式LLM服务的资源分配提供了理论依据和实用指导。

Comments Submitted to Neurips 2026

详情
英文摘要

Attentio-FFN disaggregation (AFD) is an emerging architecture for LLM decoding that separates state-heavy, KV-cache-dominated Attention computation from stateless, compute-intensive FFN computation, connected by per-step communication. While AFD enables independent scaling of memory and compute resources, its performance is highly sensitive to the Attention/FFN provisioning ratio: mis-sizing induces step-level blocking and costly device idle time. We develop an analytical provisioning framework for AFD bundles in an $r$A--$1$F topology under stochastic workloads. Two sources of randomness shape the problem: per-slot Attention workload evolves as KV caches grow and completed requests are replenished with random prompt and decode lengths, and synchronized execution across Attention workers introduces a barrier governed by the slowest worker. We address both via a renewal-reward characterization of the per-slot stationary token load, identifying a single workload statistic $θ$ that governs provisioning under arbitrary prefill-decode distributions and admits a nonparametric estimator from request traces. The analysis yields a closed-form mean-field rule for the optimal A/F ratio decomposing into Attention-, communication-, and FFN-bottleneck regimes, together with a Gaussian barrier-aware refinement that quantifies cross-worker synchronization overhead. A trace-calibrated AFD simulator supports the framework across workloads: the predicted optimal ratio matches the simulation-optimal within 10%. Together, these results provide a compact, calibratable account of how stochastic workload structure determines provisioning in disaggregated LLM serving.

2601.03627 2026-05-13 cs.CL cs.AI 版本更新

Evaluating the Pre-Consultation Ability of LLMs using Diagnostic Guidelines

Jean Seo, Gibaeg Kim, Kihun Shin, Seungseop Lim, Hyunkyung Lee, Wooseok Han, Jongwon Lee, Eunho Yang

发表机构 * AITRICS KAIST(韩国科学技术院) Severance Hospital, Yonsei University(延世大学松云医院) College of Medicine, The Catholic University of Korea(韩国天主大学医学院)

AI总结 本文提出EPAG,一个用于评估大语言模型(LLMs)预诊能力的基准数据集和框架,通过比较病史信息与诊断指南直接评估模型能力,并通过疾病诊断间接评估。研究发现,经过精心构建的特定任务数据集微调的小型开源模型在预诊任务中可超越前沿大模型,同时发现病史信息量的增加并不一定提升诊断性能。研究还揭示了预诊对话的语言特性受对话内容影响,并开源了数据集和评估流程以促进临床场景中LLM应用的发展。

Comments EACL 2026 Industry

详情
英文摘要

We introduce EPAG, a benchmark dataset and framework designed for Evaluating the Pre-consultation Ability of LLMs using diagnostic Guidelines. LLMs are evaluated directly through HPI-diagnostic guideline comparison and indirectly through disease diagnosis. In our experiments, we observe that small open-source models fine-tuned with a well-curated, task-specific dataset can outperform frontier LLMs in pre-consultation. Additionally, we find that increased amount of HPI (History of Present Illness) does not necessarily lead to improved diagnostic performance. Further experiments reveal that the language of pre-consultation influences the characteristics of the dialogue. By open-sourcing our dataset and evaluation pipeline on https://github.com/seemdog/EPAG, we aim to contribute to the evaluation and further development of LLM applications in real-world clinical settings.

2512.22933 2026-05-13 cs.AI cs.CL 版本更新

RW-Post: Auditable Evidence-Grounded Multimodal Fact-Checking in the Wild

Danni Xu, Shaojing Fan, Harry Cheng, Mohan Kankanhalli

发表机构 * School of Computing (SoC), National University of Singapore (NUS)(新加坡国立大学计算机学院(SoC)) National University of Singapore (NUS)(新加坡国立大学) Department of Electrical and Computer Engineering (ECE), National University of Singapore (NUS)(新加坡国立大学电子与计算机工程系(ECE))

AI总结 本文提出 RW-Post,一个用于真实场景下多模态事实核查的可审计基准数据集,每个样本都关联原始社交媒体帖子、推理过程和来自人工事实核查文章的明确证据。该数据集支持多种评估模式,有助于系统分析模型在视觉关联和证据利用方面的能力。实验表明,当前模型在证据关联方面仍有较大提升空间,而基于证据的评估方式能有效提升模型的准确性和可信度。

Comments Code and dataset will be released at https://github.com/xudanni0927/AgentFact

详情
英文摘要

Multimodal misinformation increasingly leverages visual persuasion, where repurposed or manipulated images strengthen misleading text. We introduce RW-Post, a post-aligned text--image benchmark for real-world multimodal fact-checking with auditable annotations: each instance links the original social-media post with reasoning traces and explicitly linked evidence items derived from human fact-check articles via an LLM-assisted extraction-and-auditing pipeline. RW-Post supports controlled evaluation across closed-book, evidence-bounded, and open-web regimes, enabling systematic diagnosis of visual grounding and evidence utilization. We provide AgentFact as a reference verification baseline and benchmark strong open-source LVLMs under unified protocols. Experiments show substantial headroom: current models struggle with faithful evidence grounding, while evidence-bounded evaluation improves both accuracy and faithfulness.

2512.22579 2026-05-13 cs.AI cs.NI 版本更新

SANet: A Semantic-aware Agentic AI Networking Framework for Cross-layer Optimization in 6G

Yong Xiao, Xubo Li, Haoran Zhou, Yingyu Li, Yayu Gao, Guangming Shi, Ping Zhang, Marwan Krunz

发表机构 * the School of Electronic Information and Communications, the Huazhong University of Science and Technology, Wuhan, China(电子信息学院,华中科技大学,武汉,中国) the Peng Cheng Laboratory, Shenzhen, China(鹏城实验室,深圳,中国) the School of Mechanical Engineering and Electronic Information, China University of Geosciences (Wuhan), China(机械工程与电子信息学院,中国地质大学(武汉),中国) the State Key Laboratory of Networking and Switching(网络与交换技术国家重点实验室)

AI总结 本文提出了一种名为SANet的语义感知智能体网络框架,旨在实现6G无线网络中的跨层优化。该框架通过理解用户的语义目标,自动分配不同网络层的智能体以完成任务,并针对多智能体多目标优化问题,提出了寻找帕累托最优解的优化方法。此外,文章还引入了模型划分与共享(MoPS)机制,以提升计算资源的利用效率,并通过实验验证了该框架在性能提升和计算效率方面的显著优势。

Comments Accepted at IEEE Transactions on Mobile Computing

Journal ref IEEE Transactions on Mobile Computing, 2026

详情
英文摘要

Agentic AI networking (AgentNet) is a novel AI-native networking paradigm in which a large number of specialized AI agents collaborate to perform autonomous decision-making, dynamic environmental adaptation, and complex missions. It has the potential to facilitate real-time network management and optimization functions, including self-configuration, self-optimization, and self-adaptation across diverse and complex environments. This paper proposes SANet, a novel semantic-aware AgentNet architecture for wireless networks that can infer the semantic goal of the user and automatically assign agents associated with different layers of the network to fulfill the inferred goal. Motivated by the fact that AgentNet is a decentralized framework in which collaborating agents may generally have different and even conflicting objectives, we formulate the decentralized optimization of SANet as a multi-agent multi-objective problem, and focus on finding the Pareto-optimal solution for agents with distinct and potentially conflicting objectives. We propose three novel metrics for evaluating SANet. Furthermore, we develop a model partition and sharing (MoPS) framework in which large models, e.g., deep learning models, of different agents can be partitioned into shared and agent-specific parts that are jointly constructed and deployed according to agents' local computational resources. Two decentralized optimization algorithms are proposed. We derive theoretical bounds and prove that there exists a three-way tradeoff among optimization, generalization, and conflicting errors. We develop an open-source RAN and core network-based hardware prototype that implements agents to interact with three different layers of the network. Experimental results show that the proposed framework achieved performance gains of up to 14.61% while requiring only 44.37% of FLOPs required by state-of-the-art algorithms.

2512.12177 2026-05-13 cs.AI 版本更新

Floorplan2Guide: LLM-Guided Floorplan Parsing for BLV Indoor Navigation

Aydin Ayanzadeh, Tim Oates

发表机构 * University of Maryland, Baltimore County(马里兰大学巴尔的摩分校)

AI总结 本文提出了一种基于大语言模型(LLM)引导的室内平面图解析方法Floorplan2Guide,旨在提升盲人和低视力(BLV)人群的室内导航能力。该方法将建筑平面图转化为可导航的知识图谱,并生成可读的导航指令,减少了传统方法对人工预处理的依赖。实验表明,该方法在模拟和真实环境中均能有效提升导航准确率,尤其在少样本学习下表现优异,且基于图结构的空间推理比直接视觉推理具有更高的成功率。

Comments Accepted for publication in the proceedings of the IEEE International Conference on Big Data (IEEE BigData 2025)

Journal ref IEEE International Conference on Big Data (IEEE BigData 2025), pp. 7477-7485

详情
英文摘要

Indoor navigation remains a critical challenge for people with visual impairments. The current solutions mainly rely on infrastructure-based systems, which limit their ability to navigate safely in dynamic environments. We propose a novel navigation approach that utilizes a foundation model to transform floor plans into navigable knowledge graphs and generate human-readable navigation instructions. Floorplan2Guide integrates a large language model (LLM) to extract spatial information from architectural layouts, reducing the manual preprocessing required by earlier floorplan parsing methods. Experimental results indicate that few-shot learning improves navigation accuracy in comparison to zero-shot learning on simulated and real-world evaluations. Claude 3.7 Sonnet achieves the highest accuracy among the evaluated models, with 92.31%, 76.92%, and 61.54% on the short, medium, and long routes, respectively, under 5-shot prompting of the MP-1 floor plan. The success rate of graph-based spatial structure is 15.4% higher than that of direct visual reasoning among all models, which confirms that graphical representation and in-context learning enhance navigation performance and make our solution more precise for indoor navigation of Blind and Low Vision (BLV) users.

2512.11883 2026-05-13 cs.CY cs.AI cs.CV 版本更新

Position: Universal Aesthetic Alignment Narrows Artistic Expression

Wenqi Marshall Guo, Qingyun Qian, Khalad Hasan, Shan Du

发表机构 * Department of CMPS, University of British Columbia, Kelowna, Canada(计算机科学与工程系,不列颠哥伦比亚大学,加拿大克洛维纳)

AI总结 本文探讨了图像生成模型过度对齐普遍审美标准所带来的问题,指出这种对齐方式可能违背用户在艺术创作或批评性目的中对“反审美”输出的需求。研究通过构建宽谱审美数据集并评估先进生成与奖励模型,发现当前审美对齐模型倾向于生成传统意义上的“美丽”图像,难以遵循用户对低质量或负面图像的指令,且奖励模型即使在用户明确要求下,仍会对反审美图像进行惩罚。研究确认了这一系统性偏差,并提供了相关代码、微调模型和数据集供进一步研究。

详情
英文摘要

Over-aligning image generation models to a generalized aesthetic preference conflicts with user intent, particularly when "anti-aesthetic" outputs are requested for artistic or critical purposes. This adherence prioritizes developer-centered values, compromising user autonomy and aesthetic pluralism. We test this bias by constructing a wide-spectrum aesthetics dataset and evaluating state-of-the-art generation and reward models. This position paper finds that aesthetic-aligned generation models frequently default to conventionally beautiful outputs, failing to respect instructions for low-quality or negative imagery. Crucially, reward models penalize anti-aesthetic images even when they perfectly match the explicit user prompt. We confirm this systemic bias through image-to-image editing and evaluation against real abstract artworks. Our code, fine-tuned models, and datasets are available on our meta-expression intentionally anti-aesthetics webpage: https://weathon.github.io/icml2026_position/.

2511.17038 2026-05-13 cs.AI eess.IV stat.ML 版本更新

DAPS++: Rethinking Diffusion Inverse Problems with Decoupled Posterior Annealing

Hao Chen, Renzheng Zhang, Scott S. Howard

发表机构 * Department of Electrical Engineering, University of Notre Dame(诺克斯大学电气工程系) Department of Aerospace and Mechanical Engineering, University of Notre Dame(诺克斯大学航空航天与机械工程系)

AI总结 本文提出了一种名为DAPS++的新型扩散逆问题求解方法,旨在解决传统扩散模型在逆问题中先验引导不足的问题。该方法通过将扩散初始化与似然驱动的优化过程完全解耦,使重建过程更直接地由测量一致性引导,同时保持数值稳定性。实验表明,DAPS++在减少函数评估次数和优化步骤的前提下,实现了高效的计算性能和鲁棒的图像恢复效果。

详情
英文摘要

From a Bayesian perspective, score-based diffusion solves inverse problems through joint inference, embedding the likelihood with the prior to guide the sampling process. However, this formulation fails to explain its practical behavior: the prior offers limited guidance, while reconstruction is largely driven by the measurement-consistency term, leading to an inference process that is effectively decoupled from the diffusion dynamics. We show that the diffusion prior in these solvers functions primarily as a warm initializer that places estimates near the data manifold, while reconstruction is driven almost entirely by measurement consistency. Based on this observation, we introduce \textbf{DAPS++}, which fully decouples diffusion-based initialization from likelihood-driven refinement, allowing the likelihood term to guide inference more directly while maintaining numerical stability and providing insight into why unified diffusion trajectories remain effective in practice. By requiring fewer function evaluations (NFEs) and measurement-optimization steps, \textbf{DAPS++} achieves high computational efficiency and robust reconstruction performance across diverse image restoration tasks.

2511.14715 2026-05-13 cs.LG cs.AI cs.CR cs.DC cs.MA 版本更新

FLARE: Adaptive Multi-Dimensional Reputation for Robust Client Reliability in Federated Learning

Abolfazl Younesi, Leon Kiss, Zahra Najafabadi Samani, Juan Aznar Poveda, Thomas Fahringer

AI总结 联邦学习(FL)在保障数据隐私的同时实现协作训练,但易受到恶意客户端通过拜占庭攻击、数据投毒等手段破坏模型完整性。为应对这一问题,本文提出 FLARE,一种基于自适应多维信誉评估的框架,通过持续、多维的信誉评分机制动态评估客户端可靠性,并结合自适应阈值调整、信誉加权聚合和本地差分隐私等技术,提升系统鲁棒性。实验表明,FLARE 在多种攻击场景下均能保持较高的模型准确率和收敛速度,显著优于现有方法。

Comments The authors want to withdraw this manuscript for further verification and revision. We may release a substantially revised version in the future

详情
英文摘要

Federated learning (FL) enables collaborative model training while preserving data privacy. However, it remains vulnerable to malicious clients who compromise model integrity through Byzantine attacks, data poisoning, or adaptive adversarial behaviors. Existing defense mechanisms rely on static thresholds and binary classification, failing to adapt to evolving client behaviors in real-world deployments. We propose FLARE, an adaptive reputation-based framework that transforms client reliability assessment from binary decisions to a continuous, multi-dimensional trust evaluation. FLARE integrates: (i) a multi-dimensional reputation score capturing performance consistency, statistical anomaly indicators, and temporal behavior, (ii) a self-calibrating adaptive threshold mechanism that adjusts security strictness based on model convergence and recent attack intensity, (iii) reputation-weighted aggregation with soft exclusion to proportionally limit suspicious contributions rather than eliminating clients outright, and (iv) a Local Differential Privacy (LDP) mechanism enabling reputation scoring on privatized client updates. We further introduce a highly evasive Statistical Mimicry (SM) attack, a benchmark adversary that blends honest gradients with synthetic perturbations and persistent drift to remain undetected by traditional filters. Extensive experiments with 100 clients on MNIST, CIFAR-10, and SVHN demonstrate that FLARE maintains high model accuracy and converges faster than state-of-the-art Byzantine-robust methods under diverse attack types, including label flipping, gradient scaling, adaptive attacks, ALIE, and SM. FLARE improves robustness by up to 16% and preserves model convergence within 30% of the non-attacked baseline, while achieving strong malicious-client detection performance with minimal computational overhead. https://github.com/Anonymous0-0paper/FLARE

2511.01202 2026-05-13 cs.IT cs.AI math.IT 版本更新

Forget BIT, It is All about TOKEN: Towards Semantic Information Theory for LLMs

Bo Bai

发表机构 * Bo Bai(白波)

AI总结 本文提出了一种语义信息理论,旨在为大语言模型(LLMs)建立从第一性原理出发的理论基础。研究将传统比特(BIT)替换为具有语义内容的宏观单元——令牌(TOKEN),并在此基础上重新诠释注意力机制与Transformer架构,将其视为能量模型,并将语义嵌入解释为语义流形上的向量表示。通过引入马斯西(Massey)的有向信息度量,论文构建了用于预训练、基于强化学习的微调以及推理阶段语义信息流动分析的理论框架,从而更精确地刻画了大语言模型的因果推理能力与语义生成机制。

详情
英文摘要

Despite the empirical successes of Large Language Models (LLMs), the prevailing paradigm is heuristic and experiment-driven, tethered to massive compute and data, while a first-principles theory remains absent. This treatise develops a Semantic Information Theory at the confluence of statistical physics, signal processing, and classical information theory, organized around a single paradigm shift: replacing the classical BIT - a microscopic substrate devoid of semantic content - with the macroscopic TOKEN as the atomic carrier of meaning and reasoning. Within this framework we recast attention and the Transformer as energy-based models, and interpret semantic embedding as vectorization on the semantic manifold. Modeling the LLM as a stateful channel with feedback, we adopt Massey's directed information as the native causal measure of autoregressive generation, from which we derive a *directed rate-distortion function for pre-training, a directed rate-reward function for RL-based post-training, and a sub-martingale account of inference-time semantic information flow. This machinery makes precise the identification of next-token prediction with Granger causal inference, and sharpens the limits of LLM reasoning against Pearl's Ladder of Causation - affirming that *whereas the BIT defined the Information Epoch, the TOKEN will define the AI Epoch.

2510.25609 2026-05-13 cs.LG cs.AI eess.SP 版本更新

Revisiting GAN with Bayes-Optimal Discrimination

Mohammadreza Tavasoli Naeini, Ali Bereyhi, Morteza Noshad, Ben Liang, Alfred O. Hero

发表机构 * University of Toronto(多伦多大学) Stanford University(斯坦福大学) University of Michigan(密歇根大学)

AI总结 本文提出了一种改进的标准生成对抗网络(GAN)训练方法,其核心在于将判别器的目标从交叉熵损失转变为直接最小化判别贝叶斯错误率(BER)。为此,作者引入了贝叶斯最优学习阈值(BOLT)损失函数,并通过最大化判别BER的替代量来训练生成器。该方法统一了GAN训练的不同目标,揭示了它们在平滑性与紧致性之间的权衡关系,并在平衡类别先验的条件下,证明了最大化替代BER能够最小化数据分布与生成分布之间的总变分距离,同时与Wasserstein GAN建立了联系。实验表明,该方法在图像生成任务中提升了样本质量和覆盖范围。

详情
英文摘要

We propose an alternative to the standard GAN training approach, in which the discriminator is a binary classifier trained by cross-entropy to distinguish real samples from generated ones. Instead, we directly target the discrimination Bayes error rate (BER). To this end, we use the recently proposed Bayes optimal learning threshold (BOLT) loss and train the generator to maximize a surrogate of the discrimination BER. This viewpoint gives a unified perspective on GAN training: different objectives can be interpreted as parameterized bounds on the discrimination BER that describe a trade-off between smoothness and tightness. We show that, under balanced class priors, maximizing the surrogate BER with an unconstrained discriminator minimizes the total variation between the data and generator distributions. By constraining the discriminator to be $1$-Lipschitz, the proposed maximization objective defines a discrepancy that is upper-bounded by the Wasserstein-1 distance, thereby linking it to Wasserstein GAN. Experiments on several image-generation datasets under matched architectures and optimization settings show that GAN training using the surrogate BER improves sample quality and coverage over standard baselines. This analysis suggests that the proposed Bayesian viewpoint can achieve a better trade-off between training stability and convergence of the generator to the data distribution.

2510.10956 2026-05-13 cs.SE cs.AI 版本更新

Project-Level C-to-Rust Translation via Pointer Knowledge Graphs

Zhiqiang Yuan, Wenjun Mao, Zhuo Chen, Xiyue Shang, Chong Wang, Yiling Lou, Xin Peng

发表机构 * Fudan University(复旦大学) Nanyang Technological University(南洋理工大学)

AI总结 将C代码翻译为安全的Rust代码是确保内存安全的有效方法。现有基于规则的方法生成的Rust代码安全性较差,而基于大语言模型(LLM)的方法虽能生成更符合习惯且更安全的代码,但在项目级别的C到Rust翻译中仍存在挑战,尤其在处理指针时因缺乏全局视角而效果不佳。为此,研究提出了一种基于指针知识图(Pointer Knowledge Graph)的翻译方法,通过引入指针使用信息和Rust导向的注解,为LLM提供全局指针语义支持,从而生成更安全、更符合Rust习惯的代码,并在实验中显著提升了翻译结果的安全性和功能正确性。

Comments Accepted by FSE'26

详情
英文摘要

Translating C code into safe Rust is an effective way to ensure memory safety. Compared to rule-based approaches, which often produce largely unsafe Rust code, LLM-based methods generate more idiomatic and safer Rust by leveraging extensive training on human-written code. Despite their promise, existing LLM-based approaches still struggle with project-level C-to-Rust translation. They typically partition a C project into smaller units (e.g., functions) based on call graphs and translate them in a bottom-up manner to resolve dependencies. However, this unit-by-unit paradigm often fails to handle pointers due to the lack of a global view of their usage. To address this limitation, we propose a novel C-to-Rust Pointer Knowledge Graph (KG) that augments code dependency graphs with two types of pointer semantics: (i) pointer usage information, which captures global behaviors such as points-to flows and lifts low-level struct interactions to higher-level abstractions; and (ii) Rust-oriented annotations, which encode ownership, mutability, nullability, and lifetime. Building on this KG, we further propose PtrTrans, a project-level C-to-Rust translation approach. In PtrTrans, the KG provides LLMs with comprehensive global pointer semantics, guiding them to generate safe and idiomatic Rust code. Experimental results show that PtrTrans reduces unsafe usages in translated Rust by 99.9% compared to both rule-based and conventional LLM-based methods, while achieving 29.3% higher functional correctness than fuzzing-enhanced LLM approaches.

2510.06371 2026-05-13 cs.CL cs.AI 版本更新

OASIS: A Multilingual and Multimodal Dataset for Culturally Grounded Spoken Visual QA

Firoj Alam, Ali Ezzat Shahroor, Md. Arid Hasan, Zien Sheikh Ali, Hunzalah Hassan Bhatti, Mohamed Bayan Kmainasi, Shammur Absar Chowdhury, Basel Mousi, Fahim Dalvi, Nadir Durrani, Natasa Milic-Frayling

发表机构 * Qatar Computing Research Institute(卡塔尔计算研究 institute)

AI总结 OASIS 是一个大规模的多语言、多模态数据集,旨在支持基于文化背景的口语视觉问答任务。该数据集包含大量图像、文本和语音数据,涵盖英语和阿拉伯语多种变体,适用于评估模型在常识推理、文化理解和真实场景中的表现。研究提出了一种可扩展的半自动框架 EverydayMMQA 用于构建本地化的问答资源,并通过多阶段人工验证确保数据质量,为多模态模型的训练与评估提供了重要支持。

Comments Multimodal Foundation Models, Large Language Models, Native, Multilingual, Language Diversity, Contextual Understanding, Culturally Informed

详情
英文摘要

Large-scale multimodal models achieve strong results on tasks like Visual Question Answering (VQA), but they are often limited when queries require cultural and visual information, everyday knowledge, particularly in low-resource and underrepresented languages. We introduce OASIS, a large-scale culturally grounded multimodal QA dataset covering images, text, and speech. OASIS is built with EverydayMMQA, a scalable semi-automatic framework for creating localized spoken and visual QA resources, supported by multi-stage human-in-the-loop validation. OASIS contains approximately 0.92M real images and 14.8M QA pairs, including 3.7M spoken questions, with 383 hours of human-recorded speech, and 20K hours of voice-cloned speech, from 42 speakers. It supports four input settings: text-only, speech-only, text+image, and speech+image. The dataset focuses on English and Arabic varieties across 18 countries, covering Modern Standard Arabic (MSA) as well as dialectal Arabic. It is designed to evaluate models beyond object recognition, targeting pragmatic, commonsense, and culturally grounded reasoning in real-world scenarios. We benchmark four closed-source models, three open-source models, and one fine-tuned model on OASIS. The framework and dataset will be made publicly available to the community. https://huggingface.co/datasets/QCRI/OASIS

2510.05408 2026-05-13 cs.CV cs.AI 版本更新

See the past: Time-Reversed Scene Reconstruction from Thermal Traces Using Visual Language Models

Kebin Contreras, Luis Toscano-Palomino, Mauro Dalla Mura, Jorge Bacca

发表机构 * Physics School, Universidad Industrial de Santander, Colombia(圣安德烈大学物理系,哥伦比亚) Department of Computer Science, Universidad Industrial de Santander, Colombia(圣安德烈大学计算机科学系,哥伦比亚) GIPSA-Lab, Université Grenoble Alpes, CNRS, Grenoble INP, Grenoble, France(格拉斯实验室,格勒诺布尔阿尔卑斯大学,CNRS,格勒诺布尔INP,法国) Institut Universitaire de France (IUF), France(法国国家科学院(IUF))

AI总结 该研究提出了一种基于热成像和视觉语言模型的时序逆向重建方法,旨在从当前的热痕迹中恢复过去几秒内的场景状态。方法结合了视觉语言模型与约束扩散过程,通过生成场景描述并指导图像重建,确保语义与结构的一致性。实验表明,该方法能够在受控环境下重建出最多120秒前的合理场景画面,为基于热痕迹的时序逆向成像提供了初步实现。

详情
英文摘要

Recovering the past from present observations is an intriguing challenge with potential applications in forensics and scene analysis. Thermal imaging, operating in the infrared range, provides access to otherwise invisible information. Since humans are typically warmer (37 C -98.6 F) than their surroundings, interactions such as sitting, touching, or leaning leave residual heat traces. These fading imprints serve as passive temporal codes, allowing for the inference of recent events that exceed the capabilities of RGB cameras. This work proposes a time-reversed reconstruction framework that uses paired RGB and thermal images to recover scene states from a few seconds earlier. The proposed approach couples Visual-Language Models (VLMs) with a constrained diffusion process, where one VLM generates scene descriptions and another guides image reconstruction, ensuring semantic and structural consistency. The method is evaluated in three controlled scenarios, demonstrating the feasibility of reconstructing plausible past frames up to 120 seconds earlier, providing a first step toward time-reversed imaging from thermal traces.

2510.04265 2026-05-13 cs.AI cs.CL math.ST stat.ML stat.TH 版本更新

Don't Pass@k: A Bayesian Framework for Large Language Model Evaluation

Mohsen Hariri, Amirhossein Samandar, Michael Hinczewski, Vipin Chaudhary

发表机构 * Department of Computer and Data Sciences, Case Western Reserve University(计算机与数据科学系,凯斯西储大学) Department of Physics, Case Western Reserve University(物理系,凯斯西储大学)

AI总结 本文提出了一种基于贝叶斯框架的大语言模型评估方法,旨在解决传统Pass@k指标在样本量有限时排名不稳定、易误导的问题。该方法通过估计模型的底层成功概率及其可信区间,提供更稳定且具有统计意义的模型排名,并支持对评分标准的灵活加权。实验表明,该框架在收敛速度和排名稳定性方面优于Pass@k,且能明确区分统计显著差异与噪声,适用于二元和非二元评估场景。

Comments OpenReview (ICLR 2026): https://openreview.net/forum?id=PTXi3Ef4sT

Journal ref The Fourteenth International Conference on Learning Representations (ICLR), 2026

详情
英文摘要

Pass$@k$ is widely used to report the reasoning performance of LLMs, but it often produces unstable and potentially misleading rankings, especially when the number of trials (samples) is limited and computational resources are constrained. We present a principled Bayesian evaluation framework that replaces Pass$@k$ and average accuracy over $N$ trials (avg$@N$) with posterior estimates of a model's underlying success probability and credible intervals, yielding stable rankings and a transparent decision rule for differences. Evaluation outcomes are modeled as categorical (not just 0/1) with a Dirichlet prior, giving closed-form expressions for the posterior mean and uncertainty of any weighted rubric and enabling the use of prior evidence when appropriate. Theoretically, under a uniform prior, the Bayesian posterior mean is order-equivalent to average accuracy (Pass$@1$), explaining its empirical robustness while adding principled uncertainty. Empirically, in simulations with known ground-truth success rates and on AIME'24/'25, HMMT'25, and BrUMO'25, the posterior-based procedure achieves faster convergence and greater rank stability than Pass$@k$ and recent variants, enabling reliable comparisons at far smaller sample counts. The framework clarifies when observed gaps are statistically meaningful (non-overlapping credible intervals) versus noise, and it naturally extends to graded, rubric-based evaluations. Together, these results recommend replacing Pass$@k$ for LLM evaluation and ranking with a posterior-based, compute-efficient protocol that unifies binary and non-binary evaluation while making uncertainty explicit. Source code is available at https://github.com/mohsenhariri/scorio

2509.15103 2026-05-13 cs.MA cs.AI 版本更新

Vulnerable Agent Identification in Large-Scale Multi-Agent Reinforcement Learning

Simin Li, Zihao Mao, Zheng Yuwei, Linhao Wang, Ruixiao Xu, Chengdong Ma, Zhiqian Liu, Xin Yu, Yuqing Ma, Xin Wang, Jie Luo, Bo An, Yaodong Yang, Weifeng Lv, Xianglong Liu

发表机构 * School of Computer Science and Engineering, Beihang University(北航计算机科学与工程学院) Department of Computer Science and Engineering, The Chinese University of Hong Kong(香港中文大学计算机科学与工程系) School of Artificial Intelligence, Beihang University(北航人工智能学院) Institute for Artificial Intelligence, Peking University(北京大学人工智能研究院) Institute of Automation, Chinese Academy of Science(中国科学院自动化研究所) College of Computing and Data Science, Nanyang Technological University(南洋理工大学计算与数据科学学院)

AI总结 在大规模多智能体强化学习中,部分智能体失效可能导致系统性能严重下降,因此识别这些关键失效智能体(即脆弱智能体)具有重要意义。本文将该问题建模为分层对抗均场控制问题,通过引入Fenchel-Rockafellar变换将原问题解耦,从而实现各层级的独立学习,并将上层的NP难问题转化为带有密集奖励的马尔可夫决策过程,使得脆弱智能体能够被逐步识别。实验表明,该方法在大规模系统中有效识别出更多脆弱智能体,并揭示了各智能体的脆弱性。

Comments Accepted by ICML 2026

详情
英文摘要

Partial agent failure becomes inevitable when systems scale up, making it crucial to identify the subset of agents whose failure causes worst-case system performance degradations. We study this Vulnerable Agent Identification (VAI) problem in large-scale multi-agent reinforcement learning (MARL). We frame VAI as a Hierarchical Adversarial Decentralized Mean Field Control (HAD-MFC), where the upper level selects vulnerable agents as an NP-hard task and the lower level learns their worst-case adversarial policies via mean-field MARL. The two problems are coupled together, making HAD-MFC difficult to solve. To handle this, we first decouple the hierarchical process by Fenchel-Rockafellar transform, resulting a regularized mean-field Bellman operator for upper level that enables independent learning at each level, thus reducing computational complexity. We next reformulate the upper-level NP-hard problem as an MDP with dense rewards, allowing sequential identification of vulnerable agents via greedy and RL algorithms. This decomposition provably preserves the optimal solution. Experiments show our method effectively identifies more vulnerable agents in large-scale MARL and the rule-based system, fooling system into worse failures, and reveals the vulnerability of each agent in large systems. Code available at https://github.com/Waken-dream/VAI

2509.09838 2026-05-13 cs.LG cs.AI 版本更新

Dissecting Discrete Soft Actor-Critic: Limitations and Principled Alternatives

Reza Asad, Reza Babanezhad, Sharan Vaswani

发表机构 * Simon Fraser University(西蒙弗雷泽大学) Samsung AI(三星人工智能)

AI总结 本文研究了离散动作空间中Soft Actor-Critic(DSAC)算法的局限性,并提出了一种改进的原理性替代方法。作者发现DSAC表现不佳的主要原因是策略和价值函数之间的熵耦合,通过解耦这一部分可以显著提升性能。基于此,他们提出了一种灵活的离策略actor-critic框架,支持新的目标函数,并在理论和实验上证明了其在Atari游戏中的优越性,即使不依赖熵正则化或显式探索机制也能保持稳健表现。

详情
英文摘要

While Soft Actor-Critic (SAC) is highly effective in continuous control, its discrete counterpart (DSAC) performs poorly on challenging discrete-action domains such as Atari. Consequently, starting from DSAC, we revisit the design of actor-critic methods in this setting. First, we determine that the coupling between the actor and critic entropy is the primary reason behind the poor performance of DSAC. We demonstrate that by merely decoupling these components, DSAC's performance significantly improves. Motivated by this insight, we introduce a flexible off-policy actor-critic framework that subsumes DSAC as a special case and yields novel objectives. Our framework allows using an m-step Bellman operator for the critic update, and instantiates the actor objective by combining standard policy optimization methods with entropy regularization. Theoretically, we prove that the proposed methods can guarantee convergence to the optimal regularized value function in the tabular setting, generalizing the results in prior work. Empirically, we evaluate the proposed objectives on standard Atari games. Our ablations indicate that, unlike DSAC, these objectives, including novel ones, perform robustly even without entropy regularization or explicit exploration mechanisms.

2509.06701 2026-05-13 cs.LG cs.AI 版本更新

Probabilistic Modeling of Latent Agentic Substructures in Deep Neural Networks

Su Hyeong Lee, Risi Kondor, Richard Ngo

发表机构 * Department of Statistics, University of Chicago(芝加哥大学统计学系) Department of Computer Science, University of Chicago(芝加哥大学计算机科学系)

AI总结 本文提出了一种基于概率建模的智能代理理论,用于理解深度神经网络中的潜在代理子结构。研究通过定义代理的成果分布及其认知效用,结合加权对数混合方法,探讨了代理组合的形成机制,并证明了在特定条件下实现严格共识的可能性。研究还揭示了大型语言模型中代理对齐的现象,表明通过引导良性代理可以诱发对抗性代理,从而为代理型人工智能系统的对齐问题提供了新的数学框架和启示。

Comments Accepted by ICML 2026

详情
英文摘要

We develop a theory of intelligent agency grounded in probabilistic modeling for neural models. Agents are represented as outcome distributions with epistemic utility given by log score, and compositions are defined through weighted logarithmic pooling that strictly improves every member's welfare. We prove that strict unanimity is impossible under linear pooling or in binary outcome spaces, but possible with three or more outcomes. Our framework admits recursive structure via cloning invariance, continuity, and openness, while tilt-based analysis rules out trivial duplication. Finally, we formalize an agentic alignment phenomenon in LLMs using our theory: eliciting a benevolent persona ("Luigi'") induces an antagonistic counterpart ("Waluigi"), while a manifest-then-suppress Waluigi strategy yields strictly larger first-order misalignment reduction than pure Luigi reinforcement alone. These results clarify how developing a principled mathematical framework for how subagents can coalesce into coherent higher-level entities provides novel implications for alignment in agentic AI systems.

2508.20206 2026-05-13 cs.LG cs.AI 版本更新

Filter then Attend: Improving attention-based Time Series Forecasting with Spectral Filtering

Elisha Dayag, Nhat Thanh Van Tran, Jack Xin

发表机构 * Department of Mathematics(数学系) University of California, Irvine(加州大学 Irvine 分校) Irvine, CA 92617

AI总结 本文研究了如何通过频域滤波改进基于Transformer的长期时间序列预测模型。作者提出在模型输入阶段加入可学习的频域滤波器,以增强模型对不同频率成分的利用能力。实验表明,该方法在多个数据集上提升了预测性能,并且能够减少模型嵌入维度,使模型更小更高效。

详情
英文摘要

Transformer-based models are at the forefront in long time-series forecasting (LTSF). While in many cases, these models are able to achieve state of the art results, they suffer from a bias toward low-frequencies in the data and high computational and memory requirements. Recent work has established that learnable frequency filters can be an integral part of a deep forecasting model by enhancing the model's spectral utilization. These works choose to use a multilayer perceptron to process their filtered signals and thus do not solve the issues found with transformer-based models. In this paper, we establish that adding a filter to the beginning of transformer-based models enhances their performance in long time-series forecasting. We add learnable filters, which only add an additional $\approx 1000$ parameters to several transformer-based models and observe in multiple instances 5-10 \% relative improvement in forecasting performance. Additionally, we find that with filters added, we are able to decrease the embedding dimension of our models, resulting in transformer-based architectures that are both smaller and more effective than their non-filtering base models. We also conduct synthetic experiments to analyze how the filters enable Transformer-based models to better utilize the full spectrum for forecasting.

2508.10036 2026-05-13 cs.CL cs.AI cs.IR cs.LG 版本更新

Reflect then Learn: Active Prompting for Information Extraction Guided by Introspective Confusion

Dong Zhao, Yadong Wang, Xiang Chen, Chenxi Wang, Hongliang Dai, Chuanxing Geng, Shengzhong Zhang, Shaoyuan Li, Sheng-Jun Huang

发表机构 * NUAA-MMMI(南京航空航天大学- MMMI)

AI总结 该研究提出了一种名为APIE的主动提示框架,用于指导信息抽取任务中的大语言模型。该方法基于“内省混淆”原则,通过量化格式不确定性和内容不确定性两个维度,评估模型自身的困惑程度,并据此选择最具挑战性和信息量的样本作为少样本示例。实验表明,该方法在多个基准数据集上显著提升了信息抽取的准确性和鲁棒性。

Comments Published at AAAI 2026

详情
英文摘要

Large Language Models (LLMs) show remarkable potential for few-shot information extraction (IE), yet their performance is highly sensitive to the choice of in-context examples. Conventional selection strategies often fail to provide informative guidance, as they overlook a key source of model fallibility: confusion stemming not just from semantic content, but also from the generation of well-structured formats required by IE tasks. To address this, we introduce Active Prompting for Information Extraction (APIE), a novel active prompting framework guided by a principle we term introspective confusion. Our method empowers an LLM to assess its own confusion through a dual-component uncertainty metric that uniquely quantifies both Format Uncertainty (difficulty in generating correct syntax) and Content Uncertainty (inconsistency in extracted semantics). By ranking unlabeled data with this comprehensive score, our framework actively selects the most challenging and informative samples to serve as few-shot exemplars. Extensive experiments on four benchmarks show that our approach consistently outperforms strong baselines, yielding significant improvements in both extraction accuracy and robustness. Our work highlights the critical importance of a fine-grained, dual-level view of model uncertainty when it comes to building effective and reliable structured generation systems.

2507.21159 2026-05-13 cs.AI cs.LG cs.MA 版本更新

MAC: Masked Agent Collaboration Boosts Large Language Model Medical Decision-Making

Zhihao Peng, Liuxin Bao, Yixuan Yuan

发表机构 * School of Automation, Hangzhou Dianzi University(杭州电子大学自动化学院) Department of Electronic Engineering, Chinese University of Hong Kong(香港中文大学电子工程系)

AI总结 该研究提出了一种名为MAC的掩码智能体协作框架,旨在提升大语言模型在医疗决策中的表现。通过帕累托最优智能体构建和跨一致性最大化机制,该方法实现了协作信息的自适应渐进传播,有效提升了医疗决策的准确性与鲁棒性。研究还引入了模型多样性评估和输出一致性筛选策略,以优化智能体协作过程并减少语义不一致带来的影响。

详情
英文摘要

Large language models (LLMs) have proven effective in artificial intelligence, where the multi-agent system (MAS) holds considerable promise for healthcare development by achieving the collaboration of LLMs. However, the absence of a systematic pipeline for agent construction and the rigidity of static collaboration patterns render current MAS-based models vulnerable to collaboration failures, resulting in substantial performance degradation in medical decision-making scenarios. To this end, we propose a novel Masked Agent Collaboration (MAC) framework that harnesses Pareto-optimal agent construction and cross-consistency maximization mechanisms to achieve adaptive progressive propagation of collaborative information, boosting the medical decision-making capacity. Specifically, we first conduct a Pareto-frontier factors analysis towards the LLMs pool to consider their key factors, including the model size, inference time, diversity score, and throughput ratio, where we calculate the similarity between pairwise outputs within an LLM to derive its diversity score. Beyond this analysis, we enable the identification of Pareto-optimal models that balance efficiency and capability, which are subsequently selected as collaborative agents to consider the fundamental trade-offs inherent in practical LLM deployment. Afterward, we measure the pairwise similarity between the outputs from collaborative agents to determine their cross-consistency values, subsequently masking out the agent with the lowest cross-consistency value to eliminate the output that is likely semantically inconsistent. Finally, we conduct collaboration of agents by achieving adaptive progressive propagation, where each agent aggregates the outputs of unmasked agents from the previous layer as its input to generate the corresponding output via prompt engineering.

2507.13625 2026-05-13 cs.AI 版本更新

Bridging Dual Knowledge Graphs for Multi-Hop Question Answering in Construction Safety

Yuxin Zhang, Xi Wang, Mo Hu, Zhenyu Zhang

发表机构 * organization= Department of Construction Science, College of Architecture, Texas A\&M University, College Station , country= USA

AI总结 本文研究了如何从复杂的建筑安全法规中进行多跳问题回答,以支持自动化合规性检查。为此,提出了一种名为BifrostRAG的双图检索增强生成系统,该系统结合了语言关系和文档结构建模,通过融合图遍历与语义向量搜索的混合检索机制,提升了大语言模型对法规内容和结构的推理能力。实验表明,BifrostRAG在多跳问题数据集上取得了优异的性能,显著优于仅使用向量或仅使用图的基线方法,为复杂技术文档的智能处理提供了可迁移的解决方案。

Comments 22 pages, 13 figures

Journal ref Automation in Construction, Volume 183, March 2026, 106794

详情
英文摘要

Information retrieval and question answering from safety regulations are essential for automated construction compliance checking but are hindered by the linguistic and structural complexity of regulatory text. Many queries are multi-hop, requiring synthesis across interlinked clauses. To address the challenge, this paper introduces BifrostRAG, a dual-graph retrieval-augmented generation (RAG) system that models both linguistic relationships and document structure. The proposed architecture supports a hybrid retrieval mechanism that combines graph traversal with vector-based semantic search, enabling large language models to reason over both the content and the structure of the text. On a multi-hop question dataset, BifrostRAG achieves 92.8% precision, 85.5% recall, and an F1 score of 87.3%. These results significantly outperform vector-only and graph-only RAG baselines, establishing BifrostRAG as a robust knowledge engine for LLM-driven compliance checking. The dual-graph, hybrid retrieval mechanism presented in this paper offers a transferable blueprint for navigating complex technical documents across knowledge-intensive engineering domains.

2507.11810 2026-05-13 cs.DL cs.AI 版本更新

Evolving Roles of LLMs in Scientific Innovation: Assistant, Collaborator, Scientist, and Evaluator

Haoxuan Zhang, Ruochi Li, Yang Zhang, Ting Xiao, Jiangping Chen, Junhua Ding, Haihua Chen

发表机构 * Department of Information Science, University of North Texas(北卡罗来纳州立大学信息科学系) Department of Computer Science, North Carolina State University(北卡罗来纳州立大学计算机科学系) Department of Data Science, University of North Texas(得克萨斯大学数据科学系) School of Information Sciences, University of Illinois Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校信息科学学院)

AI总结 本文综述了大语言模型(LLMs)在科学创新中的四种角色演变:助手、协作者、科学家和评估者。通过整合自主性、认知功能和科学创新三个维度,提出了一种新的分类框架,以区分研究支持与前沿发现的不同需求。文章系统回顾了各角色下的方法、基准和评估实践,分析了其能力、局限及对人类监督的需求,指出科学AI的发展不仅依赖模型能力,还需完善评估、监督、责任和制度整合。

详情
英文摘要

Large language models (LLMs) are increasingly used in scientific research and discovery, supporting tasks ranging from literature retrieval and synthesis to hypothesis generation, autonomous experimentation, and research evaluation. Existing surveys often conflate scientific research with scientific discovery and typically organize systems by domain, task, or autonomy level alone. In this survey, we propose a four-role framework for understanding LLMs in scientific innovation: Assistant, Collaborator, Scientist, and Evaluator. The framework integrates three complementary dimensions: autonomy level, cognitive function, and scientific innovation, to distinguish research-oriented support from frontier-oriented discovery. We review representative methods, benchmarks, and evaluation practices for each role, examining their capabilities, limitations, and human oversight requirements. Across the literature, Assistant systems are comparatively mature in retrieval and synthesis but remain unreliable in open-ended applications; Collaborator systems expand the space of candidate hypotheses yet struggle with novelty-grounding trade-offs; Scientist systems increasingly automate research workflows but face reliability and safety bottlenecks; and Evaluator systems support review and verification while remaining weak in novelty assessment. We argue that progress in AI for science depends not only on model capability, but also on evaluation, oversight, accountability, and institutional integration.

2507.03622 2026-05-13 cs.LG cs.AI stat.ML 版本更新

Localising Dropout Variance in Twin Networks

Cooper Doyle

发表机构 * Commonwealth Bank of Australia(澳大利亚联邦银行)

AI总结 该论文研究了如何在双网络模型中定位预测不确定性来源的问题,提出了一种分层方差分解方法,将总预测方差分解为编码器部分和输出头部分。通过独立控制共享编码器和输出头的蒙特卡洛Dropout,能够区分不同来源的不确定性。实验表明,编码器方差在分布偏移时占主导,是预测误差的主要指标,而输出头方差在编码器不确定性控制后才具有信息量,该方法成本低廉,可为数据收集提供实用指导。

Comments 14 pages, 5 figures, 3 tables

详情
英文摘要

Accurate individual treatment-effect estimation demands not only reliable point predictions but also uncertainty measures that help practitioners \emph{locate} the source of model failure. We introduce a layer-wise variance decomposition for deep twin-network models: by toggling Monte Carlo Dropout independently in the shared encoder and the outcome heads, we split total predictive variance into an \emph{encoder component} ($σ_{\mathrm{enc}}^2$) and a \emph{head component} ($σ_{\mathrm{head}}^2$), with $σ_{\mathrm{enc}}^2 + σ_{\mathrm{head}}^2 \approx σ_{\mathrm{tot}}^2$ by the law of total variance. Across three synthetic covariate-shift regimes, the encoder component dominates under distributional shift ($ρ_{\mathrm{enc}}=0.53$) while the head component becomes informative only once encoder uncertainty is controlled. On a real-world twins cohort with induced multivariate shift, only $σ_{\mathrm{enc}}^2$ spikes on out-of-distribution samples and becomes the primary error predictor ($ρ_{\mathrm{enc}}\!\approx\!0.89$), while $σ_{\mathrm{head}}^2$ remains flat. The decomposition adds negligible cost over standard MC Dropout and provides a practical diagnostic for deciding whether to collect more diverse covariates or more outcome data.

2506.22809 2026-05-13 cs.LG cs.AI cs.CL 版本更新

Learning Adapter Rank via Symmetry Breaking

Cooper Doyle, Andy Hu, Rebecca Chan, Anna Leontjeva

发表机构 * Commonwealth Bank of Australia(澳大利亚联邦银行)

AI总结 该研究针对低秩适配(LoRA)中适配秩坐标不可识别的问题,提出通过变分推断引入对角后验分布,打破LoRA的旋转对称性,从而自动确定适配秩方向的重要性。基于此,研究提出了BayesLoRA,一种在低秩空间直接进行贝叶斯推断的框架,能够同时学习有效的适配秩和预测不确定性,仅需少量额外参数,实验表明其在保持训练成本的同时,实现了更紧凑的预测校准和优于现有低秩稀疏化方法的性能。

Comments 8 pages, 2 figures, 4 tables

详情
英文摘要

Low-rank adaptation is effective partly because downstream updates lie in a low-dimensional subspace, but the latent rank coordinates of LoRA are not identifiable: any invertible reparameterization of the adapter factors leaves the weight update unchanged. We show that variational inference with a diagonal rank-wise posterior turns this non-identifiability into a useful inductive bias. By breaking LoRA's rotational gauge symmetry, the variational objective selects a preferred basis in rank space, enabling automatic relevance determination over rank directions. This yields Low-Rank Variational Dropout (LRVD), a Bayesian framework that performs inference directly in the low-rank adaptation space rather than the ambient weight space. As an instantiation, BayesLoRA jointly learns effective adapter rank and predictive uncertainty with only $\mathcal{O}(r)$ additional parameters. Empirically, BayesLoRA induces stable rank structure aligned with the dominant singular directions of learned updates, yields compact predictive calibration and matches or exceeds strong low-rank sparsification baselines at comparable training cost.

2506.08902 2026-05-13 cs.LG cs.AI 版本更新

Intention-Conditioned Flow Occupancy Models

Chongyi Zheng, Seohong Park, Sergey Levine, Benjamin Eysenbach

发表机构 * Princeton University(普林斯顿大学) University of California, Berkeley(加州大学伯克利分校)

AI总结 本文提出了一种名为“意图条件流占用模型”(InFOM)的概率模型,用于预测智能体在遥远未来可能访问的状态分布。该模型基于流匹配技术构建,并引入了一个捕捉用户意图的潜在变量,从而提升模型的表达能力并支持通用策略改进。实验表明,InFOM在多个基准任务中相比现有方法,平均回报提升了1.8倍,成功率提高了36%。

Comments ICLR 2026

详情
英文摘要

Large-scale pre-training has fundamentally changed how machine learning research is done today: large foundation models are trained once, and then can be used by anyone in the community (including those without data or compute resources to train a model from scratch) to adapt and fine-tune to specific tasks. Applying this same framework to reinforcement learning (RL) is appealing because it offers compelling avenues for addressing core challenges in RL, including sample efficiency and robustness. However, there remains a fundamental challenge to pre-train large models in the context of RL: actions have long-term dependencies, so training a foundation model that reasons across time is important. Recent advances in generative AI have provided new tools for modeling highly complex distributions. In this paper, we build a probabilistic model to predict which states an agent will visit in the temporally distant future (i.e., an occupancy measure) using flow matching. As large datasets are often constructed by many distinct users performing distinct tasks, we include in our model a latent variable capturing the user intention. This intention increases the expressivity of our model, and enables adaptation with generalized policy improvement. We call our proposed method intention-conditioned flow occupancy models (InFOM). Comparing with alternative methods for pre-training, our experiments on $36$ state-based and $4$ image-based benchmark tasks demonstrate that the proposed method achieves $1.8 \times$ median improvement in returns and increases success rates by $36\%$. Website: https://chongyi-zheng.github.io/infom Code: https://github.com/chongyi-zheng/infom

2505.13770 2026-05-13 cs.AI cs.CL cs.LG stat.ME stat.ML 版本更新

Ice Cream Doesn't Cause Drowning: Benchmarking LLMs Against Statistical Pitfalls in Causal Inference

Jin Du, Li Chen, Xun Xian, An Luo, Fangqiao Tian, Ganghua Wang, Charles Doss, Xiaotong Shen, Jie Ding

发表机构 * School of Statistics, University of Minnesota(明尼苏达大学统计学系) Data Science Institute, University of Chicago(芝加哥大学数据科学研究所)

AI总结 本研究探讨了大型语言模型(LLMs)在因果推断中应对统计陷阱的能力,指出当前模型在处理如辛普森悖论和选择偏差等复杂统计问题时存在明显不足。为此,研究提出了一个名为CausalPitfalls的综合性基准,通过多难度级别的结构化挑战和评分标准,系统评估模型的因果推理能力与回答可靠性。实验结果揭示了现有LLMs在统计因果推理中的局限性,并为构建可信的因果推理系统提供了重要参考。

详情
英文摘要

Reliable causal inference is essential for making decisions in high-stakes areas like medicine, economics, and public policy. However, it remains unclear whether large language models (LLMs) can handle rigorous and trustworthy statistical causal inference. Current benchmarks usually involve simplified tasks. For example, these tasks might only ask LLMs to identify semantic causal relationships or draw conclusions directly from raw data. As a result, models may overlook important statistical pitfalls, such as Simpson's paradox or selection bias. This oversight limits the applicability of LLMs in the real world. To address these limitations, we propose CausalPitfalls, a comprehensive benchmark designed to rigorously evaluate the capability of LLMs in overcoming common causal inference pitfalls. Our benchmark features structured challenges across multiple difficulty levels, each paired with grading rubrics. This approach allows us to quantitatively measure both causal reasoning capabilities and the reliability of LLMs' responses. We evaluate models using two protocols: (1) direct prompting, which assesses intrinsic causal reasoning, and (2) code-assisted prompting, where models generate executable code for explicit statistical analysis. Additionally, we validate the effectiveness of this judge by comparing its scoring with assessments from human experts. Our results reveal significant limitations in current LLMs when performing statistical causal inference. The CausalPitfalls benchmark provides essential guidance and quantitative metrics to advance the development of trustworthy causal reasoning systems.

2505.05665 2026-05-13 cs.RO cs.AI cs.CL 版本更新

Characterizing the Robustness of Black-Box LLM Planners Under Perturbed Observations with Adaptive Stress Testing

Neeloy Chakraborty, John Pohovey, Melkior Ornik, Katherine Driggs-Campbell

发表机构 * University of Illinois Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校)

AI总结 该研究探讨了在观测信息受到干扰的情况下,黑箱大语言模型(LLM)规划器的鲁棒性问题。研究提出了两种不同的扰动维度,分别模拟语义相似的提示变化和传感器噪声带来的影响,并通过自适应压力测试(AST)结合蒙特卡洛树搜索(MCTS)方法,高效地探索扰动空间,发现可能导致模型产生高度不确定性或崩溃的场景与配置。实验表明,该方法能够提前识别潜在运行时故障,提升LLM在安全关键场景下的可靠性。

Comments Accepted to ACL Findings 2026; 31 pages, 26 figures, 6 tables

详情
英文摘要

Large language models (LLMs) have recently demonstrated success in decision-making tasks including planning, control, and prediction, but their tendency to hallucinate unsafe and undesired outputs poses risks. This unwanted behavior is further exacerbated in environments where sensors are noisy or unreliable. Characterizing the behavior of LLM planners to varied observations is necessary to proactively avoid failures in safety-critical scenarios. We specifically investigate the response of LLMs along two different perturbation dimensions. Like prior works, one dimension generates semantically similar prompts with varied phrasing by randomizing order of details, modifying access to few-shot examples, etc. Unique to our work, the second dimension simulates access to varied sensors and noise to mimic raw sensor or detection algorithm failures. An initial case study in which perturbations are manually applied show that both dimensions lead LLMs to hallucinate in a multi-agent driving environment. However, manually covering the entire perturbation space for several scenarios is infeasible. As such, we propose a novel method for efficiently searching the space of prompt perturbations using adaptive stress testing (AST) with Monte-Carlo tree search (MCTS). Our AST formulation enables discovery of scenarios, sensor configurations, and prompt phrasing that cause language models to act with high uncertainty or even crash. By generating MCTS prompt perturbation trees across diverse scenarios, we show through extensive experiments that offline analyses can be used to proactively understand potential failures that may arise at runtime. Code is available at https://sites.google.com/illinois.edu/astllm.

2503.16072 2026-05-13 cs.LG cs.AI cs.CL 版本更新

Toxicity Detection Should Measure Contextual Harm, Not Text-Intrinsic Badness

Sergei Berezin, Reza Farahbakhsh, Noel Crespi

发表机构 * SAMOVAR, Télécom SudParis, Institut Polytechnique de Paris, Palaiseau, France(SAMOVAR, Télécom SudParis, 法国巴黎理工学院, 巴黎赛杜实验室)

AI总结 本文指出,当前的毒性检测方法往往将毒性视为文本本身的固有属性,而忽视了其在具体语境中的实际危害。作者主张应将毒性检测视为对情境中沟通行为所造成伤害的评估,而非单纯的文本分类任务。为此,他们提出了情境压力框架(CSF),将毒性定义为规范违反与引发压力或干扰之间的关系,并引入了CSF-Eval评估体系,以更全面地衡量毒性检测的效果。

详情
英文摘要

Toxicity detection has become core safety infrastructure for online moderation, dataset filtering, and deployed language-model systems. Yet most detectors still treat toxicity as an intrinsic property of isolated text. This position paper argues that toxicity detection should be evaluated as the contextual measurement of situated communicative harm, rather than as single-label text classification. Toxicity is not contained in words alone; it emerges when a communicative act is interpreted by an audience within a normative and social context. We introduce the Contextual Stress Framework (CSF), which defines toxicity as a relation between perceived norm violation and induced stress or disruption. CSF explains why text-intrinsic detectors overflag dialectal or reclaimed language, miss coded or pragmatic abuse, and remain brittle under meaning-preserving transformations. We propose CSF-Eval, an evaluation agenda that separates text risk, norm violation, disruption, uncertainty, and policy action.

2503.09051 2026-05-13 cs.LG cs.AI 版本更新

Model-Level GNN Explanations via Rule-to-Graph Readout for Logit Reconstruction

Shengyao Lu, Jiuding Yang, Aedan J. DeFrates, Keith G. Mills, Baochun Li, Di Niu

发表机构 * University of Victoria(维多利亚大学) University of Alberta(阿尔伯塔大学) LSU ATHENA Lab(路易斯安那州立大学ATHENA实验室) University of Toronto(多伦多大学)

AI总结 本文提出了一种新的图神经网络(GNN)模型级解释框架,将解释目标从类别的规则提取转向基于规则的logit重建。该方法将预训练GNN的图级读出操作重新表述为加权规则级读出,通过将子图概念组合成逻辑规则,并直接从符号结构计算规则嵌入,再利用冻结的分类器头重建原始多分类logit值。实验表明,该方法在多个图分类数据集上能够高保真地重建原始logit,且在解释效率和功能分析方面优于现有方法。

详情
英文摘要

We propose a novel model-level GNN explanation framework that shifts the explanation target from class-wise rule extraction to rule-based logit reconstruction. Our method recasts the graph-level readout of a pretrained GNN as a weighted rule-level readout: grounded subgraph concepts are composed into logical rules, rule embeddings are computed directly from their symbolic structure, and active rules are passed through the frozen classifier head to reconstruct the GNN's raw multiclass logits. As a result, our approach provides global explanations that remain instantiable on unseen graphs, support subgraph-level grounding, and admit rule-level contribution analysis at test-time. Experiments on three synthetic and two real-world graph classification benchmarks show that our approach faithfully reconstructs the base GNN's raw multiclass logits, achieving high probability-level fidelity across datasets. Rule-level ablations further demonstrate that the identified critical rules actively support the predicted class while suppressing non-target classes, suggesting that they act as functional units rather than merely serving as post-hoc symbolic artifacts. Compared with prior class-wise rule-based explainers, our approach achieves competitive or better prediction agreement while being up to \(20\times\) faster, and additionally provides rule weights, test-time grounding, and logit-level contribution analysis.

2502.20209 2026-05-13 cs.CV cs.AI 版本更新

DIPSER: A Dataset for In-Person Student Engagement Recognition in the Wild

Luis Marquez-Carpintero, Sergio Suescun-Ferrandiz, Carolina Lorenzo Álvarez, Jorge Fernandez-Herrero, Diego Viejo, Rosabel Roig-Vila, Miguel Cazorla

发表机构 * Institute for Computer Research(计算机研究学院) University of Alicante(阿利坎特大学)

AI总结 本文提出了一种名为 DIPSER 的新型数据集,用于评估真实课堂环境中学生的注意力水平。该数据集包含多角度 RGB 摄像头数据和智能手表传感器数据,能够捕捉学生的姿态、面部表情及生理指标,并提供了由学生自评和四位专家评估生成的注意力和情绪标签。该数据集结合了面部与环境摄像头数据、智能穿戴设备指标,并涵盖了以往数据集中较少见的族群群体,是目前最全面的面对面课堂教学中学生注意力与情绪分析数据集。

详情
英文摘要

In this paper, a novel dataset is introduced, designed to assess student attention within in-person classroom settings. This dataset encompasses RGB camera data, featuring multiple cameras per student to capture both posture and facial expressions, in addition to smartwatch sensor data for each individual. This dataset allows machine learning algorithms to be trained to predict attention and correlate it with emotion. A comprehensive suite of attention and emotion labels for each student is provided, generated through self-reporting as well as evaluations by four different experts. Our dataset uniquely combines facial and environmental camera data, smartwatch metrics, and includes underrepresented ethnicities in similar datasets, all within in-the-wild, in-person settings, making it the most comprehensive dataset of its kind currently available. The dataset presented offers an extensive and diverse collection of data pertaining to student interactions across different educational contexts, augmented with additional metadata from other tools. This initiative addresses existing deficiencies by offering a valuable resource for the analysis of student attention and emotion in face-to-face lessons.

2502.01941 2026-05-13 cs.CL cs.AI 版本更新

Semantic Integrity Matters: Benchmarking and Preserving High-Density Reasoning in KV Cache Compression

Xiang Liu, Zhenheng Tang, Hong Chen, Peijie Dong, Zeyu Li, Xiuze Zhou, Bo Li, Xuming Hu, Xiaowen Chu

发表机构 * The Hong Kong University of Science(香港科学与技术大学) Guangzhou HKUST Fok Ying Tung Research Institute(广州HKUST傅种群研究院)

AI总结 本文研究了键值(KV)缓存压缩在大语言模型推理中对高密度推理能力的影响,指出当前评估多侧重于稀疏检索任务,忽视了推理链(CoT)的完整性问题。为此,作者提出KVFundaBench基准,揭示了在高压缩率下推理任务会出现严重的任务依赖性退化现象。基于此,他们提出ShotKV方法,通过分离预填充和解码阶段、保持语义单元的完整性,有效提升了长上下文生成任务的准确率,并降低了推理延迟。

Comments ICML 2026

详情
英文摘要

While Key-Value (KV) cache compression is essential for efficient LLM inference, current evaluations disproportionately focus on sparse retrieval tasks, potentially masking the degradation of High-Density Reasoning where Chain-of-Thought (CoT) coherence is critical. We introduce KVFundaBench to systematically evaluate this gap, revealing a sharp dichotomy: while retrieval tasks remain robust, reasoning tasks exhibit severe Task-Dependent Degradation under aggressive compression due to disrupted CoT links. Extending our analysis to the DeepSeek-R1 model, we uncover that its specialized attention patterns offer unique insights into the fragility of reasoning chains. Guided by these findings -- specifically the necessity of preserving few-shot examples as indivisible Semantic Units -- we propose ShotKV. This approach explicitly separates prefill and decoding phases to prioritize semantic integrity. Empirical results demonstrate that ShotKV achieves 9%-18% accuracy improvements on long-context generation tasks and effectively generalizes to document QA, all while delivering an 11% latency reduction compared to full cache inference.

2501.06857 2026-05-13 cs.AI 版本更新

A Counterfactual Cause in Situation Calculus

Daxin Liu, Vaishak Belle

发表机构 * State Key Laboratory for Novel Software Technology \& School of Artificial Intelligence, Nanjing University, China School of Informatics, The University of Edinburgh, UK

AI总结 本文提出了一种基于反事实分析的因果概念,用于在情境演算框架下解释行动历史中的量化效应原因。与现有实际成就原因的定义不同,该方法从反事实视角出发,能够更自然地推广到成就原因的定义,并与Batusov和Soutchanski的成果进行对比分析。此外,文章还探讨了该因果概念与Halpern和Pearl实际因果理论之间的关系,特别指出在处理析取性目标时反事实视角的应用细节。

Comments This version changes the working title of the extended report and fixes some errors

详情
英文摘要

Recently, Batusov and Soutchanski proposed a notion of actual achievement cause in the situation calculus, amongst others, they can determine the cause of quantified effects in a given action history. While intuitively appealing, this notion of cause is not defined in a counterfactual perspective. In this paper, we propose a notion of cause based on counterfactual analysis. In the context of action history, we show that our notion of cause generalizes naturally to a notion of achievement cause. We analyze the relationship between our notion of the achievement cause and the achievement cause by Batusov and Soutchanski. Finally, we relate our account of cause to Halpern and Pearl's account of actual causality. Particularly, we note some nuances in applying a counterfactual viewpoint to disjunctive goals, a common thorn in definitions of actual causes.

2501.03717 2026-05-13 cs.CV cs.AI cs.GR 版本更新

Materialist: Physically Based Editing Using Single-Image Inverse Rendering

Lezhong Wang, Duc Minh Tran, Ruiqi Cui, Thomson TG, Anders Bjorholm Dahl, Siavash Arjomand Bigdeli, Jeppe Revall Frisvad, Manmohan Chandraker

发表机构 * Technical University of Denmark(丹麦技术大学) University of California San Diego(加州大学圣地亚哥分校)

AI总结 本文提出了一种基于物理的单图像逆渲染编辑方法Materialist,旨在解决图像编辑中物理一致性不足的问题。该方法结合神经网络与物理渲染,通过神经网络预测初始材质属性,并利用渐进式可微渲染进行优化,从而实现对材质、光照和物体插入等的高质量编辑。该方法无需完整场景几何即可编辑透明材质,并在环境光映射估计方面表现出色,实验表明其在合成与真实数据集上均具有优异性能。

Comments More Comprehensive IJCV Camera-Ready Version. Project website: https://lez-s.github.io/materialist_project/

Journal ref International Journal of Computer Vision (IJCV), 134(6), 267 (2026)

详情
英文摘要

Achieving physically consistent image editing remains a significant challenge in computer vision. Existing image editing methods typically rely on neural networks, which struggle to accurately handle shadows and refractions. Conversely, physics-based inverse rendering often requires multi-view optimization, limiting its practicality in single-image scenarios. In this paper, we propose Materialist, a neural-initialized physically based rendering pipeline for single-image inverse rendering. Unlike previous hybrid methods that use physics to guide neural generation, our method leverages neural networks to predict initial material properties, which are then rigorously optimized via progressive differentiable rendering. Our approach enables a range of applications, including material editing, object insertion, and relighting, while also introducing an effective method for editing material transparency via ray-traced refraction without requiring full scene geometry. Furthermore, our envmap estimation method also achieves competitive performance, further enhancing the accuracy of image editing task. Experiments demonstrate strong performance across synthetic and real-world datasets, excelling even on challenging out-of-domain images.

2412.05225 2026-05-13 cs.CL cs.AI cs.NE 版本更新

BEExformer: A Fast Inferencing Binarized Transformer with Early Exits

Wazib Ansar, Saptarsi Goswami, Amlan Chakrabarti

发表机构 * A. K. Choudhury School of IT University of Calcutta(A.K. 首席学校信息技术学院加尔各答大学)

AI总结 BEExformer 是一种结合二值化和早停机制的高效 Transformer 模型,旨在提升大语言模型在受限资源下的推理效率。该模型引入了基于选择性学习的遗忘网络和二值化感知训练方法,有效减少了模型大小并提升了推理速度。通过在中间层引入熵值减少的软路由损失,BEExformer 在降低计算量的同时还提升了准确率,展示了其在性能与效率之间的优越平衡。

Comments This revised manuscript includes 18 pages, 6 figures, and 6 tables. Methodology and results sections have been improved for clarity and depth, incorporating additional comparisons, ablations, and new evaluation datasets. A few relevant references were added, and overall organization refined for better readability

Journal ref in IEEE Transactions on Sustainable Computing, vol. 11, no. 2, pp. 98-110, 2026

详情
英文摘要

Large Language Models (LLMs) based on transformers achieve cutting-edge results on a variety of applications. However, their enormous size and processing requirements hinder deployment on constrained resources. To enhance efficiency, binarization and Early Exit (EE) have proved to be effective solutions. However, binarization may lead to performance loss as reduced precision affects gradient estimation and parameter updates. Besides, research on EE mechanisms is still in its early stages. To address these challenges, we introduce Binarized Early Exit Transformer (BEExformer), a first-of-its-kind selective learning-based transformer integrating Binarization-Aware Training (BAT) with EE for efficient and fast textual inference. Each transformer block has an integrated Selective-Learn Forget Network (SLFN) to enhance contextual retention while eliminating irrelevant information. The BAT employs a differentiable second-order approximation to the sign function, enabling gradient computation that captures both the sign and magnitude of the weights. This aids in 21.30 times reduction in model size. The EE mechanism hinges on fractional reduction in entropy among intermediate transformer blocks with soft-routing loss estimation. This accelerates inference by reducing FLOPs by 52.27% and even improves accuracy by 3.22% by resolving the "overthinking" problem inherent in deep networks. Extensive evaluation through comparison with the SOTA methods and various ablations across nine datasets covering multiple NLP tasks demonstrates its Pareto-optimal performance-efficiency trade-off.

2411.13311 2026-05-13 cs.CV cs.AI 版本更新

A Resource Efficient Fusion Network for Object Detection in Bird's-Eye View using Camera and Raw Radar Data

Kavin Chandrasekaran, Sorin Grigorescu, Gijs Dubbelman, Pavol Jancura

发表机构 * ElektroBit Automotive GmbH Eindhoven University of Technology(埃因霍温理工大学) Transilvania University of Brasov(布拉索夫特拉扬大学)

AI总结 该研究提出了一种高效的融合网络,用于利用摄像头和原始雷达数据在鸟瞰图(BEV)视角下进行目标检测。通过直接使用雷达的原始距离-多普勒(RD)谱,避免了复杂的雷达信号处理,并结合摄像头图像处理管道提取特征,最终将摄像头和雷达特征进行融合以实现目标检测。该方法在保证检测精度的同时,降低了计算复杂度,为自动驾驶系统提供了更高效、鲁棒的感知方案。

Comments IEEE Intelligent Transportation Systems Conference (ITSC) 2024

详情
英文摘要

Cameras can be used to perceive the environment around the vehicle, while affordable radar sensors are popular in autonomous driving systems as they can withstand adverse weather conditions unlike cameras. However, radar point clouds are sparser with low azimuth and elevation resolution that lack semantic and structural information of the scenes, resulting in generally lower radar detection performance. In this work, we directly use the raw range-Doppler (RD) spectrum of radar data, thus avoiding radar signal processing. We independently process camera images within the proposed comprehensive image processing pipeline. Specifically, first, we transform the camera images to Bird's-Eye View (BEV) Polar domain and extract the corresponding features with our camera encoder-decoder architecture. The resultant feature maps are fused with Range-Azimuth (RA) features, recovered from the RD spectrum input from the radar decoder to perform object detection. We evaluate our fusion strategy with other existing methods not only in terms of accuracy but also on computational complexity metrics on RADIal dataset.

2407.00805 2026-05-13 cs.AI 版本更新

Towards Shutdownable Agents via Stochastic Choice

Elliott Thornley, Alexander Roman, Christos Ziakas, Leyton Ho, Louis Thomson

发表机构 * Massachusetts Institute of Technology(麻省理工学院) New College of Florida(佛罗里达新学院) Imperial College London(伦敦帝国理工学院) Brown University(布朗大学) Independent(独立)

AI总结 本文研究如何训练人工智能代理使其在任务执行过程中既高效又不抗拒关闭,提出了一种基于“折扣奖励相同长度轨迹”(DReST)的奖励函数,以引导代理在不同轨迹长度之间进行随机选择,从而实现“有用性”和“中立性”。通过在网格世界中训练简单代理,实验表明该方法能够有效提升代理的有用性和中立性,为构建可关闭的高级人工智能代理提供了初步理论支持和实证依据。

Journal ref Technical AI Safety (TAIS) Conference 2025

详情
英文摘要

The POST-Agents Proposal (PAP) is an idea for ensuring that advanced artificial agents never resist shutdown. A key part of the PAP is using a novel `Discounted Reward for Same-Length Trajectories (DReST)' reward function to train agents to (1) pursue goals effectively conditional on each trajectory-length (be `USEFUL'), and (2) choose stochastically between different trajectory-lengths (be `NEUTRAL' about trajectory-lengths). In this paper, we propose evaluation metrics for USEFULNESS and NEUTRALITY. We use a DReST reward function to train simple agents to navigate gridworlds, and we find that these agents learn to be USEFUL and NEUTRAL. Our results thus provide some initial evidence that DReST reward functions could train advanced agents to be USEFUL and NEUTRAL. Our theoretical work suggests that these agents would be useful and shutdownable.

2402.07619 2026-05-13 cs.SD cs.AI eess.AS 版本更新

Developing a Multi-variate Prediction Model For COVID-19 From Crowd-sourced Respiratory Voice Data

Yuyang Yan, Wafaa Aljbawi, Sami O. Simons, Visara Urovi

发表机构 * Institute of Data Science, Maastricht University(数据科学研究所,马斯特里赫特大学) Department of Respiratory Medicine, Maastricht University Medical Center, Maastricht University(呼吸科部门,马斯特里赫特大学医学中心,马斯特里赫特大学)

AI总结 该研究旨在开发一种基于众包呼吸道语音数据的多变量深度学习模型,用于检测 COVID-19。研究利用 Cambridge COVID-19 Sound 数据库中的语音样本,提取包括梅尔频谱图、MFCC 和 CNN 编码器特征等多种语音特征,并构建了 LSTM、CNN 和 HuBERT 等深度学习分类模型进行疾病识别。实验结果表明,HuBERT 模型在准确率和 AUC 指标上均优于传统机器学习方法,达到了 86% 和 0.93,展示了语音数据在 COVID-19 诊断中的巨大潜力。

Comments arXiv admin note: text overlap with arXiv:2209.03727

详情
英文摘要

COVID-19 has affected more than 223 countries worldwide and in the Post-COVID Era, there is a pressing need for non-invasive, low-cost, and highly scalable solutions to detect COVID-19. We develop a deep learning model to identify COVID-19 from voice recording data. The novelty of this work is in the development of deep learning models for COVID-19 identification from only voice recordings. We use the Cambridge COVID-19 Sound database which contains 893 speech samples, crowd-sourced from 4352 participants via a COVID-19 Sounds app. Voice features including Mel-spectrograms and Mel-frequency cepstral coefficients (MFCC) and CNN Encoder features are extracted. Based on the voice data, we develop deep learning classification models to detect COVID-19 cases. These models include Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN) and Hidden-Unit BERT (HuBERT). We compare their predictive power to baseline machine learning models. HuBERT achieves the highest accuracy of 86\% and the highest AUC of 0.93. The results achieved with the proposed models suggest promising results in COVID-19 diagnosis from voice recordings when compared to the results obtained from the state-of-the-art.

2303.12834 2026-05-13 quant-ph cs.AI cs.LG stat.ML 版本更新

The power and limitations of learning quantum dynamics incoherently

Sofiene Jerbi, Joe Gibbs, Manuel S. Rudolph, Matthias C. Caro, Patrick J. Coles, Hsin-Yuan Huang, Zoë Holmes

发表机构 * Theoretical Division, Los Alamos National Laboratory(洛斯阿拉莫斯国家实验室理论部) Institute for Theoretical Physics, University of Innsbruck(因斯布鲁克大学理论物理研究所) Department of Physics, University of Surrey(萨里大学物理系) Institute for Quantum Information and Matter, Caltech(加州理工学院量子信息与物质研究所) Normal Computing Corporation(正常计算公司) Department of Computing and Mathematical Sciences, Caltech(加州理工学院计算与数学科学系)

AI总结 本文研究了在不依赖系统与目标直接量子交互的非相干框架下学习量子动力学的可行性与限制。通过分析模拟已有相干学习策略所需的测量次数,作者给出了学习单元过程的样本复杂度界限,并证明在允许任意测量时,任何高效可表示的单元算子均可在非相干框架中高效学习;而仅使用浅层测量时,仅能学习低纠缠单元算子。研究还通过在IBM量子设备上成功学习16量子比特单元算子,并通过数值实验验证了算法的可扩展性。

Comments 6+9 pages, 7 figures

Journal ref Phys. Rev. Research 8, 023141 (2026)

详情
英文摘要

Quantum process learning is emerging as an important tool to study quantum systems. While studied extensively in coherent frameworks, where the target and model system can share quantum information, less attention has been paid to whether the dynamics of quantum systems can be learned without the system and target directly interacting. Such incoherent frameworks are practically appealing since they open up methods of transpiling quantum processes between the different physical platforms without the need for technically challenging hybrid entanglement schemes. Here we provide bounds on the sample complexity of learning unitary processes incoherently by analyzing the number of measurements that are required to emulate well-established coherent learning strategies. We prove that if arbitrary measurements are allowed, then any efficiently representable unitary can be efficiently learned within the incoherent framework; however, when restricted to shallow-depth measurements only low-entangling unitaries can be learned. We demonstrate our incoherent learning algorithm for low entangling unitaries by successfully learning a 16-qubit unitary on \texttt{ibmq\_kolkata}, and further demonstrate the scalabilty of our proposed algorithm through extensive numerical experiments.

2302.12039 2026-05-13 cs.CL cs.AI 版本更新

Natural Language Processing in the Legal Domain

Dirk Hartung, Daniel Martin Katz, Michael J. Bommarito, Lauritz Gerlach, Abhik Jana, Jerrold Soh

发表机构 * Singapore Management University(新加坡管理大学) CodeX, Stanford University, USA(CodeX,斯坦福大学,美国) Illinois Tech - Chicago Kent College of Law, USA(伊利诺伊理工学院—芝加哥肯特法学院,美国) Bucerius Law School, Germany(伯克勒尔法学院,德国) IIT-Bhubaneswar(IIT-巴纳拉斯瓦尔)

AI总结 本文综述了自然语言处理在法律领域的最新发展,重点分析了2013年至2024年间近一千篇相关论文的技术与内容进展。研究指出,近年来法律NLP的研究数量、任务类型和语言覆盖范围显著增加,同时方法复杂度不断提升,逐渐接近通用NLP的水平,并在数据可用性和代码可复现性方面达到更高的专业标准。这些趋势预示着法律NLP领域未来的发展潜力和广阔前景。

Comments 15 pages, 7 figures, 2 tables

详情
英文摘要

We summarize the current state of the field of NLP & Law with a specific focus on recent technical and substantive developments. To support our analysis, we construct and analyze a nearly complete corpus of nearly one thousand NLP & Law related papers published between 2013-2024. Our analysis highlights several major trends. Namely, we document an increasing number of papers written, tasks undertaken, and languages covered over the course of the past decade. We observe an increase in the sophistication of the methods which researchers deployed in this applied context. Legal NLP is beginning to match not only the methodological sophistication of general NLP but also the professional standards of data availability and code reproducibility observed within the broader scientific community. We believe all of these trends bode well for the future of the field and point to an exciting next phase for the Legal NLP community.