英文摘要

Decision-making from offline datasets typically warm-starts a policy or score model from fixed offline data and then refines it with limited online interaction. Offline data reduces uncertainty, but it does not remove the need for exploration; it changes what remains to be explored. We formalise this residual uncertainty by the conditional mutual information $I(χ;τ_{1:T}\mid\mathcal{D}_N)$ between a learning target $χ$ and the online trajectories after conditioning on the offline dataset. This view leads naturally to information-directed sampling (IDS), a family parameterised by $η\ge 0$ that selects actions by trading off instantaneous regret against information gain. We prove a generic offline-to-online Bayesian regret bound for IDS through a ratio certificate: any information-ratio bound satisfied by a reference Thompson-sampling policy over the same randomised policy class is inherited by IDS. In a known-dynamics Bayesian linear-reward model, the conditional mutual information has a log-determinant form, and vanilla IDS ($η=0$) satisfies $\widetilde O\!\left(Hd\min\left\{\sqrt T,\,T\sqrt{C^\dagger_{β,\mathrm{IDS}_0}(N,T)/N}\right\}\right),$ where the coverage coefficient is tied to the visitation distribution induced by vanilla IDS itself. We also identify a warm-start regime with a dominated but informative probe in which vanilla IDS selects the probe while Thompson sampling never does, giving a constant-factor Bayesian regret separation. Controlled bandit experiments and D4RL offline-to-online RL experiments validate this mechanism: IDS is most beneficial when offline data is informative but leaves biased or low-probability residual uncertainty that targeted online actions can resolve, a regime shared by offline RL, offline black-box optimization, and Bayesian optimization.

URL PDF HTML ☆

赞 0 踩 0

2605.29402 2026-05-29 cs.CV cs.AI

Semantic and Visual Evidence for Efficient Long-Video Reasoning: A Solution for the HD-EPIC VQA Challenge

面向高效长视频推理的语义与视觉证据：HD-EPIC VQA挑战赛的解决方案

Yinsong Xu, Wei Jing, Liuxin Zhang, Wanjun Lv, Hui Li

AI总结提出一种统一框架，通过解耦长视频推理为语义证据（粗到细提取全局过程结构）和视觉证据（基于目标的细粒度定位），并采用查询条件证据检索与整合，在HD-EPIC VQA挑战赛中取得竞争性能。

详情

AI中文摘要

理解长格式自我中心视频对于多模态大语言模型（MLLMs）仍然具有挑战性，原因在于有限的上下文长度和对细粒度视觉细节的定位不足。最近提出的HD-EPIC基准突出了这些局限性：即使是强大的长上下文模型，在多样化的视频问答任务中也表现较低。在本文中，我们提出了一个统一框架，将长视频推理解耦为两种互补的证据形式：语义证据和视觉证据。语义证据通过粗到细的提取流程捕获全局过程结构，而基于目标的视觉证据通过边界框和视觉嵌入保留细粒度的定位。在推理过程中，我们将推理形式化为查询条件的证据检索和整合过程，动态地从两个来源选择相关信息。我们的方法在HD-EPIC-VQA挑战赛的多个任务类别中取得了竞争性能。更广泛地说，我们的结果表明，显式地结构化、检索和整合语义与视觉证据对于使用MLLMs进行有效的长视频理解至关重要。

英文摘要

Understanding long-form egocentric videos remains challenging for multimodal large language models (MLLMs) due to limited context length and insufficient grounding of fine-grained visual details. The recently proposed HD-EPIC benchmark highlights these limitations: even strong long-context models achieve relatively low performance across diverse video question answering tasks. In this paper, we propose a unified framework that decouples long-video reasoning into two complementary forms of evidence: semantic evidence and visual evidence. Semantic evidence captures global procedural structure through a coarse-to-fine extraction pipeline, while object-centric visual evidence preserves fine-grained grounding through bounding boxes and visual embeddings. During inference, we formulate reasoning as a query-conditioned evidence retrieval and integration process, dynamically selecting relevant information from both sources. Our approach achieves competitive performance in the HD-EPIC-VQA Challenge across multiple task categories. More broadly, our results demonstrate that explicitly structuring, retrieving, and integrating semantic and visual evidence is critical for effective long-video understanding with MLLMs.

URL PDF HTML ☆

赞 0 踩 0

2605.29401 2026-05-29 cs.LG

对齐但脆弱：通过零阶优化增强LLM安全鲁棒性

Zhihao Liu, Yifan Wu, Jian Lou, Di Wang, Yuxi Zhou, Yuke Hu

AI总结针对大语言模型安全对齐后易受轻量级后处理（如参数噪声、激活噪声或量化）影响的问题，提出基于零阶优化的混合框架，通过先标准一阶安全对齐再零阶精炼提升鲁棒性，并利用扰动评估估计层鲁棒性敏感性以高效聚焦关键层更新。

详情

AI中文摘要

大语言模型的安全对齐旨在减少有害或不安全行为，同时保持通用效用。然而，最近的研究发现对齐效果可能是脆弱的：轻量级的对齐后操作，如参数噪声、激活噪声或量化，很容易削弱预期的安全行为。先前提高鲁棒性的努力主要集中在数据整理、修改对齐目标和识别安全关键参数上，而优化器本身的作用在很大程度上未被探索。在本文中，我们首次从基础优化器的角度研究安全对齐的鲁棒性。这种以优化器为中心的视角自然地指向零阶优化，它通过评估扰动下的安全对齐来提供面向鲁棒性的信号。基于这一见解，我们提出了一个混合框架，首先执行标准的一阶安全对齐，然后应用零阶精炼来提高鲁棒性。从理论和实证上，我们表明仅需少量零阶精炼步骤即可增强鲁棒性，同时保持安全对齐。我们进一步通过利用其固有的基于扰动的评估来估计逐层鲁棒性敏感性，从而提高零阶精炼的效率，使精炼过程能够以适度的训练开销将更新集中在鲁棒性关键层上。

英文摘要

Safety alignment for large language models (LLMs) aims to reduce harmful or unsafe behavior while preserving general utility. However, recent findings reveal that alignment effects can be fragile: lightweight post-alignment manipulations, such as parameter noise, activation noise, or quantization, can easily weaken the intended safety behavior. Prior efforts to improve robustness have primarily focused on data curation, modified alignment objectives, and safety-critical parameter identification, leaving the role of the optimizer itself largely unexplored. In this paper, we are the first to study the robustness of safety alignment from the perspective of the base optimizer. This optimizer-centric view naturally points to zeroth-order optimization, which provides a robustness-oriented signal by evaluating safety alignment under perturbations. Based on this insight, we propose a hybrid framework that first performs standard first-order safety alignment and then applies zeroth-order refinement to improve robustness. Both theoretically and empirically, we show that only a few zeroth-order refinement steps can enhance robustness while preserving safety alignment. We further improve the efficiency of zeroth-order refinement by exploiting its inherent perturbation-based evaluations to estimate layer-wise robustness sensitivity, enabling the refinement process to concentrate updates on robustness-critical layers with modest training overhead.

URL PDF HTML ☆

赞 0 踩 0

2605.29394 2026-05-29 cs.AI

EvoMD-LLM: Learning the Language of Species Evolution in Reactive Molecular Dynamics

EvoMD-LLM：学习反应分子动力学中物种进化的语言

Zhichen Tang, Zhengzheng Dang, Yulin Chen, Jixin Wu, Haiwen Li, Yanming Wang

AI总结提出EvoMD-LLM框架，将反应分子动力学轨迹离散化为符号时间序列，通过时间脚手架机制使自回归大语言模型学习物种组成演化，在多项时间预测任务上优于基线模型，并能生成可解释性预测。

Comments 17 pages, ACL Findings

详情

AI中文摘要

虽然大型语言模型（LLM）在静态科学推理方面表现出色，但它们在建模动态物理过程的时间结构方面存在困难。我们提出了EvoMD-LLM（进化分子动力学大型语言模型），这是一个将物种级分子动力学重新表述为符号时间语言建模问题的框架。反应分子动力学轨迹被离散化为分子事件序列，其中每个标记代表一个化学物种及其持续时间，通过高效微调使标准自回归LLM能够学习随时间的组成演化。EvoMD-LLM的一个关键组成部分是时间脚手架，它将事件持续时间视为显式语言标记，并作为结构化归纳偏置，与传统的序列建模方法相比，显著减少了无效或幻觉的分子输出。我们在多个时间预测任务上评估了EvoMD-LLM，达到了高达66.14%的准确率，并始终优于序列神经网络和基于语言的基线。除了定量改进，我们定性地观察到，该模型能够通过结合相关化学知识为其预测生成解释，尽管它没有经过配对轨迹-解释数据的显式监督。这些结果表明，符号时间语言建模为将LLM应用于动态物理模拟提供了有效框架。

英文摘要

While large language models (LLMs) excel at static scientific reasoning, they struggle to model the temporal structure of dynamic physical processes. We present EvoMD-LLM (Evolutionary Molecular Dynamics Large Language Model), a framework that reformulates species-level molecular dynamics as a symbolic temporal language modeling problem. Reactive MD trajectories are discretized into sequences of molecular events, where each token represents a chemical species augmented with its persistence duration, enabling standard autoregressive LLMs to learn compositional evolution over time through efficient fine-tuning. A key component of EvoMD-LLM is temporal scaffolding, which treats event duration as an explicit linguistic token and serves as a structured inductive bias, significantly reducing invalid or hallucinated molecular outputs compared to conventional sequence modeling approaches. We evaluate EvoMD-LLM on multiple temporal prediction tasks, achieving up to 66.14% accuracy and consistently outperforming sequential neural networks and language-based baselines. Beyond quantitative improvements, we qualitatively observe that the model is capable of generating interpretations for its own predictions by incorporating relevant chemical knowledge, even though it was not explicitly supervised with paired trajectory-explanation data. These results demonstrate that symbolic temporal language modeling provides an effective framework for grounding LLMs in dynamic physical simulations.

URL PDF HTML ☆

赞 0 踩 0

2605.29390 2026-05-29 cs.CV

Orthogonal Negative Guidance in Attention Feature Space for Text-to-Image Generation

注意力特征空间中的正交负引导用于文本到图像生成

Jungmin Ko, Jungwon Park, Jimyeong Kim, Changin Choi, Wonseok Lee, Wonjong Rhee

AI总结提出一种基于注意力特征空间的正交负引导方法，通过正交化负提示注意力特征与正提示特征并仅减去正交分量，在无需训练的情况下有效抑制不需要的概念，同时保持图像质量和提示对齐。

Comments Preprint

详情

AI中文摘要

文本到图像（T2I）模型生成高质量图像的能力日益增强。然而，强制显式地避免指定对象或属性仍然是一个根本性的难题。现有方法，包括提示否定、事后编辑和负引导，对于显式概念抑制仍显不足，常常无法移除目标概念或降低整体图像质量。为此，我们提出了注意力特征空间中的正交负引导方法，这是一种无需训练的方法，在基于MM-DiT的T2I变换器的注意力输出空间中操作。我们的方法将负提示注意力特征相对于正提示特征进行正交化，并仅减去正交分量，从而在保留期望语义的同时抑制不需要的概念。在FLUX-dev和FLUX-schnell上的实验表明，我们的方法在概念抑制、提示对齐和图像质量之间取得了有利的权衡。在人工评估中，我们的方法比第二好的基线高出18.78%。我们进一步展示了该方法支持多概念抑制和可调概念抑制。

英文摘要

Text-to-image (T2I) models have become increasingly capable of generating high-quality images. Yet, enforcing the explicit absence of a specified object or attribute remains a fundamentally challenging problem. Existing approaches, including prompt negation, post-hoc editing, and negative guidance, remain insufficient for explicit concept suppression, often failing to remove the target concept or degrading overall image quality. To this end, we propose Orthogonal Negative Guidance in attention feature space, a training-free method that operates in the attention output space of MM-DiT-based T2I transformers. Our method orthogonalizes negative-prompt attention features with respect to positive-prompt features and subtracts only the orthogonal component, suppressing unwanted concepts while preserving desired semantics. Experiments on FLUX-dev and FLUX-schnell show that our method achieves favorable trade-offs between concept suppression, prompt alignment, and image quality. In human evaluation, our method outperforms the second-best baseline by 18.78%. We further show that our method supports multi-concept suppression and adjustable concept suppression.

URL PDF HTML ☆

赞 0 踩 0

2605.29387 2026-05-29 cs.LG cs.AI stat.ML

MiraBench: 评估机器人世界模型中的动作条件可靠性

Tianzhuo Yang, Zihan Shen, Zirui Mi, Zhaoyi Zhang, Jiayi Zhou, Jiaming Ji, Juntao Dai, Jiawei Chen, Boyuan Chen, Yaodong Yang

AI总结提出MiraBench基准，通过物理一致性、动作跟随保真度和乐观偏差检测三个层次评估机器人世界模型的动作条件可靠性，发现视觉保真度不能反映动作保真度、模型规模扩大不保证动作跟随改善、乐观偏差普遍存在。

详情

AI中文摘要

动作条件世界模型越来越多地被用作机器人学习的可扩展模拟器，但当前的评估对其在条件动作下预测的可靠性提供的证据有限。现有基准主要强调视觉保真度，未明确预测的未来是否物理上合理、是否忠实于命令动作，以及在动作不应成功时是否校准到失败。我们引入了\textsc{MiraBench}，一个分层基准，将\emph{动作条件可靠性}定义为机器人世界模型的核心评估目标。MiraBench将此目标分解为三个逐步严格层次：\emph{物理一致性}，评估无参考的物理一致性；\emph{动作跟随保真度}，衡量预测是否尊重任务相关动作输入；以及\emph{乐观偏差检测}，探测在导致失败的动作下预测成功结果的倾向。为支持此评估，我们整理了一个人工标注语料库，包含跨任务、失败类别和领先世界模型的超过16,000个判断。我们评估了12种代表性模型配置，涵盖向量条件机器人世界模型、文本条件生成世界模型、开源系统、闭源系统和多种模型规模。在这一广泛的模型景观中，MiraBench揭示了三个核心发现：视觉保真度是动作保真度的糟糕代理；增加模型规模并不能可靠地改善动作跟随；乐观偏差在现有系统中普遍存在。通过将评估从外观转向动作条件可靠性，MiraBench为评估和改进机器人世界模型作为忠实模拟器提供了诊断基础。

英文摘要

Action-conditioned world models are increasingly used as scalable simulators for robot learning, yet current evaluations provide limited evidence that their predictions are reliable under the actions they condition on. Existing benchmarks largely emphasize visual fidelity, leaving unclear whether predicted futures are physically plausible, faithful to commanded actions, and calibrated to failure when actions should not succeed. We introduce \textsc{MiraBench}, a hierarchical benchmark that defines \emph{action-conditioned reliability} as a core evaluation target for robotic world models. MiraBench decomposes this target into three progressively demanding levels: \emph{Physics Adherence}, which evaluates reference-free physical consistency; \emph{Action-Following Fidelity}, which measures whether predictions respect task-relevant action inputs; and \emph{Optimism Bias Detection}, which probes the tendency to predict successful outcomes under failure-inducing actions. To support this evaluation, we curate a human-annotated corpus with over 16,000 judgments across tasks, failure categories, and leading world models. We evaluate 12 representative model configurations spanning vector-conditioned robotic world models, text-conditioned generative world models, open-weight systems, closed-source systems, and multiple model scales. Across this broad model landscape, MiraBench reveals three central findings: visual fidelity is a poor proxy for action fidelity; increasing model scale does not reliably improve action following; and optimism bias is pervasive across current systems. By shifting evaluation from appearance to action-conditioned reliability, MiraBench provides a diagnostic foundation for assessing and improving robotic world models as faithful simulators.

URL PDF HTML ☆

赞 0 踩 0

2605.29358 2026-05-29 cs.AI

面向LLM安全评估的问答数据集研究：聚焦非法活动

Kenji Imamura, Masao Ideuchi, Atsushi Fujita

AI总结本文通过人工分析AnswerCarefully数据集，提出额外信息、问答示例创建方法和评估准则，用于评估LLM在非法活动方面的安全性。

Comments 10 pages, 1 figure

2605.29339 2026-05-29 cs.CV

DMC-CF: Dynamic Multimodal CounterFactual QA benchmark for Causal Reasoning

DMC-CF: 用于因果推理的动态多模态反事实QA基准

Junzhe Zhang, Huixuan Zhang, Guirong Wang, Xingyao Zhang, Pei Liu, Lin Qu, Hu Wei, Xiaojun Wan

AI总结针对现有因果推理数据集规模有限或基于非真实数据的问题，提出基于真实视频的大规模多模态因果反事实推理基准DMC-CF-Static，并利用动态图干预框架构建动态评估基准DMC-CF-Dynamic，实验表明当前多模态大模型在真实场景下的因果推理能力仍需大幅提升。

详情

AI中文摘要

随着多模态大语言模型（MLLMs）的快速发展，模型已展现出日益强大的多模态能力。然而，通过统计学习训练的MLLMs能否真正理解现实世界背后的因果关系仍是一个关键研究问题。近年来，众多多模态因果推理数据集被提出，但这些数据集要么规模有限，要么基于合成图像和视频、卡通内容或其他非真实多模态来源构建。为解决这些局限性，我们收集真实世界视频并构建了DMC-CF-Static，一个大规模多模态因果反事实推理基准。此外，为缓解传统静态评估中的数据污染等问题，我们使用因果图表示因果事件，并提出动态图干预（DGI）框架，从DMC-CF-Static构建动态评估基准DMC-CF-Dynamic。在包含静态和动态评估基准的整体DMC-CF上的实验结果表明，当前多模态大语言模型在真实场景下的多模态因果推理能力仍需大幅提升。

英文摘要

With the rapid advancement of multimodal large language models (MLLMs), models have demonstrated increasingly powerful multimodal capabilities. However, whether MLLMs trained through statistical learning can truly understand the causal relationships underlying the real world remains a key research question. In recent years, numerous multimodal causal reasoning datasets have been proposed. Nevertheless, these datasets are either limited in scale or constructed from synthetic images and videos, cartoon-based content, or other non-realistic multimodal sources. To address these limitations, we collect real-world videos and construct DMC-CF-Static, a large-scale benchmark for multimodal causal counterfactual reasoning. Furthermore, to mitigate issues such as data contamination in traditional static evaluation, we represent causal events using causal graphs and propose the Dynamic Graph Intervention (DGI) framework to build the dynamic evaluation benchmark DMC-CF-Dynamic from DMC-CF-Static. Experimental results on the overall DMC-CF, which includes both static and dynamic evaluation benchmarks, demonstrate that the multimodal causal reasoning capabilities of current multimodal large language models in real-world scenarios still require substantial improvement.

URL PDF HTML ☆

赞 0 踩 0