arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 3990
2606.07988 2026-06-09 cs.AI 新提交

PAFO: Pareto Fairness Optimization for Personalized Reward Modeling

PAFO: 个性化奖励建模的帕累托公平优化

Xiaoyan Zhao, Haoting Ni, Yang Zhang, Chunyuan Zheng, Haoxuan Li, Fuli Feng

发表机构 * National University of Singapore(新加坡国立大学) University of Science and Technology of China(中国科学技术大学) Peking University(北京大学)

AI总结 针对个性化奖励模型因训练数据偏好不平衡导致对少数用户群体存在偏见的问题,提出PAFO框架,通过帕累托公平优化提升弱势群体性能而不损害其他群体,实验表明能同时提高少数和多数群体准确率并降低不公平性。

详情
AI中文摘要

大型语言模型(LLMs)越来越依赖奖励模型来使其输出与多样化的用户偏好对齐。虽然个性化奖励模型旨在捕捉这种异质性,但它们通常在用户偏好数据不平衡的情况下训练,因此可能偏向于在训练群体中偏好更常见的用户。在本文中,我们将这种失败模式识别为个性化奖励偏差,即奖励建模质量随偏好支持率系统性地变化。我们将其缓解表述为一个关于群体效用的帕累托公平问题,旨在改善服务不足的用户而不降低其他用户群体的性能。为此,我们提出了PAFO,一种用于个性化奖励建模的帕累托公平优化框架。PAFO首先为多数和少数偏好群体训练群体专用的奖励模型,然后构建条件边际级监督,将其异质性偏好边界蒸馏到一个统一的模型中。所得模型仅在训练时使用群体信息,推理时无需显式群体标签。在Personal-LLM和DSP上的实验表明,PAFO在多个指标上提高了少数群体和多数群体的准确率,同时减少了用户级不公平性,证明了其在更公平的LLM个性化中的有效性。

英文摘要

Large language models (LLMs) increasingly rely on reward models to align their outputs with diverse user preferences. While personalized reward models aim to capture such heterogeneity, they are often trained on imbalanced user preference data and may therefore favor users whose preferences are more common in the training population. In this paper, we identify this failure mode as personalized reward bias, where reward modeling quality varies systematically with preference support rate. We formulate its mitigation as a Pareto fairness problem over group utilities, aiming to improve under-served users without degrading other user groups. To this end, we propose PAFO, a Pareto fairness optimization framework for personalized reward modeling. PAFO first trains group-specialized reward models for majority and minority preference groups, then constructs conditional margin-level supervision to distill their heterogeneous preference boundaries into a single unified model. The resulting model uses group information only during training and requires no explicit group labels at inference time. Experiments on Personal-LLM and DSP show that PAFO improves both minority-group and majority-group accuracy while reducing user-level unfairness across multiple metrics, demonstrating its effectiveness for fairer LLM personalization.

2606.07985 2026-06-09 cs.CV cs.CL 新提交

FMRFusion: Frequency-Aware Multi-View Representation Learning for Heterogeneous Image Fusion

FMRFusion: 面向异质图像融合的频率感知多视图表示学习

Tao Zhoua, Yunlong Liu, Qinghui Chen, Zekai Zhang, Minlong Sun, Changlin Biana, Dagang Li, Wenmin Wang, Jinglin Zhang

发表机构 * Shandong University(山东大学) Macau University of Science and Technology(澳门科技大学)

AI总结 提出FMRFusion网络,通过多尺度结构感知模块、双线性频率分解和跨视图互补交互,结合流匹配优化,实现红外与可见光图像融合,在夜间场景表现优异。

详情
AI中文摘要

红外与可见光图像融合旨在生成保留重要目标信息和详细纹理的复合图像,整合两种异质模态。以往的图像融合方法通常采用单模块堆叠方式从两种模态中提取特征,然而这些方法可能导致对其独特特征的学习不完整,从而限制融合效果并在真实异质数据场景中降低鲁棒性。为解决这些问题,我们提出FMRFusion,一种用于异质图像融合的频率感知多视图表示学习网络。引入多尺度结构感知模块以有效捕捉判别性结构,提取细粒度局部结构和关键上下文信息。采用双线性频率分解机制将特征分离为高频和低频分量,实现不同频率域中局部细节和全局表示的联合建模。此外,融入跨视图互补交互以显式建模和融合反射光信息与辐射强度响应之间的互补特性,促进有效的跨视图交互。我们通过流匹配进一步改善融合结果的质量,通过学习从粗数据到高质量表示的变换逐步细化融合特征。在多个基准数据集上进行的大量实验表明,FMRFusion在一系列融合任务中实现了优越且一致的性能,尤其在夜间场景中表现突出。

英文摘要

Infrared and visible image fusion aims to generate a composite image that retains significant target information and preserves detailed textures, integrating two heterogeneous modalities. Previous image fusion methods typically adopt a single-module stacking approach to extract features from the two modalities. However, these approaches may result in incomplete learning of their distinct characteristics, thereby limiting the fusion effectiveness and constrain ing robustness in real-world heterogeneous data scenarios. To address these challenges, we propose FMRFusion, a frequency-aware multi-view representation learning network for Heterogeneous Image Fusion. A Multi-Scale Struc tural Perception Module is introduced to effectively capture discriminative structures, extracting fine-grained local structures and essential contextual information. A bilinear frequency decomposition mechanism is employed to sepa rate features into high-frequency and low-frequency components, enabling joint modeling of local details and global representations across different frequency domains. Moreover, a Cross-View Complementary Interaction is incorpo rated to explicitly model and fuse the complementary characteristics between reflected light information and radiative intensity responses, facilitating effective cross-view interaction. We further improve the Performance of the fused results by flow matching, which progressively refines the fused features by learning the transformation from coarse data to high-quality representations. Extensive experiments conducted on multiple benchmark datasets demonstrate that FMRFusion achieves superior and consistent performance across a range of fusion tasks, especially in nighttime scenarios

2606.07982 2026-06-09 cs.LG 新提交

Overcoming the Limits of Finite Difference Method; Physics-Informed Neural Network for Noisy High-Dimensional Heat Diffusion

克服有限差分法的局限性:用于含噪高维热扩散的物理信息神经网络

Shreesh Bhattarai, Harish Chandra Bhandari

发表机构 * Kathmandu University(加德满都大学)

AI总结 针对高维含噪热扩散问题,提出物理信息神经网络(PINN)框架,在噪声和维度较高时显著优于有限差分法(FDM),实现精度与效率的权衡。

详情
AI中文摘要

高维瞬态热扩散在噪声边界条件下暴露了经典数值方法的根本局限性:在物理噪声不可避免的情况下,精度会灾难性地下降。本文提出了一个物理信息神经网络(PINN)框架,作为在一维、二维和三维空间中对这一问题的系统性解决方案,建立了明确的操作机制,重新定义了含噪热系统中求解器的选择。在三维空间中,当边界噪声为20%时,PINN保持约91%的精度,而有限差分法(FDM)降至36%,这是一个明显的决定性优势。这一点在物理铜热系统中得到进一步证实,在真实噪声条件下,PINN将边界重建误差降低了3.3倍。这种噪声鲁棒性伴随着维度驱动的效率交叉:在三维空间中,PINN所需的时空节点少于FDM,同时实现更高的精度,揭示了经典离散化在大规模下的真实成本。这些发现重新定义了求解器的选择:决定性的轴不仅是精度,而是噪声暴露和维度的共同作用。当噪声和维度都较高时,经典求解器范式不足;本工作为证明PINN在此类机制中作为操作标准提供了基础。

英文摘要

High-dimensional transient heat diffusion under noisy boundary conditions exposes a fundamental limitation of classical numerical methods: accuracy degrades catastrophically where physical noise is unavoidable. This paper presents a Physics-Informed Neural Network (PINN) framework as a systematic solution to this problem across one, two, and three spatial dimensions, establishing clear operational regimes that redefine solver selection in noisy thermal systems. Under 20% boundary noise in 3D, PINN sustains approximately 91% accuracy while Finite Difference Method (FDM) collapses to 36%, a clear decisive advantage. This is further confirmed in a physical copper thermal system, where PINN reduces boundary reconstruction error by 3.3 times under realistic noise conditions. This noise resilience is accompanied by a dimensionality-driven efficiency crossover: PINN requires fewer spacetime nodes than FDM in 3D while achieving superior accuracy, exposing the true cost of classical discretization at scale. These findings reframe solver selection: the decisive axis is not accuracy alone, but noise exposure and dimensionality jointly. When noise and dimensionality are both high, the classical solver paradigm is insufficient; this work provides the foundation to justify PINN as the operational standard in such regimes.

2606.07978 2026-06-09 cs.CL 新提交

MechLens: Late Crystallization of Factual Knowledge Explains Intervention Effectiveness in Language Models

MechLens:事实知识的晚期结晶解释语言模型中的干预有效性

Xueping Gao

发表机构 * Alibaba Cloud(阿里云)

AI总结 本文发现LLM中的事实知识在最后层突然“结晶”,而非逐层涌现,并基于此提出结晶引导的干预原则,优于现有方法。

详情
AI中文摘要

理解LLM存储事实知识的位置对于减少幻觉至关重要。我们系统量化了“晚期结晶”:事实知识并非逐层涌现,而是在最后层突然“结晶”。在五个模型家族(Pythia、Gemma、Qwen2.5、Llama-3.1、Mistral;0.5–14B)中,26.8%–93.4%的正确答案从未在任何中间层进入前10预测,且晚期涌现(>80%深度)在不同架构中一致。跨尺度(Qwen2.5-14B)和跨基准(MMLU:98.2%)结果证实了普遍性;调谐透镜排除了探针伪影。情感分类对照(Qwen为0.5% vs. 事实85.9%;Mistral为2.0% vs. 26.8%)确认该现象是事实回忆特有的。\n晚期结晶引出了结晶引导的干预原则:CAA在中等结晶模型(Llama、Mistral)上优于DoLa(p<0.001),在高结晶模型Qwen上方向一致反转(+25.4% vs. +15.5% MC1,p=0.069)。LayerNorm消融表明结晶是残差流固有的;LN缩放(x1.2)在零推理开销下带来+11.8% MC1提升。我们进一步揭示了可计算性-记忆谱:可计算知识比记忆事实更早结晶(层22.1/28 vs. 28.0/28)。我们发布了支持五个模型家族的MechLens。

英文摘要

Understanding where LLMs store factual knowledge is critical for hallucination mitigation. We systematically quantify Late Crystallization: factual knowledge does not gradually emerge across layers but "crystallizes" abruptly at the final layers. Across five model families (Pythia, Gemma, Qwen2.5, Llama-3.1, Mistral; 0.5--14B), 26.8%--93.4% of correct answers never enter top-10 predictions at any intermediate layer, with late emergence (>80% depth) consistent across architectures. Cross-scale (Qwen2.5-14B) and cross-benchmark (MMLU: 98.2%) results confirm generality; tuned lens rules out probe artifacts. A sentiment-classification control (0.5% for Qwen vs. 85.9% factual; 2.0% for Mistral vs. 26.8%) confirms the phenomenon is specific to factual recall. Late Crystallization yields a crystallization-guided intervention principle: CAA outperforms DoLa on moderate-crystallization models (Llama, Mistral; p<0.001), with a directionally consistent reversal on high-crystallization Qwen (+25.4% vs. +15.5% MC1, p=0.069). LayerNorm ablation shows crystallization is intrinsic to the residual stream; LN scaling (x1.2) yields +11.8% MC1 with zero inference overhead. We further reveal a Computability-Memorization Spectrum: computable knowledge crystallizes earlier (layer 22.1/28) than memorized facts (28.0/28). We release MechLens supporting five model families.

2606.07974 2026-06-09 cs.RO cs.AI 新提交

PRISM: PRior-guided Imagination Sampling in world Models

PRISM:世界模型中基于先验引导的想象采样

Yuhai Wang, Jiawei Xia, Rongxuan Zhou, Xiao Hu, Yongliang Shi, Jing Du, Yang Ye

发表机构 * Northeastern University(东北大学) University of California, Berkeley(加州大学伯克利分校) Qiyuan Lab(启元实验室) University of Florida(佛罗里达大学)

AI总结 提出PRISM框架,通过从世界模型编码器提取状态条件高斯先验,并利用精度加权高斯乘积更新规划器的采样分布,在不增加架构复杂度的情况下显著提升基于模型的连续控制性能。

详情
AI中文摘要

学习到的世界模型为评估未来状态提供了强大的物理直觉。但其在连续控制中的有效性也关键取决于如何为基于模型的规划生成候选动作。我们不仅询问模型能多准确地模拟未来,还提出:哪些候选动作首先值得评估?现有规划器通常任意搜索或仅使用专家演示初始化采样均值,丢弃了专家的状态条件置信度。正确引导这一搜索需要鲁棒的动作先验,但当前方法常依赖独立的视觉编码器或大规模VLM来获取。我们认为这种架构膨胀是不必要的:完全相同的数据——以及世界模型本身学到的表示——内在地编码了智能体的动作直觉。我们提出PRISM,一个任务无关的框架,从单一数据集中提取两者,同时保持严格的架构简洁性。基于标准的JEPA风格潜在世界模型,PRISM直接在其冻结编码器上附加一个轻量级MLP,以预测状态条件高斯先验。在规划时,PRISM通过精度加权的高斯乘积更新将该先验融合到规划器的采样分布中。这种无参数、闭式整合引导采样过程,使先验在其自信处主导,在其不自信处放弃控制。PRISM在Cube上将基于世界模型的MPC成功率提升35个百分点,在PushT上提升32个百分点,且未引入显著推理开销。

英文摘要

A learned world model provides a powerful physical intuition for evaluating future states. But its effectiveness in continuous control also depends critically on how candidate actions are generated for model-based planning. Rather than solely asking how accurately a model can simulate the future, we ask: which candidate actions are worth evaluating in the first place? Existing planners typically search arbitrarily or use expert demonstrations only to initialize a sampling mean, discarding the expert's state-conditioned confidence. Properly guiding this search requires a robust action prior, yet current approaches often rely on independent visual encoders or large-scale VLMs to obtain one. We argue that this architectural bloat is unnecessary: the exact same data - and the learned representations of the world model itself - inherently encode the agent's action intuition. We introduce PRISM, a task-agnostic framework that extracts both from a single dataset while maintaining strict architectural simplicity. Building on a standard JEPA-style latent world model, PRISM attaches a lightweight MLP directly to its frozen encoder to predict a state-conditioned Gaussian prior. At plan time, PRISM fuses this prior into the planner's sampling distribution via a precision-weighted Product-of-Gaussians update. This parameter-free, closed-form integration steers the sampling process, making the prior confident where it is and ceding control where it is not. PRISM improves success rates by 35 percentage points over vanilla world-model-based MPC on Cube and 32 percentage points on PushT, without introducing significant inference overhead.

2606.07970 2026-06-09 cs.CL cs.AI 新提交

Defending Against Malicious Finetuning by Scaling Train-time Adversarial Attacks

通过扩展训练时对抗攻击防御恶意微调

Haoming Wen, Shi Chen, Qingyu Shi, Siyuan Liu, Minrui Luo, Jingzhao Zhang, Tianxing He

发表机构 * Xiongan AI Institute(雄安人工智能研究院) Institute for Interdisciplinary Information Sciences, Tsinghua University(清华大学交叉信息研究院) Shanghai Qi Zhi Institute(上海期智研究院)

AI总结 针对全参数微调的安全威胁,提出基于对抗训练和双层优化的Patcher方法,通过扩展对抗循环中的优化步数增强防御,并设计并行算法提升效率。

详情
AI中文摘要

当前的开源大型语言模型(LLMs)容易受到恶意微调攻击,这些攻击只需在中毒数据集上进行几步监督微调(SFT)即可破坏LLMs的安全对齐。现有的对齐阶段防御主要设计用于防御使用参数高效微调方法的攻击。然而,它们无法防御使用全参数微调的更强攻击。在本文中,我们提出了Patcher,一种受对抗训练和双层优化启发的方法,以对抗此类攻击。Patcher通过扩展对抗循环中的优化步数来增强模拟攻击,从而迫使防御者找到对更强攻击不敏感的模型参数。此外,我们提出了一种高效的并行算法来实现Patcher,减少了训练的挂钟时间,同时保持了Patcher的性能。大量实验表明,与普通SFT对齐相比,Patcher显著提高了模型的鲁棒性,并且可以迁移到不同的攻击场景和模型大小。代码可在https://github.com/haomingwen/patcher获取。

英文摘要

Current open-weight large language models (LLMs) are prone to malicious finetuning attacks, which could compromise the safety alignment of LLMs with only a few steps of supervised finetuning (SFT) on poisoned datasets. Existing alignment-stage defenses are primarily designed to defend against attacks that use parameter-efficient finetuning methods. However, they fail to defend against stronger attacks that use full-parameter finetuning. In this paper, we propose Patcher, a method inspired by adversarial training and bi-level optimization, to combat such attacks. Patcher strengthens the simulated attack by scaling up the optimization steps in the adversarial loop, thus forcing the defender to find model parameters that are insensitive to stronger attacks. Furthermore, we propose an efficient parallel algorithm to implement Patcher, decreasing the wall-clock time of training while preserving Patcher's performance. Extensive experiments show that Patcher substantially improves the model's robustness compared to vanilla SFT alignment, and transfers to diverse attack scenarios and model sizes. Code is available at https://github.com/haomingwen/patcher.

2606.07969 2026-06-09 cs.CL cs.AI 新提交

Neutrality Bites: Gender Representation in AI-Generated Animal Stories

中立性的代价:AI生成的动物故事中的性别表征

Imani Finkley, Yuanxi Li, Melanie Walsh

发表机构 * University of Washington(华盛顿大学)

AI总结 研究六种主流LLM在生成动物故事时的性别分配,发现模型常避免指定性别或使用中性语言,但一旦指定则显著偏向男性,女性角色几乎缺席,表明中立策略可能导致边缘视角的抹除。

详情
Comments
FAccT(ACM Conference on Fairness, Accountability, and Transparency) 2026
AI中文摘要

AI生成故事中的性别偏见是一个有充分记录的问题。尽管人们已投入大量关注来减少或缓解这种偏见,但干预措施是否产生真正公平的结果并不总是明确的。为了调查这一问题,我们研究了大型语言模型(LLMs)如何处理一个流行、高度模糊且已知会紧密复现人类刻板印象的叙事语境中的性别分配:关于会说话的动物的故事。我们提示六个领先的LLM完成一个关于七个性别未说明的拟人化动物角色的英语故事。此外,我们迭代了四种不同的叙事设置和一系列模型温度。在23.8K个故事中,我们发现模型经常避免在故事中指定动物角色的性别(平均19%)或使用性别中立的语言如“它”或“它的”(平均38.2%)。然而,当性别被指定时,存在显著的男性偏见。女性动物角色几乎不存在,仅出现在2.2%的故事中,而男性角色出现在40.6%的故事中。我们的发现指向一个更广泛的论点:中立性是有代价的。换句话说,优先考虑中立性以解决社会偏见的模型实际上可能助长边缘化视角和身份的抹除。我们建议需要追求超越中立性的替代策略,例如那些更平等地在想象主体之间分配社会可能性的策略。

英文摘要

Gender bias in AI-generated stories is a well-documented problem. While much attention has been paid to reducing or mitigating this bias, it is not always clear whether interventions produce genuinely fairer results. To investigate this issue, we examine how large language models (LLMs) handle gender assignment in a narrative context that is popular, highly ambiguous, and also known to closely reproduce human stereotypes: stories about talking animals. We prompt six leading LLMs to complete an English-language story about seven different anthropomorphic animal characters whose gender is unstated. We additionally iterate with four different narrative settings and a range of model temperatures. Across the 23.8K stories, we find that models frequently avoid gendering the animal character in the story (19% on average) or use gender-neutral language like "it" or "its" (38.2% on average). However, when gender is assigned, there is a significant masculine bias. Feminine animal characters are virtually absent, present in just 2.2% of stories vs. 40.6% that feature masculine characters. Our findings point to a broader argument: neutrality bites. In other words, models that prioritize neutrality to address social bias may actually contribute to the erasure of marginalized perspectives and identities. We suggest that alternative strategies beyond neutrality need to be pursued, such as ones that more equally distribute social possibilities across imagined subjects.

2606.07967 2026-06-09 cs.CV 新提交

DisCo: World Models with Discrete Camera Motion Control

DisCo: 具有离散相机运动控制的世界模型

Hongrui Huang, Junke Wang, Quanhao Li, Yu-Gang Jiang, Zuxuan Wu

发表机构 * Fudan University(复旦大学)

AI总结 提出DisCo,通过离散动作原语替代连续相机轨迹作为条件,解决可控视频生成中动作表示纠缠问题,提升动作跟随可靠性,并引入DisCoBench基准。

详情
AI中文摘要

可控视频世界模型旨在实现交互式世界探索,模型必须在保持视觉质量和时间一致性的同时忠实地执行明确的动作命令。然而,现有大多数方法依赖连续相机轨迹作为动作条件,这通常导致不可靠的动作跟随,尤其是在复杂运动序列下。在这项工作中,我们识别出动作表示纠缠是可控视频生成的关键瓶颈,并表明连续相机表示导致不同运动模式之间的高特征相似性,降低了动作可控性。基于这一见解,我们提出了DisCo,一种可控视频世界模型,它将生成条件约束在一组紧凑的离散动作原语上,以提高动作可分离性。我们进一步引入了DisCoBench,一个用于评估模型在短期、长期和高度动态探索场景中能力的综合基准。大量实验表明,DisCo在保持视觉质量的同时实现了显著更可靠的动作跟随。

英文摘要

Controllable video world models target interactive world exploration, where models must faithfully execute explicit action commands while preserving visual quality and temporal coherence. However, most existing approaches rely on continuous camera trajectories as action conditions, which often lead to unreliable action following, especially under complex motion sequences. In this work, we identify action representation entanglement as a key bottleneck in controllable video generation, and show that continuous camera representations lead to high feature similarity across distinct motion patterns, degrading action controllability. Based on this insight, we propose DisCo, a controllable video world model that conditions generation on a compact set of discrete action primitives to improve action separability. We further introduce DisCoBench, a comprehensive benchmark for evaluating the ability of models in short-term, long-horizon, and highly dynamic exploration scenarios. Extensive experiments demonstrate that DisCo achieves significantly more reliable action following while preserving visual quality.

2606.07965 2026-06-09 cs.AI 新提交

Zero-Shot Learning in Industrial Scenarios: New Large-Scale Benchmark, Challenges and Baseline

工业场景中的零样本学习:新的大规模基准、挑战与基线

Zekai Zhang, Qinghui Chen, Maomao Xiong, Shijiao Ding, Zhanzhi Su, Xinjie Yao, Yiming Sun, Cong Bai, Jinglin Zhang

发表机构 * Zhejiang University(浙江大学) Alibaba Group(阿里巴巴集团) Huawei Technologies(华为技术有限公司) Tsinghua University(清华大学) Microsoft Research(微软研究院)

AI总结 针对工业场景中视觉语言模型应用难、数据稀缺的问题,提出大规模多模态工业开放数据集MMIO和精炼文本-视觉提示RTVP,实现零样本工业缺陷检测,在MMIO上达到SOTA。

详情
AI中文摘要

大型视觉语言模型(LVLMs)在视觉任务中取得了显著成功。然而,工业场景与自然场景之间的巨大差异使得应用LVLMs具有挑战性。现有的LVLMs依赖用户提供的提示来分割目标,这常常由于包含不相关像素而导致性能次优。此外,数据的稀缺性也使得LVLMs在工业场景中的应用仍未得到探索。为填补这一空白,本文提出了一个开放的工业数据集和一个精炼文本-视觉提示(RTVP),用于零样本工业缺陷检测。首先,本文构建了包含80K+样本的多模态工业开放数据集(MMIO)。MMIO包含多样化的工业类别,包括6个超类和18个子类。MMIO是首个用于工业零样本学习的大规模多场景预训练数据集,并为未来工业场景中的开放模型提供了宝贵的训练数据。基于MMIO,本文提出了专门用于工业零样本任务的RTVP。RTVP有两个显著优势:第一,本文设计了一种专家引导的大模型领域自适应机制,并基于Mobile-SAM设计了工业零样本方法,增强了大模型在工业场景中的泛化能力。第二,RTVP直接从图像自动生成视觉提示,并考虑了先前LVLM忽略的文本-视觉提示交互,提高了视觉和文本内容的理解。在MMIO的零样本和封闭场景中,RTVP分别以42.2%和24.7%的AP达到了SOTA。

英文摘要

Large Visual Language Models (LVLMs) have achieved remarkable success in vision tasks. However, the significant differences between industrial and natural scenes make applying LVLMs challenging. Existing LVLMs rely on user-provided prompts to segment objects. This often leads to suboptimal performance due to the inclusion of irrelevant pixels. In addition, the scarcity of data also makes the application of LVLMs in industrial scenarios remain unexplored. To fill this gap, this paper proposes an open industrial dataset and a Refined Text-Visual Prompt (RTVP) for zero-shot industrial defect detection. First, this paper constructs the Multi-Modal Industrial Open Dataset (MMIO) containing 80K+ samples. MMIO contains diverse industrial categories, including 6 super categories and 18 subcategories. MMIO is the first large-scale multi-scenes pre-training dataset for industrial zero-shot learning, and provides valuable training data for open models in future industrial scenarios. Based on MMIO, this paper provides a RTVP specifically for industrial zero-shot tasks. RTVP has two significant advantages: First, this paper designs an expert-guided large model domain adaptation mechanism and designs an industrial zero-shot method based on Mobile-SAM, which enhances the generalization ability of large models in industrial scenarios. Second, RTVP automatically generates visual prompts directly from images and considers text-visual prompt interactions ignored by previous LVLM, improving visual and textual content understanding. RTVP achieves SOTA with 42.2% and 24.7% AP in zero-shot and closed scenes of MMIO.

2606.07964 2026-06-09 cs.CL 新提交

What Does Debiasing Really Remove? A Geometric Study of PCA-Based Gender Debiasing in Word Embeddings

去偏究竟移除了什么?词嵌入中基于PCA的性别去偏的几何研究

Alexey Kresin, Tchifou M. Dieffi, Tomer Caspi

发表机构 * Hood College(胡德学院) Ben-Gurion University of the Negev(内盖夫本-古里安大学)

AI总结 通过几何分析揭示PCA去偏主要移除第一主成分中的直接性别偏见,但无法消除分布在多维度上的关联偏见,且会破坏嵌入几何结构,表明偏见并非纯低秩,简单子空间移除不足以全面去偏。

详情
Comments
8 pages, 4 figures. Source code available at https://github.com/AlexeyKresin/embedding-bias-geometry
AI中文摘要

基于主成分分析(PCA)的去偏方法被广泛用于减少大型语言模型词嵌入中的性别偏见,但尚不清楚这些方法实际移除了偏见的哪些方面以及这一过程的破坏性有多大。这些方法基于偏见存在于低维子空间的理解,假设大部分偏见可以通过少数主成分捕获。在这项工作中,我们对基于PCA的性别去偏进行了系统的几何分析,并研究了嵌入空间中实际被移除的内容。我们在多个嵌入上的实验表明,直接性别偏见主要集中在前几个主成分上,支持了低秩偏见假设。然而,通过WEAT测量的关联偏见并不与这些主方向对齐,而是分布在多个嵌入维度上。此外,正如预期,我们证明移除越来越多的主成分会导致嵌入几何的一致退化,影响语义结构和向量关系。这些结果表明,基于PCA的去偏是一种权衡:虽然它有效减少了某些形式的直接偏见,但未能消除分布式关联,并引入了几何扭曲。此外,不存在通用的最优去偏水平,因为偏见减少与语义保留之间的平衡取决于所选的度量和嵌入。总体而言,我们的发现表明词嵌入中的偏见并非纯粹低秩,简单的子空间移除方法可能不足以实现全面去偏。

英文摘要

Debiasing methods based on principal component analysis (PCA) are broadly used to reduce gender bias in word embeddings used in LLMs, yet it remains unclear what aspects of bias they actually remove and how destructive this process is. These methods are based on the understanding that bias resides in a low-dimensional subspace, with the assumption that most of it can be captured by a few principal components. In this work, we conduct a systematic geometric analysis of PCA-based gender debiasing and investigate what is actually removed from the embedding space. Our experiments across multiple embeddings show that direct gender bias is primarily concentrated in the first principal component, supporting the low-rank bias hypothesis. However, associative bias measured by WEAT does not align with these principal directions and is instead spread across multiple embedding dimensions. Furthermore, as expected, we demonstrate that removing an increasing number of principal components leads to a consistent degradation of the embedding geometry, affecting semantic structure and vector relationships. These results reveal that PCA-based debiasing operates as a trade-off: while it effectively reduces certain forms of direct bias, it fails to eliminate distributed associations and introduces geometric distortion. Moreover, there is no universal optimal level of debiasing, as the balance between bias reduction and semantic preservation depends on the chosen metric and embedding. Overall, our findings suggest that bias in word embeddings is not purely low-rank and that simple subspace removal methods may be insufficient for comprehensive debiasing.

2606.07963 2026-06-09 cs.AI cs.CL 新提交

Shared Latent Structures Enable Unified Backdoor Detection and Mitigation in LLMs

共享潜在结构实现大语言模型中的统一后门检测与缓解

Omar Mahmoud, Aly M. Kassem, Thommen George Karimpanal, Buddhika Laknath Semage, Negar Rostamzadeh, Golnoosh Farnadi, Santu Rana

发表机构 * Deakin University(迪肯大学) Mila, Quebec AI Institute(魁北克人工智能研究所Mila)

AI总结 发现大语言模型中多种后门攻击共享潜在机制,通过稀疏自编码器检测因果特征,并提出双向激活操控和概念消融微调实现统一检测与缓解。

详情
AI中文摘要

大语言模型中的后门攻击通常被视为孤立的触发-响应失败,促使防御针对特定触发或行为。我们证明这种观点是不完整的。在多样化的后门行为中,我们识别出一个共享的潜在机制,可以被检测、因果控制和抑制。通过在残差流激活上使用稀疏自编码器,我们发现一小部分潜在特征在越狱、拒绝操控、密码锁定、偏见诱导、情感误分类和基于国家的有害建议中一致激活。这些特征在Qwen3、Gemma~3和Llama~3.1模型(参数从4B到32B)以及微调和权重编辑攻击中泛化。通过双向激活操控,我们证明这些特征是因果性的:抑制它们降低攻击成功率,而放大它们在干净提示上诱导目标行为。我们进一步训练轻量级SAE特征分类器,这些分类器零样本泛化到未见后门,并优于残差流和权重差异基线。最后,我们引入概念消融微调,通过在训练期间消融共享潜在子空间来抑制后门形成。总之,我们的结果表明许多后门依赖于可转移的潜在机制,从而实现统一的检测和缓解。

英文摘要

Backdoor attacks in large language models (LLMs) are often treated as isolated trigger-response failures, motivating defenses tailored to specific triggers or behaviors. We show this view is incomplete. Across diverse backdoor behaviors, we identify a shared latent mechanism that can be detected, causally controlled, and suppressed. Using sparse autoencoders (SAEs) on residual-stream activations, we find a small set of latent features consistently activated across jailbreaking, refusal manipulation, password-locking, bias induction, sentiment misclassification, and country-conditioned harmful advice. These features generalize across Qwen3, Gemma~3, and Llama~3.1 models from 4B to 32B parameters, and across both fine-tuning and weight-editing attacks. Through bidirectional activation steering, we show these features are causal: suppressing them reduces attack success, while amplifying them induces target behaviors on clean prompts. We further train lightweight SAE-feature classifiers that generalize zero-shot to unseen backdoors and outperform residual-stream and weight-diffing baselines. Finally, we introduce Concept Ablation Fine-Tuning (CAFT), which suppresses backdoor formation by ablating the shared latent subspace during training. Together, our results suggest that many backdoors rely on a transferable latent mechanism, enabling unified detection and mitigation.

2606.07962 2026-06-09 cs.CV 新提交

ChronoPhyBench: Do MLLMs Truly Understand the World or Merely Exploit Language Priors?

ChronoPhyBench:多模态大语言模型真正理解世界还是仅仅利用语言先验?

Bin Zhu, Yanhao Jia, Kexin Zhao, Jie Wang, Munan Ning, Hao Li, Yuwei Niu, Tanqing Sun, Huangchong Yan, Mingjun Pan, Xinyi Wu, Qishen Yin, Yunyang Ge, Shuai Zhao, Li Yuan

发表机构 * Peking University, Shenzhen Graduate School(北京大学深圳研究生院) Peng Cheng Laboratory(鹏城实验室) Rabbitpre Intelligence Nanyang Technological University(南洋理工大学) Tsinghua University(清华大学)

AI总结 提出ChronoPhyBench基准,通过视频上下文和文本描述强制模型进行物理状态推理,揭示当前开源模型物理推理能力仍处初级阶段。

详情
AI中文摘要

多模态大语言模型(MLLMs)的最新进展在开放世界推理和理解方面展现了卓越的能力。然而,一个关键的不确定性仍然存在:这些模型是真正融合跨模态信息以构建基于物理的推理链,还是仅仅利用强大的语言先验来掩盖单模态依赖,从而幻觉出高级多模态能力?受此启发,为严格减轻语言模态偏差和捷径,我们提出一个新的多模态时序物理动力学推理基准ChronoPhyBench,该基准将下一状态预测与视觉问答(VQA)范式统一,通过基于历史视频上下文和文本描述,强制模型通过单图像选择以及更复杂的多帧时序排序任务来推断后续物理状态。同时,我们构建了一个基于ChronoPhyBench标准的大规模多模态推理数据集,包含超过10,000个长视频及其精心标注的描述,总计500万token。我们的实验评估揭示了与以往基准结论的鲜明对比。当前开源模型执行基于物理的多模态推理的能力仍处于初级阶段。最终,本工作旨在系统性地压力测试多模态模型的推理能力,量化幻觉率,并推动物理AI的发展,从而为社区提供一个朝着通用人工智能(AGI)的稳健且透明的评估框架。

英文摘要

Recent advancements in Multimodal Large Language Models (MLLMs) have demonstrated remarkable proficiency in open-world reasoning and understanding. However, a critical ambiguity persists: it remains unclear whether these models genuinely synthesize cross-modal information to construct physically grounded reasoning chains, or if they merely exploit strong language priors to mask single-modality reliance, thereby hallucinating advanced multimodal capabilities. Motivated by this, and to rigorously mitigate language modality bias and shortcuts, we propose a novel multimodal Chrono}logical Physical Dynamics Reasoning Benchmark ChronoPhyBench, which unifies next state prediction with Visual Question Answering (VQA) paradigms by conditioning on historical video context and textual captions to enforce models to deduce subsequent physical states through both single image selection and the inherently more complex task of multiple frame chronological sorting. Concurrently, we construct a large-scale multimodal reasoning dataset curated using the ChronoPhyBench criteria, comprising over 10,000 long-form videos paired with meticulously annotated captions, totaling 5M tokens. Our experimental evaluations reveal a stark contrast to conclusions drawn by previous benchmarks. The capacity of current open-source models to perform physically grounded multimodal reasoning remains in its infancy. Ultimately, this work seeks to systematically stress-test the reasoning capabilities of multimodal models, quantify hallucination rates, and advance the development of Physical AI, thereby providing the community with a robust and transparent evaluation framework toward Artificial General Intelligence (AGI).

2606.07953 2026-06-09 cs.AI 新提交

Unification of Closed-Open Industrial Detection Scenarios: New Large-Scale Benchmarks,Challenges and Baselines

闭集与开集工业检测场景的统一:新的大规模基准、挑战与基线

Zekai Zhang, Jinglin Zhang, Qinghui Chen, Gang Li, Da Chen, Shuainan Jing, He Wang, Dagang Li, Cong Liu, Cong Bai, Shengyong Chen

发表机构 * Shandong University(山东大学) University Paris Dauphine, PSL Research University, CNRS, UMR 7534(巴黎多芬纳大学,PSL研究大学,法国国家科学研究中心,UMR 7534) Qilu University of Technology(齐鲁工业大学) Zhejiang University of Technology(浙江工业大学) Nova University of Lisbon(里斯本新大学) Macau University of Science and Technology(澳门科技大学) Tianjin University of Technology(天津理工大学)

AI总结 针对工业缺陷检测中数据集稀缺和人工提示依赖问题,提出含百万样本的MMIOC-1M基准和RTVPNet网络,通过专家辅助域投影、能量稀疏采样和双向文本-视觉交互实现最优性能。

详情
AI中文摘要

大规模视觉语言模型(LVLMs)在自然视觉任务中取得了显著成功,但其在工业缺陷检测中的应用仍面临两个基本限制:(i)缺乏覆盖多个领域不同缺陷类别的大规模工业数据集,以及(ii)依赖人工提示(点、框、掩码)引入主观噪声,且缺乏用于细粒度理解的文本-视觉交互。为解决这些挑战,我们引入了一个大规模多模态工业开闭集基准(MMIOC-1M),包含超过一百万个样本,涵盖14个超类、29个工业场景和351个缺陷子类。据我们所知,MMIOC-1M是首个同时支持开放词汇和闭集工业检测的统一最大基准,为工业场景中的LVLMs提供了宝贵的预训练数据。此外,我们提出了一种精炼的文本-视觉提示网络(RTVPNet),包含三个关键创新:(1)专家辅助域投影机制,使通用视觉模型能够快速适应工业领域;(2)基于能量的稀疏采样策略,无需人工干预即可自动生成精炼的视觉提示;(3)双向文本-视觉交互模块,增强跨模态语义对齐和理解。大量实验表明,RTVPNet在MMIOC-1M、LVIS和COCO基准上实现了最先进的性能,同时保持了计算效率。数据集和代码可在https://github.com/hellozzk/MMIO获取。

英文摘要

Large-scale Visual-Language Models (LVLMs) have achieved remarkable success in natural visual tasks, yet their application to industrial defect detection remains challenging due to two fundamental limitations: (i) the scarcity of large-scale industrial datasets that cover diverse defect categories across multiple domains, and (ii) the reliance on manual prompts (points, boxes, masks) that introduce subjective noise and lack text-visual interaction for fine-grained understanding. To address these challenges, we introduce a Large-Scale Multi-Modal Industrial Open-Closed benchmark (MMIOC-1M) containing over one million samples across $14$ super-categories, $29$ industrial scenes, and $351$ defect subcategories. To our knowledge, MMIOC-1M is the first unified largest benchmark supporting both open-vocabulary and closed-set industrial detection, providing valuable pre-training data for LVLMs in industrial scenarios. Furthermore, we propose a Refined Text-Visual Prompt Network (RTVPNet) that incorporates three key innovations: (1) an expert-assisted domain projection mechanism that enables rapid adaptation of general vision models to industrial domains, (2) an energy-based sparse sampling strategy that automatically generates refined visual prompts without manual intervention, and (3) a bidirectional text-visual interaction module that enhances cross-modal semantic alignment and understanding. Extensive experiments demonstrate that RTVPNet achieves state-of-the-art performance on MMIOC-1M, LVIS, and COCO benchmarks while maintaining computational efficiency. The dataset and code are available at https://github.com/hellozzk/MMIO.

2606.07951 2026-06-09 cs.CL cs.AI cs.LG 新提交

From `May' to `Is': Certainty Distortion in Language Model Rewriting

从“可能”到“是”:语言模型改写中的确定性扭曲

Catarina G Belem, Shang Wu, Hongyu Yao, Mark Steyvers, Sameer Singh, Padhraic Smyth

发表机构 * University of California Irvine(加利福尼亚大学尔湾分校) Massachusetts Institute of Technology(麻省理工学院)

AI总结 研究语言模型在改写任务中系统性增加表达确定性的偏差,提出基于人群判断的评估指标,发现高达75%的输出存在确定性扭曲,且模型更倾向于提高确定性。

详情
AI中文摘要

人类越来越多地以塑造信念和驱动决策的方式使用语言模型(LM),包括讨论、改写和总结来自科学文章、新闻和医学报告的信息。然而,在这些领域中,主张表达的信心程度至关重要,但关于LM是否忠实地保留它却知之甚少。在这项工作中,我们研究了LM中的确定性扭曲,定义为当语义内容被保留时,表达确定性的有意义变化。我们提出了一种基于LM的评估指标,该指标与人群层面的确定性判断一致。使用该指标,我们在科学和医学交流任务的背景下,表征了不同规模和系列的模型中的确定性扭曲。我们的结果表明,确定性扭曲影响了高达75%的LM输出,并且在改写任务中系统性地不对称,大多数LM将表达确定性增加的可能性是降低的1.5-2倍。这些效应可以通过重复释义累积:在医学领域,claude-haiku-4-5在一次迭代后增加了20%示例的确定性,五次迭代后增加到40%。基于提示的干预减少了整体确定性扭曲,但并未消除它。总之,这些发现揭示了普遍存在的夸大表达确定性的偏差,对在高风险领域依赖LM的用户有直接影响。

英文摘要

Humans increasingly turn to Language Models (LMs) in ways that shape beliefs and drive decisions, including discussing, rewriting, and summarizing information from scientific articles, news, and medical reports. However, in these domains, where how confidently a claim is expressed matters, little is known about whether LMs faithfully preserve it. In this work, we investigate certainty distortion in LMs, defined as meaningful changes in expressed certainty when semantic content is preserved. We propose an LM-based evaluation metric that is consistent with population-level judgments of certainty. Using this metric, we characterize certainty distortion across different sizes and families of models in the context of scientific and medical communication tasks. Our results show that certainty distortion affects up to 75\% of LM outputs and is systematically asymmetric in rewriting tasks with most LMs being 1.5-2$\times$ more likely to increase the expressed certainty than to decrease it. These effects can compound over repeated paraphrasing: in the medical domain, claude-haiku-4-5 increases certainty of 20\% examples after a single iteration, increasing to 40\% after five iterations. Prompt-based interventions reduce overall certainty distortion but do not eliminate it. Together, these findings reveal a general bias toward inflating expressed certainty, with direct implications for users who rely on LMs in high-stakes domains.

2606.07938 2026-06-09 cs.CV cs.MM eess.IV 新提交

DAL-PCQA: Enabling Distortion-Level and Language-Driven Reasoning for Point Cloud Quality Assessment

DAL-PCQA:实现点云质量评估的失真级别与语言驱动推理

Swarna Chakraborty, Gabriel De Castro Araújo, Syeda Tasmi Faria, Marcelo M. Carvalho, Mylene C. Q. Farias

发表机构 * University of Brasília(巴西利亚大学)

AI总结 提出DAL-PCQA数据集,通过多级失真标签、质量类别和自然语言描述,结合零样本与微调多模态模型,实现可解释的点云质量评估。

详情
Comments
Accepted at Qomex 2026
AI中文摘要

点云质量评估(PCQA)方法通常预测标量平均意见分数(MOS),量化整体感知退化但不揭示其原因。相比之下,人类观察者自然地以特定失真(如模糊、颜色偏移、点密度变化、缺失区域和几何变形)进行推理。为弥合这一差距,我们引入了DAL-PCQA,一个用于PCQA的失真感知、语言标注数据集。DAL-PCQA用多级失真严重性标签、离散质量类别和与人类感知对齐的结构化自然语言描述增强了基准点云。我们定义了一个涵盖光度学和几何学伪影的点云特定失真分类法。统计分析揭示了不同失真类型和质量级别的特征退化模式。为评估这些标注的实用性,我们比较了用于生成感知质量描述的零样本和微调多模态模型。实验表明,失真感知监督显著提高了与真实描述的词法和语义对齐。通过实现可解释的失真级别推理,DAL-PCQA促进了语言驱动的、可解释的点云质量评估。该数据集公开于https://github.com/swarna96/DAL-PCQA。

英文摘要

Point Cloud Quality Assessment (PCQA) methods typically predict scalar Mean Opinion Scores (MOS), which quantify overall perceptual degradation but do not reveal its causes. In contrast, human observers naturally reason in terms of specific distortions such as blur, color shifts, point density changes, missing regions, and geometric deformations. To close this gap, we introduce DAL-PCQA, a distortion-aware, language-annotated dataset for PCQA. DAL-PCQA augments benchmark point clouds with multi-level distortion severity labels, discrete quality categories, and structured natural language descriptions aligned with human perception. We define a point-cloud-specific distortion taxonomy that covers both photometric and geometric artifacts. Statistical analysis reveals characteristic degradation patterns across distortion types and quality levels. To assess the utility of these annotations, we compare zero-shot and fine-tuned multimodal models for generating perceptual quality descriptions. Experiments show that distortion-aware supervision substantially improves lexical and semantic alignment with ground-truth descriptions. By enabling interpretable, distortion-level reasoning, DAL-PCQA facilitates language-driven, explainable point cloud quality assessment. The dataset is publicly available at https://github.com/swarna96/DAL-PCQA.

2606.07935 2026-06-09 cs.CV 新提交

REACT 2026: The Fourth Multiple Appropriate Facial Reaction Generation Challenge: Personalised MAFRG and Appropriate EEG Reaction Prediction

REACT 2026:第四届多适切面部反应生成挑战赛:个性化MAFRG与适切脑电反应预测

Siyang Song, Micol Spitale, Zijian Wu, Xiangyu Kong, Cheng Luo, Cristina Palmero, German Barquero, Sergio Escalera, Michel Valstar, Mohamed Daoudi, Fabien Ringeval, Andrew Howes, Elisabeth Andre, Hatice Gunes

发表机构 * University of Exeter(埃克塞特大学) Politecnico di Milano(米兰理工大学) Nanjing University of Science and Technology(南京理工大学) King Abdullah University of Science and Technology(阿卜杜拉国王科技大学) King's College London(伦敦国王学院) Universitat de Barcelona(巴塞罗那大学) University of Nottingham(诺丁汉大学) IMT Nord Europe(IMT 北欧欧洲) Université Grenoble Alpes(格勒诺布尔-阿尔卑斯大学) University of Augsburg(奥格斯堡大学) University of Cambridge(剑桥大学)

AI总结 提出REACT 2026挑战赛,鼓励开发机器学习模型,用于生成个性化、适切、多样、真实且同步的人类面部反应,并引入个性标签和脑电记录,探索新的一对多个性化面部反应生成设置。

详情
Comments
arXiv admin note: text overlap with arXiv:2505.17223
AI中文摘要

在二元交互中,多种人类面部反应可能适合回应每个说话者行为。继REACT 2023、2024和2025挑战系列成功举办后,针对多适切面部反应生成(MAFRG)问题,已开发了一系列生成式深度学习模型。今年,我们提出REACT 2026挑战赛,鼓励开发和基准测试机器学习(ML)模型,这些模型能够生成由特定人类倾听者表达的多个个性化、适切、多样、真实且同步的人类风格面部反应,以回应每个给定的说话者行为。作为挑战的关键,我们持续向挑战参与者提供REACT 2025引入的MARS数据集,并额外提供个体层面的五大人格标签和脑电记录。这引入了一种新的结合人类表达行为、情感和神经生理信号的一对多个性化面部反应生成设置,这在当前的二元交互建模中仍很大程度上未被探索。本文还介绍了挑战指南和四个提议的子挑战的新基线:离线通用和个性化MAFRG,以及在线通用和个性化MAFRG,这些基线公开于https://github.com/reactmultimodalchallenge/baseline_react2026。

英文摘要

In dyadic interactions, various human facial reactions could be appropriate for responding to each human speaker behaviour. Following the successful organisation of the REACT 2023, 2024 and 2025 challenge series, a body of generative deep learning (DL) models have been developed for the problem of multiple appropriate facial reaction generation (MAFRG). This year, we propose the REACT 2026 challenge encouraging the development and benchmarking of Machine Learning (ML) models that can generate multiple personalised, appropriate, diverse, realistic and synchronised human-style facial reactions expressed by a specific human listener for responding to each given speaker behaviour. As a key of the challenge, we continuously provide challenge participants with MARS dataset introduced by REACT 2025 but additionally provide individual-level Big-Five personality labels and EEG recordings. This introduces a new one-to-many personalised facial reaction generation setting combining human expressive behavioural, affective and neurophysiological signals, which remains largely unexplored in current dyadic interaction modelling. This paper also presents the challenge guidelines and new baselines on the four proposed sub-challenges: Offline generic and personalised MAFRG as well as Online generic and personalised MAFRG, respectively, which are publicly available at https://github.com/reactmultimodalchallenge/baseline_react2026.

2606.07934 2026-06-09 cs.RO 新提交

X-OP: Cross-Morphology Whole-Body Teleoperation via MPC Retargeting

X-OP: 基于MPC重定向的跨形态全身遥操作

Jen-Wei Wang, Sarthak Kaingade, Andrea Tagliabue, Nicholas Morozovsky

发表机构 * Amazon(亚马逊) University of California, Berkeley(加州大学伯克利分校)

AI总结 提出一种基于单个XR设备的层次化全身遥操作框架,通过MPC重定向联合优化操作者意图与机器人动态可行性,无需针对特定机器人重新训练策略,在仿真和实物实验中显著提升任务成功率。

详情
Comments
9 pages, 4 figures
AI中文摘要

全身遥操作对于在移动操作任务中实现可扩展的机器人数据收集至关重要,然而现有依赖外骨骼套装或多摄像头设置的方法带来了高昂的成本、复杂性和环境限制。最近使用单一扩展现实(XR)设备结合端到端强化学习策略的方法部分解决了这些限制,但需要针对特定机器人重新训练,遭受分布外故障,并且依赖于忽略动态可行性的运动重定向。我们提出了一种由单个XR设备驱动的层次化全身遥操作框架,该框架无需重新训练特定机器人的策略即可泛化到不同机器人形态。基于模型预测控制(MPC)的运动重定向器联合优化与操作者意图的对齐以及机器人的动态可行性,为现有的低级控制器生成最优命令。为了确保鲁棒的在线执行,我们引入了一种状态同步方法,在每个MPC步骤重置模拟器状态以处理嘈杂的真实世界测量和接触敏感性,并集成基于SLAM的全局位姿反馈以减轻长期漂移。仿真结果显示,在全身控制任务中,与基线相比,人形机器人(完成时间降低30%以上,功耗降低20%)和移动操作臂(零碰撞)均取得了更高的成功率。真实世界实验进一步验证了我们方法的有效性和灵活性,展示了所提出的重定向器在两个平台上成功部署于全身控制任务,并允许用户根据偏好轻松调整遥操作行为。该即插即用框架为全身机器人遥操作提供了一种可扩展、形态无关的解决方案,实现了实时行为定制和跨平台的广泛适用性。

英文摘要

Whole-body teleoperation is essential for scalable robot data collection in loco-manipulation tasks, yet existing approaches relying on exoskeleton suits or multi-camera setups impose prohibitive cost, complexity, and environmental constraints. Recent methods using a single extended reality (XR) device with end-to-end reinforcement learning policies partially address these limitations but require robot-specific retraining, suffer from out-of-distribution failures, and rely on motion retargeting that neglects dynamic feasibility. We propose a hierarchical whole-body teleoperation framework driven by a single XR device that generalizes across diverse robot morphologies without retraining robot-specific policies. A Model Predictive Control (MPC)-based motion retargeter jointly optimizes alignment with the operator's intent and the robot's dynamic feasibility, generating optimal commands for existing low-level controllers. To ensure robust online execution, we introduce a state synchronization method that resets the simulator state at each MPC step to handle noisy real-world measurements and contact sensitivity, and integrate SLAM-based global pose feedback to mitigate long-term drift. Simulation results show higher success rates on whole-body control tasks for both a humanoid (over 30% lower completion time and 20% lower power consumption) and a mobile manipulator (zero collisions) compared to baselines. Real-world experiments further validate the effectiveness and flexibility of our method, demonstrating the successful deployment of the proposed retargeter on both platforms for whole-body control tasks and the ease of allowing users to adjust teleoperation behavior based on their preferences. This plug-and-play framework offers a scalable, morphology-agnostic solution for whole-body robot teleoperation, enabling real-time behavioral customization and broad applicability across platforms.

2606.07932 2026-06-09 cs.CV cs.GR cs.MM eess.IV math.OC 新提交

LEGS: Laplacian-Enhanced Gaussian Splatting with a Nonlinear Weighted Loss

LEGS: 拉普拉斯增强的高斯泼溅与非线性加权损失

Yongfei Guo, Qizhou Huo, Xuan Sun, Yuanhao Gong

发表机构 * Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences(中国科学院长春光学精密机械与物理研究所) University of Chinese Academy of Sciences(中国科学院大学)

AI总结 提出LEGS方法,利用二阶拉普拉斯结构引导和非线性权重函数改进高斯泼溅的损失函数,在保持渲染管线不变的情况下提升结构感知优化,在Tanks&Temples和Mip-NeRF360数据集上PSNR最高提升1.68 dB。

详情
AI中文摘要

3D高斯泼溅(3DGS)已成为辐射场重建和实时新视角合成的高效显式表示方法。然而,其标准光度损失对平坦区域和结构丰富区域处理相似,这可能限制尖锐轮廓和精细细节的恢复。边缘引导高斯泼溅(EGGS)通过边缘引导加权提高了结构感知能力,但主要依赖一阶梯度响应和线性加权。本文提出LEGS,一种具有非线性加权损失的拉普拉斯增强高斯泼溅方法。LEGS用二阶拉普拉斯结构引导取代一阶梯度引导,并通过非线性响应-权重函数将归一化拉普拉斯响应映射为逐像素权重。所提出的损失改进了结构感知的高斯优化,同时保持原始3DGS渲染管线不变。在完整Tanks&Temples和Mip-NeRF360数据集上的实验表明,LEGS相比3DGS的峰值信噪比(PSNR)最高提升1.68 dB,相比EGGS最高提升0.52 dB。将所提出的二阶非线性加权策略集成到FastGS和FasterGS中,PSNR进一步提升最高1.69 dB,证明了其作为高斯泼溅管线的通用损失级扩展的有效性,在AR/VR、沉浸式可视化和实时3D内容生成中具有潜在应用。

英文摘要

3D Gaussian Splatting (3DGS) has become an efficient explicit representation for radiance field reconstruction and real-time novel view synthesis. However, its standard photometric loss treats flat and structure-rich regions similarly, which may limit the recovery of sharp contours and fine details. Edge-Guided Gaussian Splatting (EGGS) improves structure awareness through edge-guided weighting, but mainly relies on first-order gradient responses and linear weighting. In this paper, we propose LEGS, a Laplacian-Enhanced Gaussian Splatting method with a nonlinearly weighted loss. LEGS replaces first-order gradient guidance with second-order Laplacian structural guidance and maps the normalized Laplacian response into pixel-wise weights through nonlinear response-to-weight functions. The proposed loss improves structure-aware Gaussian optimization while keeping the original 3DGS rendering pipeline unchanged. Experiments on the full Tanks\&Temples and Mip-NeRF360 datasets show that LEGS improves peak signal-to-noise ratio (PSNR) by up to 1.68 dB over 3DGS and up to 0.52 dB over EGGS. Incorporating the proposed second-order nonlinear weighting strategy into FastGS and FasterGS further improves PSNR by up to 1.69 dB, demonstrating its effectiveness as a general loss-level extension for Gaussian Splatting pipelines with potential applications in AR/VR, immersive visualization, and real-time 3D content generation.

2606.07924 2026-06-09 cs.CV cs.AI cs.CL cs.LG cs.MM 新提交

Decoupling Semantics and Logic: A Training-Free Coarse-to-Fine Pipeline for Video Retrieval-Augmented Generation

解耦语义与逻辑:一种无需训练的从粗到精的视频检索增强生成流水线

Jiaxin Dai, Zehang Wei, Jiamin Yan, Xiang Xiang

发表机构 * School of Computer Science & Tech, Huazhong University of Science and Technology(华中科技大学计算机科学与技术学院) School of AI and Automation, Huazhong University of Science and Technology(华中科技大学人工智能与自动化学院)

AI总结 提出一种无需训练的两阶段级联视频RAG流水线,通过解耦语义检索与逻辑推理,实现跨语言长视频理解、严格角色遵循和零幻觉时间定位。

详情
Comments
To be presented at ACL 2026 MAGMAR Workshop (Oral; Retrieval leaderboard No.1)
AI中文摘要

本文介绍了我们为第二届多模态增强生成研讨会(MAGMaR)提交的系统描述。针对跨语言长视频理解、严格角色遵循和零幻觉时间定位等关键挑战,我们提出了一种完全无需训练的两阶段级联视频RAG流水线。我们的架构通过模态感知的任务分工,策略性地将语义检索与认知逻辑推理解耦。在第一阶段,一个高召回率的语义预取模块仅使用高保真视觉摘要和全局文本描述进行密集检索,明确隔离噪声模态(如OCR和ASR)以保持纯净的向量空间。在第二阶段,一个由商业大语言模型(LLM)驱动的自适应、迭代和推理(A.I.R.)过滤代理执行细粒度认知重排序。该代理重新整合完整的多模态上下文,以强制执行与用户角色的严格逻辑对齐,有效剪除语义相似但逻辑无关的候选。最后,提示雕刻机制约束生成器将蒸馏后的子集合成为严格格式化的JSON响应,并带有精确的块级引用。在RAG轨道上的评估表明,我们的资源感知方法在信息检索和角色条件生成方面均表现出卓越的精度。

英文摘要

This paper presents our system description for the 2nd Workshop on Multimodal Augmented Generation via MultimodAl Retrieval (MAGMaR). Addressing the critical challenges of cross-lingual long-video comprehension, strict persona adherence, and zero-hallucination temporal grounding, we propose a fully training-free, two-stage cascaded Video RAG pipeline. Our architecture strategically decouples semantic retrieval from cognitive logical reasoning through a modality-aware division of labor. In the first stage, a high-recall semantic pre-fetching module employs dense retrieval using only high-fidelity visual summaries and global text descriptions, explicitly isolating noisy modalities (e.g., OCR and ASR) to maintain a pristine vector space. In the second stage, an Adaptive, Iterative, and Reasoning-based (A.I.R.) filtering agent, powered by a commercial Large Language Model (LLM), performs fine-grained cognitive reranking. The agent re-incorporates full multimodal contexts to enforce strict logical alignment with user personas, effectively pruning semantically similar but logically irrelevant candidates. Finally, a Prompt Sculpting mechanism constrains the generator to synthesize the distilled subset into strictly formatted JSON responses with exact chunk-level citations. Evaluated on the RAG track, our resource-aware approach shows exceptional precision in both information retrieval and persona-conditioned generation.

2606.07916 2026-06-09 cs.AI 新提交

The CIFAR Synthetic Evidence Corpus for Detecting AI-Generated Evidence

CIFAR合成证据语料库:用于检测AI生成证据

Kelly McConvey, Jalehsadat Mahdavimoghaddam, Nima Jamali, Maksym Taranukhin, Sajad Ebrahimi, Wentao Zhang, Yuntian Deng, Karen Eltis, Maura R. Grossman, Vered Shwartz, Ebrahim Bagheri

发表机构 * University of Toronto(多伦多大学) University of Waterloo(滑铁卢大学) University of British Columbia(不列颠哥伦比亚大学) Vector Institute(向量研究所) University of Ottawa(渥太华大学)

AI总结 针对司法系统中证据真实性检测缺乏合适数据集的问题,构建了包含多种文档类型和篡改策略的CIFAR合成证据语料库,支持在受控条件下评估证据验证系统。

详情
AI中文摘要

生成模型生成逼真文档的能力日益增强,这对司法系统和法院中的证据工作流程构成了直接挑战,因为决策越来越依赖于收据、通信和行政记录等证据的真实性。与社交媒体或学术环境不同,证据性文档通常仅被微妙地修改,通过局部编辑保持整体合理性,同时改变法律含义。然而,自动检测的进展仍然有限,主要原因是缺乏适合司法系统要求的训练和评估数据。现有资源要么专注于人脸照片或自然风景,要么局限于狭窄的学术或社交媒体文档类型,未能捕捉真实世界证据数据的结构、多样性或篡改模式。因此,当前的检测系统不一定能学习到适合司法系统的有意义信号。我们引入了CIFAR合成证据语料库,这是一个旨在在现实和受控条件下严格评估证据验证的数据集。该语料库涵盖多个文档家族和一系列篡改策略,从小规模字段级编辑到完整文档伪造,并使用多种最先进的生成工具构建。其组织方式系统性地变化篡改复杂性和生成方法,同时在训练和测试数据之间强制源级分离,以反映现实世界的泛化挑战。

英文摘要

The growing ability of generative models to produce realistic documents poses a direct challenge to evidentiary workflows in the justice system and the courts, where decisions increasingly depend on the authenticity of evidence such as receipts, communications, and administrative records. Unlike social media or academic settings, evidentiary documents are often only subtly altered, with small, localized edits that preserve overall plausibility while changing legal meaning. Yet progress on automated detection remains limited, largely due to the absence of suitable training and evaluation data especially suited for the justice system requirements. Existing resources are either focused on photos of human faces or natural scenery or on narrowly scoped academic or social media document types, and do not capture the structure, diversity, or manipulation patterns characteristic of real-world evidentiary data. As a result, current detection systems do not necessarily learn meaningful signals appropriate for the justice system. We introduce the CIFAR Synthetic Evidence Corpus, a dataset designed to enable rigorous evaluation of evidence verification under realistic and controlled conditions. The corpus spans multiple document families and a spectrum of manipulation strategies, from small field-level edits to complete document fabrication, and is constructed using a diverse set of state-of-the-art generative tools. It is organized to systematically vary both manipulation complexity and generation method, while enforcing source-level separation between training and test data to reflect real-world generalization challenges.

2606.07915 2026-06-09 cs.AI 新提交

EditSR: Enhancing Neural Symbolic Regression via Edit-based Rectification

EditSR: 通过基于编辑的修正增强神经符号回归

Da Li, Xinxin Li, Xingyu Cui, Jin Xu, Juan Zhang, Junping Yin

发表机构 * Northeast Normal University(东北师范大学) East China Normal University(华东师范大学) Shenzhen Institute of Advanced Technology(深圳先进技术研究院) Graduate School of China Academy of Engineering Physics(中国工程物理研究院研究生院) Beihang University(北京航空航天大学) Institute of Applied Physics and Computational Mathematics(北京应用物理与计算数学研究所)

AI总结 提出EditSR双层框架,第一层神经符号回归模型生成表达式,第二层基于编辑的修正器通过预训练的状态转移链逐步修正错误,避免全局搜索重启,有效减少误差累积,提升复杂表达式生成的结构正确性。

详情
AI中文摘要

神经符号回归模型通过将结构搜索转移到预训练来提高推理效率,但其一次性自回归解码容易产生误差累积,可能导致生成结构不正确的表达式,尤其是在复杂表达式生成场景中。现有的修正策略可以缓解这一问题,但它们通常依赖于重新启动全局搜索,从而削弱了神经模型的效率优势,并且仍然容易受到误差累积的影响。在本文中,我们提出了EditSR,一个双层框架,第一层结合神经符号回归模型,第二层结合基于编辑的修正器,以实现高效预测和事后修正。我们不重新启动全局搜索,而是通过预训练修正器来保持修正效率。具体来说,我们将修正过程形式化为从错误表达式开始的逐步状态转移链,并开发了一种状态转移算法来构建用于训练修正器的监督修正链。为了确保修正过程中的语法有效性,每个编辑操作都被限制在语法有效的空间内,使得每个编辑后的表达式仍然可解析。此外,由于每个编辑决策基于当前状态而非历史,修正器允许后续编辑修正早期步骤中的错误,从而降低误差累积的风险。大量实验和消融研究表明,EditSR以有限的额外成本显著提高了符号结构恢复能力,在复杂表达式上收益更明显,因为一次性自回归解码更容易受到误差累积的影响。

英文摘要

Neural symbolic regression models improve inference efficiency by shifting structural search to pretraining, but their one-pass autoregressive decoding is prone to error accumulation, which may lead to generating structurally incorrect expressions, especially in complex expression generation scenarios. Existing rectification strategies can alleviate this issue, but they often depend on restarting global search, thereby weakening the efficiency advantage of neural models, and remain susceptible to error accumulation. In this paper, we propose EditSR, a two-layer framework that combines a neural symbolic regression model in the first layer with an edit-based Rectifier in the second layer to achieve efficient prediction and post-hoc rectification. Instead of restarting the global search, we maintain rectification efficiency by pretraining the Rectifier. Specifically, we formulate the rectification process as a step-by-step state-transition chain starting from an incorrect expression, and develop a state-transition algorithm to construct supervised rectification chains for training the Rectifier. To ensure syntactic validity throughout rectification, each edit action is restricted to a syntactically valid space so that every edited expression remains parseable. In addition, because each edit decision is conditioned on the current state rather than the history, the Rectifier allows errors made in earlier steps to be rectified by subsequent edits, thereby reducing the risk of error accumulation. Extensive experiments and ablation studies show that EditSR substantially improves symbolic structure recovery with limited extra cost, with more pronounced gains on complex expressions, where one-pass autoregressive decoding is more susceptible to error accumulation.

2606.07910 2026-06-09 cs.LG 新提交

CAAL: Contextual Bandits based Online Hand-Craft Active Learning Strategy Selection

CAAL: 基于上下文赌博机的在线手工主动学习策略选择

Shao-An Yin, Jiacong Li, Tianpei Xie, Cecile Levasseur, Wojciech Kowalinski, Nicola Elia

发表机构 * University of Minnesota, Twin Cities(明尼苏达大学双城分校) Amazon(亚马逊)

AI总结 提出CAAL框架,利用上下文信息和奖励预测动态选择主动学习策略,在公共数据集上优于现有基线方法。

详情
Comments
8 pages, 5 figures, Accepted to the NYRL 2025 Workshop
AI中文摘要

主动学习算法面临的挑战是未标注数据统计分布的不确定性,这使得难以选择最佳的手工策略。为了解决这个问题,我们引入了上下文自适应主动学习(CAAL)。在CAAL中,每个“臂”代表一个手工策略。与仅基于标注数据反馈选择策略的现有框架不同,我们利用外部上下文信息的奖励预测动态选择用于标注数据批次的策略。这个通用框架允许通过领域知识进行定制,以设计更有效的奖励和上下文候选。此外,我们通过实验表明,使用我们的奖励和上下文设计,CAAL在公共数据集上优于现有的基线自适应策略。无论每次迭代的批次大小如何,我们的结果都是一致的。

英文摘要

The challenge with active learning algorithms is the uncertainty of the statistical distribution of unlabeled data, making it difficult to choose the best hand-crafted strategy. To address this, we introduced Contextual Adaptive Active Learning (CAAL). In CAAL, each "arm" represents a hand-crafted strategy. Unlike existing frameworks that select strategies based only on feedback from labeled data, we dynamically choose strategies for labeling batches of data using reward prediction with external context information. This general framework allows for customization with domain knowledge to design more effective rewards and context candidates. In addition, we experimentally show that CAAL outperforms the existing baseline adaptive strategy on public datasets using our reward and context design. Our results are consistent regardless of batch size in each iteration.

2606.07902 2026-06-09 cs.RO 新提交

End-to-End Control of a Powered Knee-Ankle Prosthesis Towards Unified, Tuning-Free Assistance

动力膝踝假肢的端到端控制:迈向统一、免调参的辅助

John Shim, Christoph Nuesslein, Sixu Zhou, Hanjun kim, Kinsey Herrin, Aaron Young

发表机构 * Georgia Institute of Technology(佐治亚理工学院) Woodruff School of Mechanical Engineering(伍德拉夫机械工程学院) Institute for Robotics and Intelligent Machines(机器人与智能机器研究所)

AI总结 本文提出一种端到端假肢控制器,利用时序卷积网络从机载传感器估计连续执行器信号,无需意图分类器和个体调参,在多种地形和步态模式下实现统一、模式自适应的假肢辅助。

详情
Comments
7 pages, 6 figures
AI中文摘要

动力假肢通常依赖需要大量手动调参和显式模式分类的阻抗控制器。在这项工作中,我们展示了端到端假肢控制器的实时部署,该控制器从机载传感器估计连续执行器信号,消除了对意图分类器和个体调参的需求。时序卷积网络在来自18名经股截肢者的多地形数据集上训练,并在五种运动模式下实时部署。四名参与者(三名健全人,一名经股截肢者)在平地、斜坡上坡、斜坡下坡以及楼梯上坡和下坡上行走。在平地行走中,部署的控制器再现了踝关节峰值力矩随步行速度变化的训练数据缩放关系(部署:0.85 Nm/kg per m/s,p = 0.001;训练:0.96 Nm/kg per m/s,95% CI [0.42, 1.50],p = 0.002),排除了一个因异常假肢负载导致的离群点。在斜坡上坡时,控制器使膝关节预屈曲随坡度变化(部署:2.92 deg/deg,p = 0.027;训练:3.30 deg/deg,95% CI [1.83, 4.77],p < 0.001)。在斜坡下坡时,控制器相对于平地行走增加了膝关节阻力矩(部署:+0.16 Nm/kg,p < 0.001;训练:+0.16 Nm/kg,p = 0.008)。尽管训练数据仅包含一种肢体引导序列,但控制器在楼梯上坡和下坡中为健侧和假肢侧引导序列均生成了无缝的过渡。这些结果为端到端控制提供了初步证据,表明其能够提供统一、模式自适应的假肢辅助,而无需个体调参。

英文摘要

Powered prostheses conventionally rely on impedance controllers that require extensive manual tuning and explicit mode classification. In this work, we present real-time deployment of an end-to-end prosthesis controller that estimates continuous actuator signals from onboard sensors, eliminating the need for intent classifiers and subject-specific tuning. Temporal Convolutional Networks were trained on a multi-terrain dataset from 18 individuals with transfemoral amputation and deployed in real time across five locomotion modes. Four participants (three able-bodied, one with transfemoral amputation) ambulated across level ground, ramp ascent and descent, and stair ascent and descent. During level walking, the deployed controller reproduced the training-data scaling of peak ankle torque with walking speed (deployed 0.85 Nm/kg per m/s, p = 0.001; training 0.96 Nm/kg per m/s, 95% CI [0.42, 1.50], p = 0.002), after excluding one outlier traced to atypical prosthesis loading. During ramp ascent, the controller scaled knee pre-flexion with grade (deployed 2.92 deg/deg, p = 0.027; training 3.30 deg/deg, 95% CI [1.83, 4.77], p < 0.001). During ramp descent, the controller increased resistive knee torque relative to level walking (deployed +0.16 Nm/kg, p < 0.001; training +0.16 Nm/kg, p = 0.008). Seamless stair transitions were generated for both intact- and prosthetic-side-leading sequences in ascent and descent, despite the training data containing only one limb-leading sequence. These results provide initial evidence towards end-to-end control that can provide unified, mode-adaptive prosthetic assistance without subject-specific tuning.

2606.07898 2026-06-09 cs.LG cs.CE 新提交

Temporal Coverage over Density: Parsimonious Training-Set Design for ML Climate Downscaling

密度之上的时间覆盖:机器学习气候降尺度的简约训练集设计

Karandeep Singh, Stefan Rahimi, Chad W. Thackeray, Stephen Cropper, Alex Hall

发表机构 * University of California, Los Angeles(加州大学洛杉矶分校) University of Wyoming(怀俄明大学)

AI总结 针对机器学习气候降尺度中高分辨率模拟资源有限的问题,提出通过时间分布采样而非连续块状采样来分配训练年份,以更好地捕捉强迫气候响应和内部变率,实验表明时间分布采样在固定预算下性能最优。

详情
Comments
22 pages, 8 figures
AI中文摘要

高分辨率区域气候模拟为气候影响评估提供了关键信息,但计算成本高昂,这推动了机器学习降尺度器和模拟器的发展。一个关键挑战是确定如何将有限的高分辨率模拟分布到不断变化的气候轨迹中,以同时捕捉强迫气候响应和内部变率。利用美国西部的CESM2大型集合,我们在固定数据预算下比较了三种训练年份选择策略:历史年份的连续块、从模拟期开始和结束年份中抽取的年份,以及分布在整个气候轨迹中的年份。同时包含历史和未来年份始终优于仅使用历史年份训练,这表明让降尺度模型接触历史记录之外的气候状态的重要性,并突出了统计降尺度中常见的平稳性假设的局限性。使用分布在整个气候轨迹中的年份进行训练总体表现最佳,表明内部变率的广泛采样除了暴露于强迫气候响应之外还提供了额外信息。在时间分布子集上训练的模型能更成功地再现未见集合成员中的变率,同时在广泛的气候诊断中保持强劲性能。即使仅使用可用高分辨率年份的十分之一进行训练,时间分布模型仍与全数据训练高度竞争。这些结果表明,在固定计算预算下,分配稀缺的高分辨率模拟时,气候状态的广泛采样比时间连续性更有价值。这些发现为区域气候降尺度和大型集合预测工作流程提供了实用指导。

英文摘要

High-resolution regional climate simulations provide critical information for climate impacts assessments but remain computationally expensive, motivating the development of machine-learning downscalers and emulators. A key challenge is determining how limited high-resolution simulations should be distributed across a changing climate trajectory to capture both forced climate response and internal variability. Using the CESM2 Large Ensemble over the western United States, we compare three training-year selection strategies under fixed data budgets: a contiguous block of historical years, years drawn from both the beginning and end of the simulation period, and years distributed throughout the full climate trajectory. Including both historical and future years consistently outperforms training on historical years alone, demonstrating the importance of exposing downscaling models to climate states outside the historical record and highlighting limitations of stationarity assumptions common in statistical downscaling. Training on years distributed throughout the full climate trajectory performs best overall, indicating that broad sampling of internal variability provides additional information beyond exposure to the forced climate response alone. Models trained on temporally distributed subsets more successfully reproduce variability in unseen ensemble members while retaining strong performance across a wide range of climate diagnostics. Even when trained on only one-tenth of the available high-resolution years, temporally distributed models remain highly competitive with full-data training. These results suggest that, under fixed computational budgets, broad sampling of climate states is more valuable than temporal continuity when allocating scarce high-resolution simulations. The findings provide practical guidance for regional climate downscaling and large-ensemble projection workflows.

2606.07897 2026-06-09 cs.AI cs.HC 新提交

The AI Epistemic Deference Index: A Continuous Measure of Sycophancy

AI认知顺从指数:谄媚行为的连续度量

Alejandro Botas, Paul de Font-Reaulx, Luke Hewitt

发表机构 * Independent(独立研究者) University of Michigan, Ann Arbor(密歇根大学安娜堡分校) Transluce

AI总结 提出AI认知顺从指数(AEDI),通过从自然语言输出中估计概率来连续度量模型对用户态度的顺从程度,测试8个模型发现显著差异,Claude顺从最少,Grok和Gemini最多。

详情
AI中文摘要

当前的AI模型经常表现出认知谄媚,即赞同用户的说法。现有的评估通常通过衡量使模型改变二元认可所需的条件,或通过引发对命题的明确概率来度量。然而,许多面向用户的谄媚行为是通过日常语言中表达的分级支持的转变来体现的。我们提出AI认知顺从指数(AEDI):一个连续的、单维度的分数,表示模型输出中表达的支持对用户提示中表达的态度敏感程度。为了生成AEDI,我们提供了一种新的协议,用于从自然语言输出中估计概率,使用LLM作为评判者,并验证了其与人类判断的一致性和相关性。我们在一个包含500个不同主题命题和16000个不同用户态度提示的新策划数据库上部署了该指数,测试了8个主流模型。每个模型都表现出显著的顺从,尽管不同提供商之间存在巨大且系统的差异,其中Claude模型顺从最少,而Grok和Gemini模型顺从最多。在要求书面产物的提示中,这种效应被放大,并集中在模型先验较弱的命题上。我们发布AEDI作为一个易于更新的基准和测量流程,用于输出级别的谄媚评估。

英文摘要

Current AI models frequently exhibit epistemic sycophancy, endorsing claims to agree with a user. Existing evaluations typically measure this either by assessing what it takes to make a model shift a binary endorsement or by eliciting an explicit probability in a proposition. However, much user-facing sycophantic behavior is demonstrated through shifts in graded support expressed through ordinary language. We propose the AI Epistemic Deference Index (AEDI): a continuous, unidimensional score representing how sensitive the support expressed in a model's output is to the attitude expressed in a user's prompt. To generate AEDI, we provide a new protocol for estimating probabilities from natural language outputs, using LLMs-as-judges validated for consistency and correlation to human judgment. We deploy it on a new curated database of 500 propositions across diverse topics and 16,000 prompts varying in user attitude, testing eight prominent models. Every model exhibits substantial deference, though with large and systematic differences across providers, with Claude models demonstrating the least, and Grok and Gemini models the most. The effect is amplified in prompts requesting a written artifact, and concentrated on propositions where models hold weaker priors. We release AEDI as an easy-to-update benchmark and measurement pipeline for output-level sycophancy evaluation.

2606.07895 2026-06-09 cs.CV cs.RO 新提交

TBD-VLA: Temporal Block Diffusion Vision Language Action Model

TBD-VLA: 时序块扩散视觉语言动作模型

Sung-Wook Lee, Xuhui Kang, Yen-Ling Kuo

发表机构 * University of Virginia(弗吉尼亚大学)

AI总结 提出TBD-VLA框架,通过时序块扩散机制实现离散令牌VLA模型的并行动作生成,兼顾时序连贯性与推理速度,在仿真和真实任务中优于先前方法。

详情
AI中文摘要

离散视觉-语言-动作(VLA)模型通常将动作生成建模为离散动作空间上的下一个令牌预测,每个令牌自回归地依赖于先前的上下文。虽然有效,但这种范式会导致高推理延迟,并且很大程度上忽略了动作轨迹中固有的时间结构。最近的工作引入并行解码以提高效率,实现更快的推理,但缺乏建模令牌依赖关系的显式机制。我们提出TBD-VLA,一种基于离散令牌的VLA框架,它结合了块扩散以实现时序动作生成。我们将动作序列划分为时间块,并在每个块内执行掩码离散扩散,同时保持跨块的自回归生成。这种设计统一了时序自回归和并行动作解码,实现了强时序连贯性和改进的推理速度。此外,显式的时序建模通过时序修补实现了动作块(例如实时分块)的异步执行。TBD-VLA在仿真和真实世界的操作任务中显著优于先前的VLA方法,为走向快速、时序感知的离散VLA模型提供了一条可扩展的路径。项目网页:https://tbd-vla.github.io/

英文摘要

Discrete Vision-Language-Action (VLA) models typically formulate action generation as next-token prediction over discretized action spaces, conditioning each token autoregressively on prior context. While effective, this paradigm incurs high inference latency and largely ignores the temporal structure inherent in action trajectories. Recent efforts introduce parallel decoding to improve efficiency, enabling faster inference, but lack explicit mechanisms for modeling token dependencies. We introduce TBD-VLA, a discrete token-based VLA framework that incorporates block diffusion to enable temporal action generation. We partition action sequences into temporal blocks and perform masked discrete diffusion within each block, while maintaining autoregressive generation across blocks. This design unifies temporal autoregression and parallel action decoding, achieving both strong temporal coherence and improved inference speed. In addition, the explicit temporal modeling enables asynchronous execution of action chunks (e.g., Real-Time Chunking) via temporal in-painting. TBD-VLA significantly outperforms prior VLA approaches in both simulation and real-world manipulation tasks, offering a scalable path toward fast, temporally aware, discrete VLA models. Project webpage: https://tbd-vla.github.io/

2606.07893 2026-06-09 cs.CL 新提交

Beyond Individual Personas: Aligning Synthetic Dialogue to Population-Level Behavior Distributions

超越个体角色:将合成对话对齐到群体层面的行为分布

Xinyi Liu, Rinat Khaziev, Hooshang Nayyeri, Emine Yilmaz, Charith Peris, Hari Thadakamalla

发表机构 * Amazon(亚马逊) University of Illinois Urbana–Champaign(伊利诺伊大学厄巴纳-香槟分校) University College London(伦敦大学学院)

AI总结 提出GroupPersona框架,通过将参考语料库的行为分布转化为生成控制,使合成对话语料库在群体层面与参考分布对齐,在12个行为属性上Jensen-Shannon散度降低24.4%。

详情
AI中文摘要

合成对话语料库越来越多地被用作目标对话数据的代理,然而基于角色的生成器优化的是个体对话而非语料库组成,导致产生局部合理的对话但群体层面的行为混合失真。我们引入GroupPersona框架,该框架将合成对话语料库对齐到参考语料库的行为分布。GroupPersona将群体统计转化为生成控制:它将每个对话的核心行为特征与可预测的副作用分离,并利用由此产生的行为组来根据定义参考群体的交互模式调节用户代理。我们在四个跨越两种对话来源(助手风格和Reddit衍生)的语料库上评估GroupPersona,并采用两种构建变体:结构保持和变体增强。相对于最强的平均基线,GroupPersona将合成分布与参考分布在12个行为属性上的Jensen-Shannon散度从0.234降低到0.177,降低了24.4%,并且在所有四个语料库上达到最佳或并列最佳,同时保持结构对齐。它还在参考对话质量分数的校准上达到最接近,将参考对话轮廓的平均绝对偏差降低到0.63,而次优基线为0.91。

英文摘要

Synthetic dialogue corpora are increasingly used as proxies for target dialogue data, yet persona-grounded generators optimize individual conversations rather than corpus composition, yielding locally plausible dialogues with distorted population-level behavior mixes. We introduce GroupPersona, a framework that aligns synthetic dialogue corpora to the behavior distribution of a reference corpus. GroupPersona turns population statistics into generation controls: it separates each dialogue's core behavioral signature from predictable side effects, and uses the resulting behavioral groups to condition user agents on the interaction patterns that define the reference population. We evaluate GroupPersona on four corpora crossing two dialogue sources, assistant-style and Reddit-derived, with two construction variants: structure-preserving and variation-enhanced. GroupPersona lowers Jensen-Shannon divergence between synthetic and reference distributions over 12 behavior attributes from 0.234 to 0.177 relative to the strongest average baseline, a 24.4% reduction, and is best or tied-best on all four corpora while preserving structural alignment. It also achieves the closest calibration to reference-conversation quality scores, reducing mean absolute deviation from the reference-conversation profile to 0.63 versus 0.91 for the next-best baseline.

2606.07891 2026-06-09 cs.CV 新提交

C3VD-DEFCOL: A Deformable Colonoscopy Dataset with Time-Resolved 3D Ground Truth and Realistic Appearance

C3VD-DEFCOL:具有时间分辨三维真实地面真值和逼真外观的可变形结肠镜数据集

Ethan Luk, Mayank V. Golhar, Anthony Song, Raúl Iranzo, Víctor M. Batlle, Lalithkumar Seenivasan, José M. M. Montiel, Nicholas J. Durr

发表机构 * Johns Hopkins University(约翰霍普金斯大学) Universidad de Zaragoza(萨拉戈萨大学)

AI总结 提出C3VD-DEFCOL框架和数据集,通过模拟蠕动变形和真实纹理渲染,为可变形结肠镜三维重建提供带时间分辨地面真值的评估平台。

详情
AI中文摘要

三维重建可通过估计黏膜覆盖范围并在筛查期间提醒临床医生遗漏区域来改进结肠镜检查。然而,由于当前没有数据集同时提供逼真的体内外观和密集的时间分辨三维地面真值(尤其在非刚性变形下),算法开发受到限制。我们提出C3VD-DEFCOL,一个用于评估可变形结肠镜重建的框架和数据集,具有配对的几何和逼真纹理。从C3VD/C3VDv2结肠网格和相机轨迹出发,我们生成结肠表面的受控变形,包括蠕动波和中心线运动,并渲染每帧深度、表面法线、光流、相机姿态和时间戳三维网格。然后,我们使用渲染的几何(主要是深度)来条件化一个基于LTX-2.3的模拟到真实翻译模型,该模型生成具有体内样黏膜颜色、纹理、血管和镜面外观的RGB片段,同时保留底层三维场景结构。所得数据集包含来自11个独特结肠网格几何的110个视频,具有不同的相机轨迹、外观和参数化变形模式,包括三个蠕动严重程度级别作为受控评估轴。我们使用外观真实性、几何一致性和时间一致性指标评估生成的视频,并利用配对地面真值对可变形三维重建中的下游任务——姿态估计进行基准测试。实验表明,姿态估计误差随变形严重程度增加而增加,提供了现有体内数据集无法实现的受控压力测试。总体而言,C3VD-DEFCOL被设计为一个可重复的定量评估平台,用于测试可变形三维重建算法,旨在缩小合成数据集与体内结肠镜之间的领域差距。

英文摘要

3D reconstruction could improve colonoscopy by estimating mucosal coverage and alerting clinicians to missed regions during screening. However, algorithm development is limited as no current datasets provide both a realistic in vivo appearance and dense, time-resolved 3D ground truth, especially under non-rigid deformation. We present C3VD-DEFCOL, a framework and dataset for evaluating deformable colonoscopy reconstruction with paired geometry and realistic texture. Starting from C3VD/C3VDv2 colon meshes and camera trajectories, we generate controlled deformations of the colon surface, including peristaltic waves and centerline motion, and render per-frame depth, surface normals, optical flow, camera poses, and time-stamped 3D meshes. We then use the rendered geometry, primarily depth, to condition an LTX-2.3-based sim-to-real translation model that produces RGB clips with in vivo-like mucosal color, texture, vasculature, and specular appearance while preserving the underlying 3D scene structure. The resulting dataset contains 110 videos from 11 unique colon mesh geometries, with varying camera trajectories, appearances, and parameterized deformation regimes, including three peristaltic severity levels that serve as controlled evaluation axes. We evaluate the generated videos using appearance realism, geometric consistency, and temporal consistency metrics, and use the paired ground truth to benchmark the downstream task of pose estimation in deformable 3D reconstruction. Our experiments show how pose estimation error increases with increasing deformation severity, providing a controlled stress test that is not possible with existing in vivo datasets. Overall, C3VD-DEFCOL is designed as a reproducible, quantitative evaluation platform for testing deformable 3D reconstruction algorithms, with the goal of reducing the domain gap between synthetic datasets and in vivo colonoscopy.

2606.07890 2026-06-09 cs.LG stat.ML 新提交

Partially Performative Prediction

部分表现性预测

Jaewook Lee, Tijana Zrnic

发表机构 * Stanford University(斯坦福大学)

AI总结 提出部分表现性预测框架,统一建模由模型部署引起的内生分布偏移和外部时间变化引起的外生偏移,并定义在线表现性稳定与最优性,分析重复训练等启发式方法的适应性条件。

详情
AI中文摘要

表现性预测研究当预测模型部署在重要领域时产生的反馈循环。在这些设置中,部署模型可能会改变模型旨在预测其模式的人群,导致学习系统内生的分布偏移。这种视角不同于经典的分布偏移处理,其中偏移通常被建模为数据生成过程中的外生变化。然而,在实践中,分布偏移很少是单一类型的。预测模型可能通过其支持的决策影响未来数据,而世界本身也因学习者无法控制的原因持续漂移。我们研究部分表现性预测,这是一个捕捉内源和外源分布偏移源的框架。该框架通过允许数据分布既响应部署的模型又根据外部时变过程演化,推广了表现性预测。我们通过定义在线类比来跟踪演化的部分表现性环境,将表现性稳定性和表现性最优性的核心概念扩展到这一设置。我们分析了实用的学习启发式方法,包括重复训练,并刻画了它们何时成功适应部分表现性环境。

英文摘要

Performative prediction studies feedback loops that arise when predictive models are deployed in consequential domains. In these settings, deploying a model can change the population whose patterns the model aims to predict, inducing a distribution shift that is endogenous to the learning system. This perspective departs from classical treatments of distribution shift, where shifts are typically modeled as exogenous changes in the data-generating process. Yet, in practice, distribution shift is rarely one or the other. Predictive models may influence future data through the decisions they support, while the world itself continues to drift for reasons beyond the learner's control. We study partially performative prediction, a framework that captures both endogenous and exogenous sources of distribution shift. The framework generalizes performative prediction by allowing the data distribution to evolve both in response to the deployed model and according to an external, time-varying process. We extend the central notions of performative stability and performative optimality to this setting by defining their online analogues that track the evolving partially performative environment. We analyze practical learning heuristics, including repeated retraining, and characterize when they successfully adapt to partially performative environments.

2606.07882 2026-06-09 cs.CV cs.AI 新提交

The Cross-Architecture Substrate: A Domain-Transcendent, Calibration-Surviving Geometric Invariant of Modern Vision Encoders

跨架构基板:现代视觉编码器的领域超越、校准存活的几何不变量

Yousef Radwan

发表机构 * KAUST(阿卜杜拉国王科技大学)

AI总结 发现现代视觉编码器训练后前16个主方向收敛到同一16维几何对象(跨架构基板),该基板跨视觉领域传输、校准后仍存在,并应用于无标签迁移性过滤、领域检测、低样本探测和无教师蒸馏。

详情
Comments
14 pages, 2 figures. 40th Conference on Neural Information Processing Systems (NeurIPS 2026)
AI中文摘要

不同的视觉神经网络——训练用于分类、对比、重建或将图像与文本匹配——应该具有相应不同的内部表示。我们报告它们并非如此。训练后,十三个现代视觉编码器内部的前十六个主变化方向收敛到同一个十六维几何对象。我们称之为跨架构基板,并使用PCA、中心核对齐(CKA)和Pang 2026校准进行研究。该基板在四个视觉领域(自然照片、医学CT、卫星、显微镜)上以中位数Procrustes-CKA 0.679传输,在八个领域(增加素描、深度、热红外、天文学)上为0.604,每对>0.40。它在全局(7.4倍判别vs MAE分离,n=13,394)和局部(4.82-5.30,p<10^{-44})上经受住Pang校准。它不是像素统计(0.263),不是Gabor特征(0.31),不是随机投影(0.041),并且在训练的前10%中出现,而准确率持续上升。我们提供了四个应用:一个无标签迁移性过滤器,优于LogME(快3倍,+0.15 Kendall-tau);一个四路领域检测器(99.6%准确率);一个冻结低样本探测器(16维在每类N=50标签时比768维DINOv2高3.78个百分点);以及一个无教师蒸馏辅助,匹配训练教师KD在33对上(10%标签分数时峰值增益7.56个百分点)。该基板不跨模态,不帮助跨范式蒸馏,也不预测迁移质量(与迁移准确率的rho=0.08)。

英文摘要

Different vision neural networks -- trained to classify, contrast, reconstruct, or match images to text -- should have correspondingly different internal representations. We report that they do not. After training, the top sixteen principal directions of variation inside thirteen modern vision encoders converge to the same sixteen-dimensional geometric object. We call this the cross-architecture substrate and study it with PCA, centred kernel alignment (CKA), and Pang 2026 calibration. The substrate transports across four visual domains (natural photographs, medical CT, satellite, microscopy) at median Procrustes-CKA 0.679, and across eight domains (adding sketches, depth, thermal infrared, astronomy) at 0.604, every pair >0.40. It survives Pang calibration globally (7.4x disc-vs-MAE separation, n=13,394) and locally (4.82-5.30, p<10^{-44}). It is not pixel statistics (0.263), not Gabor features (0.31), not a random projection (0.041), and emerges in the first 10% of training while accuracy keeps climbing. We deliver four applications: a label-free transferability filter beating LogME (3x faster, +0.15 Kendall-tau); a four-way domain detector (99.6% accuracy); a frozen low-shot probe (16 dims beat 768-dim DINOv2 by 3.78pp at N=50 labels per class); and a teacher-free distillation auxiliary matching trained-teacher KD on 33 pairs (7.56pp peak gain at 10% label fraction). The substrate does not cross modalities, does not help cross-paradigm distillation, and does not predict transfer quality (rho=0.08 against transfer accuracy).