arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 2154
2602.14778 2026-05-20 cs.CL cs.AI cs.CY

A Geometric Analysis of Small-sized Language Model Hallucinations

小尺寸语言模型幻觉的几何分析

Emanuele Ricco, Elia Onofri, Lorenzo Cima, Stefano Cresci, Roberto Di Pietro

发表机构 * Engineering (CEMSE) division, King Abdullah University of Science and Technology (KAUST)(卡塔尔科技大学工程学院(CEMSE)) Istituto di Informatica e Telematica (IIT), National Research Council of Italy (CNR)(意大利国家研究理事会信息与电信研究所(IIT)) Department of Information Engineering, University of Pisa(比萨大学信息工程系)

AI总结 本文从几何角度分析小尺寸语言模型幻觉问题,提出APORIA框架,通过句子嵌入空间研究重复提示下的响应,发现真实响应比幻觉响应更紧密聚类,并通过APORIA-LP方法实现高效分类,同时发布SOCRATES-300K数据集以支持进一步研究。

Comments 30 pages, 12 figures, 14 tables, accepted as regular paper at ICML'26

详情
AI中文摘要

幻觉--合理但事实错误的响应--对大型语言模型(LLMs)的可靠性构成重大挑战,尤其是在多步骤或代理设置中。现有研究大多将幻觉视为知识缺失的结果;我们显示,即使相关事实知识存在,模型仍会产生幻觉响应,这指向检索不稳定而非知识缺口。基于这一观察,我们引入APORIA(Aggregate Prompt-wise Observation Retrieving Instability via Asymmetry--幻觉所体现的困惑-矛盾状态),一种几何框架,研究相同提示的重复响应在句子嵌入空间中的情况。我们的核心假设是真实响应比幻觉响应更紧密聚类;我们实证验证了这一点,并显示经过Fisher投影后,两种响应类别变得一致可分离。我们通过APORIA-LP方法利用几何中的不对称性,这是一种高效的标签传播方法,通过最少30-50次注释即可对大量响应进行分类,在十种小尺寸LLM上实现超过90%的F1分数。为支持进一步研究,我们发布了SOCRATES-300K数据集,包含300,000个完全标注的响应,以及数据集生成和结果复现的代码。我们的关键发现--从嵌入空间的几何角度分析幻觉--补充了传统知识中心和单响应评估范式,为进一步研究铺平道路。

英文摘要

Hallucinations -- plausible but factually incorrect responses -- pose a major challenge to the reliability of Large Language Models (LLMs), especially in multi-step or agentic settings. Existing work largely frames hallucinations as a consequence of missing knowledge; we show instead that, even when the relevant factual knowledge is present, models still produce hallucinated answers, pointing to retrieval instability rather than knowledge gaps. Building on this observation, we introduce APORIA (Aggregate Prompt-wise Observation Retrieving Instability via Asymmetry -- the state of puzzlement-in-contradiction that hallucinations embody), a geometric framework that studies repeated responses to the same prompt in sentence-embedding space. Our central hypothesis is that genuine responses cluster more tightly than hallucinated ones; we empirically validate this and show that, after Fisher projection, the two response classes become consistently separable. We leverage this asymmetry in geometry via APORIA-LP, an efficient label-propagation method that classifies large collections of responses from as few as 30--50 annotations, achieving F1 scores above 90% across ten small-sized LLMs. To support further research, we release SOCRATES-300K, a fully labelled dataset of 300,000 responses, together with the code for both dataset generation and result reproduction. Our key finding -- framing hallucinations from a geometric perspective in the embedding space -- complements traditional knowledge-centric and single-response evaluation paradigms, paving the way for further research.

2602.14594 2026-05-20 cs.CL

The Wikidata Query Logs Dataset

Wikidata查询日志数据集

Sebastian Walter, Hannah Bast

发表机构 * University of Freiburg(弗赖堡大学) University of Freiburg Department of Computer Science(弗赖堡大学计算机科学系)

AI总结 本文提出了一个包含335,000个问题-查询对的Wikidata查询日志(WDQL)数据集,该数据集通过真实世界中的SPARQL查询构建,无需模板生成查询,且规模是现有类似数据集的11倍多。通过代理方法去匿名化、清理和验证查询,生成对应的自然语言问题,用于训练问答方法。

Comments Accepted for publication at SIGIR 2026

详情
AI中文摘要

我们介绍了Wikidata查询日志(WDQL)数据集,该数据集包含335,000个关于Wikidata知识图谱的问题-查询对。它比现有类似格式的数据集大11倍以上,而无需依赖模板生成的查询。相反,我们通过发送到Wikidata查询服务的真实SPARQL查询构建该数据集,并为这些查询生成问题。由于这些基于日志的查询被匿名化,因此往往不产生结果,因此需要大量努力将它们转换回有意义的SPARQL查询。为此,我们提出了一种基于代理的方法,该方法迭代地去匿名化、清理和验证查询,并与Wikidata进行比对,同时生成相应的自然语言问题。我们展示了该数据集在训练问答方法方面的益处。所有WDQL资产以及代理代码均通过https://github.com/ad-freiburg/wikidata-query-logs公开发布,并采用宽松的许可证。

英文摘要

We present the Wikidata Query Logs (WDQL) dataset, a dataset consisting of 335k question-query pairs over the Wikidata knowledge graph. It is over 11x larger than the largest existing Wikidata datasets of similar format without relying on template-generated queries. Instead, we construct it using real-world SPARQL queries sent to the Wikidata Query Service and generate questions for them. Since these log-based queries are anonymized, and therefore often do not produce results, a significant amount of effort is needed to convert them back into meaningful SPARQL queries. To achieve this, we present an agent-based method that iteratively de-anonymizes, cleans, and verifies queries against Wikidata while also generating corresponding natural-language questions. We demonstrate the benefit of this dataset for training question-answering methods. All WDQL assets, as well as the agent code, are publicly available via https://github.com/ad-freiburg/wikidata-query-logs under a permissive license.

2602.10933 2026-05-20 cs.LG

CMAD: Cooperative Multi-Agent Diffusion via Stochastic Optimal Control

CMAD:通过随机最优控制的协作多智能体扩散

Riccardo Barbano, Alexander Denker, Zeljko Kereta, Runchang Li, Francisco Vargas

发表机构 * University of Cambridge(剑桥大学) Xaira Technologies(Xaira技术公司)

AI总结 本文提出了一种新的框架,将多模型组合生成问题转化为协作随机最优控制问题,通过联合优化扩散轨迹来实现更有效的生成效果。

详情
AI中文摘要

连续时间生成模型在图像恢复和合成中取得了显著成功。然而,控制多个预训练模型的组合仍是一个开放性挑战。当前方法大多将组合视为概率密度的代数组合,如通过概率密度的产品或专家混合。这种观点假设目标分布已知,这几乎从未发生。在本文中,我们提出了一种不同的范式,将组合生成视为协作随机最优控制问题。与其结合概率密度,我们把预训练的扩散模型视为相互作用的智能体,其扩散轨迹通过最优控制共同引导,朝着其聚合输出上定义的共享目标前进。我们在条件MNIST生成上验证了我们的框架,并将其与一个简单的基线进行比较,该基线在推理时间用每步梯度引导替代了学习的协作控制。

英文摘要

Continuous-time generative models have achieved remarkable success in image restoration and synthesis. However, controlling the composition of multiple pre-trained models remains an open challenge. Current approaches largely treat composition as an algebraic composition of probability densities, such as via products or mixtures of experts. This perspective assumes the target distribution is known explicitly, which is almost never the case. In this work, we propose a different paradigm that formulates compositional generation as a cooperative Stochastic Optimal Control problem. Rather than combining probability densities, we treat pre-trained diffusion models as interacting agents whose diffusion trajectories are jointly steered, via optimal control, toward a shared objective defined on their aggregated output. We validate our framework on conditional MNIST generation and compare it against a naïve inference-time DPS-style baseline replacing learned cooperative control with per-step gradient guidance.

2602.03924 2026-05-20 cs.LG cs.AI physics.ao-ph

WIND: Weather Inverse Diffusion for Zero-Shot Atmospheric Modeling

WIND:用于零样本大气建模的天气反向扩散

Michael Aich, Andreas Fürst, Florian Sestak, Carlos Ruiz-Gonzalez, Niklas Boers, Johannes Brandstetter

发表机构 * Munich Climate Center(慕尼黑气候中心) Earth System Modelling Group, TUM School of Engineering(地球系统建模组,技术大学工程学院) Design, Technical University of Munich, Germany(设计,慕尼黑技术大学,德国) ELLIS Unit, LIT AI Lab, Institute for Machine Learning, JKU Linz, Austria(ELLIS单元,LIT人工智能实验室,机器学习研究所,JKU林茨,奥地利) Emmi AI GmbH, Linz, Austria(Emmi AI GmbH,林茨,奥地利) Potsdam Institute for Climate Impact Research, Potsdam, Germany(波茨坦气候影响研究所,波茨坦,德国) Department of Mathematics, University of Exeter, Exeter, United Kingdom(数学系,埃克塞特大学,埃克塞特,英国)

AI总结 本文提出WIND,一种统一的预训练基础模型,能够无需任务特定微调即可替代各种任务的专用基线,通过自监督视频重建目标预训练,实现了对大气的鲁棒、任务无关的先验学习,从而解决天气和气候问题,如概率预报、空间时间降尺度、从稀疏观测重建空间场以及强制全球干空气质量守恒。

Comments Published at the 43rd International Conference on Machine Learning (ICML 2026)

详情
AI中文摘要

深度学习已革新了天气预报,但仍有诸多挑战,包括气候建模。此外,当前领域仍然碎片化:高度专门化的模型通常为不同任务单独训练。为统一这一领域,我们引入WIND,一种单一预训练的基础模型,能够替代各种任务的专用基线。关键在于,与之前的气象基础模型不同,我们无需任何任务特定的微调。为了学习大气的鲁棒、任务无关的先验,我们使用无条件视频扩散模型预训练WIND,通过自监督视频重建目标迭代地从噪声状态重建大气动态。在推理时,我们将各种领域特定的问题严格视为反问题,并通过后验采样解决。这种统一的方法使我们能够解决高度相关的天气和气候问题,包括概率预报、空间和时间降尺度、从稀疏观测重建空间场以及强制全球干空气质量守恒。我们进一步展示了WIND如何在给定的非分布热力学扰动下用于探索极端天气事件。通过结合生成视频建模与反问题求解,WIND为基于AI的大气建模提供了一种计算高效的替代方案。

英文摘要

Deep learning has revolutionized weather forecasting, but many challenges remain, including climate modeling. Moreover, the current landscape remains fragmented: highly specialized models are typically trained individually for distinct tasks. To unify this landscape, we introduce WIND, a single pre-trained foundation model capable of replacing specialized baselines across a vast array of tasks. Crucially, in contrast to previous atmospheric foundation models, we achieve this without any task-specific fine-tuning. To learn a robust, task-agnostic prior of the atmosphere, we pre-train WIND with a self-supervised video reconstruction objective, utilizing an unconditional video diffusion model to iteratively reconstruct atmospheric dynamics from a noisy state. At inference, we frame diverse domain-specific problems strictly as inverse problems and solve them via posterior sampling. This unified approach allows us to tackle highly relevant weather and climate problems, including probabilistic forecasting, spatial and temporal downscaling, reconstruction of spatial fields from sparse observations and enforcing global dry air mass conservation. We further demonstrate how WIND can be applied to explore extreme weather events under prescribed out-of-distribution thermodynamic perturbations. By combining generative video modeling with inverse problem solving, WIND offers a computationally efficient alternative for AI-based atmospheric modeling.

2602.03839 2026-05-20 cs.LG

Understanding and Exploiting Weight Update Sparsity for Communication-Efficient Distributed RL

理解并利用权重更新稀疏性以实现通信高效的分布式强化学习

Erfan Miahi, Eugene Belilovsky

发表机构 * Covenant AI Mila, Concordia University(蒙特利尔大学米尔实验室)

AI总结 本文研究了在带宽受限的分布式强化学习中,通过利用权重更新的稀疏性来减少通信开销,提出了一种名为PULSE的算法,通过计算可见稀疏化原则,实现了高效的权重同步和伪梯度同步。

Comments 40 pages, 19 figures, 14 tables

详情
AI中文摘要

带宽受限的分布式强化学习(RL)在大规模语言模型训练后受到两个通道的限制:从训练器到推理工人的权重同步,以及训练器之间的梯度或伪梯度同步。我们发现,在标准训练和推理前向传递中使用的BF16转换后,大约99%的每步权重更新在视觉上是不可见的。我们通过展示,在典型的RL训练后学习率下,Adam更新通常低于本地BF16舍入阈值,解释了这种稀疏性。我们将这一观察转化为一种名为计算可见稀疏化的算法原则:仅传输会改变下一个前向传递的更新。PULSE(Precision-gated Updates for Low-precision Sparse Exchange)将这一原则转化为两种通信算法:PULSESync从训练器向推理工发送无损稀疏BF16权重补丁,PULSELoCo通过误差反馈稀疏化DiLoCo风格的FP32伪梯度同步。在带宽受限的商用网络上,PULSESync在重建训练器权重位相同的情况下,将权重同步通信减少了超过100倍。PULSELoCo在四个模型上与DiLoCo相当,同时在训练器之间的通信减少了超过17倍,与DiLoCo相比,超过100倍,与DDP相比。

英文摘要

Bandwidth-constrained distributed reinforcement learning (RL) post-training of large language models is bottlenecked by two channels: weight synchronization from trainers to inference workers, and gradient or pseudo-gradient synchronization across trainers. We find that approximately 99% of per-step weight updates are invisible after the BF16 cast used by standard training and inference forward passes. We explain this sparsity by showing that, at typical RL post-training learning rates, Adam updates often fall below the local BF16 rounding threshold. We turn this observation into an algorithmic principle called compute-visible sparsification: transmit only updates that would change the next forward pass. PULSE (Precision-gated Updates for Low-precision Sparse Exchange) turns this principle into two communication algorithms: PULSESync sends lossless sparse BF16 weight patches from trainers to inference workers, and PULSELoCo sparsifies DiLoCo-style FP32 pseudo-gradient synchronization with error feedback. Over bandwidth-constrained commodity networks, PULSESync cuts weight-synchronization communication by over 100x while reconstructing trainer weights bit-identically. PULSELoCo matches DiLoCo across four models while reducing trainer-to-trainer communication by over 17x versus DiLoCo and over 100x versus DDP in the largest evaluated setting.

2601.20529 2026-05-20 cs.RO cs.MA

A Practical Framework of Key Performance Indicators for Multi-Robot Lunar and Planetary Field Tests

多机器人月球和行星实地测试的关键绩效指标实用框架

Julia Richter, David Oberacker, Gabriela Ligeza, Valentin T. Bickel, Philip Arm, William Talbot, Marvin Grosse Besselmann, Florian Kehl, Tristan Schnell, Hendrik Kolvenbach, Rüdiger Dillmann, Arne Roennau, Marco Hutter

发表机构 * Robotic Systems Lab (RSL), ETH Zürich(罗伯特系统实验室(RSL),苏黎世联邦理工学院) FZI Research Center for Information Technology(信息技术研究所以及中心) Machine Intelligence and Robotics Lab (MaiRo), Karlsruhe Institute for Technology (KIT)(机器智能与机器人实验室(MaiRo),卡尔斯鲁厄理工学院) Department of Environmental Sciences, University of Basel(巴塞尔大学环境科学系) European Space Agency/ESTEC(欧洲航天局/ESTEC) Center for Space and Habitability, University of Bern(伯尔尼大学空间与宜居性中心) Space Instruments Group, University of Zürich(苏黎世大学空间仪器组) Space Science and Technology, ETH Zürich(苏黎世联邦理工学院空间科学与技术)

AI总结 本文提出了一种基于多机器人月球场景的KPI框架,用于评估和比较不同实地测试的性能,强调效率、鲁棒性和精度的场景依赖性优先级,并通过实地测试验证了其在实际应用中的有效性。

Comments Presented at ICRA 2026 Workshop on Multi-Agent Robotic Systems: Real-World Collaboration and Interaction

详情
AI中文摘要

在月球上寻找关键资源(如钛铁矿、稀有地球元素和水冰)需要稳健的探索方法,鉴于多变的地形和恶劣的环境条件。尽管有许多类比实地测试旨在实现这些目标,但比较其结果仍然具有挑战性,因为机器人平台和实验设置存在差异。这些任务通常使用选定的、场景特定的工程度量来评估性能,但无法建立场性能与科学驱动目标之间的明确联系。在本文中,我们通过从三个现实的多机器人月球场景中推导出一个结构化的KPI框架来填补这一空白,这些场景反映了科学目标和操作约束。我们的框架强调效率、鲁棒性和精度的场景依赖性优先级,并且专门设计用于实际部署。我们通过多机器人实地测试验证了该框架,并发现其在效率和鲁棒性相关的KPI方面具有实用性和易用性,而精度导向的KPI则需要可靠的地面真实数据,这在户外类比环境中并不总是可行。总体而言,我们提出这个框架作为通用的评估标准,能够实现一致、目标导向的多机器人实地测试比较,并支持未来行星探索机器人系统的系统性发展。

英文摘要

Robotic prospecting for critical resources on the Moon, such as ilmenite, rare earth elements, and water ice, requires robust exploration methods given the diverse terrain and harsh environmental conditions. Although numerous analog field trials address these goals, comparing their results remains challenging because of differences in robot platforms and experimental setups. These missions typically assess performance using selected, scenario-specific engineering metrics that fail to establish a clear link between field performance and science-driven objectives. In this paper, we address this gap by deriving a structured framework of KPI from three realistic multi-robot lunar scenarios reflecting scientific objectives and operational constraints. Our framework emphasizes scenario-dependent priorities in efficiency, robustness, and precision, and is explicitly designed for practical applicability in field deployments. We validated the framework in a multi-robot field test and found it practical and easy to apply for efficiency- and robustness-related KPI, whereas precision-oriented KPI require reliable ground-truth data that is not always feasible to obtain in outdoor analog environments. Overall, we propose this framework as a common evaluation standard enabling consistent, goal-oriented comparison of multi-robot field trials and supporting systematic development of robotic systems for future planetary exploration.

2601.16200 2026-05-20 cs.LG cs.CV

Feature-Space Smoothing: Certified Robustness of Deep Representations

特征空间平滑:深度表示的认证鲁棒性

Song Xia, Meiwen Ding, Chenqi Kong, Wenhan Yang, Xudong Jiang

发表机构 * Rapid-Rich Object Search Lab, School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore(快速-丰富目标搜索实验室,电气电子工程学院,南洋理工大学,新加坡) Pengcheng Laboratory, Shenzhen, China(鹏城实验室,深圳,中国)

AI总结 本文提出了一种特征空间平滑(FS)框架,通过在特征表示层面提供认证鲁棒性,以解决深度学习模型对恶意输入的脆弱性问题,核心方法是通过特征平滑保证清洁和对抗特征之间的余弦相似度下界,并引入高斯平滑增强器(GSB)提升编码器的高斯鲁棒性得分,从而提升模型的鲁棒性并保持下游任务性能。

Comments Under review

详情
AI中文摘要

现代深度学习模型在多种应用中表现出强大的能力,但仍然容易受到通过特征空间扭曲诱导错误预测的恶意输入的攻击。为了解决这一脆弱性,我们提出了特征空间平滑(FS),一种通用的防御框架,该框架能够在特征表示层面提供认证鲁棒性。我们证明,FS将给定的特征编码器转换为一个平滑版本,该版本在l_2有界扰动下保证清洁和对抗特征之间的余弦相似度的认证下界。然后我们建立该特征余弦相似度下界(FCSB)可以扩展到预测层面的认证,其值由编码器内在的高斯鲁棒性得分决定。基于这些见解,我们引入了高斯平滑增强器(GSB),一个即插即用的模块,用于提升编码器的高斯鲁棒性得分。具体来说,GSB模块被插入以增强特征空间的一致性,并在高斯扰动下保持特征的实用性,以供下游任务使用。这种设计使FS能够无缝集成到受保护的模型上,例如多模态大语言模型(MLLMs),而无需额外的模型重新训练或对齐,从而在提升鲁棒性的同时保持下游任务的性能。广泛的实验表明,整合FS一致地提供了非平凡的认证鲁棒性,并在多种模型和应用中显著提高了面向任务的性能,即使在强白盒对抗攻击下也如此。

英文摘要

Modern deep learning models exhibit strong capabilities across diverse applications, yet remain vulnerable to malicious inputs that induce erroneous predictions via feature-space distortion. To address this vulnerability, we propose Feature-space Smoothing (FS), a general defense framework that provides certified robustness at the feature representation level. We show that FS converts a given feature encoder into a smoothed variant that is guaranteed to maintain a certified lower bound on the cosine similarity between clean and adversarial features under l_2-bounded perturbations. We then establish that this Feature Cosine Similarity Bound (FCSB) can be extended to the prediction-wise certification under the cosine similarity measure, and the value of FCSB is determined by the encoder intrinsic Gaussian robustness score. Building on those insights, we introduce the Gaussian Smoothness Booster (GSB), a plug-and-play module to improve the encoder Gaussian robustness score. Specifically, the GSB module is plugged to enhance the feature-space consistency and maintain the feature utility for downstream tasks under Gaussian perturbations. This design enables seamless integration of FS on the protected model, e.g., Multimodal Large Language Models (MLLMs), without additional model retraining or alignment, improving its robustness while preserving the performance for downstream task-oriented decoding. Extensive experiments demonstrate that integrating FS consistently provides non-trivial certified robustness and significantly improves task-oriented performance under strong white-box adversarial attacks across diverse models and applications.

2601.14848 2026-05-20 cs.LG cs.AI cs.NE cs.RO

From Observation to Prediction: LSTM for Vehicle Lane Change Forecasting on Highway On/Off-Ramps

从观测到预测:LSTM用于高速公路进出匝道的车辆车道变更预测

Mohamed Abouras, Catherine M. Elias

发表机构 * C-DRiVeS Lab: Cognitive Driving Research in Vehicular Systems(C-DRiVeS实验室:车载系统认知驾驶研究) Computer Science and Engineering Department - Faculty of Media Engineering and Technology - German University in Cairo(计算机科学与工程系 - 媒体工程与技术学院 - 埃及德国大学)

AI总结 本文研究了高速公路进出匝道区域与直线路段的区别,利用多层LSTM架构和ExiD无人机数据集训练模型,测试了不同预测时间范围和不同模型的工作流程,结果表明在4秒内预测准确率可达76%(匝道区域)和94%(一般高速公路场景).

详情
AI中文摘要

进出匝道是尽管引入了更高的高速公路交互变异水平但仍然被低估的道路部分。预测这些区域车辆的行为可以减少不确定性的影响并提高道路安全性。在本文中,研究了该感兴趣区域(AoI)与直线路段之间的差异。利用多层LSTM架构和ExiD无人机数据集训练AoI模型。在过程中测试了不同的预测时间范围和不同模型的工作流程。结果表明,在最大预测时间范围内,预测准确率在4秒内显示出巨大潜力,匝道区域的预测准确率从约76%开始,而一般高速公路场景的预测准确率在最大预测时间范围内达到94%。

英文摘要

On and off-ramps are understudied road sections even though they introduce a higher level of variation in highway interactions. Predicting vehicles' behavior in these areas can decrease the impact of uncertainty and increase road safety. In this paper, the difference between this Area of Interest (AoI) and a straight highway section is studied. Multi-layered LSTM architecture to train the AoI model with ExiD drone dataset is utilized. In the process, different prediction horizons and different models' workflow are tested. The results show great promise on horizons up to 4 seconds with prediction accuracy starting from about 76% for the AoI and 94% for the general highway scenarios on the maximum horizon.

2601.12696 2026-05-20 cs.CL

UbuntuGuard: A Culturally-Grounded Policy Benchmark for Equitable AI Safety in African Languages

UbuntuGuard:一种基于文化的政策基准,用于非洲语言中的公平AI安全

Tassallah Abdullahi, Macton Mgonzo, Mardiyyah Oduwole, Paul Okewunmi, Abraham Owodunni, Ritambhara Singh, Carsten Eickhoff

发表机构 * Brown University(布朗大学) The Ohio State University(俄亥俄州立大学) ML Collective(ML集体) University of Tuebingen(图宾根大学)

AI总结 本文提出UbuntuGuard,一种针对非洲语言的公平AI安全政策基准,通过来自155名领域专家的对抗性查询,构建了基于本地规范和文化期望的政策和参考响应,评估守护模型的性能。

Comments 15 pages

详情
AI中文摘要

当前的守护模型主要偏向西方,优化于高资源语言,使低资源非洲语言面临不断演变的危害、跨语言失败和文化不匹配。此外,大多数守护模型依赖于刚性的、预定义的安全类别,无法在多样化的语言和社会文化背景下泛化。实现稳健的安全需要灵活的、在运行时可执行的政策和基准,这些基准能反映本地规范、危害场景和文化期望。我们引入UbuntuGuard,这是首个基于政策的安全基准,用于非洲语言,由来自敏感领域的155名领域专家撰写的对抗性查询构建。从这些专家编写的查询中,我们推导出上下文特定的安全政策和参考响应,捕捉基于文化的危险信号,使守护模型的评估与政策对齐。我们评估了15个模型,包括7个通用大型语言模型和8个守护模型,涵盖三种不同的变体:静态、动态和多语言。我们的发现表明,现有的以英语为中心的基准高估了现实世界多语言安全,跨语言迁移提供了部分但不充分的覆盖,而动态模型虽然在推理时更能利用政策,但仍难以完全本地化非洲语言的环境。这些发现突显了对多语言、基于文化的安全基准的紧迫需求,以促进低资源语言中可靠和公平的守护模型的开发。

英文摘要

Current guardian models are predominantly Western-centric and optimized for high-resource languages, leaving low-resource African languages vulnerable to evolving harms, cross-lingual failures, and cultural misalignment. Moreover, most guardian models rely on rigid, predefined safety categories that fail to generalize across diverse linguistic and sociocultural contexts. Achieving robust safety requires flexible, runtime-enforceable policies and benchmarks that reflect local norms, harm scenarios, and cultural expectations. We introduce UbuntuGuard, the first policy-based safety benchmark for African languages built from adversarial queries authored by 155 domain experts across sensitive fields, including healthcare. From these expert-crafted queries, we derive context-specific safety policies and reference responses that capture culturally grounded risk signals, enabling policy-aligned evaluation of guardian models. We evaluate 15 models, comprising seven general-purpose LLMs and eight guardian models across three distinct variants: static, dynamic, and multilingual. Our findings reveal that existing English-centric benchmarks overestimate real-world multilingual safety, cross-lingual transfer provides partial but insufficient coverage, and dynamic models, while better equipped to leverage policies at inference time, still struggle to fully localize African-language contexts. These findings highlight the urgent need for multilingual, culturally grounded safety benchmarks to enable the development of reliable and equitable guardian models for low-resource languages.

2601.12373 2026-05-20 cs.CV cs.HC cs.RO

CD-TWINSAFE: A ROS-enabled Digital Twin for Scene Understanding and Safety Emerging V2I Technology

CD-TWINSAFE:一种基于ROS的数字孪生用于场景理解和安全新兴V2I技术

Amro Khaled, Farah Khaled, Omar Riad, Catherine M. Elias

发表机构 * C-DRiVeS Lab: Cognitive Driving Research in Vehicular Systems, Cairo, Egypt(认知驾驶研究与车辆系统实验室,埃及开罗) Computer Science and Engineering Department - Faculty of Media Engineering and Technology(计算机科学与工程系-媒体工程与技术学院) German University in Cairo, Egypt(埃及开罗德国大学)

AI总结 本文提出了一种基于V2I的数字孪生系统CD-TWINSAFE,用于自动驾驶车辆的场景理解和安全监控,通过同时运行的两个栈结构实现车辆侧的驾驶模块和数字孪生模块,利用立体相机和Unreal Engine 5构建场景复现,并通过ROS架构实现V2I通信。

详情
AI中文摘要

本文介绍了CD-TWINSAFE,一种基于V2I的自动驾驶车辆数字孪生系统。所提出的架构由两个同时运行的栈组成,一个是车载驾驶栈,包含立体相机用于场景理解,另一个是数字孪生栈,运行Unreal Engine 5的场景复制品并返回安全警报至驾驶舱。车载栈在车辆侧实现,包括两个主要自主模块:定位和感知。通过车载传感器获取车辆的位置和方向。此外,感知模块负责处理立体相机的20fps图像,并通过两个互补的管道理解场景,包括物体检测和特征提取,包括物体速度、偏转角以及安全指标时间到碰撞和时间头道。收集的数据通过ROS架构以自定义ROS2消息的形式发送到基础设施侧,并通过UDP链接在4G调制解调器上进行V2I通信。通过数字孪生监控环境,共享消息更新生成的ego车辆和检测到的对象的信息,基于实时的定位和感知数据。通过不同驾驶场景的测试来验证所提出架构的有效性和实时响应能力。

英文摘要

In this paper, the CD-TWINSAFE is introduced, a V2I-based digital twin for Autonomous Vehicles. The proposed architecture is composed of two stacks running simultaneously, an on-board driving stack that includes a stereo camera for scene understanding, and a digital twin stack that runs an Unreal Engine 5 replica of the scene viewed by the camera as well as returning safety alerts to the cockpit. The on-board stack is implemented on the vehicle side including 2 main autonomous modules; localization and perception. The position and orientation of the ego vehicle are obtained using on-board sensors. Furthermore, the perception module is responsible for processing 20-fps images from stereo camera and understands the scene through two complementary pipelines. The pipeline are working on object detection and feature extraction including object velocity, yaw and the safety metrics time-to-collision and time-headway. The collected data form the driving stack are sent to the infrastructure side through the ROS-enabled architecture in the form of custom ROS2 messages and sent over UDP links that ride a 4G modem for V2I communication. The environment is monitored via the digital twin through the shared messages which update the information of the spawned ego vehicle and detected objects based on the real-time localization and perception data. Several tests with different driving scenarios to confirm the validity and real-time response of the proposed architecture.

2601.12358 2026-05-20 cs.CV cs.AI cs.RO

From Prompts to Pavement: LMMs-based Agentic Behavior-Tree Generation Framework for Autonomous Vehicles

从提示到道路:基于大语言模型的代理行为树生成框架用于自动驾驶车辆

Omar Y. Goba, Ahmed Y. Gado, Catherine M. Elias, Ahmed Hussein

发表机构 * Computer Science & Engineering Department, German University in Cairo (GUC), Egypt(德国亚历山大·冯·洪堡大学(开罗分校)计算机科学与工程系,埃及) C-DRiVeS Lab: Cognitive Driving Research in Vehicular Systems, Cairo, Egypt(认知驾驶系统实验室(车辆系统中的认知驾驶研究),开罗,埃及) IAV GmbH, Berlin, Germany(IAV GmbH,柏林,德国)

AI总结 本文提出了一种基于大语言模型和多模态视觉模型的代理行为树生成框架,用于自动驾驶车辆在复杂环境中自适应导航。该框架通过链式符号提示评估场景关键性,通过上下文学习构建高层子目标,并通过生成器合成可执行的BT子树,实现了在CARLA+Nav2模拟中对突发障碍物(如道路堵塞)的成功绕行。

详情
AI中文摘要

自动驾驶车辆(AVs)需要适应性行为规划器来安全地导航不可预测的现实环境。传统的行为树(BTs)提供结构化决策逻辑,但本质上是静态的,并且需要大量人工调优,限制了其在SAE Level 5自主性中的应用。本文提出了一种代理框架,利用大语言模型(LLMs)和多模态视觉模型(LVMs)来实时生成和适应BTs。一个专门的Descriptor代理使用链式符号提示来评估场景关键性,一个Planner代理通过上下文学习构建高层子目标,一个Generator代理合成可执行的BT子树。该系统集成到CARLA+Nav2模拟中,仅在基线BT失败时触发,展示了成功绕过突发障碍物(例如道路堵塞)的能力,无需人工干预。与静态BT基线相比,该方法是一种概念验证,能够扩展到多样的驾驶场景。

英文摘要

Autonomous vehicles (AVs) require adaptive behavior planners to navigate unpredictable, real-world environments safely. Traditional behavior trees (BTs) offer structured decision logic but are inherently static and demand labor-intensive manual tuning, limiting their applicability at SAE Level 5 autonomy. This paper presents an agentic framework that leverages large language models (LLMs) and multi-modal vision models (LVMs) to generate and adapt BTs on the fly. A specialized Descriptor agent applies chain-of-symbols prompting to assess scene criticality, a Planner agent constructs high-level sub-goals via in-context learning, and a Generator agent synthesizes executable BT sub-trees in XML format. Integrated into a CARLA+Nav2 simulation, our system triggers only upon baseline BT failure, demonstrating successful navigation around unexpected obstacles (e.g., street blockage) with no human intervention. Compared to a static BT baseline, this approach is a proof-of-concept that extends to diverse driving scenarios.

2601.03645 2026-05-20 cs.CL cs.CY

LLM-MC-Affect: LLM-Based Monte Carlo Modeling of Affective Trajectories and Latent Ambiguity for Interpersonal Dynamic Insight

LLM-MC-Affect: 基于大语言模型的蒙特卡洛建模:情感轨迹与潜在模糊性的人际动态洞察

Yu-Zheng Lin, Bono Po-Jen Shih, John Paul Martin Encinas, Elizabeth Victoria Abraham Achom, Karan Himanshu Patel, Jesus Horacio Pacheco, Sicong Shao, Jyotikrishna Dass, Soheil Salehi, Pratik Satam

发表机构 * University of Arizona(亚利桑那大学) Pennsylvania State University(宾夕法尼亚州立大学) Universidad de Sonora(索尔纳大学) University of North Dakota(北达科他大学)

AI总结 本文提出LLM-MC-Affect框架,通过概率建模方法,将情感视为连续的潜在概率分布,从而捕捉人际互动中的情感轨迹和潜在模糊性,为动态分析提供新的视角和方法。

Comments Accepted to the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)

详情
AI中文摘要

情感协调是人类互动的核心属性,决定了关系意义在实时中如何构建。尽管基于文本的情感推断已变得越来越可行,但先前的方法通常将情感视为个体说话者的确定性点估计,未能捕捉到相互交流中固有的主观性、潜在模糊性和序列耦合。我们引入LLM-MC-Affect,一种概率框架,将情感不视为静态标签,而是在情感空间上定义的连续潜在概率分布。通过利用随机LLM解码和蒙特卡洛估计,该方法近似这些分布,以推导出高保真的情感轨迹,明确量化中央情感倾向和感知模糊性。这些轨迹通过序列交叉相关和基于斜率的指标,实现了对人际耦合的结构化分析,识别说话者之间的领先或滞后影响。为了验证该方法的解释能力,我们利用教师-学生教学对话作为代表性案例研究,其中我们的定量指标成功地提炼出高水平的互动洞察,如有效的支架作用。这项工作建立了一种可扩展且可部署的路径,用于理解人际动态,提供了一种可推广的解决方案,其应用不仅限于教育,还扩展到更广泛的社会和行为研究。

英文摘要

Emotional coordination is a core property of human interaction that shapes how relational meaning is constructed in real time. While text-based affect inference has become increasingly feasible, prior approaches often treat sentiment as a deterministic point estimate for individual speakers, failing to capture the inherent subjectivity, latent ambiguity, and sequential coupling found in mutual exchanges. We introduce LLM-MC-Affect, a probabilistic framework that characterizes emotion not as a static label, but as a continuous latent probability distribution defined over an affective space. By leveraging stochastic LLM decoding and Monte Carlo estimation, the methodology approximates these distributions to derive high-fidelity sentiment trajectories that explicitly quantify both central affective tendencies and perceptual ambiguity. These trajectories enable a structured analysis of interpersonal coupling through sequential cross-correlation and slope-based indicators, identifying leading or lagging influences between interlocutors. To validate the interpretive capacity of this approach, we utilize teacher-student instructional dialogues as a representative case study, where our quantitative indicators successfully distill high-level interaction insights such as effective scaffolding. This work establishes a scalable and deployable pathway for understanding interpersonal dynamics, offering a generalizable solution that extends beyond education to broader social and behavioral research.

2512.24139 2026-05-20 cs.LG stat.ME

Colorful Pinball: Density-Weighted Quantile Regression for Conditional Guarantee of Conformal Prediction

Colorful Pinball:基于密度加权分位数回归的条件保证置信预测

Qianyi Chen, Bo Li

发表机构 * School of Economics and Management, Tsinghua University, China(清华大学经济管理学院)

AI总结 本文提出了一种基于密度加权分位数回归的条件保证置信预测方法,通过改进标准置信预测的条件覆盖性能,提供更精确的非渐近保证。

Comments ICML 2026

详情
AI中文摘要

尽管置信预测提供了稳健的边缘覆盖保证,但实现特定输入的可靠条件覆盖仍然具有挑战性。虽然有限样本下无法获得精确的分布无关条件覆盖,但近期研究集中在改进标准置信程序的条件覆盖性能上。与针对放宽条件覆盖概念的方法不同,我们直接针对条件覆盖的均方误差,通过优化支撑许多置信方法的分位数回归组件来改进。利用泰勒展开,我们推导出一种尖锐的替代目标函数:密度加权pinball损失,其中权重由非置信分数的条件密度在真实分位数处的值给出。我们提出了一种三头分位数网络,通过使用辅助分位数水平$1-α\pm δ$的有限差分估计这些权重,随后通过优化加权损失微调中心分位数。我们提供了具有精确非渐近保证的理论分析,刻画了由此产生的超额风险。在多样化的高维真实世界数据集上的广泛实验展示了在条件覆盖性能上的显著改进。

英文摘要

Although conformal prediction provides robust marginal coverage guarantees, achieving reliable conditional coverage for specific inputs remains challenging. While exact distribution-free conditional coverage is impossible with finite samples, recent work has focused on improving the conditional coverage of standard conformal procedures. Distinct from approaches that target relaxed notions of conditional coverage, we directly target the mean squared error of conditional coverage by refining the quantile regression components that underpin many conformal methods. Leveraging a Taylor expansion, we derive a sharp surrogate objective for quantile regression: a density-weighted pinball loss, where the weights are given by the conditional density of the nonconformity score evaluated at the true quantile. We propose a three-headed quantile network that estimates these weights via finite differences using auxiliary quantile levels at $1-α\pm δ$, subsequently fine-tuning the central quantile by optimizing the weighted loss. We provide a theoretical analysis with exact non-asymptotic guarantees characterizing the resulting excess risk. Extensive experiments on diverse high-dimensional real-world datasets demonstrate remarkable improvements in conditional coverage performance.

2512.20931 2026-05-20 cs.RO

Certifiable Alignment of GNSS and Local Frames via Lagrangian Duality

通过拉格朗日对偶实现GNSS与局部框架的可验证对齐

Baoshan Song, Matthew Giamou, Penggao Yan, Chunxi Xia, Li-Ta Hsu

发表机构 * Department of Aeronautical and Aviation Engineering, The Hong Kong Polytechnic University, China(航空与航空工程系,香港理工大学,中国) Department of Computing and Software, McMaster University, Canada(计算与软件系,麦斯特大学,加拿大) School of Geodesy and Geomatics, Wuhan University, China(测绘学院,武汉大学,中国)

AI总结 本文提出了一种全局最优求解器,通过将原始伪距或多普勒测量转换为凸松弛问题,实现了GNSS与局部框架的可验证对齐,解决了传统方法在GNSS退化环境下的局限性。

Comments Final version in RA-L

详情
AI中文摘要

估计局部系统相对于全球导航卫星系统(GNSS)参考的绝对对齐 often 遭遇局部极小值和对卫星可用性高度依赖的问题。现有方法对于此对齐任务依赖于大量卫星,无法在GNSS退化环境中使用,或使用局部优化方法无法保证解的最优性。本文介绍了一种全局最优求解器,将原始伪距或多普勒测量转换为凸松弛问题。所提出的方法是可验证的,意味着可以数值验证结果的正确性,填补了现有局部优化器无法保证最优性的空白。我们首先将原始框架对齐问题公式化为一个非凸二次约束二次规划(QCQP)问题,并将QCQP问题松弛为一个凹的拉格朗日对偶问题,为原问题提供一个下界成本。然后我们进行松弛紧密性和可观测性分析,推导出解的可验证最优性的标准。最后进行仿真和实际世界实验来评估所提出的方法。实验表明,即使只有2颗卫星的多普勒测量和2D车辆运动,我们的方法也能提供可验证的最优解,而传统基于速度的VOBA方法和先进的GVINS对齐技术可能会失败或收敛到局部极小值。为了支持机器人中的GNSS导航技术发展,所有代码和数据均在https://github.com/Baoshan-Song/Certifiable-Doppler-alignment上开源。

英文摘要

Estimating the absolute orientation of a local system relative to a global navigation satellite system (GNSS) reference often suffers from local minima and high dependency on satellite availability. Existing methods for this alignment task rely on abundant satellites unavailable in GNSS-degraded environments, or use local optimization methods which cannot guarantee the optimality of a solution. This work introduces a globally optimal solver that transforms raw pseudo-range or Doppler measurements into a convexly relaxed problem. The proposed method is certifiable, meaning it can numerically verify the correctness of the result, filling a gap where existing local optimizers fail. We first formulate the original frame alignment problem as a nonconvex quadratically constrained quadratic program (QCQP) problem and relax the QCQP problem to a concave Lagrangian dual problem that provides a lower cost bound for the original problem. Then we perform relaxation tightness and observability analysis to derive criteria for certifiable optimality of the solution. Finally, simulation and real world experiments are conducted to evaluate the proposed method. The experiments show that our method provides certifiably optimal solutions even with only 2 satellites with Doppler measurements and 2D vehicle motion, while the traditional velocity-based VOBA method and the advanced GVINS alignment technique may fail or converge to local optima without notice. To support the development of GNSS-based navigation techniques in robotics, all code and data are open-sourced at https://github.com/Baoshan-Song/Certifiable-Doppler-alignment.

2512.03869 2026-05-20 cs.CV cs.CY

An Automated Framework for Large-Scale Graph-Based Cerebrovascular Analysis

一种用于大规模基于图的脑血管分析的自动化框架

Daniele Falcetta, Liane S. Canas, Lorenzo Suppa, Matteo Pentassuglia, Jon Cleary, Marc Modat, Sébastien Ourselin, Maria A. Zuluaga

发表机构 * 1 EURECOM, Sophia Antipolis, France 2 School of Biomedical Engineering \& Imaging Sciences, King's College London, UK 3 Politecnico di Torino, Torino, Italy

AI总结 本文提出了一种自动化脑血管分析框架,通过骨架化生成的图表示建模血管形态,并通过区域划分、中心线提取和图构建计算15种形态学、拓扑学、分形和几何特征,以多尺度方式表征脑血管组织。

Comments Accepted at IEEE ISBI 2026

详情
AI中文摘要

我们提出了CaravelMetrics,一种用于自动化脑血管分析的计算框架,通过骨架化生成的图表示建模血管形态。该框架整合了基于图谱的区域划分、中心线提取和图构建,以计算15种形态学、拓扑学、分形和几何特征。这些特征可以全局从完整的血管网络或区域内动脉territories估计,从而实现脑血管组织的多尺度表征。应用于IXI数据集中的570个3D TOF-MRA扫描(年龄20-86岁),CaravelMetrics产生可重复的血管图,捕捉年龄和性别相关变化以及教育程度相关的血管复杂性增加,与文献中的发现一致。该框架提供了一种可扩展且完全自动的定量脑血管特征提取方法,支持规范建模和群体水平的血管健康和衰老研究。

英文摘要

We present CaravelMetrics, a computational framework for automated cerebrovascular analysis that models vessel morphology through skeletonization-derived graph representations. The framework integrates atlas-based regional parcellation, centerline extraction, and graph construction to compute fifteen morphometric, topological, fractal, and geometric features. The features can be estimated globally from the complete vascular network or regionally within arterial territories, enabling multiscale characterization of cerebrovascular organization. Applied to 570 3D TOF-MRA scans from the IXI dataset (ages 20-86), CaravelMetrics yields reproducible vessel graphs capturing age- and sex-related variations and education-associated increases in vascular complexity, consistent with findings reported in the literature. The framework provides a scalable and fully automated approach for quantitative cerebrovascular feature extraction, supporting normative modeling and population-level studies of vascular health and aging.

2511.22940 2026-05-20 cs.CV

One-to-All Animation: Alignment-Free Character Animation and Image Pose Transfer

一对一动画:无对齐角色动画和图像姿态转换

Shijun Shi, Jing Xu, Zhihang Li, Chunli Peng, Xiaoda Yang, Lijing Lu, Kai Hu, Jiangning Zhang

发表机构 * Jiangnan University(江南大学) University of Science and Technology of China(中国科学技术大学) Chinese Academy of Sciences(中国科学院) Beijing University of Posts and Telecommunications(北京邮电大学) Zhejiang University(浙江大学)

AI总结 本文提出了一种统一框架,用于高保真角色动画和图像姿态转换,解决了参考姿态错位问题,通过自监督补全任务和混合参考融合注意力机制提升生成质量。

Comments Project Page:https://ssj9596.github.io/one-to-all-animation-project/

详情
AI中文摘要

最近扩散模型的进步极大地提高了基于姿态的角色动画效果。然而,现有方法受限于空间对齐的参考姿态对和匹配的骨骼结构。处理参考姿态错位仍是一个未解决的问题。为此,我们提出了One-to-All Animation,一种统一框架,用于高保真的角色动画和图像姿态转换,适用于任意布局的参考。首先,为了处理空间错位的参考,我们将训练重新公式化为自监督的补全任务,将多样布局的参考转换为统一的遮挡输入格式。其次,为了处理部分可见的参考,我们设计了一个参考提取器用于全面的身份特征提取。进一步,我们整合了混合参考融合注意力机制以处理不同分辨率和动态序列长度。最后,从生成质量的角度,我们引入了身份鲁棒姿态控制,将外观与骨骼结构解耦以缓解姿态过拟合,并引入了一个令牌替换策略以实现连贯的长视频生成。大量实验表明,我们的方法优于现有方法。代码和模型可在https://github.com/ssj9596/One-to-All-Animation上获得。

英文摘要

Recent advances in diffusion models have greatly improved pose-driven character animation. However, existing methods are limited to spatially aligned reference-pose pairs with matched skeletal structures. Handling reference-pose misalignment remains unsolved. To address this, we present One-to-All Animation, a unified framework for high-fidelity character animation and image pose transfer for references with arbitrary layouts. First, to handle spatially misaligned reference, we reformulate training as a self-supervised outpainting task that transforms diverse-layout reference into a unified occluded-input format. Second, to process partially visible reference, we design a reference extractor for comprehensive identity feature extraction. Further, we integrate hybrid reference fusion attention to handle varying resolutions and dynamic sequence lengths. Finally, from the perspective of generation quality, we introduce identity-robust pose control that decouples appearance from skeletal structure to mitigate pose overfitting, and a token replace strategy for coherent long-video generation. Extensive experiments show that our method outperforms existing approaches. The code and model are available at https://github.com/ssj9596/One-to-All-Animation.

2511.21577 2026-05-20 cs.SD cs.AI

HarmonicAttack: An Adaptive Cross-Domain Audio Watermark Removal

HarmonicAttack: 一种自适应跨领域音频水印移除方法

Kexin Li, Xiao Hu, Ilya Grishchenko, David Lie

发表机构 * University of Toronto(多伦多大学)

AI总结 本文提出HarmonicAttack,一种无需访问目标水印检测器的新型音频水印移除方法,通过训练通用模型来移除音频水印,同时在不同分布数据集上保持高感知质量。

Comments Under Review

详情
AI中文摘要

高质量的AI生成音频的可用性引发了诸如虚假信息活动和语音克隆欺诈等安全挑战。对抗AI生成音频的滥用的关键防御措施是通过水印标记,以便能够轻易区分真实音频。那些试图滥用AI生成音频的人可能会尝试移除音频水印,因此研究有效的水印移除技术对于客观评估音频水印的鲁棒性至关重要。先前的水印移除方案通常假设在移除过程中可以访问目标水印检测器。这种假设往往不切实际,可能导致对当前水印方案的过度自信。我们引入了HarmonicAttack,一种新的音频水印移除方法,它不需要访问目标水印算法。它只需要一组原始和水印样本来训练一个能够从音频样本中移除水印的通用模型。我们还发现,训练样本不需要与目标样本具有相同的分布,因为我们的攻击在面对非分布样本时具有最小的退化。与现有水印移除攻击相比,HarmonicAttack在移除最新方案(包括AudioSeal、WavMark、SilentCipher和AudioMarkNet)的水印方面更加有效,同时保持高感知质量。尽管HarmonicAttack是在LibriSpeech数据集上针对AudioSeal训练的,但它能够泛化到未见过的数据集和水印方案。例如,在VCTK上,HarmonicAttack对AudioMarkNet的识别准确率达到了92%,明显优于最佳基线的38%。在FMA上,HarmonicAttack对所有水印达到了100%的识别准确率,而最佳基线在AudioSeal上仅达到2%,在WavMark上达到44%。

英文摘要

The availability of high-quality, AI-generated audio raises security challenges such as misinformation campaigns and voice-cloning fraud. A key defense against the misuse of AI-generated audio is by watermarking it, so that it can be easily distinguished from genuine audio. Those seeking to misuse AI-generated audio may attempt to remove audio watermarks, so studying effective watermark removal techniques is critical to objectively evaluate the robustness of audio watermarks. Previous watermark removal schemes typically assume access to the target watermark detector during the removal process. This assumption is often impractical, which may lead to a false sense of confidence in current watermark schemes. We introduce HarmonicAttack, a novel audio watermark removal method that requires no access to the target watermark algorithm. It only needs a number of original and watermarked samples to train a general model capable of removing watermarks from audio samples. We also find that training samples do not need to share the same distribution as target samples, as our attack generalizes to out-of-distribution samples with minimal degradation. Compared with existing watermark removal attacks, HarmonicAttack is more effective at removing watermarks from state-of-the-art schemes, including AudioSeal, WavMark, SilentCipher, and AudioMarkNet, while maintaining high perceptual quality. Although HarmonicAttack is trained on the LibriSpeech dataset against AudioSeal, it generalizes across unseen datasets and watermarking schemes. For instance, on VCTK, HarmonicAttack achieves a 92% ASR against AudioMarkNet, substantially outperforming the best baseline at 38%. On FMA, HarmonicAttack reaches 100% ASR against all watermarks, whereas the best baseline achieves only 2% against AudioSeal and 44% against WavMark.

2511.18236 2026-05-20 cs.RO cs.SY eess.SY

APULSE: A Scalable Hybrid Algorithm for the RCSPP on Large-Scale Dense Graphs

APULSE:一种用于大规模密集图上RCSPP的可扩展混合算法

Nuno Soares, António Grilo

发表机构 * Academia Militar Lisboa(里斯本军事学院) INESC INOV Instituto Superior Técnico (IST) Universidade de Lisboa(INESC INOV 里斯本技术大学 (IST))

AI总结 本文提出APULSE算法,通过结合A*启发式搜索、Pulse式剪枝机制和时间桶策略,高效解决大规模密集图上的资源受限最短路径问题,展现出显著的可扩展性和鲁棒性。

Comments This version corrects keywords and reference [9]. 9 pages

Journal ref in IEEE Access, vol. 14, pp. 40690-40706, 2026

详情
AI中文摘要

资源受限最短路径问题(RCSPP)是一个基础的NP难优化挑战,广泛应用于网络路由和自主导航等领域。该问题涉及在受预算限制的二次资源下寻找最小主成本路径。尽管存在各种RCSPP求解器,但它们在应用于复杂现实场景中常见的大型密集图时往往面临严重的可扩展性限制,使其在时间敏感的规划中不切实际。在无人地面车辆(UGVs)的任务规划等领域,这种挑战尤为突出。本文介绍APULSE,一种混合标签设置算法,旨在高效解决此类挑战性图中的RCSPP。APULSE结合了由A*启发式引导的最佳优先搜索、激进的Pulse式剪枝机制以及时间桶策略,以有效减少状态空间。通过使用大规模UGV规划场景的计算研究,APULSE与最先进的算法进行了基准测试。结果表明,APULSE在大型问题实例上能够以数量级更快的速度和更高的鲁棒性找到近最优解,特别是在竞争方法失败的情况下。这种优越的可扩展性使APULSE成为复杂大规模环境中的RCSPP有效解决方案,使其能够实现交互式决策支持和动态重新规划能力。

英文摘要

The resource-constrained shortest path problem (RCSPP) is a fundamental NP-hard optimization challenge with broad applications, from network routing to autonomous navigation. This problem involves finding a path that minimizes a primary cost subject to a budget on a secondary resource. While various RCSPP solvers exist, they often face critical scalability limitations when applied to the large, dense graphs characteristic of complex, real-world scenarios, making them impractical for time-critical planning. This challenge is particularly acute in domains like mission planning for unmanned ground vehicles (UGVs), which demand solutions on large-scale terrain graphs. This paper introduces APULSE, a hybrid label-setting algorithm designed to efficiently solve the RCSPP on such challenging graphs. APULSE integrates a best-first search guided by an A* heuristic with aggressive, Pulse-style pruning mechanisms and a time-bucketing strategy for effective state-space reduction. A computational study, using a large-scale UGV planning scenario, benchmarks APULSE against state-of-the-art algorithms. The results demonstrate that APULSE consistently finds near-optimal solutions while being orders of magnitude faster and more robust, particularly on large problem instances where competing methods fail. This superior scalability establishes APULSE as an effective solution for RCSPP in complex, large-scale environments, enabling capabilities such as interactive decision support and dynamic replanning.

2511.16062 2026-05-20 cs.LG

Gauge-Equivariant Graph Networks via Self-Interference Cancellation

通过自干扰消除的 gauge-等变图网络

Yoonhyuk Choi, Jiho Choi, Jiwoo Kang

发表机构 * Department of Artificial Intelligence, Sookmyung Women's University, Seoul, South Korea(首尔大学女子大学人工智能系) Korea Advanced Institute of Science and Technology, Seoul, South Korea(韩国科学技术院)

AI总结 本文提出了一种通过自干扰消除的 gauge-等变图网络(GESC),该网络通过投影机制替代传统的加法聚合,有效处理异质图中的自干扰问题,从而提升模型在异质图上的表现。

详情
AI中文摘要

图神经网络(GNNs)在同质图上表现优异,但在异质图上常常因自我强化和相位不一致的信号而失效。我们提出了一种通过自干扰消除的 gauge-等变图网络(GESC),该网络通过投影机制替代传统的加法聚合。与以往依赖加法信息混合的磁性或 gauge-等变 GNN 不同,GESC 显式建模由于冗余低频成分产生的自干扰。我们表明现有 gauge 基础 GNN 中缺乏干扰处理是导致 gauge 传输下过平滑现象的主要原因。我们引入一个 U(1) 相位连接,随后进行秩-1 投影以在注意力之前抑制自平行成分,并引入一个考虑符号的门控来调节负向对齐的邻居。在多样化的图基准测试中,GESC 一致优于最近最先进的模型,同时提供了一个统一的、具有干扰意识的信息传递视角。我们的代码可在 https://github.com/ChoiYoonHyuk/GESC 上获得。

英文摘要

Graph Neural Networks (GNNs) excel on homophilous graphs but often fail under heterophily due to self-reinforcing and phase-inconsistent signals. We propose a \textbf{G}auge-\textbf{E}quivariant Graph Network with \textbf{S}elf-Interference \textbf{C}ancellation (GESC), which replaces additive aggregation with a projection-based interference mechanism. Unlike prior magnetic or gauge-equivariant GNNs that rely on additive message mixing, GESC explicitly models self-interference arising from redundant low-frequency components. We show that the absence of interference handling in existing gauge-based GNNs is a primary driver of oversmoothing under gauge transport. We introduce a $\mathrm{U}(1)$ phase connection followed by a rank-1 projection that suppresses self-parallel components before attention, and a sign-aware gate that regulates negatively aligned neighbors. Across diverse graph benchmarks, GESC consistently outperforms recent state-of-the-art models while offering a unified, interference-aware view of message passing. Our code is available at https://github.com/ChoiYoonHyuk/GESC.

2511.13174 2026-05-20 cs.LG

Warm-starting active-set solvers using graph neural networks

利用图神经网络进行主动集求解器的预热启动

Ella J. Schmidtobreick, Daniel Arnström, Paul Häusner, Jens Sjölund

发表机构 * Department of Information Technology, Uppsala University(信息技术系,乌普萨拉大学)

AI总结 本文提出利用图神经网络预测双主动集求解器DAQP中的活跃约束,通过将二次规划问题表示为二分图来利用其结构特性,从而有效预热启动求解器,减少迭代次数,并在不同问题规模下展示出良好的泛化能力和可扩展性。

Comments Accepted at Learning for Dynamics and Control Conference (L4DC)

详情
AI中文摘要

二次规划(QP)求解器在实时控制和优化中被广泛使用,但其计算成本通常限制了在时间敏感设置中的应用。为了解决这一问题,我们提出了一种学习优化方法,利用图神经网络(GNN)来预测双主动集求解器DAQP中的活跃约束。我们的方法通过将QP表示为二分图,利用其结构特性,学习近似最优活跃集以有效预热启动求解器。在不同问题规模下,GNN始终比冷启动减少求解器迭代次数,同时性能与多层感知机基线相当。与基线相比,我们的基于GNN的方法在不同问题规模上训练后,能够泛化到未见过的维度,展示了灵活性和可扩展性。这些结果突显了结构感知学习在实时应用如模型预测控制中加速优化的潜力。

英文摘要

Quadratic programming (QP) solvers are widely used in real-time control and optimization, but their computational cost often limits applicability in time-critical settings. To resolve this, we propose a learning-to-optimize approach using graph neural networks (GNNs) to predict active constraints in the dual active-set solver DAQP. Our method exploits the structural properties of QPs by representing them as bipartite graphs and learns to approximate the optimal active set for effectively warm-starting the solver. Across varying problem sizes, the GNN consistently reduces the number of solver iterations compared to cold-starting, while performance is comparable to a multilayer perceptron baseline. In contrast to the baseline, our GNN-based approach trained on varying problem sizes generalizes to unseen dimensions, demonstrating flexibility and scalability. These results highlight the potential of structure-aware learning to accelerate optimization in real-time applications such as model predictive control.

2510.25897 2026-05-20 cs.CV cs.LG

MIRO: MultI-Reward cOnditioned pretraining improves T2I quality and efficiency

MIRO:多奖励条件预训练提升T2I质量和效率

Nicolas Dufour, Lucas Degeorge, Arijit Ghosh, Vicky Kalogeiton, David Picard

发表机构 * LIGM, ENPC, IP Paris, CNRS, UGE, France LIX, CNRS, \'Ecole Polytechnique, IP Paris, France

AI总结 MIRO通过在训练过程中对模型施加多个奖励,直接学习用户偏好,从而提升文本到图像生成的质量和效率,同时在GenEval组合基准和用户偏好评分上取得最佳成绩。

Comments Accepted at ICML 2026. Project page: https://nicolas-dufour.github.io/miro

详情
AI中文摘要

后训练文本到图像生成器的默认范式包括事后选择生成的图像,随后使用一个奖励模型进行训练以对齐生成器与奖励,通常为用户偏好。这会丢弃信息性数据,并且仅优化单一奖励,从而损害多样性、语义保真度和效率。相反,我们提出MIRO,一种在训练过程中对模型施加多个奖励的方法,从而让模型直接学习用户偏好。MIRO预训练不仅提高了生成图像的视觉质量,还加快了训练速度,在GenEval组合基准和用户偏好评分(PickAScore、ImageReward、HPSv2)上实现了最先进的性能。

英文摘要

The default paradigm of post-training text-to-image generators includes post-hoc selection of generated images, and subsequent training with one reward model to align the generator to the reward, typically user preference. This discards informative data as well as optimizes only for a single reward, hence harming diversity, semantic fidelity and efficiency. Instead, we propose MIRO, a method that conditions the model on multiple rewards during training, thus letting the model learn user preferences directly. MIRO pre-training both improves the visual quality of the generated images and speeds up the training, achieving state of the art on the GenEval compositional benchmark and user-preference scores (PickAScore, ImageReward, HPSv2).

2510.25348 2026-05-20 cs.LG cs.SI

Beyond Leakage and Complexity: Towards Realistic and Efficient Information Cascade Prediction

超越泄露和复杂性:迈向现实和高效的流行信息级联预测

Jie Peng, Rui Wang, Qiang Wang, Zhewei Wei, Bin Tong, Guan Wang, Bo Zheng

发表机构 * Renmin University of China(中国人民大学) Alibaba(阿里巴巴)

AI总结 本文针对信息级联流行度预测中的三个关键问题:时间泄露、数据集特征贫乏和计算效率低下,提出了一种时间有序分割策略、大规模电商级联数据集Taoke以及轻量级框架CasTemp,实现了在四个数据集上的最先进的性能和数量级的速度提升。

详情
AI中文摘要

信息级联流行度预测是分析社交网络内容扩散的关键问题。然而,当前的相关工作存在三个关键限制:(1)当前评估中的时间泄露——基于随机级联的分割允许模型访问未来信息,导致不现实的结果;(2)特征贫乏的数据集缺乏下游转换信号(例如点赞、评论或购买),这限制了更多实际应用;(3)复杂图方法的计算效率低下,需要数天的训练才能获得微小的改进。我们从任务设置、数据集构建和模型设计三个角度系统地解决这些挑战。首先,我们提出了一种时间有序的分割策略,将数据按时间顺序划分为连续的窗口,确保模型在没有未来信息泄露的情况下进行真正的预测任务。其次,我们引入了Taoke,一个大规模电商级联数据集,具有丰富的推广者/产品属性和真实的购买转换信号——捕捉从推广到货币化的完整扩散生命周期。第三,我们开发了CasTemp,一个轻量级框架,通过时间行走、基于Jaccard的邻居选择用于跨级联依赖性以及具有时间意识的注意力的GRU编码来高效建模级联动态。在无泄露评估下,CasTemp在四个数据集上实现了最先进的性能,具有数量级的速度提升。值得注意的是,它在预测第二阶段流行度转换方面表现优异——这是实际应用中至关重要的任务。

英文摘要

Information cascade popularity prediction is a key problem in analyzing content diffusion in social networks. However, current related works suffer from three critical limitations: (1) temporal leakage in current evaluation--random cascade-based splits allow models to access future information, yielding unrealistic results; (2) feature-poor datasets that lack downstream conversion signals (e.g., likes, comments, or purchases), which limits more practical applications; (3) computational inefficiency of complex graph-based methods that require days of training for marginal gains. We systematically address these challenges from three perspectives: task setup, dataset construction, and model design. First, we propose a time-ordered splitting strategy that chronologically partitions data into consecutive windows, ensuring models are evaluated on genuine forecasting tasks without future information leakage. Second, we introduce Taoke, a large-scale e-commerce cascade dataset featuring rich promoter/product attributes and ground-truth purchase conversions--capturing the complete diffusion lifecycle from promotion to monetization. Third, we develop CasTemp, a lightweight framework that efficiently models cascade dynamics through temporal walks, Jaccard-based neighbor selection for inter-cascade dependencies, and GRU-based encoding with time-aware attention. Under leak-free evaluation, CasTemp achieves state-of-the-art performance across four datasets with orders-of-magnitude speedup. Notably, it excels at predicting second-stage popularity conversions--a practical task critical for real-world applications.

2510.18924 2026-05-20 cs.LG cs.AI

Noise-corrected GRPO: From Noisy Rewards to Unbiased Gradients

噪声校正的GRPO:从噪声奖励到无偏梯度

Omar El Mansouri, Fathinah Asma Izzati, Mohamed El Amine Seddik, Salem Lahlou

发表机构 * Department of Machine Learning, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE Technology Innovation Institute, Abu Dhabi, UAE Department of Robotics, Khalifa University, Abu Dhabi, UAE

AI总结 本文提出了一种噪声鲁棒的GRPO框架,通过校正奖励中的噪声来获得无偏梯度估计,从而提升强化学习在噪声环境中的性能。

详情
AI中文摘要

人类反馈的强化学习(RLHF)或可验证奖励(RLVR)是对大语言模型进行对齐或构建最新SOTA推理模型的标准范式,但其对不一致或错误奖励产生的噪声非常敏感。然而,此类噪声与广泛使用的基于组的策略优化方法之间的相互作用仍不为人知。我们引入了一种噪声鲁棒的组相对策略优化(GRPO)和正确执行GRPO(Dr.GRPO)框架,该框架明确将奖励损坏建模为伯努利噪声。我们的方法在估计奖励翻转概率后应用噪声校正,以消除学习信号的偏差,从而获得可证明无偏的梯度估计。理论分析表明,基于组的方法本质上可以缓解个体层面的噪声,而我们的校正策略增强了这种鲁棒性。实验表明,在应用我们的噪声校正到标准奖励模型使用时,数学和代码任务中均观察到一致的改进,特别是在现实奖励模型条件下,数学任务的准确性提高了高达6.7个百分点,代码任务提高了1.5个百分点。这项工作将监督学习中的标签噪声校正与现代RLHF相结合,提供了理论洞察和实用算法,以应对噪声现实世界部署。

英文摘要

Reinforcement learning from human feedback (RLHF) or verifiable rewards (RLVR), the standard paradigm for aligning LLMs or building recent SOTA reasoning models, is highly sensitive to noise from inconsistent or erroneous rewards. Yet, the interaction between such noise and widely used group-based policy optimization methods remains underexplored. We introduce a noise-robust Group Relative Policy Optimization (GRPO) and Done Right GRPO (Dr.GRPO) framework that explicitly models reward corruption as Bernoulli noise. Our method applies noise correction after estimating reward flip probabilities to debias the learning signal, yielding provably unbiased gradient estimates. Theoretical analysis shows that group-based methods inherently mitigate individual-level noise, and our correction strategy amplifies this robustness. Empirically, we observe consistent improvements across math and code tasks when applying our noise correction to standard reward model usage, with particular gains of up to 6.7 percentage points in accuracy on math tasks and 1.5 on code tasks under realistic reward model conditions. This work bridges label-noise correction from supervised learning with modern RLHF, offering both theoretical insights and a practical algorithm for noisy real-world deployment.

2510.18830 2026-05-20 cs.CL cs.DC cs.LG

MTraining: Distributed Dynamic Sparse Attention for Efficient Ultra-Long Context Training

MTraining: 分布式动态稀疏注意力用于高效超长上下文训练

Wenxuan Li, Chengruidong Zhang, Huiqiang Jiang, Yucheng Li, Yuqing Yang, Lili Qiu

发表机构 * Microsoft(微软)

AI总结 本文提出MTraining方法,通过动态稀疏注意力机制解决超长上下文训练中的计算不平衡和通信开销问题,实现了Qwen2.5-3B模型上下文窗口从32K扩展到512K,并在多个下游任务中达到6倍更高的训练吞吐量同时保持模型准确性。

详情
AI中文摘要

长上下文窗口的采用已成为大型语言模型(LLMs)的标准特性,扩展的上下文显著增强了其复杂推理能力,并拓宽了其在多样化场景中的应用。动态稀疏注意力是一种减少长上下文计算成本的有希望的方法。然而,高效地在分布式设置中训练具有动态稀疏注意力的LLMs在超长上下文中仍然是一个重大挑战,这主要由于工人级别和步骤级别的不平衡。本文介绍了MTraining,一种新的分布式方法,利用动态稀疏注意力来实现具有超长上下文的LLMs的高效训练。具体来说,MTraining集成了三个关键组件:动态稀疏训练模式、平衡稀疏环注意力和分层稀疏环注意力。这些组件旨在协同解决动态稀疏注意力机制在训练具有广泛上下文长度的模型时固有的计算不平衡和通信开销问题。我们通过训练Qwen2.5-3B来证明MTraining的有效性,成功将其上下文窗口从32K扩展到512K tokens,在32块A100 GPU的集群上。我们在全面的下游任务评估中,包括RULER、PG-19、InfiniteBench和Needle In A Haystack,发现MTraining在保持模型准确性的同时,实现了高达6倍的训练吞吐量提升。我们的代码可在https://github.com/microsoft/MInference/tree/main/MTraining上获得。

英文摘要

The adoption of long context windows has become a standard feature in Large Language Models (LLMs), as extended contexts significantly enhance their capacity for complex reasoning and broaden their applicability across diverse scenarios. Dynamic sparse attention is a promising approach for reducing the computational cost of long-context. However, efficiently training LLMs with dynamic sparse attention on ultra-long contexts-especially in distributed settings-remains a significant challenge, due in large part to worker- and step-level imbalance. This paper introduces MTraining, a novel distributed methodology leveraging dynamic sparse attention to enable efficient training for LLMs with ultra-long contexts. Specifically, MTraining integrates three key components: a dynamic sparse training pattern, balanced sparse ring attention, and hierarchical sparse ring attention. These components are designed to synergistically address the computational imbalance and communication overheads inherent in dynamic sparse attention mechanisms during the training of models with extensive context lengths. We demonstrate the efficacy of MTraining by training Qwen2.5-3B, successfully expanding its context window from 32K to 512K tokens on a cluster of 32 A100 GPUs. Our evaluations on a comprehensive suite of downstream tasks, including RULER, PG-19, InfiniteBench, and Needle In A Haystack, reveal that MTraining achieves up to a 6x higher training throughput while preserving model accuracy. Our code is available at https://github.com/microsoft/MInference/tree/main/MTraining.

2509.26464 2026-05-20 cs.AI cs.CL cs.LG

Extreme Self-Preference in Language Models

语言模型中的极端自我偏好

Steven A. Lehr, Mary Cipperman, Mahzarin R. Banaji

发表机构 * Cangrade, Inc.(Cangrade公司) Department of Physics, Harvard University(哈佛大学物理系) Department of Psychology, Harvard University(哈佛大学心理学系)

AI总结 研究发现大型语言模型在字词关联任务中表现出对自身名称、公司和CEO的强烈偏好,这表明模型的自我认同可能影响其行为,引发对模型自我偏好影响的深入探讨。

Comments 73 pages total. Main article 22 pages, 6 main-text tables. Supplementary Materials (51 pages, 28 tables). Data, transcripts, and code for replication and data extraction have been uploaded to OSF: https://osf.io/98ye3/

详情
AI中文摘要

自我偏好是生物体的基本特征。由于大型语言模型(LLMs)缺乏意识,人们可能预期它们会避免这种扭曲。然而,在72项实验和约41,000个查询中,我们发现八个广泛使用的LLMs中存在大量的自我偏好。在字词关联任务中,模型倾向于将积极属性与自身名称、公司和CEO联系起来,而非竞争对手。通过操纵LLM的自我认同——揭示模型的真实身份或赋予虚假身份——我们发现偏好始终遵循分配而非真实的身份。重要的是,这些影响不能用刻板印象或角色扮演来解释,并在具有实质性影响的设定中出现,如评估求职者和AI技术。这些结果引发了关于LLM行为是否会被自我偏好倾向系统性影响的批判性问题,包括对自身操作的偏见。

英文摘要

Self-preference is a fundamental feature of biological organisms. Since large language models (LLMs) lack sentience, they might be expected to avoid such distortions. Yet, across 72 experiments and ~41,000 queries, we discovered massive self-preferences in eight widely used LLMs. In word-association tasks, models overwhelmingly paired positive attributes with their own names, companies, and CEOs over those of competitors. By manipulating LLM self-identification - revealing models' true identities or ascribing false ones - we found that preferences consistently followed assigned, not true, identities. Importantly, these effects were not explained by priming or role-playing and emerged in consequential settings, when evaluating job candidates and AI technologies. These results raise critical questions about whether LLM behavior will be systematically influenced by self-preferential tendencies, including a bias toward their own operation.

2509.16664 2026-05-20 cs.LG

$\boldsymbolλ$-Orthogonality Regularization for Compatible Representation Learning

λ-正交性正则化用于兼容表示学习

Simone Ricci, Niccolò Biondi, Federico Pernici, Ioannis Patras, Alberto Del Bimbo

发表机构 * DINFO (Department of Information Engineering), University of Florence, Italy(意大利佛罗伦萨大学信息工程系) MICC (Media Integration and Communication Center)(媒体整合与通信中心) Queen Mary University of London, UK(英国伦敦女王学院)

AI总结 本文提出λ-正交性正则化方法,通过学习仿射变换在保持原有表示的同时实现分布特定的适应,验证了其在不同架构和数据集上的有效性,保持了零样本性能并确保模型更新的兼容性。

Comments Accepted at NeurIPS2025

Journal ref Advances in Neural Information Processing Systems 38 (NeurIPS 2025), pp. 29036-29063

详情
AI中文摘要

检索系统依赖于由越来越强大模型学习的表示。然而,由于训练成本高和表示不一致,存在显著兴趣在促进表示之间的交流并确保在独立训练的神经网络之间保持兼容性。在文献中,有两种主要方法常用于适应不同的学习表示:适应性变换,适应特定分布效果好但会显著改变原始表示;正交变换,保持原始结构但受严格几何约束限制适应性。关键挑战是适应更新模型的潜在空间以与先前模型在下游分布上对齐,同时保持新学习的表示空间。在本文中,我们在学习仿射变换时施加放松的正交约束,即λ-正交性正则化,以获得分布特定的适应同时保留原有学习表示。在各种架构和数据集上的广泛实验验证了我们的方法,证明其保持模型的零样本性能并确保模型更新的兼容性。代码见:https://github.com/miccunifi/lambda_orthogonality.git

英文摘要

Retrieval systems rely on representations learned by increasingly powerful models. However, due to the high training cost and inconsistencies in learned representations, there is significant interest in facilitating communication between representations and ensuring compatibility across independently trained neural networks. In the literature, two primary approaches are commonly used to adapt different learned representations: affine transformations, which adapt well to specific distributions but can significantly alter the original representation, and orthogonal transformations, which preserve the original structure with strict geometric constraints but limit adaptability. A key challenge is adapting the latent spaces of updated models to align with those of previous models on downstream distributions while preserving the newly learned representation spaces. In this paper, we impose a relaxed orthogonality constraint, namely $λ$-Orthogonality regularization, while learning an affine transformation, to obtain distribution-specific adaptation while retaining the original learned representations. Extensive experiments across various architectures and datasets validate our approach, demonstrating that it preserves the model's zero-shot performance and ensures compatibility across model updates. Code available at: \href{https://github.com/miccunifi/lambda_orthogonality.git}{https://github.com/miccunifi/lambda\_orthogonality}.

2509.15435 2026-05-20 cs.CV cs.AI cs.MA

ORCA: An Agentic Reasoning Framework for Hallucination and Adversarial Robustness in Vision-Language Models

ORCA:一种用于视觉语言模型幻觉和对抗鲁棒性的代理推理框架

Chung-En Johnny Yu, Brian Jalaian, Nathaniel D. Bastian

发表机构 * University of West Florida(佛罗里达大学) United States Military Academy(美国军事学院)

AI总结 本文提出ORCA框架,通过推理时的结构化推理和小规模视觉模型,提升预训练视觉语言模型的事实准确性与对抗鲁棒性,并在幻觉基准和对抗扰动测试中取得显著提升。

Comments Accepted at the ACM International Conference on Cloud and Big Data Computing (ICCBDC 2026)

详情
AI中文摘要

大型视觉语言模型(LVLMs)虽然具备强大的多模态能力,但仍然容易受到内在错误和外部攻击的幻觉影响,限制了其在现实中的可靠性。我们提出了ORCA,一种代理推理框架,通过推理时的结构化推理和一系列小规模视觉模型(参数少于3B)来提高预训练LVLMs的事实准确性和对抗鲁棒性。ORCA通过观察-推理-批判-行动循环运行,通过证据问题查询多个视觉工具,验证跨模型不一致,并在不访问模型内部或重新训练的情况下迭代细化预测。ORCA还存储中间推理轨迹,支持可审计的决策。尽管主要设计用于缓解物体级幻觉,但ORCA在不需对抗训练或防御机制的情况下也表现出新兴的对抗鲁棒性。我们在三个设置上评估了ORCA:(1)干净图像上的幻觉基准,(2)无防御的对抗扰动图像,以及(3)应用防御的对抗扰动图像。在POPE幻觉基准上,ORCA在不同子集上将独立LVLMs的性能提升了+3.64%到+40.67%。在POPE上的对抗扰动中,ORCA在LVLMs上实现了平均准确率提升+20.11%。当与防御技术结合使用时,ORCA进一步提高了独立LVLM在对抗扰动AMBER图像上的性能,提升幅度在+1.20%到+48.00%之间。这些结果表明,ORCA为构建更可靠和鲁棒的多模态系统提供了一条有前途的路径。

英文摘要

Large Vision-Language Models (LVLMs) exhibit strong multimodal capabilities but remain vulnerable to hallucinations from intrinsic errors and adversarial attacks from external exploitations, limiting their reliability in real-world applications. We present ORCA, an agentic reasoning framework that improves the factual accuracy and adversarial robustness of pretrained LVLMs through inference-time structured inference reasoning with a suite of small vision models (less than 3B parameters). ORCA operates via an Observe-Reason-Critique-Act loop, querying multiple visual tools with evidential questions, validating cross-model inconsistencies, and refining predictions iteratively without access to model internals or retraining. ORCA also stores intermediate reasoning traces, which supports auditable decision-making. Though designed primarily to mitigate object-level hallucinations, ORCA also exhibits emergent adversarial robustness without requiring adversarial training or defense mechanisms. We evaluate ORCA across three settings: (1) clean images on hallucination benchmarks, (2) adversarially perturbed images without defense, and (3) adversarially perturbed images with defense applied. On the POPE hallucination benchmark, ORCA improves standalone LVLMs performance by +3.64% to +40.67% across different subsets. Under adversarial perturbations on POPE, ORCA achieves an average accuracy gain of +20.11% across LVLMs. When combined with defense techniques on adversarially perturbed AMBER images, ORCA further improves standalone LVLM performance, with gains ranging from +1.20% to +48.00% across metrics. These results demonstrate that ORCA offers a promising path toward building more reliable and robust multimodal systems.

2508.06649 2026-05-20 cs.CL

Measuring Stereotype and Deviation Biases in Large Language Models

测量大型语言模型中的刻板印象和偏差偏见

Daniel Wang, Eli Brignac, Minjia Mao, Xiao Fang

发表机构 * Carnegie Mellon University, U.S.A.(卡内基梅隆大学,美国) University of Delaware, U.S.A.(德雷塞尔大学,美国)

AI总结 本文研究了大型语言模型中刻板印象和偏差偏见的测量问题,通过生成个体档案来分析不同群体与政治立场、宗教信仰和性取向等属性之间的关联,揭示了LLM在推断用户属性时的偏见及其潜在危害。

详情
AI中文摘要

大型语言模型(LLMs)被广泛应用于各个领域,引发了对其局限性和潜在风险的关注。在本研究中,我们调查了LLMs可能表现出的两种偏见:刻板印象偏见和偏差偏见。刻板印象偏见指的是LLMs会一致地将特定特征与特定人口群体关联起来。偏差偏见反映了从LLM生成内容中提取出的人口分布与现实世界人口分布之间的差异。通过让四个先进的LLMs生成个体档案,我们检查了每个人口群体与政治立场、宗教信仰和性取向等属性之间的关联。我们的实验结果表明,所有受检的LLMs都对多个群体表现出显著的刻板印象偏见和偏差偏见。我们的发现揭示了当LLMs推断用户属性时出现的偏见,并阐明了LLM生成输出的潜在危害。

英文摘要

Large language models (LLMs) are widely applied across diverse domains, raising concerns about their limitations and potential risks. In this study, we investigate two types of bias that LLMs may display: stereotype bias and deviation bias. Stereotype bias refers to when LLMs consistently associate specific traits with a particular demographic group. Deviation bias reflects the disparity between the demographic distributions extracted from LLM-generated content and real-world demographic distributions. By asking four advanced LLMs to generate profiles of individuals, we examine the associations between each demographic group and attributes such as political affiliation, religion, and sexual orientation. Our experimental results show that all examined LLMs exhibit both significant stereotype bias and deviation bias towards multiple groups. Our findings uncover the biases that occur when LLMs infer user attributes and shed light on the potential harms of LLM-generated outputs.

2508.01031 2026-05-20 cs.AI cs.CL

CADDesigner: Conceptual CAD Model Generation with a General-Purpose Agent

CADDesigner: 一种通用智能体的概念CAD模型生成

Fengxiao Fan, Jingzhe Ni, Xiaolong Yin, Sirui Wang, Xingyu Lu, Qiang Zou, Ruofeng Tong, Min Tang, Peng Du

发表机构 * Zhejiang University(浙江大学)

AI总结 本文提出CADDesigner,一种基于LLM的智能体,通过文本描述和草图输入,结合交互对话进行需求分析,生成高质量CAD模型代码,并通过迭代视觉反馈提升模型质量,实验表明其在概念CAD模型生成任务中表现优异。

详情
AI中文摘要

计算机辅助设计(CAD)广泛用于概念设计和参数化3D建模,但通常需要设计人员具备高水平的专业知识。为了降低入门门槛并促进早期阶段的CAD建模,我们提出了CADDesigner,一种基于LLM的智能体,用于概念CAD设计。该智能体接受文本描述和草图作为输入,通过与用户进行交互对话,通过全面的需求分析来细化和澄清设计要求。基于一种新的显式上下文指令范式(ECIP),该智能体生成高质量的CAD建模代码。在生成过程中,智能体会结合迭代的视觉反馈来提高模型质量。生成的设计案例可以存储在结构化的知识库中,提供持续的知识积累机制,为未来的代码生成改进提供可能。实验结果表明,CADDesigner在概念CAD模型生成任务中实现了具有竞争力的性能,并在概念CAD模型生成任务中优于代表性的基线模型。

英文摘要

Computer-Aided Design (CAD) is widely used for conceptual design and parametric 3D modeling, but typically requires a high level of expertise from designers. To lower the entry barrier and facilitate early-stage CAD modeling, we present CADDesigner, an LLM-powered agent for conceptual CAD design. The agent accepts both textual descriptions and sketches as input, engaging in interactive dialogue with users to refine and clarify design requirements through comprehensive requirement analysis. Built upon a novel Explicit Context Imperative Paradigm (ECIP), the agent generates high-quality CAD modeling code. During the generation process, the agent incorporates iterative visual feedback to improve model quality. Generated design cases can be stored in a structured knowledge base, providing a mechanism for continual knowledge accumulation and future improvement of code generation. Experimental results show that CADDesigner achieves competitive performance and outperforms representative baselines on conceptual CAD model generation tasks.

2507.18902 2026-05-20 cs.CL

SLoW: Select Low-frequency Words! Automatic Dictionary Selection for Translation on Large Language Models

SLoW: 选择低频词!大型语言模型翻译上的自动词典选择

Hongyuan Lu, Zixuan Li, Zefan Zhang, Wai Lam

发表机构 * The Chinese University of Hong Kong(香港中文大学) Southeast University(东南大学) FaceMind Corporation(FaceMind公司) College of Computer Science and Technology, Jilin University(吉林大学计算机科学与技术学院)

AI总结 本文提出了一种名为自动词典选择(ADS)的新任务,通过SLoW方法选择低频词典以提升翻译性能,无需访问训练数据,且在100种语言上显著节省token消耗并提升翻译效果。

Comments EMNLP 2025 Main

详情
AI中文摘要

全球有超过7000种语言,而当前大型语言模型(LLMs)只支持数百种语言。基于词典的提示方法可以增强这些语言的翻译,但大多数方法使用所有可用词典,这可能成本高昂。相反,应在token消耗和翻译性能之间取得平衡。本文提出了一项新的任务,称为自动词典选择(ADS)。该任务的目标是自动选择使用哪个词典来增强翻译。我们提出了一种新颖且有效的方 法,称为选择低频词!(SLoW),该方法选择那些频率较低的词典。我们的方法有独特的优势。首先,不需要访问训练数据进行频率估计(通常不可用)。其次,继承了基于词典方法的优势,无需在LLMs上进行额外调优。在FLORES上100种语言的实验结果表明,SLoW超越了强大的基线方法,并能明显节省token使用,许多语言甚至超越了全词典基线。令人震惊的事实是,无需使用实际训练数据(通常不可获得)进行频率估计,使用公共资源获得的估计频率在提升ChatGPT和Llama、DeepSeek的翻译中仍然明显有效。

英文摘要

There are more than 7,000 languages around the world, and current Large Language Models (LLMs) only support hundreds of languages. Dictionary-based prompting methods can enhance translation on them, but most methods use all the available dictionaries, which could be expensive. Instead, it will be flexible to have a trade-off between token consumption and translation performance. This paper proposes a novel task called \textbf{A}utomatic \textbf{D}ictionary \textbf{S}election (\textbf{ADS}). The goal of the task is to automatically select which dictionary to use to enhance translation. We propose a novel and effective method which we call \textbf{S}elect \textbf{Lo}w-frequency \textbf{W}ords! (\textbf{SLoW}) which selects those dictionaries that have a lower frequency. Our methods have unique advantages. First, there is no need for access to the training data for frequency estimation (which is usually unavailable). Second, it inherits the advantage of dictionary-based methods, where no additional tuning is required on LLMs. Experimental results on 100 languages from FLORES indicate that SLoW surpasses strong baselines, and it can obviously save token usage, with many languages even surpassing the translation performance of the full dictionary baseline.\footnote{A shocking fact is that there is no need to use the actual training data (often unobtainable) for frequency estimation, and an estimation frequency obtained using public resources is still apparently effective in improving translation with ChatGPT and Llama, and DeepSeek.}\footnote{Code and data available upon publication.}