arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 2154
专题追踪
2511.10292 2026-05-20 cs.CV cs.AI

Adaptive Residual-Update Steering for Low-Overhead Hallucination Mitigation in Large Vision Language Models

自适应残差更新引导用于大型视觉语言模型中低开销幻觉抑制

Zhengtao Zou, Ya Gao, Jiarui Guan, Bin Li, Pekka Marttinen

发表机构 * Aalto University, Espoo, Finland(艾尔沃大学,芬兰 Espoo) Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China(深圳先进技术研究院,中国科学院,深圳)

AI总结 本文提出RUDDER框架,通过创建持久视觉锚点来对抗视觉稀释,利用模型的prefill残差更新提取鲁棒证据方向,并通过自适应门控机制注入解码过程,有效抑制幻觉并保持高吞吐量。

Comments Accepted by ICML 2026; Code available at: https://github.com/Akko000/RUDDER-Residual-Update-Directed-DEcoding-Regulation-

详情
AI中文摘要

大型视觉-语言模型(LVLMs)通常将视觉输入作为语言解码器之前的前缀进行处理。随着模型自回归地生成文本,这种初始视觉信息不可避免地经历“稀释”,导致模型过度依赖语言先验并产生幻觉。现有干预尝试通过对比logits或迭代优化输出来纠正这一问题,但会带来不可接受的延迟成本。我们提出残差更新引导解码调节(RUDDER)框架,通过创建持久视觉锚点来对抗视觉稀释。我们直接从模型的prefill残差更新中提取鲁棒证据方向(CARD),并将其注入解码过程。这种注入通过自适应门控机制(Beta Gate)进行调节,该机制作为信任机制,确保只有在必要时才应用视觉提示。在LLaVA-1.5(7B/13B)、Idefics2、InstructBLIP和Qwen2.5-VL上的实验表明,RUDDER一致地抑制了幻觉(在贪婪解码中,RUDDER将CHAIR_S减少平均24.4%,将CHAIR_i减少23.6%),并在不同架构上有效扩展,同时保持>96.0%的吞吐量。

英文摘要

Large Vision-Language Models (LVLMs) typically process visual inputs as a prefix to the language decoder. As the model autoregressively generates text, this initial visual information inevitably undergoes "dilution" leading the model to over-rely on language priors and hallucinate objects. Existing interventions attempt to correct this by contrasting logits or iteratively refining outputs, but they incur prohibitive latency costs. We propose Residual-Update Directed DEcoding Regulation (RUDDER), a framework that counters visual dilution by creating a persistent visual anchor. We extract a robust evidence direction (CARD) directly from the model's prefill residual updates, and inject it into the decoding process. This injection is modulated by an adaptive gate, the Beta Gate, which acts as a trust mechanism and ensures the visual reminder is applied only when necessary. Experiments on LLaVA-1.5 (7B/13B), Idefics2, InstructBLIP, and Qwen2.5-VL demonstrate that RUDDER consistently mitigates hallucination (with greedy decoding, RUDDER reduces CHAIR_S by an average of 24.4% and CHAIR_i by 23.6% relative) and scales effectively across architectures, all while maintaining >96.0% throughput.

2511.06943 2026-05-20 cs.CV cs.AI

PlantTraitNet: An Uncertainty-Aware Multimodal Framework for Global-Scale Plant Trait Inference from Citizen Science Data

PlantTraitNet: 一种考虑不确定性的多模态框架,用于从公民科学数据中进行全球尺度植物特性推断

Ayushi Sharma, Johanna Trost, Daniel Lusk, Johannes Dollinger, Julian Schrader, Christian Rossi, Javier Lopatin, Etienne Laliberté, Simon Haberstroh, Jana Eichel, Daniel Mederer, Jose Miguel Cerda-Paredes, Shyam S. Phartyal, Lisa-Maricia Schwarz, Anja Linstädter, Maria Conceição Caldeira, Teja Kattenborn

发表机构 * GeoSense-Freiburg(弗赖堡GeoSense)

AI总结 本研究提出PlantTraitNet,一种多模态、多任务且考虑不确定性的深度学习框架,通过弱监督从公民科学照片中预测四个关键植物特性(植物高度、叶面积、特定叶面积和氮含量),并利用空间聚合生成全球特性分布图,验证结果表明其在所有评估特性上均优于现有特性地图。

Comments Accepted at the 40th AAAI Conference on Artificial Intelligence (AAAI-26). Link: https://ojs.aaai.org/index.php/AAAI/article/view/41272

详情
AI中文摘要

全球植物特性地图,如叶片氮含量或植物高度,对于理解生态系统过程,包括地球系统的碳和能量循环至关重要。然而,现有特性地图受限于基于现场测量的高成本和稀疏的地理覆盖。公民科学计划提供了一个未被充分利用的资源来克服这些限制,全球范围内有超过5000万张带有地理标签的植物照片,捕捉了有价值的植物形态和生理信息。在本研究中,我们引入PlantTraitNet,一种多模态、多任务且考虑不确定性的深度学习框架,利用弱监督从公民科学照片中预测四个关键植物特性(植物高度、叶面积、特定叶面积和氮含量)。通过在空间上聚合个体特性预测,我们生成全球特性分布图。我们通过独立的植被调查数据(sPlotOpen)验证这些地图,并将其与领先全球特性产品进行基准测试。我们的结果表明,PlantTraitNet在所有评估特性上均优于现有特性地图,证明了将公民科学影像与计算机视觉和地理空间AI结合,不仅能够实现可扩展的,而且更准确的全球特性映射。这种方法为生态研究和地球系统建模提供了强大的新途径。

英文摘要

Global plant maps of plant traits, such as leaf nitrogen or plant height, are essential for understanding ecosystem processes, including the carbon and energy cycles of the Earth system. However, existing trait maps remain limited by the high cost and sparse geographic coverage of field-based measurements. Citizen science initiatives offer a largely untapped resource to overcome these limitations, with over 50 million geotagged plant photographs worldwide capturing valuable visual information on plant morphology and physiology. In this study, we introduce PlantTraitNet, a multi-modal, multi-task uncertainty-aware deep learning framework that predictsfour key plant traits (plant height, leaf area, specific leaf area, and nitrogen content) from citizen science photos using weak supervision. By aggregating individual trait predictions across space, we generate global maps of trait distributions. We validate these maps against independent vegetation survey data (sPlotOpen) and benchmark them against leading global trait products. Our results show that PlantTraitNet consistently outperforms existing trait maps across all evaluated traits, demonstrating that citizen science imagery, when integrated with computer vision and geospatial AI, enables not only scalable but also more accurate global trait mapping. This approach offers a powerful new pathway for ecological research and Earth system modeling.

2511.06077 2026-05-20 cs.LG cs.IR

Make It Long, Keep It Fast: End-to-End 10K Long User Behavior Sequence Modeling for Billion-Scale Douyin Recommendation

让序列变长,让速度保持快速:面向十万个用户行为序列的端到端推荐系统

Lin Guan, Jia-Qi Yang, Zhishan Zhao, Beichuan Zhang, Bo Sun, Xuanyuan Luo, Jinan Ni, Xiaowen Li, Yuhang Qi, Zhifang Fan, Hangyu Wang, Qiwei Chen, Yi Cheng, Feng Zhang, Xiao Yang

发表机构 * ByteDance Beijing China(字节跳动北京中国) ByteDance Shanghai China(字节跳动上海中国) ByteDance San Jose CA USA(字节跳动加州圣何塞 USA) ByteDance Hangzhou Zhejiang China(字节跳动杭州浙江中国)

AI总结 本文提出了一种端到端的推荐系统,能够处理长达10000个用户行为序列,通过引入堆叠的目标到历史交叉注意力机制、请求级别批量处理策略以及长度外推训练策略,实现了在大规模Douyin推荐中的高效长序列建模。

Comments WWW 2026. This work studies end-to-end 10K-scale long user behavior sequence modeling for billion-scale industrial recommendation on Douyin

详情
AI中文摘要

像Douyin这样的短视频推荐系统必须在不牺牲延迟或成本预算的前提下利用极其长的用户行为历史。我们提出了一种端到端的工业推荐系统,将长序列推荐建模扩展到10000长度的历史记录。首先,我们引入了堆叠的目标到历史交叉注意力(STCA),通过用目标到历史的堆叠交叉注意力替代历史自注意力,将复杂度从二次方降低到线性,从而在长用户行为序列上实现高效的端到端训练。其次,我们提出了请求级别批量处理(RLB),一种以用户为中心的批量方案,将相同用户/请求的多个目标聚合起来共享用户侧编码,显著降低了与序列相关的存储、通信和计算成本,而无需改变学习目标。第三,我们设计了一种长度外推训练策略——在较短的窗口上训练,在更长的窗口上推断——从而使模型能够泛化到10000规模的历史记录而无需额外的训练成本。在离线和在线实验中,我们观察到随着历史长度和模型容量的增加,我们获得的收益是可预测且单调的,与在大型语言模型中观察到的扩展定律行为相呼应。在Douyin全流量部署中,我们的系统在关键参与度指标上实现了显著提升,同时满足了生产延迟,展示了将端到端超长序列推荐扩展到10000规模的实用路径。

英文摘要

Short-video recommenders such as Douyin must exploit extremely long user behavior histories without breaking latency or cost budgets. We present an end-to-end industrial recommender system that scales long-sequence recommendation modeling to 10K-length histories in production. First, we introduce Stacked Target-to-History Cross Attention (STCA), which replaces history self-attention with stacked cross-attention from the target to the history, reducing complexity from quadratic to linear in sequence length and enabling efficient end-to-end training over long user behavior sequences. Second, we propose Request Level Batching (RLB), a user-centric batching scheme that aggregates multiple targets for the same user/request to share the user-side encoding, substantially lowering sequence-related storage, communication, and compute without changing the learning objective. Third, we design a length-extrapolative training strategy -- train on shorter windows, infer on much longer ones -- so the model generalizes to 10K-scale histories without additional training cost. Across offline and online experiments, we observe predictable, monotonic gains as we scale history length and model capacity, mirroring the scaling law behavior observed in large language models. Deployed at full traffic on Douyin, our system delivers significant improvements on key engagement metrics while meeting production latency, demonstrating a practical path to scaling end-to-end ultra-long sequence recommendation to the 10K regime.

2511.01526 2026-05-20 cs.CL

Difficulty-Controllable Cloze Question Distractor Generation

可调节难度的填空题干扰项生成

Seokhoon Kang, Yejin Jeon, Seonjeong Hwang, Gary Geunbae Lee

发表机构 * Graduate School of Artificial Intelligence, POSTECH, South Korea(韩国立科学技术院人工智能研究生院) Department of Computer Science and Engineering, POSTECH, South Korea(韩国立科学技术院计算机科学与工程系) Mila Quebec AI Institute, Canada(魁北克AI研究院) McGill University, Canada(麦吉尔大学)

AI总结 本文提出了一种可调节难度的填空题干扰项生成框架,通过数据增强和多任务学习策略,生成高质量且标注难度的干扰项,优于GPT-4o在匹配干扰项难度与人类感知方面。

Comments Accepted to ACL 2026 Main Conference

详情
AI中文摘要

多项选择填空题常用于评估语言能力和理解能力。然而,生成高质量的干扰项仍具有挑战性,因为现有方法往往缺乏适应性和对难度水平的控制,缺乏难度标注的数据集进一步阻碍了进展。为了解决这些问题,我们提出了一种生成具有可控难度的干扰项的新框架,通过利用数据增强和多任务学习策略。首先,为了创建高质量、难度标注的数据集,我们引入了双向干扰项生成过程来生成多样且合理的干扰项。这些候选者经过筛选后,通过集成QA系统进行难度分类。其次,利用新创建的数据集通过多任务学习训练一个可调节难度的生成模型。实验结果表明,我们的方法在不同难度级别上生成高质量的干扰项,并在匹配干扰项难度与人类感知方面显著优于GPT-4o。

英文摘要

Multiple-choice cloze questions are commonly used to assess linguistic proficiency and comprehension. However, generating high-quality distractors remains challenging, as existing methods often lack adaptability and control over difficulty levels, and the absence of difficulty-annotated datasets further hinders progress. To address these issues, we propose a novel framework for generating distractors with controllable difficulty by leveraging both data augmentation and a multitask learning strategy. First, to create a high-quality, difficulty-annotated dataset, we introduce a two-way distractor generation process to produce diverse and plausible distractors. These candidates are filtered and then categorized by difficulty using an ensemble QA system. Second, this newly created dataset is used to train a difficulty-controllable generation model via multitask learning. Experimental results demonstrate that our method generates high-quality distractors across difficulty levels and substantially outperforms GPT-4o in aligning distractor difficulty with human perception.

2511.01126 2026-05-20 cs.LG cs.NA math.NA math.OC math.ST stat.TH

Stochastic Regret Guarantees for Online Zeroth- and First-Order Bilevel Optimization

在线零阶和一阶双层优化的随机遗憾保证

Parvin Nazari, Bojian Hou, Davoud Ataee Tarzanagh, Li Shen, George Michailidis

发表机构 * Amirkabir University of Technology(阿姆斯泰尔大学) University of Pennsylvania(宾夕法尼亚大学) Samsung SDS Research America(三星SDS美国研究部) University of California, Los Angeles(加州大学洛杉矶分校)

AI总结 本文提出了一种新的搜索方向,证明了利用该方向的零阶和一阶随机在线双层优化算法能够在不使用窗口平滑的情况下实现亚线性随机双层遗憾。此外,该框架通过减少超梯度估计中的oracle依赖、同时更新内层和外层变量以及使用基于零阶的Hessian、雅可比和梯度估计来提高效率。

Comments Published at NeurIPS 2025

详情
AI中文摘要

在线双层优化(OBO)是一种强大的框架,用于解决机器学习问题,其中外层和内层目标随时间演变,需要动态更新。当前的OBO方法依赖于确定性的窗口平滑后悔最小化,这在函数变化迅速时可能无法准确反映系统性能。在本文中,我们引入了一种新的搜索方向,并证明利用该方向的零阶和一阶随机OBO算法能够在不使用窗口平滑的情况下实现亚线性随机双层遗憾。除了这些保证外,我们的框架通过以下方式提高效率:(i)减少超梯度估计中的oracle依赖,(ii)在求解线性系统的同时更新内层和外层变量,(iii)使用基于零阶的Hessian、雅可比和梯度估计。在在线参数损失调谐和黑盒对抗攻击的实验中验证了我们的方法。

英文摘要

Online bilevel optimization (OBO) is a powerful framework for machine learning problems where both outer and inner objectives evolve over time, requiring dynamic updates. Current OBO approaches rely on deterministic \textit{window-smoothed} regret minimization, which may not accurately reflect system performance when functions change rapidly. In this work, we introduce a novel search direction and show that both first- and zeroth-order (ZO) stochastic OBO algorithms leveraging this direction achieve sublinear {stochastic bilevel regret without window smoothing}. Beyond these guarantees, our framework enhances efficiency by: (i) reducing oracle dependence in hypergradient estimation, (ii) updating inner and outer variables alongside the linear system solution, and (iii) employing ZO-based estimation of Hessians, Jacobians, and gradients. Experiments on online parametric loss tuning and black-box adversarial attacks validate our approach.

2510.25064 2026-05-20 cs.CL

Can LLMs Estimate Cognitive Complexity of Reading Comprehension Items?

LLMs能否估算阅读理解题的认知复杂性?

Seonjeong Hwang, Hyounghun Kim, Gary Geunbae Lee

发表机构 * Graduate School of Artificial Intelligence, POSTECH, Republic of Korea(韩国立科学技术院人工智能研究生院) Department of Computer Science and Engineering, POSTECH, Republic of Korea(韩国立科学技术院计算机科学与工程系)

AI总结 本文研究了大型语言模型是否能通过证据范围和转换级别两个维度估算阅读理解题的认知复杂性,结果显示LLMs能够近似估算认知复杂性,但存在推理能力与元认知意识之间的差距。

Comments ACL 2026 Main Conference

详情
AI中文摘要

估算阅读理解(RC)题的认知复杂性对于在施测前评估题目难度至关重要。与句法和语义特征(如文章长度或选项间的语义相似性)不同,答案推理过程中产生的认知特征难以用现有NLP工具提取,传统上依赖人工标注。在本研究中,我们探讨大型语言模型(LLMs)能否通过两个维度——证据范围和转换级别——来估算RC题的认知复杂性,这两个维度表明推理答案过程中所涉及的认知负担程度。我们的实验结果表明,LLMs能够近似估算题目的认知复杂性,表明其在前期难度分析中的潜力。进一步分析揭示了LLMs推理能力与其元认知意识之间的差距:即使它们产生了正确答案,有时也会错误地识别出自身推理过程背后的特征。

英文摘要

Estimating the cognitive complexity of reading comprehension (RC) items is crucial for assessing item difficulty before it is administered to learners. Unlike syntactic and semantic features, such as passage length or semantic similarity between options, cognitive features that arise during answer reasoning are not readily extractable using existing NLP tools and have traditionally relied on human annotation. In this study, we examine whether large language models (LLMs) can estimate the cognitive complexity of RC items by focusing on two dimensions-Evidence Scope and Transformation Level-that indicate the degree of cognitive burden involved in reasoning about the answer. Our experimental results demonstrate that LLMs can approximate the cognitive complexity of items, indicating their potential as tools for prior difficulty analysis. Further analysis reveals a gap between LLMs' reasoning ability and their metacognitive awareness: even when they produce correct answers, they sometimes fail to correctly identify the features underlying their own reasoning process.

2510.23507 2026-05-20 cs.LG cs.AI cs.IT math.IT

A Deep Latent Factor Graph Clustering with Fairness-Utility Trade-off Perspective

具有公平性-效用权衡视角的深度潜在因子图聚类

Siamak Ghodsi, Amjad Seyedi, Tai Le Quy, Fariba Karimi, Eirini Ntoutsi

发表机构 * L3S Research Center(L3S研究所以) University of Mons(蒙斯大学) University of Koblenz(科布伦茨大学) Bundeswehr University(联邦国防军大学)

AI总结 本文提出DFNMF,一种针对图的端到端深度非负三因子分解方法,通过软统计平衡正则化直接优化聚类分配,以实现公平性与效用的平衡,同时在合成和真实网络中表现出更高的群体平衡性和更高的模ularity。

Comments Accepted to IEEE Big-Data 2025 main research track. The paper is 10 main pages and 4 pages of Appendix

Journal ref 2025 IEEE International Conference on Big Data (BigData)

详情
AI中文摘要

公平图聚类旨在找到尊重网络结构的同时保持敏感群体比例的划分,应用范围涵盖社区检测、团队组建、资源分配和社会网络分析。许多现有方法强制性约束或依赖多阶段流程(例如谱嵌入后接k-均值),限制了权衡控制、可解释性和可扩展性。我们引入DFNMF,一种针对图的端到端深度非负三因子分解方法,直接优化聚类分配,使用软统计平衡正则化。单个参数λ调节公平性-效用平衡,非负性产生部分因子和透明的软成员资格。优化使用稀疏友好的交替更新,与边数成近线性比例。在合成和真实网络中,DFNMF在可比的模ularity下实现了显著更高的群体平衡,经常在帕累托前沿上超越最先进基线。代码可在https://github.com/SiamakGhodsi/DFNMF.git获得。

英文摘要

Fair graph clustering seeks partitions that respect network structure while maintaining proportional representation across sensitive groups, with applications spanning community detection, team formation, resource allocation, and social network analysis. Many existing approaches enforce rigid constraints or rely on multi-stage pipelines (e.g., spectral embedding followed by $k$-means), limiting trade-off control, interpretability, and scalability. We introduce \emph{DFNMF}, an end-to-end deep nonnegative tri-factorization tailored to graphs that directly optimizes cluster assignments with a soft statistical-parity regularizer. A single parameter $λ$ tunes the fairness--utility balance, while nonnegativity yields parts-based factors and transparent soft memberships. The optimization uses sparse-friendly alternating updates and scales near-linearly with the number of edges. Across synthetic and real networks, DFNMF achieves substantially higher group balance at comparable modularity, often dominating state-of-the-art baselines on the Pareto front. The code is available at https://github.com/SiamakGhodsi/DFNMF.git.

2510.21464 2026-05-20 cs.CV

CXR-LanIC: Language-Grounded Interpretable Classifier for Chest X-Ray Diagnosis

CXR-LanIC:基于语言的可解释分类器用于胸部X光诊断

Yiming Tang, Wenjia Zhong, Rushi Shah, Dianbo Liu

发表机构 * National University of Singapore(新加坡国立大学)

AI总结 本文提出CXR-LanIC,一种基于语言的可解释分类器,通过任务对齐的模式发现解决胸部X光诊断的可解释性挑战,通过训练稀疏自编码器提取可解释的视觉模式,实现高准确率的诊断并支持自然语言解释。

详情
AI中文摘要

深度学习模型在胸部X光诊断中已取得显著的准确性,但其广泛应用仍受到预测黑盒性质的限制。临床医生需要透明、可验证的解释来信任自动化诊断并识别潜在的故障模式。我们介绍CXR-LanIC(基于语言的可解释分类器用于胸部X光),一种新的框架,通过任务对齐的模式发现解决这一可解释性挑战。我们的方法在BiomedCLIP诊断分类器上训练基于转码的稀疏自编码器,将医学图像表示分解为可解释的视觉模式。通过在MIMIC-CXR数据集上训练100个转码器,我们发现了约5,000个单义模式,涵盖心脏、肺部、胸膜、结构、设备和伪影类别。每个模式在共享特定放射学特征的图像中表现出一致的激活行为,使预测分解为20-50个可解释模式,具有可验证的激活画廊。CXR-LanIC在五个关键发现上实现了竞争性的诊断准确性,同时通过计划的大型多模态模型注释为自然语言解释奠定基础。我们的关键创新在于从在特定诊断目标上训练的分类器中提取可解释特征,而不是通用嵌入,确保发现的模式直接相关于临床决策,证明医疗AI系统可以既准确又可解释,通过透明、基于临床的解释支持更安全的临床部署。

英文摘要

Deep learning models have achieved remarkable accuracy in chest X-ray diagnosis, yet their widespread clinical adoption remains limited by the black-box nature of their predictions. Clinicians require transparent, verifiable explanations to trust automated diagnoses and identify potential failure modes. We introduce CXR-LanIC (Language-Grounded Interpretable Classifier for Chest X-rays), a novel framework that addresses this interpretability challenge through task-aligned pattern discovery. Our approach trains transcoder-based sparse autoencoders on a BiomedCLIP diagnostic classifier to decompose medical image representations into interpretable visual patterns. By training an ensemble of 100 transcoders on multimodal embeddings from the MIMIC-CXR dataset, we discover approximately 5,000 monosemantic patterns spanning cardiac, pulmonary, pleural, structural, device, and artifact categories. Each pattern exhibits consistent activation behavior across images sharing specific radiological features, enabling transparent attribution where predictions decompose into 20-50 interpretable patterns with verifiable activation galleries. CXR-LanIC achieves competitive diagnostic accuracy on five key findings while providing the foundation for natural language explanations through planned large multimodal model annotation. Our key innovation lies in extracting interpretable features from a classifier trained on specific diagnostic objectives rather than general-purpose embeddings, ensuring discovered patterns are directly relevant to clinical decision-making, demonstrating that medical AI systems can be both accurate and interpretable, supporting safer clinical deployment through transparent, clinically grounded explanations.

2510.18821 2026-05-20 cs.LG

Search Self-play: Pushing the Frontier of Agent Capability without Supervision

搜索自play:在无监督条件下推动智能体能力的前沿

Hongliang Lu, Yuhang Wen, Pengyu Cheng, Ruijin Ding, Jiaqi Guo, Haotian Xu, Chutian Wang, Haonan Chen, Xiaoxi Jiang, Guanjun Jiang

发表机构 * Qwen Large Model Application Team, Alibaba(阿里巴巴文勤大模型应用团队)

AI总结 本文提出了一种基于自play的深度搜索智能体训练方法,通过自动生成任务和解决任务来提升智能体在无监督条件下的性能,无需外部监督。

Comments Published as a conference paper at the Fourteenth International Conference on Learning Representations (ICLR 2026)

详情
AI中文摘要

可验证奖励的强化学习(RLVR)已成为训练大语言模型(LLM)智能体的主要技术。然而,RLVR高度依赖精心设计的任务查询和相应的地面真实答案来提供准确的奖励,这需要大量的人力努力,并阻碍了RL过程的扩展,尤其是在代理场景中。尽管一些最近的工作探索了任务合成方法,但生成的代理任务的难度很难控制以提供有效的RL训练优势。为了实现更高可扩展性的代理RLVR,我们探索了深度搜索代理的自play训练,其中学习LLM利用多轮搜索引擎调用,并同时充当任务提出者和问题解决者。任务提出者的目标是生成具有明确地面真实答案和逐渐增加的任务难度的深度搜索查询。问题解决者试图处理生成的搜索查询并输出正确的答案预测。为了确保每个生成的搜索查询都有准确的地面真实,我们收集所有从提出者轨迹中获得的搜索结果作为外部知识,然后进行检索增强生成(RAG)以测试所提出的查询是否可以使用所有必要的搜索文档来正确回答。在这个搜索自play(SSP)游戏中,提出者和解决者通过竞争和合作共同进化其智能体能力。通过大量实验结果,我们发现SSP可以在各种基准上显著提高搜索代理的性能,而无需任何监督,在从头开始和连续RL训练设置下均如此。代码在https://github.com/Qwen-Applications/SSP。

英文摘要

Reinforcement learning with verifiable rewards (RLVR) has become the mainstream technique for training LLM agents. However, RLVR highly depends on well-crafted task queries and corresponding ground-truth answers to provide accurate rewards, which requires significant human effort and hinders the scaling of RL processes, especially in agentic scenarios. Although a few recent works explore task synthesis methods, the difficulty of generated agentic tasks can hardly be controlled to provide effective RL training advantages. To achieve agentic RLVR with higher scalability, we explore self-play training for deep search agents, in which the learning LLM utilizes multi-turn search engine calling and acts simultaneously as both a task proposer and a problem solver. The task proposer aims to generate deep search queries with well-defined ground-truth answers and increasing task difficulty. The problem solver tries to handle the generated search queries and output the correct answer predictions. To ensure that each generated search query has accurate ground truth, we collect all the searching results from the proposer's trajectory as external knowledge, then conduct retrieval-augmentation generation (RAG) to test whether the proposed query can be correctly answered with all necessary search documents provided. In this search self-play (SSP) game, the proposer and the solver co-evolve their agent capabilities through both competition and cooperation. With substantial experimental results, we find that SSP can significantly improve search agents' performance uniformly on various benchmarks without any supervision under both from-scratch and continuous RL training setups. The code is at https://github.com/Qwen-Applications/SSP.

2510.16814 2026-05-20 cs.LG cs.AI cs.CV

Needles in the Landscape: Semi-Supervised Pseudolabeling for Archaeological Site Discovery under Label Scarcity

景观中的针:在标签稀缺条件下用于考古遗址发现的半监督伪标签方法

Simon Jaxy, Anton Theys, Patrick Willett, W. Chris Carleton, Ralf Vandam, Pieter Libin

发表机构 * Sensors, Royal Military Academy, Brussels, Belgium AMGC (Archaeology, Environmental Changes \& Geo-Chemistry), Vrije Universiteit Brussel Max Planck Institute of Geoanthropology, Jena, Germany Shared first author Shared last author

AI总结 本文提出了一种非对称双伪标签(DPL)方法,通过端到端深度学习直接从多波段遥感影像中学习稀疏正样本,无需人工特征工程或对遗址不存在的假设,在两个著名的考古数据集上进行了评估。DPL在Sagalassos数据集上优于LAMAP基线,在F1和召回率上分别提高了12%和29%,而在Cyprus数据集上,DPL在无确认负样本的纯PU设置中恢复了判别能力。DPL的集成产生可解释的概率表面,支持调查规划,从最小的标记数据中有效发现遗址。

详情
AI中文摘要

考古预测建模通过结合已知位置与环境和地理空间变量来估计未发现遗址的可能位置,提出了一个积极无标签(PU)学习挑战,其中确认的遗址稀少,大多数位置未标记而非真正的负样本。为克服这一问题,我们提出了非对称双伪标签(DPL),一种端到端深度学习方法,直接从多波段遥感影像中学习稀疏正样本,无需人工特征工程或对遗址不存在的假设,并在两个著名的考古数据集上进行了评估。在Sagalassos数据集上,与独立的验证现场调查相比,DPL在F1和召回率上分别优于LAMAP基线12%和29%,而LAMAP在概率排名上保持优势。标准监督基线在负样本不确定时失败惨烈;仅正样本训练崩溃为预测 everywhere,建立经验界限。在Cyprus数据集上,纯PU设置中无确认负样本,SL翻转概率排名,而DPL恢复判别能力。DPL集成产生可解释的概率表面,支持调查规划,从最小的标记数据中有效发现遗址。

英文摘要

Archaeological predictive modelling estimates where undiscovered sites are likely to occur by combining known locations with environmental and geospatial variables, presenting a positive-unlabeled (PU) learning challenge where confirmed sites are rare and most locations are unlabeled rather than truly negative. To overcome this, we propose asymmetric dual pseudolabeling (DPL), an end-to-end deep learning method that learns from sparse positives directly from multi-band geospatial imagery without hand-crafted feature engineering or assumptions about site absence, and evaluate on two prominent archaeological datasets. On the Sagalassos dataset, evaluated against an independent, held-out field survey, DPL outperforms the LAMAP baseline by 12% in F1 and 29% in Recall, while LAMAP maintains advantages in probability ranking. Standard supervised baselines fail catastrophically when negatives are uncertain; positive-only training collapses to predicting everywhere, es- tablishing empirical bounds. On the Cyprus dataset, a pure PU setting without confirmed negatives, SL inverts probability rankings while DPL recovers discrimination. DPL ensembles produce interpretable probability surfaces supporting survey planning, enabling effective site discovery from minimal labeled data.

2510.14261 2026-05-20 cs.CL

Rewriting History: A Recipe for Interventional Analyses to Study Data Effects on Model Behavior

重写历史:一种用于干预分析以研究数据对模型行为影响的配方

Rahul Nadkarni, Yanai Elazar, Hila Gonen, Noah A. Smith

发表机构 * Paul G. Allen School of Computer Science & Engineering, University of Washington(保罗·G·阿伦计算机科学与工程学院,华盛顿大学) Department of Computer Science, Bar-Ilan University(巴伊兰大学计算机科学系) Department of Computer Science, University of British Columbia(不列颠哥伦比亚大学计算机科学系) Allen Institute for Artificial Intelligence(人工智能阿伦研究所)

AI总结 本文提出了一种实验方法,用于研究训练数据与语言模型行为之间的关系,通过干预数据批次(即'重写历史')并重新训练模型检查点来测试数据与行为之间的假设,展示了如何通过案例研究来验证事实知识获取中的数据影响。

Comments Accepted to TACL, pre-MIT Press publication version

详情
AI中文摘要

我们提出了一种实验配方,用于研究训练数据与语言模型(LM)行为之间的关系。我们概述了干预数据批次——即'重写历史'——的步骤,并重新训练模型检查点以测试数据与行为之间的假设。我们的配方将这种干预分解为多个阶段,包括从衡量模型行为的基准中选择评估项目、将相关文档与这些项目匹配,并在重新训练前修改这些文档以测量效果。我们通过事实性知识获取的案例研究展示了该配方的实用性,使用共现统计和信息检索方法来识别可能促进知识学习的文档。我们的结果补充了过去将共现与模型行为联系起来的观测分析,同时表明现有方法无法完全解释LM正确回答知识问题的能力。总体而言,我们概述了一种研究人员可以遵循的配方,以进一步测试训练数据如何影响模型行为的假设。我们的代码已公开发布,以促进未来的工作。

英文摘要

We present an experimental recipe for studying the relationship between training data and language model (LM) behavior. We outline steps for intervening on data batches -- i.e., ``rewriting history'' -- and then retraining model checkpoints over that data to test hypotheses relating data to behavior. Our recipe breaks down such an intervention into stages that include selecting evaluation items from a benchmark that measures model behavior, matching relevant documents to those items, and modifying those documents before retraining and measuring the effects. We demonstrate the utility of our recipe through case studies on factual knowledge acquisition in LMs, using both cooccurrence statistics and information retrieval methods to identify documents that might contribute to knowledge learning. Our results supplement past observational analyses that link cooccurrence to model behavior, while demonstrating that extant methods for identifying relevant training documents do not fully explain an LM's ability to correctly answer knowledge questions. Overall, we outline a recipe that researchers can follow to test further hypotheses about how training data affects model behavior. Our code is made publicly available to promote future work.

2510.13727 2026-05-20 cs.AI

From Refusal to Recovery: A Control-Theoretic Approach to Generative AI Guardrails

从拒绝到恢复:一种生成AI防护机制的控制论方法

Ravi Pandya, Madison Bland, Duy P. Nguyen, Changliu Liu, Jaime Fernández Fisac, Andrea Bajcsy

发表机构 * Robotics Institute, Carnegie Mellon University(卡内基梅隆大学机器人研究所) Department of Electrical and Computer Engineering, Princeton University(普林斯顿大学电气与计算机工程系) Equal Advising(平等指导)

AI总结 本文提出了一种基于控制论的生成AI防护机制,通过实时监控和主动纠正高风险输出,提供了一种动态替代传统标志和阻断方法的解决方案。

Journal ref Second International Association on Safe and Ethical AI Conference (IASEAI 2026)

详情
AI中文摘要

生成AI系统越来越多地在实际应用中协助并代表终端用户,从数字购物助手到下一代自动驾驶汽车。在此背景下,安全不再仅仅是阻止有害内容,而是要预防下游危害,如财务或人身伤害。然而,大多数AI防护机制仍然依赖于标记数据集和人工指定标准的输出分类,使其对新的危险情况变得脆弱。即使在不安全状况被标记时,这种检测也提供不了恢复的路径:通常,AI系统只是拒绝行动,这并不总是安全的选择。本文认为,代理AI安全本质上是一个连续决策问题:有害结果来自于AI系统持续变化的交互及其对世界下游后果。我们通过安全关键控制理论的视角来正式化这一问题,但是在AI模型的世界表征中。这使我们能够构建预测防护机制,(i) 实时监控AI系统的输出(动作),(ii) 主动纠正危险输出为安全输出,所有这些都以模型无关的方式进行,因此同一防护机制可以围绕任何AI模型。我们还提供了一种实用的训练配方,通过安全关键强化学习在大规模上计算此类防护机制。我们在模拟驾驶和电子商务设置中的实验表明,控制论防护机制能够可靠地引导LLM代理避免灾难性结果(从碰撞到破产),同时保持任务性能,提供了一种有原则的动态替代传统标志和阻断防护机制的解决方案。

英文摘要

Generative AI systems are increasingly assisting and acting on behalf of end users in practical settings, from digital shopping assistants to next-generation autonomous cars. In this context, safety is no longer about blocking harmful content, but about preempting downstream hazards like financial or physical harm. Yet, most AI guardrails continue to rely on output classification based on labeled datasets and human-specified criteria,making them brittle to new hazardous situations. Even when unsafe conditions are flagged, this detection offers no path to recovery: typically, the AI system simply refuses to act--which is not always a safe choice. In this work, we argue that agentic AI safety is fundamentally a sequential decision problem: harmful outcomes arise from the AI system's continually evolving interactions and their downstream consequences on the world. We formalize this through the lens of safety-critical control theory, but within the AI model's latent representation of the world. This enables us to build predictive guardrails that (i) monitor an AI system's outputs (actions) in real time and (ii) proactively correct risky outputs to safe ones, all in a model-agnostic manner so the same guardrail can be wrapped around any AI model. We also offer a practical training recipe for computing such guardrails at scale via safety-critical reinforcement learning. Our experiments in simulated driving and e-commerce settings demonstrate that control-theoretic guardrails can reliably steer LLM agents clear of catastrophic outcomes (from collisions to bankruptcy) while preserving task performance, offering a principled dynamic alternative to today's flag-and-block guardrails.

2510.12773 2026-05-20 cs.CL cs.AI cs.LG

Dr.LLM: Dynamic Layer Routing in LLMs

Dr.LLM:大语言模型中的动态层路由

Ahmed Heakl, Martin Gubri, Salman Khan, Sangdoo Yun, Seong Joon Oh

发表机构 * Parameter Lab(参数实验室) MBZUAI(穆扎夫法尔国际人工智能研究院) NAVER AI Lab(NAVER人工智能实验室) University of Tübingen(图宾根大学) Tübingen AI Center(图宾根人工智能中心)

AI总结 本文提出Dr.LLM,一种通过在预训练模型中加入轻量级每层路由器来实现动态层路由的框架,该方法在不改变基础权重的情况下,通过显式监督训练路由器,提高推理的计算效率和准确性。

Comments Published at ICLR 2026

详情
AI中文摘要

大语言模型(LLMs)处理每个token时都会通过transformer堆栈的所有层,这导致简单查询的计算浪费以及更复杂的查询需要更深层次推理时的灵活性不足。适应深度方法可以提高效率,但先前的方法依赖于成本高昂的推理时间搜索、架构更改或大规模重新训练,在实践中虽然提高了效率,但常常导致准确性下降。我们介绍了Dr.LLM,即大语言模型中的动态层路由,一种可回退的框架,该框架为预训练模型配备了轻量级每层路由器,决定跳过、执行或重复一个块。路由器通过显式监督进行训练:使用蒙特卡洛树搜索(MCTS),我们推导出高质量的层配置,以在计算预算下保持或提高准确性。我们的设计,包括窗口池化以实现稳定的路由、聚焦损失与类别平衡以及瓶颈MLP路由器,确保在类别不平衡和长序列下具有鲁棒性。在ARC(逻辑)和DART(数学)上,Dr.LLM在每个示例上平均节省5层的同时,将准确性提高了最高3.4个百分点。路由器能够泛化到域外任务(MMLU、GSM8k、AIME、TruthfulQA、SQuADv2、GPQA、PIQA、AGIEval)时,仅导致0.85%的准确性下降,同时保持效率,并在某些情况下优于先前的路由方法。总体而言,Dr.LLM展示了通过显式监督训练的路由器可以回退冻结的LLMs,以实现预算意识、准确性驱动的推理,而无需改变基础权重。代码可在https://github.com/parameterlab/dr-llm上获得。

英文摘要

Large Language Models (LLMs) process every token through all layers of a transformer stack, causing wasted computation on simple queries and insufficient flexibility for harder ones that need deeper reasoning. Adaptive-depth methods can improve efficiency, but prior approaches rely on costly inference-time search, architectural changes, or large-scale retraining, and in practice often degrade accuracy despite efficiency gains. We introduce Dr. LLM, Dynamic routing of Layers for LLMs, a retrofittable framework that equips pretrained models with lightweight per-layer routers deciding to skip, execute, or repeat a block. Routers are trained with explicit supervision: using Monte Carlo Tree Search (MCTS), we derive high-quality layer configurations that preserve or improve accuracy under a compute budget. Our design, windowed pooling for stable routing, focal loss with class balancing, and bottleneck MLP routers, ensures robustness under class imbalance and long sequences. On ARC (logic) and DART (math), Dr. LLM improves accuracy by up to +3.4%p while saving 5 layers per example on average. Routers generalize to out-of-domain tasks (MMLU, GSM8k, AIME, TruthfulQA, SQuADv2, GPQA, PIQA, AGIEval) with only 0.85% accuracy drop while retaining efficiency, and outperform prior routing methods by up to +7.7%p. Overall, Dr. LLM shows that explicitly supervised routers retrofit frozen LLMs for budget-aware, accuracy-driven inference without altering base weights. Code is available at https://github.com/parameterlab/dr-llm.

2510.11344 2026-05-20 cs.CV

MMAP: A Multi-Magnification and Prototype-Aware Architecture for Predicting Spatial Gene Expression

MMAP: 一种多倍率和原型感知架构,用于预测空间基因表达

Hai Dang Nguyen, Nguyen Dang Huy Pham, The Minh Duc Nguyen, Dac Thai Nguyen, Hang Thi Nguyen, Duong M. Nguyen

发表机构 * Institute for AI Innovation and Societal Impact(人工智能创新与社会影响研究所) Hanoi University of Science and Technology(河内科学技术大学) Amsterdam High School for the Gifted(阿姆斯特丹天才高中) Anatomic Pathology Division, Laboratory Department, Vinmec Times City International Hospital(Vinmec国际医院解剖病理科实验室部门) Vinmec Healthcare System(Vinmec医疗系统) University of Illinois Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校)

AI总结 本文提出MMAP架构,通过多倍率和原型增强方法,解决空间基因表达预测中的局部特征粒度不足和全局空间上下文覆盖不足的问题,实验表明其在多个评估指标上均优于现有最先进方法。

Comments Received Best Paper Award at the 2025 Pacific Rim International Conference on Artificial Intelligence (PRICAI 2025)

详情
AI中文摘要

空间转录组学(ST)能够测量基因表达的同时保留空间信息,为组织结构和疾病病理提供关键见解。最近的发展探索了使用经苏木精和伊红染色的整张滑扫图像(WSI)通过深度神经网络预测转录组-wide基因表达谱。这项任务通常被框架为回归问题,其中每个输入对应从WSI中提取的局部图像块。然而,从组织学图像预测空间基因表达仍是一个具有挑战性的问题,因为视觉特征与分子信号之间存在显著的模态差距。最近的研究尝试将局部和全局信息纳入预测模型中。然而,现有方法仍然存在两个关键限制:(1)局部特征提取的粒度不足,(2)全局空间上下文的覆盖不足。在本工作中,我们提出了一种新的框架,MMAP(多倍率和原型增强架构),同时解决这两个挑战。为了增强局部特征的粒度,MMAP利用多倍率块表示来捕捉精细的组织学细节。为了提高全局上下文的理解,它学习了一组潜在原型嵌入,这些嵌入作为滑片级信息的紧凑表示。广泛的实验结果表明,MMAP在多个评估指标上均优于所有现有最先进方法,包括平均绝对误差(MAE)、平均平方误差(MSE)和皮尔逊相关系数(PCC)。

英文摘要

Spatial Transcriptomics (ST) enables the measurement of gene expression while preserving spatial information, offering critical insights into tissue architecture and disease pathology. Recent developments have explored the use of hematoxylin and eosin (H&E)-stained whole-slide images (WSIs) to predict transcriptome-wide gene expression profiles through deep neural networks. This task is commonly framed as a regression problem, where each input corresponds to a localized image patch extracted from the WSI. However, predicting spatial gene expression from histological images remains a challenging problem due to the significant modality gap between visual features and molecular signals. Recent studies have attempted to incorporate both local and global information into predictive models. Nevertheless, existing methods still suffer from two key limitations: (1) insufficient granularity in local feature extraction, and (2) inadequate coverage of global spatial context. In this work, we propose a novel framework, MMAP (Multi-MAgnification and Prototype-enhanced architecture), that addresses both challenges simultaneously. To enhance local feature granularity, MMAP leverages multi-magnification patch representations that capture fine-grained histological details. To improve global contextual understanding, it learns a set of latent prototype embeddings that serve as compact representations of slide-level information. Extensive experimental results demonstrate that MMAP consistently outperforms all existing state-of-the-art methods across multiple evaluation metrics, including Mean Absolute Error (MAE), Mean Squared Error (MSE), and Pearson Correlation Coefficient (PCC).

2510.09872 2026-05-20 cs.LG cs.AI

WARC-Bench: Web Archive Based Benchmark for GUI Subtask Executions

WARC-Bench:基于网络存档的GUI子任务执行基准

Sanjari Srivastava, Gang Li, Cheng Chang, Rishu Garg, Manpreet Kaur, Charlene Y. Lee, Yuezhang Li, Yining Mao, Ignacio Cases, Yanan Xie, Peng Qi

发表机构 * Uniphore

AI总结 本文提出WARC-Bench,一个基于网络存档的GUI子任务执行基准,通过438个任务评估多模态AI代理在子任务上的能力,实验表明SFT和RLVR方法在提升子任务执行效果上取得显著成果。

详情
AI中文摘要

训练能够导航复杂现实网站的网络代理需要它们掌握子任务——多个UI组件上的短周期交互(例如在日期选择器中选择正确日期或在容器中滚动以提取信息)。我们介绍了WARC-Bench(网络存档基准),一个新型的网络导航基准,包含438个任务,旨在评估多模态AI代理在子任务上的能力。WARC-Bench利用Web ARChive文件实现动态且逼真的网页沙盒交互。我们证明WARC-Bench对领先的计算机使用模型具有挑战性,最高观察到的成功率仅为64.8%。为了提高开源模型在子任务上的表现,我们探索了两种常见的训练技术:监督微调(SFT)和具有可验证奖励的强化学习(RLVR)。实验表明,SFT模型在基准上的成功率为48.8%。在数据稀缺的情况下,通过RLVR训练SFT检查点,将分数提高到52.8%,在WARC-Bench上优于许多前沿模型。我们的分析得出结论:掌握这些子任务对于稳健的网络规划和导航至关重要,而这一能力并未被现有基准充分评估。

英文摘要

Training web agents to navigate complex, real-world websites requires them to master $\textit{subtasks}$ - short-horizon interactions on multiple UI components (e.g., choosing the correct date in a date picker, or scrolling in a container to extract information). We introduce WARC-Bench (Web Archive Benchmark), a novel web navigation benchmark featuring 438 tasks designed to evaluate multimodal AI agents on subtasks. WARC-Bench enables sandboxed interactions with dynamic and realistic webpages using Web ARChive files. We show that WARC-Bench is challenging for leading computer-use models, with the highest observed success rate being 64.8%. To improve open source models on subtask, we explore two common training techniques: supervised fine-tuning (SFT) and reinforcement learning with verifiable rewards (RLVR). Experiments show that SFT models obtain a 48.8% success rate on the benchmark. Training with RLVR over SFT checkpoints, even in data-scarce settings, improves the score to 52.8% on WARC-Bench, outperforming many frontier models. Our analysis concludes that mastering these subtasks is essential for robust web planning and navigation, and is a capability not extensively evaluated by existing benchmarks.

2510.09174 2026-05-20 cs.LG

Robustness and Regularization in Hierarchical Re-Basin

层次化重盆地中的鲁棒性与正则化

Benedikt Franke, Florian Heinrich, Markus Lange, Arne Raulf

发表机构 * German Aerospace Center (DLR) - Institute for AI Safety and Security(德国航空航天中心(DLR)- 人工智能安全与保密研究所)

AI总结 本文研究了Git Re-Basin在模型合并中的鲁棒性和正则化问题,提出了一种层次化模型合并方案,显著优于标准的MergeMany算法,并发现Re-Basin在合并模型中引入了对抗鲁棒性和扰动鲁棒性,但实验显示其性能下降比原始作者报告的更大。

Comments Published in 32th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, ESANN 2024

详情
AI中文摘要

本文对Git Re-Basin进行了深入研究,这是一种新颖的模型合并方法。我们提出了一种层次化模型合并方案,其性能显著优于标准的MergeMany算法。通过我们的新算法,我们发现Re-Basin在合并模型中引入了对抗鲁棒性和扰动鲁棒性,其效果随着参与层次化合并的模型数量增加而增强。然而,在我们的实验中,Re-Basin引起的性能下降比原始作者报告的要大得多。

英文摘要

This paper takes a closer look at Git Re-Basin, an interesting new approach to merge trained models. We propose a hierarchical model merging scheme that significantly outperforms the standard MergeMany algorithm. With our new algorithm, we find that Re-Basin induces adversarial and perturbation robustness into the merged models, with the effect becoming stronger the more models participate in the hierarchical merging scheme. However, in our experiments Re-Basin induces a much bigger performance drop than reported by the original authors.

2510.08986 2026-05-20 cs.CL cs.CE cs.CY

CAPC-CG: A Large-Scale, Expert-Directed LLM-Annotated Corpus of Adaptive Policy Communication in China

CAPC-CG:一个大规模、专家指导的中国适应性政策沟通LLM注释语料库

Bolun Sun, Charles Chang, Yuen Yuen Ang, Ruotong Mu, Yuchen Xu, Zhengxin Zhang, Pingxu Hao

发表机构 * Johns Hopkins University(约翰霍普金斯大学) Northwestern University(西北大学) Duke Kunshan University(杜克-昆山大学)

AI总结 本文介绍了CAPC-CG语料库,该语料库是首个开放的中国政策指令注释语料库,基于Ang的适应性政策沟通理论,采用五色分类法对清晰和模糊语言类别进行标注,旨在支持下游任务和多语言NLP研究。

Comments Accepted for publication in the Proceedings of ACL Main 2026

详情
AI中文摘要

我们介绍了CAPC-CG,即中国适应性政策沟通(中央政府)语料库,这是首个开放的中国政策指令注释语料库,基于Ang的适应性政策沟通理论。涵盖1949-2023年,该语料库包括中国最高当局发布的国家法律、行政法规和部级规章。每个文档被分割成段落,产生总计330万个单位。此外,我们还发布了全面的元数据、双轮标注框架和由专家和受训编码器开发的黄金标注集。标注者间协议在指令标签上达到Fleiss's kappa为K=0.86,表明高可靠性用于监督建模。我们提供了基于几种大语言模型(LLMs)的基线分类结果,以及我们的标注代码本,并描述了数据集中的模式。此次发布旨在支持下游任务和多语言NLP研究。

英文摘要

We introduce CAPC-CG, the Chinese Adaptive Policy Communication (Central Government) Corpus, the first open dataset of Chinese policy directives annotated with a five-color taxonomy of clear and ambiguous language categories, building on Ang's theory of adaptive policy communication. Spanning 1949-2023, this corpus includes national laws, administrative regulations, and ministerial rules issued by China's top authorities. Each document is segmented into paragraphs, producing a total of 3.3 million units. Alongside the corpus, we release comprehensive metadata, a two-round labeling framework, and a gold-standard annotation set developed by expert and trained coders. Inter-annotator agreement achieves a Fleiss's kappa of K = 0.86 on directive labels, indicating high reliability for supervised modeling. We provide baseline classification results with several large language models (LLMs), together with our annotation codebook, and describe patterns from the dataset. This release aims to support downstream tasks and multilingual NLP research in policy communication.

2510.07538 2026-05-20 cs.CV

Low-Compute Watermark Removal via Dual-Domain Natural Projection

基于双域自然投影的低计算量水印移除

Pragati Shuddhodhan Meshram, Varun Chandrasekaran

发表机构 * Department of Electrical and Computer Engineering, University of Illinois Urbana-Champaign, USA(伊利诺伊大学厄巴纳-香槟分校电子与计算机工程系)

AI总结 本文提出了一种轻量级且无需训练的攻击方法DAWN,通过在互补频率和语义空间中投影水印图像,以低计算成本实现高效的水印移除,同时保持结构和语义的完整性。

详情
AI中文摘要

有效的语义水印移除需要在三个竞争性目标之间取得平衡:高移除成功率、低感知失真和低计算成本。然而,现有的单图像攻击通常只优化前两个目标,实现强大的水印抑制,但依赖于昂贵的多步骤优化,限制了实际部署。在本文中,我们证明这种权衡是根本性的:目前没有任何方法能够同时实现这三个属性。我们引入DAWN,一种轻量级、无需训练的攻击方法,专门针对低计算成本的领域,同时保持竞争性的移除性能。DAWN通过将带水印的图像投影到自然图像先验上,在互补的频率和语义空间中压制偏离自然统计的水印信号,然后应用解耦的感知对齐步骤以最小化伪影来恢复视觉一致性。在多样化的像素、频率和潜在空间水印方案中,DAWN一致地降低了可检测性,同时保持结构和语义的保真度,证明了仅通过适度的感知退化即可实现高效的、低资源水印移除。我们的代码可在https://github.com/Pragati-Meshram/DAWN上获得。

英文摘要

Effective removal of semantic watermarks requires balancing three competing objectives: \emph{high removal success}, \emph{low perceptual distortion}, and \emph{low computational cost}. However, existing single-image attacks typically optimize only for the first two, achieving strong watermark suppression but relying on expensive, multi-step optimization that limits practical deployment. In this work, we show that this trade-off is fundamental: no current approach achieves all three properties simultaneously. We introduce \textsc{DAWN}, a lightweight, training-free attack that explicitly targets the low-cost regime while maintaining competitive removal performance. \textsc{DAWN} works by projecting a watermarked image onto natural-image priors in complementary frequency and semantic spaces, suppressing watermark signals that deviate from natural statistics, and then applying a decoupled perceptual-alignment step to restore visual consistency with minimal artifact. Across diverse pixel-, frequency-, and latent-space watermarking schemes, \textsc{DAWN} consistently reduces detectability while preserving structural and semantic fidelity, demonstrating that efficient, low-resource watermark removal is feasible with only modest perceptual degradation. Our code is available at https://github.com/Pragati-Meshram/DAWN.

2510.05746 2026-05-20 cs.AI cs.CL cs.LG

ARM: Discovering Agentic Reasoning Modules for Generalizable Multi-Agent Systems

ARM:为通用多智能体系统发现代理推理模块

Bohan Yao, Shiva Krishna Reddy Malay, Vikas Yadav

发表机构 * University of Washington(华盛顿大学)

AI总结 本文提出了一种新的自动多智能体系统设计范式,通过优化链式推理(CoT)来发现代理推理模块(ARM),该模块通过在代码空间中进行树搜索,利用执行轨迹的反思来进化,从而提升多智能体系统的泛化能力。

Comments 29 pages, 2 figures

详情
AI中文摘要

大型语言模型(LLM)驱动的多智能体系统(MAS)在各种复杂推理任务上取得了最先进的结果。最近的研究提出了自动化设计MAS的方法,消除了手动工程的需要。然而,这些方法表现不佳,通常与简单的基线相当或更差。此外,它们需要为每个新任务领域进行昂贵的架构重新发现,并且在没有现有标注验证集的领域中需要昂贵的数据注释。关键的洞察是简单的链式推理(CoT)推理往往与这些复杂系统竞争,表明MAS的基本推理单元CoT值得进一步研究。为此,我们提出了一种新的自动MAS设计范式,将焦点转向优化CoT推理。我们引入了代理推理模块(ARM),即CoT的代理泛化,其中每个细粒度推理步骤由专门的推理模块执行。该模块通过在代码空间中进行树搜索来发现,从简单的CoT模块开始,利用执行轨迹的反思进行进化。最终的ARM作为一个通用的推理构建块,可以作为直接的递归循环或作为学习元协调器中的子程序使用。我们的方法显著优于手动设计的MAS和最先进的自动MAS设计方法。关键的是,由ARM构建的MAS表现出卓越的泛化能力,在不同的基础模型和任务领域中保持高性能,而无需进一步优化。

英文摘要

Large Language Model (LLM)-powered Multi-agent systems (MAS) have achieved state-of-the-art results on various complex reasoning tasks. Recent works have proposed techniques to automate the design of MASes, eliminating the need for manual engineering. However, these techniques perform poorly, often achieving similar or inferior performance to simple baselines. Furthermore, they require computationally expensive re-discovery of architectures for each new task domain and expensive data annotation on domains without existing labeled validation sets. A critical insight is that simple Chain of Thought (CoT) reasoning often performs competitively with these complex systems, suggesting that the fundamental reasoning unit of MASes, CoT, warrants further investigation. To this end, we present a new paradigm for automatic MAS design that pivots the focus to optimizing CoT reasoning. We introduce the Agentic Reasoning Module (ARM), an agentic generalization of CoT where each granular reasoning step is executed by a specialized reasoning module. This module is discovered through a tree search over the code space, starting from a simple CoT module and evolved using mutations informed by reflection on execution traces. The resulting ARM acts as a versatile reasoning building block which can be utilized as a direct recursive loop or as a subroutine in a learned meta-orchestrator. Our approach significantly outperforms both manually designed MASes and state-of-the-art automatic MAS design methods. Crucially, MASes built with ARM exhibit superb generalization, maintaining high performance across different foundation models and task domains without further optimization.

2510.05431 2026-05-20 cs.CL

Self-Filtered Distillation with LLMs-generated Trust Indicators for Reliable Patent Classification

基于LLM生成信任指标的自过滤蒸馏用于可靠专利分类

Yongmin Yoo, Xu Zhang, Longbing Cao

发表机构 * Frontier AI Research Centre, School of Computing, Faculty of Science and Engineering, Macquarie University(前沿人工智能研究中心,计算机学院,科学与工程学院,麦考瑞大学)

AI总结 本文提出自过滤蒸馏方法,通过将LLM生成的推理作为信任指标而非真实标签,提升专利分类的可靠性,实验显示在USPTO-2M数据集上宏F1指标提升了38.7%。

详情
AI中文摘要

按照分类方案组织大规模专利语料库是信息管理的核心任务,决定先例检索、技术知识发现和知识产权决策的准确性和效率。近期方法将大语言模型生成的自然语言推理蒸馏到紧凑的学生模型中,但这些推理中固有的逻辑错误、标签不匹配和分类学不一致在训练过程中被无差别吸收,影响分类可靠性并传播误差至下游信息流程。而非事后纠正这些错误,我们提出自过滤蒸馏(SFD),通过将LLM生成的推理重新解释为信任指标而非真实监督,直接将质量保证嵌入学习过程。SFD整合三种无监督信号到统一的信任分数中,动态调节每个训练实例的贡献:自我一致性,量化独立生成推理之间的一致性;类别蕴含对齐,评估推理与分配CPC类定义之间的语义一致性;LLM同意评分,通过独立验证者评估外部合理性。在包含超过两百万专利的USPTO-2M基准上,SFD在四种学生架构上实现了宏F1指标高达38.7%的相对提升,信任分数与专家判断之间的强相关性(r=0.685)证实该框架不仅提供准确预测,还提供可分解的置信度语义,使大规模专利知识组织能够实现可审计和自文档化的分类结果。

英文摘要

Organizing large-scale patent corpora according to classification schemes is a core information management task that determines the accuracy and efficiency of prior art retrieval, technology knowledge discovery, and intellectual property decision-making. Recent approaches distill natural language rationales generated by large language models (LLMs) into compact student models, yet logical errors, label mismatches, and taxonomy misalignments inherent in these rationales are indiscriminately absorbed during training, undermining classification reliability and propagating errors throughout downstream information processes. Rather than correcting such errors post-hoc, we propose Self-Filtered Distillation (SFD), which embeds quality assurance directly into the learning process by reinterpreting LLM-generated rationales as trust indicators rather than ground-truth supervision. SFD integrates three unsupervised signals into a unified trust score that dynamically modulates each training instance's contribution: Self-Consistency, which quantifies agreement among independently generated rationales; Class Entailment Alignment, which evaluates semantic coherence between a rationale and its assigned CPC class definition; and LLM Agreement Scoring, which assesses external plausibility through an independent verifier. On the USPTO-2M benchmark comprising over two million patents, SFD achieves up to 38.7\% relative improvement in Macro-F1 across four student architectures, and the strong correlation between trust scores and expert judgments ($r = 0.685$) confirms that the framework provides not only accurate predictions but also decomposable confidence semantics that enable auditable and self-documenting classification outcomes for large-scale patent knowledge organization.

2510.03824 2026-05-20 cs.LG cs.AI stat.ML

Proximal Diffusion Neural Sampler

近端扩散神经采样器

Wei Guo, Jaemoo Choi, Yuchen Zhu, Molei Tao, Yongxin Chen

发表机构 * Georgia Institute of Technology(佐治亚理工学院)

AI总结 本文提出了一种名为近端扩散神经采样器(PDNS)的框架,通过在路径测度空间上应用近端点方法,解决神经采样器在训练过程中遇到的多模式目标分布和模式崩溃问题,通过分阶段的简单子问题逐步逼近目标分布,促进模式的全面探索。

Comments Accepted at ICLR 2026 (https://openreview.net/forum?id=XTHQqS7ObC)

详情
AI中文摘要

学习基于扩散的神经采样器以从未归一化目标分布中抽取样本的任务可以被视为路径测度上的随机最优控制问题。然而,当目标分布是多模式且存在显著的模式分离屏障时,神经采样器的训练可能会面临挑战,可能导致模式崩溃。我们提出了一种名为近端扩散神经采样器(PDNS)的框架,通过在路径测度空间上应用近端点方法来解决这些问题。PDNS将学习过程分解为一系列更简单的子问题,逐步创建一条接近目标分布的路径。这种分阶段的程序会逐步细化路径以接近目标分布,并促进对所有模式的彻底探索。为了实现实用且高效的实现,我们用近端加权去噪交叉熵(WDCE)目标实例化每个近端步骤。通过在连续和离散采样任务中的广泛实验,包括分子动力学和统计物理中的挑战性场景,我们展示了PDNS的有效性和鲁棒性。我们的代码可在https://github.com/AlexandreGUO2001/PDNS上获得。

英文摘要

The task of learning a diffusion-based neural sampler for drawing samples from an unnormalized target distribution can be viewed as a stochastic optimal control problem on path measures. However, the training of neural samplers can be challenging when the target distribution is multimodal with significant barriers separating the modes, potentially leading to mode collapse. We propose a framework named Proximal Diffusion Neural Sampler (PDNS) that addresses these challenges by tackling the stochastic optimal control problem via proximal point method on the space of path measures. PDNS decomposes the learning process into a series of simpler subproblems that create a path gradually approaching the desired distribution. This staged procedure traces a progressively refined path to the desired distribution and promotes thorough exploration across modes. For a practical and efficient realization, we instantiate each proximal step with a proximal weighted denoising cross-entropy (WDCE) objective. We demonstrate the effectiveness and robustness of PDNS through extensive experiments on both continuous and discrete sampling tasks, including challenging scenarios in molecular dynamics and statistical physics. Our code is available at https://github.com/AlexandreGUO2001/PDNS.

2510.03485 2026-05-20 cs.AI

Learning Efficient Guardrails for Compliance

学习高效的合规性防护措施

Xiaofei Wen, Wenjie Jacky Mo, Yanan Xie, Peng Qi, Muhao Chen

发表机构 * Department of Computer Science, University of California-Davis, CA, United States(加州大学戴维斯分校计算机科学系)

AI总结 本文提出PolicyGuardBench基准,通过6万条策略轨迹对评估合规性,训练出轻量级的PolicyGuard模型,实现高准确率和高效推理,展示了小规模下准确且可推广的合规防护措施的可行性。

Comments 16 pages, 5 figures. Accepted by ICML 2026

详情
AI中文摘要

自主网络代理越来越多地用于长期任务,但其遵循现实政策的能力相较于标准安全目标仍严重不足。为解决这一差距,我们引入PolicyGuardBench,一个包含6万条策略轨迹对的基准,旨在通过完整轨迹和新型前缀基于的违规检测任务评估合规性。使用此数据集,我们训练了PolicyGuard,一个轻量级的防护模型,实现了高检测准确率同时保持高推理效率。值得注意的是,我们的模型表现出强大的泛化能力,在未见过的领域中仍能保持高性能。这些贡献建立了一个全面研究政策合规性的框架,表明在小规模下准确且可推广的防护措施是可行的。

英文摘要

Autonomous web agents are increasingly deployed for long-horizon tasks, yet their ability to adhere to real-world policies remains critically underexplored compared to standard safety objectives. To address this gap, we introduce PolicyGuardBench, a benchmark of 60k policy-trajectory pairs designed to evaluate compliance through both full-trajectory and novel prefix-based violation detection tasks. Using this dataset, we train PolicyGuard, a lightweight guardrail model that achieves strong detection accuracy while maintaining high inference efficiency. Notably, our model demonstrates robust generalization capabilities, preserving high performance even on unseen domains. These contributions establish a comprehensive framework for studying policy compliance, showing that accurate and generalizable guardrails are feasible at small scales.

2510.01499 2026-05-20 cs.LG cs.AI cs.GT

Beyond Majority Voting: LLM Aggregation by Leveraging Higher-Order Information

超越多数投票:利用高阶信息进行LLM聚合

Rui Ai, Yuqi Pan, David Simchi-Levi, Milind Tambe, Haifeng Xu

发表机构 * Massachusetts Institute of Technology(麻省理工学院) School of Engineering and Applied Sciences(工程与应用科学学院) Harvard University(哈佛大学) Data Science, The University of Chicago(数据科学,芝加哥大学)

AI总结 本文提出Optimal Weight和Inverse Surprising Popularity两种算法,通过结合一阶和二阶信息,有效缓解多数投票的局限性,提升多智能体LLM聚合的可靠性。

Comments Accepted into ICML 2026

详情
AI中文摘要

随着多智能体大语言模型(LLM)推理的快速发展,如何有效聚合多个LLM的答案已成为一个根本性挑战。标准多数投票将所有答案视为同等重要,未能考虑模型间的潜在异质性和相关性。在本文中,我们设计了两种新的聚合算法,称为最优权重(OW)和反惊讶流行度(ISP),利用一阶和二阶信息。我们的理论分析显示,这些方法在温和假设下能够证明性地缓解多数投票的固有局限,从而产生更可靠的集体决策。我们在合成数据集、流行的LLM微调基准如UltraFeedback和MMLU,以及现实世界医疗场景ARMMAN上实证验证了我们的算法。我们的算法在多个基准上均优于标准基线,建立了稳健且无需训练的多智能体LLM聚合框架。

英文摘要

With the rapid progress of multi-agent large language model (LLM) reasoning, how to effectively aggregate answers from multiple LLMs has emerged as a fundamental challenge. Standard majority voting treats all answers equally, failing to consider latent heterogeneity and correlation across models. In this work, we design two new aggregation algorithms called Optimal Weight (OW) and Inverse Surprising Popularity (ISP), leveraging both first-order and second-order information. Our theoretical analysis shows these methods provably mitigate inherent limitations of majority voting under mild assumptions, leading to more reliable collective decisions. We empirically validate our algorithms on synthetic datasets, popular LLM fine-tuning benchmarks such as UltraFeedback and MMLU, and a real-world healthcare setting ARMMAN. Our algorithms consistently outperform standard baselines, establishing a robust, training-free framework for effective multi-agent LLM aggregation.

2510.00660 2026-05-20 cs.CV

Unsupervised Unfolded rPCA (U2-rPCA): Deep Interpretable Clutter Filtering for Ultrasound Microvascular Imaging

无监督展开rPCA(U2-rPCA):用于超声微血管成像的深度可解释杂波过滤

Huaying Li, Chuling Ye, Manfei Liao, Xiaobo Qu, Liansheng Wang, Yinran Chen

发表机构 * Fujian Key Laboratory of Urban Intelligent Sensing and Computing, School of Informatics, Xiamen University(福建城市智能感知与计算重点实验室,信息学院,厦门大学) School of Electronic Science and Engineering, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Xiamen University(电子科学与技术学院,福建省等离子体与磁共振重点实验室,厦门大学) Department of Computer Science and Technology, School of Informatics, and the National Institute for Data Science in Health and Medicine, Xiamen University(计算机科学与技术系,信息学院,以及健康医学数据科学国家研究院,厦门大学)

AI总结 本文提出了一种无监督展开rPCA(U2-rPCA)方法,通过迭代加权最小二乘(IRLS)rPCA基础进行展开,结合稀疏增强单元,以提高对稀疏微流信号的捕捉能力,从而在超声微血管成像中实现更高效的杂波过滤。

详情
AI中文摘要

高灵敏度杂波过滤是超声微血管成像中的基本步骤。奇异值分解(SVD)和鲁棒主成分分析(rPCA)是主要的杂波过滤策略。然而,这两种策略在特征建模和组织与血流分离方面对于高质量微血管成像有限。最近,基于深度学习的杂波过滤在更彻底地分离组织和血流信号方面显示出潜力。然而,现有的监督滤波器面临缺乏可解释性和训练真实数据的问题。虽然可解释性问题可以通过算法深度展开来解决,但训练真实数据仍然无法解决。本文提出了一种无监督展开rPCA(U2-rPCA)方法,该方法保留了数学可解释性,并且对学习标签不敏感。具体而言,U2-rPCA是从具有内在低秩和稀疏正则化的迭代加权最小二乘(IRLS)rPCA基础展开而来。此外,稀疏增强单元被插入到网络中,以增强其捕捉稀疏微流信号的能力。U2-rPCA就像一个自适应滤波器,它通过部分图像序列进行训练,然后用于后续帧。在硅基数据集和公开的活体数据集上的实验验证显示,U2-rPCA在与SVD滤波器、rPCA基础和另一种深度学习滤波器相比时表现出优越性。特别是,所提出的方法将功率多普勒图像的对比噪声比(CNR)从1.91 dB提高到8.48 dB,相比其他方法。此外,通过消融研究验证了U2-rPCA构建模块的有效性。

英文摘要

High-sensitivity clutter filtering is a fundamental step in ultrasound microvascular imaging. Singular value decomposition (SVD) and robust principal component analysis (rPCA) are the main clutter filtering strategies. However, both strategies are limited in feature modeling and separation of tissue and blood flow for high-quality microvascular imaging. Recently, deep learning-based clutter filtering has shown potential in more thoroughly separating tissue and blood flow signals. However, the existing supervised filters face the lack of interpretability and the training ground truth. While the interpretability issue can be addressed by algorithm deep unfolding, the training ground truth remains unsolved. This paper proposes an unsupervised unfolded rPCA (U2-rPCA) method that preserves mathematical interpretability and is insusceptible to learning labels. Specifically, U2-rPCA is unfolded from an iteratively reweighted least squares (IRLS) rPCA baseline with intrinsic low-rank and sparse regularization. In addition, a sparse-enhancement unit is plugged into the network to strengthen its capability to capture the sparse micro-flow signals. U2-rPCA is like an adaptive filter that is trained with part of the image sequence and then used for the following frames. Experimental validations on a in-silico dataset and public in-vivo datasets demonstrated the outperformance of U2-rPCA when compared with the SVD filter, the rPCA baseline, and another deep learning-based filter. Particularly, the proposed method improved the contrast-to-noise ratio (CNR) of the power Doppler image by 1.91 dB to 8.48 dB compared to other methods. Furthermore, the effectiveness of the building modules of U2-rPCA was validated through ablation studies.

2510.00600 2026-05-20 cs.RO cs.AI cs.CV cs.LG

Hybrid Training for Vision-Language-Action Models

视觉-语言-动作模型的混合训练

Pietro Mazzaglia, Cansu Sancaktar, Markus Peschl, Daniel Dijkman

发表机构 * Qualcomm AI Research(高通AI研究)

AI总结 本文提出混合训练框架,旨在使视觉-语言-动作模型在推理时能够根据需要生成思考过程或直接预测动作,从而在保持性能提升的同时提高推理效率。

Comments Published as a conference paper at ICLR 2026

详情
AI中文摘要

使用大型语言模型生成中间思考过程(即链式思考,CoT)再提供答案,已成为解决复杂语言任务的有效方法。在机器人领域,类似的具身CoT策略,即在执行动作前生成思考,也已被证明在使用视觉-语言-动作模型(VLAs)时能够提高性能。然而,这些技术会增加模型生成输出的长度以包含思考过程,从而影响推理时间。在现实世界执行中,如机器人操作场景,延迟代理的动作会严重影响方法的实用性,因为任务需要长序列的动作。然而,生成长链式思考是否是实现性能提升的必要条件?在本文中,我们探索了混合训练(HyT)的概念,这是一种框架,使VLAs能够从思考中学习并受益于相关的性能提升,同时在推理时允许省略CoT生成。此外,通过学习有条件地预测多样化的输出,HyT在推理时提供了灵活性,使模型能够直接预测动作、生成思考或遵循指令。我们评估了所提出的方法在一系列模拟基准和真实世界实验中的表现。

英文摘要

Using Large Language Models to produce intermediate thoughts, a.k.a. Chain-of-thought (CoT), before providing an answer has been a successful recipe for solving complex language tasks. In robotics, similar embodied CoT strategies, generating thoughts before actions, have also been shown to lead to improved performance when using Vision-Language-Action models (VLAs). As these techniques increase the length of the model's generated outputs to include the thoughts, the inference time is negatively affected. Delaying an agent's actions in real-world executions, as in robotic manipulation settings, strongly affects the usability of a method, as tasks require long sequences of actions. However, is the generation of long chains-of-thought a strong prerequisite for achieving performance improvements? In this work, we explore the idea of Hybrid Training (HyT), a framework that enables VLAs to learn from thoughts and benefit from the associated performance gains, while enabling the possibility to leave out CoT generation during inference. Furthermore, by learning to conditionally predict a diverse set of outputs, HyT supports flexibility at inference time, enabling the model to either predict actions directly, generate thoughts or follow instructions. We evaluate the proposed method in a series of simulated benchmarks and real-world experiments.

2509.23108 2026-05-20 cs.AI cs.CL

Artificial Phantasia: Emergent Mental Imagery in Large Language Models

人工幻象:大语言模型中的涌现性心智 imagery

Morgan McCarty, Jorge Morales

发表机构 * Khoury College of Computer Sciences, Northeastern University(东北大学克劳利计算机科学学院) Department of Psychology, Northeastern University(东北大学心理学系) Department of Philosophy, Northeastern University(东北大学哲学系)

AI总结 本研究探讨了纯语言能否驱动视觉 imagery,发现大语言模型在视觉 imagery 任务中表现优于人类,表明可能存在非图示性的涌现性心智 imagery,挑战传统认知科学观点。

Comments 34 pages, 10 figures, 3 tables

详情
AI中文摘要

视觉 imagery 是否可以仅由语言驱动?这一想法与传统认知科学观点相悖,即视觉心智 imagery 只能通过图示性表示实现。大语言模型(LLMs)提供了初步证据,表明通过命题性表示的视觉 imagery 是可能的,并且可能比人类想象更稳健。我们为一个经典任务创建了数十种新项目,该任务被认为只能通过图示性表示解决(即仅靠语言不足以完成)。受试者被要求想象一系列组合字母和形状的变换并识别结果图像。我们发现最佳的 LLMs 在人类(n=100)表现上显著更好(p<0.0001),表明存在人工幻象或非图示性的涌现性“视觉”心智 imagery。此外,我们测试了具有可变推理令牌分配的推理模型,发现模型在更长的推理链中表现最佳,显示了语言对任务的影响——仅靠语言可能就足够。我们检验了三种涌现 imagery 假设:纯命题性 imagery、带有视觉-语言先验的命题性 imagery 或图示性视觉 imagery(经典视觉 imagery)。本研究不仅提供了大语言模型之前未报告的涌现性认知能力的证据,也重新引发了关于心智 imagery 是否需要图示格式的讨论。

英文摘要

Can visual imagery be driven solely by language? This idea goes against cognitive science's traditional view that visual mental imagery is only possible through pictorial representations. Large Language Models (LLMs) provide nascent evidence not only that visual mental imagery via propositional-representations is possible, but that it can be more robust than human imagination. We created dozens of novel items for an extension to a classic task which is argued to be solvable exclusively via pictorial representations (i.e., language alone would be insufficient). Subjects were asked to imagine a series of compositional letter and shape transformations and identify the resultant "image". We found that the best LLMs performed significantly better than humans ($n = 100$ human participants, $p < .0001$), indicating the existence of an artificial phantasia, or emergent "visual" mental imagery that may not be pictorial. Furthermore, we tested reasoning models with variable reasoning-token allocation and found that models perform best with longer reasoning chains, demonstrating a linguistic impact on the task -- language alone may be sufficient. We examined three emergent imagery hypotheses: pure propositional imagery, propositional imagery with visio-linguistic priors, or pictorial visual imagery (classical visual imagery). Our study not only presents evidence for a previously unreported emergent cognitive capacity of LLMs, but also reignites debate on the requirement for a pictorial format in mental imagery.

2509.22292 2026-05-20 cs.CV cs.AI

Jailbreaking on Text-to-Video Models via Scene Splitting Strategy

通过场景分割策略对文本到视频模型进行劫持

Wonjun Lee, Haon Park, Doehyeon Lee, Bumsub Ham, Suhyun Kim

发表机构 * Yonsei University(延世大学) Korea Institute of Science and Technology(韩国科学技术院) AIM Intelligence(AIM智能) Seoul National University(首尔国立大学) Kyung Hee University(庆熙大学)

AI总结 本文提出了一种新的黑盒劫持方法SceneSplit,通过将有害叙述分割成多个良性场景,利用场景组合作为约束来引导最终输出,从而提高生成有害视频的可能性,验证了当前文本到视频模型的安全机制存在漏洞。

Comments ICLR 2026. Project page at https://velpegor.github.io/SceneSplit/

详情
AI中文摘要

随着文本到视频(T2V)模型的快速发展,对其安全风险的关注也日益增加。尽管最近的研究已经探讨了像LLM、VLM和文本到图像(T2I)模型等模型中的漏洞,但T2V模型仍然鲜有研究,存在显著的安全缺口。为了解决这一缺口,我们引入了SceneSplit,一种新颖的黑盒劫持方法,其通过将有害叙述分割成多个场景,每个场景本身都是无害的。这种方法利用场景组合作为强大的约束,来引导最终的输出空间。虽然每个场景单独对应一个宽泛且安全的空间,其中大多数结果都是无害的,但它们的顺序组合会共同限制这个空间,将其缩小到一个危险区域,从而显著增加生成有害视频的可能性。这种核心机制通过迭代场景操纵进一步增强,可以绕过此受限危险区域内的安全过滤器。此外,一个重用成功攻击模式的策略库进一步提高了攻击的整体效果和鲁棒性。为了验证我们的方法,我们在T2VSafetyBench上的11个安全类别上评估了SceneSplit在T2V模型上的表现。我们的结果表明,它在Luma Ray2上实现了77.2%的平均攻击成功率,在Hailuo上为84.1%,在Veo2上为78.2%,在Kling V1.0上为78.6%,在Sora2上为68.6%,显著优于现有基线。通过这项工作,我们证明了当前T2V安全机制容易受到利用叙述结构的攻击,为理解和改进T2V模型的安全性提供了新的见解。

英文摘要

Along with the rapid advancement of numerous Text-to-Video (T2V) models, growing concerns have emerged regarding their safety risks. While recent studies have explored vulnerabilities in models like LLMs, VLMs, and Text-to-Image (T2I) models through jailbreak attacks, T2V models remain largely unexplored, leaving a significant safety gap. To address this gap, we introduce SceneSplit, a novel black-box jailbreak method that works by fragmenting a harmful narrative into multiple scenes, each individually benign. This approach manipulates the generative output space, the abstract set of all potential video outputs for a given prompt, using the combination of scenes as a powerful constraint to guide the final outcome. While each scene individually corresponds to a wide and safe space where most outcomes are benign, their sequential combination collectively restricts this space, narrowing it to an unsafe region and significantly increasing the likelihood of generating a harmful video. This core mechanism is further enhanced through iterative scene manipulation, which bypasses the safety filter within this constrained unsafe region. Additionally, a strategy library that reuses successful attack patterns further improves the attack's overall effectiveness and robustness. To validate our method, we evaluate SceneSplit across 11 safety categories from T2VSafetyBench on T2V models. Our results show that it achieves a high average Attack Success Rate (ASR) of 77.2% on Luma Ray2, 84.1% on Hailuo, 78.2% on Veo2, 78.6% on Kling V1.0, and 68.6% on Sora2, significantly outperforming the existing baselines. Through this work, we demonstrate that current T2V safety mechanisms are vulnerable to attacks that exploit narrative structure, providing new insights for understanding and improving the safety of T2V models.

2509.22258 2026-05-20 cs.CV cs.AI

Beyond Classification Accuracy: Neural-MedBench and the Need for Deeper Reasoning Benchmarks

超越分类准确度:Neural-MedBench与更深层次推理基准的需求

Miao Jing, Mengting Jia, Junling Lin, Zhongxia Shen, Huan Gao, Mingkun Xu, Shangyang Li

发表机构 * School of Physics Science and Technology, Beijing University of Posts and Telecommunications(北京邮电大学物理科学与技术学院) Guangdong Institute of Intelligence Science and Technology(广东智能科学技术研究院) Beijing Chaoyang Hospital, Capital Medical University(北京朝阳医院) Sleep Medical Center, Huzhou Third Municipal Hospital, Affiliated Hospital of Wenzhou Medical University(湖州第三人民医院睡眠医学中心,温州医科大学附属医院) University of Macau(澳门大学) Renyixun Health Technology Co., Ltd(仁颐讯健康科技有限公司) Academy for Advanced Interdisciplinary Studies, Peking University(北京大学交叉学科研究院)

AI总结 本文提出Neural-MedBench,一个专门用于测试多模态神经病学推理能力的基准,揭示现有医疗数据集过于强调分类准确度的问题,并通过系统评估发现模型推理失败而非感知误差主导性能下降,强调需要兼顾广度与深度的评估框架。

Comments 23 pages, 12 figures

Journal ref ICLR'2026

详情
AI中文摘要

近期视觉-语言模型(VLMs)在标准医疗基准上取得了显著进展,但其真正的临床推理能力仍不清楚。现有数据集主要强调分类准确度,导致模型在高风险诊断推理上仍存在不足。我们引入Neural-MedBench,一个紧凑且推理密集的基准,专门用于探测多模态临床推理在神经病学中的极限。Neural-MedBench整合多序列MRI扫描、结构化电子健康记录和临床笔记,并涵盖三大核心任务家族:鉴别诊断、病变识别和推理生成。为确保可靠评估,我们开发了结合LLM评分、临床验证和语义相似度指标的混合评分流程。通过系统评估最先进的VLMs,包括GPT-4o、Claude-4和MedGemma,我们发现其性能相比传统数据集显著下降。错误分析显示,推理失败而非感知误差主导模型不足。我们的发现强调了需要双轴评估框架:以广度为导向的大数据集用于统计泛化,以深度为导向的紧凑基准如Neural-MedBench用于推理保真度。我们发布Neural-MedBench于https://neuromedbench.github.io/作为开放且可扩展的诊断测试床,引导未来基准的扩展,并实现严谨而成本有效的临床可信AI评估。

英文摘要

Recent advances in vision-language models (VLMs) have achieved remarkable performance on standard medical benchmarks, yet their true clinical reasoning ability remains unclear. Existing datasets predominantly emphasize classification accuracy, creating an evaluation illusion in which models appear proficient while still failing at high-stakes diagnostic reasoning. We introduce Neural-MedBench, a compact yet reasoning-intensive benchmark specifically designed to probe the limits of multimodal clinical reasoning in neurology. Neural-MedBench integrates multi-sequence MRI scans, structured electronic health records, and clinical notes, and encompasses three core task families: differential diagnosis, lesion recognition, and rationale generation. To ensure reliable evaluation, we develop a hybrid scoring pipeline that combines LLM-based graders, clinician validation, and semantic similarity metrics. Through systematic evaluation of state-of-the-art VLMs, including GPT-4o, Claude-4, and MedGemma, we observe a sharp performance drop compared to conventional datasets. Error analysis shows that reasoning failures, rather than perceptual errors, dominate model shortcomings. Our findings highlight the necessity of a Two-Axis Evaluation Framework: breadth-oriented large datasets for statistical generalization, and depth-oriented, compact benchmarks such as Neural-MedBench for reasoning fidelity. We release Neural-MedBench at https://neuromedbench.github.io/ as an open and extensible diagnostic testbed, which guides the expansion of future benchmarks and enables rigorous yet cost-effective assessment of clinically trustworthy AI.

2509.21698 2026-05-20 cs.CL

GRAB: A Risk Taxonomy--Grounded Benchmark for Unsupervised Topic Discovery in Financial Disclosures

GRAB:一种风险分类——面向财务披露中无监督主题发现的基准测试

Ying Li, Tiejun Ma

发表机构 * The University of Edinburgh, UK(爱丁堡大学)

AI总结 本文提出GRAB,一个专门针对财务披露中无监督主题发现的基准测试,通过结合FinBERT词注意力、YAKE关键词信号和基于分类法的短语匹配,生成无需人工标注的句子标签,从而评估无监督主题模型。

Comments 39th Conference on Neural Information Processing Systems (NeurIPS 2025) Workshop: NeurIPS 2025 Workshop on Generative AI in Finance

详情
AI中文摘要

在10-K风险披露中的风险分类对于监管和投资至关重要,但目前尚无公开的基准测试评估此类任务的无监督主题模型。我们提出了GRAB,一个专门针对金融领域的基准测试,包含来自8247份文件的161万个句子,并通过结合FinBERT词注意力、YAKE关键词信号和基于分类法的短语匹配生成无需人工标注的句子标签。标签基于一个将193个术语映射到21个细粒度类型(嵌套在五个宏观类别下的类型)的风险分类法;21种类型指导弱监督,而评估则在宏观层面进行。GRAB通过固定的数据集划分和稳健的指标——准确率、宏F1、主题BERTScore以及基于熵的有效主题数,统一了评估。该数据集、标签和代码使经典、基于嵌入、神经网络和混合主题模型在财务披露上的可重复、标准化比较成为可能。

英文摘要

Risk categorization in 10-K risk disclosures matters for oversight and investment, yet no public benchmark evaluates unsupervised topic models for this task. We present GRAB, a finance-specific benchmark with 1.61M sentences from 8,247 filings and span-grounded sentence labels produced without manual annotation by combining FinBERT token attention, YAKE keyphrase signals, and taxonomy-aware collocation matching. Labels are anchored in a risk taxonomy mapping 193 terms to 21 fine-grained types nested under five macro classes; the 21 types guide weak supervision, while evaluation is reported at the macro level. GRAB unifies evaluation with fixed dataset splits and robust metrics--Accuracy, Macro-F1, Topic BERTScore, and the entropy-based Effective Number of Topics. The dataset, labels, and code enable reproducible, standardized comparison across classical, embedding-based, neural, and hybrid topic models on financial disclosures.

2509.21196 2026-05-20 cs.LG cs.CV

Differential-Integral Neural Operator for Long-Term Turbulence Forecasting

微分-积分神经算子用于长期湍流预测

Hao Wu, Yuan Gao, Fan Xu, Fan Zhang, Qingsong Wen, Kun Wang, Xiaomeng Huang, Xian Wu

发表机构 * Tsinghua University(清华大学) University of Science and Technology of China(中国科学技术大学) The Chinese University of Hong Kong(香港中文大学) Nanyang Technological University(南洋理工大学) Tencent(腾讯)

AI总结 本文提出了一种基于物理原理的微分-积分神经算子,通过并行分支学习不同的物理算子,以提高长期湍流预测的稳定性与鲁棒性,从而在2D Kolmogorov流基准测试中实现了更精确的预测。

详情
AI中文摘要

准确预测湍流的长期演变是科学计算中的重大挑战,对气候建模和航空航天工程等应用至关重要。现有的深度学习方法,特别是神经算子,在长期自回归预测中常常失败,导致灾难性误差累积和物理保真度的丧失。这种失败源于它们无法同时捕捉湍流动力学所支配的不同的数学结构:局部、耗散效应和全局、非局部相互作用。在本文中,我们提出了微分-积分神经算子(\method{}),一种基于算子分解的原理方法。\method{}通过并行分支显式建模湍流的演变,学习不同的物理算子:一个局部微分算子,由一个受约束的卷积网络实现,该网络可以证明收敛于导数;以及一个全局积分算子,由Transformer架构捕捉,学习数据驱动的全局核。这种基于物理的分解使\method{}具有卓越的稳定性和鲁棒性。通过在具有挑战性的2D Kolmogorov流基准测试中的广泛实验,我们证明\method{}在长期预测中显著优于最先进的模型。它能够抑制数百个时间步上的误差累积,保持涡旋场和能量谱的高保真度,并建立了物理一致、长程湍流预测的新基准。

英文摘要

Accurately forecasting the long-term evolution of turbulence represents a grand challenge in scientific computing and is crucial for applications ranging from climate modeling to aerospace engineering. Existing deep learning methods, particularly neural operators, often fail in long-term autoregressive predictions, suffering from catastrophic error accumulation and a loss of physical fidelity. This failure stems from their inability to simultaneously capture the distinct mathematical structures that govern turbulent dynamics: local, dissipative effects and global, non-local interactions. In this paper, we propose the {\textbf{\underline{D}}}ifferential-{\textbf{\underline{I}}}ntegral {\textbf{\underline{N}}}eural {\textbf{\underline{O}}}perator (\method{}), a novel framework designed from a first-principles approach of operator decomposition. \method{} explicitly models the turbulent evolution through parallel branches that learn distinct physical operators: a local differential operator, realized by a constrained convolutional network that provably converges to a derivative, and a global integral operator, captured by a Transformer architecture that learns a data-driven global kernel. This physics-based decomposition endows \method{} with exceptional stability and robustness. Through extensive experiments on the challenging 2D Kolmogorov flow benchmark, we demonstrate that \method{} significantly outperforms state-of-the-art models in long-term forecasting. It successfully suppresses error accumulation over hundreds of timesteps, maintains high fidelity in both the vorticity fields and energy spectra, and establishes a new benchmark for physically consistent, long-range turbulence forecast.