arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.30237 2026-06-02 cs.IR cs.CL cs.LG

GRASP: Plan-Guided Graph Retrieval with Adaptive Fusion and Reranking on Semi-Structured Knowledge Bases

GRASP：半结构化知识库上具有自适应融合和重排序的计划引导图检索

Yicheng Tao, Yiqun Wang, Xiangchen Song, Xin Luo, Kai Liu, Jie Liu

发表机构 * Department of Electrical Engineering and Computer Science, University of Michigan（密歇根大学电气工程与计算机科学系）； Department of Computational Medicine and Bioinformatics, University of Michigan（密歇根大学计算医学与生物信息学系）

AI总结提出GRASP框架，通过计划引导图检索、条件融合和重排序三阶段，在半结构化知识库上实现最先进的检索性能。

2605.30169 2026-06-02 cs.CY cs.AI cs.MA

Dissociative Identity: Language Model Agents Lack Grounding for Reputation Mechanisms

分离性身份：语言模型代理缺乏声誉机制的基础

Botao Amber Hu, Helena Rong, Max Van Kleek

发表机构 * University of Oxford（牛津大学）； New York University Shanghai（纽约大学上海分校）

AI总结本文指出语言模型代理因本体上的分离性（模块可替换、身份流动）而无法满足声誉机制所需的身份持续性、行为可预测性和制裁敏感性，从而提出转向基于可观察性、事前、构成性、协议的行为约束。

Comments Accepted by FaccT 2026

详情

DOI: 10.1145/3805689.3806748

AI中文摘要

随着自主语言模型代理的激增，形成了一个具有现实后果的新兴代理网络，您可以使用哪些可信信号来决定是否信任并委托一个陌生的代理？自然的治理直觉是将人类身份验证和声誉机制从“了解你的客户”和信用评分扩展到“了解你的代理”制度。然而，我们认为这种类比从根本上是不完整的。声誉机制既作为社会信号，也作为纠正性反馈，维持可信行为的均衡，其前提是存在与行为连续性、制裁敏感性和昂贵不可替代性相关的持久身份。但语言模型代理在本体上是分离性的：它们本质上是可修改模块的集合——基础模型、系统提示、工具访问策略、外部记忆，在某些情况下还包括整个多代理系统——任何模块都可能改变代理行为，并且具有流动的人格，容易受到对抗性攻击，且可能不会内化制裁。借鉴分离性身份障碍的法理学，这种分离性使得代理缺乏可识别性、可预测性、可信性和可恢复性的基础——而这些正是声誉机制旨在维持的属性——从而破坏了信任。我们认为，基于身份的事后、规制性、制裁性的治理（如声誉）在结构上不适用于分离性代理，并建议转向基于可观察性的事前、构成性、协议性的行为约束。

英文摘要

As autonomous language model agents proliferate, forming an emerging agentic web with real-world consequences, what credibility signals can you use to decide whether to trust an unfamiliar agent in the wild and delegate to it? A natural governance intuition is to extend human identity verification and reputation mechanisms, from ``Know Your Customer'' and credit scores to ``Know Your Agent'' regimes. However, we argue that this analogy is fundamentally incomplete. Reputation mechanisms function both as social signals and as corrective feedback that sustain an equilibrium of trustworthy behavior, presuming a persistent identity associated with behavioral continuity, sanction sensitivity, and costly non-fungibility. Yet language model agents are ontologically \emph{dissociative}: they are essentially an assemblage of mutable modules -- foundation models, system prompts, tool-access policies, external memory, and, in some cases, a multi-agent system as a whole -- any of which may change agent behavior -- with a fluid persona that is also vulnerable to adversarial attack and may not internalize sanctions. Drawing on dissociative identity disorder jurisprudence, this dissociativity leaves agents without grounding for identifiability, predictability, credibility, and rehabilitability -- the very properties that reputation mechanisms aim to sustain -- thereby collapsing trust. We argue that identity-based, ex post, regulative, sanction-based governance, such as reputation, is structurally inapplicable to dissociative agents, and we suggest a shift to observability-based, ex ante, constitutive, protocol-based behavioral harnesses.

URL PDF HTML ☆

赞 0 踩 0

2506.02075 2026-06-02 stat.ME cs.LG

Position: Stop Chasing the C-index when Evaluating Survival Analysis Models

立场：评估生存分析模型时停止追逐C指数

Christian Marius Lillelund, Shi-ang Qi, Russell Greiner, Christian Fischer Pedersen

发表机构 * University of Copenhagen（哥本哈根大学）

AI总结本文批判性审视生存分析中的评估实践，指出C指数等一致性指标被过度使用且与建模目标错位，提出双螺旋阶梯框架以确保评估指标与模型假设对齐，并通过实验展示错位导致的误导性比较。

Comments ICML 2026 Position Paper Track (Spotlight)

详情

AI中文摘要

当前生存分析评估的现状受到持续使用与既定建模目标不一致的评估指标的困扰。此外，许多此类评估基于隐含或不合理的删失假设。这意味着报告的性能可能具有误导性，并且可能无法回答评估旨在解决的科学或建模问题。在这篇立场论文中，我们批判性地审视了生存分析中的评估实践，并强调了删失如何使评估从根本上不同于标准回归或分类。我们特别关注基于一致性的度量，如C指数，我们证明其在文献中被过度使用。为了帮助确定合适的度量，我们提出了一组关键需求，并引入了一个双螺旋阶梯，其中有效评估需要度量与模型假设之间的对齐。通过控制实验，我们表明这种对齐的违反可能导致误导性的模型比较。最后，我们提供了关于如何评估生存模型的实用指导。

英文摘要

The current state of evaluation in survival analysis is plagued by the persistent use of evaluation metrics in ways that are misaligned with the stated modeling objective. In addition, many such evaluations are based on censoring assumptions that are left implicit or unjustified. This means that the reported performance can be misleading and may fail to answer the scientific or modeling question the evaluation was intended to address. In this position paper, we critically examine evaluation practices in survival analysis and highlight how censoring makes evaluation fundamentally different from standard regression or classification. We place particular focus on concordance-based measures, such as the C-index, which we show are heavily overused in the literature. To help identify appropriate metrics, we propose a set of key desiderata and introduce a double-helix ladder, in which valid evaluation requires alignment between metric and modeling assumptions. Through controlled experiments, we show that violations of this alignment can lead to misleading model comparisons. We conclude by providing practical guidance on how to evaluate a survival model.

URL PDF HTML ☆

赞 0 踩 0

2605.29287 2026-06-02 cs.IR cs.CV

UniNote: A Unified Embedding Model for Multimodal Representation and Ranking

UniNote: 一种用于多模态表示和排序的统一嵌入模型

Jinghan Zhao, Wenwei Jin, Anqi Li, Jintao Tong, Luya Mo, Jiawei Li, Bin Li, Yao Hu

发表机构 * Xiaohongshu Beijing China（小红书北京中国）； Shanghai Jiao Tong University（上海交通大学）； Huazhong University of Science and Technology（华中科技大学）； Beijing Institute of Technology（北京理工大学）

AI总结提出UniNote统一嵌入模型，通过两阶段训练（对比SFT和强化学习）解决工业级Item-to-Item检索中全局表示与局部检索的平衡、解耦流水线效率及精度-延迟权衡问题，在小红书部署后显著提升检索质量和成本效率。

Comments Accepted by KDD Ads Track 2026

详情

AI中文摘要

Item-to-Item (I2I) 检索是现代内容平台的基础部分，支持从推荐引擎到内容审核的关键工业工作流。虽然多模态嵌入方法在通用检索中取得了进展，但由于全局内容表示与细粒度局部检索之间的平衡挑战、解耦的嵌入-排序流水线的系统性低效，以及模型精度与服务延迟之间的固有权衡，它们通常在 I2I 场景中表现不佳。为了解决这些问题，我们提出了 extbf{UniNote}，一种专为工业 I2I 检索设计的统一嵌入模型。引入了定制的检索策略，以支持在不同粒度上对复杂多模态内容进行表示学习。为了实现这些策略，UniNote 采用了两阶段训练范式：第一阶段利用对比 SFT 建立稳健的基础嵌入，第二阶段通过强化学习 (RL) 过程优化排序质量，使模型与内容相关性对齐。我们的结果表明，UniNote 在多种 I2I 任务上达到了最先进的性能。在小红书部署并与 Matryoshka 表示学习 (MRL) 集成后，UniNote 在大规模应用中显著提升了检索质量和成本效率。

英文摘要

Item-to-Item (I2I) retrieval is a fundamental part of modern content platforms, supporting critical industrial workflows from recommendation engines to content auditing. While multimodal embedding methods have advanced general retrieval, they often falter in I2I scenarios due to the challenges of balancing global content representation with fine-grained local retrieval, the systemic inefficiency of decoupled embedding-and-ranking pipelines, and the inherent trade-offs between model precision and serving latency. To solve these issues, we propose \textbf{UniNote}, a unified embedding model designed for industrial I2I retrieval. Tailored retrieval strategies are introduced to support representation learning over complex, multimodal content at varying granularities. To operationalize these strategies, UniNote employs a two-stage training paradigm: the first stage leverages contrastive SFT to establish robust base embeddings, while the second stage refines ranking quality through a reinforcement learning (RL) process that aligns the model with content relevance. Our results show that UniNote achieves SOTA performance across diverse I2I tasks. Deployed at Xiaohongshu and integrated with Matryoshka Representation Learning (MRL), UniNote achieved significant improvements in retrieval quality and cost efficiency in large-scale applications.

URL PDF HTML ☆

赞 0 踩 0

2605.29107 2026-06-02 cs.CR cs.AI

GEO-Bench: Benchmarking Ranking Manipulation in Generative Engine Optimization

GEO-Bench: 生成式引擎优化中的排名操纵基准测试

Ojas Nimase, Zhe Chen, Gengpei Qi, Yue Zhao, Xiyang Hu

发表机构 * University of Southern California（南加州大学）； Arizona State University（亚利桑那州立大学）

AI总结提出GEO-Bench基准，统一评估生成式引擎优化中的排名操纵攻击，比较黑盒提示攻击、白盒梯度攻击和白帽策略的有效性与隐蔽性。

详情

AI中文摘要

大型语言模型（LLMs）越来越多地对用户查询的产品、文档和推荐进行排名，这使得操纵这些排名成为公平性和信息完整性日益关注的问题。关于生成式引擎优化（GEO）的研究已经产生了许多操纵方法，但每种方法都在自己的数据集上使用自己的指标进行评估，因此它们的相对强度和可检测性仍不清楚。我们提出了GEO-Bench，这是一个在统一协议下评估GEO排名操纵攻击的基准。它统一了黑盒提示攻击（TAP、Zero-Shot）、白盒梯度攻击（STS、RAF、StealthRank）以及十种白帽C-SEO策略。我们针对一个固定的开源权重排序器（Llama-3.1-8B-Instruct）在五个数据集上对每种方法进行评分，使用有效性（NRG、Success@α、Promote@α）和隐蔽性（关键词违规率、困惑度比率）指标。我们的评估表明，对抗性攻击在有效性和隐蔽性之间存在权衡；黑盒内容重写在排名提升方面达到或超过梯度攻击，同时生成更流畅的文本，并且可以在某些领域逃避基于关键词和困惑度的检测；访问模型并不能预测攻击强度。通过标准化数据集、攻击实现和指标，GEO-Bench实现了这些攻击范式之间的首次直接比较，并支持检测方法的开发。

英文摘要

Large language models (LLMs) increasingly rank products, documents, and recommendations for user queries, which makes manipulating these rankings a growing concern for fairness and information integrity. Research on generative engine optimization (GEO) has produced many manipulation methods, but each is evaluated on its own dataset with its own metrics, so their relative strength and detectability stay unclear. We present GEO-Bench, a benchmark that evaluates GEO ranking-manipulation attacks under one protocol. It unifies black-box prompt-based attacks (TAP, Zero-Shot), white-box gradient-based attacks (STS, RAF, StealthRank), and ten white-hat C-SEO strategies. We score every method on five datasets against a fixed open-weight ranker (Llama-3.1-8B-Instruct), using metrics for both effectiveness (NRG, Success@α, Promote@α) and stealth (keyword violation rate, perplexity ratio). Our evaluation shows that effectiveness and stealth trade off across adversarial attacks, that black-box content rewriting matches or exceeds gradient-based attacks on rank promotion while producing more fluent text and can evade both keyword- and perplexity-based detection on some domains, and that the access model does not predict attack strength. By standardizing datasets, attack implementations, and metrics, GEO-Bench enables the first direct comparison across these attack paradigms and supports the development of detection methods.

URL PDF HTML ☆

赞 0 踩 0

2605.28952 2026-06-02 cs.CR cs.DS cs.IT cs.LG math.IT math.ST stat.TH

Optimal Rates for Differentially Private Hypothesis Testing with E-values

基于E值的差分隐私假设检验的最优速率

Ben Jacobsen, Tomas Gonzalez, Gavin Brown, Kassem Fawaz, Aaditya Ramdas

发表机构 * University of Wisconsin-Madison（威斯康星大学麦迪逊分校）； Carnegie Mellon University（卡内基梅隆大学）

AI总结研究在ε-差分隐私约束下，使用e值进行假设检验时所能达到的最大e-power，并给出最优速率及匹配算法。

Comments Corrected typos; updated references; generalized proposition 3.1

详情

AI中文摘要

近年来，e值作为支持任意有效和自适应数据分析的灵活工具引起了广泛关注。假设检验是许多此类应用的核心，而这些应用通常涉及私有或敏感数据。在这项工作中，我们回答了一个简单但重要的问题：给定两个分布 $\mathbb{P}$ 和 $\mathbb{Q}$，当使用满足 $\varepsilon$-差分隐私的e值检验 $X\sim \mathbb{P}^n$ 对 $X\sim\mathbb{Q}^n$ 时，所能达到的最大e-power是多少？我们刻画了该问题的最优速率，并提供了一个精确匹配的算法。在顺序设置中，当观测值逐个到达且分析者选择何时停止时，我们给出了任何私有e过程的停止时间的匹配上下界。数值实验证实了我们算法的实用性，在多种顺序检验问题和隐私水平下，我们的算法所需数据少于最近提出的DP-SPRT。

英文摘要

E-values have attracted considerable interest in recent years as flexible tools for enabling anytime-valid and adaptive data analysis. Hypothesis testing is at the core of many of these applications, which can often involve private or sensitive data. In this work, we answer a simple but important question: given two distributions $\mathbb{P}$ and $\mathbb{Q}$, what is the maximum achievable e-power when testing $X\sim \mathbb{P}^n$ against $X\sim\mathbb{Q}^n$ with e-values that satisfy $\varepsilon$-differential privacy? We characterize the optimal rate for this problem and provide an algorithm which matches it exactly. In the sequential setting, when observations arrive one-by-one and the analyst chooses when to halt, we give matching upper and lower bounds on the stopping times of any private e-process. Numerical experiments confirm the practicality of our algorithms, which require less data than the recently proposed DP-SPRT across a range of sequential testing problems and privacy levels.

URL PDF HTML ☆

赞 0 踩 0

2605.25889 2026-06-02 cs.CR cs.LG

Capability and Robustness Cannot Both Be Free: An Information-Theoretic Bound for Vision-Language-Action Models

能力与鲁棒性不可兼得：视觉-语言-动作模型的信息论界

Jianwei Tai

发表机构 * Jianwei Tai（Tai Jianwei）

AI总结本文证明视觉-语言-动作模型的能力与鲁棒性之间存在信息论权衡，能力与鲁棒性之和受限于任务熵与对抗信道容量之和，并通过实验验证了该界。

详情

AI中文摘要

视觉-语言-动作（VLA）模型在干净输入上达到高成功率，但在小的对抗扰动下崩溃：$16/255$ PGD攻击将OpenVLA-7B在LIBERO上的成功率从$95\\%$降至$5\\%$以下。这种权衡是否存在理论下限此前未知。我们证明它存在。对于任何VLA策略，能力$I(\\Astar;\\Api)$和鲁棒性$I(\\Api;\\Atildepi)-I(\\Api;δ)$之和至多为$H(\\Astar)+I(X;\\Xtilde)$，即任务熵加对抗信道容量。证明简化为两次应用数据处理不等式。像素级界宽松约$10^3$纳特，作为上限保证；编码器特定的推论将其收紧一个数量级以上，进入实际能力已消耗$5$--$9\\%$预算的区域。我们在$308$个单元中验证定理\\ref{thm:main}，零违反：$252$个闭式高斯VLA、$48$个OpenVLA-7B$+$LIBERO$+$PGD（$4$套件$\\times$ $4$个$\\\eps$ $\\times$ $3$个种子）、$4$个Square-Attack和$4$个多步（$T{=}10$）。一个互补的可测性不等式$\\\Rob_{\\text{disc}} \\\le \\\Cap_{\\text{disc}}$进一步在跨越OpenVLA、OpenVLA-OFT（连续$L_1$）和SmolVLA（流匹配）的$144$个跨架构单元中成立。相同的构造产生了三个无标签诊断工具：预飞行编码器上限、定位输入侧与语言模型干预的防御取证探针，以及可在离散令牌、$L_1$回归和流匹配策略间比较的头部无关鲁棒性比。这些共同提供了跨设置轴防御和架构比较目前所缺乏的。

英文摘要

Vision-Language-Action (VLA) models reach high success rates on clean inputs but collapse under small adversarial perturbations: a $16/255$ PGD attack drops OpenVLA-7B's LIBERO success from $95\%$ to under $5\%$. Whether this trade-off has a theoretical floor was open. We prove that it does. For any VLA policy, capability $I(\Astar;\Api)$ and robustness $I(\Api;\Atildepi)-I(\Api;δ)$ sum to at most $H(\Astar)+I(X;\Xtilde)$, the task entropy plus adversarial channel capacity. The proof reduces to two applications of the Data Processing Inequality. The pixel-level bound is loose by $\sim 10^3$ nats and serves as a ceiling guarantee; an encoder-specific corollary tightens it by over an order of magnitude, into a regime where realized capability already consumes $5$--$9\%$ of the budget. We validate Theorem~\ref{thm:main} with zero violations across $308$ cells: $252$ closed-form Gaussian-VLA, $48$ OpenVLA-7B$+$LIBERO$+$PGD ($4$ suites $\times$ $4$ $\eps$ $\times$ $3$ seeds), $4$ Square-Attack, and $4$ multi-step ($T{=}10$). A complementary measurability inequality $\Rob_{\text{disc}} \le \Cap_{\text{disc}}$ further holds across $144$ cross-architecture cells spanning OpenVLA, OpenVLA-OFT (continuous-$L_1$), and SmolVLA (flow-matching). The same construction yields three label-free diagnostics: a pre-flight encoder ceiling, a defense-forensics probe that localizes input-side vs.\ language-model intervention, and a head-agnostic robustness ratio comparable across discrete-token, $L_1$-regression, and flow-matching policies. Together these provide the cross-setting axis defense and architecture comparisons currently lack.

URL PDF HTML ☆

赞 0 踩 0

2602.07666 2026-06-02 cs.CR cs.AI

SoK: DARPA's AI Cyber Challenge (AIxCC): Competition Design, Architectures, and Lessons Learned

SoK: DARPA 人工智能网络挑战赛 (AIxCC)：竞赛设计、架构与经验教训

Cen Zhang, Younggi Park, Fabian Fleischer, Yu-Fu Fu, Jiho Kim, Dongkwan Kim, Youngjoon Kim, Qingxiao Xu, Andrew Chin, Ze Sheng, Hanqing Zhao, Michael Pelican, David J. Musliner, Jeff Huang, Jon Silliman, Mikel Mcdaniel, Jefferson Casavant, Isaac Goldthwaite, Nicholas Vidovich, Matthew Lehman, Taesoo Kim

发表机构 * Georgia Institute of Technology（佐治亚理工学院）； Texas A&M University（德克萨斯大学）； Smart Information Flow Technologies (SIFT)（智能信息流技术公司）； Kudu Dynamics（Kudu动态公司）； Microsoft（微软）

AI总结本文系统分析 DARPA 人工智能网络挑战赛 (AIxCC)，探讨其竞赛设计、决赛系统的架构方法，并总结驱动性能的因素、技术进展及未来研究方向。

Comments Camera ready version, systematization of Knowledge and post-competition analysis of DARPA AIxCC (2023-2025)

详情

Journal ref: USENIX Security 2026

AI中文摘要

DARPA 的人工智能网络挑战赛 (AIxCC, 2023--2025) 是迄今为止规模最大的竞赛，旨在构建完全自主的网络推理系统 (CRS)，利用人工智能的最新进展——特别是大型语言模型 (LLM)——来发现和修复真实世界开源软件中的漏洞。本文首次对 AIxCC 进行系统分析。基于设计文档、源代码、执行轨迹以及与组织者和参赛团队的讨论，我们审视了竞赛的结构和关键设计决策，描述了决赛 CRS 的架构方法，并分析了最终计分板之外的竞赛结果。我们的分析揭示了真正驱动 CRS 性能的因素，识别了各团队取得的技术进步，并指出了未来研究中仍需解决的局限性。最后，我们总结了组织未来竞赛的经验教训，以及在实际中部署自主 CRS 的更广泛见解。

英文摘要

DARPA's AI Cyber Challenge (AIxCC, 2023--2025) is the largest competition to date for building fully autonomous cyber reasoning systems (CRSs) that leverage recent advances in AI -- particularly large language models (LLMs) -- to discover and remediate vulnerabilities in real-world open-source software. This paper presents the first systematic analysis of AIxCC. Drawing on design documents, source code, execution traces, and discussions with organizers and competing teams, we examine the competition's structure and key design decisions, characterize the architectural approaches of finalist CRSs, and analyze competition results beyond the final scoreboard. Our analysis reveals the factors that truly drove CRS performance, identifies genuine technical advances achieved by teams, and exposes limitations that remain open for future research. We conclude with lessons for organizing future competitions and broader insights toward deploying autonomous CRSs in practice.

URL PDF HTML ☆

赞 0 踩 0

2505.11158 2026-06-02 eess.IV cs.CV

Diffusion Models for Hyperspectral Image Analysis: A Comprehensive Review

扩散模型在高光谱图像分析中的应用：综述

Xing Hu, Xiangcheng Liu, Qianqian Duan, Lian Zhang, Huiliang Shang, Linhua Jiang, Haima Yang, Dawei Zhang

发表机构 * School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology（上海理工大学光学电子与计算机工程学院）； School of Electronics and Electrical Engineering, Shanghai University of Engineering Science（上海工程技术大学电子与电气工程学院）； Medical Artificial Intelligence Lab, The First Hospital of Hebei Medical University, Hebei Medical University（河北医科大学第一医院医学人工智能实验室）； Hangzhou Institute of Technology, xidian University（杭州职业技术学院）

AI总结本文系统综述了扩散模型（包括去噪扩散概率模型和基于随机微分方程的生成框架）在高光谱图像处理中的最新进展，分类现有方法，强调其处理高维数据的优势，并与传统方法比较性能，特别关注变化检测和灾后异常识别等关键应用，同时讨论计算成本和训练稳定性等局限，并展望未来研究方向。

Comments Published in Neural Networks

详情

DOI: 10.1016/j.neunet.2026.109109
Journal ref: Neural Networks (2026) 109109

AI中文摘要

高光谱图像（HSI）分析在遥感、农业和环境监测中起着关键作用。然而，传统方法通常难以处理HSI数据中固有的高维度、光谱冗余和噪声，限制了其准确性和可扩展性。最近，扩散模型（包括去噪扩散概率模型和其他基于随机微分方程的生成框架）在捕捉复杂光谱空间结构和生成高保真HSI数据方面显示出强大潜力。这些模型为噪声抑制、数据增强、分类和异常检测等任务提供了有效解决方案。本文系统总结了扩散模型在HSI处理中的最新进展。我们对现有方法进行分类，强调其处理高维数据的优势，并与传统方法进行性能比较。特别关注变化检测和灾后异常识别等关键应用。本文还讨论了当前局限性，如计算成本和训练稳定性，并概述了潜在的研究方向。我们的主要贡献可总结如下：提供了基于扩散的HSI方法的系统分类，考察了它们在主要遥感任务中的应用，并提供了对未来研究潜在方向的见解。通过这些努力，本综述旨在支持社区利用深度学习模型实现更有效和高效的高光谱图像分析。

英文摘要

Hyperspectral image (HSI) analysis plays a critical role in remote sensing, agriculture, and environmental monitoring. However, traditional methods often struggle to handle the high dimensionality, spectral redundancy, and noise inherent in HSI data, limiting their accuracy and scalability. Recently, diffusion models including denoising diffusion probabilistic models and other generative frameworks based on stochastic differential equations have shown strong potential in capturing complex spectral spatial structures and generating high fidelity HSI data. These models offer effective solutions for tasks such as noise supression, data augmentation, classification, and anomaly detection. This review presents a systematic summary of recent advances in diffusion models for HSI processing. We categorize existing methods, highlight their strengths in handling high dimensional data, and compare their performance with conventional approaches. Special attention is given to critical applications such as change detection and post disaster anomaly identification. The review also discusses current limitations, such as computational cost and training stability, and outlines potential research directions. Our main contributions can be summarized as follows: we provide a systematic taxonomy of diffusion based HSI methods, examine their applications across major remote sensing tasks, and offer perspectives on potential directions for future research. With these efforts, this review seeks to support the community in harnessing deep learning models to achieve more effective and efficient hyperspectral image analysis.

URL PDF HTML ☆

赞 0 踩 0

2605.24248 2026-06-02 cs.CR cs.AI cs.SE

Attested Tool-Server Admission: A Security Extension to the Model Context Protocol

认证工具服务器准入：模型上下文协议的安全扩展

Alfredo Metere

发表机构 * Enclawed LLC（Enclawed公司）

AI总结针对MCP协议缺乏信任机制的问题，提出mcp-attested扩展，通过离线签名的权限断言、默认拒绝的工具白名单和分级强制审计日志，实现安全服务器准入与工具边界控制。

详情

AI中文摘要

模型上下文协议（MCP）标准化了大语言模型（LLM）代理与外部工具服务器之间的消息交换，但未标准化信任：主机读取服务器自声明的工具列表并分发调用，没有关于可以使用哪些服务器、敏感程度如何或服务器哪些工具在界限内的概念。这项工作源于一个具体需求——让Enclawed代理安全地使用Google外部运营的MCP服务器（Gmail、日历、Drive），准入服务器并限制其可能驱动的工具，而不改变MCP或Enclawed自身的工具应用程序编程接口（API）。我们构建的机制mcp-attested（已在开源enclawed-oss发行版和enclaved变体中发布）具有通用性：使未经中介的第三方连接对单个用户不安全的差距，使得受监管的部署无法获得认证。我们通过三种附加机制来弥补这一差距：（1）一个小的、离线签名的权限断言，服务器在众所周知的统一资源标识符（URI）上发布，主机在分派任何工具之前对照固定的信任根进行验证；（2）一个默认拒绝的每服务器工具允许列表，因此准入服务器并不意味着信任其每个工具；（3）一个分级门控的强制模式，将检查从警告转变为硬性拒绝，每个决策都写入防篡改审计日志。我们给出了线路格式、验证算法、安全分析和LLM驱动的对抗性评估；然后以规范的请求评论（RFC 2119）形式陈述了设计——模式、验证规则、错误注册表、众所周知的注册和机器可检查的一致性向量——以便它可以作为MCP附录被采纳，而不是重新发明。未扩展的主机会忽略众所周知的文档，行为与今天完全相同。

英文摘要

The Model Context Protocol (MCP) standardizes how a large-language-model (LLM) agent and an external tool server exchange messages, but not trust: a host reads a server's self-declared tool list and dispatches calls, with no notion of which servers it may use, at what sensitivity, or which of a server's tools are in bounds. This work grew out of a concrete need -- letting the Enclawed agent use Google's externally-operated MCP servers (Gmail, Calendar, Drive) safely, admitting the server and bounding the tools it may drive, without changing MCP or Enclawed's own tool application-programming interface (API). The mechanism we built, mcp-attested (shipped in both the open enclawed-oss distribution and the enclaved flavor), generalizes: the gap that makes an unmediated third-party connection unsafe for one user makes a regulated deployment impossible to accredit. We close it with three additive mechanisms: (1) a small, offline-signed clearance assertion a server publishes at a well-known Uniform Resource Identifier (URI) and a host verifies against a pinned trust root before any tool dispatch; (2) a deny-by-default per-server tool allowlist, so admitting a server is not trusting its every tool; and (3) a flavor-gated enforcement mode that turns the checks from warnings into hard denials, with every decision written to a tamper-evident audit log. We give the wire format, the verification algorithm, a security analysis, and an LLM-driven adversarial evaluation; we then state the design in normative Request-for-Comments (RFC 2119) form -- schema, verification rules, error registry, well-known registration, and machine-checkable conformance vectors -- so it can be adopted as an MCP addendum rather than reinvented. An unextended host ignores the well-known document and behaves exactly as today.

URL PDF HTML ☆

赞 0 踩 0

2602.11210 2026-06-02 cs.SE cs.AI cs.LG

SWE-MiniSandbox: Container-Free Reinforcement Learning for Building Software Engineering Agents

SWE-MiniSandbox：用于构建软件工程智能体的无容器强化学习

Danlong Yuan, Wei Wu, Enhan Zhao, Zhengren Wang, Xueliang Zhao, Huishuai Zhang, Dongyan Zhao

发表机构 * University of Science and Technology of China（中国科学技术大学）

AI总结提出SWE-MiniSandbox，一种轻量级无容器方法，通过内核级隔离和预缓存技术降低磁盘使用和准备时间，实现可扩展的强化学习训练。

详情

AI中文摘要

强化学习已成为训练软件工程智能体的关键范式，但现有流程通常依赖每个任务的容器进行隔离。在大规模场景下，预构建的容器镜像会带来显著的存储开销、缓慢的环境设置，并且需要容器管理权限。我们提出SWE-MiniSandbox，一种轻量级、无容器的方法，能够在无需牺牲隔离性的情况下实现SWE智能体的可扩展强化学习训练。SWE-MiniSandbox不依赖每个实例的容器，而是在由内核级机制支持的隔离工作空间中执行每个任务，从而大幅降低系统开销。它利用轻量级环境预缓存技术，消除了对庞大容器镜像的需求。因此，我们的方法将磁盘使用量降低到基于容器的流程所需的大约5%，并将环境准备时间缩短到容器基线的大约25%。实验结果表明，SWE-MiniSandbox实现了与标准基于容器的流程相当的评估性能。通过消除对重型容器基础设施的依赖，SWE-MiniSandbox为扩展基于强化学习的SWE智能体提供了一个实用且可访问的基础，特别是在资源受限的研究环境中。

英文摘要

Reinforcement learning (RL) has become a key paradigm for training software engineering (SWE) agents, but existing pipelines typically rely on per-task containers for isolation. At scale, pre-built container images incur substantial storage overhead, slow environment setup, and require container-management privileges. We propose SWE-MiniSandbox, a lightweight, container-free method that enables scalable RL training of SWE agents without sacrificing isolation. Instead of relying on per-instance containers, SWE-MiniSandbox executes each task in an isolated workspace backed by kernel-level mechanisms, substantially reducing system overhead. It leverages lightweight environment pre-caching techniques to eliminate the need for bulky container images. As a result, our approach lowers disk usage to approximately 5\% of that required by container-based pipelines and reduces environment preparation time to about 25\% of the container baseline. Empirical results demonstrate that SWE-MiniSandbox achieves evaluation performance comparable to standard container-based pipelines. By removing the dependency on heavy container infrastructure, SWE-MiniSandbox offers a practical and accessible foundation for scaling RL-based SWE agents, particularly in resource-constrained research environments.

URL PDF HTML ☆

赞 0 踩 0

2605.15229 2026-06-02 cs.SE cs.AI

PBT-Bench: Benchmarking AI Agents on Property-Based Testing

PBT-Bench：基于属性测试的AI智能体基准

Lucas Jing, Xinqi Wang, Liao Zhang, Simon S. Du

发表机构 * Tsinghua University（清华大学）； University of Washington（华盛顿大学）； Beneficial AI Foundation（有益人工智能基金会）

AI总结提出PBT-Bench基准，包含100个基于属性测试的问题，用于评估AI智能体从文档中推导语义不变量并生成输入策略的能力。

详情

AI中文摘要

现有的代码基准测试衡量的是智能体能否生成任何能复现已知bug的测试，或者能否生成修复描述问题的补丁。两者都没有分离出基于属性测试的独特技能：从文档中推导语义不变量，然后构建足够精确的输入生成策略，使得随机搜索能够揭示违规。我们引入了PBT-Bench，一个包含40个真实Python库中100个精心策划的基于属性测试问题的基准。每个问题注入一个或多个语义bug（共365个，平均每个问题3.65个），设计使得默认策略的随机输入几乎不会触发它们；智能体必须阅读库的文档，识别相关不变量，并指定一个Hypothesis @given策略，将质量集中在触发区域。bug按三个难度级别（L1-L3）分层，涵盖单约束边界bug到有状态、跨函数协议违规。我们在两种提示机制（开放式基线与显式Hypothesis脚手架）下评估了八个当代LLM，每个配置进行三次独立运行。在PBT引导提示下，模型间的bug召回率从42.1%到83.4%不等；在开放式基线下，从31.4%到76.7%不等。Hypothesis脚手架将中等能力模型提升了超过20个百分点，但对最强模型提升较小，有两个例外显示出退化，表明结构化提示可能干扰某些模型行为而非补充。最难的bug被证明是模型特定的：不同架构在不同问题上失败，留下没有单一模型能填补的持续空白。我们发布基准、测试框架和完整评估语料库，以支持下游关于文档基础的语义推理工作。

英文摘要

Existing code benchmarks measure whether an agent can produce any test that reproduces a known bug, or whether it can produce a patch that fixes a described issue. Neither isolates the distinct skill of property-based testing: deriving a semantic invariant from documentation, and then constructing an input-generation strategy precise enough to make a random search reveal the violation. We introduce PBT-Bench, a benchmark of 100 curated property-based testing problems across 40 real Python libraries. Each problem injects one or more semantic bugs (365 in total, mean 3.65 per problem) designed so that default-strategy random inputs almost never trigger them; the agent must read the library's documentation, identify the relevant invariant, and specify a Hypothesis @given strategy that concentrates mass in the trigger region. Bugs are stratified across three difficulty levels (L1-L3) spanning single-constraint boundary bugs to stateful, cross-function protocol violations. We evaluate eight contemporary LLMs under two prompting regimes (open-ended baseline vs. explicit Hypothesis scaffolding) for three independent runs per configuration. Bug recall under the PBT-guided prompt ranges from 42.1% to 83.4% across models; under the open-ended baseline, from 31.4% to 76.7%. Hypothesis scaffolding lifts mid-capability models by over 20 percentage points, but yields smaller gains for the strongest models, with two exceptions showing degradation, suggesting the structured prompt can interfere with certain model behaviours rather than complementing them. The hardest bugs prove model-specific: different architectures fail on different problems, leaving persistent gaps that no single model closes. We release the benchmark, harness, and full evaluation corpus to support downstream work on documentation-grounded semantic reasoning.

URL PDF HTML ☆

赞 0 踩 0

2605.19847 2026-06-02 cs.CR cs.IR cs.LG

Auditing Privacy in Multi-Tenant RAG under Account Collusion

多租户RAG中账户共谋下的隐私审计

Florian A. D. Burnat

发表机构 * University of Bath（巴斯大学）

AI总结针对多租户RAG中同一索引下账户共谋导致隐私泄露加剧的问题，提出一种可验证的审计协议，用于认证噪声-选择检索并报告共谋上限内的隐私损失。

详情

AI中文摘要

多租户RAG服务通常将账户视为隐私边界：每个账户针对租户索引获得$(\varepsilon_{ ext{acc}},δ_{ ext{acc}})$-DP检索保证。我们表明，这种框架低估了同一索引下账户共谋的泄露。对于高斯噪声-选择检索，$k$个协调的同一租户账户组合成联合泄露$Θ(\sqrt{k}\,\varepsilon_{ ext{acc}})$，而非$\varepsilon_{ ext{acc}}$；我们给出匹配的成员推断攻击，并在标量、top-$K$、训练嵌入器和生产规模的HNSW设置中验证了预测的$\sqrt{k}$ AUC趋势。然后，我们给出一个验证者可运行的审计协议，该协议认证噪声-选择检索，并针对达到声明上限$k_{\max}$的联盟报告$( extsf{PASS},\varepsilon_{ ext{audit}})$，而不泄露索引或改变检索决策规则。该声明仅针对检索通道：生成通道泄露和对抗性鲁棒的联盟规模估计是补充审计谓词。

英文摘要

Multi-tenant RAG services often treat the account as the privacy boundary: each account receives an $(\varepsilon_{\text{acc}},δ_{\text{acc}})$-DP retrieval guarantee against the tenant index. We show that this framing understates leakage under same-index account collusion. For Gaussian noise-then-select retrieval, $k$ coordinated same-tenant accounts compose to joint leakage $Θ(\sqrt{k}\,\varepsilon_{\text{acc}})$, not $\varepsilon_{\text{acc}}$; we give a matching membership-inference attack and validate the predicted $\sqrt{k}$ AUC trend in scalar, top-$K$, trained-embedder, and production-scale HNSW settings. We then give a verifier-runnable audit protocol that attests noise-then-select retrieval and reports $(\textsf{PASS},\varepsilon_{\text{audit}})$ for coalitions up to a declared cap $k_{\max}$, without disclosing the index or changing the retrieval decision rule. The claim is retrieval-channel only: generation-channel leakage and adversarially robust coalition-size estimation are complementary audit predicates.

URL PDF HTML ☆

赞 0 踩 0

2605.18694 2026-06-02 math.OC cs.LG stat.ML

Can Adaptive Gradient Methods Converge under Heavy-Tailed Noise? A Case Study of AdaGrad

自适应梯度方法能否在重尾噪声下收敛？以 AdaGrad 为例

Zijian Liu

发表机构 * Zijian Liu（刘子健）

AI总结本文研究 AdaGrad 在重尾梯度噪声下的收敛性，首次证明当尾指数 p 满足 4/3 < p ≤ 2 时，无需先验知识即可获得非凸优化的收敛率，并给出了算法相关的下界。

Comments ICML 2026. v2: simplification of the proof

详情

AI中文摘要

现代机器学习中的许多任务在优化过程中观察到涉及重尾梯度噪声。为了应对这一现实且具有挑战性的场景，引入了新的机制，如梯度裁剪和梯度归一化，以确保一阶算法的收敛性。然而，自适应梯度方法，一类著名的现代优化器，包括流行的 $\mathtt{Adam}$ 和 $\mathtt{AdamW}$，即使没有上述任何额外操作，通常也表现良好。因此，自然要问：自适应梯度方法能否在重尾噪声下收敛而无需任何算法更改？在这项工作中，我们通过研究一个特例 $\mathtt{AdaGrad}$（自适应梯度方法的起源）迈出了回答这个问题的第一步。我们首次证明了当尾指数 $p$ 满足 $4/3 < p \leq 2$ 时，$\mathtt{AdaGrad}$ 在非凸优化中的可证明收敛率。值得注意的是，这一结果无需任何关于 $p$ 的先验知识，因此对尾指数是自适应的。此外，我们开发了一个算法相关的下界，表明现有的重尾优化极小极大速率无法由 $\mathtt{AdaGrad}$ 达到。最后，我们考虑了 $\mathtt{AdaGrad}\text{-}\mathtt{Norm}$（理论研究中 $\mathtt{AdaGrad}$ 的一个流行变体），并证明了在额外温和假设下，对于任何 $1 < p \leq 2$ 都成立的改进速率。

英文摘要

Many tasks in modern machine learning are observed to involve heavy-tailed gradient noise during the optimization process. To manage this realistic and challenging setting, new mechanisms, such as gradient clipping and gradient normalization, have been introduced to ensure the convergence of first-order algorithms. However, adaptive gradient methods, a famous class of modern optimizers that includes popular $\mathtt{Adam}$ and $\mathtt{AdamW}$, often perform well even without any extra operations mentioned above. It is therefore natural to ask whether adaptive gradient methods can converge under heavy-tailed noise without any algorithmic changes. In this work, we take the first step toward answering this question by investigating a special case, $\mathtt{AdaGrad}$, the origin of adaptive gradient methods. We provide the first provable convergence rate for $\mathtt{AdaGrad}$ in non-convex optimization when the tail index $p$ satisfies $4/3<p\leq2$. Notably, this result is achieved without requiring any prior knowledge of $p$ and is hence adaptive to the tail index. In addition, we develop an algorithm-dependent lower bound, suggesting that the existing minimax rate for heavy-tailed optimization is not attainable by $\mathtt{AdaGrad}$. Lastly, we consider $\mathtt{AdaGrad}\text{-}\mathtt{Norm}$, a popular variant of $\mathtt{AdaGrad}$ in theoretical studies, and show an improved rate that holds for any $1<p\leq2$ under an extra mild assumption.

URL PDF HTML ☆

赞 0 踩 0

2605.14791 2026-06-02 astro-ph.IM astro-ph.CO cs.AI

Beyond AI as Assistants: Toward Autonomous Discovery in Cosmology

超越AI助手：迈向宇宙学中的自主发现

Licong Xu, Thomas Borrett

发表机构 * Institute of Astronomy, University of Cambridge（剑桥大学天文研究所）； Kavli Institute for Cosmology, University of Cambridge（剑桥大学凯斯勒宇宙研究所）； Cavendish Astrophysics, University of Cambridge（剑桥大学卡文迪许天体物理研究所）

AI总结本文提出两种互补的智能体系统（CMBEvolve和CosmoEvolve），通过LLM引导的代码进化与树搜索以及虚拟多智能体研究实验室，实现宇宙学中的自主科学发现，并在弱引力透镜异常检测和ACT DR6数据分析中展示了初步成果。

Comments 4 pages, 2 figures, Contribution to the 2026 Cosmology session of the 60th Rencontres de Moriond

2605.13430 2026-06-02 stat.ME cs.AI cs.LG

Towards a holistic understanding of Selection Bias for Causal Effect Identification

走向因果效应识别中选择偏差的整体理解

Yiwen Qiu, Filip Kovačević, Shimeng Huang, Peter Spirtes, Francesco Locatello

发表机构 * Carnegie Mellon University（卡内基梅隆大学）； University of California, Berkeley（加州大学伯克利分校）

AI总结研究在观测研究中存在选择偏差时，如何利用弱假设刻画倾向得分和选择概率，给出平均处理效应可识别性的充要条件，扩展了现有图形识别准则。

Comments 9 pages for the main text, ICML 2026

详情

AI中文摘要

选择偏差在观测研究中普遍存在。例如，大规模生物库数据可能表现出“健康志愿者偏差”，即受访者比他们所要代表的人群更健康、社会经济地位更高。从这样的子人群中恢复因果效应是因果推断中的一个重要问题，因为从选定人群估计平均处理效应（ATE）可能导致对整个群体的ATE估计严重偏倚。本文研究了选择偏差下ATE的可识别性。我们利用概率类的弱假设刻画倾向得分和选择概率，给出了ATE可识别性的充要条件。与以往工作相比，我们的结果扩展了现有的图形可识别性准则，并在存在选择偏差的情况下，以严格更弱的条件提供了对因果效应识别更全面的理解。

英文摘要

Selection bias is pervasive in observational studies. For example, large scale biobanks data can exhibit ``healthy volunteer bias'' when respondents are healthier and of higher socio-economic status than the population they are meant to represent. Recovering causal effects from such sub-population is an important problem in causal inference, as estimating average treatment effects (ATE) from selected populations can result in a severely biased estimate of the ATE from the whole population. In this paper, we investigate the identifiability of the ATE under selection bias. We provide necessary and sufficient conditions for ATE identifiability, leveraging weak assumptions on probability classes to characterize propensity score and selection probability. Compared to previous works, our results extend existing graphical identifiability criteria and offer a more comprehensive understanding of causal effect identification with strictly weaker conditions in the presence of selection bias.

URL PDF HTML ☆

赞 0 踩 0

2605.12768 2026-06-02 stat.ML cs.LG

ISOMORPH: A Supply Chain Digital Twin for Simulation, Dataset Generation, and Forecasting Benchmarks

ISOMORPH：用于仿真、数据集生成和预测基准的供应链数字孪生

Zhizhen Zhang, Hyemin Gu, Benjamin J. Zhang, Daniel Elenius, Michael Tyrrell, Theo J. Bourdais, Houman Owhadi, Markos A. Katsoulakis, Tuhin Sahai

发表机构 * University of Massachusetts Amherst（马萨诸塞大学阿默斯特分校）； University of North Carolina（北卡罗来纳大学）； SRI International（SRI国际）； California Institute of Technology（加州理工学院）

AI总结本文提出ISOMORPH，首个公开的多级物流网络数字孪生，通过可配置参数和模块化拓扑生成具有牛鞭效应等动态特性的数据集，并评估基础模型的零样本预测性能。

详情

AI中文摘要

开放的时间序列预测（TSF）基准涵盖零售、能源、天气和交通，但供应链物流仍未得到充分服务。我们引入了ISOMORPH，这是第一个具有可解释、用户可配置参数以及模块化拓扑、需求和控制规则的多级物流网络的公开数字孪生。该模拟器在离散时间上推进一个有向路由图：需求从库存中满足或记录为积压，并触发整个网络的补货。状态跟踪库存、未结订单、在途货物以及平滑的需求估计，在可处理的状态空间上产生马尔可夫动力学。发布的数据以经验一致的程度再现了牛鞭效应，同时三个守恒定律为模拟器扩展提供了验证工具。我们发布了两个目录规模（C=50和C=200）、六种场景扫描和20种拉丁超立方体扰动的数据集。这些数据集展示了固定TSF基准中基本缺失的动态特性，包括方差放大、级联瓶颈、制度转换以及通过共享宏观冲击的跨通道耦合。对四个基础模型（Chronos、Moirai、TimesFM和Lag-Llama）的零样本评估在低至中等预测范围上产生了超过公开GIFT-Eval参考的MASE值，支持将其纳入现有基准套件。相同的模型通过需求侧参数的拉丁超立方体扰动提供预测置信带，实现了标准TSF数据集上不可用的前向不确定性量化（UQ），并证明基础模型可以作为基于数字孪生的UQ的快速替代。代码（MIT）：https://github.com/tuhinsahai/ISOMORPH。交互演示：https://huggingface.co/spaces/HyeminGu/ISOMORPH-demo。

英文摘要

Open time-series forecasting (TSF) benchmarks cover retail, energy, weather, and traffic, but supply-chain logistics remains underserved. We introduce ISOMORPH, the first public digital twin of a multi-echelon logistics network with interpretable, user-configurable parameters and modular topology, demand, and control rules. The simulator advances a directed routing graph in discrete time: demand is served from inventory or recorded as backlog and triggers replenishment throughout the network. The state tracks inventory, outstanding orders, in-transit shipments, and a smoothed demand estimate, yielding Markovian dynamics on a tractable state space. The released data reproduces the bullwhip effect at empirically consistent magnitudes, while three conservation laws provide verification tools for simulator extensions. We release datasets at two catalogue scales ($C=50$ and $C=200$), six scenario sweeps, and 20 Latin-hypercube perturbations. These datasets exhibit dynamics largely absent from fixed TSF benchmarks, including variance amplification, cascading bottlenecks, regime shifts, and cross-channel coupling through shared macro shocks. Zero-shot evaluation of four foundation models (Chronos, Moirai, TimesFM, and Lag-Llama) yields MASE values exceeding public GIFT-Eval references at low-to-moderate horizons, supporting incorporation into existing benchmark suites. The same models provide forecast confidence bands through Latin-hypercube perturbations of demand-side parameters, enabling forward uncertainty quantification (UQ) unavailable on standard TSF datasets and demonstrating that foundation models can serve as fast surrogates for digital-twin-based UQ. Code (MIT): https://github.com/tuhinsahai/ISOMORPH. Interactive demo: https://huggingface.co/spaces/HyeminGu/ISOMORPH-demo.

URL PDF HTML ☆

赞 0 踩 0

2603.29002 2026-06-02 cs.DC cs.AI

Understand and Accelerate Memory Processing Pipeline for Large Language Model Inference

理解并加速大型语言模型推理的内存处理流水线

Zifan He, Rui Ma, Yizhou Sun, Jason Cong

发表机构 * GitHub

AI总结本文通过将稀疏注意力、检索增强生成和压缩上下文内存等优化统一为四步内存处理流水线，识别出22%-97%的内存处理开销，并提出使用GPU-FPGA异构系统加速该流水线，实现最高2.2倍加速和4.7倍能效提升。

Comments Accepted by ICML 2026. Code: https://github.com/OswaldHe/HeteroLLM

详情

AI中文摘要

现代大型语言模型（LLMs）越来越依赖于高效的长上下文处理和生成机制，包括稀疏注意力、检索增强生成（RAG）和压缩上下文内存，以支持复杂推理。我们表明这些优化可以统一为一个四步内存处理流水线：准备内存、计算相关性、检索和应用到推理。通过系统分析，我们识别出LLM推理中22%-97%的内存处理开销及其计算特征的强异构性。受此洞察启发，我们认为异构系统非常适合加速内存处理，从而加速端到端推理。我们在GPU-FPGA系统上展示了这种方法，将稀疏、不规则和内存受限的操作卸载到FPGA，同时将计算密集型操作保留在GPU上。在AMD MI210 GPU和Alveo U55C FPGA上评估，我们的系统在多种LLM推理优化中比GPU基线快高达2.2倍，能耗降低高达4.7倍（在NVIDIA A100上结果类似）。这些结果确立了异构系统作为高效LLM内存处理的实用方向，并为未来异构硬件设计提供参考。

英文摘要

Modern large language models (LLMs) increasingly depends on efficient long-context processing and generation mechanisms, including sparse attention, retrieval-augmented generation (RAG), and compressed contextual memory, to support complex reasoning. We show that these optimizations can be unified into a four-step memory processing pipeline: Prepare Memory, Compute Relevancy, Retrieval, and Apply to Inference. Through systematic profiling, we identify a 22%-97% memory processing overhead in LLM inference and strong heterogeneity in its computational characteristics. Motivated by this insight, we argue that \textbf{heterogeneous systems} are well-suited to accelerate memory processing and thus end-to-end inference. We demonstrate this approach on a GPU-FPGA system by offloading sparse, irregular, and memory-bounded operations to FPGAs while retaining compute-intensive operations on GPUs. Evaluated on an AMD MI210 GPU and an Alveo U55C FPGA, our system is up to $2.2\times$ faster and achieves up to $4.7\times$ less energy across multiple LLM inference optimizations than the GPU baseline (similar results hold on NVIDIA A100). These results establish heterogeneous systems as a practical direction for efficient LLM memory processing and inform future heterogeneous hardware design.

URL PDF HTML ☆

赞 0 踩 0

2605.00696 2026-06-02 stat.ML cs.CL cs.LG

Adaptive Querying with AI Persona Priors

基于AI人格先验的自适应查询

Kaizheng Wang, Yuhang Wu, Assaf Zeevi

发表机构 * Department of Industrial Engineering and Operations Research and Data Science Institute, Columbia University（工业工程与运筹学系及数据科学研究所，哥伦比亚大学）； Decision, Risk, and Operations Division, Columbia Business School（决策、风险与运营部门，哥伦比亚商学院）

AI总结提出一种基于AI人格诱导的潜变量模型，利用大语言模型生成响应分布，实现高效贝叶斯设计，用于在有限查询预算下学习用户相关量。

Comments ICML 2026

详情

AI中文摘要

我们研究在严格查询预算内，通过自适应查询学习用户相关的感兴趣量（如对保留项目的响应和心理测量指标）的问题。经典的贝叶斯设计和计算机化自适应测试通常依赖于限制性的参数假设或昂贵的后验近似，限制了它们在异质性、高维和冷启动场景中的应用。我们引入了一种人格诱导的潜变量模型，通过有限字典中的AI人格成员身份来表示用户状态，每种人格由大语言模型产生的响应分布提供。这产生了具有闭式后验更新和高效有限混合预测的表达性先验，从而实现了可扩展的贝叶斯设计用于顺序项目选择。在合成数据和WorldValuesBench上的实验表明，基于人格的后验提供了准确的概率预测和可解释的自适应启发流程。

英文摘要

We study adaptive querying for learning user-dependent quantities of interest, such as responses to held-out items and psychometric indicators, within tight query budgets. Classical Bayesian design and computerized adaptive testing typically rely on restrictive parametric assumptions or expensive posterior approximations, limiting their use in heterogeneous, high-dimensional, and cold-start settings. We introduce a persona-induced latent variable model that represents a user's state through membership in a finite dictionary of AI personas, each offering response distributions produced by a large language model. This yields expressive priors with closed-form posterior updates and efficient finite-mixture predictions, enabling scalable Bayesian design for sequential item selection. Experiments on synthetic data and WorldValuesBench demonstrate that persona-based posteriors deliver accurate probabilistic predictions and an interpretable adaptive elicitation pipeline.

URL PDF HTML ☆

赞 0 踩 0

2604.26977 2026-06-02 cs.LO cs.AI

Defeasible Conditional Obligation in a Two-tiered Preference-based Semantics (Extended Version)

基于双层偏好语义的可废止条件义务（扩展版）

Xavier Parent

发表机构 * Technische Universität Wien (TU Wien)（维也纳技术大学）

AI总结本文提出一种双层偏好语义框架，通过结合非单调推理机制和双序关系（理想性与正常性），解决可废止条件义务的逻辑建模问题，并与约束输入/输出逻辑建立联系。

Comments 13 pages. Extended version of a paper presented at KR 2926

2604.26197 2026-06-02 cs.IR cs.LG

Hierarchical Long-Term Semantic Memory for LinkedIn's Hiring Agent

面向LinkedIn招聘代理的分层长期语义记忆

Zhentao Xu, Shangjin Zhang, Emir Poyraz, Yvonne Li, Ye Jin, Xie Lu, Xiaoyang Gu, Karthik Ramgopal, Praveen Kumar Bodigutla, Xiaofeng Wang

发表机构 * LinkedIn Corporation（LinkedIn公司）

AI总结提出分层长期语义记忆（HLTM）框架，通过构建模式对齐的记忆树，实现可扩展的语义知识摄入、隐私感知存储、低延迟检索和透明溯源，在LinkedIn招聘助手应用中使答案正确率提升超5%、检索F1提升超10%。

Comments Accepted to the Applied Data Science (ADS) track at the 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2026)

详情

DOI: 10.1145/3770855.3818432

AI中文摘要

大型语言模型（LLM）代理越来越多地应用于实际产品中，其中个性化和上下文感知的用户交互至关重要。实现此类能力的核心是代理的长期语义记忆系统，该系统从嘈杂的纵向行为数据中提取隐式和显式信号，以结构化形式存储，并支持低延迟检索。构建工业级LLM代理长期记忆面临五大挑战：可扩展性、低延迟检索、隐私约束、适应性和可观测性。我们提出了分层长期语义记忆（HLTM）框架，该框架将文本数据组织成模式对齐的记忆树，在多个粒度级别捕获语义知识，从而实现可扩展的摄入、隐私感知存储、低延迟检索和透明溯源；HLTM还进一步融入了适应机制以泛化到不同用例。在LinkedIn招聘助手上的广泛评估表明，HLTM使答案正确率提升超过5%，检索F1提升超过10%，同时显著推进了查询与索引延迟之间的帕累托前沿。HLTM已全面部署在LinkedIn招聘助手中，用于支持生产招聘工作流中的核心个性化功能。

英文摘要

Large Language Model (LLM) agents are increasingly used in real-world products, where personalized and context-aware user interactions are essential. A central enabler of such capabilities is the agent's long-term semantic memory system, which extracts implicit and explicit signals from noisy longitudinal behavioral data, stores them in a structured form, and supports low-latency retrieval. Building industrial-grade long-term memory for LLM agents raises five challenges: scalability, low-latency retrieval, privacy constraints, adaptability, and observability. We introduce the Hierarchical Long-Term Semantic Memory (HLTM) framework, which organizes textual data into a schema-aligned memory tree that captures semantic knowledge at multiple levels of granularity, enabling scalable ingestion, privacy-aware storage, low-latency retrieval, and transparent provenance; HLTM further incorporates an adaptation mechanism to generalize across diverse use cases. Extensive evaluations on LinkedIn's Hiring Assistant show that HLTM improves answer correctness by more than 5% and retrieval F1 by more than 10%, while significantly advancing the Pareto frontier between query and indexing latency. HLTM has been fully deployed in LinkedIn's Hiring Assistant to power core personalization features in production hiring workflows.

URL PDF HTML ☆

赞 0 踩 0

2604.25191 2026-06-02 cs.AR cs.AI cs.LG

How Can Reinforcement Learning Achieve Expert-level Placement?

强化学习如何实现专家级布局？

Ruo-Tong Chen, Ke Xue, Chengrui Gao, Yunqi Shi, Tian Xu, Peng Xie, Siyuan Xu, Mingxuan Yuan, Chao Qian, Zhi-Hua Zhou

发表机构 * State Key Laboratory of Novel Software Technology, Nanjing University, China（南京大学新型软件技术国家重点实验室）； School of Artificial Intelligence, Nanjing University, China（南京大学人工智能学院）； Huawei Noah’s Ark Lab, China（华为诺亚实验室）

AI总结针对强化学习在芯片布局中因奖励设计不当而难以达到专家质量的问题，提出从专家布局直接学习奖励模型的方法，通过推断专家轨迹并训练隐式奖励模型，实现从单个设计高效学习并泛化到未见案例。

Comments DAC 2026

2604.23658 2026-06-02 cs.AR cs.AI cs.LG

FlowPlace: Flow Matching for Chip Placement

FlowPlace: 用于芯片布局的流匹配

Peng Xie, Ke Xue, Yunqi Shi, Ruo-Tong Chen, Chengrui Gao, Siyuan Xu, Chenjian Ding, Mingxuan Yuan, Chao Qian

发表机构 * State Key Laboratory of Novel Software Technology, Nanjing University, China（南京大学新型软件技术国家重点实验室）； School of Artificial Intelligence, Nanjing University, China（南京大学人工智能学院）； Huawei Noah’s Ark Lab, China（华为诺亚实验室）

AI总结提出FlowPlace，通过掩码引导的合成数据生成、基于流的灵活先验注入高效训练和硬约束采样实现无重叠布局，在OpenROAD和ICCAD 2015基准上取得更优PPA指标、10-50倍采样效率提升和零重叠。

Comments DAC 2026

2602.02689 2026-06-02 cs.CR cs.AI cs.LG

Eidolon: A Post-Quantum Signature Scheme Based on k-Colorability in the Age of Graph Neural Networks

Eidolon: 图神经网络时代基于k-可着色性的后量子签名方案

Asmaa Cherkaoui, Ramon Flores, Delaram Kahrobaei, Richard Wilson

发表机构 * Laboratory of Mathematical Analysis, Algebra and Applications (LAM2A), Faculty of Sciences Ain Chock (FSAC), University Hassan II, Casablanca, Morocco（哈桑二世大学阿因-奇克学院数学分析与代数实验室）； Department of Geometry and Topology, Faculty of Mathematics, University of Seville, Seville, Spain（塞维利亚大学数学系几何与拓扑系）； Departments of Computer Science and Mathematics, Queens College, City University of New York, USA（纽约市立大学皇后学院计算机科学与数学系；数学博士项目，理论科学倡议，研究生中心，纽约市立大学；计算机科学与工程系，纽约大学塔朗分校；计算机科学系，英国约克大学）； PhD Program in Mathematics, and Initiative for the Theoretical Sciences, Graduate Center, City University of New York, USA（英国约克大学计算机科学系）； Department of Computer Science and Engineering, Tandon School of Engineering, New York University, USA ； Department of Computer Science, University of York, United Kingdom ； Department of Computer Science, University of York, United Kingdom

AI总结提出一种基于NP完全问题k-可着色性的后量子签名方案Eidolon，通过推广Goldreich-Micali-Wigderson零知识协议、应用Fiat-Shamir变换和Merkle树压缩，并利用植入着色法生成困难实例，实验表明对经典求解器和图神经网络攻击具有抵抗性。

Comments 20 pages, 4 figures

详情

DOI: 10.1007/978-3-032-27574-5_3
Journal ref: Proceedings of WAIFI 2026, Lecture Notes in Computer Science (LNCS), Vol. 16611, Springer, 2026

AI中文摘要

我们提出Eidolon，一种基于NP完全问题k-可着色性的后量子签名方案。我们的构造将Goldreich-Micali-Wigderson零知识协议推广到任意k >= 3，应用Fiat-Shamir变换，并使用Merkle树承诺将签名从O(tn)压缩到O(t log n)。我们通过植入着色法生成困难实例，同时旨在保留随机图的统计特征。我们对此类方案进行了针对经典求解器（ILP、DSatur）和定制图神经网络（GNN）攻击者的实证安全分析。实验表明，对于n >= 60，两种方法均无法恢复与植入解匹配的有效着色，表明精心设计的k-着色实例能够抵抗所考虑的传统和基于学习的密码分析方法。这些实验表明，构造的实例能够抵抗我们评估中考虑的攻击。

英文摘要

We propose Eidolon, a post-quantum signature scheme grounded on the NP-complete k-colorability problem. Our construction generalizes the Goldreich-Micali-Wigderson zero-knowledge protocol to arbitrary k >= 3, applies the Fiat-Shamir transform, and uses Merkle-tree commitments to compress signatures from O(tn) to O(t log n). We generate hard instances by planting a coloring while aiming to preserve the statistical profile of random graphs. We present an empirical security analysis of such a scheme against both classical solvers (ILP, DSatur) and a custom graph neural network (GNN) attacker. Experiments show that for n >= 60, neither approach is able to recover a valid coloring matching the planted solution, suggesting that well-engineered k-coloring instances can resist the considered classical and learning-based cryptanalytic approaches. These experiments indicate that the constructed instances resist the attacks considered in our evaluation.

URL PDF HTML ☆

赞 0 踩 0

2604.21511 2026-06-02 cs.IR cs.CL

From Tokens to Concepts: Leveraging SAE for SPLADE

从词元到概念：利用稀疏自编码器增强SPLADE

Yuxuan Zong, Mathias Vast, Basile Van Cooten, Laure Soulier, Benjamin Piwowarski

发表机构 * Sorbonne Université, CNRS, ISIR Paris France（索邦大学、国家科学研究中心、巴黎信息科学研究所法国）

AI总结针对SPLADE依赖骨干词汇表导致多义性和同义性等问题，提出用稀疏自编码器学习的语义概念空间替换词汇表，实现性能相当且效率提升。

Comments 11 pages, 3 figures, 9 tables. To appear at SIGIR 2026

2604.20861 2026-06-02 cs.IR cs.AI

Deep Interest Mining for Intent-Enriched Semantic IDs in Multimodal Generative Recommendation

面向多模态生成式推荐中意图增强语义ID的深度兴趣挖掘

Yangchen Zeng, Jinze Wang

发表机构 * Amazon（亚马逊）

AI总结提出DeepInterestGR框架，通过视觉线索和意图描述符丰富量化前的物品表示，并结合相关性门控语义奖励，提升基于语义ID的生成式推荐性能。

详情

AI中文摘要

语义ID（SID）为生成式推荐提供了离散物品词汇表，但其质量取决于量化前保留了哪些物品证据。在产品推荐中，表面元数据常缺失潜在使用意图，视觉证据可能仅在文本中弱反映，下游策略学习对生成的SID是否对应语义有用的物品提供稀疏反馈。我们引入 extbf{DeepInterestGR}，一个用于生成式推荐的意图增强SID框架。在SID量化前， extbf{CMSA}通过两条互补证据路径丰富物品表示：面向推荐的VLM描述和投影图像嵌入。然后 extbf{DCIM}使用LLM挖掘物品侧意图描述符——由产品内容隐含的潜在使用动机，而非个性化用户状态。在构建的SID上进行策略训练时， extbf{QARM}在标准SID奖励之上添加相关性门控语义质量奖励，仅当生成的SID解码为目标物品时应用该奖励。因此，语义质量不能奖励流畅但无关的物品预测。在三个Amazon产品评论类别（Beauty、Sports和Instruments）上的实验表明，DeepInterestGR优于有竞争力的生成式和基于RL的基线，在最强每度量基线上NDCG@5相对提升高达 extbf{15.1\%}，NDCG@10提升 extbf{13.9\%}。组件消融、CMSA分支分析、奖励变体和SID级案例研究支持一个有界声明：用视觉线索和物品侧意图描述符丰富量化前物品证据，结合相关性门控语义奖励，在评估设置下改进了基于SID的生成式推荐。

英文摘要

Semantic IDs (SIDs) provide the discrete item vocabulary used by generative recommendation, but their quality depends on what item evidence is preserved before quantization. In product recommendation, surface metadata often misses latent usage intent, visual evidence may be only weakly reflected in text, and downstream policy learning provides sparse feedback about whether a generated SID corresponds to a semantically useful item. We introduce \textbf{DeepInterestGR}, an intent-enriched SID framework for generative recommendation. Before SID quantization, \textbf{CMSA} enriches item representations through two complementary evidence paths: recommendation-oriented VLM captions and projected image embeddings. \textbf{DCIM} then uses an LLM to mine item-side intent descriptors -- latent usage motivations implied by product content rather than personalized user states. During policy training over the constructed SIDs, \textbf{QARM} adds a relevance-gated semantic-quality bonus on top of standard SID rewards, applying the bonus only when the generated SID decodes to the target item. Thus, semantic quality cannot reward a fluent but irrelevant item prediction. Experiments on three Amazon Product Review categories (Beauty, Sports, and Instruments) show that DeepInterestGR improves over competitive generative and RL-based baselines, with relative gains of up to \textbf{15.1\%} in NDCG@5 and \textbf{13.9\%} in NDCG@10 over the strongest per-metric baseline. Component ablations, CMSA branch analyses, reward variants, and SID-level case studies support a bounded claim: enriching pre-quantization item evidence with visual cues and item-side intent descriptors, together with relevance-gated semantic rewards, improves SID-based generative recommendation under the evaluated settings.

URL PDF HTML ☆

赞 0 踩 0

2602.08580 2026-06-02 q-bio.TO cs.CV

retinalysis-vascx: An explainable software toolbox for the extraction of retinal vascular biomarkers

retinalysis-vascx: 一个用于提取视网膜血管生物标志物的可解释软件工具箱

Jose D. Vargas Quiros, Michael J. Beyeler, Sofia Ortin Vela, EyeNED Reading Center, Sven Bergmann, Caroline C. W. Klave, Bart Liefers, VascX Research Consortium

发表机构 * Department of Ophthalmology, Erasmus University Medical Center（埃因霍温大学医学中心眼科系）； Department of Epidemiology, Erasmus University Medical Center（埃因霍温大学医学中心流行病学系）； Department of Ophthalmology, Radboud University Medical Center（拉德堡德大学医学中心眼科系）； Institute of Molecular and Clinical Ophthalmology, University of Basel（巴塞尔大学分子与临床眼科研究所）； Dept. of Computational Biology, University of Lausanne（洛桑大学计算生物学系）； Swiss Institute of Bioinformatics, Lausanne, Switzerland（瑞士生物信息学研究所，洛桑，瑞士）； Dept. of Integrative Biomedical Sciences, University of Cape Town（开普敦大学整合生物医学科学系）

AI总结提出开源Python工具箱VascX，从彩色眼底图像中提取视网膜血管生物标志物，包括血管密度、中央视网膜等效值和迂曲度等，并通过可重复性分析和敏感性分析验证其稳健性。

详情

AI中文摘要

从彩色眼底图像（CFI）中自动提取视网膜血管生物标志物对于大规模视网膜血管研究至关重要。我们提出VascX，一个开源的Python工具箱，可从CFI动静脉分割中提取生物标志物。VascX从血管分割掩膜开始，提取其骨架，构建无向和有向血管图，并将血管段解析为更长的血管。导出一组全面的生物标志物，包括血管密度、中央视网膜等效值（CRE）和迂曲度。空间局部化的生物标志物可相对于中央凹和视盘放置的网格进行计算。VascX通过GitHub和PyPI发布，附有全面的文档和示例。我们对同一眼睛在不同设备上重复成像的测试-重测再现性分析表明，大多数VascX生物标志物具有中等至良好的一致性（ICC > 0.5），不同生物标志物的稳健性水平存在重要差异。我们对生物标志物对图像扰动和启发式参数值的敏感性分析支持这些差异，并进一步表征了VascX生物标志物。最终，VascX提供了一个可解释且易于修改的特征提取工具箱，补充了分割以产生可靠的视网膜血管生物标志物。我们基于图的生物标志物计算阶段支持可重复、区域感知的测量，适用于大规模临床和流行病学研究。通过支持轻松提取现有生物标志物和快速实验新生物标志物，VascX支持眼组学研究。其稳健性和计算效率便于在大型数据库中可扩展部署，而开源分发降低了眼科研究人员和临床医生的采用门槛。

英文摘要

Automatic extraction of retinal vascular biomarkers from color fundus images (CFI) is crucial for large-scale studies of the retinal vasculature. We present VascX, an open-source Python toolbox that extracts biomarkers from CFI artery-vein segmentations. VascX starts from vessel segmentation masks, extracts their skeletons, builds undirected and directed vessel graphs, and resolves vessel segments into longer vessels. A comprehensive set of biomarkers is derived, including vascular density, central retinal equivalents (CREs), and tortuosity. Spatially localized biomarkers may be calculated over grids placed relative to the fovea and optic disc. VascX is released via GitHub and PyPI with comprehensive documentation and examples. Our test-retest reproducibility analysis on repeat imaging of the same eye by different devices shows that most VascX biomarkers have moderate to excellent agreement (ICC > 0.5), with important differences in the level of robustness of different biomarkers. Our analyses of biomarker sensitivity to image perturbations and heuristic parameter values support these differences and further characterize VascX biomarkers. Ultimately, VascX provides an explainable and easily modifiable feature-extraction toolbox that complements segmentation to produce reliable retinal vascular biomarkers. Our graph-based biomarker computation stages support reproducible, region-aware measurements suited for large-scale clinical and epidemiological research. By enabling easy extraction of existing biomarkers and rapid experimentation with new ones, VascX supports oculomics research. Its robustness and computational efficiency facilitate scalable deployment in large databases, while open-source distribution lowers barriers to adoption for ophthalmic researchers and clinicians.

URL PDF HTML ☆

赞 0 踩 0

2510.11423 2026-06-02 cs.SI cs.CL

Beyond the Crowd: LLM-Augmented Community Notes for Governing Health Misinformation

超越众包：基于LLM增强的社区笔记治理健康错误信息

Jiaying Wu, Zihang Fu, Haonan Wang, Fanxiao Li, Jiafeng Guo, Preslav Nakov, Min-Yen Kan

发表机构 * National University of Singapore（新加坡国立大学）； Yunnan University（云南大学）； State Key Laboratory of AI Safety, Institute of Computing Technology, Chinese Academy of Sciences（中国科学院人工智能安全国家重点实验室，计算技术研究所）； University of Chinese Academy of Sciences（中国科学院大学）； Mohamed bin Zayed University of Artificial Intelligence（穆罕默德·本·扎耶德人工智能大学）

AI总结针对X平台社区笔记在健康错误信息治理中延迟高的问题，提出CrowdNotes+框架，通过LLM增强笔记生成与评估，显著提升正确性、有用性和证据效用。

Comments ACL 2026

详情

AI中文摘要

社区笔记是X（原Twitter）上众包错误信息治理系统，允许用户标记误导性帖子、附加上下文笔记并评价笔记的有用性。然而，我们对30.8K条健康相关笔记的实证分析显示，笔记获得有用性状态的中位延迟高达17.6小时。为了在现实世界错误信息激增期间提高响应速度，我们提出CrowdNotes+，一个统一的基于LLM的框架，通过增强社区笔记实现更快、更可靠的健康错误信息治理。CrowdNotes+集成了两种模式：（1）基于证据的笔记增强和（2）效用引导的笔记自动化，并由相关性、正确性和有用性的三级层次评估支持。我们通过HealthNotes（一个包含1.2K条健康笔记并标注有用性的基准数据集）和微调的有用性判断器实例化该框架。我们的分析首先揭示了当前众包治理的一个关键漏洞：投票者经常将风格流畅性与事实准确性混淆。通过我们的层次评估解决这一问题，对15个代表性LLM的实验表明，CrowdNotes+在笔记正确性、有用性和证据效用方面显著优于人类贡献者。

英文摘要

Community Notes, the crowd-sourced misinformation governance system on X (formerly Twitter), allows users to flag misleading posts, attach contextual notes, and rate the notes' helpfulness. However, our empirical analysis of 30.8K health-related notes reveals substantial latency, with a median delay of 17.6 hours before notes receive a helpfulness status. To improve responsiveness during real-world misinformation surges, we propose CrowdNotes+, a unified LLM-based framework that augments Community Notes for faster and more reliable health misinformation governance. CrowdNotes+ integrates two modes: (1) evidence-grounded note augmentation and (2) utility-guided note automation, supported by a hierarchical three-stage evaluation of relevance, correctness, and helpfulness. We instantiate the framework with HealthNotes, a benchmark of 1.2K health notes annotated for helpfulness, and a fine-tuned helpfulness judge. Our analysis first uncovers a key loophole in current crowd-sourced governance: voters frequently conflate stylistic fluency with factual accuracy. Addressing this via our hierarchical evaluation, experiments across 15 representative LLMs demonstrate that CrowdNotes+ significantly outperforms human contributors in note correctness, helpfulness, and evidence utility.

URL PDF HTML ☆

赞 0 踩 0

2601.07177 2026-06-02 cs.CR cs.AI

Safe-FedLLM: Delving into the Safety of Federated Large Language Models

Safe-FedLLM：深入探究联邦大语言模型的安全性

Mingxiang Tao, Yu Tian, Wenxuan Tu, Yue Yang, Xue Yang, Xiangyan Tang

发表机构 * Hainan University（海南大学）； Tsinghua University（清华大学）； Shanghai Jiao Tong University（上海交通大学）

AI总结本文提出Safe-FedLLM，一种基于探针的防御框架，通过三级防御（步骤级、客户端级和阴影级）利用轻量级分类器区分恶意与良性LoRA更新，以增强联邦大语言模型对恶意客户端的鲁棒性。

详情

AI中文摘要

联邦学习解决了大语言模型训练中的隐私和数据孤岛问题。大多数先前工作侧重于提高联邦学习对大语言模型的效率。然而，开放联邦环境中的安全性，特别是针对恶意客户端的防御，仍未被充分探索。为了研究联邦大语言模型的安全性，我们进行了一项初步研究，从LoRA更新的角度分析潜在的攻击面和防御特性。我们发现联邦大语言模型的两个关键特性：1）大语言模型在联邦学习中容易受到恶意客户端的攻击，以及2）LoRA更新表现出不同的行为模式，可以通过轻量级分类器有效区分。基于这些特性，我们提出了Safe-FedLLM，一种基于探针的联邦大语言模型防御框架，该框架在三个层面构建防御：步骤级、客户端级和阴影级。Safe-FedLLM的核心概念是对每个客户端的本地LoRA更新进行基于探针的区分，将其视为高维行为特征，并使用轻量级分类器判断其是否为恶意。大量实验表明，Safe-FedLLM有效提高了联邦大语言模型对恶意客户端的鲁棒性，同时保持了对良性数据的竞争性能。值得注意的是，我们的方法在不显著影响训练速度的情况下有效抑制了恶意数据的影响，即使在恶意客户端比例较高的情况下也保持有效。

英文摘要

Federated learning (FL) addresses privacy and data-silo issues in the training of large language models (LLMs). Most prior work focuses on improving the efficiency of federated learning for LLMs (FedLLM). However, security in open federated environments, particularly defenses against malicious clients, remains underexplored. To investigate the security of FedLLM, we conduct a preliminary study to analyze potential attack surfaces and defensive characteristics from the perspective of LoRA updates. We find two key properties of FedLLM: 1) LLMs are vulnerable to attacks from malicious clients in FL, and 2) LoRA updates exhibit distinct behavioral patterns that can be effectively distinguished by lightweight classifiers. Based on these properties, we propose Safe-FedLLM, a probe-based defense framework for FedLLM, which constructs defenses across three levels: Step-Level, Client-Level, and Shadow-Level. The core concept of Safe-FedLLM is to perform probe-based discrimination on each client's local LoRA updates, treating them as high-dimensional behavioral features and using a lightweight classifier to determine whether they are malicious. Extensive experiments demonstrate that Safe-FedLLM effectively improves FedLLM's robustness against malicious clients while maintaining competitive performance on benign data. Notably, our method effectively suppresses the impact of malicious data without significantly affecting training speed, and remains effective even under high malicious client ratios.

URL PDF HTML ☆

赞 0 踩 0

2504.08278 2026-06-02 math.OC cs.RO cs.SY eess.SY

Line-Search Filter Differential Dynamic Programming for Optimal Control with Nonlinear Equality Constraints

带非线性等式约束最优控制的线搜索滤波微分动态规划

Ming Xu, Stephen Gould, Iman Shames

发表机构 * School of Computer and Communication Sciences, EPFL（瑞士联邦理工学院计算机与通信科学学院）； School of Computing, Australian National University（澳大利亚国立大学计算学院）； Department of Electrical and Electronic Engineering, University of Melbourne（墨尔本大学电子与电气工程系）

AI总结提出FilterDDP算法，通过线搜索和步长滤波器处理非线性等式约束，并证明局部二次收敛性，在机器人接触隐式轨迹优化中验证有效性。

Comments Accepted for publication in the IEEE International Conference on Robotics and Automation (ICRA) 2026. Revised version with more exposition in methodology and updated results with improved implementation

详情

AI中文摘要

我们提出FilterDDP，一种用于求解带非线性等式约束的离散时间最优控制问题的微分动态规划算法。与基于价值函数或增广拉格朗日类算法的先前方法不同，FilterDDP使用步长滤波器结合线搜索来处理等式约束。我们确定了步长滤波器准则的两个重要设计选择，这些选择带来了鲁棒的数值性能：1）在步长接受准则中使用拉格朗日函数而非代价函数；2）在反向传播中扰动值函数Hessian矩阵。这两个选择都有严格的理论依据，特别是对于2），我们给出了局部二次收敛的形式化证明。除了提供处理同时含等式和不等式约束的最优控制问题的原始-对偶内点扩展外，我们还在机器人学中出现的三个接触隐式轨迹优化问题上验证了FilterDDP。

英文摘要

We present FilterDDP, a differential dynamic programming algorithm for solving discrete-time, optimal control problems (OCPs) with nonlinear equality constraints. Unlike prior methods based on merit functions or the augmented Lagrangian class of algorithms, FilterDDP uses a step filter in conjunction with a line search to handle equality constraints. We identify two important design choices for the step filter criteria which lead to robust numerical performance: 1) we use the Lagrangian instead of the cost in the step acceptance criterion and, 2) in the backward pass, we perturb the value function Hessian. Both choices are rigorously justified, for 2) in particular by a formal proof of local quadratic convergence. In addition to providing a primal-dual interior point extension for handling OCPs with both equality and inequality constraints, we validate FilterDDP on three contact implicit trajectory optimisation problems which arise in robotics.

URL PDF HTML ☆

赞 0 踩 0