arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2606.01249 2026-06-04 cs.LG cs.CL

Trust Region On-Policy Distillation

信任区域在线策略蒸馏

Xingrun Xing, Haoqing Wang, Boyan Gao, Ziheng Li, Yehui Tang

发表机构 * Samsung Research（三星研究院）； University of Oxford（牛津大学）； Peking University（北京大学）

AI总结提出信任区域在线策略蒸馏（TrOPD），通过信用分配策略和信任区域学习解决师生分布差异导致的训练不稳定问题，在数学推理、代码生成和通用基准上超越现有方法。

详情

AI中文摘要

在线策略蒸馏（OPD）是大型语言模型（LLM）高效后训练的基本技术，在智能体学习、多任务增强和模型压缩中具有广泛应用。然而，当教师和学生分布差异较大时，OPD训练变得不稳定，因为教师对学生生成token的监督可能产生不可靠的策略梯度，甚至导致优化失败。本文通过信用分配策略解决可靠的在线策略token级监督问题，并提出信任区域在线策略蒸馏（TrOPD）。它具有以下特点：1）信任区域在线策略学习：TrOPD仅在教师提供可靠监督的区域进行OPD，缓解了分布不匹配下K1反向KL估计的优化困难。2）异常值估计：对于异常区域，我们探索梯度裁剪、掩码和前向KL估计，以减少不可靠监督的不利影响。3）离策略引导：学生从教师前缀继续生成，并使用前向KL模仿离策略引导，鼓励向可靠区域进行在线策略探索。实验表明，TrOPD在数学推理、代码生成和通用领域基准上始终优于最先进的OPD基线，包括OPD、EOPD和REOPOLD。

英文摘要

On-Policy Distillation (OPD) is a fundamental technique for efficient post-training of large language models (LLMs), with broad applications in agent learning, multi-task enhancement, and model compression. However, OPD training becomes unstable when the teacher and student distributions differ substantially, as teacher supervision on student-generated tokens may yield unreliable policy gradients and even cause optimization failure. This work addresses reliable on-policy token-level supervision through credit assignment strategies, and proposes Trust Region On-Policy Distillation, TrOPD. It features the following characteristics: 1) Trust-Region On-Policy Learning: TrOPD performs OPD only in regions where the teacher provides reliable supervision, mitigating the optimization difficulty of the K1 reverse-KL estimator under distribution mismatch. 2) Outlier Estimation: For outlier regions, we explore gradient clipping, masking, and forward-KL estimation to reduce the adverse effects of unreliable supervision. 3) Off-Policy Guidance: The student continues generation from teacher prefixes and uses forward KL to imitate off-policy guidance, encouraging on-policy exploration toward reliable regions. Experiments show that TrOPD consistently outperforms SoTA OPD baselines, including OPD, EOPD, and REOPOLD, across mathematical reasoning, code generation, and general-domain benchmarks.

URL PDF HTML ☆

赞 0 踩 0

2606.01212 2026-06-04 cs.CL cs.AI cs.CR cs.IR

DiscourseFlip: An Oblique Discourse-Level Opinion Manipulation Attack against Black-box Retrieval-Augmented Generation

DiscourseFlip: 面向黑盒检索增强生成的非直述式语篇级观点操纵攻击

Yuyang Gong, Miaokun Chen, Jiawei Liu, Zhuo Chen, Guoxiu He, Wei Lu, XiaoFeng Wang, Xiaozhong Liu

发表机构 * Wuhan University（武汉大学）； East China Normal University（华东师范大学）； Nanyang Technological University（南洋理工大学）； Worcester Polytechnic Institute（沃思堡理工学院）

AI总结提出一种基于图引导的代理攻击方法DiscourseFlip，通过语义查询网络中的协同影响在有限预算下最大化语篇级观点偏差，实验证明其有效性和隐蔽性，并揭示现有防御的不足。

详情

AI中文摘要

检索增强生成（RAG）系统被广泛部署且影响力日益增强，但其对外部语料库的依赖暴露了来自中毒检索内容的新安全风险。现有的RAG攻击主要关注单个查询或狭窄主题局部查询集，这限制了其实际影响范围，并在现实场景中提供有限的伪装。在本文中，我们引入了语篇级观点操纵，这是一种新的威胁模型，其中跨语义查询网络的协同影响会在整体、多主题查询空间上诱导观点转变。我们在黑盒设置中形式化了这种威胁，并提出了DiscourseFlip，一种基于代理的、图引导的攻击，动态分配有限的中毒预算以最大化语篇级观点偏差。大量实验表明，DiscourseFlip在上下文化查询网络上持续诱导目标观点转变，并在覆盖范围和有效性方面显著优于现有基线。用户研究进一步证实，DiscourseFlip有效且能很好地伪装以躲避用户检测。此外，系统分析表明，现有的缓解策略对语篇级操纵无效，这凸显了迫切需要更鲁棒和自适应的防御措施来应对语篇级漏洞。

英文摘要

Retrieval-Augmented Generation (RAG) systems are widely deployed and increasingly influential, but their reliance on external corpora exposes new security risks from poisoned retrieval content. Existing RAG attacks are largely focusing on individual queries or narrow topic-local query sets, which limits their practical reach and offers limited camouflage in real-world settings. In this paper, we introduce discourse-level opinion manipulation, a new threat model in which coordinated influence across a semantic query network induces opinion shifts over a holistic, multi-topic query space. We formalize this threat in a black-box setting and propose DiscourseFlip, an agentic, graph-guided attack that dynamically allocates a limited poisoning budget to maximize discourse-level opinion deviation. Extensive experiments demonstrate that DiscourseFlip consistently induces targeted opinion shifts across the contextualized query network and significantly outperforms existing baselines in terms of coverage and effectiveness. User studies further confirm that DiscourseFlip is effective while remaining well camouflaged from user detection. Moreover, systematic analyses show that existing mitigation strategies are ineffective against discourse-level manipulation, underscoring the urgent need for more robust and adaptive defenses to address discourse-level vulnerabilities.

URL PDF HTML ☆

赞 0 踩 0

2606.01138 2026-06-04 cs.CR cs.AI cs.DC

memorywire: A Vendor-Neutral Wire Format for Agent Memory Operations

AMP：一种用于智能体内存操作的供应商中立线格式

Thamilvendhan Munirathinam

发表机构 * Independent Researcher（独立研究者）

AI总结提出一种基于JSON-Schema 2020-12的供应商中立线格式memorywire，支持五种内存操作和四种内存类型，通过参考实现和基准测试验证其性能与兼容性。

详情

Comments: v2: title corrected from pre-launch name "AMP" to "memorywire"; abstract clarifies recall@5 = 1.000 is on the 42 gold-id queries (50 total; 8 no-match probes excluded). 17 pages, 1 figure, 6 tables. Code: github.com/mthamil107/memorywire. Companion to arXiv:2604.18248 (Prompt Injection Detection)

AI中文摘要

智能体内存框架——mem0、Letta/MemGPT、Cognee、Zep/Graphiti、MemoryOS、MemTensor——各自提供自己的SDK、存储布局和操作词汇。没有共享的线格式：每次集成都是定制的，每次迁移都从头重建内存，并且没有框架提供治理界面，让人类在写入进入长期存储之前进行审查。我们提出memorywire，一种基于JSON-Schema 2020-12的线格式，支持五种内存操作（记住、回忆、遗忘、合并、过期）和四种内存类型（语义、情景、程序、情感），并包含一个MemoryStore接口、一个扇出路由器以及一个可选的人机回环治理通道。我们描述了一个开源参考实现，包含五个后端适配器（sqlite-vec、mem0、Letta、Cognee、pgvector）；一个基于100个事实/50个查询的标注语料库的微基准测试，在42个标注查询上实现了recall@5=1.000，摄入p50=37.8毫秒，回忆p50=40.6毫秒；一个对抗融合实验显示，在1-of-N秩0注入扫描（K∈{0,5,...,50}）中，倒数秩融合保持recall@5=1.000，而最大融合在K≥5时下降至0.500，泄露率达80%；以及一个16场景跨适配器一致性测试套件，80个单元中通过68个，零失败。贡献不在于新算法，而在于将现有组件（RRF、FSM、STM/LTM整合、差异与批准工作流）打包成一个供应商中立的协议，并附有经验验证的参考实现，旨在与模型上下文协议协作而非竞争。

英文摘要

Agent-memory frameworks -- mem0, Letta/MemGPT, Cognee, Zep/Graphiti, MemoryOS, MemTensor -- each ship their own SDK, storage layout, and operational vocabulary. There is no shared wire format: every integration is bespoke, every migration rebuilds memory from scratch, and no framework ships a governance surface that lets a human review writes before they enter long-term storage. We present memorywire, a JSON-Schema 2020-12 wire format for five memory operations (remember, recall, forget, merge, expire) over four memory types (semantic, episodic, procedural, emotional), with a MemoryStore interface, a fan-out router, and an optional HITL governance channel. We describe an open-source reference implementation with five backend adapters (sqlite-vec, mem0, Letta, Cognee, pgvector); a microbenchmark on a 100-fact / 50-query labelled corpus (42 with non-empty gold ids + 8 no-match probes) achieving recall@5 = 1.000 on the 42 gold-id queries with ingest p50 = 37.8 ms and recall p50 = 40.6 ms; an adversarial-fusion experiment showing Reciprocal Rank Fusion holds recall@5 = 1.000 across a 1-of-N rank-0 injection sweep (K in {0, 5, ..., 50}) where max fusion collapses to 0.500 with 80% leak at K >= 5; and a 16-scenario cross-adapter conformance suite passing 68 of 80 cells with zero failures. The contribution is not a new algorithm; it is a packaging of established components (RRF, FSMs, STM/LTM consolidation, diff-and-approve workflows) into a venue-neutral protocol with an empirically validated reference, positioned to compose with the Model Context Protocol rather than compete with it.

URL PDF HTML ☆

赞 0 踩 0

2606.01023 2026-06-04 cs.CV cs.AI

Data Collection for Training Quality-Control AI in Carpet Manufacturing

地毯制造中用于训练质量控制AI的数据收集

Akbar Erkinov

发表机构 * Independent Researcher（独立研究者）

AI总结针对地毯生产中视觉检测慢、主观且不一致的问题，提出一种在线机器视觉系统设计，通过同步线扫描相机和组合照明实时检测缺陷，并系统收集标注数据以持续训练质量控制模型，最终通过DMAIC方法量化质量改进。

详情

Comments: 10 pages, 3 figures

AI中文摘要

视觉检测仍然是机织和簇绒地毯生产中主要的质量控制实践，但在现代织机的线速度和宽度下，它缓慢、主观且不一致。我们提出了一种在线机器视觉系统的设计方案，其主要目的有两个：实时检测地毯幅面，以及同样重要的是，系统地收集和标注缺陷图案的图像，以便在设备使用寿命内训练日益强大的质量控制模型。该方案基于一个具体的工业环境：在一个机织地毯生产设施中进行的六西格玛（DMAIC）项目，该项目预计在增加织机后会出现生产瓶颈，且基线缺陷率较高，质量故障带来的财务风险显著。我们描述了一个基于同步线扫描相机并组合明场和掠射照明的成像子系统，推导了在多米宽幅面上分辨细微结构缺陷所需的分辨率和吞吐量要求，并定义了地毯特定的缺陷分类。然后，我们提出了一种分阶段建模策略，从基于无缺陷材料的无监督异常检测开始，遵循MVTec异常检测基准中地毯类别的范例，并通过人在环的标注飞轮成熟为有监督的检测和分割模型。最后，我们将检测性能与DMAIC目标联系起来，展示逃逸缺陷的减少如何转化为过程质量和过程西格玛水平的提升。贡献在于提供了一个端到端、可部署的蓝图，将数据收集视为首要工程目标而非事后考虑。

英文摘要

Visual inspection remains the dominant quality-control practice in woven and tufted carpet production, yet it is slow, subjective, and inconsistent at the line speeds and widths of modern looms. We present a design proposal for an in-line machine-vision system whose primary purpose is twofold: to inspect the carpet web in real time and, equally importantly, to systematically collect and label images of defect patterns so that increasingly capable quality-control models can be trained over the life of the installation. The proposal is grounded in a concrete industrial setting: a Six Sigma (DMAIC) project at a woven-carpet production facility that anticipated a production bottleneck following the installation of additional weaving machines, with a substantial baseline defect rate and significant financial exposure associated with quality failures. We describe an imaging subsystem based on synchronized line-scan cameras with combined bright-field and grazing illumination, derive the resolution and throughput requirements needed to resolve fine structural defects across a multi-metre web, and define a carpet-specific defect taxonomy. We then lay out a staged modelling strategy that begins with unsupervised anomaly detection trained on defect-free material, following the paradigm exemplified by the carpet category of the MVTec Anomaly Detection benchmark, and matures through a human-in-the-loop annotation flywheel into supervised detection and segmentation models. Finally, we connect detection performance to the DMAIC objectives, showing how reductions in escaped defects translate into improved process quality and process sigma levels. The contribution is an end-to-end, deployable blueprint that treats data collection as a first-class engineering objective rather than an afterthought.

URL PDF HTML ☆

赞 0 踩 0

2606.00747 2026-06-04 cs.CV cs.AI

SkyShield: Occupancy as a Safety Interface for Low-Altitude UAV Autonomy

SkyShield：占用作为低空无人机自主飞行的安全接口

Jie Gao, Jie Ma, Kaihui Lin, Kai Ye, Miaohui Zhang, Pingyang Dai, Liujuan Cao

发表机构 * Xiamen University（厦门大学）； Jiangxi Academy of Sciences（江西省科学院）

AI总结针对低空无人机自主飞行中的三维空间理解问题，提出首个前视单目语义占用基准SkyShield、动态感知度量KAR-mIoU和几何优先基线SkyOcc，将占用作为安全接口。

详情

AI中文摘要

对于低空无人机自主飞行，三维空间理解不仅仅是感知目标，更是人类指令与物理飞行之间的安全接口。在20米以下的人尺度城市空域中，薄几何结构、遮挡、植被和城市杂乱决定了飞行器能否安全进入前方空间。然而，现有的无人机数据集主要提供2D标注或3D框，而面向驾驶的占用基准假设稳定的地面级传感器装置。两者都缺少低空飞行的定义性场景：一个前视单目相机从移动的飞行器上观察占据和自由空间，具有逐帧变化的6自由度姿态和相机外参。为填补这一空白，我们提出了SkyShield，据我们所知，这是首个面向20米以下城市无人机飞行的前视单目语义占用基准。基于CARLA构建，SkyShield包含36K个前视无人机样本，涵盖多种城市场景和天气条件，每张图像配以逐帧6自由度无人机姿态、逐帧动态相机几何、无人机状态和前视截锥体语义占用标签。我们进一步提出了KAR-mIoU，一种以无人机为中心且动态感知的度量，通过运动可达性和碰撞时间重新加权体素级评估，揭示传统mIoU隐藏的安全关键风险。为应对这一具有挑战性的新场景，我们提供了SkyOcc，一种几何优先的单目基线，将逐帧无人机姿态集成到投影中，融合时序占用特征，并应用安全先验优化以保留稀疏的碰撞关键结构。SkyShield、KAR-mIoU和SkyOcc共同将占用确立为低空空中自主飞行的安全接口。代码和数据集将公开发布。

英文摘要

For low-altitude Unmanned Aerial Vehicle (UAV) autonomy, 3D spatial understanding is not merely a perception objective, but the safety interface between human instructions and physical flight. In human-scale urban airspace below 20 meters, thin geometry, occlusions, vegetation, and urban clutter define whether an aerial agent can safely enter the space ahead. However, existing UAV datasets mainly provide 2D annotations or 3D boxes, while driving-oriented occupancy benchmarks assume stable ground-level sensor rigs. Both miss the defining regime of low-altitude flight: a front-facing monocular camera observing occupied and free space from a moving aerial body with frame-wise changing 6-DoF pose and camera extrinsics. To bridge this gap, we introduce SkyShield, to the best of our knowledge the first front-view monocular semantic occupancy benchmark for urban UAV flight below 20 meters. Built on CARLA, SkyShield contains 36K front-view UAV samples across diverse urban scenes and weather conditions, pairing each image with frame-wise 6-DoF UAV pose, frame-wise dynamic camera geometry, UAV states, and front-frustum semantic occupancy labels. We further propose KAR-mIoU, a UAV-centric and dynamics-aware metric that re-weights voxel-level evaluation by kinematic reachability and time-to-collision, revealing safety-critical risks hidden by conventional mIoU. To tackle this challenging new setting, we provide SkyOcc, a geometry-first monocular baseline that integrates frame-wise UAV attitude into projection, fuses temporal occupancy features, and applies safety-prior optimization to preserve sparse collision-critical structures. Together, SkyShield, KAR-mIoU, and SkyOcc establish occupancy as a safety interface for low-altitude aerial autonomy. Code and dataset will be released publicly.

URL PDF HTML ☆

赞 0 踩 0

2606.00732 2026-06-04 cs.AI cs.LG

SHARP: Sleep-based Hierarchical Accelerated Replay for Long Range Non-Stationary Temporal Pattern Recognition

SHARP: 基于睡眠的分层加速重放用于长程非平稳时间模式识别

Jayanta Dey, Shikhar Srivastava, Itamar Lerner, Christopher Kanan, Dhireesha Kudithipudi

发表机构 * Department of Computer Engineering, University of Texas at San Antonio, USA（德克萨斯大学圣安东尼奥分校计算机工程系）； Department of Computer Science, University of Rochester, USA（罗切斯特大学计算机科学系）； Department of Psychology, University of Texas at San Antonio, USA（德克萨斯大学圣安东尼奥分校心理学系）

AI总结提出SHARP框架，通过将时间学习分解为记忆模块和模式识别模块，并引入离线睡眠阶段加速重放时间结构记忆，实现长程非平稳序列模式的高效学习。

详情

AI中文摘要

学习长程非平稳时间模式仍然是现代序列模型的核心挑战，特别是在严格的流式设置中。在这些设置中，数据按顺序到达，必须单次处理，不能同时回顾过去的观测。标准架构，包括循环神经网络和变换器，受到截断时间反向传播或显式输入窗口长度的限制，无法进行长程信用分配。为了解决这些限制，我们提出了SHARP（基于睡眠的分层加速重放），一个将时间学习分解为两个互补组件的框架：一个累积过去输入的结构化历史的记忆模块，以及一个在该记忆上操作的模式识别模块。这种分离通过消除跨多步时间反向传播进行长程信用分配的需求，实现了对非平稳动态的资源高效和计算高效适应。受啮齿动物在慢波睡眠期间观察到的加速重放启发，SHARP引入了离线（睡眠）阶段，其中时间结构的记忆痕迹以加速形式重放并整合到更高层次的记忆表示中，从而改善长程上下文保留。通过受控模拟和消融研究，我们表征了所提出框架的关键属性。在text8和PG-19等基准数据集上，我们证明SHARP通过保留先前见过数据的下一个令牌预测性能，同时继续从当前流中学习并泛化到未来未见数据，改进了循环基线。这些增益得益于其分层结构，该结构以线性时间计算成本实现了指数级增长的有效时间上下文。

英文摘要

Learning long-range non-stationary temporal patterns remains a core challenge for modern sequence models, particularly in strict streaming settings. In these settings, data arrive sequentially and must be processed in a single pass without simultaneously revisiting past observations. Standard architectures, including recurrent neural networks and transformers, are constrained by either truncated backpropagation through time horizon or explicit input window length for long range credit assignment. To address these limitations, we propose SHARP (Sleep-based Hierarchical Accelerated Replay), a framework that decomposes temporal learning into two complementary components: a memory module that accumulates a structured history of past inputs, and a pattern-recognition module that operates over this memory. This separation enables resource- and compute-efficient adaptation to non-stationary dynamics by eliminating the need for backpropagation through time across many steps for long-range credit assignment. Inspired by the accelerated replay observed in rodents during slow-wave sleep, SHARP incorporates offline (sleep) phases in which temporally structured memory traces are replayed in an accelerated form and integrated into higher-level memory representations, improving long-range context retention. Through controlled simulations and ablation studies, we characterize the key properties of the proposed framework. In benchmark datasets such as text8 and PG-19, we demonstrate that SHARP improves over recurrent baselines by retaining next-token predictive performance on previously seen data while continuing to learn from the current stream and generalizing to future unseen data. These gains are enabled by its hierarchical structure, which yields an exponentially increasing effective temporal context with only linear-time computational cost.

URL PDF HTML ☆

赞 0 踩 0

2606.00356 2026-06-04 cs.CL

How Far Do Auto-Interpretation Labels Generalize: A Controlled Study Across Languages, Scripts, and Rewordings

自动解释标签的泛化程度：跨语言、文字和改写的受控研究

Sripad Karne

发表机构 * Columbia University（哥伦比亚大学）

AI总结通过塞尔维亚双文字系统控制实验，研究稀疏自编码器特征的自然语言标签是否真正泛化到不同语言和文字，发现标签在语义内容匹配上存在显著偏差，且随网络深度增加而加剧。

详情

AI中文摘要

稀疏自编码器（SAE）特征越来越多地用于解释语言模型，自动生成的自然语言标签是理解每个特征含义的主要接口。我们询问这些标签是否泛化：标记为某个概念的特征是否真的跨语言和文字追踪该概念？使用塞尔维亚双文字系统作为受控测试平台——通过确定性音译将同一语言以拉丁字母和西里尔字母书写——我们首先发现，由不同语言、文字和措辞中的相同内容激活的SAE特征集具有显著重叠（峰值Jaccard相似度0.57，随机基线0.13），表明存在真正的跨语言语义特征。然后我们测试自动解释标签是否跟上步伐。它们通常没有：标签描述语义内容的特征在塞尔维亚语中错过相同含义的频率比英语中高出多达4倍，并且错过塞尔维亚西里尔字母比塞尔维亚拉丁字母更多——这两种文字是彼此的确定性音译——表明失败追踪了每种形式在训练中的表现程度。差距随着网络深度增加而扩大，但标签没有给出任何失败指示。这些结果表明，自动解释标签可能反映特征在良好表示输入上的行为，而不是概念本身。

英文摘要

Sparse autoencoder (SAE) features are increasingly used to interpret language models, with auto-generated natural-language labels serving as the primary interface for understanding what each feature represents. We ask whether these labels generalize: does a feature labeled for a concept actually track that concept across languages and scripts? Using Serbian digraphia as a controlled testbed--the same language written in both Latin and Cyrillic via deterministic transliteration--we first find that SAE feature sets activated by the same content in different languages, scripts, and wordings share substantial overlap (mean Jaccard 0.39 vs. 0.13 random baseline, peaking at 0.57), suggesting genuine cross-lingual semantic features. We then test whether auto-interpretation labels keep pace. They often do not: features whose labels describe semantic content miss the same meaning in Serbian up to 4x more often thanwithin English, and miss Serbian Cyrillic more than Serbian Latin--two scripts that are deterministic transliterations of each other--suggesting the failures align with how well each form is represented in training. The gap grows with network depth, yet the labels give no indication that they fail. These results suggest that auto-interpretation labels may reflect a feature's behavior on well-represented inputs rather than the concept itself.

URL PDF HTML ☆

赞 0 踩 0

2606.00260 2026-06-04 cs.CV cs.LG

LastAct: Trajectory-Guided Latest-Activity Localization for Real-Time Smart-Home Activity Recognition

LastAct: 轨迹引导的最新活动定位用于实时智能家居活动识别

Zishuai Liu, Ruili Fang, Jin Lu, Fei Dou

发表机构 * School of Computing, University of Georgia（佐治亚大学计算学院）

AI总结提出LastAct框架，通过轨迹图像序列和边界定位器解决滑动窗口中的边界污染问题，实现实时智能家居活动识别。

详情

AI中文摘要

基于环境传感器的人类活动识别（HAR）支持健康监测和辅助生活等智能家居应用。然而，在实际部署中，传感器事件以连续流的形式到达，活动边界未知。因此，滑动窗口推理会产生许多跨越转换并包含混合活动的窗口，造成边界污染，违反了大多数基准和模型使用的预分割实例假设。此外，许多管道通过将传感器ID视为独立标记来未充分利用空间上下文。我们提出了LastAct，一个面向轨迹的流式智能家居HAR框架，旨在处理混合窗口下的最新活动，同时显式建模空间结构。LastAct将传感器事件投影到家庭平面图上，形成保持空间连续性的布局对齐轨迹图像序列。一个轻量级门控识别受污染的窗口，边界定位器估计最后一个转换，从而实现边界引导的掩码，强调边界后的证据并抑制过时的上下文。为了提高效率，我们重用预计算的布局对齐模板缓存以避免重复渲染。实验表明，在四个公开的智能家居数据集上，采用接近真实的混合活动协议，LastAct在纯窗口上达到竞争性或更优的性能，并在交叉/混合窗口上获得显著的Macro-F1增益，展示了在接近真实的滑动窗口机制下更强的鲁棒性。

英文摘要

Human Activity Recognition (HAR) from ambient sensors enables smart-home applications such as health monitoring and assisted living. In realistic deployments, however, sensor events arrive as a continuous stream and activity boundaries are unknown. Sliding-window inference therefore produces many windows that straddle transitions and contain mixed activities, creating boundary contamination that violates the pre-segmented instance assumption used by most benchmarks and models. Moreover, many pipelines under-use spatial context by treating sensor IDs as independent tokens. We present LastAct, a trajectory-centric framework for streaming smart-home HAR that targets the most recent activity under mixed windows while explicitly modeling spatial structure. LastAct projects sensor events onto the home floorplan to form a layout-aligned trajectory image sequence that preserves spatial continuity. A lightweight gate identifies contaminated windows, and a boundary localizer estimates the last transition to enable boundary-guided masking that emphasizes post-boundary evidence and suppresses stale context. For efficiency, we reuse a precomputed layout-aligned template cache to avoid repeated rendering. Empirically, across four public smart-home datasets under near-realistic mixed-activity protocols, LastAct achieves competitive or superior performance on pure windows and yields substantial Macro-F1 gains on cross/mixed windows, demonstrating improved robustness under near-realistic sliding-window regimes.

URL PDF HTML ☆

赞 0 踩 0

2606.00012 2026-06-04 cs.CL cs.AI

DraDDP: A Multimodal Multi-Party Dialogue Discourse Parsing Dataset

DraDDP：多模态多方对话话语解析数据集

Shannan Liu, Peifeng Li, Yaxin Fan, Qiaoming Zhu

发表机构 * School of Computer Science and Technology, Soochow University（苏州大学计算机科学与技术学院）

AI总结针对现有研究局限于文本或双方对话的问题，构建了基于美剧的首个公开英文多模态多方对话话语解析数据集DraDDP，并验证了多模态信息在捕捉对话结构和关系类型中的价值。

详情

Journal ref: Findings of the Association for Computational Linguistics (ACL 2026)

AI中文摘要

多方对话话语解析旨在识别对话中话语之间的依赖结构和关系类型。以往的研究大多局限于文本模态或双方对话，无法满足多模态和多方对话场景。本文基于美国电视剧，构建了首个公开的英文多模态多方对话话语解析数据集DraDDP。该数据集包含495个对话片段，共6,374条话语和9.1小时的并行视频内容，涵盖了丰富的多方交互场景。此外，我们在DraDDP上评估了该任务，并深入分析了不同模态的影响，建立了全面的基准。实验结果表明，多模态信息在捕捉对话结构和关系类型方面具有重要价值。我们将公开发布数据集、标注指南和代码，以促进多模态对话理解的未来研究。

英文摘要

Multi-party dialogue discourse parsing aims to identify dependency structures and relation types between utterances in conversations. Previous studies are mostly limited to textual modality or two-party dialogue, failing to meet the multimodal and multi-party settings. In this paper, we construct the first publicly available English multimodal dataset DraDDP for multi-party dialogue discourse parsing, based on American TV dramas. DraDDP contains 495 dialogue segments with 6,374 utterances and 9.1 hours of parallel video content, covering rich multi-party interaction scenarios. Moreover, we establish comprehensive benchmarks by evaluating this task on DraDDP and conducting in-depth analysis on the impact of different modalities. Experimental results demonstrate the value of multimodal information in capturing dialogue structures and relation types. We will publicly release the dataset, annotation guidelines, and code to promote future research in multimodal dialogue understanding.

URL PDF HTML ☆

赞 0 踩 0

2605.31604 2026-06-04 cs.CV

Representation Forcing for Bottleneck-Free Unified Multimodal Models

表示强制：无瓶颈统一多模态模型

Yuqing Wang, Zhijie Lin, Ceyuan Yang, Yang Zhao, Fei Xiao, Hao He, Qi Zhao, Zihan Ding, Fuyun Wang, Shuai Wang, Youliang Zhang, Haoqi Fan, Xihui Liu

发表机构 * University of Hong Kong（香港大学）； ByteDance Seed（字节跳动种子）； The Chinese University of Hong Kong（香港中文大学）； Nanjing University（南京大学）； Tsinghua University（清华大学）

AI总结提出表示强制（RF）技术，通过让解码器自回归预测视觉表示作为中间令牌，再在相同骨干网络中引导像素扩散，从而消除统一多模态模型对预训练VAE的依赖，实现无瓶颈的端到端模型。

详情

Comments: Project page: https://yuqingwang1029.github.io/RepresentationForcing

AI中文摘要

统一多模态模型（UMMs）旨在单个模型中处理感知和生成。然而，现有的UMMs仍然依赖一个冻结的、单独预训练的VAE进行图像生成，造成了结构瓶颈。简单地移除它会导致质量差距，因为模型必须从原始像素中同时学习高级结构和低级细节。在本文中，我们提出了表示强制（RF），一种通过使表示预测成为模型原生能力来缩小这一差距的技术。具体来说，RF强制解码器在像素之前自回归地预测视觉表示作为中间令牌；这些令牌随后保留在上下文中，在相同骨干网络内引导像素扩散。通过将表示从感知输出转变为生成目标，RF消除了任何外部生成潜在空间的需求。我们发现RF对理解和生成都有益。在图像生成上，我们的像素空间模型与RF匹配了基于VAE的最先进统一模型。在图像理解上，像素空间RF通常优于其基于VAE的变体。这些结果共同为迈向端到端、无瓶颈的UMMs提供了有效的一步。

英文摘要

Unified multimodal models (UMMs) aim to handle perception and generation in a single model. Yet existing UMMs still rely on a frozen, separately pretrained VAE for image generation, imposing a structural bottleneck. Naively removing it introduces a quality gap, as the model must learn both high-level structure and low-level details from raw pixels. In this paper, we propose Representation Forcing (RF), a technique that closes this gap by making representation prediction a native capability of the model. Concretely, RF forces the decoder to autoregressively predict visual representations as intermediate tokens before pixels; these tokens then stay in context to guide pixel diffusion within the same backbone. By turning representations from perception outputs into generation targets, RF eliminates the need for any external generative latent space. We find that RF benefits both understanding and generation. On image generation, our pixel-space model with RF matches state-of-the-art VAE-based unified models. On image understanding, pixel-space RF generally outperforms its VAE-based variant. Together, these results offer an effective step toward end-to-end, bottleneck-free UMMs.

URL PDF HTML ☆

赞 0 踩 0

2605.31483 2026-06-04 cs.CL cs.AI

BenHalluEval: A Multi-Task Hallucination Evaluation Framework for Large Language Models on Bengali

BenHalluEval：孟加拉语大语言模型的多任务幻觉评估框架

Shefayat E Shams Adib, Ahmed Alfey Sani, Ekramul Alam Esham, Ajwad Abrar, Ishmam Tashdeed, Md Taukir Azam Chowdhury

发表机构 * Department of Computer Science and Engineering, Islamic University of Technology（伊斯兰科技大学计算机科学与工程系）； Department of Computer Science and Engineering, University of California（加州大学计算机科学与工程系）

AI总结针对孟加拉语大语言模型幻觉评估的空白，提出BenHalluEval框架，涵盖四项任务，构建12000个幻觉候选，并提出双轨校准指标BenHalluScore，揭示模型间幻觉校准的显著差异。

详情

Comments: Preprint. Under review

AI中文摘要

尽管孟加拉语是世界上使用人数第六多的语言，但此前尚无工作系统评估大语言模型（LLMs）在孟加拉语上的幻觉。我们提出了BenHalluEval，一个针对孟加拉语的细粒度幻觉评估框架，涵盖四项任务：生成式问答（GQA）、孟加拉语-英语混合问答、摘要和推理。我们利用GPT-5.4从三个现有孟加拉语数据集中构建了12,000个幻觉候选，涵盖十二种任务特定的幻觉类型，并在双轨协议下评估了七个LLM，涵盖推理导向、多语言和孟加拉语中心类别，该协议独立测量真实实例上的假阳性率（轨道A）和幻觉候选上的幻觉检测率（轨道B）。为了同时惩罚两种失败模式并防止均匀响应偏差导致的分数膨胀，我们提出了BenHalluScore，一种双轨校准指标，在模型和任务上范围从7.72%到55.42%，揭示了幻觉校准的显著差异。链式思维提示作为一种缓解策略应用，会改变响应分布，但未能一致改善幻觉判别。BenHalluEval建立了首个针对孟加拉语的专用幻觉基准，并突显了单轨和仅提示评估方法在低资源语言环境中的不足。数据集和代码可在https://anonymous.4open.science/r/BanglaHalluEval-EB77获取。

英文摘要

Despite Bengali being the sixth most spoken language in the world, no prior work has systematically evaluated hallucination in large language models (LLMs) for Bengali. We introduce BenHalluEval, a fine-grained hallucination evaluation framework for Bengali covering four tasks: Generative Question Answering (GQA), Bangla-English Code-Mixed QA, Summarization, and Reasoning. We construct 12,000 hallucinated candidates using GPT-5.4 across twelve task-specific hallucination types, drawn from three existing Bengali datasets, and evaluate seven LLMs spanning reasoning-oriented, multilingual, and Bengali-centric categories under a dual-track protocol that independently measures false-positive rate on ground-truth instances (Track A) and hallucination detection rate on hallucinated candidates (Track B). To jointly penalise both failure modes and prevent inflated scores from uniform response bias, we propose BenHalluScore, a dual-track calibration metric that ranges from 7.72% to 55.42% across models and tasks, revealing substantial variation in hallucination calibration. Chain-of-thought prompting, applied as a mitigation strategy, shifts response distributions without consistently improving hallucination discrimination. BenHalluEval establishes the first dedicated hallucination benchmark for Bengali and highlights the inadequacy of single-track and prompting-only evaluation approaches for low-resource language settings. The dataset and code are available at https://anonymous.4open.science/r/BanglaHalluEval-EB77.

URL PDF HTML ☆

赞 0 踩 0

2605.30995 2026-06-04 cs.CY cs.CL

Traceable by Design: An LLM Pipeline and Dashboard for EU Regulatory Consultation Analysis

可追溯性设计：用于欧盟监管咨询分析的LLM流程与仪表板

Thales Bertaglia, Haoyang Gui, Catalina Goanta, Gerasimos Spanakis

发表机构 * Utrecht University（乌特勒支大学）； Maastricht University（马斯特里赫特大学）

AI总结提出基于LLM的端到端流程与交互式仪表板，从监管咨询提交中提取结构化主题，确保逐字引用、完全可追溯和透明性，并以欧盟数字公平法案为例验证。

详情

Comments: This research has been supported by funding from the ERC Starting Grant HUMANads (ERC-2021-StG No 101041824)

AI中文摘要

公众咨询产生大量利益相关者提交的数据，手动分析几乎不可行。我们提出了一个基于LLM的端到端流程和交互式仪表板，用于从监管咨询提交中提取结构化主题，并以欧盟委员会数字公平法案（DFA）公开征集证据作为案例研究。该系统处理原始PDF附件和网络表单响应，提取主题注释，并将每个提取结果基于源文本的逐字引用。应用于4,322份DFA提交，该流程生成了15,368个主题注释，并附有20,951条逐字证据引用。三个原则指导了所提出的设计：逐字引用、完全可追溯性和透明性设计。仪表板通过五个分析视图展示完整的提取数据集，从数据集级别的主题概览到单个段落的深入分析，每个结果都可追溯到其来源。除了预定义的DFA主题类别外，该流程还生成了某些利益相关者关注的问题，如年龄验证、支付处理器审查和数字所有权，这些是固定分类法方法会遗漏的。该流程是领域通用的；将其适应新的咨询只需要更新提示词和新的数据集。实时演示可在https://dfa-dashboard.thalesbertaglia.com/获取。代码和处理后的数据可在https://github.com/thalesbertaglia/dfa-dashboard公开获取。

英文摘要

Public consultations generate large volumes of data in the form of stakeholder submissions that are practically unfeasible to analyse manually. We present an end-to-end LLM-based pipeline and interactive dashboard for structured topic extraction from regulatory consultation submissions, demonstrated on the European Commission's Digital Fairness Act (DFA) public call for evidence as a case study. The system processes raw PDF attachments and web-form responses, extracts topic annotations, and grounds every extraction in a verbatim quote from the source text. Applied to 4,322 DFA submissions, the pipeline produced 15,368 topic annotations supported by 20,951 verbatim evidence quotes. Three principles govern the proposed design: verbatim grounding, full traceability, and transparency by design. The dashboard exposes the full extraction dataset through five analytical views, from dataset-level topic overviews to individual paragraph drill-downs, with every result traceable to its source. Beyond the predefined DFA topic categories, the pipeline generated certain stakeholder concerns, such as Age Verification, Payment Processor Censorship, and Digital Ownership, that a fixed-taxonomy approach would have missed. The pipeline is domain-generic; adapting it to a new consultation requires only a prompt update and a new dataset. A live demo is available at https://dfa-dashboard.thalesbertaglia.com/. The code and processed data are publicly available at https://github.com/thalesbertaglia/dfa-dashboard.

URL PDF HTML ☆

赞 0 踩 0

2605.30457 2026-06-04 eess.AS cs.CL

Extracting accent features in spoken Brazilian Portuguese without sociolinguistic labels

在没有社会语言学标签的情况下提取巴西葡萄牙语的口音特征

Pedro H. L. Leite, Pedro Benevenuto Valadares, Luiz W. P. Biscainho

发表机构 * PEE/COPPE, UFRJ（PEE/COPPE，UFRJ）； Faculdade de Engenharia Elétrica e Computação (FEEC), UNICAMP（电子工程与计算学院（FEEC），UNICAMP）； DEL/Poli & PEE/COPPE, UFRJ（DEL/Poli与PEE/COPPE，UFRJ）

AI总结针对巴西葡萄牙语口音分类中标签缺乏的问题，提出一种仅使用声学标签的新工作流，通过隔离区域口音地标和基于音素的强制对齐器提取特征，在口音相关任务上优于通用架构。

2605.28210 2026-06-04 cs.AI cs.CY cs.HC q-bio.NC

The Illusion of Opting in AI-Mediated Consequential Decisions

AI中介的后果性决策中的选择错觉

Eugene Yu Ji

发表机构 * arXiv.org ； GitHub

AI总结基于Ullmann-Margalit的选择概念，揭示当前AI系统造成一种“选择错觉”，即看似有意义的后果性选择实则削弱了主体的真正选择能力，并提出通过存在诚实、生态理性和反事实修复三个规范要义来保护和发展元能力。

详情

Comments: 11 pages, 1 figure, 2 tables

AI中文摘要

借鉴Ullmann-Margalit的选择概念（变革性、不可逆性、被排除替代方案的阴影），我们表明当前AI系统引发了一个深刻的伦理问题，而现有AI伦理尚未充分捕捉：选择错觉，即个人和群体遭遇看似有意义的后果性选择的欺骗性外观，而成为真正能够选择所需的主体性却被削弱。针对将AI主要视为给定目标优化器的进路，我们认为应通过AI系统是否保护和发展对抗选择错觉的元能力来评估：这种元能力是社会和制度支撑的主体能力，通过它手段和目的得以形成、争论、修订和拥有。这种重新框架对于弱势群体尤为紧迫，当AI中介的路径误导行为和行动时，他们最无力承担选择错觉的成本。我们为AI中介的后果性决策提出三个规范要义：存在诚实，承认预测的局限性；生态理性，将指导置于异质的生活生态中；以及反事实修复，当AI中介的决策路径失败时，承认并修复被排除的替代方案。

英文摘要

Drawing on Ullmann-Margalit's concept of opting (transformative, irrevocable, and shadowed by foreclosed alternatives), we show that current AI systems raise a profound ethical problem that existing AI ethics has not fully captured: the illusion of opting, in which persons and groups encounter the deceptive appearance of meaningful consequential choice while the agency needed to become genuinely capable of choosing is weakened. Against approaches that treat AI primarily as an optimizer of already given ends, we argue that AI systems should be evaluated by whether they protect and cultivate meta-capacity against the illusion of opting: the socially and institutionally scaffolded agentive capacity through which means and ends can be formed, contested, revised, and owned. This reframing is especially urgent for disadvantaged populations, who are least able to absorb the costs of the illusion of opting when AI-mediated pathways misdirect behavior and action. We propose three normative imperatives for AI-mediated consequential decisions: existential honesty, which acknowledges the limits of prediction; ecological rationality, which situates guidance within heterogeneous lived ecologies; and counterfactual reparation, which acknowledges and repairs foreclosed alternatives when AI-mediated decision-making pathways fail.

URL PDF HTML ☆

赞 0 踩 0

2605.24358 2026-06-04 cs.LG cs.AI

Treatment Effect Estimation with Differentiated Networked Effect on Graph Data

图数据上具有差异化网络效应的处理效应估计

Xiaofeng Lin, Han Bao, Hisashi Kashima

发表机构 * Kyoto University（京都大学）； The Institute of Statistical Mathematics（统计数学研究所）； Tohoku University（东北大学）； RIKEN AIP（理化学研究所AIP）

AI总结针对图数据中个体处理效应估计受邻居干扰且存在差异化网络效应的问题，提出一种结合部分注意力机制和消息放大器的干扰建模方法，以捕获邻居重要性和规模差异，提升估计精度。

详情

Comments: Accepted by the research track of the KDD 2026 conference

AI中文摘要

从观测图数据中估计个体处理效应（ITE）对于商业和医学等领域的决策至关重要。由于干扰的存在，该任务具有挑战性，因为个体结果可能受到其邻居的处理和协变量的影响。现有方法尝试对这种干扰进行建模以实现准确的ITE估计。然而，一个关键问题常常被忽视：差异化网络效应（DNE），即由具有不同重要性和规模的邻居组成的局部网络所产生的影响。捕获DNE至关重要；否则，由于对干扰的错误刻画，我们将得到不精确的ITE估计，从而导致错误的决策。为了解决这一挑战，我们提出了一种新颖的干扰建模机制，该机制结合了两个部分注意力机制和一个消息放大器。部分注意力机制自动估计不同邻居在干扰中的重要性，而消息放大器根据邻居的规模调整干扰建模机制的结果，所有这些使得模型能够捕获DNE。在三个真实世界图上的实验表明，我们的方法在从图数据估计ITE方面优于现有方法，这证实了显式捕获DNE的重要性。

英文摘要

Estimating individual treatment effect (ITE) from observational graph data is crucial for decision-making in the fields such as commerce and medicine. This task is challenging due to interference, where individual outcomes can be influenced by the treatments and covariates of their neighbors. Existing methods attempt to model such interference for accurate ITE estimation. However, a critical issue is often overlooked: differentiated networked effect (DNE), an effect caused by local networks consisting of neighbors with varying importance and scales. Capturing DNE is vital; otherwise, we will end up with imprecise ITE estimation due to an erroneous characterization of interference, which can result in misguided decisions. To address this challenge, we propose a novel interference modeling mechanism that incorporates two partial attention mechanisms and a message amplifier. The partial attention mechanisms automatically estimate the importance of different neighbors in contributing to interference, while the message amplifier adjusts the results of the interference modeling mechanism based on the scale of neighbors, all of which enables the model to capture DNE. Experiments on three real-world graphs demonstrate that our methods outperform existing approaches for ITE estimation from graph data, which corroborates the importance of explicitly capturing DNE.

URL PDF HTML ☆

赞 0 踩 0

2605.27488 2026-06-04 cs.CR cs.AI

Grimlock: Guarding High-Agency Systems with eBPF and Attested Channels

Grimlock: 使用eBPF和认证通道保护高代理系统

Qiancheng Wu, Wenhui Zhang, Gan Fang, Sheng Mao, Biao Gao, David Levitsky, Shawna Murphy Butterworth, Rob Cameron

发表机构 * Roblox

AI总结针对代理系统中用户编排代码带来的安全挑战，提出Grimlock代理守卫，通过eBPF强制流量拦截和TLS 1.3通道绑定认证，实现透明、可审计、作用域绑定的代理间通信。

详情

Comments: Vision paper presented at the 1st Workshop on Operating Systems Design for AI Agents (AgenticOS '26), co-located with ASPLOS 2026

AI中文摘要

代理系统越来越多地运行用户编写的编排代码，这些代码调用工具、生成子任务并在机器和云之间委派工作。虽然这种高代理效率很高，但它带来了安全问题：身份、授权、来源和委派通常被推入应用程序代码，在那里它们变得难以一致地执行和审计。我们提出Grimlock，一种代理守卫，通过将信任执行移动到沙箱子系统中，同时保持代理代码不变，来恢复关注点分离。Grimlock使用eBPF强制流量拦截来确保沙箱通信通过守卫，并将其与绑定到标准TLS 1.3通道绑定的握手后认证相结合。通道建立后，守卫授权通信并生成短暂的、通道绑定的作用域令牌，这些令牌捕获最小权限委派。在接收端，目标守卫重新验证身份、作用域和通道绑定，终止TLS，并仅在策略检查成功后向目标沙箱释放明文。kTLS为受保护的通信提供了高效的数据平面。因此，Grimlock提供了一条路径，使用通用Linux原语，无需更改用户层编排代码，即可在异构多云环境中实现透明、可审计、作用域绑定的代理间通信。

英文摘要

Agentic systems increasingly run user-authored orchestration code that invokes tools, spawns subtasks, and delegates work across machines and clouds. Although this high agency is productive, it creates a security problem: identity, authorization, provenance, and delegation are often pushed into application code, where they become difficult to enforce consistently and difficult to audit. We present Grimlock, an Agent Guard that restores separation of concerns by moving trust enforcement into the sandbox substrate while leaving agent code unchanged. Grimlock uses eBPF-enforced traffic interception to ensure that sandbox communication passes through a guard, and combines it with post-handshake attestation bound to standard TLS~1.3 channel bindings. After a channel is established, the guard authorizes communication and mints short-lived, channel-bound scope tokens that capture least-privilege delegation. At the receiving side, the destination guard re-validates identity, scope, and channel binding, terminates TLS, and releases plaintext to the destination sandbox only after policy checks succeed. kTLS provides an efficient dataplane for protected communication. As a result, Grimlock offers a path toward transparent, auditable, and scope-bound agent-to-agent communication across heterogeneous multi-cloud environments, using commodity Linux primitives and without requiring changes to user-layer orchestration code.

URL PDF HTML ☆

赞 0 踩 0

2605.26814 2026-06-04 cond-mat.str-el cs.LG physics.comp-ph

Neural Autoregressive Control Variates for the Quantum Monte Carlo Sign Problem

量子蒙特卡洛符号问题的神经自回归控制变量

Bei Qiao, Lei Wang

发表机构 * Beijing National Laboratory for Condensed Matter Physics and Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China（北京凝聚态物理国家实验室和物理研究所，中国科学院，北京100190，中国）； University of Chinese Academy of Sciences, Beijing 100049, China（中国科学院大学，北京100049，中国）

AI总结通过训练一对自回归模型构造零均值控制变量，有效缓解量子蒙特卡洛模拟中的符号问题，在三角晶格海森堡反铁磁体上实现平均符号标准误差降低一个数量级，能量估计误差降低三到五倍。

详情

Comments: 19 pages, 9 figures

AI中文摘要

我们训练一对自回归模型来构造零均值控制变量，以缓解量子蒙特卡洛模拟中的符号问题。这两个自回归网络被限制在严格不相交支撑的正负符号扇区内，并且每个网络在其扇区内精确归一化。因此，它们的差在结构上具有零均值，提供了一个无偏的辅助可观测量，其与符号估计量的相关性控制方差减少。我们在随机级数展开框架内实现该方法，通过开发增量环拓扑更新将其扩展到受挫晶格。符号遍历采样通过扭转通道实现，这是非二分晶格上唯一的符号改变机制。我们将控制变量实现为自回归变换器，并带有序列结束奇偶掩码以强制精确的符号扇区分辨率，同时将增量环计数变化和累积受挫奇偶性作为拓扑特征纳入。在三角晶格海森堡反铁磁体上，我们在小$N$极限下对该方法进行基准测试。控制变量将平均符号的标准误差降低了一个数量级，并将能量估计量的标准误差降低了三到五倍，即使在平均符号低于$10^{-3}$时仍然有效。这项工作奠定了框架并提供了原理验证，表明自回归控制变量可以有效缓解符号问题。扩展到更大系统并采用物理信息架构是未来工作的主题。

英文摘要

We train a pair of autoregressive models to construct zero-mean control variates to mitigate the sign problem in quantum Monte Carlo simulations. The two autoregressive networks are confined to the positive- and negative-sign sectors with strictly disjoint support, and each is exactly normalized over its sector. Their difference is therefore structurally zero-mean, providing an unbiased auxiliary observable whose correlation with the sign estimator controls the variance reduction. We implement the method within the stochastic series expansion framework, which we extend to frustrated lattices by developing an incremental loop-topology update. Sign-ergodic sampling is achieved through a twist channel, which is the unique sign-changing mechanism on non-bipartite lattices. We implement the control variates as autoregressive transformers with an end-of-sequence parity mask that enforces exact sign-sector resolution, while the incremental loop-count change and cumulative frustration parity are incorporated as topological features. On the triangular-lattice Heisenberg antiferromagnet, we benchmark the method in the small-$N$ limit. The control variate reduces the standard error of the average sign by up to an order of magnitude and that of the energy estimator by a factor of three to five, remaining effective even when the average sign drops below $10^{-3}$. This work lays out the framework and provides a proof-of-principle demonstration that autoregressive control variates can effectively mitigate the sign problem. Scaling to larger systems with physics-informed architectures is the subject of future work.

URL PDF HTML ☆

赞 0 踩 0

2605.30120 2026-06-04 cs.IR cs.AI cs.LG

No More K-means: Single-Stage Sparse Coding for Efficient Multi-Vector Retrieval

不再需要K-means：用于高效多向量检索的单阶段稀疏编码

Lixuan Guo, Yifei Wang, Tiansheng Wen, Aosong Feng, Stefanie Jegelka, Chenyu You

发表机构 * University of California, Berkeley（加州大学伯克利分校）； Stanford University（斯坦福大学）

AI总结针对多向量检索中K-means聚类导致的索引延迟和语义损失问题，提出单阶段稀疏检索（SSR），利用稀疏自编码器将词元嵌入投影为高维稀疏表示，结合倒排索引实现高效检索，在BEIR基准上索引时间减少15倍、检索延迟减半且性能提升。

详情

Comments: Accepted by ICML2026

AI中文摘要

以ColBERT为代表的多向量检索（MVR）模型通过保留细粒度的词元级交互，在检索准确性上树立了新标杆。然而，这种粒度带来了存储和检索效率的瓶颈：为了管理十亿级词元向量的巨大内存占用和计算开销，最先进的系统被迫依赖激进的降维和复杂的聚类（例如K-means）。这种妥协引入了两个关键限制：大规模语料库聚类的过度索引延迟以及压缩固有的语义信息损失。在本文中，我们提出了单阶段稀疏检索（SSR），这是一种范式转变，用高效的稀疏编码取代了昂贵的聚类。我们不将特征压缩为低维稠密向量，而是利用稀疏自编码器（SAE）将词元嵌入投影到高维但高度稀疏的表示中。这种转换使我们能够完全绕过向量聚类，并利用倒排索引实现精确、高吞吐量的检索。在BEIR基准上的大量实验表明，SSR实现了“三连胜”的改进：与ColBERTv2相比，索引时间减少了15倍，检索延迟减半，同时检索性能优于领先的基线方法。

英文摘要

Multi-vector retrieval (MVR) models, exemplified by ColBERT, have established new benchmarks in retrieval accuracy by preserving fine-grained token-level interactions. However, this granularity imposes prohibitive storage and retrieval efficiency bottlenecks: to manage the immense memory footprint and computational overhead of billion-scale token vectors, state-of-the-art systems are forced to rely on aggressive dimension reduction and complex clustering (e.g., K-means). This compromise introduces two critical limitations: excessive indexing latency of clustering large-scale corpora and semantic information loss inherent to compression. In this paper, we propose Single-stage Sparse Retrieval (SSR}, a paradigm shift that replaces expensive clustering with efficient sparse coding. Instead of compressing features into low-dimensional dense vectors, we utilize Sparse Autoencoder (SAE) to project token embeddings into a high-dimensional but highly sparse representation. This transformation enables us to bypass vector clustering entirely and leverage inverted indexing for precise, high-throughput retrieval. Extensive experiments on the BEIR benchmark demonstrate that SSR achieves a "trifecta" of improvements: it reduces indexing time by 15x compared to ColBERTv2, halves retrieval latency, and simultaneously improves retrieval performance over leading baselines.

URL PDF HTML ☆

赞 0 踩 0

2605.30021 2026-06-04 cs.CL

Recovering Diversity Without Losing Alignment: A DPO Recipe for Post-Trained LLMs

在不损失对齐的情况下恢复多样性：面向后训练大语言模型的DPO配方

Vinay Samuel, Yapei Chang, Mohit Iyyer

发表机构 * University of Maryland, College Park（马里兰大学 College Park 分校）

AI总结提出REDIPO数据构建流程，通过离线DPO从基础模型生成中恢复多样性答案，同时保持指令模型的对齐性能。

详情

Comments: Under Review. 26 pages, 3 figures, 16 tables

AI中文摘要

许多开放式指令有多个有效答案，用户可以从看到这些答案中受益，但后训练往往将LLM的输出空间缩小到一小部分规范响应。我们引入REDIPO，一种离线DPO数据构建流程，用于恢复不同的有效答案模式，同时保留指令模型的对齐优势。对于每个提示，REDIPO从基础模型和指令模型中采样响应，用指令模型重写基础模型响应，过滤候选以确保安全和指令遵循质量，并构建偏好对，在具有相似指令遵循奖励的候选者中偏向边际多样的响应。在Qwen3-4B、OLMo-3-7B和LLaMA-3.1-8B上，相对于指令检查点，REDIPO将NoveltyBench distinct_k分别提高了134%、33%和44%，而DivPO在同一模型上将多样性改变了0%、-6%和-4%。这些增益在很大程度上保持了MTBench、IFEval和Arena-Hard的性能，并降低了直接类别HarmBench攻击成功率。消融实验表明，边际多样性对选择和基础响应重写驱动了多样性增益，而过滤和质量边界配对有助于保持对齐。总体而言，我们的结果表明，通过精心构建的偏好数据，可以重新引入基础模型生成中的多样化有效答案，同时保留后训练的对齐优势。我们在https://github.com/vsamuel2003/RiDiPO发布代码和数据。

英文摘要

Many open-ended instructions have multiple valid answers that users can benefit from seeing, but post-training often narrows an LLM's output space toward a small set of canonical responses. We introduce REDIPO, an offline DPO data-construction pipeline for recovering distinct valid answer modes while preserving the alignment benefits of the instruct model. For each prompt, REDIPO samples responses from both base and instruct models, rewrites base-model responses with the instruct model, filters candidates for safety and instruction-following quality, and builds preference pairs that favor marginally diverse responses among candidates with similar instruction-following reward. Across Qwen3-4B, OLMo-3-7B, and LLaMA-3.1-8B, REDIPO improves NoveltyBench distinct_k by 134%, 33%, and 44% relative to the instruct checkpoints, while DivPO changes diversity by 0%, -6%, and -4% on the same models. These gains largely maintain MTBench, IFEval, and Arena-Hard performance, and reduce direct-category HarmBench attack success rate. Ablations show that marginal-diversity pair selection and base-response rewriting drive the diversity gains, while filtering and quality-bounded pairing help maintain alignment. Overall, our results show that diverse valid answers from base-model generations can be reintroduced through carefully constructed preference data while retaining the alignment benefits of post-training. We release our code and data at https://github.com/vsamuel2003/ReDiPO.

URL PDF HTML ☆

赞 0 踩 0

2605.29928 2026-06-04 cs.HC cs.AI

Label Over Logic? How Source Cues Bias Human Fallacy Judgments More Than LLMs

标签胜过逻辑？源标签如何比LLMs更严重地偏差人类的谬误判断

Mahjabin Nahar, Nafis Irtiza Tripto, Aiping Xiong, Ting-Hao 'Kenneth' Huang, Dongwon Lee

发表机构 * The Pennsylvania State University（宾夕法尼亚州立大学）

AI总结通过在线实验和LLM对比，发现人类在评估逻辑谬误时显著受到内容源标签（如人类、AI等）的影响，而LLM评估相对稳定，表明源标签偏差主要是人类的弱点。

详情

AI中文摘要

随着AI生成和AI辅助内容充斥在线空间，附加在这些内容上的源标签可能会扭曲人类的推理判断，对审核、评估和决策产生下游影响。LLM是否也存在这种脆弱性，或者能提供更不受源影响的评估，仍然是一个悬而未决的问题，直接影响人机协作。我们使用逻辑谬误作为受控环境来隔离源标签对推理质量的影响，独立于领域知识。我们进行了一项在线研究（N=505），参与者被分配到不同的源条件（人类、AI、人类辅助AI、AI辅助人类或无披露），并评估包含逻辑谬误的评论，将其判断与LLM（GPT-5.2、Gemini 2.5 Flash、Claude Sonnet 4.5）在相同源条件下的评估进行比较。人类评估者显著更容易受到标记为人类或人类辅助AI的谬误的影响，并在这些条件下给予更高的信任和评估评分。LLM评估在不同源标签下相对稳定，但不同模型表现各异。无论是否存在谬误，人类和LLM在所有条件下的置信水平都同样高。我们的发现表明，推理评估中的源标签偏差主要是人类的弱点，并突显了在日益AI中介的环境中人类与LLM协作的潜力。

英文摘要

As AI-generated and AI-assisted content floods online spaces, source labels attached to such content can distort human reasoning judgments, with downstream consequences for moderation, evaluation, and decision-making. Whether LLMs share this vulnerability, or offer more source-agnostic evaluation, remains an open question with direct implications for human-AI collaboration. We examine this issue using logical fallacies as a controlled setting to isolate source-label effects on reasoning quality, independent of domain knowledge. We conduct an online study (N=505) where participants are assigned to a source condition (human, AI, human with AI assistance, AI with human assistance, or no disclosure) and evaluate comments containing logical fallacies, comparing their judgments with those of LLMs (GPT-5.2, Gemini 2.5 Flash, Claude Sonnet 4.5), who were evaluated across the same source conditions. Human evaluators were significantly more susceptible to fallacies labeled as written by human or human with AI assistance and assigned higher trust and evaluation ratings in these conditions. LLM evaluations remained comparatively stable across source labels, though performance varied across models. Confidence levels were similarly high across conditions for both humans and LLMs, regardless of fallacy presence. Our findings indicate that source-label bias in reasoning evaluation is primarily a human vulnerability and highlight the potential of human-LLM collaboration in increasingly AI-mediated environments.

URL PDF HTML ☆

赞 0 踩 0

2605.29861 2026-06-04 cs.CL cs.AI

Towards Verifiable Multimodal Deep Research: A Multi-Agent Harness for Interleaved Report Generation

迈向可验证的多模态深度研究：用于交错报告生成的多智能体框架

Chenghao Zhang, Guanting Dong, Yufan Liu, Tong Zhao, Xiaoxi Li, Zhicheng Dou

发表机构 * Gaoling School of Artificial Intelligence, Renmin University of China（中国人民大学北京校区人工智能学院）

AI总结提出多智能体框架Ptah，通过规划、研究和写作阶段生成交错文本与视觉证据的多模态报告，并引入验证器确保事实准确性和跨模态一致性。

详情

Comments: In progress

AI中文摘要

大型语言模型（LLMs）已将自主智能体从深度搜索（检索简洁的事实答案）推进到深度研究（将分散的证据综合成长篇报告）。然而，由于缺乏确定性真实值的开放式合成以及需要将文本论证与视觉证据交错，可验证的多模态深度研究仍然具有挑战性。我们提出 extsc{Ptah}，一个用于交错报告生成的多智能体框架。 extsc{Ptah}通过规划、研究和写作阶段编排从用户查询到渲染网页报告的完整生命周期，其中专门智能体构建视觉感知计划、收集基于声明的证据、在 extit{视觉工作记忆}中维护与源对齐的图像，并通过声明式多模态工具使用撰写报告。验证智能体作为框架的接受函数，在整个工作流中强制执行事实依据、引用保真度和跨模态一致性。我们进一步引入 extsc{Ptah}Eval，一个评估协议，通过图像级和呈现级评估增强现有基准。在深度研究基准上的实验表明， extsc{Ptah}生成的面向人类的多模态报告比强基线更可靠、视觉信息更丰富且更实用。

英文摘要

Large Language Models (LLMs) have advanced autonomous agents from deep search, which retrieves concise factual answers, to deep research, which synthesizes scattered evidence into long-form reports. However, verifiable multimodal deep research remains challenging due to open-ended synthesis without deterministic ground truth and the need to interleave textual arguments with visual evidence. We propose Ptah, a multi-agent harness for interleaved report generation. Ptah orchestrates the lifecycle from user query to rendered web report through planning, research, and writing stages, where specialized agents construct visual-aware plans, collect claim-grounded evidence, maintain source-aligned images in a Visual Working Memory, and compose reports through declarative multimodal tool use. A verifier agent serves as the harness's acceptance function, enforcing factual grounding, citation fidelity, and cross-modal consistency throughout the workflow. We further introduce PtahEval, an evaluation protocol that augments existing benchmarks with image-level and presentation-level assessments. Experiments on deep research benchmarks show that Ptah produces more reliable, visually informative, and usable human-facing multimodal reports than strong baselines. Our code is released at https://github.com/SnowNation101/Ptah

URL PDF HTML ☆

赞 0 踩 0

2605.29584 2026-06-04 cs.CL

GAPD: Gold-Action Policy Distillation for Agentic Reinforcement Learning in Knowledge Base Question Answering

GAPD：面向知识库问答中智能体强化学习的金动作策略蒸馏

Xin Sun, Jianan Xie, Zhongqi Chen, Qiang Liu, Shu Wu, Bowen Song, Weiqiang Wang, Zilei Wang, Liang Wang

发表机构 * University of Science and Technology of China（中国科学技术大学）； NLPR, MAIS, Institute of Automation, Chinese Academy of Sciences（中国科学院自动化研究所）； ShanghaiTech University（上海科技大学）； Ant Group（蚂蚁集团）

AI总结提出GAPD框架，通过中间锚点匹配将金动作序列与在线策略对齐，为基于强化学习的知识库问答提供密集的令牌级指导，在多个基准上取得最优结果。

详情

AI中文摘要

强化学习（RL）天然适用于智能体知识库问答（KBQA），其中模型必须发出可执行动作、观察知识库反馈并最终返回答案。然而，当前基于RL的KBQA系统主要优化来自最终答案的稀疏奖励，导致中间动作错误监督不足。这对于逻辑形式标注的KBQA基准尤其受限：金逻辑形式可转换为可执行动作序列，但现有流水线主要将其用于热启动数据构建，而非用于在线策略RL更新。我们提出GAPD，一种训练时的金动作策略蒸馏框架，为基于结果的RL添加密集的令牌级指导。为了将金动作与在线学生策略对齐，GAPD使用中间锚点匹配：它将学生探索和金执行期间达到的中间实体视为状态锚点，并通过这些探索的实体集将学生状态与金状态匹配。基于对齐后的金动作的当前策略作为停止梯度的教师，其令牌分布被蒸馏回普通学生策略的生成动作令牌跨度上。GAPD在WebQSP、GrailQA和GraphQ上持续超越当前最先进水平。

英文摘要

Reinforcement learning (RL) is a natural fit for agentic knowledge base question answering (KBQA), where a model must issue executable actions, observe knowledge-base feedback, and eventually return an answer. However, current RL-based KBQA systems mainly optimize sparse rewards from the final answer, leaving intermediate action errors weakly supervised. This is especially limiting for logical-form annotated KBQA benchmarks: gold logical forms can be converted into executable action sequences, but existing pipelines use them mainly for warm-start data construction rather than for on-policy RL updates. We propose GAPD, a training-time Gold-Action Policy Distillation framework that adds dense token-level guidance to outcome-based RL. To align gold actions with on-policy student rollouts, GAPD uses MID-ANCHOR MATCHING: it treats the intermediate entities reached during student exploration and gold execution as state anchors, and matches student states to gold states through these explored entity sets. The current policy conditioned on this aligned gold action serves as a stop-gradient teacher, whose token distribution is distilled back to the ordinary student policy over generated action-token spans. GAPD consistently surpasses the current state of the art on WebQSP, GrailQA, and GraphQ.

URL PDF HTML ☆

赞 0 踩 0

2511.05924 2026-06-04 cs.LG

DiScoFormer: Plug-In Density and Score Estimation with Transformers

DiScoFormer: 基于Transformer的即插即用密度与得分估计

Vasily Ilin, Peter Sushko, Ranjay Krishna

发表机构 * arXiv.org ； cs.LG（计算机学习）

AI总结提出DiScoFormer，一种可一次训练、任意推理的等变Transformer，通过自注意力机制实现跨分布和样本规模的密度与得分估计，证明其泛化核密度估计并优于KDE。

详情

Comments: Accepted in ICML 2026 (oral)

AI中文摘要

从样本中估计概率密度及其得分仍然是生成建模、贝叶斯推断和动力学理论中的核心问题。现有方法分为两类：经典核密度估计（KDE）可泛化到不同分布，但受维度灾难影响；现代神经得分模型精度高，但需为每个目标分布重新训练。我们提出DiScoFormer（密度与得分Transformer），一种“一次训练，任意推理”的等变Transformer，将独立同分布样本映射到密度值和得分向量，可泛化到不同分布和样本规模。理论上，我们证明自注意力可以恢复归一化KDE，从而建立其作为核方法函数泛化的地位；实验上，单个注意力头学习多尺度、类核的行为。该模型在密度估计上收敛更快、精度高于KDE，并为得分去偏KDE、Fisher信息计算和Fokker-Planck型偏微分方程提供高保真即插即用得分预言机。

英文摘要

Estimating probability density and its score from samples remains a core problem in generative modeling, Bayesian inference, and kinetic theory. Existing methods are bifurcated: classical kernel density estimators (KDE) generalize across distributions but suffer from the curse of dimensionality, while modern neural score models achieve high precision but require retraining for every target distribution. We introduce DiScoFormer (Density and Score Transformer), a ``train-once, infer-anywhere" equivariant Transformer that maps i.i.d. samples to both density values and score vectors, generalizing across distributions and sample sizes. Analytically, we prove that self-attention can recover normalized KDE, establishing it as a functional generalization of kernel methods; empirically, individual attention heads learn multi-scale, kernel-like behaviors. The model converges faster and achieves higher precision than KDE for density estimation, and provides a high-fidelity plug-in score oracle for score-debiased KDE, Fisher information computation, and Fokker-Planck-type PDEs.

URL PDF HTML ☆

赞 0 踩 0

2509.23694 2026-06-04 cs.AI cs.CL cs.CR

SafeSearch: Automated Red-Teaming of LLM-Based Search Agents

SafeSearch: 基于LLM的搜索代理的自动化红队测试

Jianshuo Dong, Sheng Guo, Hao Wang, Xun Chen, Zhuotao Liu, Tianwei Zhang, Ke Xu, Minlie Huang, Han Qiu

发表机构 * University of Science and Technology of China（中国科学技术大学）

AI总结提出SafeSearch自动化红队框架，系统评估基于LLM的搜索代理在五个风险类别中的安全性，发现GPT-4.1-mini在搜索工作流中攻击成功率高达90.5%，且常见防御措施效果有限。

详情

Comments: Accepted by ICML 2026

AI中文摘要

搜索代理将LLM连接到互联网，使其能够访问更广泛和更新的信息。然而，这也引入了一个新的威胁面：不可靠的搜索结果可能误导代理产生不安全的输出。现实世界的事件和我们的两个野外观察表明，此类失败在实践中可能发生。为了系统地研究这一威胁，我们提出了SafeSearch，一个可扩展、成本效益高且轻量级的自动化红队框架，支持搜索代理的沙盒安全评估。利用该框架，我们生成了涵盖五个风险类别（例如，错误信息和提示注入）的300个测试用例，并评估了三个搜索代理框架在17个代表性LLM上的表现。我们的结果揭示了基于LLM的搜索代理存在重大漏洞，在搜索工作流设置中，GPT-4.1-mini的最高攻击成功率（ASR）达到90.5%。此外，我们发现常见的防御措施（如提醒提示）提供的保护有限。总体而言，SafeSearch提供了一种实用的方法来衡量和提高基于LLM的搜索代理的安全性。

英文摘要

Search agents connect LLMs to the Internet, enabling them to access broader and more up-to-date information. However, this also introduces a new threat surface: unreliable search results can mislead agents into producing unsafe outputs. Real-world incidents and our two in-the-wild observations show that such failures can occur in practice. To study this threat systematically, we propose SafeSearch, an automated red-teaming framework that is scalable, cost-efficient, and lightweight, enabling sandboxed safety evaluation of search agents. Using this, we generate 300 test cases spanning five risk categories (e.g., misinformation and prompt injection) and evaluate three search agent scaffolds across 17 representative LLMs. Our results reveal substantial vulnerabilities in LLM-based search agents, with the highest ASR reaching 90.5% for GPT-4.1-mini in a search-workflow setting. Moreover, we find that common defenses, such as reminder prompting, offer limited protection. Overall, SafeSearch provides a practical way to measure and improve the safety of LLM-based search agents.

URL PDF HTML ☆

赞 0 踩 0

2504.12988 2026-06-04 cs.LG stat.ML

Why Ask One When You Can Ask $k$? Learning-to-Defer to the Top-$k$ Experts

为何只问一个专家？学习将任务推迟到Top-$k$专家

Yannis Montreuil, Axel Carlier, Lai Xing Ng, Wei Tsang Ooi

发表机构 * School of Computing, National University of Singapore（新加坡国立大学计算机学院）； Fédération ENAC, ISAE-SUPAERO, ONERA, Université de Toulouse（ENAC联合会、ISAE-SUPAERO、ONERA、图卢兹大学）； Agency for Science, Technology and Research, Institute for Infocomm Research（科技研究局、信息通信研究所）

AI总结提出Top-$k$学习推迟框架，通过将查询分配给最优的$k$个专家，实现多专家协作，并开发了与$k$无关的替代损失函数，在准确性和成本之间取得更优权衡。

详情

AI中文摘要

现有的学习推迟（L2D）框架仅限于单专家推迟，迫使每个查询仅依赖一个专家，无法利用集体专业知识。我们首次提出了Top-$k$学习推迟框架，将查询分配给成本效益最高的$k$个实体。我们的公式统一并严格推广了先前的方法，包括单阶段和两阶段机制、选择性预测以及经典级联。特别地，它将通常的Top-1推迟规则作为特例，同时当$k>1$时能够与多个专家进行原则性协作。我们进一步提出了Top-$k(x)$学习推迟，这是一种自适应变体，根据输入难度、专家质量和咨询成本学习每个查询的最佳专家数量。为了实现实际学习，我们开发了一种新颖的替代损失函数，该函数在单阶段设置中是贝叶斯一致且$\mathcal{H}_h$一致的，在两阶段设置中是$(\mathcal{H}_r,\mathcal{H}_g)$一致的。关键是，该替代损失与$k$无关，允许一次性学习单个策略并灵活地部署到不同的$k$值。在两个机制上的实验表明，Top-$k$和Top-$k(x)$在准确性和成本之间实现了更优的权衡，为L2D中的多专家推迟开辟了新方向。

英文摘要

Existing Learning-to-Defer (L2D) frameworks are limited to single-expert deferral, forcing each query to rely on only one expert and preventing the use of collective expertise. We introduce the first framework for Top-$k$ Learning-to-Defer, which allocates queries to the $k$ most cost-effective entities. Our formulation unifies and strictly generalizes prior approaches, including the one-stage and two-stage regimes, selective prediction, and classical cascades. In particular, it recovers the usual Top-1 deferral rule as a special case while enabling principled collaboration with multiple experts when $k>1$. We further propose Top-$k(x)$ Learning-to-Defer, an adaptive variant that learns the optimal number of experts per query based on input difficulty, expert quality, and consultation cost. To enable practical learning, we develop a novel surrogate loss that is Bayes-consistent, $\mathcal{H}_h$-consistent in the one-stage setting, and $(\mathcal{H}_r,\mathcal{H}_g)$-consistent in the two-stage setting. Crucially, this surrogate is independent of $k$, allowing a single policy to be learned once and deployed flexibly across $k$. Experiments across both regimes show that Top-$k$ and Top-$k(x)$ deliver superior accuracy-cost trade-offs, opening a new direction for multi-expert deferral in L2D.

URL PDF HTML ☆

赞 0 踩 0

2410.15761 2026-06-04 cs.CL cs.LG stat.ML

Optimal Query Allocation in Extractive QA with LLMs: A Learning-to-Defer Framework with Theoretical Guarantees

基于LLM的抽取式问答中的最优查询分配：一个具有理论保证的学习-推迟框架

Yannis Montreuil, Shu Heng Yeo, Axel Carlier, Lai Xing Ng, Wei Tsang Ooi

发表机构 * School of Computing, National University of Singapore（新加坡国立大学计算机学院）； Fédération ENAC ISAE-SUPAERO ONERA, Université de Toulouse, France（法国图卢兹大学ENAC ISAE-SUPAERO ONERA联合体）； Institute for Infocomm Research (A*STAR), Singapore（新加坡信息与通信研究院（A*STAR））； IPAL, IRL 2955, Singapore（新加坡IPAL实验室）

AI总结提出一个学习-推迟框架，通过将查询分配给专门专家，在保证高置信度预测的同时优化计算效率，并在SQuADv1、SQuADv2和TriviaQA上验证了其提高答案可靠性和降低计算开销的效果。

2605.29280 2026-06-04 cs.LG cs.AI cs.IR

LoopFM: Learning frOm HistOrical RePresentations of Foundation Model for Recommendation

LoopFM：从基础模型的历史表示中学习用于推荐

Shali Jiang, Hua Zheng, Boyang Liu, Laming Chen, Kenny Lov, Chuanqi Xu, Lisang Ding, Qinghai Zhou, Can Cui, Xiaolong Liu, Xiaoyi Liu, Yasmine Badr, Xin Xu, Jiyan Yang, Ellie Dingqiao Wen, Gerard Jonathan Mugisha Akkerhuis, Chenxiao Guan, Rong Jin, Ruichao Qiu, Xian Chen, Shifu Xu, Zhehui Zhou, Ping Chen, Rui Yang, Haicheng Chen, Xiangge Meng, Song Zhou, Dharak Kharod, Shuyu Xu, Qiang Jin, Qiao Yang, Wankun Zhu, Qin Huang, Yuzhen Huang, Darren Liu, Parish Aggarwal, Hui Zhou, Erzhuo Wang, Shuo Chang, Xiaorui Gan, Wenlin Chen, Santanu Kolay, Huayu Li

发表机构 * Meta

AI总结针对知识蒸馏中传递标量导致转移率下降的问题，提出LoopFM框架，通过将基础模型的中间嵌入作为输入特征传递给下游垂直模型，实现高带宽知识转移，并在理论和实验中证明其有效性。

详情

Comments: Shali Jiang, Hua Zheng, Boyang Liu contributed equally to this work

AI中文摘要

知识蒸馏（KD）将大型基础模型（FM）的单个标量预测传递给紧凑的垂直模型（VM），但由于单个标量无法传达较大FM学习的丰富中间知识，导致转移率（VM捕获的FM改进比例）下降。为了解决这一瓶颈，我们提出了LoopFM（从FM的历史表示中学习），该框架通过将FM中间嵌入结构化为下游VM的输入特征（例如，用户历史序列）来打开高带宽传输通道，无需在服务时进行实时FM推理，也无需FM和VM之间的架构耦合。我们为LoopFM提供了理论框架，包括增益分解和转移率分析。在三个公开基准上，LoopFM展示了强大的AUC改进（例如，在淘宝广告上提高6%以上）以及与KD互补的知识转移能力。在工业规模系统（数十亿样本、万亿参数FM）上，LoopFM在KD基础上将知识转移率大约翻倍，在Y1H1中实现了+0.5%的转化改进，在Y1H2中分别从两次单独发布实现了+1.03%和+1.22%的转化改进。

英文摘要

Knowledge distillation (KD) transfers a single scalar prediction from a large foundation model (FM) to compact vertical models (VMs), suffering from diminishing transfer ratio -- the fraction of FM improvement captured by the VM -- as a single scalar cannot convey the rich intermediate knowledge that larger FMs learn. To address this bottleneck, we propose LoopFM (Learning frOm HistOrical RePresentations of FM), a framework that opens a high-bandwidth transfer channel by structuring FM intermediate embeddings as input features (e.g., user history sequence) for downstream VMs, without requiring real-time FM inference at serving and architectural coupling between FM and VM. We provide a theoretical framework for LoopFM with a gain decomposition and transfer-ratio analysis. On three public benchmarks, LoopFM demonstrates strong AUC improvements (e.g., 6%+ on TaobaoAd) and complementary knowledge transfer capability with KD. On industrial-scale systems (billions of examples, trillion-parameter FMs), LoopFM approximately doubles the knowledge transfer ratio on top of KD, delivering a +0.5% conversion improvement in the first half after its initial launch, and +1.03% and +1.22% conversion improvement from two individual launches in the subsequent half.

URL PDF HTML ☆

赞 0 踩 0

2605.29076 2026-06-04 cs.CL cs.AI cs.LG

Structured Prompt Optimization Meets Reinforcement Learning for Global and Local Interpretability over Complex Text

结构化提示优化结合强化学习实现复杂文本的全局与局部可解释性

Tianyang Zhou, Wenbo Chen, Pierre Jinghong Liang, Leman Akoglu

发表机构 * Carnegie Mellon University（卡内基梅隆大学）； Amazon（亚马逊）

AI总结提出eXTC框架，通过结构化提示优化、基于SOP的推理蒸馏和强化学习扩展，在分类性能和解释质量上显著优于现有范式。

详情

AI中文摘要

LLMs在文本分类上取得了进展，但现有范式面临权衡：监督（仅标签）微调可扩展，但对复杂文本推理有限且缺乏模型透明度；离散提示优化提供可读指令，但性能和可扩展性不佳。我们引入eXTC（可解释文本分类器），包含三个渐进阶段：（1）通过新的结构化提示优化算法学习自然语言的标准操作程序（SOP或规则手册）；（2）从大型教师LLM到紧凑LM的基于SOP的推理蒸馏；（3）通过强化学习扩展超出初始SOP的推理能力。该设计使eXTC能够（i）通过紧凑LM实现快速推理，（ii）提供推理时的局部推理轨迹，以及其学习领域规则的全局模块化解释，同时（iii）在分类性能和解释质量上显著优于现有范式，并逐步提升。

英文摘要

LLMs have advanced text classification, yet existing paradigms face a trade-off: supervised (label only) fine-tuning is scalable but offers limited reasoning on complex text and lacks broader model transparency, while discrete prompt optimization offers human-readable instructions but struggles with performance and scalability. We introduce eXTC (eXplainable Text Classifier) with three progressive stages: (1) learning a Standard Operating Procedure (SOP, or rulebook) in natural language via a new Structured Prompt Optimization algorithm; (2) SOP-grounded reasoning distillation from a large teacher LLM into a compact LM; and (3) expanding reasoning capabilities beyond the initial SOP via reinforcement learning. This design enables eXTC to provide (i) fast inference via a compact LM, with (ii) inference-time local reasoning traces, alongside a global, modular explanation of its learned domain rules, while (iii) significantly outperforming existing paradigms across diverse benchmarks in both classification performance and explanation quality, with stage-by-stage gains.

URL PDF HTML ☆

赞 0 踩 0

2605.28829 2026-06-04 cs.CL cs.AI cs.CY

Aryabhata 2: Scaling Reinforcement Learning for Advanced STEM Reasoning

Aryabhata 2：扩展强化学习以提升高级STEM推理能力

Ritvik Rastogi, Vishal Singh, Tejas Chaudhari, Sandeep Varma

发表机构 * PhysicsWallah

AI总结本文提出Aryabhata 2，一个通过强化学习后训练在竞争性STEM考试中提升推理能力的语言模型，在JEE、NEET等基准上超越基础模型且输出token减少高达64%。

详情

AI中文摘要

竞争性STEM考试（如JEE和NEET）需要多步符号推理、精确数值计算以及物理、化学和数学的深层概念理解。近期的大语言模型在常见推理基准上表现强劲，但仍难以大规模部署，因为数百万学生的疑问需要领域特定且结构一致的问题求解。我们提出了Aryabhata 2，一个专注于竞争性STEM考试推理的语言模型，通过强化学习后训练进行优化。利用PhysicsWallah的内部题库，我们构建了高质量的训练课程，并通过可验证奖励的强化学习对GPT-OSS-20B进行后训练。训练结合了延长强化学习与通过逐步增大的rollout组大小拓宽探索。我们在竞争性考试基准（包括JEE Main、JEE Advanced和NEET）以及分布外推理数据集（如AIME、HMMT、MMLU-Pro、MMLU-Redux 2.0和GPQA）上评估了Aryabhata 2。结果表明，Aryabhata 2在竞争性STEM推理上优于其基础模型GPT-OSS-20B，同时所需输出token大幅减少（最多减少64%）。

英文摘要

Competitive STEM examinations such as JEE and NEET require multi-step symbolic reasoning, precise numerical computation, and deep conceptual understanding across physics, chemistry, and mathematics. Recent large language models perform strongly on common reasoning benchmarks, yet they remain difficult to deploy at scale, where millions of student doubts demand domain-specific, consistently structured problem solving. We introduce Aryabhata 2, a reasoning-focused language model for competitive STEM examinations, trained via reinforcement-learning post-training. Using PhysicsWallah's internal question banks, we construct a high-quality training curriculum and post-train GPT-OSS-20B through reinforcement learning with verifiable rewards. Training combines prolonged reinforcement learning with broadened exploration via progressively larger rollout group sizes. We evaluate Aryabhata 2 on competitive examination benchmarks, including JEE Main, JEE Advanced, and NEET, as well as out-of-distribution reasoning datasets such as AIME, HMMT, MMLU-Pro, MMLU-Redux 2.0, and GPQA. Results show that Aryabhata 2 outperforms its base model GPT-OSS-20B on competitive STEM reasoning while requiring substantially fewer output tokens (up to 64\% fewer).

URL PDF HTML ☆

赞 0 踩 0

2605.25402 2026-06-04 cs.CV cs.AI

Anatomy-Anchored Self-Supervision: Distilling Vision Foundation Models for Invariant Ultrasound Representation

解剖锚定的自监督：蒸馏视觉基础模型用于不变超声表示

Chunzheng Zhu, Yijun Wang, Jianxin Lin, Feng Wang, Hongwei Wang, Lei Zhao, Shengli Li, Kenli Li

发表机构 * Hunan University（湖南大学）； Shenzhen Maternity and Child Healthcare Hospital（深圳妇幼保健医院）

AI总结提出解剖锚定的超声自监督框架ANAUS，通过可学习潜在提示引擎和领域自适应实现无标注解剖分割，并设计双策略自监督学习（语义感知解剖分离对齐和上下文核心区域预测）来增强表示学习，在六个公开数据集上超越现有方法。

详情

Comments: MICCAI 2026 Accepted Paper; Anatomy-Anchored Ultrasound Self-Supervision

AI中文摘要

自监督预训练范式在医学图像中学习可迁移表示方面日益重要，但现有超声图像方法在图像或帧级别操作，忽略了临床对齐表示学习的解剖上下文。在这项工作中，我们提出了一种解剖锚定的超声自监督框架ANAUS，将表示学习从通用视觉区域转移到临床有意义的解剖结构。利用可学习的潜在提示引擎以及对现有公开图像-掩码对的一次性领域自适应，我们使LP-SAM模块能够大规模实现无标注解剖描绘。基于此解剖基础，我们提出了一种双策略自监督学习范式，包括视图间语义感知的解剖分离对齐和上下文核心区域预测，以增强表示学习。具体而言，前者在相同解剖区域内强制特征不变性，同时促进不同结构间的可区分性；后者迫使模型重建被破坏的区域，从而捕获细粒度的结构细节。在六个公开数据集上的广泛评估表明，我们的方法持续优于当前最先进的方法，同时保持了临床部署所需的计算效率。代码可在https://github.com/zhcz328/ANAUS获取。

英文摘要

Self-supervised pre-training paradigm has gained increasing prominence for learning transferable representations in medical imaging, yet existing methods for ultrasound (US) images operate at the image or frame level, overlooking the anatomical context for clinical-aligned representation learning. In this work, we propose an anatomy-anchored ultrasound self-supervision framework ANAUS that shifts representation learning from generic visual regions to clinically meaningful anatomical structures. Utilizing a learnable latent prompt engine alongside a one-time domain adaptation on existing public image-mask pairs, we empower the LP-SAM module to achieve annotation-free anatomy delineation at scale. Building upon this anatomical grounding, we propose a dual-policy self-supervised learning paradigm consisting of inter-view semantics-aware anatomy-separating alignment and contextual core-region prediction to enhance representation learning. Specifically, the former enforces feature invariance within identical anatomical regions while promoting discriminability across distinct structures; the latter compels the model to reconstruct corrupted regions, thereby capturing fine-grained structural details. Extensive evaluations on six public datasets demonstrate that ANAUS consistently outstrips current state-of-the-art methods while maintaining the computational efficiency essential for clinical deployment. Code is available at https://github.com/zhcz328/ANAUS.

URL PDF HTML ☆

赞 0 踩 0