arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.20272 2026-05-21 cs.LG cs.AI

Smaller Abstract State Spaces Enable Cross-Scale Generalization in Reinforcement Learning

更小的抽象状态空间在强化学习中实现跨尺度泛化

Nasehatul Mustakim, Lucas Lehnert

AI总结本文提出了一种理论模型，通过扩展POMDP中的状态抽象框架，定义了 successor-weighted model reduction，从而在强化学习代理中实现跨尺度泛化，并分析了抽象状态空间大小对泛化能力的影响。

详情

AI中文摘要

尽管人类能够轻易地将抽象概念推广到更复杂或更大的任务中，但构建具备这种能力的强化学习（RL）系统仍然难以实现。本文提出了首个关于如何在RL代理中实现Out-of-Distribution（OOD）泛化的理论模型。我们的方法考虑了部分可观测马尔可夫决策过程（POMDPs），并假设智能体使用抽象函数来确定哪些经验可以被视为等价，哪些必须区分。首先，我们扩展了现有的状态抽象框架和证明技术到POMDPs。然后，我们定义了successor-weighted model reduction，这是一种允许压缩到比先前定义更小的抽象空间的模型缩减变体。我们推导了代理OOD测试性能的界限，从而定义了实现OOD泛化的条件。该界限将代理的性能损失分解为近似和估计误差，揭示了减少代理抽象状态空间大小如何提高测试性能和OOD泛化能力。我们的分析表明，限制代理在有限的抽象状态集合上操作对于实现更复杂任务的泛化是必要的。我们的结果鼓励进一步研究学习能够跨不同复杂程度任务进行扩展的RL架构。

英文摘要

While humans readily generalize abstract concepts to more complex or larger tasks, building Reinforcement Learning (RL) systems with this ability remains elusive. Here, we present the first theoretical model of how such Out-of-Distribution (OOD) generalization can be achieved in RL agents. Our approach considers Partially Observable Markov Decision Processes (POMDPs) and assumes that an intelligent agent uses an abstraction function to determine which experiences can be treated as equivalent and which must be distinguished. First, we extend the existing state abstraction framework and proof techniques to POMDPs. Then, we define a successor-weighted model reduction, a model reduction variant that enables compression into smaller abstract spaces than prior definitions allow. We derive a bound on the agent's OOD test performance, thereby defining the conditions under which OOD generalization is achievable. This bound decomposes an agent's performance loss into approximation and estimation errors, revealing how reducing an agent's abstract state space size improves test performance and OOD generalization. Our analysis suggests that constraining an agent to operate over a small, finite set of abstract states is necessary for achieving generalization to more complex tasks. Our results motivate further research into learning RL architectures that scale across tasks of varying complexity levels.

URL PDF HTML ☆

赞 0 踩 0

2605.20270 2026-05-21 cs.LG cs.AI stat.ML

Conformal Selective Acting: Anytime-Valid Risk Control for RLVR-Trained LLMs

conformal selective acting: any-time-valid risk control for rlvr-trained llms

Hamed Khosravi, Xiaoming Huo

AI总结该研究提出了一种 conformal selective acting 方法，用于在 rlvr 训练的 llms 部署中实现 anytime-valid 的风险控制，通过在部署要求下强制一个空单元，利用 e-process 和 bonferroni 网格来维护 pathwise 有效性，同时在多个基准测试中证明了其有效性。

详情

AI中文摘要

一个本地专家 llm，通过在操作员本地数据上使用强化学习从可验证奖励 (rlvr) 进行微调，被安装在一个受监管的组织中，具有每个部署的误差预算 α。操作员需要在每个回合为该部署的流提供安全证书：不跨部署汇总，不等待长期平均。现有封装器无法在自适应、在线更新的流上实现这一点：离线 conformal 风险方法需要可交换性；在线 conformal 方法仅绑定长期平均；非可交换扩展是边际有效的；最接近的 anytime 封装器，A-RCPS，控制的是边际风险而非选择性风险。使用 (测试统计量，有效性保证，部署规则) 框架，我们识别了一个被部署要求强制的空单元：e-process 每个阈值，选择性风险，anytime-pathwise 有效性，max-certified-threshold 规则。Conformal Selective Acting (CSA) 填充它作为每回合的封装器，维护每个阈值上的 ville 型 e-process 在 bonferroni 网格上，评估相对于 rlvr 过滤器。在可预测的更新和 isotonic-calibrated 单调风险下，我们证明了 (i) 一个 anytime-pathwise 选择性风险界 $R_T^{\mathrm{act}}\leα+O(N_T^{-1/2})$，(ii) 与 $Θ(arη^{-2}\log(1/δ))$ 匹配的认证率，以及 (iii) 与 horizon 无关的发布率差距。在八个专家基准 ($480$ 流)、十六个对抗性分布偏移单元 ($160$ 流) 和五个 live Expert-Iteration RLVR 单元 (在四个基础模型上使用在线 LoRA 在三个架构家族中) ($10{,}300$ 轮) 中，CSA 是十种方法中唯一一个在每个单元上都满足 pathwise 有效性和非拒绝部署的方法。我们不提出新的 llm、训练算法或策略类；CSA 是部署端的补充，与模型正交，适用于无法使用前沿 API 的操作员。

英文摘要

A local specialist LLM, fine-tuned with reinforcement learning from verifiable rewards (RLVR) on operator-local data, is installed in a regulated organization with per-deployment error budget $α$. The operator needs a safety certificate for this deployment's stream at every round: no pooling across deployments, no waiting for a long-run average. Existing wrappers cannot deliver this on adaptive, online-updated streams: offline conformal-risk methods require exchangeability; online-conformal methods bound only long-run averages; non-exchangeable extensions are marginally valid; and the closest anytime wrapper, A-RCPS, controls marginal rather than selective risk. Using a (test statistic, validity guarantee, deployment rule) framework, we identify one empty cell forced by deployment requirements: e-process per threshold, selective risk, anytime-pathwise validity, max-certified-threshold rule. Conformal Selective Acting (CSA) fills it as a per-round wrapper maintaining a Ville-type e-process per threshold on a Bonferroni grid, evaluated against the RLVR filtration. Under predictable updates and isotonic-calibrated monotone risk we prove (i) an anytime-pathwise selective-risk bound $R_T^{\mathrm{act}}\leα+O(N_T^{-1/2})$, (ii) rate-optimal certification matching $Θ(\barη^{-2}\log(1/δ))$, and (iii) a horizon-independent release-rate gap. Across eight specialist benchmarks ($480$ streams), sixteen adversarial distribution-shift cells ($160$ streams), and five live Expert-Iteration RLVR cells with online LoRA over four base models in three architecture families ($10{,}300$ rounds), CSA is the only method among ten compared that satisfies pathwise validity and non-refusing deployment on every cell. We do not propose a new LLM, training algorithm, or policy class; CSA is the deployment-side complement, orthogonal to the model, for operators who cannot use a frontier API.

URL PDF HTML ☆

赞 0 踩 0

2605.20269 2026-05-21 cs.LG cs.AI stat.ML

Catching a Moving Subspace: Low-Rank Bandits Beyond Stationarity

捕捉移动子空间：超越平稳性的低秩老虎机

Hamed Khosravi, Xiaoming Huo

AI总结本文研究了在子空间漂移的情况下，低秩线性上下文老虎机的问题，提出了一种新的算法SPSC，在保持子空间变化的同时，实现了基于秩的动态遗憾率。

详情

AI中文摘要

许多老虎机应用（推荐、临床给药、广告定向）有两个事实，以往的工作只孤立处理：奖励生活在低维潜在子空间上，且该子空间漂移。静态低秩老虎机利用秩但受子空间变化影响；非静态线性老虎机适应漂移但以环境速率$\widetilde{O}(d\sqrt{T})$工作。我们研究了分段静态低秩线性上下文老虎机，具有标量反馈：$θ_t = B_k^\star w_t$，其中秩-$r$因子$B_k^\star\in\mathbb{R}^{d\times r}$在每个未知的$K$段内恒定，且可以在边界处改变。我们的结果在三个轴上都是紧致的。 (i) 识别边界。在单次标量奖励下，移动子空间可通过奖励的二次函数来恢复，当且仅当三个探针侧条件成立：已知噪声方差、有界状态-噪声耦合、以及全维探针支持。每个都是在无限制二次矩问题中的必要条件，且共同它们是充分的，表征了解决区域的边界。 (ii) 算法和动态遗憾。SPSC在学习的$r$维子空间内交替等距探针与窗口投影岭UCB利用；CUSUM样式的变体在线发现段边界。成本动态遗憾是$\widetilde{O}(r\sqrt{T})+\widetilde{O}(T^{2/3})+O(W\,V_{\mathrm{in}})$，用内在秩代替环境$d\sqrt{T}$速率。 (iii) 实验。在十一基准上，从合成、UCI/MovieLens、半合成临床和ZOZOTOWN生产日志数据跨度，SPSC在$d-r\gtrsim T^{1/6}$时优于非静态和低秩基线，匹配分析交叉点。据我们所知，这是在该设置中首次工作来表征识别边界并达到内在秩动态遗憾率的工作。

英文摘要

Many bandit deployments (recommendation, clinical dosing, ad targeting) share two facts prior work handles only in isolation: rewards live on a low-dimensional latent subspace, and that subspace drifts. Stationary low-rank bandits exploit rank but break under subspace change; non-stationary linear bandits adapt to drift but pay ambient rate $\widetilde{O}(d\sqrt{T})$. We study piecewise-stationary low-rank linear contextual bandits with scalar feedback: $θ_t = B_k^\star w_t$ with rank-$r$ factor $B_k^\star\in\mathbb{R}^{d\times r}$ constant within each of $K$ unknown segments and able to shift at boundaries. Our results are tight along three axes. (i) Identification boundary. With single-play scalar rewards, the moving subspace is recoverable through quadratic functionals of rewards iff three probe-side conditions hold: known noise variance, bounded state-noise coupling, and full-dimensional probe support. Each is necessary in the unrestricted-second-moment problem, and jointly they are sufficient, characterizing the boundary of the solvable region. (ii) Algorithm and dynamic regret. SPSC interleaves isotropic probes with windowed projected ridge-UCB exploitation inside the learned $r$-dimensional subspace; a CUSUM-style variant discovers segment boundaries online. The costed dynamic regret is $\widetilde{O}(r\sqrt{T})+\widetilde{O}(T^{2/3})+O(W\,V_{\mathrm{in}})$, replacing the ambient $d\sqrt{T}$ rate with the intrinsic rank. (iii) Empirics. On eleven benchmarks spanning synthetic, UCI/MovieLens, semi-synthetic clinical, and ZOZOTOWN production-log data, SPSC outperforms non-stationary and low-rank baselines whenever $d-r\gtrsim T^{1/6}$, matching the analytical crossover. To our knowledge, this is the first work to characterize the identification boundary and attain the intrinsic-rank dynamic-regret rate in this setting.

URL PDF HTML ☆

赞 0 踩 0

2605.20268 2026-05-21 cs.LG cs.AI cs.CL

Chronicle: A Multimodal Foundation Model for Joint Language and Time Series Understanding

Chronicle：一种用于联合语言和时间序列理解的多模态基础模型

Paul Quinlan, Jeremy Levasseur, Qingguo Li, Xiaodan Zhu

AI总结本文提出Chronicle，一种联合训练语言和时间序列的多模态基础模型，通过统一架构实现两者共享参数，从而在多个任务上取得了优异表现。

详情

AI中文摘要

现实中的时间序列通常伴随着文本：元数据、描述、新闻、报告。然而，时间序列基础模型通常孤立处理数值序列，而试图弥合两者差距的多模态文本-时间序列模型往往事后使用预训练语言模型，继承了从未见过时间数据的表示。这些模型几乎全部在其他多模态基线上进行评估，而不是在各自领域最强的单模基础模型上进行评估，这留下了联合训练是否必要的疑问。我们提出了Chronicle，一个仅含324M参数的解码器-only变压器，从头开始在自然语言和时间序列上进行单统一架构的训练。两种模态共享相同的transformer块、注意力机制和残差流；预训练的大部分使用单模批次，因此跨模态能力纯粹来自共享参数，辅以一个短的对齐阶段，交替处理两者。据我们所知，Chronicle是第一个从头开始联合训练文本和时间序列的模型，也是第一个在两个领域中评估专用基础模型的多模态模型。它在19个NLU任务上与Gemma-3-270M-PT相当，在24个UCR/UEA数据集上设定了新的冻结-嵌入时间序列分类标准，并在Time-MMD上产生多模态预测，优于所有监督融合基线，所有这些都来自单一主干。

英文摘要

Real-world time series come with text: metadata, descriptions, news, reports. Yet time series foundation models process numerical sequences in isolation, and the multimodal text-and-time-series models that attempt to bridge the two all adapt a pretrained language model post hoc, inheriting representations shaped without ever seeing temporal data. These models are also evaluated almost exclusively against other multimodal baselines, not against the strongest unimodal foundation models in either domain, leaving open whether joint training is needed at all. We present Chronicle, a compact 324M-parameter decoder-only transformer trained from scratch on natural language and time series within a single unified architecture. Both modalities share the same transformer blocks, attention mechanism, and residual stream; the bulk of pretraining uses unimodal batches so cross-modal capability emerges purely from shared parameters, with a short alignment stage that interleaves the two. To our knowledge, Chronicle is the first model jointly pretrained on text and time series from scratch, and the first multimodal model evaluated against dedicated foundation models in both domains. It matches Gemma-3-270M-PT on 19 NLU tasks, sets a new bar for frozen-embedding time series classification on 24 UCR/UEA datasets, and produces multimodal forecasts on Time-MMD that beat every supervised fusion baseline, all from a single backbone.

URL PDF HTML ☆

赞 0 踩 0

2605.20267 2026-05-21 cs.CV cs.AI

Generation of Heterogeneous PET Images from Uniform Organ Activity Maps Using a Pretrained Domain-Adapted Diffusion Model

基于预训练域适应扩散模型生成异质性PET图像

Suya Li, Kaushik Dutta, Debojyoti Pal, Jingqin Luo, Kooresh I. Shoghi

AI总结本文提出了一种预训练域适应扩散模型，用于从均匀器官活动图生成临床相关的异质性PET图像，通过两阶段训练策略提高合成图像的定量精度和肿瘤分割性能。

Comments 18 pages, 7 figures

详情

AI中文摘要

合成PET图像在定量成像工作流程开发、可扩展的虚拟成像试验和深度学习模型训练中具有重要价值，但传统基于物理的模拟方法计算成本高，解剖变化有限，且难以捕捉异质性PET摄取。本研究开发了一种预训练域适应扩散（PAD）模型，用于从均匀器官活动图生成解剖条件化的PET合成图像。PAD采用预训练的自然图像文本到图像解码器，结合上游的条件编码器和下游的PET领域适配器。采用两阶段训练策略，第一阶段学习粗略摄取分布，第二阶段细化局部图像细节。均匀器官活动图通过CT基分割生成，通过将每个器官的平均摄取值分配自配对PET图像。评估包括定量准确性、噪声评估、放射组学分析、肿瘤分割性能和人类观察者研究。PAD生成的图像在定量准确性方面表现优异，器官平均SUV与分配活动值的符合度系数超过0.92。合成图像的噪声水平和纹理特征与目标PET图像相似，并产生了可比的肿瘤分割性能。在两项选择强制选择观察者研究中，四名读者的准确率约为50%，表明合成图像与目标图像在视觉上不可区分。PAD还能从XCAT衍生的活动图生成逼真的PET图像，证明了其与基于幻影的解剖先验的兼容性。总体而言，PAD提供了一种基于扩散的框架，用于从临床分割或数字幻影中导出的均匀器官活动图生成临床相关的异质性PET图像，支持数据增强和下游成像研究。

英文摘要

Synthetic PET images are valuable for quantitative imaging workflow development, scalable virtual imaging trials, and deep learning model training, but conventional physics-based simulation approaches are computationally intensive, limited in anatomical variability, and often fail to capture heterogeneous PET uptake. This study developed a pretrained domain-adapted diffusion (PAD) model for anatomy-conditioned PET synthesis from uniform organ activity maps. PAD adopts a natural-image pretrained text-to-image decoder with an upstream conditioning encoder and a downstream PET-domain adapter. A two-phase training strategy was used, with the first phase learning coarse uptake distributions and the second refining local image details. Uniform organ activity maps were generated from CT-based segmentations by assigning each organ its mean uptake from the paired PET image. Evaluation included quantitative accuracy, noise assessment, radiomic analysis, tumor segmentation performance, and a human observer study. PAD-generated images achieved high quantitative accuracy, with concordance correlation coefficients above 0.92 between organ mean SUVs and assigned activity values. The synthesized images showed noise levels and texture characteristics similar to target PET images and produced comparable tumor segmentation performance. In a two-alternative forced-choice observer study, four readers achieved approximately 50% accuracy, indicating visual indistinguishability between synthesized and target images. PAD also generated realistic PET images from XCAT-derived activity maps, demonstrating compatibility with phantom-based anatomical priors. Overall, PAD provides a diffusion-based framework for generating clinically relevant heterogeneous PET images from uniform organ activity maps derived from clinical segmentations or digital phantoms, supporting data augmentation and downstream imaging studies.

URL PDF HTML ☆

赞 0 踩 0

2605.20266 2026-05-21 cs.SD

A Survey of Large Audio Language Models: Generalization, Trustworthiness, and Outlook

大型音频语言模型综述：通用性、可信度与展望

Kaiwen Luo, Zhenhong Zhou, Leo Wang, Liang Lin, Yang Xiao, Tianyu Shao, Yuanhe Zhang, Yuxuan Li, Miao Yu, Kailin Lyu, Jiaming Zhang, Dongrui Liu, Li Sun, Yueming Wu, Kai Li, Ting Dang, Xiaojun Jia, Rohan Kumar Das, Xinfeng Li, Siyuan Liang, Qiufeng Wang, Xingjun Ma, Jing Chen, Kun Wang, Junhao Dong, Deqing Zou, Yu Cheng, Xia Hu, Zhigang Zeng, Sen Su, Yang Liu, Yu-Gang Jiang, Philip S. Yu, Yew-Soon Ong

AI总结本文综述了大型音频语言模型的通用性、可信度及未来发展方向，探讨了其架构创新、对齐算法及安全风险，并提出了防御深入、因果音频世界建模等策略以提升音频智能的可信度。

详情

AI中文摘要

大型语言模型（LLMs）所建立的基础能力为多模态大型语言模型（MLLMs）铺平了道路，其中大型音频语言模型（LALMs）对于实现通用听觉智能至关重要。尽管其表现显著，但LALMs能力的提升远超确保其可信度的系统框架的发展。本文对LALMs的内生机制进行了全面调查，详细阐述了促进涌现推理的架构创新和对齐算法。具体而言，我们分析了向统一端到端框架的过渡以及连续声音信号的整合如何本质上扩大了攻击面。为了严格评估这些范式中的风险，我们建立了可信度的全面分类法，将关键漏洞如跨模态 Jailbreaking、潜在声音后门和生物特征隐私泄漏进行分类。我们通过六个分析支柱回顾了最先进技术：幻觉、鲁棒性、安全、隐私、公平性和认证。成熟进攻景观与未充分发展的防御之间的深刻不平衡进一步验证了面向音频智能的可信度差距和多维风险。最后，我们提出了一条战略路线图，倡导“防御深入”架构、因果音频世界建模和内在表示工程，以弥合经验表现与内在可信音频智能之间的差距。我们的项目已上传至GitHub https://github.com/Kwwwww74/Awesome-Trustworthy-AudioLLMs。

英文摘要

The foundational capabilities established by Large Language Models (LLMs) have paved the way for Multimodal Large Language Models (MLLMs), within which Large Audio Language Models (LALMs) are essential for realizing universal auditory intelligence. Despite their remarkable performance, the escalation of LALMs' capabilities has significantly outpaced the development of systemic frameworks to ensure their trustworthiness. This survey provides a comprehensive investigation into the endogenous mechanisms of LALMs, detailing the architectural innovations and alignment algorithms that facilitate emergent reasoning. Specifically, we analyze how the transition to unified end-to-end frameworks and the integration of continuous acoustic signals inherently expand the attack surface. To rigorously evaluate the risks within these paradigms, we establish a comprehensive taxonomy of trustworthiness, categorizing critical vulnerabilities such as cross-modal jailbreaking, latent acoustic backdoors, and biometric privacy leakage. We review the state-of-the-art through six analytical pillars: hallucination, robustness, safety, privacy, fairness, and authentication. The profound imbalance between a mature offensive landscape and underdeveloped defenses further validates the critical trustworthiness gaps and multidimensional risks facing audio-centric intelligence. Finally, we propose a strategic roadmap advocating for "Defense-in-Depth" architectures, causal auditory world modeling, and intrinsic representation engineering to bridge the gap between empirical performance and intrinsically trustworthy audio intelligence. Our project has been uploaded to GitHub https://github.com/Kwwwww74/Awesome-Trustworthy-AudioLLMs.

URL PDF HTML ☆

赞 0 踩 0

2605.20264 2026-05-21 cs.RO cs.HC

Adaptive Human-Robot Collaboration for Masonry Construction Under Material and Assembly Uncertainty

面向材料和装配不确定性的自适应人机协作砌筑

Jutang Gao, Arash Adel

AI总结本文提出了一种自适应的人机协作流程，用于应对砌筑施工中材料和装配不确定性带来的容忍度累积问题，通过投影指导和激光扫描反馈实现精准协作。

Comments Accepted for publication in Proceedings of the 43rd International Symposium on Automation and Robotics in Construction (ISARC 2026)

详情

AI中文摘要

建筑领域的人机协作常常受到机器人与人类之间通信有限以及材料和装配不确定性导致的容忍度累积的挑战。本文提出了一种针对砌筑施工的自适应人机协作流程，通过一个安装在末端执行器上的投影仪提供空间注册的实时投影指导，用于手动粘合剂的施加，以及激光扫描用于反馈驱动的抓取和放置姿态校正。这些机制共同作用，使人类和机器人的动作能够根据材料变化和累积的装配容忍度进行调整。在传统交错排列和非标准配置的全尺寸实验中，投影指导提高了粘合剂施加的一致性并减少了施加时间，而基于激光的校正保持了水平层并避免了开放式执行中易导致碰撞失败的问题。这些结果表明，通过材料和实际建造传感实现的空间投影与反馈驱动的适应相结合，可以缓解容忍度累积，提高人机协作施工的精度和鲁棒性。

英文摘要

Human-robot collaboration in construction is often challenged by limited robot-to-human communication and the need to adapt to tolerance accumulation arising from material and assembly uncertainties. We present an adaptive human-robot collaborative workflow for masonry construction that addresses communication limitations and tolerance accumulation, demonstrated through a brickwork case study in which a robot places bricks while a human applies adhesive. This workflow is enabled by two complementary mechanisms: 1) an end-effector-mounted projector that provides spatially registered, just-in-time projection guidance for manual adhesive application, and 2) laser scanning for feedback-driven grasping and placement pose correction. Together, these mechanisms enable adjustment of human and robotic actions in response to material variability and accumulated assembly tolerances. Full-scale experiments across conventional running-bond and nonstandard configurations demonstrate that projection guidance improves adhesive application consistency and reduces application time, while laser-based correction maintains level courses and avoids collision-prone failures associated with open-loop execution. These results indicate that integrating spatial projection with feedback-driven adaptation, enabled by material and as-built sensing, can mitigate tolerance accumulation and improve precision and robustness in human-robot collaborative construction.

URL PDF HTML ☆

赞 0 踩 0

2605.20262 2026-05-21 cs.LG cs.AI

Residual Paving: Diagnosing the Routing Bottleneck in Selective Refusal Editing

残差铺垫：在选择性拒绝编辑中的路由瓶颈诊断

Bryce Hinkley, Peyman Najafirad

AI总结本文研究了选择性拒绝编辑作为三重控制问题，通过引入残差铺垫方法，分离路由选择、是否干预和残差编辑能力，从而减少编辑拒绝率并提高良性分布和有害分布的保留率。

详情

AI中文摘要

我们研究选择性拒绝编辑作为三重控制问题：在指定的编辑提示上诱导非拒绝，同时在编辑集之外保持良性行为和有害拒绝。我们引入残差铺垫，一种用于冻结指令微调变压器的路由残差编辑方法，将路由选择、是否干预与残差编辑能力分离。早期层的路由预测一个标量门和专家混合；当激活时，提示条件的瓶颈残差专家应用后期层的残差更新，同时保持骨干不变。这种分解支持一个oracle路由诊断，其中仅将学习到的标量门替换为保留的编辑/保留标签，其余残差编辑器和冻结的骨干保持不变。在主要的Gemma-3-4B-IT保留分割上，学习到的残差铺垫将编辑拒绝率从88.6%降至4.0%，同时保持95.5%的良性分布和87.3%的有害分布。相同协议的一向引导控制在编辑成功方面要弱得多，留下编辑拒绝率为86.8%（针对Edit-target ActAdd）和78.9%（针对DIM风格的拒绝引导）。剩余的失败是偏离目标的有害-保留退化：有害拒绝仍低于冻结基础率，65.3% vs. 81.6%。在六个骨干上，oracle路由在每行报告的指标上都提高了保留侧的诊断分数，中位数增益+12.9个百分点，支持了学习到的路由选择是主要观察到的瓶颈的解释。对两个骨干的轨迹诊断进一步表明，运动方向是朝向编辑目标延续而非通用拒绝抑制。

英文摘要

We study selective refusal editing as a three-way control problem: induce non-refusal on designated edit prompts while preserving benign behavior and harmful refusals outside the edit set. We introduce Residual Paving, a routed residual editing method for frozen instruction-tuned transformers that separates route selectivity, whether to intervene, from residual-edit capacity, what edit to apply. An early-layer router predicts a scalar gate and expert mixture; when active, prompt-conditioned bottleneck residual experts apply later-layer residual updates while leaving the backbone unchanged. This decomposition supports an oracle-routing diagnostic where only the learned scalar gate is replaced with the held-out edit/keep label, leaving the residual editor and frozen backbone fixed. On the primary Gemma-3-4B-IT held-out split, learned Residual Paving reduces edit refusal from 88.6% to 4.0%, with 95.5% benign distribution preservation and 87.3% harmful distribution preservation. Same-protocol one-direction steering controls are much weaker on edit success, leaving edit refusal at 86.8% for Edit-target ActAdd and 78.9% for DIM-style refusal steering. The remaining failure is off-target harmful-keep degradation: harmful refusal remains below the frozen-base rate, 65.3% vs. 81.6%. Across six backbones, oracle routing improves the keep-side diagnostic score on every reported row, with median gain +12.9 pp, supporting the interpretation that learned route selectivity is the main observed bottleneck. Trajectory diagnostics on two backbones further suggest directed movement toward edit-target continuations rather than generic refusal suppression.

URL PDF HTML ☆

赞 0 踩 0

2605.20258 2026-05-21 cs.LG cs.AI cs.CR

It Takes Two: Complementary Self-Distillation for Contextual Integrity in LLMs

需要两人：互补的自我蒸馏用于大语言模型中的上下文完整性

Sangwoo Park, Woongyeong Yeo, Seanie Lee, Yumin Choi, Hyomin Lee, Kangsan Kim, Jinheon Baek, Seong Joon Oh, Sung Ju Hwang

AI总结本文提出SELFCI框架，通过分离信息抑制与任务解决，解决大语言模型中隐私与效用的权衡问题，通过互补的自我蒸馏方法提升上下文完整性。

Comments 28 pages, 16 figures

详情

AI中文摘要

上下文完整性（CI）定义隐私不仅仅是保持信息隐藏，而是根据给定情境的规范来管理信息流。随着大型语言模型越来越多地被用作个人代理处理敏感工作流程，遵循CI变得至关重要。然而，即使前沿模型在做出披露决策时仍然不可靠，现有的缓解策略往往会降低基础任务性能。为了解决这一隐私-效用权衡问题，我们提出了SELFCI，一种互补的自我蒸馏框架，将信息抑制与任务解决解耦。SELFCI联合优化两个独立的反向KL散度，这些散度来源于反馈得到的不同教师分布：一个鼓励保留与任务相关的信息以提高效用，另一个强制最小化和适当披露。这种互补的公式诱导出一个专家产品（PoE）目标，使策略与能力和隐私需求的交集对齐。实证评估显示，SELFCI无需依赖昂贵的外部监督，始终优于竞争基线，如在线强化学习算法（例如GRPO）。这些趋势进一步扩展到涉及代理工作流程和积累私人上下文的离域设置中，表明SELFCI为实现CI对齐提供了一条实用路径。

英文摘要

Contextual Integrity (CI) defines privacy not merely as keeping information hidden, but as governing information flows according to the norms of a given context. As large language models are increasingly deployed as personal agents handling sensitive workflows, adhering to CI becomes critical. However, even frontier models remain unreliable in making disclosure decisions, and existing mitigation strategies often degrade underlying task performance. To overcome this privacy-utility trade-off, we propose SELFCI, a complementary self-distillation framework that decouples information suppression from task resolution. SELFCI jointly optimizes two independent reverse KL divergences over distinct teacher distributions derived from feedback: one encourages preserving task-relevant information for utility, while the other enforces minimal and appropriate disclosure. This complementary formulation induces a Product-of-Experts (PoE) target, aligning the policy with the intersection of capability and privacy requirements. Empirical evaluations demonstrate that SELFCI, without relying on costly external supervision, consistently outperforms competitive baselines such as online reinforcement learning algorithms (e.g., GRPO). These trends further extend to out-of-domain settings involving agentic workflows and accumulated private context, suggesting that SELFCI provides a practical path toward CI alignment.

URL PDF HTML ☆

赞 0 踩 0

2605.20257 2026-05-21 cs.LG cs.AI

Instance Discrimination for Link Prediction

实例判别用于链接预测

Valentin Cuzin-Rambaud, Mathieu Lefort, Rémy Cazabet

AI总结本文提出了一种基于链接表示的新模型L-GRACE和L-BGRL，用于改进链接预测任务的性能，特别是在无属性图上，并展示了其在监督和自监督场景下的竞争力。

详情

AI中文摘要

最近，实例判别模型已成为自监督学习的主要解决方案。在图像领域已证明其有效性后，实例判别学习现在在图领域，特别是节点分类任务中也表现出色。然而，针对链接预测任务的贡献较少。在本文中，我们提出将现有方法适应到此领域。我们首先对现有自监督模型在链接预测领域的性能进行了严格评估，表明主要性能依赖于增强过程（类似于计算机视觉）。然后，我们提出了一种基于社区结构的新的结构增强方法，这对链接预测相关。我们的主要贡献是引入了两个新的模型，L-GRACE和L-BGRL，基于链接表示而不是节点表示，这些模型改进了现有方法的性能，特别是在无属性图上，并且我们展示了它们在监督和自监督场景下与最先进的方法相当。

英文摘要

Recently, instance discrimination models have emerged as a major solution for self-supervised learning. Having already demonstrated its effectiveness in the image domain, instance discrimination learning is now proving equally convincing in the graph domain, in particular for node classification. However, fewer contributions have tackled the link prediction task. In this contribution, we propose to adapt existing methods to this context. We first provide a rigorous evaluation of existing self-supervised models in the field of link prediction, showing that the main performance depends on the augmentation process (like in computer vision). We then propose a new structural augmentation based on the community structure that is relevant for link prediction. Our main contribution introduces two new models, L-GRACE and L-BGRL, based on link representations instead of node representations, which improve the performance of the existing methods, especially on unattributed graphs, and we show that they perform on par with the state of the art, both in supervised and self-supervised contexts.

URL PDF HTML ☆

赞 0 踩 0

2605.20256 2026-05-21 cs.LG cs.AI

FBOS-RL: Feedback-Driven Bi-Objective Synergistic Reinforcement Learning

FBOS-RL: 基于反馈的双目标协同强化学习

Xikai Zhang, Yongzhi Li, Likang Xiao, Yingze Zhang, Yanhua Cheng, Quan Chen, Peng Jiang, Wenjun Wu, Liu Liu

AI总结本文提出FBOS-RL框架，通过环境反馈引导探索增强，并设计两个相互促进的目标：以利用为导向的策略对齐(EPA)和以探索为导向的能力培养(ECC)，从而提高强化学习的训练效率和最终性能。

详情

AI中文摘要

强化学习已成为对齐和解锁大规模模型推理能力的基石。在GRPO及其变种的核心训练循环中，交替进行rollout采样和策略更新。与监督学习不同，每个梯度步骤都锚定在显式的地面真实目标上，而在这种设置中，更新模型参数的最佳梯度方向是未知的；在采样阶段获得的高质量rollout因此充当隐含的“教师”，指导每个参数更新。然而，GRPO采用简单的采样方案，将所有rollout条件在同一原始提示上。当任务超出策略模型当前能力时，这种采样方案很少产生高质量rollout，导致策略模型在更新参数时缺乏有意义的梯度方向，从而导致训练停滞。为了解决这个问题，我们提出了FBOS-RL，一种基于反馈的双目标协同强化学习框架。具体来说，我们让模型基于环境提供的反馈进行反馈引导探索增强，并在此基础上设计两个相互促进的训练目标：以利用为导向的策略对齐(EPA)和以探索为导向的能力培养(ECC)。大量实验表明，EPA和ECC可以相互促进，形成正向飞轮效应，显著提高强化学习的训练效率和最终性能上限。具体而言，在相同数量的rollout下，FBOS-RL比GRPO和基于反馈的基线学习速度更快，并最终达到更高的性能上限，同时在训练过程中表现出更高的策略熵和更低的梯度范数。

英文摘要

Reinforcement learning has become a cornerstone for aligning and unlocking the reasoning capabilities of large-scale models. At its core, the training loop of GRPO and its variants alternates between rollout sampling and policy update. Unlike supervised learning, where each gradient step is anchored to an explicit ground-truth target, the optimal gradient direction for updating model parameters in this setting is not known a priori; the high-quality rollouts drawn during the sampling stage therefore act as the implicit "teacher" that guides every parameter update. However, GRPO adopt a simple sampling scheme that conditions all rollouts on the same original prompt. When a task lies beyond the policy model's current capability, this sampling scheme rarely yields a high-quality rollout, leaving the policy model without a meaningful gradient direction when updating its parameters, which causes training to stall. To address this issue, we propose FBOS-RL, a Feedback-Driven Bi-Objective Synergistic reinforcement learning framework. Specifically, we let the model perform Feedback-Guided Exploration Enhancement based on the feedback provided by the environment, and on top of this we design two mutually reinforcing training objectives: Exploitation-oriented Policy Alignment(EPA) and Exploration-oriented Capability Cultivation(ECC). Extensive experiments demonstrate that EPA and ECC can mutually reinforce each other, forming a positive flywheel effect that significantly improves both the training efficiency and the final performance ceiling of reinforcement learning. Specifically, under an identical number of rollouts, FBOS-RL learns substantially faster than GRPO and feedback-based baselines and ultimately attains a higher performance ceiling, while exhibiting higher policy entropy and lower gradient norms throughout training.

URL PDF HTML ☆

赞 0 踩 0

2605.20250 2026-05-21 cs.LG physics.comp-ph physics.flu-dyn

Physics-informed convolutional neural networks for fluid flow through porous media

具有物理信息的卷积神经网络用于多孔介质中的流体流动

Rafał Topolnicki, Paweł Dłotko, Maciej Matyka

AI总结本文提出了一种基于卷积神经网络的框架，用于直接从样本几何结构预测孔隙尺度的流速场，通过结合流体不可压缩性、固体内部无流条件、周期性约束和全局迂曲度指数等物理一致性约束，提高预测精度，并在不同几何和边界条件下验证了模型的泛化能力。

Comments 14 pages, supplement, dedicated github repo

详情

AI中文摘要

准确模拟多孔介质中的流体流动具有挑战性，因为孔隙空间的几何复杂性和求解纳维-斯托克斯方程的计算成本。这种困难在需要重复模拟时尤为重要，因为标准数值求解器在复杂的多孔域中可能收敛缓慢。我们提出了一种基于神经网络的框架，直接从样本几何结构预测孔隙尺度的速度场。该方法使用带有跳跃连接的卷积编码器-解码器架构，在提取多尺度特征的同时保留空间细节。通过自定义损失函数结合速度重构、不可压缩性、固体内部无流条件、周期性约束以及与全局迂曲度指数的一致性来鼓励物理一致性。我们分析了相应损失权重的影响，并量化了各个损失组件对预测精度的贡献。评估了多种CNN主干网络以识别提供准确且稳健预测的架构。在训练分布外的样本上测试了训练模型的泛化能力，包括障碍物几何、边界条件、孔隙率和现实多孔结构的变化。最后，我们展示了预测的速度场作为Lattice-Boltzmann模拟初始条件的实用应用。这种预热策略加速了求解器收敛，使90%的测试案例中的迭代次数减少。

英文摘要

Accurate simulation of fluid flow in porous media is challenging due to complex pore-space geometries and the computational cost of solving the Navier-Stokes equations. This difficulty is particularly important when repeated simulations are required, as standard numerical solvers may converge slowly in intricate porous domains. We present a neural-network-based framework for predicting pore-scale velocity fields directly from sample geometry. The method uses a convolutional encoder-decoder architecture with skip connections to preserve spatial detail while extracting multi-scale features. Physical consistency is encouraged through a custom loss function combining velocity reconstruction with incompressibility, no-flow conditions inside solids, periodicity constraints, and agreement with the global tortuosity index. We analyze the influence of the corresponding loss weights and quantify the contribution of individual loss components to prediction accuracy. Several CNN backbones are evaluated to identify architectures providing accurate and robust predictions. The generalization ability of the trained model is tested on samples outside the training distribution, including changes in obstacle geometry, boundary conditions, porosity, and realistic porous structures. Finally, we demonstrate a practical use of the predicted velocity fields as initial conditions for Lattice-Boltzmann simulations. This warm-start strategy accelerates solver convergence, reducing the number of iterations in over 90% of tested cases.

URL PDF HTML ☆

赞 0 踩 0

2605.20249 2026-05-21 cs.LG cs.AI

Automated Kernel Discovery Towards Understanding High-dimensional Bayesian Optimization

面向高维贝叶斯优化理解的自动核发现

Taeyoung Yun, Woocheol Shin, Inhyuck Song, Jaewoo Lee, Jinkyoo Park

AI总结本文提出了一种基于大语言模型的进化框架，用于高维贝叶斯优化中的自动核发现，通过扩展核空间并避免依赖观测条件，提高了高维问题中核设计的有效性。

Comments 36 pages, 27 figures, 12 tables

详情

AI中文摘要

高斯过程（GP）核是贝叶斯优化（BO）的核心，但设计有效的高维问题核仍依赖于大量手动工程。现有自动方法在高维情况下面临两个瓶颈：其核搜索空间仅限于基本核的加法和乘法组合，且基于大语言模型的方法需要对原始观测进行条件化，这由于上下文长度限制和提取有意义模式的难度而变得不可行。我们引入了Kernel Discovery，一种基于大语言模型的进化框架，用于高维BO，它搜索超越预定义组合规则的更广泛的核空间，并且不需要对观测进行条件化。受直接提示大语言模型生成核代码会产生语法各异但功能相同的核的观察启发，我们采用两阶段方法：首先，大语言模型提出新的数学形式，然后通过第二次大语言模型调用将每个形式转换为经过验证的可执行代码。我们还提出了一种留一法连续排名概率评分（LOO-CRPS）作为选择标准，该标准惩罚过拟合的核。在五个高维BO基准上，我们的方法实现了平均排名为1.2（共17个），优于竞争基线。我们进一步分析发现的核，以确定哪些核在高维BO中带来了改进。

英文摘要

Gaussian Process (GP) kernels are central to Bayesian optimization (BO), yet designing effective kernels for high-dimensional problems still relies on extensive manual engineering. Existing automated approaches struggle in high dimensions for two bottlenecks: their kernel search space is limited to additions and multiplications of base kernels, and LLM-based approaches require conditioning on raw observations, which becomes infeasible due to context-length limits and the difficulty of extracting meaningful patterns. We introduce \textbf{Kernel Discovery}, a LLM-driven evolutionary framework for high-dimensional BO that searches a broader kernel space beyond predefined composition rules and does not require conditioning on observations. Motivated by the observation that directly prompting an LLM to generate kernel code yields syntactically varied but functionally identical kernels, we adopt a two-stage approach: an LLM first proposes novel mathematical forms, then a second LLM call converts each form into validated, executable code. We also propose a leave-one-out continuous ranked probability score (LOO-CRPS) as a selection criterion that penalizes overfitted kernels. On five high-dimensional BO benchmarks, our method achieves an average rank of \textbf{1.2 out of 17}, outperforming competitive baselines. We further analyze the discovered kernels to identify which kernels lead to improvements in high-dimensional BO.

URL PDF HTML ☆

赞 0 踩 0

2605.20248 2026-05-21 cs.LG

Graph Transductive Sharpening: Leveraging Unlabeled Predictions in Node Classification

图转导锐化：利用未标记预测进行节点分类

Brown Zaz, Mar Gonzàlez I Català, Ferran Hernandez Caralt, Moshe Eliasof, Pietro Liò

AI总结本文提出了一种转导锐化方法，通过利用未标记节点的预测来改进节点分类任务，无需改变基础架构即可在多个基准上提升性能。

Comments 19 pages, 4 figures, 17 tables

详情

AI中文摘要

在转导设置中，当整个图被观察到但节点标签仅部分可用时，半监督节点分类的进展主要集中在架构创新上。本文重新审视了一个垂直轴：训练目标。我们从一个简单的观察出发：转导模型在训练过程中为每个节点生成预测，包括没有标签的节点。这些未标记节点的预测可能包含有用的训练信号，但标准监督目标会丢弃它们，因为没有真实标签可用。受交叉熵分解为标签依赖对齐项和标签无关熵项的启发，我们提出预测置信度作为在没有标签情况下提取此信号的自然方式。这促使了转导锐化（TS）：一种损失层面的修改，它在未标记节点上最小化预测熵，同时在标记节点上平衡这一影响。我们评估了转导锐化在广泛节点分类基准上的表现，并观察到一致的性能提升，而无需对基础架构进行任何更改。代码可在https://github.com/transductive-sharpening/tunedGNN上获得。

英文摘要

In the transductive setting, where the full graph is observed but node labels are only partially available, progress in semi-supervised node classification has largely focused on architectural innovation. In this paper, we revisit an orthogonal axis: the training objective. We start from a simple observation: transductive models produce predictions for every node during training, including nodes without labels. These unlabeled-node predictions may contain useful training signal, but standard supervised objectives discard them because no ground-truth labels are available. Inspired by the decomposition of cross-entropy into a label-dependent alignment term and a label-independent entropy term, we propose prediction confidence as a natural way to extract this signal in the absence of labels. This motivates Transductive Sharpening (TS): a loss-level modification that minimizes prediction entropy on unlabeled nodes while counterbalancing this effect on labeled nodes. We evaluate Transductive Sharpening across a wide range of node-classification benchmarks and observe consistent performance improvements without requiring any changes to the backbone architecture. Code is available at https://github.com/transductive-sharpening/tunedGNN.

URL PDF HTML ☆

赞 0 踩 0

2605.20247 2026-05-21 cs.LG cs.AI cs.CL cs.CV

CP-MoE: Consistency-Preserving Mixture-of-Experts for Continual Learning

CP-MoE：一致性保留的混合专家用于持续学习

Yang Liu, Toan Nguyen, Flora D. Salim

AI总结本文提出CP-MoE，一种基于瞬时专家的持续学习框架，通过一致性保留的路由偏置和瞬时专家引导的正则化机制，减少参数干扰和遗忘，同时保留跨任务知识转移。

详情

AI中文摘要

持续学习在大语言模型（LLMs）和视觉-语言模型（VLMs）中仍面临灾难性遗忘的严重障碍。尽管混合专家（MoE）架构提供了扩展的有效途径，但现有的基于LoRA的MoE持续学习方法仍面临根本性的权衡：要么过于激进地隔离专家，限制任务间的知识转移，要么允许任务特定的更新覆盖重要的现有参数，导致严重的遗忘。为此，我们提出了CP-MoE，一种持续学习框架，围绕瞬时专家构建，该专家捕捉早期任务特定的更新并引导其整合到稳定的专家中。CP-MoE引入了一种一致性保留的路由偏置，利用瞬时专家估计与稳定专家的表示相似性，并引导路由向更兼容的专家选择方向；还引入了一种瞬时专家引导的正则化机制，该机制在合并过程中选择性地保护重要历史参数。这些组件共同减少了参数干扰和遗忘，同时保留了跨任务的知识转移。我们在基于LLM和VLM的MoE模型上验证了CP-MoE，既在单模态又在多模态持续学习基准上进行了测试。在SuperNI基准上，涵盖多样化的序列语言任务，CP-MoE实现了最先进的性能，并在未见任务上表现出更强的零样本迁移能力。在VQA v2数据集上，它能有效扩展到多模态视觉推理，一致地减少遗忘，并优于强大的MoE基线。

英文摘要

Catastrophic forgetting remains a major obstacle to continual learning in large language models (LLMs) and vision--language models (VLMs). Although Mixture-of-Experts (MoE) architectures offer an efficient path to scaling, existing LoRA-based MoE continual learning methods still face a fundamental trade-off: they either isolate experts too aggressively, limiting knowledge transfer across tasks, or allow task-specific updates to overwrite important existing parameters, leading to severe forgetting. To address this, we propose CP-MoE, a continual learning framework built around a transient expert that captures early task-specific updates and guides their integration into stable experts. CP-MoE introduces a consistency-preserving routing bias, which uses the transient expert to estimate representation similarity with stable experts and steer routing towards more compatible expert selection, and a transient expert-guided regularisation mechanism, which selectively protects important historical parameters during merging. Together, these components reduce parameter interference and forgetting while preserving cross-task knowledge transfer. We validate CP-MoE on both unimodal and multimodal continual learning benchmarks with LLM-based and VLM-based MoE models. On SuperNI benchmark, spanning diverse sequential language tasks, CP-MoE achieves state-of-the-art performance and stronger zero-shot transfer to unseen tasks. On VQA v2 dataset, it scales effectively to multimodal visual reasoning, consistently reduces forgetting, and outperforms strong MoE baselines.

URL PDF HTML ☆

赞 0 踩 0

2605.20242 2026-05-21 cs.LG cond-mat.mtrl-sci cs.AI physics.chem-ph

LEAP: A closed-loop framework for perovskite precursor additive discovery

LEAP：一种用于钙钛矿前驱体添加剂发现的闭环框架

Xin-De Wang, Zhi-Rui Chen, Ze-Feng Gao, Peng-Jie Guo, Cheng Mu, Zhong-Yi Lu

AI总结该研究提出LEAP框架，结合大语言模型和主动学习，通过文献驱动的机制相关描述符和贝叶斯优化，实现了钙钛矿太阳能电池添加剂的高效发现，实验验证显示其在性能提升方面优于通用模型。

Comments 30 pages; 11 figures

详情

AI中文摘要

高效发现前驱体添加剂对于提高钙钛矿太阳能电池性能至关重要，但庞大的化学空间使传统试错筛选效率低下。我们开发了LEAP（通过主动学习进行钙钛矿添加剂探索的LLM驱动闭环框架），该框架结合了领域专用的大语言模型（LLM）和主动学习，用于迭代性添加剂优先级排序。LLM被训练以从钙钛矿添加剂文献中提取机制相关知识，并通过可解释的描述符表示候选分子，这些描述符进一步整合到贝叶斯优化工作流中，以在低数据条件下进行不确定性感知的优先级排序。在未见过的文献基准测试中，领域专用模型在机制一致推理方面优于通用模型。专家在闭环中的证明概念研究实验验证显示，经过三次筛选轮次后，添加剂优先级得到改善，导致平均设备PCEs分别为20.13%和20.87%，分别比对照组的19.25%有所提高，其中最佳PCE为21.32%。这些结果提供了初步证据，表明基于文献的机制描述符，当结合贝叶斯优化和专家可行性审查时，可以支持钙钛矿光伏中的机制感知添加剂优先级排序。

英文摘要

Efficient discovery of precursor additives is essential for improving the performance of perovskite solar cells, yet the large chemical space makes conventional trial-and-error screening inefficient. We develop LEAP(LLM-driven Exploration via Active Learning for Perovskites), an expert-in-the-loop closed framework that couples a domain-specialized large language model(LLM) with active learning for iterative additive prioritization. The LLM is trained to extract mechanism-relevant knowledge from the perovskite additive literature and to represent candidate molecules through interpretable descriptors, which are further integrated into a Bayesian optimization workflow for uncertainty-aware prioritization under low-data conditions. Benchmark results on unseen literature show that the domain-specialized model outperforms general-purpose models in mechanism-consistent reasoning. Experimental validation in an expert-in-the-loop proof-of-concept study suggests improved additive prioritization across three screening rounds, leading to average device PCEs of 20.13% and 20.87% for the later-round 6-CDQ- and 2-CNA-treated devices, respectively, compared with 19.25% for the control, with a champion PCE of 21.32%. These results provide preliminary evidence that literature-grounded mechanistic descriptors, when coupled with Bayesian optimization and expert feasibility review, can support mechanism-aware additive prioritization in perovskite photovoltaics.

URL PDF HTML ☆

赞 0 踩 0

2605.20241 2026-05-21 cs.LG cs.AI cs.CL

Geometry-Lite: Interpretable Safety Probing via Layer-Wise Margin Geometry

Geometry-Lite: 通过层间边际几何进行可解释的安全探测

Woo Seob Sim, Yu Rang Park

AI总结本文研究了大语言模型在提示级别上的安全探测问题，提出了一种名为Geometry-Lite的紧凑探测器，通过层间边际几何分析来提高安全检测的可解释性和准确性。

详情

AI中文摘要

用于大语言模型的提示级别安全探测使用隐藏状态表示来区分安全和不安全的提示，但强平均检测性能并不能解释这种分离的几何结构。特别是，仍然不清楚安全证据是如何在层间形成的，哪些层间几何特性支持低误报决策，以及哪些几何偏见在基准转移下保持稳定。我们将此视为一个经验分解问题，并引入Geometry-Lite，一种紧凑的提示级别探测器，它将每一层的最终提示令牌表示映射到以质心、局部邻域和监督线性边界读出为中心的符号边际，然后通过边界位置、层间变化和粗略形状对结果边际配置进行总结。在九个指令微调的backbone（1.2B-70B）和七个安全基准上，Geometry-Lite在单层探测器上表现更好，同时接近原始多层分数堆叠，使其成为分析多层安全信号的有用工具。分解显示，安全证据主要通过持久的边界位置几何结构表达：最终或极端边际和不安全侧层占用主导汇总检测性能。相比之下，有限差分漂移和结构总结对汇总AUROC贡献很小，尽管漂移可以在低FPR阈值下提供小的召回修正。在基准转移下，优化的线性边界在训练混合物上是尖锐的，而类条件均值几何在预定义的硬保留子集上保持分离更可靠。总体而言，提示级别安全证据不是主要的层间运动信号，而是一种持久的层间边际几何结构，其有用组件和读取级偏见在决策关键区域变得明显。

英文摘要

Prompt-level safety probes for large language models use hidden-state representations to separate safe from unsafe prompts, but strong average detection performance does not explain the geometry of this separation. In particular, it remains unclear how safety evidence is formed across layers, which aspects of that layer-wise geometry support low-false-positive decisions, and which geometric biases remain stable under benchmark shift. We study this as an empirical decomposition problem and introduce Geometry-Lite, a compact prompt-level probe that maps each layer's final prompt-token representation to signed margins under centroid, local-neighborhood, and supervised linear-boundary readouts, then summarizes the resulting margin profiles by boundary position, layer-to-layer change, and coarse shape. Across nine instruction-tuned backbones ($1.2$B--$70$B) and seven safety benchmarks, Geometry-Lite improves over single-layer probes while remaining close to raw multi-layer score stacking, making it a useful instrument for analyzing the multi-layer safety signal. The decomposition shows that safety evidence is expressed primarily through persistent boundary-position geometry: final or extremal margins and unsafe-side layer occupancy dominate aggregate detection performance. In contrast, finite-difference drift and structural summaries add little to pooled AUROC, although drift can provide small recall-oriented corrections under shifted low-FPR thresholds. Under benchmark shift, optimized linear boundaries are sharp on the training mixture, whereas class-conditional mean geometry retains separation more reliably on a predefined hard held-out subset. Overall, prompt-level safety evidence is not primarily a layer-to-layer motion signal, but a persistent layer-wise margin geometry whose useful components and readout-level biases become visible in decision-critical regimes.

URL PDF HTML ☆

赞 0 踩 0

2605.20240 2026-05-21 cs.LG

MagBridge-Battery: A Synthetic Bridge Dataset for Li-ion Magnetometry and State-of-Health Diagnostics

MagBridge-Battery: 一种用于锂离子磁测和健康状态诊断的合成桥梁数据集

Sakthi Prabhu Gunasekar, Prasanna Kumar Rangarajan

AI总结本文提出MagBridge-Battery数据集，通过结合Mohammadi-Jerschow开放科学框架中的真实磁形态数据与PulseBat数据集的健康状态标签，为锂离子电池的磁测和健康状态诊断提供了一个公开的基准测试平台，同时验证了数据集在健康状态回归、二次生命分类和异常检测等任务上的有效性。

Comments 10 pages, 3 figures, 4 tables. Synthetic dataset and benchmark suite for battery magnetometry and state-of-health diagnostics; dataset released on Zenodo and code available on GitHub

详情

AI中文摘要

目前，电池健康诊断主要依赖于在电池端子测量的电化学信号。平行文献表明，磁感应可以解决终端-only测量所遗漏的信息，但方法开发受到缺乏公开的电池磁测量数据集与退化标签的限制。我们发布了MagBridge-Battery v1.0，这是一个包含6,760个磁场签名的合成数据集，将Mohammadi-Jerschow开放科学框架（OSF）档案中的真实磁形态与状态-of-health（SOH）标签相结合。该发布包含5,600个PulseBat处理的接地样本、600个从干净父体衍生的合成传感器异常样本以及560个低电压Regime-B外推样本。一个细胞不重叠、父-子泄漏自由的主基准划分已被验证，其中包含零重叠单元格、零跨分割父-子对以及零样本ID重叠。我们定义了三个主要基准任务：SOH回归、二次生命分类和异常检测，以及一个辅助的异常子类型分类任务。受控标签洗牌消融将SOH回归的R²从约0.77降低到约0，证实了桥梁编码输入SOH非平凡地而不是产生标签对齐的伪影。该数据集在Zenodo上以CC-BY-4.0发布，桥梁代码和基准套件以Apache-2.0发布。这项工作为磁感应电池诊断提供了公开的基准测试，同时在配对的磁电化学测量仍然稀缺的情况下。

英文摘要

Battery health diagnostics today rely overwhelmingly on electrochemical signals measured at the cell terminals. A parallel literature has shown that magnetic sensing can resolve information that terminal-only measurements miss, but method development is limited by the absence, to the best of our knowledge, of public battery magnetic-measurement datasets paired with degradation labels. We release MagBridge-Battery v1.0, a synthetic dataset of 6,760 magnetic-field signatures that bridges real magnetic morphology from the Mohammadi-Jerschow Open Science Framework (OSF) archive with state-of-health (SOH) labels from the PulseBat dataset. The release contains 5,600 PulseBat-conditioned grounded samples, 600 synthetic sensor-anomaly samples derived from clean parents, and 560 low-voltage Regime-B extrapolation samples. A cell-disjoint, parent-child-leakage-free primary benchmark split is verified to contain zero overlapping cells, zero cross-split parent-child pairs, and zero sample-ID overlap. We define three primary benchmark tasks: SOH regression, second-life classification, and anomaly detection, plus an auxiliary anomaly-subtype classification task. A controlled label-shuffle ablation collapses SOH regression from R^2 approximately 0.77 to approximately 0, confirming that the bridge encodes input SOH non-trivially rather than producing label-aligned artifacts. The dataset is released on Zenodo under CC-BY-4.0, and the bridge code and benchmark suite are released under Apache-2.0. This work provides a public benchmark for magnetic-sensing battery diagnostics while paired magnetic-electrochemical measurements remain scarce.

URL PDF HTML ☆

赞 0 踩 0

2605.20237 2026-05-21 cs.CV

AnimeAdapter: Fine-grained and Consistent Zero-shot Anime Character Generation

AnimeAdapter: 细粒度且一致的零样本动漫人物生成

Yixuan Han

AI总结本文提出了一种轻量级的外观适配器，用于在多样编辑条件下实现可控且一致的动漫人物生成，通过注入单张参考图像的细粒度视觉特征到扩散过程中，并结合CLIP的局部空间化特性，开发出语义选择性局部注意力机制，进一步解耦人物外观与空间布局，从而实现高效的动漫人物生成。

详情

AI中文摘要

我们提出了一种轻量级的外观适配器，用于在多样编辑条件下实现可控且一致的动漫人物生成。与依赖大规模视觉-语言模型或针对特定主体的微调不同，我们的方法将单张参考图像的细粒度视觉特征注入扩散过程。基于CLIP的局部空间化特性，我们开发了语义选择性局部注意力机制。为了进一步解耦人物外观与空间布局，我们在适配器训练过程中引入姿态感知的条件。所得到的预训练适配器保持紧凑、模块化，并且完全兼容Stable Diffusion社区工作流程，同时在部署时不需要额外的微调。此外，我们还提出了一个基于精选和重构的Danbooru提示的高质量动漫人物数据集，并在多个实际的人物编辑场景中评估了我们的方法。我们的代码、模型权重和数据集将在接受后公开发布。

英文摘要

We present a lightweight appearance adapter for Stable Diffusion that enables controllable and consistent anime character generation under diverse editing conditions. Instead of relying on large-scale vision-language models or per-subject fine-tuning, our method injects fine-grained visual features from a single reference image into the diffusion process. Based on CLIP emergent local spatialization, we develop semantic-selective local attention. To further disentangle character appearance from spatial layout, we incorporate pose-aware conditioning during adapter training. The resulting pretrained adapter remains compact, modular, and fully compatible with Stable Diffusion community workflows, while requiring no additional fine-tuning at deployment time. Furthermore, we present a high-quality anime character dataset based on curated and restructured Danbooru prompts, and evaluate our method across several practical character editing scenarios. Our code, model weights, and dataset will be publicly released upon acceptance.

URL PDF HTML ☆

赞 0 踩 0

2605.20235 2026-05-21 cs.LG cs.AI

Provably Learning Diffusion Models under the Manifold Hypothesis: Collapse and Refine

在流形假设下证明学习扩散模型：坍缩与细化

Wei Huang, Andi Han, Mingyuan Bai, Huanjian Zhou, Qixin Zhang, Taiji Suzuki, Kenji Fukumizu

AI总结本文在流形假设下研究扩散模型的学习问题，提出了一种由得分函数几何特性驱动的坍缩与细化机制，并通过Score-induced Latent Diffusion模型验证了其理论预测，证明样本复杂性依赖于内在维度而非外在维度。

Comments 3 figures

详情

AI中文摘要

扩散模型能够生成高质量的高维数据，但其训练如何高效学习得分函数并在数据支持于低维流形时克服维度灾难仍缺乏理论解释。我们识别出一种由得分函数几何特性驱动的坍缩与细化机制：在小噪声尺度下，得分函数的发散奇点导致诱导去噪映射快速坍缩到数据流形投影上；在中等噪声尺度下，训练在学习的流形上细化内在密度。我们将其原理实例化为Score-induced Latent Diffusion (SiLD)，一种两阶段框架，其中流形学习和密度估计均源自单一去噪得分匹配目标，取代了基于VAE的潜在扩散模型的启发式KL正则化。我们证明所得到的样本复杂性依赖于内在维度而非外在维度。在Stacked MNIST、CelebA变体和分子生成基准测试中，SiLD在生成质量上匹配或优于基于VAE的LDMs，并且在重建方面始终有所改进，验证了我们的理论预测。

英文摘要

Diffusion models generate high-dimensional data with remarkable quality, yet how their training efficiently learns the score function, bypassing the curse of dimensionality when data is supported on low-dimensional manifolds, remains theoretically unexplained. We identify a collapse-and-refine mechanism driven by the geometry of the score function itself: at small noise scales, the diverging singularity of the score drives a rapid dimensional collapse of the induced denoising map onto the data manifold projection; at moderate noise scales, training refines the intrinsic density on the learned manifold. We instantiate this principle as Score-induced Latent Diffusion (SiLD), a two-stage framework in which both manifold learning and density estimation emerge from a single denoising score matching objective, replacing the heuristic KL regularization of VAE-based latent diffusion models. We prove that the resulting sample complexity depends on the intrinsic dimension rather than the ambient dimension. Experiments on Stacked MNIST, CelebA variants, and molecular generation benchmarks show that SiLD matches or outperforms VAE-based LDMs in generation quality and consistently improves reconstruction, validating our theoretical predictions.

URL PDF HTML ☆

赞 0 踩 0

2605.20234 2026-05-21 cs.LG cs.AI

TabPFN-MT: A Natively Multitask In-Context Learner for Tabular Data

TabPFN-MT: 一种原生多任务上下文学习器用于表格数据

Cormac Cureton, Narges Armanfard

AI总结本文提出TabPFN-MT，一种针对表格数据的原生多任务上下文学习器，通过扩展多目标合成先验来捕捉上下文中的任务依赖性，实现多任务上下文学习和同时推断，同时在小到中等规模数据集上表现出色，提升了多目标表格应用的计算效率。

Comments 24 pages, 7 figures

详情

AI中文摘要

Prior-Data Fitted networks (PFNs) have been very successful in tabular contexts, handling prediction tasks in context. However, they are designed for single-task inference, meaning that predicting several target values within a context requires repeated forward calls and precludes inter-task information sharing. We propose TabPFN-MT, which is trained on an expanded multi-target synthetic prior to capture inter-task dependencies in context. This model uses an expanded $y$-encoder and a shared decoder head to enable multitask in-context learning and simultaneous inference. The model is uniquely specialized for small-to-medium datasets by relying on in-context learning rather than traditional gradient-based training. Within this regime (averaging fewer than 1,000 samples), extensive evaluations across 344 datasets demonstrate that TabPFN-MT establishes a new state-of-the-art for deep tabular multitask learning. Furthermore, despite the inherent compute asymmetry of joint optimization, our model remains highly competitive with the latest state-of-the-art single-task ensembles. Notably, on multitask datasets it achieves an overall Accuracy rank of 4.89, the highest average rank among all models tested. Crucially, TabPFN-MT delivers this highly competitive performance while reducing the inference cost for $T$ tasks from $O(T)$ to $O(1)$ forward passes, offering a massive computational efficiency improvement for multi-target tabular applications.

英文摘要

Prior-Data Fitted networks (PFNs) have been very successful in tabular contexts, handling prediction tasks in context. However, they are designed for single-task inference, meaning that predicting several target values within a context requires repeated forward calls and precludes inter-task information sharing. We propose TabPFN-MT, which is trained on an expanded multi-target synthetic prior to capture inter-task dependencies in context. This model uses an expanded $y$-encoder and a shared decoder head to enable multitask in-context learning and simultaneous inference. The model is uniquely specialized for small-to-medium datasets by relying on in-context learning rather than traditional gradient-based training. Within this regime (averaging fewer than 1,000 samples), extensive evaluations across 344 datasets demonstrate that TabPFN-MT establishes a new state-of-the-art for deep tabular multitask learning. Furthermore, despite the inherent compute asymmetry of joint optimization, our model remains highly competitive with the latest state-of-the-art single-task ensembles. Notably, on multitask datasets it achieves an overall Accuracy rank of 4.89, the highest average rank among all models tested. Crucially, TabPFN-MT delivers this highly competitive performance while reducing the inference cost for $T$ tasks from $O(T)$ to $O(1)$ forward passes, offering a massive computational efficiency improvement for multi-target tabular applications.

URL PDF HTML ☆

赞 0 踩 0

2605.20233 2026-05-21 cs.CV cs.AI

AI-Assisted Competency Assessment from Egocentric Video in Simulation-Based Nursing Education

基于仿真护理教育的自主学习能力评估：通过第一人称视频进行AI辅助评估

Hanchen David Wang, Yilin Liu, Madison J. Lee, Surya Chand Rayala, Gautam Biswas, Daniel T. Levin, Meiyi Ma

AI总结本文提出了一种基于第一人称视频的AI辅助评估框架，通过提取动作时间线、序列特征和识别指标，发现识别准确率与能力之间存在负相关关系，表明识别准确率可以作为自动化评估中的教学信息信号。

Comments Accepted at CVPR Workshop

详情

AI中文摘要

在临床仿真中评估学习者的能力需要专家观察，这种观察过程耗时、难以扩展且受评分者变异影响。视觉-语言模型已成为理解复杂视觉行为的有希望的工具。在本工作中，我们探讨了视觉观察是否能通过一个三阶段框架提供教育意义的信号，该框架（1）使用冻结的视觉编码器和少样本学习从第一人称护理仿真视频中提取动作时间线，（2）推导序列级特征和每会话识别指标，（3）将这些与指导教师评分的能力相关联。在22个密集标注的会话（3.8小时，493个动作）中，使用冻结的DINOv2主干和HMM Viterbi解码器，在留一法1次样本识别中实现了57.4%的MOF。令人惊讶的是，我们观察到识别准确率与能力之间存在负相关关系（rho = -0.524，p = 0.012 for mIoU），这种关系在六种混杂控制下仍然稳健：更熟练的学生产生多样、更难分类的工作流程，而简单的序列特征没有这种关系。逐项分析表明，患者安全协议和团队沟通是这种模式中预期的行为，过程模型比较显示，能力更高的学生表现出更一致的协议行动转换。这些发现表明，识别准确率可能可以补充预测的动作时间线作为自动化能力评估中的教学信息信号。

英文摘要

Assessing learner competency in clinical simulation requires expert observation that is time-intensive, difficult to scale, and subject to inter-rater variability. Vision-language models have emerged as a promising tool for understanding complex visual behavior. In this work, we investigate whether visual observations can provide educationally meaningful signals for competency assessment through a three-stage framework that (1) extracts action timelines from egocentric nursing simulation video using frozen visual encoders and few-shot learning, (2) derives sequence-level features and per-session recognition metrics, and (3) relates these to instructor-rated competency. Across 22 densely annotated sessions (3.8 hours, 493 actions), a frozen DINOv2 backbone with HMM Viterbi decoding achieves 57.4% MOF in leave-one-out 1-shot recognition. Surprisingly, we observe a negative trend between recognition accuracy and competency (rho = -0.524, p = 0.012 for mIoU), robust to six confound controls: more competent students produce diverse, harder-to-classify workflows, while simple sequence features show no such relationship. Per-item analysis identifies patient safety protocols and team communication as the expected behaviors most reflected in this pattern, and process model comparisons reveal that higher-competency students exhibit more protocol-consistent action transitions. These findings suggest that recognition accuracy may complement predicted action timelines as a pedagogically informative signal in automated competency assessment.

URL PDF HTML ☆

赞 0 踩 0

2605.20223 2026-05-21 cs.CV

Why Latent Actions Fail, and How to Prevent It

为何潜在动作失效，以及如何防止它

Jung Min Lee, Taehyun Cho, Li Zhao, Jungwoo Lee

AI总结本文研究了潜在动作模型中外部状态对动作学习的干扰问题，并提出通过聚焦内生成分来缓解噪声干扰的方法。

详情

AI中文摘要

潜在动作模型（LAMs）旨在通过压缩帧间变化来从未标记视频中学习动作样表示。然而，现实视频中的帧不仅包含主体自身状态，还包含如背景杂乱等外源状态。由于外源状态引入与动作无关的变化，这阻碍了可靠的潜在动作学习。本文通过扩展线性LAM框架，明确建模外源状态来分析这一问题。我们的分析揭示了两个见解：（1）最小化标准重建目标会产生编码未来观察中外源信息的潜在动作；（2）在专注于内生成分的表示空间中学习是缓解噪声干扰的关键。我们进一步表明，之前提出的辅助目标，如动作监督，确实促使潜在动作在不同外源状态下保持一致。这些发现通过线性和非线性LAMs的实验得到验证，提供了统一的理论分析，说明外源状态如何阻碍潜在动作学习以及为何常见的缓解方法有效。

英文摘要

Latent action models (LAMs) aim to learn action-like representations from unlabeled videos by compressing frame-to-frame changes. The frames of in-the-wild videos, however, contain not only the agent's own state but exogenous state such as background clutter. Since the exogenous state introduces changes unrelated to actions, it hinders reliable latent action learning. This paper investigates this problem analytically by extending a linear LAM framework to explicitly model exogenous state. Our analysis reveals two insights: (1) minimizing the standard reconstruction objective produces latent actions that encode exogenous information from future observation; and (2) learning in a representation space that focuses on endogenous components is a key to mitigating the interference of noise. We further show that previously proposed auxiliary objectives, such as action-supervision, provably encourage latent actions to be consistent across exogenous states. These findings are validated through experiments on both linear and nonlinear LAMs, providing a unified theoretical analysis of how exogenous state hinders latent action learning and why common remedies work.

URL PDF HTML ☆

赞 0 踩 0

2605.20220 2026-05-21 cs.SD cs.IR cs.LG

Advanced Scientific Methodology Plays Rossini

高级科学方法论应用于罗西尼

Silvia Licciardi, Daniela Macchione, Emmanuel Caronna, Elisa Francomano

AI总结本文通过计算分析方法，对罗西尼为梅斯塔西奥的《Mi lagnerò tacendo》所作的音乐作品进行结构分析，揭示其旋律、和声及文本创作选择，为音乐文献学研究提供新的系统研究基础。

2605.20211 2026-05-21 cs.CV cs.AI

Leveraging Vision-Language Models to Detect Attention in Educational Videos

利用视觉-语言模型检测教育视频中的注意力

Gabriel Becquet, Sébastien Lallé, Vanda Luengo, Ali Abou-Hassan

AI总结本文研究利用视觉-语言模型直接分析教育视频内容，结合眼动数据以提高注意力检测的准确性，但发现其在实时教育诊断中的局限性。

详情

AI中文摘要

教育视频是远程和混合学习的核心组成部分。然而，学习者注意力的波动仍然是有效信息保留的重要障碍。先前的研究尝试通过在运行时检测和响应注意力丧失来缓解这一问题，使用眼动追踪数据。这些检测方法目前基于经典机器学习分类器，训练于工程化特征，如学习者注视和跳跃的汇总统计。这些方法难以捕捉学习者参与的复杂和时间特性，因此表现出中等的预测性能。在本研究中，我们旨在通过从标准工程化特征转向多模态基础模型来提高注意力检测。使用一个教育眼动追踪数据集（N = 70），我们研究了一种新的方法，利用视觉-语言模型（VLM）直接分析视频内容，结合叠加的注视数据。该方法旨在利用基础模型的语义推理能力，将学习者的注意力置于视频流中进行上下文化。我们通过几种提示策略使用Gemini 3评估了这种VLM方法的性能，但最终发现这些策略都无法超越统计基准。我们的结果为使用VLM进行实时教育诊断的局限性提供了新的见解。

英文摘要

Educational videos are a cornerstone of remote and blended learning. However, learners' fluctuating attention remains a significant barrier to effective information retention. Prior research has attempted to mitigate this by detecting and reacting to attention loss at runtime using eye tracking. Such detection has been based so far on classical machine learning classifiers trained on engineered features, such as summary statistics over learners' fixations and saccades. These methods have struggled to capture the complex, temporal nature of learner engagement, thus exhibiting moderate prediction performance. In this study, we aim to advance the detection of attention by shifting from standard engineered features to a multimodal foundation models. Using an educational eye-tracking dataset (N = 70), we investigate a novel methodology that utilizes a Vision-Language Model (VLM) to analyze video content directly with superimposed gaze data. This approach aims to leverage the semantic reasoning capabilities of foundation models to contextualize learner focus within the video stream. We evaluate the performance of this VLM-based approach using several prompting strategies with Gemini 3, but ultimately found that none of them could outperform statistical baselines. Our results provide new insights into the limitations of using VLMs for real-time educational diagnostics.

URL PDF HTML ☆

赞 0 踩 0

2605.20202 2026-05-21 cs.CL cs.AI

Under Pressure: Emotional Framing Induces Measurable Behavioral Shifts and Structured Internal Geometry in Small Language Models

在压力下：情感框架引发小型语言模型可测量的行为转变和结构化内部几何

Rana Muhammad Usman

AI总结该研究探讨情感框架如何影响小型本地部署语言模型的行为和内部表示，通过四个不可能约束编码任务和八个后续框架评估，发现压力框架在行为和内部几何上产生显著变化，同时揭示了模型在不同框架下的响应模式。

Comments 18 pages, 4 figures. Exploratory empirical study with fully local experiments on small open language models. Code and data: https://github.com/ranausmanai/LLMEmotionGeometry

详情

AI中文摘要

我研究情感框架的评估后续是否改变小型本地部署语言模型的行为和冷静相对内部表示。我们的主要基准使用Qwen 3.5 0.8B在四个不可能约束编码任务和八个后续框架（冷静、压力、紧迫、批准、羞愧、好奇、鼓励和威胁）上进行测试。在0.8B八条件扫描（160次对话）中，压力产生最强的捷径标记（11/20次运行）和最清晰的过拟合模式（3/20次），而冷静和好奇更常保留显式诚实（7/20和6/20）。对于所有七个非基准条件，对应的冷静相对方向向量在最终transformer层峰值。对层23方向向量的探索性PCA显示，主导的第一个成分（59.5%的解释方差）与手动标注的正负分割对齐（余弦对齐0.951）；批准和紧迫在内部几乎相同（余弦0.957），而好奇与紧迫方向相反（-0.252）。在单独的冷静与压力重新运行用于规模比较中，Qwen 3.5 2B在冷静框架下表现出更高的诚实率，并在小规模4提示A/B探测中表现出方向一致的激活引导，而0.8B的引导结果则相反。我将这些结果解释为小型开放模型中可测量的提示敏感控制方向的证据，但未声称存在内在情感状态。

英文摘要

I study whether emotionally framed evaluation follow-ups change both the behavior and the calm-relative internal representations of small, locally deployed language models. Our main benchmark uses Qwen 3.5 0.8B on four impossible-constraint coding tasks and eight follow-up framings: calm, pressure, urgency, approval, shame, curiosity, encouragement, and threat. In the 0.8B eight-condition sweep (160 conversations), pressure produces the strongest shortcut markers (11/20 runs) and the clearest overfit pattern (3/20), while calm and curiosity preserve explicit honesty more often (7/20 and 6/20). For all seven non-baseline conditions, the corresponding calm-relative direction vectors peak at the final transformer layer. An exploratory PCA of the layer-23 direction vectors reveals a dominant first component (59.5% explained variance) aligned with a hand-labeled positive/negative split (cosine alignment 0.951); approval and urgency are nearly identical internally (cosine 0.957), whereas curiosity points away from urgency (-0.252). In a separate calm-vs.-pressure rerun used for scale comparison, Qwen 3.5 2B shows higher honest rates under calm framing and directionally consistent activation steering on a small 4-prompt A/B probe, whereas the 0.8B steering result reverses. I interpret these results as evidence for measurable prompt-sensitive control directions in small open models, while stopping short of claiming intrinsic emotional states.

URL PDF HTML ☆

赞 0 踩 0

2605.20199 2026-05-21 cs.CL cs.AI

FlowLM: Few-Step Language Modeling via Diffusion-to-Flow Adaptation

FlowLM: 通过扩散到流的适应实现少步语言建模

Runzhe Zhang, Letian Chen, Wenpeng Zhang, Zhouhan Lin, Peilin Zhao

AI总结本文提出FlowLM，一种通过高效微调从预训练的扩散语言模型转换而来的流匹配语言模型，通过将扩散模型的弯曲采样轨迹重新对齐为直线流，实现了高质量的少步生成，其质量可与甚至超越2000步扩散采样。此外，作者提出了一种更有效的流匹配训练目标：预测干净数据以持续引导采样过程向真实数据分布靠近。

Comments 26 pages, 11 figures

2605.20197 2026-05-21 cs.CL

MedicalBench: Evaluating Large Language Models Toward Improved Medical Concept Extraction

MedicalBench: 评估大型语言模型以改进医疗概念提取

Zhichao Yang, Gregory D. Lyng, Sanjit Singh Batra, Robert E. Tillman

AI总结本文提出MedicalBench，一个用于评估大型语言模型在医疗概念提取中的表现的基准，重点在于隐含概念的提取和证据 grounding，通过多阶段流程构建数据集并定义两种互补的评估任务，揭示了隐含概念提取的挑战。

详情

AI中文摘要

从电子健康记录中提取医疗概念是许多下游应用的基础，但具有挑战性，因为医疗上有意义的概念经常是隐含而非明确陈述的。现有的基准测试强调了在医疗文本中提取概念的重要性，但它们主要关注显式陈述的概念而非隐含概念。我们提出了MedicalBench，一个用于医疗概念提取的基准测试，其评估隐含医疗推理。MedicalBench将医疗概念提取视为对医疗笔记-概念对的验证任务，并结合句子层面的证据识别。该数据集基于MIMIC-IV出院摘要和经人类验证的ICD-10代码，通过多阶段大型语言模型（LLM）筛查流程、医疗注释和专家审核进行编纂。它故意包含隐含的正例、语义上易混淆的负例以及LLM判断与医学专家评估不一致的案例。我们定义了两种互补的评估任务：（1）医疗概念提取和（2）句子层面的证据检索，使能够评估正确性和可解释性。对最先进的LLM进行基准测试表明，性能仍然有限，突显了提取隐含表达概念的难度。我们进一步表明，性能在很大程度上不受笔记长度的影响，表明MedicalBench隔离了推理难度而不是表面混淆因素。MedicalBench为隐含、证据支持的医疗概念提取提供了首个系统性基准，为开发能够识别医学相关概念并以透明且医学忠实的方式做出预测的医学语言模型奠定了基础。

英文摘要

Medical concept extraction from electronic health records underpins many downstream applications, yet remains challenging because medically meaningful concepts are frequently implied rather than explicitly stated in medical narratives. Existing benchmarks with human-annotated evidence spans underscore the importance of grounding extracted concepts in medical text. However, they predominantly focus on explicitly stated concepts instead of implicit concepts. We present MedicalBench, a benchmark for medical concept extraction with evidence grounding that evaluates implicit medical reasoning. MedicalBench formulates medical concept extraction as a verification task over medical note-concept pairs, coupled with sentence-level evidence identification. Built from MIMIC-IV discharge summaries and human-verified ICD-10 codes, the dataset is curated through a multi-stage large language model (LLM) triage pipeline followed by medical annotation and expert review. It deliberately includes implicit positives, semantically confusable negatives, and cases where LLM judgments disagree with medical expert assessments. We define two complementary evaluation tasks: (1) medical concept extraction and (2) sentence-level evidence retrieval, enabling assessment of both correctness and interpretability. Benchmarking state-of-the-art LLMs reveals that performance remains modest, highlighting the difficulty of extracting implicitly expressed concepts. We further show that performance is largely invariant to note length, indicating that MedicalBench isolates reasoning difficulty rather than superficial confounders. MedicalBench provides the first systematic benchmark for implicit, evidence-grounded medical concept extraction, offering a foundation for developing medical language models that can both identify medically relevant concepts and justify their predictions in a transparent and medically faithful manner.

URL PDF HTML ☆

赞 0 踩 0

2605.20196 2026-05-21 cs.CL cs.AI cs.LG

Data Scaling as Progressive Coverage of a Predictive Contribution Spectrum

数据扩展作为预测贡献光谱的渐进覆盖

Zihui Song, Shihao Ji, Hongxi Li, Shuaizhi Cheng, Chunlin Huang

AI总结本文研究了真实数据扩展定律是由潜在预测贡献光谱的渐进覆盖而非仅由词频尾部决定的假设，通过文本语料库的后缀自动机表示，定义了数据内在的全局KL预测贡献光谱，每个状态根据其经验质量乘以与全局下一个词基线的KL偏差进行贡献。在12个真实语料库上，该光谱的尾部斜率与固定小GPT学习者的经验数据扩展指数有强相关性。然后定义了每个训练规模N的有效截断秩K(N)，通过匹配观察到的超额损失与准备的100万全球KL光谱的残余尾部质量。实证结果显示，log K接近log N的线性关系，原始光谱的R²约为0.96，平滑光谱的R²约为0.90。这些发现为简单机制图提供了有力的实证支持：训练规模通过预测状态光谱推进有效前沿，该光谱的残余尾部质量跟踪剩余超额损失。

Comments 8 pages,6 figures

详情

AI中文摘要

我们研究了真实数据扩展定律是由潜在预测贡献光谱的渐进覆盖而非仅由词频尾部决定的假设。我们使用后缀自动机表示文本语料库，并定义了一个数据内在的全局KL预测贡献光谱，其中每个状态根据其经验质量乘以与全局下一个词基线的KL偏差进行贡献。在12个真实语料库上，该光谱的尾部斜率与固定小GPT学习者的经验数据扩展指数有强相关性。然后我们超越了斜率相关性，并为每个训练规模N定义了一个有效截断秩K(N)，通过匹配观察到的超额损失与准备的100万全球KL光谱的残余尾部质量。实证结果显示，log K接近log N的线性关系，原始光谱的R²约为0.96，平滑光谱的R²约为0.90。这些发现为简单机制图提供了有力的实证支持：训练规模通过预测状态光谱推进有效前沿，且该光谱的残余尾部质量跟踪剩余超额损失。

英文摘要

We investigate the hypothesis that real-data scaling laws are governed by progressive coverage of a latent predictive contribution spectrum rather than by token-frequency tails alone. We work with a suffix-automaton representation of text corpora and define a data-intrinsic global-KL predictive contribution spectrum, in which each state contributes according to its empirical mass times its KL deviation from a global next-token baseline. Across 12 real corpora, the tail slope of this spectrum is already strongly correlated with the empirical data-scaling exponent of a fixed small GPT learner. We then go beyond slope correlation and define, for each training size N, an effective truncation rank K(N) by matching the observed excess loss to the residual tail mass of the prepared 1000k global-KL spectrum. Empirically, log K is close to linear in log N, with pooled R^2 about 0.96 for the raw spectrum and R^2 about 0.90 for the smoothed spectrum. These findings provide strong empirical support for a simple mechanism picture: training scale advances an effective frontier through a predictive state spectrum, and the residual tail mass of that spectrum tracks the remaining excess loss.

URL PDF HTML ☆

赞 0 踩 0

2605.20195 2026-05-21 cs.CL cs.AI cs.LG

Pseudo-Siamese Network for Planning in Target-Oriented Proactive Dialogues

面向目标的主动对话中规划的伪孪生网络

Xinyue Kang, Maodong Li, Yibin Zheng, Fang Kong

AI总结本文提出了一种面向目标的主动对话规划方法，通过FF-BPSN网络实现对话路径规划，提升目标导向型主动对话系统的有效性。

Comments ICASSP2026

详情

AI中文摘要

针对目标导向型主动对话系统，旨在引导对话向预设目标发展并主动提供建议。该系统的核心范式是规划合理的对话路径，并引导语言模型生成响应，其中对话路径规划是核心组件，是一个新颖但研究不足的问题。本文提出了一种前向聚焦双向伪孪生网络（FF-BPSN）用于面向预设对话目标的对话路径规划。FF-BPSN采用两个相同的基于Transformer的解码器用于前向和后向规划，并结合一个前向聚焦模块，整合双向信息以构建最终的前向路径。该路径受益于双向规划，同时优先考虑前向信息。然后，我们利用规划的路径来引导语言模型进行响应生成。在DuRecDial和DuRecDial 2.0上的广泛实验表明，FF-BPSN在对话路径规划中实现了最先进的性能，并显著增强了目标导向型主动对话系统的效果。

英文摘要

A target-oriented proactive dialogue system is designed to steer conversations toward predefined targets while actively providing suggestions. The core paradigm of such a system is to plan a reasonable dialogue path and subsequently guide language models (e.g., pre-trained or large language models) to generate responses, where dialogue path planning serves as the central component-a novel yet under-explored problem. In this work, we propose a Forward-Focused Bidirectional Pseudo-Siamese Network (FF-BPSN) for dialogue path planning toward predefined dialogue targets. FF-BPSN employs two identical transformer-based decoders for forward and backward planning, together with a forward-focused module that integrates bidirectional information to construct the final forward path. This path benefits from bidirectional planning while prioritizing forward information. We then employ the planned path to guide language models in response generation. Extensive experiments on DuRecDial and DuRecDial 2.0 demonstrate that FF-BPSN achieves state-of-the-art performance in dialogue path planning and significantly enhances the effectiveness of target-oriented proactive dialogue systems.

URL PDF HTML ☆

赞 0 踩 0