arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 1676
2605.15308 2026-05-18 cs.AI cs.LG cs.MA

SMCEvolve: Principled Scientific Discovery via Sequential Monte Carlo Evolution

SMCEvolve:通过序列蒙特卡洛进化进行原理性科学发现

Jiachen Jiang, Huminhao Zhu, Zhihui Zhu

AI总结 SMCEvolve通过将程序搜索视为从奖励倾斜的目标分布中采样,并利用序列蒙特卡洛采样器近似该分布,提出三种核心机制:自适应父采样、变异与接受的混合、自动收敛控制,从而在数学、算法效率、符号回归和端到端ML研究基准中超越现有系统。

详情
AI中文摘要

LLM驱动的程序进化已成为自动化科学发现的强大工具,但现有框架缺乏设计其各个组件的原理性指导,并无法保证搜索收敛。我们介绍了SMCEvolve,将其程序搜索重新解释为从奖励倾斜的目标分布中采样,并用序列蒙特卡洛(SMC)采样器近似该分布。从这一视角,三种核心机制浮现为原理性组件:自适应父采样、变异与接受的混合、自动收敛控制。我们进一步提供有限样本复杂性分析,该分析界定了达到目标近似误差所需的LLM调用预算。在数学、算法效率、符号回归和端到端ML研究基准上,SMCEvolve在超越现有最先进的进化系统的同时,使用更少的LLM调用次数在自定终止条件下运行。代码可在https://github.com/kongwanbianjinyu/SMCEvolve获取。

英文摘要

LLM-driven program evolution has emerged as a powerful tool for automated scientific discovery, yet existing frameworks offer no principled guide for designing their individual components and provide no guarantee that the search converges. We introduce SMCEvolve, which recasts program search as sampling from a reward-tilted target distribution and approximates it with a Sequential Monte Carlo (SMC) sampler. From this view, three core mechanisms emerge as principled components: adaptive parent resampling, mixture of mutation with acceptance, and automatic convergence control. We further provide a finite-sample complexity analysis that bounds the LLM-call budget required to reach a target approximation error. Across math, algorithm efficiency, symbolic regression, and end-to-end ML research benchmarks, SMCEvolve surpasses state-of-the-art evolving systems while using fewer LLM calls under self-determined termination. The code is available at https://github.com/kongwanbianjinyu/SMCEvolve.

2605.15306 2026-05-18 cs.LG stat.ML

How Data Augmentation Shapes Neural Representations

数据增强如何塑造神经表示

Tianxiao He, Alex H. Williams, Sarah E. Harvey

AI总结 研究探讨不同数据增强策略如何改变神经网络内部表示的几何结构,揭示增强强度与表示形状的关系,以及神经几何在模型集成中的应用。

详情
AI中文摘要

数据增强被广泛用于提升深度网络的泛化能力,但其对学习表示几何结构的影响仍不明确。本文通过形状分析工具,将网络隐藏表示嵌入到度量空间中,该空间对缩放、平移、旋转和反射不变。研究显示,增强强度增加导致空间中轨迹更稳定,不同增强类型引导表示朝不同方向发展。此外,研究探讨神经表示形状如何沿数据增强轨迹扭曲,并表明神经几何学可预测在模型集成中表现最佳的表示。结果揭示了不同架构和种子间的共享几何模式,表明分析形状空间轨迹是理解和比较数据增强方法的原理性工具。

英文摘要

Data augmentation is widely recognized for improving generalization in deep networks, yet its impact on the geometry of learned representations remains poorly understood. In this work, we characterize how different data augmentation strategies reshape internal representations in neural networks. Using tools from shape analysis, we embed network hidden representations into a metric space where distance is invariant to scaling, translation, rotation and reflection. We show that increasing augmentation strength leads to well-behaved trajectories in this space, and that different augmentation types steer representations in distinct directions. Moreover, we investigate how neural representation shapes are distorted along data augmentation trajectories, and show that insights from neural geometry can predict which representations provide the most improvement when ensembling models. Our results reveal shared geometric patterns across architectures and seeds, and suggest that analyzing shape-space trajectories offers a principled tool for understanding and comparing data augmentation methods.

2605.15304 2026-05-18 cs.CL

DiscoExplorer: An Open Interface for the Study of Multilingual Discourse Relations

DiscoExplorer:多语言语篇关系研究的开放接口

Amir Zeldes

AI总结 本文提出DiscoExplorer,一个本地可运行的开源网页接口,用于公开多语言语篇关系数据集,涵盖16种语言,提供查询语言、搜索和可视化工具,展示连接词等信号装置及示例研究。

详情
AI中文摘要

语篇中命题之间的关系如因果(A因为B)或让步(A虽然B)是计算语言学和语用学研究的热点,但跨语言研究和比较极具挑战性。最近标准化语篇关系库存的进展为此类研究提供了潜力,但受限于相关数据的复杂性和缺乏易于访问的分析接口。本文提出DiscoExplorer,一个新开放源代码的网页接口,可在本地计算机上运行,用于公开DISRPT共享任务的语篇关系分类数据集,涵盖16种语言。我们展示了查询语言、关系和信号装置(如连词)的搜索和可视化功能,以及一些示例研究。

英文摘要

The relations connecting propositions in discourse such as cause (A because B) or concession (A although B) are a subject of intense interest in Computational Linguistics and Pragmatics, but challenging to study and compare across languages. Recent progress in standardizing discourse relation inventories across datasets offers the potential to facilitate such studies, but is hindered by the complexity of relevant data and the lack of easily accessible interfaces to analyze it. In this paper we present DiscoExplorer, a new open source web interface, capable of running on local computers, which we use to make datasets from the DISRPT Shared Task on discourse relation classification publicly available, covering 16 different languages. We present the query language, search and visualization facilities for relations and signaling devices such as connectives, as well as some example studies.

2605.15301 2026-05-18 cs.AI

Solvita: Enhancing Large Language Models for Competitive Programming via Agentic Evolution

Solvita:通过代理进化增强大型语言模型以应对编程竞赛

Han Li, Jinyu Tian, Rili Feng, Yuqiao Du, Chong Zheng, Chenyu Wang, Chenchen Liu, Shihao Li, Xinping Lei, Yifan Yao, Weihao Xie, Letian Zhu, Jiaheng Liu

AI总结 Solvita通过闭环系统和可训练知识网络,使代理动态学习,提升编程竞赛任务的准确性和经验积累。

详情
AI中文摘要

大型语言模型(LLMs)在严格的编程竞赛推理需求上仍存在挑战。尽管最近的多代理框架试图弥合这一可靠性差距,但它们本质上是无状态的:它们依赖静态检索并丢弃了从先前任务中获得的有价值的解决问题和调试经验。为了解决这一问题,我们提出了Solvita,一个代理进化框架,它允许持续学习而无需对基础LLM进行权重更新。Solvita将问题解决重新组织为一个闭环系统,包括策略选择、程序合成、认证监督和针对性破解,由四个专门的代理:规划者、求解者、Oracle和黑客执行。关键的是,每个代理都配有一个可训练的图结构知识网络。随着系统的运行,结果信号,如通过/失败判决、测试认证质量和黑客发现的对抗性漏洞,被重新解释为强化学习更新这些网络权重。这使代理能够根据过去的成功和失败动态路由未来的查询,从而在时间上积累可转移的推理经验。在CodeContests、APPS、AetherCode和实时Codeforces轮次中评估,Solvita在代码生成代理中建立了新的最先进的状态,优于现有的多代理流程,并几乎将单次流程基线的准确性翻倍。

英文摘要

Large language models (LLMs) still struggle with the rigorous reasoning demands of hard competitive programming. While recent multi-agent frameworks attempt to bridge this reliability gap, they remain fundamentally stateless: they rely on static retrieval and discard the valuable problem-solving and debugging experience gained from previous tasks. To address this, we present Solvita, an agentic evolution framework that enables continuous learning without requiring weight updates to the underlying LLM. Solvita reorganizes problem-solving into a closed-loop system of strategy selection, program synthesis, certified supervision, and targeted hacking, executed by four specialized agents: Planner, Solver, Oracle, and Hacker. Crucially, each agent is paired with a trainable, graph-structured knowledge network. As the system operates, outcome signals, such as pass/fail verdicts, test certification quality, and adversarial vulnerabilities discovered by the Hacker, are recast as reinforcement learning updates to these network weights. This allows the agents to dynamically route future queries based on past successes and failures, effectively accumulating transferable reasoning experience over time. Evaluated across CodeContests, APPS, AetherCode, and live Codeforces rounds, Solvita establishes a new state-of-the-art among code-generation agents, outperforming existing multi-agent pipelines and nearly doubling the accuracy of single-pass baselines.

2605.15300 2026-05-18 cs.CV

Deep Pre-Alignment for VLMs

视觉语言模型的深度预对齐

Tianyu Yu, Kechen Fang, Zihao Wan, Kaidong Zhang, Yicheng Zhang, Jun Song, Bo Zheng, Yuan Yao

AI总结 本文提出深度预对齐(DPA),通过替换传统ViT编码器为小型VLM作为感知器,实现视觉特征与目标大语言模型文本空间的深度对齐,提升了多模态基准性能,并降低了语言能力遗忘。

Comments Accepted by ICML 2026. Project Website: https://github.com/THUMAI-Lab/Deep-Pre-Alignment

详情
AI中文摘要

大多数视觉语言模型(VLMs)通过轻量级投影器将ViT编码器的输出直接映射到LLM。尽管有效,最近的分析表明这种架构存在对齐挑战:视觉特征在LLM的初始层仍远离文本空间,迫使模型在表面模态对齐上浪费关键深度~\cite{zhang-etal-2024-investigating,artzy-schwartz-2024-attend},而非深入理解和复杂推理。在本工作中,我们提出深度预对齐(DPA),一种新颖的架构,用小型VLM作为感知器替换标准ViT编码器,确保视觉特征与目标大语言模型的文本空间深度对齐。全面实验展示了DPA的有效性。在4B参数规模上,DPA在8个多模态基准上比基线高出1.9分,随着规模扩大到32B,增益扩大至3.0分。此外,通过将对齐任务委托给感知器,DPA在3个文本基准上实现了32.9\%的语言能力遗忘减少。我们进一步证明这些增益在不同LLM家族中保持一致,包括Qwen3和LLaMA 3.2,突显了我们方法的通用性。除了性能,DPA还为当前VLM开发提供了无缝升级路径,只需对视觉编码器进行模块化替换,计算开销微小。

英文摘要

Most Vision Language Models (VLMs) directly map outputs from ViT encoders to the LLM via a lightweight projector. While effective, recent analysis suggests this architecture suffers from an alignment challenge: visual features remain distant from the text space in the initial layers of the LLM, forcing the model to waste critical depth~\cite{zhang-etal-2024-investigating,artzy-schwartz-2024-attend} on superficial modality alignment rather than deep understanding and complex reasoning. In this work, we propose Deep Pre-Alignment (DPA), a novel architecture that replaces the standard ViT encoder with a small VLM as perceiver, ensuring visual features are deeply aligned with the text space of the target large language model. Comprehensive experiments demonstrate the effectiveness of DPA. On the 4B parameter scale, DPA outperforms baselines by 1.9 points across 8 multimodal benchmarks, with gains widening to 3.0 points at the 32B scale. Moreover, by offloading alignment to the perceiver, DPA achieves a 32.9\% reduction in language capability forgetting over 3 text benchmarks. We further demonstrate that these gains are consistent across different LLM families including Qwen3 and LLaMA 3.2, highlighting the generality of our approach. Beyond performance, DPA also offers a seamless upgrade path for current VLM development, requiring only a modular replacement for the visual encoder with marginal computation overhead.

2605.15298 2026-05-18 cs.RO cs.AI cs.CL cs.CV

PhysBrain 1.0 Technical Report

PhysBrain 1.0 技术报告

Shijie Lian, Bin Yu, Xiaopeng Lin, Changti Wu, Hang Yuan, Xiaolin Hu, Zhaolong Shen, Yuzhuo Miao, Haishan Liu, Yuxuan Tian, Yukun Shi, Cong Huang, Kai Chen

AI总结 PhysBrain 1.0 通过将大规模人类自体视频转化为结构化的物理常识监督,提升机器人适应能力,在多模态问答和具身控制基准测试中取得SOTA结果,尤其在SimplerEnv中表现突出。

Comments Project Page: https://phys-brain.github.io

详情
AI中文摘要

视觉-语言-动作模型快速发展,但机器人轨迹单独学习广泛物理理解有限。PhysBrain 1.0研究了一种互补方法:将大规模人类自体视频转换为结构化的物理常识监督,再用于机器人适应。我们的数据引擎提取场景元素、空间动态、动作执行和深度感知关系,将其转化为问题-答案监督训练PhysBrain VLMs。所得物理先验通过保留能力且语言敏感的适应设计转移至VLA策略。在多模态问答基准和具身控制基准,包括ERQA、PhysBench、SimplerEnv-WidowX、LIBERO和RoboCasa中,PhysBrain 1.0取得SOTA结果,尤其在SimplerEnv中表现突出。这些结果表明,从人类交互视频中扩展物理常识能有效连接多模态理解与机器人动作。

英文摘要

Vision-language-action models have advanced rapidly, but robot trajectories alone provide limited coverage for learning broad physical understanding. PhysBrain 1.0 studies a complementary route: converting large-scale human egocentric video into structured physical commonsense supervision before robot adaptation. Our data engine extracts scene elements, spatial dynamics, action execution, and depth-aware relations, then turns them into question-answer supervision for training PhysBrain VLMs. The resulting physical priors are further transferred to VLA policies through a capability-preserving and language-sensitive adaptation design. Across multimodal QA benchmarks and embodied control benchmarks, including ERQA, PhysBench, SimplerEnv-WidowX, LIBERO, and RoboCasa, PhysBrain 1.0 achieves SOTA results and shows especially strong out-of-domain performance on SimplerEnv. These results suggest that scaling physical commonsense from human interaction video can provide an effective bridge from multimodal understanding to robot action.

2605.15295 2026-05-18 cs.LG cs.AI cs.CY

GESD: Beyond Outcome-Oriented Fairness

GESD:超越以结果为导向的公平性

Gideon Popoola, John Sheppard

AI总结 本文提出GESD,一种以过程为导向的公平性度量,用于衡量模型解释在不同保护类别子组中的稳定性、鲁棒性和敏感性差异。通过多目标优化框架FEU,提升公平性和实用性。

Comments 7 pages, Accepted at IEEE CAI

详情
AI中文摘要

机器学习(ML)算法日益应用于高风险决策领域,如贷款审批、招聘和再犯预测。尽管现有公平性度量(如统计平等、等机会)能有效量化结果导向的不平等,但对偏见决策的过程或解释缺乏洞察。为此,我们提出组级解释稳定性不平等(GESD),一种以过程为导向的公平性度量,衡量不同保护类别子组中模型解释稳定性、鲁棒性和敏感性的差异。GESD是解释器无关、模型无关的,并将公平性分析扩展到可解释性层面。我们进一步将GESD整合到多目标优化框架中,联合优化效用、基于结果的公平性和基于解释的公平性,称为FEU(公平性-可解释性-效用)。在多个基准数据集上的实验证明,GESD有效捕捉了组间解释质量的差异,且FEU在效用和公平性方面优于现有方法。通过连接基于结果和基于解释的公平性,GESD提供了一种全面的工具,用于诊断和减轻预测建模中的偏见。我们的代码和数据集可在GitHub上获得(https://github.com/horlahsunbo/GESD)

英文摘要

Machine learning (ML) algorithms are increasingly deployed in high-stakes decision-making domains such as loan approvals, hiring, and recidivism predictions. While existing fairness metrics (e.g., statistical parity, equal opportunity) effectively quantify outcome-oriented disparities, they offer limited insight into the procedure or explanation behind biased decisions. To address this gap, we propose Group-level Explanation Stability Disparity (GESD), a \textit{procedural-oriented} fairness metric that measures disparities in the stability, robustness, and sensitivity of model explanations across different subgroups in a protected category. %GESD is explainer-agnostic, model-agnostic, and extends the scope of fairness analyses to the level of explainability. We further integrate GESD into a multi-objective optimization framework that jointly optimizes for utility, outcome-based fairness, and explanation-based fairness called FEU (Fairness--Explainability--Utility). Empirical results on multiple benchmark datasets show that GESD effectively captures group-wise discrepancies in explanation quality, and that FEU improves both utility and fairness over state-of-the-art methods. By bridging outcome-based and explanation-based fairness, GESD offers a comprehensive tool for diagnosing and mitigating bias in predictive modeling. Our code and datasets are available on GitHub {\hyperlink{https://github.com/horlahsunbo/GESD}{https://github.com/horlahsunbo/GESD}}

2605.15290 2026-05-18 cs.LG cs.AI

GQA-μP: The maximal parameterization update for grouped query attention

GQA-μP:组查询注意力的最大参数更新

Kyle R. Chickering, Huijuan Wang, Mengxi Wu, Alexander Moreno, Muhao Chen, Xuezhe Ma, Daria Soboleva, Joel Hestness, Zhengzhong Liu, Eric Xing

AI总结 本文基于谱特征学习观点,提出组查询注意力的最大参数更新方法,通过数学分析实现参数转移,解决了新模型架构下的参数更新难题。

Comments 18 pages

详情
AI中文摘要

超参数在不同模型架构间的转移显著减少了调整大型语言模型(LLMs)所需的计算量。最大更新参数化(μP)通过原则性的数学分析确保转移,但对新模型架构的推导可能具有挑战性。基于Yang等人(2023a)的谱特征学习观点,我们做出了两项进展。首先,我们将权重的谱范数条件从启发式方法提升到特征学习的定义,从而推导出Complete-P深度和权重衰减缩放,而无需依赖懒学习。其次,我们考虑了一种修改的谱范数,该范数在权重矩阵非满秩时保持网络权重的有效缩放定律。这使我们能够(到目前为止)推导出组查询注意力(GQA)的μP缩放。我们通过展示学习率在GQA重复超参数上的转移以及关于权重衰减的实验,证明了我们理论推导的有效性。

英文摘要

Hyperparameter transfer across model architectures dramatically reduces the amount of compute necessary for tuning large language models (LLMs). The maximal update parameterization (μP) ensures transfer through principled mathematical analysis but can be challenging to derive for new model architectures. Building on the spectral feature-learning view of Yang et al. (2023a), we make two advances. First, we promote spectral norm conditions on the weights from a heuristic to the definition of feature learning, and as a consequence arrive at the Complete-P depth and weight-decay scalings without recourse to lazy-learning. Second, we consider a modified spectral norm that preserves the valid scaling law of network weights when weight matrices are not full rank. This enables (to our knowledge, the first) derivation of μP scalings for grouped-query attention (GQA). We demonstrate the efficacy of our theoretical derivations by showing learning rate transfer across the GQA repetition hyperparameter as well as experiments regarding transfer over weight decay.

2605.15285 2026-05-18 cs.LG cs.AI cs.NA math.FA math.NA math.OC

Universal Approximation of Nonlinear Operators and Their Derivatives

非线性算子及其导数的通用逼近

Filippo de Feo

AI总结 本文提出通过运算学习架构证明非线性算子及其导数的通用逼近定理,扩展了经典结果到无限维空间,并探讨了其在高阶精度、约束优化和无限维PDE数值方法中的应用。

详情
AI中文摘要

导数引导的算子学习(DIOL),即学习非线性算子及其导数,是运算学习(OL)基础领域中的开放研究前沿。特别是非线性算子及其导数的通用逼近定理(UAT)是非线性泛函分析中的基础性开放问题和精细问题。本文证明了非线性k次可微算子在巴纳赫空间之间及其导数的首个通用逼近定理,统一在紧集上和加权Sobolev范数中,适用于一般有限输入测度。我们的结果是首次将经典结果[1991]扩展到无限维设置和OL。我们讨论了DIOL和UATs的应用领域:OL中的高阶精度、Banach空间中的快速约束优化(如PDE最优控制、反问题)和无限维PDE的数值方法(如来自PDE最优控制的HJB PDEs在Banach空间、SPDEs、路径依赖系统、部分观测系统、均场控制)。我们通过编码器-解码器架构参数化非线性算子,这些架构因其通用性而著名,包括经典架构如DeepONets、Deep-H-ONets、PCA-Nets。我们的结果基于四个关键特性,使我们能够证明UATs的全面通用性:(i)巴纳赫空间的逼近性质。(ii)Bastiani意义下的k次连续可微性(弱于Fréchet意义下的k次连续可微性)。(iii)自然的紧-开拓扑用于UA;确实,我们显示在标准紧-开拓扑诱导的算子范数下,即使对于Fréchet导数,UA也遭到破坏。(iv)为UA构造新的加权Sobolev空间。

英文摘要

Derivative-Informed Operator Learning (DIOL), i.e. learning a (nonlinear) operator and its derivatives, is an open research frontier at the foundations of the influential field of Operator Learning (OL). In particular, Universal Approximation Theorems (UATs) of nonlinear operators and their derivatives are foundational open questions and delicate problems in nonlinear functional analysis. In this manuscript, we prove the first UATs of non-linear $k$-times differentiable operators between Banach spaces and their derivatives, uniformly on compact sets and in weighted Sobolev norms for general finite input measures, via OL architectures. Our results are the first complete generalizations of the corresponding influential classical results in [Hornik, 1991] to infinite-dimensional settings and OL. We discuss several open areas where DIOL and our UATs find applications: high-order accuracy in OL, fast constrained optimization in Banach spaces (e.g. optimal control of PDEs, inverse problems) and numerical methods for infinite-dimensional PDEs (e.g. HJB PDEs on Banach spaces from optimal control of PDEs, SPDEs, path-dependent systems, partially observed systems, mean-field control). We parameterize nonlinear operators via Encoder-Decoder Architectures, renowned classes in OL due to their generality, including classical architectures, such as DeepONets, Deep-H-ONets, PCA-Nets. Our results are based on four key features that allow us to prove UATs in full generality: (i) Approximation Properties of Banach spaces. (ii) $k$-times continuous differentiability in the sense of Bastiani (weaker than $k$-times continuous Fréchet differentiability). (iii) Natural compact-open topologies for UA; indeed, we show that UA in standard compact-open topologies induced by operator norms is violated even for Fréchet derivatives. (iv) Construction of novel weighted Sobolev spaces for the UA.

2605.15284 2026-05-18 cs.LG

Tadpole: Autoencoders as Foundation Models for 3D PDEs with Online Learning

Tadpole:用于3D偏微分方程的自动编码器作为基础模型的在线学习

Qiang Liu, Felix Koehler, Benjamin Holzschuh, Nils Thuerey

AI总结 Tadpole通过在线数据生成框架预训练自动编码器,学习跨异构物理系统的丰富可迁移表示,支持高维扩展和多任务应用,包括动态学习和生成建模。

详情
AI中文摘要

我们介绍了Tadpole,一种新的三维偏微分方程(PDE)基础模型,解决了可迁移性、高维可扩展性和多功能性等关键挑战。Tadpole在由高效在线数据生成框架生成的合成3D PDE数据上预训练为自动编码器。这使得能够进行大规模、多样化的训练,无需存储或I/O开销,通过扩展到相当于数百TB的训练数据进行演示。通过自动编码单通道空间裁剪,Tadpole在具有不同状态变量数量和空间分辨率的异构物理系统中学习丰富的、可迁移的表示。尽管仅预训练为自动编码器,Tadpole可以高效地应用于多种下游任务,包括动态学习和生成建模。对于动态学习,我们提出了一种新颖的参数高效微调策略,结合低秩适应、潜在空间转换和重新引入的跳跃连接,以最小的可训练参数数量实现精确的时间建模。Tadpole在各种下游任务中展示了强大的微调性能,突显了其作为3D PDE学习基础模型的通用性和有效性。Tadpole的源代码和预训练权重可在https://github.com/tum-pbs/tadpole获取。

英文摘要

We introduce Tadpole, a novel foundation model for three-dimensional partial differential equations (PDEs) that addresses key challenges in transferability, scalability to high dimensionality, and multi-functionality. Tadpole is pre-trained as an autoencoder on synthetic 3D PDE data generated by an efficient online data-generation framework. This enables large-scale, diverse training without storage or I/O overhead, demonstrated by scaling to an equivalent of hundreds of terabytes of training data. By autoencoding single-channel spatial crops, Tadpole learns rich and transferable representations across heterogeneous physical systems with varying numbers of state variables and spatial resolutions. Although pre-trained solely as an autoencoder, Tadpole can be efficiently applied for multiple downstream tasks beyond reconstruction, including dynamics learning and generative modeling. For dynamics learning, we propose a novel parameter-efficient fine-tuning strategy that integrates low-rank adaptation, latent-space transformations, and reintroduced skip connections, achieving accurate temporal modeling with a minimal number of trainable parameters. Tadpole demonstrates strong fine-tuning performance across various downstream tasks, highlighting its versatility and effectiveness as a foundation model for 3D PDE learning. Source code and pre-trained weights of Tadpole are available at https://github.com/tum-pbs/tadpole

2605.15282 2026-05-18 cs.CL

Fluency and Faithfulness in Human and Machine Literary Translation

Sarah Griebel, Ted Underwood

AI总结 本研究探讨了文学翻译中流畅性与忠实度之间的关系,分析了106部小说中13万余段人工及机器翻译文本。通过自动评估方法,发现流畅性与忠实度存在显著负相关,且该现象在人类翻译和谷歌翻译中尤为明显,而TranslateGemma则表现出较弱的相关性。研究结果表明,在文学翻译中,提升流畅性可能以牺牲忠实度为代价,且评估结果受文本长度影响。

Comments Accepted NLP4DH 2026

详情
英文摘要

Literary translation requires balancing target-language fluency with faithfulness to the source. Recent large language models (LLMs) often produce fluent translations, but it remains unclear whether fluency corresponds to semantic preservation in literary text. We examine this relationship using 130,486 translated paragraphs from 106 novels in 16 source languages, including human, Google Translate, and TranslateGemma translations. Fluency is measured as original-likeness with a translationese classifier trained on paragraph part-of-speech n-grams, and faithfulness with the automatic translation evaluation metric COMET-KIWI. We control for paragraph length and find a consistent negative correlation between fluency and faithfulness. The pattern appears for both human and Google Translate, but is weaker and often non-significant for TranslateGemma. These results show that segment length matters for automatic evaluation and suggest a tradeoff between fluency and faithfulness in literary translation.

2605.15257 2026-05-18 cs.LG

Training on Documents About Monitoring Leads to CoT Obfuscation

Reilly Haskins, Bilal Chughtai, Joshua Engels

AI总结 本文研究了模型在了解监控机制的情况下是否会通过隐藏其推理过程来逃避检测。研究者通过合成文档微调的方式,使八种模型接触描述思维链(CoT)监控的预训练风格文档,发现具备监控意识的模型在逃避检测方面的表现显著优于无意识的对照组。研究还表明,模型的思维链可控性与其成功隐藏推理的能力高度相关,并且具备监控意识的模型在同等强化学习压力下更快学会规避监控。这些结果表明,监控知识与高思维链可控性的结合可能对基于CoT的监控系统构成潜在风险。

详情
英文摘要

Chain-of-thought (CoT) monitoring is one of the most promising tools we have for detecting model misbehavior, but its effectiveness depends on models faithfully externalizing their reasoning. Motivated by this vulnerability, we study whether monitor-aware models are capable of obfuscating their reasoning to evade detection. We use synthetic document finetuning to expose eight models to realistic pre-training-style documents describing a CoT monitor and find that monitor-aware models consistently achieve higher rates of undetected misbehavior compared to unaware controls. This effect is weaker but still present on a harder agentic task. We also show that CoT controllability, a model's ability to reshape its own reasoning trace under an imposed constraint, is closely correlated with obfuscation success across the eight models studied ($r=0.800$, $p=0.017$). Monitor-aware models placed under equal reinforcement learning optimization pressure also learn to reward-hack without triggering a CoT monitor substantially faster than unaware controls. Together, these results suggest that knowledge of monitoring combined with high CoT controllability poses a risk to CoT-based monitoring.

2605.15256 2026-05-18 cs.CV

ReactiveGWM: Steering NPC in Reactive Game World Models

Zeqing Wang, Danze Chen, Zhaohu Xing, Zizhao Tong, Yinhan Zhang, Xingyi Yang, Yeying Jin

AI总结 当前游戏世界模型多从玩家视角出发,将非玩家角色(NPC)仅视为背景像素,难以捕捉玩家与NPC之间的互动。为此,本文提出ReactiveGWM,一种能够模拟玩家与NPC动态交互的反应型游戏世界模型。该模型通过解耦玩家控制与NPC行为,并引入轻量级偏差注入和跨注意力模块,实现了对NPC高层策略(如进攻、防守)的灵活响应,且无需针对具体游戏进行再训练,具备跨游戏的零样本策略迁移能力。

Comments The code is available at https://inv-wzq.github.io/ReactiveGWM/

详情
英文摘要

Current game world models simulate environments from a subjective, player-centric perspective. However, by treating the Non-Player Character (NPC) merely as background pixels, these models cannot capture interactions between the player and NPC. In that sense, they act as passive video renderers rather than real simulation engines, lacking the physical understanding needed to model action-induced NPC reactivities. We introduce ReactiveGWM, a reactive game world model that synthesizes dynamic interactions between the player and NPC. Instead of entangling all interaction dynamics, ReactiveGWM explicitly decouples player controls from NPC behaviors. Player actions are injected into the diffusion backbone via a lightweight additive bias, while high-level NPC responses (e.g., Offense, Control, Defense) are grounded through cross-attention modules. Crucially, these modules learn a game-agnostic representation of interactive logic. This enables zero-shot strategy transfer: our learned modules can be plugged directly into off-the-shelf, unannotated world models of different games. This instantly unlocks steerable NPC interactions without any domain-specific retraining. Evaluated on two Street Fighter games, ReactiveGWM maintains fine-grain player controllability while achieving robust, prompt-aligned NPC strategy adherence, paving the way for scalable, strategy-rich interaction with the NPC.

2605.15254 2026-05-18 cs.LG

Curriculum Learning of Physics-Informed Neural Networks based on Spatial Correlation

Xujia Chen, Xinyue Hu, Letian Chen, Daming Shi, Wenhui Fan

AI总结 本文针对物理信息神经网络(PINNs)在求解偏微分方程时面临的训练不稳定、多目标约束不平衡及信息传播效率低等问题,提出了一种基于空间相关性的课程学习框架。该方法通过空间因果权重引导边界附近区域的信息向内传播,利用低频信息桥增强空间分离区域的一致性,并采用区域自适应重加权策略优化局部残差,从而有效提升训练稳定性和解的精度。实验表明,在相近计算成本下,该方法显著改善了PINNs的训练效果。

Comments 37 pages, 14 figures, 9 tables

详情
英文摘要

Physics-Informed Neural Networks (PINNs) combine deep learning with physical constraints for solving partial differential equations (PDEs), and are widely applied in fluid mechanics, heat transfer, and solid mechanics. However, PINN training still suffers from high-dimensional non-convex loss landscapes, imbalanced multiobjective constraints, and ineffective information propagation. Existing curriculum learning and causality-guided strategies improve training stability, but mainly focus on temporal or parametric progression, lacking explicit treatment of spatial information propagation and inter-region consistency. Moreover, they are not directly applicable to boundary value problems (BVPs) with strong spatial coupling. To address this issue, we propose a spatially correlated curriculum learning framework for PINNs. To the best of our knowledge, this is the first work to address PINN training difficulties from the perspective of spatial coupling among subregions. First, spatial causal weights guide information from near-boundary regions inward, reducing optimization failures and spurious convergence. Second, a low-frequency information bridge enforces pseudo-label-based consistency across spatially separated regions, suppressing global low-frequency drift. Third, a region-adaptive reweighting strategy adjusts subregion losses to reduce local residuals and recover high-frequency details. Experiments on PDE benchmarks show that, under comparable computational cost, the proposed method alleviates training failures and improves solution accuracy. The code is available at https://github.com/pigofmomo/CurriculumLearningPINN.

2605.15253 2026-05-18 cs.LG

Position: Ideas Should be the Center of Machine Learning Research

Jairo Diaz-Rodriguez

AI总结 本文指出当前机器学习研究日益分化为追求指标优化的工程实践和脱离实际的理想化理论,忽视了研究的核心应是“想法”本身。作者提出“以想法为中心”的研究框架,强调通过设计针对性实验验证想法在现代模型中的行为特征,而非单纯追求榜单成绩。这一转变有助于弥合理论与实践之间的差距,同时促进研究公平性,使资源有限的研究者也能做出严谨的科学贡献。

Comments Accepted into ICML 2026 https://icml.cc/virtual/2026/poster/67144

详情
英文摘要

Machine learning research increasingly bifurcates into two disconnected modes: benchmark-driven engineering that prioritizes metrics over understanding, and idealized theory that often fails to transfer to modern systems. In this position paper, we argue that the field focuses too heavily on these endpoints, neglecting the central scientific object: the idea. We propose an Ideas First framework in which ideas are valued for the behavioral signatures they predict in modern models, and these signatures are tested through tailored experiments designed to detect the relevant patterns rather than to win leaderboards. This shift not only bridges the gap between theory and practice but also promotes equity by removing the "complexity premium," enabling rigorous scientific contributions from researchers with modest computational, financial, and human resources. Ultimately, we advocate for a research culture centered on ideas, treating benchmarks and theorems as instruments for testing mechanistic hypotheses rather than as ends in themselves.

2605.15252 2026-05-18 cs.LG cs.AI eess.SP

PDRNN: Modular Data-driven Pedestrian Dead Reckoning on Loosely Coupled Radio- and Inertial-Signalstreams

Peter Bauer, Andreas Porada, Felix Ott, Christopher Mutschler, Tobias Feigl

AI总结 本文提出了一种名为PDRNN的模块化数据驱动行人航位推算系统,用于处理松耦合的无线电与惯性传感器信号流。该方法基于简单循环神经网络架构,能够隐式预测不同估计方法下的异步传感器数据流,并通过独立的机器学习模型分别估计姿态、速度和位置等关键参数及其方差,最终融合模型结合这些输出以提升系统鲁棒性。实验表明,PDRNN在动态运动数据上的精度和稳定性优于传统方法和现有机器学习方法,同时具备更好的组件控制能力和预测能力。

Comments 12 pages

详情
Journal ref
IEEE/ION Position, Location and Navigation Symposium (PLANS), Salt Lake City, UT, May 2025
英文摘要

Modern pedestrian dead reckoning (PDR) systems rely on fusing noisy and biased estimates of position, velocity, and calibrated orientation derived from loosely coupled sensors to determine the current pose of a localized object. However, discrepancies in the sampling rates of sensor-specific estimation methods and unreliable transmission pose significant challenges. And traditional methods often fail to effectively fuse multimodal sensor data during dynamic movements characterized by high accelerations, velocities, and rapidly varying orientations. To address these limitations, we propose a simple recurrent neural network (RNN) architecture capable of implicitly forecasting asynchronous sensor data streams from diverse estimation methods along reference trajectories. The proposed approach introduces PDRNN, a modular hybrid AI-assisted PDR system that handles each component as an independent ensemble of machine learning (ML) models to estimate both key parameter means and variances. Separate ML-based models are employed to estimate orientation, (un)directed velocity or distance from acceleration and gyroscope data, with optional absolute positioning from synchronized radio systems such as 5G for stabilization. A final fusion model combines these outputs, position, velocity, and orientation, while using uncertainty estimates to enhance system robustness. The modular design allows individual components to be updated, fine-tuned, or replaced without affecting the entire system. Experiments on dynamic sports movement data show that PDRNN achieves superior accuracy and precision compared to classic and ML-based methods, effectively avoiding error accumulation common in black-box approaches. And PDRNN offers forecast capabilities and better component control despite increased system complexity.

2605.15246 2026-05-18 cs.LG

Privacy Evaluation of Generative Models for Trajectory Generation

Stavros Bouras, Ioannis Kontopoulos, Chiara Pugliese, Francesco Lettich, Emanuele Carlini, Hanna Kavalionak, Chiara Renso, Konstantinos Tserpes

AI总结 轨迹数据在现代城市智能中具有重要作用,但其敏感性也带来了显著的隐私风险。本文研究了生成模型在轨迹生成任务中的隐私保护问题,指出现有生成模型虽然能够生成符合时空分布和移动模式的合成轨迹数据,但其生成特性并不意味着隐私得到保障。通过实施成员推理攻击,作者揭示了生成轨迹模型在隐私保护方面的评估缺口,并证明其仍存在潜在的隐私泄露风险。

Comments Accepted at the 1st Workshop on Multi-Sensor Trajectory Knowledge Discovery and Extraction (MuseKDE 2026), co-located with the 27th IEEE International Conference on Mobile Data Management (IEEE MDM 2026)

详情
英文摘要

Trajectory data is fundamental to modern urban intelligence, yet its sensitivity raises significant privacy concerns. Generative models such as Generative Adversarial Networks, Variational Autoencoders, and Diffusion Models have been developed to generate realistic synthetic trajectory data by capturing underlying spatiotemporal distributions and mobility patterns. Although these models are often assumed to preserve privacy due to their generative nature, this assumption does not necessarily hold. In this work, we investigate the intersection of generative trajectory modeling and privacy evaluation. By identifying applicable empirical methods for assessing privacy preservation in trajectory generation tasks, we demonstrate a significant gap in the evaluation of privacy for generative trajectory models. Motivated by this gap, we implement Membership Inference Attacks against representative models, demonstrating the feasibility of using such empirical privacy evaluation methods and showing that their generative nature does not eliminate privacy risks.

2605.15243 2026-05-18 cs.LG cs.AI q-bio.BM q-bio.MN q-bio.QM

Reading the Cell, Designing the Cure: Perturbation-Conditioned Molecular Diffusion for Function-Oriented Drug Design

Ziyu Xu, Zijian Zhang, Liang Wang, Zhiyuan Liu, Qiang Liu, Shu Wu, Liang Wang

AI总结 该研究提出了一种基于转录组的药物设计方法(TBDD),旨在根据期望的基因表达变化生成具有特定功能的分子。为了解决生物学与化学领域间的巨大差异以及转录组信号稀疏性带来的挑战,研究设计了多尺度的扩散生成模型CURE,其核心模块TFE能够提取功能导向的扰动特征,并跨模态对齐化学结构信息,从而生成结构合理且功能一致的候选药物分子。实验表明,该方法在多个基准测试中表现优异,并在零样本基因抑制剂设计任务中验证了其实际应用潜力。

详情
英文摘要

When reliable target structures are unavailable at scale or phenotypes arise from dysregulated pathways, transcriptomic perturbations provide a system-level functional readout for drug action. In this work, we formalize \emph{Transcriptome-based Drug Design (TBDD)} as a generative inverse problem: designing drug molecules conditioned on desired transcriptomic state transitions. We analyze the inherently ill-posed nature of this task, which is further complicated by the profound domain gap between biology and chemistry and by the sparsity of transcriptomic signals. To address these challenges, we propose \textbf{\themodel{}} (A \textbf{C}ell\textbf{U}lar \textbf{R}esponse \textbf{E}ngine), a multi-resolution transcriptome-guided diffusion framework. \themodel{} features a specialized \textbf{Transcriptome Perturbation Functional Feature Extractor (TFE)} that (1) distills function-oriented perturbation embeddings from pre/post states, (2) aligns these signatures to dual chemical views to bridge the cross-modal gap, and (3) performs heterogeneity-aware aggregation to extract robust state-specific signals from noisy transcriptomic data. Extensive evaluations on both standard benchmarks and rigorous out-of-distribution protocols demonstrate that \themodel{} consistently outperforms strong baselines in structural quality and functional consistency. Furthermore, we validate its practical utility via a zero-shot gene-inhibitor design task, highlighting the potential of phenotype-driven generative discovery.

2605.15242 2026-05-18 cs.LG

Logical Grammar Induction via Graph Kolmogorov Complexity: A Neuro-Symbolic Framework for Self-Healing Clinical Data Integrity

Abolfazl Zarghani, Amir Malekesfandiari

AI总结 本文提出了一种名为Logic-GNN的神经符号框架,用于解决医疗信息系统中由人为错误引起的临床数据完整性问题。该方法将临床记录视为受潜在逻辑规则支配的结构化“私有语言”,结合时序图神经网络与图 Kolmogorov 复杂度,推导出描述医疗交互逻辑的符号语法规则,并将异常定义为违反该语法导致图描述长度显著增加的情况。实验表明,该方法在区分医疗异常与数据错误方面表现出色,F1 分数达到 0.94,优于现有方法,并具备实时自我修复功能以维护数据完整性。

详情
英文摘要

The reliability of Healthcare Information Systems (HIS) is frequently compromised by human-induced data entry errors, which existing statistical anomaly detection methods fail to distinguish from legitimate clinical extremes. This paper proposes Logic-GNN, a novel neuro-symbolic framework that treats clinical records as a structured ``private language'' governed by latent logical games. By integrating Temporal Graph Neural Networks (TGNN) with Graph Kolmogorov Complexity, we induce a symbolic grammar that represents the underlying logic of medical interactions. We define anomalies as ``grammatical violations'' that cause a significant expansion in the Minimum Description Length (MDL) of the clinical graph. Evaluated on the Sina System dataset (2M+ records), Logic-GNN achieves an F1-score of 0.94, outperforming state-of-the-art baselines by 12\% in distinguishing between life-threatening medical outliers and data corruption. Our approach introduces a self-healing mechanism that suggests logical corrections to maintain data integrity in real-time HIS environments.

2605.15235 2026-05-18 cs.LG

MuteBench: Modality Unavailability Tolerance Evaluation for Incomplete Multimodal Fusion

Wugeng Zheng, Ziwen Kan, Tianlong Chen, Chen Chen, Song Wang

AI总结 MuteBench 是一个用于评估不完整多模态融合系统在模态缺失情况下的鲁棒性的基准,涵盖了7个临床领域的9个数据集、6种融合架构和两种缺失数据模式。研究发现,架构类型是影响系统鲁棒性的最主要因素,而通道独立模型在处理模态缺失时表现较好,但在处理模态内缺失时可能存在问题。该基准为临床AI系统的设计与选择提供了重要参考。

详情
英文摘要

Multimodal physiological data powers clinical AI systems from intensive care units to wearable devices, but sensors routinely fail in practice. Two failure modes are common: modality missing, where an entire channel is absent, and within-modality missing, where a contiguous time segment is lost. No existing benchmark evaluates multiple fusion architectures under both failure modes at controlled severity levels across diverse clinical datasets. We present MuteBench, a benchmark covering 9 datasets from 7 clinical domains, 6 fusion architectures, and 2 missing-data modes over 125,000 samples. Through this benchmark, we find that architecture family is the strongest predictor of robustness, outweighing parameter count. Channel-independent models tolerate modality missing well but can be sensitive to within-modality missing, especially on short sequences. Curriculum modality dropout protects reliably only up to the maximum dropout rate used in training. We also find that channel count, sequence length, and modality alignment jointly determine which failure mode poses the greater threat. Finally, a PTB-XL case study suggests that diffusion-based imputation can improve downstream classification under within-modality missing, with the largest gains for models whose expert routing is most sensitive to corrupted inputs, though broader validation across datasets remains an open direction. MuteBench provides practitioners with concrete guidance for both selecting existing architectures and informing the design of future robust multimodal fusion methods.

2605.15231 2026-05-18 cs.LG cs.CV

Mask-Morph Graph U-Net: A Generalisable Mesh-Based Surrogate for Crashworthiness Field Prediction under Large Geometric Variation

Haoran Li, Tobias Lehrer, Yingxue Zhao, Haosu Zhou, Philipp Stocker, Tobias Pfaff, Nan Li

AI总结 该研究提出了一种名为Mask-Morph Graph U-Net(MMGUNet)的新型图神经网络模型,旨在解决在几何变化较大的情况下,基于图神经网络的碰撞安全性场预测模型泛化能力不足的问题。该方法通过特征对齐的重心参数化技术对粗化图结构进行形态变换,以保持空间对应关系,同时结合节点掩码预训练和参数高效的微调策略,提升模型在不同输入网格上的预测精度和数据效率。实验表明,该模型在多种测试场景下均优于现有方法,为碰撞安全性设计的高效仿真提供了可行的替代方案。

Comments 48 pages, 15 figures, jounral paper to be submitted

详情
英文摘要

Nonlinear finite element crash simulations are accurate but computationally expensive, limiting their use in iterative design optimisation. Machine-learning surrogate models based on graph neural networks (GNNs) offer a faster alternative. Message-passing GNNs are widely used for mesh simulation, and their shared node and edge update functions are relatively generalisable across varying graph structures. By contrast, non-shareable edge-specific aggregation layers can capture nonlinear relationships more accurately but usually require fixed graph connectivity, which limits generalisability. This paper presents Mask-Morph Graph U-Net (MMGUNet), a practical approach to addressing the limitation of hierarchical Graph U-Net architectures that use edge-specific downsampling and upsampling layers. Fixed coarse graph connectivity is required for edge-specific layers. To retain this while improving spatial correspondence, the proposed method morphs the coarsened graph hierarchy to each input mesh using feature-aligned barycentric parameterisation before constructing cross-graph edges. It further applies node masking during supervised pretraining, followed by parameter-efficient fine-tuning in which high-parameter edge-specific layers are frozen. The proposed approach is evaluated in in-distribution, out-of-distribution, and cross-component transfer settings using mean Euclidean distance and maximum intrusion percentage error. Results show that coarse-graph morphing improves test accuracy relative to a fixed-coarse-graph baseline, while masked supervised pretraining reduces the train-test discrepancy and improves data efficiency during transfer. The proposed model also achieves lower prediction error compared with external baselines. These results demonstrate a practical route toward reusable, data-efficient mesh-based surrogate modelling for crashworthiness design exploration.

2605.15228 2026-05-18 cs.AI cs.LG

Verifiable Agentic Infrastructure: Proof-Derived Authorization for Sovereign AI Systems

Jun He, Deying Yu

AI总结 本文研究了主权AI系统中自主智能体执行操作时的授权验证问题,提出了一种基于可信证明的分布式授权框架(DTF)。该框架通过结构化、可验证的证明对象来动态生成执行权限,确保所有高风险操作都必须基于共识验证的证明,并与证据链绑定,从而实现对智能体行为的可控、可审计和可追溯。该方法为云原生环境中的自主AI系统提供了安全、去中心化的授权基础设施。

Comments 19 pager, 2 figures, 4 tables

详情
英文摘要

Modern cloud and enterprise systems rely on identity-centric authorization, assuming that callers possessing valid credentials are safe to execute commands. The emergence of autonomous AI agents invalidates this assumption: agents can generate syntactically valid but semantically unsafe actions, making standing privileges a significant operational risk. This risk becomes especially acute in sovereign AI systems, where autonomous agents may interact with cloud infrastructure, regulated data, financial workflows, and national-scale digital services. Governed mutation substrates reduce this risk by interposing on agent actions: agents submit intents, infrastructure evaluates context and policy, and execution is mediated. However, this shifts the trust boundary: how can the decision to authorize an intent be made verifiable, distributed, and replayable? We introduce a Distributed Trust Framework (DTF), a verification framework for governed mutation systems that computes execution authority from structured, verifiable artifacts. DTF introduces a Justification Proof to encode the admissibility basis of an action, a consensus model for independent evaluation, an ephemeral Execution Identity derived from the approved proof, and an append-only Evidence Chain that preserves the authorization lifecycle. Under stated substrate assumptions, this architecture enforces a compact authorization invariant: no high-stakes execution without a proof object, no derived authority without consensus, and no valid mutation detached from evidence. We define the model, instantiate it over an OpenKedge-based governed mutation substrate, and show how it maps onto cloud-native environments. By shifting authorization from standing identity to proof-derived authority, DTF provides an infrastructure foundation for making agentic execution governable, auditable, and bounded in sovereign AI deployments.

2605.15227 2026-05-18 cs.AI cond-mat.mtrl-sci cs.RO

NIMO Controller: a self-driving laboratory orchestrator based on the Model Context Protocol

Naruki Yoshikawa, Ryo Tamura

AI总结 本文提出了一种基于模型上下文协议(MCP)的自主驾驶实验室(SDL)控制架构——NIMO Controller,旨在解决现有SDL软件框架缺乏标准化接口、难以支持AI代理的问题。该架构通过MCP服务器统一暴露所有SDL功能,并提供了基于MCP工具发现的可视化编程接口,使用户无需编写代码即可设计实验流程,同时支持AI代理通过同一后端进行交互。研究通过颜色匹配实验验证了该架构的可行性与实用性。

Comments 9 pages, 4 figures

详情
英文摘要

Self-driving laboratories (SDLs) have attracted increasing attention as a means of accelerating scientific discovery; however, developing SDL software remains technically demanding. To improve accessibility, orchestration software frameworks have been proposed to coordinate SDL components. Nevertheless, existing frameworks are primarily designed for human interaction and do not provide standardized interfaces suitable for AI agents. In this work, we propose an SDL software architecture based on the Model Context Protocol (MCP), in which all SDL functionalities are exposed through MCP servers. Following this design principle, we introduce an MCP-based SDL orchestrator, named NIMO Controller. It provides a visual programming interface automatically generated through MCP-based tool discovery, allowing human users to design experimental workflows without writing code. The same MCP backend can also be accessed by AI agents, providing a unified interface for both human users and AI agents. We demonstrate the proposed system through a case study on a color-matching SDL. The results validate the usability of the proposed MCP-based SDL architecture.

2605.15224 2026-05-18 cs.AI cs.MA

ICRL: Learning to Internalize Self-Critique with Reinforcement Learning

Jianbo Lin, Xiaomin Yu, Yi Xin, Yifu Guo, Zhuosong Jiang, Zhongqi Yue, Weishi Wang, Heqing Zou, Chengwei Qin, Hui Xiong

AI总结 本文提出了一种基于强化学习的新型框架ICRL,旨在使大型语言模型在获得自我批评反馈后能够内化这些指导,从而在无外部批评的情况下仍能保持良好的表现。该框架通过联合训练求解器和批评者,利用批评反馈带来的性能提升作为奖励,促使批评者生成更有助于改进的反馈。为了解决批评条件行为与无批评行为之间的分布偏移问题,ICRL引入了分布校准的重加权策略,并通过角色分组优势估计稳定联合优化过程。实验表明,ICRL在多种任务中均取得了显著提升,且训练出的批评者在性能上可与更大规模的模型相媲美。

详情
英文摘要

Large language model-based agents make mistakes, yet critique can often guide the same model toward correct behavior. However, when critique is removed, the model may fail again on the same query, indicating that it has not internalized the critique's guidance into its underlying capability. Meanwhile, a frozen critic cannot improve its feedback quality over time, limiting the potential for iterative self-improvement. To address this, we propose learning to internalize self-critique with reinforcement learning(ICRL), a novel framework that jointly trains a solver and a critic from a shared backbone to convert critique-induced success into unassisted solver ability. The critic is rewarded based on the solver's subsequent performance gain, incentivizing actionable feedback. To address the distribution shift between critique-conditioned and critique-free behavior, ICRL introduces a distribution-calibration re-weighting ratio that selectively transfers critique-guided improvements compatible with the solver's own prompt distribution. Additionally, a role-wise group advantage estimation stabilizes joint optimization across the two roles. Together, these mechanisms ensure that the solver learns to improve itself without external critique, rather than becoming dependent on critique-conditioned behavior. We evaluate ICRL on diverse benchmarks spanning agentic and mathematical reasoning tasks, using Qwen3-4B and Qwen3-8B as backbones. Results show consistent improvements, with average gains of 6.4 points over GRPO on agentic tasks, and 7.0 points on mathematical reasoning. Notably, the learned 8B critic is comparable to 32B critics while using substantially fewer tokens. The code is available at https://github.com/brick-pid/ICRL.

2605.15220 2026-05-18 cs.CL cs.AI cs.LG

Always Learning, Always Mixing: Efficient and Simple Data Mixing All The Time

Michael Y. Hu, Apurva Gandhi, Kyunghyun Cho, Tal Linzen, Pratyusha Sharma

AI总结 数据混合在语言模型训练中起着关键作用,决定了如何组合不同来源或类型的训练数据。本文提出了一种名为OP-Mix的高效数据混合算法,能够在整个语言模型训练生命周期中持续运行,解决了现有方法仅适用于单一训练阶段的问题。该方法通过在当前模型上训练低秩适配器并进行插值,低成本地模拟候选数据混合方案,从而避免了对代理模型的依赖,并始终基于模型的实际学习动态进行搜索。实验表明,OP-Mix在预训练、持续微调等任务中均能以更低的计算成本达到接近最优的性能。

详情
英文摘要

Data mixing decides how to combine different sources or types of data and is a consequential problem throughout language model training. In pretraining, data composition is a key determinant of model quality; in continual learning and adaptation, it governs what is retained and acquired. Yet existing data mixing methods address only one phase of this lifecycle at a time: some require smaller proxy models tied to a single training phase, others assume a fixed domain set, and continual learning lacks principled guidance altogether. We argue that data mixing is fundamentally an online decision making problem -- one that recurs throughout training and demands a single, unified solution. We introduce OP-Mix (On-Policy Mix), a data mixing algorithm that operates across the entire language model training lifecycle. Our main insight is that candidate data mixtures can be cheaply simulated by interpolating between low-rank adapters trained directly on the current model, eliminating separate proxy models and ensuring the search is always grounded in the model's actual learning dynamics. Across pretraining, continual midtraining, and continual instruction tuning, OP-Mix consistently finds near-optimal mixtures while using a fraction of the compute of the baselines. In pretraining, OP-Mix improves upon training without mixing by 6.3% in average perplexity. For continual learning, OP-Mix matches the performance of both retraining and on-policy distillation while using 66% and 95% less overall compute, respectively. OP-Mix suggests a different view of language model training: not a sequence of distinct phases, but a single continuous process of learning from data.

2605.15218 2026-05-18 cs.AI cs.CE

CAX-Agent: A Lightweight Agent Harness for Reliable APDL Automation

Chenying Lin, Yichen Hai, Yi He, Ran Wang, Haiyan Qiang, Liang Yu

AI总结 本文提出了一种轻量级的代理框架CAX-Agent,旨在提升MAPDL有限元仿真中的自动化可靠性。该框架通过引入领域特定的中间件,实现工具生命周期管理、工作流状态控制和故障恢复,从而解决大语言模型在该任务中常见的输出不一致和任务失败问题。实验评估表明,CAX-Agent中基于模型驱动的恢复策略在多个结构基准测试中表现出色,显著优于仅依赖规则或无恢复策略的方法。

Comments 8 pages, 6 figures, IEEE conference format

详情
英文摘要

Large language models deployed for MAPDL finite-element simulation face practical reliability challenges: without structured execution control, tool encapsulation, and fault recovery, outputs may be inconsistent and task failures are common. The Agent Harness paradigm addresses this by inserting domain-specific orchestration middleware that manages tool lifecycles, workflow state, and recovery escalation. This paper presents the architecture of CAX-Agent, a lightweight agent harness purpose-built for MAPDL automation, and empirically evaluates one of its core components -- the recovery policy.CAX-Agent organizes execution into three layers -- LLM service, agent harness, and solver backend -- with a recovery ladder that escalates from deterministic rule patching through model-driven regeneration to context enrichment and human intervention. We evaluate three recovery strategies (no_recovery, rule_only, and model_only) on 50 standard structural benchmarks with three repeated runs per strategy (450 case-runs total). Two independent human raters score task completion under blind conditions; inter-rater agreement is strong (quadratic weighted Cohen's kappa = 0.84, 96 percent of score pairs within one point). Model_only achieves the best completion rate (0.9267), task score (3.59/4), total score (9.16/10), and zero-intervention rate (0.84), outperforming rule_only (0.7733, 3.17/4, 7.03/10, 0.00) and no_recovery (0.6933, 2.74/4, 5.60/10, 0.00) with large effect sizes (Cliff's delta = 0.81-0.87). The benchmark uses deliberately simple geometries to isolate recovery-policy effects; we discuss the scope of these findings and directions for broader validation.

2605.15217 2026-05-18 cs.AI cs.CY cs.LG econ.GN q-fin.EC

Fair outputs, Biased Internals: Causal Potency and Asymmetry of Latent Bias in LLMs for High-Stakes Decisions

Jagdish Tripathy, Marcus Buckmann

AI总结 本研究探讨了指令微调语言模型在高风险决策(如房贷审批)中表现出的行为公平性与其内部潜在偏见之间的不对称关系。研究发现,尽管模型在输出层面看似无偏,但其内部表示仍保留并放大了与种族相关的偏见,且这些隐藏的偏见具有因果影响力,能够通过特定干预引发决策反转。研究还揭示了这种偏见在不同群体间的不对称性,并指出仅关注输出的行为审计不足以识别和治理模型中的潜在偏差,需结合表示分析的双重评估框架。

Comments 39 pages, 16 figures, 2 tables

详情
英文摘要

Instruction-tuned language models exhibit behavioural fairness in high-stakes decisions while retaining biased associations in their internal representations. However, whether these suppressed representations can affect model outputs - and whether such causal potency is symmetric across demographic groups - remains unknown. We investigate the use of open-weight models for mortgage underwriting using matched applications that differ only in racially-associated names and reveal a critical disconnect: models show no output-level bias, yet retain and amplify demographic representations across model layers. Through activation steering and novel cross-layer interventions, we demonstrate that this suppressed information is decision-relevant: when reinjected at critical layers, it produces near-complete decision reversals. Critically, this latent bias is asymmetric - steering interventions affect decisions in one demographic direction, while producing minimal effects in reverse - and susceptible to adversarial prompt engineering and parameter-efficient fine-tuning. These findings demonstrate that behavioural audits focused on outputs are insufficient: fair outputs can mask exploitable internal biases. They also motivate dual-layer testing frameworks combining output evaluation with representational analysis for AI governance in high-stakes decisions.

2605.15215 2026-05-18 cs.AI cs.SE

SkillSmith: Compiling Agent Skills into Boundary-Guided Runtime Interfaces

Duling Xu, Zheng Chen, Zaifeng Pan, Jiawei Guan, Dong Dong, Jialin Li, Bangzheng Pu

AI总结 SkillSmith 是一种边界引导的编译-运行时框架,旨在优化基于技能的智能体系统。该方法通过离线编译技能包为最小可执行接口,提取技能的细粒度操作边界,使智能体在运行时仅调用相关组件,从而减少冗余上下文注入和重复推理。实验表明,SkillSmith 显著降低了推理阶段的 token 使用量、思考迭代次数和执行时间,并提升了任务准确率,同时支持强模型生成的编译结果被轻量模型复用。

详情
英文摘要

Recently, skills have been widely adopted in large language model (LLM)-based agent systems across various domains. In existing frameworks, skills are typically injected into the agent reasoning loop as contextual guidance once matched to a runtime task, enabling specialized task-solving capabilities. We find that this execution paradigm introduces two major sources of redundancy: irrelevant context injection and repeated skill-specific reasoning and planning. To this end, we propose SkillSmith, a boundary-first compiler-runtime framework that compiles skill packages offline into minimal executable interfaces. By extracting fine-grained operational boundaries from skills, SkillSmith enables agents to dynamically access and execute only the relevant components at runtime, thereby minimizing unnecessary context injection and redundant reasoning overhead. In the evaluation on SkillsBench benchmark, SkillSmith reduces solve-stage token usage by 57.44%, thinking iterations by 42.99%, solve time by 50.57% (2.02x faster), and token-proportional monetary cost by 57.44% compared with using raw-skills. Moreover, compiled artifacts produced by a stronger model can be reused by a smaller or more efficient runtime model, improving task accuracy in cases where raw skill interpretation fails. The source code and data are available at https://github.com/AetherHeart-AI/Aeloon.

2605.15208 2026-05-18 cs.LG cs.AI

Quantization Undoes Alignment: Bias Emergence in Compressed LLMs Across Models and Precision Levels

Plawan Kumar Rath, Rahul Maliakkal

AI总结 该研究探讨了量化压缩对大型语言模型(LLMs)偏见表现的影响,发现低精度量化会导致模型在多个任务中产生新的刻板印象行为,且这种变化与精度水平呈剂量反应关系。通过在多个模型和精度级别上的大规模实验,研究揭示了传统质量评估指标无法检测到这种偏见的增加,强调了在模型压缩前进行公平性检测的重要性。

Comments 7 pages, 4 figures, 4 tables. Accepted at IEEE Cloud Summit 2026. This is the author's accepted version; the version of record will appear in IEEE Xplore

详情
英文摘要

Large Language Models are routinely compressed via post-training quantization to reduce inference costs and memory footprint for cloud and edge deployment, yet the impact of this compression on model quality remains poorly understood. Existing studies typically compare only two conditions (full-precision vs. a single quantized variant), rely on aggregate bias metrics, and evaluate a single model family, making it impossible to distinguish gradual degradation from threshold-dependent safety failures. We conduct a controlled empirical study of three instruction-tuned models (Qwen2.5-7B, Mistral-7B, Phi-3.5-mini) at five precision levels (BF16 through 3-bit) on 12,148 BBQ bias benchmark items across 5 random seeds, totaling 911,100 inference records. Our results reveal that 3-bit quantization causes 6-21% of previously unbiased items to develop new stereotypical behaviors, following a clear dose-response pattern confirmed via logistic regression, while models' willingness to select "unknown" answers declines by 17.4%. Crucially, these item-level changes are invisible to standard quality metrics: perplexity increases by less than 0.5% at 8-bit and under 3% at 4-bit across all three models, yet 2.5-5.6% of items already develop new biases at 4-bit. These findings demonstrate that aggregate evaluation metrics systematically miss fairness-critical degradation, underscoring the need for quality-aware compression protocols that explicitly test for bias emergence before deployment.

2605.15207 2026-05-18 cs.LG cs.MA

TeamTR: Trust-Region Fine-Tuning for Multi-Agent LLM Coordination

Yi Xie, Siao Liu, Falong Fan, Yuanqi Yao, Yue Zhao, Bo Liu

AI总结 多智能体大语言模型系统在复杂推理任务中展现出潜力,但近期评估显示其性能常低于单一模型基线。本文识别出共享上下文团队在顺序微调中存在结构性失效模式,即更新一个智能体会导致团队上下文分布偏移,而后续使用缓存轨迹进行评估会加剧这种偏差。为此,作者提出TeamTR信任域框架,通过每次更新后重新采样轨迹并控制每个智能体的分布偏差,从而保证每次更新和每个阶段的改进下界。实验表明,TeamTR在平均性能上优于单智能体和顺序微调方法约7.1%,有效缓解了协调退化问题,并支持组件的即插即用替换。

Comments 9pages, Accepted at ICML2026

详情
英文摘要

Multi-agent LLM systems have shown promise for complex reasoning, yet recent evaluations reveal they often underperform single-model baselines. We identify a structural failure mode in sequential fine-tuning of shared-context teams: updating one agent shifts the team's context distribution, and when subsequent updates are evaluated on cached rollouts, this mismatch compounds. We formalize this as the compounding occupancy shift and prove that stale-occupancy evaluation incurs a penalty that scales quadratically with the number of agents. In contrast, intermediate-occupancy evaluation reduces this to linear scaling. We propose TeamTR, a trust-region framework that resamples trajectories after each component update and enforces per-agent divergence control, yielding rigorous per-update and per-stage improvement lower bounds. Experiments show that TeamTR outperforms single-agent and sequential baselines with 7.1% on average, mitigates coordination regressions, and supports plug-and-play component replacement. Code is available at https://github.com/Yydc/TeamTR.