arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 2409
2605.28969 2026-05-29 cs.CL cs.AI cs.HC

Beyond Recall: Behavioral Specification as an Interpretive Layer for AI Personalization

超越回忆:行为规范作为AI个性化的解释层

Aarik Gulaya

AI总结 提出行为规范作为解释层,通过压缩用户数据为解释性模式,显著提升AI代理对用户意图的表示准确性,减少模型规避,并在解释型问题上优于原始语料和商业记忆系统。

Comments 134 pages, 4 figures. Code, data, judge prompts, and reproduction instructions: github.com/agulaya24/beyond-recall

详情
AI中文摘要

如果AI代理代表个人做出决策,这些决策必须与其用户一致。我们引入表示准确性来衡量系统忠实捕捉用户解释的程度。我们将解释层操作化为行为规范。我们的参考实现将用户数据积极压缩为解释性模式,作为语言模型的上下文。我们在一个原型基准上评估该规范,该基准由校准的5评委LLM小组对保留的行为预测进行评分。我们独立测试它,并与一系列上下文条件组合:完整原始语料、完整提取事实以及四个商业记忆系统(Mem0、Letta、Supermemory、Zep)。在14个公共领域自传语料库中,该规范总体上提升了表示准确性,并几乎消除了模型规避。它以约25倍的上下文成本降低恢复了原始语料的大部分性能。该规范将受试者提升到一个共同的预测水平,无论预训练基线如何;因此,绝对提升在基线最低时最大,表明相关人群是任何在预训练中未被充分代表的人。在需要解释的问题上提升最大,提供解释层使得模型行为能够实现提取事实或原始语料无法实现的行为。相反,在需要回忆的问题上,该层可能干扰而非帮助。我们得出结论,表示准确性不同于回忆,人机对齐取决于用户被表示的准确性。表示准确性使这种对齐可测试。

英文摘要

If an AI agent makes decisions on a person's behalf, those decisions must align with its user. We introduce representational accuracy to measure how faithfully a system captures a person's interpretation. An interpretive layer is operationalized as a Behavioral Specification. Our reference implementation aggressively compresses a person's data into interpretive patterns, served as context to a language model. We evaluate the Specification on a prototype benchmark of held-out behavioral predictions scored by a calibrated 5-judge LLM panel. We test it independently and in composition with a range of context conditions: full raw corpus, full extracted facts, and four commercial memory systems (Mem0, Letta, Supermemory, Zep). Across 14 public-domain autobiographical corpora, the Specification lifts representational accuracy in aggregate and nearly eliminates model hedging. It recovers most of what the raw corpus delivers, at ~25x less context cost. The Specification lifts subjects toward a common predictive level regardless of pretraining baseline; the lift in absolute points is therefore largest where the baseline is lowest, suggesting the population of relevance is anyone not adequately represented in pretraining. Lift is greatest on interpretation-required questions, where providing an interpretive layer enables model behavior that extracted facts or raw corpus do not. Conversely, on recall-required questions, this layer can interfere rather than help. We conclude that representational accuracy is distinct from recall and that human-AI alignment is dependent on how accurately the user is represented. Representational accuracy makes that alignment testable.

2605.28966 2026-05-29 cs.CL cs.HC

The Trust Paradox: How CS Researchers Engage LLM Leaderboards

信任悖论:CS研究人员如何使用LLM排行榜

Pouya Sadeghi, Anamaria Crisan, Jimmy Lin

AI总结 通过半结构化访谈,揭示计算机科学领域研究人员对LLM排行榜普遍存在“实用怀疑”矛盾态度,并基于发现提出设计建议。

详情
AI中文摘要

大型语言模型(LLM)排行榜使用标准化基准对AI模型进行排名,尽管其可靠性和稳健性存在已知局限性,但在计算机科学领域已变得高度可见。然而,它们如何影响研究人员的实际实践仍缺乏实证研究。我们通过对四个计算机科学子领域的八名研究人员进行半结构化访谈,并使用反思性主题分析来填补这一空白。我们发现一个近乎普遍的实用怀疑悖论:尽管参与者对排行榜排名表示深度不信任,但他们仍将其用作粗略的决策辅助工具。同行网络而非排行榜成为主要的模型选择机制,基于竞技场(人类投票)的排行榜始终比静态基准排行榜更受青睐。排行榜的影响在不同子领域间差异显著,表明学科文化而非个人态度调节了参与度;例如,NLP研究人员面临最先进比较压力,而HCI和系统/隐私研究人员则未报告此类压力。然而,在这些差异中,参与者一致认为成本透明度是最需要缺失的功能(八人中的七人)。我们将这些发现转化为具体的设计建议,使评估基础设施与研究人员实际使用方式对齐,例如任务特定分数细分、成本整合和投票者人口统计信息披露。

英文摘要

Large language model (LLM) leaderboards rank AI models using standardized benchmarks and have become highly visible across computer science, despite known limitations in their reliability and robustness. Yet how they shape researchers' actual practice remains empirically uncharted. We address this gap through semi-structured interviews with eight researchers across four computer science subfields, analyzed using reflexive thematic analysis. We find a near-universal paradox of pragmatic skepticism: while participants expressed deep distrust of leaderboard rankings, they continued to use them as rough decision-making aids. Peer networks, not leaderboards, emerged as the primary model selection mechanism, and arena-based (human-voting) leaderboards were consistently preferred over static benchmark leaderboards. Leaderboard influence varied sharply across subfields, revealing that disciplinary culture, not individual attitudes, mediates engagement; for instance, NLP researchers faced state-of-the-art comparison pressure while HCI and Systems/Privacy researchers reported none. Across these differences, however, participants converged on cost transparency as the most demanded missing feature (seven of eight). We translate these findings into concrete design recommendations that align evaluation infrastructure with how researchers actually use it, such as task-specific score breakdowns, cost integration, and voter-demographic disclosure.

2605.28965 2026-05-29 cs.AI

Frontier LLM-based agents can overcome the ontology curation bottleneck for natural phenotypes

基于前沿LLM的智能体可以克服自然表型的本体策展瓶颈

James P. Balhoff, Hilmar Lapp

AI总结 本文使用前沿大型语言模型作为智能体策展人,在自包含工作空间中利用本体和注释指南进行表型注释,其性能达到人类策展人之间的变异性范围,显著优于传统NLP工具。

Comments 7 pages, 2 figures

详情
AI中文摘要

将自由文本表型描述链接到本体术语(通常称为表型注释)对于跨研究整合比较形态学数据至关重要。这一劳动密集型过程严重依赖训练有素的人类专家,因此难以规模化,成为关键瓶颈。Dahdul等人(2018)建立了跨七个系统发育研究的实体-质量(EQ)注释金标准,并用于评估三位人类策展人和基于本体的语义相似度度量的Semantic CharaParser NLP工具;他们报告机器与人类的一致性显著低于策展人间(人类-人类)的一致性。本文使用来自Anthropic和OpenAI的五个前沿托管LLM重新审视该基准,每个LLM作为“智能体策展人”在自包含工作空间中运行,该工作空间提供源出版物PDF、原始人类策展人使用的相同注释指南、四个项目本体(UBERON、PATO、BSPO、GO)以及验证脚本。针对同一金标准评估,每个智能体的表现均落在原始研究中三位训练有素的人类生物策展人的策展人间变异性范围内;表现最佳的智能体接近但未达到表现最佳的人类策展人。智能体在所有四个指标上均大幅优于Semantic CharaParser。

英文摘要

Linking free-text phenotype descriptions to ontology terms, typically referred to as phenotype annotation, is essential for the cross-study integration of comparative morphological data. This labor intensive process has heavily relied on highly trained human experts, which makes it challenging to scale and thus a key bottleneck. Dahdul et al. (2018) established a Gold Standard (GS) of Entity-Quality (EQ) annotations across seven phylogenetic studies and used it to evaluate three human curators and the Semantic CharaParser NLP tool with ontology-based semantic similarity metrics; they reported that machine-human consistency was significantly lower than inter-curator (human-human) consistency. Here we revisit that benchmark with five frontier hosted LLMs from Anthropic and OpenAI, each operating as an "agentic curator" within a self-contained workspace that supplies the source publication PDF, the same annotation guide used by the original human curators, the four project ontologies (UBERON, PATO, BSPO, GO), and a validation script. Evaluated against the same Gold Standard, every agent fell within the range of inter-curator variability of the three trained human biocurators of the original study; the best performing agents approached but did not reach the best performing human curator. Agents substantially outperformed Semantic CharaParser on all four metrics.

2605.28962 2026-05-29 cs.CV

Resolving Endpoint Underfitting in Diffusion Bridges via Noise Alignment

通过噪声对齐解决扩散桥中的端点欠拟合

Yurong Gao, Zicheng Zhang, Congying Han, Tiande Guo, Xinmin Qiu

AI总结 针对扩散桥模型在目标端点附近出现的欠拟合问题,提出噪声对齐扩散桥(NADB),通过均值网络和噪声对齐映射解决噪声不匹配,在图像恢复和翻译任务中验证有效性。

Comments Accepted by CVPR2026

详情
AI中文摘要

扩散桥模型为连接两个数据分布(如图像恢复和翻译)提供了强大框架。许多现有方法通过模仿标准扩散模型的分数匹配公式来学习这种桥接。在这项工作中,我们发现这种方式会导致在接近目标端点($t \to 0$)时出现异常的欠拟合现象。这种欠拟合以预测方差和方向的显著漂移为特征,是由网络输入与其回归目标之间的噪声水平差异过大引起的。为了解决这个问题,我们提出了噪声对齐扩散桥(NADB)。我们的方法通过首先使用均值网络提供更清晰的条件目标,然后引入一种新颖的噪声对齐映射关系来重新表述扩散桥。这种新表述解决了噪声不匹配问题,并纠正了目标端点附近的欠拟合。在多个图像恢复和图像翻译任务上的实验验证了我们的方法的有效性。代码可在 https://github.com/gyr02/NADB 获取。

英文摘要

Diffusion bridge models offer a powerful framework for connecting two data distributions, such as in image restoration and translation. Many existing methods learn this bridge by mimicking the score-matching formulation of standard diffusion models. In this work, we find that this way leads to an anomalous underfitting phenomenon near the target endpoint, as the process approaches the target distribution ($t \to 0$). This underfitting, characterized by significant drift in the predicted variance and direction, results from an excessively large discrepancy in noise levels between the network's input and its regression target.To resolve this issue, we propose the Noise-Aligned Diffusion Bridge (NADB).Our approach reformulates the diffusion bridge by first employing a mean network to provide a cleaner conditional target, and then introducing a novel, noise-aligned mapping relationship. This new formulation resolves the noise mismatch and corrects the underfitting near the target endpoint. Experimental validation across multiple image restoration and image translation tasks demonstrates the effectiveness of our approach. Code is available at https://github.com/gyr02/NADB.

2605.28920 2026-05-29 cs.LG cs.AI stat.ML

Conf-Gen: Conformal Uncertainty Quantification for Generative Models

Conf-Gen: 生成模型的共形不确定性量化

Gabriel Loaiza-Ganem, Kevin Zhang, Wei Cui, Marc T. Law, Kin Kwan Leung

AI总结 提出Conf-Gen框架,通过共形风险控制适配生成任务,统一并扩展了共形预测在大型语言模型等生成模型中的应用,并在图像生成、对话AI和AI代理等新领域提供了形式化保证。

Comments ICML 2026

详情
AI中文摘要

共形预测(CP)及其扩展共形风险控制(CRC)是通过形式化保证量化监督机器学习中不确定性的成熟框架。然而,人工智能(AI)的最新突破由无监督生成模型驱动,例如大型语言模型(LLMs)和图像生成器,这些模型与CP或CRC不直接兼容。在这项工作中,我们引入了共形生成(Conf-Gen),这是一个将CRC适配到生成任务同时放宽其理论假设的通用框架。Conf-Gen统一并泛化了先前将CP应用于LLMs的尝试,并将共形方法扩展到全新的领域。我们通过一些新颖的应用展示了Conf-Gen的灵活性,包括在以下方面获得共形保证:生成非记忆图像的图像生成器、提出足够澄清问题的对话AI系统,以及AI代理输出的正确性。

英文摘要

Conformal prediction (CP) and its extension, conformal risk control (CRC), are established frameworks for quantifying uncertainty in supervised machine learning through formal guarantees. However, recent breakthroughs in artificial intelligence (AI) have been driven by unsupervised generative models, such as large language models (LLMs) and image generators, which are not directly compatible with CP or CRC. In this work we introduce conformal generation (Conf-Gen), a general framework adapting CRC to generative tasks while relaxing its theoretical assumptions. Conf-Gen unifies and generalizes previous attempts to apply CP to LLMs, and extends conformal methodology to entirely new domains. We demonstrate the flexibility of Conf-Gen through some novel applications, including obtaining conformal guarantees on: image generators producing non-memorized images, conversational AI systems having asked enough clarifying questions, and the output of AI agents being correct.

2605.28919 2026-05-29 cs.LG cs.AI cs.CL

CosmicFish-HRM: Adaptive Reasoning via Hierarchical Recurrent Mechanisms in Compact Language Models

CosmicFish-HRM:紧凑语言模型中基于层次循环机制的适应性推理

Venkat Akhil Lakkapragada

AI总结 提出一种紧凑语言模型CosmicFish-HRM,通过层次推理模块动态分配推理深度,在保持较小参数量的同时实现适应性推理。

Comments 17 pages, 4 figures. Exploratory study of adaptive reasoning depth in compact autoregressive language models. Code available at https://github.com/MistyozAI/CosmicFish-HRM

详情
AI中文摘要

大型语言模型已经实现了强大的推理能力,尽管通常以巨大的参数数量和昂贵的推理为代价。在这项工作中,我们探索了一个不同的方向:紧凑语言模型中的自适应推理深度。我们提出了CosmicFish-HRM,这是一个紧凑的语言模型,围绕一个层次推理模块(HRM)构建,该模块在推理过程中动态分配计算资源。该模型不是对每个输入应用固定的计算,而是迭代通过高层和低层推理循环,并根据输入复杂度学习何时停止。CosmicFish-HRM将这种自适应推理核心与现代Transformer组件(包括分组查询注意力、RoPE和SwiGLU激活)相结合。虽然额外的推理基础设施在小规模下引入了开销,但我们假设随着模型规模的增长和HRM核心相对成本的降低,这种权衡变得越来越有利。我们的结果表明,该模型学习了非均匀的推理行为,在不同任务和输入之间分配不同数量的推理步骤。这些发现表明,自适应推理深度可能为仅依赖参数规模来实现推理能力提供一种有前途的替代方案。

英文摘要

Large language models have achieved strong reasoning capabilities, though often at the cost of massive parameter counts and expensive inference. In this work, we explore a different direction: adaptive reasoning depth in compact language models. We present CosmicFish-HRM, a compact language model built around a Hierarchical Reasoning Module (HRM) that dynamically allocates computational effort during inference. Instead of applying fixed computation to every input, the model iterates through high-level and low-level reasoning cycles and learns when to halt based on input complexity. CosmicFish-HRM combines this adaptive reasoning core with modern transformer components including Grouped Query Attention, RoPE, and SwiGLU activations. While the additional reasoning infrastructure introduces overhead at small scale, we hypothesize that this tradeoff becomes increasingly favorable as model size grows and the relative cost of the HRM core diminishes. Our results show that the model learns non-uniform reasoning behavior, allocating different numbers of reasoning steps across tasks and inputs. These findings suggest that adaptive reasoning depth may offer a promising alternative to relying solely on parameter scale for reasoning capability.

2605.28913 2026-05-29 cs.CL

Reasoning that Travels: Dissecting How Chain-of-Thought Transfers Across Models

推理的迁移:解析思维链如何在模型间传递

Xinyuan Cheng, Beiduo Chen, Philipp Mondorf, Barbara Plank

AI总结 本文通过提供者-接收者框架,研究大型推理模型生成的思维链如何影响其他模型的回答,发现完整思维链的迁移效果因基准测试而异,且部分思维链前缀能指导接收者继续推理,答案一致性可作为提前停止提供者推理的无标签信号。

Comments 20 pages, 17 figures

详情
AI中文摘要

大型推理模型(LRMs)在生成最终答案之前,通常会生成大量的思维链(CoT)痕迹。作为显式的文本产物,这些痕迹可以传递给其他模型以解决相同任务,从而实现跨模型的推理迁移。然而,成功的迁移本身并不能揭示提供的CoT如何贡献于另一个模型的答案。我们通过一个受控的提供者-接收者框架研究这一问题,其中提供者生成推理痕迹,接收者从逐渐增长的痕迹前缀中解决相同问题。我们比较了强制回答(接收者直接从前缀中回答)和自由生成(接收者在回答前可以继续推理)两种模式。跨模型和基准测试,完整痕迹通常能成功迁移,但前缀轨迹揭示了不同的机制。在强制回答模式下,AIME的迁移主要由显式答案的可用性驱动。MMLU-Pro则反映了接收者能力的更大作用,而ZebraLogic依赖于部分结构化答案信息而非完整答案泄露。在自由生成模式下,部分CoT提高了跨基准测试的性能,表明前缀可以指导继续推理。最后,接收者之间的答案一致性为提前停止提供者推理提供了无标签信号。总体而言,跨模型CoT迁移并非单一现象:它可以反映答案提取、推理支架或接收者依赖的能力。

英文摘要

Large reasoning models (LRMs) often generate extensive chain-of-thought (CoT) traces before producing a final answer. As explicit textual artifacts, these traces can be passed to other models to solve the same task, enabling cross-model reasoning transfer. Yet successful transfer alone does not reveal how the provided CoT contributes to another model's answer. We study this question with a controlled provider--receiver framework, where a provider generates a reasoning trace and a receiver solves the same problem from increasingly longer trace prefixes. We compare force-answer, where the receiver answers directly from the prefix, with free-generation, where it may continue reasoning before answering. Across models and benchmarks, full traces often transfer successfully, but prefix trajectories reveal distinct mechanisms. In force-answer mode, AIME transfer is largely driven by explicit answer availability. MMLU-Pro instead reflects a larger role for receiver competence, while ZebraLogic depends on partial structured-answer information rather than complete-answer leakage alone. In free-generation mode, partial CoTs improve performance across benchmarks, indicating that prefixes can guide continued reasoning. Finally, answer agreement among receivers provides a gold-free signal for stopping provider reasoning early. Overall, cross-model CoT transfer is not a single phenomenon: it can reflect answer extraction, reasoning scaffolding, or receiver-dependent competence.

2605.28909 2026-05-29 cs.LG

Sequential Physics-Constrained Neural Operator Forward Modeling for the $\textit{Norne}$ Reservoir System

面向Norne油藏系统的序贯物理约束神经算子正演建模

Clement Etienam, Juntao Yang, Oleg Ovcharenko, Nick Luiken, Tsubasa Onishi, Nefeli Moridis, Issam Said

AI总结 针对Norne油藏基准问题,提出基于傅里叶神经算子(FNO)及其物理信息变体(PINO)的序贯代理模型,通过理论分析(函数空间公式、协变量偏移量化、物理约束谱稳定性、K步TBPTT梯度分析)和实验验证,实现全3298天时间跨度上油、气、压力和水的高精度预测,并在单GPU上获得约10^4倍加速。

Comments 22 pages, 2 figures, 2 tables. Code available at https://github.com/clementetienam/physicsnemo/tree/801a85bc08aa9caa0d54027a145b88c68e5e5f36/examples/reservoir_simulation/norne

详情
AI中文摘要

我们开发了一个全面的数学和计算框架,用于使用神经算子对三相黑油油藏动态进行序贯代理建模,特别关注傅里叶神经算子(FNO)及其物理信息变体(PINO)。应用重点是Norne基准油藏,定义在非均匀的$46\times112\times22$网格($N=113,344$个单元)上,生产历史跨越$T=30$个时间步,覆盖3298天。我们的理论贡献围绕四个相互关联的问题组织:(1)在乘积Sobolev空间设置中的泛函分析公式,包括隐式时间步映射的适定性和尖锐的局部Lipschitz估计;(2)协变量偏移量化,证明Wasserstein-2距离增长为$W_2 \leq \varepsilon(L^n-1)/(L-1)$,对于$L>1$具有指数级总体风险差异;(3)物理约束谱稳定性,表明使用$\lambda_R \geq \lambda^*_R$的PINO训练将学习到的Jacobian谱半径减小到$\rho_F + C\lambda_R^{-1/2}$,产生时间一致展开误差$|\delta_n| \leq \varepsilon/(1-\rho)$;(4)$K$步TBPTT梯度分析,推导出几何偏差衰减$O(\rho^K)$、最优窗口$K^* = O(\log(T/\sigma^2))$以及Adam收敛$O(1/\sqrt{t}) + O(\rho^{K^*})$。实证验证确认了所有理论预测:自回归PINO代理在完整的3298天范围内保持$R^2>0.99$(油)、$R^2>0.90$(气)、$R^2\approx 0.80$(压力)以及单调改善的$R^2$(水),在八块NVIDIA B200 GPU上训练不到一小时。一个包含1000个成员的集合在单块B200 GPU上运行不到一分钟,相比OPM有限体积模拟器获得约$10^4$倍的挂钟加速。

英文摘要

We develop a comprehensive mathematical and computational framework for sequential surrogate modeling of three-phase black-oil reservoir dynamics using neural operators, with particular emphasis on Fourier Neural Operators (FNO) and their physics-informed variant (PINO). The application focus is the Norne benchmark reservoir, defined on a heterogeneous $46\times112\times22$ grid ($N=113,344$ cells), with a production history spanning $T=30$ timesteps covering 3298 days. Our theoretical contributions are organized around four interlocking problems: (1) functional-analytic formulation in a product-Sobolev-space setting, including well-posedness of the implicit timestep map and sharp local Lipschitz estimates; (2) covariate shift quantification, proving that the Wasserstein-2 distance grows as $W_2 \leq \varepsilon(L^n-1)/(L-1)$, with exponential population-risk discrepancy for $L>1$; (3) physics-constrained spectral stability, showing PINO training with $λ_R \geq λ^*_R$ reduces the learned Jacobian spectral radius to $ρ_F + Cλ_R^{-1/2}$, yielding uniform-in-time rollout error $|δ_n| \leq \varepsilon/(1-ρ)$; and (4) $K$-step TBPTT gradient analysis, deriving geometric bias decay $O(ρ^K)$, optimal window $K^ = O(\log(T/σ^2))$, and Adam convergence $O(1/\sqrt{t}) + O(ρ^{K^*})$. Empirical validation confirms all theoretical predictions: autoregressive PINO surrogates sustain $R^2>0.99$ (oil), $R^2>0.90$ (gas), $R^2\approx 0.80$ (pressure), and monotonically improving $R^2$ (water) across the full 3298-day horizon, trained on eight NVIDIA B200 GPUs in under one hour. A 1000-member ensemble runs in under one minute on a single B200 GPU, giving a ${\sim}10^4\times$ wall-clock speedup over the OPM finite-volume simulator.

2605.28902 2026-05-29 cs.AI

Orthogonal Concept Erasure for Diffusion Models

扩散模型的正交概念擦除

Yuhao Sun, Lingyun Yu, Haoxiang Xu, Fengyuan Miao, Zhuoer Xu, Hongtao Xie

AI总结 提出正交概念擦除(OCE)方法,通过几何视角的乘法参数更新实现精确概念擦除,同时保持生成能力,支持多概念擦除。

Comments Accepted by ICML 2026 Oral

详情
AI中文摘要

概念擦除已成为减轻扩散模型中不期望或不安全内容的有前途方法,但现有方法仍面临显著限制。基于训练的方法有效,但高计算成本限制了可扩展性。基于编辑的方法更高效且易于部署,但难以同时实现精确的概念擦除和保持整体生成能力。我们将基于编辑的方法的这一核心限制归因于对加法参数更新的依赖。我们的实证分析表明,概念语义主要依赖于神经元方向而非神经元幅度,而整体生成能力依赖于神经元的角几何。由于加法更新固有地纠缠方向、幅度和角几何,它们不可避免地引入概念擦除与整体生成性能之间的意外干扰。为了解决这个问题,我们提出了正交概念擦除(OCE),它从几何角度将基于编辑的擦除重新表述为乘法参数更新。具体来说,OCE应用从参数的闭式解导出的逐层正交变换,能够在保持神经元幅度和角几何的同时实现精确的概念擦除。此外,为了解决多概念擦除中的冲突约束,OCE引入了具有结构化子空间操作的子空间级目标,实现了更有效和可扩展的擦除。在单概念和多概念擦除上的大量实验表明,OCE在概念擦除和非目标保持方面优于现有方法,可在4.3秒内擦除多达100个概念。代码:https://github.com/HansSunY/OCE。

英文摘要

Concept erasure has emerged as a promising approach to mitigate undesired or unsafe content in diffusion models, yet existing methods still face significant limitations. While training-based methods are effective, their high computational cost limits scalability. Editing-based methods are more efficient and deployment-friendly, yet they struggle to simultaneously achieve precise concept erasure and preserve overall generative capacity. We identify this core limitation of the editing-based methods as reliance on additive parameter updates. Our empirical analysis reveals that concept semantics primarily depend on neuron direction rather than neuron magnitude, while overall generative capacity relies on the angular geometry of neurons. As additive updates inherently entangle direction, magnitude, and angular geometry, they inevitably introduce unintended interference between concept erasure and overall generation performance. To address this, we propose Orthogonal Concept Erasure (OCE), which reformulates editing-based erasure as multiplicative parameter updates from a geometric perspective. Specifically, OCE applies layer-wise orthogonal transformations derived from a closed-form solution to the parameters, enabling precise concept erasure while preserving the neuron magnitude and angular geometry. Furthermore, to address conflicting constraints in multi-concept erasure, OCE introduces a subspace-level objective with structured subspace manipulation, yielding a more effective and scalable erasure. Extensive experiments on single- and multi-concept erasure demonstrate that OCE outperforms existing methods in concept erasure and non-target preservation, erasing up to 100 concepts in 4.3 s. Code: https://github.com/HansSunY/OCE.

2605.28900 2026-05-29 cs.LG

Spectral Guidance for Flexible and Efficient Control of Diffusion Models

谱引导:灵活高效的扩散模型控制方法

Gabriel Moreira, Manuel Marques, João Paulo Costeira, Chenyan Xiong

AI总结 提出谱引导框架,通过条件期望算子的奇异函数学习生成过程的固有几何结构,实现无需重训练或反向传播的稳定高保真控制,在CIFAR-10上条件准确率提升37个百分点且采样加速4倍。

Comments ICML 2026

详情
AI中文摘要

我们引入了谱引导,这是一个通过利用生成过程的内在几何结构来控制扩散模型的框架。随着数据被噪声逐步破坏,只有少量特征对控制仍然具有信息量。我们将这些特征表征为条件期望算子的奇异函数,并表明它们可以通过自监督目标进行学习。一旦恢复,这个基可以将任意引导信号(如标签、CLIP嵌入或掩码)直接投影到采样轨迹上。这种方法允许在采样过程中无需重训练或去噪器反向传播即可实现稳定、高保真的控制。实验上,我们在CIFAR-10上将条件准确率比最强的无训练基线提高了37个百分点,同时提供了4倍的采样加速。此外,支持标签和CLIP引导的相同表示也实现了空间控制(例如基于掩码的引导),而无需辅助模型。最后,我们的框架揭示了生成过程中的一个相变,指出了有效引导的最佳时间窗口。

英文摘要

We introduce Spectral Guidance, a framework for controlling diffusion models by leveraging the intrinsic geometry of the generative process. As data is progressively corrupted by noise, only a small number of features remain informative for control. We characterize them as the singular functions of a conditional expectation operator and show that they can be learned via a self-supervised objective. Once recovered, this basis enables the projection of arbitrary guidance signals, such as labels, CLIP embeddings, or masks, directly onto the sampling trajectory. This approach allows for stable, high-fidelity control without retraining or denoiser backpropagation during sampling. Empirically, we improve conditional accuracy on CIFAR-10 by 37 percentage points over the strongest training-free baseline while offering $4\times$ faster sampling. Moreover, the same representations that support label and CLIP guidance also enable spatial control, such as mask-based guidance, without auxiliary models. Finally, our framework reveals a phase transition in the generative process, pinpointing the optimal time window for effective guidance.

2605.28897 2026-05-29 cs.AI cs.MA

Review Arcade: On the Human Alignment and Gameability of LLM Reviews

Review Arcade: 关于LLM评审的人类对齐与可博弈性

Hans Ole Hatzel, Sebastian Steindl, Jan Strich

AI总结 通过实验评估LLM生成论文评审与人类评审的对齐程度,并发现作者可根据LLM评审迭代修改论文以提升评分(最多35%的论文显著提高),揭示了LLM评审的可博弈性。

Comments Under Review EMNLP 26

详情
AI中文摘要

LLM生成的科学论文评审正获得广泛关注,甚至被主要会议正式试点。我们必须假设不仅评审员在使用LLM辅助,而且作者在提交前也使用LLM修改论文。在这项工作中,我们对2025年ACL滚动评审(ARR)的论文进行实证实验,从作者和评审员两个角度评估LLM评审。首先,我们发现LLM评审与人类评审的对齐程度有限。在最佳情况下,对齐是合理的。然而,我们也发现LLM与人类的对齐在不同提示和模型间差异很大。最后,我们研究了作者根据LLM评审使用迭代草稿-修订工作流程改进提交的情况。我们发现,这种对LLM评审的“博弈”在特定场景下是有效的,导致最多35%的论文整体得分有统计显著提升。我们公开代码:https://github.com/uhh-hcds/reviewarcade。

英文摘要

LLM-generated reviews for scientific papers are gaining considerable traction and are even being officially piloted by major conferences. We have to assume that not only reviewers are using LLM-assistance, but also that authors use LLMs to revise their papers before submitting. In this work, we perform empirical experiments on papers from the 2025 ACL Rolling Review (ARR) to evaluate LLM reviews from both the author and the reviewer perspective. First, we identify a limited alignment of LLM reviews with human ones. In the best-case scenario, the alignment is reasonable. However, we also find that LLM-human alignment varies substantially across prompts and models. Finally, we investigate the scenario in which the author uses an iterative draft-revise workflow to improve the submission according to the LLM review. We find that this "gaming" of LLM reviews can be effective in specific scenarios, leading to a statistically significant increase of overall scores for up to 35\% of papers. We publish our code: https://github.com/uhh-hcds/reviewarcade.

2605.28896 2026-05-29 cs.LG

Feature Geometry of LoRA Adapters: A Sparse Autoencoder Analysis of Representational Divergence in Fine-Tuned Language Models

LoRA适配器的特征几何:微调语言模型中表示差异的稀疏自编码器分析

Prasanth K K

AI总结 本研究使用稀疏自编码器分析LoRA微调引起的表示几何变化,发现LoRA特征字典与预训练特征存在弱几何对齐,且适配器特定SAE能更有效重建delta激活。

详情
AI中文摘要

低秩适配(LoRA)已成为适应大型语言模型的广泛采用方法,但LoRA微调引起的内部表示变化仍未被充分理解。在这项工作中,我们使用稀疏自编码器(SAE)研究LoRA诱导表示的几何结构。我们引入了一个delta激活框架,该框架隔离了适配器对残差流的特定贡献。使用Gemma-2-9B和LoRA秩4、8、16和32,我们在多个Transformer层上训练适配器特定的SAE,并将它们学习的特征空间与预训练的SAE字典进行比较。我们使用解码器方向之间的余弦相似度、特征子空间的主角分析以及激活表示之间的中心核对齐(CKA)来评估表示对齐。跨层和秩,我们一致观察到LoRA诱导的特征字典与预训练SAE特征之间的几何对齐相对较弱。适配器特定的SAE也比预训练SAE更有效地重建delta激活,这表明LoRA更新在残差流内占据了部分不同的表示结构。此外,特征密度随秩和深度增加,而几何差异在各秩之间保持相对稳定。这些发现提供了经验证据,表明LoRA微调可以诱导出预训练可解释性字典未完全捕获的特征结构,对微调语言模型的机制可解释性、适配分析和安全审计具有启示意义。

英文摘要

Low-Rank Adaptation (LoRA) has emerged as a widely adopted approach for adapting large language models, yet the internal representational changes induced by LoRA fine-tuning remain insufficiently understood. In this work, we investigate the geometry of LoRA-induced representations using Sparse Autoencoders (SAEs). We introduce a delta activation framework that isolates the adapter-specific contribution to the residual stream. Using Gemma-2-9B with LoRA ranks 4, 8, 16, and 32, we train adapter-specific SAEs across multiple transformer layers and compare their learned feature spaces with pretrained SAE dictionaries. We evaluate representational alignment using cosine similarity between decoder directions, principal-angle analysis of feature subspaces, and Centered Kernel Alignment (CKA) between activation representations. Across layers and ranks, we consistently observe comparatively weak geometric alignment between LoRA-induced feature dictionaries and pretrained SAE features. Adapter-specific SAEs also reconstruct delta activations more effectively than pretrained SAEs, suggesting that LoRA updates occupy partially distinct representational structure within the residual stream. Additionally, feature density increases with rank and depth, while geometric divergence remains relatively stable across ranks. These findings provide empirical evidence that LoRA fine-tuning can induce feature structures that are not fully captured by pretrained interpretability dictionaries, with implications for mechanistic interpretability, adaptation analysis, and safety auditing of fine-tuned language models.

2605.28889 2026-05-29 cs.LG cs.AI

Context Distillation as Latent Memory Management

上下文蒸馏作为潜在记忆管理

Ziyang Zheng, Zeju Li, Xiangyu Wen, Jianyuan Zhong, Junhua Huang, Lei Chen, Mingxuan Yuan, Qiang Xu

AI总结 将上下文蒸馏视为潜在记忆管理问题,通过独立LoRA适配器形成模块化记忆库,并利用自门控机制决定是否激活潜在记忆,以提升检索鲁棒性和效率。

详情
AI中文摘要

上下文蒸馏将上下文信息压缩到模型参数中,然而现有方法常常忽略多个蒸馏后的潜在记忆应如何在非预言机设置下存储、检索和安全激活。我们将上下文蒸馏表述为一个潜在记忆管理问题。我们将每个上下文蒸馏成一个独立的LoRA适配器,形成一个模块化记忆库,从而实现显式的记忆选择。给定一个查询,我们的框架检索候选记忆,将查询路由到最合适的适配器,并使用自门控机制决定是否应激活潜在记忆。为了提高效率,我们进一步引入缓存共享以减少推理过程中的管理开销。实验表明,我们的方法在检索方面显著优于基线,而自门控通过停用不必要的潜在记忆提高了鲁棒性。

英文摘要

Context distillation compresses contextual information into model parameters, yet existing methods often ignore how multiple distilled latent memories should be stored, retrieved, and safely activated in non-oracle settings. We formulate context distillation as a latent memory management problem. We distill each context into an independent LoRA adapter, forming a modular memory bank that enables explicit memory selection. Given a query, our framework retrieves candidate memories, routes the query to the most suitable adapter, and uses a Self-Gating mechanism to decide whether latent memory should be activated. To improve efficiency, we further introduce cache sharing to reduce management overhead during inference. Experiments show that our method substantially outperforms baselines with retrieval, while Self-Gating improves robustness by deactivate unnecessary latent memories.

2605.28883 2026-05-29 cs.AI cs.RO

Ultra-Reduced-Impact-Encased-Logging (URIEL): propose a new method for selective sustainable logging and post-harvest silvicultural treatment in tropical forest using airborne robotics systems

超低影响包裹式伐木(URIEL):提出一种利用空中机器人系统在热带森林中进行选择性可持续伐木和采后造林处理的新方法

Daniel Albiero, Gelton Fernando de Morais, Daniela Han, Flávio Roberto de Freitas Gonçalves, Artur Vitório Andrade Santos, Wesllen Lins de Araújo, Alessandra Maia Freire, Cláudio Kiyoshi Umezu, Mateus Peressin, Francesco Toscano, Admilson Írio Ribeiro, Alfeu J. Sguarezi Filho, Américo Ferraz Dias Neto, Angel Pontin Garcia

AI总结 提出URIEL方法,结合直升机伐木、机器人、AI和无人机采后造林处理,实现高经济可行性和几乎零附带损害,维持生态系统服务。

Comments 196 pages, 40 figures, A revolutionary technology to help protect tropical forests. It was developed, scaled, detailed, calculated, and simulated in an advanced computational environment, com viabilidade econômica e social. "E pur si muove"

详情
AI中文摘要

全球热带森林正面临由经济和政治利益驱动的强烈砍伐压力,科学证据表明这种砍伐加剧了气候变化。本文提出了一种新颖的热带森林伐木方法——超低影响包裹式伐木(URIEL)。该方法基于直升机伐木技术,结合机器人技术和人工智能的密集使用,以及由无人机执行的采后造林处理。为此方法开发了合适的设备概念,确定了尺寸,在数字概念验证中完成了细节,并对各种直升机-木材-距离组合进行了有效的数字模拟和经济可行性分析。结果表明,URIEL方法具有高经济可行性,并能在维持生态系统服务的同时几乎消除对森林的附带损害。本文的主要结论是,尽管取得了令人满意的科学和技术成果,但URIEL方法的可行性取决于相关利益相关者的整合:高科技产业、政治政府、认证伐木公司和原住民。

英文摘要

Tropical forests worldwide are under intense deforestation pressure driven by economic and political interests, and scientific evidence suggests this deforestation contributes to climate change. This paper proposes a novel logging method for tropical forests, Ultra-Reduced-Impact-Encased-Logging (URIEL). This new method is based on heli-logging techniques combined with intensive use of robotics and AI integrated with post-harvest silvicultural treatments performed by drones. The concept of appropriate equipment for this method was developed, dimensions were determined, details were completed in a digital proof of concept, and an effective digital simulation and economic feasibility analysis were carried out for various helicopter-timber-distance combinations. The results demonstrated that a URIEL method has high economic viability and makes it possible to virtually eliminate collateral damage to forests while maintaining ecosystem services. The main conclusion of this paper is that, despite the satisfactory scientific and technological results, the feasibility of a Uriel method depends on the integration of stakeholders intrinsic to the context: high-tech industry; political governments; certified logging companies; and native populations.

2605.28880 2026-05-29 cs.LG physics.data-an stat.ME

Towards Continuous-time Causal Foundation Models

迈向连续时间因果基础模型

Dennis Thumm, Ruben Wiedemann, Ying Chen

AI总结 提出轨迹律对观测时间表不变的连续性准则,通过细网格积分与解耦观测实现连续时间因果先验模型,并在线性与非线性先验上验证其优于离散方法。

Comments ICML 2026 2nd Workshop on Foundation Models for Structured Data (FMSD)

详情
AI中文摘要

将时间序列的离散时间因果先验数据拟合网络扩展到连续时间,需要将机制写为随机微分方程(SDE)——但如果SDE在每个观测间隔内只积分一次,轨迹律依赖于观测时间,先验仍然是披着SDE外衣的离散时间马尔可夫模型。我们提出了一个精确的连续性准则——轨迹律对观测时间表的不变性——以及一个三层分类法(离散;朴素观测网格积分;细网格积分与解耦观测),并在具有OU或小型MLP非线性漂移、不规则观测时间表以及硬/软/时变干预的随机DAG上实现了顶层。一个2×2编码器×积分器消融实验,在线性和非线性先验上独立运行,发现细网格积分在8/8个单元上优于朴素积分(符号一致性p<1/256),且随着评估网格细化差距增大;编码器轴在细积分下无效,而在朴素积分下具有时间感知优势。我们发布了该先验以及一个在药代动力学和物理系统数据上的初步零样本协议。

英文摘要

Extending discrete-time causal Prior-data Fitted Networks for time series to continuous time invites writing the mechanism as a stochastic differential equation (SDE) -- but if the SDE is integrated \emph{once per observation gap}, the trajectory law depends on when it is observed, and the prior remains a discrete-time Markov model in SDE clothing. We propose a precise continuity criterion -- trajectory-law invariance to the observation schedule -- together with a three-tier taxonomy (discrete; naive observation-grid integration; fine-grid integration with decoupled observation) and a construction realising the top tier on a random DAG with OU or small-MLP nonlinear drifts, irregular observation schedules, and hard / soft / time-varying interventions. A $2 \times 2$ encoder $\times$ integrator ablation, run independently on a linear and a nonlinear prior, finds fine-grid integration beats naive on 8/8 cells (sign-consistency $p < 1/256$) with the gap growing as the eval grid refines; the encoder axis is null with fine integration but time-aware-leading with naive. We release the prior and a preliminary zero-shot protocol on pharmacokinetic and physical-system data.

2605.28874 2026-05-29 cs.CL

From Data to Insights: Exploring Program-of-Thoughts Prompting for Chart Summarization

从数据到洞察:探索程序化思维提示在图表摘要中的应用

Yutong Qu, Wei Zhang

AI总结 本文提出一种零样本学习方法,通过Python程序作为中介,利用程序化思维策略驱动轻量级视觉语言模型进行图表摘要,并引入图表到字典的辅助任务以提高灵活性和性能。

Comments 22 pages, 9 figures

详情
AI中文摘要

图表通过结构化视觉表示在传达数值数据洞察方面起着关键作用。然而,语义视觉理解和数值推理需求阻碍了图表的准确描述,使得图表摘要成为一项具有挑战性的任务。尽管视觉语言模型(VLM)最近取得了进展,但现有方法缺乏验证统计事实正确性的稳健机制,且计算负担重。为解决这一问题,本文探索了一种使用零样本学习的策略,通过Python程序作为中介,激励轻量级VLM执行计算推理,从而为图表理解导出有效的摘要统计量。具体而言,我们引入了一种新颖的图表到字典辅助任务,与传统的图表到表格方法相比,提供了更灵活的表示,特别适合与程序化思维(PoT)策略集成。实验结果表明,我们的策略在语义和事实指标上与现有图表摘要方法性能相当。代码可在 https://anonymous.4open.science/r/ZeroShot-PoT-C2T-5A6B 获取。

英文摘要

Charts play a critical role in conveying numerical data insights through structured visual representations. However, semantic visual understanding and numerical reasoning requirements hinder the accurate description of charts, interpreting a challenging task in chart summarization. Despite recent advancements in visual language models (VLMs), approaches lack robust mechanisms for verifying statistical fact correctness and are computationally heavy. To address this gap, this paper explores a strategy of using zero-shot learning to motivate the lightweight VLMs to perform computational reasoning, via Python programs as intermediaries to derive valid summary statistics for chart understanding. Specifically, we introduce a novel chart-to-dictionary auxiliary task, offering a more flexible representation compared to traditional chart-to-table methods, making it particularly well-suited for integration with the Program-of-Thought (PoT) strategy. Experimental results demonstrate our strategy performs on par with existing chart summarization methods across semantic and factual metrics. Code is available on https://anonymous.4open.science/r/ZeroShot-PoT-C2T-5A6B.

2605.28873 2026-05-29 cs.LG

Pre-Registering the Detectable Effect: A Paired-MDE Budget for 4-bit Quantization Benchmarks, with a Pilot Audit

预注册可检测效应:面向4位量化基准的配对MDE预算,附带一项试点审计

Zexin Zhuang, Yanhang Li, Zhichao Fan

AI总结 本文提出一种配对最小可检测效应(MDE)边界公式,用于量化基准的可靠性评估,并通过试点审计验证其有效性。

详情
AI中文摘要

这是一篇带有非配对试点审计的规划方法说明。我们将经典的配对二项样本量计算(Miettinen, 1968)应用于量化基准,给出了在配对项目数$m$和FP16-NF4不一致率$ρ_d$下的保守最小可检测效应(MDE)边界$δ^{*} \\\le (z_{1-α/2}+z_{1-β})\\\sqrt{ρ_d/m}$。该边界将“我的量化声明有多可靠?”转化为基准设计者在运行前可以承诺的一行预算。我们在四个模型和四个基准($n=100$的$k=5$次分割)上展示了该边界,并添加了一项并行的MMLU提示模板研究,以将边界的量化噪声尺度与提示噪声尺度进行比较。假设$ρ_d=0.10$(一个未测量的规划值),所有观察到的NF4-FP16差异均低于隐含的MDE,且大多数跨分割标准差落在二项参考$\\sqrt{p(1-p)/n}$的$\\pm 1.5$个百分点内,因此在$n=100$子样本上报告为“基准不可靠性”的大部分方差是二项抽样噪声。唯一的边界单元格(OPT-WinoGrande,$|Δ|=3.2$个百分点)在$ρ_d=0.10$时低于隐含MDE,但在$ρ_d=0.05$时高于它,说明了该边界明确的规划权衡。在MMLU上,提示模板范围2-10个百分点达到或超过了最大的观察量化差异(3.2个百分点),因此未先固定提示模板的量化审计会将模板方差吸收到其噪声基底中。我们用一个五行预注册模板补充了该边界。

英文摘要

This is a planning-method note with an unpaired pilot audit. We adapt the classical paired-binary sample-size calculation (Miettinen, 1968) to quantization benchmarks, giving a conservative minimum detectable effect (MDE) bound $δ^{*} \le (z_{1-α/2}+z_{1-β})\sqrt{ρ_d/m}$ in the paired item count $m$ and the FP16-NF4 disagreement rate $ρ_d$. The bound turns "how reliable is my quantization claim?" into a one-line budget a benchmark designer can commit to before running. We illustrate the bound on four models and four benchmarks ($k=5$ splits of $n=100$), and add a parallel MMLU prompt-template study to put the bound's quantization-noise scale alongside the prompt-noise scale. Assuming $ρ_d=0.10$ (an unmeasured planning value), all observed NF4-FP16 deltas fall below the implied MDE, and most cross-split SDs lie within $\pm 1.5$ pp of the binomial reference $\sqrt{p(1-p)/n}$, so much of the variance reported as "benchmark unreliability" on $n=100$ subsamples is binomial sampling noise. The single borderline cell (OPT-WinoGrande, $|Δ|=3.2$ pp) is below the implied MDE at $ρ_d=0.10$ but above it at $ρ_d=0.05$, illustrating the planning trade-off the bound makes explicit. On MMLU, prompt-template ranges of 2-10 pp meet or exceed the largest observed quantization delta (3.2 pp), so a quantization audit that does not first fix the prompt template absorbs template variance into its noise floor. We complement the bound with a five-line pre-registration template.

2605.28870 2026-05-29 cs.LG cs.AI

Representation Alignment Rests on Linear Structure

表示对齐依赖于线性结构

Kiril Bangachev, Guy Bresler, Yury Polyanskiy

AI总结 本文通过信号、偏差和噪声的三部分统计框架研究柏拉图表示假说,提出对齐源于对象与属性的线性关系,并通过稀疏自编码器提取线性特征、中心化和归一化减少偏差、以及数据稀缺导致噪声等证据支持该框架。

详情
AI中文摘要

我们通过表示的三部分统计框架研究柏拉图表示假说(PRH):信号、偏差和噪声。{1) 信号:} 我们提出柏拉图对齐源于对象与属性之间的普遍关系,这种关系根据线性表示假说(LRH)在线性上编码。我们通过稀疏自编码器提取线性对象-属性特征,并展示这些稀疏表示通常比其稠密对应物表现出更强的跨模态对齐,从而提供证据表明LRH有助于解释PRH。{2) 偏差:} 由于使用的不同架构和训练过程,模型具有不同的隐式偏差。我们表明这种差异可以部分缓解。中心化和归一化一致地改善跨模型对齐。{3) 噪声:} 有限样本训练导致表示中的噪声。我们通过揭示词频与对齐之间在LLM和文本嵌入模型中的强且一致的正相关,提供证据表明表示噪声由数据稀缺驱动。综合信号、偏差和噪声,我们提出一个统计模型,该模型细化线性表示假说,并解释与现代AI架构中出现的表示对齐相关的进一步现象。

英文摘要

We investigate the Platonic Representation Hypothesis (PRH) through a tripartite statistical framework of representations: signal, bias, and noise. {1) Signal:} We propose that Platonic alignment arises from the universal relationship between objects and attributes, which is encoded linearly in representations according to the Linear Representation Hypothesis (LRH). We provide evidence that LRH helps explain PRH by extracting linear object-attribute features with sparse autoencoders and showing that these sparse representations often exhibit stronger cross-modal alignment than their dense counterparts. {2) Bias:} Models have different implicit biases due to the diverse architectures and training procedures used. We show that this difference can be partially mitigated. Centering and normalization consistently improve cross-model alignment. {3) Noise:} Finite-sample training leads to noise in representations. We provide evidence that representational noise is driven by data scarcity by revealing a strong and consistent positive correlation between word frequency and alignment in LLMs and text embedding models. Synthesizing signal, bias, and noise, we propose a statistical model that refines the Linear Representation Hypothesis and explains further phenomena related to the alignment of representations emerging from diverse modern AI architectures.

2605.28869 2026-05-29 cs.LG cs.AI

Balancing Multimodal Learning through Label Space Reshaping

通过标签空间重塑平衡多模态学习

Xiaoyu Ma, Weijie Zhang, Yuanhao Gao, Han Miao, Yongjian Deng, Hao Chen

AI总结 针对多模态学习中模态不平衡问题,提出基于标签空间重塑的BMLR方法,通过均衡各模态映射难度来提升多模态性能。

Comments In process

详情
AI中文摘要

多模态学习常受模态不平衡问题困扰,其中收敛较快的模态主导优化,而其他模态训练不足。现有方法通常通过加强弱模态或调整优化梯度来缓解此问题。然而,这些策略主要补偿优化速率差异,往往以牺牲强模态的优化能力为代价,而未从模态层面分析这些差异如何产生。基于理论洞察和实证观察,我们认为学习速度的差异源于模态特定特征空间与共享标签空间之间映射难度的不同。为解决此问题,我们提出了平衡多模态标签重塑(BMLR),这是首个从标签侧设计促进多模态平衡的方法。BMLR重塑跨模态标签空间以均衡各模态的映射难度,从而促进模态交互并为每个模态注入更丰富的类间信息。跨多种架构的大量实验表明,BMLR持续提升多模态性能,并与多种模型设计表现出强兼容性。源代码即将发布。

英文摘要

Multimodal learning often suffers from modality imbalance, where modalities that converge faster dominate optimization while others remain undertrained. Existing approaches typically mitigate this issue by strengthening the weak modality or adjusting optimization gradients. However, such strategies mainly compensate for optimization rate discrepancies, often at the expense of the strong modality's optimization capacity, without analyzing how these discrepancies arise at the modality level. Based on theoretical insights and empirical observations, we argue that the discrepancy of learning pace arises from differences in the mapping difficulty between modality-specific feature space and the shared label space. To address this issue, we propose Balanced Multimodal Label Reshaping (BMLR), the first method that promotes multimodal balance from the label-side design. BMLR reshapes the cross-modal label space to equalize mapping difficulty across modalities, thereby facilitating modality interaction and injecting richer inter-class information into each modality. Extensive experiments across multiple architectures demonstrate that BMLR consistently improves multimodal performance and exhibits strong compatibility with diverse model designs. The source code will be released soon.

2605.28868 2026-05-29 cs.LG cs.AI

TaxDistill: Improving Metagenomic Taxonomic Annotation via Distilled Genomic Foundation Models

TaxDistill:通过蒸馏基因组基础模型改进宏基因组分类注释

Rongye Ye, Lun Li, Zheng Luo, Yiran Zhan, Shuhui Song

AI总结 提出TaxDistill知识蒸馏框架,利用500M参数的基因组基础模型GenomeOcean作为教师网络生成软标签,以减轻初始检索工具引入的标签噪声,从而提升宏基因组序列分类性能。

Comments The manuscript contains 14 pages, 7 figures, and 3 tables

详情
AI中文摘要

宏基因组分类注释旨在识别环境样本中DNA片段的微生物起源。依赖序列相似性的传统方法通常受到高微生物多样性和参考数据库不完整性的限制,这推动了诸如Taxometer等学习方法的发展,这些方法通过事后校正来学习更具信息量的宏基因组序列表示。然而,这些方法通常依赖于训练期间从相似性搜索工具获得的标签,这不可避免地引入了噪声,从而损害表示学习并降低分类性能。为了解决这个问题,我们提出了TaxDistill,一种用于宏基因组分类的知识蒸馏框架。我们引入GenomeOcean,一个500M参数的基因组基础模型,作为教师网络来提取深层语义特征并基于置信度生成软标签。通过将这些软标签信息蒸馏到轻量级学生网络中,TaxDistill有效减少了初始检索工具引入的标签噪声。在七个不同的CAMI2数据集上的全面实验表明,TaxDistill在大多数场景下优于现有基线。例如,在胃肠道数据集上,它将MMseqs2的F1分数从0.763提高到0.941,优于Taxometer基线。总体而言,TaxDistill为复杂宏基因组分析中的标签校正提供了一种可靠的方法。

英文摘要

Metagenomic taxonomic annotation aims to identify the microbial origins of DNA fragments in environmental samples. Traditional methods that rely on sequence similarity are often constrained by the high microbial diversity and the incompleteness of reference databases, which has motivated the development of learning approaches such as Taxometer that perform post hoc correction to learn more informative metagenomic sequence representations. However, these methods typically rely on labels derived from similarity search tools during training, which inevitably introduces noise that can impair representation learning and degrade classification performance. To address this issue, we propose TaxDistill, a knowledge distillation framework for metagenomic classification. We introduce GenomeOcean, a 500M parameter genomic foundation model, as the teacher network to extract deep semantic features and generate soft labels based on confidence. By distilling this soft label information into a lightweight student network, TaxDistill effectively reduces the label noise introduced by initial retrieval tools. Comprehensive experiments on seven diverse CAMI2 datasets demonstrate that TaxDistill outperforms existing baselines in most scenarios. For instance, on the Gastrointestinal dataset, it improves the F1 score of MMseqs2 from 0.763 to 0.941, outperforming the Taxometer baseline. Overall, TaxDistill provides a reliable method for label correction in complex metagenomic analysis.

2605.28867 2026-05-29 cs.LG cs.AI

PrismFlow: Residual Dynamics for Flow Matching in Time-Series Generation

PrismFlow: 时间序列生成中流匹配的残差动力学

Junru Zhang, Lang Feng, Jinbo Wang, Xu Guo, Yucheng Wang, Han Yu, Min Wu, Yabo Dong, Duanqing Xu

AI总结 提出PrismFlow方法,通过Koopman启发的动力学专家和置信度感知的胜者全得目标,在流匹配中学习残差修正,以解决标准流匹配中全局向量场估计器导致的频谱失真和模式覆盖不足问题,在时间序列生成中取得最优性能。

详情
AI中文摘要

生成高质量时间序列数据具有挑战性,因为现实世界的信号通常表现出多模态模式和多尺度动力学,包括振荡和高频变化。流匹配(FM)为扩散模型提供了一种高效的替代方案,但实际实现通常依赖于单个有限容量的全局向量场估计器。在这种异质的时间分布中,不同的状态可能通过邻近的流状态,同时需要不相容的条件速度。使用标准$\ell_2$速度匹配目标训练的单一估计器可能学习到局部传输场的过度平滑近似。这种估计器级别的平滑会减弱分支特定的动力学,导致频谱失真和较差的模式覆盖。为了解决这个问题,我们提出了PrismFlow,一种新的具有Koopman启发动力学专家的FM方法。每个专家在一个潜在空间中学习残差修正,其中局部非线性时间演化可以通过线性变换近似。我们进一步提出了一种置信度感知的胜者全得(WTA)目标,该目标仅更新与每个样本最对齐的专家,同时屏蔽其他专家的梯度,鼓励模式特定专业化。在采样过程中,选定的专家向全局传输场添加残差动力学修正,在保持FM稳定性的同时恢复细粒度和高频时间结构。在各种基准测试中,PrismFlow有效缓解了标准FM中的频谱收缩,并实现了最先进的性能,Context-FID提升了15.6%,判别分数提升了38.6%,同时在低数据设置下保持鲁棒性,并有效用于预测和插补。

英文摘要

Generating high-quality time-series data is challenging because real-world signals often exhibit multimodal patterns and multiscale dynamics, including oscillations and high-frequency variations. Flow Matching (FM) offers an efficient alternative to diffusion models, but practical implementations typically rely on a single finite-capacity global vector-field estimator. In such heterogeneous temporal distributions, distinct regimes may pass through nearby flow states while requiring incompatible conditional velocities. A monolithic estimator trained with the standard $\ell_2$ velocity-matching objective may therefore learn an overly smoothed approximation of the local transport field. This estimator-level smoothing can attenuate branch-specific dynamics, leading to spectral distortion and poor mode coverage. To address this, we propose PrismFlow, a new FM method with Koopman-inspired dynamical experts. Each expert learns residual corrections in a latent space where local nonlinear temporal evolution can be approximated by linear transitions. We further propose a confidence-aware Winner-Take-All (WTA) objective that updates only the expert best aligned with each sample while masking gradients to the others, encouraging mode-specific specialization. During sampling, the selected expert adds a residual dynamical correction to the global transport field, preserving FM stability while recovering fine-grained and high-frequency temporal structures. Across various benchmarks, PrismFlow effectively mitigates the spectral contraction in standard FM and achieves state-of-the-art performance, with a 15.6% gain in Context-FID and a 38.6% improvement in Discriminative Score, while remaining robust in low-data settings and effective for forecasting and imputation.

2605.28866 2026-05-29 cs.LG cs.AI

Continuity and Ordinality Matter: Constraining Time Series Tokens for Effective Time Series Analysis with Large Language Models

连续性与序数性至关重要:利用大语言模型进行有效时间序列分析的时间序列令牌约束

Musheng Li, Ziying Zhang, Cheng jin, Yuantao Gu

AI总结 针对令牌化时间序列大语言模型忽略连续性和序数性的问题,提出COM策略,通过几何约束初始化与训练阶段,提升模型在多个时间序列分析基准上的性能与泛化能力。

详情
AI中文摘要

基于令牌的时间序列大语言模型(TS-LLMs)已成为时间序列分析与推理的一个有前景的方向。然而,先前的研究在很大程度上忽略了时间序列令牌固有的连续性和序数性,这严重限制了模型性能。在本文中,我们认为在时间序列令牌嵌入中保留这些属性对于基于令牌的TS-LLMs的有效性至关重要。为此,我们提出了COM(连续性与序数性至关重要),这是一种连续性和序数性感知策略,将几何约束整合到初始化阶段和训练阶段。在多个时间序列分析基准上的实证结果表明,COM持续提升了基于令牌的TS-LLMs的性能,取得了有竞争力的结果和强大的泛化能力。代码可在 https://anonymous.4open.science/r/COM 获取。

英文摘要

Token-based time series large language models (TS-LLMs) have emerged as a promising direction for time series analysis and reasoning. However, prior studies largely overlook the inherent continuity and ordinality of time series tokens, which substantially limits model performance. In this paper, we argue that preserving these properties in time series token embeddings is crucial for the effectiveness of token-based TS-LLMs. To this end, we propose COM (Continuity and Ordinality Matter), a continuity- and ordinality-aware strategy that integrates geometric constraints into both the initialization and training stages. Empirical results on multiple time series analysis benchmarks demonstrate that COM consistently improves the performance of token-based TS-LLMs, achieving competitive results and strong generalizability. Code is available at https://anonymous.4open.science/r/COM .

2605.28865 2026-05-29 cs.LG cs.AI

Emergent Semantic Representations in World Models through Physical Interaction without Linguistic Supervision

无需语言监督的物理交互中世界模型中的涌现语义表征

Jiayi Fang

AI总结 通过无语言监督的物理探索训练VAE世界模型,发现其潜在空间自发形成与物理几何结构对齐的语义结构,且预测性能与语义对齐共同提升,验证了物理几何作为世界模型表征的组织原则。

Comments 10 pages, 3 figures

详情
AI中文摘要

世界模型从物理探索中学习到什么,没有任何语言监督?我们认为答案由单一原则组织:物理世界的几何结构。在随机具身探索上训练基于VAE的世界模型,我们发现其潜在空间发展出反映物理几何的空间语义结构——方向准确率0.677±0.029对比随机初始化编码器的0.547,位置RSA 0.192±0.047对比随机编码器的0.029(提升6.6倍),表明训练诱导了超越CNN归纳偏置的真正结构组织。在20个时间检查点上,预测性能和语义对齐共同提升(Spearman r=-0.61, p=0.004),与共享驱动者解释一致。我们通过双重敲除确认:标准KL正则化(beta=0.1)迫使编码器远离几何结构,预测性能和语义对齐同时崩溃至接近随机水平(第50,000步),完全符合共享驱动者预测。将beta降至0.001可恢复几何访问并同时恢复两种能力。这些发现确立了物理世界几何作为世界模型表征的组织原则,对设计语义基础的具身智能体具有直接意义。

英文摘要

What does a world model learn from physical exploration, without any linguistic supervision? We argue the answer is organized by a single principle: the geometric structure of the physical world. Training a VAE-based world model on random embodied exploration, we find that its latent space develops spatial semantic structure that mirrors physical geometry -- direction accuracy 0.677+-0.029 versus 0.547 for a randomly initialized encoder, and position RSA 0.192+-0.047 versus 0.029 for random encoders (6.6x improvement), showing that training induces genuine structural organization beyond CNN inductive bias. Across 20 temporal checkpoints, prediction performance and semantic alignment co-improve (Spearman r=-0.61, p=0.004), consistent with the shared-driver account. We confirm this through a double knockout: standard KL regularization (beta=0.1) forces the encoder away from geometric structure, and both prediction performance and semantic alignment collapse simultaneously to near-chance by step 50,000 -- exactly as the shared-driver account predicts. Reducing beta to 0.001 restores geometric access and recovers both capabilities together. These findings establish physical world geometry as the organizing principle of world model representations, with direct implications for the design of semantically grounded embodied agents.

2605.28864 2026-05-29 cs.AI cs.CL

The Cognitive Categorical Transformer: Category-Theoretic Inductive Biases for Language Modeling

认知范畴变换器:用于语言建模的范畴论归纳偏置

Al Kari

AI总结 提出认知范畴变换器(CCT),通过引入基于范畴论和认知科学的组件,在WikiText-103上以306M参数实现21.27验证困惑度,相比GPT-2 Small基线降低2.92 PPL(12%相对提升),并通过消融实验证实单纯复形消息传递贡献了84%的改进。

详情
AI中文摘要

认知范畴变换器(CCT)是一个306M参数的架构,它通过源自范畴论和认知科学的认知启发组件增强了预训练的GPT-2 Small骨干网络。在WikiText-103上采用匹配步数协议(215,000优化器步数、匹配数据、匹配优化器和调度)下,CCT达到21.27验证困惑度,而相同微调的GPT-2 Small基线为24.19。因此,该架构在领域内微调本身之外贡献了2.92 PPL(12%相对)的降低。一个从头开始重训练的消融实验,在整个七阶段激活调度中保持GT-Full单纯复形消息传递绕过,达到23.72 PPL,将84%的架构改进(2.45 of 2.92 PPL)归因于GT-Full。我们首次提供了消融验证的证据,表明单纯复形消息传递在WikiText-103上以306M参数规模改善了语言模型困惑度。已发表的GPT-2 Large在WikiText-103上以比GPT-2 Small多6.2倍的参数达到22.05零样本困惑度;本文将这一数字视为外部已发表参考,而非架构基准。关于一致性风格的范畴先验(层平滑、伴随往返、曲率正则化)的三个负面结果,以及GT-Full和PrecisionWeightedPP的联合结构先验结果,共同支持了一个经验模式,称为*结构/一致性区分*,其中添加新拓扑的范畴先验改善了语言建模,而强制执行一致性恒等式的范畴先验则没有。

英文摘要

The Cognitive Categorical Transformer (CCT) is a 306M-parameter architecture that augments a pretrained GPT-2 Small backbone with cognitively grounded components derived from category theory and several inspirations from cognitive science. Under a matched-step protocol (215,000 optimizer steps, matched data, matched optimizer and schedule) on WikiText-103, CCT reaches 21.27 validation perplexity, compared with 24.19 for an identically fine-tuned GPT-2 Small baseline. The architecture therefore contributes a 2.92 PPL (12% relative) reduction beyond what in-domain fine-tuning alone provides. A retrain-from-scratch ablation that holds GT-Full simplicial message passing bypassed across the entire seven-phase activation schedule reaches 23.72 PPL, localizing 84% of the architectural improvement (2.45 of 2.92 PPL) to GT-Full. We present the first ablation-validated evidence that simplicial message passing improves language-model perplexity at the 306M-parameter scale on WikiText-103. Published GPT-2 Large reaches 22.05 zero-shot PPL on WikiText-103 with 6.2x more parameters than GPT-2 Small; this paper treats that number as an external published reference, not as the architectural benchmark. Three negative results on consistency-style categorical priors (sheaf smoothing, adjunction round-trip, curvature regularization) and the joint structural-prior result for GT-Full and PrecisionWeightedPP together support an empirical pattern termed the *structure/consistency distinction*, in which categorical priors that add new topology improve language modeling and those that enforce a consistency identity do not.

2605.28863 2026-05-29 cs.LG cs.AI

Self-Play Reinforcement Learning under Imperfect Information in Big 2

大二(Big 2)中不完全信息下的自我对弈强化学习

Aalok Patwa

AI总结 本文提出一个自我对弈强化学习框架,在四人不完全信息纸牌游戏Big 2中比较策略梯度和值近似方法,发现PPO优于其他方法,并证明中等熵正则化和当前策略自我对弈的有效性。

Comments 11 pages

详情
AI中文摘要

不完全信息多人游戏测试智能体在隐藏信息、稀疏奖励和非平稳对手下的行动能力。我们在Big 2(一个四人不完全信息纸牌游戏)中研究这些挑战。我们为Big 2开发了一个自我对弈强化学习框架,能够对策略梯度和值近似智能体进行受控比较。在共同的环境、输入表示、训练预算和评估协议下,PPO在对抗随机、贪婪和启发式Big 2对手时优于蒙特卡洛Q近似、SARSA和Q学习。我们进一步发现,适度的熵正则化通过防止策略变得过于确定性来改进PPO,并且当前策略自我对弈比检查点自我对弈或固定对手训练提供了更强的有限预算课程。这些结果共同表明,Big 2是研究不完全信息、多人交互、延迟奖励和可变动作集下深度强化学习的一个有用的受控环境。

英文摘要

Imperfect-information multiplayer games test whether agents can act under hidden information, sparse rewards, and non-stationary opponents. We study these challenges in Big 2, a four-player imperfect-information card game. We develop a self-play RL framework for Big 2 that enables controlled comparisons between policy-gradient and value-approximating agents. Under a common environment, input representation, training budget, and evaluation protocol, PPO outperforms Monte Carlo Q approximation, SARSA, and Q-learning against random, greedy, and heuristic Big 2 opponents. We further find that moderate entropy regularization improves PPO by preventing the policy from becoming overly deterministic, and that current-policy self-play provides a stronger finite-budget curriculum than checkpoint self-play or fixed-opponent training. Together, these results show that Big 2 is a useful controlled setting for studying deep RL under imperfect information, multiplayer interaction, delayed rewards, and variable action sets.

2605.28862 2026-05-29 cs.LG q-bio.QM

Molecular Lead Optimization via Agentic Tool Planning

通过智能体工具规划进行分子先导优化

Lingxiao Li, Haobo Zhang, Ruohao Fan, Bin Chen, Jiayu Zhou

AI总结 提出TRACE,一种轨迹感知的LLM推理智能体,将先导优化建模为序列决策问题,通过工具选择实现结构约束下的前瞻性分子优化,在ADMET任务中优于基线。

Comments 12 pages

详情
AI中文摘要

药物发现是一个漫长且资源密集的过程,由多个阶段组成。其中,先导优化在将早期命中化合物转化为可行的候选药物中起着关键作用。这一阶段需要通过细微的结构修饰来改善ADMET相关性质,同时保留负责与疾病靶点结合亲和力的关键分子子结构。人工智能的最新进展在加速药物发现的各个方面显示出前景;然而,大多数现有的先导优化方法依赖于一步式分子优化,未能考虑序列设计决策的长期后果。为了解决这一限制,我们提出了TRACE,一种用于分子先导优化的轨迹感知、LLM推理智能体,它将工具选择形式化为一个关于动作轨迹的序列决策问题。给定一个先导分子和一个优化目标,TRACE在分子优化工具上做出轨迹感知的决策,从而在结构约束下实现前瞻性优化。在多个ADMET优化任务上的实验表明,与基线模型相比,我们的智能体实现了更高的优化成功率、更大的性质改进和更高的有效性,同时保持了分子相似性。

英文摘要

Drug discovery is a lengthy and resource-intensive process composed of multiple stages. Among these stages, lead optimization plays a critical role in transforming early hit compounds into viable drug candidates. This stage requires improving ADMET-related properties through subtle structural refinement while preserving key molecular substructures responsible for binding affinity to disease targets. Recent advances in artificial intelligence have shown promise in accelerating various aspects of drug discovery; however, most existing approaches to lead optimization rely on one-step molecular optimization, which fail to account for the long-term consequences of sequential design decisions. To address this limitation, we propose TRACE, a trajectory-aware, LLM-reasoning agent for molecular lead optimization that formulates tool selection as a sequential decision-making problem over action trajectories. Given a lead molecule and an optimization objective, TRACE makes trajectory-aware decisions over molecular optimization tools, enabling forward-looking refinement under structural constraints. Experiments on multiple ADMET optimization tasks show that our agent achieves higher optimization success, larger property improvements, and higher validity, while preserving molecular similarity compared to baseline models.

2605.28855 2026-05-29 cs.AI

Behavior-Aware Auxiliary Corrections for Off-Policy Temporal-Difference Prediction

行为感知的辅助校正用于离策略时序差分预测

Xingguo Chen, Zhiang He, Yuchen Shen, Shangdong Yang, Chao Li, Guang Yang, Wenhao Wang

AI总结 针对离策略时序差分学习的不稳定性,提出行为感知的辅助协方差校正方法(BA-TDC/BA-TDRC),通过替换辅助矩阵为行为贝尔曼矩阵,并引入正则化,在保持不动点和收敛性的同时提升性能。

详情
AI中文摘要

在函数近似下,离策略采样中的时序差分学习可能不稳定。TDC通过辅助协方差校正稳定离策略TD,而TDRC在单时间尺度递归中进一步正则化该校正。本文研究在线性预测设置中行为感知的辅助协方差几何替换,这是理解值函数近似特征空间动力学的标准局部模型。我们首先将TDC辅助矩阵(C)替换为行为贝尔曼矩阵(A_μ),得到BA-TDC,然后正则化同一行为感知方程得到BA-TDRC。这种两步构造将行为感知几何的贡献与正则化的贡献分离。线性分析还为神经网络值近似中出现的辅助几何设计问题提供了一个可处理模型,其中特征协方差和时间转移矩阵共同塑造最后一层校正动力学。我们给出了有限状态均值系统公式,证明了在实例化均值系统的Hurwitz稳定性条件下的不动点保持和几乎必然收敛,并通过精确线性误差递归的谱半径比较了确定性均值速率。在二状态反例、Baird反例、随机游走和Boyan链上的实验表明,行为感知替换本身在某些任务上非常有益,但正则化对于在更困难设置下实现稳健性能是必要的。

英文摘要

Temporal-difference learning with function approximation can be unstable under off-policy sampling. TDC stabilizes off-policy TD through an auxiliary covariance correction, and TDRC further regularizes this correction in a single-timescale recursion. This paper studies a behavior-aware replacement of the auxiliary covariance geometry in the linear prediction setting, which is the standard local model for understanding the feature-space dynamics of value-function approximation. We first replace the TDC auxiliary matrix (C) by the behavior Bellman matrix (A_μ), yielding BA-TDC, and then regularize the same behavior-aware equation to obtain BA-TDRC. This two-step construction separates the contribution of behavior-aware geometry from the contribution of regularization. The linear analysis also provides a tractable model for an auxiliary-geometry design question that arises in neural-network value approximation, where feature covariances and temporal transition matrices jointly shape the last-layer correction dynamics. We give a finite-state mean-system formulation, prove fixed-point preservation and almost-sure convergence under a Hurwitz stability condition on the instantiated mean system, and compare deterministic mean rates through the spectral radius of the exact linear error recursion. Experiments on the two-state counterexample, Baird's counterexample, Random Walk, and Boyan Chain show that the behavior-aware replacement can be highly beneficial by itself on some tasks, but that regularization is necessary for robust performance across harder settings.

2605.28854 2026-05-29 cs.CL cs.LG q-bio.NC

Large language models reorganize representational geometry during in-context learning

大型语言模型在上下文学习中重组表征几何结构

Hua-Dong Xiong, Li Ji-An, Robert C. Wilson, Kwonjoon Lee, Xue-Xin Wei

AI总结 研究大型语言模型在上下文学习中的表征几何重组,发现其性能与任务表征结构相关,并通过原型算法动态调整表征以提高可分性。

详情
AI中文摘要

大型语言模型(LLMs)表现出显著的灵活性:它们可以从上下文示例中适应新任务,而无需任何参数更新,这种能力被称为上下文学习(ICL)。先前关于合成任务的研究表明,ICL可以实现特定算法,展示了架构能力,并且机制分析已经识别出支持这种行为的关键回路。然而,由于上下文计算——无论其算法形式如何——依赖于高维表征空间中的变换,该空间的几何结构如何塑造ICL的有效性仍不清楚。受神经科学中将分类视为神经表征解缠的观点启发,我们假设ICL依赖于任务相关表征的成功在线解缠。为了验证这一想法,我们研究了LLMs如何对上下文示例进行分类,这些示例的标签由模型自身具有已知结构的内部表征定义。我们表明,ICL性能与底层分类任务的表征结构系统性相关,并且成功的ICL伴随着几何重组,增加了在线可分性。我们进一步发现,LLM的行为可以通过一种原型类算法很好地描述,该算法在重塑表征以支持分类的同时整合证据。这些发现为预训练LLMs中的ICL提供了几何解释,将表征几何结构确立为ICL的机制约束,并量化了预训练表征所能提供的与上下文学习所能利用之间的差距。

英文摘要

Large language models (LLMs) exhibit remarkable flexibility: they can adapt to novel tasks from in-context examples without any parameter updates, a capability known as in-context learning (ICL). Prior work on synthetic tasks has shown that ICL can implement specific algorithms, demonstrating architectural competence, and mechanistic analyses have identified key circuits that support this behavior. However, because in-context computation -- regardless of its algorithmic form -- relies on transformations in high-dimensional representation space, it remains unclear how the geometry of that space shapes ICL effectiveness. Motivated by the neuroscience view of classification as the untangling of neural representations, we hypothesize that ICL depends on the successful online untangling of task-relevant representations. To test this idea, we study how LLMs classify in-context examples whose labels are defined by the model's own internal representations with known structure. We show that ICL performance correlates systematically with the representational structure of the underlying classification task and that successful ICL is accompanied by geometric reorganization that increases online separability. We further find that LLM behavior is well described by a prototype-like algorithm that integrates evidence while reshaping representations to support classification. These findings offer a geometric account of ICL in pretrained LLMs, establish representational geometry as a mechanistic constraint on ICL, and quantify the gap between what pretrained representations afford and what in-context learning can exploit.

2605.28849 2026-05-29 cs.AI

Behavior-Induced Mirror-Prox Temporal-Difference Learning for Faster Off-Policy Prediction

行为诱导的镜像-近似时间差分学习用于更快的离策略预测

Xingguo Chen, Yuchen Shen, Shangdong Yang, Chao Li, Guang Yang, Wenhao Wang

AI总结 提出一种行为诱导的镜像-近似时间差分方法(STHTD-MP),通过用行为策略Bellman矩阵的对称部分替换协方差度量来改进离策略预测的几何结构,并证明其收敛性和更小的平均收缩因子。

详情
AI中文摘要

梯度时间差分方法通过线性函数逼近提供稳定的离策略预测,但其实际性能强烈受辅助变量度量诱导的几何结构影响。现有的镜像-近似TD方法通常使用特征协方差度量,而混合TD方法表明行为策略转移信息可以提供更具信息性的更新几何结构。本文提出一种行为诱导的镜像-近似时间差分方法,称为STHTD-MP,它将原始-对偶鞍点公式中的协方差度量替换为行为策略Bellman矩阵的对称部分。该方法对原始变量和辅助变量保持单一学习率,并对得到的混合鞍点算子应用镜像-近似预测-校正步骤。我们在标准随机逼近假设下对固定策略线性预测提供了形式化的收敛分析:行为诱导度量正定,联合均值系统Hurwitz稳定,有界性通过Lyapunov论证得到,随机递归通过ODE方法收敛。我们进一步推导了投影预言遍历间隙界,并基于确定性镜像-近似误差矩阵的谱半径与GTD2-MP进行了精确的均值算子比较。分析表明,当行为诱导度量改善鞍点几何结构时,STHTD-MP可以比GTD2-MP具有更小的平均收缩因子。在二状态、随机游走和Boyan Chain基准上的精确数值均值算子分析支持了这一条件,而Baird的反例被识别为一个奇异边界情况,其中严格假设不成立。

英文摘要

Gradient temporal-difference methods provide stable off-policy prediction with linear function approximation, but their practical performance is strongly affected by the geometry induced by the auxiliary-variable metric. Existing Mirror-Prox TD methods typically use the feature covariance metric, whereas hybrid TD methods suggest that behavior-policy transition information can provide a more informative update geometry. This paper proposes a behavior-induced Mirror-Prox temporal-difference method, called STHTD-MP, which replaces the covariance metric in the primal-dual saddle-point formulation with the symmetric part of the behavior-policy Bellman matrix. The method keeps a single learning rate for the primal and auxiliary variables and applies a Mirror-Prox prediction-correction step to the resulting hybrid saddle-point operator. We provide a formal convergence analysis for fixed-policy linear prediction under standard stochastic approximation assumptions: the behavior-induced metric is positive definite, the joint mean system is Hurwitz, boundedness follows from a Lyapunov argument, and the stochastic recursion converges by the ODE method. We further derive projected-oracle ergodic gap bounds and an exact mean-operator comparison with GTD2-MP based on the spectral radius of the deterministic Mirror-Prox error matrix. The analysis shows that STHTD-MP can have a smaller mean contraction factor than GTD2-MP when the behavior-induced metric improves the saddle-point geometry. Exact numerical mean-operator analysis on two-state, Random Walk, and Boyan Chain benchmarks supports this condition, while Baird's counterexample is identified as a singular boundary case where the strict assumptions fail.

2605.28848 2026-05-29 cs.CL cs.AI

GPF-LiveNews: A Streaming Evaluation Protocol for Group-Conditioned Framing in Large Language Models

GPF-LiveNews: 大型语言模型中群体条件框架的流式评估协议

Mohd Ariful Haque, Fahad Rahman, Kishor Datta Gupta, Roy George

AI总结 提出GPF-LiveNews流式评估协议,通过实时新闻锚点与身份标签组合,检测LLM输出中针对不同受众的语义敏感性和情感差异,用于审计群体条件框架。

详情
AI中文摘要

部署的语言模型在非静态环境中进行评估:模型版本、检索层、安全系统和真实世界输入都随时间变化。静态偏差基准仍然有用,但它们无法显示模型如何针对不同提示受众构建新出现事件的框架。我们引入了GPF-LIVENEWS,这是一个流式评估协议和基准快照,用于审计开放端LLM输出中的群体条件框架。该协议扩展了来自BBC/路透社的最新新闻锚点,涵盖42个身份标签和七个提示族,然后使用语义敏感性和情感差异信号评估响应束。在12次监控运行和23个托管模型的试点中,政策/行动提示产生了最强的语义运动,而情感变化在维度和提示族之间较为平坦。发布的工件包括文章元数据、提示模板、实例化提示、模型输出元数据、评分表、文档和复现脚本。我们将所有评分解释为用于人工审查的观察窗口审计信号,而非永久性的公平性排名或有害偏差的直接证据。

英文摘要

Deployed language models are evaluated in a non-stationary environment: model versions, retrieval layers, safety systems, and real-world inputs all change over time. Static bias benchmarks remain useful, but they do not show how models frame newly emerging events for different prompted audiences. We introduce GPF-LIVENEWS, a streaming evaluation protocol and benchmark snapshot for auditing group-conditioned framing in open-ended LLM outputs. The protocol expands fresh BBC/Reuters news anchors across 42 identity labels and seven prompt families, then evaluates response bundles using semantic-sensitivity and sentiment-disparity signals. In a pilot over 12 monitoring runs and 23 hosted models, Policy/Action prompts produce the strongest semantic movement, while sentiment variation is flatter across dimensions and prompt families. The released artifact includes article metadata, prompt templates, instantiated prompts, model-output metadata, score tables, documentation, and reproduction scripts. We interpret all scores as observed-window audit signals for human review, not as permanent fairness rankings or direct proof of harmful bias.