arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 1706
2605.23087 2026-05-25 cs.LG

The Implicit Bias of Depth: From Neural Collapse to Softmax Codes

深度的隐式偏差:从神经坍缩到Softmax编码

Connall Garrod, Jonathan P. Keating, Christos Thrampoulidis

AI总结 该研究探讨了深度神经网络中梯度下降的隐式偏差如何影响神经崩溃(NC)现象。通过分析无正则化的深度非约束特征模型(UFM),研究发现深度本身会引入一种隐式的低秩偏差,使得网络更倾向于生成低秩的特征表示,这些表示与softmax编码形式的最优解相关。研究还揭示了深度如何影响训练动态和NC的收敛区域,并指出网络宽度的增加可能促使训练向更高秩的解发展,为理解深度模型的隐式偏差提供了新的理论视角。

Comments 46 pages, 11 figures, accepted at the International Conference on Machine Learning 2026

详情
AI中文摘要

神经坍缩(NC)描述了训练分类器中特征和权重出现的结构化几何。最近的理论表明,NC在深度架构中可能不是最优的,将其归因于L2正则化的显式低秩偏差。我们研究了深度无约束特征模型(UFM)——等价于具有正交输入的深度线性网络——在无正则化训练下的情况,以隔离梯度下降和深度单独如何塑造NC。我们表明,深度诱导了隐式低秩偏差:低秩矩阵通过连续乘法更有效地传播范数,从而促进NC的低秩替代方案。我们认为,这些替代方案对应于softmax编码:先前在宽度瓶颈网络中发现的最大间隔解。通过分析谱初始化下的训练动态,我们识别出早期奇异值之间的排斥力驱动低秩出现,并刻画了深度如何缩小NC的吸引域。最后,我们展示了一些相反方向的效果:对于随机初始化的网络,增加宽度会使训练偏向更高秩的解。我们的结果首次提供了在无正则化多类交叉熵训练的深度UFM中隐式偏差的渐近和动态刻画。

英文摘要

Neural collapse (NC) describes the structured geometry that emerges in the features and weights of trained classifiers. Recent theory suggests NC can be suboptimal in deep architectures, attributing this to an explicit low-rank bias from L2 regularization. We study the deep unconstrained feature model (UFM)-equivalent to a deep linear network with orthogonal inputs-trained without regularization, to isolate how gradient descent and depth alone shape NC. We show that depth induces an implicit low-rank bias: low-rank matrices propagate norm more efficiently through successive multiplications, promoting low-rank alternatives to NC. These alternatives, we argue, correspond to softmax codes: max-margin solutions previously found in width-bottlenecked networks. Analyzing training dynamics under spectral initialization, we identify an early-time repulsion among singular values that drives low-rank emergence, and characterize how depth shrinks NC's basin of attraction. Finally, we show that some effects act in the opposite direction: for randomly initialized networks, increasing width biases training toward higher-rank solutions. Our results provide the first asymptotic and dynamic characterization of implicit bias in deep UFMs trained with unregularized multiclass cross-entropy.

2605.23081 2026-05-25 cs.LG

ThriftAttention: Selective Mixed Precision for Long-Context FP4 Attention

ThriftAttention: 面向长上下文FP4注意力机制的选择性混合精度

Joe Sharratt

AI总结 在长上下文任务中,注意力机制的二次计算成本是一个关键挑战。为了解决这一问题,ThriftAttention 提出了一种选择性混合精度方法,在保持 FP4 推理效率的同时,显著提升了长上下文场景下的模型质量。该方法通过分阶段策略,优先以 FP16 精度计算少量重要的查询-键块对,其余块则使用 FP4 精度计算,并通过在线 softmax 合并结果,从而在仅使用 5% FP16 块的情况下,恢复了 89.1% 的 FP4 到 FP16 性能差距。

详情
AI中文摘要

高效的注意力算法对于减轻长上下文工作负载中注意力的二次成本至关重要。先前的工作在Blackwell GPU上利用块缩放量化技术将注意力计算移至4位精度以加速推理。然而,这些技术在长上下文设置中会导致显著的质量下降。我们表明,量化误差的输出影响高度不均匀,并且随着每个查询-键交互的重要性而增加,将功能相关的误差集中在包含最重要标记的少量注意力块中。我们提出ThriftAttention,一种低比特注意力变体,在FP4推理效率下提供接近FP16的长上下文质量。该方法分两个阶段进行。首先,一种启发式方法快速选择少量重要的查询-键块对进行FP16精度计算。其次,选中的块以FP16计算,其余块以FP4计算,两条路径通过在线softmax合并为单个输出。我们在长上下文基准和模型家族上证明,通过仅计算5%的查询-键块为FP16,ThriftAttention平均恢复了FP4到FP16性能差距的89.1%。我们展示了ThriftAttention的优势随序列长度增加而增长,缓解了在更长上下文中观察到的系统性FP4质量下降。代码可在https://github.com/joesharratt1229/ThriftAttention获取。

英文摘要

Efficient attention algorithms are critical to mitigate the quadratic cost of attention in long-context workloads. Prior work utilises block-scaled quantisation techniques on Blackwell GPUs to move attention computation to 4-bit precision to accelerate inference. However, these techniques result in significant quality degradation in long-context settings. We show that the output impact of quantisation error is highly non-uniform and increases with the importance of each query-key interaction, concentrating functionally relevant error in a small number of attention blocks that contain the most important tokens. We propose ThriftAttention, a low-bit attention variant that delivers near-FP16 long-context quality at FP4 inference efficiency. This approach proceeds in two stages. First, a heuristic rapidly selects a small number of important query-key block pairs for FP16 precision. Second, the selected blocks are computed in FP16 and the remaining blocks in FP4, with both paths merged via online softmax into a single output. We demonstrate across long-context benchmarks and model families that by computing only 5% of query-key blocks in FP16, ThriftAttention recovers on average 89.1% of the FP4-to-FP16 performance gap. We show ThriftAttention's advantage grows with sequence length, mitigating the systematic FP4 quality degradation observed at longer contexts. The code is available at https://github.com/joesharratt1229/ThriftAttention.

2605.23078 2026-05-25 cs.LG cs.CL

GEMQ: Global Expert-Level Mixed-Precision Quantization for MoE LLMs

GEMQ:MoE大语言模型的全局专家级混合精度量化

Jianing Deng, Song Wang, Dongwei Wang, Zijie Liu, Tianlong Chen, Huanrui Yang, Jingtong Hu

AI总结 混合专家大型语言模型(MoE-LLMs)在性能上表现优异,但因大量专家参数导致内存开销较大。为解决这一问题,本文提出了一种全局专家级混合精度量化方法GEMQ,通过全局线性规划形式捕捉模型整体的专家重要性,并结合高效的路由微调以适应量化后的专家,从而实现更优的精度与内存权衡。实验表明,GEMQ在保持精度的同时显著降低了内存占用并加速了推理。

Comments ICML 2026

详情
AI中文摘要

混合专家大语言模型(MoE-LLMs)性能强大,但由于大量专家参数导致显著的内存开销。混合精度量化根据专家重要性分配不同的位宽,接近精度-内存帕累托前沿,并实现极低比特量化。然而,现有方法依赖于逐层重要性估计,忽视了量化引起的路由器偏移,导致次优的分配和路由。本文提出全局专家级混合精度量化(GEMQ),通过(1)基于量化误差分析的全局线性规划公式来捕获模型范围内的专家重要性,以及(2)高效的路由器微调以适应量化后的专家,从而克服这些限制。这些组件被集成到一个渐进式量化框架中,该框架迭代地优化重要性估计和分配。实验表明,GEMQ在最小化精度损失的情况下显著减少内存并加速推理。源代码可在 https://github.com/jndeng/GEMQ 获取。

英文摘要

Mixture-of-Experts Large Language Models (MoE-LLMs) achieve strong performance but incur substantial memory overhead due to massive expert parameters. Mixed-precision quantization mitigates this cost by allocating expert-wise bit-widths based on their importance, approaching the accuracy-memory Pareto frontier and enabling extreme low-bit quantization. However, existing methods rely on layer-wise importance estimation and overlook router shifts induced by quantization, resulting in suboptimal allocation and routing. In this work, we propose Global Expert-level Mixed-precision Quantization (GEMQ) to overcome these limitations via (1) a global linear-programming formulation that captures model-wide expert importance based on quantization error analysis, and (2) efficient router fine-tuning to adapt routing to quantized experts. These components are integrated into a progressive quantization framework that iteratively refines importance estimation and allocation. Experiments demonstrate that GEMQ significantly reduces memory and accelerates inference with minimal accuracy degradation. Source code is available at https://github.com/jndeng/GEMQ .

2605.23074 2026-05-25 cs.AI

PathCal: State-Aware Reflection-Marker Calibration for Efficient Reasoning

PathCal: 状态感知的反思标记校准用于高效推理

Lingyu Jiang, Zirui Li, Shuo Xing, Peiran Li, Tsubasa Takahashi, Dengzhe Hou, Zhengzhong Tu, Kazunori Yamada, Fangzhou Lin

AI总结 随着大语言模型在推理任务中的应用日益广泛,如何高效控制其推理路径成为一个关键问题。本文提出PathCal,一种无需训练的解码控制器,通过区分不同类型的反思标记并仅在局部不确定状态进行干预,实现对推理路径的校准。实验表明,PathCal在多个推理基准上有效提升了推理效率与性能的平衡,减少了生成长度而不牺牲准确性。

Comments 21 pages, 5 figures, 7 tables

详情
AI中文摘要

大型推理语言模型(LRMs)的出现通过推理时缩放生成长篇思维链(CoT)轨迹,为处理复杂推理任务铺平了道路。同时,这些轨迹通常包含显式的反思标记,如“wait”、“but”和“alternatively”,分别表示犹豫、修正和考虑替代探索。最近关于测试时控制的研究利用这些标记作为轻量级手柄来引导推理,通常将它们视为单一的粗粒度类别,而非区分其不同的功能角色。在本文中,我们进行类型级抑制和固定前缀干预,揭示反思标记不仅在功能角色上不同,而且在它们发挥最大影响的时机上也不同。具体来说,不同的标记类别以不同方式影响准确性和生成长度,并且标记选择在模型进入稳定推理轨迹之前最为关键。受这些发现启发,我们引入PathCal,一种新颖的无需训练的解码控制器,通过区分标记类型并仅在局部不确定状态进行干预来校准推理路径。在每个解码步骤,PathCal利用反思标记上的分布来估计维持当前推理轨迹与启动竞争分支之间的局部竞争,并在竞争分支证据过多时软性地重新平衡标记对数。在六个推理基准上的实验表明,PathCal实现了更好的效率-性能权衡,在减少生成长度的同时提高或保持准确率,且不依赖外部验证器或额外采样。

英文摘要

The emergence of Large Reasoning Language Models (LRMs) has paved the way for tackling complex reasoning tasks through test-time scaling by generating long-form Chain-of-Thought (CoT) trajectories during inference. Meanwhile, these trajectories often contain explicit reflection markers such as ``wait'', ``but'', and ``alternatively'', signaling hesitation, revision, and the consideration of alternative explorations, respectively. Recent studies on test-time control leverage such markers as lightweight handles for steering reasoning, typically treating them as a single coarse-grained category rather than distinguishing their distinct functional roles. In this paper, we conduct type-wise suppression and fixed-prefix intervention, revealing that reflection markers differ not only in their functional roles but also in when they exert the greatest influence. Specifically, different marker classes affect accuracy and generation length in distinct ways, and marker choices are most consequential before the model settles into a stable reasoning trajectory. Motivated by these findings, we introduce PathCal, a novel training-free decoding controller that calibrates reasoning paths by distinguishing marker types and intervening only at locally uncertain states. At each decoding step, PathCal utilizes the distribution over reflection-markers to estimate local competition between maintaining the current reasoning trajectory and initiating a competing branch, and softly rebalances marker logits when competing-branch evidence becomes excessive. Experiments across six reasoning benchmarks demonstrate that PathCal achieves a better efficiency--performance trade-off, improving or preserving accuracy while reducing generation length, without relying on external verifiers or additional sampling.

2605.23071 2026-05-25 cs.CL

The Efficiency Frontier: A Unified Framework for Cost-Performance Optimization in LLM Context Management

效率前沿:LLM上下文管理中成本-性能优化的统一框架

Binqi Shen, Lier Jin, Hanyu Cai, Lan Hu, Yuting Xin

AI总结 随着大语言模型对长上下文处理的需求增加,扩展上下文窗口带来了显著的计算和经济成本。本文提出了一种统一的框架《The Efficiency Frontier》,用于在上下文管理中实现成本与性能的优化,通过联合考虑任务性能、令牌成本和预处理复用,将上下文策略选择建模为部署感知的优化问题。该框架揭示了检索与预处理策略在不同操作条件下的适用范围,并在实验中展示了其在减少令牌使用和降低成本方面的显著优势。

详情
AI中文摘要

大型语言模型(LLM)越来越依赖长上下文处理,但扩展上下文窗口会带来巨大的计算和财务成本。现有的上下文缩减方法,包括检索和内存压缩方法,通常使用性能和效率指标独立评估,限制了系统比较和部署感知决策。本文介绍了效率前沿,一个用于LLM上下文管理中成本-性能优化的统一框架。该框架将上下文策略选择建模为部署感知优化问题,通过摊销成本建模联合考虑任务性能、token成本和预处理重用。与孤立比较方法的现有评估不同,所提出的框架能够进行决策导向分析,揭示不同上下文管理策略在不同操作条件下何时变得更为可取。在5000个HotpotQA实例上的评估显示,该框架揭示了基于检索和基于预处理的策略之间的不同操作区间和转换边界。结果表明,部署感知优化在可比性能(F1 ≈ 0.78)下将有效token使用减少了约25%,而摊销内存压缩在高性能设置下相比全上下文提示实现了超过50%的token成本降低。总体而言,所提出的框架为评估和部署可扩展、高效且可持续的LLM系统提供了原则性和实用性的基础。

英文摘要

Large language models (LLMs) increasingly rely on long-context processing, but expanding context windows introduces substantial computational and financial costs. Existing context reduction approaches, including retrieval and memory compression methods, are typically evaluated using performance and efficiency metrics independently, limiting systematic comparison and deployment-aware decision-making. This paper introduces The Efficiency Frontier, a unified framework for cost-performance optimization in LLM context management. The framework models context strategy selection as a deployment-aware optimization problem that jointly accounts for task performance, token cost, and preprocessing reuse through amortized cost modeling. Unlike existing evaluations that compare methods in isolation, the proposed framework enables decision-oriented analysis of when different context management strategies become preferable under varying operational conditions. Evaluated on 5,000 HotpotQA instances, the framework reveals distinct operational regimes and transition boundaries between retrieval-based and preprocessing-based strategies. Results show that deployment-aware optimization reduces effective token usage by approximately 25% at comparable performance ($F1 \approx 0.78$), while amortized memory compression achieves over 50% lower token cost relative to full-context prompting in higher-performance settings. Overall, the proposed framework provides a principled and practical foundation for evaluating and deploying scalable, efficient, and sustainable LLM systems.

2605.23070 2026-05-25 cs.CV

Flow Mismatching: Unsupervised Anomaly Detection via Velocity Discrepancies in Flow Matching Models

Flow Mismatching: 通过流匹配模型中的速度差异进行无监督异常检测

Shengzhe Chen, Mehrdad Moradi, Kamran Paynabar, Hao Yan

AI总结 本文提出了一种名为 Flow Mismatching 的无监督异常检测方法,避免了基于重建的范式,转而利用流匹配模型中的速度差异来检测异常。该方法通过在从高斯噪声到目标图像的仿射路径上分析模型预测速度与几何路径速度之间的不一致,从而识别出异常区域。实验表明,该方法在多个基准数据集上优于现有的基于重建和基于流匹配的最新方法。

详情
AI中文摘要

我们提出Flow Mismatching,一种无监督异常检测方法,有意避免基于重建的范式。相反,我们将流匹配视为几何动力学,并利用一个关键见解:异常发生在学习到的正常流与指向测试图像的几何路径不一致的地方。给定仅在正常图像上训练的流匹配模型,我们沿着从高斯噪声到目标图像的仿射路径探测其学习到的速度场。沿着每条路径,我们比较模型预测的速度(遵循正常生成动力学)与指向目标的速度(包含任何异常内容)。异常会导致这些速度之间的强烈局部不一致。聚合不同时间步和多条路径上的不匹配,产生像素级热图和图像级分数,无需测试时优化、特征记忆或额外校准。我们的分析表明,总体不匹配分解为一个不可约的降噪项和一个测试路径与正常路径得分函数之间的Fisher散度项,后者识别出驱动异常分离的得分差距成分,并解释了鲁棒路径聚合的有效性。在MVTec-AD和VisA上的大量实验表明,与最先进的基于重建和最近的基于流匹配的方法相比,性能优越。

英文摘要

We propose Flow Mismatching, an unsupervised anomaly detection method that deliberately avoids reconstruction-based paradigms. Instead, we treat flow matching as geometric dynamics and leverage a key insight: anomalies occur at places where the learned normal flow disagrees with the geometric path toward a test image. Given a flow matching model trained only on normal images, we probe its learned velocity field along affine paths from Gaussian noise to a target image. Along each path, we compare the model-predicted velocity, which follows normal generative dynamics, with the geometric velocity toward the target, which includes any anomalous content. Anomalies induce strong local disagreement between these velocities. Aggregating the mismatch over different time steps and multiple paths yields pixel-wise heatmaps and image-level scores without test-time optimization, feature memories, or additional calibration. Our analysis shows that the population mismatch decomposes into an irreducible denoising term and a Fisher-divergence term between the test-path and normal-path score functions, which identifies the score-gap component that drives anomaly separation and explains the effectiveness of robust path aggregation. Extensive experiments on MVTec-AD and VisA demonstrate superior performance compared with SOTA reconstruction-based and recent flow matching-based approaches.

2605.23069 2026-05-25 cs.CL

DFKI-MLT at SemEval-2026 TASK 7: Steering Multilingual Models Towards Cultural Knowledge

DFKI-MLT 在 SemEval-2026 任务 7 中:将多语言模型引导至文化知识

Yusser Al Ghussin, Daniil Gurgurov, Yasser Hamidullah, Josef van Genabith, Cristina España-Bonet, Simon Ostermann

AI总结 该研究针对多语言大语言模型在文化知识理解上的不足,提出了一种基于激活引导的方法,通过从平行语料FLORES中提取语言向量,对多语言模型进行推理时的适应性调整。研究参与了SemEval-2026任务7的多选题和简答题两个赛道,其中多选题部分取得了86.96%的准确率,排名第七。分析表明,激活引导在不同语言和层面上的效果不一,提示在文化感知任务中应综合优化提示设计与激活引导策略。

Comments Accepted to The 20th International Workshop on Semantic Evaluation at ACL 2026

详情
AI中文摘要

大型语言模型(LLMs)越来越多地用于不同的语言和文化背景,但其文化知识在不同地区和语言之间仍然不均匀。我们提出了用于 SemEval-2026 任务 7(文化意识)的 DFKI-MLT 系统,该系统使用从并行 FLORES 数据中提取的语言向量,对多语言 LLMs 应用激活引导。我们的方法通过在选定的 Transformer 层的残差流中添加特定语言的引导向量来进行推理时调整,无需任何参数更新。我们参加了简答题(SAQ)和多项选择题(MCQ)两个赛道;然而,只有我们的 MCQ 提交获得了官方评分。在官方 MCQ 赛道中,我们达到了 86.96% 的准确率,在 17 个队伍中排名第 7。为了更好地理解系统行为,我们对共享任务的 MCQ 和 SAQ 设置进行了事后分析。这些分析表明,激活引导对文化推理产生了适度且异质的改进:增益对层高度敏感,在不同语言-区域对之间差异很大,某些配置甚至降低了性能,并且与提示表述相互作用,比较了通用提示和文化条件提示。我们的发现表明,提示设计和激活引导应联合优化,以实现具有文化意识的多语言推理。

英文摘要

Large language models (LLMs) are increasingly used across diverse linguistic and cultural contexts, yet their cultural knowledge remains uneven across regions and languages. We present the DFKI-MLT system for SemEval-2026 Task 7 on cultural awareness, where we apply activation steering to multilingual LLMs using language vectors extracted from parallel FLORES data. Our method performs inference-time adaptation by adding language-specific steering vectors to the residual stream at a selected transformer layer, without any parameter updates. We participated in both the short-answer (SAQ) and multiple-choice (MCQ) tracks; however, only our MCQ submission received an official score. In the official MCQ track, we achieved 86.96% accuracy, ranking 7th out of 17 teams. To better understand system behavior, we conduct post-hoc analyses on the shared-task MCQ and SAQ settings. These analyses show that activation steering yields modest and heterogeneous improvements on cultural reasoning: gains are strongly layer-sensitive, vary substantially across language-region pairs, with some configurations even degrading performance, and interact with prompt formulation, comparing generic and culturally conditioned prompts. Our findings suggest that prompt design and activation steering should be jointly optimized for culturally aware multilingual inference.

2605.23068 2026-05-25 cs.CV

RoboSurg-VQA: A Multimodal Benchmark for Surgical Segmentation-Aware Visual Question Answering

RoboSurg-VQA:面向手术分割感知的视觉问答多模态基准

Chengyi Zhang, Zi Ye, Ziyang Wang

AI总结 本文提出了一种名为 RoboSurg-VQA 的多模态基准,用于评估手术场景下的分割感知视觉问答能力。该基准基于公开的手术分割数据集构建,每个图像帧都配有一组临床导向的问题,涵盖手术背景、解剖结构、成像方式、手术器械可见性等方面,并采用封闭式答案集以保证评估一致性。研究通过约束提示生成候选答案,并结合人工审核提升答案的合理性和标签一致性,旨在推动机器人辅助手术中更可靠的视觉理解技术发展。

详情
AI中文摘要

在机器人辅助和微创手术(RMIS/MIS)中,可靠的视觉理解不仅仅需要精确的掩膜:在临床实践中,临床医生会提出关于手术过程背景、可见性、伪影以及解剖结构和手术器械存在性的语言类问题,且通常是在由遮挡、烟雾、出血和镜面高光导致的退化视图下。我们提出了 extbf{RoboSurg-VQA},这是一个基于共享模式重新利用公共手术分割数据集构建的分割感知视觉问答(VQA)基准。每帧图像与一组固定的临床驱动问题配对,涵盖手术过程背景、解剖结构(包括区域)、成像模态/视图、手术伪影、图像质量以及基本可见性和空间属性,并采用封闭答案集以实现一致的评估。为了扩展标注,我们通过约束提示生成候选答案,并自动进行有效性和一致性检查,随后进行人工审计以提高合理性和标签一致性。我们报告了基准统计信息、基线合理性以及在挑战性手术条件下的常见评估挑战。代码将在https://github.com/ziyangwang007/Robosurg-VQA上提供。

英文摘要

Reliable visual understanding in robot-assisted and minimally invasive surgery (RMIS/MIS) demands more than accurate masks: in clinical practice, clinicians pose language-like questions about procedural context, visibility, artefacts, and the presence of anatomical structures and surgical instruments, often under degraded views caused by occlusion, smoke, bleeding, and specular highlights. We present \textbf{RoboSurg-VQA}, a segmentation-aware visual question answering (VQA) benchmark built by repurposing public surgical segmentation datasets under a shared schema. Each frame is paired with a fixed set of clinically motivated questions spanning procedure context, anatomy (including region), imaging modality/view, surgical artefacts, image quality, and basic visibility and spatial attributes, with closed answer sets to enable consistent evaluation. To scale annotation, we generate candidate answers via constrained prompting with automatic validity and consistency checks, followed by human auditing to improve plausibility and label consistency. We report benchmark statistics, sanity baselines, and common evaluation challenges under challenging surgical conditions. The code will be available on https://github.com/ziyangwang007/Robosurg-VQA.

2605.23067 2026-05-25 cs.CL

What Training Data Teaches RL Memory Agents: An Empirical Study of Curriculum Effects in Memory-Augmented QA

训练数据教会RL记忆体代理什么:记忆增强问答中课程效应的实证研究

Xinjie He, Zhiyuan Lin, Su Liu, Jialun Wu, Qiyang Xie, Weikai Zhou, Shuai Xiao

AI总结 本研究探讨了训练数据对强化学习记忆代理在问答任务中学习能力的影响,通过控制模型架构、算法和超参数不变,仅改变训练课程组成,分析了不同数据组合对模型性能的影响。实验表明,训练课程的构成在细粒度任务上具有显著影响,混合课程在整体表现上最优,而特定领域的训练数据可以提升特定技能。研究还提出了在单GPU环境下优化训练的两个实用经验,为实际应用提供了指导。

Comments 14 pages, 2 figures, 11 tables. Code, checkpoints, and evaluation artifacts available at https://github.com/EvaxHe/rl-memory-curriculum

详情
AI中文摘要

强化学习(RL)已成为训练LLM代理在多会话对话中推理外部记忆库的可行方法。现有工作仅在单个基准上训练,未揭示训练数据的组成如何塑造记忆体代理获得的技能。我们进行了一项受控的实证研究,固定架构、RL算法和所有超参数,仅改变三种条件下的训练课程:领域内(LoCoMo)、混合基准(LoCoMo + LongMemEval)和领域外(仅LongMemEval)。在两个基准和十种问题类型上,课程组成作为专业化的细粒度杠杆,而非性能的均匀缩放因子。混合课程在两个评估集上均获得最强的整体F1。在窄领域外集上训练可转移特定技能——时间推理,尽管整体性能较弱。每种类型的差异显著超过整体差异,表明单一数字的基准比较系统性地低估了课程效应。我们进一步报告了将GRPO适配到单GPU环境的两个实用经验:跨基准混合需要过滤记忆库中的格式特定噪声以保留训练信号,并且二元精确匹配奖励在单GPU所需的小组大小(G=4)下不产生学习信号,从而激励在该设置下使用连续奖励函数。

英文摘要

Reinforcement learning (RL) has emerged as a viable recipe for training LLM agents to reason over external memory banks in multi-session dialogue. Existing work trains exclusively on a single benchmark, leaving open how the composition of training data shapes the skills a memory agent acquires. We present a controlled empirical study that holds architecture, RL algorithm, and all hyperparameters fixed and varies only the training curriculum across three conditions: in-domain (LoCoMo), mixed-benchmark (LoCoMo + LongMemEval), and out-of-domain (LongMemEval only). Across two benchmarks and ten question types, curriculum composition acts as a fine-grained lever on specialization rather than a uniform scaling factor on performance. The mixed curriculum yields the strongest overall F1 on both evaluation sets. Training on a narrow out-of-domain set transfers a targeted skill - temporal reasoning - despite weak aggregate performance. Per-type differences substantially exceed aggregate differences, indicating that single-number benchmark comparisons systematically underreport curriculum effects. We further report two practical lessons from adapting GRPO to a single-GPU regime: cross-benchmark mixing requires filtering format-specific noise from memory banks to preserve training signal, and binary exact-match reward produces no learning signal at the small group sizes (G = 4) required on one GPU, motivating continuous reward functions in this regime.

2605.23065 2026-05-25 cs.CV cs.AI cs.LG

Dithering Defense: Adversarial Robustness of Vision Foundation Models via Multi-Level Floyd-Steinberg Dithering

抖动防御:通过多级 Floyd-Steinberg 抖动实现视觉基础模型的对抗鲁棒性

Yury Belousov, Brian Pulfer, Vitaliy Kinakh, Slava Voloshynovskiy

AI总结 该研究提出了一种基于多级Floyd-Steinberg抖动算法的轻量输入变换方法,用于提升视觉基础模型在对抗攻击下的鲁棒性。该方法通过在图像中引入可控的噪声,破坏对抗扰动的同时保留语义内容,适用于多种下游任务和不同模型架构。实验表明,该方法在多种攻击场景下表现优异,且对干净输入的性能下降较小,优于现有的去噪基线方法。

Comments Paper accepted at the IEEE International Conference on Image Processing (ICIP 2026)

详情
AI中文摘要

视觉基础模型被广泛用作许多下游任务中的冻结骨干,使其成为对抗攻击下的单点故障。我们研究了多级 Floyd-Steinberg 误差扩散抖动作为一种轻量级、模型无关的输入变换,它在保留语义内容的同时破坏对抗扰动。与先前局限于二值抖动、灰度 CIFAR-10 和从头训练的单个小模型的工作不同,我们在六个任务(分类、分割、深度估计、检索、字幕生成、视觉问答)、两个模型家族(DINOv2、PaliGemma)以及三种强度递增的攻击(PGD、MI-FGSM、SIA)上进行了评估,还包括使用直通估计器的自适应攻击者。我们的结果表明,在中间量化级别上的 Floyd-Steinberg 抖动,尤其是与后处理模糊相结合时,超过或匹配所有测试的基线(包括基于扩散的去噪),并且在干净输入上的退化显著更小。

英文摘要

Vision foundation models are widely used as frozen backbones across many downstream tasks, making them a single point of failure under adversarial attack. We study multi-level Floyd-Steinberg error-diffusion dithering as a lightweight, model-agnostic input transformation that disrupts adversarial perturbations while preserving semantic content. Unlike prior work, which was limited to binary dithering, grayscale CIFAR-10, and a single small model trained from scratch, we evaluate across six tasks (classification, segmentation, depth estimation, retrieval, captioning, visual question answering), two model families (DINOv2, PaliGemma), and three attacks of increasing strength (PGD, MI-FGSM, SIA), as well as an adaptive attacker using a straight-through estimator. Our results show that Floyd-Steinberg dithering at intermediate quantization levels, especially when combined with post-processing blur, exceeds or matches all tested baselines, including diffusion-based denoising, with substantially less degradation on clean inputs.

2605.23064 2026-05-25 cs.CV cs.LG

Millimeter-wave Imaging for Anthropometric Body Measurement

毫米波成像用于人体测量

Miriam Senne, Benjamin D. Killeen, Christoph Baur, Nassir Navab, Azade Farshad

AI总结 该研究提出了一种基于毫米波雷达的无接触人体体型测量方法,旨在解决传统测量工具在隐私、效率和适用性方面的不足。通过优化框架,该方法能够从毫米波点云数据中恢复人体三维形状并提取全面的体态测量指标。其核心贡献在于引入了一种顶点加权策略,结合参数化人体模型(SMPL)进行鲁棒的表面对齐与噪声抑制,实现了无需脱衣、无需摄像头的快速、隐私保护的测量流程,适用于各类人群的临床风险评估。

详情
AI中文摘要

身体形状和围度是临床上用于风险分层的信息性生物标志物,包括腰臀比、肢体和躯干周长等指标,然而传统工具如手动卷尺和光学扫描仪通常需要脱衣和保持姿势。这些要求减缓了工作流程,损害了尊严,并且排除了许多老年人和行动不便者。为了实现快速无接触测量,我们利用毫米波雷达,它保护隐私并能穿透典型衣物,实现快速全身采集。在这项工作中,我们提出了一个新的基于优化的框架,从体积毫米波数据中恢复3D人体形状并提取一套全面的人体测量数据。我们的方法引入了一个加权配准流程,将参数化身体模型(SMPL)直接拟合到噪声毫米波点云上。我们贡献的核心是一种顶点加权策略,该策略调节Chamfer能量函数以实现可靠的表面对齐和噪声消除。我们通过加入脚-地面约束和姿态先验进一步稳定拟合,直接优化SMPL参数。这些组件共同实现了一个快速、保护隐私的工作流程,无需摄像头或脱衣,且只需最小程度的配合,即可通过衣物提供高保真度的身体形状和测量数据,支持在诊所和护理机构中对所有年龄和活动水平的患者进行频繁的风险导向评估。

英文摘要

Body shape and circumferences are clinically informative biomarkers for risk stratification, including measures such as waist to hip ratio, limb and trunk girths, yet conventional tools such as manual tape measures and optical scanners often require undressing and sustained poses. These demands slow workflows, compromise dignity, and exclude many older adults and people with limited mobility. To make measurement fast and contactless, we leverage millimeter-wave (mmWave) radar, which preserves privacy and operates through typical clothing, enabling quick full-body acquisition. In this work, we present a new optimization-based framework to recover 3D human shape and extract a comprehensive set of anthropometric measurements from volumetric mmWave data. Our method introduces a weighted registration pipeline that fits a parametric body model (SMPL) directly to the noisy mmWave point cloud. The core of our contribution is a vertex-weighting strategy that modulates a Chamfer energy function for reliable surface alignment and noise elimination. We further stabilize the fit by incorporating a foot-ground plane constraint and pose priors, optimizing directly for the SMPL parameters. Together, these components enable a fast, privacy preserving workflow that delivers high fidelity body shape and measurements through clothing without cameras or disrobing and with minimal cooperation, supporting frequent risk oriented assessments in clinics and care facilities for patients of all ages and mobility levels.

2605.23061 2026-05-25 cs.LG cs.AI math.OC stat.ML

Anytime Training with Schedule-Free Spectral Optimization

任意时间训练:无调度谱优化

Anuj Apte, Pranav Deshpande, Niraj Kumar, Shouvanik Chakrabarti, Junhyung Lyle Kim

AI总结 本文提出了一种名为 SF-NorMuon 的无调度谱优化器,用于解决传统神经网络训练中依赖固定学习率计划的问题。该方法在无需预设训练时间范围的情况下,能够在大规模语言模型上达到甚至超越精心调参的 AdamW 优化器的性能。研究还从理论上证明了无调度谱动态的稳定性保证,并指出快速迭代中的权重衰减对长期训练稳定性至关重要,为无需预设时间范围的持续学习提供了更实用的优化方案。

详情
AI中文摘要

标准神经网络训练依赖于与固定训练步数绑定的学习率调度,导致路径依赖性强,且当数据可用性变化时需要昂贵的重新调优。无调度(SF)方法通过移除显式调度来解决这一问题,然而当前最先进的任意时间优化器SF-AdamW始终不如调优后的AdamW基线。我们提出SF-NorMuon,一种无调度谱优化器,弥补了这一差距:使用单一超参数配置,SF-NorMuon在125M和772M参数的语言模型上,在$1$--$8 imes$ Chinchilla训练步数范围内匹配或超过了调优的AdamW。在理论方面,我们证明了无调度谱动力学的平稳性保证,并指出快速迭代上的权重衰减对于长步数稳定性至关重要。SF-NorMuon使从业者能够在训练过程中的任何时刻获得高质量检查点,而无需预先承诺训练步数。通过缩小与调优基线的性能差距,SF-NorMuon使无步数优化更加实用,向真正开放式的持续学习迈出了一步。

英文摘要

Standard neural network training relies on learning-rate schedules tied to a fixed horizon, leading to strong path dependence and costly re-tuning as data availability changes. Schedule-Free (SF) methods address this by removing explicit schedules, yet SF-AdamW, the current state-of-the-art anytime optimizer, consistently underperforms well-tuned AdamW baselines. We propose SF-NorMuon, a schedule-free spectral optimizer that closes this gap: with a single hyperparameter configuration, SF-NorMuon matches or exceeds tuned AdamW on 125M and 772M parameter language models across $1$--$8\times$ Chinchilla horizons. On the theoretical side, we prove a stationarity guarantee for schedule-free spectral dynamics and identify weight decay at the fast iterate as essential for long-horizon stability. SF-NorMuon enables practitioners to obtain high-quality checkpoints at any point during training without committing to a horizon in advance. By closing the performance gap with tuned baselines, SF-NorMuon makes horizon-free optimization more practical, taking a step towards truly open-ended, continual learning.

2605.23057 2026-05-25 cs.LG cs.CL cs.PF

ModeSwitch-LLM: A Lightweight Phase-Aware Controller for Cross-Mode LLM Inference on a Single GPU

ModeSwitch-LLM:单GPU上跨模式LLM推理的轻量级阶段感知控制器

Aman Sunesh, Ali Alshehhi, Hivansh Dhakne

AI总结 ModeSwitch-LLM 是一种轻量级的请求边界控制器,旨在提升单块 GPU 上大语言模型推理的效率,通过将每个请求路由到合适的固定推理模式。该方法利用低成本的工作负载级特征,在 FP16、量化模式、推测解码等不同模式间进行动态选择,无需依赖单一静态配置。实验表明,该控制器在保持推理质量的同时,显著降低了延迟和能耗,且相比基于学习的路由方法,规则控制器在效率和资源约束下表现更优。

Comments 10 pages main text, 11 pages including references, 5 figures, 3 tables. Preprint

详情
AI中文摘要

ModeSwitch-LLM是一种轻量级请求边界控制器,通过将每个请求路由到适当的固定推理模式,提高单GPU大语言模型推理效率。该系统不依赖单一的静态服务配置,而是利用廉价的工作负载级特征,在FP16、量化模式、推测解码以及混合模式(如GPTQ加前缀缓存和INT8加连续批处理)之间进行选择。我们在单个NVIDIA A100 GPU上对Meta-Llama-3.1-8B-Instruct进行了评估。在部署风格的合成工作负载上,在线控制器相比FP16实现了2.10倍的平均延迟加速和0.48倍的平均能耗比,相当于每个token能耗降低51.7%。在用作质量门的自动基准测试中,准确率接近FP16,平均差异为+0.17个百分点。我们还评估了轻量级学习路由器,但发现它们并未明显优于基于规则的控制器,因为它们增加了路由开销,并且更频繁地选择违反质量、能耗或内存约束的模式。这些结果表明,简单的请求感知路由可以从现有推理模式中恢复大量效率,而无需重新训练模型或更改其架构。

英文摘要

ModeSwitch-LLM is a lightweight request-boundary controller for improving single-GPU large language model inference efficiency by routing each request to an appropriate fixed inference mode. Instead of relying on one static serving configuration, the system selects among FP16, quantized modes, speculative decoding, and hybrid modes such as GPTQ plus prefix caching and INT8 plus continuous batching using cheap workload-level features. We evaluate ModeSwitch-LLM on Meta-Llama-3.1-8B-Instruct served on a single NVIDIA A100 GPU. On deployment-style synthetic workloads, the online controller achieves a 2.10x mean latency speedup over FP16 and a 0.48x mean energy ratio, corresponding to 51.7% lower energy per token. On automatic benchmarks used as a quality gate, accuracy remains close to FP16 with a mean delta of +0.17 percentage points. We also evaluate lightweight learned routers, but find that they do not clearly outperform the rule-based controller because they add routing overhead and more often select modes that violate quality, energy, or memory constraints. These results show that simple request-aware routing can recover substantial efficiency from existing inference modes without retraining the model or changing its architecture.

2605.23054 2026-05-25 cs.CL cs.AI cs.LG

Model Collapse as Cultural Evolution

模型崩溃作为文化演化

Dongxin Guo, Jikun Wu, Siu Ming Yiu

AI总结 本文研究了大型语言模型(LLM)在自训练过程中出现的“模型崩溃”现象,即模型输出质量逐渐下降的问题。作者引入文化进化中的迭代学习理论,提出五个可验证的预测,并通过多语言实验验证,发现模型的组合性结构在无过滤自训练下呈现非单调变化趋势,这一特征仅在任务导向的过滤机制下得以维持。研究为模型崩溃提供了语言学层面的解释,并为自训练流程的设计提供了具体原则。

Comments Accepted at CoNLL 2026. 18 pages, 3 figures, 2 tables

详情
AI中文摘要

模型崩溃,即在其自身输出上训练的LLM的逐步退化,已被统计表征,但缺乏对哪些结构退化、以何种顺序以及为何退化的语言学解释。我们表明,文化演化中的迭代学习理论填补了这一空白。我们推导出五个可证伪的预测,区分了那些对该理论具有独特判别性的预测与确认性预测,并通过在英语、德语和土耳其语中自训练LLaMA-2-7B和Mistral-7B达10代来测试它们。关键的判别性发现:在未过滤的自训练下,组合性遵循非单调轨迹(先上升后下降)。这一特征在最大规则种子数据下持续存在(排除了噪声去除),并且仅由任务导向的过滤维持,而非随机过滤,提供了压缩-通信权衡的首个LLM尺度证据。所有预测均得到确认,效应量较大(Hedges' $g > 1.6$;$\mathrm{BF}_{10} > 100$),且LLM正则化梯度与人类行为数据高度匹配($R^2 = 0.94$)。这些结果将模型崩溃重新定义为文化传播现象,并为自训练管道设计提供了具体原则。

英文摘要

Model collapse, the progressive degradation of LLMs trained on their own outputs, has been characterized statistically but lacks a linguistic explanation for which structures degrade, in what order, and why. We show that iterated learning theory from cultural evolution fills this gap. We derive five falsifiable predictions, distinguish those uniquely discriminative for the theory from confirmatory ones, and test them by self-training LLaMA-2-7B and Mistral-7B over 10 generations in English, German, and Turkish. The critical discriminative finding: compositionality follows a non-monotonic trajectory (initially rising, then falling) under unfiltered self-training. This signature persists with maximally regular seed data (ruling out noise removal) and is sustained only by task-grounded filtering, not random filtering, providing the first LLM-scale evidence for the compression-communication tradeoff. All predictions are confirmed with large effect sizes (Hedges' $g > 1.6$; $\mathrm{BF}_{10} > 100$), and LLM regularization gradients closely match human behavioral data ($R^2 = 0.94$). These results reframe model collapse as a cultural transmission phenomenon and yield concrete principles for self-training pipeline design.

2605.23052 2026-05-25 cs.CL cs.AI

DreamerNLplus: Interpretable Modeling of Mental Health Dynamics from Social Media Timelines using Hybrid Rule-Based and RAG Methods

DreamerNLplus: 使用混合规则和RAG方法从社交媒体时间线进行可解释的心理健康动态建模

Maryia Zhyrko, Daisy Monika Lal, Erik van Mulligen, Lifeng Han

AI总结 本文提出了一种混合框架 DreamerNLplus,用于从社交媒体时间线中建模心理健康动态,参与了 CLPsych 2026 共享任务。该方法结合了基于规则和检索增强生成(RAG)的技术,分别用于心理状态建模、时间变化检测和序列级摘要任务,并在多个子任务中取得了优异成绩。研究揭示了心理健康动态建模中的关键挑战,如分类与回归性能的不匹配、时间过渡建模的困难,为未来研究提供了重要方向。

Comments Accepted by CLPsych2026. CLPsych 2026 will be held at ACL in San Diego July 4th, 2026

详情
AI中文摘要

我们提出DreamerNLplus,一个用于在CLPsych 2026共享任务中从社交媒体时间线建模心理健康动态的混合框架。我们的系统处理三个任务:心理状态建模、时间变化检测和序列级总结。对于任务1,我们结合基于LLM的数据增强、DeBERTa分类和随机森林回归进行结构化状态预测。对于任务2,我们使用本地部署的Llama 3.1模型进行少样本提示,利用短期时间上下文检测切换和升级事件。对于任务3.1,我们探索了确定性基于规则的总结流水线和基于LLM的少样本方法,官方排名第二。我们的基于RAG的方法在任务3.2中取得了强劲性能,在改善任务中排名第一,在恶化任务中排名第三,展示了其捕捉时间线上反复出现的心理变化模式的能力。我们的分析揭示了关键挑战,包括分类与回归性能之间的不匹配、时间转换建模的困难,以及基于语义和基于相似性的评估指标之间的不一致。这些发现凸显了建模心理健康动态的复杂性,并推动了未来关于统一评估框架的工作。我们在https://github.com/4dpicture/CLPsych2026分享我们的代码和提示。

英文摘要

We present DreamerNLplus, a hybrid framework for modeling mental health dynamics from social media timelines in the CLPsych 2026 shared task. Our system addresses three tasks: psychological state modeling, temporal change detection, and sequence-level summarization. For Task 1, we combine LLM-based data augmentation, DeBERTa classification, and Random Forest regression for structured state prediction. For Task 2, we use few-shot prompting with a locally deployed Llama 3.1 model to detect Switch and Escalation events using short-term temporal context. For Task 3.1, we explore both a deterministic rule-based summarization pipeline and a few-shot LLM-based approach, ranking \textbf{2nd} officially. Our RAG-based method achieves strong performance in Task 3.2, ranking \textbf{1st} for Improvement and \textbf{3rd} for Deterioration, demonstrating its ability to capture recurrent psychological change patterns across timelines. Our analysis reveals key challenges, including the mismatch between classification and regression performance, the difficulty of modeling temporal transitions, and the disagreement between semantic and similarity-based evaluation metrics. These findings highlight the complexity of modeling mental health dynamics and motivate future work on unified evaluation frameworks. We share our code and prompts at https://github.com/4dpicture/CLPsych2026

2605.23045 2026-05-25 cs.CV cs.AI cs.LG

The TIME Machine: On The Power of Motion for Efficient Perception

时间机器:论运动在高效感知中的力量

Mantas Skackauskas, Xinyue Hao, Laura Sevilla-Lara

AI总结 本文提出了一种以运动为核心模态的视频表征学习方法,旨在解决现有视频模型在时序理解和训练成本方面的局限。通过使用点轨迹表示视频中的运动,并利用掩码自编码器进行自监督训练,模型能够学习到更高效且细粒度的视频表征。该方法无需依赖语言标注,大幅降低了训练数据需求,并在多项任务中展现出与当前先进模型相当的性能,为构建更高效、更具时序感知能力的视频模型提供了新方向。

详情
AI中文摘要

近年来,视频表示学习取得了巨大进展。这受到多种因素的推动,包括训练规模以及通过语言对比训练的视觉模型的成功。虽然这些因素推动了视频模型的能力边界,但它们也引入了自身的局限性:首先,扩展视频模型可能达到高昂的成本;其次,从语言学习限制了可学习概念的范围,仅限于字幕中的概念。因此,视频模型在时间理解方面仍然存在困难。在本文中,我们提出了一种新颖的方法,将运动作为视频表示的核心模态。具体而言,给定视频中以点轨迹形式存在的运动,我们使用掩码自编码器来掩码部分轨迹,并训练自编码器重建缺失的轨迹。这使我们能够以自监督方式学习表示。我们表明,使用运动来表示视频实际上解决了视频技术的两个核心局限性。首先,它使我们能够大幅减少训练数据的规模,因为运动本质上与外观无关,因此需要更少的样本就能很好地泛化。其次,运动使我们能够绕过依赖语言的训练范式,学习更细粒度的概念。结果是一种嵌入,我们称之为TIME(时间感知运动嵌入),这是一种仅使用合成运动数据训练的表示。我们在零样本方式下对广泛的任务测试了这种嵌入。我们观察到,无需额外技巧,其性能与使用多达4个数量级更少训练数据的最先进模型相当。这为迈向更有时序感知且更具可扩展性的视频模型新范式奠定了基础。

英文摘要

Video representation learning has seen tremendous progress in recent years. This has been driven by many factors, including the scale of training and the success of visual models trained contrastively with language. While these factors have pushed the boundaries of what video models can do, they also introduce their own set of limitations: first, scaling video models can reach prohibitive costs and second, learning from language restricts the range of concepts that can be learned to those in captions. As a result, video models still struggle with temporal understanding. In this paper we propose a novel approach that uses motion as the central modality for video representation. In particular, given the motion in a video in the form of point-tracks, we use a masked-autoencoder to mask some of the tracks and train the autoencoder to reconstruct the missing tracks. This allows us to learn a representation in a self-supervised manner. We show that using motion to represent videos actually addresses both of the core limitations of video technology. First, it allows us to massively reduce the scale of training data, as motion is inherently appearance-independent and hence needs fewer examples to generalize well. Second, motion allows us to bypass the language-dependent training paradigm, learning better fine-grained concepts. The result is an embedding that we call TIME (Temporally Informed Motion Embedding), a representation trained exclusively on synthetic motion data. We test this embedding on a wide set of tasks in a zero-shot manner. We observe that without bells and whistles, performance is on par with state-of-the-art models using up to 4 orders of magnitude less training data. This is a stepping stone towards a new paradigm of video models that are both more temporally aware as well as more scalable.

2605.23043 2026-05-25 cs.CL stat.ML

HawkesLLM: Semantic Uncertainty Propagation in Agentic Text Simulation

HawkesLLM:智能体文本模拟中的语义不确定性传播

Zewei Deng, Tinghan Ye, Liyan Xie

AI总结 本文提出HawkesLLM框架,用于解决智能体文本模拟系统中语义不确定性随时间累积的问题。该方法将时间影响建模与文本生成过程分离,通过多变量Hawkes过程建模节点间的激活关系,并利用语言模型基于时间模型选择的紧凑记忆生成新内容。实验表明,在GDELT新闻传播案例中,HawkesLLM在有限提示记忆预算下有效提升了后期语义对齐的效果。

Comments 10 pages, 4 figures, Accepted at the ICML 2026 Workshop on Statistical Frameworks for Uncertainty in Agentic Systems

详情
AI中文摘要

智能体文本模拟系统按顺序生成文本,每个项目成为后续步骤的可能上下文。这使得不确定性具有路径依赖性:早期的模糊性可能影响后续输出。本文通过HawkesLLM框架研究这一问题,该框架将时间影响建模与文本生成分离。我们将级联表示为一个网络,其节点是文本生成智能体。多变量Hawkes过程模拟这些节点随时间激活的方式,以及哪些早期节点输出应影响后续提示。然后,语言模型根据该时间模型选择的紧凑记忆编写每个新事件。我们在一个保留的全球事件、语言和语调数据库(GDELT)新闻级联案例研究中评估该框架。诊断跟踪与局部保留参考的语义对齐,并区分局部漂移和全局漂移。在此设置下,HawkesLLM在紧凑的提示记忆预算下改善了后期语义对齐。

英文摘要

Agentic text-simulation systems write in sequence, with each item becoming possible context for later steps. That makes uncertainty path-dependent: an early ambiguity can affect later outputs. This paper studies this problem with HawkesLLM, a framework that separates temporal influence modeling from text generation. We represent the cascade as a network whose nodes are text-generating agents. A multivariate Hawkes process models how these nodes activate over time and which earlier node outputs should influence later prompts. A language model then writes each new event from the compact memory selected by this temporal model. We evaluate the framework on a held-out Global Database of Events, Language, and Tone (GDELT) news-cascade case study. The diagnostics track semantic alignment with local held-out references and separate local drift from global drift. In this setting, HawkesLLM improves late-stage semantic alignment under a compact prompt-memory budget.

2605.23040 2026-05-25 cs.LG

Steered Generation via Gradient-Based Optimization on Sparse Query Features

基于稀疏查询特征的梯度优化引导生成

Sumanta Bhattacharyya, Pedram Rooshenas

AI总结 本文研究如何通过梯度优化稀疏查询特征来实现对大语言模型生成过程的精准引导。作者提出基于原型的稀疏控制方法,利用稀疏自编码器对注意力查询激活进行分解,并在推理过程中通过梯度优化将其与目标行为的类原型对齐,从而实现对生成内容的可控引导。实验表明,该方法在可控环境和教育领域任务中均能有效满足逻辑规划和风格细微度的统一控制需求。

详情
AI中文摘要

潜在引导利用大型语言模型的内部表示来指导生成,但对密集状态的干预可能纠缠不同的语义特征。在本文中,我们研究注意力查询激活作为精确控制的高保真位点,假设操纵注意力机制本身比一般状态干预提供更清晰的引导能力。我们引入了基于原型的稀疏引导框架,该框架将稀疏自编码器专门应用于查询激活,将其分解为可解释的特征,然后在推理过程中应用基于梯度的优化,使稀疏表示与目标行为的类原型对齐。为了验证这一架构见解,我们首先在文本化网格世界(一个用于可验证规划约束的受控环境)中分析该机制。我们证明,优化稀疏查询特征能够有效导航刚性规划需求(即安全路径与短路径),确认了该方法满足客观规则的能力。然后,我们通过在高维教育领域训练SAE来展示该框架的通用性,其中该框架引导反馈的认知复杂性(即布鲁姆分类法)。我们的实验表明,稀疏查询表示为逻辑规划和风格细节的统一、可解释控制提供了必要的解缠。

英文摘要

Latent steering exploits internal representations of Large Language Models (LLMs) to guide generation, yet interventions on dense states can entangle distinct semantic features. In this paper, we investigate attention query activations as a high-fidelity site for precise control, hypothesizing that manipulating the attention mechanism itself offers sharper steerability than general state interventions. We introduce Prototype-Based Sparse Steering, a framework that applies Sparse Autoencoders (SAEs) specifically to query activations, to decompose them into interpretable features, then apply gradient-based optimization during inference to align the sparse representation with class prototypes of target behaviors. To validate this architectural insight, we first analyze the mechanism in Textualized Gridworld, a controlled environment for verifiable planning constraints. We demonstrate that optimizing sparse query features enables effective navigation of rigid planning requirements (i.e., safe vs. short paths), confirming the method's ability to satisfy objective rules. We then demonstrate the framework's versatility by training SAEs on a high-dimensional educational domain, where the framework steers the cognitive complexity of feedback (i.e., Bloom's Taxonomy). Our experiments establish that sparse query representations provide the necessary disentanglement for unified, interpretable control over both logical planning and stylistic nuance.

2605.23039 2026-05-25 cs.CL cs.AI cs.LG

Do Language Models Know What Not to Say? Causal Evidence for Statistical Preemption in LLMs

语言模型知道不该说什么吗?大语言模型中统计预占的因果证据

Dongxin Guo, Jikun Wu, Siu Ming Yiu

AI总结 本研究探讨了语言模型如何通过分布竞争机制习得语言禁忌知识,提出统计预占(statistical preemption)是关键机制。通过四个实验,研究发现语言模型对非常规结构的惊讶度(surprisal)与人类可接受性判断高度相关,并且这种模式由竞争形式的频率驱动,而非动词整体频率。研究还表明,预占敏感性随模型规模呈幂律增长,并通过可控微调实验验证了竞争形式频率对预占行为的因果影响,为构造语法理论提供了计算支持。

Comments Accepted at CoNLL 2026. 21 pages (9 main body + appendices and references); 4 figures, 14 tables

详情
AI中文摘要

学习者在没有负面证据的情况下如何获得关于不可接受性的知识?构式语法提出了统计预占:接触常规形式(例如,“donated the books to the library”)会预占结构上可能但未经验证的替代形式(“*donated the library the books”)。我们提出了一项计算研究,首次在单一收敛设计中直接分离了大语言模型中的统计预占与竞争性固化假说。通过跨越120个英语动词-构式配对(与格、使役、方位格)的四个实验,我们表明:(1)大语言模型的惊讶度模式与人类可接受性判断强相关(r = 0.79),并在三个独立的行为数据集上得到验证;(2)这些模式由竞争形式频率驱动,而非整体动词频率,通过非循环偏相关得到确认;(3)预占敏感度随模型规模呈幂律增长;(4)一项受控微调干预因果地表明,操纵竞争形式频率会按预测方向改变预占行为,反向控制排除了频率敏感性混淆。这些结果提供了汇聚证据,表明神经语言模型通过分布竞争(构式语法所提出的核心机制)习得负面语言知识。

英文摘要

How do learners acquire knowledge of what is unacceptable without negative evidence? Construction Grammar proposes statistical preemption: exposure to a conventional form (e.g., "donated the books to the library") preempts structurally possible but unattested alternatives ("*donated the library the books"). We present a computational study that, for the first time, directly dissociates statistical preemption from the competing entrenchment hypothesis in large language models within a single converging design. Across four experiments spanning 120 English verb-construction pairings (dative, causative, locative), we show that (1) LLM surprisal patterns correlate strongly with human acceptability judgments ($r = 0.79$), validated against three independent behavioral datasets; (2) these patterns are driven by competing-form frequency rather than overall verb frequency, confirmed by non-circular partial correlations; (3) preemption sensitivity scales as a power law with model size; and (4) a controlled fine-tuning intervention causally demonstrates that manipulating competing-form frequencies shifts preemption behavior in the predicted direction, with reverse-direction controls ruling out frequency-sensitivity confounds. These results provide converging evidence that neural language models acquire negative linguistic knowledge through distributional competition, the core mechanism posited by Construction Grammar.

2605.23037 2026-05-25 cs.LG physics.flu-dyn

Open Multimodal Datasets and Open-Source Software for Data-Driven Modeling of Multiphase Transport and Thermal Systems

用于多相输运和热系统数据驱动建模的开放多模态数据集与开源软件

Christy Dunlap, Hari Pandey, Stephen Pierson, Daniel Curl, Braden Stevens, Mohammad Ishraq Hossain, Annapurna Parjuli, Chinmaya Joshi, Han Hu

AI总结 本文介绍了由NED3实验室开发的一套开放多模态数据集和开源软件工具,旨在推动基于数据驱动的多相传输与热流体系统建模研究。研究提出了一种空间-时间维度分类框架(S+TD),用于系统化组织不同维度的测量或模拟数据,并提供了涵盖沸腾图像、热成像、高速视频等多种数据的公开数据集。同时,文章介绍了多个配套软件工具,如用于序列回归的SeqReg,支持非侵入式热通量估计等应用,为热流体领域的AI建模提供了可复现的开源平台。

Comments 23 pages, 7 figures

详情
AI中文摘要

数据驱动建模正成为多相输运、电子冷却、声学诊断和热流体数字孪生的核心,但进展受到数据集碎片化和原始仪器文件难以解码、重用或基准测试的限制。本文介绍了由纳米能源与数据驱动发现(NED3)实验室开发的开放多模态数据集和开源软件包生态系统,用于可复现的AI赋能热流体研究。我们提出了一个空间加时间维度框架,记为S+TD,用于按测量或模拟场的维度对数据集进行分类,包括0+0D点值、0+1D时间序列、1+0D剖面、2+0D图像、2+1D视频、3+0D体积场以及多模态组合。我们整理了公开的NED3数据集,涵盖沸腾图像、声学和热测量、高速视频、红外热成像、热阻测量、CFD生成场、设计文件和声发射数据。我们还描述了配套的软件包,包括BubbleID、SeqReg、CFDTwin、IRISApp、decode-wfs、AELab和FlowLab,这些软件支持计算机视觉、序列回归、代理建模、红外分析、波形解码、声发射分析和多模态诊断。特别强调了SeqReg,这是一个用于0+1D、1+1D和2+1D数据的通用序列回归库,应用包括非侵入式热通量估计。最后,我们讨论了未来社区努力构建可互操作的热流体数据库和精选的AI/ML工具库,以连接数据集、元数据、解码器、基线、基准和物理可解释模型。

英文摘要

Data-driven modeling is becoming central to multiphase transport, electronics cooling, acoustic diagnostics, and thermal-fluid digital twins, but progress is limited by fragmented datasets and raw instrument files that are difficult to decode, reuse, or benchmark. This paper presents an open ecosystem of multimodal datasets and open-source software packages developed by the Nano Energy and Data-Driven Discovery (NED3) Laboratory for reproducible AI-enabled thermal-fluid research. We introduce a spatial-plus-temporal dimensionality framework, denoted S+TD, to classify datasets by the dimensionality of measured or simulated fields, including 0+0D point values, 0+1D time series, 1+0D profiles, 2+0D images, 2+1D videos, 3+0D volumetric fields, and multimodal combinations. We organize public NED3 datasets spanning boiling images, acoustic and thermal measurements, high-speed videos, infrared thermography, thermal-resistance measurements, CFD-generated fields, design files, and acoustic-emission data. We also describe complementary software packages, including BubbleID, SeqReg, CFDTwin, IRISApp, decode-wfs, AELab, and FlowLab, which support computer vision, sequence regression, surrogate modeling, infrared analysis, waveform decoding, acoustic-emission analysis, and multimodal diagnostics. Particular emphasis is placed on SeqReg, a general sequence-regression library for 0+1D, 1+1D, and 2+1D data, with applications such as nonintrusive heat-flux estimation. Finally, we discuss future community efforts to build interoperable thermal-fluid databanks and curated AI/ML tool libraries that connect datasets, metadata, decoders, baselines, benchmarks, and physically interpretable models.

2605.23036 2026-05-25 cs.CL

Multilingual Steering by Design: Multilingual Sparse Autoencoders and Principled Layer Selection

通过设计实现多语言引导:多语言稀疏自编码器与原则性层选择

Yusser Al Ghussin, Daniil Gurgurov, Tanja Baeumel, Josef van Genabith, Patrick Schramowski, Simon Ostermann

AI总结 该研究针对多语言大语言模型中基于稀疏自编码器(SAE)的语言控制可靠性不足的问题,提出了一种设计导向的多语言引导方法。研究通过在多语言数据上训练SAE,增强了跨语言表征,并引入了一种基于多语言对齐与语言可分性交集的先验层选择规则,有效预测了干预深度,避免了逐层搜索。实验表明,该方法在翻译和跨语言摘要任务中提升了语言识别准确率与生成质量的平衡,为多语言SAE引导提供了原理性与可预测的解决方案。

Comments Accepted to TrustNLP Workshop at ACL 2026

详情
AI中文摘要

稀疏自编码器(SAEs)能够实现大型语言模型(LLMs)中的特征级可解释性机制和激活引导,但基于SAE的语言控制在多语言环境中仍然不可靠:大多数SAE仅在英语数据上训练,且引导层的选择是启发式的。我们通过推进基于SAE的多语言语言引导的原则性、机制性解释来解决这些限制。首先,我们展示了在多语言数据上训练SAE能够持续增强跨语言表示,并在不同层和模型家族中产生更可靠、质量保持的语言控制。其次,我们引入了一种基于多语言对齐与语言可分离性交集的先验引导层选择规则,该规则无需穷举逐层搜索即可预测有效的干预深度。我们在LLaMA-3.1-8B和Gemma-2-9B上,使用SpBLEU、ROUGE-L、COMET和LaSE评估了我们的方法,涉及机器翻译和跨语言摘要(CrossSumm)。结果表明,多语言SAE结合交集选择的层稳定了语言识别准确率与生成质量之间的权衡,为多语言SAE引导提供了原则性、预测性的表示级解释。

英文摘要

Sparse autoencoders (SAEs) enable feature-level mechanistic interpretability and activation steering in large language models (LLMs), but SAE-based language control remains unreliable in multilingual settings: most SAEs are trained on English-only data, and steering layers are chosen heuristically. We address these limitations by advancing a principled, mechanistic account of multilingual language steering with SAEs. First, we show that training SAEs on multilingual data consistently strengthens cross-lingual representations and yields more reliable, quality-preserving language control across layers and model families. Second, we introduce an \emph{a priori} steering layer-selection rule based on the intersection of multilingual alignment and language separability, which predicts effective intervention depths without exhaustive layerwise search. We evaluate our approach on LLaMA-3.1-8B and Gemma-2-9B across machine translation and cross-lingual summarization (CrossSumm), using SpBLEU, ROUGE-L, COMET, and LaSE. Our results show that multilingual SAEs combined with intersection-selected layers stabilize the trade-off between language identification accuracy and generation quality, providing a principled, predictive, representation-level account of multilingual SAE steering.

2605.23035 2026-05-25 cs.CL cs.AI q-bio.NC

Sparse Autoencoders Map Brain-LLM Alignment onto Cortical Semantic Topography

稀疏自编码器将大脑-LLM对齐映射到皮层语义拓扑

Dongxin Guo, Jikun Wu, Siu Ming Yiu

AI总结 该研究探讨了大型语言模型(LLM)中间层与人类大脑语言响应之间的对应关系,并利用稀疏自编码器(SAEs)对其进行机制解释。通过将SAEs与神经编码模型结合,研究者分解了GPT-2 XL和Llama-3.1-8B模型,提取出每层1.6万至3.2万个可解释特征,并验证了语义特征在预测大脑编码性能中的主导作用。研究进一步表明,SAE提取的语义特征能够重现大脑皮层的语义拓扑结构,并在多种语言中展现出良好的泛化能力。

Comments Accepted at CoNLL 2026. 20 pages (9 main + 1 limitations/acknowledgments + 3 references + 7 appendix), 5 figures, 20 tables

详情
AI中文摘要

大型语言模型(LLM)的中间层最能预测人脑对语言的反应,这是计算神经语言学中最稳健的发现之一,但其机制原因仍未得到解释。我们通过将可解释性机制中的稀疏自编码器(SAE)与神经编码模型相结合来填补这一空白,将GPT-2 XL和Llama-3.1-8B分解为每层16K-32K个可解释特征。一个人工验证的分类法(κ≥0.74)显示,仅语义特征就恢复了94%的峰值编码性能(r=0.285),显著超过了方差匹配的基线(p<0.001,d=1.31)。除了这种总体主导性之外,我们还测试了一个新颖的皮层拓扑预测:从三个独立神经科学项目先验导出的五个语义子类别应映射到不同的大脑区域。一个正式的收敛测试证实了这种对齐(Spearman ρ=0.72,p<0.001;超几何p=0.007),表明SAE发现的特征以先前方法无法达到的粒度重现了已知的皮层语义组织。SAE特征进一步预测了超出词汇控制的人类阅读时间(ΔlogLik=38.4,p<0.001),并且一项探索性的预测误差分析提供了初步证据,表明大脑还编码了意外的语义内容。结果在英语、中文和法语中具有普适性。

英文摘要

Intermediate layers of large language models (LLMs) best predict human brain responses to language, one of the most robust findings in computational neurolinguistics, yet why remains mechanistically unexplained. We address this gap by bridging sparse autoencoders (SAEs) from mechanistic interpretability with neural encoding models, decomposing GPT-2 XL and Llama-3.1-8B into 16K-32K interpretable features per layer. A human-validated taxonomy ($κ\geq 0.74$) reveals that semantic features alone recover 94% of peak encoding performance ($r=0.285$), substantially exceeding variance-matched baselines ($p<0.001$, $d=1.31$). Beyond this aggregate dominance, we test a novel cortical topography prediction: five semantic subcategories derived a priori from three independent neuroscience programs should map onto distinct brain regions. A formal convergence test confirms this alignment (Spearman $ρ=0.72$, $p<0.001$; hypergeometric $p=0.007$), demonstrating that SAE-discovered features recapitulate known cortical semantic organization at a granularity inaccessible to prior methods. SAE features further predict human reading times beyond lexical controls ($Δ\mathrm{logLik}=38.4$, $p<0.001$), and an exploratory prediction-error analysis provides preliminary evidence that the brain additionally encodes unexpected semantic content. Results generalize across English, Chinese, and French.

2605.23033 2026-05-25 cs.LG cs.AI

Uncovering the Latent Potential of Deep Intermediate Representations

揭示深度中间表示的潜在能力

Arnesh Batra, Arush Gumber, Aniket Khandelwal, Jashn Khemani, Anubha Gupta

AI总结 本文研究了深度神经网络中间表示的潜在价值,指出任务相关信息在不同层中非单调分布,不能通过简单聚合恢复。为此,作者提出了一种基于谱分析的层选择方法LOES,以及几何正则化损失GeoReg,以识别任务区分性子空间并稳定表示几何结构。实验表明,该方法在多种模型和数据条件下均优于基线,且效果随模型深度增加而提升,同时揭示了语义因素在层间的分布规律,有助于跨语言和跨模态的可解释性分析。

Comments Accepted to ICML2026 as a Spotlight

详情
AI中文摘要

在海量数据上预训练的基础模型学习到随深度演化的表示,形成具有不同语义内容和几何结构的嵌入层次。与仅使用最后一层或浅层混合的普遍做法相反,我们表明任务相关信息在层间非单调分布,且无法通过简单聚合恢复。通过跨多种模态的几何与实证研究,我们表明有效迁移依赖于识别哪些层编码任务判别结构以及它们的嵌入如何几何组织。我们提出层最优嵌入选择(LOES),一种构造性谱方法,通过在正交性和各向同性约束下最小化残差误差来识别任务判别子空间。为了将微调与此选择原则对齐,我们进一步提出几何正则化损失(GeoReg),它在微调期间对类流形施加单纯形结构并稳定表示几何。在广泛的架构、深度、模态和数据规模下,LOES 持续优于标准基线,且随着模型深度增加收益增长。除了准确性,我们的方法揭示了语义因素如何在层间分布,从而实现了跨语言和跨模态的可解释性分析。总之,我们的结果提供了强有力的证据,表明逐层嵌入几何不是偶然的,而是深度模型表示和迁移知识的核心。

英文摘要

Foundational Models pretrained on huge amount of data learn representations that evolve across depth, forming a hierarchy of embeddings with distinct semantic content and geometric structure. Contrary to the widespread practice of using only the final layer or shallow mixtures, we show that task-relevant information is distributed non-monotonically across layers and cannot be recovered by naïve aggregation. Through a geometric and empirical study across multiple modalities, we show that effective transfer depends on identifying which layers encode task-discriminative structure and how their embeddings are geometrically organized. We introduce Layer-wise Optimal Embedding Selection (LOES), a constructive spectral method that identifies task-discriminative subspaces by minimizing residual error under orthogonality and isotropy constraints. To align fine-tuning with this selection principle, we further propose Geometric Regularization Loss (GeoReg), which enforces a simplicial structure on class manifolds and stabilizes representation geometry during fine-tuning. Across a wide range of architectures, depths, modalities, and data regimes, LOES consistently outperforms standard baselines, with gains that grow as model depth increases. Beyond accuracy, our method reveals how semantic factors are distributed across layers, thereby enabling cross-lingual and cross-modal interpretability analyses. Together, our results provide strong evidence that layerwise embedding geometry is not incidental but central to how deep models represent and transfer knowledge.

2605.23032 2026-05-25 cs.CL cs.AI q-bio.NC

Brain-LLM Alignment Tracks Training Data, Not Typology

大脑-大语言模型对齐追踪训练数据,而非语言类型学

Dongxin Guo, Jikun Wu, Siu Ming Yiu

AI总结 该研究探讨了大脑与大语言模型(LLM)之间的对齐模式是否具有跨语言泛化能力,发现对齐模式主要由模型训练语言的主导性决定,而非英语本身的特性。通过对比多种语言的fMRI数据和不同语言主导的LLM,研究发现以中文为主导训练的模型在与中文大脑对齐时表现最佳,而与英语大脑对齐最差。此外,语言类型学距离、句法相关脑区的梯度差异以及分词粒度等因素也对对齐效果产生显著影响,揭示了此前观察到的“英语优势”主要源于训练数据的组成,而非语言结构本身的特性。

Comments Accepted to CoNLL 2026. 9 pages main content + 4 pages references + 6 pages appendix; 4 figures, 13 tables

详情
AI中文摘要

大脑-大语言模型对齐在英语中已得到充分证实,然而大脑的语言网络在神经解剖学上跨语言具有普遍性。这种对齐是否也能跨语言泛化,以及什么因素决定了其变化?我们使用来自英语、中文和法语(《小王子》语料库)112名参与者的fMRI数据,以及涵盖英语主导、中文主导和多语言架构的七种大语言模型进行了测试。我们的核心发现是,训练语言主导性(而非英语的固有属性)驱动了对齐模式:一个中文主导模型(Baichuan2-7B),其架构与LLaMA-2-7B匹配,完全逆转了梯度,与中文大脑对齐最佳,与英语对齐最差。除训练主导性外,形式类型学距离独立地与对齐退化共变,与句法相关的大脑区域(IFG)显示出比词汇语义区域(PTL)陡峭2.3倍的类型学梯度,而分词丰度解释了跨语言最优编码层转移的约60%。这些结果表明,大脑-大语言模型对齐中明显的“英语优势”是训练数据组成的假象,而剩余的变化反映了集中在句法处理中的真实类型学结构。

英文摘要

Brain-LLM alignment is well established in English, yet the brain's language network is neuroanatomically universal across languages. Does alignment also generalize cross-linguistically, and what governs the variation? We test this using fMRI data from 112 participants across English, Chinese, and French (the Le Petit Prince corpus) and seven LLMs spanning English-dominant, Chinese-dominant, and multilingual architectures. Our central finding is that training-language dominance, not an inherent property of English, drives the alignment pattern: a Chinese-dominant model (Baichuan2-7B), architecture-matched to LLaMA-2-7B, reverses the gradient entirely, aligning best with Chinese brains and worst with English. Beyond training dominance, formal typological distance independently covaries with alignment degradation, syntax-associated brain regions (IFG) show $2.3\times$ steeper typological gradients than lexico-semantic regions (PTL), and tokenization fertility accounts for $\sim$60% of a cross-linguistic shift in optimal encoding layer. These results reveal that the apparent "English advantage" in brain-LLM alignment is an artifact of training data composition, while the remaining variation reflects genuine typological structure concentrated in syntactic processing.

2605.23028 2026-05-25 cs.LG cs.CL cs.CV

RADAR: Relative Angular Divergence Across Representations

RADAR: 表示间的相对角度散度

Xavier Cadet, Mateusz Nowak, Peter Chin

AI总结 本文提出了一种名为 RADAR 的度量方法,用于评估基础模型在跨领域任务中的迁移能力。该方法基于几何原理,通过分析模型各层表示的角对齐和层间位移轨迹上的距离变化,比较域内与跨域动态的分布差异,从而估计领域间迁移的可行性。实验表明,RADAR 在多个模态任务中表现出色,尤其在领域过渡平滑或明确的情况下具有更强的预测能力,且其效果依赖于模型内部表示空间的几何结构。

Comments 27 pages; 8 figures; 10 tables

详情
AI中文摘要

机器学习方法依赖于数据。然而,由于可用性限制、成本或需要领域专业知识,收集合适的数据可能具有挑战性。用额外来源扩展数据集是对有限数据的常见回应,但这种做法并不总能提高下游性能,有时甚至会导致性能下降,即负迁移。我们提出RADAR,一种简单、基于几何的度量,用于估计基础模型中的跨域迁移性。RADAR通过测量沿层间位移轨迹的角度对齐和距离的相对变化,并比较域内和跨域动态的经验分布,来分析表示的逐层演化。我们假设域迁移性与这些轨迹分布之间的散度有关。我们在多种模态上评估该度量,包括使用文本嵌入模型的跨语言情感分类和使用基础视觉模型的跨域图像分类。在多种设置下,RADAR在几个视觉和文本基准上相对于现有迁移性度量提供了有竞争力的预测性能,特别是在域过渡平滑或清晰分离时。我们的消融实验进一步表明,迁移性估计的有效性取决于模型内部表示空间的几何结构,不同模态偏好不同的拓扑形式。

英文摘要

Machine learning methods rely on data. However, gathering suitable data can be challenging due to availability constraints, cost, or the need for domain expertise. Expanding datasets with additional sources is a common response to limited data, yet this practice does not always improve downstream performance and can sometimes lead to a loss of performance, known as negative transfer. We propose RADAR, a simple, geometrically grounded metric for estimating cross-domain transferability in foundation models. RADAR analyzes the layer-wise evolution of representations by measuring angular alignments and relative changes in distance along layer-to-layer displacement trajectories, and by comparing empirical distributions of within-domain and cross-domain dynamics. We hypothesize that domain transferability is related to the divergence between these trajectory distributions. We evaluate the metric across multiple modalities, including cross-lingual sentiment classification with text embedding models and cross-domain image classification with foundation vision models. Across several settings, RADAR provides competitive predictive performance relative to existing transferability metrics on several vision and text benchmarks, with particularly strong results when domain transitions are smooth or cleanly separated. Our ablations further suggest that the effectiveness of transferability estimation depends on the geometry of the model's internal representation space, with different modalities favoring different topological formulations.

2605.23027 2026-05-25 cs.RO

PIMbot: A Self-Adaptive Attack Framework for Adversarial Manipulation of Multi-Robot Reinforcement Learning

PIMbot:一种用于多机器人强化学习对抗性操纵的自适应攻击框架

Zexin Li, Ziliang Zhang, Hyoseung Kim, Cong Liu

AI总结 本文提出了一种名为PIMbot的自适应攻击框架,用于对抗性地操控多机器人强化学习中的协作行为。该框架通过奖励通道的激励操控和智能体自身策略的操控两种互补手段,实现对多机器人合作环境的干预,并利用自适应多目标控制器在线平衡这两种手段。研究引入了一种针对多智能体强化学习社会困境中独特奖励函数的新操控方法,实验表明PIMbot在仿真和真实嵌入式系统中均能有效暴露多机器人协作任务中的关键漏洞。

Comments Extension version of IROS'23

详情
AI中文摘要

最近的研究证明了强化学习在多机器人有效协作中的潜力,特别是在机器人面临自身利益与集体利益权衡的社会困境中。然而,沟通不畅和对抗性机器人等环境因素可能影响合作,因此探索如何操纵多机器人通信以实现不同结果至关重要。本文提出了PIMbot,一个通过两种互补杠杆操纵结果的框架:(i) 奖励通道的激励操纵和(ii) 智能体自身动作的策略操纵。一个自适应多目标控制器在线平衡这些杠杆。我们的工作引入了一种新颖的方法来操纵最近的多智能体强化学习社会困境,这些困境利用独特的奖励函数进行激励。通过利用我们提出的PIMbot机制,机器人能够有效地操纵社会困境环境。全面的实验结果证明了我们提出的方法在Gazebo模拟的多机器人环境中的有效性。此外,在NVIDIA Jetson Orin Nano上的真实嵌入式设备案例研究量化了系统成本,并验证了PIMbot在超越仿真的现实自主嵌入式系统场景中的有效性。这些结果共同将PIMbot定位为一个严格的压力测试工具,暴露了多机器人协作任务中的关键漏洞。

英文摘要

Recent research has demonstrated the potential of reinforcement learning in effective multi-robot collaboration, particularly in social dilemmas where robots face a trade-off between self-interest and collective benefits. However, environmental factors such as miscommunication and adversarial robots can impact cooperation, making it crucial to explore how multi-robot communication can be manipulated to achieve different outcomes. This paper presents PIMbot, a framework that manipulates outcomes via two complementary levers: (i) incentive manipulation of the reward channel and (ii) policy manipulation of an agent's own actions. An adaptive multi-objective controller balances these levers in an online manner. Our work introduces a novel approach to manipulation in recent multi-agent RL social dilemmas that utilize a unique reward function for incentivization. By utilizing our proposed PIMbot mechanisms, a robot is able to manipulate the social dilemma environment effectively. Comprehensive experimental results demonstrate the effectiveness of our proposed methods in the Gazebo-simulated multi-robot environment. Moreover, a real embedded device case study on NVIDIA Jetson Orin Nano quantifies system cost and validates PIMbot's effectiveness on realistic autonomous embedded systems scenarios beyond simulation. Together, these results position PIMbot as a rigorous stress-test tool exposing critical vulnerabilities in multi-robot cooperative tasks.

2605.23025 2026-05-25 cs.LG

World Machine: Towards Generative World Modeling for Time-Series

世界机器:面向时间序列的生成式世界建模

Elton Cardoso do Nascimento, Alexandre da Silva Simões, Esther Luna Colombini, Ricardo Ribeiro Gudwin, Paula Dornhofer Paro Costa

AI总结 本文提出了一种名为 World Machine 的生成式世界建模架构,用于时间序列数据,旨在实现对环境的可预测理解和可控模拟。该架构基于变压器模型,引入了潜在状态机制,能够适应不同长度的观测数据和上下文,相比传统变压器在计算和内存效率上有所提升。实验在合成数据集 Toy1D 上验证了该方法的可行性,并展示了其相对于传统变压器的独特优势与各训练组件的贡献。

详情
AI中文摘要

世界模型代表了生成式AI的一种范式转变,以结构化和可泛化的方式追求对环境的预测性理解和可控模拟。我们提出了World Machine,一种用于时间序列的生成式世界建模架构。它是一种基于Transformer的架构,具有潜在状态,能够适应不同数量的观测数据和上下文。这相比传统Transformer有所改进,传统Transformer的计算和内存成本随上下文呈二次方增长。在提出的合成数据集Toy1D上的实验验证了该方法的可行性,展示了传统Transformer不具备的能力,并突出了训练协议中每个组件的贡献。

英文摘要

World models represent a paradigm shift in generative AI, pursuing predictive understanding and controllable simulation of environments in a structured and generalizable way. We present World Machine, a generative world-modeling architecture for time series. It is a transformer-based architecture with latent states that enables adaptation to different amounts of observed data and contexts. This shows an improvement over traditional transformers, which have a computational and memory cost that scales quadratically with the context. Experiments on a proposed synthetic dataset, Toy1D, validate the approach's feasibility, demonstrate capabilities not found in conventional transformers, and highlight the contributions of each component of the training protocol.

2605.23024 2026-05-25 cs.AI cs.CC cs.CL cs.LG

The Deterministic Horizon: Impossibility Results as Design Specifications for Trustworthy AI Systems

确定性视界:作为可信AI系统设计规范的不可行性结果

Dongxin Guo

AI总结 本文探讨了可信人工智能系统设计中由计算理论根本限制所带来的边界问题,提出将不可行性定理转化为系统设计规则的新方法。研究核心在于确定性地证明了大型语言模型的推理深度存在一个由架构决定的上限——“确定性地平线”,该上限不受训练数据量、适配器秩或损失函数的影响,并可通过模型层数和嵌入宽度预先计算。研究还展示了这一理论在多个AI子领域中的应用,形成一套包含十六项设计规范的目录,为构建更可靠的人工智能系统提供了理论依据和设计指导。

Comments PhD thesis, Department of Computer Science, The University of Hong Kong, 2026. 271 pages, 18 figures, 15 tables, 5 algorithms

详情
AI中文摘要

大型语言模型现在编写软件、起草法律文件并生成临床笔记,但从图灵、阿罗到没有免费午餐定理的基本极限,塑造了计算的能力。本文将这些不可行性结果从奇闻转化为设计规则。其旗舰结果证明了仅由架构设定的准确率上限:超过关键推理深度后,无论适配器秩、样本大小或损失函数如何,训练都无法改变它。该确定性视界在部署前可从层数和嵌入宽度计算,在十二种Transformer架构中测量值介于19到31之间,而在最优长度轨迹上微调可恢复不到4个百分点。其机制是残差流的容量不变性,信息论转换得出超过视界后准确率超指数衰减。一个针对模幂的无条件电路复杂度下界(对抗常数深度素数模电路)补充了这一结果。同样的论证重新应用于多个子领域:任何错误指定模型下的偏好学习在样本复杂度上出现不连续跳跃;多阶段检索流水线至少需要与阶段数一样多的独立指标;标准诚实拍卖对于具有提示相关估值的智能体失效;神经推理的零知识验证为每个非线性激活支付110到190倍的测量开销。这些共同构成了一个包含16条规范的目录,每条规范配对一个可计算边界、一个量化违反成本和一个建设性设计规则:两个组合已被证明,一个配对是诚实障碍,四个保持开放。本文为可信AI可能需要的生成式研究计划提供了不可行性规范方法论。AI的每一个基本极限也是一个设计规则。

英文摘要

Large language models now write software, draft legal documents, and produce clinical notes, yet fundamental limits, from Turing and Arrow to the No Free Lunch theorems, shape what computation can do. This thesis turns such impossibility results from curiosities into design rules. Its flagship result proves an accuracy ceiling set by architecture alone: past a critical reasoning depth, no amount of training moves it, at any adapter rank, sample size, or loss function. Computable before deployment from layer count and embedding width, this Deterministic Horizon is measured between nineteen and thirty-one across twelve transformer architectures, and fine-tuning on optimal-length traces recovers under four percentage points. The mechanism is a capacity invariant of the residual stream, and an information-theoretic conversion yields super-exponential accuracy decay past the horizon. An unconditional circuit-complexity lower bound for modular exponentiation against constant-depth prime-modulus circuits complements this result. The same argument recasts across subfields: preference learning under any misspecified model jumps discontinuously in sample complexity; multi-stage retrieval pipelines require at least as many independent metrics as stages; standard truthful auctions fail for agents with prompt-dependent valuations; and zero-knowledge verification of neural inference pays a measured overhead of one hundred ten to one hundred ninety times per non-linear activation. Together these form a catalogue of sixteen specifications, each pairing a computable boundary, a quantified violation cost, and a constructive design rule: two compositions are proved, one pairing is an honest obstruction, and four remain open. The impossibility-specification methodology is offered for the generative research programme that trustworthy AI may need. Every fundamental limit of AI is also a design rule.

2605.23019 2026-05-25 cs.LG

PACE: Two-Timescale Self-Evolution for Small Language Model Agents

PACE:小型语言模型代理的双时间尺度自我进化

Chen Ling, Pei Chen, Albert Guan, Jiaming Qu, Shayan Ali Akbar, Madhu Gopinathan, Erwin Cornejo

AI总结 本文研究了在资源受限条件下,冻结的小语言模型(SLM)能否作为有效的自进化智能体。为此,作者提出了PACE框架,通过双时间尺度协调低风险的提示优化与高风险的控制逻辑更新,实现了无需更新模型权重或依赖前沿模型的可靠自进化。实验表明,PACE在多个基准任务中均优于传统方法,显著提升了多轮工具使用等复杂任务的性能。

详情
AI中文摘要

在生产中部署语言模型代理通常需要大量的计算和人力来调整提示、解析器、验证器和代理流水线的其他组件。自我进化提供了一种有前景的替代方案,但大多数现有框架假设可以访问能够可靠诊断故障、提出修订并判断自身更新的前沿模型。我们研究冻结的小型语言模型(SLM)是否可以在资源约束下作为有效的自我进化代理。我们提出PACE(提示和控制逻辑进化),一个双时间尺度框架,协调低风险的提示优化与高风险的控逻辑更新。PACE在固定控制逻辑下进化提示,直到提示层面的增益饱和,然后考虑通过保留验证接受的有约束控制逻辑更新。在三个从4B到14B参数的冻结SLM骨干和四个受控基准上,PACE在所有12个骨干-基准组合上实现了最佳性能,相比原始SLM代理相对提升高达+9.2%,相比更强的单模式进化基线相对提升高达+5.4%。tau-bench案例研究进一步表明,PACE在多次交互工具使用成功率上优于原始和仅提示进化。这些结果表明,无需更新模型权重或依赖前沿模型教师,可靠的SLM代理自我进化是可能的,并且关键优势不在于任何单一的最终求解模式,而在于自主、经过验证地发现适合任务的推理策略。

英文摘要

Deploying language-model agents in production often requires substantial compute and human effort to tune prompts, parsers, validators, and other components of the agent pipeline. Self-evolution offers a promising alternative, but most existing frameworks assume access to frontier models that can reliably diagnose failures, propose revisions, and judge their own updates. We study whether frozen small language models (SLMs) can serve as effective self-evolving agents under resource constraints. We propose PACE (Prompt And Control Logic Evolution), a two-timescale framework that coordinates low-risk prompt refinement with higher-risk control-logic updates. PACE evolves prompts under fixed control logic until prompt-level gains saturate, then considers constrained control-logic updates that are accepted through held-out validation. Across three frozen SLM backbones ranging from 4B to 14B parameters and four controlled benchmarks, PACE achieves the best performance on all 12 backbone--benchmark combinations, improving over vanilla SLM agents by up to +9.2% relative improvement and over the stronger single-mode evolution baseline by up to +5.4% relative improvement. A tau-bench case study further shows that PACE improves multi-turn tool-use success over vanilla and prompt-only evolution. These results suggest that reliable SLM agent self-evolution is possible without updating model weights or relying on frontier-model teachers, and that the key benefit is not any single final solver pattern but autonomous, validated discovery of task-appropriate inference strategies.

2605.23017 2026-05-25 cs.LG cs.GT

Smoothed Elicitation Complexity for Approximate $Γ$-calibration of Discrete Classification Tasks

离散分类任务的近似 $\Gamma$ 校准的平滑引发复杂度

Jessica Finocchiaro, Victor Ganson, Drona Khurana

AI总结 本文研究了在离散分类任务中实现近似Γ-校准的问题,针对多类别分类模型的校准复杂度过高这一挑战,提出了一种基于Lipschitz连续性质的中间表示方法,有效降低了校准复杂度。通过构造适用于强可排序离散属性的Lipschitz性质,作者首次给出了离散属性近似校准的理论结果,并提供了设计这些性质的算法,为离散属性的校准提供了新的方法和理论支持。

Comments Working paper

详情
AI中文摘要

评估机器学习模型可信度的一种重要方法是校准的概念。在二元结果设置中,如果结果根据模型的条件分布预测实现,则概率预测器是校准的。将二元校准定义直接扩展到概率多类分类器会导致指数级的复杂度爆炸,因为预测空间随类别数 $n$ 呈指数增长。作为补救措施,Noarov 和 Roth (2023) 提出了使用结果分布属性的多类校准,将复杂度从随类别数 $n$ 增长降低到属性维度 $d$,称为其引发复杂度。先前关于近似属性校准的工作通常局限于连续标量属性,尽管许多相关属性是离散的,如众数或排名。我们通过使用Lipschitz连续属性作为中介,刻画了强可排序离散属性的近似属性校准。据我们所知,这是首次为离散属性提供近似校准结果。在此过程中,我们通过构建设计这些Lipschitz属性的算法,刻画了强可排序离散属性的Lipschitz引发复杂度,并证明这些属性可以通过后处理得到原始离散属性。

英文摘要

One prominent method of evaluating machine learning model trustworthiness is the notion of calibration. In the binary outcome setting, a probabilistic predictor is calibrated if outcomes are realized according to a model's distributional prediction, conditioned on this prediction. Straightforward extensions of binary calibration definitions to probabilistic multiclass classifiers suffer from an exponential complexity blowup as the space of predictions grows exponentially in the number of classes $n$. As a remedy, Noarov and Roth (2023) propose multiclass calibration with predictions that are properties of the outcome distribution, reducing complexity from growing in the number of classes $n$ to the dimension $d$ of the property, called its elicitation complexity. Previous work on approximate property calibration is generally limited to continuous scalar properties, despite many relevant properties of interest being discrete, like the mode or rankings. We characterize the approximate property calibration of discrete properties which are strongly orderable by using Lipschitz continuous properties as an intermediary. This work is the first to our knowledge to provide approximate calibration results for discrete properties. Along the way, we characterize the Lipschitz elicitation complexity of strongly orderable discrete properties by constructing algorithms for designing these Lipschitz properties, which we prove can be post-processed to obtain the original discrete property.