arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 3990
2606.07590 2026-06-09 cs.CV cs.AI 新提交

SlideCheck: Guiding Self-Supervised Pretraining of Pathology Foundation Models via Dataset Distributions

SlideCheck: 通过数据集分布引导病理基础模型的自监督预训练

Mingyi He, Xinyi Guo, Xitong Ling, Weiming Chen, Jiawen Li, Lianghui Zhu, Minxi Ouyang, Mingxi Fu, Yizhi Wang, Tian Guan

发表机构 * Beijing University of Chemical Technology(北京化工大学) South China Normal University(华南师范大学) Tsinghua University(清华大学)

AI总结 提出SlideCheck工具,利用冻结病理基础模型的特征,通过双头MLP评分异常和恶性证据,引导自监督预训练数据筛选,实验表明数据分布影响模型下游性能。

详情
Comments
9 pages, 2 figures, 4 tables
AI中文摘要

病理基础模型在大量WSI衍生补丁流上进行预训练,而数据构建过程中的监督通常是切片级别、稀疏或异质的。这种不匹配使得理解和控制哪些生物模式进入预训练数据变得困难。我们提出SlideCheck,一个轻量级的预训练数据引导工具,建立在冻结的病理基础模型补丁特征之上。SlideCheck并非作为独立的补丁诊断模型,而是提供明确的异常和恶性评分,用于组织、过滤和审计病理预训练数据。SlideCheck使用双头MLP分别建模广泛的异常形态和恶性证据。正则化的特征空间评分器为补丁级证据估计提供监督锚点,而评分-注意力一致性将补丁评分与WSI级别的MIL注意力结合,挖掘高置信度伪标签。然后使用相同的评分构建广泛阳性ViT预训练子集,其中如果异常或恶性证据超过阈值,则选择补丁。实验表明,SlideCheck定义的数据分布影响自监督ViT预训练的下游行为,表明生物组成是病理基础模型开发中的重要可控因素。精心策划的子集可以接近全数据性能,表明明确评分的补丁池可能支持更高效和可审计的预训练数据构建。这些发现将SlideCheck定位为数据引导和审计层,用于将大型未分化补丁池转化为可控和可重用的预训练数据集。

英文摘要

Pathology foundation models are pretrained on large streams of WSI-derived patches, while supervision during data construction is often slide-level, sparse, or heterogeneous. This mismatch makes it difficult to understand and control which biological patterns enter the pretraining data. We propose SlideCheck, a lightweight pretraining data guidance tool built on frozen pathology foundation model patch features. Rather than serving as a standalone patch diagnostic model, SlideCheck provides explicit abnormality and malignancy scores for organizing, filtering, and auditing pathology pretraining data. SlideCheck uses a dual-head MLP to separately model broad abnormal morphology and malignant evidence. A regularized feature-space scorer provides a supervised anchor for patch-level evidence estimation, while score-attention agreement combines patch scores with WSI-level MIL attention to mine high-confidence pseudo labels. The same scores are then used to construct broad-positive ViT pretraining subsets, where a patch is selected if either abnormality or malignancy evidence exceeds a threshold. Experiments show that SlideCheck-defined data distributions influence the downstream behavior of self-supervised ViT pretraining, indicating that biological composition is an important controllable factor in pathology foundation model development. Curated subsets can approach full-data performance, suggesting that explicitly scored patch pools may support more efficient and auditable pretraining data construction. These findings position SlideCheck as a data guidance and auditing layer for transforming large, undifferentiated patch pools into controllable and reusable pretraining datasets.

2606.07587 2026-06-09 cs.LG 新提交

The Routing Plateau: Understanding and Breaking the Accuracy Limits of LLM Routers

路由平台:理解并突破LLM路由器的准确性极限

Yifan Lu, Qiyue Zhang, Shenrun Zhang, Zhibo Yu, Zhuang Wang, Hanjie Chen, Jiarong Xing

发表机构 * Rice University(莱斯大学) Amazon(亚马逊)

AI总结 研究发现多种LLM路由方法存在“路由平台”现象,即准确性趋同且远低于理想路由器,主要原因是可预测性瓶颈;通过增大训练数据、更强编码器和端到端微调可突破平台。

详情
Comments
23 Pages, 12 Tables, 9 Figures
AI中文摘要

LLM路由已成为一种流行的方法,通过为每个查询动态选择模型来改善LLM服务的成本-质量权衡。最近的工作探索了广泛的路由方法,包括基于聚类的路由器、学习分类器、成对排序和基于置信度的方法。我们对五个基准测试中的21种路由方法的广泛研究揭示了一个一致的现象,我们称之为路由平台:许多方法,包括kNN,实现了非常相似的准确性,并收敛到一个狭窄的性能范围,远低于理想路由器。我们的研究表明,平台主要是由可预测性瓶颈引起的:当前路由器主要学习全局平均模型性能趋势,而不是细粒度的查询特定路由信号。因此,它们解决了重叠的简单查询,但共同在需要实例特定路由决策的困难查询上失败。我们进一步研究如何超越平台,发现更大的训练数据集、更强的编码器和端到端微调可以进一步提高路由准确性。这些发现表征了当前路由方法的常见限制,并为社区构建更有效的路由系统提供了见解和可操作的方向。

英文摘要

LLM routing has become a popular approach to improve the cost-quality trade-off of LLM services by dynamically selecting a model for each query. Recent work has explored a broad range of routing methods, including clustering-based routers, learned classifiers, pairwise ranking, and confidence-based approaches. Our extensive study of 21 routing methods across five benchmarks reveals a consistent phenomenon that we call the routing plateau: many methods, including kNN, achieve very similar accuracy and converge to a narrow performance range that remains far below the oracle router. Our investigation shows that the plateau is largely caused by a predictability bottleneck: current routers mainly learn global averaged model-performance trends rather than fine-grained query-specific routing signals. As a result, they solve overlapping easy queries but collectively fail on hard queries that require instance-specific routing decisions. We further study how to move beyond the plateau and find that larger training datasets, stronger encoders, and end-to-end fine-tuning can further improve routing accuracy. These findings characterize the common limits of current routing methods and provide insights and actionable directions for the community to build more effective routing systems.

2606.07585 2026-06-09 cs.CV cs.AI 新提交

Multimodal Group Emotion Recognition In-the-Wild Towards a Privacy-Safe Non-Individual Approach

面向隐私安全的非个体化方法的多模态群体情绪识别

Anderson Augusma

发表机构 * Université Grenoble Alpes(格勒诺布尔-阿尔卑斯大学) Univ. Grenoble Alpes(格勒诺布尔-阿尔卑斯大学) Univ. of Glasgow(格拉斯哥大学) Inria(法国国家信息与自动化研究所) Univ. Paris-Saclay(巴黎-萨克雷大学) TU Delft(代尔夫特理工大学)

AI总结 本文提出两种多模态框架(交叉注意力融合+帧注意力池化,以及变分编码器多解码器),利用集体音视频信号进行群体情绪识别,避免使用个体特征,在保护隐私的同时实现鲁棒性能。

详情
Comments
Doctoral thesis
AI中文摘要

本论文研究野外环境下的群体情绪识别(GER),重点关注隐私保护。与依赖面部、目光或语音分析等个体层面线索的传统情绪识别方法不同,本工作利用集体音视频信号推断群体层面的情绪,降低个体监控和监视的风险。提出了两个互补框架。第一个是用于音视频融合的交叉注意力多模态架构,结合帧注意力池化(FAP)进行时间聚合。该框架由合成数据增强支持,并通过消融研究验证,在真实世界GER条件下展现出鲁棒性。第二个框架,变分编码器多解码器(VE-MD),学习一个共享潜在空间,用于情绪分类和结构表示预测(包括身体和面部线索)。探索了两种解码策略(基于DETR和基于热图),以分析结构表示在群体和个体设置中的作用。本论文做出三项主要贡献:阐明了多模态和结构线索在群体层面情感计算中的作用;引入了两种用于隐私保护多模态GER的架构;并证明了在不使用个体特征作为输入数据的情况下可以实现有竞争力的性能。

英文摘要

This thesis addresses group emotion recognition (GER) in-the-wild with a focus on privacy preservation. Unlike traditional emotion recognition methods that rely on individual-level cues such as face, gaze, or voice analysis, this work uses collective audio-video signals to infer emotions at the group level, reducing risks of individual monitoring and surveillance. Two complementary frameworks are proposed. The first is a cross-attention multimodal architecture for audio-video fusion, combined with Frames Attention Pooling (FAP) for temporal aggregation. It is supported by synthetic data augmentation and validated through ablation studies, demonstrating robustness in real-world GER conditions. The second framework, Variational Encoder Multi-Decoder (VE-MD), learns a shared latent space for emotion classification and structural representation prediction, including body and face cues. Two decoding strategies, DETR-based and heatmap-based, are explored to analyze the role of structural representations in group and individual settings. The thesis makes three main contributions: it clarifies the role of multimodality and structural cues in group-level affective computing; introduces two architectures for privacy-preserving multimodal GER; and shows that competitive performance can be achieved without using individual features as input data.

2606.07583 2026-06-09 cs.LG cs.AI 新提交

Outage Detection in Self-Healing Smart Grids Using Reinforcement Learning with Spectral Graph Neural Networks

基于频谱图神经网络强化学习的自愈智能电网故障检测

Lihui Liu, Mucun Sun, Caisheng Wang

发表机构 * Wayne State University(韦恩州立大学) University of Texas at Dallas(德克萨斯大学达拉斯分校)

AI总结 提出频谱图强化学习框架,利用频谱图神经网络学习最优恢复策略,实现配电网故障实时近最优管理,在三个IEEE测试系统上验证了泛化能力。

详情
AI中文摘要

自愈智能电网能够在故障期间快速调整其网络配置,以最小化电力中断。在故障期间,可以采取多种措施,例如通过开关操作进行网络重构和紧急甩负荷。然而,传统的用于故障缓解的机器学习方法由于响应速度慢和计算成本高,不适用于智能电网。为了解决这些挑战,最近的研究探索了使用强化学习自动执行网络重构。在这些方法中,控制策略通常使用图神经网络(GNN)建模。然而,传统的GNN在空间域中运行,可能无法捕捉频域中的重要关系。频域信息对于建模电力网络中的全局结构模式和系统范围交互特别有用。在本文中,我们提出了一种用于配电网故障管理的频谱图强化学习框架,以增强系统韧性。我们的模型使用频谱图神经网络学习最优电力恢复策略。我们在三个修改后的IEEE测试系统上评估了所提出的方法:13节点、34节点和123节点网络。实验结果表明,我们的方法在实时性上达到了接近最优的性能,并且在广泛的故障场景中具有良好的泛化能力。

英文摘要

Self-healing smart grids can quickly adjust their network configuration during outages to minimize power disruptions. During an outage, several actions can be taken, such as network reconfiguration through switching operations and emergency load shedding. However, traditional machine learning methods for outage mitigation are not well suited for smart grids due to their slow response time and high computational cost. To address these challenges, recent studies have explored reinforcement learning to automatically perform network reconfiguration. In these approaches, the control policy is typically modeled using a graph neural network (GNN). However, conventional GNNs operate in the spatial domain and may fail to capture important relationships in the frequency domain. Frequency-domain information is particularly useful for modeling global structural patterns and system-wide interactions in power networks. In this paper, we propose a spectral graph reinforcement learning framework for outage management in distribution networks to enhance system resilience. Our model learns the optimal power restoration policy using a spectral graph neural network. We evaluate the proposed method on three modified IEEE test systems: the 13-bus, 34-bus, and 123-bus networks. Experimental results show that our approach achieves near-optimal performance in real time and generalizes well across a wide range of outage scenarios.

2606.07582 2026-06-09 cs.LG cs.AI cs.ET 新提交

Customer Churn Prediction on Structured Data Using FT-Transformer and Stacking Ensembles

基于FT-Transformer和堆叠集成的结构化数据客户流失预测

Joyjit Roy, Samaresh Kumar Singh, Laxmi Shaw

发表机构 * Independent Researcher, Austin, TX, USA(独立研究员,美国德克萨斯州奥斯汀) Independent Researcher, Leander, TX(独立研究员,美国德克萨斯州利安德) Texas A & M University-Victoria, Victoria, TX(德克萨斯农工大学维多利亚分校)

AI总结 提出一种结合FT-Transformer与XGBoost的混合架构,通过校准感知堆叠集成处理类别不平衡和特征交互,在银行客户流失数据集上F1达62.10%,AUC-ROC为0.861。

详情
Journal ref
IEEE Access, vol. 14, pp. 62834-62855, 2026
Comments
22 pages, 9 figures, 20 tables; published in IEEE Access
AI中文摘要

客户流失预测在保险、数字银行、电子商务和订阅平台等数据驱动行业中至关重要,因为保留现有客户通常比获取新客户更具成本效益。由于类别不平衡、非线性特征交互和异质特征类型,在结构化数据集上预测流失仍然具有挑战性。基于树的集成方法在这些场景中始终表现出强大的性能,通常优于传统神经网络。本研究引入了一种经过验证的混合架构,通过校准感知堆叠将特征标记化变换器(FT-Transformer)与梯度提升树相结合。所提出的框架解决了先前研究中在统计验证、概率校准和可重复性方面的持续空白。FT-Transformer利用自注意力捕获高阶特征交互,而XGBoost通过互补的归纳偏置捕获梯度提升决策边界。类别不平衡通过使用类别加权损失函数处理,从而避免合成过采样并保留少数类分布。模型使用基于折叠外(OOF)堆叠的逻辑回归元学习器进行集成,该元学习器重新校准过于自信的基模型输出并学习最优组合权重。在一个公开的银行流失数据集上,混合模型在5x5交叉验证下达到62.10%的F1、0.861的AUC-ROC和0.647的PR-AUC,相比多层感知机(MLP)基线分别提升3.37个F1点和0.027个AUC,并报告了95%置信区间。消融研究表明,变换器组件和堆叠策略都对性能有实质性贡献。所提出的方法为结构化表格数据上的当代流失预测提供了一个可重复且可扩展的参考架构。

英文摘要

Customer churn prediction is essential across data-driven industries such as insurance, digital banking, eCommerce, and subscription platforms, where retaining existing customers is typically more cost-effective than acquiring new ones. Predicting churn on structured datasets remains challenging due to class imbalance, nonlinear feature interactions, and heterogeneous feature types. Tree-based ensemble methods consistently demonstrate strong performance in these contexts, often outperforming conventional neural networks. This study introduces a validated hybrid architecture that integrates feature-tokenized transformers (FT-Transformer) with gradient-boosted trees through calibration-aware stacking. The proposed framework addresses persistent gaps in statistical validation, probability calibration, and reproducibility found in prior research. The FT-Transformer captures higher-order feature interactions using self-attention, while XGBoost captures gradient-boosted decision boundaries with complementary inductive biases. Class imbalance is handled using class-weighted loss functions, thereby avoiding synthetic oversampling and preserving minority-class distributions. The models are ensembled using out-of-fold (OOF) stacking with a logistic regression meta-learner, which recalibrates overconfident base model outputs and learns optimal combination weights. On a public bank churn dataset, the hybrid model achieves 62.10% F1, 0.861 AUC-ROC, and 0.647 PR-AUC, outperforming the Multi-Layer Perceptron (MLP) baseline by 3.37 F1 points and 0.027 AUC under 5x5 cross-validation with 95% confidence intervals reported. Ablation studies demonstrate that both the transformer component and stacking strategy contribute materially to performance. The proposed methodology offers a reproducible and extensible reference architecture for contemporary churn prediction on structured tabular data.

2606.07577 2026-06-09 cs.AI cs.CV cs.SD eess.AS 新提交

OmniMem: Perturbation-aware Memory Compression for Streaming Audio-Visual LLMs

OmniMem: 面向流式音视频大语言模型的扰动感知记忆压缩

Guangzhi Sun, Yixuan Li, Yudong Yang, Chao Zhang

发表机构 * Tsinghua University(清华大学) ByteDance(字节跳动) Department of Engineering, University of Cambridge(剑桥大学工程系)

AI总结 提出OmniMem,一种针对音视频LLM的流式记忆压缩框架,通过模态感知分配和扰动感知选择压缩KV缓存,在保持长视频理解的同时减少内存,在多个基准上提升2-4%准确率。

详情
Comments
Code: https://github.com/bytedance/SALMONN/tree/omni_mem
AI中文摘要

音视频大语言模型(LLMs)在长视频理解方面具有强大潜力,但其长视频推理从根本上受到视频令牌和键值(KV)缓存线性增长的制约。我们提出OmniMem,一种专为音视频LLMs设计的内存高效流式框架。与将所有令牌统一处理的现有压缩方法不同,OmniMem引入了一种模态感知的内存分配策略,分别管理视觉和音频上下文,解决了两种模态之间的严重令牌不平衡问题。OmniMem进一步通过扰动感知的内存选择保留信息丰富且非冗余的KV状态,实现紧凑内存而不牺牲长程理解。为了在现实部署约束下加强压缩,我们还探索了预算感知微调,鼓励模型将有用信息整合到保留内存中。在VideoMME Long、LVBench和LVOmniBench上使用video-SALMONN 2+和Qwen-2.5-Omni的实验表明,在相同内存预算下,OmniMem始终比强训练无关压缩基线提高2-4%的绝对准确率,微调后额外提高1-2%。

英文摘要

Audio-visual large language models (LLMs) hold strong promise for long-form video understanding, yet their long-video inference is fundamentally limited by the linear growth of video tokens and key-value (KV) caches. We present OmniMem, a memory-efficient streaming framework designed specifically for audio-visual LLMs. Unlike existing compression methods that treat all tokens uniformly, OmniMem introduces a modality-aware memory allocation strategy that separately manages visual and audio contexts, addressing the severe token imbalance between the two modalities. OmniMem further preserves informative and non-redundant KV states through perturbation-aware memory selection, enabling compact memory without sacrificing long-range understanding. To strengthen compression under realistic deployment constraints, we also explore budget-aware fine-tuning, which encourages the model to consolidate useful information into retained memory. Experiments on VideoMME Long, LVBench, and LVOmniBench with video-SALMONN 2+ and Qwen-2.5-Omni show that OmniMem consistently improves over strong training-free compression baselines by 2-4% absolute accuracy under the same memory budgets, with an additional 1-2% gain after fine-tuning.

2606.07571 2026-06-09 cs.LG cs.AI 新提交

Enabling KV Caching of Shared Prefix for Diffusion Language Models

为扩散语言模型启用共享前缀的KV缓存

Younghun Go, Jaehoon Han, Changyong Shin, Chuk Yoo, Gyeongsik Yang

发表机构 * Korea University(高丽大学)

AI总结 针对扩散语言模型中双向注意力导致共享前缀KV不稳定的问题,提出双向前缀缓存(bicache),通过动态识别安全层深度重用KV,避免精度崩溃,提升吞吐量36.3%-98.3%。

详情
AI中文摘要

共享前缀的键值(KV)缓存对于高吞吐量的大语言模型(LLM)服务至关重要,但在新兴的扩散语言模型(DLM)中面临严峻挑战。在DLM中,双向注意力意味着更新任何token都会动态改变整个上下文及其对应的KV。因此,为LLM开发的现有缓存技术(假设KV一旦计算就保持不变)会破坏共享前缀KV。我们的实验表明,将这些技术应用于DLM会导致模型精度几乎降为零。为了解锁高吞吐量的DLM服务,我们提出了双向前缀缓存(bicache),这是第一个用于DLM中共享前缀的KV缓存技术。bicache基于我们全面分析的关键观察设计:共享前缀KV在浅层中保持稳定且可重用,而浅层的深度取决于每个请求中共享前缀token的比例。因此,bicache动态识别用于重用共享前缀KV的安全层深度,并消除冗余计算。评估表明,与现有技术相比,bicache显著提高了服务吞吐量36.3%-98.3%,且没有精度崩溃(仅0-1.8%的差异)。

英文摘要

Key-value (KV) caching for shared prefixes is essential for high-throughput large language model (LLM) serving, but it faces critical challenges in emerging diffusion language models (DLMs). In DLMs, bidirectional attention means that updating any token dynamically alters the entire context and its corresponding KVs. Thus, existing caching techniques developed for LLMs, which assume that KVs remain invariant once computed, corrupt the shared prefix KVs. Our experiments show that applying these techniques to DLMs causes model accuracy to collapse to near zero. To unlock high-throughput DLM serving, we propose bidirectional prefix caching, bicache, the first KV caching technique for shared prefixes in DLMs. bicache is designed based on key observations from our comprehensive analysis: shared prefix KVs remain stable and reusable in shallow layers, while the depth of shallow layers depends on the fraction of shared prefix tokens in each request. Thus, bicache dynamically identifies a safe layer depth for reusing shared prefix KVs and eliminates redundant computation. Evaluations demonstrate that bicache significantly improves serving throughput by 36.3%-98.3% compared to existing techniques without accuracy collapse (only 0-1.8% difference).

2606.07565 2026-06-09 cs.LG 新提交

STARIXNet: Multivariate and Multi-attribute Deep Learning Approach to Real-Time Resource Allocation in Cloud Platforms

STARIXNet: 云平台中多变量多属性深度学习方法实现实时资源分配

Ahmed Abdulaal, Maruf Aytekin, Thilaga kumaran Srinivasan, Tomer Lancewicki

发表机构 * Walmart Global Tech(沃尔玛全球科技)

AI总结 提出STARIXNet轻量神经网络,通过捕获多系统指标的时空关系进行多变量资源分配,优先服务稳定性再考虑成本效率,在沃尔玛生产环境中节省10%-50%成本。

详情
Comments
11 pages, 12 figures. Under review
AI中文摘要

云平台中微服务的智能伸缩对于缓解不断增长的计算成本同时避免服务中断至关重要。当前解决方案局限于单变量空间,通常仅关注CPU使用率来驱动伸缩决策。此外,它们将问题视为纯预测任务,专注于预测精度而忽略了低估和系统响应延迟的更大风险。替代方案计算复杂,使其难以用于大规模实时部署。为应对这些挑战,我们提出STARIXNet,一种轻量级神经网络,通过捕获多个系统指标间的时空关系,在多变量空间中指导资源分配决策。STARIXNet对多个准依赖属性进行建模,特别是(S)季节性、(T)时间性、(A)自回归(I)综合和(e)外生模式,然后实施聚合策略以最终确定伸缩决策,优先考虑服务稳定性,其次是成本效率,而非原始预测准确性。我们通过在真实环境中与现有解决方案进行基准测试,实证展示了STARIXNet的性能。STARIXNet已部署于沃尔玛的关键生产微服务,实现了10%至50%的实际节省,此外还通过改善服务稳定性和客户体验带来了无形收益。

英文摘要

Intelligent scaling of microservices in cloud platforms is crucial for mitigating escalating compute costs while avoiding service disruptions. Current solutions are limited to the univariate space, typically focusing on CPU usage alone to drive scaling decisions. Moreover, they address the problem as a purely forecasting task, focusing on prediction precision while neglecting the greater risks of underestimation and delays in system responsiveness. Alternative solutions are computationally complex, making them impractical for large-scale, real-time deployments. To address these challenges, we present STARIXNet, a lightweight neural network that guides resource allocation decisions in the multivariate space by capturing spatio-temporal relationships among multiple system metrics. STARIXNet models multiple quasi-dependent attributes, in particular the (S)easonal, (T)emporal, (A)uto-(R)egressive (I)ntegrated, and e(X)ogenous patterns, then implements an aggregation policy to finalize scaling decisions, prioritizing service stability, followed by cost-efficiency, over raw forecast accuracy. We empirically demonstrate the performance of STARIXNet by benchmarking against existing solutions in real-world settings. STARIXNet is deployed for critical production microservices at Walmart achieving tangible savings ranging from 10\% to 50\%, in addition to intangible benefits through improved service stability and customer experience.

2606.07563 2026-06-09 cs.LG cs.AI 新提交

Emergence via Phase Transitions: Mechanism Landscapes and Universal Convergence Across Complex Systems

通过相变涌现:机制景观与跨复杂系统的通用收敛

Truong Xuan Khanh

发表机构 * H&K Research Studio(H&K 研究工作室) Clevix LLC(Clevix 有限责任公司)

AI总结 提出层次涌现框架(HEF),将涌现建模为机制景观中的相变,证明在结构假设下物理可行且收敛到唯一不动点,并在111个模算术变换器实验中验证了相变指纹。

详情
Comments
27 pages, 3 figures, 2 tables; 15-page Supplementary Information with complete proofs included
AI中文摘要

在机器学习、生物学和物理学中,独立演化的系统尽管微观细节截然不同,但常常收敛到惊人相似的高层结构。Grokking电路在不同随机种子下收敛,进化谱系重新发现相似的代谢解决方案,重整化流趋近共同的固定点。我们提出层次涌现框架(HEF)作为此类收敛现象的候选普适性框架。HEF将涌现建模为由热力学和信息论定律约束的机制景观中的相变。该框架引入一个临界能量阈值Ec,将具有竞争机制的探索阶段与由唯一最小成本机制主导的收敛阶段分开。在结构假设下,我们证明了物理可行性,推导了严格的度量收缩,并建立了收敛到与初始条件无关的唯一不动点表示。我们进一步通过有效信息和机制竞争熵将该收敛结构与因果涌现联系起来。为测试该框架,我们研究了111个实验中模算术变换器的延迟泛化(“grokking”)。我们识别出一个可重复的Ec转变经验指纹:在92%的运行中,权重范数在grokking之前系统性达到峰值。归一化准确率曲线坍缩到tanh扭结(R^2=0.93),与Landau-Ginzburg普适类一致,所有grokked模型收敛到0.9745±0.014,与初始化、权重衰减或训练比例无关(ANOVA p>0.13)。HEF并非作为涌现的通用理论提出,而是作为研究跨复杂系统收敛现象的可证伪数学框架。

英文摘要

Across machine learning, biology, and physics, independently evolving systems often converge toward strikingly similar high-level structures despite radically different microscopic details. Grokking circuits converge across random seeds, evolutionary lineages rediscover similar metabolic solutions, and renormalization flows approach common fixed points. We propose the Hierarchical Emergence Framework (HEF) as a candidate universality framework for such convergence phenomena. HEF models emergence as a phase transition in a mechanism landscape constrained by thermodynamic and information-theoretic laws. The framework introduces a critical energy threshold Ec separating an exploration regime with competing mechanisms from a convergence regime governed by a unique minimum-cost mechanism. Under structural assumptions, we prove physical feasibility, derive strict metric contraction, and establish convergence toward a unique fixed-point representation independent of initial conditions. We further connect this convergence structure to causal emergence through Effective Information and mechanism competition entropy. To test the framework, we study delayed generalization ("grokking") in modular arithmetic transformers across 111 experiments. We identify a reproducible empirical fingerprint of the Ec transition: the weight norm peaks systematically before grokking in 92% of runs. Normalized accuracy curves collapse onto a tanh kink (R^2=0.93) consistent with a Landau-Ginzburg universality class, and all grokked models converge to 0.9745+/-0.014 regardless of initialization, weight decay, or training fraction (ANOVA p>0.13). HEF is not presented as a universal theory of emergence, but as a falsifiable mathematical scaffold for studying convergence phenomena across complex systems.

2606.07561 2026-06-09 cs.LG stat.ME stat.ML 新提交

Boundary Variance Inflation Causes Acquisition Bias in Gaussian Processes

边界方差膨胀导致高斯过程中的采集偏差

Maria Bånkestad, Sanna Jarl, Jens Sjölund

发表机构 * RISE Research Institutes of Sweden(瑞典RISE研究院) Uppsala University(乌普萨拉大学)

AI总结 本文揭示有界域上平稳核高斯过程边界方差膨胀的根本原因是核相关邻域截断,并证明该几何扭曲导致三类采集函数产生系统性偏差,提出无函数选择剖面诊断方法。

详情
Comments
14 pages, 8 figures; appendices included
AI中文摘要

具有平稳核的高斯过程在有界域上会在边界附近表现出膨胀的后验方差。尽管这在地统计学中是一个长期被认识到的伪影,并且在贝叶斯优化中是过度探索的来源,但边界引起的采集偏差的原因和影响尚未得到充分探索。我们将根本原因追溯到一个简单的几何机制:核相关邻域在域边界处的截断产生了一种与观测无关的扭曲,且随着维度的增加而恶化。我们展示了这种扭曲如何在三类采集函数中表现出来:方差最大化将选择集中在角落,而负积分后验方差和期望预测信息增益则将选择向内移动到轴向内部壳层。这些模式的出现不依赖于任何目标函数,这意味着采集行为可能由核几何主导,而非期望的任务特定不确定性。为了量化这一点,我们引入了一种针对任意采集函数、核和有界域几何的无函数选择剖面诊断方法。

英文摘要

Gaussian processes with stationary kernels on bounded domains exhibit inflated posterior variance near the boundary. Despite being a long-recognized artifact in geostatistics and a source of over-exploration in Bayesian optimization, the causes and effects of boundary-induced acquisition bias are underexplored. We trace the root cause to a simple geometric mechanism: the truncation of the kernel correlation neighborhood at the domain boundary creates an observation-independent distortion that worsens with dimensionality. We show how this distortion manifests across three acquisition classes: variance maximization concentrates selections at the corners, whereas negative integrated posterior variance and expected predictive information gain move selections inward to axis-aligned interior shells. These patterns arise without reference to any objective function, meaning that acquisition behavior can be dominated by kernel geometry rather than the desired task-specific uncertainty. To quantify this, we introduce a function-free selection-profile diagnostic for arbitrary acquisitions, kernels, and bounded-domain geometries.

2606.07560 2026-06-09 cs.CL cs.LG 新提交

Function-Vector Heads Are Two Populations: Writers and Cancellers in In-Context Learning

函数向量头是两个群体:上下文学习中的写入者和取消者

Han-yu Wang

发表机构 * The University of Hong Kong(香港大学)

AI总结 发现函数向量头并非同质群体,而是分为写入者和取消者两个子群体,分别推高和压低规则正确logit,且仅基于幅度的排名无法区分二者。

详情
AI中文摘要

函数向量头(Todd et al., 2024)通常通过其对上下文规则任务的因果贡献幅度来识别,隐含假设顶级集合是同一功能类。这一假设不成立。我们用保留符号的标准(改进的DLA + 置换FDR)替代仅幅度排名,并通过路径修补验证每个候选。然后,FV头群体分裂为两个对立的子群体:写入者推高规则正确logit;取消者压低它。一个四条件规范判定在三个模型家族和六个Pythia规模的13/15个单元中成立,符号置换检验在5/6个主要单元中拒绝同质性。仅幅度排名无法看到这种结构:Todd的前20个在层次任务中捕获了64%的取消者但仅4%的写入者,在模块任务中捕获了59%的写入者但仅8%的取消者。我们在所有27个(取消者,单元,头)对上排除了六种人为解释:归纳重叠、汇点、通用重要性、秩1复制抑制、V级联和最近邻非FV控制。零消融取消者在6/6个主要单元中产生+0.13到+0.29 nats的logit增益,方向一致地带来+2到+7个百分点的准确率提升。

英文摘要

Function-vector (FV) heads (Todd et al., 2024) are typically identified by the magnitude of their causal contribution to in-context rule tasks, under the implicit assumption that the top set is a homogeneous functional class. This assumption fails. We replace magnitude-only ranking with a sign-preserving criterion (refined DLA + permutation FDR) and validate each candidate by path patching. The FV head population then splits into two opposing sub-populations: writers push the rule-correct logit up; cancellers push it down. A four-condition canonical verdict holds in $13/15$ cells across three model families and six Pythia scales, and a sign-shuffle rejects homogeneity in $5/6$ main cells. The structure is invisible to magnitude-only ranking: Todd's top-$20$ captures $64\%$ of cancellers but only $4\%$ of writers on the hierarchical task, and $59\%$ of writers but only $8\%$ of cancellers on the modular task. We rule out six artefact accounts on all $27$ canceller (cell, head) pairs: induction overlap, sinks, generic importance, rank-$1$ copy-suppression, V-cascade, and rank-nearest non-FV controls. Zero-ablating cancellers yields $+0.13$ to $+0.29$ nats of logit gain in $6/6$ main cells with a directionally consistent $+2$ to $+7$ pp accuracy effect.

2606.07559 2026-06-09 cs.CL cs.AI quant-ph 新提交

Phantom transitions in language model fine-tuning

语言模型微调中的幻影相变

Vaibhav Prakash, Jayasri Dontabhaktuni

发表机构 * Mahindra University(马恒达大学)

AI总结 本文研究语言模型微调时,正确补全被近义词竞争而失败的现象,通过序参量分解信号与背景拖拽,发现两种失败模式,并揭示相变为幻影,源于softmax读出而非几何相变。

详情
Comments
26 pages, 9 figures
AI中文摘要

在上下文中微调语言模型,当正确补全存在近义词竞争者时,常常无声地失败。交叉熵损失单调递减,而正确token在排名上从未超越竞争者。我们研究了跨越两个系列和五倍参数范围的五种Transformer架构,在十个精心挑选的近义词上下文中。我们用一个结合预测分布和成对嵌入重叠的序参量来测量这些失败。它可加性地分解为一个信号(跟踪模型对正确token相对于其最近竞争者的承诺)和一个背景拖拽(由嵌入整体向分数泄漏概率的方式决定)。这分离出两种失败模式:运动学失败中信号保持较小;结构失败中拖拽随着微调进行而主动恶化。我们观察到序参量中类似相变的弹弓状跳跃。一个核心负面结果组织了本文:这些相变是幻影。直接测量排除了自发对称破缺的解释。在LoRA微调下,当token嵌入矩阵在训练期间完全不变时,弹弓状跳跃仍然出现,而此处不可能存在几何相变。不连续性完全存在于softmax读出中。少量无量纲量组织跨架构的轨迹。其中一个在所有五种架构的全微调下保持一致。第二个根据整体嵌入分布将架构分为两类,并预测LoRA的充分性。作为盲测,该框架预测了一个未用于拟合任何参数的保留架构的临界学习率,与后续学习率扫描的误差在2.1%以内。研究结果仅涉及近义词机制,未经重新校准不应外推。

英文摘要

Fine-tuning a language model on contexts whose correct completion has a near-synonym competitor often fails silently. The cross-entropy loss decreases monotonically while the correct token never overtakes the competitor in rank. We study this regime across five transformer architectures spanning two families and a fivefold parameter range, on ten hand-selected near-synonym contexts. We instrument these failures with an order parameter combining the predicted distribution and pairwise embedding overlaps. It decomposes additively into a signal, tracking the model's commitment to the correct token over its nearest competitor, and a background drag, set by how the embedding bulk leaks probability into the score. This isolates two failure modes. In kinematic failure the signal stays small. In structural failure the drag actively worsens as fine-tuning proceeds. We observe sharp catapult-like jumps in the order parameter that resemble a phase transition. A central negative result organises the paper. The transitions are phantoms. The spontaneous-symmetry-breaking interpretation is ruled out by direct measurement. Catapult-like jumps still appear under LoRA fine-tuning with the token embedding matrix exactly unchanged during training, where no geometric phase transition is possible. The discontinuity lives entirely in the softmax readout. A small number of dimensionless quantities organise the trajectory across architectures. One is consistent across all five under full fine-tuning. A second sorts architectures into two classes by bulk embedding distribution and predicts LoRA sufficiency. As a blind test, the framework predicts the critical learning rate of a held-out architecture, not used to fit any parameter, to within 2.1% of a subsequent learning-rate sweep. Findings concern the near-synonym mechanism only and should not be extrapolated without recalibration.

2606.07558 2026-06-09 cs.CV cs.AI cs.DL 新提交

Page image classifier fine-tuned on century-spanning archives of scanned documents for further content-specific processing

基于百年跨度扫描文档档案微调的页面图像分类器,用于进一步的内容特定处理

Kateryna Lutsai, Pavel Straňák, David Novák, Dana Křivánková

发表机构 * Institute of Formal and Applied Linguistics, Charles University MFF(查尔斯大学数学与物理学院形式与应用语言学研究所) Institute of Archaeology, Czech Academy of Sciences(捷克科学院考古研究所)

AI总结 针对历史文档数字化中手动分类不可行的问题,提出基于视觉内容类型(文本、表格、图形)的自动页面图像分类系统,采用微调深度网络(RegNetY-16GF达99.16%准确率)实现近完美分类,并公开模型、数据集和代码。

详情
Comments
29 pages, 19 figures, 13 tables. arXiv admin note: text overlap with arXiv:2507.21114
AI中文摘要

目的:人文学科的数字化项目产生了大量、异构的历史文档档案,使得手动分类在大规模下不切实际。本工作解决基于视觉内容类型——文本、表格和图形——对扫描页面图像进行分类的自动化系统需求,从而支持内容特定的下游处理,如光学字符识别(OCR)或结构化数据提取。方法:开发了一个图像分类系统,并在来自百年历史的捷克考古档案的超过48,000张带注释的历史页面图像数据集上进行评估,通过四个连续的注释阶段和领域专家审查进行优化。使用手工制作的图像特征建立了随机森林分类器基线。随后,微调并比较了深度学习架构:卷积神经网络(EfficientNetV2、RegNetY)、视觉和文档图像变换器(ViT、DiT)以及多模态CLIP模型。与领域专家合作设计了11类标签方案,并通过五折交叉验证进行评估。结果:基于特征的基线实现了约75%的准确率。微调的CNN和变换器显著优于基线,RegNetY-16GF在保留测试集上达到99.16%的Top-1准确率,ViT-large达到99.12%。CLIP ViT-B/16通过优化文本描述达到99.14%的准确率。结论:仅图像模型,特别是RegNetY-16GF,实现了近乎完美的分类准确率,并在649,508张未标注档案页面上产生一致标签,模型间一致性超过90%。微调的CLIP尽管在测试集上具有竞争力,但在未标注数据上与仅图像模型的一致性低于65%,因此不太适合部署。最终模型、注释数据集和软件均以开源许可证公开提供。

英文摘要

Purpose: Digitization projects in the humanities produce vast, heterogeneous archives of historical documents, making manual sorting impractical at scale. This work addresses the need for an automated system to classify scanned page images based on visual content type - text, tables, and graphics - enabling content-specific downstream processing such as Optical Character Recognition (OCR) or structured data extraction. Methods: An image classification system was developed and evaluated on a dataset of over 48,000 annotated historical page images from century-old Czech archaeological archives, refined through four successive annotation stages with domain-expert review. A Random Forest Classifier baseline was established using hand-crafted image features. Subsequently, deep learning architectures were fine-tuned and compared: Convolutional Neural Networks (EfficientNetV2, RegNetY), Vision and Document Image Transformers (ViT, DiT), and multimodal CLIP models. An 11-category label scheme was designed collaboratively with domain experts and evaluated via five-fold cross-validation. Results: The feature-based baseline achieved approximately 75% accuracy. Fine-tuned CNNs and Transformers substantially outperformed it, with RegNetY-16GF achieving 99.16% and ViT-large 99.12% Top-1 accuracy on the held-out test set. CLIP ViT-B/16 reached 99.14% with optimized text descriptions. Conclusion: Image-only models, particularly RegNetY-16GF, deliver near-perfect classification accuracy and produce consistent labels across 649,508 unlabeled archival pages with over 90% inter-model agreement. Fine-tuned CLIP, despite competitive test-set accuracy, showed under 65% agreement with image-only models on unlabeled data, making it less suitable for deployment. The final models, annotated dataset, and software are publicly available under open-source licenses.

2606.07555 2026-06-09 cs.CL cs.LG 新提交

Priors Persist Through Suppression: A Stroop Paradigm for Lexical Override

先验通过抑制持续存在:词汇覆盖的斯特鲁普范式

Han-yu Wang

发表机构 * The University of Hong Kong(香港大学)

AI总结 通过斯特鲁普范式实验,发现语言模型中的词汇先验在局部规则覆盖后仍持续存在,并通过激活修补定位到源位置三元组,揭示了先验是干扰起源和覆盖痕迹的共同通道。

详情
AI中文摘要

词汇表、技术规范和系统提示通常要求语言模型以不熟悉的方式使用熟悉的词汇。当这种方式有效时,词汇先验通过覆盖而非替换持续存在:它在局部规则应用后继续运作,规则降低其logit而非在顶部安装新含义。我们通过斯特鲁普风格范式对此进行测试:一个重映射规则(“doctor”意为“forest”)与查询词的词汇先验干扰项(“hospital”)对抗,并匹配中性对照。在跨越四个家族和1B-9B参数的11个开源权重模型中,即使在项目级别控制答案先验、频率、分词和提示措辞后,词汇先验强度仍能预测干扰。对五个对齐模型的激活修补定位到一个源位置三元组(定义主语、定义目标、查询词),该三元组几乎完全恢复了冲突效应(聚合$R \in [0.92, 1.06]$)。定义目标交换表明该三元组执行绑定而非身份匹配。分离实验将目标保留隔离为绑定特定特征:干扰抑制在匹配、交换和项目不匹配条件下均发生,而目标logit崩溃仅在定义目标位置被破坏时发生。行为和机制汇聚到同一通道:词汇先验既是干扰的起源,也是覆盖留下痕迹的地方。

英文摘要

Glossaries, technical specifications, and system prompts routinely ask language models to use familiar words in unfamiliar ways. When this works, the lexical prior persists through override rather than being replaced: it continues to operate after the local rule applies, with the rule lowering its logit rather than installing the new meaning on top. We test this with a Stroop-style paradigm: a remapping rule ("doctor" means "forest") pitted against the query word's lexical-prior distractor ("hospital"), with matched neutral controls. Across 11 open-weight models spanning four families and 1B--9B parameters, lexical-prior strength predicts interference even after item-level controls for answer prior, frequency, tokenization, and prompt wording. Activation patching on five aligned models locates a source-position triplet (definition subject, definition target, query word) that nearly fully recovers the conflict effect (aggregate $R \in [0.92, 1.06]$). A definition-target swap shows the triplet performs binding rather than identity matching. Dissociation experiments isolate target preservation as the binding-specific signature: distractor suppression occurs under matched, swap, and item-mismatched conditions alike, whereas target logit collapse occurs only when the definition-target position is corrupted. Behavior and mechanism converge on the same channel: the lexical prior is where both interference originates and where override leaves its mark.

2606.07550 2026-06-09 cs.LG cs.AI 新提交

Offline Reinforcement Learning for Plasma Control in Nuclear Fusion: Codebase and Benchmark

核聚变等离子体控制的离线强化学习:代码库与基准

Yang Fu, Haomin Bao, Rohit Sonker, Xiaoyan Hu, Aravind Venugopal, Jeff Schneider, Jiayu Chen

发表机构 * Central South University(中南大学) Chongqing University(重庆大学) Carnegie Mellon University(卡内基梅隆大学) The University of Hong Kong(香港大学)

AI总结 提出RL4F基准,基于DIII-D托卡马克历史数据构建评估环境,比较多种离线RL方法在等离子体控制任务上的性能,发现基于模型的离线RL方法平均表现最佳。

详情
Comments
23 pages (10 pages main text)
AI中文摘要

离线强化学习(RL)为从历史托卡马克数据开发等离子体控制器提供了一条有前景的途径,因为在真实设备上进行在线试错成本高昂且风险巨大。然而,由于缺乏针对核聚变中现实多执行器、长时域等离子体控制问题的标准化离线RL基准,这一方向的进展仍然难以衡量。我们引入了RL4F,一个用于核聚变等离子体控制的离线强化学习基准,提供了闭环评估环境和四个全剖面跟踪任务(旋转、密度、温度和压力)的基线比较。评估环境背后的动力学函数基于真实托卡马克DIII-D的历史放电数据构建。我们在统一协议下评估了广泛的模仿学习和离线RL基线。我们发现,基于模型的离线RL方法在大多数目标上获得了最佳平均性能,尽管没有单一方法在所有任务中占主导地位,这突显了动力学建模在复杂、长时域等离子体控制任务中的重要性。为了促进进一步研究,我们开源了代码库、数据集和评估框架,不仅为聚变社区,也为离线RL的算法开发提供了一个基准。

英文摘要

Offline reinforcement learning (RL) offers a promising route for developing plasma controllers from historical tokamak data, since online trial-and-error on real devices is costly and risky. However, progress in this direction remains difficult to measure due to the lack of a standardized offline RL benchmark for realistic multi-actuator, long-horizon plasma control problems in nuclear fusion. We introduce RL4F, an Offline Reinforcement Learning Benchmark for Plasma Control in Nuclear Fusion, providing closed-loop evaluation environments and baseline comparisons across four full-profile tracking tasks: rotation, density, temperature, and pressure. The dynamics function underlying the evaluation environment is built from historical discharge data from DIII-D, a real-world Tokamak. We evaluate a broad set of imitation learning and offline RL baselines under a unified protocol. We find that offline model-based RL methods obtain the best average performance on most objectives, although no single method dominates all tasks, highlighting the importance of dynamics modeling in complex, long-horizon plasma control tasks. To foster further research, we open-source the codebase, datasets, and evaluation framework, providing a benchmark not only for the fusion community but also for algorithm development in offline RL.

2606.07549 2026-06-09 cs.AI cs.MA 新提交

PathoSage: Towards Multi-Source Evidence Adjudication in Pathology via Experience-Aware Agentic Workflow

PathoSage:通过经验感知的代理工作流实现病理学多源证据裁决

Chengyang Zhang, Wenchuan Zhang, Bo Li, Mengran Li, Bob Zhang, Yuhao Yi, Hong Bu, Jiancheng Lv

发表机构 * College of Computer Science, Sichuan University(四川大学计算机科学学院) Department of Pathology and Institute of Clinical Pathology, West China Hospital, Sichuan University(四川大学华西医院病理科/临床病理研究所) Department of Computer and Information Science, University of Macau(澳门大学计算机与信息科学系) School of Intelligent Systems Engineering, Sun Yat-sen University(中山大学智能工程学院)

AI总结 提出PathoSage框架,通过结构化证据审议和Beta-Bernoulli经验系统,独立评估工具证据并解决冲突,减少幻觉和分类器分歧,提升病理学推理鲁棒性。

详情
AI中文摘要

多模态大语言模型(MLLMs)和代理工作流的最新进展在计算病理学中显示出巨大潜力,但可靠的补丁级推理仍然具有挑战性。端到端的病理学MLLM常常幻觉形态特征,而最近的代理系统通常将工具输出和检索知识合并到共享上下文中,使得决策容易受到冲突证据和上下文污染的影响。我们提出PathoSage,一个三阶段框架,明确分离知识检索、证据收集和证据裁决,用于补丁级病理学多模态推理。其核心组件结构化证据审议独立评估来自工具的异质证据,执行冲突分析,并在全新上下文中生成最终判断,以减少锚定偏差。我们进一步引入一个无需训练的Beta-Bernoulli经验系统,具有连续信用分配,以建模长期工具可靠性,并为未来工具使用构建相似性加权先验。实验表明,PathoSage有效缓解了VQA幻觉和分类器分歧,优于强病理学MLLM和代理基线。我们的结果强调了明确的证据裁决和可靠性感知工具建模是构建鲁棒病理学代理的关键要素。

英文摘要

Recent advances in Multimodal Large Language Models (MLLMs) and agent workflows have shown strong promise for computational pathology, yet reliable patch-level reasoning remains challenging. End-to-end pathology MLLMs often hallucinate morphological features, while recent agentic systems usually merge tool outputs and retrieved knowledge into a shared context, making decisions vulnerable to conflicting evidence and context contamination. We propose PathoSage, a three-stage framework that explicitly separates knowledge retrieval, evidence collection, and evidence adjudication for patch-level pathology multimodal reasoning. Its core component, Structured Evidence Deliberation, independently evaluates heterogeneous evidence from tools, performs conflict analysis, and generates the final judgment in a fresh context to reduce anchoring bias. We further introduce a training-free Beta-Bernoulli experience system with continuous credit assignment to model long-term tool reliability and construct similarity-weighted priors for future tool use. Experiments show that PathoSage effectively mitigates VQA hallucinations and classifier disagreement, outperforming strong pathology MLLM and agentic baselines. Our results highlight explicit evidence adjudication and reliability-aware tool modeling as key ingredients for robust pathology agents.

2606.07531 2026-06-09 cs.CL cs.AI 新提交

mllm-shap: A Shapley Value Explainability Platform for Text-Audio Multimodal Large Language Models

mllm-shap:面向文本-音频多模态大语言模型的Shapley值可解释性平台

Jakub Muszyński, Paweł Pozorski, Maria Ganzha

发表机构 * Warsaw University of Technology(华沙理工大学)

AI总结 提出mllm-shap框架,通过模态感知掩码、多轮对话追踪和音素对齐分组技术,将Shapley值可解释性扩展到文本-音频多模态大语言模型,并实现10-50倍的计算加速。

详情
Comments
Submitted to ACL2026
AI中文摘要

我们介绍了mllm-shap,一个开源Python框架,旨在将Shapley值(SV)可解释性从纯文本大语言模型扩展到处理联合文本和音频输入的多模态大语言模型(MLLM)。虽然基于文本的归因已得到充分研究,但mllm-shap解决了多模态领域特有的三个关键挑战:(1)模态感知的联盟掩码,管理离散文本令牌和密集音频编码器帧的交错处理。(2)多轮对话追踪,利用每令牌元数据维护角色和模态上下文。(3)基于音素对齐的令牌分组,一种新颖的技术,将联盟空间减少10到50倍,使得长音频的SV估计在计算上可行。该平台实现了五种SV估计策略,包括具有Neyman最优分配的互补贡献(CC)估计器,其收敛性优于标准蒙特卡洛基线。mllm-shap作为pip可安装包提供,并具有交互式基于Web的GUI,用于细粒度归因可视化。据我们所知,这是第一个公开可用的框架,为文本-音频MLLM中的基于SV的可解释性提供完整、可复现的流水线。

英文摘要

We introduce mllm-shap, an open-source Python framework designed to extend Shapley Value (SV) explainability from text-only Large Language Models to Multimodal LLMs (MLLMs) processing joint text and audio inputs. While text-based attribution is well-studied, mllm-shap addresses three critical challenges unique to the multimodal regime: (1) Modality-aware coalition masking, which manages the interleaved processing of discrete text tokens and dense audio encoder frames. (2) Multi-turn conversation tracking, utilizing per-token metadata to maintain role and modality context. (3) Phonetic alignment-based token grouping, a novel technique that reduces the coalition space by 10x to 50x, rendering SV estimation computationally feasible for long-form audio. The platform implements five SV estimation strategies, including a Complementary Contributions (CC) estimator with Neyman-optimal allocation that demonstrates superior convergence over standard Monte Carlo baselines. mllm-shap is provided as a pip-installable package featuring an interactive web-based GUI for granular attribution visualization. To our knowledge, this is the first publicly available framework providing a complete, reproducible pipeline for SV-based explainability in text-audio MLLMs.

2606.07529 2026-06-09 cs.CL cs.AI cs.CV cs.LG cs.MM 新提交

CAPruner: Conceptual-Adjacent Scene Graph Pruner for Enhancing 3D Spatial Reasoning of Large Language Models

CAPruner: 概念相邻场景图剪枝器以增强大语言模型的3D空间推理

Shengli Zhou, Xiangchen Wang, Guanhua Chen, Feng Zheng

发表机构 * Southern University of Science and Technology(南方科技大学) SpatialTemporal AI(时空人工智能)

AI总结 提出概念相邻场景图剪枝器(CAPruner),通过融合模糊语义相关性和空间邻近性估计关系重要性,在任务特定上下文中选择关键关系,避免关系级标注,显著提升大语言模型在3D视觉语言任务上的空间推理性能。

详情
Comments
Accepted by ACL 2026 Main Conference
AI中文摘要

大型语言模型(LLMs)最近被应用于3D视觉语言(3D-VL)任务,这些任务需要空间推理以识别相对于锚点的目标物体。场景图通常用于表示此类关系,但在完整图上进行推理会导致高昂的令牌成本和计算效率低下,因此需要剪枝。现有的剪枝方法主要依赖空间邻近性,常常移除任务相关的关系,从而削弱可靠的空间推理。为了解决这些局限性,我们推导出场景图剪枝的一个关键要求:保留与特定3D-VL任务最相关的空间关系。在此洞察指导下,我们提出了概念相邻场景图剪枝器(CAPruner)。CAPruner将模糊语义相关性与空间邻近性相结合,以估计关系的重要性,从而能够在任务特定上下文中选择关键关系。此外,为了避免昂贵的关系级标注,CAPruner通过监督每个节点入射边的聚合分数进行训练。大量实验表明,CAPruner有效保留了空间推理所必需的关系,从而显著提升了LLMs在3D-VL任务上的性能。代码可在 https://github.com/fz-zsl/CAPruner 获取。

英文摘要

Large language models (LLMs) have recently been applied to 3D vision-language (3D-VL) tasks, which require spatial reasoning to identify target objects relative to anchors. Scene graphs are commonly employed to represent such relations, but reasoning over complete graphs incurs high token costs and computational inefficiencies, motivating the need for pruning. Existing pruning methods primarily rely on spatial proximity and often remove task-relevant relations, thereby undermining reliable spatial reasoning. To address these limitations, we derive a key requirement for scene graph pruning: preserving spatial relations that are most pertinent to the specific 3D-VL task. Guided by this insight, we propose the Conceptual-Adjacent Scene Graph Pruner (CAPruner). CAPruner integrates fuzzy semantic relevance with spatial proximity to estimate the importance of relations, enabling the selection of critical relations in a task-specific context. Moreover, to avoid costly relation-level annotations, CAPruner is trained by supervising the aggregated scores of each node's incident edges. Extensive experiments demonstrate that CAPruner effectively preserves relations essential for spatial reasoning, leading to substantial performance improvements of LLMs on 3D-VL tasks. Code is available at https://github.com/fz-zsl/CAPruner.

2606.07527 2026-06-09 cs.CL cs.AI cs.LG 新提交

Post-training is (Massive) Supervised Learning

后训练是(大规模)监督学习

Michael Hassid, Yossi Adi, Roy Schwartz

发表机构 * FAIR, Meta AI(Meta AI 基础人工智能研究团队) The Hebrew University of Jerusalem(耶路撒冷希伯来大学)

AI总结 本文论证当前LLM后训练阶段(SFT+RL)实质是回归到BERT时代的“预训练-微调”范式,通过实验表明从零开始后训练的模型也能取得显著性能,并提出应转向“学会学习”的训练方式。

详情
AI中文摘要

训练LLM的主流范式已演变为依赖包含SFT和RL的大规模后训练阶段。在这篇立场论文中,我们认为这种方法实际上标志着回归到BERT时代的“预训练然后微调”方法,明确地使模型适应期望的行为和评估所用的特定基准。我们首先回顾LLM的历史,描述LLM演化的不同阶段。我们认为当前格局与LLM早期惊人地相似,那时任务性能严重依赖于将模型拟合到分布内数据集。为了实证证明这一点,我们比较了预训练模型和随机初始化模型,在现代推理数据集上对两种变体进行微调,并在竞争性数学和代码基准上评估它们。我们表明,从头开始后训练的模型产生了高度非平凡的性能。我们的发现表明,当前的后训练方法主要作为分布拟合机制发挥作用。最后,我们提出,开发通用能力的模型和系统需要超越针对预定义行为的广泛后训练,转而采用模型“学会如何学习”的训练过程。

英文摘要

The prevailing paradigm for training LLMs has evolved to rely on a massive post-training phase consisting of SFT and RL. In this position paper, we argue that this methodology effectively marks a reversion to the ``pre-train then fine-tune'' approach of the BERT era, explicitly tailoring models to the desired behaviors and specific benchmarks on which they are evaluated. We begin with a historical overview of LLMs, describing the different phases of the LLM evolution. We argue that the current landscape is remarkably similar to the early days of LLMs, where task performance heavily relied on fitting the models to in-distribution datasets. To empirically demonstrate this, we compare pre-trained models to randomly initialized ones, by fine-tuning both variants on modern reasoning datasets and evaluating them on competitive math and code benchmarks. We show that models post-trained from scratch yield highly non-trivial performance. Our findings suggest that current post-training methodologies function primarily as a distribution-fitting mechanism. We finish by positing that developing generally capable models and systems requires moving beyond extensive post-training for predefined behaviors, shifting instead toward training procedures where models ``learn how to learn''.

2606.07526 2026-06-09 cs.CL cs.AI 新提交

GraphLoRA: Structure-Aware Low-Rank Adaptation for Large Language Model Recommendation

GraphLoRA: 面向大语言模型推荐的结构感知低秩适配

Lin Mu, Guoji Wang, Li Ni, Lei Sang, Zhize Wu, Peiquan Jin, Yiwen Zhang

发表机构 * Anhui University(安徽大学) Hefei University(合肥大学) University of Science and Technology of China(中国科学技术大学)

AI总结 提出GraphLoRA框架,通过在低秩适配路径中嵌入可训练的图消息传递网络,实现结构信号传播,从而深度融合图结构与文本语义,提升LLM推荐性能。

详情
Comments
ACL 2026 findings
AI中文摘要

大型语言模型(LLM)因其强大的推理和泛化能力,在推荐任务(LLMRec)中展现出巨大潜力。然而,如何有效对齐LLM建模的文本语义与协同信号仍是一个关键挑战。现有方法要么将协同信息转化为文本提示,要么将预训练嵌入注入LLM,两者都将结构信息视为静态输入,无法捕获高阶关系依赖。为弥合这一差距,我们提出GraphLoRA,一种新颖的框架,将低秩适配从独立传播推广到结构感知传播。GraphLoRA在低秩适配路径中嵌入一个可训练的图消息传递网络,使结构信号能够在参数空间中传播。该设计允许协同拓扑显式指导参数更新,促进图结构与文本语义信息的深度融合。在多个基准上的大量实验表明,GraphLoRA不仅优于最先进的基于LLM的推荐方法,而且实现了卓越的泛化能力,有效平衡了结构推理能力与计算效率。代码可在https://github.com/wgj15965/GraphLoRA获取。

英文摘要

Large Language Models (LLMs) have shown strong potential for recommendation (LLMRec) due to their powerful reasoning and generalization abilities. However, effectively aligning the textual semantics modeled by LLMs with the collaborative signals remains a key challenge. Existing methods either translate collaborative information into textual prompts or inject pre-trained embeddings into the LLM, both of which treat structural information as static input and fail to capture high-order relational dependencies. To bridge this gap, we propose GraphLoRA, a novel framework that generalizes low-rank adaptation from independent to structure-aware propagation. GraphLoRA embeds a trainable graph message-passing network within the low-rank adaptation pathway, enabling structural signals to propagate through the parameter space. This design allows collaborative topology to explicitly guide parameter updates, fostering deep integration between graph-structured and textual semantic information. Extensive experiments on multiple benchmarks demonstrate that GraphLoRA not only outperforms state-of-the-art LLM-based recommendation methods but also achieves superior generalization, effectively balancing structural reasoning capability with computational efficiency. Code is available at \href{https://github.com/wgj15965/GraphLoRA}{https://github.com/wgj15965/GraphLoRA}.

2606.07525 2026-06-09 cs.CL cs.AI 新提交

Implicit Causal Graph Construction in Text via Chain Discovery

通过链发现实现文本中的隐式因果图构建

Liesbeth Allein, Marie-Francine Moens

发表机构 * KU Leuven(鲁汶大学) Ghent University(根特大学)

AI总结 研究利用大语言模型从文本因果对中推断中间事件以构建隐式因果图,比较端到端构建与因果链发现方法,并探索多模型集成策略,基于1560个科学验证因果对评估。

详情
AI中文摘要

文本中的因果图通常由可观察的、预定义的事件填充。相比之下,我们研究从文本中构建隐式因果图,将每个描述的因果对视为潜在隐式因果图的起点和终点,并使用大型语言模型(LLM)推断中间因果事件。我们比较了端到端图构建与将任务视为因果链发现的方法。在后一种方法中,图是通过聚合推断出的链或通过迭代搜索过程逐步扩展部分链来构建的。我们进一步探索了“群体智慧”扩展,即在事后聚合和协作推理设置中从多个LLM访问因果知识。我们分析了这些方法之间的权衡,并使用一个包含1560个经过科学验证的因果对的手动策划数据库评估推断出的因果关系的有效性。这种基于数据库的评估被认为是可靠的、资源高效的,并且可迁移到无法获得真实图的情况。

英文摘要

Causal graphs in text are typically populated by observable, predefined events. In contrast, we study implicit causal graph construction from text by treating each described cause-effect pair as the begin- and endpoint of an underlying latent causal graph and using large language models (LLMs) to infer intermediate causal events. We compare end-to-end graph construction with methods that frame the task as causal chain discovery. In the latter, graphs are built either by aggregating inferred chains or by progressively expanding partial chains through an iterative search process. We further explore Wisdom of the Crowd extensions that access causal knowledge from multiple LLMs in post-hoc aggregation and collaborative inference settings. We analyze trade-offs among these approaches and evaluate the validity of inferred causal relations using a manually curated database of 1,560 scientifically validated causal pairs. This database-based evaluation is proposed as reliable, resource-efficient, and transferable to settings where ground-truth graphs are unavailable.

2606.07524 2026-06-09 cs.CL cs.AI 新提交

ABLE: Representing and Mapping LLMs via Attribution-Based Large-model Embedding

ABLE:基于归因的大模型嵌入表示与映射

Zirui Wang, Yusen Hou, Shaofeng Liang, Bowen Tian, Yanlin Zhang, Wenshuo Chen, Yutao Yue

发表机构 * The Hong Kong University of Science and Technology (Guangzhou)(香港科技大学(广州)) Deep Interdisciplinary Intelligence Lab (DI2 Lab)(深度跨学科智能实验室(DI2 Lab))

AI总结 提出ABLE框架,利用梯度特征归因和分词器无关的词级对齐构建模型嵌入,实现异构LLM的高效比较,在关系预测、模型路由和基准分数预测上表现优异。

详情
AI中文摘要

大语言模型(LLM)的爆炸式增长形成了一个异构且文档不完善的生态系统,使得系统性的模型比较对于来源审计、安全分析和模型选择越来越重要。现有的表示方法难以高效应对这一场景。分析内部参数的方法在架构兼容时很强大,但在结构异构下面临可扩展性障碍;而依赖外部输出的方法可能混淆具有相似行为的模型,且难以在不同分词器的更丰富输出空间中对齐。为弥合这一差距,我们提出ABLE(基于归因的大模型嵌入)框架,利用可解释性空间构建模型表示。通过基于梯度的特征归因,经由分词器无关的词级对齐进行聚合,ABLE捕获模型特定的输入敏感性模式,而不仅仅是表面输出。除经验效用外,我们提供了稳定性分析,表明在可微Transformer风格模型的标准正则性假设下,ABLE诱导出一个Lipschitz连续的参数到嵌入映射,并具有有限样本收敛保证。在239个开源LLM上的大量实验表明,我们的无训练方法在关系预测、模型路由和基准分数预测方面达到了有竞争力或更优的性能。

英文摘要

The explosive growth of large language models (LLMs) has created a heterogeneous and poorly documented ecosystem, making systematic model comparison increasingly important for provenance auditing, security analysis, and model selection. Existing representation methods struggle to address this setting efficiently. Approaches analyzing internal parameters are powerful when architectures are compatible, but face scalability barriers under structural heterogeneity, while methods relying on external outputs may conflate models with similar behaviors and are difficult to align in richer output spaces across different tokenizers. To bridge this gap, we propose ABLE (Attribution-Based Large-model Embedding), a framework that leverages the interpretability space to construct model representations. By aggregating gradient-based feature attributions via a tokenizer-agnostic word-level alignment, ABLE captures model-specific input-sensitivity patterns rather than only surface-level outputs. Beyond empirical utility, we provide a stability analysis showing that, under standard regularity assumptions for differentiable Transformer-style models, ABLE induces a Lipschitz-continuous parameter-to-embedding map with finite-sample convergence guarantees. Extensive experiments on 239 open-source LLMs demonstrate that our training-free approach achieves competitive or superior performance in relation prediction, model routing, and benchmark score prediction.

2606.07522 2026-06-09 cs.CL cs.LG cs.SI 新提交

Community-Specific Slang and Entity Detection via Semantic Shift in Fine-Tuned Language Models

通过微调语言模型中的语义偏移检测社区特定俚语和实体

Julia Kruk, Sanchita Porwal, Amitrajit Bhattacharjee, Mansi Phute

发表机构 * Georgia Institute of Technology(佐治亚理工学院)

AI总结 提出无监督方法,通过测量词在微调前后的语义偏移幅度,从在线社区文本中自动识别俚语、独特实体和民俗用语。

详情
Comments
6 pages, 6 figures, 2 tables
AI中文摘要

我们提出一种无监督方法,通过隔离词汇中具有最大语义偏移幅度的词,来解析来自在线社区的俚语、独特实体和民俗用语。语义偏移定义为在社区特定文本语料上微调预训练大语言模型(LLM)后,词编码表示的演化。该值与基础模型和微调模型对词的编码表示之间的余弦相似度成反比。我们在从3个Reddit子版块(r/Technology、r/Gaming、r/WorldofWarcraft)收集的文本语料上微调DistilRoBERTa模型,对词汇上的余弦相似度分布进行建模,并表明通过提取底部10百分位的数据,可以成功解析对社区具有独特意义的词。相反,我们表明顶部10百分位的数据由具有相对普遍语义的词组成。

英文摘要

We propose an unsupervised method of resolving slang, unique entities, and folklore from online communities by isolating words in the lexicon that have the highest magnitude of semantic shift. Semantic shift is defined as the evolution of a word's encoded representation as a result of fine-tuning a pretrained Large Language Model (LLM) on a community-specific text corpus. This value is inversely proportional to the cosine similarity between the base model's encoded representation of a word, and a fine-tuned model's encoded representation. We fine-tune the DistilRoBERTa model on text corpora collected from 3 Reddit subreddits (r/Technology, r/Gaming, r/WorldofWarcraft), model a distribution of cosine similarity over the lexicon, and show that one can successfully resolve words that have unique significance to the community by pulling data in the bottom 10-percentile. In contrast, we show that data in the top 10-percentile consist of words that carry relatively universal semantics.

2606.07521 2026-06-09 cs.CL cs.AI 新提交

Evaluating Hallucinations in Domain-Adapted Large Language Models

评估领域自适应大语言模型中的幻觉现象

Sanchita Porwal, Sai Prasath S, Xingjian Bi, Madelyn Scandlen

发表机构 * College of Computing, Georgia Institute of Technology(佐治亚理工学院计算学院)

AI总结 本研究通过微调Llama-2模型,测试其记忆、回忆和推理能力,发现领域自适应大语言模型在生成新领域特定信息时存在幻觉问题,表明仅靠微调难以有效缓解幻觉。

详情
Comments
13 pages, 2 figures, 3 tables
AI中文摘要

本研究调查了领域自适应大语言模型(LLMs)中的幻觉现象,重点关注使用Lamini数据集对Llama-2模型进行微调。幻觉,即LLMs生成无意义或不忠实内容的现象,构成了重大挑战,尤其是当这些模型使用领域特定数据进行微调时。我们的方法包括一系列实验,测试微调后LLM的记忆、回忆和推理能力,并将其在新问答对和领域特定信息上的表现进行比较。我们发现,虽然模型在与训练数据相似的任务上表现出色,但其准确推理和回忆新领域特定信息的能力仍然有限,导致出现幻觉实例。模型倾向于提供带有额外信息的正确答案,表明存在过度生成的倾向。这些结果表明,仅靠微调方法在将LLMs适应专业领域时缓解幻觉存在重要局限性,并强调了在将LLMs适应专业领域时需要更鲁棒的方法。该研究还提供了关于LLMs在不同类型信息上表现差异的见解,揭示了其在处理领域特定查询时的相对弱点。

英文摘要

This study investigates the phenomenon of hallucinations in domain-adapted Large Language Models (LLMs), focusing on the fine-tuning of the Llama-2 model with the Lamini dataset. Hallucinations, or the generation of nonsensical or unfaithful content by LLMs, pose a significant challenge, especially when these models are fine-tuned with domain-specific data. Our methodology involves a series of experiments testing memorization, recall, and reasoning capabilities of the fine-tuned LLM, comparing its performance on novel question-answer pairs and domain-specific information. We found that while the model shows proficiency in tasks similar to its training data, its capability to accurately reason about and recall new domain-specific information remains limited, leading to instances of hallucination. The model demonstrates a tendency to provide correct answers with extra information, suggesting an inclination toward over-generation. These results suggest important limitations of fine-tuning-only approaches for mitigating hallucinations when adapting LLMs to specialized domains and underscore the need for more robust methods in adapting LLMs to specialized domains. The study also provides insights into the varying performance of LLMs on different types of information, revealing a comparative weakness in handling domain-specific queries.

2606.07520 2026-06-09 cs.CL cs.LG 新提交

TinyJudge: Unverifiable Constraint Alignment via Lightweight Specialist Ensembles

TinyJudge: 通过轻量级专家集成实现不可验证约束对齐

Yirong Zeng, Yufei Liu, Xiao Ding, Yutai Hou, Yuxian Wang, Wu Ning, Haonan Song, Dandan Tu, Qixun Zhang, Yuxiang He, Bibo Cai, Ting Liu

发表机构 * Harbin Institute of Technology SCIR Lab(哈尔滨工业大学SCIR实验室) Peking University(北京大学) Huawei Technologies Co., Ltd(华为技术有限公司)

AI总结 针对LLM遵循不可验证约束时奖励黑客和计算开销大的问题,提出TinyJudge框架,利用多个小型语言模型集成提供奖励,在五个基准上平均性能提升约10%,奖励精度提升12%,训练速度提升3倍。

详情
Comments
ACL 2026 Main Conference;15 pages, 9 figures
AI中文摘要

指令遵循(IF)是LLM的核心能力,要求严格遵守从可验证(如输出长度)到不可验证(如语气)的多种约束。基于可验证奖励的强化学习已成为IF任务的范式,利用LLM作为裁判来评估不可验证约束。然而,我们实验发现该方法仍存在显著瓶颈,遭受严重的奖励黑客和更高的计算开销。本文首先分析不可验证约束的泛化能力,发现特定约束表现出独特的高泛化模式。受此启发,我们提出TinyJudge框架,采用专门的小型语言模型集成(约0.6B)为软约束提供奖励。通过将前沿模型的知识蒸馏到这些小型模型中,实现了高精度、轻量级的评估。在五个基准上的广泛评估表明,TinyJudge在平均性能上比基线高出约10%,奖励精度高出12%。关键的是,它还在总训练时间上实现了3倍的加速。我们的工作为将LLM与不可验证的人类指令对齐提供了一条可扩展且稳健的路径。

英文摘要

Instruction Following (IF) is a core capability of LLMs, requiring strict adherence to diverse constraints, ranging from verifiable ones (e.g., output length) to unverifiable ones (e.g., tone). Reinforcement learning with verifiable rewards has emerged as a paradigm for IF tasks, leveraging LLM-as-a-judge to assess unverifiable constraints. However, we empirically find that this approach remains a significant bottleneck, suffering from severe reward hacking and higher computational overhead. In this work, we first analyze the generalization capabilities of unverifiable constraints and discover that specific constraints exhibit distinct, high-generalization patterns. Motivated by this, we propose TinyJudge, a framework that employs an ensemble of specialized tiny language models ($\sim0.6B$) to provide rewards for soft constraints. By distilling expertise from frontier models into these tiny models, it achieves high-precision, lightweight evaluation. Extensive evaluations across five benchmarks demonstrate that TinyJudge outperforms the baselines by $\sim10\%$ in average performance and $12\%$ in reward precision. Crucially, it also achieves a $3\times$ speedup in total training time. Our work provides a scalable and robust path for aligning LLMs with unverifiable human instructions.

2606.07519 2026-06-09 cs.CL cs.AI 新提交

Bidirectional Small-Granularity Search between Code and Text

代码与文本之间的双向小粒度搜索

Marco A. Valenzuela-Escárcega, Enrique Noriega-Atala, Gus Hahn-Powell, Clayton T. Morrison, Mihai Surdeanu

发表机构 * Lex Machina The University of Arizona(亚利桑那大学)

AI总结 提出双向小粒度搜索任务,通过自动生成数据训练模型,实现科学出版物文本与代码片段间的直接链接,支持跨模态检索。

详情
AI中文摘要

我们引入了代码与文本之间双向小粒度搜索的新任务,其中查询是文本或代码的小片段,结果也是相反模态的小片段,即代码或文本。该任务在科学出版物中的文本与相应代码片段之间建立直接链接,以支持更好、更快地理解科学方法。我们为所提出的任务引入了一个大型数据集,其中包括使用GPT-4自动生成的代码文本描述的训练分区,以及三个测试分区:一个域内和两个域外(OOD),包含手动注释的数据以及其他领域的材料。我们还提出了一种模块化方法来解决此任务。我们的方法在四个不同的子任务之间共享一个编码器,这些子任务学习双向答案跨度的开始/结束。我们表明,我们的方法在域内取得了良好结果,在域外也取得了令人鼓舞的结果。这表明使用自动生成的数据解决此任务是可能的,但仍有令人兴奋的未来工作要做。

英文摘要

We introduce the novel task of bidirectional small-granularity search between code and text, where the queries are small snippets of text or code and the results are also small fragments of the opposite modality, i.e., code or text. This task establishes direct links between text in scientific publications and corresponding code segments, in support of better and faster understanding of scientific methods. We introduce a large dataset for the proposed task that includes a training partition with textual descriptions of code generated automatically using GPT-4, and three testing partitions, one in-domain and two out-of-domain (OOD) that contain manually-annotated data as well as material from other domains. We also propose a modular approach to address this task. Our approach shares an encoder across four different subtasks that learn start/end of answer spans in both directions. We show that our method achieves good results in-domain, and encouraging results OOD. This suggests that addressing this task with automatically-generated data is possible, but there is exciting future work to be done.

2606.07066 2026-06-09 cs.CL 新提交

Modeling semantic association in self-paced reading with language model embeddings

使用语言模型嵌入建模自定步速阅读中的语义关联

Sara Møller Østergaard, Kenneth Enevoldsen, Afra Alishahi, Bruno Nicenboim

发表机构 * Department of Computational Cognitive Science, Tilburg University(蒂尔堡大学计算认知科学系) Center for Humanities Computing, Aarhus University(奥胡斯大学人文计算中心)

AI总结 本研究使用语言模型嵌入的十种实现方式量化语义关联,通过贝叶斯模型分析其对N400和自定步速阅读时间的影响,发现句子嵌入能可靠捕捉超出词可预测性的语义关联。

详情
AI中文摘要

词语与其上下文之间的语义关联已被认为是阅读理解的重要组成部分,即使考虑了词的可预测性。最近的研究强调了语言模型(LM)嵌入在量化语义关联方面的潜力。然而,基于嵌入的语义关联已有多种操作化方式。在本研究中,我们使用LM嵌入来估计联合脑电图(EEG)和自然荷兰语文本自定步速阅读语料库上的语义关联。语义关联通过十种不同的实现方式计算,这些方式在嵌入模型和上下文长度上有所不同。使用贝叶斯层次模型和贝叶斯因子检验了不同实现方式下语义关联对N400和自定步速阅读时间的影响。结果表明,嵌入模型的选择可以改变语义关联对N400和自定步速阅读时间的估计效应。此外,结果显示了句子嵌入在捕捉语义关联方面的潜力,因为只有依赖句子嵌入的实现方式在神经和行为测量上都显示出超出词可预测性的可靠语义关联结果。总之,这些发现强调了在量化语义关联时方法论选择的重要性。

英文摘要

Semantic association between a word and its context has been identified as an important component of reading comprehension, even when word predictability is accounted for. Recent research has highlighted the potential of language model ( LM) embeddings to quantify semantic association. Yet, embedding-based semantic association have been operationalized in a myriad of ways. In this study, we use embeddings from LMs to estimate semantic association on a corpus of joint electroencephalography (EEG) and self-paced reading of natural, Dutch texts. Semantic association is calculated in ten different implementations that vary the embedding model and context lengths. The effects of semantic association across the different implementations on the N400 and self-paced reading times are examined using Bayesian hierarchical models and Bayes factor. The results show that the choice of embedding model can alter the estimated effect of semantic association on both the N400 and self-paced reading times. Furthermore, the results demonstrate a promising potential of sentence embeddings for capturing semantic association, as only implementations relying on sentence embeddings indicate reliable results of semantic association beyond word predictability on both neural and behavioral measures. Together, these findings highlight the importance of methodological choices in quantifying semantic association.

2605.03357 2026-06-09 cs.LG math.OC 版本更新

Population-Aware Imitation Learning in Mean-field Games with Common Noise

平均场博弈中考虑共同噪声的群体感知模仿学习

Grégoire Lambrecht, Mathieu Laurière

发表机构 * Institut National des Sciences et Techniques de l'Information et des Systèmes (INSTI)(信息与系统科学与技术国家研究院)

AI总结 针对含共同噪声的平均场博弈,提出群体感知模仿学习框架,通过行为克隆和对抗散度两种代理,建立有限样本误差界,并利用广义虚拟博弈和深度学习计算专家策略,实验证明群体感知策略对应对随机性的重要性。

详情
AI中文摘要

平均场博弈(MFGs)为建模大量交互智能体的集体行为提供了强大框架。本文研究了含共同噪声的MFG中的模仿学习(IL)问题,其中群体分布随机演化。这种随机性迫使智能体采用群体感知策略以应对总体冲击。我们制定了两个不同的学习目标:恢复纳什均衡和最大化相对于专家群体的性能。我们研究了两种模仿代理:行为克隆(BC)和对抗(ADV)散度。然后,我们建立了有限样本误差界,表明最小化这些代理能有效控制策略的可利用性及其相对于专家的性能差距。此外,我们提出了一个使用广义虚拟博弈和深度学习的数值框架来计算专家群体感知策略。通过在三个环境上的实验,我们证明了标准的群体无感知策略无法捕捉均衡动态。我们的结果强调,学习群体感知策略对于避免被共同噪声固有的随机性误导至关重要。

英文摘要

Mean Field Games (MFGs) provide a powerful framework for modeling the collective behavior of large populations of interacting agents. In this paper, we address the problem of Imitation Learning (IL) in MFGs subject to common noise, where the population distribution evolves stochastically. This stochasticity compels agents to adopt population-aware policies to respond to aggregate shocks. We formulate two distinct learning objectives: recovering a Nash equilibrium and maximizing performance against an expert population. We investigate two imitation proxies: Behavioral Cloning (BC) and Adversarial (ADV) divergence. We then establish finite-sample error bounds showing that minimizing these proxies effectively controls both the policy's exploitability and its performance gap relative to the expert. Furthermore, we propose a numerical framework using generalized Fictitious Play and Deep Learning to compute expert population-aware policies. Through experiments on three environments we demonstrate that standard population-unaware policies fail to capture the equilibrium dynamics. Our results highlight that learning population-aware policies is crucial to avoid being misled by the randomness inherent in common noise.

2605.03229 2026-06-09 cs.CL cs.LG 版本更新

Sparse Memory Finetuning as a Low-Forgetting Alternative to LoRA and Full Finetuning

稀疏记忆微调:作为LoRA和全微调的低遗忘替代方案

Prakhar Gupta, Garv Shah, Satyam Goyal, Anirudh Kanchi

发表机构 * University of Washington(华盛顿大学)

AI总结 提出稀疏记忆微调(SMF),通过添加键值记忆层并仅更新当前批次最活跃的记忆行,在MedMCQA任务上提升2.5个百分点,同时将遗忘探针(WikiText困惑度和TriviaQA准确率)控制在基线的1个百分点内,优于LoRA和全微调。

详情
AI中文摘要

将预训练语言模型适应新任务通常会损害其已有的通用能力,这一问题被称为灾难性遗忘。稀疏记忆微调(SMF)通过向模型添加键值记忆层,并在每个训练步骤中仅更新当前批次读取最频繁的一小组记忆行来避免这种情况。我们在Qwen-2.5-0.5B-Instruct上重新实现了SMF,并将其与LoRA和全微调在MedMCQA(一个4选1的医学考试任务)上进行比较,使用WikiText困惑度和TriviaQA准确率作为遗忘探针。SMF将MedMCQA提升了2.5个百分点,同时将两个遗忘探针保持在基线的约1个百分点内,而LoRA和全微调虽然取得了更大的增益,但在两个探针上都出现了明显的漂移。我们还比较了两种行选择规则(KL散度和TF-IDF),它们在两个遗忘指标上取得了不同的平衡。

英文摘要

Adapting a pretrained language model to a new task often hurts the general capabilities it already had, a problem known as catastrophic forgetting. Sparse Memory Finetuning (SMF) tries to avoid this by adding key-value memory layers to the model and, on each training step, updating only the small set of memory rows that the current batch reads most heavily. We re-implement SMF on Qwen-2.5-0.5B-Instruct and compare it with LoRA and full finetuning on MedMCQA, a 4-choice medical exam task, using WikiText perplexity and TriviaQA accuracy as forgetting probes. SMF improves MedMCQA by 2.5 percentage points while keeping both forgetting probes within roughly 1 point of the base model, whereas LoRA and full finetuning achieve larger gains but with clear drift on both. We also compare two row-selection rules (KL-divergence and TF-IDF), which balance the two forgetting metrics differently.

2605.03226 2026-06-09 cs.LG cs.AI cs.CR 版本更新

Self-Mined Hardness for Safety Fine-Tuning

自我挖掘的难度用于安全微调

Prakhar Gupta, Garv Shah, Donghua Zhang

发表机构 * arXiv.org University of California, Berkeley(加州大学伯克利分校)

AI总结 提出通过模型自身生成结果评估提示难度,对最难的提示进行安全微调,在Llama-3模型上将攻击成功率降至1-3%,但增加了拒绝率,通过混合良性提示可平衡性能。

详情
AI中文摘要

语言模型的安全微调通常需要一个精心策划的对抗性数据集。我们采取不同的方法:通过目标模型自身生成结果被判定为有害的频率来评分每个候选提示的难度,然后在最难的提示上使用模型自身的非越狱生成结果进行微调。在Llama-3-8B-Instruct和Llama-3.2-3B-Instruct上,该方法将WildJailbreak攻击成功率从11.5%和20.1%降至1-3%,但将越狱形式良性提示的拒绝率从14-22%提升至74-94%。将相同的困难提示与对抗性框架的良性提示(看起来像越狱但意图良性的提示)以1:1的比例交错,可将8B模型的拒绝率降至30-51%,3B模型降至52-72%,但攻击成功率增加2-6个百分点。在混合模式下,使用合格池中最难的一半而非随机一半进行训练,可将两个模型的剩余ASR降低35-50%(约3个百分点)。

英文摘要

Safety fine-tuning of language models typically requires a curated adversarial dataset. We take a different approach: score each candidate prompt's difficulty by how often the target model's own rollouts are judged harmful, then fine-tune on the hardest prompts paired with the model's own non-jailbroken rollouts. On Llama-3-8B-Instruct and Llama-3.2-3B-Instruct, this approach cuts the WildJailbreak attack success rate from 11.5% and 20.1% down to 1-3%, but pushes refusal on jailbreak-shaped benign prompts from 14-22% to 74-94%. Interleaving the same hard prompts 1:1 with adversarially-framed benign prompts (prompts that look like jailbreaks but have benign intent) cuts that refusal back down to 30-51% on 8B and 52-72% on 3B, at a cost of 2-6 percentage points of attack success rate. Within the mixed regime, training on the hardest half of the eligible pool rather than a random half cuts the remaining ASR by 35-50% (about 3 percentage points) on both models.