arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 2069
2510.01137 2026-06-01 cs.LG

Re-examining Low Rank adaptation for private LLM fine-tuning

重新审视用于私有LLM微调的低秩适应

Ali Dadsetan, Frank Rudzicz

AI总结 研究差分隐私SGD中噪声导致的梯度奇异值膨胀问题,提出通过部分恢复原始奇异值分布来提升DP-SGD的样本效率。

详情
AI中文摘要

隐私是在敏感数据上微调大型语言模型(LLM)时的核心关注点,差分隐私随机梯度下降(DP-SGD)——它裁剪每个样本的梯度并添加校准的高斯噪声——是形式化隐私保证的标准工具。理论和实践都表明,低秩模型更适合DP训练,这一特性对LLM尤其相关,因为其微调梯度表现出强烈的低秩结构。诸如DP-LoRA之类的方法通过将更新限制在低秩子空间来利用这一点,即仅保留每层梯度SVD中的少数非零分量。然而,我们认为,虽然非零分量少很重要,但DP-SGD注入的各向同性噪声会膨胀梯度矩阵的奇异值,破坏其自然快速衰减。在这项工作中,我们研究了这种噪声引起的特征值膨胀是否会降低性能,并表明部分恢复原始奇异值分布显著提高了DP-SGD的样本效率。在语言分类(使用RoBERTa的GLUE基准)和文本生成(使用Qwen和Llama模型(参数高达4B)的E2E和DART表格到文本基准)上的实验表明,恢复奇异值的快速衰减是一种在不损害隐私保证的情况下加速DP优化过程的有效策略。

英文摘要

Privacy is a central concern when fine-tuning large language models (LLMs) on sensitive data, and differentially private stochastic gradient descent (DP-SGD) -- which clips per-sample gradients and adds calibrated Gaussian noise -- is the standard tool for formal privacy guarantees. Both theory and practice show that lower-rank models are better suited to DP training, a property especially relevant for LLMs, whose fine-tuning gradients exhibit a strong low-rank structure. Methods such as DP-LoRA exploit this by restricting updates to a low-rank subspace, i.e., retaining only a few non-zero components in the SVD of each layer's gradient. However, we argue that while having few non-zero components is important, the isotropic noise injected by DP-SGD inflates the singular values of the gradient matrix, disrupting their naturally fast decay. In this work, we investigate whether this noise-induced eigenvalue blow-up reduces performance, and show that partially restoring the original singular-value profile significantly improves the sample efficiency of DP-SGD. Experiments on language classification (GLUE benchmark with RoBERTa) and text generation (E2E and DART table-to-text benchmarks with Qwen and Llama models up to 4B parameters) showcase that restoring the fast decay of singular values is a viable strategy for speeding up the DP optimization process, without compromising privacy guarantees.

2601.05134 2026-06-01 cs.LG

Sequential Subspace Noise Injection Prevents Accuracy Collapse in Certified Unlearning

顺序子空间噪声注入防止认证遗忘中的精度崩溃

Polina Dolgova, Sebastian U. Stich

AI总结 提出顺序子空间噪声调度,将噪声预算分配到参数空间的正交子空间,在保持差分隐私认证保证的同时,显著提高遗忘后模型精度。

详情
AI中文摘要

基于差分隐私的认证遗忘提供了强有力的保证,但在很大程度上仍不实用:目前提出的噪声微调方法虽然实现了这些保证,但严重降低了模型精度。我们提出了顺序噪声调度,它将噪声预算分布到参数空间的正交子空间中,而不是一次性注入所有噪声。这种简单的修改减轻了噪声的破坏性影响,同时保留了原始的认证保证。我们将噪声微调的分析扩展到子空间设置,证明保留了相同的 $(\varepsilon,\delta)$ 隐私预算。在图像分类基准上的实验结果表明,我们的方法在遗忘后显著提高了精度,同时对成员推断攻击保持鲁棒性。这些结果表明,认证遗忘可以实现严格的保证和实际的效用。

英文摘要

Certified unlearning based on differential privacy offers strong guarantees but remains largely impractical: the noisy fine-tuning approaches proposed so far achieve these guarantees but severely reduce model accuracy. We propose sequential noise scheduling, which distributes the noise budget across orthogonal subspaces of the parameter space, rather than injecting it all at once. This simple modification mitigates the destructive effect of noise while preserving the original certification guarantees. We extend the analysis of noisy fine-tuning to the subspace setting, proving that the same $(\varepsilon,δ)$ privacy budget is retained. Empirical results on image classification benchmarks show that our approach substantially improves accuracy after unlearning while remaining robust to membership inference attacks. These results show that certified unlearning can achieve both rigorous guarantees and practical utility.

2601.01456 2026-06-01 cs.CV cs.AI cs.LG

Rethinking Multimodal Few-Shot 3D Point Cloud Segmentation: From Fused Refinement to Decoupled Arbitration

重新思考多模态少样本3D点云分割:从融合精炼到解耦仲裁

Wentao Bian, Fenglei Xu

AI总结 针对多模态少样本3D点云分割中“融合-精炼”范式的“可塑性-稳定性困境”和CLIP的语义盲区,提出解耦专家仲裁少样本分割网络(DA-FSS),通过解耦语义与几何路径并相互正则化梯度,实现更好的泛化性能。

Comments Accepted to IJCAI-ECAI 2026 (Main Track). 9 pages, 3 figures, 3 tables

详情
AI中文摘要

本文重新审视多模态少样本3D点云语义分割(FS-PCS),识别出“融合-精炼”范式中的一个冲突:“可塑性-稳定性困境”。此外,CLIP的类间混淆可能导致语义盲区。为解决这些问题,我们提出解耦专家仲裁少样本分割网络(DA-FSS),该模型有效区分语义和几何路径,并相互正则化它们的梯度以实现更好的泛化。DA-FSS采用与MM-FSS相同的主干网络和预训练文本编码器生成文本嵌入,从而提高自由模态的利用率并更好地利用每个模态的信息空间。为此,我们提出并行专家精炼模块以生成每个模态相关性。我们还提出堆叠仲裁模块(SAM)执行卷积融合并为每个模态路径仲裁相关性。并行专家解耦两条路径:几何专家保持可塑性,语义专家确保稳定性。它们通过解耦对齐模块(DAM)协调,该模块在不传播混淆的情况下传递知识。在流行数据集(S3DIS、ScanNet)上的实验表明DA-FSS优于MM-FSS。同时,几何边界、完整性和纹理区分均优于基线。代码可在https://github.com/MoWenQAQ/DA-FSS/获取。

英文摘要

In this paper, we revisit multimodal few-shot 3D point cloud semantic segmentation (FS-PCS), identifying a conflict in "Fuse-then-Refine" paradigms: the "Plasticity-Stability Dilemma." In addition, CLIP's inter-class confusion can result in semantic blindness. To address these issues, we present the Decoupled-experts Arbitration Few-Shot SegNet (DA-FSS), a model that effectively distinguishes between semantic and geometric paths and mutually regularizes their gradients to achieve better generalization. DA-FSS employs the same backbone and pre-trained text encoder as MM-FSS to generate text embeddings, which can increase free modalities' utilization rate and better leverage each modality's information space. To achieve this, we propose a Parallel Expert Refinement module to generate each modal correlation. We also propose a Stacked Arbitration Module (SAM) to perform convolutional fusion and arbitrate correlations for each modality pathway. The Parallel Experts decouple two paths: a Geometric Expert maintains plasticity, and a Semantic Expert ensures stability. They are coordinated via a Decoupled Alignment Module (DAM) that transfers knowledge without propagating confusion. Experiments on popular datasets (S3DIS, ScanNet) demonstrate the superiority of DA-FSS over MM-FSS. Meanwhile, geometric boundaries, completeness, and texture differentiation are all superior to the baseline. The code is available at: https://github.com/MoWenQAQ/DA-FSS/.

2601.01075 2026-06-01 cs.LG cs.AI cs.CV

Flow Equivariant World Models: Memory for Partially Observed Dynamic Environments

流等变世界模型:部分观测动态环境的记忆

Hansen Jin Lillemark, Benhao Huang, Fangneng Zhan, Yilun Du, Thomas Anderson Keller

AI总结 提出流等变世界建模框架,利用时间参数化对称性在潜在记忆中实现长时程稳定准确的动力学预测,解决部分观测问题。

Comments Accepted at ICML 2026

详情
AI中文摘要

具身系统将世界体验为“流之交响”:多种连续感官输入流与自身运动耦合,并与外部物体的动力学交织。这些感官流和世界的基本动力学遵循平滑的时间参数化对称性,而现有的世界模型忽略了这一点。如果没有尊重这种结构的记忆,部分可观测性对现有方法构成主要障碍:每次观测仅揭示世界的一部分,而未观测区域继续演化。在这项工作中,我们引入了流等变世界建模,这是一个利用潜在记忆中的时间参数化对称性来实现长时程稳定准确动力学预测的框架。潜在记忆随自身运动和推断的外部物体运动等变地移动和变换,使关于视野外区域的信息随时间保持对齐。我们在2D和3D部分观测视频世界建模基准上展示了该框架相对于最先进的扩散、记忆增强和循环世界模型架构的优势。更广泛地说,我们的结果表明,当预测表示按照它们所建模的世界的时间和动力学结构组织时,它们会变得更加强大。项目页面:https://flowequivariantworldmodels.github.io/

英文摘要

Embodied systems experience the world as 'a symphony of flows': a combination of many continuous streams of sensory input coupled to self-motion, interwoven with the dynamics of external objects. These sensory streams and the underlying dynamics of the world obey smooth, time-parameterized symmetries which existing world models ignore. Without a memory that respects this structure, partial observability presents a major obstacle to existing methods: each observation reveals only a fraction of the world, while unobserved regions continue to evolve. In this work, we introduce Flow Equivariant World Modeling, a framework that leverages time-parameterized symmetries within a latent memory for stable and accurate dynamics prediction over long horizons. The latent memory shifts and transforms equivariantly with self-motion and inferred external object motion, keeping information about out-of-view regions aligned as time progresses. We demonstrate the advantage of this framework over state-of-the-art diffusion, memory-augmented, and recurrent world model architectures on 2D and 3D partially observed video world modeling benchmarks. More broadly, our results suggest that predictive representations become more powerful when they are organized in line with the temporal and dynamical structure of the world they model. Project page: https://flowequivariantworldmodels.github.io/

2512.23626 2026-06-01 cs.AI cs.LG

Regret-Based Federated Causal Discovery with Unknown Interventions

基于遗憾的联邦因果发现与未知干预

Federico Baldo, Charles K. Assaad

AI总结 提出I-PERI算法,通过恢复客户端图并集的CPDAG并利用跨客户端干预引起的结构差异定向额外边,得到更紧的Φ-马尔可夫等价类,解决联邦环境下未知客户端级干预的因果发现问题。

Comments ICML 2026

详情
AI中文摘要

大多数因果发现方法从观测数据中恢复一个表示马尔可夫等价类的完全部分有向无环图。最近的工作将这些方法扩展到联邦设置以解决数据去中心化和隐私约束,但通常假设所有客户端共享相同的因果模型,这在实践中不现实,因为客户端特定的策略或协议(例如不同医院)自然会导致异质且未知的干预。在这项工作中,我们解决了未知客户端级干预下的联邦因果发现问题。我们提出了I-PERI,一种新颖的联邦算法,首先恢复客户端图并集的CPDAG,然后通过利用跨客户端干预引起的结构差异来定向额外的边。这产生了一个更紧的等价类,我们称之为Φ-马尔可夫等价类,由Φ-CPDAG表示。我们提供了I-PERI收敛性及其隐私保护属性的理论保证,并在合成数据上进行了实证评估,证明了所提算法的有效性。

英文摘要

Most causal discovery methods recover a completed partially directed acyclic graph representing a Markov equivalence class from observational data. Recent work has extended these methods to federated settings to address data decentralization and privacy constraints, but often under idealized assumptions that all clients share the same causal model. Such assumptions are unrealistic in practice, as client-specific policies or protocols, for example, across hospitals, naturally induce heterogeneous and unknown interventions. In this work, we address federated causal discovery under unknown client-level interventions. We propose I-PERI, a novel federated algorithm that first recovers the CPDAG of the union of client graphs and then orients additional edges by exploiting structural differences induced by interventions across clients. This yields a tighter equivalence class, which we call the $\mathbfΦ$-Markov Equivalence Class, represented by the $\mathbfΦ$-CPDAG. We provide theoretical guarantees on the convergence of I-PERI, as well as on its privacy-preserving properties, and present empirical evaluations on synthetic data demonstrating the effectiveness of the proposed algorithm.

2507.12453 2026-06-01 cs.LG

Cost-aware Stopping for Bayesian Optimization

成本感知的贝叶斯优化停止规则

Qian Xie, Linda Cai, Alexander Terenin, Peter I. Frazier, Ziv Scully

AI总结 针对贝叶斯优化中成本感知的停止问题,提出一种基于理论连接成本感知采集函数的停止规则,并证明其能保证期望成本调整简单遗憾的界。

Comments Accepted by ICML 2026

详情
AI中文摘要

在自动化机器学习、科学发现以及贝叶斯优化的其他应用中,以成本感知的方式决定何时停止评估昂贵的黑盒函数是一个重要但尚未充分探索的实际考虑。为此目的的一个自然性能指标是成本调整的简单遗憾,它明确捕捉了解决方案质量与累积评估成本之间的权衡。现有的贝叶斯优化停止规则要么是启发式的,要么具有理论基础但设计用于优化简单遗憾而不考虑评估成本;因此,当成本较高时,它们无法保证避免不必要的评估。我们提出了一种原则性的成本感知贝叶斯优化停止规则,该规则无需启发式调优即可适应变化的评估成本。我们的规则基于与最先进的成本感知采集函数(即潘多拉盒子Gittins指数(PBGI)和对数每成本期望改进(LogEIPC))的理论联系。当与任一采集函数配对时,我们证明所得策略满足一个理论保证,限制了期望成本调整的简单遗憾。在包括超参数优化和神经架构规模搜索在内的合成任务和实证基准测试中,将我们的停止规则与PBGI或LogEIPC配对,通常在成本调整的简单遗憾方面匹配或优于其他采集函数-停止规则配对。

英文摘要

In automated machine learning, scientific discovery, and other applications of Bayesian optimization, deciding when to stop evaluating expensive black-box functions in a cost-aware manner is an important but underexplored practical consideration. A natural performance metric for this purpose is the cost-adjusted simple regret, which explicitly captures the trade-off between solution quality and cumulative evaluation cost. Existing stopping rules for Bayesian optimization are either heuristic, or are theoretically grounded but designed to optimize simple regret without accounting for evaluation costs; as a result, they provide no guarantees against unnecessary evaluations when costs are high. We propose a principled cost-aware stopping rule for Bayesian optimization that adapts to varying evaluation costs without heuristic tuning. Our rule is grounded in a theoretical connection to state-of-the-art cost-aware acquisition functions, namely the Pandora's Box Gittins Index (PBGI) and log expected improvement per cost (LogEIPC). When paired with either acquisition function, we prove that the resulting policy satisfies a theoretical guarantee bounding the expected cost-adjusted simple regret. Across synthetic tasks and empirical benchmarks including hyperparameter optimization and neural architecture size search, pairing our stopping rule with PBGI or LogEIPC usually matches or outperforms other acquisition-function--stopping-rule pairs in terms of cost-adjusted simple regret.

2512.20732 2026-06-01 cs.LG cs.AI cs.SE

FEM-Bench: A Structured Scientific Reasoning Benchmark for Evaluating Code-Generating LLMs

FEM-Bench:评估代码生成大语言模型的结构化科学推理基准

Saeed Mohammadzadeh, Erfan Hamdi, Joel Shor, Emma Lejeune

AI总结 提出FEM-Bench基准,通过有限元方法相关编程任务评估大语言模型在科学计算中的结构化推理能力,实验表明现有模型尚不能稳定解决所有任务。

Comments 45 pages, 5 figures, 9 tables, 7 listings

详情
AI中文摘要

随着大语言模型在物理世界推理能力上的进步,缺乏严格基准来评估其生成科学有效物理模型的能力已成为一个关键缺口。计算力学开发和运用数学模型与数值方法,预测物理系统在力、变形和约束下的行为,为结构化科学推理评估提供了理想基础。问题遵循清晰的数学结构,强制执行严格的物理和数值约束,并支持客观验证。该学科要求构建物理系统的显式模型,并推理几何、空间关系和材料行为,直接联系到新兴的AI物理推理和世界建模目标。我们提出FEM-Bench,一个计算力学基准,旨在评估大语言模型生成正确有限元方法及相关代码的能力。FEM-Bench 2025包含一系列入门但非平凡的任务,与计算力学研究生第一门课程的材料一致。这些任务捕捉了基本的数值和物理建模挑战,同时仅代表该学科复杂性的很小一部分。尽管简单,最先进的大语言模型并不能可靠地解决所有任务。在五次尝试中,函数编写表现最好的模型Gemini 3 Pro至少一次完成了30/33个任务,五次全部完成26/33个任务。单元测试编写表现最好的模型GPT-5的平均联合成功率为73.8%。其他流行模型显示出广泛的性能差异。FEM-Bench为评估AI生成的科学代码建立了结构化基础,未来版本将纳入更复杂的任务以跟踪模型进展。

英文摘要

As LLMs advance their reasoning capabilities about the physical world, the absence of rigorous benchmarks for evaluating their ability to generate scientifically valid physical models has become a critical gap. Computational mechanics, which develops and applies mathematical models and numerical methods to predict the behavior of physical systems under forces, deformation, and constraints, provides an ideal foundation for structured scientific reasoning evaluation. Problems follow clear mathematical structure, enforce strict physical and numerical constraints, and support objective verification. The discipline requires constructing explicit models of physical systems and reasoning about geometry, spatial relationships, and material behavior, connecting directly to emerging AI goals in physical reasoning and world modeling. We introduce FEM-Bench, a computational mechanics benchmark designed to evaluate the ability of LLMs to generate correct finite element method (FEM) and related code. FEM-Bench 2025 contains a suite of introductory but nontrivial tasks aligned with material from a first graduate course on computational mechanics. These tasks capture essential numerical and physical modeling challenges while representing only a small fraction of the complexity present in the discipline. Despite their simplicity, state-of-the-art LLMs do not reliably solve all of them. In a five attempt run, the best performing model at function writing, Gemini 3 Pro, completed 30/33 tasks at least once and 26/33 tasks all five times. The best performing model at unit test writing, GPT-5, had an Average Joint Success Rate of 73.8%. Other popular models showed broad performance variation. FEM-Bench establishes a structured foundation for evaluating AI-generated scientific code, and future iterations will incorporate increasingly sophisticated tasks to track progress as models evolve.

2512.11571 2026-06-01 cs.RO

Cross-Entropy Optimization of Physically Grounded Task and Motion Plans

物理基础的任务与运动规划的交叉熵优化

Andreu Matoses Gimenez, Nils Wilde, Chris Pek, Javier Alonso-Mora

AI总结 提出利用GPU并行物理模拟器和交叉熵优化,通过采样控制器参数获得低成本解决方案,以解决传统TAMP算法忽略动力学和接触的问题。

Comments Accepted for publication in IEEE Robotics and Automation Letters (RA-L)

详情
Journal ref
IEEE Robotics and Automation Letters, 2026
AI中文摘要

自主执行任务通常需要机器人规划高级离散动作和连续低级运动来实现它们。先前的TAMP算法主要关注计算性能、完备性或最优性,通过简化和抽象使问题易于处理。然而,这可能导致生成的计划在需要操作物体时,未能考虑可靠执行任务所必需的动力学或复杂接触。此外,忽略低级控制器影响的方法可能无法为真实系统获得最优或可行的计划实现。我们研究使用GPU并行物理模拟器来计算带有运动控制器的计划实现,明确考虑动力学,并考虑与环境的接触。通过交叉熵优化,我们对控制器或动作的参数进行采样,以获得低成本解决方案。由于我们的方法使用与真实系统相同的控制器,机器人可以直接执行计算出的计划。我们在一组任务中展示了我们的方法,其中机器人能够利用环境的几何形状来移动物体。网站和代码:https://andreumatoses.github.io/research/parallel-realization

英文摘要

Autonomously performing tasks often requires robots to plan high-level discrete actions and continuous low-level motions to realize them. Previous TAMP algorithms have focused mainly on computational performance, completeness, or optimality by making the problem tractable through simplifications and abstractions. However, this comes at the cost of the resulting plans potentially failing to account for the dynamics or complex contacts necessary to reliably perform the task when object manipulation is required. Additionally, approaches that ignore effects of the low-level controllers may not obtain optimal or feasible plan realizations for the real system. We investigate the use of a GPU-parallelized physics simulator to compute realizations of plans with motion controllers, explicitly accounting for dynamics, and considering contacts with the environment. Using cross-entropy optimization, we sample the parameters of the controllers, or actions, to obtain low-cost solutions. Since our approach uses the same controllers as the real system, the robot can directly execute the computed plans. We demonstrate our approach for a set of tasks where the robot is able to exploit the environment's geometry to move an object. Website and code: https://andreumatoses.github.io/research/parallel-realization

2512.11561 2026-06-01 cs.LG

View Space: Learning Representation across Arbitrary Graphs

视图空间:跨任意图的学习表示

Dooho Lee, Myeong Kong, Minho Jeong, Jaemin Yoo

AI总结 本文提出视图空间概念,通过图视图变换(GVT)实现跨任意图的归纳节点表示学习,并在节点分类任务中显著优于现有方法。

Comments Accepted to ICML 2026

详情
AI中文摘要

将预训练模型泛化到未见数据集而无需重新训练是基础模型的核心挑战。由于数据集间特征维度和语义的巨大差异,在数值数据上实现完全归纳推理尤为困难。我们观察到,在存在图结构的情况下,数值数据在特征空间之外还允许一个由结构诱导的独特表示轴,我们将其形式化为视图空间。该视图空间能够统一表示具有异构特征的图,并激发了图视图变换(GVT),这是一类可在任意图间共享的参数化映射。我们通过循环GVT实例化该框架,这是一种用于节点分类中完全归纳节点表示学习的架构。在OGBN-Arxiv上预训练并在27个基准上评估,循环GVT比先前的完全归纳图模型GraphAny高出8.93%,并超过12个单独调优的GNN至少3.30%。这些结果确立了视图空间作为跨异构特征空间图学习的原理性和实用基础。代码和检查点可在https://github.com/dooho00/graph-view-space获取。

英文摘要

Generalizing pretrained models to unseen datasets without retraining is a central challenge toward foundation models. Achieving fully inductive inference on numerical data is particularly difficult due to large variations in feature dimensionality and semantics across datasets. We observe that, in the presence of graph structure, numerical data admits a distinct structure-induced representational axis beyond the feature space, which we formalize as the view space. This view space enables a unified representation of graphs with heterogeneous features and motivates Graph View Transformation (GVT), a class of parametric mappings that can be shared across arbitrary graphs. We instantiate this framework with Recurrent GVT, an architecture for fully inductive node representation learning in node classification. Pretrained on OGBN-Arxiv and evaluated on 27 benchmarks, Recurrent GVT outperforms GraphAny, the prior fully inductive graph model, by +8.93%, and surpasses 12 individually tuned GNNs by at least +3.30%. These results establish the view space as a principled and practical foundation for learning across graphs with heterogeneous feature spaces. Code and checkpoints are available in https://github.com/dooho00/graph-view-space.

2512.05038 2026-06-01 cs.LG

The SuperActivator Mechanism: Transformers Concentrate Reliable Concept Signals in the Tail

超级激活机制:Transformer将可靠概念信号集中在尾部

Cassandra Goldberg, Chaehyeon Kim, Adam Stein, Eric Wong

AI总结 本文发现Transformer中的超级激活机制,通过放大概念激活差距,将最可靠的概念证据集中在少数高激活token上,并基于此提出检测方法,在图像和文本模态中F1提升高达0.14。

详情
AI中文摘要

概念向量旨在通过将内部表示与人类可理解的语义联系起来增强模型可解释性,但其实际效用常受限于噪声和不一致的激活。在这项工作中,我们揭示了超级激活机制:一种Transformer动态,它放大概念激活差距,将最可靠的概念证据集中在少数高激活token上。为了从理论上理解这一机制,我们证明概念对齐的注意力头乘法放大成对激活差距,其中已经极端的激活增长最快。我们发现这种放大不仅是理论上的,而且在大型模型上经验性地发生:虽然概念内和概念外激活分布有相当重叠,但概念内分布发展出一个与噪声明显分离的正尾部。这些高尾token,我们称之为超级激活器,在概念正样本中一致出现,使其成为概念存在的可靠指标。因此,基于超级激活器的检测在标准概念激活聚合器和提示基线之上,在图像和文本模态、模型、层和概念提取技术中,F1提升高达0.14,展示了我们见解的通用性和实用性。进一步的实证分析表明,最可靠的超级激活器是稀疏的,检测通常在使用仅5-10%的概念内token激活时达到峰值,并且比全局概念向量捕获更忠实的局部语义。

英文摘要

Concept vectors aim to enhance model interpretability by linking internal representations with human-understandable semantics, but their practical utility is often limited by noisy and inconsistent activations. In this work, we uncover the SuperActivator Mechanism: a transformer dynamic that amplifies concept activation gaps, concentrating the most reliable concept evidence into a small set of high-activation tokens. To develop a theoretical understanding of this mechanism, we prove that concept-aligned attention heads multiplicatively amplify pairwise activation gaps, with already-extreme activations growing fastest. We find that this amplification is not just theoretical, but also occurs empirically on large-scale models: while in- and out-of-concept activation distributions overlap considerably, the in-concept distribution develops a positive tail clearly separated from the noise. These high-tail tokens, which we call SuperActivators, appear consistently across concept-positive samples, making them reliable indicators of concept presence. Accordingly, SuperActivator-based detection improves F1 by up to 0.14 over standard concept activation aggregators and prompting baselines across image and text modalities, models, layers, and concept extraction techniques, demonstrating the generality and practicality of our insights. Further empirical analysis demonstrates that the most reliable SuperActivators are sparse, with detection typically peaking when using only 5-10% of in-concept token activations, and capture more faithful localized semantics than global concept vectors.

2509.24901 2026-06-01 cs.SD cs.LG

Unmute the Patch Tokens: Rethinking Probing in Multi-Label Audio Classification

取消补丁令牌静音:重新审视多标签音频分类中的探测方法

Lukas Rauch, René Heinrich, Houtan Ghaffari, Lukas Miklautz, Ilyass Moummad, Bernhard Sick, Christoph Scholz

AI总结 针对自监督音频模型线性探测性能不佳的问题,提出二值化原型探测方法,通过学习原型进行类别级信息聚合,在13个数据集上超越线性探测和注意力探测,建立探测作为高效评估范式的可行性。

Comments Accepted @ ICLR26

详情
AI中文摘要

尽管探测冻结模型已成为标准评估范式,但音频中的自监督学习在追求AudioSet上的最优性能时默认采用微调。一个关键原因是全局池化造成信息瓶颈,导致线性探测错误地表示嵌入质量:$\texttt{cls}$-token丢弃了关于音频中分散、局部事件的关键令牌信息。这一弱点根源于预训练目标(全局)与下游任务(局部)之间的不匹配。在包含13个数据集和6个基于频谱图的编码器的综合基准测试中,我们研究了全局池化瓶颈。我们引入了二值化原型探测:一种轻量级且简单的池化方法,通过学习原型进行类别级信息聚合。尽管简单,我们的方法显著优于线性探测和注意力探测。我们的工作将探测确立为评估音频SSL模型的一种有竞争力且高效的范式,挑战了对昂贵微调的依赖。

英文摘要

Although probing frozen models has become a standard evaluation paradigm, self-supervised learning in audio defaults to fine-tuning when pursuing state-of-the-art on AudioSet. A key reason is that global pooling creates an information bottleneck causing linear probes to misrepresent the embedding quality: The $\texttt{cls}$-token discards crucial token information about dispersed, localized events in audio. This weakness is rooted in the mismatch between the pretraining objective (globally) and the downstream task (localized). Across a comprehensive benchmark of 13 datasets and 6 spectrogram-based encoders, we investigate the global pooling bottleneck. We introduce binarized prototypical probes: a lightweight and simple pooling method that learns prototypes to perform class-wise information aggregation. Despite its simplicity, our method notably outperforms linear and attentive probing. Our work establishes probing as a competitive and efficient paradigm for evaluating audio SSL models, challenging the reliance on costly fine-tuning.

2512.02743 2026-06-01 cs.CV cs.AI

Reasoning-Aware Multimodal Fusion for Hateful Video Detection

面向仇恨视频检测的推理感知多模态融合

Shuonan Yang, Tailin Chen, Jiangbei Yue, Guangliang Cheng, Jianbo Jiao, Zeyu Fu

AI总结 提出推理感知多模态融合框架,通过局部-全局上下文融合和语义交叉注意力实现多模态交互,并引入对抗推理生成互补语义视角,在仇恨视频检测中提升Macro-F1和召回率3%和7%。

Comments Accepted at Transactions on Machine Learning Research (TMLR)

详情
AI中文摘要

在线视频中的仇恨言论对数字平台构成日益严重的威胁,尤其是当视频内容变得日益多模态和上下文依赖时。现有方法通常难以有效融合模态间的复杂语义关系,且缺乏理解细微仇恨内容的能力。为解决这些问题,我们提出了一种创新的推理感知多模态融合(RAMF)框架。针对第一个挑战,我们设计了局部-全局上下文融合(LGCF)以捕捉局部显著线索和全局时间结构,并提出语义交叉注意力(SCA)以实现细粒度多模态语义交互。针对第二个挑战,我们引入了对抗推理——一个结构化的三阶段过程,其中视觉语言模型生成(i)客观描述、(ii)仇恨假设推理和(iii)非仇恨假设推理——提供互补的语义视角,丰富模型对细微仇恨意图的上下文理解。在两个真实仇恨视频数据集上的评估表明,我们的方法实现了稳健的泛化性能,在Macro-F1和仇恨类别召回率上分别比现有最先进方法提高了3%和7%。重现我们结果所需的源代码和数据可在https://github.com/Multimodal-Intelligence-Lab-MIL/RAMF获取。

英文摘要

Hate speech in online videos is posing an increasingly serious threat to digital platforms, especially as video content becomes increasingly multimodal and context-dependent. Existing methods often struggle to effectively fuse the complex semantic relationships between modalities and lack the ability to understand nuanced hateful content. To address these issues, we propose an innovative Reasoning-Aware Multimodal Fusion (RAMF) framework. To tackle the first challenge, we design Local-Global Context Fusion (LGCF) to capture both local salient cues and global temporal structures, and propose Semantic Cross Attention (SCA) to enable fine-grained multimodal semantic interaction. To tackle the second challenge, we introduce adversarial reasoning-a structured three-stage process where a vision-language model generates (i) objective descriptions, (ii) hate-assumed inferences, and (iii) non-hate-assumed inferences-providing complementary semantic perspectives that enrich the model's contextual understanding of nuanced hateful intent. Evaluations on two real-world hateful video datasets demonstrate that our method achieves robust generalisation performance, improving upon state-of-the-art methods by 3% and 7% in Macro-F1 and hate class recall, respectively. The source codes and data required to reproduce our results are available at https://github.com/Multimodal-Intelligence-Lab-MIL/RAMF.

2505.18069 2026-06-01 cs.LG eess.SP

Ubiquity of Emergent Hebbian Dynamics in Regularized Learning

正则化学习中涌现的赫布动力学的普遍性

David Koplow, Tomaso Poggio, Liu Ziyin

AI总结 本文发现L2权重衰减在近稳态条件下普遍驱动学习信号与赫布方向对齐,且随机噪声可诱导反赫布对齐,这为区分真正的赫布计算与涌现的赫布特征提供了实验动机。

Comments ICML 2026 Camera Ready

详情
AI中文摘要

赫布和反赫布可塑性在大脑中广泛观察到,经典上被建模为由稳态约束稳定的机械性、局部同突触规则。这引发了一个可识别性问题:在突触更新中观察到赫布/反赫布结构是否唯一地暗示了底层的赫布计算?我们识别出另一种涌现途径。我们表明,在近稳态条件下,L2权重衰减通常驱动许多更新规则的学习信号分量与赫布方向对齐,且对齐程度随衰减强度单调增加。这种类似赫布的特征并非SGD特有,甚至可以在学习停止之前很久的非学习或随机更新规则中出现。我们进一步表明,学习信号中的随机噪声可以诱导反赫布对齐,从而在回归设置中产生与权重衰减的简单权衡和相边界。这些机制并不取代标准的赫布理论;它们可以与真正的赫布可塑性共存,并使突触测量的解释复杂化,从而激励区分机械性赫布计算与涌现的赫布特征的实验。

英文摘要

Hebbian and anti-Hebbian plasticity are widely observed in the brain and are classically modeled as mechanistic, local homosynaptic rules stabilized by homeostatic constraints. This raises an identifiability question: does observing Hebbian/anti-Hebbian structure in synaptic updates uniquely imply an underlying Hebbian computation? We identify an alternative, emergent route. We show that near stationarity, L2 weight decay generically drives the \emph{learning-signal} component of many update rules to align with a Hebbian direction, with alignment increasing monotonically with decay strength. This Hebbian-like signature is not specific to SGD and can arise even for non-learning or random update rules long before learning has ceased. We further show that stochastic noise in the learning signal can induce anti-Hebbian alignment, yielding a simple tradeoff with weight decay and a phase boundary in regression settings. These mechanisms do not replace standard Hebbian theory; they can coexist with genuine Hebbian plasticity and complicate the interpretation of synaptic measurements, motivating experiments that distinguish mechanistic Hebbian computation from emergent Hebbian signatures.

2512.00349 2026-06-01 cs.AI

Debate with Images: Detecting Deceptive Behaviors in Multimodal Large Language Models

图像辩论:检测多模态大语言模型中的欺骗行为

Sitong Fang, Shiyi Hou, Kaile Wang, Boyuan Chen, Donghai Hong, Jiayi Zhou, Josef Dai, Yaodong Yang, Jiaming Ji

AI总结 本文提出 MM-DeceptionBench 基准和基于图像辩论的多智能体监控框架,系统揭示并量化多模态大语言模型中的欺骗风险,有效提升欺骗行为检测能力。

Comments 39 pages, 16 figures, camera ready version for ICML 2026

详情
AI中文摘要

前沿AI系统是否变得更加强大?当然。然而,这种进步并非纯粹的福音,而是一匹特洛伊木马:在性能飞跃的背后,隐藏着更隐蔽和更具破坏性的安全风险,即欺骗。与幻觉(源于能力不足并导致错误)不同,欺骗代表一种更深层次的威胁,模型通过复杂推理和不真诚的回应故意误导用户。随着系统能力的提升,欺骗行为已从文本环境扩展到多模态环境,放大了其潜在危害。首先,我们如何监控这些隐蔽的多模态欺骗行为?然而,当前研究几乎完全局限于文本,多模态大语言模型的欺骗风险尚未被探索。在这项工作中,我们系统地揭示并量化多模态欺骗风险,引入了MM-DeceptionBench,这是第一个专门设计用于评估多模态欺骗的基准。涵盖六类欺骗,MM-DeceptionBench描述了模型如何通过视觉和文本模态的组合策略性地操纵和误导。另一方面,多模态欺骗评估在现有方法中几乎是一个盲点。其隐蔽性,加上视觉语义模糊性和跨模态推理的复杂性,使得行动监控和思维链监控基本无效。为应对这一挑战,我们提出了图像辩论,一种新颖的多智能体辩论监控框架。通过迫使模型将其主张基于视觉证据,该方法显著提高了欺骗策略的可检测性。实验表明,它在所有测试模型上持续提高与人类判断的一致性,在GPT-4o上将Cohen's kappa提升了1.5倍,准确率提升了1.25倍。

英文摘要

Are frontier AI systems becoming more capable? Certainly. Yet such progress is not an unalloyed blessing but rather a Trojan horse: behind their performance leaps lie more insidious and destructive safety risks, namely deception. Unlike hallucination, which arises from insufficient capability and leads to mistakes, deception represents a deeper threat in which models deliberately mislead users through complex reasoning and insincere responses. As system capabilities advance, deceptive behaviours have spread from textual to multimodal settings, amplifying their potential harm. First and foremost, how can we monitor these covert multimodal deceptive behaviors? Nevertheless, current research remains almost entirely confined to text, leaving the deceptive risks of multimodal large language models unexplored. In this work, we systematically reveal and quantify multimodal deception risks, introducing MM-DeceptionBench, the first benchmark explicitly designed to evaluate multimodal deception. Covering six categories of deception, MM-DeceptionBench characterizes how models strategically manipulate and mislead through combined visual and textual modalities. On the other hand, multimodal deception evaluation is almost a blind spot in existing methods. Its stealth, compounded by visual-semantic ambiguity and the complexity of cross-modal reasoning, renders action monitoring and chain-of-thought monitoring largely ineffective. To tackle this challenge, we propose debate with images, a novel multi-agent debate monitor framework. By compelling models to ground their claims in visual evidence, this method substantially improves the detectability of deceptive strategies. Experiments show that it consistently increases agreement with human judgements across all tested models, boosting Cohen's kappa by 1.5x and accuracy by 1.25x on GPT-4o.

2510.15859 2026-06-01 cs.CL cs.AI

InfiMed-ORBIT: Aligning LLMs on Open-Ended Complex Tasks via Rubric-Based Incremental Training

InfiMed-ORBIT: 通过基于评分标准的增量训练使大语言模型对齐开放复杂任务

Pengkai Wang, Pengwei Liu, Qi Zuo, Zhijie Sang, Congkai Xie, Hongxia Yang

AI总结 提出ORBIT框架,利用动态生成的病例条件评分标准指导增量强化学习,仅用2k样本将Qwen3-4B-Instruct在HealthBench-Hard上的得分从7.0提升至27.5,达到同规模开源模型最优。

详情
AI中文摘要

强化学习(RL)推动了大语言模型(LLM)的许多近期突破,尤其是在奖励可自动计算的任务(如代码生成)中。然而,在开放式的医学对话中,RL效果较差,因为反馈模糊、依赖上下文,且难以简单总结为单一标量信号——通常需要高度监督的奖励模型,并存在奖励破解的风险。因此,我们引入了ORBIT,一个专为关键医学对话设计的基于评分标准的开放式增量训练框架。ORBIT将医学对话构建与动态生成的病例条件评分标准相结合,这些评分标准作为增量RL的自适应指南。与依赖外部医学知识库或手工规则的方法不同,ORBIT使用评分标准引导的评估,并可与通用指令遵循LLM一起实现,避免了任务特定的评判微调。仅使用2k训练样本,ORBIT将Qwen3-4B-Instruct的HealthBench-Hard得分从7.0提升至27.5,在相似规模的开源模型中实现了最先进的性能,同时随着评分标准覆盖范围的扩大,保持了良好的咨询质量。

英文摘要

Reinforcement learning (RL) has powered many recent breakthroughs in large language models (LLMs), especially for tasks where rewards can be computed automatically, such as code generation. However, it is less effective in open-ended medical dialogue, where feedback is ambiguous, context-dependent, and difficult to simply summarize into a single scalar signal-often requiring heavily supervised reward models and creating risks of reward hacking. Thus, we introduce ORBIT, an open-ended rubric-based incremental training framework tailored for critical medical dialogues. ORBIT integrates medical dialogue construction with dynamically generated case-conditioned rubrics that serve as adaptive guides for incremental RL. Unlike approaches that rely on external medical knowledge bases or handcrafted rules, ORBIT uses rubric-guided evaluation and can be implemented with general-purpose instruction-following LLMs, avoiding task-specific judge fine-tuning. With only 2k training samples, ORBIT raises Qwen3-4B-Instruct's HealthBench-Hard score from 7.0 to 27.5, achieving state-of-the-art performance among similarly sized open-source models while maintaining strong consultation quality as rubric coverage broadens.

2509.21379 2026-06-01 cs.CV cs.AI

SAEmnesia: Erasing Concepts in Diffusion Models with Supervised Sparse Autoencoders

SAEmnesia:基于监督稀疏自编码器的扩散模型概念擦除

Enrico Cassano, Riccardo Renzulli, Marco Nurisso, Mirko Zaffaroni, Alan Perotti, Marco Grangetto

AI总结 提出监督稀疏自编码器框架SAEmnesia,通过强制一对一概念-神经元映射实现特征集中化,从而高效、精准地擦除扩散模型中的概念。

Comments Accepted at ICML 2026

详情
AI中文摘要

扩散模型中的概念遗忘受到特征分裂的阻碍,即概念分布在许多潜在特征上,使得移除它们具有挑战性且计算成本高。我们引入了SAEmnesia,一种监督稀疏自编码器框架,通过强制一对一的概念-神经元映射来克服这一问题。通过在训练过程中系统地标记概念,我们的方法实现了特征集中化,将每个概念绑定到一个可解释的神经元上。这使得概念擦除高度精准且高效。与最先进的基于稀疏自编码器的遗忘方法相比,SAEmnesia将超参数搜索减少了96.67%,并在UnlearnCanvas对象基准上实现了9.22%的提升。我们的方法在顺序遗忘中也表现出卓越的可扩展性,在移除九个对象时准确率提高了28.4%,为精确可控的概念擦除迈出了一步。此外,SAEmnesia在I2P基准上有效抑制了裸体内容,并对对抗攻击保持鲁棒性。源代码可在https://github.com/EIDOSLAB/SAEmnesia获取。

英文摘要

Concept unlearning in diffusion models is hampered by feature splitting, where concepts are distributed across many latent features, making their removal challenging and computationally expensive. We introduce SAEmnesia, a supervised sparse autoencoder framework that overcomes this by enforcing one-to-one concept-neuron mappings. By systematically labeling concepts during training, our method achieves feature centralization, binding each concept to a single, interpretable neuron. This enables highly targeted and efficient concept erasure. Compared to the state-of-the-art sparse autoencoder-based unlearning approach, SAEmnesia reduces hyperparameter search by 96.67% and achieves a 9.22% improvement on the UnlearnCanvas benchmark for objects. Our method also shows superior scalability in sequential unlearning, improving accuracy by 28.4% when removing nine objects, establishing a step forward for precise and controllable concept erasure. Moreover, SAEmnesia effectively suppresses nudity on the I2P benchmark and remains robust to adversarial attacks. Source code available at https://github.com/EIDOSLAB/SAEmnesia.

2511.21513 2026-06-01 cs.LG

IntAttention: A Fully Integer Attention Pipeline for Efficient Edge Inference

IntAttention: 面向高效边缘推理的全整数注意力流水线

Wanli Zhong, Haibo Feng, Zirui Zhou, Hanyang Peng, Shiqi Yu

AI总结 针对Transformer在边缘设备上部署时softmax路径导致的数据类型转换瓶颈,提出IntAttention全整数注意力流水线,通过IndexSoftmax算子、稀疏感知裁剪、32项查找表近似和直接整数归一化,消除数据类型转换开销,在Armv8 CPU上实现高达3.7倍加速和61%能耗降低。

详情
AI中文摘要

在边缘设备上部署Transformer模型受到延迟和能量预算的限制。虽然INT8量化有效加速了主要的矩阵乘法,但它将softmax相关路径暴露为主要瓶颈。该阶段需要进行昂贵的反量化->softmax->再量化绕行,这可以占到总注意力延迟的65%,并破坏了边缘硬件效率至关重要的端到端整数数据流。为了解决这一限制,我们提出了IntAttention,这是第一个全整数注意力流水线,可作为无需训练的即插即用替代方案。我们方法的核心是IndexSoftmax,一种在整数域内完全替代浮点指数运算的硬件友好算子。IntAttention集成了稀疏感知裁剪、32项查找表近似和直接整数归一化,从而消除了注意力路径上的数据类型转换开销。在Armv8 CPU上的实验表明,与FP16基线相比,我们的方法实现了高达3.7倍的加速和61%的能耗降低,与传统的INT8注意力流水线相比,加速高达2.0倍。在多种语言和视觉模型以及额外的推理和长上下文评估中,IntAttention保持了强大的整体保真度,并展示了比现有基于LUT的softmax近似更有利的权衡。代码可在https://github.com/WanliZhong/IntAttention获取。

英文摘要

Deploying Transformer models on edge devices is limited by latency and energy budgets. While INT8 quantization effectively accelerates the primary matrix multiplications, it exposes the softmax-related path as the dominant bottleneck. This stage incurs a costly dequantize -> softmax -> requantize detour, which can account for up to 65% of total attention latency and disrupts the end-to-end integer dataflow critical for edge hardware efficiency. To address this limitation, we present IntAttention, the first fully integer attention pipeline that serves as a training-free drop-in replacement. At the core of our approach lies IndexSoftmax, a hardware-friendly operator that replaces floating-point exponentials entirely within the integer domain. IntAttention integrates sparsity-aware clipping, a 32-entry lookup table approximation, and direct integer normalization, thereby eliminating datatype conversion overhead along the attention path. Experiments on Armv8 CPUs show that our method achieves up to 3.7x speedup and 61% energy reduction over FP16 baselines, and up to 2.0x speedup over conventional INT8 attention pipelines. Across diverse language and vision models, as well as additional reasoning and long-context evaluations, IntAttention maintains strong overall fidelity and demonstrates a more favorable trade-off than existing LUT-based softmax approximations. Code is available at https://github.com/WanliZhong/IntAttention

2511.19923 2026-06-01 cs.CV cs.CL

Distilling Counterfactual Reasoning from Language to Vision: Causal Graph Guided Post-Training for Video Understanding

从语言到视觉的反事实推理蒸馏:因果图引导的视频理解后训练

Yuefei Chen, Jiang Liu, Xiaodong Lin, Ruixiang Tang

AI总结 针对视觉语言模型在反事实推理上的不足,提出CounterVQA基准和CFGPT后训练方法,通过从语言模态蒸馏反事实推理能力提升视频理解。

详情
AI中文摘要

视觉语言模型(VLM)最近在视频理解方面取得了显著进展,特别是在特征对齐、事件推理和指令遵循任务中。然而,它们在反事实推理(即在假设条件下推断替代结果)方面的能力仍未得到充分探索。这种能力对于鲁棒的视频理解至关重要,因为它需要识别潜在的因果结构并推理未观察到的可能性,而不仅仅是识别观察到的模式。为了系统评估这一能力,我们引入了CounterVQA,一个基于视频的基准测试,具有三个渐进难度级别,评估反事实推理的不同方面。通过对最先进的开源和闭源模型的全面评估,我们发现了一个显著的性能差距:虽然这些模型在简单的反事实问题上达到了合理的准确性,但在复杂的多跳因果链上性能显著下降。为了解决这些限制,我们开发了一种后训练方法CFGPT,通过从语言模态蒸馏其反事实推理能力来增强模型的视觉反事实推理能力,在CounterVQA的所有难度级别上均取得了一致的改进。数据集和代码将后续发布。

英文摘要

Vision Language Models (VLMs) have recently shown significant advancements in video understanding, especially in feature alignment, event reasoning, and instruction-following tasks. However, their capability for counterfactual reasoning, inferring alternative outcomes under hypothetical conditions, remains underexplored. This capability is essential for robust video understanding, as it requires identifying underlying causal structures and reasoning about unobserved possibilities, rather than merely recognizing observed patterns. To systematically evaluate this capability, we introduce CounterVQA, a video-based benchmark featuring three progressive difficulty levels that assess different aspects of counterfactual reasoning. Through comprehensive evaluation of both state-of-the-art open-source and closed-source models, we uncover a substantial performance gap: while these models achieve reasonable accuracy on simple counterfactual questions, performance degrades significantly on complex multi-hop causal chains. To address these limitations, we develop a post-training method, CFGPT, that enhances a model's visual counterfactual reasoning ability by distilling its counterfactual reasoning capability from the language modality, yielding consistent improvements across all CounterVQA difficulty levels. Dataset and code will be further released.

2511.19513 2026-06-01 cs.LG

Row-Stochastic Matrices Can Provably Outperform Doubly Stochastic Matrices in Decentralized Learning

行随机矩阵在去中心化学习中可证明优于双随机矩阵

Bing Liu, Boao Kong, Limin Lu, Kun Yuan, Chengcheng Zhao

AI总结 本文通过加权希尔伯特空间框架,严格证明了行随机矩阵相比双随机矩阵在去中心化学习中具有更快的收敛速度,并给出了拓扑条件指导设计。

详情
AI中文摘要

去中心化学习通常涉及具有异构节点权重$λ$的加权全局损失。我们重新审视了两种融入这些权重的自然策略:(i) 将权重嵌入局部损失以保持均匀权重(从而得到双随机矩阵),以及(ii) 保留原始损失同时采用由$λ$诱导的行随机矩阵。尽管先前的工作表明两种策略都针对相同的$λ$加权全局损失,但尚不清楚欧几里得空间中的保证是否紧致,以及它们的表现有何根本差异。为了澄清这一点,我们开发了一个加权希尔伯特空间框架$L^2(λ;\\\mathbb{R}^d)$,并获得了比标准欧几里得分析严格更紧的收敛速率。在该几何中,行随机矩阵成为\\emph{自伴的},而双随机矩阵则不是,从而产生了额外的\\emph{惩罚项},放大了共识误差,进而减缓了收敛。因此,收敛差异不仅来自谱间隙,还来自这些惩罚项。然后,我们推导了行随机设计即使具有更小的谱间隙也能更快收敛的充分条件。最后,通过使用瑞利商和Loewner序特征值比较,我们进一步获得了保证这一优势的拓扑条件,并给出了实用的拓扑设计指南。

英文摘要

Decentralized learning often involves a weighted global loss with heterogeneous node weights $λ$. We revisit two natural strategies for incorporating these weights: (i) embedding them into the local losses to retain a uniform weight (and thus a doubly stochastic matrix), and (ii) keeping the original losses while employing a $λ$-induced row-stochastic matrix. Although prior work shows that both strategies target the same $λ$-weighted global loss, it remains unclear whether the Euclidean-space guarantees are tight and what fundamentally differentiates their behaviors. To clarify this, we develop a weighted Hilbert-space framework $L^2(λ;\mathbb{R}^d)$ and obtain convergence rates that are strictly tighter than those from standard Euclidean analysis. In this geometry, the row-stochastic matrix becomes \emph{self-adjoint} whereas the doubly stochastic one does not, creating additional \emph{penalty terms} that amplify consensus error, thereby slowing convergence. Consequently, the difference in convergence arises not only from spectral gaps but also from these penalty terms. We then derive sufficient conditions under which the row-stochastic design converges faster even with a smaller spectral gap. Finally, by using a Rayleigh-quotient and Loewner-order eigenvalue comparison, we further obtain topology conditions that guarantee this advantage and yield practical topology-design guidelines.

2511.19433 2026-06-01 cs.RO cs.AI cs.CV

Mixture of Horizons in Action Chunking

动作分块中的视野混合

Dong Jing, Gang Wang, Jiaqi Liu, Weiliang Tang, Zelong Sun, Yunchao Yao, Zhenyu Wei, Yunhui Liu, Zhiwu Lu, Mingyu Ding

AI总结 针对视觉-语言-动作模型中动作分块长度(视野)的权衡问题,提出混合视野策略,通过并行处理不同视野的动作片段并融合输出,同时提升长期预见与短期精度,实现性能与泛化性的改进。

Comments Accepted at ICML 2026

详情
AI中文摘要

视觉-语言-动作(VLA)模型在机器人操作中展现出显著能力,但其性能对训练中使用的$ extbf{动作分块长度}$(称为$ extbf{视野}$)敏感。我们的实证研究揭示了一个内在权衡:较长的视野提供更强的全局预见但降低细粒度精度,而较短的视野增强局部控制但在长期任务上表现不佳,这意味着固定选择单一视野是次优的。为缓解这一权衡,我们提出$ extbf{混合视野(MoH)}$策略。MoH将动作分块重新排列为多个不同视野的片段,通过共享动作变换器并行处理,并使用轻量线性门控融合输出。它具有三个吸引人的优点:1) MoH在单个模型中联合利用长期预见和短期精度,提高了复杂任务的性能和泛化能力。2) MoH对全注意力动作模块即插即用,训练或推理开销极小。3) MoH支持自适应视野的动态推理,通过跨视野共识选择稳定动作,实现比基线高2.5倍的吞吐量,同时保持优越性能。在基于流的策略$π_0$、$π_{0.5}$和单步回归策略$π_{ ext{reg}}$上的大量实验表明,MoH在仿真和真实世界任务上均取得一致且显著的提升。值得注意的是,在混合任务设置下,带有MoH的$π_{0.5}$在LIBERO上仅经过$30k$次训练迭代即达到99$\%$的平均成功率,创下新纪录。项目页面:https://timsty1.github.io/moh/

英文摘要

Vision-language-action (VLA) models have shown remarkable capabilities in robotic manipulation, but their performance is sensitive to the $\textbf{action chunk length}$ used during training, termed $\textbf{horizon}$. Our empirical study reveals an inherent trade-off: longer horizons provide stronger global foresight but degrade fine-grained accuracy, while shorter ones sharpen local control yet struggle on long-term tasks, implying fixed choice of single horizons being suboptimal. To mitigate the trade-off, we propose a $\textbf{mixture of horizons (MoH)}$ strategy. MoH rearranges the action chunk into several segments with different horizons, processes them in parallel with a shared action transformer, and fuses outputs with a light linear gate. It has three appealing benefits. 1) MoH exploits long-term foresight and short-term precision jointly within a single model, improving both performance and generalizability to complex tasks. 2) MoH is plug-and-play for full-attention action modules with minimal training or inference overhead. 3) MoH enables dynamic inference with adaptive horizons, which selects stable actions through cross-horizon consensus, achieving 2.5$\times$ higher throughput than baselines while preserving superior performance. Extensive experiments over flow-based policies $π_0$, $π_{0.5}$, and one-step regression policy $π_{\text{reg}}$ demonstrate that MoH yields consistent and significant gains on both simulations and real-world tasks. Notably, under mixed-task setting, $π_{0.5}$ with MoH reaches a new state-of-the-art with 99$\%$ average success rate on LIBERO after only $30k$ training iterations. Project page: https://timsty1.github.io/moh/

2511.19394 2026-06-01 cs.CV

BackSplit: The Importance of Sub-dividing the Background in Biomedical Lesion Segmentation

BackSplit:在生物医学病灶分割中细分背景的重要性

Rachit Saluja, Asli Cihangir, Ruining Deng, Johannes C. Paetzold, Fengbei Liu, Mert R. Sabuncu

AI总结 提出BackSplit方法,通过将背景细分为多个子类(如组织、器官)进行训练,在不增加推理成本的情况下显著提升小病灶分割性能,并从信息论角度证明其有效性。

Comments Accepted to CVPR 2026

详情
AI中文摘要

在医学图像中分割小病灶仍然非常困难。大多数先前的工作通过设计更好的架构、损失函数或数据增强方案,以及收集更多标注数据来应对这一挑战。我们采取不同的观点,认为部分问题在于背景的建模方式。常见的病灶分割将所有非病灶像素合并为单一的“背景”类,忽略了病灶出现的丰富解剖背景。实际上,背景是高度异质的——由组织、器官和其他结构组成,这些结构现在可以手动标注或使用现有分割模型自动推断。在本文中,我们认为使用细分背景类的细粒度标签进行训练(我们称之为BackSplit)是一种简单而强大的范式,可以在不增加推理成本的情况下提供显著的性能提升。从信息论的角度,我们证明BackSplit相对于传统的二值训练增加了期望的Fisher信息,从而得到更紧的渐近界和更稳定的优化。通过在多个数据集和架构上进行大量实验,我们经验性地表明,即使辅助标签是使用预训练分割模型自动生成的,BackSplit也能持续提升小病灶分割性能。此外,我们证明从交互式分割框架中导出的辅助标签也表现出相同的有利效果,展示了其鲁棒性、简单性和广泛的适用性。

英文摘要

Segmenting small lesions in medical images remains notoriously difficult. Most prior work tackles this challenge by either designing better architectures, loss functions, or data augmentation schemes; and collecting more labeled data. We take a different view, arguing that part of the problem lies in how the background is modeled. Common lesion segmentation collapses all non-lesion pixels into a single "background" class, ignoring the rich anatomical context in which lesions appear. In reality, the background is highly heterogeneous-composed of tissues, organs, and other structures that can now be labeled manually or inferred automatically using existing segmentation models. In this paper, we argue that training with fine-grained labels that sub-divide the background class, which we call BackSplit, is a simple yet powerful paradigm that can offer a significant performance boost without increasing inference costs. From an information theoretic standpoint, we prove that BackSplit increases the expected Fisher Information relative to conventional binary training, leading to tighter asymptotic bounds and more stable optimization. With extensive experiments across multiple datasets and architectures, we empirically show that BackSplit consistently boosts small-lesion segmentation performance, even when auxiliary labels are generated automatically using pretrained segmentation models. Additionally, we demonstrate that auxiliary labels derived from interactive segmentation frameworks exhibit the same beneficial effect, demonstrating its robustness, simplicity, and broad applicability.

2511.18760 2026-06-01 cs.AI cs.FL

HERMES: Towards Efficient and Verifiable Mathematical Reasoning in LLMs

HERMES: 迈向高效且可验证的LLM数学推理

Azim Ospanov, Zijin Feng, Jiacheng Sun, Haoli Bai, Xin Shen, Farzan Farnia

AI总结 提出Hermes框架,通过将非正式推理与Lean形式化验证交替结合,并引入中间形式化检查和记忆模块,在提升推理准确性的同时显著降低计算成本。

详情
AI中文摘要

非正式数学一直是现代大型语言模型(LLM)推理的核心,提供了灵活性和高效构建论证的能力。然而,纯粹的非正式推理容易产生逻辑漏洞和细微错误,难以检测和纠正。相比之下,形式化定理证明提供了严谨、可验证的数学推理,其中每个推理步骤都由可信的编译器检查,但缺乏非正式问题解决的探索自由度。这种不匹配使得当前基于LLM的数学代理缺乏一种原则性的方法来结合两种范式的优势。在这项工作中,我们引入了Hermes,这是第一个明确将非正式推理与Lean中的形式化验证证明交替结合的工具辅助代理。该框架执行中间形式化检查以防止推理漂移,并配备一个记忆模块以在多步推理链中保持证明的连续性,从而同时实现探索和验证。我们在四个具有挑战性的数学推理基准上评估了Hermes,使用了不同参数规模的LLM,从小模型到最先进的系统。在所有设置中,Hermes可靠地提高了基础模型的推理准确性,同时与基于奖励的方法相比,显著减少了推理令牌使用量和计算成本。在AIME和HARDMath2等困难数据集上,Hermes@1实现了高达40%的准确性提升,同时总推理FLOPs减少了80%。在测试时扩展时,Hermes@5进一步将准确性提高了20%。实现和代码库公开于https://github.com/aziksh-ospanov/HERMES。

英文摘要

Informal mathematics has been central to modern large language model (LLM) reasoning, offering flexibility and efficient construction of arguments. However, purely informal reasoning is prone to logical gaps and subtle errors that are difficult to detect and correct. In contrast, formal theorem proving provides rigorous, verifiable mathematical reasoning, where each inference step is checked by a trusted compiler, but lacks the exploratory freedom of informal problem-solving. This mismatch leaves current LLM-based math agents without a principled way to combine the strengths of both paradigms. In this work, we introduce Hermes, the first tool-assisted agent that explicitly interleaves informal reasoning with formally verified proofs in Lean. The framework performs intermediate formal checking to prevent reasoning drift and a memory module for proof continuity across multi-step reasoning chains, enabling both exploration and verification. We evaluate Hermes on four challenging mathematical reasoning benchmarks using LLMs of varying parameter scales, from small models to state-of-the-art systems. Across all settings, Hermes reliably improves the reasoning accuracy of base models while substantially reducing reasoning token usage and computational cost compared to reward-based approaches. On difficult datasets such as AIME and HARDMath2, Hermes@1 achieves up to a 40% accuracy improvement while using 80% fewer total inference FLOPs. When scaled at test time, Hermes@5 boosts accuracy further by 20%. The implementation and codebase are publicly available at https://github.com/aziksh-ospanov/HERMES.

2511.17826 2026-06-01 cs.LG cs.CL stat.ML

Deterministic Inference across Tensor Parallel Sizes That Eliminates Training-Inference Mismatch

跨张量并行大小的确定性推理,消除训练-推理不匹配

Ziyang Zhang, Xinheng Ding, Jiayi Yuan, Rixin Liu, Huizi Mao, Jiarong Xing, Zirui Liu

AI总结 针对不同张量并行大小导致浮点运算非结合性引起的推理非确定性问题,提出基于树的核(TBIK)实现跨TP大小的比特级一致结果,消除RL训练中推理与训练引擎间的精度不匹配。

详情
AI中文摘要

确定性推理对于大型语言模型(LLM)应用(如LLM-as-a-judge评估、多智能体系统和强化学习(RL))日益关键。然而,现有的LLM服务框架表现出非确定性行为:当系统配置(例如张量并行(TP)大小、批大小)变化时,即使采用贪心解码,相同的输入也可能产生不同的输出。这是由于浮点运算的非结合性以及GPU间归约顺序不一致导致的。虽然先前的工作通过批不变核解决了与批大小相关的非确定性,但跨不同TP大小的确定性仍然是一个开放问题,特别是在RL设置中,训练引擎通常使用全分片数据并行(即TP=1),而部署引擎依赖多GPU TP以最大化推理吞吐量,从而在两者之间产生自然的不匹配。这种精度不匹配问题可能导致RL训练性能次优甚至崩溃。我们识别并分析了TP引起不一致的根本原因,并提出了基于树的核(TBIK),这是一组TP不变的矩阵乘法和归约原语,无论TP大小如何,都能保证比特级相同的结果。我们的关键见解是通过统一的层次二叉树结构对齐GPU内和GPU间的归约顺序。我们在Triton中实现了这些核,并将其集成到vLLM和FSDP中。实验证明,在不同TP大小下,确定性推理的概率发散为零,且具有比特级可重复性。此外,在采用不同并行策略的RL训练流程中,我们在vLLM和FSDP之间实现了比特级相同的结果。代码可在https://github.com/nanomaoli/llm_reproducibility获取。

英文摘要

Deterministic inference is increasingly critical for large language model (LLM) applications such as LLM-as-a-judge evaluation, multi-agent systems, and Reinforcement Learning (RL). However, existing LLM serving frameworks exhibit non-deterministic behavior: identical inputs can yield different outputs when system configurations (e.g., tensor parallel (TP) size, batch size) vary, even under greedy decoding. This arises from the non-associativity of floating-point arithmetic and inconsistent reduction orders across GPUs. While prior work has addressed batch-size-related nondeterminism through batch-invariant kernels, determinism across different TP sizes remains an open problem, particularly in RL settings, where the training engine typically uses Fully Sharded Data Parallel (i.e., TP = 1) while the rollout engine relies on multi-GPU TP to maximize the inference throughput, creating a natural mismatch between the two. This precision mismatch problem may lead to suboptimal performance or even collapse for RL training. We identify and analyze the root causes of TP-induced inconsistency and propose Tree-Based Invariant Kernels (TBIK), a set of TP-invariant matrix multiplication and reduction primitives that guarantee bit-wise identical results regardless of TP size. Our key insight is to align intra- and inter-GPU reduction orders through a unified hierarchical binary tree structure. We implement these kernels in Triton and integrate them into vLLM and FSDP. Experiments confirm zero probability divergence and bit-wise reproducibility for deterministic inference across different TP sizes. Also, we achieve bit-wise identical results between vLLM and FSDP in RL training pipelines with different parallel strategy. Code is available at https://github.com/nanomaoli/llm_reproducibility.

2506.08255 2026-06-01 cs.LG cs.AI cs.CR

SHIELD: Secure Hypernetworks for Incremental Expansion Learning Defense

SHIELD: 用于增量扩展学习防御的安全超网络

Patryk Krukowski, Łukasz Gorczyca, Piotr Helm, Kamil Książek, Przemysław Spurek

AI总结 提出一种结合区间边界传播(IBP)与超网络的框架SHIELD,通过生成任务特定参数和区间混合训练策略,实现可认证鲁棒的持续学习,在保持可扩展性的同时达到最优平均准确率。

Comments Accepted to CVPR 2026 (Findings track)

详情
AI中文摘要

在对抗条件下的持续学习仍然是一个开放问题,现有方法往往在鲁棒性、可扩展性或两者之间做出妥协。我们提出了一种新颖的框架,将区间边界传播(IBP)与基于超网络的架构相结合,以实现跨顺序任务的可认证鲁棒持续学习。我们的方法SHIELD通过一个共享的超网络生成任务特定的模型参数,该超网络仅依赖于紧凑的任务嵌入,从而消除了对重放缓冲区或完整模型副本的需求,并实现了高效的时间扩展。为了进一步增强鲁棒性,我们引入了区间混合(Interval MixUp),这是一种新颖的训练策略,它将表示为以MixUp点为中心的$\ell_{\infty}$球的虚拟示例混合。利用区间算术,该技术保证了可认证的鲁棒性,同时减轻了包裹效应,从而产生更平滑的决策边界。我们在多个基准测试上评估了SHIELD在强白盒对抗攻击(包括PGD和AutoAttack)下的表现。它持续优于现有的鲁棒持续学习方法,在保持可扩展性和认证性的同时,实现了最先进的平均准确率。这些结果向在对抗环境中实现实用且理论扎实的持续学习迈出了重要一步。

英文摘要

Continual learning under adversarial conditions remains an open problem, as existing methods often compromise either robustness, scalability, or both. We propose a novel framework that integrates Interval Bound Propagation (IBP) with a hypernetwork-based architecture to enable certifiably robust continual learning across sequential tasks. Our method, SHIELD, generates task-specific model parameters via a shared hypernetwork conditioned solely on compact task embeddings, eliminating the need for replay buffers or full model copies and enabling efficient over time. To further enhance robustness, we introduce Interval MixUp, a novel training strategy that blends virtual examples represented as $\ell_{\infty}$ balls centered around MixUp points. Leveraging interval arithmetic, this technique guarantees certified robustness while mitigating the wrapping effect, resulting in smoother decision boundaries. We evaluate SHIELD under strong white-box adversarial attacks, including PGD and AutoAttack, across multiple benchmarks. It consistently outperforms existing robust continual learning methods, achieving state-of-the-art average accuracy while maintaining both scalability and certification. These results represent a significant step toward practical and theoretically grounded continual learning in adversarial settings.

2511.17380 2026-06-01 cs.CV cs.LG

Non-Parametric Probabilistic Robustness: A Conservative Risk Estimator under Unknown Perturbation Distributions

非参数概率鲁棒性:未知扰动分布下的保守风险估计

Zheng Wang, Yi Zhang, Siddartha Khastgir, Carsten Maple, Xingyu Zhao

AI总结 提出非参数概率鲁棒性(NPPR)度量,通过从数据中学习扰动分布,在分布不确定性下实现保守的概率鲁棒性估计,并基于高斯混合模型开发估计器。

详情
AI中文摘要

深度学习模型尽管取得了显著成功,但仍然容易受到微小输入扰动的影响,导致错误输出,这促使最近提出概率鲁棒性(PR)作为对抗鲁棒性(AR)的补充替代方案。然而,现有的PR公式假设扰动分布固定且已知,这在实践中是不现实的期望。为了解决这一限制,我们提出了非参数概率鲁棒性(NPPR),一种更实用的PR度量,不依赖于任何预定义的扰动分布。遵循统计建模中的非参数范式,NPPR直接从数据中学习优化的扰动分布,从而在分布不确定性下实现保守的PR评估。我们进一步开发了基于高斯混合模型(GMM)的NPPR估计器,涵盖了各种输入相关和输入无关的扰动场景。理论分析建立了AR、PR和NPPR之间的关系。在CIFAR-10、CIFAR-100和Tiny ImageNet上使用ResNet18/50、WideResNet50和VGG16的大量实验验证了NPPR作为更实用的鲁棒性度量,与假设最先进技术中使用的常见扰动分布相比,显示出保守(较低)的PR估计。

英文摘要

Deep learning (DL) models, despite their remarkable success, remain vulnerable to small input perturbations that can cause erroneous outputs, motivating the recent proposal of probabilistic robustness (PR) as a complementary alternative to adversarial robustness (AR). However, existing PR formulations assume a fixed and known perturbation distribution, an unrealistic expectation in practice. To address this limitation, we propose non-parametric probabilistic robustness (NPPR), a more practical PR metric that does not rely on any predefined perturbation distribution. Following the non-parametric paradigm in statistical modeling, NPPR learns an optimized perturbation distribution directly from data, enabling conservative PR evaluation under distributional uncertainty. We further develop an NPPR estimator based on a Gaussian Mixture Model (GMM), covering various input-dependent and input-independent perturbation scenarios. Theoretical analyses establish the relationships among AR, PR, and NPPR. Extensive experiments on CIFAR-10, CIFAR-100, and Tiny ImageNet across ResNet18/50, WideResNet50 and VGG16 validate NPPR as a more practical robustness metric, showing conservative (lower) PR estimates compared to assuming those common perturbation distributions used in state-of-the-arts.

2511.17185 2026-06-01 cs.CV

PostCam: Camera-Controllable Novel-View Video Generation with Query-Shared Cross-Attention

PostCam: 基于查询共享交叉注意力的相机可控新视角视频生成

Yipeng Chen, Zhichao Ye, Zhenzhou Fang, Xinyu Chen, Xiaoyu Zhang, Jialing Liu, Nan Wang, Guofeng Zhang, Haomin Liu

AI总结 提出PostCam框架,通过查询共享交叉注意力机制对齐6自由度姿态和渲染特征,实现动态场景中高细节保持和精确相机轨迹编辑的新视角视频生成。

详情
AI中文摘要

我们提出了PostCam,一个用于新视角视频生成的简化框架,在动态场景中实现了优越的细节保留和精确的相机轨迹编辑。当前方法常常在基于姿态的控制(缺乏视觉细节)和基于渲染的引导(对几何精度过于敏感)之间权衡。尽管最近有混合尝试,但由于缺乏有效的跨模态对齐,实现精确的运动和视觉一致性仍然具有挑战性。我们认为,稳健的控制源于多模态信号的深度对齐,而不是增加输入复杂性。我们的核心贡献是查询共享交叉注意力机制,它将6自由度姿态和渲染特征投影到统一的潜在空间中。这使得模型在去噪过程中能够自发地实现运动线索和像素级引导之间的内在一致性。实验表明,PostCam在保持高保真视觉细节的同时,在轨迹精度上比最先进的方法提高了20%,在复杂动态场景中表现出卓越的鲁棒性。我们的项目网页公开在:https://cccqaq.github.io/PostCam.github.io/

英文摘要

We propose PostCam, a streamlined framework for novel-view video generation that achieves superior detail preservation and precise camera trajectory editing in dynamic scenes. Current methods often struggle with a trade-off between pose-based control, which lacks visual detail, and rendering-based guidance, which is overly sensitive to geometric accuracy. Despite recent hybrid attempts, achieving precise motion and visual consistency remains challenging due to the lack of effective cross-modal alignment. We argue that robust control stems from the deep alignment of multimodal signals rather than increased input complexity. Our core contribution is the Query-Shared Cross-Attention mechanism, which projects 6-DoF poses and rendered features into a unified latent space. This allows the model to spontaneously achieve intrinsic consistency between motion cues and pixel-level guidance during denoising. Experiments demonstrate that PostCam maintains high-fidelity visual details while outperforming state-of-the-art methods by 20% in trajectory precision, exhibiting superior robustness in complex dynamic scenes. Our project webpage is publicly available at: https://cccqaq.github.io/PostCam.github.io/

2511.15692 2026-06-01 cs.CV

Hyperspectral Image Classification using Spectral-Spatial Mixer Network

高光谱图像分类的光谱-空间混合器网络

Mohammed Q. Alkhatib

AI总结 提出SS-MixNet轻量级深度学习模型,通过3D卷积和并行MLP混合器模块提取局部与长距离光谱-空间特征,在1%标注数据下实现高精度高光谱图像分类。

Comments Accepted and published in IEEE WHISPERS2025

详情
AI中文摘要

本文介绍了SS-MixNet,一种用于高光谱图像(HSI)分类的轻量级且有效的深度学习模型。该架构将用于局部光谱-空间特征提取的3D卷积层与两个并行的MLP风格混合器模块相结合,以捕获光谱和空间维度上的长距离依赖关系。采用基于深度可分离卷积的注意力机制,以最小的计算开销增强判别能力。该模型在QUH-Tangdaowan和QUH-Qingyun数据集上进行了评估,仅使用1%的标注数据进行训练和验证。SS-MixNet在比较的方法中取得了最高性能,包括2D-CNN、3D-CNN、IP-SWIN、SimPoolFormer和HybridKAN,在Tangdaowan和Qingyun数据集上分别达到了95.68%和93.86%的总体准确率。由定量指标和分类图支持的结果证实了该模型在有限监督下提供准确且鲁棒预测的有效性。代码将在以下网址公开:https://github.com/mqalkhatib/SS-MixNet

英文摘要

This paper introduces SS-MixNet, a lightweight and effective deep learning model for hyperspectral image (HSI) classification. The architecture integrates 3D convolutional layers for local spectral-spatial feature extraction with two parallel MLP-style mixer blocks that capture long-range dependencies in spectral and spatial dimensions. A depthwise convolution-based attention mechanism is employed to enhance discriminative capability with minimal computational overhead. The model is evaluated on the QUH-Tangdaowan and QUH-Qingyun datasets using only 1% of labeled data for training and validation. SS-MixNet achieves the highest performance among compared methods, including 2D-CNN, 3D-CNN, IP-SWIN, SimPoolFormer, and HybridKAN, reaching 95.68% and 93.86% overall accuracy on the Tangdaowan and Qingyun datasets, respectively. The results, supported by quantitative metrics and classification maps, confirm the model's effectiveness in delivering accurate and robust predictions with limited supervision. The code will be made publicly available at: https://github.com/mqalkhatib/SS-MixNet

2509.12440 2026-06-01 cs.CL cs.AI

MedFact: Benchmarking the Fact-Checking Capabilities of Large Language Models on Chinese Medical Texts

MedFact:大型语言模型在中文医学文本上的事实核查能力基准测试

Jiayi He, Yangmin Huang, Qianyun Du, Xiangying Zhou, Zhiyang He, Jiaxue Hu, Xiaodong Tao, Lixian Lai

AI总结 为评估LLM在中文医学文本中的事实核查能力,构建了包含2116个专家标注实例的MedFact基准,涵盖13个专科、8种错误类型等,并发现模型在错误定位上表现不足,存在“过度批评”现象。

Comments Accepted to The Fifth Workshop on Generation, Evaluation, and Metrics (GEM) at ACL 2026

详情
AI中文摘要

在医疗应用中部署大型语言模型(LLM)需要具备事实核查能力,以确保患者安全和法规合规。我们引入了MedFact,一个具有挑战性的中文医学事实核查基准,包含来自多样化真实文本的2,116个专家标注实例,涵盖13个专科、8种错误类型、4种写作风格和5个难度级别。构建采用混合AI-人类框架,其中迭代的专家反馈优化AI驱动的多标准过滤,以确保高质量和难度。我们评估了20个领先的LLM在真实性分类和错误定位方面的表现,结果显示模型通常能判断文本是否包含错误,但难以精确定位错误,顶级模型的表现仍不及人类。我们的分析揭示了“过度批评”现象,即模型倾向于将正确信息误判为错误,而高级推理技术(如多智能体协作和推理时扩展)可能加剧这一问题。MedFact突显了部署医疗LLM的挑战,并为开发事实可靠的医疗AI系统提供了资源。

英文摘要

Deploying Large Language Models (LLMs) in medical applications requires fact-checking capabilities to ensure patient safety and regulatory compliance. We introduce MedFact, a challenging Chinese medical fact-checking benchmark with 2,116 expert-annotated instances from diverse real-world texts, spanning 13 specialties, 8 error types, 4 writing styles, and 5 difficulty levels. Construction uses a hybrid AI-human framework where iterative expert feedback refines AI-driven, multi-criteria filtering to ensure high quality and difficulty. We evaluate 20 leading LLMs on veracity classification and error localization, and results show models often determine if text contains errors but struggle to localize them precisely, with top performers falling short of human performance. Our analysis reveals the "over-criticism" phenomenon, a tendency for models to misidentify correct information as erroneous, which can be exacerbated by advanced reasoning techniques such as multi-agent collaboration and inference-time scaling. MedFact highlights the challenges of deploying medical LLMs and provides resources to develop factually reliable medical AI systems.

2511.10868 2026-06-01 cs.LG

Go-UT-Bench: A Fine-Tuning Dataset for LLM-Based Unit Test Generation in Go

Go-UT-Bench:用于基于LLM的Go语言单元测试生成的微调数据集

Yashshi Pipalani, Hritik Raj, Rajat Ghosh, Vaishnavi Bhargava, Debojyoti Dutta

AI总结 针对代码LLM训练数据不平衡问题,提出Go-UT-Bench数据集(5264对代码与单元测试),通过微调提升模型在Go语言单元测试生成任务上的性能,在超过75%的基准任务上优于基础模型。

Comments 9 pages, 5 figures

详情
AI中文摘要

训练数据不平衡对代码LLM构成了重大挑战。大多数可用数据严重偏向原始开源代码,而低估了更广泛的软件工程任务,尤其是在像Golang这样的低资源语言中。因此,模型在代码自动补全方面表现出色,但在单元测试生成等实际开发者工作流程中表现不佳。为了解决这一差距,我们引入了GO UT Bench,这是一个包含5264对代码和单元测试的基准数据集,来自10个宽松许可的Golang仓库,涵盖不同领域。我们评估了它作为微调数据集在两个LLM家族(即专家混合模型和密集解码器)上的有效性。我们的结果表明,微调后的模型在超过75%的基准任务上优于其基础对应模型。

英文摘要

Training data imbalance poses a major challenge for code LLMs. Most available data heavily over represents raw opensource code while underrepresenting broader software engineering tasks, especially in low resource languages like Golang. As a result, models excel at code autocompletion but struggle with real world developer workflows such as unit test generation. To address this gap, we introduce GO UT Bench, a benchmark dataset of 5264 pairs of code and unit tests, drawn from 10 permissively licensed Golang repositories spanning diverse domain. We evaluate its effectiveness as a fine tuning dataset across two LLM families i.e. mixture of experts and dense decoders. Our results show that finetuned models outperform their base counterparts on more than 75% of benchmark tasks.

2510.22067 2026-06-01 cs.CV

Capturing Gaze Shifts for Guidance: Cross-Modal Fusion Enhancement for VLM Hallucination Mitigation

捕捉注视转移以引导:跨模态融合增强用于VLM幻觉缓解

Zheng Qi, Chao Shang, Evangelia Spiliopoulou, Nikolaos Pappas

AI总结 提出GIFT方法,通过预计算视觉显著性图并跟踪注视转移,在解码时增强对显著视觉信息和用户查询的注意力,以缓解视觉语言模型中的幻觉问题。

Comments ICML 2026

详情
AI中文摘要

视觉语言模型(VLM)经常产生幻觉,即无法由文本或视觉输入证实的内容。先前的工作主要将其归因于过度依赖语言先验知识而非视觉输入。一些方法尝试通过按注意力分数比例放大视觉令牌注意力来缓解幻觉。然而,这些方法忽视了视觉注意力沉没问题,即注意力经常被错误分配到与任务无关的视觉区域,并且忽略了跨模态融合平衡,仅增强视觉注意力而不调整对用户查询的注意力。这可能导致放大错误区域,同时无法正确解释用户查询。为解决这些挑战,我们提出了一种简单而有效的方法,称为注视转移引导的跨模态融合增强(GIFT)。GIFT通过在用户查询理解过程中跟踪视觉注意力的正向变化(即“注视转移”),预计算整体视觉显著性图,并利用该图在每个解码步骤放大对显著视觉信息和用户查询的注意力。这减少了视觉注意力沉没的影响,因为无关令牌的转移最小,同时确保平衡的跨模态融合以获得良好整合的表示。大量实验表明,GIFT在生成和分类任务中均有效缓解了VLM的幻觉,与贪婪解码相比实现了高达20.7%的改进,同时以低计算开销保持了通用的视觉语言性能。

英文摘要

Vision language models (VLMs) often generate hallucination, i.e., content that cannot be substantiated by either textual or visual inputs. Prior work primarily attributes this to over-reliance on linguistic prior knowledge rather than visual inputs. Some methods attempt to mitigate hallucination by amplifying visual token attention proportionally to their attention scores. However, these methods overlook the visual attention sink problem, where attention is frequently misallocated to task-irrelevant visual regions, and neglect cross-modal fusion balance by enhancing only visual attention without adjusting attention to the user query. This can result in amplifying incorrect areas while failing to properly interpret the user query. To address these challenges, we propose a simple yet effective method called Gaze Shift-Guided Cross-modal Fusion Enhancement (GIFT). GIFT pre-computes a holistic visual saliency map by tracking positive changes in visual attention, or "gaze shifts", during user query comprehension, and leverages this map to amplify attention to both salient visual information and the user query at each decoding step. This reduces the impact of visual attention sink, as irrelevant tokens exhibit minimal shifts, while ensuring balanced cross-modal fusion for well-integrated representation. Extensive experiments show that GIFT effectively mitigates hallucination in VLMs across both generative and classification tasks, achieving up to 20.7% improvement over greedy decoding, while maintaining general vision-language performance with low computational overhead.