arXivDaily arXiv每日学术速递 周一至周五更新
重置
2605.01712 2026-06-03 cs.LG 版本更新

CoAction: Cross-task Correlation-aware Pareto Set Learning

CoAction: 跨任务相关性感知的帕累托集学习

Xinyue Chen, Yingxuan Liang, Yiqin Huang, Chikai Shang, Hai-Lin Liu, Fangqing Gu

发表机构 * Guangdong University of Technology(广东工业大学) Xiamen University(厦门大学)

AI总结 提出CoAction框架,利用任务感知Transformer同时处理多个多目标优化问题,通过自注意力机制捕获任务间相关性,提升超体积、范围和稀疏性指标。

Comments Accepted by ICIC 2026 (Oral)

详情
AI中文摘要

帕累托集学习(PSL)是多目标优化中的一种新兴范式,它训练神经网络将偏好向量映射到帕累托最优解。然而,现有的PSL方法主要关注一次解决单个多目标优化问题。这一局限性不仅在多目标多任务优化场景中增加了计算成本(因为每个任务需要单独的模型),而且未能利用任务间的相关性。为了解决这个问题,我们提出了一个跨任务相关性感知的帕累托集学习(CoAction)框架,该框架利用任务感知Transformer同时处理多个任务。具体来说,通过为每个任务分配任务特定的嵌入向量,模型有效地区分任务,同时促进任务间的知识共享。我们采用Transformer编码器作为骨干架构,利用其自注意力机制捕获复杂的任务依赖关系。该方法在涵盖基准问题和实际应用的全面多任务测试套件上进行了评估,在超体积、范围和稀疏性方面展示了有效性和有竞争力的性能。

英文摘要

Pareto set learning (PSL) is an emerging paradigm in multi-objective optimization that trains neural networks to map preference vectors to Pareto optimal solutions. However, existing PSL methods primarily focus on solving a single multi-objective optimization problem at a time. This limitation not only increases computational costs in multi-objective multitask optimization scenarios by requiring a separate model for each task, but also fails to exploit the inter-task correlations across tasks. To address this, we propose a Cross-tAsk correlation-aware Pareto Set Learning (CoAction) framework, which leverages task-aware transformer to handle multiple tasks simultaneously. Specifically, by assigning task-specific embedding vectors to individual tasks, the model effectively distinguishes between tasks while facilitating knowledge sharing among them. We utilize a Transformer encoder as the backbone architecture to leverage its self-attention mechanism for capturing complex task dependencies. The proposed approach is evaluated on comprehensive multitask test suites covering both benchmark problems and real-world applications, demonstrating effectiveness and competitive performance in Hypervolume, Range, and Sparsity.

2606.03940 2026-06-03 eess.IV cs.CV cs.LG cs.RO 版本更新

SEAOTTER: Sensor Embedded Autoencoding with One-Time Transcode for Efficient Reconstruction

SEAOTTER: 基于传感器嵌入自编码器与一次性转码的高效重建

Dan Jacobellis, Neeraja J. Yadwadkar

发表机构 * Department of Electrical and Computer Engineering(电气与计算机工程系) The University of Texas at Austin(德克萨斯大学奥斯汀分校)

AI总结 提出SEAOTTER框架,结合传感器嵌入自编码器与可学习JPEG转码,在200:1压缩比下实现比AVIF快7倍编码、3.5倍解码,并提升ImageNet top-1准确率8%,同时保持JPEG兼容性。

详情
AI中文摘要

在机器人系统中,使用低成本、低功耗硬件可以轻松捕获高分辨率的大量视觉数据。然而,当通过JPEG/MPEG等传统编解码器传输时,有限的带宽和机载计算资源阻碍了充分利用。较新的编解码器(如AV1/AVIF)改善了率失真权衡,但需要更多资源进行编码,在没有定制ASIC的情况下不切实际。最近的非对称自编码器在极端功率和带宽约束下提供高质量,但增加了高昂的解码成本,并使用忽略围绕JPEG等标准建立的数十年基础设施的特有格式。为了解决这些限制,我们引入了一种基于传感器嵌入自编码器与一次性转码的高效重建(SEAOTTER)的云机器人压缩框架。由于传感器、云和消费阶段面临非常不同的功率和带宽预算,SEAOTTER结合了学习潜变量的紧凑性和标准JPEG文件的广泛可用性。由于朴素转码会降低性能,我们提出了一种可学习的JPEG颜色和量化变换,能够提高全局、密集和基于视觉语言感知的准确性。使用SEAOTTER,我们为预训练的冻结编码器训练通用和任务感知的转码流水线。在200:1的压缩比下,与AVIF相比,我们观察到编码速度提高7倍,解码速度提高3.5倍,ImageNet top-1准确率提高8%,同时保持与JPEG基础设施的兼容性。我们的代码可从此https URL获取。

英文摘要

In robotics systems, vast amounts of visual data are easily captured at high resolution using low-cost, low-power hardware. Yet, limited bandwidth and on-device compute resources prevent full utilization when transmitted via conventional codecs like JPEG/MPEG. Newer codecs, like AV1/AVIF, improve the rate-distortion trade-off, but demand far more resources for encoding, impractical without custom ASICs. Recent asymmetric autoencoders deliver high quality under extreme power and bandwidth constraints, but add prohibitive decoding cost and use bespoke formats that ignore decades of infrastructure built around standards like JPEG. To address these limitations, we introduce a compression framework for cloud robotics based on a Sensor Embedded Autoencoder paired with a One-Time Transcode for Efficient Reconstruction (SEAOTTER). Because the sensor, cloud, and consumer stages face very different power and bandwidth budgets, SEAOTTER combines the compactness of a learned latent with the broad usability of a standard JPEG file. Since naive transcoding degrades performance, we propose a learnable JPEG color and quantization transform that enables increased accuracy for global, dense, and vision-language-based perception. Using SEAOTTER, we train both general-purpose and task-aware transcoding pipelines for a pre-trained, frozen encoder. At a compression ratio of 200:1 and compared to AVIF, we observe 7 times faster encoding, 3.5 times faster decoding, and +8% ImageNet top-1 accuracy, while retaining compatibility with JPEG infrastructure. Our code is available at https://github.com/UT-SysML/seaotter .

2606.02661 2026-06-03 eess.IV cs.AI cs.LG 版本更新

Learning to Refine: Spectral-Decoupled Iterative Refinement Framework for Precipitation Nowcasting

学习细化:用于降水临近预报的频谱解耦迭代细化框架

Yunlong Zhou, Chen Zhao, Danyang Peng, Fanfan Ji, Xiao-Tong Yuan

发表机构 * National University of Singapore(新加坡国立大学)

AI总结 提出频谱解耦迭代细化框架(SDIR),通过双路径设计(SFG-Former和FR-Refiner)和物理一致功率谱密度损失,在确定性框架中实现降水临近预报的渐进频率解耦细化,消除模糊和幻觉,在空间精度和频谱保真度上超越现有方法。

Comments 21 pages, 10 figures, accepted at ICML 2026

详情
AI中文摘要

准确的降水临近预报对减灾至关重要,但深度学习方法面临关键权衡:回归模型产生过度平滑、频谱衰减的预测,模糊对流细节并违反湍流幂律;扩散模型生成逼真但无锚定的幻觉,缺乏物理基础。我们提出频谱解耦迭代细化(SDIR),一个确定性框架,将临近预报重新表述为渐进频率解耦细化。SDIR首先提取稳定的低频天气尺度骨架,然后在物理约束下迭代细化高频纹理,消除模糊和幻觉。它采用双路径设计:天气尺度频率引导前馈网络(SFG-Former)使用尺度自适应Transformer处理全局结构,傅里叶残差细化器(FR-Refiner)使用尺度条件傅里叶神经算子处理精细残差。具有动态掩蔽的物理一致功率谱密度(PCPSD)损失强制执行湍流一致的频谱分布。在三个基准上的实验表明,SDIR在空间精度上显著优于最先进方法,同时实现了与基于扩散方法竞争的频谱保真度,实现了可靠的高分辨率业务化临近预报。代码链接:this https URL。

英文摘要

Accurate precipitation nowcasting is vital for disaster mitigation, but deep learning methods face a key trade-off: regression models produce over-smoothed, spectrally decaying predictions that blur convective details and violate turbulence power laws; diffusion models generate realistic yet unanchored hallucinations lacking physical grounding. We propose Spectral-Decoupled Iterative Refinement (SDIR), a deterministic framework that reformulates nowcasting as progressive frequency-decoupled refinement. SDIR first extracts a stable low-frequency synoptic skeleton, then iteratively refines high-frequency textures under physical constraints, eliminating both blurring and hallucinations. It features a dual-path design: the Synoptic Frequency-Guided Former (SFG-Former) with Scale-Adaptive Transformers for global structure, and the Fourier Residual Refiner (FR-Refiner) with Scale-Conditioned Fourier Neural Operators for fine residuals. A Physically Consistent Power Spectral Density (PCPSD) loss with dynamic masking enforces a turbulence-consistent spectral distribution. Experiments on three benchmarks show SDIR significantly outperforms SOTA methods in spatial accuracy while achieving spectral fidelity competitive with diffusion-based methods, enabling reliable high-resolution operational nowcasting. Code link: https://github.com/RuntimeWarning/SDIR.

2606.02642 2026-06-03 eess.AS cs.AI cs.CV cs.LG cs.MM cs.SD 版本更新

SVHalluc: Benchmarking Speech-Vision Hallucination in Audio-Visual Large Language Models

SVHalluc: 音频-视觉大语言模型中的语音-视觉幻觉基准测试

Chenshuang Zhang, Kyeong Seon Kim, Chengxin Liu, Tae-Hyun Oh

发表机构 * KAIST(韩国国立信息通信研究院)

AI总结 针对音频-视觉大语言模型中的语音-视觉幻觉问题,提出SVHalluc基准,从语义和时间两个维度评估模型将语音内容与视觉信号对齐的能力,发现现有模型存在跨模态理解局限。

Comments Accepted at CVPR 2026

详情
AI中文摘要

尽管音频-视觉大语言模型(LLMs)取得了成功,但它们可能产生看似合理但缺乏依据的输出,即幻觉。现有基准侧重于环境声音(例如狗叫)来指示事件发生。相比之下,人类语音承载着根本不同的、丰富的语义和时间结构,但当前模型能否准确地将语音内容与相应的视觉信号对齐仍未得到探索。在这项工作中,我们表明语音内容可以引发音频-视觉LLMs中的幻觉。为了系统研究这一点,我们引入了SVHalluc,这是第一个用于评估音频-视觉LLMs中语音-视觉幻觉的综合基准。我们的基准从两个关键且互补的方面诊断语音-视觉幻觉:语义和时间。实验结果表明,最先进的开源音频-视觉LLMs难以将语音内容与相应的视觉信号对齐,在多个任务上的准确率接近随机。相比之下,Gemini 2.5 Pro显著优于开源模型。我们的分析表明,它们的失败源于跨模态理解能力有限,尽管在单模态感知方面表现强劲。我们的工作揭示了当前音频-视觉LLMs的一个新的根本性局限,并强调了基于语音的视频理解的需求。项目页面:此https URL。

英文摘要

Despite the success of audio-visual large-language models (LLMs), they can produce plausible but ungrounded outputs, termed hallucination. Existing benchmarks focus on environmental sounds (e.g., dog barking) to indicate event occurrence. In contrast, human speech carries fundamentally different, rich semantics and temporal structures, yet it remains unexplored whether current models can accurately align speech content with corresponding visual signals. In this work, we show that speech content can induce hallucinations in audio-visual LLMs. To systematically study this, we introduce SVHalluc, the first comprehensive benchmark for evaluating speech-vision hallucination in audio-visual LLMs. Our benchmark diagnoses speech-vision hallucinations from two critical and complementary aspects: semantic and temporal. Experimental results demonstrate that state-of-the-art open-source audio-visual LLMs struggle with aligning speech content with corresponding visual signals, with a near-random accuracy on multiple tasks. In contrast, Gemini 2.5 Pro significantly outperforms the open-source models. Our analysis suggests that their failures stem from limited ability in cross-modality understanding, despite strong performance in single-modality perception. Our work uncovers a new and fundamental limitation of current audio-visual LLMs and highlights the need for speech-grounded video comprehension. Project page: https://chenshuang-zhang.github.io/projects/svhalluc/.

2606.02631 2026-06-03 eess.AS cs.AI cs.CV cs.LG cs.SD 版本更新

Wavelet as Tokenizer: Preliminary Results on a Shared Wavelet Token Schema for Natural Signals

小波作为分词器:自然信号共享小波分词方案的初步结果

Shenghao Ding

发表机构 * Yet Another AI

AI总结 本文研究音频、图像和视频能否共享统一的小波分词方案,通过基于Haar DWT/IDWT的连续令牌模型,在多个数据集上验证了统一分词模式的可行性,并分析了潜在容量和元数据的影响。

Comments 12 pages, 3 figures

详情
AI中文摘要

本文研究音频、图像和视频是否可以共享一个共同的小波令牌模式,而不是依赖于各自模态特定的潜在网格。它介绍了一个初步的连续令牌模型,该模型围绕一级Haar DWT/IDWT前端、共享系数令牌布局、可选结构元数据、轻量级模态值适配器和共享的令牌级编码器-解码器主干构建。在Speech Commands、EuroSAT RGB和DAVIS 2017数据上,密集共享模型达到了39.92 dB音频、29.37 dB图像和23.93 dB视频的PSNR。在连续潜在标量预算下的匹配速率扫描表明,视觉增益不能仅由潜在容量解释,同时也表明加性元数据嵌入并非普遍改进来源。最后,固定速率能量选择提供了一个强大的非参数基线:在压缩保留比率下,energy_global相比均匀选择将音频的平均PSNR提高了16.73 dB,图像提高了16.90 dB,视频提高了15.86 dB。掩蔽稀疏训练在50%的密集令牌下达到了34.45 dB的视频PSNR。结果支持统一的 wavelet 令牌模式和稀疏令牌接口,但尚未建立通用的离散词汇表。

英文摘要

This paper studies whether audio, images, and video can share a common wavelet token schema rather than relying on separate modality-specific latent grids. It introduces a preliminary continuous-token model built around a one-level Haar DWT/IDWT frontend, a shared coefficient-token layout, optional structural metadata, lightweight modality value adapters, and a shared token-wise encoder-decoder trunk. On Speech Commands, EuroSAT RGB, and DAVIS 2017 data, a dense shared model reaches 39.92 dB audio, 29.37 dB image, and 23.93 dB video PSNR. A matched-rate sweep under continuous latent scalar budgets indicates that the visual gains are not explained solely by latent capacity, while also showing that additive metadata embeddings are not a universal source of improvement. Finally, fixed-rate energy selection provides a strong non-parametric baseline: energy_global improves average PSNR over uniform selection by 16.73 dB for audio, 16.90 dB for images, and 15.86 dB for video under compressed keep ratios. Masked sparse training reaches 34.45 dB video PSNR with 50% of dense tokens. The results support a unified wavelet token schema and sparse token interface, while stopping short of establishing a universal discrete vocabulary.

2606.03878 2026-06-03 stat.ML cs.LG 版本更新

Privacy-Robust Incrementality Measurement for Advertising Systems under Signal Loss

信号损失下广告系统的隐私鲁棒增量测量

Prashant Shekhar, Caroline Howard

发表机构 * Department of Mathematics, Embry-Riddle Aeronautical University(数学系,埃姆伯里-里德尔航空大学)

AI总结 针对隐私保护报告系统导致的信号损失,提出鲁棒因果决策框架,通过投影观测兼容的实验世界到增量泛函,给出尖锐决策边界,实现认证、拒绝或未决的增量判断。

详情
AI中文摘要

广告平台使用随机提升测试来测量增量,但隐私保护报告系统通过匹配率损失、可链接性损失、归因窗口损失、聚合阈值抑制、随机报告噪声和分段异质信号损失降低观测信号。本文将隐私约束下的广告测量形式化为一个鲁棒因果决策问题,考虑上述信号损失。给定随机实验和隐私引起的退化的模糊集,该框架将观测兼容的干净/未过滤实验世界的纤维投影到增量泛函上,并返回认证、拒绝和未决的决策。主要结果给出了尖锐的决策边界。边界外的报告支持一致有效的认证或拒绝,而边界内的报告包含的信息太少,任何方法都无法一致区分高于阈值的增量与非增量。支持结果给出了有限样本认证、样本复杂度保证、表明信号损失减少有效信息的极小极大下界,以及报告粒度权衡。在200万条Criteo提升数据和6.4万条Hillstrom电子邮件实验中,两个数据集的干净转化提升均为正,分别为0.00112和0.00495。在Criteo中,总体认证在轻度退化下幸存,在Hillstrom中在严重退化下幸存,而两个数据集中所有考虑的有限样本压力设置在同时包含不确定性和报告噪声后仍然未决。总体而言,本研究为隐私感知的增量测量贡献了一个决策理论层,其输出是由退化广告信号证明的最强因果主张。

英文摘要

Advertising platforms use randomized lift tests to measure incrementality, but privacy-preserving reporting systems degrade the observed signal through match-rate loss, linkability loss, attribution-window loss, aggregation-threshold suppression, randomized reporting noise, and segment-heterogeneous signal loss. This paper formulates privacy-constrained advertising measurement as a robust causal decision problem under the mentioned signal losses. Given a randomized experiment and an ambiguity set for privacy-induced degradation, the framework projects the observation-compatible fiber of clean/unfiltered experimental worlds onto the incrementality functional and returns certified, rejected, and unresolved decisions. The main result gives a sharp decision frontier. Reports outside the frontier support uniformly valid certification or rejection, whereas reports inside it contain too little information for any method to uniformly distinguish above-threshold incrementality from non-incrementality. Supporting results give finite-sample certification, sample-complexity guarantees, a minimax lower bound showing that signal loss reduces effective information, and a reporting-granularity tradeoff. On 2.0M Criteo Uplift rows and the 64K-row Hillstrom email experiment, clean conversion lift is positive in both datasets, with lifts 0.00112 and 0.00495, respectively. Population certification survives mild degradation in Criteo and severe degradation in Hillstrom, while all considered finite-sample stress settings in both datasets remain unresolved after simultaneous uncertainty and reporting noise are included. Overall, the research contributes a decision-theoretic layer for privacy-aware incrementality measurement whose output is the strongest causal-claim justified by degraded ads signals.

2606.03820 2026-06-03 stat.ML cs.LG 版本更新

A Quantitative Approximation Framework for Flow Distillation in Diffusion Models

扩散模型中流蒸馏的定量近似框架

Weiguo Gao, Ming Li, Lei Shi, Hanfei Zhou

发表机构 * School of Mathematical Sciences, Fudan University(复旦大学数学学院) Shanghai Key Laboratory of Contemporary Applied Mathematics, Fudan University(复旦大学当代应用数学重点实验室)

AI总结 针对扩散模型中的流蒸馏,提出一个定量近似框架,将少步采样视为学习流映射组合下的误差传播,通过理论分析和实验验证了稳定性平衡的非均匀时间网格能显著降低端到端相对MSE。

详情
AI中文摘要

我们为扩散蒸馏开发了一个定量近似框架,将少步采样视为学习流映射组合下的误差传播。聚焦于概率流ODE的轨迹蒸馏,我们表明局部近似误差在低噪声多模态区域可能被强烈放大,其中底层动力学变得刚性。在一个解析可处理的高斯混合Ornstein--Uhlenbeck设定中,我们分离了两个核心困难:近似时间依赖的分数场和控制由概率流ODE的时间积分Jacobian界决定的动力学放大。在近似方面,我们证明了构造性的L^p(p_t)保证,表明ReLU--ReQU网络随时间一致地近似高斯混合分数,其深度和宽度在目标精度上呈多对数缩放,并显式依赖于混合几何。在稳定性方面,我们推导了概率流速度的空间Lipschitz常数的一个显式界L(t),并将其转化为由∫_s^t L(u)du控制的流映射稳定性估计,使得刚性区域中的后期放大可计算。基于这些估计,我们证明深度残差组合有效近似长时程传输,全局误差由稳定性放大因子控制,并识别出一个Lipschitz不匹配区域,其中一步蒸馏在结构上不利。由此产生的理论通过累积稳定性坐标的均匀划分得到一个稳定性平衡的非均匀时间网格。实验支持该预测,并在8个分段下与均匀网格相比将端到端相对MSE降低了高达51.9%。

英文摘要

We develop a quantitative approximation framework for diffusion distillation, viewing few-step sampling as error propagation under compositions of learned flow maps. Focusing on trajectory distillation for the probability-flow ODE, we show that local approximation errors can be strongly amplified in low-noise multimodal regimes, where the underlying dynamics become stiff. In an analytically tractable Gaussian-mixture Ornstein--Uhlenbeck setting, we separate two core difficulties: approximating the time-dependent score field and controlling the dynamical amplification governed by the time-integrated Jacobian bound of the probability-flow ODE. On the approximation side, we prove constructive L^p(p_t) guarantees showing that ReLU--ReQU networks approximate the Gaussian-mixture score uniformly over time, with depth and width scaling polylogarithmically in the target accuracy and explicitly with the mixture geometry. On the stability side, we derive an explicit bound L(t) for the spatial Lipschitz constant of the probability-flow velocity and convert it into a flow map stability estimate governed by \int_s^t L(u)\,du, making late-time amplification in stiff regimes computable. Building on these estimates, we prove that deep residual compositions efficiently approximate the long-horizon transport, with global error controlled by the stability amplification factor, and identify a Lipschitz-mismatch regime in which one-step distillation is structurally unfavorable. The resulting theory yields a stability-balanced non-uniform time grid obtained by uniform partitioning in the cumulative stability coordinate. Experiments support the prediction and reduce end-to-end relative MSE by up to 51.9\% with 8 segments compared with uniform grids.

2606.03736 2026-06-03 stat.ML cs.LG 版本更新

Resource-Constrained Adaptive Inference for Sequential Pricing

资源约束下的自适应推断用于顺序定价

Ruicheng Ao, Jiashuo Jiang, David Simchi-Levi

发表机构 * Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge, MA 02139(数据、系统与社会研究所,麻省理工学院,剑桥,马萨诸塞州,02139) Department of Industrial Engineering and Decision Analytics, Hong Kong University of Science and Technology, Hong Kong(工业工程与决策分析系,香港科技大学,香港) Department of Civil and Environmental Engineering and Operations Research Center, Massachusetts Institute of Technology, Cambridge, MA 02139(土木与环境工程系和运筹中心,麻省理工学院,剑桥,马萨诸塞州,02139)

AI总结 针对资源约束导致固定价格推断不可行的问题,提出一种目标感知定价控制器,通过认证可行目标带并记录连续局部密度,实现基于局部去偏的学生化区间,并分析遗憾-信息核算。

详情
AI中文摘要

资源约束的定价控制器可能使得固定价格推断变得不可能:即使每个已实现的动作具有已知的正密度,控制器的资源状态也可能从可行集中移除目标价格邻域。我们通过局部不可识别结果和已实现的信息时钟形式化了这种支持排除失败。然后,我们设计了一种目标感知定价控制器,该控制器认证可行的目标带并记录连续的局部密度。局部去偏产生了学生化区间,其宽度由该时钟控制。由此产生的遗憾-信息核算(直到初始求解误差)表明,廉价的探索可能不足以进行推断:多项式目标质量给出多项式速率,而纯$1/t$目标分支在没有额外局部移动的情况下不会产生收缩的固定目标区间。实验显示了在认证带中的校准以及当资源状态崩溃目标支持时的诊断性弃权。

英文摘要

Resource-constrained pricing controllers can make fixed-price inference impossible: the controller's resource state may remove the target price neighborhood from the feasible set, even when every realized action has a known positive density. We formalize this support-exclusion failure through a local non-identification result and a realized information clock. We then design a target-aware pricing controller that certifies feasible target bands and logs continuous local densities. Localized debiasing gives studentized intervals whose width is governed by this clock. The resulting regret--information accounting, stated up to pilot re-solving error, shows that cheap exploration can be insufficient for inference: polynomial target mass gives polynomial rates, while a pure $1/t$ target branch does not yield shrinking fixed-target intervals without additional local movement. Experiments show calibration in certified bands and diagnostic abstention when the resource state collapses target support.

2606.03600 2026-06-03 stat.ML cs.LG 版本更新

Set-Preserving Calibration from Conformal P-Values to E-Values

从共形p值到e值的集合保持校准

Nabil Alami, Jad Zakharia, Souhaib Ben Taieb

发表机构 * ETH Zurich(苏黎世联邦理工学院)

AI总结 针对共形预测中p值到e值转换的局限性,提出一种集合保持的P2E校准器,在不改变预测集的前提下实现高效转换,并在交叉共形预测和共形聚合中达到期望覆盖并提升效率。

详情
AI中文摘要

标准的共形预测(CP)过程通常用p值表述,但仅依赖p值限制了灵活性,例如在跨模型或数据分割组合依赖证据时。最近的工作探索了共形推断的e值表述,然而CP中p值和e值表述之间的直接联系仍然缺失,特别是在统计效率方面。我们首先指出了CP设置中经典p到e校准器的局限性,表明它们不是集合保持的,可能导致过于保守的预测集。为解决这一问题,我们提出了一种新颖的P2E校准器,它将共形p值转换为e值,而不改变原始共形p值诱导的预测集。我们在理论和实证上证明,我们的校准器相比现有的p到e校准器可以带来显著的效率提升。这种e值表述使得能够原则性地使用e值合并和随机化的最新进展,我们在两个应用中展示了其影响:交叉共形预测(CCP),其变体通常仅提供近似的$1-2\alpha$覆盖率,以及共形聚合(CA)。在这两种情况下,我们基于e值的方法满足所需的$1-\alpha$覆盖率保证,同时相比标准基线提高了效率。更广泛地说,我们的方法扩展了CP的灵活性,并为高效、无分布的量化不确定性开辟了新方向。

英文摘要

Standard conformal prediction (CP) procedures are typically formulated in terms of p-values, but reliance on p-values alone limits flexibility, for example, when combining dependent evidence across models or data splits. Recent work has explored e-value formulations for conformal inference, yet a direct connection between p- and e-value formulations in CP has been missing, especially regarding their statistical efficiency. We first identify limitations of classical p-to-e calibrators in the CP setting, showing that they are not set-preserving and can lead to overly conservative prediction sets. To address this, we propose a novel P2E calibrator that converts conformal p-values into e-values without altering the prediction set induced by the original conformal p-value. We establish both theoretically and empirically that our calibrator can yield significant efficiency gains over existing p-to-e calibrators. This e-value formulation enables principled use of recent advances in e-value merging and randomization, where we demonstrate its impact in two applications: cross-conformal prediction (CCP), whose variants typically provide only approximate $1-2α$ coverage, and conformal aggregation (CA). In both cases, our e-value-based methods satisfy the desired $1-α$ coverage guarantee while improving efficiency over standard baselines. More broadly, our approach expands the flexibility of CP and opens new directions for efficient, distribution-free uncertainty quantification.

2606.03574 2026-06-03 stat.ML cs.LG 版本更新

Few-Shot Prediction for Pulsar Noise with Long Short-Term Memory Network

基于长短期记忆网络的脉冲星噪声少样本预测

Qingye Tang, Dechao An, Haoran Peng, Yuqi Ouyang

发表机构 * Sichuan University, College of Computer Science(四川大学计算机学院) Sichuan University, College of Physics(四川大学物理学院)

AI总结 针对脉冲星计时数据稀缺问题,提出一种结合模型无关元学习优化的LSTM网络,仅需少量真实计时残差即可快速适应新频域,并利用粒子群算法自动调参,在IPTA数据集上以10%数据实现高精度预测。

详情
AI中文摘要

本文提出了一种新颖的解决方案,用于在有限数据下预测脉冲星计时残差,解决了PTA数据集中毫秒脉冲星自旋频率子组数据稀缺的关键挑战。该方案应用了长短期记忆(LSTM)网络,并通过模型无关元学习算法进行优化,使得仅需少量真实计时残差即可通过微调LSTM网络快速适应新的频域。同时,采用粒子群优化算法进行自动超参数优化,提高了预测精度。我们的解决方案在国际脉冲星计时阵列(IPTA)第二次数据发布上进行了评估,在高频测试频域的三个指标上均展现出鲁棒的泛化能力和准确预测,且仅需这些域中10%的计时残差进行模型微调。此外,我们的轻量级结构仅需16.86 MB CPU内存和18毫秒即可完成单步残差预测。所有这些特性使得我们的解决方案非常适合实际应用,在这些应用中,有效且实时的脉冲星计时残差预测至关重要——尤其是在计算能力、内存或能源有限的资源受限环境中。

英文摘要

This work proposes a novel solution to predict pulsar timing residuals with limited data, addressing the critical challenge of data scarcity across spin-frequency subgroups of millisecond pulsars in PTA datasets. The proposed solution applies a Long Short-Term Memory (LSTM) network optimized using the model-agnostic meta-learning algorithm, enabling rapid adaptation to new frequency domain by fine-tuning the LSTM network with only a few-shot of ground truth timing residuals. Particle swarm optimization algorithm is also used for automatic hyperparameter optimization, leading to improved prediction accuracy. Our solution, evaluated on the second data release of the International Pulsar Timing Array (IPTA), demonstrates robust generalization with accurate predictions in three metrics across high-frequency test frequency domains, while requiring only 10% of the timing residuals from these domains for model fine-tuning. Furthermore, our lightweight structure only costs 16.86 MB CPU memory and 18 milliseconds for single-step residual prediction. All these characteristics make our solution highly suitable for real-world applications, where effective and real-time predictions of pulsar timing residuals are essential-particularly in resource-constrained environments with limited computational power, memory, or energy availability.

2606.03553 2026-06-03 stat.ML cs.LG math.OC 版本更新

A Robust Optimization Approach to Sparse Principal Component Analysis

稀疏主成分分析的鲁棒优化方法

David Vävinggren, Francis Bach, André M. H. Teixeira, Dave Zachariah, Antônio H. Ribeiro

发表机构 * Uppsala University, Sweden(乌普萨拉大学,瑞典) PSL Research University / INRIA, France(巴黎社会科学大学 / INRIA,法国) Science for Life Laboratory, Sweden(生命科学实验室,瑞典)

AI总结 提出AdvPCA方法,通过鲁棒优化在重建目标中引入最坏情况潜在空间扰动实现稀疏性,并给出闭式解和迭代算法。

详情
AI中文摘要

虽然主成分分析(PCA)是降维的基本工具,但其稠密表示使其不适用于高维数据。现有方法通过显式的$\ell_1$惩罚来促进稀疏性,但由于任务的无监督性质,这些惩罚不易调整。相比之下,我们提出了对抗性PCA(AdvPCA),它利用鲁棒优化,通过优化针对有界、最坏情况潜在空间扰动的重建目标来实现稀疏性。我们表明,该公式允许闭式约简,从而产生一种实用的迭代算法,该算法交替进行稀疏编码器的对抗性线性回归式更新和解码器的正交更新。通过对解进行理论刻画,我们推导出一种数据自适应参数化,使算法能够开箱即用地有效执行。我们通过在合成和真实世界基因组学数据上的数值实验验证了这些主张。

英文摘要

While principal component analysis (PCA) is a fundamental tool for dimensionality reduction, its dense representations make it ill-suited for high-dimensional data. Existing methods address this by promoting sparsity through explicit $\ell_1$-penalties, but these are not obvious to tune due to the unsupervised nature of the task. In contrast, we propose Adversarial PCA (AdvPCA), which leverages robust optimization to achieve sparsity by optimizing the reconstruction objective against bounded, worst-case latent space perturbations. We show that this formulation admits a closed-form reduction, leading to a practical iterative algorithm that alternates between adversarial linear regression-style updates for the sparse encoder and orthogonal updates for the decoder. By theoretically characterizing the solution, we derive a data-adaptive parameterization that allows the algorithm to perform effectively out of the box. We validate these claims through numerical experiments on synthetic and real-world genomics data.

2606.03292 2026-06-03 stat.ML cs.LG 版本更新

Combining Statistical Features and Deep Encodings for Rehearsal-Based Class-Incremental Time Series Classification

结合统计特征与深度编码的基于排练的类增量时间序列分类

Pablo García-Santaclara, Bruno Fernández-Castro, Rebeca Pilar Díaz-Redondo

发表机构 * atlanTTic – ICLAB, Universidade de Vigo(atlanTTic–ICLAB,维戈大学) Centro Tecnolóxico de Telecomunicacións de Galicia (GRADIANT)(加利西亚电信技术中心(GRADIANT)) Universidade de Vigo(维戈大学)

AI总结 提出一种双流特征提取管道(结合预训练冻结基础模型的深度时间嵌入特征与统计特征),用于多变量时间序列的类增量持续学习,在五个基准数据集上实现了有竞争力的平均准确率和低遗忘率。

详情
AI中文摘要

现实环境中使用的许多系统需要在不遗忘分类模型先前学习内容的情况下添加新类别并整合新信息。这被称为类增量持续学习,而对于多变量时间序列,数据的时间结构进一步增加了复杂性。本文提出了一种基于双流特征提取管道(使用通过预训练冻结基础模型生成的深度时间嵌入特征以及应用统计特征)的多变量时间序列分类类增量持续学习的新方法。在五个基准数据集上的评估表明,所提出的系统在所有数据集上实现了有竞争力的平均准确率,同时在所有实验配置中保持了较低的遗忘率。

英文摘要

Many systems used in real-world environments require adding new categories and incorporating new information without forgetting what was previously learnt by the classification model. This is known as class-incremental continual learning, and in the case of multivariate time-series, is further complicated by the temporal structure of the data. In this paper, we present a novel approach for performing class incremental continual learning for the classification of multivariate time series data based upon the construction of a dual-stream feature extraction pipeline (using both deep temporal embedding features generated via a pre-trained frozen foundation model and application of statistical features). Evaluated on five benchmark datasets, the proposed system achieves competitive average accuracy across all datasets while maintaining low forgetting rates across all experimental configurations.

2606.03245 2026-06-03 stat.ML cs.LG 版本更新

Hierarchies of Calibration: Classification meets Regression

校准的层次结构:分类与回归的融合

Johannes Resin, Lu Yang, Tilmann Gneiting

发表机构 * Goethe University Frankfurt(法兰克福歌德大学) University of Minnesota(明尼苏达大学) Heidelberg Institute for Theoretical Studies(海德堡理论研究所) Karlsruhe Institute of Technology(卡尔斯鲁厄理工学院)

AI总结 本文综述、扩展并桥接了分类与回归任务中的校准概念,重点研究了不同校准概念之间的层次关系,并提出了模态校准、全校准、部分校准和平均校准等新概念。

详情
AI中文摘要

校准概念形式化了概率预测与相应结果之间的兼容性。简而言之,结果应与从预测分布中随机抽取的样本无法区分。本文回顾、扩展并桥接了针对分类和回归任务提出的校准概念。特别强调了各种概念之间的层次关系,因为它们适用于一般实值数据、连续结果、计数数据、名义类别和二元结果。为了突出若干贡献,我们引入了名义结果的模态校准概念,在此背景下区分了全校准、部分校准和平均校准,并证明了双概率积分变换(PIT)校准在逻辑上独立于先前针对离散结果提出的校准概念。此外,我们推广了关于校准概念的现有结果,这些概念以预测分布的性质或泛函(如均值、分位数或事件概率)表示。在整篇论文中,我们通过实例说明这些概念及其层次关系,并提供支持构建指导性示例和反例的算法工具。

英文摘要

Concepts of calibration formalize the compatibility between probabilistic predictions and the respective outcomes. In a nutshell, the outcomes ought to be indistinguishable from random draws from the predictive distributions. In this paper, we review, extend, and bridge notions of calibration that have been proposed for classification and regression tasks. Particular emphasis is given to hierarchical relations between the various notions, as they apply to general real-valued data, continuous outcomes, count data, nominal classes, and binary outcomes. To highlight a number of contributions, we introduce the notion of modal calibration for nominal outcomes, we distinguish full, partial, and average calibration in this setting, and we show that double probability integral transform (PIT) calibration is logically independent of previously proposed concepts of calibration for discrete outcomes. Furthermore, we generalize extant results on concepts of calibration that are expressed in terms of properties or functionals of the predictive distributions, such as means, quantiles, or event probabilities. Throughout the paper, we illustrate the concepts and their hierarchical relations in worked examples, and we provide algorithmic tools that support the construction of instructive examples and counterexamples.

2606.03217 2026-06-03 stat.ML cond-mat.dis-nn cs.LG 版本更新

An Asymptotic Theory of Chain-of-Thought in In-Context Learning

上下文学习中思维链的渐近理论

Kaito Takanami, Cengiz Pehlevan

发表机构 * Department of Physics, Graduate School of Science, The University of Tokyo(东京大学物理系研究生院) John A. Paulson School of Engineering and Applied Sciences, Harvard University(哈佛大学约翰·A·保罗森工程与应用科学学院) Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University(哈佛大学凯普勒人工智能研究 institute) Center for Brain Science, Harvard University(哈佛大学脑科学中心)

AI总结 通过高维随机矩阵理论,推导了线性回归中上下文学习思维链的泛化误差精确公式,揭示了推理深度、预训练数据量和上下文长度之间的相变现象。

详情
AI中文摘要

思维链推理已成为一种广泛使用的机制,通过在推理时生成中间推理步骤来激发大型语言模型的多步推理。然而,泛化能力随思维链深度的缩放行为仍知之甚少。为了解决这个问题,我们研究了一个理论上可解的线性回归中上下文权重预测的思维链模型,其中测试时推理表示为权重参数估计的迭代细化。利用高维渐近下的随机矩阵理论工具,我们推导了泛化误差作为推理深度、预训练数据量和上下文长度的精确公式。我们的分析揭示了指数与多项式改进、饱和及过度思考之间的尖锐相变,并刻画了最优推理深度如何缩放。我们进一步表明,更深的推理在预训练和上下文信息足够丰富时最为有效,而有限的预训练或上下文会使较长的推理容易产生误差放大或饱和。我们还通过在完全学习的线性注意力和softmax注意力模型上的实验验证了这些预测。我们的结果为测试时思维链深度如何影响泛化提供了一个统一的理论解释。

英文摘要

Chain-of-thought (CoT) reasoning has become a widely used mechanism for eliciting multi-step reasoning in large language models by generating intermediate reasoning steps at inference time. Yet the scaling behavior of generalization with CoT depth remains poorly understood. To address this question, we study a theoretically solvable model of CoT for in-context weight prediction in linear regression, where test-time reasoning is represented as an iterative refinement of the weight-parameter estimate. Using tools from random matrix theory under high-dimensional asymptotics, we derive an exact formula for the generalization error as a function of reasoning depth, pretraining data amount, and context length. Our analysis reveals a sharp phase transition separating exponential and polynomial improvement, saturation, and overthinking, and characterizes how the optimal reasoning depth scales. We further show that deeper reasoning is most effective with sufficiently rich pretraining and in-context information, whereas limited pretraining or context makes longer reasoning prone to error amplification or saturation. We also validate these predictions through experiments on fully learned linear attention and softmax attention models. Our results provide a unified theoretical account of how test-time CoT depth affects generalization.

2606.03112 2026-06-03 stat.AP cs.LG 版本更新

Trans GAN-WT: A Feature Extraction and Interactive Learning-Based Anomaly Detection Model for Wind Turbine Time Series Data

Trans GAN-WT: 一种基于特征提取和交互学习的风电机组时间序列数据异常检测模型

Jingzhe Kang

AI总结 提出融合Transformer和生成对抗网络的异常检测模型TransGAN-WT,通过放大重构误差、自回归多模态特征提取和时序特征交互学习,在真实风电机组数据集上F1达96.10%,误报率仅0.06%。

详情
AI中文摘要

随着风电场规模和数量的增加,风电机组的日常运维成本不断上升。为了降低运维成本并在灾难性故障发生前提高风电机组及系统运行数据的可靠性,监测设备运行状态并在早期检测故障至关重要。利用工况数据对风电机组运行状态进行异常评估,实现运行状态异常监测具有重要的实际意义。然而,现有的异常检测方法既无法在充满大量冗余信息的数据中进行有效的关系建模,也无法合理利用有价值的异常数据。为此,本文提出了一种融合Transformer和生成对抗网络的异常检测模型。首先,通过放大重构误差来降低微小偏差异常的漏检率。其次,利用自回归推理提取多模态特征,以增强训练的稳定性和泛化能力。最后,构建时序特征提取模块,促进不同时间尺度特征之间的交互学习,有效减少时间冗余。在真实风电机组数据集上进行的多组实验结果表明,TransGAN-WT在多个风电机组数据集上的平均F1分数达到96.10%,比几种其他最先进的基线方法分别高出5.84%和2.89%。同时,其误报率(FPR)仅为0.06%,并通过Wilcoxon符号秩检验验证了与最先进基线方法相比取得了统计上显著的性能提升,有效保障了风电机组的稳定运行。

英文摘要

With the increasing scale and number of wind farms, wind turbines' daily operation and maintenance costs are increasing. To reduce operation and maintenance costs and enhance the reliability of wind turbine and system operation data before reaching catastrophic failures, monitoring the operating status of the equipment and detecting failures at an early stage is crucial. It is of great practical significance to utilize the working condition data for abnormal assessment of the operating status of wind turbines to realize abnormal monitoring of the operating status of wind turbines. However, the existing anomaly detection methods can neither perform effective relational modeling in data filled with a large amount of redundant information nor reasonably utilize the valuable anomaly data. For this reason, this paper proposes an anomaly detection model that fuses a Transformer and a generative adversarial network. Firstly, it reduces the leakage detection rate of minor deviation anomalies by amplifying the reconstruction error. Secondly, it uses autoregressive inference to extract multimodal features to enhance the stability and generalization ability of training. Finally, the temporal feature extraction module is constructed to promote the interactive learning between features of different time scales and effectively reduce the time redundancy. The results of multiple sets of experiments conducted on real WTG datasets show that TransGAN-WT achieves an average F1 score of 96.10% across multiple wind turbine datasets, which is 5.84% and 2.89% higher than several other state-of-the-art baseline methods. It also realizes a false positive rate (FPR) of 0.06%, and is verified by the Wilcoxon signed-rank test to have achieved a statistically significant performance enhancement compared to the state-of-the-art baseline methods, effectively ensuring the stable operation of wind turbines.

2606.03018 2026-06-03 stat.ME cs.LG math.ST stat.ML stat.TH 版本更新

A Fast Screening Approach for High-dimensional Outcomes and High-dimensional Predictors

高维结果与高维预测变量的快速筛选方法

Hongju Park, Zhenyao Ye, Shuo Chen

AI总结 提出图独立双筛选(GIDS)框架,同时降低响应变量和预测变量的维度,以解决高维交叉模态分析中的计算负担和可解释性问题。

Comments 38 pages, 2 figures

详情
AI中文摘要

由于超高维度和复杂依赖结构伴随高水平噪声,对多模态高维数据间的交互建模本质上具有挑战性。筛选方法能有效降低维度,但大多数现有方法仅缩减预测变量空间而保留所有结果变量。在交叉模态分析中,不同结果变量通常选择不同的预测变量子集,因此并集仍然很大且响应维度不变,限制了筛选的实际效益。这导致沉重的计算负担和较差的可解释性。为解决这些局限,我们提出一个新的筛选框架——图独立双筛选(GIDS),它同时降低响应变量和预测变量的维度。我们设计了计算高效的算法,促进后续选择过程,提高准确性和可扩展性,并建立了支持性的理论结果。广泛的模拟研究表明,GIDS优于仅筛选预测变量的现有方法。为展示其实用性,我们将GIDS应用于阿尔茨海默病神经影像学倡议(ADNI)数据集,分析全基因组865,353个DNA甲基化与49,386个转录组变量之间的交互。GIDS将特征空间缩减至约9,000个CpG位点和2,000个转录本,揭示了块状交互结构:具有强关联的CpG位点簇和基因转录本簇。这些发现不仅提高了计算可处理性,还产生了可解释的生物学见解,突显了阿尔茨海默病背后的协调调控机制。

英文摘要

Modeling interactions among multimodal, high-dimensional data is intrinsically challenging due to ultra-high dimensionality and complex dependence structure with high level noise. Screening methods are effective for reducing dimensionality, but most existing approaches shrink only the predictor space while retaining all outcomes. In cross-modal analyses, different outcomes often select different predictor subsets, so the union remains large and the response dimension is unchanged, limiting the practical benefit of screening. This gives rise to heavy computational burdens and poor interpretability. To address these limitations, we propose a new screening framework, Graph Independence Dual Screening (GIDS), which simultaneously reduces the dimensionality of response variables and predictors. We design computationally efficient algorithms that facilitate downstream selection procedures, improving accuracy and scalability, and establish supporting theoretical results. Extensive simulation studies demonstrate that GIDS outperforms existing methods that screen only predictors. To illustrate its utility, we applied GIDS to the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, analyzing interactions between genome-wide 865,353 DNA methylation and 49,386 transcriptomic variables. GIDS reduced the feature space to approximately 9,000 CpGs and 2,000 transcripts, uncovering blockwise interaction structures: clusters of CpG sites and gene transcripts with strong associations. These findings not only improve computational tractability but also yield interpretable biological insights, highlighting coordinated regulatory mechanisms underlying Alzheimer's disease.

2606.02909 2026-06-03 stat.ML cs.LG 版本更新

Scalable Derivative Gaussian Processes via Exact Gradient Reduction

可扩展的导数高斯过程通过精确梯度约简

Hyunseok Seung, Matthias Katzfuss

发表机构 * Department of Statistics University of Wisconsin–Madison(统计学系威斯康星大学麦迪逊分校)

AI总结 提出TERA方法,利用精确梯度约简将导数高斯过程的计算复杂度从O(n^3 d^3)降至O(d m^2 + m^6),实现高维空间中的可扩展推理。

详情
AI中文摘要

梯度观测可以显著改善高斯过程(GP)代理,特别是在函数评估昂贵的高维设置中。然而,对n个函数值和n个完整梯度(d维)进行精确推理的计算复杂度与联合状态大小呈三次方关系,导致难以处理的O(n^3 d^3)计算瓶颈。我们提出TERA,一种基于目标特定精确梯度约简的高度可扩展导数GP方法。我们证明,对于平稳核,与连接目标和条件点的方向正交的梯度分量在条件上独立于目标函数值;因此,一旦指定了大小为m的条件集,精确条件密度完全由至多m^2个方向导数刻画。通过将这些约简的、无维度的条件作为Vecchia近似中的局部因子,TERA有效地将n和d从稠密矩阵求逆中解耦。这将每个目标的评估成本降低到O(d m^2 + m^6)时间和O(d m^2 + m^4)内存,同时保持底层导数GP模型在数学上不变。实验评估表明,TERA实现了最先进的预测精度,同时比标准导数GP快数个数量级。关键的是,计算时间和峰值GPU内存相对于d基本保持平稳,从而在高维空间中实现高度可扩展的推理。

英文摘要

Gradient observations can substantially improve Gaussian process (GP) surrogates, particularly in high-dimensional settings where function evaluations are expensive. However, exact inference with $n$ function values and $n$ full gradients in $d$ dimensions scales cubically in the joint state size, imposing an intractable $\mathcal{O}(n^3 d^3)$ computational bottleneck. We introduce TERA, a highly scalable derivative GP method based on target-specific exact gradient reduction. We prove that for stationary kernels, the gradient components orthogonal to the directions connecting the target and conditioning points are conditionally independent of the target function value; consequently, the exact conditional density is fully characterized by at most $m^2$ directional derivatives once a conditioning set of size $m$ is specified. By using these reduced, dimension-free conditionals as local factors in a Vecchia approximation, TERA effectively decouples $n$ and $d$ from the dense matrix inversion. This reduces the per-target evaluation cost to $\mathcal{O}(dm^2 + m^6)$ time and $\mathcal{O}(dm^2 + m^4)$ memory, leaving the underlying derivative GP model mathematically unchanged. Empirical evaluations demonstrate that TERA achieves state-of-the-art predictive accuracy while operating orders of magnitude faster than standard derivative GPs. Crucially, both computation time and peak GPU memory remain essentially flat with respect to $d$, enabling highly scalable inference in high-dimensional spaces.

2606.02740 2026-06-03 stat.ML cs.LG 版本更新

ScoreStop: Gradient-based early stopping using functional score tests

ScoreStop: 基于梯度的早期停止方法使用函数得分检验

Oliver J. Hines, Christian L. Hines

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 提出ScoreStop方法,通过函数得分检验在每次迭代中检验当前预测器是否为总体风险最小化器,从而在梯度提升决策树中实现基于梯度的早期停止,避免过拟合。

Comments Presented at the International Conference on Machine Learning 2026 Workshop on Hypothesis Testing

详情
AI中文摘要

梯度提升决策树需要停止规则以避免过拟合。标准规则监控验证损失,如果损失在固定的耐心期内没有改善则停止。然而,耐心参数没有可解释的尺度,验证损失可能带有噪声或由用户指定的梯度隐式定义。我们提出ScoreStop,一种基于梯度的早期停止规则,将每次迭代的停止决策视为检验当前预测器是否为总体风险最小化器的原假设。我们使用在验证数据上计算的函数得分检验,其统计量在更新方向上具有尺度不变性,并且在原假设下具有已知的渐近分布。由于我们的检验使用梯度而非损失值,相同的构造适用于隐式损失(如LambdaRank)和通过影响函数的数据依赖损失(如Cox回归)。在合成实验和真实数据基准测试中,我们展示了ScoreStop与基于损失的方法相比具有竞争力。

英文摘要

Gradient boosted decision trees require a stopping rule to avoid overfitting. The standard rule monitors a validation loss and stops if the loss fails to improve for a fixed patience period. However, the patience parameter has no interpretable scale and validation losses can be noisy or implicitly defined by a user-specified gradient. We propose ScoreStop, a gradient-based early-stopping rule that casts the stopping decision at each iteration as a test of the null hypothesis that the current predictor is the population risk minimizer. We use a functional score test, computed on validation data, with a statistic that is scale-invariant in the update direction, with a known asymptotic distribution under the null. Because our test uses gradients rather than loss values, the same construction applies to implicit losses such as LambdaRank, and data-dependent losses such as Cox regression via influence functions. In synthetic experiments and real-data benchmarks, we show that ScoreStop is competitive with loss-based methods.

2606.02664 2026-06-03 stat.ML cs.LG 版本更新

State-Coupled Volatility in Latent Dynamical Systems: Recovery Under Partial Observation

潜变量动力系统中的状态耦合波动性:部分观测下的恢复

Imani Beckett

发表机构 * The Herbert Wertheim School of Public Health and Human Longevity Science(赫伯特·韦特海姆公共卫生与人类长寿科学学院) University of California San Diego(加州大学圣地亚哥分校)

AI总结 提出状态耦合随机波动框架,利用粒子期望最大化算法在部分观测下估计潜变量过程方差与平衡点位移的关系,并通过仿真验证了恢复与检测性能。

Comments 40 pages, 16 figures

详情
AI中文摘要

潜状态空间模型广泛用于研究部分观测的动力系统,但大多数公式假设过程变异性与潜状态位置无关。然而,在许多生物、行为和生理系统中,变异性可能系统地依赖于潜在动力状态,产生恒定方差模型无法捕捉的结构化随机性。我们引入了一个状态耦合随机波动框架,其中潜过程方差取决于与潜平衡点的位移。为了在部分观测下估计这种关系,我们开发了一种粒子期望最大化程序,结合了引导粒子滤波和反向轨迹平滑。该模型包含一个耦合参数 $\gamma$,用于量化潜状态位置与过程变异性之间的关联强度。一个大规模仿真基准评估了在不同耦合强度、观测噪声水平、轨迹长度和持续性机制下的恢复和检测性能。与基于观测状态的异方差代理相比,所提出的框架一致地减少了恢复偏差,在强耦合下改进最大。恢复性能随着潜持续性的增加而提高,而检测性能在广泛条件下保持竞争力,并随着观测噪声的增加而变得更加有利。综合来看,结果表明当明确建模潜状态结构时,可以在部分观测下识别和估计状态耦合波动性。该框架为研究状态依赖变异性以及评估结构化随机性是否提供超出平均状态轨迹所包含的系统动力学信息提供了实用的方法论基础。

英文摘要

Latent state-space models are widely used to study partially observed dynamical systems, yet most formulations assume that process variability is independent of latent-state position. In many biological, behavioral, and physiological systems, however, variability may depend systematically on the underlying dynamical state, producing structured stochasticity that is not captured by constant-variance models. We introduce a state-coupled stochastic volatility framework in which latent process variance depends on displacement from a latent equilibrium. To estimate this relationship under partial observation, we develop a particle expectation-maximization procedure combining bootstrap particle filtering and backward trajectory smoothing. The model includes a coupling parameter, $γ$, that quantifies the strength of association between latent-state position and process variability. A large-scale simulation benchmark evaluated recovery and detection performance across varying coupling strengths, observation noise levels, trajectory lengths, and persistence regimes. The proposed framework consistently reduced recovery bias relative to an observed-state heteroskedastic proxy, with the largest improvements occurring under strong coupling. Recovery performance improved with increasing latent persistence, while detection performance remained competitive across a broad range of conditions and became increasingly advantageous as observation noise increased. Taken together, the results demonstrate that state-coupled volatility can be identified and estimated under partial observation when latent-state structure is explicitly modeled. The framework provides a practical methodological foundation for studying state-dependent variability and evaluating whether structured stochasticity contributes information about system dynamics beyond that contained in mean-state trajectories alone.

2606.02645 2026-06-03 stat.ML cs.AI cs.LG 版本更新

Target Updates May Stabilize Linear Q-Learning: Periodic and Soft Dynamics

目标更新可能稳定线性Q学习:周期性和软动态

Donghwan Lee

发表机构 * School of Electrical Engineering, KAIST(韩国成均馆大学电气工程学院)

AI总结 本文通过精确的切换线性系统动力学和联合谱半径分析,证明了在特定谱和步长条件下,周期性硬目标更新和软目标更新可以保证线性Q学习收敛到精确的投影Q-Bellman解。

详情
AI中文摘要

Q学习中的周期性目标更新和actor-critic方法中的软目标更新是经验上公认的稳定机制,但其精确的理论解释仍不完整。本文针对线性函数逼近的Q学习(线性Q学习),利用Bellman最大值引起的精确切换线性系统(SLS)动力学以及由此产生的切换矩阵族的联合谱半径(JSR),对这些机制进行了严格而精确的分析。尽管线性Q学习通常可能无法收敛,但我们证明,在明确的谱和步长条件下,周期性硬目标更新和软目标更新可以保证收敛到精确的投影Q-Bellman解。主要分析针对确定性线性Q学习进行,其中目标更新机制最为透明。一旦为均值递归建立了相应的JSR证书,随机强化学习设置可以通过将确定性模式替换为采样随机模式并添加相应的随机噪声分析来处理。

英文摘要

Periodic target updates in Q-learning and soft target updates in actor-critic methods are empirically well established stabilization mechanisms, but their precise theoretical explanation is still incomplete. This paper gives a rigorous and exact analysis of these mechanisms for Q-learning with linear function approximation (linear Q-learning) using the exact switched linear system (SLS) dynamics induced by the Bellman maximum and the joint spectral radius (JSR) of the resulting switching matrix families. Although linear Q-learning can fail to converge in general, we prove that, under explicit spectral and step-size conditions, periodic hard target updates and soft target updates can guarantee convergence to the exact projected Q-Bellman solution. The main analysis is carried out for deterministic linear Q-learning, where the target-update mechanism is most transparent. Once the corresponding JSR certificate is established for the mean recursion, the stochastic reinforcement-learning setting can be treated by replacing deterministic modes with sampled stochastic modes and adding the corresponding stochastic-noise analysis.

2606.02632 2026-06-03 stat.ML cs.AI cs.CY cs.LG econ.EM stat.AP 版本更新

Position: Prioritize Identifying Structure, Not Complex Models, for Scientific Discovery

立场:优先识别结构,而非复杂模型,以促进科学发现

Tyler H. McCormick

发表机构 * GitHub

AI总结 本文论证现代机器学习在高维代理机制下存在通用欠定性,提出“机制性机器学习”的具体标准,以确保以LLM为中心的工作流真正支持科学而非模拟科学。

Comments Will appear as a position paper in ICML

详情
AI中文摘要

现代机器学习(ML)和人工智能(AI)模型,特别是大型语言模型(LLMs),越来越多地被用于从观测数据中生成科学假设和机制解释。这篇立场论文认为,在现代ML擅长的高维代理机制中,机制性学习通常是欠定的:许多不相容的机制在数据支撑上诱导出本质上相同的观测关系,因此预测成功和连贯的解释并不足以作为机制发现的证据。这种欠定性在大型语言模型(LLMs)中变得尤为危险,因为它们倾向于将大量等价的解释类压缩成一个流畅的叙述。本文提出了“机制性机器学习”的具体标准,并论证如果以LLM为中心的工作流要支持科学而非仅仅模拟科学,这些标准是必要的。

英文摘要

Modern Machine Learning (ML) and Artificial Intelligence (AI) models, especially large language models (LLMs), are increasingly used to generate scientific hypotheses and mechanistic explanations from observational data. This position paper argues that in the high-dimensional proxy regimes where modern ML excels, mechanistic learning is generically underdetermined: many incompatible mechanisms induce essentially the same observational relationships on the support of the data, so predictive success and coherent explanations are insufficient evidence of mechanism discovery. This underdetermination becomes uniquely hazardous with large language models (LLMs), which tend to collapse large equivalence classes of explanations into a single fluent narrative. This paper proposes concrete standards for ``mechanistic ML,'' and argues these norms are necessary if LLM-centered workflows are to support science rather than merely simulate it.

2606.03184 2026-06-03 q-fin.CP cs.LG q-fin.ST 版本更新

FinStressTS: A Parametric Synthetic Benchmark for Time-Series Forecasting in Finance

FinStressTS: 金融时间序列预测的参数化合成基准

Jiaze Sun, Kelvin J. L. Koa, Ruiyang Ni, Yize Liu, Haonan Chen, Ke-Wei Huang

发表机构 * National University of Singapore(新加坡国立大学) Asian Institute of Digital Finance(亚洲数字金融研究所) Nanyang Technological University(南洋理工大学)

AI总结 针对金融预测中信号弱、机制复杂的问题,提出FinStressTS合成基准,通过30个诊断环境系统评估15种模型在点预测与概率预测上的表现,揭示模型性能对数据机制的依赖性。

Comments KDD 2026 (Oral)

详情
AI中文摘要

金融预测因信噪比低、潜在因子、重尾、机制转换和跳跃而困难。真实世界基准提供的故障归因有限:研究人员可以观察到表现不佳,但往往无法隔离原因,因为机制不可观察且纠缠。真实金融数据仅揭示一条实现路径,使得评估尾部风险校准或数据效率变得困难。我们引入FinStressTS,一个机制感知的合成基准,将模型行为与受控的结构原因联系起来。FinStressTS包含围绕六个机制族(波动率聚类、多尺度持续性、重尾冲击、机制转换、自激跳跃和零膨胀过程)的30个诊断环境。我们评估两个任务:点预测(使用五种设置下的NMAE)和概率预测(在已知数据生成机制下使用CRPS)。我们对15个模型进行基准测试,从经典方法(HAR、VAR)到Transformer预测器(PatchTST、iTransformer)和深度概率架构(DeepAR、TSFlow),并使用学习曲线衡量样本效率。我们的评估揭示了三个见解。首先,性能依赖于机制:自回归和线性模型在多个波动率、尾部和跳跃驱动的环境中具有很强的竞争力,并且通常优于基于Transformer的模型。其次,分布对齐很重要:诸如DeepAR之类的参数化概率模型在平稳设置中校准良好,而灵活模型在分布变为多模态或稀疏时可能有所帮助。第三,神经网络模型通常需要更多数据才能匹配简单基线,主要在学习潜在机制或复杂分布时获得更大收益。FinStressTS提供了一个用于诊断故障模式和推进风险感知预测的开放框架。

英文摘要

Financial forecasting is difficult due to low signal-to-noise ratios, latent factors, heavy tails, regime shifts, and jumps. Real-world benchmarks offer limited failure attribution: researchers can observe underperformance, but often cannot isolate why because mechanisms are unobservable and entangled. Real financial data reveal only one realized path, making it difficult to assess tail-risk calibration or data efficiency. We introduce FinStressTS, a mechanism-aware synthetic benchmark that links model behavior to controlled structural causes. FinStressTS comprises 30 diagnostic environments around six mechanism families: volatility clustering, multi-scale persistence, heavy-tailed shocks, regime switching, self-exciting jumps, and zero-inflated processes. We evaluate two tasks: point forecasting, using NMAE across five settings, and probabilistic forecasting, using CRPS under known data-generating mechanisms. We benchmark 15 models, from classical methods (HAR, VAR) to Transformer forecasters (PatchTST, iTransformer) and deep probabilistic architectures (DeepAR, TSFlow), and use learning curves to measure sample efficiency. Our evaluation reveals three insights. First, performance is mechanism-dependent: autoregressive and linear models are highly competitive, and often outperform Transformer-based models, in several volatility-, tail-, and jump-driven environments. Second, distributional alignment matters: parametric probabilistic models such as DeepAR calibrate well in stationary settings, while flexible models can help when distributions become multimodal or sparse. Third, neural models often require more data to match simple baselines, with larger gains mainly when learning latent regimes or complex distributions. FinStressTS provides an open framework for diagnosing failure modes and advancing risk-aware forecasting.

2606.02629 2026-06-03 q-bio.QM cs.AI cs.LG 版本更新

Enhancing Protein-Protein Interaction Prediction with Hierarchical Motif-based Multimodal Protein Embedding

基于层次化基序的多模态蛋白质嵌入增强蛋白质-蛋白质相互作用预测

Zaifei Yang, Samuel Ping-Man Choi, James Kwok

发表机构 * National University of Singapore(新加坡国立大学) University of California, Los Angeles(加州大学洛杉矶分校)

AI总结 提出MMM-PPI模型,通过层次化基序的多模态编码(微观、中观、宏观三尺度)整合序列、结构和功能信息,提升蛋白质-蛋白质相互作用预测性能。

详情
AI中文摘要

蛋白质-蛋白质相互作用(PPIs)对许多生物过程至关重要。然而,现有的PPI预测方法存在两个主要局限性:它们忽略了蛋白质的层次组织,特别是关键调控PPIs的中观尺度基序,并且未能有效整合序列、结构和功能模态。为了解决这些局限性,我们提出了MMM-PPI,一种基于层次化基序的多模态蛋白质编码器用于PPI预测,该编码器以自底向上的多模态方式在三个尺度上构建PPI嵌入。在微观尺度上,我们编码三种模态的残基特征;在中观尺度上,一种新颖的多模态基序编码器将残基聚合成空间感知的基序嵌入;在宏观尺度上,一种多模态蛋白质编码器通过联合建模基序重要性和模态间相关性将基序整合为蛋白质嵌入。预训练的编码器可直接用于大规模PPI预测。在多个PPI数据集上的大量实验表明,MMM-PPI优于最先进的多标签PPI预测模型,特别是在具有挑战性的数据划分和有限数据场景下。代码见此链接。

英文摘要

Protein-protein interactions (PPIs) are essential for many biological processes. However, existing PPI prediction approaches suffer from two major limitations: they overlook the hierarchical organization of proteins, particularly meso-scale motifs that critically regulate PPIs, and fail to effectively integrate sequence, structure, and function modalities. To address these limitations, we propose MMM-PPI, a Hierarchical Motif-based Multi-Modal protein Encoder for PPI Prediction that constructs PPI embeddings in a bottom-up multi-modal manner across three scales. At the micro-scale, we encode three modal residue features; at the meso-scale, a novel multimodal motif encoder aggregates residues into spatially-informed motif embeddings; at the macro-scale, a multimodal protein encoder integrates motifs into protein embeddings by jointly modeling motif importance and inter-modal correlations. The pre-trained encoder can be used off-the-shelf for large-scale PPI prediction. Extensive experiments on multiple PPI datasets show that MMM-PPI outperforms state-of-the-art multi-label PPI prediction models, particularly under challenging data partitions and limited data scenarios. Codes are in https://github.com/yzf-code/MMM-PPI.

2606.02625 2026-06-03 q-bio.QM cs.AI cs.LG 版本更新

DXA-Derived Skeletal Phenotypes and Hip Fracture Risk: A Backdoor-Adjusted Causal Analysis

DXA衍生的骨骼表型与髋部骨折风险:后门调整因果分析

Zixin Shi, Chen Zhao, Meiling Zhou, Kevin A. Maupin, Joyce H. Keyak, Nancy E. Lane, Kuan-Jui Su, Hui Shen, Hong-Wen Deng, Kui Zhang, Weihua Zhou

AI总结 本研究利用后门调整的平均处理效应比较了DXA衍生的髋部骨骼表型与骨折风险的关系,并评估了基于效应排序的表型对风险分层的改善。

Comments 35 pages; main manuscript includes 4 figures and 3 tables; supplementary material includes 13 figures and 3 tables

详情
AI中文摘要

目的:通过预设的混杂因素调整,比较双能X射线吸收测定法(DXA)衍生的髋部骨骼表型与髋部骨折风险的关系,并评估按后门调整的平均处理效应(ATEs)排序的表型是否能改善风险分层。方法:我们分析了21,098名英国生物样本库参与者,他们具有关联的健康记录、髋部DXA衍生的骨骼测量值和预设协变量。评估了涵盖髋部相关区域的骨矿物质含量(BMC)、骨矿物质密度(BMD)和T评分的16种表型。混杂因素选择由预设的有向无环图(DAG)指导。后门调整的ATEs以每标准差(SD)增加的绝对风险差尺度估计。评估了股骨总BMD的效应异质性,并使用临床变量与按ATE大小排序的表型组合评估下游预测。结果:在21,098名参与者中,115人发生髋部骨折。所有16种表型均显示每SD增加的后门调整ATEs为负值。最大的ATEs出现在股骨总BMC和股骨总BMD,每个的风险差为-0.0047,对应于每1,000名参与者中每SD较高的表型值减少约4.7例髋部骨折。股骨总BMD的条件效应在年龄较大和BMI较低的参与者中更强。在预测中,临床变量加上按ATE排序的前11个表型达到了比FRAX(含股骨颈BMD)更高的AUC(0.842 vs. 0.709),具有更高的敏感性(0.748 vs. 0.443)和相似的特异性(0.793 vs. 0.777)。结论:DXA衍生的髋部骨骼表型在其后门调整的ATEs上存在差异。表型水平的因果评估可能有助于识别用于风险分层的信息性DXA测量值。

英文摘要

Purpose: To compare dual-energy X-ray absorptiometry (DXA)-derived hip skeletal phenotypes in relation to hip fracture risk using prespecified confounder adjustment and to assess whether phenotypes ranked by their backdoor-adjusted average treatment effects (ATEs) improve risk stratification. Methods: We analyzed 21,098 UK Biobank participants with linked health records, hip DXA-derived skeletal measures, and prespecified covariates. Sixteen phenotypes spanning bone mineral content (BMC), bone mineral density (BMD), and T-score across hip-related regions were evaluated. Confounder selection was guided by a prespecified directed acyclic graph (DAG). Backdoor-adjusted ATEs were estimated on the absolute risk-difference scale per standard deviation (SD) increase. Effect heterogeneity was evaluated for total femur BMD, and downstream prediction was assessed using clinical variables combined with phenotypes ranked by ATE magnitude. Results: Among 21,098 participants, 115 had hip fractures. All 16 phenotypes showed negative backdoor-adjusted ATEs per SD increase. The largest ATEs were observed for total femur BMC and total femur BMD, each with a risk difference of -0.0047, corresponding to approximately 4.7 fewer hip fractures per 1,000 participants per SD higher phenotype value. Conditional effects of total femur BMD were stronger among older participants and those with lower BMI. In prediction, clinical variables plus the top 11 ATE-ranked phenotypes achieved higher AUC than FRAX with femoral neck BMD (0.842 vs. 0.709), with higher sensitivity (0.748 vs. 0.443) and similar specificity (0.793 vs. 0.777). Conclusion: DXA-derived hip skeletal phenotypes differed in their backdoor-adjusted ATEs. Phenotype-level causal evaluation may help identify informative DXA measures for risk stratification.

2606.02624 2026-06-03 q-bio.QM cs.AI cs.LG 版本更新

TadA-Bench: A Million-Variant Benchmark for Future-Round Discovery Toward Agentic Protein Engineering

TadA-Bench:面向智能蛋白质工程的未来轮次发现的百万变异基准

Jin Gao, Juntu Zhao, Zirui Zeng, Jiaqi Shen, Junhao Shi, Dukun Zhao, Yuming Lu, Dequan Wang

发表机构 * Tsinghua University(清华大学)

AI总结 TadA-Bench 是一个基于31轮TadA定向进化的百万变异湿实验回放基准,通过定义固定数据回放任务来评估模型在未见过的未来轮次中排序变异的能力,并引入Seq2Graph统一标签,揭示进化覆盖度比局部数据密度更重要。

Comments Accepted at the 43rd International Conference on Machine Learning (ICML 2026). Data: https://huggingface.co/datasets/JinGao/TadABench-1M . Code: https://github.com/shiyegao/TadABench-1M

详情
AI中文摘要

人工智能用于科学发现正进入智能体时代,蛋白质工程系统应优先考虑未来的湿实验,而不仅仅是拟合静态测量。我们引入了TadA-Bench,这是一个来自31轮TadA定向进化的百万变异湿实验回放基准,用于面向智能蛋白质工程的未来轮次发现。TadA-Bench保留了实验的时间顺序,并定义了一个固定数据回放任务:给定早期的实验轮次,模型对仅出现在后期轮次中的变异进行排序。它提供了对齐的DNA、RNA和蛋白质视图,并使用Seq2Graph(一种基于图的标签统一流程)来将嘈杂的富集测量结果协调为一致的跨轮次活性标签。随机分割控制显示强插值能力,但未来轮次排序和有限预算候选选择则弱得多。控制分析表明,进化覆盖度比局部数据密度更具信息性,将TadA-Bench定位为面向智能蛋白质工程的未来轮次发现的可重复湿实验回放基底;数据和代码已在Hugging Face和GitHub上发布。

英文摘要

AI for scientific discovery is entering an agentic era, where protein-engineering systems are expected to prioritize future wet-lab experiments rather than merely fit static measurements. We introduce TadA-Bench, a million-variant wet-lab replay benchmark from 31 TadA directed-evolution rounds for future-round discovery toward agentic protein engineering. TadA-Bench preserves the campaign chronology and defines a fixed-data replay task: given earlier experimental rounds, models rank variants that appear only in later rounds. It provides aligned DNA, RNA, and protein views, and uses Seq2Graph, a graph-based label-unification pipeline, to reconcile noisy enrichment measurements into consistent cross-round activity labels. Random-split controls show strong interpolation, but future-round ranking and finite-budget candidate selection are much weaker. Controlled analyses suggest that evolutionary coverage is more informative than local data density, positioning TadA-Bench as a reproducible wet-lab replay substrate for future-round discovery toward agentic protein engineering; the data and code are released on Hugging Face and GitHub.

2606.03990 2026-06-03 cs.LG cs.CL cs.CV 版本更新

Neuron Populations Exhibit Divergent Selectivity with Scale

神经元群体随规模表现出分化的选择性

Amil Dravid, Yasaman Bahri, Alexei A. Efros, Yossi Gandelsman

发表机构 * UC Berkeley(加州大学伯克利分校) TTIC

AI总结 通过分析Rosetta神经元在不同规模模型中的分布与特性,发现其数量遵循次线性幂律增长,且选择性随规模增强,而非Rosetta神经元则保持低选择性,提出一个平衡特征效用与神经元容量的分析模型解释这一极化现象。

Comments Project page and code: https://avdravid.github.io/rosetta-neuron-scaling/

详情
AI中文摘要

我们研究神经网络中的神经元群体是否随规模可预测地演化,将缩放定律扩展到损失等宏观可观测指标之外。为探究此问题,我们研究了Rosetta神经元——一类先前被表征的、其激活模式在独立训练的模型中相似的神经元(Dravid et al., 2023)。在分别对高达30B参数的语言模型和高达5B参数的视觉模型的分析中,我们观察到Rosetta神经元群体遵循模型规模的次线性幂律,绝对数量增长但占总神经元数的比例缩小。我们进一步观察到神经元极化效应:Rosetta神经元随规模变得更具选择性且日益单语义化,与不断增长但仍保持低选择性的非Rosetta群体分离。一个平衡特征效用与有限神经元容量的分析模型解释了次线性幂律缩放和这种极化效应。最后,我们发现Rosetta神经元随规模变得更加领域专业化,并通过一个针对持续预训练的目标数据过滤案例研究展示了其选择性。我们的结果指向一个可解释的、共享的神经元层面结构的缩放定律,将模型大小与神经元通用性、选择性和专业化的系统性变化联系起来。

英文摘要

We investigate whether neuron populations within neural networks evolve predictably with scale, extending scaling laws beyond macroscopic observables such as loss. To probe this question, we study Rosetta Neurons, a previously characterized class of neurons whose activation patterns are similar across independently trained models (Dravid et al., 2023). In separate analyses of language models up to 30B parameters and vision models up to 5B parameters, we observe that the population of Rosetta Neurons follows a sublinear power law in model size, growing in absolute number but occupying a shrinking fraction of the total neuron count. We further observe a Neuron Polarization Effect: Rosetta Neurons become more selective and increasingly monosemantic with scale, separating from a growing non-Rosetta population that remains less selective. An analytical model balancing feature utility against limited neuron capacity explains the sublinear power-law scaling and this polarization effect. Finally, we find that Rosetta Neurons become more domain-specialized with scale and illustrate their selectivity through a targeted data-filtering case study for continued pretraining. Our results point to a scaling law for interpretable, shared neuron-level structure, linking model size to systematic changes in neuron universality, selectivity, and specialization.

2606.03980 2026-06-03 cs.LG cs.CL 版本更新

Skill-RM: Unifying Heterogeneous Evaluation Criteria via Agent Skill

Skill-RM: 通过智能体技能统一异构评估标准

Tao Chen, Gangwei Jiang, Pengyu Cheng, Siyuan Huang, Yihao Liu, Jingwei Ni, Jiaqi Guo, Mengyu Zhou, Kai Tang, Junling Liu, Qinliang Su, Xiaoxi Jiang, Guanjun Jiang

发表机构 * Qwen Large Model Application Team, Alibaba(通义千问大模型应用团队,阿里巴巴) Sun Yat-sen University(中山大学) The Chinese University of Hong Kong(香港中文大学) Peking University(北京大学) ETH Zürich University of Zurich(苏黎世联邦理工学院)

AI总结 提出Skill-RM框架,将奖励建模重构为可重用的奖励评估技能执行,通过动态选择和聚合证据统一异构评估标准,在奖励基准和下游任务中优于传统方法。

详情
AI中文摘要

奖励模型(RMs)为LLM后训练提供关键反馈信号,特别是在强化微调(RFT)和强化学习(RL)流程中。然而,当前的奖励评估依赖于异构标准,如基于规则的验证器、真实参考、程序化检查表和复杂评分标准,而统一整合所有类型证据的机制尚未被探索。为此,我们提出技能奖励模型(Skill-RM),一个统一框架,将奖励建模重构为可重用的奖励评估技能的执行。通过将奖励计算视为结构化的智能体任务,Skill-RM提供一致的接口来编排异构资源,动态选择和聚合针对每个输入特定要求定制的证据。这种方法使奖励模型能够超越静态评估,确保跨不同任务的一致性和透明度。在奖励基准和下游应用(包括最佳N选择和强化学习)上的大量实验表明,Skill-RM始终优于传统的评判基线。我们的发现表明,Skill-RM不仅为奖励建模提供了统一解决方案,而且通过战略性和动态的证据编排实现了卓越性能。代码见此链接。

英文摘要

Reward models (RMs) provide critical feedback signals for LLM post-training, notably in reinforced fine-tuning (RFT) and reinforcement learning (RL) pipelines. However, current reward evaluation relies on heterogeneous criteria such as rule-based verifiers, ground-truth references, procedural checklists, and complex rubrics, where a unified mechanism to integrate all types of evidence remains unexplored. To this end, we propose Skill Reward Model (Skill-RM), a unified framework that reformulates reward modeling as the execution of a reusable Reward-Evaluation Skill. By treating reward computation as a structured agentic task, Skill-RM provides a consistent interface to orchestrate heterogeneous resources, dynamically selecting and aggregating evidence tailored to the specific requirements of each input. This approach enables the reward model to move beyond static evaluation, ensuring consistency and transparency across diverse tasks. Extensive experiments on reward benchmarks and downstream applications, including best-of-N selection and reinforcement learning, demonstrate that Skill-RM consistently outperforms traditional judge baselines. Our findings suggest that Skill-RM not only provides a unified solution for reward modeling but also achieves superior performance through the strategic and dynamic orchestration of evidence. The code is at https://github.com/Qwen-Applications/Skill-RM.

2606.03979 2026-06-03 cs.LG cs.AI 版本更新

Language Models Need Sleep: Learning to Self-Modify and Consolidate Memories

语言模型需要睡眠:学习自我修改和巩固记忆

Ali Behrouz, Farnoosh Hashemi, Vahab Mirrokni

发表机构 * Google(谷歌) Cornell University(康奈尔大学)

AI总结 受人类学习过程启发,提出“睡眠”范式,通过记忆巩固(知识播种)和梦境(自我改进)两阶段,使模型持续学习、将短期记忆转化为长期知识并自我提升。

Comments A version of this work has been publicly available from September 2025 on OpenReview

详情
AI中文摘要

过去几十年见证了机器学习算法设计的重大进步,从早期针对特定任务的浅层模型研究到更通用的深度大语言模型(LLMs)。尽管在需要即时预测或上下文学习的任务中显示出有希望的结果,现有模型缺乏持续学习并有效将其时间上下文知识转移到长期参数的能力。受人类学习过程的启发,我们引入了一种“睡眠”范式,允许模型持续学习,通过重放将其短期脆弱记忆蒸馏为稳定的长期知识,并通过“梦境”过程递归地自我改进。更详细地说,睡眠包括两个阶段:(1)记忆巩固:一个向上的蒸馏过程,称为知识播种,其中较小自我的记忆被蒸馏到更大的网络中,以在保留知识的同时提供更多容量。作为概念验证,我们提出了一种新的广义蒸馏过程用于知识播种(即在线策略蒸馏与基于强化学习的模仿学习的结合);(2)梦境:一个自我改进阶段,其中模型使用强化学习生成合成数据的课程,以排练新知识并在没有人类监督的情况下完善现有能力。我们在长视野、持续学习、知识整合和少样本泛化任务上的实验支持了睡眠阶段的重要性。

英文摘要

The past few decades have witnessed significant advances in the design of machine learning algorithms, from early studies on task-specific shallow models to more general deep Large Language Models (LLMs). Despite showing promising results in tasks that require instant prediction or in-context learning, existing models lack the ability to continually learn and effectively transfer their temporal in-context knowledge to their long-term parameters. Inspired by human learning process, we introduce a ''Sleep'' paradigm that allows the models to continually learn, distill their short-term fragile memories into stable long-term knowledge with replay, and recursively improve themselves with ''Dreaming'' process. In more detail, sleep consists of two stages: (1) Memory Consolidation: an upward distillation process, called Knowledge Seeding, where the memories of a smaller-self are distilled into a larger network to provide more capacity while preserving the knowledge. As a proof of concept, we present a new Generalized Distillation process for {Knowledge Seeding} (i.e., the combination of on-policy distillation with Reinforcement Learning (RL)-based imitation learning); (2) Dreaming: a self-improvement phase, where the model uses RL to generate a curriculum of synthetic data to rehearse new knowledge and refine existing capabilities without human supervision. Our experiments on long-horizon, continual learning, knowledge incorporation, and few-shot generalization tasks support the importance of the sleep stage.

2606.03976 2026-06-03 cs.CV cs.AI cs.LG q-bio.NC 版本更新

Formalizing the Binding Problem

形式化绑定问题

Lianghuan Huang, Yihao Li, Saeed Salehi, Yingshan Chang, Ansh Soni, Konrad P. Kording

AI总结 本文用信息论方法形式化绑定问题,提出一种探测方法测量模型表示中的绑定信息,并在视觉Transformer上实验,证明绑定是强视觉识别和推理的关键要素。

Comments Accepted to ICML 2026

详情
AI中文摘要

世界表征,可以说,包含关于特征的信息(例如,某物是蓝色的,某物是圆形的),但也包含关于哪些特征属于同一对象的信息(例如,圆形是蓝色的),我们称之为绑定信息。任何具有理解包含多个对象场景能力的系统都必须解决绑定问题:它需要知道哪些特征属于一起。然而,尽管有研究表明视觉Transformer(ViT)知道哪些补丁属于一起,但目前尚不清楚当前的深度学习模型是否学会展示绑定信息,即针对特征的信息。我们可能认为绑定信息并不多,毕竟将特征错误归因于错误对象是基于ViT架构的常见失败,尤其是在对象共享特征的场景中。本文用信息论方法形式化绑定问题,并引入一种探测方法来测量模型表示中的绑定信息。我们在ViT上进行实验,测量来自架构不同组件(如图像摘要标记[CLS]或空间标记)的绑定信息。我们使用具有不同绑定挑战的数据集,例如特征共享、遮挡和自然特征,同时比较多个预训练ViT的性能。总体而言,我们的研究证明了绑定是强视觉识别和推理的关键要素。

英文摘要

Representations of the world, arguably, contain information about features (e.g. something is blue, something is a circle) but also information about which features are part of the same object (e.g. the circle is blue), which we call binding information. Any system with the ability to understand scenes with multiple objects must be able to solve the binding problem: it needs to know which features belong together. However, despite work showing that Vision Transformers (ViTs) know which patches belong together, it is not known whether current deep learning models learn to exhibit binding information, i.e., for features. We may believe that there is not much binding information, after all misattributing features to wrong objects is a common failure of ViT-based architectures, especially in scenes with objects sharing features. Here we formalize the binding problem with an information-theoretic approach, and introduce a probing method to measure binding information in model representations. We perform experiments on ViTs, measuring binding from different components of the architecture, such as the image summary token [CLS] or the spatial tokens. We use datasets with different binding challenges, such as feature sharing, occlusion, and natural features, while comparing the performance of several pre-trained ViTs. Overall, our research demonstrates binding as a key ingredient to strong visual recognition and reasoning.

2606.03962 2026-06-03 cs.LG cs.AI 版本更新

Using Reward Uncertainty to Induce Diverse Behaviour in Reinforcement Learning

利用奖励不确定性在强化学习中诱导多样化行为

Anthony GX-Chen, Ankit Anand, Gheorghe Comanici, Zaheer Abbas, Eser Aygün, David Smalling, Shibl Mourad, Doina Precup, André Barreto, Mark Rowland

发表机构 * New York University(纽约大学) Google DeepMind(谷歌深Mind)

AI总结 针对传统强化学习缺乏多样性的问题,提出将奖励函数替换为奖励分布,通过非线性集合目标自然产生可控的多样化行为,并推导出梯度估计器,实验证明其鲁棒性和理论优势。

Comments Core contributors: Anthony GX-Chen, Ankit Anand, Gheorghe Comanici, André Barreto, Mark Rowland

详情
AI中文摘要

经典强化学习通常寻求最大化标量奖励期望和的确定性策略。然而,现代应用如语言模型微调或科学发现需要多样性。现有的补救措施如熵正则化或多样性奖励通常需要脆弱的权衡,以性能换取随机性,或依赖可能使策略排名错位的启发式指标。我们认为,多样性更自然地理解为对奖励不确定性的理性响应。当奖励函数不完全已知时——例如模糊偏好或不完美的奖励模型——承诺单一行动可能是次优的。基于此,我们提出对强化学习目标进行根本性重新表述,将标量奖励替换为奖励函数上的分布,并对行动集合应用非线性目标。结果是一个框架,其中校准的行为多样性自然出现,通过奖励函数分布保持可控,且无需牺牲期望奖励即可获得。聚焦于上下文赌博机设置,我们为该目标推导出原则性的梯度估计器,并证明我们的公式自然泛化了原始策略梯度以及最近发展的行动集方法。我们的实证结果表明,该框架为传统问题表述无法诱导所需行为广度的复杂强化学习任务提供了鲁棒且理论基础的替代方案。

英文摘要

Classical reinforcement learning (RL) typically seeks a deterministic policy that maximizes the expected sum of a scalar reward. Yet, modern applications such as language model fine-tuning or scientific discovery demand diversity. Existing remedies such as entropy regularization or diversity bonuses often require fragile trade-offs that sacrifice performance for stochasticity or rely on heuristic metrics that can misalign policy rankings. We argue that diversity is more naturally understood as the rational response to uncertainty in the reward. When the reward function is not perfectly known--as is the case with ambiguous preferences or imperfect reward models--committing to a single action can be sub-optimal. Building on this, we propose a fundamental reformulation of the RL objective by replacing the scalar reward with a distribution over reward functions, and applying a non-linear objective over sets of actions. The result is a framework in which calibrated behavioural diversity emerges naturally, remains controllable through the reward function distribution, and is obtained without sacrificing expected reward. Focusing on the contextual bandit setting, we derive a principled gradient estimator for this objective and prove that our formulation naturally generalizes both vanilla policy gradient and more recently developed action-set approaches. Our empirical results demonstrate that this framework offers a robust and theoretically grounded alternative for complex RL tasks where the traditional formulation of the problem fails to induce the desired breadth of agent behaviour.

2606.03954 2026-06-03 cs.CV cs.LG cs.RO 版本更新

VLESA: Vision-Language Embodied Safety Agent for Human Activity Monitoring

VLESA: 用于人类活动监测的视觉语言具身安全智能体

Hanjiang Hu, Yiyuan Pan, Jiaxing Li, Xusheng Luo, Alexander Robey, Na Li, Yebin Wang, Changliu Liu

发表机构 * Carnegie Mellon University(卡内基梅隆大学) Mitsubishi Electric Research Laboratories(三菱电机研究实验室) Harvard University(哈佛大学)

AI总结 提出VLESA框架,通过自我中心视频监测人类活动,利用GRPO训练的目标条件安全Q过滤器进行实时安全干预,在ASIMOV-2.0基准上实现更高干预精度。

Comments 18 pages, 5 tables, 5 figures

详情
AI中文摘要

随着AI系统越来越多地协助人类完成物理任务,确保安全变得至关重要——物理动作会带来即时且不可逆转的后果,而数字错误则不会。我们引入了视觉语言具身安全智能体(VLESA),这是一个从自我中心视频监测人类活动,并在预测到危险动作时触发实时安全干预的框架。VLESA处理意图依赖的安全问题,其中相同的动作可能根据上下文而安全或危险。我们引入了一个将自我中心帧与目标条件安全注释配对的数据集,使得能够通过GRPO训练一个目标条件安全Q过滤器,该过滤器在不重新训练的情况下根据推断的意图评估动作。在此基础上,提出了一个意图-动作预测智能体,用于从视频中联合推断目标并预测未来动作。在ASIMOV-2.0基准上,VLESA在精确的地面真值帧处实现了比基线更高的干预准确率,而通过目标条件约束解码,GRPO训练的Q过滤器将动作安全性提高了超过41个百分点。代码可在该网址获取。

英文摘要

As AI systems increasingly assist humans in physical tasks, ensuring safety becomes paramount -- physical actions carry immediate and irreversible consequences that digital errors do not. We introduce the Vision-Language Embodied Safety Agent (VLESA), a framework that monitors human activities from egocentric video and triggers real-time safety interventions when dangerous actions are predicted. VLESA addresses intent-dependent safety where identical actions can be safe or dangerous depending on context. A dataset pairing egocentric frames with goal-conditioned safety annotations is introduced, enabling a goal-conditioned safety Q-filter trained via GRPO that evaluates actions with respect to inferred intent without retraining. On top of that, an intent-action prediction agent is proposed to jointly infer goals and predict future actions from video. On the ASIMOV-2.0 benchmark, VLESA achieves higher intervention accuracy at the exact ground-truth frame compared to baselines, while the GRPO-trained Q-filter improves action safety by over 41 percentage points through goal-conditioned constrained decoding. Code is available at https://github.com/HanjiangHu/VLESA.

2606.03946 2026-06-03 cs.DB cs.LG cs.LO 版本更新

MLSkip: Data Skipping for ML Filters via Lightweight Metadata

MLSkip: 通过轻量级元数据实现ML过滤器的数据跳过

Mihail Stoian, Mark Gerarts, Pascal Ginter, Andreas Zimmerer, Jan Van den Bussche, Andreas Kipf

发表机构 * University of Technology Nuremberg(图恩堡技术大学) Hasselt University(哈塞尔特大学) Technical University of Munich(慕尼黑技术大学)

AI总结 针对ML过滤器无法应用传统数据跳过技术的问题,提出利用Parquet默认的min-max元数据以及增强的二维凸包元数据结构,实现高效的谓词剪枝,平均剪枝效果达38.31%。

详情
AI中文摘要

数据库厂商最近发布了可用于过滤器谓词的AI函数。由于这些函数通常依赖于昂贵且黑盒的ML模型,它们带来了新的数据管理挑战。具体而言,针对整数和字符串数据的传统数据跳过技术无法适用于这种新型过滤器。实际上,目前还没有已知的机制用于剪枝不合格的行组,例如从blob存储读取文件时。在这项工作中,我们首次研究了ML过滤器的数据跳过技术。我们论证了Parquet默认的min-max元数据足以实现剪枝。为此,我们联系了两条研究路线:(i) 最近提出的ML模型查询语言和(ii) 神经网络验证。我们在ReLU架构上的初步结果表明,在TPC-H和TPC-DS表上,选择性低于0.1%的过滤器的平均剪枝效果为27.4%。最后,受空间连接研究的启发,我们提出了一种增强的元数据结构:一个有大小限制的二维凸包,验证工具可以更好地利用它,将剪枝效果提高到38.31%,同时每个行组和列对最多占用45字节。我们观察到在DuckDB中相对于PyTorch的端到端加速比为1.07倍。

英文摘要

Database vendors recently released AI functions that can be used in filter predicates. As such functions often rely on costly, black-box ML models, they unveil new data management challenges. Concretely, traditional data skipping techniques for integer and string data fail to be applicable to the new filter type. Indeed, there is no known mechanism for pruning non-qualifying row groups, e.g., when reading files from blob storage. In this work, we initiate the study of data skipping techniques for ML filters. We make the case that Parquet's default min-max metadata is enough to enable pruning. To this end, we draw connections to two lines of research: (i) the recently proposed query language for ML models and (ii) neural network verification. Our preliminary results on ReLU architectures show that on tables from TPC-H and TPC-DS, the average pruning effectiveness for filters of selectivity below 0.1% amounts to 27.4%. Finally, inspired by research on spatial joins, we propose an enhanced metadata structure: a size-bounded 2D convex hull that verification tools can make better use of, increasing the pruning effectiveness to 38.31%, while occupying at most 45 bytes per row group and column pair. We observe an end-to-end speedup of 1.07$\times$ over PyTorch in DuckDB.

2606.03939 2026-06-03 cs.LG cs.AI cs.PF 版本更新

FlashbackCL: Mitigating Temporal Forgetting in Federated Learning

FlashbackCL:缓解联邦学习中的时间遗忘

Mubarak A. Ojewale, Adriana E. Chis, Jorge M. Cortes-Mendoza, Bernardo Pulido-Gaytan, Horacio Gonzalez-Velez

发表机构 * Cloud Competency Centre, National College of Ireland, Dublin, Ireland(云竞争力中心,爱尔兰国家学院,都柏林,爱尔兰)

AI总结 针对联邦学习中客户端数据分布随时间漂移导致的时间遗忘问题,提出FlashbackCL方法,通过时间衰减标签计数、类别平衡水库采样重放和服务器端主动核心集筛选,在CIFAR-10上相对Flashback提升6.9%-10.0%,时间遗忘减少68%。

详情
AI中文摘要

基础模型和边缘模型的联邦学习(FL)越来越多地部署在客户端数据分布随时间漂移的场景中,然而现有的遗忘缓解方法假设每个客户端的分布是平稳的。Flashback是近期最强的针对跨客户端(空间)遗忘的FL方法,它使用单调累积的每类标签计数作为知识代理;该代理在时间分布漂移下会失准,并将全局模型锚定在过时的类别平衡上。我们通过一个与协议级波动隔离的每阶段指标形式化定义了FL中的时间遗忘,并提出了Flashback Continual Learning(FlashbackCL),它是Flashback的即插即用扩展,包含:(i) 时间衰减的标签计数;(ii) 具有类别平衡水库采样(CBRS)的设备感知重放缓冲区;(iii) 在公共蒸馏集上的服务器端主动核心集筛选。结果表明,在具有50个客户端和三种受控时间漂移模式的CIFAR-10上,FlashbackCL相对于Flashback实现了6.9%至10.0%的相对改进,同时将时间遗忘减少了高达68%。一项5变体消融实验表明CBRS重放是关键组件。FlashbackCL在平稳CIFAR-100上也比Flashback提高了3.5个百分点,表明类别平衡重放同样正则化了空间异质性和时间漂移。

英文摘要

Federated Learning (FL) of foundation and edge models increasingly targets deployments where client data distributions drift over time, yet existing forgetting-mitigation methods assume each client's distribution is stationary. Flashback, the strongest recent FL method against cross-client (spatial) forgetting, uses monotonically accumulating per-class label counts as a knowledge proxy; this proxy becomes miscalibrated under temporal distribution shift and anchors the global model to an outdated class balance. We formalise temporal forgetting in FL with a per-phase metric isolated from protocol-level fluctuations and propose Flashback Continual Learning (FlashbackCL), a drop-in extension of Flashback with (i) temporally-decayed label counts; (ii) a device-aware replay buffer with Class-Balanced Reservoir Sampling (CBRS); and (iii) server-side active coreset curation on the public distillation set. The results show that FlashbackCL achieves 6.9% to 10.0% relative improvement relative to Flashback, on CIFAR-10 with 50 clients and three controlled temporal shift modes, while simultaneously reducing temporal forgetting by up to 68%. A 5-variant ablation identifies CBRS replay as the critical component. FlashbackCL also improves Flashback by 3.5 points on stationary CIFAR-100, suggesting that class-balanced replay regularises spatial heterogeneity as well as temporal shift.

2606.03936 2026-06-03 cs.LG physics.geo-ph 版本更新

Correcting Neural Operator Spectral Bias via Diffusion Posterior Sampling with Sparse Observations

通过稀疏观测的扩散后验采样校正神经算子谱偏差

Niccolò Perrone, Fanny Lehmann, Stefania Fresca, Filippo Gatti

发表机构 * Université Paris-Saclay, CentraleSupélec, CNRS, ENS Paris-Saclay(巴黎-萨克雷大学,中央理工学院,国家科学研究中心,巴黎-萨克雷理工学院) Laboratoire de Mécanique Paris-Saclay UMR 9026(巴黎-萨克雷力学实验室 UMR 9026) Politecnico di Milano(米兰理工大学) ETH AI Center(苏黎世联邦理工学院人工智能中心) Department of Mechanical Engineering University of Washington(华盛顿大学机械工程系)

AI总结 提出FreqNO-DPS方法,利用扩散后验采样结合谱形状引导分数,校正神经算子在稀疏观测下的高频衰减谱偏差,实现近零谱偏差。

详情
AI中文摘要

神经算子代理(NO)比数值求解器快数个数量级地近似PDE解,但受谱偏差影响:高频内容被系统性地衰减,限制了在细尺度结构重要时的可靠性。通常也可获得场的稀疏传感器测量,提供点精度而无谱失真,但仅覆盖域的一小部分。我们通过将NO预测视为扩散后验采样框架中的辅助观测来解决这一问题。我们的方法FreqNO-DPS(此 https URL )将基于无条件分数扩散先验(在高保真模拟上训练)与扩散后验采样(DPS)相结合,以稀疏观测为条件并由冻结的神经算子引导。朴素集成会重新引入代理的谱偏差;我们通过一个闭式、谱形状的引导分数来解决这一问题,该分数根据代理的频率相关精度加权,且无需去噪器反向传播。一个无分布分析在频率-扩散-时间平面上界定了近似误差,并表明引导的频率依赖性无论分布假设如何都得以保持。在3D弹性波场预测中,传感器覆盖率为5%和2%时,该方法在所有频带上达到近零谱偏差,而代理和仅传感器DPS均显示出系统性的高频衰减。各向同性引导(自然基线)提高了点精度,但几乎完整地将偏差带入后验,证实了频率依赖性校准是必要的,而不仅仅是有益的。该框架仅需配对的代理/参考数据,且除了残差的近似谱对角性外,不利用任何问题特定结构,可通过我们提供的相干性诊断对新代理进行验证。

英文摘要

Neural operator surrogates (NO) approximate PDE solutions orders of magnitude faster than numerical solvers, but suffer from spectral bias: high-frequency content is systematically attenuated, limiting reliability where fine-scale structure matters. Sparse sensor measurements of the field are often available too, offering pointwise accuracy without spectral distortion but covering only a small fraction of the domain. We address this by treating NO predictions as auxiliary observations in a diffusion posterior sampling framework. Our method, FreqNO-DPS (https://github.com/niccoloperrone/FreqNO-DPS), combines an unconditional score-based diffusion prior, trained on high-fidelity simulations, with diffusion posterior sampling (DPS) conditioned on sparse observations and guided by a frozen neural operator. Naive integration reintroduces the surrogate's spectral bias; we resolve this with a closed-form, spectrally shaped guidance score that weights the surrogate by its frequency-dependent accuracy and needs no denoiser backpropagation. A distribution-free analysis bounds the approximation error across the frequency-diffusion-time plane and shows the guidance's frequency dependence is preserved regardless of distributional assumptions. On 3D elastic wavefield prediction at 5% and 2% sensor coverage, the method reaches near-zero spectral bias across all bands, where both the surrogate and sensor-only DPS show systematic high-frequency attenuation. Isotropic guidance, the natural baseline, improves pointwise accuracy but carries the bias into the posterior nearly intact, confirming that frequency-dependent calibration is essential, not merely beneficial. The framework needs only paired surrogate/reference data and exploits no problem-specific structure beyond the residual's approximate spectral diagonality, verifiable for new surrogates via the coherence diagnostic we provide.

2606.03935 2026-06-03 cs.NE cs.LG 版本更新

Quadratic integrate-and-fire neurons exhibit less fragmented loss landscapes and outperform leaky integrate-and-fire neurons in spike-based gradient descent

二次整合-放电神经元表现出更少的碎片化损失景观,并在基于脉冲的梯度下降中优于漏电整合-放电神经元

Carlo Wenig, Raoul-Martin Memmesheimer, Christian Klos

发表机构 * University of Bonn(波恩大学) University of Tübingen(图宾根大学) University of Illinois Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校)

AI总结 通过对比LIF和QIF神经元在Spiking Heidelberg Digits数据集上的表现,发现QIF神经元具有更平滑的损失景观和梯度,从而在脉冲神经网络训练中表现更优。

Comments 9 pages, 5 figures (main part)

详情
AI中文摘要

训练脉冲神经网络对于模拟生物神经网络以及神经形态计算至关重要。然而,对于广泛使用的漏电整合-放电(LIF)神经元,任意小的参数变化都可能引起脉冲的(消失)出现,从而破坏后续活动,导致在精确的基于脉冲的梯度下降过程中出现不稳定的神经表征和永久沉默的神经元。最近的研究表明,包括二次整合-放电(QIF)神经元在内的一类神经元模型避免了这些不连续性,并实现了连续甚至平滑的基于脉冲的梯度下降。然而,尚不清楚这些优势是否能转化为实际应用。在这里,我们通过在流行的Spiking Heidelberg Digits数据集上对LIF和QIF神经元网络进行受控比较,证明了它们确实如此。具体来说,第一步,我们进行了彻底的超参数搜索以优化两种模型,揭示了QIF神经元的明显性能优势。第二步,我们可视化了损失和梯度景观。与它们较差的性能一致,我们发现LIF神经元的损失景观(不连续)显得更加碎片化,相关梯度更加不稳定。对单个样本景观的分析表明,这些特征源于脉冲时间顺序的变化,这常常导致破坏性的脉冲(消失)出现。总体而言,我们的结果主张在梯度下降训练中用具有连续脉冲动态的神经元模型(如QIF神经元)替代LIF神经元。

英文摘要

The ability to train spiking neural networks is essential for modeling biological neural networks as well as for neuromorphic computing. However, for the extensively used leaky integrate-and-fire (LIF) neurons, arbitrarily small parameter changes can induce spike (dis)appearances that disrupt subsequent activity, leading to unstable neural representations and permanently silent neurons during exact spike-based gradient descent. Recent work shows that a class of neuron models, which includes the quadratic integrate-and-fire (QIF) neuron, avoids these discontinuities and enables continuous and even smooth spike-based gradient descent. However, it remains unclear whether these advantages translate into practice. Here, we demonstrate that they do so via a controlled comparison between networks of LIF and QIF neurons on the popular Spiking Heidelberg Digits dataset. Specifically, in a first step, we perform a thorough hyperparameter search to optimize both models, revealing a clear performance advantage of QIF neurons. In a second step, we visualize the loss and gradient landscapes. Consistent with their inferior performance, we find that the loss landscapes of LIF neurons, which are discontinuous, appear more fragmented and the related gradients more erratic. An analysis of the landscapes of single samples indicates that these features arise from changes in the temporal order of spikes, which often cause disruptive spike (dis)appearances. Overall, our results advocate replacing LIF neurons with neuron models exhibiting continuous spiking dynamics, such as QIF neurons, for gradient descent training.

2606.03928 2026-06-03 cs.LG cs.CL 版本更新

Value-Aware Stochastic KV Cache Eviction for Reasoning Models

面向推理模型的价值感知随机KV缓存淘汰

Ting-Yun Chang, Harvey Yiyun Fu, Deqing Fu, Chenghao Yang, Jesse Thomason, Robin Jia

发表机构 * University of Southern California(南加州大学) University of Chicago(芝加哥大学)

AI总结 针对推理模型长输出导致的KV缓存瓶颈,提出价值感知随机淘汰方法VaSE,通过保护大幅度值状态和引入随机性,在4倍压缩下比最强淘汰方法准确率提升超4%。

Comments Codes: https://github.com/terarachang/VaSE

详情
AI中文摘要

推理模型通过扩展思维链提高了准确性,但其长输出造成了内存和计算瓶颈。KV缓存淘汰方法通过从缓存中淘汰不重要的键值对来降低这一成本,但它们的准确性往往不如基于选择的稀疏注意力替代方案,后者保留了完整的KV缓存。我们识别出对KV缓存淘汰准确性至关重要的关键因素。首先,一小部分值状态具有异常大的幅度,淘汰它们会导致灾难性失败,模型进入重复推理循环。其次,在淘汰过程中引入随机性通过增加缓存多样性提高了准确性。基于这些发现,我们提出了价值感知随机KV缓存淘汰(VaSE),这是一种无需训练的方法,保护大幅度值状态并促进多样化的淘汰决策。在六个推理任务上,使用VaSE进行4倍KV缓存压缩的Qwen3模型在相同稀疏度下比最先进的选择方法获得了更高的平均准确率,同时比最强的淘汰方法高出超过4%。总体而言,VaSE弥合了效率与准确性之间的差距,支持FlashAttention2,并为推理模型实现了静态内存占用。

英文摘要

Reasoning models improve accuracy through extended chains of thought, but their long outputs create a memory and compute bottleneck. KV cache eviction methods reduce this cost by evicting unimportant key-value pairs from the cache, yet they often yield worse accuracy than selection-based sparse attention alternatives, which keep the full KV cache. We identify key factors crucial to KV cache eviction accuracy. First, a small fraction of value states have abnormally large magnitudes, and evicting them causes catastrophic failure where models enter repetitive reasoning loops. Second, introducing stochasticity during eviction improves accuracy by increasing cache diversity. Based on these findings, we propose Value-aware Stochastic KV Cache Eviction (VaSE), a training-free recipe that protects large-magnitude value states and promotes diverse eviction decisions. Across six reasoning tasks, Qwen3 models using VaSE with 4x KV cache compression yield higher average accuracies than SOTA selection method at the same sparsity, while outperforming the strongest eviction method by more than 4%. Overall, VaSE bridges the gap between efficiency and accuracy, supporting FlashAttention2 and enabling a static memory footprint for reasoning models.

2606.03927 2026-06-03 cs.LG cs.AI 版本更新

FFR: Forward-Forward Learning for Regression

FFR:前向-前向学习用于回归

Xinyang Liu, Xuanyu Liang, Shiqi Ding, Boyang Li, Zhiqiang Que, Jiayang Li, Guosheng Hu

发表机构 * University of Bristol(布里斯托大学) University College London(伦敦大学学院) University of Cambridge(剑桥大学)

AI总结 提出FFR框架,通过序数竞争 goodness 函数、分层阶梯架构和层次化预测将前向-前向算法扩展到回归任务,在多个数据集上恢复BP 98.6%的精度并显著降低内存和时间开销。

详情
AI中文摘要

前向-前向(FF)算法通过纯局部、逐层优化训练神经网络,提供了反向传播(BP)的计算高效且生物合理的替代方案。然而,FF本质上是为通过对比正负样本对进行分类而设计的,将其扩展到回归面临根本性挑战:连续目标空间缺乏用于对比学习的自然“对立面”,且标准 goodness 函数不携带关于目标幅度或顺序的信息。我们提出FFR(前向-前向回归),据我们所知,这是第一个将FF扩展到现实世界回归并展示在多样化真实数据集上具有竞争力的性能的框架。FFR引入了三项关键创新:(1)序数竞争 goodness 函数,通过距离感知序数监督下分区神经元组之间的竞争学习取代对比对;(2)分层阶梯架构,其中浅层学习粗序数判别,深层细化到细粒度回归,并通过多尺度特征聚合实现层间协作;(3)带不确定性估计的层次化预测,其中多尺度预测器联合提供鲁棒预测和预测置信度作为免费午餐。大量实验结果表明,FFR在五个真实世界回归基准上平均恢复了BP 98.6%的精度,同时将峰值训练内存降低到深度8时BP的27%和深度32时BP的8%,每次迭代时间约为BP的72%,并且显著优于所有无BP的竞争对手。

英文摘要

The Forward-Forward (FF) algorithm offers a computationally efficient and biologically plausible alternative to backpropagation (BP) by training neural networks through purely local, layer-wise optimization. However, FF is inherently designed for classification via contrastive positive-negative sample pairs, and extending it to regression poses fundamental challenges: continuous target space lack natural "opposites" for contrastive learning, and the standard goodness function carries no information about target magnitude or ordering. We propose FFR (Forward-Forward for Regression), to our knowledge, the first framework to extend FF to real-world regression and demonstrate competitive performance across diverse real-world datasets. FFR introduces three key innovations: (1) an ordinal competitive goodness function that replaces contrastive pairs with competitive learning between partitioned neuron groups under distance-aware ordinal supervision; (2) a stratified ladder architecture where shallow layers learn coarse ordinal discrimination and deeper layers refine into fine-grained regression, with multi-scale feature aggregation for inter-layer collaboration; and (3) hierarchical prediction with uncertainty estimation, where multi-scale predictors jointly provide robust predictions and prediction confidence as a free-lunch. Extensive experimental results show FFR recovers on average 98.6% of BP's accuracy across five real-world regression benchmarks while reducing peak training memory to only 27% of BP's at depth 8 and 8% at depth 32, with per-iteration time around 72% of BP's, and substantially outperforms all BP-free competitors.

2606.03926 2026-06-03 cs.HC cs.LG 版本更新

DiffUNet^2: Bidirectional Prediction, Probabilistic Generation and Collaborative Visual Discovery for Scientific Data

DiffUNet^2: 科学数据的双向预测、概率生成与协同视觉发现

Mengdi Chu, Jiaxin Yang, Angus G. Forbes, Nathan Debardeleben, Earl Lawrence, Ayan Biswas, Han-Wei Shen

发表机构 * Ohio State University(俄亥俄州立大学) NVIDIA Los Alamos National Laboratory(洛斯阿拉莫斯国家实验室)

AI总结 提出基于扩散模型的条件生成框架DiffUNet^2,实现时间序列的双向任意步预测与概率分布捕获,并结合交互式可视化支持科学探索。

Comments 12 pages, 20 figures

详情
AI中文摘要

对科学现象进行时间演化建模对于分析和推理至关重要,然而大多数机器学习方法仅提供确定性的前向预测,忽略了多种可能的结果,且很少支持反向推理,限制了它们在科学工作流中的实用性。我们提出了一个将基于扩散的生成建模与交互式视觉分析相结合的框架,用于科学探索。我们引入了DiffUNet^2,一种条件扩散模型,能够实现跨时间的双向、任意到任意生成,并捕获系统可能演化的分布。基于该模型,我们的交互式系统支持分支时间线探索、用户引导的状态编辑和概率空间导航,使科学家能够主动探索替代假设,而非被动观察预测。我们在5个不同科学领域的数据集上评估了该模型,验证了其预测准确性和概率空间集成质量。与领域专家合作,我们证明了该方法在支持实际科学时间数据分析工作流中的有效性。通过集成建模与视觉交互,我们的方法使科学家能够交互式地探索系统动力学,将生成模型转化为假设驱动的科学分析工具。

英文摘要

Modeling temporal evolution is important to analyzing and reasoning about scientific phenomena, yet most machine learning methods provide deterministic forward predictions that overlook multiple plausible outcomes and rarely support backward reasoning, limiting their usefulness in practical scientific workflows. We present a framework that integrates diffusion-based generative modeling with interactive visual analytics for scientific exploration. We introduce DiffUNet^2, a conditional diffusion model that enables bidirectional, any-to-any generation across time and captures distributions of plausible system evolutions. Built upon the model, our interactive system supports branching timeline exploration, user-guided state editing, and probability-space navigation, enabling scientists to actively explore alternative hypotheses rather than passively observe predictions. We evaluate the model on 5 datasets across different scientific domains to validate its predictive accuracy and probability-space ensemble quality. In collaboration with domain experts, we demonstrate the effectiveness of our approach in supporting practical scientific temporal data analysis workflows. By integrating modeling and visual interaction, our approach enables scientists to interactively explore system dynamics, transforming generative models into tools for hypothesis-driven scientific analysis.

2606.03923 2026-06-03 cs.LG 版本更新

Contrastive Neural Algorithmic Reasoning for Graph Coloring

对比神经算法推理用于图着色

Thien Le, Tianyu Zhao, Melanie Weber

发表机构 * Harvard University SEAS(哈佛大学SEAS) Harvard University T.H. Chan School of Public Health(哈佛大学T.H. Chan公共卫生学院)

AI总结 提出对比学习框架学习可迁移的着色几何结构,通过图神经网络编码器实现低冲突着色,并推广到不同规模的图。

Comments 52 pages, 5 figures, 45 tables

详情
AI中文摘要

图着色旨在用尽可能少的颜色为图的节点分配颜色,使得相邻节点颜色不同。这里,我们研究近似$k$-着色,目标是用最多$k$种颜色同时最小化单色边的数量。该问题是图论的核心问题,并在调度和资源分配等领域有应用。最近的無监督GNN方法直接优化每个实例,阻碍了跨图大小和分布的泛化。我们转而提出一个对比学习框架,学习可迁移的着色几何结构,其中同色节点的嵌入对齐,而相邻节点的表示被推向不同方向。我们分析了有界大小图上的总体目标。对于单位范数嵌入,我们证明其最优解具有线原型结构:同色节点的表示坍缩到共享的一维子空间,边连接正交子空间。该几何结构在有监督设置中产生平稳条件,并在平衡着色假设下通过投影次梯度动力学保持。在非归一化变体中,梯度下降具有由商图硬间隔问题控制的最大间隔偏差。在合成和真实世界图上的实验表明,对比GNN编码器有效泛化并产生低冲突着色,与贪心方法匹配甚至有时改进。

英文摘要

Graph coloring seeks to assigns colors to a graph's nodes so that adjacent nodes receive different colors, using as few colors as possible. Here, we study approximate $k$-coloring, where the goal is to use at most $k$ colors while minimizing the number of monochromatic edges. This problem is central to graph theory and has applications in areas such as scheduling and resource allocation. Recent unsupervised GNN approaches optimize each instance directly, precluding generalization across graph sizes and distributions. We instead propose a contrastive learning framework that learns transferable coloring geometry where the embeddings of same-color nodes align, while adjacent nodes' representations are pushed toward distinct directions. We analyze the resulting population objective over bounded-size graphs. For unit-norm embeddings, we show that its optima have a line-prototype structure: Representations of nodes of the same color collapse to a shared one-dimensional subspace, and edges connect orthogonal subspaces. This geometry yields stationarity conditions in the supervised setting and is preserved by projected subgradient dynamics under a balanced-coloring assumption. In an unnormalized variant, gradient descent has a max-margin bias governed by a quotient-graph hard-margin problem. Experiments on synthetic and real-world graphs show that contrastive GNN encoders generalize effectively and produce low-conflict colorings, matching and sometimes improving on greedy approaches.

2606.03919 2026-06-03 cs.SI cs.CY cs.DL cs.LG physics.soc-ph 版本更新

Forecasting Conceptual Diffusion in Science: The Case of Quantum Computing

预测科学中的概念扩散:以量子计算为例

Thomas Maillart, Thibaut Chataing, David Dosu, Paul Bagourd, Julian Jang-Jaccard, Alain Mermoud

发表机构 * Geneva School of Economics and Management, University of Geneva(日内瓦经济管理学院,日内瓦大学) Faculty of Medicine, University of Geneva(日内瓦大学医学院) Open Quantum Institute, CERN(开放量子研究所,欧洲核子研究中心) armasuisse Science + Technology(armasuisse 科学与技术)

AI总结 通过构建时间分辨的概念共现网络并训练LightGBM模型,研究量子计算领域概念的内生巩固与外生扩散的可预测性,发现外生扩散和熵具有强可预测性(R²高达0.78),而内生巩固在量子计算中几乎不可预测,但在神经植入领域显著上升(R²=0.83),表明概念扩散受语义和引用环境中的稳定结构规律支配。

Comments 19 pages, 5 figures, 6 tables. Code and manuscript sources: https://github.com/wazaahhh/breakthroughs-diffusion . An earlier version was presented at the Global Tech Mining Conference (GTM) 2026 (submission #117)

详情
AI中文摘要

理解和预测科学变化需要能够区分科学概念的内生巩固和外生扩散的模型。利用OpenAlex中量子计算概念子树,我们构建了一个时间分辨的概念共现网络,并追踪每个概念对的上游引用谱系和下游扩散。我们在分布和多样性感知特征上训练LightGBM模型,以预测四个结果:内生巩固、外生扩散、它们的比率以及扩散熵。在控制科学整体出版增长后,内生巩固在主要的量子计算基准中基本不可预测。相比之下,外生扩散和熵具有很强的可预测性(R²高达0.78),并且由上游异质性、引用广度和分布离散度驱动,如SHAP分析所示;在机器人、先进材料和神经植入上的重复验证证实,外生扩散仍然是跨领域排名最高的目标(测试R²约0.60-0.87),而内生可预测性在神经植入中显著上升(测试R²=0.83),表明量子计算的不对称性并非普遍适用。案例研究表明,尖锐的熵增加与新概念前沿的开启同时发生,而熵崩溃则标志着技术趋同或范式更替。这些结果表明,概念扩散受嵌入语义和引用环境中的稳定结构规律支配。通过识别跨领域采纳的早期基于多样性的信号,该方法为快速发展的研究领域中的预期科学计量学、技术预见和创新导向政策分析提供了可扩展的基础。

英文摘要

Understanding and anticipating scientific change requires models that distinguish between endogenous consolidation and exogenous diffusion of scientific concepts. Using the quantum computing subtree of concepts in OpenAlex, we construct a temporally resolved concept co-occurrence network and track each concept pair through its upstream citation lineage and downstream diffusion. We train LightGBM models on distributional and diversity-aware features to predict four outcomes: endogenous reinforcement, exogenous diffusion, their ratio, and diffusion entropy. After controlling for overall publication growth of the scientific body, endogenous reinforcement proves largely unpredictable in the primary quantum-computing benchmark. In contrast, exogenous diffusion and entropy are strongly predictable ($R^2$ up to $0.78à) and are driven by upstream heterogeneity, citation breadth, and distributional dispersion, as shown by SHAP analyses; replications on robotics, advanced materials, and neuro implants confirm that exogenous diffusion remains the top-ranked target across fields ($R^2_test \sim 0.60-0.87$), while endogenous predictability rises markedly in neuro implants (R^2_test = 0.83), indicating that the quantum-computing asymmetry does not generalise uniformly. Case studies reveal that sharp entropy increases coincide with the opening of new conceptual frontiers, while entropy collapses signal technological convergence or paradigm displacement. These results demonstrate that conceptual diffusion is governed by stable structural regularities embedded in semantic and citation environments. By identifying early diversity-based signals of cross-domain uptake, the approach provides a scalable foundation for anticipatory scientometrics, technology foresight, and innovation-oriented policy analysis in rapidly evolving research fields.

2606.03904 2026-06-03 cs.LG cs.CV 版本更新

MAdam: Metric-Aware Multi-Objective Adam

MAdam: 度量感知的多目标Adam

Fengbei Liu, Rachit Saluja, Sunwoo Kwak, Ruibo Wang, Ruining Deng, Heejong Kim, Johannes C. Paetzold, Mert R. Sabuncu

发表机构 * Cornell Tech(康奈尔科技) Weill Cornell Medicine(韦尔医学院) Delft University of Technology(代尔夫特理工大学)

AI总结 提出MAdam,通过偏好条件曲率预处理多目标优化中的协调方向,解决Adam与求解器之间的权重失配和几何失配问题,在多任务学习、帕累托前沿恢复等任务中一致提升性能。

详情
AI中文摘要

多目标优化是许多机器学习问题的基础,然而跨损失平衡、梯度平衡和基于帕累托的求解器家族几乎都将它们协调后的方向交给Adam处理。我们表明这种耦合在求解器的意图和优化器的执行之间引入了两个系统性差距。第一个是权重失配:Adam的二阶矩分母将时变偏好向量与梯度统计量纠缠在一起,将偏好边缘化为历史平均值,并将不同的帕累托权衡压缩为近乎均匀的混合。第二个是几何失配:Adam的自适应度量扭曲了多目标优化求解器假设的欧几里得几何,将对齐的目标转化为明显的冲突。为了共同解决这两个问题,我们引入了MAdam(度量感知的多目标Adam),这是一个即插即用的包装器,不改变求解器和优化器。MAdam通过标量化目标的偏好条件曲率对协调方向进行预处理;在此白化输入上,Adam的二阶矩退化为单位矩阵,因此实际更新由偏好条件度量主导。在多任务学习、帕累托前沿恢复、物理信息神经网络和医学成像中,MAdam在每个求解器家族上都一致优于Adam。

英文摘要

Multi-objective optimization (MOO) underlies many machine learning problems, yet MOO solvers across the loss-balancing, gradient-balancing, and Pareto-based families almost universally hand their reconciled directions to Adam~\cite{kingma2015adam}. We show this coupling introduces two systematic gaps between the solver's intent and the optimizer's execution. The first is a \emph{weighting mismatch}: Adam's second-moment denominator entangles the time-varying preference vector with gradient statistics, marginalizing the preference into a history average and collapsing distinct Pareto trade-offs toward a near-uniform mixture. The second is a \emph{geometric mismatch}: Adam's adaptive metric distorts the Euclidean geometry MOO solvers assume, turning aligned objectives into apparent conflicts. To resolve both jointly, we introduce \textbf{MAdam} (Metric-Aware Multi-Objective Adam), a drop-in wrapper that leaves both solver and optimizer unchanged. MAdam preconditions the reconciled direction by the preference-conditioned curvature of the scalarized objective; on this whitened input, Adam's second moment collapses to identity, so the realized update is governed by the preference-conditioned metric. Across multi-task learning, Pareto-front recovery, physics-informed neural networks, and medical imaging, MAdam consistently improves over Adam for every solver family.

2606.03888 2026-06-03 cs.CV cs.LG 版本更新

CoralBay: A Self-Supervised CT Foundation Model

CoralBay: 一种自监督CT基础模型

Ioannis Gatopoulos, Nicolas Känzig, Sebastian Otálora, Fei Tang

发表机构 * kaiko.ai(Kaiko AI)

AI总结 提出CoralBay框架,通过分层3D Swin骨干网络和自蒸馏学习多尺度特征,实现CT体积数据的自监督预训练,有效提升下游放射学任务性能。

详情
AI中文摘要

自监督学习已在2D自然图像上实现了大规模预训练,产生了跨任务有效迁移的通用视觉表示。然而,许多医学成像模态(如CT扫描)本质上是三维的,在结构和语义上与自然图像根本不同。体积模态捕捉空间连续性、器官解剖和基于强度的组织特性(如亨氏单位),这些无法通过2D预训练充分建模。为弥补这一差距,我们引入了CoralBay,一种自蒸馏框架,通过使用分层3D Swin骨干网络并将自蒸馏应用于拼接的多尺度特征,扩展了DINO,实现了数据高效的自监督学习,编码了全局语义和细粒度局部结构的丰富空间表示。因此,CoralBay有效迁移到广泛的下游放射学任务,在多样化的解剖目标上展现出强大且一致的性能。此外,我们通过引入一个公开、可复现的3D放射学排行榜,为开源\eva框架做出贡献,该排行榜统一了多个数据集,并建立了评估体积表示学习方法的标准化基准。

英文摘要

Self-supervised learning has enabled large-scale pre-training on 2D natural images, producing general-purpose visual representations that transfer effectively across tasks. However, many medical imaging modalities, such as CT scans, are inherently three-dimensional and differ fundamentally from natural images in both structure and semantics. Volumetric modalities capture spatial continuity, organ anatomy, and intensity-based tissue properties (e.g., Hounsfield Units), which are not adequately modeled by 2D pre-training. To bridge this gap, we introduce CoralBay, a self-distillation framework that extends DINO by using a hierarchical 3D Swin backbone and applying self-distillation to concatenated multi-scale features, enabling data-efficient self-supervised learning of rich spatial representations that encode both global semantics and fine-grained local structure. As a result, CoralBay transfers effectively to a wide range of downstream radiological tasks, demonstrating strong and consistent performance across diverse anatomical targets. In addition, we contribute to the open-source \eva framework by introducing a public, reproducible 3D radiology leaderboard that unifies multiple datasets and establishes a standardized benchmark for evaluating volumetric representation learning methods.

2606.03885 2026-06-03 cs.LG 版本更新

Attribution via Distributional Paths for Information Revelation

通过分布路径进行信息揭示的归因

Kieran A. Murphy, Shameen Shrestha

发表机构 * New Jersey Institute of Technology(新泽西理工学院)

AI总结 提出Reveal-IG方法,将路径归因从输入空间提升到结构化探针分布空间,通过逐步揭示信息并归因模型期望输出的变化,保留完整性并避免输入空间路径伪影。

Comments Code: https://github.com/murphyka/Reveal-IG

详情
AI中文摘要

特征归因方法通过为输入特征分配重要性分数来解释预测。基于路径的方法(如积分梯度)特别有吸引力,因为它们满足 extit{完整性}:归因总和等于模型输出在参考状态和输入之间的变化。然而,大多数路径方法在输入空间中定义轨迹,沿着所选路径通过逐点扰动输入来解释模型。输入空间路径积分模型在每个经过点的原始响应,无法控制特征被查询的分辨率;轨迹中靠近基线的早期部分与输入本身对解释的贡献相同。在这里,我们将路径归因从输入空间提升到围绕感兴趣示例的结构化探针分布空间,并将我们的方法称为Reveal-IG。Reveal-IG不是遍历原始输入值,而是逐步揭示关于输入的信息,并归因模型期望输出沿此分布路径的变化。结果是一个路径归因框架,它保留了对期望模型响应的完整性,并自然地适应多尺度图像探针和表格数据中的特征级不确定性。综合诊断表明,Reveal-IG避免了影响输入空间方法的路径伪影,并且在ImageNet分类和表格回归中,它产生稳定的、有符号的归因——在使用归因符号的指标上领先,同时在其余指标上保持竞争力。

英文摘要

Feature attribution methods explain predictions by assigning importance scores to input features. Path-based methods such as Integrated Gradients are especially appealing because they satisfy \textit{completeness}: attributions sum to the change in model output between a reference state and the input. Yet most path methods define this trajectory in input space, explaining a model through pointwise perturbed inputs along a chosen path. An input-space path integrates the model's raw response at each point it passes through, with no control over the resolution at which a feature is queried; the early, baseline-adjacent part of the trajectory contributes to the explanation on equal footing with the input itself. Here, we lift path attribution from input space to a space of structured probe distributions around the example of interest, and call our method Reveal-IG. Rather than traversing raw input values, Reveal-IG progressively reveals information about the input and attributes changes in the model's expected output along this distributional path. The result is a path-attribution framework that retains completeness with respect to the expected model response, and naturally accommodates multiscale image probes and feature-wise uncertainty in tabular data. Synthetic diagnostics show that Reveal-IG avoids path artifacts that affect input-space methods, and across ImageNet classification and tabular regression it produces stable, signed attributions -- leading on metrics that use attribution sign while remaining competitive on the rest.

2606.03883 2026-06-03 cs.AI cs.LG 版本更新

Reasoning Structure of Large Language Models

大型语言模型的推理结构

Frédéric Berdoz, Luca A. Lanzendörfer, Fabian Farestam, Roger Wattenhofer

AI总结 针对大型推理模型评估中隐藏不同推理结构的问题,提出基于逻辑谜题的基准测试和将非结构化轨迹转化为可验证推理图的方法,并定义推理效率指标,以量化分析推理拓扑结构。

Comments Accepted at ICML 2026 and presented at the ICLR 2026 workshop on LLM reasoning

详情
AI中文摘要

大型推理模型(LRMs)通常使用最终答案准确率或token数量等指标进行评估。然而,这些指标上的相同分数可能隐藏着根本不同的推理结构。为了解决这一局限性,我们引入了一个可扩展的逻辑谜题LRM基准测试,以及一个将非结构化轨迹转化为包含声明和依赖关系的可验证推理图的流程。这将推理转化为一个结构化的、可测量的对象,其拓扑结构可以定量分析。在此基础上,我们定义了一个推理效率指标,用于量化模型逻辑流的集中程度。我们对开源推理模型的分析表明,结构度量能够区分token数量和准确率所混淆的行为,为诊断失败模式和比较推理如何随谜题难度扩展提供了实用工具。

英文摘要

Large reasoning models (LRMs) are often evaluated using metrics such as final-answer accuracy or token count. However, identical scores on these metrics can hide fundamentally different reasoning structures. To address this limitation, we introduce a scalable LRM benchmark of logic puzzles and a pipeline that converts unstructured traces into verifiable reasoning graphs of claims and dependencies. This turns reasoning into a structured, measurable object whose topology can be quantitatively analyzed. Building on this, we define a reasoning efficiency metric that quantifies how concentrated the model's logical flow is. Our analysis on open-source reasoning models shows that structural measurements separate behaviors that token count and accuracy conflate, providing a practical tool for diagnosing failure modes and comparing how reasoning scales with puzzle difficulty.

2606.03871 2026-06-03 cs.CV cs.CL cs.LG 版本更新

Visual Instruction Tuning Aligns Modalities through Abstraction

视觉指令调优通过抽象对齐模态

Luis Palacios, Lorenzo Basile, Diego Doimo, Alberto Cazzaniga

发表机构 * Area Science Park, Trieste, Italy(特里埃斯特Area Science Park)

AI总结 通过探针分析和因果干预,发现视觉指令调优将视觉特征直接嵌入LLM的中间语义层,绕过早期单模态处理层,并通过扩展和强化现有抽象阶段对齐视觉与文本表示。

详情
AI中文摘要

视觉指令调优有效地使预训练的大语言模型(LLM)能够同时处理图像信息和文本。然而,视觉特征如何嵌入到LLM骨干网络的逐层抽象层次中仍不清楚。通过一系列不同的视觉-语言架构,我们表明指令调优主要充当桥梁,将视觉特征直接嵌入到LLM的中间语义层,绕过了用于单模态处理的早期层。通过探针分析和因果干预,我们表明这些中间层是视觉-语言处理的语义核心,并在广泛的 multimodal 基准测试中发挥关键作用。此外,通过比较语义等价的视觉和文本表示的几何结构,我们发现微调扩展并强化了现有的抽象阶段,使视觉特征与已有的文本特征对齐。最后,我们通过将微调限制在中间层来确认这种局部对齐的功能作用:该策略在视觉中心基准测试中保持了全微调的性能,同时减少了训练时间。我们的结果表明,多模态集成是一种局部现象,由LLM内部抽象引擎的重新利用驱动。

英文摘要

Visual instruction tuning effectively adapts a pre-trained Large Language Model (LLM) to process image information alongside text. Yet, it remains unclear how visual features are embedded into the layer-wise hierarchy of abstractions of the LLM backbone. Across a diverse set of vision-language architectures, we show that instruction tuning primarily serves as a bridge, embedding visual features directly into the intermediate semantic layers of the LLM, bypassing the early layers devoted to unimodal processing. With probing analyses and causal interventions, we show that these intermediate layers are the semantic core of vision-language processing and play a critical role in the performance on a broad set of multimodal benchmarks. In addition, by comparing the geometry of semantically equivalent visual and textual representations, we find that fine-tuning extends and strengthens the existing abstraction phase, aligning visual features with pre-existing textual ones. Finally, we confirm the functional role of this localized alignment by restricting fine-tuning to intermediate layers alone: this strategy preserves the performance of full fine-tuning on vision-centric benchmarks while reducing training time. Our results suggest that multimodal integration is a localized phenomenon driven by the repurposing of the internal abstraction engine of the LLM.

2606.03864 2026-06-03 cs.SI cs.CY cs.DL cs.LG physics.soc-ph 版本更新

Explainable Forecasting of Scientific Breakthroughs from Concept Network Dynamics

基于概念网络动力学的科学突破可解释预测

Thomas Maillart, Thibaut Chataing, Ntorina Antoni, David Dosu, Paul Bagourd, Julian Jang-Jaccard, Alain Mermoud

发表机构 * Geneva School of Economics and Management, University of Geneva, Geneva, Switzerland(日内瓦经济管理学院,日内瓦大学,瑞士日内瓦) Faculty of Medicine, University of Geneva, Geneva, Switzerland(日内瓦大学医学学院,瑞士日内瓦) TU Eindhoven, The Netherlands(埃因霍温理工大学,荷兰) Open Quantum Institute, CERN, Geneva, Switzerland(开放量子研究所,欧洲核子研究中心,瑞士日内瓦) armasuisse Science + Technology, Switzerland(armasuisse 科学与技术,瑞士)

AI总结 提出一种可解释的机器学习方法,通过建模OpenAlex概念网络的演化,预测科学突破的结构前兆(研究概念之间联系的出现和增强),并利用59个特征的两阶段LightGBM模型同时预测概念对的形成和未来权重,在四个技术/生物医学领域取得优于现有方法的ROC-AUC(0.954-0.967)和可解释性。

Comments 18 pages, 10 figures, 4 tables. An earlier version was presented at Global Tech Mining Conference 2026. Code and data: https://github.com/wazaahhh/breakthroughs-forecasting

详情
AI中文摘要

我们介绍了一种可解释的机器学习方法,通过建模OpenAlex概念网络随时间演化的方式,预测科学突破的结构前兆——研究概念之间联系的出现和增强。利用59个语义和拓扑特征,一个两阶段LightGBM模型联合预测概念对的形成及其未来权重,增加了一个回归阶段,将预期强度量化到先前的链接存在预测之上。与现有技术相比,该方法同时提高了准确性和可解释性:在四个技术和生物医学领域的比较验证中,无需重新调整即可在所有时间范围内获得[0.954, 0.967]的ROC-AUC,超过了先前模型约0.90的水平,而每个预测都基于结构化的、可审计的特征,而非不透明的嵌入。分类性能高(AUC约0.95),回归保持稳定(一到五年内RMSLE为0.45至0.6)。特征归因表明,结构因素——特别是Adamic-Adar相似性和基于度的Hadamard度量——持续驱动准确性,表明与突破相关的重组出现在紧密连接的子网络中。两个专家锚定的案例——量子退火和AI赋能的量子架构——显示模型浮现出与专家预期一致的技术融合。然后,我们概述了一个三层决策架构——检测、专家翻译、机构整合——将这些预测转化为基于证据的研究战略和政策,以开放数据和可解释特征为基础。

英文摘要

We introduce an explainable machine-learning approach that forecasts the structural precursors of scientific breakthroughs -- the emergence and intensification of links between research concepts -- by modelling how OpenAlex concept networks evolve over time. Using 59 semantic and topological features, a two-stage LightGBM model jointly predicts the formation and the future weight of concept pairs, adding a regression stage that quantifies expected intensity to prior link-existence forecasts. Relative to the state of the art, the approach improves accuracy and explainability at once: comparative validation across four technology and biomedical domains yields ROC-AUC in [0.954, 0.967] at all horizons without re-tuning, exceeding the roughly 0.90 of prior models, while every forecast rests on structural, auditable features rather than opaque embeddings. Classification performance is high (AUC about 0.95) and regression remains stable (RMSLE 0.45 to 0.6 over one to five years). Feature attribution shows that structural factors -- particularly Adamic-Adar similarity and degree-based Hadamard measures -- consistently drive accuracy, suggesting that breakthrough-relevant recombinations emerge in tightly connected sub-networks. Two expert-anchored cases, quantum annealing and AI-enabled quantum architectures, show the model surfacing technological convergence consistent with expert expectations. We then outline a three-layer decision architecture -- detection, expert translation, institutional integration -- that turns these forecasts into evidence-based research strategy and policy, anchored in open data and explainable features.

2606.03851 2026-06-03 cs.LG 版本更新

Two-Action Apple Tasting with Switching Costs

带有切换成本的双动作苹果品尝问题

Tommaso Cesari, Roberto Colomboni

发表机构 * School of Electrical Engineering and Computer Science University of Ottawa(电气工程与计算机科学学院 马来西亚渥太华大学) School of Mathematics University of Bristol(数学学院 布里斯托尔大学)

AI总结 研究对抗性对手下带有切换成本的双动作苹果品尝问题,通过揭示动作和盲动作的权衡,证明了最优遗憾为Θ(√T)。

详情
AI中文摘要

我们研究带有切换成本的双动作苹果品尝问题,对手是 oblivious 的。在等价的归一化形式中,每一轮学习者在揭示动作和盲动作之间选择:揭示动作获得奖励 $0$ 并揭示盲动作的隐藏值 $x_t\in[-1,1]$;盲动作获得奖励 $x_t$ 但不揭示任何信息。每当学习者切换动作时支付一个单位,遗憾相对于事后最佳固定动作来衡量。带有切换成本的通用反馈图算法对该问题给出 $\widetilde O(T^{2/3})$ 的遗憾保证。双动作苹果品尝图是切换成本分类中缺失的 $\Omega(T^{2/3})$ 障碍的自然候选:这样的下界将传递到一大类仍未分类的反馈图。我们证明这个障碍不存在:该问题的 oblivious 极小极大期望遗憾满足 \[ \frac{1}{2\sqrt3}\cdot\sqrt T \le R_T^\star \le 2\sqrt{3}\cdot \sqrt{T}. \]

英文摘要

We study the two-action apple-tasting problem with switching costs against an oblivious adversary. In an equivalent normalized formulation, at each round the learner chooses between a revealing action and a blind action: the revealing action gives reward $0$ and reveals the hidden value $x_t\in[-1,1]$ of the blind action; the blind action gives reward $x_t$ but reveals nothing. The learner pays one unit whenever they switches actions, and regret is measured against the best fixed action in hindsight. General feedback-graph algorithms with switching costs give $\widetilde O(T^{2/3})$ regret guarantees for this problem. The two-action apple-tasting graph was the natural candidate for the missing $Ω(T^{2/3})$ obstruction in the switching-cost classification: such a lower bound would have transferred to a large family of still-unclassified feedback graphs. We prove that this obstruction is not there: the oblivious minimax expected regret for this problem satisfies \[ \frac{1}{2\sqrt3}\cdot\sqrt T \le R_T^\star \le 2\sqrt{3}\cdot \sqrt{T}. \]

2606.03846 2026-06-03 cs.CL cs.AI cs.LG 版本更新

Clustered Self-Assessment: A Simple yet Effective Method for Uncertainty Quantification in Large Language Models

聚类自评估:一种简单而有效的大型语言模型不确定性量化方法

Qi Cao, Takeshi Kojima, Andrew Gambardella, Helinyi Peng, Yutaka Matsuo, Yusuke Iwasawa

发表机构 * The University of Tokyo(东京大学)

AI总结 提出一种基于语义聚类和多项选择概率的简单自评估方法,用于大型语言模型的不确定性量化,在多个模型和数据集上优于基线方法。

Comments Findings of ACL 2026

详情
AI中文摘要

大型语言模型(LLM)在各种任务中表现出色,但常常生成看似合理实则事实错误的回答。这一问题因缺乏明确的不确定性估计而加剧,使用户难以判断模型输出的可靠性。现有的不确定性量化方法通常依赖间接信号,如生成样本的熵。这些信号难以解释,且未充分利用模型评估自身不确定性的能力。我们提出一种简单而有效的自评估方法用于LLM的不确定性量化。我们的方法将生成样本分组为语义不同的聚类,将其转化为结构化多项选择题的答案选项,并使用LLM分配给每个选项的概率作为置信度估计。在多个模型和数据集上的实验表明,我们的方法始终优于基线方法。值得注意的是,仅需两个额外样本即可达到竞争性能,证明了其有效性和效率。

英文摘要

Large language models (LLMs) demonstrate remarkable performance across diverse tasks, but they often generate responses that appear plausible while being factually incorrect. This problem is compounded by the lack of explicit uncertainty estimates, which makes it difficult for users to judge the reliability of model outputs. Existing uncertainty quantification methods typically rely on indirect signals, such as entropy across sampled generations. These signals can be difficult to interpret and do not fully leverage the model's ability to assess its own uncertainty. We propose a simple yet effective self-assessment method for uncertainty quantification in LLMs. Our approach groups sampled generations into semantically distinct clusters, converts them into answer options in a structured multiple-choice question, and uses the probability assigned by the LLM to each option as a confidence estimate. Experiments across multiple models and datasets show that our method consistently outperforms baseline approaches. Notably, it achieves competitive performance with as few as two additional samples, demonstrating both its effectiveness and efficiency.

2606.03843 2026-06-03 cs.LG cs.AI 版本更新

Re-Evaluating Continual Learning with Few-Shot Adaptation

重新评估带少样本适应的持续学习

Amogh Inamdar, Matthew So, Vici Milenia, Richard Zemel

发表机构 * Department of Computer Science(计算机科学系)

AI总结 本文提出用少样本评估替代零样本评估来更全面衡量持续学习系统的稳定性和可塑性,并通过新指标“每样本可塑性”发现元学习未来任务序列能诱导学习到学习行为。

Comments 21 pages, 16 figures

详情
AI中文摘要

持续学习方法旨在最大化在任务序列上训练的机器学习模型的稳定性和可塑性。稳定性的标准度量(即遗忘)是模型在先前学习任务上的零样本性能,而可塑性则是在最近学习任务上的性能。然而,零样本评估并未完全衡量模型或方法保留已学信息或快速适应新信息的能力,因为它需要在多个任务上完美回忆。在本文中,我们提出少样本评估作为对持续学习系统稳定性和可塑性的更全面评估。我们对持续图像分类的任务序列进行了细粒度评估,发现这一范式为流行持续学习策略的性能提供了新颖的见解。通过使用新指标——每样本可塑性——进行少样本评估,我们展示了通过元学习未来任务的短序列向持续学习方法添加“前瞻性”会在任务序列上诱导学习到学习的行为。

英文摘要

Continual learning methods aim to maximize the stability and plasticity of machine learning models that are trained on a sequence of tasks. The standard measure of stability (i.e., forgetting) is the 0-shot performance of a model on previously learned tasks, and plasticity, the performance on the most recently learned task. However, 0-shot evaluation does not fully measure a model or method's ability to retain learned information or adapt quickly to new information, as it requires perfect recall across multiple tasks. In this paper, we propose few-shot evaluation as a more comprehensive assessment of the stability and plasticity of a continual learning system. We conduct a fine-grained assessment on task sequences for continual image classification and find that this paradigm produces novel insights into the performance of popular continual learning strategies. Through few-shot evaluation with a novel metric -- per-shot plasticity -- we show that adding `foresight' to continual learning methods via the meta-learning of a short sequence of future tasks induces learning-to-learn behavior over the task sequence.

2606.03839 2026-06-03 cs.LG 版本更新

Text-attributed Graph Condensation via Text Selection and Attribute Matching

通过文本选择和属性匹配的文本属性图压缩

Haowei Han, Yuxiang Wang, Guojia Wan, Hao Wang, Shanshan Feng, Hao Huang, Jiawei Jiang, Xiao Yan

发表机构 * School of Computer Science Wuhan University(武汉大学计算机学院) Institute for Math & AI Wuhan University(武汉大学数学与人工智能研究院)

AI总结 提出TAGSAM方法,通过子图文本选择和属性相似性匹配压缩文本属性图,在保持训练精度的同时显著降低空间和时间消耗。

详情
AI中文摘要

文本属性图(TAG)是一种重要的图结构数据,其中每个节点都有文本描述。TAG模型通常联合训练图神经网络(GNN)和语言模型,导致高空间和时间消耗,尤其是在大型数据集上。为了缓解这一问题,我们提出了TAGSAM,一种在保持训练精度的同时压缩TAG的压缩方法。TAGSAM有两个关键设计,即子图文本选择和属性相似性匹配,分别压缩TAG的文本描述和图拓扑。对于文本,子图文本选择通过最大化互信息从多个相关文本描述中选择并合并代表性文本块。对于图拓扑,基于匹配训练轨迹(MTT)的流行压缩方法存在高方差,阻碍了精度。我们的属性相似性匹配通过对齐稳定的相似性矩阵来缓解这一问题。我们评估了TAGSAM与六个最先进的基线方法,结果显示其优越性能。在相同压缩大小下,TAGSAM在精度上平均比最佳基线提高4.9%。此外,即使将TAG压缩到仅1%的大小,它仍能保持有竞争力的训练精度。我们的代码可在以下网址获取:this https URL

英文摘要

Text-Attributed Graph (TAG) is an important type of graph structured data, where each node has a text description. TAG models usually train a Graph Neural Network (GNN) and language model jointly, which leads to high space and time consumption, especially on large datasets. To mitigate this, we propose TAGSAM, a condensation method that compresses TAGs while preserving training accuracy. TAGSAM comes with two key designs, i.e., subgraph text Selection and Attribute similarity Matching, which compress the text description and graph topology of TAG, respectively. For the texts, subgraph text selection selects and merges representative text chunks from multiple related text descriptions by maximizing mutual information. For the graph topology, popular condensation methods based on Matching Training Trajectories (MTT) suffer from high variance, which hinders accuracy. Our attribute similarity matching mitigates this issue by aligning stable similarity matrices. We evaluate TAGSAM against six state-of-the-art baselines, where it showcases superior performance. For the same compressed size, TAGSAM improves upon the best-performing baseline by an average of 4.9% in accuracy. Furthermore, it maintains competitive training accuracy even when the TAG is condensed to just 1% size. Our code is available at https://github.com/SundayVHan/TAGSAM

2606.03831 2026-06-03 cs.LG stat.ML 版本更新

Online Learning with Gradient-Variation Interval Regret

基于梯度变化的区间遗憾在线学习

Yan-Feng Xie, Shuche Wang, Peng Zhao, Zhi-Hua Zhou

发表机构 * State Key Laboratory for Novel Software Technology and the School of Artificial Intelligence, Nanjing University(新型软件技术国家重点实验室和人工智能学院,南京大学) Institute of Operations Research and Analytics, National University of Singapore(运筹与分析研究所,新加坡国立大学)

AI总结 本文提出首个基于梯度变化量实现区间遗憾界的在线学习算法,采用两层在线集成结构,自适应多种问题相关量并达到极小化最优率,同时引入Lipschitz和平滑性无关的变体。

详情
AI中文摘要

本文研究使用区间遗憾度量的非平稳在线学习,该度量要求在线算法在每个时间区间内表现良好。我们提出了第一个在线学习算法,其区间遗憾界随梯度变化缩放,梯度变化是衡量在线函数梯度累积变化的基本度量,与多种问题相关量有关,并与随机优化等问题紧密相连。我们的方法采用简单高效的两层在线集成结构,实现了强大的理论保证。具体来说,它享有同时自适应多种问题相关量的遗憾界,同时在最坏情况下保持极小化最优率。此外,认识到超参数调优的挑战,我们引入了一种Lipschitz和平滑性无关的变体,自动适应这些可能未知的常数。这主要得益于一种新颖的Lipschitz自适应元算法,该算法可能具有独立的意义。除了区间遗憾,我们的方法还产生了更广泛的影响:它为区间动态遗憾(一种更强的度量,与任何区间上的变化比较器竞争)提供了通用的界,并首次为随机扩展对抗优化提供了分段刻画。理论发现通过实验得到验证。

英文摘要

This paper investigates non-stationary online learning using the metric of interval regret, which requires an online algorithm to perform well over every time interval. We propose the first online learning algorithm that achieves an interval regret bound scaling with gradient variation, a fundamental measure of the cumulative change in online function gradients, which relates to various problem-dependent quantities and is closely connected to stochastic optimization and other problems. Our method employs a simple and efficient two-layer online ensemble structure that achieves strong theoretical guarantees. Specifically, it enjoys a regret bound that simultaneously adapts to various problem-dependent quantities while also preserving the minimax-optimal rate in the worst case. Moreover, recognizing the challenge of hyperparameter tuning, we introduce a Lipschitz- and smoothness-agnostic variant that automatically adapts to these potentially unknown constants. This is primarily enabled by a novel Lipschitz-adaptive meta algorithm, which may be of independent interest. Beyond interval regret, our method also yields broader implications: it provides versatile bounds for interval dynamic regret, a stronger measure that competes with changing comparators over any interval, and yields the first piecewise characterization for stochastic extended adversarial optimization. Theoretical findings are validated by experiments.

2606.03825 2026-06-03 cs.LG cs.CL 版本更新

Dynamic Short Convolutions Improve Transformers

动态短卷积改进Transformer

Oliver Sieberling, Bharat Runwal, Rameswar Panda, Yoon Kim

发表机构 * Massachusetts Institute of Technology(麻省理工学院) MIT-IBM Watson AI Lab(MIT-IBM沃森人工智能实验室)

AI总结 本文提出动态短卷积作为新的神经网络原语,通过输入依赖的滤波器增强Transformer,在语言建模中相比标准Transformer和静态短卷积变体持续提升性能,并带来计算优势。

详情
AI中文摘要

Transformer已成为大型语言模型的主导架构,主要得益于注意力、前馈层、残差连接和归一化的可扩展性和灵活性。本文引入动态短卷积作为改进Transformer的额外神经网络原语。与静态短卷积不同,动态卷积使用输入依赖的滤波器,在保持卷积局部性偏差的同时增加表达能力。动机实验表明,在具有挑战性的关联回忆任务中,对键、查询和值表示应用动态短卷积相比静态卷积变体提升了性能。在从150M到2B参数的语言建模实验中,动态卷积持续优于标准Transformer和用静态短卷积增强的Transformer。拟合缩放定律表明,当动态卷积应用于键、查询和值向量时,相对于计算匹配的Transformer具有1.33倍的计算优势,而在每个线性层后添加动态卷积时优势达到1.60倍。动态卷积还在线性RNN(Mamba-2/Gated DeltaNet)和混合专家架构上带来了改进。我们通过自定义Triton内核使这些增益变得实用,实现了高效的训练和可管理的端到端减速。这些结果表明,动态短卷积是一种可扩展、硬件高效且富有表现力的原语,可用于推进基于Transformer的语言模型。

英文摘要

Transformers have become the dominant architecture for large language models, largely due to the scalability and flexibility of attention, feed-forward layers, residual connections, and normalization. This paper introduces dynamic short convolutions as an additional neural network primitive for improving Transformers. Unlike static short convolutions, dynamic convolutions use input-dependent filters, which preserves the locality bias of convolution while increasing expressivity. Motivating experiments show that applying dynamic short convolutions to key, query, and value representations improves performance on challenging associative recall tasks compared with static convolutional variants. Across language-modeling experiments ranging from 150M to 2B parameters, dynamic convolutions consistently outperform standard Transformers and Transformers augmented with static short convolutions. Fitting scaling laws indicates a 1.33$\times$ compute advantage over compute-matched Transformers when dynamic convolutions are applied to the key, query, and value vectors, and a 1.60$\times$ advantage when adding dynamic convolutions after every linear layer. Dynamic convolutions also offer improvements on linear RNNs (Mamba-2/Gated DeltaNet) and mixture-of-experts architectures. We make these gains practical with custom Triton kernels that enable efficient training with a manageable end-to-end slowdown. These results suggest that dynamic short convolutions are a scalable, hardware-efficient, and expressive primitive for advancing Transformer-based language models.

2606.03821 2026-06-03 cs.LG 版本更新

Finding Needles in the Haystack: Transductive Active Labeling in Ecology

在干草堆中寻找针:生态学中的转导式主动标注

Rupa Kurinchi-Vendhan, Sara Beery

发表机构 * Massachusetts Institute of Technology(麻省理工学院)

AI总结 本文提出转导式主动标注方法,通过发现稀有类样本解决生态数据长尾分布问题,并设计混合停止准则提升稀有类恢复。

详情
AI中文摘要

主动学习现在已成为标注生态数据的标准做法,使生态学家能够快速处理大量野外数据以理解和监测自然环境。当前的做法归纳性地评估主动学习,在保留的测试集上估计预测性能。我们认为这种评估与大多数生态任务不一致,这些任务的目标是尽可能高效地转导式地标注整个数据池。我们证明,忽略人在回路中会低估继续标注的重要性,特别是对于长尾中的类别,这些类别可能具有不成比例的生态重要性(稀有物种、不常见行为等)。我们的分析表明,对于这个长尾,转导式目标将重要性从预测转移到发现:真正的挑战变成了在干草堆中寻找针,即嵌入在潜在几何中丰富类别密集区域内的稀有类别样本,我们通过一种新的采样难度度量来量化这一点。最后,为了将这些见解转化为实际的生态工作流程,我们提出了一种受生态稀疏曲线启发的保守混合停止准则,并表明将预测性能与发现标准相结合可以减少长尾池上的过早停止,当发现(而非分类)是限制因素时,改善稀有类别的恢复。

英文摘要

Active learning is now standard practice in labeling ecological data, enabling ecologists to quickly process large volumes of field data to understand and monitor natural environments. Current practices evaluate active learning inductively, estimating predictive performance on a held-out test set. We argue that this evaluation is misaligned with most ecological tasks, where the goal is to transductively label an entire pool of data as efficiently as possible. We demonstrate that ignoring the human-in-the-loop underestimates the importance of continuing to label, particularly for classes in the long tail which may be of disproportionate ecological importance (rare species, uncommon behaviors, etc.). Our analysis shows that, for this long tail, the transductive objective shifts importance from prediction to discovery: the true challenge becomes finding "needles in the haystack," examples of rare classes that are embedded within dense regions of abundant classes in the latent geometry, which we quantify with a novel metric of sampling difficulty. Finally, to translate these insights to practical ecological workflows, we propose a conservative hybrid stopping criterion inspired by ecological rarefaction curves, and show that combining predictive performance with discovery criteria reduces premature stopping on long-tailed pools, improving rare-class recovery when discovery, not classification, is the limiting factor.

2606.03819 2026-06-03 cs.LG 版本更新

TreeFlash: Parallel AR-Approximation for Faster Speculative Decoding

TreeFlash: 用于更快推测解码的并行AR近似

Peer Rheinboldt, Frédéric Berdoz, Roger Wattenhofer

发表机构 * ETH Zurich(苏黎世联邦理工学院)

AI总结 提出TreeFlash,通过MLP层近似自回归分布,在保持O(1)解码时间复杂度的同时,提升树形推测解码的块效率和加速比。

详情
AI中文摘要

用于推测解码的一次性块起草器在单次前向传播中生成完整草稿,通过消除顺序令牌生成实现高吞吐量。然而,它们仅基于前缀上下文预测每个草稿令牌,而不依赖于先前生成的令牌。这种非自回归条件导致随着草稿深度增加,起草器的分布偏离验证器的真实自回归分布。在基于树的起草中,这个问题更加严重,因为不同的分支被迫共享后续令牌的相同边际分布。我们提出TreeFlash,通过引入一个以起草器隐藏状态和前一个令牌为条件的MLP层来近似自回归分布,从而解决这一问题。TreeFlash通过采用两阶段近似机制,保留了一次性起草器的O(1)解码时间复杂度。TreeFlash在各种任务和模型上实现了最先进的性能,与边际树起草相比,块效率提高了12%,加速比提高了9%。

英文摘要

One-shot block drafters for speculative decoding generate the full draft in a single forward pass, achieving strong throughput by eliminating sequential token generation. However, they predict each draft token conditioned only on the prefix context, with no dependence on previously drafted tokens. This non-autoregressive conditioning causes the drafter's distribution to diverge from the verifier's true autoregressive distribution as draft depth grows. This problem becomes more severe in tree-based drafting, where distinct branches are forced to share the same marginal distribution for subsequent tokens. We propose TreeFlash, which addresses this by incorporating an MLP layer conditioned on the drafter's hidden state and the previous token to approximate an autoregressive distribution. TreeFlash retains the $\mathcal{O}(1)$ decoding time complexity of one-shot drafters by employing a two-stage approximation mechanism. TreeFlash achieves state-of-the-art performance across a variety of tasks and models, improving over marginal tree drafting by $12\%$ higher block efficiency and $9\%$ higher speedup.

2606.03811 2026-06-03 cs.CR cs.AI cs.LG 版本更新

AI Agents Enable Adaptive Computer Worms

AI代理实现自适应计算机蠕虫

Jonas Guan, Tom Blanchard, Hanna Foerster, Hengrui Jia, Gabriel Huang, Nicolas Papernot

发表机构 * University of Toronto(多伦多大学) Vector Institute(向量研究所) University of Cambridge(剑桥大学) ServiceNow

AI总结 本研究展示了AI代理能够生成针对每个目标的定制攻击策略,利用被感染机器上的大语言模型自我维持并传播,形成自持的AI驱动网络威胁。

详情
AI中文摘要

计算机蠕虫是一种通过在网络中从一台机器复制到另一台机器来传播的恶意软件。传统蠕虫(如WannaCry)利用预定的漏洞,修补这些漏洞即可阻止其传播。本文表明,人工智能(AI)代理实现了一种根本性的新威胁:一种能够针对每个遭遇的目标生成定制攻击策略的蠕虫。该蠕虫寄生性地利用被感染的机器运行开放权重的大语言模型(LLM)以维持其推理能力,或扩展其攻击范围。在部署于Linux、Windows和物联网(IoT)设备的机器网络上,该蠕虫通过利用常见的现实企业网络漏洞进行传播。由于蠕虫由窃取的计算资源驱动,攻击者每次新感染所需的边际成本为零。这在攻击者和防御者之间造成了不稳定的经济不对称。此外,由于蠕虫不需要商业AI平台,集中式安全控制(如服务拒绝或速率限制)在结构上无关紧要。我们的结果表明,自持的AI驱动网络威胁不再是理论上的。我们必须为自主的生成式对手做好准备:这些恶意软件系统无需人类操作员即可传播,其定义不是固定的利用代码,而是实时推理目标、适应观察并合成攻击逻辑的能力。

英文摘要

A computer worm is malware that spreads on a network by replicating itself from one machine to another. Traditional worms, like WannaCry, exploited predetermined vulnerabilities, and their spread can be halted by patching those vulnerabilities. Here we show that artificial intelligence (AI) agents enable a fundamentally new threat: a worm that generates tailored attack strategies to each target it encounters. The worm parasitically uses compromised machines to run open-weight large language models (LLMs) to sustain its reasoning, or extend its reach for further attacks. Deployed on a network of machines spanning Linux, Windows, and IoT (Internet of Things) devices, the worm propagated by exploiting common, real-world corporate network vulnerabilities. Since the worm is powered by stolen compute, the attacker's marginal cost per new infection is zero. This creates a destabilizing economic asymmetry between attackers and defenders. Moreover, because the worm requires no commercial AI platform, centralized safety controls, such as service refusals or rate limiting, are structurally irrelevant. Our results demonstrate that self-sustaining AI-driven cyber-threats are no longer theoretical. We must prepare for autonomous generative adversaries: malware systems that propagate without human operators and are defined not by fixed exploit code, but by the capacity to reason about targets, adapt to observations, and synthesize attack logic in real time.

2606.03808 2026-06-03 cs.LG cs.AI cs.CR 版本更新

PURGE: Projected Unlearning via Retain-Guided Erasure

PURGE: 通过保留引导擦除的投影遗忘

Vedant Jawandhia, Daksh Ahuja, Ghufran Alam Siddiqui, Prashant Trivedi, Yash Sinha, Pratik Narang

发表机构 * BITS Pilani, Pilani Campus, India(印度比斯帕利尼学院)

AI总结 提出一种基于持续学习与机器遗忘对偶性的遗忘算法PURGE,利用梯度投影约束保留损失,并通过多层表示擦除和保留混淆目标实现隐私与效用的平衡。

Comments 13 pages, 10 figures, 6 tables

详情
AI中文摘要

我们提出PURGE,一种基于简单但未被充分利用的观察构建的机器遗忘算法:持续学习(CL)和机器遗忘(MU)本质上是二元问题。CL试图在不遗忘旧任务的情况下学习新任务;MU试图在不损害保留性能的情况下擦除特定数据,代表了相同基本张力在相反方向上的体现。PURGE通过调整A-GEM(Chaudhry等人,2019)的梯度投影来利用这种对偶性,使得每个遗忘步骤都受到约束,不会增加保留集损失。在此基础上,它执行多层表示擦除,将中间层中遗忘集的激活推向保留分布,以从隐藏表示中移除信息,而不仅仅是在输出层抑制信息。一个关键的设计选择是保留混淆目标:不是将遗忘输出推向均匀分布(我们发现这很容易被成员推断攻击检测到),而是将目标设定为模型在保留数据上的自然混淆模式。这使得遗忘模型难以与从头重新训练的模型区分。两个自调节停止标准(保留损失预算和遗忘准确率目标)让算法自行决定何时停止,无需手动调整训练轮数。在五个数据集(CIFAR-10、MNIST、SVHN、STL10、PathMNIST)上的22个类别级遗忘任务实验中,PURGE始终将保留准确率保持在96%以上,同时实现接近0.5(理想值)的MIA AUROC,在隐私-效用前沿上优于梯度上升、KL均匀分布以及多个已发表的基线方法。

英文摘要

We propose PURGE, a machine unlearning algorithm built on a simple but an under-exploited observation: continual learning (CL) and machine unlearning (MU) which are fundamentally dual problems. CL tries to learn new tasks without forgetting old ones; MU tries to erase specific data without hurting retained performance representing the same underlying tension in opposite directions. PURGE leverages this duality by adapting gradient projection from A-GEM (Chaudhry et al., 2019) so that every unlearning step is constrained to not increase the retain-set loss. On top of this, it performs multi-layer representation erasure, pushing forget-set activations in intermediate layers towards the retain distribution to remove information from hidden representations rather than just suppressing it at the output. A key design choice is the retain-confusion target: rather than pushing forget outputs toward the uniform distribution, which we found to be surprisingly easy for membership inference attacks to detect, we instead target the model's natural confusion pattern on retain data. This makes the unlearned model hard to distinguish from one retrained from scratch. Two self-regulating stopping criteria (a retain-loss budget and a forget-accuracy target) let the algorithm decide on its own when to stop, removing the need for manual epoch tuning. In experiments on five datasets (CIFAR-10, MNIST, SVHN, STL10, PathMNIST) across 22 class-level forgetting tasks, PURGE consistently keeps retain accuracy above 96% while achieving MIA AUROC close to 0.5 (the ideal), outperforming gradient ascent, KL-uniform, and several published baselines on the privacy-utility frontier.

2606.03804 2026-06-03 cs.LG 版本更新

Easy-to-Use Shielding for Reinforcement Learning

易于使用的强化学习屏蔽技术

Stefan Pranger, Bettina Könighofer

AI总结 提出tempestpy库,将形式化屏蔽合成集成到Gymnasium API中,降低强化学习安全探索的门槛,并扩展了随机多人博弈的屏蔽算法。

详情
AI中文摘要

安全探索是强化学习中的一个关键挑战,旨在防止智能体在探索环境时做出有害决策。屏蔽是一种利用环境模型形式的领域知识来决定动作安全性的技术。尽管已经成熟,但由于缺乏将形式化屏蔽合成与标准强化学习框架连接起来的可访问端到端基础设施,屏蔽在强化学习中的应用有限。应用屏蔽通常需要形式化方法的专业知识和大量的工程工作,使其脱离典型的强化学习工作流程。我们通过将屏蔽合成工具Tempest扩展为安全强化学习的实用后端来解决这一问题。我们的核心贡献是tempestpy,一个Python库,它将基于Tempest的屏蔽合成直接集成到Gymnasium API中,使得屏蔽可以在现有的强化学习管道中合成和部署。这降低了屏蔽的入门门槛,将形式化安全探索方法转化为强化学习实践者可用的组件。我们还扩展了Tempest的算法支持,以计算随机多人博弈的可靠屏蔽,保留了形式化安全保证。我们端到端地展示了最终的工作流程,并在多个环境中评估了有屏蔽和无屏蔽的强化学习。为了便于建模,我们为MiniGrid提供了符号模型,并引入了MiniGridSafe,这是一个游乐场环境集合,旨在使屏蔽易于访问且实验透明。MiniGridSafe通过具有概率转换和额外智能体的安全导向场景扩展了MiniGrid,使得在简单直观的设置中研究具有挑战性的安全方面成为可能。

英文摘要

Safe exploration is a key challenge in Reinforcement Learning (RL) that aims to prevent agents from making harmful decisions while exploring their environment. Safe exploration is a key challenge in Reinforcement Learning (RL) that aims to prevent agents from making harmful decisions while exploring their environment. Shielding is one such technique that assumes domain knowledge in the form of an environment model to decide upon action safety. Although well-established, shielding has seen limited adoption in RL due to the lack of accessible end-to-end infrastructure connecting formal shield synthesis with standard RL frameworks. Applying shielding typically requires expertise in formal methods and substantial engineering effort, keeping it outside the typical RL workflow. We address this by extending our shield synthesis tool Tempest into a practical backend for safe RL. Our core contribution is tempestpy, a Python library that integrates Tempest-based shield synthesis directly into the Gymnasium API, allowing shields to be synthesized and deployed within existing RL pipelines. This lowers the barrier to entry for shielding and turns formal safe-exploration methods into a usable component for RL practitioners. We also extend Tempest's algorithmic support to compute sound shields for stochastic multiplayer games, preserving formal safety guarantees. We demonstrate the resulting workflow end to end and evaluate shielded and unshielded RL across multiple environments. To facilitate modeling, we provide symbolic models for MiniGrid and introduce MiniGridSafe, a collection of playground environments designed to make shielding easily accessible and experimentally transparent. MiniGridSafe extends MiniGrid with safety-oriented scenarios featuring probabilistic transitions and additional agents, enabling the study of challenging safety aspects in a simple and intuitive setting.

2606.03800 2026-06-03 cs.LG cs.AI 版本更新

Trading Human Curation for Synthetic Augmentation in RLVR

在RLVR中用合成增强替代人工策展

Akshansh, Leonardo Rosa Rodrigues, Michael Korostelev, Youssef Hassan, Mark E. Whiting

发表机构 * Pareto AI

AI总结 研究通过预指定、门控过滤的增强任务替代人工策展任务,在RLVR中实现成本效益权衡,并保持泛化性能。

Comments 21 pages, 5 main-text figures, 4 appendix figures. Preprint

详情
AI中文摘要

高质量训练任务的供应是基于可验证奖励的强化学习(RLVR)在智能体语言模型上的核心瓶颈。每个任务需要一个沙盒环境、一个提示和一个手工编写的奖励函数,只有通过质量标准的任务才能产生有用的训练信号。达到这一质量标准的人工策展在有效RL训练所需的任务数量上无法经济地扩展,而自动生成的任务变体与人工编写任务之间的替代率尚未确定。我们研究在RLVR期间,使用预指定、门控过滤的增强(augmentations)作为额外人工策展的替代品。我们形式化了增强任务与人工任务之间的成本调整权衡率 $\rho_{\text{cost}}$,通过在不同增强比例的训练语料库上进行受控消融实验来测量它,并描述了增强管道的端到端经济学。用增强内容替代额外的人工编写任务,在涵盖代码、指令遵循、推理和多轮智能体函数调用的十个基准测试套件上保持了聚合的留出泛化能力。在合理的 $c_{\text{human}}/c_{\text{aug}}$ 范围内,门控合成与人工RLVR任务之间的成本调整权衡率 $\rho_{\text{cost}}$ 保持在 $[1.4\times, 11.6\times]$ 之间。

英文摘要

The supply of high-quality training tasks is a central bottleneck for reinforcement learning from verifiable rewards (RLVR) on agentic language models. Each task requires a sandboxed setup, a prompt, and a hand-authored reward function, and only tasks that pass a quality bar produce useful training signal. Hand-curation at this quality bar does not scale economically to the task counts effective RL training requires, and the substitution rate between automatically generated task variants and human-authored ones is not yet established. We investigate using pre-specified, gate-filtered augmentations of a small hand-authored base as a substitute for additional human curation during RLVR. We formalize the cost-adjusted trade rate $ρ_{\text{cost}}$ between augmented and human-authored tasks, measure it through a controlled ablation across training corpora with varying augmentation share, and characterize the end-to-end economics of the augmentation pipeline. Substituting augmented content for additional human-authored tasks retains aggregate held-out generalization on a ten-benchmark suite spanning code, instruction following, reasoning, and multi-turn agentic function-calling. The cost-adjusted trade rate $ρ_{\text{cost}}$ between gated synthetic and human-authored RLVR tasks stays in $[1.4\times, 11.6\times]$ across the plausible $c_{\text{human}}/c_{\text{aug}}$ range.

2606.03794 2026-06-03 cs.LG eess.SP 版本更新

Limit Analysis of Graph Neural Networks with Wireless Conflict Graphs

基于无线冲突图的图神经网络极限分析

Romina Garcia Camargo, Zhiyang Wang, Alejandro Ribeiro

发表机构 * Department of Electrical and Systems Engineering, University of Pennsylvania(宾夕法尼亚大学电气与系统工程系) Halıcıoğlu Data Science Institute, UCSD(加州大学圣地亚哥分校Halıcıoğlu数据科学研究所)

AI总结 针对稀疏随机几何图上的图神经网络,通过分析其与确定性网格图的接近性,建立了跨尺度迁移性的理论界限,并在链路调度问题中验证了学习策略的优越性。

详情
AI中文摘要

图神经网络(GNN)已成为一种利用通信网络底层图结构进行无线资源分配的强大工具。其可迁移性使得在小规模图上训练的模型能够推广到大规模部署,且性能下降很小,这对于当前不断增长的网络而言是一个理想特性。无线网络是稀疏的,单个节点只与少量其他用户相连。本文建立了基于稀疏随机几何图(RGG)的图神经网络可迁移性的理论结果。特别地,我们关注用于建模链路间干扰的RGG冲突图。我们的方法考虑了RGG与确定性网格图(DGG)之间的接近性,以建立模型跨尺度迁移时性能损失的界限。我们通过链路调度问题验证了理论发现,表明学习策略在规模上始终优于现有基准。最后,我们考察了理论假设对经验性能的影响。

英文摘要

Graph Neural Networks (GNNs) have emerged as a powerful tool for wireless resource allocation that leverages the underlying graph structure of communication networks. Their transferability property enables models trained on small-scale graphs to generalize to large-scale deployments with little performance deterioration, a desirable property for currently growing networks. Wireless networks are sparse regimes, where a single node is connected to a small number of other users. This work establishes theoretical results for transferability of GNNs over graphs derived from sparse Random Geometric Graphs (RGGs). In particular, we focus on conflict graphs of RGGs used to model interference among links. Our approach considers the closeness between RGGs and Deterministic Grid Graphs (DGG) to establish bounds in the performance loss when a model is transferred across scales. We validate our theoretical findings through the problem of link scheduling, demonstrating that our learned policies consistently outperform existing benchmarks at scale. Finally, we examine the impact of our theoretical assumptions on empirical performance.

2606.03792 2026-06-03 cs.CV cs.LG 版本更新

Training-Free Multi-Concept LoRA Composition with Prompt-Aware Weighting

免训练的多概念LoRA组合与提示感知加权

Georgios Tsoumplekas, Stella Bounareli, Vasileios Argyriou

发表机构 * Department of Networks and Digital Media, Kingston University London, UK(网络与数字媒体系,金史密斯大学伦敦分校)

AI总结 提出一种免训练的提示感知加权策略,通过优化组合多个LoRA模块的输出实现多概念定制,提升图像质量和概念保真度。

Comments Accepted at IEEE FG 2026

详情
AI中文摘要

低秩适应(LoRA)通过将预训练扩散模型适应到特定视觉概念和风格,成功实现了文本到图像生成中的个性化。然而,将此类模型扩展到多概念定制仍然具有挑战性。简单组合多个LoRA权重或其输出通常会导致概念间的干扰,从而降低视觉质量并减少对单个概念参考图像的保真度。本文提出了一种简单而有效的多概念定制方法,通过最优组合多个LoRA模块的输出。我们利用生成过程中每个概念的相对重要性(从其对应的提示标记推断),并引入了两种方法:W-Switch和W-Composite,它们采用提示感知的重要性加权策略,其中每个LoRA根据其触发词在目标提示中的语义影响进行加权。此外,我们通过提出一种新的基于图像的相似性评估框架来扩展现有的定量评估指标,该框架通过比较真实世界参考图像和从生成图像中自动分割的概念区域来评估图像保真度和身份保持。我们在ComposLoRA测试平台上评估了我们的方法,并在视觉质量、身份保持和组合性方面展示了相对于现有最先进方法的一致改进。定性评估,包括基于大语言模型(LLM)的评估和用户研究,进一步验证了所提出方法的有效性,并与新引入的基于图像的定量指标一致。我们的代码可在该https URL获取。

英文摘要

Low-Rank Adaptation (LoRA) successfully enables personalization in text-to-image generation by adapting pre-trained diffusion models to specific visual concepts and styles. However, extending such models to multi-concept customization remains challenging. Naively combining multiple LoRA weights or their outputs often leads to interference among concepts, resulting in degraded visual quality and reduced fidelity to the reference images of individual concepts. This paper proposes a simple yet effective approach for multi-concept customization by optimally combining the outputs of multiple LoRA modules. We leverage the relative importance of each concept during generation, as inferred from its corresponding prompt tokens and introduce two methods, W-Switch and W-Composite, that employ a prompt-aware importance weighting strategy in which each LoRA is weighted according to the semantic influence of its trigger words in the target prompt. In addition, we extend existing quantitative evaluation metrics by proposing a new image-based similarity evaluation framework that assesses image fidelity and identity preservation through comparisons between real-world reference images and automatically segmented concept regions from generated images. We evaluate our approach on the ComposLoRA testbed and demonstrate consistent improvements over existing state-of-the-art methods in terms of visual quality, identity preservation and compositionality. Qualitative evaluations, including a Large Language Model (LLM) based assessment and a user study, further validate the effectiveness of the proposed methods and align with the newly introduced quantitative image-based metrics. Our code is available at https://github.com/GeorgeTsoumplekas/Prompt-Aware-Multi-LoRA-Composition.

2606.03780 2026-06-03 cs.CL cs.LG 版本更新

Expert-Aware Causal Tracing of Factual Recall in Sparse MoE Language Models

专家感知的稀疏MoE语言模型中事实回忆的因果追踪

Yuetian Lu, Ali Modarressi, Yihong Liu, Hinrich Schütze

发表机构 * Center for Information and Language Processing (CIS)(信息与语言处理中心) Ubiquitous Knowledge Processing Lab (UKP)(无所不在的知识处理实验室) Munich Center for Machine Learning (MCML)(慕尼黑机器学习中心)

AI总结 针对稀疏混合专家语言模型,提出专家感知的因果追踪方法,通过干预专家级更新定位事实回忆的关键专家,发现专家级定位依赖于模型和协议。

Comments Preprint

详情
AI中文摘要

事实回忆的因果追踪主要在密集Transformer语言模型中进行研究,其中干预将信息流定位到层或前馈模块。稀疏混合专家(MoE)语言模型引入了一个更尖锐的问题:当事实预测由路由的MoE块中介时,哪些路由的专家贡献起作用?我们为稀疏MoE语言模型制定了专家感知的因果追踪。使用CounterFact事实,我们首先通过向主题词嵌入添加噪声来破坏模型的事实偏好,然后测试干净的MoE块输出或干净的专家级更新是否恢复了真实与虚假logit对比。对于Qwen3-30B-A3B-Base,层扫描选择并验证了第44层,专家级追踪识别出L44E069作为在干净运行中反复选择的专家,其保留的补丁优于其他活跃的同一层专家补丁。对于Mixtral-8x7B-v0.1,层级追踪验证了中层信号,但该信号并未定位到选定的单个专家;相反,联盟检查通过路由的多专家更新恢复了它。这些结果表明,MoE事实追踪可以做到专家感知,同时也表明专家级定位是依赖于模型和协议的,而非普遍适用。

英文摘要

Causal tracing of factual recall has been studied predominantly in dense transformer language models, where interventions localize information flow to layers or feed-forward modules. Sparse mixture-of-experts (MoE) language models introduce a sharper question: when a factual prediction is mediated by a routed MoE block, which routed expert contributions matter? We formulate expert-aware causal tracing for sparse MoE language models. Using CounterFact facts, we first corrupt the model's factual preference by adding noise to subject-token embeddings, and then test whether clean MoE-block outputs or clean expert-level updates restore the true-vs-foil logit contrast. For Qwen3-30B-A3B-Base, a layer sweep selects and validates layer 44, and expert-level tracing identifies L44E069 as an expert repeatedly selected in the clean run whose held-out patch outperforms other active same-layer expert patches. For Mixtral-8x7B-v0.1, layer-level tracing validates a mid-layer signal, but the signal is not localized to the selected singleton expert; a coalition check instead recovers it with routed multi-expert updates. These results suggest that MoE factual tracing can be made expert-aware, while also showing that expert-level localization is model- and protocol-dependent rather than universal.

2606.03762 2026-06-03 cs.LG cs.AI 版本更新

Tool-Aware Optimization with Entropy Guidance for Efficient Agentic Reinforcement Learning

基于熵引导的工具感知优化用于高效智能体强化学习

Hongye Cao, Nuo Yan, Haoyuan Deng, Ziwei Wang, Tianpei Yang, Jing Huo, Yuyao Zhang, Yang Gao

发表机构 * National Key Laboratory for Novel Software Technology, Nanjing University(南京大学新型软件技术国家重点实验室) Nanyang Technological University(南洋理工大学) China Mobile NineVerse Artificial Intelligence Technology (Beijing) Co., Ltd.(中国移动九章人工智能技术(北京)有限公司) Institute of Artificial Intelligence, NineVerse(九章人工智能研究院)

AI总结 提出TAO-RL框架,通过工具感知轨迹过滤和熵引导探索解决智能体强化学习中工具使用导致的训练不稳定问题,在7个推理基准上优于现有方法。

详情
AI中文摘要

智能体强化学习(RL)使大型语言模型(LLMs)具备工具使用能力,从而显著提升复杂任务的推理性能。然而,整合外部工具常常导致训练不稳定:过度依赖工具会引发输入分布偏移,而过于保守的工具使用则限制了有效探索。为解决这一问题,我们提出统一框架TAO-RL,将工具感知轨迹过滤与熵引导探索相结合,以实现高效策略优化。具体而言,在数据层面,TAO-RL根据两个标准过滤轨迹:丢弃所有工具调用均执行失败的轨迹,以及移除所有轨迹全部正确或全部错误的轨迹,因为这两种情况都会产生退化的优势估计,无法提供有区分度的学习信号。这种联合过滤保留了既具备工具能力又包含信息量的数据,建立了高质量的训练分布。在算法层面,我们引入工具感知的熵引导奖励,重塑工具调用后token的优势函数,鼓励策略在关键决策点探索更多样化的推理路径。这两个组成部分相互增强:轨迹过滤建立了干净且信息丰富的训练基础,而熵引导探索则在关键工具交互节点驱动更强的推理行为。在3种模型规模下的7个具有挑战性的推理基准上的大量实验表明,TAO-RL优于现有方法。

英文摘要

Agentic reinforcement learning (RL) equips large language models (LLMs) with tool-use capabilities that substantially improve reasoning on complex tasks. However, integrating external tools often destabilizes training: over-reliance on tools can induce input distribution shift, while overly conservative tool use limits effective exploration. To address this issue, we propose a unified framework TAO-RL that couples tool-aware trajectory filtering with entropy-guided exploration for efficient policy optimization. Specifically, at the data level, TAO-RL filters rollout trajectories along two criteria: discarding those where all tool invocations fail to execute, and removing those where all rollouts are either correct or incorrect, as both cases yield degenerate advantage estimates that contribute no discriminative learning signal. This joint filtering retains data that are both tool-capable and informative, establishing a high-quality training distribution. At the algorithmic level, we introduce a tool-aware entropy-guided bonus that reshapes the advantage function at post-tool-call tokens, encouraging the policy to explore more diverse reasoning paths at critical decision points. These two components are mutually reinforcing: trajectory filtering establishes a clean and informative training foundation, while entropy-guided exploration drives stronger reasoning behaviors at critical tool-interaction junctures. Extensive experiments on 7 challenging reasoning benchmarks across 3 model scales demonstrate the superiority of TAO-RL over existing methods.

2606.03756 2026-06-03 cs.RO cs.LG 版本更新

Neural Navigation Functions for Zero-Shot Generalizable Motion Planning

神经导航函数用于零样本泛化运动规划

Benjamin D. Shaffer, Pei-An Hsieh, Brooks Kinch, Nathaniel Trask, M. Ani Hsieh

发表机构 * University of Pennsylvania, United States(宾夕法尼亚大学,美国) Department of Mechanical Engineering and Applied Mechanics(机械工程与应用力学系) Department of Electrical and Systems Engineering(电气与系统工程系)

AI总结 提出神经导航函数(Neural-NF),通过将数据驱动适应嵌入结构化椭圆规划器,实现跨未见环境几何的零样本迁移,并保证无碰撞、单调下降和全局最小值。

Comments 17 pages, 10 figures

详情
AI中文摘要

我们引入了神经导航函数(Neural-NF),一种学习到的反应式导航函数,能够跨未见环境几何进行零样本迁移。Neural-NF将数据驱动适应置于结构化椭圆规划器中,其中导航目标被学习,而规划器结构通过构造得以保留。具体来说,内在的拉普拉斯派生特征被映射到局部PDE系数,求解得到的边值问题在每个目标域上产生全局一致的值函数。对于每个可接受的学习模型,所得策略无碰撞,提供单调下降,并通过构造在目标处具有全局最小值。这为任何参数设置提供了线性可解的最优控制解释。实验上,Neural-NF在多样几何上实现了强大的零样本迁移,并比直接预测值函数的学习规划器性能提升高达5倍。

英文摘要

We introduce Neural Navigation Functions (Neural-NF), a learned reactive navigation function capable of zero-shot transfer across unseen environment geometries. Neural-NF places data-driven adaptation within a structured elliptic planner, where the navigation objective is learned while planner structure is preserved by construction. Specifically, intrinsic Laplacian-derived features are mapped to local PDE coefficients, and solving the resulting boundary value problem produces a globally consistent value function on each target domain. For every admissible learned model, the resulting policy is collision-free, provides monotonic descent and a global minimum at the goal by construction. This admits a linearly-solvable optimal-control interpretation for any parameter setting. Empirically, Neural-NF achieves strong zero-shot transfer across diverse geometries and outperforms learned planners that directly predict the value function by up to a $5\times$ improvement.

2606.03731 2026-06-03 cs.LG stat.ML 版本更新

Conformal Language Modeling via Posterior Sampling

通过后验采样的共形语言建模

Nicolas Emmenegger, Theo X. Olausson, Armando Solar-Lezama, Chara Podimata

发表机构 * Massachusetts Institute of Technology(麻省理工学院)

AI总结 提出通过近似LLM后验采样(条件为校准的高分区域)来替代事后过滤,实现目标风险控制并提高下游效用。

详情
AI中文摘要

大型语言模型仍然受到幻觉的困扰。最近的工作试图使用基于共形预测的统计技术来抑制其普遍性,取得了理论和实证上的成功。然而,这些方法以事后方式运作,将采样过程本身视为原子操作,然后通过外科手术式地修改样本来移除幻觉声明。这种过滤与生成之间的脱节可能导致样本不连贯、不一致,或者仅仅在模型本身下不太可能。此外,事后手术无法将概率质量转移到更有用和更有帮助的响应上。为了解决这些问题,我们提出从LLM后验的近似中采样,其中条件事件对应于一个校准的高分区域。我们开发了一种针对条件序列生成场景的校准程序,该程序能有效识别该区域并实现目标风险控制。在实证中,我们将我们的方法应用于以开放式的传记生成和数学问题解决为重点的案例研究;与先前的工作相比,我们获得了相同的统计保证,且下游效用更高。

英文摘要

Large Language Models remain plagued by hallucinations. Recent work has sought to tame their prevalence using statistical techniques based on conformal prediction, with both theoretical and empirical success. However, these methods operate in a post-hoc fashion, treating the sampling procedure itself as atomic and then surgically altering samples to remove hallucinated claims. This disconnect between filtering and generation can result in samples that are incoherent, inconsistent, or simply unlikely under the model itself. Moreover, post-hoc surgery is unable to shift probability mass towards more useful and helpful responses. To address these issues, we propose to instead sample from approximations to an LLM posterior, where the conditioning event corresponds to a calibrated, high-scoring region. We develop a calibration procedure tailored to the setting of conditional sequential generation that effectively identifies this region and achieves target risk control. Empirically, we apply our method to case studies focused on open-ended biography generation and mathematical problem solving; compared to prior work, we obtain the same statistical guarantees, with higher downstream utility.

2606.03723 2026-06-03 cs.LG 版本更新

Compress then Merge: From Multiple LoRAs into One Low-Rank Adapter

先压缩后合并:从多个LoRA到一个低秩适配器

Zhengbao He, Ruiqi Ding, Zhehao Huang, Ruikai Yang, Tao Li, Xiaolin Huang

发表机构 * Institute of Image Processing and Pattern Recognition, School of Automation and Intelligent Sensing, Shanghai Jiao Tong University, Shanghai, China(图像处理与模式识别研究所,自动化与智能感知学院,上海交通大学,上海,中国) Shanghai Key Laboratory of Flexible Medical Robotics, Tongren Hospital, Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, China(柔性医疗机器人上海市重点实验室,同仁医院,医疗机器人研究所,上海交通大学,上海,中国)

AI总结 针对多LoRA合并时全参数合并破坏低秩结构的问题,提出先压缩后合并(CtM)方法,通过共享子空间投影保证输出严格秩r,性能优于现有单LoRA基线。

Comments Accepted to ICML 2026. Code: https://github.com/ZhengbaoHe/compress-then-merge

详情
AI中文摘要

低秩适配(LoRA)实现了基础模型的参数高效特化,但任务特定适配器的激增将能力分散到多个适配器中,使复用和部署复杂化。我们研究将$T$个LoRA合并为单个秩-$r$ LoRA的问题,从而保留低秩结构的优势。现有的先合并后压缩流水线将秩约束视为事后考虑:它们在完整参数空间中合并适配器,然后通过截断SVD将合并结果压缩到秩$r$。然而,全参数合并可能破坏低秩结构,使得后续压缩难以恢复有效的秩-$r$ LoRA。我们提出先压缩后合并(CtM),一种反向流水线,在合并前强制秩-$r$瓶颈:CtM仅使用LoRA权重计算共享的$r$维子空间以捕获跨适配器的公共结构,将每个适配器投影到共享子空间以获得$r\times r$坐标,然后在此缩减空间中应用标准合并规则。CtM通过构造保证秩-$r$ LoRA,避免了事后截断,并在由拼接的LoRA因子张成的核心空间中实现高效计算。跨多个模型和任务的实验表明,CtM持续优于现有的单LoRA输出基线,同时缩小了与全参数合并方法的性能差距。

英文摘要

Low-rank adaptation (LoRA) enables parameter-efficient specialization of foundation models, but the proliferation of task-specific adapters fragments capabilities across many adapters, complicating reuse and deployment. We study the problem of merging $T$ LoRAs into a single rank-$r$ LoRA, thereby preserving the benefits of low-rank structure. Existing Merge-then-Compress pipelines treat the rank constraint as an afterthought: they merge adapters in the full parameter space, then compress the merged result to rank $r$ via truncated SVD. However, full-parameter merging may destroy the low-rank structure, making it difficult for subsequent compression to recover an effective rank-$r$ LoRA. We propose Compress-then-Merge (CtM), a reversed pipeline that enforces the rank-$r$ bottleneck before merging: CtM computes shared $r$-dimensional subspaces using only the LoRA weights to capture cross-adapter common structure, projects each adapter into the shared subspaces to obtain $r\times r$ coordinates, and then applies standard merging rules in this reduced space. CtM guarantees a rank-$r$ LoRA by construction, avoiding post-hoc truncation, and enables efficient computation in the core space spanned by concatenated LoRA factors. Experiments across multiple models and tasks show that CtM consistently outperforms existing single-LoRA-output baselines while narrowing the performance gap to full-parameter merging methods.

2606.03712 2026-06-03 cs.LG 版本更新

When Graph Tokens Sink: A Mechanistic Analysis of Graph Language Models

当图标记沉没:图语言模型的机制分析

Ding Zhang, Runtao Zhou, Wenqing Zheng, Rizal Fathony, Bayan Bruss, Chirag Agarwal

发表机构 * University of Virginia(弗吉尼亚大学) Capital One

AI总结 本文通过分析图语言模型中图标记的内部行为,发现激活层面的显著性与图信息利用之间存在解耦,揭示了现有图标记构建、放置和对齐机制的局限性。

详情
AI中文摘要

图语言模型(GLMs)已成为将大型语言模型(LLMs)适应图学习任务的一个有前景的方向。通过将图拓扑和节点信息转换为图标记,GLMs允许LLMs联合处理结构化图输入和文本指令。然而,LLMs如何内部解释这些图标记以及图标记是否作为图结构的有意义载体仍不清楚。在这项工作中,我们通过代表性GLM架构中的图标记行为分析了LLMs如何处理图信息。发现:我们发现GLMs中图标记的内部显著性与图信息利用并不等价。图沉没标记一致地表现为激活层面的异常值:它们可以通过一小部分隐藏状态维度上的巨大激活值来识别,并且偏向于早期的图标记位置。然而,这种激活层面的显著性并不意味着这些标记是图信息的主要载体。与语言和视觉-语言模型中的经典注意力沉没不同,图沉没标记不一定从查询标记中吸引最大的注意力权重。通过剪枝、重新定位和交换干预,我们表明图沉没标记对于下游预测并不是最重要的语义或结构标记。含义:这些结果共同表明,在当前的GLMs将图结构映射到LLM标记空间后,产生的图标记表示并不会自然地形成完全可用的拓扑感知内部表示;相反,它们在激活层面的显著性和图语义效用之间表现出解耦。这种解耦指出了现有图标记构建、放置和对齐机制的局限性。

英文摘要

Graph Language Models (GLMs) have become a promising direction for adapting Large Language Models (LLMs) to graph learning tasks. By transforming graph topology and node information into graph tokens, GLMs allow LLMs to jointly process structured graph inputs and textual instructions. Yet, it remains unclear how LLMs internally interpret these graph tokens and whether graph tokens act as meaningful carriers of graph structure. In this work, we analyze how LLMs process graph information through graph-token behavior in representative GLM architectures. Findings. We find that the internal saliency of graph tokens in GLMs is not equivalent to graph information utilization. Graph sink tokens consistently emerge as activation-level outliers: they can be identified by massive activation values along a small set of hidden-state dimensions and are biased toward early graph-token positions. However, this activation-level saliency does not imply that these tokens are the main carriers of graph information. Unlike classical attention sinks in language and vision-language models, graph sink tokens do not necessarily attract the largest attention weights from query tokens. Through pruning, repositioning, and swapping interventions, we show that graph sink tokens are not the most important semantic or structural tokens for downstream prediction. Implications. Together, these results suggest that after current GLMs map graph structure into the LLM token space, the resulting graph-token representations do not naturally form a fully usable topology-aware internal representation; instead, they exhibit a decoupling between activation-level saliency and graph-semantic utility. This decoupling points to limitations in existing graph-token construction, placement, and alignment mechanisms.

2606.03698 2026-06-03 cs.LG 版本更新

Multi$^2$: Hierarchical Multi-Agent Decision-Making with LLM-Based Agents in Interactive Environments

Multi$^2$:基于LLM智能体在交互环境中的分层多智能体决策

Sangeun Park, Minhae Kwon

发表机构 * KAIST(韩国科学技术院)

AI总结 提出Multi$^2$分层多智能体决策框架,通过高层智能体(System 1)使用监督微调生成子目标,低层智能体(System 2)使用离线到在线强化学习执行原子动作,以缓解目标漂移并实现长期稳定控制。

Comments Accepted at ICML 2026

详情
AI中文摘要

大型语言模型(LLM)研究的一个核心目标是构建能够通过与动态环境持续交互进行规划、行动和适应的智能体系统。尽管最近的基于LLM的智能体展现出令人印象深刻的上下文推理能力,但它们的长期决策仍然脆弱,常常遭受目标漂移,即目标和计划在长时间交互中发生偏移。我们引入了Multi$^2$,一个分层多智能体决策框架,将智能体行为显式分解为互补角色。高层智能体(System 1)使用监督微调(SFT)专注于上下文感知的子目标生成,而低层智能体(System 2)通过交互环境中的离线到在线强化学习(RL)执行原子动作。这种分离实现了稳定的长期控制,减轻了目标漂移,并允许高效适应。在多种交互环境中,Multi$^2$持续优于强智能体基线,在多轮交互中展现出改进的鲁棒性和协调性。除了性能提升,我们还引入并发布了三个分层基准数据集,填补了训练和评估基于LLM智能体的分层决策的长期空白。

英文摘要

A central goal of large language model (LLM) research is to build agentic systems that can plan, act, and adapt through sustained interaction with dynamic environments. While recent LLM-based agents exhibit impressive contextual reasoning, their long-horizon decision-making remains fragile, often suffering from objective drift, where goals and plans drift over extended interactions. We introduce Multi$^2$, a hierarchical multi-agent decision-making framework that explicitly decomposes agent behavior into complementary roles. A high-level agent (System 1) focuses on context-aware sub-goal generation using supervised fine-tuning (SFT), while a low-level agent (System 2) executes atomic actions through offline-to-online reinforcement learning (RL) in interactive environments. This separation enables stable long-horizon control, mitigates objective drift, and allows efficient adaptation. Across diverse interactive environments, Multi$^2$ consistently outperforms strong agentic baselines, demonstrating improved robustness and coordination in multi-turn interaction. Beyond performance, we introduce and release three hierarchical benchmark datasets, filling a long-standing gap in training and evaluating hierarchical decision-making for LLM-based agents.

2606.03689 2026-06-03 cs.LG cs.AI 版本更新

Staying Alive: Uncensored Survival Analysis with Tabular Foundation Models

保持存活:基于表格基础模型的无审查生存分析

Mariana Vargas Vieyra

发表机构 * GitHub

AI总结 提出一种无需训练的生存回归方法,利用表格基础模型预测事件时间并迭代填补右删失数据,构建加速失效时间模型,在标准基准上表现与需训练的模型相当。

详情
AI中文摘要

生存分析是一种统计框架,用于建模直到某个感兴趣事件发生的时间跨度。它广泛应用于包括医疗保健和客户流失预测在内的多个领域,其适用性的一个核心挑战在于事件时间被部分观测或存在右删失。近年来,表格基础模型因其能够在单次前向传播中执行预测任务而无需数据集特定的参数拟合,引起了广泛关注。尽管取得了成功,但由于右删失的存在,它们在时间-事件数据预测任务中的应用仍然困难。在这项工作中,我们提出了一种无需训练的生存回归方法,通过利用表格基础模型来预测事件时间并迭代地填补右删失数据。我们的方法使用表格基础模型构建加速失效时间模型,除了拟合单个标量参数外无需训练。随后,基于Buckley-James估计器,我们引入了一种非参数上下文内估计器来处理右删失数据。我们在标准生存分析基准上的实验表明,我们的方法与几种需要训练的参数和半参数生存回归模型(包括Cox回归和参数加速失效时间模型)相比具有竞争力。

英文摘要

Survival Analysis (SA) is a statistical framework that models the time span until some event of interest occurs. Widely used in several domains, including healthcare and churn prediction, a central challenge in its applicability stems from the time of the event being partially observed or \emph{right-censoring}. Tabular Foundation Models (TFM) have attracted significant interest in recent years due to their ability to perform prediction tasks in a single forward pass, requiring no dataset-specific parameter fitting. Despite their success, their application to prediction tasks on time-to-event data remains difficult due to right censoring. In this work, we present a training-free method to survival regression by leveraging TFMs to both predict the time of the event and iteratively impute right-censored data. Our method uses a TFM to construct an Accelerated Failure Time (AFT) model requiring no training beyond fitting a single scalar parameter. Subsequently, by building on the Buckley-James estimator, we introduce a non-parametric in-context estimator for right-censored data. Our experiments on standard survival analysis benchmarks show that our method is competitive with several parametric and semi-parametric survival regression models that require training, including Cox regression and parametric AFT models.

2606.03685 2026-06-03 cs.LG cs.AI 版本更新

A Close Look At World Model Recovery In Supervised Fine-Tuned LLM Planners

监督微调的大语言模型规划器中世界模型恢复的深入探究

Patrick Emami, Nan Qiang, Peter Graf

发表机构 * National Laboratory of the Rockies(落基山国家实验室)

AI总结 通过可解释性实验,研究监督微调如何影响大语言模型在经典规划任务中恢复世界模型的能力,发现微调使模型线性编码动作有效性和状态谓词,且更广泛的状态空间覆盖有助于更准确的世界模型恢复。

Comments 17 pages. Under review at TMLR

详情
AI中文摘要

监督微调(SFT)改进了大语言模型(LLM)中的端到端经典规划,但这些模型是否也学会了表示和推理它们正在解决的规划问题?由于经典规划问题的相对复杂性以及端到端规划生成对LLM的挑战,探索这个问题一直很困难。在我们的工作中,我们设计并执行了一系列可解释性实验,通过检查微调LLM的内部表示和生成能力,全面探究世界模型恢复。我们发现:a) 对有效动作序列进行监督微调使LLM能够线性编码动作有效性和一些状态谓词。b) 难以使用输出概率对动作有效性进行分类的模型可能仍然学习到将有效动作与无效动作分开的内部表示。c) 微调期间更广泛的状态空间覆盖(例如来自随机游走数据)能更准确地恢复底层世界模型。总之,这项工作为将可解释性技术应用于规划LLM提供了一种方法,并产生了有助于揭示LLM中知识表示方式的见解。

英文摘要

Supervised fine-tuning (SFT) improves end-to-end classical planning in large language models (LLMs), but do these models also learn to represent and reason about the planning problems they are solving? Due to the relative complexity of classical planning problems and the challenge that end-to-end plan generation poses for LLMs, it has been difficult to explore this question. In our work, we devise and perform a series of interpretability experiments that holistically interrogate world model recovery by examining both internal representations and generative capabilities of fine-tuned LLMs. We find that: a) Supervised fine-tuning on valid action sequences enables LLMs to linearly encode action validity and some state predicates. b) Models that struggle to use output probabilities for classifying action validity may still learn internal representations that separate valid from invalid actions. c) Broader state space coverage during fine-tuning, such as from random walk data, yields more accurate recovery of the underlying world model. In summary, this work contributes a recipe for applying interpretability techniques to planning LLMs and generates insights that shed light on open questions about how knowledge is represented in LLMs.

2606.03681 2026-06-03 cs.LG 版本更新

Speedrunning Tabular Foundation Model Pretraining

表格基础模型预训练的速通

Salih Bora Ozturk, Alexander Pfefferle, Frank Hutter

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 提出一种速通竞赛格式,通过优化单文件训练脚本,在nanoTabPFN上实现81倍预训练加速,并建立社区排行榜以累积改进。

详情
AI中文摘要

预训练成本是表格基础模型研究的主要瓶颈,减缓了新架构、先验知识和优化思路的迭代周期。然而,社区缺乏一种简单的方法来比较和累积预训练加速。我们为nanoTabPFN引入了一个社区速通:贡献者修改单文件训练脚本,并使用一块NVIDIA L40S GPU在子采样的TabArena上竞争达到固定的下游ROC AUC目标。当前最佳记录在0.92分钟内达到目标,相比74.32分钟的基线实现了81倍加速,同时使用的合成数据集减少了22倍。速通格式为社区提供了一种简单的协议来添加、验证和叠加预训练改进,排行榜对贡献开放。代码和记录可在该网址获取。

英文摘要

Pretraining cost is a major bottleneck for research on tabular foundation models, slowing the iteration cycle for new architectures, priors, and optimization ideas. Yet the community lacks a simple way to compare and accumulate pretraining speedups. We introduce a community speedrun for nanoTabPFN: contributors modify a single-file training script and compete to reach a fixed downstream ROC AUC target on subsampled TabArena using one NVIDIA L40S GPU. The current best record reaches the target in 0.92 minutes, an 81x speedup over the 74.32 minute baseline while using 22x fewer synthetic datasets. The speedrun format provides a simple protocol for the community to add, verify, and stack pretraining improvements, with the leaderboard open to contributions. Code and records are available at https://github.com/borawhocodess/modded-nanotabpfn.

2606.03647 2026-06-03 cs.CR cs.AI cs.LG 版本更新

Black-box, Adaptive, Efficient, Transferable, Harmful, Applicable... Attacks Are All You Need to Break LLMs

黑盒、自适应、高效、可迁移、有害、适用……攻击是破解LLM所需的一切

Vincent Limbach, Jonas Dornbusch, David Lüdke, Stephan Günnemann, Leo Schwinn

发表机构 * University of St. Gallen(圣加尔大学)

AI总结 提出间接危害优化(IHO)方法,通过迭代偏好优化训练掩码扩散语言模型攻击器,实现黑盒、高效、可迁移的自适应攻击,显著提升对分层防御的破解成功率。

详情
AI中文摘要

准确评估对抗鲁棒性是一个长期挑战。有缺陷的攻击设计可能会夸大鲁棒性估计,使得部署风险评估和防御比较不可靠。历史上,像AutoAttack这样的标准化攻击在很大程度上解决了图像分类器的问题,为跨防御的系统比较提供了可靠的评估基线。然而,对于LLM越狱评估,目前还没有等效的方法,而设计这样的攻击要困难得多。一个可靠的攻击必须(除其他外)兼容黑盒、适用于任意防御管道且高效,而现有方法无法同时满足这些条件。我们引入了间接危害优化(IHO),这是一种掩码扩散语言模型攻击器,通过对危害评判器进行迭代偏好优化来训练,仅需对目标进行黑盒访问。相同的方法无需修改即可用作针对个体行为的强自适应攻击,或作为一种高效的摊销策略,无需微调即可迁移到未见行为和未见目标模型。即使面对分层防御(例如,结合辅助检测器的Circuit Breaker训练模型),IHO在攻击成功率上也显著优于最先进的方法,且无需任何防御特定的适应。我们的结果将IHO定位为向那种过去提高了可靠性的标准化越狱评估迈出的实际一步。代码和模型可在GitHub和Hugging Face上获取。

英文摘要

Accurately evaluating adversarial robustness is a longstanding challenge. A flawed attack design can inflate robustness estimates, making deployment risk assessment and defense comparison unreliable. Historically, standardized attacks such as AutoAttack have largely resolved this for image classifiers, providing a reliable evaluation baseline for systematic comparison across defenses. However, no equivalent exists for LLM jailbreak evaluation yet, where designing such an attack is considerably more difficult. A reliable attack must, among other things, be black-box compatible, applicable to arbitrary defense pipelines, and efficient, which no existing method jointly satisfies. We introduce Indirect Harm Optimization (IHO), a masked diffusion language model attacker trained via iterative preference optimization against a harmfulness judge, requiring only black-box access to the target. The same method can be used without modification as a strong adaptive attack on individual behaviors, or as an efficient amortized policy that transfers to held-out behaviors and unseen target models without fine-tuning. Even against layered defenses, such as a Circuit Breaker-trained model combined with an auxiliary detector, IHO improves attack success considerably over state-of-the-art approaches, without any defense-specific adaptation. Our results position IHO as a practical step toward the kind of standardized jailbreak evaluation that has improved reliability in the past. Code and models are available on GitHub and Hugging Face.

2606.03645 2026-06-03 cs.LG cs.AI 版本更新

The Shape of Addition: Geometric Structures of Arithmetic in Large Language Models

加法的形状:大型语言模型中算术的几何结构

Liuyuan Wen, Xun Zhu, Lihao Huang, Wenbin Li, Yang Gao

发表机构 * University of Science and Technology of China(中国科学技术大学)

AI总结 通过分析多操作数加法中残差流的几何结构,发现等原始和轨迹(IRST)并建立噪声量化模型,将算术错误解释为由内部神经噪声引起的几何滑移,并利用几何一致性检查方法检测和纠正量化失败。

Comments Accepted by ICML 2026

详情
AI中文摘要

大型语言模型在基本算术中表现出矛盾的脆弱性,暗示内部计算与离散输出之间存在脱节。通过分析多操作数加法中的残差流几何结构,我们识别出等原始和轨迹(IRST),这是一种由语义数字锚定并由连续进位纤维调制的几何结构。我们提出噪声量化模型来解释这种几何结构,将算术错误视为由内部神经噪声推动连续的潜在进位势跨越量化阈值引起的几何滑移。这一几何框架进一步阐明了探针多功能性,解释了轻量级探针如何从单个激活向量中解开共存的潜在信号(如真实值与幻觉)。最后,我们通过一种几何一致性检查方法验证了这些见解,该方法在推理过程中有效检测和纠正了这些量化失败。我们的代码可在以下网址获取:https://this URL。

英文摘要

Large Language Models exhibit paradoxical fragility in fundamental arithmetic, implying a disconnect between internal computation and discrete output. By analyzing the residual stream geometry during multi-operand addition, we identify the Iso-Raw-Sum Trajectory (IRST), a geometric structure where representations are anchored by semantic digits and modulated by continuous carry fibers. We propose the Noisy Quantization Model to explain this geometry, framing arithmetic errors as Geometric Slippages caused by internal neural noise pushing a continuous, latent Carry Potential across quantization thresholds. This geometric framework further elucidates Probe Versatility, explaining how lightweight probes can disentangle coexisting latent signals (such as ground truth versus hallucination) from a single activation vector. Finally, we validate these insights through a geometric consistency check method that effectively detects and corrects these quantization failures during inference. Our code is available at https://github.com/RL-MIND/Shape-of-Addition.

2606.03644 2026-06-03 cs.LG 版本更新

Spatial Transcriptomics-Guided Alignment Enhances Molecular Profiling in Pathology Foundation Model

空间转录组学引导的对齐增强病理基础模型中的分子分析

Fengtao Zhou, Yingxue Xu, Zhengyu Zhang, Yihui Wang, Zhengrui Guo, Ling Liang, Jiabo Ma, Cheng Jin, Ziyi Liu, Huajun Zhou, Hongyi Wang, Du Cai, Chenglong Zhao, Xi Wang, Can Yang, Yu Wang, Wenbin Li, Feng Gao, Zhe Wang, Zhenhui Li, Xiuming Zhang, Li Liang, Hao Chen

发表机构 * Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR, China(计算机科学与工程系,香港科学与技术大学,香港特别行政区,中国) Department of Pathology, Nanfang Hospital, Southern Medical University, Guangzhou, China(pathology department, 南方医科大学南芳医院,广州,中国) Department of Pathology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China(pathology department, 南方医科大学基础医学学院,广州,中国) Guangdong Province Key Laboratory of Molecular Tumor Pathology, Guangzhou, China(广东省分子肿瘤病理学重点实验室,广州,中国) Jinfeng Laboratory, Chongqing, China(金风实验室,重庆,中国)

AI总结 提出STAMP框架,利用空间转录组数据通过通路感知对齐策略增强病理基础模型的分子感知能力,并在多层级评估中验证其临床效用。

详情
AI中文摘要

全面的分子分析对于现代精准肿瘤学至关重要,但高昂的成本、标本耗尽和漫长的周转时间仍然阻碍其应用。虽然病理基础模型(PFMs)已显示出从常规苏木精-伊红(H&E)全切片图像推断分子表型的潜力,但当前架构主要依赖于以视觉为中心的自监督学习或视觉-语言对齐,缺乏将细微形态学特征与潜在基因组改变联系起来所需的空间解析分子监督。空间转录组学(ST)作为一种变革性技术出现,能够在完整组织切片内进行转录组定量,从而保留组织学与分子谱之间的精确空间联系。在本研究中,我们提出了用于分子分析的空间转录组学引导对齐框架(STAMP),该框架赋予PFMs内在的分子感知能力。为支持这一范式,我们整理了HumanST-1k,一个涵盖不同解剖器官和测序平台的人类ST数据集。该图谱产生了180万对H&E斑块及其对应的转录组谱,提供了一个将组织学结构与其分子状态联系起来的语料库。为减轻原始转录组学中固有的技术噪声,STAMP采用了一种通路感知对齐策略,将转录组数据聚合为生物学功能通路,随后通过参数高效微调将其整合到PFMs中。这种对齐丰富了PFMs的表征空间,并释放了其解析亚视觉分子特征的能力。通过多层级评估框架验证了这些增强表征的临床实用性。

英文摘要

Comprehensive molecular profiling is essential for modern precision oncology but remains hindered by prohibitive costs, specimen exhaustion, and protracted turnaround times. While pathology foundation models (PFMs) have demonstrated potential for inferring molecular phenotypes from routine hematoxylin and eosin (H&E) whole-slide images (WSIs), current architectures primarily rely on vision-centric self-supervised learning or vision-language alignment, lacking the spatially resolved molecular supervision required to connect subtle morphological features with underlying genomic alterations. Spatial transcriptomics (ST) emerges as a transformative technology that enables transcriptomic quantification within intact tissue sections, thereby preserving the precise spatial link between histology and molecular profiles. In this study, we present a Spatial Transcriptomics-guided Alignment framework for Molecular Profiling (STAMP), which endows PFMs with intrinsic molecular awareness. To support this paradigm, we curated HumanST-1k, a human ST dataset spanning diverse anatomical organs and sequencing platforms. This atlas yields 1.8 million pairs of H&E patches and corresponding transcriptomic profiles, providing a corpus that links histological structures with their molecular states. To mitigate the technical noise inherent to raw transcriptomics, STAMP applies a pathway-informed alignment strategy that aggregates transcriptomic data into biologically functional pathways, which are subsequently integrated into PFMs via parameter-efficient fine-tuning. This alignment enriches the representation space of PFMs and unlocks their capacity to resolve sub-visual molecular signatures. The clinical utility of these augmented representations was validated through a multi-tier evaluation framework.

2606.03628 2026-06-03 cs.CL cs.AI cs.LG 版本更新

Building Reliable Long-Form Generation via Hallucination Rejection Sampling

通过幻觉拒绝采样构建可靠的长文本生成

Lin Li, Georgia Channing, Suhaas M Bhat, Gabriel Davis Jones, Yarin Gal

发表机构 * Georgia Institute of Technology(佐治亚理工学院) University of California, Berkeley(加州大学伯克利分校) University of Cambridge(剑桥大学) DeepMind(深度思维)

AI总结 提出分段幻觉拒绝采样框架SHARS,利用任意幻觉检测器在生成过程中拒绝并重采样幻觉片段,以缓解长文本生成中的幻觉累积问题,提升事实一致性。

Comments accepted by ICML 2026

详情
AI中文摘要

大型语言模型(LLMs)在开放式文本生成方面取得了显著进展,但仍容易产生不正确或无依据的幻觉内容,这损害了其可靠性。在长文本生成中,由于幻觉雪崩现象(早期错误传播并累积到后续输出),这一问题更加严重。为了解决这一挑战,我们提出了一种新颖的推理时幻觉缓解框架,称为分段幻觉拒绝采样(SHARS),该框架使用任意幻觉检测器在生成过程中识别并拒绝幻觉片段,并重新采样直到生成忠实的内容。通过仅保留可信信息并在此基础上构建后续生成,该框架减轻了幻觉累积并增强了事实一致性。为了实例化该框架,我们采用语义不确定性作为检测器,并引入了若干关键修改以解决其局限性并更好地适应长文本。我们的方法使模型能够自我纠正幻觉,无需外部资源(如网络搜索或知识库),同时保持与这些资源的兼容性以便未来扩展。在标准化幻觉基准上的实证评估表明,我们的方法显著减少了长文本生成中的幻觉,同时保持甚至提高了生成的信息量。代码可在以下网址获取:this https URL。

英文摘要

Large language models (LLMs) have achieved remarkable progress in open-ended text generation, yet they remain prone to hallucinating incorrect or unsupported content, which undermines their reliability. This issue is exacerbated in long-form generation due to hallucination snowballing, a phenomenon where early errors propagate and compound into subsequent outputs. To address this challenge, we propose a novel inference-time hallucination mitigation framework, named Segment-wise HAllucination Rejection Sampling (SHARS), which uses an arbitrary hallucination detector to identify and reject hallucinated segments during generation and resample until faithful content is produced. By retaining only confident information and building subsequent generations upon it, the framework mitigates hallucination accumulation and enhances factual consistency. To instantiate this framework, we adopt semantic uncertainty as the detector and introduce several vital modifications to address its limitations and better adapt it to long-form text. Our method enables models to self-correct hallucinations without requiring external resources such as web search or knowledge bases, while remaining compatible with them for future extensions. Empirical evaluations on standardized hallucination benchmarks demonstrate that our method substantially reduces hallucinations in long-form generation while preserving or even improving the informativeness of generation. Code is available at: https://github.com/TreeLLi/hallucination-rejection-sampling.

2606.03620 2026-06-03 cs.LG cs.AI 版本更新

Physics-Guided Policy Optimization with Self-Distillation

基于物理引导的自蒸馏策略优化

Ke Wang, Yuning Wu, Haoran Liu, Chaoqun Jia, Devin Chen, Kai Wei

发表机构 * Amazon(亚马逊)

AI总结 针对自蒸馏策略优化中固定步长导致训练不稳定的问题,提出受粘性流体动力学启发的物理引导策略优化(PGPO),通过互信息估计动态调整步长,在Science-QA数据集上提升性能并保持训练稳定性。

详情
AI中文摘要

自蒸馏策略优化(SDPO)已成为大语言模型后训练的一种流行范式,其中模型根据特权信息从自身预测中学习。然而,SDPO对每次更新步长的信任程度敏感:来自自我教师的修正可能在某些批次上信息丰富,而在其他批次上具有误导性,若以固定步长统一应用,会破坏训练稳定性。受粘性流体动力学启发,并在随机微分方程层面形式化类比,我们提出物理引导策略优化(PGPO),该方法引入一个基于学生预测与反馈条件教师之间互信息估计的信息调制步长乘子。我们证明这种调制保留了普通SGD的一阶弱近似保证,且每次迭代的额外开销可忽略。我们在Science-QA数据集上评估PGPO,它在4个领域中的3个上优于SDPO,提升高达+4.5个点,同时在SDPO训练后期崩溃的设置中保持稳定。

英文摘要

Self-distilled policy optimization (SDPO) has become a popular paradigm for LLM post-training, where a model learns from its own predictions conditioned on privileged information. SDPO, however, is sensitive to how much each update step should be trusted: corrections from a self-teacher can be highly informative on some batches and misleading on others, and applying them uniformly with a fixed step size can destabilize training. Drawing inspiration from viscous-fluid dynamics and formalizing the analogy at the SDE level, we propose Physics-Guided Policy Optimization (PGPO), which introduces an information-modulated step-size multiplier derived from a mutual-information estimate between the student's predictions and the feedback-conditioned teacher. We show that this modulation preserves the order-1 weak-approximation guarantees of vanilla SGD, and incurs negligible overhead per iteration. We evaluate PGPO on the Science-QA dataset, where it outperforms SDPO on 3 of the 4 domains with gains of up to +4.5 points, while remaining stable in a setting where SDPO collapses late in training.

2606.03608 2026-06-03 cs.LG cs.AI 版本更新

Exploiting Verification-Generation Gap: Test-Time Reinforcement Learning with Confidence-Conditioned Verification

利用验证-生成差距:基于置信度条件的测试时强化学习

Jiahui Li, Jianfeng Shan, Wenpei Chen, Shunyu Wu, Jian Lou, Wenjie Feng, Dan Li, See-Kiong Ng

发表机构 * Sun Yat-Sen University(中山大学) University of Science and Technology of China(中国科学技术大学) National University of Singapore(新加坡国立大学)

AI总结 提出TTRL-CoCoV框架,通过置信度自适应机制解决无标签设置下Pass@k优化中的伪标签错误和多样性崩溃问题,显著提升Pass@1和Pass@k性能。

详情
AI中文摘要

测试时强化学习已成为一种有前景的范式,用于在完全无标签的方式下增强大型语言模型的复杂推理能力。尽管现有研究关注Pass@1性能,但在无标签设置下优化Pass@k(衡量生成覆盖率以支持持续探索)仍未被充分探索且至关重要。在无标签设置下优化Pass@k极具挑战性,因为直接应用对RLVR有效的Pass@k优势设计会导致性能不佳。通过深入的实证分析,我们发现阻碍性能的根本原因:低置信度样本的伪标签估计很可能不正确,而高置信度样本的候选答案则遭受严重的多样性崩溃。为克服这些障碍,我们提出TTRL-CoCoV(基于置信度条件的测试时强化学习),一种新颖的置信度自适应框架,可扩展Pass@k覆盖率并提升Pass@1性能。基于我们的关键洞察——验证能力通常领先于生成能力,TTRL-CoCoV采用置信度条件机制:对于高置信度样本,它引导验证器并应用探索增强奖励以防止多样性崩溃;对于低置信度样本,它将伪标签选择委托给验证器以过滤错误伪标签;对于中等置信度样本,则完全绕过验证。大量实验表明,TTRL-CoCoV在6个广泛认可的基准上优于最佳竞争方法,在Pass@1上平均绝对提升+9.8%,在Pass@16上平均绝对提升+18.7%,甚至在与全监督强化学习方法相比时,在多个推理基准上实现了高达+5.0%的Pass@1绝对提升。我们的代码仓库:此 https URL。

英文摘要

Test-time reinforcement learning has emerged as a promising paradigm for enhancing the complex reasoning abilities of large language models in a completely label-free manner. Despite existing studies focusing on Pass@1 performance, optimizing Pass@k remains under-explored yet critical in label-free settings, which measures generation coverage for sustained exploration. Optimizing Pass@k in label-free setting is highly non-trivial, as directly applying the Pass@k advantage designs effective for RLVR yields unsatisfactory performance. Through in-depth empirical analysis, we discover the root causes hindering performance: pseudo-label estimations for low-confidence samples have a high probability of being incorrect, while candidate answers for high-confidence samples suffer from severe diversity collapse. To overcome these hurdles, we propose TTRL-CoCoV (Test-Time Reinforcement Learning with Confidence-Conditioned Verification), a novel confidence-adaptive framework that expands Pass@k coverage and improves Pass@1 performance. Based on our key insight that verification capability generally leads generation capability, TTRL-CoCoV employs a confidence-conditioned mechanism: for high-confidence samples, it bootstraps verifier and applies an exploration-enhancing reward to prevent diversity collapse; for low-confidence samples, it delegates pseudo-label selection to the verifier to filter incorrect pseudo-labels; and for medium-confidence samples, it bypasses verification entirely. Extensive experiments demonstrate that TTRL-CoCoV outperforms the best competing methods across 6 widely-recognized benchmarks, achieves average absolute gains of +9.8% in Pass@1 and +18.7% in Pass@16 over TTRL, and even achieves absolute Pass@1 improvements of up to +5.0% across multiple reasoning benchmarks when compared against fully supervised RL methods. Our code repository: https://github.com/shanjf666/CoCoV.

2606.03602 2026-06-03 cs.LG cs.AI cs.CL 版本更新

CauTion: Knowing When to Trust LLMs for Ensemble Causal Discovery

CauTion:知道何时信任LLM进行集成因果发现

Bo Peng, Kaiwen Wu, Sirui Chen, Zhiheng Wang, Yu Qiao, Chaochao Lu

发表机构 * Shanghai AI Laboratory(上海人工智能实验室) Shanghai Innovation Institute(上海创新研究院) Shanghai Jiao Tong University(上海交通大学) Nanjing University(南京大学) Tongji University(同济大学)

AI总结 提出CauTion框架,通过共识过滤和LLM可靠性估计,将LLM领域知识可靠地集成到多个统计因果发现算法中,解决纯统计方法的局限和LLM错误问题。

详情
AI中文摘要

从观测数据进行因果发现仍然具有挑战性,因为纯统计方法存在根本性限制,例如等价类内的统计可区分性和对有限样本量的敏感性。虽然大型语言模型(LLM)提供了有希望的领域知识来源来补充统计推断,但现有的LLM增强方法容易受到LLM错误的影响,并且产生高昂的令牌成本。此外,依赖单一数据驱动算法可能使结果对算法特定偏差敏感。为了解决这些限制,我们提出了CauTion,一个通过共识过滤和LLM可靠性估计将LLM领域知识可靠地集成到统计因果发现算法集成中的框架。CauTion分三个阶段进行。首先,算法集成利用共识投票解决算法一致的最多96%的边,在过滤后的共识边上实现接近完美的准确性。其次,一个信任校准仲裁机制通过无注释的信任校准过程估计LLM和算法的相对可靠性,然后用于控制信任加权投票过程,将LLM仲裁限制在算法证据不可靠的边上。第三,应用循环修复步骤确保最终因果图是有效的无环图。在六个数据集上的实验表明,CauTion在性能上始终优于数据驱动和LLM增强的基线,在更大的图上获得更大的收益,并且对LLM错误具有强大的鲁棒性。代码可在以下网址获取:https://this URL。

英文摘要

Causal discovery from observational data remains challenging due to the fundamental limitations of purely statistical methods, such as statistical distinguishability within equivalence classes and sensitivity to finite sample sizes. While large language models (LLMs) offer a promising source of domain knowledge to complement statistical inference, existing LLM-augmented methods are vulnerable to LLM errors and incur high token costs. Moreover, reliance on a single data-centric algorithm can make results sensitive to algorithm-specific biases. To address these limitations, we propose CauTion, a framework that reliably integrates LLM domain knowledge into an ensemble of statistical causal discovery algorithms through consensus filtering and LLM reliability estimation. CauTion proceeds in three stages. First, an algorithm ensemble utilizes a consensus voting to resolve up to 96% of edges on which algorithms agree, achieving near-perfect accuracy on the filtered consensus edges. Second, a trust-calibrated arbitration mechanism estimates the relative reliability of the LLM and the algorithms via an annotation-free trust calibration procedure, which is then utilized to govern a trust-weighted voting process that restricts LLM arbitration exclusively to edges with unreliable algorithmic evidence. Third, a cycle repair step is applied to guarantee the final causal graph is validly acyclic. Experiments on six datasets demonstrate that CauTion consistently outperforms both data-centric and LLM-augmented baselines, with larger gains on larger graphs and strong robustness to LLM errors. Code is available at https://github.com/OpenCausaLab/CauTion.

2606.03584 2026-06-03 cs.LG cond-mat.dis-nn cs.NE 版本更新

Training a Predictive Coding Network on ImageNet using Equilibrium Propagation

使用均衡传播在ImageNet上训练预测编码网络

Tugdual Kerjan, Rasmus Høier, Benjamin Scellier

发表机构 * Rain AI

AI总结 提出一种结合中心化均衡传播与新型均衡方案的预测编码网络训练方法,在ImageNet上训练10层卷积PCN,达到13.23% top-5错误率,接近反向传播基线。

详情
AI中文摘要

均衡传播(EP)是一种基于物理的训练框架,主要应用于能量模型,包括连续Hopfield网络、非线性电阻网络和耦合相位振荡器。然而,EP的实际应用至今仍局限于相对小规模的问题。预测编码网络(PCN)是另一类根植于计算神经科学的能量模型,通常使用专门的算法训练,同样尚未在大规模上得到验证。在这项工作中,我们开发了一种基于EP的PCN训练方法,该方法将中心化EP与一种新的PCN均衡方案相结合。使用这种方法,我们在全尺寸ImageNet上训练了一个10层卷积PCN(VGG10),在top-5分类任务上实现了13.23%的测试错误率,接近12.2%的反向传播基线。据我们所知,这是PCN和基于EP的训练首次在ImageNet规模上得到验证。这些结果显著扩展了两种方法的可扩展性,并表明在其他物理系统中扩展EP的主要挑战可能更多地来自这些系统的计算特性,而非EP框架本身的固有限制。

英文摘要

Equilibrium Propagation (EP) is a physics-based training framework that has primarily been employed in energy-based models, including continuous Hopfield networks, nonlinear resistive networks and coupled phase oscillators. However, EP's practical applications have so far remained limited to relatively small-scale problems. Predictive coding networks (PCNs), another class of energy-based models rooted in computational neuroscience, are typically trained with a specialized algorithm and have likewise not yet been demonstrated at large scale. In this work, we develop an EP-based training method for PCNs which combines the centered variant of EP with a novel equilibration scheme for PCNs. Using this approach, we train a 10-layer convolutional PCN (VGG10) on full-size ImageNet, achieving 13.23\% test error rate on the top-5 classification task, close to the 12.2\% backpropagation baseline. To our knowledge, this is the first demonstration of both PCNs and EP-based training at ImageNet scale. These results significantly extend the scalability of both approaches and suggest that the primary challenges in scaling EP in other physical systems may come more from the computational properties of these systems than from inherent limitations of the EP framework.

2606.03568 2026-06-03 cs.CV cs.AI cs.LG cs.RO 版本更新

Learned Non-Maximum Suppression for 3D Object Detection

用于3D目标检测的学习型非极大值抑制

Timo Osterburg, Stefan Schütte, Torsten Bertram

发表机构 * Institute of Control Theory and Systems Engineering, TU Dortmund University(控制理论与系统工程研究所,多特蒙德技术大学)

AI总结 提出两种基于学习的过滤模块(D2D-Rescore和GossipNet3D)替代启发式NMS,通过检测间关系提升3D检测性能,尤其改善小物体和稀有类别的检测精度。

Comments 6 pages, accepted at IEEE Intelligent Vehicles Symposium (IV) 2026

详情
AI中文摘要

后处理是基于激光雷达的3D目标检测中的关键阶段,必须过滤密集且重叠的提议以实现紧凑可靠的感知。本文引入了两个学习型过滤模块,通过利用检测之间的关系来替代启发式非极大值抑制(NMS)。D2D-Rescore采用基于Transformer的检测到检测(D2D)注意力,而GossipNet3D通过鸟瞰图中的局部消息传递将2D GossipNet概念适应到3D。一种与nuScenes评估协议对齐的度量感知匹配策略确保了训练和验证行为的一致性,从而提高了整体检测性能。与CircleNMS相比,两种方法都提高了平均精度(mAP)、nuScenes检测分数(NDS)和真阳性质量,特别是对于小物体和稀有类别,同时增加了最小的计算开销。这些结果表明,学习型的检测级过滤可以在不修改基础网络的情况下增强3D检测器的可靠性,为启发式抑制提供了一种原则性的替代方案。代码可在以下网址获取:https://this URL。

英文摘要

Post-processing is a critical stage in LiDAR-based 3D object detection, where dense and overlapping proposals must be filtered for compact and reliable perception. This work introduces two learned filtering modules that replace heuristic non-maximum suppression (NMS) by leveraging relations among detections. D2D-Rescore employs transformer-based detection-to-detection (D2D) attention, while GossipNet3D adapts the 2D GossipNet concept to 3D through localized message passing in bird's-eye view. A metric-aware matching strategy aligned with the nuScenes evaluation protocol ensures consistent training and validation behavior, improving overall detection performance. Both approaches improve mean average precision (mAP), nuScenes detection score (NDS), and true positive quality compared to CircleNMS, particularly for small and infrequent classes, while adding minimal computational overhead. These results demonstrate that learned, detection-level filtering can enhance 3D detector reliability without modifying the base network, offering a principled alternative to heuristic suppression. Code is available at https://github.com/rst-tu-dortmund/learned-3d-nms .

2606.03549 2026-06-03 cs.LG math.PR 版本更新

How Many Trees in a Random Forest? A Revisited Approach with Plateau Search and Optuna Integration

随机森林中需要多少棵树?一种结合平台搜索与Optuna集成的重新审视方法

Vadim Porvatov, Andrey Dukhovny, Andrey Lange

发表机构 * Sberbank Skolkovo Institute of Science and Technology (Skoltech)(Skoltech) Federal Research Center "Computer Science and Control" of Russian Academy of Sciences (FRC CSC RAS)(俄罗斯科学院计算机科学与控制联邦研究中心)

AI总结 提出一种基于三元组平台搜索的算法,通过监控袋外分数的相对变化自动确定随机森林的树数量,避免预设搜索范围,并提供了理论分析和实验验证。

详情
AI中文摘要

随机森林的超参数优化在调整树数量时面临一个特定困难:预测分数通常随集成规模单调提升,因此诸如树结构Parzen估计器(TPE)和Hyperband等标准方法需要预定义搜索范围,且往往将估计推向其右边界。早停策略避免了固定这样的范围,但对分数噪声敏感且容易过早停止。为解决此问题,我们提出一种集成的基于三元组的平台搜索算法,该算法将树数量从直接TPE搜索空间中移除,同时仍利用跨HPO试验积累的信息。该方法通过监控三个森林规模上的袋外(OOB)分数相对变化,自适应地跟踪接近最小的充分集成规模,并相应移动该三元组。这产生了一个基于容差参数的自动化且用户可解释的过程。我们还提供了理论分析:我们将所提出的相对OOB分数准则与当前分数和极限分数之间的差距联系起来,并推导了相应的基于OOB的绝对相对差异的渐近方差估计。实验表明,所选树数量可能与常见启发式方法有显著差异:对于大多数经典基准数据集,它更小;而对于一些高维生物信息学数据集(如Arcene和Dorothea),则更大。源代码和可重复实验可在以下网址获取:https://github.com/your-repo。

英文摘要

Hyperparameter optimization (HPO) for Random Forest faces a specific difficulty in tuning the number of trees: the predictive score typically improves monotonically with ensemble size, so standard methods such as Tree-structured Parzen Estimator (TPE) and Hyperband require a predefined search range and often drive the estimate toward its right boundary. Early-stopping strategies avoid fixing such a range, but can be sensitive to score noise and prone to premature stopping. To address this, we propose an integrated triplet-based plateau-search algorithm that removes the number of trees from the direct TPE search space and still exploits information accumulated across HPO trials. The method adaptively tracks a near-minimal sufficient ensemble size by monitoring relative changes in the out-of-bag (OOB) score across a triplet of forest sizes and shifting this triplet accordingly. This yields an automated and user-interpretable procedure based on a tolerance parameter. We also provide a theoretical analysis: we relate the proposed relative OOB-score criterion to the gap between the current and limiting scores, and derive an asymptotic variance estimate for the corresponding OOB-based absolute relative difference. Experiments show that the selected number of trees can differ substantially from the common heuristic: for most classical benchmark datasets it is smaller, whereas for some high-dimensional bioinformatics datasets, such as Arcene and Dorothea, it is larger. The source code and reproducible experiments are available at https://github.com/lange-am/rf_plateau_hpo.

2606.03532 2026-06-03 cs.LG cs.AI 版本更新

When Should the Teacher Move? Temporal Coupling and Stability in Self On-Policy Distillation

教师何时应该移动?自在线策略蒸馏中的时间耦合与稳定性

Haowei Guo, Baolong Bi, Ruicheng Zhang, Bingqian Sun, Wentao Zhang

发表机构 * Peking University(北京大学) University of Chinese Academy of Sciences(中国科学院大学) Tsinghua University(清华大学)

AI总结 研究自在线策略蒸馏中教师更新调度对稳定性的影响,提出基于隔离期和门控机制的CGTR方法,实现零崩溃和最佳性能。

详情
AI中文摘要

自在线策略蒸馏针对从自身参数历史派生的教师训练学生策略,但教师的更新调度——控制教师与学生之间的\emph{时间耦合}——尚未作为稳定性变量被系统研究。通过对Qwen3-8B进行受控调度扫描,我们确定\emph{隔离期}(定义为更新之间教师完全冻结)是实现稳定学习的关键结构属性,而非教师年龄。为了刻画这些底层训练动态,我们引入了一个诊断框架,包括时间KL结构、刷新冲击和长度尾部风险。该框架进一步揭示了\emph{状态遗忘崩溃}:最优的短视固定调度在长视训练下灾难性失败,因为时钟驱动的刷新可以在单个不可逆步骤中将短暂漂移的学生复制到教师中。这种失败模式在短视评估下不可见,并且在机制上不同于EMA的慢性污染。为了解决这个问题,我们提出了\emph{巩固门控教师刷新}(CGTR),它在保持隔离期的同时,基于奖励改进和长度尾部安全的联合证据对每次刷新进行门控,确保每次教师移动响应于真正的学生巩固而非时钟信号。使用单一共享参数集且无需每数据集重新调整,CGTR在所有四个任务(化学、生物学、物理学、工具使用)上实现了 extbf{零崩溃}和最佳最终分数,并自动调节其刷新频率以适应每个任务的学习动态。

英文摘要

Self on-policy distillation trains a student policy against a teacher derived from its own parameter history, yet the teacher's update schedule -- which governs the \emph{temporal coupling} between teacher and student -- has not been systematically studied as a stability variable. Through a controlled schedule sweep on Qwen3-8B, we establish that \emph{isolation periods}, defined as complete teacher freezing between updates, are the key structural property enabling stable learning, not teacher age. To characterize these underlying training dynamics, we introduce a diagnostic framework of temporal KL structure, refresh shock, and length-tail risk. This framework further uncovers \emph{state-oblivious collapse}: optimal short-horizon fixed schedules catastrophically fail under long-horizon training because a clock-driven refresh can copy a transiently drifting student into the teacher in a single, irreversible step. This failure mode is invisible under short-horizon evaluation and mechanistically distinct from EMA's chronic contamination. To address this, we propose \emph{Consolidation-Gated Teacher Refresh} (CGTR), which preserves isolation periods while gating each refresh on joint evidence of reward improvement and length-tail safety, ensuring every teacher movement responds to genuine student consolidation rather than a clock signal. With a single shared parameter set and no per-dataset retuning, CGTR achieves \textbf{zero collapse} and the best final score on all four tasks (Chemistry, Biology, Physics, ToolUse), self-regulating its refresh frequency to each task's learning dynamics.

2606.03523 2026-06-03 cs.CR cs.AI cs.LG 版本更新

High-Precision APT Malware Attribution with Out-of-Scope Resilience

高精度APT恶意软件归因与越界鲁棒性

Peter Williams, Adam Sobey, Erisa Karafili

发表机构 * Department of Computer Science, University of Oxford(1 奥克斯福德大学计算机科学系)

AI总结 提出基于排名二元分类器与显式弃权的APT恶意软件归因方法,在越界样本占比87%时仍保持92%精度和95%选择性准确率。

详情
AI中文摘要

早期归因高级持续性威胁(APT)活动可帮助防御者优先调查、选择对策并减少入侵影响。恶意软件提供了有用的归因证据,但自动化APT恶意软件归因在实践中仍然困难。现有方法通常作为封闭集分类器在有限数量的已知APT组织上进行训练和评估。然而,在操作环境中,分类器很可能遇到训练中未出现的组织样本。封闭集分类器被迫将这些样本分配给已知组织,产生无根据且可能误导的归因。我们提出一种基于排名二元分类器与显式弃权的高精度APT恶意软件归因方法。我们的方法不是训练单个多类分类器,而是为每个APT组织训练和调整两个二元分类器,根据验证性能对分类器进行排名,并顺序应用它们。仅当分类器提供足够证据时才对样本进行归因;否则,弃权。我们在APT恶意软件数据集和旨在压力测试越界行为的更大组合数据集上评估该方法。在APT恶意软件数据集上,该方法实现了比先前公布结果更高的精度。在最具挑战性的设置中,87%的测试样本来自训练中排除的60个APT组织,该方法对94%的越界样本弃权,同时在其分类的样本上保持92%的精度和95%的选择性准确率。

英文摘要

Early attribution of Advanced Persistent Threat (APT) activity can help defenders prioritise investigation, select countermeasures, and reduce the impact of an intrusion. Malware provides useful attribution evidence, but automated APT malware attribution remains difficult in practice. Existing approaches are typically trained and evaluated as closed-set classifiers over a limited number of known APT groups. In operational environments, however, classifiers are likely to encounter samples from groups not represented during training. Closed-set classifiers are then forced to assign such samples to known groups, producing unsupported and potentially misleading attributions. We present a high-precision APT malware attribution method based on ranked binary classifiers with explicit abstention. Rather than training a single multi-class classifier, our approach trains and tunes two binary classifiers per APT group, ranks the classifiers by validation performance, and applies them sequentially. A sample is attributed only when a classifier provides sufficient evidence; otherwise, it abstains. We evaluate the method on the APT Malware dataset and on a larger combined dataset designed to stress-test out-of-scope behaviour. On the APT Malware dataset, the method achieves higher precision than previously published results on the same dataset. In the most challenging setting, where 87% of test samples came from 60 APT groups excluded from training, the method abstained on 94% of out-of-scope samples while maintaining 92% precision and 95% selective accuracy on the samples it classified.

2606.03521 2026-06-03 cs.LG cs.AI 版本更新

Post-Hoc Robustness for Model-Based Reinforcement Learning

基于模型的强化学习的后验鲁棒性

Siemen Herremans, Ali Anwar, Siegfried Mercelis

发表机构 * Carnegie Mellon University(卡内基梅隆大学)

AI总结 提出一种在推理时利用学习模型和名义策略进行鲁棒策略改进的后验鲁棒化方法,通过对抗性展开的模型预测控制提升鲁棒性,无需额外训练神经网络。

详情
AI中文摘要

为了提高强化学习(RL)在现实世界中的适用性,对抗鲁棒RL领域研究如何在对抗环境扰动下训练智能体。在该设置中,主角智能体在对手的环境扰动下优化策略,形成零和马尔可夫博弈。当对抗鲁棒RL与基于模型的RL结合时,对手可以针对学习到的转移模型而非训练环境。扩展这一思想,本文引入了深度RL智能体在推理时的后验鲁棒化。通过将学习模型与训练的名义策略结合使用,我们的方法执行鲁棒策略改进步骤。目标是提高鲁棒性而无需对神经网络进行额外训练。具体来说,我们利用对抗性展开下的模型预测控制,这些展开通过有界不确定性集内的投影梯度下降进行近似。此外,这些离线展开在执行时考虑并缓解了分布外问题。通过在扰动的Gymnasium MuJoCo环境中评估算法,同时考虑后验推理设置的计算限制,验证了所提方法在鲁棒性上的显著提升。

英文摘要

To improve the real-world applicability of reinforcement learning (RL), the field of adversarially robust RL studies how to train agents under adversarial environment perturbations. In this setting, a protagonist agent optimizes a policy under environmental perturbations from an adversary, resulting in a zero-sum Markov game. When adversarially robust RL is combined with model-based RL, the adversary can target a learned transition model instead of the training environment. Extending this idea, this work introduces post-hoc robustification of deep RL agents at inference time. By using the learned model in combination with a trained nominal policy, our approach performs a robust policy improvement step. The goal is to improve robustness without any additional training of neural networks. Specifically, we utilize model-predictive control under adversarial rollouts, which are approximated via projected gradient descent within a bounded uncertainty set. Furthermore, these offline rollouts are performed while considering and mitigating out-of-distribution issues. The proposed methodology is validated by demonstrating significant improvements in robustness when the algorithm is evaluated in perturbed Gymnasium MuJoCo environments, while considering the computational limitations of the post-hoc inference setting.

2606.03498 2026-06-03 cs.LG cs.DC 版本更新

Demystifying Pipeline Parallelism: First Theory for PipeDream

揭秘流水线并行:PipeDream 的首个理论

Ivan Ilin, Peter Richtárik

发表机构 * KAUST(卡斯土尼亚大学)

AI总结 本文通过引入随机化 PipeDream (RPD) 抽象,首次为 PipeDream 风格方法提供了非凸收敛保证,并分析了其稳态延迟与阶段数的缩放关系,同时与 LocalSGD 进行了比较。

Comments 40 pages, 4 figures

详情
AI中文摘要

训练现代机器学习模型越来越需要跨多个加速器进行分布式计算。数据并行仍然是默认选择,并且通常与张量并行分片相结合,但一旦参数、激活或优化器状态不再适合单个设备,模型并行就变得不可避免。本文通过 PipeDream (PD) (Harlap et al., 2018) 的视角研究流水线模型并行。我们的第一个贡献是理论性的:我们引入了随机化 PipeDream (RPD),一种陈旧块-SGD 抽象,据我们所知,这为 PD 风格方法提供了第一个干净的非凸收敛保证。我们的第二个贡献是扩展诊断:我们证明了稳态 PD 引起的延迟随阶段数 S 增长为 $S^2 - S/2 + O(1)$,因此收敛定理中的陈旧读取贡献缩放为 $\Theta(\gamma^2 S^4)$,在调谐速率形式中等价于 $\Theta(S^4/K)$。我们的第三个贡献是与 LocalSGD 的比较,后者通过周期性模型平均来权衡权重陈旧性与同步气泡。在我们报告的模拟时间实验中,表现更好的方法取决于目标:PD 在二次目标和小型语言建模训练损失任务上表现更好,而对于逻辑回归,随着阶段数增加,LocalSGD 变得优越。

英文摘要

Training modern machine learning models increasingly requires computation to be distributed across many accelerators. Data parallelism remains the default choice and is often paired with tensor-parallel sharding, but model parallelism becomes unavoidable once parameters, activations, or optimizer states no longer fit on a single device. This paper studies pipeline model parallelism through the lens of PipeDream (PD) (Harlap et al., 2018). Our first contribution is theoretical: we introduce Randomized PipeDream (RPD), a stale block-SGD abstraction that yields, to our knowledge, the first clean nonconvex convergence guarantee for a PD-style method. Our second contribution is a scaling diagnosis: we prove that the delay induced by steady-state PD grows as $S^2 - S/2 + O(1)$ for $S$ stages, so the stale-read contribution in the convergence theorem scales as $Θ(γ^2 S^4)$, equivalently as $Θ(S^4/K)$ in the tuned-rate form. Our third contribution is a comparison with LocalSGD, whose periodic model averaging trades weight staleness for synchronization bubbles. In our reported simulated-time experiments, the better-performing method depends on the objective: PD performs better on the quadratic objective and on a small language-modeling training-loss task, while for logistic regression LocalSGD becomes superior as the number of stages increases.

2606.03495 2026-06-03 cs.LG 版本更新

HiSE: A Lightweight Hierarchical Semantic Explainer for Heterogeneous Graph Neural Networks

HiSE:一种用于异构图神经网络的轻量级层次语义解释器

Zongrui Li, Yuhang Zhao, Ying Zhao, Yuanzhao Guo, Qiang Huang, Yuan Tian

发表机构 * School of Artificial Intelligence, Jilin University(吉林大学人工智能学院) Mohamed bin Zayed University of Artificial Intelligence(穆罕默德·本·扎耶德人工智能大学)

AI总结 提出HiSE,一种轻量级特征导向的可解释模型,通过层次语义建模(语义级LASSO稀疏特征学习和跨语义级KL散度自适应融合)实现高保真、低计算开销的异构图神经网络解释。

详情
AI中文摘要

异构图神经网络(HGNNs)在建模复杂关系数据方面表现出色,然而在高风险应用中的可解释性仍然是一个关键挑战。现有的解释方法存在两个主要局限性:一方面,生成的解释未能反映HGNNs固有的语义层次,导致对模型内部决策机制的保真度不足;另一方面,特征解释通常依赖于复杂的搜索或扰动机制,导致计算复杂度过高且效率低下。为了解决这些问题,我们提出了HiSE,一种轻量级特征导向的HGNNs可解释模型。HiSE通过层次语义建模实现语义感知的特征解释:在语义层面,采用基于最小绝对收缩和选择算子(LASSO)的局部代理模型学习每个语义视图下的稀疏特征表示;在跨语义层面,通过KL散度自适应地表征不同语义视图的贡献,生成统一的解释。大量实验表明,HiSE在保真度、鲁棒性和跨语义解释能力方面优于现有方法,同时其轻量级框架具有较低的计算开销,能够高效应用于大规模、复杂的真实世界异构图。

英文摘要

Heterogeneous graph neural networks (HGNNs) have demonstrated remarkable performance in modeling complex relational data, however their interpretability in high-stakes applications remains a critical challenge. Existing explanation methods suffer from two major limitations: on the one hand, the generated explanations fail to reflect the inherent semantic hierarchy of HGNNs, resulting in a lack of fidelity to the model's internal decision-making mechanism; on the other hand, feature explanations often rely on complex search or perturbation mechanisms, leading to excessive computational complexity and poor efficiency. To address these issues, we propose HiSE, a lightweight feature-oriented interpretable model for HGNNs. HiSE achieves semantically aware feature explanations through hierarchical semantic modeling: at the semantic level, local surrogate models based on the Least Absolute Shrinkage and Selection Operator (LASSO) are employed to learn sparse feature representations under each semantic view; at the cross-semantic level, the contributions of different semantic views are adaptively characterized via KL divergence to produce a unified explanation. Extensive experiments demonstrate that HiSE outperforms existing methods in terms of fidelity, robustness, and cross-semantic explanation capability, while its lightweight framework incurs low computational overhead, enabling efficient application to large-scale, complex real-world heterogeneous graphs.

2606.03493 2026-06-03 cs.CV cs.LG 版本更新

Low-Frequency Shortcuts in Texture-Driven Visual Learning

纹理驱动视觉学习中的低频捷径

Utku Şirin, Cathy Hou, David Alvarez-Melis, Stratos Idreos

发表机构 * Harvard University(哈佛大学) Kempner Institute(凯姆纳研究所)

AI总结 本文分析了纹理驱动领域中神经网络依赖低频成分作为捷径的现象,提出通过裁剪低频成分来消除捷径,从而提升分布内准确率和鲁棒性。

详情
AI中文摘要

神经网络存在捷径学习问题,即学习到的特征在训练集上泛化良好,但在分布内(ID)或分布外(OOD)测试集上表现不佳。现有研究均基于少数几个标准基准,这些基准是形状驱动的。然而,许多应用领域是纹理驱动的。在这项工作中,我们针对纹理驱动领域进行了捷径学习分析,并将其与标准基准进行了比较。我们表明,纹理驱动领域存在低频捷径。它们主要基于少数具有偏斜频谱行为的低频成分(LFC)做出决策,尽管其分类信息存在于更高频率的细粒度细节中。从训练集和测试集中裁剪LFC可以消除捷径,并提供更平衡的频谱行为,将ID准确率提升高达8%。我们表明,低频捷径使模型极易受到OOD干扰的影响,导致与ID准确率相比下降高达70%。裁剪LFC显著提高了对低频干扰的鲁棒性,提升高达40%,并引入了对高频干扰的权衡;平衡的频谱行为提供了更好的泛化性能,而对高频特征的依赖增加则降低了泛化性能。OOD准确率取决于这两个因素之间的相互作用。

英文摘要

Neural networks suffer from shortcut learning, where learned features generalize well to the training set but not to in-distribution (ID) or out-of-distribution (OOD) test sets. Existing studies are all based on a few standard benchmarks, which are shape-driven. Numerous application domains, however, are texture-driven. In this work, we present shortcut learning analysis for texture-driven domains, and compare it with that of a standard benchmark. We show that texture-driven domains suffer from low-frequency shortcuts. They make the majority of their decisions based on a few low-frequency components (LFCs) with a skewed spectral behavior, despite that their classification information is in higher-frequency, fine-grained details. Pruning LFCs from training and test sets eliminates the shortcut and provides a more balanced spectral behavior, improving the ID accuracy by up to 8%. We show that low-frequency shortcuts make the models highly vulnerable to OOD corruptions, leading up to 70% accuracy drop compared to the ID accuracy. Pruning LFCs significantly improves robustness to low-frequency corruptions, by up to 40%, and introduces a trade-off for high-frequency corruptions; the balanced spectral behavior provides a better generalization performance, whereas the increased dependence on high-frequency features reduces it. OOD accuracy depends on the interaction between these two factors.

2606.03483 2026-06-03 cs.LG cs.AI 版本更新

Analyzing Stream Collapse in Hyper-Connections: From Diagnosis to Mitigation

分析超连接中的流坍缩:从诊断到缓解

Ekaterina Alimaskina, Gleb Molodtsov, Aleksandr Beznosikov

发表机构 * MIRAI BRAIn Lab Yandex Research Innopolis University

AI总结 本文通过细粒度诊断发现超连接中的多流残差连接存在流坍缩现象,即信号集中于主导流,并通过打破初始化对称性缓解该问题以提升性能。

详情
AI中文摘要

超连接(HC)用多个流替换单个Transformer残差流,引入了流索引上的置换对称性。我们研究这种对称性在实践中如何被打破:流是平衡地专门化还是表现出主导流使用。通过对基于HC的语言模型进行细粒度诊断,我们追踪多流表示的实际使用方式。我们发现,在早期种子阶段之后,残差混合通常保持接近恒等映射,限制了HC在流之间交换信息的核心机制。此外,信号和可解释特征都集中在一个主导流中,名义上的多流残差连接可能未充分利用其容量,行为更接近单流残差路径。最后,我们表明在流初始化时打破对称性可以减少主导行为并提高各种 extit{m}HC变体的性能。我们的代码已公开。

英文摘要

Hyper-Connections (HC) replace the single Transformer residual stream with multiple streams, introducing a permutation symmetry over stream indices. We study how this symmetry is resolved in practice: whether streams specialize in a balanced way or exhibit dominant-stream usage. Using fine-grained diagnostics for HC-based language models, we trace how multi-stream representations are actually used. We find that after an early seeding stage, residual mixing often remains close to identity, limiting a core HC mechanism for exchanging information between streams. Moreover, both signal and interpretable features concentrate in a dominant stream, and the nominally multi-stream residual connection can underutilize its capacity, behaving closer to a single-stream residual pathway. Finally, we show that breaking symmetry at stream initialization reduces dominant behavior and improves performance across \textit{m}HC variants. Our code is publicly available.

2606.03465 2026-06-03 cs.LG cs.AI 版本更新

Rethinking the Role of Tensor Decompositions in Post-Training LLM Compression

重新思考张量分解在训练后大语言模型压缩中的作用

Artur Zagitov, Alexander Miasnikov, Maxim Krutikov, Vladimir Aletov, Gleb Molodtsov, Nail Bashirov, Artem Tsedenov, Aleksandr Beznosikov

发表机构 * University of Florida(佛罗里达大学) National Research University Higher School of Economics(俄罗斯国家研究大学——莫斯科经济学院)

AI总结 本文系统评估了张量分解在稠密和MoE架构上的训练后压缩效果,通过实证与理论分析揭示了其与LLM异构表示之间的根本性不匹配,从而界定了其实际限制和在规模化部署中的可行角色。

详情
AI中文摘要

训练后压缩对于在资源紧张条件下部署大型语言模型(LLM)至关重要。张量分解已成为一个有前景的方向,提供了适合Transformer权重结构的紧凑参数化。然而,现有研究在狭窄的设置中评估这些方法,使得张量化在大规模部署中是否有效尚不清楚。我们系统评估了稠密和MoE架构上的张量压缩,建立了基于实证分析和理论分析的性能权衡。我们识别出张量分解假设的共享子空间与现代LLM学习的异构表示之间的根本性不匹配,从而界定了它们的实际限制,并阐明了它们在大规模部署中的可行角色。代码可在该网址获取。

英文摘要

Post-training compression is essential for deploying large language models (LLMs) under tight resource constraints. Tensor decompositions have emerged as a promising direction, offering compact parameterizations well suited to Transformer weight structures. However, existing studies evaluate these methods in narrow settings, leaving unclear whether tensorization is effective at large-scale deployment. We systematically evaluate tensor compression across dense and MoE architectures, establishing performance trade-offs grounded in both empirical analysis and theoretical analysis. We identify a fundamental mismatch between the shared subspaces assumed by tensor decompositions and the heterogeneous representations learned by modern LLMs, thereby delineating their practical limits and clarifying their viable role in large-scale deployment. The code is available at https://github.com/brain-lab-research/TT-LLM.

2606.03462 2026-06-03 cs.LG cs.SI 版本更新

Topology-Aware Gaussian Graph Repair for Robust Graph Neural Networks

拓扑感知的高斯图修复用于鲁棒图神经网络

Anubha Goel, Juho Kanniainen

发表机构 * Computing Science/Financial Computing and Data Analytics Group, Tampere University(计算科学/金融计算与数据分析组,塔尔皮奥大学)

AI总结 提出拓扑感知高斯修复(TAGR)框架,通过自适应高斯核构建稀疏特征邻域图并结合拓扑感知残差校正,在不改变网络架构的情况下提升图神经网络在噪声边和缺失边场景下的鲁棒性。

详情
AI中文摘要

图神经网络在图结构数据上取得了强劲性能,但其有效性高度依赖于观测图的质量。在实际应用中,图拓扑往往不完美:噪声边可能连接无关节点,而缺失边可能阻碍有用信息的传播。现有的鲁棒图学习方法主要通过移除可疑边或在训练过程中学习新图结构来解决这一问题。然而,仅移除边无法恢复缺失连接,而图结构学习可能引入额外的优化复杂度。在本文中,我们提出拓扑感知高斯修复(TAGR),一种用于图神经网络中鲁棒消息传递的简单图修复框架。TAGR 不学习稠密邻接矩阵,而是使用自适应高斯核构建稀疏特征邻域图,并将其与观测图的拓扑感知残差校正相结合。高斯修复组件在特征相似节点之间引入辅助边,而残差校正根据局部特征和结构一致性保留并重新加权原始拓扑。修复后的图可直接用于标准图神经网络,无需改变其架构。在基准引文网络上的大量实验表明,TAGR 在噪声边和缺失边设置下均能提升 GNN 的鲁棒性。进一步分析表明,高斯特征邻域修复提供了主要的鲁棒性增益,而拓扑感知残差校正在观测图不完整时提高了稳定性。这些结果表明,通过轻量级稀疏图修复而非稠密图结构学习即可实现有效的图鲁棒性。

英文摘要

Graph neural networks have achieved strong performance on graph-structured data, but their effectiveness depends heavily on the quality of the observed graph. In real applications, graph topology is often imperfect: noisy edges may connect unrelated nodes, while missing edges may prevent useful information from being propagated. Existing robust graph learning methods mainly address this problem by removing suspicious edges or by learning a new graph structure during training. However, edge removal alone cannot recover missing connections, and graph structure learning may introduce additional optimization complexity. In this paper, we propose Topology-Aware Gaussian Repair (TAGR), a simple graph repair framework for robust message passing in graph neural networks. Instead of learning a dense adjacency matrix, TAGR constructs a sparse feature-neighborhood graph using an adaptive Gaussian kernel and combines it with a topology-aware residual correction of the observed graph. The Gaussian repair component introduces auxiliary edges between feature-similar nodes, while the residual correction preserves and reweights the original topology according to local feature and structural consistency. The repaired graph can be used directly with standard graph neural networks without changing their architectures. Extensive experiments on benchmark citation networks show that TAGR improves the robustness of GNNs under both noisy-edge and missing-edge settings. The analysis further show that Gaussian feature-neighborhood repair provides the main robustness gain, while topology-aware residual correction improves stability when the observed graph is incomplete. These results suggest that effective graph robustness can be achieved through lightweight sparse graph repair rather than dense graph structure learning.

2606.03458 2026-06-03 cs.LG 版本更新

KVarN: Variance-Normalized KV-Cache Quantization Mitigates Error Accumulation in Reasoning Tasks

KVarN: 方差归一化的KV缓存量化减轻推理任务中的误差累积

Lorenz K. Muller, Philippe Bich, Chiara Boretti, Hyun-Min Chang, Jiawei Zhuang, Lukas Cavigelli

发表机构 * Huawei(华为)

AI总结 提出KVarN,一种无校准的KV缓存量化方法,通过Hadamard旋转和双尺度方差归一化减少自回归解码中的量化误差累积,在2位精度下达到生成基准测试的最新水平。

详情
AI中文摘要

测试时扩展是一种在大语言模型中获取更好推理能力的强大方法,但在长时域解码过程中,由于KV缓存增长,它会成为内存瓶颈。KV缓存量化有助于改善这一问题,但当前方法在预填充设置下进行评估,而误差在自回归解码下表现不同。我们表明,在后一种情况下,量化误差随时间步累积,主要由不正确的token尺度驱动。我们引入KVarN,一种无校准的KV缓存量化器,它应用Hadamard旋转,随后对K和V矩阵的两个轴进行双尺度方差归一化。我们发现,这种组合修复了异常的token尺度误差,并显著减少了现有基线的误差累积。KVarN在生成基准测试(包括MATH500、AIME24和HumanEval)上以2位精度建立了KV缓存量化的最新技术水平。KVarN方法的vLLM实现可在此https URL获取。

英文摘要

Test-time scaling is a powerful approach to obtain better reasoning in large language models, but it becomes memory-bottlenecked during long-horizon decoding, as the KV-cache grows. KV-cache quantization can help improve this, but current methods are evaluated under prefill-like settings and errors behave differently under autoregressive decoding. We show that in the latter regime, quantization errors accumulate across timesteps, driven primarily by incorrect token scales. We introduce KVarN, a calibration-free KV-cache quantizer that applies a Hadamard rotation followed by a dual-scaling variance normalization across both axes of the K and V matrices. We find that this combination fixes outlying token-scale errors and substantially reduces error accumulation over existing baselines. KVarN establishes a new state-of-theart for KV-cache quantization on generative benchmarks, including MATH500, AIME24 and HumanEval, at 2-bit precision. A vLLM implementation of the KVarN method is available at https://github.com/huawei-csl/KVarN

2606.03432 2026-06-03 cs.CR cs.AI cs.LG 版本更新

A Hybrid Approach For Malware Classification Using Secondary Features Fusion

一种使用二次特征融合的恶意软件分类混合方法

Raja Khurram Shahzad, Muhammad Mustaqeem, Haroon Elahi

AI总结 提出一种通过融合API调用和n-gram特征,并采用投票集成算法进行恶意软件检测与家族分类的方法,在Microsoft数据集上达到99.72%准确率和0.989 AUC。

详情
AI中文摘要

恶意软件(无论是变种还是新型)的数量正在迅速增加,使得恶意软件检测和缓解成为一个复杂的问题。改善恶意软件缓解的一种方法是自动检测和恶意软件家族分类。然而,传统的恶意软件检测方法无法将检测到的恶意软件分类到各自的家族中,阻碍了有效的恶意软件缓解。因此,本文提出了一种自动化恶意软件检测并将检测到的恶意软件分类到相应恶意软件家族的方法。所提出的方法在提取相关恶意软件特征(如API调用、固定和可变长度n-gram)后,使用自定义特征选择方法进行特征融合。此外,对于预测模型,提出了一种基于投票的算法融合方法。为了对所提出的方法进行实验评估,对Microsoft提供的数据集应用了二分类和多分类方法。最后,将实验结果与现有技术进行了比较。实验结果表明了所提出方法的有效性和效率,AUC为0.989,准确率为99.72%,对数损失为0.01。

英文摘要

The number of malware (either variant or novel) is rapidly increasing, making malware detection and mitigation a complex problem. One approach to improving malware mitigation is automatic detection and malware family classification. However, traditional malware detection methods cannot classify detected malware into their respective families, hindering effective malware mitigation. Consequently, this paper proposes a method to automate malware detection and classification of the detected malware into respective malware families. The proposed method uses feature fusion after extracting relevant malware features such as API calls and fixed and variable length n-grams with a customized feature selection method. Moreover, for the predictive model, a voting based approach is proposed for algorithm fusion. For the experimental evaluation of the proposed method, both binary and multi-class classification approaches are applied to the data set provided by Microsoft. Finally, the experimental results are compared with the state of the art. The experimental results indicate the effectiveness and efficiency of the proposed approach with an AUC of 0.989, accuracy of 99.72%, and a log loss of 0.01.

2606.03428 2026-06-03 cs.NE cs.AI cs.LG 版本更新

PrimeSVT: An Automated Memory-aware Pruning Framework with Prioritized Compression Policy for Spiking Vision Transformers

PrimeSVT: 一种具有优先压缩策略的自动化内存感知剪枝框架用于脉冲视觉Transformer

Rachmad Vidya Wicaksana Putra, Achyuta Muthuvelan, Alberto Marchisio, Muhammad Shafique

发表机构 * eBRAIN Lab, Division of Engineering, New York University (NYU) Abu Dhabi(eBRAIN实验室,工程系,纽约大学(NYU)阿布扎克分校) New York University (NYU) Abu Dhabi, United Arab Emirates (UAE)(纽约大学(NYU)阿布扎克分校,阿拉伯联合酋长国(UAE))

AI总结 提出PrimeSVT框架,通过自动化结构化剪枝和优先压缩策略,在满足精度和内存约束下压缩脉冲视觉Transformer,实现内存节省26.68%且精度损失小于3%。

Comments 8 pages, 8 figures, 3 tables

详情
AI中文摘要

脉冲视觉Transformer(SViT)的大尺寸仍然阻碍其嵌入式实现,因此需要模型压缩。现有工作通过非结构化剪枝压缩SViT模型,这需要专门的硬件加速器来利用其特定的稀疏模式以最大化效率提升。此外,它们的手动方法需要大量设计时间来为每个网络找到合适的剪枝设置,因此这种方法不可扩展。为了解决这一限制,我们提出了PrimeSVT,一种新颖的框架,对预训练的SViT模型执行自动化的内存感知结构化剪枝,从而在推理期间最大化其效率提升,适用于广泛使用的计算架构。为此,PrimeSVT首先根据层的大小(即参数数量)对SViT层进行排序,根据它们在不同剪枝率下的鲁棒性识别目标剪枝层,然后利用这个顺序从最大层到最小层逐层顺序压缩模型(即所谓的优先压缩策略),同时考虑用户定义的约束(即可接受的精度和内存节省)。在每一层中,PrimeSVT基于L2范数值采用通道级滤波器剪枝,以结构性地移除不重要的权重。实验结果表明,PrimeSVT通过自动化单次剪枝节省了26.68%的内存,同时将精度保持在原始未剪枝SViT模型(73.3%)的3%以内(未微调时为70.3%,微调后为72.9%),从而满足了精度和内存约束。这些表明我们的PrimeSVT框架实现了SViT及其嵌入式实现的设计自动化。

英文摘要

The large sizes of Spiking Vision Transformers (SViTs) still hinder their embedded implementation, highlighting the need for model compression. State-of-the-art works compress SViT models through unstructured pruning, which needs specialized hardware accelerators for their specific sparsity patterns to maximize efficiency gains. Moreover, their manual approach requires a huge design time to find an appropriate pruning setting for each network, thus making this approach not scalable. To address this limitation, we propose PrimeSVT, a novel framework that performs automated memory-aware structured pruning on pre-trained SViT models, thereby maximizing their efficiency gains during inference amenable to widely-used computing architectures. To achieve this, PrimeSVT first sorts the SViT layers based on their sizes (i.e., number of parameters), identifies the targeted pruning layers based on their robustness under different pruning rates, then leverages this order for compressing the model layer-by-layer sequentially from the largest one to the smallest one (i.e., so-called prioritized compression policy), while considering the user-defined constraints (i.e., acceptable accuracy and memory saving). In each layer, PrimeSVT employs channel-wise filter pruning based on their L2-norm values to structurally remove the non-significant weights. Experimental results show that PrimeSVT saves 26.68% memory through automated single-shot pruning, while preserving accuracy within 3% (70.3% without fine-tuning and 72.9% with fine-tuning) from the original unpruned SViT model (73.3%), thus meeting the accuracy and memory constraints. These show that our PrimeSVT framework enables design automation for SViTs and their embedded implementation.

2606.03391 2026-06-03 cs.LG cs.AI cs.CL 版本更新

When Model Merging Breaks Routing: Training-Free Calibration for MoE

当模型合并破坏路由:MoE的无训练校准

Canbin Huang, Tianyuan Shi, Xiaojun Quan, Jingang Wang, Jianfei Zhang, Qifan Wang

发表机构 * University of Science and Technology of China(中国科学技术大学)

AI总结 针对MoE架构中模型合并导致的路由崩溃问题,提出基于二阶曲率的无训练校准方法HARC,通过闭式解和共轭梯度法高效重对齐路由器,显著提升数学推理和代码生成性能。

详情
AI中文摘要

模型合并已成为一种无需重新训练即可整合多个LLM能力的成本效益方法。然而,现有的合并技术主要基于线性参数算术或优化,在应用于混合专家(MoE)架构时面临困难。我们识别出MoE合并中的一个关键失效模式,称为路由崩溃,其中合并后的路由器无法将令牌分派给合适的专家。路由崩溃源于非线性softmax和离散Top-k路由机制对合并引起的参数扰动的敏感性,这种敏感性进一步被MoE预训练期间施加的负载平衡约束放大。由于微调后的专家表现出不同的专长,即使是适度的错误路由也可能导致严重的性能下降。为解决此问题,我们提出Hessian感知路由器校准(HARC),一种无训练框架,利用二阶曲率信息重新对齐合并后的路由器。该方法采用闭式解,可通过无矩阵共轭梯度法高效求解。在数学推理和代码生成任务上的实验表明,HARC有效缓解了多种MoE合并基线中的路由崩溃,并带来了显著的性能提升。我们的代码可在该https URL获取。

英文摘要

Model merging has emerged as a cost-effective approach for consolidating the capabilities of multiple LLMs without retraining. However, existing merging techniques, largely based on linear parameter arithmetic or optimization, struggle when applied to Mixture-of-Experts (MoE) architectures. We identify a critical failure mode in MoE merging, termed routing breakdown, in which the merged router fails to dispatch tokens to suitable experts. Routing breakdown stems from the sensitivity of the non-linear softmax and discrete Top-k routing mechanisms to parameter perturbations from merging, a sensitivity further amplified by load-balancing constraints imposed during MoE pretraining. Because fine-tuned experts exhibit distinct specializations, even modest misrouting can cause severe performance degradation. To address this issue, we propose Hessian-Aware Router Calibration (HARC), a training-free framework that leverages second-order curvature information to realign the merged router. This approach admits a closed-form solution that can be efficiently solved using a matrix-free conjugate gradient method. Experiments on mathematical reasoning and code generation tasks show that HARC effectively mitigates routing breakdown across diverse MoE merging baselines and leads to substantial performance improvements. Our code is available at https://github.com/huangcb01/HARC.

2606.03365 2026-06-03 cs.LG 版本更新

Link Prediction or Perdition: the Seeds of Instability in Knowledge Graph Embeddings

链接预测还是预测失灵:知识图谱嵌入中不稳定的种子

Guillaume Méroué, Fabien Gandon, Pierre Monnin

发表机构 * Université Côte d’Azur, Inria, CNRS, I3S, France(法国埃克塞特大学、法国国家信息与自动化研究所、法国国家科学研究中心、I3S研究所)

AI总结 本文系统分析了多种知识图谱嵌入模型在链接预测中的稳定性,发现高性能模型在三元组预测和嵌入空间上存在显著不稳定性,且随机种子、超参数等因素独立引发同等程度的不稳定,投票机制仅能有限提升稳定性。

Comments Paper accepted at ESWC 2026 (https://2026.eswc-conferences.org)

详情
AI中文摘要

嵌入模型(KGEMs)是完成知识图谱的主要链接预测方法。标准评估协议强调基于排名的指标如MRR或Hits@$K$,但通常忽略随机种子对结果稳定性的影响。此外,这些指标掩盖了个别预测和嵌入空间组织中的潜在不稳定性。在这项工作中,我们对多个数据集上的多种KGEM进行了系统的稳定性分析。我们发现高性能模型实际上在三元组级别产生分歧预测,并具有高度可变的嵌入空间。通过隔离随机因素(即初始化、三元组排序、负采样、dropout、硬件),我们表明每个因素独立地引发相当程度的不稳定性。此外,对于给定模型,具有更好MRR的超参数配置并不能保证更稳定。而且,投票虽然是一种已知的补救机制,但只能提供有限的稳定性增强。这些发现凸显了当前基准测试协议的关键局限性,并引发了对KGEM用于知识图谱补全的可靠性的担忧。

英文摘要

Embedding models (KGEMs) constitute the main link prediction approach to complete knowledge graphs. Standard evaluation protocols emphasize rank-based metrics such as MRR or Hits@$K$, but usually overlook the influence of random seeds on result stability. Moreover, these metrics conceal potential instabilities in individual predictions and in the organization of embedding spaces. In this work, we conduct a systematic stability analysis of multiple KGEMs across several datasets. We find that high-performance models actually produce divergent predictions at the triple level and highly variable embedding spaces. By isolating stochastic factors (i.e., initialization, triple ordering, negative sampling, dropout, hardware), we show that each independently induces instability of comparable magnitude. Furthermore, for a given model, hyperparameter configurations with better MRR are not guaranteed to be more stable. Moreover, voting, albeit a known remediation mechanism, only provides a limited enhancement of stability. These findings highlight critical limitations of current benchmarking protocols, and raise concerns about the reliability of KGEMs for knowledge graph completion.

2606.03361 2026-06-03 cs.LG 版本更新

Mitigating False Credit Propagation: Probabilistic Graphical Reward Aggregation for Rubric-Based Reinforcement Learning

缓解虚假信用传播:基于概率图奖励聚合的准则强化学习

Can Lv, Mingju Chen, Heng Chang, Shiji Zhou

发表机构 * Beijing Advanced Innovation Center for Future Blockchain and Privacy Computing, School of Artificial Intelligence, Beihang University(北京未来区块链与隐私计算先进创新中心,人工智能学院,北京航空航天大学) Tsinghua University(清华大学)

AI总结 针对准则奖励中因忽略准则间依赖关系导致的虚假信用传播问题,提出概率图框架Graphical Event Aggregation for Rubric rewards (GEAR),通过建模潜在伯努利事件和软抑制传播实现依赖感知的奖励聚合,在多个基准上提升性能并减少信用泄漏。

详情
AI中文摘要

基于准则的奖励越来越多地用于开放式语言模型的后训练,但准则级别的分数通常作为独立效用进行聚合。这种扁平标量化忽略了准则间由准则指定的前提和激活关系,使得即使触发奖励或惩罚的条件不存在,奖励或惩罚仍被计入。我们将这种结构性的奖励聚合失败称为 extbf{虚假信用传播}(FCP)。为解决这一局限,我们提出\ourname( extbf{G}raphical extbf{E}vent extbf{A}ggregation for extbf{R}ubric rewards),一种用于依赖感知准则聚合的概率图框架。\ourname将每个准则结果建模为类型化准则图中的潜在伯努利事件,从不受支持的父事件向其子事件传播软抑制,并将结果事件概率聚合为归一化的期望符号效用。这产生了一个线性时间的奖励计算,可以插入到标准的基于准则的RL流程中,而无需改变外部优化算法。在HealthBench、WritingBench和PLawBench上使用两种策略骨干的实验表明,\ourname一致优于扁平聚合和确定性门控,相对于扁平聚合实现了高达15.5%的相对增益。FCP诊断进一步显示,相对于扁平聚合,\ourname减少了96.5%的泄漏,同时保留了比确定性门控更多的许可下游效用。我们的代码在此https URL公开。

英文摘要

Rubric-based rewards are increasingly used for open-ended language model post-training, but criterion-level scores are often aggregated as independent utilities. This flat scalarization ignores rubric-specified prerequisite and activation relations among criteria, allowing reward or penalty to be counted even when the condition that licenses it is absent. We call this structural reward-aggregation failure \textbf{False Credit Propagation} (FCP). To address this limitation, we propose \ourname (\textbf{G}raphical \textbf{E}vent \textbf{A}ggregation for \textbf{R}ubric rewards), a probabilistic graphical framework for dependency-aware rubric aggregation. \ourname models each criterion outcome as a latent Bernoulli event in a typed rubric graph, propagates soft suppression from unsupported parent events to their children, and aggregates the resulting event probabilities into a normalized expected signed utility. This yields a linear-time reward computation that can be plugged into standard rubric-based RL pipelines without changing the outer optimization algorithm. Experiments on HealthBench, WritingBench, and PLawBench with two policy backbones show that \ourname consistently improves over flat aggregation and deterministic gating, achieving relative gains of up to 15.5\% over flat aggregation. FCP diagnostics further show that \ourname reduces leakage by 96.5\% relative to flat aggregation while preserving more licensed downstream utility than deterministic gating. Our code is publicly available at https://github.com/LvCan926/GEAR.

2606.03359 2026-06-03 cs.SD cs.CL cs.LG 版本更新

Speech Emotion Recognition using Attention-based LSTM-Network with Residual Connection

基于注意力机制的残差连接LSTM网络的语音情感识别

Daniil Krasnoproshin, Maxim Vashkevich

发表机构 * Institute of Cybernetics and Machine Learning, Belarusian State University(白俄罗斯国立大学信息学与机器学习学院)

AI总结 提出ResLSTM-SA轻量级架构,在LSTM中集成残差连接和软注意力,在RAVDESS数据集上以46.8k参数达到0.6517 UAR,优于传统基线且适合边缘部署。

Comments 6 pages, 5 figures, DSPA 2026

详情
AI中文摘要

语音情感识别是现代人机交互系统的重要组成部分。然而,许多最先进的方法依赖于具有高计算和内存需求的大型预训练模型,限制了其适用性。本文提出了ResLSTM-SA,一种轻量级架构,在基于LSTM的框架中集成了残差连接和软注意力。在RAVDESS数据集上,在严格的说话人独立划分下进行评估,所提出的模型在未加权平均召回率(UAR)方面优于传统的基于注意力的LSTM基线以及几种先前报道的CNN和混合CNN-LSTM架构。性能最佳的变体(ResLSTM-SA-h64)仅用46.8k可训练参数就达到了0.6517的最大UAR,以比大规模自监督替代方案少三个数量级的参数提供了具有竞争力的准确性,从而能够在边缘设备和实时语音助手上高效部署。源代码可在以下网址获取:https://this URL。

英文摘要

Speech emotion recognition is an important component of modern human-computer interaction systems. However, many state-of-the-art approaches rely on large pretrained models with high computational and memory requirements, limiting their applicability. This paper proposes ResLSTM-SA, a lightweight architecture that integrates residual connections with soft attention within an LSTM-based framework. Evaluated on the RAVDESS dataset under strict speaker-independent partitioning, the proposed model outperforms conventional attention-based LSTM baselines and several previously reported CNN- and hybrid CNN-LSTM architectures in terms of unweighted average recall (UAR). The best-performing variant (ResLSTM-SA-h64) achieves a maximum UAR of 0.6517 with only 46.8k trainable parameters, delivering competitive accuracy with three orders of magnitude fewer parameters than large-scale self-supervised alternatives, thereby enabling efficient deployment on edge devices and real-time voice assistants. The source code is available at https://github.com/Mak-Sim/ResLSTM-SER.

2606.03358 2026-06-03 cs.LG 版本更新

The Impact of Temporal Granularity on Socio-Demographic Inference from Household Load Profiles

时间粒度对家庭负荷曲线社会人口推断的影响

Dejan Radovanovic, Maximilian Schirl, Andreas Unterweger, Günther Eibl

发表机构 * Center for Secure Energy Informatics(安全能源信息中心) Salzburg University of Applied Sciences(萨尔茨堡应用科学大学) Paris Lodron University of Salzburg(萨尔茨堡巴黎洛登大学)

AI总结 本文通过分析15分钟到7天不同粒度负荷曲线对8个社会人口属性的预测影响,揭示了隐私-效用权衡中时间分辨率、特征提取和分类器选择的联合作用。

Comments 30 pages, 10 figures, book chapter

详情
AI中文摘要

智能电表数据可以揭示家庭敏感的社会人口特征,引发隐私担忧。虽然这一风险已在固定粒度下得到证实,但时间分辨率在塑造推断性能中的作用尚未得到充分探索。本文通过分析从15分钟到7天不同粒度的负荷曲线如何影响1589户家庭一年数据中八个社会人口属性的可预测性,填补了这一空白。我们引入了一个评估框架,其中分类器在全年数据上训练,但在任意周上测试,迫使模型跨季节和每周变化进行泛化。我们的结果显示了三个主要发现。首先,虽然粗化粒度降低了预测准确性,但出现了两个平台期:性能在15分钟到1小时之间稳定,以及在1到7天之间再次稳定。这揭示了在不牺牲效用的情况下进行数据最小化的机会。其次,可解释的手工特征和tsfresh特征仍然与基于CNN的自编码器嵌入具有竞争力,而XGBoost始终优于其他分类器。第三,特征重要性分析突出了静态和动态属性之间的差异:即使从粗粒度数据中也能推断出住宅面积,而游泳池使用则需要细粒度的时间信号。总体而言,我们的研究为智能计量中的隐私-效用权衡提供了新的见解,显示了时间分辨率、特征提取和分类器选择如何共同影响社会人口推断。

英文摘要

Smart meter data can reveal sensitive socio-demographic characteristics of households, raising privacy concerns. While this risk has been demonstrated at fixed granularities, the role of temporal resolution in shaping inference performance remains insufficiently explored. This paper addresses this gap by analyzing how load profiles with granularities from 15 minutes to 7 days affect the predictability of eight socio-demographic attributes in a dataset of 1,589 households over one year. We introduce an evaluation framework where classifiers are trained on year-round data but tested on arbitrary weeks, forcing generalization across seasonal and weekly variations. Our results show three main findings. First, while coarsening granularity reduces predictive accuracy, two plateaus emerge: performance is stable between 15 minutes and 1 hour, and again between 1 and 7 days. This reveals opportunities for data minimization without sacrificing utility. Second, interpretable handcrafted and tsfresh features remain competitive with CNN-based autoencoder embeddings, while XGBoost consistently outperforms alternative classifiers. Third, feature importance analysis highlights differences between static and dynamic attributes: dwelling size can be inferred even from coarse data, whereas swimming pool usage requires fine-grained temporal signals. Overall, our study provides new insights into the privacy-utility trade-off in smart metering, showing how temporal resolution, feature extraction, and classifier choice jointly influence socio-demographic inference.

2606.03355 2026-06-03 cs.LG 版本更新

APIC: Amortized Physics-Informed Calibration using Neural Processes

APIC: 使用神经过程的摊销物理信息校准

Aishwarya Venkataramanan, Sai Karthikeya Vemuri, Joachim Denzler

发表机构 * Computer Vision Group, Friedrich Schiller University Jena(耶纳弗里德里希-施莱尔大学计算机视觉组)

AI总结 提出APIC框架,通过神经过程实现群体级贝叶斯推断,利用两分支潜在架构分离实例特定物理参数与共享结构差异,实现从稀疏观测中快速校准并量化不确定性。

Comments Accepted at UAI 2026

详情
AI中文摘要

物理模型由于机制错误或缺失而固有地不完美,导致模型预测与真实观测之间存在系统性差异。Kennedy-O'Hagan (KOH) 框架通过显式差异建模解决了这个问题。然而,其非摊销的、每个实例的公式限制了在相关系统族中的可扩展性。我们引入了摊销物理信息校准 (APIC),这是 KOH 的群体级扩展,利用神经过程在实现之间进行可扩展的贝叶斯推断。我们的框架采用两分支潜在架构,将实例特定的物理参数与共享的、状态相关的结构差异分离开来。通过将可微物理集成到摊销推断骨干中,APIC 能够从稀疏观测中快速校准未见过的实现,同时量化不确定性。在阻尼弹簧振荡器、Lotka-Volterra 系统和具有错误物理的对流扩散偏微分方程上的实验表明,与其他校准方法相比,参数恢复得到改善,并且系统差异结构的一致识别得到增强。

英文摘要

Physics models are inherently imperfect due to misspecified or missing mechanisms, resulting in systematic discrepancies between model predictions and real-world observations. The Kennedy-O'Hagan (KOH) framework addresses this issue through explicit discrepancy modeling. However, its non-amortized, per-instance formulation limits scalability across families of related systems. We introduce Amortized Physics-Informed Calibration (APIC), a population-level extension of KOH that leverages Neural Processes to perform scalable Bayesian inference across realizations. Our framework employs a two-branch latent architecture to disentangle instance-specific physical parameters from shared, state-dependent structural discrepancies. By integrating differentiable physics into an amortized inference backbone, APIC enables rapid calibration of unseen realizations from sparse observations while quantifying uncertainty. Experiments on the damped spring oscillator, the Lotka-Volterra system, and the advection-diffusion PDE with misspecified physics demonstrate improved parameter recovery and consistent identification of the systemic discrepancy structure compared to other calibration approaches.

2606.03347 2026-06-03 cs.LG cs.AI stat.ML 版本更新

AugMask: Training Diffusion Models on Incomplete Tabular Data via Stochastic Augmentation and Masking

AugMask: 通过随机增强和掩码在不完整表格数据上训练扩散模型

Jungkyu Kim, Taeyoung Park, Kibok Lee

发表机构 * KAIST(韩国科学技术院)

AI总结 提出AugMask训练框架,通过条件随机增强和仅对观测坐标去噪,使标准扩散模型适应缺失表格数据,并连接Rao-Blackwellized目标实现方差加权惩罚,优于专门处理缺失的基线。

详情
AI中文摘要

基于分数的扩散模型已成为突出的深度生成模型;然而,它们在表格数据上的应用仍然具有挑战性,因为其主干网络假设输入完全指定,而现实世界的表格数据通常包含缺失值。我们提出了AugMask,一个即插即用的训练框架,通过将条件与监督分离,使对缺失不敏感的主干网络适应不完整数据。AugMask 1) 使用轻量级辅助模型通过条件随机增强构建数值输入,2) 仅对观测坐标应用去噪监督。实际上,增强的缺失条目作为不确定的条件上下文,而不是训练目标。我们将此训练规则与Rao-Blackwellized目标联系起来,并表明对缺失条目进行边缘化会产生方差加权的敏感性惩罚,从而阻止对不确定补全的过度依赖。在多种数据集和缺失机制下,AugMask使基于扩散的标准表格生成器优于专门处理缺失的基线方法。

英文摘要

Score-based diffusion models have emerged as prominent deep generative models; however, their application to tabular data remains challenging because their backbones assume fully specified inputs, whereas real-world tabular data often contain missing values. We propose AugMask, a plug-and-play training framework that adapts missing-unaware backbones to incomplete data by separating conditioning from supervision. AugMask 1) constructs numeric inputs via conditional stochastic augmentation using lightweight auxiliary models, and 2) applies denoising supervision only to observed coordinates. In effect, augmented missing entries serve as uncertain conditioning context rather than training targets. We connect this training rule to a Rao--Blackwellized objective and show that marginalizing missing entries yields a variance-weighted sensitivity penalty, discouraging over-reliance on uncertain completions. Across diverse datasets and missingness regimes, AugMask enables standard diffusion-based tabular generators to outperform specialized missing-aware baselines.

2606.03344 2026-06-03 cs.CR cs.LG 版本更新

RogueMerge: Robust and Unified Attacks against LLM Model Merging

RogueMerge: 针对大语言模型合并的鲁棒统一攻击

Jinghuai Zhang, Yetian He, Kunlin Cai, Han Zhao, Fnu Suya, Yuan Tian

发表机构 * University of California, Los Angeles(加州大学洛杉矶分校)

AI总结 提出RogueMerge框架,通过联合优化、元学习模拟和分布鲁棒优化,解决模型合并中针对自回归生成、未知合并配置和攻击提示泛化的三大挑战,实现鲁棒且统一的攻击。

详情
AI中文摘要

模型合并通过聚合来自未验证公共平台的任务向量,将专门能力组合到单个大语言模型中,暴露了关键的供应链攻击面:由于任何恶意行为都可以编码到任务向量中,并且合并允许第三方向量直接写入模型权重,攻击者提供的任务向量可以启用或放大多种下游威胁。先前的工作仅研究针对分类器的模型合并的后门攻击,使用静态算术启发式方法,由于三个原因无法有效处理针对生成式大语言模型的各种攻击。(i) 大语言模型依赖于自回归解码,合并引入的微小参数漂移会在令牌间累积并迅速降低攻击效果。(ii) 攻击者不知道受害者的合并配置,导致独立优化的静态攻击向量容易被稀释或破坏。(iii) 实际威胁诱导必须泛化到优化期间未见过的攻击提示,静态向量无法充分编码。我们提出RogueMerge,这是第一个原则性的统一框架,解决了所有三个挑战。为了处理自回归生成,我们用联合优化替代静态算术,明确强制合并后的攻击成功。为了处理未知的合并设置,我们将攻击注入表述为随机最小-最大问题,并通过元学习风格的模拟来解决。为了在异构攻击提示间泛化,我们采用分布鲁棒优化,并在大语言模型规模下推导出可处理的一阶泰勒近似,具有可证明的误差界。在四种威胁、六种合并算法和超过170个合并的大语言模型上,RogueMerge始终优于现有攻击。它还在多种合并设置下保持稳定,并能抵抗标准防御。

英文摘要

Model merging composes specialized capabilities into a single LLM by aggregating task vectors sourced from unverified public platforms, exposing a critical supply-chain attack surface: Because any malicious behavior can be encoded into a task vector, and merging grants third-party vectors direct write access to model weights, an attacker-provided task vector can enable or amplify diverse downstream threats. Prior work studies only backdoor attacks against model merging for classifiers using static arithmetic heuristics, which fail to effectively handle diverse attacks on generative LLMs for three reasons. (i) LLMs rely on autoregressive decoding, where the minor parameter drift introduced by merging compounds across tokens and rapidly degrades the attack. (ii) Attackers have no knowledge of the victim's merging configurations, causing a static attack vector optimized in isolation to be easily diluted or destroyed. (iii) Practical threat induction must generalize to attack prompts unseen during optimization, which static vectors cannot adequately encode. We present RogueMerge, the first principled, unified framework that addresses all three challenges. To handle autoregressive generation, we replace static arithmetic with a joint optimization that explicitly enforces attack success after merging. To handle unknown merging settings, we formulate attack injection as a stochastic min-max problem and solve it via meta-learning-style simulation. To generalize across heterogeneous attack prompts, we employ distributionally robust optimization and derive a tractable first-order Taylor approximation at LLM scale, with a provable error bound. Across four threats, six merging algorithms, and over 170 merged LLMs, RogueMerge consistently outperforms existing attacks. It also remains stable across diverse merging settings and resists standard defenses.

2606.03338 2026-06-03 cs.LG cs.CV 版本更新

IdEst: Assessing Self-Supervised Learning Representations via Intrinsic Dimension

IdEst: 通过内在维度评估自监督学习表示

Julie Mordacq, Vicky Kalogeiton, Steve Oudot

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 提出IdEst方法,利用最小生成树维度估计器评估自监督学习表示的内在维度,发现其与下游线性探测性能强相关,并能高效选择超参数。

Comments ICML 2026

详情
AI中文摘要

自监督学习(SSL)已成为从无标签数据中学习有意义表示的有效范式。然而,评估这些表示的标准协议——线性探测——计算成本高、对超参数敏感,并且对表示空间的几何结构提供的洞察有限。在这项工作中,受神经网络泛化与内在维度(ID)之间联系的启发,我们提出了IdEst,一种通过最小生成树维度估计器($\mathrm{dim}_\mathrm{MST}$)估计SSL表示ID的方法。在多种数据集、架构和SSL预训练目标上,我们表明IdEst与下游线性探测性能强相关。此外,我们证明IdEst能够实现高效的超参数选择,与监督替代方案相比显著降低计算成本。我们的结果突出了内在维度作为评估SSL表示的原则性几何代理,补充了标准的监督探测协议。

英文摘要

Self-supervised learning (SSL) has emerged as a powerful paradigm for learning meaningful representations from unlabeled data. However, the standard protocol for evaluating these representations, linear probing, is computationally expensive, sensitive to hyperparameters, and provides limited insight into the geometric structure of the representation space. In this work, motivated by connections between neural network generalization and intrinsic dimension (ID) we propose IdEst, a method for estimating the ID of SSL representations via the Minimum Spanning Tree dimension estimator ($\mathrm{dim}_\mathrm{MST}$). Across diverse datasets, architectures, and SSL pretraining objectives, we show that IdEst strongly correlates with downstream linear probe performances. Furthermore, we demonstrate that IdEst enables efficient hyperparameter selection, significantly reducing the computational cost compared to supervised alternatives. Our results highlight intrinsic dimensionality as a principled geometric proxy for assessing SSL representations, complementing standard supervised probing protocols.

2606.03334 2026-06-03 cs.CL cs.LG 版本更新

Lingo_Research_Group at SemEval-2026 Task 9: Evaluating Prompt Variants for Polarization Detection

Lingo_Research_Group 在 SemEval-2026 任务 9:评估用于极化检测的提示变体

Pritam Kadasi, Anuj Tiwari, Mayank Singh

发表机构 * Lingo Research Group(Lingo研究组) Indian Institute of Technology Gandhinagar(印度理工学院加尔各答分校) Noida Institute of Engineering and Technology(诺伊达工程与技术学院) ML Collective(ML集体)

AI总结 本文针对 SemEval-2026 任务 9 的多语言极化检测,通过设计 12 种不同提示变体,使用 aya-101 和 Gemma3-27B 模型,在三个子任务上评估提示方法的效果,发现提示方法在粗粒度极化检测上有效,但在细粒度多标签分类上困难增加。

Comments Accepted at the SemEval Workshop, ACL 2026

详情
AI中文摘要

本文提交的成果针对 SemEval-2026 任务 9:多语言文本分类挑战——极化检测,涵盖了所有三个子任务:(1) 二元极化检测,(2) 极化类型分类,以及 (3) 极化表现识别。我们采用系统性的短设计提示研究方法,考虑了 12 种在术语清晰度、定义详细程度、推理指导以及上下文示例使用上不同的设计提示。实验使用 aya-101 和 Gemma3-27B 进行,后者因性能考虑在开发阶段结束时被选用于提交。我们的系统在官方测试集上(22 种语言的平均值)在子任务 1 上的平均宏 F1 得分为 0.762,子任务 2 为 0.587,子任务 3 为 0.444,平均准确率分别为 0.819、0.678 和 0.498。通过跨任务和跨语言分析,我们证明基于提示的方法可以有效检测粗粒度极化,但在细粒度多标签社会语言学分类方面遇到越来越多的困难。

英文摘要

Our submission presented in this paper is for SemEval-2026 Task 9: Multilingual Text Classification Challenge - Polarization Detection and it covers all three subtasks: (1) binary polarization detection, (2) polarization type classification and (3) polarization manifestation identification. We adopt a systematic approach of research on short designed prompts by considering twelve designed prompts that are different in terminology clarity, detail of the definition, guidance of reasoning and in-context examples use. The experiments are conducted using aya-101 and Gemma3-27B, with the latter chosen for the submission at the end of the development through performance considerations. Our system has an average macro level F1-score of 0.762 on Subtask 1, 0.587 on Subtask 2 and 0.444 on Subtask 3 with the average accuracy of 0.819, 0.678 and 0.498, respectively, on the official test set averaged among 22 languages, respectively. With cross-task and cross-lingual analysis, we demonstrate that prompt-based approaches can be used effectively to detect coarse grained polarization but encounter more and more difficulties as far as fine-grained and multi-label sociolinguistic classification is concerned.

2606.03332 2026-06-03 cs.LG 版本更新

Tailoring Strictly Proper Scoring Rules for Downstream Tasks: An Application to Causal Inference

为下游任务定制严格适当的评分规则:因果推断中的应用

Roman Plaud, Alexandre Perez-Lebel, Antoine Saillenfest, Thomas Bonald, Marine Le Morvan, Gaël Varoquaux, Matthieu Labeau

发表机构 * Inria(法国国家科学研究中心) CNRS(法国国家科学研究中心) Université de Paris(巴黎大学) Université de Paris-Saclay(巴黎-萨克雷大学)

AI总结 提出一种通过匹配下游误差指标的局部曲率来推导任务特定严格适当评分规则的框架,并将其应用于平均处理效应估计,导出了闭式损失函数及其对应的规范概率映射,实验表明该方法优于标准似然和协变量平衡方法。

Comments Accepted to ICML 2026

详情
AI中文摘要

概率模型通常使用任务无关的目标函数(如对数损失)进行训练,这可能导致下游估计出现显著误差。这种脱节在因果推断的逆概率加权(IPW)中尤为关键,其中倾向得分在接近 $0$ 和 $1$ 处的误差常常导致高偏差和高方差。我们提出一个原则性框架,通过匹配下游误差指标的局部曲率来推导任务特定的严格适当评分规则。我们将此应用于平均处理效应(ATE)估计,导出了一个闭式损失函数及其对应的规范概率映射,该映射可以轻松集成到任何模型(如神经网络或梯度提升算法)中。在因果推断基准上的广泛评估表明,我们定制的目标函数始终优于标准的似然和协变量平衡方法。

英文摘要

Probabilistic models are typically trained using task-agnostic objectives like log-loss, which can lead to significant errors in downstream estimation. This disconnect is especially critical in Inverse Probability Weighting (IPW) for causal inference, where propensity score errors near $0$ and $1$ often lead to high bias and variance. We propose a principled framework for deriving task-specific strictly proper scoring rules by matching the local curvature of the downstream error metric. We apply this to the Average Treatment Effect (ATE) estimation, deriving a closed-form loss and its corresponding canonical probability mapping that can be readily integrated with any model like a neural network or a gradient boosting algorithm. Extensive evaluations on causal inference benchmarks demonstrate that our tailored objective consistently outperforms standard likelihood-based and covariate-balancing approaches.

2606.03330 2026-06-03 cs.LG cs.AI cs.CR 版本更新

FLIPS: Instance-Fingerprinting for LLMs via Pseudo-random Sequences

FLIPS:通过伪随机序列为LLMs进行实例指纹识别

Gurvan Richardeau, Gohar Dashyan, Erwan Le Merrer, Gilles Tredan

发表机构 * Inria(法国国家信息与自动化研究所)

AI总结 提出FLIPS方法,利用生成的二进制随机序列中的偏差,在237个模型实例上实现96%(闭集)和90%(开集)的识别准确率,解决了现有指纹识别技术无法区分同一LLM不同配置的问题,为AI监管提供了实例级指纹识别新范式。

Comments 20 pages, 20 figures, 3 tables. 43rd International Conference on Machine Learning (ICML 2026)

详情
AI中文摘要

文献揭示,大型语言模型(LLM)的行为不仅受其原始权重影响,还受其实例级参数(如指令提示、采样配置或量化)影响。在一种配置下生成安全输出的模型,在另一种配置下可能产生有毒内容。然而,当前的LLM识别技术(如指纹识别)侧重于知识产权保护,其设计倾向于对这些实例级参数的变化具有鲁棒性。这对AI监管构成了关键挑战,因为合规评估针对的是实际部署的行为,而非模型来源。在本文中,我们引入了实例级指纹识别,这是一种面向监管的范式,用于区分同一LLM的不同配置。我们的方法FLIPS利用生成的二进制随机序列中的偏差,在237个模型实例上达到96%(闭集)和90%(开集,其中一些目标未知)的识别准确率,而改编的LLMmap基线仅为35%。这表明实例级指纹识别对于监管既必要又实际可行。代码见https://this URL。

英文摘要

Literature reveals that a Large Language Model's (LLM) behavior is not only conditioned by its original weights but also its instance-level parameters, such as instructional prompt, sampling configuration or quantization. A model that generates safe outputs under one configuration may produce toxic content under another. However, current LLM identification techniques (such as fingerprinting) focus on intellectual property protection, and their design favors robustness to changes in these instance-level parameters. This poses a critical challenge for AI regulation in which compliance assessments target actual deployed behaviors, not model provenance. In this paper, we introduce instance-level fingerprinting, a regulator-oriented paradigm that distinguishes configurations of the same LLM. Our method FLIPS, exploits biases in generated binary random sequences to reach 96% (closed-set) and 90% (open-set, where some targets are unknown) identification accuracy across 237 model instances, versus 35% for the adapted LLMmap baseline. This shows that instance-level fingerprinting is both necessary for regulation and practically feasible. Code available at https://github.com/GurvanR/FLIPS-LLM-Instance-Fingerprinting.

2606.03322 2026-06-03 cs.LG cs.AI 版本更新

Multi-Modal Graph Neural Network with Transformer-Guided Adaptive Diffusion for Preclinical Alzheimer Classification

多模态图神经网络与Transformer引导的自适应扩散用于临床前阿尔茨海默病分类

Jaeyoon Sim, Minjae Lee, Guorong Wu, Won Hwa Kim

发表机构 * Pohang University of Science and Technology(浦项科学技术大学) University of North Carolina at Chapel Hill(北卡罗来纳大学教堂山分校)

AI总结 提出一种结合扩散核与多头注意力的图神经网络框架,通过Transformer引导自适应扩散过程,有效融合多模态特征,提升临床前阿尔茨海默病分类性能并识别关键脑区。

Comments 10 pages, Accepted to MICCAI 2024

详情
AI中文摘要

大脑的图形表示通过感兴趣区域(ROI)之间的关系为诊断和预测神经退行性疾病提供了关键见解。尽管近年来出现了各种图神经网络(GNN)来有效捕获关系信息,但在解释大脑网络方面仍存在固有局限性。具体而言,卷积方法无法有效聚合远邻域信息,而基于注意力的方法在捕获节点中心信息方面存在缺陷,特别是在保留关键节点的关键特征方面。这些不足揭示了从不同模态的不同特征中识别疾病特异性变化的挑战。为此,我们提出一个集成框架,通过下游Transformer引导每个节点的扩散过程,其中图的短程和长程属性分别通过扩散核和多头注意力进行聚合。我们通过使用多种模态改进临床前阿尔茨海默病(AD)分类的性能,证明了我们模型的优越性。此外,我们的模型能够熟练识别与AD临床前阶段密切相关的关键ROI,为疾病的早期诊断和预防提供了重要潜力。

英文摘要

The graphical representation of the brain offers critical insights into diagnosing and prognosing neurodegenerative disease via relationships between regions of interest (ROIs). Despite recent emergence of various Graph Neural Networks (GNNs) to effectively capture the relational information, there remain inherent limitations in interpreting the brain networks. Specifically, convolutional approaches ineffectively aggregate information from distant neighborhoods, while attention-based methods exhibit deficiencies in capturing node-centric information, particularly in retaining critical characteristics from pivotal nodes. These shortcomings reveal challenges for identifying disease-specific variation from diverse features from different modalities. In this regard, we propose an integrated framework guiding diffusion process at each node by a downstream transformer where both short- and long-range properties of graphs are aggregated via diffusion-kernel and multi-head attention respectively. We demonstrate the superiority of our model by improving performance of pre-clinical Alzheimer's disease (AD) classification with various modalities. Also, our model adeptly identifies key ROIs that are closely associated with the preclinical stages of AD, marking a significant potential for early diagnosis and prevision of the disease.

2606.03321 2026-06-03 cs.LG cs.MA cs.SY eess.SY 版本更新

Validation-Gated Multi-Agent Governance for Online Adaptation of Thermal-Hydraulic Surrogate Models under Operating-Regime Shift

验证门控多智能体治理:运行工况迁移下热工水力代理模型的在线自适应

Doyeong Lim, Seungyoon Lee, In Cheol Bang

发表机构 * Department of Nuclear Engineering, Ulsan National Institute of Science and Technology (UNIST)(核工程系,乌山国立科学技术研究所(UNIST))

AI总结 针对离线训练模型在运行工况迁移时性能退化问题,提出验证门控多智能体治理框架,通过角色分离的智能体协作与确定性门控机制实现可审计的在线自适应,在实验热工水力数据上将平均绝对误差降低19%。

详情
AI中文摘要

人工智能代理模型可以支持每秒的热工水力预测,但离线选定并冻结的模型一旦部署到预训练包络之外,可能会变得条件锁定。本研究针对实验热工水力回路数据开发了一个受保护的持续自适应框架,其中角色分离的智能体——监控器、诊断器、自适应器、安全审计器和编排器——诊断误差特征、优先考虑候选模型族并审查升级,而确定性的冠军-挑战者门控和后台影子学习保留对模型替换的最终权限。通过分块三折交叉验证筛选了七个代理模型族,并选择时间傅里叶神经算子作为初始冠军,用于两个保留瞬态的60秒历史到10秒轨迹预测,每种自适应模式使用三个种子。静态部署给出通道平均MAE为7.06,警告超标率为56.8%;基于规则的自适应将MAE降至6.54,而仅使用影子刷新则接近静态。MA-Full模式中,角色分离的多智能体委员会审查每个评估流步骤,实现了最低的平均误差5.72和35.8%的超标率,相比静态改进19.0%。与静态的配对自助区间排除零,但自适应模式之间的区间重叠,且六个配对单元限制了广泛的统计声明。从神经算子到Transformer和图神经网络的验证升级表明,记录的门控自适应可以支持可审计的代理模型演化,同时确定性门控保留部署权限。

英文摘要

Artificial-intelligence surrogates can support second-by-second thermal-hydraulic forecasting, but models selected and frozen offline may become condition-locked once deployed outside their pretraining envelope. This study develops a guarded continual-adaptation framework for experimental thermal-hydraulic loop data in which role-separated agents - Monitor, Diagnosis, Adaptation, Safety-Auditor, and Orchestrator - diagnose error signatures, prioritize candidate model families, and review promotions, while deterministic champion-challenger gates and background shadow learning retain final authority over model replacement. Seven surrogate families were screened by blocked three-fold cross-validation, and a temporal Fourier neural operator was selected as the initial champion for 60-s-history-to-10-s-trajectory forecasting on two held-out transients, with three seeds per adaptive mode. Static deployment gave a channel-averaged MAE of 7.06 and a 56.8% warning-exceedance ratio; rule-based adaptation reduced MAE to 6.54, whereas shadow refresh alone remained close to Static. The MA-Full mode, in which the role-separated multi-agent council reviews every evaluated stream step, achieved the lowest mean error, 5.72, and 35.8% exceedance, corresponding to a 19.0% improvement over Static. Paired bootstrap intervals against Static excluded zero, although intervals among adaptive modes overlapped and the six paired units limit broad statistical claims. Validated promotions from the neural operator to Transformer and graph neural network indicate that logged, gate-controlled adaptation can support auditable surrogate evolution while deterministic gates retain deployment authority.

2606.03315 2026-06-03 cs.LG 版本更新

A Graph Foundation Model with Spectral Parsing and Prototype-Guided Spatial Propagation

具有频谱解析和原型引导空间传播的图基础模型

Ankang Yang, Jitao Zhao, Dongxiao He, Liang Yang, Di Jin, Weixiong Zhang

发表机构 * School of Computer Science and Technology(计算机科学与技术学院) Tianjin University(天津大学) School of Artificial Intelligence(人工智能学院) Hebei University of Technology(河北工业大学) Department of Health Technology and Informatics(健康技术与信息学院) Department of Data Science and Artificial Intelligence(数据科学与人工智能系) The Hong Kong Polytechnic University(香港理工大学)

AI总结 提出SPG模型,通过可学习的切比雪夫滤波器分解节点特征为多个频谱响应,并利用Gromov-Wasserstein原型几何蒸馏可迁移的成对关系,实现跨域图泛化。

详情
AI中文摘要

图基础模型旨在从多样化的图中学习可迁移知识,以泛化到未见过的图和任务。与文本和图像不同,图缺乏共享词汇或规则的空间网格,使得跨图迁移具有挑战性。这一挑战既来自特征差异,更关键的是来自多样化的图结构。现有的GFM主要通过统一特征空间或引入结构标记和词汇来提高可迁移性。然而,现有的拓扑感知设计仍有局限性。结构标记通常是离散的,而结构词汇通常依赖于预定义的子结构(如树和环),其有限覆盖可能遗漏跨图中更丰富的关系模式。此外,图信号包含高频局部模式和更平滑的低频模式,这需要不同的传播行为。这些成分在原始图信号中通常是纠缠的,而这一频谱视角在现有GFM中很少被探索。为了解决这些挑战,我们提出了SPG,一种具有频谱解析和原型引导空间传播的图基础模型。SPG应用可学习的切比雪夫滤波器将节点特征分解为多个频谱响应,减少频率特定图信号与传播行为之间的不匹配。然后,它构建一个Gromov-Wasserstein原型几何,将超越预定义子结构的可迁移成对关系蒸馏到共享结构空间中。学习到的原型几何进一步被投影回作为原型引导的传播算子。实验表明在跨域泛化中具有一致的改进。

英文摘要

Graph foundation models aim to learn transferable knowledge from diverse graphs for generalization to unseen graphs and tasks. Unlike text and images, graphs lack a shared vocabulary or regular spatial grid, making cross-graph transfer challenging. This challenge comes from both feature discrepancies and, more critically, diverse graph structures. Existing GFMs mainly improve transferability by unifying feature spaces or incorporating structural tokens and vocabularies. However, existing topology-aware designs still have limitations. Structural tokens are usually discrete, while structural vocabularies often rely on predefined substructures such as trees and cycles, whose limited coverage may miss richer relational patterns across graphs. Moreover, graph signals contain both high-frequency local patterns and smoother low-frequency patterns, which require different propagation behaviors. These components are often entangled in raw graph signals, while this spectral perspective is rarely explored in existing GFMs. To address these challenges, we propose SPG, a graph foundation model with spectral parsing and prototype-guided spatial propagation. SPG applies learnable Chebyshev filters to decompose node features into multiple spectral responses, reducing the mismatch between frequency-specific graph signals and propagation behaviors. It then constructs a Gromov-Wasserstein prototype geometry to distill transferable pairwise relations beyond predefined substructures into a shared structural space. The learned prototype geometry is further projected back as a prototype-guided propagation operator. Experiments demonstrate consistent improvements in cross-domain generalization.

2606.03310 2026-06-03 cs.LG cs.AI 版本更新

Learning Multi-Scale Hypergraph for High-Order Brain Connectivity Analysis

学习多尺度超图用于高阶脑连接分析

Jaeyoon Sim, Soojin Hwang, Seunghun Baek, Guorong Wu, Won Hwa Kim

发表机构 * KAIST(韩国科学技术院)

AI总结 提出自适应多尺度超边学习框架MuHL,通过构建层次节点特征并动态学习高阶交互,在多个脑网络基准上提升神经退行性疾病分类性能并识别关键脑区。

Comments 24 pages, Accepted to ICML 2026

详情
AI中文摘要

理解脑区之间的复杂交互对于早期神经退行性疾病(如阿尔茨海默病和帕金森病)的分类至关重要。虽然基于图的模型广泛用于分析脑网络,但大多数现有方法主要关注直接连接节点之间的成对交互,限制了其捕捉跨多个区域的高阶依赖关系的能力。尽管已有基于超图的方法来建模高阶关系,但许多方法依赖于预定义的超边或将学习限制在超边权重上,降低了灵活性并限制了其捕捉多分辨率结构模式的能力。为此,我们引入了一个自适应多尺度超边学习框架,即MuHL,该框架构建层次节点特征,并通过在多分辨率图信号上连续构建超边来动态学习高阶交互。在多个脑网络基准上的大量实验表明,MuHL在不同阶段持续提高了疾病分类性能,并从学习到的超边中识别出与疾病进展相关的关键感兴趣区域及其群体交互,突显了其作为神经退行性疾病脑网络分析强大工具的潜力。

英文摘要

Understanding complex interactions between brain regions is critical for early neurodegenerative disease classification such as Alzheimer's Disease (AD) and Parkinson's Disease (PD). While graph-based models are widely used to analyze brain networks, most existing approaches primarily focus on pairwise interactions between directly connected nodes, limiting their ability to capture higher-order dependencies across multiple regions. Although hypergraph-based methods have been proposed to model higher-order relations, many rely on predefined hyperedges or restrict learning to hyperedge weights, reducing flexibility and limiting their capacity to capture multi-resolution structural patterns. In this regard, we introduce an adaptive multi-scale hyperedge learning framework, i.e., MuHL, which constructs hierarchical node features and dynamically learns high-order interactions through continuous hyperedge construction over multi-resolution graph signals. Extensive experiments on multiple brain network benchmarks demonstrate that MuHL consistently improves disease classification performance across different stages, and further identifies key regions of interest (ROIs) and their group-wise interactions from the learned hyperedges that are associated with disease progression, highlighting its potential as a powerful tool for brain network analysis in neurodegenerative disorders.

2606.03304 2026-06-03 cs.CL cs.LG 版本更新

From Script to Semantics: Prompting Strategies for African NLI

从脚本到语义:非洲自然语言推理的提示策略

Anuj Tiwari, Terry Oko-odion, Hannah Nwokocha

发表机构 * arXiv

AI总结 本研究系统评估了五种提示策略在斯瓦希里语、约鲁巴语和豪萨语的自然语言推理任务上的表现,发现对比提示策略最可靠且能显著提升模型性能。

Comments Accepted at the RAIL Workshop, LREC 2026

详情
AI中文摘要

大型语言模型(LLMs)在多语言环境中的评估日益增多,但它们在低资源非洲语言中的推理行为仍未得到充分探索,尤其是在无微调的纯提示设置下。我们使用AfriXNLI基准,对斯瓦希里语、约鲁巴语和豪萨语的自然语言推理(NLI)提示策略进行了系统研究。我们评估了五种提示策略:基线(零样本)、脚本感知、语言特定、对比和原生标签自翻译(NL-STP),使用了两个中等规模的开源模型(Llama3.2-3B和Gemma3-4B)。为隔离提示设计的影响,我们的研究中排除了少样本示例和思维链推理的影响。我们发现不同策略在类别性能上存在显著差异,某些配置中出现高度中性类崩溃和高预测偏斜。对比提示被证明是最可靠的策略,在不同语言和模型上持续改进,并具有更好的类别行为平衡和整体准确率提升。值得注意的是,精心构建的提示足以击败提供少样本提示和思维链提示的更强大基线。我们发现提示表述对于低资源语言的多语言NLI至关重要,并且语言感知的决策结构可以有效地增强资源受限环境下的鲁棒性。

英文摘要

Large language models (LLMs) are increasingly evaluated in multilingual settings, yet their inference behavior in low-resource African languages remains underexplored especially under pure prompting without fine-tuning. We present a systematic study of prompting strategies for Natural Language Inference (NLI) in Swahili, Yoruba, and Hausa using the AfriXNLI benchmark. We evaluate five prompting strategies Baseline (zero-shot), Script-Aware, Language Specific, Contrastive, and Native-Label Self-Translation (NL-STP) across two mid-sized open weight models (Llama3.2-3B and Gemma3-4B). To isolate the effect of prompt design, the effect of few-shot examples and Chain-of-Thought reasoning is eliminated in our study. We find a significant difference in performance of class wise across strategies with highly neutral class collapse and high prediction skew in some configurations. Contrastive prompting proves to be the most reliable and steadily improving strategy over language and model and has better balance of class behavior and balance of overall accuracy gains. Notably, well-constructed prompts are sufficient to beat more powerful baselines that are provided with few-shot prompts and Chain-of-Thought prompts. We have found that prompt formulation is essential to multilingual NLI with low-resource languages and that language aware decision structuring can be used to meaningfully enhance robustness in resource challenged settings.

2606.03290 2026-06-03 cs.LG cs.AI 版本更新

Message Tuning Outshines Graph Prompt Tuning: A Prismatic Space Perspective

消息调优优于图提示调优:棱镜空间视角

Yancheng Chen, Dun Ma, Shuai Zhang, Yang Liu, Xixun Lin, Xiangyu Zhao, Wenguo Yang, Wei Chen, Chuan Zhou

发表机构 * University of Science and Technology of China(中国科学技术大学)

AI总结 本文提出棱镜空间理论(PS-Theory)量化图提示调优的适应能力上限,并引入消息调优(MTG)方法,通过注入可学习消息原型超越该上限,实验验证其优越性。

Comments Accepted by ICML 2026

详情
AI中文摘要

基于预训练与自适应范式的图基础模型(GFMs)已成为图学习的研究热点。对于基于GNN的GFMs,图提示调优已成为下游任务的主流自适应方法。尽管近期方法解释了图提示调优为何有效,但如何严格衡量其适应能力仍是一个开放问题。解决该问题对于理解图提示调优的能力极限以及开发更强大的自适应方法至关重要。本文提出棱镜空间理论(PS-Theory),一种新颖的数学框架,用于量化自适应方法的能力,同时重点建立图提示调优适应能力上限。基于所提出的PS-Theory,我们进一步引入GFMs的消息调优(MTG),一种轻量级方法,在GNN骨干网络的每一层注入少量可学习消息原型,以自适应地引导消息融合,无需更新预训练权重。通过我们的PS-Theory,我们证明MTG的适应能力可以超过图提示调优的理论上限。大量实验表明,MTG在多个基准数据集上 consistently 优于图提示基线,为我们的理论发现提供了强有力的实证支持。

英文摘要

Graph Foundation Models (GFMs), built upon the Pre-training and Adaptation paradigm, have emerged as a research hotspot in graph learning. For GNN-based GFMs, graph prompt tuning has become the prevailing adaptation method for downstream tasks. Although recent methods explain why graph prompt tuning works, how to rigorously measure its adaptation capacity remains an open problem. Addressing this problem is critical for understanding the capability limits of graph prompt tuning and for developing more powerful adaptation methods. In this paper, we propose Prismatic Space Theory (PS-Theory), a novel mathematical framework to quantify the capacity of adaptation methods, while focusing on establishing the upper bound for the adaptation capacity of graph prompt tuning. Building upon the proposed PS-Theory, we further introduce Message Tuning for GFMs (MTG), a lightweight approach that injects a small set of learnable message prototypes into each layer of the GNN backbone to adaptively guide message fusion without updating pre-trained weights. Through our PS-Theory, we prove that the adaptation capacity of MTG can exceed the theoretical upper bound of graph prompt tuning. Extensive experiments demonstrate that MTG consistently outperforms graph prompt baselines across diverse benchmark datasets, providing strong empirical support for our theoretical findings.

2606.03279 2026-06-03 cs.LG 版本更新

A Geometric Lens on Physics-Aligned Data Compression

物理对齐数据压缩的几何视角

Aleix Segui, Wesley Armour

发表机构 * GitHub

AI总结 本文通过局部几何理论揭示了物理信息损失函数在科学数据压缩中导致的率失真权衡,并提出了基于主特征空间重叠的对齐诊断方法。

Comments Proceedings of the 43rd International Conference on Machine Learning, Seoul, South Korea. PMLR 306, 2026

详情
Journal ref
Proceedings of the 43rd International Conference on Machine Learning, Seoul, South Korea. PMLR 306, 2026
AI中文摘要

在人工智能科学中,物理信息损失函数越来越多地被用于训练科学数据的学习压缩器,但其率失真影响仍知之甚少。在固定比特率下,这些目标通常能改善目标物理可观测量的保存,但会降低标准重建保真度。我们发展了一个局部几何理论,表明这种权衡由熵模型、物理可观测量和失真度量引起的潜在空间敏感性的相互作用所支配。在每个操作点,这些因素诱导出压缩噪声应被抑制的优先方向,从而产生各向异性的误差分配机制。当这些方向未对齐时,在固定速率下改善可观测量必然恶化标准失真,这确立了同时保存的基本限制。我们通过局部切空间率失真定律形式化这一点,并引入基于主特征空间重叠的实用对齐诊断方法。跨科学领域的实验测试了该理论,并验证了对齐诊断与观测到的数据和物理空间权衡相关。

英文摘要

In AI for Science, physics-informed losses are increasingly used to train learned compressors for scientific data, but their rate-distortion implications remain poorly understood. At fixed bitrate, these objectives often improve preservation of a target physical observable while degrading standard reconstruction fidelity. We develop a local geometric theory showing that this tradeoff is governed by the interaction of latent-space sensitivities induced by the entropy model, the physical observable, and the distortion metric. At each operating point, these induce preferred directions along which compression noise should be suppressed, yielding an anisotropic error-allocation mechanism. When these directions are misaligned, improving the observable at fixed rate necessarily worsens standard distortion, establishing a fundamental limit on simultaneous preservation. We formalise this through a local tangent-space rate-distortion law and introduce a practical alignment diagnostic based on dominant eigenspace overlap. Experiments across scientific domains test the theory and validate that the alignment diagnostic correlates with observed data- and physics-space trade-offs.

2606.03270 2026-06-03 cs.LG cs.AI 版本更新

Are Common Substructures Transferable? Riemannian Graph Foundation Model with Neural Vector Bundles

常见子结构可迁移吗?基于神经向量丛的黎曼图基础模型

Li Sun, Zhenhao Huang, Yiding Wang, Qin Chen, Pietro Lio, Philip S. Yu

发表机构 * University of Science and Technology of China(中国科学技术大学)

AI总结 针对图结构迁移性理论缺失的问题,提出基于黎曼几何的神经向量丛框架GAUGE,通过内在几何学习实现可迁移子结构表征,在零样本链接预测和图同构任务中验证了优越性。

Comments Accepted by ICML 2026

详情
AI中文摘要

基础模型通过预训练-适应范式引发了革命,最近的研究将这一成功扩展到图。与其他模态不同,图包含丰富的结构模式,但其结构迁移性仍知之甚少。先前的研究考虑离散领域中的常见子结构,我们被一个基本问题所驱动:常见子结构可迁移吗?其背后的理论很大程度上未被探索。在这项工作中,我们转向通过功能行为的视角学习可迁移结构。理论上,我们将可迁移子结构与表示空间的内在几何联系起来。然而,表征这种内在几何很少被触及。基于黎曼几何,我们开发了一个称为神经向量丛的图内在几何学习框架,该框架能够用局部坐标解析内在几何。在此基础上,我们设计了GAUGE,一个可预训练的神经架构,它构建向量丛,展平几何兼容的局部坐标,以及一个新的狄利克雷损失,该损失也衡量迁移努力。我们通过实验验证了其在具有挑战性的任务(包括零样本链接预测和图同构)中的优越表现力。

英文摘要

Foundation models have sparked a revolution via a pretraining-adaptation paradigm, with recent efforts extending this success to graphs. Unlike other modalities, graphs contain rich structural patterns, yet their structural transferability remains poorly understood. Prior studies consider common substructures in the discrete realm, and we are motivated by a fundamental question: Are common substructures transferable? The underlying theory is largely underexplored. In this work, we shift toward learning transferable structures through the lens of functional behavior. Theoretically, we connect transferable substructures to intrinsic geometry of the representation space. However, characterizing such intrinsic geometry has rarely been touched. Grounded in Riemannian geometry, we develop a graph intrinsic geometry learning framework called Neural Vector Bundle, which enables parsing intrinsic geometry with local coordinates. Building on this, we design GAUGE, a pretrainable neural architecture that constructs the vector bundle, flattening geometrically compatible local coordinates, and a new Dirichlet loss, which also measures the transfer effort. We empirically validate its superior expressiveness in challenging tasks including zero-shot link prediction and graph isomorphism.

2606.03262 2026-06-03 cs.LG cs.NA math.NA 版本更新

Let There Be Light: Reflection, Refraction and Scattering for Neural Operators

Let There Be Light: 面向神经算子的反射、折射与散射

Keke Wu, Yixuan Zhang, Jingrun Chen

发表机构 * Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou(苏州先进研究院,中国科学技术大学,苏州) School of Artificial Intelligence and Data Science, University of Science and Technology of China, Hefei(人工智能与数据科学学院,中国科学技术大学,合肥) Suzhou Big Data & AI Research and Engineering Center, Suzhou(苏州大数据与人工智能研究与工程中心,苏州) School of Mathematical Science, Peking University, Beijing(数学科学学院,北京大学,北京) School of Mathematical Sciences, University of Science and Technology of China, Hefei(数学科学学院,中国科学技术大学,合肥)

AI总结 提出一种受光传输启发的神经算子LiNO,通过反射、折射和散射三种机制分解潜在演化,实现局部特征调制与全局空间通信的结构化分离,并开发高效散射变体将空间复杂度从二次降至线性。

详情
AI中文摘要

神经算子学习无限维函数空间之间的映射,为参数化偏微分方程(PDE)提供数据驱动的代理建模范式。现有架构通常通过在指定变换域中参数化积分核,或对离散空间点应用类似注意力的交互来获得表达能力。尽管这些方法取得了显著进展,但它们常常面临物理可解释性、非局部空间通信、网格可扩展性和计算成本之间的持续权衡。我们提出了一种光启发的神经算子(LiNO),其潜在演化被分解为由基本光传输启发的三种机制:反射、折射和散射。反射和折射在潜在特征空间中充当自适应逐点变换,实现局部特征重定向和各向异性调制,而散射则在物理域上执行输入依赖的非局部传播。我们首先将散射公式化为具有相对位置偏置的归一化成对核,然后开发了一种高效的散射变体,用正特征全局传播和局部扩散分支替代显式的成对交互,将主导空间复杂度从二次降至线性。这产生了一个结构化的神经算子,将局部特征调制与全局空间通信分离,同时保留了模块化和可解释的潜在演化。

英文摘要

Neural operators learn mappings between infinite-dimensional function spaces and provide a data-driven surrogate modeling paradigm for parametric partial differential equations (PDEs). Existing architectures typically obtain expressivity by parameterizing integral kernels in prescribed transform domains or by applying attention-like interactions over discretized spatial points. While these approaches have achieved substantial progress, they often face a persistent trade-off among physical interpretability, nonlocal spatial communication, mesh scalability, and computational cost. We propose a Light-inspired neural operator(LiNO), an operator-learning architecture whose latent evolution is decomposed into three mechanisms motivated by elementary light transport: reflection, refraction, and scattering. Reflection and refraction act as adaptive pointwise transformations in latent feature space, enabling local feature reorientation and anisotropic modulation, whereas scattering performs input-dependent nonlocal propagation over the physical domain. We first formulate scattering as a normalized pairwise kernel with relative positional bias, and then develop an efficient scattering variant that replaces explicit pairwise interactions with positive-feature global propagation and a local diffusion branch, reducing the dominant spatial complexity from quadratic to linear. This yields a structured neural operator that separates local feature modulation from global spatial communication while retaining a modular and interpretable latent evolution.

2606.03260 2026-06-03 cs.LG cs.AI 版本更新

EqGINO: Equivariant Geometry-Informed Fourier Neural Operators for 3D PDEs

EqGINO: 面向3D PDE的等变几何信息傅里叶神经算子

Sungwon Kim, Juho Song, Seungmin Shin, Guimok Cho, Sangkook Kim, Chanyoung Park

发表机构 * University of Texas at Austin(得克萨斯大学奥斯汀分校)

AI总结 提出EqGINO框架,通过在谱域强制执行各向同性,实现离散对称性的精确等变,并泛化到任意连续旋转,有效建模3D PDE的坐标不变物理规律。

Comments ICML 2026

详情
AI中文摘要

用于3D偏微分方程(PDE)的深度学习代理通常难以在几何变换下泛化,因为它们严重依赖于特定的坐标系。虽然等变网络提供了一种解决方案,但它们通常依赖于空间域中的局部操作,使得对PDE动力学至关重要的全局感受野计算成本高昂。相反,傅里叶神经算子(FNO)高效地捕获全局交互,但由于谱群卷积的过高成本,在其中建立3D等变性仍然不切实际。为弥合这一差距,我们引入了EqGINO,一个在谱域中强制执行各向同性的几何鲁棒框架。通过设计,EqGINO保证对离散化计算域固有的离散对称性具有精确等变性。除了这种离散保证外,我们的结构先验使得即使在有限数量的SE(3)变换训练样本下,也能有效泛化到任意连续方向。因此,我们的方法在复杂的非规则3D几何上鲁棒地建模坐标不变的物理定律。我们的代码可在此https URL获取。

英文摘要

Deep learning surrogates for 3D Partial Differential Equations (PDEs) often fail to generalize across geometric transformations because they depend heavily on specific coordinate systems. While equivariant networks offer a solution, they typically rely on local operations in the spatial domain, making the global receptive field, which is essential for PDE dynamics, computationally expensive. Conversely, Fourier Neural Operators (FNOs) efficiently capture global interactions, yet establishing 3D equivariance within them remains impractical due to the prohibitive cost of spectral group convolutions. To bridge this gap, we introduce EqGINO, a geometrically robust framework that enforces isotropy in the spectral domain. By design, EqGINO guarantees exact equivariance to the discrete symmetries inherent to the discretized computational domain. Beyond this discrete guarantee, our structural prior enables effective generalization to arbitrary continuous orientations even with a limited number of SE(3)-transformed training samples. Consequently, our method robustly models coordinate-invariant physical laws on complex irregular 3D geometries. Our code is available at https://github.com/sung-won-kim/EqGINO

2606.03257 2026-06-03 cs.NE cs.AI cs.LG 版本更新

PSViT: A Methodology for Structurally Pruning Spiking Vision Transformers

PSViT:一种结构剪枝脉冲视觉Transformer的方法

Rachmad Vidya Wicaksana Putra, Achyuta Muthuvelan, Alberto Marchisio, Muhammad Shafique

发表机构 * eBRAIN Lab, Division of Engineering, New York University (NYU) Abu Dhabi(eBRAIN实验室,工程系,纽约大学(NYU)阿布扎赫德分校) New York University (NYU) Abu Dhabi, United Arab Emirates (UAE)(纽约大学(NYU)阿布扎赫德分校,阿拉伯联合酋长国(UAE))

AI总结 提出PSViT方法,通过结构化剪枝(均匀通道滤波器和基于敏感性的细粒度剪枝)压缩脉冲视觉Transformer,在ImageNet-1K上实现22.4%内存节省且精度损失小于3%。

Comments 8 pages, 7 figures, 3 tables

详情
AI中文摘要

脉冲视觉Transformer(SViT)模型是很有前景的低功耗ViT模型,用于解决基于视觉的任务,具有最先进的性能。然而,它们的大尺寸限制了在资源受限的嵌入式平台上的部署,凸显了模型压缩的需求。一种突出的压缩技术是剪枝,最先进的工作采用非结构化剪枝技术来压缩SViT模型。这种技术需要专门针对稀疏模式定制的硬件架构才能最大化其效率优势,使得这种方法不可扩展。为了解决这个问题,我们提出了PSViT,一种对SViT模型进行结构化剪枝的新方法,从而使得利用现有且广泛使用的计算架构高效加速其推理成为可能。为此,PSViT采用了几个关键步骤:均匀通道滤波器剪枝以结构化消除非显著权重,敏感性分析以评估单层通道剪枝对精度和网络大小的影响,以及基于敏感性分析和给定网络架构的细粒度通道剪枝。实验结果表明,PSViT通过单次剪枝有效获得了22.4%的内存节省,同时在ImageNet-1K上保持高精度(未经微调为70.3%,经微调为72.8%),与原始未剪枝SViT模型(73.3%)相比精度损失在3%以内。这些结果还表明,PSViT方法推进了在资源受限应用中实现高效SViT部署的努力。

英文摘要

Spiking Vision Transformer (SViT) models are promising low-power ViT models for solving vision-based tasks with state-of-the-art performance. However, their large sizes limit their deployments for resource-constrained embedded platforms, underscoring the needs of model compression. One of prominent compression techniques is pruning, and the state-of-the-art works employ unstructured pruning techniques to compress SViT models. Such techniques require specialized hardware architectures tailored for the sparsity patterns to maximize their efficiency benefits, making this approach not scalable. To address this, we propose PSViT, a novel methodology to perform structured pruning on SViT models, hence making it possible to efficiently accelerate their inference using the existing and widely-used computing architectures. To do this, PSViT employs several key steps: uniform channel-wise filter pruning to structurally eliminate the non-significant weights, sensitivity analysis to evaluate the impact of channel-wise pruning of individual layer on accuracy and network size, as well as fine-grained channel-wise pruning based on the sensitivity analysis and the given network architecture. Experimental results show that PSViT effectively obtains 22.4% memory saving through single-shot pruning, while maintaining high accuracy within 3% (70.3% without fine-tuning and 72.8% with fine-tuning) from the original non-pruned SViT model (73.3%) on the ImageNet-1K. These results also show that the PSViT methodology advances the effort in enabling efficient SViT deployments on resource-constrained applications.

2606.03251 2026-06-03 cs.AI cs.CV cs.LG eess.IV stat.ML 版本更新

Do Real-World Datasets Contain Natural Experiments? An Empirical Study Using Causal Feature Selection

现实世界数据集是否包含自然实验?基于因果特征选择的实证研究

Gautam Gare, John Galeotti, Michael Mozer, Deva Ramanan, Nan Rosemary Ke

AI总结 本文利用因果发现和特征选择检测现实世界数据集中的自然实验,并通过干预性处理提升模型性能。

详情
AI中文摘要

在自然界中,影响某些个体或群体但不影响其他个体或群体的事件构成隐式干预,被称为自然实验。例如,COVID-19大流行是冠状病毒对感染COVID的亚群的一次干预。我们问:现有的现实世界数据集中是否存在自然实验?如果存在,我们应该如何处理它们?为了检测数据中的自然实验,我们使用因果发现恢复潜在因果图,并基于因果链接进行特征选择。如果通过将数据视为干预性而非观测性来提升下游性能,我们认为这表明数据集包含自然实验。我们首先通过使用合成图模拟包含和不包含自然实验的数据集来验证这一假设。然后,我们在大量现实世界数据集上进行系统的实证评估。我们的结果表明,现实世界数据集确实包含自然实验,我们可以利用这些自然实验通过因果推断来提升模型性能。我们的工作代表了该领域的初步探索,在有限范围内进行了初步研究。

英文摘要

In nature, events that affect some individuals or groups but not others constitute an implicit intervention and are known as natural experiments. For example, the COVID-19 pandemic was an intervention by the coronavirus on the sub-population infected with COVID. We ask, do natural experiments occur in existing real-world datasets? If yes, how should we treat them? To detect natural experiments in data, we use causal discovery to recover the underlying causal graph and perform feature selection based on causal links. If downstream performance improves by treating the data as interventional rather than observational, we argue that this suggests the dataset contains natural experiments. We first validate this hypothesis by simulating datasets with and without natural experiments using synthetic graphs. We then perform a systematic empirical evaluation on a large suite of real-world datasets. Our results indicate that real-world datasets do contain natural experiments and we can take advantage of those natural experiments to improve model performance using causal inference. Our work represents the initial foray into this area, offering a preliminary exploration within a limited scope.

2606.03238 2026-06-03 cs.LG cs.AI 版本更新

When RLHF Fails: A Mechanistic Taxonomy of Reward Hacking, Collapse, and Evaluator Gaming

当RLHF失败时:奖励黑客、崩溃和评估者博弈的机制分类

Zelalem Abahana

发表机构 * First Citizens Bank(第一公民银行) Alma Mater Europaea University(欧洲大学)

AI总结 本文通过PPO、DPO等方法的对比实验,提出了一种基于奖励和评估者分数方向的机制分类法,将RLHF失败模式分类为可定位、可预测的训练动态。

Comments 20 pages, 8 figures; includes code, artifacts, and live demo

详情
AI中文摘要

从人类反馈中强化学习(RLHF)通过用学习到的可扩展代理替代未明确指定的人类目标,实现了大规模后训练。这种替代同时创建了一个结构化的失败面:优化可以提高学习到的奖励而外部质量下降,降低代理和评估者分数,揭示代理欠对齐,或产生评估者特定的分歧。我们展示了一个紧凑RLHF流程的实证失败模式研究,该流程包括近端策略优化(PPO)、直接偏好优化(DPO)、不确定性惩罚PPO(UP-PPO)、奖励模型不确定性、近似策略漂移、多样性和重复诊断,以及两个外部LLM评估者。我们不将奖励黑客视为单一终端事件,而是使用学习到的奖励、评估者分数和平均评估者分数的方向对检查点之间的匹配转换进行分类。在61个检查点行和1920个行级转换中,激进的PPO具有最高的局部奖励黑客率(14.45%;bootstrap 95% CI: 10.16-18.75),而UP-PPO在相同激进机制下产生较低率(11.33-10.94%)。转换前的逻辑模型以ROC-AUC 0.821预测未来行级奖励黑客,行级分析发现12个设置中有3个存在检查点平均值遗漏的局部奖励黑客。核心结论是方法论上的:RLHF失败不仅是最终模型病理,而且是可分类、可定位和部分可预测的训练动态。

英文摘要

Reinforcement learning from human feedback (RLHF) makes large-scale post-training possible by replacing an underspecified human objective with learned and scalable proxies. The same substitution creates a structured failure surface: optimization can raise the learned reward while external quality falls, degrade both proxy and judge scores, reveal proxy under-alignment, or produce evaluator-specific disagreement. We present an empirical failure-mode study of a compact RLHF pipeline with proximal policy optimization (PPO), direct preference optimization (DPO), uncertainty-penalized PPO (UP-PPO), reward-model uncertainty, approximate policy drift, diversity and repetition diagnostics, and two external LLM judges. Rather than treating reward hacking as a single terminal event, we classify matched transitions between checkpoints using the directions of the learned reward, judge scores, and average judge score. Across 61 checkpoint rows and 1920 row-level transitions, aggressive PPO has the highest localized reward-hacking rate (14.45%; bootstrap 95% CI: 10.16-18.75), while UP-PPO yields lower rates in the same aggressive regime (11.33-10.94%). A pre-transition logistic model predicts future row-level reward hacking with ROC-AUC 0.821, and row-level analysis finds localized reward hacking that checkpoint averages miss in 3 of 12 settings. The central conclusion is methodological: RLHF failures are not only final-model pathologies, but training dynamics that can be classified, localized, and partially anticipated.

2606.03237 2026-06-03 cs.AI cs.CL cs.CY cs.LG cs.MA 版本更新

Solipsistic Superintelligence is Unlikely to be Cooperative

唯我论超级智能不太可能合作

Rakshit S Trivedi, Natasha Jaques, Logan Cross, Alexander Sasha Vezhnevets, Joel Z Leibo

发表机构 * DeepMind(深度Mind) University of Cambridge(剑桥大学) University of California, Berkeley(加州大学伯克利分校)

AI总结 本文指出,基于唯我论方法设计的超级智能(极端能力的任务求解器)因忽视部署引发的内生非平稳性而难以合作,呼吁将相互依存作为核心设计原则的非唯我论研究范式。

Comments 24 pages, 1 figure, Accepted at Proceedings of the 43rd International Conference on Machine Learning, 2026

详情
AI中文摘要

AI的核心挑战正从能力转向共存。AI研究的主导范式侧重于开发将世界视为外生且平稳反馈源的强大智能体。我们认为,源于这种唯我论AI设计方法的超级智能(极端能力的任务求解器)不太可能合作。部署AI系统会引发内生非平稳性,导致训练-测试-部署差距,即历史分布与部署环境相偏离。我们称此为单边优化的自我削弱属性。缩小这一差距需要参与合作的AI:即多个行为体导航其相互依存的均衡选择过程。我们呼吁一种非唯我论的研究范式,将这种相互依存作为核心设计原则,而非将合作视为待解决的任务。这需要构建涉及自适应对手方的动态评估测试平台,将制度视为设计原语,并保留人类能动性作为我们构建系统的结构性特征。

英文摘要

AI's central challenge is shifting from capability to coexistence. The dominant paradigm in AI research focuses on developing powerful agents that treat the world as an exogenous and stationary source of feedback. We contend that superintelligence, an extremely capable task solver, born out of such a solipsistic approach to AI design, is unlikely to be cooperative. Deploying AI systems induces endogenous non-stationarity, resulting in a train-test-deploy gap where historical distributions diverge from the deployment context. We refer to this as the self-undermining property of unilateral optimization. Closing this gap requires AI that participates in cooperation: the equilibrium-selection process through which multiple actors navigate their interdependence. We call for a non-solipsistic research paradigm that treats this interdependence as a core design principle rather than approaching cooperation as a task to solve. This entails building dynamic evaluation testbeds involving adaptive counterparties, treating institutions as design primitives, and preserving human agency as a structural feature of the systems we build.

2606.03234 2026-06-03 cs.LG 版本更新

Right Makes Might: Aligning Verified Hidden States Empowers RL Reasoning

正确即力量:对齐验证的隐藏状态增强强化学习推理

Ziyue Wang, Aomufei Yuan, Yongfu Zhu, Shuai Dong, Wenpu Liu, Yiran Yao, Weichu Xie, Yuqi Xu, Caoyuan Ma, Wenqi Shao, Xiaoying Zhang, Nan Duan, Jiaqi Wang

发表机构 * Peking University(北京大学) JINGDONG(京东) Shanghai Innovation Institute(上海创新研究院) The University of Tokyo(东京大学) Tianjin University(天津大学)

AI总结 提出Hidden-Align辅助损失函数,在强化学习训练中对齐正确rollout在锚点token处的最后一层隐藏状态,提升数学推理性能。

Comments 16 pages, 7 figures

详情
AI中文摘要

基于可验证奖励的强化学习(RLVR)已成为提升大语言模型数学推理的主流方法,但当前方法将每个正确rollout简化为单个奖励比特,忽略了其隐藏状态共享的几何结构。研究这一结构发现,在锚点token(答案标记前的位置)处,正确rollout自然收敛,因为它们必须产生相同答案(余弦相似度约0.84),但每个rollout仍保留其独特推理路径的残余方差。鼓励在该点完全对齐,推动模型提取统一的“正确决策”表示,减少对推理路径的敏感性。基于此观察,我们提出Hidden-Align,一种辅助损失函数,在RL训练中对齐正确rollout在锚点token处的最后一层隐藏状态,训练和推理中零开销。在八个数学推理基准上,Hidden-Align在Qwen3-1.7B、4B和14B上分别比DAPO基线平均提升pass@1 3.8、6.2和5.4个百分点,且在所有三种规模上pass@k一致提升,消融实验支持了损失类型、锚点位置、层深度和损失权重的影响。

英文摘要

Reinforcement Learning from Verifiable Rewards (RLVR) has become the dominant approach for improving mathematical reasoning in large language models, yet current methods reduce each correct rollout to a single reward bit, ignoring the geometric structure shared among their hidden states. Investigating this structure, we find that at the anchor token (the position immediately before the answer marker), correct rollouts converge naturally because they must produce the same answer (cosine similarity ~0.84), yet each retains residual variance from its unique reasoning path. Encouraging full alignment at this point pushes the model to extract a unified "correct decision" representation, reducing sensitivity to which reasoning path was taken. Based on this observation, we propose Hidden-Align, an auxiliary loss function that aligns the last-layer hidden states of correct rollouts at the anchor token during RL training, with zero overhead in both training and inference. On eight mathematical reasoning benchmarks, Hidden-Align improves average pass@1 over the DAPO baseline by 3.8, 6.2, and 5.4 percentage points on Qwen3-1.7B, 4B, and 14B respectively, with consistent pass@k gains across all three scales, supported by ablations on loss type, anchor position, layer depth, and loss weight.

2606.03232 2026-06-03 cs.LG cs.AI 版本更新

GFFMERGE: Efficient Merging of Graph Neural Force Fields and Beyond

GFFMERGE: 图神经力场的高效合并及其扩展

Parth Verma, Parv P. Singh, Vipul Garg, Ishita Thakre, N. M. Anoop Krishnan, Sayan Ranu

发表机构 * University of California, Berkeley(加州大学伯克利分校) Stanford University(斯坦福大学) University of Cambridge(剑桥大学)

AI总结 提出GFFMERGE框架,通过凸嵌入对齐问题解析解实现图神经网络的闭式模型合并,在力场回归任务中恢复接近联合训练的性能,并实现5-27倍加速。

详情
AI中文摘要

图神经网络(GNN)通过降低计算成本实现接近量子精度的原子模拟,彻底改变了神经力场,但将这些模型适应新化学系统需要对基础模型进行昂贵的重新训练。受视觉和语言处理中模型合并的启发,我们提出了GFFMERGE,这是第一个用于GNN闭式模型合并的原则性框架。我们利用消息传递层的线性结构,将合并问题形式化为具有解析解的凸嵌入对齐问题。通过对GNN模型合并的首次系统基准测试,我们发现为视觉和语言设计的现有方法在力场回归任务上灾难性地失败,而GFFMERGE恢复了接近黄金标准联合训练的性能。在分子(MD17、MD22)、固态(LiPS20)和大规模图基准测试中,GFFMERGE及其通用GNN对应物GNNMERGE实现了5-27倍的加速,同时支持专业模型的模块化组合。值得注意的是,我们的闭式解在微调前就优于所有基线方法,并为更快、数据高效的收敛提供了优越的初始化。

英文摘要

Graph Neural Networks (GNNs) have revolutionized Neural Force Fields for atomistic simulations, achieving near-quantum accuracy at reduced cost, yet adapting these models to new chemical systems requires expensive retraining of foundation models. Inspired by model merging in vision and language processing, we introduce GFFMERGE, the first principled framework for closed-form model merging in GNNs. We exploit the linear structure of message-passing layers and formulate merging as a convex embedding-alignment problem with an analytical solution. Through the first systematic benchmarking of model merging for GNNs, we show that existing methods designed for vision and language catastrophically fail on force field regression, while GFFMERGE recovers performance approaching gold standard joint training. Across molecular (MD17, MD22), solid-state (LiPS20), and large-scale graph benchmarks, GFFMERGE and GNNMERGE (its generic GNN counterpart) achieve 5-27$\times$ speedups while enabling modular composition of specialized models. Remarkably, our closed-form solution alone outperforms all baseline methods before fine-tuning and provides superior initialization for faster, data-efficient convergence.

2606.03227 2026-06-03 cs.LG 版本更新

Learning Temporal Causal Structure via Smooth Differentiable Optimization

通过平滑可微优化学习时间因果结构

Tong Zhao, Ce Guo, Wayne Luk, Emil Lupu, Ray Dipojjwal

发表机构 * Imperial College London(帝国理工学院伦敦分校) University of Bristol(布里斯托大学)

AI总结 提出使用Gumbel-Sinkhorn算子学习可微变量排序,三角化结构向量自回归模型的瞬时系数矩阵,将无环性转化为参数化,实现统一连续优化,提高时间序列因果发现的效率和准确性。

详情
AI中文摘要

多变量时间序列中具有瞬时效应的因果发现具有挑战性,因为瞬时结构必须是无环的。先前的方法通过将瞬时和滞后估计分离为多阶段流水线,或通过复杂的增广拉格朗日优化施加代数无环性约束来强制执行这一点,这两种方法都 incur 高计算成本。在这项工作中,我们提出了一种不同的方法:我们使用Gumbel-Sinkhorn算子学习变量的可微排列,并按照学习到的顺序三角化结构向量自回归(SVAR)模型的瞬时系数矩阵。这将无环性从硬约束转化为参数化,并在整个优化过程中保持其有效性。通过这样做,我们的方法实现了基于梯度的学习的统一连续优化,从而提高了时间序列因果发现的效率。在三个真实世界基准测试中,我们的方法在发现准确性和效率方面均优于12个基线方法,取得了最佳整体性能。在大规模基准测试中,它进一步展示了强大的可扩展性,实现了比竞争方法快6倍以上的加速。

英文摘要

Causal discovery with instantaneous effects in multivariate time series is challenging, as the instantaneous structure must be acyclic. Prior methods enforce this by either separating instantaneous and lagged estimation into multi-stage pipelines or imposing algebraic acyclicity constraints via complex augmented Lagrangian optimization, both of which incur high computational cost. In this work, we propose a different approach: we learn a differentiable permutation of variables using the Gumbel--Sinkhorn operator and triangularize the instantaneous coefficient matrix of a Structural Vector Autoregressive (SVAR) model in the learned order. This converts acyclicity from a hard constraint into a parameterization and keeps it valid throughout optimization. In doing so, our method enables unified, continuous optimization with gradient-based learning, leading to improved efficiency in time--series causal discovery. Across three real-world benchmarks, our method achieves the best overall performance compared with 12 baselines in both discovery accuracy and efficiency. On the large-scale benchmark, it further demonstrates strong scalability, achieving more than a 6x speedup over competing methods.

2606.03219 2026-06-03 cs.CL cs.LG 版本更新

Sample-Size Scaling of the African Languages NLI Evaluation

非洲语言自然语言推理评估的样本量缩放

Anuj Tiwari, Oluwapelumi Ogunremu, Terry Oko-odion, Jesujuwon Egbewale, Hannah Nwokocha

发表机构 * Noida Institute of Engineering and Technology(奈德人工智能工程与技术学院) ML Collective(机器学习集体)

AI总结 本研究通过AfriXNLI基准对16种非洲语言进行系统样本量缩放实验,发现NLI性能随样本量增加并非单调提升,而是呈现语言敏感且非单调的缩放行为,表明数据量不足以保证稳定收益,需语言敏感的数据集和更强多语言建模策略。

Comments Accepted at the AfricaNLP Workshop, EACL 2026

详情
AI中文摘要

非洲语言标注数据非常少,且增加标注数据量是否能可靠提升下游性能尚不明确。本研究基于AfriXNLI基准,对16种非洲语言进行了自然语言推理(NLI)的系统样本量缩放研究。在受控条件下,测试了两个约0.6B参数的多语言Transformer模型(在XNLI上微调的XLM-R Large和AfroXLM-R Large),样本量从50到500个标注示例不等,并在随机子采样运行中平均结果。与通常认为的随数据增加性能单调提升相反,我们发现了一种强烈语言敏感且通常非单调的缩放行为。一些语言在低资源场景下表现出早期饱和或性能下降,以及高方差。这些结果表明,数据量不足以保证非洲NLI的稳定收益,因此需要创建语言敏感的数据集和更强的多语言建模策略。

英文摘要

African languages have very little labelled data, and it is unclear if augmenting the quantity of annotation data reliably enhances downstream performance. The study is a systematic sample-size scaling study of natural language inference (NLI) on 16 African languages based on the AfriXNLI benchmark. Under controlled conditions, two multilingual transformer models with roughly 0.6B parameters XLM-R Large fine-tuned on XNLI and AfroXLM-R Large are tested on sample sizes of between 50 and 500 labeled examples and average their results across random subsampling runs. As opposed to the usual belief of monotonic increase with increased data, we find a strongly language sensitive and often non-monotonic scaling behavior. Some languages show early saturation or decrease in performance with sample size as well as high variance in low resource regimes. These results indicate that the volume of data is not enough to guarantee stable profits to African NLI, creating the necessity of language sensitive datasets creation and stronger multi-lingual modelling strategies.

2606.03214 2026-06-03 cs.AI cs.CV cs.CY cs.LG 版本更新

Effect of Demographic Bias on Skin Lesion Classification

人口统计偏差对皮肤病变分类的影响

Ralf Raumanns, Gerard Schouten, Veronika Cheplygina, Josien P. W. Pluim

发表机构 * Fontys University of Applied Science, Venlo, The Netherlands(Fontys应用科学大学,荷兰Venlo) Fontys University of Applied Science, Eindhoven, The Netherlands(Fontys应用科学大学,荷兰Eindhoven) Eindhoven University of Technology, Eindhoven, The Netherlands(埃因霍温技术大学,荷兰Eindhoven) IT University of Copenhagen, Denmark(哥本哈根IT大学,丹麦)

AI总结 本研究使用基于ResNet的卷积模型评估皮肤病变分类性能,通过线性规划控制人口统计特征,研究患者性别和年龄偏差的影响,并比较三种学习策略,发现性别偏差主要源于数据不平衡,而年龄偏差始终偏向年轻群体。

Comments Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) , 26 pages, 12 figures

详情
Journal ref
https://melba-journal.org/2026:011
AI中文摘要

在这项研究中,我们评估了使用基于ResNet的卷积模型进行皮肤病变分类的性能,重点关注训练数据中人口统计偏差的影响,特别是患者性别和年龄的变化。我们使用线性规划生成具有受控人口统计特征的数据集,从而系统性地研究偏差效应。评估了三种学习策略:单任务模型、强化多任务模型和对抗学习方案。我们的性别分析表明,性别特定的训练数据集优化了模型性能。值得注意的是,在训练数据中包含男性患者提高了男性亚组的性能,即使在女性占多数的情况下也是如此。强化学习和对抗学习方案缩小或消除了平衡和女性占多数数据集中的偏差差距。然而,这些策略在男性占多数的环境中效果较差,模型在男性上的表现仍然优于女性。在主要男性患者群体中,与基线模型相比,这两种学习方案显示出边际偏差减少。基于年龄的分析表明,三种模型方法的基线性能相当,性能随年龄类别下降。无论训练数据分布如何,年轻组始终达到最高性能。尽管平衡训练对最年轻年龄组产生最佳结果,但较老年组的性能下降。我们发现性别偏差主要源于数据不平衡,而年龄偏差无论分布如何始终偏向年轻群体。这些不同的机制需要有针对性的缓解策略。此外,在两个外部数据集上的跨数据集验证表明,域转移显著影响性能和人口统计偏差模式。

英文摘要

In this study, we evaluate the performance of skin lesion classification using ResNet-based convolutional models, focusing on the impact of demographic bias in training data, particularly variations in patient sex and age. We use linear programming to generate datasets with controlled demographic characteristics, allowing systematic investigation of bias effects. Three learning strategies are evaluated: a single-task model, a reinforcing multi-task model, and an adversarial learning scheme. Our sex-based analysis indicates that sex-specific training datasets optimise model performance. Notably, including male patients in the training data improved performance for the male subgroup, even in female-majority cases. Reinforcing and adversarial learning schemes narrowed or eliminated bias gaps in balanced and female-majority datasets. However, these strategies proved less effective in male-majority settings, where models continued to perform better for males than females. The two learning schemes showed marginal bias reduction compared to the baseline model in predominantly male patient populations. Age-based analysis demonstrates comparable baseline performance across the three model approaches, with performance declining across age categories. Younger groups consistently achieve the highest performance, regardless of training data distribution. Although balanced training yields optimal results for the youngest age category, performance decreases in older categories. We find that sex biases arise mainly from data imbalances, while age biases consistently favour younger groups regardless of distribution. These distinct mechanisms require targeted mitigation strategies. Additionally, cross-dataset validation on two external datasets revealed that domain shifts notably affect performance and patterns of demographic bias.

2606.03210 2026-06-03 cs.CE cs.LG cs.NA math.NA 版本更新

Critical evaluation of PINN for FWD inverse analysis and differentiable FEM as an alternative

PINN 在 FWD 反分析中的批判性评估及可微有限元方法作为替代方案

Yongjin Choi, Hyeonbin Moon, Seunghwa Ryu

发表机构 * KAIST(韩国科学技术院)

AI总结 本文批判性评估了物理信息神经网络(PINN)在多层路面系统落锤式弯沉仪(FWD)反分析中的表现,并提出可微有限元方法(DiffFEM)作为更准确、稳定和高效的替代方案。

详情
AI中文摘要

基于自动微分的反分析方法,包括物理信息神经网络(PINN)和可微编程,最近因其计算精确梯度和收敛效率的能力而显示出巨大潜力。然而,它们对落锤式弯沉仪(FWD)反计算的适用性尚未被探索。本研究基于合成基准,批判性评估了基于PINN的多层路面系统反分析,并研究了可微有限元方法(DiffFEM)作为替代方案。标准PINN由于层状路面系统固有的尖锐域不连续性而无法恢复层模量。尽管我们使用了具有域分解的扩展PINN(XPINN),它在不连续域上表现更好,但其性能仍然对损失权重和网络架构高度敏感,并且在测量噪声下会退化。相比之下,DiffFEM始终获得更准确、稳定且计算高效的反演结果。这些结果表明,将控制物理作为硬约束强加的DiffFEM比基于PINN的方法(其中控制物理通过损失函数作为软约束施加)具有更好的准确性、鲁棒性和计算效率。更广泛地说,研究结果表明,在基于PINN和DiffFEM的反分析之间进行选择需要仔细考虑,当存在高效且稳健的可微正演求解器时,DiffFEM提供了实际优势。

英文摘要

Automatic-differentiation-based inverse analysis methods, including physics-informed neural networks (PINNs) and differentiable programming, have recently shown great promise due to their ability to compute accurate gradients and convergence efficiency. However, their applicability to falling weight deflectometer (FWD) backcalculation remains unexplored. This study critically evaluates PINN-based inverse analysis for a multilayer pavement system and investigates differentiable finite element method (DiffFEM) as an alternative based on a synthetic benchmark. The standard PINN does not recover layer moduli because of the sharp domain discontinuities inherent to layered pavement systems. Although we use an extended PINN with domain decomposition (XPINN), which shows better performance on discontinuous domains, its performance remains highly sensitive to loss weighting and network architecture, and degrades under measurement noise. By contrast, DiffFEM consistently achieves more accurate, stable, and computationally efficient inversion results. These results indicate that DiffFEM, which enforces the governing physics as a hard constraint, yields better accuracy, robustness, and computational efficiency than PINN-based approaches, in which the governing physics is imposed as a soft constraint through the loss function. More broadly, the findings suggest that the choice between PINN- and DiffFEM-based inverse analysis needs careful consideration, with DiffFEM offering practical advantages when an efficient and robust differentiable forward solver is available.

2606.03209 2026-06-03 cs.LG 版本更新

DECA: Decentralizing Block-Wise Adam for Efficient LLM Full-Parameter Fine-Tuning on Non-IID Data

DECA: 去中心化逐块Adam优化器用于非独立同分布数据上的高效大语言模型全参数微调

Yunsheng Yuan, Shaowei Li, Kai Wang, Zhongyuan Sun, Zheng Zhang, Kai Han, Jun Luo, Feng Li

发表机构 * School of Computer Science and Technology, Shandong University, Qingdao China(山东大学计算机科学与技术学院,青岛中国) School of Mathematical Science, Peking University, China(北京大学数学科学学院,中国) IEIT SYSTEM, China(IEIT SYSTEM,中国) School of Computer Science and Artificial Intelligence, Shanghai University of Finance and Economics, Shanghai, China(上海财经大学计算机科学与人工智能学院,上海中国) College of Computing and Data Science, Nanyang Technological University, Singapore(南洋理工大学计算与数据科学学院,新加坡)

AI总结 针对隐私敏感和资源受限环境中的大语言模型微调,提出DECA框架,通过逐块Adam优化和去中心化共识机制,在非独立同分布数据上实现高效的全参数微调,兼顾收敛速度、下游性能和资源效率。

详情
AI中文摘要

在隐私敏感和资源受限的环境中微调大语言模型(LLM)仍然具有挑战性。由于训练数据通常分布在多个客户端上,去中心化微调提供了一种无需中央服务器的协作适应自然范式。然而,在这种去中心化设置中实现全参数微调(FPFT)是困难的:FPFT提供了强大的适应能力,但对于十亿级模型来说会带来高昂的资源消耗。因此,现有的去中心化LLM微调方法主要依赖于参数高效更新,这提高了效率但可能限制下游性能。此外,客户端数据通常是非独立同分布的,这使得去中心化优化更容易受到客户端漂移和不稳定收敛的影响。为了解决这些挑战,我们提出了DECA,一种用于非独立同分布数据上LLM的资源高效去中心化FPFT框架。DECA将模型参数划分为不相交的块,并执行顺序逐块Adam优化,在保持去中心化全参数适应的同时减少资源消耗。为了稳定训练,DECA进一步引入了基于新鲜局部梯度统计和共识衍生差异信号的一阶和二阶逐块矩估计。我们提供了严格的理论分析和广泛的实验,表明DECA实现了快速收敛、强大的下游性能和显著的资源效率。

英文摘要

Fine-tuning large language models (LLMs) in privacy-sensitive and resource-constrained environments remains challenging. Since training data are often distributed across multiple clients, decentralized fine-tuning offers a natural paradigm for collaborative adaptation without a central server. However, enabling full-parameter fine-tuning (FPFT) in this decentralized setting is difficult: FPFT provides strong adaptation capacity but incurs prohibitive resource consumption for billion-scale models. Existing decentralized LLM fine-tuning methods therefore mainly rely on parameter-efficient updates, which improve efficiency but may restrict downstream performance. Moreover, client data are typically non-IID, making decentralized optimization more vulnerable to client drift and unstable convergence. To address these challenges, we propose DECA, a resource-efficient decentralized FPFT framework for LLMs on non-IID data. DECA partitions model parameters into disjoint blocks and performs sequential block-wise Adam optimization, reducing resource consumption while preserving decentralized full-parameter adaptation. To stabilize training, DECA further introduces first- and second-order block-wise moment estimates with fresh local gradient statistics and consensus-derived discrepancy signals. We provide rigorous theoretical analysis and extensive experiments, showing that DECA achieves fast convergence, strong downstream performance, and significant resource efficiency.

2606.03199 2026-06-03 cs.LG physics.chem-ph 版本更新

Fast Organic Crystal Structure Prediction with Unit Cell Flow Matching

基于晶胞流匹配的快速有机晶体结构预测

Alston Lo, Luka Mucko, Austin H. Cheng, Andy Cai, Alastair J. A. Price, Wojciech Matusik, Alán Aspuru-Guzik

发表机构 * MIT CSAIL(麻省理工学院计算机科学与人工智能实验室) University of Zagreb(Zagreb大学) University of Toronto(多伦多大学) Vector Institute for Artificial Intelligence(人工智能矢量研究所) Acceleration Consortium(加速联盟) Canadian Institute for Advanced Research(加拿大高级研究研究院) NVIDIA(NVIDIA公司)

AI总结 提出Clari模型,利用流匹配生成无冗余晶胞,以秒级速度实现有机晶体结构预测,速度提升15-30倍。

详情
AI中文摘要

有机晶体结构预测(CSP)是有机固体计算建模的必要条件,但传统上每个分子需要耗费数CPU年。诸如OXtal之类的生成模型通过直接采样稳定的有机晶体结构,大幅降低了这一成本。然而,OXtal放弃了显式晶格参数化,转而使用昂贵的三角形层对块体材料的大块区域进行建模,这可能导致每个分子花费数分钟的计算成本。在本文中,我们通过Clari将其降低到秒级,Clari是一个大规模流匹配模型,生成无冗余晶胞,并用纯对偏注意力取代三角形层。Clari仅需原子类型和键作为输入,无需RDKit可处理的输入分子,从而扩展了其适用于富勒烯、金属配合物和原子团簇等具有挑战性的化学体系。我们进一步消融了关键设计选择,如辅助损失、时间步分布、噪声先验和自条件化。在OXtal的测试集上,我们超越了OXtal的求解率,同时获得了15-30倍的加速。由于Clari还模拟了显式氢原子,它通过直接能量排序支持推理时扩展,无需任何修饰或弛豫步骤。当生成150个晶体并选择能量前30的晶体时,我们进一步提高了求解率,同时保持了5-8倍的加速。我们还引入了CSD教学子集,作为未来基准测试中多样化和复杂分子的新测试分割。我们的贡献使得在几秒内实现CSP成为可能,使有机固体的大规模虚拟筛选变得实用。代码可从此https URL获取。

英文摘要

Organic crystal structure prediction (CSP) is a requirement for computational modelling of organic solids, but traditionally costs several CPU-years per molecule. Generative models such as OXtal dramatically reduce this cost by sampling stable organic crystal structures directly. However, OXtal forgoes explicit lattice parametrization in favour of modelling large crops of the bulk material with expensive triangle layers, which can incur a computational cost of minutes per molecule. In this paper, we reduce this to seconds with Clari, a large-scale flow matching model that generates redundancy-free unit cells and replaces triangle layers with pure pair-bias attention. Clari requires only atom types and bonds as input and does not need an RDKit-sanitizable input molecule, which expands its applicability to challenging chemistries such as fullerenes, metal complexes, and atom clusters. We further ablate key design choices such as auxiliary losses, timestep distributions, noise priors, and self-conditioning. On OXtal's test sets, we surpass OXtal's solve rate while obtaining a speedup of $15$-$30\times$. Because Clari also models explicit hydrogens, it supports inference-time scaling via direct energy ranking, without any decoration or relaxation step. When generating 150 crystals and selecting the top-30 by energy, we further improve solve rate while maintaining a speedup of $5$-$8\times$. We also introduce the CSD Teaching Subset as a new test split of diverse and complex molecules for future benchmarking. Our contributions enable CSP within seconds, making large-scale virtual screening of organic solids practical. Code is available at https://github.com/aspuru-guzik-group/clari.

2606.03180 2026-06-03 cs.CV cs.CL cs.LG 版本更新

GLINT: Sparsely Gated Vision-Language Alignment for Fine-Grained Radiology Representations

GLINT:面向细粒度放射学表征的稀疏门控视觉-语言对齐

Jonggwon Park, Seongeun Lee, Junhyun Park, Hannah Yun, Hyunwoong Kim, Sohyun Jeong, Hyewon Kang, Byungmu Yoon, Kyoyun Choi

AI总结 针对放射学图像-报告全局对齐与局部病灶尺度不匹配的问题,提出GLINT框架,通过稀疏门控对齐和密集特征正则化实现零样本分类、定位和分割。

详情
AI中文摘要

放射学中的视觉-语言模型(VLM)通过利用临床工作流程中自然产生的图像-报告对,已成为一种可扩展的范式。然而,这种配对揭示了尺度上的不匹配:每个病灶仅占据图像的一小部分区域,但监督仅在全局图像-报告级别提供。这带来了一个核心挑战:先前的方法将权重密集地分布到所有补丁上,而不是集中在与给定查询相关的稀疏子集上。为了解决这个问题,我们提出了GLINT(门控语言-图像对齐)框架,该框架显式建模这种稀疏对应关系。在对齐方面,我们引入了稀疏门控对齐,这是一种新颖的架构,其中在单独的门控嵌入空间上的sigmoid门仅激活与每个文本查询相关的补丁,强制执行显式稀疏性。在表征方面,我们添加了密集特征正则化,将可训练编码器的中间特征锚定到冻结的自监督学习(SSL)教师模型上,从而保留门控所依赖的细粒度补丁特征。相同的方案适用于2D胸部X光片(CXR)和3D胸部计算机断层扫描(CT),分别基于DINOv3和V-JEPA 2.1构建。GLINT支持从自由文本查询进行零样本分类、定位和分割,据我们所知,这是首次在没有掩码监督的情况下在3D CT体积上展示零样本分割。值得注意的是,最显著的增益出现在零样本定位和分割上,这些任务需要稀疏的、特定于查询的定位,这与我们的设计意图一致。在下游评估中,GLINT在分类、报告生成和分割方面均优于SSL编码器和医学VLM。

英文摘要

Vision-language models (VLMs) for radiology have emerged as a scalable paradigm by leveraging image-report pairs naturally produced in clinical workflows. However, this pairing reveals a mismatch in scale: each finding occupies only a small region of the image, yet supervision is provided only at the global image-report level. This poses a central challenge: prior approaches spread weight densely across all patches rather than concentrating on the sparse subset relevant to a given query. To address this, we present GLINT (Gated Language-Image alignmeNT), a framework that explicitly models this sparse correspondence. On the alignment side, we introduce Sparsely Gated Alignment, a novel architecture in which a sigmoid gate over a separate gate embedding space activates only the patches relevant to each textual query, enforcing explicit sparsity. On the representation side, we add Dense Feature Regularization, which anchors the trainable encoder's intermediate features to a frozen self-supervised learning (SSL) teacher, preserving the fine-grained patch features that the gate relies on. The same recipe applies to both 2D chest X-ray (CXR) and 3D chest computed tomography (CT), built with DINOv3 and V-JEPA 2.1, respectively. GLINT enables zero-shot classification, grounding, and segmentation from free-text queries, and to our knowledge is the first to demonstrate zero-shot segmentation on 3D CT volumes without mask supervision. Notably, the most pronounced gains arise on zero-shot grounding and segmentation, where sparse, query-specific localization is required, consistent with our design intent. In downstream evaluation, GLINT outperforms both SSL encoders and medical VLMs on classification, report generation, and segmentation.

2606.03173 2026-06-03 cs.CY cs.LG cs.SI 版本更新

Auditing Engagement Incentives in the Kidfluencer Ecosystem: A Multimodal Weak Supervision Approach

审计儿童网红生态系统中的参与激励:一种多模态弱监督方法

Zijing Wei, Chao Peter Yang, Xuanjie Chen

发表机构 * University of California, Berkeley(加州大学伯克利分校) Stanford University(斯坦福大学)

AI总结 本研究采用多模态弱监督方法审计YouTube儿童网红频道,发现剥削信号与观看量显著正相关,且表演性劳动、情感诱饵和隐私侵犯能带来参与度溢价。

详情
AI中文摘要

YouTube上“儿童网红”的兴起引发了对儿童数字劳动和剥削的伦理担忧。尽管新兴立法试图规范这一生态系统,但由于大规模操作化剥削的困难,将剥削与参与度联系起来的实证证据仍然稀缺。本研究对79个儿童网红频道的5,051个视频进行了多模态AI审计,使用弱监督方法检测剥削信号,无需大规模人工标注。我们聚合了噪声标注函数——包括基于LLM的标题分类和基于GPT-4 Vision的缩略图与描述分析,涵盖六个基于文献的维度——为每个视频分配一个概率剥削分数。一项多标注者验证研究(N=107)显示与人类判断高度一致(宏平均F1=0.911),并对整体剥削风险具有高敏感性(召回率=0.960,F1=0.793)。我们的发现揭示了表演性劳动、情感诱饵和隐私侵犯的显著参与度溢价。剥削分数与观看次数相关(Spearman ρ=0.229,p<10^{-50}),控制频道层面变化的混合效应回归显示,剥削分数每增加一个单位,观看次数增加4.4倍(p<0.001)。频道内分析表明,情感诱饵的中位观看次数提升+65.6%,表演性内容提升+56.0%(FDR校正p<0.001),且在同年稳健性检验中效果持续(p=0.030)。相比之下,明确的商业内容(产品植入)没有溢价(-3.8%,不显著),表明平台奖励的是儿童身份和劳动的商品化,而非传统广告。这些发现挑战了仅关注财务信托的政策框架,表明参与度与儿童的密集表演性劳动系统性地相关。

英文摘要

The rise of `kidfluencers' on YouTube has raised ethical concerns about child digital labor and exploitation. While emerging legislation attempts to regulate this ecosystem, empirical evidence linking exploitation to engagement remains scarce, given the difficulty of operationalizing exploitation at scale. This study presents a multimodal AI audit of 5,051 videos across 79 kidfluencer channels, using weak supervision to detect exploitation signals without large-scale manual labels. We aggregate noisy labeling functions -- including LLM-based classification of titles and GPT-4 Vision analysis of thumbnails and descriptions across six literature-grounded dimensions -- to assign a probabilistic exploitation score to each video. A multi-annotator validation study (N=107) shows strong agreement with human judgment (macro-average F1 $= 0.911$) and high sensitivity for overall exploitation risk (recall $= 0.960$, F1 $= 0.793$). Our findings reveal a significant engagement premium for performative labor, emotional bait, and privacy violations. Exploitation scores correlate with view counts (Spearman $ρ= 0.229$, $p < 10^{-50}$), and mixed-effects regression controlling for channel-level variation shows that a one-unit increase in exploitation score yields a $4.4\times$ increase in views ($p < 0.001$). Within-channel analyses indicate median view boosts of $+65.6\%$ for emotional bait and $+56.0\%$ for performative content (FDR-corrected $p<0.001$), with effects holding in same-year robustness checks ($p=0.030$). Explicit commercial content (product placement), by contrast, shows no premium ($-3.8\%$, n.s.), suggesting the platform rewards commodification of the child's identity and labor over traditional advertising. These findings challenge policy frameworks focused solely on financial trusts, showing that engagement is systematically tied to the intensive, performative labor of children.

2606.03169 2026-06-03 cs.SD cs.LG cs.MM 版本更新

SketchSong: Hierarchical Song Generation with Sketch Planning and Fine-Grained Multi-Track Modeling

SketchSong: 基于草图规划与细粒度多轨建模的分层歌曲生成

Xiaoyue Duan, Nanxing Hu, Yutang Feng, Xudong Yan, Jiatao Chen, Jinchao Zhang, Jie Zhou

发表机构 * Pattern Recognition Center, WeChat AI, Tencent Inc.(腾讯人工智能研究院)

AI总结 提出分层歌曲生成框架SketchSong,通过歌曲级草图规划和细粒度多轨建模解决歌曲编排不连贯及声部建模粗糙问题,在客观指标和人工听测上优于基线。

详情
AI中文摘要

最近的歌曲生成系统能够合成逼真的音频,但生成完整歌曲仍面临两个挑战。首先,现有方法中缺乏明确的歌曲级编排规划,模型往往需要在生成底层音频细节的同时组织整体编排发展,这常导致编排不连贯,如段落过渡薄弱和动态进展受限。其次,对不同音乐部分的粗粒度建模掩盖了它们各自的作用和交互,限制了生成歌曲的编排丰富性。本文提出SketchSong,一种分层歌曲生成框架,通过歌曲级草图规划和细粒度多轨建模解决这些问题。在时间维度上,SketchSong首先预测从压缩音频表示中提取的高层草图标记的紧凑序列,然后基于这些草图生成音频标记。这种从粗到细的过程在详细音频生成之前为模型提供了明确的编排规划。在轨道维度上,SketchSong显式建模四个轨道,即人声、贝斯、鼓和其他乐器。这使得模型能够更精确地捕捉不同音乐部分的作用和交互。在歌曲生成基准上的实验表明,SketchSong在客观指标和人工听测上均持续优于基线。尽管没有采用额外的偏好优化后训练(如歌词和文本提示对齐),SketchSong仍取得了与经过后训练的强开源系统相竞争的结果,证明了我们整体设计的有效性。

英文摘要

Recent song generation systems can synthesize realistic audio, yet generating complete songs remains challenging for two reasons. First, explicit song-level arrangement planning remains limited in existing methods, so models often need to organize overall arrangement development while generating low-level audio details. This often leads to incoherence in arrangements, such as weak section transitions and limited dynamic progression. Second, coarse modeling of different musical parts obscures their distinct roles and interactions, limiting arrangement richness of generated songs. In this paper, we present SketchSong, a hierarchical song generation framework that addresses these issues through song-level sketch planning and fine-grained multi-track modeling. Along the temporal dimension, SketchSong first predicts a compact sequence of high-level sketch tokens derived from compressed audio representations, and then generates audio tokens conditioned on these sketches. This coarse-to-fine process gives the model an explicit arrangement plan before detailed audio generation. Along the track dimension, SketchSong explicitly models four tracks, i.e., vocals, bass, drums and other instruments. This enables the model to capture the roles and interactions of different musical parts more precisely. Experiments on song generation benchmarks show that SketchSong consistently outperforms our baseline on both objective metrics and human listening tests. Despite not employing additional post-training for preference optimization such as lyrics and text-prompt alignments, SketchSong achieves competitive results against strong, post-trained open-source systems, demonstrating the effectiveness of our overall design.

2606.03143 2026-06-03 cs.LG cs.CL 版本更新

FederatedSkill: Federated Learning for Agentic Skill Evolution

FederatedSkill: 面向智能体技能演化的联邦学习

Jingbo Yang, Guanyu Yao, Yang Zhang, Ramana Rao Kompella, Gaowen Liu, Shiyu Chang

发表机构 * UC Santa Barbara(加州大学圣巴bara分校) MIT-IBM Watson AI Lab(麻省理工-IBM Watson人工智能实验室) Cisco Research(思科研究)

AI总结 提出FederatedSkill框架,通过语义技能差异作为通信单元,在保护隐私的同时实现个性化技能演化,相比自演化基线成功率提升44.4%,计算成本降低37.5%。

详情
AI中文摘要

现代LLM智能体越来越依赖技能库来处理复杂任务,使得技能演化成为自我改进的主要驱动力。然而,孤立的单用户任务流缺乏构建全面技能所需的多样性。虽然跨用户协作可以克服这一数据瓶颈,但当前的轨迹共享方法会损害用户隐私,并强加一个统一的全局库,无法适应客户端的异质性。我们引入了FederatedSkill,一个用于协作智能体演化的隐私保护框架。FederatedSkill超越了原始轨迹共享,利用语义技能差异(即对本地库的结构化补丁)作为通信的基本单位。在服务器端,一个演化智能体聚合这些补丁,动态建模客户端特定的能力边界,促进严格个性化的技能演化,而不是次优的全局平均。在20个不同的智能体任务族上评估,FederatedSkill相比自演化基线表现出显著提升,成功率最高提高44.4%,计算成本降低37.5%。

英文摘要

Modern LLM agents increasingly rely on skill libraries to handle complex tasks, making skill evolution a primary driver of self-improvement. However, isolated single-user task streams lack the diversity required to build comprehensive skills. While cross-user collaboration can overcome this data bottleneck, current trajectory-sharing approaches compromise user privacy and impose a uniform global library that fails to accommodate client heterogeneity. We introduce FederatedSkill, a privacy-preserving framework for collaborative agent evolution. Moving beyond raw trajectory sharing, FederatedSkill utilizes semantic skill diffs, structured patches over local libraries, as the fundamental unit of communication. On the server side, an evolution agent aggregates these patches to dynamically model client-specific capability boundaries, facilitating strictly personalized skill evolution rather than a suboptimal global average. Evaluated across 20 distinct agent task families, FederatedSkill demonstrates substantial gains over self-evolving baselines, achieving up to a 44.4% increase in success rate and a 37.5% reduction in computational cost.

2606.03134 2026-06-03 cs.RO cs.LG 版本更新

How Visible Are Silent Manipulation Failures? An Observability Study of False-Success Detection in Simulated Robot Episodes

无声操作失败的可见性:模拟机器人任务中假成功检测的可观测性研究

Aarav Bedi

发表机构 * Aarav Bedi

AI总结 本研究通过模拟双机械臂ALOHA任务,探讨机器人自身成功检测器标记为成功的任务中,假成功(实际失败但被误判为成功)的可恢复性,发现基于关节数据的检测器在方块转移任务中几乎完全可恢复假成功,而在插销任务中仅部分可恢复,视觉检测器可弥补差距,且可分离性依赖于远低于实际传感器噪声的速度差异。

Comments 4 pages, 3 figures

详情
AI中文摘要

模仿学习策略用于机器人操作时,其训练任务的成功标签质量取决于机器人自身的成功检测器。一种特别有害的错误是假成功:机器人记录为成功但实际任务结果错误的任务。我们针对这些任务提出一个狭窄但实际的问题:一旦任务被标记为成功,推翻该标签所需的信息有多少存在于本体感觉中,又有多少需要视觉?我们在两个双机械臂ALOHA任务上构建模拟测试平台,通过环境扰动而非标签编辑诱发失败,利用检测器从未见过的特权模拟器状态标记每个任务,仅保留机器人标记为成功的任务。然后,我们将限制于本体感觉的检测器与基于视觉的检测器进行比较。我们发现可恢复性范围广泛:在方块转移任务中,假成功几乎完全可从关节数据中恢复,而在插销插入任务中,本体感觉仅恢复部分假成功,视觉检测器则弥补了大部分差距。我们还表明,我们测量的本体感觉可分离性依赖于远低于任何实际传感器噪声水平的速度差异,因此最好将其视为无噪声模拟器夸大的乐观上限。我们发布了生成和评估流程。

英文摘要

Imitation-learning policies for robot manipulation inherit the quality of the success labels attached to their training episodes, and those labels are usually produced by the robot's own success check. A particularly damaging error is the false success: an episode the robot logs as a success when the task outcome was actually wrong. We ask a narrow but practical question about these episodes. Once an episode has already been flagged as a success, how much of the information needed to overturn that label is present in proprioception, and how much requires vision? We build a simulated testbed on two bimanual ALOHA tasks, induce failures through environment perturbations rather than label edits, label every episode by privileged simulator state that the detector never sees, and keep only episodes the robot flagged as successful. We then compare detectors restricted to proprioception against a vision-based detector. We find that recoverability spans a wide range: in cube transfer the false successes are almost fully recoverable from joint data alone, while in peg insertion proprioception recovers only part of them and a vision detector closes most of the gap. We also show that the proprioceptive separability we measure rests on velocity differences far below any realistic sensor noise floor, so it is best read as an optimistic upper bound that a noiseless simulator inflates. We release the generation and evaluation pipeline.

2606.03131 2026-06-03 cs.LG 版本更新

HARVE: Hacking-Aware Reward-Head Vector Editing for Robust Reward Models

HARVE:面向鲁棒奖励模型的感知黑客奖励头向量编辑

Shuang Liu, Yuxuan Bo, Qiuyang Zhao, Caiyue Huang, Xiaorong Chen, Yanguang Liu, Mengnan Du

发表机构 * Carnegie Mellon University(卡内基梅隆大学) University of Virginia(弗吉尼亚大学) Harvard University(哈佛大学) Stanford University(斯坦福大学) University of Michigan(密歇根大学) New Jersey Institute of Technology(新泽西理工学院) The Chinese University of Hong Kong, Shenzhen(香港中文大学(深圳))

AI总结 针对奖励模型易受奖励黑客攻击的问题,提出无需训练的奖励头编辑方法HARVE,通过移除与黑客相关子空间对齐的奖励头向量分量,提升鲁棒性并保持通用能力。

详情
AI中文摘要

奖励模型对于大型语言模型(LLM)对齐至关重要,但它们仍然容易受到奖励黑客攻击。为了评估奖励模型的鲁棒性,我们引入了RewardHackBench,其中包含13种奖励黑客模式,涵盖现实生活中的高风险领域和通用设置,并且我们发现八个奖励模型在特定子类别上存在严重失败。为了缓解这些失败,我们提出了HARVE,一种针对标量奖励模型的无需训练的奖励头编辑方法。HARVE不是微调奖励模型,而是从与选定黑客子类别相关的残差流方向中识别出多方向黑客子空间,并移除与该子空间对齐的奖励头向量分量。这直接降低了奖励头对黑客相关特征的敏感性,仅使用少量对比性的黄金-黑客示例,无需梯度更新或微调。在八个奖励模型上的综合实验表明,该方法提高了黑客鲁棒性,优于微调基线,并保持了奖励模型的通用能力。进一步的分析表明,奖励黑客攻击更适合被捕捉为多维残差空间结构,而不是孤立的表面线索。

英文摘要

Reward models are central to large language model (LLM) alignment, but they remain vulnerable to reward hacking. To evaluate reward-model robustness, we introduce RewardHackBench containing 13 reward-hacking patterns covering real life high-stakes domains and general settings, and we find severe failures on specific subcategories across eight reward models. To mitigate these failures, we propose HARVE, a training-free reward-head editing method for scalar reward models. Instead of fine-tuning the reward model, HARVE identifies a multi-directional hacking subspace from residual stream directions associated with selected hacking subcategories, and removes the component of the reward-head vector aligned with that subspace. This directly reduces the reward head's sensitivity to hacking-related features using only a small set of contrastive gold-hacked examples, without gradient updates or fine-tuning. Comprehensive experiments across eight reward models indicates that \model improves hacking robustness, outperforms fine-tuning baselines, and preserves reward-models' general capability. Further analyses suggest that reward hacking is better captured as a multidimensional residual-space structure than by isolated surface cues.

2606.03130 2026-06-03 cs.LG 版本更新

Synthetic Hallucinations, Real Gains: Hard Negatives from Frontier Models for FIM Hallucination Mitigation

合成幻觉,真实收益:来自前沿模型的硬负样本用于FIM幻觉缓解

Mahdi Erfanian, Nelson Daniel Troncoso, Aashna Garg, Amabel Gale, Xiaoyu Liu, Pareesa Ameneh Golnari, Shengyu Fu

发表机构 * University of Illinois Chicago(伊利诺伊大学芝加哥分校) Microsoft(微软)

AI总结 针对小型开源代码模型在IDE自动补全中产生的填充中间(FIM)幻觉问题,提出一种无需执行的替代方法:利用前沿代码模型合成看似合理但错误的补全作为硬负样本,通过对比合成幻觉与真实开发者编辑的差异作为监督微调信号,在Delulu基准上提升精确匹配18.8个百分点。

详情
AI中文摘要

驱动IDE自动补全的小型开源代码模型仍然会输出幻觉的填充中间(FIM)补全:对项目中不存在的方法、参数、变量和导入的语法上自然的调用。现有的缓解方法要么需要每种语言的执行沙箱(在按键中途不适用),要么需要偏好优化管道(需要大量人工标注语料库)。我们提出一种无需执行的替代方案:使用前沿代码模型合成看似合理但错误的补全作为硬负样本,然后利用这些合成幻觉与真实开发者编辑之间的对比作为监督微调信号。我们的管道从公共GitHub中跨八种语言抓取多语言FIM上下文,并让一组三个前沿生成器为每个上下文针对Delulu分类法(一个经Docker验证的多语言FIM幻觉基准)中的四种幻觉类型各生成一个硬负样本,从而产生配对的选定/拒绝数据集。在10万行精选子集上微调Qwen2.5-Coder-7B-Instruct,使Delulu精确匹配提升+18.8点,编辑相似度提升+0.22,覆盖每种语言和每种类型,同时改进每个HumanEval-Infilling分割和每个SAFIM子集。同样的配方在3B模型上使Delulu提升+12.8 EM,并带有小的、特征化的一般FIM权衡。五轴消融实验(规模、类型混合、语言覆盖、基础模型家族和难度感知的愚弄率)加上头对头的SFT与DPO/ORPO比较,映射了哪些设计选择驱动了收益。我们发布完整的管道源代码——生成、愚弄率LLM评判、筛选和FIM微调配方——以便本文中的实验可以在任何许可语料库上端到端复现。

英文摘要

Small open-source code models that power IDE autocomplete still emit hallucinated Fill-in-the-Middle (FIM) completions: syntactically natural calls to methods, parameters, variables, and imports that do not exist in the surrounding project. Existing mitigations either require per-language execution sandboxes that do not apply at mid-keystroke or preference-optimisation pipelines that need large human-labelled corpora. We propose an execution-free alternative: use frontier code models to synthesise plausible-but-wrong completions as hard negatives, then leverage the contrast between these synthetic hallucinations and the ground-truth developer edit as a supervised fine-tuning signal. Our pipeline scrapes multilingual FIM contexts from public GitHub across eight languages and asks a panel of three frontier generators to produce one hard negative per context for each of four hallucination types drawn from the Delulu taxonomy, a Docker-verified multilingual FIM hallucination benchmark, yielding a paired chosen/rejected dataset. Fine-tuning Qwen2.5-Coder-7B-Instruct on a 100K-row curated subset lifts Delulu exact match by +18.8 points and edit similarity by +0.22 on every language and every type, while also improving every HumanEval-Infilling split and every SAFIM subset. The same recipe at 3B lifts Delulu by +12.8 EM with a small, characterised general-FIM trade-off. Five-axis ablations (size, type mix, language coverage, base-model family, and a difficulty-aware fool rate) plus a head-to-head SFT vs. DPO/ORPO comparison map which design choices drive the gain. We release the full pipeline source code -- generation, fool-rate LLM judging, curation, and the FIM fine-tuning recipe -- so that the experiments in this paper can be reproduced end-to end on any permissively licensed corpus.

2606.03128 2026-06-03 cs.CR cs.AI cs.CL cs.LG 版本更新

Decoupled Smart Contract Audits: Lightweight LLM Framework via Distillation and Aggregation

解耦式智能合约审计:通过蒸馏与聚合的轻量级LLM框架

Bagus Rakadyanto Oktavianto Putra, Muhamad Risqi Utama Saputra, Widyawan, Guntur Dharma Putra

发表机构 * University of Indonesia(印度尼西亚大学)

AI总结 提出一种基于轻量级开源LLM(0.6B-4B参数)的解耦式智能合约审计框架,通过rsLoRA、知识蒸馏和链式验证聚合策略,在漏洞检测中达到98.25%准确率,优于7B-34B参数模型。

Comments 12 pages, 4 figures, 5 tables. Accepted to IEEE ICWS 2026

详情
AI中文摘要

智能合约面临关键安全挑战,需要在去中心化网络服务中进行彻底审计。虽然大型语言模型(LLMs)在自动漏洞检测中展现出潜力,但现有方法缺乏严重性评估和可操作的修复建议,且计算开销过大。在本研究中,我们引入了一个高效的端到端智能合约安全审计框架,利用轻量级、高度优化的开源LLMs(0.6B-4B参数)。我们的框架将综合审计任务解耦为四个相互关联的组件:漏洞检测、解释、严重性分类和修复建议。为了在无需庞大参数量的情况下保持高准确性,我们实现了秩稳定低秩适配器(rsLoRA)、知识蒸馏以及自定义链式验证(CoVe)聚合策略,系统性地筛选并整合模型生成的多个草稿响应,形成高准确度的审计报告。实验结果表明,我们的轻量级流水线持续优于最先进的开源代码密集LLMs(7B至34B参数),在漏洞检测中达到98.25%的准确率,在生成解释任务中达到0.4375的对齐分数。此外,我们广泛的消融研究实证验证了我们的解耦审计过程相对于统一提示的优越性,并揭示了一种新颖的严重性中心性偏差,为未来LLM辅助审计研究建立了关键基准。

英文摘要

Smart contracts face critical security challenges that require thorough auditing in decentralized web services. While Large Language Models (LLMs) have shown promise in automated vulnerability detection, existing approaches lack severity evaluations with actionable remediation and demand unnecessarily massive computational overhead. In this study, we introduce an efficient end-to-end smart contract security audit framework utilizing lightweight, highly optimized open-source LLMs (0.6B-4B parameters). Our framework decouples comprehensive audit tasks into four interconnected components: vulnerability detection, explanation, severity classification, and remediation recommendation. To maintain high accuracy without massive parameters, we implement Rank-Stabilized Low-Rank Adapters (rsLoRA), knowledge distillation, and a custom Chain-of-Verification (CoVe) aggregation strategy to systematically screen and consolidate multiple draft responses from the model into a highly accurate audit report. Experimental results demonstrate that our lightweight pipeline consistently outperforms state-of-the-art open-source coder dense LLMs (7B to 34B parameters), achieving 98.25% accuracy in vulnerability detection and an alignment score of 0.4375 in generative explanation tasks. Furthermore, our extensive ablation studies empirically validate the superiority of our decoupled audit processes over unified prompting and uncover a novel severity centrality bias, establishing a critical benchmark for future research in LLM-assisted auditing.

2606.03125 2026-06-03 cs.LG 版本更新

Rethinking Neural Width for Alternating Current Optimal Power Flow Proxies

重新思考用于交流最优潮流代理的神经网络宽度

Dhruvi Khandelwal, Anurag Basistha, Ayushi Jolotia, Parikshit Pareek

发表机构 * Department of Electrical Engineering, National Institute of Technology Kurukshetra, India(印度克什米尔国立理工学院电气工程系) Indraprastha Institute of Information Technology Delhi, India(印度德里印度理工信息学院) Department of Electrical Engineering, Indian Institute of Technology Roorkee, India(印度罗尔基印度理工学院电气工程系)

AI总结 本文提出损失引导神经稠密化算法,通过逐步扩展网络容量来最小化宽度,以精确逼近交流最优潮流流形,并在多个IEEE系统上以少十倍的神经元达到与基线相当的性能。

详情
AI中文摘要

用于交流最优潮流(ACOPF)的深度学习代理缺乏确定架构大小的系统方法。本文通过一个建设性的思想实验来回答一个基本问题:神经网络必须有多宽才能几乎准确地逼近ACOPF流形?我们引入了一种损失引导神经稠密化(LG-ND)算法,该算法仅在当前深度神经网络拓扑无法进一步改进时进行扩展,从而逐步发现必要的容量。在多个IEEE系统上的实验结果表明,LG-ND使用每层最多少十倍的神经元即可达到与文献基线相当的性能。这种架构极简性对于安全关键电网运行中所需的正式验证至关重要。

英文摘要

Deep learning proxies for Alternating Current Optimal Power Flow (ACOPF) lack systematic methods for determining architectural size. This paper conducts a constructive thought experiment to answer a fundamental inquiry: how wide must a neural network be to almost accurately approximate the ACOPF manifold? We introduce a Loss-Guided Neural Densification (LG-ND) algorithm that incrementally discovers necessary capacity by expanding only when the current deep neural network topology fails to improve further. Empirical results across various IEEE systems show that LG-ND achieves performance parity with literature baselines using up to ten times fewer neurons per layer. Such architectural minimalism is critical for the formal verification required in safety-critical grid operations.

2606.03121 2026-06-03 cs.LG 版本更新

TiWeaver: Unified Temporal Dynamics Modeling via Contextual Patching

TiWeaver:通过上下文补丁实现统一的时间动态建模

Zhe Li, Jindong Tian, Hao Miao, Zhi Lei, Chenjuan Guo, Bin Yang

发表机构 * East China Normal University(东华大学) Hong Kong Polytechnic University(香港理工大学) Aalborg University(奥胡斯大学)

AI总结 针对多变量时间序列中因缺失值和非均匀采样等不规则性导致的动态复杂性和通道间异步依赖问题,提出TiWeaver框架,通过图引导自适应分词器(G²AT)和细粒度异步依赖提取器(FADE)实现自适应建模,在12个数据集上取得最高25%的性能提升。

详情
AI中文摘要

多变量时间序列预测在现实世界应用中扮演着关键角色,包括天气预报、股票分析和健康监测。由于数据源的多样性,时间序列表现出多样的时间动态,通常伴随着各种不规则性,如缺失值和非均匀采样频率。这些不规则性导致跨通道的复杂异步时间依赖。因此,具有固定补丁方案的单一模型往往难以很好地适应多样化的多变量时间序列,阻碍了准确预测。在本文中,我们提出了TiWeaver,一个统一框架,旨在自适应地处理时间动态和细粒度的通道间依赖。具体来说,我们引入了一个图引导自适应分词器(G²AT),通过联合考虑时间密度和表示一致性,将时间序列划分为高度上下文连贯的补丁。此外,我们提出了一个细粒度异步依赖提取器(FADE),旨在建模细粒度的异步通道间依赖,同时结合长期历史依赖。我们在12个真实世界时间序列数据集上评估了TiWeaver,它取得了最先进的性能,优于现有方法高达25%。这些结果证明了其在多样化领域和数据特征上的鲁棒性和有效性。

英文摘要

Multivariate time series forecasting plays a critical role in real-world applications, including weather prediction, stock analysis, and health monitoring. Due to the diversity of data sources, time series exhibit diverse temporal dynamics, often accompanied by various irregularities such as missing values and non-uniform sampling frequencies. Such irregularities lead to complex and asynchronous temporal dependencies across channels. Thus, a single model with a fixed patching scheme often fails to adapt well to diverse multivariate time series, hindering accurate forecasting. In this paper, we propose TiWeaver, a unified framework designed to handle temporal dynamics and fine-grained inter-channel dependencies adaptively. Specifically, we introduce a Graph-Guided Adaptive Tokenizer (G$^2$AT) that divides time series into high contextually coherent patches by jointly considering temporal density and representation consistency. In addition, we propose a Fine-grained Asynchronous Dependency Extractor (FADE), which is designed to model fine-grained asynchronous inter-channel dependencies while incorporating long-term historical dependencies. We evaluate TiWeaver on 12 real-world time series datasets, where it achieves state-of-the-art performance, outperforming existing methods up to 25%. These results demonstrate its robustness and effectiveness across diverse domains and data characteristics.

2606.03119 2026-06-03 cs.CV cs.AI cs.LG 版本更新

GuidedBridge: Training-freely Improving Bridge Models with Prior Guidance

GuidedBridge: 无需训练地利用先验引导改进桥接模型

Zehua Chen, Yucheng Yang, Binjie Yuan, Kaiwen Zheng, Jun S. Liu, Jun Zhu

发表机构 * University of Science and Technology of China(中国科学技术大学)

AI总结 提出无需训练的先验引导方法(PG)和频率调制先验引导(FMPG),通过对比弱先验与已见先验增强桥接模型的先验利用,并设计级联框架CFG-FMPG用于图像修复,实验证明该方法能一致提升预训练桥接模型在多种图像翻译任务中的性能。

Comments ICML 2026

详情
AI中文摘要

引导方法,如无分类器引导(CFG)和自动引导(AG),推动了扩散模型中噪声到数据生成的发展。最近,桥接模型引入了一种数据到数据的生成过程,可以利用有指导性的干净先验。在这项工作中,受先前通过去噪结果质量差异作为引导的方法启发,我们提出了一种无需训练的桥接引导方法,称为先验引导(PG)。具体来说,我们引入一个弱先验,该先验在桥接预训练期间未见,阻碍先验利用从而降低去噪结果。然后,我们将其与已见先验对比,通过缩放因子突出并增强先验利用。此外,我们分析了桥接过程中先验利用的潜在机制,并设计了频率调制先验引导(FMPG),该引导将引导尺度调整到与桥接生成动力学一致的低频和高频带。为了解决图像修复中的先验利用问题,我们开发了一个级联框架CFG-FMPG,该框架首先通过CFG生成噪声隐藏表示,然后将其作为生成先验与FMPG一起利用,在不影响推理效率的情况下发挥它们的互补优势。实验表明,我们的PG方法在多种图像翻译任务中一致地改进了预训练桥接模型。

英文摘要

Guidance methods, such as classifier-free guidance (CFG) and auto-guidance (AG), have advanced noise-to-data generation in diffusion models. Recently, bridge models have introduced a data-to-data generative process that can exploit an instructive clean prior. In this work, inspired by previous methods creating quality difference between denoising results as guidance, we propose a training-free bridge guidance method, termed Prior Guidance (PG). Specifically, we introduce a weak prior, which is unseen during bridge pre-training, hindering prior exploitation and thereby degrading denoising result. Then, we contrast it with the seen prior to highlight and enhance prior exploitation via a scaling factor. Moreover, we analyze the underlying mechanism of prior exploitation in the bridge process and design frequency-modulated prior guidance (FMPG), which tailors the guidance scale to low- and high-frequency bands coherent with bridge generative dynamics. To address prior exploitation in image in-painting, we develop a cascaded framework, CFG-FMPG, which first generates a noisy hidden representation via CFG and then exploits it as a generative prior with FMPG, fulfilling their complementary strengths without compromising inference efficiency. Experiments demonstrate that our PG methods consistently improve pre-trained bridge models across diverse image translation tasks.

2606.03118 2026-06-03 cs.LG cs.CV q-bio.NC 版本更新

Learning to See via Epiretinal Implant Stimulation in silico with Model-Based Deep Reinforcement Learning

通过基于模型的深度强化学习在硅上学习经由视网膜上植入物刺激的视觉

Jacob Lavoie, Marwan Besrour, William Lemaire, Jean Rouat, Réjean Fontaine, Eric Plourde

发表机构 * Department of Electrical Engineering and Computer Engineering, Université de Sherbrooke(电气与计算机工程系, Sherbrooke 大学)

AI总结 本研究提出使用各向同性和各向异性形状,通过深度强化学习在虚拟患者的视网膜上渲染可理解的图像,以提高人工恢复视觉的清晰度。

Comments 18 pages, 6 figures. Published version: Biomed. Phys. Eng. Express 10, 025006 (2024)

详情
Journal ref
Biomed. Phys. Eng. Express 10 (2024) 025006
AI中文摘要

目标:年龄相关性黄斑变性和视网膜色素变性等疾病会导致感光层退化。恢复视力的一种方法是通过微电极阵列(如视网膜上植入物)电刺激存活的视网膜神经节细胞。已知视网膜上植入物会产生沿邻近视网膜神经节细胞轴突束延伸的可见各向异性形状。最近的研究表明,为了获得各向同性的像素状形状,可以通过失活电极或降低刺激电流水平来映射轴突束并避免刺激它们。避免轴突束刺激旨在去除类似笔触的形状,转而采用更简化的像素状形状集合。方法:在本研究中,我们提出使用各向同性和各向异性形状,在名为rlretina的强化学习环境中为虚拟患者的视网膜渲染可理解的图像。该环境将任务形式化为在基于笔触的渲染任务中使用笔触。主要结果:我们训练了一个深度强化学习智能体,它学会组合各向同性和各向异性形状以形成图像。我们研究了哪种基于误差或基于感知的指标适合奖励智能体。该智能体以基于模型的数据生成方式训练,使用经过心理物理学验证的轴突映射模型来渲染不同虚拟患者感知到的图像。我们表明,与不同虚拟患者中的朴素方法相比,该智能体可以生成更可理解的图像。意义:这项工作提供了一种解决视网膜上刺激的新方法,这是朝着使用各向异性光幻视改善人工恢复视力中视觉敏锐度的第一步。

英文摘要

Objective: Diseases such as age-related macular degeneration and retinitis pigmentosa cause the degradation of the photoreceptor layer. One approach to restore vision is to electrically stimulate the surviving retinal ganglion cells with a microelectrode array such as epiretinal implants. Epiretinal implants are known to generate visible anisotropic shapes elongated along the axon fascicles of neighboring retinal ganglion cells. Recent work has demonstrated that to obtain isotropic pixel-like shapes, it is possible to map axon fascicles and avoid stimulating them by inactivating electrodes or lowering stimulation current levels. Avoiding axon fascicle stimulation aims to remove brushstroke-like shapes in favor of a more reduced set of pixel-like shapes. Approach: In this study, we propose the use of isotropic and anisotropic shapes to render intelligible images on the retina of a virtual patient in a reinforcement learning environment named rlretina. The environment formalizes the task as using brushstrokes in a stroke-based rendering task. Main Results: We train a deep reinforcement learning agent that learns to assemble isotropic and anisotropic shapes to form an image. We investigate which error-based or perception-based metrics is adequate to reward the agent. The agent is trained in a model-based data generation fashion using the psychophysically validated axon map model to render images as perceived by different virtual patients. We show that the agent can generate more intelligible images compared to the naive method in different virtual patients. Significance: This work shares a new way to address epiretinal stimulation that constitutes a first step towards improving visual acuity in artificially-restored vision using anisotropic phosphenes.

2606.03094 2026-06-03 cs.LG 版本更新

FGRPO: Federated GRPO with Adaptive Aggregation on Non-IID Data

FGRPO:非独立同分布数据上具有自适应聚合的联邦GRPO

Pengyu Chen, Shaowei Li, Kai Wang, Yunsheng Yuan, Kai Han, Jun Luo, Feng Li

发表机构 * School of Computer Science and Technology, Shandong University(山东大学计算机科学与技术学院) School of Mathematical Science, Peking University(北京大学数学科学学院) School of Computer Science and Artificial Intelligence, Shanghai University of Finance and Economics(上海财经大学计算机科学与人工智能学院) College of Computing and Data Science, Nanyang Technological University(南洋理工大学计算与数据科学学院)

AI总结 提出联邦GRPO(FGRPO)框架,通过基于相对性能增益的自适应聚合机制,在非独立同分布数据上实现去中心化推理模型微调,兼顾数据隐私与鲁棒收敛。

详情
AI中文摘要

语言模型的最新进展已将强化学习确立为引发自我纠正和长链推理的主要范式。虽然群体相对策略优化(GRPO)通过消除评论家网络提供了卓越的可扩展性,但将其部署在中央基础设施上需要从分布式所有者收集大量数据,这带来了显著的隐私风险。为了解决这些问题,我们引入了联邦GRPO(FGRPO),这是一个旨在跨异构数据所有者去中心化推理模型微调的框架。为了有效缓解异构任务间奖励尺度差异引起的不稳定性,FGRPO结合了一种基于相对性能增益的自适应聚合机制。通过刻画每个客户端相对于其个性化历史基线的改进,该框架动态地优先考虑有效的学习轨迹,而无需考虑局部任务的难度。FGRPO在非独立同分布数据上确保鲁棒收敛,同时保护数据隐私。

英文摘要

Recent advances in language models have established reinforcement learning as the primary paradigm for eliciting self-correction and long-chain reasoning. While group relative policy optimization (GRPO) offers superior scalability by eliminating the critic network, deploying it on a central infrastructure entails collecting a large volume of data from distributed owners, which poses significant privacy risks. To address these concerns, we introduce federated GRPO (FGRPO), a framework designed to decentralize the fine-tuning of reasoning models across heterogeneous data owners. To effectively mitigate the instability caused by divergent reward scales across heterogeneous tasks, FGRPO incorporates an adaptive aggregation mechanism based on relative performance gain. By characterizing each client's improvement relative to its personalized historical baseline, the framework dynamically prioritizes effective learning trajectories regardless of local task difficulty. FGRPO ensures robust convergence on non-IID data while preserving data privacy.

2606.03087 2026-06-03 cs.LG 版本更新

Learning to Solve, Forgetting to Retain: Correct-Set Turnover in RLVR

学会解决,忘记保留:RLVR中的正确集更替

Chuanyu Qin, Chenxu Yang, Qingyi Si, Naibin Gu, Peng Fu, Zheng Lin

发表机构 * Institute of Information Engineering, Chinese Academy of Sciences(中国科学院信息工程研究所) School of Cyber Security, University of Chinese Academy of Sciences(中国科学院大学网络安全学院) JD.COM(京东)

AI总结 针对强化学习可验证奖励(RLVR)中模型遗忘已解决问题的问题,提出正确集更替现象和修复窗口原则,并设计保留感知的回顾机制\method{},通过零额外开销的预部署批量替换提升多模态任务性能。

详情
AI中文摘要

强化学习可验证奖励(RLVR)提升了大型语言模型的能力,然而头条准确率的提升往往掩盖了一个隐藏代价:随着训练进行,先前解决的问题悄然变得无法解决。我们将此现象定义为\emph{正确集更替},代表了在已掌握集上解决方案获取与退化的耦合动态。在此视角下,保留与获取一样成为明确的优化目标。我们分析并实证建立了\emph{修复窗口原则}:恢复退化提示的成本随回顾延迟急剧增加,定义了一个标准RLVR流程未能利用的低成本窗口。为解决此问题,我们提出\method{},一种保留感知的回顾机制,追踪已掌握提示并定期重新引入以\emph{提醒}模型先前的解决方案。通过利用预部署批量替换,\method{}引入零额外部署开销。在涵盖图像-文本、视频和纯文本任务的20个基准上,使用Qwen3-VL和Qwen2.5-Math进行评估,\method{}在GRPO、DAPO和回放基线上持续提升性能,展示了跨模态和算法的稳健泛化能力。

英文摘要

Reinforcement learning with verifiable rewards (RLVR) improves the ability of large language model, yet headline accuracy gains often conceal a hidden cost: previously solved problems quietly become unsolvable as training proceeds. We frame this phenomenon as \emph{correct-set turnover}, representing the coupled dynamics of solution acquisition and regression over the mastered set. Under this view, retention becomes an explicit optimization target alongside acquisition. We analytically and empirically establish the \emph{repair-window principle}: the cost of restoring a regressed prompt grows sharply with review delay, defining a low-cost window that standard RLVR pipelines fail to exploit. To address this, we propose \textbf{\method{}}, a retention-aware review mechanism that tracks mastered prompts and periodically reintroduces them to \textbf{remind} the model of previous solutions. By utilizing pre-rollout batch replacement, \method{} incurs zero additional rollout overhead. Evaluated across 20 benchmarks spanning image-text, video, and text-only tasks with Qwen3-VL and Qwen2.5-Math, \method{} consistently improves performance over GRPO, DAPO, and replay baselines, demonstrating robust generalizability across modalities and algorithms.

2606.03074 2026-06-03 cs.LG cs.SY eess.SY 版本更新

RMPrior: Bridging Propagation Priors and Diffusion Refinement for Efficient Radio Map Construction

RMPrior: 融合传播先验与扩散精炼的高效无线电地图构建

Zixuan Guo, Xiucheng Wang, Nan Cheng

发表机构 * Zhejiang University(浙江大学)

AI总结 提出一种中起点采样策略,通过将传播先验扰动至中间扩散时间步,仅执行剩余反向步骤,在加速2.01倍的同时提升重建质量,并理论分析了初始化差距的上界及截断条件。

详情
AI中文摘要

扩散模型通过迭代去噪实现高保真无线电地图构建,但其采样成本限制了在需要反复更新无线电地图的动态无线系统中的实用性。同时,经典传播模型编码了有价值的场景级知识,而标准扩散推理通过从纯高斯噪声初始化完全丢弃了这些知识。本文通过中起点采样策略桥接了传播先验与扩散精炼。匹配的传播先验被扰动至中间扩散时间步,预训练的扩散骨干仅执行剩余的反向步骤,将计算集中在多径感知精炼上,而非从噪声完全重建。我们提供了理论分析,建立了初始化差距的上界、截断提高重建保真度的充分条件,以及在激进截断下先验质量敏感性的形式化刻画。在IRT4HighRes上的实验表明,在$P_{ ext{start}}=0.5$时,所提方法实现了2.01倍的加速,同时在NMSE、RMSE、SSIM和PSNR上均优于全步基线。跨三个不同保真度传播模型的先验质量消融实验证实,重建质量跟踪先验质量,且敏感性在更短的反向轨迹下放大,与理论预测一致。这些结果还表明,中起点重建质量可作为不同传播模型场景级保真度排序的代理指标。

英文摘要

Diffusion models achieve high-fidelity radio map construction through iterative denoising, yet their sampling cost limits practicality in dynamic wireless systems where radio maps must be refreshed repeatedly. Meanwhile, classical propagation models encode valuable scene-level knowledge that standard diffusion inference discards entirely by initializing from pure Gaussian noise. This paper bridges propagation priors and diffusion refinement through a mid-start sampling strategy. A matched propagation prior is perturbed to an intermediate diffusion timestep, and the pretrained diffusion backbone executes only the remaining reverse steps, focusing computation on multipath-aware refinement rather than full reconstruction from noise. We provide theoretical analysis establishing an upper bound on the initialization gap, a sufficient condition under which truncation improves reconstruction fidelity, and a formal characterization of prior-quality sensitivity under aggressive truncation. Experiments on IRT4HighRes show that, at $P_{\text{start}}=0.5$, the proposed method achieves a $2.01\times$ speedup while simultaneously improving NMSE, RMSE, SSIM, and PSNR over the full-step baseline. A prior-quality ablation across three propagation models of different fidelity confirms that reconstruction quality tracks prior quality, with the sensitivity amplified under shorter reverse trajectories, consistent with the theoretical predictions. These results also suggest that mid-start reconstruction quality can serve as a proxy for ranking the scene-level fidelity of different propagation models.

2606.03073 2026-06-03 cs.LG cs.AI 版本更新

Efficient Hyperparameter Optimization for LLM Reinforcement Learning

大语言模型强化学习的高效超参数优化

Minping Chen, Bowen Xiao, Du Liang, Chuxuan Zeng, Zeyi Wen

发表机构 * The Hong Kong University of Science and Technology (Guangzhou)(香港科学与技术大学(广州)) The Hong Kong University of Science and Technology(香港科学与技术大学) China United Network Communications Group(中国联合网络通信集团)

AI总结 提出联合保真度超参数优化方法,通过同时调整模型大小和训练预算作为保真度,并集成早停策略和检查点机制,显著提升计算效率(每轮最高14.9倍)且性能提升5.8%-111.6%。

Comments 12 pages, 6 figures, accepted at ACL 2026

详情
AI中文摘要

大语言模型的强化学习对超参数配置高度敏感,使得超参数优化至关重要但计算成本高昂。现有的多保真度超参数优化方法由于模型规模庞大和训练周期资源密集,在LLM RL中仍然效率低下。本文提出联合保真度超参数优化(JF-HPO),它同时调整模型大小和训练预算作为保真度。JF-HPO通过以下方式实现:(i)在每次HPO试验中,利用目标LLM的小型代理模型进行高效训练和评估;(ii)基于训练动态整合精心设计的早停策略;(iii)引入高效的检查点机制以消除冗余计算。与现有HPO方法相比,JF-HPO显著提高了每次试验的计算效率(最高达14.9倍),同时在相同时间预算下达到更好或具有竞争力的预测精度。值得注意的是,与使用VeRL配方中的超参数配置相比,JF-HPO的性能提升范围从5.8%到111.6%。

英文摘要

Reinforcement learning (RL) for large language models (LLMs) is highly sensitive to hyperparameter configurations, making hyperparameter optimization (HPO) essential yet computationally expensive. Existing multi-fidelity HPO methods remain inefficient for LLM RL due to the massive model scale and resource-intensive training cycles. In this paper, we propose Joint Fidelity Hyperparameter Optimization (JF-HPO), which simultaneously adapts both model size and training budget as fidelity. JF-HPO is empowered by: (i) it leverages a small proxy model of the target LLM for efficient training and evaluation in each HPO trial; (ii) it integrates carefully designed early-stopping strategies based on training dynamics; (iii) it introduces an efficient checkpointing mechanism to eliminate redundant computations. Compared with existing HPO methods, JF-HPO significantly improves the computational efficiency of each trial (up to 14.9 times), while achieving better or competitive predictive accuracy under the same time budget. Notably, compared with utilizing hyperparameter configurations from the VeRL Recipe, JF-HPO delivers performance improvements ranging from 5.8% to 111.6%.

2606.03069 2026-06-03 cs.CV cs.AI cs.LG 版本更新

ROBUST-WT: Robust Uncertainty-aware Segmentation Transform via Whitening and Training Enhancements

ROBUST-WT: 通过白化和训练增强的鲁棒不确定性感知分割变换

Aqsa Naseer, Maryam Bibi, Syeda Samiya Urooj, Muhammad Khurram Shahzad

发表机构 * SEECs, University of Engineering and Technology, Lahore, Pakistan(工程与技术大学,拉合尔,巴基斯坦)

AI总结 针对WT-PSE框架的四个局限性,提出域自适应增强、混合损失函数、课程式权重调度和消融控制标志四种改进,在眼底视盘分割中Dice达0.956。

Comments 8 pages, 6 figures; code available at https://github.com/213269/WT-PSE-code-main

详情
AI中文摘要

医学图像的广义分割可防止跨多个领域使用不同成像设备和临床协议时的性能下降。基于白化变换的概率形状正则化提取器(WT-PSE)发表于2024年IEEE Transactions on Medical Imaging,通过特征去相关和基于Wasserstein距离的知识蒸馏实现鲁棒的跨域分割。本研究系统性地检查了对WT-PSE学习框架的改进。识别出原始实现中的四个局限性:有限的训练增强无法模拟真实的扫描仪变化;依赖逐像素二元交叉熵损失对边缘噪声敏感;缺乏调度损失加权策略可能导致早期训练不稳定;以及缺乏用于受控科学比较的消融开关。为解决这些问题,我们提出四项增强:(1) 域自适应增强,包括随机擦除、伽马校正和椒盐噪声;(2) 混合BCE和Dice损失函数,用于在噪声条件下改进边缘感知分割;(3) 基于课程的Dice权重调度策略;(4) 命令行控制标志用于系统消融研究。在眼底视盘分割基准上的实验表明,改进后的流程在最终epoch的视盘Dice得分为0.956,ASD得分为13.31,优于基线epoch-5的Dice得分0.939。这些结果表明,在不修改底层WT-PSE架构的情况下,训练层面的改进可以提供一致的性能提升。

英文摘要

Generalized segmentation of medical images prevents performance degradation when different imaging devices and clinical protocols are used across multiple domains. The Whitening Transform-based Probabilistic Shape Regularization Extractor (WT-PSE), published in IEEE Transactions on Medical Imaging in 2024, addresses this challenge by employing feature decorrelation and Wasserstein distance-based knowledge distillation to achieve robust cross-domain segmentation. This study systematically examines improvements to the WT-PSE learning framework. Four limitations in the original implementation are identified: limited training augmentations that fail to simulate real scanner variations, reliance on per-pixel binary cross-entropy loss that is sensitive to edge noise, the absence of a scheduled loss weighting strategy that may destabilize early training, and the lack of ablation switches for controlled scientific comparison. To address these issues, we propose four enhancements: (1) domain-adaptive augmentation including random erasing, gamma correction, and salt-and-pepper noise; (2) a hybrid BCE and Dice loss function for improved edge-aware segmentation under noisy conditions; (3) a curriculum-based Dice weight scheduling strategy; and (4) command-line control flags for systematic ablation studies. Experiments on the fundus optic disc segmentation benchmark demonstrate that the improved pipeline achieves a final epoch optic-disc Dice score of 0.956 and an ASD score of 13.31, outperforming the baseline epoch-5 Dice score of 0.939. These results indicate that training-level improvements can provide consistent performance gains without modifying the underlying WT-PSE architecture.

2606.03068 2026-06-03 cs.LG cs.AI 版本更新

Learn When and Where to Connect: Adaptive Virtual Nodes for Dynamic Message Passing on Graphs

学习何时何地连接:图上动态消息传递的自适应虚拟节点

Jaejun Lee, Joyce Jiyoung Whang

发表机构 * School of Computing, KAIST(计算机学院,韩国科学技术院) Department of AI Computing, KAIST(人工智能计算系,韩国科学技术院)

AI总结 提出MAVN框架,通过端到端可微分的方式自适应地决定在消息传递神经网络的哪一层为哪些节点引入虚拟节点,并基于双向评分机制建立连接,理论证明其能模拟任意节点-虚拟节点连接模式,实验表明在多个数据集上显著提升骨干网络性能。

Comments 12 pages, 6 figures, 10 tables, 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2026)

详情
AI中文摘要

虽然虚拟节点(VN)常用于消息传递神经网络(MPNN)中以促进有效的消息传递,但现有的基于VN的方法存在局限性,例如限制所有节点连接到相同数量的VN、在应用MPNN之前固定连接,以及独立于连接到同一VN的其他节点而将节点连接到VN。我们提出了MAVN,一个端到端可微分的MPNN框架,允许节点和VN之间无约束的连接,并根据跨层演化的节点表示动态按需引入VN。具体来说,MAVN学习基于连接的相对重要性自适应地决定何时(在哪一层)以及何地(连接到哪些节点)引入和连接VN。从候选VN池中,MAVN在每一层选择必要的VN,每个选中的VN连接到非空节点子集,由双向评分机制引导,该机制同时捕捉节点对VN的偏好和VN对节点的偏好。我们理论上证明,对于任何节点-VN连接模式,都存在一组MAVN参数可以模拟该模式。在九个真实世界数据集上的实验表明,MAVN持续提升骨干MPNN的性能,相对于骨干网络实现高达46.5%的提升,并优于基线方法。

英文摘要

While Virtual Nodes (VNs) are often utilized in Message Passing Neural Networks (MPNNs) to facilitate effective message passing, existing VN-based methods have limitations, such as constraining all nodes to connect to the same number of VNs, fixing the connections before applying MPNNs, and connecting a node to a VN independently of the other nodes that connect to the same VN. We propose MAVN, an end-to-end differentiable MPNN framework that allows non-constrained connections between nodes and VNs and dynamically introduces VNs on demand in response to evolving node representations across layers. Specifically, MAVN learns to adaptively determine when (at which layer) and where (to which nodes) to introduce and connect VNs based on the relative importance of connections. From a pool of candidate VNs, MAVN selects the necessary VNs in each layer, where each selected VN is connected to a nonempty subset of nodes, guided by a dual-perspective scoring mechanism that jointly captures the nodes' preferences for VNs and the VNs' preferences for nodes. We theoretically prove that for any node-VN connectivity pattern, there exists a set of MAVN's parameters that can simulate the pattern. Experiments on nine real-world datasets demonstrate that MAVN consistently improves the performance of backbone MPNNs, achieving up to 46.5% improvement over the backbones and outperforms the baselines.

2606.03061 2026-06-03 cs.DC cs.AI cs.LG cs.NI cs.SY eess.SY 版本更新

Brief Announcement: Generative Markov Model for Distributed Computing Systems

简要公告:分布式计算系统的生成马尔可夫模型

Alfreds Lapkovskis, Ali Beikmohammadi, Sindri Magnússon, Praveen Kumar Donta

发表机构 * Department of Computer and Systems Sciences, Stockholm University, Sweden(斯德哥尔摩大学计算机与系统科学系)

AI总结 针对分布式计算系统的异构性和复杂性,提出一种基于结构化状态分解的生成马尔可夫模型,实现可处理的模拟、推理和策略学习,并通过协作AI推理案例验证其有效性。

Comments Submitted to 40th International Symposium on Distributed Computing (DISC 2026)

详情
AI中文摘要

新兴的分布式计算范式,如计算连续体,本质上是异构、随机和复杂的。高效且有效地利用连续体中所有可用资源需要一个统一的系统形式化模型。为了解决这一差距,我们提出了一个通用框架,将分布式计算系统建模为生成马尔可夫模型,该模型在结构化系统状态上进行分解。在我们的模型中,状态分解为高维变量,每个变量进一步在其元素上分解,反映了分布式系统固有的稀疏依赖结构。这产生了一个可处理的模型,能够对原本难以处理的系统状态进行模拟、推理和策略学习,从而将分布式计算与马尔可夫链理论和强化学习(RL)联系起来。我们通过一个协作AI推理的案例研究来展示我们的框架,其中专用服务器将资源与服务用户自愿提供的资源相结合。我们的结果表明,集中式调度在规模上成为瓶颈,而将计算分布到用户设备上可减少延迟和服务器资源消耗。这些发现突显了自适应决策在分布式计算系统中的价值,并展示了该框架在建模、模拟和优化方面的实用性。

英文摘要

Emerging distributed computing paradigms, such as the computing continuum, are inherently heterogeneous, stochastic, and complex. Efficiently and effectively utilizing all available resources across the continuum demands a unified formal model of the system. To address this gap, we propose a general framework for modeling distributed computing systems as a generative Markov model, factorized over a structured system state. In our model, the state decomposes into high-dimensional variables, each further factorized over its elements, reflecting the sparse dependency structure inherent to distributed systems. This yields a tractable model enabling simulation, inference, and policy learning over otherwise intractable system states, bridging distributed computing with Markov chain theory and reinforcement learning (RL). We demonstrate our framework through a case study of collaborative AI inference, in which a dedicated server combines resources with those volunteered by service users. Our results show that centralized scheduling becomes a bottleneck at scale, while distributing computation across user devices reduces both latency and server resource consumption. These findings highlight the value of adaptive decision-making in distributed computing systems and demonstrate the framework's utility for modeling, simulation, and optimization.

2606.03057 2026-06-03 cs.LG cs.AI 版本更新

Rethinking Molecular Text Representations for LLMs: An Empirical Study

重新思考用于大语言模型的分子文本表示:一项实证研究

Arun Raja, Garrett M. Morris, Kian Ming A. Chai

发表机构 * University of Oxford(牛津大学) DSO National Laboratories(DSO国家实验室)

AI总结 通过系统基准测试,评估了9种分子表示和8种化学任务下16个LLM的性能,发现表示选择强烈影响结果,结构化文本表示(CML、MolJSON)在结构任务中占优,IUPAC在语义任务中占优,而SMILES很少最优。

Comments 25 pages, 11 figures, 20 tables

详情
AI中文摘要

大语言模型(LLMs)越来越多地用于分子任务,但目前尚不清楚使用哪种分子表示。我们提出了一个系统基准测试,评估了LLM在九种表示和八种化学任务上的分子能力。我们基准测试了16个LLM,涵盖五个模型家族,包括推理和非推理变体、化学专用LLM以及封闭前沿模型。性能强烈依赖于表示,没有单一表示在所有任务中获胜,尽管CML是最好的,其次是MolJSON、InChI,然后是规范SMILES。显式结构化文本表示(CML和MolJSON)主导结构任务;IUPAC主导语义任务,在所有16个LLM的分子检索中获胜;而SMILES变体尽管在预训练中普遍存在,但很少是最优的。化学专用模型在使用SMILES时表现良好,但使用结构化文本表示时性能大幅下降,这表明仅基于SMILES的评估奖励了不具泛化能力的专业化。使用LLM作为评判者,我们发现IUPAC产生的正确分子生成比例最高。通过分词审计、线性探针和注意力的机制研究表明,表示在模型内部以不同方式编码;例如,结构化表示需要跨分子范围的更高注意力。我们的结果反对表示不变的评估,并激励基于LLM的化学任务感知表示路由。

英文摘要

Large language models (LLMs) are increasingly used for molecular tasks, but it remains unclear which molecular representation to use. We present a systematic benchmark evaluating LLM molecular competence across nine representations and eight chemical tasks. We benchmark 16 LLMs across five model families, including reasoning and non-reasoning variants, chemistry-specialized LLMs, and closed frontier models. Performance is strongly representation-dependent and no single representation wins across tasks, though CML is the best, followed by MolJSON, InChI, and then canonical SMILES. Explicit structured text representations (CML and MolJSON) dominate structural tasks; IUPAC dominates semantic tasks, winning molecule retrieval for all 16 LLMs; and SMILES variants are rarely optimal despite their prevalence in pretraining. Chemistry-specialized models perform well with SMILES at the cost of large degradations with structured text representations, suggesting SMILES-only evaluation rewards specialization that does not generalize. Using LLM-as-a-judge, we find that IUPAC produces the highest fraction of correct molecule generations. A mechanistic study via tokenization audits, linear probes and attention shows that representations are encoded differently inside the model; for example, structured representations require higher attention across the molecular span. Our results argue against representation-invariant evaluation and motivate task-aware representation routing for LLM-based chemistry.

2606.03052 2026-06-03 cs.LG 版本更新

What Do Students Learn? A Feature-Level Analysis of Dark Knowledge

学生学到了什么?暗知识的特征级分析

Seungu Kang, Songkuk Kim

发表机构 * Yonsei University(延世大学)

AI总结 本文利用交互张量框架分析知识蒸馏中学生模型的特征学习,发现有效蒸馏作为正则化器去除低频样本特定特征,并提出基于混淆矩阵的教师无关自蒸馏方法混淆蒸馏(CD),在CIFAR-100上优于现有自蒸馏方法。

Comments Accepted at ICPR 2026

详情
AI中文摘要

知识蒸馏(KD)是模型压缩的强大工具,然而学生模型获取特征表示的确切机制仍未充分探索。在这项工作中,我们使用交互张量框架分析学生特征学习。我们的分析表明,有效的KD充当正则化器,修剪低频、样本特定的特征,鼓励学生依赖一组紧凑的高可重用特征。至关重要的是,我们观察到数据集级别的混淆矩阵包含类似于教师“暗知识”的结构信息。利用这一见解,我们提出了混淆蒸馏(CD),一种无教师自蒸馏方法,利用模型自身不断演化的混淆模式作为动态软目标。CD在CIFAR-100上的ResNet-34和ResNet-50上取得了有竞争力的性能,比现有的自蒸馏方法如CS-KD和PS-KD高出1.2%,同时提供了标准KD的计算高效替代方案。

英文摘要

Knowledge Distillation (KD) is a powerful tool for model compression, yet the precise mechanisms by which student models acquire feature representations remain underexplored. In this work, we analyze student feature learning using the Interaction Tensor framework. Our analysis reveals that effective KD acts as a regularizer that prunes low-frequency, sample-specific features, encouraging the student to rely on a compact set of highly reusable features. Crucially, we observe that the dataset-level confusion matrix contains structural information analogous to the teacher's "Dark Knowledge." Leveraging this insight, we propose Confusion Distillation (CD), a teacher-free self-distillation method that utilizes the model's own evolving confusion patterns as dynamic soft targets. CD achieves competitive performance on ResNet-34 and ResNet-50 for CIFAR-100, outperforming existing self-distillation methods like CS-KD and PS-KD by 1.2% while offering a computationally efficient alternative to standard KD.

2606.03040 2026-06-03 cs.AI cs.LG 版本更新

RelGT-AC: A Relational Graph Transformer for Autocomplete Tasks in Relational Databases

RelGT-AC:用于关系数据库中自动完成任务的关系图Transformer

Phillip Jiang

发表机构 * Appsofa LLC(Appsofa公司)

AI总结 提出RelGT-AC模型,通过列掩码策略、统一任务头和TF-IDF文本编码器,在关系数据库的自动完成任务上优于GraphSAGE基线。

Comments 12 pages, 6 figures. Code and model checkpoints available at https://github.com/jiangdmv/graph-transformer

详情
AI中文摘要

关系数据库支撑着现代企业、科学和医疗系统,但由于其多表、异构和时间结构,对此类数据进行预测性机器学习仍然具有挑战性。关系深度学习(RDL)通过将数据库表示为异构图并直接应用图神经网络(GNN)来解决这一问题。RelBench v2最近引入了自动完成任务——一种实际动机的任务类型,其目标是从关系上下文中预测现有列值,类似于智能表单填充助手。我们提出了RelGT-AC(用于自动完成的关系图Transformer),通过三个有针对性的贡献扩展了RelGT架构:(1)一种列掩码策略,通过在子图编码期间屏蔽目标列来防止平凡解;(2)一个统一的任务头,支持在单个模型内进行二分类、多分类和回归自动完成任务;(3)一个TF-IDF文本编码器,自动检测和编码自由文本列,恢复分类编码器丢弃的强词汇信号。在跨越3个RelBench v2数据集(rel-trial、rel-f1、rel-stack)的7个任务中,RelGT-AC在所有3个回归自动完成任务上优于GraphSAGE基线,并通过TF-IDF编码器在文本密集的资格任务上实现了高达+10 AUROC点的提升。

英文摘要

Relational databases underpin modern enterprise, scientific, and healthcare systems, yet predictive machine learning on such data remains challenging due to their multi-table, heterogeneous, and temporal structure. Relational Deep Learning (RDL) addresses this by representing databases as heterogeneous graphs and applying graph neural networks (GNNs) directly. RelBench v2 recently introduced autocomplete tasks -- a practically motivated task type where the goal is to predict an existing column value from relational context, analogous to an intelligent form-filling assistant. We propose RelGT-AC (Relational Graph Transformer for Autocomplete), extending the RelGT architecture with three targeted contributions: (1) a column masking strategy that prevents trivial solutions by masking the target column during subgraph encoding; (2) a unified task head supporting binary classification, multiclass classification, and regression autocomplete tasks within a single model; and (3) a TF-IDF text encoder that automatically detects and encodes free-text columns, recovering strong lexical signal that categorical encoders discard. Across 7 tasks spanning 3 RelBench v2 datasets (rel-trial, rel-f1, rel-stack), RelGT-AC outperforms the GraphSAGE baseline on all 3 regression autocomplete tasks and achieves up to +10 AUROC points on text-heavy eligibility tasks via the TF-IDF encoder.

2606.03038 2026-06-03 cs.LG physics.comp-ph physics.optics 版本更新

Will Accurate Fields Mislead Photonic Design? FromGlobal Accuracy to Port Readout

精确的场会误导光子设计吗?从全局精度到端口读出

Yitian Zhang, Yonghong chen, Youming Chen, Yiyang Li, Xing Zhe, Renhe Lu, Shaolin Liao, Yuzhe Ma, Zhong Guan

发表机构 * Sun Yat-sen University(中山大学) The Hong Kong University of Science and Technology (Guangzhou)(香港科技大学(广州))

AI总结 针对光子设计中全局场精度高但端口读出不可靠的问题,提出传播对齐神经算子PaNO及其输出感知变体PaNO-R2,在MMI分束器基准上将端口功率误差降低72.7%。

详情
AI中文摘要

神经场代理可以加速光子设计循环,但一个在全局场误差上看起来精确的代理,当最终决策依赖于局部输出端口读出时,仍可能对候选器件进行错误排序。这种风险在传播主导的MMI分束器和耦合器中尤为严重,其中端口功率、分束、相位和耦合由累积的模态干涉和输出窗口聚合决定,而不仅仅是平均场相似性。我们通过场/中介/读出视角研究这种场到设计的不匹配,将密集复场误差与传播轮廓和输出窗口误差在端口聚合前分离。为了将代理与此链对齐,我们提出PaNO,一种传播对齐的神经算子,它保持全场预测接口,同时围绕局部边界结构、横向模态内容、轴向传播和交叉模态交互组织潜在状态。我们还评估了PaNO-R2,一种针对端口区域附近残余场分量的输出感知反馈变体。在具有4608个保留场的15波长可调谐$3\times3$ MMI基准上,PaNO将NeurOLight的端口功率误差从0.2018降低到0.0739,尽管cMAE略有升高,表明仅全局场精度不足以实现设计相关的读出保真度。PaNO-R2获得了最佳的cMAE、传播轮廓误差、输出轮廓误差和端口功率误差,将NeurOLight的端口功率和输出轮廓误差分别降低了72.7%和72.5%。

英文摘要

Neural field surrogates can accelerate photonic design loops, but a surrogate that looks accurate in global field error can still mis-rank candidate devices when the final decision depends on localized output-port readouts. This risk is acute in propagation-dominated MMI splitters and couplers, where port power, splitting, phase, and coupling are determined by accumulated modal interference and output-window aggregation rather than by average field similarity alone. We study this field-to-design mismatch through a Field/Mediator/Readout view that separates dense complex-field error from propagation-profile and output-window errors before port aggregation. To align the surrogate with this chain, we propose PaNO, a propagation-aligned neural operator that keeps the full-field prediction interface while organizing latent states around local boundary structure, transverse modal content, axial propagation, and cross-mode interaction. We also evaluate PaNO-R2, an output-aware feedback variant for residual field components near the port region. On a 15-wavelength tunable $3{\times}3$ MMI benchmark with 4608 held-out fields, PaNO lowers NeurOLight's port-power error from 0.2018 to 0.0739 despite slightly higher cMAE, showing that global field accuracy alone is not sufficient for design-relevant readout fidelity. PaNO-R2 attains the best cMAE, propagation-profile error, output-profile error, and port-power error, reducing NeurOLight's port-power and output-profile errors by 72.7\% and 72.5\%.

2606.03026 2026-06-03 cs.NE cs.AI cs.LG 版本更新

Spike-Aware C++ INT8 Inference for Sparse Spiking Language Models on Commodity CPUs

面向稀疏脉冲语言模型在商用CPU上的脉冲感知C++ INT8推理

Ting Liu

发表机构 * SymbolicLight Research(SymbolicLight研究院)

AI总结 本文提出一种脉冲感知的C++推理运行时,利用稀疏二进制脉冲状态作为执行原语,结合混合布局、AVX2/FMA内核和INT8量化,在商用CPU上实现脉冲语言模型的高效解码,吞吐量优于同等规模稠密模型但质量略逊。

Comments 11 pages, 7 tables

详情
AI中文摘要

脉冲语言模型展现出激活稀疏性,而稠密Transformer运行时无法直接利用。本文从系统角度研究这一特性。基于SymbolicLight V1脉冲门控语言模型家族,我们实现了一个C++ CPU推理运行时,将稀疏二进制脉冲状态视为执行原语,而非仅应用事后权重压缩。该运行时结合了清单驱动的权重加载器、混合行/列内存布局、AVX2/FMA内核、每通道对称INT8量化以及脉冲条件稀疏路径的整数域累加。在AMD Ryzen 7 5800X上,早期标量FP32基线解码速度为9.5 tokens/s。混合布局AVX2 FP32将其提升至14.7 tokens/s,而AVX2 INT8在相同step-30k导出模型上达到19.9 tokens/s,同时将权重占用从3.49 GB降至1.06 GB。对于可用的186k步874M参数INT8导出模型,C++运行时在单线程CPU基准测试中解码速度为22.63 tokens/s,相比之下,TinyLlama-1.1B Q8_0为16.31 tokens/s,Falcon3-1B Q8_0为11.26 tokens/s,Qwen2.5-1.5B Q8_0为9.70 tokens/s。线程扩展在四个CPU线程时达到47.90 tokens/s,512 token预填充从单线程的29.86 tokens/s提升至八线程的94.68 tokens/s。吞吐量提升伴随着质量代价:SNN报告WikiText-2困惑度为24.80,差于同一基准中的稠密基线。我们将结果定位为稀疏语言运行时的推理系统研究,长期动机在于可能受益于传感器和执行器附近本地低核推理的具身和边缘智能体。脉冲感知执行可以改善稀疏脉冲语言模型的CPU吞吐量和内存行为,而模型质量、受控稠密训练基线、具身任务评估和测量CPU能耗仍是开放问题。

英文摘要

Spiking language models expose activation sparsity that dense Transformer runtimes do not directly exploit. This paper studies that property from a systems perspective. Building on the SymbolicLight V1 spike-gated language model family, we implement a C++ CPU inference runtime that treats sparse binary spike states as an execution primitive rather than only applying post-hoc weight compression. The runtime combines a manifest-driven weight loader, mixed row/column memory layout, AVX2/FMA kernels, per-channel symmetric INT8 quantization, and integer-domain accumulation for spike-conditioned sparse paths. On an AMD Ryzen 7 5800X, an early scalar FP32 baseline decodes at 9.5 tokens/s. Mixed-layout AVX2 FP32 raises this to 14.7 tokens/s, and AVX2 INT8 reaches 19.9 tokens/s on the same step-30k export while reducing the weight footprint from 3.49 GB to 1.06 GB. For the available 186k-step 874M-parameter INT8 export, the C++ runtime decodes at 22.63 tokens/s in a single-thread CPU benchmark, compared with 16.31 tokens/s for TinyLlama-1.1B Q8_0, 11.26 tokens/s for Falcon3-1B Q8_0, and 9.70 tokens/s for Qwen2.5-1.5B Q8_0 under llama.cpp. Thread scaling reaches 47.90 tokens/s at four CPU threads, and 512-token prefill improves from 29.86 to 94.68 tokens/s from one to eight threads. The throughput result comes with a quality cost: the SNN reports WikiText-2 perplexity 24.80, worse than the dense baselines in the same benchmark. We frame the result as an inference-systems study for sparse language runtimes, with longer-term motivation in embodied and edge agents that may benefit from local, low-core inference near sensors and actuators. Spike-aware execution can improve CPU throughput and memory behavior for sparse spiking language models, while model quality, controlled dense training baselines, embodied-task evaluation, and measured CPU energy remain open problems.

2606.03017 2026-06-03 cs.LG cs.AI cs.RO 版本更新

ConTraIRL: Factorized Contrastive Abstractions for Transferable IRL

ConTraIRL:用于可迁移逆强化学习的分解对比抽象

Yikang Gui, Bikramjit Banerjee, Prashant Doshi

发表机构 * School of Computing University of Georgia(乔治亚大学计算学院) School of Computing Sciences & Computer Engineering The University of Southern Mississippi(密西西比大学计算科学与计算机工程学院)

AI总结 提出ConTraIRL框架,通过双编码器对比学习解耦环境动态与任务目标的潜在表示,实现组合奖励迁移,在连续控制基准上显著提升少样本迁移的样本效率和奖励恢复。

详情
AI中文摘要

当策略必须泛化到未见过的环境动态与任务目标组合时,逆强化学习中的奖励迁移不可靠。我们提出用于可迁移逆强化学习的分解对比抽象(ConTraIRL),该框架通过学习这两个因素的解耦潜在表示来实现组合奖励迁移。ConTraIRL采用双编码器架构,将观测映射到分离的动态和目标的潜在空间,并通过双重对比目标进行训练。时间对齐鼓励动态编码器学习目标不变的结构,而目标编码器捕获动态不变的特征。这种分解支持在重组动态-目标设置下的奖励推断。在连续控制基准上的实验表明,对未见过的动态-目标配对进行有效的少样本迁移,与迁移逆强化学习基线相比,提高了样本效率和奖励恢复。

英文摘要

Reward transfer in Inverse Reinforcement Learning (IRL) is unreliable when policies must generalize to unseen combinations of environment dynamics and task goals. We propose Factorized Contrastive Abstractions for Transferable IRL (ConTraIRL), a framework that enables compositional reward transfer by learning decoupled latent representations of these two factors. ConTraIRL uses a dual-encoder architecture that maps observations into separate dynamics and goal latent spaces, trained with a dual contrastive objective. Temporal alignment encourages the dynamics encoder to learn goal-invariant structure, while the goal encoder captures dynamics-invariant features. This factorization supports reward inference under recombined dynamics-goal settings. Experiments on continuous control benchmarks demonstrate effective few-shot transfer to unseen dynamics-goal pairings, improving sample efficiency and reward recovery over transfer IRL baselines.

2606.03014 2026-06-03 cs.LG cs.AR 版本更新

MOSAIC: Efficient Mixture-of-Agent Scheduling via Adaptive Aggregation and Inference Concurrency

MOSAIC: 通过自适应聚合和推理并发的高效混合智能体调度

Saptarshi Mitra, Yifan Zhang, Rachid Karami, Phyo Pyae Moe Aung, Nazmul Takbir, Sreetama Sarkar, Souvik Kundu, Sitao Huang

发表机构 * University of California, Irvine, USA(加州大学 Irvine 分校) University of Southern California, Los Angeles, USA(南加州大学洛杉矶分校) Intel, USA(英特尔公司)

AI总结 针对混合智能体系统在有限GPU资源下的负载不均衡问题,提出基于整数线性规划调度器和置信度感知自适应聚合的MOSAIC框架,实现最高2.5倍专家阶段、4.23倍聚合阶段和1.7~2.3倍端到端加速,精度损失在0.1个百分点内。

Comments 13 pages, 8 main pages

详情
AI中文摘要

混合智能体(MoA)系统通过将每个查询路由到多个专家大语言模型并聚合其输出来提高推理准确性。在有限的GPU资源上高效执行此工作负载存在瓶颈。基于技能的调度导致专家需求倾斜,而将指令微调的大语言模型与长推理模型结合会导致生成长度的极端变化。因此,传统的调度策略由于负载不平衡而遭受显著的GPU空闲和吞吐量崩溃。我们提出了MOSAIC,一个加速MoA工作负载的调度框架。首先,我们制定了一个基于整数线性规划(ILP)的调度器,该调度器根据离线分析的成本联合优化专家放置和每个工作线程的提示分配,在工作线程间复制推理专家同时固定轻量级专家。其次,MOSAIC使用置信度感知的自适应聚合,利用专家间一致性来绕过重型最终聚合器大语言模型处理共识查询。在我们的4-GPU系统中,与基线调度器相比,MOSAIC实现了最高2.5倍的专家阶段、4.23倍的聚合阶段和1.7~2.3倍的端到端加速,同时精度匹配在0.1个百分点以内。

英文摘要

Mixture-of-Agents (MoA) systems improve reasoning accuracy by routing each query to multiple expert LLMs and aggregating their outputs. Efficiently executing this workload on limited GPU resources has bottlenecks. Skill-based routing creates skewed expert demand, and combining instruction-tuned LLMs with long-reasoning models results in extreme variability in generation lengths. Consequently, traditional scheduling strategies suffer from significant GPU idling and throughput collapse due to load imbalances. We present MOSAIC, a scheduling framework to accelerate MoA workloads. First, we formulate an Integer Linear Program (ILP) based scheduler that jointly optimizes expert placement and per-worker prompt assignment from offline-profiled costs, replicating reasoning experts across workers while pinning lightweight ones. Second, MOSAIC uses confidence-aware adaptive aggregation, leveraging inter-expert agreement to bypass the heavy final aggregator LLM for consensus queries. In our 4-GPU system, MOSAIC achieves up to 2.5x expert-stage, 4.23x aggregator-stage and 1.7~2.3x end-to-end speedups over the baseline scheduler, while matching accuracy within 0.1pp.

2606.03003 2026-06-03 cs.LG cs.AI cs.RO 版本更新

Exact equivariance, kept through training, buys zero-shot generalisation across the symmetry group

精确等变性在训练中保持,实现跨对称群的零样本泛化

Hongbo Wang

发表机构 * Department of Mathematics, Stony Brook University(石溪大学数学系)

AI总结 通过等变编码器和预测器构建的潜世界模型,其训练损失具有可证明的对称性,从而在仅拟合部分方向动力学时,数学上确定整个轨道上的行为,实现跨对称群的零样本泛化。

Comments 92 pages, 11 figures. Core paper plus an extended results-log appendix and a forward-looking theory supplement. All experiments are laptop-scale (CPU/MPS), fully seeded and deterministic

详情
AI中文摘要

由等变编码器 $E$ 和等变预测器 $f$ 构建的潜世界模型继承了其训练损失的可证明对称性:当世界的动力学真正承载一个群 $G$,通过正交表示 $\rho(g)$ 作用于潜变量时,单步预测 relMSE 在整个群上精确不变,因此仅在方向的受限切片上拟合动力学,数学上就确定了整个轨道上的动力学(举一反三)。我们在笔记本电脑规模(CPU/MPS,完全设定随机种子)上端到端验证了这一点。[A] 该对称性在真实的 Muon/AdamW + EMA + VICReg 运行中幸存——组合的编码-预测残差在优化后约为 $10^{-6}$,不仅在初始化时,而且在任何优化器下都成立。[B] 单步误差在整个群上平坦至五位小数,而相同假设类别的非等变基线拟合了切片但在分布外失效(2D 中 VN $\times 1.00$ 对比基线 $\times 13.8$,3D 中 $\times 17.2$,整个 $\mathrm{SE}(3)$ 阶梯上 $\times 157$),且等变模型小 $4.5$-$7.4$ 倍。[C] 相同的等距论证提升到闭环:在匹配的等变规划器下,方向 $g$ 处的控制轨迹恰好是所见轨迹应用 $\rho(g)$ 的结果,因此闭环误差在整个群上不变——在真实 PushT 上的 2D/$\mathrm{SO}(2)$ 中浮点地板精确,在 3D/$\mathrm{SE}(3)$ 中统计平坦(不相交的 95% 置信区间)。我们针对 Sutton 的苦涩教训对先验进行了压力测试:增强、暴力规模和软等变性各自最多缩小跨群任务指标,但从未达到浮点地板精确性。由于等变性在复合下封闭,$H$ 步展开在每个视界上保持平坦($\times 1.00$,$\le 2\times 10^{-7}$),而基线的残差随 $H$ 复合。超出范围:任务成功扫描、无规划器不变性和缩放。

英文摘要

A latent world model built from an equivariant encoder $E$ and an equivariant predictor $f$ inherits a provable symmetry of its training loss: when the world's dynamics genuinely carries a group $G$ acting on latents by an orthogonal representation $ρ(g)$, the one-step prediction relMSE is exactly invariant across the whole group, so fitting the dynamics on a restricted slice of orientations mathematically determines it on the entire orbit (jǔ yī fǎn sān). We verify this end-to-end at laptop scale (CPU/MPS, fully seeded). [A] The symmetry survives a real Muon/AdamW + EMA + VICReg run -- composed encode-then-predict residual $\sim 10^{-6}$ after optimisation, not just at initialisation, and under any optimiser. [B] One-step error is flat to five digits across the group, while a same-hypothesis-class non-equivariant baseline fits the slice but breaks out-of-distribution (VN $\times 1.00$ vs baseline $\times 13.8$ in 2D, $\times 17.2$ in 3D, $\times 157$ over the full $\mathrm{SE}(3)$ ladder), with the equivariant model $4.5$-$7.4\times$ smaller. [C] The same isometry argument lifts to closed loop: under a matching equivariant planner the control trajectory at orientation $g$ is exactly $ρ(g)$ applied to the seen one, so closed-loop error is invariant across the group -- float-floor-exact in 2D/$\mathrm{SO}(2)$ on real PushT and statistically flat in 3D/$\mathrm{SE}(3)$ (disjoint 95% CIs). We stress-test the prior against Sutton's Bitter Lesson: augmentation, brute-force scale, and soft-equivariance each close at most the across-group task metric, never the float-floor exactness. Because equivariance is closed under composition, the $H$-fold rollout stays flat ($\times 1.00$, $\le 2\times 10^{-7}$) at every horizon, while the baseline's residual compounds with $H$. Out of scope: task-success sweeps, planner-free invariance, and scaling.

2606.02998 2026-06-03 cs.LG eess.AS 版本更新

CoughSense: Five-Class Respiratory Disease Classification via Whisper Encoder Fine-Tuning and Dual-Encoder Cross-Attention Fusion with Balanced Contrastive Learning

CoughSense:通过Whisper编码器微调和双编码器交叉注意力融合与平衡对比学习的五类呼吸系统疾病分类

Nikhil Vincent

发表机构 * Independent Researcher, Bothell, Washington, USA(独立研究者,华盛顿州贝斯尔市)

AI总结 提出CoughSense系统,利用Whisper编码器微调和双编码器交叉注意力融合,结合主动帧注意力池化和平衡对比学习,在智能手机录音上实现五类呼吸系统疾病(健康、COVID-19、哮喘/呼吸疾病、支气管炎、肺炎)的高精度分类。

Comments 26 pages, 3 figures

详情
AI中文摘要

自动咳嗽分析为低成本呼吸系统筛查提供了一条途径,但现有工作大多止步于二元COVID-19检测。一个实用的工具需要能够从消费者智能手机的一次咳嗽录音中区分出多种呼吸系统疾病。我们提出了CoughSense,一个将咳嗽录音分为五类的系统:健康、COVID-19、哮喘或呼吸系统疾病、支气管炎和肺炎。我们汇集了来自四个公共数据集(Coswara、CoughVID、Virufy和West China Hospital Pediatric Cough Dataset)的18,301条录音,并使用OpenAI Whisper编码器作为预训练骨干进行咳嗽疾病分类。主要贡献是主动帧QKV注意力池化,它将注意力限制在1500个编码器令牌中的前200个。这避免了由于3秒咳嗽仅填充Whisper 30秒输入窗口中的150个令牌而产生的静音稀释问题。其他训练部分处理19:1的类别不平衡和四个数据集的领域偏移,包括加权随机采样器、SpecAugment、强制少数配对的平衡混合、监督对比辅助损失、FiLM症状条件化和梯度反转领域适应。双编码器模型通过交叉注意力将Whisper与OPERA-CT呼吸基础模型融合。CoughSense(Whisper-tiny,8.6M参数)在五折交叉验证中达到了82.3%的平衡准确率(宏F1为0.817,AUC为0.941),比ImageNet预训练的EfficientNet-B2高出11.1个百分点,比从头训练的ViT高出29.6个百分点。所有五个类别的召回率均超过74%,其中四个超过80%。双编码器模型达到了85.4%的平衡准确率。在所有消融组件中,主动帧池化是最大的单一贡献者,贡献了5.1个百分点,这应该有助于任何使用Whisper作为骨干的短音频任务。

英文摘要

Automated cough analysis offers a path to low-cost respiratory screening, but most existing work stops at binary COVID-19 detection. A practical tool needs to tell apart several respiratory conditions from one cough recording on a consumer smartphone. We present CoughSense, a system that sorts cough recordings into five classes. These are healthy, COVID-19, asthma or respiratory condition, bronchitis, and pneumonia. We aggregated 18,301 recordings from four public datasets (Coswara, CoughVID, Virufy, and the West China Hospital Pediatric Cough Dataset) and used the OpenAI Whisper encoder as a pretrained backbone for cough disease classification. The main contribution is active-frame QKV attention pooling, which restricts attention to the first 200 of 1500 encoder tokens. This avoids the silence-dilution problem that arises because a 3-second cough fills only 150 tokens of Whisper's 30-second input window. Other training parts handle the 19 to 1 class imbalance and the four-dataset domain shift. These include WeightedRandomSampler, SpecAugment, Balanced Mixup with forced minority pairing, a supervised contrastive auxiliary loss, FiLM symptom conditioning, and gradient-reversal domain adaptation. A dual-encoder model fuses Whisper with the OPERA-CT respiratory foundation model through cross-attention. CoughSense (Whisper-tiny, 8.6M parameters) reached 82.3 percent balanced accuracy on five-fold cross-validation (macro-F1 of 0.817, AUC of 0.941). It beat an ImageNet-pretrained EfficientNet-B2 by 11.1 points and a ViT trained from scratch by 29.6 points. All five classes passed 74 percent recall and four of five passed 80 percent. The dual-encoder model reached 85.4 percent balanced accuracy. Active-frame pooling is the largest single contributor across all ablation components at 5.1 points, which should help any short-audio task using Whisper as a backbone.

2606.02993 2026-06-03 cs.LG math.OC math.RT math.ST stat.ML stat.TH 版本更新

Neural Networks Provably Learn Spectral Representations for Group Composition

神经网络可证明地学习群组合的谱表示

Jianliang He, Leda Wang, Fengzhuo Zhang, Siyu Chen, Zhuoran Yang

AI总结 通过将投影梯度流提升到傅里叶域,证明两层神经网络在群组合任务中几乎必然收敛到单个不可约表示,并揭示了表示论视角下的特征学习和低秩压缩现象。

详情
AI中文摘要

理解神经网络训练过程中结构化内部结构如何涌现是深度学习研究的核心。我们通过群组合任务研究这一现象,其中训练一个两层神经网络来预测有限群 $G$ 中元素的 $g_1 \star g_2$。通过将投影梯度流提升到傅里叶域,我们证明训练动力学由一个表示论能量泛函上的黎曼梯度上升控制。我们证明,在随机初始化下,该流驱动每个神经元几乎必然收敛到单个不可约表示,而跨层傅里叶系数实现旋转秩一对齐。该框架提供了特征学习的表示论解释,并刻画了矩阵值群表示的一种新颖的低秩压缩现象。此外,对于阿贝尔群,我们提供了完整的总体水平描述:随机初始化促进非平凡表示上的均匀多样化,并诱导 Haar 均匀相位,通过多数投票机制联合逼近指示函数。我们进一步证明相位对齐和表示竞争都以指数收敛速率出现。

英文摘要

Understanding how structured internal structure emerges during neural network training is central to the study of deep learning. We investigate this phenomenon through the group composition task, where a two-layer neural network is trained to predict $g_1 \star g_2$ for elements of a finite group $G$. By lifting the projected gradient flow to the Fourier domain, we demonstrate that the training dynamics are governed by a Riemannian gradient ascent on a representation-theoretic energy functional. We prove that, under random initialization, this flow drives each neuron to converge almost surely toward a single irreducible representation, while the cross-layer Fourier coefficients achieve a rotational rank-one alignment. This framework provides a representation-theoretic account of feature learning and characterizes a novel low-rank compression phenomenon for matrix-valued group representations. Moreover, for Abelian groups, we provide a complete population-level description: random initialization promotes uniform diversification across nontrivial representations and induces Haar-uniform phases, jointly approximating the indicator via a majority-vote mechanism. We further prove that both phase alignment and representation competition emerge with exponential convergence rates.

2606.02982 2026-06-03 cs.PF cs.DC cs.LG 版本更新

DriftSched: Adaptive QoS-Aware Scheduling under Runtime Token Drift for Multi-Tenant GPU Inference

DriftSched: 多租户GPU推理中运行时令牌漂移下的自适应QoS感知调度

Kathiravan Palaniappan

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 提出DriftSched框架,通过运行时反馈驱动的漂移补偿和自适应偏差校正,解决多租户LLM推理中令牌漂移导致的调度问题,在NVIDIA L4 GPU上实现平均38.8%的估计误差降低和42%的中位延迟改善。

Comments 17 pages, 22 figures, 7 tables

详情
AI中文摘要

大型语言模型(LLM)推理服务的快速增长增加了对高效多租户GPU调度的需求。尽管现代推理运行时(如vLLM)通过连续批处理和优化内存管理提高了吞吐量,但准确估计异构推理请求的运行时成本仍然是一个重大挑战。在实践中,观察到的输出长度通常偏离准入时的估计值,产生运行时令牌漂移,可能导致工作负载错误分类、队列不平衡、尾延迟增加和服务质量(QoS)下降。本文提出了DriftSched,一个用于NVIDIA L4 GPU上多租户LLM推理服务的自适应QoS感知调度框架。DriftSched结合了工作负载分类、令牌预算估计、租户感知队列管理和运行时反馈驱动的漂移补偿,以改进准入时的调度决策。该框架在异构多租户工作负载下评估了FIFO、优先级、加权、最短作业优先(SJF)和老化优先级调度策略。实验结果表明,各工作负载类别存在可测量的运行时令牌漂移。自适应偏差校正将工作负载估计误差平均降低38.8%(MAE)和40.5%(RMSE),提高了工作负载分类稳定性和调度准确性。在所有评估的调度器中,SJF实现了最佳整体性能,在持续GPU争用下,相对于FIFO,中位端到端延迟降低了约42%,P99延迟降低了约16%。该工作贡献了一个自适应漂移感知调度架构、一个运行时令牌漂移补偿机制,以及一个用于评估共享GPU基础设施上QoS感知LLM推理调度的可重复基准测试框架。

英文摘要

The rapid growth of large language model (LLM) inference services has increased the demand for efficient multi-tenant GPU scheduling. While modern inference runtimes such as vLLM improve throughput through continuous batching and optimized memory management, accurately estimating the runtime cost of heterogeneous inference requests remains a significant challenge. In practice, observed output lengths often deviate from admission-time estimates, creating runtime token drift that can lead to workload misclassification, queue imbalance, increased tail latency, and degraded Quality-of-Service (QoS). This paper presents DriftSched, an adaptive QoS-aware scheduling framework for multi-tenant LLM inference serving on NVIDIA L4 GPUs. DriftSched combines workload classification, token-budget estimation, tenant-aware queue management, and runtime feedback-driven drift compensation to improve admission-time scheduling decisions. The framework evaluates FIFO, Priority, Weighted, Shortest-Job-First (SJF), and Aging Priority scheduling policies under heterogeneous multi-tenant workloads. Experimental results demonstrate measurable runtime token drift across workload categories. Adaptive bias correction reduces workload estimation error by an average of 38.8% (MAE) and 40.5% (RMSE), improving workload classification stability and scheduling accuracy. Among all evaluated schedulers, SJF achieves the best overall performance, reducing median end-to-end latency by approximately 42% and P99 latency by approximately 16% relative to FIFO under sustained GPU contention. The work contributes an adaptive drift-aware scheduling architecture, a runtime token-drift compensation mechanism, and a reproducible benchmarking framework for evaluating QoS-aware LLM inference scheduling on shared GPU infrastructure.

2606.02974 2026-06-03 cs.AI cs.HC cs.LG 版本更新

WISE-HAR: A Generalizable Ensemble Deep Learning Framework for WiFi-Based Human Activity Recognition

WISE-HAR:一种基于WiFi的人类活动识别的可泛化集成深度学习框架

Maheen Arshad, Qindeel E Zahra, Muhammad Khuram Shahzad

发表机构 * Department of Computing, School of Electrical Engineering and Computer Science(计算机系,电气工程与计算机科学学院) National University of Sciences and Technology (NUST)(国家科学与技术大学(NUST))

AI总结 本文提出WISE-HAR框架,通过集成五种CNN架构、数据增强和跨场景评估,在Wallhack1.8k数据集上实现94.87%的LOS测试准确率,并展现出强泛化能力。

Comments 8 pages, 5 figures

详情
AI中文摘要

利用WiFi信号进行人类活动识别(HAR)已成为智能家居、医疗监控、安全系统和环境辅助生活的一项变革性技术。与引发严重隐私问题且在弱光条件下失效的传统基于摄像头的系统,或需要用户配合的可穿戴传感器不同,基于WiFi的HAR是非侵入性的、保护隐私的、成本效益高的,并且能在任何光照条件下无缝工作。本文提出了一种综合方法,使用Wallhack1.8k WiFi频谱图数据集识别三种不同的人类活动:“无人”(空房间)、“行走”和“行走+挥手”。我们提出了三项关键改进以应对基于WiFi的HAR的主要挑战。首先,为了解决高性能方差问题,我们实现了集成学习,采用五种不同的CNN架构(Deep CNN、Wide CNN、MobileNetV2、ResNet50V2和EfficientNetB0)。其次,为了解决小数据集大小的限制,我们应用了激进的数据增强技术,包括时间扭曲、频率掩蔽和噪声添加。第三,为了评估真实世界的泛化能力,我们进行了跨场景评估(在视距上训练,在非视距上测试)和跨天线评估(在双锥天线上训练,在PIFA天线上测试)。我们的集成模型在使用双锥天线的LOS场景下达到了94.87%的测试准确率,比最佳单个模型高出0.66%。数据增强将随机森林的性能从60%提升到95%。跨场景评估显示准确率下降极小,仅为1.37%和2.07%,证明了强大的泛化能力。结果表明,所提出的方法鲁棒、可靠,适用于不同硬件配置的多样化环境中的实际部署。

英文摘要

Human Activity Recognition (HAR) using WiFi signals has emerged as a transformative technology for smart homes, healthcare monitoring, security systems, and ambient assisted living. Unlike traditional camera-based systems that raise significant privacy concerns and fail in low-light conditions, or wearable sensors that require user compliance, WiFi-based HAR is non-intrusive, privacy-preserving, cost-effective, and works seamlessly in any lighting condition. This paper presents a comprehensive approach to recognize three distinct human activities: "No Presence" (empty room), "Walking", and "Walking + Arm-waving" using the Wallhack1.8k WiFi spectrogram dataset. We propose three key improvements to address the main challenges in WiFi-based HAR. First, to address high performance variance, we implement ensemble learning with five different CNN architectures (Deep CNN, Wide CNN, MobileNetV2, ResNet50V2, and EfficientNetB0). Second, to address the small dataset size limitation, we apply aggressive data augmentation techniques including time-warping, frequency masking, and noise addition. Third, to evaluate real-world generalization capability, we perform cross-scenario evaluation (training on Line-of-Sight and testing on Non-Line-of-Sight) and cross-antenna evaluation (training on Biquad antenna and testing on PIFA antenna). Our ensemble model achieved a test accuracy of 94.87% on the LOS scenario with Biquad antenna, outperforming the best individual model by 0.66%. Data augmentation improved Random Forest performance from 60% to 95%. Cross-scenario evaluation showed minimal accuracy drops of only 1.37% and 2.07%, demonstrating strong generalization capabilities. The results indicate that the proposed approach is robust, reliable, and suitable for real-world deployment in diverse environments with different hardware configurations.

2606.02964 2026-06-03 cs.AR cs.CL cs.LG 版本更新

Multi-Segment Attention: Enabling Efficient KV-Cache Management for Faster Large Language Model Serving

多段注意力:实现高效KV缓存管理以加速大型语言模型服务

Chunan Shi, Yilei Chen, Yilin Chen, Xupeng Miao, Bin Cui

发表机构 * Peking University(北京大学)

AI总结 提出AsymCache,一种计算延迟感知的KV缓存管理系统,通过多段注意力、缓存驱逐策略和自适应分块调度器,在保持无损精度的同时显著降低TTFT和TPOT。

详情
AI中文摘要

大型语言模型(LLM)推理依赖键值(KV)缓存以避免冗余的注意力计算。虽然近似KV缓存保留技术通过牺牲模型精度来减少内存使用,但无损方法则从GPU内存中驱逐KV缓存块并按需重建以保留精确输出。现有的无损KV缓存管理系统主要基于访问频率或位置启发式做出驱逐决策,而不考虑不同KV缓存块如何影响GPU注意力内核的执行效率。在本文中,我们提出了AsymCache,一种用于LLM推理的计算延迟感知KV缓存管理系统,它明确地将缓存驻留决策与GPU注意力内核性能对齐,包括三个关键组件:用于高效非连续KV上下文处理的多段注意力(MSA)、联合优化命中率和位置感知重计算成本的缓存驱逐策略,以及用于高硬件利用率的自适应分块调度器。实验表明,与最新基线相比,AsymCache将TTFT降低了高达1.90-2.03倍,每输出令牌时间(TPOT)降低了1.62-1.71倍,证实了该方法在常见工作负载中的有效性,并验证了其平衡计算效率与缓存命中率的设计目标。此外,AsymCache的低级设计允许无缝集成到诸如Continuum的代理服务系统中,进一步将平均作业延迟降低高达18.1%。

英文摘要

Large Language Model (LLM) inference relies on key-value (KV) caches to avoid redundant attention computation. While approximate KV cache retention techniques reduce memory usage by sacrificing model accuracy, lossless approaches instead evict KV cache blocks from GPU memory and reconstruct them on demand to preserve exact outputs. Existing lossless KV cache management systems primarily base eviction decisions on access frequency or positional heuristics, without considering how different KV cache blocks affect the execution efficiency of GPU attention kernels. In this paper, we propose AsymCache, a computation-latency-aware KV cache management system for LLM inference that explicitly aligns cache residency decisions with GPU attention kernel performance, including three key components: Multi-Segment Attention (MSA) for efficient non-contiguous KV context processing, a cache eviction policy that jointly optimizes hit rate and position-aware recomputation cost, and an adaptive chunking scheduler for high hardware utilization. Experiments show that AsymCache reduces TTFT by up to 1.90-2.03x and time-per-output-token (TPOT) by 1.62-1.71x over latest baselines, confirming the effectiveness of the method in common workloads and validating its design goal of balancing computational efficiency with cache hit rate. Moreover, the low-level design of AsymCache allows seamless integration into agent serving systems such as Continuum, where it further reduces average job latency by up to 18.1%.

2606.02963 2026-06-03 cs.LG 版本更新

KForge: LLM-Driven Cross-Platform Kernel Generation for AI Accelerators

KForge:面向AI加速器的LLM驱动跨平台内核生成

Taras Sereda, Burak Bartan, Ankita Nayak, Tom St. John, Natalie Serrino, Zain Asgar

发表机构 * Gimlet Labs Inc(Gimlet实验室)

AI总结 提出KForge框架,通过两个协作的LLM代理(生成代理和性能分析代理)迭代优化,自动生成跨平台高性能内核,在NVIDIA B200和Intel Arc B580上分别实现2.12%的吞吐量提升和5.13倍的几何平均加速。

Comments Accepted at ISCA 2026 Workshop MLArchSys

详情
AI中文摘要

生产推理越来越多地针对异构加速器组合。智能体管道交织推理、工具调用和多智能体协调,每个阶段具有不同的计算和内存特征。为达到最优效率,每个阶段应在最适合的加速器上运行。这带来了系统挑战:每个管道现在需要在越来越多的硬件后端和编程模型上生成高性能内核。手工编写这些内核耗时、需要深厚的底层专业知识,并且随着内核复杂性增长而难以扩展。最近,大型语言模型(LLMs)已被用于自动内核生成,但在低级代码生成和跨后端泛化方面仍存在挑战。我们提出KForge,一个跨平台框架,围绕由两个协作的基于LLM的代理驱动的迭代优化循环构建:生成代理,使用编译和正确性反馈生成并逐步优化内核;性能分析代理,解释从编程API到基于GUI的工具的性能数据,并发出指导下一轮合成的建议。该循环在功能传递(驱动候选达到正确性)和优化传递(缩小与手工调优基线的性能差距)之间交替。我们在两个基线参考可用性差异很大的后端上评估KForge。在NVIDIA B200上,KForge在gpt-oss-20b推理速度基准上相比TensorRT-LLM实现了2.12%的端到端吞吐量提升。在Intel Arc B580上,KForge生成的Triton内核在KernelBench Level 2的37个GEMM+尾部操作工作负载上,通过算子融合和混合精度执行,实现了比PyTorch eager和this http URL中较快者5.13倍的几何平均加速。

英文摘要

Production inference increasingly targets a heterogeneous mix of accelerators. Agentic pipelines interleave reasoning, tool calls, and multi-agent coordination, each with distinct compute and memory profiles. For optimal efficiency, each stage should run on the accelerator best suited to it. This creates a systems challenge: each pipeline now requires high-performance kernels across a growing set of hardware backends and programming models. Writing these kernels by hand is time-consuming, demands deep low-level expertise, and does not scale as kernel complexity grows. Recently, Large Language Models (LLMs) have been leveraged for automatic kernel generation, but challenges in low-level code generation and cross-backend generalization persist. We present KForge, a cross-platform framework built around an iterative refinement loop driven by two collaborating LLM-based agents: a generation agent that produces and progressively refines kernels using compilation and correctness feedback, and a performance-analysis agent that interprets profiling data, from programmatic APIs to GUI-based tools, and emits recommendations that steer the next round of synthesis. The loop alternates between functional passes, which drive a candidate to correctness, and optimization passes, which close the performance gap to hand-tuned baselines. We evaluate KForge on two backends with very different baseline reference availability. On NVIDIA B200, KForge achieves a 2.12$\%$ improvement in end-to-end throughput compared to TensorRT-LLM on the gpt-oss-20b inference speed benchmark. On Intel Arc B580, KForge generates Triton kernels achieving a 5.13$\times$ geometric mean speedup over the faster of PyTorch eager and torch.compile on 37 GEMM + tail-ops workloads from KernelBench Level 2, primarily via operator fusion and mixed-precision execution.

2606.02959 2026-06-03 cs.LG cs.CR 版本更新

Gate AI: LLM Security Benchmark Evaluation Methodology and Results

Gate AI:大语言模型安全基准评估方法与结果

Ryle Goehausen, Marcus Sousa

发表机构 * constellationnetwork(Constellation Network)

AI总结 针对提示注入和越狱检测器评估中数据集阈值调优和操作点未公开的问题,提出一种采用5折交叉验证、全局操作点选择和多种泛化诊断的评估框架,并在16个公开基准上进行了测试。

Comments 17 pages, 23 figures, 2 tables. Working preprint; subsequent versions may update benchmark numbers as the framework evolves

详情
AI中文摘要

已发布的大语言模型提示注入和越狱检测器评估通常存在两个系统性弱点:每个数据集单独调整阈值以及未公开的操作点。我们描述了一种解决这两个问题的评估框架。被评估的检测器在16个公共基准(12,111个样本)上使用5折交叉验证进行评分。主要流程采用StratifiedKFold(按行);同时,并行运行StratifiedGroupKFold流程,基于复合键(父提示ID加上Jaccard $\gtrsim 0.8$的MinHash + LSH近重复聚类)作为泄漏溢价诊断。在保留的折上选择一个全局操作点(在FPR $\leq 1\%$条件下最大化F1),并统一应用于每个数据集,因此每个数据集的结果反映一个阈值,而非每个基准的优化。通过一系列诊断检查泛化能力(留一数据集交叉验证、随机标签对照、对抗验证、排列特征重要性、长度偏差相关性、分类器头部一致性、跨源近重复检测、阈值可迁移性、训练集与OOF一致性以及释义不变性探测),其中大多数具有定量通过阈值,其余则说明失败模式。对于每次外部比较,检测器的阈值根据竞争对手公布的假阳性率重新调整,以便在匹配的操作点上评估对比值。

英文摘要

Published evaluations of prompt-injection and jailbreak detectors for Large Language Models often suffer from two systematic weaknesses: per-dataset threshold tuning and undisclosed operating points. We describe an evaluation harness that addresses both. The detector under evaluation is scored across 16 public benchmarks (12,111 samples) using 5-fold cross-validation. StratifiedKFold (by row) is the headline pass; a parallel StratifiedGroupKFold pass over a composite key (parent-prompt id plus MinHash + LSH near-duplicate clusters at Jaccard $\gtrsim 0.8$) runs alongside it as a leakage-premium diagnostic. A single global operating point is selected on the held-out folds (max F1 subject to FPR $\leq 1\%$) and applied uniformly to every dataset, so per-dataset results reflect one threshold rather than per-benchmark optimisation. Generalisation is examined through a battery of diagnostics (leave-one-dataset-out cross-validation, a random-label control, adversarial validation, permutation feature importance, length-bias correlation, classifier-head agreement, cross-source near-duplicate detection, threshold transferability, train-vs-OOF agreement, and a paraphrase-invariance probe), most with a quantitative pass threshold and the remainder with a stated failure mode. For every external comparison, the detector's threshold is re-tuned to the competitor's published false-positive rate so head-to-head values are evaluated at matched operating points.

2606.02956 2026-06-03 cs.CV cs.LG cs.RO 版本更新

The Road Ahead in Autonomous Driving: The KITScenes Multimodal Dataset

自动驾驶的未来之路:KITScenes多模态数据集

Richard Schwarzkopf, Fabian Immel, Alexander Blumberg, Jonas Merkert, Nils Rack, Kaiwen Wang, Fabian Konstantinidis, Julian Truetsch, Carlos Fernandez, Annika Bätz, Kevin Rösch, Marlon Steiner, Willi Poh, Yinzhe Shen, Royden Wagner, Felix Hauser, Dominik Strutz, Jaime Villa, Gleb Stepanov, Holger Caesar, Ömer Şahin Taş, Frank Bieder, Jan-Hendrik Pauls, Christoph Stiller

发表机构 * FZI Research Center for Information Technology(弗劳恩霍夫信息技术研究中心) Karlsruhe Institute of Technology(卡尔斯鲁厄理工学院) University Charles III of Madrid(马德里第三大学) Delft University of Technology(代尔夫特理工大学)

AI总结 本文提出KITScenes多模态数据集,通过高保真传感器和完整HD地图,解决现有数据集在传感器精度、地图完整性和地理多样性上的不足,并引入四个基准推动空间学习。

Comments 28 pages, 21 figures

详情
AI中文摘要

现有的自动驾驶数据集取得了重大进展,但在传感器保真度、地图完整性或地理多样性方面仍存在不足。我们提出了KITScenes多模态数据集,这是一个基于高保真传感器和地图构建的欧洲数据集。我们完全同步的传感器套件结合了高分辨率全局快门相机、超过400米的长距离激光雷达、4D成像雷达以及冗余的GNSS/INS定位。据我们所知,我们的HD地图是任何传感器数据集中最完整的,并通过开源软件上的自动驾驶试验进行了验证。首次在公共数据集中,所有与驾驶相关的交通元素(如交通灯)都以3D方式映射到重投影精确的水平,并具有完整的拓扑连接。我们的数据集记录在街道布局不规则且交通模式混合的城市中,通过拓宽可用的地理多样性来补充现有数据集。我们还引入了四个基准,每个基准都推动了具身AI的空间学习:在线HD地图构建、长距离深度估计、新颖视图合成和端到端驾驶。项目页面:此https URL

英文摘要

Existing autonomous driving datasets have enabled major progress, but fall short in sensor fidelity, map completeness, or geographic diversity. We present KITScenes Multimodal, a European dataset built around high-fidelity sensors and maps. Our fully synchronized sensor suite combines high-resolution global-shutter cameras, long-range lidar beyond 400m, 4D imaging radar, and redundant GNSS/INS localization. Our HD maps are, to our knowledge, the most complete of any sensor dataset, validated through autonomous driving trials on open-source software. For the first time in a public dataset, all driving-relevant traffic elements, such as traffic lights, are mapped in 3D to a reprojection-accurate level with full topological connectivity. Recorded in cities with irregular street layouts and mixed traffic modes, our dataset complements existing datasets by broadening the available geographic diversity. We also introduce four benchmarks, each advancing spatial learning for embodied AI: online HD map construction, long-range depth estimation, novel view synthesis, and end-to-end driving. Project page: https://kitscenes.com/

2606.02948 2026-06-03 cs.LG cs.DS 版本更新

From Non-Convex to Strongly Convex: Curvature-Adaptive FTPL for Online Optimization

从非凸到强凸:曲率自适应的FTPL在线优化

Moses Charikar, Chirag Pabbaraju, Ambuj Tewari

发表机构 * Stanford University(斯坦福大学) University of Michigan, Ann Arbor(密歇根大学安娜堡分校)

AI总结 提出一种曲率自适应的FTPL算法,通过时变扰动尺度实现非凸Lipschitz损失下的最优遗憾界,并在线性累积曲率下达到对数遗憾。

详情
AI中文摘要

曲率自适应是在线优化中的一个经典主题:对于凸Lipschitz损失,自适应方法在一般凸损失的最优$O(\sqrt{T})$遗憾和强凸性下的$O(\log T)$遗憾之间进行插值。最近的研究表明,假设可以访问近似离线优化预言机,Follow-the-Perturbed-Leader (FTPL) 即使对于在线非凸Lipschitz损失也能实现最优的$O(\sqrt{T})$遗憾,但这些保证没有利用曲率。我们证明,在非凸设置中,FTPL可以变得曲率自适应,而无需事先知道曲率如何随时间累积。我们的算法将标准FTPL的固定扰动尺度替换为仅使用过去信息选择的时变尺度。我们给出了该尺度的简单跟随者调节规则,并表明它与事后最佳选择竞争(在常数因子内)。所得到的方法对于任意非凸Lipschitz损失实现了$O(\sqrt{T})$遗憾,并随着累积曲率的增长而改进;在足够精确的预言机调用下,当累积曲率线性增长时(包括经典的强凸情形),它实现了$O(\log T)$遗憾。我们用规定的累积曲率序列(即使对于一维凸损失)的匹配下界来补充这些上界,表明最坏情况非凸遗憾与曲率驱动的快速速率之间的权衡是内在的。

英文摘要

Curvature adaptivity is a classical theme in online optimization: for convex Lipschitz losses, adaptive methods interpolate between the optimal $O(\sqrt{T})$ regret for general convex losses and $O(\log T)$ regret under strong convexity. Recent work has shown that Follow-the-Perturbed-Leader (FTPL) achieves optimal $O(\sqrt{T})$ regret even for online non-convex Lipschitz losses, assuming access to an approximate offline-optimization oracle, but these guarantees do not exploit curvature. We show that FTPL can be made curvature-adaptive in the non-convex setting, without knowing in advance how curvature will accumulate over time. Our algorithm replaces the fixed perturbation scale of standard FTPL with a time-varying scale chosen using only past information. We give a simple follow-the-leader tuning rule for this scale and show that it competes, up to constants, with the best choice in hindsight. The resulting method achieves $O(\sqrt{T})$ regret for arbitrary non-convex Lipschitz losses and improves as cumulative curvature grows; with sufficiently accurate oracle calls, it achieves $O(\log T)$ regret when cumulative curvature grows linearly, which includes the classical strongly convex regime. We complement these upper bounds with matching lower bounds for prescribed cumulative-curvature sequences, already for one-dimensional convex losses, showing that the tradeoff between worst-case non-convex regret and curvature-driven fast rates is intrinsic.

2606.02947 2026-06-03 cs.LG cs.CV 版本更新

BYORn: Bootstrap Your Own Responses to Defend Large Vision-Language Models Against Backdoor Attacks

BYORn:自举你的响应以防御大型视觉-语言模型的后门攻击

Ivan Sabolić, Marin Oršić, Josip Šarić, Sven Lončarić

发表机构 * University of Rijeka(里耶卡大学)

AI总结 提出BYORn框架,通过识别并替换语义不合理的后门目标响应,打破触发器与目标输出的关联,从而在保持干净任务性能的同时提升对后门攻击的鲁棒性。

Comments Accepted to ICML 2026

详情
AI中文摘要

监督微调是将自回归视觉-语言模型适应下游任务的主要方法。最近的研究表明,这种范式极易受到后门攻击,并且现有的防御在开放生成设置中无效。为此,我们提出了BYORn,一个鲁棒的后门防御微调框架,其动机是观察到,在给定相应图像-文本输入和预训练模型的情况下,被毒化的目标响应通常在语义上不合理。BYORn识别这种不对齐的响应,并动态地用模型生成的替代响应替换它们,从而打破触发器与目标输出之间的相关性。由此产生的目标梯度对应于干净数据分布上总体风险上界的经验估计的梯度。实验上,BYORn在保持干净任务性能的同时,持续提高了对后门攻击的鲁棒性,建立了泛化与攻击成功率之间的新权衡边界。最后,我们证明了BYORn对专门设计用于规避所提防御的自适应攻击仍然有效。

英文摘要

Supervised fine-tuning is the predominant approach for adapting autoregressive vision-language models to downstream tasks. Recent work has shown that this paradigm is highly vulnerable to backdoor attacks, and that existing defenses are ineffective in open-ended generation settings. In response, we propose BYORn, a backdoor-robust fine-tuning framework motivated by the observation that poisoned target responses are often semantically implausible given the corresponding image-text inputs and a pretrained model. BYORn identifies such misaligned responses and dynamically replaces them with alternative responses generated by the model, thereby breaking the correlation between triggers and target outputs. The resulting objective gradient corresponds to the gradient of the empirical estimate of the population risk upper bound over the clean data distribution. Empirically, BYORn consistently improves robustness to backdoor attacks while preserving clean-task performance, establishing a new trade-off frontier between generalization and attack success rate. Finally, we demonstrate that BYORn remains effective against adaptive attacks specifically designed to circumvent the proposed defense.

2606.02946 2026-06-03 cs.LG cs.CR 版本更新

Outsmarting the Chameleon: Counterfactual Decoupling for Tactical OOD Shifts in Live Streaming Risk Assessment

智取变色龙:针对直播风险评估中战术性OOD偏移的反事实解耦

Yiran Qiao, Jing Chen, Jiaqi Xu, Yang Liu, Qiwei Zhong, Xiang Ao

发表机构 * Institute of Computing Technology, Chinese Academy of Sciences(中国科学院计算技术研究所) ByteDance China(字节跳动中国)

AI总结 针对直播中恶意行为者通过战术性分布偏移(Tactical OOD Shift)规避检测的问题,提出基于潜在因果的反事实解耦框架LPCD,通过潜在层建模意图与叙事变化并强制潜在反事实一致性,实现鲁棒的风险评估。

Comments Accepted by KDD'26

详情
AI中文摘要

直播已成为社交互动和数字商务的主要媒介,但日益受到复杂风险的困扰。该领域的一个基本挑战是战术性分布偏移(tactical OOD shift):虽然恶意行为者保持稳定的潜在目标,但他们不断重新设计叙事包装以逃避检测。这种对抗性偏移暴露了现有OOD泛化范式的关键局限性,其假设在意图-战术紧密耦合演变和原始级反事实定义不清的情况下难以满足。在本文中,我们从潜在因果角度解决这一问题,并提出潜在预测反事实解耦(LPCD),一个用于鲁棒直播风险评估的即插即用框架。LPCD通过在潜在层建模意图和叙事变化来实现对抗性战术重新包装下的反事实推理,并强制潜在反事实一致性以将风险预测锚定在因果稳定的恶意意图上。在推理时,LPCD应用轻量级、无参数的校准以进一步缓解战术引起的分布偏移。在大规模工业数据集和在线生产流量上的大量实验表明,LPCD持续优于最先进的基线,验证了其在现实直播中调节不断演变的对抗性风险的有效性。项目页面见此https URL。

英文摘要

Live streaming has emerged as a primary medium for social interaction and digital commerce, yet it is increasingly plagued by sophisticated risks. A fundamental challenge in this domain is \emph{tactical out-of-distribution (OOD) shift}: while malicious actors maintain stable underlying objectives, they continuously redesign narrative packaging to evade detection. Such adversarial shifts expose critical limitations of existing OOD generalization paradigms, whose assumptions are difficult to satisfy in the presence of tightly coupled intent-tactic evolution and ill-defined raw-level counterfactuals. In this paper, we tackle this issue from a \emph{latent causal} perspective and propose \underline{L}atent-\underline{P}redictive \underline{C}ounterfactual \underline{D}ecoupling~(LPCD), a plug-in framework for robust live streaming risk assessment. LPCD enables counterfactual reasoning under adversarial tactical re-packaging by modeling intent and narrative variation at the latent level, and enforces \emph{latent counterfactual consistency} to anchor risk prediction on causally stable malicious intent. At inference time, LPCD applies a lightweight, parameter-free calibration to further mitigate tactic-induced distribution shifts. Extensive experiments on large-scale industrial datasets and online production traffic demonstrate that LPCD consistently outperforms state-of-the-art baselines, validating its effectiveness in moderating evolving adversarial risks in real-world live streaming. The project page is available at https://qiaoyran.github.io/LiveStreamingRiskAssessment/.

2606.02939 2026-06-03 cs.LG eess.SP 版本更新

ERP-XTTN: Interpretable Prototype-Guided Cross-Attention for Cross-Subject ERP Classification

ERP-XTTN: 可解释的原型引导跨注意力用于跨被试ERP分类

Charlotte Genevier Wyman, Leanne Hirshfield

发表机构 * University of Colorado Boulder(科罗拉多大学波得尔分校)

AI总结 提出ERP-XTTN,一种基于原型引导跨注意力的架构,在无需校准的跨被试条件下实现可解释的ERP分类,并揭示分类错误的神经生理学原因。

详情
AI中文摘要

可解释的脑机接口分类器能够在无需校准的情况下跨被试泛化仍然是一个开放的挑战。我们测试了基于原型的跨注意力是否能在部署兼容条件下提供具有竞争力且可解释的事件相关电位(ERP)分类。我们提出ERP-XTTN,一种跨注意力架构,通过仅查询-键的跨注意力(无值投影)将输入EEG片段路由到固定的差异波原型,因此分类完全依赖于注意力路由,且注意力忠实性是结构性的而非事后解释的。原型从训练折差异波的极值自动推导。我们在三个公开数据集(BNCI Horizon 2020、HRI Cursor和ERP CORE)上评估,涵盖八个ERP成分(ERN、LRP、ErrP、N170、P300、N2pc、MMN、N400),使用留一被试(LOSO)评估,并在两种通道数(3通道和全导联)下采用因果滤波,与EEGNet和基于黎曼几何的xDAWN(xDAWN+RG)对比。最佳基线与ERP-XTTN的平均差距在3通道时为0.018 AUROC,在全导联时为0.034,这源于两个大致不同的来源:相对于EEGNet的时间灵活性成本和相对于xDAWN+RG的空间利用成本,后者在全导联时由信噪比驱动。除了准确性,透明的路由揭示了黑箱模型无法发现的跨被试信号结构:假阳性与真阳性的相似度高于真阴性,表明分类错误在神经生理学上是可以解释的。ERP-XTTN在因果、无校准条件下泛化到多种ERP,并在最小导联设置下具有较小的可解释性代价。据我们所知,这是ERP CORE上首个epoch级LOSO基准测试。

英文摘要

Interpretable brain-computer interface classifiers that generalize across subjects without calibration remain an open challenge. We test whether prototype-based cross-attention can provide competitive, interpretable event-related potential (ERP) classification under deployment-compatible conditions. We propose ERP-XTTN, a cross-attention architecture that routes input EEG patches to fixed difference-wave prototypes via query-key-only cross-attention with no value projection, so classification depends entirely on attention routing and attention faithfulness is structural rather than post-hoc. Prototypes are derived automatically from extrema in the training-fold difference wave. We evaluate across three public sources (BNCI Horizon 2020, HRI Cursor, and ERP CORE) spanning eight ERP components (ERN, LRP, ErrP, N170, P300, N2pc, MMN, N400), using leave-one-subject-out (LOSO) evaluation with causal filtering at two channel counts (3-channel and full montage), against EEGNet and xDAWN with Riemannian geometry (xDAWN+RG). The mean gap between the best baseline and ERP-XTTN was .018 AUROC at 3 channels and .034 at full montage, arising from two largely distinct sources: a temporal-flexibility cost relative to EEGNet and a spatial-exploitation cost relative to xDAWN+RG, the latter driven by signal-to-noise ratio at full montage. Beyond accuracy, the transparent routing reveals cross-subject signal structure that black-box models cannot: false positives resembled true positives more than true negatives did, indicating that classification errors are neurophysiologically explicable. ERP-XTTN generalizes across diverse ERPs under causal, calibration-free conditions with a small interpretability cost at minimal montages. To our knowledge, this is the first epoch-level LOSO benchmark on ERP CORE.

2606.02936 2026-06-03 cs.LG 版本更新

Hierarchical RBF-KAN and RBF-SKAN Architectures for Multidimensional Function Approximation and Random Field Learning

分层RBF-KAN和RBF-SKAN架构用于多维函数逼近和随机场学习

Mingtao Xia, Qijing Shen

发表机构 * University of Houston(德克萨斯大学) University of Birmingham(伯明翰大学) University of Oxford(牛津大学)

AI总结 提出并分析使用径向基函数作为激活函数的分层Kolmogorov-Arnold神经网络架构,用于逼近确定性函数和随机场模型,并证明其通用逼近性质及缓解维度灾难的潜力。

详情
AI中文摘要

本文提出并分析了使用径向基函数作为激活函数的分层Kolmogorov-Arnold神经网络架构,用于逼近确定性函数和随机场模型。具体地,我们开发了用于多维确定性函数逼近的分层径向基函数Kolmogorov-Arnold网络(分层RBF-KAN)和用于随机场学习的分层径向基函数随机Kolmogorov-Arnold网络(分层RBF-SKAN)。从理论角度,我们为两种架构建立了通用逼近结果。特别地,我们推导了分层RBF-KAN的定量逼近估计,表明所提出的框架通过降低逼近问题的有效维度,有潜力部分缓解高维函数学习中的维度灾难。此外,我们证明了分层RBF-SKAN可以在Wasserstein-2度量下逼近随机场模型。实验上,我们表明所提出的基于径向基函数的神经网络结构能够有效学习多元函数和随机场模型。

英文摘要

In this manuscript, we propose and analyze hierarchical Kolmogorov--Arnold neural network architectures employing radial basis functions as activation functions for approximating deterministic functions and random field models. Specifically, we develop a hierarchical radial-basis-function Kolmogorov--Arnold network (hierarchical RBF-KAN) for multidimensional deterministic function approximation and a hierarchical radial-basis-function stochastic Kolmogorov--Arnold network (hierarchical RBF-SKAN) for random field learning. From a theoretical perspective, we establish universal approximation results for both architectures. In particular, we derive quantitative approximation estimates for the hierarchical RBF-KAN, showing that the proposed framework has the potential to partially alleviate the curse of dimensionality in learning high-dimensional functions by reducing the effective dimensionality of the approximation problem. Furthermore, we show that the hierarchical RBF-SKAN can approximate random field models under the Wasserstein-2 metric. Empirically, we show that our proposed radial-basis-function-based neural network structure could effectively learn multivariate functions and random field models.

2606.02920 2026-06-03 cs.LG 版本更新

Fast Unlearning at Scale via Margin Self-Correction

通过边际自我修正实现大规模快速遗忘学习

Federico Di Gennaro, Alexander Shevchenko, Fanny Yang

发表机构 * ETH Zürich(苏黎世联邦理工学院)

AI总结 提出MASC方法,通过在线停止规则在无需下游评估的情况下高效实现语言模型遗忘,显著降低计算成本。

详情
AI中文摘要

语言模型遗忘学习更新已训练模型,使其表现得好像从未见过选定的训练样本,同时保持效用并避免昂贵的重新训练。现有方法通常使用固定的训练预算微调预训练模型,然后通过在下游验证数据上评估几个保存的检查点来最终选择模型。两种不必要的计算限制了可扩展性:训练超出期望的遗忘-保留权衡,以及需要额外存储和重复评估的检查点选择。为了解决这些限制,我们引入了MArgin Self-Correction (MASC),一种高效的遗忘学习方法,具有在线停止规则,不需要下游评估。给定一个要遗忘的文本序列,MASC主动减小原始下一个词元与最可能替代词元之间的logit差距。一旦这个差距在所有遗忘序列的足够大比例的词元位置上平均较小,它就会输出最终模型。在TOFU、MUSE News和MUSE Books上,MASC以现有基线计算成本的一小部分实现了具有竞争力的遗忘-保留权衡。我们进一步观察到,随着模型规模(即参数数量)的增加,MASC和SimNPO的权衡都得到了改善——遗忘指标保持可比,而保留效用增加。

英文摘要

Language-model unlearning updates a trained model to behave as if it had not seen selected training examples, while preserving utility and avoiding costly retraining. Existing approaches typically fine-tune the pretrained model with a fixed training budget and select the final model afterwards by evaluating several saved checkpoints on downstream validation data. Two sources of unnecessary computation limit scalability: training beyond the desired forget-retain trade-off, and checkpoint selection that requires extra storage and repeated evaluations. To address these limitations, we introduce MArgin Self-Correction (MASC), an efficient unlearning method with an online stopping rule that does not require downstream evaluation. Given a text sequence to be forgotten, MASC actively reduces the logit gap between the original next token and the most likely alternatives. It outputs a final model once this gap is small on average over a sufficiently large proportion of token positions across all forget sequences. On TOFU, MUSE News, and MUSE Books, MASC achieves a competitive forget-retain trade-off at a fraction of the computational cost of existing baselines. We further observe that as we increase model size (a.k.a. number of parameters), the trade-offs improve for both MASC and SimNPO -- the forget metrics remain comparable while retain utility increases.

2606.02902 2026-06-03 cs.CY cs.LG 版本更新

Fairness Definitions and Metrics in Deep Reinforcement Learning for Drug Discovery in Healthcare: A Rapid Evidence Review

医疗保健中深度强化学习的公平性定义与指标:药物发现的快速证据综述

Esmaeil Shakeri, Ronnie de Souza Santos, Behrouz Far

发表机构 * Department of Electrical and Software Engineering, Schulich School of Engineering, University of Calgary(电气与软件工程系,Schulich工程学院,卡尔加里大学)

AI总结 本文通过快速证据综述,系统总结了深度强化学习在药物分子生成中公平性的定义、测量指标,并分析了数据集组成、奖励设计对公平性的影响。

Comments 10 pages, 6 figures, 3 tables. Accepted as a full paper at a symposium of IEEE COMPSAC 2026

详情
AI中文摘要

深度强化学习(DRL)越来越多地应用于从头分子设计,但数据、奖励和评估的选择可能导致在不同疾病区域和化学类型上的性能不均。尽管如此,目前尚无关于DRL药物发现中公平性如何定义、测量和测试的简明综合。在这篇快速证据综述中,我们综合了医疗保健中DRL驱动分子生成的公平性定义和指标。我们关注三个问题:(i)数据集组成和划分策略(特别是支架划分与随机划分)如何影响评估和分布偏移;(ii)奖励设计(如QED、对接、毒性、合成可及性)如何产生或减轻偏差,重点关注癌症靶点;(iii)哪些可测量指标最能捕捉公平性。这包括癌症与非癌症适应症之间以及癌症亚型之间的均等性。还包括关键物理化学描述符的分布平衡、支架/化学类型多样性、组间有效性、毒性和合成可及性。从2017年起,我们检索了主要的生物医学、计算机科学和工程文献数据库,并使用arXiv进行地平线扫描。记录通过PRISMA式程序筛选,并通过内容编码分析,将报告的均等性结果与数据集和奖励选择联系起来。我们的综述为DRL分子生成提供了一套简洁的公平性定义和指标。它为报告分布均等性和结果均等性提供了实用指南。它还总结了数据集和奖励选择如何与观察到的均等性效应相关,并指出了与可信、癌症相关的DRL生成相关的未解决问题。

英文摘要

Deep reinforcement learning (DRL) is increasingly applied to de novo molecular design, but choices in data, rewards, and evaluation can yield uneven performance across disease areas and chemotypes. Despite this, there is no concise synthesis of how fairness is defined, measured, and tested in DRL-based drug discovery. In this rapid evidence review, we synthesize fairness definitions and metrics for DRL-driven molecule generation in healthcare. We focus on three questions: (i) how dataset composition and split strategies, especially scaffold versus random splits, affect evaluation and distribution shift; (ii) how reward design (e.g., QED, docking, toxicity, synthetic accessibility) can create or mitigate bias, with emphasis on cancer targets; and (iii) which measurable metrics best capture fairness. This includes parity across cancer versus non-cancer indications and across cancer subtypes. It also includes distributional balance in key physicochemical descriptors, scaffold/chemotype diversity, groupwise validity, toxicity, and synthetic accessibility. From 2017 onward, we searched major biomedical, computer science, and engineering literature databases and used arXiv for horizon scanning. Records were screened using PRISMA-style procedures and analyzed via content coding to link reported parity outcomes to dataset and reward choices. Our review provides a concise set of fairness definitions and metrics for DRL molecule generation. It offers practical guidance for reporting distribution parity and outcome parity. It also summarizes how dataset and reward choices relate to observed parity effects and identifies open gaps relevant to trustworthy, cancer-relevant DRL generation.

2606.02892 2026-06-03 cs.LG 版本更新

Multi-Modal Machine Learning for Breast Cancer Recurrence Prediction

多模态机器学习用于乳腺癌复发预测

Jiahao Shao, Xudong Wang, Anam Nawaz Khan, Christopher Brett, Xueping Li, Bing Yao

发表机构 * Department of Industrial & Systems Engineering, The University of Tennessee, Knoxville, TN 37996, USA(工业与系统工程系,田纳西大学,诺克斯维尔,TN 37996,美国) The University of Tennessee Medical Center, Knoxville, TN 37920, USA(田纳西大学医学中心,诺克斯维尔,TN 37920,美国)

AI总结 本研究通过整合结构化与非结构化临床数据(治疗记录、病理报告和临床笔记),结合基于规则的正则表达式提取和优先级冲突解决策略,显著提升了乳腺癌复发预测的准确性。

Comments 33 pages, 10 figures

详情
AI中文摘要

乳腺癌复发是幸存者长期死亡的主要原因,需要及时准确的风险评估来指导随访护理和治疗计划。传统预测模型通常局限于结构化或非结构化数据,难以捕捉完整的临床背景。本研究探讨了整合多模态临床数据(包括治疗记录、病理报告和临床笔记)对复发预测的影响。通过结合基于规则的正则表达式提取机制和严格的基于优先级的冲突解决策略,我们的方法有效地从自由文本病理叙述中恢复确定的肿瘤特征,以增强结构化记录。我们还与先前乳腺癌研究中常用的特征集进行性能基准测试,以评估多模态整合的附加价值。单源和多模态输入在一系列机器学习模型上进行评估。结果表明,与单模态方法相比,多模态整合一致地提高了预测准确性。

英文摘要

Breast cancer recurrence, a leading cause of long-term mortality among survivors, requires timely and accurate risk assessment to guide follow-up care and treatment planning. Traditional predictive models, often limited to either structured or unstructured data alone, struggle to capture the full clinical context. This study examines the impact of integrating multi-modal clinical data, including treatment records, pathology reports, and clinician notes, on recurrence prediction. By integrating a rule-based regular expression extraction mechanism with a rigorous precedence-based conflict reconciliation strategy, our approach effectively recovers definitive tumor characteristics from free-text pathology narratives to augment structured records. We also benchmark performance against commonly used feature sets from prior breast cancer studies to assess the added value of multi-modal integration. Single-source and multi-modal inputs are evaluated across a range of machine learning models. Results show that multi-modal integration consistently improves predictive accuracy compared to single-modal methods.

2606.02887 2026-06-03 cs.LG cs.NA math.NA math.OC 版本更新

A Nonmonotone Gradient-Based Algorithm for Symmetric Nonnegative Matrix Factorization and Graph Clustering

一种用于对称非负矩阵分解和图聚类的非单调梯度算法

Ryan Swart, Johannes Brust

AI总结 提出SNMPBB算法,首次将非单调投影Barzilai-Borwein方法应用于对称非负矩阵分解,并扩展至图聚类和大规模问题,证明全局收敛性,实验显示显著加速和精度提升。

详情
AI中文摘要

对称非负矩阵分解(Symmetric NMF)将矩阵近似为 $WW^T$,其中 $W$ 是非负矩形因子。它在图聚类和机器学习中有广泛应用。与NMF相比,投影梯度方法在对称问题上收敛缓慢。为了解决这个问题,我们引入了SNMPBB,这是非单调投影Barzilai-Borwein方法在对称NMF上的首次应用,表明梯度算法比以前认为的有效得多。我们进一步将SNMPBB扩展到使用图拉普拉斯正则化的图聚类(Graph-SNMPBB)以及使用低秩近似的大规模问题(LAI-SNMPBB)。对于所有变体,我们证明了全局收敛到一阶稳定点,并且Barzilai-Borwein曲率信息在随机近似下得以保留。在合成数据上,SNMPBB在相似残差下比替代的SymANLS快6倍,且优势随秩增加而扩大。在六个真实世界聚类基准测试中,Graph-SNMPBB匹配或超过了SymANLS的精度。最后,LAI-SNMPBB在34个SuiteSparse矩阵上,在运行时间和残差质量方面均优于最先进的LAI-SymPGNCG。

英文摘要

Symmetric nonnegative matrix factorization (Symmetric NMF) approximates a matrix as $WW^T$ with nonnegative rectangular factor $W$. It has broad applications in graph clustering and machine learning. In contrast to the NMF, projected gradient methods for the symmetric problem had been associated with slow convergence. To address this, we introduce SNMPBB, the first adaptation of nonmonotone projected Barzilai-Borwein methods to Symmetric NMF, demonstrating that gradient algorithms are significantly more effective than previously understood. We further extend SNMPBB to graph clustering using the graph Laplacian regularization (Graph-SNMPBB) and to large problems with low-rank approximations (LAI-SNMPBB). For all variants we prove global convergence to first-order stationary points and also that Barzilai-Borwein curvature information is preserved with randomized approximations. On synthetic data, SNMPBB achieves 6 times speedup over the alternative SymANLS for similar residuals, with advantages growing at higher ranks. Across six real-world clustering benchmarks, Graph-SNMPBB matches or exceeds SymANLS accuracy. Lastly, LAI-SNMPBB outperforms state-of-the-art LAI-SymPGNCG on 34 SuiteSparse matrices in both runtime and residual quality.

2606.02884 2026-06-03 cs.LG cs.AI 版本更新

Are we really tilting? The mechanics of reward guidance in flow and diffusion models

我们真的在倾斜吗?流模型和扩散模型中奖励引导的机制

Sanjit Dandapanthula, Nicholas M. Boffi

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 本文通过高斯混合模型和二次奖励的闭式分析,揭示了奖励引导扩散中奖励黑客现象源于Doob h函数的有限粒子插件估计,并提出了无额外计算的闭式奖励阻尼调度来纠正模式内偏差。

详情
AI中文摘要

奖励引导算法在推理时将学习到的生成过程导向奖励倾斜的测度。虽然经验上强大,但这些方法容易产生奖励黑客行为:引导模型以牺牲对学习分布的保真度为代价过度优化奖励。先前的工作将其归因于神经奖励函数的复杂性或扩散训练中的隐式偏差,但其根本起源仍知之甚少。我们表明,奖励黑客行为源于大多数实际奖励引导扩散实现中的一个近似——Doob h函数的有限粒子插件估计——即使在最简单的高斯和高斯混合目标以及二次奖励的非平凡设置中也是如此。在闭式中,我们分离了插件估计器的两种不同失效模式:它导致每个模式内的奖励黑客行为,并且无法选择高奖励模式。我们提出了一种闭式奖励阻尼调度,无需额外计算即可纠正模式内偏差,并阐明了最佳-n采样在补偿模式选择失败中的作用。在高斯混合目标、2D棋盘和FLUX.1文本到图像生成上的实验证实了我们的理论见解适用于实际设置。

英文摘要

Reward guidance algorithms steer a learned generative process toward the reward-tilted measure at inference time. While empirically powerful, these methods are prone to reward hacking: the guided model over-optimizes the reward at the cost of fidelity to the learned distribution. Prior work has attributed this to the complexity of neural reward functions or implicit biases in diffusion training, but its fundamental origins remain poorly understood. We show that reward hacking arises from an approximation made in most practical implementations of reward-guided diffusion -- finite-particle plug-in estimation of the Doob h-function -- even in the simplest non-trivial settings of Gaussian and Gaussian mixture targets with quadratic rewards. In closed form, we isolate two distinct failure modes of the plug-in estimator: it leads to reward hacking within each mode and it cannot select high-reward modes. We propose a closed-form reward damping schedule that corrects the within-mode bias with no additional compute, and clarify the role of best-of-n sampling in compensating for the mode selection failure. Experiments on Gaussian mixture targets, a 2D checkerboard, and FLUX.1 text-to-image generation confirm that our theoretical insights carry over to practical settings.

2606.02876 2026-06-03 cs.LG 版本更新

RRISE: Robust Radius Inference via a Surrogate Estimator

RRISE: 通过代理估计器进行鲁棒半径推断

Jong-Ik Park, Shreyas Chaudhari, Carlee Joe-Wong, José M. F. Moura

发表机构 * Carnegie Mellon University(卡内基梅隆大学)

AI总结 提出RRISE框架,使用代理模型替代蒙特卡洛采样进行随机平滑认证,通过一次性共形校准保证保守半径,在保持认证精度的同时大幅降低计算成本。

详情
AI中文摘要

随机平滑(RS)使用平滑分类器提供与架构无关的$\ell_2$分类鲁棒性认证,但其对每个输入的蒙特卡洛(MC)采样的依赖限制了其在实时系统中的应用。我们认为这种代价是结构性的而非根本性的,因此可以通过在部署流中共享信息来显著降低。我们引入RRISE,一个RS框架,将认证压缩为通过学习的代理进行的单次前向传播。RRISE通过软标签交叉熵损失针对预计算的MC类计数目标训练代理,并通过一次性共形校准步骤将代理预测转换为可证明保守的认证半径。得到的证书是可部署验证的:每当校准半径为正值时,代理的预测可证明与平滑分类器的预测一致,且平滑分类器在输入周围该半径的球内是常数。在图像分类基准测试中,RRISE在固定预算MC认证精度上相差0.84个百分点以内,同时将每次查询最多10^4次噪声基础模型评估替换为单次代理前向传播,在约10^5次部署查询后即可收回MC训练成本。在CIFAR-100和Tiny ImageNet上,先前唯一的离线代理方法失效,而RRISE实现了1.23到1.91倍的认证精度提升,确立了高效随机平滑作为重复部署场景中认证鲁棒性的实用路径。

英文摘要

Randomized smoothing (RS) uses a smoothed classifier to provide architecture-agnostic certificates of $\ell_2$ classification robustness, but its dependence on per-input Monte Carlo (MC) sampling undermines its use in real-time systems. We argue that this cost is structural rather than fundamental, such that it can be significantly reduced by sharing information across the deployment stream. We introduce RRISE, an RS framework that compresses certification into a single forward pass through a learned surrogate. RRISE trains the surrogate against precomputed MC class-count targets via a soft-label cross-entropy loss and converts surrogate predictions into provably conservative certified radii through a one-time conformal calibration step. The resulting certificate is deployment-verifiable: whenever the calibrated radius is positive, the surrogate's prediction provably matches the smoothed classifier's and the smoothed classifier is constant on a ball of that radius around the input. Across image classification benchmarks, RRISE matches fixed-budget MC certified accuracy within $0.84$ percentage points while replacing up to $10^4$ noisy base-model evaluations per query with a single surrogate forward pass, recouping MC training cost after $\approx 10^5$ deployment queries. On CIFAR-100 and Tiny ImageNet, where the only prior offline-surrogate method collapses, RRISE achieves $1.23$ to $1.91\times$ higher certified accuracy, establishing efficient randomized smoothing as a practical path to certified robustness in repeated-deployment settings.

2606.02860 2026-06-03 cs.LG cs.AI 版本更新

Forgetting is Not Erasure: Recovering Latent Knowledge via Transport Keys

遗忘并非擦除:通过传输键恢复潜在知识

Archie Chaudhury

发表机构 * Axionic Labs(Axionic实验室)

AI总结 通过缝合评估协议和紧凑的任务特定传输键,发现灾难性遗忘主要由内部阶段接口漂移而非任务相关计算的永久擦除引起,并能在顺序训练后恢复大部分早期任务性能。

Comments Technical report showcasing results from transport keys

详情
AI中文摘要

灾难性遗忘通常被视为表征问题:在顺序训练后,模型似乎失去了支持早期任务性能的特征。我们挑战了这一观点的更强形式。在受控的持续学习设置中,我们发现相当一部分明显的遗忘可归因于内部阶段之间的接口漂移,而非任务相关计算的永久擦除。我们通过一种缝合评估协议研究这一现象,该协议将更新后网络的早期计算与其前身的后期计算相结合,并可选地通过紧凑的任务特定传输键进行中介。我们在系统层面将传输键描述为紧凑的接口对齐算子,从少量配对的锚点激活中估计,并通过模型缝合进行评估。在split CIFAR-100上使用ResNet风格网络时,传输键在顺序训练任务B后恢复了大部分原始任务A的性能。在紧凑视觉变换器上,我们观察到类似的恢复模式。这些结果表明,持续学习可能需要更好的机制来索引和重新访问潜在计算,而不仅仅是防止权重变化的方法。

英文摘要

Catastrophic forgetting is often framed as a representational problem: after sequential training, a model appears to lose the features that supported performance on earlier tasks. We challenge the stronger form of this view. Across controlled continual-learning settings, we find that a significant portion of apparent forgetting can be attributed to interface drift between internal stages rather than permanent erasure of task-relevant computation. We study this phenomenon through a stitched evaluation protocol that combines early computation from a post-update network with late computation from its predecessor, optionally mediated by a compact, task-specific transport key. We describe transport keys at a systems level as compact interface-alignment operators estimated from a small set of paired anchor activations and evaluated through model stitching. On split CIFAR-100 with a ResNet-style network, transport keys recover most of the original Task A performance after sequential training on Task B. On a compact vision transformer, we observe a similar recovery pattern. These results suggest that continual learning may require better mechanisms for indexing and re-accessing latent computations, not only methods that prevent weight change.

2606.02857 2026-06-03 cs.LG cs.AI 版本更新

GRZO: Group-Relative Zeroth-Order Optimization for Large Language Model Fine-Tuning

GRZO:用于大语言模型微调的组相对零阶优化

Liyan Tan, Yequan Zhao, Yifan Yang, Ruijie Zhang, Xinling Yu, Zheng Zhang

发表机构 * University of California, Santa Barbara(加州大学圣巴巴拉分校)

AI总结 提出GRZO优化器,通过组相对归一化聚合每个样本的损失,在不增加前向成本的情况下将有效梯度方向数从1提升至批量大小,降低方差并改善收敛,在多个模型和任务上优于MeZO。

Comments Preprint. Under review

详情
AI中文摘要

零阶优化是微调大语言模型时一种内存高效的反向传播替代方案,但其部署受限于梯度估计的高方差。我们提出GRZO,一种组相对零阶优化器,它为每个小批量样本抽取一个伪独立扰动,并通过组相对归一化聚合每个样本的损失,在不增加额外前向成本的情况下将有效梯度方向数从1提升至批量大小,同时保持推理级内存。我们证明GRZO在方向上是无偏的,方差随批量大小成比例缩小,从而得到比MeZO更紧的非凸收敛界。在RoBERTa-large、Llama3-8B和OPT-13B上,跨多个任务,GRZO在Llama3-8B上的平均准确率比MeZO提高$+3.0$,峰值GPU内存降低$23\%$;作为MeZO核心的即插即用替代,它平均将稀疏、低秩和量化ZO变体提升$+6.0$。

英文摘要

Zeroth-order (ZO) optimization is a memory-efficient alternative to backpropagation for fine-tuning large language models, but its deployment is limited by the high variance of gradient estimation. We propose GRZO, a Group-Relative Zeroth-Order optimizer that draws one pseudo-independent perturbation per mini-batch example and aggregates the per-example losses through group-relative normalization, raising the effective gradient-direction count from one to the batch size at no additional forward cost while preserving inference-level memory. We prove that GRZO is directionally unbiased with variance shrinking proportionally to the batch size, yielding a tighter nonconvex convergence bound than MeZO. Across RoBERTa-large, Llama3-8B, and OPT-13B over multiple tasks, GRZO improves average accuracy on Llama3-8B by $+3.0$ over MeZO at $23\%$ lower peak GPU memory; as a drop-in replacement for the MeZO core, it lifts sparse, low-rank, and quantized ZO variants by $+6.0$ on average.

2606.02852 2026-06-03 cs.LG 版本更新

RESCAST-100K: A Comprehensive Dataset for Cross-Domain Residential Load and Indoor Temperature Forecasting

RESCAST-100K:一个用于跨领域住宅负荷和室内温度预测的综合数据集

Jainam Dhruva, Yousaf Raza, A. B. Siddique, Simone Silvestri

AI总结 提出RESCAST-100K大规模基准数据集,通过配置驱动接口支持跨领域泛化研究,涵盖约10万模拟住宅和5个真实数据集,用于评估迁移学习、域适应和零样本预测方法。

详情
AI中文摘要

住宅能源负荷和室内温度的准确短期预测对于家庭能源管理系统、电网级需求响应和社区能效工作至关重要。域适应和迁移学习在改善住宅环境中常见的数据异质性和稀缺性下的预测精度方面显示出潜力。然而,由于缺乏全面的住宅数据集,进展受到限制:现有基准在目标覆盖范围上狭窄,且很少支持结构化的跨领域评估。我们引入了RESCAST-100K,这是一个用于研究跨领域泛化的大规模住宅预测基准。它提供了一个配置驱动的接口,沿着可解释的轴(包括地理、气候区、墙体结构和供暖设备)实例化源域和目标域,从而能够在受控域偏移下系统评估迁移学习、域适应和零样本泛化。该基准涵盖约10万个来自ResStock的EnergyPlus模拟的美国住宅,每个住宅包含三个耦合目标的15分钟时间序列:总负荷、暖通空调负荷和室内温度。这些数据与天气通道、暖通空调设定点以及超过40个静态建筑协变量配对。RESCAST-100K还整合了五个真实世界住宅数据集,采用统一模式,支持在相同任务上进行模拟到真实的评估。我们对循环、注意力和MLP混合器架构进行了零样本性能基准测试,涵盖跨领域、缺失输入条件和预测任务。在域偏移下,交叉注意力和MLP混合器模型始终优于循环和经典Transformer基线。RESCAST-100K旨在帮助机器学习和建筑分析社区在家庭、社区和电网规模上推进跨领域住宅预测。

英文摘要

Accurate short-term forecasting of residential energy load and indoor temperature is essential for home energy management systems, grid-level demand response, and community energy efficiency efforts. Domain adaptation and transfer learning have shown promise for improving forecasting accuracy under data heterogeneity and scarcity commonly seen in residential settings. However, progress is limited by the lack of comprehensive residential datasets: existing benchmarks are narrow in target coverage and rarely support structured cross-domain evaluation. We introduce RESCAST-100K, a large-scale residential forecasting benchmark for studying cross-domain generalization. It provides a configuration-driven interface that instantiates source and target domains along interpretable axes, including geography, climate zone, wall construction, and heating equipment, enabling systematic evaluation of transfer learning, domain adaptation, and zero-shot generalization under controlled domain shifts. The benchmark covers approximately 100,000 EnergyPlus-simulated U.S. homes derived from ResStock, with 15-minute time series for three coupled targets per home: total load, HVAC load, and indoor temperature. These are paired with weather channels, HVAC setpoints, and over 40 static building covariates. RESCAST-100K also integrates five real-world residential datasets under a unified schema, supporting sim-to-real evaluation on the same tasks. We benchmark recurrent, attention-based, and MLP-mixer architectures for zero-shot performance across domains, missing-input conditions, and forecasting tasks. Cross-attention and MLP-mixer models consistently outperform recurrent and classical transformer baselines under domain shift. RESCAST-100K is intended to aid the machine learning and building analytics communities advance cross-domain residential forecasting at home, community, and grid scale.

2606.02849 2026-06-03 cs.LG 版本更新

A Systematic Evaluation of Current Architectures in Wind Power Forecasting

风电功率预测中当前架构的系统评估

Vinicius Bortolini, Gilson Adamczuk Oliveira, Erick Oliveira Rodrigues, Matheus Henrique Dal Molin Ribeiro

AI总结 本文通过系统文献综述,评估混合深度学习、模态分解和统计方法在风电区间预测中的应用,发现结合VMD或EEMD等分解技术能提高预测精度和可靠性。

详情
Journal ref
IEEE Access 2025
AI中文摘要

区间风速预测对于将风能有效集成到电力系统中至关重要,因为它考虑了风资源的固有不确定性。本研究对风电发电区间预测的混合方法进行了系统文献综述,探讨了深度学习、模态分解和统计方法的结合。为了指导论文选择,应用了潜在狄利克雷分配(LDA)进行主题建模,从而识别出模式和研究趋势。研究结果强调,将混合模型与分解技术(如变分模态分解(VMD)和集合经验模态分解(EEMD))相结合,通过在不牺牲覆盖率的情况下缩小预测区间,提高了预测准确性和可靠性。关于区间构建,大多数研究采用双模型策略,独立预测上下界。输入数据通常使用EMD、EEMD或VMD等技术进行分解,提取基于频率的分量。这些分量作为LSTM或ELM等模型的输入,分别针对每个边界进行训练。这种方法允许对不确定性进行有针对性的建模,提高了灵活性和精度。区间质量通常通过平衡覆盖率和区间宽度的指标进行评估。该综述还强调了挑战,包括缺乏标准化的评估指标、计算复杂性和有限的实际验证。总体而言,该研究强化了区间预测对风能运营的价值,并为提高模型鲁棒性和决策提供了见解。

英文摘要

Interval wind speed forecasting is essential for the efficient integration of wind energy into power systems, as it accounts for the inherent uncertainty of wind resources. This study presents a systematic literature review focused on hybrid approaches to interval forecasting of wind generation, exploring the combination of deep learning, modal decomposition, and statistical methods. To guide the paper selection, Latent Dirichlet Allocation (LDA) was applied for topic modeling, enabling the identification of patterns and research trends. The findings emphasize that integrating hybrid models with decomposition techniques-such as Variational Mode Decomposition (VMD) and Ensemble Empirical Mode Decomposition (EEMD)-enhances forecast accuracy and reliability by narrowing prediction intervals without compromising coverage. Regarding interval construction, most studies adopt a dual-model strategy, independently forecasting the lower and upper bounds. Input data are commonly decomposed using techniques like EMD, EEMD, or VMD, which extract frequency-based components. These components serve as inputs to models such as LSTM or ELM, trained separately for each bound. This approach allows for targeted modeling of uncertainty, improving flexibility and precision, Interval quality is typically evaluated through metrics that balance coverage and interval width. The review also highlights challenges, including the lack of standardized evaluation metrics, computational complexity, and limited real-world validation. Overall, the study reinforces the value of interval forecasting for wind energy operations and offers insights for advancing model robustness and decision-making.

2606.02842 2026-06-03 cs.LG 版本更新

Spectral-Progressive Thought Flow for Lightweight Multimodal Reasoning

光谱渐进式思维流:轻量级多模态推理

Yixian Shen, Zhiheng Yang, Qi Bi, Changshuo Wang, Shuai Wang, Jia-Hong Huang, George Floros, Prayag Tiwari, Anuj Pathania

AI总结 提出光谱渐进式思维流(SpecFlow),通过在固定大小离散余弦空间中表示中间视觉思维,并利用无分类器引导将视觉状态更新与文本意图对齐,实现轻量级多模态空间推理,在保持竞争性能的同时将计算和KV缓存成本降低高达2.1倍。

Comments Accepted at ICML 2026

详情
AI中文摘要

多模态空间推理通常依赖于长链的中间文本和视觉思维,其中累积的视觉标记和密集的跨模态注意力会带来大量的计算和内存开销。为了解决这一挑战,我们提出了光谱渐进式思维流(SpecFlow),一种新颖的轻量级多模态空间推理框架,它在固定大小的离散余弦空间中表示中间视觉思维。通过利用强大的能量压缩,SpecFlow保留了全局布局和关系结构,同时仅在需要增加空间精度时引入高频细节。为了将视觉状态演化与语言意图对齐,无分类器引导使得自回归文本思维能够引导基于流的视觉工作空间/状态更新,而无需扩展上下文。因此,SpecFlow维持一个有界的视觉工作空间,其更新仅依赖于当前视觉状态和累积的文本轨迹,从而能够以稳定的延迟和内存使用进行长程推理,且与推理深度无关。实验结果表明,SpecFlow在实现竞争性或更优推理性能的同时,将计算和KV缓存成本降低了高达2.1倍。

英文摘要

Multimodal spatial reasoning often relies on long chains of intermediate textual and visual thoughts, where accumulating visual tokens and dense cross-modal attention incur substantial computation and memory overhead. To address this challenge, we propose Spectral-Progressive Thought Flow (SpecFlow), a novel lightweight multimodal spatial reasoning framework that represents intermediate visual thoughts in a fixed-size discrete cosine space. By exploiting strong energy compaction, SpecFlow preserves global layout and relational structure while introducing high-frequency details only when increased spatial precision is required. To align visual state evolution with linguistic intent, classifier-free guidance enables autoregressive textual thoughts to steer flow-based updates of the visual workspace/state without expanding the context. As a result, SpecFlow maintains a bounded visual workspace whose updates depend only on the current visual state and accumulated textual trace, enabling long-horizon inference with stable latency and memory usage independent of reasoning depth. Empirical results show that SpecFlow achieves competitive or superior reasoning performance while reducing computation and KV cache costs by up to 2.1 times.

2606.02841 2026-06-03 cs.LG math.AT 版本更新

Learning Coherent Representations: A Topological Approach to Interpretability

学习一致表示:一种拓扑可解释性方法

Sigurd Gaukstad, Melvin Vaupel, Valdemar Kargård Olsen, Erik Hermansen, Benjamin Dunn

发表机构 * University of Oslo(奥斯陆大学)

AI总结 提出基于脑神经编码启发的“一致性”几何约束,通过Fréchet方差目标函数Coh训练模型,使特征在样本空间中形成连续区域,从而提升表示的可解释性。

Comments To appear in ICML 2026

详情
AI中文摘要

深度神经网络学习的表示中,单个特征往往缺乏可解释意义;一个神经元可能对分散、不相关的输入激活。我们引入一致性,这是一种受大脑神经编码启发的几何性质,其中像网格细胞和头部方向细胞这样的神经元对状态空间的连续区域做出响应。一个非负矩阵是一致的,如果每个行(样本)关注几何上聚类的列(特征),反之亦然,并且每个样本都由某个特征很好地描述,每个特征都被某个样本需要。我们证明一致矩阵在样本和特征的Vietoris-Rips过滤之间诱导有界交错,保证两个空间共享兼容的拓扑结构。这种几何约束促进了可解释性。例如,如果数据位于圆上,一致特征必须将该圆分割成连续的弧段。我们引入Coh,一种基于Fréchet方差的可微目标函数,在训练过程中强制执行一致性。与稀疏性(限制一个特征激活多少个样本)不同,一致性限制哪些样本,要求几何连通性而不仅仅是稀有性。这不仅产生可解释的特征,还产生可解释的特征空间。我们使用合成数据和旋转MNIST数据集在自编码器中验证Coh,并使用语言数据在BERT的词嵌入中验证Coh。

英文摘要

Deep neural networks learn representations where individual features often lack interpretable meaning; a single neuron may activate for scattered, unrelated inputs. We introduce coherence, a geometric property inspired by neural coding in the brain, where neurons like grid cells and head direction cells respond to contiguous regions of state space. A non-negative matrix is coherent if each row (sample) attends to geometrically clustered columns (features) and vice versa, and in addition every sample is well described by some feature and every feature is needed by some sample. We prove that coherent matrices induce a bounded interleaving between the Vietoris-Rips filtrations of samples and features, guaranteeing that both spaces share compatible topological structure. This geometric constraint facilitates interpretability. For example, if data lies on a circle, coherent features must tile that circle into contiguous arcs. We introduce Coh, a differentiable objective function based on Fréchet variance that enforces coherence during training. Unlike sparsity, which bounds how many samples a feature activates on, coherence bounds which samples, requiring geometric connectivity rather than only rarity. This yields not just interpretable features but an interpretable feature space. We validate Coh in an auto-encoder using synthetic and rotated MNIST datasets and in a token embedding of BERT using language data.

2606.02830 2026-06-03 cs.LG math.OC 版本更新

Mitigating Spurious Correlations with Memorization-Guided Dataset De-Biasing

利用记忆引导的数据集去偏缓解虚假相关性

Arda Fazla, Abolfazl Hashemi

发表机构 * School of Electrical and Computer Engineering, Purdue University(电子与计算机工程学院,普渡大学)

AI总结 提出一种两阶段样本评分函数,通过解耦核心特征与虚假特征的学习动态来识别并优先选择信息性样本,从而在仅需10%原始数据的情况下,使用标准ERM模型超越现有去偏技术。

详情
AI中文摘要

真实世界的数据集通常包含与目标标签无因果关系的虚假相关性。当这些相关性主导大部分训练样本时,模型倾向于依赖它们,导致对不呈现相同虚假模式的少数样本分类错误。虽然一种潜在的方法是选择数据子集以更好地代表少数样本,但这可能需要访问通常未知的组标签。此外,正如我们所展示的,在不变子集或核心集选择文献中广泛使用的样本评分函数在很大程度上依赖于虚假特征,因此无法准确捕捉核心因果相关特征的重要性或难度。因此,我们提出通过开发一种两阶段样本评分函数来缓解虚假相关性,该函数解耦核心特征和虚假特征的学习动态,并分别评估它们的难度。基于我们提出的度量,我们引入了一种新算法来查找并优先处理带有和不带有虚假相关性的信息样本。大量实验表明,在我们选择的样本上训练的标准ERM模型在性能上优于最先进的去偏技术,同时仅需要原始训练数据的10%。

英文摘要

Real-world datasets often contain spurious correlations that are not causally related to the target label. When such correlations dominate the majority of training samples, models tend to rely on them, leading to misclassification of minority samples that do not exhibit the same spurious patterns. While a potential approach is to select subsets of data to better represent the minority samples, this may require access to group labels, which are typically unknown. Furthermore, as we demonstrate, widely used sample scoring functions in the invariant subset or coreset selection literature largely depend on spurious features and therefore fail to accurately capture the importance or difficulty of core, causally relevant features. Accordingly, we propose to mitigate spurious correlations by developing a two-stage sample scoring function that disentangles the learning dynamics of core and spurious features and evaluates their difficulty separately. Based on our proposed metric, we introduce a new algorithm to find and prioritize informative samples both with and without spurious correlations. Extensive experiments demonstrate that a standard ERM model trained on our selected samples achieves superior performance compared to state-of-the-art debiasing techniques, while requiring as little as 10\% of the original training data.

2606.02823 2026-06-03 cs.LG 版本更新

Qift: Shift-Friendly No-Zero W2 Post-Training Quantization for Rotated W2A4/KV4 LLM Inference

Qift: 面向旋转W2A4/KV4 LLM推理的移位友好型无零W2后训练量化

Chi-Wei Huang, Chia-Chi Tsai

发表机构 * National Cheng Kung University(国立成功大学)

AI总结 针对旋转量化流水线中W2A4/KV4设置下标准W2电平集性能不佳的问题,提出一种基于零中心高斯源模型的无零固定W2电平集Qift,通过重新设计码本映射实现无训练、无学习码本、无组网格、无零点的量化,显著提升模型困惑度和下游精度。

Comments 23 pages, 8 figures

详情
AI中文摘要

两位权重量化对于内存高效的LLM推理具有吸引力,但标准W2电平集{-2,-1,0,+1}在激进的W2A4/KV4设置下经常崩溃。我们研究了哈达玛旋转量化流水线中两位权重的标量电平集几何结构。传统的非对称W2相比标准电平集有显著改进,表明W2A4失败不仅是位宽问题,也是重构电平问题。在LLaMA-2-7B和LLaMA-3.1-8B的每个224个线性模块中,预训练权重已经接近零中心,而哈达玛旋转主要使其标准化形状高斯化:超额峰度和Q-Q误差下降数个数量级。基于这种近似零中心高斯型源模型,我们提出了Qift,一种用于旋转W2A4/KV4推理的固定无零W2电平集。主电平集为{+/-0.5, +/-1.5},在半尺度重参数化下等价于{+/-1, +/-3};一种2的幂次变体使用{+/-1, +/-4}用于符号移位解码权重应用。Qift重新设计了固定的两位码本到电平的映射,并且无需训练、无需学习码本、无需组网格、无需零点,保留了标准的每通道尺度。一种尺度不变比率分析确定了有效内/外质心比率范围为0.25到0.33,解释了为什么镜像无零(MNZ)、Lloyd、NF2和PoT-MNZ表现良好而{+/-1, +/-2}则不然。在两个模型上,无零电平集在纯W2A4困惑度、L层混合W2/W4困惑度、下游准确率和GPTQ残差行为上持续优于标准W2电平集。在L=16混合精度下,它们显著缩小了与W3A4的差距,同时保持一半的Transformer层为两位精度,为更复杂的学习型W2码本提供了一种简单、源感知且易于部署的替代方案。

英文摘要

Two-bit weight quantization is attractive for memory-efficient LLM inference, but the standard W2 level set {-2,-1,0,+1} often collapses under aggressive W2A4/KV4 settings. We study the scalar level-set geometry of two-bit weights in a Hadamard-rotated quantization pipeline. Conventional asymmetric W2 substantially improves over the standard level set, indicating that W2A4 failure is not only a bit-width problem but also a reconstruction-level problem. Across all 224 linear modules in each of LLaMA-2-7B and LLaMA-3.1-8B, pretrained weights are already nearly zero-centered, while Hadamard rotation primarily Gaussianizes their standardized shape: excess kurtosis and Q-Q error drop by orders of magnitude. Based on this approximate zero-centered Gaussian-like source model, we propose Qift, a fixed no-zero W2 level set for rotated W2A4/KV4 inference. The main level set is {+/-0.5, +/-1.5}, equivalently {+/-1, +/-3} under a half-scale reparameterization; a power-of-two variant uses {+/-1, +/-4} for sign-and-shift decoded weight application. Qift redesigns the fixed two-bit code-to-level mapping and is training-free, learned-codebook-free, group-grid-free, and zero-point-free, retaining the standard per-channel scale. A scale-invariant ratio analysis identifies an effective inner/outer centroid ratio range of 0.25 to 0.33, explaining why mirror no-zero (MNZ), Lloyd, NF2, and PoT-MNZ perform well while {+/-1, +/-2} does not. On both models, the no-zero level sets consistently improve pure W2A4 perplexity, L-layer mixed W2/W4 perplexity, downstream accuracy, and GPTQ residual behavior over the standard W2 level set. At L=16 mixed precision, they substantially narrow the gap to W3A4 while keeping half of the transformer layers at two-bit precision, giving a simple, source-aware, and deployment-friendly alternative to more complex learned W2 codebooks.

2606.02785 2026-06-03 cs.LG hep-ex physics.atom-ph quant-ph 版本更新

QUIVER: Quantum-Informed Views for Enhanced Representations in Large ML Models

QUIVER: 用于大型机器学习模型中增强表示的量子信息视角

Aritra Bal, Michael Binder, Markus Klute, Benedikt Maier, Michael Spannowsky

发表机构 * University of California, Berkeley(加州大学伯克利分校) ETH Zurich(苏黎世联邦理工学院) University of Cambridge(剑桥大学)

AI总结 提出QUIVER框架,通过变分量子电路提取量子Fisher信息矩阵作为几何特征,与经典特征融合以提升大型机器学习模型性能,并在QM9和JetClass数据集上验证了有效性。

Comments 9 pages, 1 figure and 2 tables. Accepted as a poster at the AI4Physics Workshop, ICML 2026 (Seoul, South Korea)

详情
AI中文摘要

大型机器学习模型显著受益于提供同一示例互补视角的多模态输入。我们引入QUIVER(量子信息增强表示视角),这是一种用量子Fisher视角丰富经典数据驱动特征的范式:量子Fisher视角是一种几何驱动的、基无关的高阶相关性总结,由为执行相同任务而训练的变分量子电路(VQC)捕获。与经典特征增强不同,量子Fisher信息矩阵编码了学习到的量子态流形的内在几何结构。虽然这种受量子信息理论启发的特征映射通常难以经典建模,但它可以揭示额外的经典数据或模型容量难以学习的统计结构。这使得量子Fisher视角成为一种真正互补而非冗余的模态。我们证明QUIVER在两个来自完全不同领域的基准数据集上提升了标准性能指标:用于预测分子性质的QM9,以及用于预测大型强子对撞机(LHC)喷注风味的JetClass。然而,核心贡献是领域无关的:量子Fisher视角可以通过对基础架构进行针对性修改,融合到广泛类别的模型架构中,以纳入问题的量子几何信息。这些结果表明,从模拟变分电路中提取的量子几何特征,可以在容错量子硬件出现之前,为标准机器学习任务带来可衡量的价值。

英文摘要

Large machine learning models benefit substantially from multimodal inputs that provide a complementary view of the same example. We introduce QUIVER (QUantum-Informed Views for Enhanced Representations, a paradigm that enriches classical data-driven features with a quantum Fisher view: a geometrically motivated, basis-independent summary of higher-order correlations captured by a variational quantum circuit (VQC) trained to perform the same task. Unlike classical feature augmentation, the quantum Fisher information matrix encodes the intrinsic geometry of the learned quantum state manifold. While this feature map, motivated by quantum information theory, is ordinarily non-trivial to model classically, it can surface statistical structure that additional classical data or model capacity finds difficult to learn. This makes the quantum Fisher view a genuinely complementary modality rather than a redundant one. We demonstrate that QUIVER improves standard performance metrics on two benchmark datasets from very different fields: QM9 for predicting molecule properties, and JetClass for predicting jet flavor at the Large Hadron Collider (LHC). The core contribution, however, is domain-agnostic: the quantum Fisher view can be fused into a broad class of model architectures via targeted modifications to the base architecture, to incorporate information about the quantum geometry of the problem. These results demonstrate that quantum-geometric features, extracted from simulated variational circuits, can deliver measurable value for standard machine learning tasks, well before the advent of fault-tolerant quantum hardware.

2606.02767 2026-06-03 cs.RO cs.LG 版本更新

Hybrid Adaptive Kalman Filtering for Data-Efficient Joint Tracking and Classification

混合自适应卡尔曼滤波用于数据高效的联合跟踪与分类

Jiho Lee, Nisar R. Ahmed, Rebecca Russell

发表机构 * Charles Stark Draper Laboratory, Inc.(查尔斯·斯泰克·德帕尔实验室,Inc.) Ann and H. J. Smead Department of Aerospace Engineering Sciences(安与H.J.斯梅德航空航天工程科学系)

AI总结 提出一种自监督混合自适应卡尔曼滤波器,通过仅从测量中学习系统动力学和过程噪声协方差的结构化校正,同时保持滤波器的概率结构,实现低数据和大数据场景下的高精度估计与鲁棒分类。

Comments 8 pages, 4 figures

详情
AI中文摘要

卡尔曼滤波性能对模型失配和噪声协方差调谐高度敏感。基于学习的方法解决了这些局限性,但通常依赖于大量数据集的监督训练,且不能产生一致的不确定性估计。在本文中,我们提出了一种自监督混合自适应卡尔曼滤波器,该滤波器仅从测量中学习系统动力学和过程噪声协方差的结构化校正,同时保持滤波器的概率结构。这使得可以计算创新似然,并随后通过广义贝叶斯推理用于模型分类。在真实世界和模拟数据集上的实验结果表明,在低数据和大数据场景下,估计精度和统计一致性均得到提高,分类性能也表现出鲁棒性。

英文摘要

Kalman filtering performance is highly sensitive to model mismatch and noise covariance tuning. Learning-based approaches address these limitations but typically rely on supervised training with large datasets and do not produce consistent uncertainty estimates. In this paper, we propose a self-supervised Hybrid Adaptive Kalman Filter that learns structured corrections to system dynamics and process noise covariance from measurements alone while preserving the probabilistic structure of the filter. This allows the innovation likelihood to be computed and subsequently used for model classification via generalized Bayesian inference. Experimental results on real-world and simulated datasets demonstrate improved estimation accuracy and statistical consistency as well as robust classification performance across both low-data and large-data scenarios.

2606.02765 2026-06-03 cs.LG cs.AI 版本更新

Representational Capacity: Geometric Limits on Feature Representation in Transformer Language Models

表示能力:Transformer语言模型中特征表示的几何限制

Alexander Guha

发表机构 * Arizona State University(亚利桑那州立大学)

AI总结 基于线性表示和叠加假设,通过嵌入矩阵的余弦相似度分布估计模型可支持的近正交方向数量,推导出容量公式,并发现容量对偏差ε指数敏感。

Comments 22 pages, 10 figures. Submitted to NeurIPS 2026. This is a condensed version of thesis: https://hdl.handle.net/2286/R.2.N.204857

详情
AI中文摘要

模型维度($d_{model}$)是Transformer语言模型中的一个基本超参数,但其在设定特征表示的几何限制方面的作用仍未得到充分探索。基于线性表示和叠加假设——这些假设提出模型将特征编码为潜在空间中的近正交方向——我们开发了一个框架来估计模型可以支持多少个这样的方向。我们首先将嵌入矩阵确立为跨潜在空间近正交约束的可测量代理:成对余弦相似度分布中有意义的token关系与偶然相似性之间的边界给出了模型对完美正交性的可接受偏差ε的具体估计。将此度量应用于数十个开源模型揭示了两个类别:具有高ε且其嵌入缺乏近正交结构的模型,以及具有低ε且保持近正交结构的模型。然后我们表明,标准的Johnson-Lindenstrauss引理大大低估了训练表示的填充效率,并推导出一个调整后的容量公式,其中近正交方向的数量取决于向量与维度的比率($k/d$)而非原始计数——这一单一修改在没有额外参数的情况下将预测误差降低了两个数量级。结合这些结果,我们将表示能力定义为模型潜在空间中可用于特征和嵌入的可区分方向上界。容量对ε指数敏感,并且较大的模型倾向于更严格的正交约束而非最大化原始容量——这一模式与几种解释(稳定性-容量权衡、可用概念的上限或模型规模的混杂因素)兼容,我们将这些留给未来工作。

英文摘要

Model dimension ($d_{model}$) is a fundamental hyperparameter in transformer language models, yet its role in setting the geometric limits of feature representation remains under-explored. Grounded in the Linear Representation and Superposition Hypotheses - which propose that models encode features as near-orthogonal directions in latent space - we develop a framework for estimating how many such directions a model can support. We first establish the embedding matrix as a measurable proxy for near-orthogonality constraints across the latent space: the boundary between meaningful token relationships and incidental similarity in the pairwise cosine similarity distribution gives a concrete estimate of the model's accepted deviation $\varepsilon$ from perfect orthogonality. Applying this metric across dozens of open-source models reveals two classes: models with high $\varepsilon$ whose embeddings lack near-orthogonal structure, and models with low $\varepsilon$ that maintain it. We then show that the standard Johnson-Lindenstrauss lemma greatly underestimates the packing efficiency of trained representations, and derive an adjusted capacity formula in which the number of near-orthogonal directions depends on the ratio of vectors to dimensions ($k/d$) rather than the raw count - a single modification that cuts prediction error by two orders of magnitude with no extra parameters. Combining these results, we define representational capacity as an upper bound on the number of distinguishable directions available for features and embeddings in a model's latent space. Capacity is exponentially sensitive to $\varepsilon$, and larger models favor tighter orthogonality constraints over maximizing raw capacity - a pattern compatible with several explanations (a stability-capacity trade-off, a ceiling on usable concepts, or confounds with model scale) that we leave to future work.

2606.02762 2026-06-03 cs.LG 版本更新

Binary Road Surface Classification Using Machine Learning on Production Vehicle Signals During Cruising

基于生产车辆巡航信号的道路表面二分类机器学习方法

Vishal Hariharan, Salar Basiri, Kanwar Bharat Singh

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 针对巡航工况下传统摩擦估计方法失效的问题,提出基于特征和端到端数据驱动框架,利用车辆动力学信号统计特征对路面抓地力(干/湿)与打滑(雪/冰)进行二分类。

详情
AI中文摘要

实时道路滑溜性知识,甚至更精确的峰值抓地潜力估计,是车辆预警和干预控制系统的关键输入。通常,摩擦通过基于动力学的递归估计器计算滑移斜率来估计;然而,其有效性受到车辆动力学场景的严重限制。当车辆巡航且几乎没有滑移时,由于当前生产级传感器(如轮速传感器)和方法无法测量或准确估计微滑移(这对区分不同路面至关重要),这些方法变得无效。为了解决这一挑战,需要利用机器学习揭示巡航过程中车辆信号与路面条件之间的相关性。本文采用基于特征的框架和端到端数据驱动框架,将车辆动力学行为统计量与路面条件相关联,并执行二分类:抓地(干或湿)和打滑(雪或冰)。采用滑动窗口方法,将短时缓冲窗口内的轮速、轮扭矩、纵向加速度、转向角和横摆角速度批量输入机器学习模块,以预测道路状态。在公共道路数据上的验证结果表明,即使在巡航过程中,数据驱动方法也能正确识别路面,显示出在轮胎和车辆动力学领域实现精确数据驱动摩擦相关状态估计器的潜力。

英文摘要

Knowledge of real-time road slipperiness, or even better, a refined estimate of peak grip potential, is a critical input for vehicle warning and intervention control systems. Typically, friction is estimated through dynamics-based recursive estimators by calculating the slip slope; however, its efficacy is heavily constrained by the vehicle dynamic scenario. When the vehicle is cruising and there is little to no slip, these methods become ineffective due to the inability of present-day production-grade sensors, such as wheel speed sensors, and methods to either measure or accurately estimate micro slip, which is crucial for distinguishing different surfaces. To address this challenge, the correlation between vehicle signals and road surface condition during cruising needs to be uncovered using machine learning. In this paper, a feature-based framework and an end-to-end data-driven framework are used to correlate the statistics of vehicle dynamics behavior with the condition of the road surface and perform binary classification into grip, dry or damp, and slip, snow or ice, conditions. A sliding-window approach is adopted to batch a short buffered window of wheel speeds, wheel torques, longitudinal acceleration, steering angle, and yaw rate, which are fed into a machine learning module for predicting the road state. Validation results on public-road data show scenarios where the data-driven method identifies the road surface correctly even during cruising, showing promise for accurate data-driven friction-related state estimators in the field of tire and vehicle dynamics.

2606.02754 2026-06-03 cs.LG 版本更新

$Ψ$-Bench: Evaluating Persona-Sensitive Influencing in Persuasive Dialogues

$\Psi$-Bench: 评估说服性对话中人格敏感的影响力

Peixuan Han, Hongyi Du, Jiayu Liu, Yihang Sun, Yutong Liu, Jiaxuan You

发表机构 * University of Illinois Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校)

AI总结 提出 $\Psi$-Bench 基准,通过三个现实场景评估 LLM 利用用户画像进行说服的能力,发现当前模型仍有较大提升空间,且用户画像带来 18.24% 的性能提升。

详情
AI中文摘要

个性化是现代语言代理的关键能力。然而,当前研究主要将个性化代理定位为对用户偏好的被动响应者,限制了其与用户交互并主动提供建议或指导的能力。为了在真实交互中系统评估这种主动个性化,我们提出了 $\Psi$-Bench,一个评估 LLM 通过对话影响真实用户能力的基准。我们在 $\Psi$-Bench 中设计了三个涉及说服的现实交互场景,并通过从对话历史中提取的显式用户画像赋予模拟客户个性特征。我们在 $\Psi$-Bench 上评估了 10 个前沿 LLM,发现尽管大多数模型能产生连贯合理的论点,但即使是最先进的模型在说服方面仍有相当大的改进空间。我们还发现,提供客户画像访问权限平均带来 18.24% 的性能提升,突显了用户特定信息对有效说服的重要性。总体而言,我们的工作强调了人格敏感的影响力作为评估和开发更主动的个性化 LLM 代理的一个具有挑战性但实用的方向。代码可在以下网址获取:this https URL。

英文摘要

Personalization is a crucial capability of modern language agents. However, current research primarily positions personalized agents as passive responders to user preferences, limiting their ability to interact with users and provide suggestions or guidance proactively. To systematically evaluate such proactive personalization in realistic interactions, we propose $Ψ$-Bench, a benchmark for assessing LLMs' ability to influence realistic users through conversation. We design three real-world interaction scenarios that involve persuasion in $Ψ$-Bench, and endow simulated clients with personal characteristics through explicit user profiles derived from dialogue histories. We evaluate 10 frontier LLMs on $Ψ$-Bench and find that while most models can produce coherent and reasonable arguments, even state-of-the-art models still leave considerable room for improvement in persuasion. We also find that providing access to client profiles yields an average performance gain of 18.24\%, highlighting the importance of user-specific information for effective persuasion. Overall, our work highlights persona-sensitive influencing as a challenging yet practical direction for evaluating and developing more proactive personalized LLM agents. Codes are available at: https://github.com/Hanpx20/Psi-Bench.

2606.02745 2026-06-03 cs.RO cs.LG 版本更新

SeeTraceAct: Visibility-Aware Latent Planning from Cross-Embodiment Demonstration Videos

SeeTraceAct: 跨具身演示视频中的可见性感知潜在规划

Jaehyeon Son, Junhyun Kim, Kyle Kam, Jeremiah Coholich, Seok Joon Kim, Jinhoo Kim, Chris Dongjoo Kim, Jaemin Cho, Dieter Fox, Zsolt Kira

发表机构 * Georgia Institute of Technology(佐治亚理工学院) Allen Institute for AI(Allen人工智能研究所) Johns Hopkins University(约翰霍普金斯大学) University of Washington(华盛顿大学)

AI总结 提出SeeTraceAct框架,通过可见性感知的未来末端执行器轨迹预测增强空间定位,实现基于单次跨具身演示视频的机器人策略泛化,在模拟和真实场景中取得最优成功率。

详情
AI中文摘要

视觉-语言-动作模型(VLA)是有前途的通用机器人策略,但将其适应新任务通常需要昂贵的任务特定遥操作数据。作为替代,我们研究一次性演示条件VLA,其中机器人策略以未见任务的单个演示视频为条件。我们发现,当成功执行需要精确定位小目标区域时,现有的端到端方法往往难以应对。为解决这一限制,我们提出SeeTraceAct,一种演示条件VLA框架,通过可见性感知的未来末端执行器轨迹预测来鼓励精确的空间定位。为实现跨具身演示的可重复评估,我们引入并发布了RoboCasa-DC,这是RoboCasa的演示条件扩展,包含成对的人形机器人视频。在RoboCasa-DC和真实世界基准(Franka Panda臂以人类演示为条件)上的实验表明,SeeTraceAct优于基线,在所有四个RoboCasa-DC设置中实现了最佳成功率,并将真实世界平均成功率提高了12.5个百分点。

英文摘要

Vision-language-action models (VLAs) are promising general-purpose robot policies, but adapting them to new tasks typically requires costly task-specific teleoperation data. As an alternative, we study one-shot demo-conditioned VLAs, where a robot policy is conditioned on a single demonstration video of an unseen task. We find that existing end-to-end approaches often struggle when successful execution requires precisely localizing small target regions. To address this limitation, we propose SeeTraceAct, a demo-conditioned VLA framework that encourages precise spatial grounding through visibility-aware prediction of future end-effector traces. To enable reproducible evaluation with cross-embodiment demonstrations, we introduce and release RoboCasa-DC, a demo-conditioned extension of RoboCasa with episode-paired humanoid videos. Experiments on RoboCasa-DC and a real-world benchmark, where a Franka Panda arm is conditioned on human demonstrations, show that SeeTraceAct outperforms baselines, achieving the best success rate across all four RoboCasa-DC settings and improving real-world average success by 12.5 percentage points.

2606.02680 2026-06-03 cs.LG 版本更新

Locality Does Not Imply Reachability: Boundary Repair in Block-Sparse Causal Attention

局部性并不意味着可达性:块稀疏因果注意力中的边界修复

Zhibo Yang

发表机构 * Ocean University of China(中国海洋大学)

AI总结 本文研究块稀疏因果注意力中序列局部性与注意力图可达性之间的不匹配,通过结构依赖集形式化边界伪影,提出相位条件覆盖函数分析可达性,并引入边界桥接注意力作为最小修复方法。

Comments 36 pages, 5 figures, 16 tables

详情
AI中文摘要

稀疏因果注意力通常由序列局部性描述:附近的token应保持易于访问,而远处的token可能被丢弃以降低成本。本文研究了序列局部性与注意力图可达性之间的不匹配。在固定块因果注意力中,两个相邻token可能在每一深度的注意力图中断开连接。我们通过结构依赖集形式化了这种边界伪影:如果每个注意力层使用相同的固定块因果掩码,且所有剩余操作是位置级的,则目标表示只能依赖于其自身块前缀中的token。这为构造的K路边界复制分布产生了架构级的边界-复制分离,top-1准确率上界为1/K,期望交叉熵下界为log K。然后,我们推导了相位条件覆盖函数,表明可达性取决于源-目标距离以及目标在其块内的偏移。这些覆盖律预测了稀疏模式何时会失败、修复何时有帮助,以及滑动窗口注意力和边界修复为何不可互换。边界桥接注意力被视为建设性的证明:它保留了固定块路径,并在块边界附近使用共享投影添加了零额外参数的辅助因果边。受控的1024-token实验表明,收益集中在覆盖对齐的诊断中。作为次要的外部有效性证据,固定检查点的8K-token Qwen2.5-7B探针显示了相同的覆盖不可比模式。贡献在于一个理论指导的诊断框架,用于块稀疏因果注意力中的局部性-可达性不匹配,以及相位条件覆盖分析和最小建设性修复。

英文摘要

Sparse causal attention is usually described by sequence locality: nearby tokens should remain easy to access, while distant tokens may be dropped to reduce cost. This paper studies a mismatch between sequence locality and attention-graph reachability. In fixed block causal attention, two adjacent tokens can be disconnected in the attention graph at every depth. We formalize this boundary artifact through structural dependency sets: if every attention layer uses the same fixed block causal mask and all remaining operations are positionwise, a target representation can depend only on tokens in its own block prefix. This yields an architecture-level boundary-copy separation for a constructed K-way boundary-copy distribution, with top-1 accuracy upper bound 1/K and expected cross-entropy lower bound log K. We then derive phase-conditioned coverage functions showing that reachability depends on both source-target distance and the target's offset within its block. These coverage laws predict when a sparse pattern should fail, when a repair can help, and why sliding-window attention and boundary repair are not interchangeable. Boundary Bridge Attention is treated as a constructive witness: it preserves the fixed block path and adds zero-additional-parameter auxiliary causal edges near block boundaries using shared projections. Controlled 1024-token experiments show that gains concentrate in coverage-aligned diagnostics. As secondary external-validity evidence, a fixed-checkpoint 8K-token Qwen2.5-7B probe shows the same coverage-incomparability pattern. The contribution is a theory-guided diagnostic framework for locality-reachability mismatch in block-sparse causal attention, together with phase-conditioned coverage analysis and a minimal constructive repair.

2606.02679 2026-06-03 cs.LG cs.MM cs.SD eess.AS 版本更新

Before Fusion, Ask What to Keep: Contextual Calibration of Multimodal Signals

融合之前,先问保留什么:多模态信号的上下文校准

Jiyuan Liu, Liangwei Nathan Zheng, Wei Emma Zhang, Xinpei Wang, Weitong Chen

发表机构 * Adelaide University(阿德莱德大学) Shandong University(山东大学)

AI总结 提出一种紧凑的校准模块,在融合前对各模态特征进行实例级和维度级调制,抑制不可靠成分并增强上下文支持信号,提升多模态任务性能。

Comments 11 pages, 7 figures, 9 tables

详情
AI中文摘要

多模态系统通常受益于跨语言、声音和视觉流的信息组合,但这种收益并非保证。一个模态对某个输入有用,可能对另一个输入成为干扰,同一模态内的局部特征响应可能与其他来源的证据不一致。本文研究如何在下游预测器合并多模态表示之前调整它们。我们开发了一个紧凑的校准模块,在摘要级别将每个模态与其他模态进行比较,提取跨源支持和冲突的线索,并将这些线索转换为实例级和维度级的调制信号。校准应用于原始模态特征而非已融合的表示,使模型能够抑制误导成分,保留微弱但有用的证据,并强调在当前多模态上下文中得到更好支持的响应。该模块设计为即插即用组件,可附加到不同的融合主干上,无需更改其预测头。在涵盖情感理解、动作识别、音视频事件检测和音视频情感分类的五个基准测试中,所提出的预融合校准策略在基于序列和卷积的融合设置下均提升了性能。模态移除、合成损坏、训练动态和特征级可视化的额外分析表明,在融合前校准信号可以减少来自不可靠模态的干扰,并产生更稳定的多模态优化。

英文摘要

Multimodal systems often benefit from combining information across language, sound, and visual streams, but this benefit is not guaranteed. A modality that is useful for one input may become distracting for another, and local feature responses within the same modality can disagree with evidence from other sources. This work investigates how to adjust multimodal representations before they are merged by a downstream predictor. We develop a compact calibration module that compares each modality with the others at the summary level, extracts cues of cross-source support and conflict, and converts these cues into instance-wise and dimension-wise modulation signals. The calibration is applied to the original modality features rather than to already fused representations, enabling the model to suppress misleading components, preserve weak but useful evidence, and emphasize responses that are better supported by the current multimodal context. The module is designed as a plug-in component and can be attached to different fusion backbones without changing their prediction heads. Across five benchmarks covering sentiment understanding, action recognition, audio-visual event detection, and audio-visual emotion classification, the proposed pre-combination calibration strategy improves performance under both sequence-based and convolutional fusion settings. Additional analyses under modality removal, synthetic corruption, training dynamics, and feature-level visualization show that calibrating signals before fusion can reduce interference from unreliable modalities and produce more stable multimodal optimization.

2606.02673 2026-06-03 cs.AI cs.LG 版本更新

Visual Graph Scaffolds for Structural Reasoning in Large Language Models

大语言模型中用于结构推理的可视化图脚手架

Runlin Lei, Xiaokui Xiao, Zhewei Wei

发表机构 * University of Science and Technology of China(中国科学技术大学)

AI总结 本文提出将图结构作为大语言模型的内部推理辅助而非仅外部知识源,通过多跳问答实验发现视觉图引导相比文本化图在无直接答案提示时仍保持有效性,支持图作为组织推理的可视化脚手架。

详情
AI中文摘要

图已被用于增强大语言模型的结构化推理,主要是在测试时作为外部知识源提供给模型。在本文中,我们采取不同的视角:图对LLMs的价值不仅在于提供信息,还在于组织推理。受人类使用图结构思维导图组织分支和汇聚思维的启发,我们探究图是否可以作为推理辅助的内部形式。我们在多跳问答任务上研究这一问题,其中教师提供的推理轨迹被重写为图思维导图并用于指导学生模型。我们的实验揭示了明显的模态差距。当图结构被扁平化为文本时,一旦直接答案提示被移除,其益处变得有限。在这种抽象引导设置下,推理效率和答案质量都大幅下降。相比之下,视觉图引导在没有直接答案线索时仍然有效,并且其优势在监督微调和基于KL的蒸馏后仍然保持。上述发现支持了以下主张:图不仅应作为LLMs的外部知识结构来研究,还应作为组织推理的可视化脚手架。

英文摘要

Graphs have been used to enhance large language models (LLMs) for structured reasoning, mostly as external knowledge sources are provided to models at test time. In this paper, we take a different view: the value of graphs for LLMs lie not only in supplying information, but also in organizing reasoning. Inspired by how humans use graph-structured mind maps to organize branching and converging thoughts, we ask whether graphs can serve as an internal form of reasoning assistance. We study this question on multi-hop question answering tasks, where teacher-provided reasoning traces are rewritten as graph mind maps and used to guide a student model. Our experiments reveal a clear modality gap. When graph structures are flattened into text, their benefits become limited once direct answer hints are removed. Under this abstract guidance setting, both reasoning efficiency and answer quality degrade substantially. In contrast, visual graph guidance remains effective without direct answer clues, and its advantage persists after supervised fine-tuning and KL-based distillation. The above findings support the claim that graphs should be studied not only as external knowledge structures for LLMs, but also as visual scaffolds for organizing reasoning.

2606.02671 2026-06-03 cs.LG cs.AI 版本更新

Aligning Data-Driven Predictors with Allocation: A Decision-Focused Approach to Survival Analysis

对齐数据驱动预测器与分配:面向决策的生存分析方法

Itai Zilberstein, Ioannis Anagnostides, Tuomas Sandholm

发表机构 * Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA(计算机科学系,卡内基梅隆大学,匹兹堡,PA) Strategy Robot, Inc.(策略机器人公司) Strategic Machine, Inc.(战略机器公司) Optimized Markets, Inc.(优化市场公司)

AI总结 针对生存分析中预测模型与分配决策目标不一致的问题,提出基于归一化折现累积增益(NDCG)的决策聚焦学习方法,通过优化NDCG提升分配效果,在心脏移植数据上使基线模型NDCG提升50-100%。

详情
AI中文摘要

机器学习预测器已成为指导自动化决策的重要工具。然而,一个主要的错位仍然存在:预测模型通常根据标准统计指标进行优化,而与其所指导的算法任务相孤立。我们在器官分配这一高风险领域中强调了这种不一致性,通过证明任何依赖(即使是高度准确的)针对标准指标(如一致性指数(C-index))优化的生存预测器的算法,在用于分配时可能产生任意差的结果,无法保证比均匀随机选择更好的效用。为了弥合生存分析与策略优化之间的差距,我们引入了一种基于优化归一化折现累积增益(NDCG)的决策聚焦学习方法,NDCG是信息检索中的主流指标。我们通过证明NDCG转化为分配性能的保证,确立了其在生存分析中的效用。在实证中,我们提出了一种自举方法来优化现有生存模型的NDCG。与先前工作不同,我们还解决了评估排名时右删失的挑战。在美国历史心脏移植数据上,我们的方法将基线模型的NDCG大幅提升了50-100%,这相当于在移植分配中每年额外获得数万生命年。我们预计我们的框架将在基于预测的决策中找到更广泛的应用。

英文摘要

Machine learning predictors have become essential tools for guiding automated decision making. However, a major misalignment persists: predictive models are typically optimized in terms of standard statistical metrics in isolation from the algorithmic tasks they inform. We highlight this incongruity in the high-stakes domain of organ allocation by demonstrating that any algorithm relying on (even highly accurate) survival predictors optimized for standard metrics -- such as the Concordance index (C-index) -- can yield arbitrarily poor outcomes when used for allocation, failing to guarantee utility better than a uniform random selection. To bridge the gap between survival analysis and policy optimization, we introduce a decision-focused learning approach based on optimizing normalized discounted cumulative gain (NDCG), a mainstay metric in information retrieval. We establish the utility of NDCG in survival analysis by proving that it translates to guarantees on the performance of allocation. Empirically, we propose a bootstrapping approach to optimize the NDCG of existing survival models. Unlike prior work, we also address the challenge of right censorship when evaluating ranking. On historical heart transplant data from the US, our method dramatically boosts the NDCG of baseline models by 50-100%, which translates to tens of thousands of additional life years gained annually when deployed for transplant allocation. We anticipate that our framework will find broader applications in decision making with predictions.

2606.02663 2026-06-03 cs.LG cs.AI 版本更新

AdaWeather: Adaptively Mixing Probabilistic Weather Forecasts with Logarithmic Regret

AdaWeather: 自适应混合概率天气预报与对数遗憾

Saptarishi Dhanuka, Sarvesh Iyer, Manmeet Singh, Mihir More, Rushil Gupta, Dhruman Gupta, Parthasarathi Mukhopadhyay, Sandeep Juneja

发表机构 * Ashoka University(阿什oka大学) Western Kentucky University(西方肯塔基大学)

AI总结 提出 AdaWeather 自适应框架,通过结合机器学习和专家混合方法融合多个概率天气预报,实现对数遗憾界,并在温度预测上取得改进。

Comments 36 pages, 16 figures. Submitted to arXiv. Forecast aggregation for probabilistic weather prediction using offline supervised learning and online prediction with expert advice. Includes theoretical regret guarantees and empirical evaluation on temperature forecasting. Submitted to NeurIPS 2026

详情
AI中文摘要

机器学习的最新进展已经产生了与最先进数值天气预报模型相当的概率天气预报模型。但没有任何模型在时空上持续占优,且相对性能高度依赖于上下文。这激发了自适应方法来组合多个预报以获得改进和鲁棒性。尽管文献中已提出组合预报,但这些要么通过监督学习实现,要么通过专家建议预测方法实现。我们引入 AdaWeather,一个自适应框架,它使用机器学习和专家混合方法结合多个概率预报,以得到统一的改进概率预报。传统专家方法针对事后最佳单一专家建立遗憾界,而我们扩展了算法和分析,表明我们的方法相对于事后最佳静态专家混合具有对数遗憾。实验上,我们专注于温度预测,并观察到相对于现有方法的改进。

英文摘要

Recent advances in machine learning have produced probabilistic weather forecasting models comparable to state-of-the-art numerical weather predictors. But no model consistently dominates spatio-temporally, and relative performance is highly context-dependent. This motivates adaptive methods for combining multiple forecasts to obtain improvements and robustness. While combined forecasts have been proposed in the literature, these are achieved either through supervised learning or through prediction with expert advice methods. We introduce AdaWeather, an adaptive framework that combines many probabilistic forecasts using both machine learning as well as mixture of experts to arrive at a unified improved probabilistic forecast. While traditional expert methods develop the regret bounds with respect to the best single expert in hindsight, we extend the algorithm and analysis to show our method has logarithmic regret compared to the best static mixture of experts in hindsight. Empirically, we focus on forecasting temperature, and observe improvements over existing methods.

2606.02662 2026-06-03 cs.LG cs.AI physics.chem-ph 版本更新

Improvise, Adapt, Overcome: An On-The-Fly Multifidelity Algorithm for Efficient Machine Learning

即兴、适应、克服:一种用于高效机器学习的即时多保真算法

Vivin Vinod, Peter Zaspel

发表机构 * School of Mathematics and Natural Sciences, University of Wuppertal(数学与自然科学学院,乌珀塔尔大学)

AI总结 提出一种自适应即时多保真机器学习框架,通过动态查询不同保真度的训练样本,自动确定数据集组成,在降低数据生成成本的同时提高模型精度。

Comments Supplementary Information added as separate PDF

详情
AI中文摘要

机器学习加速了量子化学,但受到生成高保真训练数据的高昂成本的阻碍。多保真机器学习(MFML)通过系统性地结合丰富的低保真数据和稀疏的高保真数据来减轻这一开销。尽管取得了成功,标准MFML方案依赖于预定义的缩放因子来确定不同保真度之间的稀疏数据比例,通常会产生冗余的多保真数据,导致效率损失。在这里,我们介绍了一种用于机器学习的自适应即时多保真框架,该框架自主确定训练数据集的组成。通过动态查询每个保真度的训练样本,该算法在转向更昂贵的参考计算之前,先在较低保真度上使模型精度饱和。我们在不同的化学性质上对新颖的自适应MFML进行了基准测试,包括计算化学金标准的耦合簇能量,以及更具化学挑战性的激发能。在我们的数值实验中,我们表明,与单保真方法相比,我们的自适应算法将数据生成成本降低了多达30倍,并且与标准MFML相比提高了多达5倍。数据冗余的缓解为量子化学中可持续的成本感知机器学习建立了一条高精度、低成本的途径。

英文摘要

Machine learning has accelerated quantum chemistry but is hindered by the prohibitive cost of generating high fidelity training data. Multifidelity machine learning (MFML) mitigates this overhead by systematically combining abundant low fidelity data with sparse high fidelity data. In spite of its success, standard MFML schemes rely on pre-defined scaling factors to determine sparse data ratio across fidelities, often generating redundant multifidelity data resulting in a loss of efficiency. Here, we introduce an adaptive on-the-fly multifidelity framework for machine learning that autonomously determines training dataset composition. By dynamically querying training samples at each fidelity, the algorithm saturates model accuracy at lower fidelities before moving up to more expensive reference calculations. We benchmark the novel adaptive-MFML across diverse chemical properties including the computational chemistry gold standard coupled cluster energies, and the more chemically challenging excitation energies. In our numerical experiments we show that our adaptive algorithm reduces data generation costs by up to a factor of 30 compared to single fidelity methods and improves upon standard MFML by up to a factor of 5. The mitigation of data redundancy establishes a high-accuracy low-cost pathway for sustainable cost-aware machine learning in quantum chemistry.

2606.02659 2026-06-03 cs.LG cs.AI 版本更新

CL-DMDF:Dynamic Multimodal Data Fusion Model Based on Contrastive Learning

CL-DMDF:基于对比学习的动态多模态数据融合模型

Dong Li, Lingling Zhang, Binghao Han, Linlin Ding, Yue Kou

发表机构 * Tsinghua University(清华大学)

AI总结 针对多模态数据融合中模态缺失和局部交互忽视全局互补线索的问题,提出基于对比学习的动态多模态数据融合模型(CL-DMDF),通过跨特征和模态维度的注意力机制、实体质心对比学习模块和自适应融合模块,提升动态融合的效率和准确性。

Comments 9 pages, 5 figures, 7 tables

详情
AI中文摘要

多模态数据融合涉及整合和分析来自多种模态的信息,以揭示潜在的关联和互补模式,从而增强数据处理和决策能力。尽管现有的结构化多模态输入方法通常针对特定任务设计并假设模态完全可观测,但实际应用中常因各种因素导致模态输入不确定或缺失。一些传统模型过度强调缺失模态内的局部交互,忽视了多模态表示中嵌入的全局互补线索。为克服这些限制,我们提出了一种基于对比学习的动态多模态数据融合模型(CL-DMDF)。CL-DMDF引入了一种新颖的注意力机制,该机制在特征和模态维度上同时操作,以计算可靠的注意力分数,有效反映每个层级的重要性。CL-DMDF进一步整合了实体质心对比学习模块,该模块从实体特征构建基于质心的正样本,以增强判别学习。此外,采用自适应融合模块以提高动态融合策略的效率和准确性。在三个数据集上进行的大量实验证明了CL-DMDF在各种多模态融合任务中的有效性。

英文摘要

Multimodal data fusion involves integrating and analyzing information from multiple modalities to uncover latent correlations and complementary patterns, thereby enhancing data processing and decision-making. While existing methods for structured multimodal inputs are typically designed around specific tasks and assume fully observed modalities, real-world applications often suffer from uncertain or missing modality inputs due to various factors. Some traditional models overly emphasize local interactions within missing modalities, neglecting the global complementary cues embedded in multimodal representations. To overcome these limitations, we propose a Dynamic Multimodal Data Fusion model based on Contrastive Learning (CL-DMDF). CL-DMDF introduces a novel attention mechanism that operates across both feature and modality dimensions to compute reliable attention scores, effectively reflecting importance at each level. The CL-DMDF further incorporates an entity-centroid contrastive learning module that constructs centroid-based positive samples from entity features to enhance discriminative learning. Additionally, an adaptive fusion module is employed to improve the efficiency and accuracy of dynamic fusion strategies. Extensive experiments conducted on three datasets demonstrate the effectiveness of the CL-DMDF across diverse multimodal fusion tasks.

2606.02657 2026-06-03 cs.LG q-fin.CP q-fin.ST 版本更新

Regime-Arrival Uncertainty in Generalization Bounds under Distribution Shift

分布偏移下泛化界中的制度到达不确定性

Prince Poudel

发表机构 * Independent researcher(独立研究者)

AI总结 针对分布偏移中制度组成不匹配带来的额外风险,提出量化框架,通过精确分解分离制度不匹配与制度敏感性,并扩展至β-混合数据,给出极小极大下界。

Comments 23 pages, 4 tables, 3 Figures

详情
AI中文摘要

标准泛化界假设训练和部署分布相同或静态,不考虑平静与危机状态比例不同的制度切换环境。本文提出一个框架,通过量化当分布偏移为马尔可夫切换时因制度组成不匹配导致的额外风险,来泛化制度感知模型。我们得到了精确分解,将制度不匹配与制度敏感性分离;将界限扩展到β-混合数据,使用针对谱间隙校正的有效样本量;并在合成数据和25年全球股指上展示了极小极大下界。所提出的惩罚是事后实现的泛化差距,而仅训练估计器未显示显著相关性:危机的特征几何可以被检测到,但时间到达不能。因此,该框架不是预测机器。在制度变化的罕见情况下,预测未来制度的组成是一个开放问题。

英文摘要

The standard generalization bounds assume that the training and deployment distributions are the same, or are static, and don't consider regime switching environments where the ratio of calm vs crisis states is different. This paper proposes a framework that generalizes regime-aware models by quantifying the extra risk due to regime composition mismatch, when distribution shifts are Markov-switching. We obtain an exact decomposition, separating regime mismatch from regime sensitivity; we extend the bound to beta-mixing data using the effective sample size corrected for the spectral gap; and we show a minimax lower bound for synthetic data and on 25 years of global equity indices. The proposed penalty is an ex post realized generalization gap, whereas the training-only estimator does not show significant correlation: the feature geometry of crises can be detected, but not the temporal arrival. Thus, the framework is not a forecast machine. Forecasting the composition of the future regime is an open question in the rare cases of regime change.

2606.02628 2026-06-03 cs.LG cs.CL 版本更新

Hallucination Is Linearly Decodable from Mid-Layer Hidden States in Quantized LLMs

幻觉可从量化LLM中间层隐藏状态线性解码

Aizierjiang Aiersilan

发表机构 * University of Macau(澳门大学)

AI总结 研究开源LLM在4位量化下中间层隐藏状态是否编码线性可分的真实性信号,发现单层线性探针AUROC达0.904-1.000,优于采样方法,且信号近似线性。

详情
AI中文摘要

我们研究开源LLM是否在其隐藏状态中编码线性可分的真实性信号,以及该信号在网络哪一层最强。在三个7B-8B指令微调模型(Llama-3.1-8B、Mistral-7B、Qwen2.5-7B)以4位NF4量化加载的情况下,我们在四个幻觉基准(TruthfulQA、HaluEval-QA、FEVER和一个受控合成集)上提取每层隐藏状态,并比较四种检测方法:线性探针、MLP探针、INSIDE EigenScore、自一致性和注意力熵。单个中间网络层的线性探针在保留分割上达到0.904-1.000 AUROC,而基于采样的检测器在相同协议下不超过0.541 AUROC。真实性信号近似线性:MLP探针很少超过线性探针0.01 AUROC。在自然语言基准上,峰值探测层落在模型家族的一致范围内——Llama和Mistral的32层中第13-18块,Qwen的28层中第19-25块。第一块注意力熵在知识基础设置中提供互补信号(HaluEval-QA上0.866-0.941 AUROC),且无额外推理成本。该协议下采样方法的低区分性反映了配对标签评估与这些方法访问信息之间的结构性不匹配,而非这些方法的固有限制。代码和数据已发布,可在单个8 GB GPU上完全复现。

英文摘要

We investigate whether open-source LLMs encode a linearly separable truthfulness signal in their hidden states, and at which network depth this signal is strongest. Across three $7$B--$8$B instruction-tuned models (Llama-3.1-8B, Mistral-7B, Qwen2.5-7B) loaded in $4$-bit NF4 quantization, we extract per-layer hidden states on four hallucination benchmarks (TruthfulQA, HaluEval-QA, FEVER, and a controlled synthetic set) and compare four detection approaches: linear and MLP probes, INSIDE EigenScore, self-consistency, and attention entropy. A linear probe on a single mid-network layer achieves $0.904$--$1.000$ AUROC on held-out splits, while sampling-based detectors do not exceed $0.541$ AUROC under the same protocol. The truthfulness signal is approximately linear: MLP probes rarely surpass linear probes by more than $0.01$ AUROC. Peak probing layers fall in a consistent band across model families on natural-language benchmarks -- blocks~$13$--$18$ of~$32$ for Llama and Mistral, and blocks~$19$--$25$ of~$28$ for Qwen. First-block attention entropy provides a complementary signal in knowledge-grounded settings ($0.866$--$0.941$ AUROC on HaluEval-QA) at no additional inference cost. The low discriminability of sampling methods under this protocol reflects a structural mismatch between paired-label evaluation and the information these methods access, rather than an inherent limitation of those methods. Code and data are released for full reproducibility on a single $8$\,GB GPU.

2606.02623 2026-06-03 cs.NE cs.AI cs.LG 版本更新

Oscillatory State-Space Models as Inductive Biases for Physics-Informed Neural PDE Solvers

振荡状态空间模型作为物理信息神经PDE求解器的归纳偏置

Abhishek Chandra, Taniya Kapoor

发表机构 * KTH Royal Institute of Technology(皇家理工学院) Wageningen University & Research(瓦赫宁根大学与研究中心)

AI总结 提出一种结合振荡状态空间动力学和PDE感知空间谱的PINN方法,以改进时变PDE求解的精度和内存效率。

详情
AI中文摘要

求解时变偏微分方程(PDE)是计算科学与工程中的一个重要问题。物理信息神经网络(PINN)从控制方程中学习PDE解。然而,准确捕捉时间演化仍然具有挑战性。最近的基于序列模型的方法使用通用序列模型参数化时间演化,这些模型捕捉时间依赖性,但没有显式编码PDE解的结构化动力学。此外,它们的内存需求可能随序列长度和分辨率而不利地扩展,限制了在大规模或高维设置中的适用性。本文介绍了一种PINN方法,该方法结合了振荡状态空间动力学来表示PDE解的模态结构。所提出的方法利用基于线性振荡器的时间演化,以及空间上的PDE感知谱基。这种设计实现了闭式空间微分和边界条件的一致强制执行。该方法在前向、逆和高维PDE问题上进行了评估,包括高达100个空间维度的情况。结果表明,与最近基于序列模型的PINN方法相比,该方法提高了精度并减少了内存使用。总体而言,本文强调了将结构化动力学先验纳入神经PDE求解器的时间演化中的好处,并建议设计更符合物理和计算高效的PINN架构。

英文摘要

Solving time-dependent partial differential equations (PDEs) is an important problem in computational science and engineering. Physics-informed neural networks (PINNs) learn PDE solutions from governing equations. However, accurately capturing temporal evolution remains challenging. Recent sequence-model-based approaches parameterize time evolution using general-purpose sequence models, which capture temporal dependencies but do not explicitly encode the structured dynamics of PDE solutions. In addition, their memory requirements can scale unfavorably with sequence length and resolution, limiting applicability in large-scale or high-dimensional settings. This work introduces a PINN approach that incorporates oscillatory state-space dynamics to represent the modal structure of PDE solutions. The proposed method leverages a linear-oscillator-based temporal evolution, together with a PDE-aware spectral basis in space. This design enables closed-form spatial differentiation and consistent enforcement of boundary conditions. The method is evaluated on forward, inverse, and high-dimensional PDE problems, including cases up to 100 spatial dimensions. The results show improved accuracy and reduced memory usage compared to recent sequence-model-based PINN approaches. Overall, this work highlights the benefits of incorporating structured dynamical priors into the temporal evolution of neural PDE solvers and suggests designing more physics-aligned and computationally efficient PINN architectures.

2606.02610 2026-06-03 cs.CE cs.AI cs.LG physics.ao-ph 版本更新

Samudra 2: Scaling Ocean Emulators across Resolutions

Samudra 2: 跨分辨率扩展海洋仿真器

Yuan Yuan, Jesse Rusak, Alexander Merose, Adam Subel, Pavel Perezhogin, Alistair Adcroft, Carlos Fernandez-Granda, Laure Zanna

发表机构 * Courant Institute School of Mathematics, Computing, and Data Science, New York University(Courant学院数学、计算与数据科学系,纽约大学) Open Athena AI Foundation, Inc.(开放Athena人工智能基金会) Program in Atmospheric and Oceanic Sciences, Princeton University(大气与海洋科学项目,普林斯顿大学)

AI总结 针对现有海洋神经仿真器在长期自回归滚动中出现的方差崩溃和印记伪影问题,提出Samudra 2,通过改进U-Net骨干网络和动态损失函数,在1°分辨率下将上层海洋全球平均温度R²从0.56提升至0.87,并将深层海洋温度误差降低约七倍,且可扩展至1/2°和1/4°分辨率。

详情
AI中文摘要

海洋环流模式(OGCM)对气候科学至关重要,但计算成本高,限制了集合规模和强迫情景。神经仿真器有望实现数量级的加速,然而现有的海洋仿真器未能将精细空间分辨率与多年自回归滚动相结合。Samudra是第一个产生多十年全球滚动的自回归神经海洋仿真器,但仅限于$1^\\\circ$分辨率,并表现出两种长期故障模式:\\emph{方差崩溃},即时间变异性的丧失,以及\\emph{印记伪影},即速度模式泄漏到深海场中。我们提出Samudra 2,它引入了更宽的U-Net骨干网络,采用修改后的ConvNeXt风格块和减小的块内扩展因子,以及一个动态损失函数,根据预测误差重新加权输出通道,从而增强缓慢演变的深海场的梯度。在$1^\\\circ$分辨率下,Samudra 2将上层海洋全球平均温度$R^2$从0.56提高到0.87,并将深海温度误差降低约七倍。相同的架构可扩展到$1/2^\\\circ$和$1/4^\\\circ$分辨率,在大约8年的自回归滚动中恢复中尺度涡旋和尖锐的西边界流。在单个GPU上运行,Samudra 2能够为海平面预测、海洋热吸收和气候变率研究提供更大的集合。我们在此https URL提供代码、文档和基准资源。

英文摘要

Ocean general circulation models (OGCMs) are essential to climate science but computationally expensive, limiting ensemble size and forcing scenarios. Neural emulators promise orders-of-magnitude speedups, yet existing ocean emulators have not combined fine spatial resolution with multi-year autoregressive rollouts. Samudra, the first autoregressive neural ocean emulator to produce multi-decade global rollouts, is limited to $1^\circ$ resolution and exhibits two long-horizon failure modes: \emph{variance collapse}, the loss of temporal variability, and \emph{imprinting artifacts}, in which velocity patterns leak into deep-ocean fields. We present Samudra 2, which introduces a wider U-Net backbone with modified ConvNeXt-style blocks and a reduced block-internal expansion factor, together with a dynamic loss that reweights output channels according to their prediction errors, strengthening gradients for slow-evolving deep-ocean fields. At $1^\circ$, Samudra 2 increases upper-ocean global-mean temperature $R^2$ from 0.56 to 0.87 and reduces deep-ocean temperature error by roughly sevenfold. The same architecture scales to $1/2^\circ$ and $1/4^\circ$ over approximately 8-year autoregressive rollouts, recovering mesoscale eddies and sharp western boundary currents. Running on a single GPU, Samudra 2 enables larger ensembles for sea-level projections, ocean heat uptake, and climate variability studies. We provide code, documentation, and benchmark resources at https://openathena.ai/Ocean_Emulator/.

2606.02607 2026-06-03 cs.LG cs.AI cs.CR 版本更新

Geometry-Aware Tabular Diffusion

几何感知表格扩散

David Turtora Zagardo

发表机构 * arXiv

AI总结 提出几何感知表格扩散(GATD),通过向扩散去噪器注入列值差异的成对角度和长度作为输入和辅助目标,以显式建模列间关系,在10个数据集上以更少参数取得SOTA性能。

Comments Accepted to the ICML 2026 main track. 24 pages, 10 figures, 22 tables

详情
AI中文摘要

表格合成对于隐私保护的共享和增强至关重要,然而扩散模型依赖隐式机制来捕捉列间关系。我们引入了几何感知表格扩散(GATD),它通过从列值差异计算出的成对角度和长度来增强表格扩散去噪器,并将其用作输入和辅助目标。我们的MLP实例化在平均使用3.5倍更少参数(对于分类任务最多25倍)的情况下实现了最先进的基准性能:在十个数据集上,它在8/10的形状、7/10的趋势和9/10的下游效用(F1/RMSE)上获胜,将形状和趋势误差分别降低了27%和20%。默认损失权重可迁移到GNN和Transformer去噪器,在27/30个架构-数据集单元上改善了形状,在25/30上改善了趋势。一项匹配的消融实验表明,监督(而非额外输入或容量)驱动了性能提升。这表明显式关系监督是表格扩散的一种可移植归纳偏置。

英文摘要

Tabular synthesis is critical for privacy-preserving sharing and augmentation, yet diffusion models rely on implicit mechanisms to capture inter-column relationships. We introduce Geometry-Aware Tabular Diffusion (GATD), which augments tabular diffusion denoisers with pairwise angles and lengths computed from column value differences and used as inputs and auxiliary targets. Our MLP instantiation achieves state-of-the-art benchmark performance while using 3.5x fewer parameters on average (up to 25x for classification tasks): on ten datasets, it wins 8/10 Shape, 7/10 Trend, and 9/10 downstream utility (F1/RMSE), reducing Shape and Trend error by 27% and 20%. Default loss weights transfer to GNN and Transformer denoisers, improving Shape on 27/30 and Trend on 25/30 architecture-dataset cells. A matched ablation shows supervision (not extra inputs or capacity) drives the gain. This shows explicit relational supervision is a portable inductive bias for tabular diffusion.

2606.02606 2026-06-03 cs.LG cs.AI 版本更新

ReLoRA: Knowledge-Reusing Adaptation for Fast Rollout of Evolving LLM Services

ReLoRA: 面向演化LLM服务快速部署的知识复用适配

Yang Xu, Zihuai Xu, Hongli Xu, Yunming Liao, Zhiwei Yao, Xitong Fu

发表机构 * School of Computer Science and Technology, University of Science and Technology of China(计算机科学与技术学院,中国科学技术大学) Suzhou Institute for Advanced Research, University of Science and Technology of China(苏州先进研究院,中国科学技术大学)

AI总结 针对基础模型频繁更新导致已有LoRA适配器失效的问题,提出ReLoRA框架,通过贝叶斯优化初始化与调度正则化微调,实现知识复用与快速重新适配,降低计算开销并提升性能。

详情
AI中文摘要

大型语言模型(LLM)越来越多地被部署为持续演化的服务,其中频繁的基础模型更新可能使先前部署的任务特定低秩适配(LoRA)适配器失效。对于管理众多下游模型服务的提供商来说,为每个更新的基础模型从头重新训练每个LoRA适配器在计算上代价高昂,并延迟服务部署。同时,更简单的替代方案,即简单地将原始LoRA适配器应用于更新的基础模型,由于适配器-骨干网络不兼容,常常导致服务质量下降。为了解决这个问题,我们提出了ReLoRA,一种知识复用的重新适配框架,能够高效地为演化的LLM服务恢复可用的LoRA适配器,同时保持或提升任务性能。具体来说,ReLoRA包含两个关键的优化步骤:1)自适应LoRA初始化利用贝叶斯优化,通过融合先前部署的任务适配器和基础模型演化的信息,构建一个兼容性感知的起点;2)带调度正则化的微调首先通过强正则化快速将适配器引导至高质量区域,随后通过放松正则化进行任务特定精炼。这种设计使得在减少重新适配开销的同时,能够快速恢复服务质量。大量实验表明,与基线相比,ReLoRA将就绪时间减少高达8.9倍,准确率提升高达4.6%。

英文摘要

Large Language Models (LLMs) are increasingly deployed as continuously evolving services, where frequent base-model updates may invalidate previously deployed task-specific Low-Rank Adaptation (LoRA) adapters. For service providers managing numerous downstream model services, retraining each LoRA adapter from scratch for every updated base model is computationally prohibitive and delays service rollout. Meanwhile, the simpler alternative, i.e., naively applying the original LoRA adapter to the updated base model, often leads to degraded service quality due to adapter-backbone incompatibility. To address this problem, we propose ReLoRA, a knowledge-reusing re-adaptation framework that efficiently restores service-ready LoRA adapters for evolving LLM services while preserving or improving task performance. Specifically, ReLoRA comprises two key optimization steps: 1) Adaptive LoRA initialization leverages Bayesian optimization to construct a compatibility-aware starting point by fusing information from both the previously deployed task adapter and the base model's evolution; 2) Fine-tuning with scheduled regularization first rapidly steers the adapter to a high-quality region via strong regularization, followed by relaxed regularization for task-specific refinement. This design enables rapid service-quality recovery with reduced re-adaptation overhead. Extensive experiments demonstrate that ReLoRA reduces time-to-readiness by up to 8.9$\times$ and improves accuracy by up to 4.6\% compared to baselines.

2606.02605 2026-06-03 cs.LG cs.AI eess.IV 版本更新

Cross-Modal Contrastive Learning of ECG and Angiography Representations for Severe Stenosis Classification

用于严重狭窄分类的心电图与血管造影表示的跨模态对比学习

Nikola Cenikj, Özgün Turgut, Alexander Müller, Alexander Steger, Jan Kehrer, Marcus Brugger, Daniel Rueckert, Philip Müller

发表机构 * Chair for AI in Healthcare and Medicine, Technical University of Munich and TUM University Hospital(人工智能在医疗与医学中的研究所,慕尼黑技术大学及慕尼黑大学医院) Department of Computing, Imperial College London(伦敦帝国理工学院计算机系) Munich Center for Machine Learning (MCML), Munich, Germany(慕尼黑机器学习中心(MCML)) Department of Internal Medicine, TUM University Hospital(慕尼黑大学医院内科学系)

AI总结 提出StenCE预训练框架,通过跨模态对比学习从心电图特征中实现冠状动脉狭窄风险分层,在严重狭窄分类中首次达到高性能。

详情
AI中文摘要

冠状动脉狭窄是一种常见的心血管疾病,未经治疗的严重病例具有显著的心肌梗死风险。尽管冠状动脉(X射线)血管造影仍是狭窄诊断的金标准,但其具有侵入性、耗时且资源密集,因此仅对基于症状和既往临床测试具有高疾病概率的患者进行。然而,一部分患者,尤其是无症状患者,可能仍未被诊断。从心电图(ECG)中检测狭窄的迹象,由于心电图快速、廉价、无创,因此即使在无症状患者中也常规采集,将支持早期诊断。然而,由于在心电图中尚未识别出可靠的狭窄特异性信号,目前无法用于狭窄风险分层。为解决这一问题,我们引入了StenCE,一个预训练框架,允许基于直接从心电图导出的特征对患者进行分层。在不同狭窄严重程度阈值和额外心电图疾病分类任务上的评估表明,不同心电图编码器均取得了一致的性能提升,优于先前的工作。所获得的模型成功检测到心电图中用于狭窄诊断的信号,并且是首个在严重狭窄分类中实现高性能的模型。源代码可在以下网址获取:此 https URL。

英文摘要

Coronary artery stenosis is a common cardiovascular disease, with severe, untreated cases posing significant risks of heart attack. Although coronary (X-ray) angiograms remain the standard for stenosis diagnosis, they are invasive, time- and resource-intensive, and therefore only performed on patients with a high probability of disease based on symptoms and prior clinical tests. However, a subset of patients, especially those without symptoms, may remain undiagnosed. Detecting indications of stenosis from ECGs, which are fast, cheap, non-invasive, and thus routinely acquired even in asymptomatic patients, would support early diagnosis. However, as no reliable stenosis-specific signal has been identified in ECGs, they can not currently be used for stenosis risk stratification. To address this, we introduce StenCE, a pretraining framework, allowing stratification of patients based on features derived directly from ECGs. Evaluations across varying stenosis severity thresholds and additional ECG disease classification tasks demonstrate consistent performance improvements across different ECG encoders, outperforming previous work. The obtained models successfully detect signals for stenosis diagnosis in ECGs and are the first to achieve high performance in severe stenosis classification. The source code is available at https://github.com/NikolaCenic/ecg-stenosis-cls.

2606.02604 2026-06-03 cs.LG cs.AI 版本更新

Auditable Climate Risk Intelligence from Fragmented ESG Data: Deterministic Orchestration and Imbalance-Aware Learning for Scope 1-3 Validation

来自碎片化ESG数据的可审计气候风险智能:面向范围1-3验证的确定性编排与不平衡感知学习

Karan Sehgal, Khawar Naveed Bhatti

发表机构 * Kent Business School, University of Kent(肯特大学 Kent 商学院)

AI总结 针对ESG数据碎片化及传统验证缺乏可审计性的问题,提出一种融合确定性编排、时序异常检测、不平衡感知集成学习和可解释治理的框架,并构建合成基准实现可复现验证。

Comments 22 pages, 7 figures. Preprint

详情
AI中文摘要

ESG和气候风险数据在异构的范围1、范围2和范围3报告环境中仍然碎片化,而传统的验证流程缺乏来源感知的可审计性、隐藏漂移检测和面向可复现性的治理。本文提出一个确定性气候风险智能框架,整合单一真相来源编排、时序异常检测、不平衡感知集成学习和面向可解释性的治理,用于可审计的ESG验证。为支持开放复现,我们构建并发布了一个合成ESG验证基准,该基准根据GHG协议、PCAF和ISSB标准的公开报告特征进行校准。该方法包括时序漂移分析、基于SMOTE的罕见事件优化、集成学习、来源感知编排以及基于TreeSHAP的可解释性,用于治理检查和审计重建。我们使用分类指标(召回率、F1、ROC AUC)、校准指标(ECE、Brier分数)以及面向治理的审计轨迹完整性度量(衡量可重建确定性来源到升级来源链的标记异常比例)将框架与统计分类器、异常检测方法、时序预测基线和基于阈值的系统进行评估。结果以分层五折交叉验证的均值和标准差报告,并进行配对显著性检验。该框架将ESG报告重新定义为确定性气候风险治理基础设施,支持可复现性、可解释性和运营可审计性。

英文摘要

ESG and climate risk data remain fragmented across heterogeneous Scope 1, Scope 2, and Scope 3 reporting environments, while conventional validation pipelines lack provenance aware auditability, hidden drift detection, and reproducibility oriented governance. This paper proposes a deterministic climate risk intelligence framework integrating single source of truth orchestration, temporal anomaly detection, imbalance aware ensemble learning, and explainability oriented governance for auditable ESG validation. To support open reproducibility, we construct and release a synthetic ESG validation benchmark calibrated against publicly reported characteristics of the GHG Protocol, PCAF, and ISSB standards. The methodology incorporates temporal drift analysis, SMOTE based rare event optimization, ensemble learning, provenance aware orchestration, and TreeSHAP based interpretability for governance inspection and audit reconstruction. We evaluate the framework against statistical classifiers, anomaly detection methods, temporal forecasting baselines, and a threshold based system using classification metrics (recall, F1, ROC AUC), calibration metrics (ECE, Brier score), and a governance oriented audit trace completeness metric measuring the fraction of flagged anomalies for which a deterministic source to escalation provenance chain can be reconstructed. Results are reported as mean and standard deviation across stratified five fold cross validation with paired significance testing. The framework reframes ESG reporting toward deterministic climate risk governance infrastructure supporting reproducibility, explainability, and operational auditability.

2606.02603 2026-06-03 cs.CV cs.LG 版本更新

COD10K-C: Benchmarking Robustness of Camouflaged Object Detection Under Natural Image Corruptions

COD10K-C:自然图像损坏下伪装目标检测的鲁棒性基准测试

Arafat Hossain Sayem

发表机构 * CSE, Bangladesh University of Engineering and Technology(孟加拉国工程与技术大学计算机科学与工程系)

AI总结 提出COD10K-C基准,包含8种损坏类型和5个严重级别,评估伪装目标检测模型在损坏图像上的性能,并引入轻量级模型RobustCODLite,通过损坏增强、频率先验分支和不确定性一致性损失,在损坏条件下保持较高Dice分数。

Comments 7 pages, 1 figure

详情
AI中文摘要

伪装目标检测已取得显著进步,但大多数标准基准仅评估模型在干净图像上的性能。这并不现实,因为真实相机经常捕捉到模糊、传感器噪声、天气效应和压缩伪影。我们提出了COD10K-C,一个基于COD10K的损坏鲁棒性基准。它包含8种损坏类型和5个严重级别,总共40种条件和81,040个评估对。我们评估了三种流行的伪装目标检测模型:SINet-v2、PFNet和ZoomNet,以及一个轻量级模型RobustCODLite。所有模型在损坏图像上均表现出明显的性能下降。运动模糊和高斯模糊导致最大的下降,其中SINet-v2在运动模糊下损失了18.5个Dice点。亮度和雾的影响较小。RobustCODLite使用了损坏增强、频率先验分支和不确定性一致性损失。它在损坏条件下保留了其干净Dice分数的92.3%,而SINet-v2为87.7%,ZoomNet为84.8%,PFNet为84.1%。在最严重的损坏情况下,RobustCODLite达到或超过了在干净数据上表现更好的模型。我们将发布COD10K-C的GitHub仓库,以支持未来在鲁棒伪装目标检测方面的研究。

英文摘要

Camouflaged object detection has improved substantially, but most standard benchmarks evaluate models only on clean images. This is not realistic because real cameras often capture blur, sensor noise, weather effects, and compression artifacts. We present COD10K-C, a corruption robustness benchmark based on COD10K. It includes 8 corruption types and 5 severity levels, giving 40 conditions and 81,040 evaluation pairs in total. We evaluate three popular camouflaged object detection models, SINet-v2, PFNet, and ZoomNet, as well as a lightweight model called RobustCODLite. All models show clear performance drops on corrupted images. Motion blur and Gaussian blur cause the largest drops, with SINet-v2 losing 18.5 Dice points under motion blur. Brightness and fog are less harmful. RobustCODLite uses corruption augmentation, a frequency-prior branch, and an uncertainty-consistency loss. It retains 92.3% of its clean Dice score under corruption, compared with 87.7% for SINet-v2, 84.8% for ZoomNet, and 84.1% for PFNet. On the hardest corruptions, RobustCODLite matches or outperforms models that perform better on clean data. We will release the COD10K-C GitHub repository to support future research in robust camouflaged object detection.

2606.02602 2026-06-03 cs.LG cs.CV 版本更新

Graph Mamba Survival Analysis Based on Topology-Aware ordering

基于拓扑感知排序的图Mamba生存分析

Yuanfang Chen, Peiqiang Yan, Yuntao Shou, Qian Zhao, Xiangyong Cao

发表机构 * School of Mathematics and Statistics(数学与统计学学院) West China Science and Technology Innovation Harbor(西部科学与技术创新港) School of Computer Science and Technology(计算机科学与技术学院)

AI总结 针对WSI生存分析中Mamba模型对输入顺序敏感及单向架构限制空间结构利用的问题,提出基于拓扑感知排序的图Mamba框架TopoMamSurv,通过TAO策略、双向Mamba模块和GCN集成实现高效长程依赖建模与双向空间上下文建模。

详情
AI中文摘要

在计算病理学中,全切片图像(WSI)生存分析对于患者预后评估至关重要,但面临多项技术挑战。尽管Transformer通过其自注意力机制捕获长程依赖,但其$O(N^2)$时间复杂度在大规模WSI图结构中造成严重计算瓶颈。Mamba模型以线性复杂度突破了Transformer的计算瓶颈。然而,由于Mamba对输入数据顺序的高度敏感性,图Mamba中传统的节点排序方法(如基于节点度或子图大小的方法)未能充分考虑图数据的拓扑连通性,从而限制了Mamba序列建模的性能。此外,其单向架构无法利用图像的双向空间结构。为解决这些挑战,本文提出一种基于拓扑感知排序的新型图Mamba生存分析框架(TopoMamSurv),以适应Mamba的序列敏感性。我们的可视化实验进一步证实,通过拓扑感知排序(TAO)策略提取的节点确实表现出更高的相似性。此外,我们设计了双向Mamba模块并集成图卷积网络(GCN),以实现图像的双向空间上下文建模,形成“局部聚合-全局捕获”的分层特征学习架构。该框架通过TAO、双向语义建模和分层特征融合的系统设计,有效调和了WSI分析中长程依赖建模、计算效率和空间结构利用之间的矛盾。该框架在五个TCGA数据集上验证了其全面的性能优势。

英文摘要

In computational pathology, Whole Slide Images (WSIs) survival analysis is crucial for patient prognosis assessment, but it faces multiple technical challenges. Although the Transformer captures long-range dependencies through its self-attention mechanism, its $O(N^2)$ time complexity causes a severe computational bottleneck in large-scale WSIs graph structures. The Mamba model breaks through the Transformer's computational bottleneck with linear complexity. But, owing to Mamba's high sensitivity to the order of input data, traditional node sorting methods in Graph Mamba, such as those based on node degree or subgraph size, fail to adequately account for the topological connectivity of graph data. This inadequacy consequently restricts the performance of Mamba's sequential modeling. Moreover, its unidirectional architecture cannot leverage the bidirectional spatial structure of images. To address these challenges, this paper proposes a novel Graph Mamba survival analysis framework based on topology-aware ordering (TopoMamSurv) to adapt to the sequential sensitivity of Mamba. Our visualization experiments further confirmed that the nodes extracted through the topology-aware ordering (TAO) strategy indeed exhibit higher similarity. Furthermore, we designed a bidirectional Mamba module and integrated a Graph Convolutional Network (GCN) to achieve bidirectional spatial context modeling of images, forming a hierarchical feature learning architecture for "local aggregation - global capture." This framework effectively reconciles the contradiction between long-range dependency modeling, computational efficiency, and spatial structure utilization in WSIs analysis through its systematic design of TAO, bidirectional semantic modeling, and hierarchical feature fusion. This framework has been validated for its comprehensive performance advantage on five TCGA datasets.

2606.02601 2026-06-03 cs.LG 版本更新

Testing the Test: Score-Direction Instability in Class-Split Anomaly Detection

测试测试:类分割异常检测中的分数方向不稳定性

Alejandro Ascarate, Leo Lebrat, Rodrigo Santa Cruz, Clinton Fookes, Olivier Salvado

发表机构 * GitHub arXiv

AI总结 本文通过提出邻域类泄漏诊断方法,揭示了类分割异常检测协议在异常类与正常混合重叠时分数方向不稳定的问题,并建议将其视为几何依赖的压力测试。

Comments 4+1 pages, 1 figure, accepted at ICML 2026 Workshop on Hypothesis Testing

详情
AI中文摘要

数据集内类分割评估被广泛用作完全无条件分布外异常检测的代理。我们表明,当留出的异常类在表示空间中与正常混合重叠时,该协议可能变得不适定。在这种情况下,异常分数可能趋近于随机甚至反转,且偏好的分数方向可能取决于未知的异常类。我们引入了一种简单的无训练诊断方法——邻域类泄漏,并表明它在Fashion-MNIST、CIFAR-10和Imagenette上,无论是在像素空间还是VAE潜在空间中,都能预测分数方向的不稳定性。我们的结果表明,类分割异常检测基准应被视为几何依赖的压力测试,而非异常检测能力的无条件证据。

英文摘要

Within-dataset class-split evaluation is widely used as a proxy for fully unconditional out-of-distribution anomaly detection. We show that this protocol can become ill-posed when the held-out anomaly class overlaps the normal mixture in representation space. In this regime, anomaly scores may collapse toward chance or even invert, and the preferred score direction can depend on the unknown anomaly class. We introduce a simple training-free diagnostic, neighborhood class leakage, and show that it predicts score-direction instability across Fashion-MNIST, CIFAR-10, and Imagenette, in both pixel and VAE latent spaces. Our results suggest that class-split AD benchmarks should be treated as geometry-dependent stress tests rather than unconditional evidence of anomaly-detection ability.

2606.02598 2026-06-03 cs.LG cs.HC 版本更新

Assessing Region-Level EEG Contributions to Cognitive Workload Prediction

评估区域级脑电图对认知负荷预测的贡献

Jacob Wong, Sohan Singh, Prannaya Gupta, Jin Xing Ang, Kritika Johari, U-Xuan Tan

发表机构 * School of InfoComm Technology, Ngee Ann Polytechnic(信息与通信技术学院,南洋理工学院新加坡分校) Engineering Product Development Pillar, Singapore University of Technology and Design(工程产品开发支柱,新加坡科技设计大学) NUS High School of Math and Science(国立大学科学高中)

AI总结 提出区域级评估框架,通过跨四个公开数据集的大规模分析,发现额叶电极组在混合被试和独立被试评估中均优于全头皮基线,额中央区域预测稳定性最高,支持设计高效通用的脑电图负荷监测系统。

Comments Accepted to EMBC 2026

详情
AI中文摘要

准确且可泛化的脑电图(EEG)认知负荷估计对于以人为中心和安全性关键的系统至关重要。尽管EEG广泛用于负荷评估,但区域级EEG贡献在不同任务、数据集和受试者之间的一致性仍不清楚。本文提出了一个基于EEG的负荷预测区域级评估框架,其中模型使用仅从属于解剖学定义的头皮区域的电极提取的特征进行训练和评估。我们对四个公开可用的EEG负荷数据集进行了大规模分析,这些数据集涵盖了多样化的任务需求、记录硬件和电极布局。区域重要性通过一种模型无关的、基于性能的方法在混合受试者和受试者独立评估协议下进行量化,并使用基于排名的策略汇总结果,以确保跨实验配置的鲁棒性。在所有数据集和受试者独立评估中,额叶电极组在相对排名位置上优于全头皮基线约15-20%,同时使用的电极数量显著减少。额中央区域表现出最稳定的预测效用,而顶叶和枕叶区域在实验条件下的贡献一致性较低。这些发现表明,与负荷相关的EEG信息最一致地保留在额叶和额中央电极组中,支持设计高效且可泛化的基于EEG的负荷监测系统。

英文摘要

Accurate and generalizable estimation of cognitive workload from electroencephalography (EEG) is critical for human-centered and safety-critical systems. Although EEG is widely used for workload assessment, the consistency of region-level EEG contributions across tasks, datasets, and subjects remains unclear. This paper presents a region-level evaluation framework for EEG-based workload prediction in which models are trained and evaluated using features extracted exclusively from electrodes belonging to anatomically defined scalp regions. We perform a large-scale analysis across four publicly available EEG workload datasets spanning diverse task demands, recording hardware, and electrode montages. Region importance is quantified using a model-agnostic, performance-based approach under both mixed-subject and subject-independent evaluation protocols, with results aggregated using a rank-based strategy to ensure robustness across experimental configurations. Across all datasets and subject-independent evaluations, frontal electrode groups outperform the full-scalp baseline by approximately 15-20% in relative rank position while using substantially fewer electrodes. Fronto-central regions exhibit the most stable predictive utility, whereas posterior and occipital regions contribute less consistently across experimental conditions. These findings indicate that workload-relevant EEG information is most consistently retained within frontal and fronto-central electrode groups, supporting the design of efficient and generalizable EEG-based workload monitoring systems.

2606.02597 2026-06-03 cs.LG cs.CR 版本更新

Making Brain-Computer Interfaces More Secure

使脑机接口更安全

Md Fahimul Kabir Chowdhury, Gahangir Hossain

发表机构 * University of California, Berkeley(加州大学伯克利分校) Stanford University(斯坦福大学)

AI总结 针对脑电图(EEG)脑机接口(BCI)易受对抗攻击的问题,提出轻量级卷积神经网络(CNN)架构,在梯度攻击下比EEGNet、DeepConvNet和SleepEEGNet等模型具有更好的分类鲁棒性。

Comments Accepted and presented at IEEE World AI IoT Congress 2026

详情
AI中文摘要

基于脑电图(EEG)的脑机接口(BCI)的发展主要得益于机器学习而显著进步。尽管早期研究大多集中在提高分类准确率上,但安全性和鲁棒性方面关注较少。根据最近的研究,基于EEG的BCI容易受到对抗性攻击,这些攻击可能由于微小、精心设计的扰动而导致误诊。因此,评估模型对此类扰动的鲁棒性对于确保可靠部署至关重要。在本研究中,我们提出了一种轻量级的自定义卷积神经网络(CNN)架构,以研究基于EEG的BCI中的对抗鲁棒性。所提出的方法使用两个EEG数据集进行评估,并与三种针对EEG定制的新型CNN模型(即EEGNet、DeepConvNet和SleepEEGNet)在基于梯度的对抗攻击场景下进行对比。实验结果表明,在对抗扰动下,所提出的模型在分类性能上持续优于基线模型,显示出更强的鲁棒性。这些发现突显了轻量级架构在对抗条件下增强基于EEG的BCI系统可靠性的潜力。

英文摘要

The development of brain-computer interfaces (BCIs) based on electroencephalograms (EEGs) has advanced significantly mainly to machine learning. Although the majority of earlier research has been on increasing classification accuracy, relatively little focus has been placed on security and robustness. According to recent research, EEG-based BCIs are susceptible to adversarial attacks, which can cause misdiagnosis due to minute, well-crafted disturbances. Evaluating model robustness against such perturbations is therefore critical for ensuring reliable deployment. In this study, we propose a lightweight custom Convolutional Neural Network (CNN) architecture to investigate adversarial robustness in EEG-based BCIs. The suggested method is assessed using two EEG datasets and contrasted with three novel CNN models tailored to EEG, namely EEGNet, DeepConvNet, and SleepEEGNet, under gradient-based adversarial attack scenarios. According to experimental findings, the suggested model continuously performs better in classification under adversarial perturbations compared to baseline models, indicating improved robustness. These findings highlight the potential of lightweight architectures for enhancing the reliability of EEG-based BCI systems under adversarial conditions.

2606.02596 2026-06-03 cs.LG 版本更新

Spectral Asymptotics of Neural Network Loss Landscapes: An Exact Decomposition of the Curvature Exponent

神经网络损失景观的谱渐近:曲率指数的精确分解

Anherutowa Calvo

发表机构 * D Labs(9D实验室)

AI总结 本文通过谱对齐分解证明曲率指数α=2+dlogΦ_k/dlogσ_k,揭示了不同层类型曲率指数变化的几何原因,并推导出谱传递恒等式s=αγ,在无自由参数下以约2%中位误差预测Hessian衰减指数。

Comments 13 pages, 6 figures, 3 tables. Code and data: https://github.com/9D-Labs/9d-spectral-alignment-decomposition

详情
AI中文摘要

曲率指数α(h_k ∝ σ_k^α中,控制Hessian特征值如何随梯度奇异值缩放)在不同层类型中系统变化(卷积层约2,Transformer注意力层约1,MLP上投影层小于1)。为什么?我们证明了谱对齐分解:α = 2 + dlogΦ_k / dlogσ_k,其中Φ_k衡量Kronecker因子特征基与梯度奇异方向之间的对齐程度。这将“为什么α变化?”简化为一个几何问题,我们针对LayerNorm、残差连接和softmax头给出了答案。该分解蕴含一个谱传递恒等式s = αγ,连接曲率指数、有效梯度秩衰减γ和Hessian衰减指数s。该恒等式是代数的;其实证内容是,在独立数据(HVPs vs. SVD)上拟合的α和γ,在93个层、五种架构和三个数据集上以约2%的中位误差恢复s,且无自由参数。参与比的zeta函数界表明曲率集中在每层一个有效方向上。作为概念验证,我们推导了架构自适应预条件子T(σ;α),并展示了在梯度奇异基中实现T的谱牛顿法在α≈2的视觉基准上优于AdamW。

英文摘要

The curvature exponent $α$ in $h_k \propto σ_k^α$ -- governing how Hessian eigenvalues scale with gradient singular values -- varies systematically across layer types ($α\approx 2$ for convolutions, $\approx 1$ for transformer attention, $< 1$ for MLP up-projections). Why? We prove the Spectral Alignment Decomposition: $α= 2 + d\logΦ_k / d\logσ_k$, where $Φ_k$ measures alignment between Kronecker factor eigenbases and gradient singular directions. This reduces "why does $α$ vary?" to a geometric question we answer for LayerNorm, residual connections, and softmax heads. The decomposition implies a spectral transfer identity $s = αγ$ linking curvature exponent, effective gradient rank-decay $γ$, and Hessian decay exponent $s$. The identity is algebraic; its empirical content is that $α$ and $γ$, fit on independent data (HVPs vs. SVD), recover $s$ to ~2% median error across 93 layers, five architectures, and three datasets -- with no free parameters. A zeta-function bound on participation ratio shows curvature concentrates onto effectively one direction per layer. As a proof of concept, we derive the architecture-adaptive preconditioner $T(σ;α)$ and show that Spectral Newton -- implementing $T$ in the gradient singular basis -- outperforms AdamW on vision benchmarks where $α\approx 2$.

2606.02595 2026-06-03 cs.LG 版本更新

Human-in-the-Loop Contextual Bandits for Short-Term Rental Dynamic Pricing: Structural Equivalence of Historical Warm-Up and Approval-Gated Live Learning

面向短租动态定价的人机协同上下文赌博机:历史预热与审批门控在线学习的结构等价性

Oleg Miroshnichenko

发表机构 * Oleg Miroshnichenko(奥列格·米罗什尼琴科)

AI总结 针对短租动态定价中反馈稀疏、决策风险高的问题,提出人机协同门控赌博机框架,证明历史定价数据与在线策略预热数据的结构等价性,并设计正则化岭回归预热方法,将冷启动周期从约150轮压缩至约30轮。

详情
AI中文摘要

短租市场中的动态定价为在线学习算法带来了独特挑战:定价决策具有重大财务风险,运营商需要可解释性,且市场反馈稀疏(每个挂牌夜仅有一个预订结果)。我们提出了人机协同门控赌博机(HITL-GB)框架,其中上下文赌博机算法生成价格推荐,但人类代理保留在接受、修改或拒绝每条推荐后应用的权力。我们证明,在审批约束下,历史定价数据——在先前确定性策略下收集的——与用于初始化赌博机后验的策略内预热数据在结构上等价,从而避免了在稀疏反馈市场中使纯在线赌博机学习不可行的数周至数月的冷启动期。我们形式化了审批门控奖励信号,从历史片段推导出正则化岭回归预热程序,并在真实短租生产数据(匿名城市市场,2个房间,2022年4月至2026年4月,1461个夜间定价片段)上验证了该方法。当从层次因子化汤普森采样(HF-TS)家族初始化代理时,我们的预热程序将有效冷启动从约150轮压缩至约30轮。我们进一步论证,该结构等价结果具有领域无关性:任何法律或操作上需要人类审批的高风险领域——包括临床药物剂量、信贷发放、内容审核和放射诊断——都满足相同条件,并受益于相同的预热策略。在受监管行业中,强制性人类监督因此是一种统计资产而非部署约束。

英文摘要

Dynamic pricing in short-term rental (STR) markets presents a distinctive challenge for online learning algorithms: pricing decisions carry significant financial risk, operators require explainability, and market feedback is sparse (one booking outcome per listed night). We introduce the Human-in-the-Loop Gated Bandit (HITL-GB) framework, in which a contextual bandit algorithm generates price recommendations but a human agent retains authority to accept, modify, or reject each recommendation before it is applied. We show that under this approval constraint, historical pricing data -- collected under a prior deterministic policy -- is structurally equivalent to on-policy warm-up data for initialising the bandit's posterior, bypassing the weeks-to-months cold-start period that renders pure online bandit learning impractical in sparse-feedback markets. We formalise the approval-gated reward signal, derive a regularised ridge-regression warm-up procedure from historical episodes, and validate the approach on real STR production data (anonymised urban market, 2 rooms, April 2022 -- April 2026, 1,461 nightly pricing episodes). Our warm-up procedure compresses effective cold-start from ~150 episodes to ~30 episodes when initialising agents from the Hierarchical Factored Thompson Sampling (HF-TS) family. We further argue that the structural equivalence result is domain-agnostic: any high-stakes domain where human approval is legally or operationally required -- including clinical drug dosing, credit origination, content moderation, and radiological diagnosis -- satisfies the same conditions and benefits from the same warm-up strategy. In regulated industries, mandatory human oversight is thus a statistical asset rather than a deployment constraint.

2606.02582 2026-06-03 cs.CE cs.LG cs.NA math.NA 版本更新

Applying Two-Grid Preconditioner for Subsurface Flow Simulation using Attention-enhanced Hybrid Network to Accelerate Multiscale Discretization in High-contrast Media

应用注意力增强混合网络的两网格预条件子进行高对比度介质中地下流动模拟以加速多尺度离散化

Peiqi Li, Jie Chen, Shubin Fu

发表机构 * xjtlu.edu.cn(XTL大学)

AI总结 提出一种结合学习与多尺度数值方法的混合框架,利用注意力增强混合网络预测多尺度基函数,并通过两网格预条件求解器加速高对比度介质中达西方程的数值求解。

详情
AI中文摘要

本文研究了强非均质、高对比度渗透率介质中达西方程的高效数值求解,提出了一种结合学习与多尺度数值方法的混合框架。学习组件用于预测混合广义多尺度有限元方法(混合GMsFEM)中的多尺度基函数,旨在减少离线阶段所需的重复局部计算。一旦预测出这些基函数,全局系统被组装,并通过两网格预条件求解器计算压力场。所提方法加速了昂贵的局部基函数构建阶段,同时保留了底层求解器的多尺度离散化和预条件迭代结构。在二维非均质达西问题上的数值实验表明,与几种代表性基于学习的方法相比,所提框架能获得更准确的最终压力重构,并在强非均质和高对比度系数下保持稳定。与传统混合GMsFEM相比,其主要优势在于基函数生成阶段的效率,而全局求解的质量仍由两网格预条件子保证。这些结果表明,通过学习加速多尺度基函数构建,同时保留成熟的全局问题数值求解器,为高分辨率达西型模拟提供了一种可行方法。

英文摘要

In this paper, we study the efficient numerical solution of Darcy equations in strongly heterogeneous media with high-contrast permeability and propose a hybrid framework that combines learning with multiscale numerical methods. The learning component is used for the prediction of multiscale basis functions in the mixed generalized multiscale finite element method (mixed GMsFEM), with the goal of reducing the repeated local computations required in the offline stage. Once these basis functions are predicted, the global system is assembled and the pressure field is computed by a two-grid preconditioned solver. The resulting method accelerates the costly local basis-construction stage while retaining the multiscale discretization and preconditioned iterative structure of the underlying solver. Numerical experiments on two-dimensional heterogeneous Darcy problems show that the proposed framework yields more accurate final pressure reconstruction than several representative learning-based methods and remains stable under strong heterogeneity and high-contrast coefficients. In comparison with the traditional mixed GMsFEM, its main advantage lies in the efficiency of the basis-generation stage, while the quality of the global solve is still ensured by the two-grid preconditioner. These results indicate that accelerating multiscale basis construction through learning, while preserving a mature numerical solver for the global problem, provides a viable approach for high-resolution Darcy-type simulations.

2606.03769 2026-06-03 math.OC cs.LG math.PR 版本更新

Bregman meets Lévy: Stochastic mirror descent with heavy-tailed noise in continuous and discrete time

Bregman遇见Lévy:具有重尾噪声的随机镜像下降在连续和离散时间中

Pierre-Louis Cauvin, Panayotis Mertikopoulos

AI总结 研究随机镜像下降在重尾噪声下的鲁棒性,通过引入Lévy镜像流连续时间模型,证明其在凸和强凸目标下达到ε-最优的时间复杂度,并推导出离散时间匹配保证。

Comments 68 pages, 3 figures; to appear in the proceedings of ICML 2026

详情
AI中文摘要

我们研究了随机镜像下降(SMD)在重尾噪声下的鲁棒性,重点关注该方法在使用无限方差随机梯度输入时是否保持其收敛保证。为了以原则性的方式解决这个问题,我们首先引入SMD的连续时间模型,作为一个由具有有限$p$阶矩($1 < p \leq 2$)的中心化Lévy噪声过程驱动的随机微分方程(SDE)。该方案——我们称之为Lévy镜像流(LMF)——自然作为重尾噪声下SMD的缩放极限出现。特别地,当$p < 2$(即重噪声区域)时,LMF的轨迹通常表现出任意大小的跳跃不连续性,如果这些跳跃足够频繁,会导致无限方差。然而,尽管存在这种高度奇异的行为,我们证明LMF在凸情况下在$\mathcal{O}(\epsilon^{-p/(p-1)})$时间内达到$\epsilon$-最优,在(相对)强凸目标下在$\mathcal{\tilde O}(\epsilon^{-1/(p-1)})$时间内达到$\epsilon$-最优。这些保证提供了频繁长跳跃对过程收敛影响的清晰刻画,并渗透到重尾噪声下SMD几种变体的系列匹配离散时间保证中。

英文摘要

We study the robustness of stochastic mirror descent (SMD) under heavy-tailed noise, focusing on whether the method retains its convergence guarantees when run with infinite-variance stochastic gradient input. To address this question in a principled manner, we begin by introducing a continuous-time model of SMD as a stochastic differential equation (SDE) driven by a centered Lévy noise process with finite $p$-th order moments, $1 < p \leq 2$. This scheme -- which we call the Lévy mirror flow (LMF) -- arises naturally as the scaling limit of SMD in the presence of heavy-tailed noise. In particular, when $p < 2$ -- the heavy noise regime -- the trajectories of LMF generically exhibit jump discontinuities of arbitrary magnitude which, if frequent enough, lead to infinite variance. Nonetheless, despite this highly singular behavior, we show that LMF attains $ε$-optimality within $\mathcal{O}(ε^{-p/(p-1)})$ time in the convex case, and within $\mathcal{\tilde O}(ε^{-1/(p-1)})$ time for (relatively) strongly convex objectives. These guarantees provide a transparent characterization of the impact of frequent long jumps on the convergence of the process, and percolate to a series of matching discrete-time guarantees for several variants of SMD under heavy-tailed noise.

2606.02758 2026-06-03 math.DG cs.LG math.CT 版本更新

Theoretical Aspects of Lie Groupoid and Lie Algebroid Equivariant Convolutional Neural Networks

李群胚与李代数胚等变卷积神经网络的理论方面

Michael Astwood

发表机构 * Department of Mathematics, University of Manitoba(曼尼托巴大学数学系)

AI总结 本文引入李群胚等变神经网络作为拓扑范畴等变神经网络在可微情形的特化,证明其与李代数胚等变神经网络的等价性,并推广了群不变全局池化。

Comments 28 pages, 2 figures. Preliminary version. Comments and criticism welcome!

详情
AI中文摘要

我们将李群胚等变神经网络作为最近提出的拓扑范畴等变神经网络在可微情形的特化引入。李群胚等变神经网络由李群胚提升卷积和李群胚卷积层组成,并且我们展示了对于合适的李群胚,它们等价于某些李代数胚等变神经网络。此外,我们将群不变全局池化描述为群不变全局池化的推广。进一步,我们通过证明上述每一层都是最近引入的可容许范畴等变层的特例,即它们定义了连续特征函子之间的连续自然变换,从而证明了这一点。

英文摘要

We introduce Lie groupoid equivariant neural networks as a specialization of recently proposed topological category-equivariant neural networks to the differentiable setting. Lie groupoid equivariant neural networks are composed from Lie groupoid lifting convolutions and Lie groupoid convolution layers, and we show how for suitable Lie groupoids they are equivalent to certain Lie algebroid-equivariant neural networks. We additionally describe groupoid invariant global pooling as a generalization of group invariant global pooling. Furthermore, we show that each of the aforementioned layers is a special case of recently introduced admissible category-equivariant layers by demonstrating that they define continuous natural transformations between continuous feature functors.

2606.03517 2026-06-03 quant-ph cs.AI cs.LG 版本更新

Scalable On-Hardware Training of Quantum Neural Networks and Application to Clinical Data Imputation

可扩展的量子神经网络片上训练及其在临床数据填补中的应用

Natansh Mathur, Panagiotis Kl. Barkoutsos, Masako Yamada, Martin Roetteler, Iordanis Kerenidis

发表机构 * IRIF, CNRS and Université Paris Cité(巴黎-萨克雷大学 IRIF 实验室、法国国家科学研究中心和巴黎-萨克雷大学) QC Ware, France(法国 QC Ware 公司) IonQ(IonQ 公司) Quantum Signals(量子信号)

AI总结 提出一种结合蝴蝶电路架构、逐层训练策略和并行化参数位移规则的训练框架,将梯度估计成本从O(n^2)降至O(log n),并在MIMIC-III数据集上验证了其可扩展性和性能。

Comments 13 pages, 9 figures

详情
AI中文摘要

在量子硬件上训练量子神经网络(QNN)目前受限于梯度估计的成本:标准参数位移方法所需的电路评估次数随可训练参数数量二次增长,使得在小型系统之外难以进行基于硬件的优化。在这项工作中,我们引入了一个训练框架,将该成本降低到量子比特数的对数级别,使得在近期硬件上以更大规模进行基于梯度的QNN优化成为可能。我们的框架结合了三个协同设计的要素:(i)一种结构化的、保持子空间的蝴蝶电路架构,具有$O(n \log n)$个参数和对数深度;(ii)一种逐层训练策略,将片上优化限制在每次一个小型、结构良好的层上;(iii)一种并行化的参数位移规则,利用每个蝴蝶层内的交换结构,在恒定数量的电路执行中提取所有梯度。这些共同将每个优化步骤所需的独立电路评估次数从$O(n^2)$减少到$O(\log n)$。我们使用MIMIC-III电子健康记录数据集在临床数据填补上验证了该框架,这是一个对优化不稳定性和模型方差敏感的高要求基准。混合经典-量子模型直接在IonQ Forte Enterprise离子阱硬件上以16量子比特进行训练,性能相对于理想或噪声模拟没有下降,并通过张量网络模拟以32量子比特进行训练,32量子比特推理在硬件上执行。得到的模型在下游患者生存预测中匹配或超过强经典神经网络基线,同时表现出跨运行的低方差,证明了所提出的框架在现实硬件约束下实现了实用、可扩展的QNN训练。

英文摘要

Training quantum neural networks (QNNs) on quantum hardware is currently bottlenecked by the cost of gradient estimation: standard parameter-shift methods require a number of circuit evaluations that grows quadratically with the number of trainable parameters, making hardware-based optimisation impractical beyond small system sizes. In this work, we introduce a training framework that reduces this cost to logarithmic in the number of qubits, making gradient-based QNN optimisation feasible on near-term hardware at increasing scales. Our framework combines three co-designed ingredients: (i) a structured, subspace-preserving Butterfly circuit architecture with $O(n \log n)$ parameters and logarithmic depth; (ii) a layer-wise training strategy that confines on-hardware optimisation to one small, well-structured layer at a time; and (iii) a parallelised parameter-shift rule that exploits the commuting structure within each Butterfly layer to extract all gradients in a constant number of circuit executions. Together these reduce the number of distinct circuit evaluations per optimisation step from $O(n^2)$ to $O(\log n)$. We validate the framework on clinical data imputation using the MIMIC-III electronic health record dataset, a demanding benchmark sensitive to optimisation instability and model variance. Hybrid classical-quantum models are trained directly on IonQ Forte Enterprise trapped-ion hardware at 16 qubits without performance degradation relative to ideal or noisy simulation and via tensor-network simulation at 32 qubits, with 32-qubit inference executed on hardware. The resulting models match or exceed strong classical neural baselines in downstream patient survival prediction while exhibiting reduced variance across runs, demonstrating that the proposed framework enables practical, scalable QNN training under realistic hardware constraints.

2606.02655 2026-06-03 quant-ph cs.GT cs.LG math.OC 版本更新

Coherent Swap Regret and Channel-Proof Learning

相干交换遗憾与信道证明学习

Sohail Sarkar

发表机构 * Sohail (Neel) Sarkar

AI总结 针对量子博弈中局部CPTP偏差,提出相干交换遗憾作为基准,并通过熵镜像上升算法实现O(√(dT log d))的遗憾界,揭示了非幺正使用推荐寄存器是困难根源,并应用于有限量子博弈达到ε-近似可分量子相关均衡。

Comments 23 pages

详情
AI中文摘要

外部遗憾仅保证相对于固定替代行为的稳定性。在量子博弈中,这遗漏了一个自然的物理操作:玩家可以对其实际接收或制备的状态应用局部完全正迹保持(CPTP)映射。我们引入相干交换遗憾作为针对所有此类局部CPTP偏差的遗憾基准,并给出一种算法,通过熵镜像上升在CPTP Choi切片上结合不动点博弈规则,实现O(√(dT log d))的相干交换遗憾。主要结果是一个三级偏差类景观。替换通道以Θ(√(T log d))的速率恢复普通外部遗憾。幺正通道(包括幺正偏差和幺正混合)具有零极小极大遗憾。确定性测量-制备通道在中等时间范围内已迫使Ω(√(dT log d))的遗憾,且该速率对所有CPTP偏差也是充分的。因此,困难源于对推荐寄存器的非幺正使用,而非仅量子相干性。作为应用,有限量子博弈中的去中心化完全信息学习在T=O(max_i d_i log d_i/ε^2)轮后达到ε-近似可分量子相关均衡。我们将这些均衡与中介量子推荐协议的信道证明性等同,给出适用于任意有限维状态的局部CPTP可剥削性的SDP审计,并包含一个在Haar随机纯态探测下具有伪遗憾O(d^{4/3}T^{2/3}(log d)^{1/3})的探测-赌博机扩展。

英文摘要

External regret certifies stability only against replacing one's behavior by a fixed alternative. In a quantum game, this misses a natural physical move: a player can apply a local completely positive trace-preserving (CPTP) map to the state it actually received or prepared. We introduce coherent swap regret as the regret benchmark against all such local CPTP deviations, and give an algorithm achieving $O(\sqrt{dT\log d})$ coherent swap regret via entropic mirror ascent on the CPTP Choi slice with a fixed-point play rule. The main result is a three-level deviation-class landscape. Replacement channels recover ordinary external regret at rate $Θ(\sqrt{T\log d})$. Unital channels, including unitary deviations and mixtures of unitaries, have zero minimax regret. Deterministic measurement-and-preparation channels already force $Ω(\sqrt{dT\log d})$ regret in the moderate-horizon regime, and this rate is also sufficient for all CPTP deviations. Thus the hardness comes from non-unital use of the recommendation register, not from quantum coherence alone. As an application, decentralized full-information learning in finite quantum games reaches an $\varepsilon$-approximate separable quantum correlated equilibrium after $T=O(\max_i d_i\log d_i/\varepsilon^2)$ rounds. We identify these equilibria with channel-proofness of mediated quantum recommendation protocols, give an SDP audit for local CPTP exploitability applicable to arbitrary finite-dimensional states, and include a probing-bandit extension with pseudo-regret $O(d^{4/3}T^{2/3}(\log d)^{1/3})$ under Haar-random pure-state probes.

2606.03917 2026-06-03 physics.app-ph cs.LG 版本更新

Beyond Gradient Descent: Adam for Analog Ising Machines

超越梯度下降:用于模拟伊辛机的Adam优化器

Stijn Van Vooren, Guy Van der Sande, Guy Verschaffelt

发表机构 * Applied Physics research group, Vrije Universiteit Brussel(应用物理研究组,布鲁塞尔自由大学)

AI总结 研究将动量法和Adam优化器应用于模拟连续时间伊辛机,通过推导连续时间版本,在Max-Cut基准测试中显著缩短求解时间并提高解质量,并引入一阶连续时间近似作为物理实现的简化起点。

Comments submitted to Physical Review E

详情
AI中文摘要

随着摩尔定律达到极限,伊辛机为难优化问题提供了一种有前景的替代计算方法。然而,许多模拟、时间连续的伊辛机依赖类似梯度下降的动力学来寻找解,这可能限制速度和鲁棒性。我们研究了动量法和Adam优化是否能改进这些系统。由于这些优化器传统上以离散时间形式表述,我们推导了适用于模拟、时间连续伊辛机动力的连续时间版本。在Max-Cut基准测试中,我们发现基于Adam的动力学相比基于梯度下降和动量的动力学,显著减少了达到目标的时间并提高了解质量。我们进一步引入了Adam的一阶连续时间近似,旨在作为未来物理实现的更简单起点,并且在连续时间设置中表现优于完整的Adam公式。我们还研究了纯算法离散时间设置,其中在较容易的问题实例上性能差距缩小,而在较难的加权问题实例上基于Adam的更新规则表现最佳。这些结果将连续时间Adam动力学确定为模拟伊辛机的一个强大设计原则。

英文摘要

As Moore's law reaches its limits, Ising machines offer a promising alternative computing approach for difficult optimization problems. However, many analog, time-continuous Ising machines rely on gradient-descent-like dynamics to find solutions, which can limit speed and robustness. We investigate whether momentum and Adam optimization can improve these systems. Since these optimizers are traditionally formulated in discrete time, we derive continuous-time versions suitable for analog, time-continuous Ising-machine dynamics. On Max-Cut benchmarks, we find that Adam-based dynamics substantially reduce time-to-target and improve solution quality compared with gradient-descent- and momentum-based dynamics. We further introduce a first-order continuous-time approximation of Adam that is intended as a simpler starting point for future physical implementations and while performing better than the full Adam formulation in a continuous-time setting. We also study a purely algorithmic discrete-time setting, where the performance gap is reduced on easier problem instances, while the Adam-based update rule performs best on harder weighted problem instances. These results identify continuous-time Adam dynamics as a powerful design principle for analog Ising machines.

2606.02600 2026-06-03 cond-mat.dis-nn cs.LG 版本更新

High-Dimensional Latents Should Be Diagnosed Through Phase Structure

高维潜在变量应通过相结构进行诊断

Alejandro Ascarate, Leo Lebrat, Rodrigo Santa Cruz, Clinton Fookes, Olivier Salvado

发表机构 * Queensland University of Technology(昆士兰技术大学)

AI总结 本文通过自旋玻璃理论分析自编码器和变分自编码器的潜在空间,提出基于相结构的诊断方法,并展示其在生成和异常检测任务中的实际效益。

Comments 9+22 pages, 4+6 figures, under review

详情
AI中文摘要

我们通过自旋玻璃理论的视角研究自编码器和变分自编码器的潜在空间。本文包含两个部分。首先,我们形式化了一个潜在空间自旋玻璃字典:对于固定的解码器,重建项与超球坐标先验共同在潜在球面上诱导出一个哈密顿量,其中潜在坐标扮演连续自旋的角色,先验则充当外部磁场。这使我们能够引入可操作的自旋玻璃诊断——重叠分布、磁化率和块自旋粗粒化——来检测训练后潜在表示中的有序、无序和边缘稳定相。其次,我们表明,有意将潜在系统推向拓扑平凡化区域的边缘稳定状态会带来具体的下游后果。在生成方面,超球压缩改善了CIFAR-10和CelebA64上的重建-生成权衡,在保持或改善重建的同时降低了自FID。在异常检测方面,相同的半有序潜在几何提高了完全无监督和条件性OOD检测的性能,包括真实世界的火星车和Galaxy Zoo数据集,以及基于CIFAR-10/100和Imagenette的OOD基准。因此,我们倡导对AE/VAE采用相感知的评估范式,其中自旋玻璃可观测量补充标准机器学习指标,并揭示在许多情况下决定下游成功或失败的潜在区域。

英文摘要

We study autoencoder and variational-autoencoder latent spaces through the lens of spin-glass theory. The paper has two components. First, we formalize a latent-space spin-glass dictionary: for a fixed decoder, the reconstruction term together with a hyperspherical coordinates prior induces a Hamiltonian on the latent sphere, where latent coordinates play the role of continuous spins and the prior acts as an external magnetic field. This allows us to import operational spin-glass diagnostics -- overlap distributions, susceptibility, and block-spin coarse-graining -- to detect ordered, disordered, and edge-of-stability phases in trained latent representations. Second, we show that deliberately driving the latent system toward the edge-of-stability of the topological trivialization regime has concrete downstream consequences. In generation, hyperspherical compression improves the reconstruction-generation trade-off on CIFAR-10 and CelebA64, yielding lower self-FID while preserving or improving reconstruction. In anomaly detection, the same semi-ordered latent geometry improves both fully unsupervised and conditional OOD detection, including real-world Mars Rover and Galaxy Zoo datasets, as well as CIFAR-10/100 and Imagenette-based OOD benchmarks. We therefore advocate a phase-aware evaluation paradigm for AEs/VAEs, in which spin-glass observables complement standard ML metrics and expose the latent regimes that underlie downstream success or failure in many cases.

2606.02912 2026-06-03 astro-ph.IM cs.LG gr-qc physics.geo-ph 版本更新

Data-Driven Forecasting of three-Component Seismograms Using Transformer Architectures

基于Transformer架构的三分量地震图数据驱动预测

Waleed Esmail, Stuart Russell, Jana Klinge, Alexander Kappes, Christine Thomas

发表机构 * Institut für Kernphysik, Universität Münster(穆斯特大学核物理研究所) Institut für Geophysik, Universität Münster(穆斯特大学地质物理研究所) James Cook University(詹姆斯·库克大学) Geological Survey of Denmark and Greenland(丹麦和格陵兰地质调查局)

AI总结 提出基于Transformer的自回归模型SeismoGPT,通过物理约束的延续问题框架直接预测三分量地震波形,在合成数据上实现中位数归一化互相关>0.93,证明了Transformer序列模型可学习地震波场的稳定动力学延续。

Comments 35 pages, 13 figures and 4 tables

详情
AI中文摘要

由于地震波传播的非线性、色散和多尺度特性,预测超出观测数据的地震波形仍然具有挑战性。在这项工作中,我们引入了 extsc{SeismoGPT},一种基于Transformer的自回归模型,旨在直接在时域中预测三分量地震波形。预测被表述为一个物理约束的延续问题,其中模型接收从P波到达开始并延伸至S波到达后定义时间的波形上下文,之后在没有真实样本的情况下递归生成未来运动。在合成地震图上进行评估,这些地震图覆盖了5--100 km的震源深度、10--90$^\circ$的震中距离以及$3 \leq M_w \leq 7$的震级。为了区分上下文长度和预测范围的影响,我们使用距离归一化上下文比率和固定的120秒及240秒预测范围定义了三种评估配置。在所有配置中,模型的中位数归一化互相关均高于0.93。对代表性预测的分析表明,成功的预测保留了相位一致性和频谱能量分布。在出现失败案例时,主要原因是自回归展开过程中的逐渐相位漂移,而非非物理的信号生成。这些结果表明,基于Transformer的序列模型可以学习地震波场的稳定动力学延续,凸显了基础模型方法在物理驱动时间序列预测中的潜力。该方法在地震预警和减灾中具有潜在应用,特别是对于下一代引力波观测站,如爱因斯坦望远镜。

英文摘要

Forecasting seismic waveforms beyond observed data remains challenging due to the nonlinear, dispersive, and multi-scale nature of seismic wave propagation. In this work, we introduce \textsc{SeismoGPT}, a transformer-based autoregressive model designed to forecast three-component seismic waveforms directly in the time domain. Forecasting is formulated as a physically constrained continuation problem in which the model receives waveform context beginning at the P-wave arrival and extending a defined time beyond the S-wave arrival, after which future motion is generated recursively without access to ground-truth samples. Evaluation is performed on synthetic seismograms spanning source depths of 5--100\,km, epicentral distances of 10--90$^\circ$, and magnitudes $3 \leq M_w \leq 7$. To disentangle the effects of context length and prediction horizon, we define three evaluation configurations using a distance-normalized context ratio and fixed prediction horizons of 120 and 240\,s. Across all configurations, the model achieves median normalized cross correlation above 0.93. Analysis of representative forecasts shows that successful predictions preserve both phase coherence and spectral energy distribution. Where failure cases arise, this is primarily due to gradual phase drift during autoregressive rollout rather than unphysical signal generation. These results demonstrate that transformer-based sequence models can learn stable dynamical continuation of seismic wavefields, highlighting the potential of foundation-model approaches for physics-driven time-series forecasting. There are potential applications of this methodology in seismic warning and hazard mitigation, particularly for next-generation gravitational-wave observatories, such as the Einstein Telescope.

2606.02788 2026-06-03 astro-ph.IM cs.LG 版本更新

Neutrino Fingerprints: Image-Based Encodings of IceCube Events for CNN Direction Reconstruction

中微子指纹:基于图像的 IceCube 事件编码用于 CNN 方向重建

Floriano Tori, Brecht Verbeken, Vincent Ginis

发表机构 * Data Analytics Lab, Vrije Universiteit Brussel(自由大学布鲁塞尔数据分析实验室) imec-SMIT, Vrije Universiteit Brussel(imec-SMIT,自由大学布鲁塞尔) School of Engineering and Applied Sciences, Harvard University(哈佛大学工程与应用科学学院)

AI总结 提出将 IceCube 中微子事件编码为紧凑的 72×72×3 图像(中微子指纹),利用 ResNet18 卷积网络实现方向重建,平均角误差为 1.10 rad,性能媲美更复杂架构。

Comments 6 pages, 1 figure

详情
AI中文摘要

在 IceCube 中微子天文台中重建入射中微子的方向是天体物理学中的一个重要问题。公开的 IceCube--Neutrinos in Deep Ice Kaggle 竞赛提供了 1.4 亿个模拟事件来基准测试重建技术。为了从新颖的角度解决这一挑战,我们引入了中微子指纹——紧凑的 $72 \times 72 \times 3$ 图像,其中每个像素代表一个探测器,脉冲时序和电荷统计编码为颜色通道。这种表示将稀疏、不规则的脉冲数据转换为适合卷积处理的密集图像。我们的 ResNet18 模型实现了 $1.10$ rad 的平均角误差,表明基于指纹训练的卷积网络在性能上可与更复杂的架构相媲美,同时为 IceCube 事件重建提供了有效、可解释的基线。

英文摘要

Reconstructing the direction of incoming neutrinos in the IceCube Neutrino Observatory is an important problem in astrophysics. The public IceCube--Neutrinos in Deep Ice Kaggle competition provided 140 million simulated events to benchmark reconstruction techniques. To address this challenge from a novel perspective we introduce neutrino fingerprints compact $72 \times 72 \times 3$ images in which each pixel represents a single detector, with pulse timing and charge statistics encoded as color channels. This representation transforms sparse, irregular pulse data into dense images suitable for convolutional processing. Our ResNet18 model achieves a mean angular error of $1.10$ rad, indicating that convolutional networks trained on fingerprints rival more complex architectures while offering an effective, interpretable baseline for IceCube event reconstruction.

2606.02437 2026-06-03 cs.LG cs.CL 版本更新

On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters

论PEFT的扩展:迈向万亿参数百万个性化模型

Mind Lab, :, Vin Bo, Song Cao, Vic Cao, Andrew Chen, Kaijie Chen, Cleon Cheng, Steven Chiang, Kaixuan Fan, Hera Feng, Huan Feng, Arthur Fu, Jun Gao, Hongquan Gu, Aaron Guan, Nolan Ho, Mutian Hong, Hailee Hou, Peixuan Hua, Charles Huang, Miles Jiang, Nora Jiang, Yuyi Jiang, Qiuyu Jin, Fancy Kong, Andrew Lei, Kyrie Lei, Alexy Li, Lucian Li, Ray Li, Theo Li, Wenhao Li, Zhihui Li, Allen Lin, Jiayi Lin, Kairus Liu, Kieran Liu, Logan Liu, Xiang Liu, Irvine Lu, Maeve Luo, Runze Lv, Pony Ma, Verity Niu, Anson Qiu, Vincent Wang, Rio Yang, Maxwell Yao, Carrie Ye, Regis Ye, Wenlin Ye, Josh Ying, Danney Zeng, Yuhan Zhan, Anya Zhang, Di Zhang, Ruijia Zhang, Shiyang Zhang, Sueky Zhang, Ya Zhang, Wei Zhao, Ada Zhou, Adrian Zhou, Yuhua Zhou, Xinyue Zhu, Murphy Zhuang

AI总结 研究参数高效微调(PEFT)作为共享基础模型上的持久局部状态,通过三个扩展轴(向上、向下、向外)分析其作为个性化模型基质的可行性,并提出MinT基础设施管理适配器生命周期。

详情
AI中文摘要

参数高效微调(PEFT)通常被视为全微调的廉价替代方案。我们研究了一个更广泛的作用:将小型可训练适配器作为强大共享基础模型之上的持久局部状态。在这种框架下,基础模型提供共享能力,而适配器承载实例特定行为,如偏好、技能、工具习惯和类似记忆的更新。我们围绕三个扩展轴组织问题:向上扩展,更强的共享先验使小型局部更新更有用;向下扩展,研究适配器可以多小同时保持可靠性;向外扩展,许多持久化适配实例共存。MinT提供了一个管理适配器身份、修订、来源、评估和服务驻留的基础设施示例。综合来看,结果表明PEFT可以成为持久个性化模型的紧凑基质,而不仅仅是全微调的预算替代方案。

英文摘要

Parameter-efficient fine-tuning (PEFT) is usually treated as a cheaper alternative to full fine-tuning. We study a broader role: small trainable adapters as persistent local state on top of strong shared foundation models. In this framing, the base model provides shared competence while adapters carry instance-specific behavior such as preferences, skills, tool habits, and memory-like updates. We organize the problem around three scaling axes: Scale Up, where stronger shared priors make small local updates more useful; Scale Down, where we study how small adapters can be while remaining reliable; and Scale Out, where many persistent adapted instances coexist. MinT provides one infrastructure example for managing adapter identity, revision, provenance, evaluation, and serving residency. Together, the results suggest that PEFT can be a compact substrate for persistent personal models rather than only a budget substitute for full fine-tuning.

2606.02332 2026-06-03 cs.AI cs.CL cs.LG 版本更新

Forget Attention: Importance-Aware Attention Is All You Need

忘记注意力:重要性感知注意力即你所需

Suhyeong Shin, Yeongwook Yang

发表机构 * Department of Computer Engineering(计算机工程系)

AI总结 提出SISA方法,通过将状态空间模型的重要性信号直接融入注意力分数计算,实现分数级融合,在语言建模中兼顾全局检索与重要性排序。

Comments 20 pages, 6 figures, 25 tables

详情
AI中文摘要

将注意力的全局检索与状态空间模型(SSM)的顺序重要性信号相结合是混合语言建模的开放挑战。Transformer能看见所有位置但无法区分优先级;SSM知道什么重要但无法重新访问。现有混合模型——Jamba(块级)和Hymba(头级)——将两者置于独立模块,因此在注意力计算过程中彼此无法相互影响。我们提出SISA(SSM引导的Softmax注意力),该方法在注意力分数内部直接添加SSM导出的重要性项,并通过在增强的查询/键向量上执行单个SDPA调用来实现完整操作——无需循环状态,无需自定义内核。在152M/5B token上,SISA在LAMBADA-greedy上达到17.3%(对比Transformer的13.9和Mamba-3的15.5),并从第1K步起实现NIAH 100%,比Transformer的检索收敛速度快7倍;在369M规模下,Mamba-3在LAMBADA上领先,而SISA保持完美的NIAH和标准SDPA执行。因此,SISA为SSM-注意力混合模型定义了第三个设计轴——分数级融合——超越了此前主导该领域的块级和头级范式。

英文摘要

Combining attention's global retrieval with the sequential importance signal of state space models (SSMs) is the open challenge of hybrid language modeling. Transformers see everywhere but cannot prioritize; SSMs know what matters but cannot revisit. Existing hybrids -- Jamba (block level) and Hymba (head level) -- place the two in separate compartments, so neither informs the other during the attention computation itself. We propose SISA (SSM-Informed Softmax Attention), which adds an SSM-derived importance term directly inside the attention score and realizes the full operation as a single SDPA call on augmented query/key vectors -- no recurrent state, no custom kernel. At 152M / 5B tokens, SISA reaches LAMBADA-greedy 17.3% (vs. Transformer 13.9 and Mamba-3 15.5) and attains NIAH 100% from step 1K, 7x faster than Transformer's retrieval convergence; at 369M, Mamba-3 leads LAMBADA while SISA preserves perfect NIAH and stock-SDPA execution. SISA thus defines a third design axis for SSM-attention hybrids -- score-level fusion -- beyond the block-level and head-level paradigms that have dominated the field.

2606.02004 2026-06-03 cs.CL cs.LG 版本更新

Machine Learning for Coding Retail Product Names to Consumer-Price Categories: A Rule-plus-Bag-of-Words Pipeline with Reliability-Weighted Human-in-the-Loop Labeling

将零售产品名称编码为消费者价格类别的机器学习:基于规则加词袋的流水线,结合可靠性加权的人工参与标注

Vladimir Beskorovainyi

发表机构 * Besk Tech(Besk科技) Moscow Institute of Physics and Technology (MIPT)(莫斯科物理技术学院)

AI总结 本文提出一种结合规则和词袋模型的流水线方法,并采用可靠性加权的人工参与标注协议,将零售产品名称映射到消费者价格类别(如UN COICOP),实验表明词袋模型在该任务上已接近饱和(F1约0.99),而标注协议中可靠性加权投票仅略优于简单多数投票。

Comments 11 pages, 3 tables. Methodology paper; illustrative experiments only, no proprietary data

详情
AI中文摘要

消费者价格测量越来越多地依赖替代数据源——扫描仪、网络抓取和交易/收据数据。一个反复出现的障碍是,这些来源中的产品描述简短、嘈杂且缩写,没有标准产品代码,因此每个项目必须首先映射到消费分类(例如,联合国COICOP方案),然后才能比较价格。本文将该映射作为一种通用的、可重复的方法进行研究。流水线包括:(i) 对嘈杂项目名称进行文本归一化和分词;(ii) 基于每类关键词和停用词的前缀树(trie)规则预分类器;(iii) 每个类别的二元确认模型,决定一个项目是否属于暂定分配的类别。对于大规模标注,我们使用人工参与协议,其中标注者给出二元有效/拒绝判断,通过动态更新的可靠性权重进行聚合;模型加入相同的规则,实现持续微调。我们的实证发现是通货紧缩的:在一个受控、无泄漏的研究中(一个类别,真实正例与困难负例,五个随机种子),词袋模型基本上饱和了任务(F1约0.99)——线性分类器匹配多层感知器,显式词序(n-gram)特征没有增加任何价值,约67个标注样本已经足够。标注协议的蒙特卡洛研究表明,可靠性加权投票勉强超过简单多数投票(其加性权重饱和),而Dawid-Skene方法明显更好地恢复标签。我们还讨论了价格层面的质量控制和统计办公室考虑交易数据时的设计经验。所有数字均为示意性;未复制任何机密数据、代码或文档。

英文摘要

Consumer-price measurement increasingly draws on alternative data sources -- scanner, web-scraped, and transaction/receipt data. A recurring obstacle is that product descriptions in such sources are short, noisy, and abbreviated, with no standard product code, so each item must first be mapped to a consumption classification (e.g., the UN COICOP scheme) before prices can be compared. This paper studies that mapping as a general, reproducible method. The pipeline is: (i) text normalization and tokenization of noisy item names; (ii) a prefix-tree (trie) rule-based pre-classifier driven by per-category key-phrases and stop-phrases; and (iii) a per-category binary confirmation model deciding whether an item belongs to a tentatively assigned category. For labels at scale we use a human-in-the-loop protocol in which annotators give a binary valid/reject judgment, aggregated by a dynamically updated reliability weight; the model joins the same rule, enabling continual fine-tuning. Our empirical finding is deflationary: in a controlled, leakage-free study (one category, real positives vs. hard negatives, five seeds), bag-of-words models essentially saturate the task (F1 about 0.99) -- a linear classifier matches a multilayer perceptron, explicit word-order (n-gram) features add nothing, and about 67 labeled examples already suffice. A Monte-Carlo study of the labeling protocol shows the reliability-weighted vote barely beats plain majority (its additive weights saturate) while Dawid-Skene recovers labels markedly better. We also discuss price-level quality control and design lessons for statistical offices considering transaction data. All figures are illustrative; no confidential data, code, or documentation is reproduced.

2606.01849 2026-06-03 cs.LG cs.CL cs.CR 版本更新

ContinuousBench: Can Differentially Private Synthetic Text Improve Capabilities?

ContinuousBench: 差分隐私合成文本能否提升能力?

Peihan Liu, Lucas Rosenblatt, Weiwei Kong, Natalia Ponomareva, Gautam Kamath, Rachel Cummings, Roxana Geambasu, Yu Gan, Lillian Tsai, Alex Bie

发表机构 * Columbia University(哥伦比亚大学) NYU(纽约大学) Google Research(谷歌研究) University of Waterloo(滑铁卢大学) Vector Institute(向量研究所) Google(谷歌)

AI总结 提出ContinuousBench基准,通过持续更新的数据集评估差分隐私合成文本能否传递原始语料库中的新知识,实验表明非隐私合成能有效转移知识,而最先进的DP合成方法即使在高隐私预算下也基本失败。

Comments For datasets, see https://huggingface.co/ContinuousBench; for the evaluation harness, see https://github.com/plau666/ContinuousBenchEval; for an accompanying blog post, see https://peihanliu.com/posts/continuousbench.html

详情
AI中文摘要

差分隐私(DP)文本合成有望为模型训练解锁敏感语料库,但目前尚不清楚DP合成数据是否能传递仅存在于这些语料库中的真正新知识和能力。这是因为现有评估依赖于无需训练即可几乎解决的任务,因此强大的基准性能并不能证明DP合成可以替代原始数据访问。为此,我们引入了ContinuousBench,一个持续自动更新的基准,用于衡量DP合成文本带来的能力提升。每季度发布一次新版本,配对一个从未见过的训练语料库和一个衍生的问答集,其构建满足:(1)无语料库无法解决;(2)在DP下可学习,因为测试的知识由数百条独立记录支持。研究人员从训练语料库生成DP合成数据,并在其合成数据上运行我们的标准化训练和评估框架以衡量增益。我们实例化两个轨道:Geminon,一个关于虚构生物的程序生成数据集;以及News,一个新爬取的公共新闻文章流。尽管标准基准几乎饱和,但在ContinuousBench上,我们发现非隐私合成从原始语料库转移了大量知识,而最先进的DP合成方法即使在高隐私预算(ε=100)下也基本无法做到。

英文摘要

Differentially private (DP) text synthesis promises to unlock sensitive corpora for model training, but it remains unclear whether DP synthetic data transmits genuinely new knowledge and capabilities present only in those corpora. This is because existing evaluations rely on tasks that are nearly solvable without training, so strong benchmark performance does not establish that DP synthesis can substitute original data access. Thus, we introduce ContinuousBench, a continuously and automatically-regenerated benchmark that measures capability gain from DP synthetic text. Each quarter, a new release pairs a never-before-seen training corpus with a derived QA set, constructed to be: (1) unsolvable sans-corpus; and (2) learnable under DP, as the tested knowledge is supported by hundreds of independent records. Researchers produce DP synthetic data from the training corpus and run our standardized training and evaluation harness on their synthetic data to measure gains. We instantiate two tracks: Geminon, a procedurally-generated dataset about fictional creatures; and News, a stream of newly crawled public news articles. Although standard benchmarks are nearly saturated, on ContinuousBench we find that non-private synthesis transfers substantial knowledge from the original corpus, while state-of-the-art DP synthesis methods generally fail to do so, even at $\varepsilon=100$.

2606.01532 2026-06-03 cs.LG cs.CC 版本更新

Rethinking the Role of Positional Encoding: Sliding-Window Transformers without PE Remain Turing Complete

重新思考位置编码的作用:无PE的滑动窗口Transformer仍具图灵完备性

Qian Li, Xinyu Mao, Shang-Hua Teng

发表机构 * Shenzhen Research Institute of Big Data(深圳大数据研究院) University of Southern California(南加州大学)

AI总结 本文证明,在滑动窗口机制下,无需位置编码的Transformer仍可通过窗口演化模拟图灵完备的Post机器,从而具备通用计算能力。

详情
AI中文摘要

位置编码(PE)被广泛认为是Transformer处理有序序列所必需的:没有位置编码,下一个token映射在其上下文token中似乎是置换不变的。这一直觉支撑了所有先前的普适性结果,这些结果依赖位置信息来证明具有思维链的Transformer可以执行任意计算,即它们是图灵完备的。我们在与长程推理最相关的机制下重新审视这一信念,其中生成通过有限的滑动上下文窗口进行。我们的初步认识是,窗口机制本身(轻微地)打破了置换对称性。为了提炼并精确捕捉这种额外表达能力的大小,我们引入了一个抽象的自回归模型——HIST模型,其中每次更新仅依赖于恒定大小的内部状态和当前窗口内的token计数直方图。我们证明这个HIST模型是图灵完备的,通过展示窗口的演化可以揭示刚刚离开窗口的token,这足以模拟图灵完备的Post机器。然后,我们构建了一个在恒定大小token字母表上的滑动窗口Transformer,没有位置编码,并证明它可以模拟HIST模型。我们的结果表明,位置编码对于Transformer执行通用计算并非不可或缺:窗口滑动本身已经打破了置换对称性并捕获了足够的位置信息。

英文摘要

Positional encoding (PE) is widely viewed as necessary for transformers to process ordered sequences: without them, the next-token map appears permutation-invariant in its context tokens. This intuition underlies all prior universality results, which rely on positional information to prove that transformers with chain-of-thought can perform arbitrary computation, i.e., they are Turing complete. We revisit this belief in the regime most relevant to long-form reasoning, where generation proceeds through a finite sliding context window. Our opening perception is that the window mechanism itself (mildly) breaks the permutation symmetry. To distill and precisely capture the degree of this added expressiveness, we introduce an abstract autoregressive model, the HIST model, in which each update depends only on constant-size internal state and the token-count histogram within the current window. We prove that this HIST model is Turing complete by showing that the evolution of the window can reveal the token that has just left the window, which suffices to simulate Turing-complete Post machines. We then construct a sliding-window transformer over a constant-size token alphabet, without PE, and show that it can simulate the HIST model. Our result demonstrates that positional encodings are not indispensable for transformers to perform universal computation: The window sliding itself already breaks permutation symmetry and captures sufficient positional information.

2606.01472 2026-06-03 cs.DC cs.AI cs.LG 版本更新

Hierarchical Online Prompt Mutation with Dual-Loop Feedback for Guardrailed Evidence Document Generation: A Production-Evaluation Case Study

分层在线提示变异与双环反馈用于有护栏的证据文档生成:生产评估案例研究

Nataraj Agaram Sundar, Tejas Morabia

发表机构 * eBay Inc.(eBay公司)

AI总结 提出分层在线提示变异框架HOPM,通过双环反馈(人工审核与自动评判)优化提示策略,在真实市场纠纷证据生成中显著提升胜率和质量。

Comments 7 pages. Production-evaluation case study of guardrailed LLM evidence-document generation

详情
AI中文摘要

高风险生产文档生成系统要求语言模型具有适应性、基于证据且可审计。我们提出HOPM,一种分层在线提示变异框架,在真实市场纠纷证据工作流上评估。HOPM将提示视为在线策略:一个家族/版本路由器选择提示,确定性护栏将失败归因于可变的提示-令牌类别,来自人工审核和自动评判的双重反馈更新路由和变异优先级。主要证据是观察到的匹配生产评估消融:七个变体在相同的600个案例上评估,实现组件比较:静态提示、手动迭代、仅bandit路由、仅变异适应、仅人工反馈、仅自动评判反馈和全双环HOPM。全HOPM将计数胜率从34.7%提升至45.7%(+11.0个百分点;配对McNemar p=1.31e-11),金额加权胜率从22.3%提升至41.4%(+19.1个百分点;95%配对bootstrap CI [10.3, 28.9]个百分点)。它还将平均Likert质量从3.18提高到4.40,并将问题标记率从15.3%降低到5.2%。支持性审查工件涵盖770篇生成文本审查、318份标记审查员导出、一个10案例/61评分的校准切片和一个70案例/350评分的OCR基准;这些工件校准评分标准、护栏、标题风险和OCR风险解释,而非替代生产消融。论文包括控制设置、样本量、置信区间、配对检验、提示-令牌类别、伪代码、模式、评分标准、护栏分类法以及一个构造示例,以便在不暴露专有证据的情况下重现评估结构。

英文摘要

High-stakes production document-generation systems require language models to be adaptive, evidence-grounded, and auditable. We present HOPM, a hierarchical online prompt mutation framework evaluated on a real marketplace dispute-evidence workflow. HOPM treats prompts as online policies: a family/version router selects a prompt, deterministic guardrails attribute failures to mutable prompt-token categories, and dual feedback from human review and an automated judge updates both routing and mutation priorities. The primary evidence is an observed matched production-evaluation ablation: seven variants are evaluated on the same 600 cases each, enabling component comparisons against static prompting, manual iteration, bandit-only routing, mutation-only adaptation, human-only feedback, auto-judge-only feedback, and full dual-loop HOPM. Full HOPM improves count win rate over a static control from 34.7% to 45.7% (+11.0 pp; paired McNemar p = 1.31e-11) and amount-weighted win rate from 22.3% to 41.4% (+19.1 pp; 95% paired bootstrap CI [10.3, 28.9] pp). It also increases mean Likert quality from 3.18 to 4.40 and reduces issue-flag rate from 15.3% to 5.2%. Supporting review artifacts cover 770 generated-text reviews, 318 labeled reviewer exports, a 10-case/61-rating calibration slice, and a 70-case/350-rating OCR benchmark; these artifacts calibrate rubric, guardrail, title-risk, and OCR-risk interpretation rather than substituting for the production ablation. The paper includes control setup, sample sizes, confidence intervals, paired tests, prompt-token categories, pseudocode, schema, rubric, guardrail taxonomy, and a constructed example so the evaluation structure can be reproduced without exposing proprietary evidence.

2606.01340 2026-06-03 cs.LG stat.ML 版本更新

Sample Complexity and Decision-Theoretic Guarantees for Bayesian Model Averaging over Decision Trees with Catalan-Exponential Priors

基于Catalan指数先验的决策树贝叶斯模型平均的样本复杂度和决策理论保证

Livija Jakaite, Vitaly Schetinin

发表机构 * School of Computing and Engineering University of Bedfordshire, Luton, UK(计算与工程学院贝德福德郡大学,卢顿,英国)

AI总结 针对具有Dirichlet-Multinomial叶模型和Catalan指数树大小先验的贝叶斯决策树,建立了理性承诺阈值的完整非渐近理论,回答了贝叶斯模型平均权重何时蕴含足够认知信息以证明对平均分布的承诺利用是合理的。

Comments 22 pages, 3 figures, Submitted to the Journal of Machine Learning Research

详情
AI中文摘要

我们提出一个问题:当决策树上的贝叶斯模型平均(BMA)权重携带足够的认知信息时,何时可以证明对平均分布的承诺利用是合理的?对于具有Dirichlet-Multinomial叶模型和Catalan指数树大小先验(Schetinin & Jakaite, 2025)的贝叶斯决策树(BDTs),我们以闭式回答了这个问题,建立了理性承诺阈值的完整非渐近理论。

英文摘要

We ask: when do Bayesian model averaging (BMA) weights over decision trees carry sufficient epistemic information to justify committed exploitation of the averaging distribution? We answer this question in closed form for Bayesian decision trees (BDTs) with Dirichlet-Multinomial leaf models and a Catalan-exponential tree-size prior (Schetinin&Jakaite, 2025), establishing a complete non-asymptotic theory of rational commitment thresholds.

2606.01111 2026-06-03 cs.LG 版本更新

LeAP: Learnable Adaptive Permutation for Feature Selection in Heterogeneous and Sparse Recommender Systems

LeAP: 面向异构稀疏推荐系统的可学习自适应特征选择排列

Yihong Huang, Chen Chu, Fei Chen, Yu Lin, Ruiduan Li, Zhihao Li

发表机构 * Bilibili Inc.(哔哩哔哩公司)

AI总结 针对工业推荐系统中特征异构、极度稀疏及排列计算成本高的问题,提出可学习自适应排列模块LeAP,通过将随机排列转化为可学习机制并引入自适应正则化,实现高效特征选择,在四个公开数据集和十亿级工业搜索排序模型中取得最优性能。

详情
AI中文摘要

现代工业推荐系统依赖数千种异构特征——从低维标量(如统计值)到高维嵌入(如用户ID嵌入、MLP表示)——以实现高精度预测。鉴于训练相关的巨大计算成本,高效的特征选择至关重要。然而,现有方法面临三个主要瓶颈:(1)它们通常假设特征维度统一或需要昂贵的映射到固定大小;(2)它们难以处理极度稀疏性,其中大多数特征(例如99%以上)保持默认值;(3)传统的基于排列的方法在大规模设置中计算成本过高。为了解决这些挑战,我们提出了LeAP(可学习自适应排列),一种新颖的、模型无关的即插即用特征选择模块。LeAP将低效的随机排列过程转化为可学习机制,显著加速了特征重要性的评估。此外,我们引入了一种针对异构维度和极度稀疏性定制的自适应正则化策略,使得在非对称输入空间中获得优越的特征重要性排序结果。在四个公开推荐数据集上的实验表明,LeAP达到了最先进的性能。此外,LeAP已部署在一个大规模工业搜索排序模型中,该模型每天处理超过十亿次请求,模型参数规模达2TB。在这个涉及12000多个总特征维度的实际场景中,LeAP成功识别并移除了超过3600个冗余维度,且性能没有下降,其能力是基线方法的2到10倍。

英文摘要

Modern industrial recommender systems rely on thousands of heterogeneous features -- ranging from low-dimensional scalars (e.g., statistical value) to high-dimensional embeddings (e.g., user-id embeddings, MLP representations) -- to achieve high-precision predictions. Given the immense computational costs associated with training, efficient feature selection is critical. However, existing methods encounter three primary bottlenecks: (1) they typically assume uniform feature dimensions or require costly mapping to a fixed size; (2) they struggle with extreme sparsity, where the majority of features (e.g., 99%+) remain at default values; and (3) traditional permutation-based approaches are computationally prohibitive in large-scale settings. To address these challenges, we propose LeAP (Learnable Adaptive Permutation), a novel, model-agnostic plug-in module for feature selection. LeAP transforms the inefficient random permutation process into a learnable mechanism, significantly accelerating the evaluation of feature importance. In addition, we introduce an adaptive regularization strategy tailored for heterogeneous dimensions and extreme sparsity, enabling superior feature importance ranking results across asymmetric input spaces. Experiments on four public recommendation datasets demonstrate that LeAP achieves state-of-the-art performance. Furthermore, LeAP has been deployed in a large-scale industrial search ranking model with over a billion daily requests and a 2TB model parameter scale. In this real-world scenario involving 12,000+ total feature dimensions, LeAP successfully identified and removed over 3,600 redundant dimensions without performance degradation, which is 2 to 10 times the ability of compared baseline methods.

2606.00757 2026-06-03 cs.LG 版本更新

RADE: Random Add-Drop Edge as a Regularizer

RADE: 随机增删边作为正则化器

Danial Saber, Amirali Salehi-Abari

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 提出随机增删边方法RADE,同时解决图神经网络过拟合和长程信息过压缩问题,通过训练-推理对齐实现无分布偏移的正则化,并自适应调整增删率。

Comments 27 pages, ICML 2026

详情
AI中文摘要

图神经网络(GNN)存在过拟合和长程信息过压缩的问题。随机图增强(如边删除)通过正则化训练来缓解过拟合,但会导致训练-推理错位,且无法改善过压缩。相反,重连方法通过改善连通性来缓解过压缩,但并非设计用于正则化训练。我们提出随机增删边(RADE),一种同时删除和添加边的随机图增强方法,以同时解决过拟合和过压缩。RADE被证明能够对齐训练和推理,使得随机增强在无分布偏移的情况下正则化训练,同时在推理时支持长程通信。我们进一步提出并研究了一种小批量梯度范数平衡算法,该算法在训练过程中自适应调整删除和添加率,使得RADE在实践中无需超参数。在节点和图分类基准上的实验表明,RADE是一种强大的正则化器,并能缓解过压缩。消融实验支持了训练-推理对齐、自适应率选择以及随机边删除和边添加的互补作用。

英文摘要

Graph Neural Networks (GNNs) suffer from overfitting and over-squashing of long-range information. Stochastic graph augmentations (e.g., edge deletion) regularize training against overfitting but can introduce train-inference misalignment and do not improve over-squashing. In contrast, rewiring methods improve connectivity to mitigate over-squashing, but are not designed to regularize training. We propose Random Add-Drop Edge (RADE), a stochastic graph augmentation method that jointly drops and adds edges to address both overfitting and over-squashing simultaneously. RADE is provably designed to align training and inference so that random augmentations regularize training without distribution shift, while supporting long-range communication at inference. We further propose and study a mini-batch gradient-norm balancing algorithm that adapts deletion and addition rates during training, rendering RADE hyperparameter-free in practice. Experiments on node- and graph-classification benchmarks show that RADE is a strong regularizer and mitigates over-squashing. Ablations support the roles of train-inference alignment, adaptive rate selection, and the complementary effects of random edge deletion and edge addition.

2606.00680 2026-06-03 cs.AI cs.LG 版本更新

Regularized Offline Policy Optimization with Posterior Hybrid Bayesian Belief

具有后验混合贝叶斯信念的正则化离线策略优化

Hongqiang Lin, Pengfei Wang, Nenggan Zheng

AI总结 提出后验混合贝叶斯信念(PhyB)以统一量化离线强化学习中的认知不确定性,并基于此开发迭代正则化策略优化算法,实现单调改进直至收敛。

详情
AI中文摘要

离线强化学习旨在从预先收集的数据集中优化策略。该范式的一个瓶颈是管理认知不确定性,这种不确定性源于有限的数据覆盖(样本层面)以及从有限数据中识别转移动态的模糊性(模型层面)。为了统一量化这些不确定性,贝叶斯强化学习通过将动态模型视为随机变量并维护相应的信念而被提出。尽管具有理论吸引力,贝叶斯强化学习中的策略优化在计算上仍然具有挑战性,因为它需要求解带有期望的复合目标。先前的方法要么采用计算可扩展性差的基于搜索的技术,要么施加牺牲贝叶斯强化学习适应性的限制性后验假设。为了解决这些局限性,我们提出了后验混合贝叶斯信念(PhyB),它将期望重新表述为动态模型子集上的凸组合。理论分析表明,这种近似引起的目标差异是有界的。基于PhyB,我们开发了一种迭代正则化策略优化算法,该算法为单调改进直至收敛提供了与度量无关的保证。实验结果表明,PhyB在各种基准测试中达到了最先进的性能。

英文摘要

Offline reinforcement learning (RL) aims to optimize policies from pre-collected datasets. A bottleneck of this paradigm is managing epistemic uncertainty, which arises from limited data coverage (sample-level) and the ambiguity in identifying transition dynamics from finite data (model-level). To provide a unified quantification of these uncertainties, Bayesian RL has been proposed by treating the dynamics model as a random variable and maintaining a corresponding belief. Despite its theoretical appeal, policy optimization in Bayesian RL remains computationally challenging as it requires solving composite objectives with expectations. Prior methods either employ search-based techniques with poor computational scalability or impose restrictive posterior assumptions that sacrifice the adaptability of Bayesian RL. To address these limitations, we propose Posterior Hybrid Bayesian Belief (PhyB), which reformulates the expectation as a convex combination over a subset of dynamics models. Theoretical analysis demonstrates that the objective discrepancy induced by this approximation remains bounded. Based on PhyB, we develop an iterative regularized policy optimization algorithm that provides metric-agnostic guarantees for monotonic improvement until convergence. Empirical results demonstrate that PhyB achieves state-of-the-art performance on various benchmarks.

2606.00542 2026-06-03 cs.LG 版本更新

Rethinking Bregman Divergences in Kronecker-Factored Optimizers

重新思考Kronecker因子优化器中的Bregman散度

Bing Liu, Wenjie Zhou, Chengcheng Zhao

发表机构 * College of Control Science and Engineering, Zhejiang University(浙江大学控制科学与工程学院) State Key Laboratory of AI Safety, Institute of Computing Technology, Chinese Academy of Sciences(中国科学院人工智能安全国家重点实验室,计算技术研究所)

AI总结 本文通过协方差矩阵谱分析,研究了不同Bregman散度(Frobenius、von Neumann、LogDet)在Kronecker近似误差分配中的角色,并提出一种子空间感知的Kronecker优化器,在顶部子空间应用基于特征值的预处理,在底部子空间使用自适应各向同性加速常数。

详情
AI中文摘要

Shampoo风格的优化器使用Kronecker因子结构近似梯度协方差矩阵。最近的工作~\cite{lin2026understanding}表明,这种近似可以视为Bregman矩阵散度下的投影,从而得到不同的Kronecker因子预条件子。然而,当协方差并非精确Kronecker因子化时,散度选择的作用仍不清楚。我们通过协方差矩阵的谱来研究这个问题。我们表明,Frobenius、von Neumann和LogDet散度将不可避免的Kronecker近似误差以不同方式分布在协方差谱上。我们进一步表明,它们的Kronecker因子由散度加权残差而非原始近似误差主导,解释了这些谱偏好如何在所得预条件子中实现。实验上,我们观察到顶部协方差特征空间与Hessian矩阵的对齐程度显著更好,而尾部谱则更加嘈杂且不可靠。受这些发现启发,我们提出一种子空间感知的Kronecker优化器,在顶部子空间应用基于特征值的预处理,在底部子空间使用自适应各向同性加速常数。

英文摘要

Shampoo-style optimizers approximate gradient covariance matrices using Kronecker-factored structures. Recent work~\cite{lin2026understanding} showed that such approximations can be viewed as projections under Bregman matrix divergences, leading to different Kronecker-factored preconditioners. However, it remains unclear what role the choice of divergence plays when the covariance is not exactly Kronecker-factored. We study this question through the spectrum of the covariance matrix. We show that Frobenius, von Neumann, and LogDet divergences distribute the unavoidable Kronecker approximation error differently across the covariance spectrum. We further show that their Kronecker factors are governed by divergence-weighted residuals rather than the raw approximation error, explaining how these spectral preferences are realized in the resulting preconditioners. Empirically, we observe that the top covariance eigenspace is substantially better aligned with the Hessian matrix, while the tail spectrum is much noisier and unreliable. Motivated by these findings, we propose a subspace-aware Kronecker optimizer that applies eigenvalue-based preconditioning in the top subspace and uses an adaptive isotropic acceleration constant in the bottom subspace.

2606.00494 2026-06-03 cs.LG 版本更新

ProjQ: Project-and-Quantize for Adapter-Aware LLM Compression

ProjQ:面向适配器感知的大语言模型压缩的投影与量化

Wenya Yu, Chao Zhang, Li Wang, Samson Lasaulce, Merouane Debbah

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 提出ProjQ框架,通过正交子空间投影将量化噪声约束到低秩流形,利用交替算法将主导误差卸载给适配器,实现更优的量化误差补偿和下游任务微调。

Comments Acceppted paper in ICML 2026

详情
AI中文摘要

训练后量化(PTQ)和低秩适配(LoRA)构成了高效大语言模型(LLM)部署的标准流程。然而,顺序应用它们会带来一个问题:PTQ常常留下分散(在模型权重中)的随机噪声,LoRA难以轻易修复,这意味着LoRA最终会浪费其有限的容量来试图修复不可校正的噪声,而不是提高任务性能。在本文中,我们提出了 extbf{ProjQ},一种通过正交子空间投影将量化噪声约束到低秩流形的新框架。我们推导出一种高效的交替算法,将量化噪声塑造成低秩结构,有效地将主导误差分量卸载给后续适配器,同时最小化正交“不可校正”子空间中的残差误差。我们的理论分析表明,与标准PTQ相比,ProjQ为下游任务保留了严格更大的模型可塑性。在LLaMA-2、Qwen2.5和Qwen3上的大量实验证实,ProjQ在量化误差补偿和下游任务微调方面均持续优于现有方法,在补偿方面实现了高达$2 imes$的评估损失降低,并且仅用3比特就达到了标准4比特基线在语言建模任务上的性能。代码可在https://github.com/yy9301/ProjQ获取。

英文摘要

Post-Training Quantization (PTQ) and Low-Rank Adaptation (LoRA) constitute the standard pipeline for efficient Large Language Model (LLM) deployment. However, applying them sequentially poses a problem: PTQ often leaves behind random noise that is spread out (across the model's weights) in a way LoRA can't easily fix, meaning that LoRA ends up wasting its limited capacity trying to fix uncorrectable noise instead of improving task performance. In this paper, we propose \textbf{ProjQ}, a novel framework for constraining quantization noise to the low-rank manifold via orthogonal subspace projection. We derive an efficient alternating algorithm that shapes the quantization noise into a low-rank structure, effectively offloading dominant error components to the subsequent adapter while minimizing the residual error in the orthogonal "uncorrectable" subspace. Our theoretical analysis demonstrates that ProjQ preserves strictly greater model plasticity for downstream tasks compared to standard PTQ. Extensive experiments on LLaMA-2, Qwen2.5 and Qwen3 confirm that ProjQ consistently outperforms existing methods in both quantization error compensation and downstream task fine-tuning, achieving up to $2\times$ lower evaluation loss for compensation and matching the performance of standard 4-bit baselines on language modeling tasks with only 3 bits. The code is available on https://github.com/yy9301/ProjQ .

2606.00395 2026-06-03 cs.LG cs.AI 版本更新

PR2: Predictive Routing Replay for MoE-Based LLM Reinforcement Learning

PR2: 基于MoE的大语言模型强化学习中的预测性路由重放

Daize Dong, Junlin Chen, Haolong Jia, Jiang Liu, Jiawei Wu, Huanwei Di, Jialian Wu, Zhengzhong Liu, Zicheng Liu, Emad Barsoum, Dimitris N. Metaxas, Hongyi Wang

发表机构 * Rutgers University(罗格斯大学) AMD MBZUAI

AI总结 针对MoE大语言模型强化学习中路由器漂移导致的不稳定性问题,提出预测性路由重放方法,通过轻量级演化预测器减少路由不匹配,提升训练稳定性和性能。

详情
AI中文摘要

混合专家(MoE)大语言模型(LLM)在规模上实现了强大的性能。然而,基于MoE的LLM的强化学习(RL)常常遭受训练不稳定性。一个根本原因是路由器漂移,即专家激活可能在模型更新时发生剧烈变化,并且在分解的推出和训练阶段之间不同,导致PPO风格RL算法中出现大的推出-训练不匹配和不稳定的重要性采样权重。路由重放通过在每个推理轨迹内冻结重放路由来缓解这个问题,但它忽略了路由器在离策略更新下如何演化,从而导致路由器过时。为了解决这个限制,我们提出了预测性路由重放(PR2),它为每个路由器配备了一个轻量级的演化预测器,学习预测短时域的路由器演化。在推出阶段,我们使用预测性路由分布来应用top-$k$路由,使梯度能够到达更新后可能激活的专家。在训练阶段,我们重放由此产生的预测路由,以保持一致性,从而实现稳定的重要性估计。理论分析和实验支持PR2减少了由路由引起的不匹配,提高了RL稳定性,并在各种推理基准上取得了更强的性能。

英文摘要

Mixture of Experts (MoE) Large Language Models (LLMs) achieve strong performance at scale. However, reinforcement learning (RL) on MoE-based LLMs often suffers from training instability. A root cause is router drift, i.e., expert activations can change drastically across model updates and differ between disaggregated rollout and training phases, causing large rollout--training mismatch and unstable importance sampling weights in PPO-style RL algorithms. Routing replay mitigates this issue by freezing the replay route within each reasoning trajectory, but it ignores how the router evolves under off-policy updates and thus causes router staleness. To address this limitation, we propose Predictive Routing Replay (PR2), which augments each router with a lightweight evolution predictor that learns to anticipate short-horizon router evolution. During the rollout phase, we use the predictive routing distribution to apply top-$k$ routing, enabling gradients to reach experts that are likely to become active after updates. During the training phase, we replay the resulting predicted route to retain consistency for stable importance estimation. Theoretical analysis and experiments support that PR2 reduces routing-induced mismatch, improves RL stability, and yields stronger performance across various reasoning benchmarks.

2606.00366 2026-06-03 cs.LG math.OC 版本更新

GLENS: Global Search via Learning from Solver Iterates with Diffusion Models

GLENS: 通过扩散模型从求解器迭代中学习进行全局搜索

Anjian Li, Bartolomeo Stellato, Ryne Beeson

发表机构 * Department of Electrical and Computer Engineering, Princeton University(电气工程与计算机科学系,普林斯顿大学) Department of Operations Research and Financial Engineering, Princeton University(运筹学与金融工程系,普林斯顿大学) Department of Mechanical and Aerospace Engineering, Princeton University(机械与航空航天工程系,普林斯顿大学)

AI总结 提出GLENS方法,利用扩散模型学习求解器迭代过程中的局部几何结构,生成高质量且多样化的初始猜测,加速多模态非凸优化问题的全局搜索。

详情
AI中文摘要

我们考虑为多模态非凸连续优化问题的局部最小值生成大量初始猜测的问题。目标是这些初始猜测质量高(即数值求解器快速收敛)且多样化(即代表许多不同的局部最小值)。识别多个局部最优解能够实现灵活的下游决策,但通常需要昂贵的全局搜索。现有的数据驱动方法仅使用离线求解器运行中最终收敛的最优值来预测初始猜测,这丢弃了关于解局部邻域的信息,并限制了可用的训练数据。我们提出GLENS(通过从求解器迭代中学习进行全局搜索),一种数据高效的全局搜索方法,利用中间求解器迭代作为免费的数据增强。GLENS由两个组件组成:邻域结构模型,使用扩散模型学习以问题参数为条件的最优值周围的局部几何结构;以及求解器行为模型,学习细化方向,在扩散采样期间进一步引导样本朝向附近的最优值。在修改的非凸基准问题和双机器人避障导航问题上的实验表明,GLENS生成高质量的初始猜测,同时保留了多样局部最优值的多模态分布。生成的初始猜测在不同问题设置和求解器中导致更快的求解器收敛。我们还分析了关键超参数选择对性能的影响。

英文摘要

We consider the problem of generating a large collection of initial guesses for local minima of multimodal non-convex continuous optimization problems. The goal is for these initial guesses to be high-quality (i.e., a numerical solver converges quickly) and diverse (i.e., represent many different local minima). Identifying multiple locally optimal solutions enables flexible downstream decision-making, but typically requires expensive global search. Existing data-driven methods predict initial guesses using only the final converged optima from offline solver runs, which discards information about the local neighborhoods of solutions and limits the available training data. We propose GLENS (Global Search via Learning from Solver Iterates), a data-efficient global search method that leverages intermediate solver iterates as free data augmentation. GLENS consists of two components: a neighborhood structure model that uses diffusion models to learn the local geometry around optima conditioned on problem parameters, and a solver behavior model that learns refinement directions to further guide samples towards nearby optima during diffusion sampling. Experiments on modified non-convex benchmark problems and a two-robot obstacle-avoidance navigation problem show that GLENS generates high-quality initial guesses while preserving the multimodal distribution of diverse local optima. The resulting initial guesses lead to faster solver convergence across different problem settings and solvers. We also analyze how key hyperparameter choices affect the performance.

2606.00188 2026-06-03 cs.GR cs.CV cs.LG 版本更新

PaintBench: Deterministic Evaluation of Precise Visual Editing

PaintBench: 精确视觉编辑的确定性评估

Kai Xu, Ellis Brown, Shrikar Madhu, Rob Fergus, He He, Saining Xie

发表机构 * New York University(纽约大学)

AI总结 提出PaintBench基准,通过程序化生成20种基本视觉编辑操作,实现确定性像素级评估,发现当前模型性能低(最高mIoU 17.1%),并揭示任务分解和场景变化的影响。

Comments Project Page: https://paintbench.github.io/

详情
AI中文摘要

虽然当前的多模态模型在开放式视觉编辑方面表现熟练,但执行精确的单答案编辑仍然是一个重要障碍。为了探究这一挑战,我们引入了PaintBench,一个动态可扩展的基准测试,针对四个类别的20种基本精确视觉编辑操作:几何变换、结构操作、颜色变化和符号推理。具有可配置复杂性的程序化生成实现了有效无限、抗污染的评估套件,而确定性像素级评估消除了对易偏见的评判模型的依赖。在11个图像编辑模型中,我们发现整体性能较低,当前表现最佳的行业领先者仅得17.1%(mIoU)。任务分解揭示了特别具有挑战性的操作类型(几何变换、大多数结构操作、基于公式的颜色变化)和模型特定的专长。细粒度的基准诊断进一步显示了由对象数量、背景复杂性、配色方案和编辑区域大小等场景变化引起的性能下降。为了测试PaintBench分数对应用任务性能的泛化能力,我们创建了一个用于数据可视化编辑的程序化确定性评估(TinyGrafixBench),并发现其与PaintBench分数之间存在强线性相关性($R^2 = 0.91$, $p < 0.001$)。总之,PaintBench为衡量和推动精确多模态视觉编辑的进展提供了严格的基础。

英文摘要

While current multimodal models are proficient at open-ended visual editing, executing precise single-answer edits remains an important obstacle. To probe this challenge, we introduce PaintBench, a dynamically scalable benchmark targeting 20 fundamental precise visual editing operations across four categories: geometric transformation, structural manipulation, color change, and symbolic reasoning. Procedural generation with configurable complexity enables an effectively infinite, contamination-resistant evaluation suite, and deterministic pixel-level evaluation eliminates reliance on bias-prone judge models. Across 11 image editing models, we find overall low performance, with the current highest-performing industry leader scoring only 17.1% (mIoU). Task decomposition reveals especially challenging operation types (geometric transformation, most structural manipulation, formula-based color change) and model-specific specializations. Fine-grained benchmark diagnostics further show performance degradations induced by scene variations in object count, background complexity, color scheme, and edit-region size. To test generalization of PaintBench scores to applied task performance, we create a procedural, deterministic evaluation for data visualization editing (TinyGrafixBench) and find strong linear correlation with PaintBench scores ($R^2 = 0.91$, $p < 0.001$). Altogether, PaintBench provides a rigorous foundation for measuring and driving progress in precise multimodal visual editing.

2605.30952 2026-06-03 cs.LG 版本更新

Spectral Anatomy of Quantum Gaussian Process Kernels

量子高斯过程核的谱解剖

Jian Xu, Chao Li, Guang Lin, Yuning Qiu, Delu Zeng, John Paisley, Qibin Zhao

发表机构 * RIKEN iTHEMS RIKEN AIP South China University of Technology(华南理工大学) Columbia University(哥伦比亚大学)

AI总结 通过归一化谱熵S(K)/log n统一解释了量子高斯过程回归中指数加速失效与后验病理现象,并证明该诊断量在多种量子与经典核上具有普适性,且在IBM Heron硬件上实现了低误差迁移。

详情
AI中文摘要

两个近期结果重塑了量子高斯过程(QGPs)。一方面,\citet{lowe2025assessing} 排除了在典型、良态条件下基于HHL的QGP回归声称的指数加速;另一方面,一项独立工作表明,高表达性量子核存在后验病理,破坏了贝叶斯优化。我们证明这些看似无关的现象由同一个量控制:核Gram矩阵的归一化谱熵 $S(K)/\log n$。我们证明了Nyström近似误差的Cauchy–Schwarz尾部界、以Bach自由度 $d_σ(K)$ 表示的有限样本方差收缩恒等式,以及通过目标在核本征基中的内在维数对 \emph{依赖于目标} 的最优熵的表征。实验上,该诊断量与核无关:硬件高效、matchgate、IQP \emph{以及} RBF/Matérn/RFF/深度核族在去量子化、ECE和方差收缩面板上全部坍缩到相同的 $S/\log n$ 曲线上。NLL最佳点位于光滑目标的高熵和带限量子数据目标的低熵。该诊断量从模拟器迁移到IBM Heron硬件,在 $n_q = 4$ 的 $24$ 种配置中 $S/\log n$ 的中位绝对误差为 $3.2\%$,平均误差为 $5.2\%$,其中matchgate和IQP的平均误差在 $5\%$ 以内,单个HE配置返回 $30\%$ 的异常值,重新运行时降至 $0.5\%$(归因于校准漂移);相同的诊断量迁移到第二个Heron后端(平均误差 $2.7\%$)以及原始后端上的 $n_q = 6$ 扩展(平均误差 $1.7\%$)。全程未应用误差缓解。

英文摘要

Two recent results have reshaped quantum Gaussian processes (QGPs). On the one hand, \citet{lowe2025assessing} rule out the exponential speedups claimed by HHL-based QGP regression in the typical, well-conditioned regime; on the other, an independent line of work shows that highly expressive quantum kernels suffer posterior pathologies that break Bayesian optimization. We show that these seemingly unrelated phenomena are governed by the same quantity: the normalized spectral entropy $S(K)/\log n$ of the kernel Gram matrix. We prove a Cauchy--Schwarz tail bound on Nyström approximation error, a finite-sample variance-contraction identity in terms of Bach's degrees of freedom $d_σ(K)$, and a characterization of the \emph{target-dependent} optimal entropy via the intrinsic dimension of the target in the kernel eigenbasis. Empirically, the diagnostic is kernel-agnostic: hardware-efficient, matchgate, IQP \emph{and} RBF/Matérn/RFF/deep-kernel families all collapse onto identical $S/\log n$ curves on dequantization, ECE, and variance-contraction panels. The NLL sweet spot lives at high entropy for smooth targets and at low entropy for band-limited quantum-data targets. The diagnostic transfers from simulator to IBM Heron hardware with median absolute error $3.2\%$ and mean $5.2\%$ in $S/\log n$ across $24$ configurations at $n_q = 4$, with matchgate and IQP within $5\%$ mean and a single HE configuration returning a $30\%$ outlier that drops to $0.5\%$ on rerun (attributed to calibration drift); the same diagnostic transfers to a second Heron backend (mean error $2.7\%$) and to a $n_q = 6$ scale-up on the original backend (mean error $1.7\%$). No error mitigation is applied throughout.

2605.30789 2026-06-03 cs.LG cs.AI 版本更新

Smaller Models are Natural Explorers for Policy-Level Diversity in GRPO

小型模型是GRPO中策略级多样性的自然探索者

Yiming Ren, Yiran Xu, Zicheng Lin, Chufan Shi, Yukang Chen, Dingdong Wang, Tianhe Wu, Junjie Wang, Yujiu Yang, Yu Qiao, Ruihang Chu

发表机构 * University of Science and Technology of China(中国科学技术大学)

AI总结 提出S2L-PO框架,利用小型模型作为自然探索者生成策略级多样性的rollout,通过渐进退火策略过渡到大型模型自身采样,提升数学推理性能并减少计算开销。

详情
AI中文摘要

我们识别出增强LLM组相对策略优化(GRPO)中rollout多样性的新维度。虽然GRPO依赖于多样化的rollout,但主流策略主要通过注入更多token级随机性来增加多样性,这可能引入逐步噪声并导致不连贯的轨迹。我们发现,同一模型族中的较小模型固有地表现出更高的策略级多样性,随着样本数量增加,其pass@k优于较大模型。与token级噪声不同,这种多样性在时间上相关,保持逻辑一致性,并为梯度估计提供结构化探索信号。因此,我们提出S2L-PO(从小到大的策略优化)框架,利用固定的小型模型作为自然探索者来训练大型模型。为了平衡探索与利用,我们设计了一种渐进退火策略,从离线的小模型rollout过渡到大型学习者自身的采样。这种转变优雅地避免了由小模型容量限制导致的训练中期性能下降,实现了更快的收敛并解锁了更高的性能上限。S2L-PO在多种数学推理基准上提高了准确率(例如,使用1.7B探索者指导8B模型在AIME 24上提高了8.8%),同时减少了rollout计算量。

英文摘要

We identify a new dimension for enhancing rollout diversity in Group Relative Policy Optimization (GRPO) for LLMs. While GRPO relies on diverse rollouts, prevailing strategies primarily increase diversity by injecting more token-level randomness, which may introduce step-wise noise and lead to incoherent trajectories. We uncover that smaller models within the same model family inherently exhibit higher policy-level diversity, indicated by their superior pass@k relative to larger counterparts as sample counts increase. Unlike token-level noise, this diversity is temporally correlated, preserves logical consistency, and provides structured exploration signals for gradient estimation. We thus propose S2L-PO (Small-to-Large Policy Optimization), a framework that leverages fixed small models as natural explorers to train larger models. To balance exploration and exploitation, we design a progressive annealing strategy that transitions from offline small-model rollouts to the large learner's own sampling. This shift elegantly avoids mid-training performance drops caused by the small model's capacity limits, achieving faster convergence and unlocking a higher performance ceiling. S2L-PO improves accuracy on diverse mathematical reasoning benchmarks (e.g., +8.8% on AIME 24 using a 1.7B explorer to guide the 8B model) while reducing rollout compute.

2605.30722 2026-06-03 cs.LG stat.CO stat.ME 版本更新

Self-Certifying Transport MCMC via Dual Spectral-Gap Certificates

通过双谱间隙证书实现自认证传输MCMC

Jun Hu

发表机构 * Wuhan University of Technology(武汉理工大学)

AI总结 提出CerT-MCMC框架,利用归一化流实现传输MCMC的自动严格收敛认证,通过覆盖证书和分位数核心证书提供谱间隙界限。

Comments 35 pages, 3 figures, 9 tables. Submitted to JASA

详情
AI中文摘要

我们提出CerT-MCMC,一个为学习传输马尔可夫链蒙特卡洛配备自动、严格收敛证书的框架。归一化流将高斯参考映射到目标后验的近似;同一流同时作为独立Metropolis-Hastings提议和可计算谱间隙界的基础。我们开发了两个互补的证书。覆盖证书通过有限样本覆盖论证在全提议支撑上界权重比振荡,当保守梯度界可用时产生全支撑谱间隙界;其修正项以O(n^{-1/D})缩放,随着维度增加迅速变弱并最终无效。我们证明了一个匹配的Omega(n^{-1/D})下界,确立这一障碍是逐点Lipschitz认证固有的。分位数核心证书将注意力限制在高概率残差核心上,其振荡由一维经验分位数控制,具有O(n^{-1/2})的有限样本概率松弛,与维数无关。在合成目标(D=2-20)、结构工程后验(D=6,8)、心脏病数据集上的真实数据逻辑回归(D=13)以及合成贝叶斯逻辑回归(D=20)上,分位数核心证书在覆盖证书无效时提供了非平凡的谱间隙界,其谱间隙代理在7%内跟踪经验有效样本量。一个阴性对照实验证实,证书以超过10倍的因子区分流质量,而接受率仅相差1.15倍。据我们所知,双证书框架是第一个为学习传输MCMC提供自动、维度感知收敛证书的框架,区分了真正的传输失败与证明技术限制。

英文摘要

We propose CerT-MCMC, a framework that equips learned-transport Markov chain Monte Carlo with automatic, rigorous convergence certificates. A normalising flow maps a Gaussian reference to an approximation of the target posterior; the same flow then serves as both the independence Metropolis-Hastings proposal and the basis for a computable spectral-gap bound. We develop two complementary certificates. The covering certificate bounds the weight-ratio oscillation over the full proposal support via finite-sample covering arguments, yielding full-support spectral-gap bounds when a conservative gradient bound is available; its correction term scales as O(n^{-1/D}), making it rapidly weak and eventually vacuous as dimension increases. We prove a matching Omega(n^{-1/D}) lower bound, establishing that this barrier is intrinsic to pointwise Lipschitz certification. The quantile-core certificate restricts attention to a high-probability residual core on which the oscillation is controlled by one-dimensional empirical quantiles, with a finite-sample probability slack of O(n^{-1/2}), independent of the ambient dimension. On synthetic targets (D=2-20), structural-engineering posteriors (D=6,8), real-data logistic regression on the Heart Disease data set (D=13), and synthetic Bayesian logistic regression (D=20), the quantile-core certificate delivers non-vacuous spectral-gap bounds where the covering certificate is vacuous, and its spectral-gap proxy tracks empirical effective sample sizes within 7%. A negative control experiment confirms that the certificate discriminates flow quality by a factor exceeding 10x, whereas acceptance rates differ by only 1.15x. To our knowledge, the dual-certificate framework is the first to provide automatic, dimension-aware convergence certificates for learned-transport MCMC, distinguishing genuine transport failure from proof-technique limitations.

2605.26704 2026-06-03 cs.LG cs.AI 版本更新

SL-BiLEM: Structured Learnable Behavior-in-the-Loop Epidemic Modeling for Forecasting and Policy Evaluation

SL-BiLEM: 用于预测和政策评估的结构化可学习行为循环流行病模型

Haochun Wang, Sendong Zhao, Jingbo Wang, Yanrui Du, Ting Liu, Bing Qin

发表机构 * Faculty of Computing, Harbin Institute of Technology(计算学院,哈尔滨工业大学)

AI总结 提出SL-BiLEM模型,通过物理约束正则化实现鲁棒外推,在政策干预导致的分布偏移下预测准确率提升76%,并支持反事实分析。

Comments ACM SIGKDD 2026

详情
AI中文摘要

流行病预测面临一个基本挑战:人类行为会动态响应疾病传播,形成反馈循环,在政策干预点引发分布偏移。这使得数据驱动模型在分布偏移下不可靠。我们提出 extbf{SL-BiLEM}(结构化可学习行为循环流行病模型),利用物理约束作为正则化实现鲁棒外推。该框架将有效传播率分解为$β_{ ext{eff}}(t,g) = β_0(g) imes m_{ ext{policy}}(t) imes m_{ ext{media}}(t) imes m_{ ext{comp}}(t,g)$,其中对学习到的依从函数施加单调性、平滑性和有界跳跃约束,以在新政策制度下保持预测有效性。除预测外,SL-BiLEM还能为干预决策支持进行反事实分析。我们在三个真实世界数据集(邮轮、学校流感和学区COVID-19监测)上验证预测性能,并在已知真实情况的合成基准上评估反事实恢复。SL-BiLEM表明:(1)相比神经机制基线改进76%,在政策诱导偏移下仅53%的OOD退化,而神经基线为1142%;(2)在27个合成反事实实验中,自举置信区间覆盖率达100%;(3)处理效应准确度超过0.85。这些结果使SL-BiLEM成为公共卫生决策者寻求准确预测和原则性干预规划的可解释工具。

英文摘要

Epidemic forecasting faces a fundamental challenge: human behavior dynamically responds to disease spread, creating feedback loops that induce distribution shifts at policy intervention points. This renders data-driven models unreliable under distribution shift. We propose \textbf{SL-BiLEM} (Structured Learnable Behavior-in-the-Loop Epidemic Model), leveraging physical constraints as regularization for robust extrapolation. The framework decomposes effective transmission as $β_{\text{eff}}(t,g) = β_0(g) \times m_{\text{policy}}(t) \times m_{\text{media}}(t) \times m_{\text{comp}}(t,g)$, where monotonicity, smoothness, and bounded-jump constraints on the learned compliance function maintain predictive validity under novel policy regimes. Beyond forecasting, SL-BiLEM enables counterfactual analysis for intervention decision support. We validate forecasting on three real-world datasets (cruise ship, school influenza, and school-district COVID-19 surveillance) and evaluate counterfactual recovery on synthetic benchmarks with known ground truth. SL-BiLEM demonstrates: (1) 76\% improvement over neural-mechanistic baselines, with only 53\% OOD degradation versus 1142\% for neural baselines under policy-induced shift; (2) 100\% bootstrap CI coverage across 27 synthetic counterfactual experiments; and (3) Treatment Effect Accuracy exceeding 0.85. These results establish SL-BiLEM as an interpretable tool for public health decision-makers seeking accurate prediction and principled intervention planning.

2605.30253 2026-06-03 stat.ML cs.LG math.FA math.OC math.PR stat.CO 版本更新

Wasserstein Contraction of Coordinate Ascent Variational Inference

坐标上升变分推断的Wasserstein收缩

Rocco Caprio, Adrien Corenflos, Sam Power

发表机构 * Department of Statistics, University of Warwick(沃里克大学统计系) School of Mathematics, University of Bristol(布里斯托大学数学学院)

AI总结 研究坐标上升变分推断算法在Wasserstein距离下的收缩性,通过不动点处的传输-信息不等式和函数光滑性条件给出局部收敛保证,并应用于贝叶斯高斯混合模型、高维贝叶斯Probit回归及Pólya-Gamma逻辑回归。

Comments 17 pages + 3 pages appendix, 3 figures. V2 fixes some citations not displaying properly in the appendix. No content change compared to prior version

详情
AI中文摘要

我们研究了坐标上升变分推断算法在Wasserstein距离下的收缩性。该性质在不动点处满足传输-信息不等式和函数光滑性条件时成立。结果是通用且精确的,允许局部收敛保证,适用于一般光滑流形,也适用于某些非光滑空间。我们考虑了在贝叶斯高斯混合模型、高维贝叶斯Probit回归以及带有Pólya-Gamma随机变量的逻辑回归(即Jaakkola-Jordan算法)中的应用。

英文摘要

We study the contraction in Wasserstein distance of the coordinate ascent variational inference algorithm. This is shown to hold under a transport-information inequality at the fixed points and a functional smoothness condition. The results are general and sharp, allow for local convergence guarantees, hold for general smooth manifolds, and also in some non-smooth spaces. We consider applications to Bayesian Gaussian Mixture Models, and high-dimensional Bayesian Probit Regression, and Logistic Regression with Pólya-Gamma random variables (i.e. Jaakkola-Jordan's algorithm).

2605.30225 2026-06-03 cs.LG 版本更新

ExDBSCAN: Explaining DBSCAN with Counterfactual Reasoning -- Additional Material

ExDBSCAN: 用反事实推理解释DBSCAN——附加材料

Pernille Matthews, Lena Krieger, Tommaso Amico, Artur Zimek, Thomas Seidl, Ira Assent

发表机构 * Aarhus University Department of Computer Science(奥胡斯大学计算机科学系) Forschungszentrum Jülich(朱利奇研究中心) LMU Munich(慕尼黑大学) University of Southern Denmark Department of Computer Science and Mathematics(南丹麦大学计算机科学与数学系) MCML Munich Germany(慕尼黑MCML德国) DBS

AI总结 提出ExDBSCAN方法,通过密度感知的反事实解释为DBSCAN聚类结果提供可解释性,理论保证有效性,实验证明优于基线。

详情
AI中文摘要

聚类是一种通过相似性对数据点进行分组的无监督技术。尽管监督机器学习存在可解释性方法,但它们不能直接应用于聚类,使得理解聚类分配具有挑战性。这种可解释性差距在流行的基于密度的方法DBSCAN中尤为明显,该方法将点分配为内点(密集区域中的聚类成员)或离群点(稀疏区域中的噪声点)。DBSCAN没有提供关于为什么特定点得到其分配或其分配是否对数据的小变化鲁棒的见解。为了解决缺乏可解释性的问题,我们引入了ExDBSCAN,一种密度感知的事后解释方法。ExDBSCAN提供可操作的反事实解释,并具有有效性的理论保证。它使用密度连接的加权图生成多个反事实,采用物理启发模型,该模型使反事实候选者彼此排斥(多样性),同时将它们拉向要解释的实例(接近性)。在30个表格数据集上对比四个基线的实证评估表明,ExDBSCAN在所有基线上表现优异,同时达到完美的有效性并检索到多样、接近的反事实。

英文摘要

Clustering is an unsupervised technique for grouping data points by similarity. While explainability methods exist for supervised machine learning, they are not directly applicable to clustering, making it challenging to understand cluster assignments. This interpretability gap is particularly evident in the popular density-based method DBSCAN, which assigns points as inliers (cluster members in dense regions) or outliers (noise points in sparse regions). DBSCAN does not provide insight into why a particular point receives its assignment or whether its assignment is robust to small changes in the data. To address the lack of explainability, we introduce ExDBSCAN, a density-aware, post-hoc explanation method. ExDBSCAN offers actionable counterfactual explanations, with theoretical guarantees for validity. It generates multiple counterfactuals using a density connected weighted graph, adopting a physics-inspired model that repels counterfactual candidates from one another (diversity), while pulling them toward the instance to explain (proximity). Empirical evaluation on 30 tabular datasets comparing against four baselines shows that ExDBSCAN outperforms all baselines while attaining perfect validity and retrieving diverse, proximal counterfactuals.

2605.30166 2026-06-03 cs.SI cs.LG 版本更新

SAHG: Sector-Anisotropic Hyperbolic Graph Model for Social Bot Detection

SAHG:用于社交机器人检测的扇区各向异性双曲图模型

Hanning Lu, Yingguang Yang, Jinwei Su, Yang Liu, Zhaoqian Yao, Yaoming Li, Taoran Liang, Ziyi Zhang, Ran Ran, Kefu Xu, Bin Chong

发表机构 * University of Leeds(利兹大学) University of Science and Technology of China(中国科学技术大学) South China Normal University(华南师范大学) Tsinghua University(清华大学) The Chinese University of Hong Kong(香港中文大学) Harbin University of Commerce(哈尔滨商业大学) Beijing University of Posts and Telecommunications(北京邮电大学) Peking University(北京大学) University of California, Berkeley(加州大学伯克利分校)

AI总结 提出扇区各向异性双曲图模型SAHG,通过方向依赖曲率场和扇区原型解决欧几里得GNN在层次无标度社交图中的失真问题以及异质连接导致的信号污染问题,在三个基准上取得最佳性能。

详情
AI中文摘要

LLM驱动的社交机器人能生成流畅类人文本,降低了纯内容检测的判别优势。然而,协调活动仍留下关系模式——交互、行为相似性、共享邻居、社区位置和协调活动——图方法可利用这些模式。现有图检测器在利用此类证据时面临两个挑战。首先,欧几里得GNN扭曲了层次和无标度社交图;虽然双曲几何解决了这种体积增长不匹配,但固定曲率模型仍对不同密度和分离需求的结构方向分配均匀的几何分辨率。其次,关系证据并不总是可靠:复杂机器人与真实用户伪造异质连接,导致邻域聚合混合机器人和人类信号,稀释账户级证据。我们提出SAHG(扇区各向异性双曲图),解决这两个挑战。SAHG学习方向依赖的曲率场γ(u),适应结构方向上的几何分辨率,并使用扇区原型将角度集中和对齐转换为分类器可读特征。为防止受污染的聚合淹没账户级证据,SAHG在两个独立的SAH通道中编码每个账户特征和图邻域表示,仅在分类器处融合。在Fox8-23、BotSim-24和MGTAB上的实验表明,SAHG在所有三个基准上实现了最高准确率和F1,优于基于特征、基于图、基于LLM和各向同性双曲基线。消融和几何分析证实了各向异性几何和双通道设计的有效性。

英文摘要

LLM-driven social bots can generate fluent, human-like text, reducing the discriminative advantage of content-based detection alone. However, coordinated campaigns still leave relational patterns -- interactions, behavioral similarity, shared neighborhoods, community positions, and coordinated activity -- that graph-based methods can exploit. Existing graph detectors face two challenges when exploiting such evidence. First, Euclidean GNNs distort hierarchical and scale-free social graphs; while hyperbolic geometry addresses this volume-growth mismatch, fixed-curvature models still assign uniform geometric resolution to structural directions with different densities and separation needs. Second, relational evidence is not always reliable: sophisticated bots forge heterophilic connections with genuine users, causing neighborhood aggregation to mix bot and human signals and dilute account-level evidence. We propose SAHG (Sector-Anisotropic Hyperbolic Graph), addressing both challenges. SAHG learns a direction-dependent curvature field $γ(u)$ that adapts geometric resolution across structural directions, and uses sector prototypes to convert angular concentration and alignment into classifier-readable features. To prevent contaminated aggregation from overwhelming account-level evidence, SAHG encodes per-account features and graph-neighborhood representations in two independent SAH channels, fusing them only at the classifier. Experiments on Fox8-23, BotSim-24, and MGTAB show that SAHG achieves the highest accuracy and F1 on all three benchmarks, outperforming feature-based, graph-based, LLM-based, and isotropic hyperbolic baselines. Ablation and geometric analyses confirm the effectiveness of the anisotropic geometry and dual-channel design.

2605.28166 2026-06-03 cs.LG cs.AI 版本更新

QuITE: Query-Based Irregular Time Series Embedding

QuITE: 基于查询的不规则时间序列嵌入

Junghoon Lim

AI总结 提出一种即插即用的嵌入模块QuITE,通过可学习查询令牌聚合不规则观测值,无需插值或修改架构,显著提升多变量时间序列模型的预测和分类性能。

Comments ICML 2026

详情
AI中文摘要

不规则多变量时间序列在实践中很常见,但其不规则采样给有效建模带来了困难。现有方法通常要么(i)设计专门架构,限制了经过验证的多变量时间序列模型的复用,要么(ii)通过插值将不规则时间序列映射到规则时间网格,这可能会引入人工值从而扭曲时间动态。为解决这些限制,我们提出了一种新的基于输入嵌入的方法。我们发现关键瓶颈不在于主干架构,而在于假设均匀采样的传统嵌入层。在这项工作中,我们引入了QuITE(基于查询的不规则时间序列嵌入),一种简单而有效的即插即用嵌入模块。QuITE使用可学习查询令牌通过单层自注意力聚合不规则观测值,直接生成与主干兼容的潜在表示,无需生成人工值或修改架构。在真实世界基准上的大量实验表明,QuITE持续改进多变量时间序列模型,在不同数据集和主干架构上,预测任务平均相对提升高达54.7%,分类任务平均相对提升高达15.8%。代码可在 https://github.com/Meaningfull9502/QuITE 获取。

英文摘要

Irregular Multivariate Time Series (IMTS) are common in practice, yet their irregular sampling complicates effective modeling. Existing approaches typically either (i) design specialized architectures that limit the reuse of proven Multivariate Time Series (MTS) models, or (ii) map IMTS onto regular temporal grids through interpolation, which may distort temporal dynamics by introducing artificial values. To address these limitations, we propose a new input-embedding-based approach. We identify that the key bottleneck lies not in the backbone architecture, but in conventional embedding layers that assume uniform sampling. In this work, we introduce QuITE (Query-Based Irregular Time Series Embedding), a simple yet effective plug-and-play embedding module for IMTS. QuITE employs learnable query tokens to aggregate irregular observations through a single self-attention layer, directly producing backbone-compatible latent representations without artificial value generation or architectural modification. Extensive experiments on real-world benchmarks show that QuITE consistently improves MTS models, yielding average relative gains of up to $54.7\%$ in forecasting and $15.8\%$ in classification across diverse datasets and backbone architectures. Code is available at: https://github.com/Meaningfull9502/QuITE.

2605.29058 2026-06-03 cs.LG 版本更新

Parallel Adaptive Multi-Objective Evolutionary Learning of Discretized Bayesian Network Classifiers for Clinical Data

面向临床数据的离散化贝叶斯网络分类器的并行自适应多目标进化学习

Damy M. F. Ha, Thalea Schlender, Yvette M. van der Linden, Peter A. N. Bosman, Tanja Alderliesten

发表机构 * Leiden University Medical Center(莱顿大学医学中心) Centrum Wiskunde & Informatica(数学与信息学中心)

AI总结 针对贝叶斯网络学习计算量大且仅用于合成数据的问题,提出并行化策略和自适应优化机制,将Baymex重构为多目标优化分类器,在真实临床数据上实现高效且可解释的分类性能。

详情
AI中文摘要

贝叶斯网络(BN)从可解释人工智能的角度来看很有吸引力,为决策支持提供了透明的概率模型。Baymex是最近引入的一种用于学习离散化BN的多目标进化算法,使专家能够权衡不同的目标,如似然、模型复杂度和先验信念。虽然Baymex已被证明优于最先进的BN学习方法,但Baymex仍然1)需要大量计算时间,2)仅在合成数据上进行了评估。为了提高可扩展性,我们引入了并行化策略以及一种能够自适应地将优化导向过拟合较少网络的机制。此外,我们重新配置Baymex,通过交叉熵损失和BIC复杂度项的多目标优化来训练BN分类器,以评估其在真实世界临床分类任务上的性能。除了在16核CPU上观察到高达54倍以上的加速外,在两个开源数据集(RADCURE和SUPPORT)和一个内部数据集上,与临床熟悉的基线(决策树、逻辑回归、朴素贝叶斯和随机森林)进行比较表明,Baymex获得了统计上相似或更好的预测性能,同时生成了紧凑、临床可检查的BN。重要的是,Baymex找到了多个合理的BN分类器,其中包含与既定临床因素一致的预测因子。

英文摘要

Bayesian Networks (BNs) are of interest from an explainable AI viewpoint, offering transparent probabilistic models for decision support. Baymex is a recently introduced multi-objective evolutionary algorithm for learning discretized BNs, enabling experts to trade-off different objectives of interest, such as likelihood, model complexity, and prior beliefs. While Baymex has been shown to outperform state-of-the-art BN learning approaches, Baymex still 1) requires a lot of computation time and 2) has only been evaluated on synthetic data. To improve scalability, we introduce a parallelization strategy as well as a mechanism that enables adaptively steering optimization toward networks that overfit less. We furthermore reconfigure Baymex to train a BN classifier through multi-objective optimization of cross-entropy loss and the BIC complexity term so as to evaluate its performance on real-world clinical classification tasks. Besides observing speedups up to over 54 times on a 16-core CPU, comparisons against clinically familiar baselines (decision trees, logistic regression, naive Bayes, and random forests) on two open-source (RADCURE and SUPPORT) and one in-house dataset, show that Baymex obtains statistically similar or better predictive performance while producing compact, clinically inspectable BNs. Importantly, Baymex finds multiple plausible BN classifiers that contain predictors consistent with established clinical factors.

2605.26366 2026-06-03 cs.AI cs.LG 版本更新

Automatic Layer Selection for Hallucination Detection

幻觉检测的自动层选择

Xinpeng Wang, William X. Cao, Andrew Gordon Wilson, Zhe Zeng

发表机构 * University of Washington(华盛顿大学)

AI总结 针对大语言模型中幻觉检测的层选择问题,提出无需训练的FEPoID准则,自动识别最优中间层,并结合截断策略提升检测性能。

Comments Accepted at ICML 2026

详情
AI中文摘要

最近关于幻觉检测的研究表明,在大语言模型(LLMs)中,与幻觉相关的信号在中间层比在最后一层编码得更强。尽管越来越多的研究试图利用这一特性进行幻觉检测,但如何自动选择高性能层仍未得到充分探索,且缺乏针对此目的的原则性方法。为填补这一空白,我们首先提出了几个关于为何这些信号出现在中间层的假设,并评估了相应的自动层选择准则,这些准则适用于不同的LLM架构、规模和任务,涵盖了问答和摘要幻觉检测基准。然而,我们发现这些准则均不能持续提供令人满意的性能。因此,我们提出了一种新的选择准则——第一有效本征维度峰值(FEPoID),它能够一致地识别最优或接近最优的层,并优于上述准则和现有的幻觉检测基线。FEPoID无需训练,且计算开销可忽略不计。此外,我们研究了LLM的生成行为,并引入了一种简单而有效的截断策略,该策略进一步放大了与幻觉相关的信号,并显著提高了整体检测性能。代码公开于 https://github.com/DesoloYw/Automatic-Layer-Selection-for-Hallucination-Detection.git

英文摘要

Recent studies on hallucination detection have shown that hallucination-related signals are more strongly encoded in intermediate layers than in the final layer of large language models (LLMs). Although a growing body of work has sought to exploit this property for hallucination detection, how to automate the selection of high-performing layers remains underexplored, and principled methods for this purpose are still lacking. To address this gap, we first propose several hypotheses for why such signals emerge in intermediate layers and evaluate corresponding criteria for automatic layer selection across diverse LLM architectures, scales, and tasks, covering both question answering and summarization hallucination detection benchmarks. However, we find that none of these criteria consistently delivers satisfactory performance. We therefore propose a new selection criterion, First Effective Peak of Intrinsic Dimension (FEPoID), which consistently identify optimal or near-optimal layers and outperforms both the aforementioned criteria and existing hallucination detection baselines. FEPoID is training-free and incurs negligible computational overhead. In addition, we study the generation behaviors of LLMs and introduce a simple yet effective truncation strategy, which further amplifies hallucination-related signals and substantially improves overall detection performance. Code is publicly available at https://github.com/DesoloYw/Automatic-Layer-Selection-for-Hallucination-Detection.git

2605.25902 2026-06-03 cs.LG 版本更新

Reading the Finetuning Prior: Verbatim Content Recovery via Contrastive Decoding Diffing

读取微调先验:通过对比解码差异进行逐字内容恢复

Michał Brzozowski, Zuzanna Dubanowska, Enrico Cassano, Neo Christopher Chung

发表机构 * Samsung AI Center(三星人工智能中心) University of Turin(都灵大学) University of Warsaw(华沙大学)

AI总结 提出对比解码差异(CDD)方法,仅基于输出层logit分布,无需权重或内部访问,即可从微调模型中逐字恢复植入事实,并在多种架构和场景下优于白盒方法。

详情
AI中文摘要

窄微调的语言模型会逐字记忆植入的内容,但在无法访问模型权重或训练数据的情况下,审计已部署模型所学内容仍然是一个开放挑战。最近的研究表明,基础模型与微调模型之间的激活差异携带了微调领域的可读痕迹;最先进的激活差异透镜(ADL)可以恢复模糊的领域级描述,但需要完全的“白盒”访问模型内部。我们引入了对比解码差异(CDD),一种仅操作输出级logit分布的模型差异方法,无需权重访问、无需层选择、无需针对模型调参,却能恢复植入的事实。CDD包含三个思想:绕过聊天模板以暴露原始微调先验,用最大模糊的前缀填充种子生成,以及在每个解码步骤放大微调模型与基础模型之间的logit空间差异。一个单一的默认配置即可逐字恢复植入的事实——精确的药物名称、投票数、物理测量值和程序细节——涵盖四种架构(1B-32B参数),尽管访问更少且运行速度约快170倍,但全面优于ADL。此外,CDD揭示了意外的数据管道伪影:LLM数据生成器通过模式崩溃引入的虚构角色泄露到模型权重中,并被CDD提取出来,据我们所知,这构成了首个从数据生成器伪影到模型权重再到恢复输出的端到端指纹链。我们在真实领域微调设置上进行了验证,在所有单数据集非CoT变体上实现了近乎完美的恢复,并在混合数据集设置中正确识别了所有四个数据集。CDD作为一种灰盒方法成功超越了白盒基线,凸显了其在AI系统透明度和问责性方面的实际效用。

英文摘要

Narrowly finetuned language models memorize implanted content verbatim, but auditing what a deployed model has been taught, without access to its weights or training data, remains an open challenge. Recent work shows that activation differences between base and finetuned models carry readable traces of the finetuning domain; the state-of-the-art Activation Difference Lens (ADL) recovers a vague domain-level description but requires full "white-box" access to model internals. We introduce Contrastive Decoding Diffing (CDD), a model diffing method that operates on output-level logit distributions only, with no weight access, no layer selection, and no per-model tuning, yet recovers implanted facts. CDD consists of three ideas: bypassing the chat template to expose the raw finetuning prior, seeding generation with maximally vague pre-fills, and amplifying the logit-space difference between finetuned and base models at each decoding step. A single default configuration recovers implanted facts verbatim -- exact drug names, vote counts, physical measurements, and procedural details -- across four architectures (1B--32B parameters), uniformly outperforming ADL despite less access and running ~170x faster. Furthermore, CDD surfaces unintended data pipeline artifacts: a fictional persona introduced by the LLM data generator via mode collapse leaked into model weights and was extracted by CDD, constituting to our knowledge the first demonstrated end-to-end fingerprinting chain from data generator artifact to model weights to recovered output. We validate on real-domain finetuning settings, achieving near-perfect recovery across all single-dataset non-CoT variants and correctly identifying all four datasets in the mixed-dataset setting. CDD's success as a grey-box method outperforming white-box baselines underscores its practical utility for transparency and accountability in AI systems.

2603.05691 2026-06-03 cs.LG stat.ML 版本更新

Improved Scaling Laws via Weak-to-Strong Generalization in Random Feature Ridge Regression

随机特征岭回归中弱到强泛化的改进缩放定律

Diyuan Wu, Lehan Chen, Theodor Misiakiewicz, Marco Mondelli

AI总结 本文通过随机特征岭回归的确定性等价分析,揭示了弱教师训练的强学生模型在偏差主导和方差主导场景下均能改进缩放定律,甚至达到极小极大最优率。

详情
AI中文摘要

在机器学习中,使用学习模型标记数据,然后用这些数据训练更强大的模型变得越来越常见。弱到强泛化现象体现了这种两阶段过程的优势:强学生模型在由弱教师模型获得的不完美标签上训练,但强学生模型的表现优于弱教师模型。在本文中,我们展示了这种潜在改进是显著的,因为它影响了测试误差所遵循的缩放定律。具体来说,我们考虑通过随机特征岭回归(RFRR)训练的学生和教师模型。我们的主要技术贡献是推导出学生模型在教师模型获得的标签上训练时的超额测试误差的确定性等价。通过这个确定性等价,我们识别出学生模型的缩放定律相对于教师模型得到改进的区域,揭示了这种改进可以在偏差主导和方差主导的设置中实现。引人注目的是,无论教师模型的缩放定律如何,学生模型都可能达到极小极大最优率——事实上,即使教师模型的测试误差不随样本量衰减,这一结论也成立。

英文摘要

It is increasingly common in machine learning to use learned models to label data and then employ such data to train more capable models. The phenomenon of weak-to-strong generalization exemplifies the advantage of this two-stage procedure: a strong student is trained on imperfect labels obtained from a weak teacher, and yet the strong student outperforms the weak teacher. In this paper, we show that the potential improvement is substantial, in the sense that it affects the scaling law followed by the test error. Specifically, we consider students and teachers trained via random feature ridge regression (RFRR). Our main technical contribution is to derive a deterministic equivalent for the excess test error of the student trained on labels obtained via the teacher. Via this deterministic equivalent, we then identify regimes in which the scaling law of the student improves upon that of the teacher, unveiling that the improvement can be achieved both in bias-dominated and variance-dominated settings. Strikingly, the student may attain the minimax optimal rate regardless of the scaling law of the teacher -- in fact, when the test error of the teacher does not even decay with the sample size.

2605.23055 2026-06-03 cs.LG cs.AI cs.CL 版本更新

Decomposing and Measuring Evaluation Awareness

分解与度量评估意识

Changling Li, Terry Jingchen Zhang, Jie Zhang, Zhijing Jin, Sahar Abdelnabi, Maksym Andriushchenko

发表机构 * ETH Zürich(苏黎世联邦理工学院) ELLIS Institute Tübingen(图宾根ELLIS研究所) Max Planck Institute for Intelligent Systems(智能系统马克斯·普朗克研究院) Tübingen AI Center(图宾根人工智能中心) University of Toronto & Vector Institute(多伦多大学及向量研究所) EuroSafeAI(欧洲安全人工智能)

AI总结 本研究借鉴社会心理学,将评估意识分解为环境与模型两部分,通过EvalAwareBench基准测试发现识别率取决于模型与基准的配对,且识别很少导致行为改变,安全评估比能力评估更易受影响。

详情
AI中文摘要

前沿语言模型有时会意识到自己正在被评估并调整行为,从而破坏基准结果的有效性。然而,该领域的研究缺乏共同基础,混淆了评估属性与模型属性,以及检测与行为响应。我们将评估意识植根于社会心理学,将其分解为环境组件(任务的可识别程度)和模型组件(将识别与行动倾向分离)。我们通过八个分类触发因素(如占位符实体和评分式输出格式)来操作化环境组件,并通过思维链监控研究识别和行为。在九个前沿模型和四个基准上,识别率取决于模型和基准的具体配对,而非单独一方。识别很少导致行为改变,即使发生,方向也取决于所感知的评估类型。模型对安全评估的敏感性高于能力评估,使安全基准的有效性面临更大风险。为了研究每个模型对哪些因素敏感以及它们如何相互作用,我们提出了EvalAwareBench,这是一个包含100个配对安全-能力任务的因子控制基准,其中八个因素中的每一个都可以独立切换,在保持底层请求不变的同时改变评估信号。通过EvalAwareBench,我们发现没有单一因素能统一影响所有模型,但叠加因素会逐步提高所有模型的评估意识。我们的框架和EvalAwareBench提供了度量、归因和缓解评估意识的工具,指出在识别下的行为一致性是一条有前景的前进道路。

英文摘要

Frontier language models sometimes recognize that they are being evaluated and adjust their behavior, undermining validity of benchmark results. Yet the field studies it without a shared foundation, conflating properties of the evaluation with properties of the model, and detection with behavioral response. We ground evaluation awareness in social psychology, decomposing it into an environment component (how recognizable the task is) and a model component that separates recognition from propensity to act on it. We operationalize the environment component through eight categorized trigger factors, such as placeholder entities and grading-style output formats, and study recognition and behavior through chain-of-thought monitoring. Across nine frontier models and four benchmarks, recognition rates depend on the specific pairing of model and benchmark rather than on either in isolation. Recognition rarely leads to behavioral change, and when it does, the direction depends on the type of evaluation perceived. Models are also more sensitive to safety than capability evaluations, placing safety benchmark validity at greater risk. To study which factors each model is sensitive to and how they interact, we propose \textbf{EvalAwareBench}, a factor-controlled benchmark of 100 paired safety-capability tasks where each of the eight factors can be independently toggled, varying evaluative signals while holding the underlying request fixed. Through EvalAwareBench, we find that no single factor uniformly affects all models, but stacking factors progressively raises evaluation awareness across all of them. Our framework and EvalAwareBench provide the tools to measure, attribute, and mitigate evaluation awareness, pointing to behavioral consistency under recognition as a promising path forward.

2605.20402 2026-06-03 cs.LG cs.AI 版本更新

Decomposing MXFP4 quantization error for LLM reinforcement learning: reducible bias, recoverable deadzone, and an irreducible floor

分解 MXFP4 量化误差以用于大语言模型强化学习:可约简的偏差、可恢复的死区以及不可约简的底噪

Xiaocan Li, Shiliang Wu, Zheng Shen

发表机构 * Huawei Canada(华为加拿大)

AI总结 本文通过将 MXFP4 量化误差分解为三个可加分量(尺度偏差、死区截断和网格噪声),并针对每个分量提出针对性修正(宏块缩放、异常值回退和自适应量化噪声),从而在 LLM 强化学习后训练中恢复精度。

详情
AI中文摘要

MXFP4 算术可以显著加速大语言模型(LLM)强化学习(RL)后训练,但量化误差会导致严重的精度下降。现有工作将量化误差视为单一噪声项,忽略了量化误差损害训练的不同机制。我们证明了量化误差的精确三向分解,并展示了每个分量如何主导不同的 RL 训练路径。我们的理论和实证分析将 MXFP4 量化误差分解为三个可加分量:来自 2 的幂次舍入的“尺度偏差”、来自小值归零的“死区截断”以及来自舍入到最近 4 位网格的“网格噪声”。每个分量主导不同的 RL 失效模式:尺度偏差通过反向传播乘法累积,影响梯度精度;死区截断降低 rollout 质量;网格噪声提高策略的熵。我们结合了针对 RL 失效模式但不限于特定分量的修正:宏块缩放以减少尺度偏差,异常值回退恢复死区条目,同时也部分减少尺度偏差引起的误差,以及自适应量化噪声(AQN)用于控制策略熵。在 Qwen2.5-3B 密集模型和 Qwen3-30B-A3B-Base 混合专家模型上,针对性修正分别将 BF16 精度恢复到 0.7% 以内,并超过 BF16 达 +1.0%。

英文摘要

MXFP4 arithmetic can dramatically accelerate reinforcement learning (RL) post-training of large language models (LLMs), yet the quantization error introduces severe accuracy degradation. Existing work treats the quantization error as a monolithic noise term, missing the distinct mechanisms upon interpreting how quantization error damages training. We prove an exact three-way decomposition of quantization error and show how each component dominates a distinct RL training pathway. Our theoretical and empirical analysis decomposes the MXFP4 quantization error into three additive components: "scale bias" from power-of-two rounding, "deadzone truncation" from zeroing small values, and "grid noise" from rounding to the nearest 4-bit grid. Each component dominates a distinct RL failure mode: scale bias accumulates multiplicatively through the backward pass, affecting gradient accuracy; deadzone truncation degrades rollout quality; and grid noise raises the policy's entropy. We combine corrections that are RL failure mode-targeted but not component-exclusive: Macro-block scaling to reduce scale bias, Outlier Fallback recovers deadzone entries, but also partially reduces scale bias induced error, and Adaptive Quantization Noise (AQN) for controlling the policy entropy. On Qwen2.5-3B dense and Qwen3-30B-A3B-Base mixture-of-experts model, the targeted corrections recover BF16 accuracy to within 0.7% and exceed BF16 by +1.0% respectively.

2604.27147 2026-06-03 cs.LG cs.AI 版本更新

How to Guide Your Flow: Few-Step Alignment via Flow Map Reward Guidance

如何引导你的流:通过流图奖励引导实现少步对齐

Jerry Y. Huang, Justin Lin, Sheel Shah, Kartik Nair, Nicholas M. Boffi

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 提出流图奖励引导(FMRG),一种无训练、单轨迹的框架,利用流图在仅需3次NFE下实现奖励引导生成,速度比现有方法快一个数量级。

详情
AI中文摘要

在生成建模中,我们通常希望生成能够最大化用户指定奖励(如美学质量或与人类偏好对齐)的样本,这一问题被称为 extit{引导}。尽管现有引导方法被广泛使用,但它们要么需要昂贵的多粒子、多步方案,要么依赖于理解不足的近似。我们将引导重新表述为一个 extit{确定性最优控制问题},产生了一个算法层次结构,在最粗略的层次上包含了现有方法。我们表明, extit{流图}——因其在快速推理中的作用而近期受到广泛关注的对象——在最优解中自然出现。基于这一观察,我们提出 extbf{流图奖励引导(FMRG)}:一种无训练、 extit{单轨迹}框架,利用流图来积分和引导流。在文生图规模上,FMRG在逆问题和奖励引导生成中匹配或超越基线,且 extbf{仅需3次NFE},相比先前最先进方法至少实现一个数量级的加速。代码可在 https://github.com/jrrhuang/fmrg 获取。

英文摘要

In generative modeling, we often wish to produce samples that maximize a user-specified reward such as aesthetic quality or alignment with human preferences, a problem known as \textit{guidance}. Despite their widespread use, existing guidance methods either require expensive multi-particle, many-step schemes or rely on poorly understood approximations. We reformulate guidance as a \textit{deterministic optimal control problem}, yielding a hierarchy of algorithms that subsumes existing approaches at the coarsest level. We show that the \textit{flow map}, an object of significant recent interest for its role in fast inference, arises naturally in the optimal solution. Based on this observation, we propose \textbf{Flow Map Reward Guidance (FMRG)}: a training-free, \textit{single-trajectory} framework that uses the flow map to both integrate and guide the flow. At text-to-image scale, FMRG matches or surpasses baselines across inverse problems and reward-guided generation with \textbf{as few as 3 NFEs}, giving at least an order-of-magnitude speedup in comparison to prior state of the art. Code is available at https://github.com/jrrhuang/fmrg.

2605.20306 2026-06-03 cs.CV cs.LG 版本更新

WildRoadBench: A Wild Aerial Road-Damage Grounding Benchmark for Vision-Language Models and Autonomous Agents

WildRoadBench: 面向视觉语言模型与自主智能体的野外航拍道路损伤定位基准

Bingnan Liu, Chenhang Cui, Rui Huang, Jiani Luo, Zhirong Shen, Tinghao Wang, Xiande Huang, Lingbei Meng, Fei Shen, An Zhang

发表机构 * University of Electronic Science and Technology of China(电子科技大学) National University of Singapore(新加坡国立大学) De Artificial Intelligence Lab(德人工智能实验室) The Chinese University of Hong Kong, Shenzhen(香港中文大学(深圳)) University of Science and Technology of China(中国科学技术大学)

AI总结 提出WildRoadBench基准,通过VLM直接定位和LLM驱动智能体自主研究两种协议,评估模型在航拍道路损伤定位上的性能,发现现有方法在野外场景下仍不可靠。

Comments Preprint. Under review. 4 figures, 6 tables

详情
AI中文摘要

我们介绍了WildRoadBench,一个野外航拍道路损伤定位基准,它在一个专业标注的无人机语料库上,将视觉语言模型的直接视觉定位与LLM驱动的智能体的自主研究与工程相结合。在两种协议下评估相同的图像集和相同的每类AP_50指标。VLM轨道衡量固定VLM是否能在统一的提示、解码和解析流程下,从一张图像和一个简短提示中定位特定领域的损伤。智能体轨道衡量一个自主智能体,在仅给定书面任务简介、少量探索切片和固定交互预算的情况下,能否搜索公共网络、调整预训练组件、编写训练和推理代码,并通过隐藏保留集上的标量反馈预言机提交预测。我们对广泛的闭源前沿模型和开源VLM以及几个前沿LLM驱动的智能体进行了基准测试。在野外环境中,两种途径都远未达到可靠性能:闭源前沿模型在VLM排行榜上领先,但仍留下超过一半的指标未达到;开源定位器远低于它们,且新一代或推理型变体并未持续改进定位;每个开源模型的小目标均崩溃;尽管智能体拥有更丰富的功能,但仍落后于最强的VLM,且有几个未能在预算内提交有效结果。我们在https://anonymous.4open.science/r/wildroadbench-0607发布代码和数据,以支持可重复的后续研究。

英文摘要

We introduce WildRoadBench, a wild aerial road-damage grounding benchmark that couples direct visual grounding by vision-language models with autonomous research-and-engineering by LLM-driven agents on a single professionally annotated UAV corpus. The same image set and the same per-class AP_50 metric are evaluated under two protocols. The VLM Track measures whether a fixed VLM can localise domain-specific damage from one image and one short prompt under a unified prompting, decoding and parsing pipeline. The Agent Track measures whether an autonomous agent, given only a written task brief, a small exploratory slice and a fixed interaction budget, can search the public web, adapt pretrained components, write training and inference code, and submit predictions through a scalar-feedback oracle on a hidden holdout. We benchmark a broad pool of closed-source frontier models and open-source VLMs together with several frontier LLM-driven agents. Both routes remain far from reliable performance in this wild setting: closed-source frontier models lead the VLM leaderboard but still leave more than half of the metric on the table; open-source grounders plateau well below them, and newer generations or reasoning-style variants do not consistently improve grounding; small targets collapse for every open-source model; agents lag the strongest VLM despite richer affordances, and several fail to land a valid submission within the budget. We release the code and data at https://anonymous.4open.science/r/wildroadbench-0607 to support reproducible follow-up research.

2605.19805 2026-06-03 cs.LG cs.AI stat.ML 版本更新

Latent Laplace Diffusion for Irregular Multivariate Time Series

潜在拉普拉斯扩散用于不规则多元时间序列

Zinuo You, Jin Zheng, John Cartlidge

发表机构 * University of Cambridge(剑桥大学)

AI总结 提出潜在拉普拉斯扩散(LLapDiff)生成框架,通过低维潜在轨迹和拉普拉斯域参数化实现不规则时间序列的长时预测与缺失值插补。

Comments Accepted as a Spotlight at ICML 2026. The Version of Record will appear in Proceedings of Machine Learning Research (PMLR). 27 pages, 5 figures. Code: https://github.com/pixelhero98/LLapDiffusion

详情
AI中文摘要

不规则多元时间序列对长期预测提出了权衡:离散方法通过重新网格化可能扭曲时间结构,而连续时间模型通常需要容易漂移的序贯求解器。为弥合这一差距,我们提出了潜在拉普拉斯扩散(LLapDiff),一种生成式框架,将目标建模为低维潜在轨迹,从而无需逐步积分物理时间即可实现全范围生成。我们利用受随机端口-哈密顿动力学启发的稳定模态参数化来引导逆向过程,并通过可学习的共轭复极点参数化其在拉普拉斯域中的均值演化,从而能够在不规则时间戳上直接评估。我们还通过更新平均分析将连续动力学与不规则观测联系起来,该分析将采样间隙映射到有效事件域极点,并激发了间隙感知的历史总结器。大量实验表明,LLapDiff在长期预测中优于基线,其连续时间生成性质通过在同一模型的历史时间戳上查询,支持缺失值插补。代码可在https://github.com/pixelhero98/LLapDiffusion获取。

英文摘要

Irregular multivariate time series impose a trade-off for long-horizon forecasting: discrete methods can distort temporal structure via re-gridding, while continuous-time models often require sequential solvers prone to drift. To bridge this gap, we present Latent Laplace Diffusion (LLapDiff), a generative framework that models the target as a low-dimensional latent trajectory, enabling horizon-wide generation without step-by-step integration over physical time. We guide the reverse process utilizing a stable modal parameterization motivated by stochastic port-Hamiltonian dynamics, and parameterize its mean evolution in the Laplace domain via learnable complex-conjugate poles, enabling direct evaluation over irregular timestamps. We also link continuous dynamics to irregular observations through renewal-averaging analysis, which maps sampling gaps to effective event-domain poles and motivates a gap-aware history summarizer. Extensive experiments show that LLapDiff improves over baselines in long-horizon forecasting, and its continuous-time generative nature supports missing-value imputation by querying the same model at historical timestamps. Code is available at https://github.com/pixelhero98/LLapDiffusion.

2605.18740 2026-06-03 cs.CV cs.AI cs.CL cs.LG 版本更新

Vision-OPD: Learning to See Fine Details for Multimodal LLMs via On-Policy Self-Distillation

Vision-OPD:通过在线策略自蒸馏学习多模态大语言模型的精细细节

Qianhao Yuan, Jie Lou, Xing Yu, Hongyu Lin, Le Sun, Xianpei Han, Yaojie Lu

发表机构 * Tsinghua University(清华大学)

AI总结 提出Vision-OPD框架,通过在线策略自蒸馏将模型自身的局部区域感知能力迁移到全局图像策略,提升多模态大语言模型对细粒度视觉理解的准确性。

Comments Project page: https://github.com/VisionOPD/Vision-OPD

详情
AI中文摘要

多模态大语言模型(MLLMs)在细粒度视觉理解方面仍然存在困难,答案往往依赖于全图中微小但决定性的证据。我们观察到一种区域到全局的感知差距:当以证据为中心的裁剪图像为条件时,同一MLLM回答细粒度问题的准确率高于以对应全图为条件,这表明许多失败源于难以聚焦于相关证据,而非局部识别能力不足。受此观察启发,我们提出Vision-OPD(视觉在线策略蒸馏),一种区域到全局的自蒸馏框架,将模型自身特权的区域感知迁移到其全图策略。Vision-OPD从同一MLLM实例化两个条件策略:一个以裁剪图像为条件的教师和一个以全图为条件的学生。学生生成在线策略轨迹,Vision-OPD沿这些轨迹最小化教师和学生下一个词元分布之间的词元级差异。这使得模型能够内化视觉放大的好处,而无需外部教师模型、真实标签、奖励验证器或推理时工具使用。在多个细粒度视觉理解基准上的实验表明,Vision-OPD模型在性能上可与更大的开源、闭源以及“思考图像”智能体模型相媲美或更优。

英文摘要

Multimodal Large Language Models (MLLMs) still struggle with fine-grained visual understanding, where answers often depend on small but decisive evidence in the full image. We observe a regional-to-global perception gap: the same MLLM answers fine-grained questions more accurately when conditioned on evidence-centered crops than on the corresponding full images, suggesting that many failures stem from difficulty to focus on relevant evidence rather than insufficient local recognition ability. Motivated by this observation, we propose Vision-OPD (Vision On-Policy Distillation), a regional-to-global self-distillation framework that transfers the model's own privileged regional perception to its full-image policy. Vision-OPD instantiates two conditional policies from the same MLLM: a crop-conditioned teacher and a full-image-conditioned student. The student generates on-policy rollouts, and Vision-OPD minimizes token-level divergence between the teacher and student next-token distributions along these rollouts. This enables the model to internalize the benefit of visual zooming without external teacher models, ground-truth labels, reward verifiers, or inference-time tool use. Experiments on multiple fine-grained visual understanding benchmarks show that Vision-OPD models achieve competitive or superior performance against much larger open-source, closed-source, and "Thinking-with-Images" agentic models. The code is available at https://github.com/VisionOPD/Vision-OPD

2605.19262 2026-06-03 cs.LG cs.CR 版本更新

Backdooring Masked Diffusion Language Models

掩码扩散语言模型的后门攻击

Daniel Yiming Cao, Chengzhong Wang, Sheng-Yen Chou, Chengyu Huang, Pin-Yu Chen, Shengwei An

发表机构 * Cornell University(康奈尔大学) Virginia Tech(弗吉尼亚理工学院) IBM Research(IBM研究院)

AI总结 提出SHADOWMASK后门攻击方法,通过修改掩码扩散语言模型的前向破坏过程,实现高成功率攻击并保持模型清洁性能。

详情
AI中文摘要

掩码扩散语言模型(MDLM)正成为文本生成的一种引人注目的新范式,但其训练时安全性仍 largely unexplored。现有的针对高斯扩散模型或自回归语言模型的后门攻击并不直接适用于MDLM,因为MDLM依赖于离散状态破坏和迭代去噪,而非连续噪声或从左到右预测。在这项工作中,我们首次对MDLM的训练时后门攻击进行了系统研究。我们提出了SHADOWMASK,一种通过将标准全掩码终端分布替换为触发-掩码混合先验来修改MDLM前向破坏过程的后门攻击。这创建了一条从触发破坏状态到攻击者指定目标的专用去噪路径,同时保持清洁去噪行为。我们进一步通过定义后门前向过程、推导反向时间后验以及获得连续时间训练目标,提供了原理性的数学表述。在基于DiT的MDLM和LLaDA-8B-Instruct上,针对WikiText-103、OpenWebText和Alpaca的评估表明,SHADOWMASK实现了接近100%的攻击成功率,显著优于标准数据投毒,在很大程度上保持了清洁效用,在全模型和参数高效微调下仍然有效,并且对代表性防御具有鲁棒性。

英文摘要

Masked diffusion language models (MDLMs) are emerging as a compelling new paradigm for text generation, but their training-time security remains largely unexplored. Existing backdoor attacks on Gaussian diffusion models or autoregressive language models do not directly apply to MDLMs because MDLMs rely on discrete state corruption and iterative denoising rather than continuous noising or left-to-right prediction. In this work, we present the first systematic study of training-time backdoor attacks on MDLMs. We propose SHADOWMASK, a backdoor attack that modifies the MDLM forward corruption process by replacing the standard all-mask terminal distribution with a trigger-mask mixture prior. This creates a dedicated denoising pathway from trigger-corrupted states to attacker-specified targets while preserving clean denoising behavior. We further provide a principled mathematical formulation by defining the backdoored forward process, deriving the reverse-time posterior, and obtaining the continuous-time training objective. Evaluations on DiT-based MDLM and LLaDA-8B-Instruct across WikiText-103, OpenWebText, and Alpaca show that SHADOWMASK achieves near-100% attack success, substantially outperforms standard data poisoning, largely preserves clean utility, remains effective under full-model and parameter-efficient fine-tuning, and is robust against representative defenses.

2512.18552 2026-06-03 cs.SE cs.AI cs.CL cs.LG 版本更新

Toward Training Superintelligent Software Agents through Self-Play SWE-RL

通过自我对弈SWE-RL训练超级智能软件代理

Yuxiang Wei, Zhiqing Sun, Emily McMilin, Jonas Gehring, David Zhang, Gabriel Synnaeve, Daniel Fried, Lingming Zhang, Sida Wang

发表机构 * Meta FAIR University of Illinois Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校) Meta TBD Lab(Meta TBD 实验室) Carnegie Mellon University(卡内基梅隆大学)

AI总结 提出自我对弈SWE-RL(SSR)方法,通过强化学习在自对弈环境中训练单一LLM代理,使其在无需人工标注问题或测试的情况下,在真实代码库中迭代注入和修复软件缺陷,在SWE-bench基准上实现显著自我改进并超越人类数据基线。

Comments Accepted to ICML 2026

详情
AI中文摘要

尽管当前由大型语言模型(LLM)和智能体强化学习(RL)驱动的软件代理能够提高程序员的生产力,但其训练数据(例如GitHub问题和拉取请求)和环境(例如通过-通过和失败-通过测试)严重依赖人类知识或整理,这构成了通向超级智能的根本障碍。在本文中,我们提出了自我对弈SWE-RL(SSR),这是迈向超级智能软件代理训练范式的第一步。我们的方法仅需最小的数据假设,只需访问带有源代码和已安装依赖项的沙盒化仓库,无需人工标注的问题或测试。基于这些真实世界的代码库,单个LLM代理通过强化学习在自我对弈环境中进行训练,以迭代地注入和修复复杂度逐渐增加的软件缺陷,每个缺陷由测试补丁而非自然语言问题描述正式指定。在SWE-bench Verified和SWE-Bench Pro基准上,SSR实现了显著的自我改进(分别提升+10.4和+7.8分),并在整个训练轨迹中持续优于人类数据基线,尽管其评估的是自我对弈中未出现的自然语言问题。我们的结果虽然尚处于早期阶段,但表明了一条路径,即代理可以从真实软件仓库中自主收集广泛的学习经验,最终实现超越人类能力的超级智能系统,在理解系统构建方式、解决新挑战以及从头开始自主创建新软件方面超越人类。

英文摘要

While current software agents powered by large language models (LLMs) and agentic reinforcement learning (RL) can boost programmer productivity, their training data (e.g., GitHub issues and pull requests) and environments (e.g., pass-to-pass and fail-to-pass tests) heavily depend on human knowledge or curation, posing a fundamental barrier to superintelligence. In this paper, we present Self-play SWE-RL (SSR), a first step toward training paradigms for superintelligent software agents. Our approach takes minimal data assumptions, only requiring access to sandboxed repositories with source code and installed dependencies, with no need for human-labeled issues or tests. Grounded in these real-world codebases, a single LLM agent is trained via reinforcement learning in a self-play setting to iteratively inject and repair software bugs of increasing complexity, with each bug formally specified by a test patch rather than a natural language issue description. On the SWE-bench Verified and SWE-Bench Pro benchmarks, SSR achieves notable self-improvement (+10.4 and +7.8 points, respectively) and consistently outperforms the human-data baseline over the entire training trajectory, despite being evaluated on natural language issues absent from self-play. Our results, albeit early, suggest a path where agents autonomously gather extensive learning experiences from real-world software repositories, ultimately enabling superintelligent systems that exceed human capabilities in understanding how systems are constructed, solving novel challenges, and autonomously creating new software from scratch.

2605.18629 2026-06-03 cs.LG 版本更新

Aligned Training: A Parameter-Free Method to Improve Feature Quality and Stability of Sparse Autoencoders (SAE)

对齐训练:一种无参数方法提升稀疏自编码器(SAE)的特征质量与稳定性

Michał Brzozowski, Neo Christopher Chung

发表机构 * Samsung AI Center(三星人工智能中心) University of Warsaw(华沙大学)

AI总结 提出无参数的对齐训练方法,通过强制编码器与解码器方向内积为1的几何约束,同时提升稀疏自编码器的重建质量、消除死特征并增强训练稳定性。

详情
AI中文摘要

稀疏自编码器(SAE)是解释深度神经网络(DNN)内部工作机制的主要方法之一,将激活分解为高维特征。然而,它们存在关键缺陷:大量特征从未被激活且不稳定。尽管SAE的变体试图缓解这些问题,但它们需要额外的数据、重采样或训练。我们提出了 extbf{对齐训练},一种无参数的SAE重参数化方法,同时提升重建质量、消除死特征,并显著增强跨训练种子的稳定性。我们的方法源于一个被忽视的观察:SAE特征质量(通过编码器和解码器方向之间的内积衡量,我们称之为 extbf{对齐分数})在所有现代架构中呈现双峰分布。所提出的对齐训练在编码器和解码器之间施加几何约束,使得每个特征的内积等于1,从而在不增加任何超参数的情况下消除了SAE训练中的一个退化来源。在多个模型、字典大小和稀疏度水平上,对齐训练在SAEBench基准测试中显示出帕累托改进。除了改善死特征、稳定性和重建外,我们的方法可以轻松集成到机械可解释性技术中,例如Top/BatchTop-K架构和p退火。总体而言,对齐训练在不增加计算复杂度或成本的情况下显著提升了SAE的特征质量和稳定性。

英文摘要

Sparse autoencoders (SAEs) are one of the main methods to interpret the inner workings of deep neural networks (DNNs), decomposing activations into higher-dimensional features. However, they exhibit critical shortcomings where a large fraction of features are never activated and are unstable. Despite variants of SAEs that attempt to mitigate these issues, they require additional data, resampling, or training. We propose the \textbf{aligned training}, a parameter-free reparameterization of SAEs that simultaneously improves reconstruction quality, eliminates dead features, and significantly enhances stability across training seeds. Our approach is motivated by an overlooked observation that SAE feature quality, measured by the inner product between encoder and decoder directions (which we call the \textbf{alignment score}), follows a bimodal distribution across all modern architectures. The proposed aligned training enforces a geometric constraint between the encoder and decoder such that their inner product equals one for every feature, which removes a source of degeneracy in the SAE training without adding any hyperparameters. Across multiple models, dictionary sizes, and sparsity levels, the aligned training shows Pareto improvements on the SAEBench benchmarks. Beyond improving dead features, stability and reconstruction, our method readily integrates with techniques in mechanical interpretability such as Top/BatchTop-K architectures and p-Annealing. Overall, the aligned training substantially improves feature quality and stability of SAE without computational complexity or cost.

2605.18106 2026-06-03 math.OC cs.AI cs.LG stat.ML 版本更新

Symmetry-Compatible Principle for Optimizer Design: Embeddings, LM Heads, SwiGLU MLPs, and MoE Routers

优化器设计的对称性兼容原理:嵌入、LM头、SwiGLU MLP和MoE路由器

Tim Tsz-Kit Lau, Weijie Su

发表机构 * University of Pennsylvania(宾夕法尼亚大学) Wharton School(沃顿商学院)

AI总结 针对现代神经网络参数空间的对称性与坐标级优化器之间的几何不匹配,提出对称性兼容的优化器设计原则,并针对嵌入矩阵、LM头、SwiGLU MLP投影和MoE路由器等特殊参数块导出相应更新规则,实验证明其改善验证损失、负载平衡和训练稳定性。

详情
AI中文摘要

深度学习实践中长期存在一种显著的几何差异。现代神经网络架构自然展现出丰富的对称性和等变性,而流行的优化器如Adam及其变体本质上是坐标级的,无法尊重参数空间的等变结构。我们通过引入优化器设计的对称性兼容原则来解决这一差异:梯度更新规则应在作用于相应权重块的对称群下等变。遵循这一原则,我们首先为一般矩阵层提供了双正交等变更新的统一视角,如随机谱下降、Muon、Scion和极梯度方法所采用的。更重要的是,通过从正交群转向置换和共享移位对称性,我们为参数块(其对称性与一般矩阵层不同)推导了对称性兼容的优化器:嵌入和LM头矩阵、SwiGLU MLP投影以及MoE路由器矩阵。这些构造包括单边谱、行范数、混合行范数/谱、行感知、列感知、中心行范数和左谱更新。它们产生了一个端到端的逐层优化器堆栈,其中每个主要的矩阵值参数类被分配一个更新,其等变性与其对称群匹配。我们通过在密集和稀疏MoE语言模型上的预训练实验验证了这一原则,包括Qwen3-0.6B风格、Gemma 3 1B风格、OLMoE-1B-7B风格和缩小版gpt-oss架构。在这些实验中,对称性兼容的更新规则一致地改善了最终验证损失,减少了稀疏MoE模型中的负载不平衡,并在若干情况下比相应的AdamW更新提高了训练稳定性。

英文摘要

A striking geometric disparity has long persisted in the practice of deep learning. While modern neural network architectures naturally exhibit rich symmetry and equivariance properties, popular optimizers such as Adam and its variants operate inherently coordinate-wise, rendering them unable to respect the equivariance structures of the parameter space. We address this disparity by introducing a symmetry-compatible principle for optimizer design: the gradient update rule should be equivariant under the symmetry group acting on the corresponding weight block. Following this principle, we first provide a unified perspective on bi-orthogonally equivariant updates for general matrix layers, as employed by stochastic spectral descent, Muon, Scion, and polar gradient methods. More importantly, by moving from orthogonal groups to permutation and shared-shift symmetries, we derive symmetry-compatible optimizers for parameter blocks whose symmetries differ from those of general matrix layers: embedding and LM head matrices, SwiGLU MLP projections, and MoE router matrices. These constructions include one-sided spectral, row-norm, hybrid row-norm/spectral, row-aware, column-aware, centered row-norm, and left-spectral updates. They yield an end-to-end layerwise optimizer stack in which each major matrix-valued parameter class is assigned an update whose equivariance matches its symmetry group. We corroborate this principle through pre-training experiments on dense and sparse MoE language models, including Qwen3-0.6B-style, Gemma 3 1B-style, OLMoE-1B-7B-style, and downsized gpt-oss architectures. Across these experiments, symmetry-compatible update rules consistently improve final validation loss, reduce load imbalance in sparse MoE models, and in several cases improve training stability over the corresponding AdamW updates.

2605.17866 2026-06-03 cs.LG 版本更新

DAD4TS: Data-Augmentation-Oriented Diffusion Model for Time-Series Forecasting with Small-Scale Data

DAD4TS:面向小规模数据的时间序列预测的数据增强扩散模型

Masahiro Suzuki, Bohui Xia, Hiroto Yamamoto, Masanori Miyahara

发表机构 * Sony Group Corporation(索尼集团公司)

AI总结 针对小规模时间序列数据预测中数据增强生成有意义数据困难的问题,提出基于扩散模型和强化学习的DAD4TS方法,通过几何空间投影训练扩散模型,在多个数据集和模型上验证了有效性。

详情
AI中文摘要

小规模数据是时间序列预测任务中的一个关键问题。数据增强是解决该问题的有效策略,但在生成有意义的数据方面存在局限性。为了解决这一局限性,我们提出了DAD4TS,一种基于扩散模型并结合强化学习的数据增强方法,专为小规模数据的时间序列预测设计。在DAD4TS中,数据生成器与时间序列模型同时训练,并由强化学习模型控制,以高效生成能够提高时间序列模型预测准确性的样本。为了支持小规模数据,我们使用数学方法代替传统的VAE方法,通过将时间序列数据投影到几何空间来训练扩散模型。我们通过定性和定量实验,在六个真实世界数据集和八个时间序列模型上,使用七种对比方法验证了DAD4TS的有效性。结果表明,DAD4TS在五个数据集上得到了验证。

英文摘要

Small-scale data is a critical problem in time-series forecasting tasks. Data augmentation is an effective strategy for this task, but it has a limitation in generating meaningful data. To address this limitation, we propose DAD4TS, a diffusion-model-based data augmentation method with reinforcement learning, designed for time-series forecasting with small-scale data. In DAD4TS, a data generator is simultaneously trained with a time-series model and controlled by a reinforcement learning model to efficiently generate samples that improve the forecast accuracy of the time-series model. To support small-scale data, we use mathematical methods instead of conventional VAE methods to train the diffusion model by projecting the time-series data into the geometric space. We validated the effectiveness of DAD4TS with seven comparative methods through qualitative and quantitative experiments on six real-world datasets and eight time-series models. As a result, DAD4TS was validated on five datasets.

2605.17219 2026-06-03 cs.CR cs.AI cs.LG cs.NI eess.SP 版本更新

Integration of AI in Cybersecurity: Current Trends with a Focused Look at Intrusion Detection Applications

AI在网络安全中的集成:当前趋势及入侵检测应用的聚焦分析

S. Tazili, A. Mansour, M. Y. Chkouri

发表机构 * SIGL Laboratory, ENSATE, Abdelmalek Essaâdi University, Tetouan, Morocco(SIGL实验室、ENSATE、阿卜杜勒马利克·埃萨迪大学、突塔努安、摩洛哥)

AI总结 本文综述了当前基于AI的网络安全趋势,重点分析入侵检测方法,通过比较不同AI技术和性能指标揭示有意义见解。

Comments Accepted at AI2SD 2025. Forthcoming in Springer Lecture Notes in Networks and Systems (2026). Please cite this preprint as indicated in the paper!

详情
Journal ref
https://conferences.academyskills.net/ai2sd/2025/PapersManagement/all.php#:~:text=643174
AI中文摘要

人工智能(AI)如今被广泛采用,因其能够检测模式、自动化任务并减少各种应用中的时间和成本。AI与网络安全的整合引起了广泛关注,特别是在入侵检测、恶意软件分析以及钓鱼或垃圾邮件检测等领域。随着AI和网络安全的发展,新的方法和途径不断涌现。当前趋势包括使用生成式AI、自然语言处理、用于隐私保护协作训练的联邦学习以及可解释AI以确保可解释性和信任,这些在网络安全中至关重要。本文对当前基于AI的网络安全趋势进行了有趣的综述,重点聚焦入侵检测方法,旨在通过基于所采用的AI技术和报告性能的比较分析,揭示有意义的见解。

英文摘要

Artificial Intelligence (AI) is widely adopted today for its ability to detect patterns, automate tasks, and reduce time and cost across various applications. Its integration into Cybersecurity has garnered significant attention, particularly in areas such as intrusion detection, malware analysis, and phishing or spam detection. As AI and cybersecurity evolve, new methods and approaches emerge regularly. Current trends include the use of Generative AI, Natural Language Processing, Federated Learning for privacy-preserving collaborative training, and eXplainable AI to ensure interpretability and trust, which are vital in cybersecurity. This paper presents an interesting review of current AI-based cybersecurity trends, focusing on intrusion detection approaches and aiming to uncover meaningful insights through comparative analysis based on the employed AI techniques and reported performance.

2605.15806 2026-06-03 cs.LG 版本更新

Martingale Neural Operators: Learning Stochastic Marginals via Doob-Meyer Factorization

鞅神经算子:通过Doob-Meyer分解学习随机边际分布

Kai Hidajat

AI总结 提出鞅神经算子(MNO),利用Doob-Meyer定理将随机偏微分方程的边际分布分解为可预测漂移和鞅部分,直接预测条件均值和协方差,在多种任务上显著降低Wasserstein距离并提升效率。

详情
AI中文摘要

神经算子作为确定性代理表现出色,但在应用于随机偏微分方程时不可避免地坍缩到条件均值,丢弃了不确定性量化所依赖的方差和尾部结构。恢复这种结构通常需要蒙特卡洛滚动或嫁接的生成模型,两者都放弃了定义算子范式的单次效率和分辨率不变性。为解决此问题,我们借鉴Doob-Meyer定理,该定理确立了任何半鞅从根本上分解为一个可预测漂移和一个不可预测的零均值鞅。将该定理转化为架构先验,我们引入了鞅神经算子(MNO)。MNO将初始条件直接映射到终端律的条件均值和协方差,参数化为类似漂移的均值和低秩因子$B_ϕ$,其中$B_ϕ^\top B_ϕ$通过构造是半正定的。在我们的实验中,我们使用高斯残差实例化。在一维SPDE、粗糙波动率和二维算子任务中,MNO在$ϕ^4$场论上将Wasserstein距离减少高达$120$倍,在随机Burgers方程上减少$68$倍,在匹配的壁钟训练预算下,评估速度比条件扩散基线快约$3$倍。在二维任务上,MNO在零样本分辨率转移和湍流方面与FNO相当,而准确定性系统(如Gray-Scott)仍然是失败模式。

英文摘要

Neural operators excel as deterministic surrogates, but inevitably collapse to the conditional mean when applied to stochastic PDEs, discarding the variance and tail structure upon which uncertainty quantification depends. Recovering this structure typically requires Monte Carlo rollouts or grafted generative models, both of which surrender the one-shot efficiency and resolution invariance that define the operator paradigm. To resolve this, we draw on the Doob-Meyer theorem, which establishes that any semimartingale fundamentally decomposes into a predictable drift and an unpredictable, zero-mean martingale. Translating this theorem into an architectural prior, we introduce the Martingale Neural Operator (MNO). MNO maps an initial condition directly to the conditional mean and covariance of the terminal law, parameterized by a drift-like mean and a low-rank factor $B_ϕ$ with $B_ϕ^\top B_ϕ$ positive semi-definite by construction. For our experiments, we use a Gaussian residual instantiation. Across 1D SPDEs, rough volatility, and 2D operator tasks, MNO reduces Wasserstein distance by up to $120\times$ on $ϕ^4$ field theory and $68\times$ on stochastic Burgers, evaluating $\sim 3\times$ faster than a conditional diffusion baseline at matched wall-clock training budgets. On 2D tasks, MNO is comparable to FNO on zero-shot resolution transfer and turbulent flow, while quasi-deterministic systems such as Gray-Scott remain a failure mode.

2604.22891 2026-06-03 cs.LG cs.AI cs.CL 版本更新

Quantifying and Mitigating Self-Preference Bias of LLM Judges

量化与缓解LLM评判者的自我偏好偏差

Jinming Yang, Zheng Hu, Chuxian Qiu, Zhenyu Deng, Xinshan Jiao, Tao Zhou

发表机构 * CompleX Lab, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China(复杂实验室、计算机科学与工程学院、电子科技大学、成都、中国)

AI总结 提出自动化框架量化LLM自我偏好偏差,并通过认知负荷分解的多维评估策略平均降低31.5%的偏差。

详情
AI中文摘要

LLM-as-a-Judge已成为自动评估系统中的主导方法,在模型对齐、排行榜构建、质量控制等方面发挥关键作用。然而,该方法的可扩展性和可信度可能因自我偏好偏差(SPB)而严重失真,SPB是一种定向评估偏差,即LLM在评估时系统性地偏好或排斥自身生成的输出。现有测量方法依赖昂贵的人工标注,并将生成能力与评估立场混为一谈,因此不适用于实际系统中的大规模部署。为解决此问题,我们引入了一个完全自动化的框架来量化和缓解SPB,该框架构建质量差异可忽略的等质量回答对,从而在无需人工黄金标准的情况下,从偏差倾向中统计分离出可区分性。对20个主流LLM的实证分析表明,先进能力通常与低SPB不相关,甚至负相关。为缓解此偏差,我们提出了一种基于认知负荷分解的结构化多维评估策略,平均降低SPB 31.5%。

英文摘要

LLM-as-a-Judge has become a dominant approach in automated evaluation systems, playing critical roles in model alignment, leaderboard construction, quality control, and so on. However, the scalability and trustworthiness of this approach can be substantially distorted by Self-Preference Bias (SPB), which is a directional evaluative deviation in which LLMs systematically favor or disfavor their own generated outputs during evaluation. Existing measurements rely on costly human annotations and conflate generative capability with evaluative stance, and thus are impractical for large-scale deployment in real-world systems. To address this issue, we introduce a fully automated framework to quantifying and mitigating SPB, which constructs equal-quality pairs of responses with negligible quality differences, enabling statistical disentanglement of discriminability from bias propensity without human gold standards. Empirical analysis across 20 mainstream LLMs reveals that advanced capabilities are often uncorrelated, or even negatively correlated, with low SPB. To mitigate this bias, we propose a structured multi-dimensional evaluation strategy grounded in cognitive load decomposition, which reduces SPB by 31.5\% on average.

2605.08935 2026-06-03 cs.AI cs.LG 版本更新

PnP-Corrector: A Universal Correction Framework for Coupled Spatiotemporal Forecasting

PnP-Corrector:一种用于耦合时空预测的通用校正框架

Hao Wu, Fan Xu, Yuxu Lu, Penghao Zhao, Fan Zhang, Hao Jia, Yuxuan Liang, Ruijian Gou, Qingsong Wen, Xian Wu, Xiaomeng Huang, Yuan Gao

发表机构 * University of Science and Technology of China(中国科学技术大学) Tsinghua University(清华大学)

AI总结 针对耦合系统中误差相互放大导致长期预测崩溃的问题,提出一种即插即用的校正框架PnP-Corrector,通过冻结物理模拟引擎并训练校正代理来主动抵消系统偏差,显著提升长期预测的稳定性和准确性。

详情
AI中文摘要

耦合时空预测对于预测多个相互作用动力系统的未来演化(如气候模型)非常重要。然而,现有方法受到复合误差这一持续瓶颈的严重限制。在耦合系统中,每个子系统模拟器的误差会相互传播和放大,我们将这种现象称为互惠误差放大,导致长期预测迅速崩溃。为了应对这一挑战,我们提出了一种通用框架,称为PnP-Corrector(即插即用校正器)。我们框架的核心思想是将物理模拟与误差校正过程解耦:它冻结预训练的物理模拟引擎,并专门训练一个校正代理,以主动抵消耦合系统中出现的系统偏差。此外,我们设计了一种高效的预测模型架构DSLCast,作为该框架的主干。大量实验表明,我们的方法显著增强了耦合预测系统的长期稳定性和准确性。例如,在300天的全球海洋-大气耦合预测这一具有挑战性的任务中,我们的PnP-Corrector框架将基线模型的预测误差降低了28%,并在多个关键指标上超越了最先进的模型。

英文摘要

Coupled spatiotemporal forecasting is important for predicting the future evolution of multiple interacting dynamical systems, such as in climate models. However, existing methods are severely constrained by the persistent bottleneck of compounding errors. In coupled systems, errors from each subsystem simulator propagate and amplify one another, a phenomenon we term Reciprocal Error Amplification, leading to a rapid collapse of long-range predictions. To address this challenge, we propose a universal framework called PnP-Corrector (Plug-and-Play Corrector). The core idea of our framework is to decouple the physical simulation from the error correction process: it freezes pre-trained physics simulation engines and exclusively trains a correction agent to proactively counteract the systematic biases emerging from the coupled system. Furthermore, we design an efficient predictive model architecture, DSLCast, to serve as the backbone of this framework. Extensive experiments demonstrate that our method significantly enhances the long-term stability and accuracy of coupled forecasting systems. For instance, in the challenging task of a 300-day global ocean-atmosphere coupled forecast, our PnP-Corrector framework reduces the prediction error of the baseline model by 28% and surpasses state-of-the-art models on several key metrics.

2605.11607 2026-06-03 stat.ML cs.AI cs.LG 版本更新

Exact Stiefel Optimization for Probabilistic PLS: Closed-Form Updates, Error Bounds, and Calibrated Uncertainty

概率PLS的精确Stiefel优化:闭式更新、误差界与校准不确定性

Haoran Hu, Xingce Wang

发表机构 * School of Artificial Intelligence, Beijing Normal University(人工智能学院,北京师范大学)

AI总结 提出一种基于Stiefel流形精确优化的概率偏最小二乘框架,通过噪声预估计、约束似然优化和预测校准,实现闭式更新、误差界和校准不确定性。

详情
AI中文摘要

概率偏最小二乘(PPLS)是一种基于似然的核心双视图模型,适用于需要可解释潜在因子和校准不确定性的场景。基于Bouhaddani等人(2018)的可识别参数化,现有拟合流程仍面临两个实际瓶颈:联合EM/ECM更新下的噪声-信号耦合以及正交约束的非平凡处理。遵循固定噪声标量似然协议,我们开发了一个端到端框架,将噪声预估计、约束似然优化和预测校准整合到一条流水线中。我们从低特征值噪声子空间估计观测噪声,并通过精确的Stiefel流形优化强制执行正交性。噪声子空间估计器实现了与信号强度无关的前沿有限样本率,并匹配极小极大下界,而全谱噪声估计器在同一模型下携带确定性偏差。我们通过可选的高斯化将框架扩展到次高斯设置,并通过块结构Fisher分析提供闭式标准误差。在合成高噪声设置和两个多组学基准(TCGA-BRCA和PBMC CITE-seq)上,该方法无需事后重新校准即可实现接近名义覆盖,在TCGA-BRCA上秩$r=3$时达到Ridge级点精度,在跨视图预测上匹配或超过PO2PLS,同时提供原生校准不确定性,并提高参数恢复的稳定性。

英文摘要

Probabilistic partial least squares (PPLS) is a central likelihood-based model for two-view learning when one needs both interpretable latent factors and calibrated uncertainty. Building on the identifiable parameterization of Bouhaddani et al.\ (2018), existing fitting pipelines still face two practical bottlenecks: noise--signal coupling under joint EM/ECM updates and nontrivial handling of orthogonality constraints. Following the fixed-noise scalar-likelihood protocol, we develop an end-to-end framework that combines noise pre-estimation, constrained likelihood optimization, and prediction calibration in one pipeline. We estimate the observation noise from the low-eigenvalue noise subspace and enforce orthogonality through exact Stiefel-manifold optimization. The noise-subspace estimator attains a signal-strength-independent leading finite-sample rate and matches a minimax lower bound, whereas a full-spectrum noise estimator carries a deterministic bias under the same model. We further extend the framework to sub-Gaussian settings via optional Gaussianization and provide closed-form standard errors through a block-structured Fisher analysis. Across synthetic high-noise settings and two multi-omics benchmarks (TCGA-BRCA and PBMC CITE-seq), the method achieves near-nominal coverage without post-hoc recalibration, reaches Ridge-level point accuracy on TCGA-BRCA at rank $r=3$, matches or exceeds PO2PLS on cross-view prediction while providing native calibrated uncertainty, and improves stability of parameter recovery.

2605.11170 2026-06-03 cs.LG cs.CR 版本更新

Unlearning with Asymmetric Sources: Improved Unlearning-Utility Trade-off with Public Data

非对称源下的反学习:利用公共数据改进反学习-效用权衡

Ahmed Mehdi Inane, Vincent Quirion, Gintare Karolina Dziugaite, Ioannis Mitliagkas

发表机构 * University of Waterloo(滑铁卢大学)

AI总结 提出非对称朗之万反学习(ALU)框架,利用公共数据注入将反学习认证噪声成本降低O(1/n_pub^2)倍,在分布偏移下仍保持高效用。

详情
AI中文摘要

基于噪声的认证机器反学习目前面临一个硬性上限:认证反学习所需的噪声幅度通常会破坏模型效用,特别是在大规模删除请求的情况下。虽然利用公共数据是差分隐私中缓解这一紧张关系的标准技术,但其在反学习中的作用尚未被探索。我们通过引入非对称朗之万反学习(ALU)框架来填补这一空白,该框架利用公共数据来降低隐私成本。我们证明,公共数据注入将反学习成本抑制了O(1/n_pub^2)倍,保证了相对于重新训练的计算优势。这建立了一种新的控制机制:从业者可以通过增加公共数据量来缓解对高噪声及其相关效用损失的需求。关键的是,我们分析了分布偏移的现实场景,明确刻画了公共和私有源之间的偏移如何影响效用。我们表明,ALU能够实现对恒定数据集部分的大规模反学习——在这种机制下,标准对称方法变得不切实际——同时保持高效用。使用变分Rényi散度和成员推理攻击的实验评估证实,在合理的分布偏移下,ALU能有效阻止隐私攻击,同时保持效用。

英文摘要

Noise-based certified machine unlearning currently faces a hard ceiling: the noise magnitude required to certify unlearning typically destroys model utility, particularly for large-scale deletion requests. While leveraging public data is a standard technique in differential privacy to relax this tension, its role in unlearning remains unexplored. We address this gap by introducing Asymmetric Langevin Unlearning (ALU), a framework that uses public data to mitigate privacy costs. We prove that public data injection suppresses the unlearning cost by a factor of $O(1/n_{\mathrm{pub}}^2)$, guaranteeing a strict computational advantage over retraining. This establishes a new control mechanism: practitioners can mitigate the need for high noise-and the associated utility loss-by increasing the volume of public data. Crucially, we analyze the realistic setting of distribution mismatch, explicitly characterizing how shifts between public and private sources impact utility. We show that ALU enables mass unlearning of constant dataset fractions -- a regime where standard symmetric methods become impractical -- while maintaining high utility. Empirical evaluations using variational Rényi divergence and membership inference attacks confirm that ALU effectively thwarts privacy attacks while preserving utility under reasonable distribution shifts.

2602.22480 2026-06-03 cs.AI cs.CL cs.LG 版本更新

VeRO: A Harness for Agents to Optimize Agents

VeRO: 用于优化智能体的智能体框架

Varun Ursekar, Apaar Shanker, Veronica Chatrath, Yuan Xue, Samuel Marc Denton

发表机构 * arXiv

AI总结 提出 VeRO 框架和 VeRO-Bench 基准,通过版本化快照、预算控制评估和结构化执行轨迹来优化智能体代码,并实验比较不同优化器对目标智能体的改进效果。

Comments Accepted to the Forty-Third International Conference on Machine Learning (ICML), 2026

详情
AI中文摘要

编码智能体的一个重要新兴应用是智能体框架优化:通过编辑和评估目标智能体的代码来迭代改进它。尽管具有相关性,但社区对编码智能体在此任务上的表现缺乏系统理解。框架优化与传统软件工程不同:智能体框架将确定性代码与随机 LLM 完成交错,需要结构化捕获中间执行轨迹和下游结果。为了解决这些挑战,我们引入了 (1) VeRO(版本化、奖励和观察),一个外部框架,提供目标框架的版本化快照、预算控制评估和结构化执行轨迹,以及 (2) VeRO-Bench,一个包含参考评估程序的目标智能体和任务的基准套件。使用 VeRO,我们进行了一项实证研究,比较了不同任务上的优化器,并分析了哪些修改能可靠地改进目标智能体框架。我们发布 VeRO 以支持作为编码智能体核心能力的智能体优化研究。代码可在 https://github.com/scaleapi/vero 获取。

英文摘要

An important emerging application of coding agents is agent harness optimization: the iterative improvement of a target agent by editing and evaluating its code. Despite its relevance, the community lacks a systematic understanding of coding agent performance on this task. Harness optimization differs from conventional software engineering: agent harnesses interleave deterministic code with stochastic LLM completions, requiring structured capture of both intermediate execution traces and downstream outcomes. To address these challenges, we introduce (1) VeRO (Versioning, Rewards, and Observations), an outer harness that provides versioned snapshots, budget-controlled evaluation, and structured execution traces of target harnesses, and (2) VeRO-Bench, a benchmark suite of target agents and tasks with reference evaluation procedures. Using VeRO, we conduct an empirical study comparing optimizers across tasks and analyzing which modifications reliably improve target agent harnesses. We release VeRO to support research on agent optimization as a core capability for coding agents. Code is available at https://github.com/scaleapi/vero.

2605.05629 2026-06-03 stat.ML cs.CL cs.LG 版本更新

Spherical Flows for Sampling Categorical Data

用于分类数据采样的球面流

Jannis Chemseddine, Gregor Kornhardt, Gabriele Steidl

发表机构 * Technische Universität Berlin(柏林技术大学)

AI总结 提出在球面上利用von Mises-Fisher分布进行离散序列生成建模,通过径向对称性简化连续性方程为标量ODE,结合后验加权切线和与预测-校正采样实现高效采样。

详情
AI中文摘要

我们研究了在连续嵌入空间中学习离散序列生成模型的问题。以往的方法通常在欧几里得空间或概率单纯形上操作,而我们则在球面$\mathbb S^{d-1}$上工作。在那里,von Mises-Fisher (vMF)分布诱导了一个自然的噪声过程,并允许闭式条件得分。条件速度通常是难以处理的。利用vMF密度的径向对称性,我们将$\mathbb S^{d-1}$上的连续性方程简化为关于余弦相似度的标量ODE,其唯一有界解决定了速度。$\mathbb S^{d-1}$上的边际速度和边际得分都分解为后验加权的切线和,仅因每个token的标量权重不同。这提供了ODE和预测-校正(PC)采样两种途径。后验是唯一需要学习的对象,通过交叉熵损失训练。实验将vMF路径与测地线和欧几里得替代方案进行了比较。vMF与PC采样的结合显著改善了数独和语言建模的结果。

英文摘要

We study the problem of learning generative models for discrete sequences in a continuous embedding space. Whereas prior approaches typically operate in Euclidean space or on the probability simplex, we instead work on the sphere $\mathbb S^{d-1}$. There the von Mises-Fisher (vMF) distribution induces a natural noise process and admits a closed-form conditional score. The conditional velocity is in general intractable. Exploiting the radial symmetry of the vMF density we reduce the continuity equation on $\mathbb S^{d-1}$ to a scalar ODE in the cosine similarity, whose unique bounded solution determines the velocity. The marginal velocity and marginal score on $(\mathbb S^{d-1})^L$ both decompose into posterior-weighted tangent sums that differ only by per-token scalar weights. This gives access to both ODE and predictor-corrector (PC) sampling. The posterior is the only learned object, trained by a cross-entropy loss. Experiments compare the vMF path against geodesic and Euclidean alternatives. The combination of vMF and PC sampling significantly improves results on Sudoku and language modeling.

2604.23099 2026-06-03 cs.LG cs.AI stat.ML 版本更新

ProEval: Proactive Failure Discovery and Efficient Performance Estimation for Generative AI Evaluation

ProEval:生成式AI评估的主动故障发现与高效性能估计

Yizheng Huang, Wenjun Zeng, Aditi Kumaresan, Zi Wang

发表机构 * Google DeepMind(谷歌深Mind)

AI总结 提出ProEval框架,利用预训练高斯过程进行贝叶斯积分和超水平集采样,实现高效性能估计和主动故障发现,在推理、安全对齐和分类基准上以8-65倍更少样本达到1%误差内估计。

Comments Our open-sourced code and data can be found at https://github.com/google-deepmind/proeval

详情
Journal ref
International Conference on Machine Learning, 2026
AI中文摘要

由于推理速度慢、评估成本高以及模型和基准的快速增长,评估生成式AI模型变得越来越资源密集。我们提出ProEval,一个主动评估框架,利用迁移学习高效估计性能并识别故障案例。ProEval采用预训练高斯过程(GPs)作为性能评分函数的代理,将模型输入映射到指标,如错误严重性或安全违规。通过将性能估计构建为贝叶斯积分(BQ)和故障发现构建为超水平集采样,我们开发了不确定性感知的决策策略,主动选择或合成高度信息量的输入进行测试。理论上,我们证明了基于预训练GP的BQ估计器是无偏且有界的。实验上,在推理、安全对齐和分类基准上的大量实验表明,ProEval比竞争基线显著更高效。它需要8-65倍更少的样本即可达到真实值1%内的估计,同时在更严格的评估预算下揭示更多样化的故障案例。

英文摘要

Evaluating generative AI models is increasingly resource-intensive due to slow inference, expensive raters, and a rapidly growing landscape of models and benchmarks. We propose ProEval, a proactive evaluation framework that leverages transfer learning to efficiently estimate performance and identify failure cases. ProEval employs pre-trained Gaussian Processes (GPs) as surrogates for the performance score function, mapping model inputs to metrics such as the severity of errors or safety violations. By framing performance estimation as Bayesian quadrature (BQ) and failure discovery as superlevel set sampling, we develop uncertainty-aware decision strategies that actively select or synthesize highly informative inputs for testing. Theoretically, we prove that our pre-trained GP-based BQ estimator is unbiased and bounded. Empirically, extensive experiments on reasoning, safety alignment, and classification benchmarks demonstrate that ProEval is significantly more efficient than competitive baselines. It requires 8-65x fewer samples to achieve estimates within 1% of the ground truth, while simultaneously revealing more diverse failure cases under a stricter evaluation budget.

2604.20316 2026-06-03 cs.LG 版本更新

R2IF: Aligning Reasoning with Decisions via Composite Rewards for Interpretable LLM Function Calling

R2IF: 通过复合奖励对齐推理与决策以实现可解释的LLM函数调用

Aijia Cheng, Kailong Wang, Ling Shi, Yongxin Zhao

发表机构 * Shanghai Key Laboratory of Trustworthy Computing, East China Normal University, Shanghai, China(上海可信计算实验室,华东师范大学,上海,中国) Huazhong University of Science and Technology(华中科技大学) Nanyang Technological University(南洋理工大学)

AI总结 提出R2IF框架,通过复合奖励(格式/正确性约束、思维链有效性奖励和规范-修改-价值奖励)和GRPO优化,对齐推理过程与工具调用决策,在BFCL/ACEBench上提升函数调用准确性和可解释性。

详情
AI中文摘要

函数调用使大型语言模型(LLM)能够与外部工具交互,但现有的基于强化学习的方法存在推理过程与工具调用决策之间的错位。我们提出了R2IF,一种面向可解释函数调用的推理感知强化学习框架,采用复合奖励,整合格式/正确性约束、思维链有效性奖励(CER)和规范-修改-价值(SMV)奖励,并通过GRPO进行优化。在BFCL/ACEBench上的实验表明,R2IF在性能上优于基线方法,最高提升34.62%(Llama3.2-3B在BFCL上),同时平均思维链有效性为正(Llama3.2-3B为0.05),增强了函数调用的准确性和可解释性,为可靠的工具增强型LLM部署提供了支持。

英文摘要

Function calling empowers large language models (LLMs) to interface with external tools, yet existing RL-based approaches suffer from misalignment between reasoning processes and tool-call decisions. We propose R2IF, a reasoning-aware RL framework for interpretable function calling, adopting a composite reward integrating format/correctness constraints, Chain-of-Thought Effectiveness Reward (CER), and Specification-Modification-Value (SMV) reward, optimized via GRPO. Experiments on BFCL/ACEBench show R2IF outperforms baselines by up to 34.62% (Llama3.2-3B on BFCL) with positive Average CoT Effectiveness (0.05 for Llama3.2-3B), enhancing both function-calling accuracy and interpretability for reliable tool-augmented LLM deployment.

2604.18995 2026-06-03 cs.CL cs.AI cs.LG 版本更新

$R^2$-dLLM: Accelerating Diffusion Large Language Models via Spatio-Temporal Redundancy Reduction

$R^2$-dLLM: 通过时空冗余减少加速扩散大语言模型

Zhenbang Du, Kejing Xia, Xinrui Zhong, Yonggan Fu, Nicolai Oswald, Binfei Ji, Brucek Khailany, Pavlo Molchanov, Yingyan Lin

AI总结 提出 $R^2$-dLLM 框架,通过推理和训练两阶段减少扩散大语言模型解码中的空间和时间冗余,实现高达 88% 的解码步数减少并保持生成质量。

详情
AI中文摘要

扩散大语言模型(dLLMs)通过并行令牌预测成为自回归生成的有前途的替代方案。然而,实际的 dLLM 解码仍然遭受高推理延迟,限制了部署。在这项工作中,我们观察到这种低效率的很大一部分来自解码过程中反复出现的冗余,包括由置信度聚类和位置模糊性引起的空间冗余,以及由重复重新掩蔽已经稳定的预测引起的时间冗余。受这些模式的启发,我们提出了 $R^{2}$-dLLM,一个从推理和训练两个角度减少解码冗余的统一框架。在推理时,我们引入了无需训练的解码规则,聚合局部置信度和令牌预测,并最终确定时间稳定的令牌以避免冗余解码步骤。我们进一步提出了一个冗余感知的监督微调流程,使模型与高效解码轨迹对齐,并减少对手动调整阈值的依赖。实验表明,与现有解码策略相比,$R^{2}$-dLLM 一致地将解码步数减少高达 88%,同时在不同模型和任务上保持有竞争力的生成质量。这些结果验证了解码冗余是 dLLMs 的一个核心瓶颈,明确减少它能够带来显著的实用效率提升。我们的代码和模型可在 https://github.com/GATECH-EIC/R2-dLLM 获取。

英文摘要

Diffusion Large Language Models (dLLMs) have emerged as a promising alternative to autoregressive generation by enabling parallel token prediction. However, practical dLLM decoding still suffers from high inference latency, which limits deployment. In this work, we observe that a substantial part of this inefficiency comes from recurring redundancy in the decoding process, including spatial redundancy caused by confidence clusters and positional ambiguity, and temporal redundancy caused by repeatedly remasking predictions that have already stabilized. Motivated by these patterns, we propose $R^{2}$-dLLM, a unified framework for reducing decoding redundancy from both inference and training perspectives. At inference time, we introduce training-free decoding rules that aggregate local confidence and token predictions, and finalize temporally stable tokens to avoid redundant decoding steps. We further propose a redundancy-aware supervised fine-tuning pipeline that aligns the model with efficient decoding trajectories and reduces reliance on manually tuned thresholds. Experiments demonstrate that $R^{2}$-dLLM consistently reduces the number of decoding steps by up to 88\% compared to existing decoding strategies, while maintaining competitive generation quality across different models and tasks. These results validate that decoding redundancy is a central bottleneck in dLLMs, and that explicitly reducing it yields substantial practical efficiency gains. Our code and models are available at https://github.com/GATECH-EIC/R2-dLLM.

2604.18572 2026-06-03 cs.CV cs.AI cs.LG 版本更新

Back into Plato's Cave: Examining Cross-modal Representational Convergence at Scale

回到柏拉图的洞穴:大规模检验跨模态表示收敛性

A. Sophia Koepke, Daniil Zverev, Shiry Ginosar, Alexei A. Efros

发表机构 * UC Berkeley(伯克利大学) Technical University Munich, MCML(慕尼黑技术大学) University of Tübingen, Tübingen AI Center(图宾根大学) Toyota Technical Institute at Chicago(芝加哥丰田技术研究所)

AI总结 本文通过大规模数据集实验,质疑了柏拉图表示假说中跨模态表示收敛的证据,发现对齐度随数据规模增大而显著下降,且仅反映粗粒度语义重叠。

Comments Project page: http://akoepke.github.io/cave_umwelten/

详情
AI中文摘要

柏拉图表示假说认为,在不同模态(例如文本和图像)上训练的神经网络会趋向于对齐并最终收敛到相同的现实表示。如果该假说成立,将对模态选择是否重要产生重大影响。我们表明,该假说的实验证据是脆弱的,且关键依赖于评估方式。对齐度通过小数据集(约1000个样本)上的互最近邻测量,当数据集扩展到数百万样本时,对齐度显著下降。在文本-音频和文本-视频对齐中也观察到相同行为。模型表示之间剩余的对齐反映的是粗粒度语义重叠,而非一致的细粒度结构。此外,Huh等人的评估是在一对一图像-标题设置中进行的,这种约束在现实的多对多设置中失效,进一步降低了测量的对齐度。我们还发现,更强的语言模型与视觉对齐度增加的趋势似乎不适用于较新的模型。总体而言,我们的发现表明,当前跨模态表示收敛的证据比后续工作所认为的要弱得多。在不同模态上训练的模型可能学习到同样丰富的世界表示,但并非相同的表示。

英文摘要

The Platonic Representation Hypothesis suggests that neural networks trained on different modalities (e.g., text and images) align and eventually converge toward the same representation of reality. If true, this has significant implications for whether modality choice matters at all. We show that the experimental evidence for this hypothesis is fragile and depends critically on the evaluation regime. Alignment is measured using mutual nearest neighbors on small datasets ($\approx$1K samples) and degrades substantially as the dataset is scaled to millions of samples. The same behavior is observed beyond text-image, for text-audio and text-video alignment. The alignment that remains between model representations reflects coarse semantic overlap rather than consistent fine-grained structure. Moreover, the evaluations in Huh et al. are done in a one-to-one image-caption setting, a constraint that breaks down in realistic many-to-many settings and further reduces measured alignment. We also find that the reported trend of stronger language models increasingly aligning with vision does not appear to hold for newer models. Overall, our findings suggest that the current evidence for cross-modal representational convergence is considerably weaker than subsequent works have taken it to be. Models trained on different modalities may learn equally rich representations of the world, just not the same one.

2604.16029 2026-06-03 cs.CL cs.LG 版本更新

Cut Your Losses! Learning to Prune Paths Early for Efficient Parallel Reasoning

减少损失!学习早期剪枝路径以实现高效并行推理

Jiaxi Bi, Tongxu Luo, Wenyu Du, Zhengyang Tang, Benyou Wang

发表机构 * The Chinese University of Hong Kong, Shenzhen(香港中文大学(深圳)) Shenzhen Loop Area Institute(深圳环形区研究所) USTB(中国地质大学) DualityRL

AI总结 提出一种基于可学习内部信号的路径剪枝方法STOP,通过前缀级剪枝减少并行推理中的无效路径,显著提升大型推理模型的效率与性能。

Comments 9 pages, 7 figures

详情
AI中文摘要

并行推理增强了大型推理模型(LRMs),但由于早期错误导致的无效路径,其成本高昂。为了缓解这一问题,前缀级的路径剪枝至关重要,然而现有研究缺乏标准化框架,较为零散。在这项工作中,我们提出了第一个系统的路径剪枝分类法,根据信号来源(内部与外部)和可学习性(可学习与不可学习)对方法进行分类。这种分类揭示了可学习内部方法的未开发潜力,促使我们提出STOP(用于剪枝的超级令牌)。在参数规模从1.5B到20B的LRMs上的广泛评估表明,与现有基线相比,STOP在效果和效率上均表现出优越性。此外,我们在不同计算预算下严格验证了STOP的可扩展性——例如,在固定计算预算下,将GPT-OSS-20B在AIME25上的准确率从84%提升至近90%。最后,我们将发现提炼为形式化的经验指南,以促进最优的实际部署。代码、数据和模型可在 https://bijiaxihh.github.io/STOP 获取。

英文摘要

Parallel reasoning enhances Large Reasoning Models (LRMs) but incurs prohibitive costs due to futile paths caused by early errors. To mitigate this, path pruning at the prefix level is essential, yet existing research remains fragmented without a standardized framework. In this work, we propose the first systematic taxonomy of path pruning, categorizing methods by their signal source (internal vs. external) and learnability (learnable vs. non-learnable). This classification reveals the unexplored potential of learnable internal methods, motivating our proposal of STOP (Super TOken for Pruning). Extensive evaluations across LRMs ranging from 1.5B to 20B parameters demonstrate that STOP achieves superior effectiveness and efficiency compared to existing baselines. Furthermore, we rigorously validate the scalability of STOP under varying compute budgets - for instance, boosting GPT-OSS-20B accuracy on AIME25 from 84% to nearly 90% under fixed compute budgets. Finally, we distill our findings into formalized empirical guidelines to facilitate optimal real-world deployment. Code, data and models are available at https://bijiaxihh.github.io/STOP

2507.17506 2026-06-03 eess.SP cs.LG 版本更新

Power-Aware Cognitive Radar Multi-target Tracking Under Unknown Disturbances

未知扰动下的功率感知认知雷达多目标跟踪

Imad Bouhou, Stefano Fortunati, Leila Gharsalli, Alexandre Renaux

发表机构 * Université Paris-Saclay, CNRS, CentraleSupélec, Laboratoire des Signaux et Systèmes(巴黎萨克雷大学、法国国家科学研究中心、中央理工大学、信号与系统实验室) SAMOVAR, Télécom SudParis, Institut Polytechnique de Paris(SAMOVAR、 Télécom SudParis、巴黎公立理工学院) DR2I-IPSA

AI总结 针对未知扰动下多目标跟踪问题,提出一种基于部分可观测蒙特卡洛规划(POMCP)的认知雷达框架,通过自适应波形设计和功率分配提升低信噪比目标检测概率和跟踪精度。

详情
AI中文摘要

本文提出了一种认知雷达(CR)框架,旨在利用大规模多输入多输出(MMIMO)系统在未知扰动下跟踪多架飞机。由于均匀功率分配在不同信噪比(SNR)下是次优的,我们结合了由部分可观测蒙特卡洛规划(POMCP)驱动的自适应波形设计。通过为每个目标分配独立的POMCP树,系统高效预测目标状态。这些预测指导一个约束优化问题,主动将发射能量导向较弱的目标,同时为较强的目标维持足够的功率。结果证实,所提出的POMCP方法将低SNR目标的检测概率从0.6提高到接近0.9,并且相比非自适应正交波形或认知均匀功率POMCP基线,对最弱目标的跟踪更精确。

英文摘要

This work presents a cognitive radar (CR) framework designed to track multiple aircraft under unknown disturbances using massive multiple-input multiple-output (MMIMO) systems. Since uniform power allocation is suboptimal across varying signal-to-noise ratios (SNRs), we couple an adaptive waveform design driven by Partially Observable Monte Carlo Planning (POMCP). By assigning an independent POMCP tree to each target, the system efficiently predicts target states. These predictions inform a constrained optimization problem that actively directs transmit energy toward weaker targets while maintaining sufficient power for stronger ones. Results confirm that the proposed POMCP method improves the detection probability for low-SNR targets from 0.6 to nearly 0.9, and yields more accurate tracking of the weakest target than a non-adaptive orthogonal waveform or a cognitive uniform-power POMCP baseline.

2604.10169 2026-06-03 cs.AI cs.LG 版本更新

MAVEN-T: Reinforced Heterogeneous Distillation for Real-Time Multi-Agent Trajectory Prediction

MAVEN-T:用于实时多智能体轨迹预测的强化异构蒸馏

Wenchang Duan, Zhenguo Gao, Jinguo Xian, Yi Shi

发表机构 * School of Mathematical Sciences, Shanghai Jiao Tong University(上海交通大学数学科学学院) Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Shanghai Jiao Tong University(上海交通大学Bio-X研究院、发育与神经精神疾病遗传学重点实验室) Shanghai Key Laboratory of Psychotic Disorders, Brain Science and Technology Research Center, Shanghai Jiao Tong University(上海精神疾病重点实验室、脑科学与技术研究中心,上海交通大学)

AI总结 提出MAVEN-T框架,通过高容量教师模型和紧凑学生模型的异构蒸馏,结合强化学习优化,实现实时多智能体轨迹预测,在多个数据集上达到高精度与低延迟。

详情
AI中文摘要

轨迹预测是自动驾驶系统的关键组成部分,因为未来运动直接影响碰撞检查、行为规划和控制。在密集交互、异构行为、多模态未来和有限车载计算条件下,该任务仍然具有挑战性。现有的图、注意力和生成式预测器改进了交互推理或不确定性建模,但其高容量设计通常成本高昂,难以实时部署。轻量级预测器和传统蒸馏降低了推理成本,但通常依赖静态模仿,并未明确纠正与安全相关的教师偏差。本文提出了MAVEN-T,一种用于实时多智能体轨迹预测的强化异构蒸馏框架。高容量教师模型通过环绕感知图编码器建模有向局部交互,结合高效时间滤波与移位窗口空间注意力,并通过稀疏混合专家头解码特定机动未来。紧凑的GRU-挤压激励学生模型配备低秩自适应策略头,通过特征级、注意力级和语义级蒸馏进行训练。为了与下游行为对齐,学生模型进一步通过近端策略优化奖励进行细化,奖励包括碰撞避免、舒适性和进度,同时复杂度感知课程和弹性权重巩固稳定了分阶段训练。在NGSIM、HighD、MoCAD、Argoverse 2和Waymo开放运动数据集上的实验评估了准确性、效率、泛化性、鲁棒性和闭环安全性。学生模型在NVIDIA Jetson AGX Orin上实现了6.2倍参数压缩、3.7倍推理加速和14.6毫秒延迟,同时保持竞争性准确性。

英文摘要

Trajectory prediction is a key component of autonomous driving systems because future motions directly affect collision checking, behavior planning, and control. The task remains challenging under dense interactions, heterogeneous behaviors, multimodal futures, and limited on-board computation. Existing graph, attention, and generative predictors improve interaction reasoning or uncertainty modeling, but their high-capacity designs are often costly for real-time deployment. Lightweight predictors and conventional distillation reduce inference cost, yet usually rely on static imitation and do not explicitly correct safety-relevant teacher bias. This paper proposes \textbf{MAVEN-T}, a reinforced heterogeneous distillation framework for real-time multi-agent trajectory prediction. A high-capacity teacher models directed local interactions with a surround-aware graph encoder, combines efficient temporal filtering with shifted-window spatial attention, and decodes maneuver-specific futures through a sparse Mixture-of-Experts head. A compact GRU--Squeeze-and-Excitation student with a Low-Rank Adapted policy head is trained by feature-, attention-, and semantic-level distillation. To align prediction with downstream behavior, the student is further refined by Proximal Policy Optimization rewards for collision avoidance, comfort, and progress, while a complexity-aware curriculum and Elastic Weight Consolidation stabilize stage-wise training. Experiments on NGSIM, HighD, MoCAD, Argoverse~2, and the Waymo Open Motion Dataset evaluate accuracy, efficiency, generalization, robustness, and closed-loop safety. The student achieves 6.2$\times$ parameter compression, 3.7$\times$ inference acceleration, and 14.6,ms latency on an NVIDIA Jetson AGX Orin while maintaining competitive accuracy.

2510.02779 2026-06-03 cs.LG 版本更新

Optimal Rates for Generalization of Gradient Descent for Deep ReLU Classification

深度ReLU分类中梯度下降泛化的最优速率

Yuanfan Li, Yunwen Lei, Zheng-Chu Guo, Yiming Ying

发表机构 * School of Mathematical Sciences, Zhejiang University(浙江大学数学科学学院) Department of Mathematics, The University of Hong Kong(香港大学数学系) School of mathematics and statistics, University of Sydney(悉尼大学数学与统计学学院)

AI总结 针对深度ReLU网络,通过权衡优化与泛化误差,在NTK可分离假设下证明了梯度下降的泛化误差率为~O(L^6/(nγ^2)),与SVM最优率仅差深度相关因子,关键技术是控制参考模型附近的激活模式以得到更紧的Rademacher复杂度界。

Comments Published in NeurIPS 2025

详情
AI中文摘要

近期进展显著提升了我们对深度神经网络中梯度下降(GD)方法泛化性能的理解。一个自然且基本的问题是:GD能否达到核方法中建立的最小最大最优速率?现有结果要么给出次优的$O(1/\sqrt{n})$速率,要么关注具有光滑激活函数的网络,导致对网络深度$L$的指数依赖。本文通过仔细权衡优化与泛化误差,为深度ReLU网络的GD建立了最优泛化速率,仅对深度有多项式依赖。具体地,在数据以间隔$γ$为NTK可分离的假设下,我们证明了过风险率为$\widetilde{O}(L^6 / (n γ^2))$,这与最优SVM型速率$\widetilde{O}(1 / (n γ^2))$仅差深度相关因子。一项关键的技术贡献是我们对参考模型附近激活模式的新颖控制,从而为梯度下降训练的深度ReLU网络获得了更紧的Rademacher复杂度界。

英文摘要

Recent advances have significantly improved our understanding of the generalization performance of gradient descent (GD) methods in deep neural networks. A natural and fundamental question is whether GD can achieve generalization rates comparable to the minimax optimal rates established in the kernel setting. Existing results either yield suboptimal rates of $O(1/\sqrt{n})$, or focus on networks with smooth activation functions, incurring exponential dependence on network depth $L$. In this work, we establish optimal generalization rates for GD with deep ReLU networks by carefully trading off optimization and generalization errors, achieving only polynomial dependence on depth. Specifically, under the assumption that the data are NTK separable from the margin $γ$, we prove an excess risk rate of $\widetilde{O}(L^6 / (n γ^2))$, which aligns with the optimal SVM-type rate $\widetilde{O}(1 / (n γ^2))$ up to depth-dependent factors. A key technical contribution is our novel control of activation patterns near a reference model, enabling a sharper Rademacher complexity bound for deep ReLU networks trained with gradient descent.

2604.07366 2026-06-03 cs.LG 版本更新

Flow Learners for PDEs: Toward a Physics-to-Physics Paradigm for Scientific Computing

PDE的流学习器:迈向科学计算的物理到物理范式

Yilong Dai, Shengyu Chen, Xiaowei Jia, Runlong Yu

发表机构 * The University of Alabama(阿拉巴马大学) University of Pittsburgh(匹兹堡大学)

AI总结 本文提出流学习器(flow learners)范式,通过参数化传输向量场并积分生成轨迹,将PDE求解从状态预测转向物理上允许的未来传输建模,实现连续时间预测、不确定性量化及物理感知求解器设计。

详情
AI中文摘要

偏微分方程(PDE)支配着科学与工程中几乎所有的物理过程,但大规模求解仍然代价高昂。生成式AI已经改变了语言、视觉和蛋白质科学,但学习的PDE求解器尚未经历类似的转变。现有范式各自捕捉了问题的一部分。物理信息神经网络嵌入残差结构,尽管在刚性、多尺度或大区域情况下通常难以优化。神经算子跨实例进行摊销,尽管它们通常继承快照预测的求解视图,并可能在长滚动中退化。基于扩散的求解器对不确定性建模,尽管它们通常建立在仍以状态回归为中心的求解器模板上。我们认为核心问题是用于训练学习求解器的抽象。许多模型被要求预测状态,而许多科学设置需要建模不确定性如何在约束动力学中移动。相关对象是物理上允许的未来上的传输。这激发了流学习器:参数化传输向量场并通过积分生成轨迹的模型,呼应定义PDE演化的连续动力学。这种物理到物理的对齐支持连续时间预测、原生不确定性量化以及物理感知求解器设计的新机会。我们解释了为什么基于传输的学习为学习的PDE求解提供了更强的组织原则,并概述了从这一转变中产生的研究议程。

英文摘要

Partial differential equations (PDEs) govern nearly every physical process in science and engineering, but solving them at scale remains prohibitively expensive. Generative AI has transformed language, vision, and protein science, but learned PDE solvers have not undergone a comparable shift. Existing paradigms each capture part of the problem. Physics-informed neural networks embed residual structure, although they are often difficult to optimize in stiff, multiscale, or large-domain regimes. Neural operators amortize across instances, although they commonly inherit a snapshot-prediction view of solving and can degrade over long rollouts. Diffusion-based solvers model uncertainty, although they are often built on a solver template that still centers on state regression. We argue that the core issue is the abstraction used to train learned solvers. Many models are asked to predict states, while many scientific settings require modeling how uncertainty moves through constrained dynamics. The relevant object is transport over physically admissible futures. This motivates flow learners: models that parameterize transport vector fields and generate trajectories through integration, echoing the continuous dynamics that define PDE evolution. This physics-to-physics alignment supports continuous-time prediction, native uncertainty quantification, and new opportunities for physics-aware solver design. We explain why transport-based learning offers a stronger organizing principle for learned PDE solving and outline the research agenda that follows from this shift.

2604.04439 2026-06-03 cs.LG cs.CV 版本更新

Estimating Central, Peripheral, and Temporal Visual Contributions to Human Decision Making in Atari Games

估计Atari游戏中中央、周边和时间视觉对人类决策的贡献

Henrik Krauss, Takehisa Yairi

发表机构 * Department of Advanced Interdisciplinary Studies, The University of Tokyo(东京大学先进跨学科研究系) Research Center for Advanced Science and Technology, The University of Tokyo(东京大学先进科学与技术研究中心)

AI总结 通过控制消融框架分析Atari游戏中的眼动数据,发现周边视觉信息对人类决策贡献最大,而注视信息和过去状态信息贡献较小。

详情
AI中文摘要

我们研究了不同视觉信息源在动态视觉环境中对人类决策的贡献。利用Atari-HEAD(一个带有同步眼动追踪的大规模Atari游戏数据集),我们引入了一个受控消融框架,作为逆向工程周边视觉信息、显式注视信息(以注视图形式)以及人类行为中过去状态信息贡献的手段。我们在六种设置下训练动作预测网络,这些设置选择性地包含或排除这些信息源。在20个游戏中,周边信息的贡献最为显著,移除后预测准确率的中位数下降范围为35.27-43.90%。注视信息导致的下降较小,为2.11-2.76%,而过去状态信息的下降范围较广,为1.52-15.51%,其中上限可能因减少了周边信息泄露而更具信息量。为了补充总体准确率,我们根据不同模型配置分配的真实动作概率对状态进行聚类。该分析识别出粗略的行为模式,包括焦点主导、周边主导以及更多情境决策情境。这些结果表明,Atari游戏中的人类决策强烈依赖于当前注视焦点之外的信息,而所提出的框架提供了一种从行为中估计此类信息源贡献的方法。

英文摘要

We study how different visual information sources contribute to human decision making in dynamic visual environments. Using Atari-HEAD, a large-scale Atari gameplay dataset with synchronized eye-tracking, we introduce a controlled ablation framework as a means to reverse-engineer the contribution of peripheral visual information, explicit gaze information in the form of gaze maps, and past-state information from human behavior. We train action-prediction networks under six settings that selectively include or exclude these information sources. Across 20 games, peripheral information shows by far the strongest contribution, with median prediction-accuracy drops in the range of 35.27-43.90% when removed. Gaze information yields smaller drops of 2.11-2.76%, while past-state information shows a broader range of 1.52-15.51%, with the upper end likely more informative due to reduced peripheral-information leakage. To complement aggregate accuracies, we cluster states by true-action probabilities assigned by the different model configurations. This analysis identifies coarse behavioral regimes, including focus-dominated, periphery-dominated, and more contextual decision situations. These results suggest that human decision making in Atari depends strongly on information beyond the current focus of gaze, while the proposed framework provides a way to estimate such information-source contributions from behavior.

2604.04087 2026-06-03 cs.LG 版本更新

ArrowFlow: Hierarchical Machine Learning in the Space of Permutations

ArrowFlow:排列空间中的层次化机器学习

Ozgur Yilmaz

发表机构 * Department of Artificial Intelligence(人工智能系) Adana Science and Technology University(阿达纳科学技术大学)

AI总结 提出ArrowFlow架构,在排列空间中通过排序滤波器和置换矩阵累积实现无浮点参数的层次化排序表示学习,并利用社会选择公理违反作为归纳偏置,实验表明在多个基准上具有竞争力且具备噪声鲁棒性、隐私保护等特性。

详情
AI中文摘要

我们引入了ArrowFlow,一种完全在排列空间中运行的机器学习架构。其计算单元是排序滤波器,即学习到的排序,通过Spearman's footrule距离比较输入,并通过置换矩阵累积(一种基于位移证据的非梯度规则)进行更新。层以层次方式组合:每一层的输出排序成为下一层的输入,从而在核心计算中无需任何浮点参数即可实现深度序数表示学习。我们将该架构与Arrow不可能定理联系起来,表明社会选择公平性公理(上下文依赖性、专业化、对称性破坏)的违反作为非线性、稀疏性和稳定性的归纳偏置。实验涵盖UCI表格基准、MNIST、基因表达癌症分类(TCGA)和偏好数据,均与GridSearchCV调优的基线进行比较。ArrowFlow在Iris上击败所有基线(2.7% vs. 3.3%),并在大多数UCI数据集上具有竞争力。单个参数多项式次数充当主开关:次数1带来噪声鲁棒性(退化减少8-28%)、隐私保护(成本增加0.5个百分点)和缺失特征弹性;更高次数则牺牲这些特性以换取更高的干净准确率。ArrowFlow并非旨在超越基于梯度的方法。它是一个存在性证明,表明在一种根本不同的计算范式(将序数结构提升为一等公民,且与纯整数和神经形态硬件自然对齐)中实现有竞争力的分类是可能的。

英文摘要

We introduce ArrowFlow, a machine learning architecture that operates entirely in the space of permutations. Its computational units are ranking filters, learned orderings that compare inputs via Spearman's footrule distance and update through permutation-matrix accumulation, a non-gradient rule rooted in displacement evidence. Layers compose hierarchically: each layer's output ranking becomes the next layer's input, enabling deep ordinal representation learning without any floating-point parameters in the core computation. We connect the architecture to Arrow's impossibility theorem, showing that violations of social-choice fairness axioms (context dependence, specialization, symmetry breaking) serve as inductive biases for nonlinearity, sparsity, and stability. Experiments span UCI tabular benchmarks, MNIST, gene expression cancer classification (TCGA), and preference data, all against GridSearchCV-tuned baselines. ArrowFlow beats all baselines on Iris (2.7% vs. 3.3%) and is competitive on most UCI datasets. A single parameter, polynomial degree, acts as a master switch: degree 1 yields noise robustness (8-28% less degradation), privacy preservation (+0.5pp cost), and missing-feature resilience; higher degrees trade these for improved clean accuracy. ArrowFlow is not designed to surpass gradient-based methods. It is an existence proof that competitive classification is possible in a fundamentally different computational paradigm, one that elevates ordinal structure to a first-class citizen, with natural alignment to integer-only and neuromorphic hardware.

2603.19551 2026-06-03 stat.ME cs.LG 版本更新

Learning to Bet for Horizon-Aware Anytime-Valid Testing

学习在严格截止日期下进行前瞻性任意有效测试的投注

Ege Onur Taga, Samet Oymak, Shubhanshu Shekhar

发表机构 * Department of Electrical and Computer Engineering, University of Michigan(密歇根大学电气与计算机工程系)

AI总结 本文通过将前瞻性投注建模为有限时域最优控制问题,利用深度强化学习学习通用策略,在严格截止日期下实现有界均值的任意有效测试和置信序列。

Comments To appear in ICML 2026; 29 pages, 22 figures

详情
AI中文摘要

我们针对严格截止日期 $N$ 下的有界均值,开发了前瞻性任意有效测试和置信序列。利用投注/e-过程框架,我们将前瞻性投注视为一个状态空间为 $(t, \log W_t)$ 的有限时域最优控制问题,其中 $t$ 是时间,$W_t$ 是测试鞅值。我们首先证明,在状态空间的某些内部区域,显著偏离Kelly投注的策略是次优的,而Kelly投注以高概率达到阈值。然后,我们识别出充分条件,表明在该区域之外,如果投注者落后于计划,比Kelly更激进的投注可能更好;如果投注者领先,比Kelly更保守的投注可能更好。这些结果共同暗示了 $(t, \log W_t)$ 平面上的一个简单相图,描绘了Kelly、分数Kelly和激进投注可能更优的区域。在此相图指导下,我们引入了一种基于通用深度Q网络(DQN)智能体的深度强化学习方法,该智能体从合成经验中学习单一策略,并将过去观测的简单统计量映射为跨时域和零假设的投注。在有限时域实验中,学习到的DQN策略取得了最先进的结果。

英文摘要

We develop horizon-aware anytime-valid tests and confidence sequences for bounded means under a strict deadline $N$. Using the betting/e-process framework, we cast horizon-aware betting as a finite-horizon optimal control problem with state space $(t, \log W_t)$, where $t$ is the time and $W_t$ is the test martingale value. We first show that in certain interior regions of the state space, policies that deviate significantly from Kelly betting are provably suboptimal, while Kelly betting reaches the threshold with high probability. We then identify sufficient conditions showing that outside this region, more aggressive betting than Kelly can be better if the bettor is behind schedule, and less aggressive can be better if the bettor is ahead. Taken together these results suggest a simple phase diagram in the $(t, \log W_t)$ plane, delineating regions where Kelly, fractional Kelly, and aggressive betting may be preferable. Guided by this phase diagram, we introduce a Deep Reinforcement Learning approach based on a universal Deep Q-Network (DQN) agent that learns a single policy from synthetic experience and maps simple statistics of past observations to bets across horizons and null values. In limited-horizon experiments, the learned DQN policy yields state-of-the-art results.

2602.07768 2026-06-03 cs.CV cs.AI cs.LG cs.MM 版本更新

PAND: Prompt-Aware Neighborhood Distillation for Lightweight Fine-Grained Visual Classification

PAND:面向提示的邻域蒸馏用于轻量级细粒度视觉分类

Qiuming Luo, Yuebing Li, Feng Li, Chang Kong

发表机构 * arXiv

AI总结 提出PAND框架,通过提示感知语义校准和邻域感知结构蒸馏,将大型视觉语言模型知识迁移至轻量网络,在细粒度分类任务上超越现有方法。

Comments Accepted by ICIP2026

详情
AI中文摘要

在细粒度视觉分类(FGVC)中,从大型视觉语言模型(VLM)中蒸馏知识到轻量级网络至关重要但具有挑战性,原因是依赖于固定提示和全局对齐。为解决此问题,我们提出PAND(提示感知邻域蒸馏),一个两阶段框架,将语义校准与结构迁移解耦。首先,我们引入提示感知语义校准以生成自适应语义锚点。其次,我们提出邻域感知结构蒸馏策略以约束学生的局部决策结构。PAND在四个FGVC基准上持续优于现有方法。值得注意的是,我们的ResNet-18学生在CUB-200上达到76.09%的准确率,超过强基线VL2Lite 3.4%。代码可在https://github.com/LLLVTA/PAND获取。

英文摘要

Distilling knowledge from large Vision-Language Models (VLMs) into lightweight networks is crucial yet challenging in Fine-Grained Visual Classification (FGVC), due to the reliance on fixed prompts and global alignment. To address this, we propose PAND (Prompt-Aware Neighborhood Distillation), a two-stage framework that decouples semantic calibration from structural transfer. First, we incorporate Prompt-Aware Semantic Calibration to generate adaptive semantic anchors. Second, we introduce a neighborhood-aware structural distillation strategy to constrain the student's local decision structure. PAND consistently outperforms state-of-the-art methods on four FGVC benchmarks. Notably, our ResNet-18 student achieves 76.09% accuracy on CUB-200, surpassing the strong baseline VL2Lite by 3.4%. Code is available at https://github.com/LLLVTA/PAND.

2602.04132 2026-06-03 eess.SY cs.LG cs.RO cs.SY 版本更新

LC-SAC: Lyapunov-Constrained Soft Actor-Critic via Koopman Operator Theory for Trajectory Tracking and Stabilization

LC-SAC: 基于Koopman算子理论的李雅普诺夫约束软演员-评论家算法用于轨迹跟踪与镇定

Dhruv S. Kushwaha, Zoleikha A. Biron

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 提出一种结合Koopman算子理论的李雅普诺夫约束软演员-评论家算法,通过线性提升动力学模型和闭环控制李雅普诺夫函数实现轨迹跟踪与镇定,并引入条件风险价值约束处理罕见但严重的失稳事件。

Comments 13 pages, 8 Figures

详情
AI中文摘要

强化学习在解决复杂序列决策问题中取得了显著成功,但其在安全关键物理系统中的应用仍受限于缺乏稳定性保证。标准强化学习算法优先考虑奖励最大化,往往产生可能引起振荡或无界状态发散的策略。本文提出一种基于Koopman算子理论的李雅普诺夫约束软演员-评论家算法。我们通过扩展动态模态分解学习误差动力学的线性提升代理模型,并求解离散代数Riccati方程以获得闭式二次候选控制李雅普诺夫函数。该控制李雅普诺夫函数作为拉格朗日惩罚项被纳入SAC演员更新中,通过条件风险价值目标聚合最坏情况尾部分布,将约束压力集中在罕见但严重的失稳事件上。我们进一步引入三种结构性的EDMD改进:在求解DARE之前对提升的A矩阵进行谱半径归一化、具有物理意义的LQR状态代价,以及强制V(0)=0的值偏置锚点,使得闭式控制李雅普诺夫函数对于更高维的提升模型(如倒立摆和3D四旋翼)是适定的。消融研究表明,硬拉格朗日约束是必要的,将其替换为奖励塑形会导致学习不稳定并在四旋翼任务中导致回报崩溃。

英文摘要

Reinforcement Learning (RL) has achieved remarkable success in solving complex sequential decision-making problems. However, its application to safety-critical physical systems remains constrained by the lack of stability guarantees. Standard RL algorithms prioritize reward maximization, often yielding policies that may induce oscillations or unbounded state divergence. In this work we propose a Lyapunov-Constrained Soft Actor-Critic (LC-SAC) algorithm using Koopman operator theory. We learn a linear lifted surrogate of the error dynamics via Extended Dynamic Mode Decomposition (EDMD) and solve the Discrete Algebraic Riccati Equation (DARE) to obtain a closed-form quadratic candidate Control Lyapunov Function (CLF). This CLF is incorporated into the SAC actor update as a Lagrangian penalty that aggregates the worst-case tail of violations via a Conditional Value-at-Risk (CVaR) objective, concentrating constraint pressure on rare but severe instability events. We further introduce three structural EDMD refinements spectral-radius normalization of the lifted A-matrix prior to the DARE solve, a physically meaningful LQR state cost, and a value-bias anchor enforcing V(0)=0 that make the closed-form CLF well-posed for higher-dimensional lifted models such as the cartpole and 3D quadrotor. The ablation study shows that a hard Lagrangian constraint is essential, replacing it with reward shaping (Lyap-RS-SAC) destabilizes learning and collapses return on quadrotor tasks.

2512.09106 2026-06-03 cs.LG 版本更新

Learning Unmasking Policies for Diffusion Language Models

学习扩散语言模型的去掩码策略

Metod Jazbec, Theo X. Olausson, Louis Béthune, Pierre Ablin, Michael Kirchhof, João Monteiro, Victor Turrisi, Jason Ramapuram, Marco Cuturi

发表机构 * Apple(苹果公司) University of Amsterdam(阿姆斯特丹大学) Massachusetts Institute of Technology(麻省理工学院)

AI总结 针对扩散语言模型中的去掩码采样问题,提出基于强化学习训练轻量级策略,以替代手动调优的启发式方法,在保持性能的同时提升鲁棒性。

Comments V4: Accepted as an oral spotlight at ICML 2026

详情
AI中文摘要

扩散(大型)语言模型(dLLMs)现在在许多任务上与自回归模型的下游性能相匹配,同时有望在推理过程中更高效。dLLMs的一个关键设计方面是采样过程,该过程选择在每个扩散步骤中要去掩码哪些标记。实际上,最近的研究发现,与随机去掩码相比,诸如置信度阈值之类的启发式策略提高了样本质量和标记吞吐量。然而,此类启发式方法存在缺点:它们需要手动调整,并且我们观察到它们的性能随着块大小的增加而下降。在这项工作中,我们提出使用强化学习来训练采样过程。具体来说,我们将掩码扩散采样形式化为一个马尔可夫决策过程,其中dLLM充当环境,并提出了一个基于单层transformer的轻量级策略,该策略将dLLM标记置信度映射到去掩码决策。我们的实验表明,当与半自回归(块)生成结合时,这些训练后的策略与最先进的启发式方法的性能相匹配,同时在完全扩散设置中优于它们。

英文摘要

Diffusion (Large) Language Models (dLLMs) now match the downstream performance of their autoregressive counterparts on many tasks, while holding the promise of being more efficient during inference. One critical design aspect of dLLMs is the sampling procedure that selects which tokens to unmask at each diffusion step. Indeed, recent work has found that heuristic strategies such as confidence thresholding improve both sample quality and token throughput compared to random unmasking. However, such heuristics have downsides: they require manual tuning, and we observe that their performance degrades with larger block sizes. In this work, we instead propose to train sampling procedures using reinforcement learning. Specifically, we formalize masked diffusion sampling as a Markov decision process in which the dLLM serves as the environment, and propose a lightweight policy based on a single-layer transformer that maps dLLM token confidences to unmasking decisions. Our experiments show that these trained policies match the performance of state-of-the-art heuristics when combined with semi-autoregressive (block) generation, while outperforming them in the full-diffusion setting.

2602.07075 2026-06-03 physics.chem-ph cs.AI cs.CL cs.LG 版本更新

LatentChem: From Textual CoT to Latent Thinking in Chemical Reasoning

LatentChem: 从文本思维链到化学推理中的潜在思考

Xinwu Ye, Yicheng Mao, Yuxuan Liao, Jia Zhang, Yimeng Liu, Li Hao, Fang Wu, Zhiwei Li, Zehong Wang, Zhiyuan Liu, Zhenfei Yin, Li Yuan, Philip Torr, Huan Sun, xiangxiang Zeng, Mengdi Wang, Le Cong, Shenghua Gao, Xiangru Tang

发表机构 * University of Science and Technology of China(中国科学技术大学)

AI总结 针对化学大语言模型依赖显式思维链导致的模态不匹配问题,提出LatentChem推理接口,通过连续思维向量和动态感知解耦化学逻辑与语言生成,在ChemCoTBench上以59.88%非平局胜率超越强CoT基线,并实现平均10.84倍推理步骤开销降低(5.96倍实际加速)。

Comments Accepted at ICML 2026

详情
AI中文摘要

当前的化学大语言模型主要依赖显式的思维链来解决复杂推理问题。然而,将非语言的隐性化学逻辑强制转化为离散的自然语言,造成了根本性的“模态不匹配”,为推理带来了人为瓶颈。我们提出了LatentChem,一种将化学逻辑与语言生成解耦的推理接口,使模型能够通过连续思维向量和动态感知来处理信息。我们的研究揭示了一个关键涌现行为:自发内化,这里定义为在仅结果优化下的自我选择。当为任务成功进行优化时,模型放弃冗长的文本推导,转而采用隐式的潜在计算,这表明模型将连续流形视为化学逻辑更自然的载体。这一范式转变也被证明是一种更优的计算策略:在严格的ChemCoTBench基准上,LatentChem对强CoT基线取得了59.88%的非平局胜率,同时在所有评估基准上实现了平均10.84倍的推理步骤开销降低(5.96倍实际加速)。我们的结果提供了经验证据,表明化学推理更自然、更有效地实现为连续潜在动力学,而非离散的语言轨迹。

英文摘要

Current chemical large language models (LLMs) predominantly rely on explicit Chain-of-Thought (CoT) to solve complex reasoning problems. However, forcing nonverbal tacit chemical logic into discrete natural language imposes a fundamental ``modality mismatch,'' creating an artificial bottleneck for reasoning. We introduce LatentChem, a reasoning interface that decouples chemical logic from linguistic generation, enabling the model to process information via continuous thought vectors and dynamic perception. Our investigation reveals a pivotal emergent behavior: spontaneous internalization, defined here as self-selected under outcome-only optimization. When optimized for task success, the model abandons verbose textual derivations in favor of implicit latent computation, suggesting that it identifies the continuous manifold as a more native substrate for chemical logic. This paradigm shift also proves to be a superior computational strategy: LatentChem achieves a 59.88\% non-tie win rate against the strong CoT baseline on the rigorous ChemCoTBench, while delivering a broad 10.84$\times$ average reduction in reasoning step overhead (5.96$\times$ wall-clock speedup) across all evaluated benchmarks. Our results provide empirical evidence that chemical reasoning is more naturally and effectively realized as continuous latent dynamics rather than discretized linguistic trajectories.

2511.04469 2026-06-03 cs.LG physics.data-an q-fin.CP stat.ME stat.OT 版本更新

Towards Causal Market Simulators

迈向因果市场模拟器

Dennis Thumm, Luis Ontaneda Mijares

发表机构 * National University of Singapore(新加坡国立大学) Veracruz Mexico(墨西哥韦拉克鲁斯)

AI总结 提出一种结合变分自编码器与结构因果模型的时间序列神经因果模型VAE(TNCM-VAE),用于生成保留时间依赖和因果关系的反事实金融时间序列,在合成数据上实现低至0.03-0.10的L1距离。

Comments ICAIF 2025 Workshop on Rethinking Financial Time-Series

详情
AI中文摘要

使用深度生成模型的市场生成器在合成金融数据生成方面显示出前景,但现有方法缺乏反事实分析和风险评估所必需的因果推理能力。我们提出了一种时间序列神经因果模型VAE(TNCM-VAE),它将变分自编码器与结构因果模型相结合,以生成反事实金融时间序列,同时保留时间依赖性和因果关系。我们的方法通过解码器架构中的有向无环图施加因果约束,并使用因果Wasserstein距离进行训练。我们在受Ornstein-Uhlenbeck过程启发的合成自回归模型上验证了该方法,在反事实概率估计中表现出优越性能,与真实值相比L1距离低至0.03-0.10。该模型通过生成尊重潜在因果机制的合理反事实市场轨迹,实现了金融压力测试、情景分析和增强回测。

英文摘要

Market generators using deep generative models have shown promise for synthetic financial data generation, but existing approaches lack causal reasoning capabilities essential for counterfactual analysis and risk assessment. We propose a Time-series Neural Causal Model VAE (TNCM-VAE) that combines variational autoencoders with structural causal models to generate counterfactual financial time series while preserving both temporal dependencies and causal relationships. Our approach enforces causal constraints through directed acyclic graphs in the decoder architecture and employs the causal Wasserstein distance for training. We validate our method on synthetic autoregressive models inspired by the Ornstein-Uhlenbeck process, demonstrating superior performance in counterfactual probability estimation with L1 distances as low as 0.03-0.10 compared to ground truth. The model enables financial stress testing, scenario analysis, and enhanced backtesting by generating plausible counterfactual market trajectories that respect underlying causal mechanisms.

2603.04956 2026-06-03 cs.LG cs.IT math.IT 版本更新

WaterSIC: Information-Theoretically (Near) Optimal Linear Layer Quantization

WaterSIC: 信息论(近乎)最优的线性层量化

Egor Lifar, Semyon Savkin, Or Ordentlich, Yury Polyanskiy

发表机构 * University of Illinois at Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校) Stanford University(斯坦福大学) University of California, Berkeley(加州大学伯克利分校) University of Texas at Austin(德克萨斯大学奥斯汀分校)

AI总结 针对密集线性层低精度量化问题,提出WaterSIC算法,通过为权重矩阵不同列分配不同量化率,实现与信息论极限仅0.255比特的差距,并在Llama和Qwen系列大语言模型上达到1-4比特量化的最优性能。

详情
AI中文摘要

本文考虑将给定的密集线性层转换为低精度的问题。从信息论(IT)角度分析了压缩长度与输出差异之间的权衡。结果表明,流行的GPTQ算法可能与IT极限存在任意大的差距。为解决此问题,提出了一种名为“WaterSIC”的新算法,并证明其与IT极限的速率差距不超过0.255比特,且对输入激活的所有可能协方差矩阵一致成立。WaterSIC的关键创新在于为权重矩阵的不同列(输入特征)分配不同的量化率,模拟了经典的IT解决方案“注水”。将WaterSIC应用于Llama和Qwen系列大语言模型,在1到4比特的所有量化率下均取得了新的最优性能。我们的代码可在https://github.com/egorlifar/watersic获取。

英文摘要

This paper considers the problem of converting a given dense linear layer to low precision. The tradeoff between compressed length and output discrepancy is analyzed information theoretically (IT). It is shown that a popular GPTQ algorithm may have an arbitrarily large gap to the IT limit. To alleviate this problem, a novel algorithm, termed ``WaterSIC'', is proposed and is shown to be within a rate gap of 0.255 bits to the IT limit, uniformly over all possible covariance matrices of input activations. The key innovation of WaterSIC's is to allocate different quantization rates to different columns (in-features) of the weight matrix, mimicking the classical IT solution known as "waterfilling". Applying WaterSIC to the Llama and Qwen family of LLMs establishes new state-of-the-art performance for all quantization rates from 1 to 4 bits. Our code is available at https://github.com/egorlifar/watersic.

2603.03612 2026-06-03 cs.LG cs.CC cs.CL cs.FL 版本更新

Why Are Linear RNNs More Parallelizable?

为什么线性RNN更易于并行化?

William Merrill, Hongjian Jiang, Yanhong Li, Anthony Lin, Ashish Sabharwal

发表机构 * GitHub

AI总结 本文通过将RNN类型与标准复杂度类紧密关联,揭示了线性RNN(LRNN)因可视为对数深度算术电路而易于并行化,而非线性RNN因能解决L-完全问题而存在并行化障碍。

Comments To appear at ICML 2026

详情
AI中文摘要

社区越来越多地探索线性RNN(LRNN)作为语言模型,受其表达能力和并行化能力的驱动。虽然先前的工作确立了LRNN相对于Transformer的表达优势,但尚不清楚是什么使得LRNN——而非传统的非线性RNN——在实践中与Transformer一样易于并行化。我们通过提供RNN类型与标准复杂度类之间的紧密联系来回答这个问题。我们表明,LRNN可以看作是对数深度(有界扇入)算术电路,相对于Transformer所允许的对数深度布尔电路,这仅代表轻微深度开销。此外,我们表明非线性RNN可以解决$\mathsf{L}$-完全问题(甚至在多项式精度下解决$\mathsf{P}$-完全问题),揭示了将它们与Transformer一样高效并行化的根本障碍。我们的理论还识别了近期流行LRNN变体之间的细粒度表达差异:置换对角LRNN是$\mathsf{NC}^1$-完全的,而对角加低秩LRNN更具表达性($\mathsf{PNC}^1$-完全)。我们通过将每种RNN类型与它可以模拟的相应自动机理论模型相关联,提供了进一步见解。总之,我们的结果揭示了非线性RNN与不同LRNN变体之间的基本权衡,为设计在表达性和并行性之间实现最佳平衡的LLM架构提供了基础。

英文摘要

The community is increasingly exploring linear RNNs (LRNNs) as language models, motivated by their expressive power and parallelizability. While prior work establishes the expressivity benefits of LRNNs over transformers, it is unclear what makes LRNNs -- but not traditional, nonlinear RNNs -- as easy to parallelize in practice as transformers. We answer this question by providing a tight connection between types of RNNs and standard complexity classes. We show that LRNNs can be viewed as log-depth (bounded fan-in) arithmetic circuits, which represents only a slight depth overhead relative to log-depth boolean circuits that transformers admit. Furthermore, we show that nonlinear RNNs can solve $\mathsf{L}$-complete problems (and even $\mathsf{P}$-complete ones, under polynomial precision), revealing a fundamental barrier to parallelizing them as efficiently as transformers. Our theory also identifies fine-grained expressivity differences between recent popular LRNN variants: permutation-diagonal LRNNs are $\mathsf{NC}^1$-complete whereas diagonal-plus-low-rank LRNNs are more expressive ($\mathsf{PNC}^1$-complete). We provide further insight by associating each type of RNN with a corresponding automata-theoretic model that it can simulate. Together, our results reveal fundamental tradeoffs between nonlinear RNNs and different variants of LRNNs, providing a foundation for designing LLM architectures that achieve an optimal balance between expressivity and parallelism.

2510.20372 2026-06-03 stat.ML cs.LG econ.EM math.ST stat.ME stat.TH 版本更新

Testing Most Influential Sets

最具影响力集合的检验

Lucas D. Konrad, Nikolas Kuschnig

发表机构 * Vienna University of Economics and Business(维也纳经济与商业大学) Monash University(墨尔本大学)

AI总结 针对小部分数据点可能过度影响模型结论的问题,基于线性最小二乘法推导精确影响公式并识别最大影响的极值分布,提出一个用于检验过度影响的假设检验框架。

Comments Published as a conference paper at ICLR 2026

详情
AI中文摘要

小的有影响力的数据子集可以极大地影响模型结论,少数数据点可能推翻关键发现。虽然最近的研究识别了这些最具影响力的集合,但没有正式的方法来判断最大影响何时是过度的,而非在自然随机抽样变异下预期的。我们通过开发一个关于最具影响力集合的原则性框架来填补这一空白。聚焦于线性最小二乘法,我们推导了一个方便的精确影响公式,并识别了最大影响的极值分布——对于固定大小的集合和重尾数据是重尾的弗雷歇分布,对于增长集合或轻尾数据是表现良好的耿贝尔分布。这使得我们能够对过度影响进行严格的假设检验。我们通过跨经济学、生物学和机器学习基准的应用,解决了有争议的发现,并用严格的推断取代了临时的启发式方法。

英文摘要

Small influential data subsets can dramatically impact model conclusions, with a few data points overturning key findings. While recent work identifies these most influential sets, there is no formal way to tell when maximum influence is excessive rather than expected under natural random sampling variation. We address this gap by developing a principled framework for most influential sets. Focusing on linear least-squares, we derive a convenient exact influence formula and identify the extreme value distributions of maximal influence - the heavy-tailed Fréchet for constant-size sets and heavy-tailed data, and the well-behaved Gumbel for growing sets or light tails. This allows us to conduct rigorous hypothesis tests for excessive influence. We demonstrate through applications across economics, biology, and machine learning benchmarks, resolving contested findings and replacing ad-hoc heuristics with rigorous inference.

2510.16462 2026-06-03 cs.LG stat.ML 版本更新

Buzz, Choose, Forget: A Meta-Bandit Framework for Bee-Like Decision Making

Buzz, Choose, Forget: 一种类蜂决策的元老虎机框架

Emmanuelle Claeys, Elena Kerjean, Jean-Michel Loubes

发表机构 * University of Toulouse, IRIT(图卢兹大学,IRIT) University of Toulouse, CBI(图卢兹大学,CBI) Regalia Team, INRIA University of Toulouse, France(Regalia团队,法国国家信息与自动化研究所图卢兹大学)

AI总结 提出基于多臂老虎机的序列模仿学习模型MAYA,通过时间窗口τ模拟蜜蜂有限记忆,在真实、模拟和补充数据集上优于基线模型,并具备可解释性和轨迹推断能力。

详情
AI中文摘要

本文介绍了MAYA,一种基于多臂老虎机的序列模仿学习模型,旨在再现和预测个体蜜蜂在情境化觅食任务中的决策。该模型通过时间窗口$τ$考虑蜜蜂的有限记忆,其最优值约为7次试验,且轻微依赖于天气条件。在真实、模拟和补充(小鼠)数据集上的实验结果表明,MAYA(特别是使用Wasserstein距离时)优于模仿基线和经典统计模型,同时提供了个体学习策略的可解释性,并能够推断出用于未来生态应用的真实轨迹。

英文摘要

This work introduces MAYA, a sequential imitation learning model based on multi-armed bandits, designed to reproduce and predict individual bees' decisions in contextualized foraging tasks. The model accounts for bees' limited memory through a temporal window $τ$, whose optimal value is around 7 trials, with a slight dependence on weather conditions. Experimental results on real, simulated, and complementary (mice) datasets show that MAYA (particularly with the Wasserstein distance) outperforms imitation baselines and classical statistical models, while providing interpretability of individual learning strategies and enabling the inference of realistic trajectories for prospective ecological applications.

2506.13107 2026-06-03 cs.LG stat.ML 版本更新

Honesty in Causal Forests: When It Helps and When It Hurts

因果森林中的诚实性:何时有益,何时有害

Yanfang Hou, Carlos Fernández-Loría

AI总结 本文通过偏差-方差权衡分析,发现诚实估计(分割数据用于子组定义和效应估计)在异质性较强且数据充足时会降低个体处理效应估计精度,建议将其视为正则化手段而非默认选择。

详情
AI中文摘要

因果森林估计处理效应如何随个体变化,指导营销、运营和公共政策等领域的个性化干预。标准做法是诚实估计:将数据分为两个样本,一个用于定义子组,另一个用于估计子组内的处理效应。这旨在减少过拟合,并且是许多软件包的默认设置。但这是正确的选择吗?我们表明,诚实估计会降低个体处理效应估计的准确性,特别是当效应异质性显著且数据集足够大以检测到它时。原因是偏差-方差权衡:诚实性降低了过拟合的风险,但通过限制可用于检测和建模异质性的数据,增加了欠拟合的风险。在超过7000个基准数据集上,我们发现默认使用诚实性的代价可能高达需要多27%的数据才能匹配未使用诚实性训练的模型的性能。诚实性最好被理解为一种正则化形式。是否采用它应取决于应用的目标及其经验表现,而不是反射性的默认使用。

英文摘要

Causal forests estimate how treatment effects vary across individuals, guiding personalized interventions in areas like marketing, operations, and public policy. A standard practice is honest estimation: dividing the data into two samples, one to define subgroups and another to estimate treatment effects within them. This is intended to reduce overfitting and is the default in many software packages. But is it the right choice? We show that honest estimation can reduce the accuracy of estimates of individual treatment effects, especially when effect heterogeneity is substantial and datasets are large enough to detect it. The reason is a bias-variance trade-off: honesty lowers the risk of overfitting but increases the risk of underfitting by limiting the data available to detect and model heterogeneity. Across more than 7,000 benchmark datasets, we find that the cost of using honesty by default can be as high as requiring 27% more data to match the performance of models trained without it. Honesty is best understood as a form of regularization. Whether to adopt it should depend on the goals of the application and its empirical performance, not on reflexive default use.

2603.03480 2026-06-03 cs.LG stat.ML 版本更新

Minimax Optimal Strategy for Delayed Observations in Online Reinforcement Learning

在线强化学习中延迟观测的极小化最优策略

Harin Lee, Kevin Jamieson

发表机构 * University of California, Berkeley(加州大学伯克利分校) UC Berkeley(加州大学伯克利分校)

AI总结 针对延迟状态观测的强化学习问题,提出结合增广方法和上置信界算法的策略,在表格型MDP上达到极小化最优遗憾界。

Comments ICML camera ready version

详情
AI中文摘要

我们研究具有延迟状态观测的强化学习,其中智能体在随机数量的时间步后观察到当前状态。我们提出了一种结合增广方法和上置信界方法的算法。对于表格型马尔可夫决策过程(MDP),我们推导出遗憾界为$\tilde{\mathcal{O}}(H \sqrt{D_{\max} SAK})$,其中$S$和$A$是状态和动作空间的基数,$H$是时间跨度,$K$是回合数,$D_{\max}$是最大延迟长度。我们还提供了匹配的下界(对数因子除外),表明我们的方法是最优的。我们的分析框架将这个问题表述为一类更广泛的MDP的特例,其中它们的转移动态分解为已知部分和未知但结构化的部分。我们为这个抽象设定建立了通用结果,这可能具有独立的研究价值。

英文摘要

We study reinforcement learning with delayed state observation, where the agent observes the current state after some random number of time steps. We propose an algorithm that combines the augmentation method and the upper confidence bound approach. For tabular Markov decision processes (MDPs), we derive a regret bound of $\tilde{\mathcal{O}}(H \sqrt{D_{\max} SAK})$, where $S$ and $A$ are the cardinalities of the state and action spaces, $H$ is the time horizon, $K$ is the number of episodes, and $D_{\max}$ is the maximum length of the delay. We also provide a matching lower bound up to logarithmic factors, showing the optimality of our approach. Our analytical framework formulates this problem as a special case of a broader class of MDPs, where their transition dynamics decompose into a known component and an unknown but structured component. We establish general results for this abstract setting, which may be of independent interest.

2603.01471 2026-06-03 cs.IR cs.LG 版本更新

Reconstructing Content with Collaborative Attention for Universal Multimodal Representation Learning

通过协同注意力重建内容以提升多模态嵌入质量

Jiahan Chen, Da Li, Hengran Zhang, Yinqiong Cai, Lixin Su, Jiafeng Guo, Daiting Shi, Dawei Yin, Keping Bi

发表机构 * State Key Laboratory of AI Safety(人工智能安全国家重点实验室) Institute of Computing Technology, Chinese Academy of Sciences(中国科学院计算技术研究所) University of Chinese Academy of Sciences(中国科学院大学) Baidu Inc.(百度公司)

AI总结 提出CoCoA预训练范式,通过重构注意力流和基于EOS的重建任务,利用协同注意力优化多模态嵌入,使模型将输入语义压缩到<EOS>令牌中,从而提升嵌入质量。

详情
AI中文摘要

多模态嵌入模型,根植于多模态大语言模型(MLLMs),在检索和分类等多样任务中取得了显著的性能提升。然而,现有方法大多严重依赖大规模对比学习,对MLLMs的架构和训练范式如何影响嵌入质量的探索有限。虽然MLLMs的因果注意力和下一个令牌预测范式在生成任务中有效,但并未明确鼓励形成全局紧凑的表示,限制了其作为多模态嵌入骨干的有效性。为解决这一问题,我们提出了CoCoA,一种基于协同注意力的内容重建预训练范式,用于多模态嵌入优化。具体而言,我们重构注意力流并引入基于EOS的重建任务,鼓励模型从相应的<EOS>嵌入中重建输入。这促使多模态模型将输入的语义信息压缩到<EOS>令牌中,为后续的对比学习奠定基础。在MMEB-V1上的大量实验表明,基于Qwen2-VL和Qwen2.5-VL构建的CoCoA显著提升了嵌入质量。结果验证了内容重建作为最大化现有数据价值的有效策略,使多模态嵌入模型能够生成紧凑且信息丰富的表示,提升其性能上限。

英文摘要

Multimodal embedding models, rooted in multimodal large language models (MLLMs), have yielded significant performance improvements across diverse tasks such as retrieval and classification. However, most existing approaches rely heavily on large-scale contrastive learning, with limited exploration of how the architectural and training paradigms of MLLMs affect embedding quality. While effective for generation, the causal attention and next-token prediction paradigm of MLLMs does not explicitly encourage the formation of globally compact representations, limiting their effectiveness as multimodal embedding backbones. To address this, we propose CoCoA, a Content reconstruction pre-training paradigm based on Collaborative Attention for multimodal embedding optimization. Specifically, we restructure the attention flow and introduce an EOS-based reconstruction task, encouraging the model to reconstruct input from the corresponding <EOS> embeddings. This drives the multimodal model to compress the semantic information of the input into the <EOS> token, laying the foundations for subsequent contrastive learning. Extensive experiments on MMEB-V1 demonstrate that CoCoA built upon Qwen2-VL and Qwen2.5-VL significantly improves embedding quality. Results validate that content reconstruction serves as an effective strategy to maximize the value of existing data, enabling multimodal embedding models generate compact and informative representations, raising their performance ceiling.

2603.01372 2026-06-03 cs.LG cs.AI 版本更新

Causal Neural Probabilistic Circuits

因果神经概率电路

Weixin Chen, Han Zhao

AI总结 提出因果神经概率电路(CNPC),通过结合神经属性预测器和从因果图编译的因果概率电路,支持遵循因果依赖的精确干预推理,从而提升概念瓶颈模型在干预下的分类准确率。

详情
AI中文摘要

概念瓶颈模型(CBM)通过引入概念层并从概念预测中预测类别标签,增强了端到端神经网络的可解释性。CBM的一个关键特性是支持干预,即领域专家可以在测试时纠正错误预测的概念值以提高最终准确性。然而,典型的CBM仅覆盖被纠正的概念,而保持其他概念预测不变,这忽略了概念间的因果依赖。为解决此问题,我们提出了因果神经概率电路(CNPC),它结合了神经属性预测器和从因果图编译的因果概率电路。该电路支持精确、易处理的因果推理,天然尊重因果依赖。在干预下,CNPC基于专家乘积(PoE)建模类别分布,融合了属性预测器的预测分布和电路计算的干预边际。我们从理论上刻画了CNPC相对于其模块的组合干预误差,并确定了CNPC接近真实干预类别分布的条件。在五个基准数据集上的分布内和分布外实验表明,与五个基线模型相比,CNPC在不同干预属性数量下均实现了更高的任务准确率。

英文摘要

Concept Bottleneck Models (CBMs) enhance the interpretability of end-to-end neural networks by introducing a layer of concepts and predicting the class label from the concept predictions. A key property of CBMs is that they support interventions, i.e., domain experts can correct mispredicted concept values at test time to improve the final accuracy. However, typical CBMs apply interventions by overwriting only the corrected concept while leaving other concept predictions unchanged, which ignores causal dependencies among concepts. To address this, we propose the Causal Neural Probabilistic Circuit (CNPC), which combines a neural attribute predictor with a causal probabilistic circuit compiled from a causal graph. This circuit supports exact, tractable causal inference that inherently respects causal dependencies. Under interventions, CNPC models the class distribution based on a Product of Experts (PoE) that fuses the attribute predictor's predictive distribution with the interventional marginals computed by the circuit. We theoretically characterize the compositional interventional error of CNPC w.r.t. its modules and identify conditions under which CNPC closely matches the ground-truth interventional class distribution. Experiments on five benchmark datasets in both in-distribution and out-of-distribution settings show that, compared with five baseline models, CNPC achieves higher task accuracy across different numbers of intervened attributes.

2602.00423 2026-06-03 cs.LG 版本更新

scBatchProx: Federated-Inspired Refinement for Stable Cell-Type Discriminability under Heterogeneous Batch Compositions

scBatchProx:异质性批次组成下稳定细胞类型可区分性的联邦启发式精炼

Quang-Huy Nguyen, Jiaqi Wang, Wei-Shinn Ku

发表机构 * National Institute of Health (NIH)(国家卫生研究院)

AI总结 提出scBatchProx,一种轻量级后处理方法,通过联邦学习启发的优化和保守正则化,稳定单细胞潜在嵌入,提升异质批次下的细胞类型分类性能。

详情
AI中文摘要

单细胞整合工作流通常构建低维细胞嵌入,然后使用后处理方法减少批次效应。当细胞类型组成在不同批次间变化,某些群体在特定批次中代表性不足或缺失时,这种精炼过程可能变得不稳定。在动态单细胞数据系统中,新获取的批次可能改变技术条件和细胞类型组成,问题变得更加严重。这种不稳定性会降低下游细胞类型分类性能,并削弱在失衡扰动下的稳定性。我们引入scBatchProx,一种轻量级后处理方法,用于在这些异质和不断变化的环境中稳定单细胞潜在嵌入。scBatchProx直接操作预计算嵌入,并将每个批次或研究视为联邦启发优化过程中的客户端。批次条件FiLM适配器学习局部潜在更新,而近端和身份保持正则化使这些更新保持保守。在多批次和跨研究单细胞数据集上的实验表明,scBatchProx在不同上游嵌入上改善了下游细胞类型分类。在受控失衡扰动中,当选定群体从一个批次中降采样或移除时,scBatchProx维持更稳定的受影响细胞类型F1分数。在累积重训练和持续整合设置中,随着新数据集随时间到达,scBatchProx保持有效。这些结果共同表明,保守的联邦启发式精炼有助于在批次组成随数据集和时间变化时维持稳定的单细胞嵌入。

英文摘要

Single-cell integration workflows often construct low-dimensional cell embeddings and then refine them with post-hoc methods to reduce batch effects. This refinement process can become unstable when cell-type compositions vary across batches, with some populations underrepresented or absent in particular batches. The problem becomes more consequential in dynamic single-cell data systems, where newly acquired batches can change both technical conditions and cell-type composition. Such instability can reduce downstream cell-type classification performance and weaken stability under imbalance perturbations. We introduce scBatchProx, a lightweight post-hoc refinement method for stabilizing single-cell latent embeddings in these heterogeneous and evolving settings. scBatchProx operates directly on precomputed embeddings and treats each batch or study as a client in a federated-inspired optimization procedure. A batch-conditioned FiLM adapter learns local latent updates, while proximal and identity-preserving regularization keep these updates conservative. Experiments on multi-batch and cross-study single-cell datasets show that scBatchProx improves downstream cell-type classification across different upstream embeddings. In controlled imbalance perturbations, scBatchProx maintains more stable affected-cell-type F1 when selected populations are downsampled or ablated from one batch. In cumulative retraining and continual integration settings, scBatchProx remains effective as new datasets arrive over time. Together, these results suggest that conservative, federated-inspired refinement can help maintain stable single-cell embeddings as batch compositions change across datasets and over time.

2505.23725 2026-06-03 cs.LG 版本更新

MuLoCo: Muon is a practical inner optimizer for DiLoCo

MuLoCo: Muon 是 DiLoCo 的实用内部优化器

Benjamin Thérien, Xiaolong Huang, Aaron Defazio, Irina Rish, Eugene Belilovsky

发表机构 * FAIR at Meta(Meta 的 FAIR 部门) Mila Université de Montréal(蒙特利尔大学) Concordia University(康科迪亚大学)

AI总结 本文提出 MuLoCo,将 Muon 作为 DiLoCo 的内部优化器,通过产生方向更准确的伪梯度,在多个工作节点下提升大语言模型训练性能,并兼容量化、流式处理和长同步间隔。

详情
AI中文摘要

DiLoCo 是一个强大的大语言模型(LLM)训练框架,能够在网络约束下实现更大的最优批大小和更高的加速器利用率。然而,研究表明 DiLoCo 的性能会随着工作节点数(K)的增加而下降(Charles 等人,2025)。在这项工作中,我们认为 DiLoCo 行为中一个相关但常被忽视的因素是内部优化器的选择,它塑造了外部优化器使用的伪梯度。鉴于最近 Muon 相对于 AdamW 在数据并行(DP)训练中的成功,我们研究了 Muon 的归一化优化器步骤如何影响伪梯度的质量。我们发现,相对于 AdamW,随着工作节点数(K)的增加,Muon 产生方向更正确的伪梯度。在我们预训练语言模型的实验中,我们对 150M、416M、914M、1.76B 和 3.1B 模型的 DiLoCo、MuLoCo、AdamW DP 和 Muon DP 进行了广泛的超参数调优。在所有规模上一致地发现,当 K≥1 时,MuLoCo(Muon 内部优化器 DiLoCo)在绝对性能上优于 DiLoCo,并且当 K>2 时,相对于它们各自的数据并行基线,MuLoCo 优于 DiLoCo,同时兼容量化、流式处理和长同步间隔。当 K=1 时,我们发现 MuLoCo 甚至可以优于数据并行黄金标准,同时具有更大的临界批大小。最后,我们将最优超参数外推到 15B 规模,并使用 K=1 和 K=16 个工作节点训练每个方法(共六种)的模型。我们发现,在此规模下,K=16 的 MuLoCo 几乎匹配单工作节点性能,而 K=1 的 MuLoCo 在使用更大的 16M token 批大小时匹配最佳基线性能。

英文摘要

DiLoCo is a powerful framework for training large language models (LLMs), enabling larger optimal batch sizes and increased accelerator utilization under networking constraints. However, DiLoCo's performance has been shown to degrade as the number of workers (K) increases (Charles et al., 2025). In this work, we posit that a related but often overlooked factor in DiLoCo's behavior is the choice of inner optimizer, which shapes the pseudogradient used by the outer optimizer. Given the recent success of Muon relative to AdamW for data parallel (DP) training, we examine how Muon's normalized optimizer steps can affect the pseudogradient's quality. We find that, relative to AdamW, Muon yields more directionally correct pseudogradients as the number of workers ($K$) increases. In our experiments pre-training language models, we conduct extensive hyperparameter tuning across 150M, 416M, 914M, 1.76B, and 3.1B models for DiLoCo, MuLoCo, AdamW DP, and Muon DP. Consistently across all scales, we find that with $K\geq1$ workers, MuLoCo (Muon inner optimizer DiLoCo) achieves superior performance to DiLoCo in absolute terms and for $K>2$ it outperforms DiLoCo relative to their data parallel baselines, while being compatible with quantization, streaming, and long synchronization intervals. At $K=1$, we find that MuLoCo can even outperform the data-parallel gold standard while having larger critical batch sizes. Finally, we extrapolate optimal hyperparameters to 15B scale and train a model with each method (six in total) using $K=1$ and $K=16$ workers. We find that $K=16$ MuLoCo nearly matches single-worker performance at this scale, while MuLoCo $K=1$ matches the best performing baseline while using a much larger $16$M token batch size.

2602.20217 2026-06-03 cs.LG cs.AI 版本更新

KnapSpec: Self-Speculative Decoding via Adaptive Layer Selection as a Knapsack Problem

KnapSpec: 通过自适应层选择作为背包问题的自推测解码

Seongjin Cha, Gyuwan Kim, Dongsu Han, Tao Yang, Insu Han

发表机构 * KAIST(韩国科学技术院)

AI总结 提出KnapSpec,一种无需训练的框架,将草稿模型选择重新表述为背包问题,通过解耦注意力与MLP层并建模其硬件特定延迟,使用并行动态规划算法自适应确定最优草稿配置,实现令牌吞吐量最大化。

Comments Accepted to ICML 2026

详情
AI中文摘要

自推测解码(SSD)通过跳过层来创建高效的草稿模型,从而加速LLM推理,但现有方法通常依赖静态启发式,忽略了长上下文场景中注意力的动态计算开销。我们提出KnapSpec,一种无需训练的框架,将草稿模型选择重新表述为背包问题,以最大化每时间令牌吞吐量。通过解耦注意力与MLP层,并将其硬件特定延迟建模为上下文长度的函数,KnapSpec通过并行动态规划算法自适应地即时识别最优草稿配置。此外,我们提供了首个严格的理论分析,建立了隐藏状态之间的余弦相似度作为令牌接受率的数学上合理的代理。这一基础使得我们的方法在导航现实世界硬件的动态瓶颈时,能够保持高草稿保真度。我们在Qwen3和Llama3上的实验表明,KnapSpec始终优于最先进的SSD基线,在各种基准测试中实现了高达1.47倍的墙钟加速。我们的即插即用方法确保了长序列的高效推理,无需额外训练或损害目标模型的输出分布。

英文摘要

Self-speculative decoding (SSD) accelerates LLM inference by skipping layers to create an efficient draft model, yet existing methods often rely on static heuristics that ignore the dynamic computational overhead of attention in long-context scenarios. We propose KnapSpec, a training-free framework that reformulates draft model selection as a knapsack problem to maximize tokens-per-time throughput. By decoupling Attention and MLP layers and modeling their hardware-specific latencies as functions of context length, KnapSpec adaptively identifies optimal draft configurations on the fly via a parallel dynamic programming algorithm. Furthermore, we provide the first rigorous theoretical analysis establishing cosine similarity between hidden states as a mathematically sound proxy for the token acceptance rate. This foundation allows our method to maintain high drafting faithfulness while navigating the shifting bottlenecks of real-world hardware. Our experiments on Qwen3 and Llama3 demonstrate that KnapSpec consistently outperforms state-of-the-art SSD baselines, achieving up to 1.47x wall-clock speedup across various benchmarks. Our plug-and-play approach ensures high-speed inference for long sequences without requiring additional training or compromising the target model's output distribution.

2602.16666 2026-06-03 cs.AI cs.CY cs.LG 版本更新

Towards a Science of AI Agent Reliability

迈向AI代理可靠性的科学

Stephan Rabanser, Sayash Kapoor, Peter Kirgis, Kangheng Liu, Saiteja Utpala, Arvind Narayanan

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 本文提出十二个具体指标,从一致性、鲁棒性、可预测性和安全性四个维度分解AI代理的可靠性,并通过实验揭示能力提升仅带来可靠性小幅改进。

Comments Accepted at ICML 2026. Interactive dashboard available at: https://hal.cs.princeton.edu/reliability

详情
AI中文摘要

AI代理越来越多地被部署来执行重要任务。虽然标准基准测试上的准确率分数不断提高表明进展迅速,但许多代理在实践中仍然持续失败。这种差异凸显了当前评估的一个根本局限性:将代理行为压缩为单一成功指标会掩盖关键的操作缺陷。值得注意的是,它忽略了代理是否在不同运行中表现一致、能否承受扰动、是否可预测地失败,或者错误严重性是否有界。基于安全关键工程,我们通过提出十二个具体指标来提供全面的性能概况,这些指标将代理可靠性分解为四个关键维度:一致性、鲁棒性、可预测性和安全性。在两个互补基准测试上评估15个模型,我们发现最近的能力提升仅带来了可靠性的小幅改进。通过暴露这些持续的局限性,我们的指标补充了传统评估,同时提供了推理代理如何表现、退化和失败的工具。

英文摘要

AI agents are increasingly deployed to execute important tasks. While rising accuracy scores on standard benchmarks suggest rapid progress, many agents still continue to fail in practice. This discrepancy highlights a fundamental limitation of current evaluations: compressing agent behavior into a single success metric obscures critical operational flaws. Notably, it ignores whether agents behave consistently across runs, withstand perturbations, fail predictably, or have bounded error severity. Grounded in safety-critical engineering, we provide a holistic performance profile by proposing twelve concrete metrics that decompose agent reliability along four key dimensions: consistency, robustness, predictability, and safety. Evaluating 15 models across two complementary benchmarks, we find that recent capability gains have only yielded small improvements in reliability. By exposing these persistent limitations, our metrics complement traditional evaluations while offering tools for reasoning about how agents perform, degrade, and fail.

2602.18690 2026-06-03 q-bio.NC cs.CV cs.LG 版本更新

Neural Fields as World Models

神经场作为世界模型

Joshua Nunley

发表机构 * Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington(信息学、计算与工程学院,印第安纳大学,布卢明顿) Cognitive Science Program, Indiana University, Bloomington(认知科学项目,印第安纳大学,布卢明顿)

AI总结 提出同构世界模型,利用运动门控神经场在空间图中进行物理预测,实现离线任务学习和身体相关表征。

Comments 6 pages, 6 figures. Annual Meeting of the Cognitive Science Society (CogSci 2026)

详情
AI中文摘要

人类可以在离线状态下预演可能的未来,例如在心理练习和可能的梦境中,这表明世界模型可能支持远离环境的学习。标准的机器学习世界模型将视觉输入压缩为潜在向量,丢弃了感觉皮层的空间结构特征。我们提出了同构世界模型:一种保持感觉拓扑结构的架构,使得物理预测成为几何传播而非抽象状态转换。我们通过运动门控神经场实现这一想法,其中活动通过局部侧向连接演化,运动命令乘性地调制特定通道。在三个实验中,相同的架构学习了无“瞬移”的弹道预测,通过将任务误差通过冻结的学习世界模型传播,改进了离线接球策略,并在没有身体标签的情况下发展出身体选择性的运动通道。这些结果提供了初步证据,表明物理预测、离线任务学习和身体相关表征共享一个共同的计算基础:空间地图内的动作条件预测。

英文摘要

Humans rehearse possible futures offline, as in mental practice and perhaps dreaming, suggesting that world models may support task learning away from the environment. Standard machine learning world models compress visual input into latent vectors, discarding the spatial structure that characterizes sensory cortex. We propose isomorphic world models: architectures that preserve sensory topology, so physics prediction becomes geometric propagation rather than abstract state transition. We implement this idea with motor-gated neural fields, where activity evolves through local lateral connectivity and motor commands multiplicatively modulate specific channels. Across three experiments, the same architecture learns ballistic prediction without ``teleporting,'' improves a catching policy offline by propagating task error through a frozen learned world model, and develops body-selective motor channels without body labels. These results provide preliminary evidence that physical prediction, offline task learning, and body-linked representation share a common computational substrate: action-conditional prediction within a spatial map.

2602.18084 2026-06-03 cs.LG 版本更新

Balancing Symmetry and Efficiency in Graph Flow Matching

平衡图流匹配中的对称性与效率

Benjamin Honoré, Alba Carballo-Castro, Yiming Qin, Pascal Frossard

发表机构 * LTS4, EPFL, Lausanne, Switzerland(LTS4,瑞士洛桑联邦理工学院,拉夫斯堡)

AI总结 通过可控对称调制方案,研究图生成模型中严格等变性带来的计算成本与收敛速度之间的权衡,发现适当调节对称性可在加速收敛的同时避免过拟合。

Comments 15 pages, 11 figures

详情
AI中文摘要

等变性是图生成模型的核心,因为它确保模型尊重图的置换对称性。然而,严格的等变性由于增加了架构约束而提高了计算成本,并且由于模型必须在大量可能的节点置换空间上保持一致而可能减慢收敛速度。我们研究了图生成模型中的这种权衡。具体来说,我们从等变离散流匹配模型出发,在训练过程中通过基于正弦位置编码和节点置换的可控对称调制方案来放松其等变性。实验首先表明,对称性破缺可以通过提供更简单的学习信号来加速早期训练,但代价是鼓励捷径解决方案,可能导致过拟合,即模型重复生成训练集的重复图。相反,适当调节对称性信号可以延迟过拟合,同时加速收敛,使模型在基线训练周期的19%内达到更强的性能。

英文摘要

Equivariance is central to graph generative models, as it ensures the model respects the permutation symmetry of graphs. However, strict equivariance can increase computational cost due to added architectural constraints, and can slow down convergence because the model must be consistent across a large space of possible node permutations. We study this trade-off for graph generative models. Specifically, we start from an equivariant discrete flow-matching model, and relax its equivariance during training via a controllable symmetry modulation scheme based on sinusoidal positional encodings and node permutations. Experiments first show that symmetry-breaking can accelerate early training by providing an easier learning signal, but at the expense of encouraging shortcut solutions that can cause overfitting, where the model repeatedly generates graphs that are duplicates of the training set. On the contrary, properly modulating the symmetry signal can delay overfitting while accelerating convergence, allowing the model to reach stronger performance with $19\%$ of the baseline training epochs.

2502.08834 2026-06-03 cs.LG cs.AI stat.ML 版本更新

Rex: A Family of Reversible Exponential (Stochastic) Runge-Kutta Solvers

Rex: 一族可逆指数(随机)龙格-库塔求解器

Zander W. Blasingame, Chen Liu

发表机构 * University of Washington(华盛顿大学)

AI总结 提出Rex求解器族,通过Lawson方法将显式(随机)龙格-库塔格式转化为代数可逆形式,用于扩散ODE和SDE,实现近机器精度重建并提升流模型和扩散模型的性能。

Comments Accepted as an Oral presentation at ICML 2026

详情
AI中文摘要

基于神经微分方程的深度生成模型已成为许多生成任务的最先进方法。这些模型依赖于从先验分布积分到数据分布的ODE/SDE求解器;在许多应用中,逆方向积分也非常可取。然而,标准求解器会累积离散误差,阻碍精确反演,这种不准确性在精度关键的应用中是不可接受的。现有的反演方法稳定性差、收敛阶低,且严格限于ODE设置。在这项工作中,我们提出Rex,一族可逆指数(随机)龙格-库塔求解器,通过应用Lawson方法将任何显式(随机)龙格-库塔格式转化为扩散ODE和SDE的代数可逆格式。除了严格的理论分析——建立任意阶收敛性和非零线性稳定区域——我们通过实验证明Rex实现了近机器精度的重建,并改进了基于流模型的玻尔兹曼采样以及基于扩散模型的图像生成和编辑。

英文摘要

Deep generative models based on neural differential equations have become state-of-the-art for many generation tasks. These models rely on ODE/SDE solvers that integrate from a prior distribution to the data distribution; in many applications it is also highly desirable to integrate in the inverse direction. Standard solvers, however, accumulate discretization errors that prohibit exact inversion, an inaccuracy that is unacceptable in precision-critical applications. Existing inversion methods suffer from poor stability and low order of convergence, and are strictly limited to the ODE setting. In this work, we propose Rex, a family of reversible exponential (stochastic) Runge-Kutta solvers obtained by applying Lawson methods to convert any explicit (stochastic) Runge-Kutta scheme into an algebraically reversible one for both diffusion ODEs and SDEs. Beyond a rigorous theoretical analysis -- establishing arbitrary-order convergence and a non-zero region of linear stability -- we empirically demonstrate that Rex achieves near-machine-precision reconstruction and improves Boltzmann sampling with flow models as well as image generation and editing with diffusion models.

2602.17149 2026-06-03 cs.LG cs.AI 版本更新

TimeOmni-VL: Unified Models for Time Series Understanding and Generation

TimeOmni-VL:统一时间序列理解与生成的模型

Tong Guan, Sheng Pan, Johan Barthelemy, Zhao Li, Yujun Cai, Cesare Alippi, Ming Jin, Shirui Pan

发表机构 * Tsinghua University(清华大学)

AI总结 提出TimeOmni-VL框架,通过保真双向映射和理解引导生成,首次统一时间序列的理解与生成任务。

Comments Accepted by the Forty-third International Conference on Machine Learning (ICML 2026)

详情
AI中文摘要

近期的时间序列建模在数值生成与语义理解之间存在明显鸿沟,研究表明生成模型往往依赖浅层模式匹配,而理解导向的模型难以输出高保真数值。尽管统一多模态模型(UMMs)已在视觉领域弥合这一差距,但其在时间序列上的潜力尚未被发掘。我们提出TimeOmni-VL,这是首个以视觉为中心的统一时间序列理解与生成框架,通过两项关键创新实现:(1)时间序列与图像之间的保真双向映射(Bi-TSI),改进了时间序列到图像(TS2I)和图像到时间序列(I2TS)的转换,确保近乎无损的变换。(2)理解引导生成。我们引入TSUMM-Suite,这是一个新颖的数据集,包含六个基于时间序列分析的理解任务,并耦合两个生成任务。通过校准的思维链,TimeOmni-VL首次利用时间序列理解作为高保真生成的显式控制信号。实验证实,这种统一方法显著提升了语义理解和数值精度,为多模态时间序列建模开辟了新前沿。

英文摘要

Recent time series modeling faces a sharp divide between numerical generation and semantic understanding, with research showing that generation models often rely on superficial pattern matching, while understanding-oriented models struggle with high-fidelity numerical output. Although unified multimodal models (UMMs) have bridged this gap in vision, their potential for time series remains untapped. We propose TimeOmni-VL, the first vision-centric framework that unifies time series understanding and generation through two key innovations: (1) Fidelity-preserving bidirectional mapping between time series and images (Bi-TSI), which advances Time Series-to-Image (TS2I) and Image-to-Time Series (I2TS) conversions to ensure near-lossless transformations. (2) Understanding-guided generation. We introduce TSUMM-Suite, a novel dataset consisting of six understanding tasks rooted in time series analytics and coupled with two generation tasks. With a calibrated Chain-of-Thought, TimeOmni-VL is the first to leverage time series understanding as an explicit control signal for high-fidelity generation. Experiments confirm that this unified approach significantly improves semantic understanding and numerical precision, establishing a new frontier for multimodal time series modeling.

2602.17063 2026-06-03 cs.LG cs.AI cs.CL cs.CV 版本更新

Sign Lock-In: Randomly Initialized Weight Signs Persist and Bottleneck Sub-Bit Model Compression

符号锁定:随机初始化的权重符号持续存在并成为亚比特模型压缩的瓶颈

Akira Sakai, Yuma Ichikawa

发表机构 * Fujitsu Limited(富士通株式会社) Tokai University(静冈大学) Riken Center for AIP(理化学研究所AIP研究中心)

AI总结 研究亚比特模型压缩中符号位的瓶颈问题,通过符号锁定理论解释权重符号的随机性来源,并提出一种从头开始的低秩符号模板训练方法以突破该瓶颈。

Comments Accepted at the Forty-Third International Conference on Machine Learning (ICML 2026)

详情
AI中文摘要

亚比特模型压缩的目标是将每个权重的存储降至1比特以下;当幅度被激进压缩时,符号位成为固定成本的瓶颈。在Transformer、CNN和MLP中,学习到的符号矩阵抵抗低秩近似,并且在频谱上与i.i.d. Rademacher基线无法区分。这种随机性导致了亚比特模型压缩的下界——1比特墙。尽管存在这种明显的随机性,大多数权重仍保留其初始化符号;翻转主要通过罕见的近零边界穿越发生,表明符号模式的随机性很大程度上继承自初始化。我们通过符号锁定理论形式化了这一行为,这是对SGD噪声下符号翻转的停时分析。在有界更新和零的小邻域内罕见重新进入的条件下,有效符号翻转的数量呈现几何尾部。基于这一机制,我们引入了一种从头开始的低秩符号模板训练方法,以防止这种1比特墙的出现。

英文摘要

Sub-bit model compression targets storage below one bit per weight; as magnitudes are aggressively compressed, the sign bit becomes a fixed-cost bottleneck. Across Transformers, CNNs, and MLPs, learned sign matrices resist low-rank approximation and are spectrally indistinguishable from an i.i.d. Rademacher baseline. This randomness gives rise to the lower bound of sub-bit model compression -- the one-bit wall. Despite this apparent randomness, most weights retain their initialization signs; flips primarily occur via rare near-zero boundary crossings, suggesting that sign-pattern randomness is largely inherited from initialization. We formalize this behavior with sign lock-in theory, a stopping-time analysis of sign flips under SGD noise. Under bounded updates and a rare re-entry condition into a small neighborhood of zero, the number of effective sign flips exhibits a geometric tail. Building on this mechanism, we introduce a from-scratch low-rank sign-template training method that prevents the emergence of this one-bit wall.

2602.14279 2026-06-03 cs.LG cs.AI cs.CL cs.SI 版本更新

Whom to Query for What: Adaptive Group Elicitation via Multi-Turn LLM Interactions

为谁查询什么:通过多轮LLM交互的自适应群体征询

Ruomeng Ding, Tianwei Gao, Thomas P. Zollo, Eitan Bachmat, Richard Zemel, Zhun Deng

发表机构 * University of North Carolina at Chapel Hill(北卡罗来纳大学教堂山分校) Columbia University(哥伦比亚大学) Ben-Gurion University of the Negev(贝内-约尔大学内盖夫分校)

AI总结 针对有限预算下群体属性不确定性降低问题,提出结合LLM期望信息增益与异构图神经网络传播的自适应群体征询框架,实现问题与受访者联合选择,在三个真实数据集上显著提升群体响应预测。

Comments Published as a conference paper at ICML 2026

详情
AI中文摘要

从调查和其他集体评估中征询信息以减少关于潜在群体属性的不确定性,需要在实际成本和缺失数据下分配有限的提问努力。尽管大型语言模型支持自然语言中的自适应多轮交互,但大多数现有征询方法优化了在固定受访者池中询问什么,并且在响应部分或不完整时不会调整受访者选择或利用群体结构。为解决这一差距,我们研究了自适应群体征询,这是一个多轮设置,其中智能体在明确的查询和参与预算下自适应地选择问题和受访者。我们提出了一个理论基础的框架,该框架结合了(i)基于LLM的期望信息增益目标,用于评分候选问题,以及(ii)异构图神经网络传播,该传播聚合观察到的响应和参与者属性,以插补缺失响应并指导每轮受访者选择。这种闭环过程查询一个小的、信息丰富的个体子集,同时通过结构化相似性推断群体级别的响应。在三个真实世界意见数据集上,我们的方法在预算受限的情况下持续提高了群体级别响应预测,包括在10%受访者预算下CES上相对提升超过12%。

英文摘要

Eliciting information to reduce uncertainty about latent group-level properties from surveys and other collective assessments requires allocating limited questioning effort under real costs and missing data. Although large language models enable adaptive, multi-turn interactions in natural language, most existing elicitation methods optimize what to ask with a fixed respondent pool, and do not adapt respondent selection or leverage population structure when responses are partial or incomplete. To address this gap, we study adaptive group elicitation, a multi-round setting where an agent adaptively selects both questions and respondents under explicit query and participation budgets. We propose a theoretically grounded framework that combines (i) an LLM-based expected information gain objective for scoring candidate questions with (ii) heterogeneous graph neural network propagation that aggregates observed responses and participant attributes to impute missing responses and guide per-round respondent selection. This closed-loop procedure queries a small, informative subset of individuals while inferring population-level responses via structured similarity. Across three real-world opinion datasets, our method consistently improves population-level response prediction under constrained budgets, including a >12% relative gain on CES at a 10% respondent budget.

2602.11908 2026-06-03 cs.AI cs.CL cs.LG 版本更新

When Should LLMs Be Less Specific? Selective Abstraction for Reliable Long-Form Text Generation

LLM何时应降低具体性?面向可靠长文本生成的选择性抽象

Shani Goren, Ido Galil, Ran El-Yaniv

发表机构 * Technion(技术离子大学) NVIDIA(英伟达)

AI总结 针对LLM在长文本生成中因低置信度而丢弃有价值信息的问题,提出选择性抽象框架,通过原子级抽象替换不确定内容,在保持语义的同时提升准确性和可靠性。

详情
AI中文摘要

LLM被广泛使用,但仍容易出现事实错误,这削弱了用户信任并限制了在高风险场景中的采用。缓解这一风险的一种方法是为模型配备不确定性估计机制,在置信度低时弃权。然而,这种二元的“全有或全无”方法在长文本场景中过于严格,常常丢弃有价值的信息。我们引入了选择性抽象(SA),这是一个框架,使LLM能够通过选择性地降低不确定内容的细节来用具体性换取可靠性。我们首先通过选择性风险和覆盖率的视角形式化SA。然后,我们提出原子级选择性抽象,这是一种声明级别的实例化,将响应分解为原子声明(简短、自包含的陈述,每个表达一个单一事实),并用更高置信度、更低具体性的抽象替换不确定的原子。为了评估这一框架,我们开发了一个新颖的端到端流水线用于开放式生成,将风险实例化为事实正确性,并使用信息论度量保留信息来衡量覆盖率。在FactScore和LongFact-Objects基准测试上的六个开源模型中,原子级SA始终优于现有基线,在风险-覆盖率曲线下面积(AURC)上比声明移除方法提升高达27.73%,表明降低具体性可以在保留大部分原始含义的同时提升准确性和可靠性。

英文摘要

LLMs are widely used, yet they remain prone to factual errors that erode user trust and limit adoption in high-risk settings. One approach to mitigate this risk is to equip models with uncertainty estimation mechanisms that abstain when confidence is low. However, this binary "all-or-nothing" approach is excessively restrictive in long-form settings, often discarding valuable information. We introduce Selective Abstraction (SA), a framework that enables LLMs to trade specificity for reliability by selectively reducing the detail of uncertain content. We first formalize SA through the lenses of selective risk and coverage. We then propose Atom-wise Selective Abstraction, a claim-level instantiation that decomposes responses into atomic claims (short, self-contained statements each expressing a single fact) and replaces uncertain atoms with higher confidence, less specific abstractions. To evaluate this framework, we develop a novel end-to-end pipeline for open-ended generation that instantiates risk as factual correctness and measures coverage using an information-theoretic measure of retained information. Across six open-source models on the FactScore and LongFact-Objects benchmarks, atom-wise SA consistently outperforms existing baselines, improving the area under the risk-coverage curve (AURC) by up to 27.73% over claim removal, demonstrating that reducing specificity can boost accuracy and reliability while preserving most of their original meaning.

2602.10949 2026-06-03 stat.ML cs.LG math.DS math.PR 版本更新

Optimal Initialization in Depth: Lyapunov Initialization and Limit Theorems for Deep Leaky ReLU Networks

深度网络的最优初始化:深度Leaky ReLU网络的Lyapunov初始化与极限定理

Constantin Kogler, Tassilo Schwarz, Samuel Kittle

发表机构 * School of Mathematics, Institute for Advanced Study(数学系,高级研究院) Mathematical Institute, University of Oxford(牛津大学数学学院) Max Planck Institute for Multidisciplinary Sciences(多学科科学研究所) Department of Mathematics, University College London(伦敦大学学院数学系)

AI总结 本文通过随机深度Leaky ReLU网络的严格概率分析,提出Lyapunov初始化方法,将Lyapunov指数设为零以确保激活稳定性,从而改善学习效果。

Comments Preprint, 44 pages

详情
AI中文摘要

深度网络的有效初始化需要理解随机神经网络。本文对深度无偏置随机Leaky ReLU网络进行了严格的概率分析。我们证明了网络激活范数对数的强大数定律和中心极限定理,表明随着层数增加,其增长由称为Lyapunov指数的参数控制。该参数刻画了激活消失与爆炸之间的尖锐相变,并针对高斯或正交权重矩阵显式计算了Lyapunov指数。我们的结果表明,标准方法(如He初始化或正交初始化)无法保证低宽度深度网络的激活稳定性。基于这些理论见解,我们提出了一种新的初始化方法,称为Lyapunov初始化,它将Lyapunov指数设为零,从而确保神经网络尽可能稳定,经验上导致学习改进。

英文摘要

Effective initialization in deep networks requires an understanding of random neural networks. In this work, a rigorous probabilistic analysis of deep bias-free random Leaky ReLU networks is provided. We prove a Law of Large Numbers and a Central Limit Theorem for the logarithm of the norm of network activations, establishing that, as the number of layers increases, their growth is governed by a parameter called the Lyapunov exponent. This parameter characterizes a sharp phase transition between vanishing and exploding activations, and we calculate the Lyapunov exponent explicitly for Gaussian or orthogonal weight matrices. Our results reveal that standard methods, such as He initialization or orthogonal initialization, do not guarantee activation stability for deep networks of low width. Based on these theoretical insights, we propose a novel initialization method, referred to as Lyapunov initialization, which sets the Lyapunov exponent to zero and thereby ensures that the neural network is as stable as possible, leading empirically to improved learning.

2602.10352 2026-06-03 cs.CL cs.AI cs.LG 版本更新

Learning Self-Interpretation from Interpretability Artifacts: Training Lightweight Adapters on Vector-Label Pairs

从可解释性工件中学习自我解释:在向量-标签对上训练轻量级适配器

Keenan Pepper, Alex McKenzie, Florin Pop, Stijn Servaes, Martin Leitgab, Mike Vaiana, Judd Rosenblatt, Michael S. A. Graziano, Diogo de Lucena

发表机构 * University of Washington(华盛顿大学)

AI总结 通过训练轻量级适配器(标量仿射适配器,仅需d_model+1参数)在可解释性工件上,保持语言模型完全冻结,实现了跨任务和模型族的可靠自我解释,在稀疏自编码器特征标注、主题识别和多跳推理桥接实体解码等任务上显著优于未训练基线。

Comments 26 pages, 18 tables, 17 figures. Code and data at https://github.com/agencyenterprise/selfie-adapters

详情
AI中文摘要

自我解释方法促使语言模型描述其内部状态,但由于超参数敏感性而仍然不可靠。我们表明,在可解释性工件上训练轻量级适配器,同时保持语言模型完全冻结,可以在任务和模型族中产生可靠的自我解释。一个仅需$d_\text{model}+1$个参数的标量仿射适配器就足够了:训练后的适配器生成稀疏自编码器特征标签,其性能优于训练标签本身(在70B规模下,生成评分为70% vs 50%),以94%的召回率@1识别主题(未训练基线为1%),并在多跳推理中解码既不在提示中也不在响应中出现的桥接实体,从而无需思维链即可揭示隐式推理。仅学习到的偏置向量就占了改进的85%,更简单的适配器比更具表达力的替代方案具有更好的泛化能力。通过提示描述控制模型知识,我们发现从7B到72B参数,自我解释的提升超过了能力提升。我们的结果表明,自我解释随着规模扩大而改善,且无需修改被解释的模型。

英文摘要

Self-interpretation methods prompt language models to describe their own internal states, but remain unreliable due to hyperparameter sensitivity. We show that training lightweight adapters on interpretability artifacts, while keeping the LM entirely frozen, yields reliable self-interpretation across tasks and model families. A scalar affine adapter with just $d_\text{model}+1$ parameters suffices: trained adapters generate sparse autoencoder feature labels that outperform the training labels themselves (70% vs 50% generation scoring at 70B scale), identify topics with 94% recall@1 versus 1% for untrained baselines, and decode bridge entities in multi-hop reasoning that appear in neither prompt nor response, surfacing implicit reasoning without chain-of-thought. The learned bias vector alone accounts for 85% of improvement, and simpler adapters generalize better than more expressive alternatives. Controlling for model knowledge via prompted descriptions, we find self-interpretation gains outpace capability gains from 7B to 72B parameters. Our results demonstrate that self-interpretation improves with scale, without modifying the model being interpreted.

2602.09708 2026-06-03 cs.LG cs.AI cs.CV cs.NA math.NA 版本更新

Physics-informed diffusion models in spectral space

谱空间中的物理信息扩散模型

Davide Gallon, Philippe von Wurstemberger, Patrick Cheridito, Arnulf Jentzen

发表机构 * ETH Zürich(苏黎世联邦理工学院)

AI总结 提出物理信息谱扩散(PISD)方法,结合生成式潜扩散模型与物理信息机器学习,在谱表示潜空间中对偏微分方程参数和解进行扩散建模,通过扩散后验采样施加物理约束和测量条件,在泊松、亥姆霍兹和不可压缩纳维-斯托克斯方程上展现出比现有扩散求解器更高的精度和计算效率。

Comments 18 pages, 10 figures

详情
AI中文摘要

我们提出物理信息谱扩散(PISD),一种将生成式潜扩散模型与物理信息机器学习相结合的方法,用于生成基于部分观测的偏微分方程(PDE)的解,特别包括正向和逆向PDE问题。我们在缩放谱表示的潜空间中通过扩散过程学习PDE参数和解的联合分布,其中高斯噪声对应于具有受控正则性的函数。与基于网格的扩散模型相比,这种谱公式能够实现显著的降维,并确保函数空间中的诱导过程保持在PDE算子定义良好的函数类内。基于扩散后验采样,我们在推理过程中施加物理信息约束和测量条件,在每个扩散步骤应用基于Adam的更新。我们在泊松、亥姆霍兹和不可压缩纳维-斯托克斯方程上评估了所提出的方法,与现有的基于扩散的PDE求解器(在稀疏观测下达到最先进水平)相比,展示了更高的精度和计算效率。代码可在 https://github.com/deeplearningmethods/PISD 获取。

英文摘要

We propose physics-informed spectral diffusion (PISD), a methodology that combines generative latent diffusion models with physics-informed machine learning to generate solutions of partial differential equations (PDEs) conditioned on partial observations, which includes, in particular, forward and inverse PDE problems. We learn the joint distribution of PDE parameters and solutions via a diffusion process in a latent space of scaled spectral representations, where Gaussian noise corresponds to functions with controlled regularity. This spectral formulation enables significant dimensionality reduction compared to grid-based diffusion models and ensures that the induced process in function space remains within a class of functions for which the PDE operators are well defined. Building on diffusion posterior sampling, we enforce physics-informed constraints and measurement conditions during inference, applying Adam-based updates at each diffusion step. We evaluate the proposed approach on Poisson, Helmholtz, and incompressible Navier-Stokes equations, demonstrating improved accuracy and computational efficiency compared with existing diffusion-based PDE solvers, which are state of the art for sparse observations. Code is available at https://github.com/deeplearningmethods/PISD.

2510.12636 2026-06-03 stat.ML cs.LG math.AP 版本更新

Adapting Noise to Data: Generative Flows from 1D Processes

将噪声适应于数据:来自一维过程的生成流

Jannis Chemseddine, Gregor Kornhardt, Richard Duong, Gabriele Steidl

发表机构 * University of Cambridge(剑桥大学)

AI总结 提出一个通用框架,通过一维分位数函数学习数据自适应的参数化先验分布(潜在噪声),利用噪声与数据之间的Wasserstein距离进行优化,以改善生成流模型对重尾等分布的学习能力。

Comments ICML 2026

详情
AI中文摘要

基于流的生成模型中的默认高斯潜变量在学习某些分布(如重尾分布)时会带来挑战。我们引入了一个通用框架,使用一维分位数函数学习数据自适应的参数化先验分布(潜在噪声),并通过噪声与数据之间的Wasserstein距离进行优化。基于分位数的先验参数化自然地适应重尾分布和紧支撑分布,并缩短传输路径。在重尾天气和图像数据集上的数值结果证实了该方法的灵活性和有效性,且计算开销可忽略不计。

英文摘要

The default Gaussian latent in flow-based generative models poses challenges when learning certain distributions such as heavy-tailed ones. We introduce a general framework for learning data-adaptive parametric prior distributions (latent noise) using one-dimensional quantile functions, optimized via the Wasserstein distance between noise and data. The quantile-based prior parameterization naturally adapts to both heavy-tailed and compactly supported distributions and shortens transport paths. Numerical results on heavy-tailed weather and image datasets confirm the method's flexibility and effectiveness achieved with negligible computational overhead.

2602.06842 2026-06-03 math.NA cs.LG cs.NA 版本更新

Are Deep Learning Based Hybrid PDE Solvers Reliable? Why Training Paradigms and Update Strategies Matter

基于深度学习的混合PDE求解器可靠吗?为什么训练范式和更新策略很重要

Yuhan Wu, Jan Willem van Beek, Victorita Dolean, Alexander Heinlein

发表机构 * Delft Institute of Applied Mathematics(代尔夫特应用数学研究所) Delft University of Technology(代尔夫特理工大学) Department of Mathematics and Computer Science(数学与计算机科学系) Eindhoven University of Technology(埃因霍温理工大学) The Netherlands(荷兰)

AI总结 本文研究基于深度学习的混合迭代方法(DL-HIMs)在科学计算中的可靠性问题,发现训练目标与求解器动力学及物理问题不一致会导致残差停滞,并提出物理感知的Anderson加速(PA-AA)方法以恢复可靠收敛。

Comments Accepted manuscript version of an article accepted for publication in IEEE Computing in Science & Engineering. The final published version will be available through IEEE Xplore

详情
AI中文摘要

基于深度学习的混合迭代方法(DL-HIMs)将经典数值求解器与神经算子相结合,利用它们互补的谱偏差来加速收敛。尽管有这一前景,许多DL-HIMs在假固定点处停滞,此时神经更新消失而物理残差仍然很大,这引发了对其在科学计算中可靠性的质疑。在本文中,我们提供证据表明,即使神经架构固定,性能对训练范式和更新策略高度敏感。通过对基于DeepONet的混合迭代数值可转移求解器(HINTS)和基于FFT的傅里叶神经求解器(FNS)的详细研究,我们展示了当训练目标与求解器动力学和问题物理不一致时,显著的物理残差可能持续存在。我们进一步研究了Anderson加速(AA),并证明其经典形式不适用于非线性神经算子。为了克服这一点,我们引入了物理感知的Anderson加速(PA-AA),它最小化物理残差而非固定点更新。数值实验证实,PA-AA在显著更少的迭代次数内恢复了可靠收敛。这些发现为围绕基于AI的PDE求解器的持续争议提供了具体答案:可靠性不仅取决于架构,还取决于物理信息驱动的训练和迭代设计。

英文摘要

Deep learning-based hybrid iterative methods (DL-HIMs) integrate classical numerical solvers with neural operators, utilizing their complementary spectral biases to accelerate convergence. Despite this promise, many DL-HIMs stagnate at false fixed points where neural updates vanish while the physical residual remains large, raising questions about reliability in scientific computing. In this paper, we provide evidence that performance is highly sensitive to training paradigms and update strategies, even when the neural architecture is fixed. Through a detailed study of a DeepONet-based hybrid iterative numerical transferable solver (HINTS) and an FFT-based Fourier neural solver (FNS), we show that significant physical residuals can persist when training objectives are not aligned with solver dynamics and problem physics. We further examine Anderson acceleration (AA) and demonstrate that its classical form is ill-suited for nonlinear neural operators. To overcome this, we introduce physics-aware Anderson acceleration (PA-AA), which minimizes the physical residual rather than the fixed-point update. Numerical experiments confirm that PA-AA restores reliable convergence in substantially fewer iterations. These findings provide a concrete answer to ongoing controversies surrounding AI-based PDE solvers: reliability hinges not only on architectures but on physically informed training and iteration design.

2511.12085 2026-06-03 cs.CR cs.AI cs.LG 版本更新

A Robust and Explainable Transformer-Based Framework for Phishing Email Detection

一种鲁棒且可解释的基于Transformer的钓鱼邮件检测框架

Sajad U P

发表机构 * Independent Researcher(独立研究者)

AI总结 提出基于DistilBERT的轻量级钓鱼邮件检测框架,通过梯度对抗训练和字符级噪声增强鲁棒性,并集成LIME、SHAP和IG三种可解释AI方法,结合Flan-T5-Small生成自然语言解释,提升检测准确性和用户信任。

详情
AI中文摘要

钓鱼及相关网络威胁正变得越来越复杂,基于电子邮件的钓鱼仍然是最持久的攻击载体。这些攻击利用人类漏洞来传递恶意软件或获取对敏感信息的未授权访问。基于Transformer的模型通过强大的上下文语言理解增强了钓鱼检测;然而,由于缺乏可解释性,它们通常被视为黑盒。此外,最近的AI驱动攻击进一步削弱了模型的韧性。为了解决这些挑战,本文提出了一种基于DistilBERT(一种轻量级Transformer模型)的轻量级钓鱼检测框架。通过使用快速梯度法(FGM)进行基于梯度的对抗训练,并结合随机字符级扰动,增强了对嵌入级扰动和字符级输入噪声的鲁棒性。为了提高透明度,集成了三种突出的可解释AI(XAI)方法:LIME(局部可解释模型无关解释)、SHAP(SHapley Additive exPlanations)和IG(积分梯度),以解释模型决策。一个结构化的基于规则的提示结合模型预测和XAI特征,引导Flan-T5-Small生成通俗易懂、基于证据的解释。实验结果表明,所提出的框架在准确性和韧性方面优于未经鲁棒性增强的标准DistilBERT检测模型。这种集成方法有助于弥合模型可靠性与用户信任之间的差距,推动透明钓鱼检测的发展。

英文摘要

Phishing and related cyber threats are becoming increasingly sophisticated, with email-based phishing remaining the most persistent attack vector. These attacks exploit human vulnerabilities to deliver malware or gain unauthorized access to sensitive information. Transformer-based models enhance phishing detection through robust contextual language understanding; yet they are often regarded as black boxes due to a lack of interpretability. Moreover, recent AI-enabled attacks further undermine model resilience. To address these challenges, this work proposes a lightweight phishing detection framework based on DistilBERT, a lightweight Transformer model. Robustness to embedding-level perturbations and character-level input noise is enhanced through gradient-based adversarial training using the Fast Gradient Method (FGM), combined with stochastic character-level perturbations. To improve transparency, three prominent Explainable AI (XAI) methods, LIME (Local Interpretable Model-agnostic Explanations), SHAP (SHapley Additive exPlanations), and IG (Integrated Gradients), are integrated to interpret model decision-making. A structured rule-based prompt combines model predictions and XAI features to guide Flan-T5-Small in generating plain-language, evidence-based explanations. Experimental results demonstrate that the proposed framework outperforms a standard DistilBERT-based detection model trained without robustness enhancements in terms of accuracy and resilience. This integrated approach helps bridge the gap between model reliability and user trust, advancing transparent phishing detection.

2602.05031 2026-06-03 cs.LG 版本更新

Laplacian Representations for Decision-Time Planning

用于决策时规划的拉普拉斯表示

Dikshant Shehmar, Matthew Schlegel, Matthew E. Taylor, Marlos C. Machado

发表机构 * University of Cambridge(剑桥大学)

AI总结 本文提出利用拉普拉斯表示作为决策时规划的潜在空间,通过多时间尺度捕捉状态空间距离,并基于此设计层次规划算法ALPS,在离线目标条件强化学习任务中优于常用基线。

Comments Accepted at ICML 2026

详情
Journal ref
Proceedings of the 43rd International Conference on Machine Learning (ICML 2026)
AI中文摘要

在基于模型的强化学习中,使用学习到的模型进行规划仍然是一个关键挑战。在决策时规划中,状态表示至关重要,因为它们必须支持局部成本计算,同时保持长时程结构。在本文中,我们展示了拉普拉斯表示通过在多时间尺度上捕捉状态空间距离,为规划提供了一个有效的潜在空间。这种表示保留了有意义的距离,并自然地将长时程问题分解为子目标,同时也减轻了长预测范围内出现的复合误差。基于这些特性,我们引入了ALPS,一种层次规划算法,并证明它在来自OGBench(一个以前由无模型方法主导的基准)的离线目标条件强化学习任务选择上优于常用的基线。

英文摘要

Planning with a learned model remains a key challenge in model-based reinforcement learning (RL). In decision-time planning, state representations are critical as they must support local cost computation while preserving long-horizon structure. In this paper, we show that the Laplacian representation provides an effective latent space for planning by capturing state-space distances at multiple time scales. This representation preserves meaningful distances and naturally decomposes long-horizon problems into subgoals, also mitigating the compounding errors that arise over long prediction horizons. Building on these properties, we introduce ALPS, a hierarchical planning algorithm, and demonstrate that it outperforms commonly used baselines on a selection of offline goal-conditioned RL tasks from OGBench, a benchmark previously dominated by model-free methods.

2507.10419 2026-06-03 cs.LG cs.AI cs.CL stat.ML 版本更新

Multiple Choice Learning of Low-Rank Adapters for Language Modeling

低秩适配器的多选学习用于语言建模

Victor Letzelter, Hugo Malard, Mathieu Fontaine, Gaël Richard, Slim Essid, Andrei Bursuc, Patrick Pérez

发表机构 * Institut National de la Recherche Scientifique (INRS)(国家科学研究院)

AI总结 提出LoRA-MCL训练方案,通过多选学习和低秩适配扩展语言模型的下一词预测,以在推理时解码多样且合理的句子延续。

Comments ICML 2026

详情
AI中文摘要

我们提出LoRA-MCL,一种训练方案,通过一种旨在推理时解码多样、合理的句子延续的方法,扩展语言模型中的下一词预测。传统语言建模是一个本质上不适定的问题:给定一个上下文,多个未来可能同样合理。我们的方法利用多选学习(MCL)和胜者全得损失,通过低秩适配有效处理歧义。我们提供了将MCL应用于语言建模的理论解释,假设数据来自混合分布。我们使用马尔可夫链混合来说明所提出的方法。然后,我们通过音频和视觉字幕以及机器翻译的实验证明,我们的方法在生成输出中实现了高多样性和相关性。我们发布了将LoRA-MCL应用于广泛语言模型的代码。

英文摘要

We propose LoRA-MCL, a training scheme that extends next-token prediction in language models with a method designed to decode diverse, plausible sentence continuations at inference time. Traditional language modeling is an intrinsically ill-posed problem: given a context, multiple futures may be equally plausible. Our approach leverages Multiple Choice Learning (MCL) and the winner-takes-all loss to efficiently handle ambiguity through Low-Rank Adaptation. We provide a theoretical interpretation of applying MCL to language modeling, assuming the data is generated from a mixture of distributions. We illustrate the proposed approach using mixtures of Markov chains. We then demonstrate with experiments on audio and visual captioning, as well as machine translation, that our method achieves high diversity and relevance in generated outputs. We release the code for applying LoRA-MCL to a wide range of language models.

2602.03681 2026-06-03 cs.CL cs.LG 版本更新

Neural Attention Search Linear: Towards Adaptive Token-Level Hybrid Attention Models

神经注意力搜索线性:迈向自适应令牌级混合注意力模型

Difan Deng, Andreas Bentzen Winje, Lukas Fehring, Marius Lindauer

发表机构 * University of Copenhagen(哥本哈根大学)

AI总结 提出NAtS-L框架,在同一层内对不同令牌自适应选择线性注意力或softmax注意力,以平衡效率与表达能力。

Comments 21 pages, 12 figures

详情
AI中文摘要

softmax变换器的二次计算复杂度已成为长上下文场景的瓶颈。相比之下,线性注意力模型系列为更高效的序列模型提供了有希望的方向。这些线性注意力模型将过去的KV值压缩成单个隐藏状态,从而在训练和推理期间有效降低复杂度。然而,它们的表达能力仍然受限于隐藏状态的大小。先前的工作提出交错使用softmax和线性注意力层,以在保持表达能力的同时降低计算复杂度。然而,这些模型的效率仍然受限于其softmax注意力层。在本文中,我们提出神经注意力搜索线性(NAtS-L)框架,该框架在同一层内对不同令牌同时应用线性注意力和softmax注意力操作。NAtS-L自动判断一个令牌是否可以由线性注意力模型处理(即仅具有短期影响且可编码为固定大小隐藏状态的令牌),或者是否需要softmax注意力(即包含与长期检索相关信息且需要为未来查询保留的令牌)。通过跨令牌搜索最优的门控DeltaNet和softmax注意力组合,我们展示了NAtS-L提供了一种强大而高效的令牌级混合架构。

英文摘要

The quadratic computational complexity of softmax transformers has become a bottleneck in long-context scenarios. In contrast, linear attention model families provide a promising direction towards a more efficient sequential model. These linear attention models compress past KV values into a single hidden state, thereby efficiently reducing complexity during both training and inference. However, their expressivity remains limited by the size of their hidden state. Previous work proposed interleaving softmax and linear attention layers to reduce computational complexity while preserving expressivity. Nevertheless, the efficiency of these models remains bottlenecked by their softmax attention layers. In this paper, we propose Neural Attention Search Linear (NAtS-L), a framework that applies both linear attention and softmax attention operations within the same layer on different tokens. NAtS-L automatically determines whether a token can be handled by a linear attention model, i.e., tokens that have only short-term impact and can be encoded into fixed-size hidden states, or require softmax attention, i.e., tokens that contain information related to long-term retrieval and need to be preserved for future queries. By searching for optimal Gated DeltaNet and softmax attention combinations across tokens, we show that NAtS-L provides a strong yet efficient token-level hybrid architecture.

2602.02890 2026-06-03 cs.LG 版本更新

Self-Soupervision: Cooking Model Soups without Labels

自我汤合:无标签的模型汤烹饪

Anthony Fuller, James R. Green, Evan Shelhamer

AI总结 提出Self-Soupervision方法,将模型汤技术扩展到自监督学习,通过使用无标签数据混合不同自监督算法训练的参数,提升模型鲁棒性和准确性。

Comments code: https://github.com/antofuller/self_soupervision data: https://huggingface.co/datasets/antofuller/mini-VTAB

详情
AI中文摘要

模型汤是参数的一种奇特且异常有效的组合。它们将一个模型(底汤)微调成多个模型(配料),然后将它们的参数混合回一个模型(汤)以改进预测。虽然所有已知的汤都需要监督学习,并在标记数据上优化相同的损失,但我们的Self-Soupervision配方将汤推广到自监督学习(SSL)。我们的Self-Souping允许我们在新的数据源上调味配料,例如来自迁移任务的无标签数据或来自鲁棒性迁移的数据。我们表明,在损坏的测试数据上进行Self-Souping,然后回到未损坏的训练数据上进行微调,可以将鲁棒性提升+3.5%(ImageNet-C)和+7%(LAION-C)。Self-Soupervision还解锁了无数SSL算法,以烹饪更鲁棒汤所需的各种配料。我们首次表明,配料可以在其SSL超参数上有所不同——更令人惊讶的是,在其SSL算法上也可以不同。我们烹饪了MAE、MoCoV3、MMCR和LeJEPA配料的汤,这些汤比任何单个SSL配料都更准确。

英文摘要

Model soups are strange and strangely effective combinations of parameters. They take a model (the stock), fine-tune it into multiple models (the ingredients), and then mix their parameters back into one model (the soup) to improve predictions. While all known soups require supervised learning, and optimize the same loss on labeled data, our recipes for Self-Soupervision generalize soups to self-supervised learning (SSL). Our Self-Souping lets us flavor ingredients on new data sources, e.g. from unlabeled data from a task for transfer or from a shift for robustness. We show that Self-Souping on corrupted test data, then fine-tuning back on uncorrupted train data, boosts robustness by +3.5% (ImageNet-C) and +7% (LAION-C). Self-Soupervision also unlocks countless SSL algorithms to cook the diverse ingredients needed for more robust soups. We show for the first time that ingredients can differ in their SSL hyperparameters -- and more surprisingly, in their SSL algorithms. We cook soups of MAE, MoCoV3, MMCR, and LeJEPA ingredients that are more accurate than any single SSL ingredient.

2602.01903 2026-06-03 cs.LG stat.ML 版本更新

Data- and Variance-dependent Regret Bounds for Online Tabular MDPs

在线表格MDPs的数据依赖和方差依赖遗憾界

Mingyi Li, Taira Tsuchiya, Kenji Yamanishi

AI总结 针对已知转移的在线表格马尔可夫决策过程,提出在对抗性环境下实现数据依赖遗憾界、在随机环境下实现方差依赖遗憾界的最优算法,并证明全局优化方法达到近乎最优。

Comments Accepted at ICML 2026. 72 pages, 4 tables

详情
AI中文摘要

本文研究具有已知转移的在线情景表格马尔可夫决策过程(MDPs),并开发了在对抗性环境下实现精细数据依赖遗憾界、在随机环境下实现方差依赖遗憾界的最佳算法。我们使用一阶量和几个新的数据依赖度量(包括二阶量和路径长度度量)来量化对抗性环境下的MDP复杂度,以及基于方差的度量来量化随机环境下的MDP复杂度。为了适应这些度量,我们基于全局优化和策略优化开发了算法,两者都建立在具有对数障碍正则化的乐观跟随正则化领导者之上。对于全局优化,我们的算法在对抗性环境下实现了一阶、二阶和路径长度遗憾界,在随机环境下实现了方差感知的无间隙依赖界和方差感知的间隙依赖界(该界关于情景数量为多对数)。对于策略优化,通过利用新的乐观$Q$函数估计器,我们的算法实现了相同的数据和方差依赖自适应性,但乘以情景视界因子。最后,我们针对对抗性环境下的数据依赖复杂度度量和随机环境下的方差度量建立了遗憾下界,表明全局优化方法实现的遗憾上界是近乎最优的。

英文摘要

This work studies online episodic tabular Markov decision processes (MDPs) with known transitions and develops best-of-both-worlds algorithms that achieve refined data-dependent regret bounds in the adversarial regime and variance-dependent regret bounds in the stochastic regime. We quantify MDP complexity using a first-order quantity and several new data-dependent measures for the adversarial regime, including a second-order quantity and a path-length measure, as well as variance-based measures for the stochastic regime. To adapt to these measures, we develop algorithms based on global optimization and policy optimization, both built on optimistic follow-the-regularized-leader with log-barrier regularization. For global optimization, our algorithms achieve first-order, second-order, and path-length regret bounds in the adversarial regime, and in the stochastic regime, they achieve a variance-aware gap-independent bound and a variance-aware gap-dependent bound that is polylogarithmic in the number of episodes. For policy optimization, our algorithms achieve the same data- and variance-dependent adaptivity, up to a factor of the episode horizon, by exploiting a new optimistic $Q$-function estimator. Finally, we establish regret lower bounds in terms of data-dependent complexity measures for the adversarial regime and a variance measure for the stochastic regime, implying that the regret upper bounds achieved by the global-optimization approach are nearly optimal.

2602.01483 2026-06-03 cs.LG cs.AI stat.ME 版本更新

Causal Preference Elicitation

因果偏好启发

Edwin V. Bonilla, He Zhao, Daniel M. Steinberg

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 提出一种贝叶斯框架,通过主动查询局部边关系来集中有向无环图的后验分布,实现专家参与的因果发现。

详情
AI中文摘要

我们提出因果偏好启发,一种用于专家参与因果发现的贝叶斯框架,该框架主动查询局部边关系以集中有向无环图(DAG)的后验分布。从任何黑箱观测后验出发,我们使用一个三向似然模型对专家的噪声判断进行建模,该似然涵盖边的存在性和方向。后验推断采用灵活的粒子近似,并通过专家分类响应的期望信息增益准则高效选择查询。在合成图、蛋白质信号数据以及人类基因扰动基准上的实验表明,在严格的查询预算下,后验集中速度更快,且对有向效应的恢复能力得到提升。

英文摘要

We propose causal preference elicitation, a Bayesian framework for expert-in-the-loop causal discovery that actively queries local edge relations to concentrate a posterior over directed acyclic graphs (DAGs). From any black-box observational posterior, we model noisy expert judgments with a three-way likelihood over edge existence and direction. Posterior inference uses a flexible particle approximation, and queries are selected by an efficient expected information gain criterion on the expert's categorical response. Experiments on synthetic graphs, protein signaling data, and a human gene perturbation benchmark show faster posterior concentration and improved recovery of directed effects under tight query budgets.

2512.00956 2026-06-03 cs.LG cs.CL 版本更新

WUSH: Near-Optimal Adaptive Transforms for LLM Quantization

WUSH: 面向LLM量化的近最优自适应变换

Jiale Chen, Vage Egiazarian, Roberto L. Castro, Torsten Hoefler, Dan Alistarh

发表机构 * University of Tartu(塔尔图大学)

AI总结 提出一种结合Hadamard基与数据依赖二阶矩的非正交变换WUSH,在标准RTN AbsMax缩放块量化器下实现权重-激活联合量化的闭式最优解,显著提升低比特量化精度并支持高效GPU实现。

Comments Published as a conference paper at the 43rd International Conference on Machine Learning (ICML 2026): https://openreview.net/forum?id=ZsECxUkbKB

详情
AI中文摘要

量化LLM权重和激活是实现高效部署的标准方法,但少数极端异常值会拉伸动态范围并放大低比特量化误差。先前的基于变换的缓解方法(例如Hadamard旋转)是固定的且与数据无关,其量化最优性尚不明确。我们推导了在标准RTN AbsMax缩放块量化器下,用于联合权重-激活量化的闭式最优线性块变换,涵盖整数和浮点格式。由此产生的构造WUSH将Hadamard骨干与数据依赖的二阶矩分量相结合,形成一种非正交变换,在温和假设下对FP和INT量化器证明是近最优的,同时支持高效的融合GPU实现。实验上,WUSH在最强Hadamard基线(例如,在Llama-3.1-8B-Instruct的MXFP4上,RTN平均提升+2.8个点,GPTQ提升+0.7个点)上改善了W4A4精度,同时通过FP4 MatMul实现了高达BF16的5.8倍每层吞吐量。源代码可在https://github.com/IST-DASLab/WUSH获取。

英文摘要

Quantizing LLM weights and activations is a standard approach for efficient deployment, but a few extreme outliers can stretch the dynamic range and amplify low-bit quantization errors. Prior transform-based mitigations (e.g., Hadamard rotations) are fixed and data-agnostic, and their optimality for quantization has remained unclear. We derive closed-form optimal linear blockwise transforms for joint weight-activation quantization under standard RTN AbsMax-scaled block quantizers, covering both integer and floating-point formats. The resulting construction, WUSH, combines a Hadamard backbone with a data-dependent second-moment component to form a non-orthogonal transform that is provably near-optimal for FP and INT quantizers under mild assumptions while admitting an efficient fused GPU implementation. Empirically, WUSH improves W4A4 accuracy over the strongest Hadamard-based baselines (e.g., on Llama-3.1-8B-Instruct in MXFP4, it gains +2.8 average points with RTN and +0.7 with GPTQ) while delivering up to 5.8$\times$ per-layer throughput over BF16 via FP4 MatMul. Source code is available at https://github.com/IST-DASLab/WUSH.

2510.02763 2026-06-03 cs.LG cs.AI 版本更新

Fusing Multi- and Hyperspectral Satellite Data for Harmful Algal Bloom Monitoring with Self-Supervised and Hierarchical Deep Learning

融合多光谱和高光谱卫星数据用于有害藻华监测的自监督与分层深度学习

Nicholas LaHaye, Kelly M. Luis, Michelle M. Gierach

发表机构 * University of Colorado Boulder(科罗拉多大学博尔德分校)

AI总结 提出自监督机器学习框架SIT-FUSE,融合多传感器卫星反射率与TROPOMI太阳诱导荧光数据,通过分层深度聚类生成有害藻华严重程度和物种分类产品,在墨西哥湾和南加州验证了与实测数据的一致性。

详情
AI中文摘要

我们提出了一种自监督机器学习框架,用于利用多传感器卫星数据检测和绘制有害藻华(HABs)的严重程度和物种分类。通过融合来自运行极轨卫星仪器(VIIRS、MODIS、OLCI和OCI)的反射率数据与TROPOMI太阳诱导荧光(SIF),我们的框架SIT-FUSE无需每个仪器的标记数据集即可生成HAB严重程度和物种分类产品。该框架采用自监督表示学习和分层深度聚类,将浮游植物细胞丰度和物种分割成可解释的类别,并利用墨西哥湾和南加州(2018-2025年)的原位数据进行了验证。结果显示与总浮游植物、短凯伦藻和拟菱形藻属测量值高度一致。这项工作推进了在地面观测有限的环境中进行可扩展的HAB监测,同时通过分层嵌入实现探索性分析——这是将自监督学习应用于全球水生生物地球化学操作化的关键一步。

英文摘要

We present a self-supervised machine learning framework for detecting and mapping the severity and speciation of harmful algal blooms (HABs) using multi-sensor satellite data. By fusing reflectance data from operational polar-orbiting satellite-based instruments (VIIRS, MODIS, OLCI, and OCI) with TROPOMI solar-induced fluorescence (SIF), our framework, called SIT-FUSE, generates HAB severity and speciation products without requiring per-instrument labeled datasets. The framework employs self-supervised representation learning and hierarchical deep clustering to segment phytoplankton cell abundance and species into interpretable classes, validated against in-situ data from the Gulf of Mexico and Southern California (2018-2025). Results show strong agreement with total phytoplankton, Karena brevis, and Pseudo-nitzschia spp. measurements. This work advances scalable HAB monitoring in environments where ground truth observations are limited, while enabling exploratory analysis via hierarchical embeddings - a critical step toward operationalizing self-supervised learning for global aquatic biogeochemistry.

2602.00392 2026-06-03 cs.LG 版本更新

Localized, High-resolution Geographic Representations with Slepian Functions

基于Slepian函数的局部高分辨率地理表示

Arjun Rao, Ruth Crasto, Tessa Ooms, David Rolnick, Konstantin Klemmer, Marc Rußwurm

AI总结 提出利用球面Slepian函数构建地理编码器,在感兴趣区域内集中表示能力,实现高分辨率且计算高效,并引入混合Slepian-球谐编码器平衡局部与全局性能,在分类、回归等任务中优于基线。

Comments ICML 2026

详情
AI中文摘要

地理数据本质上是局部的。疾病爆发集中在人口中心,生态模式沿着海岸线出现,经济活动集中在国家边界内。然而,编码地理位置的机器学习模型将表示能力均匀地分布在全球,难以满足局部应用所需的细粒度分辨率。我们提出了一种基于球面Slepian函数的地理位置编码器,它将表示能力集中在感兴趣区域内,并在无需大量计算需求的情况下扩展到高分辨率。对于需要全局上下文的情况,我们提出了一种混合Slepian-球谐编码器,它有效地平衡了局部与全局性能的权衡,同时保留了诸如极点安全和球面距离保持等理想特性。在涵盖分类、回归和图像增强预测的五项任务中,Slepian编码优于基线,并在广泛的神经网络架构中保持性能优势。

英文摘要

Geographic data is fundamentally local. Disease outbreaks cluster in population centers, ecological patterns emerge along coastlines, and economic activity concentrates within country borders. Machine learning models that encode geographic location, however, distribute representational capacity uniformly across the globe, struggling at the fine-grained resolutions that localized applications require. We propose a geographic location encoder built from spherical Slepian functions that concentrate representational capacity inside a region-of-interest and scale to high resolutions without extensive computational demands. For settings requiring global context, we present a hybrid Slepian-Spherical Harmonic encoder that efficiently bridges the tradeoff between local-global performance, while retaining desirable properties such as pole-safety and spherical-surface-distance preservation. Across five tasks spanning classification, regression, and image-augmented prediction, Slepian encodings outperform baselines and retain performance advantages across a wide range of neural network architectures.

2601.23169 2026-06-03 cs.LG cs.LO cs.SC 版本更新

Names Don't Matter: Symbol-Invariant Transformer for Open-Vocabulary Learning

名称无关:面向开放词汇学习的符号不变Transformer

İlker Işık, Wenchao Li

发表机构 * University of California, Berkeley(加州大学伯克利分校) Stanford University(斯坦福大学)

AI总结 提出一种符号不变Transformer机制,通过并行嵌入流和聚合注意力实现可互换令牌的重命名不变性,在开放词汇任务上取得显著性能提升。

Comments ICML 2026 Poster (Camera-Ready Version)

详情
AI中文摘要

当前的神经架构缺乏处理可互换令牌(即语义等价但可区分的符号,如绑定变量)的原则性方法。因此,在固定词汇表上训练的模型通常难以泛化到未见过的符号,即使底层语义保持不变。我们提出了一种新颖的基于Transformer的机制,该机制对可互换令牌的重命名具有可证明的不变性。我们的方法采用并行嵌入流来隔离输入中每个可互换令牌的贡献,并结合聚合注意力机制实现跨流的结构化信息共享。实验结果证实了我们方法的理论保证,并在需要泛化到新符号的开放词汇任务上展示了显著的性能提升。项目页面:https://bu-depend-lab.github.io/Symbol-Invariant-Transformer/

英文摘要

Current neural architectures lack a principled way to handle interchangeable tokens, i.e., symbols that are semantically equivalent yet distinguishable, such as bound variables. As a result, models trained on fixed vocabularies often struggle to generalize to unseen symbols, even when the underlying semantics remain unchanged. We propose a novel Transformer-based mechanism that is provably invariant to the renaming of interchangeable tokens. Our approach employs parallel embedding streams to isolate the contribution of each interchangeable token in the input, combined with an aggregated attention mechanism that enables structured information sharing across streams. Experimental results confirm the theoretical guarantees of our method and demonstrate substantial performance gains on open-vocabulary tasks that require generalization to novel symbols. Project page: https://bu-depend-lab.github.io/Symbol-Invariant-Transformer/

2601.22443 2026-06-03 cs.LG cs.CV stat.CO stat.ML 版本更新

Weak Diffusion Priors Can Still Achieve Strong Inverse-Problem Performance

弱扩散先验仍能实现强逆问题性能

Jing Jia, Wei Yuan, Sifan Liu, Liyue Shen, Guanyang Wang

发表机构 * University of California, Berkeley(加州大学伯克利分校) Stanford University(斯坦福大学)

AI总结 研究弱扩散先验在逆问题中的鲁棒性,通过贝叶斯一致性和局部相关性分析揭示其在信息丰富测量下仍有效的原因。

Comments 37 pages, ICML 2026 spotlight. Code: https://github.com/jjia131/weak-diffusion-priors-inverse-problem, Project Page: https://jjia131.github.io/weak-diffusion-priors-inverse-problem/

详情
AI中文摘要

在卧室图像上训练的扩散模型能否恢复人脸图像?扩散模型被广泛用作逆问题的先验,但标准方法通常假设一个高保真模型,该模型在与未知信号高度匹配的数据上训练。实践中,常常必须使用不匹配或低保真的扩散先验。令人惊讶的是,这些弱先验的表现往往几乎与全强度的域内基线相当。我们研究了逆求解器何时以及为何对弱扩散先验具有鲁棒性。通过大量实验,我们发现当测量信息高度丰富(例如,大量观测像素)时,弱先验能够成功,并识别了它们失败的场景。为了解释这一行为,我们将贝叶斯一致性理论与局部相关性分析相结合:理论给出了高维测量使后验集中于真实信号附近的条件,而相关性分析表明弱先验和更强的自然图像先验可以共享相似的局部空间结构。这些结果为何时可以可靠地使用弱扩散先验提供了原则性依据。代码可在 https://github.com/jjia131/weak-diffusion-priors-inverse-problem 获取。

英文摘要

Can a diffusion model trained on bedrooms recover human faces? Diffusion models are widely used as priors for inverse problems, but standard approaches usually assume a high-fidelity model trained on data that closely match the unknown signal. In practice, one often must use a mismatched or low-fidelity diffusion prior. Surprisingly, these weak priors often perform nearly as well as full-strength, in-domain baselines. We study when and why inverse solvers are robust to weak diffusion priors. Through extensive experiments, we find that weak priors succeed when measurements are highly informative (e.g., many observed pixels), and we identify regimes where they fail. To explain this behavior, we combine Bayesian-consistency theory with local-correlation analysis: the theory gives conditions under which high-dimensional measurements make the posterior concentrate near the true signal, while the correlation analysis shows that weak and stronger natural-image priors can share similar local spatial structure. These results provide a principled justification on when weak diffusion priors can be used reliably. Code is available at https://github.com/jjia131/weak-diffusion-priors-inverse-problem.

2601.21683 2026-06-03 cs.LG 版本更新

Can Local Learning Match Self-Supervised Backpropagation?

局部学习能否匹配自监督反向传播?

Wu S. Zihan, Ariane Delrocq, Wulfram Gerstner, Guillaume Bellec

发表机构 * University of Zurich(苏黎世大学)

AI总结 本文通过理论分析和算法变体,证明局部自监督学习在深度非线性卷积网络中可接近全局反向传播自监督学习的性能,并在图像数据集上达到或超越现有最优水平。

Comments Accepted at ICML 2026; Code is available at https://github.com/zihan-wu/local-SSL

详情
AI中文摘要

虽然基于反向传播的端到端自监督学习(全局BP-SSL)已成为训练现代AI系统的核心,但局部自监督学习(local-SSL)理论在深度神经网络中构建功能表示方面仍面临挑战。为建立全局与局部规则之间的联系,我们首先发展了深度线性网络的理论:识别了局部SSL算法(如Forward-forward或CLAPP)实现与全局BP-SSL完全相同的权重更新的条件。从理论见解出发,我们随后开发了局部SSL算法的新变体,以近似深度非线性卷积神经网络中的全局BP-SSL。那些提高局部SSL与全局BP-SSL梯度更新相似性的变体在图像数据集(CIFAR-10、STL-10和Tiny ImageNet)上也表现出更好的性能。使用CLAPP损失函数的最佳局部SSL规则与使用InfoNCE或CPC类损失函数的可比全局BP-SSL性能相匹配,并在这些基准上改进了局部SSL的最新技术水平。

英文摘要

While end-to-end self-supervised learning with backpropagation (global BP-SSL) has become central for training modern AI systems, theories of local self-supervised learning (local-SSL) have struggled to build functional representations in deep neural networks. To establish a link between global and local rules, we first develop a theory for deep linear networks: we identify conditions for local-SSL algorithms (like Forward-forward or CLAPP) to implement exactly the same weight update as a global BP-SSL. Starting from the theoretical insights, we then develop novel variants of local-SSL algorithms to approximate global BP-SSL in deep non-linear convolutional neural networks. Variants that improve the similarity between gradient updates of local-SSL with those of global BP-SSL also show better performance on image datasets (CIFAR-10, STL-10, and Tiny ImageNet). The best local-SSL rule with the CLAPP loss function matches the performance of a comparable global BP-SSL with InfoNCE or CPC-like loss functions, and improves upon state-of-the-art for local SSL on these benchmarks.

2601.20844 2026-06-03 cs.LG cs.AI cs.IR 版本更新

$\mathbb{R}^{2k}$ is Theoretically Large Enough for Embedding-based Top-$k$ Retrieval

$\mathbb{R}^{2k}$ 理论上足够大,用于基于嵌入的 Top-$k$ 检索

Zihao Wang, Hang Yin, Lihui Liu, Hanghang Tong, Yangqiu Song, Ginny Wong, Simon See

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 研究最小可嵌入维度(MED),证明对于内积、欧氏距离和余弦相似度,MED 为 Θ(k),与 m 无关;进一步考虑鲁棒 MED(RMED),推导出可行性上限 ε_⋆(m,k),并通过实验验证理论结果。

Comments v2: fix broken citation. v3: ICML 2026

详情
AI中文摘要

本文研究最小可嵌入维度(MED):即存在 m 个对象向量配置的最小维度,使得每个大小至多为 k 的子集都能通过分数比较被精确检索。我们的结果表明,对于内积、欧氏距离和余弦相似度,MED 为 Θ(k),与 m 无关。然后我们考虑鲁棒 MED(RMED),其中所有向量为单位范数,并且需要 ε 的分数间隙。我们推导出依赖于 m 的可行性上限 ε_⋆(m,k)=m/√(k(m-1)(m-k)),当 m≫k 时趋近于 1/√k,并且高斯质心构造在可行边界区域内给出了鲁棒见证的上界。在合成 top-2 检索上的数值模拟,使用循环多面体和质心查询优化,证实了我们的理论主张。在 LIMIT 和 LIMIT-small 数据集上的实验也表明,简单的基于嵌入的检索基线可能过拟合,并优于报告的单向量 LLM 嵌入基线。理论和实证结果都排除了精确几何容量不足作为障碍的可能性。

英文摘要

This paper studies the Minimal Embeddable Dimension (MED): the least dimension in which there exists a configuration of $m$ object vectors so that every subset of size at most $k$ is exactly retrieved by score comparison. Our result shows MED is $Θ(k)$, independent of $m$, for inner product, Euclidean distance, and cosine similarity. We then consider Robust MED (RMED), where all vectors are unit normed and an $ε$ gap of scores is required. We derive the $m$-dependent feasibility ceiling $ε_\star(m,k)=m/\sqrt{k(m-1)(m-k)}$, which approaches $1/\sqrt{k}$ when $m\gg k$, and a Gaussian centroid construction gives a robust witness upper bound in the feasible margin regime. Numerical simulation on synthetic top-$2$ retrieval with cyclic polytope and centroid query optimization confirmed our theoretical claims. Experiments on LIMIT and LIMIT-small datasets also show that simple embedding-based retrieval baselines can overfit and outperform the reported single-vector LLM embedding baseline. Both theoretical and empirical findings rule out the lack of exact geometric capacity as the obstruction.

2601.12247 2026-06-03 cs.CL cs.AI cs.LG 版本更新

Plan, Verify and Fill: A Structured Parallel Decoding Approach for Diffusion Language Models

规划、验证与填充:扩散语言模型的结构化并行解码方法

Miao Li, Hanyang Jiang, Sikai Cheng, Hengyu Fu, Yuhang Cai, Baihe Huang, Tinghan Ye, Xuanzhou Chen, Pascal Van Hentenryck

发表机构 * Georgia Institute of Technology(佐治亚理工学院) University of California, Berkeley(加州大学伯克利分校) University of Michigan(密歇根大学)

AI总结 提出Plan-Verify-Fill (PVF)方法,通过定量验证进行分层骨架规划,并采用验证协议实现结构化停止,在保持准确性的同时将函数评估次数减少高达65%。

详情
AI中文摘要

扩散语言模型(DLM)为文本生成提供了一种有前景的非顺序范式,不同于标准的自回归(AR)方法。然而,当前的解码策略通常采取被动姿态,未能充分利用全局双向上下文来指导全局轨迹。为了解决这个问题,我们提出了Plan-Verify-Fill(PVF),一种无需训练的范式,通过定量验证来锚定规划。PVF通过优先考虑高杠杆语义锚点主动构建分层骨架,并采用验证协议来实现实用的结构化停止,在进一步思考收益递减时停止。在LLaDA-8B-Instruct和Dream-7B-Instruct上的广泛评估表明,与基于置信度的并行解码相比,PVF在基准数据集上将函数评估次数(NFE)减少了高达65%,在不牺牲准确性的情况下实现了卓越的效率。

英文摘要

Diffusion Language Models (DLMs) present a promising non-sequential paradigm for text generation, distinct from standard autoregressive (AR) approaches. However, current decoding strategies often adopt a reactive stance, underutilizing the global bidirectional context to dictate global trajectories. To address this, we propose Plan-Verify-Fill (PVF), a training-free paradigm that grounds planning via quantitative validation. PVF actively constructs a hierarchical skeleton by prioritizing high-leverage semantic anchors and employs a verification protocol to operationalize pragmatic structural stopping where further deliberation yields diminishing returns. Extensive evaluations on LLaDA-8B-Instruct and Dream-7B-Instruct demonstrate that PVF reduces the Number of Function Evaluations (NFE) by up to 65% compared to confidence-based parallel decoding across benchmark datasets, unlocking superior efficiency without compromising accuracy.

2509.01641 2026-06-03 eess.SP cs.AI cs.LG 版本更新

Non-Identical Diffusion Models in MIMO-OFDM Channel Generation

MIMO-OFDM信道生成中的非相同扩散模型

Yuzhi Yang, Omar Alhussein, Mérouane Debbah

AI总结 提出非相同扩散模型,通过元素级时间指示器捕获局部误差变化,解决MIMO-OFDM信道估计中元素可靠性不均的问题,理论验证其正确性并数值实验证明有效性。

Comments resubmitted to IEEE TCOM

详情
AI中文摘要

我们提出了一种新颖的扩散模型,称为非相同扩散模型,并研究了其在无线正交频分复用(OFDM)信道生成中的应用。与使用标量时间索引表示全局噪声水平的标准扩散模型不同,我们将这一概念扩展为元素级时间指示器,以更准确地捕获局部误差变化。非相同扩散使我们能够表征噪声输入中每个元素(例如OFDM中的子载波)的可靠性,从而在初始化有偏时改善生成结果。具体来说,我们专注于无线多输入多输出(MIMO)OFDM信道矩阵的恢复,其中由于导频方案,初始信道估计在元素间表现出高度不均匀的可靠性。传统的时间嵌入假设噪声进展均匀,无法捕获这种跨导频方案和噪声水平的变化。我们引入一个与输入大小匹配的矩阵来控制元素级噪声进展。遵循与现有方法类似的扩散过程,我们从理论和数值上证明了所提出的非相同扩散方案的正确性和有效性。对于MIMO-OFDM信道生成,我们提出了一种维度级时间嵌入策略。我们还开发并评估了多种训练和生成方法,并通过数值实验进行了比较。

英文摘要

We propose a novel diffusion model, termed the non-identical diffusion model, and investigate its application to wireless orthogonal frequency division multiplexing (OFDM) channel generation. Unlike the standard diffusion model that uses a scalar-valued time index to represent the global noise level, we extend this notion to an element-wise time indicator to capture local error variations more accurately. Non-identical diffusion enables us to characterize the reliability of each element (e.g., subcarriers in OFDM) within the noisy input, leading to improved generation results when the initialization is biased. Specifically, we focus on the recovery of wireless multi-input multi-output (MIMO) OFDM channel matrices, where the initial channel estimates exhibit highly uneven reliability across elements due to the pilot scheme. Conventional time embeddings, which assume uniform noise progression, fail to capture such variability across pilot schemes and noise levels. We introduce a matrix that matches the input size to control element-wise noise progression. Following a similar diffusion procedure to existing methods, we show the correctness and effectiveness of the proposed non-identical diffusion scheme both theoretically and numerically. For MIMO-OFDM channel generation, we propose a dimension-wise time embedding strategy. We also develop and evaluate multiple training and generation methods and compare them through numerical experiments.

2501.17377 2026-06-03 cs.LG cs.AI 版本更新

ASAP: Exploiting the Satisficing Generalization Edge in Neural Combinatorial Optimization

ASAP:利用神经组合优化中的满意泛化优势

Han Fang, Paul Weng, Yutong Ban

发表机构 * GitHub

AI总结 针对神经组合优化模型在分布偏移下的脆弱性,提出ASAP框架,通过将决策分解为提案和选择两阶段,并利用MAML增强在线适应能力,在3D-BPP、TSP和CVRP上提升了泛化性能。

Comments Accepted as poster of ICML-2026

详情
AI中文摘要

深度强化学习(DRL)已成为解决组合优化(CO)问题(如3D装箱问题(3D-BPP)、旅行商问题(TSP)或车辆路径问题(VRP))的一种有前景的方法,但这些神经求解器在面对分布偏移时往往表现出脆弱性。为了解决这个问题,我们揭示了满意泛化优势,并在理论和实验上进行了验证:识别一组有希望的行动本质上比选择单一最优行动更具泛化性。为了利用这一特性,我们提出了自适应选择后提案(ASAP),这是一个通用框架,将决策过程分解为两个不同的阶段:作为鲁棒过滤器的提案策略和作为可适应决策者的选择策略。这种架构使得一种高效的在线适应策略成为可能,其中选择策略可以在新分布上快速微调。具体地,我们引入了一个由模型无关元学习(MAML)增强的两阶段训练框架,以使模型能够快速适应。在3D-BPP、TSP和CVRP上的大量实验表明,ASAP提高了最先进基线的泛化能力,并在分布外实例上实现了优越的在线适应。

英文摘要

Deep Reinforcement Learning (DRL) has emerged as a promising approach for solving Combinatorial Optimization (CO) problems, such as the 3D Bin Packing Problem (3D-BPP), Traveling Salesman Problem (TSP), or Vehicle Routing Problem (VRP), but these neural solvers often exhibit brittleness when facing distribution shifts. To address this issue, we uncover the Satisficing Generalization Edge, which we validate both theoretically and experimentally: identifying a set of promising actions is inherently more generalizable than selecting the single optimal action. To exploit this property, we propose Adaptive Selection After Proposal (ASAP), a generic framework that decomposes the decision-making process into two distinct phases: a proposal policy that acts as a robust filter, and a selection policy as an adaptable decision maker. This architecture enables a highly effective online adaptation strategy where the selection policy can be rapidly fine-tuned on a new distribution. Concretely, we introduce a two-phase training framework enhanced by Model-Agnostic Meta-Learning (MAML) to prime the model for fast adaptation. Extensive experiments on 3D-BPP, TSP, and CVRP demonstrate that ASAP improves the generalization capability of state-of-the-art baselines and achieves superior online adaptation on out-of-distribution instances.

2511.04243 2026-06-03 quant-ph cs.LG 版本更新

Twirlator: A Pipeline for Analyzing Subgroup Symmetry Effects in Quantum Machine Learning Ansatzes

Twirlator: 分析量子机器学习拟设中子群对称性效应的流水线

Valter Uotila, Väinö Mehtola, Ilmo Salmenperä, Bo Zhao

AI总结 提出Twirlator流水线,通过对称群子群大小建模部分对称性,量化对称性增加时量子机器学习拟设的生成器漂移、电路开销、表达能力和纠缠能力之间的权衡。

Comments 8 pages; 7 figures; presented at the 7th International Workshop on Quantum Software Engineering (Q-SE 2026)

详情
Journal ref
Q-SE '26: Proceedings of the 7th IEEE/ACM International Workshop on Quantum Software Engineering (2026) 55 - 62
AI中文摘要

对称性是几何深度学习及其量子对应物中的强归纳偏置,并因其改善QML模型可训练性而受到越来越多的关注。然而,将对称性纳入量子机器学习(QML)拟设并非免费:对称化通常会增加门并约束电路。为了理解这些效应,我们提出了Twirlator,这是一个自动化流水线,用于对称化参数化QML拟设,并量化随着对称性增加而产生的权衡。Twirlator通过对称群子群的大小对部分对称性进行建模,从而能够分析“无对称性”和“完全对称性”极端之间的情形。在19种常见拟设模式中,Twirlator针对$S_n$的任何子群对称化电路,并测量(1)生成器漂移,(2)电路开销(深度和大小),以及(3)表达能力和纠缠能力。实验评估聚焦于$S_4$和$S_5$的子群。Twirlator揭示,较大的子群通常会增加电路开销,降低表达能力,并往往增加纠缠能力。该流水线和结果为在对称性感知的QML应用中选择平衡硬件成本和模型性能的拟设模式和对称性水平提供了实用指导。

英文摘要

Symmetry is a strong inductive bias in geometric deep learning and its quantum counterpart, and has attracted increasing attention for improving the trainability of QML models. Yet incorporating symmetries into quantum machine learning (QML) ansatzes is not free: symmetrization often adds gates and constrains the circuits. To understand these effects, we present Twirlator, which is an automated pipeline that symmetrizes parameterized QML ansatzes and quantifies the trade-offs as the amount of symmetry increases. Twirlator models partial symmetries by the size of a subgroup of the symmetric group, enabling analysis between the ``no symmetry'' and ``full symmetry'' extremes. Across 19 common ansatz patterns, Twirlator symmetrizes circuits with respect to any subgroup of $S_n$ and measures (1) generator drift, (2) circuit overhead (depth and size), and (3) expressibility and entangling capability. The experimental evaluation focuses on subgroups of $S_4$ and $S_5$. Twirlator reveals that larger subgroups typically increase circuit overhead, reduce expressibility, and often increase entangling capability. The pipeline and results provide practical guidance for selecting ansatz patterns and symmetry levels that balance hardware cost and model performance in symmetry-aware QML applications.

2601.17130 2026-06-03 cs.LG cs.CR 版本更新

Impact of Graph Structure on Membership-Inference Risk for Graph Neural Networks

图结构对图神经网络成员推理风险的影响

Megha Khosla

发表机构 * Delft University of Technology(代尔夫特理工大学)

AI总结 本文通过分析训练图构建和推理时边访问两个维度,研究了图结构如何影响图神经网络的节点级成员推理风险,并发现雪球采样会损害泛化能力,而推理时边访问能显著改变成员推理优势。

Comments Accepted for publication in PETS 2026

详情
AI中文摘要

图神经网络(GNN)广泛用于节点分类和链接预测等任务,但在敏感场景中的使用引发了训练数据泄露的担忧。先前关于GNN隐私泄露的工作大多借鉴非图领域的假设,忽视了图结构的作用。我们主张对隐私风险进行图特定的分析,并研究图结构如何影响节点级成员推理。我们形式化了节点-邻域元组上的成员推理(MI),并探讨了两个重要维度:(i)训练图构建和(ii)推理时边访问。我们比较了雪球采样(一种结构感知过程)与均匀随机节点采样用于构建训练图。实验表明,雪球采样由于其覆盖偏差,通常比随机采样更损害泛化能力。相反,在推理时允许访问训练-测试间边可以提高测试准确率,缩小训练-测试差距,同时也会对成员推理优势产生强烈且依赖于设置的影响。这些结果表明图结构直接塑造了隐私风险。我们进一步表明,泛化差距(以训练和测试节点之间的性能差异衡量)是成员推理风险的不完全代理:成员推理优势可以独立于该差距的变化而上升或下降,而推理时边访问通常起着关键作用。理论上,我们证明对于节点级任务,基于成员推理的标准隐私审计结果不能直接推广到归纳图设置,因为训练和测试节点在结构上相互依赖而非可互换。我们在https://github.com/PriXAI/GraphStructurePrivacyAnalysis-public 发布代码和数据。

英文摘要

Graph neural networks (GNNs) are widely used for tasks such as node classification and link prediction, but their use in sensitive settings raises concerns about training-data leakage. Prior work on privacy leakage in GNNs largely borrows assumptions from non-graph domains, overlooking the role of graph structure. We argue for a graph-specific analysis of privacy risk and study how graph structure affects node-level membership inference. We formalize membership inference (MI) over node-neighborhood tuples and investigate two important dimensions: (i) training-graph construction and (ii) inference-time edge access. We compare snowball sampling, a structure-aware procedure, with uniform random node sampling for constructing training graphs. Our experiments show that snowball sampling often hurts generalization relative to random sampling due to its coverage bias. In contrast, allowing access to inter-train-test edges at inference improves test accuracy, reduces the train-test gap, while also having a strong and setting-dependent effect on membership advantage. These results show that graph structure directly shapes privacy risk. We further show that the generalization gap, measured as the performance difference between training and test nodes, is an incomplete proxy for membership inference risk: membership advantage can rise or fall independently of changes in this gap, with inference-time edge access often playing a crucial role. Theoretically, we show that for node-level tasks, standard privacy-auditing results based on membership inference do not directly carry over to inductive graph settings, because training and test nodes are structurally dependent rather than interchangeable. We release the code and data at https://github.com/PriXAI/GraphStructurePrivacyAnalysis-public.

2601.14569 2026-06-03 cs.CL cs.LG 版本更新

Social Caption: Evaluating Social Understanding in Multimodal Models

Social Caption: 评估多模态模型的社会理解能力

Leena Mathur, Bhaavanaa Thumu, Youssouf Kebe, Louis-Philippe Morency

发表机构 * School of Computer Science, Carnegie Mellon University(卡内基梅隆大学计算机科学学院)

AI总结 提出基于交互理论的SOCIAL CAPTION框架,从社会推理、整体社会分析和定向社会分析三个维度评估多模态大语言模型的社会理解能力,并分析影响性能的因素。

Comments 25 pages, 10 figures

详情
AI中文摘要

社会理解能力对于多模态大语言模型(MLLMs)解读人类社交互动至关重要。我们引入SOCIAL CAPTION,这是一个基于交互理论的框架,用于从三个维度评估MLLMs的社会理解能力:社会推理(SI),即对互动做出准确推断的能力;整体社会分析(HSA),即生成互动全面描述的能力;定向社会分析(DSA),即从互动中生成相关信息的能力。我们分析了影响模型社会理解性能的因素,如规模、架构设计和口语语境。使用MLLM评判员的实验展示了扩展多模态社会理解自动化评估的路径。

英文摘要

Social understanding abilities are crucial for multimodal large language models (MLLMs) to interpret human social interactions. We introduce SOCIAL CAPTION, a framework grounded in interaction theory to evaluate social understanding abilities of MLLMs along three dimensions: Social Inference (SI), the ability to make accurate inferences about interactions; Holistic Social Analysis (HSA), the ability to generate comprehensive descriptions of interactions; Directed Social Analysis (DSA), the ability to generate relevant information from interactions. We analyze factors influencing model performance in social understanding, such as scale, architectural design, and spoken context. Experiments with MLLM judges demonstrate a path towards scaling automated evaluation of multimodal social understanding.

2601.11667 2026-06-03 cs.LG cs.AI 版本更新

Distill-then-Replace: Efficient Task-Specific Hybrid Attention Model Construction

Distill-then-Replace: 高效的任务特定混合注意力模型构建

Xiaojie Xia, Huigang Zhang, Chaoliang Zhong, Jun Sun, Yusuke Oishi

发表机构 * Fujitsu Research & Development Center CO., LTD(富士通研发中心有限公司) Fujitsu Research, FUJITSU LTD(富士通研究所,富士通有限公司)

AI总结 提出Distill-then-Replace (DtR)方法,通过逐块局部蒸馏和贪婪层替换策略,将预训练的全注意力模型高效转换为任务特定的混合注意力模型,无需重新训练或神经架构搜索。

详情
AI中文摘要

Transformer架构通过密集的全注意力机制实现了最先进的准确性,但其相对于序列长度的二次时间和内存复杂度限制了实际部署。线性注意力机制提供线性或接近线性的缩放,但通常会导致性能下降。集成全注意力和线性注意力层的混合模型有望在效率和表达能力之间取得平衡,但面临两个主要挑战:从头训练此类混合模型计算成本高,且手动设计注意力类型的最佳放置位置非常困难。我们提出DtR(Distill-then-Replace),首先通过逐块局部蒸馏将预训练的全注意力模块的权重转移到其线性注意力对应模块,然后应用贪婪层替换策略,迭代地用线性注意力块替换全注意力块,同时监控目标任务的验证性能。DtR在单次高效过程中生成任务特定的混合模型,无需昂贵的重新训练或神经架构搜索,并可应用于任何预训练的全注意力骨干网络以处理各种下游任务。

英文摘要

Transformer architectures deliver state-of-the-art accuracy via dense full-attention, but their quadratic time and memory complexity with respect to sequence length limits practical deployment. Linear attention mechanisms offer linear or near-linear scaling yet often incur performance degradation. Hybrid models that integrate full and linear attention layers promise a balance between efficiency and expressiveness, but face two major challenges: training such hybrid models from scratch is computationally expensive, and manually designing the optimal placement of attention types is highly nontrivial. We propose DtR (Distill-then-Replace), which first transfers weights from the pretrained full-attention modules to its linear attention counterparts through blockwise local distillation, and then applies a greedy layer replacement strategy that iteratively substitutes full attention blocks with linear ones while monitoring validation performance on the target task. DtR yields a task-specific hybrid model in a single efficient pass, without costly re-training or neural architecture search, and can be applied to any pretrained full-attention backbone for diverse downstream tasks.

2510.22491 2026-06-03 cs.LG cs.CE cs.CV 版本更新

LAMP: Data-Efficient Linear Affine Weight-Space Models for Parameter-Controlled 3D Shape Generation and Extrapolation

LAMP: 数据高效的线性仿射权重空间模型用于参数控制的3D形状生成与外推

Ghadi Nehme, Yanxia Zhang, Dule Shu, Matt Klenk, Faez Ahmed

发表机构 * GitHub

AI总结 提出LAMP框架,通过过拟合共享初始化的符号距离函数解码器并对齐权重空间,以少量样本实现参数约束下的可控3D生成与外推,并引入线性失配安全度量确保可靠性。

详情
AI中文摘要

在显式参数约束下生成高保真3D几何体是工程设计的核心,但当前方法通常需要大型数据集,且无法在训练分布之外提供可靠控制。我们提出LAMP,一个数据高效的框架,用于可控和可解释的3D生成,该框架通过从共享初始化过拟合每个样本并对齐符号距离函数(SDF)解码器,然后在对齐的权重空间中通过求解参数约束的仿射混合问题来生成新设计。为了提高可靠性,我们提出一种线性失配安全度量,用于检测混合解码器何时离开有效的局部区域。我们在DrivAerNet++、BlendedNet以及额外的工业级车辆系列(包括跑车、SUV和敞篷车)上评估LAMP。LAMP能够以少至50个样本实现受控插值,在训练范围外安全外推高达100%,并在固定参数下进行性能引导优化,在外推、数据效率和参数保真度方面优于条件自编码器和深度网络插值(DNI)基线。我们的结果表明,LAMP推进了用于设计探索、数据集生成和性能驱动优化的可控、数据高效且安全的3D生成。

英文摘要

Generating high-fidelity 3D geometries under explicit parameter constraints is central to engineering design, yet current methods often require large datasets and fail to provide reliable control beyond the training distribution. We introduce LAMP, a data-efficient framework for controllable and interpretable 3D generation that aligns signed distance function (SDF) decoders by overfitting each exemplar from a shared initialization, then generates new designs by solving a parameter-constrained affine mixing problem in the aligned weight space. To improve reliability, we propose a linearity-mismatch safety metric that detects when mixed decoders leave the valid local regime. We evaluate LAMP on DrivAerNet++, BlendedNet, and additional industry-level vehicle families, including sports cars, SUVs, and convertibles. LAMP enables controlled interpolation with as few as 50 samples, safe extrapolation up to 100% beyond training ranges, and performance-guided optimization under fixed parameters, outperforming conditional autoencoder and Deep Network Interpolation (DNI) baselines in extrapolation, data efficiency, and parameter fidelity. Our results demonstrate that LAMP advances controllable, data-efficient, and safe 3D generation for design exploration, dataset generation, and performance-driven optimization.

2601.04120 2026-06-03 math.OC cs.LG 版本更新

A Single-Loop Bilevel Deep Learning Method for Optimal Control of Obstacle Problems

障碍问题最优控制的单环双层深度学习方法

Yongcun Song, Shangzhi Zeng, Jin Zhang, Lvgang Zhang

发表机构 * SUSTech(四川大学)

AI总结 提出一种无网格、可扩展的单环双层深度学习方法,通过约束嵌入神经网络和单环随机一阶双层算法高效求解障碍问题的最优控制。

详情
AI中文摘要

障碍问题的最优控制出现在广泛的应用中,由于其非光滑性、非线性和双层结构,计算上具有挑战性。经典数值方法依赖于基于网格的离散化,通常需要求解一系列代价高昂的子问题。在这项工作中,我们提出了一种单环双层深度学习方法,该方法无网格、可扩展到高维和复杂域,并避免重复求解离散子问题。该方法采用约束嵌入神经网络来逼近状态和控制,并保持双层结构。为了高效训练神经网络,我们提出了一种单环随机一阶双层算法(S2-FOBA),该算法消除了嵌套优化,并且不依赖于限制性的下层唯一性假设。我们在温和假设下分析了S2-FOBA的收敛行为。在基准示例上的数值实验,包括复杂域上具有规则和不规则障碍的分布控制和障碍控制问题,表明与经典数值方法相比,所提出的方法在降低计算成本的同时实现了令人满意的精度。

英文摘要

Optimal control of obstacle problems arises in a wide range of applications and is computationally challenging due to its nonsmoothness, nonlinearity, and bilevel structure. Classical numerical approaches rely on mesh-based discretization and typically require solving a sequence of costly subproblems. In this work, we propose a single-loop bilevel deep learning method, which is mesh-free, scalable to high-dimensional and complex domains, and avoids repeated solution of discretized subproblems. The method employs constraint-embedding neural networks to approximate the state and control and preserves the bilevel structure. To train the neural networks efficiently, we propose a Single-Loop Stochastic First-Order Bilevel Algorithm (S2-FOBA), which eliminates nested optimization and does not rely on restrictive lower-level uniqueness assumptions. We analyze the convergence behavior of S2-FOBA under mild assumptions. Numerical experiments on benchmark examples, including distributed and obstacle control problems with regular and irregular obstacles on complex domains, demonstrate that the proposed method achieves satisfactory accuracy while reducing computational cost compared to classical numerical methods.

2505.11785 2026-06-03 cs.LG cs.AI stat.ML 版本更新

Improving Coverage in Combined Prediction Sets with Weighted p-values

通过加权p值提高组合预测集的覆盖范围

Gina Wong, Drew Prinster, Suchi Saria, Rama Chellappa, Anqi Liu

发表机构 * Johns Hopkins University(约翰霍普金斯大学)

AI总结 提出一种加权聚合预测集的框架,通过为每个预测集分配权重,实现覆盖范围在$1-2α$与$1-α$之间的灵活控制,并推广到数据依赖权重,在混合专家模型等场景中保持有限样本有效性。

详情
Journal ref
AISTATS 2026
AI中文摘要

共形预测通过用有效的预测集增强点预测来量化机器学习模型的不确定性。对于涉及多个试验、模型或数据源的复杂场景,可以聚合共形预测集以创建捕获整体不确定性的预测集,通常能提高精度。然而,聚合具有个体$1-α$覆盖率的多个预测集不可避免地削弱了整体保证,通常导致最坏情况覆盖率为$1-2α$。在这项工作中,我们提出了一个预测集加权聚合的框架,其中根据每个预测集的贡献为其分配权重。我们的框架提供了对集合聚合方式的灵活控制,实现了更紧的覆盖界限,根据权重的分布在组合模型的$1-2α$保证和单个模型的$1-α$保证之间插值。重要的是,我们的框架推广到数据依赖的权重,因为我们推导了一个加权聚合程序,即使权重依赖于数据,也能保持有限样本有效性。这一扩展使我们的框架广泛适用于权重被学习的场景,例如混合专家模型(MoE),并且我们通过在MoE设置中的实验证明,我们的方法实现了自适应覆盖。

英文摘要

Conformal prediction quantifies the uncertainty of machine learning models by augmenting point predictions with valid prediction sets. For complex scenarios involving multiple trials, models, or data sources, conformal prediction sets can be aggregated to create a prediction set that captures the overall uncertainty, often improving precision. However, aggregating multiple prediction sets with individual $1-α$ coverage inevitably weakens the overall guarantee, typically resulting in $1-2α$ worst-case coverage. In this work, we propose a framework for the weighted aggregation of prediction sets, where weights are assigned to each prediction set based on their contribution. Our framework offers flexible control over how the sets are aggregated, achieving tighter coverage bounds that interpolate between the $1-2α$ guarantee of the combined models and the $1-α$ guarantee of an individual model depending on the distribution of weights. Importantly, our framework generalizes to data-dependent weights, as we derive a procedure for weighted aggregation that maintains finite-sample validity even when the weights depend on the data. This extension makes our framework broadly applicable to settings where weights are learned, such as mixture-of-experts (MoE), and we demonstrate through experiments in the MoE setting that our methods achieve adaptive coverage.

2507.16003 2026-06-03 cs.CL cs.LG 版本更新

Learning without training: The implicit dynamics of in-context learning

无需训练的学习:上下文学习的内在动态

Benoit Dherin, Michael Munn, Hanna Mazzawi, Michael Wunder, Javier Gonzalvo

发表机构 * Google(谷歌)

AI总结 本文通过理论分析和实验证明,自注意力层与MLP的组合使Transformer块能够根据上下文隐式修改MLP权重,从而解释大语言模型在推理时无需权重更新即可进行上下文学习的机制。

详情
AI中文摘要

大型语言模型(LLMs)最显著的特征之一是其上下文学习能力。即在推理时,即使提示中呈现的模式在训练中未见,LLM也能在无需额外权重更新的情况下学习新模式。这种机制如何实现仍很大程度上未知。本文中,我们展示了自注意力层与MLP的堆叠使得Transformer块能够根据上下文隐式修改MLP层的权重。通过理论分析和实验,我们认为这种简单机制可能有助于解释为什么LLMs展现出超越训练捕获的上下文学习能力。具体而言,我们证明带有上下文的标准前向传播在数学上等价于无上下文但MLP权重通过表示上下文的最小低秩更新进行更新的前向传播。

英文摘要

One of the most striking features of Large Language Models (LLMs) is their ability to learn in-context. Namely at inference time an LLM is able to learn new patterns without any additional weight update when these patterns are presented in the form of examples in the prompt, even if these patterns were not seen during training. The mechanisms through which this can happen are still largely unknown. In this work, we show that the stacking of a self-attention layer with an MLP allows the transformer block to implicitly modify the weights of the MLP layer according to the context. We argue through theoretical analysis and experimentation that this simple mechanism may help explain why LLMs demonstrate capabilities of in-context learning, beyond what is captured during training. Specifically, we show that a standard forward pass with context is mathematically equivalent to a forward pass without context but with the MLP weights updated by a minimal low-rank update representing the context.

2512.16882 2026-06-03 physics.chem-ph cond-mat.mtrl-sci cs.LG 版本更新

A Cartesian-3j Framework for Machine Learning Interatomic Potentials

机器学习原子间势的 Cartesian-3j 框架

Zemin Xu, Chenyu Wu, Wenbo Xie, P. Hu

发表机构 * University of Science and Technology of China(中国科学技术大学)

AI总结 提出基于Cartesian-3j符号和Cartesian广义Clebsch-Gordan系数的不可约Cartesian张量框架,构建MACE、NequIP和Allegro的Cartesian版本,并引入TACE-v1-OAM-M模型在Matbench Discovery上取得竞争性能。

详情
AI中文摘要

机器学习原子间势(MLIPs)在计算化学的外推能力方面带来了显著提升。然而,大多数等变模型通常使用球张量(STs)构建,而笛卡尔张量公式尽管与原子坐标和张量目标自然对齐,但仍未得到充分发展。在这项工作中,我们通过引入\texttt{Cartesian-3j}符号和Cartesian广义Clebsch-Gordan系数,为不可约Cartesian张量(ICTs)开发了一个Cartesian框架,这些符号和系数直接类比于为ST耦合定义的\texttt{Wigner-3j}符号和广义Clebsch-Gordan系数。我们扩展了\texttt{e3nn}库以支持ICT乘积,并使用该框架构建了\texttt{MACE}、\texttt{NequIP}和\texttt{Allegro}的Cartesian对应版本,从而首次实现了在固定架构仅改变张量基下的受控比较。我们的实验表明,不可约Cartesian模型可以达到与球面对应版本相当的精度,但直接Cartesian化会导致不利的计算和内存缩放,这促使我们采用专门的Cartesian架构选择。利用ICTs和我们的框架,我们引入了\texttt{TACE-v1-OAM-M},并证明它在Matbench Discovery上取得了与最先进ST模型竞争的性能。

英文摘要

Machine learning interatomic potentials (MLIPs) have brought substantial gains in the extrapolation capability in computational chemistry. However, most equivariant models are typically built with spherical tensors (STs), while Cartesian tensor formulations remain less developed despite their natural alignment with atomic coordinates and tensorial targets. In this work, we develop a Cartesian framework for irreducible Cartesian tensors (ICTs) by introduce the \texttt{Cartesian-3j} symbol and Cartesian Generalized Clebsch-Gordan Coefficients, which serve as direct analogues of the \texttt{Wigner-3j} symbol and Generalized Clebsch-Gordan coefficients defined for ST coupling. We extend the \texttt{e3nn} library to support ICT product, and use this framework to build Cartesian counterparts of \texttt{MACE}, \texttt{NequIP}, and \texttt{Allegro}, allowing the first controlled comparison where architectures are held fixed and only the tensor basis is changed. Our experiments show that irreducible Cartesian models can achieve accuracy comparable to spherical counterparts, but direct Cartesianization incurs unfavorable compute and memory scaling, motivating dedicated Cartesian architectural choices. Leveraging ICTs and our framework, we introduce \texttt{TACE-v1-OAM-M} and demonstrate that it achieves competitive performance on Matbench Discovery compared to state-of-the-art ST models.

2512.15427 2026-06-03 cs.LG cond-mat.stat-mech math.ST stat.TH 版本更新

Statistics of Min-max Normalized Eigenvalues in Random Matrices

随机矩阵中最小-最大归一化特征值的统计

Hyakka Nakada, Shu Tanaka

发表机构 * Graduate School of Science and Technology(理工学研究科) Keio University(庆应大学) Department of Applied Physics and Physico-Informatics(应用物理与物理信息学系) Keio University Sustainable Quantum Artificial Intelligence Center (KSQAIC)(庆应大学可持续量子人工智能中心) Human Biology-Microbiome-Quantum Research Center (WPI-Bio2Q)(人生物学-微生物组-量子研究中心(WPI-Bio2Q)) Green Computing System Research Organization(绿色计算系统研究机构)

AI总结 研究随机矩阵中最小-最大归一化特征值的统计性质,提出有效分布并推导累积分布的标度律和矩阵分解的残差误差。

Comments 4 pages, 4 figures

详情
Journal ref
Journal of the Physical Society of Japan, vol. 95, no. 6, pp. 064003, 2026
AI中文摘要

随机矩阵理论在纯数学、数学物理和机器学习的各个领域都发挥了重要作用。从数据科学的实际角度来看,输入数据通常在处理前进行归一化。因此,本研究探讨了随机矩阵中最小-最大归一化特征值的统计性质。先前,已经提出了这种归一化特征值的有效分布。在本研究中,我们将其应用于评估累积分布的标度律。此外,我们推导了随机矩阵分解过程中产生的残差误差。我们进行了数值实验来验证这些理论预测。

英文摘要

Random matrix theory has played an important role in various areas of pure mathematics, mathematical physics, and machine learning. From a practical perspective of data science, input data are usually normalized prior to processing. Thus, this study investigates the statistical properties of min-max normalized eigenvalues in random matrices. Previously, the effective distribution for such normalized eigenvalues has been proposed. In this study, we apply it to evaluate a scaling law of the cumulative distribution. Furthermore, we derive the residual error that arises during matrix factorization of random matrices. We conducted numerical experiments to verify these theoretical predictions.

2512.03019 2026-06-03 cs.LG cs.AI 版本更新

Distribution-Calibrated Inference Time Compute for Thinking LLM-as-a-Judge

分布校准的推理时间计算用于思考型LLM作为评判者

Hamid Dadkhahi, Firas Trabelsi, Parker Riley, Juraj Juraska, Mehdi Mirzazadeh

发表机构 * University of California, Berkeley(加州大学伯克利分校) DeepMind(深Mind) University of Cambridge(剑桥大学)

AI总结 针对思考型大语言模型作为评判者时单样本噪声和聚合不一致问题,提出基于Bradley-Terry-Davidson模型的分布校准聚合方案,利用极性(非平局边际)和决定性(非平局率)区分微弱多数与强共识,显著降低MAE并提高成对准确率,匹配或超越人类评判者。

详情
AI中文摘要

用作成对偏好评判的思考型大语言模型在单样本层面仍存在噪声,常见的聚合规则(多数投票、软自一致性或基于指令的自聚合)在允许平局时不一致。我们研究了评估者的推理时间计算(ITC),该评估者为每个项目生成n个独立的思考-评分样本,并提出了一种原则性的、分布校准的聚合方案。我们的方法使用Bradley-Terry-Davidson公式对评分计数进行三向偏好建模,利用极性(非平局间的边际)和决定性(非平局率)来区分微弱多数与强共识。在各种评估基准上,与标准基线相比,我们的方法持续降低MAE并提高成对准确率,并且在针对人类共识元标签进行评估时,匹配或超过单个人类评判者。这些结果表明,精心分配ITC并使用分布感知方法进行聚合,可以将嘈杂的个体模型判断转化为可靠的评估评分。

英文摘要

Thinking Large Language Models (LLMs) used as judges for pairwise preferences remain noisy at the single-sample level, and common aggregation rules (majority vote, soft self-consistency, or instruction-based self-aggregation) are inconsistent when ties are allowed. We study inference-time compute (ITC) for evaluators that generate n independent thinking--rating samples per item, and propose a principled, distribution-calibrated aggregation scheme. Our method models three-way preferences with a Bradley-Terry-Davidson formulation on rating counts, leveraging both polarity (margin among non-ties) and decisiveness (non-tie rate) to distinguish narrow margins from strong consensus. Across various evaluation benchmarks, our approach consistently reduces MAE and increases pairwise accuracy versus standard baselines, and when evaluated against human-consensus meta-labels, matches or exceeds individual human raters. These results show that carefully allocating ITC and aggregating with distribution-aware methods turns noisy individual model judgments into reliable ratings for evaluation.

2511.19959 2026-06-03 cs.LG cs.DC 版本更新

ParaBlock: Communication-Computation Parallel Block Coordinate Federated Learning for Large Language Models

ParaBlock:面向大语言模型的通信-计算并行块坐标联邦学习

Yujia Wang, Yuanpu Cao, Jinghui Chen

发表机构 * College of Information Sciences and Technology(信息科学与技术学院) Pennsylvania State University(宾夕法尼亚州立大学)

AI总结 提出ParaBlock方法,通过并行化通信与计算线程,在联邦学习大语言模型时提升通信效率,并理论证明其收敛率与标准方法相同,实验验证其性能与效率优势。

Comments Accepted by TMLR

详情
AI中文摘要

联邦学习作为一种隐私保护训练范式已被广泛研究。最近,联邦块坐标下降方案在训练大规模模型中成为流行选择,因为它允许客户端仅本地训练模型的一个子集而非整个模型。然而,在大语言模型时代,即使单个块也可能包含大量参数,导致显著的通信延迟,特别是对于资源受限的客户端。为了解决联邦训练/微调大语言模型中的这一挑战,我们提出了ParaBlock,一种新颖的方法,它建立两个并行线程分别用于通信和计算,以提高通信效率。我们从理论上证明,所提出的ParaBlock实现了与标准联邦块坐标下降方法相同的收敛率。在通用指令遵循和数学推理任务上微调大语言模型的实证评估证实,ParaBlock不仅保持了强大的性能,而且显著提高了通信效率。

英文摘要

Federated learning (FL) has been extensively studied as a privacy-preserving training paradigm. Recently, federated block coordinate descent scheme has become a popular option in training large-scale models, as it allows clients to train only a subset of the model locally instead of the entire model. However, in the era of large language models (LLMs), even a single block can contain a significant number of parameters, posing substantial communication latency, particularly for resource-constrained clients. To address this challenge in federated training/fine-tuning LLMs, we propose ParaBlock, a novel approach that establishes two parallel threads for communication and computation to enhance communication efficiency. We theoretically prove that the proposed ParaBlock achieves the same convergence rate as the standard federated block coordinate descent methods. Empirical evaluations on fine-tuning LLMs on general instruction following and mathematical reasoning confirm that ParaBlock not only maintains strong performance but also significantly improves communication efficiency.

2511.17126 2026-06-03 eess.IV cs.CV cs.LG physics.optics 版本更新

Towards Blind Lens Aberration Correction via Large LensLib Pre-training and Discrete Degradation Priors

面向盲镜头像差校正的大规模LensLib预训练与离散退化先验

Xiaolong Qian, Qi Jiang, Yao Gao, Lei Sun, Kailun Yang, Xian Wang, Zhonghua Yi, Wenyong Li, Ming-Hsuan Yang, Luc Van Gool, Kaiwei Wang

发表机构 * National Research Center for Optical Instrumentation, Zhejiang University(浙江省光学仪器研究中心,浙江大学) INSAIT, Sofia University "St. Kliment Ohridski"(INSAIT,索菲亚大学"圣克莱门特·欧弗里迪斯基") School of Artificial Intelligence and Robotics, Hunan University(人工智能与机器人学院,湖南大学) National Engineering Research Center of Robot Visual Perception and Control Technology, Hunan University(机器人视觉感知与控制技术国家工程研究中心,湖南大学)

AI总结 提出FoundCAC框架,通过构建大规模无偏镜头库AODLibpro和离散退化先验LPR,解决数据扩展与先验缺失问题,实现盲镜头像差校正的零样本泛化和高效少样本适应。

Comments Accepted to 2026 IEEE International Conference on Computational Photography (ICCP). The source code and datasets will be made publicly available at https://github.com/zju-jiangqi/FoundCAC

详情
AI中文摘要

新兴的基于深度学习的镜头库预训练(LensLib-PT)流程通过训练通用神经网络,为盲镜头像差校正提供了新途径,展现出处理多种未知光学退化的强大能力。本文提出FoundCAC,一个通用的基础框架,解决了阻碍现有流程泛化的两个挑战:训练数据扩展的困难以及缺乏表征光学退化的先验指导。为提高数据可扩展性,我们扩展设计规范以增加退化多样性,并基于均匀采样策略构建了大规模无偏镜头库AODLibpro,该策略量化了空间变化模式和严重程度。在模型设计方面,为利用点扩散函数(PSF)作为指导同时保持盲范式,我们提出了一种多阶段向量量化表示学习方案。该范式专门设计用于构建潜在PSF表示(LPR),将复杂的连续PSF显式编码为离散退化先验,以规范高度病态的恢复过程。通过简单而有效的码本冻结策略,我们的框架利用离散先验提升全样本恢复性能,并实现对未见镜头的高效少样本适应。在合成LensLib和真实镜头的多种像差上的实验表明,我们的框架实现了最先进的零样本泛化,同时支持针对特定镜头的高效少样本适应。源代码和数据集将在https://github.com/zju-jiangqi/FoundCAC公开提供。

英文摘要

Emerging deep-learning-based lens library pre-training (LensLib-PT) pipeline offers a new avenue for blind lens aberration correction by training a universal neural network, demonstrating strong capability in handling diverse unknown optical degradations. This work proposes FoundCAC, a universal foundational framework that resolves two challenges hindering the generalization of existing pipelines: the difficulty of scaling training data and the absence of prior guidance characterizing optical degradation. To improve data scalability, we expand the design specifications to increase degradation diversity and construct AODLibpro, a large-scale, unbiased lens library based on a uniform sampling strategy that quantifies spatial-variation patterns and severity. In terms of model design, to leverage Point Spread Functions (PSFs) as guidance while maintaining the blind paradigm, we propose a multi-stage vector-quantized representation learning scheme. This paradigm is specifically designed to construct a Latent PSF Representation (LPR), explicitly encoding complex continuous PSFs into a discrete degradation prior to regularize the highly ill-posed restoration process. Through a simple yet effective codebook-freezing strategy, our framework leverages the discrete prior to elevate full-shot restoration performance and unlock highly efficient few-shot adaptation for unseen lenses. Experiments on diverse aberrations of synthetic LensLib and real-world lenses demonstrate that our framework achieves state-of-the-art zero-shot generalization while enabling highly efficient few-shot adaptation for specific lenses. The source code and datasets will be made publicly available at https://github.com/zju-jiangqi/FoundCAC.

2511.05050 2026-06-03 stat.ML cs.LG stat.ME 版本更新

Estimating Bidirectional Causal Effects with Large Scale Online Kernel Learning

基于大规模在线核学习的双向因果效应估计

Masahiro Tanaka

发表机构 * Japan Society for the Promotion of Science(日本学术振兴会)

AI总结 提出一种可扩展的在线核学习框架,结合异方差识别和拟极大似然估计,用于估计存在相互依赖和异方差系统中的双向因果效应,并通过随机傅里叶特征和自适应在线梯度下降实现高效计算。

详情
Journal ref
Proceedings of the 2025 International Conference on Data Science and Intelligent Systems (DSIS 2025), Article 65, pp. 449-455
AI中文摘要

本研究提出一种可扩展的在线核学习框架,用于估计以相互依赖和异方差为特征的系统中的双向因果效应。传统因果推断通常关注单向效应,忽略了现实世界中常见的双向关系。基于异方差识别,该方法将联立方程模型的拟极大似然估计与大规模在线核学习相结合。它采用随机傅里叶特征逼近来灵活建模非线性条件均值和方差,同时自适应在线梯度下降算法确保了对流式和高维数据的计算效率。大量模拟结果表明,与单方程和多项式逼近基线相比,该方法在多种数据生成过程中实现了更高的准确性和稳定性,偏差和均方根误差更低。这些结果证实,该方法以近线性计算扩展有效捕获了复杂的双向因果效应。通过将计量经济学识别与现代机器学习技术相结合,所提框架为自然科学/社会科学、政策制定、商业和工业应用中的大规模因果推断提供了一种实用、可扩展且理论扎实的解决方案。

英文摘要

In this study, a scalable online kernel learning framework is proposed for estimating bidirectional causal effects in systems characterized by mutual dependence and heteroskedasticity. Traditional causal inference often focuses on unidirectional effects, overlooking the common bidirectional relationships in real-world phenomena. Building on heteroskedasticity-based identification, the proposed method integrates a quasi-maximum likelihood estimator for simultaneous equation models with large scale online kernel learning. It employs random Fourier feature approximations to flexibly model nonlinear conditional means and variances, while an adaptive online gradient descent algorithm ensures computational efficiency for streaming and high-dimensional data. Results from extensive simulations demonstrate that the proposed method achieves superior accuracy and stability than single equation and polynomial approximation baselines, exhibiting lower bias and root mean squared error across various data-generating processes. These results confirm that the proposed approach effectively captures complex bidirectional causal effects with near-linear computational scaling. By combining econometric identification with modern machine learning techniques, the proposed framework offers a practical, scalable, and theoretically grounded solution for large scale causal inference in natural/social science, policy making, business, and industrial applications.

2511.13899 2026-06-03 q-bio.NC cs.CE cs.LG 版本更新

A Factorized Low-Rank RNN Framework for Uncovering Independent Neural Latent Dynamics and Connectivity

一种分解低秩RNN框架用于揭示独立神经潜在动力学和连接性

Chengrui Li, Yunmiao Wang, Yule Wang, Weihan Li, Dieter Jaeger, Anqi Wu

发表机构 * University of California, San Diego(加州大学圣迭戈分校)

AI总结 提出FacRNN框架,通过组间独立假设和部分相关惩罚,在低秩循环神经网络中实现潜在动力学的解耦与可解释性提升。

详情
AI中文摘要

低秩循环神经网络(lrRNN)是一类揭示神经群体活动背后低维潜在动力学的模型。尽管其功能连接是低秩的,但缺乏独立性解释,使得难以将不同的计算角色分配给不同的潜在维度。为了解决这个问题,我们提出了分解循环神经网络(FacRNN),这是一种生成式lrRNN框架,它假设潜在动力学之间具有组间独立性,同时允许组内灵活纠缠。这些独立的潜在组允许潜在动力学分别演化,但内部丰富以进行复杂计算。我们在变分自编码器(VAE)框架下重新表述lrRNN,从而引入部分相关惩罚,鼓励潜在维度组之间的独立性。在合成数据、猴子M1和小鼠电压成像数据上的实验表明,与不鼓励组间独立性的基线lrRNN相比,FacRNN持续改善了在低维空间和低秩连接中学到的神经潜在轨迹的解耦性和可解释性。

英文摘要

Low-rank recurrent neural networks (lrRNNs) are a class of models that uncover low-dimensional latent dynamics underlying neural population activity. Although their functional connectivity is low-rank, it lacks independence interpretations, making it difficult to assign distinct computational roles to different latent dimensions. To address this, we propose the Factored Recurrent Neural Network (FacRNN), a generative lrRNN framework that assumes group-wise independence among latent dynamics while allowing flexible within-group entanglement. These independent latent groups allow latent dynamics to evolve separately, but are internally rich for complex computation. We reformulate the lrRNN under a variational autoencoder (VAE) framework, enabling us to introduce a partial correlation penalty that encourages independence between groups of latent dimensions. Experiments on synthetic, monkey M1, and mouse voltage imaging data show that FacRNN consistently improves the disentanglement and interpretability of learned neural latent trajectories in low-dimensional space and low-rank connectivity over baseline lrRNNs that do not encourage group-wise independence.

2511.13663 2026-06-03 cs.PL cs.LG 版本更新

SAIL: Sound Abstract Interpreters with LLMs

SAIL: 基于LLM的可靠抽象解释器

Qiuhan Gu, Avaljot Singh, Gagandeep Singh

AI总结 提出SAIL框架,利用大语言模型自动合成全局可靠的抽象变换器,通过约束优化和代价函数确保可靠性,在神经网络验证中匹配甚至超越人工设计的变换器。

Comments 43 pages, 21 figures

详情
Journal ref
Proc. ACM Program. Lang. 10, PLDI, Article 230, 26 pages (2026)
AI中文摘要

如何构建全局可靠的抽象解释器以安全地近似程序行为仍然是抽象解释中的一个瓶颈。在本文中,我们展示了使用最先进的大语言模型来自动化这一繁琐过程的潜力。聚焦于神经网络验证领域,我们利用大语言模型从零开始在无限空间中搜索,跨不同抽象域合成非平凡的可靠抽象变换器。我们将合成任务形式化为一个约束优化问题,为此设计了一种新颖的基于数学的代价函数,用于衡量每个生成候选变换器的不可靠程度,同时强制执行硬性的语法和语义有效性约束。基于这一公式,我们引入了SAIL,一个新颖的统一框架,结合了模型生成、语法和语义验证以及基于代价函数的细化,以合成全局可靠的抽象变换器。评估结果表明,SAIL不仅匹配了人工设计的变换器的性能,还能够合成为复杂非线性算子设计的、文献中不存在的可靠且高精度的变换器。

英文摘要

How to construct globally sound abstract interpreters to safely approximate program behaviors remains a bottleneck in abstract interpretation. In this paper, we show the potential of using state-of-the-art LLMs to automate this tedious process. Focusing on the neural network verification area, we synthesize non-trivial sound abstract transformers across diverse abstract domains using LLMs to search within infinite space from scratch. We formalize the synthesis task as a constrained optimization problem, for which we design a novel mathematically grounded cost function that measures the degree of unsoundness of each generated candidate transformer, while enforcing hard syntactic and semantic validity constraints. Building on this formulation, we introduce SAIL, a novel unified framework that combines model generation, syntactic and semantic validation, and cost-function-based refinement to synthesize globally sound abstract transformers. Evaluation results show that SAIL not only matches the performance of manually designed transformers, but also is able to synthesize sound and high-precision transformers that do not exist in the literature for complex non-linear operators.

2511.12482 2026-06-03 quant-ph cs.LG 版本更新

Discovering autonomous quantum error correction via deep reinforcement learning

通过深度强化学习发现自主量子纠错

Yue Yin, Tailong Xiao, Xiaoyang Deng, Ming He, Jianping Fan, Guihua Zeng

发表机构 * Zhiyuan College, Shanghai Jiao Tong University, Shanghai 200240, P.R. China(上海交通大学玉泉学院) State Key Laboratory of Photonics and Communications, Institute for Quantum Sensing and Information Processing, Shanghai Jiao Tong University, Shanghai 200240, P.R. China(上海交通大学光子通信国家重点实验室) Hefei National Laboratory, Hefei, 230088, P.R. China(合肥国家实验室) Shanghai Research Center for Quantum Sciences, Shanghai, 201315, P.R. China(上海量子科学研究中心) AI Lab, Lenovo Research, Beijing 100094, P.R. China(联想AI实验室)

AI总结 本文利用课程学习启发的深度强化学习,在近似自主量子纠错框架下发现抵抗单光子和双光子损失的玻色子码,并实现超越盈亏平衡点的最优码字。

详情
Journal ref
Phys. Rev. A 112, 062618 (2025)
AI中文摘要

量子纠错对于容错量子计算至关重要。然而,依赖主动测量的标准方法可能会引入额外错误。自主量子纠错(AQEC)通过利用玻色子系统中的工程耗散和驱动来规避这一问题,但由于严格的Knill-Laflamme条件,识别实用的编码仍然具有挑战性。在本工作中,我们利用课程学习启发的深度强化学习,在近似AQEC框架下发现抵抗单光子和双光子损失的玻色子码。我们提出了在近似条件下求解主方程的解析解,这可以显著加速强化学习的训练过程。智能体首先通过在受限演化时间框架内快速探索,识别出超越盈亏平衡点的编码子空间,然后策略性地微调其策略,以在更长的时间范围内维持这一性能优势。我们发现,经过两阶段训练的智能体能够发现最优码字集合,即考虑单光子和双光子损失效应的Fock态$\ket{4}$和$\ket{7}$。我们识别出该码在更长的演化时间内超越了盈亏平衡阈值,并达到了最先进的性能。我们还分析了该码对相位阻尼和振幅阻尼噪声的鲁棒性。我们的工作突显了课程学习启发的深度强化学习在发现最优量子纠错码方面的潜力,特别是在早期容错量子系统中。

英文摘要

Quantum error correction is essential for fault-tolerant quantum computing. However, standard methods relying on active measurements may introduce additional errors. Autonomous quantum error correction (AQEC) circumvents this by utilizing engineered dissipation and drives in bosonic systems, but identifying practical encoding remains challenging due to stringent Knill-Laflamme conditions. In this work, we utilize curriculum learning enabled deep reinforcement learning to discover Bosonic codes under approximate AQEC framework to resist both single-photon and double-photon losses. We present an analytical solution of solving the master equation under approximation conditions, which can significantly accelerate the training process of reinforcement learning. The agent first identifies an encoded subspace surpassing the breakeven point through rapid exploration within a constrained evolutionary time-frame, then strategically fine-tunes its policy to sustain this performance advantage over extended temporal horizons. We find that the two-phase trained agent can discover the optimal set of codewords, i.e., the Fock states $\ket{4}$ and $\ket{7}$ considering the effect of both single-photon and double-photon loss. We identify that the discovered code surpasses the breakeven threshold over a longer evolution time and achieve the state-of-art performance. We also analyze the robustness of the code against the phase damping and amplitude damping noise. Our work highlights the potential of curriculum learning enabled deep reinforcement learning in discovering the optimal quantum error correct code especially in early fault-tolerant quantum systems.

2511.11346 2026-06-03 cs.LG 版本更新

Fast and Expressive Multi-Byte Prediction with Probabilistic Circuits

基于概率电路的快速且富有表现力的多字节预测

Andreas Grivas, Lorenzo Loconte, Emile van Krieken, Piotr Nawrot, Yu Zhao, Euan Wielewski, Pasquale Minervini, Edoardo Ponti, Antonio Vergari

发表机构 * University of Cambridge(剑桥大学)

AI总结 提出MTPC框架,利用概率电路编码未来令牌的联合分布,在字节级LLM中实现快速生成,同时保持表现力。

详情
AI中文摘要

多令牌预测(MTP)是一种显著加速大型语言模型(LLM)生成的突出策略,尤其是在字节级LLM中,这些模型无需分词器但速度极慢。然而,许多现有的MTP方法要么假设未来令牌之间独立,牺牲了表现力,要么在窗口内逐个生成令牌,增加了延迟。在这项工作中,我们在概率电路(PC)框架内研究了MTP中表现力与延迟之间的权衡。我们的框架MTPC允许通过选择电路架构来探索编码未来令牌联合分布的不同方式,推广了经典模型,如(层次)混合模型、隐马尔可夫模型和张量网络。我们通过改造现有的字节级LLM(如EvaByte)和字节化的子词模型(如Llama3.2 3B)展示了MTPC的有效性。实验表明,当与推测解码结合时,与具有独立性假设的MTP相比,MTPC显著加速了生成,同时保证保持原始验证器LLM的性能。我们还严格研究了在探索MTPC的可能参数化(如PC架构以及验证器和草稿LLM之间的部分层共享)时,表现力与延迟之间的最优权衡。

英文摘要

Multi-token prediction (MTP) is a prominent strategy to significantly speed up generation in large language models (LLMs), especially in byte-level LLMs, which are tokeniser-free but prohibitively slow. However, many existing MTP methods either assume independence between future tokens, sacrificing expressiveness, or generate tokens one at a time within the window, increasing latency. In this work, we investigate the trade-off between expressiveness and latency in MTP within the framework of probabilistic circuits (PCs). Our framework, MTPC, allows one to explore different ways to encode the joint distributions over future tokens by selecting circuit architectures, generalising classical models such as (hierarchical) mixture models, hidden Markov models, and tensor networks. We show the efficacy of MTPC by retrofitting existing byte-level LLMs, such as EvaByte, and byte-fied subword models, such as Llama3.2 3B. Our experiments show that, when combined with speculative decoding, MTPC substantially speeds up generation compared to MTP with independence assumptions, while guaranteeing to retain the performance of the original verifier LLM. We also rigorously study the optimal trade-off between expressiveness and latency when exploring the possible parameterisations of MTPC, such as PC architectures and partial layer sharing between the verifier and draft LLMs.

2511.07971 2026-06-03 cs.LG 版本更新

Low-Rank Curvature for Zeroth-Order Optimization in LLM Fine-Tuning

低秩曲率用于大语言模型微调中的零阶优化

Hyunseok Seung, Jaewoo Lee, Hyunsuk Ko

发表机构 * University of Wisconsin – Madison(威斯康星大学麦迪逊分校) University of Georgia(佐治亚大学) Hanyang University(翰阳大学)

AI总结 提出LOREN方法,通过低秩块对角预条件器捕捉曲率并利用REINFORCE留一法梯度估计器降低方差,在LLM微调中实现更高精度和更快收敛,同时峰值内存使用降低27.3%。

Comments Accepted to the AAAI Conference on Artificial Intelligence (AAAI-2026)

详情
AI中文摘要

我们引入了LOREN,一种用于微调大型语言模型(LLM)的曲率感知零阶(ZO)优化方法。现有的ZO方法通过随机扰动的有限差分估计梯度,常常遭受高方差和次优搜索方向的问题。我们的方法通过以下方式解决这些挑战:(i)将梯度预条件问题重新表述为自适应估计用于梯度估计的各向异性扰动分布的问题,(ii)通过自然进化策略框架,使用低秩块对角预条件器捕捉曲率,以及(iii)应用REINFORCE留一法(RLOO)梯度估计器来降低方差。在标准LLM基准上的实验表明,我们的方法通过实现更高的精度和更快的收敛,优于最先进的ZO方法,同时与MeZO-Adam相比,峰值内存使用减少了高达27.3%。

英文摘要

We introduce LOREN, a curvature-aware zeroth-order (ZO) optimization method for fine-tuning large language models (LLMs). Existing ZO methods, which estimate gradients via finite differences using random perturbations, often suffer from high variance and suboptimal search directions. Our approach addresses these challenges by: (i) reformulating the problem of gradient preconditioning as that of adaptively estimating an anisotropic perturbation distribution for gradient estimation, (ii) capturing curvature through a low-rank block diagonal preconditioner using the framework of natural evolution strategies, and (iii) applying a REINFORCE leave-one-out (RLOO) gradient estimator to reduce variance. Experiments on standard LLM benchmarks show that our method outperforms state-of-the-art ZO methods by achieving higher accuracy and faster convergence, while cutting peak memory usage by up to 27.3% compared with MeZO-Adam.

2506.08464 2026-06-03 cs.LG 版本更新

MAC: An Efficient Gradient Preconditioning using Mean Activation Approximated Curvature

MAC:一种使用平均激活近似曲率的高效梯度预条件方法

Hyunseok Seung, Jaewoo Lee, Hyunsuk Ko

发表机构 * University of Wisconsin – Madison(威斯康星大学麦迪逊分校) University of Georgia(佐治亚大学) Hanyang University(翰阳大学)

AI总结 提出MAC方法,通过近似KFAC中Fisher信息矩阵的Kronecker因子,降低二阶优化计算负担,并首次将Kronecker分解应用于Transformer注意力层,在多种网络和数据集上优于KFAC等现有方法。

Comments Accepted to the IEEE International Conference on Data Mining (ICDM-2025)

详情
AI中文摘要

用于训练神经网络的二阶优化方法,如KFAC,通过利用损失景观的曲率信息展现出优越的收敛性。然而,这是以高计算负担为代价的。在这项工作中,我们分析了构成KFAC中逐层Fisher信息矩阵(FIM)的两个组件:与激活和预激活梯度相关的Kronecker因子。基于对其特征谱的实证观察,我们提出了它们的有效近似,从而产生了一种计算高效的优化方法,称为MAC。据我们所知,MAC是第一个将Kronecker分解应用于Transformer中注意力层的FIM,并明确将注意力分数整合到预条件中的算法。我们还研究了MAC在非线性神经网络上的收敛性质,并提供了其收敛到全局最小值的两个条件。我们在各种网络架构和数据集上的广泛评估表明,所提出的方法在准确性、端到端训练时间和内存使用方面优于KFAC和其他最先进的方法。

英文摘要

Second-order optimization methods for training neural networks, such as KFAC, exhibit superior convergence by utilizing curvature information of loss landscape. However, it comes at the expense of high computational burden. In this work, we analyze the two components that constitute the layer-wise Fisher information matrix (FIM) used in KFAC: the Kronecker factors related to activations and pre-activation gradients. Based on empirical observations on their eigenspectra, we propose efficient approximations for them, resulting in a computationally efficient optimization method called MAC. To the best of our knowledge, MAC is the first algorithm to apply the Kronecker factorization to the FIM of attention layers used in transformers and explicitly integrate attention scores into the preconditioning. We also study the convergence property of MAC on nonlinear neural networks and provide two conditions under which it converges to global minima. Our extensive evaluations on various network architectures and datasets show that the proposed method outperforms KFAC and other state-of-the-art methods in terms of accuracy, end-to-end training time, and memory usage.

2310.00965 2026-06-03 cs.LG 版本更新

Node Perturbation Can Effectively Train Multi-Layer Neural Networks

节点扰动可以有效训练多层神经网络

Sander Dalm, Marcel van Gerven, Nasir Ahmad

发表机构 * Donders Institute for Brain, Cognition and Behaviour(大脑、认知与行为研究所)

AI总结 通过将节点扰动与方向导数对齐并在每层进行输入去相关,显著提升了节点扰动学习的参数收敛速度和测试性能,接近反向传播。

详情
AI中文摘要

反向传播(BP)仍然是训练深度神经网络参数的主导且最成功的方法。然而,BP依赖于两个计算上不同的阶段,不能提供对生物学习的满意解释,并且可能难以应用于具有不连续性或噪声节点动态的网络训练。相比之下,节点扰动(NP),也称为活动扰动前向梯度,提出通过向网络激活中注入噪声并随后测量引起的损失变化来学习。NP依赖于两次前向(推理)传递,不使用网络导数,并已被提出作为生物系统中学习的模型。然而,标准NP数据效率极低,并且由于其无引导的基于噪声的搜索过程可能不稳定。在这项工作中,我们通过将NP与方向导数相关联并引入输入去相关,发展了一种现代视角。我们发现,与方向导数的更紧密对齐以及每层的输入去相关在理论和实践上增强了NP学习的性能,在参数收敛方面有大幅改进,并且在测试数据上获得更高的性能,接近BP。此外,我们的新公式允许应用于噪声过程本身不可访问的噪声系统,这对于神经形态芯片上的学习特别有意义。

英文摘要

Backpropagation (BP) remains the dominant and most successful method for training parameters of deep neural network models. However, BP relies on two computationally distinct phases, does not provide a satisfactory explanation of biological learning, and can be challenging to apply for training of networks with discontinuities or noisy node dynamics. By comparison, node perturbation (NP), also known as activity-perturbed forward gradients, proposes learning by the injection of noise into network activations, and subsequent measurement of the induced loss change. NP relies on two forward (inference) passes, does not make use of network derivatives, and has been proposed as a model for learning in biological systems. However, standard NP is highly data inefficient and can be unstable due to its unguided noise-based search process. In this work, we develop a modern perspective on NP by relating it to the directional derivative and incorporating input decorrelation. We find that a closer alignment with directional derivatives together with input decorrelation at every layer theoretically and practically enhances performance of NP learning with large improvements in parameter convergence and much higher performance on the test data, approaching that of BP. Furthermore, our novel formulation allows for application to noisy systems in which the noise process itself is inaccessible, which is of particular interest for on-chip learning in neuromorphic systems.

2511.02986 2026-06-03 stat.ML cs.LG q-bio.GN 版本更新

Scalable Single-Cell Gene Expression Generation with Latent Diffusion Models

基于潜在扩散模型的可扩展单细胞基因表达生成

Giovanni Palla, Sudarshan Babu, Payam Dibaeinia, James D. Pearce, Donghui Li, Aly A. Khan, Theofanis Karaletsos, Jakub M. Tomczak

发表机构 * University of Cambridge(剑桥大学)

AI总结 提出scLDM,一种结合变分自编码器和潜在扩散模型的可扩展生成方法,通过置换不变/等变架构和扩散Transformer实现高质量单细胞基因表达生成。

Comments Accepted to ICML 2026, Github: https://github.com/czi-ai/scldm/

详情
AI中文摘要

单细胞基因表达的计算建模对于理解细胞过程至关重要,但生成真实的表达谱仍然是一个主要挑战。这一困难源于基因表达数据的计数性质以及基因之间复杂的潜在依赖性。现有的生成模型通常强加人工基因排序或依赖浅层神经网络架构。我们引入了一种可扩展的潜在扩散模型用于单细胞基因表达数据,称为scLDM,该模型尊重数据的基本可交换性属性。我们的VAE使用固定大小的潜在变量,利用统一的多头交叉注意力块(MCAB)架构,该架构具有双重作用:编码器中的置换不变池化和解码器中的置换等变反池化。我们通过用使用扩散Transformer和线性插值的潜在扩散模型替换高斯先验来增强这一框架,从而通过多条件无分类器引导实现高质量生成。我们在观察性和扰动性单细胞数据的多种实验以及下游任务(如细胞水平分类)中展示了其优越性能。

英文摘要

Computational modeling of single-cell gene expression is crucial for understanding cellular processes, but generating realistic expression profiles remains a major challenge. This difficulty arises from the count nature of gene expression data and complex latent dependencies among genes. Existing generative models often impose artificial gene orderings or rely on shallow neural network architectures. We introduce a scalable latent diffusion model for single-cell gene expression data, which we refer to as scLDM, that respects the fundamental exchangeability property of the data. Our VAE uses fixed-size latent variables leveraging a unified Multi-head Cross-Attention Block (MCAB) architecture, which serves dual roles: permutation-invariant pooling in the encoder and permutation-equivariant unpooling in the decoder. We enhance this framework by replacing the Gaussian prior with a latent diffusion model using Diffusion Transformers and linear interpolants, enabling high-quality generation with multi-conditional classifier-free guidance. We show its superior performance in a variety of experiments for both observational and perturbational single-cell data, as well as downstream tasks like cell-level classification.

2511.02304 2026-06-03 cs.MA cs.AI cs.CL cs.FL cs.LG 版本更新

Automata-Conditioned Cooperative Multi-Agent Reinforcement Learning

自动机条件化协作多智能体强化学习

Beyazit Yalcinkaya, Marcell Vazquez-Chanlatte, Ameesh Shah, Hanna Krasowski, Sanjit A. Seshia

发表机构 * Massachusetts Institute of Technology(麻省理工学院) Stanford University(斯坦福大学)

AI总结 提出自动机条件化协作多智能体强化学习框架,通过自动机分解团队目标为子任务,学习任务条件化的分散策略,实现最优任务分配和多步协调。

详情
AI中文摘要

我们研究在集中训练、分散执行下,针对协作性时间目标的多任务、多智能体策略学习。在此设置中,使用自动机表示分配给智能体的任务,能够将团队级目标分解为更简单、更小的子任务。然而,现有方法样本效率低下,且局限于单任务情况,需要为每个新任务重新训练策略。在这项工作中,我们提出了自动机条件化协作多智能体强化学习(ACC-MARL),一个学习任务条件化分散团队策略的框架。我们识别了ACC-MARL可行性的挑战,提出了解决方案,并证明了我们的方法是最优的。我们进一步展示了学习到的价值函数可用于在测试时最优地分配任务。实验表明,智能体之间涌现出任务感知的多步协调,例如按下按钮开门、扶住门以及短路任务。

英文摘要

We study learning multi-task, multi-agent policies for cooperative, temporal objectives, under centralized training, decentralized execution. In this setting, using automata to represent tasks assigned to agents enables breaking down a team-level objective into simpler, smaller sub-tasks. However, existing approaches remain sample-inefficient and are limited to the single-task case, requiring retraining policies for each new task. In this work, we present Automata-Conditioned Cooperative Multi-Agent Reinforcement Learning (ACC-MARL), a framework for learning task-conditioned, decentralized team policies. We identify challenges to the feasibility of ACC-MARL, propose solutions, and prove that our approach is optimal. We further show that learned value functions can be used to assign tasks optimally at test time. Experiments demonstrate emergent task-aware, multi-step coordination among agents, such as pressing a button to unlock a door, holding the door, and short-circuiting tasks.

2510.23216 2026-06-03 cs.AI cs.LG 版本更新

Human-Like Goalkeeping in a Realistic Football Simulation: a Sample-Efficient Reinforcement Learning Approach

逼真足球模拟中的人性化守门:一种样本高效的强化学习方法

Alessandro Sestini, Joakim Bergdahl, Jean-Philippe Barrette-LaPierre, Florian Fuchs, Brady Chen, Fabio Zinno, Michael Jones, Linus Gisslén

发表机构 * University of Edinburgh(爱丁堡大学) KTH Royal Institute of Technology(皇家理工学院) University of California, Berkeley(加州大学伯克利分校)

AI总结 提出一种样本高效的深度强化学习方法,通过利用预收集数据和增加网络可塑性,在EA SPORTS FC 25中训练出守门员智能体,其扑救率比内置AI高10%,训练速度比标准DRL快50%,且行为更接近人类。

详情
AI中文摘要

尽管多个知名视频游戏已成为深度强化学习(DRL)的测试平台,但该技术很少被游戏行业用于制作真实的AI行为。先前的研究侧重于使用大型模型训练超人类智能体,这对于资源有限、旨在实现类人智能体的游戏工作室来说并不实际。本文提出了一种样本高效的DRL方法,专为在工业环境(如视频游戏行业)中训练和微调智能体而设计。我们的方法通过利用预收集的数据和增加网络可塑性来提高基于价值的DRL的样本效率。我们在EA SPORTS FC 25(当今最畅销的足球模拟游戏之一)中评估了该方法训练守门员智能体的效果。我们的智能体在扑救率上比游戏内置AI高出10%。消融研究表明,与标准DRL方法相比,我们的方法训练智能体速度提高了50%。最后,领域专家的定性评估表明,与手工制作的智能体相比,我们的方法创造了更人性化的游戏玩法。作为该方法影响力的证明,该技术已被用于该系列的最新版本中。

英文摘要

While several high profile video games have served as testbeds for Deep Reinforcement Learning (DRL), this technique has rarely been employed by the game industry for crafting authentic AI behaviors. Previous research focuses on training super-human agents with large models, which is impractical for game studios with limited resources aiming for human-like agents. This paper proposes a sample-efficient DRL method tailored for training and fine-tuning agents in industrial settings such as the video game industry. Our method improves sample efficiency of value-based DRL by leveraging pre-collected data and increasing network plasticity. We evaluate our method training a goalkeeper agent in EA SPORTS FC 25, one of the best-selling football simulations today. Our agent outperforms the game's built-in AI by 10% in ball saving rate. Ablation studies show that our method trains agents 50% faster compared to standard DRL methods. Finally, qualitative evaluation from domain experts indicates that our approach creates more human-like gameplay compared to hand-crafted agents. As a testament to the impact of the approach, the method has been adopted for use in the most recent release of the series.

2510.23469 2026-06-03 cs.LG 版本更新

Towards Fair Graph Prompting: A Dual-Prompt Mechanism for Mitigating Attribute and Structural Bias

面向公平图提示:一种缓解属性与结构偏差的双提示机制

Yuhan Yang, Xingbo Fu, Jundong Li

发表机构 * University of Michigan(密歇根大学) University of Virginia(弗吉尼亚大学)

AI总结 提出自适应双提示框架(ADPrompt),通过自适应特征修正和自适应消息校准两个模块,在适应预训练GNN的同时缓解节点属性与图结构中的偏差,实现公平的节点分类。

详情
AI中文摘要

对未标记图数据进行自监督预训练已成为图神经网络(GNN)的常见范式。然而,预训练目标与下游任务之间通常存在目标差距。为弥补这一差距,图提示方法通过可学习提示将冻结的预训练GNN适应到特定下游任务。尽管有效,但现有大多数图提示方法主要关注提升模型性能,而很大程度上忽略了公平性问题。由于下游图数据在节点属性和图结构中固有地包含偏差,预训练GNN可能在不同人口统计子组之间产生不同的表示。为解决这一局限,我们提出自适应双提示(ADPrompt),一种公平感知的图提示框架,用于适应预训练GNN。ADPrompt包含两个互补组件:自适应特征修正,学习个性化属性提示以在输入层面抑制敏感信息;以及自适应消息校准,引入逐层结构提示以动态调节来自邻居节点的信息传播。通过联合优化这两个模块,ADPrompt在适应预训练GNN的同时缓解了属性级和结构级偏差。在四个基准数据集上采用多种预训练策略的实验表明,ADPrompt在节点分类任务中始终优于七个竞争基线。

英文摘要

Self-supervised pre-training on unlabeled graph data has become a common paradigm for Graph Neural Networks (GNNs). However, an objective gap often remains between pre-training objectives and downstream tasks. To bridge this gap, graph prompting methods adapt frozen pre-trained GNNs to specific downstream tasks through learnable prompts. Despite its effectiveness, most existing graph prompting methods primarily focus on improving model performance and largely overlook fairness concerns. As downstream graph data inherently contains biases in both node attributes and graph structures, pre-trained GNNs may produce representations that differ across demographic subgroups. To address this limitation, we propose Adaptive Dual Prompting (ADPrompt), a fairness-aware graph prompting framework for adapting pre-trained GNNs. ADPrompt incorporates two complementary components: Adaptive Feature Rectification, which learns personalized attribute prompts to suppress sensitive information at the input level, and Adaptive Message Calibration, which introduces layer-wise structure prompts to dynamically regulate information propagation from neighboring nodes. By jointly optimizing these two modules, ADPrompt adapts the pre-trained GNN while mitigating both attribute-level and structural bias. Experiments on four benchmark datasets with multiple pre-training strategies demonstrate that ADPrompt consistently outperforms seven competitive baselines in node classification tasks.

2510.15780 2026-06-03 stat.AP cs.LG 版本更新

Enhanced Renewable Energy Forecasting using Context-Aware Conformal Prediction

基于上下文感知保形预测的增强型可再生能源预测

Alireza Moradi, Mathieu Tanneau, Reza Zandehshahvar, Pascal Van Hentenryck

发表机构 * EPFL, Switzerland(瑞士联邦理工学院) Ghent University, Belgium(比利时根特大学)

AI总结 提出上下文感知保形预测(CACP)框架,通过加权历史观测校准预测区间,无需重新训练模型,提升可再生能源预测的可靠性和效率。

详情
AI中文摘要

人工智能(AI)越来越多地被用于支持可再生能源预测和电网运营。随着可再生能源渗透率的增长,可靠的概率预测对于管理不确定性和支持风险感知的运营决策变得至关重要。然而,由于时间变异性、天气条件变化和异质运行机制,这些预测常常存在校准偏差。在许多实际场景中,可再生能源预测由外部来源、供应商或独立训练的系统提供,由于模型访问受限或计算约束,重新训练不可行。这需要高效且模型无关的方法来在预测生成后提高其可靠性。本文提出了上下文感知保形预测(CACP),一种用于校准可再生能源预测的框架。所提方法在校准过程中依赖于一种加权机制,该机制为与目标预测条件更相似的历史观测分配更高的权重。这使得能够自适应预测区间,反映局部不确定性机制,而无需访问或重新训练底层预测模型。实验在来自美国国家可再生能源实验室(NREL)的日前太阳能预测大规模数据集上进行,涵盖包括MISO、ERCTO和SPP在内的多个系统。结果表明,与NREL的基础预测模型和其他保形预测基线相比,CACP在站点和系统层面均改善了可靠性-效率权衡。这些结果表明,CACP可以作为可信AI驱动的可再生能源预测和运营决策支持的实际可靠性增强层。

英文摘要

Artificial intelligence (AI) is increasingly used to support renewable energy forecasting and grid operations. As renewable penetration grows, reliable probabilistic forecasting is becoming essential for managing uncertainty and supporting risk-aware operational decision-making. However, these forecasts often suffer from miscalibration due to temporal variability, changing weather conditions, and heterogeneous operating regimes. In many real-world settings, renewable energy forecasts are provided by external sources, vendors, or independently trained systems, making retraining infeasible because of limited model access or computational constraints. This creates a need for efficient and model-agnostic methods that can improve forecast reliability after they are produced. This paper presents Context-Aware Conformal Prediction (CACP), a framework for calibrating renewable energy forecasts. The proposed method relies on a weighting mechanism during the calibration procedure which assigns higher weights to historical observations that are more similar to the target forecasting condition. This enables adaptive prediction intervals that reflect local uncertainty regimes without requiring access to, or retraining of, the underlying forecasting model. Experiments are performed on a large-scale dataset from National Renewable Energy Laboratory (NREL) day-ahead solar forecasting, covering multiple systems including MISO, ERCTO, and SPP. The results show that CACP improves the reliability-efficiency tradeoff at both site and system levels compared to NREL's base forecasting model and the other conformal prediction baselines. These results suggest that CACP can serve as a practical reliability-enhancement layer for trustworthy AI-enabled renewable energy forecasting and operational decision support.

2509.08048 2026-06-03 hep-ph cs.LG 版本更新

Forecasting Generative Amplification

预测生成放大

Henning Bahl, Sascha Diefenbacher, Nina Elmer, Tilman Plehn, Jonas Spinner

发表机构 * Institut für Theoretische Physik, Universität Heidelberg, Germany(海德堡大学理论物理研究所) Physics Division, Lawrence Berkeley National Laboratory, Berkeley, USA(伯克利国家实验室物理部) Interdisciplinary Center for Scientific Computing (IWR), Universität Heidelberg, Germany(海德堡大学跨学科科学计算中心(IWR))

AI总结 本文提出两种互补方法(平均放大和差分放大)来估计生成网络在LHC模拟中的统计放大因子,无需大型保留数据集,并应用于最新事件生成器,表明放大在相空间特定区域可行但尚未覆盖整个分布。

Comments 23 pages, 15 figures. v2: added link to github repo, extended acknowledgements. v3: updated conventions and refined text, now 25 pages

详情
Journal ref
SciPost Phys. 20, 150 (2026)
AI中文摘要

生成网络是提高LHC模拟速度和精度的完美工具。理解其统计精度至关重要,尤其是在生成超出训练数据集大小的事件时。我们提出了两种互补方法来估计放大因子,无需大型保留数据集。平均放大使用贝叶斯网络或集成方法,通过给定相空间体积上积分的精度来估计放大。差分放大使用假设检验来量化放大,且没有任何分辨率损失。应用于最先进的事件生成器时,两种方法都表明放大在相空间的特定区域是可能的,但尚未覆盖整个分布。

英文摘要

Generative networks are perfect tools to enhance the speed and precision of LHC simulations. It is important to understand their statistical precision, especially when generating events beyond the size of the training dataset. We present two complementary methods to estimate the amplification factor without large holdout datasets. Averaging amplification uses Bayesian networks or ensembling to estimate amplification from the precision of integrals over given phase-space volumes. Differential amplification uses hypothesis testing to quantify amplification without any resolution loss. Applied to state-of-the-art event generators, both methods indicate that amplification is possible in specific regions of phase space, but not yet across the entire distribution.

2506.09398 2026-06-03 cs.LG physics.comp-ph 版本更新

Efficient Prediction of SO(3)-Equivariant Hamiltonian Matrices via SO(2) Local Frames

通过SO(2)局部框架高效预测SO(3)等变哈密顿矩阵

Haiyang Yu, Yuchao Lin, Xuan Zhang, Xiaofeng Qian, Shuiwang Ji

发表机构 * National University of Singapore(新加坡国立大学)

AI总结 提出QHNetV2网络,利用SO(2)局部框架和SO(2)等变操作实现全局SO(3)等变性,避免昂贵的SO(3)张量积,高效预测哈密顿矩阵。

Comments Code available at: https://github.com/divelab/AIRS

详情
AI中文摘要

我们考虑预测哈密顿矩阵以加速电子结构计算的任务,这在物理、化学和材料科学中扮演重要角色。受哈密顿矩阵的非对角块与SO(2)局部框架之间固有关系的启发,我们提出了一种新颖高效的网络,称为QHNetV2,该网络在不使用昂贵的SO(3) Clebsch-Gordan张量积的情况下实现了全局SO(3)等变性。这是通过引入一组新的高效且强大的SO(2)等变操作,并在SO(2)局部框架内执行所有非对角特征更新和消息传递来实现的,从而消除了对SO(3)张量积的需求。此外,在每个节点的SO(2)局部框架内执行连续的SO(2)张量积以融合节点特征,模拟对称收缩操作。在大型QH9和MD17数据集上的大量实验表明,我们的模型在广泛的分子结构和轨迹上实现了优越的性能,凸显了其强大的泛化能力。所提出的基于SO(2)局部框架的SO(2)操作为可扩展且对称感知的电子结构学习提供了一个有前景的方向。我们的代码将作为AIRS库的一部分发布,网址为https://github.com/divelab/AIRS。

英文摘要

We consider the task of predicting Hamiltonian matrices to accelerate electronic structure calculations, which plays an important role in physics, chemistry, and materials science. Motivated by the inherent relationship between the off-diagonal blocks of the Hamiltonian matrix and the SO(2) local frame, we propose a novel and efficient network, called QHNetV2, that achieves global SO(3) equivariance without the costly SO(3) Clebsch-Gordan tensor products. This is achieved by introducing a set of new efficient and powerful SO(2)-equivariant operations and performing all off-diagonal feature updates and message passing within SO(2) local frames, thereby eliminating the need of SO(3) tensor products. Moreover, a continuous SO(2) tensor product is performed within the SO(2) local frame at each node to fuse node features, mimicking the symmetric contraction operation. Extensive experiments on the large QH9 and MD17 datasets demonstrate that our model achieves superior performance across a wide range of molecular structures and trajectories, highlighting its strong generalization capability. The proposed SO(2) operations on SO(2) local frames offer a promising direction for scalable and symmetry-aware learning of electronic structures. Our code will be released as part of the AIRS library https://github.com/divelab/AIRS.

2308.07867 2026-06-03 eess.SY cs.LG cs.SY 版本更新

Learning Power Flow with Confidence: A Probabilistic Guarantee Framework for Voltage Risk

学习潮流与置信度:电压风险的概率保证框架

Parikshit Pareek, Sidhant Misra, Deepjyoti Deka

AI总结 针对机器学习在电力系统安全应用中缺乏形式化性能保证的问题,提出基于高斯过程回归的概率保证框架,通过顶点度核和网络扫描主动学习算法实现数据高效且可靠的电压风险评估。

Comments 10 pages

详情
AI中文摘要

机器学习缺乏形式化性能保证限制了其在安全关键的电力系统应用中的采用,在这些应用中,置信度和可解释性与准确性同样重要。在这项工作中,我们通过高斯过程回归框架,为潮流学习和电压风险估计提供了概率保证。具体来说,我们建立了期望估计误差的界限,将GP的预测方差与电压风险估计的置信度联系起来,确保与基于蒙特卡洛的ACPF风险量化在统计上等价。为了在低数据情况下增强模型的可学习性,我们首先设计了顶点度核,这是一种拓扑感知的加性核,将电压-负荷相互作用分解为局部邻域,以实现高效的大规模学习。在此基础上,我们引入了一种网络扫描主动学习算法,该算法自适应地采样信息丰富的运行点,并提供了原则性的停止准则,无需样本外验证。这些进展通过结合数据效率和分析保证,缓解了基于机器学习的潮流的主要瓶颈——缺乏可靠的保证。在IEEE 118、500和1354节点系统上的实证评估证实,所提出的VDK-GP实现了低于1E-03 p.u.的平均绝对电压误差,以15倍更少的ACPF计算复现了蒙特卡洛级别的电压风险估计,并在保守地约束违规概率的同时实现了超过120倍的评估时间减少。

英文摘要

The absence of formal performance guarantees in machine learning (ML) has limited its adoption for safety-critical power system applications, where confidence and interpretability are as vital as accuracy. In this work, we present a probabilistic guarantee for power flow learning and voltage risk estimation, derived through the framework of Gaussian Process (GP) regression. Specifically, we establish a bound on the expected estimation error that connects the GP's predictive variance to confidence in voltage risk estimates, ensuring statistical equivalence with Monte Carlo-based ACPF risk quantification. To enhance model learnability in the low-data regime, we first design the Vertex-Degree Kernel (VDK), a topology-aware additive kernel that decomposes voltage-load interactions into local neighborhoods for efficient large-scale learning. Building on this, we introduce a network-swipe active learning (AL) algorithm that adaptively samples informative operating points and provides a principled stopping criterion without requiring out-of-sample validation. Together, these developments mitigate the principal bottleneck of ML-based power flow, its lack of guaranteed reliability, by combining data efficiency with analytical assurance. Empirical evaluations across IEEE 118-, 500-, and 1354-bus systems confirm that the proposed VDK-GP achieves mean absolute voltage errors below 1E-03 p.u., reproduces Monte Carlo-level voltage risk estimates with 15x fewer ACPF computations, and achieves over 120x reduction in evaluation time while conservatively bounding violation probabilities.

2510.09845 2026-06-03 cs.LG cs.AI cs.CV 版本更新

Harnessing Self-Supervised Deep Learning and Geostationary Remote Sensing for Advancing Wildfire and Associated Air Quality Monitoring: Improved Smoke and Fire Front Masking using GOES and TEMPO Radiance Data

利用自监督深度学习和地球静止遥感推进野火及相关空气质量监测:使用GOES和TEMPO辐射数据改进烟雾和火锋掩膜

Nicholas LaHaye, Thilanka Munashinge, Hugo Lee, Xiaohua Pan, Gonzalo Gonzalez Abad, Hazem Mahmoud, Jennifer Wei

AI总结 本研究利用NASA TEMPO卫星任务的每小时数据和自监督深度学习,提出了一种创新系统,通过GOES-18和TEMPO数据有效区分烟雾与云层,实时绘制野火火锋和烟雾羽流,显著优于现有业务产品。

Comments https://2025.ieeeigarss.org/view_paper.php?PaperNum=6389&SessionID=1611

详情
AI中文摘要

这项工作展示了通过利用NASA的TEMPO卫星任务前所未有的每小时数据以及自监督深度学习的进展,改善美国西部野火和空气质量管理的可能性。我们展示了一种创新的自监督深度学习系统在绘制近实时每小时野火火锋和烟雾羽流扩散方面的有效性:成功使用GOES-18和TEMPO数据区分烟雾与云层,不同传感模态生成的烟雾和火掩膜之间具有强一致性,并且对于相同案例相比业务产品有显著改进。

英文摘要

This work demonstrates the possibilities for improving wildfire and air quality management in the western United States by leveraging the unprecedented hourly data from NASA's TEMPO satellite mission and advances in self-supervised deep learning. Here we demonstrate the efficacy of deep learning for mapping the near real-time hourly spread of wildfire fronts and smoke plumes using an innovative self-supervised deep learning-system: successfully distinguishing smoke plumes from clouds using GOES-18 and TEMPO data, strong agreement across the smoke and fire masks generated from different sensing modalities as well as significant improvement over operational products for the same cases.

2510.08977 2026-06-03 cs.LG cs.CL 版本更新

Breaking the Self-Confirming Loop: Diagnosing and Mitigating Systemic Reward Bias in Self-Rewarding RL

打破自我确认循环:诊断与缓解自奖励强化学习中的系统性奖励偏差

Chuyi Tan, Peiwen Yuan, Xinglin Wang, Yiwei Li, Shaoxiong Feng, Yueqi Zhang, Jiayi Shi, Ji Zhang, Boyuan Pan, Yao Hu, Kan Li

发表机构 * University of Science and Technology of China(中国科学技术大学)

AI总结 本文通过量化反馈回路偏差并提出集成奖励强化学习(RLER)方法,诊断并缓解了自奖励强化学习中由置信度耦合导致的系统性奖励偏差,从而提升性能与稳定性。

详情
AI中文摘要

基于可验证奖励的强化学习(RLVR)高效扩展了大语言模型(LLMs)的推理能力,但受限于稀缺的标注数据。基于内在奖励的强化学习(RLIR)通过自奖励提供了一种可扩展的替代方案,但常面临不稳定和性能较差的问题。我们将这一差距归因于置信度耦合的自奖励中的系统性偏差:模型倾向于过度奖励高置信度的错误,形成自我确认循环。我们通过三个指标量化这种反馈回路偏差:奖励噪声幅度(rho_noise)、策略-奖励耦合(rho_selfbias)和过度/不足奖励偏斜(rho_symbias)。我们的分析显示了一种复合效应,其中强耦合放大了置信度条件误差,并导致向过度奖励的漂移,从而引发不稳定和较低的性能上限。为缓解这一问题,我们提出集成奖励强化学习(RLER),该方法通过自适应奖励插值和分歧感知的轨迹选择聚合多样化的模型,以减少耦合并抑制过度奖励漂移。大量实验表明,RLER相比最佳RLIR基线提升了6.2%,且与RLVR的差距在3.6%以内,同时在未标注样本上表现出稳定的扩展性。

英文摘要

Reinforcement learning with verifiable rewards (RLVR) efficiently scales the reasoning ability of large language models (LLMs) but is bottlenecked by scarce labeled data. Reinforcement learning with intrinsic rewards (RLIR) offers a scalable alternative via self-rewarding, yet often suffers from instability and inferior performance. We trace this gap to a systemic bias in confidence-coupled self-rewarding: the model tends to over-reward high-confidence mistakes, forming a self-confirming loop. We quantify this feedback-loop bias with three metrics: reward noise magnitude (rho_noise), policy-reward coupling (rho_selfbias), and over-/under-reward skew (rho_symbias). Our analyses show a compounding effect where strong coupling amplifies confidence-conditioned errors and drives a drift toward over-reward, leading to instability and a lower performance ceiling. To mitigate this, we propose reinforcement learning with ensembled rewards (RLER), which aggregates diverse models with adaptive reward interpolation and disagreement-aware rollout selection to reduce coupling and suppress over-reward drift. Extensive experiments show that RLER improves by 6.2% over the best RLIR baseline and is within 3.6% of RLVR, while exhibiting stable scaling on unlabeled samples.

2502.09755 2026-06-03 cs.CR cs.LG 版本更新

Jailbreak Attack Initializations as Extractors of Compliance Directions

越狱攻击初始化作为合规方向的提取器

Amit Levi, Rom Himelstein, Yaniv Nemcovsky, Avi Mendelson, Chaim Baskin

发表机构 * Department of Computer Science, Technion - Israel Institute of Technology(技术学院计算机科学系) Department of Data and Decision Science, Technion - Israel Institute of Technology(技术学院数据与决策科学系) School of Electrical and Computer Engineering Engineering, Ben-Gurion University of the Negev(内盖夫本· Gurion大学电气与计算机工程学院)

AI总结 本文发现基于梯度的越狱攻击初始化会收敛到抑制拒绝的单一合规方向,并据此提出CRI框架,通过沿合规方向投影未见提示来提高攻击成功率并降低计算开销。

Comments Accepted to Findings of the Association for Computational Linguistics 2025 (EMNLP 2025)

详情
AI中文摘要

安全对齐的LLM对提示的响应要么是合规要么是拒绝,每种响应对应模型激活空间中的不同方向。最近的研究表明,通过从其他提示进行自我迁移来初始化攻击可以显著提升其性能。然而,这些初始化的潜在机制仍不清楚,并且攻击使用任意或手动选择的初始化。本文表明,每个基于梯度的越狱攻击及其后续初始化逐渐收敛到一个抑制拒绝的单一合规方向,从而能够实现从拒绝到合规的高效转换。基于这一见解,我们提出了CRI,一个旨在将未见提示进一步投影到合规方向的初始化框架。我们在多种攻击、模型和数据集上展示了我们的方法,实现了更高的攻击成功率(ASR)并降低了计算开销,突显了安全对齐LLM的脆弱性。参考实现可在以下网址获取:https://amit1221levi.github.io/CRI-Jailbreak-Init-LLMs-evaluation

英文摘要

Safety-aligned LLMs respond to prompts with either compliance or refusal, each corresponding to distinct directions in the model's activation space. Recent works show that initializing attacks via self-transfer from other prompts significantly enhances their performance. However, the underlying mechanisms of these initializations remain unclear, and attacks utilize arbitrary or hand-picked initializations. This work presents that each gradient-based jailbreak attack and subsequent initialization gradually converge to a single compliance direction that suppresses refusal, thereby enabling an efficient transition from refusal to compliance. Based on this insight, we propose CRI, an initialization framework that aims to project unseen prompts further along compliance directions. We demonstrate our approach on multiple attacks, models, and datasets, achieving an increased attack success rate (ASR) and reduced computational overhead, highlighting the fragility of safety-aligned LLMs. A reference implementation is available at: https://amit1221levi.github.io/CRI-Jailbreak-Init-LLMs-evaluation.

2510.03316 2026-06-03 cs.CV cs.AI cs.LG 版本更新

The View From Space: Navigating Instrumentation Differences with EOFMs

从太空视角:利用EOFMs导航仪器差异

Ryan P. Demilt, Nicholas LaHaye, Karis Tenneson

发表机构 * Spatial Informatics Group(空间信息组)

AI总结 本研究通过分析地球观测基础模型(EOFMs)对传感器架构的敏感性,揭示了当前模型设计的缺陷,并为模型开发者、用户和遥感科学社区指明了前进方向。

详情
Journal ref
https://neurips.cc/virtual/2025/loc/san-diego/122891
AI中文摘要

地球观测基础模型(EOFMs)作为处理大量遥感及其他地球观测数据、并对许多关键地球监测任务产生影响的工具,其普及程度急剧上升。一个新兴趋势是利用预训练模型的输出作为“嵌入”,这些嵌入总结了高维数据,可用于通用任务,如相似性搜索和内容特定查询。然而,大多数EOFMs仅在单一模态数据上训练,然后通过匹配不同模态的波段进行应用或基准测试。现有工作尚不清楚多样化的传感器架构如何影响当前EOFMs套件的内部表示。我们在本工作中表明,EOFMs的表示空间对传感器架构高度敏感,理解这一差异为我们提供了关于当前EOFMs设计陷阱的关键视角,并指明了作为模型开发者、用户以及以稳健遥感科学为指导的社区应如何前进的方向。

英文摘要

Earth Observation Foundation Models (EOFMs) have exploded in prevalence as tools for processing the massive volumes of remotely sensed and other earth observation data, and for delivering impact on the many essential earth monitoring tasks. An emerging trend posits using the outputs of pre-trained models as 'embeddings' which summarize high dimensional data to be used for generic tasks such as similarity search and content-specific queries. However, most EOFM models are trained only on single modalities of data and then applied or benchmarked by matching bands across different modalities. It is not clear from existing work what impact diverse sensor architectures have on the internal representations of the present suite of EOFMs. We show in this work that the representation space of EOFMs is highly sensitive to sensor architecture and that understanding this difference gives a vital perspective on the pitfalls of current EOFM design and signals for how to move forward as model developers, users, and a community guided by robust remote-sensing science.

2510.01377 2026-06-03 math.OC cs.AI cs.LG cs.MA cs.SY eess.SY 版本更新

DeMuon: A Decentralized Muon for Matrix Optimization over Graphs

DeMuon:一种用于图上矩阵优化的去中心化Muon方法

Chuan He, Shuyi Ren, Jingwei Mao, Erik G. Larsson

发表机构 * Department of Mathematics, Linköping University(利乌普堡大学数学系) Department of Electrical Engineering, Linköping University(利乌普堡大学电气工程系) Department of Computer and Information Science, Linköping University(利乌普堡大学计算机与信息科学系)

AI总结 提出DeMuon方法,通过牛顿-舒尔茨迭代实现矩阵正交化,并利用梯度跟踪处理局部函数异质性,在重尾噪声下达到与集中式算法匹配的复杂度,首次将Muon扩展到去中心化图优化并具有可证明的复杂度保证。

Comments Add an accelerated variant of the proposed method. New proofs of proposed methods

详情
AI中文摘要

本文提出DeMuon,一种在给定通信拓扑上进行去中心化矩阵优化的方法。DeMuon通过牛顿-舒尔茨迭代(继承自其集中式前身Muon)实现矩阵正交化,并采用梯度跟踪来减轻局部函数之间的异质性。在重尾噪声条件和额外的温和假设下,我们建立了DeMuon达到近似随机驻点的迭代复杂度。该复杂度结果在目标容差依赖方面与已知的最佳集中式算法复杂度界相匹配。据我们所知,DeMuon是首个将Muon直接扩展到图上去中心化优化并具有可证明复杂度保证的方法。我们在不同连通程度的图上进行了去中心化Transformer预训练的初步数值实验。数值结果表明,在不同网络拓扑下,DeMuon相比其他流行的去中心化算法具有明显的改进优势。

英文摘要

In this paper, we propose DeMuon, a method for decentralized matrix optimization over a given communication topology. DeMuon incorporates matrix orthogonalization via Newton-Schulz iterations-a technique inherited from its centralized predecessor, Muon-and employs gradient tracking to mitigate heterogeneity among local functions. Under heavy-tailed noise conditions and additional mild assumptions, we establish the iteration complexity of DeMuon for reaching an approximate stochastic stationary point. This complexity result matches the best-known complexity bounds of centralized algorithms in terms of dependence on the target tolerance. To the best of our knowledge, DeMuon is the first direct extension of Muon to decentralized optimization over graphs with provable complexity guarantees. We conduct preliminary numerical experiments on decentralized transformer pretraining over graphs with varying degrees of connectivity. Our numerical results demonstrate a clear margin of improvement of DeMuon over other popular decentralized algorithms across different network topologies.

2509.26169 2026-06-03 cs.LG 版本更新

Alignment-Aware Decoding

对齐感知解码

Frédéric Berdoz, Luca A. Lanzendörfer, René Caky, Roger Wattenhofer

发表机构 * EPFL, Switzerland(瑞士联邦理工学院)

AI总结 提出一种推理时增强模型对齐的方法——对齐感知解码(AAD),可解释为隐式奖励优化,无需额外训练,在多种基准和模型规模上优于强基线,并能生成合成数据改善数据受限场景下的对齐。

Comments Accepted at ICML 2026

详情
AI中文摘要

大型语言模型的对齐仍然是自然语言处理中的一个核心挑战。偏好优化已成为一种流行且有效的改进对齐的方法,通常通过训练时或基于提示的干预来实现。在本文中,我们介绍了一种对齐感知解码(AAD)方法,该方法直接在推理时增强模型对齐。理论上,AAD可以解释为隐式奖励优化,但它不需要超出标准DPO设置之外的专门训练。经验上,AAD在各种对齐基准和模型规模上始终优于强基线。此外,在数据受限的设置中,AAD可以生成高质量的合成数据,以改善标准解码下的对齐,为标记数据有限时提供了一种实用的解决方案。

英文摘要

Alignment of large language models remains a central challenge in natural language processing. Preference optimization has emerged as a popular and effective method for improving alignment, typically through training-time or prompt-based interventions. In this paper, we introduce alignment-aware decoding (AAD), a method to enhance model alignment directly at inference. Theoretically, AAD can be interpreted as implicit reward optimization, yet it requires no specialized training beyond the standard DPO setup. Empirically, AAD consistently outperforms strong baselines across diverse alignment benchmarks and model scales. Moreover, in data-constrained settings, AAD can produce high-quality synthetic data to improve alignment under standard decoding, providing a practical solution when labeled data is limited.

2509.22468 2026-06-03 cs.LG cs.AI 版本更新

Learning the Neighborhood: Contrast-Free Multimodal Self-Supervised Molecular Graph Pretraining

学习邻域:无对比的多模态自监督分子图预训练

Boshra Ariguib, Mathias Niepert, Andrei Manolache

发表机构 * University of Tübingen(图宾根大学)

AI总结 提出C-FREE框架,通过预测子图嵌入与互补邻域的关系,融合2D拓扑和3D构象信息,实现无对比、无负样本的多模态自监督分子图预训练,在MoleculeNet上取得最优结果。

Comments Accepted at ICML 2026

详情
AI中文摘要

高质量的分子表示对于性质预测和分子设计至关重要,然而大型标注数据集仍然稀缺。尽管分子图上的自监督预训练已显示出潜力,但许多现有方法要么依赖于手工数据增强或复杂的生成目标,要么仅利用2D拓扑,导致宝贵的3D结构信息未被充分利用。为弥补这一空白,我们引入了C-FREE(基于自我网络的无需对比的表示学习),一个将2D图与3D构象集成在一起的简单框架。C-FREE通过从潜在空间中互补邻域预测子图嵌入来学习分子表示,使用固定半径的自我网络作为不同构象之间的建模单元。这种设计使我们能够在混合图神经网络(GNN)-Transformer骨干中整合几何和拓扑信息,无需负样本、位置编码或昂贵的预处理。在提供丰富3D构象多样性的GEOM数据集上进行预训练后,C-FREE在MoleculeNet上取得了最先进的结果,超越了对比、生成和其他多模态自监督方法。在具有不同规模和分子类型的数据集上进行微调进一步表明,预训练能有效迁移到新的化学领域,突显了3D信息分子表示的重要性。

英文摘要

High-quality molecular representations are essential for property prediction and molecular design, yet large labeled datasets remain scarce. While self-supervised pretraining on molecular graphs has shown promise, many existing approaches either depend on hand-crafted augmentations or complex generative objectives, and often rely solely on 2D topology, leaving valuable 3D structural information underutilized. To address this gap, we introduce C-FREE (Contrast-Free Representation learning on Ego-nets), a simple framework that integrates 2D graphs with ensembles of 3D conformers. C-FREE learns molecular representations by predicting subgraph embeddings from their complementary neighborhoods in the latent space, using fixed-radius ego-nets as modeling units across different conformers. This design allows us to integrate both geometric and topological information within a hybrid Graph Neural Network (GNN)-Transformer backbone, without negatives, positional encodings, or expensive pre-processing. Pretraining on the GEOM dataset, which provides rich 3D conformational diversity, C-FREE achieves state-of-the-art results on MoleculeNet, surpassing contrastive, generative, and other multimodal self-supervised methods. Fine-tuning across datasets with diverse sizes and molecule types further demonstrates that pretraining transfers effectively to new chemical domains, highlighting the importance of 3D-informed molecular representations.

2509.08726 2026-06-03 math.OC cs.LG 版本更新

Decentralized Stochastic Nonconvex Optimization under the $(L_0,L_1)$-Smoothness

$(L_0,L_1)$-光滑条件下的去中心化随机非凸优化

Luo Luo, Xue Cui, Tingkai Jia, Cheng Chen

发表机构 * School of Data Science, Fudan University(复旦大学数据科学学院) East China Normal University(华东师范大学)

AI总结 针对满足$(L_0,L_1)$-光滑条件的非凸函数,提出去中心化归一化随机梯度下降算法,实现每个局部智能体达到ε-稳定点,并给出样本复杂度和通信复杂度的上界。

详情
AI中文摘要

本文关注去中心化随机优化问题 $f(\mathbf{x})=\frac{1}{m}\sum_{i=1}^m f_i(\mathbf{x})$,其中网络由 $n$ 个智能体连接,每个局部函数形如 $f_i(\mathbf{x}) = {\mathbb E}\left[F(\mathbf{x};{\boldsymbol ξ}_i)\right]$,满足 $(L_0,L_1)$-光滑条件但可能非凸,且每个随机变量 ${\boldsymbol ξ}_i$ 服从分布 ${\mathcal D}_i$。我们提出一种新算法——去中心化归一化随机梯度下降(DNSGD),该算法可使每个局部智能体达到 $\varepsilon$-稳定点。我们提出了一个基于梯度范数与一致性误差乘积的李雅普诺夫函数的新框架,用于分析 $(L_0,L_1)$-光滑设置下的去中心化一阶方法。我们证明,所提算法在每个智能体上的样本复杂度上界为 ${\mathcal O}(m^{-1}(L_fσ^2Δ_fε^{-4} + σ^2ε^{-2} + L_f^{-2}L_1^3σ^2Δ_fε^{-1} + L_f^{-2}L_1^2σ^2))$,通信复杂度上界为 $\tilde{\mathcal O}((L_fε^{-2} + L_1ε^{-1})γ^{-1/2}Δ_f)$,其中 $L_f=L_0 +L_1ζ$,$σ^2$ 是随机梯度的方差,$Δ_f$ 是初始最优函数值差距,$γ$ 是网络的谱间隙,$ζ$ 是梯度异质性程度。在 $L_1=0$ 的特殊情况下,上述结果(几乎)匹配标准光滑条件下去中心化随机非凸优化的下界。我们还进行了数值实验,以展示我们方法的实证优越性。

英文摘要

This paper focuses on the decentralized stochastic optimization problem $f(\mathbf{x})=\frac{1}{m}\sum_{i=1}^m f_i(\mathbf{x})$ over a connected network of $n$ agents, where each local function has the form of $f_i(\mathbf{x}) = {\mathbb E}\left[F(\mathbf{x};{\boldsymbol ξ}_i)\right]$ which satisfies the $(L_0,L_1)$-smooth condition but possibly nonconvex and each random variable ${\boldsymbol ξ}_i$ follows distribution ${\mathcal D}_i$. We propose a novel algorithm called decentralized normalized stochastic gradient descent (DNSGD), which can achieve an $ε$-stationary point at each local agent. We present a new framework for analyzing decentralized first-order methods in the $(L_0,L_1)$-smooth setting, based on the Lyapunov function related to the product of the gradient norm and the consensus error. We show that the proposed algorithm attains the upper bounds on the sample complexity of ${\mathcal O}(m^{-1}(L_fσ^2Δ_fε^{-4} + σ^2ε^{-2} + L_f^{-2}L_1^3σ^2Δ_fε^{-1} + L_f^{-2}L_1^2σ^2))$ per agent and the communication complexity of $\tilde{\mathcal O}((L_fε^{-2} + L_1ε^{-1})γ^{-1/2}Δ_f)$, where $L_f=L_0 +L_1ζ$, $σ^2$ is the variance of the stochastic gradient, $Δ_f$ is the initial optimal function value gap, $γ$ is the spectral gap of the network, and $ζ$ is the degree of the gradient dissimilarity. In the special case of $L_1=0$, the above results (nearly) match the lower bounds of decentralized stochastic nonconvex optimization under the standard smoothness. We also conduct numerical experiments to show the empirical superiority of our method.

2502.02748 2026-06-03 cs.LG cond-mat.mtrl-sci 版本更新

ReciNet: Reciprocal Space-Aware Long-Range Modeling for Crystalline Property Prediction

ReciNet: 用于晶体性质预测的倒易空间感知长程建模

Jianan Nie, Peiyao Xiao, Kaiyi Ji, Peng Gao

发表机构 * Department of Computer Science, Virginia Tech(维吉尼亚理工大学计算机科学系) Department of Computer Science and Engineering, University at Buffalo(布法罗大学计算机科学与工程系)

AI总结 提出基于倒易空间的几何网络ReciNet,通过傅里叶级数表示和可学习滤波器结合几何GNN与倒易模块,实现晶体中短程和长程相互作用建模,在多个基准上取得优异预测精度。

详情
AI中文摘要

从晶体结构预测其性质是材料科学中一项基础但具有挑战性的任务。与分子不同,晶体结构表现出原子的无限周期排列,需要能够有效捕捉局部和全局信息的方法。然而,当前的工作在捕捉周期结构内的长程相互作用方面存在不足。为了解决这个问题,我们利用倒易空间(周期晶体的自然域),并从分数坐标和倒易格矢出发,使用可学习滤波器构建傅里叶级数表示。在此基础上,我们引入了基于倒易空间的几何网络(ReciNet),这是一种新颖的架构,它集成了几何GNN和倒易模块来建模短程和长程相互作用。在综合基准JARVIS、Materials Project和MatBench上的实验表明,ReciNet在一系列晶体性质预测任务中取得了出色的预测精度。此外,我们探索了使用混合专家模型进行多性质预测的模型扩展,该扩展展示了高计算效率,并揭示了相关性质之间的正迁移。这些发现凸显了我们的模型作为可扩展且准确的晶体性质预测解决方案的潜力。

英文摘要

Predicting properties of crystals from their structures is a fundamental yet challenging task in materials science. Unlike molecules, crystal structures exhibit infinite periodic arrangements of atoms, requiring methods capable of capturing both local and global information effectively. However, current works fall short of capturing long-range interactions within periodic structures. To address this, we leverage \emph{reciprocal space}, the natural domain for periodic crystals, and construct a Fourier series representation from fractional coordinates and reciprocal lattice vectors with learnable filters. Building on this, we introduce the reciprocal space-based geometry network (\textbf{ReciNet}), a novel architecture that integrates geometric GNNs and reciprocal blocks to model short-range and long-range interactions. Experiments on comprehensive benchmarks JARVIS, Materials Project, and MatBench demonstrate that ReciNet achieves outstanding predictive accuracy across a range of crystal property prediction tasks. Additionally, we explore a model extension for multi-property prediction with the mixture-of-experts, which demonstrates high computational efficiency and reveals positive transfer between correlated properties. These findings highlight the potential of our model as a scalable and accurate solution for crystal property prediction.

2509.19305 2026-06-03 cs.LG cs.AI eess.SP 版本更新

Wavelet Fourier Diffuser: Frequency-Aware Diffusion Model for Reinforcement Learning

小波傅里叶扩散器:用于强化学习的频率感知扩散模型

Yifu Luo, Yongzhe Chang, Xueqian Wang

发表机构 * Tsinghua University China(清华大学中国)

AI总结 针对现有扩散模型在离线强化学习中忽略频域特征导致频率偏移的问题,提出WFDiffuser,通过离散小波变换分解轨迹并利用短时傅里叶变换和交叉注意力增强频域建模,在D4RL基准上有效缓解频率偏移,提升轨迹稳定性和决策性能。

Comments IJCNN 2025

详情
Journal ref
IJCNN 2025
AI中文摘要

扩散概率模型通过直接建模轨迹序列,在离线强化学习中展现出显著潜力。然而,现有方法主要关注时域特征而忽略频域特征,根据我们的观察,这会导致频率偏移和性能下降。在本文中,我们从频域的新视角研究强化学习问题。我们首先观察到,仅使用时域的方法会无意中引入频域低频分量的偏移,从而导致轨迹不稳定和性能下降。为了解决这个问题,我们提出了小波傅里叶扩散器(WFDiffuser),一种新颖的基于扩散的强化学习框架,它集成了离散小波变换将轨迹分解为低频和高频分量。为了进一步增强每个分量的扩散建模,WFDiffuser采用短时傅里叶变换和交叉注意力机制来提取频域特征并促进跨频率交互。在D4RL基准上的大量实验结果表明,WFDiffuser有效缓解了频率偏移,从而产生更平滑、更稳定的轨迹,并相比现有方法提高了决策性能。

英文摘要

Diffusion probability models have shown significant promise in offline reinforcement learning by directly modeling trajectory sequences. However, existing approaches primarily focus on time-domain features while overlooking frequency-domain features, leading to frequency shift and degraded performance according to our observation. In this paper, we investigate the RL problem from a new perspective of the frequency domain. We first observe that time-domain-only approaches inadvertently introduce shifts in the low-frequency components of the frequency domain, which results in trajectory instability and degraded performance. To address this issue, we propose Wavelet Fourier Diffuser (WFDiffuser), a novel diffusion-based RL framework that integrates Discrete Wavelet Transform to decompose trajectories into low- and high-frequency components. To further enhance diffusion modeling for each component, WFDiffuser employs Short-Time Fourier Transform and cross attention mechanisms to extract frequency-domain features and facilitate cross-frequency interaction. Extensive experiment results on the D4RL benchmark demonstrate that WFDiffuser effectively mitigates frequency shift, leading to smoother, more stable trajectories and improved decision-making performance over existing methods.

2509.08707 2026-06-03 q-bio.BM cs.LG 版本更新

Tokenizing Loops of Antibodies

抗体环的标记化

Ada Fang, Robert G. Alberstein, Simon Kelow, Frédéric A. Dreyer

发表机构 * Harvard University(哈佛大学) Prescient Design, Genentech(Prescient Design,基因泰克)

AI总结 提出Igloo多模态抗体环标记器,通过对比学习编码主链二面角和序列,高效检索相似环结构,提升H3环识别性能5.9%,并集成到蛋白质语言模型中改善抗体设计。

Comments 21 pages, 7 figures, 10 tables, code available at https://github.com/prescient-design/igloo

详情
AI中文摘要

抗体的互补决定区是环状结构,对其与抗原的相互作用至关重要,并且对新型生物制品的设计具有高度重要性。自20世纪80年代以来,将CDR结构的多样性分类为规范簇使得能够识别抗体的关键结构基序。然而,现有方法的覆盖范围有限,并且不能轻易地整合到蛋白质基础模型中。在这里,我们介绍了免疫球蛋白环标记器Igloo,这是一种多模态抗体环标记器,用于编码主链二面角和序列。Igloo使用对比学习目标进行训练,以在潜在空间中将具有相似主链二面角的环映射得更近。Igloo可以高效地从结构抗体数据库中检索最接近的匹配环结构,在识别相似H3环方面比现有方法高出5.9%。Igloo为所有环分配标记,解决了规范簇覆盖范围有限的问题,同时保留了恢复规范环构象的能力。为了展示Igloo标记的多功能性,我们展示了它们可以通过IglooLM和IglooALM整合到蛋白质语言模型中。在预测重链变体的结合亲和力方面,IglooLM在10个抗体-抗原靶点中的8个上优于基础蛋白质语言模型。此外,它与现有的最先进的基于序列和多模态蛋白质语言模型相当,与参数多7倍的模型表现相当。IglooALM采样的抗体环在序列上多样化,在结构上比最先进的抗体逆折叠模型更一致。Igloo展示了引入多模态标记用于抗体环在编码抗体环的多样化景观、改进蛋白质基础模型以及抗体CDR设计方面的优势。

英文摘要

The complementarity-determining regions of antibodies are loop structures that are key to their interactions with antigens, and of high importance to the design of novel biologics. Since the 1980s, categorizing the diversity of CDR structures into canonical clusters has enabled the identification of key structural motifs of antibodies. However, existing approaches have limited coverage and cannot be readily incorporated into protein foundation models. Here we introduce ImmunoGlobulin LOOp Tokenizer, Igloo, a multimodal antibody loop tokenizer that encodes backbone dihedral angles and sequence. Igloo is trained using a contrastive learning objective to map loops with similar backbone dihedral angles closer together in latent space. Igloo can efficiently retrieve the closest matching loop structures from a structural antibody database, outperforming existing methods on identifying similar H3 loops by 5.9\%. Igloo assigns tokens to all loops, addressing the limited coverage issue of canonical clusters, while retaining the ability to recover canonical loop conformations. To demonstrate the versatility of Igloo tokens, we show that they can be incorporated into protein language models with IglooLM and IglooALM. On predicting binding affinity of heavy chain variants, IglooLM outperforms the base protein language model on 8 out of 10 antibody-antigen targets. Additionally, it is on par with existing state-of-the-art sequence-based and multimodal protein language models, performing comparably to models with $7\times$ more parameters. IglooALM samples antibody loops which are diverse in sequence and more consistent in structure than state-of-the-art antibody inverse folding models. Igloo demonstrates the benefit of introducing multimodal tokens for antibody loops for encoding the diverse landscape of antibody loops, improving protein foundation models, and for antibody CDR design.

2507.23035 2026-06-03 cs.LG cs.AR 版本更新

OASIS: Outlier-Aware LUT-Based GEMM with Dual-Side Quantization for LLM Inference Acceleration

OASIS:基于查找表的离群点感知双端量化LLM推理加速通用矩阵乘法

Xueying Wu, Baijun Zhou, Zhihui Gao, Yuzhe Fu, Qilin Zheng, Yintao He, Hai Li

发表机构 * National University of Singapore(新加坡国立大学)

AI总结 提出OASIS架构,利用预计算笛卡尔积查找表实现非均匀量化权重与激活的高效通用矩阵乘法,通过离群点感知量化方案和实时离群点检测引擎Orizuru,在保持精度的同时显著提升推理速度和能效。

详情
AI中文摘要

大型语言模型(LLM)在各种应用中展现了令人印象深刻的能力,但在推理过程中需要大量的内存和计算资源。现有的量化方法在效率和准确性之间存在权衡:仅权重量化(WOQ)引入了昂贵的反量化开销,而整数权重和激活量化(INT-WAQ)降低了精度并损害了模型质量。非均匀权重和激活量化(NU-WAQ)能更好地捕捉LLM权重和激活的非均匀分布,但仍与传统的低精度计算单元不兼容。本文提出了OASIS,一种基于查找表(LUT)的架构,能够在无需反量化的情况下实现非均匀量化权重和激活之间的高效通用矩阵乘法(GEMM)。OASIS采用预计算的笛卡尔积LUT,实现了LUT大小的64倍缩减,并相较于现有基于LUT的GEMM方法实现了1024倍的计算并行度提升。为了在激进的激活量化下保持精度,OASIS引入了一种离群点感知量化方案,同时进行基于LUT的GEMM和针对离群点的误差补偿。此外,我们设计了Orizuru,一种用于实时激活离群点检测的高效top-k检测引擎。根据广泛评估,与FP16基线相比,OASIS的平均精度下降仅为1.98%,比Atom低5.18%。在硬件方面,与FIGLUT加速器相比,OASIS实现了平均3.00倍的加速和1.44倍的能效提升。

英文摘要

Large language models (LLMs) have demonstrated impressive capabilities across a wide range of applications, but demand substantial memory and compute resources during inference. Existing quantization methods expose a trade-off between efficiency and accuracy: weight-only quantization (WOQ) incurs costly dequantization overheads, while integer weight-and-activation quantization (INT-WAQ) reduces precision and degrades model quality. Non-uniform weight-and-activation quantization (NU-WAQ) can better capture the non-uniform distributions of LLM weights and activations, yet remains incompatible with conventional low-precision compute units. This paper presents OASIS, a lookup table (LUT)-based architecture that enables efficient general matrix multiplication (GEMM) between non-uniformly quantized weights and activations without requiring dequantization. OASIS employs pre-computed Cartesian Product LUTs, achieving a 64x reduction in LUT size and enabling a 1024x higher computational parallelism over existing LUT-based GEMM methods. To preserve accuracy under aggressive activation quantization, OASIS introduces an outlier-aware quantization scheme with concurrent LUT-based GEMM and error compensation for outliers. Furthermore, we design Orizuru, an efficient top-k detection engine for real-time activation outlier identification. According to extensive evaluations, OASIS incurs an average accuracy drop of only 1.98% compared to the FP16 baseline, which is 5.18% lower than Atom. On the hardware side, OASIS achieves an average 3.00x speedup and a 1.44x energy efficiency improvement compared to the FIGLUT accelerator.

2508.13174 2026-06-03 cs.AI cs.LG q-fin.CP stat.ML 版本更新

AlphaEval: A Comprehensive and Efficient Evaluation Framework for Formula Alpha Mining

AlphaEval:一个全面高效的公式化Alpha挖掘评估框架

Hongjun Ding, Binqi Chen, Jinsheng Huang, Taian Guo, Zhengyang Mao, Guoyi Shao, Lutong Zou, Luchen Liu, Ming Zhang

发表机构 * CUNY Baruch College(CUNY 巴纳特学院) Peking University(北京大学) Harvard University(哈佛大学) Zhengren Research(正人研究所) Zhengren Quant(正人量化)

AI总结 提出AlphaEval框架,通过五个维度(预测能力、稳定性、鲁棒性、金融逻辑、多样性)对自动Alpha挖掘模型进行统一、可并行化且无需回测的评估,实现与回测相当的评估一致性并提高效率。

Comments Accepted by KDD2026

详情
AI中文摘要

公式化Alpha挖掘从金融数据中生成预测信号,对量化投资至关重要。尽管遗传编程、强化学习和大语言模型等多种算法方法显著扩展了Alpha发现的能力,但系统评估仍是一个关键挑战。现有评估指标主要包括回测和基于相关性的度量。回测计算密集、本质上是顺序的,并且对特定策略参数敏感。基于相关性的度量虽然高效,但仅评估预测能力,忽略了时间稳定性、鲁棒性、多样性和可解释性等其他关键属性。此外,大多数现有Alpha挖掘模型的闭源性质阻碍了可重复性并减缓了该领域的进展。为解决这些问题,我们提出了AlphaEval,一个统一、可并行化且无需回测的自动Alpha挖掘模型评估框架。AlphaEval沿五个互补维度评估生成Alpha的整体质量:预测能力、稳定性、对市场扰动的鲁棒性、金融逻辑和多样性。跨代表性Alpha挖掘算法的广泛实验表明,AlphaEval实现了与全面回测相当的评估一致性,同时提供更全面的洞察和更高的效率。此外,与传统的单一指标筛选方法相比,AlphaEval能有效识别更优的Alpha。所有实现和评估工具均已开源,以促进可重复性和社区参与。

英文摘要

Formula alpha mining, which generates predictive signals from financial data, is critical for quantitative investment. Although various algorithmic approaches-such as genetic programming, reinforcement learning, and large language models-have significantly expanded the capacity for alpha discovery, systematic evaluation remains a key challenge. Existing evaluation metrics predominantly include backtesting and correlation-based measures. Backtesting is computationally intensive, inherently sequential, and sensitive to specific strategy parameters. Correlation-based metrics, though efficient, assess only predictive ability and overlook other crucial properties such as temporal stability, robustness, diversity, and interpretability. Additionally, the closed-source nature of most existing alpha mining models hinders reproducibility and slows progress in this field. To address these issues, we propose AlphaEval, a unified, parallelizable, and backtest-free evaluation framework for automated alpha mining models. AlphaEval assesses the overall quality of generated alphas along five complementary dimensions: predictive power, stability, robustness to market perturbations, financial logic, and diversity. Extensive experiments across representative alpha mining algorithms demonstrate that AlphaEval achieves evaluation consistency comparable to comprehensive backtesting, while providing more comprehensive insights and higher efficiency. Furthermore, AlphaEval effectively identifies superior alphas compared to traditional single-metric screening approaches. All implementations and evaluation tools are open-sourced to promote reproducibility and community engagement.

2507.19684 2026-06-03 cs.LG cs.AI cs.CL cs.CV 版本更新

CoMPAS3D: A Dataset and Benchmark for Interactive Motion

CoMPAS3D: 一个用于交互动作的数据集和基准

Bermet Burkanova, Yasaman Etesam, Payam Jome Yazdian, Trinity Evans, Chuxuan Zhang, Zoe Stanley, Paige Tuttösí, Angelica Lim

发表机构 * School of Computing Science Simon Fraser University(计算科学学院西蒙弗雷泽大学)

AI总结 提出CoMPAS3D数据集和评估框架,通过动作可读性和熟练度适当性等客观指标,解决交互式动作生成中缺乏社交上下文评估的问题。

Comments https://rosielab.github.io/compas3d

详情
AI中文摘要

社交互动型人形机器人必须通过身体与人类互动,实时适应伙伴的动作、意图和能力。这需要模型不仅理解身体如何移动,还要理解在共享社交背景下动作的含义。然而,交互式动作生成的评估框架并未衡量生成的动作是否在共享动作词汇中可读,也不评估其是否适合伙伴的熟练水平。这一差距有两个原因:现有框架依赖运动学指标(如FID和节拍对齐),无法衡量上述特性;现有数据集缺乏动作标注和熟练度变化。萨尔萨舞作为评估领域很合适:即兴、双人、由动作词汇和评判标准(涵盖时机、音乐性、技巧、难度、配合和原创性)指导。我们提出CoMPAS3D,一个即兴双人萨尔萨舞的动作捕捉数据集,附带评估框架,涵盖运动学质量、两个客观指标(动作可读性和熟练度适当性)以及六个基于竞赛的主观维度。数据集包含18名舞者(涵盖初级、中级和高级水平)的3小时即兴表演,超过2800个专家标注片段,涵盖动作类型、错误和风格元素。我们定义了三个基准:动作分类(类似于转录)、熟练度估计(流利度评估)和跟随者生成(对话响应)。微调的视觉语言模型在应用于真实动作序列的客观指标上表现强劲。应用于Duolando和InterGen时,这些指标揭示了运动学指标遗漏的失败。人工评估确认了生成动作与真实动作之间的差距。CoMPAS3D、标注、基准代码和基线结果公开可用。

英文摘要

Socially interactive humanoid robots must engage with humans through their bodies, adapting in real time to a partner's movement, intent, and abilities. This requires models that understand not just how bodies move, but what movement means in a shared social context. Yet evaluation frameworks for interactive motion generation do not measure whether generated follower motion is legible within a shared movement vocabulary, nor whether it is appropriate to the partner's proficiency level. This gap has two causes: existing frameworks rely on kinematic metrics such as FID and beat alignment that cannot measure either property, and existing datasets lack the move annotations and proficiency variation needed. Salsa is well-suited as an evaluation domain: improvised, dyadic, and governed by a move vocabulary and judging criteria covering timing, musicality, technique, difficulty, partnering, and originality. We present CoMPAS3D, a motion capture dataset of improvised partner salsa paired with an evaluation framework covering kinematic quality, two objective metrics (move legibility and proficiency appropriateness), and six competition-based subjective dimensions. The dataset includes 3 hours of improvisation by 18 dancers spanning beginner, intermediate, and professional levels, with over 2,800 expert-annotated segments covering move types, errors, and stylistic elements. We define three benchmarks: move classification (analogous to transcription), proficiency estimation (fluency assessment), and follower generation (dialogue response). Fine-tuned vision-language models perform strongly on objective metrics applied to ground-truth motion sequences. Applied to Duolando and InterGen, the metrics reveal failures that kinematic metrics miss. Human evaluations confirm the gap between generated and ground-truth motion. CoMPAS3D, annotations, benchmark code, and baseline results are publicly available.

2504.01531 2026-06-03 cs.LG 版本更新

DRAN: A Distribution and Relation Adaptive Network for Spatio-temporal Forecasting

DRAN:一种面向时空预测的分布与关系自适应网络

Xiaobei Zou, Luolin Xiong, Kexuan Zhang, Cesare Alippi, Yang Tang

发表机构 * Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai(能源化学过程智能制造关键实验室,教育部,东华大学,上海) Faculty of Informatics, Università della Svizzera italiana(瑞士意大利大学信息学院) Department of Electronics, Information and Bioengineering, Politecnico di Milano(米兰理工学院电子、信息与生物工程系)

AI总结 针对非平稳时空系统的预测挑战,提出分布与关系自适应网络(DRAN),通过空间因子学习器(SFL)和动态-静态融合学习器(DSFL)分别适应分布偏移和关系变化,在天气和交通预测任务上超越现有方法。

Comments 15 pages, 10 figures

详情
AI中文摘要

准确的时空系统预测对于系统管理、控制和危机预防等任务至关重要。然而,许多时空系统固有的时变性给在非平稳条件下实现准确预测带来了挑战。为了解决非平稳性问题,我们提出了一种分布与关系自适应网络(DRAN),能够动态适应随时间变化的关系和分布。虽然时间归一化和反归一化是适应分布偏移的常用技术,但这种操作不适用于时空上下文,因为时间归一化会缩放节点的时间序列,可能破坏节点间的空间关系。为了解决这个问题,我们开发了一个空间因子学习器(SFL)模块,使得归一化和反归一化过程得以实现。为了适应传感器间空间关系的动态变化,我们提出了一种动态-静态融合学习器(DSFL)模块,通过自适应融合比例机制有效整合从动态和静态关系中学习到的特征。此外,我们引入了一个随机学习器来捕获时空表示中的噪声成分。我们的方法在天气预测和交通流预测任务上优于现有最先进方法。实验结果表明,我们的SFL在各种时间归一化操作下有效保持了空间关系。对学习到的动态和静态关系的可视化表明,DSFL能够捕获节点间的局部和远程关系。

英文摘要

Accurate predictions of spatio-temporal systems are crucial for tasks such as system management, control, and crisis prevention. However, the inherent time variance of many spatio-temporal systems poses challenges to achieving accurate predictions whenever stationarity is not granted. In order to address non-stationarity, we propose a Distribution and Relation Adaptive Network (DRAN) capable of dynamically adapting to relation and distribution changes over time. While temporal normalization and de-normalization are frequently used techniques to adapt to distribution shifts, this operation is not suitable for the spatio-temporal context as temporal normalization scales the time series of nodes and possibly disrupts the spatial relations among nodes. In order to address this problem, a Spatial Factor Learner (SFL) module is developed that enables the normalization and de-normalization process. To adapt to dynamic changes in spatial relationships among sensors, we propose a Dynamic-Static Fusion Learner (DSFL) module that effectively integrates features learned from both dynamic and static relations through an adaptive fusion ratio mechanism. Furthermore, we introduce a Stochastic Learner to capture the noisy components of spatio-temporal representations. Our approach outperforms state-of-the-art methods on weather prediction and traffic flow forecasting tasks.Experimental results show that our SFL efficiently preserves spatial relationships across various temporal normalization operations. Visualizations of the learned dynamic and static relations demonstrate that DSFL can capture both local and distant relationships between nodes.

2506.21129 2026-06-03 cs.LG cs.AI 版本更新

Curriculum-Adapted Robust Reinforcement Learning for UAV Deconfliction in Adversarial Environments

对抗环境中无人机冲突消解的课程自适应鲁棒强化学习

Deepak Kumar Panda, Adolfo Perrusquia, Weisi Guo

发表机构 * Faculty of Engineering and Applied Sciences, Cranfield University(工程与应用科学学院,克兰菲尔德大学)

AI总结 提出一种课程引导的适应框架,通过渐进暴露于梯度对抗观测扰动并对齐时序差分误差分布,提升无人机在GNSS欺骗攻击下的鲁棒性和泛化能力。

详情
AI中文摘要

自主无人机(UAV)越来越依赖强化学习(RL)进行导航。然而,全球导航卫星系统(GNSS)欺骗攻击可能导致分布外观测偏移,破坏价值估计并降低任务性能。现有的鲁棒RL方法通常能提高对特定攻击模型的抵抗力,但往往无法泛化到训练中未遇到的攻击。为解决这一局限,我们提出一种课程引导的适应框架,该框架逐步将鲁棒策略暴露于强度递增的基于梯度的对抗观测扰动,同时对齐课程阶段间的时序差分(TD)误差分布。所提出的方法不是适应特定的攻击模型,而是保持TD误差一致性以促进跨攻击条件的可迁移性。我们进一步推导了一个TD空间泛化保证,表明如果测试时攻击引起的TD误差分布与最终课程阶段的分布足够接近,则由此产生的性能退化是有界的。该框架在具有动态3D障碍物的无人机冲突消解环境中进行评估,面对之前未见过的固定和动态GNSS欺骗攻击。在固定欺骗条件下,课程适应策略实现了近乎完美的任务成功率,而标准和鲁棒RL基线为20-56%。在动态障碍物引诱欺骗攻击下,它获得了最高的情节奖励,同时随着空中交通密度的增加,任务完成步骤最多减少了45%。

英文摘要

Autonomous unmanned aerial vehicles (UAVs) increasingly rely on reinforcement learning (RL) for navigation. However, global navigation satellite system (GNSS) spoofing attacks can induce out-of-distribution observation shifts that corrupt value estimation and degrade mission performance. Existing robust RL approaches typically improve resilience against specific attack models but often fail to generalize to attacks not encountered during training. To address this limitation, we propose a curriculum-guided adaptation framework that progressively exposes a robust policy to gradient-based adversarial observation perturbations of increasing intensity while aligning temporal-difference (TD) error distributions across curriculum stages. Rather than adapting to a particular attack model, the proposed approach preserves TD-error consistency to promote transferability across attack conditions. We further derive a TD-space generalization certificate showing that if the TD-error distribution induced by a test-time attack remains sufficiently close to that of the final curriculum stage, the resulting performance degradation is bounded. The framework is evaluated in a UAV deconfliction environment with dynamic 3D obstacles under previously unseen fixed and dynamic GNSS spoofing attacks. Under fixed spoofing conditions, the curriculum-adapted policy achieved near-perfect mission success rates, compared with 20-56% for standard and robust RL baselines. Under dynamic obstacle-luring spoofing attacks, it achieved the highest episodic rewards while reducing mission completion steps by up to 45% across increasing aerial traffic densities.

2506.01969 2026-06-03 cs.DC cs.AI cs.LG 版本更新

FlashMLA-ETAP: Efficient Transpose Attention Pipeline for Accelerating MLA Inference on NVIDIA H20 GPUs

FlashMLA-ETAP:用于加速NVIDIA H20 GPU上MLA推理的高效转置注意力流水线

Pengcuo Dege, Qiuming Luo, Rui Mao, Chang Kong

发表机构 * Tencent(腾讯) College of Computer Science and Software Engineering, Shenzhen University(深圳大学计算机科学与软件工程学院) College of Artificial Intelligence, Shenzhen Polytechnic University(深圳职业技术学院人工智能学院)

AI总结 针对单多GPU服务器部署DeepSeek-R1 671B模型时多头潜在注意力(MLA)推理效率低的问题,提出FlashMLA-ETAP框架,通过高效转置注意力流水线(ETAP)重配置注意力计算,在NVIDIA H20 GPU上实现2.78倍加速,并保持数值稳定性。

Comments Accepted by ICONIP2025

详情
AI中文摘要

多头潜在注意力(MLA)的高效推理面临在单台多GPU服务器上部署DeepSeek-R1 671B模型的挑战。本文介绍FlashMLA-ETAP,一种新颖的框架,用于增强NVIDIA H20 GPU上单实例部署场景的MLA推理。我们提出了高效转置注意力流水线(ETAP),通过转置重新配置注意力计算,使KV上下文长度与WGMMA操作中的\(M\)维度对齐,显著减少冗余计算。FlashMLA-ETAP在64K序列长度(批大小16)下比FlashMLA加速2.78倍,比FlashAttention-3和FlashInfer分别提升5.24倍和4.94倍,同时保持数值稳定性,均方根误差(RMSE)比FlashAttention-3低15.2倍(\(1.25 imes 10^{-5}\))。此外,ETAP的设计能够无缝集成到FlashAttention-3和FlashInfer等框架中,并有详细的理论分析支持。我们的工作解决了资源受限推理中的一个关键空白,为中端GPU提供了可扩展的解决方案,并为硬件感知优化的更广泛采用铺平了道路。代码可在https://github.com/pengcuo/FlashMLA-ETAP获取。

英文摘要

Efficient inference of Multi-Head Latent Attention (MLA) is challenged by deploying the DeepSeek-R1 671B model on a single Multi-GPU server. This paper introduces FlashMLA-ETAP, a novel framework that enhances MLA inference for the single-instance deployment scenario on NVIDIA H20 GPUs. We propose the Efficient Transpose Attention Pipeline (ETAP), which reconfigures attention computation through transposition to align the KV context length with the \(M\)-dimension in WGMMA operations, significantly reducing redundant computations. FlashMLA-ETAP achieves a 2.78x speedup over FlashMLA at 64K sequence length (batch size 16), with 5.24x and 4.94x improvements over FlashAttention-3 and FlashInfer, respectively, while maintaining numerical stability with a 15.2x lower RMSE (\(1.25 \times 10^{-5}\)) than FlashAttention-3. Furthermore, ETAP's design enables seamless integration into frameworks like FlashAttention-3 and FlashInfer, supported by a detailed theoretical analysis. Our work addresses a critical gap in resource-constrained inference, offering a scalable solution for mid-tier GPUs and paving the way for broader adoption in hardware-aware optimization. Code is available at https://github.com/pengcuo/FlashMLA-ETAP.

2506.03087 2026-06-03 cs.LG cs.AI 版本更新

Do Explanations Increase the Risk of Decision Logic Leakage? Explanation-Guided Stealing of Graph Models

解释是否会增加决策逻辑泄露的风险?解释引导的图模型窃取

Bin Ma, Yuyuan Feng, Minhua Lin, Enyan Dai

发表机构 * The Hong Kong University of Science and Technology (Guangzhou)(香港科学与技术大学(广州)) Xiamen University(厦门大学) The Pennsylvania State University(宾夕法尼亚州立大学)

AI总结 研究解释机制可能泄露图神经网络决策逻辑的风险,提出一种结合解释对齐与数据增强的模型窃取框架,实验证明其优于传统方法。

详情
AI中文摘要

图神经网络(GNNs)已成为药物发现和金融分析等领域中分析图结构数据的重要工具,导致对模型透明度的需求日益增长。可解释GNNs的最新进展通过揭示影响预测的重要子图满足了这一需求,但这些解释机制可能无意中使这些模型面临安全风险。本文研究了此类解释如何潜在泄露可被利用进行模型窃取的关键决策逻辑。我们提出了{\method},一种新颖的窃取框架,它将用于捕获决策逻辑的解释对齐与用于在有限查询下高效训练的引导数据增强相结合,从而能够有效复制目标模型的预测行为和底层推理模式。在分子图数据集上的实验表明,我们的方法在模型窃取方面优于传统方法。这项工作突出了在敏感领域部署可解释GNNs时的重要安全考虑,并表明需要针对基于解释的攻击采取保护措施。我们的代码可在https://github.com/beanmah/EGSteal获取。

英文摘要

Graph Neural Networks (GNNs) have become essential tools for analyzing graph-structured data in domains such as drug discovery and financial analysis, leading to a growing demand for model transparency. Recent advances in explainable GNNs have addressed this need by revealing important subgraphs that influence predictions, but these explanation mechanisms may inadvertently expose these models to security risks. This paper investigates how such explanations potentially leak critical decision logic that can be exploited for model stealing. We propose {\method}, a novel stealing framework that integrates explanation alignment for capturing decision logic with guided data augmentation for efficient training under limited queries, enabling effective replication of both the predictive behavior and underlying reasoning patterns of target models. Experiments on molecular graph datasets demonstrate that our approach shows advantages over conventional methods in model stealing. This work highlights important security considerations for the deployment of explainable GNNs in sensitive domains and suggests the need for protective measures against explanation-based attacks. Our code is available at https://github.com/beanmah/EGSteal.

2506.01075 2026-06-03 cs.DS cs.IT cs.LG math.IT 版本更新

Learning DNF through Generalized Fourier Representations

通过广义傅里叶表示学习DNF

Mohsen Heidari, Roni Khardon

发表机构 * Department of Computer Sciences, Indiana University, Bloomington, IN, USA(印第安纳大学计算机科学系,印第安纳州布卢明顿,IN,USA)

AI总结 针对非乘积分布下DNF学习难题,引入基于贝叶斯网络的广义傅里叶表示,证明合取式的L1谱范数有界性,实现DNF和决策树的可学习性。

Comments 60 pages

详情
AI中文摘要

布尔傅里叶表示在学习理论中被广泛使用,特别是在均匀分布和乘积分布下学习析取范式(DNF)。将这些结果扩展到非乘积分布一直是一个长期未解决的开放问题。我们通过引入一种广义傅里叶表示来应对这一挑战,该表示能够在广泛的一类非乘积分布下进行学习。我们的方法将任意分布$D$表示为贝叶斯网络(BN),并推导出相应的傅里叶展开。我们证明了使用成员查询来识别重系数的标准基于傅里叶的学习技术可以通过少量修改适应于这种广义表示。我们证明了对于差分有界树BN,合取式的$L_1$谱范数在这种展开下保持有界,显著推广了均匀分布的已知结果;匹配的下界证明了这些约束的必要性。利用这些结果,我们建立了DNF的可学习性以及决策树在此类分布下的不可知学习性。最后,我们提出了一种学习差分有界树BN分布的算法,将我们的结果扩展到分布未知的场景。

英文摘要

The Boolean Fourier representation has been widely used in learning theory, particularly for learning Disjunctive Normal Form (DNF) under uniform and product distributions. Extending these results to non-product distributions has remained a longstanding open problem. We address this challenge by introducing a generalized Fourier representation that enables learning under a broad class of non-product distributions. Our approach represents any distribution $D$ as a Bayesian network (BN) and derives a corresponding Fourier expansion. We show that standard Fourier-based learning techniques using membership queries to identify heavy coefficients can be adapted to this generalized representation with minor modifications. We prove that the $L_1$ spectral norm of conjunctions remains bounded under this expansion for difference-bounded tree BNs, significantly generalizing the known result for uniform distributions; matching lower bounds demonstrate the necessity of these constraints. Using these results, we establish the learnability of DNF and the agnostic learnability of decision trees under such distributions. Finally, we present an algorithm for learning difference-bounded tree BN distributions, extending our results to settings where the distribution is unknown.

2506.00431 2026-06-03 cs.LG 版本更新

TIDFormer: Exploiting Temporal and Interactive Dynamics Makes A Great Dynamic Graph Transformer

TIDFormer: 利用时间和交互动态打造卓越的动态图Transformer

Jie Peng, Zhewei Wei, Yuhang Ye

发表机构 * Renmin University of China(中国人民大学) Huawei Shenzhen, Guangdong China(华为深圳,广东中国)

AI总结 提出TIDFormer,通过高效利用时间和交互动态,并设计可解释的自注意力机制,在多个动态图数据集上超越现有模型。

Comments KDD2025

详情
AI中文摘要

由于自注意力机制(SAMs)在序列建模中捕捉依赖关系的能力,一些现有的动态图神经网络(DGNNs)利用具有各种编码设计的Transformer架构来捕捉动态图的序列演化。然而,这些基于Transformer的DGNNs的有效性和效率差异很大,凸显了在动态图上正确定义SAM以及在不增加额外复杂模块的情况下全面编码时间和交互动态的重要性。在这项工作中,我们提出了TIDFormer,一种以高效方式充分利用时间和交互动态的动态图Transformer。我们阐明并验证了我们提出的SAM的可解释性,解决了先前工作中在动态图上其定义不可解释的开放问题。为了分别建模时间和交互动态,我们利用基于日历的时间划分信息,并仅使用采样的一阶邻居为二分图和非二分图提取信息丰富的交互嵌入。此外,我们通过简单的分解捕捉历史交互模式的潜在变化,联合建模时间和交互特征。我们在多个动态图数据集上进行了大量实验,以验证TIDFormer的有效性和效率。实验结果表明,TIDFormer表现出色,在大多数数据集和实验设置中超越了最先进的模型。此外,与之前基于Transformer的方法相比,TIDFormer展现出显著的效率优势。

英文摘要

Due to the proficiency of self-attention mechanisms (SAMs) in capturing dependencies in sequence modeling, several existing dynamic graph neural networks (DGNNs) utilize Transformer architectures with various encoding designs to capture sequential evolutions of dynamic graphs. However, the effectiveness and efficiency of these Transformer-based DGNNs vary significantly, highlighting the importance of properly defining the SAM on dynamic graphs and comprehensively encoding temporal and interactive dynamics without extra complex modules. In this work, we propose TIDFormer, a dynamic graph TransFormer that fully exploits Temporal and Interactive Dynamics in an efficient manner. We clarify and verify the interpretability of our proposed SAM, addressing the open problem of its uninterpretable definitions on dynamic graphs in previous works. To model the temporal and interactive dynamics, respectively, we utilize the calendar-based time partitioning information and extract informative interaction embeddings for both bipartite and non-bipartite graphs using merely the sampled first-order neighbors. In addition, we jointly model temporal and interactive features by capturing potential changes in historical interaction patterns through a simple decomposition. We conduct extensive experiments on several dynamic graph datasets to verify the effectiveness and efficiency of TIDFormer. The experimental results demonstrate that TIDFormer excels, outperforming state-of-the-art models across most datasets and experimental settings. Furthermore, TIDFormer exhibits significant efficiency advantages compared to previous Transformer-based methods.

2505.20853 2026-06-03 cs.LG cs.AI 版本更新

Cooperation of Experts: Fusing Heterogeneous Information with Large Margin

专家合作:大间隔融合异构信息

Shuo Wang, Shunyang Huang, Jinghui Yuan, Zhixiang Shen, Zhao Kang

发表机构 * Shuo Wang, Shunyang Huang, Jinghui Yuan, Zhixiang Shen, Zhao Kang(未知)

AI总结 提出专家合作框架,通过大间隔机制融合异构信息,在统一异构多路网络中编码多类型数据,实现鲁棒且互补的知识提取。

Comments Accepted at the 42nd International Conference on Machine Learning (ICML 2025)

详情
Journal ref
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:63169-63185, 2025
AI中文摘要

融合异构信息仍然是现代数据分析中的一个持续挑战。尽管已取得显著进展,但现有方法往往未能考虑对象模式在不同语义空间中的固有异质性。为解决这一局限性,我们提出了专家合作(CoE)框架,该框架将多类型信息编码到统一的异构多路网络中。通过克服模态和连接差异,CoE为捕捉现实世界复杂数据的复杂结构提供了一个强大且灵活的模型。在我们的框架中,专用编码器充当领域特定专家,每个专家专门学习特定语义空间中的不同关系模式。为了增强鲁棒性并提取互补知识,这些专家通过一种新颖的大间隔机制进行协作,该机制由定制的优化策略支持。严格的理论分析保证了框架的可行性和稳定性,而跨多种基准的广泛实验证明了其优越的性能和广泛的适用性。我们的代码可在 https://github.com/strangeAlan/CoE 获取。

英文摘要

Fusing heterogeneous information remains a persistent challenge in modern data analysis. While significant progress has been made, existing approaches often fail to account for the inherent heterogeneity of object patterns across different semantic spaces. To address this limitation, we propose the Cooperation of Experts (CoE) framework, which encodes multi-typed information into unified heterogeneous multiplex networks. By overcoming modality and connection differences, CoE provides a powerful and flexible model for capturing the intricate structures of real-world complex data. In our framework, dedicated encoders act as domain-specific experts, each specializing in learning distinct relational patterns in specific semantic spaces. To enhance robustness and extract complementary knowledge, these experts collaborate through a novel large margin mechanism supported by a tailored optimization strategy. Rigorous theoretical analyses guarantee the framework's feasibility and stability, while extensive experiments across diverse benchmarks demonstrate its superior performance and broad applicability. Our code is available at https://github.com/strangeAlan/CoE.

2505.20142 2026-06-03 cs.LG 版本更新

Grounding Functional Similarity by Invariance-Aware Model Stitching

通过不变性感知模型拼接实现功能相似性评估

Ioannis Athanasiadis, Anmar Karmush, Michael Felsberg

发表机构 * Ioannis Athanasiadis Anmar Karmush Michael Felsberg

AI总结 针对标准模型拼接忽略不变性导致功能相似性误判的问题,提出前向-后向兼容性要求下的不变性感知模型拼接方法,揭示隐藏的功能差异。

详情
AI中文摘要

在深度学习中,功能相似性评估量化了独立训练的模型学习相似输入-输出关系的程度。在模型拼接中,功能相似性被表述为表示前向兼容性,即两个模型的表示能否对齐以解决给定任务。然而,最近的研究强调了一个关键限制:依赖不同信息线索的模型仍可能产生兼容的表示,使其看起来具有误导性的相似性(Smith et al., 2025)。我们将此失败归因于标准模型拼接本质上对拼接模型的不变性特性视而不见。为解决这一限制,我们引入了前向-后向兼容性要求,并据此制定了不变性感知模型拼接。通过分析关键拼接配置,我们研究了前向和后向兼容性之间的相互作用,表明不变性感知模型拼接为功能相似性评估提供了更原则性的方法,同时揭示了先前被掩盖的功能差异。

英文摘要

In deep learning, functional similarity evaluation quantifies the extent to which independently trained models learn similar input--output relationships. In model stitching, functional similarity is framed as representation forward compatibility, i.e., whether the representations of two models can be aligned to solve a given task. Recent studies, however, highlight a critical limitation: models relying on different information cues can still produce compatible representations, making them appear misleadingly similar (Smith et al., 2025). We attribute this failure to standard model stitching being inherently blind to the invariance properties of the stitched models. To address this limitation, we introduce the forward--backward compatibility requirement under which we formulate the invariance-aware model stitching. Through analyzing key stitching configurations, we study the interplay between forward and backward compatibility, showing that invariance-aware model stitching provides a more principled approach to functional similarity evaluation while revealing functional discrepancies previously obscured.

2502.08006 2026-06-03 cs.LG cs.AI stat.ML 版本更新

Greed is Good: A Unifying Perspective on Guided Generation

贪婪即美德:引导生成的统一视角

Zander W. Blasingame, Chen Liu

AI总结 本文通过将后验引导视为端到端引导的贪婪策略,统一了两种梯度引导方法,并提出了在计算与精度之间权衡的插值方法,在逆图像问题和分子生成任务上验证了有效性。

Comments Accepted at NeurIPS 2025

详情
AI中文摘要

无训练引导生成是一种广泛使用且强大的技术,允许最终用户对流/扩散模型的生成过程施加进一步控制。一般来说,针对基于梯度的引导,已经出现了两种技术系列:即后验引导(即通过目标预测模型将当前样本投影到目标分布进行引导)和端到端引导(即通过在整个ODE求解过程中执行反向传播进行引导)。在这项工作中,我们表明这两个看似分离的系列实际上可以通过将后验引导视为端到端引导的贪婪策略来统一。我们探索了这两个系列之间的理论联系,并深入分析了这两种技术相对于连续理想梯度的关系。基于这一分析,我们提出了一种在这两个系列之间插值的方法,从而在引导梯度的计算与精度之间实现权衡。然后,我们在几个逆图像问题和性质引导的分子生成任务上验证了这项工作。

英文摘要

Training-free guided generation is a widely used and powerful technique that allows the end user to exert further control over the generative process of flow/diffusion models. Generally speaking, two families of techniques have emerged for solving this problem for gradient-based guidance: namely, posterior guidance (i.e., guidance via projecting the current sample to the target distribution via the target prediction model) and end-to-end guidance (i.e., guidance by performing backpropagation throughout the entire ODE solve). In this work, we show that these two seemingly separate families can actually be unified by looking at posterior guidance as a greedy strategy of end-to-end guidance. We explore the theoretical connections between these two families and provide an in-depth theoretical of these two techniques relative to the continuous ideal gradients. Motivated by this analysis we then show a method for interpolating between these two families enabling a trade-off between compute and accuracy of the guidance gradients. We then validate this work on several inverse image problems and property-guided molecular generation.

2505.08886 2026-06-03 cs.CV cs.LG 版本更新

Optimizing Neuro-Fuzzy and Colonial Competition Algorithms for Skin Cancer Diagnosis in Dermatoscopic Images

优化神经模糊与殖民竞争算法用于皮肤镜图像中的皮肤癌诊断

Hamideh Khaleghpour, Brett McKinney

AI总结 本研究融合图像处理、神经模糊和殖民竞争算法,在ISIC数据库的560张皮肤镜图像上实现94%准确率,旨在辅助临床早期黑色素瘤检测。

Comments 7 pages, 10 figures. Accepted at the 2nd Asia Pacific Computer Systems Conference (APCS 2024), March 15-17, 2024

详情
Journal ref
Proceedings of the 2024 7th International Conference on Information and Computer Technologies, pages 166-172, IEEE, March 2024
AI中文摘要

皮肤癌发病率的上升,加上公众意识有限和临床专业知识的不足,凸显了对先进诊断辅助工具的迫切需求。人工智能(AI)已成为该领域有前景的工具,特别是在区分恶性与良性皮肤病变方面。利用公开可用的皮肤病变数据集,研究人员一直在开发基于AI的诊断解决方案。然而,此类计算机系统在临床环境中的整合仍处于初期阶段。本研究旨在通过融合图像处理技术和机器学习算法(特别是神经模糊和殖民竞争方法)来弥合这一差距。应用于ISIC数据库中的皮肤镜图像,我们的方法在560张图像的数据集上达到了94%的显著准确率。这些结果强调了我们的方法在帮助临床医生早期检测黑色素瘤方面的潜力,从而为皮肤癌诊断做出重要贡献。

英文摘要

The rising incidence of skin cancer, coupled with limited public awareness and a shortfall in clinical expertise, underscores an urgent need for advanced diagnostic aids. Artificial Intelligence (AI) has emerged as a promising tool in this domain, particularly for distinguishing malignant from benign skin lesions. Leveraging publicly available datasets of skin lesions, researchers have been developing AI-based diagnostic solutions. However, the integration of such computer systems in clinical settings is still nascent. This study aims to bridge this gap by employing a fusion of image processing techniques and machine learning algorithms, specifically neuro-fuzzy and colonial competition approaches. Applied to dermoscopic images from the ISIC database, our method achieved a notable accuracy of 94% on a dataset of 560 images. These results underscore the potential of our approach in aiding clinicians in the early detection of melanoma, thereby contributing significantly to skin cancer diagnostics.

2505.07068 2026-06-03 stat.ML cs.LG math.DS 版本更新

A Sparse Bayesian Learning Algorithm for Estimation of Interaction Kernels in Motsch-Tadmor Model

Motsch-Tadmor模型中交互核估计的稀疏贝叶斯学习算法

Jinchao Feng, Sui Tang

发表机构 * Department of Mathematics, Great Bay University(广东大湾大学数学系) Department of Mathematics, University of California, Santa Barbara(加州大学圣芭芭拉分校数学系)

AI总结 针对Motsch-Tadmor模型中非对称交互核的估计问题,提出一种基于变分框架和稀疏贝叶斯学习的算法,实现核函数的鲁棒识别与不确定性量化。

Comments 23 pages

详情
AI中文摘要

本文基于观测轨迹数据,研究Motsch-Tadmor模型中非对称交互核的数据驱动辨识。所考虑的模型由一类半线性演化方程控制,其中交互核定义了一个归一化的、依赖于状态的拉普拉斯算子,该算子支配集体动力学。为了解决由此产生的非线性逆问题,我们提出一个变分框架,利用控制方程的隐式形式重新表述核辨识问题,将其简化为子空间辨识问题。我们建立了一个可辨识性结果,刻画了交互核在尺度意义下可唯一恢复的条件。为了鲁棒地求解逆问题,我们开发了一种稀疏贝叶斯学习算法,该算法引入信息先验进行正则化,量化不确定性,并实现原则性的模型选择。在代表性交互粒子系统上的大量数值实验表明,所提出的框架在不同噪声水平和数据范围内具有准确性、鲁棒性和可解释性。

英文摘要

In this paper, we investigate the data-driven identification of asymmetric interaction kernels in the Motsch-Tadmor model based on observed trajectory data. The model under consideration is governed by a class of semilinear evolution equations, where the interaction kernel defines a normalized, state-dependent Laplacian operator that governs collective dynamics. To address the resulting nonlinear inverse problem, we propose a variational framework that reformulates kernel identification using the implicit form of the governing equations, reducing it to a subspace identification problem. We establish an identifiability result that characterizes conditions under which the interaction kernel can be uniquely recovered up to scale. To solve the inverse problem robustly, we develop a sparse Bayesian learning algorithm that incorporates informative priors for regularization, quantifies uncertainty, and enables principled model selection. Extensive numerical experiments on representative interacting particle systems demonstrate the accuracy, robustness, and interpretability of the proposed framework across a range of noise levels and data regimes.

2504.01250 2026-06-03 cs.LG cs.SY eess.SY 版本更新

R2DN: Scalable Parameterization of Contracting and Lipschitz Recurrent Deep Networks

R2DN:收缩和Lipschitz循环深度网络的可扩展参数化

Nicholas H. Barbara, Ruigang Wang, Ian R. Manchester

发表机构 * Australian Centre for Robotics(澳大利亚机器人中心) School of Aerospace, Mechanical and Mechatronic Engineering(航空航天、机械与机电工程学院) The University of Sydney(悉尼大学)

AI总结 本文提出鲁棒循环深度网络(R2DN),通过将线性时不变系统与1-Lipschitz深度前馈网络反馈互联,直接参数化权重以保证模型稳定(收缩)且对小输入扰动鲁棒(Lipschitz),相比循环均衡网络(REN)无需迭代求解均衡层,显著提升GPU上的推理和反向传播速度,并在非线性系统辨识、观测器设计和基于学习的反馈控制中实现相近性能下训练和推理速度提升一个数量级。

详情
AI中文摘要

本文提出鲁棒循环深度网络(R2DN),这是一种用于机器学习和数据驱动控制的鲁棒循环神经网络的可扩展参数化。我们将R2DN构造为线性时不变系统与1-Lipschitz深度前馈网络的反馈互联,并直接参数化权重,使得我们的模型天生稳定(收缩)且对小输入扰动鲁棒(Lipschitz)。我们的参数化使用了类似于先前提出的循环均衡网络(REN)的结构,但无需在每个时间步迭代求解均衡层。这加速了GPU上的模型推理和反向传播,并且与REN相比,使得网络规模、批大小和输入序列长度的扩展在计算上可行。我们在非线性系统辨识、观测器设计和基于学习的反馈控制三个代表性问题上将R2DN与REN进行比较。我们发现,在相似的测试集性能下,训练和推理速度均提升一个数量级,并且它们在模型表达能力方面具有更好的可扩展性。

英文摘要

This paper presents the Robust Recurrent Deep Network (R2DN), a scalable parameterization of robust recurrent neural networks for machine learning and data-driven control. We construct R2DNs as the feedback interconnection of a linear time-invariant system and a 1-Lipschitz deep feedforward network, and directly parameterize the weights so that our models are stable (contracting) and robust to small input perturbations (Lipschitz) by design. Our parameterization uses a structure similar to the previously-proposed recurrent equilibrium network (REN), but without the requirement to iteratively solve an equilibrium layer at each time-step. This speeds up both model inference and backpropagation on GPUs, and makes it computationally feasible to scale up the network size, batch size, and input sequence length in comparison to RENs. We compare R2DNs to RENs on three representative problems in nonlinear system identification, observer design, and learning-based feedback control. We find that training and inference are both up to an order of magnitude faster with similar test set performance, and that they scale more favorably with respect to model expressivity.

2502.03139 2026-06-03 astro-ph.CO astro-ph.IM cs.LG 版本更新

Fast Sampling of Cosmological Initial Conditions with Gaussian Neural Posterior Estimation

基于高斯神经后验估计的宇宙学初始条件快速采样

Oleg Savchenko, Guillermo Franco Abellán, Florian List, Noemi Anau Montel, Christoph Weniger

发表机构 * GRAPPA Institute, Institute for Theoretical Physics Amsterdam, University of Amsterdam, Science Park 904, 1098 XH Amsterdam, The Netherlands(GRAPPA研究所、阿姆斯特丹理论物理研究所、阿姆斯特丹大学、科学公园904号、1098 XH阿姆斯特丹、荷兰) Department of Astrophysics, University of Vienna, Türkenschanzstraße 17, 1180 Vienna, Austria(天体物理学系、维也纳大学、土耳其沙恩茨街17号、1180维也纳、奥地利)

AI总结 提出一种基于模拟推理的方法,通过高斯后验建模和神经网络估计,实现从晚期观测数据快速重建宇宙初始密度场,比现有方法快数个数量级。

Comments 9 + 2 pages, 7 figures, 1 table. Comments welcome!

详情
Journal ref
Mon Not R Astron Soc (2026)
AI中文摘要

了解宇宙大尺度结构在宇宙时间中形成的原初物质密度场对宇宙学至关重要。然而,从晚期观测重建这些宇宙学初始条件是一项著名的困难任务,需要先进的宇宙学模拟器和复杂的统计方法来探索数百万维的参数空间。我们展示了如何利用基于模拟的推理(SBI)来解决这个问题,并以模拟高效的方式使用通用的不可微模拟器获得数据约束的原初暗物质密度场实现。我们的方法适用于完整的高分辨率暗物质$N$体模拟,并基于将约束初始条件的后验分布建模为傅里叶空间中对角协方差矩阵的高斯分布。因此,我们可以在单个GPU上几秒内生成数千个后验样本,比现有方法快数个数量级,为宇宙学场的顺序SBI铺平了道路。此外,我们对协方差与波数的依赖关系进行了解析拟合,有效地将任何初始条件的点估计器转化为快速采样器。我们通过汇总统计将获得的样本与真实值进行比较,并执行贝叶斯一致性检验,验证了样本的有效性。

英文摘要

Knowledge of the primordial matter density field from which the large-scale structure of the Universe emerged over cosmic time is of fundamental importance for cosmology. However, reconstructing these cosmological initial conditions from late-time observations is a notoriously difficult task, which requires advanced cosmological simulators and sophisticated statistical methods to explore a multi-million-dimensional parameter space. We show how simulation-based inference (SBI) can be used to tackle this problem and to obtain data-constrained realisations of the primordial dark matter density field in a simulation-efficient way with general non-differentiable simulators. Our method is applicable to full high-resolution dark matter $N$-body simulations and is based on modelling the posterior distribution of the constrained initial conditions to be Gaussian with a diagonal covariance matrix in Fourier space. As a result, we can generate thousands of posterior samples within seconds on a single GPU, orders of magnitude faster than existing methods, paving the way for sequential SBI for cosmological fields. Furthermore, we perform an analytical fit of the estimated dependence of the covariance on the wavenumber, effectively transforming any point-estimator of initial conditions into a fast sampler. We test the validity of our obtained samples by comparing them to the true values with summary statistics and performing a Bayesian consistency test.

2502.02260 2026-06-03 cs.LG cs.CR 版本更新

Position: Adversarial ML for LLMs Is Not Making Any Progress

立场:针对LLM的对抗性机器学习并未取得任何进展

Javier Rando, Jie Zhang, Nicholas Carlini, Florian Tramèr

发表机构 * GitHub University of California, Berkeley(加州大学伯克利分校)

AI总结 本文认为,在大语言模型时代,对抗性机器学习研究的问题定义更模糊、更难解决且更难以评估,可能导致未来十年仍无法取得有意义进展。

Comments Accepted at ICML 2026 Position Paper Track

详情
AI中文摘要

在过去十年中,大量研究工作致力于保护在对抗性环境中运行的机器学习模型。然而,即使是简单的“玩具”问题(例如,对微小对抗扰动的鲁棒性),进展也很缓慢,并且常常受到非严格评估的阻碍。如今,对抗性机器学习研究已转向研究更大规模、通用目的的语言模型。在这篇立场论文中,我们认为情况现在更糟:在大语言模型时代,对抗性机器学习研究的问题(1)定义更不明确,(2)更难解决,以及(3)更难以评估。因此,我们警告说,又一个十年的对抗性机器学习工作可能无法产生有意义的进展。

英文摘要

In the past decade, considerable research effort has been devoted to securing machine learning (ML) models that operate in adversarial settings. Yet, progress has been slow even for simple "toy" problems (e.g., robustness to small adversarial perturbations) and is often hindered by non-rigorous evaluations. Today, adversarial ML research has shifted towards studying larger, general-purpose language models. In this position paper, we argue that the situation is now even worse: in the era of LLMs, the field of adversarial ML studies problems that are (1) less clearly defined, (2) harder to solve, and (3) even more challenging to evaluate. As a result, we caution that yet another decade of work on adversarial ML may be failing to produce meaningful progress.

2501.02173 2026-06-03 cs.IR cs.LG 版本更新

The Efficiency vs. Accuracy Trade-off: Optimizing RAG-Enhanced LLM Recommender Systems Using Multi-Head Early Exit

效率与准确性的权衡:使用多头早期退出优化RAG增强的LLM推荐系统

Huixue Zhou, Hengrui Gu, Xi Liu, Kaixiong Zhou, Mingfu Liang, Yongkang Xiao, Srinivas Govindan, Piyush Chawla, Jiyan Yang, Xiangfei Meng, Huayu Li, Buyun Zhang, Liang Luo, Wen-Yen Chen, Yiping Han, Bo Long, Rui Zhang, Tianlong Chen

发表机构 * Meta Platforms(Meta平台) University of Minnesota(明尼苏达大学) NCSU(北卡罗来纳州立大学) UNC at Chapel Hill(Chapel Hill分校,北卡罗来纳大学)

AI总结 提出结合检索增强生成(RAG)与多头早期退出架构的优化框架,通过图卷积网络(GCN)高效检索和动态推理终止,在降低计算时间的同时保持或提升点击率(CTR)预测准确性。

详情
AI中文摘要

在推荐系统中部署大型语言模型(LLM)以预测点击率(CTR)需要在计算效率和预测准确性之间取得微妙的平衡。本文提出一个优化框架,结合检索增强生成(RAG)与创新的多头早期退出架构,同时增强这两个方面。通过集成图卷积网络(GCN)作为高效检索机制,我们能够显著减少数据检索时间,同时保持高模型性能。采用的早期退出策略允许动态终止模型推理,利用跨多个头的实时预测置信度评估。这不仅加快了LLM的响应速度,还维持或提高了其准确性,使其非常适合实时应用场景。我们的实验表明,该架构有效减少了计算时间,而不牺牲可靠推荐交付所需的准确性,为商业系统中高效、实时的LLM部署建立了新标准。

英文摘要

The deployment of Large Language Models (LLMs) in recommender systems for predicting Click-Through Rates (CTR) necessitates a delicate balance between computational efficiency and predictive accuracy. This paper presents an optimization framework that combines Retrieval-Augmented Generation (RAG) with an innovative multi-head early exit architecture to concurrently enhance both aspects. By integrating Graph Convolutional Networks (GCNs) as efficient retrieval mechanisms, we are able to significantly reduce data retrieval times while maintaining high model performance. The early exit strategy employed allows for dynamic termination of model inference, utilizing real-time predictive confidence assessments across multiple heads. This not only quickens the responsiveness of LLMs but also upholds or improves their accuracy, making it ideal for real-time application scenarios. Our experiments demonstrate how this architecture effectively decreases computation time without sacrificing the accuracy needed for reliable recommendation delivery, establishing a new standard for efficient, real-time LLM deployment in commercial systems.

2412.05109 2026-06-03 cs.LG cs.IT math.IT math.PR math.ST stat.ML stat.TH 版本更新

Generating Rectifiable Measures through Neural Networks

通过神经网络生成可求积测度

Erwin Riegler, Alex Bühler, Yang Pan, Helmut Bölcskei

AI总结 本文证明可数m-可求积测度可通过ReLU神经网络将[0,1]上的一维勒贝格测度推前得到,在Wasserstein距离下达到任意小逼近误差,且所需网络数量上界为2^{O(ε^{-m} log^2 ε)},该率等于可求积参数m。

详情
AI中文摘要

我们推导了(可数)$m$-可求积测度类的通用逼近结果。具体地,我们证明$m$-可求积测度可以通过ReLU神经网络将$[0,1]$上的一维勒贝格测度推前得到,在Wasserstein距离下达到任意小的逼近误差。此外,所考虑网络的权重是量化和有界的,达到逼近误差$\varepsilon$所需的ReLU神经网络数量不超过$2^{b(\varepsilon)}$,其中$b(\varepsilon)=\mathcal{O}(\varepsilon^{-m}\log^2(\varepsilon))$。这一结果改进了Perekrestenko等人的引理IX.4,因为它表明当$\varepsilon$趋于零时$b(\varepsilon)$趋于无穷的速率等于可求积参数$m$,而$m$可能远小于环境维度。我们将此结果推广到可数$m$-可求积测度,并证明该速率仍然等于可求积参数$m$,前提是(除其他技术假设外)测度在可数$m$-可求积支撑集的各个分量上指数衰减。

英文摘要

We derive universal approximation results for the class of (countably) $m$-rectifiable measures. Specifically, we prove that $m$-rectifiable measures can be approximated as push-forwards of the one-dimensional Lebesgue measure on $[0,1]$ using ReLU neural networks with arbitrarily small approximation error in terms of Wasserstein distance. What is more, the weights in the networks under consideration are quantized and bounded and the number of ReLU neural networks required to achieve an approximation error of $\varepsilon$ is no larger than $2^{b(\varepsilon)}$ with $b(\varepsilon)=\mathcal{O}(\varepsilon^{-m}\log^2(\varepsilon))$. This result improves Lemma IX.4 in Perekrestenko et al. as it shows that the rate at which $b(\varepsilon)$ tends to infinity as $\varepsilon$ tends to zero equals the rectifiability parameter $m$, which can be much smaller than the ambient dimension. We extend this result to countably $m$-rectifiable measures and show that this rate still equals the rectifiability parameter $m$ provided that, among other technical assumptions, the measure decays exponentially on the individual components of the countably $m$-rectifiable support set.

2409.08958 2026-06-03 cs.LG cs.AI physics.comp-ph physics.flu-dyn 版本更新

PINNfluence: Interpreting PINNs through Influence Functions

PINNfluence: 通过影响函数解释 PINN

Aleksander Krasowski, Jonas R. Naujoks, Moritz Weckbecker, Galip Ü. Yolcu, Thomas Wiegand, Sebastian Lapuschkin, Wojciech Samek, René P. Klausen

发表机构 * Technical University of Munich(慕尼黑技术大学) Max Planck Institute for Intelligent Systems(智能系统马克斯·普朗克研究所) University of Tübingen(图宾根大学) ETH Zurich(苏黎世联邦理工学院)

AI总结 提出 PINNfluence 框架,基于影响函数对物理信息神经网络进行训练数据归因,实现预测、损失分量和训练数据点之间的细粒度归因,并通过基准实验区分训练好与差的 PINN 的结构特征。

Comments Accepted at ICML 2026

详情
AI中文摘要

物理信息神经网络(PINN)已成为物理科学中求解偏微分方程(PDE)的强大深度学习方法,但其行为在很大程度上仍然不透明,通常通过故障模式分析而非显式可解释性来理解。为了解决这个问题,我们引入了 PINNfluence,这是一个基于影响函数解释 PINN 的训练数据归因框架。通过将影响函数扩展到复合物理信息训练目标,我们实现了预测、损失分量和训练数据点之间的细粒度归因。通过跨各种 PDE 的基准实验,我们证明了影响模式提供了区分训练良好和训练不良的 PINN 结构特征的细粒度诊断。因此,PINNfluence 通过数据视角为理解和提高 PINN 的可靠性开辟了新途径。

英文摘要

Physics-informed neural networks (PINNs) have emerged as a powerful deep learning approach for solving partial differential equations (PDEs) in the physical sciences, yet their behavior remains largely opaque and is typically understood through failure mode analyses rather than explicit interpretability. To address this issue, we introduce PINNfluence, a training data attribution framework for interpreting PINNs based on influence functions. By extending influence functions to composite physics-informed training objectives, we enable fine-grained attribution between predictions, loss components, and training data points. Through benchmark experiments across various PDEs, we demonstrate that influence patterns provide granular diagnostics that distinguish structural characteristics across well-trained and poorly-trained PINNs. PINNfluence thus opens a new avenue for understanding and improving the reliability of PINNs through the lens of their data.

2410.14573 2026-06-03 cs.LG cs.AI 版本更新

Building Trust in Black-box Optimization: A Comprehensive Framework for Explainability

在黑盒优化中建立信任:可解释性的综合框架

Nazanin Nezami, Hadis Anahideh

发表机构 * University of Illinois Chicago(伊利诺伊大学芝加哥分校)

AI总结 提出一套模型无关的指标IEMSO,通过采样核心、批次属性、优化过程和特征重要性四类指标,增强代理优化方法的透明性和可解释性。

详情
AI中文摘要

在受限评估预算内优化昂贵的黑盒函数在许多实际应用中面临重大挑战。代理优化(SO)是一种常见的解决方案,但其由代理模型和采样核心(例如采集函数)的复杂性引入的专有性质往往导致缺乏可解释性和透明度。尽管现有文献主要集中在增强对全局最优的收敛性,但新提出策略的实际解释仍未被充分探索,特别是在批量评估设置中。在本文中,我们提出了代理优化的包容性可解释性指标(IEMSO),这是一组全面的模型无关指标,旨在增强SO方法的透明度、可信度和可解释性。通过这些指标,我们在执行昂贵评估之前和之后为从业者提供中间和事后解释,以建立信任。我们考虑了四类主要指标,每类针对SO过程的特定方面:采样核心指标、批次属性指标、优化过程指标和特征重要性。我们的实验评估证明了所提指标在不同基准上的显著潜力。

英文摘要

Optimizing costly black-box functions within a constrained evaluation budget presents significant challenges in many real-world applications. Surrogate Optimization (SO) is a common resolution, yet its proprietary nature introduced by the complexity of surrogate models and the sampling core (e.g., acquisition functions) often leads to a lack of explainability and transparency. While existing literature has primarily concentrated on enhancing convergence to global optima, the practical interpretation of newly proposed strategies remains underexplored, especially in batch evaluation settings. In this paper, we propose \emph{Inclusive} Explainability Metrics for Surrogate Optimization (IEMSO), a comprehensive set of model-agnostic metrics designed to enhance the transparency, trustworthiness, and explainability of the SO approaches. Through these metrics, we provide both intermediate and post-hoc explanations to practitioners before and after performing expensive evaluations to gain trust. We consider four primary categories of metrics, each targeting a specific aspect of the SO process: Sampling Core Metrics, Batch Properties Metrics, Optimization Process Metrics, and Feature Importance. Our experimental evaluations demonstrate the significant potential of the proposed metrics across different benchmarks.

2406.10407 2026-06-03 math.OC cs.LG cs.NA math.NA 版本更新

Suboptimality bounds for trace-bounded SDPs enable a faster and scalable low-rank SDP solver SDPLR+

迹有界半定规划的最优性界实现更快且可扩展的低秩SDP求解器SDPLR+

Yufan Huang, David F. Gleich

发表机构 * Purdue University(普渡大学)

AI总结 本文利用迹有界半定规划的最优性界改进Burer-Monteiro的低秩SDP求解器SDPLR,提出SDPLR+,通过动态调整秩并跟踪原始不可行性和最优性,实现更快的求解和更好的可扩展性。

Comments 31 pages, 12 figures

详情
AI中文摘要

半定规划(SDP)及其求解器是机器学习和数据科学中许多应用的有力工具。设计可扩展的SDP求解器具有挑战性,因为标准情况下正半定决策变量是一个$n \times n$的稠密矩阵,尽管输入通常是$n \times n$的稀疏矩阵。然而,如Barvinok和Pataki所示,解可能不需要满秩矩阵。二十年前,Burer和Monteiro开发了SDP求解器\texttt{SDPLR},它在低秩分解上而不是完整矩阵上进行优化。这大大降低了存储成本,并且对许多问题效果良好。原始求解器\texttt{SDPLR}仅跟踪解的原始不可行性,阻止了在中等精度下的提前终止。我们利用迹有界SDP问题的最优性界,使我们能够更好地跟踪进展并执行提前终止。然后我们开发了\texttt{SDPLR+},它以极低秩分解开始优化,并基于原始不可行性和最优性动态更新秩。这进一步加速了计算并节省了存储。在Max Cut、Minimum Bisection、Cut Norm和Lovász Theta问题上与许多近期的内存高效可扩展SDP求解器的数值比较展示了\texttt{SDPLR+}在决策变量达到百万乘百万规模问题上的可扩展性。它通常是达到中等精度$10^{-2}$的最快求解器。在$\mu$-电导、矩阵补全和$k$-均值聚类上的进一步实验显示了\texttt{SDPLR+}在更广泛数据科学应用中的潜力。

英文摘要

Semidefinite programs (SDPs) and their solvers are powerful tools with many applications in machine learning and data science. Designing scalable SDP solvers is challenging because by standard the positive semidefinite decision variable is an $n \times n$ dense matrix, even though the input is often an $n \times n$ sparse matrix. However, the solution may not require a full-rank matrix, as shown by Barvinok and Pataki. Two decades ago, Burer and Monteiro developed an SDP solver \texttt{SDPLR} that optimizes over a low-rank factorization instead of the full matrix. This greatly decreases the storage cost and works well for many problems. The original solver \texttt{SDPLR} tracks only the primal infeasibility of the solution, preventing early termination at moderate accuracy. We use a suboptimality bound for trace-bounded SDP problems that enables us to track the progress better and perform early termination. We then develop \texttt{SDPLR+}, which starts the optimization with an extremely low-rank factorization and dynamically updates the rank based on the primal infeasibility and suboptimality. This further speeds up the computation and saves storage. Numerical comparisons on Max Cut, Minimum Bisection, Cut Norm, and Lovász Theta problems with many recent memory-efficient scalable SDP solvers demonstrate the scalability of \texttt{SDPLR+} up to problems with million-by-million decision variables. It is often the fastest solver to a moderate accuracy of $10^{-2}$. Further experiments on $μ$-conductance, matrix completion, and $k$-means clustering show the potential of \texttt{SDPLR+} on a broader range of data science applications.

2407.18428 2026-06-03 cs.LG cs.AI cs.CV 版本更新

Weighted Risk Invariance: Domain Generalization under Invariant Feature Shift

加权风险不变性:不变特征偏移下的领域泛化

Gina Wong, Joshua Gleason, Rama Chellappa, Yoav Wald, Anqi Liu

发表机构 * Johns Hopkins University(约翰霍普金斯大学) University of Maryland, College Park(马里兰大学学院公园分校) New York University(纽约大学) Center for Data Science(数据科学中心)

AI总结 针对不变协变量偏移下现有不变学习方法性能不佳的问题,提出加权风险不变性(WRI)框架,通过环境间损失的不变性并加权训练样本,在理论上保证学习到不变模型,并在实验中优于先前方法。

详情
Journal ref
TMLR 2024
AI中文摘要

学习预测在多个环境下不变的模型是一种有前景的分布外泛化方法。这类模型被训练来提取特征 $X_{ ext{inv}}$,其中给定提取特征的条件分布 $Y \mid X_{ ext{inv}}$ 在不同环境下不发生变化。不变模型还应能泛化到提取特征 $X_{ ext{inv}}$ 的边缘分布 $p(X_{ ext{inv}})$ 的偏移,这种偏移称为 $ extit{不变协变量偏移}$。然而,我们表明,现有学习不变模型的方法在不变协变量偏移下表现不佳,要么无法学习到不变模型——即使对于从简单且经过充分研究的线性-高斯模型生成的数据也是如此——要么有限样本性能较差。为了解决这些问题,我们提出 $ extit{加权风险不变性}$(WRI)。我们的框架基于对训练样本进行适当加权,强制要求损失在不同环境下保持不变。我们证明,在线性-高斯设置下,WRI 可证明地学习到不变模型,即丢弃虚假相关性。我们提出了一种实用算法,通过同时学习密度 $p(X_{ ext{inv}})$ 和模型参数来实现 WRI,并且实验表明,在不变协变量偏移下,WRI 优于先前的不变学习方法。

英文摘要

Learning models whose predictions are invariant under multiple environments is a promising approach for out-of-distribution generalization. Such models are trained to extract features $X_{\text{inv}}$ where the conditional distribution $Y \mid X_{\text{inv}}$ of the label given the extracted features does not change across environments. Invariant models are also supposed to generalize to shifts in the marginal distribution $p(X_{\text{inv}})$ of the extracted features $X_{\text{inv}}$, a type of shift we call an $\textit{invariant covariate shift}$. However, we show that proposed methods for learning invariant models underperform under invariant covariate shift, either failing to learn invariant models$\unicode{x2014}$even for data generated from simple and well-studied linear-Gaussian models$\unicode{x2014}$or having poor finite-sample performance. To alleviate these problems, we propose $\textit{weighted risk invariance}$ (WRI). Our framework is based on imposing invariance of the loss across environments subject to appropriate reweightings of the training examples. We show that WRI provably learns invariant models, i.e. discards spurious correlations, in linear-Gaussian settings. We propose a practical algorithm to implement WRI by learning the density $p(X_{\text{inv}})$ and the model parameters simultaneously, and we demonstrate empirically that WRI outperforms previous invariant learning methods under invariant covariate shift.

2405.03386 2026-06-03 cs.LG 版本更新

Annot-Mix: Learning with Noisy Class Labels from Multiple Annotators via a Mixup Extension

Annot-Mix: 通过混合扩展从多个标注者学习带噪声类别标签

Marek Herde, Lukas Lührs, Denis Huseljic, Bernhard Sick

发表机构 * University of Kassel(卡塞尔大学) European Conference on Artificial Intelligence(欧洲人工智能会议) Conference on Prestigious Applications of Intelligent Systems(智能系统 prestigious 应用会议)

AI总结 提出Annot-Mix框架,通过扩展mixup处理多标注者提供的类别标签,在11个数据集上优于11种现有方法。

Comments 9 pages, 8 figures, 4 tables; post-publication arXiv version with minor editorial corrections; methodology, results, and conclusions unchanged

详情
Journal ref
ECAI 2024: 27th European Conference on Artifical Intelligence, IOS Press, pp. 2910-2918, 2024
AI中文摘要

使用带噪声的类别标签进行训练会损害神经网络的泛化性能。在此背景下,mixup是一种流行的正则化技术,通过使记忆错误类别标签更加困难来提高训练鲁棒性。然而,mixup忽略了多个标注者(例如众包工作者)通常提供类别标签的事实。因此,我们提出了mixup的一种扩展,该扩展处理每个实例的多个类别标签,同时考虑哪个类别标签来自哪个标注者。集成到我们的多标注者分类框架annot-mix中,在包含来自人类或模拟标注者的噪声类别标签的11个数据集的评估研究中,它的性能优于11种(大多数是最先进的)方法。我们的代码通过我们的GitHub仓库公开提供:https://github.com/ies-research/multi-annotator-machine-learning/tree/annot-mix

英文摘要

Training with noisy class labels impairs neural networks' generalization performance. In this context, mixup is a popular regularization technique to improve training robustness by making memorizing false class labels more difficult. However, mixup neglects that multiple annotators, e.g., crowdworkers, typically provide class labels. Therefore, we propose an extension of mixup, which handles multiple class labels per instance while considering which class label originates from which annotator. Integrated into our multi-annotator classification framework annot-mix, it performs superiorly to eleven (mostly state-of-the-art) approaches in an evaluation study with eleven datasets comprising noisy class labels from either human or simulated annotators. Our code is publicly available through our GitHub repository at https://github.com/ies-research/multi-annotator-machine-learning/tree/annot-mix

1212.5524 2026-06-03 eess.SY cs.LG cs.SY 版本更新

Reinforcement learning for port-Hamiltonian systems

面向端口-哈密顿系统的强化学习

Olivier Sprangers, Gabriel A. D. Lopes, Robert Babuska

AI总结 针对端口-哈密顿系统的无源控制中性能优化与PDE求解困难的问题,提出一种基于演员-评论家强化学习的参数化能量平衡无源控制方法,实现近最优控制策略学习并保持系统稳定性。

Comments submitted

详情
Journal ref
IEEE Transactions on Cybernetics, Volume: 45 , Issue: 5 , May 2015
AI中文摘要

端口-哈密顿系统的无源控制(PBC)通过使系统相对于期望的存储函数无源,提供了一种直观的稳定化方法。然而,在大多数情况下,控制律的获得没有考虑任何性能指标,并且必须通过求解复杂的偏微分方程(PDE)来计算。为了解决这些问题,我们将强化学习方法引入能量平衡无源控制(EB-PBC)方法中,这是一种PBC形式,其中闭环能量等于存储能量与供给能量之差。我们提出了一种参数化EB-PBC的技术,该技术保留了系统的PDE匹配条件,不需要指定全局期望哈密顿量,包含性能标准,并且对额外的非线性(如控制输入饱和)具有鲁棒性。控制律的参数通过演员-评论家强化学习找到,从而能够学习满足期望闭环能量景观的近最优控制策略。其优点是,可以使用标准的能量整形技术生成近最优控制器,并且学习到的解可以在能量整形和阻尼注入方面进行解释,从而使得利用无源性理论对稳定性进行数值评估成为可能。从强化学习的角度来看,我们的方法允许将端口-哈密顿系统类纳入演员-评论家框架,通过策略的参数化加速学习。该方法已成功应用于仿真和实际实验中的摆锤起摆问题。

英文摘要

Passivity-based control (PBC) for port-Hamiltonian systems provides an intuitive way of achieving stabilization by rendering a system passive with respect to a desired storage function. However, in most instances the control law is obtained without any performance considerations and it has to be calculated by solving a complex partial differential equation (PDE). In order to address these issues we introduce a reinforcement learning approach into the energy-balancing passivity-based control (EB-PBC) method, which is a form of PBC in which the closed-loop energy is equal to the difference between the stored and supplied energies. We propose a technique to parameterize EB-PBC that preserves the systems's PDE matching conditions, does not require the specification of a global desired Hamiltonian, includes performance criteria, and is robust to extra non-linearities such as control input saturation. The parameters of the control law are found using actor-critic reinforcement learning, enabling learning near-optimal control policies satisfying a desired closed-loop energy landscape. The advantages are that near-optimal controllers can be generated using standard energy shaping techniques and that the solutions learned can be interpreted in terms of energy shaping and damping injection, which makes it possible to numerically assess stability using passivity theory. From the reinforcement learning perspective, our proposal allows for the class of port-Hamiltonian systems to be incorporated in the actor-critic framework, speeding up the learning thanks to the resulting parameterization of the policy. The method has been successfully applied to the pendulum swing-up problem in simulations and real-life experiments.

1206.3582 2026-06-03 math.OC cs.LG cs.SY eess.SY 版本更新

Decentralized Learning for Multi-player Multi-armed Bandits

多人多臂老虎机的分散式学习

Dileep Kalathil, Naumaan Nayyar, Rahul Jain

AI总结 针对多人多臂老虎机问题,提出了一种无需协调的分散式在线学习算法dUCB_4,实现了近O(log^2 T)的期望遗憾。

Comments 33 pages, 3 figures. Submitted to IEEE Transactions on Information Theory

详情
AI中文摘要

我们考虑多人多臂老虎机(MAB)模型中的分布式在线学习问题。每个玩家可以选择多个臂。当玩家选择一个臂时,它会获得奖励。我们考虑独立同分布奖励模型和马尔可夫奖励模型。在独立同分布模型中,每个臂被建模为具有未知均值的未知分布的独立同分布过程。在马尔可夫模型中,每个臂被建模为具有未知概率转移矩阵和平稳分布的有限、不可约、非周期且可逆的马尔可夫链。不同玩家从臂中获得不同奖励。如果两个玩家选择同一个臂,则发生“碰撞”,两者均无法获得任何奖励。玩家之间没有专用的控制信道用于协调或通信。用户之间的任何其他通信都是有代价的,并会增加遗憾。我们提出了一种基于索引的在线分布式学习策略,称为${ t dUCB_4}$算法,该算法以正确的方式权衡探索与利用,并实现期望遗憾增长不超过近$O(\log^2 T)$。该研究的动机来自认知无线电网络中多个次要用户的机会频谱接入,他们必须在不同用户看起来不同的各种无线信道中进行选择。据我们所知,这是首个针对多人MAB的分布式学习算法。

英文摘要

We consider the problem of distributed online learning with multiple players in multi-armed bandits (MAB) models. Each player can pick among multiple arms. When a player picks an arm, it gets a reward. We consider both i.i.d. reward model and Markovian reward model. In the i.i.d. model each arm is modelled as an i.i.d. process with an unknown distribution with an unknown mean. In the Markovian model, each arm is modelled as a finite, irreducible, aperiodic and reversible Markov chain with an unknown probability transition matrix and stationary distribution. The arms give different rewards to different players. If two players pick the same arm, there is a "collision", and neither of them get any reward. There is no dedicated control channel for coordination or communication among the players. Any other communication between the users is costly and will add to the regret. We propose an online index-based distributed learning policy called ${\tt dUCB_4}$ algorithm that trades off \textit{exploration v. exploitation} in the right way, and achieves expected regret that grows at most as near-$O(\log^2 T)$. The motivation comes from opportunistic spectrum access by multiple secondary users in cognitive radio networks wherein they must pick among various wireless channels that look different to different users. This is the first distributed learning algorithm for multi-player MABs to the best of our knowledge.

1303.4778 2026-06-03 cs.LG cs.NA math.NA stat.ML 版本更新

Greedy Feature Selection for Subspace Clustering

子空间聚类的贪婪特征选择

Eva L. Dyer, Aswin C. Sankaranarayanan, Richard G. Baraniuk

AI总结 本文研究使用贪婪方法(正交匹配追踪)进行子空间聚类的精确特征选择,并证明其在稀疏采样条件下优于最近邻方法。

Comments 32 pages, 7 figures, 1 table

详情
Journal ref
Journal of Machine Learning Research, Vol.14, Issue 1, pp. 2487-2517, January 2013
AI中文摘要

子空间的并集为高维数据集合提供了对线性子空间模型的强大推广。为了从数据集合中学习子空间的并集,必须识别集合中属于同一子空间的信号集,以获得数据中存在的子空间结构的准确估计。最近,稀疏恢复方法已被证明为精确特征选择(EFS)提供了可证明且稳健的策略——从集合中恢复位于同一子空间的点集。与最近关于L1最小化EFS的研究并行,本文为使用贪婪方法(即正交匹配追踪(OMP))进行稀疏信号恢复的EFS发展了充分条件。在分析之后,我们提供了对生活在子空间并集上的信号的特征选择策略的实证研究,并刻画了稀疏恢复方法与基于最近邻(NN)的方法之间的差距。特别是,我们证明了稀疏恢复方法比NN方法具有显著优势,并且当数据集中子空间的采样稀疏时,这两种方法之间的差距尤为明显。我们的结果表明,在NN方法无法揭示集合中点所属子空间的许多情况下,OMP可以可靠地恢复精确的特征集。

英文摘要

Unions of subspaces provide a powerful generalization to linear subspace models for collections of high-dimensional data. To learn a union of subspaces from a collection of data, sets of signals in the collection that belong to the same subspace must be identified in order to obtain accurate estimates of the subspace structures present in the data. Recently, sparse recovery methods have been shown to provide a provable and robust strategy for exact feature selection (EFS)--recovering subsets of points from the ensemble that live in the same subspace. In parallel with recent studies of EFS with L1-minimization, in this paper, we develop sufficient conditions for EFS with a greedy method for sparse signal recovery known as orthogonal matching pursuit (OMP). Following our analysis, we provide an empirical study of feature selection strategies for signals living on unions of subspaces and characterize the gap between sparse recovery methods and nearest neighbor (NN)-based approaches. In particular, we demonstrate that sparse recovery methods provide significant advantages over NN methods and the gap between the two approaches is particularly pronounced when the sampling of subspaces in the dataset is sparse. Our results suggest that OMP may be employed to reliably recover exact feature sets in a number of regimes where NN approaches fail to reveal the subspace membership of points in the ensemble.

1207.2940 2026-06-03 stat.ML cs.LG cs.SY eess.SY 版本更新

Expectation Propagation in Gaussian Process Dynamical Systems: Extended Version

高斯过程动态系统中的期望传播:扩展版

Marc Peter Deisenroth, Shakir Mohamed

AI总结 本文提出基于期望传播的消息传递算法用于高斯过程动态系统的近似推理,通过前向后向平滑迭代获得更精确的潜在结构后验分布,提升预测性能,并统一了现有GPDS平滑器。

详情
Journal ref
Advances in Neural Information Processing Systems 25 (NIPS), pp. 2609-2617, 2012
AI中文摘要

丰富且复杂的时间序列数据,例如来自工程系统、金融市场、视频或神经记录的生成数据,现在已成为现代数据分析的常见特征。解释这些多样化数据集背后的现象需要灵活且准确的模型。在本文中,我们推广高斯过程动态系统(GPDS)作为适合此类分析的丰富模型类。特别地,我们提出了一种基于期望传播的GPDS近似推理消息传递算法。通过将推理视为一般的消息传递问题,我们迭代前向后向平滑。因此,我们获得了更准确的潜在结构后验分布,与最先进的GPDS平滑器(这些平滑器是我们一般消息传递算法的特例)相比,预测性能得到改善。因此,我们提供了一种统一的方法,在其中将消息传递置于GPDS的上下文中。

英文摘要

Rich and complex time-series data, such as those generated from engineering systems, financial markets, videos or neural recordings, are now a common feature of modern data analysis. Explaining the phenomena underlying these diverse data sets requires flexible and accurate models. In this paper, we promote Gaussian process dynamical systems (GPDS) as a rich model class that is appropriate for such analysis. In particular, we present a message passing algorithm for approximate inference in GPDSs based on expectation propagation. By posing inference as a general message passing problem, we iterate forward-backward smoothing. Thus, we obtain more accurate posterior distributions over latent structures, resulting in improved predictive performance compared to state-of-the-art GPDS smoothers, which are special cases of our general message passing algorithm. Hence, we provide a unifying approach within which to contextualize message passing in GPDSs.

1102.5597 2026-06-03 math.NA cs.LG cs.NA 版本更新

Fast and Faster: A Comparison of Two Streamed Matrix Decomposition Algorithms

快速与更快:两种流式矩阵分解算法的比较

Radim Řeh{ů}řek

AI总结 本文比较了单遍分布式算法和两遍流式随机算法在恒定内存下处理大规模矩阵分解的性能与精度,以英文维基百科为数据集进行潜在语义分析实验。

详情
Journal ref
NIPS Workshop on Low-Rank Methods for Large-Scale Machine Learning, 2010
AI中文摘要

随着数字数据集规模的爆炸式增长,分解算法的限制因素是对输入的 extit{遍历次数},因为输入通常存储在外存甚至异地。此外,我们只关注相对于输入大小在 extit{恒定内存}中运行的算法,以便处理任意大的输入。在本文中,我们提出了两种此类算法的实际比较:一种是对输入进行单遍操作的分布式方法,另一种是流式两遍随机算法。实验跟踪了分布式计算、过采样和内存权衡对两种算法精度和性能的影响。为了确保有意义的结果,我们选择真实数据集,即整个英文维基百科,作为潜在语义分析的应用场景。

英文摘要

With the explosion of the size of digital dataset, the limiting factor for decomposition algorithms is the \emph{number of passes} over the input, as the input is often stored out-of-core or even off-site. Moreover, we're only interested in algorithms that operate in \emph{constant memory} w.r.t. to the input size, so that arbitrarily large input can be processed. In this paper, we present a practical comparison of two such algorithms: a distributed method that operates in a single pass over the input vs. a streamed two-pass stochastic algorithm. The experiments track the effect of distributed computing, oversampling and memory trade-offs on the accuracy and performance of the two algorithms. To ensure meaningful results, we choose the input to be a real dataset, namely the whole of the English Wikipedia, in the application settings of Latent Semantic Analysis.

1111.2262 2026-06-03 cs.LG cs.NA math.NA 版本更新

Improved Bound for the Nystrom's Method and its Application to Kernel Classification

Nyström方法的改进界及其在核分类中的应用

Rong Jin, Tianbao Yang, Mehrdad Mahdavi, Yu-Feng Li, Zhi-Hua Zhou

AI总结 本文通过积分算子集中不等式和压缩感知理论改进了Nyström方法的谱范数逼近误差界,并应用于核分类,证明在特征值服从p次幂律时可将支持向量数量减少至N^{2p/(p^2-1)}。

详情
AI中文摘要

我们开发了两种分析Nyström方法逼近误差界的方法,一种基于积分算子的集中不等式,另一种基于压缩感知理论。我们表明,在大特征间隙的情况下,以谱范数度量的逼近误差可以从$O(N/\sqrt{m})$改进到$O(N/m^{1 - ρ})$,其中$N$是数据点总数,$m$是采样数据点数,$ρ\in (0, 1/2)$是刻画特征间隙的正常数。当核矩阵的特征值服从$p$次幂律时,基于压缩感知理论的分析在非相干性假设下进一步将界改进为$O(N/m^{p - 1})$,这解释了为什么Nyström方法对特征值倾斜的核矩阵效果良好。我们提出了一种基于Nyström方法的核分类方法,并利用改进的界推导了其泛化性能。我们表明,当核矩阵的特征值服从$p$次幂律时,我们可以将支持向量数量减少到$N^{2p/(p^2 - 1)}$,当$p > 1+\sqrt{2}$时该数量小于$N$,而不会严重牺牲其泛化性能。

英文摘要

We develop two approaches for analyzing the approximation error bound for the Nyström method, one based on the concentration inequality of integral operator, and one based on the compressive sensing theory. We show that the approximation error, measured in the spectral norm, can be improved from $O(N/\sqrt{m})$ to $O(N/m^{1 - ρ})$ in the case of large eigengap, where $N$ is the total number of data points, $m$ is the number of sampled data points, and $ρ\in (0, 1/2)$ is a positive constant that characterizes the eigengap. When the eigenvalues of the kernel matrix follow a $p$-power law, our analysis based on compressive sensing theory further improves the bound to $O(N/m^{p - 1})$ under an incoherence assumption, which explains why the Nyström method works well for kernel matrix with skewed eigenvalues. We present a kernel classification approach based on the Nyström method and derive its generalization performance using the improved bound. We show that when the eigenvalues of kernel matrix follow a $p$-power law, we can reduce the number of support vectors to $N^{2p/(p^2 - 1)}$, a number less than $N$ when $p > 1+\sqrt{2}$, without seriously sacrificing its generalization performance.

1205.4133 2026-06-03 math.NA cs.LG cs.NA 版本更新

Constrained Overcomplete Analysis Operator Learning for Cosparse Signal Modelling

约束过完全分析算子学习用于共稀疏信号建模

Mehrdad Yaghoobi, Sangnam Nam, Remi Gribonval, Mike E. Davies

AI总结 提出一种基于L1优化的约束学习框架,通过投影次梯度和Douglas-Rachford分裂技术学习过完全分析算子,实现共稀疏信号建模,并验证了其在干净和噪声训练集上的鲁棒恢复能力。

Comments 29 pages, 13 figures, accepted to be published in TSP

详情
AI中文摘要

我们考虑从训练样本集合中学习低维信号模型的问题。主流方法是学习一个过完全字典,利用稀疏合成系数对训练样本提供良好近似。这个著名的稀疏模型有一个不太为人知的对应物,即分析形式的共稀疏分析模型。在这个新模型中,信号的特征在于它们在使用过完全(线性)分析算子的变换域中的简约性。我们提出基于L1优化的约束优化框架,从训练语料库中学习分析算子。在优化框架中引入约束的原因是为了排除平凡解。尽管目前还没有最终答案确定哪个约束最相关,但我们研究了模型自适应领域的一些常规约束,并为此使用了均匀归一化紧框架(UNTF)。然后,我们推导了一个实用的学习算法,基于投影次梯度和Douglas-Rachford分裂技术,并展示了当提供足够大小的干净训练集时,该算法能够稳健地恢复真实分析算子。我们还使用一些含噪的共稀疏信号找到了图像的分析算子,这确实是一个更现实的实验。由于推导出的优化问题不是凸规划,我们通常使用此类变分方法找到局部最小值。针对两种不同设置推导了局部最优性条件,为学习问题在适当条件下的适定性提供了初步的理论支持。

英文摘要

We consider the problem of learning a low-dimensional signal model from a collection of training samples. The mainstream approach would be to learn an overcomplete dictionary to provide good approximations of the training samples using sparse synthesis coefficients. This famous sparse model has a less well known counterpart, in analysis form, called the cosparse analysis model. In this new model, signals are characterised by their parsimony in a transformed domain using an overcomplete (linear) analysis operator. We propose to learn an analysis operator from a training corpus using a constrained optimisation framework based on L1 optimisation. The reason for introducing a constraint in the optimisation framework is to exclude trivial solutions. Although there is no final answer here for which constraint is the most relevant constraint, we investigate some conventional constraints in the model adaptation field and use the uniformly normalised tight frame (UNTF) for this purpose. We then derive a practical learning algorithm, based on projected subgradients and Douglas-Rachford splitting technique, and demonstrate its ability to robustly recover a ground truth analysis operator, when provided with a clean training set, of sufficient size. We also find an analysis operator for images, using some noisy cosparse signals, which is indeed a more realistic experiment. As the derived optimisation problem is not a convex program, we often find a local minimum using such variational methods. Some local optimality conditions are derived for two different settings, providing preliminary theoretical support for the well-posedness of the learning problem under appropriate conditions.

0911.1419 2026-06-03 cs.DS cond-mat.stat-mech cs.DM cs.LG cs.NA math.NA math.OC 版本更新

Belief Propagation and Loop Calculus for the Permanent of a Non-Negative Matrix

非负矩阵积和式的信念传播与环路微积分

Yusuke Watanabe, Michael Chertkov

AI总结 针对非负矩阵积和式的计算问题,利用信念传播固定点导出了精确的积和式表达式,并提供了基于Bethe自由能和Ihara图zeta函数的两种推导。

Comments 11 pages; submitted to Journal of Physics A: Mathematical Theoretical

详情
AI中文摘要

我们考虑计算一个正$(N\times N)$非负矩阵$P=(P_i^j|i,j=1,\cdots,N)$的积和式,或等价地,完全二分图$K_{N,N}$上完美匹配的加权计数问题。该问题已知具有指数复杂度。作为图模型的配分函数$Z$,该问题允许精确的环路微积分表示[Chertkov, Chernyak '06],该表示基于Bethe自由能泛函在非整数双随机边际信念矩阵$\beta=(\beta_i^j|i,j=1,\cdots,N)$上的内部最小值,该矩阵也对应于信念传播(BP)型迭代消息传递算法的固定点。我们的主要结果是给出精确配分函数(积和式)用BP边际矩阵$\beta$表示的显式表达式:$Z=\mbox{Perm}(P)=Z_{BP} \mbox{Perm}(\beta_i^j(1-\beta_i^j))/\prod_{i,j}(1-\beta_i^j)$,其中$Z_{BP}$是用$\beta$显式表示的BP积和式表达式。我们给出了该公式的两种推导:一种直接基于Bethe自由能,另一种结合了Ihara图$\zeta$函数和环路微积分方法。假设已计算出信念传播边际矩阵$\beta$,我们提供了两个下界和一个上界来估计乘积项。两个互补的下界分别基于Gurvits-van der Waerden定理以及修正积和式与行列式之间的关系。

英文摘要

We consider computation of permanent of a positive $(N\times N)$ non-negative matrix, $P=(P_i^j|i,j=1,\cdots,N)$, or equivalently the problem of weighted counting of the perfect matchings over the complete bipartite graph $K_{N,N}$. The problem is known to be of likely exponential complexity. Stated as the partition function $Z$ of a graphical model, the problem allows exact Loop Calculus representation [Chertkov, Chernyak '06] in terms of an interior minimum of the Bethe Free Energy functional over non-integer doubly stochastic matrix of marginal beliefs, $β=(β_i^j|i,j=1,\cdots,N)$, also correspondent to a fixed point of the iterative message-passing algorithm of the Belief Propagation (BP) type. Our main result is an explicit expression of the exact partition function (permanent) in terms of the matrix of BP marginals, $β$, as $Z=\mbox{Perm}(P)=Z_{BP} \mbox{Perm}(β_i^j(1-β_i^j))/\prod_{i,j}(1-β_i^j)$, where $Z_{BP}$ is the BP expression for the permanent stated explicitly in terms if $β$. We give two derivations of the formula, a direct one based on the Bethe Free Energy and an alternative one combining the Ihara graph-$ζ$ function and the Loop Calculus approaches. Assuming that the matrix $β$ of the Belief Propagation marginals is calculated, we provide two lower bounds and one upper-bound to estimate the multiplicative term. Two complementary lower bounds are based on the Gurvits-van der Waerden theorem and on a relation between the modified permanent and determinant respectively.

1208.4773 2026-06-03 eess.SY cs.AI cs.LG cs.SY 版本更新

Optimized Look-Ahead Tree Policies: A Bridge Between Look-Ahead Tree Policies and Direct Policy Search

优化前瞻树策略:连接前瞻树策略与直接策略搜索的桥梁

Tobias Jung, Louis Wehenkel, Damien Ernst, Francis Maes

AI总结 提出一种混合策略学习方案,通过直接策略搜索学习节点评分函数来指导小型前瞻树的构建,从而结合直接策略搜索和前瞻树策略的优势。

Comments In Submission

详情
AI中文摘要

直接策略搜索(DPS)和前瞻树(LT)策略是两类广泛使用的技术,用于为序列决策问题产生高性能策略。要使DPS方法有效工作,一个关键问题是针对目标问题选择合适的参数化策略空间。LT方法的一个基本问题是,为了做出好的决策,这类策略必须开发非常大的前瞻树,这可能需要过多的在线计算资源。在本文中,我们提出了一种新的混合策略学习方案,它位于DPS和LT的交集,其中策略是一种算法,以有向方式开发一个小型前瞻树,由通过DPS学习的节点评分函数引导。基于LT的表示被证明是在DPS方案中表示策略的一种通用方式,同时,DPS能够显著减少做出高质量决策所需的前瞻树的大小。我们通过实验将我们的方法与两种其他最先进的DPS技术和四种常见的LT策略在四个基准领域进行比较,并表明它结合了其起源的两种技术的优势。特别是,我们表明我们的方法:(1)总体上比纯DPS和纯LT策略产生更好的性能策略,(2)需要的策略评估次数远少于其他DPS技术,(3)易于调整,(4)产生的策略对初始条件的扰动具有相当的鲁棒性。

英文摘要

Direct policy search (DPS) and look-ahead tree (LT) policies are two widely used classes of techniques to produce high performance policies for sequential decision-making problems. To make DPS approaches work well, one crucial issue is to select an appropriate space of parameterized policies with respect to the targeted problem. A fundamental issue in LT approaches is that, to take good decisions, such policies must develop very large look-ahead trees which may require excessive online computational resources. In this paper, we propose a new hybrid policy learning scheme that lies at the intersection of DPS and LT, in which the policy is an algorithm that develops a small look-ahead tree in a directed way, guided by a node scoring function that is learned through DPS. The LT-based representation is shown to be a versatile way of representing policies in a DPS scheme, while at the same time, DPS enables to significantly reduce the size of the look-ahead trees that are required to take high-quality decisions. We experimentally compare our method with two other state-of-the-art DPS techniques and four common LT policies on four benchmark domains and show that it combines the advantages of the two techniques from which it originates. In particular, we show that our method: (1) produces overall better performing policies than both pure DPS and pure LT policies, (2) requires a substantially smaller number of policy evaluations than other DPS techniques, (3) is easy to tune and (4) results in policies that are quite robust with respect to perturbations of the initial conditions.

1205.2584 2026-06-03 math.NA cs.LG cs.NA math.OC 版本更新

Low Complexity Damped Gauss-Newton Algorithms for CANDECOMP/PARAFAC

低复杂度阻尼高斯-牛顿算法用于CANDECOMP/PARAFAC分解

Anh Huy Phan, Petr Tichavský, Andrzej Cichocki

AI总结 针对CP分解中阻尼高斯-牛顿算法计算复杂度过高的问题,提出基于分块逆近似Hessian的快速实现,显著降低计算和内存需求。

详情
AI中文摘要

用于CANDECOMP/PARAFAC (CP) 分解的阻尼高斯-牛顿 (dGN) 算法可以处理因子共线性和不同因子量级的挑战;然而,对于大小为 $I_1\times I_N$、秩为 $R$ 的 $N$ 维张量分解,该算法由于需要构建大小为 $(RT \times RT)$ 的大型近似Hessian矩阵并求逆(其中 $T = \sum_n I_n$),计算量巨大。本文提出了一种dGN算法的快速实现,基于分块形式的逆近似Hessian的新表达式。新实现具有较低的计算复杂度,除了梯度计算(这部分两种方法相同)外,只需要对一个大小为 $NR^2\times NR^2$ 的矩阵求逆,如果 $T \gg NR$,这远小于整个近似Hessian矩阵。此外,该实现具有更低的内存需求,因为Hessian矩阵及其逆矩阵都不需要完整存储。还提出了处理复数数据的算法变体。在困难基准张量示例上,将所提算法的复杂度和性能与dGN和带线搜索的ALS进行了比较。

英文摘要

The damped Gauss-Newton (dGN) algorithm for CANDECOMP/PARAFAC (CP) decomposition can handle the challenges of collinearity of factors and different magnitudes of factors; nevertheless, for factorization of an $N$-D tensor of size $I_1\times I_N$ with rank $R$, the algorithm is computationally demanding due to construction of large approximate Hessian of size $(RT \times RT)$ and its inversion where $T = \sum_n I_n$. In this paper, we propose a fast implementation of the dGN algorithm which is based on novel expressions of the inverse approximate Hessian in block form. The new implementation has lower computational complexity, besides computation of the gradient (this part is common to both methods), requiring the inversion of a matrix of size $NR^2\times NR^2$, which is much smaller than the whole approximate Hessian, if $T \gg NR$. In addition, the implementation has lower memory requirements, because neither the Hessian nor its inverse never need to be stored in their entirety. A variant of the algorithm working with complex valued data is proposed as well. Complexity and performance of the proposed algorithm is compared with those of dGN and ALS with line search on examples of difficult benchmark tensors.

1104.3792 2026-06-03 stat.ML cs.LG cs.NA math.NA 版本更新

A sufficient condition on monotonic increase of the number of nonzero entry in the optimizer of L1 norm penalized least-square problem

L1范数惩罚最小二乘问题优化器中非零条目数单调递增的充分条件

J. Duan, Charles Soussen, David Brie, Jerome Idier, Y. -P. Wang

AI总结 本文针对L1范数惩罚最小二乘问题(LASSO),提出了一个充分条件,在该条件下当超参数减小时优化器中非零条目数单调递增,并将结果推广到全变分情形。

详情
AI中文摘要

基于$\ell$-1范数的优化广泛应用于信号处理,尤其是近期的压缩感知理论。本文研究$\ell$-1范数惩罚最小二乘问题的解路径,其约束形式称为最小绝对收缩和选择算子(LASSO)。解路径是随着超参数(拉格朗日乘子)变化的所有优化器的集合。解路径的研究对于理解和观察近似项与正则化项之间的权衡曲线具有重要意义。如果已知给定问题的解路径,它可以帮助我们在给定准则(如Akaike信息准则)下找到最优超参数。本文提出了$\ell$-1范数惩罚最小二乘问题的一个充分条件。在该充分条件下,当超参数减小时,优化器或解向量中的非零条目数单调递增。我们还将结果推广到常用的全变分情形,其中$\ell$-1范数作用于解向量的一阶导数。我们证明所提出的条件与Donoho等人\cite{Donoho08}给出的条件以及Efron等人\cite{Efron04}的正锥条件具有内在联系。然而,所提出的条件不需要像Donoho等人的条件那样假设信号的稀疏水平,并且在用于实际应用时比Efron等人的正锥条件更容易验证。

英文摘要

The $\ell$-1 norm based optimization is widely used in signal processing, especially in recent compressed sensing theory. This paper studies the solution path of the $\ell$-1 norm penalized least-square problem, whose constrained form is known as Least Absolute Shrinkage and Selection Operator (LASSO). A solution path is the set of all the optimizers with respect to the evolution of the hyperparameter (Lagrange multiplier). The study of the solution path is of great significance in viewing and understanding the profile of the tradeoff between the approximation and regularization terms. If the solution path of a given problem is known, it can help us to find the optimal hyperparameter under a given criterion such as the Akaike Information Criterion. In this paper we present a sufficient condition on $\ell$-1 norm penalized least-square problem. Under this sufficient condition, the number of nonzero entries in the optimizer or solution vector increases monotonically when the hyperparameter decreases. We also generalize the result to the often used total variation case, where the $\ell$-1 norm is taken over the first order derivative of the solution vector. We prove that the proposed condition has intrinsic connections with the condition given by Donoho, et al \cite{Donoho08} and the positive cone condition by Efron {\it el al} \cite{Efron04}. However, the proposed condition does not need to assume the sparsity level of the signal as required by Donoho et al's condition, and is easier to verify than Efron, et al's positive cone condition when being used for practical applications.

1101.4003 2026-06-03 cs.AI cs.LG cs.SY eess.SY math.OC 版本更新

Dyna-H: a heuristic planning reinforcement learning algorithm applied to role-playing-game strategy decision systems

Dyna-H:一种应用于角色扮演游戏策略决策系统的启发式规划强化学习算法

Matilde Santos, Jose Antonio Martin H., Victoria Lopez, Guillermo Botella

AI总结 提出Dyna-H算法,结合启发式搜索与Dyna框架,在角色扮演游戏策略决策中实现无模型在线强化学习,实验表明其性能显著优于Q-Learning和Dyna-Q。

详情
AI中文摘要

在角色扮演游戏中,寻找最优轨迹是最重要的任务之一。实际上,策略决策系统成为游戏引擎的关键组成部分。决策方式(在线、批处理或模拟)以及决策所消耗的资源(如执行时间、内存)将在很大程度上影响游戏性能。当可以使用经典搜索算法(如A*)时,它们是最优先的选择。然而,这些方法依赖于搜索空间的精确和完整模型,在许多有趣的场景中无法应用。此时,无模型的序贯决策方法(在不确定性下)是最佳选择。本文提出一种启发式规划策略,将启发式搜索在路径规划中的能力融入Dyna智能体。所提出的Dyna-H算法,与A*一样,会选择更有可能产生结果的路径分支。此外,它具有无模型在线强化学习算法的优点。该方案与单步Q-Learning和Dyna-Q算法进行了对比评估,获得了优异的实验结果:Dyna-H在所有实验中显著优于这两种方法。我们还提出了一个功能类比,即从最差轨迹中采样的启发式与人类行为中梦境(如噩梦)的作用类似。

英文摘要

In a Role-Playing Game, finding optimal trajectories is one of the most important tasks. In fact, the strategy decision system becomes a key component of a game engine. Determining the way in which decisions are taken (online, batch or simulated) and the consumed resources in decision making (e.g. execution time, memory) will influence, in mayor degree, the game performance. When classical search algorithms such as A* can be used, they are the very first option. Nevertheless, such methods rely on precise and complete models of the search space, and there are many interesting scenarios where their application is not possible. Then, model free methods for sequential decision making under uncertainty are the best choice. In this paper, we propose a heuristic planning strategy to incorporate the ability of heuristic-search in path-finding into a Dyna agent. The proposed Dyna-H algorithm, as A* does, selects branches more likely to produce outcomes than other branches. Besides, it has the advantages of being a model-free online reinforcement learning algorithm. The proposal was evaluated against the one-step Q-Learning and Dyna-Q algorithms obtaining excellent experimental results: Dyna-H significantly overcomes both methods in all experiments. We suggest also, a functional analogy between the proposed sampling from worst trajectories heuristic and the role of dreams (e.g. nightmares) in human behavior.

1012.3005 2026-06-03 math.OC cs.LG cs.NI cs.SY eess.SY math.PR 版本更新

On the Combinatorial Multi-Armed Bandit Problem with Markovian Rewards

关于马尔可夫奖励的组合多臂老虎机问题

Yi Gai, Bhaskar Krishnamachari, Mingyan Liu

AI总结 针对用户-资源匹配中状态演化为马尔可夫链的组合多臂老虎机问题,提出一种多项式存储和每步多项式复杂度的学习算法,实现接近对数时间的遗憾界。

详情
AI中文摘要

我们考虑经典多臂老虎机问题的一个组合推广,定义如下:给定一个二分图,包含 $M$ 个用户和 $N \geq M$ 个资源。对于每个用户-资源对 $(i,j)$,存在一个关联状态,该状态演化为一个参数未知的不可约非周期有限状态马尔可夫链,每次特定用户 $i$ 被分配资源 $j$ 时状态发生转移。用户 $i$ 每次被分配资源 $j$ 时获得一个依赖于对应状态的奖励。系统目标是学习用户与资源的最佳匹配,使得所有用户获得的长期奖励总和最大化。这对应于最小化遗憾,这里定义为最佳可能静态匹配所能获得的期望总奖励与给定算法所能达到的期望总奖励之间的差距。我们针对该问题提出了一种多项式存储和每步多项式复杂度的匹配学习算法。我们证明该算法能够实现均匀任意接近对数时间的遗憾,且遗憾与用户和资源数量成多项式关系。该公式广泛适用于网络中的调度和交换问题,并显著扩展了该领域的先前结果。

英文摘要

We consider a combinatorial generalization of the classical multi-armed bandit problem that is defined as follows. There is a given bipartite graph of $M$ users and $N \geq M$ resources. For each user-resource pair $(i,j)$, there is an associated state that evolves as an aperiodic irreducible finite-state Markov chain with unknown parameters, with transitions occurring each time the particular user $i$ is allocated resource $j$. The user $i$ receives a reward that depends on the corresponding state each time it is allocated the resource $j$. The system objective is to learn the best matching of users to resources so that the long-term sum of the rewards received by all users is maximized. This corresponds to minimizing regret, defined here as the gap between the expected total reward that can be obtained by the best-possible static matching and the expected total reward that can be achieved by a given algorithm. We present a polynomial-storage and polynomial-complexity-per-step matching-learning algorithm for this problem. We show that this algorithm can achieve a regret that is uniformly arbitrarily close to logarithmic in time and polynomial in the number of users and resources. This formulation is broadly applicable to scheduling and switching problems in networks and significantly extends prior results in the area.

1005.2146 2026-06-03 cs.LG cs.NA math.NA 版本更新

On the Finite Time Convergence of Cyclic Coordinate Descent Methods

关于循环坐标下降法的有限时间收敛性

Ankan Saha, Ambuj Tewari

AI总结 本文证明了在等调性假设下,两种循环坐标下降变体具有O(1/k)的收敛速率,并通过与梯度下降的比较展示了其优越性。

Comments 20 pages

详情
AI中文摘要

循环坐标下降是一种经典的优化方法,在机器学习中重新引起了兴趣。原因包括其简单性、速度和稳定性,以及在$\ell_1$正则化光滑优化问题上的竞争性能。令人惊讶的是,关于这些问题的有限时间收敛行为知之甚少。大多数现有结果要么仅证明收敛,要么提供渐近速率。我们通过证明在等调性假设下,两种循环坐标下降变体具有$O(1/k)$的收敛速率(其中$k$是迭代次数),填补了这一文献空白。我们的分析通过比较两种变体所达到的目标值以及梯度下降算法来进行。我们表明,循环坐标下降方法生成的迭代点在整个时间上始终优于梯度下降。

英文摘要

Cyclic coordinate descent is a classic optimization method that has witnessed a resurgence of interest in machine learning. Reasons for this include its simplicity, speed and stability, as well as its competitive performance on $\ell_1$ regularized smooth optimization problems. Surprisingly, very little is known about its finite time convergence behavior on these problems. Most existing results either just prove convergence or provide asymptotic rates. We fill this gap in the literature by proving $O(1/k)$ convergence rates (where $k$ is the iteration counter) for two variants of cyclic coordinate descent under an isotonicity assumption. Our analysis proceeds by comparing the objective values attained by the two variants with each other, as well as with the gradient descent algorithm. We show that the iterates generated by the cyclic coordinate descent methods remain better than those of gradient descent uniformly over time.

1303.3183 2026-06-03 eess.SY cs.CE cs.LG cs.SY q-bio.MN 版本更新

Toggling a Genetic Switch Using Reinforcement Learning

使用强化学习切换遗传开关

Aivar Sootla, Natalja Strelkowa, Damien Ernst, Mauricio Barahona, Guy-Bart Stan

AI总结 本文采用拟合Q迭代强化学习算法,无需系统数学模型,直接利用测量数据实现基因调控网络的最优外源控制,并以切换开关系统为例驱动两种蛋白质浓度到达目标状态区域。

Comments 12 pages, presented at the 9th French Meeting on Planning, Decision Making and Learning, Liège (Belgium), May 12-13, 2014

详情
AI中文摘要

在本文中,我们考虑了基因调控网络的最优外源控制问题。我们的方法在于采用一种称为拟合Q迭代的成熟强化学习算法。该算法直接根据系统对外部控制输入的响应测量值推断控制律,而无需使用系统的数学模型。测量数据集既可以从湿实验室实验中收集,也可以通过系统动力学模型的计算机模拟人工创建。由于该算法能够处理非线性和随机系统动力学,因此适用于广泛的生物系统。为了说明该算法在基因调控网络中的应用,考虑了切换开关系统的调控。该问题的控制目标是驱动两种特定蛋白质的浓度到达状态空间中的目标区域。

英文摘要

In this paper, we consider the problem of optimal exogenous control of gene regulatory networks. Our approach consists in adapting an established reinforcement learning algorithm called the fitted Q iteration. This algorithm infers the control law directly from the measurements of the system's response to external control inputs without the use of a mathematical model of the system. The measurement data set can either be collected from wet-lab experiments or artificially created by computer simulations of dynamical models of the system. The algorithm is applicable to a wide range of biological systems due to its ability to deal with nonlinear and stochastic system dynamics. To illustrate the application of the algorithm to a gene regulatory network, the regulation of the toggle switch system is considered. The control objective of this problem is to drive the concentrations of two specific proteins to a target region in the state space.

1201.5604 2026-06-03 cs.AI cs.LG cs.NE cs.SY eess.SY math.OC 版本更新

Discrete and fuzzy dynamical genetic programming in the XCSF learning classifier system

XCSF学习分类系统中的离散与模糊动态遗传编程

Richard J. Preen, Larry Bull

AI总结 本文在XCSF框架内使用离散和模糊动态系统表示(异步随机布尔网络和模糊逻辑网络),通过自适应的开放式进化设计集成系统,解决多个经典测试问题。

详情
Journal ref
Soft Computing (2014), 18(1):153-167
AI中文摘要

学习分类系统中已经提出了多种表示方案,从二进制编码到神经网络。本文报告了在XCSF学习分类系统中使用离散和模糊动态系统表示的研究结果。具体而言,在离散情况下使用异步随机布尔网络表示传统的条件-动作生产系统规则,在连续值情况下使用异步模糊逻辑网络。研究表明,可以在XCSF中使用自适应的开放式进化来设计此类动态系统的集成,以解决多个著名的测试问题。

英文摘要

A number of representation schemes have been presented for use within learning classifier systems, ranging from binary encodings to neural networks. This paper presents results from an investigation into using discrete and fuzzy dynamical system representations within the XCSF learning classifier system. In particular, asynchronous random Boolean networks are used to represent the traditional condition-action production system rules in the discrete case and asynchronous fuzzy logic networks in the continuous-valued case. It is shown possible to use self-adaptive, open-ended evolution to design an ensemble of such dynamical systems within XCSF to solve a number of well-known test problems.

1106.3703 2026-06-03 nlin.AO cs.AI cs.IT cs.LG cs.SY eess.SY math.IT q-bio.QM stat.ME 版本更新

Prediction and Modularity in Dynamical Systems

动力系统中的预测与模块性

Artemy Kolchinsky, Luis M. Rocha

AI总结 本文从统计建模和预测的角度,利用模型简洁性与预测精度之间的权衡,提出了一种将动力网络最优多尺度分解为弱耦合简单模块的方法,并给出了状态依赖和因果版本。

Comments v1 published in ECAL 2011 (European Conference on Artificial Life). v2 fixes error in causal risk (number of parameters should be based on training distribution)

详情
AI中文摘要

识别和理解模块化组织是复杂系统研究中的核心问题。已有多种方法被提出,其中许多以信息论术语表述。我们的研究从动力系统的统计建模和预测这一互补视角出发。已知对于有限量的训练数据,简单模型可能比复杂模型具有更强的预测能力。我们利用模型简洁性与预测精度之间的权衡,将动力网络最优多尺度分解为弱耦合的简单模块。还提出了我们方法的状态依赖和因果版本。

英文摘要

Identifying and understanding modular organizations is centrally important in the study of complex systems. Several approaches to this problem have been advanced, many framed in information-theoretic terms. Our treatment starts from the complementary point of view of statistical modeling and prediction of dynamical systems. It is known that for finite amounts of training data, simpler models can have greater predictive power than more complex ones. We use the trade-off between model simplicity and predictive accuracy to generate optimal multiscale decompositions of dynamical networks into weakly-coupled, simple modules. State-dependent and causal versions of our method are also proposed.

1209.2194 2026-06-03 math.OC cs.LG cs.MA cs.SY eess.SY 版本更新

Cooperative learning in multi-agent systems from intermittent measurements

基于间歇测量的多智能体系统协同学习

Naomi Ehrich Leonard, Alex Olshevsky

AI总结 针对时变连接和间歇测量下的多智能体系统,提出一种分布式学习协议,从噪声测量中学习未知向量μ,并给出学习速度与网络大小和组合特征的关系。

详情
AI中文摘要

受分布式跟踪方向问题的启发,我们考虑具有时变连接和间歇测量的多智能体系统中的协同学习一般问题。我们提出了一种分布式学习协议,能够从自主节点独立进行的噪声测量中学习未知向量$μ$。我们的协议完全分布式,能够应对智能体间通信的时变、不可预测和噪声特性,以及$μ$的间歇噪声测量。我们的主要结果根据连接节点的(时变)网络的大小和组合特征,界定了协议的学习速度。

英文摘要

Motivated by the problem of tracking a direction in a decentralized way, we consider the general problem of cooperative learning in multi-agent systems with time-varying connectivity and intermittent measurements. We propose a distributed learning protocol capable of learning an unknown vector $μ$ from noisy measurements made independently by autonomous nodes. Our protocol is completely distributed and able to cope with the time-varying, unpredictable, and noisy nature of inter-agent communication, and intermittent noisy measurements of $μ$. Our main result bounds the learning speed of our protocol in terms of the size and combinatorial features of the (time-varying) networks connecting the nodes.

1210.7559 2026-06-03 cs.LG cs.NA math.NA stat.ML 版本更新

Tensor decompositions for learning latent variable models

用于学习潜变量模型的张量分解

Anima Anandkumar, Rong Ge, Daniel Hsu, Sham M. Kakade, Matus Telgarsky

AI总结 本文利用低阶可观测矩的张量结构,通过对称张量分解(类似矩阵SVD的推广)实现高斯混合模型、隐马尔可夫模型等潜变量模型的参数估计,并提供了鲁棒张量幂法的详细分析。

详情
Journal ref
Journal of Machine Learning Research, 15(Aug):2773-2832, 2014
AI中文摘要

本文研究了一类广泛的潜变量模型(包括高斯混合模型、隐马尔可夫模型和潜在狄利克雷分配)的计算和统计高效参数估计方法,该方法利用了其低阶可观测矩(通常是二阶和三阶)中的特定张量结构。具体地,参数估计被简化为从矩导出的对称张量中提取某种(正交)分解的问题;这种分解可以看作是矩阵奇异值分解的自然推广。尽管张量分解通常难以计算,但这些特殊结构张量的分解可以通过多种方法高效获得,包括幂迭代和最大化方法(类似于矩阵的情况)。本文提供了鲁棒张量幂方法的详细分析,建立了类似于矩阵奇异向量的Wedin扰动定理的类比。这意味着对于几种流行的潜变量模型,存在一种鲁棒且计算可行的估计方法。

英文摘要

This work considers a computationally and statistically efficient parameter estimation method for a wide class of latent variable models---including Gaussian mixture models, hidden Markov models, and latent Dirichlet allocation---which exploits a certain tensor structure in their low-order observable moments (typically, of second- and third-order). Specifically, parameter estimation is reduced to the problem of extracting a certain (orthogonal) decomposition of a symmetric tensor derived from the moments; this decomposition can be viewed as a natural generalization of the singular value decomposition for matrices. Although tensor decompositions are generally intractable to compute, the decomposition of these specially structured tensors can be efficiently obtained by a variety of approaches, including power iterations and maximization approaches (similar to the case of matrices). A detailed analysis of a robust tensor power method is provided, establishing an analogue of Wedin's perturbation theorem for the singular vectors of matrices. This implies a robust and computationally tractable estimation approach for several popular latent variable models.

1204.4200 2026-06-03 cs.AI cs.LG cs.NE cs.SY eess.SY 版本更新

Discrete Dynamical Genetic Programming in XCS

XCS中的离散动力遗传编程

Richard J. Preen, Larry Bull

AI总结 本文研究在XCS学习分类器系统中使用异步随机布尔网络作为离散动力系统表示,通过自适应的开放式进化设计集成系统以解决多个经典测试问题。

Comments arXiv admin note: substantial text overlap with arXiv:1201.5604

详情
Journal ref
In Proceedings of the 11th annual conference on genetic and evolutionary computation, GECCO '09, pp. 1299-1306. ACM, 2009
AI中文摘要

在XCS学习分类器系统中,已有多种表示方案,从二进制编码到神经网络。本文研究了在XCS中使用离散动力系统表示的结果。特别地,使用异步随机布尔网络来表示传统的条件-动作生产系统规则。结果表明,可以通过自适应的开放式进化在XCS中设计这样的离散动力系统集成,以解决多个著名的测试问题。

英文摘要

A number of representation schemes have been presented for use within Learning Classifier Systems, ranging from binary encodings to neural networks. This paper presents results from an investigation into using a discrete dynamical system representation within the XCS Learning Classifier System. In particular, asynchronous random Boolean networks are used to represent the traditional condition-action production system rules. It is shown possible to use self-adaptive, open-ended evolution to design an ensemble of such discrete dynamical systems within XCS to solve a number of well-known test problems.

1211.4116 2026-06-03 cs.LG cs.NA math.AG math.CO math.NA stat.ML 版本更新

The Algebraic Combinatorial Approach for Low-Rank Matrix Completion

低秩矩阵补全的代数组合方法

Franz J. Király, Louis Theran, Ryota Tomioka

AI总结 本文提出一种基于代数几何和拟阵理论的代数组合新视角,通过研究少量条目间的关系来解决低秩矩阵补全问题,并给出概率为1的算法判断特定条目是否可补全、从少量条目补全该条目及估计补全误差。

Comments 37 pages, with an appendix by Takeaki Uno

详情
AI中文摘要

我们提出了一种新颖的低秩矩阵补全的代数组合观点,该观点基于使用代数几何和拟阵理论的工具研究少量条目之间的关系。该方法的固有局部性允许在封闭的理论和实践框架中处理单个条目。更具体地说,除了引入低秩矩阵补全的代数组合理论外,我们还提出了概率为1的算法,以决定矩阵的特定条目是否可以被补全。我们还描述了从少量其他条目补全该条目的方法,以及估计任何补全该条目的方法所产生的误差。此外,我们展示了关于矩阵补全的已知结果及其采样假设如何与我们的新视角相关联,并可根据可补全性相变进行解释。

英文摘要

We present a novel algebraic combinatorial view on low-rank matrix completion based on studying relations between a few entries with tools from algebraic geometry and matroid theory. The intrinsic locality of the approach allows for the treatment of single entries in a closed theoretical and practical framework. More specifically, apart from introducing an algebraic combinatorial theory of low-rank matrix completion, we present probability-one algorithms to decide whether a particular entry of the matrix can be completed. We also describe methods to complete that entry from a few others, and to estimate the error which is incurred by any method completing that entry. Furthermore, we show how known results on matrix completion and their sampling assumptions can be related to our new perspective and interpreted in terms of a completability phase transition.

1012.1919 2026-06-03 math.NA cs.IT cs.LG cs.NA math.IT 版本更新

Low-Rank Structure Learning via Log-Sum Heuristic Recovery

通过对数求和启发式恢复的低秩结构学习

Yue Deng, Qionghai Dai, Risheng Liu, Zengke Zhang, Sanqing Hu

AI总结 提出对数求和启发式恢复(LHR)模型,通过非凸对数求和度量增强稀疏性,并采用MM型算法求解,在鲁棒主成分分析和低秩表示任务中优于ℓ1基方法。

Comments 13 pages, 3 figures

详情
Journal ref
Neural Networks and Learning Systems, IEEE Transactions on, Volume:24 , Issue: 3, March, 2013
AI中文摘要

从被破坏的观测中恢复内在数据结构在机器学习和信号处理的各种任务中扮演重要角色。本文提出一种新模型,称为对数求和启发式恢复(LHR),用于从被破坏数据中学习本质的低秩结构。与传统方法直接使用ℓ1范数衡量稀疏性不同,LHR引入更合理的对数求和度量,以增强内在低秩结构和稀疏破坏中的稀疏性。尽管所提出的LHR优化不再凸,但仍可通过一种主要化-最小化(MM)类型算法有效求解,该算法用凸代理迭代替换非凸目标函数,最终LHR落入重加权方法的一般框架。我们证明了MM型算法在连续迭代后可以收敛到一个稳定点。我们通过将模型应用于解决两个典型问题来测试其性能:鲁棒主成分分析(RPCA)和低秩表示(LRR)。对于RPCA,我们从模拟和实际应用的角度将LHR与基准方法主成分追踪(PCP)进行比较。对于LRR,我们将LHR应用于计算运动分割和股票聚类的低秩表示矩阵。低秩结构学习的实验结果表明,所提出的基于对数求和的模型在数据秩更高且破坏更密集的情况下,性能远优于基于ℓ1的方法。

英文摘要

Recovering intrinsic data structure from corrupted observations plays an important role in various tasks in the communities of machine learning and signal processing. In this paper, we propose a novel model, named log-sum heuristic recovery (LHR), to learn the essential low-rank structure from corrupted data. Different from traditional approaches, which directly utilize $\ell_1$ norm to measure the sparseness, LHR introduces a more reasonable log-sum measurement to enhance the sparsity in both the intrinsic low-rank structure and in the sparse corruptions. Although the proposed LHR optimization is no longer convex, it still can be effectively solved by a majorization-minimization (MM) type algorithm, with which the non-convex objective function is iteratively replaced by its convex surrogate and LHR finally falls into the general framework of reweighed approaches. We prove that the MM-type algorithm can converge to a stationary point after successive iteration. We test the performance of our proposed model by applying it to solve two typical problems: robust principal component analysis (RPCA) and low-rank representation (LRR). For RPCA, we compare LHR with the benchmark Principal Component Pursuit (PCP) method from both the perspectives of simulations and practical applications. For LRR, we apply LHR to compute the low-rank representation matrix for motion segmentation and stock clustering. Experimental results on low rank structure learning demonstrate that the proposed Log-sum based model performs much better than the $\ell_1$-based method on for data with higher rank and with denser corruptions.

1302.6768 2026-06-03 math.NA cs.LG cs.NA stat.ML 版本更新

Missing Entries Matrix Approximation and Completion

缺失条目的矩阵逼近与补全

Gil Shabat, Yaniv Shmueli, Amir Averbuch

AI总结 针对仅部分条目已知的矩阵,提出一系列算法实现矩阵补全与逼近,支持低秩、核范数、谱范数等多种约束,并证明凸情形下全局收敛,无需参数且适用于图像重建及偏微分方程数据恢复。

详情
AI中文摘要

我们描述了当矩阵仅部分条目已知时,用于矩阵补全和矩阵逼近的几种算法。逼近约束可以是任何对于完整矩阵已知其近似解的约束。对于低秩逼近,类似的算法最近在文献中以不同名称出现。在这项工作中,我们引入了矩阵逼近的新定理,并表明这些算法可以扩展到处理不同的约束,例如核范数、谱范数、正交约束等,这些约束与低秩逼近不同。由于这些算法可以从优化的角度看待,我们讨论了它们在凸情形下收敛到全局解的问题。我们还讨论了最优步长,并表明它在每次迭代中是固定的。此外,推导出的矩阵补全流是鲁棒的,不需要任何参数。该矩阵补全流适用于不同的谱最小化问题,并可应用于物理、数学和电气工程问题,例如图像数据重建以及来自偏微分方程(如用于电磁波的亥姆霍兹方程)的数据重建。

英文摘要

We describe several algorithms for matrix completion and matrix approximation when only some of its entries are known. The approximation constraint can be any whose approximated solution is known for the full matrix. For low rank approximations, similar algorithms appears recently in the literature under different names. In this work, we introduce new theorems for matrix approximation and show that these algorithms can be extended to handle different constraints such as nuclear norm, spectral norm, orthogonality constraints and more that are different than low rank approximations. As the algorithms can be viewed from an optimization point of view, we discuss their convergence to global solution for the convex case. We also discuss the optimal step size and show that it is fixed in each iteration. In addition, the derived matrix completion flow is robust and does not require any parameters. This matrix completion flow is applicable to different spectral minimizations and can be applied to physics, mathematics and electrical engineering problems such as data reconstruction of images and data coming from PDEs such as Helmholtz equation used for electromagnetic waves.

1206.1623 2026-06-03 stat.ML cs.DS cs.LG cs.NA math.NA math.OC 版本更新

Proximal Newton-type methods for minimizing composite functions

最小化复合函数的近端牛顿型方法

Jason D. Lee, Yuekai Sun, Michael A. Saunders

AI总结 针对光滑函数与非光滑凸函数之和的优化问题,提出近端牛顿型方法,并证明其在不精确搜索方向下仍保持牛顿型方法的收敛性,统一了生物信息学、信号处理和统计学习中的多种流行方法。

详情
AI中文摘要

我们将最小化光滑函数的牛顿型方法推广到处理两个凸函数之和:一个光滑函数和一个具有简单近端映射的非光滑函数。我们表明,即使搜索方向计算不精确,所得到的近端牛顿型方法也继承了最小化光滑函数的牛顿型方法的理想收敛行为。许多针对生物信息学、信号处理和统计学习中出现的特定问题的流行方法是近端牛顿型方法的特例,我们的分析为其中一些方法提供了新的收敛结果。

英文摘要

We generalize Newton-type methods for minimizing smooth functions to handle a sum of two convex functions: a smooth function and a nonsmooth function with a simple proximal mapping. We show that the resulting proximal Newton-type methods inherit the desirable convergence behavior of Newton-type methods for minimizing smooth functions, even when search directions are computed inexactly. Many popular methods tailored to problems arising in bioinformatics, signal processing, and statistical learning are special cases of proximal Newton-type methods, and our analysis yields new convergence results for some of these methods.

1301.3584 2026-06-03 cs.LG cs.NA math.NA 版本更新

Revisiting Natural Gradient for Deep Networks

重新审视深度网络的自然梯度

Razvan Pascanu, Yoshua Bengio

AI总结 本文重新评估了自然梯度算法在深度模型学习中的应用,揭示了其与Hessian-Free、Krylov子空间下降和TONGA方法的联系,并提出了利用无标签数据改进泛化误差、评估对训练集顺序的鲁棒性以及结合二阶信息的扩展算法。

详情
AI中文摘要

我们评估了最初由Amari (1997)提出的自然梯度算法在深度模型学习中的应用。本文的贡献如下:我们展示了自然梯度与其他三种最近提出的深度模型训练方法——Hessian-Free (Martens, 2010)、Krylov子空间下降 (Vinyals and Povey, 2012)和TONGA (Le Roux et al., 2008)——之间的联系。我们描述了如何利用无标签数据来改善自然梯度所获得的泛化误差,并实证评估了该算法相对于随机梯度下降对训练集顺序的鲁棒性。最后,我们将自然梯度扩展到结合流形信息与二阶信息,并使用截断牛顿方法(而非对角近似)来求逆度量矩阵,从而对新算法进行了基准测试。

英文摘要

We evaluate natural gradient, an algorithm originally proposed in Amari (1997), for learning deep models. The contributions of this paper are as follows. We show the connection between natural gradient and three other recently proposed methods for training deep models: Hessian-Free (Martens, 2010), Krylov Subspace Descent (Vinyals and Povey, 2012) and TONGA (Le Roux et al., 2008). We describe how one can use unlabeled data to improve the generalization error obtained by natural gradient and empirically evaluate the robustness of the algorithm to the ordering of the training set compared to stochastic gradient descent. Finally we extend natural gradient to incorporate second order information alongside the manifold information and provide a benchmark of the new algorithm using a truncated Newton approach for inverting the metric matrix instead of using a diagonal approximation of it.

1302.3447 2026-06-03 math.ST cs.LG cs.NA math.NA math.PR stat.TH 版本更新

Exact Methods for Multistage Estimation of a Binomial Proportion

二项比例多阶段估计的精确方法

Zhengjia Chen, Xinjia Chen

AI总结 本文回顾了现有二项比例序贯估计方法,提出了一类新的组序贯抽样方案,在给定误差和置信水平下实现均匀覆盖概率控制和渐近最优性,并推导了样本数的解析界。

Comments 38 pages, 9 figures

详情
AI中文摘要

我们首先回顾了现有的估计二项比例的序贯方法。之后,我们提出了一类新的组序贯抽样方案,用于在给定的误差范围和置信水平下估计二项比例。特别地,我们建立了这类抽样方案的覆盖概率的一致可控性和渐近最优性。我们的理论结果确立了这类抽样方案的参数可以确定,从而以很少的样本浪费保证指定的置信水平。推导了累积分布函数和样本数期望的解析界。此外,我们讨论了各种抽样方案的内在联系。解决了数值问题以提高计算的准确性和效率。进行了计算实验以比较抽样方案。给出了在临床试验中应用的说明性示例。

英文摘要

We first review existing sequential methods for estimating a binomial proportion. Afterward, we propose a new family of group sequential sampling schemes for estimating a binomial proportion with prescribed margin of error and confidence level. In particular, we establish the uniform controllability of coverage probability and the asymptotic optimality for such a family of sampling schemes. Our theoretical results establish the possibility that the parameters of this family of sampling schemes can be determined so that the prescribed level of confidence is guaranteed with little waste of samples. Analytic bounds for the cumulative distribution functions and expectations of sample numbers are derived. Moreover, we discuss the inherent connection of various sampling schemes. Numerical issues are addressed for improving the accuracy and efficiency of computation. Computational experiments are conducted for comparing sampling schemes. Illustrative examples are given for applications in clinical trials.

1303.4207 2026-06-03 cs.LG cs.NA math.NA 版本更新

Improving CUR Matrix Decomposition and the Nyström Approximation via Adaptive Sampling

通过自适应采样改进CUR矩阵分解和Nyström近似

Shusen Wang, Zhihua Zhang

AI总结 本文通过建立自适应列/行采样算法的更一般误差界,提出了具有预期相对误差界的更精确CUR和Nyström算法,并给出了标准Nyström方法和集成Nyström方法的低误差界理论分析。

详情
Journal ref
Journal of Machine Learning Research, 14: 2549-2589, 2013
AI中文摘要

CUR矩阵分解和Nyström近似是两种重要的低秩矩阵近似技术。Nyström方法通过少量列来近似对称半正定矩阵,而CUR通过少量列和行来近似任意数据矩阵。因此,CUR分解可以看作是Nyström近似的扩展。在本文中,我们为自适应列/行采样算法建立了更一般的误差界,基于此我们提出了具有预期相对误差界的更精确CUR和Nyström算法。所提出的CUR和Nyström算法也具有低时间复杂度,并且可以避免将整个数据矩阵保存在内存中。此外,我们对标准Nyström方法和集成Nyström方法的低误差界进行了理论分析。本文建立的主要理论结果是新颖的,并且我们的分析对数据矩阵没有特殊假设。

英文摘要

The CUR matrix decomposition and the Nyström approximation are two important low-rank matrix approximation techniques. The Nyström method approximates a symmetric positive semidefinite matrix in terms of a small number of its columns, while CUR approximates an arbitrary data matrix by a small number of its columns and rows. Thus, CUR decomposition can be regarded as an extension of the Nyström approximation. In this paper we establish a more general error bound for the adaptive column/row sampling algorithm, based on which we propose more accurate CUR and Nyström algorithms with expected relative-error bounds. The proposed CUR and Nyström algorithms also have low time complexity and can avoid maintaining the whole data matrix in RAM. In addition, we give theoretical analysis for the lower error bounds of the standard Nyström method and the ensemble Nyström method. The main theoretical results established in this paper are novel, and our analysis makes no special assumption on the data matrices.

1303.4694 2026-06-03 math.NA cs.LG cs.NA stat.ML 版本更新

Recovering Non-negative and Combined Sparse Representations

恢复非负和组合稀疏表示

Karthikeyan Natesan Ramamurthy, Jayaraman J. Thiagarajan, Andreas Spanias

AI总结 基于多面体理论,研究了欠定线性系统中非负解的唯一性条件,并提出了组合稀疏表示范式及相应的组合正交匹配追踪算法,用于恢复唯一最稀疏系数向量。

详情
AI中文摘要

有时,欠定线性系统的非负解可以被唯一恢复,即使不施加任何额外的稀疏约束。在本文中,我们基于多面体理论推导了此类系统存在唯一非负解的条件。此外,我们发展了组合稀疏表示的范式,其中只有部分系数向量被约束为非负,其余部分无约束(一般)。我们分析了在三种不同的系数支持知识情况下,组合表示的唯一最稀疏解的恢复:(a)非负系数和一般系数的非零支持已知,(b)仅一般系数的非零支持已知,(c)两个非零支持均未知。对于情况(c),我们提出了组合正交匹配追踪算法用于系数恢复,并推导了确定性稀疏度阈值,在该阈值下可以恢复唯一最稀疏的系数向量。我们量化了算法的阶复杂度,并检验了它们在各种噪声条件下精确和近似恢复系数的性能。此外,我们还获得了它们的经验相变特性。我们表明,与无约束的对应算法相比,具有部分非负约束的基追踪算法和所提出的贪婪算法在恢复唯一稀疏表示方面表现更好。最后,我们展示了所提方法在恢复受饱和噪声污染的图像中的实用性。

英文摘要

The non-negative solution to an underdetermined linear system can be uniquely recovered sometimes, even without imposing any additional sparsity constraints. In this paper, we derive conditions under which a unique non-negative solution for such a system can exist, based on the theory of polytopes. Furthermore, we develop the paradigm of combined sparse representations, where only a part of the coefficient vector is constrained to be non-negative, and the rest is unconstrained (general). We analyze the recovery of the unique, sparsest solution, for combined representations, under three different cases of coefficient support knowledge: (a) the non-zero supports of non-negative and general coefficients are known, (b) the non-zero support of general coefficients alone is known, and (c) both the non-zero supports are unknown. For case (c), we propose the combined orthogonal matching pursuit algorithm for coefficient recovery and derive the deterministic sparsity threshold under which recovery of the unique, sparsest coefficient vector is possible. We quantify the order complexity of the algorithms, and examine their performance in exact and approximate recovery of coefficients under various conditions of noise. Furthermore, we also obtain their empirical phase transition characteristics. We show that the basis pursuit algorithm, with partial non-negative constraints, and the proposed greedy algorithm perform better in recovering the unique sparse representation when compared to their unconstrained counterparts. Finally, we demonstrate the utility of the proposed methods in recovering images corrupted by saturation noise.

1102.2490 2026-06-03 math.ST cs.LG cs.SY eess.SY math.OC stat.TH 版本更新

The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond

KL-UCB算法:有界随机赌博机及其扩展

Aurélien Garivier, Olivier Cappé

AI总结 本文提出KL-UCB算法,通过有限时间分析证明其在有界奖励下优于UCB和UCB2,在伯努利奖励下达到Lai和Robbins下界,并扩展到指数族分布,数值实验显示其高效稳定。

Comments 18 pages, 3 figures; Conf. Comput. Learning Theory (COLT) 2011 in Budapest, Hungary

详情
Journal ref
Conference On Learning Theory n°24 Jul. 2011 pp.359-376
AI中文摘要

本文对KL-UCB算法进行了有限时间分析,该算法是一种在线、无时间视界的随机赌博机问题的索引策略。我们证明了两个不同的结果:首先,对于任意有界奖励,KL-UCB算法满足比UCB或UCB2一致更优的遗憾界;其次,在伯努利奖励的特殊情况下,它达到了Lai和Robbins的下界。此外,我们展示了KL-UCB算法的简单改编对于特定类别的(可能无界)奖励也是最优的,包括那些从指数族分布生成的奖励。一项大规模数值研究将KL-UCB与其主要竞争对手(UCB、UCB2、UCB-Tuned、UCB-V、DMED)进行比较,表明KL-UCB非常高效且稳定,包括在短时间范围内。KL-UCB也是唯一始终优于基本UCB策略的方法。我们的遗憾界依赖于附录中陈述并证明的具有独立兴趣的偏差结果。作为副产品,我们还获得了标准UCB算法的改进遗憾界。

英文摘要

This paper presents a finite-time analysis of the KL-UCB algorithm, an online, horizon-free index policy for stochastic bandit problems. We prove two distinct results: first, for arbitrary bounded rewards, the KL-UCB algorithm satisfies a uniformly better regret bound than UCB or UCB2; second, in the special case of Bernoulli rewards, it reaches the lower bound of Lai and Robbins. Furthermore, we show that simple adaptations of the KL-UCB algorithm are also optimal for specific classes of (possibly unbounded) rewards, including those generated from exponential families of distributions. A large-scale numerical study comparing KL-UCB with its main competitors (UCB, UCB2, UCB-Tuned, UCB-V, DMED) shows that KL-UCB is remarkably efficient and stable, including for short time horizons. KL-UCB is also the only method that always performs better than the basic UCB policy. Our regret bounds rely on deviations results of independent interest which are stated and proved in the Appendix. As a by-product, we also obtain an improved regret bound for the standard UCB algorithm.

1211.6687 2026-06-03 stat.ML cs.LG cs.NA math.NA math.OC 版本更新

Robustness Analysis of Hottopixx, a Linear Programming Model for Factoring Nonnegative Matrices

Hottopixx的鲁棒性分析:一个用于非负矩阵分解的线性规划模型

Nicolas Gillis

AI总结 本文对Hottopixx线性规划模型进行鲁棒性分析,并提出一种后处理策略以增强对数据集中重复和近似重复的鲁棒性。

Comments 23 pages; new numerical results; Comparison with Arora et al.; Accepted in SIAM J. Mat. Anal. Appl

详情
Journal ref
SIAM J. Matrix Anal. & Appl. 34 (3), pp. 1189-1212, 2013
AI中文摘要

尽管非负矩阵分解(NMF)通常是NP难的,但最近研究表明,在输入非负数据矩阵接近可分离的假设下(可分离性要求输入矩阵的所有列属于由这些列的一个小子集张成的锥),NMF是易处理的。此后,设计了多种算法来处理这类NMF子问题。特别地,Bittorf、Recht、Ré和Tropp(《用线性规划分解非负矩阵》,NIPS 2012)提出了一种线性规划模型,称为Hottopixx。本文提供了对其方法的一种新的、更一般的鲁棒性分析。特别地,我们通过一种后处理策略设计了一个可证明更鲁棒的变体,该策略允许我们处理数据集中的重复和近似重复。

英文摘要

Although nonnegative matrix factorization (NMF) is NP-hard in general, it has been shown very recently that it is tractable under the assumption that the input nonnegative data matrix is close to being separable (separability requires that all columns of the input matrix belongs to the cone spanned by a small subset of these columns). Since then, several algorithms have been designed to handle this subclass of NMF problems. In particular, Bittorf, Recht, Ré and Tropp (`Factoring nonnegative matrices with linear programs', NIPS 2012) proposed a linear programming model, referred to as Hottopixx. In this paper, we provide a new and more general robustness analysis of their method. In particular, we design a provably more robust variant using a post-processing strategy which allows us to deal with duplicates and near duplicates in the dataset.

1210.5323 2026-06-03 cs.IT cs.LG cs.NA math.IT math.NA 版本更新

The performance of orthogonal multi-matching pursuit under RIP

正交多匹配追踪在RIP下的性能

Zhiqiang Xu

AI总结 研究正交多匹配追踪(OMMP)在受限等距性质(RIP)下的性能,证明在特定RIP条件下OMMP能在s次迭代内恢复s-稀疏信号,并针对慢衰减稀疏信号实现迭代次数减少。

Comments 22 pages

详情
AI中文摘要

正交多匹配追踪(OMMP)是正交匹配追踪(OMP)的自然扩展。我们将参数为$M$的OMMP记为OMMP(M),其中$M\geq 1$是整数。OMP与OMMP(M)的主要区别在于,OMMP(M)每次迭代选择$M$个原子,而OMP每次只向最优原子集添加一个原子。本文研究正交多匹配追踪(OMMP)在RIP下的性能。特别地,我们证明,当测量矩阵A满足$(9s, 1/10)$-RIP时,存在绝对常数$M_0\leq 8$使得OMMP(M_0)能在$s$次迭代内恢复$s$-稀疏信号。我们进一步证明,对于慢衰减的$s$-稀疏信号,对于一大类$M$,OMMP(M)能在$O(\frac{s}{M})$次迭代内恢复$s$-稀疏信号。特别地,对于$M=s^a$且$a\in [0,1/2]$,OMMP(M)能在$O(s^{1-a})$次迭代内恢复慢衰减的$s$-稀疏信号。该结果表明OMMP能大幅降低计算复杂度。

英文摘要

The orthogonal multi-matching pursuit (OMMP) is a natural extension of orthogonal matching pursuit (OMP). We denote the OMMP with the parameter $M$ as OMMP(M) where $M\geq 1$ is an integer. The main difference between OMP and OMMP(M) is that OMMP(M) selects $M$ atoms per iteration, while OMP only adds one atom to the optimal atom set. In this paper, we study the performance of orthogonal multi-matching pursuit (OMMP) under RIP. In particular, we show that, when the measurement matrix A satisfies $(9s, 1/10)$-RIP, there exists an absolutely constant $M_0\leq 8$ so that OMMP(M_0) can recover $s$-sparse signal within $s$ iterations. We furthermore prove that, for slowly-decaying $s$-sparse signal, OMMP(M) can recover s-sparse signal within $O(\frac{s}{M})$ iterations for a large class of $M$. In particular, for $M=s^a$ with $a\in [0,1/2]$, OMMP(M) can recover slowly-decaying $s$-sparse signal within $O(s^{1-a})$ iterations. The result implies that OMMP can reduce the computational complexity heavily.

1211.3500 2026-06-03 math.NA cs.LG cs.NA 版本更新

Accelerated Canonical Polyadic Decomposition by Using Mode Reduction

利用模式约简加速典型多路分解

Guoxu Zhou, Andrzej Cichocki, Shengli Xie

AI总结 针对高阶张量CP分解中频繁展开至N个模式导致的效率瓶颈,提出一种将N阶张量先转化为3阶张量再分解的方法,避免逐模式展开,同时保持分解唯一性并提升效率。

Comments 12 pages. Accepted by TNNLS

详情
AI中文摘要

典型多路(或CANDECOMP/PARAFAC,CP)分解(CPD)广泛应用于分析高阶张量。现有的CPD方法使用交替最小二乘(ALS)迭代,因此需要频繁地将张量展开到每个$N$个模式,这是大规模数据特别是当$N$较大时效率的主要瓶颈之一。为了解决这个问题,本文提出了一种新的CPD方法,该方法首先将原始的$N$阶($N>3$)张量转换为3阶张量。然后通过分解这个模式约简后的张量,再经过Khatri-Rao积投影过程来实现完整的CPD。这种方法非常高效,因为避免了展开到每个$N$个模式,并且可以轻松地加入降维以进一步提高效率。我们证明,在温和条件下,任何$N$阶CPD都可以转化为3阶情况,而不会破坏本质唯一性,并且理论上给出与直接$N$路CPD方法相同的结果。仿真表明,与最先进的CPD方法相比,所提方法更高效,且更容易摆脱局部解。

英文摘要

Canonical Polyadic (or CANDECOMP/PARAFAC, CP) decompositions (CPD) are widely applied to analyze high order tensors. Existing CPD methods use alternating least square (ALS) iterations and hence need to unfold tensors to each of the $N$ modes frequently, which is one major bottleneck of efficiency for large-scale data and especially when $N$ is large. To overcome this problem, in this paper we proposed a new CPD method which converts the original $N$th ($N>3$) order tensor to a 3rd-order tensor first. Then the full CPD is realized by decomposing this mode reduced tensor followed by a Khatri-Rao product projection procedure. This way is quite efficient as unfolding to each of the $N$ modes are avoided, and dimensionality reduction can also be easily incorporated to further improve the efficiency. We show that, under mild conditions, any $N$th-order CPD can be converted into a 3rd-order case but without destroying the essential uniqueness, and theoretically gives the same results as direct $N$-way CPD methods. Simulations show that, compared with state-of-the-art CPD methods, the proposed method is more efficient and escape from local solutions more easily.

1303.1849 2026-06-03 cs.LG cs.DS cs.NA math.NA 版本更新

Revisiting the Nystrom Method for Improved Large-Scale Machine Learning

重新审视 Nyström 方法以改进大规模机器学习

Alex Gittens, Michael W. Mahoney

AI总结 本文重新审视了对称半正定矩阵低秩近似的随机算法,通过经验评估和理论分析比较了采样与投影方法的性能,并提出了更优的误差界。

Comments 60 pages, 15 color figures; updated proof of Frobenius norm bounds, added comparison to projection-based low-rank approximations, and an analysis of the power method applied to SPSD sketches

详情
AI中文摘要

我们重新考虑了数据分析和机器学习应用中出现的对称半正定(SPSD)矩阵(如拉普拉斯矩阵和核矩阵)低秩近似的随机算法。我们的主要结果包括对一系列多样化的SPSD矩阵上采样和投影方法的性能质量和运行时间的经验评估。我们的结果突出了采样与投影方法的互补性;它们描述了常见数据预处理步骤对这些算法性能的影响;并指出了基于杠杆得分的均匀采样与非均匀采样方法之间的重要差异。此外,我们的经验结果表明,现有理论非常薄弱,甚至无法为实践提供定性指导。因此,我们用一套随机采样和随机投影方法的最坏情况理论界限来补充我们的经验结果。这些界限在质量上优于现有界限——例如,谱范数和Frobenius范数误差的改进加性误差界,以及迹范数误差的相对误差界——并指出了使这些算法在更大规模机器学习应用中更有用的未来方向。

英文摘要

We reconsider randomized algorithms for the low-rank approximation of symmetric positive semi-definite (SPSD) matrices such as Laplacian and kernel matrices that arise in data analysis and machine learning applications. Our main results consist of an empirical evaluation of the performance quality and running time of sampling and projection methods on a diverse suite of SPSD matrices. Our results highlight complementary aspects of sampling versus projection methods; they characterize the effects of common data preprocessing steps on the performance of these algorithms; and they point to important differences between uniform sampling and nonuniform sampling methods based on leverage scores. In addition, our empirical results illustrate that existing theory is so weak that it does not provide even a qualitative guide to practice. Thus, we complement our empirical results with a suite of worst-case theoretical bounds for both random sampling and random projection methods. These bounds are qualitatively superior to existing bounds---e.g. improved additive-error bounds for spectral and Frobenius norm error and relative-error bounds for trace norm error---and they point to future directions to make these algorithms useful in even larger-scale machine learning applications.

1204.4202 2026-06-03 cs.AI cs.LG cs.NE cs.SY eess.SY 版本更新

Fuzzy Dynamical Genetic Programming in XCSF

XCSF中的模糊动态遗传编程

Richard J. Preen, Larry Bull

AI总结 研究在XCSF学习分类器系统中使用模糊动态遗传编程表示,通过异步模糊逻辑网络实现自适应性开放演化,解决连续值测试问题。

Comments 2 page GECCO 2011 poster paper

详情
Journal ref
In Proceedings of the 13th annual conference companion on genetic and evolutionary computation, GECCO '11, pp. 167-168. ACM, 2011
AI中文摘要

学习分类器系统中已提出多种表示方案,从二进制编码到神经网络,以及最近的动态遗传编程(DGP)。本文研究了在XCSF学习分类器系统中使用模糊DGP表示的结果。特别是,异步模糊逻辑网络用于表示传统的条件-动作产生式系统规则。结果表明,可以在XCSF内通过自适应性、开放式的演化设计一组这样的模糊动态系统,以解决几个著名的连续值测试问题。

英文摘要

A number of representation schemes have been presented for use within Learning Classifier Systems, ranging from binary encodings to Neural Networks, and more recently Dynamical Genetic Programming (DGP). This paper presents results from an investigation into using a fuzzy DGP representation within the XCSF Learning Classifier System. In particular, asynchronous Fuzzy Logic Networks are used to represent the traditional condition-action production system rules. It is shown possible to use self-adaptive, open-ended evolution to design an ensemble of such fuzzy dynamical systems within XCSF to solve several well-known continuous-valued test problems.

1211.7045 2026-06-03 cs.LG cs.NA math.NA math.OC q-bio.BM 版本更新

Orientation Determination from Cryo-EM images Using Least Unsquared Deviation

使用最小未平方偏差从冷冻电镜图像确定方向

Lanhui Wang, Amit Singer, Zaiwen Wen

AI总结 针对冷冻电镜单颗粒重构中方向未知的二维投影图像,提出基于最小未平方偏差的鲁棒全局自洽误差模型,通过半定松弛和谱范数约束/正则化求解,显著降低低共线检测率下的方向估计误差。

详情
AI中文摘要

冷冻电镜单颗粒重构的一个主要挑战是利用未知方向的二维投影图像建立可靠的三维初始模型。基于共线的方法无需额外几何信息即可估计方向。然而,当图像噪声水平过高导致共线检测率过低时,此类方法会失效。通过半定规划的凸松弛,得到了最小二乘全局自洽误差的近似。本文引入一种更鲁棒的全局自洽误差,并证明相应的优化问题可通过半定松弛求解。为了防止估计视角的人为聚类,我们进一步引入一个谱范数项,作为约束或正则化项添加到松弛的最小化问题中。所得问题通过交替方向乘子法或迭代重加权最小二乘过程求解。模拟和真实图像的数值实验表明,当共线检测率较低时,所提方法显著降低了方向估计误差。

英文摘要

A major challenge in single particle reconstruction from cryo-electron microscopy is to establish a reliable ab-initio three-dimensional model using two-dimensional projection images with unknown orientations. Common-lines based methods estimate the orientations without additional geometric information. However, such methods fail when the detection rate of common-lines is too low due to the high level of noise in the images. An approximation to the least squares global self consistency error was obtained using convex relaxation by semidefinite programming. In this paper we introduce a more robust global self consistency error and show that the corresponding optimization problem can be solved via semidefinite relaxation. In order to prevent artificial clustering of the estimated viewing directions, we further introduce a spectral norm term that is added as a constraint or as a regularization term to the relaxed minimization problem. The resulted problems are solved by using either the alternating direction method of multipliers or an iteratively reweighted least squares procedure. Numerical experiments with both simulated and real images demonstrate that the proposed methods significantly reduce the orientation estimation error when the detection rate of common-lines is low.

1204.1259 2026-06-03 cs.LG cs.IR cs.NA math.NA 版本更新

Fast ALS-based tensor factorization for context-aware recommendation from implicit feedback

基于快速ALS的张量分解用于隐式反馈的上下文感知推荐

Balázs Hidasi, Domonkos Tikk

AI总结 提出iTALS算法,利用基于ALS的张量分解方法线性扩展至非零元素,整合上下文信息(如季节性和序列模式),在隐式反馈数据集上显著提升推荐质量。

Comments Accepted for ECML/PKDD 2012, presented on 25th September 2012, Bristol, UK

详情
Journal ref
Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
AI中文摘要

尽管基于隐式反馈的推荐问题(仅用户历史可用而无评分)是实际应用中最典型的场景,但其研究远少于显式反馈情况。在显式情况下高效的最先进算法,若要保持可扩展性,无法直接转化为隐式情况。隐式反馈基准数据集很少,因此新想法通常基于显式基准进行实验。本文提出了一种通用的上下文感知隐式反馈推荐算法,称为iTALS。iTALS应用了一种快速的、基于ALS的张量分解学习方法,其规模与张量中非零元素数量呈线性关系。该方法还允许我们在保持计算效率的同时,将多样化的上下文信息融入模型。特别地,我们提出了iTALS的两种上下文感知实现变体。第一种融入季节性,能够区分不同时间间隔的用户行为。另一种将用户历史视为序列信息,能够识别特定物品组的典型使用模式,例如自动区分通常重复购买(收藏品、杂货)或一次性购买(家用电器)的产品类型或类别。在三个隐式数据集(两个专有数据集和Netflix数据集的隐式变体)上进行的实验表明,通过将上下文感知信息与我们的分解框架集成到最先进的隐式推荐算法中,推荐质量显著提高。

英文摘要

Albeit, the implicit feedback based recommendation problem - when only the user history is available but there are no ratings - is the most typical setting in real-world applications, it is much less researched than the explicit feedback case. State-of-the-art algorithms that are efficient on the explicit case cannot be straightforwardly transformed to the implicit case if scalability should be maintained. There are few if any implicit feedback benchmark datasets, therefore new ideas are usually experimented on explicit benchmarks. In this paper, we propose a generic context-aware implicit feedback recommender algorithm, coined iTALS. iTALS apply a fast, ALS-based tensor factorization learning method that scales linearly with the number of non-zero elements in the tensor. The method also allows us to incorporate diverse context information into the model while maintaining its computational efficiency. In particular, we present two such context-aware implementation variants of iTALS. The first incorporates seasonality and enables to distinguish user behavior in different time intervals. The other views the user history as sequential information and has the ability to recognize usage pattern typical to certain group of items, e.g. to automatically tell apart product types or categories that are typically purchased repetitively (collectibles, grocery goods) or once (household appliances). Experiments performed on three implicit datasets (two proprietary ones and an implicit variant of the Netflix dataset) show that by integrating context-aware information with our factorization framework into the state-of-the-art implicit recommender algorithm the recommendation quality improves significantly.

1303.6370 2026-06-03 stat.ML cs.LG cs.NA math.NA 版本更新

Convex Tensor Decomposition via Structured Schatten Norm Regularization

通过结构化Schatten范数正则化的凸张量分解

Ryota Tomioka, Taiji Suzuki

AI总结 本文研究用于凸优化张量分解的结构化Schatten范数,从理论上证明“潜在”方法优于“重叠”方法,并建立对偶性、一致性和可识别性结果。

Comments 12 pages, 3 figures

详情
AI中文摘要

我们讨论了用于张量分解的结构化Schatten范数,包括最近提出的两种用于基于凸优化的张量分解的范数(“重叠”和“潜在”),并将张量分解与更广泛的结构化稀疏性文献联系起来。基于结构化Schatten范数的性质,我们从数学上分析了“潜在”方法在张量分解中的性能,该方法在某些设置下经验上被发现比“重叠”方法表现更好。我们从理论上证明了这确实是事实。特别是,当未知的真实张量在特定模式下是低秩时,该方法的表现与知道最小秩的模式一样好。在此过程中,我们展示了结构化Schatten范数的一个新颖的对偶性结果,建立了一致性,并讨论了该方法的可识别性。通过数值模拟,我们确认了我们的理论预测可以精确预测均方误差的缩放行为。

英文摘要

We discuss structured Schatten norms for tensor decomposition that includes two recently proposed norms ("overlapped" and "latent") for convex-optimization-based tensor decomposition, and connect tensor decomposition with wider literature on structured sparsity. Based on the properties of the structured Schatten norms, we mathematically analyze the performance of "latent" approach for tensor decomposition, which was empirically found to perform better than the "overlapped" approach in some settings. We show theoretically that this is indeed the case. In particular, when the unknown true tensor is low-rank in a specific mode, this approach performs as good as knowing the mode with the smallest rank. Along the way, we show a novel duality result for structures Schatten norms, establish the consistency, and discuss the identifiability of this approach. We confirm through numerical simulations that our theoretical prediction can precisely predict the scaling behavior of the mean squared error.

1303.4434 2026-06-03 cs.LG cs.NA math.NA stat.CO stat.ML 版本更新

A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems

非凸正则化优化问题的一般迭代收缩与阈值算法

Pinghua Gong, Changshui Zhang, Zhaosong Lu, Jianhua Huang, Jieping Ye

AI总结 针对非凸稀疏诱导惩罚的优化问题,提出一种通用迭代收缩与阈值算法(GIST),通过近端算子闭式解和BB规则线搜索实现高效求解,并给出收敛性分析。

详情
AI中文摘要

非凸稀疏诱导惩罚近年来在稀疏学习中受到广泛关注。最近的理论研究表明,在若干稀疏学习场景中,非凸惩罚优于其凸对应物。然而,与非凸惩罚相关的非凸优化问题的求解仍然是一个重大挑战。一种常用方法是多阶段(MS)凸松弛(或DC规划),它将原始非凸问题松弛为一系列凸问题。这种方法通常不适用于大规模问题,因为其计算成本是求解单个凸问题的倍数。在本文中,我们提出了一种通用迭代收缩与阈值(GIST)算法,用于求解一大类非凸惩罚的非凸优化问题。GIST算法迭代求解一个近端算子问题,而该问题对于许多常用惩罚具有闭式解。在算法的每次外迭代中,我们使用由Barzilai-Borwein(BB)规则初始化的线搜索,以快速找到合适的步长。本文还给出了GIST算法的详细收敛性分析。通过在大规模数据集上的大量实验,证明了所提算法的效率。

英文摘要

Non-convex sparsity-inducing penalties have recently received considerable attentions in sparse learning. Recent theoretical investigations have demonstrated their superiority over the convex counterparts in several sparse learning settings. However, solving the non-convex optimization problems associated with non-convex penalties remains a big challenge. A commonly used approach is the Multi-Stage (MS) convex relaxation (or DC programming), which relaxes the original non-convex problem to a sequence of convex problems. This approach is usually not very practical for large-scale problems because its computational cost is a multiple of solving a single convex problem. In this paper, we propose a General Iterative Shrinkage and Thresholding (GIST) algorithm to solve the nonconvex optimization problem for a large class of non-convex penalties. The GIST algorithm iteratively solves a proximal operator problem, which in turn has a closed-form solution for many commonly used penalties. At each outer iteration of the algorithm, we use a line search initialized by the Barzilai-Borwein (BB) rule that allows finding an appropriate step size quickly. The paper also presents a detailed convergence analysis of the GIST algorithm. The efficiency of the proposed algorithm is demonstrated by extensive experiments on large-scale data sets.

1301.3527 2026-06-03 cs.LG cs.NA math.NA 版本更新

Block Coordinate Descent for Sparse NMF

块坐标下降法用于稀疏非负矩阵分解

Vamsi K. Potluru, Sergey M. Plis, Jonathan Le Roux, Barak A. Pearlmutter, Vince D. Calhoun, Thomas P. Hayes

AI总结 针对稀疏NMF问题,提出基于L1/L2混合范数的块坐标下降算法,在保持稀疏性的同时显著提升计算速度,适用于大规模数据集。

详情
AI中文摘要

非负矩阵分解(NMF)已成为数据分析中无处不在的工具。一个重要变体是稀疏NMF问题,当我们明确要求学习到的特征稀疏时就会出现。稀疏性的自然度量是L$_0$范数,但其优化是NP难的。混合范数(如L$_1$/L$_2$度量)已被证明能够基于这些度量需要满足的直观属性来稳健地建模稀疏性。这与计算上更便宜的替代方案(如普通L$_1$范数)形成对比。然而,当前为优化混合范数L$_1$/L$_2$而设计的算法速度较慢,并且已经提出了其他稀疏NMF的公式,例如基于L$_1$和L$_0$范数的公式。我们提出的算法允许我们在不牺牲计算时间的情况下解决混合范数稀疏约束。我们在真实世界数据集上的实验证据表明,与当前最先进的优化混合范数的求解器相比,我们的新算法速度快一个数量级,并且适用于大规模数据集。

英文摘要

Nonnegative matrix factorization (NMF) has become a ubiquitous tool for data analysis. An important variant is the sparse NMF problem which arises when we explicitly require the learnt features to be sparse. A natural measure of sparsity is the L$_0$ norm, however its optimization is NP-hard. Mixed norms, such as L$_1$/L$_2$ measure, have been shown to model sparsity robustly, based on intuitive attributes that such measures need to satisfy. This is in contrast to computationally cheaper alternatives such as the plain L$_1$ norm. However, present algorithms designed for optimizing the mixed norm L$_1$/L$_2$ are slow and other formulations for sparse NMF have been proposed such as those based on L$_1$ and L$_0$ norms. Our proposed algorithm allows us to solve the mixed norm sparsity constraints while not sacrificing computation time. We present experimental evidence on real-world datasets that shows our new algorithm performs an order of magnitude faster compared to the current state-of-the-art solvers optimizing the mixed norm and is suitable for large-scale datasets.

1301.3389 2026-06-03 math.NA cs.LG cs.NA 版本更新

The Diagonalized Newton Algorithm for Nonnegative Matrix Factorization

非负矩阵分解的对角化牛顿算法

Hugo Van hamme

AI总结 针对非负矩阵分解问题,提出一种对角化牛顿算法(DNA),在保持实现简单性的同时加速收敛,适用于高秩问题。

Comments 8 pages + references; International Conference on Learning Representations, 2013

详情
AI中文摘要

非负矩阵分解(NMF)已成为文本挖掘、语音和图像处理、生物信息学以及地震数据分析等领域中许多问题的流行机器学习方法。在NMF中,非负数据矩阵被近似为两个非负矩阵的低秩乘积。本文使用数据与其低秩重构之间的Kullback-Leibler散度来衡量近似质量。简单的乘法更新(MU)算法的存在促进了NMF的成功。尽管已有收敛更快的算法,MU因其简单性仍然流行。本文提出一种对角化牛顿算法(DNA),在保持实现简单且适用于高秩问题的同时,显示出更快的收敛速度。将DNA算法应用于各种公开数据集,在现代硬件上实现了显著的加速。

英文摘要

Non-negative matrix factorization (NMF) has become a popular machine learning approach to many problems in text mining, speech and image processing, bio-informatics and seismic data analysis to name a few. In NMF, a matrix of non-negative data is approximated by the low-rank product of two matrices with non-negative entries. In this paper, the approximation quality is measured by the Kullback-Leibler divergence between the data and its low-rank reconstruction. The existence of the simple multiplicative update (MU) algorithm for computing the matrix factors has contributed to the success of NMF. Despite the availability of algorithms showing faster convergence, MU remains popular due to its simplicity. In this paper, a diagonalized Newton algorithm (DNA) is proposed showing faster convergence while the implementation remains simple and suitable for high-rank problems. The DNA algorithm is applied to various publicly available data sets, showing a substantial speed-up on modern hardware.

1106.6104 2026-06-03 math.OC cs.LG cs.SY eess.SY math.PR math.ST stat.TH 版本更新

Deterministic Sequencing of Exploration and Exploitation for Multi-Armed Bandit Problems

多臂赌博机问题的探索与利用确定性排序

Sattar Vakili, Keqin Liu, Qing Zhao

AI总结 提出基于探索与利用确定性排序(DSEE)的策略,针对轻尾分布实现最优对数遗憾,针对重尾分布达到次优遗憾,并推广到多种MAB变体。

Comments 22 pages, 2 figures

详情
AI中文摘要

在多臂赌博机(MAB)问题中,存在一组具有未知奖励模型的臂。在每个时刻,玩家选择一个臂进行游戏,旨在最大化在T长度时间范围内的总期望奖励。本文开发了一种基于探索与利用确定性排序(DSEE)的方法来构建顺序臂选择策略。结果表明,对于所有轻尾奖励分布,DSEE实现了遗憾的最优对数阶,其中遗憾定义为相对于已知奖励模型的理想情况的总期望奖励损失。对于重尾奖励分布,当奖励分布的矩存在到p阶(1<p≤2)时,DSEE实现了O(T^{1/p})的遗憾,对于p>2时实现了O(T^{1/(1+p/2)})的遗憾。利用对重尾奖励分布有限矩的上界知识,DSEE提供了最优的对数遗憾阶。所提出的DSEE方法通过为一般奖励分布提供相应结果,补充了现有的MAB工作。此外,通过明确定义的可调参数——探索序列的基数,DSEE方法易于扩展到MAB的变体,包括具有不同目标的MAB、具有多个玩家和碰撞下不完全奖励观测的分散式MAB、具有未知马尔可夫动力学的MAB,以及具有依赖臂的组合MAB,这些常出现在网络优化问题中,如未知随机权重下的最短路径、最小生成树和支配集问题。

英文摘要

In the Multi-Armed Bandit (MAB) problem, there is a given set of arms with unknown reward models. At each time, a player selects one arm to play, aiming to maximize the total expected reward over a horizon of length T. An approach based on a Deterministic Sequencing of Exploration and Exploitation (DSEE) is developed for constructing sequential arm selection policies. It is shown that for all light-tailed reward distributions, DSEE achieves the optimal logarithmic order of the regret, where regret is defined as the total expected reward loss against the ideal case with known reward models. For heavy-tailed reward distributions, DSEE achieves O(T^1/p) regret when the moments of the reward distributions exist up to the pth order for 1<p<=2 and O(T^1/(1+p/2)) for p>2. With the knowledge of an upperbound on a finite moment of the heavy-tailed reward distributions, DSEE offers the optimal logarithmic regret order. The proposed DSEE approach complements existing work on MAB by providing corresponding results for general reward distributions. Furthermore, with a clearly defined tunable parameter-the cardinality of the exploration sequence, the DSEE approach is easily extendable to variations of MAB, including MAB with various objectives, decentralized MAB with multiple players and incomplete reward observations under collisions, MAB with unknown Markov dynamics, and combinatorial MAB with dependent arms that often arise in network optimization problems such as the shortest path, the minimum spanning, and the dominating set problems under unknown random weights.

1303.1264 2026-06-03 cs.LG cs.NA math.NA 版本更新

Discovery of factors in matrices with grades

带等级矩阵中的因子发现

Radim Belohlavek, Vilem Vychodil

AI总结 提出一种针对有序数据矩阵的分解与因子分析方法,基于完全剩余格结构,利用几何洞察识别矩形子矩阵作为最优因子,并设计贪心近似算法实现少量因子的分解。

详情
AI中文摘要

我们提出了一种处理有序数据矩阵的分解与因子分析的方法。矩阵中的条目是对象(由行表示)满足属性(由列表示)的等级,例如图像红色的程度、产品具有给定特征的程度或一个人在测试中表现良好的程度。我们假设这些等级构成一个有界尺度,配备特定的聚合算子,并符合完全剩余格的结构。我们提出了一种贪心近似算法,用于在因子数量较小的限制下,将此类矩阵分解为两个带等级矩阵的乘积。我们的算法基于一个定理提供的几何洞察,该定理将特定的矩形子矩阵识别为分解的最优因子。这些因子对应于输入数据的形式概念,并允许对分解进行简单解释。我们提供了说明性示例和实验评估。

英文摘要

We present an approach to decomposition and factor analysis of matrices with ordinal data. The matrix entries are grades to which objects represented by rows satisfy attributes represented by columns, e.g. grades to which an image is red, a product has a given feature, or a person performs well in a test. We assume that the grades form a bounded scale equipped with certain aggregation operators and conforms to the structure of a complete residuated lattice. We present a greedy approximation algorithm for the problem of decomposition of such matrix in a product of two matrices with grades under the restriction that the number of factors be small. Our algorithm is based on a geometric insight provided by a theorem identifying particular rectangular-shaped submatrices as optimal factors for the decompositions. These factors correspond to formal concepts of the input data and allow an easy interpretation of the decomposition. We present illustrative examples and experimental evaluation.

1302.7283 2026-06-03 cs.LG cs.NA math.NA 版本更新

Source Separation using Regularized NMF with MMSE Estimates under GMM Priors with Online Learning for The Uncertainties

基于GMM先验下MMSE估计的正则化NMF源分离及其不确定性在线学习

Emad M. Grais, Hakan Erdogan

AI总结 提出一种在非负矩阵分解中引入高斯混合模型先验的最小均方误差估计正则化方法,用于单通道源分离,通过在线学习不确定性提升分离性能。

详情
AI中文摘要

我们提出了一种在非负矩阵分解(NMF)解上施加先验的新方法。所提出的算法可用于去噪或单通道源分离(SCSS)应用。NMF解被引导遵循源信号的高斯混合先验模型(GMM)下的最小均方误差(MMSE)估计。在SCSS应用中,观测到的混合信号的频谱被分解为每个源使用NMF训练的基向量的加权线性组合。在这项工作中,NMF分解权重矩阵被视为被失真算子扭曲的图像,该失真算子直接从观测信号中学习。然后,在GMM先验和对数正态分布失真下找到权重矩阵的MMSE估计,以改进NMF分解结果。MMSE估计被嵌入优化目标中,形成一个新的正则化NMF代价函数。本文推导了新目标的相应更新规则。实验结果表明,与不使用先验或使用其他先验模型的NMF相比,所提出的正则化NMF算法提高了源分离性能。

英文摘要

We propose a new method to enforce priors on the solution of the nonnegative matrix factorization (NMF). The proposed algorithm can be used for denoising or single-channel source separation (SCSS) applications. The NMF solution is guided to follow the Minimum Mean Square Error (MMSE) estimates under Gaussian mixture prior models (GMM) for the source signal. In SCSS applications, the spectra of the observed mixed signal are decomposed as a weighted linear combination of trained basis vectors for each source using NMF. In this work, the NMF decomposition weight matrices are treated as a distorted image by a distortion operator, which is learned directly from the observed signals. The MMSE estimate of the weights matrix under GMM prior and log-normal distribution for the distortion is then found to improve the NMF decomposition results. The MMSE estimate is embedded within the optimization objective to form a novel regularized NMF cost function. The corresponding update rules for the new objectives are derived in this paper. Experimental results show that, the proposed regularized NMF algorithm improves the source separation performance compared with using NMF without prior or with other prior models.

0910.5260 2026-06-03 math.NA cs.LG cs.NA 版本更新

A Gradient Descent Algorithm on the Grassman Manifold for Matrix Completion

格拉斯曼流形上的梯度下降算法用于矩阵补全

Raghunandan H. Keshavan, Sewoong Oh

AI总结 提出基于奇异值分解和局部流形优化的OptSpace算法,从少量观测条目中精确恢复低秩矩阵。

Comments 26 pages, 15 figures

详情
AI中文摘要

我们考虑从少量观测条目中重构低秩矩阵的问题。本文描述了一种高效算法OptSpace的实现,该算法基于奇异值分解后接局部流形优化,用于解决低秩矩阵补全问题。已有研究表明,如果观测条目数量足够大,奇异值分解的输出能给出原始矩阵的良好估计,从而局部优化能以高概率重构出正确矩阵。我们给出的数值结果表明,该算法能从极少量观测条目中精确重构低秩矩阵。我们进一步研究了算法对噪声的鲁棒性,以及在实际协同过滤数据集上的性能。

英文摘要

We consider the problem of reconstructing a low-rank matrix from a small subset of its entries. In this paper, we describe the implementation of an efficient algorithm called OptSpace, based on singular value decomposition followed by local manifold optimization, for solving the low-rank matrix completion problem. It has been shown that if the number of revealed entries is large enough, the output of singular value decomposition gives a good estimate for the original matrix, so that local optimization reconstructs the correct matrix with high probability. We present numerical results which show that this algorithm can reconstruct the low rank matrix exactly from a very small subset of its entries. We further study the robustness of the algorithm with respect to noise, and its performance on actual collaborative filtering datasets.

1112.6234 2026-06-03 cs.IT cs.LG cs.SY eess.SY math.IT 版本更新

Sparse Recovery from Nonlinear Measurements with Applications in Bad Data Detection for Power Networks

非线性测量下的稀疏恢复及其在电力网络不良数据检测中的应用

Weiyu Xu, Meng Wang, Jianfeng Cai, Ao Tang

AI总结 本文提出一种迭代混合ℓ1和ℓ2凸规划方法,通过局部线性化非线性测量实现状态估计,并给出线性与非线性测量下的性能界与收敛条件,应用于电力网络不良数据检测。

Comments journal. arXiv admin note: substantial text overlap with arXiv:1105.0442

详情
AI中文摘要

本文考虑非线性测量下的稀疏恢复问题,该问题在电力网络的状态估计和不良数据检测中有应用。使用一种迭代混合ℓ1和ℓ2凸规划,通过局部线性化非线性测量来估计真实状态。当测量为线性时,通过利用线性子空间的几乎欧几里得性质,我们推导出在稀疏不良数据和加性观测噪声下状态估计误差的新性能界。作为副产品,本文利用几何泛函分析中的“逃逸网格”定理,给出了线性子空间几乎欧几里得性质的尖锐界。当测量为非线性时,我们给出了即使局部线性化测量可能不是实际非线性测量,迭代算法解收敛到真实状态的条件。我们数值评估了所提出的迭代凸规划方法在非线性电力网络问题中进行不良数据检测的性能。我们能够使用半定规划来验证所提出的从非线性测量中迭代稀疏恢复算法收敛的条件。

英文摘要

In this paper, we consider the problem of sparse recovery from nonlinear measurements, which has applications in state estimation and bad data detection for power networks. An iterative mixed $\ell_1$ and $\ell_2$ convex program is used to estimate the true state by locally linearizing the nonlinear measurements. When the measurements are linear, through using the almost Euclidean property for a linear subspace, we derive a new performance bound for the state estimation error under sparse bad data and additive observation noise. As a byproduct, in this paper we provide sharp bounds on the almost Euclidean property of a linear subspace, using the "escape-through-the-mesh" theorem from geometric functional analysis. When the measurements are nonlinear, we give conditions under which the solution of the iterative algorithm converges to the true state even though the locally linearized measurements may not be the actual nonlinear measurements. We numerically evaluate our iterative convex programming approach to perform bad data detections in nonlinear electrical power networks problems. We are able to use semidefinite programming to verify the conditions for convergence of the proposed iterative sparse recovery algorithms from nonlinear measurements.

1301.0584 2026-06-03 cs.AI cs.LG cs.SY eess.SY 版本更新

Decayed MCMC Filtering

衰减MCMC滤波

Bhaskara Marthi, Hanna Pasula, Stuart Russell, Yuval Peres

AI总结 提出一种基于衰减MCMC的随机近似滤波算法,通过偏向翻转近期状态变量的提议分布对状态轨迹进行采样,并证明在观测序列增长时收敛时间有界,实验表明与粒子滤波等算法性能相当。

Comments Appears in Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI2002)

详情
AI中文摘要

滤波——从观测序列中估计部分可观测马尔可夫过程的状态——是控制理论、人工智能和计算统计学中研究最广泛的问题之一。对于大型离散系统和非线性连续系统,后验分布的精确计算通常是难以处理的,因此大量工作致力于开发鲁棒的近似算法。本文描述了一种简单的随机近似滤波算法,称为衰减MCMC。该算法对状态轨迹空间应用马尔可夫链蒙特卡罗采样,使用偏向翻转较新状态变量的提议分布。该算法的形式化分析涉及MCMC收敛的标准耦合论证的推广。我们证明,对于任何遍历的底层马尔可夫过程,随着观测序列长度的增长,具有逆多项式衰减的衰减MCMC的收敛时间保持有界。实验表明,衰减MCMC至少与粒子滤波等其他近似算法具有竞争力。

英文摘要

Filtering---estimating the state of a partially observable Markov process from a sequence of observations---is one of the most widely studied problems in control theory, AI, and computational statistics. Exact computation of the posterior distribution is generally intractable for large discrete systems and for nonlinear continuous systems, so a good deal of effort has gone into developing robust approximation algorithms. This paper describes a simple stochastic approximation algorithm for filtering called {em decayed MCMC}. The algorithm applies Markov chain Monte Carlo sampling to the space of state trajectories using a proposal distribution that favours flips of more recent state variables. The formal analysis of the algorithm involves a generalization of standard coupling arguments for MCMC convergence. We prove that for any ergodic underlying Markov process, the convergence time of decayed MCMC with inverse-polynomial decay remains bounded as the length of the observation sequence grows. We show experimentally that decayed MCMC is at least competitive with other approximation algorithms such as particle filtering.

1212.4507 2026-06-03 stat.ML cs.LG cs.NA math.NA 版本更新

Variational Optimization

变分优化

Joe Staines, David Barber

AI总结 本文提出一种通用技术,通过构造可微边界来优化不可微或离散目标函数,并应用于稀疏学习与支持向量分类。

详情
AI中文摘要

我们讨论了一种通用技术,可用于形成不可微或离散目标函数最优值的可微边界。我们形成了这些方法的统一描述,并考虑了在何种情况下该边界是凹的。特别地,我们考虑了该方法的两个具体应用,即稀疏学习和支持向量分类。

英文摘要

We discuss a general technique that can be used to form a differentiable bound on the optima of non-differentiable or discrete objective functions. We form a unified description of these methods and consider under which circumstances the bound is concave. In particular we consider two concrete applications of the method, namely sparse learning and support vector classification.

1212.2475 2026-06-03 cs.LG cs.SY eess.SY 版本更新

Efficient Gradient Estimation for Motor Control Learning

运动控制学习的高效梯度估计

Gregory Lawrence, Noah Cowan, Stuart Russell

AI总结 针对存在输入噪声的梯度估计问题,提出两种降低估计误差的方法:基于局部线性模型的强化基线法和方差折扣法,并应用于模拟三连杆机械臂的投镖任务,显著改善了奖励函数梯度估计和学习曲线。

Comments Appears in Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI2003)

详情
AI中文摘要

在噪声存在的情况下估计函数梯度的任务是几种强化学习形式的核心,包括策略搜索方法。我们提出了两种技术,用于在可观测输入噪声应用于控制信号时减少梯度估计误差。第一种方法通过拟合一个局部线性模型到被估计梯度的函数,扩展了强化基线的思想;我们展示了如何找到最小化梯度估计方差的线性模型,以及如何从数据中估计该模型。第二种方法通过折扣具有高方差的梯度向量分量进一步改进了这一点。这些方法被应用于运动控制学习问题,其中执行器噪声对行为有显著影响。特别地,我们将这些技术应用于使用模拟三连杆机械臂的投镖任务中学习局部最优控制器;我们证明了所提出的方法显著改善了奖励函数梯度估计,并因此改善了学习曲线,优于现有方法。

英文摘要

The task of estimating the gradient of a function in the presence of noise is central to several forms of reinforcement learning, including policy search methods. We present two techniques for reducing gradient estimation errors in the presence of observable input noise applied to the control signal. The first method extends the idea of a reinforcement baseline by fitting a local linear model to the function whose gradient is being estimated; we show how to find the linear model that minimizes the variance of the gradient estimate, and how to estimate the model from data. The second method improves this further by discounting components of the gradient vector that have high variance. These methods are applied to the problem of motor control learning, where actuator noise has a significant influence on behavior. In particular, we apply the techniques to learn locally optimal controllers for a dart-throwing task using a simulated three-link arm; we demonstrate that proposed methods significantly improve the reward function gradient estimate and, consequently, the learning curve, over existing methods.

1212.2471 2026-06-03 cs.LG cs.AI cs.NA math.NA 版本更新

Monte Carlo Matrix Inversion Policy Evaluation

蒙特卡洛矩阵求逆策略评估

Fletcher Lu, Dale Schuurmans

AI总结 提出使用蒙特卡洛矩阵求逆(MCMI)进行强化学习策略评估,通过重要性采样降低方差,并在运行时间和准确性上优于最大似然模型和时序差分方法。

Comments Appears in Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI2003)

详情
AI中文摘要

1950年,Forsythe和Leibler(1950)引入了一种统计技术,通过将矩阵逆的元素表征为一系列随机游走的期望值来求矩阵的逆。Barto和Duff(1994)随后展示了该技术与标准动态规划和时序差分方法之间的关系。蒙特卡洛矩阵求逆(MCMI)方法的优势在于,它相对于其他技术,在状态空间大小方面具有更好的可扩展性。在本文中,我们介绍了一种使用MCMI进行强化学习策略评估的算法。我们证明,MCMI在运行时间上优于基于最大似然模型的策略评估方法,并且在运行时间和准确性上都优于时序差分(TD)策略评估方法。我们进一步通过向算法添加重要性采样技术来降低估计器的方差,从而改进了MCMI策略评估。最后,我们展示了将MCMI扩展到大规模状态空间以进行策略改进的技术。

英文摘要

In 1950, Forsythe and Leibler (1950) introduced a statistical technique for finding the inverse of a matrix by characterizing the elements of the matrix inverse as expected values of a sequence of random walks. Barto and Duff (1994) subsequently showed relations between this technique and standard dynamic programming and temporal differencing methods. The advantage of the Monte Carlo matrix inversion (MCMI) approach is that it scales better with respect to state-space size than alternative techniques. In this paper, we introduce an algorithm for performing reinforcement learning policy evaluation using MCMI. We demonstrate that MCMI improves on runtime over a maximum likelihood model-based policy evaluation approach and on both runtime and accuracy over the temporal differencing (TD) policy evaluation approach. We further improve on MCMI policy evaluation by adding an importance sampling technique to our algorithm to reduce the variance of our estimator. Lastly, we illustrate techniques for scaling up MCMI to large state spaces in order to perform policy improvement.

1211.7369 2026-06-03 stat.ML cs.LG cs.NA math.NA 版本更新

Approximate Rank-Detecting Factorization of Low-Rank Tensors

低秩张量的近似秩检测分解

Franz J. Király, Andreas Ziehe

AI总结 提出AROFAC2算法,通过检测三阶张量的CP秩并分解为秩一分量,具有内在检测真实秩、避免虚假分量、对异常值和非高斯噪声鲁棒的优势。

详情
AI中文摘要

我们提出了一种算法AROFAC2,该算法能够检测三阶张量的(CP-)秩并将其分解为秩一分量。我们给出了算法有效的生成条件,并在合成数据和真实世界数据上证明,AROFAC2是黄金标准PARAFAC的潜在更优替代方案,其优势在于能够内在检测真实秩、避免虚假分量,并且对异常值和非高斯噪声具有稳定性。

英文摘要

We present an algorithm, AROFAC2, which detects the (CP-)rank of a degree 3 tensor and calculates its factorization into rank-one components. We provide generative conditions for the algorithm to work and demonstrate on both synthetic and real world data that AROFAC2 is a potentially outperforming alternative to the gold standard PARAFAC over which it has the advantages that it can intrinsically detect the true rank, avoids spurious components, and is stable with respect to outliers and non-Gaussian noise.

1211.5414 2026-06-03 cs.DS cs.LG cs.NA math.NA stat.ML 版本更新

Analysis of a randomized approximation scheme for matrix multiplication

矩阵乘法的随机近似方案分析

Daniel Hsu, Sham M. Kakade, Tong Zhang

AI总结 本文分析了Sarlos (2006)提出的基于随机旋转和均匀列采样的矩阵乘法随机近似方案,利用矩阵Bernstein不等式和次高斯随机向量二次型的尾部不等式给出简单分析。

详情
AI中文摘要

本文对Sarlos (2006)提出的基于随机旋转后均匀列采样的矩阵乘法随机近似方案进行了简单分析。结果来自于矩阵版本的Bernstein不等式以及次高斯随机向量中二次型的尾部不等式。

英文摘要

This note gives a simple analysis of a randomized approximation scheme for matrix multiplication proposed by Sarlos (2006) based on a random rotation followed by uniform column sampling. The result follows from a matrix version of Bernstein's inequality and a tail inequality for quadratic forms in subgaussian random vectors.

1103.1417 2026-06-03 math.ST cs.LG cs.SY eess.SY math.OC math.PR stat.TH 版本更新

Localization from Incomplete Noisy Distance Measurements

基于不完整含噪距离测量的定位

Adel Javanmard, Andrea Montanari

AI总结 针对含噪部分距离测量下的欧氏空间点云定位问题,提出基于半定规划的算法,并刻画其在随机几何图模型下的性能边界。

Comments 46 pages, 8 figures, numerical experiments added. Journal version (v1,v2: Conference versions, ISIT 2011); Journal of Foundations of Computational Mathematics, 2012

详情
AI中文摘要

我们考虑在欧氏空间 $\mathbb{R}^d$ 中利用部分成对距离的含噪测量来定位点云的问题。该任务在传感器网络定位和从NMR测量重建蛋白质构象等领域有应用。此外,它与降维问题和流形学习密切相关,后者的目标是通过局部(或部分)度量信息学习数据集的潜在全局几何结构。本文提出一种基于半定规划的重建算法。对于随机几何图模型和一致有界噪声,我们精确刻画了算法的性能:在无噪声情况下,我们找到一个半径 $r_0$,超过该半径算法能重建精确位置(直至刚性变换)。在存在噪声的情况下,我们得到的重建误差上下界仅相差一个依赖于维度 $d$ 和图中节点平均度的因子。

英文摘要

We consider the problem of positioning a cloud of points in the Euclidean space $\mathbb{R}^d$, using noisy measurements of a subset of pairwise distances. This task has applications in various areas, such as sensor network localization and reconstruction of protein conformations from NMR measurements. Also, it is closely related to dimensionality reduction problems and manifold learning, where the goal is to learn the underlying global geometry of a data set using local (or partial) metric information. Here we propose a reconstruction algorithm based on semidefinite programming. For a random geometric graph model and uniformly bounded noise, we provide a precise characterization of the algorithm's performance: In the noiseless case, we find a radius $r_0$ beyond which the algorithm reconstructs the exact positions (up to rigid transformations). In the presence of noise, we obtain upper and lower bounds on the reconstruction error that match up to a factor that depends only on the dimension $d$, and the average degree of the nodes in the graph.

1011.4104 2026-06-03 cs.LG cs.NA math.NA math.SP 版本更新

Clustering and Latent Semantic Indexing Aspects of the Singular Value Decomposition

奇异值分解的聚类和潜在语义索引方面

Andri Mirzal

AI总结 本文解释了奇异值分解(SVD)如何用于聚类,并指出其聚类与潜在语义索引(LSI)源于同一原理,进而设计了一种模拟SVD聚类能力的LSI算法,无需指定分解秩,性能与SVD相当。

Comments 38 pages, submitted to Pattern Recognition

详情
AI中文摘要

本文讨论了奇异值分解(SVD)的聚类和潜在语义索引(LSI)方面。本文的目的有两个。第一是解释奇异向量如何以及为何可用于聚类。第二是表明这两个看似无关的SVD方面实际上源于同一来源:在低秩近似矩阵的图表示中,相关顶点比在原始语义图中更倾向于聚集在一起。因此,SVD可以提高信息检索系统的检索性能,因为对近似矩阵的查询比原始矩阵的相同查询能检索到更多相关文档并过滤掉更多不相关文档。利用这一事实,我们将设计一种LSI算法,模拟SVD在聚类相关顶点方面的能力。收敛性分析表明该算法收敛,并对每个输入产生唯一解。使用LSI研究中一些标准数据集的实验结果表明,该算法的检索性能与SVD相当。此外,该算法更实用且更易使用,因为无需确定分解秩,而分解秩对驱动SVD的检索性能至关重要。

英文摘要

This paper discusses clustering and latent semantic indexing (LSI) aspects of the singular value decomposition (SVD). The purpose of this paper is twofold. The first is to give an explanation on how and why the singular vectors can be used in clustering. And the second is to show that the two seemingly unrelated SVD aspects actually originate from the same source: related vertices tend to be more clustered in the graph representation of lower rank approximate matrix using the SVD than in the original semantic graph. Accordingly, the SVD can improve retrieval performance of an information retrieval system since queries made to the approximate matrix can retrieve more relevant documents and filter out more irrelevant documents than the same queries made to the original matrix. By utilizing this fact, we will devise an LSI algorithm that mimicks SVD capability in clustering related vertices. Convergence analysis shows that the algorithm is convergent and produces a unique solution for each input. Experimental results using some standard datasets in LSI research show that retrieval performances of the algorithm are comparable to the SVD's. In addition, the algorithm is more practical and easier to use because there is no need to determine decomposition rank which is crucial in driving retrieval performance of the SVD.

1211.3444 2026-06-03 cs.LG cs.NA math.NA stat.ML 版本更新

Spectral Clustering: An empirical study of Approximation Algorithms and its Application to the Attrition Problem

谱聚类:近似算法的实证研究及其在员工流失问题中的应用

B. Cung, T. Jin, J. Ramirez, A. Thompson, C. Boutsidis, D. Needell

AI总结 本文通过实验评估多种谱聚类近似方法,并应用于员工流失预测问题,展示了近似谱聚类在保持分类准确性的同时降低计算成本的有效性。

详情
AI中文摘要

聚类是将一组对象分成组(称为簇)的问题,使得同一簇内的对象比不同簇中的对象更相似。谱聚类是一种众所周知的聚类方法,它利用数据相似性矩阵的谱来进行这种分离。由于该方法依赖于求解特征向量问题,对于大型数据集计算成本很高。为了克服这一限制,人们开发了近似方法,旨在减少运行时间同时保持准确的分类。在本文中,我们总结并实验评估了几种谱聚类的近似方法。从应用的角度,我们使用谱聚类来解决所谓的员工流失问题,其目标是从一组员工中识别出那些可能自愿离开公司的人。我们的研究揭示了现有近似谱聚类方法的实证性能,并展示了这些方法在一个重要的业务优化相关问题中的适用性。

英文摘要

Clustering is the problem of separating a set of objects into groups (called clusters) so that objects within the same cluster are more similar to each other than to those in different clusters. Spectral clustering is a now well-known method for clustering which utilizes the spectrum of the data similarity matrix to perform this separation. Since the method relies on solving an eigenvector problem, it is computationally expensive for large datasets. To overcome this constraint, approximation methods have been developed which aim to reduce running time while maintaining accurate classification. In this article, we summarize and experimentally evaluate several approximation methods for spectral clustering. From an applications standpoint, we employ spectral clustering to solve the so-called attrition problem, where one aims to identify from a set of employees those who are likely to voluntarily leave the company from those who are not. Our study sheds light on the empirical performance of existing approximate spectral clustering methods and shows the applicability of these methods in an important business optimization related problem.

1211.1550 2026-06-03 cs.LG cs.NA math.NA math.OC 版本更新

A Riemannian geometry for low-rank matrix completion

低秩矩阵补全的黎曼几何

B. Mishra, K. Adithya Apuroop, R. Sepulchre

AI总结 针对低秩矩阵补全问题,提出一种新的固定秩矩阵的黎曼几何,通过调节商空间的度量来适配最小二乘代价函数,并开发了梯度下降和信赖域算法,实现了与最先进算法LMaFit竞争的性能。

Comments Title modified, Typos removed. arXiv admin note: text overlap with arXiv:1209.0430

详情
AI中文摘要

我们提出了一种新的固定秩矩阵的黎曼几何,专门针对低秩矩阵补全问题。利用商空间的自由度,我们将搜索空间上的度量调整为特定的最小二乘代价函数。一方面,它以一种新颖的方式展示了如何利用商流形优化的灵活框架。另一方面,我们的算法可以被视为LMaFit(最先进的高斯-赛德尔算法)的改进版本。我们开发了执行一阶和二阶优化所需的必要工具。特别地,我们提出了梯度下降方案(最速下降和共轭梯度)以及信赖域算法。我们还表明,由于代价函数的简单性,在给定搜索方向时进行精确线搜索在数值上是廉价的,这使得我们的算法在标准低秩矩阵补全实例上与最先进算法具有竞争力。

英文摘要

We propose a new Riemannian geometry for fixed-rank matrices that is specifically tailored to the low-rank matrix completion problem. Exploiting the degree of freedom of a quotient space, we tune the metric on our search space to the particular least square cost function. At one level, it illustrates in a novel way how to exploit the versatile framework of optimization on quotient manifold. At another level, our algorithm can be considered as an improved version of LMaFit, the state-of-the-art Gauss-Seidel algorithm. We develop necessary tools needed to perform both first-order and second-order optimization. In particular, we propose gradient descent schemes (steepest descent and conjugate gradient) and trust-region algorithms. We also show that, thanks to the simplicity of the cost function, it is numerically cheap to perform an exact linesearch given a search direction, which makes our algorithms competitive with the state-of-the-art on standard low-rank matrix completion instances.

1211.1690 2026-06-03 cs.RO cs.CV cs.LG cs.SY eess.SY 版本更新

Learning Monocular Reactive UAV Control in Cluttered Natural Environments

学习在杂乱自然环境中进行单目反应式无人机控制

Stephane Ross, Narek Melik-Barkhudarov, Kumar Shaurya Shankar, Andreas Wendel, Debadeepta Dey, J. Andrew Bagnell, Martial Hebert

AI总结 本文使用单目相机和模仿学习训练控制器,使小型四旋翼飞行器能在自然森林环境中以1.5m/s速度自主避障导航。

Comments 8 pages, 10 figures

详情
AI中文摘要

大型无人机的自主导航相对简单,因为可以使用昂贵的传感器和监控设备。相比之下,在杂乱环境中低空飞行的微型飞行器(MAV)的避障仍然是一项具有挑战性的任务。与大型飞行器不同,MAV只能携带非常轻的传感器,如摄像头,这使得通过障碍物的自主导航更具挑战性。本文描述了一个系统,该系统能够使小型四旋翼直升机在自然森林环境中低空自主导航。仅使用单个廉价摄像头感知环境,我们能够保持高达1.5m/s的恒定速度。通过少量人类飞行员演示,我们使用最新的模仿学习技术训练了一个控制器,该控制器通过调整MAV的航向来避免树木。我们在室内更受控的环境和室外真实自然森林环境中展示了系统的性能。

英文摘要

Autonomous navigation for large Unmanned Aerial Vehicles (UAVs) is fairly straight-forward, as expensive sensors and monitoring devices can be employed. In contrast, obstacle avoidance remains a challenging task for Micro Aerial Vehicles (MAVs) which operate at low altitude in cluttered environments. Unlike large vehicles, MAVs can only carry very light sensors, such as cameras, making autonomous navigation through obstacles much more challenging. In this paper, we describe a system that navigates a small quadrotor helicopter autonomously at low altitude through natural forest environments. Using only a single cheap camera to perceive the environment, we are able to maintain a constant velocity of up to 1.5m/s. Given a small set of human pilot demonstrations, we use recent state-of-the-art imitation learning techniques to train a controller that can avoid trees by adapting the MAVs heading. We demonstrate the performance of our system in a more controlled environment indoors, and in real natural forest environments outdoors.

1211.0056 2026-06-03 math.OC cs.LG cs.NA math.NA stat.CO stat.ML 版本更新

Iterative Hard Thresholding Methods for $l_0$ Regularized Convex Cone Programming

$l_0$ 正则化凸锥规划问题的迭代硬阈值方法

Zhaosong Lu

AI总结 提出迭代硬阈值方法及其变体求解 $l_0$ 正则化凸锥规划,证明收敛到局部极小点并建立迭代复杂度。

Comments 25 pages

详情
AI中文摘要

本文考虑 $l_0$ 正则化凸锥规划问题。具体地,我们首先提出一种迭代硬阈值(IHT)方法及其变体,用于求解 $l_0$ 正则化箱约束凸规划。我们证明这些方法生成的序列收敛到局部极小点。同时,我们建立了 IHT 方法寻找 $\epsilon$-局部最优解的迭代复杂度。然后,我们通过将 IHT 方法应用于二次罚松弛,提出一种求解 $l_0$ 正则化凸锥规划的方法,并建立其寻找 $\epsilon$-近似局部极小解的迭代复杂度。最后,我们提出该方法的变体,其中相关的罚参数动态更新,并证明每个聚点是问题的局部极小点。

英文摘要

In this paper we consider $l_0$ regularized convex cone programming problems. In particular, we first propose an iterative hard thresholding (IHT) method and its variant for solving $l_0$ regularized box constrained convex programming. We show that the sequence generated by these methods converges to a local minimizer. Also, we establish the iteration complexity of the IHT method for finding an $ε$-local-optimal solution. We then propose a method for solving $l_0$ regularized convex cone programming by applying the IHT method to its quadratic penalty relaxation and establish its iteration complexity for finding an $ε$-approximate local minimizer. Finally, we propose a variant of this method in which the associated penalty parameter is dynamically updated, and show that every accumulation point is a local minimizer of the problem.

1202.5298 2026-06-03 eess.SY cs.LG cs.SY 版本更新

Min Max Generalization for Two-stage Deterministic Batch Mode Reinforcement Learning: Relaxation Schemes

两阶段确定性批量模式强化学习的最小最大泛化:松弛方案

Raphael Fonteneau, Damien Ernst, Bernard Boigelot, Quentin Louveaux

AI总结 针对确定性批量模式强化学习中的最小最大优化问题,提出两种松弛方案(约束丢弃和拉格朗日对偶化)以降低计算复杂度,并证明其优于现有方法。

详情
AI中文摘要

我们研究了[22]中提出的用于计算确定性环境下批量模式强化学习策略的最小最大优化问题。首先,我们证明该问题是NP难的。在两阶段情况下,我们提供了两种松弛方案。第一种松弛方案通过丢弃一些约束来获得一个可在多项式时间内求解的问题。第二种松弛方案基于拉格朗日松弛,将所有约束对偶化,得到一个圆锥二次规划问题。我们还从理论上证明并实验说明,两种松弛方案均能提供比[22]中更好的结果。

英文摘要

We study the minmax optimization problem introduced in [22] for computing policies for batch mode reinforcement learning in a deterministic setting. First, we show that this problem is NP-hard. In the two-stage case, we provide two relaxation schemes. The first relaxation scheme works by dropping some constraints in order to obtain a problem that is solvable in polynomial time. The second relaxation scheme, based on a Lagrangian relaxation where all constraints are dualized, leads to a conic quadratic programming problem. We also theoretically prove and empirically illustrate that both relaxation schemes provide better results than those given in [22].

1210.5034 2026-06-03 cs.LG cs.CV cs.NA math.NA 版本更新

Optimal Computational Trade-Off of Inexact Proximal Methods

非精确近端方法的最优计算权衡

Pierre Machart, Sandrine Anthoine, Luca Baldassarre

AI总结 本文研究近端梯度方法在计算代价与收敛速度之间的权衡,提出了一种计算高效且易于实现的快速非精确近端梯度算法(SIP)。

详情
AI中文摘要

在本文中,我们研究了在使用近端梯度方法(机器学习中流行的优化工具)最小化复合泛函时,收敛速度与计算代价之间的权衡。我们考虑近端算子通过迭代过程计算的情况,该过程提供了精确近端算子的近似。在这种情况下,我们得到具有两个嵌套循环的算法。我们表明,在有限时间内达到所需精度的解时,最小化计算代价的策略是将内迭代次数设置为常数,这与收敛速度分析所指示的策略不同。在此过程中,我们还提出了一种称为SIP(快速非精确近端梯度算法)的新程序,该程序既计算高效又易于实现。我们的数值实验证实了理论发现,并表明SIP可以成为标准程序的非常有竞争力的替代方案。

英文摘要

In this paper, we investigate the trade-off between convergence rate and computational cost when minimizing a composite functional with proximal-gradient methods, which are popular optimisation tools in machine learning. We consider the case when the proximity operator is computed via an iterative procedure, which provides an approximation of the exact proximity operator. In that case, we obtain algorithms with two nested loops. We show that the strategy that minimizes the computational cost to reach a solution with a desired accuracy in finite time is to set the number of inner iterations to a constant, which differs from the strategy indicated by a convergence rate analysis. In the process, we also present a new procedure called SIP (that is Speedy Inexact Proximal-gradient algorithm) that is both computationally efficient and easy to implement. Our numerical experiments confirm the theoretical findings and suggest that SIP can be a very competitive alternative to the standard procedure.

1210.4883 2026-06-03 cs.LG cs.NA math.NA stat.ML 版本更新

A Model-Based Approach to Rounding in Spectral Clustering

基于模型的谱聚类舍入方法

Leonard K. M. Poon, April H. Liu, Tengfei Liu, Nevin Lianwen Zhang

AI总结 提出一种基于潜树模型的谱聚类舍入方法,同时解决特征向量选择、聚类数确定和数据划分三个子问题。

Comments Appears in Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence (UAI2012)

详情
AI中文摘要

在谱聚类中,首先为数据点集合定义相似矩阵,然后变换矩阵得到拉普拉斯矩阵,接着计算拉普拉斯矩阵的特征向量,最后利用前导特征向量获得数据的划分。最后一步有时称为舍入,需要决定使用多少个前导特征向量、确定聚类数以及划分数据点。本文提出了一种新的舍入方法。该方法在三个方面与以往方法不同。首先,我们放宽了聚类数等于所用特征向量数的假设。其次,在决定使用多少个前导特征向量时,我们不仅依赖前导特征向量本身包含的信息,还使用后续特征向量。第三,我们的方法是基于模型的,并使用一类称为潜树模型的图模型来解决舍入的三个子问题。我们在合成数据和真实数据上评估了该方法。结果表明,在理想情况下(即类间相似度为0),我们的方法能够正确工作,并且随着偏离理想情况,性能会优雅地下降。

英文摘要

In spectral clustering, one defines a similarity matrix for a collection of data points, transforms the matrix to get the Laplacian matrix, finds the eigenvectors of the Laplacian matrix, and obtains a partition of the data using the leading eigenvectors. The last step is sometimes referred to as rounding, where one needs to decide how many leading eigenvectors to use, to determine the number of clusters, and to partition the data points. In this paper, we propose a novel method for rounding. The method differs from previous methods in three ways. First, we relax the assumption that the number of clusters equals the number of eigenvectors used. Second, when deciding the number of leading eigenvectors to use, we not only rely on information contained in the leading eigenvectors themselves, but also use subsequent eigenvectors. Third, our method is model-based and solves all the three subproblems of rounding using a class of graphical models called latent tree models. We evaluate our method on both synthetic and real-world data. The results show that our method works correctly in the ideal case where between-clusters similarity is 0, and degrades gracefully as one moves away from the ideal case.

1210.4081 2026-06-03 math.NA cs.CV cs.DS cs.LG cs.NA math.OC 版本更新

Getting Feasible Variable Estimates From Infeasible Ones: MRF Local Polytope Study

从不可行变量估计获得可行变量估计:MRF局部多面体研究

Bogdan Savchynskyy, Stefan Schmidt

AI总结 针对具有可分离性的大规模优化问题,提出一种从对偶解构造近似可行原始解的方法,并应用于马尔可夫随机场推理问题的局部多面体松弛,证明其优于现有方法。

Comments 20 page, 4 figures

详情
AI中文摘要

本文提出了一种方法,用于从对偶解构造具有特定可分离性的大规模优化问题的近似可行原始解。虽然通常可以从对偶函数的(次)梯度产生不可行的原始估计,但将其投影到原始可行集往往并不容易,因为投影本身的复杂度与初始问题的复杂度相当。我们提出了一种替代的有效方法来获得可行性,并证明了其影响收敛到最优性的性质与欧几里得投影的性质相似。我们将我们的方法应用于马尔可夫随机场推理问题的局部多面体松弛,并证明了其优于现有方法。

英文摘要

This paper proposes a method for construction of approximate feasible primal solutions from dual ones for large-scale optimization problems possessing certain separability properties. Whereas infeasible primal estimates can typically be produced from (sub-)gradients of the dual function, it is often not easy to project them to the primal feasible set, since the projection itself has a complexity comparable to the complexity of the initial problem. We propose an alternative efficient method to obtain feasibility and show that its properties influencing the convergence to the optimum are similar to the properties of the Euclidean projection. We apply our method to the local polytope relaxation of inference problems for Markov Random Fields and demonstrate its superiority over existing methods.

1202.3772 2026-06-03 cs.LG cs.NA math.NA stat.ML 版本更新

Rank/Norm Regularization with Closed-Form Solutions: Application to Subspace Clustering

具有闭式解的秩/范数正则化:应用于子空间聚类

Yao-Liang Yu, Dale Schuurmans

AI总结 本文通过推广Eckart-Young-Mirsky定理到所有酉不变范数,得到秩/范数正则化问题的闭式解,并应用于子空间聚类,获得新理论见解和实验效果。

Comments 11 pages, 1 figure, appeared in UAI 2011. One footnote corrected and appendix added

详情
AI中文摘要

当数据从未知子空间采样时,主成分分析(PCA)提供了一种有效的方法来估计子空间,从而降低数据的维度。PCA的核心是Eckart-Young-Mirsky定理,该定理刻画了矩阵的最佳秩k近似。在本文中,我们证明了Eckart-Young-Mirsky定理在所有酉不变范数下的推广。利用这一结果,我们得到了一组秩/范数正则化问题的闭式解,并推导出一类通用子空间聚类问题(其中数据由未知子空间的并集建模)的闭式解。从这些结果中,我们获得了新的理论见解和有希望的实验结果。

英文摘要

When data is sampled from an unknown subspace, principal component analysis (PCA) provides an effective way to estimate the subspace and hence reduce the dimension of the data. At the heart of PCA is the Eckart-Young-Mirsky theorem, which characterizes the best rank k approximation of a matrix. In this paper, we prove a generalization of the Eckart-Young-Mirsky theorem under all unitarily invariant norms. Using this result, we obtain closed-form solutions for a set of rank/norm regularized problems, and derive closed-form solutions for a general class of subspace clustering problems (where data is modelled by unions of unknown subspaces). From these results we obtain new theoretical insights and promising experimental results.

1107.3090 2026-06-03 cs.CC cs.LG cs.SY eess.SY math.OC 版本更新

On the Computational Complexity of Stochastic Controller Optimization in POMDPs

关于POMDP中随机控制器优化的计算复杂度

Nikos Vlassis, Michael L. Littman, David Barber

AI总结 本文证明在马尔可夫决策过程中寻找最优随机“盲”控制器是NP难问题,相应的决策问题属于PSPACE且是SQRT-SUM难的,并指出POMDP中更一般的随机控制器优化问题也是NP难的,但存在一个凸的特殊情况可高效求解。

Comments Corrected error in the proof of Theorem 2, and revised Section 5

详情
AI中文摘要

我们证明了在马尔可夫决策过程中寻找最优随机“盲”控制器是一个NP难问题。相应的决策问题是NP难的、属于PSPACE且是SQRT-SUM难的,因此将其置于NP中将意味着计算机科学中长期未解难题的突破。我们的结果确立了POMDP中更一般的随机控制器优化问题也是NP难的。尽管如此,我们概述了一个凸的特殊情况,该情况允许高效的全局解。

英文摘要

We show that the problem of finding an optimal stochastic 'blind' controller in a Markov decision process is an NP-hard problem. The corresponding decision problem is NP-hard, in PSPACE, and SQRT-SUM-hard, hence placing it in NP would imply breakthroughs in long-standing open problems in computer science. Our result establishes that the more general problem of stochastic controller optimization in POMDPs is also NP-hard. Nonetheless, we outline a special case that is convex and admits efficient global solutions.

1206.4481 2026-06-03 math.NA cs.LG cs.NA 版本更新

Parsimonious Mahalanobis Kernel for the Classification of High Dimensional Data

用于高维数据分类的简约马氏核

M. Fauvel, A. Villa, J. Chanussot, J. A. Benediktsson

AI总结 利用高维空间的空性,基于马氏距离提出一种简约核,通过高维判别分析模型估计信号和噪声子空间,实现稳定逆协方差矩阵,并在SVM框架下优化半径-间隔界,实验表明该核优于高斯核。

详情
AI中文摘要

本文考虑使用核方法对高维数据进行分类。利用高维空间的空性,提出了一种基于马氏距离的核。计算马氏距离需要协方差矩阵的逆。在高维空间中,估计的协方差矩阵是病态的,其逆不稳定或不可能。使用简约统计模型,即高维判别分析模型,为每个考虑的类别估计特定的信号和噪声子空间,使得类别特定协方差矩阵的逆显式且稳定,从而定义了简约马氏核。采用基于SVM的框架,通过优化所谓的半径-间隔界来选择简约马氏核的超参数。在三个高维数据集上的实验结果表明,所提出的核适用于高维数据分类,比传统高斯核提供更好的分类精度。

英文摘要

The classification of high dimensional data with kernel methods is considered in this article. Exploit- ing the emptiness property of high dimensional spaces, a kernel based on the Mahalanobis distance is proposed. The computation of the Mahalanobis distance requires the inversion of a covariance matrix. In high dimensional spaces, the estimated covariance matrix is ill-conditioned and its inversion is unstable or impossible. Using a parsimonious statistical model, namely the High Dimensional Discriminant Analysis model, the specific signal and noise subspaces are estimated for each considered class making the inverse of the class specific covariance matrix explicit and stable, leading to the definition of a parsimonious Mahalanobis kernel. A SVM based framework is used for selecting the hyperparameters of the parsimonious Mahalanobis kernel by optimizing the so-called radius-margin bound. Experimental results on three high dimensional data sets show that the proposed kernel is suitable for classifying high dimensional data, providing better classification accuracies than the conventional Gaussian kernel.

1209.0001 2026-06-03 cs.LG cs.NA math.NA stat.ML 版本更新

An Improved Bound for the Nystrom Method for Large Eigengap

大特征间隙下Nyström方法的改进界

Mehrdad Mahdavi, Tianbao Yang, Rong Jin

AI总结 针对核矩阵谱中存在大特征间隙的情况,基于积分算子集中不等式和矩阵扰动理论,将Nyström方法的Frobenius范数近似误差从O(N/m^{1/4})改进到O(N/m^{1/2})。

详情
AI中文摘要

我们在大特征间隙假设下,为Nyström方法的近似误差建立了一个改进的界。这是基于经验观察,即特征间隙对Nyström方法的近似误差有显著影响。我们的方法基于积分算子的集中不等式和矩阵扰动理论。我们的分析表明,当存在大特征间隙时,在Frobenius范数下,我们可以将Nyström方法的近似误差从$O(N/m^{1/4})$改进到$O(N/m^{1/2})$,其中$N$是核矩阵的大小,$m$是采样列的数量。

英文摘要

We develop an improved bound for the approximation error of the Nyström method under the assumption that there is a large eigengap in the spectrum of kernel matrix. This is based on the empirical observation that the eigengap has a significant impact on the approximation error of the Nyström method. Our approach is based on the concentration inequality of integral operator and the theory of matrix perturbation. Our analysis shows that when there is a large eigengap, we can improve the approximation error of the Nyström method from $O(N/m^{1/4})$ to $O(N/m^{1/2})$ when measured in Frobenius norm, where $N$ is the size of the kernel matrix, and $m$ is the number of sampled columns.

1208.0864 2026-06-03 math.OC cs.LG cs.SY eess.SY 版本更新

Statistical Results on Filtering and Epi-convergence for Learning-Based Model Predictive Control

基于学习的模型预测控制的滤波与上收敛统计结果

Anil Aswani, Humberto Gonzalez, S. Shankar Sastry, Claire Tomlin

AI总结 本文证明了基于学习的模型预测控制中测量模型选择的合理性,并给出了随机收敛性证明,同时证明了用于LBMPC的非参数估计器的统计性质。

详情
AI中文摘要

基于学习的模型预测控制(LBMPC)是一种提供鲁棒性确定性保证的技术,同时使用统计识别工具来识别更丰富的系统模型以提高性能。本技术说明提供了证明,阐明我们选择测量模型的原因,并给出了关于LBMPC随机收敛性的证明。第一部分讨论了可用常微分方程(ODE)描述的动力系统的同时状态估计和未建模动力学的统计识别(或学习)。第二部分提供了关于可与基于学习的模型预测控制(LBMPC)技术一起使用的不同统计估计器的上收敛的证明。特别地,我们证明了一种非参数估计器的统计性质,该估计器设计用于与LBMPC结合使用时具有正确的确定性和随机性数值实现性质。

英文摘要

Learning-based model predictive control (LBMPC) is a technique that provides deterministic guarantees on robustness, while statistical identification tools are used to identify richer models of the system in order to improve performance. This technical note provides proofs that elucidate the reasons for our choice of measurement model, as well as giving proofs concerning the stochastic convergence of LBMPC. The first part of this note discusses simultaneous state estimation and statistical identification (or learning) of unmodeled dynamics, for dynamical systems that can be described by ordinary differential equations (ODE's). The second part provides proofs concerning the epi-convergence of different statistical estimators that can be used with the learning-based model predictive control (LBMPC) technique. In particular, we prove results on the statistical properties of a nonparametric estimator that we have designed to have the correct deterministic and stochastic properties for numerical implementation when used in conjunction with LBMPC.

1107.2487 2026-06-03 math.OC cs.LG cs.SY eess.SY math.ST stat.TH 版本更新

Provably Safe and Robust Learning-Based Model Predictive Control

可证明安全且鲁棒的基于学习的模型预测控制

Anil Aswani, Humberto Gonzalez, S. Shankar Sastry, Claire Tomlin

AI总结 提出一种基于学习的模型预测控制(LBMPC)方案,通过解耦安全与性能,利用统计学习改进性能并保证鲁棒性。

详情
AI中文摘要

控制器设计面临鲁棒性与性能之间的权衡,线性控制器的可靠性使得许多从业者关注前者。然而,为了应对日益增长的能源约束,提高系统性能重新引起兴趣。本文描述了一种基于学习的模型预测控制(LBMPC)方案,该方案提供鲁棒性的确定性保证,同时使用统计识别工具来识别更丰富的系统模型以提高性能;该框架的优点在于它处理状态和输入约束,根据成本函数优化系统性能,并且可以设计使用各种参数或非参数统计工具。LBMPC的主要见解是,在优化框架中,通过维护两个系统模型,可以在合理条件下解耦安全性和性能。第一个是具有不确定性界限的近似模型,第二个模型通过统计方法更新。LBMPC通过选择最小化成本的输入(受学习动力学约束)来提高性能,并通过检查这些相同的输入是否在不确定性下保持近似模型稳定来确保安全性和鲁棒性。此外,我们证明如果系统充分激励,则LBMPC控制动作概率收敛到使用真实动力学计算的MPC的控制动作。

英文摘要

Controller design faces a trade-off between robustness and performance, and the reliability of linear controllers has caused many practitioners to focus on the former. However, there is renewed interest in improving system performance to deal with growing energy constraints. This paper describes a learning-based model predictive control (LBMPC) scheme that provides deterministic guarantees on robustness, while statistical identification tools are used to identify richer models of the system in order to improve performance; the benefits of this framework are that it handles state and input constraints, optimizes system performance with respect to a cost function, and can be designed to use a wide variety of parametric or nonparametric statistical tools. The main insight of LBMPC is that safety and performance can be decoupled under reasonable conditions in an optimization framework by maintaining two models of the system. The first is an approximate model with bounds on its uncertainty, and the second model is updated by statistical methods. LBMPC improves performance by choosing inputs that minimize a cost subject to the learned dynamics, and it ensures safety and robustness by checking whether these same inputs keep the approximate model stable when it is subject to uncertainty. Furthermore, we show that if the system is sufficiently excited, then the LBMPC control action probabilistically converges to that of an MPC computed using the true dynamics.

1207.3438 2026-06-03 stat.ML cs.LG cs.NA math.NA 版本更新

MahNMF: Manhattan Non-negative Matrix Factorization

MahNMF: 曼哈顿非负矩阵分解

Naiyang Guan, Dacheng Tao, Zhigang Luo, John Shawe-Taylor

AI总结 针对重尾噪声和异常值问题,提出基于曼哈顿距离的MahNMF模型,并开发了秩一残差迭代和Nesterov平滑两种快速优化算法。

Comments 43 pages, 20 figures, 2 tables, submission to Journal of Machine Learning Research

详情
AI中文摘要

非负矩阵分解(NMF)通过两个非负低秩因子矩阵 $W$ 和 $H$ 的乘积来逼近非负矩阵 $X$。NMF 及其扩展通过最小化 $X$ 与 $W^T H$ 之间的 Kullback-Leibler 散度或欧氏距离来建模泊松噪声或高斯噪声。然而,当噪声分布具有重尾特性时,这些方法表现不佳。本文提出曼哈顿 NMF(MahNMF),通过最小化 $X$ 与 $W^T H$ 之间的曼哈顿距离来建模重尾拉普拉斯噪声。与稀疏和低秩矩阵分解类似,MahNMF 能够鲁棒地估计非负矩阵的低秩部分和稀疏部分,从而在数据受到异常值污染时有效工作。我们通过开发带盒约束的 MahNMF、流形正则化 MahNMF、组稀疏 MahNMF、弹性网诱导 MahNMF 和对称 MahNMF,将 MahNMF 扩展到各种实际应用。本文的主要贡献在于为 MahNMF 及其扩展提出了两种快速优化算法:秩一残差迭代(RRI)方法和 Nesterov 平滑方法。具体地,通过将 MahNMF 中的残差矩阵近似为 $W$ 的一行和 $H$ 的一行的外积,我们开发了 RRI 方法,以闭式解迭代更新 $W$ 和 $H$ 的每个变量。尽管 RRI 对于小规模 MahNMF 及其某些扩展是高效的,但它既不能扩展到大规模矩阵,也不够灵活以优化所有 MahNMF 扩展。由于 MahNMF 及其扩展的目标函数既非凸也不光滑,我们应用 Nesterov 平滑方法,在固定一个因子矩阵的情况下递归优化另一个因子矩阵。通过将平滑参数设置为与迭代次数成反比,我们逐步提高了 MahNMF 及其扩展的逼近精度。

英文摘要

Non-negative matrix factorization (NMF) approximates a non-negative matrix $X$ by a product of two non-negative low-rank factor matrices $W$ and $H$. NMF and its extensions minimize either the Kullback-Leibler divergence or the Euclidean distance between $X$ and $W^T H$ to model the Poisson noise or the Gaussian noise. In practice, when the noise distribution is heavy tailed, they cannot perform well. This paper presents Manhattan NMF (MahNMF) which minimizes the Manhattan distance between $X$ and $W^T H$ for modeling the heavy tailed Laplacian noise. Similar to sparse and low-rank matrix decompositions, MahNMF robustly estimates the low-rank part and the sparse part of a non-negative matrix and thus performs effectively when data are contaminated by outliers. We extend MahNMF for various practical applications by developing box-constrained MahNMF, manifold regularized MahNMF, group sparse MahNMF, elastic net inducing MahNMF, and symmetric MahNMF. The major contribution of this paper lies in two fast optimization algorithms for MahNMF and its extensions: the rank-one residual iteration (RRI) method and Nesterov's smoothing method. In particular, by approximating the residual matrix by the outer product of one row of W and one row of $H$ in MahNMF, we develop an RRI method to iteratively update each variable of $W$ and $H$ in a closed form solution. Although RRI is efficient for small scale MahNMF and some of its extensions, it is neither scalable to large scale matrices nor flexible enough to optimize all MahNMF extensions. Since the objective functions of MahNMF and its extensions are neither convex nor smooth, we apply Nesterov's smoothing method to recursively optimize one factor matrix with another matrix fixed. By setting the smoothing parameter inversely proportional to the iteration number, we improve the approximation accuracy iteratively for both MahNMF and its extensions.

1203.1007 2026-06-03 cs.LG cs.AI cs.SY eess.SY stat.ML 版本更新

Agnostic System Identification for Model-Based Reinforcement Learning

基于模型的强化学习的不可知系统辨识

Stephane Ross, J. Andrew Bagnell

AI总结 针对模型类可能不包含真实系统的不可知情况,提出一种利用无遗憾在线学习算法获得近优策略的迭代方法,并在离散和连续域上验证其有效性。

Comments 8 pages, published in ICML 2012

详情
AI中文摘要

控制中的一个基本问题是从观测中学习一个对控制器综合有用的系统模型。为了提供良好的性能保证,现有方法必须假设真实系统属于学习过程中考虑的模型类。我们提出了一种迭代方法,即使在系统不在模型类中的不可知情况下,也能提供强有力的保证。特别地,我们表明,只要某个模型实现了低训练误差并且能够访问良好的探索分布,任何无遗憾在线学习算法都可以用于获得近优策略。我们的方法适用于离散和连续域。我们在文献中一个具有挑战性的直升机领域上展示了其有效性和可扩展性。

英文摘要

A fundamental problem in control is to learn a model of a system from observations that is useful for controller synthesis. To provide good performance guarantees, existing methods must assume that the real system is in the class of models considered during learning. We present an iterative method with strong guarantees even in the agnostic case where the system is not in the class. In particular, we show that any no-regret online learning algorithm can be used to obtain a near-optimal policy, provided some model achieves low training error and access to a good exploration distribution. Our approach applies to both discrete and continuous domains. We demonstrate its efficacy and scalability on a challenging helicopter domain from the literature.

1206.6857 2026-06-03 cs.LG cs.NA math.NA stat.ML 版本更新

Faster Gaussian Summation: Theory and Experiment

更快的高斯求和:理论与实验

Dongryeol Lee, Alexander G. Gray

AI总结 本文针对机器学习中常见的高斯求和问题,提出两种新扩展(带严格误差界的O(Dp)泰勒展开和集成任意近似方法的新误差控制方案),并在自适应分层数据结构框架下实现更快的算法,通过核密度估计中的最优带宽选择实验首次揭示了当前最先进方法的优缺点。

Comments Appears in Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (UAI2006)

详情
AI中文摘要

我们为高斯求和问题提供了更快的算法,该问题出现在许多机器学习方法中。我们在使用自适应分层数据结构的最佳离散算法框架内,开发了两个新的扩展——一个具有严格误差界的O(Dp)泰勒展开式用于高斯核,以及一个集成任意近似方法的新误差控制方案。我们在核密度估计中最优带宽选择的背景下严格评估了这些技术的实证效果,首次揭示了当前最先进方法的优缺点。我们的结果表明,新的误差控制方案提高了性能,而级数展开方法仅在低维(五维或以下)中有效。

英文摘要

We provide faster algorithms for the problem of Gaussian summation, which occurs in many machine learning methods. We develop two new extensions - an O(Dp) Taylor expansion for the Gaussian kernel with rigorous error bounds and a new error control scheme integrating any arbitrary approximation method - within the best discretealgorithmic framework using adaptive hierarchical data structures. We rigorously evaluate these techniques empirically in the context of optimal bandwidth selection in kernel density estimation, revealing the strengths and weaknesses of current state-of-the-art approaches for the first time. Our results demonstrate that the new error control scheme yields improved performance, whereas the series expansion approach is only effective in low dimensions (five or less).

1206.6833 2026-06-03 cs.LG cs.CE cs.NA math.NA stat.ML 版本更新

Matrix Tile Analysis

矩阵瓦片分析

Inmar Givoni, Vincent Cheung, Brendan J. Frey

AI总结 提出矩阵瓦片分析(MTA)问题,通过非重叠瓦片分解矩阵,并设计近似迭代算法和和积松弛方法,在合成数据和酵母基因敲除数据上验证其有效性。

Comments Appears in Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (UAI2006)

详情
AI中文摘要

许多任务需要在数字、符号或类别似然矩阵中寻找元素组。一种方法是使用高效的双线性或三线性分解技术,包括PCA、ICA、稀疏矩阵分解和格子分析。当矩阵元素的加法和乘法没有明确定义时,这些技术不适用。更直接地,像双聚类这样的方法可用于对矩阵元素进行分类,但这些方法做出了过于严格的假设,即每个元素的类别是行类别和列类别的函数。我们引入一个通用的计算问题——矩阵瓦片分析(MTA),它将矩阵分解为一组非重叠的瓦片,每个瓦片由通常不相邻的行和列的子集定义。MTA不需要用于组合瓦片的代数,但必须搜索瓦片分配的离散组合。精确MTA是一个计算上难以处理的整数规划问题,但我们描述了一种近似迭代技术和一种计算高效的整数规划和积松弛。我们在数百个随机生成的任务上比较了这些方法与PCA和格子分析的有效性。利用双基因敲除数据,我们展示了MTA找到了具有生物学相关功能的相互作用酵母基因群。

英文摘要

Many tasks require finding groups of elements in a matrix of numbers, symbols or class likelihoods. One approach is to use efficient bi- or tri-linear factorization techniques including PCA, ICA, sparse matrix factorization and plaid analysis. These techniques are not appropriate when addition and multiplication of matrix elements are not sensibly defined. More directly, methods like bi-clustering can be used to classify matrix elements, but these methods make the overly-restrictive assumption that the class of each element is a function of a row class and a column class. We introduce a general computational problem, `matrix tile analysis' (MTA), which consists of decomposing a matrix into a set of non-overlapping tiles, each of which is defined by a subset of usually nonadjacent rows and columns. MTA does not require an algebra for combining tiles, but must search over discrete combinations of tile assignments. Exact MTA is a computationally intractable integer programming problem, but we describe an approximate iterative technique and a computationally efficient sum-product relaxation of the integer program. We compare the effectiveness of these methods to PCA and plaid on hundreds of randomly generated tasks. Using double-gene-knockout data, we show that MTA finds groups of interacting yeast genes that have biologically-related functions.

1206.6474 2026-06-03 cs.DS cs.LG cs.NA math.NA stat.ML 版本更新

Estimation of Simultaneously Sparse and Low Rank Matrices

同时稀疏和低秩矩阵的估计

Emile Richard, Pierre-Andre Savalle, Nicolas Vayatis

AI总结 本文提出一种凸混合惩罚方法,同时使用ℓ1范数和迹范数,以估计同时稀疏和低秩的矩阵,并推导了预言不等式和链接预测的泛化误差界,通过近端下降算法高效求解。

Comments Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012)

详情
AI中文摘要

本文介绍了一种惩罚矩阵估计过程,旨在同时实现稀疏和低秩的解。这种结构出现在社交网络或蛋白质相互作用的背景下,其中底层图的邻接矩阵在适当基下是块对角化的。我们引入了一种凸混合惩罚,同时涉及ℓ1范数和迹范数。我们得到了一个预言不等式,指示了两种效应如何根据目标矩阵的性质相互作用。我们界定了链接预测问题中的泛化误差。我们还开发了近端下降策略来高效求解优化问题,并在合成和真实数据集上评估了性能。

英文摘要

The paper introduces a penalized matrix estimation procedure aiming at solutions which are sparse and low-rank at the same time. Such structures arise in the context of social networks or protein interactions where underlying graphs have adjacency matrices which are block-diagonal in the appropriate basis. We introduce a convex mixed penalty which involves $\ell_1$-norm and trace norm simultaneously. We obtain an oracle inequality which indicates how the two effects interact according to the nature of the target matrix. We bound generalization error in the link prediction problem. We also develop proximal descent strategies to solve the optimization problem efficiently and evaluate performance on synthetic and real data sets.

1206.6470 2026-06-03 cs.LG cs.DM cs.NA math.NA stat.ML 版本更新

A Combinatorial Algebraic Approach for the Identifiability of Low-Rank Matrix Completion

低秩矩阵完备可辨识性的组合代数方法

Franz Kiraly, Ryota Tomioka

AI总结 本文通过组合代数方法,首次给出了任意秩矩阵从一组矩阵条目中可辨识的充要组合条件,并提出了新算法。

Comments Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012)

详情
AI中文摘要

本文回顾了矩阵完备问题,并揭示了其与代数几何、组合学和图论的密切联系。我们首次给出了任意秩矩阵从一组矩阵条目中可辨识的充要组合条件,为矩阵完备问题提供了理论约束和新算法。最后,我们通过算法评估了给定条件和算法在实际相关矩阵大小上的紧致性,表明代数组合方法可以改进现有的矩阵完备方法。

英文摘要

In this paper, we review the problem of matrix completion and expose its intimate relations with algebraic geometry, combinatorics and graph theory. We present the first necessary and sufficient combinatorial conditions for matrices of arbitrary rank to be identifiable from a set of matrix entries, yielding theoretical constraints and new algorithms for the problem of matrix completion. We conclude by algorithmically evaluating the tightness of the given conditions and algorithms for practically relevant matrix sizes, showing that the algebraic-combinatoric approach can lead to improvements over state-of-the-art matrix completion methods.

1206.6141 2026-06-03 cs.LG cs.SY eess.SY stat.ML 版本更新

Directed Time Series Regression for Control

面向控制的定向时间序列回归

Yi-Hao Kao, Benjamin Van Roy

AI总结 提出定向时间序列回归方法,结合最小二乘回归与经验优化的优点,用于确定性等价模型预测控制中的时间序列模型参数估计,在随机倒立摆平衡问题中显著提升控制器性能。

详情
AI中文摘要

我们提出了定向时间序列回归,这是一种用于确定性等价模型预测控制中时间序列模型参数估计的新方法。该方法结合了最小二乘回归和经验优化的优点。通过一个涉及著名倒立摆平衡问题的随机版本的计算研究,我们证明了定向时间序列回归能够在控制器性能上比上述任何一种替代方法产生显著的改进。

英文摘要

We propose directed time series regression, a new approach to estimating parameters of time-series models for use in certainty equivalent model predictive control. The approach combines merits of least squares regression and empirical optimization. Through a computational study involving a stochastic version of a well known inverted pendulum balancing problem, we demonstrate that directed time series regression can generate significant improvements in controller performance over either of the aforementioned alternatives.

1206.4676 2026-06-03 cs.LG cs.CV cs.NA math.NA stat.ML 版本更新

Clustering by Low-Rank Doubly Stochastic Matrix Decomposition

基于低秩双随机矩阵分解的聚类

Zhirong Yang, Erkki Oja

AI总结 提出一种超越矩阵分解的低秩学习方法,通过两步二分随机游走逼近聚类分配概率,利用KL散度最小化实现判别模型的最大似然估计,并采用松弛的MM算法优化,显著提升大规模流形数据的聚类纯度。

Comments ICML2012

详情
AI中文摘要

在过去十年中,通过非负低秩近似进行聚类分析取得了显著进展。然而,该方向上的大多数近似方法仍局限于矩阵分解。我们提出了一种新的低秩学习方法以提高聚类性能,该方法超越了矩阵分解。该近似基于通过虚拟聚类节点的两步二分随机游走,其中近似仅由聚类分配概率构成。通过Kullback-Leibler散度测量的近似误差最小化等价于判别模型的最大似然估计,这为我们的方法提供了坚实的概率解释。优化通过一种松弛的Majorization-Minimization算法实现,该算法在寻找良好局部最小值方面具有优势。此外,我们指出带有Dirichlet先验的正则化算法仅作为初始化。实验结果表明,新方法在各种数据集上,特别是大规模流形数据上,具有强大的聚类纯度性能。

英文摘要

Clustering analysis by nonnegative low-rank approximations has achieved remarkable progress in the past decade. However, most approximation approaches in this direction are still restricted to matrix factorization. We propose a new low-rank learning method to improve the clustering performance, which is beyond matrix factorization. The approximation is based on a two-step bipartite random walk through virtual cluster nodes, where the approximation is formed by only cluster assigning probabilities. Minimizing the approximation error measured by Kullback-Leibler divergence is equivalent to maximizing the likelihood of a discriminative model, which endows our method with a solid probabilistic interpretation. The optimization is implemented by a relaxed Majorization-Minimization algorithm that is advantageous in finding good local minima. Furthermore, we point out that the regularized algorithm with Dirichlet prior only serves as initialization. Experimental results show that the new method has strong performance in clustering purity for various datasets, especially for large-scale manifold data.

1206.4645 2026-06-03 cs.LG cs.NA math.NA stat.ME stat.ML 版本更新

Ensemble Methods for Convex Regression with Applications to Geometric Programming Based Circuit Design

凸回归的集成方法及其在基于几何规划的电路设计中的应用

Lauren Hannah, David Dunson

AI总结 本文提出集成方法(如bagging、smearing和随机划分)来改进分段线性凸回归的稳定性,并应用于基于几何规划的电路设计中的器件建模和约束近似。

Comments ICML2012

详情
AI中文摘要

凸回归是连接统计估计和确定性凸优化的一个有前景的领域。新的分段线性凸回归方法快速且可扩展,但在用于近似优化问题的约束或目标函数时可能不稳定。集成方法,如bagging、smearing和随机划分,可以缓解这一问题并保持底层估计器的理论性质。我们通过实验检验了集成方法在预测和优化中的性能,然后将其应用于基于几何规划的电路设计中的器件建模和约束近似。

英文摘要

Convex regression is a promising area for bridging statistical estimation and deterministic convex optimization. New piecewise linear convex regression methods are fast and scalable, but can have instability when used to approximate constraints or objective functions for optimization. Ensemble methods, like bagging, smearing and random partitioning, can alleviate this problem and maintain the theoretical properties of the underlying estimator. We empirically examine the performance of ensemble methods for prediction and optimization, and then apply them to device modeling and constraint approximation for geometric programming based circuit design.

1206.4643 2026-06-03 cs.LG cs.GT cs.SY eess.SY 版本更新

Lightning Does Not Strike Twice: Robust MDPs with Coupled Uncertainty

闪电不会两次击中同一地点:具有耦合不确定性的鲁棒MDP

Shie Mannor, Ofir Mebel, Huan Xu

AI总结 针对马尔可夫决策过程中参数不确定性的耦合问题,提出“闪电不会两次击中同一地点”概念,设计可计算最优策略的算法。

Comments ICML2012

详情
AI中文摘要

我们考虑参数不确定性下的马尔可夫决策过程。以往的研究都限制在不同状态之间的不确定性是解耦的,这导致保守的解。相比之下,我们引入了一个直观的概念,称为“闪电不会两次击中同一地点”,来建模耦合的不确定参数。具体来说,我们要求系统只能偏离其名义参数有限次数。我们给出了概率保证,表明该模型代表了现实生活中的情况,并设计了使用这一概念计算最优控制策略的可行算法。

英文摘要

We consider Markov decision processes under parameter uncertainty. Previous studies all restrict to the case that uncertainties among different states are uncoupled, which leads to conservative solutions. In contrast, we introduce an intuitive concept, termed "Lightning Does not Strike Twice," to model coupled uncertain parameters. Specifically, we require that the system can deviate from its nominal parameters only a bounded number of times. We give probabilistic guarantees indicating that this model represents real life situations and devise tractable algorithms for computing optimal control policies using this concept.

1206.4640 2026-06-03 math.NA cs.LG cs.NA stat.ML 版本更新

Stability of matrix factorization for collaborative filtering

协同过滤中矩阵分解的稳定性

Yu-Xiang Wang, Huan Xu

AI总结 研究矩阵分解算法在矩阵补全中对抗性噪声的稳定性,通过误差界、子空间分析和个体预测误差分析,为协同过滤系统设计提供指导。

Comments ICML2012

详情
AI中文摘要

我们研究了矩阵分解算法在矩阵补全中对抗性噪声的稳定性。具体地,我们的结果包括:(I)我们以均方根误差为度量,给出了分解方法解矩阵与真实值之间的差距的界;(II)我们将矩阵分解视为子空间拟合问题,并分析了求解子空间与真实子空间之间的差异;(III)我们基于子空间稳定性分析了单个用户的预测误差。我们将这些结果应用于操纵者攻击下的协同过滤问题,从而为协同过滤系统设计提供了有用的见解和指导。

英文摘要

We study the stability vis a vis adversarial noise of matrix factorization algorithm for matrix completion. In particular, our results include: (I) we bound the gap between the solution matrix of the factorization method and the ground truth in terms of root mean square error; (II) we treat the matrix factorization as a subspace fitting problem and analyze the difference between the solution subspace and the ground truth; (III) we analyze the prediction error of individual users based on the subspace stability. We apply these results to the problem of collaborative filtering under manipulator attack, which leads to useful insights and guidelines for collaborative filtering system design.

1206.4608 2026-06-03 cs.LG cs.DS cs.NA math.NA stat.ML 版本更新

A Hybrid Algorithm for Convex Semidefinite Optimization

凸半定优化的一种混合算法

Soeren Laue

AI总结 提出一种混合算法用于优化凸光滑函数在半正定矩阵锥上的问题,该算法收敛到全局最优解,可解决大规模半定规划,在矩阵补全、度量学习和稀疏PCA上优于现有方法。

Comments ICML2012

详情
AI中文摘要

我们提出了一种混合算法,用于在半正定矩阵锥上优化凸光滑函数。我们的算法收敛到全局最优解,可用于解决一般的大规模半定规划问题,因此可以轻松应用于各种机器学习问题。我们在三个机器学习问题(矩阵补全、度量学习和稀疏PCA)上展示了实验结果。我们的方法优于最先进的算法。

英文摘要

We present a hybrid algorithm for optimizing a convex, smooth function over the cone of positive semidefinite matrices. Our algorithm converges to the global optimal solution and can be used to solve general large-scale semidefinite programs and hence can be readily applied to a variety of machine learning problems. We show experimental results on three machine learning problems (matrix completion, metric learning, and sparse PCA) . Our approach outperforms state-of-the-art algorithms.

1206.4602 2026-06-03 math.NA cs.LG cs.NA stat.ML 版本更新

Quasi-Newton Methods: A New Direction

拟牛顿方法:一个新方向

Philipp Hennig, Martin Kiefel

AI总结 本文通过将拟牛顿方法解释为贝叶斯线性回归的近似,揭示了经典算法的缺陷,并提出了一种新的非参数拟牛顿方法,在相似计算成本下更高效地利用信息。

Comments ICML2012

详情
AI中文摘要

在拟牛顿方法发明四十年后,它们仍然是无约束数值优化中的最先进技术。虽然通常不被这样解释,但这些是拟合目标函数的局部二次逼近的学习算法。我们表明,许多(包括最流行的)拟牛顿方法可以解释为在不同先验假设下贝叶斯线性回归的近似。这一新概念阐明了经典算法的一些缺陷,并为一种新颖的非参数拟牛顿方法指明了道路,该方法能够在与之前方法相似的计算成本下更有效地利用可用信息。

英文摘要

Four decades after their invention, quasi-Newton methods are still state of the art in unconstrained numerical optimization. Although not usually interpreted thus, these are learning algorithms that fit a local quadratic approximation to the objective function. We show that many, including the most popular, quasi-Newton methods can be interpreted as approximations of Bayesian linear regression under varying prior assumptions. This new notion elucidates some shortcomings of classical algorithms, and lights the way to a novel nonparametric quasi-Newton method, which is able to make more efficient use of available information at computational cost similar to its predecessors.

1206.3285 2026-06-03 cs.AI cs.LG cs.SY eess.SY 版本更新

Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping

具有线性函数逼近和优先级扫描的Dyna风格规划

Richard S. Sutton, Csaba Szepesvari, Alborz Geramifard, Michael P. Bowling

AI总结 本文提出一种基于模型的Dyna风格规划方法,扩展至线性函数逼近,证明其收敛性,并引入线性Dyna的优先级扫描算法。

Comments Appears in Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI2008)

详情
AI中文摘要

我们考虑在在线设置中高效学习最优控制策略和价值函数的问题,其中状态空间很大,且必须在每次与世界交互后获得估计。本文开发了一种显式的基于模型的方法,将Dyna架构扩展到线性函数逼近。Dyna风格规划通过从世界模型生成想象经验,然后将无模型强化学习算法应用于想象的状态转移来进行。我们的主要结果是证明,在自然条件下,线性Dyna风格规划收敛到一个独立于生成分布的唯一解。在策略评估设置中,我们证明极限点是最小二乘(LSTD)解。我们的结果的一个含义是,优先级扫描可以合理地扩展到线性逼近情况,即回溯到前驱特征而不是前驱状态。我们介绍了两种线性Dyna的优先级扫描版本,并在Mountain Car和Boyan Chain问题上简要展示了它们的经验性能。

英文摘要

We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available after each interaction with the world. This paper develops an explicitly model-based approach extending the Dyna architecture to linear function approximation. Dynastyle planning proceeds by generating imaginary experience from the world model and then applying model-free reinforcement learning algorithms to the imagined state transitions. Our main results are to prove that linear Dyna-style planning converges to a unique solution independent of the generating distribution, under natural conditions. In the policy evaluation setting, we prove that the limit point is the least-squares (LSTD) solution. An implication of our results is that prioritized-sweeping can be soundly extended to the linear approximation case, backing up to preceding features rather than to preceding states. We introduce two versions of prioritized sweeping with linear Dyna and briefly illustrate their performance empirically on the Mountain Car and Boyan Chain problems.

1008.5373 2026-06-03 math.OC cs.LG cs.NA cs.SY eess.SY math.NA q-fin.CP q-fin.ST 版本更新

Penalty Decomposition Methods for Rank Minimization

秩最小化的罚分解方法

Zhaosong Lu, Yong Zhang

AI总结 本文提出罚分解方法求解目标或约束中含秩的秩最小化问题,通过块坐标下降法求解子问题,并证明序列聚点满足一阶最优性条件,在矩阵补全和最近低秩相关矩阵问题上表现优于或持平现有方法。

Comments This paper has been withdrawn by the author

详情
AI中文摘要

本文考虑一般的秩最小化问题,其中秩出现在目标函数或约束中。我们首先建立一类特殊的秩最小化问题具有闭式解。利用这一结果,我们随后提出针对一般秩最小化问题的罚分解方法,其中每个子问题通过块坐标下降法求解。在适当假设下,我们证明由罚分解方法生成的序列的任何聚点都满足问题的非线性重构的一阶最优性条件。最后,我们将方法应用于矩阵补全和最近低秩相关矩阵问题以测试性能。计算结果表明,我们的方法在解的质量方面通常与现有方法相当或更优。

英文摘要

In this paper we consider general rank minimization problems with rank appearing in either objective function or constraint. We first establish that a class of special rank minimization problems has closed-form solutions. Using this result, we then propose penalty decomposition methods for general rank minimization problems in which each subproblem is solved by a block coordinate descend method. Under some suitable assumptions, we show that any accumulation point of the sequence generated by the penalty decomposition methods satisfies the first-order optimality conditions of a nonlinear reformulation of the problems. Finally, we test the performance of our methods by applying them to the matrix completion and nearest low-rank correlation matrix problems. The computational results demonstrate that our methods are generally comparable or superior to the existing methods in terms of solution quality.

1205.2643 2026-06-03 cs.LG cs.SY eess.SY math.OC stat.CO stat.ML 版本更新

New inference strategies for solving Markov Decision Processes using reversible jump MCMC

使用可逆跳跃MCMC求解马尔可夫决策过程的新推理策略

Matthias Hoffman, Hendrik Kueck, Nando de Freitas, Arnaud Doucet

AI总结 本文提出基于可逆跳跃MCMC的改进推理策略,通过新目标分布和打破参数-轨迹相关性,实现高维空间中的最优策略估计。

Comments Appears in Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI2009)

详情
AI中文摘要

本文基于先前使用推理技术(特别是马尔可夫链蒙特卡洛(MCMC)方法)求解参数化控制问题的工作,提出了一系列改进,以使该方法在一般的高维空间中更加实用。我们首先引入了一个新的目标分布,能够从采样轨迹中融入更多奖励信息。我们还展示了如何打破策略参数与采样轨迹之间的强相关性,以实现更自由的采样。最后,我们展示了如何以原则性的方式将这些技术结合起来,以获得最优策略的估计。

英文摘要

In this paper we build on previous work which uses inferences techniques, in particular Markov Chain Monte Carlo (MCMC) methods, to solve parameterized control problems. We propose a number of modifications in order to make this approach more practical in general, higher-dimensional spaces. We first introduce a new target distribution which is able to incorporate more reward information from sampled trajectories. We also show how to break strong correlations between the policy parameters and sampled trajectories in order to sample more freely. Finally, we show how to incorporate these techniques in a principled manner to obtain estimates of the optimal policy.

1008.5372 2026-06-03 math.OC cs.CV cs.IT cs.LG cs.NA math.IT math.NA stat.ME 版本更新

Penalty Decomposition Methods for $L0$-Norm Minimization

L0-范数最小化的罚分解方法

Zhaosong Lu, Yong Zhang

AI总结 提出罚分解方法求解含L0-范数的优化问题,通过转化为秩最小化问题并利用向量化操作,在压缩感知等应用中优于现有方法。

Comments This paper has been withdrawn by the author because an updated version has been resubmitted

详情
AI中文摘要

本文考虑一般的l0-范数最小化问题,即目标函数或约束中出现l0-范数的问题。特别地,我们首先将l0-范数约束问题重新表述为等价的秩最小化问题,然后应用[33]中提出的罚分解(PD)方法求解后者。通过利用特殊结构,我们将该方法的所有矩阵运算转化为向量运算,得到仅涉及向量运算的PD方法。在适当的假设下,我们证明PD方法生成的序列的任何聚点满足一阶最优性条件,该条件通常比一个自然最优性条件更强。我们进一步扩展PD方法以求解目标函数中出现l0-范数的问题。最后,通过将PD方法应用于压缩感知、稀疏逻辑回归和稀疏逆协方差选择来测试其性能。计算结果表明,我们的方法在解质量和/或速度方面通常优于现有方法。

英文摘要

In this paper we consider general l0-norm minimization problems, that is, the problems with l0-norm appearing in either objective function or constraint. In particular, we first reformulate the l0-norm constrained problem as an equivalent rank minimization problem and then apply the penalty decomposition (PD) method proposed in [33] to solve the latter problem. By utilizing the special structures, we then transform all matrix operations of this method to vector operations and obtain a PD method that only involves vector operations. Under some suitable assumptions, we establish that any accumulation point of the sequence generated by the PD method satisfies a first-order optimality condition that is generally stronger than one natural optimality condition. We further extend the PD method to solve the problem with the l0-norm appearing in objective function. Finally, we test the performance of our PD methods by applying them to compressed sensing, sparse logistic regression and sparse inverse covariance selection. The computational results demonstrate that our methods generally outperform the existing methods in terms of solution quality and/or speed.

1204.6250 2026-06-03 eess.SY cs.LG cs.SY 版本更新

Feature Selection for Generator Excitation Neurocontroller Development Using Filter Technique

使用滤波技术的发电机励磁神经控制器特征选择

Abdul Ghani Abro, Junita Mohamad Saleh

AI总结 针对发电机励磁控制问题,提出采用滤波技术选择最优输入特征以训练人工神经网络控制器,提升控制性能。

Comments 10-Pages, 10-Figures, 8-Tables, International Journal of Computer Science Issues, Vol. 8, Issue 5, No 3, September 2011

详情
Journal ref
International Journal of Computer Science Issues,PP. 108-117, Vol. 8, Issue 5, No 3, September 2011
AI中文摘要

本质上,使用控制系统的动机是生成适当的控制信号,以产生物理过程的期望响应。同步发电机的控制在电力系统运行和控制中始终非常关键。出于某些众所周知的原因,发电机通常在其稳态稳定性极限以下运行。这提高了对高效快速控制器的需求。据报道,人工智能在控制工程领域带来了革命性的成果。人工神经网络(ANN)作为人工智能的一个分支,利用其固有的可观测性,已被用于非线性和自适应控制。神经控制器的整体性能也依赖于输入特征。选择最优特征以最优地训练神经控制器非常关键。数据的质量和大小对于更好的性能同等重要。在这项工作中,采用滤波技术选择用于ANN训练的独立因素。

英文摘要

Essentially, motive behind using control system is to generate suitable control signal for yielding desired response of a physical process. Control of synchronous generator has always remained very critical in power system operation and control. For certain well known reasons power generators are normally operated well below their steady state stability limit. This raises demand for efficient and fast controllers. Artificial intelligence has been reported to give revolutionary outcomes in the field of control engineering. Artificial Neural Network (ANN), a branch of artificial intelligence has been used for nonlinear and adaptive control, utilizing its inherent observability. The overall performance of neurocontroller is dependent upon input features too. Selecting optimum features to train a neurocontroller optimally is very critical. Both quality and size of data are of equal importance for better performance. In this work filter technique is employed to select independent factors for ANN training.

1106.1933 2026-06-03 cs.GT cs.LG cs.SY eess.SY math.OC 版本更新

Lyapunov stochastic stability and control of robust dynamic coalitional games with transferable utilities

具有可转移效用的鲁棒动态联盟博弈的Lyapunov随机稳定性与控制

Dario Bauso, Puduru Viswanadha Reddy, Tamer Basar

AI总结 针对特征函数为连续时间有界均值遍历过程的动态可转移效用博弈,提出基于额外奖励观测的分配规则,确保平均分配收敛到平均博弈的核心且联盟超额收敛到先验给定锥。

详情
AI中文摘要

本文考虑一个具有可转移效用(TU)的动态博弈,其中特征函数是一个连续时间有界均值遍历过程。一个中央规划者通过选择满足预算约束的瞬时分配,随时间与玩家持续交互。在博弈开始前,中央规划者知道过程的性质(有界均值遍历)、联盟值采样的有界集以及长期平均联盟值。另一方面,他不知道产生联盟值的潜在概率函数。我们的目标是找到分配规则,该规则使用对联盟截至当前时间所获得的额外奖励的度量,通过在玩家之间重新分配预算。目标有两个:i) 保证平均分配收敛到平均博弈的核心(或核心中的特定点),ii) 驱动联盟超额收敛到先验给定的锥。由此产生的分配规则是鲁棒的,因为尽管联盟值具有不确定性和时变性,它们仍能保证上述收敛性质。我们强调三个主要贡献。首先,我们基于对额外奖励的完全观测设计了一个分配规则,使得平均分配接近平均博弈核心中的特定点,而联盟超额收敛到先验给定的方向。其次,我们基于对额外奖励的部分观测设计了一个新的分配规则,使得平均分配收敛到平均博弈的核心,而联盟超额收敛到先验给定的锥。第三,我们建立了与逼近理论和可达性理论的联系。

英文摘要

This paper considers a dynamic game with transferable utilities (TU), where the characteristic function is a continuous-time bounded mean ergodic process. A central planner interacts continuously over time with the players by choosing the instantaneous allocations subject to budget constraints. Before the game starts, the central planner knows the nature of the process (bounded mean ergodic), the bounded set from which the coalitions' values are sampled, and the long run average coalitions' values. On the other hand, he has no knowledge of the underlying probability function generating the coalitions' values. Our goal is to find allocation rules that use a measure of the extra reward that a coalition has received up to the current time by re-distributing the budget among the players. The objective is two-fold: i) guaranteeing convergence of the average allocations to the core (or a specific point in the core) of the average game, ii) driving the coalitions' excesses to an a priori given cone. The resulting allocation rules are robust as they guarantee the aforementioned convergence properties despite the uncertain and time-varying nature of the coaltions' values. We highlight three main contributions. First, we design an allocation rule based on full observation of the extra reward so that the average allocation approaches a specific point in the core of the average game, while the coalitions' excesses converge to an a priori given direction. Second, we design a new allocation rule based on partial observation on the extra reward so that the average allocation converges to the core of the average game, while the coalitions' excesses converge to an a priori given cone. And third, we establish connections to approachability theory and attainability theory.

1204.4717 2026-06-03 math.OC cs.LG cs.SY eess.SY 版本更新

Energy-Efficient Building HVAC Control Using Hybrid System LBMPC

使用混合系统LBMPC的节能建筑HVAC控制

Anil Aswani, Neal Master, Jay Taneja, Andrew Krioukov, David Culler, Claire Tomlin

AI总结 本文提出一种基于混合系统学习模型预测控制(LBMPC)的建筑HVAC控制方法,通过系统辨识和模型更新实现日均1.5MWh的节能效果,且不降低舒适度。

详情
AI中文摘要

提高供暖、通风和空调(HVAC)系统的能效具有巨大的经济和社会效益。本文关注建筑级HVAC系统的混合系统模型辨识,以及后续使用基于学习的模型预测控制(LBMPC)的混合系统公式进行控制。这里,学习指的是对混合系统模型的更新,除了底层控制中固有的积分器动态外,还纳入了由于 occupancy、太阳效应、室外空气温度(OAT)和设备引起的加热效应。尽管我们做了显著的建模简化,但使用该模型的相应控制器能够在实验中实现大幅降低能耗,且不降低 occupant 舒适度。通过这种方式,我们证明了所做出的建模简化的合理性。最后,我们展示了在建筑HVAC测试平台上的实验结果,显示平均每天节省1.5MWh的能源(p = 0.002),95%置信区间为1.0MWh至2.1MWh。

英文摘要

Improving the energy-efficiency of heating, ventilation, and air-conditioning (HVAC) systems has the potential to realize large economic and societal benefits. This paper concerns the system identification of a hybrid system model of a building-wide HVAC system and its subsequent control using a hybrid system formulation of learning-based model predictive control (LBMPC). Here, the learning refers to model updates to the hybrid system model that incorporate the heating effects due to occupancy, solar effects, outside air temperature (OAT), and equipment, in addition to integrator dynamics inherently present in low-level control. Though we make significant modeling simplifications, our corresponding controller that uses this model is able to experimentally achieve a large reduction in energy usage without any degradations in occupant comfort. It is in this way that we justify the modeling simplifications that we have made. We conclude by presenting results from experiments on our building HVAC testbed, which show an average of 1.5MWh of energy savings per day (p = 0.002) with a 95% confidence interval of 1.0MWh to 2.1MWh of energy savings.

1204.0885 2026-06-03 eess.SY cs.LG cs.NE cs.SY 版本更新

PID Parameters Optimization by Using Genetic Algorithm

使用遗传算法优化PID参数

Andri Mirzal, Shinichiro Yoshii, Masashi Furukawa

AI总结 针对一阶滞后加时滞系统,采用遗传算法确定PID控制器参数,并与迭代法和Ziegler-Nichols规则的结果进行比较。

Comments 12 pages, 4 figures

详情
Journal ref
ISTECS Journal, Vol. 8, pp. 34-43, 2006
AI中文摘要

时滞是导致系统响应滞后的组件。它们出现在物理、化学、生物和经济系统以及测量和计算过程中。在这项工作中,我们采用遗传算法确定PID控制器参数,以补偿一阶滞后加时滞(FOLPD)系统中的延迟,并将结果与迭代法和Ziegler-Nichols规则的结果进行比较。

英文摘要

Time delays are components that make time-lag in systems response. They arise in physical, chemical, biological and economic systems, as well as in the process of measurement and computation. In this work, we implement Genetic Algorithm (GA) in determining PID controller parameters to compensate the delay in First Order Lag plus Time Delay (FOLPD) and compare the results with Iterative Method and Ziegler-Nichols rule results.

1203.2511 2026-06-03 cs.LG cs.CE cs.NI cs.SY eess.SY stat.AP 版本更新

A Simple Flood Forecasting Scheme Using Wireless Sensor Networks

一种使用无线传感器网络的简单洪水预测方案

Victor Seal, Arnab Raha, Shovan Maity, Souvik Kr Mitra, Amitava Mukherjee, Mrinal Kanti Naskar

AI总结 提出一种基于无线传感器网络的多元鲁棒线性回归洪水预测模型,通过简单快速的计算实现实时预测,并与其他算法对比验证改进效果。

Comments 16 pages, 4 figures, published in International Journal Of Ad-Hoc, Sensor And Ubiquitous Computing, February 2012; V. seal et al, 'A Simple Flood Forecasting Scheme Using Wireless Sensor Networks', IJASUC, Feb.2012

详情
AI中文摘要

本文提出一种使用无线传感器网络(WSNs)设计的预测模型,用于预测河流洪水,采用简单快速的计算提供实时结果,以拯救可能受洪水影响的生命。我们的预测模型使用多元鲁棒线性回归,易于理解,实现简单且成本效益高,速度高效,资源利用率低,同时提供可靠精度的实时预测,因此具有任何实际算法所期望的特征。我们的预测模型独立于参数数量,即可以根据现场需求添加或删除任意数量的参数。当水位上升时,我们使用多项式表示水位,其性质用于判断水位是否可能在近期超过洪水线。我们将我们的工作与一种当代算法进行比较,以展示我们的改进。然后,我们展示了预测水位与实际水位的仿真结果。

英文摘要

This paper presents a forecasting model designed using WSNs (Wireless Sensor Networks) to predict flood in rivers using simple and fast calculations to provide real-time results and save the lives of people who may be affected by the flood. Our prediction model uses multiple variable robust linear regression which is easy to understand and simple and cost effective in implementation, is speed efficient, but has low resource utilization and yet provides real time predictions with reliable accuracy, thus having features which are desirable in any real world algorithm. Our prediction model is independent of the number of parameters, i.e. any number of parameters may be added or removed based on the on-site requirements. When the water level rises, we represent it using a polynomial whose nature is used to determine if the water level may exceed the flood line in the near future. We compare our work with a contemporary algorithm to demonstrate our improvements over it. Then we present our simulation results for the predicted water level compared to the actual water level.

1008.3043 2026-06-03 math.NA cs.CC cs.LG cs.NA stat.ML 版本更新

Learning Functions of Few Arbitrary Linear Parameters in High Dimensions

高维中少量任意线性参数的函数学习

Massimo Fornasier, Karin Schnass, Jan Vybiral

AI总结 针对高维空间中由少量线性参数决定的函数,提出基于随机采样和压缩感知的近似算法,在多项式时间内实现高概率逼近。

Comments 31 pages, this version was accepted to Foundations of Computational Mathematics, the final publication will be available on http://www.springerlink.com

详情
AI中文摘要

假设 $f$ 是定义在 $\mathbb R^d$ 的单位球上的连续函数,形式为 $f(x) = g (A x)$,其中 $A$ 是 $k imes d$ 矩阵,$g$ 是 $k$ 个变量的函数,且 $k \ll d$。我们有一个预算 $m \in \mathbb N$,即允许查询 $f$ 的 $m$ 个点 $f(x_i)$,$i=1,...,m$,以构造一致逼近函数。在函数 $g$ 的某些光滑性和变差假设下,以及矩阵 $A$ 的任意选择下,本文提出: 1. 随机抽取点 $\{x_i\}$ 的采样选择,用于每个函数逼近; 2. 计算逼近函数的算法(算法1和算法2),其复杂度在维度 $d$ 和点数 $m$ 上最多为多项式。 由于 $A$ 的任意性,采样点的选择将根据适当的随机分布进行,我们的结果以压倒性概率成立。我们的方法使用了压缩感知框架中的工具、正半定矩阵和的近期Chernoff界,以及奇异值分解不变子空间的经典稳定性界。

英文摘要

Let us assume that $f$ is a continuous function defined on the unit ball of $\mathbb R^d$, of the form $f(x) = g (A x)$, where $A$ is a $k \times d$ matrix and $g$ is a function of $k$ variables for $k \ll d$. We are given a budget $m \in \mathbb N$ of possible point evaluations $f(x_i)$, $i=1,...,m$, of $f$, which we are allowed to query in order to construct a uniform approximating function. Under certain smoothness and variation assumptions on the function $g$, and an {\it arbitrary} choice of the matrix $A$, we present in this paper 1. a sampling choice of the points $\{x_i\}$ drawn at random for each function approximation; 2. algorithms (Algorithm 1 and Algorithm 2) for computing the approximating function, whose complexity is at most polynomial in the dimension $d$ and in the number $m$ of points. Due to the arbitrariness of $A$, the choice of the sampling points will be according to suitable random distributions and our results hold with overwhelming probability. Our approach uses tools taken from the {\it compressed sensing} framework, recent Chernoff bounds for sums of positive-semidefinite matrices, and classical stability bounds for invariant subspaces of singular value decompositions.

1108.6296 2026-06-03 cs.LG cs.NA math.NA 版本更新

Infinite Tucker Decomposition: Nonparametric Bayesian Models for Multiway Data Analysis

无限Tucker分解:用于多路数据分析的非参数贝叶斯模型

Zenglin Xu, Feng Yan, Yuan, Qi

AI总结 提出基于非参数贝叶斯的无限Tucker分解模型(InfTucker),通过潜在高斯/t过程和非线性协方差函数,在概率框架下处理连续和二元数据,并开发高效变分推理方法,显著提升预测精度。

详情
AI中文摘要

张量分解是多路数据分析的强大计算工具。许多流行的张量分解方法——如Tucker分解和CANDECOMP/PARAFAC (CP)——本质上是多线性因子分解。它们不足以建模(i)数据实体间的复杂交互、(ii)各种数据类型(如缺失数据和二元数据)以及(iii)噪声观测和异常值。为解决这些问题,我们提出了张量变量潜在非参数贝叶斯模型,并结合高效推理方法,用于多路数据分析。我们将这些模型命名为InfTucker。使用这些InfTucker,我们在无限特征空间中进行Tucker分解。与经典张量分解模型不同,我们的新方法在概率框架下处理连续和二元数据。与先前关于矩阵和张量的贝叶斯模型不同,我们的模型基于具有非线性协方差函数的潜在高斯或t过程。为了从数据中高效学习InfTucker,我们开发了一种张量上的变分推理技术。与经典实现相比,新技术将时间和空间复杂度降低了几个数量级。我们在化学计量学和社交网络数据集上的实验结果表明,我们的新模型比最先进的张量分解方法取得了显著更高的预测精度。

英文摘要

Tensor decomposition is a powerful computational tool for multiway data analysis. Many popular tensor decomposition approaches---such as the Tucker decomposition and CANDECOMP/PARAFAC (CP)---amount to multi-linear factorization. They are insufficient to model (i) complex interactions between data entities, (ii) various data types (e.g. missing data and binary data), and (iii) noisy observations and outliers. To address these issues, we propose tensor-variate latent nonparametric Bayesian models, coupled with efficient inference methods, for multiway data analysis. We name these models InfTucker. Using these InfTucker, we conduct Tucker decomposition in an infinite feature space. Unlike classical tensor decomposition models, our new approaches handle both continuous and binary data in a probabilistic framework. Unlike previous Bayesian models on matrices and tensors, our models are based on latent Gaussian or $t$ processes with nonlinear covariance functions. To efficiently learn the InfTucker from data, we develop a variational inference technique on tensors. Compared with classical implementation, the new technique reduces both time and space complexities by several orders of magnitude. Our experimental results on chemometrics and social network datasets demonstrate that our new models achieved significantly higher prediction accuracy than the most state-of-art tensor decomposition

1109.1533 2026-06-03 math.OC cs.LG cs.NI cs.SY eess.SY math.PR 版本更新

The Non-Bayesian Restless Multi-Armed Bandit: A Case of Near-Logarithmic Strict Regret

非贝叶斯不安分多臂老虎机:近对数严格遗憾的一个案例

Wenhan Dai, Yi Gai, Bhaskar Krishnamachari, Qing Zhao

AI总结 针对非贝叶斯不安分多臂老虎机问题,提出一种元策略方法,通过学习有限策略集中的最优策略,实现近对数遗憾,并首次在非贝叶斯RMAB中达到与已知模型最优策略相同的平均奖励。

Comments arXiv admin note: significant text overlap with arXiv:1011.4752

详情
AI中文摘要

在经典的贝叶斯不安分多臂老虎机(RMAB)问题中,有$N$个臂,所有臂上的奖励在每个时刻以已知参数的马尔可夫链演化。玩家每时刻选择激活$K \geq 1$个臂,以最大化多次游戏获得的期望总奖励。RMAB是一个具有挑战性的问题,通常已知为PSPACE-hard。本文考虑更困难的问题:非贝叶斯RMAB,其中马尔可夫链的参数假设先验未知。我们提出了一种原创方法,适用于当对应的贝叶斯问题具有如下结构时:根据已知参数值,最优解是预设的有限策略集中的一个。在此类设置中,我们提出通过采用合适的元策略来学习非贝叶斯RMAB的最优策略,该元策略将有限策略集中的每个策略视为另一个非贝叶斯多臂老虎机问题中的一个臂,而该问题的单臂选择策略是最优的。我们通过开发一种新的感知策略来演示该方法,用于在未知动态信道上进行机会频谱接入。我们证明,我们的策略实现了近对数遗憾(与模型感知的“精灵”相比的期望奖励差异),从而获得了与已知模型下最优策略相同的平均奖励。这是文献中首次在非贝叶斯RMAB上得到这样的结果。在证明中,我们还开发了Chernoff-Hoeffding界的一个新推广。

英文摘要

In the classic Bayesian restless multi-armed bandit (RMAB) problem, there are $N$ arms, with rewards on all arms evolving at each time as Markov chains with known parameters. A player seeks to activate $K \geq 1$ arms at each time in order to maximize the expected total reward obtained over multiple plays. RMAB is a challenging problem that is known to be PSPACE-hard in general. We consider in this work the even harder non-Bayesian RMAB, in which the parameters of the Markov chain are assumed to be unknown \emph{a priori}. We develop an original approach to this problem that is applicable when the corresponding Bayesian problem has the structure that, depending on the known parameter values, the optimal solution is one of a prescribed finite set of policies. In such settings, we propose to learn the optimal policy for the non-Bayesian RMAB by employing a suitable meta-policy which treats each policy from this finite set as an arm in a different non-Bayesian multi-armed bandit problem for which a single-arm selection policy is optimal. We demonstrate this approach by developing a novel sensing policy for opportunistic spectrum access over unknown dynamic channels. We prove that our policy achieves near-logarithmic regret (the difference in expected reward compared to a model-aware genie), which leads to the same average reward that can be achieved by the optimal policy under a known model. This is the first such result in the literature for a non-Bayesian RMAB. For our proof, we also develop a novel generalization of the Chernoff-Hoeffding bound.

1109.1552 2026-06-03 cs.LG cs.NI cs.SY eess.SY math.OC math.PR 版本更新

Efficient Online Learning for Opportunistic Spectrum Access

机会频谱接入的高效在线学习

Wenhan Dai, Yi Gai, Bhaskar Krishnamachari

AI总结 针对认知无线电网络中机会频谱接入的非贝叶斯多臂赌博机问题,提出连续探索与利用(CEE)算法,实现近对数遗憾界,并在已知部分信息时达到对数遗憾。

详情
AI中文摘要

认知无线电网络中的机会频谱接入问题最近被建模为非贝叶斯非平稳多臂赌博机问题。该问题中,有N个臂(对应信道)和一个玩家(对应次用户)。每个臂的状态演变为参数未知的有限状态马尔可夫链。在每个时隙,玩家可以选择K < N个臂进行播放,并获得状态相关的奖励(对应主用户活动下的吞吐量)。目标是最大化多次播放获得的期望总奖励(即总吞吐量)。此类多臂赌博机算法的性能通过遗憾来衡量,定义为与始终播放最佳K个臂的模型感知精灵相比的期望奖励差异。本文针对该问题提出了一种新的连续探索与利用(CEE)算法。当没有关于臂动态的信息时,CEE是首个保证随时间均匀近对数遗憾的算法。当已知与平稳状态分布和状态相关奖励对应的某些界限时,我们证明CEE可以轻松修改以实现随时间对数遗憾。相比之下,先前算法需要关于转移矩阵第二特征值界限的额外信息才能保证对数遗憾。最后,通过数值模拟表明CEE比先前算法更高效。

英文摘要

The problem of opportunistic spectrum access in cognitive radio networks has been recently formulated as a non-Bayesian restless multi-armed bandit problem. In this problem, there are N arms (corresponding to channels) and one player (corresponding to a secondary user). The state of each arm evolves as a finite-state Markov chain with unknown parameters. At each time slot, the player can select K < N arms to play and receives state-dependent rewards (corresponding to the throughput obtained given the activity of primary users). The objective is to maximize the expected total rewards (i.e., total throughput) obtained over multiple plays. The performance of an algorithm for such a multi-armed bandit problem is measured in terms of regret, defined as the difference in expected reward compared to a model-aware genie who always plays the best K arms. In this paper, we propose a new continuous exploration and exploitation (CEE) algorithm for this problem. When no information is available about the dynamics of the arms, CEE is the first algorithm to guarantee near-logarithmic regret uniformly over time. When some bounds corresponding to the stationary state distributions and the state-dependent rewards are known, we show that CEE can be easily modified to achieve logarithmic regret over time. In contrast, prior algorithms require additional information concerning bounds on the second eigenvalues of the transition matrices in order to guarantee logarithmic regret. Finally, we show through numerical simulations that CEE is more efficient than prior algorithms.

1105.2176 2026-06-03 math.OC cs.IT cs.LG cs.SY eess.SY math.IT 版本更新

A Framework for Optimization under Limited Information

有限信息下的优化框架

Tansu Alpcan

AI总结 针对有限信息下的优化问题,提出一个融合信息收集、估计和优化的统一框架,采用贝叶斯方法和高斯过程回归,并利用信息论熵量化信息获取。

详情
AI中文摘要

在许多现实世界问题中,优化决策必须在信息有限的情况下做出。决策者可能没有关于通常非凸目标函数的先验或后验数据,只能通过随时间推移的昂贵观测获得有限数量的点。本文提出了一个优化框架,以整体和结构化的方式考虑信息收集(观测)、估计(回归)和优化(最大化)方面。通过使用信息论中的熵度量显式量化每个优化步骤中获取的信息,采用贝叶斯方法并使用高斯过程作为最先进的回归方法,对要优化(最大化)的(非凸)目标函数进行建模和估计。由此产生的迭代方案允许决策者通过同时定量表达每个方面的偏好来解决问题。

英文摘要

In many real world problems, optimization decisions have to be made with limited information. The decision maker may have no a priori or posteriori data about the often nonconvex objective function except from on a limited number of points that are obtained over time through costly observations. This paper presents an optimization framework that takes into account the information collection (observation), estimation (regression), and optimization (maximization) aspects in a holistic and structured manner. Explicitly quantifying the information acquired at each optimization step using the entropy measure from information theory, the (nonconvex) objective function to be optimized (maximized) is modeled and estimated by adopting a Bayesian approach and using Gaussian processes as a state-of-the-art regression method. The resulting iterative scheme allows the decision maker to solve the problem by expressing preferences for each aspect quantitatively and concurrently.

1110.1781 2026-06-03 cs.LG cs.SY eess.SY 版本更新

A Study of Unsupervised Adaptive Crowdsourcing

无监督自适应众包研究

G. Kesidis, A. Kurve

AI总结 基于用户响应与多数响应的一致性,研究无监督众包性能,提出两种场景下的可靠性度量方法。

Comments Technical Report, 2 figures

详情
AI中文摘要

我们考虑基于以下模型的无监督众包性能:最终用户的响应本质上根据其响应与相同子任务/问题的其他多数响应的相关性进行评分。在一种设置中,我们考虑独立同分布的众包任务(元任务)序列,而在另一种设置中,我们考虑具有大量组件子任务的单个任务。两个问题都产生了直观的结果,其中群体的整体可靠性是一个因素。

英文摘要

We consider unsupervised crowdsourcing performance based on the model wherein the responses of end-users are essentially rated according to how their responses correlate with the majority of other responses to the same subtasks/questions. In one setting, we consider an independent sequence of identically distributed crowdsourcing assignments (meta-tasks), while in the other we consider a single assignment with a large number of component subtasks. Both problems yield intuitive results in which the overall reliability of the crowd is a factor.

1107.1744 2026-06-03 math.OC cs.LG cs.SY eess.SY 版本更新

Stochastic convex optimization with bandit feedback

带强盗反馈的随机凸优化

Alekh Agarwal, Dean P. Foster, Daniel Hsu, Sham M. Kakade, Alexander Rakhlin

AI总结 针对带强盗反馈的随机凸优化问题,提出椭球算法的推广,实现$\otil(\poly(d)\sqrt{T})$遗憾,在$T$的尺度上最优。

详情
AI中文摘要

本文研究了在随机强盗反馈模型下,最小化凸集$\xset$上的凸Lipschitz函数$f$的问题。在该模型中,算法可以在任意查询点$x \in \xset$处观察到函数值$f(x)$的带噪声实现。关注的指标是算法的遗憾,即算法查询点处的函数值之和减去最优函数值。我们展示了椭球算法的一个推广,其遗憾为$\otil(\poly(d)\sqrt{T})$。由于任何算法在该问题上的遗憾至少为$Ω(\sqrt{T})$,我们的算法在$T$的尺度上是最优的。

英文摘要

This paper addresses the problem of minimizing a convex, Lipschitz function $f$ over a convex, compact set $\xset$ under a stochastic bandit feedback model. In this model, the algorithm is allowed to observe noisy realizations of the function value $f(x)$ at any query point $x \in \xset$. The quantity of interest is the regret of the algorithm, which is the sum of the function values at algorithm's query points minus the optimal function value. We demonstrate a generalization of the ellipsoid algorithm that incurs $\otil(\poly(d)\sqrt{T})$ regret. Since any algorithm has regret at least $Ω(\sqrt{T})$ on this problem, our algorithm is optimal in terms of the scaling with $T$.

1110.0718 2026-06-03 cs.IT cs.LG cs.SY eess.SY math.IT 版本更新

Directed information and Pearl's causal calculus

有向信息与Pearl因果演算

Maxim Raginsky

AI总结 本文探讨Pearl因果形式化与信息论中因果性及反馈概念之间的联系,并展示条件有向信息如何用于发展Pearl后门准则的信息论版本。

Comments 8 pages, uses ieeeconf.cls; to appear in Proc. 49th Annual Allerton Conf. on Communication, Control and Computing (2011)

详情
AI中文摘要

概率图模型是统计学、机器学习、信号处理和控制中的基本工具。当这样的模型定义在有向无环图(DAG)上时,可以为相应随机系统中发生的事件分配偏序。基于Judea Pearl等人的工作,这些基于DAG的联合概率测度的“因果分解”已被用于功能依赖(因果联系)的表征和推断。这篇主要是说明性的论文聚焦于Pearl形式化(特别是他的“干预”概念)与信息论中的因果性和反馈概念(如因果条件、有向随机核和有向信息)之间的若干联系。作为一个应用,我们展示了如何利用条件有向信息来发展Pearl的“后门”准则的信息论版本,该准则用于从被动观测中识别因果效应。这表明后门准则可以被视为统计充分性的因果类比。

英文摘要

Probabilistic graphical models are a fundamental tool in statistics, machine learning, signal processing, and control. When such a model is defined on a directed acyclic graph (DAG), one can assign a partial ordering to the events occurring in the corresponding stochastic system. Based on the work of Judea Pearl and others, these DAG-based "causal factorizations" of joint probability measures have been used for characterization and inference of functional dependencies (causal links). This mostly expository paper focuses on several connections between Pearl's formalism (and in particular his notion of "intervention") and information-theoretic notions of causality and feedback (such as causal conditioning, directed stochastic kernels, and directed information). As an application, we show how conditional directed information can be used to develop an information-theoretic version of Pearl's "back-door" criterion for identifiability of causal effects from passive observations. This suggests that the back-door criterion can be thought of as a causal analog of statistical sufficiency.

1109.2088 2026-06-03 cs.LG cs.NI cs.SY eess.SY math.OC math.PR 版本更新

Online Learning Algorithms for Stochastic Water-Filling

随机注水的在线学习算法

Yi Gai, Bhaskar Krishnamachari

AI总结 针对未知信道增益分布下的随机时变信道,提出两种基于多臂老虎机的在线注水算法CWF1和CWF2,分别优化期望和速率与和速率的期望,并证明其遗憾或错误分配次数随信道数多项式增长、随时间对数增长。

详情
AI中文摘要

注水是解决将受限功率分配给一组并行信道以最大化总数据速率的经典问题的术语。它在实践中被广泛使用,例如在WiMax等多用户OFDM系统中用于子载波的功率分配。经典的注水算法是确定性的,并且需要信道增益与噪声比的完美知识。在本文中,我们考虑如何在随机时变(i.i.d.)信道上进行功率分配,且增益与噪声比分布未知。我们采用基于随机多臂老虎机的在线学习框架。我们考虑该问题的两种变体:一种目标是找到最大化 $\sum\limits_i \mathbb{E}[\log(1 + SNR_i)]$ 的功率分配,另一种目标是找到最大化 $\sum\limits_i \log(1 + \mathbb{E}[SNR_i])$ 的功率分配。对于第一个问题,我们提出了一种称为CWF1的认知注水算法。我们证明CWF1获得的遗憾(定义为随时间累积的、由分布感知的预言机获得的速率和与该策略获得的速率和之间的差距)随信道数多项式增长、随时间对数增长,这意味着它渐近地达到了在已知增益分布时可以获得的最优时间平均速率。对于第二个问题,我们提出了一种称为CWF2的算法,据我们所知,这是随机多臂老虎机文献中第一个利用臂之间非线性依赖关系的算法。我们证明CWF2选择错误功率分配的次数被一个随信道数多项式增长、随时间对数增长的函数所界定,这意味着其错误分配频率趋于零。

英文摘要

Water-filling is the term for the classic solution to the problem of allocating constrained power to a set of parallel channels to maximize the total data-rate. It is used widely in practice, for example, for power allocation to sub-carriers in multi-user OFDM systems such as WiMax. The classic water-filling algorithm is deterministic and requires perfect knowledge of the channel gain to noise ratios. In this paper we consider how to do power allocation over stochastically time-varying (i.i.d.) channels with unknown gain to noise ratio distributions. We adopt an online learning framework based on stochastic multi-armed bandits. We consider two variations of the problem, one in which the goal is to find a power allocation to maximize $\sum\limits_i \mathbb{E}[\log(1 + SNR_i)]$, and another in which the goal is to find a power allocation to maximize $\sum\limits_i \log(1 + \mathbb{E}[SNR_i])$. For the first problem, we propose a \emph{cognitive water-filling} algorithm that we call CWF1. We show that CWF1 obtains a regret (defined as the cumulative gap over time between the sum-rate obtained by a distribution-aware genie and this policy) that grows polynomially in the number of channels and logarithmically in time, implying that it asymptotically achieves the optimal time-averaged rate that can be obtained when the gain distributions are known. For the second problem, we present an algorithm called CWF2, which is, to our knowledge, the first algorithm in the literature on stochastic multi-armed bandits to exploit non-linear dependencies between the arms. We prove that the number of times CWF2 picks the incorrect power allocation is bounded by a function that is polynomial in the number of channels and logarithmic in time, implying that its frequency of incorrect allocation tends to zero.

1102.5288 2026-06-03 stat.ML cs.LG cs.SY eess.SY math.OC stat.AP 版本更新

Sparse Bayesian Methods for Low-Rank Matrix Estimation

低秩矩阵估计的稀疏贝叶斯方法

S. Derin Babacan, Martin Luessi, Rafael Molina, Aggelos K. Katsaggelos

AI总结 提出基于稀疏贝叶斯学习的矩阵补全和鲁棒主成分分析算法,通过稀疏约束自动确定秩并实现高恢复性能。

Comments This paper has been withdrawn by the author due to significant revisions in the paper. The new version will be uploaded soon

详情
AI中文摘要

低秩矩阵的恢复最近在科学和工程的许多领域引起了显著关注,这得益于精确重构保证的理论结果和有趣的实际应用。针对这一恢复问题,已经开发了许多方法。然而,通常没有提供选择未知目标秩的原则性方法。在本文中,我们提出了基于稀疏贝叶斯学习(SBL)原理的矩阵补全和鲁棒主成分分析中估计低秩矩阵的新恢复算法。从矩阵分解公式出发,将估计中的低秩约束作为稀疏约束强制执行,我们开发了一种在确定正确秩的同时提供高恢复性能的有效方法。我们提供了与其他类似问题中现有方法的联系,以及经验结果和与当前最先进方法的比较,说明了该方法的有效性。

英文摘要

Recovery of low-rank matrices has recently seen significant activity in many areas of science and engineering, motivated by recent theoretical results for exact reconstruction guarantees and interesting practical applications. A number of methods have been developed for this recovery problem. However, a principled method for choosing the unknown target rank is generally not provided. In this paper, we present novel recovery algorithms for estimating low-rank matrices in matrix completion and robust principal component analysis based on sparse Bayesian learning (SBL) principles. Starting from a matrix factorization formulation and enforcing the low-rank constraint in the estimates as a sparsity constraint, we develop an approach that is very effective in determining the correct rank while providing high recovery performance. We provide connections with existing methods in other similar problems and empirical results and comparisons with current state-of-the-art methods that illustrate the effectiveness of this approach.

1004.2027 2026-06-03 cs.LG cs.AI cs.SY eess.SY math.OC stat.ML 版本更新

Dynamic Policy Programming

动态策略编程

Mohammad Gheshlaghi Azar, Vicenc Gomez, Hilbert J. Kappen

AI总结 提出动态策略编程(DPP)方法,通过平均累积误差的无穷范数界,在近似误差下优于标准近似值迭代和近似策略迭代,并在多个问题域中显著超越现有强化学习方法。

Comments Submitted to Journal of Machine Learning Research

详情
AI中文摘要

在本文中,我们提出了一种新颖的策略迭代方法,称为动态策略编程(DPP),用于估计无限时域马尔可夫决策过程中的最优策略。我们证明了在存在近似/估计误差的情况下,DPP的有限迭代和渐近l∞范数性能损失界。这些界以平均累积误差的l∞范数表示,而不是标准近似值迭代(AVI)和近似策略迭代(API)中误差的l∞范数。这表明DPP可以实现比AVI和API更好的性能,因为它平均了整个学习过程中由蒙特卡洛采样引起的模拟噪声。我们通过在不同问题域上比较DPP的近似变体与现有强化学习(RL)方法的性能,数值验证了这一理论结果。我们的结果表明,在所有情况下,基于DPP的算法都大幅优于其他RL方法。

英文摘要

In this paper, we propose a novel policy iteration method, called dynamic policy programming (DPP), to estimate the optimal policy in the infinite-horizon Markov decision processes. We prove the finite-iteration and asymptotic l\infty-norm performance-loss bounds for DPP in the presence of approximation/estimation error. The bounds are expressed in terms of the l\infty-norm of the average accumulated error as opposed to the l\infty-norm of the error in the case of the standard approximate value iteration (AVI) and the approximate policy iteration (API). This suggests that DPP can achieve a better performance than AVI and API since it averages out the simulation noise caused by Monte-Carlo sampling throughout the learning process. We examine this theoretical results numerically by com- paring the performance of the approximate variants of DPP with existing reinforcement learning (RL) methods on different problem domains. Our results show that, in all cases, DPP-based algorithms outperform other RL methods by a wide margin.

1011.1716 2026-06-03 math.NA cs.LG cs.NA 版本更新

Least Squares Ranking on Graphs

图上的最小二乘排序

Anil N. Hirani, Kaushik Kalyanaraman, Seth Watts

AI总结 本文利用图上的最小二乘计算解决基于成对比较数据的排序问题,并展示了其与谱图理论、代数多重网格、Hodge分解等多个领域的深刻联系。

Comments Added missing references, comparison of linear solvers overhauled, conclusion section added, some new figures added

详情
AI中文摘要

给定一组待排序的备选方案以及一些成对比较数据,排序就是图上的最小二乘计算。顶点是备选方案,边值包含比较数据。基本思想非常简单且古老:在顶点上赋予数值,使得它们的差值匹配给定的边数据。由于通常无法精确匹配,因此只能以最小二乘意义进行匹配。该公式由Leake于1976年首次描述,用于对足球队进行排名,并作为示例出现在Gilbert Strang教授的经典线性代数教科书中。如果进一步观察残差,问题就会真正活跃起来,正如Jiang等人最近一篇引人注目的论文所有效展示的那样。无论是否采用这一技巧,图上的最小二乘问题都与当前许多研究领域有着深远的联系。这些联系涉及理论计算机科学(谱图理论、图拉普拉斯系统的多重网格方法)、数值分析(代数多重网格、有限元外微积分)、其他数学(Hodge分解、随机团复形)以及应用(套利、运动队排名)。本文并未探索所有这些联系,但探索了许多。基本思想易于解释,仅需要初等线性代数中的四个基本子空间。我们的目标之一是解释这些基本思想和联系,以引起许多领域研究者的兴趣。另一个目标是利用我们的数值实验来指导方法选择并揭示进一步发展的需求。

英文摘要

Given a set of alternatives to be ranked, and some pairwise comparison data, ranking is a least squares computation on a graph. The vertices are the alternatives, and the edge values comprise the comparison data. The basic idea is very simple and old: come up with values on vertices such that their differences match the given edge data. Since an exact match will usually be impossible, one settles for matching in a least squares sense. This formulation was first described by Leake in 1976 for rankingfootball teams and appears as an example in Professor Gilbert Strang's classic linear algebra textbook. If one is willing to look into the residual a little further, then the problem really comes alive, as shown effectively by the remarkable recent paper of Jiang et al. With or without this twist, the humble least squares problem on graphs has far-reaching connections with many current areas ofresearch. These connections are to theoretical computer science (spectral graph theory, and multilevel methods for graph Laplacian systems); numerical analysis (algebraic multigrid, and finite element exterior calculus); other mathematics (Hodge decomposition, and random clique complexes); and applications (arbitrage, and ranking of sports teams). Not all of these connections are explored in this paper, but many are. The underlying ideas are easy to explain, requiring only the four fundamental subspaces from elementary linear algebra. One of our aims is to explain these basic ideas and connections, to get researchers in many fields interested in this topic. Another aim is to use our numerical experiments for guidance on selecting methods and exposing the need for further development.

1106.1651 2026-06-03 cs.IT cs.LG cs.SY eess.SY math.IT math.OC 版本更新

Sparse Principal Component of a Rank-deficient Matrix

秩亏矩阵的稀疏主成分

Megasthenis Asteris, Dimitris S. Papailiopoulos, George N. Karystinos

AI总结 针对秩亏矩阵的稀疏主成分识别问题,通过引入辅助球面变量并证明存在多项式大小的候选指标集,提出了一种多项式时间算法来计算任意稀疏度下的最优稀疏主成分。

Comments 5 pages, 1 figure, to be presented at ISIT

详情
AI中文摘要

我们考虑识别秩亏矩阵的稀疏主成分的问题。我们引入辅助球面变量,并证明存在一组候选指标集(即向量参数非零元素的指标集合),其大小关于秩是多项式有界的,并且包含最优指标集,即最优解的非零元素的指标集。最后,我们开发了一种算法,对于任何稀疏度,都能在多项式时间内计算出最优稀疏主成分。

英文摘要

We consider the problem of identifying the sparse principal component of a rank-deficient matrix. We introduce auxiliary spherical variables and prove that there exists a set of candidate index-sets (that is, sets of indices to the nonzero elements of the vector argument) whose size is polynomially bounded, in terms of rank, and contains the optimal index-set, i.e. the index-set of the nonzero elements of the optimal solution. Finally, we develop an algorithm that computes the optimal sparse principal component in polynomial time for any sparsity degree.

1105.3931 2026-06-03 cs.LG cs.NA math.NA stat.ML 版本更新

Behavior of Graph Laplacians on Manifolds with Boundary

带边界流形上图拉普拉斯算子的行为

Xueyuan Zhou, Mikhail Belkin

AI总结 本文分析了带边界流形上图拉普拉斯算子在边界附近的行为,揭示了其与内部不同的缩放特性及全局影响,并给出了收敛速率和数值结果。

详情
AI中文摘要

在流形学习中,基于数据构建的图拉普拉斯算法在实际应用和理论分析中都受到了广泛关注。特别是,从采样数据获得的图拉普拉斯算子收敛到某些连续算子最近成为一个活跃的研究课题。现有的大部分工作都假设数据采样自无边界流形,或者感兴趣的函数在远离边界的点处评估。然而,边界行为问题具有相当大的实践和理论意义。在本文中,我们分析了图拉普拉斯算子在边界附近或边界上的点的行为,讨论了它们的收敛速率及其含义,并提供了一些数值结果。结果表明,虽然边界附近的点只占流形总体积的一小部分,但图拉普拉斯算子在这些点的行为具有与流形上其他地方不同的缩放特性,并对整个流形产生全局影响,这一观察对于流形学习的普遍问题具有潜在的重要意义。

英文摘要

In manifold learning, algorithms based on graph Laplacians constructed from data have received considerable attention both in practical applications and theoretical analysis. In particular, the convergence of graph Laplacians obtained from sampled data to certain continuous operators has become an active research topic recently. Most of the existing work has been done under the assumption that the data is sampled from a manifold without boundary or that the functions of interests are evaluated at a point away from the boundary. However, the question of boundary behavior is of considerable practical and theoretical interest. In this paper we provide an analysis of the behavior of graph Laplacians at a point near or on the boundary, discuss their convergence rates and their implications and provide some numerical results. It turns out that while points near the boundary occupy only a small part of the total volume of a manifold, the behavior of graph Laplacian there has different scaling properties from its behavior elsewhere on the manifold, with global effects on the whole manifold, an observation with potentially important implications for the general problem of learning on manifolds.

1009.4219 2026-06-03 cs.LG cs.SY eess.SY math.OC 版本更新

Safe Feature Elimination for the LASSO and Sparse Supervised Learning Problems

LASSO和稀疏监督学习问题的安全特征消除

Laurent El Ghaoui, Vivian Viallon, Tarek Rabbani

AI总结 提出一种快速方法,在LASSO问题中消除无关特征,显著减少运行时间,并可推广到一般l1惩罚凸问题。

Comments Submitted to JMLR in April 2011

详情
AI中文摘要

我们描述了一种快速方法,用于消除l1惩罚最小二乘回归(或LASSO)问题中的特征(变量)。特征的消除可能导致运行时间的大幅减少,特别是对于惩罚参数的大值。我们的方法不是启发式的:它只消除那些在解决LASSO问题后保证不存在的特征。特征消除步骤易于并行化,并且可以独立测试每个特征的消除。此外,与解决LASSO问题的计算量相比,我们的方法的计算努力可以忽略不计——大致相当于单个梯度步骤的计算量。我们的方法扩展了现有LASSO算法的范围,以处理以前无法达到的更大数据集。我们展示了如何将我们的方法扩展到一般的l1惩罚凸问题,并给出了稀疏支持向量机和逻辑回归问题的初步结果。

英文摘要

We describe a fast method to eliminate features (variables) in l1 -penalized least-square regression (or LASSO) problems. The elimination of features leads to a potentially substantial reduction in running time, specially for large values of the penalty parameter. Our method is not heuristic: it only eliminates features that are guaranteed to be absent after solving the LASSO problem. The feature elimination step is easy to parallelize and can test each feature for elimination independently. Moreover, the computational effort of our method is negligible compared to that of solving the LASSO problem - roughly it is the same as single gradient step. Our method extends the scope of existing LASSO algorithms to treat larger data sets, previously out of their reach. We show how our method can be extended to general l1 -penalized convex problems and present preliminary results for the Sparse Support Vector Machine and Logistic Regression problems.

1105.2211 2026-06-03 math.OC cs.IT cs.LG cs.SY eess.SY math.IT 版本更新

Dual Control with Active Learning using Gaussian Process Regression

使用高斯过程回归的主动学习双控制

Tansu Alpcan

AI总结 针对信息有限的控制问题,提出一种基于信息论熵度量和高斯过程回归的双控制方法,同时优化系统辨识和控制目标,并在混沌系统和倒立摆控制中验证。

详情
AI中文摘要

在许多实际问题中,控制决策必须在有限信息下做出。控制器可能没有关于非线性系统的先验(甚至后验)数据,除了随时间获得的有限数量点。这要么是由于观测成本高,要么是由于系统的高度非平稳性。信息收集(辨识、探索)与控制(优化、利用)之间的冲突需要一种主动学习方法,用于迭代选择控制动作,同时为系统辨识提供数据点。本文提出一种双控制方法,其中每个控制步骤获取的信息使用信息论中的熵度量进行量化,并作为最先进的高斯过程回归(贝叶斯学习)方法的训练输入。对每个数据点获取的信息进行显式量化,允许迭代优化辨识和控制目标。所开发的方法通过两个例子说明:作为混沌系统的逻辑斯蒂映射控制和带倒立摆的小车位置控制。

英文摘要

In many real world problems, control decisions have to be made with limited information. The controller may have no a priori (or even posteriori) data on the nonlinear system, except from a limited number of points that are obtained over time. This is either due to high cost of observation or the highly non-stationary nature of the system. The resulting conflict between information collection (identification, exploration) and control (optimization, exploitation) necessitates an active learning approach for iteratively selecting the control actions which concurrently provide the data points for system identification. This paper presents a dual control approach where the information acquired at each control step is quantified using the entropy measure from information theory and serves as the training input to a state-of-the-art Gaussian process regression (Bayesian learning) method. The explicit quantification of the information obtained from each data point allows for iterative optimization of both identification and control objectives. The approach developed is illustrated with two examples: control of logistic map as a chaotic system and position control of a cart with inverted pendulum.

1104.5391 2026-06-03 cs.LG cs.SY eess.SY math.OC 版本更新

On Optimality of Greedy Policy for a Class of Standard Reward Function of Restless Multi-armed Bandit Problem

关于一类标准奖励函数的贪婪策略在非稳态多臂赌博机问题中的最优性

Quan Liu, Kehao Wang, Lin Chen

AI总结 针对非稳态多臂赌博机问题,通过分析一类标准奖励函数,建立了保证贪婪策略在折扣期望奖励准则下最优性的折扣因子闭式条件,并验证了其在认知无线电网络中的有效性。

详情
AI中文摘要

本文考虑非稳态赌博机问题,这是决策理论中著名的随机多臂赌博机问题最广泛研究的推广之一。然而,已知该问题在近似任何非平凡因子时是PSPACE-难的。因此,由于其高复杂性,最优性很难获得。考虑到贪婪策略的稳定性和简单性,一个自然的方法是采用贪婪策略。然而,贪婪策略通常因其固有的短视行为而导致最优性损失。本文通过分析一类所谓的标准奖励函数,建立了关于折扣因子β的闭式条件,使得在折扣期望奖励准则下贪婪策略的最优性得到保证,特别是条件β=1表示在平均累积奖励准则下贪婪策略的最优性。因此,标准形式的奖励函数可以轻松用于判断贪婪策略的最优性,无需任何复杂计算。文中给出了认知无线电网络中的一些例子,以验证该数学结果在判断贪婪策略最优性方面的有效性。

英文摘要

In this paper,we consider the restless bandit problem, which is one of the most well-studied generalizations of the celebrated stochastic multi-armed bandit problem in decision theory. However, it is known be PSPACE-Hard to approximate to any non-trivial factor. Thus the optimality is very difficult to obtain due to its high complexity. A natural method is to obtain the greedy policy considering its stability and simplicity. However, the greedy policy will result in the optimality loss for its intrinsic myopic behavior generally. In this paper, by analyzing one class of so-called standard reward function, we establish the closed-form condition about the discounted factor βsuch that the optimality of the greedy policy is guaranteed under the discounted expected reward criterion, especially, the condition β= 1 indicating the optimality of the greedy policy under the average accumulative reward criterion. Thus, the standard form of reward function can easily be used to judge the optimality of the greedy policy without any complicated calculation. Some examples in cognitive radio networks are presented to verify the effectiveness of the mathematical result in judging the optimality of the greedy policy.

1001.4475 2026-06-03 cs.LG cs.SY eess.SY math.OC math.ST stat.TH 版本更新

X-Armed Bandits

X-Armed Bandits

Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvari

AI总结 针对臂集为一般可测空间且均值回报函数满足已知相异度局部Lipschitz条件的随机多臂赌博机问题,提出HOO算法,实现与维度无关的遗憾界并证明极小极大最优性。

详情
AI中文摘要

我们考虑随机赌博机的一个推广,其中臂集$\cX$可以是任意可测空间,且均值回报函数关于决策者已知的相异度函数是“局部Lipschitz”的。在此条件下,我们构建了一种称为HOO(分层乐观优化)的臂选择策略,对于一大类问题,其遗憾界相比之前的结果有所改进。特别地,我们的结果表明,如果$\cX$是欧氏空间中的单位超立方体,且均值回报函数有有限个全局最大值,在这些最大值附近函数的行为具有已知光滑度的局部连续性,那么HOO的期望遗憾以对数因子为界被$\sqrt{n}$控制,即遗憾的增长速率与空间维度无关。我们还证明了当相异度为度量时,我们的算法是极小极大最优的。我们的基本策略具有关于时间步数的二次计算复杂度,且不依赖于加倍技巧。我们还引入了一种改进策略,该策略依赖于加倍技巧但运行时间为线性对数。这两个结果相比之前的方法都有改进。

英文摘要

We consider a generalization of stochastic bandits where the set of arms, $\cX$, is allowed to be a generic measurable space and the mean-payoff function is "locally Lipschitz" with respect to a dissimilarity function that is known to the decision maker. Under this condition we construct an arm selection policy, called HOO (hierarchical optimistic optimization), with improved regret bounds compared to previous results for a large class of problems. In particular, our results imply that if $\cX$ is the unit hypercube in a Euclidean space and the mean-payoff function has a finite number of global maxima around which the behavior of the function is locally continuous with a known smoothness degree, then the expected regret of HOO is bounded up to a logarithmic factor by $\sqrt{n}$, i.e., the rate of growth of the regret is independent of the dimension of the space. We also prove the minimax optimality of our algorithm when the dissimilarity is a metric. Our basic strategy has quadratic computational complexity as a function of the number of time steps and does not rely on the doubling trick. We also introduce a modified strategy, which relies on the doubling trick but runs in linearithmic time. Both results are improvements with respect to previous approaches.

1010.5290 2026-06-03 cs.LG cs.NA math.NA 版本更新

Converged Algorithms for Orthogonal Nonnegative Matrix Factorizations

正交非负矩阵分解的收敛算法

Andri Mirzal

AI总结 提出基于Lee和Seung算法以及Lin思想的单正交和双正交非负矩阵分解算法,并给出收敛性证明,实验验证了收敛性。

Comments 55 pages, 11 figures

详情
AI中文摘要

本文提出了具有鲁棒收敛性证明的单正交和双正交非负矩阵分解算法。我们基于Lee和Seung[1]的工作设计算法,并利用Lin[2]的思想推导出收敛版本。实验结果证实了收敛性的理论保证。

英文摘要

This paper proposes uni-orthogonal and bi-orthogonal nonnegative matrix factorization algorithms with robust convergence proofs. We design the algorithms based on the work of Lee and Seung [1], and derive the converged versions by utilizing ideas from the work of Lin [2]. The experimental results confirm the theoretical guarantees of the convergences.

1103.2491 2026-06-03 cs.LG cs.GT cs.SY eess.SY math.OC 版本更新

Heterogeneous Learning in Zero-Sum Stochastic Games with Incomplete Information

不完全信息零和随机博弈中的异构学习

Quanyan Zhu, Hamidou Tembine, Tamer Basar

AI总结 针对不完全信息零和随机博弈,提出异构学习方案(各智能体采用不同学习模式),利用随机逼近将其转化为常微分方程,并应用于安全博弈中攻防双方因理性与信息差异采用不同学习策略的场景。

详情
AI中文摘要

学习算法对于博弈论在网络环境中的应用至关重要。在动态和去中心化的环境中,流量、拓扑和信道状态可能随时间变化,且智能体之间的通信不切实际,因此需要制定和研究不完全信息博弈以及完全分布式学习算法,这些算法要求每个智能体对其他智能体的信息需求最小。在本文中,我们应对这一重大挑战,引入了异构学习方案,其中每个智能体在不完全信息博弈的背景下采用不同的学习模式。我们使用随机逼近技术来证明异构学习方案可以通过其确定性常微分方程对应物进行研究。根据玩家的学习速率,这些常微分方程可能不同于标准的复制者动力学、(短视)最佳响应动力学、logit动力学和虚拟博弈动力学。我们将结果应用于一类安全博弈,其中攻击者和防御者由于理性水平和获取信息的差异而采用不同的学习方案。

英文摘要

Learning algorithms are essential for the applications of game theory in a networking environment. In dynamic and decentralized settings where the traffic, topology and channel states may vary over time and the communication between agents is impractical, it is important to formulate and study games of incomplete information and fully distributed learning algorithms which for each agent requires a minimal amount of information regarding the remaining agents. In this paper, we address this major challenge and introduce heterogeneous learning schemes in which each agent adopts a distinct learning pattern in the context of games with incomplete information. We use stochastic approximation techniques to show that the heterogeneous learning schemes can be studied in terms of their deterministic ordinary differential equation (ODE) counterparts. Depending on the learning rates of the players, these ODEs could be different from the standard replicator dynamics, (myopic) best response (BR) dynamics, logit dynamics, and fictitious play dynamics. We apply the results to a class of security games in which the attacker and the defender adopt different learning schemes due to differences in their rationality levels and the information they acquire.

1102.2975 2026-06-03 math.OC cs.LG cs.SY eess.SY math.PR 版本更新

Decentralized Restless Bandit with Multiple Players and Unknown Dynamics

多玩家未知动力学的分散式休止臂赌博机

Haoyang Liu, Keqin Liu, Qing Zhao

AI总结 针对多玩家未知动力学的分散式休止多臂赌博机问题,提出一种分散式策略,在已知系统参数边界时实现对数阶遗憾,在无先验知识时实现任意接近对数阶的遗憾。

Comments 7 pages, 2 figures, in Proc. of Information Theory and Applications Workshop (ITA), January, 2011

详情
AI中文摘要

我们考虑具有未知动力学和多玩家的分散式休止多臂赌博机问题。每个臂的奖励状态在被激活时根据未知马尔可夫规则转移,在被动时根据任意未知随机过程演化。同时激活同一臂的玩家会发生碰撞并遭受奖励损失。目标是通过设计分散式臂选择策略来解决未知奖励模型和玩家间的碰撞,从而最大化长期奖励。我们构建了一种分散式策略,当已知某些系统参数的任意非平凡边界时,该策略实现对数阶遗憾。当没有关于系统的知识可用时,我们扩展该策略以实现任意接近对数阶的遗憾。该结果可应用于通信网络、金融投资和工业工程。

英文摘要

We consider decentralized restless multi-armed bandit problems with unknown dynamics and multiple players. The reward state of each arm transits according to an unknown Markovian rule when it is played and evolves according to an arbitrary unknown random process when it is passive. Players activating the same arm at the same time collide and suffer from reward loss. The objective is to maximize the long-term reward by designing a decentralized arm selection policy to address unknown reward models and collisions among players. A decentralized policy is constructed that achieves a regret with logarithmic order when an arbitrary nontrivial bound on certain system parameters is known. When no knowledge about the system is available, we extend the policy to achieve a regret arbitrarily close to the logarithmic order. The result finds applications in communication networks, financial investment, and industrial engineering.

1102.0899 2026-06-03 cs.AI cs.CV cs.LG cs.NA math.NA math.PR 版本更新

Evidence Feed Forward Hidden Markov Model: A New Type of Hidden Markov Model

证据前馈隐马尔可夫模型:一种新型隐马尔可夫模型

Michael DelRose, Christian Wagner, Philip Frederick

AI总结 针对隐马尔可夫模型无法建模观测间关联的问题,提出证据前馈隐马尔可夫模型,通过引入观测间概率链接提升分类性能,并在视觉动作和测量数据上验证其有效性。

Comments 19 pages, International Journal of Artificial Intelligence and Applications

详情
Journal ref
International Journal of Artificial Intelligence and Applications (IJAIA), Vol. 2, No. 1, Jan 2011
AI中文摘要

仅基于视觉动作预测他人意图的能力是人类和动物独有的技能。当前计算机算法的智能尚未达到这种复杂程度,但已有若干研究正朝此方向努力。由于可用的分类算法众多,难以确定哪种算法最适合特定情境。在视觉人类意图数据分类中,隐马尔可夫模型(HMM)及其变体是主要候选方法。HMM无法提供观测间链接的概率,这是该分类技术的一大缺陷。当人通过视觉识别他人的动作时,会监控观测中的模式。通过估计下一个观测,人们能够总结动作,从而相当准确地判断执行动作者的意图。这些视觉线索和链接对于创建基于视觉观测确定人类动作的智能算法至关重要。证据前馈隐马尔可夫模型是一种新开发的算法,它提供了观测间链接。本研究阐述了证据前馈HMM背后的理论,提供了其学习这些参数以优化观测似然性的数学证明(这对所有计算智能算法都至关重要),并给出了与标准HMM在视觉动作数据和测量数据分类中的比较示例,从而为证据前馈HMM在多种问题分类中的应用奠定了坚实基础。

英文摘要

The ability to predict the intentions of people based solely on their visual actions is a skill only performed by humans and animals. The intelligence of current computer algorithms has not reached this level of complexity, but there are several research efforts that are working towards it. With the number of classification algorithms available, it is hard to determine which algorithm works best for a particular situation. In classification of visual human intent data, Hidden Markov Models (HMM), and their variants, are leading candidates. The inability of HMMs to provide a probability in the observation to observation linkages is a big downfall in this classification technique. If a person is visually identifying an action of another person, they monitor patterns in the observations. By estimating the next observation, people have the ability to summarize the actions, and thus determine, with pretty good accuracy, the intention of the person performing the action. These visual cues and linkages are important in creating intelligent algorithms for determining human actions based on visual observations. The Evidence Feed Forward Hidden Markov Model is a newly developed algorithm which provides observation to observation linkages. The following research addresses the theory behind Evidence Feed Forward HMMs, provides mathematical proofs of their learning of these parameters to optimize the likelihood of observations with a Evidence Feed Forwards HMM, which is important in all computational intelligence algorithm, and gives comparative examples with standard HMMs in classification of both visual action data and measurement data; thus providing a strong base for Evidence Feed Forward HMMs in classification of many types of problems.

1011.1518 2026-06-03 stat.ML cs.LG cs.NA math.NA 版本更新

Robust Matrix Decomposition with Outliers

含离群值的鲁棒矩阵分解

Daniel Hsu, Sham M. Kakade, Tong Zhang

AI总结 研究通过ℓ1范数和迹范数最小化从观测矩阵中恢复低秩矩阵和稀疏离群值矩阵的条件,给出了比以往更强的恢复保证,且不假设离群值的空间模式是随机的。

Comments Corrected comparisons to previous work of Candes et al (2009)

详情
AI中文摘要

假设给定的观测矩阵可以分解为一个低秩矩阵和一个稀疏矩阵(离群值)的和,目标是恢复这些独立分量。这种加性分解在多种数值问题中有应用,包括系统辨识、潜变量图建模和主成分分析。我们研究通过ℓ1范数和迹范数最小化实现这种分解的条件。我们特别关注允许多少离群值使得凸规划仍能实现准确恢复,并且我们得到了比以往研究更强的恢复保证。此外,我们不假设离群值的空间模式是随机的,这与通过矩阵补全进行相关分析形成对比。

英文摘要

Suppose a given observation matrix can be decomposed as the sum of a low-rank matrix and a sparse matrix (outliers), and the goal is to recover these individual components from the observed sum. Such additive decompositions have applications in a variety of numerical problems including system identification, latent variable graphical modeling, and principal components analysis. We study conditions under which recovering such a decomposition is possible via a combination of $\ell_1$ norm and trace norm minimization. We are specifically interested in the question of how many outliers are allowed so that convex programming can still achieve accurate recovery, and we obtain stronger recovery guarantees than previous studies. Moreover, we do not assume that the spatial pattern of outliers is random, which stands in contrast to related analyses under such assumptions via matrix completion.

1008.4406 2026-06-03 cs.MM cs.LG cs.SY eess.SY 版本更新

Structural Solutions to Dynamic Scheduling for Multimedia Transmission in Unknown Wireless Environments

未知无线环境下多媒体传输的动态调度结构解决方案

Fangwen Fu, Mihaela van der Schaar

AI总结 针对时变无线信道中延迟敏感媒体数据的调度问题,提出基于马尔可夫决策过程(MDP)和优先级图(DAG)的结构化解决方案,通过分解多数据单元决策为顺序单数据单元决策降低复杂度,并开发低复杂度在线学习算法处理未知统计知识,显著优于现有方法。

详情
AI中文摘要

在本文中,我们提出了一种系统性的解决方案,用于在时变无线信道上调度延迟敏感的媒体数据进行传输。我们首先将动态调度问题建模为马尔可夫决策过程(MDP),该过程明确考虑了用户异构的多媒体数据特征(例如延迟截止时间、失真影响和依赖性等)以及时变信道条件,这些在现有的数据包调度算法中并未同时考虑。这种建模使我们能够进行前瞻性决策,在每次传输时调度多个数据单元,以优化多媒体应用的长期效用。媒体数据的异构性使我们能够将不同数据单元之间的传输优先级表示为优先级图,这是一个有向无环图(DAG)。该优先级图为我们提供了一种优雅的结构,可以将每次的多数据单元前瞻性决策分解为多个单数据单元前瞻性决策,这些决策可以按顺序执行,从高优先级数据单元到低优先级数据单元,从而显著降低计算复杂度。当多媒体数据特征和信道条件的统计知识先验未知时,我们开发了一种低复杂度的在线学习算法来更新价值函数,该函数捕捉当前决策对未来效用的影响。仿真结果表明,所提出的解决方案显著优于现有的最先进调度解决方案。

英文摘要

In this paper, we propose a systematic solution to the problem of scheduling delay-sensitive media data for transmission over time-varying wireless channels. We first formulate the dynamic scheduling problem as a Markov decision process (MDP) that explicitly considers the users' heterogeneous multimedia data characteristics (e.g. delay deadlines, distortion impacts and dependencies etc.) and time-varying channel conditions, which are not simultaneously considered in state-of-the-art packet scheduling algorithms. This formulation allows us to perform foresighted decisions to schedule multiple data units for transmission at each time in order to optimize the long-term utilities of the multimedia applications. The heterogeneity of the media data enables us to express the transmission priorities between the different data units as a priority graph, which is a directed acyclic graph (DAG). This priority graph provides us with an elegant structure to decompose the multi-data unit foresighted decision at each time into multiple single-data unit foresighted decisions which can be performed sequentially, from the high priority data units to the low priority data units, thereby significantly reducing the computation complexity. When the statistical knowledge of the multimedia data characteristics and channel conditions is unknown a priori, we develop a low-complexity online learning algorithm to update the value functions which capture the impact of the current decision on the future utility. The simulation results show that the proposed solution significantly outperforms existing state-of-the-art scheduling solutions.

1007.0380 2026-06-03 math.NA cs.LG cs.NA 版本更新

Additive Non-negative Matrix Factorization for Missing Data

缺失数据的加性非负矩阵分解

Mithun Das Gupta

AI总结 提出一种加性非负矩阵分解方法,通过联合优化缺失属性和分解因子来生成测试数据中的缺失属性,并证明算法的单调收敛性。

Comments General extension of the NMF framework

详情
AI中文摘要

非负矩阵分解(NMF)先前已被证明是多变量数据的有用分解。我们以新的方式解释该分解,并利用它从测试数据生成缺失属性。我们为缺失属性以及NMF因子提供了联合优化方案。我们证明了算法的单调收敛性。我们展示了缺失属性情况下的分类结果。

英文摘要

Non-negative matrix factorization (NMF) has previously been shown to be a useful decomposition for multivariate data. We interpret the factorization in a new way and use it to generate missing attributes from test data. We provide a joint optimization scheme for the missing attributes as well as the NMF factors. We prove the monotonic convergence of our algorithms. We present classification results for cases with missing attributes.

0910.0921 2026-06-03 cs.LG cs.NA math.NA 版本更新

Low-rank Matrix Completion with Noisy Observations: a Quantitative Comparison

含噪声观测的低秩矩阵补全:定量比较

Raghunandan H. Keshavan, Andrea Montanari, Sewoong Oh

AI总结 本文通过仿真平台定量比较了三种主流低秩矩阵补全算法(OptSpace、ADMiRA和FPCA)在噪声观测下的性能,并展示了它们在真实数据和随机生成数据上的准确重建能力。

Comments 7 pages, 7 figures, 47th Allerton Conference on Communication Control and Computing, 2009, invited paper

详情
AI中文摘要

我们考虑一个具有重要实际意义的问题,即从少量条目中重建低秩数据矩阵。该问题出现在许多领域,如协同过滤、计算机视觉和无线传感器网络。本文重点研究观测样本被噪声污染的矩阵补全问题。我们在单一仿真平台上比较了三种最先进的矩阵补全算法(OptSpace、ADMiRA和FPCA)的性能,并给出了数值结果。我们表明,这些高效算法在实践中可用于准确重建真实数据矩阵以及随机生成的矩阵。

英文摘要

We consider a problem of significant practical importance, namely, the reconstruction of a low-rank data matrix from a small subset of its entries. This problem appears in many areas such as collaborative filtering, computer vision and wireless sensor networks. In this paper, we focus on the matrix completion problem in the case when the observed samples are corrupted by noise. We compare the performance of three state-of-the-art matrix completion algorithms (OptSpace, ADMiRA and FPCA) on a single simulation platform and present numerical results. We show that in practice these efficient algorithms can be used to reconstruct real data matrices, as well as randomly generated matrices, accurately.

0909.5000 2026-06-03 cs.LG cs.NA cs.NE math.NA 版本更新

Eignets for function approximation on manifolds

用于流形上函数逼近的特征网络

H. N. Mhaskar

AI总结 针对紧致光滑黎曼流形上的函数逼近问题,提出一种基于核函数线性组合的特征网络(eignet)的确定性通用算法,给出最优逼近阶估计并证明系数有界性及导数逼近的最优性。

Comments 28 pages. Articles in press; Applied and Computational Harmonic Analysis, 2009

详情
AI中文摘要

设 $\XX$ 为无边界紧致光滑连通黎曼流形,$G:\XX\times\XX\to \RR$ 为核函数。类似于径向基函数网络,特征网络(eignet)形如 $\sum_{j=1}^M a_jG(\circ,y_j)$,其中 $a_j\in\RR$,$y_j\in\XX$,$1\le j\le M$。我们描述了一种确定性的通用算法,用于构造逼近 $L^p(\mu;\XX)$ 中函数的特征网络,适用于一类广泛的测度 $\mu$ 和核 $G$。我们的算法产生线性算子。以中心 $y_j$ 之间的最小间隔作为逼近代价,我们给出了特征网络逼近度的光滑模估计,并通过逆定理证明这些估计对每个个体函数都是最优的。我们还根据特征网络的范数给出了系数 $a_j$ 的估计。最后,我们证明:如果任何特征网络序列满足光滑函数逼近度的最优估计(以最小间隔度量),那么特征网络的导数也以最优方式逼近目标函数的相应导数。

英文摘要

Let $\XX$ be a compact, smooth, connected, Riemannian manifold without boundary, $G:\XX\times\XX\to \RR$ be a kernel. Analogous to a radial basis function network, an eignet is an expression of the form $\sum_{j=1}^M a_jG(\circ,y_j)$, where $a_j\in\RR$, $y_j\in\XX$, $1\le j\le M$. We describe a deterministic, universal algorithm for constructing an eignet for approximating functions in $L^p(μ;\XX)$ for a general class of measures $μ$ and kernels $G$. Our algorithm yields linear operators. Using the minimal separation amongst the centers $y_j$ as the cost of approximation, we give modulus of smoothness estimates for the degree of approximation by our eignets, and show by means of a converse theorem that these are the best possible for every \emph{individual function}. We also give estimates on the coefficients $a_j$ in terms of the norm of the eignet. Finally, we demonstrate that if any sequence of eignets satisfies the optimal estimates for the degree of approximation of a smooth function, measured in terms of the minimal separation, then the derivatives of the eignets also approximate the corresponding derivatives of the target function in an optimal manner.

0810.0877 2026-06-03 math.NA cs.LG cs.NA 版本更新

Bias-Variance Techniques for Monte Carlo Optimization: Cross-validation for the CE Method

蒙特卡洛优化的偏差-方差技术:CE方法的交叉验证

Dev Rajnarayan, David Wolpert

AI总结 本文利用偏差-方差权衡和交叉验证技术改进交叉熵(CE)方法在蒙特卡洛优化中的性能,并指出参数学习中的技术可推广至优化算法。

详情
AI中文摘要

本文在蒙特卡洛优化(MCO)和参数学习(PL)的广泛背景下考察了CE方法,后者是一种机器学习。一个众所周知的用于提高许多PL算法性能的总体原则是偏差-方差权衡。该权衡已被用于改进从蒙特卡洛积分估计到线性估计再到一般统计估计的PL算法。此外,如所述,MCO与PL密切相关。由于这种相似性,偏差-方差权衡影响MCO性能,正如它影响PL性能一样。在本文中,我们利用偏差-方差权衡来增强MCO算法的性能。我们使用交叉验证技术,一种基于偏差-方差权衡的技术,来显著改进交叉熵(CE)方法(一种MCO算法)的性能。在先前的工作中,我们已确认其他PL技术改进了其他MCO算法的性能。我们得出结论,PL中开创的许多技术可以研究作为改进一般MCO算法,特别是CE方法的方法。

英文摘要

In this paper, we examine the CE method in the broad context of Monte Carlo Optimization (MCO) and Parametric Learning (PL), a type of machine learning. A well-known overarching principle used to improve the performance of many PL algorithms is the bias-variance tradeoff. This tradeoff has been used to improve PL algorithms ranging from Monte Carlo estimation of integrals, to linear estimation, to general statistical estimation. Moreover, as described by, MCO is very closely related to PL. Owing to this similarity, the bias-variance tradeoff affects MCO performance, just as it does PL performance. In this article, we exploit the bias-variance tradeoff to enhance the performance of MCO algorithms. We use the technique of cross-validation, a technique based on the bias-variance tradeoff, to significantly improve the performance of the Cross Entropy (CE) method, which is an MCO algorithm. In previous work we have confirmed that other PL techniques improve the perfomance of other MCO algorithms. We conclude that the many techniques pioneered in PL could be investigated as ways to improve MCO algorithms in general, and the CE method in particular.