arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.30327 2026-05-29 cs.LG cs.AI cs.CL math.ST stat.ML stat.TH

Reasoning with Sampling: Cutting at Decision Points

基于采样的推理：在决策点进行裁剪

Felix Zhou, Anay Mehrotra, Quanquan C. Liu

发表机构 * Yale University（耶鲁大学）； Stanford University（斯坦福大学）

AI总结提出Entropy-Cut Metropolis-Hastings算法，利用基础模型的下一词元熵作为代理识别关键决策点并重新采样，从而高效地从幂分布中采样以增强推理能力，在多个基准上超越基线和RL训练模型。

详情

AI中文摘要

前沿推理模型是通过对基础语言模型进行强化学习后训练而产生的。最近的研究对此提出了挑战，表明从基础模型分布的锐化版本（即所谓的幂分布）中采样，无需额外训练、精心策划的数据集或验证器，就能产生可比的推理能力。然而，使这种方法实用化需要高效地从幂分布中采样。采样器需要“混合”到幂分布，这需要在目标分布的模态之间移动；直观地说，例如尝试不同的推理策略。先前工作中提出的采样器反复在当前推理轨迹中均匀随机选择一个“裁剪”位置，并从该位置开始重新采样后缀。然而，推理轨迹通常包含少数关键决策（例如，证明策略或算法的选择），我们观察到均匀选择的裁剪往往重写局部细节，而不是重新审视决策点。我们引入了一种算法（Entropy-Cut Metropolis-Hastings），该算法使用基础模型的下一词元熵作为代理来识别关键决策点，并从这些位置重新采样。我们通过实验验证了熵跳变是决策点的有用代理，并在一个风格化的推理模型中证明了我们的方法的混合时间与轨迹中的决策数量成比例，而不是与可能大得多的词元数量成比例。在MATH500、HumanEval、GPQA Diamond和AIME26上，我们的方法始终优于基线和RL训练模型。

英文摘要

Frontier reasoning models are produced by posttraining base language models with reinforcement learning. Recent work has challenged this by showing that sampling from a sharpened version of the base model's distribution, a so-called power distribution, elicits comparable reasoning without additional training, curated datasets, or verifiers. However, making this method practical requires efficiently sampling from the power distribution. A sampler needs to "mix" to the power distribution, which necessitates moving between modes of the target distribution; intuitively, e.g., trying different reasoning strategies. The samplers proposed in prior works repeatedly select a "cut" position in the current reasoning trace uniformly at random and resample the suffix from that position onward. However, reasoning traces typically contain a few consequential decisions (e.g., the choice of proof strategy or algorithm), and we observe that a uniformly chosen cut tends to rewrite local details rather than revisit decision points. We introduce an algorithm (Entropy-Cut Metropolis-Hastings) that uses the base model's next-token entropy as a proxy to identify key decision points and resample from those positions. We empirically verify that entropy jumps are a useful proxy for decision points and, in a stylized model of reasoning, prove that our method's mixing time scales with the number of decisions in a trace rather than with the number of tokens, which can be much larger. Across MATH500, HumanEval, GPQA Diamond, and AIME26, our method consistently improves over baselines and RL-trained models.

URL PDF HTML ☆

赞 0 踩 0

2605.30324 2026-05-29 cs.DS cs.AI cs.CL cs.LG stat.ML

On Language Generation in the Limit with Bounded Memory

有界记忆下的极限语言生成

Jon Kleinberg, Anay Mehrotra, Amin Saberi, Grigoris Velegkas

发表机构 * Cornell University（康奈尔大学）； Stanford University（斯坦福大学）； Google Research（谷歌研究）

AI总结研究有界记忆下语言生成的极限问题，通过组合界和滑动窗口分析记忆约束对可生成性、密度和识别的影响。

Comments The abstract has been shortened to fit within the arXiv limit

详情

高斯分布的Wasserstein KL散度

Adwait Datar, Nihat Ay

AI总结提出基于Wasserstein几何的高斯分布KL散度新版本（WKL散度），证明其与样本空间几何一致，且狄拉克测度的WKL散度正比于两点间距离平方。

2605.30289 2026-05-29 cs.LG stat.AP stat.ML

Statistical Embeddings for Similarity, Retrieval, and Interpretable Alignment of Numeric Tabular Datasets

用于数值表格数据集的相似性、检索和可解释对齐的统计嵌入

M. Ross Kunz, John Merickel, Keith Wilson

发表机构 * Idaho National Laboratory（爱达荷国家实验室）

AI总结提出一种通过结构化探索性数据分析描述符、句子变换器嵌入和典型相关分析（CCA）来表征和比较数值表格数据集的方法，实现跨数据集的相似性检索和可解释变量级对齐，并支持差分隐私。

详情

AI中文摘要

数值表格数据集是科学实践中的主要数据格式，但大型语言模型缺乏在异构特征空间中有意义地表示数值数据集的原生机制。现有方法要么针对单个数据集的预测建模（需要共享变量定义），要么缺乏可解释的跨数据集对齐机制。提出的方法通过结构化探索性数据分析描述符来表征数值表格数据集，使用预训练的句子变换器将这些描述符嵌入到共享向量空间，并通过典型相关分析（CCA）量化跨数据集相似性。此外，应用惩罚形式的CCA来恢复数据集之间稀疏、可解释的变量级对应关系，识别哪些统计描述符或变量级数量驱动跨数据集对齐，而无需共享变量名或特征约定。在嵌入之前，可选地对描述符集应用差分隐私，支持在敏感数据环境中部署，而无需在比较时访问原始观测值。该方法在15个数据集上进行了评估，涵盖通用基准、材料信息学和核级石墨表征。结果表明，总P@1得分为0.9，已知最近邻检索和聚类结构在嵌入消融和差分隐私预算下保持稳健。所提出的框架为将异构数值数据集成到检索增强生成流程中提供了一条原则性途径，同时保留统计上下文，直接应用于数据驱动的算法选择和未知数据集的模拟模型初始化。

英文摘要

Numeric tabular datasets are the dominant data format in scientific practice, yet large language models lack native mechanisms for representing numeric datasets in a meaningful way across heterogeneous feature spaces. Existing approaches either target predictive modeling over individual datasets, which requires a shared set of variable definitions, or lack mechanisms for interpretable cross-dataset alignment. The proposed methodology characterizes numeric tabular datasets through structured exploratory data analysis descriptors, embeds those descriptors into a shared vector space using a pretrained sentence transformer, and quantifies cross-dataset similarity via Canonical Correlation Analysis (CCA). Furthermore, a penalized formulation of CCA is applied to recover sparse, interpretable variable-level correspondences between datasets, identifying which statistical descriptors or variable-level quantities drive cross-dataset alignment without requiring shared variable names or feature conventions. Differential privacy is optionally applied to the descriptor set prior to embedding, supporting deployment in sensitive data contexts without requiring access to raw observations at time of comparison. The methodology is evaluated across 15 datasets spanning general-purpose benchmarks, materials informatics, and nuclear-grade graphite characterization. Results demonstrate a total P@1 score of 0.9, with known nearest-neighbor retrieval and cluster structure remaining robust across embedding ablations and differential privacy budgets. The proposed framework provides a principled pathway for integrating heterogeneous numeric data into retrieval-augmented generation pipelines while preserving statistical context, with direct applications to data-driven algorithm selection and simulation model initialization for unknown datasets.

URL PDF HTML ☆

赞 0 踩 0

2605.30287 2026-05-29 stat.ME

MoSAIC: Multi-Resolution Spatial Regression Analysis of Cellular Colocalizations in Cancer Imaging

MoSAIC: 癌症成像中细胞共定位的多分辨率空间回归分析

Jessica Aldous, Michele Peruzzi, Maria Masotti, Aaron Udager, Allison May, Evan Keller, Veerabhadran Baladandayuthapani

AI总结提出层次贝叶斯空间回归模型MoSAIC，联合分析多分辨率空间数据，分解全局肿瘤梯度效应、患者特异性效应和空间依赖性，在肾细胞癌成像中识别EMT梯度相关的免疫-肿瘤共定位变化。

Comments 45 pages (30 before supplement), 6 figures, submitted to ISBA and JSM

详情

AI中文摘要

层次多重成像方法在患者肿瘤标本内的多个空间组织的视野（FOV）中生成空间分辨的单细胞测量，从而能够系统研究肿瘤微环境组织如何沿生物学上有意义的瘤内梯度变化。现有方法未能联合处理恢复真实生物信号所需的多分辨率数据结构。我们提出MoSAIC：细胞共定位的多分辨率空间回归分析，一种为多分辨率空间数据设计的层次贝叶斯空间回归模型。MoSAIC将联合变异分解为三个模型组件：（i）全局肿瘤梯度效应，（ii）患者特异性效应以捕获患者间变异，以及（iii）高斯过程模型以解释每个患者肿瘤组织内FOV之间的空间依赖性。模拟表明，与现有的空间和非空间模型替代方案相比，MoSAIC具有改进的预测和模型拟合。我们的方法受肾细胞癌多重成像队列的启发并应用于该队列，以研究跨上皮-间充质转化（EMT）梯度的免疫-肿瘤共定位模式。MoSAIC识别出随着EMT梯度增加，巨噬细胞-肿瘤共定位增加和细胞毒性T细胞-肿瘤共定位减少，这与EMT相关的免疫抑制和空间变化的免疫参与一致。总体而言，MoSAIC为量化癌症成像研究中的空间肿瘤梯度效应提供了一个可解释的多分辨率框架。软件可在GitHub上获取：jcaldous/MoSAIC。

英文摘要

Hierarchical multiplex imaging approaches generate spatially resolved single-cell measurements across multiple, spatially organized fields of view (FOVs) within patient tumor specimens, thereby enabling systematic investigation of how the organization of the tumor microenvironment varies along biologically meaningful intratumoral gradients. Existing approaches fail to jointly address this multi-resolution data structure needed to recover true biological signals. We propose MoSAIC: multi-resolution spatial regression analysis of cell colocalizations, a hierarchical Bayesian spatial regression model designed for multi-resolution spatial data. MoSAIC decomposes the joint variation into three model components: (i) global tumor-gradient effects, (ii) patient-specific effects to capture inter-patient variability, and (iii) Gaussian process models to account for spatial dependence between FOVs within each patient tumor tissue. Simulations demonstrate MoSAIC has improved prediction and model fit compared to existing spatial and non-spatial model alternatives. Our method is motivated by and applied to a renal cell carcinoma multiplex imaging cohort to investigate immune-tumor colocalization patterns across the epithelial-to-mesenchymal transition (EMT) gradient. MoSAIC identifies increased macrophage-tumor colocalization and decreased cytotoxic T-tumor colocalization progressing across the increasing EMT gradient, consistent with EMT-associated immune suppression and spatially varying immune engagement. Overall, MoSAIC provides an interpretable, multi-resolution framework for quantifying spatial tumor-gradient effects in cancer imaging studies. Software is available on GitHub at jcaldous/MoSAIC.

URL PDF HTML ☆

赞 0 踩 0

2605.30266 2026-05-29 math.ST stat.TH

Wasserstein Least Squares: A Canonical Regression Method for Probability Distributions

Wasserstein最小二乘法：概率分布的规范回归方法

Uriel Martínez León, Jonathan Niles-Weed

AI总结本文提出Wasserstein最小二乘回归方法，从凸分析角度证明其是欧几里得最小二乘在概率分布空间上的规范扩展，并在模板变形模型下实现n^{-1/2}的估计速率，应用于人口统计学数据分析。

详情

AI中文摘要

我们对Wasserstein最小二乘问题进行了数学和统计分析，这是一种针对向量值协变量和分布值响应的回归方法。我们的提议与其他分布回归方法形成对比，因为它具有直接基于随机变量的解释，是经典随机效应模型的非参数类比。在数学方面，我们采用Lavenant (2024)的策略，从凸分析的角度证明Wasserstein最小二乘是欧几里得最小二乘在概率分布空间上的规范扩展；这一观点引出了Wasserstein最小二乘问题的多边缘和对偶公式，扩展了Wasserstein重心类似的理论。我们在模板变形模型下对Wasserstein最小二乘问题进行了统计分析，令人惊讶地表明，估计可以达到n^{-1/2}的速率。作为特例，我们获得了Wasserstein重心估计的改进速率，这比Ahidar-Coutrix、Le Gouic和Paris (2020)建立的速率呈指数级改进。最后，我们提出了一种启发式粒子方法用于Wasserstein最小二乘，并利用它对来自RAND健康与退休研究的大规模人口统计学数据进行了新颖的分析。

英文摘要

We perform a mathematical and statistical analysis of the Wasserstein least squares problem, a regression method for vector-valued covariates and distribution-valued responses. Our proposal contrasts with other distributional regression methods by having a direct interpretation in terms of random variables, as a nonparametric analogue of the classic random-effects model. On the mathematical side, we use a strategy of Lavenant (2024) to show that Wasserstein least squares is the canonical extension of Euclidean least squares to the space of probability distributions from the perspective of convex analysis; this viewpoint gives rise to multimarginal and dual formulations of the Wasserstein least squares problem, extending a similar theory for Wasserstein barycenters. We perform a statistical analysis of the Wasserstein least squares problem under the template deformation model, showing, surprisingly, that estimation is possible at the n^{-1/2} rate. As a special case, we obtain improved rates of estimation for Wasserstein barycenters, which are an exponential improvement over those established by Ahidar-Coutrix, Le Gouic and Paris (2020). Finally, we propose a heuristic particle method for Wasserstein least squares and use it to conduct a novel analysis of large-scale demographic data from the RAND Health and Retirement Study.

URL PDF HTML ☆

赞 0 踩 0

2605.30209 2026-05-29 econ.GN q-fin.EC stat.AP

Betting Against Integrity: Identifying Match-Fixing Through In-Play Market Dynamics

对抗诚信：通过实时市场动态识别假球

David Winkelmann, Maya Vienken, Christian Deutscher, Roland Langrock

AI总结本研究利用意大利足球乙级联赛的高频实时投注数据，通过状态空间模型描述标准投注市场动态并预测预期投注量，再结合异常值检测技术识别异常投注行为，为早期发现假球提供统计支持。

详情

AI中文摘要

假球通过侵蚀公众信任和威胁俱乐部及联赛的财务可持续性，破坏了体育的诚信。全球体育博彩市场的扩张为操纵创造了新的激励和机会，迫切需要严格的数据驱动监控工具。足球在全球博彩营业额中占比最大，尤其容易受到影响：诚信报告持续指出多场可疑比赛，意大利和土耳其过去的丑闻凸显了问题的持续性。本研究使用意大利足球乙级联赛（2018/19-2020/21赛季）的高频实时投注数据，探索检测异常投注行为的统计方法。采用状态空间建模框架描述标准投注市场动态，并根据比赛特征预测预期投注量。然后利用异常值检测技术分析这些预期值的偏差，以识别潜在的可疑时段。结果表明统计建模如何有助于早期识别异常投注模式，从而支持实时体育博彩市场的诚信保障。

英文摘要

Match-fixing undermines the integrity of sport by eroding public trust and threatening the financial sustainability of clubs and leagues. The global expansion of sports betting markets has created new incentives and opportunities for manipulation, calling for rigorous, data-driven monitoring tools. Football, which accounts for the largest share of global betting turnover, remains particularly exposed: integrity reports continue to flag several suspicious matches, with past scandals in Italy and Turkey underlining the problem's persistence. This study uses high-frequency live-betting data from the Italian Serie B (2018/19-2020/21) to explore statistical approaches for detecting abnormal betting behaviour. A state-space modelling framework is employed to describe standard betting market dynamics and to predict expected betting volumes conditional on match characteristics. Deviations from these expectations can then be analysed using outlier detection techniques to identify potentially suspicious periods. The results demonstrate how statistical modelling can contribute to the early identification of irregular betting patterns, thereby supporting integrity assurance in live sports betting markets.

URL PDF HTML ☆

赞 0 踩 0

2605.30178 2026-05-29 stat.ME stat.CO

Cellwise Robust Discriminant Analysis

单元稳健判别分析

Fabio Centofanti, Can Hakan Dagidir, Mia Hubert, Peter J. Rousseeuw

AI总结针对数据矩阵中的单元异常值，提出基于单元稳健估计和惩罚最大似然的判别分析方法（cellQDA/cellLDA），在训练和预测中同时处理单元和个案异常值及缺失值。

详情

AI中文摘要

经典判别分析（DA）基于每个类别的均值和经验协方差矩阵，两者都对数据中的异常值敏感。过去关注的是个案异常值，即远离的数据点。但如今对单元异常值（即数据矩阵中的意外条目）的兴趣日益增加。因为一个或几个异常单元而删除整个个案会丢失大量信息。单元稳健方法旨在检测异常单元并保留其他单元的信息。我们提出一种DA方法，通过单元和个案稳健估计量估计每个类别的位置和协方差来进行训练，该方法也能处理缺失值。我们方法的主要创新在于对测试数据的预测，测试数据本身可能包含异常单元和缺失值。新的稳健判别函数通过惩罚最大似然从新的统计模型推导得出。我们专注于二次DA，但也涵盖线性DA的设置。新的cellQDA和cellLDA方法在模拟中表现良好。该方法在真实数据上进行了说明，并通过图形展示解释结果。

英文摘要

Classical discriminant analysis (DA) is based on the mean and empirical covariance matrix of each class, both of which are sensitive to outliers in the data. In the past the focus was on casewise outliers, that is, datapoints that lie far away. But nowadays there is increasing interest in cellwise outliers, that are unexpected entries in the data matrix. Removing an entire case because it has one or a few outlying cells would lose much information. Cellwise robust methods aim to detect the outlying cells and to preserve the information in the other cells. We propose a DA method that is trained by estimating the location and covariance of each class by cellwise and casewise robust estimators, that can also handle NA's. The main novelty of our approach is in the prediction on test data, that may contain outlying cells and NA's themselves. The new robust discriminant function is derived from a novel statistical model by penalized maximum likelihood. We focus on quadratic DA, but also cover the setting of linear DA. The new cellQDA and cellLDA methods perform well in simulation. The approach is illustrated on real data, and the results are interpreted with the help of graphical displays.

URL PDF HTML ☆

赞 0 踩 0

2605.30175 2026-05-29 astro-ph.HE cs.LG stat.ML

A new completely parameter-free clustering algorithm for unsupervised classification of BATSE gamma-ray bursts

一种用于BATSE伽马射线暴无监督分类的全新无参数聚类算法

Soumita Modak

发表机构 * Department of Statistics, Presidency University（统计系，普雷斯顿大学）

AI总结提出一种完全无参数的聚类算法，对BATSE伽马射线暴样本进行分类，支持双群（短暴与长暴）的合并-坍缩星理论。

详情

AI中文摘要

聚类分析是一种广泛应用的机器学习技术，用于理解伽马射线暴（GRB）群体中存在的模式，以探索其物理来源。目前，尽管采用了最先进的聚类程序进行了多次尝试，但对应可区分群组的聚类数量仍存在争议。这一关键未知参数需要通过直接或间接方式（以其他调优参数的形式）评估，以便通过实施合适的聚类算法在GRB中产生聚类。虽然大多数应用的算法得出了两个物理上可解释的群组（分别以短暴和长暴为主的合并与坍缩星），但其他统计方法违反了这种二元划分。然而，任何额外聚类的物理建立尚未得到确认。因此，我们提出一种新算法，来自一种称为“完全无参数”的不同聚类流派，它以迄今未尝试过的方式对GRB进行分类。该算法从BATSE样本中指示出两个主要群组，即短持续时间和长持续时间爆发，与合并-坍缩星理论兼容。

英文摘要

Cluster analysis is a widely applied machine learning technique to understand the existing patterns in the population of gamma-ray bursts (GRBs), in order to explore their physical sources. In the present scenario, the number of clusters corresponding to differentiable groups is still under conflict, in spite of numerous attempts with the state-of-the-art clustering procedures. This crucial unknown parameter needs to be evaluated, either directly or indirectly in terms of other tuning parameters, to produce the clusters in GRBs through implementation of an appropriate clustering algorithm. While most of the applied algorithms reached two physically explained groups of merger and collapsar predominated by the short and long bursts respectively, other statistical approaches violated this binary partition. However, physical establishment of any additional cluster(s) is not yet confirmed. Therefore, we propose a new algorithm, from a different stream of clustering referred to as `completely parameter-free', which carries out the classification of GRBs in a manner that has not been tried so far. It indicates two main groups, of short and long duration bursts from the BATSE sample, compatible with the merger-collapsar theory.

URL PDF HTML ☆

赞 0 踩 0

2605.30167 2026-05-29 stat.ML cs.CV cs.LG stat.AP

Visual Spatial Learning: Single-Field Spatial Interpolation Using Convolutional Neural Networks

视觉空间学习：使用卷积神经网络的单场空间插值

Daniel Tinoco, Raquel Menezes, Carlos Baquero, Alexandra Silva

发表机构 * Centro de Matemática (CMAT), Universidade do Minho（数学中心（CMAT），明霍大学）； DEI-FEUP & INESC TEC, Universidade do Porto（FEUP-DEI与INESC TEC，波尔图大学）； Instituto Português do Mar e da Atmosfera, I. P. (IPMA, I. P.), Lisboa, Portugal（葡萄牙海洋与大气研究所（IPMA, I. P.），里斯本，葡萄牙）； Centro de Ciências do Mar e do Ambiente (MARE), Évora, Portugal（海洋与环境科学中心（MARE），埃维拉，葡萄牙）

AI总结提出基于卷积神经网络（CNN）的架构，直接从单次部分观测场学习空间插值，无需外部数据或先验场，作为克里金法的替代方案。

Comments 53 pages, 10 figures

详情

AI中文摘要

从稀疏观测中预测完整的空间相关场是空间统计和环境建模中的一个基本挑战。经典的插值方法如克里金法依赖于高斯过程假设和变异函数分析，这可能会限制其在非平稳环境中的有效性，并且需要大量的领域专业知识。在这项工作中，我们利用基于卷积神经网络（CNN）的架构进行空间插值，该架构在单个部分观测场上进行训练和应用，无需访问外部数据或先验场。模型直接在观测位置进行监督，并学习在用户定义的网格上预测未观测点的值。与克里金法不同，我们的方法不需要显式的协方差建模或变异函数估计，并且可以以数据驱动的方式灵活捕捉局部空间模式。这项工作展示了CNN在稀疏监督下进行单实例空间插值的潜力，为经典地统计方法提供了实用的替代方案，并将CNN的应用扩展到新的问题领域。

英文摘要

Predicting a complete spatially correlated field from sparse observations is a fundamental challenge in spatial statistics and environmental modelling. Classical interpolation methods such as Kriging rely on Gaussian process assumptions and variography, which can limit their effectiveness in non-stationary settings and require substantial domain expertise. In this work, we leverage an architecture based on convolutional neural networks (CNNs) for spatial interpolation that is trained and applied on a single partially observed field, without access to external data or prior fields. The model is supervised directly on the observed locations and learns to predict values at unobserved points on the user defined grid. Unlike Kriging, our method does not require explicit covariance modelling or variogram estimation, and it can flexibly capture local spatial patterns in a data-driven manner. This work demonstrates the potential of CNNs for single-instance spatial interpolation under sparse supervision, offering a practical alternative to classical geostatistical methods, and extending the use of CNNs to a new problem domain.

URL PDF HTML ☆

赞 0 踩 0

2605.30158 2026-05-29 stat.ME

High-Dimensional Data with Measurement Error

高维数据中的测量误差

Herman Tesso, Georges Nguefack-Tsague

AI总结针对高维回归中协变量存在测量误差的问题，综述并比较了岭回归、Lasso、Dantzig选择器和弹性网及其误差校正变体的性能，通过模拟和真实数据提供实践建议。

Comments 21 pages, 0 figure

详情

AI中文摘要

在许多重要的统计分析中，协变量的数量 $p$ 通常超过数据规模 $n$，这种情形通常被称为高维。尽管在假设协变量无误差的高维回归方面取得了显著进展，但现实世界的数据经常涉及噪声或损坏的测量。如果不加以处理，测量误差可能会悄无声息地扭曲分析并误导结论。本文回顾并评估了一些适用于存在测量误差协变量的高维回归的统计推断方法。我们讨论了四种惩罚回归方法——岭回归、lasso、Dantzig选择器和弹性网——及其测量误差校正变体，并在线性加性且不相关的测量误差模型下进行了比较研究。通过模拟研究和对高维医学遗传数据的实际应用，我们展示了所研究的方法，表明校正程序的选择是问题特定的，并提供了实用建议以帮助实践者应对这一方法论领域。

英文摘要

In many important statistical analyses, the number of covariates $p$ often exceeds the data size $n$, a regime commonly referred to as high-dimensional. While considerable progress has been made in high-dimensional regression under the assumption of error-free covariates, real-world data frequently involve noisy or corrupted measurements. When left unaddressed, measurement errors can silently distort the analysis and mislead the conclusions. This paper reviews and evaluates some advisable statistical inference methods for high-dimensional regression in the presence of mismeasured covariates. We discuss four penalized regression methods -- ridge, lasso, Dantzig selector, and Elastic-net -- alongside their measurement-error-corrected variants, and conduct a comparative study under linear additive and uncorrelated measurement error models. Through simulation studies and a real application to high-dimensional medical genetic data, we illustrate the methods studied, show that the choice of correction procedure is problem-specific, and provide practical recommendations to help practitioners navigate this methodological landscape.

URL PDF HTML ☆

赞 0 踩 0

2605.30157 2026-05-29 stat.AP

Leveraging Large Language Models to Improve Precision in Randomized Controlled Trials

利用大型语言模型提高随机对照试验的精度

Jaylin Lowe, Adam Sales, Johann A. Gagnon-Bartsch

AI总结本文探索如何安全、严谨地利用大型语言模型（LLM）的预测来提升随机对照试验（RCT）的精度，并通过三个案例验证其有效性。

Comments Submitted to Machine Learning and Artificial Intelligence for Causal Inference in the Behavioral and Social Sciences: Methodological Advances and Applications, a topical issue of the Zeitschrift für Psychologie

2605.30153 2026-05-29 stat.ML cs.IT cs.LG math.IT math.ST stat.TH

Diffusion Models Are Statistically Optimal for Learning Low-Dimensional Multi-Modal Distributions

扩散模型在学习低维多模态分布时具有统计最优性

Jingda Wu, Changxiao Cai

发表机构 * Department of Industrial and Operations Engineering, University of Michigan, Ann Arbor, USA（工业与运营工程系，密歇根大学，安娜堡，美国）

AI总结本文证明扩散模型在学习支撑在低维子空间并集上的分布时，样本复杂度仅依赖于内在维度，达到近最优的1-Wasserstein误差率，无需光滑性或有界密度假设。

Comments accepted to ICML 2026

详情

AI中文摘要

基于分数的扩散模型在学习高维分布，特别是那些具有低维和多模态结构的分布方面，已经展现出显著的实证成功。然而，对其统计效率的理论理解仍然有限。现有理论通常依赖于强正则性假设，例如一致有界密度或全局光滑的分数函数，这些假设无法捕捉此类内在结构。在这项工作中，我们研究了扩散模型在学习支撑在低维子空间并集上的分布时的样本复杂度。假设每个子空间内的数据分布是次高斯的，我们证明扩散模型最多需要$\widetilde{O}(\varepsilon^{-k \vee 2})$个样本即可在1-Wasserstein距离上达到$\varepsilon$误差，其中$k$是内在维度。这一近最优的收敛速率仅依赖于内在维度，并显著改进了先前遭受维度灾难的理论保证。值得注意的是，我们的分析适用于广泛的分布，无需施加光滑性、有界密度或对数凹性假设。总体而言，我们的结果表明，扩散模型能够统计适应内在低维结构，同时自然容纳多模态数据，为其在复杂高维学习任务中的成功提供了严格的理论依据。

英文摘要

Score-based diffusion models have demonstrated remarkable empirical success in learning high-dimensional distributions, particularly those exhibiting low-dimensional and multi-modal structures. However, theoretical understanding of their statistical efficiency remains limited. Existing theories typically rely on strong regularity assumptions, such as uniformly bounded densities or globally smooth score functions, which fail to capture such intrinsic structures. In this work, we study the sample complexity of diffusion models for learning distributions supported on a union of low-dimensional subspaces. Assuming that the data distribution within each subspace is subgaussian, we show that diffusion models require at most $\widetilde{O}(\varepsilon^{-k \vee 2})$ samples to achieve $\varepsilon$ error in 1-Wasserstein distance, where $k$ is the intrinsic dimension. This near-optimal convergence rate depends only on the intrinsic dimension and significantly improves upon prior theoretical guarantees that suffer from the curse of dimensionality. Notably, our analysis applies to a broad collection of distributions without imposing smoothness, bounded-density, or log-concavity assumptions. Overall, our results show that diffusion models can statistically adapt to intrinsic low-dimensional structure while naturally accommodating multi-modal data, offering a rigorous theoretical justification for their success in complex high-dimensional learning tasks.

URL PDF HTML ☆

赞 0 踩 0

2605.30134 2026-05-29 stat.CO

Accurate and Efficient MCMC for Latent Position Models

潜在位置模型的精确高效MCMC

Zonghao Li, Aaron Smith

AI总结针对潜在位置模型（LPM）的贝叶斯推断，提出两种MCMC算法：一种在几乎O(|E|)时间内运行且精度更高，另一种在几乎O(|V|)时间内运行且精度略有提升，主要创新是引入一种可廉价更新的辅助数据结构。

Comments 43 pages, 8 figures

详情

AI中文摘要

潜在位置模型（LPM）是一类广泛流行的随机图模型。然而，拟合贝叶斯LPM在计算上具有挑战性——即使只计算一次似然，所需时间也与观测图$G = (V,E)$的顶点数$|V|$呈二次关系。许多先前的工作引入了近似MCMC算法来加速，其中最接近我们的是Rastelli等人（2024），他们提出了一种算法，其摊销运行时间几乎可以降低到$O(|E|)$，并且在合理的推理问题上具有良好的实证表现。本文提供了两种解决同一问题的算法：一种“快速”算法，其运行时间与Rastelli等人相同，几乎为$O(|E|)$量级，但具有更强的精度保证；另一种“更快”算法，其运行时间改进为几乎$O(|V|)$，精度保证相比Rastelli等人略有提升（但不足以应对所有任务）。主要改进来自于引入一种简单的辅助数据结构，该结构可以在MCMC运行期间廉价地更新；我们怀疑这种“廉价草图”可能对其他MCMC算法也有用。

英文摘要

Latent position models (LPMs) are a large and popular class of models for random graphs. However, fitting Bayesian LPMs is computationally challenging - computing the likelihood even once takes time that is quadratic in the number of vertices $|V|$ of the observed graph $G = (V,E)$. Many previous papers have introduced approximate MCMC algorithms to speed this up, with the most similar to ours, Rastelli et al (2024), presenting an algorithm that has amortized running time that can be reduced almost to $O(|E|)$ and good empirical performance on reasonable inference problems. The present paper offers two algorithms for solving the same problem: a ``fast" algorithm with running time of the same almost-$O(|E|)$ order as astelli et al and much stronger accuracy guarantees, and a ``faster" algorithm with an improved running time of almost $O(|V|)$, and accuracy guarantees that are slightly improved compared to Rastelli et al (but not sufficient for all tasks). The main improvements come from the introduction of a simple auxiliary data structure that can be cheaply updated during an MCMC run; we suspect that the same ``cheap sketch" may be useful for other MCMC algorithms.

URL PDF HTML ☆

赞 0 踩 0

2605.30132 2026-05-29 cs.LG stat.ML

Learning to Extrapolate to New Tasks: A Relational Approach to Task Extrapolation

学习外推到新任务：一种关系型任务外推方法

Adam Ousherovitch, Yixin Wang

发表机构 * Department of Statistics, University of Michigan, Ann Arbor（统计学系，密歇根大学，安阿伯）

AI总结提出关系型任务外推器（RTE），通过将目标任务分解为锚定任务和变换关系并学习关系算子，实现向未见任务的系统性外推，在函数预测和序列预测中显著优于现有方法。

Comments ICML 2026

详情

AI中文摘要

现代学习系统擅长内插，但难以泛化到训练分布支持范围之外的未见任务。即使在简单设置中（如处理超出训练范围的任务参数），这种失败也会发生，并且尽管基础模型取得了进展，问题依然存在。为此，我们开发了关系型任务外推器（RTE），一种旨在实现向新任务系统性外推的算法。关键观察是外推本质上是关系型的：外推到未见任务需要学习任务如何相互转换。如果模型在训练期间学习了任务A和B之间的变换，它可以在测试时应用相同的变换来关联已知任务和未见任务。RTE通过将每个目标任务分解为一个已知的锚定任务和一个连接锚定与目标的变换来实现这一思想。然后它学习一个关系算子，将锚定-变换对映射到目标任务的预测。我们在函数预测的多个任务外推场景中实例化RTE，例如目标任务使用超出范围的参数（参数外推）、具有更大的组合深度（长度外推）和/或以未见方式重新组合函数原语（组合外推）。我们进一步将RTE扩展到序列预测，将其集成到基础模型的微调算法中。在实证研究中，我们发现RTE在向新颖、未见任务的外推上显著优于现有方法。

英文摘要

Modern learning systems excel at interpolation but struggle to generalize to unseen tasks outside the training distribution's support. This failure occurs even in simple settings, such as handling task parameters beyond the training range, and persists despite advances in foundation models. To this end, we develop the Relational Task Extrapolator (RTE), an algorithm designed to enable systematic extrapolation to novel tasks. The key observation is that extrapolation is inherently relational: extrapolating to unseen tasks requires learning how tasks transform into one another. If a model learns the transformation between tasks A and B during training, it can apply that same transformation to relate known tasks to unseen ones at test time. RTE operationalizes this idea by decomposing each target task into a known anchor task and a transformation linking the anchor and target. It then learns a relational operator, mapping an anchor-transformation pair to predictions for the target task. We instantiate RTE across multiple task extrapolation regimes in function prediction, e.g. where target tasks use out-of-range parameters (parameter extrapolation), have greater compositional depth (length extrapolation), and/or recombine function primitives in unseen ways (compositional extrapolation). We further extend RTE to sequence prediction, integrating it into fine-tuning algorithms for foundation models. Across empirical studies, we find that RTE substantially outperforms existing approaches on extrapolation to novel, unseen tasks.

URL PDF HTML ☆

赞 0 踩 0

2605.30113 2026-05-29 math.ST cs.CC cs.DS math.PR stat.TH

Low-degree estimation thresholds in planted hypergraphs and tensor PCA

植入超图和张量PCA中的低度估计阈值

Daniel Fu, Youngtak Sohn

AI总结本文通过低度框架研究植入稠密子超图、稀疏张量PCA和一般先验张量PCA中的统计-计算差距，确定了低度估计的尖锐阈值，并给出了多项式时间算法。

Comments 67 pages, 1 figure

详情

AI中文摘要

高维统计学中的一个核心问题是理解统计-计算差距：即恢复隐藏信号在信息论上可能但推测在计算上难以处理的区域。低度框架通过将估计器限制为观测数据中次数至多为$D$的多项式，提供了一种研究这一差距的具体方法。本文研究了植入稠密子超图、稀疏张量PCA以及具有一般先验的张量PCA中的低度估计。对于$n$个顶点上的植入稠密子超图模型，我们根据植入集合是否大于或小于$\sqrt{n}$确定了两种情形。在此尺度之上，我们识别出低度估计的尖锐阈值。在此尺度之下，我们在先前工作预测的区域内建立了困难性，从而解决了Schramm和Wein（2022）以及Sohn和Wein（2025）的一个问题。对于稀疏张量PCA，我们识别出类似的尖锐相变。对于具有一般先验的张量PCA，我们在关键信号尺度上证明了低度估计下界，与先前工作提示的度-信号权衡相匹配。我们的下界适用于次数$D=n^δ$，其中$n$是维度，$δ>0$是常数，并且我们通过相应的低度上界进行补充。此外，对于$\sqrt{n}$尺度以上的植入稠密子超图和稀疏张量PCA，我们将上界转化为多项式时间算法，在尖锐阈值以上实现几乎精确恢复，从而得到成功达到该阈值的多项式时间算法。我们的证明通过条件变体扩展了Sohn和Wein（2025）的框架，在无条件方法不足的设置中得到了正确的信噪比。

英文摘要

A central question in high-dimensional statistics is to understand statistical--computational gaps: regimes in which recovering a hidden signal is information-theoretically possible but conjectured to be computationally intractable. The low-degree framework offers a concrete way to study this gap by restricting attention to estimators that are polynomials of degree at most $D$ in the observed data. In this paper, we study low-degree estimation in planted dense subhypergraph, sparse tensor PCA, and tensor PCA with a general prior. For the planted dense subhypergraph model on $n$ vertices, we identify two regimes depending on whether the planted set is larger or smaller than $\sqrt{n}$. Above this scale, we identify a sharp threshold for low-degree estimation. Below this scale, we establish hardness in the regimes predicted by prior work, thereby resolving a question of Schramm and Wein (2022) and Sohn and Wein (2025). For sparse tensor PCA, we identify an analogous sharp phase transition. For tensor PCA with a general prior, we prove a low-degree estimation lower bound at the critical signal scale, matching the degree--signal tradeoff suggested by prior work. Our lower bounds apply to degree $D=n^δ$, where $n$ is the dimension and $δ>0$ is a constant, and we complement them with corresponding low-degree upper bounds. In addition, for planted dense subhypergraph and sparse tensor PCA above the $\sqrt{n}$ scale, we convert our upper bounds into polynomial-time algorithms that achieve almost exact recovery above the sharp threshold, yielding polynomial-time algorithms succeeding up to this threshold. Our proofs extend the framework of Sohn and Wein (2025) through a conditional variant that yields the correct signal-to-noise ratio in settings where the unconditional approach is insufficient.

URL PDF HTML ☆

赞 0 踩 0

2605.30095 2026-05-29 math.ST cs.IT eess.SP math.IT stat.TH

The generalized method of moments is (almost) statistically efficient in low-SNR Gaussian latent-variable models

广义矩方法在低信噪比高斯潜变量模型中（几乎）具有统计有效性

Amnon Balanov, Tamir Bendory, Dan Edidin

AI总结针对低信噪比高斯潜变量模型，证明广义矩方法在最优加权下与最大似然估计具有相同的一阶渐近协方差，从而提供统计有效的替代方案。

详情

AI中文摘要

我们研究了低信噪比（SNR）条件下的一类广泛的高斯潜变量模型，包括高斯混合和轨道恢复问题。我们证明，在该条件下，广义矩方法（GMoM）与最大似然估计的一阶渐近有效性相匹配。特别地，如果矩特征选择到识别所需的最小局部阶数并最优加权，则所得的GMoM估计量与最大似然估计量具有相同的主渐近协方差。我们的分析表明，在低信噪比下，这种等价性由分层局部几何结构决定：不同方向在不同矩阶数下变得信息丰富，将空间划分为具有不同SNR缩放比例的分层。我们证明了观测Fisher信息和GMoM信息算子在这些层上具有匹配的分层展开。因此，在低信噪比条件下，GMoM提供了最大似然的统计有效替代方案，同时保留了基于矩估计的计算优势。

英文摘要

We study estimation in the low signal-to-noise ratio (SNR) regime for a broad class of Gaussian latent-variable models, including Gaussian mixtures and orbit recovery problems. We show that, in this regime, the generalized method-of-moments (GMoM) matches the first-order asymptotic efficiency of maximum likelihood. In particular, if the moment features are chosen up to the minimal local order required for identification and are weighted optimally, then the resulting GMoM estimator has the same leading asymptotic covariance as the maximum-likelihood estimator. Our analysis shows that, in low SNR, this equivalence is governed by a layered local geometry: different directions become informative at different moment orders, partitioning the space into layers with distinct SNR scalings. We prove that the observed Fisher information and the GMoM information operator admit matching layerwise expansions across these layers. As a consequence, in the low-SNR regime, GMoM provides a statistically efficient alternative to maximum likelihood, while preserving the computational advantages of moment-based estimation.

URL PDF HTML ☆

赞 0 踩 0

2605.30085 2026-05-29 cs.AI cs.CL cs.LG stat.ML

Conformal Certification of Reasoning Trace Prefixes

推理轨迹前缀的保形认证

Matt Y. Cheung, Ashok Veeraraghavan, Hanjie Chen, Guha Balakrishnan

发表机构 * Department of Electrical & Computer Engineering, Rice University（电气与计算机工程系，里士满大学）

AI总结提出CROP方法，通过保形校准选择阈值，返回最长无错前缀，并控制错误包含概率，平衡保留有效推理与丢弃误导后缀。

Comments Code available at https://github.com/matthewyccheung/crop

详情

AI中文摘要

语言模型推理轨迹很少是全有或全无；在关键错误发生之前，它们通常包含有效的中间步骤。现有的不确定性量化方法通常认证最终答案或整个响应，未能为顺序轨迹中可安全保留的比例提供统计保证。为了解决这个问题，我们引入了CROP（保形推理输出前缀），一种与验证器无关的校准程序，用于干净前缀认证。给定任何步骤级风险代理，CROP选择一个校准阈值，并返回其步骤风险代理保持低于该阈值的最长连续前缀，将未认证的后缀路由到下游审查或修复。假设可交换性，CROP严格控制了返回前缀包含注释错误的边际概率。在六个过程标记的推理数据集上，我们证明了标准步骤级指标（如AUROC）不能完全捕捉前缀效用，建议验证器应改为通过认证前缀长度进行评估。此外，CROP平衡了过度保留和不足保留，通过保留有效的中间推理同时丢弃误导后缀，提高了下游修复的准确性。最终，这项工作将前缀认证定位为过程监督、弃权和修复之间的严格、实用的桥梁。

英文摘要

Language model reasoning traces are rarely all-or-nothing; they frequently contain valid intermediate steps before a critical error occurs. Existing uncertainty quantification methods typically certify final answers or entire responses, failing to provide statistical guarantees for the proportion of a sequential trace that can be safely retained. To address this, we introduce CROP (Conformal Reasoning Output Prefixes), a verifier-agnostic calibration procedure for clean-prefix certification. Given any step-level risk proxy, CROP selects a calibrated threshold and returns the longest contiguous prefix whose step risk proxies remain below it, routing the uncertified suffix for downstream review or repair. Assuming exchangeability, CROP rigorously controls the marginal probability that the returned prefix contains an annotated error. Across six process-labeled reasoning datasets, we demonstrate that standard step-level metrics such as AUROC do not fully capture prefix utility, suggesting verifiers should instead be evaluated by certified prefix length. Furthermore, CROP balances over- and under-withholding, improving downstream repair accuracy by preserving valid intermediate reasoning while discarding misleading suffixes. Ultimately, this work positions prefix certification as a rigorous, practical bridge between process supervision, abstention, and repair.

URL PDF HTML ☆

赞 0 踩 0

2605.30072 2026-05-29 stat.ME

Credible rectangles for high-dimensional posterior comparison

高维后验比较的可信矩形

Alice Chevaux, Julyan Arbel, Guillaume Kon Kam King, Sophie Achard

AI总结提出一种贝叶斯框架，通过构建和比较后验分布的可信超矩形，实现脑连接图分析中的不确定性量化与比较，并提供高维可扩展算法和理论保证。

Comments 35 pages, 4 figures

详情

AI中文摘要

我们提出了一种用于脑连接图分析中不确定性量化和比较的贝叶斯框架。标准的基于图的方法通常依赖于相关矩阵的点估计，忽视了从有限数据中进行高维估计所引入的不确定性。我们的方法构建并比较从后验分布导出的可信超矩形，为个体水平推断和纵向监测提供了可解释的工具。我们开发了在高维中估计这些区域的可扩展算法，并在静息态fMRI数据的逆Wishart模型中建立了理论保证，包括相关矩阵的Bernstein--von Mises定理和贝叶斯族系错误率的控制。所提出的框架能够在保持联合依赖结构的同时，从全局和局部两个层面原则性地检测显著的连接差异。在合成数据集上，该方法与多重检验程序相比表现出竞争性能，同时它还促进了单个患者两次不同扫描的直接比较，这是文献中目前缺失的能力。我们利用这一新颖性在真实数据集上提高了可解释性。除了fMRI数据，该方法为高维依赖环境中的比较问题提供了一个通用框架。

构建传染病建模中的接触矩阵和连通性矩阵

Xiahui Li, Dongni Zhang, Neha Bansal, Jessica R. E. Bridgen, Chris Jewell, Emma McBryde, Glenn Marion, Emily Nixon, Philip D. O'Neill, David J. Pascall, Lorenzo Pellis, Simon E. F. Spencer, Panayiota Touloupou, Lloyd Chapman, Ben Swallow

AI总结本文综述了用于构建接触矩阵的数据类型以及将不确定性和异质性纳入矩阵的方法，并指出了未来研究方向。

2605.29961 2026-05-29 stat.ME

Modifying causal models to distinguish between transient and lasting causal effects

修改因果模型以区分瞬时和持久因果效应

Russell Steele, Naftali Weinberger, Tess Baker, Ian Shrier

AI总结本文提出一种基于系统和状态的方法，通过定义新的零效应概念来区分时间变化系统中的瞬时和持久因果效应。

Comments 18 pages, 7 figures

详情

AI中文摘要

本文考虑如何对随时间观察的结果和暴露的因果模型中的干预效应进行分类。首先，我们展示了在时变框架中，潜在结果和因果有向无环图的最常见用法在捕捉所有可能干预方面的局限性，特别是在关键问题涉及维持或改变平衡行为的干预时。其次，我们采用基于系统和状态的方法，而不是基于测量的方法，来识别因果参数。特别是，我们讨论了关于系统平衡的假设以及干预对该平衡的影响如何允许更具体的因果解释，并阐明设计和分析的目标。第三，我们展示了识别时变系统因果参数的能力如何取决于测量系统状态的时间点的选择。我们通过提出一种新颖的零效应版本来解决这个问题，该版本旨在区分瞬时和持久因果效应。

英文摘要

This paper considers how to classify the effects of interventions in causal models for outcomes and exposures observed over time. First, we demonstrate the limitations of the most common uses of potential outcomes and causal directed acyclic graphs for capturing all possible interventions in a time varying framework, particularly in problems where the key question concerns interventions to maintain or change equilibrium behaviour. Second, we adopt a system and state based approach rather than a measurement-based approach to identify the causal parameters. In particular, we discuss how assumptions about the system's equilibrium and the effects of interventions on that equilibrium can allow for more specific causal interpretations and clarify the goals of design and analysis. Third, we show how the ability to identify the the causal parameters of a time varying system depends on the selection of timepoints for measuring the system's states. We address this by proposing a novel version of the null effect, which is designed to distinguish between transient and lasting causal effects.

URL PDF HTML ☆

赞 0 踩 0

拓扑稳定性指数：一种基于方差的持久性条形码度量

Joris Kirchner, Ioannis Diamantis

AI总结提出拓扑稳定性指数（TSI），一种基于方差的持久性条形码标量度量，量化持久性寿命的离散程度，并建立其基本性质及与Rényi熵的联系。

Comments 31 pages, 14 figures

详情

AI中文摘要

我们引入了拓扑稳定性指数（TSI），一种基于方差的持久性条形码标量度量，用于量化持久性寿命的离散程度。与仅依赖于归一化权重的持久熵不同，TSI捕获绝对变异性，并对异质特征尺度敏感。我们建立了TSI的基本性质，包括其缩放行为、在寿命平移下的不变性以及在插入和删除条形下的显式更新公式。我们还考虑了一种互补的一阶矩型量——拓扑信号指数（TSigI），它捕获持久性寿命的典型尺度，并与TSI一起提供额外的可解释性。我们进一步引入了一个归一化版本$cv\text{TSI}$，它是尺度不变的，并且与二阶Rényi熵有显式的代数关系。特别地，$cv\text{TSI}$是碰撞概率$\sum_i p_i^2$的仿射函数，因此是Rényi熵的单调重参数化，为拓扑数据分析中基于方差和基于熵的摘要提供了直接联系。在合成数据和随机时间序列上的数值实验表明，TSI捕获了与熵互补的结构变异性：它对确定性趋势相对不敏感，而对随机波动和持久性幅度的变化响应强烈。

英文摘要

We introduce the \emph{Topological Stability Index} (TSI), a variance-based scalar measure for persistence barcodes that quantifies the dispersion of persistence lifetimes. Unlike persistent entropy, which depends only on normalized weights, the TSI captures absolute variability and is sensitive to heterogeneous feature scales. We establish fundamental properties of the TSI, including its scaling behavior, invariance under lifetime translation and explicit update formulas under insertion and deletion of bars. We also consider a complementary first-moment-type quantity, the Topological Signal Index (TSigI), which captures the typical scale of persistence lifetimes and provides additional interpretability alongside the TSI. We further introduce a normalized version, $cv\text{TSI}$, which is scale invariant and admits an explicit algebraic relation to the Rényi entropy of order two. In particular, $cv\text{TSI}$ is an affine function of the collision probability $\sum_i p_i^2$, and therefore a monotone reparametrization of the Rényi entropy, providing a direct link between variance-based and entropy-based summaries in topological data analysis. Numerical experiments on synthetic data and stochastic time series demonstrate that the TSI captures structural variability complementary to entropy: it is relatively insensitive to deterministic trends, while responding strongly to stochastic fluctuations and variations in persistence magnitude.

URL PDF HTML ☆

赞 0 踩 0

2605.29836 2026-05-29 cs.LG cs.AI stat.ML

CB-SLICE: Concept-Based Interpretable Error Slice Discovery

CB-SLICE: 基于概念的可解释错误切片发现

Yael Konforti, Mateo Espinosa Zarlenga, Elaf Almahmoud, Mateja Jamnik

发表机构 * Department of Computer Science and Technology, University of Cambridge, Cambridge, UK（计算机科学与技术系，剑桥大学，剑桥，英国）； Trinity College, University of Oxford, Oxford, UK（牛津大学三一学院，牛津，英国）； Cambridge Institute for Technology and Humanity, Cambridge, UK（剑桥技术与人类研究所，剑桥，英国）

AI总结提出CB-SLICE方法，利用概念瓶颈模型的概念预测失败来发现错误切片，并通过关键词概念解释失败模式，优于现有方法。

Comments 20 pages, 7 figures, 12 tables, to be published at Proceedings of the 43rd International Conference on Machine Learning (ICML 2026)

详情

AI中文摘要

尽管平均性能强劲，深度学习模型在特定人群组（称为错误切片）上常表现出系统性错误。识别这些组及其失败的根本原因对于模型调试和偏差缓解至关重要。然而，现有的错误切片发现方法（SDMs）通常生成与模型推理过程脱节的解释，因此只能近似潜在错误源，可能不准确。我们通过利用概念瓶颈模型（CBMs）来解决这一局限，其预测直接依赖于人类可理解的语义概念。由于CBM中下游任务失败通常源于概念预测错误，概念表示为错误切片识别提供了强有力的候选，提供了直接关联错误源的细粒度解释。基于这一见解，我们引入CB-SLICE，一种基于概念的SDM，它将共享概念预测失败的样本分组，并识别每个切片失败模式中最关键的关键词概念。在多个基准测试中，我们展示了CB-SLICE在发现已知偏差方面优于最先进方法，同时提供更丰富、更忠实的模型错误解释。

英文摘要

Despite strong average-case performance, deep learning models often exhibit systematic errors on specific population groups, known as error slices. Identifying these groups and the root causes of their failures is critical for model debugging and bias mitigation. However, existing error Slice Discovery Methods (SDMs) typically generate explanations disconnected from the model's inference process, thus only approximating the underlying error source and may be inaccurate. We address this limitation by leveraging Concept Bottleneck Models (CBMs), whose predictions are directly dependent on human-understandable semantic concepts. Since downstream task failures in CBMs commonly arise from concept mispredictions, concept representations provide a strong candidate for error slice identification, offering fine-grained explanations directly linked to the error source. Building on this insight, we introduce CB-SLICE, a concept-based SDM that groups samples with shared concept prediction failures and identifies the keyword concepts most responsible for each slice's failure mode. Across multiple benchmarks, we show that CB-SLICE outperforms state-of-the-art methods in uncovering well-known biases while providing richer and more faithful explanations of model errors.

URL PDF HTML ☆

赞 0 踩 0

2605.29830 2026-05-29 math.ST math.PR stat.AP stat.TH

A Multi-factorial Innovation Model with Feature Interaction

具有特征交互的多因素创新模型

Giacomo Aletti, Irene Crimaldi, Andrea Ghiglietti

AI总结提出一个基于印度自助餐过程的多因素创新模型，引入特征交互机制，分析参数对渐近行为的影响，并建立强律和中心极限定理。

详情

AI中文摘要

我们引入了一个用于多因素创新的印度自助餐类型模型，其中每个到达的智能体可能同时表现出先前观察到的特征和新特征。新特征的数量遵循幂律行为，而选择旧特征的概率结合了自强化（取决于特征特定的流行度）和平均场交互项（取决于所有观察到的特征的平均流行度）。该模型由通常的创新参数（质量、折扣和浓度）以及两个额外参数控制：一个控制强化相对于强制输入（趋向于零）的强度，另一个调节特征交互的强度。尽管观察到的不同特征总数的增长与三参数印度自助餐过程相同，但交互机制产生了新的渐近状态。对于聚合量，包括预测均值、每个智能体的平均特征数、平均包含概率和平均特征流行度，相变由折扣参数与强制输入权重的比较决定。对于特征特定量，根据交互水平与临界阈值的比较，出现进一步的转变。特别是，高交互导致特征特定包含概率的渐近同步。我们建立了强律和二阶渐近结果，包括在鞅波动与确定性或随机项竞争的状态下的中心极限定理。该分析依赖于递归随机动力学的新的一般结果，这些结果可能在本框架之外也有用。

英文摘要

We introduce an Indian-buffet-type model for multi-factorial innovation in which each arriving agent may exhibit both previously observed and new features. The number of new features follows a power-law behavior, while the probability of selecting an old feature combines self-reinforcement, depending on the feature-specific popularity, with a mean-field interaction term depending on the average popularity of all observed features. The model is governed by the usual innovation parameters (mass, discount and concentration), together with two additional parameters: one controlling the strength of reinforcement against a forcing input toward zero, and one regulating the intensity of feature interaction. Although the growth of the total number of distinct observed features has the same behavior as in the three-parameter Indian buffet process, the interaction mechanism produces new asymptotic regimes. For aggregate quantities, including the predictive mean, the averaged number of features per agent, the mean inclusion probability, and the mean feature popularity, the phase transition is determined by the comparison between the discount parameter and the weight of the forcing input. For feature-specific quantities, a further transition appears according to the comparison between the interaction level and a critical threshold. In particular, high interaction leads to an asymptotic synchronization of feature-specific inclusion probabilities. We establish strong laws and second-order asymptotic results, including central limit theorems in regimes where martingale fluctuations compete with deterministic or random terms. The analysis relies on novel general results for recursive stochastic dynamics, which may be useful beyond the present framework.

URL PDF HTML ☆

赞 0 踩 0

2605.29758 2026-05-29 stat.ME

Fisher's ideas and the design of field experiments in agronomy and plant breeding

Fisher的思想与农学和植物育种中的田间试验设计

Hans-Peter Piepho

AI总结回顾Fisher在实验设计中的关键思想，并联系当前农业田间试验中的系统设计、行列设计、多环境试验等方法。

Comments 31 pages, 2 tables

2605.29748 2026-05-29 stat.ML cs.LG

Instance-dependent Stochastic Lipschitz bandit

实例依赖的随机Lipschitz bandit

Marius Potfer, Vianney Perchet

发表机构 * Crest (Fairplay joint team)（Crest（Fairplay联合团队））； EDF R&D（EDF研发）； Criteo AI Lab（Criteo人工智能实验室）

AI总结针对Lipschitz bandit问题，提出一种基于水平集次优性间隙积分的算法，实现比传统缩放维度更优的实例依赖遗憾界。

详情

AI中文摘要

我们研究Lipschitz bandit问题，其中学习器通过带噪声的点评估在域$\mathcal{X} \subset [0,1]^d$上顺序最大化未知的Lipschitz函数$f$。现有的遗憾界要么是最坏情况的，缩放为$\tilde\Theta \left ( T^{d+1/d+2}\right )$，要么通过缩放维度$d_z$自适应，得到$\tilde\Theta \left ( T^{d_z+1/d_z+2}\right )$。然而，这种基于缩放的保证仅是部分实例依赖的，因为它们仅依赖于近最优水平集的渐近增长，未能捕捉$f$的更精细结构性质。我们提供了一种分析和算法，通过$f$在其水平集上的次优性间隙的积分来刻画遗憾。这产生了适应水平集局部增长（而不仅仅是其渐近行为）的遗憾界。作为推论，当最大化者集合的维度$d^\star>0$时，我们获得了阶为$\tilde{\mathcal{O}} \left ( T^{d_z+1 / \max(d_z,d^\star)+2}\right )$的改进自适应速率，在该情况下严格优于经典的缩放界。最后，我们将分析扩展到完全信息设置（Lipschitz专家），并展示了如何放宽一些正则性假设。

英文摘要

We study the Lipschitz bandit problem, where a learner sequentially maximizes an unknown Lipschitz function $f$ over a domain $\mathcal{X} \subset [0,1]^d$ using noisy pointwise evaluations. Existing regret bounds are either worst-case, scaling as $\tildeΘ \left ( T^{d+1/d+2}\right )$, or adaptive via the zooming dimension $d_z$, yielding $\tildeΘ \left ( T^{d_z+1/d_z+2}\right )$. However, such zooming-based guarantees are only partially instance-dependent, as they depend solely on the asymptotic growth of near-optimal level sets and fail to capture finer structural properties of $f$. We provide an analysis and an algorithm that characterizes the regret through integrals of the suboptimality gap of $f$ over its level sets. This yields regret bounds that adapt to the local growth of level sets, rather than only their asymptotic behavior. As a corollary, when the set of maximizers has dimension $d^\star>0$, we obtain improved adaptive rates of order $\tilde{\mathcal{O}} \left ( T^{d_z+1 / \max(d_z,d^\star)+2}\right )$ strictly improving over classical zooming bounds in this regime. Finally, we extend our analysis to the full-information setting (Lipschitz experts) and show how some of the regularity assumptions can be relaxed.

URL PDF HTML ☆

赞 0 踩 0

2605.29702 2026-05-29 stat.ME

A Jensen-Shannon divergence based $k$--$NN$ algorithm for missing value imputation in compositional data

基于Jensen-Shannon散度的$k$--$NN$算法用于成分数据缺失值插补

Michail Tsagris, Connie Stewart, Abdulaziz Alenazi

AI总结提出一种基于Jensen-Shannon散度和Fr\'echet均值的$k$--$NN$非参数方法，用于成分数据缺失值插补，具有自适应性且适用于含零值数据。

Comments This is the preprint of the paper that was published in the Journal of Applied Statistics. https://www.tandfonline.com/doi/full/10.1080/02664763.2026.2677908

2605.29684 2026-05-29 cs.LG cond-mat.dis-nn stat.ML

Kernel Renormalization in Bayesian Deep Neural Networks: the Equivalent Wishart Ansatz in the Proportional Regime

贝叶斯深度神经网络中的核重整化：比例机制下的等效Wishart假设

Paolo Baglioni, Christian Keup, Vincenzo Zimbardo, Rosalba Pacelli, Alessandro Vezzani, Raffaella Burioni, Pietro Rotondo

发表机构 * INFN, Sezione di Milano Bicocca（意大利国家研究所（INFN），米兰Bicocca分所）； INFN, Gruppo Collegato di Parma（意大利国家研究所（INFN），帕尔马联合小组）； Dipartimento di Scienze Matematiche, Fisiche e Informatiche, Università degli Studi di Parma（帕尔马大学数学、物理和信息科学系）； Istituto dei Materiali per l’Elettronica ed il Magnetismo (IMEM-CNR), Parco Area delle Scienze（电子与磁性材料研究所（IMEM-CNR），帕尔马科技园区）

AI总结针对固定深度L的贝叶斯多层感知机，提出等效Wishart假设来捕捉层次经验核的随机涨落，通过大偏差分析得到重正化NNGP核描述，在比例极限下用至多L个标量序参数刻画表示学习，并扩展到CNN揭示局部核重整化机制。

Comments 45 pages, 21 figures

详情

AI中文摘要

训练集大小$P$和深度神经网络宽度$N$以相同速率增长的比例宽度极限，已被深入研究用于浅层单隐藏层网络。然而，将这些非微扰结果从浅层架构扩展到深度非线性网络已被证明非常具有挑战性。在这里，我们提出了一种有效的近似方法，用于预测固定深度$L$的贝叶斯多层感知机（MLP）在任意高维数据上的泛化性能。我们提出了一个等效Wishart假设，以捕捉MLP层次经验核的主要随机涨落。这使我们能够在比例极限下对MLP的配分函数进行大偏差分析，并用重正化NNGP核表示。在这种描述中，即使比例极限下的强表示学习也由至多$L$个标量序参数编码，这些参数自洽确定。将该方法扩展到卷积架构（CNN），我们识别出一种层次局部核重整化机制，该机制允许量化CNN中由于有限宽度效应导致的大宽度核的更复杂数据相关变换。我们在经典基准数据集上，针对深度$L \sim O(10)$和$P\sim O(10^3)$的有限深度神经网络的贝叶斯后验采样实验测试了我们的有效理论，发现总体吻合良好，同时存在两种不同类型的系统性偏差。

英文摘要

The scaling limit where both the size of the training set $P$ and the width $N$ of a deep neural network grow at the same rate, the so-called proportional-width regime, has been intensely studied for shallow, single-hidden-layer networks. However, extending these non-perturbative results from shallow architectures to deep non-linear networks has proven very challenging. Here we present an effective approximate approach to predict the generalization performance of Bayesian multi-layer perceptrons (MLPs) of fixed depth $L$ on arbitrary high-dimensional data. We propose an equivalent Wishart Ansatz to capture the dominant stochastic fluctuations of the hierarchical empirical kernels of MLPs. This allows us to perform a large deviation analysis for the partition function of MLPs in the proportional limit, expressed in terms of a renormalized NNGP kernel. In this description, even strong representation learning in the proportional limit is encoded in at most $L$ scalar order parameters, determined self-consistently. Extending the approach to convolutional architectures (CNNs), we identify a hierarchical local kernel renormalization mechanism, which allows to quantify more complex data-dependent transformations of the large-width kernel in CNNs due to finite-width effects. We test our effective theory against sampling experiments from the Bayesian posterior of finite deep neural networks with depths $L \sim O(10)$ and $P\sim O(10^3)$ on classic benchmark datasets, finding overall very good agreement together with two distinct types of systematic deviations.

URL PDF HTML ☆

赞 0 踩 0

2605.29645 2026-05-29 cs.LG cs.AI stat.ML

The Sample Complexity of Multiclass and Sparse Contextual Bandits

多类别和稀疏上下文赌博机的样本复杂度

Liad Erez, Fan Chen, Alon Cohen, Tomer Koren, Yishay Mansour, Shay Moran, Alexander Rakhlin

发表机构 * Tel Aviv University（特拉维夫大学）； Massachusetts Institute of Technology（麻省理工学院）； Google Research Tel Aviv（谷歌研究特拉维夫）； Technion—Israel Institute of Technology（技术学院—以色列理工学院）

AI总结针对随机i.i.d.上下文赌博机，提出基于决策估计系数和低方差探索的算法，在稀疏奖励下实现接近最优的样本复杂度，并匹配下界。

详情

AI中文摘要

我们研究随机i.i.d.设置下的上下文赌博机，其中学习器观察来自未知分布的上下文，从有限集合$A$中选择动作，并旨在基于赌博机反馈从给定类别中识别近似最优策略。受零一奖励的赌博机多类别分类启发，我们关注\emph{$s$-稀疏}设置，其中对于每个上下文，奖励向量的$L_1$范数至多为$s \ll |A|$。我们的主要结果是设计算法，以高概率输出一个相对于策略类$Π$的$ε$-最优策略，使用$ ilde{O} ((s/ε^2 + |A|/ε)\log |Π|/δ)$个样本。我们将此界推广到一般Natarajan类，并补充了匹配的下界（对数因子内），从而缩小了先前工作（Erez等人，2024, 2025）留下的巨大差距，后者额外增加了$Θ(|A|^9)$依赖。我们通过两种互补方法获得这些结果。首先，我们从具有结构化观测的上下文决策角度分析上下文赌博机，设计了一种探索-优化算法，其样本复杂度由\emph{决策估计系数}（DEC；Foster等人，2021, 2022）控制。我们证明，在$s$-稀疏奖励下，诱导的模型类具有随$s$缩放的尖锐DEC界，直接产生最优速率。由于这种方法主要是信息论性的，并涉及求解复杂的min-max优化问题，我们还开发了第二种更专门的算法方法，基于低方差探索技术。这种方法产生了具体、易处理的算法，并自然地扩展到上下文组合半赌博机，为赌博机多类别列表分类提供了改进的样本复杂度保证。

英文摘要

We study contextual bandits in the stochastic i.i.d.\ setting, where a learner observes contexts drawn from an unknown distribution, selects actions from a finite set $A$, and aims to identify an approximately optimal policy from a given class based on bandit feedback. Motivated by bandit multiclass classification with zero-one rewards, we focus on the \emph{$s$-sparse} setting in which, for every context, the reward vector has $L_1$-norm at most $s \ll |A|$. Our main result is the design of algorithms that, with high probability, output an $ε$-optimal policy compared to policy class $Π$ using $\tilde{O} ((s/ε^2 + |A|/ε)\log |Π|/δ)$ samples. We extend this bound to general Natarajan classes and complement it with a matching lower bound (up to logarithmic factors), thereby closing a substantial gap left by prior work (Erez et al., 2024, 2025), which incurred an additional $Θ(|A|^9)$ dependence. We obtain these results via two complementary approaches. First, we analyze contextual bandits through the lens of contextual decision making with structured observations, designing an exploration-by-optimization algorithm whose sample complexity is governed by the \emph{decision-estimation coefficient} (DEC; Foster et al., 2021, 2022). We show that, with $s$-sparse rewards, the induced model class admits a sharp DEC bound that scales with $s$ and directly yields the optimal rate. Since this approach is largely information-theoretic and involves solving complex min-max optimization problems, we also develop a second, more specialized algorithmic method based on a low-variance exploration technique. This approach leads to concrete, tractable algorithms and naturally extends to contextual combinatorial semi-bandits, leading to improved sample complexity guarantees for bandit multiclass list classification.

URL PDF HTML ☆

赞 0 踩 0

2605.29642 2026-05-29 stat.ML cs.IT cs.LG math.IT

Matching Rates and Optimal Allocation for Federated Probe-Logit Distillation under Heterogeneous Bandwidth Budgets

异构带宽预算下的联邦探针-逻辑蒸馏匹配率与最优分配

Prasanjit Dubey, Xiaoming Huo

发表机构 * H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA 30332, U.S.A（H. Milton Stewart工业与系统工程学院，佐治亚理工学院，亚特兰大，GA 30332，美国）

AI总结针对联邦探针-逻辑蒸馏（FPLD）中带宽项速率紧性及异构节点带宽分配问题，提出匹配下界、多轮改进方案及闭合形式最优分配规则。

详情

AI中文摘要

在联邦语言建模中，$K$个节点各自持有$n$个样本，但无法合并数据或交换全精度梯度或权重。我们研究当每个节点在公共探针集上每次查询最多上传$B$比特时，对$V$个令牌上的条件分布进行估计的极小极大速率。在联邦探针-逻辑蒸馏（FPLD）中，每个节点在探针集上传输一个标量量化的逻辑向量，聚合器蒸馏出一个全局参数化学生模型。先前的工作（Dubey and Huo, 2026）建立了高概率KL速率$O(d/(Kn) + ρ\sqrt{V \log V / m} + K^{-1} \cdot 2^{-2B/V})$加上优化松弛项，其中带宽项采用迹锐化形式。该带宽项速率是否紧致，以及上界如何推广到异构每节点带宽，仍是开放问题。我们填补了这两个空白。首先，抖动FPLD构造在非退化条件下具有匹配的单轮下界$Ω(K^{-1} \cdot 2^{-2B/V})$，将带宽轴速率确定为$Θ(K^{-1} \cdot 2^{-2B/V})$。使用嵌套/缩放残差量化器的$T$轮顺序细化达到$O(K^{-1} \cdot 2^{-2TB/V})$；对于任意$T > 1$，原始FPLD的与$T$无关的带宽项是次优的。其次，我们建立了每节点预算$B_i$的异构带宽上界，并配以闭合形式的最优分配$B_i^* = B_{\mathrm{tot}}/K + (V/2) \log_2(w_i / ar{w}_g)$，这是一种对数倾斜的注水规则，是失真率优化中反向注水的每节点类比。一种即插即用自适应变体通过短预热阶段估计权重，并达到$1 + O(\sqrt{\log(K/δ)/(m T_0)})$的相对次优性。合成n-gram模拟证实经验KL被上界和下界所界定，并且在异构裁剪下最优分配严格优于均匀和逆权重基线。

英文摘要

In federated language modeling, $K$ nodes each hold $n$ samples but cannot pool data or exchange full-precision gradients or weights. We study the minimax rate at which a conditional distribution over $V$ tokens can be estimated when each node may upload at most $B$ bits per query in a public probe set. In federated probe-logit distillation (FPLD), each node transmits a scalar-quantized logit vector on the probe set, and an aggregator distills a global parametric student. Prior work (Dubey and Huo, 2026) establishes a high-probability KL rate $O(d/(Kn) + ρ\sqrt{V \log V / m} + K^{-1} \cdot 2^{-2B/V})$ plus optimization slack, with the bandwidth term in its trace-sharpened form. Whether this bandwidth-term rate is tight, and how the upper bound generalizes to heterogeneous per-node bandwidths, are left open. We close both gaps. First, the dithered FPLD construction has a matching single-round lower bound $Ω(K^{-1} \cdot 2^{-2B/V})$ under non-degeneracy, pinning the bandwidth-axis rate at $Θ(K^{-1} \cdot 2^{-2B/V})$. $T$-round sequential refinement with nested/scaled residual quantizers achieves $O(K^{-1} \cdot 2^{-2TB/V})$; vanilla FPLD's $T$-independent bandwidth term is suboptimal for every $T > 1$. Second, we establish a heterogeneous-bandwidth upper bound for per-node budgets $B_i$, paired with a closed-form optimal allocation $B_i^* = B_{\mathrm{tot}}/K + (V/2) \log_2(w_i / \bar{w}_g)$, a log-tilted water-filling rule that is the per-node analogue of reverse water-filling for distortion-rate optimization. A plug-in adaptive variant estimates the weights from a short warm-up phase and attains $1 + O(\sqrt{\log(K/δ)/(m T_0)})$ relative suboptimality. Synthetic n-gram simulations confirm that empirical KL is bracketed by the upper and lower bounds and that the optimal allocation strictly dominates uniform and inverse-weighted baselines under heterogeneous clipping.

URL PDF HTML ☆

赞 0 踩 0

2605.29611 2026-05-29 stat.ME stat.CO

Hierarchical forecasting: The role of information

分层预测：信息的作用

Minh Nguyen, Farshid Vahid, Shanika L Wickramasuriya

AI总结本文提出信息组合（IComb）方法，通过结合不同信息集的基础预测来改进分层时间序列预测，并证明信息组合与聚合约束对预测改进的独立贡献。

详情

AI中文摘要

在分层预测中，预测协调过程将一组不满足实际数据中分层聚合约束的“基础”或“原始”预测，转换为一组满足这些约束的“一致”预测。学术文献提供了大量模拟证据和实际案例，证明预测协调在改进分层时间序列预测方面的价值。这种改进归因于聚合约束的施加。然而，这些证据来源于每个使用不同信息集（通常是每个时间序列对应的单变量信息集）生成的基础预测。由于协调算法结合了预测，很难确定改进在多大程度上是由于约束的施加，还是由于不同预测所携带信息的组合。在本文中，我们证明当基础预测基于不同的信息集且历史数据可用时，即使预测已经一致，通过组合每个预测携带的信息也有改进这些预测的空间。我们提出了一种新方法，称为信息组合（IComb）方法，该方法在协调过程中结合预测的信息内容。该方法基于回归，可以使用现有的惩罚回归包实现。我们提供模拟证据来说明信息集与聚合约束在分层时间序列预测中的不同作用。最后，我们将我们的方法应用于文献中先前使用的数据集，并证明与传统方法相比，它取得了更优的结果。

具有多项式改进近似因子的 $2 \rightarrow q$ 范数算法及其应用

Samuel B. Hopkins, Stefan Tiegel

发表机构 * MIT（麻省理工学院）

AI总结本文针对 $q>2$ 时的 $2 \rightarrow q$ 范数，提出了首个多项式时间近似算法，其近似因子在多项式级别上优于基线 $d^{1/4}$，例如 $q=4$ 时达到 $d^{1/8}$，并构造了平方和证书，从而改进了鲁棒均值估计、协方差估计、回归和聚类等问题的算法。

Comments v2 corrected minor typos

详情

AI中文摘要

矩阵 $X \in \mathbb{R}^{n \times d}$ 的 $2 \rightarrow q$ 范数定义为 $\lVert X \rVert_{2 \rightarrow q} = \sup_{\lVert v \rVert_2 = 1} \lVert Xv \rVert_q$。我们针对 $q > 2$（即超收缩设置）给出了该范数的多项式时间乘法近似算法。该问题要么直接对应，要么与组合优化和近似难度（例如小集扩张）、量子信息（例如最佳可分态）以及算法统计学中长期存在的开放问题密切相关。关于在多项式时间内能为此问题达到何种近似因子，我们所知甚少，尽管此类近似具有重要的下游影响。Barak、Brandão、Harrow、Kelner、Steurer 和 Zhou 表明，假设指数时间假设（FOCS'12），没有多项式时间算法能实现优于 $2^{\sqrt{\log n}}$ 的近似因子。另一方面，一个简单的谱算法给出了 $d^{1/4}$ 的基线近似。据我们所知，我们给出了首个在多项式因子内超越该基线的多项式时间近似算法。对于重要的特例 $q = 4$，它实现了 $d^{1/8}$ 的近似。所有先前的算法要么需要对 $X$ 附加假设，要么仅在 $n$ 较小时才能超越基线。此外，我们为 $2 \rightarrow q$ 范数构造了平方和证书。这直接改进了当数据仅满足 $q$ 阶矩有界时的鲁棒均值和协方差估计、鲁棒回归以及聚类算法。

英文摘要

The $2 \rightarrow q$ norm of a matrix $X \in \mathbb{R}^{n \times d}$ is defined as $\lVert X \rVert_{2 \rightarrow q} = \sup_{\lVert v \rVert_2 = 1} \lVert Xv \rVert_q$. We give polynomial-time multiplicative approximation algorithms for this norm when $q > 2$ (i.e. in the hypercontractive setting). This problem either directly captures or is closely related to long-standing open problems in combinatorial optimization and hardness of approximation (e.g. Small Set Expansion), quantum information (e.g. Best Separable State), and algorithmic statistics. Very little is known about what approximation factors we can achieve for this problem in polynomial time, even though such approximations have significant downstream consequences. Barak, Brandão, Harrow, Kelner, Steurer, and Zhou showed that no polynomial-time algorithm can achieve an approximation factor better than $2^{\sqrt{\log n}}$, assuming the Exponential Time Hypothesis (FOCS'12). On the other hand, a simple spectral algorithm gives a $d^{1/4}$-approximation as a baseline. We give, to the best of our knowledge, the first polynomial-time approximation algorithm beating this baseline by polynomial factors. For the important special case of $q = 4$ it achieves a $d^{1/8}$-approximation. All previous algorithms required additional assumptions on $X$, or only surpassed the baseline for small values of $n$. Moreover, we construct sum-of-squares certificates for the $2 \rightarrow q$ norm. This directly implies improved algorithms for robust mean and covariance estimation, robust regression, and clustering, when the data only satisfies a bound on its $q$-th moment.

URL PDF HTML ☆

赞 0 踩 0

2605.13986 2026-05-29 cs.LG stat.ML

TabPFN-3: Technical Report

TabPFN-3: 技术报告

Léo Grinsztajn, Klemens Flöge, Oscar Key, Felix Birkel, Philipp Jund, Brendan Roof, Mihir Manium, Shi Bin Hoo, Magnus Bühler, Anurag Garg, Dominik Safaric, Jake Robertson, Benjamin Jäger, Simone Alessi, Adrian Hayler, Vladyslav Moroshan, Lennart Purucker, Philipp Singer, Alan Arazi, Julien Siems, Jan Hendrik Metzen, Georg Grab, Nick Erickson, Siyuan Guo, Eliott Kalfon, Simon Bing, David Salinas, Clara Cornu, Lilly Charlotte Wehrhahn, Diana Kriuchkova, Kursat Kaya, Lydia Sidhoum, Marie Salmon, Jerry Chen, Madelon Hulsebos, Yann LeCun, Samuel Müller, Bernhard Schölkopf, Sauraj Gambhir, Noah Hollmann, Frank Hutter

发表机构 * Prior Labs

AI总结本文提出TabPFN-3，通过扩展训练数据和优化推理，在表格数据上实现最先进性能，并支持时间序列、关系数据和表格文本数据。

详情

AI中文摘要

表格数据支撑着科学和工业中大多数高价值预测问题，而TabPFN推动了该模态的基础模型革命。根据用户反馈设计，TabPFN-3在此基础上将最先进性能扩展到具有100万训练行的数据集，并大幅减少训练和推理时间。TabPFN-3完全基于我们先验的合成数据进行预训练，极大地推动了表格预测的前沿，并在时间序列、关系数据和表格文本数据上带来了实质性收益。在标准表格基准TabArena上，TabPFN-3的前向传播以显著优势优于所有其他模型（包括调优和集成基线），并在速度/性能前沿上占据帕累托优势。在更多样化的数据集上，TabPFN-3在多类数据集上排名第一，并在多达100万训练行和200个特征的数据集上击败了经过8小时调优的梯度提升树基线。TabPFN-3将测试时计算缩放引入表格基础模型。我们的API产品TabPFN-3-Plus（思考版）利用这一点，在TabArena上以超过200 Elo的优势击败所有非TabPFN模型，在最大数据子集上达到420 Elo，并且比AutoGluon 1.5 extreme快10倍，同时不使用LLM、真实数据、互联网搜索或除TabPFN之外的任何其他模型。TabPFN-3扩展了我们模型的能力，实现了对关系数据（在RelBenchV1上新的最先进基础模型）和表格文本数据（通过TabPFN-3-Plus在TabSTAR上达到最先进）的最先进预测；并改进了现有集成：专用检查点TabPFN-TS-3在时间序列基准fev-bench上排名第二，SHAP值计算速度提升高达120倍。TabPFN-3在实现这一性能的同时，比TabPFN-2.5快20倍。此外，减少的KV缓存和行分块技术使得在单个H100上以快速推理速度扩展到100万行。

英文摘要

Tabular data underpins most high-value prediction problems in science and industry, and TabPFN has driven the foundation model revolution for this modality. Designed with feedback from our users, TabPFN-3 builds on this foundation to scale state-of-the-art performance to datasets with 1M training rows and substantially reduce training and inference time. Pretrained exclusively on synthetic data from our prior, TabPFN-3 dramatically pushes the frontier of tabular prediction and brings substantial gains on time series, relational, and tabular-text data. On the standard tabular benchmark TabArena, a forward pass of TabPFN-3 outperforms all other models, including tuned and ensembled baselines, by a significant margin, and pareto-dominates the speed/performance frontier. On more diverse datasets, TabPFN-3 ranks first on datasets with many classes, and beats 8-hour-tuned gradient-boosted-tree baselines on datasets up to 1M training rows and 200 features. TabPFN-3 introduces test-time compute scaling to tabular foundation models. Our API offering TabPFN-3-Plus (Thinking) exploits this to beat all non-TabPFN models by over 200 Elo on TabArena, rising to 420 Elo on the largest data subset, and outperforms AutoGluon 1.5 extreme while being 10x faster, without using LLMs, real data, internet search or any other model besides TabPFN. TabPFN-3 extends the capabilities of our models, enabling SOTA prediction on relational data (new SOTA foundation model on RelBenchV1) and tabular-text data (SOTA on TabSTAR via TabPFN-3-Plus); and improves existing integrations: a specialized checkpoint, TabPFN-TS-3, ranks 2nd on the time-series benchmark fev-bench, and SHAP-value computation is up to 120x faster. TabPFN-3 achieves this performance while being up to 20x faster than TabPFN-2.5. In addition, a reduced KV cache and row-chunking scale to 1M rows on one H100 with fast inference speed.

URL PDF HTML ☆

赞 0 踩 0

2605.07596 2026-05-29 stat.ML cs.LG

A Refined Generalization Analysis for Extreme Multi-class Supervised Contrastive Representation Learning

极端多类监督对比表示学习的精细泛化分析

Nong Minh Hieu, Antoine Ledent

发表机构 * School of Computing and Information Systems, Singapore Management University（新加坡国立管理学院计算机与信息系统学院）

AI总结针对对比表示学习在有限标注数据中构造元组导致依赖性的问题，提出改进的U-统计量分析，得到与类别数R同阶的样本复杂度，并设计新估计器在长尾分布下实现O(k)的样本复杂度。

Comments Accepted at ICML 2026

详情

AI中文摘要

对比表示学习（CRL）在多个机器学习领域取得了强大的实证成功，但其理论样本复杂度仍然知之甚少。现有分析通常假设输入元组是独立同分布的，这一假设在大多数实际设置中被违反，因为对比元组是从有限标注数据池中构造的，导致元组之间存在依赖性。虽然最近有一项工作使用U-统计量分析这种学习设置以估计总体风险，但其中使用的技术要求每个类别的风险均匀集中，使得超额风险界限的规模为$ρ_{\min}^{-{1}/{2}}$，其中$ρ_{\min}$表示最稀有类别的概率。这种依赖在极端多类设置中可能过于悲观，因为存在许多尾部类别，它们对总体风险的贡献极小。我们的贡献有两方面。首先，我们改进了先前的工作，证明了一个样本复杂度与类别数$R$同阶的界限，无论类别分布如何。此外，我们制定了一个不同的估计器，捕捉风险 extit{跨类别}的集中性，从而在极端多类学习场景中实现更尖锐的界限，特别是在类别分布为长尾的情况下。在类别分布的温和假设下，得到的样本复杂度为$\mathcal{O}(k)$，其中$k$是每个元组的样本数。

英文摘要

Contrastive Representation Learning (CRL) has achieved strong empirical success in multiple machine learning disciplines, yet its theoretical sample complexity remains poorly understood. Existing analyses usually assume that input tuples are identically and independently distributed, an assumption violated in most practical settings where contrastive tuples are constructed from a finite pool of labeled data, inducing dependencies among tuples. While one recent work analyzed this learning setting using U-Statistics to estimate the population risk, the techniques used therein require the risk of each class to concentrate uniformly, making excess risk bounds scale in the order of $ρ_{\min}^{-{1}/{2}}$ where $ρ_{\min}$ denotes the probability of the rarest class. Such a dependency can be overly pessimistic in the extreme multiclass settings where there are many tail classes which contribute minimally to the overall population risk. Our contributions are two-fold. Firstly, we improve upon the previous work and prove a bound with a sample complexity of the same order as the number of classes $R$, regardless of the distribution over classes. Furthermore, we formulate a different estimator that captures the concentration of the risk \textit{across classes}, enabling sharper bounds in extreme multi-class learning scenarios, especially where class distributions are long-tailed. Under mild assumptions on the class distributions, the resulting sample complexity is $\mathcal{O}(k)$ where $k$ is the number of samples per tuple.

URL PDF HTML ☆

赞 0 踩 0

2605.06355 2026-05-29 cs.LG stat.ML

Order-Agnostic Autoregressive Modelling with Missing Data

缺失数据下的顺序无关自回归建模

Ignacio Peis, Pablo M. Olmos, Jes Frellsen

发表机构 * Technical University of Denmark（丹麦技术大学）； Pioneer Centre for AI（先锋人工智能中心）； Universidad Carlos III de Madrid（马德里卡洛斯三世大学）

AI总结本文通过缺失数据视角重新审视顺序无关自回归模型，提出缺失感知训练框架，并利用其条件密度估计进行主动信息获取，在多个基准上优于传统插补方法。

详情

AI中文摘要

顺序无关自回归模型在深度生成建模中表现出色，但其在数据不完整情况下的应用尚未被充分探索。本文从缺失数据的角度重新审视这些模型。首先，我们证明它们在完全观测数据上的标准训练过程隐式地在完全随机缺失机制下进行插补，从而在高缺失率场景下实现了稳健的样本外插补性能。其次，我们提出了第一个原则性框架，用于在一般缺失机制下直接从不完整数据集中训练这些模型。第三，我们利用其摊销条件密度估计进行主动信息获取，即顺序选择对下游预测或推理最有信息量的缺失变量。在一系列真实世界基准测试中，我们的缺失感知顺序无关自回归模型（MO-ARM）持续优于已建立的插补基线。

英文摘要

Order-Agnostic autoregressive models have demonstrated strong performance in deep generative modeling, yet their use in settings with incomplete data remains largely unexplored. In this work, we reinterpret them through the lens of missing data. First, we show that their standard training procedure on fully observed data implicitly performs imputation under a missing completely at random mechanism, resulting in robust out-of-sample imputation performance in settings with high missingness. Second, we introduce the first principled framework for training them directly on incomplete datasets under general missingness mechanisms. Third, we leverage their amortized conditional density estimation to perform active information acquisition, i.e., sequentially selecting the most informative missing variables for downstream prediction or inference. Across a suite of real-world benchmarks, our Missingness-Aware Order-Agnostic Autoregressive Model (MO-ARM) consistently outperforms established imputation baselines.

URL PDF HTML ☆

赞 0 踩 0

2605.01665 2026-05-29 econ.EM stat.ME

Exact Likelihood Inference and Robust Filtering for Gauss-Cauchy Convolution Models

高斯-柯西卷积模型的精确似然推断与鲁棒滤波

Peter Reinhard Hansen, Chen Tong

AI总结本文推导了Voigt分布（高斯与柯西卷积）的解析表达式，用于重尾测量噪声建模，并基于此提出GCC滤波器，在状态空间模型中实现鲁棒滤波，实证中优于高斯、t分布等滤波器。

详情

AI中文摘要

高斯分布与柯西分布的卷积，即Voigt分布，广泛应用于光谱学，并为重尾测量噪声建模提供了自然框架。我们利用缩放互补误差函数推导了其密度、得分、海森矩阵、Fisher信息量和条件矩的解析表达式，从而无需数值卷积、有限差分导数或伪Voigt近似即可实现稳定的最大似然估计。潜在高斯分量的条件期望由再下降位置得分控制，因此极端观测值会被自动折扣而非传播。该结构导致了用于具有高斯潜在动态和Voigt测量误差的状态空间模型的高斯-柯西卷积（GCC）滤波器，其中Masreliez高斯预测近似保留了Voigt预测误差密度。在对科技板块SPDR基金的对数已实现波动率的应用中，GCC滤波器将持久的潜在变化与瞬时的测量噪声分离，并在所考虑的高斯、Student-t、Huber及相关滤波规格中实现了最高的预测误差准则。

英文摘要

The convolution of a Gaussian and a Cauchy distribution, known as the Voigt distribution, is widely used in spectroscopy and provides a natural framework for modeling heavy-tailed measurement noise. We derive analytical expressions for its density, score, Hessian, Fisher information, and conditional moments using the scaled complementary error function, enabling stable maximum likelihood estimation without numerical convolution, finite-difference derivatives, or pseudo-Voigt approximations. The conditional expectation of the latent Gaussian component is governed by a redescending location score, so extreme observations are automatically discounted rather than propagated. This structure leads to the Gauss-Cauchy Convolution (GCC) filter for state-space models with Gaussian latent dynamics and Voigt measurement errors, where the Masreliez Gaussian prediction approximation preserves a Voigt prediction-error density. In an application to log realized volatility for the Technology Select Sector SPDR Fund, the GCC filter separates persistent latent variation from transient measurement noise and attains the highest implemented prediction-error criterion among the Gaussian, Student-$t$, Huber, and related filtering specifications considered.

URL PDF HTML ☆

赞 0 踩 0

2605.01050 2026-05-29 stat.AP

Trust Me, I'm a Doctor?

相信我，我是医生？

Zach Shahn, Mats Stensrud

AI总结利用嵌套在观察性队列中的随机试验数据，推导医生个人策略优于试验平均推荐策略的比例的严格界限。

详情

AI中文摘要

临床试验通常针对平均治疗效果，但治疗决策是针对个体做出的。这种张力引发了对循证医学的一个常见批评：平均有益的特定治疗可能不适合特定患者，而熟练的医生可能优于严格遵循随机试验中表现最佳的策略。我们考虑如何使用来自同一目标人群的随机和观察性数据来评估这种可能性。具体来说，我们研究了随机试验嵌套在观察性队列中的设置，从而在治疗、对照和常规护理下观察到结果。我们询问观察数据能揭示医生多久能胜过试验建议的策略。我们推导了医生个人策略优于始终选择试验中表现更好的治疗的比例的严格界限，假设没有医生的策略比始终选择试验中表现更差的治疗更差。这些结果揭示了临床数据何时支持依赖医生自由裁量权而非试验平均推荐，以及何时需要更强的理由。

英文摘要

Clinical trials usually target average treatment effects, but treatment decisions are made for individuals. This tension motivates a common criticism of evidence-based medicine: a treatment that is beneficial on average may be inappropriate for a particular patient, and skilled physicians may outperform rigid adherence to the strategy that performed best in a randomized trial. We consider how randomized and observational data from the same target population can be used to assess that possibility. Specifically, we study settings in which a randomized trial is nested within an observational cohort, so that outcomes are observed under treatment, control, and usual care. We ask what the observed data can reveal about how often physicians outperform the strategy suggested by the trial. We derive sharp bounds on the proportion of physicians whose personal strategies perform better than always choosing the better performing treatment from the trial under the assumption that no physician's strategy is worse than always choosing the worse performing treatment from the trial. These results shed light on when clinical data support relying on physician discretion over the trial-average recommendation and when stronger justification is required.

URL PDF HTML ☆

赞 0 踩 0

2604.02094 2026-05-29 math.ST math.PR stat.TH

Importance sampling for Bayesian inference: polynomial-dimension dependent error bounds

贝叶斯推断的重要性采样：多项式维数依赖的误差界

Fabián González, Víctor Elvira, Joaquín Míguez

AI总结本文在随机观测视角下，通过引入链接函数，证明了重要性采样的L2误差界在任意维数下以标准蒙特卡洛速率收敛，并给出了多项式维数依赖的误差分析。

详情

AI中文摘要

许多贝叶斯推断问题涉及高维模型，其中标准重要性采样（IS）方法的性能通常随着维数增加而迅速下降。经典IS分析通常假设观测是任意但固定的（即确定性的），从而忽略了贝叶斯模型赋予数据的概率结构。在本文中，我们采用观测本身是随机变量的视角，其分布由底层模型控制。在这个概率框架内，我们识别出一个模型依赖的函数，称为链接函数，它连接了固定观测和随机观测的表述。我们给出了L2蒙特卡洛估计误差的特征：具体来说，我们证明了当且仅当链接函数是Bochner可积时，L2误差界是有限的，并且以标准蒙特卡洛速率O(N^{-1/2})收敛，对于任意大的维数。这一结果揭示了控制近似误差的基本量，并建立了一种管理模型状态维数依赖性的机制。因此，我们的方法提供了一种原则性的方式来缓解高维挑战，提供了超越现有文献中主导的最坏情况分析的见解。最后，我们推导了几类模型（包括线性高斯系统和具有有界观测函数的模型）相关误差的维数缩放的显式解析示例。

英文摘要

Many Bayesian inference problems involve high-dimensional models where the performance of standard importance sampling (IS) methods often degrades rapidly as the dimensionality increases. Classical analyses of IS typically rely on the assumption that observations are arbitrary but fixed (i.e., deterministic), thereby neglecting the probabilistic structure that the Bayesian model induces on the data. In this paper, we adopt the perspective that observations are themselves random variables whose distribution is governed by the underlying model. Within this probabilistic framework, we identify a model-dependent function, referred to as the link function, which connects the fixed- and random-observation formulations. We provide a characterization of the $L^2$ Monte Carlo estimation error: specifically, we show that the $L^2$ error bounds are finite and converge at the standard Monte Carlo rate $O(N^{-1/2})$, for arbitrarily large dimension, if and only if the link function is Bochner integrable. This result reveals the fundamental quantity controlling the approximation error and establishes a mechanism to manage the dependence on the model state dimension. Consequently, our approach provides a principled way to alleviate the challenges of high dimensionality, offering insights that transcend worst-case analyses dominant in the existing literature. Finally, we derive explicit analytical examples of the dimensional scaling of the associated errors for several model classes, including linear-Gaussian systems and models with bounded observation functions.

URL PDF HTML ☆

赞 0 踩 0

2603.20329 2026-05-29 stat.ML cs.LG math.PR

Measure flow path recovery in Bayes Hilbert spaces

贝叶斯希尔伯特空间中的测度流路径恢复

S. David Mis, Maarten V. de Hoop

发表机构 * Rice University（里士大学）

AI总结针对有限移动局部传感器恢复概率测度流的不适定问题，提出基于贝叶斯希尔伯特框架的变分理论，通过构造最小能量传输实现和线性化观测算子，分析可恢复性条件，并发展有限维约化方法实现稳定重建。

详情

AI中文摘要

我们研究使用贝叶斯希尔伯特框架从有限个移动局部传感器恢复概率测度流的不适定问题。相对于固定的参考概率测度，概率律由其中心化对数比坐标表示，因此演化律成为希尔伯特函数空间中的一条路径。对于足够正则的贝叶斯希尔伯特路径，我们通过在每个时间点求解加权纽曼问题，构造路径的规范最小能量传输实现，得到切方向上的内在传输形式。然后，我们直接在贝叶斯希尔伯特路径空间上制定逆问题。观测算子的线性化产生可观测性形式，可恢复性由其与传输几何通过联合传输-可观测性形式的相互作用决定。在无穷维环境中，我们发展了正则化变分理论，并识别了局部传感器的局限性：移动传感器可以使联合形式单射，但通常不能在整个状态空间上产生强制稳定性估计。这一障碍自然导致有限维贝叶斯希尔伯特约化。在那里，传输形式成为动能张量，线性化观测成为约化感知矩阵，因此可恢复性可以通过显式的格拉姆条件表达。我们证明局部凸起传感器检测每个固定的约化方向，有限个适当放置的静态传感器产生均匀的约化可观测性，并且存在依赖于路径的传感器轨迹，使得即使单个移动传感器也能恢复约化路径。最后，我们证明这些约化恢复结果可以提升到对由所选有限维子空间良好近似的路径的近似环境恢复，从而实现稳定重建至投影误差。

英文摘要

We study the ill-posed problem of recovering a probability measure flow from finitely many moving localized sensors using a Bayes Hilbert framework. Relative to a fixed reference probability measure, a probability law is represented by its centered log-ratio coordinates, so that an evolving law becomes a path in a Hilbert space of functions. For sufficiently regular Bayes Hilbert paths, we construct a canonical minimum-energy transport realization of the path by solving a weighted Neumann problem at each time, yielding an intrinsic transport form on tangent directions. We then formulate an inverse problem directly on Bayes Hilbert path space. Linearization of an observation operator yields an observability form, and recoverability is governed by its interaction with the transport geometry through a joint transport--observability form. In the ambient infinite-dimensional setting, we develop a regularized variational theory and identify limitations of localized sensing: mobile sensors can make the joint form injective, but they do not in general yield a coercive stability estimate on the full state space. This obstruction leads naturally to finite-dimensional Bayes Hilbert reductions. There the transport form becomes a kinetic tensor and the linearized observations become reduced sensing matrices, so recoverability can be expressed through explicit Gramian conditions. We show that localized bump sensors detect every fixed reduced direction, that finitely many suitably placed static sensors yield uniform reduced observability, and there exist path-dependent sensor trajectories such that even a single moving sensor can recover the reduced path. Finally, we show that these reduced recovery results lift to approximate ambient recovery for paths that are well approximated by the chosen finite-dimensional subspaces, yielding stable reconstruction up to projection error.

URL PDF HTML ☆

赞 0 踩 0

2603.15192 2026-05-29 stat.AP

Benchmarking Formula 1 results using a normal model

使用正态模型对标一级方程式赛车成绩

John Fry, Silvio Fanzon, Mark Austin, Tom Brighton

AI总结本文使用单变量和双变量正态模型，区分精英与非精英车队，量化车手和车队层面的合理性能预期，并应用于2025赛季数据。

2603.05002 2026-05-29 cs.LG math.OC stat.ML

Non-Euclidean Gradient Descent Operates at the Edge of Stability

非欧几里得梯度下降在稳定性边缘运行

Rustem Islamov, Michael Crawshaw, Jeremy Cohen, Robert Gower

发表机构 * University of Basel（巴塞尔大学）； George Mason University（乔治·马歇尔大学）； Flatiron Institute（Flatiron研究所）

AI总结本文通过方向光滑性解释梯度下降中的稳定性边缘现象，并将其推广到非欧几里得范数，定义广义尖锐度，实验表明非欧几里得梯度下降也表现出渐进尖锐化和阈值振荡。

详情

AI中文摘要

稳定性边缘（EoS）是一种现象，其中Hessian矩阵的尖锐度（最大特征值）在梯度下降（GD）中接近并徘徊在稳定性阈值$2/η$附近（步长为$η$）。尽管（表面上）违反了经典光滑性假设，但EoS在深度学习中已被广泛观察到，其理论基础仍不完整。我们通过方向光滑性[Mishkin et al., 2024]的视角提供了对EoS的解释。这种解释自然地扩展到非欧几里得范数，我们用它来定义任意范数下的广义尖锐度。我们的广义尖锐度度量包括先前研究的普通GD和预处理GD作为特例，以及尚未研究EoS的方法，例如$\ell_{\infty}$下降、块坐标下降、谱GD及其归一化版本。通过在神经网络上的实验，我们表明具有广义尖锐度的非欧几里得GD也表现出渐进尖锐化，随后在阈值$2/η$附近或之上振荡。在实践中，我们的框架提供了一种几何感知的谱诊断方法，可应用于广泛的非欧几里得梯度方法类别。

英文摘要

The Edge of Stability (EoS) is a phenomenon where the sharpness (largest eigenvalue) of the Hessian approaches and then hovers near the stability threshold $2/η$ during gradient descent (GD) with step size $η$. Despite (apparently) violating classical smoothness assumptions, EoS has been widely observed in deep learning, but its theoretical foundations remain incomplete. We provide an interpretation of EoS through the lens of Directional Smoothness [Mishkin et al., 2024]. This interpretation naturally extends to non-Euclidean norms, which we use to define generalized sharpness under an arbitrary norm. Our generalized sharpness measure includes previously studied vanilla GD and preconditioned GD as special cases, as well as methods for which EoS has not been studied, such as $\ell_{\infty}$-descent, Block CD, Spectral GD, and their normalized versions. Through experiments on neural networks, we show that non-Euclidean GD with our generalized sharpness also exhibits progressive sharpening followed by oscillations around or above the threshold $2/η$. Practically, our framework provides a geometry-aware spectral diagnostic that can be applied across a broad class of non-Euclidean gradient methods.

URL PDF HTML ☆

赞 0 踩 0

2602.16449 2026-05-29 cs.LG cs.AI stat.ML

GICDM: Mitigating Hubness for Reliable Distance-Based Generative Model Evaluation

GICDM: 缓解枢纽性以实现可靠的基于距离的生成模型评估

Nicolas Salvy, Hugues Talbot, Bertrand Thirion

发表机构 * Inria, Palaiseau, France（法国帕莱索研究所）

AI总结针对生成模型评估中高维嵌入空间的枢纽性现象，提出GICDM方法（基于迭代上下文不相似度度量），通过多尺度扩展校正邻域估计，恢复可靠度量并与人类评估对齐。

Comments Forty-third International Conference on Machine Learning, 2026

2602.10637 2026-05-29 cs.LG cond-mat.stat-mech physics.chem-ph stat.ML

Coarse-Grained Boltzmann Generators

粗粒度玻尔兹曼生成器

Weilong Chen, Bojun Zhao, Jan Eckwert, Julija Zavadlav

发表机构 * Professorship of Multiscale Modeling of Fluid Materials, Department of Engineering Physics and Computation, TUM School of Engineering and Design, Technical University of Munich, Germany（多尺度流体材料建模教授职位，工程物理与计算系，TUM工程与设计学院，慕尼黑技术大学，德国）； Atomistic Modeling Center, Munich Data Science Institute, Technical University of Munich, Germany（原子建模中心，慕尼黑数据科学研究所，慕尼黑技术大学，德国）

AI总结提出粗粒度玻尔兹曼生成器（CG-BGs）框架，结合基于流的生成模型与重要性采样，利用学习到的平均力势（PMF）进行重加权，在降低计算成本的同时实现大分子系统的平衡采样。

Comments Accepted at ICML 2026

详情

AI中文摘要

从玻尔兹曼分布中采样平衡分子构型是一个长期挑战。玻尔兹曼生成器（BGs）通过结合精确似然生成模型与重要性采样来解决这一问题，但实际可扩展性有限。同时，粗粒度代理模型通过降低有效维度来建模更大系统，但往往缺乏确保渐近正确统计量的重加权过程。在这项工作中，我们提出了粗粒度玻尔兹曼生成器（CG-BGs），一个用于粗粒度坐标空间中的降阶生成建模与重要性采样的框架。CG-BGs使用基于流的模型生成样本，并使用学习到的平均力势（PMF）进行重加权。我们表明，可以通过增强采样力匹配从快速收敛的轨迹中学习PMF。实验证明，CG-BGs在高度降阶表示中捕获溶剂介导的相互作用，同时相对于原子级BGs大幅降低计算成本，为更大分子系统的平衡采样提供了实用途径。

英文摘要

Sampling equilibrium molecular configurations from the Boltzmann distribution is a longstanding challenge. Boltzmann Generators (BGs) address this by combining exact-likelihood generative models with importance sampling, but practical scalability is limited. Meanwhile, coarse-grained surrogates enable the modeling of larger systems by reducing effective dimensionality, yet often lack a reweighting procedure required to ensure asymptotically correct statistics. In this work, we propose Coarse-Grained Boltzmann Generators (CG-BGs), a framework for reduced-order generative modeling with importance sampling in coarse-grained coordinate space. CG-BGs generate samples using a flow-based model and reweight them using a learned potential of mean force (PMF). We show that the PMF can be learned from rapidly converged trajectories via enhanced sampling force matching. Experiments demonstrate that CG-BGs capture solvent-mediated interactions in highly reduced representations while substantially reducing computational cost relative to atomistic BGs, providing a practical route toward equilibrium sampling of larger molecular systems.

URL PDF HTML ☆

赞 0 踩 0

2602.06361 2026-05-29 cs.GT cs.IT cs.LG math.IT stat.ML

Envy-Free Allocation of Indivisible Goods via Noisy Queries

通过噪声查询实现不可分割物品的无嫉妒分配

Zihan Li, Yan Hao Ling, Jonathan Scarlett, Warut Suksompong

发表机构 * Meta（Meta公司）； National University of Singapore（国立新加坡大学）； Nanyang Technological University（南洋理工大学）

AI总结针对不可直接观测估值、仅能通过噪声查询获取信息的不可分割物品分配问题，在双智能体高斯噪声和有界估值设定下，推导了实现无嫉妒分配所需查询次数的上下界，并证明了当最优分配负嫉妒值Δ不太小时最优查询次数与m^{2.5}/Δ^2成比例。

Comments ICML 2026

2602.05961 2026-05-29 cs.LG stat.ML

Discrete diffusion samplers and bridges: Off-policy algorithms and applications in latent spaces

离散扩散采样器与桥：离策略算法及其在潜在空间中的应用

Arran Carter, Sanghyeok Choi, Kirill Tamogashev, Víctor Elvira, Esmeralda S. Whitammer

发表机构 * University of Edinburgh（爱丁堡大学）； CIFAR Fellow（卡尔·弗里德里希·列文森研究员）

AI总结提出离策略训练技术改进离散扩散采样器性能，并首次引入离散域的数据到能量薛定谔桥训练，应用于图像生成模型的离散潜在空间中的无数据后验采样。

Comments ICML 2026. Code: https://github.com/mmacosha/offpolicy-discrete-diffusion-samplers-and-bridges

详情

AI中文摘要

从已知归一化常数的分布 $p(x) \propto e^{-\mathcal{E}(x)}$ 中采样是统计学中一个重要且具有挑战性的问题。近年来，出现了一类新的摊销采样算法，通常称为扩散采样器，能够从未归一化的密度中快速高效地采样。这类算法在连续空间采样任务中已被广泛研究；然而，它们在离散空间问题中的应用仍 largely 未被探索。尽管该领域已取得一些进展，但离散扩散采样器并未充分利用连续空间采样中常用的思想。在本文中，我们提出通过引入离散扩散采样器的离策略训练技术来弥合这一差距。我们证明这些技术在已有和新颖的合成基准上提高了离散采样器的性能。接下来，我们将离散扩散采样器推广到两个任意分布之间的桥接任务，首次为离散域引入了数据到能量薛定谔桥训练。最后，我们展示了所提出的扩散采样器在图像生成模型的离散潜在空间中进行无数据后验采样的应用。

This paper develops distribution theory and bootstrap-based inference methods for a broad class of convex pairwise difference estimators. These estimators minimize a kernel-weighted convex-in-parameter function over observation pairs with similar covariates, where the similarity is governed by a localization (bandwidth) parameter. While classical results establish asymptotic normality under restrictive bandwidth conditions, we show that valid Gaussian and bootstrap-based inference remains possible under substantially weaker assumptions. First, we extend the theory of small bandwidth asymptotics to convex pairwise difference estimation settings, deriving robust Gaussian approximations even when a smaller than standard bandwidth is used. Second, we employ a debiasing procedure based on generalized jackknifing to enable inference with larger bandwidths, while preserving convexity of the objective function. Third, we construct a novel bootstrap method that adjusts for bandwidth-induced variance distortions, yielding valid inference across a wide range of bandwidth choices. Our proposed inference method enjoys demonstrably greater robustness, while retaining the practical appeal of convex pairwise difference estimators.

URL PDF HTML ☆

赞 0 踩 0

2510.00094 2026-05-29 physics.soc-ph stat.AP

Proximity-based cities emit less mobility-driven CO$_2$

基于邻近性的城市减少出行驱动的CO$_2$排放

Francesco Marzolla, Matteo Bruno, Hygor P. M. Melo, Vittorio Loreto

AI总结通过分析全球近400个城市的数据，发现服务设施靠近居民的区域人均交通碳排放更低，并量化了优化服务布局可实现的减排潜力。

详情

DOI: 10.1038/s44333-025-00074-0

AI中文摘要

在追求更环境可持续的城市区域的过程中，提出了15分钟城市的概念，以鼓励主动出行，主要是步行和骑行。如果每个居民都能在离家15分钟步行或骑行范围内获得基本服务，则该城市区域被视为“15分钟城市”。然而，关于该模式在减少汽车使用和碳排放方面的有效性仍存在争议。在本研究中，我们进行了一项大规模数据驱动分析，以评估服务设施靠近住宅对CO$_2$排放的影响。通过检查全球近400个城市，我们发现，在同一城市内，服务设施更靠近居民的区域人均交通CO$_2$排放更低。我们为每个城市建立了服务设施邻近性与CO$_2$排放之间的明确关系。此外，我们量化了30个城市在优化服务设施位置后潜在的减排量。这种优化保持每个城市的服务设施总数不变，同时重新分布它们以确保整个城市区域的平等可达性。我们的研究结果表明，改善服务设施的邻近性可以显著减少与交通相关的预期城市排放。

英文摘要

In the quest for more environmentally sustainable urban areas, the concept of the 15-minute city has been proposed to encourage active mobility, primarily through walking and cycling. An urban area is considered a ``15-minute city" if every resident can access essential services within a 15-minute walk or bike ride from their home. However, there is an ongoing debate about the effectiveness of this model in reducing car usage and carbon emissions. In this study, we conduct a large-scale data-driven analysis to evaluate the impact of service proximity to homes on CO$_2$ emissions. By examining nearly 400 cities worldwide, we discover that, within the same city, areas with services located closer to residents produce less CO$_2$ emissions per capita from transportation. We establish a clear relationship between the proximity of services and CO$_2$ emissions for each city. Additionally, we quantify the potential reduction in emissions for 30 cities if they optimise the location of their services. This optimisation maintains each city's total number of services while redistributing them to ensure equal accessibility throughout the entire urban area. Our findings indicate that improving the proximity of services can significantly reduce expected urban emissions related to transportation.

URL PDF HTML ☆

赞 0 踩 0

2509.24100 2026-05-29 stat.ME cs.LG

SpeedCP: Fast Kernel-based Conditional Conformal Prediction

SpeedCP: 基于核的快速条件共形预测

Yating Liu, Yeo Jin Jung, Zixuan Wu, So Won Jeong, Claire Donnat

发表机构 * Department of Statistics University of Chicago（统计学系芝加哥大学）

AI总结提出一种基于路径追踪的高效算法，在保持RKHS条件共形预测框架理论优势的同时，将计算速度提升40倍，区间长度缩短30%。

详情

AI中文摘要

共形预测提供了具有有限样本条件保证的分布自由预测集。我们基于Gibbs等人(2023)的RKHS框架，该框架利用协变量偏移族来提供近似条件共形预测区间，具有强大的理论前景，但计算成本过高。为弥补这一差距，我们开发了一种稳定高效的算法，该算法以与单次核分位数拟合基本相同的成本计算正则化RKHS共形优化问题的完整解路径。我们的路径追踪框架同时调整超参数，提供平滑控制和数据自适应校准。为了将方法扩展到高维设置，我们进一步将我们的方法与低秩潜在嵌入相结合，在数据驱动的潜在空间中捕获条件有效性。实验上，我们的方法在各种现代黑盒预测器上提供了可靠的条件覆盖，将Gibbs等人(2023)的区间长度改善了30%，同时实现了40倍的加速。

英文摘要

Conformal prediction provides distribution-free prediction sets with finite-sample conditional guarantees. We build upon the RKHS-based framework of Gibbs et al. (2023), which leverages families of covariate shifts to provide approximate conditional conformal prediction intervals, an approach with strong theoretical promise, but with prohibitive computational cost. To bridge this gap, we develop a stable and efficient algorithm that computes the full solution path of the regularized RKHS conformal optimization problem, at essentially the same cost as a single kernel quantile fit. Our path-tracing framework simultaneously tunes hyperparameters, providing smoothness control and data-adaptive calibration. To extend the method to high-dimensional settings, we further integrate our approach with low-rank latent embeddings that capture conditional validity in a data-driven latent space. Empirically, our method provides reliable conditional coverage across a variety of modern black-box predictors, improving the interval length of Gibbs et al. (2023) by 30%, while achieving a 40-fold speedup.

URL PDF HTML ☆

赞 0 踩 0

2506.21543 2026-05-29 math.ST cs.IT math.IT math.PR stat.TH

Detecting weighted hidden cliques

检测加权隐藏团

Urmisha Chatterjee, Karissa Huang, Ritabrata Karmakar, B. R. Vinay Kumar, Gábor Lugosi, Nandan Malhotra, Anirban Mandal, Maruf Alam Tarafdar

AI总结研究加权图中隐藏团检测问题，通过假设检验框架，在已知和部分已知分布下推导可检测性统计极限，并给出谱方法算法。

Comments Revision with organised references

详情

AI中文摘要

我们研究了经典隐藏团问题到具有实值边权图的推广。形式上，我们定义了一个假设检验问题。在原假设下，$n$个顶点的完全图的边权独立同分布于分布$P$。在备择假设下，随机选择$k$个顶点，它们之间的边权来自分布$Q$，其余边权来自$P$。目标是观察边权后决定它们来自哪个假设。我们在两种场景下研究该问题：(1) $P$和$Q$完全已知，(2) $P$和$Q$只有部分信息。在第一种场景中，我们得到了当两个假设可区分和不可区分时$k$的统计极限。此外，在每个场景中，当$Q$关于$P$不是绝对连续时，我们给出了假设检验问题最小风险的界。我们还提供了计算高效的谱检验，只要$k=Ω(\sqrt{n})$，即可在两种场景下区分两个假设。

英文摘要

We study a generalization of the classical hidden clique problem to graphs with real-valued edge weights. Formally, we define a hypothesis testing problem. Under the null hypothesis, edges of a complete graph on $n$ vertices are associated with independent and identically distributed edge weights from a distribution $P$. Under the alternate hypothesis, $k$ vertices are chosen at random and the edge weights between them are drawn from a distribution $Q$, while the remaining are sampled from $P$. The goal is to decide, upon observing the edge weights, which of the two hypotheses they were generated from. We investigate the problem under two different scenarios: (1) when $P$ and $Q$ are completely known, and (2) when there is only partial information of $P$ and $Q$. In the first scenario, we obtain statistical limits on $k$ when the two hypotheses are distinguishable, and when they are not. Additionally, in each of the scenarios, we provide bounds on the minimal risk of the hypothesis testing problem when $Q$ is not absolutely continuous with respect to $P$. We also provide computationally efficient spectral tests that can distinguish the two hypotheses as long as $k=Ω(\sqrt{n})$ in both the scenarios.

URL PDF HTML ☆

赞 0 踩 0

2505.13745 2026-05-29 cs.LG stat.ML

Synthetic Non-stationary Data Streams for Recognition of the Unknown

用于未知识别的合成非平稳数据流

Joanna Komorniczak

发表机构 * Wrocław University of Science and Technology（沃拉夫大学科学与技术学院）

AI总结提出一种同时包含概念漂移和新类出现的合成数据流生成策略，并评估无监督漂移检测器在开放集识别任务中的表现。

详情

DOI: 10.1007/978-3-032-19102-1_9

AI中文摘要

数据非平稳性问题在数据流处理中常被讨论。在动态环境中，方法应持续准备分析时变数据——因此，它们应支持增量训练并应对概念漂移。非平稳数据流环境中另一个同样重要的变化是新的、先前未知类别的出现。通常，方法专注于这两种现象之一——检测概念漂移或检测新类别——而数据流中可能同时出现这两种困难。此外，关于先前未知的观测，开放类别集的话题近年来变得尤为重要，方法的目标是在已知类别内高效分类，并识别模型能力范围外的对象。本文提出一种合成数据流生成策略，其中同时出现概念漂移和代表未知对象的新类别。所呈现的研究展示了无监督漂移检测器如何处理检测新类别和概念漂移的任务，并演示了生成的数据流如何用于开放集识别任务。

英文摘要

The problem of data non-stationarity is commonly addressed in data stream processing. In a dynamic environment, methods should continuously be ready to analyze time-varying data -- hence, they should enable incremental training and respond to concept drifts. An equally important variability typical for non-stationary data stream environments is the emergence of new, previously unknown classes. Often, methods focus on one of these two phenomena -- detection of concept drifts or detection of novel classes -- while both difficulties can be observed in data streams. Additionally, concerning previously unknown observations, the topic of open set of classes has become particularly important in recent years, where the goal of methods is to efficiently classify within known classes and recognize objects outside the model competence. This article presents a strategy for synthetic data stream generation in which both concept drifts and the emergence of new classes representing unknown objects occur. The presented research shows how unsupervised drift detectors address the task of detecting novelty and concept drifts and demonstrates how the generated data streams can be utilized in the open set recognition task.

URL PDF HTML ☆

赞 0 踩 0

2505.07989 2026-05-29 stat.ME econ.EM stat.CO

rd2d: Causal Inference in Boundary Discontinuity Designs

rd2d：边界断点设计中的因果推断

Matias D. Cattaneo, Rocio Titiunik, Ruiqi Rae Yu

AI总结本文介绍rd2d软件包，用于边界断点设计中基于局部多项式估计的因果效应推断，支持双变量得分或单变量符号距离得分，并提供带宽选择、偏差校正、置信带等功能。

详情

AI中文摘要

边界断点（BD）设计用于实证研究，以了解由双变量得分定义的连续分配边界上的因果处理效应。这些设计也称为多得分断点回归（RD）设计，其中地理RD设计是一个突出的例子。本文介绍了\pkg{rd2d}，一个用于\proglang{R}、\proglang{Python}和\proglang{Stata}的统计软件包，该软件包使用双变量得分或单变量符号距离边界得分实现BD设计的局部多项式估计和推断。该软件涵盖精确和模糊BD设计，提供自动带宽选择、稳健偏差校正逐点推断、一致置信带、联合或单独拟合约定的聚类稳健推断、协变量调整效率改进、质量点检查和协方差正则化等功能。我们通过一个应用于机会区的实证例子来说明该软件包，在该区域中，资格对指定有强烈的第一阶段效应，但对早期工作场所就业增长没有显著影响。

英文摘要

Boundary Discontinuity (BD) designs are used in empirical research to learn about causal treatment effects along a continuous assignment boundary defined by a bivariate score. These designs are also known as multi-score regression discontinuity (RD) designs, and include geographic RD designs as a prominent example. This article introduces \pkg{rd2d}, a statistical software package for \proglang{R}, \proglang{Python}, and \proglang{Stata} that implements local polynomial estimation and inference for BD designs using either the bivariate score or a univariate signed distance-to-boundary score. The software covers sharp and fuzzy BD designs, providing automatic bandwidth selection, robust bias-corrected pointwise inference, uniform confidence bands, cluster-robust inference with joint or separate fitting conventions, covariate-adjusted efficiency improvements, mass-point checks, and covariance regularization, among other features. We illustrate the package with an empirical application to Opportunity Zones, where eligibility has a strong first-stage effect on designation but no significant effects on early workplace-job growth.

URL PDF HTML ☆

赞 0 踩 0

2505.02069 2026-05-29 cs.LG stat.ML

Neural Logistic Bandits

神经逻辑老虎机

Seoungbin Bae, Dabeen Lee

发表机构 * Department of Industrial \& Systems Engineering, KAIST ； Department of Mathematical Sciences \& Research Institute of Mathematics, Seoul National University ； Interdisciplinary Program in Artificial Intelligence, Seoul National University

AI总结针对神经逻辑老虎机问题，利用一种新型的自归一化向量值鞅的Bernstein型不等式，提出两种算法NeuralLog-UCB-1和NeuralLog-UCB-2，分别实现与有效维度相关的遗憾上界，改进了现有结果。

详情

AI中文摘要

我们研究了神经逻辑老虎机问题，其主要任务是通过神经网络学习逻辑链接函数内的未知奖励函数。现有方法要么对$κ$（其中$1/κ$表示奖励分布的最小方差）有不利的依赖，要么直接依赖于特征维度$d$，而在基于神经网络的设置中$d$可能非常大。在这项工作中，我们引入了一种新型的自归一化向量值鞅的Bernstein型不等式，旨在绕过对环境维度的直接依赖。这使我们能够推导出一个遗憾上界，该上界随有效维度$\widetilde{d}$增长，而不是特征维度，同时保持对$κ$的最小依赖。基于该集中不等式，我们提出了两种算法NeuralLog-UCB-1和NeuralLog-UCB-2，它们分别保证了$\widetilde{O}(\widetilde{d}\sqrt{κT})$和$\widetilde{O}(\widetilde{d}\sqrt{T/κ})$阶的遗憾上界，改进了现有结果。最后，我们在合成数据集和真实数据集上报告了数值结果，以验证我们的理论发现。

英文摘要

We study the problem of neural logistic bandits, where the main task is to learn an unknown reward function within a logistic link function using a neural network. Existing approaches either exhibit unfavorable dependencies on $κ$, where $1/κ$ represents the minimum variance of reward distributions, or suffer from direct dependence on the feature dimension $d$, which can be huge in neural network-based settings. In this work, we introduce a novel Bernstein-type inequality for self-normalized vector-valued martingales that is designed to bypass a direct dependence on the ambient dimension. This lets us deduce a regret upper bound that grows with the effective dimension $\widetilde{d}$, not the feature dimension, while keeping a minimal dependence on $κ$. Based on the concentration inequality, we propose two algorithms, NeuralLog-UCB-1 and NeuralLog-UCB-2, that guarantee regret upper bounds of order $\widetilde{O}(\widetilde{d}\sqrt{κT})$ and $\widetilde{O}(\widetilde{d}\sqrt{T/κ})$, respectively, improving on the existing results. Lastly, we report numerical results on both synthetic and real datasets to validate our theoretical findings.

URL PDF HTML ☆

赞 0 踩 0

2503.15287 2026-05-29 stat.CO cs.DC

Distributed Generalized Linear Models: A Privacy-Preserving Approach

分布式广义线性模型：一种隐私保护方法

Daniel Tinoco, Raquel Menezes, Carlos Baquero

AI总结提出一种隐私保护的分布式广义线性模型方法，通过数据流或分布式计算实现模型训练，并扩展到GLM框架，数值实验验证了其在联邦环境中的有效性。

Comments Total PDF pages: 23 Figures: 7

2410.23222 2026-05-29 cs.LG cs.AI stat.ML

Dataset-Driven Channel Masks in Transformers for Multivariate Time Series

数据集驱动的Transformer通道掩码用于多变量时间序列

Seunghan Lee, Taeyoung Park, Kibok Lee

发表机构 * Department of Statistics and Data Science, Yonsei University（延世大学统计与数据科学系）； LG AI Research（LG人工智能研究）

AI总结提出部分通道依赖（PCD）概念，通过数据集特定的通道掩码（CMs）改进Transformer中的通道依赖建模，并在多种任务和数据集上验证有效性。

Comments ICASSP 2026. Preliminary version: NeurIPS Workshop on Time Series in the Age of Large Models 2024 (Oral presentation)

详情

AI中文摘要

最近基础模型的进展已成功扩展到时间序列（TS）领域，这得益于大规模TS数据集的出现。然而，先前的努力主要集中于捕获通道依赖（CD），这对于建模多变量时间序列至关重要，并且基于注意力的方法已被广泛用于此目的。尽管如此，这些方法主要关注修改架构，往往忽略了数据集特定特征的重要性。在这项工作中，我们引入了部分通道依赖（PCD）的概念，通过利用数据集特定信息来增强基于Transformer的模型中的CD建模，从而细化模型捕获的CD。为了实现PCD，我们提出了通道掩码（CMs），通过逐元素乘法将其集成到Transformer的注意力矩阵中。CMs由两个组件组成：1）捕获通道之间关系的相似性矩阵，以及2）数据集特定且可学习的领域参数，用于细化相似性矩阵。我们在多种任务和数据集上使用不同的骨干网络验证了PCD的有效性。代码可在此存储库获取：https://github.com/YonseiML/pcd。

英文摘要

Recent advancements in foundation models have been successfully extended to the time series (TS) domain, facilitated by the emergence of large-scale TS datasets. However, previous efforts have primarily Capturing channel dependency (CD) is essential for modeling multivariate time series (TS), and attention-based methods have been widely employed for this purpose. Nonetheless, these methods primarily focus on modifying the architecture, often neglecting the importance of dataset-specific characteristics. In this work, we introduce the concept of partial channel dependence (PCD) to enhance CD modeling in Transformer-based models by leveraging dataset-specific information to refine the CD captured by the model. To achieve PCD, we propose channel masks (CMs), which are integrated into the attention matrices of Transformers via element-wise multiplication. CMs consist of two components: 1) a similarity matrix that captures relationships between the channels, and 2) dataset-specific and learnable domain parameters that refine the similarity matrix. We validate the effectiveness of PCD across diverse tasks and datasets with various backbones. Code is available at this repository: https://github.com/YonseiML/pcd.

URL PDF HTML ☆

赞 0 踩 0

2409.06439 2026-05-29 cs.LG stat.CO stat.ML

Extending Explainable Ensemble Trees (E2Tree) to regression contexts

将可解释集成树（E2Tree）扩展到回归场景

Massimo Aria, Agostino Gnasso, Carmela Iorio, Marjolein Fokkema

发表机构 * Department of Economics and Statistics, University of Naples Federico II（那不勒斯费德里科二世大学经济学与统计学系）； Institute of Psychology, Leiden University（莱顿大学心理学研究所）

AI总结本文通过引入新的不相似度度量，将可解释集成树方法从分类扩展到回归，并在真实数据集上验证其解释能力。

Journal ref Applied Stochastic Models in Business and Industry, Vol. 42, No. 1, e70064 (2026)

详情

DOI: 10.1002/asmb.70064

AI中文摘要

集成方法如随机森林通过聚合多个弱学习器提供了高精度的预测，改变了监督学习的格局。然而，尽管它们有效，这些方法往往缺乏透明度，阻碍了用户理解随机森林模型如何得出预测。可解释集成树（E2Tree）是一种解释随机森林的新方法，提供了响应变量与预测变量之间关系的图形表示。E2Tree的一个显著特点是它不仅考虑预测变量对响应的影响，还通过计算和使用不相似度度量来考虑预测变量之间的关联。E2Tree方法最初是为分类任务提出的。在本文中，我们将该方法扩展到回归场景。为了展示所提算法的解释能力，我们在真实数据集上进行了演示。

英文摘要

Ensemble methods such as random forests have transformed the landscape of supervised learning, offering highly accurate prediction through the aggregation of multiple weak learners. However, despite their effectiveness, these methods often lack transparency, impeding users' comprehension of how RF models arrive at their predictions. Explainable ensemble trees (E2Tree) is a novel methodology for explaining random forests, that provides a graphical representation of the relationship between response variables and predictors. A striking characteristic of E2Tree is that it not only accounts for the effects of predictor variables on the response but also accounts for associations between the predictor variables through the computation and use of dissimilarity measures. The E2Tree methodology was initially proposed for use in classification tasks. In this paper, we extend the methodology to encompass regression contexts. To demonstrate the explanatory power of the proposed algorithm, we illustrate its use on real-world datasets.

URL PDF HTML ☆

赞 0 踩 0

2408.13596 2026-05-29 stat.ME stat.CO

Robust Principal Components by Casewise and Cellwise Weighting

通过案例加权和单元加权实现稳健主成分

Fabio Centofanti, Mia Hubert, Peter J. Rousseeuw

AI总结提出 cellPCA 方法，通过结合两种稳健损失函数和迭代重加权最小二乘算法，同时处理案例异常值、单元异常值和缺失数据，实现稳健的主成分分析。

详情

DOI: 10.1080/00401706.2026.2643216

AI中文摘要

主成分分析（PCA）是分析多元数据的基本工具。这里关注的是降维到主子空间，其特征由投影矩阵表示。经典的主子空间可能受到异常值的强烈影响。传统的稳健方法考虑案例异常值，即由与干净案例不同的未指定异常分布生成的案例。但也可能存在单元异常值，即可能出现在数据矩阵中任何位置的可疑条目。另一个常见问题是某些单元格可能缺失。本文提出了一种新的稳健PCA方法，称为cellPCA，它可以同时处理案例异常值、单元异常值和缺失单元格。其单一目标函数结合了两个稳健损失函数，共同减轻案例和单元异常值的影响。目标函数通过迭代重加权最小二乘（IRLS）算法最小化。提出了残差单元格图和增强异常值图用于异常值检测。推导了主子空间的案例和单元影响函数，并得到了其渐近分布。广泛的模拟和两个真实数据示例说明了cellPCA的性能。

检验函数自回归模型的拟合优度

W. González-Manteiga, M. D. Ruiz-Medina, M. Febrero-Bande

AI总结针对函数型时间序列中的线性自相关模型，提出基于经验过程的拟合优度检验，推导了经验过程收敛到时间变换的Wiener过程的函数中心极限定理，并验证了检验在简单和复合原假设下的大样本性质及一致性。

2212.12435 2026-05-29 stat.ME math.ST stat.TH

Second-level global sensitivity analysis of numerical simulators with application to an accident scenario in a sodium-cooled fast reactor

数值模拟器的二级全局灵敏度分析及其在钠冷快堆事故场景中的应用

Anouar Meynaoui, Amandine Marrel, Béatrice Laurent

AI总结针对输入分布不确定对全局灵敏度分析结果的影响，提出基于加权HSIC估计量的单环蒙特卡洛方法，实现计算预算有限的二级全局灵敏度分析，并应用于核反应堆严重事故场景。

Comments This work was intended as a replacement of arXiv:1902.07030 and any subsequent updates will appear there

详情

AI中文摘要

数值模拟器广泛用于模拟物理现象，全局灵敏度分析（GSA）旨在研究输入不确定性对模拟器输出的全局影响。为了进行GSA，通常使用基于输入/输出依赖度量的统计工具。我们关注希尔伯特-施密特独立性准则（HSIC）。有时，建模输入不确定性的概率分布本身可能是不确定的，量化其对GSA结果的影响非常重要。我们将其称为二级全局灵敏度分析（GSA2）。然而，当使用蒙特卡洛双环方法进行GSA2时，需要大量的模型评估，这对于CPU时间昂贵的模拟器来说是难以处理的。为了解决这一限制，我们提出了一种基于蒙特卡洛单环且计算预算有限的新统计方法。首先，我们从精心选择的输入概率分布中构建一个唯一的输入和模拟器输出样本。从该样本中，我们通过使用加权HSIC度量估计器，对各种假设的输入概率分布进行GSA。证明了这些加权估计器的统计性质。随后，我们定义了输入分布与GSA结果之间的基于HSIC的二级度量，即GSA2指标。通过一个解析示例说明了我们的GSA2方法的效率，从而比较了几种技术选项。最后，将其应用于模拟核反应堆严重事故场景的测试案例。

英文摘要

Numerical simulators are widely used to model physical phenomena and global sensitivity analysis (GSA) aims at studying the global impact of the input uncertainties on the simulator output. To perform GSA, statistical tools based on inputs/output dependence measures are commonly used. We focus here on the Hilbert-Schmidt independence criterion (HSIC). Sometimes, the probability distributions modeling the uncertainty of inputs may be themselves uncertain and it is important to quantify their impact on GSA results. We call it here the second-level global sensitivity analysis (GSA2). However, GSA2, when performed with a Monte Carlo double-loop, requires a large number of model evaluations, which is intractable with CPU time expensive simulators. To cope with this limitation, we propose a new statistical methodology based on a Monte Carlo single-loop with a limited calculation budget. First, we build a unique sample of inputs and simulator outputs, from a well-chosen probability distribution of inputs. From this sample, we perform GSA for various assumed probability distributions of inputs by using weighted HSIC measures estimators. Statistical properties of these weighted estimators are demonstrated. Subsequently, we define 2 nd-level HSICbased measures between the distributions of inputs and GSA results, which constitute GSA2 indices. The efficiency of our GSA2 methodology is illustrated on an analytical example, thereby comparing several technical options. Finally, an application to a test case simulating a severe accidental scenario on nuclear reactor is provided.

URL PDF HTML ☆

赞 0 踩 0

2605.29580 2026-05-29 cs.LG stat.ML

On the Construction and Implications of Low-Loss Valleys in LoRA-based Bayesian Inference

基于LoRA的贝叶斯推理中低损失谷的构造与启示

Daniel Dold, Emanuel Sommer, Julius Kobialka, Oliver Dürr, David Rügamer

发表机构 * HTWG Konstanz（康斯坦茨应用科学大学）； LMU Munich（慕尼黑大学）； Munich Center for Machine Learning (MCML)（慕尼黑机器学习中心）

AI总结本文提出LoRA-Curve方法，通过分段贝塞尔曲线参数化在LoRA空间中连接独立最优解，形成连续低损失谷，并结合平坦极小扰动和JS散度正则化，在不牺牲性能的前提下提高预测分布的互信息，实现功能多样性。

详情

AI中文摘要

虽然低秩适应（LoRA）等参数高效微调方法已成为大型语言模型的标准方法，但对认知不确定性的原则性估计仍然具有挑战性。最近在LoRA机制下的结果表明，深度集成等离散多模态方法相比单模态方法几乎没有优势。这与深度学习中的更广泛观察相矛盾，在深度学习中，集成独立最优解通常能改善泛化，而通过连续低损失谷连接这些模态能进一步增强贝叶斯模型平均（BMA）。LoRA空间中是否存在这种结构，以及它是否能产生局部或离散方法所遗漏的功能多样性，尚未被研究。我们引入了LoRA-Curve，一种在LoRA空间中的分段贝塞尔曲线参数化，包含两种变体：一种自由配置，联合优化所有控制点；另一种锚定配置，连接独立微调的LoRA最优解。我们证明了损失沿曲线的路径连续性和Lipschitz正则性，并通过Qwen2.5 7B在推理和分类基准上的实验表明，线性插值会遇到损失障碍，而我们的锚定多段曲线通过连续低损失谷连接独立最优解。结合平坦极小扰动和詹森-香农散度正则化，LoRA-Curve在不牺牲性能的情况下，可测量地提高了预测分布的互信息，并将连续参数空间遍历与功能多样性联系起来。

英文摘要

While parameter-efficient fine-tuning methods like low-rank adaptation (LoRA) are standard for large language models, principled estimation of epistemic uncertainty remains challenging. Recent results in the LoRA regime suggest that discrete multi-mode approaches such as deep ensembles offer little benefit over single-mode methods. This contradicts broader observations in deep learning, where ensembling independent optima typically improves generalization, and linking these modes through continuous low-loss valleys further enhances Bayesian model averaging (BMA). Whether such structure exists in the LoRA space and whether it yields functional diversity missed by local or discrete methods has not been studied. We introduce LoRA-Curve, a segmented Bézier curve parameterization in the LoRA space, with two variants: a free configuration that jointly optimizes all control points, and an anchored configuration that connects independently fine-tuned LoRA optima. We prove pathwise continuity and Lipschitz regularity of the loss along the curve and empirically show, across reasoning and classification benchmarks with Qwen2.5 7B, that linear interpolation encounters loss barriers, while our anchored multi-segment curves connect independent optima through continuous low-loss valleys. Combined with flat-minima perturbations and a Jensen-Shannon divergence regularizer, LoRA-Curve yields measurably higher mutual information of the predictive distribution without sacrificing performance, and links continuous parameter-space traversal to functional diversity.

URL PDF HTML ☆

赞 0 踩 0

2605.29541 2026-05-29 stat.ME q-fin.ST

Change-point estimation for Weibull time series with copula-based Markov models

基于Copula马尔可夫模型的威布尔时间序列变点估计

Li-Hsien Sun, Zong-Yuan Huang, Yi-Ling Huang, Chi-Yang Chiu, Ning Ning

AI总结针对具有非线性序列依赖的时间序列，提出基于Copula的马尔可夫链模型（威布尔边缘分布），通过Clayton和Joe copula捕捉非对称尾部依赖，利用牛顿-拉夫逊算法进行最大似然估计变点，并采用参数自助法构建置信区间。

详情

AI中文摘要

我们研究具有非线性序列依赖的时间序列数据的离线变点估计。为解决此问题，我们提出一个基于Copula的马尔可夫链模型，具有威布尔边缘分布，适用于建模非负数据，如事件时间和波动率度量。通过Clayton和Joe copula纳入非线性依赖，使模型能够分别捕捉非对称下尾和上尾依赖结构。我们推导相应的似然函数，并通过牛顿-拉夫逊算法实现的最大似然估计来估计变点和模型参数。通过参数自助蒙特卡洛程序构建置信区间。进行大量数值研究，评估所提出方法在不同依赖结构和copula误设情况下的有限样本性能和稳健性。结果表明，所提出的估计量在RMSE和相对误差方面表现良好，特别是对于变点的估计。对COVID-19大流行期间VIX指数的实证应用进一步说明了所提出方法在检测边缘分布和序列依赖结构中的结构性变化方面的实际效用。

英文摘要

We study offline change-point estimation for time series data exhibiting nonlinear serial dependence. To address this problem, we propose a copula-based Markov chain model with Weibull marginal distributions, which is suitable for modeling nonnegative data such as event times and volatility measures. Nonlinear dependence is incorporated through the Clayton and Joe copulas, allowing the model to capture asymmetric lower-tail and upper-tail dependence structures, respectively. We derive the corresponding likelihood function and estimate the change point and model parameters using maximum likelihood estimation implemented through the Newton--Raphson algorithm. Confidence intervals are constructed via a parametric bootstrap Monte Carlo procedure. Extensive numerical studies are conducted to evaluate the finite-sample performance and robustness of the proposed method under different dependence structures and copula misspecification scenarios. The results demonstrate that the proposed estimators perform well in terms of RMSE and relative error, particularly for the estimation of the change point. An empirical application to the VIX index during the COVID-19 pandemic further illustrates the practical usefulness of the proposed approach in detecting structural changes in both the marginal distributions and serial dependence structure.

URL PDF HTML ☆

赞 0 踩 0

2605.29516 2026-05-29 stat.ME math.OC

Active learning strategy for excursion-set confidence regions of functional simulator outputs

主动学习策略用于功能模拟器输出的超越集置信区域

Lucas Brunel, Mathieu Balesdent, Loïc Brevault, Rodolphe Le Riche, Bruno Sudret

AI总结提出结合主成分分析和高斯过程回归的代理模型，并引入基于最大-最小准则的主动学习策略，高效估计具有随机输入和功能输出的函数的超越集置信区域。

详情

AI中文摘要

估计超越集置信区域旨在以给定的置信水平识别函数可能超过某个阈值的区域。本文关注于函数具有随机输入且功能输出一次性返回的情况下的置信区域估计。我们开发了一种基于代理模型的方法来估计置信区域，结合了主成分分析和高斯过程回归。还引入了一种基于最大-最小准则的主动学习策略，该策略选择可能减少置信区域不确定性的新样本。该策略通过Karhunen-Loève展开利用高斯过程的高效采样。将所提出的方法应用于三个案例研究的置信区域估计：一个合成函数、高超声速飞行器的表面压力系数分布以及可重复使用运载器第一级的滑翔返回轨迹。该方法在准确估计置信区域的同时减少了建模不确定性的来源，表现出高效性。与文献中的参考方法进行了基准比较。讨论了评估置信区域估计性能的相关指标。

英文摘要

Estimating excursion set confidence regions seeks to identify regions where a function may exceed some threshold with a given confidence level. This paper focuses on estimating such confidence regions in cases where the function has random inputs and a functional output that is returned all at once. We develop a surrogate-based approach for estimating the confidence region, combining principal component analysis and Gaussian process regression. An active learning strategy is also introduced, based on a max-min criterion that selects new samples which are likely to reduce the uncertainty in the confidence region. This strategy leverages efficient sampling of the Gaussian process through a Karhunen-Loève expansion. The proposed approach is applied to estimate the confidence regions of three case studies: a synthetic function, the surface pressure coefficient distribution of a hypersonic vehicle, and the glide-back trajectory of a reusable launcher first stage. The method demonstrates efficiency in accurately estimating the confidence region while reducing sources of modeling uncertainties. It is benchmarked against reference methods from the literature. Relevant metrics for assessing the confidence region estimation performance are discussed.

URL PDF HTML ☆

赞 0 踩 0

2605.29466 2026-05-29 stat.CO physics.data-an

`pandemonium`: High Dimensional Analysis in Linked Spaces

`pandemonium`: 链接空间中的高维分析

Gabriel McCoy, German Valencia, Ursula Laa

AI总结提出R包pandemonium，通过聚类分析和链接可视化探索预测变量与响应变量之间的关系，核心方法包括非线性降维和动画游览，主要贡献在于提供了一种在双空间中同时可视化和分析高维数据结构的工具。

详情

AI中文摘要

数据分析中的一个常见挑战是在涉及大量预测变量和响应变量的问题中揭示它们之间的关系。当预测变量和响应变量的数量有限时，可视化方法特别有效。我们提出了一个R包pandemonium，旨在通过将聚类分析与链接可视化相结合来探索此类问题。聚类在一组变量中执行，以识别在该空间中具有相似模式的区域。使用基于非线性降维和动画游览的链接视图，同时在两个空间中可视化得到的聚类。我们通过两个示例介绍该包，这些示例说明了不同类型的链接空间。在第一个示例中，我们考虑一组输入变量如何映射到神经网络回归模型中的潜在激活，以识别导致相似激活模式的输入组合。在第二个示例中，我们分析了一个在物理学中出现的复杂多变量数学模型，以研究预测空间中的结构如何与响应相关。

英文摘要

A common challenge in data analysis is uncovering relationships between predictors and responses in problems involving large numbers of both. When the number of predictors and responses is limited, visual approaches are particularly effective. We present an R package, pandemonium, designed to explore such problems by combining cluster analysis with linked visualisations. Clustering is performed in one set of variables to identify regions with similar patterns in that space. The resulting clusters are simultaneously visualised in both spaces using linked views based on non-linear dimension reduction and animated tours. We introduce the package through two examples that illustrate different types of linked spaces. In the first example, we consider how a set of input variables is mapped to latent activations in a neural network regression model, to identify input combinations that result in similar activation patterns. In the second example, we analyse a complex multivariable mathematical model arising in physics to investigate how structure in the predictor space relates to the responses.

URL PDF HTML ☆

赞 0 踩 0

2605.29464 2026-05-29 stat.ML cs.LG

Deep Optimal Individualized Treatment Rules for Bivariate Survival Outcomes via Adaptive Prediction-Powered Learning

双变量生存结局的深度最优个体化治疗规则：基于自适应预测驱动学习

Kun Ren, Yifan Cui, Wen Su

发表机构 * Department of Biostatistics, City University of Hong Kong（香港城市大学生物统计学系）； Center for Data Science, Zhejiang University（浙江大学数据科学中心）

AI总结针对随机试验中的双变量生存结局，提出一种基于深度神经网络的自适应预测驱动方法，通过随机策略建模治疗规则并耦合边际加速失效时间模型，以最大化联合生存概率。

2605.29424 2026-05-29 stat.AP cond-mat.soft physics.data-an

Model-free estimation in scattering analysis of microscopy

显微镜散射分析中的无模型估计

Tong Lin, Jinseok Lee, Matt Helgeson, Megan T. Valentine, Yimin Luo, Mengyang Gu

AI总结提出一种基于概率框架的无模型方法MF-AIUQ，通过中间散射函数与均方位移的关系，利用边际最大似然估计从显微镜视频中直接估计均方位移，无需粒子追踪或参数模型。

Comments 18 pages, 6 figures

详情

AI中文摘要

粒子的均方位移通常通过粒子追踪方法从显微镜视频中估计，这些方法依赖手动调整参数，并且在整个滞后时间范围内往往不稳定，尤其是在密集或低对比度情况下。在这项工作中，我们提出了无模型从头不确定性量化方法，这是一种基于概率框架的显微镜视频散射分析无模型方法，它无需分离粒子或链接其轨迹即可估计均方位移。基于累积量定理导出的中间散射函数与均方位移之间的关系，MF-AIUQ通过边际最大似然估计器估计均方位移值。为了降低计算成本，似然函数通过傅里叶变换强度的一个子集来近似。这些强度在傅里叶基函数和对数滞后时间点的对数值上等间距分布。我们发现中间散射函数在这个对数输入空间中是平滑的，并且中间散射函数的信息可以通过这个输入子集捕获。我们通过涵盖几个代表性随机过程的模拟研究和三个实验系统来检验该方法：用于评估在光学密集和明场设置下性能的牛顿流体、具有演化均方位移形状的凝胶系统，以及用于模量估计的粘弹性生物聚合物蜗牛粘液。在这些研究中，MF-AIUQ在整个滞后时间范围内提供了平滑且稳定的均方位移估计，并在粒子追踪不可靠或均方位移参数模型不可用或不可验证的情况下，作为一种有用的补充方法。

英文摘要

The mean squared displacement (MSD) of particles or probes is commonly estimated from microscopy videos using particle tracking approaches, which rely on tuning parameters manually, and are often unstable over the entire lag time range, especially in dense or low-contrast situations. In this work, we propose model-free ab initio uncertainty quantification (MF-AIUQ), a model-free method for scattering analysis of microscopy video based on a probabilistic framework, which estimates MSD without isolating particles and linking their trajectories. Based on the relationship between the intermediate scattering function (ISF) and the MSD derived from the cumulant theorem, MF-AIUQ estimates the MSD values by the marginal maximum likelihood estimator. To reduce the computational cost, the likelihood function is approximated by a subset of Fourier-transformed intensities. These intensities are equally spaced at the logarithmic values of Fourier basis functions and lag time points. We found that the ISF is smooth in this logarithmic input space, and the information of the ISF can be captured by this subset of inputs. We examine the method through simulation studies covering several representative stochastic processes and three experimental systems: a Newtonian fluid for evaluating performance in optically dense and bright-field settings, a gelation system with an evolving MSD shape, and snail mucin, a viscoelastic biopolymer, for modulus estimation. Across these studies, MF-AIUQ provides smooth and stable MSD estimates over the full lag time range and serves as a useful complementary approach in settings where particle tracking is unreliable or a parametric model of MSD is unavailable or unverifiable.

URL PDF HTML ☆

赞 0 踩 0

2605.29415 2026-05-29 eess.IV cs.CV cs.LG eess.SP stat.ML

Constructing efficient channels for ideal observers using the conjugate gradient method

使用共轭梯度法构建理想观察者的高效通道

Weimin Zhou

发表机构 * University of Arizona, Wyant College of Optical Sciences（亚利桑那大学光学科学学院）； University of Arizona, Department of Radiology & Imaging Sciences（亚利桑那大学放射科与成像科学系）

AI总结针对医学成像系统图像质量的任务评估，提出基于共轭梯度（CG）的方法构建高效通道，以近似贝叶斯理想观察者（IO）和霍特林观察者（HO）的性能。

Comments Submitted to the Journal of Medical Imaging (JMI) Special Issue Honoring Dr. Harrison H. Barrett

2605.29413 2026-05-29 q-fin.PM q-fin.MF q-fin.RM q-fin.ST stat.AP

From Classical Optimization to Bayesian Integration: A Comprehensive Analysis of Systematic Portfolio Management

从经典优化到贝叶斯整合：系统性投资组合管理的全面分析

Ajay Kumar Verma, Shravya Barkam

AI总结本文通过十只美国股票在2023年9月至2025年12月期间的数据，比较了均值-方差优化、约束优化、Fama-French五因子回归、蒙特卡洛模拟和Black-Litterman模型等现代投资组合构建方法，分析了约束、风险因子、模拟近似和市场观点对投资组合配置、绩效和稳定性的影响。

详情

AI中文摘要

本文通过选取十只美国股票（TSLA、WMT、BAC、GS、LLY、MRK、GOOG、META、AAPL和XOM），在2023年9月至2025年12月的时间范围内，比较了一系列当代投资组合构建方法。本文探讨了基本的均值-方差优化、约束优化、Fama-French五因子回归建模、蒙特卡洛模拟以及Black-Litterman模型，以确定解的约束、策略的风险因子、模拟近似以及特定的市场观点如何影响投资组合配置、绩效和稳定性。总体而言，结果表明：标准优化可能导致高度集中的投资组合，而约束优化通过改变有效前沿导致投资组合配置发生变化；五因子回归模型表明一种防御性大价值与盈利暴露的基本投资风格；蒙特卡洛近似是一种可行的技术，用于获得均值-方差最优投资组合，前提是模拟次数足够高，尤其是在箱约束下；与标准均值-方差优化相比，Black-Litterman投资组合方法产生了更具经济直觉的配置和更高的稳定性，因为该方法平衡了均衡收益与投资者观点。

英文摘要

This paper compares a series of contemporary portfolio construction approaches by employing ten U.S. stocks (TSLA, WMT, BAC, GS, LLY, MRK, GOOG, META, AAPL and XOM) in a time frame from September 2023 to December 2025. The paper explores both basic mean-variance optimization, constrained optimization, Fama French five factor regression modeling, Monte Carlo simulation, and the Black-Litterman model to determine how constraints to a solution, risk factors to a strategy, simulated approximations, and specific market views may all impact the outcome of portfolio allocation, performance and stability. Overall, the results show that standard optimization may result in highly concentrated portfolios, while constrained optimization leads to changes in portfolio allocations by altering the efficient frontier, five factor regression models suggest that a basic investment style of defensive large value and profitability exposure, Monte Carlo approximation is a viable technique to arrive at mean-variance optimal portfolios provided the simulations are high enough especially under a box constraint, the Black Litterman portfolio approach produces more economically intuitive allocations and greater stability compared to standard mean-variance optimization as the approach balances equilibrium returns with investor views.

URL PDF HTML ☆

赞 0 踩 0

2605.29411 2026-05-29 cs.LG cs.AI stat.ME stat.ML

The Good, the Bad, and the Ugly of Markov Boundary for Tabular Prediction

马尔可夫边界在表格预测中的好、坏与丑

Shu Wan, Abhinav Gorantla, Huan Liu, K. Selçuk Candan

发表机构 * Arizona State University（亚利桑那州立大学）

AI总结研究马尔可夫边界在表格预测中的实际效用，发现理论上最优的边界在实践中有条件地提升预测性能，但因果发现方法难以实现其潜力。

Comments 11 pages, 9 figures, 2 tables. Preprint

详情

AI中文摘要

在标准图形假设下，目标变量的马尔可夫边界是使所有其他特征冗余的最小特征集。一旦观察到边界，目标变量与表格的其余部分条件独立。这对于表格预测来说是一个诱人的对象，因为它恰好指出了模型所需的列。然而，现代回归器仍然在完整特征集上训练。我们询问马尔可夫边界是否在SCM3K（一个包含3450个任务的合成SCM基准，特征数量从40到1000，涵盖六个SCM家族）上对预测真正有用，并使用六个回归器进行评估。答案比理论所暗示的要微妙得多。将回归器限制在oracle边界上通常会显著改善预测，并且随着特征空间变得更大更稀疏，改善程度增加。但是，通过因果发现恢复边界并在恢复的掩码上训练的自然流程并不奏效。现有的估计器在达到边界最有帮助的区域之前就耗尽了计算预算，即使它们运行，也很少能击败完整特征集。我们将此归因于三个原因。发现优化的是结构恢复而非预测。假阴性和假阳性具有高度不对称的预测成本。精确边界只是众多击败所有特征的特征集之一。然后，我们阐述了这些事实对于预测对齐的特征选择以及学习使用因果结构的表格模型的意义。

英文摘要

Under standard graphical assumptions, the Markov boundary of a target variable is the smallest set of features that renders every other feature redundant. Once the boundary is observed, the target is conditionally independent of the rest of the table. This is a tempting object for tabular prediction, since it names exactly the columns a model should need. Yet modern regressors are still trained on the full feature set. We ask whether the Markov boundary is genuinely useful for prediction on SCM3K, a 3,450-task synthetic SCM benchmark with feature counts from 40 to 1000 and six SCM families, evaluated with six regressors. The answer is more nuanced than the theory suggests. Restricting a regressor to the oracle boundary often improves prediction substantially, and the improvement grows as the feature space becomes larger and sparser. But the natural pipeline of recovering the boundary with causal discovery and training on the recovered mask does not deliver. Existing estimators exhaust the compute budget before reaching the regime where the boundary helps most, and even where they run they rarely beat the full feature set. We trace this to three causes. Discovery optimizes structural recovery rather than prediction. False negatives and false positives carry sharply asymmetric predictive cost. The exact boundary is only one of many feature sets that beat all features. We then develop what these facts imply for prediction-aligned feature selection and for tabular models that learn to use causal structure.

URL PDF HTML ☆

赞 0 踩 0

2605.29403 2026-05-29 stat.ME stat.AP

Power Estimation for Longitudinal Studies with Time Dependent Covariates Using Generalized Method of Moments

使用广义矩方法对含时变协变量的纵向研究进行功效估计

Niloofar Ramezani, Oliver Hurst

AI总结本文针对含时变协变量的纵向研究，提出基于广义矩方法（GMM）的两种功效估计方法（Wald法和距离度量法），填补了GMM框架下缺乏实用功效分析工具的空白。

Comments 27 pages with appendix, 16 pages main manuscript, 3 figures in main manuscript, 7 figures including figures in appendix

详情

AI中文摘要

纵向研究经常包含随时间变化的协变量，这会在结果和预测变量之间产生复杂的依赖结构。当协变量是时变时，标准的功效分析工具——主要针对广义估计方程（GEE）开发——可能会产生误导性结果，因为它们没有考虑有效边际推断所需的矩基结构。广义矩方法（GMM）为在存在时变协变量的情况下估计边际效应提供了一个灵活且高效的框架，但目前尚无实用工具可用于GMM下的功效分析。本文介绍了一个现代、可实施的框架，用于使用GMM对含时变协变量的纵向研究进行功效估计。开发了两种互补方法：一种基于Wald的方法，利用GMM估计量的渐近正态性；另一种基于距离度量的方法，基于样本和总体矩条件的二次型。两种方法仅需有限的分布假设，并依赖于有效的矩条件而非完整的似然设定。我们概述了理论基础，提供了逐步实施指南，并利用骨关节炎倡议的数据说明了这些方法。提出了一个模拟框架用于评估实证性能。这些方法填补了纵向建模文献中的一个关键空白，为应用研究人员提供了一种实用的、分布轻量的功效估计方法，适用于存在时变协变量且GMM是首选估计技术的情况。

英文摘要

Longitudinal studies frequently incorporate covariates that evolve over time, creating complex dependence structures between outcomes and predictors. When covariates are time dependent, standard power analysis tools--largely developed for generalized estimating equations (GEE)--can yield misleading results because they do not account for the moment based structure required for valid marginal inference. Generalized Method of Moments (GMM) provides a flexible and efficient framework for estimating marginal effects in the presence of time dependent covariates, yet no practical tools exist for conducting power analysis under GMM. This paper introduces a modern, implementable framework for power estimation in longitudinal studies with time dependent covariates using GMM. Two complementary approaches are developed: a Wald based method that leverages the asymptotic normality of GMM estimators, and a distance metric method based on quadratic forms of sample and population moment conditions. Both approaches require only limited distributional assumptions and rely on valid moment conditions rather than full likelihood specification. We outline the theoretical foundations, provide step by step implementation guidance, and illustrate the methods using data from the Osteoarthritis Initiative. A simulation framework is presented for evaluating empirical performance. These methods fill a critical gap in the longitudinal modeling literature by offering applied researchers a practical, distribution light approach to power estimation when time dependent covariates are present and GMM is the preferred estimation technique.

URL PDF HTML ☆

赞 0 踩 0

2605.29395 2026-05-29 stat.ME stat.ML

注意力作为上下文经验贝叶斯：通过粒子动力学的两阶段视角

Matthew Smart, Soumya Ganguly, Nilava Metya, Alexandre V. Morozov, Anirvan M. Sengupta

发表机构 * Lewis-Sigler Institute for Integrative Genomics（利斯-西格尔整合基因组研究所）； Princeton University（普林斯顿大学）； Department of Mathematics（数学系）； Rutgers University（罗格斯大学）； Department of Physics and Astronomy（物理与天文学系）； Center for Computational Quantum Physics and Center for Computational Mathematics（计算量子物理中心和计算数学中心）； Flatiron Institute（Flatiron研究所）； Simons Foundation（西蒙斯基金会）

AI总结本文通过粒子动力学将最小注意力仅变换器解释为两阶段经验贝叶斯过程，揭示了深度和注意力残差的统计角色，并证明无需显式噪声调度即可实现有效去噪。

Comments 52 pages, 5 figures

详情

AI中文摘要

我们研究了在所有标记损坏情况下的最小注意力仅变换器，并表明它们具有两阶段经验贝叶斯解释。单个注意力步骤计算相对于由上下文定义的经验分布的核加权后验均值。深度通过粒子动力学（阶段1）细化该分布，而长程跳跃连接将噪声输入作为查询用于后验推断（阶段2），揭示了深度和注意力残差的独特统计角色。该框架隔离了一个最小设置，其中上下文本身诱导了一个控制上下文推断的深度依赖能量景观。我们表明，无需显式噪声调度即可出现有效去噪：固定的核带宽和有限的积分范围就足够了，从而产生了一个有原则的深度-噪声关系。我们进一步为一类表现良好的先验建立了后验均值恢复保证，其中经验估计器在渐近条件下收敛到贝叶斯最优预测器。将这些动力学与反向扩散极限联系起来，我们的结果为注意力作为通过基于样本的后验估计进行上下文推断提供了统计解释，无需显式密度建模。

英文摘要

We study minimal attention-only transformers under all-token corruption and show they admit a two-stage empirical Bayes interpretation. A single attention step computes a kernel-weighted posterior mean with respect to the empirical distribution defined by the context. Depth refines this distribution through particle dynamics (Stage 1), while a long-range skip-connection carries the noisy input as a query for posterior inference (Stage 2), revealing distinct statistical roles for depth and attention residuals. The framework isolates a minimal setting in which the context itself induces a depth-dependent energy landscape governing in-context inference. We show that effective denoising can emerge without an explicit noise schedule: a fixed kernel bandwidth and finite integration horizon suffice, yielding a principled depth-noise relationship. We further establish a posterior-mean recovery guarantee for a class of well-behaved priors, where the empirical estimator converges to the Bayes-optimal predictor under asymptotic conditions. Connecting these dynamics to reverse-diffusion limits, our results provide a statistical interpretation of attention as in-context inference via sample-based posterior estimation, without explicit density modeling.

URL PDF HTML ☆

赞 0 踩 0

2605.29315 2026-05-29 econ.EM stat.ME

支付网络中的因果标签恢复

Gaurav Dhama

发表机构 * Mastercard（麦star卡）

AI总结针对支付网络中标签存在的四种系统偏差，提出序列三重稳健（STR）估计器，同时纠正所有偏差并达到半参数效率界，实现基于数天而非数月数据的训练。

Comments 49 pages

详情

AI中文摘要

支付网络中的欺诈检测模型依赖于存在系统性偏差的退单标签进行训练。每个标签必须依次经过三个门控：授权（被拒绝的交易不产生标签）、发卡行报告（未报告的欺诈不可见）和延迟（待处理的退单在训练时缺失）。到达的标签可能因第一方滥用或发卡行错误分类而受损。配套论文[arXiv:2605.27557]证明这四种损害对检测性能施加了极小极大下界。本文问：能否达到该下界？我们将观测流程形式化为一个具有三个倾向阶段和一个损坏层的顺序缺失数据问题，并构建了序列三重稳健（STR）估计器。STR同时纠正所有四种损害，并达到半参数效率界——没有估计器能具有更低的渐近方差。它是序列三重稳健的：在每个门控处，一致性仅要求倾向模型或结果回归中有一个正确指定，而非两者。我们提供了通过噪声率调整的伪标签进行损坏校正、通过经验贝叶斯收缩稳定小发卡行的逆倾向权重、提供有效置信区间的插件方差估计量，以及用于有限样本保证的伯恩斯坦集中不等式。在操作层面，我们推导了最优训练延迟——使标签质量损失和模型过时之和最小化的成熟窗口——并证明STR允许使用数天而非数月前的数据进行训练，将模型新鲜度与退单成熟周期解耦。对于任何样本量，STR在均方误差上严格优于基于退单的朴素训练。

英文摘要

Fraud detection models in payment networks train on chargeback labels that are systematically biased. Every label must survive three sequential gates: authorization (declined transactions generate no labels), issuer reporting (unreported fraud is invisible), and delay (pending chargebacks are missing at training time). Labels that do arrive may be corrupted by first-party misuse or issuer misclassification. A companion paper [arXiv:2605.27557] proved that these four impairments impose a minimax lower bound on detection performance. This paper asks: can that bound be achieved? We formalize the observation pipeline as a sequential missing-data problem with three propensity stages and a corruption layer, and construct the Sequential Triply Robust (STR) estimator. The STR corrects for all four impairments simultaneously and achieves the semiparametric efficiency bound -- no estimator can have lower asymptotic variance. It is sequentially triply robust: at each gate, consistency requires only that either the propensity model or the outcome regression is correctly specified, not both. We provide corruption correction via noise-rate-adjusted pseudo-labels, empirical Bayes shrinkage to stabilize inverse-propensity weights for small issuers, a plug-in variance estimator yielding valid confidence intervals, and a Bernstein concentration inequality for finite-sample guarantees. On the operational side, we derive the optimal training delay -- the maturity window that minimizes the sum of label-quality loss and model staleness -- and prove that the STR permits training on data that is days old rather than months old, decoupling model freshness from the chargeback maturity cycle. The STR provably dominates naive chargeback-based training in mean squared error for any sample size.

URL PDF HTML ☆

赞 0 踩 0

2605.29255 2026-05-29 stat.ME

船舶涂层破损预测与检查规划

Huy Truong-Ba, Michael E. Cholette, Geoffrey Will, Marc Hartmann

AI总结采用幂律非齐次泊松过程（PL-NHPP）和分层贝叶斯方法，解决数据稀缺下船舶涂层缺陷预测问题，并优化检查规划以降低生命周期成本。

详情

AI中文摘要

海洋腐蚀显著降低船舶可用性，增加运营成本并可能影响安全。防护涂层可缓解这些风险，但其有效性随时间恶化。早期检测涂层破损对于防止昂贵的维修和安全问题至关重要。虽然腐蚀本身已被充分理解，但由于缺乏长期数据，涂层退化仍研究不足。本文通过增强涂层缺陷预测和优化船舶检查规划来填补这一知识空白。采用幂律非齐次泊松过程（PL-NHPP）对涂层缺陷到达进行建模。与先前研究不同，我们采用分层贝叶斯方法进行参数拟合，有效解决了与稀缺真实数据相关的局限性。此外，我们通过考虑停运成本和延迟维修导致的潜在成本增加来优化检查规划。通过一项涉及最近投入使用且历史数据有限的船队的综合案例研究，评估了这些方法的有效性。本研究通过实现更准确的涂层破损预测和在船队寿命早期优化检查计划，推动了船舶基于状态的维护（CBM）策略的发展。该方法最终提高了运营效率并降低了生命周期成本。

英文摘要

Marine corrosion significantly reduces a ship's availability, increases costs of operation and could impact safety. Protective coatings mitigate these risks, but their effectiveness deteriorates over time. Early detection of coating breakdown is crucial to prevent costly repairs and safety concerns. While corrosion itself is well-understood, coating degradation remains under-investigated due to insufficient long-term data. This work addresses this knowledge gap by enhancing coating defect prediction and optimizing inspection planning for ships. The Power Law Non-Homogeneous Poisson Process (PL-NHPP) is utilized for modeling coating defect arrivals. Unlike prior studies, we employ a hierarchical Bayesian approach for parameter fitting, effectively addressing limitations associated with scarce real-world data. Furthermore, we optimize inspection planning by incorporating out-of-service costs and potential costs increases due to delayed repairs. The efficacy of these methods is evaluated through a comprehensive case study involving a recently commissioned fleet with limited historical data. This research contributes to the advancement of condition-based maintenance (CBM) strategies for ships by enabling more accurate prediction of coating breakdowns and optimizing inspection schedules early in the life of the fleet. This approach ultimately improves operational efficiency and reduces life-cycle costs.

URL PDF HTML ☆

赞 0 踩 0

2605.29193 2026-05-29 stat.AP

Bayesian reversal of the liquid level trajectory in a draining tank for pollution forensics

用于污染溯源的排水罐中液位轨迹的贝叶斯反演

Kyla D. Jones, Gbenga Fabusola, Alexander W. Dowling, Cory M. Simon

AI总结针对污染事件中未知初始液位的反问题，提出基于贝叶斯统计反演的框架，结合托里拆利定律物理模型和经验偏差函数，从最终液位和排水时长推断初始液位并量化不确定性。

详情

AI中文摘要

危险液体储罐在工业和农业中很常见。在污染事件中，液体可能通过小孔、裂缝或管道从储罐中排出。控制泄漏后，估算排出的液体体积对于公共安全、监管评估和修复至关重要。当原始库存未知时，这构成一个反问题。在这项工作中，我们提出了一个框架，用于从污染事件后观察到的最终液位和排水时长的估计值推断部分排空储罐中的初始液位。由于排水动力学、模型参数和观测值存在不确定性，我们采用贝叶斯统计反演将先验物理知识与实验液位时间序列数据相结合，以预测具有量化不确定性的初始液位。我们使用基于托里拆利定律的物理模型来描述储罐排水动力学，并通过经验偏差函数对其进行增强，以解释缺失或不完美建模的物理过程。在我们用水进行储罐排水的实验中，我们发现推断的初始液位是准确的，尽管不确定性随着排水时长的增加而增加。除了应用于污染溯源外，这项工作还可以作为动手课堂项目，说明动态建模、模型偏差和贝叶斯推断。

深度网络会忘记初始化吗？实用归纳偏见的遗忘时间视角

Mohua Das, Pierfrancesco Beneventano, Shibshankar Dey, Gareth H. McKinkey, Tomaso Poggio

发表机构 * MIT（麻省理工学院）； Northwestern University（西北大学）

AI总结通过引入初始化记忆度量，研究随机初始化对训练后预测器的影响，发现低学习率SGD保留初始化记忆而Adam族方法遗忘，且遗忘动力学与泛化正则化相关。

Comments 39 pages, 9 figures

详情

AI中文摘要

随机初始化的神经网络在函数上诱导先验，但实践中使用的预测器仅在训练后产生。我们询问这种初始偏差有多少在训练流程中幸存。为了使问题可测量，我们引入初始化记忆：验证选择的预测器对随机初始化尺度的依赖性。我们在ResNet上进行了受控的CIFAR-10实验，其中初始化记忆已经尖锐地分离了训练机制。低学习率SGD可以在记住初始化的同时进行插值：在批大小$b=128$的ResNet-9上，尽管训练准确率$\ge99.5\%$，测试准确率在不同初始化尺度上变化$26.5$个百分点。这不是欠训练：将相同的低学习率机制扩展到$5{,}000$个epoch，差异基本不变。相比之下，Adam族方法在很大程度上消除了这种依赖性。当较大的学习率与显式$L_2$范数控制配对时，SGD也可以被遗忘。我们根据遗忘的时间尺度解释这些发现：梯度流式动力学可以保留初始化记忆，而随机有限步效应、显式范数衰减和自适应预处理在由显式或隐式正则化大小控制的尺度上消除它。因此，训练网络的实用归纳偏见不仅仅是架构先验，而是经过训练流程遗忘动力学过滤后的架构先验；并且改善泛化的相同正则化器正是那些消除初始化记忆的。

英文摘要

Randomly initialized neural networks induce a prior over functions, but the predictor used in practice is produced only after training. We ask how much of this initial bias survives the training pipeline. To make the question measurable, we introduce initialization memory: the dependence of the validation-selected predictor on the scale of the random initialization. We perform controlled CIFAR-10 experiments on ResNets where initialization memory already sharply separates training regimes. Low-learning-rate SGD can interpolate while still remembering its initialization: on ResNet-9 with batch size $b=128$, test accuracy varies by $26.5$ percentage points across initialization scales despite $\ge99.5\%$ training accuracy. This is not undertraining: extending the same low-learning-rate regime to $5{,}000$ epochs leaves the spread essentially unchanged. In contrast, Adam-family methods largely erase the dependence. SGD can also be made to forget when larger learning rates are paired with explicit $L_2$ norm control. We interpret these findings in terms of the time scale of forgetting: gradient-flow-like dynamics can preserve initialization memory, whereas stochastic finite-step effects, explicit norm decay, and adaptive preconditioning erase it on scales governed by the size of explicit or implicit regularization. The practical inductive bias of a trained network is therefore not the architectural prior alone, but the architectural prior after being filtered by the forgetting dynamics of the training pipeline; and the same regularizers that improve generalization are precisely those that erase memory of initialization.

URL PDF HTML ☆

赞 0 踩 0

2605.29148 2026-05-29 cs.LG stat.ML

Optimal Gap-Dependent Regret for Private Stochastic Decision-Theoretic Online Learning

私有随机决策理论在线学习的最优间隙相关遗憾

Tommaso Cesari, Roberto Colomboni

发表机构 * School of Electrical Engineering and Computer Science University of Ottawa（电气工程与计算机科学学院，渥太华大学）； School of Mathematics University of Bristol（数学学院，布里斯托尔大学）

AI总结针对完全信息、事件级纯差分隐私的随机决策理论在线学习，提出一种无水平线的纯差分隐私算法，并证明遗憾界为O(log K / Δ_min + log K / ε)。

详情

AI中文摘要

我们研究具有完全信息和事件级纯差分隐私的随机决策理论在线学习。Hu和Mehta在COLT上提出的一个开放问题要求确定在纯事件级差分隐私下，随机决策理论在线学习的最优间隙相关遗憾率。对于$K$个动作，损失在$[0,1]$中，且唯一最优动作与次优动作的间隙为$Δ_{\min}$，已知下界为$ rac{\log K}{\min\{Δ_{\min},\varepsilon\}} $，或等价地，在通用常数范围内，为\[ rac{\log K}{Δ_{\min}}+ rac{\log K}{\varepsilon} \]。我们给出一个无水平线的纯DP算法，并证明对于任意水平线$T$，显式遗憾界\[ \operatorname{Reg}_T \le 1000 \cdot \left( rac{\log K}{Δ_{\min}}+ rac{\log K}{\varepsilon} ight) \]。数值常数未优化。该算法将时间划分为指数增长大小的块，每个块内执行单个动作，并通过指数机制（应用于前一个块的数据无关随机前缀）选择下一个动作。随机前缀将块遗憾转化为所有前缀长度上softmax选择误差的和。单个熵势参数以代价$\log K/\varepsilon$控制所有隐私主导的大间隙动作。

英文摘要

We study stochastic decision-theoretic online learning with full information and event-level pure differential privacy. A COLT open problem of Hu and Mehta asks to determine the optimal gap-dependent regret rate for stochastic decision-theoretic online learning under pure event-level differential privacy. For $K$ actions, losses in $[0,1]$, and a unique best action separated from the second-best action by gap $Δ_{\min}$, the known lower bound is of order $ \frac{\log K}{\min\{Δ_{\min},\varepsilon\}}, $ or equivalently, up to universal constants, of order \[ \frac{\log K}{Δ_{\min}}+\frac{\log K}{\varepsilon}. \] We give a horizon-free pure-DP algorithm and prove the explicit regret bound \[ \operatorname{Reg}_T \le 1000 \cdot \left(\frac{\log K}{Δ_{\min}}+\frac{\log K}{\varepsilon}\right) \] for every horizon $T$. The numerical constant is not optimized. The algorithm partitions time into blocks of exponentially increasing size, plays a single action throughout each block, and chooses the next action by an exponential mechanism applied to a data-independent random prefix of the previous block. The random prefix converts block regret into a sum, over all prefix lengths, of softmax selection errors. A single entropy-potential argument controls all privacy-dominated large-gap actions at cost $\log K/\varepsilon$.

URL PDF HTML ☆

赞 0 踩 0

2605.29139 2026-05-29 stat.ML cs.LG

Anytime-Valid Federated Conformal RAG for LLM Swarms

面向LLM群体的任意有效联邦共形RAG

Prasanjit Dubey, Xiaoming Huo

发表机构 * H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology（H.米尔顿·斯图尔特工业与系统工程学院，佐治亚理工学院）

AI总结提出Anytime-FC-RAG，通过可累积的逐步校准偏差预算和截断投注e过程，将联邦共形RAG扩展到任意停止时间均有效的序贯覆盖，并保证时间均匀报警有效性、Hoeffding拼接累积误覆盖包络及自适应控制下的安全性。

详情

AI中文摘要

联邦共形RAG（FC-RAG）为带宽受限的弱语言模型群体提供了无分布假设的覆盖保证，但仅限于固定时间范围。我们将其扩展到任意有效序贯覆盖：在每个停止时间均有效，且在可预测自适应控制（重新校准、每节点带宽升级、蒸馏学生刷新）下保持不变，且无需比固定时间范围FC-RAG更多的假设。朴素组合失败，因为FC-RAG的边缘覆盖界使得投注e过程在不利校准抽取下成为非超鞅，无法调用Ville不等式。我们提出Anytime-FC-RAG，这是一种序贯扩展，基于可累加的逐步校准偏差预算，将边缘界转换为校准好事件上的严格条件界，并配以在整个概率空间上为非负超鞅的截断投注e过程。由这两个要素，我们获得四个保证：时间均匀报警有效性$\mathbb{P}(\sup_t E_t \ge 1/δ_e) \le δ_e + δ_{\mathrm{cal}}$，相同总预算下的Hoeffding拼接累积误覆盖包络，任何可预测控制器（重新校准、带宽升级、学生刷新）下的安全性，以及通过可累加训练预算在无界序列的联邦探针-逻辑蒸馏（FPLD）刷新上的训练侧误差传播。实际结果是，仅在e过程超过警告阈值时升级检索带宽的自适应控制器，以显著更低的通信成本匹配固定高带宽调度的报警率。在GPT-2-small + MiniLM群体上对MMLU、DBpedia和AG News的实验验证了预测的报警率、检测延迟、包络覆盖以及14%-57%的带宽节省；报警仅在覆盖真正失效时触发。

英文摘要

Federated Conformal RAG (FC-RAG) provides distribution-free coverage for a bandwidth-limited swarm of weak language models, but only at a fixed horizon. We extend it to anytime-valid sequential coverage: validity at every stopping time, preserved under predictable adaptive control (recalibration, per-node bandwidth escalation, distilled-student refresh), at no extra cost in assumptions over fixed-horizon FC-RAG. Naive composition fails because FC-RAG's marginal coverage bound makes the betting e-process a non-supermartingale on adverse calibration draws, and Ville's inequality cannot be invoked. We give Anytime-FC-RAG, a sequential extension built on a summable per-step calibration-deviation budget that converts the marginal bound into a strict conditional bound on a calibration-good event, paired with a truncated betting e-process that is a nonnegative supermartingale on the entire probability space. From these two ingredients, we obtain four guarantees: time-uniform alarm validity $\mathbb{P}(\sup_t E_t \ge 1/δ_e) \le δ_e + δ_{\mathrm{cal}}$, a Hoeffding-stitched cumulative-miscoverage envelope at the same total budget, safety under any predictable controller (recalibration, bandwidth escalation, student refresh), and training-side error propagation across an unbounded sequence of Federated Probe-Logit Distillation (FPLD) refreshes via a summable training budget. As a practical consequence, an adaptive controller that escalates retrieval bandwidth only when the e-process crosses a warning threshold matches the alarm rate of a fixed-high-bandwidth schedule at substantially lower communication cost. Experiments on a GPT-2-small + MiniLM swarm across MMLU, DBpedia, and AG News verify the predicted alarm rate, detection delay, envelope coverage, and $14$-$57\%$ bandwidth savings; the alarm fires when and only when coverage genuinely breaks.

URL PDF HTML ☆

赞 0 踩 0

2605.29112 2026-05-29 stat.ME

Efficient First-Order Methods for Estimating Generalized Additive Index Models

估计广义可加指标模型的高效一阶方法

Ziyu Peng, Linglingzhi Zhu, Yao Xie

AI总结针对广义可加指标模型（GAIM）的序贯估计计算效率低的问题，提出基于基展开的梯度下降和变分不等式算法，实现同时估计，并证明收敛到稳定点，数值实验显示优于经典方法。

详情

AI中文摘要

广义可加指标模型（GAIM）提供了一个灵活的半参数框架，用于捕捉复杂的数据关系，平衡了参数模型的可解释性和非参数方法的灵活性。然而，经典的GAIM逐阶段估计方法由于其序贯性质和对非参数平滑的依赖而遭受计算效率低下的问题。为了克服这些缺点，我们提出了高效的GAIM同时估计算法。通过利用基展开，我们将半参数估计任务转化为一个有限维优化问题，该问题可以通过一阶方法（如梯度下降（GD））求解。此外，我们引入了一种变分不等式（VI）估计算法，将VI框架从广义线性模型扩展到GAIM。我们为两种算法提供了统一的收敛到稳定点的结果。数值实验突出了我们的方法相对于经典逐阶段方法的计算和统计优势，并揭示了基于VI的方法在非规范链接函数上相对于GD的潜在优势。

英文摘要

Generalized additive index models (GAIMs) offer a flexible semiparametric framework for capturing complex data relationships, balancing the interpretability of parametric models with the flexibility of nonparametric approaches. However, classical stage-wise estimation procedures for GAIMs suffer from computational inefficiencies due to their sequential nature and reliance on nonparametric smoothing. To overcome these drawbacks, we propose efficient, simultaneous estimation algorithms for GAIMs. By leveraging basis expansion, we cast the semiparametric estimation task as a finite-dimensional optimization problem solvable by first-order methods such as gradient descent (GD). Furthermore, we introduce a variational inequality (VI) estimation algorithm, extending the VI framework from generalized linear models to GAIMs. We provide a unified convergence result to a stationary point for both algorithms. Numerical experiments highlight the computational and statistical advantages of our methods over classical stage-wise procedures, and reveal the potential benefits of the VI-based approach over GD for non-canonical link functions.

URL PDF HTML ☆

赞 0 踩 0

2605.29081 2026-05-29 stat.ME

Bayesian Inference of Mixing and Transmission Heterogeneity in Stratified Disease Surveillance Models

分层疾病监测模型中混合与传播异质性的贝叶斯推断

Miles Moran, Rob Trangucci, Lisa Madsen

AI总结提出一种贝叶斯潜变量扩展的内生-流行病模型，用于从分层疾病监测数据中推断未观测的个体传播性、疾病发生率与流行率的分离以及人口组间混合结构。

详情

AI中文摘要

当传染病发病率的监测数据（如每周病例数）按人口统计指标分层时，这些群体之间长期健康结果的差异变得明显。准确识别高风险亚群将使政策制定者能够在流行病早期进行针对性干预；然而，疾病发病的时间模型通常缺乏对多变量（即亚群水平）结果的稳健处理。我们提出了一种新颖的贝叶斯潜变量扩展，用于通常为此目的使用的内生-流行病（``EE''）建模框架。具体来说，我们通过明确表示未观测的个体水平传播性、明确分离疾病发病率和流行率以及参数化估计人口组间混合结构来增强EE模型类。得到的模型可以针对罕见疾病（高度地方性）或暴发驱动（高度流行性）背景进行定制，并且能够仅从发病率数据推断社会接触混合模式，包括多重分层数据中的混合模式。为了演示，我们进行了一项模拟研究，将我们的模型与现有的双重分层EE模型在预期的罕见疾病应用制度中进行比较。然后，我们将我们的推断与竞争对手对2011-2015年柏林诺如病毒胃肠炎实际发病率数据（按六个年龄组和十二个地理区域分层）的推断进行比较。最后，我们报告了我们的模型对疫情第一年密歇根州记录的COVID-19发病率（按六个年龄组和六十六个地理区域分层）的推断。

英文摘要

When surveillance data of infectious disease incidence (e.g. weekly case counts) are disaggregated by demographic indicators, disparities in long-run health outcomes between these groups become apparent. Accurate identification of high-risk subpopulations would enable policy-makers to target interventions early in an epidemic; but, temporal models of disease incidence typically lack robust treatment of multivariate (i.e. subpopulation-level) outcomes. We propose a novel Bayesian latent-variable extension of the endemic-epidemic (``EE'') modeling framework commonly used for this purpose. Specifically, we augment the EE model class with explicit representation of unobserved individual-level transmissibility; explicit separation of disease incidence and prevalence; and parametric estimation of between-demographic-groups mixing structure. The resulting model may be tailored for either rare-disease (highly-endemic) contexts or outbreak-driven (highly-epidemic) contexts, and is capable of inferring social contact mixing patterns from incidence data alone, including mixing patterns among multiply-stratified data. To demonstrate, we conduct a simulation study comparing our model to an existing doubly-stratified EE model in the intended rare-disease application regime. We then compare our inference to the competitor's for real incidence data of norovirus gastroenteritis in Berlin, 2011-2015, disaggregated by six age groups and twelve geographic regions. Finally, we report inference of our model on COVID-19 incidence recorded in Michigan during the first year of the pandemic, disaggregated by six age groups and sixty-six geographic regions.

URL PDF HTML ☆

赞 0 踩 0

2605.29066 2026-05-29 math.ST math.PR stat.TH

A scale-free density bound for Gaussian maxima

高斯最大值的一个无尺度密度界

Suhas Vijaykumar

AI总结针对中心化高斯向量的最大值，推导了一个无尺度密度界，该界非均匀、对数依赖于维数且适用于任意协方差矩阵，并应用于高维假设检验和反集中性等。

2605.29032 2026-05-29 cs.LG stat.ML

Theoretical Foundations and Effective Algorithms for Policy-Aware Simulator Learning

策略感知模拟器学习的理论基础与有效算法

Christoph Dann, Yishay Mansour, Mehryar Mohri

发表机构 * Google Research（谷歌研究）； Tel Aviv University（特拉维夫大学）； Courant Institute of Mathematical Sciences（数学科学学院）

AI总结针对模型强化学习中模拟器利用问题，提出以策略鲁棒性为目标，通过零和极小极大博弈学习模拟器，并给出理论保证与有效算法。

详情

AI中文摘要

基于模型的强化学习（MBRL）智能体通常通过最小化预测损失来学习世界模型。然而，强大的RL优化器不可避免地会利用微小的模型不准确性，导致模拟器利用和现实差距，即策略在模拟中成功但在现实世界中失败。我们提出学习模拟器的目标应该是策略鲁棒性而非预测准确性，并将其形式化为模型玩家与对抗策略玩家之间的零和极小极大博弈。我们提供了全面的理论分析：（1）在线学习保证，表明该博弈是可学习的，具有次线性遗憾界；（2）一个可处理的基于评论家的简化，通过局部评论家的损失来界定全局策略价值差距；（3）误差-MDP对偶性，证明寻找最坏情况策略在形式上是标准RL问题的对偶，其中奖励是一步评论家误差。这种对偶性产生了一个可证明收敛的主动数据选择算法。在连续控制任务上的实验表明，我们的方法在策略重要区域将预测误差降低了1.5-2.2倍，并使完全在模拟中训练的策略能够匹配接近最优的现实世界性能。

英文摘要

Model-based reinforcement learning (MBRL) agents typically learn world models by minimizing predictive loss. However, powerful RL optimizers inevitably exploit minor model inaccuracies, leading to simulator exploitation and a reality gap where policies succeed in simulation but fail in the real world. We propose that the objective for learning simulators should be strategic robustness rather than predictive accuracy, and formulate this as a zero-sum minimax game between a model player and an adversarial policy player. We provide a comprehensive theoretical analysis: (1) an online learning guarantee showing the game is learnable with sublinear regret bounds; (2) a tractable critic-based simplification bounding the global policy-value gap by the local critic's loss; and (3) an Error-MDP duality, proving that finding the worst-case policy is formally dual to a standard RL problem where the reward is the one-step critic error. This duality yields a provably convergent active data selection algorithm. Experiments on continuous control tasks demonstrate that our approach reduces prediction error in strategically important regions by $1.5$-$2.2\times$ and enables policies trained purely in simulation to match near-optimal real-world performance.

URL PDF HTML ☆

赞 0 踩 0

2605.28974 2026-05-29 math.ST math.RT stat.AP stat.ME stat.TH

Algorithm to check Maximum Likelihood Estimate Existence for integrated PCA

检查集成PCA的最大似然估计存在性的算法

Dmitri Shmelkin

AI总结基于不变理论与统计学的桥梁，利用箭图半不变技术，提出并验证了集成PCA模型中最大似然估计存在的充要条件，并开发了易于使用的软件。

Comments 6 pages

2605.28961 2026-05-29 stat.ML cs.LG math.OC

Dynamics of Stochastic Momentum with Sparse Updates in High Dimensions

高维稀疏更新下随机动量的动力学

Katie Everett, Elliot Paquette

发表机构 * Google DeepMind & MIT（谷歌DeepMind及麻省理工学院）； McGill University & Mila（麦吉尔大学及MILA）

AI总结本文通过最小二乘和逻辑回归模型，理论分析了稀疏更新下动量的动力学，揭示了由动量保留时间尺度与学习时间尺度之比决定的相结构，并发现不同令牌稀疏度下的振荡动力学存在谱冲突。

详情

AI中文摘要

现有的动量理论假设梯度以大致恒定的速率到达每个参数，但这一假设在重尾数据分布和现代架构中常被违反。我们理论分析了稀疏更新下两种可处理动量模型的动力学：具有稀疏输入的最小二乘模型和具有稀有类别的逻辑回归模型。两者都给出了精确的闭式二阶矩动力学，我们针对稀疏性、批量大小和动量衰减的三个标度指数刻画了其高维极限。两个问题上的相结构由两个内在时间尺度之比决定：动量保留时间尺度（缓冲区存活的活动更新次数）和学习时间尺度（减少平方误差所需的活动更新次数）。当学习远慢于保留时，极限匹配SGD；当学习更快时，系统不稳定；当时间尺度相当时，我们恢复经典的重球动力学。振荡动力学发生在不同令牌稀疏度的不同动量值处，从而在全局动量上产生跨令牌频率的谱冲突。

英文摘要

Existing theory of momentum assumes that gradients arrive at every parameter at a roughly constant rate, an assumption violated in practice by heavy-tailed data distributions and modern architectures. We theoretically analyze the dynamics of two tractable models of momentum under sparse updates: a least squares model with sparse inputs and a logistic regression model with a rare class. Both admit exact closed-form second-moment dynamics whose high-dimensional limits we characterize across three scaling exponents for sparsity, batch size, and momentum decay. The phase structure on both problems is governed by the ratio of two intrinsic timescales: a momentum retention timescale (how many active updates the buffer survives) and a learning timescale (how many active updates it takes to reduce the squared error). When learning is much slower than retention, the limit matches SGD; when learning is faster, the system is unstable; where the timescales coincide, we recover classical heavy-ball dynamics. The oscillatory dynamics occur at different momentum values for different token sparsity, creating a spectral conflict for global momentum across token frequencies.

URL PDF HTML ☆

赞 0 踩 0

2605.28920 2026-05-29 cs.LG cs.AI stat.ML

Conf-Gen: Conformal Uncertainty Quantification for Generative Models

Conf-Gen: 生成模型的共形不确定性量化

Gabriel Loaiza-Ganem, Kevin Zhang, Wei Cui, Marc T. Law, Kin Kwan Leung

发表机构 * layer6ai-labs（layer6ai实验室）

AI总结提出Conf-Gen框架，通过共形风险控制适配生成任务，统一并扩展了共形预测在大型语言模型等生成模型中的应用，并在图像生成、对话AI和AI代理等新领域提供了形式化保证。

Comments ICML 2026

详情

AI中文摘要

共形预测（CP）及其扩展共形风险控制（CRC）是通过形式化保证量化监督机器学习中不确定性的成熟框架。然而，人工智能（AI）的最新突破由无监督生成模型驱动，例如大型语言模型（LLMs）和图像生成器，这些模型与CP或CRC不直接兼容。在这项工作中，我们引入了共形生成（Conf-Gen），这是一个将CRC适配到生成任务同时放宽其理论假设的通用框架。Conf-Gen统一并泛化了先前将CP应用于LLMs的尝试，并将共形方法扩展到全新的领域。我们通过一些新颖的应用展示了Conf-Gen的灵活性，包括在以下方面获得共形保证：生成非记忆图像的图像生成器、提出足够澄清问题的对话AI系统，以及AI代理输出的正确性。

英文摘要

Conformal prediction (CP) and its extension, conformal risk control (CRC), are established frameworks for quantifying uncertainty in supervised machine learning through formal guarantees. However, recent breakthroughs in artificial intelligence (AI) have been driven by unsupervised generative models, such as large language models (LLMs) and image generators, which are not directly compatible with CP or CRC. In this work we introduce conformal generation (Conf-Gen), a general framework adapting CRC to generative tasks while relaxing its theoretical assumptions. Conf-Gen unifies and generalizes previous attempts to apply CP to LLMs, and extends conformal methodology to entirely new domains. We demonstrate the flexibility of Conf-Gen through some novel applications, including obtaining conformal guarantees on: image generators producing non-memorized images, conversational AI systems having asked enough clarifying questions, and the output of AI agents being correct.

URL PDF HTML ☆

赞 0 踩 0

通过工具变量交互作用的结构性加速失效时间模型的识别与推断

Qiushi Bu, Wen Su, Xinyu Zhang, Xingqiu Zhao, Zhonghua Liu

AI总结针对存在未测量混杂的右删失时间-事件结局，提出一种利用工具变量交互作用进行识别和推断的框架，无需经典工具变量有效性假设，并采用增强逆概率删失加权和广义经验似然方法实现稳健推断。

详情

AI中文摘要

我们研究在存在未测量混杂的情况下，右删失时间-事件结局的因果推断。聚焦于结构性加速失效时间模型，我们开发了一个利用工具变量交互作用的识别和推断框架。所提出的方法不依赖于经典工具变量有效性，并在有效和无效工具变量下均能产生有效的因果推断，前提是交互作用识别条件成立。为处理右删失，我们使用增强逆概率删失加权方法构建了一个删失调整的观测数据矩函数。该矩函数对 nuisance 函数具有 Neyman 正交性，并具有双重稳健性，从而在灵活的 nuisance 估计下实现有效推断。使用广义经验似然进行估计和推断，该方法适用于具有许多潜在弱交互作用矩条件的情形。我们在许多弱矩渐近条件下建立了相合性和渐近正态性，并开发了诊断工具来评估交互作用识别强度和过度识别限制。模拟研究展示了在各种删失率和工具配置下良好的有限样本性能。对英国生物银行数据的应用说明了所提出方法在大规模观察性研究中进行因果生存分析的实际意义。

英文摘要

We study causal inference for time-to-event outcomes under right censoring in the presence of unmeasured confounding. Focusing on structural accelerated failure time models, we develop an identification and inference framework that exploits interactions among instrumental variables. The proposed approach does not rely on classical instrumental variable validity and yields valid causal inference under both valid and invalid instruments, provided that the interaction-based identification condition holds. To accommodate right censoring, we construct a censoring-adjusted observed data moment function using an augmented inverse probability censoring weighting approach. The resulting moment function is Neyman orthogonal with respect to nuisance functions and enjoys a double robustness property, enabling valid inference under flexible nuisance estimation. Estimation and inference are conducted using generalized empirical likelihood, which is well suited to settings with many potentially weak interaction-based moment conditions. We establish consistency, and asymptotic normality under many weak moment asymptotics, and develop diagnostic tools to assess interaction-based identification strength and overidentifying restrictions. Simulation studies demonstrate favorable finite sample performance across a range of censoring rates and instrument configurations. An application to UK Biobank data illustrates the practical relevance of the proposed method for causal survival analysis in large-scale observational studies.

URL PDF HTML ☆

赞 0 踩 0

2605.27975 2026-05-29 cs.LG stat.ML

Continual Learning in Modern Hopfield Networks with an Application to Diffusion Models

现代Hopfield网络中的持续学习及其在扩散模型中的应用

Ken Takeda, Masafumi Oizumi, Ryo Karakida

发表机构 * Graduate School of Arts and Science, The University of Tokyo（东京大学艺术与科学研究生院）； Artificial Intelligence Research Center, AIST（AIST人工智能研究中心）

AI总结通过现代Hopfield能量分析扩散模型中的持续学习，证明高能量异常样本更容易被遗忘，并基于能量选择重放样本以缓解遗忘。

详情

AI中文摘要

生成模型（包括扩散模型）越来越多地被用作基础模型，并通过顺序微调进行适配，这使得持续学习成为一个关键问题设定。然而，此类生成模型中的持续学习仍未被充分理解：任务变化后，学习分布的哪些方面最容易丢失，以及应优先重放哪些样本？我们通过现代Hopfield能量来解决这些问题。现代Hopfield网络（MHN）与扩散模型之间的最新联系使得MHN中的分析可以迁移到扩散模型。我们引入内在遗忘作为任务变化后Hopfield能量的增加。在MHN的可处理设定中，我们证明高能量、类似异常值的样本比类似聚类的样本经历更大的能量增加，这意味着位于尖锐、孤立盆地中的样本更容易被遗忘。我们进一步分析了记忆重放，并表明重放对高能量样本特别有效，从而实现了基于能量的重放样本选择。我们在MHN和两种扩散模型（Stable Diffusion和像素空间DDPM）的持续学习设置实验中验证了这些预测。在这些扩散模型中，Hopfield能量追踪基于重建的遗忘，重放实验揭示了与MHN分析一致的能量依赖性遗忘缓解。

英文摘要

Generative models, including diffusion models, are increasingly used as foundation models and adapted through sequential fine-tuning, making continual learning an essential problem setting. However, continual learning in such generative models remains poorly understood: after a task change, what aspects of the learned distribution are most easily lost, and what replay samples should be prioritized? We address these questions through the modern Hopfield energy. Recent links between modern Hopfield networks (MHNs) and diffusion models allow analyses in MHNs to be transferred to diffusion models. We introduce intrinsic forgetting as an increase in Hopfield energy after the task change. In tractable settings in an MHN, we prove that high-energy, outlier-like samples undergo a larger energy increase than cluster-like samples, implying that samples located in sharp, isolated basins are more forgettable. We further analyze memory replay and show that replay is particularly effective for high-energy samples, enabling an energy-based selection of replay samples. We validate these predictions in experiments on MHNs and two diffusion models under continual-learning settings: Stable Diffusion and a pixel-space DDPM. In these diffusion models, Hopfield energy tracks reconstruction-based forgetting, and replay experiments reveal energy-dependent mitigation of forgetting that is consistent with the MHN analysis.

URL PDF HTML ☆

赞 0 踩 0

2605.27625 2026-05-29 math.ST stat.TH

Admissibility of Adaptive Monotone Step-Down Multiple Testing Procedures Under Arbitrary Covariance Dependence

任意协方差依赖下自适应单调逐步多重检验程序的可容许性

Prasenjit Ghosh, Arijit Chakrabarti

AI总结针对任意协方差依赖下多元正态均值的同步检验问题，建立了一类基于残差的自适应单调逐步下降多重检验程序的可容许性定理，证明其关于向量值损失函数是可容许的。

详情

AI中文摘要

本文考虑在任意协方差依赖下多元正态均值的同步检验问题。具体地，设 $\boldsymbol{X}\sim N_n(\boldsymbolθ,\boldsymbolΣ)$，其中 $\boldsymbolθ\in\mathbb{R}^n$ 未知，$\boldsymbolΣ$ 是已知的正定协方差矩阵。目标是同时检验 $H_{0i}:θ_i=0$ 对 $H_{Ai}:θ_i\neq 0$，$i=1,\ldots,n$。我们为一类广泛的基于残差的单调逐步下降多重检验程序建立了通用可容许性定理，这些程序通过使用由条件正态分布产生的适当标准化残差统计量的局部自适应严格递增变换得到的统计量，迭代地对活跃假设进行排序。我们的主要结果表明，每个这样的程序关于向量值损失函数（其分量是通常的单个 $0$-$1$ 检验损失）都是可容许的。证明依赖于对诱导接受区域的精细几何分析以及自适应逐步拒绝指标的结构不变性。该定理实质性地扩展了 Cohen 等人 (2009) 为最大残差下降程序建立的可容许性理论，并揭示了依赖下的可容许性根本上是由残差统计量诱导的单调排序结构驱动的，而不是由检验规则本身的精确函数形式驱动的。

英文摘要

In this paper, we consider the problem of simultaneous testing of multivariate normal means under arbitrary covariance dependence. Specifically, let $\boldsymbol{X}\sim N_n(\boldsymbolθ,\boldsymbolΣ)$, where $\boldsymbolθ\in\mathbb{R}^n$ is unknown and $\boldsymbolΣ$ is a known positive definite covariance matrix. The objective is to test $H_{0i}:θ_i=0$ against $H_{Ai}:θ_i\neq 0$, simultaneously for $i=1,\ldots,n$. We establish a general admissibility theorem for a broad class of monotone residual-based step-down multiple testing procedures which iteratively rank the active hypotheses using statistics obtained through locally adaptive strictly increasing transformations of suitably standardized residual statistics arising from conditional normal distributions. Our main result shows that every such procedure is admissible with respect to a vector-valued loss function whose components are the usual individual $0$--$1$ testing losses. The proof relies on a delicate geometric analysis of the induced acceptance regions together with structural invariance properties of the adaptive stagewise rejection indices. The theorem substantially extends the admissibility theory developed for the maximum residual down procedure of Cohen et al. (2009) and reveals that admissibility under dependence is fundamentally driven by the monotone ordering structure induced by the residual statistics rather than by the precise functional form of the testing rule itself.

URL PDF HTML ☆

赞 0 踩 0

2605.27474 2026-05-29 stat.ML cs.LG

Stop Suppressing the Tail: Causal Inference for Extreme Events

停止抑制尾部：极端事件的因果推断

Eichi Uehara

发表机构 * Eichi Uehara

AI总结针对重尾结果，提出一种平均剂量-响应函数（ADRF）估计器，通过基于中位数中心化的尾部诊断（PDHTE+JK）打破循环依赖，输出结构化尾部形状和深层尾部风险指标，在极端事件预测中显著优于传统方法。

Comments 22 pages, 6 figures, 13 tables. Keywords: double machine learning, dose-response, heavy tails, extreme value theory, causal inference

详情

AI中文摘要

估计结果如何响应连续处理（平均剂量-响应函数，ADRF）是因果推断的核心基础。然而，当结果具有重尾时，标准的鲁棒双重机器学习（DML）会刻意抑制这些极端值以稳定整体均值。在高风险场景（如金融收益或气候损失）中，这种被忽略的千分之一极端事件恰恰是实际目标量。此外，当前从模型残差中读取尾部的方法存在循环依赖，导致仅因核心估计器在Huber和Welsch之间切换，尾部形状推断就会发生剧烈变化。本研究提出一种ADRF估计器，它在标准点估计之外输出结构化的尾部形状。其尾部诊断（PDHTE+JK）通过基于中位数中心化的结果评估每个处理下的尾部形状，成功打破了循环依赖，使诊断结果不受核心方法选择的影响。输出包含四个处理条件量：尾部形状$\hatξ(t)$、深层尾部回报水平$\hat{Q}_α(t)$、条件短缺$\hat{S}_α(t)$、恢复的均值ADRF，以及一个明确的拒绝机制，当数据不支持极值建模时拒绝外推。与核加权分位数回归（QR）相比，所提估计器在重尾面板上将深层尾部（$α=0.001$）回报水平MAE降低了11%，条件短缺MAE降低了25.5%。在样本稀缺场景（$n\le2000$）中，MAE降低了20-29%。在freMTPL2汽车保险索赔数据上，它在对数索赔尺度上成功触发了明确的外推拒绝，这是QR或仅损失DML无法实现的。

英文摘要

Estimating how an outcome responds to a continuous treatment (the Average Dose-Response Function, or ADRF) is a core causal-inference primitive. However, when outcomes possess heavy tails, standard robust double machine learning (DML) deliberately suppresses these extremes to stabilize the bulk average. In high-stakes settings, such as financial returns or climate losses, this omitted 1-in-1000 extreme event is the actual target quantity. Furthermore, current methods that read the tail from a model's residuals suffer from circular dependence, causing tail shape inferences to shift drastically based solely on whether the core estimator is switched between Huber and Welsch. The research proposes an ADRF estimator that emits a structured tail-shape output alongside the standard point estimate. Its tail diagnostic (PDHTE+JK) evaluates the per-treatment tail shape from the outcome centered by a pilot median, successfully breaking the circular dependence and rendering the diagnostic invariant to the choice of core method. The output encompasses four treatment-conditional quantities: tail shape $\hatξ(t)$, deep-tail return levels $\hat{Q}_α(t)$, conditional shortfalls $\hat{S}_α(t)$, the recovered mean ADRF, and an explicit refusal mechanism that declines extrapolation when extreme-value modeling is unsupported by the data. Compared to kernel-weighted quantile regression (QR), the proposed estimator reduces deep-tail ($α=0.001$) return-level MAE by 11% and conditional-shortfall MAE by 25.5% across a heavy-tailed panel. It also achieves a 20-29% MAE reduction in sample-scarce regimes ($n\le2000$). On freMTPL2 motor-insurance claims, it successfully triggered an explicit extrapolation refusal on the log-claim scale, which neither QR nor loss-only DML can produce.

URL PDF HTML ☆

赞 0 踩 0

2605.26653 2026-05-29 stat.ME

Nonparametric Regression via Tree-Guided Feature Aggregation

基于树引导特征聚合的非参数回归

Sithija Manage, Y. Samuel Wang, Martin T. Wells

AI总结针对协变量具有层次树结构的回归问题，提出一种基于惩罚的Nadaraya-Watson型估计器KR-TEXAS，通过自适应惩罚权重同时实现模型选择和特征聚合，并证明了模型选择一致性。

详情

AI中文摘要

在协变量自然组织成层次树结构的回归问题中，一个核心挑战是选择协变量进入模型的粒度。确定这种特征聚合水平具有内在的科学意义，并且可以通过引入稀疏性来提高统计效率。尽管已有丰富文献在线性设置中解决了这一问题，但将特征聚合扩展到非线性设置仍然是一个开放的挑战。在这项工作中，我们提出通过一种惩罚的Nadaraya-Watson型估计器同时进行模型选择和特征聚合。我们提出的估计器，即带有树探索聚合的核回归（KR-TEXAS），基于回归函数偏导数的初始估计器为特征构建自适应惩罚权重。在温和条件下，我们为定义良好的目标聚合集建立了模型选择一致性，并且我们的模拟显示在模型选择和预测方面均表现强劲。最后，我们通过将我们的方法应用于一个微生物组数据集来预测短链脂肪酸，展示了其效用。我们的方法在R包krtexas中提供了用户友好的实现。

英文摘要

In regression problems where covariates are naturally organized in a hierarchical tree structure, a central challenge is to select the resolution at which covariates enter the model. Determining this level of feature aggregation is of intrinsic scientific interest and can improve statistical efficiency by inducing sparsity. While a rich literature addresses this problem in the linear setting, extending feature aggregation to the nonlinear setting remains an open challenge. In this work, we propose to simultaneously perform model selection and feature aggregation through a penalized Nadaraya-Watson-type estimator. Our proposed estimator, Kernel Regression with Tree-EXploring AggregationS (KR-TEXAS), constructs adaptive penalty weights for the features based on pilot estimators of the regression function's partial derivatives. Under mild conditions, we establish model selection consistency for a well-defined target aggregation set, and our simulations show strong performance in both model selection and prediction. Finally, we demonstrate the utility of our procedure by applying it to a microbiome data set to predict short chain fatty acids. A user-friendly implementation of our procedure is available in the R package krtexas.

URL PDF HTML ☆

赞 0 踩 0

2605.26408 2026-05-29 cs.LG stat.ME stat.ML

Function-Valued Causal Influence in Nonlinear Time Series

非线性时间序列中的函数值因果影响

Valentina V. Kuskova, Dmitry Zaytsev, Michael Coppedge

发表机构 * Lucy Family Institute for Data \& Society, University of Notre Dame, Notre Dame, Indiana, USA. ； Department of Political Science, University of Notre Dame, Notre Dame, Indiana, USA

AI总结针对非线性时间序列因果发现中常用标量评分掩盖状态依赖函数效应的问题，提出基于个体条件期望的框架从神经加性向量自回归模型直接估计因果响应函数，揭示标量评分无法区分的多种函数行为。

Comments 26 pages, 6 tables, 8 figures

详情

AI中文摘要

时间序列中的因果发现越来越多地使用非线性机器学习模型进行，但由此产生的因果关系几乎总是通过标量边评分来总结。我们认为，这种做法掩盖了非线性自回归模型真正学习到的对象：一个状态依赖的函数，其效应随机制、幅度和上下文而变化。我们形式化了加性、贡献可分解架构的函数值因果影响，并表明标量因果评分构成了严重的信息瓶颈，将状态间变化与状态内残差噪声混为一谈。以神经加性向量自回归作为代表性架构，我们引入了一个基于个体条件期望的实用框架，直接从训练好的模型估计因果响应函数。通过受控的合成实验，我们证明了具有无法区分的标量评分的边可以表现出定性的不同函数行为，包括单调、阈值、饱和和符号变化效应。一个关于民主发展的应用案例进一步表明，函数值分析揭示了以评分为中心的方法系统性遗漏的特定于机制和非对称的因果结构。

英文摘要

Causal discovery in time series is increasingly performed using nonlinear machine-learning models, yet the resulting causal relationships are almost always summarized by scalar edge scores. We argue that this practice obscures the true object learned by nonlinear autoregressive models: a state-dependent function whose effect varies across regimes, magnitudes, and contexts. We formalize function-valued causal influence for additive, contribution-decomposable architectures and show that scalar causal scores constitute a severe information bottleneck, conflating between-state variation with within-state residual noise. Using Neural Additive Vector Autoregression as a representative architecture, we introduce a practical framework based on Individual Conditional Expectation for estimating causal response functions directly from trained models. Through controlled synthetic experiments, we demonstrate that edges with indistinguishable scalar scores can exhibit qualitatively different functional behaviors, including monotonic, thresholded, saturating, and sign-changing effects. An applied case study on democratic development further shows that function-valued analysis reveals regime-specific and asymmetric causal structure systematically missed by score-centric approaches.

URL PDF HTML ☆

赞 0 踩 0

2605.13168 2026-05-29 stat.ME

Variance-Aware Estimation and Inference for Michaelis--Menten Models with Heteroscedastic Errors and Clustered Measurements

异方差误差和聚类测量下Michaelis-Menten模型的方差感知估计与推断

Mijeong Kim, Minkyoung Cha, Ah Young Jeong

AI总结针对Michaelis-Menten模型，提出一种基于条件矩约束的方差感知估计与推断方法，通过简单条件高斯工作模型实现单曲线和聚类数据的参数估计，改善了异方差和聚类结构下的推断效率。

详情

AI中文摘要

Michaelis-Menten分析通常在恒定方差假设下通过非线性最小二乘法进行，尽管酶动力学数据经常表现出浓度依赖的异方差性，并且通常包含重复或聚类测量。我们开发了一种方差感知的Michaelis-Menten估计与推断程序，该程序由条件矩约束驱动，并通过简单的条件高斯工作模型实现。对于单曲线，该方法简化为对$K_m$的一维求根，随后对$V_{\max}$和方差尺度参数进行闭式插件更新；相同的得分逻辑通过随机效应诱导的工作协方差扩展到聚类水平。在模拟中，相对于同方差非线性最小二乘法，建模异方差改善了方差恢复和区间效率，而聚类感知的半参数和NLME拟合比忽略聚类的合并分析更有效地恢复了固定效应覆盖。在自主实验室和土壤胞外酶数据中，异方差模型实现了比同方差非线性最小二乘法更低的信息准则，其中平方根方差函数在预指定的工作模型中给出了最稳定的经验拟合。我们在配套的\texttt{inferMM}包中实现了单曲线、分组和聚类Michaelis-Menten分析的工作流程。这些结果表明，当变异性随底物浓度变化或测量值聚类时，简单的方差函数和协方差建模可以稳定原始尺度的Michaelis-Menten推断。

英文摘要

Michaelis--Menten analysis is often conducted by nonlinear least squares under a constant-variance assumption, even though enzyme-kinetic data frequently display concentration-dependent heteroscedasticity and often include repeated or clustered measurements. We develop a variance-aware procedure for Michaelis--Menten estimation and inference that is motivated by conditional moment restrictions and implemented through simple conditionally Gaussian working models. For single curves, the method reduces to one-dimensional root finding for $K_m$ followed by closed-form plug-in updates for $V_{\max}$ and a variance scale parameter; the same score logic yields a cluster-level extension through a random-effect-induced working covariance. In simulation, modeling heteroscedasticity improved variance recovery and interval efficiency relative to homoscedastic nonlinear least squares, while cluster-aware semiparametric and NLME fits restored fixed-effect coverage far more effectively than pooled analyses that ignored clustering. In self-driving laboratory and soil exoenzyme data, heteroscedastic models achieved lower information criteria than homoscedastic nonlinear least squares, with the square-root variance function giving the most stable empirical fit among the prespecified working models. We implement the workflow in the companion \texttt{inferMM} package for single-curve, grouped, and clustered Michaelis--Menten analysis. These results show that simple variance-function and covariance modeling can stabilize original-scale Michaelis--Menten inference when variability changes with substrate concentration or measurements are clustered.

URL PDF HTML ☆

赞 0 踩 0

2605.12208 2026-05-29 stat.ML cs.AI cs.LG stat.CO

Self-Supervised Laplace Approximation for Bayesian Uncertainty Quantification

自监督拉普拉斯近似用于贝叶斯不确定性量化

Julian Rodemann, Alexander Marquard, Thomas Augustin, Michele Caprio

发表机构 * Rational Intelligence Lab, CISPA Helmholtz Center for Information Security Department of Statistics, LMU Munich（理性智能实验室，CISPA海德堡信息安全中心统计学系，慕尼黑大学）； Department of Statistics, LMU Munich（统计学系，慕尼黑大学）； Department of Computer Science, The University of Manchester（计算机科学系，曼彻斯特大学）

AI总结提出自监督拉普拉斯近似（SSLA），通过重新拟合自预测数据直接近似后验预测分布，实现确定性、无采样的贝叶斯不确定性量化，并在回归任务中优于经典拉普拉斯近似。

Comments Accepted for publication in TMLR (https://openreview.net/forum?id=T8w8L2t3JG), v2: fixed typos and added a deceased-author footnote with a dedication to Thomas Augustin

Journal ref Transactions on Machine Learning Research (TMLR). ISSN 2835-8856 (2026)

详情

AI中文摘要

近似贝叶斯推断通常围绕计算后验参数分布展开。然而，在实践中，感兴趣的主要对象通常是模型的预测而非其参数。在这项工作中，我们提出绕过参数后验，直接关注近似后验预测分布。我们通过从自监督和半监督学习中的自训练中汲取灵感来实现这一点。本质上，我们通过重新拟合自预测数据来量化贝叶斯模型的预测不确定性。这个想法非常简单：如果模型对自预测数据赋予高似然，那么这些预测的不确定性低，反之亦然。这产生了后验预测的确定性、无采样近似。我们的自监督拉普拉斯近似（SSLA）的模块化结构进一步允许我们插入不同的先验规范，从而实现经典的贝叶斯敏感性（关于先验选择）分析。为了绕过昂贵的重新拟合，我们进一步引入了SSLA的近似版本，称为ASSLA。我们从理论和经验上研究了（A）SSLA，涉及从贝叶斯线性模型到贝叶斯神经网络的回归模型。在模拟和真实数据集的广泛回归任务中，我们的方法在预测校准方面优于经典拉普拉斯近似，同时保持计算效率。

英文摘要

Approximate Bayesian inference typically revolves around computing the posterior parameter distribution. In practice, however, the main object of interest is often a model's predictions rather than its parameters. In this work, we propose to bypass the parameter posterior and focus directly on approximating the posterior predictive distribution. We achieve this by drawing inspiration from self-training within self-supervised and semi-supervised learning. Essentially, we quantify a Bayesian model's predictive uncertainty by refitting on self-predicted data. The idea is strikingly simple: If a model assigns high likelihood to self-predicted data, these predictions are of low uncertainty, and vice versa. This yields a deterministic, sampling-free approximation of the posterior predictive. The modular structure of our Self-Supervised Laplace Approximation (SSLA) further allows us to plug in different prior specifications, enabling classical Bayesian sensitivity (w.r.t. prior choice) analysis. In order to bypass expensive refitting, we further introduce an approximate version of SSLA, called ASSLA. We study (A)SSLA both theoretically and empirically in regression models ranging from Bayesian linear models to Bayesian neural networks. Across a wide array of regression tasks with simulated and real-world datasets, our methods outperform classical Laplace approximations in predictive calibration while remaining computationally efficient.

URL PDF HTML ☆

赞 0 踩 0

2605.02574 2026-05-29 stat.CO cs.NA math.NA stat.ME

Fast and accurate conditioning for large-scale and online Gaussian process prediction problems

大规模和在线高斯过程预测问题的快速且精确的条件化方法

Samanyu Arora, Christopher J. Geoga

AI总结提出一种通过精心设计的线性组合进行条件化的方法，以指数级收敛速度实现大规模高斯过程预测，并支持在线预测。

详情

AI中文摘要

高斯过程模型为预测和不确定性量化提供了灵活框架。然而，对于大多数协方差函数，基于 $n$ 个点的精确 GP 预测计算复杂度为 $\mathcal{O}(n^3)$，这使得它在大数据集或大量预测点上代价高昂。虽然基于最近邻的预测在某些情况下效果良好，但非病理情况（例如测量噪声）会严重限制其效率。本文提出了一种互补方法，即对精心设计的数据线性组合进行条件化，这在联合预测数据域中大型连通区域内的多个值时特别有效。对于远离原点光滑的核函数和简单预测域，该方法在用于条件化的线性组合数量 $r$ 上呈指数收敛，并且当 $r \approx 100$ 时可达到机器精度。该方法的计算成本为 $\mathcal{O}(T r^2)$，其中 $T$ 是求解数据协方差矩阵线性系统的成本，因此在许多情况下，通过利用良好协方差矩阵的秩结构，可以以线性或近线性成本计算。通过额外 $\mathcal{O}(n r^2)$ 的预计算成本，该方法还可以在 $\mathcal{O}(1)$ 的在线工作中为指定区域的任意点提供预测，这使得它对于预测点事先未知的问题特别有吸引力。

英文摘要

Gaussian Process (GP) models provide a flexible framework for prediction and uncertainty quantification. For most covariance functions, however, exact GP prediction with $n$ points scales as $\mathcal{O}(n^3)$, making it prohibitively expensive for large datasets or large numbers of prediction points. While nearest neighbor-based prediction can work well in certain settings, non-pathological circumstances (for example measurement noise) can severely restrict its efficiency. This work presents a complementary approach where one conditions on carefully designed linear combinations of data, which is particularly effective in the setting of jointly predicting many values in large connected regions of the data domain. For kernel functions that are smooth away from the origin and simple prediction domains, this method can be exponentially convergent in the number of linear combinations $r$ used for conditioning, and can be machine-precision machine-precision accurate for $r \approx 100$. This approach costs $\mathcal{O}(T r^2)$ work to compute where $T$ is the cost of solving a linear system with the data covariance matrix, and so in many cases can be computed in linear or near-linear cost by exploiting rank structure in well-behaved covariance matrices. At the cost of $\mathcal{O}(nr^2)$ additional precomputation work, this approach can also provide predictions at arbitrary points of a designated region in $\mathcal{O}(1)$ online work, making it particularly attractive for problems where prediction points are not known in advance.

URL PDF HTML ☆

赞 0 踩 0

2604.13410 2026-05-29 stat.ME cs.LG stat.ML

Estimating Continuous Treatment Effects with Two-Stage Kernel Ridge Regression

使用两阶段核岭回归估计连续治疗效果

Seok-Jin Kim, Kaizheng Wang

发表机构 * Department of IEOR, Columbia University（哥伦比亚大学工业工程与运营研究系）； Department of IEOR and Data Science Institute, Columbia University（哥伦比亚大学工业工程与数据科学研究所）

AI总结针对连续治疗的效果函数估计问题，提出两阶段核岭回归方法，通过第一阶段建模响应与治疗和协变量的关系，第二阶段构造伪结果校正分布偏移，无需估计条件治疗密度即可达到最优学习界，并实现数据驱动的模型选择。

详情

AI中文摘要

我们研究连续治疗的效果函数估计问题，该函数将每个治疗值映射到群体平均结果。该设置中的一个核心挑战是混杂：治疗分配通常依赖于协变量，产生选择偏差，使得直接对响应进行回归不可靠。为了解决这个问题，我们提出了一种两阶段核岭回归方法。在第一阶段，我们学习一个模型，将响应表示为治疗和协变量的函数；在第二阶段，我们使用该模型构造伪结果以校正分布偏移，然后拟合第二个模型来估计治疗效果。尽管响应随治疗和协变量变化，但通过对协变量平均得到的诱导效果函数通常更简单，我们的估计器适应这种结构。我们在不估计条件治疗密度的情况下实现了最优学习界，从而绕过了现有方法中的一个主要瓶颈。此外，我们引入了一种完全数据驱动的模型选择程序，该程序对未知的重叠程度和底层核的谱衰减具有可证明的自适应性。

英文摘要

We study the problem of estimating the effect function for a continuous treatment, which maps each treatment value to a population-averaged outcome. A central challenge in this setting is confounding: treatment assignment often depends on covariates, creating selection bias that makes direct regression of the response on treatment unreliable. To address this issue, we propose a two-stage kernel ridge regression method. In the first stage, we learn a model for the response as a function of both treatment and covariates; in the second stage, we use this model to construct pseudo-outcomes that correct for distribution shift, and then fit a second model to estimate the treatment effect. Although the response varies with both treatment and covariates, the induced effect function obtained by averaging over covariates is typically much simpler, and our estimator adapts to this structure. Our optimal learning bounds are achieved without estimating the conditional treatment density, thereby bypassing a major bottleneck in existing methods. Furthermore, we introduce a fully data-driven model selection procedure that achieves provable adaptivity to both the unknown degree of overlap and the spectral decay of the underlying kernel.

URL PDF HTML ☆

赞 0 踩 0

2604.13147 2026-05-29 stat.ML cs.LG math.PR

Adaptive Learning via Off-Model Training and Importance Sampling for Fully Non-Markovian Optimal Stochastic Control. Complete version

基于离模型训练和重要性采样的自适应学习用于完全非马尔可夫最优随机控制（完整版）

Dorival Leão, Alberto Ohashi, Simone Scotti, Adolfo M. D da Silva

发表机构 * Departamento de Matemática, Universidade de Brasília（数学系，巴西利亚大学）； Università di Pisa, DEM（比萨大学，DEM）； Université Paris Cité, LPSM（巴黎Cité大学，LPSM）

AI总结针对完全非马尔可夫且依赖未知模型参数的连续时间随机控制问题，提出一种基于离散骨架和重要性采样的蒙特卡洛学习方法，实现离模型训练架构和自适应参数更新，并给出非渐近误差界。

Comments Typos are fixed. Numerical experiment is revised

详情

AI中文摘要

本文研究连续时间随机控制问题，其受控状态是完全非马尔可夫的，且依赖于未知模型参数。这类问题自然出现在路径依赖随机微分方程、粗糙波动率对冲以及分数布朗运动驱动的系统中。基于先前工作中发展的离散骨架方法，我们提出了一种用于相关嵌入后向动态规划方程的蒙特卡洛学习方法。我们的主要贡献有两方面。首先，针对几类具有代表性的非马尔可夫受控系统，我们构造了显式的支配训练律和Radon-Nikodym权重。这产生了一种离模型训练架构，其中在参考律下生成固定的合成数据集，而通过重要性采样恢复与目标模型相关的动态规划算子。其次，我们利用这种结构设计了参数模型不确定性下的自适应更新机制，使得可以通过重新加权相同的训练样本而非重新生成新轨迹来执行重复校准。对于固定参数，我们建立了通过深度神经网络逼近嵌入动态规划方程的非渐近误差界。对于自适应学习，我们推导了将蒙特卡洛逼近误差与模型风险误差分离的定量估计。数值实验在结构化线性二次型例子中展示了离模型训练机制和自适应重要性采样更新。

英文摘要

This paper studies continuous-time stochastic control problems whose controlled states are fully non-Markovian and depend on unknown model parameters. Such problems arise naturally in path-dependent stochastic differential equations, rough-volatility hedging, and systems driven by fractional Brownian motion. Building on the discrete skeleton approach developed in earlier work, we propose a Monte Carlo learning methodology for the associated embedded backward dynamic programming equation. Our main contribution is twofold. First, we construct explicit dominating training laws and Radon--Nikodym weights for several representative classes of non-Markovian controlled systems. This yields an off-model training architecture in which a fixed synthetic dataset is generated under a reference law, while the dynamic programming operators associated with a target model are recovered by importance sampling. Second, we use this structure to design an adaptive update mechanism under parametric model uncertainty, so that repeated recalibration can be performed by reweighting the same training sample rather than regenerating new trajectories. For fixed parameters, we establish non-asymptotic error bounds for the approximation of the embedded dynamic programming equation via deep neural networks. For adaptive learning, we derive quantitative estimates that separate Monte Carlo approximation error from model-risk error. Numerical experiments illustrate both the off-model training mechanism and the adaptive importance-sampling update in structured linear-quadratic examples.

URL PDF HTML ☆

赞 0 踩 0

2604.05446 2026-05-29 stat.ML cs.LG

MEC: Machine-Learning-Assisted Generalized Entropy Calibration for Semi-Supervised Mean Estimation

MEC：基于机器学习的广义熵校准用于半监督均值估计

Se Yoon Lee, Jae Kwang Kim

发表机构 * Texas A\&M University（德克萨斯A&M大学）； Iowa State University（爱荷华州立大学）

AI总结提出MEC方法，通过交叉拟合校准加权改进预测驱动推断，在半监督均值估计中实现半参数效率界，并提升置信区间覆盖率和精度。

详情

AI中文摘要

BITS for GAPS：用于层次高斯过程代理的贝叶斯信息论采样

Kyla D. Jones, Alexander W. Dowling

发表机构 * Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, IN 46556, USA（化学与生物分子工程系，诺特大学）

AI总结提出BITS for GAPS框架，通过贝叶斯层次建模将超参数不确定性传播到采样准则中，实现基于高斯过程代理模型的信息论实验设计，并在汽液平衡案例中验证其提升预测精度和信息增益的效果。

Journal ref Computers & Chemical Engineering, 197, 109041 (2026)

详情

DOI: 10.1016/j.compchemeng.2026.109650

AI中文摘要

我们引入了用于层次高斯过程代理的贝叶斯信息论采样（BITS for GAPS），这是一个框架，能够实现基于高斯过程的代理模型的信息论实验设计。与标准方法（在采集函数中使用固定或点估计的超参数）不同，我们的方法通过贝叶斯层次建模将超参数不确定性传播到采样准则中。在该框架中，潜在函数接受高斯过程先验，而超参数被赋予额外的先验以捕捉建模者对控制物理现象的知识。因此，采集函数同时包含了来自潜在函数及其超参数的不确定性，确保采样由数据稀缺性和模型不确定性共同指导。我们进一步在此背景下建立了理论结果：后验微分熵的闭式近似和下界。我们通过一个汽液平衡案例研究展示了该框架在混合建模中的实用性。具体来说，我们为二元混合物中的潜在活度系数构建了一个代理模型。通过将代理嵌入扩展形式的拉乌尔定律中，我们构建了一个混合模型。该混合模型随后用于指导蒸馏设计。该案例研究展示了如何将部分物理知识转化为层次高斯过程代理。它还表明，使用BITS for GAPS通过瞄准Wilson活度模型的高不确定性区域，增加了期望信息增益和预测准确性。总体而言，BITS for GAPS是一个用于复杂物理系统中自适应数据采集的通用不确定性感知框架。

英文摘要

We introduce Bayesian Information-Theoretic Sampling for hierarchical GAussian Process Surrogates (BITS for GAPS), a framework enabling information-theoretic experimental design of Gaussian process-based surrogate models. Unlike standard methods, which use fixed or point-estimated hyperparameters in acquisition functions, our approach propagates hyperparameter uncertainty into the sampling criterion through Bayesian hierarchical modeling. In this framework, a latent function receives a Gaussian process prior, while hyperparameters are assigned additional priors to capture the modeler's knowledge of the governing physical phenomena. Consequently, the acquisition function incorporates uncertainties from both the latent function and its hyperparameters, ensuring that sampling is guided by both data scarcity and model uncertainty. We further establish theoretical results in this context: a closed-form approximation and a lower bound of the posterior differential entropy. We demonstrate the framework's utility for hybrid modeling with a vapor-liquid equilibrium case study. Specifically, we build a surrogate model for latent activity coefficients in a binary mixture. We construct a hybrid model by embedding the surrogate into an extended form of Raoult's law. This hybrid model then informs distillation design. This case study shows how partial physical knowledge can be translated into a hierarchical Gaussian process surrogate. It also shows that using BITS for GAPS increases expected information gain and predictive accuracy by targeting high-uncertainty regions of the Wilson activity model. Overall, BITS for GAPS is a generalized uncertainty-aware framework for adaptive data acquisition in complex physical systems.

URL PDF HTML ☆

赞 0 踩 0

2510.16060 2026-05-29 cs.LG cs.AI stat.ME stat.ML

Beyond Accuracy: Are Time Series Foundation Models Well-Calibrated?

超越准确性：时间序列基础模型是否良好校准？

Coen Adler, Yuxin Chang, Felix Draxler, Samar Abdi, Padhraic Smyth

发表机构 * Department of Computer Science（计算机科学系）； Department of Statistics（统计学系）； Google, Irvine（谷歌（伊文斯堡））

AI总结本文系统评估了五个时间序列基础模型和两个基线的校准特性，发现基础模型校准优于基线且无系统性过度自信或信心不足。

Comments Published as a conference paper at ICLR 2026

Journal ref Proceedings of ICLR 2026

2510.12152 2026-05-29 stat.ML cs.LG

Follow-the-Perturbed-Leader for Decoupled Bandits: Best-of-Both-Worlds and Practicality

解耦赌博机的跟随扰动领导者：两全其美与实用性

Chaiwon Kim, Jongyeong Lee, Min-hwan Oh

发表机构 * Seoul National University, Seoul, Korea（首尔国立大学，韩国首尔）； Korea Institute of Science and Technology, Seoul, Korea（韩国科学技术院，韩国首尔）

AI总结针对解耦多臂赌博机问题，提出一种高效的跟随扰动领导者策略，在随机环境下实现常数遗憾，在对抗环境下实现最优O(√KT)遗憾，且避免了凸优化和重采样过程，显著降低计算成本。

Comments Accepted to ICML 2026, 31 pages

详情

AI中文摘要

我们研究了解耦多臂赌博机问题，其中学习者在每一轮分别选择一个臂进行探索，并选择另一个可能不同的臂进行利用。在此设置中，探索臂的损失被观察到但不承担，而利用臂的损失被承担但不被观察到。我们提出了一种高效的跟随扰动领导者（FTPL）策略，该策略在随机环境下实现常数遗憾，在对抗环境下实现最优$O(\sqrt{KT})$遗憾，从而获得两全其美（BOBW）保证。我们方法的一个关键特征是它完全避免了先前BOBW策略所需的凸优化以及FTPL赌博机策略中通常使用的重采样过程。这使得FTPL能够充分发挥其计算效率优势，大幅降低计算成本。我们通过实验证实，我们的策略不仅提高了运行时间，而且在两种环境下都表现出优越的遗憾性能。

英文摘要

We study the decoupled multi-armed bandit problem, where the learner separately selects one arm for exploration and one, possibly different, arm for exploitation at each round. In this setting, the loss of the explored arm is observed but not incurred, whereas the loss of the exploited arm is incurred without being observed. We propose an efficient Follow-the-Perturbed-Leader (FTPL) policy that achieves Best-of-Both-Worlds (BOBW) guarantee with constant regret in the stochastic regime and optimal $O(\sqrt{KT})$ regret in the adversarial regime. A key feature of our method is that it completely avoids both the convex optimization required by prior BOBW policies and the resampling procedures typically used in FTPL bandit policies. This allows FTPL to fully realize its computational efficiency advantages, leading to substantial reductions in computational cost. We empirically confirm that our policy not only improves the runtime but also demonstrates superior regret performance in both regimes.

URL PDF HTML ☆

赞 0 踩 0

2510.10578 2026-05-29 math.PR math.ST stat.TH

On extremes for Gaussian subordination

高斯从属过程的极值理论

Shuyang Bai, Marie-Christine Duker

AI总结本文研究高斯从属过程的极值理论，通过改进方法、推广到多元设置并引入m-极值依赖概念，建立了点过程弱收敛和多元极值极限定理。

Comments 32 pages; revised based on reviewer's comments

详情

AI中文摘要

本文研究了通过对平稳高斯过程应用变换而得到的过程（也称为从属高斯过程）的极值理论。主要贡献如下：首先，我们改进了\cite{sly2008nonstandard}的方法，允许底层高斯过程的协方差以比任何多项式速率更慢的速度衰减，几乎达到Berman条件。其次，我们将理论推广到多元设置，其中从属过程和底层高斯过程都可以是向量值，且变换是有限维的。特别地，我们建立了由从属高斯过程构造的点过程的弱收敛，从而得到多元极值极限定理。一个促进我们分析的关键观察（可能具有独立意义）是：任何从两个具有非单位典型相关的联合高斯向量变换得到的二元随机向量始终保持极值独立。这一观察也促使我们引入并讨论一个称为m-极值依赖的概念，它扩展了经典的m-依赖概念。此外，我们放宽了对有限维变换的限制，通过近似论证将结果推广到无限维设置。作为示例，我们为多元移动最大值过程建立了极限定理，该过程由具有潜在长记忆的从属高斯过程产生的规则变化创新驱动。

英文摘要

This paper investigates extreme value theory for processes obtained by applying transformations to stationary Gaussian processes, also called subordinated Gaussian processes. The main contributions are as follows. First, we refine the method of \cite{sly2008nonstandard} to allow the covariance of the underlying Gaussian process to decay more slowly than any polynomial rate, nearly matching Berman's condition. Second, we extend the theory to a multivariate setting, where both the subordinated process and the underlying Gaussian process may be vector-valued, and the transformation is finite-dimensional. In particular, we establish the weak convergence of a point process constructed from the subordinated Gaussian process, from which a multivariate extreme value limit theorem follows. A key observation that facilitates our analysis, and may be of independent interest, is the following: any bivariate random vector derived from transformations of two jointly Gaussian vectors with a non-unity canonical correlation always remains extremally independent. This observation also motivates us to introduce and discuss a notion we call $m$-extremal-dependence, which extends the classical concept of $m$-dependence. Moreover, we relax the restriction to finite-dimensional transforms, extending the results to infinite-dimensional settings via an approximation argument. As an illustration, we establish a limit theorem for a multivariate moving maxima process driven by regularly varying innovations that arise from subordinated Gaussian processes with potentially long memory.

URL PDF HTML ☆

赞 0 踩 0

2510.10020 2026-05-29 stat.ML cs.LG q-bio.BM

Calibrating Generative Models to Distributional Constraints

生成模型的分布约束校准

Henry D. Smith, Nathaniel L. Diamant, Brian L. Trippe

发表机构 * Stanford University, Palo Alto, CA USA（斯坦福大学）

AI总结针对生成模型采样分布统计量偏离期望的校准问题，提出将校准形式化为受约束优化问题，并通过松弛损失和奖励损失两种替代目标进行微调，在蛋白质设计、图像生成和语言建模等应用中显著降低了数百个同时约束下的校准误差。

Comments To appear at the International Conference on Machine Learning (ICML), 2026. Codebase accompanying the paper is available at: https://github.com/smithhenryd/cgm

2509.21707 2026-05-29 stat.ML cs.LG stat.ME

SADA: Safe and Adaptive Aggregation of Multiple Black-Box Predictions in Semi-Supervised Learning

SADA：半监督学习中多个黑箱预测的安全自适应聚合

Jiawei Shan, Zhifeng Chen, Yiming Dong, Yazhen Wang, Jiwei Zhao

发表机构 * Department of Biostatistics & Medical Informatics, University of Wisconsin-Madison（生物统计与医学信息学系，威斯康星大学麦迪逊分校）； Department of Statistics, University of Wisconsin-Madison（统计学系，威斯康星大学麦迪逊分校）

AI总结提出一种安全自适应聚合多个不确定质量黑箱预测的方法，保证不劣于仅用标注数据，并在存在完美预测时实现更快收敛或半参数效率界。

详情

AI中文摘要

半监督学习（SSL）在实践中出现于标注数据稀缺或获取成本高昂，而大量未标注数据易于获取的情况下。随着机器学习技术的广泛采用，使用多种模型和算法（包括深度学习、大语言模型和生成式AI）生成多个预测标签已变得越来越可行。在本文中，我们提出了一种新颖方法，能够安全且自适应地聚合多个质量不确定的黑箱预测，用于推理和预测任务。我们的方法提供两个关键保证：（i）无论预测质量如何，其表现永远不会差于仅使用标注数据；（ii）如果任意一个预测（无需知道是哪一个）完美拟合真实标签，算法会自适应地利用这一点，以实现更快的收敛速度或半参数效率界。我们通过小规模模拟和两项具有不同科学目标的真实数据分析展示了所提算法的有效性。提供了用户友好的R包sada以促进实际实施。

英文摘要

Semi-supervised learning (SSL) arises in practice when labeled data are scarce or expensive to obtain, while large quantities of unlabeled data are readily available. With the growing adoption of machine learning techniques, it has become increasingly feasible to generate multiple predicted labels using a variety of models and algorithms, including deep learning, large language models, and generative AI. In this paper, we propose a novel approach that safely and adaptively aggregates multiple black-box predictions of uncertain quality for both inference and prediction tasks. Our method provides two key guarantees: (i) it never performs worse than using the labeled data alone, regardless of the quality of the predictions; and (ii) if any one of the predictions (without knowing which one) perfectly fits the ground truth, the algorithm adaptively exploits this to achieve either a faster convergence rate or the semiparametric efficiency bound. We demonstrate the effectiveness of the proposed algorithm through small-scale simulations and two real-data analyses with distinct scientific goals. A user-friendly R package, sada, is provided to facilitate practical implementation.

URL PDF HTML ☆

赞 0 踩 0

2509.08194 2026-05-29 cs.LG stat.ML

轨道几何监测的传感器融合：通过卡尔曼滤波集成车载状态监测与退化模型

Huy Truong-Ba, Jacky Chin, Michael E. Cholette, Pietro Borghesani

AI总结本研究提出一种通过卡尔曼滤波融合低成本车载传感器振动信号与退化模型的方法，以提升轨道几何预测的可靠性，并实验验证了频繁传感器数据能显著降低预测不确定性。

详情

AI中文摘要

轨道几何监测对于维护铁路运营的安全性和效率至关重要。虽然轨道检测车（TRCs）能提供轨道几何指标的精确测量，但其有限的可用性和高昂的运营成本限制了在大型铁路网络中的频繁监测。近年来，安装在运营列车上的车载传感器系统提供了一种成本效益高的替代方案，能够实现高频但精度较低的数据采集。本研究提出一种方法，通过卡尔曼滤波框架将低精度传感器振动信号与退化模型相结合，以增强轨道几何预测的可靠性。一项使用安装在TRC上的低成本传感器系统的实验活动评估了所提出的方法。结果表明，即使数据存在噪声，融入频繁的传感器数据也能显著降低预测不确定性。研究还探讨了数据记录频率如何影响可信预测区间的大小，为有效轨道监测和维护规划中车载传感器的最优部署提供指导。

英文摘要

Track geometry monitoring is essential for maintaining the safety and efficiency of railway operations. While Track Recording Cars (TRCs) provide accurate measurements of track geometry indicators, their limited availability and high operational costs restrict frequent monitoring across large rail networks. Recent advancements in on-board sensor systems installed on in-service trains offer a cost-effective alternative by enabling high-frequency, albeit less accurate, data collection. This study proposes a method to enhance the reliability of track geometry predictions by integrating low-accuracy sensor vibration signals with degradation models through a Kalman filter framework. An experimental campaign using a low-cost sensor system mounted on a TRC evaluates the proposed approach. The results demonstrate that incorporating frequent sensor data significantly reduces prediction uncertainty, even when the data is noisy. The study also investigates how the frequency of data recording influences the size of the credible prediction interval, providing guidance on the optimal deployment of on-board sensors for effective track monitoring and maintenance planning.

URL PDF HTML ☆

赞 0 踩 0

2505.20634 2026-05-29 cs.LG stat.ML

Explaining Concept Shift with Interpretable Feature Attribution

用可解释的特征归因解释概念漂移

Ruiqi Lyu, Alistair Turcan, Bryan Wilder

发表机构 * Carnegie Mellon University（卡内基梅隆大学）

AI总结提出SGShift方法，通过将概念漂移建模为特征选择任务，利用广义加性模型、敲除和吸收等统计工具识别导致源域与目标域模型性能差异的稀疏漂移特征。

详情

AI中文摘要

当特征条件标签分布在域间发生变化时，就会发生概念漂移，这可能导致即使调优良好的机器学习模型在新域上校准失效。识别这些漂移特征可以独特地揭示域间特征-标签关系如何不同，考虑到这种差异可能跨越科学相关的维度（如时间、疾病状态、人群等）。在本文中，我们提出SGShift，一种将表格数据中概念漂移导致的性能下降归因于稀疏漂移特征集的方法。我们将概念漂移框架化为特征选择任务，以学习能够解释源域和目标域模型间性能差异的特征。该框架使SGShift能够适应强大的统计工具，如广义加性模型、敲除和吸收，以识别这些漂移特征。我们在各种机器学习模型的合成数据和真实数据上进行了广泛实验，发现SGShift比基线方法更准确地识别漂移特征，在漂移域中所需样本少，并且对复杂的概念漂移情况具有鲁棒性。

英文摘要

Concept shift occurs when the distribution of labels conditioned on the features changes between domains, which can make even a well-tuned ML model miscalibrated on a new domain. Identifying these shifted features provides unique insight into how feature-label relationships differ between domains, considering the difference may be across a scientifically relevant dimension, such as time, disease status, population, etc. In this paper, we propose SGShift, a method for attributing performance degradation under concept shift in tabular data to a sparse set of shifted features. We frame concept shift as a feature selection task to learn the features that can explain performance differences between models in the source and target domain. This framework enables SGShift to adapt powerful statistical tools such as generalized additive models, knockoffs, and absorption towards identifying these shifted features. We conduct extensive experiments in synthetic and real data across various ML models and find SGShift can identify shifted features much more accurately than baseline methods, requires few samples in the shifted domain, and is robust to complex cases of concept shift.

URL PDF HTML ☆

赞 0 踩 0

2505.02743 2026-05-29 cs.LG stat.ML

Cooperative Variance Estimation and Bayesian Neural Networks for Disentangling Aleatoric and Epistemic Uncertainties

合作方差估计与贝叶斯神经网络用于分离偶然不确定性和认知不确定性

Jiaxiang Yi, Miguel A. Bessa

发表机构 * Faculty of Mechanical Engineering, Delft University of Technology, Mekelweg 2, Delft, 2628 CD, The Netherlands（代尔夫特理工大学机械工程学院）； School of Engineering, Brown University, 184 Hope St., Providence, RI 02912, USA（布朗大学工程学院）

AI总结提出通过合作训练方差估计网络与贝叶斯神经网络，实现偶然不确定性与认知不确定性的分离，并提升均值估计性能。

Comments 38 pages, 26 figures

详情

AI中文摘要

真实世界的数据包含偶然不确定性——由不完美的测量或对数据生成过程的不完全了解引起的不可约噪声。均值-方差估计网络可以学习这种类型的不确定性，但需要即兴的正则化策略以避免过拟合，并且无法预测认知不确定性（模型不确定性）。相反，贝叶斯神经网络可以预测认知不确定性，但由于贝叶斯推断的近似性质，它们以难以训练而著称。我们提出合作训练一个方差估计网络与一个贝叶斯神经网络，并通过实验证明，所得模型在改善均值估计的同时分离了偶然不确定性和认知不确定性。我们展示了该方法在多种数据集上的有效性和可扩展性，包括我们创建的一个时间依赖异方差回归数据集，其中偶然不确定性是已知的。所提出的方法易于实现、鲁棒，并且适用于各种模型架构。

英文摘要

Real-world data contains aleatoric uncertainty - irreducible noise arising from imperfect measurements or from incomplete knowledge about the data generation process. Mean-variance estimation networks can learn this type of uncertainty but require ad-hoc regularization strategies to avoid overfitting and are unable to predict epistemic uncertainty (model uncertainty). Conversely, Bayesian neural networks predict epistemic uncertainty but are notoriously difficult to train due to the approximate nature of Bayesian inference. We propose to cooperatively train a variance estimation network with a Bayesian neural network and empirically demonstrate that the resulting model disentangles aleatoric and epistemic uncertainties while improving the mean estimation. We demonstrate the effectiveness and scalability of this method across a diverse range of datasets, including a time-dependent heteroscedastic regression dataset we created where the aleatoric uncertainty is known. The proposed method is straightforward to implement, robust, and adaptable to various model architectures.

URL PDF HTML ☆

赞 0 踩 0

2502.04867 2026-05-29 stat.AP

Invariant Image Reparameterisation: Bridging Symbolic and Numerical Methods for Identifiability Analysis, Model Reduction, and Prediction

不变图像重参数化：连接符号与数值方法进行可辨识性分析、模型简化与预测

Oliver J. Maclaren, Ruanui Nicholson, Joel A. Trent, Joshua Rottenberry, Matthew Simpson

AI总结本文提出不变图像重参数化（IIR）方法，通过将符号重参数化条件替换为单参考点的数值导数计算，实现模型降维、可辨识性分析与预测。

Comments 41 pages incl. supplementary material (main text approx. 28 pages)

详情

AI中文摘要

当数学模型用于解释数据时，结构性和实践性参数不可辨识问题很常见。此类问题促使了模型重参数化和简化方法的发展。本文考虑不变图像重参数化（IIR），探讨何时可将符号重参数化条件替换为单参考点处的数值导数计算。核心对象是不变图像：一种简化且与基无关的表示，用于描述控制可观测模型行为的参数组合。我们证明，当存在一一对应的分量变换使得可观测行为仅依赖于变换后参数的固定线性组合时，单个数值雅可比矩阵即可确定相关的低维重参数化空间。这包括依赖于原始参数单项式组合的模型。我们还给出了一阶不变性条件，通过局部零空间的不变部分区分最小简化与非最小但精确的简化。在结构可辨识但实践弱信息的情况下，相同的计算可分离强信息与弱信息的参数组合。不变图像支持多种坐标表示：奇异值分解（SVD）提供默认的按局部可辨识性排序的标准正交基，而稀疏单项式基通常更具可解释性。将这些坐标作为剖面分析（Profile-Wise Analysis）中的关注参数，可得到基于似然的不确定性量化和预测。我们在具有泊松极限、扩展泊松极限和非极限情况的参数化正态模型上，以及在基因调控的非线性微分方程模型repressilator上演示了该方法。IIR的Julia实现（包含这些示例及更多案例）可在https://github.com/omaclaren/reparam获取。

英文摘要

Structural and practical parameter non-identifiability issues are common when mathematical models are used to interpret data. Such issues motivate model reparameterisation and reduction methods. Here, we consider Invariant Image Reparameterisation (IIR), which asks when symbolic reparameterisation conditions can be replaced by numerical derivative calculations at a single reference point. The central object is the invariant image: a reduced, basis-independent representation of the parameter combinations controlling observable model behaviour. We show that when a one-to-one componentwise transformation makes observable behaviour depend only on fixed linear combinations of the transformed parameters, a single numerical Jacobian determines the associated lower-dimensional reparameterisation space. This includes models depending on monomial combinations of the original parameters. We also give a first-order invariance condition that distinguishes minimal from non-minimal but exact reductions via the invariant part of the local null space. In structurally identifiable but practically weakly informed settings, the same calculations separate strongly and weakly informed parameter combinations. The invariant image admits multiple coordinate representations: the SVD gives a default orthonormal basis ordered by local identifiability, while sparse monomial bases are often more interpretable. Treating these coordinates as interest parameters in Profile-Wise Analysis gives likelihood-based uncertainty quantification and prediction. We demonstrate the method on parameterised normal models with Poisson-limit, extended Poisson-limit, and non-limit cases, and on the repressilator, a nonlinear differential equation model of gene regulation. A Julia implementation of IIR, with these and further examples, is available at https://github.com/omaclaren/reparam.

URL PDF HTML ☆

赞 0 踩 0

2410.19371 2026-05-29 stat.ML cs.CR cs.LG

Noise-Aware Differentially Private Variational Inference

噪声感知的差分隐私变分推断

Talal Alrawajfeh, Joonas Jälkö, Antti Honkela

发表机构 * University of Helsinki（赫尔辛基大学）

AI总结针对差分隐私导致下游推断不可靠的问题，提出一种基于随机梯度变分推断的噪声感知近似贝叶斯推断方法，可应用于高维和非共轭模型，并改进了后验评估精度。

Comments 26 pages, 4 figures

2408.15451 2026-05-29 cs.LG cs.CR stat.ME

Certified Causal Defense with Generalizable Robustness

具有泛化鲁棒性的认证因果防御

Yiran Qiao, Yu Yin, Chen Chen, Jing Ma

发表机构 * Case Wester Reserve University（凯斯西储大学）； University of Virginia（弗吉尼亚大学）

AI总结提出GLEAN框架，通过可认证因果因子学习解耦因果关系与虚假相关性，并设计因果认证防御策略，实现跨分布偏移域的鲁棒性泛化。

Comments Accepted by AAAI 2025

详情

AI中文摘要

尽管机器学习模型在各种场景中已被证明有效，但普遍认为许多模型容易受到对抗性攻击。近年来，出现了大量对抗性防御的研究。其中，认证防御因其对输入在特定范围内（例如$l_2$球）的任意对抗性扰动具有理论保证而闻名。然而，该领域现有的大多数工作难以将其认证鲁棒性泛化到具有分布偏移的其他数据域中。这一问题的根源在于难以消除不同域中虚假相关性对鲁棒性的负面影响。为解决此问题，本文提出了一种新颖的认证防御框架GLEAN，该框架将因果视角引入认证防御的泛化问题。具体而言，我们的框架集成了一个可认证的因果因子学习组件，以解耦输入与标签之间的因果关系和虚假相关性，从而排除虚假相关性对防御的负面影响。在此基础上，我们设计了一种因果认证防御策略来处理对潜在因果因子的对抗性攻击。通过这种方式，我们的框架不仅对训练分布中数据上的恶意噪声具有鲁棒性，而且能够将其鲁棒性泛化到具有分布偏移的各个域中。在基准数据集上的大量实验验证了我们的框架在不同数据域中认证鲁棒性泛化的优越性。代码见补充材料。

英文摘要

While machine learning models have proven effective across various scenarios, it is widely acknowledged that many models are vulnerable to adversarial attacks. Recently, there have emerged numerous efforts in adversarial defense. Among them, certified defense is well known for its theoretical guarantees against arbitrary adversarial perturbations on input within a certain range (e.g., $l_2$ ball). However, most existing works in this line struggle to generalize their certified robustness in other data domains with distribution shifts. This issue is rooted in the difficulty of eliminating the negative impact of spurious correlations on robustness in different domains. To address this problem, in this work, we propose a novel certified defense framework GLEAN, which incorporates a causal perspective into the generalization problem in certified defense. More specifically, our framework integrates a certifiable causal factor learning component to disentangle the causal relations and spurious correlations between input and label, and thereby exclude the negative effect of spurious correlations on defense. On top of that, we design a causally certified defense strategy to handle adversarial attacks on latent causal factors. In this way, our framework is not only robust against malicious noises on data in the training distribution but also can generalize its robustness across domains with distribution shifts. Extensive experiments on benchmark datasets validate the superiority of our framework in certified robustness generalization in different data domains. Code is available in the supplementary materials.

URL PDF HTML ☆

赞 0 踩 0

2407.04142 2026-05-29 stat.ME

Bayesian Structured Mediation Analysis With Unobserved Confounders

贝叶斯结构化中介分析：存在未观测混杂因素

Yuliang Xu, Shu Yang, Jian Kang

AI总结针对具有空间平滑结构的高维中介变量（如脑成像数据）中未观测混杂因素影响的问题，提出贝叶斯结构化中介分析（BASMU）框架，通过引入潜在个体效应作为未观测混杂因素来去偏中介效应，并建立模型可识别性条件与两阶段估计算法。

详情

AI中文摘要

我们探索了减少未观测混杂因素对具有空间平滑结构的高维中介变量（如脑成像数据）的因果中介分析影响的方法。关键方法是将影响结构化中介变量的潜在个体效应作为未观测混杂因素纳入结果模型，从而可能对中介效应进行去偏。我们开发了贝叶斯结构化中介分析（BASMU）框架，并建立了其模型可识别性条件。当中介分析中忽略未观测混杂因素时，我们对自然间接效应（NIE）和自然直接效应（NDE）的渐近偏差进行了理论分析。针对BASMU，我们提出了一种两阶段估计算法，以减轻这些未观测混杂因素对估计中介效应的影响。大量模拟表明，BASMU在各种场景下显著减少了偏差。我们将BASMU应用于青少年脑认知发展（ABCD）研究的fMRI数据分析，重点关注先前报道显示具有有意义中介效应的四个脑区。与现有的图像中介分析方法相比，BASMU识别出具有显著中介效应的体素数量增加了两到四倍，其中NIE增加了41%，NDE减少了26%。

英文摘要

We explore methods to reduce the impact of unobserved confounders on the causal mediation analysis of high-dimensional mediators with spatially smooth structures, such as brain imaging data. The key approach is to incorporate the latent individual effects, which influence the structured mediators, as unobserved confounders in the outcome model, thereby potentially debiasing the mediation effects. We develop BAyesian Structured Mediation analysis with Unobserved confounders (BASMU) framework, and establish its model identifiability conditions. Theoretical analysis is conducted on the asymptotic bias of the Natural Indirect Effect (NIE) and the Natural Direct Effect (NDE) when the unobserved confounders are omitted in mediation analysis. For BASMU, we propose a two-stage estimation algorithm to mitigate the impact of these unobserved confounders on estimating the mediation effect. Extensive simulations demonstrate that BASMU substantially reduces the bias in various scenarios. We apply BASMU to the analysis of fMRI data in the Adolescent Brain Cognitive Development (ABCD) study, focusing on four brain regions previously reported to exhibit meaningful mediation effects. Compared with the existing image mediation analysis method, BASMU identifies two to four times more voxels that have significant mediation effects, with the NIE increased by 41%, and the NDE decreased by 26%.

URL PDF HTML ☆

赞 0 踩 0

2402.01866 2026-05-29 stat.ME

Parametric Bootstrap for Fixed Edge-Probability Network Models

固定边概率网络模型的参数自助法

Zhixuan Shao, Can M. Le

AI总结针对Chung-Lu模型下的网络统计量不确定性量化问题，提出一种两层自助法以消除参数自助法的偏差，并构建更精确的置信区间。

详情

AI中文摘要

本文研究网络数据的参数自助法，旨在量化感兴趣的网络统计量的不确定性。现有的网络重抽样方法主要关注节点可交换图模型下的计数统计量，而我们考虑在未假设节点可交换性的Chung-Lu模型下更一般的网络统计量，包括局部统计量。我们表明，自然的网络参数自助法（先估计网络生成模型，再从估计模型中抽取自助样本）通常存在自助偏差。作为通用补救措施，我们证明两层自助程序可证明地减少这种偏差。这将经典迭代自助法的思想扩展到网络设置中，其中参数数量随网络规模增长。此外，对于许多网络统计量，第二层自助法提供了构建更高精度置信区间的方法。作为该分析的副产品，我们还得到了非齐次Erdos-Rényi模型下子图计数的中心极限定理，这可能具有独立意义。

英文摘要

This paper studies parametric bootstrap methods for network data, with the goal of quantifying the uncertainty of network statistics of interest. While existing network resampling methods primarily focus on count statistics under node-exchangeable graphon models, we consider more general network statistics, including local statistics, under the Chung-Lu model without assuming node exchangeability. We show that the natural network parametric bootstrap, which first estimates the network-generating model and then draws bootstrap samples from the estimated model, generally suffers from bootstrap bias. As a general remedy, we show that a two-level bootstrap procedure provably reduces this bias. This extends the classical idea of the iterative bootstrap to the network setting, where the number of parameters grows with the network size. Moreover, for many network statistics, the second-level bootstrap provides a way to construct confidence intervals with higher accuracy. As a by-product of this analysis, we also obtain a central limit theorem for subgraph counts under the inhomogeneous Erdos-Rényi model, which may be of independent interest.

URL PDF HTML ☆

赞 0 踩 0

2308.13222 2026-05-29 physics.comp-ph cs.LG physics.flu-dyn stat.ML

Bayesian Reasoning for Physics Informed Neural Networks

物理信息神经网络的贝叶斯推理

Krzysztof M. Graczyk, Kornel Witkowski

发表机构 * Institute for Theoretical Physics, University of Wroc aw（沃拉夫大学理论物理研究所）； Institute of Low Temperature and Structure Research（低温与结构研究所）； Polish Academy of Sciences（波兰科学院）

AI总结提出一种基于证据驱动的贝叶斯物理信息神经网络方法，通过拉普拉斯近似高效计算模型证据，自动优化偏微分方程残差、边界条件和观测数据之间的损失权重，并在热方程、波动方程和伯格斯方程上验证了其求解精度与不确定性量化能力。

Comments 21 pages, 12 figures, re-edit the description of the Bayesian framework, some of the content moved to Appendix. Discussion of numerical performance added, as well as related approaches

Journal ref Phys. Rev. E 113, 055307 (2026)

2212.08549 2026-05-29 stat.CO astro-ph.IM hep-lat hep-th

Microcanonical Hamiltonian Monte Carlo

微正则哈密顿蒙特卡洛

Jakob Robnik, G. Bruno De Luca, Eva Silverstein, Uroš Seljak

AI总结本文提出微正则哈密顿蒙特卡洛（MCHMC），通过固定能量哈密顿动力学和能量守恒的动量反弹实现遍历性，并开发了连续方向保持反弹的欠阻尼朗之万变体（MCLMC），在多个基准问题上性能优于NUTS HMC一个数量级以上。

Comments 34 pages, 11 figures

详情

AI中文摘要

我们发展了微正则哈密顿蒙特卡洛（MCHMC），这是一类遵循固定能量哈密顿动力学的模型，与遵循不同能量水平正则分布的哈密顿蒙特卡洛（HMC）形成对比。MCHMC调整哈密顿函数，使得动量变量上常能量曲面的均匀分布的边缘分布给出期望的目标分布。我们证明MCHMC需要偶尔的能量守恒台球式动量反弹以实现遍历性，类似于HMC中的动量重采样。我们将反弹概念推广到连续版本，在每一步进行部分方向保持反弹，这给出了具有非高斯噪声的能量守恒欠阻尼朗之万动力学（MCLMC）。MCHMC和MCLMC在条件数和维度上表现出有利的缩放性质。我们开发了一种高效的超参数调整方案，该方案在几个标准基准问题上实现了高性能并始终优于NUTS HMC，在某些情况下性能提升超过一个数量级。

英文摘要

We develop Microcanonical Hamiltonian Monte Carlo (MCHMC), a class of models which follow a fixed energy Hamiltonian dynamics, in contrast to Hamiltonian Monte Carlo (HMC), which follows canonical distribution with different energy levels. MCHMC tunes the Hamiltonian function such that the marginal of the uniform distribution on the constant-energy-surface over the momentum variables gives the desired target distribution. We show that MCHMC requires occasional energy conserving billiard-like momentum bounces for ergodicity, analogous to momentum resampling in HMC. We generalize the concept of bounces to a continuous version with partial direction preserving bounces at every step, which gives an energy conserving underdamped Langevin-like dynamics with non-Gaussian noise (MCLMC). MCHMC and MCLMC exhibit favorable scalings with condition number and dimensionality. We develop an efficient hyperparameter tuning scheme that achieves high performance and consistently outperforms NUTS HMC on several standard benchmark problems, in some cases by more than an order of magnitude.

URL PDF HTML ☆

赞 0 踩 0