arXivDaily arXiv每日学术速递 周一至周五更新
重置
2605.16246 2026-05-18 stat.ME stat.ML

FRESH: Information-Geometric Calibration of Patient-Level Models to Aggregate Evidence

FRESH:信息几何校准患者级模型以聚合证据

Franklin Fuller, Daniele Bertolini, Samantha Liang, Jason Christopher, Aaron M. Smith

AI总结 FRESH通过信息几何方法将群体层面结果与患者层面数据结合,提升临床决策模型的效率与准确性。

详情
AI中文摘要

本文介绍FRESH(近期证据与患者历史融合),一种将群体层面总结结果——已发表的临床试验、注册摘要、先前自然史研究和同行评审的间接比较——纳入基于患者层面数据训练的预测模型中的方法。该方法提供了一种系统地将患者层面和汇总层面数据类型结合到统一的、数据高效的模型中的方法,用于临床决策。FRESH假设可以访问一个基于患者层面数据源(例如临床试验或真实世界数据)训练的生成模型。该方法通过重新校准的模型产生患者层面的预测,该模型匹配目标人群指定的汇总统计量。这可以理解为对汇总源的患者层面重演——其关键特性是重新校准是对原始联合分布在特定信息几何意义上的最小扰动。生成的样本可以直接分析或结合到后训练过程中以更新原始生成模型。这种方法使在需要严格纳入患者层面数据与汇总信息的多个应用中变得可行,包括(i)将单臂试验结果与最近的标准化护理进行上下文化,(ii)用于临床试验设计和技术成功概率估计的临床试验模拟,以及(iii)对上市药物的比较有效性分析。

英文摘要

This note introduces FRESH (Fusion of Recent Evidence and Subject Histories), a method for incorporating population-level summary results -- published clinical trials, registry summaries, prior natural-history studies, and peer-reviewed indirect comparisons -- into predictive models trained on patient-level data. This method provides a principled means of combining both patient-level and aggregate-level data types into a unified data-efficient model for clinical decision making. FRESH assumes access to a generative model trained on patient-level data sources (e.g. clinical trial or real-world data). The method produces patient-level predictions from a re-calibrated model that matches a set of specified aggregate statistics for a target population. This can be understood as a patient-level recapitulation of the aggregate source -- with the key property that the recalibration is a minimal perturbation of the original joint distribution in a specific information-geometric sense. The resulting samples can be analyzed directly or combined into a post-training procedure to update the original generative model. This approach enables several applications where rigorously incorporating patient-level data with summary information is valuable, including (i) contextualizing single-arm trial results with respect to recent standard-of-care, (ii) clinical-trial simulations for design and probability-of-technical-success estimation, and (iii) comparative-effectiveness analyses of on-market therapies.

2605.16229 2026-05-18 cs.IT math.IT math.ST stat.ML stat.TH

Breaking the Finite-Sample Barrier in Entropy Coupling

突破有限样本障碍的熵耦合

Shahab Asoodeh, Jun Chen

AI总结 本文提出最小列表熵耦合,研究依赖性观测如何突破有限样本限制,通过条件熵分析揭示独立观测指数减少不确定性,而依赖观测可有限样本消除不确定性。

详情
AI中文摘要

边际受限观测间的依赖可以打破有限样本障碍。为形式化这一现象,我们引入最小列表熵耦合H(P∥Q₁,…,Qₘ),即所有具有给定离散边际分布P和Yᵢ∼Qᵢ的联合分布中最小的条件熵H(X|Y₁,…,Yₘ)。与基于独立观测的经典方法不同,我们的模型允许Y₁,…,Yₘ任意依赖,同时保持每个边际固定。扩大耦合空间揭示了明确二元性:独立观测使残余不确定性指数级减少,而依赖观测可在有限样本后精确消除。我们通过必要充分条件刻画零熵区域,并给出具体结构准则。特别地,在温和的支持假设下,零熵可通过O(log(1/Pₘin))观测实现,其中Pₘin是P的最小非零质量。我们还开发了具有单调近似保证的贪心算法以计算H(P∥Q₁,…,Qₘ)。最后,我们展示相同框架可形式化有限样本限制在分布匹配表示学习和随机性提取中,其中零熵对应于精确恢复和提取。

英文摘要

Dependence among marginally constrained observations can break a finite-sample barrier. To formalize this phenomenon, we introduce the \emph{minimum list entropy coupling} $H(P\|Q_1,\dots,Q_m)$, the minimum conditional entropy $H(X|Y_1,\dots,Y_m)$ over all joint distributions with prescribed discrete marginals $X\sim P$ and $Y_i\sim Q_i$. Unlike classical formulations based on independent observations, our model allows $Y_1,\dots,Y_m$ to be arbitrarily dependent while keeping each marginal fixed. This enlarged coupling space reveals a sharp dichotomy: independent observations reduce residual uncertainty exponentially, whereas dependent observations can eliminate it exactly after finitely many samples. We characterize this zero-entropy regime through necessary and sufficient conditions and give concrete structural criteria under which it occurs. In particular, under mild support assumptions, zero entropy is achieved with $O(\log(1/P_{\min}))$ observations, where $P_{\min}$ is the minimum nonzero mass of $P$. We also develop a greedy algorithm with monotone approximation guarantees for computing $H(P\|Q_1,\dots,Q_m)$. Finally, we show that the same framework formalizes finite-sample limits in distribution-matching representation learning and randomness extraction, where zero entropy corresponds to exact recovery and exact extraction.

2605.16221 2026-05-18 stat.ME stat.AP

Why Empirical p-Values Are Not Uniform: Reference Samples, Dependence, and PIT Backtesting

为什么经验p值不遵循均匀分布:参考样本、依赖性和PIT回测

Jakub Lis

AI总结 研究指出经验p值因参考样本和依赖性影响,不遵循均匀分布,需改进回测方法以考虑两阶段采样结构。

Comments 16 pages, 5 figures

详情
AI中文摘要

概率积分变换(PITs)和经验p值广泛用于评估预测分布的校准。尽管精确PIT值在模型正确规范下服从均匀分布,但实际应用依赖于有限样本构建的经验估计。本文显示,这一估计步骤从根本上改变了问题的统计结构。特别是,共同样本和滚动窗口实现引入了依赖性和方差扭曲,使经典单样本均匀性检验失效。当经验分位数基于共享参考样本时,所得统计量趋向于双样本Kolmogorov-Smirnov情形,而滚动窗口则诱导自相关和方差抑制。研究发现,将经验分位数视为独立均匀抽样会扭曲统计推断,基于PITs的回测程序需采用修正的校准方法以考虑底层两阶段采样结构。

英文摘要

Probability integral transforms (PITs) and empirical $p$-values are widely used to assess the calibration of predictive distributions. While exact PIT values are uniformly distributed under correct model specification, practical implementations rely on empirical estimates constructed from finite samples. We show that this estimation step fundamentally alters the statistical structure of the problem. In particular, common-sample and rolling-window implementations introduce dependence and variance distortions that invalidate classical one-sample uniformity tests. When empirical percentiles are conditioned on a shared reference sample, the resulting statistics converge towards a two-sample Kolmogorov--Smirnov regime, while rolling windows induce autocorrelation and variance suppression. Our findings indicate that treating empirical percentiles as independent uniform draws can distort statistical inference and that backtesting procedures based on PITs require revised calibration methods accounting for the underlying two-stage sampling structure.

2605.16219 2026-05-18 cs.LG stat.ML

The Privacy Price of Tail-Risk Learning: Effective Tail Sample Size in Differentially Private CVaR Optimization

尾风险学习的隐私代价:差分隐私CVaR优化中的有效尾样本量

El Mustapha Mansouri

AI总结 研究揭示差分隐私对CVaR学习有效样本量的影响,提出隐私代价分解方法,推导出标量估计和有限类别的学习速率,并指出隐私学习在有效尾样本量上的核心挑战。

Comments 34 pages, 3 figures, 2 tables

详情
AI中文摘要

差分隐私改变了CVaR学习的有效样本量。对于尾质量τ,隐私相关的样本量不是n,而是nτ;等价地,有效的隐私尾样本量是εnτ。私有CVaR超额风险分解为普通的尾风险统计误差和隐私代价。这种分解在标量估计和有限类别的情况下是完整的:标量估计的速率是Θ(B min{1,(nτ)^{-1/2}+(εnτ)^{-1}}),有限类别的大小为M时的速率是Θ(B min{1,√(log(2M)/(nτ))+log(2M)/(εnτ)} )。这些完整的速率在纯DP下成立,其下界可扩展到近似DP的 stated small-δ 范围内。对于凸Lipschitz学习,模块化上界和下界减少显示,CVaR特定的隐私项必然以1/(εnτ)的比例增长,其维度依赖性继承自私有随机凸优化。这些结果识别出在私有CVaR学习中,普通私有学习在Θ(nτ)信息量的尾记录上的核心挑战。

英文摘要

Differential privacy changes the effective sample size governing CVaR learning. For tail mass $τ$, the privacy-relevant sample size is not $n$, but $nτ$; equivalently, the effective private tail sample size is $εnτ$. Private CVaR excess risk decomposes into ordinary tail-risk statistical error and a privacy price. This decomposition is complete for scalar estimation and finite classes: scalar estimation has rate $Θ(B \min\{1,(nτ)^{-1/2}+(εnτ)^{-1}\})$, and finite classes of size $M$ have rate $Θ(B \min\{1,\sqrt{\log(2M)/(nτ)}+\log(2M)/(εnτ)\})$. These complete rates hold under pure DP, and their lower bounds extend to approximate DP in the stated small-$δ$ regimes. For convex Lipschitz learning, modular upper and lower reductions show that the CVaR-specific privacy term necessarily scales as $1/(εnτ)$, with dimension dependence inherited from private stochastic convex optimization. Together, these results identify ordinary private learning on $Θ(nτ)$ informative tail records as the canonical hard subproblem inside private CVaR learning.

2605.16208 2026-05-18 stat.ML cs.LG

A Scalable Nonparametric Continuous-Time Survival Model through Numerical Quadrature

通过数值积分实现的可扩展非参数连续时间生存模型

Chaeyeon Lee, Sehwan Kim, Hyungrok Do

AI总结 本文提出QSurv模型,通过高斯-勒让德数值积分实现非参数连续时间生存建模,无需时间离散化或限制分布假设,有效捕捉非平稳危险动态,实验表明其在即时危险函数估计上具有优势。

详情
AI中文摘要

灵活的连续时间生存建模对于捕捉高维数据中的复杂时间变化危险动态至关重要;然而,由于似然估计所需的不可计算积分,训练此类模型仍然具有挑战性。我们引入QSurv,一种可扩展的深度学习框架,使非参数连续时间建模成为可能,而无需依赖时间离散化或限制性分布假设。我们提出基于高斯-勒让德数值积分的训练目标,该方法以高阶精度近似累积危险,同时通过标准反向传播实现高效的端到端训练。此外,为了在复杂架构中有效捕捉非平稳危险动态,我们引入了时间条件低秩适应,一种通过动态调节权重实现对时间的条件化的机制。我们提供了理论分析,建立了累积危险评估的近似误差界。在合成基准、大规模真实世界表格数据集和高维医学影像任务中的全面实验表明,QSurv在预测性能上具有竞争力,在即时危险函数估计方面具有优势,从而能够更可解释地表征时间变化的风险模式。

英文摘要

Flexible continuous-time survival modeling is critical for capturing complex time-varying hazard dynamics in high-dimensional data; however, training such models remains challenging due to the intractable integral required for likelihood estimation. We introduce QSurv, a scalable deep learning framework that enables nonparametric continuous-time modeling without relying on time discretization or restrictive distributional assumptions. We propose a training objective based on Gauss-Legendre numerical quadrature, which approximates the cumulative hazard with high-order accuracy while facilitating efficient end-to-end training via standard backpropagation. Furthermore, to effectively capture non-stationary hazard dynamics in complex architectures, we introduce time-conditioned low-rank adaptation, a mechanism that conditions general neural backbones on time by dynamically modulating weights via low-rank updates. We provide theoretical analysis establishing approximation error bounds for cumulative-hazard evaluation. Comprehensive experiments across synthetic benchmarks, large-scale real-world tabular datasets, and high-dimensional medical imaging tasks demonstrate that QSurv achieves competitive predictive performance with advantages in instantaneous hazard function estimation, enabling more interpretable characterization of time-varying risk patterns.

2605.16145 2026-05-18 stat.ML cs.LG

Skew-adaptive conformal prediction

偏斜自适应置信预测

Paulo C. Marques F., Helton Graziadei

AI总结 本文提出一种偏斜自适应置信预测方法,通过非对称区间族和 gauge 方法构建置信分数,利用逆双曲正弦变换训练额外预测模型以适应特征空间中的不确定性倾斜,保持了样本有限的边缘有效性,同时实现了对局部尺度和偏斜的适应。

Comments 17 pages, 2 figures

详情
AI中文摘要

我们开发了一种偏斜自适应扩展的分割置信预测方法用于回归。该方法从一个以点预测为中心的非对称区间族开始,并利用 gauge 方法推导出由该族诱导的置信分数。符号缩放残差的逆双曲正弦变换提供了额外预测模型的训练目标,其作用是学习如何在特征空间中调整预测不确定性。所得到的程序在交换性下保持了样本有限的边缘有效性,同时产生能够适应局部尺度和局部偏斜的区间。我们还开发了一种基于校准样本的估计器,用于比较偏斜自适应和经典缩放分数区间的预期相对宽度。在各种数据集上的实验表明,与缩放分数构造和置信化分位数回归相比,预测区间效率有所提高,并显示所提出的估计器与测试样本上观察到的相应平均宽度比高度吻合。

英文摘要

We develop a skew-adaptive extension of split conformal prediction for regression. The method starts from an asymmetric interval family centered at a point prediction and uses the gauge approach to deduce the conformity score induced by this family. The inverse hyperbolic sine transform of signed scaled residuals provides the training target for an additional predictive model, whose role is to learn how predictive uncertainty should tilt across the feature space. The resulting procedure preserves the finite-sample marginal validity of split conformal prediction under exchangeability, while producing intervals that adapt to both local scale and local skewness. We also develop a calibration-sample-based estimator for comparing the expected relative future width of the skew-adaptive and classical scaled-score intervals. Experiments on a variety of datasets indicate gains in prediction interval efficiency over the scaled-score construction and conformalized quantile regression, and show that the proposed estimator closely matches the corresponding average width ratio observed on the test sample.

2605.16140 2026-05-18 cs.IT cs.SY eess.SY math.IT math.ST stat.TH

Covert Bayesian Quickest Change Detection

隐蔽的贝叶斯最快变化检测

Yun-Feng Lo, Matthieu R. Bloch

AI总结 研究在贝叶斯和无限时间框架下隐蔽最快变化检测问题,提出隐蔽预算指标,分析在误报概率和隐蔽预算约束下检测延迟的第二阶界限,并提出可行方案。

Comments 36 pages, 2 figures. Submitted to IEEE ITW 2026

详情
AI中文摘要

我们研究在贝叶斯和无限时间框架下隐蔽最快变化检测问题。一个合法实体通过主动探测离散无记忆信道来尽可能快地检测状态变化,同时确保其探测行为对监控主动传感的对手保持隐蔽。我们引入预期隐蔽预算(ECB)作为可分析的隐蔽度量指标,该指标界定了主动和被动传感诱导的观测序列之间的相对熵上限。在误报概率(PFA)和ECB约束下,我们建立了平均检测延迟的第二阶渐进对偶界,当PFA约束趋近于零时,对于任何正的ECB约束,明确量化了最大平方根阶隐蔽传感增益。此外,我们提出了一种利用恒定传感概率的Shiryaev型策略的可行方案,并展示了该方案与第二阶渐进对偶界的一致性。我们通过数值示例来说明结果。

英文摘要

We investigate the problem of covert quickest change detection in a Bayesian and infinite-horizon setting. A legitimate entity seeks to detect a change in the state of a discrete memoryless channel as quickly as possible by actively probing it. Simultaneously, the entity must ensure its probing remains covert from an adversary monitoring the channel for active sensing. We introduce the expected covertness budget (ECB) as an analytically tractable covertness metric that bounds from above the relative entropy between the observation sequences induced by active and passive sensing. Under constraints on both the probability of false alarm (PFA) and the ECB, we establish a second-order asymptotic converse bound on the average detection delay as the PFA constraint approaches zero, for any positive ECB constraint, explicitly quantifying the maximum square-root-order covert sensing gain possible. Furthermore, we propose an achievability scheme utilizing a constant-sensing-probability Shiryaev-type policy and show that it matches the second-order asymptotic converse. We illustrate our result with a numerical example.

2605.16126 2026-05-18 cs.LG cs.AI cs.IT math.IT math.ST stat.OT stat.TH

Entropy Across the Bridge: Conditional-Marginal Discretization for Flow and Schrödinger Samplers

熵跨桥梁:用于流和薛定谔采样的条件-边缘离散化

Bruno Trentini, Dejan Stancevic, Michael M. Bronstein, Alexander Tong, Luca Ambrogioni

AI总结 本文提出一种基于熵率的目标,用于桥-aware的离散化,通过分离端点条件桥几何和边缘流演变,提升低预算下的高维桥和流采样性能。

详情
AI中文摘要

对于固定流基生成模型,在有限的推断预算下,样本质量强烈依赖于采样器在有限函数评估上的分配。流匹配和薛定谔桥梁定义了概率路径,但其推断网格通常为启发式或继承自一端扩散。本文推导出一种条件-边缘熵率目标用于桥-aware离散化,分离端点条件桥几何与边缘流演变,并以此构建无训练的熵推断时间调度器。对于高斯布朗桥,该速率具有闭式解且呈U型,推动边界密集的非均匀网格。在训练的二维桥/流模型上,估计的轮廓恢复预测形状,并在10步ODE-Heun MMD中比线性提升18.1%,在相同低NFE扫描中,SDE-Heun改进22.7%。在EDM/CIFAR-10上,熵时间离散化在五步FID测试中表现最佳(186.3±4.0 vs 200.5±2.9线性和238.0±5.3余弦)。在AlphaFlow蛋白质生成中,熵条件-边缘调度在CAMEO22和ATLAS基准上低NFE情况下表现优势。这些结果支持熵率调度作为高维桥和流采样的实用低预算分配信号。

英文摘要

For a fixed flow-based generative model under a small inference budget, sample quality can depend strongly on where the sampler spends its few function evaluations. Flow matching and Schrödinger bridges define probability paths, yet their inference grids are usually heuristic or inherited from one-endpoint diffusion. We derive a conditional-marginal entropy-rate objective for bridge-aware discretization, separating endpoint-conditioned bridge geometry from marginal flow evolution, and use it to build a training-free entropic inference-time scheduler from first principles. For Gaussian Brownian bridges this rate is closed-form and U-shaped, motivating boundary-heavy nonuniform grids. On trained two-dimensional bridge/flow models, the estimated profile recovers the predicted shape and improves 10-step ODE-Heun MMD over linear by 18.1%, with a paired 22.7% SDE-Heun improvement in the same low-NFE sweep. On EDM/CIFAR-10, the entropic time-discretization gives the best tested five-step FID (186.3 \pm 4.0 versus 200.5 \pm 2.9 for linear and 238.0 \pm 5.3 for cosine). On AlphaFlow protein generation, entropic conditional-marginal (cond-marg) scheduling shows advantage in low-NFE regimes on both CAMEO22 and ATLAS benchmarks. These results support entropy-rate scheduling as a practical low-budget allocation signal for high-dimensional bridge and flow samplers.

2605.16078 2026-05-18 stat.ML cs.LG

A numerical study into neural network surrogate model performance for uncertainty propagation

基于神经网络代理模型的不确定性传播性能数值研究

Noah Wade, Kirubel Teferra

AI总结 本文研究神经网络代理模型在捕捉整个概率空间中解场完整分布的能力,尤其关注分布尾部表现,通过热传导方程对比了全连接网络与深度算子网络的性能。

详情
AI中文摘要

神经网络代理模型已发展为一种有前景的方法,用于建模物理建模中遇到的各种边界值问题的解场。随机问题特别受到关注,因为传统数值求解器在参数分析中可以显著减少昂贵的正向模型重复评估。然而,文献中的许多研究主要关注神经网络代理模型表示确定性样本或均值场解的能力,而忽视了代理模型在分布尾部的性能。本文详细研究了神经网络代理模型捕捉整个概率空间中解场完整分布的能力,尤其强调分布尾部的表现。作为典型问题,热传导方程具有高度随机的源项,导致热解场出现极端变化。通过比较经典前馈全连接网络和深度算子网络架构,使用数据驱动和物理指导的损失函数进行比较。结果表明,最坏情况预测误差比均值场误差大一个数量级,突显了异常样本的重要性。与极端样本相关的较大误差源于网络必须超出训练数据范围进行外推。本文提出了一种识别这些样本的方法,并讨论了处理其误差的潜在方法。在考虑的模型中,使用弱形式残差损失训练的全连接神经网络在处理这些外推输入方面表现最佳,实现了对数值生成数据集的最高预测精度。

英文摘要

Neural network surrogate models have emerged as a promising approach to model solution fields for a wide variety of boundary value problems encountered in physical modeling. Stochastic problems represent an area of particularly high interest because of the potential to significantly reduce the repeated evaluation of expensive forward models via traditional numerical solvers when conducting parametric analysis. However, many studies found in the literature primarily focus on the ability of neural network surrogate models to represent deterministic samples or mean field solutions and largely overlook surrogate model performance at the tails of the distribution. The present study examines in detail the ability of neural network surrogate models to capture the full distribution of solution fields over the entire probability space, while emphasis is placed at the tails of the distribution. Serving as a canonical problem is the heat conduction equation with a highly stochastic source term, inducing extremely large variation in the thermal solution field. Comparisons are made between a classic feed-forward fully connected network and a Deep Operator Network architecture, using both data-driven and physics-informed loss functions. Results show that the worst-case prediction errors are an order of magnitude larger than the mean field error, highlighting the importance of the outlier samples. The large errors associated with extreme samples result from the networks having to extrapolate beyond the bounds of the training data. A method for identifying these samples is presented along with a discussion of potential approaches to account of their errors. Among the models considered, the fully connected neural network trained using a weak form residual loss performs best in handling these extrapolated inputs, achieving the highest prediction accuracy for the numerically produced datasets.

2605.16075 2026-05-18 stat.ME stat.CO

REX-SUB: A Scalable Subsampling Strategy for Modeling Large Spatial Datasets

REX-SUB:一种用于建模大规模空间数据集的可扩展子采样策略

Nicholas Rios, Ben Seiyon Lee

AI总结 本文提出REX-SUB子采样策略,通过随机交换算法高效选择小样本以最小化空间GP模型的预测误差,并结合可扩展的Vecchia近似方法提升计算效率,实验表明其在预测误差和区间评分上优于其他子采样方法。

详情
AI中文摘要

近年来,数据收集技术的进步导致大规模空间数据集的出现,测量点数量达到数百万级。传统的地理统计模型通常采用高斯过程(GPs)捕捉空间依赖性,但标准GP拟合在如此大规模下变得不可行。一种有前途的解决方案是最佳子采样,即选择优化某一准则的子集。本文提出一种随机交换算法用于子采样(REX-SUB),该算法高效地选择小样本以最小化拟合空间GP模型的预测误差。为进一步提高计算效率,我们嵌入了可扩展的Vecchia近似方法到GP的联合似然中,利用精度矩阵的稀疏性,实现对所选子样本的快速推断。通过模拟研究和对遥感可降水量数据集的应用,我们表明REX-SUB在预测误差和区间评分上均优于其他子采样策略。

英文摘要

Recent advances in data collection technologies have led to the emergence of massive spatial datasets, with measurements obtained at millions of spatial locations. Geostatistical models typically employ Gaussian processes (GPs) to capture spatial dependence, but standard GP fitting becomes prohibitive at such scales. A promising solution is optimal subsampling, where a subset of locations is selected that optimizes a criterion. In this study, we propose a randomized exchange algorithm for subsampling (REX-SUB) which efficiently selects small subsamples that minimize prediction errors in the fitted spatial GP models. To further improve computational efficiency, we embed a scalable Vecchia approximation to the GP's joint likelihood, which takes advantage of sparsity in the precision matrix to enable fast inference on the selected subsamples. Through a simulation study and an application to a remotely sensed precipitable water dataset, we show that REX-SUB yields lower mean squared prediction errors and interval scores compared to competing subsampling strategies.

2605.16067 2026-05-18 cs.LG stat.ML

SAFE Quantum Machine Learning with Variational Quantum Classifiers

安全量子机器学习中的变分量子分类器

Ying Chen, Paolo Giudici, Vasily Kolesnikov, Paolo Recchia

AI总结 本文提出一种基于幅度编码的变分量子分类器,结合归一化幅度嵌入与有界量子可观测量,构建了结构化且平滑的假设空间,通过SAFE-AI指标评估模型可靠性,实验证明其在预测性能和噪声鲁棒性方面优于经典基线。

Comments 31 pages, 8 figures

详情
AI中文摘要

我们提出了一种变分量子分类器,通过幅度编码在高维深度表示上运作,通过可学习的经典预编码层稳定。通过将归一化幅度嵌入与有界量子可观测量结合,所得到的模型诱导了一个结构化且平滑的假设空间,具有受控的对输入变化的敏感性。模型可靠性通过从Cramer von Mises偏离度导出的SAFE-AI度量进行评估,从而在准确性、鲁棒性和可解释性维度上实现一致的评估。实验证明,所提出的量子模型在预测性能上与强大的经典基线竞争,同时表现出更平衡的SAFE可靠性轮廓,具有改进的噪声鲁棒性和在结构化特征移除下的稳定性。这些发现表明,变分量子电路为在安全关键设置中以稳定性为导向的SAFE学习提供了一种原则性的机制。

英文摘要

We propose a variational quantum classifier operating on high dimensional deep representations via amplitude encoding, stabilized by a learnable classical pre encoding layer.By combining normalized amplitude embeddings with bounded quantum observables, the resulting model induces a structured and smooth hypothesis class with controlled sensitivity to input variations. Model reliability is assessed using SAFE-AI metrics derived from the Cramer von Mises divergence, enabling consistent evaluation across accuracy, robustness, and explainability dimensions. Empirical results show that the proposed quantum model provides competitive predictive performance compared with strong classical baselines while exhibiting a more balanced SAFE reliability profile, with improved robustness to noise and stability under structured feature removal. These findings suggest that variational quantum circuits offer a principled mechanism for stability oriented SAFE learning in safety critical settings.

2605.16066 2026-05-18 stat.AP

A market-calibrated accelerated failure time model for in-play football forecasting

为足球实时投注预测设计的市场校准加速失败时间模型

Lawrence Clegg, Zixing Song, John Cartlidge

AI总结 本文提出一种结合市场信息和实时数据的加速失败时间模型,通过校准球队强度参数和包含赛后预期进球作为时间变异协变量,提升实时足球预测的准确性,并在140场比赛中验证了其效果。

Comments 25 pages, 6 figures, 6 tables

详情
AI中文摘要

在实时足球预测模型中,难以匹配博彩交易所的价格,这些价格聚合了多个市场参与者的信息。我们通过将Weibull加速失败时间模型的两个扩展相结合来填补这一差距:校准球队强度参数以Betfair交易所开赛时的价格捕捉赛前市场信息,并包含赛后预期进球作为时间变异协变量以捕捉实时信息。校准方法通过平方误差最小化联合拟合球队强度参数到1X2和胜/负投注市场,适用于任何基于强度的目标到达模型,从而增强实时预测能力。在140场英超联赛比赛中,每分钟评估发现校准模型几乎匹配Betfair的分类准确性(70.2% vs 70.6%),同时保留可解释的球队层面参数和协变量效应。与两种替代的连续时间得分模型相比,均校准到相同赛前赔率,证实了市场校准是预测准确性的主导因素。针对Betfair实时赔率的投注模拟显示,17,458笔投注获得4.5%的回报(夏普比率5.94),表明实时足球市场存在不效率。

英文摘要

In-play football forecasting models have struggled to match the accuracy of betting exchange prices, which aggregate information from many market participants. We close this gap by combining two extensions to a Weibull accelerated failure time model: calibrating team strength parameters to Betfair Exchange prices at kick-off to capture pre-match market information, and including post-shot expected goals as a time-varying covariate to capture in-play information. The calibration approach, where we jointly fit team-strength parameters to 1X2 and over/under betting markets via squared-error minimisation, applies to any intensity-based goal arrival model and enables stronger in-play forecasting. Evaluated across 140 English Premier League matches at minute intervals, the calibrated model almost matches Betfair's classification accuracy (70.2% versus 70.6%) while retaining interpretable team-level parameters and covariate effects. A comparison with two alternative continuous-time scoring models, both calibrated to the same pre-match odds, confirms that market calibration is the dominant driver of predictive accuracy. A betting simulation against Betfair in-play odds yields a 4.5% return on investment (Sharpe ratio 5.94) over 17,458 bets, suggesting an inefficiency within in-play football markets.

2605.16041 2026-05-18 stat.ML cs.LG

Explainable AI Isn't Enough! Rethinking Algorithmic Contestability

可解释AI还不够!重新思考算法可争议性

Timo Freiesleben, Kristof Meding, Gunnar König

AI总结 本文探讨了算法可争议性的重要性,提出了一种新的定义,指出传统XAI方法不足以挑战算法决策,提出了三种证据类型以支持决策逆转。

详情
AI中文摘要

机器学习系统日益影响个人生活决策,如贷款审批、招聘和作弊检测,引发如何应对这些系统不利决定的问题。尽管可解释AI(XAI)主要关注算法可逆性,但算法可争议性问题却较少受到关注。本文提出可争议性作为算法问题的正式定义,强调决策可能错误,并识别三种证据类型以挑战和推翻决策。

英文摘要

Machine learning systems increasingly make life-changing decisions about individuals, such as loan approvals, hiring, and cheating detection, raising a pressing question: how can individuals respond to negative decisions made by these opaque systems? While explainable artificial intelligence (XAI) has largely focused on algorithmic recourse -- helping individuals change their features to obtain a desired outcome -- the parallel problem of algorithmic contestability -- helping individuals review and correct erroneous algorithmic decisions -- has received far less attention, despite its central ethical and legal importance. We trace this neglect to the absence of clear formal definitions and a systematic operationalization of contestability as an algorithmic problem. To address it, we propose an operational definition of contestability as a natural complement to recourse: contestability starts from the presumption that a decision may be incorrect and focuses on identifying evidence to challenge and potentially overturn it, whereas recourse assumes the decision is valid and instead provides pathways for changing it. We show that standard XAI explanations, such as counterfactuals, LIME, or Anchors, even when combined with human intuitions about decision continuity or monotonicity, reveal only errors in the neighborhood of the individual, but provide insufficient grounds for overturning the decision at hand. Going thus beyond traditional XAI, we identify three types of evidence warranting reversal according to the decision maker's own ethical standards: predictive multiplicity, incorrect feature values, and neglected overruling evidence. We argue that these render decisions normatively indefensible and thus successfully contestable. Finally, we analyze how existing EU legislation connects to our framework and argue that individuals already hold some legal rights to these forms of evidence.

2605.16033 2026-05-18 math.ST stat.TH

Tests for the mean of high-dimensional data

高维数据均值检验

Dietmar Ferger

AI总结 本文提出基于V_n统计量的高维数据均值检验方法,无需协方差矩阵求逆,通过嵌入Hilbert空间l2推导渐进行为,并证明Bootstrap近似在无稀疏性假设下具有渐近有效性。

Comments 16 pages

详情
AI中文摘要

本文考虑高维数据均值检验问题,当维度可能无限制增长时,提出基于V_n = n||Xn||^2的统计量,避免协方差矩阵求逆,适合高维设置。通过嵌入Hilbert空间l2推导固定和递增维度下的渐进行为,证明Bootstrap近似在无稀疏性假设或协方差矩阵结构条件下的渐近有效性。新的l2中心极限定理被证明是极其有用的工具。

英文摘要

We consider the problem of testing the mean of high-dimensional data when the dimension may grow without explicit rate restrictions relative to the sample size. The proposed procedure is based on the statistic V_n = n||Xn||^2, which avoids inversion of the covariance matrix and is therefore suitable for high-dimensional settings.We establish asymptotic distributional results for both fixed and increasing dimension by embedding the observations into the Hilbert space l2. Furthermore, we prove the asymptotic validity of a bootstrap approximation for the distribution of the test statistic. The resulting bootstrap test yields asymptotic level-a procedures without requiring sparsity assumptions or structural conditions on the covariance matrix. In all this, a new Central Limit Theorem in l2 is proving to be an extremely useful tool.

2605.16027 2026-05-18 math.ST stat.TH

Nearest-Neighbour Matching on Unbounded Supports and Covariate Shift Transfer

无界支持上的最近邻匹配与协变量转移

Simon Viel

AI总结 本文研究了在无界支持上最近邻匹配的收敛性,提出无需假设协变量支持集的紧凑性,而是通过源与目标分布之间的转移性度量来保证估计效率。

详情
AI中文摘要

多变量函数在缺失标签下的期望在迁移学习和平均治疗效应等领域中经常出现。尽管基于最近邻匹配的非参数估计器在此背景下被广泛使用,但现有文献通常假设协变量生活在$\R^d$的某些良好形状的紧致子集内,且密度远离零。本文证明在最小的协变量支持集假设下也能实现通常的收敛速率。这些假设被替换为对源和目标分布的条件,其中包括衡量两个概率测度之间转移性的度量。我们证明这些条件是通用的,可以应用于支持在流形上的分布,并允许目标分布具有比源分布更重的尾部。我们还证明这种对转移性的控制对于任何估计器实现良好的收敛速率都是必需的。最后,将我们的结果应用于治疗效应的估计,我们能够放松赋值概率必须远离零和一的假设。

英文摘要

Expectations of multivariate functions with missing labels occur in various fields such as transfer learning and average treatment effects. Although non-parametric estimators based on nearest-neighbour matching are frequently used in this context, the existing literature assumes that the covariates live in some well-shaped compact subset of $\R^d$, with densities that are bounded away from zero. In this paper, we show that the usual rates of convergence can be achieved with minimal assumptions on the covariate supports. These assumptions are replaced with conditions on the source and target distributions, among which a measure of the tranferability between the two probability measures. We show that these conditions are general, can be applied to distributions supported on manifolds, and allow the target distribution to have a heavier tail than the source distribution. We also show that this control of the transferability is needed for any estimator to achieve good rates of convergence. Finally, applying our results to the estimation of treatment effects, we could relax the assumption that the assignment probabilities had to be bounded away from zero and one.

2605.15996 2026-05-18 stat.ML cs.LG math.ST stat.TH

Testing properties of trees in graphical models with covariance queries

利用协方差查询测试图模型中树的性质

Sofiya Burova, Francisco Calvillo, Gábor Lugosi, Piotr Zwiernik

AI总结 本文研究高维图模型下树结构的性质测试,设计了基于子二次查询数量的随机测试方法,针对叶子数、最大度、典型距离和直径等属性提出显式查询复杂度界限。

详情
AI中文摘要

我们考虑高维图模型下图结构性质的测试问题。我们采用Lugosi等人(2021)引入的协方差查询模型。我们研究底层图为树的情况。本文主要结果表明,尽管重建整个树可能代价高昂,但某些全局结构性质可以高效测试。特别是,我们设计了针对全局结构性质的随机测试,使用子二次数量的查询。我们为几种基本性质开发了测试程序,包括叶子数、最大度、典型距离和直径。对于每个性质,我们获得了依赖于目标阈值和容差参数的显式查询复杂度界限。

英文摘要

We consider the problem of testing properties of graphs underlying high-dimensional graphical models. We adopt the model of covariance queries introduced by Lugosi, Truszkowski, Velona, and Zwiernik (2021). We study the case when the underlying graph is a tree. The main results of the paper show that, while reconstructing the entire tree may be costly, certain global structural properties can be tested efficiently. In particular, we design randomized tests for global structural properties that use a sub-quadratic number of queries. We develop testing procedures for several fundamental properties, including the number of leaves, the maximum degree, the typical distance, and the diameter of the tree. For each property, we obtain explicit query complexity bounds that depend on the target threshold and tolerance parameters.

2605.15966 2026-05-18 econ.EM stat.ME

Quasi-Bayesian Local Projection Instrumental-Variables Method: Application to Renewable Energy and Electricity Prices

准贝叶斯局部投影工具变量方法:应用于可再生能源和电力价格

Masahiro Tanaka

AI总结 本文提出一种准贝叶斯方法用于局部投影工具变量估计,通过广义矩方法构建准后验,并采用粗糙度惩罚先验平滑不同时间跨度的冲击响应。方法保留传统LP-IV方法的一阶特性,增强有限样本稳定性,并允许联合推断。仿真显示该正则化方法在中长期预测中降低均方误差。

Comments This paper supersedes a working paper circulated under the title "Quasi-Bayesian Local Projections: Simultaneous Inference and Extension to the Instrumental Variable Method" (arXiv:2503.20249)

详情
AI中文摘要

本文介绍了一种准贝叶斯方法用于局部投影工具变量(LP-IV)估计。它利用广义矩方法(GMM)目标构建基于矩的准后验,并应用粗糙度惩罚先验以平滑不同时间跨度的冲击响应。该方法保留传统LP-IV方法的关键一阶特性,同时在有限样本中增强稳定性,并允许通过同时带进行联合推断。仿真表明,这种正则化方法相比标准GMM在中长期预测中降低了均方误差。对丹麦电力市场应用展示了该方法的实用性。

英文摘要

This paper introduces a quasi-Bayesian approach for local projection instrumental-variables (LP-IV) estimation. It builds a moment-based quasi-posterior using the generalized method of moments (GMM) objective and applies a roughness-penalty prior to smooth impulse responses over different horizons. The approach maintains the key first-order features of traditional LP-IV methods, while enhancing stability in finite samples and allowing for joint inference through simultaneous bands. Simulations indicate that this regularization decreases root mean squared error compared to standard GMM, especially at medium and longer horizons. An application to Danish electricity markets highlights the method's practical usefulness.

2605.15943 2026-05-18 math.ST stat.ML stat.TH

Node-private community estimation in stochastic block models: Tractable algorithms and lower bounds

节点私有社区估计在随机块模型中:可计算算法和下界

Laurentiu Marchis, Ethan D'souza, Tomáš Flídr, Po-Ling Loh

AI总结 本文研究了在固定社区数的随机块模型中社区恢复问题,提出在节点差分隐私约束下基于谱聚类的可计算算法及下界,通过隐私保护的PCA、凸优化等方法提升社区估计一致性。

Comments 78 pages

详情
AI中文摘要

我们研究了在固定社区数的随机块模型中社区恢复问题,具有一个 twists:我们寻找在图结构节点层面变化下稳定的算法,正式定义为差分隐私约束。我们开发的算法基于谱聚类,在社区恢复流程中引入隐私保护的邻接矩阵、私有PCA、私有凸优化、私有低秩矩阵估计和私有近似子空间估计。现有隐私算法的直接应用导致隐私参数ε迅速增加以确保在节点差分隐私下的估计一致性,与边隐私更简单的设置形成对比。为缓解这些问题,我们开发了基于(1)指数机制采样与Lipschitz扩展和(2)构建从无向图空间到有限度图空间的光滑投影的一般框架的新型算法。重要的是,我们开发的所有方法在多项式时间内可计算。我们还开发了在节点隐私下实现一致社区估计所需的ε增长速率的新型下界。技术上,本文突显了在非标准缩放ε→∞下分析隐私算法的复杂性,并提出了一些解决方案。我们还提供了一个新的HGR最大相关性在PAC学习准确性放大中的应用,这可能具有独立兴趣。

英文摘要

We study the classical problem of community recovery in stochastic block models with a fixed number of communities, with a twist: We seek algorithms that are stable with respect to node-wise changes in the graph structure, formally defined as a differential privacy constraint. The algorithms we develop are based on spectral clustering, where we introduce privacy to the community recovery pipeline in the form of directly privatizing the adjacency matrix; private PCA; private convex optimization; private low-rank matrix estimation; and private approximate subspace estimation. Straightforward applications of existing private algorithms lead to a rapid increase in the privacy parameter $ε$ in order to ensure consistent estimation under node differential privacy, in contrast with the simpler setting of edge privacy. To alleviate these issues, we develop novel algorithms based on (1) sampling from an exponential mechanism with a Lipschitz extension and (2) a general framework for constructing smooth projections from the space of undirected graphs to the space of bounded-degree graphs, which can then be combined with various edge-private algorithms. Importantly, the methods we develop are all computable in polynomial-time as a function of the number of nodes in the graph. We also develop novel lower bounds on the growth rate of $ε$ required in order to achieve consistent community estimation under node privacy. On a technical note, our paper highlights the complications that arise when analyzing private algorithms under the non-standard scaling $ε\rightarrow \infty$ and proposes some solutions. We also provide a novel application of the HGR maximal correlation from information theory in the context of accuracy amplification in PAC learning, which may be of independent interest.

2605.15920 2026-05-18 stat.ML cs.LG

Unsupervised Domain Shift Detection with Interpretable Subspace Attribution

无监督领域偏移检测与可解释子空间归因

Sebastian Springer, Alessandro Laio

AI总结 本文提出一种无监督领域偏移检测工具,通过高维特征空间中的局部密度异常检测,识别偏移特征子空间,从而可解释偏移来源,并提供补偿协议。

详情
AI中文摘要

我们开发了一种检测领域偏移的工具,即数据集概率分布的细微差异。我们通过检测高维特征空间中的局部密度异常来识别这些偏移。如果存在异常,则确定异常最显著的特征子空间。这使我们能够追溯偏移到一小部分特征,使其可解释。此外,我们提供了一种补偿领域偏移的协议,通过从两个未标记数据集中提取无明显残余分布差异的样本子集。我们在受控的20维基准上验证了该框架,恢复了广义和局部偏移及其支持的特征子空间。然后将其应用于由782个特征表示的健康心电图(ECG)记录。在年龄和性别匹配的队列比较中,方法检测到设备引起的偏移,提取了富含不平衡设备组件的代表性子集,并识别了与获取对比相关的ECG特征。这些结果表明,密度偏移检测和子空间归因提供了一种实用框架,可在下游建模之前揭示隐藏的队列偏见。

英文摘要

We developed a tool for detecting domain shifts, namely subtle differences in the probability distributions of datasets. We identify these shifts using an algorithm designed to detect localised density anomalies in high-dimensional feature spaces. If an anomaly is present, we then identify the feature subspace in which the anomaly is most pronounced. This allows us to trace the domain shift to a small set of features, making the shift interpretable. Moreover, we provide a protocol for compensating domain shifts by extracting, from two unlabelled datasets, subsets of samples with no detectable residual distributional difference. We validate the framework on controlled 20-dimensional benchmarks with known ground truth, recovering both broad and localized shifts together with their supporting feature subspaces. We then apply it to healthy electrocardiogram (ECG) recordings represented by 782 features. In age- and sex-matched cohort comparisons differing in measurement-device composition, the method detects device-induced shifts, extracts representative subsets enriched in the imbalanced device components, and identifies ECG features associated with the acquisition contrast. These results suggest that density-shift detection and subspace attribution provide a practical framework for uncovering hidden cohort biases before downstream modelling.

2605.15911 2026-05-18 stat.ME

Statistical Inference for Smoothed Support Vector Machines in High Dimensions: From Offline to Online Data

高维环境下平滑支持向量机的统计推断:从离线到在线数据

Shuya Zhou, Junwen Xia, Jingxiao Zhang

AI总结 本文提出一种统一的推断框架,通过离线和在线设置中的平滑技术消除偏差,实现有效的统计推断和计算效率提升。

详情
AI中文摘要

高维分类问题常依赖于Lasso惩罚的线性支持向量机(SVMs)。然而,该模型中hinge损失和Lasso惩罚的双重非光滑性使统计推断变得困难,并阻碍了计算效率。本文提出了一种统一的推断框架,适用于离线和在线设置。在离线情况下,通过将hinge损失进行卷积平滑,我们构建了一个去偏差估计器,从而建立有效的置信区间。对于在线流数据,我们开发了一个实时估计器和推断程序,仅依赖于历史数据的汇总统计量。理论上,我们为离线和在线去偏差估计器的渐近正态性提供了严格的证明。模拟研究和实际数据应用表明,我们的方法实现了有效的统计推断和计算效率的提升。

英文摘要

High-dimensional classification problems often rely on the Lasso-penalized linear Support Vector Machines (SVMs). However, the double non-smoothness induced by the hinge loss and Lasso penalty in this model makes statistical inference challenging and impedes computational efficiency. In this paper, we propose a unified inference framework in both offline and online settings. In the offline case, by applying a convolution smoothing technique to the hinge loss, we construct a debiased estimator that eliminates the shrinkage bias, thereby building a valid confidence interval. For online streaming data, we develop a real-time estimator and inference procedure that relies only on summary statistics of historical data. Theoretically, we provide rigorous proofs for the asymptotic normality of our offline and online debiased estimators. Simulation studies and real data applications demonstrate that our methods achieve valid statistical inference and improved computational efficiency.

2605.15907 2026-05-18 math.ST stat.TH

Edge-indexed network time series with graph Ornstein-Uhlenbeck dynamics

基于图奥尔内-乌尔岑动态的边索引网络时间序列

Jiaming Chen, Almut E. D. Veraart

AI总结 本文提出了一种基于图奥尔内-乌尔岑动态的边索引网络时间序列模型,通过最大似然框架估计参数并分析其渐近性质,展示了在高频金融数据中的应用价值。

详情
AI中文摘要

我们引入了一类由莱维驱动的图奥尔内-乌尔岑(grOU)模型,用于边索引网络时间序列。所提出的框架将通用网络自回归(GNAR)过程扩展到连续时间,并将最初为节点索引过程设计的图奥尔内-乌尔岑动态适应到边索引设置。该模型能够容纳一般的莱维噪声,因此能够捕捉布朗运动和跳跃行为。我们证明模型参数可通过最大似然框架估计,并推导了估计量的渐近性质。通过模拟研究检验了该方法的有限样本性能,并通过实际应用到高频金融数据中展示了其实用性。结果表明,相对于标准基准,grOU模型在边索引网络时间序列中提高了预测精度并减少了计算时间,同时通过基于网络的参数化保持了鲁棒性。

英文摘要

We introduce a class of Lévy-driven graph Ornstein-Uhlenbeck (grOU) models for edge-indexed network time series. The proposed framework extends generalized network autoregressive (GNAR) processes for edge-indexed network time series to continuous time and adapts graph Ornstein-Uhlenbeck dynamics, originally developed for node-indexed processes, to the edge-indexed setting. The model accommodates general Lévy noise and therefore captures both Brownian and jump behavior. We show that the model parameters can be estimated via a maximum-likelihood framework and derive the asymptotic properties of the estimator. We examine the finite-sample performance of the methodology through simulation studies and illustrate its practical relevance in an empirical application to high-frequency financial data. The results indicate that grOU models for edge-indexed network time series improve forecasting accuracy and reduce computational time relative to standard benchmarks while maintaining robustness through their network-based parametrization.

2605.15896 2026-05-18 stat.ME stat.AP

A Model-Agnostic Bootstrap for Macro-Level Claims Reserving Under the Conditioning Principle

基于条件原理的宏观层面赔款准备金模型无关自助法

Robin Van Oirbeek, Tim Verdonck

AI总结 本文提出一种满足条件原理的自助法,用于宏观层面赔款准备金估计,通过Dirichlet-Gamma层次结构实现精确校准,改进了现有自助法的覆盖误差问题。

Comments 23 pages

详情
AI中文摘要

正确的推断对象是条件预测分布p(R|D,θ̂),其中D是观察到的三角形保持固定。我们称之为条件原理。所有现有自助法违反这一原理,通过在预测循环中对D的函数进行重采样,产生O(1)的覆盖误差,随着三角形增大不消失。Dirichlet-Gamma层次结构允许一种满足该原理的自助法:S^{IBNP}_i = X^{obs}_i (1-W_i)/W_i,其中W_i ~ Beta(cF_{I-i}, c(1-F_{I-i}))直接从其预测分布中采样。仅模拟分配比例W_i;观察到的三角形保持固定。因此继承了任何开发比例方法(链式梯度、Bornhuetter-Ferguson、Cape Cod或其他)的校准,使其模型无关。覆盖缺陷为O(I^{-1/2}),与开发时期数量无关。在复合泊松数据生成过程中,该自助法对于每个F_{I-i} ∈ (0,1)是保守的:预测标准差分析上超过真实值的因子为1/√F_{I-i}。ODP自助法通过两种相反方向的机制违反该原理:重新估计在ODP DGP下膨胀自助方差,而缺失事故年脆弱性在脆弱性DGP下缩小它。结果覆盖差异为Ω(1),无论I如何,为Meyers(2015)文档的跨投资组合误校准异质性提供了结构解释。链式梯度、Bornhuetter-Ferguson和Cape Cod在稀疏、信息丰富和池化先验下分别作为可信度估计量,计数和金额具有相同结构。集中程度c作为诊断:ĉ < 30表明开发非平稳。

英文摘要

The correct inferential object in claims reserving is the conditional predictive distribution $p(R \mid \mathcal{D}, \hatθ)$, where $\mathcal{D}$ is the observed triangle held fixed. We refer to this as the conditioning principle. All existing bootstraps violate it by resampling functions of $\mathcal{D}$ inside the predictive loop, producing an $O(1)$ coverage error that does not vanish as the triangle grows. The Dirichlet-Gamma hierarchy admits a bootstrap that satisfies the principle exactly: $S^{IBNP}_i = X^{obs}_i (1-W_i)/W_i$ with $W_i \sim \mathrm{Beta}(c\hat{F}_{I-i}, c(1-\hat{F}_{I-i}))$ sampled directly from its predictive distribution. Only the allocation proportion $W_i$ is simulated; the observed triangle is held fixed. It thus inherits calibration from any development-proportion method (Chain-Ladder, Bornhuetter-Ferguson, Cape Cod, or other), making it model-agnostic. The coverage deficit is $O(I^{-1/2})$, independent of the number of development periods. Under compound Poisson data-generating processes the bootstrap is conservative for every $F_{I-i} \in (0,1)$: the predictive standard deviation analytically exceeds the true value by the factor $1/\sqrt{F_{I-i}}$. The ODP bootstrap violates the principle through two mechanisms in opposite directions: re-estimation inflates bootstrap variance under the ODP DGP, while missing accident-year frailty deflates it under frailty DGPs. The resulting coverage discrepancy is $Ω(1)$ regardless of $I$, providing a structural explanation for the cross-portfolio miscalibration heterogeneity documented by Meyers (2015). Chain-Ladder, Bornhuetter-Ferguson and Cape Cod emerge as credibility estimators under diffuse, informative and pooling priors respectively, with identical structure for counts and amounts. The concentration $c$ serves as a diagnostic: $\hat{c} < 30$ signals non-stationary development.

2605.15859 2026-05-18 cs.DS cs.LG math.ST stat.ML stat.TH

Complexity of Non-Log-Concave Sampling in Fisher Information

非对数凹分布采样中复杂性的研究

Sinho Chewi, Andre Wibisono

AI总结 研究非对数凹分布采样中相对信息量保证的查询复杂性,提出基于近端采样器的算法,利用受限高斯 oracle 实现,改进非对数凹采样的复杂性并提升对数凹采样的精度。

详情
AI中文摘要

我们研究了获得非对数凹分布采样相对 Fisher 信息保证的查询复杂性,该问题类似于优化中的近似 stationary 点寻找。我们的算法基于近端采样器,即 Langevin 扩散的隐式离散化,并需要实现称为受限高斯 oracle(RGO)的后向步骤。我们展示通过利用最近在 Rényi 散度中高精度对数凹采样的结果,可以得到近似 RGO 实现,当与近端采样器结合时,能够获得在相对 Fisher 信息中继承与对数凹采样相同维度依赖性的复杂性保证,并在非对数凹采样中改进先前工作。我们还展示了一个逆向减少,任何在非对数凹采样中相对 Fisher 信息的维度依赖性改进都将导致高精度对数凹采样中的维度依赖性改进。

英文摘要

We study the query complexity of obtaining a relative Fisher information guarantee for sampling from a log-smooth non-log-concave distribution; this is a sampling analog of finding an approximate stationary point in optimization. Our algorithm is based on the proximal sampler, which is an implicit discretization of the Langevin diffusion, and requires an implementation of the backward step known as the restricted Gaussian oracle (RGO). We show that by leveraging the recent results for log-concave sampling with high-accuracy guarantees in Rényi divergence, we can obtain an approximate RGO implementation that -- when used with the proximal sampler -- yields a complexity guarantee in relative Fisher information that inherits the same dimension dependence as log-concave sampling, and improves upon prior work for non-log-concave sampling. We also show a converse reduction that any improvement in the dimension dependence in relative Fisher information for non-log-concave sampling will yield an improved dimension dependence for high-accuracy log-concave sampling.

2605.15856 2026-05-18 stat.CO

crossfit: A Graph-Based Cross-Fitting Engine in R

crossfit: 一种基于图的交叉拟合引擎

Etienne Peyrot, François Petit

AI总结 crossfit提供了一种通用的交叉拟合引擎,支持通过指定目标函数和有向无环图来估计低维目标,具备可重复的调度和可审计的流程,适用于模拟密集型的基准测试和方法开发。

Comments 25 pages, 1 figure

详情
AI中文摘要

交叉拟合是许多半参数估计程序中的关键组成部分,如双/去偏机器学习(DML),通过强制使用外样本的干扰预测,能够有效估计低维目标。crossfit是一个R包,提供了一个通用的、估计器无关的交叉拟合引擎。用户指定(i)目标函数和(ii)有向无环图(DAG)的干扰模型,其中每个节点有特定的训练折叠宽度和目标特定的评估窗口。引擎在折叠、面板和重复上执行可重复的调度,返回标量估计(mode=

英文摘要

Cross-fitting is a key ingredient in many semiparametric estimation procedures, such as double/debiased machine learning (DML), enabling valid estimation of low-dimensional targets in the presence of high-dimensional nuisance functions by enforcing out-of-sample use of nuisance predictions. crossfit is an R package that provides a general-purpose, estimator-agnostic cross-fitting engine. Users specify (i) a target functional and (ii) a directed acyclic graph (DAG) of nuisance models, with node-specific training fold widths and target-specific evaluation windows. The engine executes a reproducible schedule over folds, panels, and repetitions, returning either a scalar estimate (mode="estimate") or a cross-fitted predictor function for application to new data (mode="predict"). Beyond standard cross-fitting, crossfit implements fold-allocation modes that control how training data are shared across nuisance components, including disjoint and independence-enforcing allocations that duplicate reused nodes to reduce dependence between nuisance branches. The implementation targets simulation-heavy benchmarking and method development, with explicit and auditable schedules, defensive validation of specifications and nuisance dependencies, reuse-aware caching to avoid redundant refits, and failure isolation policies for large experiment grids. The crossfit package is available on CRAN, openly developed on GitHub under GPL-3, and is intended as a lightweight, tested foundation to prototype and empirically evaluate cross-fitted estimators with explicit control over fold geometry, dependence, and computation.

2605.15847 2026-05-18 stat.ME stat.CO

Bayesian Inference for Non-Conjugate Distance Dependent Chinese Restaurant Process Models

非共轭距离依赖中文餐厅过程模型的贝叶斯推断

Joseph Marsh, Theodore Kypraios, Rowland G. Seymour

AI总结 本文提出了一种基于RJMCMC的框架,用于处理非共轭ddCRP模型中的维度变化问题,通过比较不同提议策略提升推断效率,验证了数据驱动的矩匹配方法在离散和连续观测模型中的有效性。

Comments 23 pages, 6 figures, 1 table. Includes supplementary material. Code available at https://github.com/jmarsh96/non-conjugate-ddcrp

详情
AI中文摘要

距离依赖中文餐厅过程(ddCRP)提供了一种灵活的先验分布用于聚类观测,通过成对距离融入协变量信息,并能容纳丰富的聚类结构。当聚类参数与似然函数共轭时,贝叶斯推断是直接的。然而在非共轭情况下,由于聚类分配变化导致的跨维参数空间使推断变得更具挑战性。本文开发了一种可逆跳马尔可夫链蒙特卡洛(RJMCMC)框架来解决这一挑战,针对观察分配更新时聚类参数向量的维度变化特性。我们引入并比较了几种提议策略,包括基于先验、独立性和数据驱动的矩匹配提议,以针对高后验密度区域。对于固定维数的移动,我们提出了一种后验重采样策略,以提高接受率同时保持计算效率。通过模拟研究和Old Faithful喷发持续时间的应用,我们证明了矩匹配提议提供了一种原理上可行的数据驱动替代方案,替代基于先验的提议。所得到的方法提供了一种通用的RJMCMC框架,用于非共轭ddCRP模型,本文在离散和连续观测模型上进行了演示。

英文摘要

The distance dependent Chinese Restaurant Process (ddCRP) provides a flexible prior distribution for clustering observations, incorporating covariate information through pairwise distances and accommodating a rich variety of cluster structures. When cluster parameters are conjugate to the likelihood, Bayesian inference is straightforward. In the non-conjugate setting, however, inference becomes substantially more challenging due to the trans-dimensional parameter spaces that arise as cluster assignments change. We develop a reversible jump Markov chain Monte Carlo (RJMCMC) framework to address this challenge, targeting the dimension-changing nature of cluster parameter vectors when observation assignments are updated. We introduce and compare several proposal strategies for birth and death moves, including prior-based, independence, and data-driven moment-matching proposals that target regions of high posterior density. For fixed-dimensional moves, we propose a posterior resampling strategy that improves acceptance rates while maintaining computational efficiency. Through a simulation study and an application to Old Faithful eruption durations, we demonstrate moment-matched proposals offer a principled, data-driven alternative to prior-based proposals. The resulting methodology provides a general RJMCMC framework for ddCRP models with non-conjugate likelihoods, demonstrated here on both discrete and continuous observation models.

2605.15823 2026-05-18 stat.AP cs.IT math.IT

Active Redundancy Allocation Strategy at Component and System Level

组件和系统层面的主动冗余分配策略

Bidhan Modok, Shovan Chowdhury, Amarjit Kundu

AI总结 本文研究了在可能依赖和相同组件构成的相干系统中,非匹配主动冗余(备用)的分配策略,以提高系统可靠性。通过皮尔逊函数建模组件依赖性,并推导出两种异质主动冗余在组件或系统层面的最优分配条件。

详情
AI中文摘要

可靠性工程和优化领域的研究人员和实践者经常使用主动冗余技术来增强系统性能。本文研究了在由可能依赖和相同组件构成的相干系统中,非匹配主动冗余(备用)的分配策略,以实现更好的系统可靠性。组件之间的依赖性通过皮尔逊函数建模。推导出两种异质主动冗余在组件或系统层面的最优分配条件。此外,结果适用于组件寿命遵循一般参数族分布的情况。结果保证了在组件层面(系统层面)主动冗余之间的似然比(反危险率)顺序。在此过程中还建立了某些老化性质。提供了一些示例以展示理论结果。

英文摘要

Researchers and practitioners in the field of reliability engineering and optimization frequently use active redundancy techniques to intensify the performance of systems. In this article, we study allocation strategies of non-matching active redundancies (spares) in coherent systems consisting of possibly dependent and identical components for achieving better system reliability. The dependence of the components is modeled through copulas using the distortion function. Sufficient conditions are derived to establish optimal allocation strategies for two heterogeneous active redundancies at the component or system levels. Moreover, the results are true for the component lifetimes following a general family of parametric distributions. The results guarantee the likelihood ratio (reversed hazard) ordering between the coherent systems at the component level (system level) active redundancies. Some aging properties are also established in this endeavor. Several examples are provided to demonstrate the theoretical results.

2605.15822 2026-05-18 cs.LG stat.ML

Intrinsic Wasserstein Rates for Score-Based Generative Models on Smooth Manifolds

基于光滑流形的分数布朗运动生成模型的内在Wasserstein速率

Guoji Fu, Taiji Suzuki, Wee Sun Lee, Atsushi Nitanda

AI总结 本文研究了在光滑流形上基于分数布朗运动的生成模型的内在Wasserstein速率,证明了在满足一定条件的流形上,变差保持的SGM估计器能达到特定的样本指数,且分析了分数近似在不同噪声 regime 下的表现。

详情
AI中文摘要

本文研究了在光滑流形上基于分数布朗运动的生成模型的内在Wasserstein速率,证明了在满足一定条件的流形上,变差保持的SGM估计器能达到特定的样本指数,且分析了分数近似在不同噪声 regime 下的表现。

英文摘要

Score-based generative models are trained in high-dimensional ambient spaces, yet many data distributions are supported on low-dimensional nonlinear structures. We prove that, for compact $d$-dimensional smooth manifolds $\mathcal{M} \subset [0,1]^D$ with $d > 2$ and $β$-Hölder densities strictly positive on $\mathcal{M}$, a variance-preserving SGM estimator attains the intrinsic Wasserstein--1 sample exponent $\tilde{\mathcal{O}}(D^{\mathcal{O}_β(d)}n^{-(β+1)/(d+2β)})$, up to logarithmic factors and explicit geometry and density factors. The full nonasymptotic bound explicitly isolates the finite-order geometry envelope, Hölder radius, density lower bound, ambient dependence, and finite-order correction terms. The analysis separates score approximation into a large-noise tangent-cell regime and a small-noise projection-centered, de-Gaussianized Laplace regime. The key technical ingredient is a ReLU implementation of nearest-projection coordinates via finite intrinsic anchors and Gauss--Newton iterations, rather than approximating the manifold projection as a black-box high-dimensional smooth map. Consequently, for families with polynomially controlled geometry and density lower bounds, the constructed score-network parameters have polynomial ambient dependence.

2605.15814 2026-05-18 math.ST stat.TH

Goodness-of-Fit Testing for Point Processes in Large Populations

大规模群体中点过程的拟合优度检验

Sami Umut Can, Estate V. Khmaladze, Roger J. A. Laeven

AI总结 本文提出一种新方法,通过构造单位变换使自然参数检验过程弱收敛于标准目标过程,实现大规模群体中点过程参数族的渐近分布自由拟合优度检验。

详情
AI中文摘要

假设我们有一个观察到的路径,该路径来自一个大规模群体中计数事件发生的点过程。基于观察到的路径,我们希望检验原假设,即点过程的条件强度属于特定的参数族。我们提出了一种新的进行此类拟合优度检验的方法。想法是构造一个自然参数检验过程的单位变换,使其弱收敛于一个"标准"目标过程,该过程与原假设下假设的特定参数形式无关。这种变换因此为参数点过程的渐近分布自由拟合优度检验铺平了道路。我们通过Aalen型生存过程的蒙特卡洛模拟(无和有删失)、混合治愈模型以及软件可靠性模型,展示了我们方法在有限样本性能上的良好表现,并通过观察到的人类寿命以及真实软件故障示例,展示了其适用性。

英文摘要

Suppose we have an observed path from a point process counting event occurrences in a large population. Based on the observed path, we would like to test the null hypothesis that the conditional intensity of the point process belongs to a particular parametric family. We propose a novel approach to conducting such goodness-of-fit tests. The idea is to construct a unitary transformation of a natural parametric testing process such that it converges weakly to a ``standard'' target process, independent of the particular parametric form assumed under the null hypothesis. This transformation therefore paves the way for asymptotically distribution-free goodness-of-fit testing of parametric point processes. We demonstrate the good finite-sample performance of our approach through Monte Carlo simulations of Aalen-type survival processes, without and with censoring, mixture cure models, and software reliability models, and we illustrate its applicability with observed human lifetimes as well as real software failures.

2605.15811 2026-05-18 stat.ME stat.AP

The Negative Binomial Chain-Ladder: A Full Likelihood Model for Claim Count Reserving

负二项链梯法:一种完整的似然模型用于赔款准备

Robin Van Oirbeek

AI总结 本文提出负二项链梯模型,通过泊松-伽马构造自然产生负二项分布,提供更清晰的生成解释,统一了链梯方法家族,并通过模拟验证了模型的稳健性。

Comments 35 pages, 3 figures

详情
AI中文摘要

链梯法仍是非寿险赔款准备的主要宏观技术,但其经典形式缺乏一致的概率基础。现有随机扩展,包括马科模型和过分散泊松(ODP)框架,提供不确定性度量但依赖二阶矩假设或准似然方差结构。本文开发了一种负二项链梯(NB-CL)模型,将链梯方法嵌入完整的似然框架中。关键贡献是微观层面推导,显示负二项分布自然源于泊松-伽马构造:索赔按具有伽马分布年度异质性的泊松过程到达,聚合产生负二项增量计数。此推导赋予分散参数κ结构解释,即年度异质性,而非随意的过分散调整。NB-CL模型在κ→∞极限下推广泊松链梯模型,与ODP模型共享点估计但方差函数不同(二次vs线性),并在单个概率层级内统一链梯家族。开发了参数Bootstrap程序以纳入过程和参数不确定性。模拟研究证实,在正确规范下,当分散参数经过偏差校正后,覆盖率接近名义水平;在模型不规范情况下表现出受控退化。对索赔计数数据(澳大利亚机动车身体伤害)和已付金额(泰勒-阿什)的实证研究证实了κ的结构解读以及在金额情况下的工作近似状态。

英文摘要

The Chain-Ladder (CL) method remains the dominant macro-level technique for claims reserving in non-life insurance, yet its classical formulation lacks a coherent probabilistic foundation. Existing stochastic extensions-including the Mack model and the Over-Dispersed Poisson (ODP) framework-provide measures of uncertainty but rely on second-moment assumptions or quasi-likelihood variance structures without clear generative interpretations. This paper develops a Negative Binomial Chain-Ladder (NB-CL) model that embeds the CL method within a full likelihood-based framework. The key contribution is a micro-level derivation showing that the negative binomial distribution arises naturally from a Poisson-Gamma construction: claims arrive according to a Poisson process with Gamma-distributed accident-year heterogeneity, and aggregation yields negative binomial incremental counts. This derivation gives the dispersion parameter $κ$ a structural interpretation as accident-year heterogeneity, rather than an ad-hoc overdispersion adjustment. The NB-CL model generalises the Poisson Chain-Ladder model in the limit $κ\to \infty$, shares the point estimates of the ODP model while differing in its variance function (quadratic vs. linear), and unifies the Chain-Ladder family within a single probabilistic hierarchy. A parametric bootstrap procedure is developed to incorporate both process and parameter uncertainty. Simulation studies confirm near-nominal coverage under correct specification once the dispersion parameter is bias-corrected, and a controlled degradation under model misspecification. Empirical illustrations on claim count data (Australian motor bodily injury) and paid amounts (Taylor-Ashe) document both the structural reading of $κ$ and the working-approximation status of the model in the amounts case.

2605.15802 2026-05-18 stat.ME

Generalized raking and stabilized weights for regression modeling in two-phase samples

双重抽样回归建模中的广义校正与稳定权重

Tong Chen, Joshua Slone, Gustavo Amorim, Pamela A. Shaw, Bryan E. Shepherd, Thomas Lumley

AI总结 本文提出结合广义校正与稳定权重的方法,用于双重抽样回归建模,通过减少权重变异提升效率,利用辅助变量信息提高精度。

详情
AI中文摘要

在复杂调查设计数据拟合的回归模型中,采样权重常包含非必要的变异,导致方差估计膨胀。稳定权重通过调整采样权重以考虑协变量解释的变异来缓解这一问题。在双重抽样背景下,我们评估了最优稳定权重的表现,并提出将稳定权重估计器与广义校正结合,这是一种高效的基于设计的估计器。这种结合通过减少不必要的权重变异并利用辅助变量信息来提高效率。我们展示了这种结合可以使用标准统计软件实现,该软件处理双重抽样和广义校正。模拟研究显示,所提出的估计器在现实中的双重抽样设计下提高了精度,尽管在高度信息性设计中效率提升可能有限。所开发的方法应用于一项大规模的多国双重抽样研究,研究Kaposi肉芽肿在人类免疫缺陷病毒感染者中的情况。

英文摘要

In regression models fitted to data from complex survey designs, sampling weights often incorporate non-essential variation, inflating variance estimates. Stabilized weights mitigate this issue by adjusting sampling weights to account for variation explained by covariates. In the context of two-phase sampling, we evaluate the performance of optimal stabilized weights and propose combining the stabilized weight estimator with generalized raking, a class of efficient design-based estimators. This combination improves efficiency by reducing unnecessary weight variation and leveraging information from auxiliary variables. We show this combination can be implemented using the standard statistical package that handles two-phase samples and generalized raking. Simulation studies demonstrate that the proposed estimator enhances precision under realistic two-phase designs, though efficiency gains may be limited in highly informative designs. The developed methods were applied to a large multinational two-phase study of Kaposi sarcoma among people living with HIV.

2605.15789 2026-05-18 cs.LG eess.SP stat.ML

Learning Context-conditioned Gaussian Overbounds for Convolution-Based Uncertainty Propagation

基于卷积的不确定性传播的上下文条件高斯上界学习

Ruirui Liu, Xuejie Hou, Yiping Jiang, Hui Ren

AI总结 本文提出一种统一的学习框架,通过训练神经网络生成上下文感知的高斯上界,确保在有限分位数网格上具有可证明的保守性,并在满足三个显式正则性假设时在认证区间内保持连续尾保守性。

详情
AI中文摘要

不确定性量化在安全关键领域至关重要——从自动驾驶到航空、金融和健康——其中决策必须依赖保守的界限而非点估计。预测层面的区间(如分位数回归、符合预测、方差网络或贝叶斯模型)通常不具有可组合性:将两个变量的区间相加不一定得到其和的合法区间或保持覆盖率。在航空领域,高斯上界用复杂的误差分布替换为保守的高斯分布,其尾部支配真实分布,因此保守性通过线性操作传播。然而,经典上界是全局的,通常过于保守,且难以适应特征条件误差。我们提出了一种统一的学习框架,训练神经网络生成上下文感知的高斯上界——均值和尺度——在有限分位数网格上具有可证明的保守性,并在满足三个显式正则性假设时在认证区间内保持连续尾保守性。我们的上界损失在选定的分位数上强制保守性,同时用一种类似瓦瑟斯坦的项惩罚分布距离。所学习的界限支持在强制网格上进行保守的线性组合和卷积分析,并在假设成立时在认证区间内进行保守性分析,同时比传统方法更不冗余。我们提供了离散到连续保守性的范围分析和紧域目标正则性的分析,并在合成数据和真实世界数据集上进行了验证,包括多路径、电离层和对流层残差误差。在这些设置中,该方法在保持强制网格上的保守性的同时,提供了更紧的界限。该框架是模态无关的,并适用于需要在动态环境中进行保守、特征条件不确定性估计的学习系统。

英文摘要

Uncertainty quantification is essential in safety-critical settings--from autonomous driving to aviation, finance, and health--where decisions must rely on conservative bounds rather than point estimates. Predictor-level intervals (e.g., from quantile regression, conformal prediction, variance networks, or Bayesian models) generally do not compose: adding two per-variable intervals need not yield a valid interval for their sum or preserve coverage. In aviation, Gaussian overbounding replaces complex error distributions with a conservative Gaussian whose tails dominate the truth, so conservatism propagates through linear operations. Yet classical overbounds are global, often overly conservative, and hard to adapt to feature-conditioned errors. We propose a unified learning framework that trains neural networks to produce context-aware Gaussian overbounds--mean and scale--with provable conservatism on a finite quantile grid and, under three explicit regularity assumptions, continuous-tail conservatism on a certified interval. Our overbounding loss enforces conservativeness at selected quantiles while penalizing distributional distance with a Wasserstein-style term. The learned bounds support conservative linear-combination and convolution analysis on the enforced grid, and on the certified interval when assumptions hold, while being less redundant than traditional methods. We provide a scoped analysis of discrete-to-continuous conservatism and compact-domain objective regularity, and validate on synthetic data and real-world datasets, including multipath, ionospheric, and tropospheric residual errors. Across these settings, the method yields tighter bounds while maintaining conservatism on the enforced grid and in experiments. The framework is modality-agnostic and applicable to learning systems that require conservative, feature-conditioned uncertainty estimates in dynamic environments.

2605.15108 2026-05-18 stat.ML cs.AI cs.IR cs.LG stat.ME

Logging Policy Design for Off-Policy Evaluation

为离线策略评估设计日志策略

Connor Douglas, Joel Persson, Foster Provost

AI总结 本文研究如何设计日志策略以最小化OPE误差,探讨了奖励与覆盖之间的根本权衡,并在不同信息场景下提出了最优策略。

详情
AI中文摘要

离线策略评估(OPE)利用不同日志策略收集的数据来估计目标策略(如推荐系统)的价值。它使高风险实验无需实时部署,但实际准确性严重依赖于用于计算估计值的数据收集日志策略。我们研究如何设计日志策略以最小化OPE误差。我们刻画了一个根本的奖励-覆盖权衡:将概率质量集中在高奖励动作上会减少方差,但可能错过目标策略可能采取的动作的信号。我们提出了一种统一的日志策略设计框架,并在目标策略和奖励分布已知、未知或部分通过先验或噪声估计可知的信息场景中推导出最优策略。我们的结果为公司选择多个候选推荐系统提供了可行指导。我们展示了在收集OPE数据时治疗选择的重要性,并在该目标是公司主要目标时描述了理论上最优的方法。我们还提炼了在操作约束防止实施理论最优的情况下选择日志策略的实用设计原则。

英文摘要

Off-policy evaluation (OPE) estimates the value of a target treatment policy (e.g., a recommender system) using data collected by a different logging policy. It enables high-stakes experimentation without live deployment, yet in practice accuracy depends heavily on the logging policy used to collect data for computing the estimate. We study how to design logging policies that minimize OPE error for given target policies. We characterize a fundamental reward-coverage tradeoff: concentrating probability mass on high-reward actions reduces variance but risks missing signal on actions the target policy may take. We propose a unifying framework for logging policy design and derive optimal policies in canonical informational regimes where the target policy and reward distribution are (i) known, (ii) unknown, and (iii) partially known through priors or noisy estimates at logging time. Our results provide actionable guidance for firms choosing among multiple candidate recommendation systems. We demonstrate the importance of treatment selection when gathering data for OPE, and describe theoretically optimal approaches when this is a firm's primary objective. We also distill practical design principles for selecting logging policies when operational constraints prevent implementing the theoretical optimum.

2605.14260 2026-05-18 stat.ML cs.LG

On the Burden of Achieving Fairness in Conformal Prediction

在符合预测中实现公平性的负担

Ziang Gao, Pengqi Liu, Archer Yi Yang, Mouloud Belbahri, Jesse C. Cresswell, Masoud Asgharian

AI总结 研究揭示了单一阈值校准在符合预测中隐藏的跨组异质性,证明了公平性定义之间的根本矛盾,并量化了不同校准策略的成本。

详情
AI中文摘要

符合预测通常使用单一池化阈值进行校准,但这种方法可能隐藏分数分布中的跨组异质性并扭曲各组的覆盖范围。我们通过分割符合校准下的总体分数分布研究了这一现象。首先,我们推导出一个守恒定律和下限,表明池化校准在跨组分位数异质性的尺度上不可避免地导致各组覆盖范围的扭曲。其次,我们证明了符合预测中两种主要公平性定义,即等覆盖和等集合大小,本质上存在根本矛盾。第三,我们量化了在不同策略之间转换的成本,这些策略分别处理各组或池化各组。在合成和真实数据上的实验验证了有限样本校准后的相同双向权衡。我们的结果表明,对于所研究的校准家族,校准选择不会消除跨组异质性;它决定了由此产生的扭曲出现在覆盖或大小维度中,为实际公平导向的校准选择提供了原理性的分析视角。

英文摘要

Conformal prediction is often calibrated with a single pooled threshold, but this can hide cross-group heterogeneity in score distributions and distort group-wise coverage. We study this phenomenon through the population score distributions underlying split conformal calibration. First, we derive a conservation law and lower bound showing that pooled calibration incurs irreducible group-wise coverage distortion at a scale set by cross-group quantile heterogeneity. Second, we demonstrate that the two leading fairness definitions for conformal prediction, Equalized Coverage and Equalized Set Size, are fundamentally in tension. Third, we quantify the cost of moving between policies which treat groups separately or pool them. Experiments on synthetic and real data confirm the same bidirectional trade-off after finite-sample calibration. Our results show that, for the policy families studied here, calibration choice does not remove cross-group heterogeneity; it determines whether the resulting distortion appears in the coverage or size dimension, providing a principled lens for analyzing fairness-oriented calibration choices in practice.

2605.12830 2026-05-18 stat.ME

Linking COPD Prevalence with Income Distribution: A Spatial Heterogeneous Compositional Regression via Geographically Weighted Penalized Approach

将慢性阻塞性肺病患病率与收入分布联系起来:一种通过地理加权惩罚方法的空间异质组合回归

Jingwen Deng, Shujie Ma, Sergio J. Rey, Guanyu Hu

AI总结 本文提出了一种地理加权惩罚组合回归模型,用于分析收入分布与COPD患病率之间的空间异质关系,通过非凸惩罚提升高维空间数据的估计精度和可解释性。

Comments 39 pages, 7 figures, appendix included

详情
AI中文摘要

收入不平等是健康差异的主要贡献者,但其影响常因地理区域而异,并通常表示为组成分布(例如各收入阶层家庭比例)。现有空间回归方法在这一设定中面临挑战:它们通常假设空间变化平滑,无法处理 abrupt 的空间异质性,并缺乏对组成协变量的系统性处理。本文提出了一种地理加权惩罚组合回归模型,以同时解决这些挑战。我们的方法采用成对融合惩罚,使能够检测具有共享回归效应的连续和非连续区域集群,从而放松了空间平滑性和地理连续性的强假设。这允许具有相似经济社会结构的地区即使在非地理相邻的情况下也能被识别。通过引入非凸惩罚,如最小最大凹惩罚(MCP),该方法在高维空间设置中实现了改进的估计精度、可解释性和可扩展性。我们通过分析美国收入组成与慢性阻塞性肺病(COPD)患病率之间的关系来说明该方法,揭示了传统模型所掩盖的空间异质关联。所提出的框架为涉及组成预测变量和区域异质性的空间数据分析提供了一种灵活且稳健的工具。

英文摘要

Income inequality is a major contributor to health disparities, yet its effects often vary by geography and are commonly represented as compositional distributions (e.g., proportions of households across income brackets). Existing spatial regression methods struggle in this setting: they typically assume smooth spatial variation, cannot accommodate abrupt spatial heterogeneity, and lack principled treatment of compositional covariates. We propose a geographically weighted penalized compositional regression model that addresses these challenges simultaneously. Our method adopts a pairwise fusion penalty that enables detection of both contiguous and noncontiguous regional clusters with shared regression effects, thereby relaxing strong assumptions of spatial smoothness and geographic contiguity. This allows regions with similar underlying socioeconomic structures to be identified even when they are not geographically adjacent. By incorporating nonconvex penalties, such as the minimax concave penalty (MCP), the approach achieves improved estimation accuracy, interpretability, and scalability in high-dimensional spatial settings. We illustrate the method through an analysis linking U.S. income composition to chronic obstructive pulmonary disease (COPD) prevalence, revealing spatially heterogeneous associations that are obscured by conventional models. The proposed framework provides a flexible and robust tool for spatial data analysis involving compositional predictors and region-specific heterogeneity.

2605.09231 2026-05-18 cs.CV stat.ML

An Elastic Shape Variational Autoencoder for Skeleton Pose Trajectories

一种弹性形状变分自编码器用于骨骼姿态轨迹

Arafat Rahman, Shashwat Kumar, Laura E. Barnes, Anuj Srivastava

AI总结 本文提出ES-VAE,通过运输平方根速度场表示在Kendall形状流形上学习骨骼轨迹的生成模型,有效分离形状动态,优于标准VAE和序列建模基线,在步态分析和动作识别中表现优异。

Comments 9 pages

详情
AI中文摘要

深度生成模型为建模复杂结构数据提供了灵活的框架,如图像、视频、3D物体和文本。然而,当应用于人体骨骼序列时,标准变分自编码器(VAEs)通常将大量容量分配给干扰因素,如摄像机方向、主体尺寸、视角和执行速度,而非形状和运动的内在几何结构。我们提出弹性形状-变分自编码器(ES-VAE),一种针对骨骼轨迹的几何感知生成模型,利用传输平方根速度场(TSRVF)表示在Kendall形状流形上。该表示本质上消除了形状的刚体平移、旋转和全局缩放以及序列的时间率变化,隔离了底层形状动态。ES-VAE编码器将骨骼序列映射到低维潜在空间,结合黎曼对数映射,而解码器利用相应的指数映射重建序列。我们在两个数据集上展示了ES-VAE的有效性。首先,我们分析骨骼步态周期以预测临床移动评分并分类主体为健康和中风后组。其次,我们在NTU RGB+D数据集上评估动作识别。在两种设置中,ES-VAE均优于标准VAE和一系列序列建模基线,包括时间卷积网络、Transformer和图卷积网络。更广泛地说,ES-VAE为在姿态形状流形上学习生成模型提供了系统框架,相较于现有深度学习方法,提供了改进的潜在表示和下游性能。

英文摘要

Deep generative models provide flexible frameworks for modeling complex, structured data such as images, videos, 3D objects, and texts. However, when applied to sequences of human skeletons, standard variational autoencoders (VAEs) often allocate substantial capacity to nuisance factors-such as camera orientation, subject scale, viewpoint, and execution speed-rather than the intrinsic geometry of shapes and their motion. We propose the Elastic Shape - Variational Autoencoder (ES-VAE), a geometry-aware generative model for skeletal trajectories that leverages the transported square-root velocity field (TSRVF) representation on Kendall's shape manifold. This representation inherently removes rigid translations, rotations, and global scaling of shapes, and temporal rate variability of sequences, isolating the underlying shape dynamics. The ES-VAE encoder maps skeletal sequences to a low-dimensional latent space incorporating the Riemannian logarithm map, while the decoder reconstructs sequences using the corresponding exponential map. We demonstrate the effectiveness of ES-VAE on two datasets. First, we analyze skeletal gait cycles to predict clinical mobility scores and classify subjects into healthy and post-stroke groups. Second, we evaluate action recognition on the NTU RGB+D dataset. Across both settings, ES-VAE consistently outperforms standard VAEs and a range of sequence modeling baselines, including temporal convolutional networks, transformers, and graph convolutional networks. More broadly, ES-VAE provides a principled framework for learning generative models of longitudinal data on pose shape manifolds, offering improved latent representation and downstream performance compared to existing deep learning approaches.

2605.03406 2026-05-18 stat.ME

A General Framework for Optimal Group Sequential Testing via Mixed-Integer Linear Programming

一种通过混合整数线性规划进行最优分组序贯检验的一般框架

Dae Woong Ham, Stefanus Jasin, Xuejun Zhao

AI总结 本文提出了一种基于混合整数线性规划的优化方法,用于在控制I型和II型错误的前提下,改进分组序贯检验的拒绝准则,展示了其在急性肾损伤干预研究中的应用效果。

详情
AI中文摘要

序贯假设检验被广泛用于在数据随时间到达时进行多个检验。特别是,研究者经常使用分组序贯检验(GST)在K次或“组”中测试相同的假设。在这一设定中,许多方法已被提出以允许研究者在K次检查中统一控制I型错误(通常称为各种alpha-spending预算)。尽管这些方法在控制统一I型错误方面都有效,但不清楚在尝试尽快拒绝原假设时哪些方法是最优的。在本文中,我们直接在GST设定下优化拒绝准则,同时在控制I型和II型错误的相同约束下进行优化。我们使用样本平均近似结合混合整数线性规划(S-MILP)方法来解决这个问题,并展示了我们的S-MILP方法如何优于经典的GST程序,如Lan-DeMets、Pocock和O'Brien-Fleming方法。我们还发现最优解通常会尽早激进地使用alpha预算,为长期存在的关于哪种alpha-spending预算更高效的争论提供了见解。最后,我们将我们的最优S-MILP方法应用于最近一项关于急性肾损伤干预的研究,并发现我们的最优S-MILP方法可以比原始研究和其他GST方法更快地达到统计上显著的结论。

英文摘要

Sequential hypothesis tests are widely adopted as a principled way to perform multiple tests on data that arrives over time. In particular, researchers frequently utilize group sequential hypothesis tests (GST) to test the same hypotheses at K times or "groups" while data arrives sequentially. In this setting, many methods have been proposed to allow researchers to uniformly control type-1 error across K checks (often known as various alpha-spending budgets). Although these methods are all successfully valid in controlling uniform type-1 error, it is not clear which of these methods are optimal when trying to reject the null as soon as possible. In this paper, we directly optimize the rejection criterion in the GST setting under the same constraints of controlling type-1 and type-2 errors. We use a sample average approximation combined with mixed integer linear programming (S-MILP) approach for this problem and show how our S-MILP approach dominates classical GST procedures such as Lan-DeMets, Pocock, and O'Brien-Fleming methods. We also find that the optimal solution typically aggressively spends the alpha-budget early, shedding insight to the long-standing debate of which alpha-spending budgets are more efficient. We finally apply our optimal S-MILP approach to a recent study on acute kidney injury interventions and find our optimal S-MILP approach can reach the same statistically significant conclusion faster than the original study and other GST methods.

2605.02248 2026-05-18 math.ST cs.DM eess.SP q-bio.GN q-fin.ST stat.TH

Statistics of a multi-factor function from its Fourier transform

多因素函数的统计学与其傅里叶变换

Matthew A. Herman, Stephen Doro

AI总结 通过傅里叶变换推导多因素函数的总体统计量,提出m系数/索引湮灭定理,揭示傅里叶域中项的索引和为零的特性,用于分析设计工具及搜索算法约束。

Comments Submitted to the Journal of Fourier Analysis and Applications. 42 pages, 6 figures

详情
AI中文摘要

对于定义在有限阿贝尔群G上的n因素函数f,我们仅通过其傅里叶变换f̂推导其总体统计量。主要结果是m系数/索引湮灭定理:函数f的m阶矩成为一系列项,每个项恰好包含m个傅里叶系数---令人惊讶的是,每个项中的系数索引在群加法下求和为零。这一条件像一个过滤器,限制傅里叶域中出现的项,可揭示驱动f的变量之间的深层关系。这些技术也可作为分析/设计工具或搜索算法中的可行性约束。对于定义在Z_2^n上的函数,我们展示了如何从傅里叶域推导二项分布的偏度、峰度等统计量。其他示例也进行了展示。

英文摘要

For a phenomenon $\boldsymbol{f}$ that is a function of $n$ factors, defined on a finite abelian group $G$, we derive its population statistics solely from its Fourier transform $\hat{\boldsymbol{f}}$. Our main result is an $m$-Coefficient/Index Annihilation Theorem: the $m$th moment of $\boldsymbol{f}$ becomes a series of terms, each with precisely $m$ Fourier coefficients --- and surprisingly, the coefficient indices in each term sum to zero under group addition. This condition acts like a filter, limiting which terms appear in the Fourier domain, and can reveal deeper relationships between the variables driving $\boldsymbol{f}$. These techniques can also be used as an analytical/design tool, or as a feasibility constraint in search algorithms. For functions defined on $\mathbb{Z}_2^n$, we show how the skew, kurtosis, etc. of a binomial distribution can be derived from the Fourier domain. Several other examples are presented.

2605.00934 2026-05-18 cs.LG cs.CV stat.ML

Structured Analytic Coherent Point Drift for Non-Rigid Point Set Registration

结构化分析一致点漂移用于非刚性点集配准

Wei Feng, Haiyong Zheng

AI总结 本文提出Analytic-CPD,通过结构化分析映射改进传统CPD,实现更高效且可控的非刚性点集配准,实验验证其在不同数据集上的有效性与精度效率优势。

Comments Revised version. Supplementary material incorporated as appendices; method, implementation, and experimental details expanded

详情
AI中文摘要

Coherent Point Drift (CPD) 是一种用于无监督非刚性点集配准的概率框架。其标准非刚性M-step然而依赖于点索引高斯核系统,其大小随移动点数量增长,导致大点集的形变估计计算负担重且难以控制复杂度。为解决这些限制,我们提出Analytic-CPD,一种新的无监督非刚性配准框架,为CPD提供结构化分析重述。Analytic-CPD保留CPD后验对应层,但将M-step从点索引核位移估计提升到结构化分析映射估计。通过将CPD的高斯混合后验机制与结构化分析映射(SAM)耦合,该方法获得一个系数维度由环境维度和分析阶数而非移动点数量决定的形变模型。更重要的是,形变估计在可解释的分析函数空间层次上组织,因此分析阶数可以随着后验对应可靠性增加而逐步提升。我们通过增加阶数连续策略与减少阶段长度实现该想法:低阶分析映射首先稳定后验对应结构,而更高阶模式随后细化非线性残差形变。在受控模型匹配、平滑模型不匹配和注册人体形状数据上的实验验证了Analytic-CPD的有效性和优越的精度-效率性能。

英文摘要

Coherent Point Drift (CPD) is a representative probabilistic framework for unsupervised non-rigid point set registration. Its standard non-rigid M-step, however, relies on a point-indexed Gaussian-kernel system whose size grows with the number of moving points, making deformation estimation computationally heavy for large point sets and difficult to control in complexity during registration. To address these limitations, we propose Analytic-CPD, a new unsupervised non-rigid registration framework that gives CPD a structured analytic reformulation. Analytic-CPD preserves the CPD posterior correspondence layer, but lifts the M-step from point-indexed kernel displacement estimation to structured analytic mapping estimation. By coupling the Gaussian-mixture posterior mechanism of CPD with Structured Analytic Mappings (SAM), the method obtains a deformation model whose coefficient dimension is governed by the ambient dimension and analytic order rather than by the number of moving points. More importantly, deformation estimation is organized over an interpretable hierarchy of analytic function spaces, so the analytic order can be increased progressively as posterior correspondences become more reliable. We implement this idea through an increasing-degree continuation strategy with decreasing stage lengths: low-order analytic maps first stabilize the posterior correspondence structure, while higher-order modes later refine nonlinear residual deformation. Experiments on controlled model-matched, smooth model-mismatch, and registered human-shape data demonstrate the effectiveness and favorable accuracy--efficiency performance of Analytic-CPD.

2604.26126 2026-05-18 eess.SY cs.SY stat.ML

Application of Deep Reinforcement Learning to Event-Triggered Control for Networked Artificial Pancreas Systems

深度强化学习在联网人工胰腺系统事件触发控制中的应用

Junya Ikemoto, Satoshi Maruyama, Kazumune Hashimoto

AI总结 本文提出基于深度强化学习的事件触发控制器设计,通过引入基于血糖变化的规则准则,避免显式学习更新时间,提升通信效率并保持控制性能。

Comments 14 pages, 7 figures, submitted to a journal

详情
AI中文摘要

本文提出了一种基于深度强化学习(DRL)的事件触发控制器设计,用于联网人工胰腺(AP)系统。尽管现有基于DRL的AP控制器通常假设周期性控制更新,联网控制系统(NCSs)需要减少通信频率以实现高效能操作,这与控制更新直接相关。然而,联合学习胰岛素剂量和更新时间显著增加了学习问题的复杂性。为缓解这一复杂性,我们开发了一种实用的DRL控制器设计,通过引入基于血糖变化的规则准则,避免显式学习更新时间。结果表明,决策发生在不规则间隔,问题自然地被建模为半马尔可夫决策过程(SMDP),我们扩展了标准DRL算法。数值实验表明,所提出的方法在提高通信效率的同时保持了控制性能。

英文摘要

This paper proposes a deep reinforcement learning (DRL)-based event-triggered controller design for networked artificial pancreas (AP) systems. Although existing DRL-based AP controllers typically assume periodic control updates, networked control systems (NCSs) require a reduction in communication frequency to achieve energy-efficient operation, which is directly tied to control updates. However, jointly learning both insulin dosing and update timing significantly increases the complexity of the learning problem. To alleviate this complexity, we develop a practical DRL-based controller design that avoids explicitly learning update timing by introducing a rule-based criterion defined by changes in blood glucose. As a result, decision-making occurs at irregular intervals, and the problem is naturally formulated as a semi-Markov decision process (SMDP), for which we extend a standard DRL algorithm. Numerical experiments demonstrate that the proposed method improves communication efficiency while maintaining control performance.

2603.11719 2026-05-18 stat.ME

Cross-Validation in Bipartite Networks

二元网络中的交叉验证

Bokai Yang, Yuanxing Chen, Yuhong Yang

AI总结 本文提出二元交叉验证方法,用于解决二元网络中社区数量估计问题,建立了首个模型选择一致性理论,并通过模拟和实际数据验证了其良好的有限样本性能。

Comments 48 pages, 7 figures

详情
AI中文摘要

二元网络,即编码两种不同实体之间相互作用的网络,在应用中广泛出现,并在节点集之间表现出固有的不对称性。尽管关于二元社区检测的文献日益增多,但估计社区数量(K₁,K₂)仍然是二元网络分析中的关键问题,且在没有建立模型选择一致性的情况下仍缺乏理论发展。实际上,固有的不对称性和可能剧烈不同的K₁和K₂的二维参数空间提出了与单元网络不同的独特挑战。特别是,候选模型可能同时在一组节点上过拟合而在另一组上欠拟合。为了解决这些挑战,我们提出了二元交叉验证(BCV),即一种惩罚交叉验证框架,能够以完全数据驱动的方式联合选择(K₁,K₂)。我们建立了首个二元网络的模型选择一致性理论,特别地,容纳了社区数量随网络规模变化的 regime,揭示了稀疏性和模型复杂性之间的复杂相互作用。模拟和实际数据应用证明了BCV的强有限样本性能。

英文摘要

Bipartite networks, which encode interactions between two distinct types of entities, arise widely in applications and exhibit inherent asymmetry across node sets. Despite a growing literature on bipartite community detection, estimating community numbers $(K_1, K_2)$, a critical issue for bipartite network analysis, remains theoretically underdeveloped without any model selection consistency established, to our knowledge. Indeed, the inherent asymmetry and the two-dimensional parameter space with possibly drastically different $K_1$ and $K_2$ pose unique challenges that differ from unipartite cases. In particular, the candidate models may simultaneously overfit one node set while underfitting the other. To address these challenges, we propose Bipartite Cross-Validation (BCV), a penalized cross-validation framework that jointly selects $(K_1,K_2)$ in a fully data-driven manner. We establish the first model selection consistency for bipartite networks, notably accommodating the regime where the numbers of communities scale with the network size, revealing the intricate interplay between sparsity and model complexity. Simulations and real-data applications demonstrate strong finite-sample performance of BCV.

2602.23892 2026-05-18 math.OC cs.IT math.IT stat.CO

Towards Tsallis Fully Probabilistic Design

迈向Tsallis完全概率设计

Vyacheslav Kungurtsev, Giovanni Russo

AI总结 本文提出基于Tsallis散度的完全概率设计框架,用于处理非高斯尾部行为的随机过程,通过双迭代方案证明了其收敛性与最优性。

详情
AI中文摘要

完全概率设计(FPD)是一种强大的框架,提供了随机控制、学习和决策的优雅统一描述。本文引入了广义的FPD框架,称为Tsallis FPD。Tsallis FPD使用Tsallis散度替代标准FPD中的Kullback-Leibler散度。Tsallis散度是非广泛统计力学的自然推广,为具有非高斯尾部行为的随机过程提供灵活性。在构建Tsallis FPD后,我们通过固定点迭代法证明了其收敛性。该构造采用双迭代方案,执行一系列逆向归纳,而非传统FPD中单次向下传递的步骤。我们证明了该构造渐近收敛到固定点,并且该固定点是Tsallis FPD的最优解。

英文摘要

Fully Probabilistic design (FPD) is a powerful framework offering an elegant and unifying account of stochastic control, learning and decision-making. Here we introduce a generalized FPD framework, which we term as Tsallis FPD. Tsallis FPD uses Tsallis divergence in place of the Kullback-Leibler divergence that defines the standard FPD cost term. Tsallis divergence is a natural generalization of the KL divergence, rooted in non-extensive statistical mechanics and providing flexibility towards modeling stochastic processes with non-Gaussian tail behavior. After formulating Tsallis FPD, we develop a constructive proof of convergence by formulating a fixed point iteration. The construction takes the form of a double iteration scheme that performs a sequence of backwards inductions, rather than a single pass down the stages that constitutes the proven approach for classical FPD. We prove that this construction asymptotically converges to a fixed point and that this fixed point is an optimal solution to Tsallis FPD.

2601.21765 2026-05-18 stat.CO stat.ME stat.ML

Mean-field Variational Bayes for Sparse Probit Regression

稀疏Probit回归的均场变分贝叶斯方法

Augusto Fasano, Giovanni Rebaudo

AI总结 本文提出基于均场变分贝叶斯的方法,用于二元结果的稀疏变量选择,通过闭式更新实现高效推断,相比MCMC方法速度快数十倍且保持准确性。

详情
AI中文摘要

我们考虑在probit链接下二元结果的贝叶斯变量选择,采用spike-and-slab先验对回归系数进行建模。受高维情况下MCMC采样计算挑战的启发,我们开发了一种均场变分贝叶斯近似方法,其中所有变分因子均可获得闭式更新,且证据下界可闭式表达。这使得可以开发一种高效的坐标上升变分推断算法来寻找变分参数的最优值。该方法能够产生后验包含概率和参数估计,从而在单一框架内实现可解释的选择和预测。如模拟和真实数据应用所示,所提方法成功识别了重要变量,并在速度上比MCMC快数个数量级,同时保持了可比的准确性。

英文摘要

We consider Bayesian variable selection for binary outcomes under a probit link with a spike-and-slab prior on the regression coefficients. Motivated by the computational challenges encountered by Markov chain Monte Carlo (MCMC) samplers in high-dimensional regimes, we develop a mean-field variational Bayes approximation in which all variational factors admit closed-form updates, and the evidence lower bound is available in closed form. This, in turn, allows the development of an efficient coordinate ascent variational inference algorithm to find the optimal values of the variational parameters. The approach produces posterior inclusion probabilities and parameter estimates, enabling interpretable selection and prediction within a single framework. As shown in both simulated and real data applications, the proposed method successfully identifies the important variables and is orders of magnitude faster than MCMC, while maintaining comparable accuracy.

2601.21636 2026-05-18 cs.LG cs.CR stat.ML

Sampling-Free Privacy Accounting for Matrix Mechanisms under Random Allocation

无需采样矩阵机制下的随机分配隐私计费

Jan Schuchardt, Nikita Kalinin

AI总结 本文提出基于Rényi散度和条件组合的无采样界限,用于矩阵分解下随机分配的差分隐私放大,解决了采样方法的高概率保证和随机放弃问题,适用于任意带状和非带状矩阵。

详情
AI中文摘要

我们研究了在随机分配(也称为球入箱模型)下矩阵分解中差分隐私模型训练的隐私放大。Choquette-Choo等人(2025)提出了一种基于采样的蒙特卡洛方法来计算放大参数,但其保证要么仅在高概率下成立,要么需要机制的随机放弃。此外,确保(ε,δ)-DP所需的样本数与δ成反比。相反,我们开发了基于Rényi散度和条件组合的无采样界限。前者通过动态规划公式高效计算界限,后者通过提供更强的隐私保证来补充,特别是在小ε的情况下,Rényi散度界限本质上导致过估计。我们的框架适用于任意带状和非带状矩阵。通过数值比较,我们展示了我们的方法在广泛使用的矩阵机制中的有效性。

英文摘要

We study privacy amplification for differentially private model training with matrix factorization under random allocation (also known as the balls-in-bins model). Recent work by Choquette-Choo et al. (2025) proposes a sampling-based Monte Carlo approach to compute amplification parameters in this setting. However, their guarantees either only hold with some high probability or require random abstention by the mechanism. Furthermore, the required number of samples for ensuring $(ε,δ)$-DP is inversely proportional to $δ$. In contrast, we develop sampling-free bounds based on Rényi divergence and conditional composition. The former is facilitated by a dynamic programming formulation to efficiently compute the bounds. The latter complements it by offering stronger privacy guarantees for small $ε$, where Rényi divergence bounds inherently lead to an over-approximation. Our framework applies to arbitrary banded and non-banded matrices. Through numerical comparisons, we demonstrate the efficacy of our approach across a broad range of matrix mechanisms used in research and practice.

2601.20761 2026-05-18 cs.IT math.IT math.ST stat.TH

Anytime-Valid Quantum State Tomography via Confidence Sequences

基于置信序列的 anytime 量子态重构

Aldo Cumitini, Luca Barletta, Osvaldo Simeone

AI总结 本文提出基于置信序列的 anytime 量子态重构方法,通过在每次测量后提供具有用户定义概率的置信集,实现对量子态估计不确定性的严格量化。

Comments Paper submitted to an IEEE journal

详情
AI中文摘要

在本文中,我们解决开发在测量序列过程中始终保持有效性的量子态重构(QST)方法的问题。具体而言,目标是提供严格量化当前态估计不确定性的方法,随着数据的逐步获取。为此,所提出的框架通过将当前状态点估计与保证以用户定义的概率包含真实量子态的置信集关联起来,来增强现有的QST技术。该方法基于最近在 anytime-valid 置信序列中的统计进展。数值结果验证了所提出 anytime-valid QST 的理论覆盖性质。

英文摘要

In this letter, we address the problem of developing quantum state tomography (QST) methods that remain valid at any time during a sequence of measurements. Specifically, the aim is to provide a rigorous quantification of the uncertainty associated with the current state estimate as data are acquired incrementally. To this end, the proposed framework augments existing QST techniques by associating current point estimates of the state with confidence sets that are guaranteed to contain the true quantum state with a user-defined probability. The methodology is grounded in recent statistical advances in anytime-valid confidence sequences. Numerical results confirm the theoretical coverage properties of the proposed anytime-valid QST.

2512.14473 2026-05-18 math.ST stat.TH

Sharp convergence rates for Spectral methods via the feature space decomposition method

通过特征空间分解方法获得谱方法的精确收敛率

Guillaume Lecué, Zhifan Li, Zong Shang

AI总结 本文通过特征空间分解方法,在一般条件下获得谱方法在线性回归中的总体超额风险的匹配上界和下界,从而定义谱方法的收敛率优先级,并推广反问题中的饱和效应,提供其发生条件。

详情
AI中文摘要

在本文中,我们应用在[LS24, GLS25, LSSW26, ALSS26]中开发的特征空间分解(FSD)方法,以在较为一般条件下获得谱方法在线性回归中平方损失下的总体超额风险的匹配上界和下界,对于每一个协方差和信号。这一结果使我们能够在给定的线性回归问题中,根据收敛率对谱方法集进行排序,从而表征哪种谱算法对于该特定问题更优。此外,这使我们能够推广反问题中提出的饱和效应,并提供其发生的必要和充分条件。我们的方法还表明,在广泛条件下,任何谱算法在单指数学习等问题中都无法超越信息指数的障碍。

英文摘要

In this paper, we apply the Feature Space Decomposition (FSD) method developed in [LS24, GLS25, LSSW26, ALSS26] to obtain, under fairly general conditions, matching upper and lower bounds for the population excess risk of spectral methods in linear regression under the squared loss, for every covariance and every signal. This result enables us, for a given linear regression problem, to define a pre-order on the set of spectral methods according to their convergence rates, thereby characterizing which spectral algorithm is superior for that specific problem. Furthermore, this allows us to generalize the saturation effect proposed in inverse problems and to provide necessary and sufficient conditions for its occurrence. Our method also shows that, under broad conditions, any spectral algorithm cannot overcome the barrier of the information exponent in problems such as single-index learning.

2511.18225 2026-05-18 cs.LG stat.ML stat.OT

Adaptive Conformal Prediction for Quantum Machine Learning

适应性符合预测用于量子机器学习

Douglas Spencer, Samual Nicholls, Michele Caprio

AI总结 本文提出适应性量子符合预测算法,解决量子处理器时间变化噪声对符合保证的影响,通过重复校准保持有效性,实验证明其在IBM量子处理器上的稳定性和覆盖率。

Comments Accepted at TMLR 05/2026. 27 pages, 5 figures

详情
Journal ref
Transactions on Machine Learning Research, May 2026, ISSN 2835-8856
AI中文摘要

量子机器学习旨在利用量子计算机改进经典机器学习算法。目前,量子领域仍缺乏稳健的不确定性量化方法,尽管需要可靠和可信的预测。最近的工作引入了量子符合预测框架,该框架能产生保证包含真实结果的概率预测集。本文正式阐述了量子处理器中固有的时间变化噪声如何即使在校准和测试数据可交换的情况下也会破坏符合保证。为解决这一挑战,我们借鉴了适应性符合推断方法,该方法通过重复校准在时间上保持有效性。我们引入了适应性量子符合预测(AQCP)算法,该算法在任意硬件噪声条件下提供渐近平均覆盖率保证。在IBM量子处理器上的实验证明,AQCP实现了目标覆盖率并表现出比量子符合预测更大的稳定性。

英文摘要

Quantum machine learning seeks to leverage quantum computers to improve upon classical machine learning algorithms. Currently, robust uncertainty quantification methods remain underdeveloped in the quantum domain, despite the critical need for reliable and trustworthy predictions. Recent work has introduced quantum conformal prediction, a framework that produces prediction sets that are guaranteed to contain the true outcome with a user-specified probability. In this work, we formalise how the time-varying noise inherent in quantum processors can undermine conformal guarantees, even when calibration and test data are exchangeable. To address this challenge, we draw on Adaptive Conformal Inference, a method which maintains validity over time via repeated recalibration. We introduce Adaptive Quantum Conformal Prediction (AQCP), an algorithm which provides asymptotic average coverage guarantees under arbitrary hardware noise conditions. Empirical studies on an IBM quantum processor demonstrate that AQCP achieves the target coverage level and exhibits greater stability than quantum conformal prediction.

2510.24539 2026-05-18 stat.ME

Unbiased likelihood estimation of the Langevin diffusion for animal movement modelling

兰格vin扩散在动物运动建模中的无偏似然估计

Ron R. Togunov, S. Knutsen Furset, Martin E. Pettersen, Robert B. O'Hara

AI总结 本文提出利用布朗桥进行重要性采样,改进兰格vin扩散模型的似然估计,以解决 telemetry 数据中自相关和时间不规则性的问题,提升生态栖息地选择研究的准确性。

详情
AI中文摘要

动物生态学中持续存在的挑战是开发能够考虑测距数据中自相关性和时间不规则性的运动模型。连续时间兰格vin扩散模型已被提出用于建模时间自相关和不规则采样数据。然而,当前的估计技术在观测间隔增加时会获得越来越偏的参数估计。本文提出利用布朗桥在重要性采样方案中改进兰格vin扩散模型的似然估计。在一系列模拟研究中,我们展示了我们的方法在各种场景下有效去除了偏倚。我们发现,数据跨度更长但采样频率较低时,估计的栖息地系数的精度提高。这表明该模型可能更适合用于采样分辨率较低的数据集,这在使用旧一代动物标签收集的数据集时很常见。我们利用斯特勒海狮(Eumetopias jubatus)的跟踪数据展示了本模型的应用。我们发现系数估计值收敛到显著不同于以前研究估计值的值,表明传统估计方法中的偏倚可能对栖息地偏好结论产生重大影响。这些改进拓宽了兰格vin扩散模型的应用范围,从而提高了对栖息地选择的生态见解。

英文摘要

An ongoing challenge in animal ecology is developing movement models that account for the autocorrelation, and often temporal irregularity, in telemetry data. Continuous-time Langevin diffusion models have been proposed to model temporally autocorrelated and irregularly sampled data. However, current estimation techniques obtain increasingly biased parameter estimates as the time between observations increases. In this paper, we propose using Brownian bridges in an importance sampling scheme to improve the likelihood approximation of the Langevin diffusion model. In a series of simulation studies, we showed that our approach effectively removed the bias under various scenarios. We found that the precision of the estimated habitat coefficients increased for data spanning a longer duration at a lower frequency than for shorter, more frequently sampled tracks. This suggests that the model may be well suited for modelling tracking data sampled at a coarser resolution, as is common in datasets collected with older generations of animal tags. We illustrated the application of our model using tracking data from Steller sea lions, \textit{Eumetopias jubatus}. We found that the coefficient estimates converged to values significantly different than those estimated in previous studies, suggesting that bias in conventional estimation methods may meaningfully affect ecological conclusions about habitat preference. Together, these improvements broaden the applicability of Langevin diffusion models, thereby improving ecological insight into habitat selection.

2510.20163 2026-05-18 math.PR math.ST stat.TH

Topics in Probability, Parametric Estimation and Stochastic Calculus

概率、参数估计与随机分析中的专题

Levi Lopes de Lima

AI总结 本文系统发展参数估计的核心工具,结合几何视角探讨概率理论,涵盖集中不等式、极限定理等,并介绍布朗运动与伊藤公式及其应用。

Comments 201 pages; 2 figures; substantially rewritten in several parts to improve clarity and exposition, with new examples and contextual remarks added throughout; lots of typos fixed

详情
AI中文摘要

我们从概率论的基础开始,回顾其在现实问题中的重要应用:参数估计。文中系统发展这一主题,介绍集中不等式、极限定理、置信区间、最大似然估计、最小二乘和假设检验等核心工具,强调理论基础与实际相关性。通过几何视角探讨概率的不变性性质,特别是正态分布随机向量。附录介绍布朗运动和随机分析,最终得出伊藤公式。文章还展示了高斯集中不等式、费曼-科茨公式以及金融中的黑-索斯策略等应用。

英文摘要

We begin our journey by recalling the fundamentals of Probability Theory that underlie one of its most significant applications to real-world problems: Parametric Estimation. Throughout the text, we systematically develop this theme by presenting and discussing the main tools it encompasses (concentration inequalities, limit theorems, confidence intervals, maximum likelihood, least squares, and hypothesis testing) always with an eye toward both their theoretical underpinnings and practical relevance. While our approach follows the broad contours of conventional expositions, we depart from tradition by consistently exploring the geometric aspects of probability, particularly the invariance properties of normally distributed random vectors. This geometric perspective is taken further in an extended appendix, where we introduce the rudiments of Brownian motion and the corresponding stochastic calculus, culminating in Itô's celebrated change-of-variables formula. To highlight its scope and elegance, we present some of its most striking applications: the sharp Gaussian concentration inequality (a central example of the "concentration of measure phenomenon"), the Feynman-Kac formula (used to derive a path integral representation for the Laplacian heat kernel), and, as a concluding delicacy, the Black-Scholes strategy in Finance.

2510.18903 2026-05-18 stat.ME math.ST q-fin.ST stat.TH

Centered-Innovation MA for Bayesian Dirichlet ARMA: Theoretical Equivalence and an Application to Bank-Asset Shares

基于贝叶斯狄利克雷ARMA的中心创新MA:理论等价性及对银行资产份额的应用

Harrison Katz

AI总结 本文研究了对组合时间序列的贝叶斯狄利克雷ARMA进行最小修改:用中心创新替代原始加性对数比残差。证明了在固定参数下,中心化规格与digamma链接DARMA在1/ϕ阶上的等价性,并通过银行资产份额数据验证了其预测性能。

详情
AI中文摘要

我们研究了对组合时间序列的贝叶斯狄利克雷ARMA(B--DARMA)进行最小修改:将移动平均块中的原始加性对数比(ALR)残差替换为一个中心创新,该创新减去狄利克雷条件ALR均值,可通过digamma恒等式得到闭合形式。我们证明了在固定参数下,中心化规格与digamma链接DARMA在1/ϕ阶上的等价性,前提是显式的内部和滞后稳定性条件成立。结果澄清了为何在高精度 regime 中两个规格应预测上不可区分,但本身并不控制重新估计产生的贝叶斯后验的几何结构。在每周联邦储备委员会H.8银行资产份额(2015年10月至2025年10月,T=522周)上,预测性能在104个滚动周起始点上在所有精度指标上统计上不可区分,而原始规格下的哈密顿蒙特卡罗发散转换在孤立的滚动拟合中大约更频繁一个数量级,这由局部的滚动拟合引起,原始后验表现出局部病态。四参考敏感性分析证实了预测等价性是参考不变的,并且中心化在不同参考下保持几何优势,但随原始病态拟合的普遍性而变化,从贷款参考的显著减少到现金参考的平局。实际意义是操作而非预测:中心化避免了在孤立滚动起始点出现的原始MA发散尖峰,这对生产流程中后验模拟用于下游压力测试至关重要。该调整是分析且插件式的,只需对MA创新计算进行局部修改。

英文摘要

We study a minimal change to an observation-driven Bayesian Dirichlet ARMA (B--DARMA) for compositional time series: replace the raw additive log-ratio (ALR) residual in the moving-average block with a centered innovation that subtracts the Dirichlet conditional ALR mean, available in closed form via digamma identities. We prove a recursion-level first-order equivalence (in $1/ϕ$) between the centered specification and a digamma-link DARMA at fixed parameters, under explicit interior and lag-stability conditions. The result clarifies why the two specifications should be predictively indistinguishable in the high-precision regime but does not by itself govern the geometry of the Bayesian posteriors that re-estimation produces. On weekly Federal Reserve H.8 bank-asset shares (October~2015 through October~2025, $T=522$ weeks), predictive performance is statistically indistinguishable across $104$ rolling weekly origins on every accuracy metric examined, while Hamiltonian Monte Carlo divergent transitions are approximately an order of magnitude more frequent under the raw specification, driven by isolated rolling fits at which the raw posterior exhibits localized pathologies. A four-reference sensitivity analysis confirms that predictive equivalence is reference-invariant and that the geometric advantage of centering is preserved across references but varies with the prevalence of pathological raw fits, from a substantial reduction at the loans reference to parity at the cash reference. The practical implication is operational rather than predictive: centering avoids the catastrophic raw-MA divergence spikes that occur at isolated rolling origins, which matters for production workflows in which posterior simulation feeds downstream stress tests. The adjustment is analytic and plug-in, and requires only a local change to the MA innovation calculation.

2509.22739 2026-05-18 cs.CL cs.AI cs.LG stat.ML

Painless Activation Steering: An Automated, Lightweight Approach for Post-Training Large Language Models

无痛激活导向:一种自动化、轻量级的微调大型语言模型方法

Sasha Cui, Zhongren Chen

AI总结 本文提出Painless Activation Steering,一种自动化方法,无需人工干预即可利用标注数据提升模型性能,尤其在行为任务中表现优异,但对智能任务效果有限。

详情
AI中文摘要

语言模型通常通过权重或提示导向进行微调,但前者耗时昂贵,后者控制不精确且需手动试错。激活导向(AS)提供了一种更经济、快速且可控的替代方法,但现有技术需人工构造提示对或进行大量特征标注,不如RL和SFT等方法方便。本文引入Painless Activation Steering(PAS),一种完全自动的方法,可利用任何标注数据集进行AS,无需提示构造、特征标注或人工干预。在三个开源模型和18个任务上评估PAS,发现其在行为任务中性能可靠,但对智能任务效果有限。 introspective variant(iPAS)在偏差、道德和对齐任务上分别提升了10.1%、5.2%和34.8%。此外,PAS在上下文学习(ICL)和SFT基础上还提供了额外增益。PAS构建了一个快速、轻量的激活向量,可低成本训练、存储和激活。实验结果为AS的应用提供了明确的指导,展示了其作为实用自动化微调方法的潜力。

英文摘要

Language models (LMs) are typically post-trained for desired capabilities and behaviors via weight-based or prompt-based steering, but the former is time-consuming and expensive, and the latter is not precisely controllable and often requires manual trial-and-error. While activation steering (AS) promises a cheap, fast, and controllable alternative to the two existing post-training methods, current AS techniques require hand-crafted prompt pairs or labor-intensive feature annotation, making them more inconvenient than the plug-and-play methods such as Reinforcement Learning (RL) and Supervised Fine-Tuning (SFT). We introduce Painless Activation Steering (PAS), a family of fully automated methods that make AS readily usable with any given labeled dataset, with no need for prompt construction, feature labeling, or human intervention. We evaluate PAS on three open-weight models (Llama3.1-8B-Instruct, DeepSeek-R1-Distill-8B, and Nous-Hermes-2) and 18 tasks; we find that PAS reliably improves performance for behavior tasks, but not for intelligence-oriented tasks. The introspective variant (iPAS) delivers the strongest causal steering effects (10.1% on Bias, 5.2% on Morality, and 34.8% on Alignment). We also show PAS delivers additional gains on top of In-Context Learning (ICL) and SFT. PAS constructs a fast, lightweight activation vector that can be cheaply trained, easily stored, and activated at will. Our results provide a characterization of where AS helps, where it fails, and how to deploy it as a practical, automated LM post-training option.

2508.14690 2026-05-18 stat.ME

Nesting a Target Study within a Target Trial: A Framework for Evaluating Intervention Effects on Disparities

将目标研究嵌入目标试验:评估干预对不平等影响的框架

Xinyi Sun, Theodore J. Iwashyna, Emmanuel F. Drabo, Deidra C. Crews, Kadija Ferryman, John W. Jackson

AI总结 本文提出TS+TT框架,通过伦理假设测量不平等,结合分层抽样和随机化策略,评估干预对不平等的影响,并扩展G-computation处理连续干预。

Comments Main text: 23 pages, 4 tables; Appendix: 45 pages

详情
AI中文摘要

我们提出了一种新颖的框架(TS+TT),用于将目标研究(TS)嵌入目标试验(TT)中,以评估干预对不平等的影响。TS部分基于允许性概念,将不平等的测量根植于伦理假设,并将其锚定在特定时间内的明确人群。它指定了分层抽样计划,以获得在允许的协变量上社会群体分布相似的样本。在该样本中,TT部分在每个社会群体内随机化干预策略。由于社会群体在基线时在允许的协变量上处于相似位置,并且在社会群体内分配的干预组是可交换的,TS+TT反映了评估干预如何影响不平等的有意义的因果估计量。我们描述了该框架的关键组成部分、其模拟以及其在评估假设干预对脉搏血氧仪偏见影响治疗获取不平等的临床护理中的应用。我们还扩展了半参数G计算法,以适应连续随机干预,并估计时间到事件结果的因果不平等。TS+TT框架提供了一种灵活且政策相关的方法,用于生成具有伦理意识的因果证据,以减少不平等并避免加剧不平等。

英文摘要

We present a novel framework (TS+TT) to nest a Target Study (TS) within a Target Trial (TT) for evaluating the effects of interventions on disparities. The TS component grounds the measurement of disparity in ethical assumptions, based on the concept of allowability, and anchors it to an explicit population within calendar time. It specifies an enrollment plan of stratified sampling of eligible persons to yield a sample where social groups are distributionally similar on covariates deemed allowable for measuring disparity. Within this enrolled sample, the TT component specifies randomization of intervention strategies within each social group. Because social groups are similarly situated on allowable covariates at baseline, and because assigned intervention arms are exchangeable within social groups, TS+TT reflects a meaningful causal estimand for evaluating how interventions impact disparity. We describe the framework's key components, its emulation, and demonstrate its application to evaluate how hypothetical interventions on pulse oximeter bias affect disparities in treatment receipt in clinical care. We also extend semiparametric G-computation to accommodate continuous stochastic interventions and estimate counterfactual disparities in time-to-event outcomes. The TS+TT framework offers a versatile and policy-relevant approach for generating ethically informed causal evidence to reduce disparities and avoid exacerbating disparities.

2507.15475 2026-05-18 eess.SP math.PR stat.AP

On the Distribution of a Two-Dimensional Random Walk with Restricted Angles

二维受限角度随机游走的分布

Karl-Ludwig Besser

AI总结 研究受限角度二维随机游走的分布,推导两步联合与边缘分布,提供一般步数的数值解及大步数近似,明确支持集的精确描述。

Comments 14 pages, 14 figures

详情
AI中文摘要

本文推导了二维(复数)随机游走的分布,其中每一步的角度被限制在圆的一个子集。这种设置出现在信号处理中的空中计算等领域。特别地,我们推导了两步的联合和边缘分布,给出了任意步数的数值解,并对大步数提供了近似解。此外,我们为任意步数提供了支持集的精确描述。本文的结果为未来涉及此类问题的研究提供了参考。

英文摘要

In this paper, we derive the distribution of a two-dimensional (complex) random walk in which the angle of each step is restricted to a subset of the circle. This setting appears in various domains, such as in over-the-air computation in signal processing. In particular, we derive the exact joint and marginal distributions for two steps, numerical solutions for a general number of steps, and approximations for a large number of steps. Furthermore, we provide an exact characterization of the support for an arbitrary number of steps. The results in this work provide a reference for future work involving such problems.

2507.02032 2026-05-18 hep-ph hep-ex physics.data-an stat.ML

Neural simulation-based inference of the Higgs trilinear self-coupling via off-shell Higgs production

基于神经模拟的Higgs三线性自耦合推断:通过非壳Higgs生产

Aishik Ghosh, Maximilian Griese, Ulrich Haisch, Tae Hyoun Park

AI总结 本文提出一种混合神经模拟推断方法,用于推断Higgs三线性自耦合,结合标准模型有效场论和背景过程,实现高亮度大型强子对撞机的约束。

Comments 27 pages, 17 figures, 2 tables; v2: revised and improved version of the manuscript as accepted for publication in EPJC

详情
AI中文摘要

粒子物理中的一项重大挑战是实验确定Higgs三线性自耦合。尽管研究主要集中在质子-质子碰撞中的壳内双Higgs和单Higgs生产,非壳Higgs生产也被提出作为有价值的补充探测手段。本文设计了一种混合神经模拟基于推断(NSBI)方法,以构建包含标准模型有效场论(SMEFT)修改、相关背景过程和量子干涉效应的Higgs信号似然性。该方法利用矩阵元增强技术的训练效率,对于稳健的SMEFT应用至关重要,同时结合基于分类方法的实用优势以获得有效的背景估计。我们证明了NSBI方法的灵敏度接近理论最优,并提供了预期的高亮度升级大型强子对撞机的约束。虽然我们主要关注Higgs三线性自耦合,但也考虑了影响非壳Higgs生产其他SMEFT算符的约束。

英文摘要

One of the forthcoming major challenges in particle physics is the experimental determination of the Higgs trilinear self-coupling. While efforts have largely focused on on-shell double- and single-Higgs production in proton-proton collisions, off-shell Higgs production has also been proposed as a valuable complementary probe. In this article, we design a hybrid neural simulation-based inference (NSBI) approach to construct a likelihood of the Higgs signal incorporating modifications from the Standard Model effective field theory (SMEFT), relevant background processes, and quantum interference effects. It leverages the training efficiency of matrix-element-enhanced techniques, which are vital for robust SMEFT applications, while also incorporating the practical advantages of classification-based methods for effective background estimates. We demonstrate that our NSBI approach achieves sensitivity close to the theoretical optimum and provide expected constraints for the high-luminosity upgrade of the Large Hadron Collider. While we primarily concentrate on the Higgs trilinear self-coupling, we also consider constraints on other SMEFT operators that affect off-shell Higgs production.

2504.20268 2026-05-18 stat.AP

Spatio-temporal fusion of reanalysis and in situ data for censored threshold exceedances of PM2.5

PM2.5阈值超量的再分析与实地数据空间-时间融合

M. Daniela Cuba, Craig Wilkie, Marian Scott, Daniela Castro-Camilo

AI总结 本文提出基于极值理论的贝叶斯分层数据融合框架,通过Dirac-delta广义帕雷托分布联合考虑阈值和非阈值超量,提升PM2.5污染预测精度,尤其在伦敦地区表现优于传统高斯模型。

详情
AI中文摘要

数据融合模型广泛用于空气质量监测,整合实地数据和大范围网格化产品,提供空间完整和时间详细的估计。然而,传统高斯模型常低估极端污染值,导致风险评估偏差。为此,本文提出基于极值理论的贝叶斯分层数据融合框架,使用Dirac-delta广义帕雷托分布联合考虑阈值和非阈值超量,同时保留超量和非超量事件的时间信息。我们的模型用于描述和预测伦敦地区PM2.5污染的censored阈值超量,使用CAMS大气成分再分析数据和英国政府运营的自动城乡网络(AURN)实地观测站。关键特点包括结合不同空间-时间分辨率的数据,并完全考虑参数不确定性。结果表明,我们的模型在大多数观测站点预测阈值超量时优于高斯模型和单独再分析数据,甚至能产生比背景数据更明显的PM2.5污染空间模式。此外,我们的方法捕捉到更大的变异性与空间模式,如沿海地区更高的PM2.5浓度,这些在再分析数据中不明显。

英文摘要

Data fusion models are widely used in air quality monitoring to integrate in situ and large-scale gridded products, offering spatially complete and temporally detailed estimates. However, traditional Gaussian-based models often underestimate extreme pollution values, leading to biased risk assessments. To address this, we present a Bayesian hierarchical data fusion framework rooted in extreme value theory, using the Dirac-delta generalised Pareto distribution to jointly account for threshold and non-threshold exceedances while preserving the timing of exceedance and non-exceedance episodes. Our model is used to describe and predict censored threshold exceedances of PM2.5 pollution in the Greater London region by using CAMS atmospheric composition reanalysis, and in situ observation stations from the automatic urban and rural network (AURN) run by the UK government. Key features of our approach include combining data with varying spatio-temporal resolutions and fully accounting for parameter uncertainties. Results show that our model outperforms Gaussian-based alternatives and standalone reanalysis data in predicting threshold exceedances at the majority of observation sites and can even result in improved spatial patterns of PM2.5 pollution than those discernible from the background data. Moreover, our approach captures greater variability and spatial patterns, such as higher PM2.5 concentrations near coastal areas, which are not evident in the reanalysis data alone.

2504.18522 2026-05-18 stat.ML cs.LG

Extrapolation Guarantees for Perturbation Modeling Under the Additive Latent Shift Assumption

在加性潜在位移假设下对扰动建模的外推保证

Julius von Kügelgen, Jakob Ketterer, Michael Vollenweider, Michael Scholkemper, Xinwei Shen, Nicolai Meinshausen, Jonas Peters

AI总结 本文研究了在加性潜在位移假设下,通过扰动建模预测新扰动组合的分布,提出PDAE模型并证明了外推保证。

Comments Updated preprint with new material and empirical results; previous version presented at the ICLR'25 Workshop on Learning Meaningful Representations of Life

详情
AI中文摘要

我们考虑了建模如基因敲除等扰动对测量(如单细胞RNA计数)的影响问题。给定某些扰动的数据,我们旨在预测新扰动组合的测量分布。为此,我们假设扰动在合适但未知的嵌入空间中是加性的。我们将数据生成过程建模为潜在变量模型,其中扰动相当于潜在空间中的均值位移,并且可以加性组合。我们证明,在训练扰动足够多样时,表示和扰动效应可识别到正交变换为止,并利用此推导出对未见扰动的外推保证,这些未见扰动可表示为已见扰动的线性组合。为了从数据中估计模型,我们提出扰动分布自编码器(PDAE),该模型通过最大化真实与模拟扰动分布之间的分布相似性进行训练。训练后的模型可用于预测之前未见的扰动分布。为了支持我们的理论结果,我们通过模拟展示了PDAE能够准确预测未见但可识别的扰动效应,并在组合基因扰动数据上展示了该方法。

英文摘要

We consider the problem of modeling the effects of perturbations like gene knockouts on measurements such as single-cell RNA counts. Given data for some perturbations, we aim to predict the distribution of measurements for new combinations of perturbations. To address this challenging extrapolation task, we posit that perturbations act additively in a suitable, unknown embedding space. We formulate the data-generating process as a latent variable model, in which perturbations amount to mean shifts in latent space and can be combined additively. We then prove that, given sufficiently diverse training perturbations, the representation and perturbation effects are identifiable up to orthogonal transformation and use this to derive extrapolation guarantees for unseen perturbations that can be expressed as linear combinations of seen ones. To estimate the model from data, we propose the perturbation distribution autoencoder (PDAE), which is trained by maximizing the distributional similarity between true and simulated perturbation distributions. The trained model can then be used to predict previously unseen perturbation distributions. In support of our theoretical results, we demonstrate through simulations that PDAE can accurately predict the effects of unseen but identifiable perturbations, and showcase the method on combinatorial gene perturbation data.

2503.23927 2026-05-18 stat.ML cs.LG

Detecting Localized Density Anomalies in Multivariate Data via Coin-Flip Statistics

通过硬币翻转统计检测多变量数据中的局部密度异常

Sebastian Springer, Andre Scaffidi, Maximilian Autenrieth, Gabriella Contardo, Alessandro Laio, Roberto Trotta, Heikki Haario

AI总结 本文提出EagleEye方法,通过编码k近邻列表为二进制序列,检测多变量数据中的局部过密度和欠密度异常,并在三种场景中验证其有效性。

Comments Code Availability: The code used to generate the results of this study is available at GitHub via the link: https://github.com/sspring137/EagleEye

详情
AI中文摘要

检测两个样本之间的局部差异是科学数据分析的核心任务,用于识别信号事件、制度变化或模型不匹配。我们引入EagleEye方法,通过将有序k近邻列表编码为二进制成员序列,并测试该序列中累积成功次数是否与二项式(硬币翻转)空模型一致,来定位多变量特征空间中的局部过密度和欠密度。在存在真实局部异常时,邻居将优先属于其中一个数据集,导致相对于二项式空模型的“成功”次数过多。这些局部点检测通过确定性细化程序整合为可解释的异常集,同时可以估计不可约背景和局部密度异常纯度。我们通过三种场景展示了EagleEye的有效性:首先考虑具有已知局部过密度和欠密度的人工数据示例;其次展示EagleEye在粒子对撞机实验中检测新物理现象时在系统背景建模差异下的应用;最后进行气候分析研究,揭示了时空温度模式重复中的局部变化。

英文摘要

Detecting localized differences between two samples is a central task in scientific data analysis, required for the identification of signal events, regime changes, or model mismatch. We introduce EagleEye, a method that pinpoints local over- and under-densities in multivariate feature spaces. EagleEye assigns each point an anomaly score by encoding its ordered k-nearest-neighbour list as a binary membership sequence and testing whether the cumulative number of successes in this sequence is consistent with a binomial (coin-flipping) null model. In the presence of a genuine local anomaly, neighbours will preferentially belong to one of the two datasts, yielding an excess of ``successes'' relative to the binomial null model. These local, pointwise detections are consolidated into interpretable anomaly sets through a deterministic refinement procedure that can also estimate the irreducible background and local density anomaly purity. We demonstrate EagleEye's efficacy in three scenarios. We first consider an artificial data example with known localized over- and under-densities. Second, we demonstrate how EagleEye may be used for new physics searches at particle collider experiments in the presence of systematic background modelling differences. Finally, we conduct a climate analysis study that reveals localized changes in spatiotemporal temperature-pattern recurrence.

2503.15107 2026-05-18 stat.ML cs.LG

Interpretability of Graph Neural Networks to Assess Effects of Global Change Drivers on Ecological Networks

图神经网络的可解释性:评估全球变化驱动因素对生态网络的影响

Emre Anakok, Pierre Barbillon, Colin Fontaine, Elisa Thebault

AI总结 研究通过图神经网络分析全球变化驱动因素对传粉网络连接性的影响,探讨环境变量与植物属的交互作用,并验证去偏技术对估计效果的影响。

详情
AI中文摘要

传粉者在植物繁殖中起关键作用,无论是自然生态系统还是人类修改的景观。全球变化驱动因素,如气候变化或土地利用修改,会改变植物-传粉者相互作用。为了评估全球变化驱动因素对传粉的影响,需要大规模的相互作用、气候和土地利用数据。尽管最近的机器学习方法,如图神经网络(GNNs),允许分析此类数据集,但解释其结果具有挑战性。我们探索现有的GNN解释方法,以突出各种环境协变量对传粉网络连接性的影响。进行了广泛的模拟研究,以确认这些方法能否检测协变量与植物属之间的交互作用,以及去偏技术的应用是否影响这些效果的估计。对Spipoll数据集的应用,包括和不包括考虑采样效应,突显了土地利用对网络连接性潜在影响,并显示考虑采样效应部分改变了这些效果的估计。

英文摘要

Pollinators play a crucial role for plant reproduction, either in natural ecosystem or in human-modified landscape. Global change drivers,including climate change or land use modifications, can alter the plant-pollinator interactions. To assess the potential influence of global change drivers on pollination, large-scale interactions, climate and land use data are required. While recent machine learning methods, such as graph neural networks (GNNs), allow the analysis of such datasets, interpreting their results can be challenging. We explore existing methods for interpreting GNNs in order to highlight the effects of various environmental covariates on pollination network connectivity. An extensive simulation study is performed to confirm whether these methods can detect the interactive effect between a covariate and a genus of plant on connectivity, and whether the application of debiasing techniques influences the estimation of these effects. An application on the Spipoll dataset, with and without accounting for sampling effects, highlights the potential impact of land use on network connectivity and shows that accounting for sampling effects partially alters the estimation of these effects.

2503.14311 2026-05-18 math.ST stat.ME stat.TH

Asymptotic properties of the MLE in distributional regression under random censoring

分布回归中随机截断下MLE的渐进行为

Gitte Kremling, Gerhard Dikta

AI总结 研究在随机右截断下分布回归中MLE的渐近性质,证明其几乎处处一致性和渐近正态性,并通过模拟和实际数据验证。

详情
AI中文摘要

分布回归旨在从给定的参数条件分布族中找到最佳候选分布来建模给定数据集。由于分布族中的每个候选可以通过对应的分布参数识别,常用方法是使用最大似然估计器(MLE)估计参数。本文在响应变量受随机右截断的情况下,建立了该估计量的理论结果。特别地,我们证明了在截断下的MLE几乎处处一致性和渐近正态性。通过模拟研究和实际数据示例展示了经验行为。

英文摘要

Distributional regression aims to find the best candidate in a given parametric family of conditional distributions to model a given dataset. As each candidate in the distribution family can be identified by the corresponding distribution parameters, a common approach for this task is to use the maximum likelihood estimator (MLE) for the parameters. In this paper, we establish theoretical results for this estimator in case the response variable is subject to random right censoring. In particular, we provide proofs of almost sure consistency and asymptotic normality of the MLE under censoring. The empirical behavior is illustrated by a simulation study and a real data example.

2412.11308 2026-05-18 stat.ML cs.LG

From XAI to MLOps: Explainable Concept Drift Detection with Profile Drift Detection

从XAI到MLOps:基于轮廓漂移检测的可解释概念漂移检测

Ugur Dar, Mustafa Cavus

AI总结 本文提出轮廓漂移检测方法,利用可解释AI工具部分依赖性轮廓图,通过新的漂移度量标准检测概念漂移并理解其原因,实验表明其在保持预测性能的同时有效平衡了漂移信号的敏感性和稳定性。

Comments 15 pages, 6 figures

详情
Journal ref
Future Generation Computer Systems (2026)
AI中文摘要

预测模型的性能往往因数据分布的变化而下降,这种现象称为数据漂移。其中,概念漂移(解释变量与响应变量之间的关系变化)尤其难以检测和适应。传统漂移检测方法通常依赖准确率或边缘变量分布等指标,可能无法捕捉到微妙但重要的概念变化。本文提出了一种新方法,轮廓漂移检测(PDD),通过利用可解释AI工具部分依赖性轮廓图(PDPs),实现了对概念漂移的检测和对其潜在原因的深入理解。PDD通过新的漂移度量标准量化PDPs的变化,这些度量标准对数据流中的变化敏感,同时保持计算效率。该方法与MLOps实践一致,强调在动态环境中持续的模型监控和适应性重训练。在合成和实际数据集上的实验表明,PDD在保持高预测性能的同时,有效平衡了漂移信号的敏感性和稳定性。结果突显了其在实时应用中的适用性,本文最后讨论了该方法的优势、限制以及向更广泛应用场景扩展的潜力。

英文摘要

Predictive models often degrade in performance due to evolving data distributions, a phenomenon known as data drift. Among its forms, concept drift, where the relationship between explanatory variables and the response variable changes, is particularly challenging to detect and adapt to. Traditional drift detection methods often rely on metrics such as accuracy or marginal variable distributions, which may fail to capture subtle but important conceptual changes. This paper proposes a novel method, Profile Drift Detection (PDD), which enables both the detection of concept drift and an enhanced understanding of its underlying causes by leveraging an explainable AI tool: Partial Dependence Profiles (PDPs). PDD quantifies changes in PDPs through new drift metrics that are sensitive to shifts in the data stream while remaining computationally efficient. This approach is aligned with MLOps practices, emphasizing continuous model monitoring and adaptive retraining in dynamic environments. Experiments on synthetic and real-world datasets demonstrate that PDD outperforms existing methods by maintaining high predictive performance while effectively balancing sensitivity and stability in drift signals. The results highlight its suitability for real-time applications, and the paper concludes by discussing the method's advantages, limitations, and potential extensions to broader use cases.

2406.02834 2026-05-18 stat.ME

Asymptotic inference with flexible covariate adjustment under rerandomization and stratified rerandomization

基于重新随机化和分层重新随机化的灵活协变量调整的渐近推断

Bingkai Wang, Fan Li

AI总结 本文研究了在重新随机化和分层重新随机化下,更广泛的协变量调整估计量的渐近理论,证明了M估计量的渐近线性及影响函数在简单随机化与重新随机化下保持相同,但重新随机化可能导致非高斯渐近分布,并探讨了基于数据自适应机器学习的高效估计量的效率最优性。

详情
AI中文摘要

重新随机化是一种有效的治疗分配程序,用于控制基线协变量不平衡。对于估计平均治疗效应,重新随机化已被证明可以提高未调整和线性调整估计量的精度,而不会影响一致性。然而,重新随机化是否适用于更广泛的M估计量类,包括广义线性回归的g计算公式和双重鲁棒方法,以及更广泛的数据自适应机器学习的高效估计量,仍不清楚。本文发展了在重新随机化及其分层扩展下更广泛的协变量调整估计量的渐近理论。证明了在简单随机化和重新随机化下,任何M估计量的渐近线性及影响函数保持相同,但重新随机化可能导致非高斯渐近分布。我们进一步通过几个常见M估计量的例子解释,如果在最终估计量中适当调整重新随机化变量,则可以实现渐近正态性。这些结果扩展到分层重新随机化。最后,我们研究了基于数据自适应机器学习的高效估计量的渐近理论,并证明其在重新随机化和分层重新随机化下的效率最优性。我们的结果通过模拟和重新分析一个使用分层重新随机化的集群随机化实验得到验证。

英文摘要

Rerandomization is an effective treatment allocation procedure to control for baseline covariate imbalance. For estimating the average treatment effect, rerandomization has been previously shown to improve the precision of the unadjusted and the linearly-adjusted estimators over simple randomization without compromising consistency. However, it remains unclear whether such results apply more generally to the class of M-estimators, including the g-computation formula with generalized linear regression and doubly-robust methods, and more broadly, to efficient estimators with data-adaptive machine learners. In this paper, we develop the asymptotic theory for a more general class of covariate-adjusted estimators under rerandomization and its stratified extension. We prove that the asymptotic linearity and the influence function remain identical for any M-estimator under simple randomization and rerandomization, but rerandomization may lead to a non-Gaussian asymptotic distribution. We further explain, drawing examples from several common M-estimators, that asymptotic normality can be achieved if rerandomization variables are appropriately adjusted for in the final estimator. These results are extended to stratified rerandomization. Finally, we study the asymptotic theory for efficient estimators based on data-adaptive machine learners, and prove their efficiency optimality under rerandomization and stratified rerandomization. Our results are demonstrated via simulations and re-analyses of a cluster-randomized experiment that used stratified rerandomization.

2312.13992 2026-05-18 stat.ME

Bayesian nonparametric boundary detection for multiple areal data

基于多重区域数据的贝叶斯非参数边界检测

Matteo Gianella, Mario Beraha, Alessandra Guglielmi

AI总结 本文提出一种贝叶斯非参数混合模型,用于多重区域数据的边界检测,通过空间依赖权重和随机组件数量,无需外部信息即可识别不同密度区域的边界,应用于洛杉矶地区收入不平等分析。

详情
AI中文摘要

我们考虑了区域数据的边界检测问题,重点在于每个区域单元有多个观测值的情况。我们提出了一种贝叶斯非参数混合模型,用于区域特定的人口密度,具有空间依赖权重和随机组件数量。与之前的方法不同,我们不需要外部信息如区域特定协变量或相似性度量。通过利用每个区域的多个样本信息,能够识别出密度不同的区域边界。关键在于混合组件数量需要从数据中学习以获得有意义的边界检测,因为过度拟合的混合模型存在非识别性。因此,我们假设该数量随机,并在其上放置先验。动机应用是分析大洛杉矶地区的经济不平等,通常导致社会不平等和动荡。通过利用最近引入的最优辅助先验,高效的后验计算由一种转维马尔可夫链蒙特卡洛采样器实现。该方法通过广泛模拟验证,并应用于大洛杉矶地区的收入数据。我们识别出收入分布中的几个边界,这些边界可以事后解释为无健康保险人口的百分比,但不能解释为总犯罪数,显示了这种分析对政策制定者的有用性。

英文摘要

We consider the problem of boundary detection for areal data, focusing on situations where for each areal unit multiple observations are available. We propose a Bayesian nonparametric mixture model for the area-specific population densities, with spatially dependent weights and a random number of components. Contrary to previously proposed methods for boundary detection, which consider one observation per areal unit, ours does not require external information such as area-specific covariates or dissimilarity metrics. Instead, by exploiting information from multiple samples per area, it is able to identify boundaries between areas that exhibit different densities. Crucially, the number of mixture components needs to be learned from data to obtain meaningful boundary detection, due to the non-identifiability of overfitted mixtures. Therefore, we assume it random by placing a prior on it. The motivating application is the analysis of economic inequality in the greater Los Angeles region, which typically yields social inequality and unrest. Efficient posterior computation is facilitated by a transdimensional Markov Chain Monte Carlo sampler which exploits the recently introduced optimal auxiliary priors to improve the mixing. The methodology is validated via extensive simulations and applied to the income data in the greater Los Angeles region. We identify several boundaries in the income distributions, which can be explained ex-post in terms of the percentage of the population without health insurance, though not in terms of the total number of crimes, showing the usefulness of such an analysis to policymakers.

2605.15756 2026-05-18 cs.HC stat.AP

Separating Acute Psychological Stress from Physical Exertion in Biometric Signals

在生物信号中区分急性心理压力与体力消耗

Esther Bosch

AI总结 研究通过分析五种生理信号在认知压力与体力活动下的反应,发现 tonic electrodermal activity 是区分心理压力与体力消耗最有效的指标,其他信号则受体力活动影响更大。

详情
AI中文摘要

急性心理压力在日常情境中广泛出现,包括交通、职业环境和体力活动,其可靠检测可实现自适应系统响应并支持人类福祉。自动压力识别的持续挑战是区分急性心理压力的生物信号与同时发生的体力消耗信号。本研究考察了五种生理信号( tonic electrodermal activity、trapezius electromyography、心率、心率变异性、呼吸率)在认知压力和体力活动下的反应,单独和组合情况下。十九名参与者在2x3组内设计中完成了n-back算术任务结合社交压力和金钱奖励,三个活动条件:静坐、行走和静力骑行。多级线性混合模型和重复测量方差分析用于分解每种传感器的主效应和交互作用。tonic electrodermal activity 对认知压力(r=0.48)和体力消耗(r=0.67)有稳健的加法反应,无交互作用,使其成为体力活动期间压力检测最有前途的候选者。心率和trapezius electromyography几乎完全由体力消耗驱动,无可靠敏感性。RMSSD被体力活动强烈抑制,对认知负荷只有微弱敏感性。呼吸率受体力活动主导,主分析中无可靠压力效应。这些发现提供了现实世界压力检测的传感器特异性层级,并突显tonic electrodermal activity为在体力活动人群中识别认知压力时最信息丰富的通道。

英文摘要

Acute psychological stress occurs in a wide range of everyday contexts, including transportation, occupational settings, and physical activity, where its reliable detection could enable adaptive system responses and support human well-being. A persistent challenge in automated stress recognition is disentangling the biometric signatures of acute psychological stress from those of concurrent physical exertion. This study examined how five physiological signals (tonic electrodermal activity, trapezius electromyography, heart rate, heart rate variability, and respiration rate) respond to cognitive stress and physical activity, independently and in combination. Nineteen participants completed a 2x3 within-subjects design in which acute psychological stress was induced via an n-back arithmetic task combined with social pressure and financial reward, across three activity conditions: idle sitting, walking, and stationary cycling. Multilevel linear mixed models and repeated-measures ANOVA were used to decompose main effects and interactions for each sensor. Tonic electrodermal activity showed a robust, additive response to both cognitive stress (r=0.48) and physical exertion (r=0.67), with no interaction, making it the most promising candidate for stress detection during physical activity. Heart rate and trapezius electromyography were driven almost exclusively by physical exertion, with no reliable sensitivity to the stress task. RMSSD was strongly suppressed by physical activity and showed only marginal sensitivity to cognitive load. Respiration rate was dominated by physical activity, with no reliable stress effect in the primary analysis. These findings provide a sensor-specific hierarchy for real-world stress detection and highlight tonic electrodermal activity as the most informative channel when cognitive stress must be identified in physically active populations.

2605.15702 2026-05-18 stat.ME

Re-examining and calibrating weighted survival analysis for causal inference

重新审视和校准加权生存分析用于因果推断

Wenfu Xu, Yi Zhang, Tobias Gerhard, Zhiqiang Tan

AI总结 本文重新审视加权Kaplan-Meier方法,并开发新的校准方法以改进生存分析的统计特性,通过模拟和实证研究验证了其有效性。

详情
AI中文摘要

基于时间到事件结局的因果推断在各种科学研究中至关重要。在静态设置中使用拟合的倾向分数时,加权Kaplan-Meier估计生存概率和加权Breslow-Peto估计危险比已被广泛应用,但其统计特性往往被忽视或仅有限研究。我们通过正式将其与增强逆概率加权估计的一般框架联系起来,重新审视加权Kaplan-Meier方法,并通过校准估计在低维和高维设置中开发新的方法和相关理论。我们展示了模拟研究和精神分裂症患者辅助抗精神病治疗效果的实证应用。校准方法在模拟研究中产生更接近目标的覆盖率比例,并在模拟和实证研究中产生更短的置信区间。

英文摘要

Causal inference with time-to-event outcomes is fundamental in various scientific studies. In a static setup with fitted propensity scores, weighted Kaplan-Meier estimation for survival probabilities and weighted Breslow-Peto estimation for hazard ratios have been widely used, but their statistical properties have been overlooked or studied only to a limited extent. We re-examine the weighted Kaplan-Meier method by formally linking it with the general framework of augmented inverse probability weighted estimation including both point and variance estimation. Furthermore, to address limitations of existing weighted methods for survival analysis, we develop new methods and associated theory through calibrated estimation in both low-dimensional and high-dimensional settings. We present a simulation study and an empirical application on the effectiveness of adjunctive psychotropic treatments for patients with schizophrenia. The calibrated methods yield coverage proportions closer to target ones in the simulation study, and produce shorter confidence intervals in both simulation and empirical studies.

2605.15692 2026-05-18 cs.LG stat.ML

Tighter Regret Bounds for Contextual Action-Set Reinforcement Learning

更紧的基于上下文动作集强化学习的遗憾界

Zijun Chen, Zihan Zhang

AI总结 本文研究了具有固定奖励和转移函数的回合制强化学习,但每个回合的动作集依赖于回合。通过MVP算法,建立了对抗性和随机性情境下的更紧遗憾界,并推导了样本复杂度和间隙依赖的遗憾界。

详情
AI中文摘要

我们研究了具有固定奖励和转移函数的回合制强化学习,但每个回合的动作集依赖于回合。性能通过累积遗憾衡量,即$\sum_{k=1}^K [V^{*,M^k} - V^{π^k,M^k}]$,其中$M^k$表示第$k$个回合的动作上下文。我们证明MVP算法可以自然扩展到此框架并享有强理论保证。特别是,我们建立了对抗性情境下的最小最大遗憾界$\widetilde{O}(\sqrt{SAH^3K\log L})$,其中$L$表示可能的上下文数量。此结果意味着在随机性情境下的遗憾界为$\widetilde{O}(\sqrt{SAH^3K})$。我们进一步将随机性遗憾保证转换为固定上下文分布的样本复杂度界$\widetilde{O}(SAH^3/ε^2)$。此外,我们推导了一个依赖间隙的遗憾界$\widetilde O\left( \inf_{p\in [0,1)} \left( \frac{1}{Δ_{\min}^{p}} + pKΔ_{\min}^{p} \right)\log K \cdot \mathrm{poly}(S,A,H) \right)$,其中$Δ_{\min}^{p}$是子最优$(h,s,a)$三元组的全局$p$-修剪正间隙底。此界在相关子最优间隙较大的情况下可以显著改进最小最大速率。

英文摘要

We study episodic reinforcement learning with fixed reward and transition functions, but with episode-dependent admissible action sets that are observed at the start of each episode. Performance is measured by cumulative regret against the episode-wise optimal value, $\sum_{k=1}^K [V^{*,M^k} - V^{π^k,M^k}]$, where $M^k$ represents the action context in the $k$-th episode. We show that the MVP algorithm naturally extends to this framework and enjoys strong theoretical guarantees. In particular, we establish a minimax regret bound of $\widetilde{O}(\sqrt{SAH^3K\log L})$ for adversarial contexts, where $L$ denotes the number of possible contexts. This result implies a regret bound of $\widetilde{O}(\sqrt{SAH^3K})$ for stochastic contexts. We further translate the stochastic regret guarantee into a sample complexity bound of $\widetilde{O}(SAH^3/ε^2)$ for a fixed context distribution. In addition, we derive a gap-dependent regret bound of \[ \widetilde O\left( \inf_{p\in [0,1)} \left( \frac{1}{Δ_{\min}^{p}} + pKΔ_{\min}^{p} \right)\log K \cdot \mathrm{poly}(S,A,H) \right), \] where $Δ_{\min}^{p}$ is the global $p$-trimmed positive-gap floor over suboptimal $(h,s,a)$ triples. This bound can substantially improve upon the minimax rate when the relevant suboptimality gaps are large.

2605.15688 2026-05-18 stat.ML cs.AI cs.LG math.PR

$α$-TCAV: A Unified Framework for Testing with Concept Activation Vectors

$α$-TCAV:基于概念激活向量的测试统一框架

Ekkehard Schnoor, Jawher Said, Malik Tiomoko, Wojciech Samek, Alexander Jung

AI总结 本文提出$α$-TCAV框架,解决传统TCAV方法中因指示函数不连续导致的方差问题,通过参数化平滑函数统一概率表述,并提供参数调优指导,挑战现有实践惯例。

Comments 44 pages, 12 figures

详情
AI中文摘要

概念激活向量(CAVs)是深度学习中基于概念的可解释性基础工具,但其实际应用受限于统计不稳定性。本文分析了CAVs和TCAV方法的随机性质,推导了主要CAV类别的分布,包括PatternCAV、FastCAV和基于岭回归的CAV。识别了标准TCAV得分的根本缺陷:其依赖不连续指示函数导致关键区域方差不衰减。为此,引入$α$-TCAV,一种通用框架,用参数化平滑函数替代指示函数,得到统一的概率表述,涵盖TCAV和Multi-TCAV。刻画了灵敏度得分和不同TCAV变体的诱导分布,显示现有最先进的选择缺乏理论依据。提供原理指导,调优$α$-TCAV参数:要么以较低计算成本模仿Multi-TCAV,要么获得校准的贝叶斯最优概率度量。最终分析产生实用建议,挑战现有惯例:最显著的是将全部采样预算分配给单一CAV而非多个。

英文摘要

Concept Activation Vectors (CAVs) are a fundamental tool for concept-based explainability in deep learning, yet their practical utility is limited by statistical instability. We analyze the stochastic nature of CAVs and the Testing with CAVs (TCAV) method, deriving the distributions of major CAV classes including PatternCAV, FastCAV, and ridge regression-based CAVs. We then identify a fundamental flaw in the standard TCAV score: its reliance on a discontinuous indicator function induces non-decaying variance in critical regimes. To address this, we introduce $α$-TCAV, a generalized framework that replaces the indicator with a parameterized smooth function, yielding a unified probabilistic formulation that subsumes both TCAV and Multi-TCAV. We characterize the induced distributions of sensitivity scores and different TCAV variants, showing that established state-of-the-art choices lack theoretical justification. We provide principled guidance on tuning the parameter in $α$-TCAV -- either to imitate Multi-TCAV at substantially lower computational cost, or to obtain a calibrated Bayes-optimal probabilistic measure of a concept's influence. Finally, our analysis yields practical recommendations that challenge established routines: most notably, allocating the full sampling budget to a single CAV rather than splitting it across several.

2605.15639 2026-05-18 stat.ME stat.CO stat.ML

Leveraging heterogeneity for identifiability: Bayesian order-based learning of multiple DAGs

利用异质性提高可识别性:基于顺序的贝叶斯学习多个DAG

Hyunwoong Chang, Fariha Taskin

AI总结 本文提出一种联合顺序评分框架,用于在异质数据设置下学习有向无环图模型的因果结构。通过展示利用异质性可提高因果顺序估计的准确性,提出基于顺序的贝叶斯方法,并在高维情况下建立其理论性质。

详情
AI中文摘要

我们提出了一种联合顺序评分框架,用于在异质数据设置下学习有向无环图(DAG)模型的因果结构。我们证明,利用异质性可以提高因果顺序估计的准确性。在最有利的情况下,因果顺序在两个排列内是可识别的。在此框架基础上,我们提出了一种基于顺序的贝叶斯方法用于高斯DAG模型,并在高维情况下建立了其理论性质。对于顺序空间的后验推断,我们引入了随机到随机(R2R)提议邻域用于Metropolis-Hastings算法,该方法在理论上得到支持,并表现出高效的混合行为。模拟研究证实了所提方法的强经验性能,且对重大抑郁障碍单核RNA测序数据的应用展示了实际用途。

英文摘要

We propose a joint order-based scoring framework for causal structure learning of directed acyclic graph (DAG) models under heterogeneous data settings. We show that leveraging heterogeneity improves the accuracy of causal ordering estimation. In the most favorable case, the causal ordering is identifiable up to two permutations. Building on this framework, we propose an order-based Bayesian method for Gaussian DAG models and establish its theoretical properties in the high-dimensional regime. For posterior inference over the space of orderings, we introduce a random-to-random (R2R) proposal neighborhood for the Metropolis-Hastings algorithm, which is theoretically motivated and exhibits efficient mixing behavior. Simulation studies confirm the strong empirical performance of the proposed method, and an application to single-nucleus RNA sequencing data from major depressive disorder demonstrates practical utility.

2605.15633 2026-05-18 stat.ME

Structured Transfer Learning for Survival Risk Stratification in Data-Sparse Clinical Cohorts

结构化迁移学习用于数据稀疏临床队列的生存风险分层

Junhan Yu, Yurui Chen, Juan Delgado-SanMartin, Dennis Wang, Hong Pan, Doudou Zhou

AI总结 本文提出CORE-Cox模型,通过低秩Cox系数结构学习共享风险因子模式,并通过正则化残差校正适应目标队列,提升生存风险分层的可靠性。

详情
AI中文摘要

背景:生存预测模型在样本量有限或事件较少的临床群体中可靠性较低。目标仅模型可能不稳定,而来自大群体的模型在风险因子效应差异时转移效果差。我们评估了结构化迁移学习是否能改善数据稀疏群体的生存风险分层并允许群体特异性适应。方法:我们开发了COhort-shared Rank-rEduced Cox模型(CORE-Cox),一种用于多结局生存预测的两阶段框架。CORE-Cox通过低秩Cox系数结构在较大来源群体中学习相关结局的共享风险因子模式,然后通过正则化残差校正适应较小的目标群体。我们评估了CORE-Cox在UK Biobank(白人来源,n=150,093;亚洲目标,n=2,534)和MIMIC-IV(白人ICU来源,n=15,997;亚洲ICU目标,n=672)中的表现,与目标仅Cox、惩罚Cox、低秩多任务、朴素池化、直接转移和单结局残差转移在重复嵌套交叉验证中进行比较。结果:CORE-Cox在大多数结局上实现了最佳或接近最佳的区分能力。在UK Biobank中,平均C指数从0.733提高到0.766,在MIMIC-IV中从0.628提高到0.658,有九个结局中的八个有所提升。CORE-Cox还提高了前15%风险富集,危险比估计通常介于来源仅和目标仅模型之间。讨论:CORE-Cox为数据稀疏群体的生存风险分层提供了可解释的迁移学习框架,结合了跨结局的共享结构与群体特异性适应。进一步验证是使用校准的绝对风险预测或临床决策之前的必要步骤。

英文摘要

Background: Survival prediction models are often less reliable in clinical groups with limited sample sizes or few outcome events. Target-only models may be unstable, whereas models from larger cohorts may transfer poorly when risk-factor effects differ across populations. We evaluated whether structured transfer learning can improve survival risk stratification in data-sparse cohorts while allowing cohort-specific adaptation. Methods: We developed the COhort-shared Rank-rEduced Cox model (CORE-Cox), a two-stage framework for multi-outcome survival prediction. CORE-Cox learns shared risk-factor patterns across related outcomes in a larger source cohort via a low-rank Cox coefficient structure, then adapts these patterns to a smaller target cohort through regularized residual correction. We evaluated CORE-Cox in UK Biobank (White source, n=150,093; Asian target, n=2,534) and MIMIC-IV (White ICU source, n=15,997; Asian ICU target, n=672), comparing against target-only Cox, penalized Cox, low-rank multi-task, naive pooling, direct transfer, and single-outcome residual transfer under repeated nested cross-validation. Results: CORE-Cox achieved best or near-best discrimination across most outcomes. Mean C-index improved from 0.733 to 0.766 in UK Biobank and from 0.628 to 0.658 in MIMIC-IV, with gains in eight of nine outcomes. CORE-Cox also improved top-15% risk enrichment, with hazard-ratio estimates typically intermediate between source-only and target-only models. Discussion: CORE-Cox offers an interpretable transfer-learning framework for survival risk stratification in data-sparse cohorts, combining shared cross-outcome structure with cohort-specific adaptation. Further validation is needed before use in calibrated absolute-risk prediction or clinical decision-making.

2605.15620 2026-05-18 stat.ML cs.LG

Pessimistic Risk-Aware Policy Learning in Contextual Bandits

悲观风险感知策略学习在上下文老虎机中

Yilong Wan, Yuqiang Li, Xianyi Wu

AI总结 本文提出统一框架优化Lipschitz连续风险函数,涵盖均值-方差、熵风险等,通过新型经验集中不等式推导数据依赖的次优界,无须强叠加假设,达到最小最大最优。

详情
AI中文摘要

我们研究风险感知的离线策略学习,旨在从记录数据中学习最优决策规则,满足一般风险标准。在高风险领域,线上交互不可行且需严格控制不利结果。现有离线上下文老虎机文献要么聚焦预期奖励标准,要么仅限于策略评估而非优化。本文提出统一分布框架优化Lipschitz连续风险函数,涵盖均值-方差、熵风险、条件风险价值等。通过开发新型经验集中不等式用于重要性采样分布估计,分析推导数据依赖的次优界,无须强叠加假设,该速率最小最大最优,与风险中性离线策略优化一致,表明优化一般Lipschitz风险标准无额外统计成本。

英文摘要

We study risk-aware offline policy learning, aiming to learn a decision rule from logged data that is optimal under general risk criteria. This problem is crucial in high-stakes domains where online interaction is infeasible and adverse outcomes must be carefully controlled. However, existing literature on offline contextual bandits either centers on expected-reward criteria or restricts risk considerations to policy evaluation instead of optimization. In this work, we propose a unified distributional framework for optimizing Lipschitz-continuous risk functionals, a broad class of risk measures encompassing mean-variance, entropic risk, and conditional value-at-risk, among others. By developing novel empirical concentration inequalities for importance sampling-based distributional estimators, our analysis derives data-dependent suboptimality bounds with an $\tilde{\mathcal{O}}(1/\sqrt{n})$ rate, without relying on restrictive uniform overlap assumptions. This rate is minimax optimal and matches that of risk-neutral offline policy optimization, indicating that optimizing general Lipschitz risk criteria incurs no additional statistical cost relative to the expected-reward.

2605.15612 2026-05-18 cs.IT math.IT stat.AP

Statistical two-round search for one excellent element

统计两轮搜索寻找一个优秀元素

Nagananda K G, Jong Sung Kim

AI总结 本文研究了统计意义上的两轮搜索问题,旨在以最小的测试次数找到至少一个优秀元素,证明在稀疏泊松条件下,测试次数随人口规模对数增长。

Comments 17 pages

详情
AI中文摘要

本文研究了统计意义上的两轮搜索问题,旨在以最小的测试次数找到至少一个优秀元素。考虑一个包含n个元素的总体,其中每个元素独立地以概率λ/n成为优秀元素,λ>0。一个子集测试是无噪声的:当查询的子集包含至少一个优秀元素时,它会返回阳性。目标是在保证以至少1-α的概率找到一个优秀元素的前提下,最小化预期测试次数,其中0<α<1。与传统的群体测试不同,目标不是恢复所有优秀元素的集合,而是仅识别其中一个。我们首先证明成功本质上受到没有优秀元素的可能性的限制。在稀疏泊松条件下,这提出了必要的可行性条件α≥e^{-λ}。当目标成功概率是可行的,我们证明最优的预期测试次数随总体规模对数增长。上界通过结合初始存在测试和第二轮分离设计获得;下界则来自信息计数论证。数值示例展示了可行性边界和由此产生的对数尺度。

英文摘要

We formulate and study a statistical version of Katona's two-round search problem of finding at least one excellent element in a set. A population of $n$ elements is considered, where each element is independently excellent with probability $λ/n$, $λ> 0$. A subset test is noiseless: it returns positive exactly when the queried subset contains at least one excellent element. The goal is to minimize the expected number of tests subject to finding one excellent element with probability at least $1-α$, where $0<α<1$, under the restriction that testing is performed in two rounds. Unlike classical group testing, the objective is not to recover the full set of excellent elements, but only to identify one of them. We first show that success is fundamentally limited by the possibility that no excellent element exists. In the sparse Poisson regime, this imposes the necessary feasibility condition $α\ge e^{-λ}$. When the target success probability is feasible, we prove that the optimal expected number of tests grows logarithmically with the population size. The upper bound is obtained by combining an initial existence test with a second-round separating design; the lower bound follows from an information-counting argument. Numerical illustrations show the feasibility boundary and the resulting logarithmic scaling.

2605.15596 2026-05-18 stat.ME

Tail postcoloring in long-run variance estimation of time series

长程方差估计中的尾部后着色

Xu Liu, Kin Wai Chan

AI总结 本文提出尾部后着色方法,通过参数模型将非参数估计中被忽视的尾部自协方差投影到最终估计器,实现非参数方差估计与参数着色模型的连接,提高稳健性和效率。

详情
AI中文摘要

预白化是处理强自相关性的常见方法。本文提出一种新的方法,称为尾部后着色,通过参数模型将非参数估计中被忽视的尾部自协方差投影到最终估计器。该方法通过缩放因子连接非参数方差估计器和参数着色模型,并通过带宽参数自动切换这两种方法,无需将整个数据集转换为残差。当着色模型正确指定时,可获得参数速率。在有限样本中,它比标准方法中的白化模型更稳健,且避免了标准方法中由于着色因子导致的严重方差膨胀或功率下降。本文展示了多种参数模型可构建多重稳健的尾部后着色估计器,并自然适用于多元时间序列。通过马尔可夫链蒙特卡罗输出分析的实证数据示例进行了说明。

英文摘要

Prewhitening is a common approach to deal with strong autocorrelation. In this article, we propose a new approach called tail postcoloring, motivated by it. It uses parametric models to project, or color back, the neglected tail autocovariances in nonparametric estimators onto the final estimator. This approach bridges the non-parametric variance estimator and the parametric coloring model through a scaling factor. It automatically switches between these two arms using a bandwidth parameter, without the need to transform the entire dataset into residuals, as in the standard prewhitening approach. When the coloring model is well-specified, a parametric rate can be achieved. In finite samples, it is also more robust to misspecification of the coloring model compared to the whitening model in the standard approach. Besides, it avoids severe potential variance inflation or power reduction caused by the recoloring factor in the standard approach. We show that multiple parametric models can be used to construct a multiply robust tail postcolored estimator. It also naturally works for multivariate time series. A real-data example in Markov chain Monte Carlo output analysis is provided.

2605.15571 2026-05-18 stat.ML cs.LG

MaxSketch: Robust Distinct Counting in Streams via Random Projections

MaxSketch:通过随机投影在数据流中实现鲁棒的唯一计数

Nikos Tsikouras, Constantine Caramanis, Christos Tzamos

AI总结 本文提出MaxSketch,利用随机高斯投影在高维噪声数据流中实现鲁棒的唯一计数,证明在几何结构下可将内存需求降低至~O(log n / ε²)。

详情
AI中文摘要

估计数据流中不同元素的数量在重复元素相同的情况下已知。然而在现代设置中,观测是高维且噪声的,相同对象的重复实例仅近似相似——例如不同个体的图像在像素层面可能有显著差异。经典草图如HyperLogLog依赖一致的哈希值来处理相同元素,在这种情况下会失效。最近在一般度量空间中关于鲁棒唯一计数的研究实现了~Θ(√n)的内存需求,这是最坏情况下的最优。本文证明在学习表示中常见的几何结构下,可以实现显著改进的内存保证。我们介绍了MaxSketch,一种由随机高斯投影构建的简单max线性草图,并证明其能够估计潜在对象的数量。具体而言,我们证明在这一假设下,m = ~O(log n / ε²)的随机投影(因此~O(log n / ε²)的内存)足以在(1+ε)因子内恢复真实的唯一计数。在图像流上的实验证实MaxSketch能够准确估计唯一计数,并在训练范围外泛化。我们的结果将经典流算法与现代表示学习连接起来,展示了几何结构如何从根本上减少唯一计数的复杂性。

英文摘要

Estimating the number of distinct elements in a data stream is well understood when repeated elements are identical. In modern settings, however, observations are high-dimensional and noisy, so repeated instances of the same object are only approximately similar -- for example, different images of the same individual may vary significantly at the pixel level. Classical sketches such as HyperLogLog rely on consistent hash values for identical elements and break down in this regime. Recent work on robust distinct counting in general metric spaces achieves $\widetildeΘ(\sqrt{n})$ memory, which is tight in the worst case. We show that substantially improved memory guarantees are possible under geometric structure common in learned representations. We introduce MaxSketch, a simple max-linear sketch built from random Gaussian projections, and prove that it succeeds in estimating the number of distinct latent objects. Concretely, we show that under this assumption $m = \widetilde{O} (\log n / \varepsilon^2)$ random projections (and hence $\widetilde{O} (\log n/\varepsilon^2)$ memory) suffice to recover the true distinct count within a $(1+\varepsilon)$ factor. Experiments on image streams confirm that MaxSketch accurately estimates distinct counts and generalizes beyond the training regime. Our results bridge classical streaming algorithms and modern representation learning, showing how geometric structure can fundamentally reduce the complexity of distinct counting.

2605.15531 2026-05-18 math.ST math.CO stat.TH

Bounds on the Number of Modes of a Gaussian Mixture Density

高斯混合密度模态数的界限

Hien Duy Nguyen

AI总结 研究高斯混合密度非退化临界点和模态数的上界及下界,提出直接Pfaffian界和增强界,并通过Morse理论改进有限模态上界,同时给出同方差情况下的改进界和下界。

详情
AI中文摘要

我们推导了高斯混合密度在R^d中k个分量的非退化临界点数量的显式上界,以及当模态集有限时的模态数上界和下界。通过将临界点方程归一化于参考分量,对于k≥2,得到直接Pfaffian界U_het(d,k)=2^{d+组合数(k-1,2)}(d+2min(d,k-1)+1)^{k-1}。对于相同参数范围,通过精确消元和代数倒数变量得到替代界U_aug(d,k)=2^{组合数(k-1,2)}(d+1)((2k-1)d+2k-1)^{k-1}。因此,对于k≥2,最佳临界点界是它们的最小值。Morse理论将对应的有限模态上界改进为floor(min{U_het(d,k),U_aug(d,k)}+1)/2。在同方差情况下,对于k≥2,直接界改进为U_hom(d,k)=2^{d+组合数(k-1,2)}(d+min(d,k-1)+1)^{k-1},仿射秩减少将d替换为组件均值的仿射秩,而增强同方差减少给出无维度界U_aug,hom(k)=2^{组合数(k-1,2)+1}(2k)^{k-1}。在下界方面,对于d,k≥2,我们得到L_bin(d,k)=k+max_{2≤r≤min(d,k)}组合数(k,r),并给出一个填充-产品族,特别说明线性下界d+k-1,以及一个种子-闭包原理,将产品和填充构造打包。我们进一步给出了临界集连通分支数的显式界。

英文摘要

We derive explicit upper bounds for the number of nondegenerate critical points of a $k$-component Gaussian mixture density in $\mathbb{R}^d$, and the number of modes when the modal set is finite, together with lower bounds. By normalizing the critical-point equations by a reference component, for $k\ge2$ we get the direct Pfaffian bound \[ U_{\mathrm{het}}(d,k)=2^{\,d+\binom{k-1}{2}}\left(d+2\min(d,k-1)+1\right)^{k-1}. \] For the same parameter range, an exact elimination augmented by an algebraic reciprocal variable gives the alternative bound \[ U_{\mathrm{aug}}(d,k)= 2^{\binom{k-1}{2}}(d+1)\left((2k-1)d+2k-1\right)^{k-1}. \] Thus, for $k\ge2$, the best critical-point bound is their minimum. A Morse-theoretic argument improves the corresponding finite-mode upper bound to \[ \left\lfloor \frac{\min\{U_{\mathrm{het}}(d,k),U_{\mathrm{aug}}(d,k)\}+1}{2}\right\rfloor. \] In the homoscedastic case, for $k\ge2$, the direct bound improves to \[ U_{\mathrm{hom}}(d,k)=2^{\,d+\binom{k-1}{2}}\left(d+\min(d,k-1)+1\right)^{k-1}, \] an affine-rank reduction replaces $d$ by the affine rank of the component means, and an augmented homoscedastic reduction gives the dimension-free bound \[ U_{\mathrm{aug,hom}}(k)=2^{\binom{k-1}{2}+1}(2k)^{k-1}. \] On the lower-bound side, for $d,k\ge 2$ we obtain \[ L_{\mathrm{bin}}(d,k)=k+\max_{2\le r\le \min(d,k)}\binom{k}{r}, \] together with a padding-product family that in particular implies the linear lower bound $d+k-1$, and a seed-closure principle that packages product and padding constructions. We further give explicit bounds for the number of connected components of the critical set.

2605.15524 2026-05-18 cs.LG cs.AI math.DG math.ST stat.TH

Neural Point-Forms

神经点形

Bruno Trentini, Jacob Hume, Vincenzo Antonio Isoldi, Philipp Misof, Ekaterina S. Ivshina, Kelly Maggs

AI总结 本文提出神经点形(NPFs),通过扩散几何中的拉普拉斯技术,构建点云的可学习几何特征,用于比较微分形式,并在合成和生物相关实验中展示其在处理采样密度、流形结构和群体几何时的优势。

详情
AI中文摘要

点云学习通常基于观察样本是嵌入高维特征空间的底层几何对象的噪声轨迹的假设。然而,许多几何特性无法仅通过坐标、成对距离或学习的图邻域直接捕捉。在光滑情况下,微分形式用于编码高阶切线信息。本文引入了一种新的可学习几何特征家族,称为神经点形(NPFs)。在没有自然切线结构的情况下,我们使用来自扩散几何的拉普拉斯技术,通过内积构建点云的离散模型,以比较微分形式。在连续情况下,共享环境特征空间的子流形表示为比较矩阵,其条目描述了特征形式对偶切线信息的相互作用。我们通过证明在标准采样、带宽、密度和流形假设下比较矩阵的长期一致性,使这一直觉精确化。这产生了一个紧凑、高效且可交换的神经层,其输出是一个学习的形比较矩阵。在合成和生物相关实验中,我们展示了NPFs提供了一个竞争性且可解释的表示,当标签依赖于采样密度、流形结构或响应相关群体几何时,其优势最为明显。

英文摘要

Point cloud learning often rests on the premise that observed samples are noisy traces of an underlying geometric object, such as a manifold embedded in a high-dimensional feature space. Yet much of this geometry is not captured directly by coordinates, pairwise distances, or learned graph neighborhoods alone. In the smooth setting, differential forms are devices to encode higher order tangency information. In this work, we introduce a new family of principled learnable geometric features for point clouds called neural point-forms (NPFs). In the absence of a natural tangency structure, we instead use Laplacian-based techniques from Diffusion Geometry to build a discrete model for comparing differential forms on point clouds via inner products. In the continuum, submanifolds of a shared ambient feature space are represented as comparison matrices, whose entries describe how pairs of feature forms interact with extrinsic tangency information. We make this intuition precise by proving the long-run consistency of comparison matrices under standard sampling, bandwidth, density, and manifold-hypothesis assumptions. This yields a compact, efficient and permutation-invariant neural layer whose output is a learned form-comparison matrix. Across synthetic and biologically relevant experiments, we show that NPFs provide a competitive, and interpretable representation, with the strongest benefits appearing when labels depend on sampling density, manifold-like structure, or response-relevant population geometry.

2605.15516 2026-05-18 eess.SY cs.SY stat.AP

Co-Design Optimization for Data Center Cooling System via Digital Twin

通过数字孪生实现数据中心冷却系统协同优化

Shrenik Jadhav, Zheng Liu

AI总结 本文提出三层优化框架,解决超算中心冷却单元分配与流体分配问题,通过模型简化评估所有可行方案,实现年度冷却能耗降低35.48%。

Comments 12 pages, 8 figures

详情
AI中文摘要

液冷超算系统通过多并行子回路冷却装置散热,但如何分配冷却单元(CDUs)及流体分配尚未系统解决。本文提出三层优化框架,联合确定CDUs在子回路中的整数划分、连续流体分配及每时间步总流量和供温的协同优化,满足子回路热安全约束。基于前沿超算的数据构建Modelica仿真模型,开发降阶替代模型,评估所有611种可行划分方案。比较三种渐进丰富操作策略,最终最优设计为双子回路系统,实现35.48%年度冷却能耗节省,仅比现有三子回路前沿设计高出0.18%。流体分配优化可补偿任何可行CDU到子回路分配,降低设计敏感性93%,提供低成本软件路径实现现有前沿硬件近最优性能。该框架可迁移至其他液冷高性能计算系统。

英文摘要

Liquid-cooled exascale supercomputers dissipate heat through cooling plants organized as multiple parallel subloops, but how to allocate coolant distribution units (CDUs) across subloops and how to distribute flow among them has not been systematically addressed for facilities at this scale. This paper presents a three-layer optimization framework that jointly determines the integer partition of CDUs across subloops, the continuous flow fraction allocation, and the per-timestep co-design optimization of total flow rate and supply temperature subject to per-subloop thermal safety constraints. The Modelica simulation model is built based on the data of Frontier exascale supercomputer at Oak Ridge National Laboratory. By developing a reduced-order surrogate model, all 611 feasible partitions of 25 CDUs are evaluated across the full year operational dataset of 49,353 timesteps. Three progressively richer operational strategies are compared, ranging from flow control optimization to full three-layer co-design optimization with dynamically adjusted flow fractions. The globally optimal design is a two-subloop plant achieving 35.48% annual cooling energy savings, only 0.18% above the current three-subloop Frontier design at 35.30%. Flow fraction optimization is shown to compensate for any feasible CDU-to-subloop assignment, reducing the design sensitivity by 93% and providing a low-cost software-only pathway to near-optimal performance on the existing Frontier hardware. The framework is transferable to other liquid-cooled high-performance computing plants.

2605.15488 2026-05-18 cs.LG stat.ML

SurvivalPFN: Amortizing Survival Prediction via In-Context Bayesian Inference

SurvivalPFN: 通过上下文贝叶斯推断实现生存预测的 amortization

Shi-ang Qi, Vahid Balazadeh, Michael Cooper, Russell Greiner, Rahul G. Krishnan

AI总结 SurvivalPFN 通过上下文学习实现生存预测的 amortization,利用预训练的网络在单次前向传递中处理右删失数据,避免了参数假设,产生校准的生存分布,在61个数据集上表现优异。

详情
AI中文摘要

生存分析提供了一个强大的统计框架,用于在删失存在的情况下建模时间到事件的结果。然而,从众多专门的生存方法中选择合适的估计器通常需要大量方法论和领域专业知识。我们引入了SurvivalPFN,这是一种先验-数据拟合网络,通过上下文学习实现对删失观测的贝叶斯推断的amortization。SurvivalPFN 在多样化的合成、可识别和右删失数据生成过程中进行预训练,使其能够在推理过程中单次前向传递中实现生存分析的amortization。结果,模型适应每个数据集的有效复杂性,而无需任务特定的训练或超参数调整,避免了限制性的参数假设,并产生校准的生存分布。在涵盖61个数据集、21种方法和5种评估指标的大型基准测试中,SurvivalPFN实现了强大的预测性能,并经常优于已建立的生存模型。这些结果表明,SurvivalPFN为生存分析提供了一个原理上和实用的基础模型,潜在应用领域包括医疗、金融和工程(https://github.com/rgklab/SurvivalPFN)

英文摘要

Survival analysis provides a powerful statistical framework for modeling time-to-event outcomes in the presence of censoring. However, selecting an appropriate estimator from the many specialized survival approaches often requires substantial methodological and domain expertise. We introduce SurvivalPFN, a prior-data fitted network that amortizes Bayesian inference for censored observations through in-context learning. SurvivalPFN is pretrained on a diverse family of synthetic, identifiable, and right-censored data-generating processes, enabling it to amortize survival analysis in a single forward pass during inference. As a result, the model adapts to the effective complexity of each dataset without task-specific training or hyperparameter tuning, avoids restrictive parametric assumptions, and produces calibrated survival distributions. In a large-scale benchmark spanning 61 datasets, 21 methods, and 5 evaluation metrics, SurvivalPFN achieves strong predictive performance and often improves upon established survival models. These results suggest that SurvivalPFN offers a principled and practical foundation model for survival analysis, with potential applications in high-impact domains such as healthcare, finance, and engineering (https://github.com/rgklab/SurvivalPFN).

2605.15483 2026-05-18 stat.ME stat.ML

Improving the Efficiency of Subgroup Analysis in Randomized Controlled Trials with TMLE

利用TMLE改进随机对照试验中的亚组分析效率

Sky Qiu, Nerissa Nance, Rachael Phillips, Jens Tarp, Maya Petersen, Mark van der Laan

AI总结 本文提出TMLE-PR和A-TMLE方法,通过利用非亚组参与者信息提升亚组治疗效应估计的精度,避免外部数据偏倚,用于心血管试验中亚组风险降低估计。

详情
AI中文摘要

在随机对照试验中,亚组分析常因样本量不足而缺乏统计效力。本文通过利用非亚组参与者的信息来增强亚组估计。具体而言,我们研究了两种目标最大似然估计器(TMLE):一种使用池化回归的TMLE(TMLE-PR)和一种自适应目标最大似然估计器(A-TMLE)。这两种估计器能够在不依赖外部真实世界数据的情况下共享信息,从而利用试验的关键优势:随机治疗带来的偏倚保护,以及数据收集的一致性和定义的一致性。本文提出的一般策略直接服务于关键监管机构(如FDA)的优先事项,通过在不引入外部偏倚的情况下提高亚组特定治疗效应估计的精度,从而促进严谨推断以支持公平的标签、可及性和市场后评估。在基于心血管结局试验(LEADER,NCT01179048)数据的案例研究中,我们使用所提出的估计器估计了利拉鲁肽治疗下黑人和亚洲亚组中主要不良心脏事件(MACE)的风险降低,这两个亚组各自占试验人口的不到10%。使用A-TMLE,我们发现亚洲参与者在365、540和730天时的估计绝对MACE风险降低分别为1.6、1.5和1.5个百分点,黑人参与者分别为2.1、2.0和2.1个百分点,95%置信区间在每个时间点均不包含零。

英文摘要

Subgroup analyses within randomized controlled trials are often underpowered due to limited sample sizes. We address this challenge by leveraging trial participants outside the subgroup of interest to augment estimation within the subgroup. Specifically, we study two Targeted Maximum Likelihood Estimators (TMLEs) that borrow information from non-subgroup participants within the same trial: a TMLE with pooled regression (TMLE-PR) and an Adaptive Targeted Maximum Likelihood Estimator (A-TMLE). Both estimators enable information sharing without relying on any external real-world data, thereby capitalizing on key strengths of the trial: most importantly, the protection against bias afforded by the randomized treatment, but also harmonized data collection, and consistent treatment and outcome definitions. The general strategy proposed here directly advances the priorities of key regulatory agencies, including the FDA, by improving the precision of subgroup-specific treatment effect estimates without introducing external sources of bias, thereby facilitating rigorous inference to support equitable labeling, access, and post-market evaluation. In a case study based on analysis of data from a cardiovascular outcome trial (LEADER, NCT01179048), we estimate the risk reduction of major adverse cardiac events (MACE) under liraglutide treatment among Black and Asian subgroups -- each comprising less than 10\% of the trial population -- using the proposed estimators that borrow information from the remainder of the trial. Using A-TMLE, in particular, we find estimated absolute MACE risk reductions of 1.6, 1.5, and 1.5 percentage points among Asian participants and 2.1, 2.0, and 2.1 percentage points among Black participants at 365, 540, and 730 days, respectively, with 95\% confidence intervals excluding the null at each time point.

2605.15469 2026-05-18 stat.ME

Tree-aggregated regression for compositional data with measurement errors

基于测量误差的树聚合回归

Zhenghan Li, Tianying Wang

AI总结 本文提出TARCO方法,通过整合偏差校正估计量和树感知正半定稳定化,解决树聚合与测量误差交互导致的偏差和不稳定性问题,提升微生物组研究的估计精度和解释性。

详情
AI中文摘要

高维组成型协变量,通常源自计数数据,易受测量误差影响,常通过预设树聚合以提高可解释性。现有方法通常处理树引导的组成回归或误差修正,但未考虑其交互引起的层级污染。本文提出TARCO方法,整合偏差校正估计量与树感知正半定稳定化及稀疏正则化,通过交叉验证选择超参数。所得凸优化问题可通过可扩展算法求解。建立预测和估计误差的有限样本界,并在反映树异质性的条件下证明符号一致性。当测量误差协方差被一致估计器替代时,保证仍成立。多树深度模拟和微生物组应用显示,相比忽略树聚合与测量误差交互的方法,TARCO在估计精度、支持恢复和聚合层面解释性上表现更优。

英文摘要

High-dimensional compositional covariates, often derived from count data, are subject to measurement error and are frequently analyzed after aggregation along a prespecified tree to improve interpretability in applications such as microbiome studies. Existing approaches typically handle either tree-guided compositional regression or errors-in-variables correction, but they do not account for the hierarchical contamination induced by their interaction. We show that tree aggregation turns leaf-level measurement error into level-dependent, correlated contamination across aggregated nodes, which inflates bias, weakens concentration rates for corrected estimating quantities, and leads to unstable variable selection for naive approaches. We propose Tree-Aggregated Regression with Correction for Observation Error (TARCO), which integrates bias-corrected estimating quantities with a tree-aware positive semidefinite stabilization and sparse regularization, with tuning selected by cross-validation based on the corrected objective. The resulting convex program can be solved with scalable algorithms. We establish finite-sample bounds for prediction and estimation errors and prove sign consistency under conditions that explicitly reflect tree heterogeneity. The guarantees persist when the measurement-error covariance is replaced by a consistent estimator. Simulations across multiple tree depths and a microbiome application demonstrate improved estimation accuracy, support recovery, and aggregation-level interpretability compared with methods that ignore the interaction between tree aggregation and measurement error.

2605.15459 2026-05-18 cs.LG stat.ML

Don't Stop Me Yet: Sampling Loss Minima via Dissipative Riemannian Mechanics

别停止我:通过耗散黎曼流形力学采样损失极小值

Albert Kjøller Jacobsen, Leo Uhre Jakobsen, Johanna Marie Gegenfurtner, Georgios Arvanitidis

AI总结 本文提出DiMS方法,通过耗散黎曼流形力学精确采样损失极小值,解决传统方法无法准确采样重参数化不变解的问题,并在贝叶斯推断中验证其有效性。

详情
AI中文摘要

现代神经网络损失函数的极小值通常不是孤立的,而是形成在训练数据上重参数化不变解的连通组件。分析这些解是一个难题,但采样方法是可行的。现有方法要么在低损失区域扩散,无法精确采样重参数化不变解,要么本质上是局部的,限制了对其他极小值盆地的探索。本文提出基于动能的动力系统,受重力和摩擦项驱动,以精确采样极小水平集。DiMS方法依赖物理动机的超参数,允许控制采样器的探索能力。我们以不确定性量化作为动机问题,在贝叶斯推断中观察到比之前方法更好的性能。

英文摘要

The minima of modern neural network loss functions are typically not isolated, rather they form connected components of reparameterization invariant solutions on the training data. Analytically characterizing these solutions is a hard problem, but sampling approaches are feasible. By construction, existing methods either spread over low-loss regions, and thus do not sample reparameterization invariant solutions exactly, or are inherently local, which limits exploration of other minima valleys. We propose sampling such reparameterization invariant models using a dynamical system based on kinetic energy, subject to a gravitational pull and a friction term that dissipates energy from the system. Our proposed sampler, DiMS, is guaranteed to sample exactly from the minimum level sets and depends on physically motivated hyperparameters which allows control over the exploration capabilities of the sampler. We consider uncertainty quantification in Bayesian inference as the motivating problem and observe improved performance compared to previously proposed approaches.

2605.15428 2026-05-18 stat.ME

Modeling Misclassification in Spousal Violence Reporting: Evidence from Bayesian Quantile Regression

夫妻暴力报告中的误分类建模:来自贝叶斯分位数回归的证据

Joon Jin Song, Mohammad Arshad Rahman, Yoo-Mi Chin, James Stamey

AI总结 本文提出一种贝叶斯分位数回归框架,用于处理二元误分类数据,通过引入潜在真实响应和显式建模假阴性和假阳性报告误差,改进了对敏感二元结果的推断。

详情
AI中文摘要

分位数回归扩展了回归分析,超越条件均值,提供更丰富的协变量效果特征。然而,对于敏感的二元结果,由于漏报导致的误分类可能显著偏误推断。本文提出一种贝叶斯分位数回归框架,用于误分类的二元结果,引入潜在真实响应并显式建模假阴性和假阳性报告误差。估计通过一种新的马尔可夫链蒙特卡罗(MCMC)算法进行。在不同先验规格和误分类率下的模拟研究显示,该方法在忽略误分类的模型上表现更优。本文将该方法应用于自我报告的夫妻暴力数据,研究就业状况和家庭财富与关联,同时调整社会人口因素。结果表明,各分位数中漏报超过过报,并且考虑误分类可以改变实质性结论。

英文摘要

Quantile regression extends regression analysis beyond the conditional mean, providing a richer characterization of covariate effects across the outcome distribution. For sensitive binary outcomes, however, misclassification due to underreporting can substantially bias inference. We propose a Bayesian quantile regression framework for misclassified binary outcomes that introduces a latent true response and explicitly models false negative and false positive reporting errors. Estimation is performed through a novel Markov chain Monte Carlo (MCMC) algorithm. Simulation studies under varying prior specifications and misclassification rates demonstrate improved performance over models that ignore misclassification. We apply the method to self-reported spousal violence data, examining associations with employment status and household wealth while adjusting for socio-demographic factors. The results indicate that underreporting exceeds overreporting across quantiles and that accounting for misclassification can change substantive conclusions.

2605.15411 2026-05-18 stat.ML cs.LG math.OC

Harnessing Unimodality in Semiparametric Contextual Pricing via Oracle Price Map Learning

通过Oracle价格图学习利用单峰性在半参数上下文定价中

Yingying Fan, Yuxuan Han, Jinchi Lv, Xiaocong Xu, Zhengyuan Zhou

AI总结 本文研究了半参数标量指数估值模型中的上下文动态定价,通过Oracle价格图学习方法,利用β-Hölder光滑性和收益几何条件,提出了一种模块化粗到细策略,实现非参数Oracle图学习的最优 regret 界。

详情
AI中文摘要

我们研究了半参数标量指数估值模型中的上下文动态定价,其中潜在价值为 $v_t=μ_\ast(\mathsf c_t)+ξ_t$,其中未知效用图 $μ_\ast$ 和未知加性噪声分布。关键决策对象是通过标量指数 $u=μ_\ast(\mathsf c)$ 和噪声尾部诱导的一维Oracle价格图 $u\mapsto p^\ast(u)$。在 $β$-Hölder光滑性($β\geq 2$)和收益几何条件(提供唯一、稳定的内部最大化器)下,该Oracle图本身为 $(β-1)$-光滑。我们通过 $\mathsf{ORBIT}$,一种模块化粗到细策略,利用标量试点指数作为输入,在每个活跃区间内局部化基准价格,并通过多臂凸优化学习Oracle图的局部多项式近似。对于基线线性效用模型 $μ_\ast(\mathsf c)=\mathsf c^\topθ_\ast$,自适应椭圆探索方案在不假设上下文分布的情况下构建所需的标量试点在线。所得到的策略达到 regret $\widetilde{O}\big(T^{\frac{2β-1}{4β-3}}+\sqrt{dT}\big)$。对于固定 $d$,我们建立了在时间范围依赖上的匹配下界,揭示了非参数Oracle图学习项的最小最大尖锐性。相同的标量试点接口还扩展到稀疏高维线性效用和非参数Hölder效用。

英文摘要

We study contextual dynamic pricing in a semiparametric scalar-index valuation model where the latent value is $v_t=μ_\ast(\mathsf c_t)+ξ_t$, with an unknown utility map $μ_\ast$ and an unknown additive noise distribution. The key decision object is the one-dimensional oracle price map $u\mapsto p^\ast(u)$ induced by the scalar index $u=μ_\ast(\mathsf c)$ and the noise tail. Under the $β$-Hölder smoothness of the tail function for $β\geq 2$ and a revenue-geometry condition that gives a unique, stable, interior maximizer, this oracle map is itself $(β-1)$-smooth. We exploit such structure through $\mathsf{ORBIT}$, a modular coarse-to-fine policy that takes a scalar pilot index as input, localizes a benchmark price in each active bin, and learns a local polynomial approximation of the oracle map inside a trust region via bandit convex optimization. For the baseline linear utility model $μ_\ast(\mathsf c)=\mathsf c^\topθ_\ast$, an adaptive elliptical exploration scheme constructs the required scalar pilot online without distributional assumptions on the contexts. The resulting policy achieves regret $\widetilde{O}\big(T^{\frac{2β-1}{4β-3}}+\sqrt{dT}\big)$. For fixed $d$, we establish a matching lower bound in the horizon dependence, unveiling that the nonparametric oracle-map learning term is minimax sharp. The same scalar-pilot interface also yields extensions to sparse high-dimensional linear utility and nonparametric Hölder utility.

2605.15405 2026-05-18 econ.GN q-fin.EC stat.ME

Estimating Social Norm Complementarities

估计社会规范的互补性

Eliana La Ferrara, Cheaheon Lim, Davide Viviano

AI总结 本文通过实证研究探讨社会规范在技术和社会维度的互补性,发现女性割礼和童婚在塞拉利昂存在互补性,而多妻制与童婚在尼日利亚存在替代性,为政策制定提供依据。

详情
AI中文摘要

我们开发了一个关于社会规范选择的模型,允许在两个维度上存在互补性:技术维度,类似于消费商品之间的互补性,以及社会维度,捕捉从从众中获得的回报。这些共同决定了两种规范是互补品、替代品还是独立品,这由一种规范的均衡普及率如何响应另一种规范效用的边际变化来定义。我们使用塞拉利昂和尼日利亚的重复横断面数据估计该模型,重点研究女性割礼、多妻制和童婚。社会回报在所有规范中均显著。对于女性割礼和童婚,我们发现互补性证据,尤其是在塞拉利昂尤为明显。对于多妻制和童婚,我们发现社会替代性证据,特别是在尼日利亚。我们利用人类学见解解释这些差异。最后,我们通过迭代模型研究政策反事实,评估法律改革和社会干预的潜在影响。

英文摘要

We develop a model of choice over social norms that allows for complementarities along two dimensions: \textit{technological}, analogous to complementarities between consumption goods, and social, capturing returns from conformity. Together, these determine whether two norms are complements, substitutes, or independent, as defined by how the equilibrium prevalence of one norm responds to a marginal shift in the utility of another. We estimate the model using repeated cross-sections from Sierra Leone and Nigeria, focusing on female genital cutting, polygyny, and child marriage. Social returns are significant across all specifications. For female genital cutting and child marriage, we find evidence of complementarities, especially strong in Sierra Leone. For polygyny and child marriage, we find evidence of social substitutability, particularly in Nigeria. We interpret these differences using insights from anthropology. Finally, we iterate the model forward to study policy counterfactuals, assessing the potential effects of legal reforms and social interventions.

2605.15403 2026-05-18 cs.LG math.OC stat.ML

$ϕ$-Balancing for Mixture-of-Experts Training

$ϕ$-平衡用于专家混合训练

Lizhang Chen, Jonathan Li, Qi Wang, Runlong Liao, Shuozhe Li, Chen Liang, Ni Lao, Qiang Liu

AI总结 本文提出$ϕ$-平衡框架,通过最小化严格凸、对称且可导的潜在函数,实现专家资源的群体平衡,优于现有启发式方法,在大规模预训练和下游微调中表现更稳定有效。

详情
AI中文摘要

混合专家(MoE)模型依赖于平衡的专家利用以充分发挥其可扩展性。然而,现有负载平衡方法大多是启发式的,并基于嘈杂的小批量分配统计,引入了相对于群体目标的偏差。我们提出$ϕ$-平衡,一个原则性的框架,通过最小化预期路由分布的严格凸、对称且可导的潜在函数,直接针对群体层面的专家平衡。利用凸对偶性,我们推导出等价的min-max形式,并通过镜像下降法获得一个简单的在线算法,得到一个高效的EMA基于路由调整,具有可忽略的开销。在大规模预训练和下游微调中,$ϕ$-平衡一致优于先前的Switch式和无损失基线,展示了更稳定和有效的专家利用。

英文摘要

Mixture-of-Experts (MoE) models rely on balanced expert utilization to fully realize their scalability. However, existing load-balancing methods are largely heuristic and operate on noisy mini-batch assignment statistics, introducing bias relative to population-level objectives. We propose $ϕ$-balancing, a principled framework that directly targets population-level expert balance by minimizing a strictly convex, symmetric, and differentiable potential of the expected routing distribution. Using convex duality, we derive an equivalent min-max formulation and obtain a simple online algorithm via mirror descent, yielding an efficient EMA-based routing adjustment with negligible overhead. Across large-scale pretraining and downstream fine-tuning, $ϕ$-balancing consistently outperforms prior Switch-style and loss-free baselines, demonstrating more stable and effective expert utilization.

2605.15394 2026-05-18 cs.LG cs.AI stat.ML

Representation Without Reward: A JEPA Audit for LLM Fine-Tuning

无奖励的表示:用于LLM微调的JEPA审计

Biswa Sengupta

AI总结 本文探讨了在无奖励设定下,通过JEPA架构学习更有效的表示方法,测试了多种辅助项在自然语言到正则表达式生成任务中的表现,发现某些辅助项在特定统计检验下显著,但整体效果不显著。

详情
AI中文摘要

联合嵌入预测架构(JEPAs)提出,当模型被训练以预测潜在表示而非观测输出时,应学习更有用的抽象。对于自回归语言模型微调,这一原则意味着诱导的隐藏状态几何必须达到语言模型头部并且提高解码任务指标。我们在此基础上,在固定Llama-3.2-1B-Instruct LoRA基础上,对自然语言到正则表达式生成任务进行了测试,比较了22种训练时的辅助项,包括轨迹形状正则化、分布约束、预测器/目标不对称性、Fisher度量Jacobi残差以及一个解码器可见的JEPA目标,该目标位于交叉熵的正锥内。经验结果是一个结构化的零假设:几种辅助项在单细胞配对α=0.10下显著(T3-Local在Δ=+2.53 pp,p=0.003最强),但无一通过Bonferroni或Holm-Bonferroni检验。解码器可见的JEPA产生了研究中的第一个正辅助-交叉熵梯度余弦值,但精确匹配仍处于种子噪声内;在五个种子的完整微调复制中,相同的辅助项在两个基准测试中均重现了零假设(TURK:Δ=+0.04 pp,p_配对=0.96;SYNTH:Δ=+0.52 pp,p_配对=0.28),因此零假设在LoRA和完整微调中对解码器可见的构造是稳健的。隐藏状态表示和解码任务准确性在这一领域因此弱相关;我们相应地将LLM领域JEPA评估重新定义为耦合问题,其中核心问题是哪些指标下有用的隐藏几何成为解码器可见的任务信号。

英文摘要

Joint-embedding predictive architectures (JEPAs) propose that a model should learn more useful abstractions when trained to predict latent representations rather than observed outputs. For autoregressive language-model fine-tuning the principle entails a stricter requirement: the induced hidden-state geometry must reach the language-model head \emph{and} improve the decoded task metric. We test that requirement under a fixed Llama-3.2-1B-Instruct LoRA harness on natural-language-to-regex generation, comparing twenty-two training-time auxiliaries across trajectory-shape regularisation, distributional constraints, predictor/target asymmetry, Fisher-metric Jacobi residuals, and a decoder-visible JEPA objective constructed to lie in cross-entropy's positive cone. The empirical answer is a structured null: several auxiliaries clear single-cell paired $α= 0.10$ without correction (T3-Local at $Δ= +2.53$~pp, $p = 0.003$ being the strongest), but none survives Bonferroni or Holm--Bonferroni at the relevant family-wise threshold, even though many change curvature, anisotropy, variance, and gradient direction. Decoder-visible JEPA yields the first positive auxiliary--cross-entropy gradient cosine in the study, yet exact match remains inside seed noise; a full-fine-tuning replication of the same auxiliary at $n = 5$ seeds reproduces the null on both benchmarks (TURK: $Δ= +0.04$~pp, $p_{\text{paired}} = 0.96$; SYNTH: $Δ= +0.52$~pp, $p_{\text{paired}} = 0.28$), so the null is robust across LoRA and full fine-tuning for the decoder-visible construction. Hidden-state representation work and decoded-task accuracy are therefore weakly coupled in this regime; we accordingly reframe LLM-domain JEPA evaluation as a coupling problem, in which the operative question is under which metrics useful hidden geometry becomes decoder-visible task signal.

2605.15373 2026-05-18 stat.ME

Nonparametric inference for sublevel-set probabilities of conditional average treatment effect functions

非参数推断:条件平均治疗效应函数子集概率

Anders Munch, Thomas A. Gerds

AI总结 本文研究了条件平均治疗效应函数的子集概率,提出了一种非参数估计方法,利用单调函数估计和机器学习技术,开发了Grenander型估计器以可视化异质性。

详情
AI中文摘要

平均治疗效应可能掩盖个体对治疗反应差异的重要性。尽管条件平均治疗效应(CATE)函数捕捉了这种异质性,但当它依赖于许多协变量时,难以沟通。多变量CATE函数的子集同样复杂,但其概率是一个简单的数字,表示预期治疗效应不超出预设阈值的个体比例。通过改变阈值,出现单变量单调曲线,可用于可视化人口中的异质性类型和程度。本文将此曲线作为目标参数,并展示其在非参数模型下并非路径可微。为解决此非标准估计问题,本文利用单调函数估计的最新进展,开发了结合机器学习的Grenander型估计器。此外,本文还证明了最佳分段线性近似到目标曲线是一个路径可微参数,并开发了该近似的去偏机器学习估计器。本文在基于随机试验合成数据的一系列数值研究中检验了所提估计器的有限样本性能。方法在糖尿病药物随机试验数据上进行了说明。

英文摘要

The average treatment effect can obscure important heterogeneity when individuals respond differently to a treatment. While the conditional average treatment effect (CATE) function captures such heterogeneity, it is difficult to communicate when it depends on many covariates. Sublevels sets of a multivariate CATE function are equally complicated objects, but the probability of a sublevel set of a CATE function is a single number with a simple interpretation as the proportion of individuals whose expected treatment effect does not exceed a prespecified threshold. By varying the threshold, a univariate monotone curve appears which can be used to visualize the overall type and degree of heterogeneity in a population. We formalize this curve as a target parameter and show that it is not pathwise differentiable under a nonparametric model. To address this nonstandard estimation problem, we leverage recent advances in monotone function estimation and develop a Grenander-type estimator that incorporates machine learning. We also show that the best piecewise linear approximation to the curve of interest is a pathwise differentiable parameter, and we develop a debiased machine learning estimator of this approximation. We investigate our proposed estimators' finite sample performance in a sequence of numerical studies based on data synthesized from a randomized trial. The methods are illustrated in data from a randomized trial on diabetes medication.

2605.15356 2026-05-18 math.NA cs.NA stat.ML

Proposal-Guided Greedy Surrogate Refinement for PDE-Driven High-Dimensional Rare-Event Estimation

基于提案的贪心替代模型细化用于PDE驱动的高维稀有事件估计

Zhiwei Gao, George Karniadakis

AI总结 本文提出一种替代模型辅助的自适应重要采样框架,通过局部细化替代模型提升高维稀有事件模拟的准确性,减少高保真评估次数。

详情
AI中文摘要

在PDE驱动的高维稀有事件模拟中,准确构建替代模型具有挑战性,因为性能评估成本高昂。由于全局准确的替代模型可能需要大量高保真评估,自适应重要采样提供了一种自然的局部化工具:其演进的提案分布逐步识别故障相关区域。受此启发,我们提出了一种替代模型辅助的自适应重要采样框架,通过在演进的提案上局部细化替代模型,而非在整个输入空间上。替代模型结合编码器和神经网络,为预测和样本选择提供低维潜在表示。在每次自适应迭代中,从当前提案中抽出的候选样本通过贪心的潜在空间规则平衡与估计故障边界的距离和样本多样性。所选样本通过高保真模型评估并用于细化替代模型,该模型随后指导后续的交叉熵型自适应提案更新。我们建立了在局部替代误差下的一步提案稳定性界,以及由替代模型引起的误分类和有限样本估计误差界。在多模态基准和PDE驱动的稀有事件问题上,最多100维的数值实验表明,所提方法在准确性上可与真模型自适应重要采样相媲美,同时显著减少高保真评估次数。

英文摘要

Accurate surrogate construction for PDE-driven high-dimensional rare-event simulation is challenging when performance evaluations are expensive. Since a globally accurate surrogate may require many high-fidelity evaluations, adaptive importance sampling provides a natural localization tool: its evolving proposal distribution progressively identifies the failure-relevant region. Motivated by this observation, we propose a surrogate-assisted adaptive importance sampling framework that refines the surrogate locally along the evolving proposal, rather than over the entire input space. The surrogate combines an encoder with a neural network, providing a low-dimensional latent representation for both prediction and sample selection. At each adaptive iteration, candidates drawn from the current proposal are selected by a greedy latent-space rule balancing proximity to the estimated failure boundary and sample diversity. The selected samples are evaluated by the high-fidelity model and used to refine the surrogate, which then guides the subsequent cross-entropy-type adaptive proposal update. We establish one-step proposal stability bounds under local surrogate errors, together with surrogate-induced misclassification and finite-sample estimation error bounds. Numerical experiments on multimodal benchmarks and PDE-driven rare-event problems up to 100 dimensions show that the proposed method achieves accuracy comparable to true-model adaptive importance sampling while requiring substantially fewer high-fidelity evaluations.

2605.15306 2026-05-18 cs.LG stat.ML

How Data Augmentation Shapes Neural Representations

数据增强如何塑造神经表示

Tianxiao He, Alex H. Williams, Sarah E. Harvey

AI总结 研究探讨不同数据增强策略如何改变神经网络内部表示的几何结构,揭示增强强度与表示形状的关系,以及神经几何在模型集成中的应用。

详情
AI中文摘要

数据增强被广泛用于提升深度网络的泛化能力,但其对学习表示几何结构的影响仍不明确。本文通过形状分析工具,将网络隐藏表示嵌入到度量空间中,该空间对缩放、平移、旋转和反射不变。研究显示,增强强度增加导致空间中轨迹更稳定,不同增强类型引导表示朝不同方向发展。此外,研究探讨神经表示形状如何沿数据增强轨迹扭曲,并表明神经几何学可预测在模型集成中表现最佳的表示。结果揭示了不同架构和种子间的共享几何模式,表明分析形状空间轨迹是理解和比较数据增强方法的原理性工具。

英文摘要

Data augmentation is widely recognized for improving generalization in deep networks, yet its impact on the geometry of learned representations remains poorly understood. In this work, we characterize how different data augmentation strategies reshape internal representations in neural networks. Using tools from shape analysis, we embed network hidden representations into a metric space where distance is invariant to scaling, translation, rotation and reflection. We show that increasing augmentation strength leads to well-behaved trajectories in this space, and that different augmentation types steer representations in distinct directions. Moreover, we investigate how neural representation shapes are distorted along data augmentation trajectories, and show that insights from neural geometry can predict which representations provide the most improvement when ensembling models. Our results reveal shared geometric patterns across architectures and seeds, and suggest that analyzing shape-space trajectories offers a principled tool for understanding and comparing data augmentation methods.

2605.15303 2026-05-18 stat.ME

Functional Cox model for interval-censored data

区间截断数据的函数Cox模型

Yangjianchen Xu, Peijun Sang

AI总结 本文提出一种函数Cox模型,用于分析区间截断数据中函数协变量对事件时间的影响,通过惩罚最大似然估计和EM算法进行参数估计,并构建全局检验方法。

详情
AI中文摘要

区间截断数据在科学研究中很常见,其中感兴趣的事件仅在特定时间区间内发生。在这些研究中,函数性协变量(如连续曲线或空间分布)日益普遍,研究其轨迹对事件时间的影响具有重要科学意义。本文通过函数Cox模型探讨标量和函数协变量对区间截断事件时间的影响。我们考虑了该模型的惩罚最大似然估计,并设计了EM算法以稳定地计算参数估计值。所得回归参数和系数函数线性泛函的估计量被证明是一致且渐近正态的,其极限协方差矩阵达到半参数效率下限,并可通过轮廓似然方法估计。基于这些结果,我们构建了函数协变量总体效应的全局检验。最后,我们通过广泛的模拟研究评估了所提方法的性能,并展示了阿尔茨海默病神经影像计划的数据应用。

英文摘要

Interval-censored data arise frequently in scientific studies, where the event of interest is known only to occur within a specific time interval. In such studies, functional covariates taking the form of continuous curves or spatial profiles are increasingly encountered, and it is of substantial scientific relevance to investigate how the trajectory of a functional covariate affects the event time. We formulate the effects of both scalar and functional covariates on the interval-censored event time through a functional Cox model. We consider penalized maximum likelihood estimation for this model and devise an EM algorithm to stably compute the parameter estimators. The resulting estimators for the regression parameters and linear functionals of the coefficient function are shown to be consistent and asymptotically normal, with limiting covariance matrices that attain the semiparametric efficiency bound and can be readily estimated through the profile likelihood method. Building upon these results, we construct a global test for the overall effect of the functional covariate. Finally, we assess the performance of the proposed methods through extensive simulation studies and present an application to data from the Alzheimer's Disease Neuroimaging Initiative.

2605.15291 2026-05-18 stat.AP

BaySC: Uncovering Tissue Architecture in Spatial Multi-Omics via Probabilistic Spatial Clustering

BaySC:通过概率空间聚类揭示空间多组学中的组织结构

Xin Li, Xiaofei Dong, Zhenke Duan, Lulu Shang, Xiao Wang, Xinyuan Song, Hanwen Ning, Guanyu Hu

AI总结 BaySC通过概率空间聚类框架整合多组学数据,自动学习空间域数量,利用马尔可夫随机场保证局部空间一致性,提升空间拓扑结构的保持能力。

详情
AI中文摘要

空间域识别需要联合建模分子特征和物理坐标,但现有工具常过度平滑生物边界、需用户指定聚类数且缺乏系统整合方法。我们提出BaySC,一种整合的贝叶斯空间聚类框架。BaySC通过有限混合(MFM)先验自动学习空间域数量。组织拓扑通过马尔可夫随机场(MRF)应用于离散细胞分配建模,该策略在不扭曲基因表达特征的情况下保证局部空间一致性。这使BaySC能够准确映射连续组织层以及地理分散但转录相同细胞群体。此外,BaySC通过Gibbs采样执行加权对数似然融合机制,为每种模态分配可解释权重,允许用户量化不同数据层对最终组织图的生物相关性。在十个单模态空间转录组和两个空间多组学数据集上验证,BaySC产生高度可解释的概率输出。它在标准聚类指标上表现竞争,且在空间感知调整兰德指数(spARI)上一致优于现有工具。

英文摘要

Spatial domain identification requires jointly modeling molecular signatures and physical coordinates, yet current tools frequently over-smooth biological boundaries, require user-specified cluster numbers, and lack principled multimodal integration. We introduce BaySC, an integrative Bayesian spatial clustering framework for spatial domain identification. BaySC inherently learns the true number of spatial domains from the data by employing a Mixture of Finite Mixtures (MFM) prior. Tissue topology is modeled via a Markov Random Field (MRF) applied to discrete cellular assignments, a strategy that enforces local spatial coherence without distorting the underlying gene expression features. This enables BaySC to accurately map contiguous tissue layers as well as geographically scattered, transcriptionally identical cell populations. Furthermore, BaySC handles spatial multi-omics data through a weighted log-likelihood fusion mechanism executed via Gibbs sampling. This approach assigns interpretable weights to each modality, allowing users to quantify the biological relevance of different data layers to the final tissue map. Validated across ten single-modal spatial transcriptomics and two spatial multi-omics datasets, BaySC yields highly interpretable probabilistic outputs. It demonstrates competitive accuracy on standard clustering metrics and consistently outperforms existing tools in preserving spatial topography, as measured by spatially-aware Adjusted Rand Index (spARI).

2605.15278 2026-05-18 math.ST stat.TH

Intrinsic-dimension empirical Bernstein inequalities for bounded self-adjoint operators

Diego Martinez-Taboada, Aaditya Ramdas

AI总结 本文研究了有界自伴算子和的经验型伯内特和伯恩斯坦不等式,旨在解决传统算子值浓度不等式在实际应用中对先验方差依赖过强、且依赖环境维度的问题。作者提出了一种完全基于数据驱动的方法,用经验方差替代未知方差,并基于算子的内在维度而非环境维度进行分析,从而获得了更精确且适用于无限维空间的浓度界。该方法在非各向同性随机矩阵中表现更优,并且在理论保证上达到了已知最优的渐近精度。

详情
英文摘要

Operator-valued concentration inequalities are foundational to the analysis of modern high-dimensional statistics and randomized algorithms. However, standard oracle bounds are frequently limited in practice: they require explicit a priori knowledge of the true variance, and often explicitly scale with the ambient dimension, rendering them vacuous for infinite-dimensional or heavily structured operators. Motivated by these challenges, we establish the first empirical Bennett and Bernstein inequalities for sums of independent, bounded, compact self-adjoint operators. Our fully data-driven bounds replace the unknown variance with an empirical estimate and rely strictly on the intrinsic dimension rather than the ambient dimension. This structural shift yields computable, dimension-free guarantees that are strictly sharper for non-isotropic random matrices and seamlessly extend to infinite-dimensional Hilbert spaces. We demonstrate that our empirical bounds achieve asymptotic sharpness with the best known oracle rates. Finally, as an independent byproduct, we derive novel empirical concentration guarantees for the intrinsic dimension itself.

2605.15240 2026-05-18 stat.ML cs.LG

On Kernel Eigen-alignments of KRR: Reconstruction and Generalization

Yang Liu, Ernest Fokoue, Richard Lange, Daniel Krutz

AI总结 本文研究了核矩阵与学习目标之间的特征对齐在实现鲁棒泛化中的关键作用,建立了核方法泛化性能与矩阵特征向量和特征值估计之间的直接联系。通过分析核矩阵扰动对预测结果的影响,作者推导出基于特征值和特征向量估计稳定性的泛化误差上界,并指出在高秩核条件下,重建误差对泛化能力的预测作用有限。研究从特征值估计的角度提出了新的泛化界,表明强泛化能力需要增强特征向量对齐、增大特征值幅度或增大相邻特征值之间的间隔。

详情
英文摘要

This paper investigates the critical role of eigenalignments between the kernel matrix and learning targets in achieving robust generalization in learning problems. We establish a direct connection between generalization performance in kernel methods and the estimation of eigenvectors and eigenvalues of matrices, offering a more intuitive understanding compared to prior work with minimal assumptions. We also show that, since the prediction task in KRR is essentially the weighted sum of eigenvectors/singular vectors, by analyzing how much error can be caused by perturbations to the kernel matrix, we can then derive a bound on this generalization error using the estimation stability of matrix eigenvalues and eigenvectors. Compared with previous work, our analysis concentrates on finite-sample settings and on the generalization error arising from having a suboptimal finite training set. Our findings reveal that in kernel methods, as long as the kernel is of high rank, the near-zero reconstruction error can be trivially obtained, implying that the reconstruction error will have limited predictive power for generalization. Finally, we establish a generalization bound from an eigenvalues/eigenvectors estimation perspective, showing that strong generalization requires increasing eigenvector alignment, eigenvalue magnitude, or gaps between consecutive eigenvalues.

2605.15234 2026-05-18 math.NA cs.NA math.SP math.ST stat.CO stat.TH

Sampling pseudospectrum for data-driven matrices

Caroline Wormell

AI总结 许多复杂系统可以通过对捕捉其动态的矩阵进行谱分解来简化为关键组件,这些矩阵通常通过最小二乘拟合等方法从数据中构建。然而,现有方法难以区分所得离散特征值是数据有限性引起的误差还是系统真实特征。本文提出了一种采样伪谱 $P(λ)$,用于在复平面上提供有限数据特征值行为的概率信息,并给出了其估计量 $\hat P(λ)$,可通过重新处理数据得到。该估计量计算高效,能够统计检验真实特征值的位置,从而严格且普遍地判断从有限数据中提取的模式是信号还是噪声。

Comments 30 pages. Comments welcome

详情
英文摘要

Many complex systems can be reduced to their key components through spectrally decomposing matrices that capture their dynamics. These matrices can in turn be constructed from data, often by least-squares fitting: examples of algorithms to do this include Dynamical Mode Decomposition and variants, subspace identification and eigenvalue realisation algorithms. Typical outputs of these algorithms include a range of isolated, peripheral eigenvalues capturing persistent emergent patterns in the system. However, there is no objective way to assess which of these discrete eigenvalues are artefacts of finite data error, and which are reflections of a fully sampled operator. n this paper, we present a sampling pseudospectrum $P(λ)$, that provides probabilistic information on the behaviour of finite-data eigenvalues in the complex plane, and an estimator $\hat P(λ)$, which can be obtained by reprocessing our finite data sample. The estimator, which is computationally efficient to implement, allows us to test statistically for the location of the true eigenvalues. This gives us a rigorous and very general way to assess whether the patterns we extract from finite data are likely to be signal or noise.

2605.05179 2026-05-18 cs.LG cond-mat.dis-nn stat.ML

Estimating the expected output of wide random MLPs more efficiently than sampling

Wilson Wu, Victor Lecomte, Michael Winer, George Robinson, Jacob Hilton, Paul Christiano

AI总结 本文提出了一种比采样更高效的方法,用于估计初始化后的宽随机多层感知机(MLP)在高斯输入下的期望输出。该方法通过构建每一层激活值的近似分布,利用累积量和Hermite展开等工具,避免了传统采样方式中逐个输入计算的耗时过程。实验表明,该方法在保证均方误差的前提下,显著减少了计算量,尤其在估计小概率事件和模型训练中表现出色,为降低模型尾部风险提供了新思路。

Comments 68 pages. Code is available at https://github.com/alignment-research-center/mlp_cumulant_propagation

详情
英文摘要

By far the most common way to estimate an expected loss in machine learning is to draw samples, compute the loss on each one, and take the empirical average. However, sampling is not necessarily optimal. Given an MLP at initialization, we show how to estimate its expected output over Gaussian inputs without running samples through the network at all. Instead, we produce approximate representations of the distributions of activations at each layer, leveraging tools such as cumulants and Hermite expansions. We show both theoretically and empirically that for sufficiently wide networks, our estimator achieves a target mean squared error using substantially fewer FLOPs than Monte Carlo sampling. We find moreover that our methods perform particularly well at estimating the probabilities of rare events, and additionally demonstrate how they can be used for model training. Together, these findings suggest a path to producing models with a greatly reduced probability of catastrophic tail risks.

2604.15598 2026-05-18 nlin.CG q-bio.QM stat.AP

When do trajectories matter? Identifiability analysis for stochastic transport phenomena

Matthew J Simpson, Michael J Plank

AI总结 该研究探讨了在随机扩散模型中,轨迹数据对模型参数可识别性的影响。通过结合基于代理的模拟、偏微分方程近似、似然估计与可识别性分析等方法,研究发现仅使用计数数据可能导致结构不可识别问题,而引入个体轨迹数据可有效改善参数估计的准确性。研究还分析了不同实验设计对参数可识别性的影响,并提供了开源代码供进一步使用。

Comments 7 Figures

详情
英文摘要

Stochastic models of diffusion are routinely used to study dispersal of populations, including populations of animals, plants, seeds and cells. Advances in imaging and field measurement technologies mean that data are often collected across a range of scales, including count data collected across a series of fixed sampling regions to characterize population-level dispersal, as well as individual trajectory data to examine at the motion of individuals within a diffusive population. In this work we consider a lattice-based random walk model and examine the extent to which model parameters can be determined by collecting count data and/or trajectory data. Our analysis combines agent-based stochastic simulations, mean-field partial differential equation approximations, likelihood-based estimation, identifiability analysis, and model-based prediction. These combined tools reveal that working with count data alone can sometimes lead to challenges involving structural non-identifiability that can be alleviated by collecting trajectory data. Furthermore, these tools allow us to explore how different experimental designs impact inferential precision by comparing how different trajectory data collection protocols affects practical identifiability. Open source implementations of all algorithms used in this work are available on GitHub.

2604.13137 2026-05-18 stat.CO math.NT math.ST stat.TH

$p$-adic Linear Regression for Random Sampling with Digitwise Noise

Tomoki Mihara

AI总结 本文提出了一种新的$p$-adic线性回归概率算法,用于处理带有逐位噪声的随机采样问题。该方法包含一种新的模$p$线性回归概率算法,能够在噪声环境下更准确地估计回归参数。研究的主要贡献在于将$p$-adic分析引入统计回归问题,为处理高噪声数据提供了新的理论工具和计算方法。

详情
英文摘要

We propose a new probabilistic algorithm of $p$-adic linear regression for random sampling with digitwise noise. This includes a new probabilistic algorithm of modulo $p$ linear regression.

2602.16274 2026-05-18 cs.LG stat.ML

Regret and Sample Complexity of Online Q-Learning via Concentration of Stochastic Approximation with Time-Inhomogeneous Markov Chains

Rahul Singh, Siddharth Chandak, Eric Moulines, Vivek S. Borkar, Nicholas Bambos

AI总结 本文首次为无限时间折扣马尔可夫决策过程中的经典在线Q学习提供了悔恨界,无需依赖乐观或奖励项。研究分析了衰减温度的玻尔兹曼Q学习,并提出了一种结合ε_n-贪心与玻尔兹曼探索的平滑探索策略,证明其悔恨界对子优化间隙具有鲁棒性,达到近似O(N^{9/10})的上界。同时,作者还给出了高概率下的样本复杂度保证,并发展了一种适用于合缩马尔可夫随机逼近的高概率集中界,该结果具有独立研究价值。

详情
英文摘要

We present the first regret bound for classical online Q-learning in infinite-horizon discounted Markov decision processes (MDPs), without relying on optimism or bonus terms. We first analyze Boltzmann Q-learning with decaying temperature and show that its regret depends critically on the suboptimality gap of the MDP: for sufficiently large gaps, the regret is sublinear, while for small gaps it deteriorates and can approach linear growth. To address this limitation, we study a Smoothed $ε_n$-Greedy exploration scheme that combines $ε_n$-greedy and Boltzmann exploration, for which we prove a gap-robust regret bound of near-$\tilde{O}(N^{9/10})$. We also obtain sample complexity guarantees, with both regret and sample complexity bounds holding with high probability. To analyze these algorithms, we develop a high-probability concentration bound for contractive Markovian stochastic approximation with iterate- and time-dependent transition dynamics. This bound may be of independent interest as the contraction factor in our framework is allowed to converge to one asymptotically.

2602.14342 2026-05-18 math.ST cs.DS cs.LG math.PR stat.TH

High-accuracy log-concave sampling with stochastic queries

Fan Chen, Sinho Chewi, Constantinos Daskalakis, Alexander Rakhlin

AI总结 本文研究了在对数凹函数采样中如何实现高精度的采样保证,提出使用具有亚指数尾部的随机梯度可以达到迭代和查询复杂度与 $\mathrm{poly}\log(1/δ)$ 相关的高精度采样。这与凸优化问题形成对比,后者在梯度存在随机性时需要 $\mathrm{poly}(1/δ)$ 的查询次数。研究还从信息论角度论证了轻尾随机梯度对于实现高精度采样的必要性,并给出了针对零阶随机查询和有限和势函数采样的改进复杂度结果。

详情
英文摘要

We show that high-accuracy guarantees for log-concave sampling -- that is, iteration and query complexities which scale as $\mathrm{poly}\log(1/δ)$, where $δ$ is the desired target accuracy -- are achievable using stochastic gradients with subexponential tails. Notably, this exhibits a separation with the problem of convex optimization, where stochasticity (even additive Gaussian noise) in the gradient oracle incurs $\mathrm{poly}(1/δ)$ queries. We also give an information-theoretic argument that light-tailed stochastic gradients are necessary for high accuracy: for example, in the bounded variance case, we show that the minimax-optimal query complexity scales as $Θ(1/δ)$. Our framework also provides similar high accuracy guarantees under stochastic zeroth order (value) queries, and an improved complexity result for sampling from finite-sum potentials.

2601.23030 2026-05-18 stat.ML cs.LG stat.ME

Neural Backward Filtering Forward Guiding

Gefan Yang, Frank van der Meulen, Stefan Sommer

AI总结 本文提出了一种名为“神经反向滤波正向引导”(NBFFG)的统一框架,用于解决树状非线性连续随机过程中的推断问题,尤其适用于观测稀疏且拓扑结构复杂的情形。该方法通过构造一个近似的线性高斯过程,得到闭式反向滤波器以引导生成路径向高似然区域移动,并利用神经网络残差捕捉非线性偏差,从而实现无偏的路径子采样,显著降低训练复杂度。实验表明,NBFFG在合成数据集和高维系统发育分析任务中均优于现有方法。

详情
英文摘要

Inference in nonlinear continuous stochastic processes on trees is challenging, particularly when observations are sparse and the topology is complex. Exact smoothing via Doob's $h$-transform is intractable for general nonlinear dynamics. We propose Neural Backward Filtering Forward Guiding (NBFFG), a unified framework for both discrete transitions and continuous diffusions. Our method constructs a variational posterior by leveraging a proxy linear-Gaussian process. This proxy process yields a closed-form backward filter that serves as a guide, steering the generative path toward high-likelihood regions. We then learn a neural residual to capture the non-linear discrepancies. This formulation allows for an unbiased pathwise subsampling scheme, reducing the training complexity from tree-size dependent to path-length dependent. Empirical results show that NBFFG outperforms baselines on synthetic benchmarks, and we demonstrate the method on a high-dimensional inference task in phylogenetic analysis with reconstruction of ancestral butterfly wing shapes.

2601.21294 2026-05-18 cs.LG stat.ML

Missing-Data-Induced Phase Transitions in Spectral PLS for Multimodal Learning

Anders Gjølbye, Ida Kargaard, Emma Kargaard, Lina Skerath, Lars Kai Hansen

AI总结 本文研究了在多模态学习中,缺失数据对谱偏最小二乘(PLS)方法性能的影响。通过在高维尖峰模型下分析独立缺失的完全随机掩码对交叉协方差矩阵的影响,发现缺失数据会削弱信号强度,并导致类似BBP类型的相变现象:当信号与噪声比低于临界阈值时,主奇异向量无法有效捕捉潜在共享结构;高于该阈值时则能实现非平凡对齐。研究还提出了有限秩扩展的猜想,并通过仿真和半合成实验验证了理论预测的相图和恢复曲线。

Comments Preprint

详情
英文摘要

Partial Least Squares (PLS) learns shared structure from paired data via the top singular vectors of the empirical cross-covariance (PLS-SVD), but multimodal datasets often have missing entries in both views. We study PLS-SVD under independent entry-wise missing-completely-at-random masking in a proportional high-dimensional spiked model. After appropriate normalization, the masked cross-covariance behaves like a spiked rectangular random matrix whose effective signal strength is attenuated by $\sqrtρ$, where $ρ$ is the joint entry retention probability. The replica-symmetric analysis predicts a sharp BBP-type phase transition: below a critical signal-to-noise threshold the leading singular vectors are asymptotically uninformative, while above it they achieve nontrivial alignment with the latent shared directions, with closed-form asymptotic overlap formulas. We also state a finite-rank extension as a conjecture, predicting that the same missingness-adjusted threshold applies componentwise when the latent spikes are separated. Simulations and semi-synthetic multimodal experiments agree with the predicted phase diagram and recovery curves across aspect ratios, signal strengths, and missingness levels.

2512.18250 2026-05-18 stat.ME

NMF-FFB: Non-negative matrix factorization with feedforward-feedback structure

Kenichi Satoh

AI总结 本文提出了一种具有前馈-反馈结构的非负矩阵分解方法(NMF-FFB),用于处理非负数据中的内生变量关系。该方法在传统NMF基础上引入了内生变量之间的潜在反馈机制,通过同时方程建模实现内生与外生变量路径的分离。NMF-FFB适用于小样本、非负加法数据场景,能够自动发现潜在因子并区分直接与累积反馈效应,在多个实际数据集上展示了良好的解释性与应用效果。

详情
英文摘要

Non-negative matrix factorization (NMF) approximates a non-negative endogenous data matrix as $Y_1 \approx XB$, with non-negative latent components $X$ and coefficients $B$. Standard covariate-aware NMF is feedforward: $B$ depends only on exogenous variables $Y_2$, with no latent feedback among endogenous variables. We propose NMF-FFB (NMF with feedforward-feedback structure), an exploratory data-fitting framework that embeds the simultaneous equation $B = Θ_1 Y_1 + Θ_2 Y_2$ in NMF, where $Θ_1$ is non-negative latent feedback and $Θ_2$ non-negative exogenous pathways. NMF-FFB is positioned within data-fitting structural equation modeling (SEM): it fits $Y_1$ directly rather than a model-implied covariance, and is not a confirmatory measurement model or a replacement for maximum-likelihood SEM under standard confirmatory factor analysis assumptions. When $ρ(XΘ_1)<1$, the reduced form $Y_1 \approx (I-XΘ_1)^{-1} XΘ_2 Y_2$ defines a latent Leontief inverse separating direct from cumulative feedback-amplified effects. Estimation uses regularized multiplicative updates with orthogonality and sparsity penalties; an $X$-fixed bootstrap summarizes uncertainty for the feedback spectral radius, the amplification ratio, and path coefficients. Unlike conventional SEM, NMF-FFB requires only the latent rank $Q$ and lets $X$ group endogenous indicators into latent factors. This suits non-negative additive data, automatic loading discovery, Leontief-type cumulative effects, and small samples where covariance-based maximum-likelihood fitting is ill-conditioned. Applications to Holzinger-Swineford, Los Angeles pollution-mortality, and Mississippi county-level health data demonstrate interpretable parts-based representations across distinct latent-feedback regimes.

2512.09673 2026-05-18 cs.LG cs.AI cs.NE stat.ML

Drawback of Enforcing Equivariance and its Compensation via the Lens of Expressive Power

Yuzhu Chen, Tian Qin, Xinmei Tian, Fengxiang He, Dacheng Tao

AI总结 本文研究了强制等变性对神经网络表达能力的影响,发现这种约束可能削弱模型的表达能力。通过分析边界超平面和通道向量,作者构造性地证明了这一问题,并指出可通过扩大模型规模来补偿这一缺陷,同时证明了所需扩大的上界。令人意外的是,扩大的网络结构反而降低了假设空间的维度,可能带来更好的泛化能力。

详情
英文摘要

Equivariant neural networks encode the intrinsic symmetry of data as an inductive bias, which has achieved impressive performance in wide domains. However, the understanding to their expressive power remains premature. Focusing on 2-layer ReLU networks, this paper investigates the impact of enforcing equivariance constraints on the expressive power. By examining the boundary hyperplanes and the channel vectors, we constructively demonstrate that enforcing equivariance constraints could undermine the expressive power. Naturally, this drawback can be compensated for by enlarging the model size -- we further prove upper bounds on the required enlargement for compensation. Surprisingly, we show that the enlarged neural architectures have reduced hypothesis space dimensionality, implying even better generalizability.

2512.00242 2026-05-18 cs.LG cs.AI cs.ET stat.ML

Polynomial Neural Sheaf Diffusion: A Spectral Filtering Approach on Cellular Sheaves

Alessio Borgi, Fabrizio Silvestri, Pietro Liò

AI总结 本文提出了一种名为多项式神经束扩散(PolyNSD)的新方法,用于改进神经束网络在图结构上的扩散过程。该方法通过在归一化束拉普拉斯矩阵上应用K次多项式传播算子,实现了与束维数无关的K跳感受野,并通过凸混合的正交多项式基响应进行可训练的谱响应建模。相比传统方法,PolyNSD在保持模型稳定性的同时,降低了计算和内存需求,并在同质和异质图基准测试中取得了新的最先进结果。

详情
英文摘要

Sheaf Neural Networks equip graph structures with a cellular sheaf: a geometric structure which assigns local vector spaces (stalks) and a linear learnable restriction/transport maps to nodes and edges, yielding an edge-aware inductive bias that handles heterophily and limits oversmoothing. However, common Neural Sheaf Diffusion implementations rely on SVD-based sheaf normalization and dense per-edge restriction maps, which scale with stalk dimension, require frequent Laplacian rebuilds, and yield brittle gradients. To address these limitations, we introduce Polynomial Neural Sheaf Diffusion (PolyNSD), a new sheaf diffusion approach whose propagation operator is a degree-K polynomial in a normalised sheaf Laplacian, evaluated via a stable three-term recurrence on a spectrally rescaled operator. This provides an explicit K-hop receptive field in a single layer (independently of the stalk dimension), with a trainable spectral response obtained as a convex mixture of K+1 orthogonal polynomial basis responses. PolyNSD enforces stability via convex mixtures, spectral rescaling, and residual/gated paths, reaching new state-of-the-art results on both homophilic and heterophilic benchmarks, inverting the Neural Sheaf Diffusion trend by obtaining these results with just diagonal restriction maps, decoupling performance from large stalk dimension, while reducing runtime and memory requirements.

2511.17426 2026-05-18 cs.LG cs.CV stat.ML

Self-Supervised Learning by Curvature Alignment

Benyamin Ghojogh, M. Hadi Sepanj, Paul Fieguth

AI总结 本文提出了一种基于曲率对齐的自监督学习方法CurvSSL及其核空间扩展kernel CurvSSL,旨在通过显式建模数据流形的局部几何结构来提升表征学习效果。该方法在传统非对比学习框架中引入曲率正则化项,通过计算嵌入特征的局部曲率并对其在不同数据增强视图间进行对齐和去相关,从而增强表示的不变性和几何一致性。实验表明,该方法在MNIST和CIFAR-10数据集上取得了优于现有方法的线性评估性能。

Comments A shorter version of this paper has been published in: Journal of Computational Vision and Imaging Systems, Vol. 11, No. 1, Special Issue: Proceedings of CVIS 2025

详情
Journal ref
Shorter version of this paper is published in Journal of Computational Vision and Imaging Systems, Vol. 11, No. 1, Special Issue: Proceedings of CVIS 2025
英文摘要

Self-supervised learning (SSL) has recently advanced through non-contrastive methods that couple an invariance term with variance, covariance, or redundancy-reduction penalties. While such objectives shape first- and second-order statistics of the representation, they largely ignore the local geometry of the underlying data manifold. In this paper, we introduce CurvSSL, a curvature-regularized self-supervised learning framework, and its RKHS extension, kernel CurvSSL. Our approach retains a standard two-view encoder-projector architecture with a Barlow Twins-style redundancy-reduction loss on projected features, but augments it with a curvature-based regularizer. Each embedding is treated as a vertex whose $k$ nearest neighbors define a discrete curvature score via cosine interactions on the unit hypersphere; in the kernel variant, curvature is computed from a normalized local Gram matrix in an RKHS. These scores are aligned and decorrelated across augmentations by a Barlow-style loss on a curvature-derived matrix, encouraging both view invariance and consistency of local manifold bending. Experiments on MNIST and CIFAR-10 datasets with a ResNet-18 backbone show that curvature-regularized SSL yields competitive or improved linear evaluation performance compared to Barlow Twins and VICReg. Our results indicate that explicitly shaping local geometry is a simple and effective complement to purely statistical SSL regularizers.

2511.03606 2026-05-18 stat.ML cs.LG math.ST stat.TH

Vector-valued self-normalized concentration inequalities beyond sub-Gaussianity

Diego Martinez-Taboada, Tomas Gonzalez, Aaditya Ramdas

AI总结 本文研究了超越次高斯分布的向量值自归一化过程的集中不等式,填补了该领域在非次高斯条件下的理论空白。作者提出了适用于轻尾分布(如贝内特或伯努利分布)的集中界,扩展了传统自归一化分析的适用范围。研究成果在在线线性回归及核化线性强盗算法中具有重要应用价值。

详情
英文摘要

The study of self-normalized processes plays a crucial role in a wide range of applications, from sequential decision-making to econometrics. While the behavior of self-normalized concentration has been widely investigated for scalar-valued processes, vector-valued processes remain comparatively underexplored, especially outside of the sub-Gaussian framework. In this contribution, we provide concentration bounds for self-normalized processes with light tails beyond sub-Gaussianity (such as Bennett or Bernstein bounds). We illustrate the relevance of our results in the context of online linear regression, with applications in (kernelized) linear bandits.

2510.20741 2026-05-18 stat.ME stat.AP

A comparison of methods for designing hybrid type 2 cluster-randomized trials with continuous effectiveness and implementation endpoints

Melody Owen, Fan Li, Ruyi Liu, Donna Spiegelman

AI总结 本文比较了五种用于设计具有连续有效性及实施终点的混合型II类集群随机试验的方法,旨在为研究者提供统计功效分析的实用指导。研究通过理论分析和大规模数值模拟,揭示了不同方法在不同情境下的功效差异,发现当处理效应不同时,离散型两自由度检验具有优势,而处理效应相同时,单自由度检验更为有效。文章还介绍了用于计算试验功效和样本量的R包 crt2power,为该类试验的设计提供了重要工具。

详情
英文摘要

Hybrid type 2 studies are gaining popularity for their ability to assess both implementation and health outcomes as co-primary endpoints. Often conducted as cluster-randomized trials (CRTs), five design methods can validly power these studies: p-value adjustment methods, combined outcomes approach, single weighted 1-DF test, disjunctive 2-DF test, and conjunctive test. We compared these methods theoretically and numerically. Theoretical comparisons of power equations allowed us to identify when one method had more or less power than another globally. We showed that p-value adjustment methods are always less powerful than both the combined outcomes approach and the single 1-DF test, and identified conditions where the disjunctive 2-DF test is less powerful than the single 1-DF test. To further investigate when power advantages shift, we conducted a large-scale numerical study using our novel crt2power R package, which calculates power or sample size for CRTs with two continuous co-primary endpoints using these methods. Across 45,000 input scenarios, we found specific patterns: when treatment effects are unequal, the disjunctive 2-DF test tends to be most powerful; when treatment effects are equal, the single 1-DF test tends to dominate. Together, these comparisons offer practical guidance for powering hybrid type 2 studies.

2509.01685 2026-05-18 stat.ML cs.LG math.OC stat.CO

Preconditioned Regularized Wasserstein Proximal Sampling

Hong Ye Tan, Stanley Osher, Wuchen Li

AI总结 本文研究如何通过有限粒子的演化从吉布斯分布中进行采样,提出了一种预条件正则化Wasserstein近端采样方法。该方法通过正则化Wasserstein近端算子的数值可计算得分函数来近似得分函数,并基于各向异性热方程的Cole-Hopf变换推导出其核形式。实验表明,该方法在多种对数凹和非对数凹分布以及贝叶斯图像去卷积和神经网络训练任务中表现出加速和稳定性优势。

详情
英文摘要

We consider sampling from a Gibbs distribution by evolving finitely many particles. We propose a preconditioned version of a recently proposed noise-free sampling method, governed by approximating the score function with the numerically tractable score of a regularized Wasserstein proximal operator. This is derived by a Cole--Hopf transformation on coupled anisotropic heat equations, yielding a kernel formulation for the preconditioned regularized Wasserstein proximal. The diffusion component of the proposed method is also interpreted as a modified self-attention block, as in transformer architectures. For quadratic potentials, we provide a discrete-time non-asymptotic convergence analysis and explicitly characterize the bias, which is dependent on regularization and independent of step-size. Experiments demonstrate acceleration and particle-level stability on various log-concave and non-log-concave toy examples to Bayesian total-variation regularized image deconvolution, and competitive/better performance on non-convex Bayesian neural network training when utilizing variable preconditioning matrices.

2506.18673 2026-05-18 math.PR math-ph math.MP math.ST stat.TH

Asymptotic Expansions of Gaussian and Laguerre Ensembles at the Soft Edge III: Generating Functions

Folkmar Bornemann

AI总结 本文研究了高维高斯和拉盖尔系综在软边缘处的渐近展开,重点分析了间隙概率生成函数的结构。作者证明了渐近展开中的修正项是主导项高阶导数的多线性形式,并具有与生成函数变量无关的有理多项式系数。这一结构同样适用于由线性诱导得到的量,如第 $k$ 大水平的分布。对于正交和辛系综,研究基于某些假设,并通过数值模拟验证了假设的合理性。

详情
Journal ref
SIGMA 22 (2026), 047, 20 pages
英文摘要

We conclude our work [arXiv:2403.07628, arXiv:2503.12644] on asymptotic expansions at the soft edge for the classical $n$-dimensional Gaussian and Laguerre ensembles, now studying the gap-probability generating functions. We show that the correction terms in the asymptotic expansion are multilinear forms of the higher-order derivatives of the leading-order term, with certain rational polynomial coefficients that are independent of the dummy generating function variable. In this way, the same multilinear structure, with the same polynomial coefficients, is inherited by the asymptotic expansion of any linearly induced quantity such as the distribution of the $k$-th largest level. Whereas the results for the unitary ensembles are presented with proof, the discussion of the orthogonal and symplectic ones is based on some hypotheses. To substantiate the hypotheses, we check the result for the $k$-th largest level in the orthogonal ensembles against simulation data for choices of $n$ and $k$ that require as many as four correction terms to achieve satisfactory accuracy.

2506.12532 2026-05-18 stat.ME

Bayesian inference for the learning rate in Generalised Bayesian inference

Jeong Eun Lee, Sitong Liu, Geoff K. Nicholls

AI总结 本文研究了广义贝叶斯推断(GBI)中学习率和损失函数超参数的估计问题。作者提出了一种基于留出数据的贝叶斯方法,用于推断这些超参数的后验分布,并定义了两种不同的超参数后验形式,分别基于ELPPD效用和伪真参数覆盖。该方法支持对多个超参数进行联合估计与不确定性量化,实验表明其在模拟数据和实际文本分析任务中均优于传统贝叶斯方法,尤其适用于多数据集融合场景。

Comments 33 pages, 7 figures, 1 Table with 32 pages of appendices including 18 further figures and 4 further tables

详情
英文摘要

In Generalised Bayesian Inference (GBI), the learning rate and hyperparameters of the loss must be estimated. These inference-hyperparameters can't be estimated jointly with the other parameters, from the data, by giving them a prior. However, in some settings there exist unknown ``true'' hyperparameter-values about which it is meaningful to have prior belief. It is then possible to use Bayesian inference with held-out data to get hyperparameter-posteriors. We define two hyperparameter posteriors, one based on an ELPPD-utility and one aiming to cover the pseudo-true parameter. The new framework supports estimation and uncertainty quantification for multiple hyperparameters jointly. Experiments show that the resulting GBI-posteriors out-perform Bayesian inference on simulated test data and select optimal or near optimal hyperparameter values in a large real problem of text analysis. Generalised Bayesian inference is particularly useful for combining multiple data sets and most of our examples belong to that setting. We also give asymptotic results for some of the special ``multi-modular'' Generalised Bayes posteriors which we use in our examples.

2506.00182 2026-05-18 stat.ML cs.IT cs.LG math.IT math.ST stat.TH

Overfitting has a limitation: a model-independent generalization gap bound based on Rényi entropy

Atsushi Suzuki, Jing Wang

AI总结 本文研究了机器学习模型泛化能力的限制,提出了一个与模型无关的泛化间隙上界,该上界仅依赖于数据生成分布的Rényi熵。研究指出,即使模型规模无限增大,只要数据量相对于Rényi熵足够,仍可保持较小的泛化间隙。该框架不仅解释了数据中注入噪声导致性能下降的现象,还拓展了无免费午餐定理,强调了数据分布熵在成功学习中的关键作用。

详情
英文摘要

Will further scaling up of machine learning models continue to bring success? A significant challenge in answering this question lies in understanding generalization gap, which is the impact of overfitting. Understanding generalization gap behavior of increasingly large-scale machine learning models remains a significant area of investigation, as conventional analyses often link error bounds to model complexity, failing to fully explain the success of extremely large architectures. This research introduces a novel perspective by establishing a model-independent upper bound for generalization gap applicable to algorithms whose outputs are determined solely by the data's histogram, such as empirical risk minimization or gradient-based methods. Crucially, this bound is shown to depend only on the Rényi entropy of the data-generating distribution, suggesting that a small generalization gap can be maintained even with arbitrarily large models, provided the data quantity is sufficient relative to this entropy. This framework offers a direct explanation for the phenomenon where generalization performance degrades significantly upon injecting random noise into data, where the performance degrade is attributed to the consequent increase in the data distribution's Rényi entropy. Furthermore, we adapt the no-free-lunch theorem to be data-distribution-dependent, demonstrating that an amount of data corresponding to the Rényi entropy is indeed essential for successful learning, thereby highlighting the tightness of our proposed generalization bound.

2503.16589 2026-05-18 cs.LG cs.ET math.ST stat.TH

A Statistical Analysis for Per-Instance Evaluation of Stochastic Optimizers: Avoiding Unreliable Conclusions

Moslem Noori, Elisabetta Valiante, Thomas Van Vaerenbergh, Masoud Mohseni, Ignacio Rozada

AI总结 本文针对随机优化器的性能评估问题,提出了一种统计分析方法,以避免因实验设计不当导致的不可靠结论。研究分析了常用性能指标的置信区间及其与实验重复次数的关系,并推导出保证指标精度所需的最小重复次数下界。基于此,作者提出了一种自适应调整重复次数的算法,以提高评估的准确性和可靠性。实验结果验证了该方法在基准测试和超参数调优中的有效性。

详情
Journal ref
Physical Review Applied 25, no. 3 (2026): 034081
英文摘要

A key trait of stochastic optimizers is that multiple runs of the same optimizer in attempting to solve the same problem can produce different results. As a result, their performance is evaluated over several repeats, or runs, on the problem. However, the accuracy of the estimated performance metrics depends on the number of runs and should be studied using statistical tools. We present a statistical analysis of the common metrics, and develop guidelines for experiment design to measure the optimizer's performance using these metrics to a high level of confidence and accuracy. To this end, we first discuss the confidence interval of the metrics and how they are related to the number of runs of an experiment. We then derive a lower bound on the number of repeats in order to guarantee achieving a given accuracy in the metrics. Using this bound, we propose an algorithm to adaptively adjust the number of repeats needed to ensure the accuracy of the evaluated metric. Our simulation results demonstrate the utility of our analysis and how it allows us to conduct reliable benchmarking as well as hyperparameter tuning and prevent us from drawing premature conclusions regarding the performance of stochastic optimizers.

2503.00326 2026-05-18 stat.ME stat.ML

A Bayesian Additive Regression Tree Model for Learning Conditional Average Treatment Effects in Regression Discontinuity Designs

Rafael Alcantara, P. Richard Hahn, Hedibert F. Lopes

AI总结 本文提出了一种高效的贝叶斯方法,用于回归不连续设计(RDD)中的条件平均处理效应(CATE)估计。该方法基于贝叶斯加性回归树(BART)模型,通过在叶节点引入对运行变量和处理虚拟变量的线性回归,实现了对处理效应的可解释估计。该模型能够自适应地划分协变量空间,识别运行变量斜率显著变化的区域,避免了传统方法对基函数展开的严格假设,提升了模型的灵活性和适用性。

详情
英文摘要

This paper develops a performant Bayesian approach to conditional average treatment effect (CATE) estimation in regression discontinuity designs (RDD), an increasingly prevalent form of quasi-experiment that facilitates causal inference. Earlier Bayesian approaches do not easily accommodate CATE estimation while recent frequentist approaches to this problem assume a known basis expansion, a steep model specification requirement that our approach avoids. The new model is a variant of a Bayesian additive regression tree (BART) model with linear leaf-level regressions on the running variable and a treatment dummy (and their interaction). The model adaptively partitions covariate space into regions where the slope on the running variable appreciably differs, providing interpretable Bayesian inference on conditional average treatment effects near the cutoff.

2502.12187 2026-05-18 cs.CL cs.FL cs.LG math.ST stat.ML stat.TH

Hallucinations are inevitable but can be made statistically negligible

Atsushi Suzuki, Yulan He, Feng Tian, Zhongyuan Wang

AI总结 本文探讨了语言模型中不可避免的“幻觉”现象,即模型生成非事实内容的问题。尽管已有研究从可计算性理论角度证明,任何语言模型在无限输入集上都会产生幻觉,但本文从概率论角度提出,只要训练数据的质量和数量足够,幻觉在统计意义上可以被显著降低。研究指出,虽然可计算性理论结果具有理论意义,但概率理论结果更符合实际应用需求,为缓解幻觉问题提供了新的理论依据。

详情
英文摘要

Hallucinations, a phenomenon where a language model (LM) generates nonfactual content, pose a significant challenge to the practical deployment of LMs. While many empirical methods have been proposed to mitigate hallucinations, recent studies established a computability-theoretic result showing that any LM will inevitably generate hallucinations on an infinite set of inputs, regardless of the quality and quantity of training datasets and the choice of the language model architecture and training and inference algorithms. Although the computability-theoretic result may seem pessimistic, its significance in practical viewpoints has remained unclear. This paper claims that those "innate" inevitability results from computability theory and diagonal argument, in principle, cannot explain practical issues of LLMs. We demonstrate this claim by presenting a positive theoretical result from a probabilistic perspective. Specifically, we prove that hallucinations can be made statistically negligible, provided that the quality and quantity of the training data are sufficient. Interestingly, our positive result coexists with the computability-theoretic result, implying that while hallucinations on an infinite set of inputs cannot be entirely eliminated, their probability can always be reduced by improving algorithms and training data. By evaluating the two seemingly contradictory results through the lens of information theory, we argue that our probability-theoretic positive result better reflects practical considerations than the computability-theoretic negative result.

2407.08094 2026-05-18 stat.ML cs.LG physics.chem-ph physics.data-an

Density Estimation via Binless Multidimensional Integration

Matteo Carli, Alex Rodriguez, Alessandro Laio, Aldo Glielmo

AI总结 本文提出了一种名为无箱多维热力学积分(BMTI)的非参数密度估计方法,用于高效、稳健地估计高维数据的密度。该方法通过计算相邻数据点之间的对数密度差异,并结合最大似然框架对其进行加权积分,从而估计密度的对数。BMTI无需对数据进行分箱或空间划分,而是基于自适应带宽选择构建邻域图,利用流形假设在数据的内在流形上进行估计,有效克服了传统非参数密度估计方法的局限性,并在高维空间中表现出优越的性能。

详情
英文摘要

We introduce the Binless Multidimensional Thermodynamic Integration (BMTI) method for nonparametric, robust, and data-efficient density estimation. BMTI estimates the logarithm of the density by initially computing log-density differences between neighbouring data points. Subsequently, such differences are integrated, weighted by their associated uncertainties, using a maximum-likelihood formulation. This procedure can be seen as an extension to a multidimensional setting of the thermodynamic integration, a technique developed in statistical physics. The method leverages the manifold hypothesis, estimating quantities within the intrinsic data manifold without defining an explicit coordinate map. It does not rely on any binning or space partitioning, but rather on the construction of a neighbourhood graph based on an adaptive bandwidth selection procedure. BMTI mitigates the limitations commonly associated with traditional nonparametric density estimators, effectively reconstructing smooth profiles even in high-dimensional embedding spaces. The method is tested on a variety of complex synthetic high-dimensional datasets, where it is shown to outperform traditional estimators, and is benchmarked on realistic datasets from the chemical physics literature.

2404.04775 2026-05-18 stat.ME

Bipartite causal inference with interference, time series data, and a random network

Zhaoyan Song, Georgia Papadogeorgou

AI总结 本文研究了在存在干扰、时间序列数据和随机网络结构下的二分图因果推断问题,旨在估计干预单元对结果单元的即时和持续影响效应。作者在暴露映射框架下定义了这些因果效应,并基于干预单元的处理分配和随机网络的无混淆假设,建立了结果单元暴露的无混淆性。研究提出了适用于二元、连续和多元暴露映射的因果效应估计方法,并在二元暴露情形下设计了结合匹配与协变量平衡的算法,证明了估计偏差的有界性。实证研究表明,野火烟雾对旧金山自行车通勤存在即时负面影响。

详情
英文摘要

In bipartite causal inference with interference, interventional units might receive treatment or control, and they might affect the outcome of outcome units through their connections on a bipartite network. We study bipartite causal inference with interference based on observational data across time and a changing bipartite network. Under an exposure mapping framework, we define the immediate and carryover causal effects for each outcome unit, representing contrasts of potential outcomes under different values of the immediately preceding and past exposures, respectively, averaged over time. We establish unconfoundedness of the exposure received by outcome units based on unconfoundedness assumptions on the interventional units' treatment assignment and the random network, hence respecting the bipartite structure of the problem. Our results hold for binary, continuous, and multivariate exposure mappings. In the special case of binary exposure and carryover mappings, we propose algorithms for the immediate and carryover causal effects that combine matching and covariate balancing. We show that the bias of the resulting estimators is bounded. In our motivating study, we find some evidence that smoke from wildfires has an immediate impact on reducing transportation by bicycle in San Francisco.

2404.03099 2026-05-18 cs.LG cs.AI cs.CE cs.IT math.IT stat.ML

Composite Bayesian Optimization In Function Spaces Using NEON -- Neural Epistemic Operator Networks

Leonardo Ferreira Guilhoto, Paris Perdikaris

AI总结 本文提出了一种名为NEON的神经网络架构,用于在无限维函数空间中进行带有不确定性的预测,其参数数量远少于性能相当的深度集成方法。研究聚焦于复合贝叶斯优化问题,即优化由未知函数映射和已知函数组成的复合函数,并通过实验表明NEON在多个场景下取得了领先的优化效果,同时显著降低了模型复杂度。

详情
Journal ref
Guilhoto, Leonardo Ferreira, and Paris Perdikaris. "Composite Bayesian optimization in function spaces using NEON - Neural Epistemic Operator Networks." Scientific Reports 14.1 (2024): 29199
英文摘要

Operator learning is a rising field of scientific computing where inputs or outputs of a machine learning model are functions defined in infinite-dimensional spaces. In this paper, we introduce NEON (Neural Epistemic Operator Networks), an architecture for generating predictions with uncertainty using a single operator network backbone, which presents orders of magnitude less trainable parameters than deep ensembles of comparable performance. We showcase the utility of this method for sequential decision-making by examining the problem of composite Bayesian Optimization (BO), where we aim to optimize a function $f=g\circ h$, where $h:X\to C(\mathcal{Y},\mathbb{R}^{d_s})$ is an unknown map which outputs elements of a function space, and $g: C(\mathcal{Y},\mathbb{R}^{d_s})\to \mathbb{R}$ is a known and cheap-to-compute functional. By comparing our approach to other state-of-the-art methods on toy and real world scenarios, we demonstrate that NEON achieves state-of-the-art performance while requiring orders of magnitude less trainable parameters.

2311.03658 2026-05-18 cs.CL cs.AI cs.LG stat.ML

The Linear Representation Hypothesis and the Geometry of Large Language Models

Kiho Park, Yo Joong Choe, Victor Veitch

AI总结 本文探讨了“线性表示假设”,即高层概念在表示空间中以线性方向形式表示的问题,提出了“线性表示”的两种形式化定义,并分别对应输出(词)空间和输入(句子)空间。通过引入因果内积,作者建立了一个非欧几里得的内积结构,能够统一各种线性表示的概念,并用于构建探针和引导向量。实验表明,大型语言模型中确实存在概念的线性表示,且内积的选择对解释与控制模型具有基础性作用。

Comments Accepted for a presentation at ICML 2024 and an oral presentation at NeurIPS 2023 Workshop on Causal Representation Learning. Code is available at https://github.com/KihoPark/linear_rep_geometry

详情
Journal ref
In Proceedings of the 41st International Conference on Machine Learning (ICML), 2024
英文摘要

Informally, the 'linear representation hypothesis' is the idea that high-level concepts are represented linearly as directions in some representation space. In this paper, we address two closely related questions: What does "linear representation" actually mean? And, how do we make sense of geometric notions (e.g., cosine similarity or projection) in the representation space? To answer these, we use the language of counterfactuals to give two formalizations of "linear representation", one in the output (word) representation space, and one in the input (sentence) space. We then prove these connect to linear probing and model steering, respectively. To make sense of geometric notions, we use the formalization to identify a particular (non-Euclidean) inner product that respects language structure in a sense we make precise. Using this causal inner product, we show how to unify all notions of linear representation. In particular, this allows the construction of probes and steering vectors using counterfactual pairs. Experiments with LLaMA-2 demonstrate the existence of linear representations of concepts, the connection to interpretation and control, and the fundamental role of the choice of inner product.

2306.15199 2026-05-18 stat.ME

Rank-Transformed Dissimilarity Profiles for High-Dimensional Classification

Xiangbo Mo, Hao Chen

AI总结 在小样本高维分类任务中,由于样本量有限且类别间信号变化复杂,分类仍面临挑战。本文提出了一种基于差异性分析的分类框架,通过构建每个样本相对于各类别的差异性分布,将其转化为低维表示,从而捕捉类内和类间系统性的差异模式。该方法进一步引入秩变换,提升对异常值的鲁棒性,并在多种高维数据集上表现出优于或接近现有分类器的性能。

详情
英文摘要

Despite advances in representation learning, high-dimensional classification remains challenging in low-sample-size regimes, where the dominant signal may vary across applications and labeled data are often limited. We propose a dissimilarity-profiling classification framework that represents each observation by its class-wise dissimilarity profile, transforming the original feature space into a low-dimensional representation that summarizes how the observation relates to each class. The key idea is to turn a consequence of the curse of dimensionality into signal: high-dimensional geometry can induce systematic within-class and between-class dissimilarity patterns under location, scale, or other distributional changes, and these patterns are captured by the class-wise profiles. Building on this representation, we introduce a rank-transformed algorithm that converts dissimilarities into class-wise rank profiles, yielding a compact representation for classification. The proposed method delivers competitive or improved performance relative to commonly used classifiers on two-class, multi-class, network, and real high-dimensional low-sample-size datasets. To provide insight into the mechanism underlying the method, we analyze a distance-based surrogate and show that the resulting profiles encode differences in first, second, and higher-order moments, while the rank transformation improves robustness to outliers. Together, these results show that rank-transformed dissimilarity profiles provide an adaptive representation for high-dimensional classification when the signal structure is unknown.

2212.05524 2026-05-18 stat.ME stat.AP

Bayesian inference for partial orders from random linear extensions: power relations from 12th Century Royal Acta

Geoff K. Nicholls, Jeong Eun Lee, Nicholas Karn, David Johnson, Rukuang Huang, Alexis Muir-Watt

AI总结 本文研究了12世纪英格兰、威尔士和诺曼底皇家法令中主教名单的顺序变化,以揭示社会地位和权力的变化。研究将社会秩序建模为一个随时间演化的偏序集(poset),并构建了一个隐马尔可夫模型,其中隐藏状态为演化中的偏序集,观测数据为符合该偏序集的随机全序列表。该方法能够处理噪声并考虑主教在等级中的位置变化,通过模型拟合发现了社会地位随时间演变的证据,并在法院政治背景下对结果进行了解释。

Comments 64 pages, 37 figures and 3 tables including appendices

详情
Journal ref
Annals of Applied Statistics, 19(2), 1663-1690, (June 2025)
英文摘要

In the eleventh and twelfth centuries in England, Wales and Normandy, Royal Acta were legal documents in which witnesses were listed in order of social status. Any bishops present were listed as a group. For our purposes, each witness-list is an ordered permutation of bishop names with a known date or date-range. Changes over time in the order bishops are listed may reflect changes in their authority. Historians would like to detect and quantify these changes. There is no reason to assume that the underlying social order which constrains bishop-order within lists is a complete order. We therefore model the evolving social order as an evolving partial ordered set or {\it poset}. We construct a Hidden Markov Model for these data. The hidden state is an evolving poset (the evolving social hierarchy) and the emitted data are random total orders (dated lists) respecting the poset present at the time the order was observed. This generalises existing models for rank-order data such as Mallows and Plackett-Luce. We account for noise via a random ``queue-jumping'' process. Our latent-variable prior for the random process of posets is marginally consistent. A parameter controls poset depth and actor-covariates inform the position of actors in the hierarchy. We fit the model, estimate posets and find evidence for changes in status over time. We interpret our results in terms of court politics. Simpler models, based on Bucket Orders and vertex-series-parallel orders, are rejected. We compare our results with a time-series extension of the Plackett-Luce model. Our software is publicly available.

2003.06804 2026-05-18 stat.ME math.ST stat.ML stat.TH

Semi-Modular Inference: enhanced learning in multi-modular models by tempering the influence of components

Chris U. Carmona, Geoff K. Nicholls

AI总结 本文提出了一种半模块化推断(Semi-Modular Inference, SMI)方法,旨在提升多模块模型中的学习效果。该方法通过引入一个影响参数,灵活调节模块间的推理影响,既包含贝叶斯推断和Cut模型作为特例,又实现了信息流的可调和定向控制。研究还提供了一种元学习准则用于选择最佳推断方案,并在多个测试案例和考古数据集上验证了方法的有效性。

Comments for associated R package to reproduce results, see https://github.com/christianu7/aistats2020smi

详情
Journal ref
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:4226-4235, 2020
英文摘要

Bayesian statistical inference loses predictive optimality when generative models are misspecified. Working within an existing coherent loss-based generalisation of Bayesian inference, we show existing Modular/Cut-model inference is coherent, and write down a new family of Semi-Modular Inference (SMI) schemes, indexed by an influence parameter, with Bayesian inference and Cut-models as special cases. We give a meta-learning criterion and estimation procedure to choose the inference scheme. This returns Bayesian inference when there is no misspecification. The framework applies naturally to Multi-modular models. Cut-model inference allows directed information flow from well-specified modules to misspecified modules, but not vice versa. An existing alternative power posterior method gives tunable but undirected control of information flow, improving prediction in some settings. In contrast, SMI allows tunable and directed information flow between modules. We illustrate our methods on two standard test cases from the literature and a motivating archaeological data set.

1904.04185 2026-05-18 stat.ME

Multiple imputation in data that grow over time: A comparison of three strategies

X. M. Kavelaars, S. van Buuren, J. R. van Ginkel

AI总结 该研究比较了三种处理随时间增长的纵向数据中缺失值的多重插补策略。核心方法包括重新插补、嵌套插补和附加插补,研究通过模拟分析发现,所有方法在单调缺失模式下均能提供有效推断,而非单调缺失模式下则可能产生偏差。研究指出,时间点内的相关性需强于时间点间相关性,才能保证推断有效性,并认为附加插补在存在退出缺失的纵向数据中尤为有益。

Comments 15 pages, 5 tables, 1 figure

详情
Journal ref
Multivariate Behavioral Research, 57(2-3):513-523, 2022
英文摘要

Multiple imputation is a highly recommended technique to deal with missing data, but the application to longitudinal datasets can be done in multiple ways. When a new wave of longitudinal data arrives, we can treat the combined data of multiple waves as a new missing data problem and overwrite existing imputations with new values (re-imputation). Alternatively, we may keep the existing imputations, and impute only the new data. We may do either a full multiple imputation (nested) or a single imputation (appended) on the new data per imputed set. This study compares these three strategies by means of simulation. All techniques resulted in valid inference under a monotone missingness pattern. A non-monotone missingness pattern led to biased and non-confidence valid regression coefficients after nested and appended imputation, depending on the correlation structure of the data. Correlations within timepoints must be stronger than correlations between timepoints to obtain valid inference. In an empirical example, the three strategies performed similarly.We conclude that appended imputation is especially beneficial in longitudinal datasets that suffer from dropout.

1702.00971 2026-05-18 stat.ME

Multiple imputation for multilevel data with continuous and binary variables

Vincent Audigier, Ian R. White, Shahab Jolani, Thomas P. A. Debray, Matteo Quartagno, James Carpenter, Stef van Buuren, Matthieu Resche-Rigon

AI总结 本文研究了针对包含连续变量和二元变量的多层数据的多重插补方法,重点比较了不同方法在系统性缺失和随机缺失情况下的表现。通过理论分析和基于真实数据集的模拟研究,发现异方差插补方法在多数情况下比同方差方法更准确,且有效推断需要数据包含大量聚类单元。研究还指出不同方法适用于不同类型的聚类规模和变量类型,为多层数据缺失值处理提供了重要参考。

详情
Journal ref
Statistical Science, 33(2):160-183, 2018
英文摘要

We present and compare multiple imputation methods for multilevel continuous and binary data where variables are systematically and sporadically missing. The methods are compared from a theoretical point of view and through an extensive simulation study motivated by a real dataset comprising multiple studies. Simulations are reproducible. The comparisons show why these multiple imputation methods are the most appropriate to handle missing values in a multilevel setting and why their relative performances can vary according to the missing data pattern, the multilevel structure and the type of missing variables. This study shows that valid inferences can only be obtained if the dataset gathers a large number of clusters. In addition, it highlights that heteroscedastic MI methods provide more accurate inferences than homoscedastic methods, which should be reserved for data with few individuals per cluster. Finally, the method of Quartagno and Carpenter (2016a) appears generally accurate for binary variables, the method of Resche-Rigon and White (2016) with large clusters, and the approach of Jolani et al. (2015) with small clusters.