arXivDaily arXiv每日学术速递 周一至周五更新
重置
STAT统计111

1. 统计理论与方法 15 篇

2606.12174 2026-06-11 stat.AP stat.ME 新提交

The data-driven extreme value distribution: non-parametric tail estimation with a derived stability criterion

数据驱动的极值分布:基于导出稳定性准则的非参数尾部估计

Michael Sandbichler, Tobias Hell

AI总结 提出数据驱动极值分布(DDEVD),一种非参数估计器,通过核方法重建基分布并导出稳定性准则,在降水与冶金数据中优于传统极值模型。

详情
Comments
28 pages, 6 figures
AI中文摘要

量化极端事件的可能性是风险评估的基础,然而经典极值理论依赖于渐近假设,这在数据稀疏、非平稳的情况下失效,而实践者越来越常遇到这种情况。我们引入了数据驱动极值分布(DDEVD),一种非参数估计器,它元统计地聚合所有观测值,并用核重建基分布,去除了参数尾部假设。我们推导了其最优带宽,并证明了一个稳定性定律 $m < C\\,n^{1+\gamma/2}$,将可靠外推与极值指数 $\gamma$ 联系起来。在亚小时尺度的阿尔卑斯降水数据中,DDEVD 从单个十年中恢复了稳定的100年重现水平(校准比率 $0.96$),与完整记录参考值的偏差超过 $50\\%$ 的情况在不到五十分之一的窗口中发生——而 GEV 拟合则为五分之一。在冶金显微图像中,它在安全相关的晶粒尺寸尾部上与广义极值拟合相匹配,而标准对数正态分布在 $1\\,\mathrm{cm}^{2}$ 处高估了 $58\\%$。

英文摘要

Quantifying the likelihood of extreme events underpins risk assessment, yet classical Extreme Value Theory relies on asymptotic assumptions that fail in the data-sparse, non-stationary regimes practitioners increasingly face. We introduce the Data-Driven Extreme Value Distribution (DDEVD), a non-parametric estimator that aggregates all observations metastatistically and reconstructs the base distribution with a kernel, removing parametric tail assumptions. We derive its optimal bandwidth and prove a stability law $m < C\,n^{1+\gamma/2}$ relating reliable extrapolation to the extreme value index $\gamma$. In sub-hourly Alpine precipitation, DDEVD recovers stable 100-year return levels from single decades (calibration ratio $0.96$), departing from the full-record reference by over $50\,\%$ in fewer than one window in fifty -- versus one in five for a GEV fit. In metallurgical micrographs, it matches a generalised extreme-value fit on the safety-relevant grain-size tail, where the standard log-normal over-predicts by $58\,\%$ at $1\,\mathrm{cm}^{2}$.

2606.12015 2026-06-11 stat.ME 新提交

Introducing precision-weighted bias as a performance measure to inform the inclusion of adaptive designs in meta-analysis

引入精度加权偏倚作为性能度量以指导元分析中适应性设计的纳入

Martin Law (1 and 2), David S. Robertson (1), Sofia S. Villar (1), Tim P. Morris (3), Babak Choodari-Oskooei (4), Thomas Jaki (1 and 5), Ian R. White (4) ((1) Medical Research Council Biostatistics Unit, University of Cambridge, (2) Royal Papworth Hospital, Cambridge, (3) Statistical Methodology, Novartis Pharmaceuticals UK Ltd., (4) UCL Innovative Clinical Trials Unit, University College London, (5) Department of Machine Learning and Statistics, University of Regensburg, DE)

AI总结 提出精度加权偏倚作为新的统计性能指标,证明元分析中适应性设计的偏倚可忽略,建议将其作为模拟研究的标准补充。

详情
Comments
9 pages, 2 figures
AI中文摘要

我们提出一种新颖、直观的统计性能度量:精度加权偏倚。精度加权偏倚定义为估计量的无条件偏倚以其所含信息量(精度)加权。当前指南(如GRADE和CONSORT)常将适应性设计中潜在的偏倚增加视为系统综述中纳入此类设计的阻碍。然而,我们证明共同效应元分析中的偏倚近似等于其组成研究的精度加权偏倚的精度加权平均,而非其未加权无条件偏倚的平均。通过模拟研究,我们表明虽然适应性设计可能表现出未加权偏倚,但它们通常具有零精度加权偏倚。因此,纳入这些设计通常导致整体元分析偏倚的微小变化。这些结果表明,精度加权偏倚是决定是否将适应性设计纳入元分析的更优指标。我们建议在模拟研究中使用精度加权偏倚作为未加权无条件偏倚和条件偏倚的标准补充,以支持更具包容性和准确的证据合成。

英文摘要

We propose a novel, intuitive measure of statistical performance: precision-weighted bias. Precision-weighted bias is defined as the unconditional bias of an estimator weighted by the degree of information (precision) it contains. Current guidelines, such as GRADE and CONSORT, often view the potential for increased bias in adaptive designs as a deterrent for the inclusion of such designs in systematic reviews. However, we demonstrate that the bias in a common-effect meta-analysis is approximately equal to the precision-weighted average of the precision-weighted biases of its constituent studies, rather than of their unweighted unconditional biases. Through simulation studies, we show that while adaptive designs may exhibit unweighted bias, they frequently have zero precision-weighted bias. Consequently, including these designs often results in a negligible change to the overall meta-analysis bias. These results suggest that precision-weighted bias is a superior indicator for determining whether to include an adaptive design in a meta-analysis. We recommend that precision-weighted bias be used as a standard complement to unweighted unconditional and conditional bias in simulation studies to support more inclusive and accurate evidence synthesis.

2606.11933 2026-06-11 math.ST stat.ME 新提交

Testing axial symmetry in multivariate location-scale linear regression

多元位置尺度线性回归中的轴向对称性检验

Šárka Hudecová, Miroslav Šiman

AI总结 提出基于积分秩得分的检验方法,用于多元线性异方差回归中条件轴向对称性的检验,推导渐近分布,并通过模拟和实际数据验证。

详情
AI中文摘要

本文研究多元线性异方差回归框架下条件轴向对称性的检验问题。提出了一种基于积分秩得分的新检验,并推导了其渐近分布。所提出的方法将针对多元数据开发的类似程序扩展到回归设定中。该检验也可用于评估关于误差项分布特性的特定假设。通过一个小型模拟研究和实际经济数据说明了其性能和应用。本文还包含一些关于轴向对称性的理论结果,这些结果可能具有独立的意义。

英文摘要

The article deals with the problem of testing conditional axial symmetry within a~multivariate linear heteroscedastic regression framework. A new test based on integrated rank scores is introduced and its asymptotic distribution is derived. The proposed method extends a similar procedure developed for multivariate data to the regression setting. The test may also be employed to assess specific hypotheses concerning distributional properties of the error term. Its performance and application is illustrated in a small simulation study and with real economic data. The article also contains a few theoretical results regarding axial symmetry that may be of independent interest.

2606.11548 2026-06-11 stat.ME 新提交

Estimating the local false discovery rate under an unknown symmetric null

在未知对称零假设下估计局部错误发现率

Daniel Xiang, William Fithian, Nikolaos Ignatiadis, Jake A. Soloff, Asaf Weinstein

AI总结 针对零分布仅对称于零的双组模型,提出基于逻辑回归和自然三次样条的局部错误发现率估计方法,并证明该估计可渐近控制多重检验的局部错误发现率。

详情
AI中文摘要

本文关注在双组模型中估计局部错误发现率(lfdr),其中关于零分布的唯一假设是它关于零对称。我们的动机来自当代多重假设检验框架,特别适用于变量选择问题,该框架将任何用户指定的分数转换为统计量,其零分布关于零对称,而非零分布通常预期在零右侧富集。虽然现代方法如knockoff滤波器(Barber and Candes; 2015)能够利用零性质来控制错误发现率(FDR),但一个更合适的目标是针对被拒绝的假设控制局部错误发现率,如Soloff等人(2024)所提出的,其中分析了标准的双组模型(已知$f_0$和独立性)。在这里,我们朝这个方向迈出一步,提出通过针对替代密度比$f(-w)/f(w)$($w>0$)来估计lfdr,其中$f$是上述“简化”双组模型中的边际密度。我们研究了几种估计量,并提出了一种基于自然三次样条基的逻辑回归方法。我们还证明了该替代的任何一致估计量都能使以名义水平阈值估计的多重检验过程渐近控制lfdr。

英文摘要

This paper is concerned with estimating the local false discovery rate (lfdr) in a two-groups model where the only assumption regarding the null distribution is symmetry about zero. Our motivation comes from the contemporary framework for multiple hypothesis testing, particularly relevant in variable selection problems, which transforms any user-specified scores into statistics whose null distributions are symmetric about zero, whereas enrichment to the right of zero is generally expected for the non-nulls. While modern methods such as the knockoff filter (Barber and Candes; 2015) are able to exploit the null property for controlling the false discovery rate (FDR), an arguably more appropriate goal is to target control of the local false discovery rate for the rejected hypotheses, as proposed in Soloff et al. (2024) where the standard two-groups model (known $f_0$ and independence) is analyzed. Here, we take a step in this direction and propose to estimate the lfdr by targeting the surrogate density ratio $f(-w)/f(w)$, for $w>0$, where $f$ is the marginal density in the aforementioned ``stripped-down'' two-groups model. We study several estimators and propose a logistic regression based method with natural cubic spline basis. We also show that any consistent estimator of this surrogate yields asymptotic lfdr control of the multiple testing procedure that thresholds the estimate at the nominal level.

2606.11421 2026-06-11 stat.ME math.ST stat.CO 新提交

Second-Order Least Squares as a Special Case of the Polynomial Maximization Method

二阶最小二乘法作为多项式最大化方法的特例

Serhii Zabolotnii

AI总结 证明在条件同方差非高斯误差下,最优加权二阶最小二乘法与二次广义多项式最大化方法等价,并揭示高阶效率储备。

详情
Comments
26 pages, 3 figures, 7 tables. Includes Lean 4 formal verification and Monte Carlo simulation
AI中文摘要

我们证明,对于具有条件同方差非高斯误差的线性回归,最优加权二阶最小二乘法(SLS)与二次广义多项式最大化方法(PMM)是相同的总体估计方程:它们选择前两个中心残差矩的最优线性组合,求解同一个总体正规方程组,共享同一个影响函数,并达到相同的渐近方差 $c_2g_2/N$——普通最小二乘斜率方差因子 $c_2$ 乘以 PMM 方差缩减系数 $g_2=1-\gamma_3^2/(2+\gamma_4)$(其中 $\gamma_3,\gamma_4$ 为误差偏度和超额峰度)。因此,可行的插件实现是一阶等价的,仅存在高阶有限样本差异。这一等价性是尖锐的:在异方差下,无条件 PMM 主体与条件 SLS 加权分离,导致对称误差的效率损失和不对称误差的一致性损失。在二次以上,PMM 拥有 SLS 在其二阶矩范围内无法达到的效率储备。对于对称的尖峰误差,SLS 退化为普通最小二乘法估计斜率,而三次 PMM 通过闭式系数 $g_3$ 利用 SLS 矩范围之外的峰度信息;对于典型非对称分布,在三次多项式矩类中,这一储备为 $30$--$50\\%$。Lean 4 开发环境机器检验了特定次数的代数核心——$g_2$ 和 $g_3$ 的闭式、$g_2\le1$ 结果、设计抵消和对称退化——而一般单调性 $g_{S+1}\le g_S\le1$ 通过嵌套分析证明。蒙特卡洛研究说明了等价性、储备和异方差边界在有限样本中的表现。

英文摘要

We prove that optimally weighted second-order least squares (SLS) and the degree-two generalized polynomial maximization method (PMM) are the same population estimating equation for linear regression with conditionally homoskedastic non-Gaussian errors: they choose the same optimal linear combination of the first two centered residual moments, solve one population normal system, share one influence function, and attain the common asymptotic variance $c_2g_2/N$ -- the ordinary-least-squares slope-variance factor $c_2$ scaled by the PMM variance-reduction coefficient $g_2=1-\gamma_3^2/(2+\gamma_4)$ (with $\gamma_3,\gamma_4$ the error skewness and excess kurtosis). Feasible plug-in implementations are therefore first-order equivalent, with only higher-order finite-sample differences. The identity is sharp: under heteroskedasticity the unconditional PMM body and the conditional SLS weighting separate, costing efficiency for symmetric errors and consistency for asymmetric errors. Beyond degree two, PMM holds an efficiency reserve that SLS cannot reach within its second-moment span. For symmetric platykurtic errors SLS collapses to ordinary least squares for the slope, while degree-three PMM exploits kurtosis information outside the SLS moment span through a closed-form coefficient $g_3$; for canonical asymmetric laws this reserve is $30$--$50\%$ within the degree-three polynomial moment class. The Lean 4 development machine-checks the degree-specific algebraic core -- the closed forms for $g_2$ and $g_3$, the $g_2\le1$ result, the design cancellations, and the symmetric collapse -- while the general monotonicity $g_{S+1}\le g_S\le1$ is proved analytically by nesting. A Monte Carlo study illustrates the equivalence, the reserve, and the heteroskedastic boundary at finite samples.

2606.10212 2026-06-11 math.ST stat.ML 版本更新

Intrinsic Riemannian Cross-covariance for Manifold-valued Random Objects

内蕴立足点不变黎曼互协方差

Carlos Soto, Cheng Wang, Yujing Huang, Xiaoyu Chen

AI总结 提出一种通过平行传输将局部变化映射到公共切空间的黎曼互协方差,实现流形上随机对象的二阶统计量估计,并证明其渐近性质,在球面、SPD流形和心脏瓣膜形状数据上验证有效性。

详情
Comments
31 pages, 16 figures
AI中文摘要

协方差估计是表示学习、降维和依赖建模中基本的二阶统计量。虽然协方差在欧几里得空间中已被充分理解,但对于位于非线性黎曼流形上的随机对象(在现代机器学习应用中日益常见,涉及形状、对称正定(SPD)矩阵等),协方差定义不明确。本文引入了一种针对流形值随机对象的内蕴黎曼互协方差。我们的方法通过平行传输将局部变化映射到公共切空间来定义协方差和相关,从而得到一个独立于任意坐标选择的二阶描述符。我们证明了所提出的协方差继承了欧几里得对应物的理想性质,并刻画了其渐近行为。在球面和SPD流形上的数值研究,以及在Kendall形状空间中心脏瓣膜形状的真实数据实验,证明了我们估计量的有效性并验证了所述性质。我们的结果将黎曼协方差定位为非欧几里得表示空间中二阶学习和分析的基本工具。

英文摘要

Covariance estimation yields a fundamental second-order statistic underlying representation learning, dimension reduction, and dependence modeling. While covariance has been well understood in Euclidean spaces, it is ill-defined for random objects residing on nonlinear Riemannian manifolds, which increasingly arise in modern machine learning applications involving shapes, symmetric positive definite (SPD) matrices, etc. This paper introduces an intrinsic Riemannian cross-covariance for manifold-valued random objects. Our approach defines covariance and correlation by transporting local variations to a common tangent space via parallel transport, yielding a second-order descriptor that is independent of arbitrary coordinate choices. We establish that the proposed covariance inherits desirable properties of its Euclidean counterparts and characterize its asymptotic behavior. Numerical studies on spheres and SPD manifolds, together with real-data experiments on heart valve shapes in Kendall's shape space, demonstrate the effectiveness of our estimators and verify the stated properties. Our results position the Riemannian covariance as a fundamental tool for second-order learning and analysis in non-Euclidean representation spaces.

2606.01854 2026-06-11 stat.ME 版本更新

A Uniform Improvement of the Benjamini-Hochberg Procedure via e-Closure

使用e-闭包对Benjamini-Hochberg方法的统一改进

Jelle Goeman

AI总结 提出closed BH方法,基于e-闭包原理统一改进BH程序,在相同假设下不减少拒绝但增加功效,尤其当假零假设数量大时。

详情
AI中文摘要

本文提出了closed BH,这是Benjamini和Hochberg(BH)的假发现率控制方法的一种统一改进。Closed BH在BH相同的子集正回归依赖(PRDS)假设下有效。作为一种统一改进,closed BH从不比BH拒绝更少的假设,但它可能拒绝更多。功效的增加尤其当假零假设数量大时观察到。该新方法是使用e-闭包原理构建的,这是最近推导出的多重检验的一般原理。

英文摘要

This paper presents closed BH, a uniform improvement of the False Discovery Rate controlling method of Benjamini and Hochberg (BH). Closed BH is valid under the same assumption of Positive Regression Dependency on a Subset (PRDS) as BH, but also under an alternative and weaker minimal sufficient condition. As a uniform improvement, closed BH never rejects fewer hypotheses than BH, but it may reject quite a few more. An increase in power is observed especially when the number of false null hypotheses is large. The novel method is constructed using the e-Closure principle, a recently derived general principle for multiple testing. The method is implemented in the eClosure package in R.

2605.21641 2026-06-11 stat.ME stat.CO 版本更新

Stable direct estimation for GPLSIAMs using P-splines with dynamically updated boundaries

使用动态更新边界的P样条实现GPLSIAMs的稳定直接估计

Danilo V. Silva, Gilberto A. Paula

AI总结 本文提出了一种稳定直接估计GPLSIAMs的方法,通过使用模型矩阵和惩罚完全鱼尔信息矩阵动态更新单指数协变量的边界,在统一的迭代框架中实现快速计算有效自由度和点wise置信区间。

详情
AI中文摘要

广义部分线性单指数加法模型(GPLSIAMs)因其在功能灵活性与参数维度缩减之间的平衡而被广泛应用于不同领域。然而,估计过程面临严重的计算挑战。本文介绍了一种新的稳定方法,利用每个单指数效应的模型矩阵,定义为其单指数系数,并通过惩罚完全鱼尔信息矩阵动态更新单指数协变量的边界,以统一的迭代框架实现。推导出的模型矩阵使得能够快速计算估计的有效自由度和单指数效应的点wise置信区间。通过广义Fellner-Schall方法将平滑参数更新整合到迭代过程中,从而提供对全局惩罚优化问题的高效近似。在中等样本量和非高斯分布下的模拟研究证实了估计在多个场景下的经验一致性。值得注意的是,所提出的方法在最先进竞争方法无法恢复真实单指数系数和非线性函数的稳定情况下仍保持稳定,并且在计算最密集的场景中比常规两步方法快80.13倍。通过应用于Capital Bike Sharing数据集,展示了该方法的建模优势,其中处理每年的单指数交互效应,具有不同的单指数系数和复杂的结构,使得竞争方法不适用。所提出的方法在R中实现,提供了可重复和透明的比较功能。

英文摘要

Generalized partially linear single-index additive models (GPLSIAMs) have been increasingly applied across diverse areas due to their versatility in integrating functional flexibility with parametric dimension reduction while maintaining interpretability. However, the estimation presents severe computational challenges. This paper introduces a novel stable method that uses the model matrix for each single-index effect, defined by its single-index coefficients, and the penalized complete Fisher information matrix to dynamically update the boundaries of the single-index covariates within a unified iterative framework. The derived model matrices enable the fast computation of the estimated effective degrees of freedom and pointwise confidence bands for the single-index effects. The smoothing parameter updates are integrated into the iterative process via the generalized Fellner-Schall method, which recycles the derived matrix decompositions, thereby providing an efficient approximation to the global penalized optimization problem. Simulation studies with moderate sample sizes under non-Gaussian distributions confirm the empirical consistency of the estimation across multiple scenarios. Notably, the proposed approach remains stable where state-of-the-art competitive methods fail to recover true single-index coefficients and nonlinear functions, and is 80.13 times faster than the usual two-step method in the most computationally intensive scenario. The modeling advantage is illustrated through an application to Capital Bike Sharing data, where we deal with a single-index interaction effect for each year, with distinct single-index coefficients, a complex structure that makes competitive methods inapplicable. The proposed method is implemented in R, with functions available for reproducibility and transparency in comparisons.

2603.22668 2026-06-11 math.ST stat.ME 版本更新

Fixed-level calibration of the Cauchy combination test

柯西组合检验的固定水平校准

Hirofumi Ota

AI总结 研究柯西组合检验在固定显著性水平下的渐近精确性,发现原始CCT在固定水平下不精确,提出边界层校准CCT(BL-CCT)通过修正参考分布而非统计量实现渐近精确,并在多种备择假设下保持功效。

详情
Comments
Added several related references, conducted power analyses and polished the proofs and the simulation section
AI中文摘要

柯西组合检验(CCT)被广泛使用,因为它能产生闭式组合$p$值,并且在宽依赖结构下当名义水平$\alpha\downarrow0$时渐近有效。我们研究了一个不同的渐近问题:当组合$p$值的数量$K$在依赖下增长时,通常的柯西截断值在普通固定水平下是否仍然准确。在典型单因子等相关高斯copula模型下,我们证明原始CCT在固定$\alpha$下通常不是渐近精确的。在固定正相关下,统计量收敛到随机潜在因子极限,因此不存在通用的固定水平参考分布。当公共相关$\rho_K$随$K$减弱时,固定水平行为由边界层尺度$s_K=\sqrt{\rho_K}(\log K)^{3/2}$控制,且原始CCT渐近精确当且仅当$\rho_K(\log K)^3\to0$。由于大小失真完全来自参考分布而非统计量,因此可以在不修改检验统计量本身的情况下进行校正。我们提出了边界层校准CCT(BL-CCT),它用单参数高斯平滑柯西族替代标准柯西参考分布。与最近修改检验统计量的变体不同,BL-CCT保持统计量不变,仅校正参考分布。BL-CCT在更弱的条件$\rho_K\log K\to0$下渐近精确,并在有界边界层上提供有用的有限$K$近似。我们还进行了若干功效分析:尽管BL-CCT仅提高了截断值,但在局部密集、稀疏和密集高斯备择假设下,它在精确度尺度上相对于原始CCT没有一阶功效损失。数值实验支持校准理论。

英文摘要

The Cauchy combination test (CCT) is widely used because it yields a closed-form combined $p$-value and is known to be asymptotically valid as the nominal level $\alpha\downarrow0$ under broad dependence structures. We study a different asymptotic question: whether the usual Cauchy cutoff remains accurate at an ordinary fixed level when the number $K$ of combined $p$-values grows under dependence. Under a canonical one-factor equicorrelated Gaussian copula model, we show that the raw CCT is generally not asymptotically exact at fixed $\alpha$. With fixed positive correlation, the statistic converges to a random latent-factor limit, so there is no universal fixed-level reference law. When the common correlation $\rho_K$ weakens with $K$, fixed-level behaviour is governed by the boundary-layer scale $s_K=\sqrt{\rho_K}(\log K)^{3/2}$, and the raw CCT is asymptotically exact if and only if $\rho_K(\log K)^3\to0$. Because the size distortion arises entirely from the reference law and not from the statistic, it can be corrected without modifying the test statistic itself. We propose the boundary-layer calibrated CCT (BL-CCT), which replaces the standard Cauchy reference by a one-parameter Gaussian-smoothed Cauchy family. Unlike recent variants that modify the test statistic, BL-CCT leaves the statistic unchanged and corrects only the reference law. BL-CCT is asymptotically exact under the weaker condition $\rho_K\log K\to0$ and provides a useful finite-$K$ approximation on bounded boundary layers. We also conduct several power analyses: although BL-CCT only raises the cutoff, it incurs no first-order power loss relative to the raw CCT on the exactness scale, under local dense, sparse, and dense Gaussian alternatives. Numerical experiments support the calibration theory.

2505.03649 2026-06-11 stat.ML cs.LG math.CO math.PR 版本更新

Weighted Random Dot Product Graphs

加权随机点积图

Bernardo Marenco, Paola Bermolen, Marcelo Fiori, Federico Larroca, Gonzalo Mateos

AI总结 提出加权随机点积图(WRDPG)模型,通过节点潜位置的内积刻画边权分布的高阶矩,并给出谱嵌入估计的统计保证与生成框架。

详情
Comments
30 pages, 12 figures, code to generate Figures 3 to 12 available at this https URL. Updated to match the published version
AI中文摘要

复杂关系模式的建模已成为当代统计研究和相关数据科学领域的基石。以图形式表示的网络为这种分析提供了自然框架。本文扩展了随机点积图(RDPG)模型以适应加权图,显著拓宽了该模型的适用范围,使其能够处理边权呈现异质分布的场景。我们提出了一种非参数加权(W)RDPG模型,为每个节点分配一系列潜位置。这些节点向量的内积通过矩生成函数指定其关联边权分布的矩。与现有技术不同,WRDPG能够区分具有相同均值但高阶矩不同的权重分布。我们推导了基于工作马邻接谱嵌入的节点潜位置估计量的统计保证,建立了其一致性和渐近正态性。我们还贡献了一个生成框架,能够采样符合(指定或数据拟合的)WRDPG的图,从而促进例如使用恰当的参考分布对观测图指标进行分析和检验。本文组织如下:形式化模型定义、估计(或节点嵌入)过程及其保证,以及生成加权图的方法,所有内容均辅以说明性和可重复的示例,展示WRDPG在各种网络分析应用中的有效性。

英文摘要

Modeling of intricate relational patterns has become a cornerstone of contemporary statistical research and related data science fields. Networks, represented as graphs, offer a natural framework for this analysis. This paper extends the Random Dot Product Graph (RDPG) model to accommodate weighted graphs, markedly broadening the model's scope to scenarios where edges exhibit heterogeneous weight distributions. We propose a nonparametric weighted (W)RDPG model that assigns a sequence of latent positions to each node. Inner products of these nodal vectors specify the moments of their incident edge weights' distribution via moment-generating functions. In this way, and unlike prior art, the WRDPG can discriminate between weight distributions that share the same mean but differ in other higher-order moments. We derive statistical guarantees for an estimator of the nodal's latent positions adapted from the workhorse adjacency spectral embedding, establishing its consistency and asymptotic normality. We also contribute a generative framework that enables sampling of graphs that adhere to a (prescribed or data-fitted) WRDPG, facilitating, e.g., the analysis and testing of observed graph metrics using judicious reference distributions. The paper is organized to formalize the model's definition, the estimation (or nodal embedding) process and its guarantees, as well as the methodologies for generating weighted graphs, all complemented by illustrative and reproducible examples showcasing the WRDPG's effectiveness in various network analytic applications.

2603.02566 2026-06-11 stat.ME 版本更新

Modeling double bounded data based on correlated gamma random variables

基于相关伽马随机变量的双有界数据建模

Roberto Vila, Felipe Quintino, Marcelo Bourguignon

AI总结 针对单位区间上比率形式的有界数据,提出一种通过Copula连接相关伽马变量的新模型,克服传统独立假设的局限,允许正负相关,并通过模拟和真实经济数据验证其灵活性和有效性。

详情
Comments
41 pages, 14 figures
AI中文摘要

许多定义在单位区间上的有界数据自然以 $X/(X + Y)$ 的比率形式出现。在现有文献中,针对此类有界数据的主要统计模型通常基于随机变量 $X$ 和 $Y$ 独立的假设。然而,在实际应用中,由于共享的潜在机制或共同的变异来源,$X$ 和 $Y$ 往往存在相关性,因此这一假设常常不切实际。在本文中,我们克服了这些局限性,提出了一种模型,其中两个分量的边际分布通过Copula连接,从而得到更灵活、更真实的单位区间数据表示。特别地,在所提出的模型中,$X$ 和 $Y$ 是相依的伽马随机变量,其联合分布通过Morgenstern二元分布指定,允许分量之间存在正相关和负相关。我们严格研究了其数学性质和实践应用。所得分布呈现广泛的形状,适应不同程度的偏度,并且在某些参数配置下,具有更复杂的密度结构。进行了蒙特卡洛模拟研究,表明最大似然估计在多种参数选择场景下具有良好的性能。还讨论了高效基于似然计算的潜力和局限性。我们通过建模与经济相关的真实数据集,评估了新模型及其估计的有效性。

英文摘要

Many types of bounded data defined on the unit interval arise naturally as ratios of the form $X/(X + Y)$. In the existing literature, the main statistical models proposed for this type of bounded data typically based on the assumption that the random variables $X$ and $Y$ are independent. However, this assumption is often unrealistic in practical applications, where $X$ and $Y$ tend to be correlated due to shared underlying mechanisms or common sources of variability. In this paper, we overcome such limitations and propose a model in which the marginal distributions of the two components are linked by a copula, leading to a more flexible and realistic representation of unit-interval data. In particular, in the proposed model, $X$ and $Y$ are dependent gamma random variables whose joint distribution is specified via Morgenstern's bivariate distribution}, allowing for positive and negative correlations between the components. The mathematical properties and practical applications are rigorously investigated. The resulting distribution exhibits a wide range of shapes, accommodating different degrees of skewness and, for some parameter configurations, more complex density structures. A Monte Carlo simulation study is carried out that shows the good performance of the maximum likelihood estimator in several scenarios of parameter choices. The potential and limitations of efficient likelihood-based computations are also discussed. We evaluate the effectiveness of the new model and its estimates in modeling real-world datasets related to economics.

2511.11862 2026-06-11 econ.EM math.ST stat.ME 版本更新

Compound Selection Decisions: An Almost SURE Approach

复合选择决策:一种几乎无偏的SURE方法

Jiafeng Chen, Lihua Lei, Timothy Sudijono, Liyang Sun, Tian Xie

AI总结 针对高斯序列模型中的复合选择问题,提出基于SURE的几乎无偏估计量ASSURE,通过优化期望效用选择最优决策规则,并证明其渐近最优性。

详情
Comments
V2: Additional Results and Simulations. 110 pages. Comments welcome
AI中文摘要

本文提出了在高斯序列模型中生成复合选择决策的方法。给定未知的固定参数 $\mu_ {1:n}$ 和已知的 $\sigma_{1:n}$,观测值 $Y_i \sim \textsf{N}(\mu_i, \sigma_i^2)$,决策者希望选择一个子集 $S$ 以最大化效用 $\frac{1}{n}\sum_{i\in S} (\mu_i - K_i)$,其中 $K_i$ 为已知成本。受Stein无偏风险估计(SURE)启发,我们引入了一种几乎无偏的估计量,称为ASSURE,用于估计给定决策规则的期望效用。ASSURE允许用户通过优化估计福利,从预先指定的类别中选择福利最大化的规则,从而产生能够跨噪声估计借用强度的选择决策。我们证明,ASSURE产生的决策规则在渐近意义上不劣于预指定类别中最优但不可行的决策规则。我们将ASSURE应用于经济机会的人口普查区选择、歧视性企业的识别以及A/B测试中 $p$ 值决策程序的分析。

英文摘要

This paper proposes methods for producing compound selection decisions in a Gaussian sequence model. Given unknown, fixed parameters $\mu_ {1:n}$ and known $\sigma_{1:n}$ with observations $Y_i \sim \textsf{N}(\mu_i, \sigma_i^2)$, the decision maker would like to select a subset of indices $S$ so as to maximize utility $\frac{1}{n}\sum_{i\in S} (\mu_i - K_i)$, for known costs $K_i$. Inspired by Stein's unbiased risk estimate (SURE), we introduce an almost unbiased estimator, called ASSURE, for the expected utility of a proposed decision rule. ASSURE allows a user to choose a welfare-maximizing rule from a pre-specified class by optimizing the estimated welfare, thereby producing selection decisions that borrow strength across noisy estimates. We show that ASSURE produces decision rules that are asymptotically no worse than the optimal but infeasible decision rule in the pre-specified class. We apply ASSURE to the selection of Census tracts for economic opportunity, the identification of discriminating firms, and the analysis of $p$-value decision procedures in A/B testing.

2506.00330 2026-06-11 physics.data-an cs.IT stat.ML 版本更新

Accurate Estimation of Mutual Information in High Dimensional Data

高维数据中互信息的准确估计

Eslam Abdelaleem, K. Michael Martini, Ilya Nemenman

AI总结 针对高维欠采样下互信息估计难题,提出基于低维潜在表示的神经估计器,结合统计一致性检验、偏差校正和置信区间,并引入VSIB概率批评器族,在合成与真实图像数据上实现可靠估计。

详情
Comments
15 pages main text, 21 pages SI, 12 Figs overall
AI中文摘要

互信息(MI)量化变量之间的统计依赖性,广泛应用于科学领域,但从有限数据中准确估计仍然非常困难。常见方法在现代实验典型的高维欠采样场景($N \lesssim K$)中失败,且没有公认的测试来检测基于神经网络的估计器何时失效,使其实际上无法作为科学仪器使用。我们证明,当统计依赖关系具有低维潜在表示时,神经MI估计器可以变得可靠。样本复杂度由潜在维度$K_Z \ll K$而非环境维度决定——我们通过随机矩阵理论从经验上确认并从理论上奠定了这一机制转变。基于这一见解,我们开发了一个实用协议,为神经估计器提供显式的统计一致性检查、偏差校正和置信区间。此外,我们引入了一类新的概率批评器(VSIB族),在标准估计器失效的高MI值下显著降低偏差和方差。我们在合成基准($K=500$,$N$低至256)、Czyz等人(2023)的标准40数据集基准套件、噪声MNIST($K=784$)以及使用ResNet-20骨干网络的CIFAR-10/100($K=3072$)上验证了该协议。我们的协议始终匹配或超越现有方法,同时是唯一报告置信区间并标记不可靠估计的方法,在真实图像上实现了远低于环境像素维度的可靠MI检测。

英文摘要

Mutual information (MI) quantifies statistical dependence between variables and is widely used across scientific disciplines, yet accurate estimation from finite data remains notoriously difficult. Common approaches fail in high-dimensional, undersampled regimes ($N \lesssim K$) typical of modern experiments, and no accepted tests exist to detect when neural network-based estimators fail, making them effectively unusable as scientific instruments. We show that neural MI estimators can be made reliable when the statistical dependencies admit a low-dimensional latent representation. Sample complexity is then governed by the latent dimensionality $K_Z \ll K$ rather than the ambient dimension -- a regime shift we confirm empirically and ground theoretically via random matrix theory. Building on this insight, we develop a practical protocol that provides neural estimators with explicit statistical consistency checks, bias correction, and confidence intervals. We additionally introduce a new class of probabilistic critics (the VSIB family) that substantially reduce bias and variance at higher MI values where standard estimators break down. We validate the protocol on synthetic benchmarks ($K=500$, $N$ as low as $256$), on the standard 40-dataset benchmark suite of Czyz et al. (2023), on noisy MNIST ($K=784$), and on CIFAR-10/100 ($K=3072$) with a ResNet-20 backbone. Our protocol consistently matches or exceeds existing methods while being the only approach to report confidence intervals and flag unreliable estimates, achieving reliable MI detection well below the ambient pixel dimension on real images.

2509.10817 2026-06-11 math.ST stat.ME 版本更新

Conditional Independence Testing Using Exchangeable Pairs

使用可交换对的条件独立性检验

Bilol Banerjee

AI总结 提出基于可交换对的条件独立性检验方法,将问题转化为两样本检验,利用能量距离度量偏离,并证明其一致性和最优检测率。

详情
AI中文摘要

本文考虑在给定混杂随机向量 \(\m Z\) 的情况下,检验两个随机向量 \(\m X\) 和 \(\m Y\) 之间的条件独立性问题。引入了一个可交换对框架,通过该框架将条件独立性检验问题重新表述为两样本检验问题。该框架受模型X文献思想的启发,基于在原假设条件独立性下成立的基本可交换性性质。采用能量距离/最大均值差异类型的度量来衡量可交换对与条件独立性的偏离。构建了所提出的差异度量的一致估计量,并在一般假设下建立了其理论性质。然后,使用该估计量作为检验统计量开发了条件独立性检验,并通过适当的重采样程序进行校准。结果表明,所提出的检验对固定备择假设是一致的,对局部邻接备择假设具有非平凡的渐近功效,达到了检测由所提出的差异度量表征的备择假设的极小化最优分离率,并且在数据维度随样本量发散时仍然一致。还研究了用于生成可交换对的条件分布估计的影响,并建立了保持有效性和功效性质的条件。广泛的模拟研究表明,所提出的方法在与一些最先进的方法相比具有竞争力。

英文摘要

This article considers the problem of testing conditional independence between two random vectors \(bm X\) and \(\bm Y\) given a confounding random vector \(\bm Z\). An exchangeable-pairs framework is introduced through which the conditional independence testing problem is reformulated as a two-sample testing problem. The framework is motivated by ideas from the model-X literature and is based on a fundamental exchangeability property that holds under the null hypothesis of conditional independence. An energy-distance/maximum mean discrepancy type measure is employed on the resulting exchangeable pairs to quantify departures from conditional independence. A consistent estimator of the proposed discrepancy measure is constructed and its theoretical properties are established under general assumptions. A conditional independence test is then developed using this estimator as a test statistic and is calibrated through a suitable resampling procedure. It is shown that the proposed test is consistent against fixed alternatives, possesses nontrivial asymptotic power against local contiguous alternatives, attains the minimax separation rate for detecting alternatives characterized by the proposed discrepancy measure, and remains consistent when the data dimension diverges with the sample size. The effect of estimating the conditional distribution used to generate the exchangeable pairs is also investigated, and condition under which validity and power properties are preserved is established. Extensive simulation studies demonstrate that the proposed procedure performs competitively with some state-of-the-art methods.

2405.01651 2026-06-11 stat.ME 版本更新

Confidence regions for a persistence diagram of a single image with one or more loops

单张图像中一个或多个环状结构的置信区域

Susan Glenn, Jessi Cisewski-Kehe, Jun Zhu, William M. Bement

AI总结 本文提出利用TDA方法估计单张图像中的底层结构并量化不确定性,通过将图像分为背景和受损细胞区域,建立持久图空间中的置信区域以纠正传统TDA的偏差。

详情
Comments
30 pages, 8 figures
AI中文摘要

拓扑数据分析(TDA)利用持续同调量化数据中的环形和高维孔洞,尤其适用于细胞生物学中细胞图像特征分析。在细胞损伤情况下,随着时间推移,细胞图像中会出现环状伤口并逐渐消失。对单张图像中环状模式进行统计推断具有挑战性,因为缺乏重复样本。本文提出一种新颖的框架,利用TDA估计单张图像中的底层结构并量化相关不确定性。我们的方法将图像分为背景和受损细胞区域,然后利用受影响细胞区域的像素在持久图空间中建立置信区域。该方法在持久图上建立估计以纠正传统TDA方法的偏差。通过模拟研究评估所提置信区域的覆盖概率,并与本文提出的替代方法进行比较。我们还通过细胞修复提供的实际例子展示了我们的方法。

英文摘要

Topological data analysis (TDA) uses persistent homology to quantify loops and higher-dimensional holes in data, making it particularly relevant for examining the characteristics of images of cells in the field of cell biology. In the context of a cell injury, as time progresses, a wound in the form of a ring emerges in the cell image and then gradually vanishes. Performing statistical inference on this ring-like pattern in a single image is challenging due to the absence of repeated samples. In this paper, we develop a novel framework leveraging TDA to estimate underlying structures within individual images and quantify associated uncertainties through confidence regions. Our proposed method partitions the image into the background and the damaged cell regions. Then pixels within the affected cell region are used to establish confidence regions in the space of persistence diagrams (topological summary statistics). The method establishes estimates on the persistence diagrams which correct the bias of traditional TDA approaches. A simulation study is conducted to evaluate the coverage probabilities of the proposed confidence regions in comparison to an alternative approach is proposed in this paper. We also illustrate our methodology by a real-world example provided by cell repair.

2. 贝叶斯统计与概率建模 10 篇

2606.12305 2026-06-11 stat.ME 新提交

Bayesian nonparametric Mallows model for clustering preference data

贝叶斯非参数Mallows模型用于偏好数据聚类

Lorenzo Zuccato, Veronica Vinciotti, Valeria Vitelli

AI总结 提出基于狄利克雷过程混合模型的贝叶斯非参数Mallows模型,实现聚类数自动推断与聚类分配联合学习,在R包BayesMallows中实现,模拟与真实数据验证有效。

详情
Comments
21 pages (main text), 28 pages including supplementary material. Submitted for peer review
AI中文摘要

偏好学习是指从不同类型的排序和偏好数据中学习潜在模式。偏好学习的典型目标是推断共享共识排序、学习个体级偏好以及进行无监督聚类。Mallows模型是少数能够同时实现所有这些目标的方法之一。先前的工作基于MCMC Metropolis-Hastings方案开发了计算上可行的贝叶斯推断方法,其中通过有限混合Mallows模型进行聚类,然后对聚类数进行后验推断。这里我们提出基于狄利克雷过程混合模型的贝叶斯非参数Mallows模型,允许对非空聚类数和聚类分配进行联合推断,以及对聚类特定参数进行后验推断。所提出的采样算法已集成到现有的R包BayesMallows中,该包还支持不完整排序和成对比较形式的数据。模拟数据表明,与有限混合模型相比,非参数模型在恢复正确聚类数方面表现良好,而电影评分的实证数据展示了该模型在丢弃评分上提供个性化电影推荐的有效性。

英文摘要

Preference learning refers to the learning of latent patterns from ranking and preference data of different kinds. Typical aims of preference learning are to infer a shared consensus ranking, to learn individual-level preferences, and to perform unsupervised clustering. The Mallows model is among the few approaches that can achieve all these objectives jointly. Previous work has developed computationally tractable methods for Bayesian inference based on a MCMC Metropolis-Hastings scheme, where clustering is performed via a finite mixture of Mallows models. Inference on the number of clusters is then conducted a posteriori. Here we propose a Bayesian nonparametric Mallows model, based on a Dirichlet process mixture model. This allows joint inference on the number of non-empty clusters and on the clustering allocation, as well as posterior inference on cluster-specific parameters. The implementation of the proposed sampling algorithm is integrated into the existing R package BayesMallows, which also supports data in the form of incomplete rankings and pairwise comparisons. Simulated data show good performance of the nonparametric model compared to a finite mixture model in terms of recovery of the correct number of clusters, while empirical data on movie ratings show the model's effectiveness in providing personalized movie recommendations on discarded ratings.

2606.12296 2026-06-11 stat.ME 新提交

Bayesian Triangulation Splines: Spatial Adaptation on Irregular Domains

贝叶斯三角剖分样条:不规则域上的空间自适应

Sihyeon Pyeon, Sunwoo Lim, Seonghyun Jeong

AI总结 提出贝叶斯三角剖分样条方法,通过约束Delaunay三角剖分处理不规则域边界和异质性平滑,实现空间自适应,并证明其最优后验收缩率和Oracle性质。

详情
AI中文摘要

针对二维非矩形域的传统非参数回归方法常常忽略域几何结构,允许跨边界平滑。在空间和地质统计应用中,这一假设通常无效,因为域边界通常约束观测之间的相互作用。适应空间变化的平滑度也比单变量设置更具挑战性,大多数现有方法未能充分捕捉目标函数的局部结构。为了解决这些问题,我们提出了贝叶斯三角剖分样条,该方法在多边形域上构造局部自适应样条。该方法采用约束Delaunay三角剖分来尊重边界几何并适应异质性平滑。精心设计的先验进一步提高了经验性能。在全局Sobolev平滑假设下,我们证明了所提方法实现了最优后验收缩率,并适应未知平滑度。我们还表明,该方法在实现非均匀或局部变化结构特征的Oracle率方面表现出理想的空间适应性。至关重要的是,这种Oracle保证并非特定于约束Delaunay三角剖分,而是适用于任何满足弱形状正则条件的三角剖分。模拟研究证实,所提方法通过实现更高的估计精度同时保持低模型复杂度,优于现有方法。

英文摘要

Conventional nonparametric regression methods for two-dimensional non-rectangular domains often overlook domain geometry and allow smoothing across boundaries. In spatial and geostatistical applications, this assumption is frequently invalid because domain boundaries typically constrain interactions among observations. Accommodating spatially varying smoothness is also substantially more challenging than in the univariate setting, and most existing methods do not adequately capture this local structure of the target function. To address these challenges, we propose Bayesian triangulation splines, which constructs locally adaptive splines over a polygonal domain. The method employs constrained Delaunay triangulations to respect boundary geometry and adapt to heterogeneous smoothness. A carefully designed prior further improves empirical performance. Under a global Sobolev smoothness assumption, we show that the proposed method achieves the optimal posterior contraction rate and adapts to unknown smoothness. We also show that the method exhibits ideal spatial adaptation in the sense that it achieves the oracle rate for inhomogeneous or locally varying structural features. Crucially, this oracle guarantee is not specific to constrained Delaunay triangulations, but holds over any triangulation satisfying weak shape-regularity conditions. Simulation studies confirm that the proposed method outperforms existing approaches by achieving higher estimation accuracy while maintaining low model complexity.

2606.12164 2026-06-11 stat.ME 新提交

Bayesian Effect Selection for Additive Quantile Regression with an Application to Air Pollution Thresholds

加性分位数回归的贝叶斯效应选择及其在空气污染阈值中的应用

Nadja Klein, Aaron Wei Qi Lee, Jorge Mateu

AI总结 提出一种贝叶斯效应选择方法,通过Demmler-Reinsch基展开正交分解加性效应的线性和非线性部分,并使用尖峰-板先验进行选择,应用于马德里空气污染数据分析,揭示极端NO2浓度的驱动因素。

详情
Comments
arXiv admin note: substantial text overlap with arXiv:2105.10890
AI中文摘要

空气污染监管限值通常以浓度阈值超标来定义,这些阈值自然与污染物分布的条件分位数相关,因此直接关系到严重污染事件的评估。同时,不仅要确定协变量是否影响空气污染,还要确定这种影响是线性、非线性还是两者兼有。我们通过开发加性分位数回归的贝叶斯效应选择方法来解决这些问题。虽然惩罚样条的常用混合模型表示(MMR)允许灵活的非线性效应,但它们不能提供线性和非线性效应成分的有意义分离。因此,我们采用Demmler-Reinsch基展开,将每个加性效应正交分解为线性和非线性部分,并从理论上证明两个效应成分可以一致估计。为了促进数据驱动的模型构建,我们提出贝叶斯效应选择,对与线性和非线性成分相关的标量重要性参数分别使用尖峰-板先验,并实现高效的Gibbs采样器。通过模拟研究,我们展示了该方法对非对称拉普拉斯工作似然引起的误设具有鲁棒性,并显示出相对于MMR的优越性能。在对西班牙马德里空气污染数据的详细分析中,我们强调了灵活建模极端二氧化氮(NO$_2$)浓度的附加价值,并揭示了阈值相关的污染水平受气候变量和交通相关空间结构的不同驱动。这些发现强调了需要先进的统计模型来支持短期决策,并帮助地方当局减轻或潜在防止NO$_2$浓度限值超标。

英文摘要

Air pollution regulatory limits are typically defined in terms of exceedances of concentration thresholds which are naturally related to conditional quantiles of the pollutant distribution and are therefore of direct relevance for assessing severe pollution events. At the same time, it is important to determine not only whether a covariate affects air pollution but also whether this effect is linear, nonlinear, or both. We address these issues by developing a Bayesian effect selection approach for additive quantile regression. While commonly used mixed model representations (MMRs) of penalized splines allow for flexible nonlinear effects, they do not provide a meaningful separation of linear and nonlinear effect components. We therefore employ a Demmler-Reinsch basis expansion, which yields an orthogonal decomposition of each additive effect into linear and nonlinear parts and show theoretically that both effect components can be estimated consistently. To facilitate data-driven model building, we propose Bayesian effect selection with separate spike and slab priors on the scalar importance parameters associated with the linear and nonlinear components and implement an efficient Gibbs sampler. Through simulation studies, we demonstrate robustness to the misspecification induced by the employed asymmetric Laplace working likelihood and show superior performance relative to the MMR. In a detailed analysis of air pollution data in Madrid, Spain we highlight the added value of flexibly modeling extreme nitrogen dioxide (NO$_2$) concentrations and reveal that threshold-relevant pollution levels are driven differently by climatological variables and traffic-related spatial structure. These findings underline the need for advanced statistical models that support short-term decision-making and help local authorities mitigate, or potentially prevent, exceedances of NO$_2$ concentration limits.

2606.11876 2026-06-11 q-bio.QM cs.LG stat.ME 新提交

Seeing Below the Limit of Detection: A Censored-Poisson Bayesian Latent-Growth Change-Point Detector (the Span Detector) for Serial ctDNA in HR+/HER2- Metastatic Breast Cancer

检测限以下:用于HR+/HER2-转移性乳腺癌连续ctDNA的删失泊松贝叶斯潜在增长变点检测器(Span检测器)

Aarchi Singh Thakur, Abhijoy Sarkar

AI总结 提出Span检测器,利用删失泊松贝叶斯潜在增长变点模型处理ctDNA非检测作为左删失观测,通过序贯广义似然比统计量检测变异检测率上升点,在10%假警报率下将提前三个月捕获进展的比例从11%提升至25%。

详情
Comments
9 pages, 4 figures, 2 tables. Code and synthetic data generator: this https URL
AI中文摘要

循环肿瘤DNA(ctDNA)在影像学显示耐药性数月前就已携带证据,但最早证据存在于检测限(LoD)以下:新生亚克隆仅被间歇性检测到,产生微弱检测和非检测的闪烁序列。商业液体活检将每次抽取视为独立快照,并将非检测视为无信号。我们认为非检测是左删失观测,而随时间变化的非检测和微弱检测模式在单个值可信之前就携带了可操作的生长证据。我们引入Span,一种删失泊松贝叶斯潜在增长变点检测器,它对二元检测过程建模,为每个变异的检测率累积一个向上变点的序贯广义似然比统计量,并以校准的假警报控制发出竞争风险警报。Span没有学习权重,因此没有过拟合风险。在一线CDK4/6抑制剂联合内分泌治疗的HR+/HER2-转移性乳腺癌合成队列中,在匹配的10%假警报率下,Span将提前三个月捕获的即将进展比例大约翻倍(惰性出现:25% vs 快照的11%),具有可证伪的剂量反应:对惰性出现效果显著,对快速出现效果消失。值轨迹基线表现与快照相同,将增益归因于删失检测模型。生存主干在真实乳腺癌数据(GBSG-2,n=686;C指数0.67 vs 0.68)上与Cox基线匹配,在具有清洁生物标志物的真实纵向队列(PBC2,n=312)上,同一管道正确拒绝获胜,这是一个可证伪的边界测试,确认机制是特定于状态的。所有ctDNA轨迹均为合成数据。

英文摘要

Circulating-tumour DNA (ctDNA) carries evidence of drug resistance months before imaging shows it, but the earliest evidence lives below the assay's limit of detection (LoD): a nascent subclone is detected only intermittently, producing a flickering sequence of faint detects and non-detects. Commercial liquid biopsies treat each draw as an independent snapshot and a non-detect as nothing. We argue a non-detect is a left-censored observation, and the pattern of non-detects and faint detects over time carries actionable evidence of growth before any single value is trustworthy. We introduce Span, a censored-Poisson Bayesian latent-growth change-point detector that models the binary detection process, accumulates a sequential generalised-likelihood-ratio statistic for an upward change-point in the per-variant detection rate, and raises a competing-risks alarm with calibrated false-alarm control. Span has no learned weights, so there is nothing to overfit. On a synthetic cohort of HR+/HER2- metastatic breast cancer on first-line CDK4/6-inhibitor plus endocrine therapy, at a matched 10% false-alarm rate, Span roughly doubles the fraction of impending progressions caught three months ahead (indolent regime: 25% vs 11% for the snapshot), with a falsifiable dose-response: large for indolent emergence, vanishing for fast emergence. A value-trajectory baseline performs identically to the snapshot, isolating the gain to the censored detection model. The survival backbone matches a Cox baseline on real breast-cancer data (GBSG-2, n=686; C-index 0.67 vs 0.68), and on a real longitudinal cohort with clean biomarkers (PBC2, n=312) the same pipeline correctly declines to win, a falsifiable boundary test confirming the mechanism is regime-specific. All ctDNA trajectories are synthetic.

2606.11624 2026-06-11 stat.ME 新提交

The Triply-Randomized Negative Binomial Beta for Robust Regression and Conjugate Models of Bounded Support Data

三重随机负二项贝塔分布用于鲁棒回归和有界支持数据的共轭模型

Jimmy Lederman, Aaron Schein

AI总结 提出三重随机负二项贝塔分布(TNBbeta),通过随机化标准贝塔分布的参数,解决其对异常值敏感、无法处理零观测及缺乏共轭先验的问题,并利用Pólya-gamma增广实现高效吉布斯采样。

详情
AI中文摘要

贝塔分布是许多响应变量支持为$[0,1]$的回归问题中默认的似然函数选择,尽管它对异常值敏感、无法处理精确为零的观测值,并且缺乏闭式共轭先验。我们通过引入三重随机负二项贝塔分布(记为$\mathrm{TNBbeta}(p,\\,q,\\,\varepsilon)$)来解决这些缺陷,该分布由中位数$p$、浓度参数$q$和允许在$0$和$1$处具有正密度的边界参数$\varepsilon$参数化。TNBbeta通过随机化标准贝塔分布的参数(使用三个相依的负二项随机变量)得到,我们证明了每个随机变量的完全条件分布本身也是负二项分布。此外,将$p$和$q$与具有logit链接函数的高斯潜变量连接,通过Pólya-gamma增广得到闭式更新。这些性质共同为有界支持数据的回归模型提供了简单的辅助变量吉布斯采样器,在有效样本量每秒和留一预测方面通常优于标准贝塔回归方法,尤其是在存在异常值的情况下。在森林冠层覆盖度的案例研究中,我们证明了该框架可以轻松融入空间结构和精确零观测。总体而言,这项工作大大扩展了可高效拟合的$[0,1]$有界支持数据的贝叶斯模型类别。

英文摘要

The beta distribution is the default choice of likelihood in many regression problems with a $[0,1]$-bounded support response despite its sensitivity to outliers, inability to accommodate exact zero observations, and a lack of closed-form conjugate priors. We address these shortcomings by introducing the triply-randomized negative binomial beta distribution, denoted $\mathrm{TNBbeta}(p,\,q,\,\varepsilon)$, parameterized by a median $p$, concentration parameter $q$, and boundary parameter $\varepsilon$ which permits positive density at $0$ and $1$. The TNBbeta arises by randomizing the parameters of a standard beta distribution with three dependent negative binomial random variables, each of whose complete conditional distribution we show is itself negative binomial. Moreover, connecting $p$ and $q$ to Gaussian latent variables with logit link functions yields closed-form updates via Pólya-gamma augmentation. Together, these properties yield simple auxiliary-variable Gibbs samplers for regression models of bounded-support data, which often outperform standard beta regression approaches in terms of effective sample size per second and held-out prediction, especially in the presence of outliers. In a case study of forest canopy cover, we demonstrate that this framework can easily incorporate spatial structure and exact zero observations. Overall, this work substantially expands the class of Bayesian models for $[0,1]$-bounded support data that can be fit efficiently.

2605.11340 2026-06-11 stat.ME 版本更新

Hyperbolic Latent Space Models for Network Embedding: Model Specification and Bayesian Inference

双曲潜空间模型用于网络嵌入:模型规范与贝叶斯推断

Yiwei Gong, Anna L. Smith, Dena Asta, Catherine A. Calder

AI总结 本文提出双曲潜空间模型,通过贝叶斯推断解决网络嵌入中的树状结构和厚尾度分布问题,强调温度参数对网络拓扑的重要性。

详情
AI中文摘要

许多现实世界网络表现出分层、树状结构和厚尾度分布,这些现象无法被传统网络数据统计模型轻易捕捉。本文基于统计物理的见解,提出具有双曲几何基础的连续潜空间模型,以概率方式将节点嵌入具有恒定负曲率的潜空间。然而,大多数统计实现简化了原始物理模型,忽略了控制潜距离到概率映射锐度的温度参数。本文认为这一省略是关键性的。我们证明温度是控制网络树状拓扑的根本参数,未能推断温度会削弱模型表达能力。我们正式提出一个具有未知可学习温度参数的贝叶斯双曲连续潜空间模型。然后开发了两种推断程序:用于严谨后验特征化的哈密顿蒙特卡罗方法和用于大规模网络的可扩展自编码变分贝叶斯算法。通过模拟和实际数据示例,我们证明在大多数情况下,本文模型在图重建任务中优于具有固定温度和错误指定欧几里得几何的模型,确认温度是复杂网络的关键且可推断的特征。

英文摘要

Many real-world networks exhibit hierarchical, tree-like structure and heavy-tailed degree distributions, phenomena not readily captured by standard statistical models for network data. Extensions of the popular continuous latent space modeling framework have been proposed to accommodate such networks. Drawing on insights from statistical physics, continuous latent space models with underlying hyperbolic geometry have been proposed as a natural framework, probabilistically embedding nodes in a latent Riemannian manifold with constant negative curvature. Most statistical implementations, however, simplify the original physics-based model by omitting the ``temperature parameter," which controls the sharpness of the latent distance-to-probability mapping. We argue this omission is critical. We demonstrate that temperature is the fundamental parameter governing a network's tree-like topology, and that failing to infer it weakens model expressiveness. We formalize a Bayesian hyperbolic continuous latent space model with an unknown, learnable temperature parameter. We then develop two inferential procedures: a Hamiltonian Monte Carlo approach for rigorous posterior characterization and a scalable auto-encoding variational Bayes algorithm for large-scale networks. Through simulation and real data examples, we show that our model outperforms models with fixed temperature and misspecified Euclidean geometries in graph reconstruction tasks in most settings, confirming temperature is a crucial and inferable feature of complex networks.

2603.27843 2026-06-11 math.ST stat.ME 版本更新

Empirical Bayes Estimation and Inference via Smooth Nonparametric Maximum Likelihood

经验贝叶斯估计与推断:基于光滑非参数最大似然法

Taehyun Kim, Bodhisattva Sen

AI总结 针对非参数最大似然估计的离散性和慢对数解卷积率,引入高斯平滑层,提出光滑NPMLE,实现多项式解卷积率、近参数去噪性能及后验一致估计,并构建最优边际覆盖集。

详情
AI中文摘要

基于非参数最大似然估计(NPMLE)的经验贝叶斯 $g$-建模方法一直是正态均值问题中大规模估计和推断的核心。然而,不确定性量化的理论保证仍然很少。一个关键障碍是NPMLE必然是离散的,这导致离散的后验可信集和缓慢的对数解卷积率。我们通过引入一个分层高斯平滑层来解决这两个限制,该平滑层将混合分布限制为高斯位置混合。我们的光滑NPMLE继承了经典NPMLE的优良性质:它可以通过凸优化计算,并实现近乎参数的降噪性能。此外,它实现了多项式解卷积率,在相应类别上是渐近极小极大的。我们的过程还导致估计的光滑后验以多项式率收敛到真实后验。进一步,我们刻画了在期望长度上最优的边际覆盖集,构造了这些集的插件估计量,并在覆盖概率和期望长度方面为估计集建立了理论保证。我们还将理论扩展到模型误设和异方差高斯观测的设置,并研究了所提分层模型的可识别性。

英文摘要

The empirical Bayes $g$-modeling approach based on the nonparametric maximum likelihood estimator (NPMLE) has been central to large-scale estimation and inference in the normal means problem. However, theoretical guarantees for uncertainty quantification remain scarce. A key obstacle is that the NPMLE is necessarily discrete, which yields discrete posterior credible sets and a slow logarithmic deconvolution rate. We address both limitations by introducing a hierarchical Gaussian smoothing layer that restricts the mixing distribution to a Gaussian location mixture. Our smooth NPMLE inherits the favorable properties of the classical NPMLE: it is computable via convex optimization and achieves nearly parametric denoising performance. Moreover, it achieves a polynomial deconvolution rate that is asymptotically minimax over the corresponding class. Our procedure also leads to estimated smooth posteriors that converge to the true posteriors at a polynomial rate. Further, we characterize marginal coverage sets that are optimal in expected length, construct plug-in estimators of these sets, and establish theoretical guarantees for the estimated sets in terms of both coverage probability and expected length. We also extend the theory to settings with model misspecification and heteroscedastic Gaussian observations, and study identifiability of the proposed hierarchical model.

2512.23581 2026-06-11 stat.ME 版本更新

Profile Bayesian Optimization for Expensive Computer Experiments

面向昂贵计算机实验的轮廓贝叶斯优化

Courtney Kyger, James Fernandez, John A. Grunenwald, James Braun, Annie Booth

AI总结 提出一种新型贝叶斯优化方法,通过两阶段采集策略和深度/浅层高斯过程代理,在控制参数范围内高效识别轮廓最优解,应用于旋转爆震发动机扩散器设计。

详情
AI中文摘要

我们提出了一种新颖的贝叶斯优化(BO)程序,旨在识别具有单个控制参数和多个干扰参数的确定性黑箱计算机模拟的“轮廓最优”。轮廓最优捕捉作为控制参数函数的最优响应值。我们的目标是在控制参数的整个合理范围内识别这些最优值。经典BO针对所有参数寻找单个最优值,不会探索整个控制参数范围。相反,我们开发了一种新颖的两阶段采集方案,以平衡控制参数上的探索和轮廓最优的利用,利用深度和浅层高斯过程代理来促进不确定性量化。我们的动机来自旋转爆震燃烧发动机中扩散器的计算机模拟,该模拟返回作为各种设计参数函数的通过扩散损失的能量。我们旨在识别作为扩散器长度函数的最低可能能量损失;理解这种关系将能够做出明智的设计选择。我们的“轮廓贝叶斯优化”程序在各种基准测试中优于传统BO和轮廓优化方法,并在我们的激励应用中证明比最先进的多目标优化更有效。

英文摘要

We propose a novel Bayesian optimization (BO) procedure aimed at identifying the "profile optima" of a deterministic black-box computer simulation that has a single control parameter and multiple nuisance parameters. The profile optima capture the optimal response values as a function of the control parameter. Our objective is to identify these optima across the entire plausible range of the control parameter. Classic BO, which targets a single optimum over all parameters, does not explore the entire control parameter range. Instead, we develop a novel two-stage acquisition scheme to balance exploration across the control parameter and exploitation of the profile optima, leveraging deep and shallow Gaussian process surrogates to facilitate uncertainty quantification. We are motivated by a computer simulation of a diffuser in a rotating detonation combustion engine, which returns the energy lost through diffusion as a function of various design parameters. We aim to identify the lowest possible energy loss as a function of the diffuser's length; understanding this relationship will enable well-informed design choices. Our "profile Bayesian optimization" procedure outperforms traditional BO and profile optimization methods on a variety of benchmarks and proves effective in our motivating application against state-of-the-art multi-objective optimization.

2505.22587 2026-06-11 stat.ME 版本更新

Bayesian Non-Parametric Inference for Lévy Measures in State-Space Models

状态空间模型中Lévy测度的贝叶斯非参数推断

Bill Z. Lin, Simon Godsill

AI总结 提出贝叶斯非参数框架,利用独立伽马缩放狄利克雷过程(IGSDP)推断线性状态空间模型中子序和正态方差均值过程的Lévy测度,实现可识别参数化与高效MCMC算法。

详情
AI中文摘要

Lévy过程以其能够建模具有偏斜、重尾和不连续性的复杂动态而闻名,在各个领域的随机建模中发挥着关键作用。然而,大多数Lévy过程的推断,无论是在参数还是非参数设置中,仍然是一个重大挑战。在这项工作中,我们提出了一个新颖的贝叶斯非参数推断框架,用于在线性状态空间模型内推断子序和正态方差均值(NVM)过程的Lévy测度。引入了一种灵活随机测度——独立伽马缩放狄利克雷过程(IGSDP),其中著名的伽马过程是一个特例,从而为关于两个Lévy测度的推断提供了可处理的条件分布。我们进一步表明,在伽马过程特例中,可以实现超参数推断的共轭性。提供了NVM过程参数轮廓的显式表征,使得模型具有可识别的参数化,从而在后验推断中实现有效的马尔可夫链蒙特卡洛算法。该方法在合成数据和逐笔(高频)金融数据集上进行了演示。

英文摘要

Lévy processes, known for their ability to model complex dynamics with skewness, heavy tails, and discontinuities, play a critical role in stochastic modeling across various domains. However, inference for most Lévy processes, whether in parametric or non-parametric settings, remains a significant challenge. In this work, we present a novel Bayesian non-parametric inference framework for inferring the Lévy measures of subordinators and normal variance-mean (NVM) processes within a linear state space model. A flexible random measure, the Independent Gamma-scaled Dirichlet Process (IGSDP), is introduced, for which the well-known Gamma process is a special case, leading to tractable conditional distributions for inference about both Lévy measures. We further show that in the Gamma process special case, conjugacy can be achieved for hyper-parameter inference. An explicit characterization of the parameter contour for NVM processes is provided, enabling an identifiable parameterization of the model for effective Markov Chain Monte Carlo algorithms in posterior inference. The method is demonstrated on both synthetic and tick-level (high-frequency) financial datasets.

2505.02653 2026-06-11 math.ST math.PR stat.ME 版本更新

Hierarchical Random Measures without Tables

无表格的层次随机测度

Marta Catalano, Claudio Del Sole

AI总结 提出一种层次狄利克雷过程的新先验,消除潜在表格变量,实现后验的准共轭分布和高效采样算法,并推广至归一化层次随机测度框架。

详情
AI中文摘要

层次狄利克雷过程是贝叶斯非参数多层次模型的基石。其生成模型可通过一组潜在变量描述,在流行的餐馆特许经营隐喻中通常称为表格。潜在表格简化了后验的表达,并允许实现吉布斯采样算法以近似抽取后验样本。然而,管理它们的分配可能变得计算昂贵,特别是随着数据集大小和层次数量的增加。在这项工作中,我们为层次狄利克雷过程的浓度参数确定了一个先验,该先验(i)诱导准共轭后验分布,并且(ii)消除了对表格的需求,导致后验更可解释的表达,同时具有可扩展和精确的算法来从中采样。值得注意的是,这种构造超越了狄利克雷过程,导致了一个定义归一化层次随机测度的新框架和一类从其后验采样的新算法。关键的分析工具是多元增量的独立性,即它们作为完全随机向量的表示。

英文摘要

The hierarchical Dirichlet process is the cornerstone of Bayesian nonparametric multilevel models. Its generative model can be described through a set of latent variables, commonly referred to as tables within the popular restaurant franchise metaphor. The latent tables simplify the expression of the posterior and allow for the implementation of Gibbs sampling algorithms to approximately draw posterior samples. However, managing their assignments can become computationally expensive, especially as the size of the dataset and the number of levels increase. In this work, we identify a prior for the concentration parameter of the hierarchical Dirichlet process that (i) induces a quasi-conjugate posterior distribution, and (ii) removes the need for tables, leading to more interpretable expressions for the posterior, with both a scalable and an exact algorithm to sample from it. Remarkably, this construction extends beyond the Dirichlet process, leading to a new framework for defining normalized hierarchical random measures and a new class of algorithms to sample from their posteriors. The key analytical tool is the independence of multivariate increments, that is, their representation as completely random vectors.

3. 因果推断与实验设计 6 篇

2606.11715 2026-06-11 stat.ME 新提交

Bracketing Relationships of Weighted Average Treatment Effects

加权平均处理效应的括号关系

Pengfei Tian, Fan Yang, Peng Ding

AI总结 在因果推断的观测研究规范设定下,证明了在倾向得分与条件平均处理效应满足单调关系时,重叠权重的平均处理效应介于处理组和对照组的平均处理效应之间,并推广到加权局部平均处理效应及其他权重,建议使用CP图。

详情
AI中文摘要

在因果推断的观测研究规范设定下,我们证明了在倾向得分与条件平均处理效应之间存在单调关系时,重叠权重(权重与给定协变量下处理的条件方差成比例)下的平均处理效应介于处理组和对照组的平均处理效应之间。我们进一步将结果推广到具有二元工具变量和二元处理的规范设定下的加权局部平均处理效应。我们还将结果推广到其他权重。基于该理论,我们建议绘制估计的条件平均处理效应与估计的倾向得分的“CP图”。

英文摘要

Under the canonical setting of observational studies for causal inference, we show that the average treatment effect under the overlap weight, the weight that is proportional to the conditional variance of the treatment given the covariates, is bounded between the average treatment effects on the treated and control, under a monotonic relationship between the propensity score and the conditional average treatment effect. We further extend the result to weighted local average treatment effects, under the canonical setting with a binary instrumental variable and a binary treatment. We also extend the results to other weights. Based on the theory, we recommend the ``CP-plot'' of the estimated conditional average treatment effect against the estimated propensity score.

2606.11405 2026-06-11 stat.ME stat.AP 新提交

Bayesian Causal Machine Learning for Cure Models

治愈模型的贝叶斯因果机器学习

Antonio R. Linero, F. Javier Rubio, Piyali Basak

AI总结 针对治愈模型中治疗对治愈概率和未治愈患者生存时间的不同影响,提出贝叶斯因果机器学习方法BartCure,分解受限平均生存时间的因果效应,并在乳腺癌试验中验证其有效性。

详情
AI中文摘要

在生存研究中,治疗可以通过不同机制使患者受益:治疗可能增加治愈的概率,或延迟未治愈患者的失败时间。量化哪种机制占主导地位,以及它是否在不同亚群中变化,具有临床重要性,但因果机器学习文献中针对此问题的研究有限。标准的因果生存学习器针对有限时间生存或受限平均生存时间,而许多治愈模型在未估计因果效应的情况下捕捉治愈结构。在这项工作中,我们在存在治愈亚群的情况下定义了有意义的因果效应,并引入了BartCure,一种用于估计这些效应的贝叶斯因果机器学习方法。我们推荐的因果效应将受限平均生存时间的因果效应分解为随机治愈和随机潜伏期成分,并将这些新效应与随机干预效应和主层中的因果效应联系起来。在模拟中,BartCure在估计平均效应方面具有竞争力,并且在保守地检测治疗效应异质性的方向方面特别有效。我们将BartCure应用于CALGB 40101乳腺癌试验,以估计平均和亚组因果效应,并识别治疗效应异质性。

英文摘要

In survival studies, treatments can benefit patients through different mechanisms: a treatment may increase the probability of being cured or delay failure among patients who are not cured. Quantifying which mechanism is dominant, and whether it varies across subpopulations, is clinically important, yet there is limited work in the causal machine learning literature addressing this problem. Standard causal survival learners target finite-horizon survival or restricted mean survival time, while many cure models capture cure structures without estimating causal effects. In this work, we define meaningful causal effects in the presence of a cured subpopulation and introduce BartCure, a Bayesian causal machine learning approach for estimating them. The causal effects we recommend decompose the causal effect on restricted mean survival time into a stochastic cure and stochastic latency component, and we relate these new effects to both stochastic intervention effects and causal effects in principal strata. In simulations, BartCure is competitive for estimating average effects and is especially effective at conservatively detecting the direction of treatment-effect heterogeneity. We apply BartCure to estimate average and subgroup causal effects and to identify treatment effect heterogeneity in the CALGB 40101 breast cancer trial.

2411.10959 2026-06-11 econ.EM cs.LG math.ST stat.AP stat.ME stat.ML 版本更新

Program Evaluation with Remotely Sensed Outcomes

利用遥感结果的程序评估

Ashesh Rambachan, Rahul Singh, Davide Viviano

AI总结 本文研究了在实验和准实验中,由于遥感变量不完全测量经济结果而引起的因果推断问题,提出了一种非参数识别因果参数的方法,结合实验和观测数据进行n^{-1/2}推断。

详情
AI中文摘要

我们研究了在实验和准实验中,经济结果由遥感变量不完全测量的因果推断问题。遥感变量是低成本、可扩展且在观测数据中预测经济结果的变量,例如卫星图像和移动电话活动。我们将遥感变量视为后结果:经济结果的变化导致遥感变量的变化。例如,环境质量的变化导致卫星图像的变化,而不是相反。在这一假设下,我们提出了一种结合实验和观测数据的公式,以非参数方式识别因果参数。我们开发了一种n^{-1/2}推断方法,该方法对规格不正确具有鲁棒性,并且不限制用于处理遥感变量的算法。

英文摘要

We study causal inference in experiments and quasi-experiments, where the economic outcome is imperfectly measured by a remotely sensed variable. The remotely sensed variable is low-cost, scalable, and predictive of the economic outcome in observational data; examples include satellite imagery and mobile phone activity. We model the remotely sensed variable as post-outcome: variation in the economic outcome causes variation in the remotely sensed variable. For example, changes in environmental quality cause changes in satellite imagery, not vice versa. Under this assumption, we propose a formula to nonparametrically identify the causal parameter by combining experimental and observational data. We develop a method for n^{-1/2} inference that is robust to misspecification and that does not restrict the algorithms used to process remotely sensed variables.

2403.16673 2026-06-11 stat.ME econ.EM 版本更新

Quasi-randomization tests for network interference: a random graph approach

网络干扰的准随机化检验:一种随机图方法

Supriya Tiwari, Pallavi Basu

AI总结 提出将网络视为随机变量,利用随机图零模型构建无溢出效应的零分布,克服了现有条件随机化检验的计算难题,在有限样本下精确有效,显著提升检验功效。

详情
AI中文摘要

当一个单元的处理状态影响其他单元的潜在结果时,就会产生网络干扰,导致难以检验的溢出效应。我们提出将网络视为随机变量而非固定量来应对这一挑战。这克服了原假设下潜在结果不可插补的关键难题,并避免了现有条件随机化检验的计算复杂性。我们的准随机化检验利用随机图零模型构建无溢出效应的零分布,在网络生成过程的温和假设下,在有限样本中精确有效,并且比现有方法(尤其是在整群随机试验中)提供显著更高的检验功效。我们通过模拟验证了该方法,并通过在中国农村的一项天气保险采纳实验中检验干扰效应进行了说明。

英文摘要

Network interference occurs when the treatment status of one unit affects the potential outcomes of other units, giving rise to spillover effects that are difficult to test for. We propose treating the network as a random variable rather than a fixed quantity to address this challenge. This overcomes a key challenge of non-imputability of potential outcomes under the null and avoids the computational intractability of existing conditional randomization tests. Our quasi-randomization test builds the null distribution of no spillover effects using random graph null models, is exactly valid in finite samples under mild assumptions on the network-generating process, and offers substantially improved power over existing methods, particularly in cluster-randomized trials. We validate our approach via simulation and illustrate it by testing for interference in a weather insurance adoption experiment in rural China.

2508.05859 2026-06-11 stat.ME 版本更新

Doubly robust integration of nonprobability and probability survey data

非概率与概率调查数据的双重稳健整合

Shaun R Seaman, Tommy Nyberg, Anne M Presanis

AI总结 提出双重稳健估计器整合非概率样本与概率调查数据,扩展至子域估计,并与仅使用概率数据的估计器组合,给出方差公式和渐近效率分析。

详情
Comments
66 pages, 31 figures. The preprint v2 extends the paper with: domain estimation; a new Hajek-style version of the Kim--Haziza doubly robust estimator; and, theory on the asymptotic relative efficiency of the combined estimators and a simulation study to assess the relative efficiency
AI中文摘要

针对整合非概率样本的结果和协变量数据与概率调查的协变量数据,已提出用于估计总体均值(或患病率)的双重稳健估计器。这些估计器结合了逆概率加权估计与大规模插补。然而,如何将这些双重稳健估计器与仅使用概率调查结果数据的Horvitz-Thompson或Hajek估计器相结合的问题,仅受到有限关注。本文首先回顾了先前提出的仅使用非概率样本结果数据的双重稳健估计器。我们将这些估计器扩展至能够估计域(子总体)均值(或患病率),可能利用域外个体的数据以改进小域估计。然后,我们考虑如何将此双重稳健估计器与仅使用概率调查数据的Horvitz-Thompson或Hajek估计器相结合。我们描述了有效的组合估计器,并给出了其重复抽样方差以及这些方差估计量的公式。我们还研究了组合估计器相对于其两个分量估计器的渐近相对效率,并进行了模拟研究以评估它们在有限样本中的相对效率。这些相对效率取决于两个分量估计器的方差之比以及协变量对结果的预测能力。

英文摘要

Doubly robust estimators for estimating the population mean (or prevalence) of an outcome have been proposed for integrating outcome and covariate data from a nonprobability sample with covariate data from a probability survey. These estimators combine inverse probability weighting estimation with mass imputation. However, the question of how to combine these doubly robust estimators with a Horvitz-Thompson or Hajek estimator that uses only outcome data from the probability survey has received only limited attention. In this paper, we first review previously proposed doubly robust estimators that use outcome data from only the nonprobability sample. We extend these estimators to enable estimation of domain (subpopulation) means (or prevalences), possibly using data from individuals outside the domain to improve estimation when the domain is small. We then consider how to combine this doubly robust estimator with a Horvitz-Thompson or Hajek estimator that uses only the probability survey data. We describe efficient combined estimators, and provide formulae for their repeated-sampling variances and for estimators of these variances. We also investigate the asymptotic relative efficiencies of the combined estimators compared to their two component estimators, and carry out a simulation study to assess their relative efficiencies in finite samples. These relative efficiencies depend on the ratio of the variances of the two component estimators and on how predictive the covariates are of the outcome.

2310.14983 2026-06-11 econ.EM math.ST stat.ME 版本更新

Causal clustering: design of cluster experiments under network interference

因果聚类:网络干扰下的聚类实验设计

Davide Viviano, Lihua Lei, Guido Imbens, Brian Karrer, Okke Schrijvers, Liang Shi

AI总结 研究网络干扰下估计全局处理效应的聚类实验设计,提出通过惩罚最小割优化选择聚类以最小化最坏情况均方误差,并给出选择聚类设计的简单条件。

详情
AI中文摘要

本文研究在存在网络溢出效应的情况下,用于估计全局处理效应的聚类实验设计。我们提供了一个框架来选择聚类,以最小化估计全局效应的最坏情况均方误差。我们证明最优聚类解决了一个新颖的惩罚最小割优化问题,可通过现成的半定规划算法计算。我们的分析还刻画了在任何两个聚类设计之间进行选择的简单条件,包括在聚类或个体水平随机化之间进行选择。我们使用来自Facebook用户宇宙的独特网络数据和现有现场实验数据来说明该方法的性质。

英文摘要

This paper studies the design of cluster experiments to estimate the global treatment effect in the presence of network spillovers. We provide a framework to choose the clustering that minimizes the worst-case mean-squared error of the estimated global effect. We show that optimal clustering solves a novel penalized min-cut optimization problem computed via off-the-shelf semi-definite programming algorithms. Our analysis also characterizes simple conditions to choose between any two cluster designs, including choosing between a cluster or individual-level randomization. We illustrate the method's properties using unique network data from the universe of Facebook's users and existing data from a field experiment.

4. 高维统计与正则化 3 篇

2606.11887 2026-06-11 stat.ME 新提交

Model-based sparse mixed-type PCA

基于模型的稀疏混合类型PCA

Lauri Heinonen, Joni Virta

AI总结 针对混合类型数据,提出一种基于矩估计的潜在协方差矩阵估计方法,实现稀疏主成分分析,并通过模拟和实际数据验证性能。

详情
AI中文摘要

本文提出了一种新的主成分分析方法,用于处理由连续、二元、整数和正连续变量组成的混合类型数据。假设数据来自一个概率模型,其中指数族分布的参数由一组共享的高斯潜在变量决定。所提出的方法MTPCA基于通过矩估计来估计这些潜在混合物的协方差矩阵。提出了一种稀疏化成分载荷的方法,并与经典稀疏PCA理论一致。我们提出了一种估计主成分得分的策略,并讨论了潜在维度的选择。通过模拟混合类型数据研究了该方法的性能,并在由二元动物特征组成的Zoo数据集上展示了该模型。

英文摘要

This work presents a new method for principal component analysis (PCA) of a mixed-type data consisting of continuous, binary, integer-valued and positive continuous variables. The data are assumed to come from a probability model, where the parameters of the exponential family distributions are determined by a set of shared Gaussian latent variables. The proposed method, MTPCA, is based on estimating the covariance matrix of these latent mixtures through the method of moments. A way to sparsify the component loadings is presented and aligns with the classical theory of sparse PCA. We propose a strategy for estimating the principal component scores and discuss the choice of the latent dimension. The method's performance is studied with a simulated mixed-type data and we illustrate the model on the Zoo data set consisting of binary animal characteristics.

2606.11738 2026-06-11 stat.ML cs.LG 新提交

Renewable Lasso without Batch-Number Constraints: A Gradient-Enhanced Approach

无批次数量约束的可再生Lasso:一种梯度增强方法

Junzhuo Gao, Ling Peng, Xu Guo, Heng Lian

AI总结 针对高维广义线性模型的流数据在线估计,提出梯度增强替代损失函数,消除批次数量约束,并扩展到分布式流数据场景,理论推导非渐近误差界,实验验证精度提升。

详情
AI中文摘要

我们研究具有流数据的高维广义线性模型的在线估计。首先,针对非分布式设置,我们提出一种梯度增强替代损失函数,仅使用历史摘要近似累积损失,修改并改进了现有高维设置下同一模型的可再生估计方法,并消除了先前研究中的批次数量约束。然后,我们将该方法扩展到主从架构下的分布式流数据,其中批次按站点划分,仅交换摘要(梯度向量)。我们的调整方法不要求客户端计算完整的替代损失,而不是直接应用Jordan等人(2019)的流行方法到替代二次损失。我们在高维尺度下推导了非渐近误差界,没有先前研究中严格的批次数量约束。在线性和逻辑模型下的模拟结果以及实际数据应用表明,与现有的可再生估计器相比,精度有所提高。

英文摘要

We study online estimation for high-dimensional generalized linear models with streaming data. First, for the non-distributed setting, we propose a gradient-enhanced surrogate loss that approximates the cumulative loss using only historical summaries, which modifies and improves upon the existing renewable estimation approach for the same model in the high-dimensional setting, and removes the batch-number constraint in previous studies. We then extend the method to distributed streaming data under the master-client architecture, where batches are partitioned across sites and only summaries (gradient vectors) are exchanged. Instead of directing applying the popular method of Jordan et al. (2019) to the surrogate quadratic loss, our adjusted approach does not require the clients to compute the full surrogate loss. We derive non-asymptotic error bounds under the high-dimensional scaling, without the stringent constraint on the number of batches in the previous studies. Simulation results under linear and logistic models, together with a real-data application, show improved accuracy over existing renewable estimators.

2504.15510 2026-06-11 stat.ME 版本更新

Ridge-Regularized Largest Root Test For High-Dimensional General Linear Hypotheses

高维一般线性假设的岭正则化最大根检验

Haoran Li

AI总结 针对高维多元线性模型中一般线性假设检验问题,提出岭正则化Roy最大根检验,通过岭项稳定协方差估计,建立正则化F矩阵最大特征值的渐近Tracy-Widom分布,并开发高效参数估计方法。

详情
AI中文摘要

多元分析中的一个基本问题是检验多元线性模型中回归系数的一般线性假设。该框架涵盖了广泛的研究任务,包括MANOVA、预测变量的联合显著性检验以及趋势或季节效应的检测。在经典方法中,Roy的最大根检验特别适用于检测集中信号,它依赖于由残差协方差矩阵构造的F矩阵的最大特征值。然而,在高维设置中,这些矩阵通常变得病态或奇异,使得检验不可行。为了解决这个问题,我们提出了一种岭正则化的Roy检验,通过岭项稳定协方差估计。我们在高维框架下建立了正则化F矩阵最大特征值的渐近Tracy-Widom分布,其中维度和假设都与样本量相当,且仅假设有限矩条件。我们开发了一种计算高效的程序来估计相关的中心化和尺度参数。我们进一步分析了检验在一类低秩备择假设下的功效,并考察了正则化参数的影响。该方法在模拟中表现出色,并应用于人类连接组项目的数据,以评估脑体积测量与行为变量之间的关联。

英文摘要

A fundamental problem in multivariate analysis is testing general linear hypotheses for regression coefficients in a multivariate linear model. This framework encompasses a wide range of well-studied tasks, including MANOVA, joint significance testing of predictors, and detection of trends or seasonal effects. Among classical approaches, Roy's largest root test is particularly effective for detecting concentrated signals, relying on the largest eigenvalue of an F matrix constructed from residual covariance matrices. However, in high-dimensional settings, these matrices often become ill-conditioned or singular, rendering the test infeasible. To address this, we propose a ridge-regularized Roy's test that stabilizes the covariance estimation via a ridge term. We establish the asymptotic Tracy-Widom distribution of the largest eigenvalue of the regularized F-matrix under a high-dimensional regime, where both the dimension and hypotheses are comparable to the sample size, assuming only finite-moment conditions. A computationally efficient procedure is developed to estimate the associated centering and scaling parameters. We further analyze the power of the test under a class of low-rank alternatives and examine the influence of the regularization parameter. The method demonstrates strong performance in simulations and is applied to data from the Human Connectome Project to assess associations between volumetric brain measurements and behavioral variables.

5. 时间序列与空间统计 8 篇

2606.12097 2026-06-11 stat.AP physics.data-an 新提交

Weibull-Stationary Stochastic Differential Equations for Conditional Long-Horizon Wind Power Forecasting

条件长期风电预测的威布尔平稳随机微分方程

Luca Di Persio, Mehrdad Ghadiri

AI总结 提出一种基于威布尔平稳SDE的月度风电概率预测框架,通过异方差卡尔曼滤波和三种SDE模型实现高分辨率预测,CRPS约1.57 m/s,功率Wasserstein距离低于额定容量1.4%。

详情
AI中文摘要

我们提出了一个以十分钟分辨率进行一个月前风电预测的条件概率框架。从序列相关的SCADA风速数据中估计月度威布尔形状和尺度参数,通过Godambe协方差修正,并使用异方差卡尔曼滤波在双变量VAR(1)状态空间模型上进行预测。以MMSE预测的威布尔不变律为条件,我们构建并比较了三种正风速SDE模型:Ornstein-Uhlenbeck-Weibull变换、Fokker-Planck漂移优先扩散和Fokker-Planck扩散优先模型。模拟的风速集合通过校准的XGBoost功率曲线映射到功率。应用于Kelmarsh风电场Senvion MM92涡轮机2021年1月的数据,三种SDE公式在概率精度上统计上不可区分,平均CRPS值在1.569至1.575 m/s之间。因此,扩散优先模型在计算上更优,运行时间相对于OU-Weibull模型减少了约七倍。在功率域中,模拟与观测分布之间的Wasserstein距离为26.1-27.6 kW,低于额定容量的1.4%,而所检查月份的月能量产出偏差约为-7.3%。在0-1500 kW范围内,超越概率误差保持在1.6个百分点以下,在额定功率附近约为2.2个百分点。这些量为下游运行问题提供了决策相关的概率输入,而非完成的备用、储能、市场或疲劳优化决策。完全边缘化卡尔曼预测律下的威布尔参数是一个自然的扩展。

英文摘要

We present a one-month-ahead conditional probabilistic framework for wind-power forecasting at ten-minute resolution. Monthly Weibull shape and scale parameters are estimated from serially dependent SCADA wind-speed data, corrected through a Godambe covariance, and forecast by a heteroskedastic Kalman filter on a bivariate VAR(1) state-space model. Conditional on the MMSE forecasted Weibull invariant law, we construct and compare three positive wind-speed SDE models: an Ornstein-Uhlenbeck-Weibull transform, a Fokker-Planck drift-first diffusion, and a Fokker-Planck diffusion-first model. The simulated wind-speed ensembles are mapped to power through a calibrated XGBoost power curve. Applied to January 2021 data from a Senvion MM92 turbine at Kelmarsh Wind Farm, the three SDE formulations are statistically indistinguishable in probabilistic accuracy, with mean CRPS values between 1.569 and 1.575 m/s. The diffusion-first model is therefore preferred on computational grounds, reducing runtime by about a factor of seven relative to the OU-Weibull model. In the power domain, the Wasserstein distance between simulated and observed distributions is 26.1-27.6 kW, below $1.4\%$ of rated capacity, while the monthly energy-yield bias is about $-7.3\%$ for the examined month. Exceedance-probability errors remain below 1.6 percentage points over the 0-1500 kW range and about 2.2 percentage points near rated power. These quantities provide decision-relevant probabilistic inputs for downstream operational problems, rather than completed reserve, storage, market, or fatigue-optimization decisions. Full marginalisation over the Kalman predictive law of the Weibull parameters is left as a natural extension.

2606.11962 2026-06-11 stat.ME q-fin.ST stat.CO 新提交

Composite likelihood inference of fractional Gaussian processes with sequentially optimal subset selection

具有顺序最优子集选择的分数高斯过程的复合似然推断

Mathis Fourreau, Matthieu Garcin

AI总结 针对分数高斯过程,提出通过顺序最大化Godambe信息来选择子集,以平衡估计精度与计算成本,并推导了Fisher信息和Godambe信息的理论表达式。

详情
AI中文摘要

复合似然方法通过考虑观测的几个子集而非全部来降低时间序列参数估计的计算成本。该方法的渐近性质与Godambe信息相关,Godambe信息是Fisher信息的扩展,考虑了观测子集之间的依赖性。我们旨在将该方法应用于线性高斯模型,特别是分数布朗运动和分数高斯噪声。我们推导了其Fisher信息和Godambe信息的理论表达式,并推导出一种顺序最大化Godambe信息的子集选择设计。子集的大小使我们能够控制估计精度与计算成本之间的权衡。通过模拟,我们将该方法与矩方法和最大似然估计进行比较,并将其应用于真实数据,即股票指数的波动率序列和风速时间序列。

英文摘要

The composite likelihood method reduces the computational cost of parameter estimation in time series by considering several subsets of observations instead of all observations at once. The asymptotic properties of this method are related to the Godambe information, an extension of the Fisher information that accounts for the dependence between subsets of observations. We aim to apply this method to linear Gaussian models, in particular fractional Brownian motion and fractional Gaussian noise. We derive theoretical expressions for their Fisher information and their Godambe information and deduce a subset selection design that sequentially maximizes the Godambe information. The size of the subsets then allows us to control the trade-off between estimation accuracy and computational cost. Through simulations, we compare this method with the method of moments and maximum likelihood estimation, and we apply it to real data, namely volatility series of a stock index and a wind speed time series.

2606.11775 2026-06-11 math.MG q-bio.QM stat.ML 新提交

Magnitude-Based Features for Multispecies Spatial Data

基于量值的多物种空间数据特征

Julia Sollberger, Joshua Bull, Sara Kališnik, Bernadette Stolz

AI总结 提出基于量值的全局和局部特征向量,用于分析多物种空间数据中的相互作用,在合成肿瘤微环境和人类结直肠癌组织微阵列数据中验证了其识别空间异质性和分类能力。

详情
Comments
32 pages, 24 figures
AI中文摘要

多物种空间数据出现在许多应用中,其中不同实体之间的相互作用对系统行为至关重要,包括生物医学成像、地理空间分析和物种生态学。尽管它们很重要,但捕获这种相互作用的定量工具相对较少。在这项工作中,我们提出了基于量值的特征用于分析多物种空间数据。量值是有限度量空间的一个实值不变量,可以解释为有效点数,结合了空间配置和尺度。我们开发了全局和局部量值特征向量,并在合成肿瘤微环境数据以及人类结直肠癌样本的组织微阵列数据中展示了它们的实用性。在局部,该方法识别出不同的邻域类型并揭示空间异质性;在模型中,这包括与模拟的不同定性结果相关的径向模式,而在真实世界数据中,它反映了B细胞和T细胞群体之间三级淋巴结构样相互作用的重要性。在全局上,该方法恢复了合成数据中跨参数区域的长期模拟结果的已知分类,并提示CD4+ T细胞和CD163+巨噬细胞在区分有利的克罗恩样反应与不利的弥漫性免疫浸润患者中发挥重要作用。总之,这些结果表明基于量值的特征为多物种空间数据分析提供了强大而灵活的工具。

英文摘要

Multispecies spatial data arise in many applications where interactions between different entities are central to system behaviour, including biomedical imaging, geospatial analysis, and species ecology. Despite their importance, relatively few quantitative tools exist to capture such interactions. In this work, we propose magnitude-based features for the analysis of multispecies spatial data. Magnitude is a real-valued invariant of finite metric spaces that can be interpreted as an effective number of points, incorporating both spatial configuration and scale. We develop global and local magnitude feature vectors and demonstrate their utility on synthetic tumour microenvironment data, and in tissue microarray data from human colorectal cancer samples. Locally, the method identifies distinct neighbourhood types and reveals spatial heterogeneity; in the model, this includes radial patterns associated with different qualitative outcomes of the simulations, while in the real-world data it reflects the importance of tertiary lymphoid structure-like interactions between B and T cell populations. Globally, the approach recovers known classifications of long-term simulation outcomes across parameter regimes in synthetic data, and suggests important roles for CD4+ T cells and CD163+ macrophages in distinguishing patients with favourable Crohn's like reactions from unfavourable diffuse immune infiltration. Together, these results suggest that magnitude-based features provide a powerful and flexible tool for the analysis of multispecies spatial data.

2606.11768 2026-06-11 stat.ME stat.AP 新提交

Hierarchical excitatory processes for modelling event-time data in the presence of exogenous stimuli

外源刺激下事件时间数据建模的分层激发过程

Francesco Sanna Passino, Nicholas A. Heard, Jeffrey W. Brown, William N. Frost, Vince P. Lyzinski

AI总结 提出分层激发过程(HEP)模型,通过动态演化核函数叠加外源刺激的激发效应,实现对重复刺激下事件时间数据的灵活建模,并嵌入聚类框架识别潜在响应模式。

详情
AI中文摘要

我们引入了分层激发过程(HEP),一种用于在重复外部刺激下观察到的事件时间数据的灵活点过程模型。所提出的框架将点过程的条件强度建模为外部刺激引起的激发效应的叠加,其特征由参数随时间动态演化的核函数刻画。这种分层结构使得能够跨重复刺激调节激发强度,提供了一种可解释的结构。我们为所提出的模型建立了基于似然的推断,并将HEP嵌入到基于模型的聚类框架中,以识别具有相似响应动态的潜在组。模拟研究证明了该模型恢复演化潜在模式的能力,而对海蛞蝓足神经节尖峰序列记录的应用展示了HEP如何能够在不同实验条件下表征重复刺激下神经元的刺激驱动兴奋性。

英文摘要

We introduce the Hierarchical Excitatory Process (HEP), a flexible point process model for event-time data observed under repeated external stimuli. The proposed framework models the conditional intensity of a point process as a superposition of excitation effects induced by external stimuli, characterised by kernels with parameters dynamically evolving over time. This hierarchical construction enables modulation of excitation strength across repeated stimuli, providing an interpretable structure. We establish likelihood-based inference for the proposed model and embed HEP within a model-based clustering framework to identify latent groups sharing similar response dynamics. Simulation studies demonstrate the model's ability to recover evolving latent patterns, and an application to spike train recordings from the sea slug Aplysia pedal ganglion illustrates how HEPs are able to characterise stimulus-driven excitability of neurons across repeated stimulation under different experimental conditions.

2606.11746 2026-06-11 astro-ph.IM stat.ML 新提交

Time Series Analysis in Machine Learning

机器学习中的时间序列分析

Antonio Pagliaro, Anna Anzalone

AI总结 从机器学习视角综述时间序列分析,涵盖经典统计模型与现代机器学习方法,强调跨领域应用原则。

详情
Comments
Invited chapter for the edited book "Machine Learning Techniques for Astrophysics and Cosmology" (Eds. Cosimo Bambi, Vinay Kashyap, Swarnim Shashank, Naoki Yoshida, Springer Singapore, expected in 2026). Submitted version
AI中文摘要

时间序列分析是机器学习的基本组成部分,尤其是在天体物理学和宇宙学中,时域数据丰富。本章从机器学习的视角对时间序列分析技术进行了教学性综述。我们涵盖了时间序列的基本概念(平稳性、自相关、季节性)、经典统计模型(自回归、移动平均、ARIMA、指数平滑、状态空间模型)以及现代机器学习方法。特别地,我们讨论了传统统计方法如何奠定基础,然后探索了用于时间序列的机器学习方法,包括基于特征的回归、基于树的集成方法、隐马尔可夫模型、高斯过程和深度学习模型(循环神经网络、卷积网络、变换器)。在整章中,我们通过来自多个领域(例如天文学、天气预报、金融)的示例进行说明,以强调共同原则。目标是使读者具备理论理解和实践背景,以便在其研究中应用机器学习技术进行时间序列分析。

英文摘要

Time series analysis is a fundamental component of machine learning, especially in astrophysics and cosmology where temporal data abound. This chapter provides a pedagogical review of time series analysis techniques from a machine learning perspective. We cover the basic concepts of time series (stationarity, autocorrelation, seasonality), classical statistical models (autoregressive, moving average, ARIMA, exponential smoothing, state-space models), and modern machine learning approaches. In particular, we discuss how traditional statistical methods lay the groundwork, and then explore machine learning methods for time series, including feature-based regression, tree-based ensemble methods, hidden Markov models, Gaussian processes, and deep learning models (recurrent neural networks, convolutional networks, transformers). Throughout, we illustrate with examples drawn from multiple domains (e.g. astronomy, weather forecasting, finance) to emphasize common principles. The goal is to equip readers with both the theoretical understanding and practical context to apply machine learning techniques for time series analysis in their research.

2605.27478 2026-06-11 stat.ML cs.LG math.PR 版本更新

Triangular-Reference Schrödinger Bridges for Time Series Generation

三角参考薛定谔桥用于时间序列生成

Gabriele Bocchi

AI总结 提出三角参考薛定谔桥框架,通过区间冻结的退化扩散参考和层次化潜在波动率结构,实现时间序列的保守生成,并保持熵最小化的变分核心。

详情
AI中文摘要

我们引入了用于时间序列的三角参考薛定谔桥(TR-SBTS),这是SBTS框架的一种保守扩展,其中布朗参考被替换为区间冻结的、可能退化的扩散参考,在潜在波动率水平的层次上呈三角形。该构造是在增广状态空间上的单一熵投影,变分约束在时间和潜在水平上联合施加,并通过相对熵的分解层次展开。SBTS的变分核心得以保留:熵最小化器是参考的h-变换,在每个冻结区间上,最优动力学在活跃协方差方向的仿射叶上具有对数梯度漂移公式,即使冻结协方差是秩亏的也成立。我们建立了冻结近似的稳定性以及相应正则化核估计量的收敛性。该构造通过一个有限维条件映射实现,该映射由三种互补的过去约简组成——块PCR摘要、由运行时冻结协方差累积量诱导的过去增量的参考感知马氏核,以及在同一参考度量下的过去窗口WLS漂移回归器——以及一个耦合的状态-协方差桥步骤,其中每个潜在水平为上一水平产生动态参考,并由协方差描述符总结;该构造在数值实验上进行了评估。

英文摘要

Schrödinger bridges for time series (SBTS) generate synthetic paths by projecting, in relative entropy, a Brownian reference onto the path laws that match the joint distribution of the data on the observation grid. The Brownian reference, however, fixes the quadratic variation of the generated paths, which is restrictive when stochastic volatility, correlated noise, or rank-deficient covariance structures must be reproduced. We introduce "Triangular-Reference Schrödinger Bridges for Time Series" (TR-SBTS), which keeps the entropy-projection backbone of SBTS but replaces the Brownian reference by a triangular, volatility-informed, intervalwise frozen reference on a state augmented with latent covariance descriptors. The construction remains a single entropy projection on the augmented state: the minimiser is the \(h\)-transform of the reference, and on each frozen interval the optimal drift has the logarithmic-gradient form \(b^\star(t,x)=A\,\nabla\log H(t,x)\), intrinsic to the active covariance directions when the frozen covariance \(A\) is degenerate. We prove stability of the frozen approximation and consistency of the associated regularised kernel estimators, describe a reference-aware Nadaraya--Watson implementation of the conditional next-increment law, and evaluate the construction on numerical experiments.

2601.14031 2026-06-11 stat.ML cs.LG 版本更新

Intermittent time series forecasting: local vs global models

间歇性时间序列预测:局部模型与全局模型

Stefano Damato, Nicolò Rubattu, Dario Azzimonti, Giorgio Corani

AI总结 针对间歇性时间序列预测问题,首次系统比较了概率性局部模型与全局模型(如TiDE),发现简单神经网络架构TiDE在精度和计算效率上均优于局部模型,且Tweedie分布头对高分位数估计最佳。

详情
Comments
Submitted to the Journal of the Operational Research Society
AI中文摘要

预测包含零值的间歇性时间序列是供应链中的一个关键挑战,因为库存策略需要概率预测来建立安全水平。间歇性时间序列通常使用局部模型进行预测,即对每个时间序列单独训练。近年来,基于大量时间序列训练的全局模型在时间序列预测中变得流行。全局模型通常基于神经网络或梯度提升树。我们进行了首次研究,比较了最先进的概率性局部模型和全局模型在间歇性时间序列上的表现。对于全局模型,我们考虑了三种适用于间歇性时间序列的不同分布头:负二项、障碍移位负二项和Tweedie。据我们所知,这是后两者首次与神经网络结合使用。我们在五个数据集上进行了实验,这些数据集总共包含超过40,000个真实世界的时间序列。在全局模型中,TiDE(一种简单的神经网络架构)取得了最佳精度;它还持续优于局部模型,并且计算需求更低。大型全局模型反而计算需求更高且精度更低。在分布头中,Tweedie提供了最高分位数的最佳估计。

英文摘要

Forecasting intermittent time series, which contain zeros, is a crucial challenge in supply chains as inventory policies require probabilistic forecasts to establish safety levels. Intermittent time series are commonly forecast using local models, trained individually on each time series. In the last years global models, trained on a large collection of time series, have become popular for time series forecasting. Global models are often based on neural networks or gradient boosted trees. We carry out the first study comparing state-of-the-art probabilistic local and global models on intermittent time series. For global models we consider three different distribution heads suitable for intermittent time series: negative binomial, hurdle-shifted negative binomial and Tweedie. To the best of our knowledge, this is the first use of the latter two with neural networks. We perform experiments on five datasets comprising overall more than 40'000 real-world time series. Among global models, TiDE, a simple neural network architecture, achieves the best accuracy; it also consistently outperforms local models and has lower computational requirements. Large global models are instead much more computationally demanding and less accurate. Among the distribution heads, the Tweedie provides the best estimates of the highest quantiles.

2411.12193 2026-06-11 stat.AP cs.LG stat.ML 版本更新

Hierarchical Probabilistic Conformal Prediction for Distributed Energy Resources Adoption

分布式能源采纳的分层概率保形预测

Wenbin Zhou, Shixiang Zhu

AI总结 针对分布式能源采纳预测中的不确定性和分层电网结构,提出基于多元霍克斯过程与分裂保形预测的量化框架,确保聚合后统计有效性,在印第安纳波利斯数据上优于基线。

详情
AI中文摘要

分布式能源(DERs)的快速增长为电网管理带来了机遇和运营挑战。准确预测DER采纳对于主动基础设施规划至关重要,但DER增长的固有不确定性和空间差异使传统预测方法复杂化。此外,配电网的分层结构要求预测在电路和变电站层面均满足统计保证,这是可靠决策的非平凡要求。本文提出了一种新的DER采纳预测不确定性量化框架,确保在分层电网结构中的有效性。利用多元霍克斯过程建模DER采纳动态,并采用定制的分裂保形预测算法,我们引入了一种新的非一致性分数,在保持预测效率的同时,在聚合下保留统计保证。我们在温和条件下建立了理论有效性,并通过印第安纳州印第安纳波利斯的客户级太阳能电池板安装数据实证评估,表明我们的方法在预测准确性和不确定性校准方面始终优于现有基线。

英文摘要

The rapid growth of distributed energy resources (DERs) presents both opportunities and operational challenges for electric grid management. Accurately predicting DER adoption is critical for proactive infrastructure planning, but the inherent uncertainty and spatial disparity of DER growth complicate traditional forecasting approaches. Moreover, the hierarchical structure of distribution grids demands that predictions satisfy statistical guarantees at both the circuit and substation levels, a non-trivial requirement for reliable decision-making. In this paper, we propose a novel uncertainty quantification framework for DER adoption predictions that ensures validity across hierarchical grid structures. Leveraging a multivariate Hawkes process to model DER adoption dynamics and a tailored split conformal prediction algorithm, we introduce a new nonconformity score that preserves statistical guarantees under aggregation while maintaining prediction efficiency. We establish theoretical validity under mild conditions and demonstrate through empirical evaluation on customer-level solar panel installation data from Indianapolis, Indiana that our method consistently outperforms existing baselines in both predictive accuracy and uncertainty calibration.

6. 计算统计与MCMC 5 篇

2606.12021 2026-06-11 stat.ME 新提交

Adaptive spatial blocking for scalable clustering inference with applications to high-throughput spatial proteomics

自适应空间分块用于可扩展聚类推断及其在高通量空间蛋白质组学中的应用

Mingyu Go, Julia Wrobel, Hoseung Song

AI总结 提出自适应空间分块算法,通过构造满足点计数和形状约束的局部块,利用渐近正态近似实现大规模点模式数据的快速聚类推断,平衡统计功效与计算效率。

详情
AI中文摘要

Ripley's K函数是一种广泛用于评估点模式聚类的空间汇总统计量。然而,现有的基于K的方法在处理大规模数据时计算成本高昂,特别是在高通量空间蛋白质组学中,因为它们依赖于图像中所有点的空间信息。为应对这一挑战,我们提出了一种计算高效的基于分块的测试框架,该框架从图像中提取不相交的局部块,并跨块聚合聚类证据。所提出的自适应空间分块算法构造满足点计数和形状约束的块,通过渐近正态近似实现可扩展的空间聚类推断和快速p值计算。数值研究表明,所提出的方法在统计功效和计算效率之间提供了良好的平衡。在健康人肠道空间蛋白质组学数据的应用中,我们的方法检测到浆细胞的强空间聚集以及浆细胞与巨噬细胞之间的共定位,同时在大图像上具有良好的可扩展性。

英文摘要

Ripley's K-function is a widely used spatial summary statistic for assessing clustering in point patterns. However, existing K-based methods can be computationally prohibitive for large-scale data, particularly in high-throughput spatial proteomics, because they rely on spatial information from all points in the image. To address this challenge, we propose a computationally efficient block-based testing framework that extracts disjoint local blocks from an image and aggregates clustering evidence across them. The proposed adaptive spatial blocking algorithm constructs blocks satisfying point-count and shape constraints, enabling scalable spatial clustering inference and fast p-value computation through an asymptotic normal approximation. Numerical studies demonstrate that the proposed method provides a favorable balance between statistical power and computational efficiency. In an application to healthy human intestine spatial proteomics data, our method detects strong spatial aggregation of plasma cells and colocalization between plasma cells and macrophages, while scaling favorably to large images.

2606.11487 2026-06-11 math.ST math.PR stat.ML 新提交

Unbiased Derivative Estimation for Stationary Mean of Parameterized Markov chains

参数化马尔可夫链平稳均值的无偏导数估计

Jeffrey Wang, Chang-han Rhee

AI总结 提出一种针对参数化马尔可夫链平稳均值梯度的无偏估计方法,在慢混合率下高效,无需密度函数先验知识,适用于神经网络参数化。

详情
Comments
Preliminary draft. Full version in preparation
AI中文摘要

我们提出了一种新方法,用于无偏估计与参数化马尔可夫链族相关的平稳均值的梯度。当马尔可夫链具有慢混合率时,我们的估计器特别高效。我们的方法不需要特定的参数化,除了一个预言机来评估给定数据点的转移密度及其梯度,而无需关于密度函数本身的任何额外知识。这使得我们的估计器适用于与神经网络相关的参数化。该估计器在效率方面可能实现大幅提升。数值实验证实了理论预测的良好性能。

英文摘要

We propose a new approach to unbiased estimation of the gradients of the stationary means associated with parametrized families of Markov chains. Our estimators are particularly efficient when the Markov chains have slow mixing rate. Our approach does not require a specific parametrization except for an oracle to evaluate the transition density and its gradient at a given data point without any additional knowledge about the density function itself. It makes our estimator suitable for parametrizations associated with neural networks. The estimator can potentially achieve large improvement in terms of efficiency. Numerical experiments confirm the good performance predicted by the theory.

2606.11402 2026-06-11 stat.CO astro-ph.IM stat.ML 新提交

GraphGP: Scalable Gaussian Processes with Vecchia's Approximation

GraphGP: 基于Vecchia近似的可扩展高斯过程

Benjamin Dodge, Philipp Frank, Susan E. Clark

AI总结 提出GraphGP算法,利用Vecchia近似和GPU加速,将高斯过程扩展到近十亿参数,实现线性时间和内存复杂度,适用于大动态范围任意点分布。

详情
Comments
Accepted to Conference on Physics and AI at Stanford University (PAI 2026)
AI中文摘要

高斯过程是建模连续场的强大工具,但其朴素的$\mathcal{O}(N^3)$计算成本和$\mathcal{O}(N^2)$内存需求常常限制其实际应用。Vecchia近似是一种针对平稳、衰减核的稀疏精度矩阵近似,它将每个点仅条件于其$k$个最近邻。我们提出GraphGP,一种用于Vecchia近似的GPU算法,可扩展到近十亿参数,具有线性时间和内存需求,并能处理大动态范围内的任意点分布。我们的关键贡献是:(1) 一种比特反转k-d树排序,允许高效邻居搜索同时最大化批处理并行性;(2) 一种可微的CUDA实现,比纯JAX基线显著更快且内存效率更高。GraphGP提供了推理所需的构建块,包括前向生成、逆应用、对数行列式和核参数导数。

英文摘要

Gaussian processes are a powerful tool for modeling continuous fields, but their naive $\mathcal{O}(N^3)$ computational cost and $\mathcal{O}(N^2)$ memory requirement often limit their practical use. Vecchia's approximation is a sparse precision matrix approximation for stationary, decaying kernels that conditions each point only on its $k$ nearest neighbors. We present GraphGP, a GPU algorithm for Vecchia's approximation that scales to nearly a billion parameters with linear time and memory requirements, handling arbitrary point distributions over a large dynamic range. Our key contributions are (1) a bit-reversed k-d tree ordering that allows efficient neighbor searches while also maximizing batch parallelism, and (2) a differentiable CUDA implementation, which is substantially faster and more memory efficient than our pure JAX baseline. GraphGP provides the building blocks for inference, including forward generation, inverse application, log-determinant, and kernel parameter derivatives.

2606.11347 2026-06-11 stat.ML cs.LG math.OC 新提交

Annealed Entropic Allocation for Ranking and Selection

退火熵分配用于排序与选择

Xin Fei, Juergen Branke

AI总结 提出退火熵分配框架,通过加权log-sum-exp替代非光滑极大极小大偏差率目标,结合鞍点近似提升有限预算下的区分能力,数值实验表明在多个候选接近时性能优异。

详情
AI中文摘要

我们提出了退火熵分配,一种用于排序与选择中顺序预算分配的退火加权软最小化框架。核心思想是用加权log-sum-exp替代非光滑的极大极小大偏差率目标,该替代通过软最小化权重聚合特定候选对的得分,从而在多个候选几乎同时活跃时缓解硬切换。为了提升有限预算下的区分能力,我们引入了鞍点近似——一种从精细化的成对尾部渐近性导出的次指数修正。由于这些修正是次指数的,且平滑参数退火至零,该替代保持了与经典极大极小公式相同的一阶大偏差目标。我们证明了该替代一致收敛于硬最小值,软最小化权重集中于活跃候选,并且在固定权重下,诱导的目标分配映射在单纯形内部是连续的。在高斯和指数实例上的数值实验展示了竞争性能,尤其是在多个候选几乎持平时。

英文摘要

We propose Annealed Entropic Allocation, an annealed weighted soft-min framework for sequential budget allocation in ranking and selection. The central idea is to replace the non-smooth maximin large-deviation rate objective with a weighted log-sum-exp surrogate that aggregates challenger-specific pairwise scores through soft-min weights, mitigating hard switching when several challengers are nearly active. To improve finite-budget discrimination, we incorporate the saddlepoint approximation -- a sub-exponential correction derived from refined pairwise tail asymptotics. Because these corrections are sub-exponential and the smoothing parameter is annealed to zero, the surrogate preserves the same first-order large-deviation target as the classical maximin formulation. We show that the surrogate converges uniformly to the hard minimum, that the soft-min weights concentrate on the active challengers, and that, under fixed weights, the induced target allocation map is continuous on the simplex interior. Numerical experiments on Gaussian and exponential instances demonstrate competitive performance, especially when multiple challengers are nearly tied.

2510.01861 2026-06-11 stat.ME stat.CO 版本更新

Compressed Bayesian Tensor Regression

压缩贝叶斯张量回归

Roberto Casarin, Radu Craiu, Qing Wang

AI总结 针对张量回归中的高维问题,提出广义张量随机投影方法将高维协变量嵌入低维子空间,结合贝叶斯推理框架和低秩参数表示,实现高效预测与计算成本降低。

详情
AI中文摘要

为了解决张量回归中常见的高维问题,我们引入了一种广义张量随机投影方法,该方法将高维张量值协变量嵌入低维子空间,同时最小化响应信息的损失。该方法灵活,允许张量-wise、模式-wise 或组合随机投影作为特例。我们提供了一个贝叶斯推理框架,其特点是使用分层先验分布和参数的低秩表示。为随机投影的集中性质和贝叶斯推理的后验一致性提供了强有力的理论支持。开发了一个高效的吉布斯采样器来对压缩数据进行推理。为了减轻随机投影引入的敏感性,采用了贝叶斯模型平均,并使用逆逻辑回归估计归一化常数。进行了广泛的模拟研究,以检查不同调谐参数的影响。模拟表明,并且实际数据应用证实,与标准贝叶斯张量回归相比,压缩贝叶斯张量回归可以在显著降低计算成本的同时实现更好的样本外预测。

英文摘要

To address the common problem of high dimensionality in tensor regressions, we introduce a generalized tensor random projection method that embeds high-dimensional tensor-valued covariates into low-dimensional subspaces with minimal loss of information about the responses. The method is flexible, allowing for tensor-wise, mode-wise, or combined random projections as special cases. A Bayesian inference framework is provided featuring the use of a hierarchical prior distribution and a low-rank representation of the parameter. Strong theoretical support is provided for the concentration properties of the random projection and posterior consistency of the Bayesian inference. An efficient Gibbs sampler is developed to perform inference on the compressed data. To mitigate the sensitivity introduced by random projections, Bayesian model averaging is employed, with normalising constants estimated using reverse logistic regression. An extensive simulation study is conducted to examine the effects of different tuning parameters. Simulations indicate, and the real data application confirms, that compressed Bayesian tensor regression can achieve better out-of-sample prediction while significantly reducing computational cost compared to standard Bayesian tensor regression.

7. 机器学习统计基础 27 篇

2606.12058 2026-06-11 stat.ML cond-mat.dis-nn cs.LG 新提交

Phase Transitions in Attention: A Bayesian Theory of Copy Head Emergence

注意力中的相变:复制头涌现的贝叶斯理论

Itay Lavie, Kirsten Fischer, Andrey Lekov, Frederic Van Maele, Zohar Ringel, Moritz Helias

AI总结 通过分析单层softmax注意力网络在复制任务上的训练,提出贝叶斯理论揭示注意力矩阵的后验分布存在相变,并对比线性注意力发现softmax注意力呈现一阶相变。

详情
AI中文摘要

注意力是Transformer中上下文学习的关键机制,经验上观察到注意力模式在训练过程中突然涌现。我们提出了注意力中特征学习的贝叶斯理论;然后通过分析在复制任务上训练的单层softmax注意力网络,专注于归纳头第一层中复制子电路的学习方式。我们推导出注意力矩阵上的闭式后验,并将其简化为低维序参数空间。这种简化揭示了训练数据量上的相变,我们通过贝叶斯采样和使用Adam的标准训练验证了这一点。我们将结果与线性注意力对比,发现softmax注意力表现出\emph{一阶相变},而在线性注意力中,初始的\emph{二阶相变}之后是向结构化注意力模式的平滑连续演化(\emph{交叉})。我们的工作为复制子电路的突然涌现提供了第一性原理的理论解释,这让人联想到在大语言模型训练中观察到的现象。

英文摘要

Attention is the key mechanism underlying in-context learning in transformers, and attention patterns have been observed empirically to emerge abruptly during training. We present a Bayesian theory of feature learning in attention; we then focus on how the copy subcircuit in the first layer of an induction head is learned by analyzing a single-layer softmax attention network trained on a copy task. We derive a closed-form posterior over the attention matrix and reduce it to a low-dimensional order parameter space. This reduction reveals a phase transition in the amount of training data, which we verify using both Bayesian sampling and standard training with Adam. We contrast our results with linear attention and find that softmax attention exhibits a \emph{first-order phase transition} while in linear attention an initial \emph{second-order phase transition} is followed by a smooth, continuous evolution toward the structured attention pattern (\emph{crossover}). Our work provides a first-principles theoretical account of the abrupt emergence of the copy subcircuit, reminiscent of the one observed in training large language models.

2606.11988 2026-06-11 cs.LG stat.ML 新提交

What Uncertainties Do We Need for Dynamical Systems?

动力系统需要哪些不确定性?

Yusuf Sale, Christopher Bülte, Felix Czaja, Joshua Stiller, Eyke Hüllermeier

发表机构 * Institute of Computer Science, LMU Munich(慕尼黑大学计算机科学研究所) Munich Center for Machine Learning (MCML)(慕尼黑机器学习中心) Department of Mathematics, LMU Munich(慕尼黑大学数学系) German Research Center for Artificial Intelligence (DFKI, DSA)(德国人工智能研究中心(DFKI, DSA))

AI总结 本文从机器学习视角探讨动力系统中的不确定性,区分偶然与认知不确定性,并讨论不同任务中表示和量化不确定性的目标。

详情
Comments
EIML@ICML
AI中文摘要

偶然不确定性和认知不确定性之间的区别在机器学习研究中受到了相当大的关注,主要是在监督学习的背景下,但也涉及其他设置,如生成建模。在本文中,我们提供了一个关于动力系统不确定性建模的机器学习视角,这方面的研究迄今较少。特别是,我们提出:动力系统需要哪些不确定性?我们讨论了不确定性的来源,阐明了它们的性质(偶然或认知),并考虑了表示和量化不确定性的目标如何在不同任务中变化。

英文摘要

The distinction between aleatoric and epistemic uncertainty has received considerable attention in machine learning research, mainly in the context of supervised learning but also in other settings such as generative modeling. In this paper, we offer a machine learning perspective on uncertainty modeling for dynamical systems, which has been studied much less so far. In particular, we ask: what uncertainties do we need for dynamical systems? We discuss sources of uncertainty, clarify their nature (aleatoric or epistemic), and consider how the objectives of representing and quantifying uncertainty vary across different tasks.

2606.11968 2026-06-11 cs.LG stat.ML 新提交

Efficient Multinomial Logistic Bandit via Frequent Directions

基于频繁方向的高效多项式逻辑斯蒂老虎机

Linzhe He, Yu-Jie Zhang, Sifan Yang, Lijun Zhang

发表机构 * State Key Laboratory of Novel Software Technology, Nanjing University(南京大学计算机软件新技术国家重点实验室) School of Artificial Intelligence, Nanjing University(南京大学人工智能学院) Paul G. Allen School of Computer Science & Engineering, University of Washington(华盛顿大学保罗·G·艾伦计算机科学与工程学院)

AI总结 针对多项式逻辑斯蒂老虎机的高维计算瓶颈,提出集成频繁方向矩阵素描的EOFD-MLogB算法,将每轮复杂度降至O(Kd(m+K)^2)时间和O(Kd(m+K))空间,并证明其遗憾界接近原算法。

详情
AI中文摘要

本文研究多项式逻辑斯蒂老虎机(MLogB)的高效在线算法,其中$K+1$个结果的反馈分布遵循$d$维动作向量的多项式逻辑斯蒂模型。代表性的UCB型算法OFUL-MLogB实现了$\tilde{\mathcal{O}}(Kd\sqrt{T})$的遗憾界,但由于参数估计和乐观奖励构造,每轮仍需$\mathcal{O}(K^3d^3)$时间和$\mathcal{O}(K^2d^2)$空间,在高维场景下不可行。为解决此限制,我们提出EOFD-MLogB,将频繁方向矩阵素描集成到OFUL-MLogB中。通过维护累积Hessian的低秩SVD素描,参数估计中的约束在线牛顿更新和奖励奖励中的$Kd \times K$谱范数计算分别简化为单维求根任务和$K \times K$特征值计算。这导致每轮主要时间复杂度为$\mathcal{O}(Kd(m+K)^2)$,空间复杂度为$\mathcal{O}(Kd(m+K))$,其中$m \ll d$为素描大小。我们进一步证明了$\tilde{\mathcal{O}}(\Delta_T(Kd\ln\Delta_T+m)\sqrt{T})$的遗憾界,其中素描误差因子$\Delta_T$由Hessian的$m$截断谱尾控制。因此,当Hessian近似低秩时,遗憾接近OFUL-MLogB。实验验证了计算效率和竞争性能。

英文摘要

This paper studies efficient online algorithms for multinomial logistic bandits (MLogB), where the feedback distribution over $K+1$ outcomes follows a multinomial logistic model of $d$-dimensional action vectors. A representative UCB-type algorithm, OFUL-MLogB, achieves a regret bound of $\tilde{\mathcal{O}}(Kd\sqrt{T})$, but still requires $\mathcal{O}(K^3d^3)$ time and $\mathcal{O}(K^2d^2)$ space per round due to parameter estimation and optimistic reward construction, which is prohibitive in high-dimensional settings. To address this limitation, we propose EOFD-MLogB, which integrates frequent directions matrix sketching into OFUL-MLogB. By maintaining a low-rank SVD sketch of the accumulated Hessian, constrained online Newton updates in parameter estimation and $Kd \times K$ spectral-norm computations in the reward bonus are reduced to one-dimensional root-finding tasks and $K \times K$ eigenvalue computations, respectively. This yields dominant per-round time complexity $\mathcal{O}(Kd(m+K)^2)$ and space complexity $\mathcal{O}(Kd(m+K))$, where $m \ll d$ is the sketch size. We further prove a regret bound of $\tilde{\mathcal{O}}(\Delta_T(Kd\ln\Delta_T+m)\sqrt{T})$, where the sketching error factor $\Delta_T$ is controlled by the $m$-truncated spectral tail of the Hessian. Thus, when the Hessian is approximately low-rank, the regret is close to that of OFUL-MLogB. Experiments validate the computational efficiency and competitive performance.

2606.11711 2026-06-11 cs.LG stat.ML 新提交

Capacity-Constrained Online Convex Optimization with Delayed Feedback

具有延迟反馈的容量受限在线凸优化

Alexander Ryabchenko, Idan Attias, Daniel M. Roy

发表机构 * Department of Statistical Sciences, University of Toronto(多伦多大学统计科学系) Vector Institute(向量研究所) Institute for Data, Econometrics, Algorithms, and Learning (IDEAL), hosted by UIC and TTIC(数据、计量经济学、算法与学习研究所(IDEAL),由伊利诺伊大学芝加哥分校和丰田工业大学芝加哥分校主办)

AI总结 研究在硬容量约束下(最多同时跟踪C个待处理轮次)的延迟在线凸优化,通过引入半先知模型和延迟加权FTRL算法,首次给出了凸和强凸损失下容量受限OCO的遗憾界。

详情
AI中文摘要

具有延迟反馈的在线学习通常假设学习者可以跟踪所有待处理轮次直到其反馈到达。在实践中,跟踪资源是有限的,未跟踪轮次的反馈将永久丢失。在本文中,我们研究了在硬容量约束下的延迟在线凸优化(OCO),其中任何时候最多可以跟踪$C$个待处理轮次。为了建模延迟信息,我们引入了一个半先知模型,该模型细化了先前工作中的先知假设:学习者不需要在预测时知道延迟,而是在线观察延迟到期,这与经典的无约束延迟设置一致。我们的方法通过归约到一个新颖的“延迟且加权”的OCO问题来实现,使用一个随机化跟踪决策并对结果观测进行重要性加权的调度器。对于这个基础问题,我们提出并分析了延迟加权FTRL及其赌博机变体,建立了明确刻画时变权重与延迟反馈之间相互作用的遗憾界。将这些基础学习器与我们的调度器相结合,首次给出了在凸和强凸损失下容量受限OCO的遗憾保证,适用于一阶和赌博机反馈。对于一阶反馈,容量$C = \Omega(\log T)$足以在忽略对数因子的情况下恢复标准延迟OCO的速率。对于赌博机反馈,遗憾率由$(1 + \sigma_{\text{max}}/C)$的幂次调制,其中$\sigma_{\text{max}}$是任何时候的最大待处理观测数。这使得当$C < \sigma_{\text{max}}$时遗憾界能够优雅地退化,同时保持次线性。

英文摘要

Online learning with delayed feedback typically assumes that the learner can track all pending rounds until their feedback arrives. In practice, tracking resources are finite, and feedback from untracked rounds is permanently lost. In this paper, we study delayed online convex optimization (OCO) under a hard capacity constraint, where at most $C$ pending rounds can be tracked at any time. To model delay information, we introduce a semi-clairvoyant model that refines the clairvoyant assumption from prior work: rather than requiring delays to be known at prediction time, the learner observes delay expirations online, consistent with the classical unconstrained delayed setting. Our approach proceeds via a reduction to a novel ``delayed and weighted'' OCO problem, using a scheduler that randomizes tracking decisions and importance-weights the resulting observations. For this base problem, we propose and analyze Delayed-Weighted FTRL and its bandit analogue, establishing regret bounds that explicitly characterize the interaction between time-varying weights and delayed feedback. Combining these base learners with our schedulers yields the first regret guarantees for capacity-constrained OCO under convex and strongly convex losses, for both first-order and bandit feedback. For first-order feedback, capacity $C = \Omega(\log T)$ suffices to recover standard delayed OCO rates up to logarithmic factors. For bandit feedback, the regret rates are modulated by powers of $(1 + \sigma_{\text{max}}/C)$, where $\sigma_{\text{max}}$ is the maximum number of pending observations at any time. This allows the regret bound to degrade gracefully when $C < \sigma_{\text{max}}$, while remaining sublinear.

2606.11646 2026-06-11 cs.LG q-bio.QM stat.ML 新提交

Tree-Structured Orthonormal Decomposition of the Aitchison Simplex

Aitchison单纯形的树结构正交分解

Daisuke Yamada, Qijun Zhang, Travis Pence, Barbara B. Bendlin, Federico Rey, Vikas Singh

AI总结 提出PolyILR方法,利用树结构对成分数据进行正交分解,在微生物组和单细胞数据中生成稳定可解释的特征,并建立与softmax分类器的理论联系。

详情
Comments
Accepted at ICML 2026. To appear in PMLR vol. 306
AI中文摘要

成分数据——编码相对比例的向量——出现在包括生态学、地球化学和基因组学在内的科学领域。这些数据中的特征通常具有已知的层次结构(例如,分类学、系统发育、本体论),但现有方法要么忽略这种结构,要么丢弃内在的Aitchison几何,要么设计用于二叉树,要么产生不完整的坐标系。我们描述了PolyILR,一种与任何树拓扑对齐的Aitchison切空间的正交分解。我们的构造在每个内部节点定义了一个加权局部几何,捕获完整的分支结构,然后将这些提升到一个全局正交基,其中每个坐标对应一个特定的树位置。在微生物组和单细胞基准测试中,PolyILR产生稳定、可解释的特征,并支持多尺度树分辨率下的推理。我们还建立了与softmax分类器的新理论联系,暗示了在概率建模中的可能应用。

英文摘要

Compositional data -- vectors encoding relative proportions -- arise across scientific domains, including ecology, geochemistry, and genomics. The features in these data often come with known hierarchical structure (e.g., taxonomies, phylogenies, ontologies), yet existing methods either ignore this structure, discard the intrinsic Aitchison geometry, are designed for binary trees, or yield incomplete coordinate systems. We describe PolyILR, a canonical orthonormal decomposition of the Aitchison tangent space aligned with any tree topology. Our construction defines a weighted local geometry at each internal node capturing full branching structure, then lifts these to a global orthonormal basis where every coordinate corresponds to a specific tree location. On microbiome and single-cell benchmarks, PolyILR yields stable, interpretable features and enables inference at multiscale tree resolution. We also establish a novel theoretical connection to softmax classifiers, suggesting possible applications to probabilistic modeling.

2606.11574 2026-06-11 cs.LG cond-mat.mtrl-sci physics.chem-ph stat.ML 新提交

Range-Aware Bayesian Optimization for Discovering Diverse Designs within Target Property Windows

范围感知贝叶斯优化用于在目标属性窗口内发现多样化设计

Shengli Jiang, Jason Wu, Charles M. Schroeder, Michael A. Webb

发表机构 * Department of Chemical and Biological Engineering, Princeton University(普林斯顿大学化学与生物工程系)

AI总结 提出范围感知贝叶斯优化框架,通过采集函数直接评分候选解满足目标范围的后验概率,在基准任务和实际案例中比标准方法发现更多样化的有效设计。

详情
Comments
64 pages, 6 main text figures, 17 supporting figures, 6 supporting tables
AI中文摘要

在许多材料和产品设计问题中,理想的候选物表现出可接受范围内的属性,而非达到单一最优值。恢复满足此类规格的多个不同解也具有实际价值,因为某些候选物可能因成本、可加工性或鲁棒性等原因而更受青睐,而这些因素难以直接编码到目标函数中。在此,我们开发了一个范围感知贝叶斯优化(BO)框架,其中采集函数直接评分候选解满足目标范围的后验概率。该框架自然扩展到在共享候选空间上并行追求多个不同规格。在基准任务中,范围感知采集一致地比标准BO基线和最近的目标寻求方法恢复更大且更多样化的有效设计集。其效用进一步在两个实际动机的设计案例研究中得到证明,涉及优化聚合物合成的反应条件和发现指定光学吸收带的序列定义低聚物,并得到量子化学计算的支持。这些结果表明,范围感知BO可以为规格驱动设计提供实用且样本高效的基础,特别是当设计灵活性和解多样性是重要考虑因素时。

英文摘要

In many materials and product design problems, desirable candidates exhibit properties that fall within an acceptable range rather than achieve a single optimum. Recovering multiple, distinct solutions that satisfy such specifications is also practically valuable, as some candidates may be preferred for reasons of cost, processability, or robustness that are difficult to encode directly in an objective function. Here, we develop a range-aware Bayesian optimization (BO) framework in which the acquisition function directly scores the posterior probability that a candidate satisfies a target range. The framework naturally extends to parallel pursuit of multiple distinct specifications over a shared candidate space. Across benchmark tasks, range-aware acquisition consistently recovers larger and more diverse sets of valid designs than standard BO baselines and recent goal-seeking methods. Its utility is further demonstrated in two practically motivated design case studies involving optimizing reaction conditions for polymer synthesis and sequence-defined oligomer discovery for prescribed optical absorption bands, supported by quantum chemical calculations. These results suggest that range-aware BO can provide a practical and sample-efficient foundation for specification-driven design, particularly when design flexibility and solution diversity are important considerations.

2606.11570 2026-06-11 stat.ML cs.LG stat.ME 新提交

Enhancing Spectral Embedding through Robust and Flexible Knowledge Transfer in Electronic Health Records

通过电子健康记录中的鲁棒且灵活的知识迁移增强谱嵌入

Feiqing Huang, Zongqi Xia, Rong Ma, Tianxi Cai

AI总结 提出一种基于谱的无监督表示学习框架,通过从更广泛人群提取知识矩阵并放松信号对齐假设,为罕见病队列生成低维嵌入,在模拟和真实多发性硬化症数据中优于现有方法。

详情
AI中文摘要

我们提出了一种基于谱的无监督表示学习框架,用于从电子健康记录中为罕见病队列的临床概念和患者导出低维嵌入,其中数据是高维的但样本量有限。为了克服这一挑战,我们引入了一个从更广泛人群中提取的知识矩阵,该矩阵与罕见病队列共享部分重叠的子空间。我们的方法不同于现有方法,它放松了潜在数据矩阵和知识矩阵之间严格的一对一信号对齐假设,允许更灵活和现实的结构化共享形式。我们引入了一种新颖的两步谱嵌入过程:首先,我们从知识矩阵中识别并移除不相关的成分;然后,我们应用基于投影的方法分别恢复共享和异质成分。模拟和对真实世界多发性硬化症队列的分析表明,所提出的方法优于竞争方法,特别是在共享信号较弱且仅部分对齐的挑战性场景中,这在罕见病数据中很常见。

英文摘要

We propose a spectral-based, unsupervised representation learning framework to derive low-dimensional embeddings for clinical concepts and patients in rare disease cohorts from electronic health records, where data are high-dimensional but sample sizes are limited. To overcome this challenge, we incorporate a knowledge matrix extracted from a broader population that shares a partially overlapping subspace with the rare-disease cohort. Our method departs from existing approaches by relaxing restrictive one-to-one signal-alignment assumptions between the latent data matrix and knowledge matrix, allowing more flexible and realistic forms of structured sharing. We introduce a novel two-step spectral embedding procedure: first, we identify and remove irrelevant components from the knowledge matrix; then, we apply a projection-based method to separately recover shared and heterogeneous components. Simulations and an analysis of a real-world multiple sclerosis cohort show that the proposed method outperforms competing approaches, particularly in challenging scenarios where shared signals are weak and only partially aligned, as is common in rare-disease data.

2606.11437 2026-06-11 cs.DS cs.AI cs.LG stat.ML 新提交

The Power of Test-Time Training for Approximate Sampling

测试时训练对近似采样的威力

Noah Golowich, Ankur Moitra, Dhruv Rohatgi

AI总结 本文形式化测试时训练(TTT)为从已知分布类中采样的问题,证明查询复杂度的二次下界,并展示在分布类大小受限时可规避该下界,为TTT提供理论框架。

详情
AI中文摘要

从复杂概率分布中高效采样是一个基本问题,近年来随着生成式AI的兴起,这一问题变得越来越重要,因为从大语言模型(LLM)中提出的复杂采样程序已被用于解决具有挑战性的推理问题。然而,这类采样算法的有效性受到LLM与特定采样任务之间关系的限制,这推动了测试时训练(TTT)框架的发展。TTT通过根据推理时收到的部分生成和奖励反馈更新模型权重来工作,从而适应特定问题。在这项工作中,我们提出了一种TTT的形式化,将其定义为从属于已知分布类$F$的给定概率测度$\mu^\star$中生成样本的问题,给定一个提供$\mu^\star$近似密度估计的预言机$\hat \mu$。这与Jerrum、Valiant和Vazirani(1986)以及Jerrum和Sinclair(1989)的开创性工作中研究的将采样约化为近似计数的问题密切相关:即当$F$是所有分布的类时,它恰好与上述计数到采样的约化一致。在本文中,我们首先证明了在给定对$\hat \mu$的查询访问的情况下,从$\mu^\star$采样的查询复杂度的二次下界(对于足够大的类$F$),从而表明Jerrum和Sinclair(1989)提出并由Hayes和Sinclair(2010)改进的随机游走方法是最优的。这回答了Hayes和Sinclair提出的一个开放问题。然后,我们证明如果$F$的大小适当受限,这个下界可以被规避。正如我们所讨论的,后一个结果可以被视为TTT的抽象,因此代表了为TTT发展一个原则性理论框架的起点。

英文摘要

Efficiently sampling from a complex probability distribution is a fundamental problem which has become increasingly pertinent in recent years with the rise of generative AI, as sophisticated sampling procedures from LLMs have been proposed to solve challenging reasoning problems. The efficacy of such sampling algorithms is limited, however, by the relationship between the LLM and the particular sampling task at hand, which has motivated the framework of test-time training (TTT). TTT works by updating a model's weights in response to partial generations and reward feedback received at inference time, thus adapting to the particular problem. In this work, we propose a formalization for TTT as the problem of producing a sample from a given probability measure $\mu^\star$ belonging to a known class ${F}$ of distributions, given an oracle $\hat \mu$ which yields approximate density estimates for $\mu^\star$. This is closely related to the problem of reducing sampling to approximate counting studied in seminal works of Jerrum, Valiant & Vazirani (1986) and Jerrum & Sinclair (1989): namely, when ${F}$ is the class of all distributions, it coincides exactly with the aforementioned counting-to-sampling reduction. In this paper, we first show a quadratic lower bound on the query complexity of sampling from $\mu^\star$ given query access to $\hat \mu$ (for sufficiently large classes ${F}$), thus showing that the random walk approach proposed by Jerrum & Sinclair (1989) and refined by Hayes & Sinclair (2010), is optimal. This answers an open question posed by Hayes & Sinclair. We then show that this lower bound can be circumvented if the size of ${F}$ is bounded appropriately. As we discuss, this latter result can be viewed as an abstraction of TTT, and thus represents a starting point for the development of a principled theoretical framework for TTT.

2606.11417 2026-06-11 cs.LG cs.AI stat.ML 新提交

Signed Compression Progress on a Sealed Audit is Goodhart-Resistant

密封审计上的有符号压缩进展是古德哈特抵抗的

Ayush Mittal, Dhruv Gupta

AI总结 提出有符号压缩进展作为内在动机,证明其累积奖励等于审计改进,且对有限审计面板具有假阳性预算,抵抗古德哈特定律。

详情
Comments
16 pages, 7 figures. Lean 4 (Mathlib) mechanized core and ARC-TGI experiment code: this https URL
AI中文摘要

压缩进展是一个长期提出的内在动机方案:当智能体的世界模型在预测或压缩经验方面变得更好时给予奖励。民间声称这种奖励是“可信的”,因为它只在学习时支付。我们使这一点精确化并证明它。如果内在奖励是固定密封审计损失的有符号减少,即 r_t = E(theta_{t-1}) - E(theta_t),那么累积奖励恰好望远镜式地归结为端点审计改进,因此没有策略可以在真实审计性能停滞或下降时无限推高奖励。对于有限审计面板,同样的结果成立,并带有尖锐的假阳性预算:累积经验奖励最多为真实审计改进加上 2 Delta_n(F, delta),即模型类的均匀审计偏差。这是无水平依赖的:一旦密封面板均匀控制该类,随时间变化的适应性无需付出代价。该定理还识别了失败模式:如果进展被截断、在智能体自身流上评分、暴露于可重用面板上的高容量模型,或应用于使 Delta_n 无效的神经类,则保证消失。我们给出了结构核心(望远镜式、有限审计界、有限吉布斯和熵下限)的 Lean 4 机械化,以及在 ARC-TGI 网格变换生成器上带有自适应保留攻击的实验套件。实验证实了理论:有限审计偏差按 n^{-0.527} 缩放;有符号进展抵抗截断农场、流泄漏和噪声电视好奇心;朴素的可重用审计可被黑盒标量反馈利用,而标准发布防御将攻击保持在 2 Delta_n 阈值以下。密封审计上的有符号压缩进展是真正改进的会计信号。

英文摘要

Compression progress is a long-standing proposal for intrinsic motivation: reward an agent when its world model becomes better at predicting or compressing experience. The folk claim is that this reward is "credible" because it is paid only for learning. We make this precise and prove it. If intrinsic reward is the signed decrease of a fixed sealed-audit loss, r_t = E(theta_{t-1}) - E(theta_t), then cumulative reward telescopes exactly to endpoint audit improvement, so no policy can push reward up indefinitely while true audit performance stagnates or degrades. For finite audit panels the same result holds with a sharp false-positive budget: cumulative empirical reward is at most true audit improvement plus 2 Delta_n(F, delta), the uniform audit deviation of the model class. This is horizon-free: adaptivity over time costs nothing once the sealed panel uniformly controls the class. The theorem also identifies the failure modes: the guarantee disappears if progress is clipped, scored on the agent's own stream, exposed to a high-capacity model on a reusable panel, or applied to a neural class that makes Delta_n vacuous. We give a Lean 4 mechanization of the structural core (telescoping, the finite-audit bound, finite Gibbs, and the entropy floor) and an experiment suite on ARC-TGI grid-transformation generators with adaptive holdout attacks. Experiments confirm the theory: finite-audit deviation scales as n^{-0.527}; signed progress resists clip-farming, stream leakage, and noisy-TV curiosity; naive reusable audits are exploitable by black-box scalar feedback, while standard release defenses keep the attack below the 2 Delta_n threshold. Signed compression progress on a sealed audit is an accounting signal of genuine improvement.

2606.11339 2026-06-11 math.OC cs.AI cs.LG eess.SY stat.ML 新提交

Quantized Stochastic Primal-Dual Methods for Distributed Optimization under Relaxed Global Geometry

松弛全局几何下分布式优化的量化随机原始-对偶方法

Susmit Sarkar, Abhinav Raghuvanshi, Kushal Chakrabarti, Mayank Baranwal

AI总结 提出量化随机原始-对偶方法q-PDGD,在松弛全局几何下证明线性收敛到邻域或O(1/k)收敛,匹配最优集中随机复杂度。

详情
Comments
Accepted to UAI
AI中文摘要

我们研究具有随机梯度和有限比特通信(由随机(无偏)量化建模)的分布式优化。我们提出q-PDGD,一种量化的随机原始-对偶方法,并在松弛全局几何下对其进行分析。在受限割线不等式(RSI)下,常数步长产生线性收缩到由梯度噪声、量化失真和网络连通性确定的显式邻域,而递减步长在没有共享最小化器假设的情况下实现O(1/k)收敛。在Polyak-Lojasiewicz(PL)不等式下,我们在相同的随机量化设置中获得线性到邻域的收敛。我们的结果在预言复杂度上匹配已知最优的集中随机速率,并通过实验证明了量化水平、步长选择和图结构之间的预测权衡。

英文摘要

We study distributed optimization with stochastic gradients and finite-bit communication modeled by random (unbiased) quantization. We propose q-PDGD, a quantized stochastic primal-dual method, and analyze it under relaxed global geometry. Under restricted secant inequality (RSI), a constant step-size yields linear contraction to an explicit neighborhood determined by gradient noise, quantization distortion, and network connectivity, while a diminishing step-size achieves O(1/k) convergence without shared-minimizer assumptions. Under Polyak-Lojasiewicz (PL) inequality, we obtain linear-to-neighborhood convergence in the same stochastic quantized setting. Our results match the best-known centralized stochastic rates in oracle complexity, and are supported by experiments demonstrating the predicted tradeoffs between quantization level, step-size choice, and graph structure.

2605.04893 2026-06-11 cs.LG cs.CL stat.ML 版本更新

Self-Attention as Transport: Limits of Symmetric Spectral Diagnostics

自注意力作为传输:对称谱诊断的极限

Dominik Dahlem, Diego Maniloff, Mac Misiura

AI总结 研究语言模型注意力路由的两种失效形状(过度集中或过度分散),证明对称谱诊断对方向不敏感,并揭示因果注意力中传输容量的理论下限,提出基于容量和方向的双轴诊断方法。

详情
Comments
48 pages, 6 figures, 7 tables; 81-page online supplement (proofs, additional experiments, dataset statistics) as an ancillary file
AI中文摘要

当语言模型处理幻觉响应时,其注意力路由往往以两种形状之一失效:过度集中在狭窄的位置集合上,或者分散得如此广泛以至于相关性被稀释,而失效的形状携带诊断信号。我们研究这些形状作为诊断特征,从在基准标记响应的\emph{强制评分}下计算的注意力矩阵中得出,而不是在实时生成期间。一类广泛使用的谱方法分析度归一化注意力算子的对称分量,该算子控制传输\emph{容量};我们证明该算子的每个转置不变谱诊断在结构上是\emph{方向盲的}(它无法区分算子与其转置,因此无法检测信息流方向),并且盲定理的逆定理将任何Lipschitz诊断的转置敏感性限制为不对称系数$G$。将其与规范因果架构的闭式二分-Cheeger景观配对,我们证明均匀因果注意力满足一个与$n$无关的下界$\phi \ge 1/5$,而窗口注意力以$O(w/n)$穿透下界;失效模式在形状上不同,而不仅仅在数值上不同。这个下界是一个理想化架构的基准,而不是经验吸引子:穿透它的真实注意力头的比例本身就是一个架构特征。由此产生的双轴诊断($\phi$表示容量,$G$表示方向)产生一个可证伪的极性预测:瓶颈主导和分散主导的基准应表现出相反的极性。在长度控制评估下,传输特征在测试的仅解码器、仅编码器和编码器-解码器模型中保持可解释的信号(0.62-0.84 LC-AUROC),极性在HaluEval和MedHallu之间如预测般反转。

英文摘要

When a language model processes a hallucinated response, its attention routing tends to fail in one of two shapes: over-concentrating on a narrow set of positions, or spreading so diffusely that relevance is diluted, and the shape of the failure carries diagnostic signal. We study these shapes as a diagnostic characterization, computed from attention matrices under \emph{forced scoring} of benchmark-labeled responses rather than during live generation. A widely used family of spectral methods analyzes the symmetric component of the degree-normalized attention operator, which governs transport \emph{capacity}; we prove that every transpose-invariant spectral diagnostic of this operator is structurally \emph{orientation-blind} (it cannot distinguish an operator from its transpose, and therefore cannot detect information-flow direction), with a converse to the blindness theorem bounding any Lipschitz diagnostic's transpose sensitivity by the asymmetry coefficient $G$. Pairing this with a closed-form bipartite-Cheeger landscape for canonical causal architectures, we show that uniform causal attention satisfies an $n$-independent floor $\phi \ge 1/5$, while window attention pierces the floor as $O(w/n)$; failure modes are shape-different, not just value-different. This floor is an idealized-architecture benchmark, not an empirical attractor: the fraction of real attention heads that pierce it is itself an architectural signature. The resulting two-axis diagnostic ($\phi$ for capacity, $G$ for direction) yields a falsifiable polarity prediction: bottleneck- and diffuse-dominated benchmarks should exhibit opposite polarity. Under length-controlled evaluation, transport features retain interpretable signal (0.62-0.84 LC-AUROC) across the tested decoder-only, encoder-only, and encoder-decoder models, with polarity reversing as predicted between HaluEval and MedHallu.

2606.08493 2026-06-11 q-bio.GN cs.LG stat.ML 版本更新

Querying Counterfactuals on Tissue Graphs with Supervised Disentanglement

在组织图上通过监督解缠查询反事实

Abdul Moeed, Stefan Schrod, Martin Rohbeck, Marc Jan Bonder, Pavlo Lutsik, Oliver Stegle, Daniel Dimitrov

AI总结 本文形式化组织图反事实为空间干预,提出Cellina框架通过监督解缠分解细胞内在状态与空间上下文,用于反事实预测,在结直肠癌和小鼠大脑数据上优于现有方法。

详情
AI中文摘要

组织图反事实询问在改变的空间邻居上下文中细胞的表达将如何变化。这类查询对于预测组织中细胞行为至关重要,但缺乏统一定义,现有方法针对特定干预类型或将细胞视为独立同分布。在这项工作中,我们首先将组织图反事实形式化为一类空间干预,这些干预要么重新连接细胞之间的边(边扰动),要么修改其邻居的表达(节点扰动)。然后,我们介绍Cellina(https://cellina.readthedocs.io),一个使用监督解缠将细胞内在状态从其空间上下文中分解出来的框架,将后者作为反事实预测的条件输入。在跨越结直肠癌和小鼠大脑中超过250万个空间分辨细胞的基准测试中,Cellina在组织扰动、解缠和可扩展性方面优于空间感知和非空间的竞争对手。此外,我们展示了Cellina以无监督方式揭示生物学上不同的癌症子域,并实现靶向邻居扰动模拟。

英文摘要

Tissue graph counterfactuals ask how a cell's expression would change under altered spatial neighbor contexts. Such queries are central to predicting cell behavior in tissues, but lack a unified definition, with existing methods targeting specific intervention types or treating cells as i.i.d. In this work, we first formalize tissue graph counterfactuals as a class of spatial interventions that either rewire connections between cells (edge perturbation) or modify the expression of their neighbors (node perturbation). We then introduce Cellina ( this https URL ) - a framework that uses supervised disentanglement to decompose a cell's intrinsic state from its spatial context, using the latter as a conditioning input for counterfactual predictions. Across benchmarks spanning over 2.5 million spatially-resolved cells in colorectal cancer and mouse brain, Cellina outperforms spatially-informed and non-spatial competitors in in-silico graph perturbations, disentanglement, and scalability. Additionally, we show that Cellina reveals biologically distinct cancer subdomains in an unsupervised manner and enables targeted neighbor perturbation simulations.

2606.05551 2026-06-11 stat.ML cs.AI cs.LG 版本更新

Conformal Risk-Averse Decision Making with Action Conditional Guarantee

具有行动条件保证的共形风险规避决策

Zihan Zhu, Shayan Kiyani, George Pappas, Hamed Hassani

AI总结 提出行动条件共形预测方法,通过分位数损失最小化算法实现行动条件风险价值优化,在有限样本下提供行动条件安全保证。

详情
AI中文摘要

由机器学习模型驱动的可靠决策管道需要具有明确安全保证的不确定性量化(UQ)方法。共形预测通过将ML预测包装成预测集来提供这种UQ,而Kiyani等人(2025b)的最新工作表明,这些集合可以转化为最优的风险规避决策策略——但仅继承边际安全保证。我们通过以下方式推广并加强了他们的结果:(i)引入行动条件共形预测,该预测产生明确条件于决策者所采取的每个行动的安全保证;(ii)表明行动条件预测集可作为风险规避决策者旨在优化行动条件风险价值的可行决策空间的代理;(iii)提出一种基于分位数损失最小化的原则性有限样本算法,将Gibbs等人(2025)的框架与行动条件保证联系起来。在两个真实世界数据集上的实验证实,我们的方法在行动条件性能上显著优于共形基线。

英文摘要

Reliable decision making pipelines powered by machine learning models require uncertainty quantification (UQ) methods that come with explicit safety guarantees. Conformal prediction provides such UQ by wrapping ML predictions into prediction sets, and recent work by Kiyani et al. (2025b) established that these sets can be translated into optimal risk-averse decision policies -- yet only inheriting marginal safety guarantees. We generalize and strengthen their results by (i) introducing action-conditional conformal prediction, which yields safety guarantees conditioned explicitly on each action taken by the decision maker, (ii) showing that action-conditional prediction sets serve as a proxy for the feasible decision space for risk-averse decision makers aiming to optimize action-conditional value-at-risk, and (iii) proposing a principled finite-sample algorithm based on pinball-loss minimization, connecting the framework of Gibbs et al. (2025) to action-conditional guarantees. Experiments on two real-world datasets confirm that our approach significantly improves action-conditional performance over conformal baselines.

2605.22346 2026-06-11 stat.ML cs.LG cs.SI 版本更新

The ASE-LSE Disagreement Landscape: An End-to-End Characterisation of Extremes and Structural Drivers

偏离正则性:度异质性和特征间隙作为ASE-LSE潜在子空间分歧的结构驱动因素

Minh Triet Pham, Ian Gallagher

AI总结 本文研究了图数据分析中邻接谱嵌入和拉普拉斯谱嵌入方法在相同网络上产生不同结果的结构原因,揭示了度异质性和社区结构强度对潜在子空间分歧的影响。

详情
Comments
This paper is being withdrawn as it was submitted without the consent of all listed authors, and contains work that is currently under academic assessment. It will be resubmitted at an appropriate time once evaluation is complete
AI中文摘要

图数据分析中,邻接谱嵌入和拉普拉斯谱嵌入两种最常用方法在相同网络上常产生不同结果。本文提供了结构上的解释。我们证明正则性是完美一致的充分条件:当每个节点具有相同数量的连接时,两种方法产生相同的潜在子空间。任何偏离正则性都会引入分歧,我们证明了一个显式的界限,其两个术语表明控制分歧的结构因素:度异质性推动方法分离,社区结构强度则拉近它们。我们通过成千上万个模拟网络验证了这两种驱动因素,确认异质性推动分歧增加,社区强度抑制它,其比值提供了两种嵌入可以互换或不可互换的强预测。

英文摘要

Two of the most widely used methods for analysing graph data, Adjacency Spectral Embedding and Laplacian Spectral Embedding, often produce different results when applied to the same graph. Yet the structural reasons behind this disagreement remain incompletely understood. This paper provides an end-to-end account of ASE-LSE latent subspace disagreement. We first prove that the two methods produce identical latent subspaces for every embedding dimension whenever the Laplacian is a scalar multiple of the adjacency matrix, and show that this scalar relationship holds if and only if the graph is either regular or bipartite biregular. This anchor result identifies a sufficient condition for perfect agreement that pins down the floor of the disagreement spectrum and supplies the baseline for the perturbation analysis. We then prove that no maximal-disagreement graph or family of graphs exists: the disagreement is always strictly below its theoretical ceiling, and we exhibit a witness family demonstrating that no finite maximum is attainable, so the disagreement landscape has no maximiser. With both endpoints established, we derive a Regularity Departure Bound whose two terms isolate degree heterogeneity and eigengap as the primary structural factors influencing disagreement in the middle regime. Empirical validation across thousands of simulated graphs confirms the mechanisms predicted by the bound: heterogeneity pushes disagreement up, eigengap suppresses it, and their joint ratio emerges as a unified predictor of ASE-LSE disagreement, suggesting when the two embeddings can be treated as interchangeable and when they cannot.

2603.12901 2026-06-11 stat.ML cond-mat.dis-nn cs.IT cs.LG 版本更新

A theory of learning data statistics in diffusion models, from easy to hard

扩散模型中学习数据统计的理论:从容易到困难

Lorenzo Bardone, Claudia Merger, Sebastian Goldt

AI总结 本文研究了扩散模型在学习数据统计时的分布简单性偏差,揭示了学习 pairwise 统计和 higher-order 统计所需的样本复杂度差异,并引入了扩散信息指数这一不变量。

详情
AI中文摘要

尽管扩散模型已成为强大的生成模型,但其学习动态仍不明确。我们通过实验证明,标准扩散模型在自然图像上学习时存在分布简单性偏差,先学习简单的 pairwise 输入统计,再转向更高阶相关性。我们在简单的去噪器上用最小数据模型混合累积模型重现了这一行为,并精确控制了输入的 pairwise 和 higher-order 相关性。我们识别出一个模型不变量,即扩散信息指数,类比于不同学习范式中的相关不变量。利用这一不变量,我们证明去噪器在线性样本复杂度下学习输入的简单 pairwise 统计,而更复杂的 higher-order 统计如四阶累积量需要至少立方样本复杂度。我们还证明,如果 pairwise 和 higher-order 统计共享相关潜在结构,则学习四阶累积量的样本复杂度是线性的。本文描述了扩散模型如何学习越来越复杂分布的关键机制。

英文摘要

While diffusion models have emerged as a powerful class of generative models, their learning dynamics remain poorly understood. We address this issue first by empirically showing that standard diffusion models trained on natural images exhibit a distributional simplicity bias, learning simple, pair-wise input statistics before specializing to higher-order correlations. We reproduce this behaviour in simple denoisers trained on a minimal data model, the mixed cumulant model, where we precisely control both pair-wise and higher-order correlations of the inputs. We identify a scalar invariant of the model that governs the sample complexity of learning pair-wise and higher-order correlations that we call the diffusion information exponent, in analogy to related invariants in different learning paradigms. Using this invariant, we prove that the denoiser learns simple, pair-wise statistics of the inputs at linear sample complexity, while more complex higher-order statistics, such as the fourth cumulant, require at least cubic sample complexity. We also prove that the sample complexity of learning the fourth cumulant is linear if pair-wise and higher-order statistics share a correlated latent structure. Our work describes a key mechanism for how diffusion models can learn distributions of increasing complexity.

2603.08558 2026-06-11 cs.LG stat.ML 版本更新

Impact of Connectivity on Laplacian Representations in Reinforcement Learning

连通性对强化学习中拉普拉斯表示的影响

Tommaso Giorgi, Pierriccardo Olivieri, Keyue Jiang, Laura Toni, Matteo Papini

AI总结 本文研究了连通性对强化学习中拉普拉斯表示的误差影响,通过分析状态图的代数连通性,推导了线性价值函数近似误差的上界,并展示了表示学习管道中的端到端误差分解。

详情
AI中文摘要

在马尔可夫决策过程(MDPs)中学习紧凑的状态表示对于解决大规模强化学习(RL)问题中的维度灾难至关重要。现有方法通过构造状态表示为状态图拉普拉斯特征向量的线性组合,利用结构先验。当转移图未知或状态空间过大时,可通过样本轨迹直接估计图谱特征。本文证明了在学习的谱特征下线性价值函数近似误差的上界,并展示了该误差如何随状态图的代数连通性变化,从而将近似质量根植于MDP的拓扑结构中。进一步界定了由特征向量估计本身引入的误差,导致表示学习管道中的端到端误差分解。此外,尽管RL设置中的拉普拉斯算子表达式等价于现有方法,但其防止了一些常见的误解,并展示了文献中的示例。我们的结果适用于一般的(非均匀)策略,无需对诱导转移核的对称性做任何假设。我们通过在网格世界环境中进行数值模拟验证了理论发现。

英文摘要

Learning compact state representations in Markov Decision Processes (MDPs) has proven crucial for addressing the curse of dimensionality in large-scale reinforcement learning (RL) problems. Existing principled approaches leverage structural priors on the MDP by constructing state representations as linear combinations of the state-graph Laplacian eigenvectors. When the transition graph is unknown or the state space is prohibitively large, the graph spectral features can be estimated directly via sample trajectories. In this work, we prove an upper bound on the approximation error of linear value function approximation under the learned spectral features. We show how this error scales with the algebraic connectivity of the state-graph, grounding the approximation quality in the topological structure of the MDP. We further bound the error introduced by the eigenvector estimation itself, leading to an end-to-end error decomposition across the representation learning pipeline. Additionally, our expression of the Laplacian operator for the RL setting, although equivalent to existing ones, prevents some common misunderstandings, of which we show some examples from the literature. Our results hold for general (non-uniform) policies without any assumptions on the symmetry of the induced transition kernel. We validate our theoretical findings with numerical simulations on gridworld environments.

2604.27442 2026-06-11 math.ST stat.ML 版本更新

Bayesian online learning in the one-pass regime: Frequentist validity and uncertainty quantification

单次遍历下的贝叶斯在线学习:频率有效性及不确定性量化

Jeyong Lee, Junhyeok Choi, Dongguen Kim, Minwoo Chae

AI总结 提出一种针对单次遍历的贝叶斯在线学习算法,通过预热阶段确保稳定更新,证明后验达到最优收敛率并建立在线Bernstein-von Mises定理,实现无需小批量样本量发散的不确定性量化。

详情
Comments
52 pages
AI中文摘要

贝叶斯在线学习为序贯推理提供了一个连贯的框架。然而,其理论理解仍然有限,特别是在单次遍历设置中。现有的理论保证通常要求小批量样本量发散,这一条件在单次遍历机制下无法满足。在本文中,我们提出了一种针对单次遍历设置量身定制的新贝叶斯在线学习算法,该算法包含一个预热阶段以确保稳定的序贯更新。对于该算法,我们证明了序贯更新的后验达到了最优收敛率。在此基础上,我们建立了Bernstein-von Mises定理的在线类比,该定理保证了在没有发散的小批量样本量的情况下有效的不确定性量化。我们的分析基于一个新颖的理论框架,该框架与在线学习文献中的现有方法有根本不同。在广义线性模型上的数值实验表明,所提出的方法匹配了批处理估计器的性能,同时优于现有的在线程序。

英文摘要

Bayesian online learning provides a coherent framework for sequential inference. However, its theoretical understanding remains limited, particularly in the one-pass setting. Existing theoretical guarantees typically require the mini-batch sample size to diverge, a condition that fails in the one-pass regime. In this paper, we propose a new Bayesian online learning algorithm tailored to the one-pass setting, which incorporates a warm-start phase to ensure stable sequential updates. For this algorithm, we show that the sequentially updated posterior attains the optimal convergence rate. Building on this, we establish an online analogue of the Bernstein-von Mises theorem, which guarantees valid uncertainty quantification without diverging mini-batch sample sizes. Our analysis is based on a novel theoretical framework that differs fundamentally from existing approaches in the online learning literature. Numerical experiments on generalized linear models show that the proposed method matches the performance of the batch estimator while outperforming existing online procedures.

2603.09276 2026-06-11 stat.ML cs.LG 版本更新

On Regret Bounds of Thompson Sampling for Bayesian Optimization

关于贝叶斯优化中汤普森采样遗憾界的分析

Shion Takeno, Shogo Iwazaki

AI总结 本文针对高斯过程汤普森采样(GP-TS)方法,在目标函数为GP样本路径的假设下,推导了其遗憾下界、累积遗憾二阶矩上界、期望宽松遗憾上界以及改进的累积遗憾上界,填补了GP-TS在高概率遗憾界方面的空白。

详情
Comments
43 pages, Accepted to ICML 2026
AI中文摘要

我们研究了一种广泛使用的贝叶斯优化方法——高斯过程汤普森采样(GP-TS),假设目标函数是高斯过程的一个样本路径。与具有高概率和期望遗憾界的高斯过程上置信界(GP-UCB)相比,GP-TS的大多数分析仅限于期望遗憾。此外,最近关于GP-UCB的宽松遗憾和改进的累积遗憾上界的分析是否能应用于GP-TS仍不清楚。为了填补这些空白,本文展示了几个遗憾界:(i) GP-TS的遗憾下界,这意味着GP-TS以概率δ依赖于$1/\delta$的多项式;(ii) 累积遗憾二阶矩的上界,直接暗示了关于δ的改进遗憾上界;(iii) 期望宽松遗憾上界;(iv) 关于时间水平T的改进累积遗憾上界。在此过程中,我们提供了几个有用的引理,包括从最近分析中放松必要条件以获得关于T的改进累积遗憾上界。

英文摘要

We study a widely used Bayesian optimization method, Gaussian process Thompson sampling (GP-TS), under the assumption that the objective function is a sample path from a GP. Compared with the GP upper confidence bound (GP-UCB) with established high-probability and expected regret bounds, most analyses of GP-TS have been limited to expected regret. Moreover, whether the recent analyses of GP-UCB for the lenient regret and the improved cumulative regret upper bound can be applied to GP-TS remains unclear. To fill these gaps, this paper shows several regret bounds: (i) a regret lower bound for GP-TS, which implies that GP-TS suffers from a polynomial dependence on $1/\delta$ with probability $\delta$, (ii) an upper bound of the second moment of cumulative regret, which directly suggests an improved regret upper bound on $\delta$, (iii) expected lenient regret upper bounds, and (iv) an improved cumulative regret upper bound on the time horizon $T$. Along the way, we provide several useful lemmas, including a relaxation of the necessary condition from recent analysis to obtain improved regret upper bounds on $T$.

2601.21817 2026-06-11 stat.ML cs.LG 版本更新

A Judge-Aware Ranking Framework for Evaluating Large Language Models without Ground Truth

一种面向评委的排名框架:无需真实标签评估大语言模型

Mingyuan Xu, Xinzi Tan, Jiawei Wu, Doudou Zhou

AI总结 本文提出一种面向评委的排名框架,通过引入评委特定的辨别参数扩展Bradley-Terry-Luce模型,在不参考标签的情况下联合估计潜在模型质量和评委可靠性,从而提高人类偏好的一致性,提高数据效率,并产生校准的不确定性量化。

详情
AI中文摘要

评估大语言模型(LLMs)在开放性任务上无需真实标签的评估越来越通过LLM-as-a-judge范式进行。一个关键但未充分建模的问题是,评判LLMs在可靠性上存在显著差异;将所有评委视为同等对待会导致偏见的排行榜和误导性的不确定性估计。更多的数据在不正确的聚合下可能导致评估更加自信地错误。我们提出了一种面向评委的排名框架,通过引入评委特定的辨别参数扩展Bradley-Terry-Luce模型,在不参考标签的情况下联合估计潜在模型质量和评委可靠性。我们建立了可识别性,直到自然归一化,并证明最大似然估计的一致性和渐近正态性,从而能够为分数差异和排名比较生成置信区间。在多个公开基准和一个新收集的数据集上,我们的方法提高了与人类偏好的一致性,比无权基线实现了更高的数据效率,并产生了校准的LLM排名不确定性量化。

英文摘要

Evaluating large language models (LLMs) on open-ended tasks without ground-truth labels is increasingly done via the LLM-as-a-judge paradigm. A critical but under-modeled issue is that judge LLMs differ substantially in reliability; treating all judges equally can yield biased leaderboards and misleading uncertainty estimates. More data can make evaluation more confidently wrong under misspecified aggregation. We propose a judge-aware ranking framework that extends the Bradley-Terry-Luce model by introducing judge-specific discrimination parameters, jointly estimating latent model quality and judge reliability from pairwise comparisons without reference labels. We establish identifiability up to natural normalizations and prove consistency and asymptotic normality of the maximum likelihood estimator, enabling confidence intervals for score differences and rank comparisons. Across multiple public benchmarks and a newly collected dataset, our method improves agreement with human preferences, achieves higher data efficiency than unweighted baselines, and produces calibrated uncertainty quantification for LLM rankings.

2505.15201 2026-06-11 cs.LG cs.AI cs.CL stat.ML 版本更新

Pass@K Policy Optimization: Solving Harder Reinforcement Learning Problems

Pass@K 策略优化:解决更困难的强化学习问题

Christian Walder, Deep Karkhanis

AI总结 提出 Pass-at-k 策略优化 (PKPO),通过变换奖励直接优化 pass@k 性能,利用低方差无偏估计器,在训练中退火 k 可同时提升 pass@1 和 pass@k,解决更难问题。

详情
AI中文摘要

强化学习算法对每个问题采样多个 n>1 的解决方案尝试并独立奖励它们。这优化了 pass@1 性能,优先考虑孤立样本的强度,而牺牲了样本集的多样性和集体效用。这未充分利用采样能力,限制了探索和在更难示例上的最终改进。作为修复,我们提出 Pass-at-k 策略优化 (PKPO),一种对最终奖励的变换,导致直接优化 pass@k 性能,从而优化联合考虑时最大化奖励的样本集。我们的贡献是推导出 pass@k 及其梯度在二元和连续奖励设置中的新型低方差无偏估计器。我们展示了使用我们的估计器进行优化简化为标准强化学习,其中奖励经过稳定高效的变换函数联合变换。虽然先前的工作仅限于 k=n,但我们是第一个能够对任意 k ≤ n 实现 pass@k 鲁棒优化的。此外,我们的方法不是以 pass@1 性能换取 pass@k 增益,而是允许在训练中退火 k,同时优化两个指标,通常能在显著 pass@k 增益的同时获得强大的 pass@1 数值。我们在玩具实验上验证了我们的奖励变换,揭示了我们的公式的方差减少特性。我们还使用开源 LLM GEMMA-2 包含了真实世界的例子。我们发现我们的变换有效地优化了目标 k。此外,更高的 k 值能够解决更多和更难的问题,而退火 k 则同时提升了 pass@1 和 pass@k。关键的是,在传统 pass@1 优化停滞的具有挑战性的任务集上,我们的 pass@k 方法解锁了学习,这可能是由于通过优先考虑联合效用而非单个样本的效用实现了更好的探索。

英文摘要

Reinforcement Learning (RL) algorithms sample multiple n>1 solution attempts for each problem and reward them independently. This optimizes for pass@1 performance and prioritizes the strength of isolated samples at the expense of the diversity and collective utility of sets of samples. This under-utilizes the sampling capacity, limiting exploration and eventual improvement on harder examples. As a fix, we propose Pass-at-k Policy Optimization (PKPO), a transformation on the final rewards which leads to direct optimization of pass@k performance, thus optimizing for sets of samples that maximize reward when considered jointly. Our contribution is to derive novel low variance unbiased estimators for pass@k and its gradient, in both the binary and continuous reward settings. We show optimization with our estimators reduces to standard RL with rewards that have been jointly transformed by a stable and efficient transformation function. While previous efforts are restricted to k=n, ours is the first to enable robust optimization of pass@k for any arbitrary k <= n. Moreover, instead of trading off pass@1 performance for pass@k gains, our method allows annealing k during training, optimizing both metrics and often achieving strong pass@1 numbers alongside significant pass@k gains. We validate our reward transformations on toy experiments, which reveal the variance reducing properties of our formulations. We also include real-world examples using the open-source LLM, GEMMA-2. We find that our transformation effectively optimizes for the target k. Furthermore, higher k values enable solving more and harder problems, while annealing k boosts both the pass@1 and pass@k. Crucially, for challenging task sets where conventional pass@1 optimization stalls, our pass@k approach unblocks learning, likely due to better exploration by prioritizing joint utility over the utility of individual samples.

2512.11081 2026-06-11 stat.ML cs.LG stat.ME 版本更新

Provable Recovery of Locally Important Signed Features and Interactions from Random Forest

从随机森林中可证明地恢复局部重要符号特征和交互

Kata Vuk, Nicolas Alexander Ihlo, Merle Behr

AI总结 提出一种局部、模型特定的特征与交互重要性方法,通过结合全局和局部决策路径模式,在局部尖峰稀疏模型下可证明地恢复真实信号特征及其交互,并识别特征值大小对预测的驱动方向。

详情
AI中文摘要

特征与交互重要性(FII)方法在监督学习中至关重要,用于评估复杂预测模型中输入变量及其交互的相关性。在许多领域,如个性化医疗,通常需要针对单个预测的局部解释,而不是总结整体特征重要性的全局分数。随机森林(RF)在这些场景中被广泛使用,现有的可解释性方法通常利用树结构和分裂统计量来提供模型特定的见解。然而,对RF的局部FII方法的理论理解仍然有限,这使得如何解释单个预测的高重要性分数变得不明确。我们提出了一种新颖的、局部的、模型特定的FII方法,该方法识别特征在决策路径上的频繁共现,将全局模式与特定测试点路径上的模式相结合。我们证明,在局部尖峰稀疏(LSS)模型下,我们的方法一致地恢复真实的局部信号特征及其交互,并识别出大或小的特征值是否驱动预测。通过模拟研究和真实数据示例,我们展示了我们的方法和理论结果的有用性。

英文摘要

Feature and Interaction Importance (FII) methods are essential in supervised learning for assessing the relevance of input variables and their interactions in complex prediction models. In many domains, such as personalized medicine, local interpretations for individual predictions are often required, rather than global scores summarizing overall feature importance. Random Forests (RFs) are widely used in these settings, and existing interpretability methods typically exploit tree structures and split statistics to provide model-specific insights. However, theoretical understanding of local FII methods for RF remains limited, making it unclear how to interpret high importance scores for individual predictions. We propose a novel, local, model-specific FII method that identifies frequent co-occurrences of features along decision paths, combining global patterns with those observed on paths specific to a given test point. We prove that our method consistently recovers the true local signal features and their interactions under a Locally Spike Sparse (LSS) model and also identifies whether large or small feature values drive a prediction. We illustrate the usefulness of our method and theoretical results through simulation studies and a real-world data example.

2408.07498 2026-06-11 math.AP stat.ML 版本更新

Wasserstein Gradient Flows of MMD Functionals with Distance Kernel and Cauchy Problems on Quantile Functions

距离核MMD泛函的Wasserstein梯度流及分位数函数上的Cauchy问题

Richard Duong, Viktor Stein, Robert Beinert, Johannes Hertrich, Gabriele Steidl

AI总结 研究负距离核下最大均值差异泛函的Wasserstein梯度流,通过将Wasserstein-2空间等距嵌入分位数函数空间,将梯度流转化为L2上的Cauchy问题并给出解公式,证明了流的正则性。

详情
Comments
We corrected the implicit Euler scheme in our code and updated the plots. Also, a minor mistake in the def. (14) and an error in the proof of Thm. 3.5 have been corrected. We thank the anonymous contributors for their valuable feedback, further improving the clarity of the paper. 48 pages, 23 figures, comments welcome!
AI中文摘要

我们全面描述了实直线上最大均值差异(MMD)泛函 $\mathcal F_\nu:= \text{MMD}_K^2(\cdot, \nu)$ 朝向给定目标测度 $\nu$ 的Wasserstein梯度流,其中我们关注负距离核 $K(x,y):= -|x-y|$。在一维情况下,Wasserstein-2空间可以等距嵌入到分位数函数的锥 $\mathcal C(0,1) \subset L_2(0,1)$ 中,从而通过 $L_2(0,1)$ 上相关Cauchy问题的解来刻画Wasserstein梯度流。基于在 $L_2(0,1)$ 上构造 $\mathcal F_\nu$ 的适当对应物及其次微分,我们给出了Cauchy问题的解。对于离散目标测度 $\nu$,这导致一个分段线性解公式。我们证明了流在 $\mathcal C(0,1)$ 子集上的不变性和光滑性。对于某些 $\mathcal F_\nu$ 流,这意味着初始点测度立即变得绝对连续,并随时间保持。最后,我们通过使用隐式欧拉格式的各种数值例子说明了流的行为,该格式可通过二分法轻松计算。对于连续目标 $\nu$,也可以使用显式欧拉格式,尽管收敛保证有限。

英文摘要

We give a comprehensive description of Wasserstein gradient flows of maximum mean discrepancy (MMD) functionals $\mathcal F_\nu:= \text{MMD}_K^2(\cdot, \nu)$ towards given target measures $\nu$ on the real line, where we focus on the negative distance kernel $K(x,y):= -|x-y|$. In one dimension, the Wasserstein-2 space can be isometrically embedded into the cone $\mathcal C(0,1) \subset L_2(0,1)$ of quantile functions leading to a characterization of Wasserstein gradient flows via the solution of an associated Cauchy problem on $L_2(0,1)$. Based on the construction of an appropriate counterpart of $\mathcal F_\nu$ on $L_2(0,1)$ and its subdifferential, we provide a solution of the Cauchy problem. For discrete target measures $\nu$, this results in a piecewise linear solution formula. We prove invariance and smoothing properties of the flow on subsets of $\mathcal C(0,1)$. For certain $\mathcal F_\nu$-flows this implies that initial point measures instantly become absolutely continuous, and stay so over time. Finally, we illustrate the behavior of the flow by various numerical examples using an implicit Euler scheme, which is easily computable by a bisection algorithm. For continuous targets $\nu$, also the explicit Euler scheme can be employed, although with limited convergence guarantees.

2510.02149 2026-06-11 cs.LG math.OC stat.ML 版本更新

Reinforcement Learning with Action-Triggered Observations

具有动作触发观测的强化学习

Alexander Ryabchenko, Wenlong Mou

AI总结 提出动作触发稀疏可追踪MDP框架,推导Bellman方程并证明最优策略存在,利用观测间动作序列的线性表示实现基于回归的方法,在几何分布情节下达到与完全可观测线性MDP匹配的遗憾界。

详情
AI中文摘要

我们引入了动作触发稀疏可追踪马尔可夫决策过程(ATST-MDPs),这是一种用于部分可观测性的强化学习框架,其中完整状态观测在每个步骤以由所选动作决定的概率随机发生。我们推导了针对该设置的Bellman方程,并证明了最优策略的存在性。利用稀疏观测揭示完整状态的事实,我们提供了一个等价公式,其中智能体在连续观测之间承诺动作序列。在线性MDP假设下,我们证明了这些动作序列上的值函数在有限维特征映射中具有线性表示,从而能够使用标准的基于回归的方法。作为一个应用,我们推导了ATST-LSVI-UCB,一种乐观算法,在几何分布的情节学习中实现了遗憾界$\widetilde{O}(\sqrt{Kd^3(1-\gamma)^{-3}})$,其中$K$是情节数,$d$是特征维度,$\gamma$是折扣因子(情节继续概率),与完全可观测线性MDP的已知速率相匹配。

英文摘要

We introduce Action-Triggered Sporadically Traceable Markov Decision Processes (ATST-MDPs), a reinforcement learning framework for partial observability in which full state observations occur stochastically at each step, with probability determined by the chosen action. We derive Bellman equations tailored to this setting and establish the existence of an optimal policy. Exploiting the fact that sporadic observations reveal the full state, we provide an equivalent formulation in which agents commit to action-sequences between consecutive observations. Under the linear MDP assumption, we show that the value function over such action-sequences admits a linear representation in a finite-dimensional feature map, enabling standard regression-based methods. As an application, we derive ATST-LSVI-UCB, an optimistic algorithm achieving regret $\widetilde{O}(\sqrt{Kd^3(1-\gamma)^{-3}})$ for episodic learning with geometrically distributed horizons, where $K$ is the number of episodes, $d$ the feature dimension, and $\gamma$ the discount factor (episode continuation probability), matching the known rate for linear MDPs with full observability.

2508.17077 2026-06-11 stat.ML cs.LG 版本更新

CP4SBI: Local Conformal Calibration of Credible Sets in Simulation-Based Inference

CP4SBI: 基于模拟推断中可信集的局部共形校准

Luben M. C. Cabezas, Vagner S. Santos, Thiago R. Ramos, Pedro L. C. Rodrigues, Rafael Izbicki

AI总结 提出CP4SBI框架,通过回归树和CDF校准实现局部贝叶斯覆盖,为任意评分函数提供有限样本局部覆盖保证,提升神经后验估计的不确定性量化质量。

详情
AI中文摘要

当前实验科学家越来越依赖基于模拟的推断(SBI)来反演具有难以处理似然的复杂非线性模型。然而,通过SBI获得的后验近似通常校准不佳,导致可信区域低估真实参数。我们开发了$\texttt{CP4SBI}$,一个模型无关的共形校准框架,用于构建具有局部贝叶斯覆盖的可信集。我们提出的两种变体,即通过回归树进行局部校准和基于CDF的校准,为任意评分函数(包括HPD、对称和基于分位数的区域)提供了有限样本局部覆盖保证。在广泛使用的SBI基准上的实验表明,我们的方法使用归一化流和分数扩散建模提高了神经后验估计器的不确定性量化质量。

英文摘要

Current experimental scientists have been increasingly relying on simulation-based inference (SBI) to invert complex non-linear models with intractable likelihoods. However, posterior approximations obtained with SBI are often miscalibrated, causing credible regions to undercover true parameters. We develop $\texttt{CP4SBI}$, a model-agnostic conformal calibration framework that constructs credible sets with local Bayesian coverage. Our two proposed variants, namely local calibration via regression trees and CDF-based calibration, enable finite-sample local coverage guarantees for any scoring function, including HPD, symmetric, and quantile-based regions. Experiments on widely used SBI benchmarks demonstrate that our approach improves the quality of uncertainty quantification for neural posterior estimators using both normalizing flows and score-diffusion modeling.

2507.21164 2026-06-11 cs.LG cs.AI eess.IV stat.ML 版本更新

OCSVM-Guided Representation Learning for Unsupervised Anomaly Detection

OCSVM引导的无监督异常检测表示学习

Nicolas Pinon (MYRIAD), Robin Trombetta (MYRIAD), Carole Lartizien (MYRIAD)

AI总结 提出一种将表示学习与可解析求解的一类SVM耦合的方法,通过定制损失函数直接对齐潜在特征与决策边界,在MNIST-C和脑MRI病变检测任务上展现了鲁棒性和性能。

详情
AI中文摘要

无监督异常检测(UAD)旨在无需标签数据检测异常,这在许多机器学习应用中是必要的,因为异常样本稀少或不可用。大多数最先进的方法分为两类:基于重构的方法(通常重构异常过于完美)和与密度估计器解耦的表示学习(可能遭受次优特征空间)。虽然一些近期方法尝试耦合特征学习和异常检测,但它们通常依赖替代目标、限制核选择或引入近似,从而限制了表达能力和鲁棒性。为解决这一挑战,我们提出了一种新颖方法,通过自定义损失公式将表示学习与可解析求解的一类SVM(OCSVM)耦合,该损失直接使潜在特征与OCSVM决策边界对齐。该模型在两个任务上评估:基于MNIST-C的新基准,以及具有挑战性的脑MRI细微病变检测任务。与大多数关注图像级别大而高信号病变的方法不同,我们的方法成功针对小而非高信号的病变,同时我们评估体素级别的指标,处理了更具临床相关性的场景。两个实验评估了对领域偏移的鲁棒性形式,包括MNIST-C中的损坏类型以及MRI中的纹理或人群年龄变化。结果展示了我们提出模型的性能和鲁棒性,突显了其在通用UAD和现实医学成像应用中的潜力。源代码可在此https URL获取。

英文摘要

Unsupervised anomaly detection (UAD) aims to detect anomalies without labeled data, a necessity in many machine learning applications where anomalous samples are rare or not available. Most state-of-the-art methods fall into two categories: reconstruction-based approaches, which often reconstruct anomalies too well, and decoupled representation learning with density estimators, which can suffer from suboptimal feature spaces. While some recent methods attempt to couple feature learning and anomaly detection, they often rely on surrogate objectives, restrict kernel choices, or introduce approximations that limit their expressiveness and robustness. To address this challenge, we propose a novel method that couples representation learning with an analytically solvable One-Class SVM (OCSVM), through a custom loss formulation that directly aligns latent features with the OCSVM decision boundary. The model is evaluated on two tasks: a \deleted{new} benchmark based on MNIST-C, and a challenging brain MRI \deleted{subtle} lesion detection task. Unlike most methods that focus on large, hyperintense lesions at the image level, our approach succeeds to target small, non-hyperintense lesions, while we evaluate voxel-wise metrics, addressing a more clinically relevant scenario. Both experiments evaluate a form of robustness to domain shifts, including corruption types in MNIST-C and texture or population age variations in MRI. Results demonstrate performance and robustness of our proposed model, highlighting its potential for general UAD and real-world medical imaging applications. The source code is available at this https URL.

2505.08784 2026-06-11 stat.ML cs.LG math.ST stat.ME 版本更新

PCS-UQ: Uncertainty Quantification via the Predictability-Computability-Stability Framework

PCS-UQ:基于可预测性-可计算性-稳定性框架的不确定性量化

Abhineet Agarwal, Fange Xiao, Rebecca Barter, Omer Ronen, Boyu Fan, Bin Yu

AI总结 提出PCS-UQ框架,通过预测检查、bootstrap采样和乘法校准实现不确定性量化,在回归和分类任务中优于或媲美共形预测方法,并提供理论保证。

详情
AI中文摘要

随着机器学习进入高风险领域,可信的不确定性量化对于安全性至关重要。本文基于真实数据科学的可预测性、可计算性和稳定性原则,提出了PCS-UQ框架。从候选模型或算法集开始,PCS-UQ集成了严格的预测检查以筛选出集合中不合适的模型,并利用bootstrap样本来捕获预测检查算法的样本间变异性和算法不稳定性。然后,我们引入了一种新颖的乘法校准方案来增强局部自适应性,这基本上对应于共形预测中的新分数。此外,我们编制了17个真实世界回归数据集,并手动构建了子组。在该基准测试中,PCS-UQ在保持目标覆盖率的同时,在区间宽度上优于或匹配配备有oracle选择算法的共形方法。PCS-UQ实现了一致的子组覆盖率,优于这些oracle选择的共形方法。值得注意的是,PCS-UQ在实现竞争性区间宽度和一致子组覆盖率方面表现出色。在6个分类数据集上,PCS-UQ将预测集大小减少了20%。为了将框架扩展到深度学习,我们提出了计算高效的变体,避免了昂贵的重新训练。在三个计算机视觉基准测试中,这些变体将预测集大小比共形基线减少了20%。最后,我们提供了理论证明,即修改后的PCS-UQ算法在可交换性下作为分割共形推断的一种形式保持了有效的覆盖率。

英文摘要

As machine learning (ML) enters high-stakes domains, trustworthy uncertainty quantification (UQ) is essential for safety. In this paper we introduce PCS-UQ, a framework based on the Predictability, Computability, and Stability (PCS) principles for veridical data science. Starting with a candidate set of models or algorithms, PCS-UQ integrates a rigorous prediction-check to screen out unsuitable models in the set and utilizes bootstrap samples, in order to capture both inter-sample variability and algorithmic instability for the prediction-checked algorithms. We then introduce a novel multiplicative calibration scheme to enhance local adaptivity, which basically corresponds to a new score in conformal prediction. Moreover, we produce a compilation of 17 real-world regression datasets with manually-constructed subgroups. On this benchmark, PCS-UQ maintains the target coverage while outperforming or matching conformal methods equipped with oracle-selected algorithms in interval width. PCS-UQ achieves consistent subgroup coverage, outperforming these oracle-selected conformal methods. Notably, PCS-UQ stands out in achieving both competitive interval widths and consistent subgroup this http URL 6 classification datasets, PCS-UQ reduces prediction set sizes by 20\%. To scale the framework for deep learning, we propose computationally efficient variants that bypass expensive retraining. On three computer vision benchmarks, these variants reduce prediction set sizes by 20\% over conformal baselines. Finally, we provide theoretical proof that a modified PCS-UQ algorithm preserves valid coverage under exchangeability as a form of split conformal inference.

2410.24145 2026-06-11 stat.ML cs.LG stat.ME 版本更新

Projected random forests and conformal prediction of circular data

投影随机森林与圆形数据的共形预测

Paulo C. Marques F., Rinaldo Artes, Helton Graziadei

AI总结 针对圆形响应回归问题,应用共形预测技术,通过投影方法将线性回归模型转换为圆形模型,并利用随机森林的袋外机制避免额外校准样本,生成具有有限样本覆盖保证和自适应弧长的预测集。

详情
Comments
7 pages; 4 figures
AI中文摘要

我们将共形预测技术应用于具有圆形响应的回归问题,在数据可交换性假设下,为任何圆形预测模型生成具有自适应弧长和有限样本覆盖保证的预测集。利用现有为线性响应设计的高性能预测模型,我们分析了一种通用的投影过程,将任何线性响应回归模型转换为适用于圆形响应的模型。当在此投影过程中使用随机森林作为基模型时,我们利用随机森林的袋外机制,在构建预测集时无需单独的校准样本。在合成和真实数据集上,与两种现有替代模型生成的分割共形预测集相比,所得的投影随机森林模型产生了更高效的袋外共形预测集,中位弧长更短。

英文摘要

We apply conformal prediction techniques to regression problems with circular responses, producing prediction sets with adaptive arc length and finite-sample coverage guarantees for any circular predictive model under the assumption of data exchangeability. Leveraging the high performance of existing predictive models designed for linear responses, we analyze a general projection procedure that converts any linear-response regression model into one suitable for circular responses. When random forests are used as base models in this projection procedure, we leverage the random forest out-of-bag mechanism to eliminate the need for a separate calibration sample in the construction of prediction sets. On synthetic and real datasets, the resulting projected random forest model produces more efficient out-of-bag conformal prediction sets, with shorter median arc length, than the split conformal prediction sets generated by two existing alternative models.

8. 生物统计与医学统计 7 篇

2606.11439 2026-06-11 stat.ME 新提交

A Likelihood Ratio Testing Approach for Interval-Censored Data

区间删失数据的似然比检验方法

Yuan Wu, Susan Halabi

AI总结 针对区间删失数据,提出基于样条筛的稳健似然比检验,解决Wald检验在小样本中的不稳定性,理论推导渐近分布,模拟和实例验证其优越性。

详情
AI中文摘要

区间删失数据在临床研究中经常出现,其中事件时间仅已知落在特定的评估窗口内。尽管Cox比例风险模型是处理此类数据的标准方法,但现有的Wald型检验在小样本中常常存在不稳定性或性能较差。在本文中,我们提出了一种基于样条筛的稳健似然比检验,用于区间删失数据。我们开发了一个计算高效的估计框架,确保了数值稳定性。此外,我们严格建立了所提出的似然比统计量的渐近分布,为统计推断提供了坚实的理论基础。广泛的模拟研究表明,与传统方法相比,我们的方法实现了更优的错误控制和更高的功效。通过一个真实临床数据集的分析,进一步说明了该方法的实用性。

英文摘要

Interval-censored data frequently arise in clinical research where event times are only known to fall within specific assessment windows. Although the Cox proportional hazards model is a standard approach for such data, existing Wald-type tests often suffer from instability or poor performance in small samples. In this paper, we propose a robust spline-sieve-based likelihood ratio test for interval-censored data. We develop a computationally efficient estimation framework that ensures numerical stability. Furthermore, we rigorously establish the asymptotic distribution of the proposed likelihood ratio statistic, providing a solid theoretical foundation for statistical inference. Extensive simulation studies demonstrate that our approach achieves superior error control and higher power compared with traditional approaches. The practical utility of the method is further illustrated through the analysis of a real-world clinical dataset.

2606.11414 2026-06-11 stat.ME 新提交

Group Sequential Sample Size for Comparing Two Survival Probabilities at a Specific Time Point

比较特定时间点两个生存概率的组序贯样本量

Susan Halabi, Lu Liu, Chenxi Yu, Yuan Wu

AI总结 提出一种新方法,在固定和组序贯试验设计中同时确定检验两个生存概率的样本量,控制I类错误,适用于比例风险假设不成立或含新辅助治疗的随机试验。

详情
AI中文摘要

我们提出了一种新方法,该方法在固定和组序贯试验设计中同时确定检验两个预先指定时间点的生存概率所需的样本量,同时保证I类错误控制。在不同假设差异、失效分布、删失比例和名义功效下的模拟显示出一致的性能,而中期分析则突出了每次分析时降低的I类错误和增加的功效,无论潜在的失效时间分布或花费函数如何。重要的是,我们的方法特别适用于评估随机试验中固定时间的生存结局,其中一组治疗包括术前新辅助治疗,而另一组仅进行手术。此外,当比例风险假设不满足时,该方法也具有优势,这常见于具有延迟或时变治疗效果或生存曲线交叉的免疫治疗试验中。该方法也适用于随机II期试验,其中较小的样本量和中间或替代时间至事件终点的使用要求高效的数据利用和稳健的错误控制。我们通过肾癌和前列腺癌的激励性例子说明了该方法。附带的R Shiny应用程序使研究者能够交互式地计算样本量,从而促进不同环境下的实际试验规划。

英文摘要

We propose a novel method that simultaneously determines the sample size for testing two survival probabilities at a pre-specified ltime while guaranteeing type I error control in both fixed and group-sequential trial designs. Simulations across varying hypothesized differences, failure distributions, censoring proportions, and nominal powers demonstrate consistent performance, while interim analyses highlight reduced type I error and increased power at each look, regardless of the underlying failure time distribution or spending function. Importantly, our method is especially useful for evaluating survival outcomes at a fixed time in randomized trials where one treatment arm includes neoadjuvant therapy prior to surgery while the other involves surgery alone. Furthermore, it is advantageous when the proportional hazards assumption is not satisfied, as often occurs in immunotherapy trials with delayed or time-varying treatment effects or crossing survival curves. The method is also applicable to randomized phase II trials, where smaller sample sizes and the use of intermediate or surrogate time-to-event endpoints demand efficient data use and robust error control. We illustrate the approach with motivating examples in renal and prostate cancer. An accompanying R Shiny application enables investigators to compute sample sizes interactively, facilitating practical trial planning in diverse settings.

2604.23464 2026-06-11 stat.ME stat.AP 版本更新

Design-Based Cross-Validation for Comparing Small Area Estimators

关于小区域估计器的交叉验证

Qianyu Dong, Zehang Richard Li

AI总结 本文提出一种适用于复杂调查设计的小区域估计器交叉验证框架,通过分解交叉验证平方误差,揭示可识别偏差与不可识别成分,提升模型比较的稳健性和可解释性。

详情
Comments
Previous title: "On cross-validation for small area estimators"
AI中文摘要

地方公共卫生监测常常依赖住户调查,但所需空间分辨率的数据稀少。小区域估计(SAE)方法通过跨区域借用强度和辅助信息解决这一挑战。然而,在缺乏真实数据的情况下,比较这些估计器仍然困难。我们提出了一种适用于复杂调查设计的交叉验证框架,用于评估小区域估计器。我们的方法使能够对区域级和单元级SAE模型进行模型无关的比较。框架的核心是交叉验证平方误差的分解,揭示了可识别偏差和不可识别成分,后者可以被界定。我们的理论结果和模拟研究显示,传统方法如留一区域法交叉验证可能导致误导性的模型排名,而所提方法提供了更稳健和可解释的模型比较,并具有不确定性量化。我们通过比较赞比亚Demographic and Health Surveys中估计的亚国家女性识字率的小区域估计模型,展示了该框架。

英文摘要

Subnational monitoring of public health and development indicators often relies on household surveys where data are sparse at the desired spatial resolution. Small area estimation (SAE) methods address this challenge by borrowing strength across areas and incorporating auxiliary information. However, comparing these estimators remains difficult in the absence of ground truth. We propose a design-based cross-validation framework for evaluating small area estimators that accommodates complex survey designs. Our approach enables model-agnostic comparisons between area-level and unit-level SAE models. We derive a decomposition of the conditional mean squared error that yields a consistent cross-validation score, show that finite-sample comparisons carry an unidentifiable bias that can be bounded, and use this bound as a principled threshold for ranking models. We further show that leave-one-area-out cross-validation, a popular alternative, targets extrapolation rather than smoothing error and can reverse the correct ranking. We evaluate the framework through extensive design-based simulations. We apply the framework to compare subnational female literacy estimators in Zambia using the 2024 Demographic and Health Survey. The framework applies broadly across prevalence mapping and other SAE problems and is applicable to any small area estimator irrespective of the underlying model class.

2602.00434 2026-06-11 stat.AP 版本更新

How should covariates be handled in randomized trials? Empirical evidence from 50 trials and recommendations for practice

随机临床试验中协变量调整策略的基准测试

Yulin Shao, Liangbo Lyu, Menggang Yu, Bingkai Wang

AI总结 本文通过大规模实证研究比较了不同协变量调整策略在随机临床试验中的表现,发现简洁的回归方法在效率提升方面表现优异,而基于机器学习的方法在二元结局中计算稳定性较差。

详情
AI中文摘要

背景和目的:协变量调整可以提高随机临床试验的精度和统计功效,并被主要监管机构推荐。然而,关于不同调整策略在多样化真实世界试验中的表现缺乏实证证据,导致对统计分析计划中应预指定的方法和协变量存在不确定性。我们旨在填补这一空白并提供实用建议。 方法:我们利用50个公开可用的随机试验的个体层面数据(29,094名参与者;574个治疗-结局比较)进行了大规模实证研究。我们比较了常用的协变量调整估计量,包括分析协方差、逆概率加权、g计算和基于机器学习的方法,并结合三种协变量选择策略。性能通过精度提升、点估计变化、计算可靠性以及协变量调整改变统计显著性概率来评估。 结果:协变量调整在大多数情况下提高了精度,连续结局的中位方差减少率为13.3%,二元结局为4.6%。使用少量预指定的预测性协变量的简洁回归方法在小至中等样本中表现与更复杂的方法相当或更好。基于机器学习的估计量在二元结局中未提供额外的精度,并且更易出现计算失败。 结论:在不同试验中,简洁的协变量调整提供了稳定的效率提升,而不引入系统性偏差。这些发现支持在主要试验分析中常规使用协变量调整。所有整理的数据集和分析代码已公开发布,以支持未来临床研究。

英文摘要

Background and Objective: Covariate adjustment can improve precision and power in randomized clinical trials and is recommended by major regulatory agencies. However, there is limited empirical evidence on how different adjustment strategies perform across diverse real-world trials, leaving uncertainty about which methods and covariates should be prespecified in statistical analysis plans. We aim to address this gap and provide practical recommendations. Methods: We conducted a large-scale empirical study using individual-level data from 50 publicly available randomized trials (29,094 participants; 574 treatment-outcome comparisons). We compared commonly used covariate-adjusted estimators, including analysis of covariance, inverse-probability weighting, g-computation, and machine-learning-based approaches, combined with three covariate-selection strategies. Performance was evaluated using precision gains, changes in point estimates, computational reliability, and the probability that covariate adjustment altered statistical significance relative to an unadjusted analysis. Results: Covariate adjustment improved precision in most settings, with a median variance reduction of 13.3\% for continuous outcomes and 4.6\% for binary outcomes. Parsimonious regression approaches using a small prespecified set of prognostic covariates performed as well as or better than more complex methods, particularly in small to medium samples. Machine-learning-based estimators did not provide additional precision and were more prone to computational failure for binary outcomes. Conclusions: Across trials, parsimonious covariate adjustment provided consistent efficiency gains without introducing systematic bias. These findings support routine covariate adjustment in primary trial analyses. All curated datasets and analysis code are openly released to support future clinical research.

2505.00571 2026-06-11 stat.ML cs.LG 版本更新

Discovery and inference beyond linearity for epidemiological data by integrating Bayesian regression, tree ensembles and Shapley values

通过整合贝叶斯回归、树集成和Shapley值对流行病学数据进行线性之外的发现与推断

Giorgio Spadaccini, Marjolein Fokkema, Mark A. van de Wiel

AI总结 提出RuleSHAP框架,结合贝叶斯稀疏回归、改进的树规则生成器和Shapley值,实现非线性与交互效应的检测及个体水平的不确定性量化,应用于流行病学数据发现高胆固醇和血压的影响因素。

详情
AI中文摘要

机器学习在流行病学和医疗健康研究中越来越受欢迎,用于无假设地发现风险和保护因素。机器学习在发现非线性和交互作用方面很强,但这种能力因缺乏可靠的推断而受损。尽管Shapley值提供了特征效应的局部度量,但这些效应通常缺乏有效的不确定性量化,从而排除了统计推断。我们提出RuleSHAP,一个通过结合专用贝叶斯稀疏回归模型、改进的基于树的规则生成器和Shapley值归因来解决这一局限性的框架。RuleSHAP能够检测非线性和交互效应,其关键贡献在于个体水平的不确定性量化。我们推导了一个在该框架内计算边际Shapley值的有效公式。我们将RuleSHAP应用于一个流行病学队列的数据,以检测和推断高胆固醇和血压的几种效应,例如年龄、性别、种族、BMI和血糖水平等特征之间的非线性交互效应。最后,我们在模拟数据上证明了我们框架的有效性。

英文摘要

Machine Learning (ML) is gaining popularity in epidemiology and healthcare studies for hypothesis-free discovery of risk and protective factors. ML is strong at discovering nonlinearities and interactions, but this power is compromised by a lack of reliable inference. Although Shapley values provide local measures of features' effects, valid uncertainty quantification for these effects is typically lacking, thus precluding statistical inference. We propose RuleSHAP, a framework that addresses this limitation by combining a dedicated Bayesian sparse regression model with an improved tree-based rule generator and Shapley value attribution. RuleSHAP provides detection of nonlinear and interaction effects, with uncertainty quantification at the individual level as a key contribution. We derive an efficient formula for computing marginal Shapley values within this framework. We apply RuleSHAP to data from an epidemiological cohort to detect and infer several effects for high cholesterol and blood pressure, such as nonlinear interaction effects between features like age, sex, ethnicity, BMI and glucose level. To conclude, we demonstrate the validity of our framework on simulated data.

1910.07712 2026-06-11 stat.AP stat.CO stat.ME 版本更新

Estimating Spatially-Smoothed Fiber Orientation Distribution from Diffusion-MRI Experiments

从扩散MRI实验估计空间平滑的纤维取向分布

Jilei Yang, Seungyong Hwang, Mengjie Shi, Jie Peng

AI总结 提出最近邻自适应回归模型(NARM),通过加权局部似然估计和空间邻域嵌套实现纤维取向分布(FOD)的空间自适应估计,引入体素级重缩放和数据驱动停止规则防止过平滑,并基于配置感知策略选择相似性平滑参数,在模拟和人类连接组项目数据中提高了估计准确性和可重复性。

详情
AI中文摘要

扩散加权磁共振成像(D-MRI)是一种非侵入性体内技术,用于探测生物组织的微观结构架构。在每个体素处,纤维取向分布(FOD)表征局部纤维构型和方向,因此是D-MRI分析中的核心估计对象。我们提出了最近邻自适应回归模型(NARM),这是一种用于FOD估计的空间自适应框架,它在嵌套的空间邻域上执行加权局部似然估计,其中权重联合编码相邻FOD之间的空间邻近性和相似性,通过最优传输或Hellinger距离测量。为了防止过平滑同时保留结构异质性,我们引入了体素级重缩放方案和基于最小最近邻相异性的数据驱动停止规则。我们进一步开发了一种配置感知策略来选择相似性平滑参数,使平滑强度能够适应局部纤维复杂性。模拟研究表明,相对于体素级方法和现有的空间平滑方法PMARM,NARM提高了FOD估计精度。对人类连接组项目的重测数据的应用还表明,NARM产生了更可重复的FOD估计。实现细节以及模拟和真实数据分析的脚本可在以下网址获得:https://github.com/DMRIdotL/NARM

英文摘要

Diffusion-weighted magnetic resonance imaging (D-MRI) is a noninvasive in vivo technique for probing the microstructural architecture of biological tissues. At each voxel, the fiber orientation distribution (FOD) characterizes local fiber configurations and orientations and is therefore a central object of estimation in D-MRI analysis. We propose the Nearest-Neighbor Adaptive Regression Model (NARM), a spatially adaptive framework for FOD estimation that performs weighted local likelihood estimation over nested spatial neighborhoods, where the weights jointly encode spatial proximity and similarity among neighboring FODs, measured by either the optimal transport or Hellinger distance. To prevent over-smoothing while preserving structural heterogeneity, we introduce a voxel-wise rescaling scheme and a data-driven stopping rule based on minimum nearest-neighbor dissimilarity. We further develop a configuration-aware strategy for selecting the similarity-smoothing parameter, allowing the smoothing strength to adapt to local fiber complexity. Simulation studies demonstrate that NARM improves FOD estimation accuracy relative to voxel-wise methods and the existing spatial smoothing approach PMARM. Application to test-retest data from the Human Connectome Project additionally shows that NARM yields more reproducible FOD estimates. Implementation details and scripts for the simulation and real data analyses are available at this https URL

2305.09455 2026-06-11 stat.AP 版本更新

A latent class approach to assess the effects of dynamic adherence to polytherapy in heart failure patients

评估心力衰竭患者多药治疗动态依从性影响的潜在类别方法

Nicole Fontana, Laura Savaré, Emanuele Di Angelantonio, Francesca Ieva

AI总结 提出结合潜在马尔可夫模型与动态依从性建模的方法,分析心力衰竭患者多药治疗依从性模式及其对再住院风险的影响,发现高依从性可显著降低风险。

详情
AI中文摘要

心力衰竭(HF)的治疗严重依赖药物治疗,特别是根据临床指南推荐联合使用多种疗法。然而,对规定方案的依从性不佳仍然是一个重大挑战,导致住院率增加和患者预后恶化。本研究引入了一种新颖的方法学流程,将潜在马尔可夫模型(LMM)与动态依从性建模相结合,以评估依从性行为及其对HF再住院的影响。使用意大利伦巴第大区的行政医疗数据,我们分析了2020年7月至12月期间因HF住院的6,818名患者。在六个月的观察期内每月评估依从性,并使用Cox回归将依从性概况与临床结局联系起来。识别出七种潜在行为概况,反映了不同的依从性水平和轨迹。结果显示,较高的依从性水平显著降低了再住院风险。与低依从性患者相比,持续高依从性患者的HF再住院风险降低了56%。重要的是,观察期内依从性的改善与更好的生存概率相关,突显了及时干预的潜在益处。此外,依从性行为受到年龄、合并症负担和观察期内住院等因素的影响。本研究强调了动态和个性化策略在监测和增强多药治疗依从性方面的重要性。通过将依从性模式与临床结局联系起来,所提出的方法为改善患者管理和减轻HF对医疗系统的负担提供了可操作的见解。

英文摘要

Heart failure (HF) treatment relies heavily on pharmacotherapy, particularly combining multiple therapies as recommended by clinical guidelines. However, non-adherence to prescribed regimens remains a significant challenge, contributing to increased hospitalizations and poorer patient outcomes. This study introduces a novel methodological pipeline that integrates Latent Markov Models (LMM) with dynamic adherence modeling to evaluate adherence behaviors and their impact on HF rehospitalization. Using administrative healthcare data from Lombardy, Italy, we analyzed 6,818 patients hospitalized for HF between July and December 2020. Adherence was assessed monthly over a six-month observation period, and adherence profiles were linked to clinical outcomes using Cox regression. Seven latent behavioral profiles were identified, reflecting varying levels and trajectories of adherence. The findings revealed that higher adherence levels significantly reduced the risk of rehospitalization. Patients with consistently high adherence exhibited a 56% lower risk of HF rehospitalization compared to those with low adherence. Importantly, improving adherence during the observation period was associated with better survival probabilities, highlighting the potential benefits of timely interventions. Additionally, adherence behaviors were influenced by factors such as age, comorbidity burden, and hospitalization during the observation period. This study underscores the importance of dynamic and personalized strategies to monitor and enhance adherence to polytherapy. By linking adherence patterns to clinical outcomes, the proposed approach offers actionable insights for improving patient management and reducing the burden of HF on healthcare systems.

9. 经济金融与社会科学统计 6 篇

2606.12260 2026-06-11 econ.TH cs.AI cs.GT cs.LG stat.ML 新提交

Market Design for AI: Beyond the Copyright Binary

人工智能的市场设计:超越版权二元论

Yan Dai, Maryam Farboodi, Negin Golrezaei, Sepehr Shahshahani

AI总结 本文通过静态和动态博弈模型,分析AI训练数据市场中“自由使用”与“强知识产权”两种模式的失败,提出通过数据中介内部化外部性并补贴创新贡献的市场设计。

详情
AI中文摘要

我们如何设计一个用于训练AI模型的人类生成内容市场,既能促进技术进步,又能保留个人创作高质量内容的激励?现有方法采取两极立场:基于合理使用的“自由使用”模式和“强知识产权”模式。我们证明两者均失败:自由使用不补偿创作者,而通过建模为静态Stackelberg博弈,强知识产权也削弱了创作激励。我们发现这对更具创新性的创作者尤其如此,我们将此现象称为“原创性惩罚”。将这一见解扩展到动态模型,我们发现另一种市场失灵会损害AI模型性能,即使对于初始良好的模型也是如此:此类模型导致人类更依赖AI辅助创作,导致同质化内容反馈到训练中,从而降低模型性能——即“精确性诅咒”。我们进一步提出一种市场设计,通过数据中介内部化跨创作者外部性并补贴创新贡献,从而恢复效率。

英文摘要

How can we design a market of human-generated content for use in training AI models that both enables technological progress and preserves individual incentives for high-quality content creation? Existing approaches take polar positions: a "free-for-all" model based on fair use and a "strong intellectual property rights" model. We show that both fail: Free-for-all does not compensate creators, and -- by modeling as a static Stackelberg game -- strong intellectual property rights also underpower creative incentives. We find this especially true for more innovative creators, a phenomenon we term the "originality penalty." Extending this insight to a dynamic model, we find another market failure undermining AI model performance, even for an initially good model: Such a model induces greater reliance by humans on AI-assisted creation, resulting in homogenized content feeding back into training, which degrades the model performance -- a "curse of precision." We further propose a market design with a data intermediary internalizing cross-creator externalities and subsidizing innovative contributions, thereby restoring efficiency.

2606.11526 2026-06-11 stat.ME econ.EM 新提交

What is the Long-Term Value of Reliability?

可靠性的长期价值是什么?

Chenyu Qiu, Xu Kuang, Inessa Liskovich, Ali Rauh, Stefan Wager

AI总结 提出Chronos LTV系统,利用马尔可夫决策过程建模客户交互,通过协变量平衡算法估计延迟率对业务指标的长期影响。

详情
AI中文摘要

我们描述了Chronos LTV,一个用于衡量延迟和其他服务缺陷对关键业务指标长期影响的系统。我们使用马尔可夫决策过程对客户随时间推移的交互进行建模,并将我们的目标估计量形式化为相对于移动平均延迟率的边际政策效应。在此设定下,我们表明,在给定观察到的订单特征的情况下,延迟在顺序无混淆假设下(即延迟近似随机)可以识别长期效应;并且可以使用简单的协变量平衡算法来估计这些效应。

英文摘要

We describe Chronos LTV, a system to measure the long-term impact of delays and other service defects on key business metrics. We use Markov decision processes to model customer interactions over time, and formalize our target estimand as the marginal policy effect with respect to moving the average delay rate. Given this setup, we show that we can identify long-term effects under a sequential unconfoundedness assumption where delays are as good as random given observed order characteristics; and can estimate these effects using a simple covariate-balancing algorithm.

2606.11118 2026-06-11 cs.LG math.OC math.PR stat.AP stat.ML 版本更新

Data-Driven Dynamic Assortment in Online Platforms: Learning about Two Sides

在线平台中的数据驱动动态分类:学习双边信息

Rahul Roy, Nur Sunar, Jayashankar M. Swaminathan

AI总结 针对双边服务平台,提出一种数据驱动算法,在未知顾客和卖家选择参数的情况下动态优化商品分类,并证明其遗憾值随时间呈多对数增长且达到最优速率。

详情
AI中文摘要

我们研究了一个在离散时间环境下,具有不完全信息和异质顾客的双边服务平台上的动态分类问题。在每个周期,一位顾客到达寻求服务,平台选择一组卖家进行展示。顾客根据多项逻辑选择模型,最多向分类中的一个卖家提出交易。经过固定数量的周期后,卖家审查收到的提议,并根据另一个多项逻辑选择模型,每位卖家最多选择一个顾客,然后循环重复。一个关键挑战是平台事先不知道顾客或卖家的选择模型参数。据我们所知,这是首次研究双边选择参数均未知的动态分类问题。我们开发了一种数据驱动算法,该算法在优化平台目标的同时学习这些参数。我们使用遗憾值来评估性能,该遗憾值衡量相对于一个预知所有参数和顾客到达时间的先知基准的收入损失。我们证明该算法的最坏情况遗憾值随时间呈多对数增长,并推导出匹配的下界,从而确定其速率最优性。

英文摘要

We study a dynamic assortment problem on a two-sided service platform with incomplete information and heterogeneous customers in a discrete-time setting. In each period, a customer arrives seeking service, and the platform chooses an assortment of sellers to display. The customer then proposes a transaction to at most one seller in the assortment according to a multinomial logit choice model. After a fixed number of periods, sellers review the proposals they have received and each chooses at most one customer according to another multinomial logit choice model, after which the cycle repeats. A key challenge is that the platform does not know the choice-model parameters of either customers or sellers in advance. To our knowledge, this is the first study of a dynamic assortment problem in which both sides' choice parameters are unknown. We develop a data-driven algorithm that learns these parameters while optimizing the platform's objective over time. We evaluate performance using regret, which measures revenue loss relative to a clairvoyant benchmark that knows all parameters and customer arrivals in advance. We show that the algorithm's worst-case regret grows polylogarithmically over time, and we derive a matching lower bound, establishing its rate optimality.

2606.01650 2026-06-11 q-fin.PM q-fin.TR stat.AP stat.ME 版本更新

Post Selection Estimation of Sharpe Ratios

夏普比率的事后选择估计

Steven E. Pav

AI总结 针对从众多资产中选择具有最高样本内夏普比率的资产,研究基于多面体引理、James-Stein收缩、期望最大夏普比率去偏、阈值法和经验贝叶斯的估计器,并通过模拟评估其偏差、均方根误差和秩相关性。

详情
AI中文摘要

我们考虑估计一个资产的真实夏普比率的问题,该资产因在众多资产中具有最高的样本内夏普比率而被选中。我们讨论了基于多面体引理、James-Stein收缩、期望最大夏普比率去偏、阈值法和经验贝叶斯的估计器。我们在模拟中测试了这些估计器,计算了不同样本量、资产数量以及总体夏普比率的分布范围和形状下的偏差和均方根误差。我们还计算了估计器与潜在真实值的秩相关性,模拟了这些估计器如何用于比较或排序执行此选择过程的不同团队的结果。我们发现James-Stein估计器在相关参数的许多不同实际值下提供了最佳性能,其次是Jiang和Zhang的GMLEB估计器。这些结果对资产收益的相关性相当稳健,但有一些注意事项。

英文摘要

We consider the problem of estimating the true Sharpe ratio of an asset selected for having the highest observed in-sample Sharpe ratio among many assets. We discuss estimators based on the polyhedral lemma, James Stein shrinkage, debiasing the expected maximum Sharpe ratio, thresholding and empirical Bayes. We test these estimators in simulations, computing bias and root mean square error across different values of sample size, number of assets, and spread and shape of population Sharpe ratios. We also compute rank correlation of the estimators against the underlying quantity, simulating how these estimators might be used to compare or rank the output of different teams which perform this selection process. We find that the James Stein estimator provides the best performance across many different realistic values of the relevant parameters, followed by the GMLEB estimator of Jiang and Zhang. These results are fairly robust to correlation of asset returns, with some caveats.

1911.04090 2026-06-11 stat.ME q-fin.PM 版本更新

A post hoc test on the Sharpe ratio

夏普比率的事后检验

Steven E. Pav

AI总结 提出一种夏普比率的事后检验方法,类似于Tukey检验,用于在拒绝所有总体信噪比相等的假设后,比较资产夏普比率的差异。

详情
AI中文摘要

我们描述了一种针对夏普比率的事后检验,类似于Tukey检验用于均值的两两相等性检验。该检验可以在拒绝所有总体信噪比相等的假设后应用。该检验适用于资产收益间具有简单相关结构的情形。模拟表明,该检验在广泛条件下维持名义第一类错误率,并在合理备择假设下具有中等功效。

英文摘要

We describe a post hoc test for the Sharpe ratio, analogous to Tukey's test for pairwise equality of means. The test can be applied after rejection of the hypothesis that all population Signal-Noise ratios are equal. The test is applicable under a simple correlation structure among asset returns. Simulations indicate the test maintains nominal type I rate under a wide range of conditions and is moderately powerful under reasonable alternatives.

2509.04691 2026-06-11 stat.AP 版本更新

Inferring Piece Value in Chess and Chess Variants

推断国际象棋及其变体中的棋子价值

Steven E. Pav

AI总结 使用逻辑回归从Lichess数据估计标准国际象棋及四种变体的棋子价值,发现主要棋子相对价值与历史估值一致,但象略高于马,且原子棋和反象棋中绝对值较小。

详情
Comments
58 pages
AI中文摘要

我们使用逻辑回归来估计标准国际象棋及几种变体(即Chess 960、原子棋、反象棋和部落棋)中棋子的价值。我们对来自免费开源互联网国际象棋服务器Lichess的多年数据进行回归分析。我们使用已发布的玩家等级分来控制不同玩家技能带来的混杂效应。我们调整了由于观测等级分噪声导致的回归衰减偏差。我们发现,主要棋子的价值相对于兵的价值,与历史估值体系相当一致。然而,我们发现象的价值略高于马。我们发现,在原子棋和反象棋中,棋子的绝对值比标准国际象棋小。我们还给出了当不同技能水平的玩家对战时,使棋局平衡的近似棋子价值。我们简要考虑了使用Stockfish引擎进行自我对弈实验,这提供了关于棋子价值的对比视角。

英文摘要

We use logistic regression to estimate the value of the pieces in standard chess and several chess variants, namely Chess 960, Atomic chess, Antichess, and Horde chess. We perform our regressions on several years of data from Lichess, the free and open-source internet chess server. We use the published player ratings to control for the confounding effect of differential player skill. We adjust for the attenuation bias in regressions due to the noise in observed ratings. We find that major piece values, relative to the value of a pawn, are fairly consistent with historical valuation systems. However we find slightly higher value to bishops than knights. We find that piece values are smaller, in absolute value, in Atomic and Antichess than standard chess. We also present approximate values of the pieces to equalize odds when players of varying skill face off. We briefly consider self-play experiments using the Stockfish engine, which give a contrasting view of piece value.

10. 数据隐私、稳健性与公平性 6 篇

2606.11949 2026-06-11 cs.LG cs.CR stat.ML 新提交

Online Shift Detection and Conformal Adaptation for Deployed Safety Classifiers

已部署安全分类器的在线漂移检测与共形自适应

Jun Wen Leong

AI总结 提出在线监测系统,使用校准序列统计检测分布漂移,并通过共形弃权层自适应阈值恢复目标错误率,在800个实验单元中实现86.6%有效检测。

详情
Comments
16 pages, 4 figures, 7 tables. Code and data at this https URL
AI中文摘要

我们提出了一种在线监测系统,用于检测已部署安全分类器中的分布漂移,使用校准的序列统计量来检测分类器何时移出分布。一旦检测到,共形弃权层会自适应调整决策阈值,以恢复目标错误率ε=0.1。在一项预注册的析因评估(4个分类器×5种漂移条件×20个种子×2个窗口大小,共800个单元)中,该系统实现了86.6%的有效检测(693/800,95% CI [84.1%, 88.8%]),平均延迟为39.5步。检测在三种真实标签机制下均有效:合成发作(86.6%)、真实时间越狱(85%,17/20)和GCG对抗攻击。加权共形预测为DeBERTa恢复了高达39个百分点的丢失覆盖率(ESS=46/300),但所有其他分类器均崩溃(ESS≈300):逻辑密度比估计在高维嵌入空间中实现了完美的源/目标可分离性,将所有重要性权重裁剪至下限。DeBERTa显示出从有效校正(释义,ESS=46)到几乎完全崩溃(对抗后缀,ESS=206)的梯度。PCA降至32维打破了崩溃,为Llama Guard恢复了33个百分点,为ShieldGemma恢复了21个百分点。方差分解显示分类器(η²=0.243)、漂移类型(η²=0.237)及其交互作用(η²=0.185)均对检测延迟方差有显著贡献(所有p<0.001),表明需要针对每个分类器的监测配置文件。

英文摘要

We present an online monitoring system for distributional shift in deployed safety classifiers, using calibrated sequential statistics to detect when a classifier has moved out of distribution. Upon detection, a conformal abstention layer adapts decision thresholds to recover a target error rate epsilon=0.1. In a pre-registered factorial evaluation (4 classifiers x 5 shift conditions x 20 seeds x 2 window sizes, 800 cells), the system achieves 86.6% valid detection (693/800, 95% CI [84.1%, 88.8%]) with mean latency of 39.5 steps. Detection holds across three ground-truth regimes: synthetic onset (86.6%), real temporal jailbreaks (85%, 17/20), and GCG adversarial attacks. Weighted conformal prediction recovers up to 39 pp of lost coverage for DeBERTa (ESS=46/300) but collapses for all other classifiers (ESS~300): logistic density ratio estimation achieves perfect source/target separability in high-dimensional embedding spaces, clipping all importance weights to the floor. DeBERTa shows a gradient from effective correction (paraphrase, ESS=46) to near-total collapse (adversarial suffix, ESS=206). PCA to 32 dimensions breaks the collapse, recovering 33 pp for Llama Guard and 21 pp for ShieldGemma. Variance decomposition reveals classifier (eta^2=0.243), shift type (eta^2=0.237), and their interaction (eta^2=0.185) all contribute substantially to detection latency variance (all p<0.001), indicating per-classifier monitoring profiles are necessary.

2606.11865 2026-06-11 stat.ML cs.LG 新提交

Conformal Bayes under Label Shift: Post-Hoc Calibration vs. In-Training Adaptation

标签偏移下的共形贝叶斯:事后校准与训练内适应

Seungjin Choi

AI总结 研究标签偏移下共形贝叶斯方法,通过重要性加权共形校准恢复目标域覆盖,比较事后校准与训练内适应两种策略,后者在偏差训练中起到去偏作用。

详情
Comments
2nd Workshop on Epistemic Intelligence in Machine Learning (EIML@ICML 2026)
AI中文摘要

共形贝叶斯将贝叶斯后验预测与共形校准相结合,产生既统计有效又几何高效的预测集。我们从统一视角研究标签偏移下的共形贝叶斯,识别出两种互补方法,它们通过重要性加权共形校准恢复名义目标域覆盖,但通过独立机制运作。\emph{事后校准}将后验预测向目标域倾斜,并通过重要性加权分位数校正共形阈值,保持参数后验不变。\emph{训练内适应}将参数后验本身向目标域倾斜,产生校正后的预测,其最高预测密度区域作为基于拟合目标预测的最高预测密度(HPD)预测集;效率依赖于模型,并不保证有限样本条件最优性。两个受控实验表明,在无偏训练机制下,两种策略同样实现有效覆盖,而在领先优化机制下,训练内适应作为去偏算子,在覆盖不变的情况下减少区间宽度。

英文摘要

Conformal Bayes combines Bayesian posterior predictives with conformal calibration to produce prediction sets that are both statistically valid and geometrically efficient. We study conformal Bayes under label shift from a unified perspective, identifying two complementary approaches that restore nominal target-domain coverage through importance-weighted conformal calibration but operate through independent mechanisms. \emph{Post-hoc calibration} tilts the posterior predictive toward the target domain and corrects the conformal threshold via an importance-weighted quantile, leaving the parameter posterior unchanged. \emph{In-training adaptation} tilts the parameter posterior itself to the target domain, producing a corrected predictive whose highest predictive density region serves as the highest predictive density (HPD) based prediction set under the fitted target predictive; efficiency is model-dependent and does not imply finite-sample conditional optimality. Two controlled experiments show that in an unbiased training regime both strategies achieve valid coverage equally, while in a lead-optimization regime in-training adaptation acts as a debiasing operator, reducing interval width at unchanged coverage.

2606.11283 2026-06-11 cs.DS cs.LG stat.ML 新提交

Fixed-Parameter Tractability of Private Synthetic Data Generation

私有合成数据生成的固定参数可处理性

Badih Ghazi, Cristóbal Guzmán, Pritish Kamath, Alexander Knop, Ravi Kumar, Pasin Manurangsi

AI总结 研究差分隐私下合成数据生成问题,通过查询族关联图的树宽参数建立固定参数可处理性,提出两种最优算法。

详情
AI中文摘要

我们研究在差分隐私下生成合成数据的问题。我们建立了该问题的固定参数可处理性(FPT),其中参数是查询族关联图的树宽。我们的算法在所有情况下都达到最优错误率,并通过两种不同方法实现:第一种基于线性规划(LP)和LP对偶分离问题的FPT;第二种基于子采样私有乘法权重方法,其中我们获得了从吉布斯分布采样的FPT。两种方法都通过树分解上的动态规划框架统一。

英文摘要

We study the problem of generating synthetic data under differential privacy. We establish fixed-parameter tractability (FPT) for this problem where the parameter is the treewidth of the query family's incidence graph. Our algorithms attain optimal error rates across all regimes and are realized by two different approaches: the first is based on linear programming (LP) and the FPT of the separation problem for the LP dual; the second is based on a subsampled private multiplicative weights method, where we obtain FPT for sampling from Gibbs distributions. Both approaches are unified by a dynamic programming framework over a tree decomposition.

2510.07750 2026-06-11 stat.ML cs.LG 版本更新

Calibrating Decision Robustness via Inverse Conformal Risk Control

通过逆保形风险控制校准决策鲁棒性

Wenbin Zhou, Shixiang Zhu

AI总结 提出逆保形风险控制框架,为鲁棒优化策略提供无分布、有限样本的误覆盖与遗憾保证,通过追踪Pareto前沿帮助决策者根据成本-风险偏好校准鲁棒性水平。

详情
AI中文摘要

鲁棒优化通过针对最坏情况优化来保护决策免受不确定性影响,但其有效性取决于预先指定的鲁棒性水平,该水平通常是临时选择的,导致保护不足或过度保守且成本高昂的解决方案。最近使用保形预测的方法构建了具有有限样本覆盖保证的数据驱动不确定性集,但它们仍然事先固定覆盖目标,并且对选择鲁棒性水平提供的指导很少。我们提出了一个新框架,该框架为任何鲁棒预测-然后优化策略族提供了无分布、有限样本的误覆盖和遗憾保证。我们的方法构建了有效的估计量,这些估计量描绘出误覆盖-遗憾帕累托前沿,使决策者能够根据其成本-风险偏好可靠地评估和校准鲁棒性水平。该框架易于实现,广泛适用于经典优化公式,并实现了更优的有限样本性能。本文提供了一种原则性的数据驱动方法,用于指导鲁棒性选择,并使从业者能够在高风险决策中平衡鲁棒性和保守性。

英文摘要

Robust optimization safeguards decisions against uncertainty by optimizing against worst-case scenarios, yet their effectiveness hinges on a prespecified robustness level that is often chosen ad hoc, leading to either insufficient protection or overly conservative and costly solutions. Recent approaches using conformal prediction construct data-driven uncertainty sets with finite-sample coverage guarantees, but they still fix coverage targets a priori and offer little guidance for selecting robustness levels. We propose a new framework that provides distribution-free, finite-sample guarantees on both miscoverage and regret for any family of robust predict-then-optimize policies. Our method constructs valid estimators that trace out the miscoverage--regret Pareto frontier, enabling decision-makers to reliably evaluate and calibrate robustness levels according to their cost--risk preferences. The framework is simple to implement, broadly applicable across classical optimization formulations, and achieves sharper finite-sample performance. This paper offers a principled data-driven methodology for guiding robustness selection and empowers practitioners to balance robustness and conservativeness in high-stakes decision-making.

2506.01396 2026-06-11 cs.LG cs.CR stat.ML 版本更新

Mitigating Disparate Impact of Differentially Private Learning through Bounded Adaptive Clipping

通过有界自适应裁剪减轻差分隐私学习中的差异影响

Linzh Zhao, Aki Rehn, Mikko A. Heikkilä, Razane Tajeddine, Antti Honkela

AI总结 针对差分隐私学习中梯度裁剪对少数群体造成的不公平影响,提出有界自适应裁剪方法,通过引入可调下界防止过度梯度抑制,在Skewed和Fashion MNIST上最差类准确率提升超过10个百分点。

详情
Comments
TMLR camera-ready version
AI中文摘要

差分隐私已成为隐私保护机器学习的基本框架。然而,现有的差分隐私学习方法通常对模型预测产生差异影响,例如对少数群体。梯度裁剪常用于差分隐私学习,但会抑制来自困难样本的较大梯度。我们表明,自适应裁剪会加剧这一问题,因为它通常会将裁剪边界缩小到极小值以匹配拟合良好的多数类,同时显著降低其他类的准确率。我们提出有界自适应裁剪,引入可调下界以防止过度梯度抑制。与无界自适应裁剪相比,我们的方法在Skewed和Fashion MNIST上将最差类准确率提高了超过10个百分点,与自动裁剪相比提高了7个百分点,与恒定裁剪相比提高了5个百分点。代码可在该 https URL 获取。

英文摘要

Differential privacy (DP) has become an essential framework for privacy-preserving machine learning. Existing DP learning methods, however, often have disparate impacts on model predictions, e.g., for minority groups. Gradient clipping, which is often used in DP learning, can suppress larger gradients from challenging samples. We show that this problem is amplified by adaptive clipping, which will often shrink the clipping bound to tiny values to match a well-fitting majority, while significantly reducing the accuracy for others. We propose bounded adaptive clipping, which introduces a tunable lower bound to prevent excessive gradient suppression. Our method improves worst-class accuracy by over 10 percentage points on Skewed and Fashion MNIST compared to unbounded adaptive clipping, 7 points compared to Automatic clipping, and 5 points compared to constant clipping. The code is available at this https URL.

2310.01009 2026-06-11 stat.ME 版本更新

Neyman-Pearson and equal opportunity: when efficiency meets fairness in classification

Neyman-Pearson 与机会均等:当分类中的效率遇到公平

Jianqing Fan, Xin Tong, Yanhui Wu, Lucy Xia, Shunan Yao

AI总结 将机会均等约束融入 Neyman-Pearson 分类框架,推导最优分类器,提出有限样本分类器以满足公平与效率约束,并在模拟和真实数据上验证有效性。

详情
AI中文摘要

组织通常依赖统计算法做出具有社会和经济影响的决策。我们必须解决这些重要自动化决策中的公平性问题。另一方面,经济效率对于组织的生存和成功仍然至关重要。因此,在促进现实世界数据科学解决方案的公平性时,适当兼顾公平和效率至关重要。作为朝着这一双重目标的首次努力之一,我们将机会均等(EO)约束纳入 Neyman-Pearson(NP)分类范式。在这个新的 NP-EO 框架下,我们(a)推导了最优分类器,(b)提出了基于有限样本的分类器,以高概率满足总体水平的公平性和效率约束,以及(c)在模拟和真实数据集上展示了我们算法的统计和社会有效性。

英文摘要

Organizations often rely on statistical algorithms to make socially and economically impactful decisions. We must address the fairness issues in these important automated decisions. On the other hand, economic efficiency remains instrumental in organizations' survival and success. Therefore, a proper dual focus on fairness and efficiency is essential in promoting fairness in real-world data science solutions. Among the first efforts towards this dual focus, we incorporate the equal opportunity (EO) constraint into the Neyman-Pearson (NP) classification paradigm. Under this new NP-EO framework, we (a) derive the oracle classifier, (b) propose finite-sample based classifiers that satisfy population-level fairness and efficiency constraints with high probability, and (c) demonstrate statistical and social effectiveness of our algorithms on simulated and real datasets.

11. 数据集、软件与应用 7 篇

2606.12317 2026-06-11 stat.ME stat.CO 新提交

ShrinkageTrees: An R Package for Bayesian Tree Ensembles for Survival Analysis and Causal Inference

ShrinkageTrees: 用于生存分析和因果推断的贝叶斯树集成R包

Tijn Jacobs

AI总结 ShrinkageTrees是一个R包,通过贝叶斯加性回归树模型处理右删失和区间删失生存数据,支持因果推断中的预后和治疗效应分解,并引入深度惩罚、Dirichlet分裂和马蹄铁先验等正则化策略,适用于高维场景。

详情
AI中文摘要

ShrinkageTrees是一个用于生存分析和因果推断的贝叶斯树集成R包。该包在加速失效时间(AFT)框架下实现了针对右删失和区间删失生存结果的贝叶斯加性回归树模型,并可选择分解为预后和治疗效应成分以进行因果推断。提供两种互补的正则化形式:通过深度惩罚先验和Dirichlet分裂先验对树结构进行正则化,以及通过全局-局部收缩先验对步高进行正则化。ShrinkageTrees首次实现了马蹄铁森林,即对步高施加马蹄铁先验。这些正则化策略将贝叶斯树集成扩展到高维设置。高效的Rcpp后端、多链MCMC和S3方法支持完整的流程:拟合、预测、因果效应估计和收敛诊断。

英文摘要

ShrinkageTrees is an R package for Bayesian tree ensembles in survival analysis and causal inference. The package implements Bayesian additive regression tree models for right- and interval-censored survival outcomes within an accelerated failure time (AFT) framework, with optional decomposition into prognostic and treatment-effect components for causal inference. Two complementary forms of regularisation are available: regularisation of the tree structure, via depth-penalising priors and Dirichlet splitting priors, and regularisation of the step heights, via global-local shrinkage priors. ShrinkageTrees provides the first implementation of the Horseshoe Forest, which places a horseshoe prior on the step heights. These regularisation strategies extend Bayesian tree ensembles to high-dimensional settings. An efficient Rcpp backend, multi-chain MCMC, and S3 methods support the full workflow: fitting, prediction, causal effect estimation, and convergence diagnostics.

2606.11911 2026-06-11 stat.ML cs.LG math.AT 新提交

From Persistence to Survival: Hypothesis Testing, Effect Sizes and Vectorisation for Topological Features

从持续性到生存:拓扑特征的假设检验、效应大小与向量化

Juliette Murris, Bernadette Stolz, Karsten Borgwardt

AI总结 提出STRAND方法,将持久性图视为生存数据,利用持久性生存函数统一实现假设检验、效应大小计算和向量化,在合成数据和真实基准上验证了有效性。

详情
AI中文摘要

持久性图是拓扑数据分析中常见的表示形式,但它们并非天然存在于向量空间中,且用于比较它们的统计工具在很大程度上与用于下游预测的工具分开发展。我们引入STRAND(生存拓扑表示图分析),将(集合的)持久性图视为生存数据:每个具有持久性值 $p = d - b$ 的拓扑特征是一个完全观测的事件时间,持久性生存函数 $S(t) = \mathbb{P}(p > t)$ 是比较图的中心对象。从这个单一表示中,我们推导出(i)一个非参数双样本检验,具有校准的第一类错误率和少量图的高功效;(ii)可解释的效应大小;以及(iii)用于下游机器学习的1-Wasserstein稳定特征向量。我们在具有受控拓扑的合成流形上验证了校准和功效,展示了在14个图和3D点云基准上的竞争性向量化,并将该方法应用于fMRI/神经科学数据中的功能性脑连接研究。据我们所知,STRAND是第一个从单一连贯且可解释的表示为持久性图提供假设检验和向量化的方法。

英文摘要

Persistence diagrams are common representations in topological data analysis, but they do not naturally live in a vector space, and the statistical tools developed for comparing them have largely evolved separately from those used for downstream prediction. We introduce STRAND (Survival Topological Representation ANalysis of Diagrams), which treats (collections of) PDs as survival data: each topological feature with persistence value $p = d - b$ is a fully observed time-to-event, and the persistence survival function $S(t) = \mathbb{P}(p > t)$ is the central object for comparing diagrams. From this single representation we derive (i) a non-parametric two-sample test with calibrated Type I error and high power from a small number of diagrams; (ii) interpretable effect sizes; and (iii) a 1-Wasserstein-stable feature vector for downstream machine learning. We validate calibration and power on synthetic manifolds with controlled topology, demonstrate competitive vectorisation across 14 graph and 3D point cloud benchmarks, and apply the method to study functional brain connectivity in fMRI/neuroscience data. To our knowledge, STRAND is the first method to provide hypothesis testing and vectorisation for persistence diagrams from a single coherent and interpretable representation.

2606.11651 2026-06-11 cs.LG q-bio.QM stat.AP 新提交

DeepRHP: A Hybrid Variational Autoencoder for Designing Random Heteropolymers as Protein Mimics

DeepRHP:一种用于设计随机异聚合物作为蛋白质模拟物的混合变分自编码器

Shuni Li, Zhiyuan Ruan, Andy Shen, Ivan Jayapurna, Ting Xu, Haiyan Huang

AI总结 提出混合变分自编码器DeepRHP,在半监督框架下结合特征VAE与经典VAE,通过潜在空间捕获关键化学特征与序列模式,指导随机异聚合物设计,实验验证其稳定膜蛋白的有效性。

详情
Comments
Oral presentation at AAAI 2023 Workshop on AI to Accelerate Science and Engineering
AI中文摘要

由预定义单体组成的合成随机异聚合物(RHP)为设计类蛋白质材料提供了一种方法。如果设计得当,这些RHP可以模拟蛋白质的行为和功能。因此,需要计算工具来有效指导RHP设计。我们通过开发DeepRHP(一种在半监督框架下改进的变分自编码器(VAE)模型)来弥补这一差距。通过为经典VAE配备额外的基于特征的VAE,DeepRHP迫使潜在空间捕获关键化学特征的结构以及单个RHP序列模式。从这个意义上说,我们的方法是通用的,允许以混合方式纳入任何相关特征。我们通过提出在非原生环境中稳定膜蛋白(例如水通道蛋白Z)的潜在单体组成,并将我们的预测与已发表的结果进行交叉验证,证明了DeepRHP的有效性。我们的模型与真实RHP功能之间的一致性表明,利用混合自编码器架构来指导蛋白质和其他生物化合物的RHP设计具有巨大潜力。

英文摘要

Synthetic random heteropolymers (RHPs), consisting of a predefined set of monomers, offer an approach toward the design of protein-like materials. These RHPs, if designed appropriately, can mimic protein behavior and function. As such, there is a need for computational tools to efficiently guide RHP design. We bridge this gap by developing DeepRHP, a modified variational autoencoder (VAE) model under a semi-supervised framework. By equipping a classical VAE with an additional feature-based VAE, DeepRHP forces the latent space to capture structures of critical chemical features as well as individual RHP sequence patterns. In this sense, our method is versatile by allowing any relevant features to be incorporated in a hybrid manner. We demonstrate the effectiveness of DeepRHP by suggesting potential monomer compositions that stabilize membrane proteins (e.g. Aquaporin Z) in non-native environments and cross-validating our prediction with published results. The concordance between our model and true RHP function suggests strong potential in utilizing hybrid autoencoder architectures to guide RHP design for proteins and other biological compounds.

2606.11510 2026-06-11 q-bio.QM q-bio.PE stat.ML 新提交

Continuous biome representations from Earth observation embeddings

从地球观测嵌入中提取连续生物群落表示

Maxwell B. Joseph, Flávia De Souza Mendes, Dieu My T. Nguyen, Camile Sothe, Christopher B. Anderson (Planet Labs PBC)

AI总结 针对离散生物群落图压缩生态连续性的问题,提出从卫星图像嵌入中学习连续概率表示,在巴西6个生物群落和4672种植物数据上验证,优于离散标签预测物种分布。

详情
Comments
8 pages, 4 figures
AI中文摘要

生物群落随空间连续变化,但生物群落图通过分类边界压缩了这种变化,特别是在生态过渡带,过渡群落具有独特的生态特征。地球观测基础模型通过密集嵌入编码光谱、空间和时间信息,能否将离散的生物群落图转换为更好地捕捉生态变化的连续表示?本文在Clay v1.5卫星图像嵌入上拟合线性分类器,从分类图中预测生物群落标签。softmax输出产生一个连续概率向量,其维度对应命名的生物群落类别。我们使用巴西六个生物群落、130万个嵌入和10015个保留的森林清查样地(涵盖4672种植物)评估该方法。连续生物群落表示在预测物种出现方面优于离散生物群落标签(10次空间交叉验证中平均每物种AUC 0.618 vs. 0.570)。分解这一增益表明,改进来自分级概率输出的连续性,而非标签重新分配;该模式在距生物群落边界的所有距离上均成立。原始1024维嵌入仍然是我们测试的最强预测因子(平均AUC 0.646 vs. 0.618),但连续表示恢复了嵌入相对于离散标签的大部分增益。这种简单方法为分类地图标签提供了概率替代方案,保留了其含义,同时编码了离散地图抑制的分级变化。

英文摘要

Biotic communities vary continuously across space, yet biome maps impose categorical boundaries that compress this variation, particularly at ecotones where transitional communities are ecologically distinct. Could Earth observation (EO) foundation models, which encode spectral, spatial, and temporal information with dense embeddings, convert discrete biome maps into continuous representations that better capture ecological variation? Here, we fit a linear classifier on Clay v1.5 satellite image embeddings to predict biome labels from a categorical map. The softmax output yields a continuous probability vector whose dimensions correspond to named biome classes. We evaluate this approach using six Brazilian biomes, 1.3 million embeddings, and 10,015 withheld forest inventory plots spanning 4,672 plant species. The continuous biome representation outperforms discrete biome labels for predicting species occurrence (mean per-species AUC 0.618 vs. 0.570 across 10 spatial cross-validation folds). Decomposing this gain shows that continuity in the graded probability output, rather than label reassignment, accounts for the improvement; the pattern holds across all distances from biome boundaries. The raw 1024-dimensional embedding remains the strongest predictor we tested (mean AUC 0.646 vs. 0.618), but the continuous representation recovers most of the embedding's gain over discrete labels. This simple approach provides a probabilistic replacement for categorical map labels, preserving their meaning while encoding graded variation that discrete maps suppress.

2606.11473 2026-06-11 cs.LG cs.AI stat.ML 新提交

CRUMB: Efficient Prior Fitted Network Inference via Distributionally Matched Context Batching

CRUMB: 通过分布匹配上下文批处理实现高效先验拟合网络推理

Jamie Heredge, Mattia J. Villani, Pranav Deshpande, Akshay Seshadri, Niraj Kumar

发表机构 * Global Technology Applied Research, JPMorganChase(摩根大通全球技术应用研究)

AI总结 提出CRUMB方法,通过聚类查询、最小化最大均值差异选择训练子集、再执行精确推理,在不重新训练的情况下加速先验拟合网络推理,在51个数据集上优于同类方法。

详情
Comments
26 pages, 13 figures
AI中文摘要

先验拟合网络(PFNs)是一类有前景的表格基础模型,执行上下文学习,其中整个带标签的训练集作为上下文提供,并在单次前向传播中生成测试查询的预测。然而,许多PFN架构中二次缩放的自注意力机制使得对于非常大的训练数据集推理变得不可行。我们提出CRUMB(使用最小化MMD批处理的聚类检索),一个三阶段推理包装器:(i)聚类测试查询,(ii)通过贪心最小化最大均值差异(MMD)为每个聚类选择一个小型、分布匹配的训练子集,(iii)在每个缩减上下文的批次上执行精确的PFN推理。CRUMB是架构无关的,无需重新训练。在51个数据集的TabArena基准测试中,跨三种PFN架构(TabPFNv2、TabICLv1、TabICLv2)评估,我们展示了CRUMB优于类似的最先进的上下文选择策略。我们还展示了CRUMB对协变量漂移具有鲁棒性,因为MMD最小化步骤自然有助于对齐训练上下文分布以匹配当前测试批次分布。

英文摘要

Prior-fitted networks (PFNs) are a promising class of tabular foundation models that perform in-context learning, whereby the entire labelled training set is supplied as context, and predictions for test queries are produced in a single forward pass. However, the quadratically scaling self-attention mechanism in many PFN architectures makes inference prohibitive for very large training datasets. We propose CRUMB (Clustered Retrieval Using Minimised-MMD Batching), a three-stage inference wrapper that (i) clusters the test queries, (ii) selects a small, distributionally matched training subset for each cluster by greedily minimising the maximum mean discrepancy (MMD), and (iii) runs exact PFN inference on each reduced-context batch. CRUMB is architecture-agnostic and requires no retraining. On the 51-dataset TabArena benchmark, evaluated across three PFN architectures (TabPFNv2, TabICLv1, TabICLv2), we show that CRUMB outperforms similar state-of-the-art context selection strategies. We also show that CRUMB is resilient to covariate drift, as the MMD-minimisation step naturally helps align the training context distribution to match the current test batch distributions.

2606.11235 2026-06-11 cs.LG cs.DB stat.ME 新提交

Few-Shot Resampling for Scalable Statistically-Sound Data Mining

少样本重采样:可扩展的统计可靠数据挖掘

Leonardo Pellegrina, Fabio Vandin

发表机构 * Department of Information Engineering, University of Padova(帕多瓦大学信息工程系)

AI总结 提出FewRS方法,基于重采样评估数据挖掘结果的统计显著性,通过推导新的上界偏差界,仅需极少量重采样数据集即可保证假发现概率,显著提升可扩展性。

详情
Comments
Accepted to KDD 2026
AI中文摘要

知识发现的一个关键步骤是评估数据挖掘结果。在包括模式挖掘、图分析等多个应用中,此步骤包括评估结果的统计显著性,以避免仅由噪声或数据随机波动导致的虚假发现。虽然针对某些特定应用已经开发了专门程序,但基于重采样的方法被广泛使用,尤其是在无法推导解析结果的复杂分析中。然而,当前基于重采样的方法需要生成和分析数千个重采样数据集,因此对于大型数据集或计算密集型分析不实用。本文中,我们介绍了FewRS,一种简单有效的基于重采样的方法,用于评估数据挖掘结果的统计显著性,并对错误发现概率提供严格保证。我们的方法可应用于任何使用重采样方法的情况。FewRS基于我们对表示数据挖掘结果质量的检验统计量的上确界偏差推导出的新界。我们证明FewRS需要生成和分析极少数量的重采样数据集,从而得到高度可扩展且广泛适用的方法。我们在常见任务(如模式挖掘和网络分析)上测试了我们的方法。在所有情况下,与现有技术相比,我们的方法在运行时间上减少了多达两个数量级,同时保持高统计功效,使得能够在大型真实世界数据集上对数据挖掘结果进行统计验证。

英文摘要

A key step in knowledge discovery is the evaluation of data mining results. In several applications, including pattern mining, graph analysis, and others, this step includes the evaluation of the statistical significance of the results, to avoid spurious discoveries due only to noise or random fluctuations in the data. While specialized procedures have been developed for some specific applications, resampling-based approaches are widely used, in particular for complex analyses where analytical results cannot be derived. However, current resampling-based approaches require the generation and analysis of thousands of resampled datasets, and are therefore impractical for large datasets or computationally intensive analyses. In this paper, we introduce FewRS, a simple and effective resampling-based approach to assess the statistical significance of data mining results with rigorous guarantees on the probability of false discoveries. Our approach can be used in every situation where resampling-based approaches are applied. FewRS builds on our derivation of a novel bound to the supremum deviation of test statistics representing the quality of data mining results. We prove that FewRS needs to generate and analyze an extremely small number of resampled datasets, leading to a highly scalable approach with wide applicability. We test our approach on common tasks such as pattern mining and network analysis. In all cases, our approach results in a reduction of up to two orders of magnitude in running time compared to the state of the art, while preserving high statistical power, enabling the statistical validation of data mining results on large-scale real-world datasets.

2602.10908 2026-06-11 cs.CL cs.LG stat.ML 版本更新

SoftMatcha 2: A Fast and Soft Pattern Matcher for Trillion-Scale Corpora

SoftMatcha 2:一种用于万亿级语料库的快速软模式匹配器

Masataka Yoneda, Yusuke Matsushita, Go Kamoda, Kohei Suenaga, Takuya Akiba, Masaki Waga, Sho Yokoi

AI总结 提出SoftMatcha 2,一种基于后缀数组和词向量的超快速软搜索算法,通过动态语料感知剪枝和磁盘感知设计,在万亿级语料上实现0.3秒内支持替换、插入和删除的语义变体搜索,并发现基准污染。

详情
Comments
Accepted at ICML2026. Project Page & Web Interface: this https URL, Source Code: this https URL
AI中文摘要

我们提出SoftMatcha 2,一种超快速且灵活的搜索算法,能够在0.3秒内搜索万亿规模的自然语言语料库,同时允许以替换、插入和删除形式进行的语义变体。我们的方法采用基于后缀数组的字符串匹配,该数组随语料库规模扩展良好,并将单词表示为向量,这支撑了其语义灵活性。为了缓解查询语义放松导致的组合爆炸,我们的方法建立在两个关键算法思想上:动态语料感知剪枝和由磁盘感知设计实现的快速精确查找。我们从理论上分析了所提出方法的效率,表明它可以缓解搜索空间的指数增长。在FineWeb-Edu(Lozhkov等人,2024)(1.4T tokens)上的实验表明,与现有方法infini-gram(Liu等人,2024)、infini-gram mini(Xu等人,2025)和SoftMatcha(Deguchi等人,2025)相比,它实现了显著更低的搜索延迟。作为实际应用,我们的方法发现了现有方法遗漏的训练语料库中的基准污染,并且也有利于信息检索和释义检测。我们还提供了一个在线演示,支持七种语言的语料库快速软搜索。

英文摘要

We present SoftMatcha 2, an ultra-fast and flexible search algorithm that enables search over trillion-scale natural language corpora in under 0.3 seconds while allowing semantic variations in the form of substitution, insertion, and deletion. Our approach employs string matching based on suffix arrays that scales well with corpus size, and represents words as vectors, which underpin its semantic flexibility. To mitigate the combinatorial explosion induced by the semantic relaxation of queries, our method is built on two key algorithmic ideas: dynamic corpus-aware pruning and fast exact lookup enabled by a disk-aware design. We theoretically analyze the efficiency of the proposed method, indicating that it can mitigate exponential growth in the search space. Empirically, on FineWeb-Edu (Lozhkov et al., 2024) (1.4T tokens), it attains substantially lower search latency than existing methods: infini-gram (Liu et al., 2024), infini-gram mini (Xu et al., 2025), and SoftMatcha (Deguchi et al., 2025). As a practical application, our method uncovers benchmark contamination in training corpora that existing approaches miss, and it also benefits information retrieval and paraphrase detection. We also provide an online demo of fast, soft search across corpora in seven languages.

12. 其他/综合统计 11 篇

2606.12057 2026-06-11 stat.AP 新提交

ChargeBD: Character-Aware Heterogeneous Agent Reasoning for Guided Engineering in Battery Development

ChargeBD:面向电池开发中引导工程的字符感知异构智能体推理

Rui Huang, Zekun Jiang, Xingyu Niu, Yuqiang Li, Xinying Gu, Tianhang Zhou

AI总结 提出ChargeBD框架,通过MBTI启发的角色智能体矩阵,结合异构推理,解决液流电池多尺度多目标研发中的自适应问题。

详情
AI中文摘要

液流电池(RFB)研究涵盖分子设计、电解质优化、电极和膜材料、电堆运行、系统管理和安全分析,使其成为一个受约束、多尺度、多目标的储能研发问题。尽管大型语言模型(LLM)可以支持科学知识整合和提案生成,但通用LLM推理在创新导向探索、基于规则的执行、机理建模和系统级权衡方面仍不够自适应。本文介绍ChargeBD,一个用于电池开发中引导工程的字符感知异构智能体推理框架。从50个RFB特定任务集开始,我们构建了500个问题的ESS-LLM基准,并定义了MBTI启发的角色智能体作为结构化认知偏差模板,而非心理测量工具或真实人格表征。选择DeepSeek-V3-Plus作为共享基础模型,评估16个MBTI启发的角色智能体,以构建角色能力矩阵和认知优势矩阵。

英文摘要

Redox-flow battery (RFB) research spans molecular design, electrolyte optimization, electrode and membrane materials, stack operation, system management, and safety analysis, making it a constrained, multi-scale, and multi-objective energy-storage R&D problem. Although large language models (LLMs) can support scientific knowledge integration and proposal generation, generic LLM reasoning remains insufficiently adaptive across innovation-oriented exploration, rule-based execution, mechanistic modeling, and system-level trade-offs. Here we introduce ChargeBD, a character-aware heterogeneous-agent reasoning framework for guided engineering in battery development. Starting from a 50-question RFB-specific task set, we construct a 500-question ESS-LLM Benchmark and define MBTI-inspired persona agents as structured cognitive-bias templates rather than psychometric instruments or representations of real personalities. DeepSeek-V3-Plus is selected as the shared base model, and 16 MBTI-inspired persona agents are evaluated to construct a persona capability matrix and a cognitive advantage matrix.

2606.12047 2026-06-11 cs.CV cs.AI stat.ML 新提交

Metadata-Aware Multi-Prompt Reasoning for Zero-Shot Accident Understanding

元数据感知的多提示推理用于零样本事故理解

Tarandeep Singh, Soumyanetra Pal, Soham Biswas, Nishanth Chandran

发表机构 * Netradyne

AI总结 提出三阶段流水线,通过视觉-语言相似性、元数据驱动的多提示推理和开放词汇检测,实现零样本事故视频的时序定位、语义分类和空间定位,显著提升性能。

详情
Comments
Accepted at the AUTOPILOT Workshop, CVPR 2026 (non-archival). Workshop Paper ID 15
AI中文摘要

在本文中,我们通过识别冲击事件发生的时间、类型以及帧中的位置,使用自然语言解决监控视频中事故的零样本理解问题。我们提出一个三阶段流水线,将事故理解分解为何时、何物和何地。第一阶段利用视觉-语言相似性提取冲击周围的短时间窗口。第二阶段,我们执行元数据驱动的多提示推理,包含五个互补视角(基线、运动、几何、对比和决胜),并通过熵门控成对裁决器解决分歧。最后,我们基于预测的事故类型和场景布局查询开放词汇检测器以定位冲击,并使用分数加权质心聚合关键帧上的检测结果。我们的流水线在零样本ACCIDENT @ CVPR基准测试上,相对于帧中心基线,调和平均分数有显著提升。我们表明,将零样本视频理解分解为时序定位、语义分类和空间定位,比直接提示更能实现视觉-语言模型的可靠推理。

英文摘要

In this paper, we address the problem of zero-shot understanding of accidents from surveillance videos by identifying when an impact event occurs, what type of impact it is, and where in the frame it occurs using natural language. We propose a three-stage pipeline that decomposes the accident understanding into when, what, and where. The first stage extracts a short temporal window around the impact using vision-language similarity. In the second stage, we perform metadata-driven multi-prompt reasoning with five complementary views (baseline, motion, geometry, contrast, and tiebreaker) and resolve disagreement via an entropy-gated pairwise adjudicator. Finally, we localize the impact of an open-vocabulary detector queried on the predicted accident type and scene layout, and aggregate detections across keyframes using a score-weighted centroid. Our pipeline achieves a substantial improvement in the harmonic-mean score over a centre-of-frame baseline on the zero-shot ACCIDENT @ CVPR benchmark. We show that decomposing zero-shot video understanding into temporal localization, semantic classification, and spatial grounding enable more reliable reasoning with vision-language models than direct prompting alone.

2606.11282 2026-06-11 stat.AP math.PR math.ST 新提交

The Statistical Compass

统计罗盘

Eliuvish Han Cui

AI总结 将概率与随机过程思想作为统计学的翻译语言,从设计观测到数据对象、目标、稳定性、推断与应用,通过实例连接抽象对象与记录、机制和决策。

详情
Comments
669 pages, 23 figures; textbook/monograph working manuscript
AI中文摘要

本专著将概率和随机过程思想发展为统计学的翻译语言:从设计观测和数据对象到目标、稳定性陈述、推断和应用。各章节从激励性示例和随机化出发,涵盖概率测度、核、似然、数据对象、弱收敛、经验场、函数型数据、M-和Z-估计、检验、局部逼近、事件时间过程和预测。使用历史和生物医学示例,将抽象对象与记录、机制和决策联系起来。目的是为读者提供经典概率、现代数据结构和统计实践的通用语法。

英文摘要

This monograph develops probability and stochastic-process ideas as a translation language for statistics: from designed observations and data objects to targets, stability statements, inference, and use. The chapters move from motivating examples and randomization through probability measures, kernels, likelihoods, data objects, weak convergence, empirical fields, functional data, M- and Z-estimation, testing, local approximations, event-time processes, and prediction. Historical and biomedical examples are used to keep abstract objects tied to records, mechanisms, and decisions. The aim is to give readers a common grammar for classical probability, modern data structures, and statistical practice.

2605.13631 2026-06-11 stat.CO 版本更新

ProjGuard: Safety Monitoring for Computer-Use Agents via Low-Dimensional Projections

ProjGuard:通过低维投影实现计算机使用代理的安全监控

Kebin Contreras, Carlos Hinojosa, Jorge Bacca, Bernard Ghanem

AI总结 ProjGuard通过行为轨迹监控实现计算机使用代理的安全防护,利用轻量级风险信号提前预警潜在危险,结合辅助视觉语言模型进行针对性修正,提升任务完成率并降低安全风险。

详情
Comments
The manuscript was submitted under an inappropriate category. In addition, substantial updates and improvements are currently being made to the document. To avoid confusion and ensure that readers access the most accurate version of the work, we request withdrawal of the current manuscript
AI中文摘要

计算机使用代理越来越多地在真实操作系统上运行,但这也增加了提示注入、间接指令和视觉攻击的风险。现有防御通常依赖于在推理时分析提示或每个潜在恶意输入,使用第二个大模型,这可能限制覆盖范围或增加部署成本。我们提出了ProjGuard,一种基于行为轨迹监控的替代方案。在每一步,我们从代理的累积交互历史中推导出一个轻量级的标量风险信号,并在线评估执行是否开始向不安全区域偏移。这使在轨迹达到潜在有害操作之前就能发出预警。当触发警报时,我们选择性地激活辅助的视觉语言模型,提出修正的下一步,并将执行引导回任务完成。在OS-Harm实验中,使用按需修正的监控将不安全率从16%降低到3%,同时提高任务完成率从59%到65%。我们进一步评估了在RiosWorld上的迁移效果,方法保持竞争力,达到4%的不安全率和64%的任务完成率。总体而言,这些结果支持了一种分层的安全策略,即持续监控可提前预警偏差,并仅在需要时激活修正。

英文摘要

Computer-use agents are increasingly capable of operating on real operating systems, but this capability has also increased the risks posed by prompt injection, indirect instructions, and visual attacks. Existing defenses typically rely on analyzing the prompt or each potentially malicious input with a second large model at inference time, which can limit coverage or increase deployment cost. We propose ProjGuard, an alternative based on behavioral trajectory monitoring. At each step, we derive a lightweight scalar risk signal from the agent's accumulated interaction history and evaluate, online, whether execution is beginning to drift toward an unsafe region. This enables early warnings before the trajectory reaches a potentially harmful action. When an alert is raised, we selectively activate an auxiliary vision-language model to propose a corrected next step and steer execution back toward task completion. Experiments on OS-Harm show that monitoring with on-demand correction reduces the unsafe rate from 16 percent to 3 percent while improving task completion from 59 percent to 65 percent. We further evaluate transfer to RiosWorld, where the method remains competitive, reaching 4 percent unsafe and 64 percent completion. Overall, these results support a hierarchical safety strategy in which always-on monitoring anticipates deviations and activates correction only when needed.

2601.09072 2026-06-11 cs.AI cs.CL stat.ME

Human-AI Co-design for Clinical Prediction Models

Jean Feng, Avni Kothari, Patrick Vossler, Andrew Bishara, Lucas Zier, Newton Addo, Aaron Kornblith, Yan Shuo Tan, Chandan Singh

详情
Journal ref
npj Digital Medicine 2026
英文摘要

Developing safe, effective, and practically useful clinical prediction models (CPMs) traditionally requires iterative collaboration between clinical experts, data scientists, and informaticists. This process refines the often small but critical details of the model building process, such as which features/patients to include and how clinical categories should be defined. However, this traditional collaboration process is extremely time- and resource-intensive, resulting in only a small fraction of CPMs reaching clinical practice. This challenge intensifies when teams attempt to incorporate unstructured clinical notes, which can contain an enormous number of concepts. To address this challenge, we introduce HACHI, an iterative human-in-the-loop framework that uses AI agents to accelerate the development of fully interpretable CPMs by enabling the exploration of concepts in clinical notes. HACHI alternates between (i) an AI agent rapidly exploring and evaluating candidate concepts in clinical notes and (ii) clinical and domain experts providing feedback to improve the CPM learning process. HACHI defines concepts as simple yes-no questions that are used in linear models, allowing the clinical AI team to transparently review, refine, and validate the CPM learned in each round. In two real-world prediction tasks (acute kidney injury and traumatic brain injury), HACHI outperforms existing approaches, surfaces new clinically relevant concepts not included in commonly-used CPMs, and improves model generalizability across clinical sites and time periods. Furthermore, HACHI reveals the critical role of the clinical AI team, such as directing the AI agent to explore concepts that it had not previously considered, adjusting the granularity of concepts it considers, changing the objective function to better align with the clinical objectives, and identifying issues of data bias and leakage.

2102.08591 2026-06-11 stat.ME stat.ML

Data-Driven Logistic Regression Ensembles

Anthony-Alexander Christidis, Stefan Van Aelst, Ruben Zamar

详情
英文摘要

Advances in data collecting technologies in genomics have significantly increased the need for tools designed to study the genetic basis of many diseases. Effective statistical methods should excel in both prediction accuracy and biomarker identification. We introduce a novel approach to high-dimensional binary classification that integrates regularization with ensembling techniques. The method constructs compact ensembles of interpretable models derived by optimizing a global objective function. In medical genomics applications, the proposed approach identifies critical biomarkers overlooked by competing methods. We develop a variable importance ranking system to help researchers prioritize promising genes. The method's asymptotic properties are established, and an efficient computational algorithm is provided. Through extensive simulations across complex scenarios and analysis of cancer genomics datasets, we demonstrate strong predictive performance. Based on the numerical experiments, we offer practical guidelines for determining optimal ensemble size.

2502.04046 2026-06-11 stat.ME math.ST stat.TH

A method for sparse and robust independent component analysis

Lauri Heinonen, Joni Virta

详情
Journal ref
Journal of Multivariate Analysis, 213, 105587 (2026)
Comments
27 pages, 9 figures
英文摘要

This work presents sparse invariant coordinate selection, SICS, a new method for sparse and robust independent component analysis. SICS is based on classical invariant coordinate selection, which is presented in such a form that a LASSO-type penalty can be applied to promote sparsity. Robustness is achieved by using robust scatter matrices. In the first part of the paper, the background and building blocks: scatter matrices, measures of robustness, ICS and independent component analysis, are carefully introduced. Then the proposed new method and its algorithm are derived and presented. This part also includes consistency and breakdown point results for a general case of sparse ICS-like methods. The performance of SICS in identifying sparse independent component loadings is investigated with multiple simulations. The method is illustrated with an example in constructing sparse causal graphs and we also propose a graphical tool for selecting the appropriate sparsity level in SICS.

2503.11683 2026-06-11 stat.AP

MealMeter: Using Multimodal Sensing and Machine Learning for Automatically Estimating Nutrition Intake

Asiful Arefeen, Samantha Fessler, Sayyed Mostafa Mostafavi, Carol S Johnston, Hassan Ghasemzadeh

详情
英文摘要

Accurate estimation of meal macronutrient composition is a pre-perquisite for precision nutrition, metabolic health monitoring, and glycemic management. Traditional dietary assessment methods, such as self-reported food logs or diet recalls are time-intensive and prone to inaccuracies and biases. Several existing AI-driven frameworks are data intensive. In this study, we propose MealMeter, a machine learning driven method that leverages multimodal sensor data of wearable and mobile devices. Data are collected from 12 participants to estimate macronutrient intake. Our approach integrates physiological signals (e.g., continuous glucose, heart rate variability), inertial motion data, and environmental cues to model the relationship between meal intake and metabolic responses. Using lightweight machine learning models trained on a diverse dataset of labeled meal events, MealMeter predicts the composition of carbohydrates, proteins, and fats with high accuracy. Our results demonstrate that multimodal sensing combined with machine learning significantly improves meal macronutrient estimation compared to the baselines including foundation model and achieves average mean absolute errors (MAE) and average root mean squared relative errors (RMSRE) as low as 13.2 grams and 0.37, respectively, for carbohydrates. Therefore, our developed system has the potential to automate meal tracking, enhance dietary interventions, and support personalized nutrition strategies for individuals managing metabolic disorders such as diabetes and obesity.

2406.07909 2026-06-11 eess.AS cs.CL cs.SD stat.ML

Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation

Eungbeom Kim, Hantae Kim, Kyogu Lee

详情
Comments
Accepted by Interspeech 2024
英文摘要

Transformer encoder with connectionist temporal classification (CTC) framework is widely used for automatic speech recognition (ASR). However, knowledge distillation (KD) for ASR displays a problem of disagreement between teacher-student models in frame-level alignment which ultimately hinders it from improving the student model's performance. In order to resolve this problem, this paper introduces a self-knowledge distillation (SKD) method that guides the frame-level alignment during the training time. In contrast to the conventional method using separate teacher and student models, this study introduces a simple and effective method sharing encoder layers and applying the sub-model as the student model. Overall, our approach is effective in improving both the resource efficiency as well as performance. We also conducted an experimental analysis of the spike timings to illustrate that the proposed method improves performance by reducing the alignment disagreement.

1812.05678 2026-06-11 stat.ME

Objective-Driven Ensembles: Bridging the Gap Between Interpretable Sparsity and Algorithmic Prediction

目标驱动集成:弥合可解释稀疏性与算法预测之间的差距

Anthony Christidis, Stefan Van Aelst, Ruben Zamar

AI总结 本文提出目标驱动集成方法,通过将最优子集选择推广为联合数学优化问题,生成可解释的集成模型,并理论证明惩罚预测变量重叠可限制预测协方差、减轻有限样本虚假相关的影响,实现机器学习级精度与稀疏模型可解释性的兼顾。

详情
AI中文摘要

稀疏方法(如最优子集选择、弹性网)是获得可解释模型的标准方法,但可能遭受高方差和易受虚假相关影响的问题。另一方面,算法集成(如随机森林、梯度提升)实现了高预测精度,但产生了由随机化或顺序残差拟合驱动的不可解释黑箱。近年来,一种统一的范式出现了:目标驱动集成。通过将最优子集选择推广为联合数学优化问题,该方法通过将预测变量最优地分配到少量不同模型中来生成可解释的集成。在本文中,我们综合了这一日益增长的文献,并为其经验成功提供了理论见解。具体来说,我们表明惩罚预测变量重叠在数学上限制了预测协方差,并减轻了有限样本虚假相关的影响。我们使用精确的组合预言机证明了这些性质,并回顾了最近的计算近似如何成功地将这一框架扩展到各种领域,包括高维数据、分类任务以及存在逐案例或逐单元污染的场景,实现了机器学习级别的精度,同时保留了稀疏模型的可解释性。

英文摘要

Sparse methods (e.g., Best Subset Selection, Elastic Net) are the standard approach for obtaining interpretable models, but they can suffer from high variance and vulnerability to spurious correlations. Alternatively, algorithmic ensembles (e.g., Random Forests, Gradient Boosting) achieve high prediction accuracy but yield uninterpretable black boxes driven by randomization or sequential residual fitting. In recent years, a unifying paradigm has emerged: Objective-Driven Ensembles. By generalizing best subset selection into a joint mathematical optimization problem, this approach generates interpretable ensembles by optimally splitting predictors across a small number of diverse models. In this paper, we synthesize this growing body of literature and illustrate the statistical principles driving its empirical success. Specifically, we utilize finite-sample bounds to demonstrate how penalizing predictor overlap controls ensemble covariance and provides a mathematical hedge against spurious correlations. We evaluate these mechanics using an exact combinatorial oracle, and review how recent computational approximations have successfully scaled this framework to a variety of domains, including high-dimensional data, classification tasks, and settings with casewise or cellwise contamination, achieving machine-learning-level accuracy while retaining the interpretability of sparse models.

1609.08725 2026-06-11 stat.ME

An adaptable generalization of Hotelling's $T^2$ test in high dimension

Haoran Li, Alexander Aue, Debashis Paul, Jie Peng, Pei Wang

详情
Comments
42 pages, 6 figures
英文摘要

We propose a two-sample test for detecting the difference between mean vectors in a high-dimensional regime based on a ridge-regularized Hotelling's $T^2$. To choose the regularization parameter, a method is derived that aims at maximizing power within a class of local alternatives. We also propose a composite test that combines the optimal tests corresponding to a specific collection of local alternatives. Weak convergence of the stochastic process corresponding to the ridge-regularized Hotelling's $T^2$ is established and used to derive the cut-off values of the proposed test. Large sample properties are verified for a class of sub-Gaussian distributions. Through an extensive simulation study, the composite test is shown to compare favorably against a host of existing two-sample test procedure in a wide range of settings. The performance of the proposed test procedure is illustrated through an application to a breast cancer data set where the goal is to detect the pathways with different DNA copy number alterations across breast cancer subtypes.