arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2606.12174 2026-06-11 stat.AP stat.ME 新提交

The data-driven extreme value distribution: non-parametric tail estimation with a derived stability criterion

数据驱动的极值分布：基于导出稳定性准则的非参数尾部估计

Michael Sandbichler, Tobias Hell

AI总结提出数据驱动极值分布（DDEVD），一种非参数估计器，通过核方法重建基分布并导出稳定性准则，在降水与冶金数据中优于传统极值模型。

详情

Comments: 28 pages, 6 figures

AI中文摘要

量化极端事件的可能性是风险评估的基础，然而经典极值理论依赖于渐近假设，这在数据稀疏、非平稳的情况下失效，而实践者越来越常遇到这种情况。我们引入了数据驱动极值分布（DDEVD），一种非参数估计器，它元统计地聚合所有观测值，并用核重建基分布，去除了参数尾部假设。我们推导了其最优带宽，并证明了一个稳定性定律 $m < C\\,n^{1+\gamma/2}$，将可靠外推与极值指数 $\gamma$ 联系起来。在亚小时尺度的阿尔卑斯降水数据中，DDEVD 从单个十年中恢复了稳定的100年重现水平（校准比率 $0.96$），与完整记录参考值的偏差超过 $50\\%$ 的情况在不到五十分之一的窗口中发生——而 GEV 拟合则为五分之一。在冶金显微图像中，它在安全相关的晶粒尺寸尾部上与广义极值拟合相匹配，而标准对数正态分布在 $1\\,\mathrm{cm}^{2}$ 处高估了 $58\\%$。

英文摘要

Quantifying the likelihood of extreme events underpins risk assessment, yet classical Extreme Value Theory relies on asymptotic assumptions that fail in the data-sparse, non-stationary regimes practitioners increasingly face. We introduce the Data-Driven Extreme Value Distribution (DDEVD), a non-parametric estimator that aggregates all observations metastatistically and reconstructs the base distribution with a kernel, removing parametric tail assumptions. We derive its optimal bandwidth and prove a stability law $m < C\,n^{1+\gamma/2}$ relating reliable extrapolation to the extreme value index $\gamma$. In sub-hourly Alpine precipitation, DDEVD recovers stable 100-year return levels from single decades (calibration ratio $0.96$), departing from the full-record reference by over $50\,\%$ in fewer than one window in fifty -- versus one in five for a GEV fit. In metallurgical micrographs, it matches a generalised extreme-value fit on the safety-relevant grain-size tail, where the standard log-normal over-predicts by $58\,\%$ at $1\,\mathrm{cm}^{2}$.

URL PDF HTML ☆

赞 0 踩 0

2606.12015 2026-06-11 stat.ME 新提交

Introducing precision-weighted bias as a performance measure to inform the inclusion of adaptive designs in meta-analysis

引入精度加权偏倚作为性能度量以指导元分析中适应性设计的纳入

Martin Law (1 and 2), David S. Robertson (1), Sofia S. Villar (1), Tim P. Morris (3), Babak Choodari-Oskooei (4), Thomas Jaki (1 and 5), Ian R. White (4) ((1) Medical Research Council Biostatistics Unit, University of Cambridge, (2) Royal Papworth Hospital, Cambridge, (3) Statistical Methodology, Novartis Pharmaceuticals UK Ltd., (4) UCL Innovative Clinical Trials Unit, University College London, (5) Department of Machine Learning and Statistics, University of Regensburg, DE)

AI总结提出精度加权偏倚作为新的统计性能指标，证明元分析中适应性设计的偏倚可忽略，建议将其作为模拟研究的标准补充。

详情

Comments: 9 pages, 2 figures

AI中文摘要

我们提出一种新颖、直观的统计性能度量：精度加权偏倚。精度加权偏倚定义为估计量的无条件偏倚以其所含信息量（精度）加权。当前指南（如GRADE和CONSORT）常将适应性设计中潜在的偏倚增加视为系统综述中纳入此类设计的阻碍。然而，我们证明共同效应元分析中的偏倚近似等于其组成研究的精度加权偏倚的精度加权平均，而非其未加权无条件偏倚的平均。通过模拟研究，我们表明虽然适应性设计可能表现出未加权偏倚，但它们通常具有零精度加权偏倚。因此，纳入这些设计通常导致整体元分析偏倚的微小变化。这些结果表明，精度加权偏倚是决定是否将适应性设计纳入元分析的更优指标。我们建议在模拟研究中使用精度加权偏倚作为未加权无条件偏倚和条件偏倚的标准补充，以支持更具包容性和准确的证据合成。

英文摘要

We propose a novel, intuitive measure of statistical performance: precision-weighted bias. Precision-weighted bias is defined as the unconditional bias of an estimator weighted by the degree of information (precision) it contains. Current guidelines, such as GRADE and CONSORT, often view the potential for increased bias in adaptive designs as a deterrent for the inclusion of such designs in systematic reviews. However, we demonstrate that the bias in a common-effect meta-analysis is approximately equal to the precision-weighted average of the precision-weighted biases of its constituent studies, rather than of their unweighted unconditional biases. Through simulation studies, we show that while adaptive designs may exhibit unweighted bias, they frequently have zero precision-weighted bias. Consequently, including these designs often results in a negligible change to the overall meta-analysis bias. These results suggest that precision-weighted bias is a superior indicator for determining whether to include an adaptive design in a meta-analysis. We recommend that precision-weighted bias be used as a standard complement to unweighted unconditional and conditional bias in simulation studies to support more inclusive and accurate evidence synthesis.

URL PDF HTML ☆

赞 0 踩 0

2606.11933 2026-06-11 math.ST stat.ME 新提交

Testing axial symmetry in multivariate location-scale linear regression

多元位置尺度线性回归中的轴向对称性检验

Šárka Hudecová, Miroslav Šiman

AI总结提出基于积分秩得分的检验方法，用于多元线性异方差回归中条件轴向对称性的检验，推导渐近分布，并通过模拟和实际数据验证。

2606.11548 2026-06-11 stat.ME 新提交

Estimating the local false discovery rate under an unknown symmetric null

在未知对称零假设下估计局部错误发现率

Daniel Xiang, William Fithian, Nikolaos Ignatiadis, Jake A. Soloff, Asaf Weinstein

AI总结针对零分布仅对称于零的双组模型，提出基于逻辑回归和自然三次样条的局部错误发现率估计方法，并证明该估计可渐近控制多重检验的局部错误发现率。

详情

AI中文摘要

本文关注在双组模型中估计局部错误发现率（lfdr），其中关于零分布的唯一假设是它关于零对称。我们的动机来自当代多重假设检验框架，特别适用于变量选择问题，该框架将任何用户指定的分数转换为统计量，其零分布关于零对称，而非零分布通常预期在零右侧富集。虽然现代方法如knockoff滤波器（Barber and Candes; 2015）能够利用零性质来控制错误发现率（FDR），但一个更合适的目标是针对被拒绝的假设控制局部错误发现率，如Soloff等人（2024）所提出的，其中分析了标准的双组模型（已知$f_0$和独立性）。在这里，我们朝这个方向迈出一步，提出通过针对替代密度比$f(-w)/f(w)$（$w>0$）来估计lfdr，其中$f$是上述“简化”双组模型中的边际密度。我们研究了几种估计量，并提出了一种基于自然三次样条基的逻辑回归方法。我们还证明了该替代的任何一致估计量都能使以名义水平阈值估计的多重检验过程渐近控制lfdr。

英文摘要

This paper is concerned with estimating the local false discovery rate (lfdr) in a two-groups model where the only assumption regarding the null distribution is symmetry about zero. Our motivation comes from the contemporary framework for multiple hypothesis testing, particularly relevant in variable selection problems, which transforms any user-specified scores into statistics whose null distributions are symmetric about zero, whereas enrichment to the right of zero is generally expected for the non-nulls. While modern methods such as the knockoff filter (Barber and Candes; 2015) are able to exploit the null property for controlling the false discovery rate (FDR), an arguably more appropriate goal is to target control of the local false discovery rate for the rejected hypotheses, as proposed in Soloff et al. (2024) where the standard two-groups model (known $f_0$ and independence) is analyzed. Here, we take a step in this direction and propose to estimate the lfdr by targeting the surrogate density ratio $f(-w)/f(w)$, for $w>0$, where $f$ is the marginal density in the aforementioned ``stripped-down'' two-groups model. We study several estimators and propose a logistic regression based method with natural cubic spline basis. We also show that any consistent estimator of this surrogate yields asymptotic lfdr control of the multiple testing procedure that thresholds the estimate at the nominal level.

URL PDF HTML ☆

赞 0 踩 0

2606.11421 2026-06-11 stat.ME math.ST stat.CO 新提交

Second-Order Least Squares as a Special Case of the Polynomial Maximization Method

二阶最小二乘法作为多项式最大化方法的特例

Serhii Zabolotnii

AI总结证明在条件同方差非高斯误差下，最优加权二阶最小二乘法与二次广义多项式最大化方法等价，并揭示高阶效率储备。

详情

Comments: 26 pages, 3 figures, 7 tables. Includes Lean 4 formal verification and Monte Carlo simulation

AI中文摘要

我们证明，对于具有条件同方差非高斯误差的线性回归，最优加权二阶最小二乘法（SLS）与二次广义多项式最大化方法（PMM）是相同的总体估计方程：它们选择前两个中心残差矩的最优线性组合，求解同一个总体正规方程组，共享同一个影响函数，并达到相同的渐近方差 $c_2g_2/N$——普通最小二乘斜率方差因子 $c_2$ 乘以 PMM 方差缩减系数 $g_2=1-\gamma_3^2/(2+\gamma_4)$（其中 $\gamma_3,\gamma_4$ 为误差偏度和超额峰度）。因此，可行的插件实现是一阶等价的，仅存在高阶有限样本差异。这一等价性是尖锐的：在异方差下，无条件 PMM 主体与条件 SLS 加权分离，导致对称误差的效率损失和不对称误差的一致性损失。在二次以上，PMM 拥有 SLS 在其二阶矩范围内无法达到的效率储备。对于对称的尖峰误差，SLS 退化为普通最小二乘法估计斜率，而三次 PMM 通过闭式系数 $g_3$ 利用 SLS 矩范围之外的峰度信息；对于典型非对称分布，在三次多项式矩类中，这一储备为 $30$--$50\\%$。Lean 4 开发环境机器检验了特定次数的代数核心——$g_2$ 和 $g_3$ 的闭式、$g_2\le1$ 结果、设计抵消和对称退化——而一般单调性 $g_{S+1}\le g_S\le1$ 通过嵌套分析证明。蒙特卡洛研究说明了等价性、储备和异方差边界在有限样本中的表现。

英文摘要

We prove that optimally weighted second-order least squares (SLS) and the degree-two generalized polynomial maximization method (PMM) are the same population estimating equation for linear regression with conditionally homoskedastic non-Gaussian errors: they choose the same optimal linear combination of the first two centered residual moments, solve one population normal system, share one influence function, and attain the common asymptotic variance $c_2g_2/N$ -- the ordinary-least-squares slope-variance factor $c_2$ scaled by the PMM variance-reduction coefficient $g_2=1-\gamma_3^2/(2+\gamma_4)$ (with $\gamma_3,\gamma_4$ the error skewness and excess kurtosis). Feasible plug-in implementations are therefore first-order equivalent, with only higher-order finite-sample differences. The identity is sharp: under heteroskedasticity the unconditional PMM body and the conditional SLS weighting separate, costing efficiency for symmetric errors and consistency for asymmetric errors. Beyond degree two, PMM holds an efficiency reserve that SLS cannot reach within its second-moment span. For symmetric platykurtic errors SLS collapses to ordinary least squares for the slope, while degree-three PMM exploits kurtosis information outside the SLS moment span through a closed-form coefficient $g_3$; for canonical asymmetric laws this reserve is $30$--$50\%$ within the degree-three polynomial moment class. The Lean 4 development machine-checks the degree-specific algebraic core -- the closed forms for $g_2$ and $g_3$, the $g_2\le1$ result, the design cancellations, and the symmetric collapse -- while the general monotonicity $g_{S+1}\le g_S\le1$ is proved analytically by nesting. A Monte Carlo study illustrates the equivalence, the reserve, and the heteroskedastic boundary at finite samples.

URL PDF HTML ☆

赞 0 踩 0

2606.10212 2026-06-11 math.ST stat.ML 版本更新

Intrinsic Riemannian Cross-covariance for Manifold-valued Random Objects

内蕴立足点不变黎曼互协方差

Carlos Soto, Cheng Wang, Yujing Huang, Xiaoyu Chen

AI总结提出一种通过平行传输将局部变化映射到公共切空间的黎曼互协方差，实现流形上随机对象的二阶统计量估计，并证明其渐近性质，在球面、SPD流形和心脏瓣膜形状数据上验证有效性。

详情

Comments: 31 pages, 16 figures

AI中文摘要

协方差估计是表示学习、降维和依赖建模中基本的二阶统计量。虽然协方差在欧几里得空间中已被充分理解，但对于位于非线性黎曼流形上的随机对象（在现代机器学习应用中日益常见，涉及形状、对称正定（SPD）矩阵等），协方差定义不明确。本文引入了一种针对流形值随机对象的内蕴黎曼互协方差。我们的方法通过平行传输将局部变化映射到公共切空间来定义协方差和相关，从而得到一个独立于任意坐标选择的二阶描述符。我们证明了所提出的协方差继承了欧几里得对应物的理想性质，并刻画了其渐近行为。在球面和SPD流形上的数值研究，以及在Kendall形状空间中心脏瓣膜形状的真实数据实验，证明了我们估计量的有效性并验证了所述性质。我们的结果将黎曼协方差定位为非欧几里得表示空间中二阶学习和分析的基本工具。

英文摘要

Covariance estimation yields a fundamental second-order statistic underlying representation learning, dimension reduction, and dependence modeling. While covariance has been well understood in Euclidean spaces, it is ill-defined for random objects residing on nonlinear Riemannian manifolds, which increasingly arise in modern machine learning applications involving shapes, symmetric positive definite (SPD) matrices, etc. This paper introduces an intrinsic Riemannian cross-covariance for manifold-valued random objects. Our approach defines covariance and correlation by transporting local variations to a common tangent space via parallel transport, yielding a second-order descriptor that is independent of arbitrary coordinate choices. We establish that the proposed covariance inherits desirable properties of its Euclidean counterparts and characterize its asymptotic behavior. Numerical studies on spheres and SPD manifolds, together with real-data experiments on heart valve shapes in Kendall's shape space, demonstrate the effectiveness of our estimators and verify the stated properties. Our results position the Riemannian covariance as a fundamental tool for second-order learning and analysis in non-Euclidean representation spaces.

URL PDF HTML ☆

赞 0 踩 0

2606.01854 2026-06-11 stat.ME 版本更新

A Uniform Improvement of the Benjamini-Hochberg Procedure via e-Closure

使用e-闭包对Benjamini-Hochberg方法的统一改进

Jelle Goeman

AI总结提出closed BH方法，基于e-闭包原理统一改进BH程序，在相同假设下不减少拒绝但增加功效，尤其当假零假设数量大时。

2605.21641 2026-06-11 stat.ME stat.CO 版本更新

Stable direct estimation for GPLSIAMs using P-splines with dynamically updated boundaries

使用动态更新边界的P样条实现GPLSIAMs的稳定直接估计

Danilo V. Silva, Gilberto A. Paula

AI总结本文提出了一种稳定直接估计GPLSIAMs的方法，通过使用模型矩阵和惩罚完全鱼尔信息矩阵动态更新单指数协变量的边界，在统一的迭代框架中实现快速计算有效自由度和点wise置信区间。

详情

AI中文摘要

广义部分线性单指数加法模型（GPLSIAMs）因其在功能灵活性与参数维度缩减之间的平衡而被广泛应用于不同领域。然而，估计过程面临严重的计算挑战。本文介绍了一种新的稳定方法，利用每个单指数效应的模型矩阵，定义为其单指数系数，并通过惩罚完全鱼尔信息矩阵动态更新单指数协变量的边界，以统一的迭代框架实现。推导出的模型矩阵使得能够快速计算估计的有效自由度和单指数效应的点wise置信区间。通过广义Fellner-Schall方法将平滑参数更新整合到迭代过程中，从而提供对全局惩罚优化问题的高效近似。在中等样本量和非高斯分布下的模拟研究证实了估计在多个场景下的经验一致性。值得注意的是，所提出的方法在最先进竞争方法无法恢复真实单指数系数和非线性函数的稳定情况下仍保持稳定，并且在计算最密集的场景中比常规两步方法快80.13倍。通过应用于Capital Bike Sharing数据集，展示了该方法的建模优势，其中处理每年的单指数交互效应，具有不同的单指数系数和复杂的结构，使得竞争方法不适用。所提出的方法在R中实现，提供了可重复和透明的比较功能。

英文摘要

Generalized partially linear single-index additive models (GPLSIAMs) have been increasingly applied across diverse areas due to their versatility in integrating functional flexibility with parametric dimension reduction while maintaining interpretability. However, the estimation presents severe computational challenges. This paper introduces a novel stable method that uses the model matrix for each single-index effect, defined by its single-index coefficients, and the penalized complete Fisher information matrix to dynamically update the boundaries of the single-index covariates within a unified iterative framework. The derived model matrices enable the fast computation of the estimated effective degrees of freedom and pointwise confidence bands for the single-index effects. The smoothing parameter updates are integrated into the iterative process via the generalized Fellner-Schall method, which recycles the derived matrix decompositions, thereby providing an efficient approximation to the global penalized optimization problem. Simulation studies with moderate sample sizes under non-Gaussian distributions confirm the empirical consistency of the estimation across multiple scenarios. Notably, the proposed approach remains stable where state-of-the-art competitive methods fail to recover true single-index coefficients and nonlinear functions, and is 80.13 times faster than the usual two-step method in the most computationally intensive scenario. The modeling advantage is illustrated through an application to Capital Bike Sharing data, where we deal with a single-index interaction effect for each year, with distinct single-index coefficients, a complex structure that makes competitive methods inapplicable. The proposed method is implemented in R, with functions available for reproducibility and transparency in comparisons.

URL PDF HTML ☆

赞 0 踩 0

2603.22668 2026-06-11 math.ST stat.ME 版本更新

Fixed-level calibration of the Cauchy combination test

柯西组合检验的固定水平校准

Hirofumi Ota

AI总结研究柯西组合检验在固定显著性水平下的渐近精确性，发现原始CCT在固定水平下不精确，提出边界层校准CCT（BL-CCT）通过修正参考分布而非统计量实现渐近精确，并在多种备择假设下保持功效。

详情

Comments: Added several related references, conducted power analyses and polished the proofs and the simulation section

AI中文摘要

柯西组合检验（CCT）被广泛使用，因为它能产生闭式组合$p$值，并且在宽依赖结构下当名义水平$\alpha\downarrow0$时渐近有效。我们研究了一个不同的渐近问题：当组合$p$值的数量$K$在依赖下增长时，通常的柯西截断值在普通固定水平下是否仍然准确。在典型单因子等相关高斯copula模型下，我们证明原始CCT在固定$\alpha$下通常不是渐近精确的。在固定正相关下，统计量收敛到随机潜在因子极限，因此不存在通用的固定水平参考分布。当公共相关$\rho_K$随$K$减弱时，固定水平行为由边界层尺度$s_K=\sqrt{\rho_K}(\log K)^{3/2}$控制，且原始CCT渐近精确当且仅当$\rho_K(\log K)^3\to0$。由于大小失真完全来自参考分布而非统计量，因此可以在不修改检验统计量本身的情况下进行校正。我们提出了边界层校准CCT（BL-CCT），它用单参数高斯平滑柯西族替代标准柯西参考分布。与最近修改检验统计量的变体不同，BL-CCT保持统计量不变，仅校正参考分布。BL-CCT在更弱的条件$\rho_K\log K\to0$下渐近精确，并在有界边界层上提供有用的有限$K$近似。我们还进行了若干功效分析：尽管BL-CCT仅提高了截断值，但在局部密集、稀疏和密集高斯备择假设下，它在精确度尺度上相对于原始CCT没有一阶功效损失。数值实验支持校准理论。

英文摘要

The Cauchy combination test (CCT) is widely used because it yields a closed-form combined $p$-value and is known to be asymptotically valid as the nominal level $\alpha\downarrow0$ under broad dependence structures. We study a different asymptotic question: whether the usual Cauchy cutoff remains accurate at an ordinary fixed level when the number $K$ of combined $p$-values grows under dependence. Under a canonical one-factor equicorrelated Gaussian copula model, we show that the raw CCT is generally not asymptotically exact at fixed $\alpha$. With fixed positive correlation, the statistic converges to a random latent-factor limit, so there is no universal fixed-level reference law. When the common correlation $\rho_K$ weakens with $K$, fixed-level behaviour is governed by the boundary-layer scale $s_K=\sqrt{\rho_K}(\log K)^{3/2}$, and the raw CCT is asymptotically exact if and only if $\rho_K(\log K)^3\to0$. Because the size distortion arises entirely from the reference law and not from the statistic, it can be corrected without modifying the test statistic itself. We propose the boundary-layer calibrated CCT (BL-CCT), which replaces the standard Cauchy reference by a one-parameter Gaussian-smoothed Cauchy family. Unlike recent variants that modify the test statistic, BL-CCT leaves the statistic unchanged and corrects only the reference law. BL-CCT is asymptotically exact under the weaker condition $\rho_K\log K\to0$ and provides a useful finite-$K$ approximation on bounded boundary layers. We also conduct several power analyses: although BL-CCT only raises the cutoff, it incurs no first-order power loss relative to the raw CCT on the exactness scale, under local dense, sparse, and dense Gaussian alternatives. Numerical experiments support the calibration theory.

URL PDF HTML ☆

赞 0 踩 0

2505.03649 2026-06-11 stat.ML cs.LG math.CO math.PR 版本更新

Weighted Random Dot Product Graphs

加权随机点积图

Bernardo Marenco, Paola Bermolen, Marcelo Fiori, Federico Larroca, Gonzalo Mateos

AI总结提出加权随机点积图（WRDPG）模型，通过节点潜位置的内积刻画边权分布的高阶矩，并给出谱嵌入估计的统计保证与生成框架。

详情

Comments: 30 pages, 12 figures, code to generate Figures 3 to 12 available at this https URL. Updated to match the published version

AI中文摘要

复杂关系模式的建模已成为当代统计研究和相关数据科学领域的基石。以图形式表示的网络为这种分析提供了自然框架。本文扩展了随机点积图（RDPG）模型以适应加权图，显著拓宽了该模型的适用范围，使其能够处理边权呈现异质分布的场景。我们提出了一种非参数加权（W）RDPG模型，为每个节点分配一系列潜位置。这些节点向量的内积通过矩生成函数指定其关联边权分布的矩。与现有技术不同，WRDPG能够区分具有相同均值但高阶矩不同的权重分布。我们推导了基于工作马邻接谱嵌入的节点潜位置估计量的统计保证，建立了其一致性和渐近正态性。我们还贡献了一个生成框架，能够采样符合（指定或数据拟合的）WRDPG的图，从而促进例如使用恰当的参考分布对观测图指标进行分析和检验。本文组织如下：形式化模型定义、估计（或节点嵌入）过程及其保证，以及生成加权图的方法，所有内容均辅以说明性和可重复的示例，展示WRDPG在各种网络分析应用中的有效性。

英文摘要

Modeling of intricate relational patterns has become a cornerstone of contemporary statistical research and related data science fields. Networks, represented as graphs, offer a natural framework for this analysis. This paper extends the Random Dot Product Graph (RDPG) model to accommodate weighted graphs, markedly broadening the model's scope to scenarios where edges exhibit heterogeneous weight distributions. We propose a nonparametric weighted (W)RDPG model that assigns a sequence of latent positions to each node. Inner products of these nodal vectors specify the moments of their incident edge weights' distribution via moment-generating functions. In this way, and unlike prior art, the WRDPG can discriminate between weight distributions that share the same mean but differ in other higher-order moments. We derive statistical guarantees for an estimator of the nodal's latent positions adapted from the workhorse adjacency spectral embedding, establishing its consistency and asymptotic normality. We also contribute a generative framework that enables sampling of graphs that adhere to a (prescribed or data-fitted) WRDPG, facilitating, e.g., the analysis and testing of observed graph metrics using judicious reference distributions. The paper is organized to formalize the model's definition, the estimation (or nodal embedding) process and its guarantees, as well as the methodologies for generating weighted graphs, all complemented by illustrative and reproducible examples showcasing the WRDPG's effectiveness in various network analytic applications.

URL PDF HTML ☆

赞 0 踩 0

2603.02566 2026-06-11 stat.ME 版本更新

Modeling double bounded data based on correlated gamma random variables

基于相关伽马随机变量的双有界数据建模

Roberto Vila, Felipe Quintino, Marcelo Bourguignon

AI总结针对单位区间上比率形式的有界数据，提出一种通过Copula连接相关伽马变量的新模型，克服传统独立假设的局限，允许正负相关，并通过模拟和真实经济数据验证其灵活性和有效性。

详情

Comments: 41 pages, 14 figures

AI中文摘要

许多定义在单位区间上的有界数据自然以 $X/(X + Y)$ 的比率形式出现。在现有文献中，针对此类有界数据的主要统计模型通常基于随机变量 $X$ 和 $Y$ 独立的假设。然而，在实际应用中，由于共享的潜在机制或共同的变异来源，$X$ 和 $Y$ 往往存在相关性，因此这一假设常常不切实际。在本文中，我们克服了这些局限性，提出了一种模型，其中两个分量的边际分布通过Copula连接，从而得到更灵活、更真实的单位区间数据表示。特别地，在所提出的模型中，$X$ 和 $Y$ 是相依的伽马随机变量，其联合分布通过Morgenstern二元分布指定，允许分量之间存在正相关和负相关。我们严格研究了其数学性质和实践应用。所得分布呈现广泛的形状，适应不同程度的偏度，并且在某些参数配置下，具有更复杂的密度结构。进行了蒙特卡洛模拟研究，表明最大似然估计在多种参数选择场景下具有良好的性能。还讨论了高效基于似然计算的潜力和局限性。我们通过建模与经济相关的真实数据集，评估了新模型及其估计的有效性。

英文摘要

Many types of bounded data defined on the unit interval arise naturally as ratios of the form $X/(X + Y)$. In the existing literature, the main statistical models proposed for this type of bounded data typically based on the assumption that the random variables $X$ and $Y$ are independent. However, this assumption is often unrealistic in practical applications, where $X$ and $Y$ tend to be correlated due to shared underlying mechanisms or common sources of variability. In this paper, we overcome such limitations and propose a model in which the marginal distributions of the two components are linked by a copula, leading to a more flexible and realistic representation of unit-interval data. In particular, in the proposed model, $X$ and $Y$ are dependent gamma random variables whose joint distribution is specified via Morgenstern's bivariate distribution}, allowing for positive and negative correlations between the components. The mathematical properties and practical applications are rigorously investigated. The resulting distribution exhibits a wide range of shapes, accommodating different degrees of skewness and, for some parameter configurations, more complex density structures. A Monte Carlo simulation study is carried out that shows the good performance of the maximum likelihood estimator in several scenarios of parameter choices. The potential and limitations of efficient likelihood-based computations are also discussed. We evaluate the effectiveness of the new model and its estimates in modeling real-world datasets related to economics.

URL PDF HTML ☆

赞 0 踩 0

2511.11862 2026-06-11 econ.EM math.ST stat.ME 版本更新

Compound Selection Decisions: An Almost SURE Approach

复合选择决策：一种几乎无偏的SURE方法

Jiafeng Chen, Lihua Lei, Timothy Sudijono, Liyang Sun, Tian Xie

AI总结针对高斯序列模型中的复合选择问题，提出基于SURE的几乎无偏估计量ASSURE，通过优化期望效用选择最优决策规则，并证明其渐近最优性。

详情

Comments: V2: Additional Results and Simulations. 110 pages. Comments welcome

AI中文摘要

本文提出了在高斯序列模型中生成复合选择决策的方法。给定未知的固定参数 $\mu_ {1:n}$ 和已知的 $\sigma_{1:n}$，观测值 $Y_i \sim \textsf{N}(\mu_i, \sigma_i^2)$，决策者希望选择一个子集 $S$ 以最大化效用 $\frac{1}{n}\sum_{i\in S} (\mu_i - K_i)$，其中 $K_i$ 为已知成本。受Stein无偏风险估计（SURE）启发，我们引入了一种几乎无偏的估计量，称为ASSURE，用于估计给定决策规则的期望效用。ASSURE允许用户通过优化估计福利，从预先指定的类别中选择福利最大化的规则，从而产生能够跨噪声估计借用强度的选择决策。我们证明，ASSURE产生的决策规则在渐近意义上不劣于预指定类别中最优但不可行的决策规则。我们将ASSURE应用于经济机会的人口普查区选择、歧视性企业的识别以及A/B测试中 $p$ 值决策程序的分析。

英文摘要

This paper proposes methods for producing compound selection decisions in a Gaussian sequence model. Given unknown, fixed parameters $\mu_ {1:n}$ and known $\sigma_{1:n}$ with observations $Y_i \sim \textsf{N}(\mu_i, \sigma_i^2)$, the decision maker would like to select a subset of indices $S$ so as to maximize utility $\frac{1}{n}\sum_{i\in S} (\mu_i - K_i)$, for known costs $K_i$. Inspired by Stein's unbiased risk estimate (SURE), we introduce an almost unbiased estimator, called ASSURE, for the expected utility of a proposed decision rule. ASSURE allows a user to choose a welfare-maximizing rule from a pre-specified class by optimizing the estimated welfare, thereby producing selection decisions that borrow strength across noisy estimates. We show that ASSURE produces decision rules that are asymptotically no worse than the optimal but infeasible decision rule in the pre-specified class. We apply ASSURE to the selection of Census tracts for economic opportunity, the identification of discriminating firms, and the analysis of $p$-value decision procedures in A/B testing.

URL PDF HTML ☆

赞 0 踩 0

2506.00330 2026-06-11 physics.data-an cs.IT stat.ML 版本更新

Accurate Estimation of Mutual Information in High Dimensional Data

高维数据中互信息的准确估计

Eslam Abdelaleem, K. Michael Martini, Ilya Nemenman

AI总结针对高维欠采样下互信息估计难题，提出基于低维潜在表示的神经估计器，结合统计一致性检验、偏差校正和置信区间，并引入VSIB概率批评器族，在合成与真实图像数据上实现可靠估计。

详情

Comments: 15 pages main text, 21 pages SI, 12 Figs overall

AI中文摘要

互信息（MI）量化变量之间的统计依赖性，广泛应用于科学领域，但从有限数据中准确估计仍然非常困难。常见方法在现代实验典型的高维欠采样场景（$N \lesssim K$）中失败，且没有公认的测试来检测基于神经网络的估计器何时失效，使其实际上无法作为科学仪器使用。我们证明，当统计依赖关系具有低维潜在表示时，神经MI估计器可以变得可靠。样本复杂度由潜在维度$K_Z \ll K$而非环境维度决定——我们通过随机矩阵理论从经验上确认并从理论上奠定了这一机制转变。基于这一见解，我们开发了一个实用协议，为神经估计器提供显式的统计一致性检查、偏差校正和置信区间。此外，我们引入了一类新的概率批评器（VSIB族），在标准估计器失效的高MI值下显著降低偏差和方差。我们在合成基准（$K=500$，$N$低至256）、Czyz等人（2023）的标准40数据集基准套件、噪声MNIST（$K=784$）以及使用ResNet-20骨干网络的CIFAR-10/100（$K=3072$）上验证了该协议。我们的协议始终匹配或超越现有方法，同时是唯一报告置信区间并标记不可靠估计的方法，在真实图像上实现了远低于环境像素维度的可靠MI检测。

英文摘要

Mutual information (MI) quantifies statistical dependence between variables and is widely used across scientific disciplines, yet accurate estimation from finite data remains notoriously difficult. Common approaches fail in high-dimensional, undersampled regimes ($N \lesssim K$) typical of modern experiments, and no accepted tests exist to detect when neural network-based estimators fail, making them effectively unusable as scientific instruments. We show that neural MI estimators can be made reliable when the statistical dependencies admit a low-dimensional latent representation. Sample complexity is then governed by the latent dimensionality $K_Z \ll K$ rather than the ambient dimension -- a regime shift we confirm empirically and ground theoretically via random matrix theory. Building on this insight, we develop a practical protocol that provides neural estimators with explicit statistical consistency checks, bias correction, and confidence intervals. We additionally introduce a new class of probabilistic critics (the VSIB family) that substantially reduce bias and variance at higher MI values where standard estimators break down. We validate the protocol on synthetic benchmarks ($K=500$, $N$ as low as $256$), on the standard 40-dataset benchmark suite of Czyz et al. (2023), on noisy MNIST ($K=784$), and on CIFAR-10/100 ($K=3072$) with a ResNet-20 backbone. Our protocol consistently matches or exceeds existing methods while being the only approach to report confidence intervals and flag unreliable estimates, achieving reliable MI detection well below the ambient pixel dimension on real images.

URL PDF HTML ☆

赞 0 踩 0

2509.10817 2026-06-11 math.ST stat.ME 版本更新

Conditional Independence Testing Using Exchangeable Pairs

使用可交换对的条件独立性检验

Bilol Banerjee

AI总结提出基于可交换对的条件独立性检验方法，将问题转化为两样本检验，利用能量距离度量偏离，并证明其一致性和最优检测率。

详情

AI中文摘要

本文考虑在给定混杂随机向量 $\m Z$ 的情况下，检验两个随机向量 $\m X$ 和 $\m Y$ 之间的条件独立性问题。引入了一个可交换对框架，通过该框架将条件独立性检验问题重新表述为两样本检验问题。该框架受模型X文献思想的启发，基于在原假设条件独立性下成立的基本可交换性性质。采用能量距离/最大均值差异类型的度量来衡量可交换对与条件独立性的偏离。构建了所提出的差异度量的一致估计量，并在一般假设下建立了其理论性质。然后，使用该估计量作为检验统计量开发了条件独立性检验，并通过适当的重采样程序进行校准。结果表明，所提出的检验对固定备择假设是一致的，对局部邻接备择假设具有非平凡的渐近功效，达到了检测由所提出的差异度量表征的备择假设的极小化最优分离率，并且在数据维度随样本量发散时仍然一致。还研究了用于生成可交换对的条件分布估计的影响，并建立了保持有效性和功效性质的条件。广泛的模拟研究表明，所提出的方法在与一些最先进的方法相比具有竞争力。

英文摘要

This article considers the problem of testing conditional independence between two random vectors $bm X$ and $\bm Y$ given a confounding random vector $\bm Z$. An exchangeable-pairs framework is introduced through which the conditional independence testing problem is reformulated as a two-sample testing problem. The framework is motivated by ideas from the model-X literature and is based on a fundamental exchangeability property that holds under the null hypothesis of conditional independence. An energy-distance/maximum mean discrepancy type measure is employed on the resulting exchangeable pairs to quantify departures from conditional independence. A consistent estimator of the proposed discrepancy measure is constructed and its theoretical properties are established under general assumptions. A conditional independence test is then developed using this estimator as a test statistic and is calibrated through a suitable resampling procedure. It is shown that the proposed test is consistent against fixed alternatives, possesses nontrivial asymptotic power against local contiguous alternatives, attains the minimax separation rate for detecting alternatives characterized by the proposed discrepancy measure, and remains consistent when the data dimension diverges with the sample size. The effect of estimating the conditional distribution used to generate the exchangeable pairs is also investigated, and condition under which validity and power properties are preserved is established. Extensive simulation studies demonstrate that the proposed procedure performs competitively with some state-of-the-art methods.

URL PDF HTML ☆

赞 0 踩 0

2405.01651 2026-06-11 stat.ME 版本更新

Confidence regions for a persistence diagram of a single image with one or more loops

单张图像中一个或多个环状结构的置信区域

Susan Glenn, Jessi Cisewski-Kehe, Jun Zhu, William M. Bement

AI总结本文提出利用TDA方法估计单张图像中的底层结构并量化不确定性，通过将图像分为背景和受损细胞区域，建立持久图空间中的置信区域以纠正传统TDA的偏差。

详情

Comments: 30 pages, 8 figures

AI中文摘要

拓扑数据分析（TDA）利用持续同调量化数据中的环形和高维孔洞，尤其适用于细胞生物学中细胞图像特征分析。在细胞损伤情况下，随着时间推移，细胞图像中会出现环状伤口并逐渐消失。对单张图像中环状模式进行统计推断具有挑战性，因为缺乏重复样本。本文提出一种新颖的框架，利用TDA估计单张图像中的底层结构并量化相关不确定性。我们的方法将图像分为背景和受损细胞区域，然后利用受影响细胞区域的像素在持久图空间中建立置信区域。该方法在持久图上建立估计以纠正传统TDA方法的偏差。通过模拟研究评估所提置信区域的覆盖概率，并与本文提出的替代方法进行比较。我们还通过细胞修复提供的实际例子展示了我们的方法。

英文摘要

Topological data analysis (TDA) uses persistent homology to quantify loops and higher-dimensional holes in data, making it particularly relevant for examining the characteristics of images of cells in the field of cell biology. In the context of a cell injury, as time progresses, a wound in the form of a ring emerges in the cell image and then gradually vanishes. Performing statistical inference on this ring-like pattern in a single image is challenging due to the absence of repeated samples. In this paper, we develop a novel framework leveraging TDA to estimate underlying structures within individual images and quantify associated uncertainties through confidence regions. Our proposed method partitions the image into the background and the damaged cell regions. Then pixels within the affected cell region are used to establish confidence regions in the space of persistence diagrams (topological summary statistics). The method establishes estimates on the persistence diagrams which correct the bias of traditional TDA approaches. A simulation study is conducted to evaluate the coverage probabilities of the proposed confidence regions in comparison to an alternative approach is proposed in this paper. We also illustrate our methodology by a real-world example provided by cell repair.

URL PDF HTML ☆

赞 0 踩 0

2606.12305 2026-06-11 stat.ME 新提交

Bayesian nonparametric Mallows model for clustering preference data

贝叶斯非参数Mallows模型用于偏好数据聚类

Lorenzo Zuccato, Veronica Vinciotti, Valeria Vitelli

AI总结提出基于狄利克雷过程混合模型的贝叶斯非参数Mallows模型，实现聚类数自动推断与聚类分配联合学习，在R包BayesMallows中实现，模拟与真实数据验证有效。

详情

Comments: 21 pages (main text), 28 pages including supplementary material. Submitted for peer review

AI中文摘要

偏好学习是指从不同类型的排序和偏好数据中学习潜在模式。偏好学习的典型目标是推断共享共识排序、学习个体级偏好以及进行无监督聚类。Mallows模型是少数能够同时实现所有这些目标的方法之一。先前的工作基于MCMC Metropolis-Hastings方案开发了计算上可行的贝叶斯推断方法，其中通过有限混合Mallows模型进行聚类，然后对聚类数进行后验推断。这里我们提出基于狄利克雷过程混合模型的贝叶斯非参数Mallows模型，允许对非空聚类数和聚类分配进行联合推断，以及对聚类特定参数进行后验推断。所提出的采样算法已集成到现有的R包BayesMallows中，该包还支持不完整排序和成对比较形式的数据。模拟数据表明，与有限混合模型相比，非参数模型在恢复正确聚类数方面表现良好，而电影评分的实证数据展示了该模型在丢弃评分上提供个性化电影推荐的有效性。

英文摘要

Preference learning refers to the learning of latent patterns from ranking and preference data of different kinds. Typical aims of preference learning are to infer a shared consensus ranking, to learn individual-level preferences, and to perform unsupervised clustering. The Mallows model is among the few approaches that can achieve all these objectives jointly. Previous work has developed computationally tractable methods for Bayesian inference based on a MCMC Metropolis-Hastings scheme, where clustering is performed via a finite mixture of Mallows models. Inference on the number of clusters is then conducted a posteriori. Here we propose a Bayesian nonparametric Mallows model, based on a Dirichlet process mixture model. This allows joint inference on the number of non-empty clusters and on the clustering allocation, as well as posterior inference on cluster-specific parameters. The implementation of the proposed sampling algorithm is integrated into the existing R package BayesMallows, which also supports data in the form of incomplete rankings and pairwise comparisons. Simulated data show good performance of the nonparametric model compared to a finite mixture model in terms of recovery of the correct number of clusters, while empirical data on movie ratings show the model's effectiveness in providing personalized movie recommendations on discarded ratings.

URL PDF HTML ☆

赞 0 踩 0

2606.12296 2026-06-11 stat.ME 新提交

Bayesian Triangulation Splines: Spatial Adaptation on Irregular Domains

贝叶斯三角剖分样条：不规则域上的空间自适应

Sihyeon Pyeon, Sunwoo Lim, Seonghyun Jeong

AI总结提出贝叶斯三角剖分样条方法，通过约束Delaunay三角剖分处理不规则域边界和异质性平滑，实现空间自适应，并证明其最优后验收缩率和Oracle性质。

详情

AI中文摘要

针对二维非矩形域的传统非参数回归方法常常忽略域几何结构，允许跨边界平滑。在空间和地质统计应用中，这一假设通常无效，因为域边界通常约束观测之间的相互作用。适应空间变化的平滑度也比单变量设置更具挑战性，大多数现有方法未能充分捕捉目标函数的局部结构。为了解决这些问题，我们提出了贝叶斯三角剖分样条，该方法在多边形域上构造局部自适应样条。该方法采用约束Delaunay三角剖分来尊重边界几何并适应异质性平滑。精心设计的先验进一步提高了经验性能。在全局Sobolev平滑假设下，我们证明了所提方法实现了最优后验收缩率，并适应未知平滑度。我们还表明，该方法在实现非均匀或局部变化结构特征的Oracle率方面表现出理想的空间适应性。至关重要的是，这种Oracle保证并非特定于约束Delaunay三角剖分，而是适用于任何满足弱形状正则条件的三角剖分。模拟研究证实，所提方法通过实现更高的估计精度同时保持低模型复杂度，优于现有方法。

英文摘要

Conventional nonparametric regression methods for two-dimensional non-rectangular domains often overlook domain geometry and allow smoothing across boundaries. In spatial and geostatistical applications, this assumption is frequently invalid because domain boundaries typically constrain interactions among observations. Accommodating spatially varying smoothness is also substantially more challenging than in the univariate setting, and most existing methods do not adequately capture this local structure of the target function. To address these challenges, we propose Bayesian triangulation splines, which constructs locally adaptive splines over a polygonal domain. The method employs constrained Delaunay triangulations to respect boundary geometry and adapt to heterogeneous smoothness. A carefully designed prior further improves empirical performance. Under a global Sobolev smoothness assumption, we show that the proposed method achieves the optimal posterior contraction rate and adapts to unknown smoothness. We also show that the method exhibits ideal spatial adaptation in the sense that it achieves the oracle rate for inhomogeneous or locally varying structural features. Crucially, this oracle guarantee is not specific to constrained Delaunay triangulations, but holds over any triangulation satisfying weak shape-regularity conditions. Simulation studies confirm that the proposed method outperforms existing approaches by achieving higher estimation accuracy while maintaining low model complexity.

URL PDF HTML ☆

赞 0 踩 0

2606.12164 2026-06-11 stat.ME 新提交

Bayesian Effect Selection for Additive Quantile Regression with an Application to Air Pollution Thresholds

加性分位数回归的贝叶斯效应选择及其在空气污染阈值中的应用

Nadja Klein, Aaron Wei Qi Lee, Jorge Mateu

AI总结提出一种贝叶斯效应选择方法，通过Demmler-Reinsch基展开正交分解加性效应的线性和非线性部分，并使用尖峰-板先验进行选择，应用于马德里空气污染数据分析，揭示极端NO2浓度的驱动因素。

详情

Comments: arXiv admin note: substantial text overlap with arXiv:2105.10890

AI中文摘要

空气污染监管限值通常以浓度阈值超标来定义，这些阈值自然与污染物分布的条件分位数相关，因此直接关系到严重污染事件的评估。同时，不仅要确定协变量是否影响空气污染，还要确定这种影响是线性、非线性还是两者兼有。我们通过开发加性分位数回归的贝叶斯效应选择方法来解决这些问题。虽然惩罚样条的常用混合模型表示（MMR）允许灵活的非线性效应，但它们不能提供线性和非线性效应成分的有意义分离。因此，我们采用Demmler-Reinsch基展开，将每个加性效应正交分解为线性和非线性部分，并从理论上证明两个效应成分可以一致估计。为了促进数据驱动的模型构建，我们提出贝叶斯效应选择，对与线性和非线性成分相关的标量重要性参数分别使用尖峰-板先验，并实现高效的Gibbs采样器。通过模拟研究，我们展示了该方法对非对称拉普拉斯工作似然引起的误设具有鲁棒性，并显示出相对于MMR的优越性能。在对西班牙马德里空气污染数据的详细分析中，我们强调了灵活建模极端二氧化氮（NO$_2$）浓度的附加价值，并揭示了阈值相关的污染水平受气候变量和交通相关空间结构的不同驱动。这些发现强调了需要先进的统计模型来支持短期决策，并帮助地方当局减轻或潜在防止NO$_2$浓度限值超标。

英文摘要

Air pollution regulatory limits are typically defined in terms of exceedances of concentration thresholds which are naturally related to conditional quantiles of the pollutant distribution and are therefore of direct relevance for assessing severe pollution events. At the same time, it is important to determine not only whether a covariate affects air pollution but also whether this effect is linear, nonlinear, or both. We address these issues by developing a Bayesian effect selection approach for additive quantile regression. While commonly used mixed model representations (MMRs) of penalized splines allow for flexible nonlinear effects, they do not provide a meaningful separation of linear and nonlinear effect components. We therefore employ a Demmler-Reinsch basis expansion, which yields an orthogonal decomposition of each additive effect into linear and nonlinear parts and show theoretically that both effect components can be estimated consistently. To facilitate data-driven model building, we propose Bayesian effect selection with separate spike and slab priors on the scalar importance parameters associated with the linear and nonlinear components and implement an efficient Gibbs sampler. Through simulation studies, we demonstrate robustness to the misspecification induced by the employed asymmetric Laplace working likelihood and show superior performance relative to the MMR. In a detailed analysis of air pollution data in Madrid, Spain we highlight the added value of flexibly modeling extreme nitrogen dioxide (NO$_2$) concentrations and reveal that threshold-relevant pollution levels are driven differently by climatological variables and traffic-related spatial structure. These findings underline the need for advanced statistical models that support short-term decision-making and help local authorities mitigate, or potentially prevent, exceedances of NO$_2$ concentration limits.

URL PDF HTML ☆

赞 0 踩 0

2606.11876 2026-06-11 q-bio.QM cs.LG stat.ME 新提交

Seeing Below the Limit of Detection: A Censored-Poisson Bayesian Latent-Growth Change-Point Detector (the Span Detector) for Serial ctDNA in HR+/HER2- Metastatic Breast Cancer

检测限以下：用于HR+/HER2-转移性乳腺癌连续ctDNA的删失泊松贝叶斯潜在增长变点检测器（Span检测器）

Aarchi Singh Thakur, Abhijoy Sarkar

AI总结提出Span检测器，利用删失泊松贝叶斯潜在增长变点模型处理ctDNA非检测作为左删失观测，通过序贯广义似然比统计量检测变异检测率上升点，在10%假警报率下将提前三个月捕获进展的比例从11%提升至25%。

详情

Comments: 9 pages, 4 figures, 2 tables. Code and synthetic data generator: this https URL

AI中文摘要

循环肿瘤DNA（ctDNA）在影像学显示耐药性数月前就已携带证据，但最早证据存在于检测限（LoD）以下：新生亚克隆仅被间歇性检测到，产生微弱检测和非检测的闪烁序列。商业液体活检将每次抽取视为独立快照，并将非检测视为无信号。我们认为非检测是左删失观测，而随时间变化的非检测和微弱检测模式在单个值可信之前就携带了可操作的生长证据。我们引入Span，一种删失泊松贝叶斯潜在增长变点检测器，它对二元检测过程建模，为每个变异的检测率累积一个向上变点的序贯广义似然比统计量，并以校准的假警报控制发出竞争风险警报。Span没有学习权重，因此没有过拟合风险。在一线CDK4/6抑制剂联合内分泌治疗的HR+/HER2-转移性乳腺癌合成队列中，在匹配的10%假警报率下，Span将提前三个月捕获的即将进展比例大约翻倍（惰性出现：25% vs 快照的11%），具有可证伪的剂量反应：对惰性出现效果显著，对快速出现效果消失。值轨迹基线表现与快照相同，将增益归因于删失检测模型。生存主干在真实乳腺癌数据（GBSG-2，n=686；C指数0.67 vs 0.68）上与Cox基线匹配，在具有清洁生物标志物的真实纵向队列（PBC2，n=312）上，同一管道正确拒绝获胜，这是一个可证伪的边界测试，确认机制是特定于状态的。所有ctDNA轨迹均为合成数据。

英文摘要

Circulating-tumour DNA (ctDNA) carries evidence of drug resistance months before imaging shows it, but the earliest evidence lives below the assay's limit of detection (LoD): a nascent subclone is detected only intermittently, producing a flickering sequence of faint detects and non-detects. Commercial liquid biopsies treat each draw as an independent snapshot and a non-detect as nothing. We argue a non-detect is a left-censored observation, and the pattern of non-detects and faint detects over time carries actionable evidence of growth before any single value is trustworthy. We introduce Span, a censored-Poisson Bayesian latent-growth change-point detector that models the binary detection process, accumulates a sequential generalised-likelihood-ratio statistic for an upward change-point in the per-variant detection rate, and raises a competing-risks alarm with calibrated false-alarm control. Span has no learned weights, so there is nothing to overfit. On a synthetic cohort of HR+/HER2- metastatic breast cancer on first-line CDK4/6-inhibitor plus endocrine therapy, at a matched 10% false-alarm rate, Span roughly doubles the fraction of impending progressions caught three months ahead (indolent regime: 25% vs 11% for the snapshot), with a falsifiable dose-response: large for indolent emergence, vanishing for fast emergence. A value-trajectory baseline performs identically to the snapshot, isolating the gain to the censored detection model. The survival backbone matches a Cox baseline on real breast-cancer data (GBSG-2, n=686; C-index 0.67 vs 0.68), and on a real longitudinal cohort with clean biomarkers (PBC2, n=312) the same pipeline correctly declines to win, a falsifiable boundary test confirming the mechanism is regime-specific. All ctDNA trajectories are synthetic.

URL PDF HTML ☆

赞 0 踩 0

2606.11624 2026-06-11 stat.ME 新提交

The Triply-Randomized Negative Binomial Beta for Robust Regression and Conjugate Models of Bounded Support Data

三重随机负二项贝塔分布用于鲁棒回归和有界支持数据的共轭模型

Jimmy Lederman, Aaron Schein

AI总结提出三重随机负二项贝塔分布（TNBbeta），通过随机化标准贝塔分布的参数，解决其对异常值敏感、无法处理零观测及缺乏共轭先验的问题，并利用Pólya-gamma增广实现高效吉布斯采样。

详情

AI中文摘要

贝塔分布是许多响应变量支持为$[0,1]$的回归问题中默认的似然函数选择，尽管它对异常值敏感、无法处理精确为零的观测值，并且缺乏闭式共轭先验。我们通过引入三重随机负二项贝塔分布（记为$\mathrm{TNBbeta}(p,\\,q,\\,\varepsilon)$）来解决这些缺陷，该分布由中位数$p$、浓度参数$q$和允许在$0$和$1$处具有正密度的边界参数$\varepsilon$参数化。TNBbeta通过随机化标准贝塔分布的参数（使用三个相依的负二项随机变量）得到，我们证明了每个随机变量的完全条件分布本身也是负二项分布。此外，将$p$和$q$与具有logit链接函数的高斯潜变量连接，通过Pólya-gamma增广得到闭式更新。这些性质共同为有界支持数据的回归模型提供了简单的辅助变量吉布斯采样器，在有效样本量每秒和留一预测方面通常优于标准贝塔回归方法，尤其是在存在异常值的情况下。在森林冠层覆盖度的案例研究中，我们证明了该框架可以轻松融入空间结构和精确零观测。总体而言，这项工作大大扩展了可高效拟合的$[0,1]$有界支持数据的贝叶斯模型类别。

英文摘要

The beta distribution is the default choice of likelihood in many regression problems with a $[0,1]$-bounded support response despite its sensitivity to outliers, inability to accommodate exact zero observations, and a lack of closed-form conjugate priors. We address these shortcomings by introducing the triply-randomized negative binomial beta distribution, denoted $\mathrm{TNBbeta}(p,\,q,\,\varepsilon)$, parameterized by a median $p$, concentration parameter $q$, and boundary parameter $\varepsilon$ which permits positive density at $0$ and $1$. The TNBbeta arises by randomizing the parameters of a standard beta distribution with three dependent negative binomial random variables, each of whose complete conditional distribution we show is itself negative binomial. Moreover, connecting $p$ and $q$ to Gaussian latent variables with logit link functions yields closed-form updates via Pólya-gamma augmentation. Together, these properties yield simple auxiliary-variable Gibbs samplers for regression models of bounded-support data, which often outperform standard beta regression approaches in terms of effective sample size per second and held-out prediction, especially in the presence of outliers. In a case study of forest canopy cover, we demonstrate that this framework can easily incorporate spatial structure and exact zero observations. Overall, this work substantially expands the class of Bayesian models for $[0,1]$-bounded support data that can be fit efficiently.

URL PDF HTML ☆

赞 0 踩 0

2605.11340 2026-06-11 stat.ME 版本更新

Hyperbolic Latent Space Models for Network Embedding: Model Specification and Bayesian Inference

双曲潜空间模型用于网络嵌入：模型规范与贝叶斯推断

Yiwei Gong, Anna L. Smith, Dena Asta, Catherine A. Calder

AI总结本文提出双曲潜空间模型，通过贝叶斯推断解决网络嵌入中的树状结构和厚尾度分布问题，强调温度参数对网络拓扑的重要性。

详情

AI中文摘要

许多现实世界网络表现出分层、树状结构和厚尾度分布，这些现象无法被传统网络数据统计模型轻易捕捉。本文基于统计物理的见解，提出具有双曲几何基础的连续潜空间模型，以概率方式将节点嵌入具有恒定负曲率的潜空间。然而，大多数统计实现简化了原始物理模型，忽略了控制潜距离到概率映射锐度的温度参数。本文认为这一省略是关键性的。我们证明温度是控制网络树状拓扑的根本参数，未能推断温度会削弱模型表达能力。我们正式提出一个具有未知可学习温度参数的贝叶斯双曲连续潜空间模型。然后开发了两种推断程序：用于严谨后验特征化的哈密顿蒙特卡罗方法和用于大规模网络的可扩展自编码变分贝叶斯算法。通过模拟和实际数据示例，我们证明在大多数情况下，本文模型在图重建任务中优于具有固定温度和错误指定欧几里得几何的模型，确认温度是复杂网络的关键且可推断的特征。

英文摘要

Many real-world networks exhibit hierarchical, tree-like structure and heavy-tailed degree distributions, phenomena not readily captured by standard statistical models for network data. Extensions of the popular continuous latent space modeling framework have been proposed to accommodate such networks. Drawing on insights from statistical physics, continuous latent space models with underlying hyperbolic geometry have been proposed as a natural framework, probabilistically embedding nodes in a latent Riemannian manifold with constant negative curvature. Most statistical implementations, however, simplify the original physics-based model by omitting the ``temperature parameter," which controls the sharpness of the latent distance-to-probability mapping. We argue this omission is critical. We demonstrate that temperature is the fundamental parameter governing a network's tree-like topology, and that failing to infer it weakens model expressiveness. We formalize a Bayesian hyperbolic continuous latent space model with an unknown, learnable temperature parameter. We then develop two inferential procedures: a Hamiltonian Monte Carlo approach for rigorous posterior characterization and a scalable auto-encoding variational Bayes algorithm for large-scale networks. Through simulation and real data examples, we show that our model outperforms models with fixed temperature and misspecified Euclidean geometries in graph reconstruction tasks in most settings, confirming temperature is a crucial and inferable feature of complex networks.

URL PDF HTML ☆

赞 0 踩 0

2603.27843 2026-06-11 math.ST stat.ME 版本更新

Empirical Bayes Estimation and Inference via Smooth Nonparametric Maximum Likelihood

经验贝叶斯估计与推断：基于光滑非参数最大似然法

Taehyun Kim, Bodhisattva Sen

AI总结针对非参数最大似然估计的离散性和慢对数解卷积率，引入高斯平滑层，提出光滑NPMLE，实现多项式解卷积率、近参数去噪性能及后验一致估计，并构建最优边际覆盖集。

详情

AI中文摘要

基于非参数最大似然估计（NPMLE）的经验贝叶斯 $g$-建模方法一直是正态均值问题中大规模估计和推断的核心。然而，不确定性量化的理论保证仍然很少。一个关键障碍是NPMLE必然是离散的，这导致离散的后验可信集和缓慢的对数解卷积率。我们通过引入一个分层高斯平滑层来解决这两个限制，该平滑层将混合分布限制为高斯位置混合。我们的光滑NPMLE继承了经典NPMLE的优良性质：它可以通过凸优化计算，并实现近乎参数的降噪性能。此外，它实现了多项式解卷积率，在相应类别上是渐近极小极大的。我们的过程还导致估计的光滑后验以多项式率收敛到真实后验。进一步，我们刻画了在期望长度上最优的边际覆盖集，构造了这些集的插件估计量，并在覆盖概率和期望长度方面为估计集建立了理论保证。我们还将理论扩展到模型误设和异方差高斯观测的设置，并研究了所提分层模型的可识别性。

英文摘要

The empirical Bayes $g$-modeling approach based on the nonparametric maximum likelihood estimator (NPMLE) has been central to large-scale estimation and inference in the normal means problem. However, theoretical guarantees for uncertainty quantification remain scarce. A key obstacle is that the NPMLE is necessarily discrete, which yields discrete posterior credible sets and a slow logarithmic deconvolution rate. We address both limitations by introducing a hierarchical Gaussian smoothing layer that restricts the mixing distribution to a Gaussian location mixture. Our smooth NPMLE inherits the favorable properties of the classical NPMLE: it is computable via convex optimization and achieves nearly parametric denoising performance. Moreover, it achieves a polynomial deconvolution rate that is asymptotically minimax over the corresponding class. Our procedure also leads to estimated smooth posteriors that converge to the true posteriors at a polynomial rate. Further, we characterize marginal coverage sets that are optimal in expected length, construct plug-in estimators of these sets, and establish theoretical guarantees for the estimated sets in terms of both coverage probability and expected length. We also extend the theory to settings with model misspecification and heteroscedastic Gaussian observations, and study identifiability of the proposed hierarchical model.

URL PDF HTML ☆

赞 0 踩 0

2512.23581 2026-06-11 stat.ME 版本更新

Profile Bayesian Optimization for Expensive Computer Experiments

面向昂贵计算机实验的轮廓贝叶斯优化

Courtney Kyger, James Fernandez, John A. Grunenwald, James Braun, Annie Booth

AI总结提出一种新型贝叶斯优化方法，通过两阶段采集策略和深度/浅层高斯过程代理，在控制参数范围内高效识别轮廓最优解，应用于旋转爆震发动机扩散器设计。

详情

AI中文摘要

我们提出了一种新颖的贝叶斯优化（BO）程序，旨在识别具有单个控制参数和多个干扰参数的确定性黑箱计算机模拟的“轮廓最优”。轮廓最优捕捉作为控制参数函数的最优响应值。我们的目标是在控制参数的整个合理范围内识别这些最优值。经典BO针对所有参数寻找单个最优值，不会探索整个控制参数范围。相反，我们开发了一种新颖的两阶段采集方案，以平衡控制参数上的探索和轮廓最优的利用，利用深度和浅层高斯过程代理来促进不确定性量化。我们的动机来自旋转爆震燃烧发动机中扩散器的计算机模拟，该模拟返回作为各种设计参数函数的通过扩散损失的能量。我们旨在识别作为扩散器长度函数的最低可能能量损失；理解这种关系将能够做出明智的设计选择。我们的“轮廓贝叶斯优化”程序在各种基准测试中优于传统BO和轮廓优化方法，并在我们的激励应用中证明比最先进的多目标优化更有效。

英文摘要

We propose a novel Bayesian optimization (BO) procedure aimed at identifying the "profile optima" of a deterministic black-box computer simulation that has a single control parameter and multiple nuisance parameters. The profile optima capture the optimal response values as a function of the control parameter. Our objective is to identify these optima across the entire plausible range of the control parameter. Classic BO, which targets a single optimum over all parameters, does not explore the entire control parameter range. Instead, we develop a novel two-stage acquisition scheme to balance exploration across the control parameter and exploitation of the profile optima, leveraging deep and shallow Gaussian process surrogates to facilitate uncertainty quantification. We are motivated by a computer simulation of a diffuser in a rotating detonation combustion engine, which returns the energy lost through diffusion as a function of various design parameters. We aim to identify the lowest possible energy loss as a function of the diffuser's length; understanding this relationship will enable well-informed design choices. Our "profile Bayesian optimization" procedure outperforms traditional BO and profile optimization methods on a variety of benchmarks and proves effective in our motivating application against state-of-the-art multi-objective optimization.

URL PDF HTML ☆

赞 0 踩 0

2505.22587 2026-06-11 stat.ME 版本更新

Bayesian Non-Parametric Inference for Lévy Measures in State-Space Models

状态空间模型中Lévy测度的贝叶斯非参数推断

Bill Z. Lin, Simon Godsill

AI总结提出贝叶斯非参数框架，利用独立伽马缩放狄利克雷过程（IGSDP）推断线性状态空间模型中子序和正态方差均值过程的Lévy测度，实现可识别参数化与高效MCMC算法。

详情

AI中文摘要

Lévy过程以其能够建模具有偏斜、重尾和不连续性的复杂动态而闻名，在各个领域的随机建模中发挥着关键作用。然而，大多数Lévy过程的推断，无论是在参数还是非参数设置中，仍然是一个重大挑战。在这项工作中，我们提出了一个新颖的贝叶斯非参数推断框架，用于在线性状态空间模型内推断子序和正态方差均值（NVM）过程的Lévy测度。引入了一种灵活随机测度——独立伽马缩放狄利克雷过程（IGSDP），其中著名的伽马过程是一个特例，从而为关于两个Lévy测度的推断提供了可处理的条件分布。我们进一步表明，在伽马过程特例中，可以实现超参数推断的共轭性。提供了NVM过程参数轮廓的显式表征，使得模型具有可识别的参数化，从而在后验推断中实现有效的马尔可夫链蒙特卡洛算法。该方法在合成数据和逐笔（高频）金融数据集上进行了演示。

英文摘要

Lévy processes, known for their ability to model complex dynamics with skewness, heavy tails, and discontinuities, play a critical role in stochastic modeling across various domains. However, inference for most Lévy processes, whether in parametric or non-parametric settings, remains a significant challenge. In this work, we present a novel Bayesian non-parametric inference framework for inferring the Lévy measures of subordinators and normal variance-mean (NVM) processes within a linear state space model. A flexible random measure, the Independent Gamma-scaled Dirichlet Process (IGSDP), is introduced, for which the well-known Gamma process is a special case, leading to tractable conditional distributions for inference about both Lévy measures. We further show that in the Gamma process special case, conjugacy can be achieved for hyper-parameter inference. An explicit characterization of the parameter contour for NVM processes is provided, enabling an identifiable parameterization of the model for effective Markov Chain Monte Carlo algorithms in posterior inference. The method is demonstrated on both synthetic and tick-level (high-frequency) financial datasets.

URL PDF HTML ☆

赞 0 踩 0

2505.02653 2026-06-11 math.ST math.PR stat.ME 版本更新

Hierarchical Random Measures without Tables

无表格的层次随机测度

Marta Catalano, Claudio Del Sole

AI总结提出一种层次狄利克雷过程的新先验，消除潜在表格变量，实现后验的准共轭分布和高效采样算法，并推广至归一化层次随机测度框架。

详情

AI中文摘要

层次狄利克雷过程是贝叶斯非参数多层次模型的基石。其生成模型可通过一组潜在变量描述，在流行的餐馆特许经营隐喻中通常称为表格。潜在表格简化了后验的表达，并允许实现吉布斯采样算法以近似抽取后验样本。然而，管理它们的分配可能变得计算昂贵，特别是随着数据集大小和层次数量的增加。在这项工作中，我们为层次狄利克雷过程的浓度参数确定了一个先验，该先验(i)诱导准共轭后验分布，并且(ii)消除了对表格的需求，导致后验更可解释的表达，同时具有可扩展和精确的算法来从中采样。值得注意的是，这种构造超越了狄利克雷过程，导致了一个定义归一化层次随机测度的新框架和一类从其后验采样的新算法。关键的分析工具是多元增量的独立性，即它们作为完全随机向量的表示。

英文摘要

The hierarchical Dirichlet process is the cornerstone of Bayesian nonparametric multilevel models. Its generative model can be described through a set of latent variables, commonly referred to as tables within the popular restaurant franchise metaphor. The latent tables simplify the expression of the posterior and allow for the implementation of Gibbs sampling algorithms to approximately draw posterior samples. However, managing their assignments can become computationally expensive, especially as the size of the dataset and the number of levels increase. In this work, we identify a prior for the concentration parameter of the hierarchical Dirichlet process that (i) induces a quasi-conjugate posterior distribution, and (ii) removes the need for tables, leading to more interpretable expressions for the posterior, with both a scalable and an exact algorithm to sample from it. Remarkably, this construction extends beyond the Dirichlet process, leading to a new framework for defining normalized hierarchical random measures and a new class of algorithms to sample from their posteriors. The key analytical tool is the independence of multivariate increments, that is, their representation as completely random vectors.

URL PDF HTML ☆

赞 0 踩 0

2606.11715 2026-06-11 stat.ME 新提交

Bracketing Relationships of Weighted Average Treatment Effects

加权平均处理效应的括号关系

Pengfei Tian, Fan Yang, Peng Ding

AI总结在因果推断的观测研究规范设定下，证明了在倾向得分与条件平均处理效应满足单调关系时，重叠权重的平均处理效应介于处理组和对照组的平均处理效应之间，并推广到加权局部平均处理效应及其他权重，建议使用CP图。

2606.11405 2026-06-11 stat.ME stat.AP 新提交

Bayesian Causal Machine Learning for Cure Models

治愈模型的贝叶斯因果机器学习

Antonio R. Linero, F. Javier Rubio, Piyali Basak

AI总结针对治愈模型中治疗对治愈概率和未治愈患者生存时间的不同影响，提出贝叶斯因果机器学习方法BartCure，分解受限平均生存时间的因果效应，并在乳腺癌试验中验证其有效性。

详情

AI中文摘要

在生存研究中，治疗可以通过不同机制使患者受益：治疗可能增加治愈的概率，或延迟未治愈患者的失败时间。量化哪种机制占主导地位，以及它是否在不同亚群中变化，具有临床重要性，但因果机器学习文献中针对此问题的研究有限。标准的因果生存学习器针对有限时间生存或受限平均生存时间，而许多治愈模型在未估计因果效应的情况下捕捉治愈结构。在这项工作中，我们在存在治愈亚群的情况下定义了有意义的因果效应，并引入了BartCure，一种用于估计这些效应的贝叶斯因果机器学习方法。我们推荐的因果效应将受限平均生存时间的因果效应分解为随机治愈和随机潜伏期成分，并将这些新效应与随机干预效应和主层中的因果效应联系起来。在模拟中，BartCure在估计平均效应方面具有竞争力，并且在保守地检测治疗效应异质性的方向方面特别有效。我们将BartCure应用于CALGB 40101乳腺癌试验，以估计平均和亚组因果效应，并识别治疗效应异质性。

英文摘要

In survival studies, treatments can benefit patients through different mechanisms: a treatment may increase the probability of being cured or delay failure among patients who are not cured. Quantifying which mechanism is dominant, and whether it varies across subpopulations, is clinically important, yet there is limited work in the causal machine learning literature addressing this problem. Standard causal survival learners target finite-horizon survival or restricted mean survival time, while many cure models capture cure structures without estimating causal effects. In this work, we define meaningful causal effects in the presence of a cured subpopulation and introduce BartCure, a Bayesian causal machine learning approach for estimating them. The causal effects we recommend decompose the causal effect on restricted mean survival time into a stochastic cure and stochastic latency component, and we relate these new effects to both stochastic intervention effects and causal effects in principal strata. In simulations, BartCure is competitive for estimating average effects and is especially effective at conservatively detecting the direction of treatment-effect heterogeneity. We apply BartCure to estimate average and subgroup causal effects and to identify treatment effect heterogeneity in the CALGB 40101 breast cancer trial.

URL PDF HTML ☆

赞 0 踩 0

2411.10959 2026-06-11 econ.EM cs.LG math.ST stat.AP stat.ME stat.ML 版本更新

Program Evaluation with Remotely Sensed Outcomes

利用遥感结果的程序评估

Ashesh Rambachan, Rahul Singh, Davide Viviano

AI总结本文研究了在实验和准实验中，由于遥感变量不完全测量经济结果而引起的因果推断问题，提出了一种非参数识别因果参数的方法，结合实验和观测数据进行n^{-1/2}推断。

2403.16673 2026-06-11 stat.ME econ.EM 版本更新

Quasi-randomization tests for network interference: a random graph approach

网络干扰的准随机化检验：一种随机图方法

Supriya Tiwari, Pallavi Basu

AI总结提出将网络视为随机变量，利用随机图零模型构建无溢出效应的零分布，克服了现有条件随机化检验的计算难题，在有限样本下精确有效，显著提升检验功效。

2508.05859 2026-06-11 stat.ME 版本更新

Doubly robust integration of nonprobability and probability survey data

非概率与概率调查数据的双重稳健整合

Shaun R Seaman, Tommy Nyberg, Anne M Presanis

AI总结提出双重稳健估计器整合非概率样本与概率调查数据，扩展至子域估计，并与仅使用概率数据的估计器组合，给出方差公式和渐近效率分析。

详情

Comments: 66 pages, 31 figures. The preprint v2 extends the paper with: domain estimation; a new Hajek-style version of the Kim--Haziza doubly robust estimator; and, theory on the asymptotic relative efficiency of the combined estimators and a simulation study to assess the relative efficiency

AI中文摘要

针对整合非概率样本的结果和协变量数据与概率调查的协变量数据，已提出用于估计总体均值（或患病率）的双重稳健估计器。这些估计器结合了逆概率加权估计与大规模插补。然而，如何将这些双重稳健估计器与仅使用概率调查结果数据的Horvitz-Thompson或Hajek估计器相结合的问题，仅受到有限关注。本文首先回顾了先前提出的仅使用非概率样本结果数据的双重稳健估计器。我们将这些估计器扩展至能够估计域（子总体）均值（或患病率），可能利用域外个体的数据以改进小域估计。然后，我们考虑如何将此双重稳健估计器与仅使用概率调查数据的Horvitz-Thompson或Hajek估计器相结合。我们描述了有效的组合估计器，并给出了其重复抽样方差以及这些方差估计量的公式。我们还研究了组合估计器相对于其两个分量估计器的渐近相对效率，并进行了模拟研究以评估它们在有限样本中的相对效率。这些相对效率取决于两个分量估计器的方差之比以及协变量对结果的预测能力。

英文摘要

Doubly robust estimators for estimating the population mean (or prevalence) of an outcome have been proposed for integrating outcome and covariate data from a nonprobability sample with covariate data from a probability survey. These estimators combine inverse probability weighting estimation with mass imputation. However, the question of how to combine these doubly robust estimators with a Horvitz-Thompson or Hajek estimator that uses only outcome data from the probability survey has received only limited attention. In this paper, we first review previously proposed doubly robust estimators that use outcome data from only the nonprobability sample. We extend these estimators to enable estimation of domain (subpopulation) means (or prevalences), possibly using data from individuals outside the domain to improve estimation when the domain is small. We then consider how to combine this doubly robust estimator with a Horvitz-Thompson or Hajek estimator that uses only the probability survey data. We describe efficient combined estimators, and provide formulae for their repeated-sampling variances and for estimators of these variances. We also investigate the asymptotic relative efficiencies of the combined estimators compared to their two component estimators, and carry out a simulation study to assess their relative efficiencies in finite samples. These relative efficiencies depend on the ratio of the variances of the two component estimators and on how predictive the covariates are of the outcome.

URL PDF HTML ☆

赞 0 踩 0

2310.14983 2026-06-11 econ.EM math.ST stat.ME 版本更新

Causal clustering: design of cluster experiments under network interference

AI总结提出基于量值的全局和局部特征向量，用于分析多物种空间数据中的相互作用，在合成肿瘤微环境和人类结直肠癌组织微阵列数据中验证了其识别空间异质性和分类能力。

详情

Comments: 32 pages, 24 figures

AI中文摘要

多物种空间数据出现在许多应用中，其中不同实体之间的相互作用对系统行为至关重要，包括生物医学成像、地理空间分析和物种生态学。尽管它们很重要，但捕获这种相互作用的定量工具相对较少。在这项工作中，我们提出了基于量值的特征用于分析多物种空间数据。量值是有限度量空间的一个实值不变量，可以解释为有效点数，结合了空间配置和尺度。我们开发了全局和局部量值特征向量，并在合成肿瘤微环境数据以及人类结直肠癌样本的组织微阵列数据中展示了它们的实用性。在局部，该方法识别出不同的邻域类型并揭示空间异质性；在模型中，这包括与模拟的不同定性结果相关的径向模式，而在真实世界数据中，它反映了B细胞和T细胞群体之间三级淋巴结构样相互作用的重要性。在全局上，该方法恢复了合成数据中跨参数区域的长期模拟结果的已知分类，并提示CD4+ T细胞和CD163+巨噬细胞在区分有利的克罗恩样反应与不利的弥漫性免疫浸润患者中发挥重要作用。总之，这些结果表明基于量值的特征为多物种空间数据分析提供了强大而灵活的工具。

英文摘要

Multispecies spatial data arise in many applications where interactions between different entities are central to system behaviour, including biomedical imaging, geospatial analysis, and species ecology. Despite their importance, relatively few quantitative tools exist to capture such interactions. In this work, we propose magnitude-based features for the analysis of multispecies spatial data. Magnitude is a real-valued invariant of finite metric spaces that can be interpreted as an effective number of points, incorporating both spatial configuration and scale. We develop global and local magnitude feature vectors and demonstrate their utility on synthetic tumour microenvironment data, and in tissue microarray data from human colorectal cancer samples. Locally, the method identifies distinct neighbourhood types and reveals spatial heterogeneity; in the model, this includes radial patterns associated with different qualitative outcomes of the simulations, while in the real-world data it reflects the importance of tertiary lymphoid structure-like interactions between B and T cell populations. Globally, the approach recovers known classifications of long-term simulation outcomes across parameter regimes in synthetic data, and suggests important roles for CD4+ T cells and CD163+ macrophages in distinguishing patients with favourable Crohn's like reactions from unfavourable diffuse immune infiltration. Together, these results suggest that magnitude-based features provide a powerful and flexible tool for the analysis of multispecies spatial data.

URL PDF HTML ☆

赞 0 踩 0

2606.11768 2026-06-11 stat.ME stat.AP 新提交

Hierarchical excitatory processes for modelling event-time data in the presence of exogenous stimuli

外源刺激下事件时间数据建模的分层激发过程

Francesco Sanna Passino, Nicholas A. Heard, Jeffrey W. Brown, William N. Frost, Vince P. Lyzinski

AI总结提出分层激发过程(HEP)模型，通过动态演化核函数叠加外源刺激的激发效应，实现对重复刺激下事件时间数据的灵活建模，并嵌入聚类框架识别潜在响应模式。

详情

AI中文摘要

我们引入了分层激发过程(HEP)，一种用于在重复外部刺激下观察到的事件时间数据的灵活点过程模型。所提出的框架将点过程的条件强度建模为外部刺激引起的激发效应的叠加，其特征由参数随时间动态演化的核函数刻画。这种分层结构使得能够跨重复刺激调节激发强度，提供了一种可解释的结构。我们为所提出的模型建立了基于似然的推断，并将HEP嵌入到基于模型的聚类框架中，以识别具有相似响应动态的潜在组。模拟研究证明了该模型恢复演化潜在模式的能力，而对海蛞蝓足神经节尖峰序列记录的应用展示了HEP如何能够在不同实验条件下表征重复刺激下神经元的刺激驱动兴奋性。

英文摘要

We introduce the Hierarchical Excitatory Process (HEP), a flexible point process model for event-time data observed under repeated external stimuli. The proposed framework models the conditional intensity of a point process as a superposition of excitation effects induced by external stimuli, characterised by kernels with parameters dynamically evolving over time. This hierarchical construction enables modulation of excitation strength across repeated stimuli, providing an interpretable structure. We establish likelihood-based inference for the proposed model and embed HEP within a model-based clustering framework to identify latent groups sharing similar response dynamics. Simulation studies demonstrate the model's ability to recover evolving latent patterns, and an application to spike train recordings from the sea slug Aplysia pedal ganglion illustrates how HEPs are able to characterise stimulus-driven excitability of neurons across repeated stimulation under different experimental conditions.

URL PDF HTML ☆

赞 0 踩 0

2606.11746 2026-06-11 astro-ph.IM stat.ML 新提交

Time Series Analysis in Machine Learning

机器学习中的时间序列分析

Antonio Pagliaro, Anna Anzalone

AI总结从机器学习视角综述时间序列分析，涵盖经典统计模型与现代机器学习方法，强调跨领域应用原则。

详情

Comments: Invited chapter for the edited book "Machine Learning Techniques for Astrophysics and Cosmology" (Eds. Cosimo Bambi, Vinay Kashyap, Swarnim Shashank, Naoki Yoshida, Springer Singapore, expected in 2026). Submitted version

AI中文摘要

时间序列分析是机器学习的基本组成部分，尤其是在天体物理学和宇宙学中，时域数据丰富。本章从机器学习的视角对时间序列分析技术进行了教学性综述。我们涵盖了时间序列的基本概念（平稳性、自相关、季节性）、经典统计模型（自回归、移动平均、ARIMA、指数平滑、状态空间模型）以及现代机器学习方法。特别地，我们讨论了传统统计方法如何奠定基础，然后探索了用于时间序列的机器学习方法，包括基于特征的回归、基于树的集成方法、隐马尔可夫模型、高斯过程和深度学习模型（循环神经网络、卷积网络、变换器）。在整章中，我们通过来自多个领域（例如天文学、天气预报、金融）的示例进行说明，以强调共同原则。目标是使读者具备理论理解和实践背景，以便在其研究中应用机器学习技术进行时间序列分析。

英文摘要

Time series analysis is a fundamental component of machine learning, especially in astrophysics and cosmology where temporal data abound. This chapter provides a pedagogical review of time series analysis techniques from a machine learning perspective. We cover the basic concepts of time series (stationarity, autocorrelation, seasonality), classical statistical models (autoregressive, moving average, ARIMA, exponential smoothing, state-space models), and modern machine learning approaches. In particular, we discuss how traditional statistical methods lay the groundwork, and then explore machine learning methods for time series, including feature-based regression, tree-based ensemble methods, hidden Markov models, Gaussian processes, and deep learning models (recurrent neural networks, convolutional networks, transformers). Throughout, we illustrate with examples drawn from multiple domains (e.g. astronomy, weather forecasting, finance) to emphasize common principles. The goal is to equip readers with both the theoretical understanding and practical context to apply machine learning techniques for time series analysis in their research.

URL PDF HTML ☆

赞 0 踩 0

2605.27478 2026-06-11 stat.ML cs.LG math.PR 版本更新

Triangular-Reference Schrödinger Bridges for Time Series Generation

三角参考薛定谔桥用于时间序列生成

Gabriele Bocchi

AI总结提出三角参考薛定谔桥框架，通过区间冻结的退化扩散参考和层次化潜在波动率结构，实现时间序列的保守生成，并保持熵最小化的变分核心。

详情

AI中文摘要

我们引入了用于时间序列的三角参考薛定谔桥（TR-SBTS），这是SBTS框架的一种保守扩展，其中布朗参考被替换为区间冻结的、可能退化的扩散参考，在潜在波动率水平的层次上呈三角形。该构造是在增广状态空间上的单一熵投影，变分约束在时间和潜在水平上联合施加，并通过相对熵的分解层次展开。SBTS的变分核心得以保留：熵最小化器是参考的h-变换，在每个冻结区间上，最优动力学在活跃协方差方向的仿射叶上具有对数梯度漂移公式，即使冻结协方差是秩亏的也成立。我们建立了冻结近似的稳定性以及相应正则化核估计量的收敛性。该构造通过一个有限维条件映射实现，该映射由三种互补的过去约简组成——块PCR摘要、由运行时冻结协方差累积量诱导的过去增量的参考感知马氏核，以及在同一参考度量下的过去窗口WLS漂移回归器——以及一个耦合的状态-协方差桥步骤，其中每个潜在水平为上一水平产生动态参考，并由协方差描述符总结；该构造在数值实验上进行了评估。

英文摘要

Schrödinger bridges for time series (SBTS) generate synthetic paths by projecting, in relative entropy, a Brownian reference onto the path laws that match the joint distribution of the data on the observation grid. The Brownian reference, however, fixes the quadratic variation of the generated paths, which is restrictive when stochastic volatility, correlated noise, or rank-deficient covariance structures must be reproduced. We introduce "Triangular-Reference Schrödinger Bridges for Time Series" (TR-SBTS), which keeps the entropy-projection backbone of SBTS but replaces the Brownian reference by a triangular, volatility-informed, intervalwise frozen reference on a state augmented with latent covariance descriptors. The construction remains a single entropy projection on the augmented state: the minimiser is the $h$-transform of the reference, and on each frozen interval the optimal drift has the logarithmic-gradient form $b^\star(t,x)=A\,\nabla\log H(t,x)$, intrinsic to the active covariance directions when the frozen covariance $A$ is degenerate. We prove stability of the frozen approximation and consistency of the associated regularised kernel estimators, describe a reference-aware Nadaraya--Watson implementation of the conditional next-increment law, and evaluate the construction on numerical experiments.

URL PDF HTML ☆

赞 0 踩 0

2601.14031 2026-06-11 stat.ML cs.LG 版本更新

Intermittent time series forecasting: local vs global models

间歇性时间序列预测：局部模型与全局模型

Stefano Damato, Nicolò Rubattu, Dario Azzimonti, Giorgio Corani

AI总结针对间歇性时间序列预测问题，首次系统比较了概率性局部模型与全局模型（如TiDE），发现简单神经网络架构TiDE在精度和计算效率上均优于局部模型，且Tweedie分布头对高分位数估计最佳。

详情

Comments: Submitted to the Journal of the Operational Research Society

AI中文摘要

预测包含零值的间歇性时间序列是供应链中的一个关键挑战，因为库存策略需要概率预测来建立安全水平。间歇性时间序列通常使用局部模型进行预测，即对每个时间序列单独训练。近年来，基于大量时间序列训练的全局模型在时间序列预测中变得流行。全局模型通常基于神经网络或梯度提升树。我们进行了首次研究，比较了最先进的概率性局部模型和全局模型在间歇性时间序列上的表现。对于全局模型，我们考虑了三种适用于间歇性时间序列的不同分布头：负二项、障碍移位负二项和Tweedie。据我们所知，这是后两者首次与神经网络结合使用。我们在五个数据集上进行了实验，这些数据集总共包含超过40,000个真实世界的时间序列。在全局模型中，TiDE（一种简单的神经网络架构）取得了最佳精度；它还持续优于局部模型，并且计算需求更低。大型全局模型反而计算需求更高且精度更低。在分布头中，Tweedie提供了最高分位数的最佳估计。

英文摘要

Forecasting intermittent time series, which contain zeros, is a crucial challenge in supply chains as inventory policies require probabilistic forecasts to establish safety levels. Intermittent time series are commonly forecast using local models, trained individually on each time series. In the last years global models, trained on a large collection of time series, have become popular for time series forecasting. Global models are often based on neural networks or gradient boosted trees. We carry out the first study comparing state-of-the-art probabilistic local and global models on intermittent time series. For global models we consider three different distribution heads suitable for intermittent time series: negative binomial, hurdle-shifted negative binomial and Tweedie. To the best of our knowledge, this is the first use of the latter two with neural networks. We perform experiments on five datasets comprising overall more than 40'000 real-world time series. Among global models, TiDE, a simple neural network architecture, achieves the best accuracy; it also consistently outperforms local models and has lower computational requirements. Large global models are instead much more computationally demanding and less accurate. Among the distribution heads, the Tweedie provides the best estimates of the highest quantiles.

URL PDF HTML ☆

赞 0 踩 0

2411.12193 2026-06-11 stat.AP cs.LG stat.ML 版本更新

Hierarchical Probabilistic Conformal Prediction for Distributed Energy Resources Adoption

分布式能源采纳的分层概率保形预测

Wenbin Zhou, Shixiang Zhu

AI总结针对分布式能源采纳预测中的不确定性和分层电网结构，提出基于多元霍克斯过程与分裂保形预测的量化框架，确保聚合后统计有效性，在印第安纳波利斯数据上优于基线。

详情

AI中文摘要

分布式能源（DERs）的快速增长为电网管理带来了机遇和运营挑战。准确预测DER采纳对于主动基础设施规划至关重要，但DER增长的固有不确定性和空间差异使传统预测方法复杂化。此外，配电网的分层结构要求预测在电路和变电站层面均满足统计保证，这是可靠决策的非平凡要求。本文提出了一种新的DER采纳预测不确定性量化框架，确保在分层电网结构中的有效性。利用多元霍克斯过程建模DER采纳动态，并采用定制的分裂保形预测算法，我们引入了一种新的非一致性分数，在保持预测效率的同时，在聚合下保留统计保证。我们在温和条件下建立了理论有效性，并通过印第安纳州印第安纳波利斯的客户级太阳能电池板安装数据实证评估，表明我们的方法在预测准确性和不确定性校准方面始终优于现有基线。

英文摘要

The rapid growth of distributed energy resources (DERs) presents both opportunities and operational challenges for electric grid management. Accurately predicting DER adoption is critical for proactive infrastructure planning, but the inherent uncertainty and spatial disparity of DER growth complicate traditional forecasting approaches. Moreover, the hierarchical structure of distribution grids demands that predictions satisfy statistical guarantees at both the circuit and substation levels, a non-trivial requirement for reliable decision-making. In this paper, we propose a novel uncertainty quantification framework for DER adoption predictions that ensures validity across hierarchical grid structures. Leveraging a multivariate Hawkes process to model DER adoption dynamics and a tailored split conformal prediction algorithm, we introduce a new nonconformity score that preserves statistical guarantees under aggregation while maintaining prediction efficiency. We establish theoretical validity under mild conditions and demonstrate through empirical evaluation on customer-level solar panel installation data from Indianapolis, Indiana that our method consistently outperforms existing baselines in both predictive accuracy and uncertainty calibration.

URL PDF HTML ☆

赞 0 踩 0

2606.12021 2026-06-11 stat.ME 新提交

Adaptive spatial blocking for scalable clustering inference with applications to high-throughput spatial proteomics

自适应空间分块用于可扩展聚类推断及其在高通量空间蛋白质组学中的应用

Mingyu Go, Julia Wrobel, Hoseung Song

AI总结提出自适应空间分块算法，通过构造满足点计数和形状约束的局部块，利用渐近正态近似实现大规模点模式数据的快速聚类推断，平衡统计功效与计算效率。

详情

AI中文摘要

Ripley's K函数是一种广泛用于评估点模式聚类的空间汇总统计量。然而，现有的基于K的方法在处理大规模数据时计算成本高昂，特别是在高通量空间蛋白质组学中，因为它们依赖于图像中所有点的空间信息。为应对这一挑战，我们提出了一种计算高效的基于分块的测试框架，该框架从图像中提取不相交的局部块，并跨块聚合聚类证据。所提出的自适应空间分块算法构造满足点计数和形状约束的块，通过渐近正态近似实现可扩展的空间聚类推断和快速p值计算。数值研究表明，所提出的方法在统计功效和计算效率之间提供了良好的平衡。在健康人肠道空间蛋白质组学数据的应用中，我们的方法检测到浆细胞的强空间聚集以及浆细胞与巨噬细胞之间的共定位，同时在大图像上具有良好的可扩展性。

英文摘要

Ripley's K-function is a widely used spatial summary statistic for assessing clustering in point patterns. However, existing K-based methods can be computationally prohibitive for large-scale data, particularly in high-throughput spatial proteomics, because they rely on spatial information from all points in the image. To address this challenge, we propose a computationally efficient block-based testing framework that extracts disjoint local blocks from an image and aggregates clustering evidence across them. The proposed adaptive spatial blocking algorithm constructs blocks satisfying point-count and shape constraints, enabling scalable spatial clustering inference and fast p-value computation through an asymptotic normal approximation. Numerical studies demonstrate that the proposed method provides a favorable balance between statistical power and computational efficiency. In an application to healthy human intestine spatial proteomics data, our method detects strong spatial aggregation of plasma cells and colocalization between plasma cells and macrophages, while scaling favorably to large images.

URL PDF HTML ☆

赞 0 踩 0

2606.11487 2026-06-11 math.ST math.PR stat.ML 新提交

Unbiased Derivative Estimation for Stationary Mean of Parameterized Markov chains

参数化马尔可夫链平稳均值的无偏导数估计

Jeffrey Wang, Chang-han Rhee

AI总结提出一种针对参数化马尔可夫链平稳均值梯度的无偏估计方法，在慢混合率下高效，无需密度函数先验知识，适用于神经网络参数化。

2606.11402 2026-06-11 stat.CO astro-ph.IM stat.ML 新提交

GraphGP: Scalable Gaussian Processes with Vecchia's Approximation

GraphGP: 基于Vecchia近似的可扩展高斯过程

Benjamin Dodge, Philipp Frank, Susan E. Clark

AI总结提出GraphGP算法，利用Vecchia近似和GPU加速，将高斯过程扩展到近十亿参数，实现线性时间和内存复杂度，适用于大动态范围任意点分布。

详情

Comments: Accepted to Conference on Physics and AI at Stanford University (PAI 2026)

AI中文摘要

高斯过程是建模连续场的强大工具，但其朴素的$\mathcal{O}(N^3)$计算成本和$\mathcal{O}(N^2)$内存需求常常限制其实际应用。Vecchia近似是一种针对平稳、衰减核的稀疏精度矩阵近似，它将每个点仅条件于其$k$个最近邻。我们提出GraphGP，一种用于Vecchia近似的GPU算法，可扩展到近十亿参数，具有线性时间和内存需求，并能处理大动态范围内的任意点分布。我们的关键贡献是：(1) 一种比特反转k-d树排序，允许高效邻居搜索同时最大化批处理并行性；(2) 一种可微的CUDA实现，比纯JAX基线显著更快且内存效率更高。GraphGP提供了推理所需的构建块，包括前向生成、逆应用、对数行列式和核参数导数。

英文摘要

Gaussian processes are a powerful tool for modeling continuous fields, but their naive $\mathcal{O}(N^3)$ computational cost and $\mathcal{O}(N^2)$ memory requirement often limit their practical use. Vecchia's approximation is a sparse precision matrix approximation for stationary, decaying kernels that conditions each point only on its $k$ nearest neighbors. We present GraphGP, a GPU algorithm for Vecchia's approximation that scales to nearly a billion parameters with linear time and memory requirements, handling arbitrary point distributions over a large dynamic range. Our key contributions are (1) a bit-reversed k-d tree ordering that allows efficient neighbor searches while also maximizing batch parallelism, and (2) a differentiable CUDA implementation, which is substantially faster and more memory efficient than our pure JAX baseline. GraphGP provides the building blocks for inference, including forward generation, inverse application, log-determinant, and kernel parameter derivatives.

URL PDF HTML ☆

赞 0 踩 0

2606.11347 2026-06-11 stat.ML cs.LG math.OC 新提交

Annealed Entropic Allocation for Ranking and Selection

退火熵分配用于排序与选择

Xin Fei, Juergen Branke

AI总结提出退火熵分配框架，通过加权log-sum-exp替代非光滑极大极小大偏差率目标，结合鞍点近似提升有限预算下的区分能力，数值实验表明在多个候选接近时性能优异。

详情

AI中文摘要

我们提出了退火熵分配，一种用于排序与选择中顺序预算分配的退火加权软最小化框架。核心思想是用加权log-sum-exp替代非光滑的极大极小大偏差率目标，该替代通过软最小化权重聚合特定候选对的得分，从而在多个候选几乎同时活跃时缓解硬切换。为了提升有限预算下的区分能力，我们引入了鞍点近似——一种从精细化的成对尾部渐近性导出的次指数修正。由于这些修正是次指数的，且平滑参数退火至零，该替代保持了与经典极大极小公式相同的一阶大偏差目标。我们证明了该替代一致收敛于硬最小值，软最小化权重集中于活跃候选，并且在固定权重下，诱导的目标分配映射在单纯形内部是连续的。在高斯和指数实例上的数值实验展示了竞争性能，尤其是在多个候选几乎持平时。

英文摘要

We propose Annealed Entropic Allocation, an annealed weighted soft-min framework for sequential budget allocation in ranking and selection. The central idea is to replace the non-smooth maximin large-deviation rate objective with a weighted log-sum-exp surrogate that aggregates challenger-specific pairwise scores through soft-min weights, mitigating hard switching when several challengers are nearly active. To improve finite-budget discrimination, we incorporate the saddlepoint approximation -- a sub-exponential correction derived from refined pairwise tail asymptotics. Because these corrections are sub-exponential and the smoothing parameter is annealed to zero, the surrogate preserves the same first-order large-deviation target as the classical maximin formulation. We show that the surrogate converges uniformly to the hard minimum, that the soft-min weights concentrate on the active challengers, and that, under fixed weights, the induced target allocation map is continuous on the simplex interior. Numerical experiments on Gaussian and exponential instances demonstrate competitive performance, especially when multiple challengers are nearly tied.

URL PDF HTML ☆

赞 0 踩 0

2510.01861 2026-06-11 stat.ME stat.CO 版本更新

Compressed Bayesian Tensor Regression

压缩贝叶斯张量回归

Roberto Casarin, Radu Craiu, Qing Wang

AI总结针对张量回归中的高维问题，提出广义张量随机投影方法将高维协变量嵌入低维子空间，结合贝叶斯推理框架和低秩参数表示，实现高效预测与计算成本降低。

详情

AI中文摘要

为了解决张量回归中常见的高维问题，我们引入了一种广义张量随机投影方法，该方法将高维张量值协变量嵌入低维子空间，同时最小化响应信息的损失。该方法灵活，允许张量-wise、模式-wise 或组合随机投影作为特例。我们提供了一个贝叶斯推理框架，其特点是使用分层先验分布和参数的低秩表示。为随机投影的集中性质和贝叶斯推理的后验一致性提供了强有力的理论支持。开发了一个高效的吉布斯采样器来对压缩数据进行推理。为了减轻随机投影引入的敏感性，采用了贝叶斯模型平均，并使用逆逻辑回归估计归一化常数。进行了广泛的模拟研究，以检查不同调谐参数的影响。模拟表明，并且实际数据应用证实，与标准贝叶斯张量回归相比，压缩贝叶斯张量回归可以在显著降低计算成本的同时实现更好的样本外预测。

英文摘要

To address the common problem of high dimensionality in tensor regressions, we introduce a generalized tensor random projection method that embeds high-dimensional tensor-valued covariates into low-dimensional subspaces with minimal loss of information about the responses. The method is flexible, allowing for tensor-wise, mode-wise, or combined random projections as special cases. A Bayesian inference framework is provided featuring the use of a hierarchical prior distribution and a low-rank representation of the parameter. Strong theoretical support is provided for the concentration properties of the random projection and posterior consistency of the Bayesian inference. An efficient Gibbs sampler is developed to perform inference on the compressed data. To mitigate the sensitivity introduced by random projections, Bayesian model averaging is employed, with normalising constants estimated using reverse logistic regression. An extensive simulation study is conducted to examine the effects of different tuning parameters. Simulations indicate, and the real data application confirms, that compressed Bayesian tensor regression can achieve better out-of-sample prediction while significantly reducing computational cost compared to standard Bayesian tensor regression.

URL PDF HTML ☆

赞 0 踩 0

2606.12058 2026-06-11 stat.ML cond-mat.dis-nn cs.LG 新提交

Phase Transitions in Attention: A Bayesian Theory of Copy Head Emergence

注意力中的相变：复制头涌现的贝叶斯理论

Itay Lavie, Kirsten Fischer, Andrey Lekov, Frederic Van Maele, Zohar Ringel, Moritz Helias

AI总结通过分析单层softmax注意力网络在复制任务上的训练，提出贝叶斯理论揭示注意力矩阵的后验分布存在相变，并对比线性注意力发现softmax注意力呈现一阶相变。

详情

AI中文摘要

Aitchison单纯形的树结构正交分解

Daisuke Yamada, Qijun Zhang, Travis Pence, Barbara B. Bendlin, Federico Rey, Vikas Singh

AI总结提出PolyILR方法，利用树结构对成分数据进行正交分解，在微生物组和单细胞数据中生成稳定可解释的特征，并建立与softmax分类器的理论联系。

详情

Comments: Accepted at ICML 2026. To appear in PMLR vol. 306

AI中文摘要

成分数据——编码相对比例的向量——出现在包括生态学、地球化学和基因组学在内的科学领域。这些数据中的特征通常具有已知的层次结构（例如，分类学、系统发育、本体论），但现有方法要么忽略这种结构，要么丢弃内在的Aitchison几何，要么设计用于二叉树，要么产生不完整的坐标系。我们描述了PolyILR，一种与任何树拓扑对齐的Aitchison切空间的正交分解。我们的构造在每个内部节点定义了一个加权局部几何，捕获完整的分支结构，然后将这些提升到一个全局正交基，其中每个坐标对应一个特定的树位置。在微生物组和单细胞基准测试中，PolyILR产生稳定、可解释的特征，并支持多尺度树分辨率下的推理。我们还建立了与softmax分类器的新理论联系，暗示了在概率建模中的可能应用。

英文摘要

Compositional data -- vectors encoding relative proportions -- arise across scientific domains, including ecology, geochemistry, and genomics. The features in these data often come with known hierarchical structure (e.g., taxonomies, phylogenies, ontologies), yet existing methods either ignore this structure, discard the intrinsic Aitchison geometry, are designed for binary trees, or yield incomplete coordinate systems. We describe PolyILR, a canonical orthonormal decomposition of the Aitchison tangent space aligned with any tree topology. Our construction defines a weighted local geometry at each internal node capturing full branching structure, then lifts these to a global orthonormal basis where every coordinate corresponds to a specific tree location. On microbiome and single-cell benchmarks, PolyILR yields stable, interpretable features and enables inference at multiscale tree resolution. We also establish a novel theoretical connection to softmax classifiers, suggesting possible applications to probabilistic modeling.

URL PDF HTML ☆

赞 0 踩 0

2606.11574 2026-06-11 cs.LG cond-mat.mtrl-sci physics.chem-ph stat.ML 新提交

Range-Aware Bayesian Optimization for Discovering Diverse Designs within Target Property Windows

范围感知贝叶斯优化用于在目标属性窗口内发现多样化设计

Shengli Jiang, Jason Wu, Charles M. Schroeder, Michael A. Webb

发表机构 * Department of Chemical and Biological Engineering, Princeton University（普林斯顿大学化学与生物工程系）

AI总结提出范围感知贝叶斯优化框架，通过采集函数直接评分候选解满足目标范围的后验概率，在基准任务和实际案例中比标准方法发现更多样化的有效设计。

详情

Comments: 64 pages, 6 main text figures, 17 supporting figures, 6 supporting tables

AI中文摘要

在许多材料和产品设计问题中，理想的候选物表现出可接受范围内的属性，而非达到单一最优值。恢复满足此类规格的多个不同解也具有实际价值，因为某些候选物可能因成本、可加工性或鲁棒性等原因而更受青睐，而这些因素难以直接编码到目标函数中。在此，我们开发了一个范围感知贝叶斯优化（BO）框架，其中采集函数直接评分候选解满足目标范围的后验概率。该框架自然扩展到在共享候选空间上并行追求多个不同规格。在基准任务中，范围感知采集一致地比标准BO基线和最近的目标寻求方法恢复更大且更多样化的有效设计集。其效用进一步在两个实际动机的设计案例研究中得到证明，涉及优化聚合物合成的反应条件和发现指定光学吸收带的序列定义低聚物，并得到量子化学计算的支持。这些结果表明，范围感知BO可以为规格驱动设计提供实用且样本高效的基础，特别是当设计灵活性和解多样性是重要考虑因素时。

英文摘要

In many materials and product design problems, desirable candidates exhibit properties that fall within an acceptable range rather than achieve a single optimum. Recovering multiple, distinct solutions that satisfy such specifications is also practically valuable, as some candidates may be preferred for reasons of cost, processability, or robustness that are difficult to encode directly in an objective function. Here, we develop a range-aware Bayesian optimization (BO) framework in which the acquisition function directly scores the posterior probability that a candidate satisfies a target range. The framework naturally extends to parallel pursuit of multiple distinct specifications over a shared candidate space. Across benchmark tasks, range-aware acquisition consistently recovers larger and more diverse sets of valid designs than standard BO baselines and recent goal-seeking methods. Its utility is further demonstrated in two practically motivated design case studies involving optimizing reaction conditions for polymer synthesis and sequence-defined oligomer discovery for prescribed optical absorption bands, supported by quantum chemical calculations. These results suggest that range-aware BO can provide a practical and sample-efficient foundation for specification-driven design, particularly when design flexibility and solution diversity are important considerations.

URL PDF HTML ☆

赞 0 踩 0

2606.11570 2026-06-11 stat.ML cs.LG stat.ME 新提交

Enhancing Spectral Embedding through Robust and Flexible Knowledge Transfer in Electronic Health Records

通过电子健康记录中的鲁棒且灵活的知识迁移增强谱嵌入

Feiqing Huang, Zongqi Xia, Rong Ma, Tianxi Cai

AI总结提出一种基于谱的无监督表示学习框架，通过从更广泛人群提取知识矩阵并放松信号对齐假设，为罕见病队列生成低维嵌入，在模拟和真实多发性硬化症数据中优于现有方法。

详情

AI中文摘要

我们提出了一种基于谱的无监督表示学习框架，用于从电子健康记录中为罕见病队列的临床概念和患者导出低维嵌入，其中数据是高维的但样本量有限。为了克服这一挑战，我们引入了一个从更广泛人群中提取的知识矩阵，该矩阵与罕见病队列共享部分重叠的子空间。我们的方法不同于现有方法，它放松了潜在数据矩阵和知识矩阵之间严格的一对一信号对齐假设，允许更灵活和现实的结构化共享形式。我们引入了一种新颖的两步谱嵌入过程：首先，我们从知识矩阵中识别并移除不相关的成分；然后，我们应用基于投影的方法分别恢复共享和异质成分。模拟和对真实世界多发性硬化症队列的分析表明，所提出的方法优于竞争方法，特别是在共享信号较弱且仅部分对齐的挑战性场景中，这在罕见病数据中很常见。

英文摘要

We propose a spectral-based, unsupervised representation learning framework to derive low-dimensional embeddings for clinical concepts and patients in rare disease cohorts from electronic health records, where data are high-dimensional but sample sizes are limited. To overcome this challenge, we incorporate a knowledge matrix extracted from a broader population that shares a partially overlapping subspace with the rare-disease cohort. Our method departs from existing approaches by relaxing restrictive one-to-one signal-alignment assumptions between the latent data matrix and knowledge matrix, allowing more flexible and realistic forms of structured sharing. We introduce a novel two-step spectral embedding procedure: first, we identify and remove irrelevant components from the knowledge matrix; then, we apply a projection-based method to separately recover shared and heterogeneous components. Simulations and an analysis of a real-world multiple sclerosis cohort show that the proposed method outperforms competing approaches, particularly in challenging scenarios where shared signals are weak and only partially aligned, as is common in rare-disease data.

URL PDF HTML ☆

赞 0 踩 0

2606.11437 2026-06-11 cs.DS cs.AI cs.LG stat.ML 新提交

The Power of Test-Time Training for Approximate Sampling

测试时训练对近似采样的威力

Noah Golowich, Ankur Moitra, Dhruv Rohatgi

AI总结本文形式化测试时训练（TTT）为从已知分布类中采样的问题，证明查询复杂度的二次下界，并展示在分布类大小受限时可规避该下界，为TTT提供理论框架。

详情

AI中文摘要

从复杂概率分布中高效采样是一个基本问题，近年来随着生成式AI的兴起，这一问题变得越来越重要，因为从大语言模型（LLM）中提出的复杂采样程序已被用于解决具有挑战性的推理问题。然而，这类采样算法的有效性受到LLM与特定采样任务之间关系的限制，这推动了测试时训练（TTT）框架的发展。TTT通过根据推理时收到的部分生成和奖励反馈更新模型权重来工作，从而适应特定问题。在这项工作中，我们提出了一种TTT的形式化，将其定义为从属于已知分布类$F$的给定概率测度$\mu^\star$中生成样本的问题，给定一个提供$\mu^\star$近似密度估计的预言机$\hat \mu$。这与Jerrum、Valiant和Vazirani（1986）以及Jerrum和Sinclair（1989）的开创性工作中研究的将采样约化为近似计数的问题密切相关：即当$F$是所有分布的类时，它恰好与上述计数到采样的约化一致。在本文中，我们首先证明了在给定对$\hat \mu$的查询访问的情况下，从$\mu^\star$采样的查询复杂度的二次下界（对于足够大的类$F$），从而表明Jerrum和Sinclair（1989）提出并由Hayes和Sinclair（2010）改进的随机游走方法是最优的。这回答了Hayes和Sinclair提出的一个开放问题。然后，我们证明如果$F$的大小适当受限，这个下界可以被规避。正如我们所讨论的，后一个结果可以被视为TTT的抽象，因此代表了为TTT发展一个原则性理论框架的起点。

英文摘要

Efficiently sampling from a complex probability distribution is a fundamental problem which has become increasingly pertinent in recent years with the rise of generative AI, as sophisticated sampling procedures from LLMs have been proposed to solve challenging reasoning problems. The efficacy of such sampling algorithms is limited, however, by the relationship between the LLM and the particular sampling task at hand, which has motivated the framework of test-time training (TTT). TTT works by updating a model's weights in response to partial generations and reward feedback received at inference time, thus adapting to the particular problem. In this work, we propose a formalization for TTT as the problem of producing a sample from a given probability measure $\mu^\star$ belonging to a known class ${F}$ of distributions, given an oracle $\hat \mu$ which yields approximate density estimates for $\mu^\star$. This is closely related to the problem of reducing sampling to approximate counting studied in seminal works of Jerrum, Valiant & Vazirani (1986) and Jerrum & Sinclair (1989): namely, when ${F}$ is the class of all distributions, it coincides exactly with the aforementioned counting-to-sampling reduction. In this paper, we first show a quadratic lower bound on the query complexity of sampling from $\mu^\star$ given query access to $\hat \mu$ (for sufficiently large classes ${F}$), thus showing that the random walk approach proposed by Jerrum & Sinclair (1989) and refined by Hayes & Sinclair (2010), is optimal. This answers an open question posed by Hayes & Sinclair. We then show that this lower bound can be circumvented if the size of ${F}$ is bounded appropriately. As we discuss, this latter result can be viewed as an abstraction of TTT, and thus represents a starting point for the development of a principled theoretical framework for TTT.

URL PDF HTML ☆

赞 0 踩 0

2606.11417 2026-06-11 cs.LG cs.AI stat.ML 新提交

Signed Compression Progress on a Sealed Audit is Goodhart-Resistant

密封审计上的有符号压缩进展是古德哈特抵抗的

Ayush Mittal, Dhruv Gupta

AI总结提出有符号压缩进展作为内在动机，证明其累积奖励等于审计改进，且对有限审计面板具有假阳性预算，抵抗古德哈特定律。

详情

Comments: 16 pages, 7 figures. Lean 4 (Mathlib) mechanized core and ARC-TGI experiment code: this https URL

AI中文摘要

压缩进展是一个长期提出的内在动机方案：当智能体的世界模型在预测或压缩经验方面变得更好时给予奖励。民间声称这种奖励是“可信的”，因为它只在学习时支付。我们使这一点精确化并证明它。如果内在奖励是固定密封审计损失的有符号减少，即 r_t = E(theta_{t-1}) - E(theta_t)，那么累积奖励恰好望远镜式地归结为端点审计改进，因此没有策略可以在真实审计性能停滞或下降时无限推高奖励。对于有限审计面板，同样的结果成立，并带有尖锐的假阳性预算：累积经验奖励最多为真实审计改进加上 2 Delta_n(F, delta)，即模型类的均匀审计偏差。这是无水平依赖的：一旦密封面板均匀控制该类，随时间变化的适应性无需付出代价。该定理还识别了失败模式：如果进展被截断、在智能体自身流上评分、暴露于可重用面板上的高容量模型，或应用于使 Delta_n 无效的神经类，则保证消失。我们给出了结构核心（望远镜式、有限审计界、有限吉布斯和熵下限）的 Lean 4 机械化，以及在 ARC-TGI 网格变换生成器上带有自适应保留攻击的实验套件。实验证实了理论：有限审计偏差按 n^{-0.527} 缩放；有符号进展抵抗截断农场、流泄漏和噪声电视好奇心；朴素的可重用审计可被黑盒标量反馈利用，而标准发布防御将攻击保持在 2 Delta_n 阈值以下。密封审计上的有符号压缩进展是真正改进的会计信号。

英文摘要

Compression progress is a long-standing proposal for intrinsic motivation: reward an agent when its world model becomes better at predicting or compressing experience. The folk claim is that this reward is "credible" because it is paid only for learning. We make this precise and prove it. If intrinsic reward is the signed decrease of a fixed sealed-audit loss, r_t = E(theta_{t-1}) - E(theta_t), then cumulative reward telescopes exactly to endpoint audit improvement, so no policy can push reward up indefinitely while true audit performance stagnates or degrades. For finite audit panels the same result holds with a sharp false-positive budget: cumulative empirical reward is at most true audit improvement plus 2 Delta_n(F, delta), the uniform audit deviation of the model class. This is horizon-free: adaptivity over time costs nothing once the sealed panel uniformly controls the class. The theorem also identifies the failure modes: the guarantee disappears if progress is clipped, scored on the agent's own stream, exposed to a high-capacity model on a reusable panel, or applied to a neural class that makes Delta_n vacuous. We give a Lean 4 mechanization of the structural core (telescoping, the finite-audit bound, finite Gibbs, and the entropy floor) and an experiment suite on ARC-TGI grid-transformation generators with adaptive holdout attacks. Experiments confirm the theory: finite-audit deviation scales as n^{-0.527}; signed progress resists clip-farming, stream leakage, and noisy-TV curiosity; naive reusable audits are exploitable by black-box scalar feedback, while standard release defenses keep the attack below the 2 Delta_n threshold. Signed compression progress on a sealed audit is an accounting signal of genuine improvement.

URL PDF HTML ☆

赞 0 踩 0

2606.11339 2026-06-11 math.OC cs.AI cs.LG eess.SY stat.ML 新提交

Quantized Stochastic Primal-Dual Methods for Distributed Optimization under Relaxed Global Geometry

松弛全局几何下分布式优化的量化随机原始-对偶方法

Susmit Sarkar, Abhinav Raghuvanshi, Kushal Chakrabarti, Mayank Baranwal

AI总结提出量化随机原始-对偶方法q-PDGD，在松弛全局几何下证明线性收敛到邻域或O(1/k)收敛，匹配最优集中随机复杂度。

详情

Comments: Accepted to UAI

AI中文摘要

我们研究具有随机梯度和有限比特通信（由随机（无偏）量化建模）的分布式优化。我们提出q-PDGD，一种量化的随机原始-对偶方法，并在松弛全局几何下对其进行分析。在受限割线不等式（RSI）下，常数步长产生线性收缩到由梯度噪声、量化失真和网络连通性确定的显式邻域，而递减步长在没有共享最小化器假设的情况下实现O(1/k)收敛。在Polyak-Lojasiewicz（PL）不等式下，我们在相同的随机量化设置中获得线性到邻域的收敛。我们的结果在预言复杂度上匹配已知最优的集中随机速率，并通过实验证明了量化水平、步长选择和图结构之间的预测权衡。

英文摘要

We study distributed optimization with stochastic gradients and finite-bit communication modeled by random (unbiased) quantization. We propose q-PDGD, a quantized stochastic primal-dual method, and analyze it under relaxed global geometry. Under restricted secant inequality (RSI), a constant step-size yields linear contraction to an explicit neighborhood determined by gradient noise, quantization distortion, and network connectivity, while a diminishing step-size achieves O(1/k) convergence without shared-minimizer assumptions. Under Polyak-Lojasiewicz (PL) inequality, we obtain linear-to-neighborhood convergence in the same stochastic quantized setting. Our results match the best-known centralized stochastic rates in oracle complexity, and are supported by experiments demonstrating the predicted tradeoffs between quantization level, step-size choice, and graph structure.

URL PDF HTML ☆

赞 0 踩 0

2605.04893 2026-06-11 cs.LG cs.CL stat.ML 版本更新

Self-Attention as Transport: Limits of Symmetric Spectral Diagnostics

自注意力作为传输：对称谱诊断的极限

Dominik Dahlem, Diego Maniloff, Mac Misiura

AI总结研究语言模型注意力路由的两种失效形状（过度集中或过度分散），证明对称谱诊断对方向不敏感，并揭示因果注意力中传输容量的理论下限，提出基于容量和方向的双轴诊断方法。

详情

Comments: 48 pages, 6 figures, 7 tables; 81-page online supplement (proofs, additional experiments, dataset statistics) as an ancillary file

AI中文摘要

当语言模型处理幻觉响应时，其注意力路由往往以两种形状之一失效：过度集中在狭窄的位置集合上，或者分散得如此广泛以至于相关性被稀释，而失效的形状携带诊断信号。我们研究这些形状作为诊断特征，从在基准标记响应的\emph{强制评分}下计算的注意力矩阵中得出，而不是在实时生成期间。一类广泛使用的谱方法分析度归一化注意力算子的对称分量，该算子控制传输\emph{容量}；我们证明该算子的每个转置不变谱诊断在结构上是\emph{方向盲的}（它无法区分算子与其转置，因此无法检测信息流方向），并且盲定理的逆定理将任何Lipschitz诊断的转置敏感性限制为不对称系数$G$。将其与规范因果架构的闭式二分-Cheeger景观配对，我们证明均匀因果注意力满足一个与$n$无关的下界$\phi \ge 1/5$，而窗口注意力以$O(w/n)$穿透下界；失效模式在形状上不同，而不仅仅在数值上不同。这个下界是一个理想化架构的基准，而不是经验吸引子：穿透它的真实注意力头的比例本身就是一个架构特征。由此产生的双轴诊断（$\phi$表示容量，$G$表示方向）产生一个可证伪的极性预测：瓶颈主导和分散主导的基准应表现出相反的极性。在长度控制评估下，传输特征在测试的仅解码器、仅编码器和编码器-解码器模型中保持可解释的信号（0.62-0.84 LC-AUROC），极性在HaluEval和MedHallu之间如预测般反转。

英文摘要

When a language model processes a hallucinated response, its attention routing tends to fail in one of two shapes: over-concentrating on a narrow set of positions, or spreading so diffusely that relevance is diluted, and the shape of the failure carries diagnostic signal. We study these shapes as a diagnostic characterization, computed from attention matrices under \emph{forced scoring} of benchmark-labeled responses rather than during live generation. A widely used family of spectral methods analyzes the symmetric component of the degree-normalized attention operator, which governs transport \emph{capacity}; we prove that every transpose-invariant spectral diagnostic of this operator is structurally \emph{orientation-blind} (it cannot distinguish an operator from its transpose, and therefore cannot detect information-flow direction), with a converse to the blindness theorem bounding any Lipschitz diagnostic's transpose sensitivity by the asymmetry coefficient $G$. Pairing this with a closed-form bipartite-Cheeger landscape for canonical causal architectures, we show that uniform causal attention satisfies an $n$-independent floor $\phi \ge 1/5$, while window attention pierces the floor as $O(w/n)$; failure modes are shape-different, not just value-different. This floor is an idealized-architecture benchmark, not an empirical attractor: the fraction of real attention heads that pierce it is itself an architectural signature. The resulting two-axis diagnostic ($\phi$ for capacity, $G$ for direction) yields a falsifiable polarity prediction: bottleneck- and diffuse-dominated benchmarks should exhibit opposite polarity. Under length-controlled evaluation, transport features retain interpretable signal (0.62-0.84 LC-AUROC) across the tested decoder-only, encoder-only, and encoder-decoder models, with polarity reversing as predicted between HaluEval and MedHallu.

URL PDF HTML ☆

赞 0 踩 0

2606.08493 2026-06-11 q-bio.GN cs.LG stat.ML 版本更新

Querying Counterfactuals on Tissue Graphs with Supervised Disentanglement

在组织图上通过监督解缠查询反事实

Abdul Moeed, Stefan Schrod, Martin Rohbeck, Marc Jan Bonder, Pavlo Lutsik, Oliver Stegle, Daniel Dimitrov

AI总结本文形式化组织图反事实为空间干预，提出Cellina框架通过监督解缠分解细胞内在状态与空间上下文，用于反事实预测，在结直肠癌和小鼠大脑数据上优于现有方法。

详情

AI中文摘要

组织图反事实询问在改变的空间邻居上下文中细胞的表达将如何变化。这类查询对于预测组织中细胞行为至关重要，但缺乏统一定义，现有方法针对特定干预类型或将细胞视为独立同分布。在这项工作中，我们首先将组织图反事实形式化为一类空间干预，这些干预要么重新连接细胞之间的边（边扰动），要么修改其邻居的表达（节点扰动）。然后，我们介绍Cellina（https://cellina.readthedocs.io），一个使用监督解缠将细胞内在状态从其空间上下文中分解出来的框架，将后者作为反事实预测的条件输入。在跨越结直肠癌和小鼠大脑中超过250万个空间分辨细胞的基准测试中，Cellina在组织扰动、解缠和可扩展性方面优于空间感知和非空间的竞争对手。此外，我们展示了Cellina以无监督方式揭示生物学上不同的癌症子域，并实现靶向邻居扰动模拟。

英文摘要

Tissue graph counterfactuals ask how a cell's expression would change under altered spatial neighbor contexts. Such queries are central to predicting cell behavior in tissues, but lack a unified definition, with existing methods targeting specific intervention types or treating cells as i.i.d. In this work, we first formalize tissue graph counterfactuals as a class of spatial interventions that either rewire connections between cells (edge perturbation) or modify the expression of their neighbors (node perturbation). We then introduce Cellina ( this https URL ) - a framework that uses supervised disentanglement to decompose a cell's intrinsic state from its spatial context, using the latter as a conditioning input for counterfactual predictions. Across benchmarks spanning over 2.5 million spatially-resolved cells in colorectal cancer and mouse brain, Cellina outperforms spatially-informed and non-spatial competitors in in-silico graph perturbations, disentanglement, and scalability. Additionally, we show that Cellina reveals biologically distinct cancer subdomains in an unsupervised manner and enables targeted neighbor perturbation simulations.

URL PDF HTML ☆

赞 0 踩 0

2606.05551 2026-06-11 stat.ML cs.AI cs.LG 版本更新

Conformal Risk-Averse Decision Making with Action Conditional Guarantee

具有行动条件保证的共形风险规避决策

Zihan Zhu, Shayan Kiyani, George Pappas, Hamed Hassani

AI总结提出行动条件共形预测方法，通过分位数损失最小化算法实现行动条件风险价值优化，在有限样本下提供行动条件安全保证。

详情

AI中文摘要

由机器学习模型驱动的可靠决策管道需要具有明确安全保证的不确定性量化（UQ）方法。共形预测通过将ML预测包装成预测集来提供这种UQ，而Kiyani等人（2025b）的最新工作表明，这些集合可以转化为最优的风险规避决策策略——但仅继承边际安全保证。我们通过以下方式推广并加强了他们的结果：（i）引入行动条件共形预测，该预测产生明确条件于决策者所采取的每个行动的安全保证；（ii）表明行动条件预测集可作为风险规避决策者旨在优化行动条件风险价值的可行决策空间的代理；（iii）提出一种基于分位数损失最小化的原则性有限样本算法，将Gibbs等人（2025）的框架与行动条件保证联系起来。在两个真实世界数据集上的实验证实，我们的方法在行动条件性能上显著优于共形基线。

英文摘要

Reliable decision making pipelines powered by machine learning models require uncertainty quantification (UQ) methods that come with explicit safety guarantees. Conformal prediction provides such UQ by wrapping ML predictions into prediction sets, and recent work by Kiyani et al. (2025b) established that these sets can be translated into optimal risk-averse decision policies -- yet only inheriting marginal safety guarantees. We generalize and strengthen their results by (i) introducing action-conditional conformal prediction, which yields safety guarantees conditioned explicitly on each action taken by the decision maker, (ii) showing that action-conditional prediction sets serve as a proxy for the feasible decision space for risk-averse decision makers aiming to optimize action-conditional value-at-risk, and (iii) proposing a principled finite-sample algorithm based on pinball-loss minimization, connecting the framework of Gibbs et al. (2025) to action-conditional guarantees. Experiments on two real-world datasets confirm that our approach significantly improves action-conditional performance over conformal baselines.

URL PDF HTML ☆

赞 0 踩 0

2605.22346 2026-06-11 stat.ML cs.LG cs.SI 版本更新

The ASE-LSE Disagreement Landscape: An End-to-End Characterisation of Extremes and Structural Drivers

偏离正则性：度异质性和特征间隙作为ASE-LSE潜在子空间分歧的结构驱动因素

Minh Triet Pham, Ian Gallagher

AI总结本文研究了图数据分析中邻接谱嵌入和拉普拉斯谱嵌入方法在相同网络上产生不同结果的结构原因，揭示了度异质性和社区结构强度对潜在子空间分歧的影响。

详情

Comments: This paper is being withdrawn as it was submitted without the consent of all listed authors, and contains work that is currently under academic assessment. It will be resubmitted at an appropriate time once evaluation is complete

AI中文摘要

图数据分析中，邻接谱嵌入和拉普拉斯谱嵌入两种最常用方法在相同网络上常产生不同结果。本文提供了结构上的解释。我们证明正则性是完美一致的充分条件：当每个节点具有相同数量的连接时，两种方法产生相同的潜在子空间。任何偏离正则性都会引入分歧，我们证明了一个显式的界限，其两个术语表明控制分歧的结构因素：度异质性推动方法分离，社区结构强度则拉近它们。我们通过成千上万个模拟网络验证了这两种驱动因素，确认异质性推动分歧增加，社区强度抑制它，其比值提供了两种嵌入可以互换或不可互换的强预测。

英文摘要

Two of the most widely used methods for analysing graph data, Adjacency Spectral Embedding and Laplacian Spectral Embedding, often produce different results when applied to the same graph. Yet the structural reasons behind this disagreement remain incompletely understood. This paper provides an end-to-end account of ASE-LSE latent subspace disagreement. We first prove that the two methods produce identical latent subspaces for every embedding dimension whenever the Laplacian is a scalar multiple of the adjacency matrix, and show that this scalar relationship holds if and only if the graph is either regular or bipartite biregular. This anchor result identifies a sufficient condition for perfect agreement that pins down the floor of the disagreement spectrum and supplies the baseline for the perturbation analysis. We then prove that no maximal-disagreement graph or family of graphs exists: the disagreement is always strictly below its theoretical ceiling, and we exhibit a witness family demonstrating that no finite maximum is attainable, so the disagreement landscape has no maximiser. With both endpoints established, we derive a Regularity Departure Bound whose two terms isolate degree heterogeneity and eigengap as the primary structural factors influencing disagreement in the middle regime. Empirical validation across thousands of simulated graphs confirms the mechanisms predicted by the bound: heterogeneity pushes disagreement up, eigengap suppresses it, and their joint ratio emerges as a unified predictor of ASE-LSE disagreement, suggesting when the two embeddings can be treated as interchangeable and when they cannot.

URL PDF HTML ☆

赞 0 踩 0

2603.12901 2026-06-11 stat.ML cond-mat.dis-nn cs.IT cs.LG 版本更新

A theory of learning data statistics in diffusion models, from easy to hard

扩散模型中学习数据统计的理论：从容易到困难

Lorenzo Bardone, Claudia Merger, Sebastian Goldt

AI总结本文研究了扩散模型在学习数据统计时的分布简单性偏差，揭示了学习 pairwise 统计和 higher-order 统计所需的样本复杂度差异，并引入了扩散信息指数这一不变量。

详情

AI中文摘要

尽管扩散模型已成为强大的生成模型，但其学习动态仍不明确。我们通过实验证明，标准扩散模型在自然图像上学习时存在分布简单性偏差，先学习简单的 pairwise 输入统计，再转向更高阶相关性。我们在简单的去噪器上用最小数据模型混合累积模型重现了这一行为，并精确控制了输入的 pairwise 和 higher-order 相关性。我们识别出一个模型不变量，即扩散信息指数，类比于不同学习范式中的相关不变量。利用这一不变量，我们证明去噪器在线性样本复杂度下学习输入的简单 pairwise 统计，而更复杂的 higher-order 统计如四阶累积量需要至少立方样本复杂度。我们还证明，如果 pairwise 和 higher-order 统计共享相关潜在结构，则学习四阶累积量的样本复杂度是线性的。本文描述了扩散模型如何学习越来越复杂分布的关键机制。

英文摘要

While diffusion models have emerged as a powerful class of generative models, their learning dynamics remain poorly understood. We address this issue first by empirically showing that standard diffusion models trained on natural images exhibit a distributional simplicity bias, learning simple, pair-wise input statistics before specializing to higher-order correlations. We reproduce this behaviour in simple denoisers trained on a minimal data model, the mixed cumulant model, where we precisely control both pair-wise and higher-order correlations of the inputs. We identify a scalar invariant of the model that governs the sample complexity of learning pair-wise and higher-order correlations that we call the diffusion information exponent, in analogy to related invariants in different learning paradigms. Using this invariant, we prove that the denoiser learns simple, pair-wise statistics of the inputs at linear sample complexity, while more complex higher-order statistics, such as the fourth cumulant, require at least cubic sample complexity. We also prove that the sample complexity of learning the fourth cumulant is linear if pair-wise and higher-order statistics share a correlated latent structure. Our work describes a key mechanism for how diffusion models can learn distributions of increasing complexity.

URL PDF HTML ☆

赞 0 踩 0

2603.08558 2026-06-11 cs.LG stat.ML 版本更新

Impact of Connectivity on Laplacian Representations in Reinforcement Learning

连通性对强化学习中拉普拉斯表示的影响

Tommaso Giorgi, Pierriccardo Olivieri, Keyue Jiang, Laura Toni, Matteo Papini

AI总结本文研究了连通性对强化学习中拉普拉斯表示的误差影响，通过分析状态图的代数连通性，推导了线性价值函数近似误差的上界，并展示了表示学习管道中的端到端误差分解。

详情

AI中文摘要

在马尔可夫决策过程（MDPs）中学习紧凑的状态表示对于解决大规模强化学习（RL）问题中的维度灾难至关重要。现有方法通过构造状态表示为状态图拉普拉斯特征向量的线性组合，利用结构先验。当转移图未知或状态空间过大时，可通过样本轨迹直接估计图谱特征。本文证明了在学习的谱特征下线性价值函数近似误差的上界，并展示了该误差如何随状态图的代数连通性变化，从而将近似质量根植于MDP的拓扑结构中。进一步界定了由特征向量估计本身引入的误差，导致表示学习管道中的端到端误差分解。此外，尽管RL设置中的拉普拉斯算子表达式等价于现有方法，但其防止了一些常见的误解，并展示了文献中的示例。我们的结果适用于一般的（非均匀）策略，无需对诱导转移核的对称性做任何假设。我们通过在网格世界环境中进行数值模拟验证了理论发现。

英文摘要

Learning compact state representations in Markov Decision Processes (MDPs) has proven crucial for addressing the curse of dimensionality in large-scale reinforcement learning (RL) problems. Existing principled approaches leverage structural priors on the MDP by constructing state representations as linear combinations of the state-graph Laplacian eigenvectors. When the transition graph is unknown or the state space is prohibitively large, the graph spectral features can be estimated directly via sample trajectories. In this work, we prove an upper bound on the approximation error of linear value function approximation under the learned spectral features. We show how this error scales with the algebraic connectivity of the state-graph, grounding the approximation quality in the topological structure of the MDP. We further bound the error introduced by the eigenvector estimation itself, leading to an end-to-end error decomposition across the representation learning pipeline. Additionally, our expression of the Laplacian operator for the RL setting, although equivalent to existing ones, prevents some common misunderstandings, of which we show some examples from the literature. Our results hold for general (non-uniform) policies without any assumptions on the symmetry of the induced transition kernel. We validate our theoretical findings with numerical simulations on gridworld environments.

URL PDF HTML ☆

赞 0 踩 0

2604.27442 2026-06-11 math.ST stat.ML 版本更新

Bayesian online learning in the one-pass regime: Frequentist validity and uncertainty quantification

单次遍历下的贝叶斯在线学习：频率有效性及不确定性量化

Jeyong Lee, Junhyeok Choi, Dongguen Kim, Minwoo Chae

AI总结提出一种针对单次遍历的贝叶斯在线学习算法，通过预热阶段确保稳定更新，证明后验达到最优收敛率并建立在线Bernstein-von Mises定理，实现无需小批量样本量发散的不确定性量化。

详情

Comments: 52 pages

AI中文摘要

贝叶斯在线学习为序贯推理提供了一个连贯的框架。然而，其理论理解仍然有限，特别是在单次遍历设置中。现有的理论保证通常要求小批量样本量发散，这一条件在单次遍历机制下无法满足。在本文中，我们提出了一种针对单次遍历设置量身定制的新贝叶斯在线学习算法，该算法包含一个预热阶段以确保稳定的序贯更新。对于该算法，我们证明了序贯更新的后验达到了最优收敛率。在此基础上，我们建立了Bernstein-von Mises定理的在线类比，该定理保证了在没有发散的小批量样本量的情况下有效的不确定性量化。我们的分析基于一个新颖的理论框架，该框架与在线学习文献中的现有方法有根本不同。在广义线性模型上的数值实验表明，所提出的方法匹配了批处理估计器的性能，同时优于现有的在线程序。

英文摘要

Bayesian online learning provides a coherent framework for sequential inference. However, its theoretical understanding remains limited, particularly in the one-pass setting. Existing theoretical guarantees typically require the mini-batch sample size to diverge, a condition that fails in the one-pass regime. In this paper, we propose a new Bayesian online learning algorithm tailored to the one-pass setting, which incorporates a warm-start phase to ensure stable sequential updates. For this algorithm, we show that the sequentially updated posterior attains the optimal convergence rate. Building on this, we establish an online analogue of the Bernstein-von Mises theorem, which guarantees valid uncertainty quantification without diverging mini-batch sample sizes. Our analysis is based on a novel theoretical framework that differs fundamentally from existing approaches in the online learning literature. Numerical experiments on generalized linear models show that the proposed method matches the performance of the batch estimator while outperforming existing online procedures.

URL PDF HTML ☆

赞 0 踩 0

2603.09276 2026-06-11 stat.ML cs.LG 版本更新

On Regret Bounds of Thompson Sampling for Bayesian Optimization

关于贝叶斯优化中汤普森采样遗憾界的分析

Shion Takeno, Shogo Iwazaki

AI总结本文针对高斯过程汤普森采样（GP-TS）方法，在目标函数为GP样本路径的假设下，推导了其遗憾下界、累积遗憾二阶矩上界、期望宽松遗憾上界以及改进的累积遗憾上界，填补了GP-TS在高概率遗憾界方面的空白。

详情

Comments: 43 pages, Accepted to ICML 2026

AI中文摘要

我们研究了一种广泛使用的贝叶斯优化方法——高斯过程汤普森采样（GP-TS），假设目标函数是高斯过程的一个样本路径。与具有高概率和期望遗憾界的高斯过程上置信界（GP-UCB）相比，GP-TS的大多数分析仅限于期望遗憾。此外，最近关于GP-UCB的宽松遗憾和改进的累积遗憾上界的分析是否能应用于GP-TS仍不清楚。为了填补这些空白，本文展示了几个遗憾界：(i) GP-TS的遗憾下界，这意味着GP-TS以概率δ依赖于$1/\delta$的多项式；(ii) 累积遗憾二阶矩的上界，直接暗示了关于δ的改进遗憾上界；(iii) 期望宽松遗憾上界；(iv) 关于时间水平T的改进累积遗憾上界。在此过程中，我们提供了几个有用的引理，包括从最近分析中放松必要条件以获得关于T的改进累积遗憾上界。

英文摘要

We study a widely used Bayesian optimization method, Gaussian process Thompson sampling (GP-TS), under the assumption that the objective function is a sample path from a GP. Compared with the GP upper confidence bound (GP-UCB) with established high-probability and expected regret bounds, most analyses of GP-TS have been limited to expected regret. Moreover, whether the recent analyses of GP-UCB for the lenient regret and the improved cumulative regret upper bound can be applied to GP-TS remains unclear. To fill these gaps, this paper shows several regret bounds: (i) a regret lower bound for GP-TS, which implies that GP-TS suffers from a polynomial dependence on $1/\delta$ with probability $\delta$, (ii) an upper bound of the second moment of cumulative regret, which directly suggests an improved regret upper bound on $\delta$, (iii) expected lenient regret upper bounds, and (iv) an improved cumulative regret upper bound on the time horizon $T$. Along the way, we provide several useful lemmas, including a relaxation of the necessary condition from recent analysis to obtain improved regret upper bounds on $T$.

URL PDF HTML ☆

赞 0 踩 0

2601.21817 2026-06-11 stat.ML cs.LG 版本更新

A Judge-Aware Ranking Framework for Evaluating Large Language Models without Ground Truth

一种面向评委的排名框架：无需真实标签评估大语言模型

Mingyuan Xu, Xinzi Tan, Jiawei Wu, Doudou Zhou

AI总结本文提出一种面向评委的排名框架，通过引入评委特定的辨别参数扩展Bradley-Terry-Luce模型，在不参考标签的情况下联合估计潜在模型质量和评委可靠性，从而提高人类偏好的一致性，提高数据效率，并产生校准的不确定性量化。

详情

AI中文摘要

评估大语言模型（LLMs）在开放性任务上无需真实标签的评估越来越通过LLM-as-a-judge范式进行。一个关键但未充分建模的问题是，评判LLMs在可靠性上存在显著差异；将所有评委视为同等对待会导致偏见的排行榜和误导性的不确定性估计。更多的数据在不正确的聚合下可能导致评估更加自信地错误。我们提出了一种面向评委的排名框架，通过引入评委特定的辨别参数扩展Bradley-Terry-Luce模型，在不参考标签的情况下联合估计潜在模型质量和评委可靠性。我们建立了可识别性，直到自然归一化，并证明最大似然估计的一致性和渐近正态性，从而能够为分数差异和排名比较生成置信区间。在多个公开基准和一个新收集的数据集上，我们的方法提高了与人类偏好的一致性，比无权基线实现了更高的数据效率，并产生了校准的LLM排名不确定性量化。

英文摘要

Evaluating large language models (LLMs) on open-ended tasks without ground-truth labels is increasingly done via the LLM-as-a-judge paradigm. A critical but under-modeled issue is that judge LLMs differ substantially in reliability; treating all judges equally can yield biased leaderboards and misleading uncertainty estimates. More data can make evaluation more confidently wrong under misspecified aggregation. We propose a judge-aware ranking framework that extends the Bradley-Terry-Luce model by introducing judge-specific discrimination parameters, jointly estimating latent model quality and judge reliability from pairwise comparisons without reference labels. We establish identifiability up to natural normalizations and prove consistency and asymptotic normality of the maximum likelihood estimator, enabling confidence intervals for score differences and rank comparisons. Across multiple public benchmarks and a newly collected dataset, our method improves agreement with human preferences, achieves higher data efficiency than unweighted baselines, and produces calibrated uncertainty quantification for LLM rankings.

URL PDF HTML ☆

赞 0 踩 0

2505.15201 2026-06-11 cs.LG cs.AI cs.CL stat.ML 版本更新

Pass@K Policy Optimization: Solving Harder Reinforcement Learning Problems

Pass@K 策略优化：解决更困难的强化学习问题

Christian Walder, Deep Karkhanis

AI总结提出 Pass-at-k 策略优化 (PKPO)，通过变换奖励直接优化 pass@k 性能，利用低方差无偏估计器，在训练中退火 k 可同时提升 pass@1 和 pass@k，解决更难问题。

详情

AI中文摘要

强化学习算法对每个问题采样多个 n>1 的解决方案尝试并独立奖励它们。这优化了 pass@1 性能，优先考虑孤立样本的强度，而牺牲了样本集的多样性和集体效用。这未充分利用采样能力，限制了探索和在更难示例上的最终改进。作为修复，我们提出 Pass-at-k 策略优化 (PKPO)，一种对最终奖励的变换，导致直接优化 pass@k 性能，从而优化联合考虑时最大化奖励的样本集。我们的贡献是推导出 pass@k 及其梯度在二元和连续奖励设置中的新型低方差无偏估计器。我们展示了使用我们的估计器进行优化简化为标准强化学习，其中奖励经过稳定高效的变换函数联合变换。虽然先前的工作仅限于 k=n，但我们是第一个能够对任意 k ≤ n 实现 pass@k 鲁棒优化的。此外，我们的方法不是以 pass@1 性能换取 pass@k 增益，而是允许在训练中退火 k，同时优化两个指标，通常能在显著 pass@k 增益的同时获得强大的 pass@1 数值。我们在玩具实验上验证了我们的奖励变换，揭示了我们的公式的方差减少特性。我们还使用开源 LLM GEMMA-2 包含了真实世界的例子。我们发现我们的变换有效地优化了目标 k。此外，更高的 k 值能够解决更多和更难的问题，而退火 k 则同时提升了 pass@1 和 pass@k。关键的是，在传统 pass@1 优化停滞的具有挑战性的任务集上，我们的 pass@k 方法解锁了学习，这可能是由于通过优先考虑联合效用而非单个样本的效用实现了更好的探索。

英文摘要

Reinforcement Learning (RL) algorithms sample multiple n>1 solution attempts for each problem and reward them independently. This optimizes for pass@1 performance and prioritizes the strength of isolated samples at the expense of the diversity and collective utility of sets of samples. This under-utilizes the sampling capacity, limiting exploration and eventual improvement on harder examples. As a fix, we propose Pass-at-k Policy Optimization (PKPO), a transformation on the final rewards which leads to direct optimization of pass@k performance, thus optimizing for sets of samples that maximize reward when considered jointly. Our contribution is to derive novel low variance unbiased estimators for pass@k and its gradient, in both the binary and continuous reward settings. We show optimization with our estimators reduces to standard RL with rewards that have been jointly transformed by a stable and efficient transformation function. While previous efforts are restricted to k=n, ours is the first to enable robust optimization of pass@k for any arbitrary k <= n. Moreover, instead of trading off pass@1 performance for pass@k gains, our method allows annealing k during training, optimizing both metrics and often achieving strong pass@1 numbers alongside significant pass@k gains. We validate our reward transformations on toy experiments, which reveal the variance reducing properties of our formulations. We also include real-world examples using the open-source LLM, GEMMA-2. We find that our transformation effectively optimizes for the target k. Furthermore, higher k values enable solving more and harder problems, while annealing k boosts both the pass@1 and pass@k. Crucially, for challenging task sets where conventional pass@1 optimization stalls, our pass@k approach unblocks learning, likely due to better exploration by prioritizing joint utility over the utility of individual samples.

URL PDF HTML ☆

赞 0 踩 0

2512.11081 2026-06-11 stat.ML cs.LG stat.ME 版本更新

Provable Recovery of Locally Important Signed Features and Interactions from Random Forest

从随机森林中可证明地恢复局部重要符号特征和交互

Kata Vuk, Nicolas Alexander Ihlo, Merle Behr

AI总结提出一种局部、模型特定的特征与交互重要性方法，通过结合全局和局部决策路径模式，在局部尖峰稀疏模型下可证明地恢复真实信号特征及其交互，并识别特征值大小对预测的驱动方向。

详情

AI中文摘要

特征与交互重要性（FII）方法在监督学习中至关重要，用于评估复杂预测模型中输入变量及其交互的相关性。在许多领域，如个性化医疗，通常需要针对单个预测的局部解释，而不是总结整体特征重要性的全局分数。随机森林（RF）在这些场景中被广泛使用，现有的可解释性方法通常利用树结构和分裂统计量来提供模型特定的见解。然而，对RF的局部FII方法的理论理解仍然有限，这使得如何解释单个预测的高重要性分数变得不明确。我们提出了一种新颖的、局部的、模型特定的FII方法，该方法识别特征在决策路径上的频繁共现，将全局模式与特定测试点路径上的模式相结合。我们证明，在局部尖峰稀疏（LSS）模型下，我们的方法一致地恢复真实的局部信号特征及其交互，并识别出大或小的特征值是否驱动预测。通过模拟研究和真实数据示例，我们展示了我们的方法和理论结果的有用性。

英文摘要

Feature and Interaction Importance (FII) methods are essential in supervised learning for assessing the relevance of input variables and their interactions in complex prediction models. In many domains, such as personalized medicine, local interpretations for individual predictions are often required, rather than global scores summarizing overall feature importance. Random Forests (RFs) are widely used in these settings, and existing interpretability methods typically exploit tree structures and split statistics to provide model-specific insights. However, theoretical understanding of local FII methods for RF remains limited, making it unclear how to interpret high importance scores for individual predictions. We propose a novel, local, model-specific FII method that identifies frequent co-occurrences of features along decision paths, combining global patterns with those observed on paths specific to a given test point. We prove that our method consistently recovers the true local signal features and their interactions under a Locally Spike Sparse (LSS) model and also identifies whether large or small feature values drive a prediction. We illustrate the usefulness of our method and theoretical results through simulation studies and a real-world data example.

URL PDF HTML ☆

赞 0 踩 0

2408.07498 2026-06-11 math.AP stat.ML 版本更新

Wasserstein Gradient Flows of MMD Functionals with Distance Kernel and Cauchy Problems on Quantile Functions

距离核MMD泛函的Wasserstein梯度流及分位数函数上的Cauchy问题

Richard Duong, Viktor Stein, Robert Beinert, Johannes Hertrich, Gabriele Steidl

AI总结研究负距离核下最大均值差异泛函的Wasserstein梯度流，通过将Wasserstein-2空间等距嵌入分位数函数空间，将梯度流转化为L2上的Cauchy问题并给出解公式，证明了流的正则性。

详情

Comments: We corrected the implicit Euler scheme in our code and updated the plots. Also, a minor mistake in the def. (14) and an error in the proof of Thm. 3.5 have been corrected. We thank the anonymous contributors for their valuable feedback, further improving the clarity of the paper. 48 pages, 23 figures, comments welcome!

AI中文摘要

我们全面描述了实直线上最大均值差异（MMD）泛函 $\mathcal F_\nu:= \text{MMD}_K^2(\cdot, \nu)$ 朝向给定目标测度 $\nu$ 的Wasserstein梯度流，其中我们关注负距离核 $K(x,y):= -|x-y|$。在一维情况下，Wasserstein-2空间可以等距嵌入到分位数函数的锥 $\mathcal C(0,1) \subset L_2(0,1)$ 中，从而通过 $L_2(0,1)$ 上相关Cauchy问题的解来刻画Wasserstein梯度流。基于在 $L_2(0,1)$ 上构造 $\mathcal F_\nu$ 的适当对应物及其次微分，我们给出了Cauchy问题的解。对于离散目标测度 $\nu$，这导致一个分段线性解公式。我们证明了流在 $\mathcal C(0,1)$ 子集上的不变性和光滑性。对于某些 $\mathcal F_\nu$ 流，这意味着初始点测度立即变得绝对连续，并随时间保持。最后，我们通过使用隐式欧拉格式的各种数值例子说明了流的行为，该格式可通过二分法轻松计算。对于连续目标 $\nu$，也可以使用显式欧拉格式，尽管收敛保证有限。

英文摘要

We give a comprehensive description of Wasserstein gradient flows of maximum mean discrepancy (MMD) functionals $\mathcal F_\nu:= \text{MMD}_K^2(\cdot, \nu)$ towards given target measures $\nu$ on the real line, where we focus on the negative distance kernel $K(x,y):= -|x-y|$. In one dimension, the Wasserstein-2 space can be isometrically embedded into the cone $\mathcal C(0,1) \subset L_2(0,1)$ of quantile functions leading to a characterization of Wasserstein gradient flows via the solution of an associated Cauchy problem on $L_2(0,1)$. Based on the construction of an appropriate counterpart of $\mathcal F_\nu$ on $L_2(0,1)$ and its subdifferential, we provide a solution of the Cauchy problem. For discrete target measures $\nu$, this results in a piecewise linear solution formula. We prove invariance and smoothing properties of the flow on subsets of $\mathcal C(0,1)$. For certain $\mathcal F_\nu$-flows this implies that initial point measures instantly become absolutely continuous, and stay so over time. Finally, we illustrate the behavior of the flow by various numerical examples using an implicit Euler scheme, which is easily computable by a bisection algorithm. For continuous targets $\nu$, also the explicit Euler scheme can be employed, although with limited convergence guarantees.

URL PDF HTML ☆

赞 0 踩 0

2510.02149 2026-06-11 cs.LG math.OC stat.ML 版本更新

Reinforcement Learning with Action-Triggered Observations

具有动作触发观测的强化学习

Alexander Ryabchenko, Wenlong Mou

AI总结提出动作触发稀疏可追踪MDP框架，推导Bellman方程并证明最优策略存在，利用观测间动作序列的线性表示实现基于回归的方法，在几何分布情节下达到与完全可观测线性MDP匹配的遗憾界。

详情

AI中文摘要

投影随机森林与圆形数据的共形预测

Paulo C. Marques F., Rinaldo Artes, Helton Graziadei

AI总结针对圆形响应回归问题，应用共形预测技术，通过投影方法将线性回归模型转换为圆形模型，并利用随机森林的袋外机制避免额外校准样本，生成具有有限样本覆盖保证和自适应弧长的预测集。

详情

Comments: 7 pages; 4 figures

AI中文摘要

我们将共形预测技术应用于具有圆形响应的回归问题，在数据可交换性假设下，为任何圆形预测模型生成具有自适应弧长和有限样本覆盖保证的预测集。利用现有为线性响应设计的高性能预测模型，我们分析了一种通用的投影过程，将任何线性响应回归模型转换为适用于圆形响应的模型。当在此投影过程中使用随机森林作为基模型时，我们利用随机森林的袋外机制，在构建预测集时无需单独的校准样本。在合成和真实数据集上，与两种现有替代模型生成的分割共形预测集相比，所得的投影随机森林模型产生了更高效的袋外共形预测集，中位弧长更短。

英文摘要

We apply conformal prediction techniques to regression problems with circular responses, producing prediction sets with adaptive arc length and finite-sample coverage guarantees for any circular predictive model under the assumption of data exchangeability. Leveraging the high performance of existing predictive models designed for linear responses, we analyze a general projection procedure that converts any linear-response regression model into one suitable for circular responses. When random forests are used as base models in this projection procedure, we leverage the random forest out-of-bag mechanism to eliminate the need for a separate calibration sample in the construction of prediction sets. On synthetic and real datasets, the resulting projected random forest model produces more efficient out-of-bag conformal prediction sets, with shorter median arc length, than the split conformal prediction sets generated by two existing alternative models.

URL PDF HTML ☆

赞 0 踩 0

2606.11439 2026-06-11 stat.ME 新提交

A Likelihood Ratio Testing Approach for Interval-Censored Data

区间删失数据的似然比检验方法

Yuan Wu, Susan Halabi

AI总结针对区间删失数据，提出基于样条筛的稳健似然比检验，解决Wald检验在小样本中的不稳定性，理论推导渐近分布，模拟和实例验证其优越性。

2606.11414 2026-06-11 stat.ME 新提交

Group Sequential Sample Size for Comparing Two Survival Probabilities at a Specific Time Point

比较特定时间点两个生存概率的组序贯样本量

Susan Halabi, Lu Liu, Chenxi Yu, Yuan Wu

AI总结提出一种新方法，在固定和组序贯试验设计中同时确定检验两个生存概率的样本量，控制I类错误，适用于比例风险假设不成立或含新辅助治疗的随机试验。

详情

AI中文摘要

我们提出了一种新方法，该方法在固定和组序贯试验设计中同时确定检验两个预先指定时间点的生存概率所需的样本量，同时保证I类错误控制。在不同假设差异、失效分布、删失比例和名义功效下的模拟显示出一致的性能，而中期分析则突出了每次分析时降低的I类错误和增加的功效，无论潜在的失效时间分布或花费函数如何。重要的是，我们的方法特别适用于评估随机试验中固定时间的生存结局，其中一组治疗包括术前新辅助治疗，而另一组仅进行手术。此外，当比例风险假设不满足时，该方法也具有优势，这常见于具有延迟或时变治疗效果或生存曲线交叉的免疫治疗试验中。该方法也适用于随机II期试验，其中较小的样本量和中间或替代时间至事件终点的使用要求高效的数据利用和稳健的错误控制。我们通过肾癌和前列腺癌的激励性例子说明了该方法。附带的R Shiny应用程序使研究者能够交互式地计算样本量，从而促进不同环境下的实际试验规划。

英文摘要

We propose a novel method that simultaneously determines the sample size for testing two survival probabilities at a pre-specified ltime while guaranteeing type I error control in both fixed and group-sequential trial designs. Simulations across varying hypothesized differences, failure distributions, censoring proportions, and nominal powers demonstrate consistent performance, while interim analyses highlight reduced type I error and increased power at each look, regardless of the underlying failure time distribution or spending function. Importantly, our method is especially useful for evaluating survival outcomes at a fixed time in randomized trials where one treatment arm includes neoadjuvant therapy prior to surgery while the other involves surgery alone. Furthermore, it is advantageous when the proportional hazards assumption is not satisfied, as often occurs in immunotherapy trials with delayed or time-varying treatment effects or crossing survival curves. The method is also applicable to randomized phase II trials, where smaller sample sizes and the use of intermediate or surrogate time-to-event endpoints demand efficient data use and robust error control. We illustrate the approach with motivating examples in renal and prostate cancer. An accompanying R Shiny application enables investigators to compute sample sizes interactively, facilitating practical trial planning in diverse settings.

URL PDF HTML ☆

赞 0 踩 0

2604.23464 2026-06-11 stat.ME stat.AP 版本更新

Design-Based Cross-Validation for Comparing Small Area Estimators

关于小区域估计器的交叉验证

Qianyu Dong, Zehang Richard Li

AI总结本文提出一种适用于复杂调查设计的小区域估计器交叉验证框架，通过分解交叉验证平方误差，揭示可识别偏差与不可识别成分，提升模型比较的稳健性和可解释性。

详情

Comments: Previous title: "On cross-validation for small area estimators"

AI中文摘要

地方公共卫生监测常常依赖住户调查，但所需空间分辨率的数据稀少。小区域估计（SAE）方法通过跨区域借用强度和辅助信息解决这一挑战。然而，在缺乏真实数据的情况下，比较这些估计器仍然困难。我们提出了一种适用于复杂调查设计的交叉验证框架，用于评估小区域估计器。我们的方法使能够对区域级和单元级SAE模型进行模型无关的比较。框架的核心是交叉验证平方误差的分解，揭示了可识别偏差和不可识别成分，后者可以被界定。我们的理论结果和模拟研究显示，传统方法如留一区域法交叉验证可能导致误导性的模型排名，而所提方法提供了更稳健和可解释的模型比较，并具有不确定性量化。我们通过比较赞比亚Demographic and Health Surveys中估计的亚国家女性识字率的小区域估计模型，展示了该框架。

英文摘要

Subnational monitoring of public health and development indicators often relies on household surveys where data are sparse at the desired spatial resolution. Small area estimation (SAE) methods address this challenge by borrowing strength across areas and incorporating auxiliary information. However, comparing these estimators remains difficult in the absence of ground truth. We propose a design-based cross-validation framework for evaluating small area estimators that accommodates complex survey designs. Our approach enables model-agnostic comparisons between area-level and unit-level SAE models. We derive a decomposition of the conditional mean squared error that yields a consistent cross-validation score, show that finite-sample comparisons carry an unidentifiable bias that can be bounded, and use this bound as a principled threshold for ranking models. We further show that leave-one-area-out cross-validation, a popular alternative, targets extrapolation rather than smoothing error and can reverse the correct ranking. We evaluate the framework through extensive design-based simulations. We apply the framework to compare subnational female literacy estimators in Zambia using the 2024 Demographic and Health Survey. The framework applies broadly across prevalence mapping and other SAE problems and is applicable to any small area estimator irrespective of the underlying model class.

URL PDF HTML ☆

赞 0 踩 0

2602.00434 2026-06-11 stat.AP 版本更新

How should covariates be handled in randomized trials? Empirical evidence from 50 trials and recommendations for practice

随机临床试验中协变量调整策略的基准测试

Yulin Shao, Liangbo Lyu, Menggang Yu, Bingkai Wang

AI总结本文通过大规模实证研究比较了不同协变量调整策略在随机临床试验中的表现，发现简洁的回归方法在效率提升方面表现优异，而基于机器学习的方法在二元结局中计算稳定性较差。

详情

AI中文摘要

背景和目的：协变量调整可以提高随机临床试验的精度和统计功效，并被主要监管机构推荐。然而，关于不同调整策略在多样化真实世界试验中的表现缺乏实证证据，导致对统计分析计划中应预指定的方法和协变量存在不确定性。我们旨在填补这一空白并提供实用建议。方法：我们利用50个公开可用的随机试验的个体层面数据（29,094名参与者；574个治疗-结局比较）进行了大规模实证研究。我们比较了常用的协变量调整估计量，包括分析协方差、逆概率加权、g计算和基于机器学习的方法，并结合三种协变量选择策略。性能通过精度提升、点估计变化、计算可靠性以及协变量调整改变统计显著性概率来评估。结果：协变量调整在大多数情况下提高了精度，连续结局的中位方差减少率为13.3%，二元结局为4.6%。使用少量预指定的预测性协变量的简洁回归方法在小至中等样本中表现与更复杂的方法相当或更好。基于机器学习的估计量在二元结局中未提供额外的精度，并且更易出现计算失败。结论：在不同试验中，简洁的协变量调整提供了稳定的效率提升，而不引入系统性偏差。这些发现支持在主要试验分析中常规使用协变量调整。所有整理的数据集和分析代码已公开发布，以支持未来临床研究。

英文摘要

Background and Objective: Covariate adjustment can improve precision and power in randomized clinical trials and is recommended by major regulatory agencies. However, there is limited empirical evidence on how different adjustment strategies perform across diverse real-world trials, leaving uncertainty about which methods and covariates should be prespecified in statistical analysis plans. We aim to address this gap and provide practical recommendations. Methods: We conducted a large-scale empirical study using individual-level data from 50 publicly available randomized trials (29,094 participants; 574 treatment-outcome comparisons). We compared commonly used covariate-adjusted estimators, including analysis of covariance, inverse-probability weighting, g-computation, and machine-learning-based approaches, combined with three covariate-selection strategies. Performance was evaluated using precision gains, changes in point estimates, computational reliability, and the probability that covariate adjustment altered statistical significance relative to an unadjusted analysis. Results: Covariate adjustment improved precision in most settings, with a median variance reduction of 13.3\% for continuous outcomes and 4.6\% for binary outcomes. Parsimonious regression approaches using a small prespecified set of prognostic covariates performed as well as or better than more complex methods, particularly in small to medium samples. Machine-learning-based estimators did not provide additional precision and were more prone to computational failure for binary outcomes. Conclusions: Across trials, parsimonious covariate adjustment provided consistent efficiency gains without introducing systematic bias. These findings support routine covariate adjustment in primary trial analyses. All curated datasets and analysis code are openly released to support future clinical research.

URL PDF HTML ☆

赞 0 踩 0

2505.00571 2026-06-11 stat.ML cs.LG 版本更新

Discovery and inference beyond linearity for epidemiological data by integrating Bayesian regression, tree ensembles and Shapley values

通过整合贝叶斯回归、树集成和Shapley值对流行病学数据进行线性之外的发现与推断

Giorgio Spadaccini, Marjolein Fokkema, Mark A. van de Wiel

AI总结提出RuleSHAP框架，结合贝叶斯稀疏回归、改进的树规则生成器和Shapley值，实现非线性与交互效应的检测及个体水平的不确定性量化，应用于流行病学数据发现高胆固醇和血压的影响因素。

详情

AI中文摘要

机器学习在流行病学和医疗健康研究中越来越受欢迎，用于无假设地发现风险和保护因素。机器学习在发现非线性和交互作用方面很强，但这种能力因缺乏可靠的推断而受损。尽管Shapley值提供了特征效应的局部度量，但这些效应通常缺乏有效的不确定性量化，从而排除了统计推断。我们提出RuleSHAP，一个通过结合专用贝叶斯稀疏回归模型、改进的基于树的规则生成器和Shapley值归因来解决这一局限性的框架。RuleSHAP能够检测非线性和交互效应，其关键贡献在于个体水平的不确定性量化。我们推导了一个在该框架内计算边际Shapley值的有效公式。我们将RuleSHAP应用于一个流行病学队列的数据，以检测和推断高胆固醇和血压的几种效应，例如年龄、性别、种族、BMI和血糖水平等特征之间的非线性交互效应。最后，我们在模拟数据上证明了我们框架的有效性。

英文摘要

Machine Learning (ML) is gaining popularity in epidemiology and healthcare studies for hypothesis-free discovery of risk and protective factors. ML is strong at discovering nonlinearities and interactions, but this power is compromised by a lack of reliable inference. Although Shapley values provide local measures of features' effects, valid uncertainty quantification for these effects is typically lacking, thus precluding statistical inference. We propose RuleSHAP, a framework that addresses this limitation by combining a dedicated Bayesian sparse regression model with an improved tree-based rule generator and Shapley value attribution. RuleSHAP provides detection of nonlinear and interaction effects, with uncertainty quantification at the individual level as a key contribution. We derive an efficient formula for computing marginal Shapley values within this framework. We apply RuleSHAP to data from an epidemiological cohort to detect and infer several effects for high cholesterol and blood pressure, such as nonlinear interaction effects between features like age, sex, ethnicity, BMI and glucose level. To conclude, we demonstrate the validity of our framework on simulated data.

URL PDF HTML ☆

赞 0 踩 0

1910.07712 2026-06-11 stat.AP stat.CO stat.ME 版本更新

Estimating Spatially-Smoothed Fiber Orientation Distribution from Diffusion-MRI Experiments

从扩散MRI实验估计空间平滑的纤维取向分布

Jilei Yang, Seungyong Hwang, Mengjie Shi, Jie Peng

AI总结提出最近邻自适应回归模型（NARM），通过加权局部似然估计和空间邻域嵌套实现纤维取向分布（FOD）的空间自适应估计，引入体素级重缩放和数据驱动停止规则防止过平滑，并基于配置感知策略选择相似性平滑参数，在模拟和人类连接组项目数据中提高了估计准确性和可重复性。

详情

AI中文摘要

扩散加权磁共振成像（D-MRI）是一种非侵入性体内技术，用于探测生物组织的微观结构架构。在每个体素处，纤维取向分布（FOD）表征局部纤维构型和方向，因此是D-MRI分析中的核心估计对象。我们提出了最近邻自适应回归模型（NARM），这是一种用于FOD估计的空间自适应框架，它在嵌套的空间邻域上执行加权局部似然估计，其中权重联合编码相邻FOD之间的空间邻近性和相似性，通过最优传输或Hellinger距离测量。为了防止过平滑同时保留结构异质性，我们引入了体素级重缩放方案和基于最小最近邻相异性的数据驱动停止规则。我们进一步开发了一种配置感知策略来选择相似性平滑参数，使平滑强度能够适应局部纤维复杂性。模拟研究表明，相对于体素级方法和现有的空间平滑方法PMARM，NARM提高了FOD估计精度。对人类连接组项目的重测数据的应用还表明，NARM产生了更可重复的FOD估计。实现细节以及模拟和真实数据分析的脚本可在以下网址获得：https://github.com/DMRIdotL/NARM

英文摘要

Diffusion-weighted magnetic resonance imaging (D-MRI) is a noninvasive in vivo technique for probing the microstructural architecture of biological tissues. At each voxel, the fiber orientation distribution (FOD) characterizes local fiber configurations and orientations and is therefore a central object of estimation in D-MRI analysis. We propose the Nearest-Neighbor Adaptive Regression Model (NARM), a spatially adaptive framework for FOD estimation that performs weighted local likelihood estimation over nested spatial neighborhoods, where the weights jointly encode spatial proximity and similarity among neighboring FODs, measured by either the optimal transport or Hellinger distance. To prevent over-smoothing while preserving structural heterogeneity, we introduce a voxel-wise rescaling scheme and a data-driven stopping rule based on minimum nearest-neighbor dissimilarity. We further develop a configuration-aware strategy for selecting the similarity-smoothing parameter, allowing the smoothing strength to adapt to local fiber complexity. Simulation studies demonstrate that NARM improves FOD estimation accuracy relative to voxel-wise methods and the existing spatial smoothing approach PMARM. Application to test-retest data from the Human Connectome Project additionally shows that NARM yields more reproducible FOD estimates. Implementation details and scripts for the simulation and real data analyses are available at this https URL

URL PDF HTML ☆

赞 0 踩 0

2305.09455 2026-06-11 stat.AP 版本更新

A latent class approach to assess the effects of dynamic adherence to polytherapy in heart failure patients

评估心力衰竭患者多药治疗动态依从性影响的潜在类别方法

Nicole Fontana, Laura Savaré, Emanuele Di Angelantonio, Francesca Ieva

AI总结提出结合潜在马尔可夫模型与动态依从性建模的方法，分析心力衰竭患者多药治疗依从性模式及其对再住院风险的影响，发现高依从性可显著降低风险。

详情

AI中文摘要

心力衰竭（HF）的治疗严重依赖药物治疗，特别是根据临床指南推荐联合使用多种疗法。然而，对规定方案的依从性不佳仍然是一个重大挑战，导致住院率增加和患者预后恶化。本研究引入了一种新颖的方法学流程，将潜在马尔可夫模型（LMM）与动态依从性建模相结合，以评估依从性行为及其对HF再住院的影响。使用意大利伦巴第大区的行政医疗数据，我们分析了2020年7月至12月期间因HF住院的6,818名患者。在六个月的观察期内每月评估依从性，并使用Cox回归将依从性概况与临床结局联系起来。识别出七种潜在行为概况，反映了不同的依从性水平和轨迹。结果显示，较高的依从性水平显著降低了再住院风险。与低依从性患者相比，持续高依从性患者的HF再住院风险降低了56%。重要的是，观察期内依从性的改善与更好的生存概率相关，突显了及时干预的潜在益处。此外，依从性行为受到年龄、合并症负担和观察期内住院等因素的影响。本研究强调了动态和个性化策略在监测和增强多药治疗依从性方面的重要性。通过将依从性模式与临床结局联系起来，所提出的方法为改善患者管理和减轻HF对医疗系统的负担提供了可操作的见解。

英文摘要

Heart failure (HF) treatment relies heavily on pharmacotherapy, particularly combining multiple therapies as recommended by clinical guidelines. However, non-adherence to prescribed regimens remains a significant challenge, contributing to increased hospitalizations and poorer patient outcomes. This study introduces a novel methodological pipeline that integrates Latent Markov Models (LMM) with dynamic adherence modeling to evaluate adherence behaviors and their impact on HF rehospitalization. Using administrative healthcare data from Lombardy, Italy, we analyzed 6,818 patients hospitalized for HF between July and December 2020. Adherence was assessed monthly over a six-month observation period, and adherence profiles were linked to clinical outcomes using Cox regression. Seven latent behavioral profiles were identified, reflecting varying levels and trajectories of adherence. The findings revealed that higher adherence levels significantly reduced the risk of rehospitalization. Patients with consistently high adherence exhibited a 56% lower risk of HF rehospitalization compared to those with low adherence. Importantly, improving adherence during the observation period was associated with better survival probabilities, highlighting the potential benefits of timely interventions. Additionally, adherence behaviors were influenced by factors such as age, comorbidity burden, and hospitalization during the observation period. This study underscores the importance of dynamic and personalized strategies to monitor and enhance adherence to polytherapy. By linking adherence patterns to clinical outcomes, the proposed approach offers actionable insights for improving patient management and reducing the burden of HF on healthcare systems.

URL PDF HTML ☆

赞 0 踩 0

2606.12260 2026-06-11 econ.TH cs.AI cs.GT cs.LG stat.ML 新提交

Market Design for AI: Beyond the Copyright Binary

人工智能的市场设计：超越版权二元论

Yan Dai, Maryam Farboodi, Negin Golrezaei, Sepehr Shahshahani

AI总结本文通过静态和动态博弈模型，分析AI训练数据市场中“自由使用”与“强知识产权”两种模式的失败，提出通过数据中介内部化外部性并补贴创新贡献的市场设计。

详情

AI中文摘要

夏普比率的事后检验

Steven E. Pav

AI总结提出一种夏普比率的事后检验方法，类似于Tukey检验，用于在拒绝所有总体信噪比相等的假设后，比较资产夏普比率的差异。

2509.04691 2026-06-11 stat.AP 版本更新

Inferring Piece Value in Chess and Chess Variants

推断国际象棋及其变体中的棋子价值

Steven E. Pav

AI总结使用逻辑回归从Lichess数据估计标准国际象棋及四种变体的棋子价值，发现主要棋子相对价值与历史估值一致，但象略高于马，且原子棋和反象棋中绝对值较小。

详情

Comments: 58 pages

AI中文摘要

我们使用逻辑回归来估计标准国际象棋及几种变体（即Chess 960、原子棋、反象棋和部落棋）中棋子的价值。我们对来自免费开源互联网国际象棋服务器Lichess的多年数据进行回归分析。我们使用已发布的玩家等级分来控制不同玩家技能带来的混杂效应。我们调整了由于观测等级分噪声导致的回归衰减偏差。我们发现，主要棋子的价值相对于兵的价值，与历史估值体系相当一致。然而，我们发现象的价值略高于马。我们发现，在原子棋和反象棋中，棋子的绝对值比标准国际象棋小。我们还给出了当不同技能水平的玩家对战时，使棋局平衡的近似棋子价值。我们简要考虑了使用Stockfish引擎进行自我对弈实验，这提供了关于棋子价值的对比视角。

英文摘要

We use logistic regression to estimate the value of the pieces in standard chess and several chess variants, namely Chess 960, Atomic chess, Antichess, and Horde chess. We perform our regressions on several years of data from Lichess, the free and open-source internet chess server. We use the published player ratings to control for the confounding effect of differential player skill. We adjust for the attenuation bias in regressions due to the noise in observed ratings. We find that major piece values, relative to the value of a pawn, are fairly consistent with historical valuation systems. However we find slightly higher value to bishops than knights. We find that piece values are smaller, in absolute value, in Atomic and Antichess than standard chess. We also present approximate values of the pieces to equalize odds when players of varying skill face off. We briefly consider self-play experiments using the Stockfish engine, which give a contrasting view of piece value.

URL PDF HTML ☆

赞 0 踩 0

2606.11949 2026-06-11 cs.LG cs.CR stat.ML 新提交

Online Shift Detection and Conformal Adaptation for Deployed Safety Classifiers

已部署安全分类器的在线漂移检测与共形自适应

Jun Wen Leong

AI总结提出在线监测系统，使用校准序列统计检测分布漂移，并通过共形弃权层自适应阈值恢复目标错误率，在800个实验单元中实现86.6%有效检测。

详情

Comments: 16 pages, 4 figures, 7 tables. Code and data at this https URL

AI中文摘要

我们提出了一种在线监测系统，用于检测已部署安全分类器中的分布漂移，使用校准的序列统计量来检测分类器何时移出分布。一旦检测到，共形弃权层会自适应调整决策阈值，以恢复目标错误率ε=0.1。在一项预注册的析因评估（4个分类器×5种漂移条件×20个种子×2个窗口大小，共800个单元）中，该系统实现了86.6%的有效检测（693/800，95% CI [84.1%, 88.8%]），平均延迟为39.5步。检测在三种真实标签机制下均有效：合成发作（86.6%）、真实时间越狱（85%，17/20）和GCG对抗攻击。加权共形预测为DeBERTa恢复了高达39个百分点的丢失覆盖率（ESS=46/300），但所有其他分类器均崩溃（ESS≈300）：逻辑密度比估计在高维嵌入空间中实现了完美的源/目标可分离性，将所有重要性权重裁剪至下限。DeBERTa显示出从有效校正（释义，ESS=46）到几乎完全崩溃（对抗后缀，ESS=206）的梯度。PCA降至32维打破了崩溃，为Llama Guard恢复了33个百分点，为ShieldGemma恢复了21个百分点。方差分解显示分类器（η²=0.243）、漂移类型（η²=0.237）及其交互作用（η²=0.185）均对检测延迟方差有显著贡献（所有p<0.001），表明需要针对每个分类器的监测配置文件。

英文摘要

We present an online monitoring system for distributional shift in deployed safety classifiers, using calibrated sequential statistics to detect when a classifier has moved out of distribution. Upon detection, a conformal abstention layer adapts decision thresholds to recover a target error rate epsilon=0.1. In a pre-registered factorial evaluation (4 classifiers x 5 shift conditions x 20 seeds x 2 window sizes, 800 cells), the system achieves 86.6% valid detection (693/800, 95% CI [84.1%, 88.8%]) with mean latency of 39.5 steps. Detection holds across three ground-truth regimes: synthetic onset (86.6%), real temporal jailbreaks (85%, 17/20), and GCG adversarial attacks. Weighted conformal prediction recovers up to 39 pp of lost coverage for DeBERTa (ESS=46/300) but collapses for all other classifiers (ESS~300): logistic density ratio estimation achieves perfect source/target separability in high-dimensional embedding spaces, clipping all importance weights to the floor. DeBERTa shows a gradient from effective correction (paraphrase, ESS=46) to near-total collapse (adversarial suffix, ESS=206). PCA to 32 dimensions breaks the collapse, recovering 33 pp for Llama Guard and 21 pp for ShieldGemma. Variance decomposition reveals classifier (eta^2=0.243), shift type (eta^2=0.237), and their interaction (eta^2=0.185) all contribute substantially to detection latency variance (all p<0.001), indicating per-classifier monitoring profiles are necessary.

URL PDF HTML ☆

赞 0 踩 0

2606.11865 2026-06-11 stat.ML cs.LG 新提交

Conformal Bayes under Label Shift: Post-Hoc Calibration vs. In-Training Adaptation

标签偏移下的共形贝叶斯：事后校准与训练内适应

Seungjin Choi

AI总结研究标签偏移下共形贝叶斯方法，通过重要性加权共形校准恢复目标域覆盖，比较事后校准与训练内适应两种策略，后者在偏差训练中起到去偏作用。

详情

Comments: 2nd Workshop on Epistemic Intelligence in Machine Learning (EIML@ICML 2026)

AI中文摘要

共形贝叶斯将贝叶斯后验预测与共形校准相结合，产生既统计有效又几何高效的预测集。我们从统一视角研究标签偏移下的共形贝叶斯，识别出两种互补方法，它们通过重要性加权共形校准恢复名义目标域覆盖，但通过独立机制运作。\emph{事后校准}将后验预测向目标域倾斜，并通过重要性加权分位数校正共形阈值，保持参数后验不变。\emph{训练内适应}将参数后验本身向目标域倾斜，产生校正后的预测，其最高预测密度区域作为基于拟合目标预测的最高预测密度（HPD）预测集；效率依赖于模型，并不保证有限样本条件最优性。两个受控实验表明，在无偏训练机制下，两种策略同样实现有效覆盖，而在领先优化机制下，训练内适应作为去偏算子，在覆盖不变的情况下减少区间宽度。

英文摘要

Conformal Bayes combines Bayesian posterior predictives with conformal calibration to produce prediction sets that are both statistically valid and geometrically efficient. We study conformal Bayes under label shift from a unified perspective, identifying two complementary approaches that restore nominal target-domain coverage through importance-weighted conformal calibration but operate through independent mechanisms. \emph{Post-hoc calibration} tilts the posterior predictive toward the target domain and corrects the conformal threshold via an importance-weighted quantile, leaving the parameter posterior unchanged. \emph{In-training adaptation} tilts the parameter posterior itself to the target domain, producing a corrected predictive whose highest predictive density region serves as the highest predictive density (HPD) based prediction set under the fitted target predictive; efficiency is model-dependent and does not imply finite-sample conditional optimality. Two controlled experiments show that in an unbiased training regime both strategies achieve valid coverage equally, while in a lead-optimization regime in-training adaptation acts as a debiasing operator, reducing interval width at unchanged coverage.

URL PDF HTML ☆

赞 0 踩 0

2606.11283 2026-06-11 cs.DS cs.LG stat.ML 新提交

Fixed-Parameter Tractability of Private Synthetic Data Generation

私有合成数据生成的固定参数可处理性

Badih Ghazi, Cristóbal Guzmán, Pritish Kamath, Alexander Knop, Ravi Kumar, Pasin Manurangsi

AI总结研究差分隐私下合成数据生成问题，通过查询族关联图的树宽参数建立固定参数可处理性，提出两种最优算法。

2510.07750 2026-06-11 stat.ML cs.LG 版本更新

Calibrating Decision Robustness via Inverse Conformal Risk Control

通过逆保形风险控制校准决策鲁棒性

Wenbin Zhou, Shixiang Zhu

AI总结提出逆保形风险控制框架，为鲁棒优化策略提供无分布、有限样本的误覆盖与遗憾保证，通过追踪Pareto前沿帮助决策者根据成本-风险偏好校准鲁棒性水平。

详情

AI中文摘要

鲁棒优化通过针对最坏情况优化来保护决策免受不确定性影响，但其有效性取决于预先指定的鲁棒性水平，该水平通常是临时选择的，导致保护不足或过度保守且成本高昂的解决方案。最近使用保形预测的方法构建了具有有限样本覆盖保证的数据驱动不确定性集，但它们仍然事先固定覆盖目标，并且对选择鲁棒性水平提供的指导很少。我们提出了一个新框架，该框架为任何鲁棒预测-然后优化策略族提供了无分布、有限样本的误覆盖和遗憾保证。我们的方法构建了有效的估计量，这些估计量描绘出误覆盖-遗憾帕累托前沿，使决策者能够根据其成本-风险偏好可靠地评估和校准鲁棒性水平。该框架易于实现，广泛适用于经典优化公式，并实现了更优的有限样本性能。本文提供了一种原则性的数据驱动方法，用于指导鲁棒性选择，并使从业者能够在高风险决策中平衡鲁棒性和保守性。

英文摘要

Robust optimization safeguards decisions against uncertainty by optimizing against worst-case scenarios, yet their effectiveness hinges on a prespecified robustness level that is often chosen ad hoc, leading to either insufficient protection or overly conservative and costly solutions. Recent approaches using conformal prediction construct data-driven uncertainty sets with finite-sample coverage guarantees, but they still fix coverage targets a priori and offer little guidance for selecting robustness levels. We propose a new framework that provides distribution-free, finite-sample guarantees on both miscoverage and regret for any family of robust predict-then-optimize policies. Our method constructs valid estimators that trace out the miscoverage--regret Pareto frontier, enabling decision-makers to reliably evaluate and calibrate robustness levels according to their cost--risk preferences. The framework is simple to implement, broadly applicable across classical optimization formulations, and achieves sharper finite-sample performance. This paper offers a principled data-driven methodology for guiding robustness selection and empowers practitioners to balance robustness and conservativeness in high-stakes decision-making.

URL PDF HTML ☆

赞 0 踩 0

2506.01396 2026-06-11 cs.LG cs.CR stat.ML 版本更新

Mitigating Disparate Impact of Differentially Private Learning through Bounded Adaptive Clipping

通过有界自适应裁剪减轻差分隐私学习中的差异影响

Linzh Zhao, Aki Rehn, Mikko A. Heikkilä, Razane Tajeddine, Antti Honkela

AI总结针对差分隐私学习中梯度裁剪对少数群体造成的不公平影响，提出有界自适应裁剪方法，通过引入可调下界防止过度梯度抑制，在Skewed和Fashion MNIST上最差类准确率提升超过10个百分点。

详情

Comments: TMLR camera-ready version

AI中文摘要

差分隐私已成为隐私保护机器学习的基本框架。然而，现有的差分隐私学习方法通常对模型预测产生差异影响，例如对少数群体。梯度裁剪常用于差分隐私学习，但会抑制来自困难样本的较大梯度。我们表明，自适应裁剪会加剧这一问题，因为它通常会将裁剪边界缩小到极小值以匹配拟合良好的多数类，同时显著降低其他类的准确率。我们提出有界自适应裁剪，引入可调下界以防止过度梯度抑制。与无界自适应裁剪相比，我们的方法在Skewed和Fashion MNIST上将最差类准确率提高了超过10个百分点，与自动裁剪相比提高了7个百分点，与恒定裁剪相比提高了5个百分点。代码可在该 https URL 获取。

英文摘要

Differential privacy (DP) has become an essential framework for privacy-preserving machine learning. Existing DP learning methods, however, often have disparate impacts on model predictions, e.g., for minority groups. Gradient clipping, which is often used in DP learning, can suppress larger gradients from challenging samples. We show that this problem is amplified by adaptive clipping, which will often shrink the clipping bound to tiny values to match a well-fitting majority, while significantly reducing the accuracy for others. We propose bounded adaptive clipping, which introduces a tunable lower bound to prevent excessive gradient suppression. Our method improves worst-class accuracy by over 10 percentage points on Skewed and Fashion MNIST compared to unbounded adaptive clipping, 7 points compared to Automatic clipping, and 5 points compared to constant clipping. The code is available at this https URL.

URL PDF HTML ☆

赞 0 踩 0

2310.01009 2026-06-11 stat.ME 版本更新

Neyman-Pearson and equal opportunity: when efficiency meets fairness in classification

Neyman-Pearson 与机会均等：当分类中的效率遇到公平

Jianqing Fan, Xin Tong, Yanhui Wu, Lucy Xia, Shunan Yao

AI总结将机会均等约束融入 Neyman-Pearson 分类框架，推导最优分类器，提出有限样本分类器以满足公平与效率约束，并在模拟和真实数据上验证有效性。

2606.12317 2026-06-11 stat.ME stat.CO 新提交

ShrinkageTrees: An R Package for Bayesian Tree Ensembles for Survival Analysis and Causal Inference

ShrinkageTrees: 用于生存分析和因果推断的贝叶斯树集成R包

Tijn Jacobs

AI总结 ShrinkageTrees是一个R包，通过贝叶斯加性回归树模型处理右删失和区间删失生存数据，支持因果推断中的预后和治疗效应分解，并引入深度惩罚、Dirichlet分裂和马蹄铁先验等正则化策略，适用于高维场景。

2606.11911 2026-06-11 stat.ML cs.LG math.AT 新提交

From Persistence to Survival: Hypothesis Testing, Effect Sizes and Vectorisation for Topological Features

从持续性到生存：拓扑特征的假设检验、效应大小与向量化

Juliette Murris, Bernadette Stolz, Karsten Borgwardt

AI总结提出STRAND方法，将持久性图视为生存数据，利用持久性生存函数统一实现假设检验、效应大小计算和向量化，在合成数据和真实基准上验证了有效性。

详情

AI中文摘要

持久性图是拓扑数据分析中常见的表示形式，但它们并非天然存在于向量空间中，且用于比较它们的统计工具在很大程度上与用于下游预测的工具分开发展。我们引入STRAND（生存拓扑表示图分析），将（集合的）持久性图视为生存数据：每个具有持久性值 $p = d - b$ 的拓扑特征是一个完全观测的事件时间，持久性生存函数 $S(t) = \mathbb{P}(p > t)$ 是比较图的中心对象。从这个单一表示中，我们推导出（i）一个非参数双样本检验，具有校准的第一类错误率和少量图的高功效；（ii）可解释的效应大小；以及（iii）用于下游机器学习的1-Wasserstein稳定特征向量。我们在具有受控拓扑的合成流形上验证了校准和功效，展示了在14个图和3D点云基准上的竞争性向量化，并将该方法应用于fMRI/神经科学数据中的功能性脑连接研究。据我们所知，STRAND是第一个从单一连贯且可解释的表示为持久性图提供假设检验和向量化的方法。

英文摘要

Persistence diagrams are common representations in topological data analysis, but they do not naturally live in a vector space, and the statistical tools developed for comparing them have largely evolved separately from those used for downstream prediction. We introduce STRAND (Survival Topological Representation ANalysis of Diagrams), which treats (collections of) PDs as survival data: each topological feature with persistence value $p = d - b$ is a fully observed time-to-event, and the persistence survival function $S(t) = \mathbb{P}(p > t)$ is the central object for comparing diagrams. From this single representation we derive (i) a non-parametric two-sample test with calibrated Type I error and high power from a small number of diagrams; (ii) interpretable effect sizes; and (iii) a 1-Wasserstein-stable feature vector for downstream machine learning. We validate calibration and power on synthetic manifolds with controlled topology, demonstrate competitive vectorisation across 14 graph and 3D point cloud benchmarks, and apply the method to study functional brain connectivity in fMRI/neuroscience data. To our knowledge, STRAND is the first method to provide hypothesis testing and vectorisation for persistence diagrams from a single coherent and interpretable representation.

URL PDF HTML ☆

赞 0 踩 0

2606.11651 2026-06-11 cs.LG q-bio.QM stat.AP 新提交

DeepRHP: A Hybrid Variational Autoencoder for Designing Random Heteropolymers as Protein Mimics

DeepRHP：一种用于设计随机异聚合物作为蛋白质模拟物的混合变分自编码器

Shuni Li, Zhiyuan Ruan, Andy Shen, Ivan Jayapurna, Ting Xu, Haiyan Huang

AI总结提出混合变分自编码器DeepRHP，在半监督框架下结合特征VAE与经典VAE，通过潜在空间捕获关键化学特征与序列模式，指导随机异聚合物设计，实验验证其稳定膜蛋白的有效性。

详情

Comments: Oral presentation at AAAI 2023 Workshop on AI to Accelerate Science and Engineering

AI中文摘要

由预定义单体组成的合成随机异聚合物（RHP）为设计类蛋白质材料提供了一种方法。如果设计得当，这些RHP可以模拟蛋白质的行为和功能。因此，需要计算工具来有效指导RHP设计。我们通过开发DeepRHP（一种在半监督框架下改进的变分自编码器（VAE）模型）来弥补这一差距。通过为经典VAE配备额外的基于特征的VAE，DeepRHP迫使潜在空间捕获关键化学特征的结构以及单个RHP序列模式。从这个意义上说，我们的方法是通用的，允许以混合方式纳入任何相关特征。我们通过提出在非原生环境中稳定膜蛋白（例如水通道蛋白Z）的潜在单体组成，并将我们的预测与已发表的结果进行交叉验证，证明了DeepRHP的有效性。我们的模型与真实RHP功能之间的一致性表明，利用混合自编码器架构来指导蛋白质和其他生物化合物的RHP设计具有巨大潜力。

英文摘要

Synthetic random heteropolymers (RHPs), consisting of a predefined set of monomers, offer an approach toward the design of protein-like materials. These RHPs, if designed appropriately, can mimic protein behavior and function. As such, there is a need for computational tools to efficiently guide RHP design. We bridge this gap by developing DeepRHP, a modified variational autoencoder (VAE) model under a semi-supervised framework. By equipping a classical VAE with an additional feature-based VAE, DeepRHP forces the latent space to capture structures of critical chemical features as well as individual RHP sequence patterns. In this sense, our method is versatile by allowing any relevant features to be incorporated in a hybrid manner. We demonstrate the effectiveness of DeepRHP by suggesting potential monomer compositions that stabilize membrane proteins (e.g. Aquaporin Z) in non-native environments and cross-validating our prediction with published results. The concordance between our model and true RHP function suggests strong potential in utilizing hybrid autoencoder architectures to guide RHP design for proteins and other biological compounds.

URL PDF HTML ☆

赞 0 踩 0

2606.11510 2026-06-11 q-bio.QM q-bio.PE stat.ML 新提交

Continuous biome representations from Earth observation embeddings

从地球观测嵌入中提取连续生物群落表示

Maxwell B. Joseph, Flávia De Souza Mendes, Dieu My T. Nguyen, Camile Sothe, Christopher B. Anderson (Planet Labs PBC)

AI总结针对离散生物群落图压缩生态连续性的问题，提出从卫星图像嵌入中学习连续概率表示，在巴西6个生物群落和4672种植物数据上验证，优于离散标签预测物种分布。

详情

Comments: 8 pages, 4 figures

AI中文摘要

生物群落随空间连续变化，但生物群落图通过分类边界压缩了这种变化，特别是在生态过渡带，过渡群落具有独特的生态特征。地球观测基础模型通过密集嵌入编码光谱、空间和时间信息，能否将离散的生物群落图转换为更好地捕捉生态变化的连续表示？本文在Clay v1.5卫星图像嵌入上拟合线性分类器，从分类图中预测生物群落标签。softmax输出产生一个连续概率向量，其维度对应命名的生物群落类别。我们使用巴西六个生物群落、130万个嵌入和10015个保留的森林清查样地（涵盖4672种植物）评估该方法。连续生物群落表示在预测物种出现方面优于离散生物群落标签（10次空间交叉验证中平均每物种AUC 0.618 vs. 0.570）。分解这一增益表明，改进来自分级概率输出的连续性，而非标签重新分配；该模式在距生物群落边界的所有距离上均成立。原始1024维嵌入仍然是我们测试的最强预测因子（平均AUC 0.646 vs. 0.618），但连续表示恢复了嵌入相对于离散标签的大部分增益。这种简单方法为分类地图标签提供了概率替代方案，保留了其含义，同时编码了离散地图抑制的分级变化。

英文摘要

Biotic communities vary continuously across space, yet biome maps impose categorical boundaries that compress this variation, particularly at ecotones where transitional communities are ecologically distinct. Could Earth observation (EO) foundation models, which encode spectral, spatial, and temporal information with dense embeddings, convert discrete biome maps into continuous representations that better capture ecological variation? Here, we fit a linear classifier on Clay v1.5 satellite image embeddings to predict biome labels from a categorical map. The softmax output yields a continuous probability vector whose dimensions correspond to named biome classes. We evaluate this approach using six Brazilian biomes, 1.3 million embeddings, and 10,015 withheld forest inventory plots spanning 4,672 plant species. The continuous biome representation outperforms discrete biome labels for predicting species occurrence (mean per-species AUC 0.618 vs. 0.570 across 10 spatial cross-validation folds). Decomposing this gain shows that continuity in the graded probability output, rather than label reassignment, accounts for the improvement; the pattern holds across all distances from biome boundaries. The raw 1024-dimensional embedding remains the strongest predictor we tested (mean AUC 0.646 vs. 0.618), but the continuous representation recovers most of the embedding's gain over discrete labels. This simple approach provides a probabilistic replacement for categorical map labels, preserving their meaning while encoding graded variation that discrete maps suppress.

URL PDF HTML ☆

赞 0 踩 0

2606.11473 2026-06-11 cs.LG cs.AI stat.ML 新提交

CRUMB: Efficient Prior Fitted Network Inference via Distributionally Matched Context Batching

CRUMB: 通过分布匹配上下文批处理实现高效先验拟合网络推理

Jamie Heredge, Mattia J. Villani, Pranav Deshpande, Akshay Seshadri, Niraj Kumar

发表机构 * Global Technology Applied Research, JPMorganChase（摩根大通全球技术应用研究）

AI总结提出CRUMB方法，通过聚类查询、最小化最大均值差异选择训练子集、再执行精确推理，在不重新训练的情况下加速先验拟合网络推理，在51个数据集上优于同类方法。

详情

Comments: 26 pages, 13 figures

AI中文摘要

先验拟合网络（PFNs）是一类有前景的表格基础模型，执行上下文学习，其中整个带标签的训练集作为上下文提供，并在单次前向传播中生成测试查询的预测。然而，许多PFN架构中二次缩放的自注意力机制使得对于非常大的训练数据集推理变得不可行。我们提出CRUMB（使用最小化MMD批处理的聚类检索），一个三阶段推理包装器：（i）聚类测试查询，（ii）通过贪心最小化最大均值差异（MMD）为每个聚类选择一个小型、分布匹配的训练子集，（iii）在每个缩减上下文的批次上执行精确的PFN推理。CRUMB是架构无关的，无需重新训练。在51个数据集的TabArena基准测试中，跨三种PFN架构（TabPFNv2、TabICLv1、TabICLv2）评估，我们展示了CRUMB优于类似的最先进的上下文选择策略。我们还展示了CRUMB对协变量漂移具有鲁棒性，因为MMD最小化步骤自然有助于对齐训练上下文分布以匹配当前测试批次分布。

英文摘要

Prior-fitted networks (PFNs) are a promising class of tabular foundation models that perform in-context learning, whereby the entire labelled training set is supplied as context, and predictions for test queries are produced in a single forward pass. However, the quadratically scaling self-attention mechanism in many PFN architectures makes inference prohibitive for very large training datasets. We propose CRUMB (Clustered Retrieval Using Minimised-MMD Batching), a three-stage inference wrapper that (i) clusters the test queries, (ii) selects a small, distributionally matched training subset for each cluster by greedily minimising the maximum mean discrepancy (MMD), and (iii) runs exact PFN inference on each reduced-context batch. CRUMB is architecture-agnostic and requires no retraining. On the 51-dataset TabArena benchmark, evaluated across three PFN architectures (TabPFNv2, TabICLv1, TabICLv2), we show that CRUMB outperforms similar state-of-the-art context selection strategies. We also show that CRUMB is resilient to covariate drift, as the MMD-minimisation step naturally helps align the training context distribution to match the current test batch distributions.

URL PDF HTML ☆

赞 0 踩 0

2606.11235 2026-06-11 cs.LG cs.DB stat.ME 新提交

Few-Shot Resampling for Scalable Statistically-Sound Data Mining

少样本重采样：可扩展的统计可靠数据挖掘

Leonardo Pellegrina, Fabio Vandin

发表机构 * Department of Information Engineering, University of Padova（帕多瓦大学信息工程系）

AI总结提出FewRS方法，基于重采样评估数据挖掘结果的统计显著性，通过推导新的上界偏差界，仅需极少量重采样数据集即可保证假发现概率，显著提升可扩展性。

详情

Comments: Accepted to KDD 2026

AI中文摘要

知识发现的一个关键步骤是评估数据挖掘结果。在包括模式挖掘、图分析等多个应用中，此步骤包括评估结果的统计显著性，以避免仅由噪声或数据随机波动导致的虚假发现。虽然针对某些特定应用已经开发了专门程序，但基于重采样的方法被广泛使用，尤其是在无法推导解析结果的复杂分析中。然而，当前基于重采样的方法需要生成和分析数千个重采样数据集，因此对于大型数据集或计算密集型分析不实用。本文中，我们介绍了FewRS，一种简单有效的基于重采样的方法，用于评估数据挖掘结果的统计显著性，并对错误发现概率提供严格保证。我们的方法可应用于任何使用重采样方法的情况。FewRS基于我们对表示数据挖掘结果质量的检验统计量的上确界偏差推导出的新界。我们证明FewRS需要生成和分析极少数量的重采样数据集，从而得到高度可扩展且广泛适用的方法。我们在常见任务（如模式挖掘和网络分析）上测试了我们的方法。在所有情况下，与现有技术相比，我们的方法在运行时间上减少了多达两个数量级，同时保持高统计功效，使得能够在大型真实世界数据集上对数据挖掘结果进行统计验证。

英文摘要

A key step in knowledge discovery is the evaluation of data mining results. In several applications, including pattern mining, graph analysis, and others, this step includes the evaluation of the statistical significance of the results, to avoid spurious discoveries due only to noise or random fluctuations in the data. While specialized procedures have been developed for some specific applications, resampling-based approaches are widely used, in particular for complex analyses where analytical results cannot be derived. However, current resampling-based approaches require the generation and analysis of thousands of resampled datasets, and are therefore impractical for large datasets or computationally intensive analyses. In this paper, we introduce FewRS, a simple and effective resampling-based approach to assess the statistical significance of data mining results with rigorous guarantees on the probability of false discoveries. Our approach can be used in every situation where resampling-based approaches are applied. FewRS builds on our derivation of a novel bound to the supremum deviation of test statistics representing the quality of data mining results. We prove that FewRS needs to generate and analyze an extremely small number of resampled datasets, leading to a highly scalable approach with wide applicability. We test our approach on common tasks such as pattern mining and network analysis. In all cases, our approach results in a reduction of up to two orders of magnitude in running time compared to the state of the art, while preserving high statistical power, enabling the statistical validation of data mining results on large-scale real-world datasets.

URL PDF HTML ☆

赞 0 踩 0

2602.10908 2026-06-11 cs.CL cs.LG stat.ML 版本更新

SoftMatcha 2: A Fast and Soft Pattern Matcher for Trillion-Scale Corpora

SoftMatcha 2：一种用于万亿级语料库的快速软模式匹配器

Masataka Yoneda, Yusuke Matsushita, Go Kamoda, Kohei Suenaga, Takuya Akiba, Masaki Waga, Sho Yokoi

AI总结提出SoftMatcha 2，一种基于后缀数组和词向量的超快速软搜索算法，通过动态语料感知剪枝和磁盘感知设计，在万亿级语料上实现0.3秒内支持替换、插入和删除的语义变体搜索，并发现基准污染。

详情

Comments: Accepted at ICML2026. Project Page & Web Interface: this https URL, Source Code: this https URL

AI中文摘要

我们提出SoftMatcha 2，一种超快速且灵活的搜索算法，能够在0.3秒内搜索万亿规模的自然语言语料库，同时允许以替换、插入和删除形式进行的语义变体。我们的方法采用基于后缀数组的字符串匹配，该数组随语料库规模扩展良好，并将单词表示为向量，这支撑了其语义灵活性。为了缓解查询语义放松导致的组合爆炸，我们的方法建立在两个关键算法思想上：动态语料感知剪枝和由磁盘感知设计实现的快速精确查找。我们从理论上分析了所提出方法的效率，表明它可以缓解搜索空间的指数增长。在FineWeb-Edu（Lozhkov等人，2024）（1.4T tokens）上的实验表明，与现有方法infini-gram（Liu等人，2024）、infini-gram mini（Xu等人，2025）和SoftMatcha（Deguchi等人，2025）相比，它实现了显著更低的搜索延迟。作为实际应用，我们的方法发现了现有方法遗漏的训练语料库中的基准污染，并且也有利于信息检索和释义检测。我们还提供了一个在线演示，支持七种语言的语料库快速软搜索。

英文摘要

We present SoftMatcha 2, an ultra-fast and flexible search algorithm that enables search over trillion-scale natural language corpora in under 0.3 seconds while allowing semantic variations in the form of substitution, insertion, and deletion. Our approach employs string matching based on suffix arrays that scales well with corpus size, and represents words as vectors, which underpin its semantic flexibility. To mitigate the combinatorial explosion induced by the semantic relaxation of queries, our method is built on two key algorithmic ideas: dynamic corpus-aware pruning and fast exact lookup enabled by a disk-aware design. We theoretically analyze the efficiency of the proposed method, indicating that it can mitigate exponential growth in the search space. Empirically, on FineWeb-Edu (Lozhkov et al., 2024) (1.4T tokens), it attains substantially lower search latency than existing methods: infini-gram (Liu et al., 2024), infini-gram mini (Xu et al., 2025), and SoftMatcha (Deguchi et al., 2025). As a practical application, our method uncovers benchmark contamination in training corpora that existing approaches miss, and it also benefits information retrieval and paraphrase detection. We also provide an online demo of fast, soft search across corpora in seven languages.

URL PDF HTML ☆

赞 0 踩 0

2606.12057 2026-06-11 stat.AP 新提交

ChargeBD: Character-Aware Heterogeneous Agent Reasoning for Guided Engineering in Battery Development

ChargeBD：面向电池开发中引导工程的字符感知异构智能体推理

Rui Huang, Zekun Jiang, Xingyu Niu, Yuqiang Li, Xinying Gu, Tianhang Zhou

AI总结提出ChargeBD框架，通过MBTI启发的角色智能体矩阵，结合异构推理，解决液流电池多尺度多目标研发中的自适应问题。

详情

AI中文摘要

液流电池（RFB）研究涵盖分子设计、电解质优化、电极和膜材料、电堆运行、系统管理和安全分析，使其成为一个受约束、多尺度、多目标的储能研发问题。尽管大型语言模型（LLM）可以支持科学知识整合和提案生成，但通用LLM推理在创新导向探索、基于规则的执行、机理建模和系统级权衡方面仍不够自适应。本文介绍ChargeBD，一个用于电池开发中引导工程的字符感知异构智能体推理框架。从50个RFB特定任务集开始，我们构建了500个问题的ESS-LLM基准，并定义了MBTI启发的角色智能体作为结构化认知偏差模板，而非心理测量工具或真实人格表征。选择DeepSeek-V3-Plus作为共享基础模型，评估16个MBTI启发的角色智能体，以构建角色能力矩阵和认知优势矩阵。

英文摘要

Redox-flow battery (RFB) research spans molecular design, electrolyte optimization, electrode and membrane materials, stack operation, system management, and safety analysis, making it a constrained, multi-scale, and multi-objective energy-storage R&D problem. Although large language models (LLMs) can support scientific knowledge integration and proposal generation, generic LLM reasoning remains insufficiently adaptive across innovation-oriented exploration, rule-based execution, mechanistic modeling, and system-level trade-offs. Here we introduce ChargeBD, a character-aware heterogeneous-agent reasoning framework for guided engineering in battery development. Starting from a 50-question RFB-specific task set, we construct a 500-question ESS-LLM Benchmark and define MBTI-inspired persona agents as structured cognitive-bias templates rather than psychometric instruments or representations of real personalities. DeepSeek-V3-Plus is selected as the shared base model, and 16 MBTI-inspired persona agents are evaluated to construct a persona capability matrix and a cognitive advantage matrix.

URL PDF HTML ☆

赞 0 踩 0

2606.12047 2026-06-11 cs.CV cs.AI stat.ML 新提交

Metadata-Aware Multi-Prompt Reasoning for Zero-Shot Accident Understanding

元数据感知的多提示推理用于零样本事故理解

Tarandeep Singh, Soumyanetra Pal, Soham Biswas, Nishanth Chandran

发表机构 * Netradyne

AI总结提出三阶段流水线，通过视觉-语言相似性、元数据驱动的多提示推理和开放词汇检测，实现零样本事故视频的时序定位、语义分类和空间定位，显著提升性能。

详情

Comments: Accepted at the AUTOPILOT Workshop, CVPR 2026 (non-archival). Workshop Paper ID 15

AI中文摘要

在本文中，我们通过识别冲击事件发生的时间、类型以及帧中的位置，使用自然语言解决监控视频中事故的零样本理解问题。我们提出一个三阶段流水线，将事故理解分解为何时、何物和何地。第一阶段利用视觉-语言相似性提取冲击周围的短时间窗口。第二阶段，我们执行元数据驱动的多提示推理，包含五个互补视角（基线、运动、几何、对比和决胜），并通过熵门控成对裁决器解决分歧。最后，我们基于预测的事故类型和场景布局查询开放词汇检测器以定位冲击，并使用分数加权质心聚合关键帧上的检测结果。我们的流水线在零样本ACCIDENT @ CVPR基准测试上，相对于帧中心基线，调和平均分数有显著提升。我们表明，将零样本视频理解分解为时序定位、语义分类和空间定位，比直接提示更能实现视觉-语言模型的可靠推理。

英文摘要

In this paper, we address the problem of zero-shot understanding of accidents from surveillance videos by identifying when an impact event occurs, what type of impact it is, and where in the frame it occurs using natural language. We propose a three-stage pipeline that decomposes the accident understanding into when, what, and where. The first stage extracts a short temporal window around the impact using vision-language similarity. In the second stage, we perform metadata-driven multi-prompt reasoning with five complementary views (baseline, motion, geometry, contrast, and tiebreaker) and resolve disagreement via an entropy-gated pairwise adjudicator. Finally, we localize the impact of an open-vocabulary detector queried on the predicted accident type and scene layout, and aggregate detections across keyframes using a score-weighted centroid. Our pipeline achieves a substantial improvement in the harmonic-mean score over a centre-of-frame baseline on the zero-shot ACCIDENT @ CVPR benchmark. We show that decomposing zero-shot video understanding into temporal localization, semantic classification, and spatial grounding enable more reliable reasoning with vision-language models than direct prompting alone.

URL PDF HTML ☆

赞 0 踩 0

2606.11282 2026-06-11 stat.AP math.PR math.ST 新提交

The Statistical Compass

统计罗盘

Eliuvish Han Cui

AI总结将概率与随机过程思想作为统计学的翻译语言，从设计观测到数据对象、目标、稳定性、推断与应用，通过实例连接抽象对象与记录、机制和决策。

2605.13631 2026-06-11 stat.CO 版本更新

ProjGuard: Safety Monitoring for Computer-Use Agents via Low-Dimensional Projections

ProjGuard：通过低维投影实现计算机使用代理的安全监控

Kebin Contreras, Carlos Hinojosa, Jorge Bacca, Bernard Ghanem

AI总结 ProjGuard通过行为轨迹监控实现计算机使用代理的安全防护，利用轻量级风险信号提前预警潜在危险，结合辅助视觉语言模型进行针对性修正，提升任务完成率并降低安全风险。

详情

Comments: The manuscript was submitted under an inappropriate category. In addition, substantial updates and improvements are currently being made to the document. To avoid confusion and ensure that readers access the most accurate version of the work, we request withdrawal of the current manuscript

AI中文摘要

计算机使用代理越来越多地在真实操作系统上运行，但这也增加了提示注入、间接指令和视觉攻击的风险。现有防御通常依赖于在推理时分析提示或每个潜在恶意输入，使用第二个大模型，这可能限制覆盖范围或增加部署成本。我们提出了ProjGuard，一种基于行为轨迹监控的替代方案。在每一步，我们从代理的累积交互历史中推导出一个轻量级的标量风险信号，并在线评估执行是否开始向不安全区域偏移。这使在轨迹达到潜在有害操作之前就能发出预警。当触发警报时，我们选择性地激活辅助的视觉语言模型，提出修正的下一步，并将执行引导回任务完成。在OS-Harm实验中，使用按需修正的监控将不安全率从16%降低到3%，同时提高任务完成率从59%到65%。我们进一步评估了在RiosWorld上的迁移效果，方法保持竞争力，达到4%的不安全率和64%的任务完成率。总体而言，这些结果支持了一种分层的安全策略，即持续监控可提前预警偏差，并仅在需要时激活修正。

英文摘要

Computer-use agents are increasingly capable of operating on real operating systems, but this capability has also increased the risks posed by prompt injection, indirect instructions, and visual attacks. Existing defenses typically rely on analyzing the prompt or each potentially malicious input with a second large model at inference time, which can limit coverage or increase deployment cost. We propose ProjGuard, an alternative based on behavioral trajectory monitoring. At each step, we derive a lightweight scalar risk signal from the agent's accumulated interaction history and evaluate, online, whether execution is beginning to drift toward an unsafe region. This enables early warnings before the trajectory reaches a potentially harmful action. When an alert is raised, we selectively activate an auxiliary vision-language model to propose a corrected next step and steer execution back toward task completion. Experiments on OS-Harm show that monitoring with on-demand correction reduces the unsafe rate from 16 percent to 3 percent while improving task completion from 59 percent to 65 percent. We further evaluate transfer to RiosWorld, where the method remains competitive, reaching 4 percent unsafe and 64 percent completion. Overall, these results support a hierarchical safety strategy in which always-on monitoring anticipates deviations and activates correction only when needed.

URL PDF HTML ☆

赞 0 踩 0

2601.09072 2026-06-11 cs.AI cs.CL stat.ME

Human-AI Co-design for Clinical Prediction Models

Jean Feng, Avni Kothari, Patrick Vossler, Andrew Bishara, Lucas Zier, Newton Addo, Aaron Kornblith, Yan Shuo Tan, Chandan Singh

详情

DOI: 10.1038/s41746-026-02838-5
Journal ref: npj Digital Medicine 2026

英文摘要

Developing safe, effective, and practically useful clinical prediction models (CPMs) traditionally requires iterative collaboration between clinical experts, data scientists, and informaticists. This process refines the often small but critical details of the model building process, such as which features/patients to include and how clinical categories should be defined. However, this traditional collaboration process is extremely time- and resource-intensive, resulting in only a small fraction of CPMs reaching clinical practice. This challenge intensifies when teams attempt to incorporate unstructured clinical notes, which can contain an enormous number of concepts. To address this challenge, we introduce HACHI, an iterative human-in-the-loop framework that uses AI agents to accelerate the development of fully interpretable CPMs by enabling the exploration of concepts in clinical notes. HACHI alternates between (i) an AI agent rapidly exploring and evaluating candidate concepts in clinical notes and (ii) clinical and domain experts providing feedback to improve the CPM learning process. HACHI defines concepts as simple yes-no questions that are used in linear models, allowing the clinical AI team to transparently review, refine, and validate the CPM learned in each round. In two real-world prediction tasks (acute kidney injury and traumatic brain injury), HACHI outperforms existing approaches, surfaces new clinically relevant concepts not included in commonly-used CPMs, and improves model generalizability across clinical sites and time periods. Furthermore, HACHI reveals the critical role of the clinical AI team, such as directing the AI agent to explore concepts that it had not previously considered, adjusting the granularity of concepts it considers, changing the objective function to better align with the clinical objectives, and identifying issues of data bias and leakage.

URL PDF HTML ☆

赞 0 踩 0

2102.08591 2026-06-11 stat.ME stat.ML

Data-Driven Logistic Regression Ensembles

Anthony-Alexander Christidis, Stefan Van Aelst, Ruben Zamar

2502.04046 2026-06-11 stat.ME math.ST stat.TH

A method for sparse and robust independent component analysis

Lauri Heinonen, Joni Virta

2503.11683 2026-06-11 stat.AP

MealMeter: Using Multimodal Sensing and Machine Learning for Automatically Estimating Nutrition Intake

Asiful Arefeen, Samantha Fessler, Sayyed Mostafa Mostafavi, Carol S Johnston, Hassan Ghasemzadeh

2406.07909 2026-06-11 eess.AS cs.CL cs.SD stat.ML

Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation

Eungbeom Kim, Hantae Kim, Kyogu Lee

1812.05678 2026-06-11 stat.ME

Objective-Driven Ensembles: Bridging the Gap Between Interpretable Sparsity and Algorithmic Prediction

目标驱动集成：弥合可解释稀疏性与算法预测之间的差距

Anthony Christidis, Stefan Van Aelst, Ruben Zamar

AI总结本文提出目标驱动集成方法，通过将最优子集选择推广为联合数学优化问题，生成可解释的集成模型，并理论证明惩罚预测变量重叠可限制预测协方差、减轻有限样本虚假相关的影响，实现机器学习级精度与稀疏模型可解释性的兼顾。

详情

AI中文摘要

稀疏方法（如最优子集选择、弹性网）是获得可解释模型的标准方法，但可能遭受高方差和易受虚假相关影响的问题。另一方面，算法集成（如随机森林、梯度提升）实现了高预测精度，但产生了由随机化或顺序残差拟合驱动的不可解释黑箱。近年来，一种统一的范式出现了：目标驱动集成。通过将最优子集选择推广为联合数学优化问题，该方法通过将预测变量最优地分配到少量不同模型中来生成可解释的集成。在本文中，我们综合了这一日益增长的文献，并为其经验成功提供了理论见解。具体来说，我们表明惩罚预测变量重叠在数学上限制了预测协方差，并减轻了有限样本虚假相关的影响。我们使用精确的组合预言机证明了这些性质，并回顾了最近的计算近似如何成功地将这一框架扩展到各种领域，包括高维数据、分类任务以及存在逐案例或逐单元污染的场景，实现了机器学习级别的精度，同时保留了稀疏模型的可解释性。

英文摘要

Sparse methods (e.g., Best Subset Selection, Elastic Net) are the standard approach for obtaining interpretable models, but they can suffer from high variance and vulnerability to spurious correlations. Alternatively, algorithmic ensembles (e.g., Random Forests, Gradient Boosting) achieve high prediction accuracy but yield uninterpretable black boxes driven by randomization or sequential residual fitting. In recent years, a unifying paradigm has emerged: Objective-Driven Ensembles. By generalizing best subset selection into a joint mathematical optimization problem, this approach generates interpretable ensembles by optimally splitting predictors across a small number of diverse models. In this paper, we synthesize this growing body of literature and illustrate the statistical principles driving its empirical success. Specifically, we utilize finite-sample bounds to demonstrate how penalizing predictor overlap controls ensemble covariance and provides a mathematical hedge against spurious correlations. We evaluate these mechanics using an exact combinatorial oracle, and review how recent computational approximations have successfully scaled this framework to a variety of domains, including high-dimensional data, classification tasks, and settings with casewise or cellwise contamination, achieving machine-learning-level accuracy while retaining the interpretability of sparse models.

URL PDF HTML ☆

赞 0 踩 0

1609.08725 2026-06-11 stat.ME

An adaptable generalization of Hotelling's $T^2$ test in high dimension

Haoran Li, Alexander Aue, Debashis Paul, Jie Peng, Pei Wang

1. 统计理论与方法 15 篇

The data-driven extreme value distribution: non-parametric tail estimation with a derived stability criterion

Introducing precision-weighted bias as a performance measure to inform the inclusion of adaptive designs in meta-analysis

Testing axial symmetry in multivariate location-scale linear regression

Estimating the local false discovery rate under an unknown symmetric null

Second-Order Least Squares as a Special Case of the Polynomial Maximization Method

Intrinsic Riemannian Cross-covariance for Manifold-valued Random Objects

A Uniform Improvement of the Benjamini-Hochberg Procedure via e-Closure

Stable direct estimation for GPLSIAMs using P-splines with dynamically updated boundaries

Fixed-level calibration of the Cauchy combination test

Weighted Random Dot Product Graphs

Modeling double bounded data based on correlated gamma random variables

Compound Selection Decisions: An Almost SURE Approach

Accurate Estimation of Mutual Information in High Dimensional Data

Conditional Independence Testing Using Exchangeable Pairs

Confidence regions for a persistence diagram of a single image with one or more loops

2. 贝叶斯统计与概率建模 10 篇

Bayesian nonparametric Mallows model for clustering preference data

Bayesian Triangulation Splines: Spatial Adaptation on Irregular Domains

Bayesian Effect Selection for Additive Quantile Regression with an Application to Air Pollution Thresholds

Seeing Below the Limit of Detection: A Censored-Poisson Bayesian Latent-Growth Change-Point Detector (the Span Detector) for Serial ctDNA in HR+/HER2- Metastatic Breast Cancer

The Triply-Randomized Negative Binomial Beta for Robust Regression and Conjugate Models of Bounded Support Data

Hyperbolic Latent Space Models for Network Embedding: Model Specification and Bayesian Inference

Empirical Bayes Estimation and Inference via Smooth Nonparametric Maximum Likelihood

Profile Bayesian Optimization for Expensive Computer Experiments

Bayesian Non-Parametric Inference for Lévy Measures in State-Space Models

Hierarchical Random Measures without Tables

3. 因果推断与实验设计 6 篇

Bracketing Relationships of Weighted Average Treatment Effects

Bayesian Causal Machine Learning for Cure Models

Program Evaluation with Remotely Sensed Outcomes

Quasi-randomization tests for network interference: a random graph approach

Doubly robust integration of nonprobability and probability survey data

Causal clustering: design of cluster experiments under network interference

4. 高维统计与正则化 3 篇

Model-based sparse mixed-type PCA

Renewable Lasso without Batch-Number Constraints: A Gradient-Enhanced Approach

Ridge-Regularized Largest Root Test For High-Dimensional General Linear Hypotheses

5. 时间序列与空间统计 8 篇

Weibull-Stationary Stochastic Differential Equations for Conditional Long-Horizon Wind Power Forecasting

Composite likelihood inference of fractional Gaussian processes with sequentially optimal subset selection

Magnitude-Based Features for Multispecies Spatial Data

Hierarchical excitatory processes for modelling event-time data in the presence of exogenous stimuli

Time Series Analysis in Machine Learning

Triangular-Reference Schrödinger Bridges for Time Series Generation

Intermittent time series forecasting: local vs global models

Hierarchical Probabilistic Conformal Prediction for Distributed Energy Resources Adoption

6. 计算统计与MCMC 5 篇

Adaptive spatial blocking for scalable clustering inference with applications to high-throughput spatial proteomics

Unbiased Derivative Estimation for Stationary Mean of Parameterized Markov chains

GraphGP: Scalable Gaussian Processes with Vecchia's Approximation

Annealed Entropic Allocation for Ranking and Selection

Compressed Bayesian Tensor Regression

7. 机器学习统计基础 27 篇

Phase Transitions in Attention: A Bayesian Theory of Copy Head Emergence

What Uncertainties Do We Need for Dynamical Systems?

Efficient Multinomial Logistic Bandit via Frequent Directions

Capacity-Constrained Online Convex Optimization with Delayed Feedback

Tree-Structured Orthonormal Decomposition of the Aitchison Simplex

Range-Aware Bayesian Optimization for Discovering Diverse Designs within Target Property Windows

Enhancing Spectral Embedding through Robust and Flexible Knowledge Transfer in Electronic Health Records

The Power of Test-Time Training for Approximate Sampling

Signed Compression Progress on a Sealed Audit is Goodhart-Resistant

Quantized Stochastic Primal-Dual Methods for Distributed Optimization under Relaxed Global Geometry

Self-Attention as Transport: Limits of Symmetric Spectral Diagnostics

Querying Counterfactuals on Tissue Graphs with Supervised Disentanglement

Conformal Risk-Averse Decision Making with Action Conditional Guarantee

The ASE-LSE Disagreement Landscape: An End-to-End Characterisation of Extremes and Structural Drivers

A theory of learning data statistics in diffusion models, from easy to hard

Impact of Connectivity on Laplacian Representations in Reinforcement Learning

Bayesian online learning in the one-pass regime: Frequentist validity and uncertainty quantification

On Regret Bounds of Thompson Sampling for Bayesian Optimization

A Judge-Aware Ranking Framework for Evaluating Large Language Models without Ground Truth

Pass@K Policy Optimization: Solving Harder Reinforcement Learning Problems

Provable Recovery of Locally Important Signed Features and Interactions from Random Forest

Wasserstein Gradient Flows of MMD Functionals with Distance Kernel and Cauchy Problems on Quantile Functions

Reinforcement Learning with Action-Triggered Observations

CP4SBI: Local Conformal Calibration of Credible Sets in Simulation-Based Inference

OCSVM-Guided Representation Learning for Unsupervised Anomaly Detection

PCS-UQ: Uncertainty Quantification via the Predictability-Computability-Stability Framework