arXivDaily arXiv每日学术速递 周一至周五更新
2606.20514 2026-06-19 stat.ME 新提交

Hypergraph Variable Selection with False Discovery Rate Control

具有错误发现率控制的超图变量选择

Sarah Organ, Toby Kenney, Hong Gu

AI总结 针对预测变量复杂依赖结构导致变量选择方法功效降低的问题,提出基于超图的选择方法,在控制错误发现率的同时提高选择功效。

Comments 28 pages, 4 figures

详情
AI中文摘要

控制错误发现率的变量选择方法在预测变量呈现复杂依赖结构时往往会失去功效。我们先前表明,选择分层聚类组的预测变量可以缓解这一问题,同时保持错误发现率控制。然而,当相关性结构较不明确时,重叠的预测变量集可能更有效。我们引入了针对预测变量集上定义假设的广义错误发现率,并提出了一种基于超图的选择方法。该方法在各种设置下实现了更高的功效,同时保持了严格的错误发现率控制。

英文摘要

Variable selection methods that control the false discovery rate often lose power when predictors exhibit complex dependence structures. We previously showed that selecting hierarchically clustered groups of predictors can mitigate this issue while maintaining false discovery rate control. When correlations are less structured, however, overlapping predictor sets may be more effective. We introduce a generalized false discovery rate for hypotheses defined on sets of predictors and propose a hypergraph-based selection method. This approach achieves higher power across diverse settings while preserving rigorous false discovery rate control.

2606.20406 2026-06-19 stat.ME stat.CO 新提交

Flexible modeling of bimodal distributions via skewed-$t$ mixtures

双峰分布的灵活建模:基于偏斜-t分布的混合模型

Marco Bee, Flavio Santi

AI总结 提出基于Fernández和Steel (1998)偏斜-t分布的混合模型,通过EM算法进行极大似然估计,并开发似然比检验,用于拟合双峰、偏斜和厚尾数据,在标准普尔500指数中验证了双峰性。

详情
AI中文摘要

我们提出了一种位置-尺度偏斜-t分布的混合模型,用于拟合双峰、偏斜和厚尾数据。特别地,该混合模型基于Fernández和Steel (1998)的偏斜-t分布,因此模型构建过程可以轻松扩展到其他对称分布的混合。在研究了混合模型的性质后,我们通过EM算法开发了极大似然估计方法,并提出了一个似然比检验,用于检验任何给定成分中无偏斜的原假设。与最近提出的g-and-h分布混合的基于模拟的比较表明,所提出模型在良好指定设置下的估计精度和错误指定框架下的建模能力方面均表现出色。将该模型拟合到标准普尔500指数失真数据,证实了其分布的双峰性,这意味着美国股市历史上处于熊市或牛市状态,而非接近其基本面价值。

英文摘要

We propose a mixture of location-scale skewed-$t$ distributions to fit bimodal, skewed and heavy-tailed data. In particular, the mixture is based on the skewed-$t$ distribution by Fernández and Steel (1998), so that the model-building procedure can be easily extended to mixtures of other symmetric distributions. After studying the properties of the mixture, we develop a maximum likelihood estimation approach via the EM algorithm and a likelihood ratio test of the null hypothesis of no skewness in any given component. A simulation-based comparison to a recently proposed mixture of g-and-h distributions suggests that the performance of the proposed model is excellent, in terms of both estimation precision in well-specified setups and modeling capability in mis-specified frameworks. Fitting the model to the Standard & Poor's 500 distortion allows us to confirm the bimodality of its distribution, with the implication that the US stock market has historically been in bearish or bullish conditions, rather than near its fundamental value.

2606.20341 2026-06-19 stat.ME stat.AP 新提交

Anchors Away: Navigating Unanchored Indirect Comparisons with Multilevel Unanchored Meta-Regression (ML-UMR)

锚定之外:使用多层次非锚定元回归(ML-UMR)导航非锚定间接比较

Conor Chandler, Jack Ishak

AI总结 针对随机证据缺失时的非锚定治疗比较,提出多层次非锚定元回归(ML-UMR),通过贝叶斯框架联合建模个体与汇总数据,估计多治疗、多研究及目标人群的边际和条件效应,并明确识别假设与可转移性假设。

Comments 20 pages (excluding supplementary material), 5 figures

详情
AI中文摘要

当随机证据不可用时,使用单臂研究或断开证据的非锚定间接治疗比较越来越多地用于卫生技术评估(HTA)。现有方法,包括匹配调整间接比较(MAIC)和模拟治疗比较(STC),通常局限于成对设置,并且通常估计比较研究人群中的边际效应,这可能与决策相关人群不同。我们提出多层次非锚定元回归(ML-UMR),一种用于综合来自完全断开证据的个体患者数据和汇总数据的贝叶斯回归框架。ML-UMR通过在一个统一似然中联合建模个体水平和汇总水平数据,将多层次网络元回归(ML-NMR)扩展到非锚定设置,从而能够估计跨多个治疗、研究和目标人群的治疗特异性结果以及边际和条件效应。ML-UMR区分了识别治疗效应所需的假设与将结果转移到目标人群所需的假设。与所有非锚定比较一样,有效推断依赖于强且通常不可验证的假设,包括条件可交换性、结果模型的正确设定以及跨治疗假设(例如,共享预后因素假设(SPFA))。ML-UMR并未减轻这些要求,而是在统一框架内使其明确,并促进敏感性分析。在模拟研究中,ML-UMR对比较人群效应产生了低偏差和名义覆盖。向其他人群的可转移性关键取决于识别假设:在强效应修饰下,违反SPFA导致偏差,而纳入亚组信息则恢复了近乎无偏的估计和名义覆盖。

英文摘要

Unanchored indirect treatment comparisons using single-arm studies or disconnected evidence are increasingly used in health technology assessment (HTA) when randomized evidence is unavailable. Existing methods, including matching-adjusted indirect comparison (MAIC) and simulated treatment comparison (STC), are generally limited to pairwise settings and typically estimate marginal effects in the comparator study population, which may differ from the decision-relevant population. We propose multilevel unanchored meta-regression (ML-UMR), a Bayesian regression framework for synthesizing individual patient data and aggregate data from fully disconnected evidence. ML-UMR extends multilevel network meta-regression (ML-NMR) to unanchored settings by jointly modeling individual- and aggregate-level data within a unified likelihood, enabling estimation of treatment-specific outcomes and both marginal and conditional effects across multiple treatments, studies, and target populations. ML-UMR distinguishes assumptions required to identify treatment effects from those required to transport results to target populations. As with all unanchored comparisons, valid inference relies on strong and often unverifiable assumptions, including conditional exchangeability, correct specification of the outcome model, and cross-treatment assumptions (e.g., shared prognostic factor assumption (SPFA)). ML-UMR does not lessen these requirements but makes them explicit within a unified framework and facilitates sensitivity analyses. In simulation studies, ML-UMR produced low bias and nominal coverage for comparator-population effects. Transportability to alternative populations depended critically on identifying assumptions: violations of SPFA led to bias under strong effect modification, whereas incorporating subgroup information restored near-unbiased estimation and nominal coverage.

2606.20226 2026-06-19 stat.ME stat.CO 新提交

Analysis of uncertain fixed-effects model for Latin square designs

拉丁方设计的不确定固定效应模型分析

Yaru Cheng, Zhiming Li

AI总结 针对无频率稳定性的不确定实验数据,建立拉丁方设计的不确定固定效应模型,提出三种估计方法并构建置信区间,进行不确定齐性检验和常见检验,通过数值模拟和实例验证模型有效性。

详情
AI中文摘要

实验设计中常出现无频率稳定性的不确定数据。经典固定效应模型只能分析精确的实验数据。基于不确定测度,本文建立了拉丁方设计的不确定固定效应模型。首先,我们提出了三种不确定方法来估计处理和区组效应,并构建其置信区间。然后,进行不确定齐性检验和常见检验以评估处理效应的显著性。在数值模拟中,基于偏差、均方误差、平均绝对误差、总体标准差、覆盖概率和平均区间长度比较了三种估计方法。给出了几个例子来说明估计和假设检验的过程。最后,将不确定固定效应模型应用于真实教育数据,展示了其实用价值。

英文摘要

Uncertain data without frequency stability often arises in experimental design. Classical fixed-effects models can only analyze precise experimental data. Based on an uncertain measure, this paper establishes uncertain fixed-effect models for Latin-square designs. First, we propose three methods with uncertainty to estimate the treatment and blocked effects and construct their confidence intervals. Then, uncertain homogeneity and common tests are conducted to assess the significance of treatment effects. In the numerical simulations, the three estimation methods are compared based on bias, mean squared error, mean absolute error, overall standard deviation, coverage probability, and average interval length. Several examples are given to illustrate the process of estimation and hypothesis. Finally, the uncertain fixed-effects model is applied to real education data, demonstrating its practical value.

2606.20148 2026-06-19 stat.ME 新提交

A case study of causal mediation using Bayesian nonparametrics and semiparametric corrections

使用贝叶斯非参数和半参数修正的因果中介分析案例研究

Yuhua Zhang, Michael J. Daniels

AI总结 提出截断富集狄利克雷过程混合模型估计自然直接和间接效应,结合高效MCMC算法和基于有效影响函数的一步后验修正,解决贝叶斯非参数中因果估计量的可靠推断问题。

详情
AI中文摘要

我们提出了一种贝叶斯非参数方法,使用截断富集狄利克雷过程混合(EDPM)模型来估计存在后处理混杂因素时的因果中介分析中的自然直接效应(NDE)和间接效应(NIE)。我们引入了一种高效的簇重分配Metropolis-Hasting算法,以改善阻塞吉布斯采样器中的混合。我们基于有效影响函数实现了针对我们设定的一步后验修正。这个后处理步骤解决了贝叶斯非参数中的一个关键问题:如何从为复杂联合分布设计的模型中获得特定因果估计量(NDE和NIE)的可靠估计和后验,并具有优良的频率性质,如正确的覆盖。我们进行了模拟研究以评估我们方法的性能,并将其应用于评估一项体重管理临床试验中的因果中介效应。

英文摘要

We propose a Bayesian nonparametric approach using a truncated Enriched Dirichlet Process mixture (EDPM) model to estimate natural direct (NDE) and indirect (NIE) effects in causal mediation analyses in the presence of post-treatment confounders. We introduce an efficient cluster reallocation Metropolis-Hasting algorithm to improve mixing in the blocked Gibbs sampler. We implement a one-step posterior correction based on the efficient influence function for our setting. This post-processing step solves a critical problem in Bayesian nonparametrics: how to obtain reliable estimates and posteriors for a specific causal estimand of interest (the NDE and NIE) with excellent frequentist properties, such as correct coverage, from a model designed for complex joint distributions. We conduct simulation studies to assess our method's performance and apply it to evaluate causal mediation effects in a weight management clinical trial.

2606.20114 2026-06-19 stat.ME stat.AP 新提交

Community detection in small-sample ordinal regimes: A benchmarking framework for Delphi data

小样本有序情境下的社区检测:德尔菲数据的基准测试框架

Yuri Calleo, Simone Di Zio, Fabrizio Maturo

AI总结 针对德尔菲数据高维小样本导致的秩亏问题,提出从变量中心协方差模型转向网络中心连接模型,利用社区检测算法识别潜在主题结构,实现结构稳定的降维。

详情
AI中文摘要

德尔菲数据共识的统计建模面临一个关键瓶颈:问卷项目的高维性与专家小组有限样本量之间的矛盾。这种秩亏导致传统潜变量模型(如主成分分析)结构不稳定且易过拟合。为弥补这一方法论空白,本研究提出从变量中心协方差模型转向网络中心连接模型。通过将项目相关性映射到加权图拓扑,我们提出了一个基于模拟的基准测试,利用社区检测算法识别潜在主题结构,有效解决了高维小样本情境下典型的谱不稳定性和秩亏问题。该研究系统评估了基于结构密度、信息流和谱划分的拓扑方法在合成数据集上的鲁棒性,这些数据集旨在复制共识数据的病理条件,包括有序量表和系统噪声。核心方法论贡献在于证明专家判断间的共线性——传统上被视为需要正则化的统计冗余——可以有效地重新解释为凝聚的拓扑信号。该框架为研究人员提供了一种结构化的自动降维程序,确保即使在标准因子分析失效的小样本情境下也能保持结构稳定性和心理测量一致性。

英文摘要

The statistical modeling of consensus in Delphi data faces a critical bottleneck: the high dimensionality of questionnaire items relative to the limited sample size of expert panels. This rank deficiency leads traditional latent variable models, such as Principal Component Analysis, to be structurally unstable and prone to overfitting. Addressing this methodological gap, this study proposes a transition from variable-centric covariance models to network-centric connectivity models. By mapping item correlations onto a weighted graph topology, we present a simulation-based benchmark that utilizes community detection algorithms to identify latent thematic structures, effectively addressing the spectral instability and rank deficiency typical of high-dimensional, low-sample-size regimes. The research systematically evaluates the robustness of topological approaches based on structural density, information flow, and spectral partitioning against synthetic datasets designed to replicate the pathological conditions of consensus data, including ordinal scales and systemic noise. The central methodological contribution lies in demonstrating that collinearity among expert judgments - traditionally treated as statistical redundancy to be regularized - can be effectively reinterpreted as a topological signal of cohesion. This framework provides researchers with a structured and automated procedure for dimensionality reduction, ensuring structural stability and psychometric consistency even in small-sample regimes where standard factor analysis breaks down.

2606.20069 2026-06-19 stat.ME 新提交

A minimum-risk and cost-efficient two-sample sequential testing framework for the shifted exponential models with application to precipitation data

移位指数模型的最小风险与成本高效双序贯检验框架及其在降水数据中的应用

Ashwani Rajput, Neeraj Joshi

AI总结 提出一种双序贯抽样框架,通过控制第一类错误概率并最小化包含第二类错误和抽样成本的损失函数,检验两个移位指数模型的位置参数差异,具有一阶、二阶效率和风险效率。

详情
AI中文摘要

本文通过一种新颖的双序贯抽样框架,研究了比较两个移位指数模型位置参数的问题。所提出的假设检验过程通过将第一类错误概率控制在预设水平,同时最小化包含第二类错误概率和相应抽样成本的损失函数来开发。相应的最优固定样本量表达式依赖于未知的尺度参数,这使得在固定样本设计下,期望的检验精度在实践中无法实现。为克服这一困难,提出了一种双序贯抽样程序,用于在尺度参数未知且不等时检验位置参数之间的差异。所提出的方法具有理想的新近性质,包括一阶效率、二阶效率和二阶风险效率。广泛的模拟研究和涉及气象站强降水事件的实际数据应用证明了所提出程序的实际有效性和适用性。

英文摘要

This paper investigates the problem of comparing the location parameters of two shifted exponential models through a novel double sequential sampling framework. The proposed hypothesis testing procedure is developed by controlling the type I error probability at a preassigned level while minimizing a loss function that incorporates both the type II error probability and the associated sampling cost. The corresponding optimal fixed-sample-size expressions are shown to depend on unknown scale parameters, rendering the desired testing accuracies unattainable in practice under fixed-sample designs. To overcome this difficulty, a double sequential sampling procedure is proposed to test the difference between location parameters when the scale parameters are unknown and unequal. The proposed methodology is shown to possess desirable asymptotic properties, including first-order efficiency, second-order efficiency, and second-order risk efficiency. Extensive simulation studies and a real-data application that involves heavy precipitation episodes at meteorological stations demonstrate the practical effectiveness and applicability of the proposed procedure.

2606.19982 2026-06-19 stat.ME 新提交

Built-in Selection Bias in Proportional Hazards Models with Omitted Covariates: Simulation Evidence and Alternative Approaches

省略协变量的比例风险模型中的内置选择偏倚:模拟证据与替代方法

Ayoub Bifenzi, Helene Jacqmin-Gadda

AI总结 本文通过模拟和实际数据,证明在随机试验中,即使省略的协变量与处理独立,仍会导致Cox比例风险模型估计的处理风险比存在偏倚,并比较了脆弱模型、加速失效时间模型和Kaplan-Meier曲线等替代方法的稳健性。

详情
AI中文摘要

在时间-事件分析中,来自Cox比例风险(PH)模型的风险比(HR)是评估治疗效果最常用且广泛报告的指标。然而,由于风险比固有地依赖于每个时间点的生存条件,它们具有非可压缩性。因此,当存在因省略重要协变量导致的未测量异质性时,即使这些协变量在基线时与主要暴露独立(如随机对照试验中),风险比也会受到内置选择偏倚的影响。本文旨在概述文献中关于未观测异质性(由影响结局的省略协变量引起)如何在标准比例风险模型中偏倚治疗风险比估计的关键发现,即使在处理分配独立于这些协变量的随机试验中也是如此。通过模拟,我们评估了半参数Cox PH模型和参数PH模型在各种未测量异质性场景下的偏倚程度。然后,我们将这些标准模型与替代方法进行比较,这些方法要么解决了这一问题,要么被认为对此具有稳健性。这些替代方法包括来自脆弱模型的风险比、来自加速失效时间(AFT)模型的回归参数,以及使用Kaplan-Meier曲线非参数估计或基于具有时变暴露效应的Cox模型估计的治疗组间生存差异。我们通过一个来自放射治疗肿瘤学组(RTOG 9202)的随机对照试验的实际数据应用,说明了所探索替代方法的实际相关性。

英文摘要

In time-to-event analysis, the hazard ratio (HR) derived from the Cox proportional hazards (PH) model is the most commonly used and widely reported measure for assessing treatment effects. However, hazard ratios are non-collapsible due to their inherent conditioning on survival up to each time point. As a result, they are subject to built-in selection bias in the presence of unmeasured heterogeneity arising from omitted important covariates, even when these covariates are independent of the main exposure at baseline, as is the case in randomized controlled trials. This article aims to provide an overview of key findings from the literature on how unobserved heterogeneity, due to omitted covariates that affect the outcome, can bias the estimation of the treatment hazard ratio in standard proportional hazards models, even in randomized trials where treatment is assigned independently of such covariates. Through simulations, we evaluate the extent of bias in the semi-parametric Cox PH model and parametric PH model under various scenarios of unmeasured heterogeneity. We then compare these standard models to alternative approaches that either account for this issue or are considered robust to it. These alternatives include the hazard ratio estimated from frailty models, regression parameters from an Accelerated Failure Time (AFT) model, and survival differences between treatment groups estimated nonparametrically using Kaplan-Meier curves or based on a Cox model with time-dependent effect of the exposure. We illustrate the practical relevance of the explored alternatives through a real data application to a randomized controlled trial from the Radiation Therapy Oncology Group (RTOG 9202).

2606.19892 2026-06-19 stat.ME 新提交

The Ghosh-Lin and Fine-Gray models for a mix of administrative and random censoring

混合行政删失与随机删失下的Ghosh-Lin和Fine-Gray模型

Thomas H. Scheike, Christian Mirian, Isao Yokota, Giuliana Cortese

AI总结 针对同时存在行政删失和随机删失的数据,提出结合风险集调整和逆概率删失加权的方法,使Ghosh-Lin和Fine-Gray模型得到一致估计。

详情
AI中文摘要

复发事件或竞争风险回归模型通常应用于生物医学领域,两者都可视为边际模型。在存在右删失的情况下,需要调整这些模型以获得一致估计量。当删失是行政性时,边际回归模型特别容易估计。然而,当删失是随机作用时,通常考虑逆概率删失加权(IPCW)调整来获得参数估计。该技术通过正确的删失模型进行删失权重调整,但对于行政删失,只需修改风险集即可正确调整。在实践中,对于大型中央登记处或某些临床试验,所有受试者的行政删失时间已知,但通常也会有一定比例的受试者被随机删失。在这项工作中,我们考虑两种常用的回归方法:用于带有终止事件的复发事件的Ghosh-Lin模型和用于竞争事件的Fine-Gray模型。对于这两种情况,当同时存在行政删失和随机删失时,我们展示了如何通过处理这两种不同类型删失的组合,在最小化建模假设的基础上获得正确估计。

英文摘要

Recurrent events or competing risks regression models are often applied in the bio-medical setting and both can be considered as marginal models. In presence of right-censoring, such models need to be adjusted to give consistent estimators. When censoring is administrative, marginal regression models are particularly easy to estimate. However, when censoring is instead acting randomly, inverse probability of censoring weighting (IPCW) adjustments are typically considered to obtain parameter estimates. This technique relies on a censoring-weights adjustment via a correct censoring model, but for administrative censoring the adjustment is done correctly simply by modifying the risk-set. In practice for large central registries or some clinical trials, the administrative censoring time will be known for all subjects, but there will typically also be a proportion of subjects that are censored at random. In this work, we consider two frequently used regression approaches, the Ghosh-Lin model for recurrent events with terminal events and the Fine-Gray model for competing events. For these two settings, when both administrative and random censoring are present, we demonstrate how to obtain correct estimation by dealing with the combination of the two different types of censoring relying on a minimum of modeling assumptions.

2606.19743 2026-06-19 stat.ME stat.AP 新提交

A Bayesian spatio-temporal nearest neighbor Gaussian process model for pooled genetic data

一种用于汇总遗传数据的贝叶斯时空最近邻高斯过程模型

Imke Botha, Tianxiao Hao, Lucinda E. Harrison, Nick Golding, Daniel J. Weiss, Jennifer A. Flegg

AI总结 提出最近邻高斯过程模型,结合序贯蒙特卡洛平方算法,高效推断汇总遗传数据中的单倍型频率,并应用于非洲抗疟药物耐药性遗传数据分析。

详情
AI中文摘要

大规模遗传数据集通常汇总不同遗传标记的总等位基因计数。从这些汇总数据中推断单倍型频率(即多标记等位基因的频率)是一个挑战。由于计算成本,先前在此背景下的时空建模仅限于3个标记。在这项工作中,我们提出了一种最近邻高斯过程(NNGP)模型,以改善随标记和观测数量扩展的规模。为了推断模型参数,我们开发了一种新颖的序贯蒙特卡洛平方算法,该算法使用带有祖先抽样的粒子吉布斯来变异NNGP函数值。后者在观测数量和NNGP数量上具有线性成本,并可应用于广泛的NNGP模型。作为案例研究,我们分析了与非洲抗疟药物耐药性相关的遗传数据,并在3和6个遗传标记数据集上实证展示了我们的扩展结果。

英文摘要

Large scale genetic datasets often aggregate the total allele counts of distinct genetic markers. Inferring haplotype frequencies (i.e.\ the frequency of multimarker alleles) from these pooled data is a challenge. Previous spatio-temporal modelling in this context has been limited to 3 markers due to the computational cost. In this work, we propose a nearest neighbor Gaussian process (NNGP) model to improve scaling with the number of markers and observations. To infer the parameters of our model, we develop a novel sequential Monte Carlo squared algorithm, which uses particle Gibbs with ancestor sampling to mutate the NNGP function values. The latter has a linear cost in the number of observations and the number of NNGPs, and can be applied to a broad range of NNGP models. As a case study, we analyse genetic data relating to antimalarial drug resistance in Africa, and show our scaling results empirically on a 3 and 6 genetic marker dataset.

2606.19737 2026-06-19 stat.ME stat.ML 新提交

Calibration without labels in multiple testing

多重检验中的无标签校准

Adway S. Wadekar, Jake A. Soloff

AI总结 针对多重检验中无法观测真实标签的难题,利用有序p值间距构造伪标签,实现局部错误发现率的校准,并揭示q值在心理学和神经科学文献中可能严重失准。

详情
AI中文摘要

大规模假设检验支持对单个假设的概率性声明,如经验贝叶斯方法估计局部错误发现率。我们研究如何将这些声明解释为原假设的近似校准预测,即使在模型误设定下也能产生可解释的错误概率。我们的方法从概率预测中汲取概念灵感,但面临不同的挑战:与预测不同(标签最终可观测),在多重检验中真实情况从未揭示,因此校准必须随机评估并间接建立。我们通过构造一组伪标签来应对这一挑战,这些伪标签源自有序$p$值的间距,并以局部错误发现率作为回归目标。我们的构造解锁了现有工具,用于评估和执行多重检验中的事后校准。值得注意的是,我们在对已发表的心理学和神经科学文献的大规模实证调查中发现,基于错误发现率的流行误差度量$q$值可能严重失准。

英文摘要

Large-scale hypothesis testing supports probability claims about individual hypotheses, as in empirical Bayes methods for estimating local false discovery rates. We study how such claims can be interpreted as approximately calibrated forecasts of the null hypothesis, yielding interpretable error probabilities even under model misspecification. Our approach draws conceptual inspiration from probabilistic forecasting but addresses a different challenge: unlike forecasting, where labels are eventually observed, in multiple testing the ground truth is never revealed, so calibration must be assessed stochastically and established indirectly. We address this challenge by constructing a set of pseudo-labels, derived from the spacings of ordered $p$-values, which have the local false discovery rate as their regression target. Our construction unlocks existing tools for assessing and performing post-hoc calibration in multiple testing. Notably, we find on a large-scale empirical survey of published psychology and neuroscience literature that the $q$-value, a popular error measure based on the false discovery rate, can be severely miscalibrated.

2606.19580 2026-06-19 stat.ME stat.ML 新提交

Machine Learning Integrated in Wavelet Shrinkage (MLShrink)

机器学习集成小波收缩 (MLShrink)

Dixon Vimalajeewa, Vijini Lakmini, Brani Vidakovic

AI总结 提出MLShrink,结合小波收缩与机器学习,通过双阈值对中间带系数进行数据自适应分类,保留经典阈值简单性,理论证明其非扩张性和oracle一致性,在非平滑信号上表现优异。

详情
AI中文摘要

实践中遇到的数据经常被加性噪声污染,小波收缩仍是非参数估计中恢复潜在信号的基本工具。经典方法如硬阈值和软阈值几乎完全根据系数的大小决定是否保留。尽管在许多情况下有效,这些规则对于幅度落在信号与噪声区分不确定的中间区域的系数可能过于僵化。我们提出MLShrink,一种将小波收缩与机器学习相结合的双阈值小波去噪过程。低于下阈值的系数被丢弃,高于上阈值的系数被保留,中间带的系数使用局部小波域特征进行分类。这样,MLShrink在远离决策边界处保留了经典阈值的简单性,同时允许对模糊系数进行数据自适应决策。本文还为此架构开发了一个理论框架。我们证明MLShrink是一个非扩张的支持选择规则,推导出一个基于oracle的风险分解,表明多余的去噪风险由未决策带上的分类误差决定,并在分类器性能的适当假设下建立了oracle一致性结果。在标准基准信号上的模拟实验表明,MLShrink与几种已建立的小波收缩方法具有竞争力,尤其适用于具有不规则、边缘丰富或非平滑结构的信号。这些发现表明,中间阈值带上的学习决策为经典小波去噪与现代统计学习之间提供了有用且可解释的联系。

英文摘要

Data encountered in practice are frequently contaminated by additive noise, and wavelet shrinkage remains a fundamental tool for recovering underlying signals in nonparametric estimation. Classical procedures such as hard and soft thresholding decide whether to retain a wavelet coefficient almost entirely from its magnitude. Although effective in many settings, these rules can be too rigid for coefficients whose magnitudes fall in an intermediate region where the distinction between signal and noise is uncertain. We propose MLShrink, a two-threshold wavelet denoising procedure that combines wavelet shrinkage with machine learning. Coefficients below a lower threshold are discarded, coefficients above an upper threshold are retained, and coefficients in the intermediate band are classified using local wavelet-domain features. In this way, MLShrink preserves the simplicity of classical thresholding away from the decision boundary while allowing data-adaptive decisions for ambiguous coefficients. The paper also develops a theoretical framework tailored to this architecture. We show that MLShrink is a nonexpansive support-selection rule, derive an oracle-based risk decomposition showing that excess denoising risk is determined by classification errors on the undecided band, and establish an oracle-consistency result under suitable assumptions on classifier performance. Simulation experiments on standard benchmark signals indicate that MLShrink is competitive with several established wavelet shrinkage methods and is especially effective for signals with irregular, edge-rich, or non-smooth structure. These findings suggest that learned decisions on the intermediate threshold band provide a useful and interpretable connection between classical wavelet denoising and modern statistical learning.

2606.19572 2026-06-19 stat.ME 新提交

SCOPE Shrinkage: A Unified Framework for Wavelet Denoising

SCOPE 收缩:小波去噪的统一框架

Dixon Vimalajeewa, Vijini Lakmini, Malith Premarathna, Fabrizio Ruggeri, Brani Vidakovic

AI总结 提出基于对称单峰分布累积分布函数的SCOPE收缩族,通过两个可解释参数分离尺度与形状效应,实现局部强收缩与渐近无偏的平衡,在小波去噪中性能与可解释性兼具。

详情
AI中文摘要

我们引入了对称CDF导向概率增强(SCOPE)收缩,这是一个由对称单峰分布的中心累积分布函数构造的保号收缩规则统一族。所提出的框架生成了一类广泛的衰减轮廓,在零点附近强局部收缩与尾部渐近无偏行为之间插值。我们开发了一个通用公式,通过两个可解释参数分离尺度与形状效应,从而能够独立控制有效的阈值位置和过渡锐度。在明确的规律性假设下,建立了SCOPE收缩的结构性质,包括奇性、单调性、连续性、收缩性以及将规则与软化阈值算子联系起来的混合表示。还发展了贝叶斯和惩罚似然解释:SCOPE规则允许偶惩罚表示,该表示在系数幅度上非递减,并且合适的子类在适当的对称单峰先验下作为精确的最大后验估计出现。基于逻辑分布、均匀分布和柯西分布的代表性例子说明了概率形状如何控制收缩行为。通过Stein型无偏风险估计讨论了光滑子类的数据驱动参数选择。在标准Donoho-Johnstone测试函数上的Oracle校准模拟研究表明,SCOPE收缩与几种已建立的小波去噪方法相比具有竞争力,同时保持了高度的可解释性和结构灵活性。结果突出了中心分布函数作为小波去噪及相关估计问题中收缩的自然且通用的设计原则。

英文摘要

We introduce Symmetric CDF Oriented Probability Enhanced (SCOPE) shrinkage, a unified family of sign-preserving shrinkage rules constructed from centered cumulative distribution functions of symmetric unimodal distributions. The proposed framework generates a broad class of attenuation profiles that interpolate between strong local shrinkage near zero and asymptotically unbiased behavior in the tails. A general formulation is developed that separates scale and shape effects through two interpretable parameters, allowing effective threshold location and transition sharpness to be controlled independently. Under explicit regularity assumptions, structural properties of SCOPE shrinkage are established, including oddness, monotonicity, continuity, contractivity, and a mixture representation that connects the rules to softened thresholding operators. A Bayesian and penalized likelihood interpretation is also developed: SCOPE rules admit even penalty representations that are nondecreasing in coefficient magnitude, and suitable subclasses arise as exact maximum a posteriori estimators under proper symmetric unimodal priors. Representative examples based on logistic, uniform, and Cauchy distributions illustrate how probabilistic shape governs shrinkage behavior. Data driven parameter selection for smooth subclasses is discussed via Stein-type unbiased risk estimation. Oracle calibrated simulation studies on standard Donoho-Johnstone test functions show that SCOPE shrinkage performs competitively with several established wavelet denoising methods, while retaining a high degree of interpretability and structural flexibility. The results highlight centered distribution functions as a natural and versatile design principle for shrinkage in wavelet denoising and related estimation problems.

2606.19540 2026-06-19 stat.ME stat.CO stat.ML 新提交

Overfitted high-dimensional matrix factorizations via adaptive spectral shrinkage

通过自适应谱收缩的过拟合高维矩阵分解

Lorenzo Mauri, David B. Dunson

AI总结 提出EigenBayes方法,通过谱估计和自适应经验贝叶斯校准超参数,实现快速且具有不确定性量化的过拟合因子模型,在数值实验和基因组学应用中优于现有方法。

详情
AI中文摘要

因子模型是分析高维数据以提取低秩信号和估计协方差的常用方法。它们将协方差矩阵分解为低秩分量和对角分量之和。一个关键问题是如何选择潜在维度$k$,当因子模型仅近似成立且信噪比较低时,这尤其具有挑战性。贝叶斯过拟合因子模型指定$k$的上界,并依赖结构化收缩先验有效去除多余分量。这类方法流行且有效,但计算成本高。我们提出了一种更快的\texttt{EigenBayes}方法,基于潜在因子的谱估计和关键超参数的自适应经验贝叶斯校准,提供有效的不确定性量化。得到的后验分布可跨结果分解且解析可处理,绕过了马尔可夫链蒙特卡洛。我们证明\texttt{EigenBayes}能适应每个结果和潜在维度的信噪比,同时将多余的潜在分量收缩至零。我们建立了良好的渐近性质,并在数值实验和基因组学应用中展示了强大的实证性能,其中EigenBayes优于最先进的替代方法。

英文摘要

Factor models are popular approaches for analyzing high-dimensional data to extract low-rank signals and estimate covariances. They decompose the covariance matrix as the sum of low-rank and diagonal components. A key issue is how to choose the latent dimension $k$, which is particularly challenging when the factor model only holds approximately and in low signal-to-noise scenarios. Bayesian overfitted factor models specify an upper bound on $k$ and rely on structured shrinkage priors to effectively remove extra components. Such approaches are popular and effective, but computationally expensive. We propose a much faster \texttt{EigenBayes} approach that provides valid uncertainty quantification, based on spectral estimation of latent factors and adaptive empirical Bayes calibration of key hyperparameters. The resulting posterior distribution factorizes across outcomes and is analytically tractable, bypassing Markov chain Monte Carlo. We show that \texttt{EigenBayes} adapts to the signal-to-noise ratio of each outcome and latent dimension, while shrinking superfluous latent components to zero. We establish favorable asymptotic properties and demonstrate strong empirical performance in numerical experiments and a genomics application, where EigenBayes outperforms state-of-the-art alternatives.

2606.19041 2026-06-19 stat.ME 新提交

Efficient Cumulative Incidence Estimation in Biobank Studies Using All Prevalent and Incident Events

利用所有现患和发病事件在生物库研究中进行高效累积发病率估计

David M. Zucker, Malka Gorfine

AI总结 针对生物库数据中同时包含招募前发病(现患)和随访期间发病的个体,提出一种新的累积发病率函数估计方法,整合所有病例,处理年轻发病且生存期长的疾病,理论证明渐近性质,模拟和UK生物库癌症数据验证其优势。

详情
AI中文摘要

基于人群的生物库已在许多国家建立,为大规模研究各种疾病的发病率提供了机会。生物库数据通常是在特定日历期内招募的研究队列中收集的,受试者在年龄介于$R_L$和$R_U$之间时进入研究。本研究关注包含两类个体的生物库数据:在招募前已发生目标疾病(称为现患病例)的个体,以及最初招募时无病但在随访期间发病的个体。我们提出一种新的累积发病率函数(CIF)估计量,它超越了现有方法,因为它整合了所有疾病病例,无论是现患还是发病,无论其后续生命历程如何。特别是,新方法可以处理涉及在年轻年龄发生且发病后生存期长的疾病的情况。建立了新方法的渐近性质,并进行了模拟研究以检验该方法的性能。我们通过将方法应用于英国生物库的癌症数据,说明了该方法的使用,并强调了其相对于现有方法的优势。

英文摘要

Population-based biobanks, now established in many countries, offer opportunities for large-scale studies investigating the incidence of various diseases. Biobank data is typically collected from a study cohort recruited over a defined calendar period, with subjects entering the study at various ages falling between $R_L$ and $R_U$. This work focuses on biobank data that includes individuals in whom onset of the disease of interest occurred before recruitment, termed prevalent cases, along with individuals initially recruited as disease-free in whom disease onset occurred during the follow-up period. We propose a novel cumulative incidence function (CIF) estimator that goes beyond existing methods in that it incorporates all disease cases, both prevalent and incident, irrespective of their subsequent life course. In particular, the new method can handle situations involving diseases that can occur at young ages with long survival after disease onset. Asymptotic properties of the new method are established and a simulation study is presented examining the performance of the method. We illustrate the use of the method and highlight its advantages over existing methods with an application to cancer data from the UK biobank.

2606.17308 2026-06-19 stat.ME stat.ML 新提交

Kernel-Based Functional Balancing for Causal Inference with Compositional Treatments

基于核的协变量函数平衡法用于成分处理下的因果推断

Sungbum Kim, Jiayi Wang

AI总结 针对成分处理(暴露位于单纯形)的因果效应估计,提出基于核的协变量函数平衡加权法,通过最小化再生核希尔伯特空间中的最坏情况平衡误差构造权重,并构建增强加权估计量,实现√n一致性。

Comments 40 pages, 3 figures

详情
AI中文摘要

我们研究成分处理下的因果效应估计,其中暴露位于单纯形上,估计量定义在成分上而非标量或二元值。通过考虑平均潜在结果在处理空间上的投影,采用基于核的协变量函数平衡方法进行权重构造。权重通过直接最小化在由处理和协变量联合空间定义的再生核希尔伯特空间(RKHS)上的最坏情况平衡误差获得,而非在处理分配模型下估计。基于这些权重,提出了一个增强加权估计量(AWE),其中结果函数通过核岭回归估计,并与协变量分布的边际增广相结合。尽管所得目标函数结构复杂,但通过表示定理和低秩近似,我们将其转化为有限维凸优化问题。所提出的估计量在不要求权重一致估计或光滑性的情况下实现了√n一致性。建立了围绕样本特定目标的渐近正态性结果。通过模拟研究和真实数据应用展示了经验性能。

英文摘要

We study causal effect estimation with compositional treatments, where the exposure lies on a simplex and the estimand is defined over compositions rather than scalar or binary values. By considering a projection of the average potential outcome onto the treatment space, a kernel-based covariate functional balancing approach is adopted for weight construction. The weights are obtained by directly minimizing a worst-case balancing error over a reproducing kernel Hilbert space (RKHS) defined on the joint space of treatments and covariates, instead of being estimated under a treatment assignment model. Building on these weights, an augmented weighted estimator (AWE) is proposed, where the outcome function is estimated via kernel ridge regression and combined with a marginal augmentation over the covariate distribution. Despite the complex structure of the resulting objective, a finite-dimensional convex optimization problem is formulated via a representer theorem and a low-rank approximation. The proposed estimator achieves $\sqrt{n}$-consistency without requiring consistent estimation or smoothness of the weights. An asymptotic normality result is established around a sample-specific target. Empirical performance is demonstrated through simulation studies and a real data application.

2606.17165 2026-06-19 stat.ME cs.AI econ.EM math.ST stat.TH 新提交

Statistical Foundations of LLM-based A/B Testing: A Surrogacy Framework for Human Causal Inference

基于LLM的A/B测试的统计基础:用于人类因果推断的替代指标框架

Joel Persson, Mårten Schultzberg, Sebastian Ankargren

发表机构 * Spotify USA, Inc.(Spotify美国公司)

AI总结 提出替代指标理论框架,证明在弱于分布等价条件下,校准LLM输出可识别平均处理效应,并分析随机性带来的偏差与方差。

详情
AI中文摘要

组织和研究者越来越有兴趣在A/B测试中使用大型语言模型(LLM)代替人类参与者,以期更快、更低成本地进行实验。我们研究当在LLM结果上估计的处理效应何时能够恢复在感兴趣的人类群体上测量的效应。LLM与人类结果之间的分布等价性会使任何标准估计量有效,但这不现实。因此,我们开发了一个统计框架,将替代终点理论适配到LLM。该框架表明,将LLM结果校准到人类结果,在替代性和可比性条件(联合弱于分布等价性)下,可以识别平均处理效应。当这些条件不成立时,感兴趣的效应仅部分可识别,我们提供了诊断方法,可以在历史实验上证伪替代性,并给出有限重叠下最坏情况偏差的界限。我们进一步证明,LLM固有的随机性会引入偏差和方差,但使用多次抽取的平均值作为替代指标可以同时缓解两者。我们在模拟和Upworthy标题的A/B测试应用中展示了方法和理论。我们工作的一个核心结论是,LLM结果作为替代指标的有效性只能对过去的处理被证伪,而无法对新处理被验证,因此对于新颖干预,人类实验仍然不可或缺。我们讨论了LLM选择、提示和温度作为设计变量的作用,以及如何确定人类实验的规模以进行验证。

英文摘要

Organizations and researchers show increasing interest in using large language models (LLMs) in place of human participants in A/B tests, in the hope of experimenting faster and at lower cost. We study when a treatment effect estimated on LLM outcomes can recover the effect that would have been measured on the human population of interest. Distributional equivalence between LLM and human outcomes would make any standard estimator valid but is unrealistic. We therefore develop a statistical framework that adapts surrogate endpoint theory to LLMs, showing that calibrating LLM outcomes to human outcomes identifies the average treatment effect under surrogacy and comparability conditions that are jointly weaker than distributional equivalence. We present a falsification test for surrogacy and a bound on the worst-case bias from limited overlap between the LLM and human samples. We further show that the stochasticity inherent to LLMs can weaken surrogacy for identification while also introducing bias and variance during estimation, but that using an average over multiple LLM draws per unit as the surrogate mitigates these issues. Simulations validate the results, and an empirical application to A/B tests on Upworthy headlines shows that raw LLM predictions recover only 39\% of the human treatment effect while nonparametric calibration closes the gap. A central takeaway is that A/B testing on LLMs yields correct results only by assumption, whereas A/B testing on humans is correct by design, and that the required assumptions are hardest to justify precisely where A/B testing on LLMs promises the greatest benefit. We discuss the role of LLM choice, prompting, and temperature as design variables, the compounded challenge posed by long-term outcomes, and how to size human pilot studies for validation.

2606.20191 2026-06-19 stat.ML stat.ME 交叉投稿

AK-MCS-C2 : Active Kriging Monte Carlo Simulation method with conformal certification for failure probability estimation

AK-MCS-C2: 具有共形认证的主动克里金蒙特卡洛模拟方法用于失效概率估计

Edgar Jaber, Vincent Chabridon, Mathilde Mougeot

AI总结 提出一种结合主动克里金蒙特卡洛模拟与共形预测的主动学习框架,通过自适应交叉共形策略和J+GP共形估计器,在少量样本下提供无分布假设的预测误差保证,提高极限状态面附近样本分类可靠性,从而提升失效概率估计的准确性和鲁棒性。

详情
AI中文摘要

我们提出了一种新颖的主动学习框架,用于结构可靠性分析中的失效概率估计,该框架将主动克里金蒙特卡洛模拟与共形预测相结合。所提出的方法采用了一种自适应交叉共形策略,专门针对小样本设置和基于J+GP共形估计器的克里金代理模型设计。与标准的AK-MCS方法不同,所提出的框架对预测误差提供了无分布假设的保证,从而对极限状态面附近的样本进行更可靠的分类。这种改进的不确定性量化增强了失效概率估计的准确性和鲁棒性,特别是在这种效率至关重要的罕见事件区域。可重复的数值结果说明了该方法的有效性,并在公认的基准测试上将其与经典方法进行了比较。

英文摘要

We introduce a novel active-learning framework for failure probability estimation in structural reliability analysis that integrates Active Kriging Monte Carlo simulation with conformal prediction. The proposed approach employs an adaptive cross-conformal strategy specifically designed for small-sample settings and kriging surrogate models using the J+GP conformal estimator. Unlike standard AK-MCS methods, the proposed framework provides distribution-free guarantees on prediction errors, leading to more reliable classification of samples near the limit-state surface. This improved uncertainty quantification enhances both the accuracy and robustness of failure probability estimates, especially for rare-event regimes where such efficiency is crucial. Reproducible numerical results illustrate the effectiveness of the method and also compare it to classical approaches on well-established benchmarks.

2606.19714 2026-06-19 stat.ML cs.AI cs.LG stat.CO stat.ME 交叉投稿

AURA: Adaptive Uncertainty-aware Refinement for LLM-as-a-Judge Auditing

AURA: 用于LLM作为评判审计的自适应不确定性感知精炼

Zilong Zhang, Yi-Ting Hung, Weiyi He, Junxi Zhang, Lei Ding, Chi-Kuang Yeh

AI总结 提出AURA框架,通过自适应不确定性感知精炼,在少量人工验证下迭代学习人类一致性信号,优先审核不确定比较,提升LLM评判的可靠性。

详情
AI中文摘要

大型语言模型(LLM)越来越多地被用作开放式生成的评判者,因为大规模人工评估通常昂贵且难以扩展,但它们的偏好仍然是人类判断的不完美代理。现有的审计流程通常假设事先存在可靠的示例子集或干净的监督信号,例如来自人工注释、启发式过滤或强评判者的输出。在LLM评估中,这一假设是脆弱的:初始分割可能继承评判者偏差,而人工验证通常过于稀缺,无法在规模上定义稳定组。我们提出AURA,一种自适应不确定性感知精炼框架,用于在选定的人工验证下审计成对LLM作为评判的决策。AURA迭代学习人类一致性信号,传播可靠证据,并优先将不确定的比较提交人工审核。关键思想是将对评判者的信任视为一个潜在量,随着证据积累逐步精炼。我们提供了紧凑的公式、稳定的精炼过程,以及在合成和真实成对LLM答案数据上的全面评估。

英文摘要

Large language models (LLMs) are increasingly used as judges for open-ended generation, as large-scale human evaluation is often expensive and difficult to scale, yet their preferences remain imperfect proxies for human judgment. Existing auditing pipelines often assume that a reliable subset of examples or clean supervision signals are available beforehand, for example from human annotation, heuristic filtering, or the outputs of strong judges. In LLM evaluation, this assumption is fragile: the initial split may inherit judge bias, while human verification is typically too scarce to define stable groups at scale. We propose AURA, an adaptive uncertainty--aware refinement framework for auditing pairwise LLM--as--a--judge decisions under selected human verification. AURA iteratively learns a human-consistency signal, propagates reliable evidence, and prioritizes uncertain comparisons for human review. The key idea is to treat trust in a judge as a latent quantity that is progressively refined as evidence accumulates. We provide a compact formulation, a stable refinement procedure, and a comprehensive evaluation on both synthetic and real pairwise LLM-answer data.

2606.19909 2026-06-19 stat.CO math.PR stat.ME 交叉投稿

Establishing an $Ω(\sqrt{d})$ complexity lower bound for PDMP samplers and how to break it: a sub-$\sqrt{d}$ algorithm for Gaussian-tailed targets

建立 PDMP 采样器的 $\Omega(\sqrt{d})$ 复杂度下界及如何突破:针对高斯尾目标的一个亚 $\sqrt{d}$ 算法

Augustin Chevallier

AI总结 本文证明分段确定性马尔可夫过程采样器在标准设置下具有 $\Omega(\sqrt{d})$ 复杂度下界,并通过放宽目标密度连续时间不变性假设,提出一种新方案,对高斯尾目标实现 $O(d^\alpha)$($\alpha\in[0.2,0.3]$)的经验复杂度。

详情
AI中文摘要

尽管分段确定性马尔可夫过程(PDMP)采样器在理论上有非可逆性的吸引力,但迄今为止,尚未开发出在计算复杂度上相对于目标维度 $d$ 优于 $\mathcal{O}(\sqrt{d})$ 的 PDMP 采样器。我们通过在标准设置中建立 PDMP 采样器算法复杂度的 $\Omega(\sqrt{d})$ 下界,证明这是一个基本限制。通过放宽目标密度必须在所有连续时间保持不变的假设,我们随后展示了如何突破这一障碍。具体来说,我们引入了一种新颖的 PDMP 采样方案,并表明它对高斯尾目标实现了 $\mathcal{O}(d^\alpha)$ 的经验复杂度,其中 $\alpha \in [0.2, 0.3]$。此外,该 PDMP 方案在轨迹长度和速度更新之间的距离上都是局部自适应的。

英文摘要

Despite the theoretical appeal of their non-reversibility, to date, no Piecewise Deterministic Markov Process (PDMP) samplers have been developed that scale better than $\mathcal{O}(\sqrt{d})$ in computational complexity with respect to the target dimension $d$. We prove that this is a fundamental limitation by establishing an $Ω(\sqrt{d})$ lower bound on the algorithmic complexity of PDMP samplers in a standard setup. By relaxing the assumption that the target density must remain invariant at all continuous times, we then demonstrate how to bypass this barrier. Specifically, we introduce a novel PDMP sampling scheme and show that it achieves an empirical complexity of $\mathcal{O}(d^α)$, where $α\in [0.2, 0.3]$ for Gaussian-tailed targets. In addition, this PDMP scheme is locally adaptive in both trajectory length and distance between velocity updates.

2606.19361 2026-06-19 cs.LG cs.AI cs.NA math.NA stat.CO stat.ME stat.ML 交叉投稿

Computational Identifiability

计算可识别性

Lucius E. J. Bynum, Rajesh Ranganath, Kyunghyun Cho

发表机构 * New York University(纽约大学)

AI总结 提出“计算可识别性”框架,通过有限计算搜索过程在指定误差容限内找到经验估计量,从而解决理论可识别性在有限样本、模糊图标准等实际场景中的不足。

详情
AI中文摘要

识别条件描述了目标查询或感兴趣参数作为可用信息类型和数量的函数的可计算性。在因果识别中,这些信息通常以因果图的形式表达,数据是针对图中某些变量子集观测或收集的。目标查询可以是单个效应,也可以是给定模型中的一类效应。识别算法的推导在数学上定义了期望中理论上唯一确定所需因果效应的过程。期望中的可识别性,即“理论可识别性”,通常假设渐近性质、无限数据或其他数学理想化条件。在本文中,我们探讨了这种理论理想化的可识别性与一种受计算限制的替代方案之间的根本区别。我们提出的框架——“计算可识别性”——而是为经验估计量定义一个有限的计算搜索过程。如果该过程在期望的误差容限内经验性地找到了估计量,则满足可识别性,条件取决于搜索的指定假设(即参数上的先验分布)以及搜索过程本身。通过多个实验,我们展示了该框架如何回答细粒度的实际识别问题,例如小有限样本下的识别、模糊图标准下的识别、混合观测-干预数据下的识别,以及跨反事实数据和估计量的识别。代码见 https://this https URL。

英文摘要

Identification conditions describe the computability of a target query or parameter of interest as a function of the type and amount of information available. In causal identification, this information is often expressed in the form of a causal graph, and data are observed or collected for some subset of variables in the graph. Target queries may be for a single effect alone or for a class of effects in a given model. The derivation of an identification algorithm then defines mathematically the process by which the desired causal effect(s) can be uniquely determined, theoretically, in expectation. Identifiability in expectation, or 'theoretical identifiability,' generally assumes asymptotic properties, infinite data, or other mathematically idealized conditions. In this paper, we explore a fundamental distinction between this theoretical, idealized notion of identifiability and a proposed alternative that is computation-bound. The framework we propose - 'computational identifiability' - is to instead define a finite computational search procedure for an empirical estimator. If this process finds an estimator empirically, within a desired error tolerance, then identifiability is satisfied, conditional on the specified assumptions of the search (i.e., a prior distribution over the parameters) and conditional on the search procedure itself. Through several experiments, we demonstrate how this framework allows us to answer fine-grained, practical identification questions, such as identification with small finite samples, with ambiguous graphical criteria, with mixed observational-interventional data, and across counterfactual data and estimands. Code is available at https://github.com/lbynum/metadentify.

2606.20427 2026-06-19 math.ST stat.ME stat.TH 交叉投稿

Private Rate-Double-Robust Inference

私有率双稳健推断

Máté Kormos, Aad van der Vaart

AI总结 本文通过局部隐私机制注入噪声保护个体隐私,同时利用率双稳健性实现目标参数的无偏和半参数有效推断,并开发了私有化非参数和参数 nuisance 估计方法。

详情
AI中文摘要

我们协调了隐私保护和率双稳健推断。个体隐私通过局部隐私机制得到保护:向敏感数据注入噪声,仅揭示用于推断的噪声数据。因此,隐私保护阻碍了推断。相比之下,当目标参数的估计量的大样本偏差由另外两个 nuisance 参数的估计误差之间的权衡表征时,该参数的推断是率双稳健的。因此,率双稳健性促进了推断。我们协调的起点是一类由无限维线性索引和低维非线性回归索引的率双稳健目标参数。这包括因果参数等。为了私有地推断这些目标,我们展示了合适的隐私机制如何将敏感数据模型的半参数性质转移到私有设置中。率双稳健性被转移,从而实现了对目标参数的局部私有、无偏和半参数有效推断。最后,我们将一般的非参数 nuisance 估计量转化为私有估计量,这些估计量继承了其非私有对应物的收敛性质。对于参数 nuisance 模型,我们开发了一种私有矩估计方法及其大样本推断理论。

英文摘要

We reconcile privacy protection and rate-double-robust inference. The privacy of individuals is protected by a local privacy mechanism: injecting noise into their sensitive data, revealing only the noisy data for inference. Hence, privacy protection hinders inference. In contrast, the inference of a target parameter is rate-double-robust when the large-sample bias of an estimator of the parameter is characterised by a trade-off between the estimation errors of two other, nuisance, parameters. Hence, rate-double-robustness facilitates inference. Our starting point of reconciliation is a class of rate-double-robust target parameters indexed linearly by an infinite-dimensional and nonlinearly by a low-dimensional regression. Among others, this includes causal parameters. To infer these targets privately, we show how suitable privacy mechanisms transfer the semiparametric properties of the sensitive-data model to the private setting. Rate-double-robustness is transferred, enabling locally-private, unbiased and semiparametrically efficient inference of our target parameters. Finally, we transform general nonparametric nuisance estimators into private ones, which inherit convergence properties of their nonprivate counterparts. For parametric nuisance models, we develop a private method-of-moments estimator and its large-sample inference theory.

2606.19789 2026-06-19 math.OC stat.ME 交叉投稿

Dynamic Core Allocation for Malleable Jobs with Unknown Speed-up Parameters

具有未知加速参数的可变作业的动态核心分配

S. ~A. Bodas, J. ~L. Dorsman, M. Mandjes, L. Ravner

AI总结 针对多核系统中具有未知加速参数的可变作业,提出一种迭代学习-控制框架,通过最大似然估计未知参数并求解马尔可夫决策过程更新分配策略,以最小化长期平均作业数。

详情
AI中文摘要

我们研究了具有固定数量处理核心和可变形作业流的多核计算系统中的动态资源分配问题。每个作业可以在执行期间调整其并行度,从而允许在并发活动作业之间自适应地重新分配资源。作业属于两个可观测类别之一,每个类别由具有未知参数的独特加速函数表征。目标是学习一种核心分配策略,以最小化系统中长期平均作业数,即稳态下的平均响应时间。为了解决这种不确定性,我们开发了一个迭代学习与控制框架。系统在根据观察到的作业完成情况估计未知加速参数和求解相关马尔可夫决策过程以更新分配策略之间交替。在每个作业类别内,核心在活动作业之间平均共享;分配给每个类别的容量比例来自文献[17]的MDP公式,并在当前参数估计下进行评估。我们基于状态相关的离开时间构建了最大似然估计器,并证明了在固定分配策略下其强一致性。我们进一步提出了两种学习算法,将该估计步骤与基于动态规划的策略更新相结合,并通过数值实验说明了它们的性能。

英文摘要

We study dynamic resource allocation in a multicore computing system with a fixed number of processing cores and a stream of {\it malleable} jobs. Each job may adjust its level of parallelism during execution, allowing adaptive redistribution of resources across concurrently active jobs. Jobs belong to one of two observable classes, each characterized by a distinct speed-up function with unknown parameters. The objective is to learn a core-allocation policy that minimizes the long-run mean number of jobs in the system, equivalently the mean response time in steady state. \noindent To address this uncertainty, we develop an iterative learning-and-control framework. The system alternates between estimating the unknown speed-up parameters from observed job completions and solving the associated Markov decision process (MDP) to update the allocation policy. Within each job class, cores are shared equally among active jobs; the fraction of capacity assigned to each class is obtained from the MDP formulation of \cite{berg2017}, evaluated at the current parameter estimates. We construct a maximum likelihood estimator based on state-dependent inter-departure times and prove its strong consistency under a fixed allocation policy. We further propose two learning algorithms that combine this estimation step with dynamic programming-based policy updates, and illustrate their through numerical experiments.

2606.18933 2026-06-19 cs.LG cs.IR stat.ME 交叉投稿

Zero-Shot Active Feature Acquisition via LLM-Elicitation

基于LLM启发式的零样本主动特征获取

Binyamin Perets, Natalie Mendelson, Shiran Vainberg, Yehuda Chowers, Shai Shen-Orr, Shie Mannor

发表机构 * Faculty of EE, Technion(技术学院电子工程系) Faculty of Medicine, Technion(技术学院医学院) CytoReason NVIDIA

AI总结 提出通过LLM启发式获取马尔可夫随机场充分统计量的零样本主动特征获取框架,解决数据标注不足问题,在IBD患者诊断中优于现有方法。

详情
AI中文摘要

主动特征获取(AFA)顺序选择要观察的特征以达成分类或排序决策。其主要局限性在于依赖大量标注数据来拟合指导获取的概率模型。大型语言模型(LLM)提供无监督的领域知识,但作为序列规划者表现不佳。要求其同时知晓和决策会混淆最好分开的能力。这里,我们通过严格的启发式方法开发了一个零样本AFA框架:仅要求LLM返回其可被信任返回的内容,即马尔可夫随机场(MRF)的充分统计量——一元偏差和成对协变。我们将该框架应用于两个场景:二分类和top-$k$识别。实践中,LLM可靠地仅返回判别性统计量,即区分类别而非孤立每个类别的统计量,这阻碍了经典AFA。我们应用最大熵闭包来解决这种规范模糊性。我们在炎症性肠病(IBD)患者队列上进行评估,这是一个活跃的临床环境,其中诊断模糊性和患者异质性阻碍了稳定的治疗策略。我们的框架在真实标签和其自身提取的信念上均优于LLM。在最关键的地方,即最困难的患者上,我们的top-$k$获取策略显著优于所有现有方法。

英文摘要

Active feature acquisition (AFA) sequentially selects which features to observe to reach a classification or ranking decision. Its central limitation is reliance on large amount of labeled data to fit probabilistic models guiding acquisition. Large language models (LLMs) supply unsupervised domain knowledge, but are poor sequential planners. Asking one to both know and decide conflates capabilities best kept separate. Here, we develop a framework for zero-shot AFA through disciplined elicitation: asking the LLM only for what it can be trusted to return, the unary deviations and pairwise co-variations that are the sufficient statistics of a Markov random field (MRF). We apply our framework to two settings: binary classification and top-$k$ identification. In practice, the LLM reliably returns only discriminative statistics, what distinguishes the classes rather than each class in isolation, which precludes classical AFA. We apply a maximum-entropy closure that resolves this gauge ambiguity. We evaluate on a cohort of Inflammatory Bowel Disease (IBD) patients, an active clinical setting where diagnostic ambiguity and patient heterogeneity obstruct stable treatment strategies. Our framework outperforms the LLM both on real labels and on its own extracted beliefs. Where it matters most, on the hardest patients, our top-$k$ acquisition policy markedly outperforms all existing methods.

2606.04307 2026-06-19 cs.LG stat.CO stat.ME 版本更新

Folded Transport MCMC: Eliminating Label Switching by Sampling on a Fundamental Domain

折叠传输MCMC:对称贝叶斯模型的可认证商后验计算

Jun Hu

发表机构 * Wuhan University of Technology(武汉理工大学)

AI总结 针对对称贝叶斯模型中的冗余多峰性导致MCMC收敛诊断退化的问题,提出Folded Transport MCMC方法,通过在对称群的基本域上构建独立采样器直接对商后验进行推断,并利用LCNF振荡认证框架在商度量下提供可证明的认证下界。

Comments 50 pages (including supplementary material), 5 figures, 6 tables. Submitted to Journal of Computational and Graphical Statistics

详情
AI中文摘要

具有有限对称性的贝叶斯模型——如可交换分量的混合模型、具有紧密间隔模态的结构识别——定义的后验在标签置换群下不变,产生冗余的多峰性,从而降低MCMC收敛诊断的质量。我们引入折叠传输MCMC(FolT-MCMC),该方法通过在对称群的基本域上构建独立采样器,直接对商后验进行推断。商提议分布通过对群轨道上学习的归一化流进行对称化得到。我们证明了基于LCNF振荡的认证框架可以迁移到商度量,并具有稳定子修正的球质量界和改进的覆盖半径,并且当未折叠流表现出跨模态提议缺陷时,分位数核心认证下界会得到改善。在高斯混合(d=2-20)、标签切换目标(最多24个等价模态)以及标准贝叶斯三分量混合后验上,分位数核心认证改进比从2倍到145倍不等,且折叠认证经验上几乎与维度无关。在台风山竹期间超高层建筑的真实加速度计数据上,FolT-MCMC产生了非平凡的分位数核心认证,而未折叠认证是平凡的。

英文摘要

In Bayesian mixture models and other exchangeable-component models, the posterior is invariant under permutation of component labels, creating m! equivalent modes-the label-switching problem. Standard MCMC methods either mix poorly across these modes or rely on post-hoc relabelling that cannot guarantee the sampler has converged. We propose Folded Transport MCMC (FolT-MCMC), which eliminates label switching before sampling by restricting the Markov chain to a fundamental domain-a sorted or reflected subspace containing exactly one representative from each symmetric mode. The proposal is a learned normalising flow whose density is symmetrised over the group orbits, ensuring correct targeting on the reduced space. We show that this construction preserves a computable convergence diagnostic based on the oscillation of the log-density ratio, and that the diagnostic becomes sharper on the fundamental domain whenever the original-space flow under-covers one or more symmetric modes. Experiments on Gaussian mixtures (d=2-20), label-switching targets (up to 24 equivalent modes), a standard Bayesian three-component mixture posterior, and real accelerometer data from a supertall building show improvement ratios of 2x to 145x, with the folded diagnostic stable across dimensions while the unfolded diagnostic collapses.

2605.15896 2026-06-19 stat.ME stat.AP 版本更新

A Model-Agnostic Bootstrap for Macro-Level Claims Reserving Under the Conditioning Principle

基于条件原理的宏观层面赔款准备金模型无关自助法

Robin Van Oirbeek, Tim Verdonck

AI总结 本文提出一种满足条件原理的自助法,用于宏观层面赔款准备金估计,通过Dirichlet-Gamma层次结构实现精确校准,改进了现有自助法的覆盖误差问题。

Comments 23 pages, v2: correction of the interpretation of the $κ$ parameter

详情
AI中文摘要

正确的推断对象是条件预测分布p(R|D,θ̂),其中D是观察到的三角形保持固定。我们称之为条件原理。所有现有自助法违反这一原理,通过在预测循环中对D的函数进行重采样,产生O(1)的覆盖误差,随着三角形增大不消失。Dirichlet-Gamma层次结构允许一种满足该原理的自助法:S^{IBNP}_i = X^{obs}_i (1-W_i)/W_i,其中W_i ~ Beta(cF_{I-i}, c(1-F_{I-i}))直接从其预测分布中采样。仅模拟分配比例W_i;观察到的三角形保持固定。因此继承了任何开发比例方法(链式梯度、Bornhuetter-Ferguson、Cape Cod或其他)的校准,使其模型无关。覆盖缺陷为O(I^{-1/2}),与开发时期数量无关。在复合泊松数据生成过程中,该自助法对于每个F_{I-i} ∈ (0,1)是保守的:预测标准差分析上超过真实值的因子为1/√F_{I-i}。ODP自助法通过两种相反方向的机制违反该原理:重新估计在ODP DGP下膨胀自助方差,而缺失事故年脆弱性在脆弱性DGP下缩小它。结果覆盖差异为Ω(1),无论I如何,为Meyers(2015)文档的跨投资组合误校准异质性提供了结构解释。链式梯度、Bornhuetter-Ferguson和Cape Cod在稀疏、信息丰富和池化先验下分别作为可信度估计量,计数和金额具有相同结构。集中程度c作为诊断:ĉ < 30表明开发非平稳。

英文摘要

The correct inferential object in claims reserving is the conditional predictive distribution $p(R \mid \mathcal{D}, \hatθ)$, where $\mathcal{D}$ is the observed triangle held fixed. We refer to this as the conditioning principle. All existing bootstraps violate it by resampling functions of $\mathcal{D}$ inside the predictive loop, producing an $O(1)$ coverage error that does not vanish as the triangle grows. The Dirichlet-Gamma hierarchy admits a bootstrap that satisfies the principle exactly: $S^{IBNP}_i = X^{obs}_i (1-W_i)/W_i$ with $W_i \sim \mathrm{Beta}(c\hat{F}_{I-i}, c(1-\hat{F}_{I-i}))$ sampled directly from its predictive distribution. Only the allocation proportion $W_i$ is simulated; the observed triangle is held fixed. It thus inherits calibration from any development-proportion method (Chain-Ladder, Bornhuetter-Ferguson, Cape Cod, or other), making it model-agnostic. The coverage deficit is $O(I^{-1/2})$, independent of the number of development periods. Under compound Poisson data-generating processes the bootstrap is conservative for every $F_{I-i} \in (0,1)$: the predictive standard deviation analytically exceeds the true value by the factor $1/\sqrt{F_{I-i}}$. The ODP bootstrap violates the principle through two mechanisms in opposite directions: re-estimation inflates bootstrap variance under the ODP DGP, while missing accident-year frailty deflates it under frailty DGPs. The resulting coverage discrepancy is $Ω(1)$ regardless of $I$, providing a structural explanation for the cross-portfolio miscalibration heterogeneity documented by Meyers (2015). Chain-Ladder, Bornhuetter-Ferguson and Cape Cod emerge as credibility estimators under diffuse, informative and pooling priors respectively, with identical structure for counts and amounts. The concentration $c$ serves as a diagnostic: $\hat{c} < 30$ signals non-stationary development.

2605.15811 2026-06-19 stat.ME stat.AP 版本更新

The Negative Binomial Chain-Ladder: A Full Likelihood Model for Claim Count Reserving

负二项链梯法:一种完整的似然模型用于赔款准备

Robin Van Oirbeek

AI总结 本文提出负二项链梯模型,通过泊松-伽马构造自然产生负二项分布,提供更清晰的生成解释,统一了链梯方法家族,并通过模拟验证了模型的稳健性。

Comments 35 pages, 3 figures, v2: correction of the interpretation of the $κ$ parameter

详情
AI中文摘要

链梯法仍是非寿险赔款准备的主要宏观技术,但其经典形式缺乏一致的概率基础。现有随机扩展,包括马科模型和过分散泊松(ODP)框架,提供不确定性度量但依赖二阶矩假设或准似然方差结构。本文开发了一种负二项链梯(NB-CL)模型,将链梯方法嵌入完整的似然框架中。关键贡献是微观层面推导,显示负二项分布自然源于泊松-伽马构造:索赔按具有伽马分布年度异质性的泊松过程到达,聚合产生负二项增量计数。此推导赋予分散参数κ结构解释,即年度异质性,而非随意的过分散调整。NB-CL模型在κ→∞极限下推广泊松链梯模型,与ODP模型共享点估计但方差函数不同(二次vs线性),并在单个概率层级内统一链梯家族。开发了参数Bootstrap程序以纳入过程和参数不确定性。模拟研究证实,在正确规范下,当分散参数经过偏差校正后,覆盖率接近名义水平;在模型不规范情况下表现出受控退化。对索赔计数数据(澳大利亚机动车身体伤害)和已付金额(泰勒-阿什)的实证研究证实了κ的结构解读以及在金额情况下的工作近似状态。

英文摘要

The Chain-Ladder (CL) method remains the dominant macro-level technique for claims reserving in non-life insurance, yet its classical formulation lacks a coherent probabilistic foundation. Existing stochastic extensions-including the Mack model and the Over-Dispersed Poisson (ODP) framework-provide measures of uncertainty but rely on second-moment assumptions or quasi-likelihood variance structures without clear generative interpretations. This paper develops a Negative Binomial Chain-Ladder (NB-CL) model that embeds the CL method within a full likelihood-based framework. The key contribution is a micro-level derivation showing that the negative binomial distribution arises naturally from a Poisson-Gamma construction: claims arrive according to a Poisson process with Gamma-distributed accident-year heterogeneity, and aggregation yields negative binomial incremental counts. This derivation gives the dispersion parameter $κ$ a structural interpretation as accident-year heterogeneity, rather than an ad-hoc overdispersion adjustment. The NB-CL model generalises the Poisson Chain-Ladder model in the limit $κ\to \infty$, shares the point estimates of the ODP model while differing in its variance function (quadratic vs. linear), and unifies the Chain-Ladder family within a single probabilistic hierarchy. A parametric bootstrap procedure is developed to incorporate both process and parameter uncertainty. Simulation studies confirm near-nominal coverage under correct specification once the dispersion parameter is bias-corrected, and a controlled degradation under model misspecification. Empirical illustrations on claim count data (Australian motor bodily injury) and paid amounts (Taylor-Ashe) document both the structural reading of $κ$ and the working-approximation status of the model in the amounts case.

2412.17470 2026-06-19 math.ST econ.EM stat.ME stat.TH 版本更新

A Necessary and Sufficient Condition for Size Controllability of Heteroskedasticity Robust Test Statistics

异方差稳健检验统计量尺寸可控性的一个充要条件

Benedikt M. Pötscher, David Preinerstorfer

AI总结 针对回归模型中单个约束检验,给出了异方差稳健检验统计量尺寸可控性的充要条件,改进了现有仅充分条件的结果。

Comments Clarification in Footnote 15 added

详情
AI中文摘要

我们重新审视了Pötscher和Preinerstorfer (2025)中关于回归模型中异方差稳健检验统计量的尺寸可控性结果。对于检验单个约束(例如,单个系数的零约束)这一特殊但重要的情形,我们给出了尺寸可控性的一个充要条件,而Pötscher和Preinerstorfer (2025)中的条件通常仅是充分的(即使在检验单个约束的情形下)。

英文摘要

We revisit size controllability results in Pötscher and Preinerstorfer (2025) concerning heteroskedasticity robust test statistics in regression models. For the special, but important, case of testing a single restriction (e.g., a zero restriction on a single coefficient), we povide a necessary and sufficient condition for size controllability, whereas the condition in Pötscher and Preinerstorfer (2025) is, in general, only sufficient (even in the case of testing a single restriction).

2603.20022 2026-06-19 stat.ME 版本更新

Q-approximation of operating characteristics of clinical trial designs

临床试验设计操作特性的Q-近似

Susanna Gentile, Daniel E. Schwartz, Riddhiman Saha, Lorenzo Trippa

AI总结 提出Q-近似方法,通过二次近似似然函数替代完整数据模拟,快速评估临床试验的操作特性,计算效率比蒙特卡罗模拟高150-1900倍。

详情
AI中文摘要

设计临床试验需要评估多个操作特性(OCs),例如早期停止决策的可能性、检测治疗效应的概率以及I类错误率。在大多数情况下,这些评估基于计算密集型的蒙特卡罗模拟。随着临床试验复杂性和适应性设计使用的增加,计算负担可能迅速变得难以承受。我们引入了一种快速近似OCs的策略,称为Q-近似。我们的方法基于对数似然的二次近似和渐近论证。主要思想是用模拟决定试验中期和最终决策的近似似然函数来替代完整试验数据集的模拟。Q-近似方法可应用于任何使用与似然原理一致的数据分析方法的试验设计,包括具有早期停止的多阶段设计、自适应随机化设计以及利用外部数据的设计。我们通过几个例子说明了该方法,并表明它在减少计算时间的同时提供了重要OCs的准确近似。特别是,在我们的实验中,要达到相当的精度水平,标准蒙特卡罗近似OCs所需的计算预算比Q-近似高150到1900倍。通过实现快速的OC评估,Q-近似可以支持在应用试验规划和方法学开发中更广泛地使用创新试验设计。

英文摘要

Designing clinical trials requires evaluating multiple operating characteristics (OCs), such as the likelihood of an early stopping decision, the probability of detecting a treatment effect, and the Type I error rate. In most cases, these evaluations are based on computationally intensive Monte Carlo simulations. As the complexity of clinical trials and the use of adaptive designs increase, the computational burden can quickly become prohibitive. We introduce a strategy for rapidly approximating OCs, called the Q-approximation. Our approach is based on quadratic approximations of the log-likelihood and asymptotic arguments. The main idea is to replace simulation of full trial datasets with simulation of the approximate likelihood functions that determine the trial's interim and final decisions. The Q-approximation approach can be applied to any trial design that uses data analysis methods coherent with the likelihood principle, including multistage designs with early stopping, adaptively randomized designs, and designs that leverage external data. We illustrate the approach with several examples and show that it provides an accurate approximation of important OCs while reducing the computation time compared to Monte Carlo simulations. In particular, in our experiments, the standard Monte Carlo approximation of OCs requires 150 to 1,900 times greater computing budget than Q-approximations to achieve comparable levels of accuracy. By enabling fast OC evaluations, Q-approximations can support the broader use of innovative trial designs in both applied trial planning and methodological development.

2603.19745 2026-06-19 stat.ME 版本更新

Invariant quantile regression for heterogeneous environments

异质环境下的不变分位数回归

Bo Fu, Dandan Jiang

AI总结 针对多环境数据集提出不变分位数回归框架,通过核平滑估计器利用环境间不变性实现因果发现和内生性克服。

Comments 25 pages, 4 figures

详情
AI中文摘要

在本文中,我们提出了一个专门针对多环境数据集的不变分位数回归(IQR)框架,该框架捕捉了不同环境之间的不变性。该框架与迁移学习、因果推断和公平机器学习密切相关,其动机源于响应变量在给定协变量下的条件概率发生变化,而某些关键变量保持不变的场景。这一视角与以往仅关注条件均值的工作显著不同,后者通常不足以捕捉异质环境中协变量与响应变量之间的完整因果关系。相比之下,基于分位数的不变性自然地适应异质性,并且与结构因果模型更加一致,其中在一个或多个分位数水平上跨环境不变的变量直接指示潜在且稳定的因果变量。此外,我们表明,与条件均值框架相比,IQR 可能产生更大的内生变量集,从而更有效地排除虚假(非因果)变量。为此,我们引入了一种核平滑不变分位数回归(KS-IQR)估计器,该估计器利用潜在的不变结构和环境间的异质性,确保在多个环境中稳定估计。我们在非渐近框架下建立了我们方法的因果发现性质,展示了其克服“内生性诅咒”的能力,并推导了估计器的 $\ell_2$ 误差界。我们将我们的方法应用于真实数据的因果发现,获得了具有生物学意义的关系,恢复了已知的信号通路并揭示了额外的分位数特定效应。

英文摘要

In this paper, we propose an invariant quantile regression (IQR) framework specifically designed for multi-environment datasets, which captures the invariance across different environments. This framework is closely related to transfer learning, causal inference, and fair machine learning, and is motivated by scenarios in which the conditional probability of the response given covariates varies, while certain key variables remain invariant. This perspective differs notably from previous works that restrict attention to the conditional mean, which is often insufficient to capture the full causal relationships between covariates and the response in heterogeneous environments. In contrast, quantile-based invariance naturally accommodates heterogeneity, and aligns more closely with structural causal models, in which variables invariant across environments at one or multiple quantile levels directly indicate potential and stable causal variables. Moreover, we show that IQR may yield a larger set of endogenous variables compared to the conditional mean framework, which in turn promotes more effective exclusion of spurious (non-causal) variables. To achieve this, we introduce a Kernel-Smoothed Invariant Quantile Regression (KS-IQR) estimator, which leverages the underlying invariance structure and heterogeneity among environments, ensuring stable estimation across multiple environments. We establish the causal discovery properties of our method, demonstrate its ability to overcome the ``curse of endogeneity'', and derive an $\ell_2$ error bound for our estimator, all in a non-asymptotic framework. We apply our method to real data for causal discovery and obtain biologically meaningful relationships, recovering known signaling pathways and revealing additional quantile-specific effects.