arXivDaily arXiv每日学术速递 周一至周五更新
重置
2605.23879 2026-05-25 stat.ML cs.CR cs.LG math.ST stat.TH

On the Stability of Spherical Hellinger-Kantorovich Flows and Their Implications for Differential Privacy

球形Hellinger-Kantorovich流的稳定性及其对差分隐私的影响

Aratrika Mustafi, Soumya Mukherjee

AI总结 本文研究了球形Hellinger-Kantorovich梯度流的稳定性问题,并探讨其在差分隐私中的应用。作者建立了该梯度流的扰动理论,分析了不同势函数下流的动力学差异,并给出了与时间相关的log-似然比和Rényi散度的统一上界,进一步推导了KL散度的界。这些结果被用于差分隐私中的指数机制采样,提供了基于SHK梯度流的纯差分隐私和近似差分隐私保证,并分离了机制本身的次优性与有限时间采样误差的影响。

详情
AI中文摘要

梯度流采样将吉布斯分布解释为概率测度上能量泛函的最小值,并生成收敛到该目标的动力学。在球形Hellinger-Kantorovich (SHK)几何下,流耦合输运和反应,并与生灭Langevin动力学一致。本文发展了SHK梯度流的摄动理论。对于两个势函数$V$和$V^{\prime}$,我们从共同初始值出发比较相关的流,并量化势差异随时间传播的程度。一个统一的扰动界给出了对数似然比和Rényi散度的无维、逐点控制,而额外的结构使我们能够推导出KL散度的界。我们将这些结果应用于差分隐私中指数机制的近似采样。似然比控制为基于SHK的采样器提供了显式的时间依赖纯DP保证,而KL界通过hockey-stick散度给出了近似DP证书。我们还推导了一个效用界,将指数机制的内在次优性与有限时间采样误差分离。

英文摘要

Gradient-flow sampling interprets a Gibbs distribution as the minimizer of an energy functional over probability measures and generates dynamics converging to this target. Under spherical Hellinger-Kantorovich (SHK) geometry, the flow couples transport and reaction and coincides with birth-death Langevin dynamics. In this work, we develop a perturbation theory for SHK gradient flows. For two potentials $V$ and $V^{\prime}$, we compare the associated flows from a common initialization and quantify how potential discrepancies propagate over time. A uniform perturbation bound yields dimension-free, pointwise control of the log-likelihood ratio and Rényi divergence, while additional structure allows us to derive bounds for the KL divergence as well. We apply these results to approximate sampling for the exponential mechanism in differential privacy. The likelihood-ratio control provides explicit time-dependent Pure-DP guarantees for SHK-based samplers, while the KL bound yields Approximate-DP certificates via hockey-stick divergence. We also derive a utility bound separating intrinsic exponential-mechanism suboptimality from finite-time sampling error.

2605.23872 2026-05-25 cs.LG cs.NA math.NA stat.ML

Training-Free Looped Transformers

免训练循环Transformer

Lizhang Chen, Jonathan Li, Chen Liang, Ni Lao, Qiang Liu

AI总结 本文提出了一种无需训练的循环变压器模型,通过在冻结的预训练模型中引入一个轻量级的推理时包装器,对连续的中间层块进行循环应用,而无需额外微调或结构修改。研究发现,直接重复使用中间层块会导致性能下降,因此作者借鉴常微分方程的前向欧拉方法,将循环视为对同一近似的优化,采用更小的阻尼子步骤替代单一的大更新。实验表明,该方法在多种模型架构上均能有效提升推理性能,如在MMLU-Pro等基准测试中取得显著提升。

详情
AI中文摘要

我们引入了免训练循环Transformer,其中轻量级推理时包装器循环冻结检查点的连续中间块层,无需额外微调、继续训练或架构更改。与先前使用循环结构端到端训练的循环Transformer方法不同,我们在测试时将循环性改造到预训练模型上。我们表明,简单的块重新应用通常会降低性能,凸显了循环应用策略的重要性。受将预归一化Transformer块视为ODE上的前向欧拉步骤的启发,我们将循环视为同一近似的细化,用一个大的更新替换为更小的阻尼子步骤。在七个密集、稀疏MoE和MLA+MoE模型家族中,我们的方法在MMLU-Pro上将Qwen3-4B-Instruct提升了2.64个百分点,在CommonsenseQA上将Qwen3-30B-A3B-Instruct提升了1.14个百分点,在OpenBookQA上将Moonlight-16B-A3B-Instruct提升了1.20个百分点。

英文摘要

We introduce training-free looped transformers, in which a lightweight inference-time wrapper loops a contiguous mid-stack block of layers of a frozen checkpoint without additional fine-tuning, continued training, or architectural changes. Unlike prior looped transformer methods that train with the looped structure end-to-end, we retrofit recurrence onto pretrained models at test time. We show that naive block reapplication usually degrades performance, highlighting the importance of the loop application strategy. Motivated by viewing a pre-norm transformer block as a forward Euler step on an ODE, we instead treat looping as a refinement of the same approximation, replacing one large update with smaller damped sub-steps. Across seven dense, sparse MoE, and MLA+MoE model families, our method improves Qwen3-4B-Instruct by +2.64 pp on MMLU-Pro, Qwen3-30B-A3B-Instruct by +1.14 pp on CommonsenseQA, and Moonlight-16B-A3B-Instruct by +1.20 pp on OpenBookQA.

2605.23871 2026-05-25 stat.ML cs.LG math.ST stat.TH

Move on Muon : A Hamiltonian probability gradient flow perspective of Muon optimizer

Muon上的移动:Muon优化器的哈密顿概率梯度流视角

Aratrika Mustafi, Soumya Mukherjee, Bharath K. Sriperumbudur

AI总结 本文从哈密顿概率梯度流的视角,研究了Muon优化器的连续时间动力学行为,提出了正则化Muon优化的梯度流形式,并揭示了其与核范数的Fenchel对偶平滑之间的联系。通过将Muon优化推广到有限粒子概率目标函数,作者推导了其惯性连续时间极限,并建立了参数-动量对的概率相空间平均场方程,证明了该动力学为阻尼哈密顿概率动力系统,具有单调递减的哈密顿能量。此外,文章还分析了目标函数的收敛性,并将该方法扩展到适用于变换器混合专家模型的块状Muon概率流。

详情
AI中文摘要

我们开发了一种在矩阵值参数概率测度空间上的梯度流,该梯度流由正则化Muon(理想化Muon优化器的解析平滑版本)诱导。关键观察是正则化正交化映射是核范数的光滑Fenchel对偶平滑的梯度。这确定了(正则化)Muon更新为更新变量中的镜像/近端步骤,其中动量充当对偶坐标。我们利用这一结构将Muon从单个矩阵参数提升到形如$J(ρ)=R\left(\int F d ρ ight)$的有限粒子概率目标,这一设置由神经网络训练的均场描述所激发,并推导出惯性连续时间极限。利用这一结构,我们在步长和动量的惯性缩放下推导出有限粒子连续时间极限,然后过渡到参数-动量对概率律上的相空间均场方程。所得流可被证明是阻尼哈密顿概率动力学,其动能由正则化Muon镜像势诱导。我们证明了一个精确的哈密顿耗散恒等式,显示哈密顿能量单调递减。虽然目标目标本身在惯性Muon动力学下不一定单调,但在额外的梯度优势、有界动量和曲率/对齐假设下,我们获得了目标间隙的连续和离散时间指数收敛率。我们还研究了均场极限方程的适定性,并建立了相互作用粒子系统的混沌传播保证。最后,我们将公式扩展到乘积矩阵空间上的Hilbert值特征映射,得到适用于平滑变压器混合专家模型的块状Muon概率流。

英文摘要

We develop a gradient flow on the space of probability measures defined on matrix-valued parameters induced by regularized Muon, an analytically smoothed version of the idealized Muon optimizer. The key observation is that the regularized orthogonalization map is the gradient of a smooth Fenchel-dual smoothing of the nuclear norm. This identifies the (regularized) Muon update as a mirror/prox step in the update variable, with momentum acting as the dual coordinate. We use this structure to lift Muon from a single matrix parameter to finite-particle probability objectives of the form $J(ρ)=R\left(\int F d ρ\right)$, a setting motivated by mean-field descriptions of neural-network training, and derive the inertial continuous-time limit. Using this structure, we derive the finite-particle continuous-time limit under the inertial scaling of step size and momentum, and then pass to a phase-space mean-field equation over probability laws on parameter-momentum pairs. The resulting flow can be shown to be a damped Hamiltonian probability dynamics whose kinetic energy is induced by the regularized Muon mirror potential. We prove an exact Hamiltonian dissipation identity, showing that the Hamiltonian energy decreases monotonically. While the target objective itself need not be monotone along the inertial Muon dynamics, under additional gradient-dominance, bounded-momentum, and curvature/alignment assumptions, we obtain continuous and discrete-time exponential convergence rates for the objective gap. We also study the well-posedness of the mean-field limit equation and establish propagation of chaos guarantees for the interacting particle system. Finally, we extend the formulation to Hilbert-valued feature maps on product matrix spaces, yielding a blockwise Muon probability flow applicable to smooth transformer mixture-of-experts models.

2605.23858 2026-05-25 stat.AP

Anticipating Continued Global Fertility Decline via Neural Forecasting

通过神经预测预见持续的全球生育率下降

Daniel Ciganda, Facundo Morini, Francisco Piriz, Henrik-Alexander Schubert, Ugofilippo Basellini, Mikko Myrskylä

AI总结 本文研究了全球生育率持续下降的趋势,并引入了一个基于循环神经网络的预测框架NeuralTFR,用于评估各国生育率未来的发展路径。该模型通过整合196个国家和地区的历史生育数据,学习人口动态并生成预测区间,表现出比现有方法更优的预测精度和不确定性校准能力。研究发现,与联合国的BayesTFR模型相比,NeuralTFR预测到2040年全球将面临更广泛的低生育率状况,表明短期内生育率稳定的可能性较低。

详情
AI中文摘要

向低和超低生育率的加速转变加剧了争论:目前正在经历快速下降的国家是趋于稳定还是进入更持久的低生育率体制?现有预测系统对此给出了不同答案,因为它们嵌入了关于恢复和外部驱动因素作用的不同假设。为了在这场辩论中提供经验基准,我们引入了NeuralTFR,一个基于循环神经网络的内生全球预测框架。利用来自196个国家和地区的协调历史生育率序列面板,该模型汇集跨国信息以学习人口惯性,并通过多分位数回归生成经验预测区间。在保留期(2009-2023年)的评估中,NeuralTFR的点预测误差低于Naive Drift基线和联合国贝叶斯层次模型BayesTFR,同时保持竞争性的不确定性校准。在到2040年的前瞻预测中,NeuralTFR指出低和极低生育率的暴露范围比BayesTFR更广,表明对近期稳定性的支持较弱,但仍未达到全球疾病负担项目预测的最严重下降路径。

英文摘要

The accelerating shift toward low and ultra-low fertility has intensified the debate over whether countries now undergoing rapid decline are approaching stabilization or entering a more persistent low-fertility regime. Existing projection systems answer that question differently because they embed different assumptions about recovery and about the role of external drivers. To provide an empirical benchmark in this debate, we introduce NeuralTFR, an endogenous global forecasting framework based on a recurrent neural network. Drawing on a harmonized panel of historical fertility series from 196 countries and territories, the model pools cross-country information to learn demographic momentum and generate empirical prediction intervals via multi-quantile regression. Evaluated on a held-out period (2009--2023), NeuralTFR achieves lower point-forecast errors than a Naive Drift baseline and BayesTFR, the United Nations' Bayesian Hierarchical Model, while maintaining competitive uncertainty calibration. In forward projections to 2040, NeuralTFR points to broader exposure to low and very low fertility than BayesTFR, suggesting weaker support for near-term stabilization while still falling short of the most severe decline paths predicted by the Global Burden of Disease project.

2605.23854 2026-05-25 cs.LG math.ST stat.ML stat.TH

Entrywise Error Bounds for Spectral Ranking with Semi-Random Adversaries

半随机对抗下谱排序的逐项误差界

Dongmin Lee, Anuran Makur, Japneet Singh

AI总结 本文研究了在半随机对抗环境下谱方法用于谱排序的逐项误差界问题。针对能够任意增强某些边采样概率的半随机对手,作者分析了无权重谱方法的性能,并发现其表现高度依赖生成图的谱特性。通过适当重加权观测边以抵消对手影响,可恢复接近均匀采样图的渐近性能。数值实验验证了理论结果的有效性。

详情
Comments
17 pages, 2 figures, 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2
AI中文摘要

Bradley-Terry-Luce (BTL) 模型估计是一种基于成对比较数据对项目集合进行排序的成熟策略。尽管在均匀采样图的情况下,谱估计和最大似然估计等 BTL 估计方法的理论性能已得到充分研究,但将这些结果推广到更广泛的随机图类已被证明具有挑战性。在这项工作中,我们研究了谱算法在半随机对抗下的逐项误差,该对抗可以任意提升某些边的采样概率。我们发现,未加权谱方法的性能严重依赖于生成图的谱性质。此外,我们表明,通过适当地重新加权观察到的边以对抗对抗并恢复谱间隙,可以恢复接近均匀采样图的渐近性能。最后,我们提供了支持我们理论发现的数值模拟。

英文摘要

Bradley-Terry-Luce (BTL) model estimation is a well-established strategy to rank a collection of items given a dataset of pairwise comparisons. Although the theoretical performance of BTL estimation methods, such as spectral and maximum likelihood estimation, is well studied in the regime of uniformly sampled graphs, generalizing such results to a wider class of random graphs has proved challenging. In this work, we investigate the entry-wise error of spectral algorithms against a semi-random adversary that can arbitrarily boost the sampling probabilities of certain edges. We find that the performance of the unweighted spectral method is heavily dependent on the spectral properties of the generated graph. Furthermore, we show that asymptotic performance approaching that of uniformly sampled graphs can be recovered by appropriately reweighting the observed edges to counteract the adversary and restore the spectral gap. Finally, we provide numerical simulations that support our theoretical findings.

2605.23791 2026-05-25 stat.ME

Joint Bayesian models for validating spatial health-event databases against a gold standard: separating global and local discrepancies

联合贝叶斯模型用于验证空间健康事件数据库与金标准:分离全局和局部差异

Mathias Brugel, Florine Kempf, Camille Ternynck, Marta Blangiardo, Michaël Génin

AI总结 该研究提出了一种基于贝叶斯框架的空间健康事件数据库验证方法,用于对比候选复用数据库与黄金标准数据库之间的全局和局部差异。研究引入了随机误差模型和结构化误差模型,并与共享成分模型进行比较,通过全局风险比和局部误差超概率指标评估数据库的一致性。实验表明,该方法能够有效识别空间结构差异,适用于医疗数据的复用验证。

详情
AI中文摘要

医疗行政和合成空间数据的重用可能克服基于人群的登记的一些局限性,前提是进行严格的验证。然而,目前没有工具可以对候选重用数据库(CFRD)与金标准(GS)进行空间验证。我们提出了一个贝叶斯框架,用于空间健康事件数据库的二维(全局和局部)地图到地图验证。我们考虑了一个误差模型族(随机[REM]和结构化[SEM]),其中CFRD被建模为与GS的偏离。两者都与共享成分模型(SCM)进行比较。全局不一致性通过数据库特定的截距差异($RR_{\mathrm{global}}$)评估,而局部不一致性通过数据库特定误差项的超越概率测量。扰动情景包括CFRD中的零、均匀、聚类和随机扰动。敏感性、特异性、错误发现率和马修斯相关系数评估了检测性能。$RR_{\mathrm{global}}$在所有模型和情景中准确恢复了全图偏移。REM和SEM对局部差异既敏感又特异。SCM更为保守。应用于来自EPIMAD登记和CFRD的克罗恩病数据,所有模型得出相同结论:CFRD再现了全局和局部空间结构,整体信号约低7%。扩展到其他结果分布、时空模型和校准是自然的下一步。 \textit{关键词:} 数据重用;空间数据库验证;贝叶斯层次模型;疾病映射;共享成分模型。

英文摘要

The reuse of medico-administrative and synthetic spatial data may overcome some limitations of population-based registries, provided rigorous validation is performed. However, no tool exists to spatially validate a candidate-for-reuse database (CFRD) against a gold standard (GS). We propose a Bayesian framework for two-dimensional (global and local) map-to-map validation of spatial health-event databases. We consider an error-model family (random [REM] and structured [SEM]) in which the CFRD is modelled as a departure from the GS. Both are compared with a shared component model (SCM). Global disagreement is assessed using the database-specific intercept difference ($RR_{\mathrm{global}}$), while local disagreement is measured by the exceedance probability of the database-specific error term. Disturbance scenarios included null, uniform, clustered, and random perturbations in the CFRD. Sensitivity, specificity, false detection rate, and Matthews Correlation Coefficient assessed detection performance. $RR_{\mathrm{global}}$ accurately recovered map-wide shifts across all models and scenarios. REM and SEM behaved were both sensitive and specific to local discrepancies. SCM was more conservative. Applied to Crohn's disease data from the EPIMAD registry and a CFRD, all models reached the same conclusion: the CFRD reproduced global and local spatial structures with an overall signal about 7\% lower. Extensions to other outcome distributions, spatio-temporal models and calibration constitute natural next steps. \textit{Keywords:} data reuse; spatial database validation; Bayesian hierarchical models; disease mapping; shared component model.

2605.23760 2026-05-25 stat.ME

Global Sensitivity Analysis: a novel generation of mighty estimators based on rank statistics

全局敏感性分析:基于秩统计的新型强大估计器

Fabrice Gamboa, Pierre Gremaud, Thierry Klein, Agnès Lagnoux

AI总结 本文提出了一种基于秩统计量的新型统计估计框架,用于计算多种全局敏感性分析指标。该方法利用了Chatterjee提出的经验相关系数,能够高效估计包括Cramér-von-Mises指标、一阶Sobol指标、广义度量空间指标和高阶矩指标等在内的多种敏感性指标。研究证明了所提估计量的一致性与数值效率,尤其在小样本情况下表现突出,并为一阶Sobol指标的估计量建立了中心极限定理。

详情
Journal ref
Bernoulli, 2022, 28 (4), pp.2345-2374
Comments
Erratum for Global Sensitivity Analysis: a novel generation of mighty estimators based on rank statistics. Fabrice Gamboa, Thierry Klein, Agn{è}s Lagnoux, and Paul Rochet. arXiv admin note: substantial text overlap with arXiv:2003.01772
AI中文摘要

我们为一大类全局敏感性分析指标提出了一种新的统计估计框架。我们的方法基于秩统计,并使用了Chatterjee [9]最近引入的经验相关系数。我们展示了如何应用该方法不仅计算与Chatterjee相关性概念直接相关的Cramér-von-Mises指标,还计算一阶Sobol指标、一般度量空间指标和高阶矩指标。我们建立了所得估计量的一致性,并展示了其数值效率,特别是在小样本情况下。此外,我们为一阶Sobol指标的估计量证明了中心极限定理。

英文摘要

We propose a new statistical estimation framework for a large family of global sensitivity analysis indices. Our approach is based on rank statistics and uses an empirical correlation coefficient recently introduced by Chatterjee [9]. We show how to apply this approach to compute not only the Cram{é}r-von-Mises indices, which are directly related to Chatterjee's notion of correlation, but also first-order Sobol indices, general metric space indices and higher-order moment indices. We establish consistency of the resulting estimators and demonstrate their numerical efficiency, especially for small sample sizes. In addition, we prove a central limit theorem for the estimators of the first-order Sobol indices.

2605.23726 2026-05-25 cs.LG cs.DS stat.ML

Optimal Dimension-Free Sampling for Regularized Classification

正则化分类的最优无维度采样

Meysam Alishahi, Alexander Munteanu, Simon Omlor, Jeff M. Phillips

AI总结 本文研究了在正则化分类问题中实现$(1\pm\varepsilon)$相对误差的最优无维度采样方法,适用于一大类满足Lipschitz条件的分类损失函数,如逻辑回归、铰链损失和ReLU损失等。作者给出了不同正则化项下的采样复杂度上界和下界,证明了基于$\|\cdot\|_2/k$和$\|\cdot\|_1/k$正则化的采样复杂度分别为$k^2/\varepsilon^2$和$k/\varepsilon^2$,并分析了$\|\cdot\|_2^2/k$正则化下采样复杂度对函数导数性质的依赖。相比现有基于敏感度的立方复杂度方法,本文通过统一采样和更精细的高阶矩分析,实现了更优的采样效率。

详情
AI中文摘要

我们证明了对于一大类Lipschitz连续分类损失函数,在各种正则化项下,达到$(1\pm\varepsilon)$相对误差的最优采样界。这包括重要的函数如logistic和sigmoid损失、hinge损失和ReLU损失,作为突出和流行的代表性例子。特别地,我们证明了对于$\|\cdot\|_2/k$正则化的$k^2/\varepsilon^2$上下界,以及对于$\|\cdot\|_1/k$正则化的$k/\varepsilon^2$上下界。对于$\|\cdot\|_2^2/k$正则化,采样复杂度主要取决于有界导数性质:如果$|g'(x)|\leq g(x)$,且$g(0)>0$,且$g$是单调或凸的,则采样复杂度是$k$的线性;否则一般界为$k^2/\varepsilon^2$。然而,如果$g(0)=0$,我们的结果表明不可能得到无维度界,甚至次线性界也被排除。所有上界都有匹配的下界(至多相差多对数项)。此外,我们的工作在概念上和算法上依赖于简单的均匀或(平方)范数采样,从而改进了最近(Alishahi and Phillips, ICML'24)的立方$k^3/\varepsilon^2$敏感度采样界。这是通过涉及更高矩界和经验过程分析的精细论证来实现的,以避免在事实上的标准VC维和敏感度框架中出现的过度计数。

英文摘要

We prove optimal sampling bounds achieving $(1\pm\varepsilon)$-relative error for a broad class of Lipschitz continuous classification loss functions under various regularization terms. This includes important functions such as logistic and sigmoid loss, hinge loss, and ReLU loss, as prominent and popular representative examples. In particular, we prove $k^2/\varepsilon^2$ upper and lower bounds for $\|\cdot\|_2/k$ regularization, and $k/\varepsilon^2$ upper and lower bounds for $\|\cdot\|_1/k$ regularization. For $\|\cdot\|_2^2/k$ regularization, the sampling complexity depends mainly on a bounded derivative property: if $|g'(x)|\leq g(x)$, and $g(0)>0$, and $g$ is monotonic or convex, then it admits linear in $k$ sampling complexity; otherwise the general bound is $k^2/\varepsilon^2$. However, if $g(0)=0$, our results indicate that no dimension-free bounds are possible, and even sublinear bounds are ruled out. All upper bounds are complemented by matching lower bounds up to polylogarithmic terms. Moreover, our work relies conceptually and algorithmically on simple uniform or (squared) norm sampling and hereby improves over recent cubic $k^3/\varepsilon^2$ sensitivity sampling bounds of (Alishahi and Phillips, ICML'24). This is achieved by refined arguments involving higher moment bounds and empirical process analyses to avoid overcounting that appears in the de-facto standard VC-dimension and sensitivity framework.

2605.23692 2026-05-25 stat.CO stat.AP

Trajectory-Oriented Optimization Via Adaptive Thompson Sampling And Grid Refinement: A Tutorial With The ADAPTIVE\_TS Package

基于自适应汤普森采样和网格细化的轨迹导向优化:ADAPTIVE_TS 包教程

David O'Gara, Arindam Fadikar, Mickaël Binois, Nicholson Collier, Jonathan Ozik

AI总结 本文介绍了一种基于自适应汤普森采样和网格细化的轨迹导向优化方法,并通过开源Python工具包adaptive_ts进行教程讲解。该方法无需对模拟器的随机行为做出假设,特别适用于校准流行病学模拟器等复杂模型,能够有效通过模拟器与观测数据之间的误差识别轨迹。文章还提供了多个实际案例,帮助用户理解和应用该优化框架。

详情
AI中文摘要

随机模拟器越来越多地被用于扩展科学知识的边界并为现实世界中的决策提供信息。模拟器校准是一个关键步骤,通过调整内部模型输入以匹配外部标准(通常以观测数据的形式),在模型设计和验证中至关重要。流行病学模拟器提供了一个特别引人注目的用例,正如最近的新冠疫情所证明的那样。在几种校准范式中,轨迹导向优化是一种新兴方法,它不需要对模拟器复制的随机行为做出假设,并且通过模拟器与观测数据之间的误差视角,特别有效地识别轨迹,尤其是与贝叶斯优化结合时。我们提供了一个关于使用开源Python包 exttt{adaptive\_ts}进行轨迹导向优化的教程。我们还在一个配套网页上提供了一系列工作示例。

英文摘要

Stochastic simulators are increasingly used to expand the frontier of scientific knowledge and inform decision-making across real-world contexts. Simulator calibration, a process by which internal model inputs are tuned to match some external criteria, usually in the form of observed data, is a key step in model design and validation. Epidemiological simulators present an especially compelling use case, as evidenced by the recent COVID-19 pandemic. Among several calibration paradigms, trajectory-oriented optimization is an emerging approach that does not require assumptions on the stochastic behavior of the simulator replicates and is particularly effective at identifying trajectories through the lens of errors between the simulator and observed data, especially when combined with Bayesian optimization. We present a tutorial on trajectory-oriented optimization with \texttt{adaptive\_ts}, an open-source Python package. We also provide a series of worked examples on an accompanying webpage.

2605.23691 2026-05-25 stat.ME

Joint Estimation of Marginal and Heterogeneous Treatment Effects

边际和异质性治疗效应的联合估计

Leticia Wuethrich, Torsten Hothorn

AI总结 该研究旨在同时估计边际治疗效应和异质性治疗效应,解决传统随机对照试验中协变量调整可能改变边际效应估计的问题。提出了一种联合建模框架,将边际治疗效应直接嵌入结果和基线协变量的联合模型中,从而在保持边际可解释性的同时提升估计效率并评估协变量的预测作用。该方法适用于多种类型结局变量,并通过模拟和实际应用验证了其有效性与优越性。

详情
AI中文摘要

随机临床试验通常旨在估计边际治疗效应。虽然协变量调整可以提高精度,但在非线性模型中由于不可折叠性,它可能改变估计目标,导致条件治疗效应而非边际治疗效应。同时,识别预后和预测协变量对于理解治疗效应异质性和指导临床决策至关重要。在保持边际可解释性的同时允许效率提升和异质性评估仍然是一个方法论挑战。 在这项工作中,我们将非参数正态调整边际推断扩展到允许异质性治疗效应。所提出的框架将边际治疗效应直接嵌入到结果和基线协变量的联合模型中。这种构造在调整潜在的预后和/或预测协变量的同时,保持了边际可解释性。该方法适用于连续、二元、有序和事件时间结局,并允许在共同尺度上对预后和预测协变量进行显式估计和排序。 对于连续结局,我们表明,在协变量调整下,以Cohen's $d$衡量的边际治疗效应的渐近方差从不比未调整时更差,且通常更好。效率提升主要由预后效应驱动,而现实中的预测效应贡献甚微。模拟研究在不同结局类型中证实了这些发现,并展示了对于Cohen's d、对数优势比和对数风险比的无偏且更有效的边际效应估计。应用于一项针灸试验表明,该方法在提高效率并允许对预后和预测协变量进行排序的同时,再现了原始试验结果。

英文摘要

Randomized clinical trials typically aim to estimate a marginal treatment effect. While covariate adjustment can improve precision, it may change the estimand in nonlinear models due to noncollapsibility, leading to conditional rather than marginal treatment effects. At the same time, identifying prognostic and predictive covariates is important for understanding treatment effect heterogeneity and informing clinical decision-making. Keeping marginal interpretability while allowing efficiency gains and assessment of heterogeneity remains a methodological challenge. In this work, we extend nonparanormal adjusted marginal inference to allow for heterogeneous treatment effects. The proposed framework embeds the marginal treatment effect directly in a joint model for the outcome and baseline covariates. This construction preserves marginal interpretability while adjusting for potentially prognostic and/or predictive covariates. The method applies to continuous, binary, ordinal, and time-to-event outcomes and allows explicit estimation and ranking of prognostic and predictive covariates on a common scale. For continuous outcomes, we show that the asymptotic variance of the marginal treatment effect measured as Cohen's $d$ is never worse and often better under covariate adjustment than without adjustment. Efficiency gains are primarily driven by prognostic effects, with realistic predictive effects contributing little additional improvement. Simulation studies confirm these findings across outcome types and demonstrate unbiased and more efficient estimation of marginal effects for Cohen's d, log-odds ratios, and log-hazard ratios. Application to an acupuncture trial demonstrates that the method reproduces the original trial findings while improving efficiency and allowing ranking of prognostic and predictive covariates.

2605.22738 2026-05-25 cs.LG cs.AI stat.ML

Proxy-Based Approximation of Shapley and Banzhaf Interactions

基于代理的Shapley和Banzhaf交互近似

Santo M. A. R. Thies, Hubert Baniecki, R. Teal Witter, Eyke Hüllermeier, Maximilian Muschalik, Fabian Fumagalli

AI总结 本文研究了如何高效准确地估计Shapley和Banzhaf交互值,以解释机器学习模型中特征之间的复杂相互作用。为此,作者提出了ProxySHAP方法,结合树模型代理的高效采样与残差校正策略,实现了在保证精度的同时提升计算效率。理论分析表明,ProxySHAP能够在多项式时间内计算树集成模型的精确交互指数,并有效控制偏差与方差。实验表明,ProxySHAP在多个基准测试中表现优异,尤其在大规模高维数据上显著优于现有方法。

详情
AI中文摘要

Shapley和Banzhaf交互捕捉了现代机器学习应用中固有的复杂动态。然而,当前对这些高阶交互的估计器在速度和准确性之间进行权衡。为了克服这一限制,我们引入了ProxySHAP。ProxySHAP将基于树的代理模型的高样本效率与通过残差校正实现一致性的原则路径相结合。在理论层面,我们推导了干预TreeSHAP的多项式时间推广,以计算树集成的精确交互指数,成功避免了先前方法中的指数树深度依赖。此外,我们正式分析了残差调整策略,刻画了最大样本重用(MSR)在特定条件下校正代理偏差而不使其方差随交互规模指数增长的条件。广泛的基准测试表明,ProxySHAP在近似质量上树立了新的最先进标准,包括在具有数千个特征的大规模应用中。通过在小预算和大预算场景下均实现最低误差,ProxySHAP显著优于先前最佳估计器ProxySPEX和KernelSHAP-IQ,同时在可解释性下游任务上也提供了卓越性能。

英文摘要

Shapley and Banzhaf interactions capture the complex dynamics inherent in modern machine learning applications. However, current estimators for these higher-order interactions trade off between speed and accuracy. To overcome this limitation, we introduce ProxySHAP. ProxySHAP reconciles the high sample efficiency of tree-based proxy models with a principled path to consistency via residual correction. On a theoretical level, we derive a polynomial-time generalization of interventional TreeSHAP to compute exact interaction indices for tree ensembles, successfully bypassing exponential tree-depth dependencies in prior methods. Furthermore, we formally analyze the residual adjustment strategy, characterizing the specific conditions under which Maximum Sample Reuse (MSR) corrects proxy bias without its variance scaling exponentially with interaction size. Extensive benchmarking demonstrates that ProxySHAP sets a new state-of-the-art standard for approximation quality, including in large-scale applications with thousands of features. By achieving the lowest error in both small- and large-budget regimes, ProxySHAP significantly outperforms the prior best estimators ProxySPEX and KernelSHAP-IQ, while also delivering superior performance on downstream explainability tasks.

2605.21813 2026-05-25 cs.LG stat.ME stat.ML

Symbolic Density Estimation for Discrete Distributions

离散分布的符号密度估计

Ziwen Liu, Meng Li

AI总结 本文提出了一种名为符号密度估计(SDE)的无监督框架,用于自动恢复离散分布的闭式概率质量函数。该方法通过在结构化的搜索空间中组合基本解析操作,结合领域特定的结构先验、进化搜索和有效性感知推理阶段,能够有效扩展至更复杂的分布族,如零膨胀分布和有限混合分布。研究还构建了一个涵盖多种常用离散分布的基准数据集,并在实验中验证了该算法在参数估计和模型拟合方面的优越性。

详情
Comments
28 pages, 5 figures, 22 tables
AI中文摘要

离散概率法则支撑着统计建模,然而可解释分布的目录通过几个世纪以来逐案数学推导仅逐渐扩展。我们引入了符号密度估计(SDE),这是一个无监督框架,通过在结构化搜索空间内组合基本解析操作自动恢复闭式概率质量函数。我们的方法将领域特定的结构先验与进化搜索和有效性感知推理阶段相结合,并扩展到更丰富的分布族,如零膨胀和有限混合。为了支持系统评估和未来研究,我们贡献了一个涵盖广泛常用离散分布的基准数据集。所提出的算法恢复了所有基准分布族,并给出了准确的参数估计。一个真实数据应用表明,它识别出简洁且可解释的混合模型,这些模型在拟合优度上优于标准模型。

英文摘要

Discrete probability laws underpin statistical modeling, yet the catalog of interpretable distributions has expanded only gradually through centuries of case-by-case mathematical derivations. We introduce symbolic density estimation (SDE), an unsupervised framework that automatically recovers closed-form probability mass functions by composing elementary analytic operations within a structured search space. Our method integrates domain-specific structural priors with evolutionary search and a validity-aware inference stage, and it extends to richer distribution families such as zero inflation and finite mixtures. To support systematic evaluation and future research, we contribute a benchmark dataset spanning a broad collection of commonly used discrete distributions. The proposed algorithm recovers all benchmark families with accurate parameter estimates. A real data application shows that it identifies concise and interpretable mixture models that improve goodness-of-fit over standard models.

2605.20143 2026-05-25 stat.AP stat.CO stat.ML

Semi-Parametric Bayesian Additive Regression Trees for Risk Prediction with High-Dimensional Epigenetic Signatures and Low-Dimensional Covariates

半参数贝叶斯加性回归树用于高维表观遗传特征和低维协变量的风险预测

Saurabh Bhandari, Parveen Bhatti, Brian C. -H. Chiu, Yuan Ji

AI总结 在精准医学背景下,如何有效结合高维表观遗传数据与低维协变量进行风险预测是一个重要挑战。本文提出一种半参数贝叶斯加性回归树模型(spBART),通过参数部分建模低维协变量以获得可解释的效应估计,同时利用树集成捕捉高维预测变量之间的复杂非线性关系。该方法结合交叉验证与贝叶斯假阳性率控制,实现了稳定的变量选择,并在多发性骨髓瘤研究中表现出优异的预测性能,验证集AUC达到0.96。

详情
AI中文摘要

在精准医学时代,全基因组表观遗传修饰提供了可用于风险预测的丰富数据。然而,这些数据具有高维性和复杂依赖结构,当目标是获得协变量调整的可解释效应估计时,很难将它们与低维协变量联合建模。标准贝叶斯加性回归树(BART)提供强大的预测性能,但在树集成中统一处理所有预测变量,掩盖了重要协变量的贡献,并使高维设置中的变量选择复杂化。我们提出了一种半参数BART模型(spBART),通过参数组件建模低维协变量(具有可解释系数),同时通过树集成捕捉高维预测变量之间的复杂非线性关联,从而解决了这一局限性。为了进行稳定的变量选择,我们开发了一种基于交叉验证的程序,该程序汇总各折的后验包含概率,并应用贝叶斯错误发现率控制。我们将所提出的方法应用于两项多发性骨髓瘤研究(N = 869)中循环游离DNA衍生的高维全基因组5-羟甲基胞嘧啶谱的汇总病例-对照分析。该方法识别出一组简约的候选位点,并在保留验证集中实现了强大的样本外判别能力(AUC = 0.96)。总体而言,spBART为在高维生物医学研究中结合可解释协变量推断与灵活建模及变量选择提供了一个统一框架。

英文摘要

In the era of precision medicine, genome-wide epigenetic modifications offer rich data that could inform risk prediction. However, these data are high-dimensional and exhibit complex dependence structures, which makes it difficult to jointly model them with low-dimensional covariates when the goal is to obtain interpretable effect estimates for covariate adjustment. Standard Bayesian additive regression trees (BART) provide strong predictive performance but treat all predictors uniformly within the tree ensemble, obscuring the contributions of significant covariates and complicating variable selection in high-dimensional settings. We propose a semi-parametric BART model (spBART) that addresses this limitation by modeling low-dimensional covariates through a parametric component with interpretable coefficients, while capturing complex nonlinear associations among high-dimensional predictors through the tree ensemble. To perform stable variable selection, we develop a cross-validation-based procedure that aggregates posterior inclusion probabilities across folds and applies Bayesian false discovery rate control. We apply the proposed method to a pooled case--control analysis of high-dimensional genome-wide 5-hydroxymethylcytosine profiles derived from circulating cell-free DNA in two multiple myeloma studies ($N = 869$). The approach identifies a parsimonious set of candidate loci and achieves strong out-of-sample discrimination (AUC $= 0.96$) in a held-out validation set. Overall, spBART provides a unified framework for combining interpretable covariate inference with flexible modeling and variable selection in high-dimensional biomedical studies.

2605.11138 2026-05-25 cond-mat.stat-mech cs.IT hep-th math.IT stat.ME

Field Theory of Data: Anomaly Detection via the Functional Renormalization Group. The 2D Ising Model as a Benchmark

数据的场论:基于泛函重整化群的异常检测——以二维Ising模型为基准

Riccardo Finotello, Vincent Lahoche, Parham Radpay, Dine Ousmane Samary

AI总结 本文将高噪声环境下的异常检测与非平衡场论的重整化群流建立对应关系,提出了一种基于场论的异常检测新方法。通过证明相互作用非平衡系统的相变检测可映射到近高斯固定点的有效平衡场论,并识别出该分布为通用的Marchenko-Pastur分布,为该框架提供了物理基础。以二维Ising模型为基准,研究展示了噪声与信号比相当于物理温度,信号在涨落背景中以有序区域形式出现,该方法在临界阈值识别上误差低于4%,显著优于传统信息论指标。

详情
Comments
15 pages, 2 appendixes; correction of typos and captions, improved clarity
AI中文摘要

我们在高噪声区域的异常检测与非平衡场论的重整化群流之间建立了对应关系。通过证明相互作用非平衡系统中的相变检测映射到其高斯不动点附近的有效平衡场论的研究(我们将其识别为通用的Marchenko-Pastur分布),我们为该框架提供了物理基础。将泛函重整化群应用于二维Model A,我们证明了噪声信号比扮演物理温度的角色,其中信号作为有序域出现在热化涨落背景中。利用Onsager精确解作为基准,我们表明该方法识别临界阈值的误差低于4%,显著优于标准信息论度量如Kullback-Leibler散度。我们的结果为在临界点附近解析复杂数据集中的结构提供了通用策略,弥合了统计力学与统计推断之间的鸿沟。

英文摘要

We establish a correspondence between anomaly detection in high-noise regimes and the renormalization group flow of non-equilibrium field theories. We provide a physical grounding for this framework by proving that the detection of phase transitions in interacting non-equilibrium systems maps to the study of an effective equilibrium field theory near its Gaussian fixed point, which we identify with the universal Marchenko-Pastur distribution. Applying the Functional Renormalization Group to the two-dimensional Model A, we demonstrate that the noise-to-signal ratio acts as a physical temperature, where the signal emerges as ordered domains within a thermalized background of fluctuations. Using the exact Onsager solution as a benchmark, we show that this approach identifies critical thresholds with an error below 4%, significantly outperforming standard information-theoretic metrics such as the Kullback-Leibler divergence. Our results provide a universal strategy for resolving structures in complex datasets near criticality, bridging the gap between statistical mechanics and statistical inference.

2605.05428 2026-05-25 stat.ME cond-mat.stat-mech physics.plasm-ph

Parameter estimation for kappa distributions using the EM algorithm in the superstatistical framework

超统计框架下使用EM算法估计kappa分布的参数

Leonardo Herrera-Fuenzalida, Sergio Davis

AI总结 本文研究了在超统计框架下利用EM算法对κ分布参数进行估计的问题。由于κ分布不属于指数族,传统极大似然估计难以直接应用,作者通过引入贝克-科恩超统计框架,将逆温度参数β作为潜在变量,从而恢复了指数族结构,并实现了EM算法的解析求解。该方法在合成数据上的实验表明,能够稳定收敛并准确恢复生成参数,为存在局部温度波动的超统计系统提供了高效且透明的参数估计方法。

详情
AI中文摘要

Kappa分布广泛应用于空间等离子体物理中,用于模拟具有重尾的速度分布函数。然而,这些分布中的参数估计因kappa分布不属于指数族而变得复杂,因此它没有充分统计量,直接最大似然估计需要数值优化,且没有解析的闭式更新方程。在Beck-Cohen超统计框架下,其中伽马分布的逆温度\(\beta\)通过边缘化生成kappa分布,我们将\(\beta\)视为潜变量。这种层次描述恢复了边缘kappa分布所缺乏的指数族结构,并得到了期望最大化(EM)算法的解析可处理实现,其E步和M步以充分统计量的形式具有闭式表达式。应用于从模型生成的合成数据时,该算法单调收敛到边缘kappa对数似然的平稳点,并在所探索的\(\kappa\)范围内一致地恢复生成参数。因此,EM为具有局部温度波动的超统计系统中的推断提供了一种可处理且透明的途径。

英文摘要

Kappa distributions are widely used in space plasma physics to model velocity distribution functions with heavy tails. Parameter estimation in these distributions is, however, complicated by the fact that the kappa distribution does not belong to the exponential family, so it admits no sufficient statistics and direct maximum likelihood requires numerical optimization without analytically closed-form update equations. Working within the Beck-Cohen superstatistics framework, where a gamma-distributed inverse temperature \(β\) generates the kappa distribution upon marginalization, we treat \(β\) as a latent variable. This hierarchical description restores the exponential family structure that the marginal kappa distribution lacks, and yields an analytically tractable implementation of the expectation-maximization (EM) algorithm whose E-step and M-step admit closed-form expressions in terms of sufficient statistics. Applied to synthetic data drawn from the model, the algorithm converges monotonically to a stationary point of the marginal kappa log-likelihood and recovers the generating parameters consistently across the explored range of \(κ\). EM thus offers a tractable and transparent route to inference in superstatistical systems with local temperature fluctuations.

2602.15602 2026-05-25 cs.LG stat.ML

Certified Per-Instance Unlearning Using Individual Sensitivity Bounds

使用个体灵敏度界限的认证逐实例遗忘

Hanna Benarroch, Jamal Atif, Olivier Cappé

AI总结 本文研究了如何通过个体敏感度界限实现有保证的逐实例模型遗忘。不同于传统的基于最坏情况敏感度的噪声注入方法,作者提出了一种针对每个数据点贡献进行自适应噪声校准的新方法,从而减少噪声注入量并提升模型性能。在岭回归和深度学习实验中验证了该方法的有效性,证明其在保证遗忘认证的同时能够显著降低噪声影响。

详情
AI中文摘要

认证的机器遗忘可以通过注入噪声实现,从而提供差分隐私保证,其中噪声根据最坏情况灵敏度进行校准。这种保守的校准通常会导致性能下降,限制了实际适用性。在这项工作中,我们研究了一种基于自适应逐实例噪声校准的替代方法,该校准针对每个数据点对学习解的个体贡献进行定制。这引发了以下挑战:当机制依赖于要移除的特定点时,如何建立正式的遗忘保证?为了定义噪声梯度动力学中的个体数据点灵敏度,我们考虑使用逐实例差分隐私。对于通过朗之万动力学训练的岭回归,我们推导出高概率的逐实例灵敏度界限,从而在注入显著更少噪声的情况下实现认证遗忘。我们通过线性设置中的实验证实了我们的理论发现,并提供了进一步的经验证据,表明该方法在深度学习设置中的相关性。

英文摘要

Certified machine unlearning can be achieved via noise injection leading to differential privacy guarantees, where noise is calibrated to worst-case sensitivity. Such conservative calibration often results in performance degradation, limiting practical applicability. In this work, we investigate an alternative approach based on adaptive per-instance noise calibration tailored to the individual contribution of each data point to the learned solution. This raises the following challenge: how can one establish formal unlearning guarantees when the mechanism depends on the specific point to be removed? To define individual data point sensitivities in noisy gradient dynamics, we consider the use of per-instance differential privacy. For ridge regression trained via Langevin dynamics, we derive high-probability per-instance sensitivity bounds, yielding certified unlearning with substantially less noise injection. We corroborate our theoretical findings through experiments in linear settings and provide further empirical evidence on the relevance of the approach in deep learning settings.

2602.12534 2026-05-25 stat.ML cs.DS cs.LG math.ST stat.TH

Linear Regression with Unknown Truncation Beyond Gaussian Features

未知截断下的线性回归:超越高斯特征

Alexandros Kouridakis, Anay Mehrotra, Alkis Kalavasis, Constantine Caramanis

AI总结 本文研究了在截断线性回归中,当响应变量的生存集未知时,如何高效估计未知的回归参数问题。不同于以往依赖已知生存集或强假设(如高斯分布)的工作,本文提出了一种仅需特征向量满足次高斯条件的算法,其运行时间仅为多项式时间,显著提升了计算效率。该方法的核心在于设计了一种新的子程序,能够在仅有正例且满足平滑条件的情况下高效学习有限个区间联合的模型,具有独立的理论价值和应用前景。

详情
AI中文摘要

在截断线性回归中,只有当结果 $y$ 落在某个生存集 $S^\star$ 内时,样本 $(x,y)$ 才被观测到,目标是估计未知的 $d$ 维回归系数 $w^\star$。该问题在统计学和机器学习中有着悠久的研究历史,可追溯到 (Galton, 1897; Tobin, 1958) 的工作,以及近期如 (Daskalakis et al., 2019; 2021; Lee et al., 2023; 2024) 的研究。然而,尽管历史久远,大多数先前工作仅限于 $S^\star$ 精确已知的特殊情况。更实际相关的情况——$S^\star$ 未知且需从数据中学习——仍然开放:实际上,目前可用的算法要么要求特征向量分布有强假设(如高斯性),即使如此,达到 $\varepsilon$ 精度的运行时间也为 $d^{\mathrm{poly} (1/\varepsilon)}$。在本工作中,我们给出了首个针对未知生存集的截断线性回归算法,运行时间为 $\mathrm{poly} (d/\varepsilon)$,仅要求特征向量是次高斯的。我们的算法依赖于一个新颖的子程序,该子程序在某种平滑条件下,利用正例(无负例)高效学习有界数量区间的并集。该学习保证补充了正例仅 PAC 学习的研究路线,并可能具有独立意义。

英文摘要

In truncated linear regression, samples $(x,y)$ are shown only when the outcome $y$ falls inside a certain survival set $S^\star$ and the goal is to estimate the unknown $d$-dimensional regressor $w^\star$. This problem has a long history of study in Statistics and Machine Learning going back to the works of (Galton, 1897; Tobin, 1958) and more recently in, e.g., (Daskalakis et al., 2019; 2021; Lee et al., 2023; 2024). Despite this long history, however, most prior works are limited to the special case where $S^\star$ is precisely known. The more practically relevant case, where $S^\star$ is unknown and must be learned from data, remains open: indeed, here the only available algorithms require strong assumptions on the distribution of the feature vectors (e.g., Gaussianity) and, even then, have a $d^{\mathrm{poly} (1/\varepsilon)}$ run time for achieving $\varepsilon$ accuracy. In this work, we give the first algorithm for truncated linear regression with unknown survival set that runs in $\mathrm{poly} (d/\varepsilon)$ time, by only requiring that the feature vectors are sub-Gaussian. Our algorithm relies on a novel subroutine for efficiently learning unions of a bounded number of intervals using access to positive examples (without any negative examples) under a certain smoothness condition. This learning guarantee adds to the line of works on positive-only PAC learning and may be of independent interest.

2509.06896 2026-05-25 cs.LG stat.ML

Are Targeted Data Poisoning Attacks as Effective as We Think?

定向数据投毒攻击是否如我们想象中那么有效?

William Xu, Chenyu Zhang, Yihan Wang, Matthew Y. R. Yang, Zuoqiu Liu, Gautam Kamath, Yaoliang Yu, Yiwei Lu

AI总结 本文研究目标数据投毒攻击的实际有效性,指出现有评估方法基于随机选择的目标样本,未能反映最坏情况下的攻击效果。为此,作者提出应聚焦于最难被攻击的样本进行评估,并基于干净模型的信息,提出了一种识别易受攻击和最难受攻击样本的方法,从而实现更严格的最坏情况评估和主动防御策略。

详情
AI中文摘要

定向数据投毒攻击通过向训练数据中注入恶意样本来操纵模型对特定测试样本的预测。然而,现有评估通常报告随机选择目标上的平均攻击成功率,掩盖了真实的最坏情况效果。我们认为正确的评估应聚焦于最难投毒的样本。同样的推理适用于防御:由于定向攻击在分布层面不留下痕迹,防御者应主动识别最脆弱的样本并应用定向对策。给定一个测试数据集,本文仅基于清洁模型信息识别最容易和最难投毒的样本。具体而言,我们利用清洁训练动态提供粗粒度评估,并利用投毒距离和预算对投毒类别进行细粒度分类。实验表明,这些指标能够可靠地按投毒脆弱性对样本分层,从而实现严格的最坏情况评估和主动的脆弱性感知防御。

英文摘要

Targeted data poisoning attacks manipulate model predictions on specific test samples by injecting malicious data into training. Yet existing evaluations report average attack success rates over randomly selected targets, obscuring true worst-case effectiveness. We argue that the right evaluation focuses on the hardest samples to poison. The same reasoning applies to defense: since targeted attacks leave no footprint at the distribution level, defenders should proactively identify the most vulnerable samples and apply targeted countermeasures. Given a test dataset, this paper identifies both the easiest and hardest to poison examples based on only clean model information. Specifically, we offer coarse evaluations using clean training dynamics, and fine-grained classification on poison class using poison distances and budgets. Our experiments show these metrics reliably stratify samples by poisoning vulnerability, enabling both rigorous worst-case evaluation and proactive vulnerability-aware defense.

2505.08045 2026-05-25 math.ST stat.TH

Measures of association for approximating copulas

近似copula的关联度量

Marcus Rockel

AI总结 本文研究了用于近似目的的常用copula的多种关联度量的闭式表达式,包括Bernstein、shuffle-of-min、checkerboard和check-min copula。特别地,给出了近年来受到关注的Chatterjee关联度量ξ的闭式表达式,并证明了在特定条件下,checkerboard近似copula的ξ值小于等于原copula的ξ值,且随着近似阶数增加趋于一致。这一结果为copula模型的近似与依赖结构分析提供了理论支持。

详情
Comments
28 pages
AI中文摘要

本文研究了常用于近似目的的多种copula关联度量的闭式表达式,包括Bernstein、shuffle-of-min、checkerboard和check-min copula。特别地,提供了最近流行的Chatterjee's ξ的闭式表达式,该度量量化了两个随机变量之间的依赖性。给定一个具有TP2密度的绝对连续二元copula C及其近似n×n棋盘copula Cn,我们证明ξ(Cn) ≤ ξ(C)且当n→∞时ξ(Cn)→ξ(C)。

英文摘要

This paper studies closed-form expressions for multiple association measures of copulas commonly used for approximation purposes, including Bernstein, shuffle--of--min, checkerboard and check--min copulas. In particular, closed-form expressions are provided for the recently popularized Chatterjee's $ξ$, which quantifies the dependence between two random variables. Given an absolutely continuous bivariate copula $C$ with TP$_2$ density and approximating $n\times n$-checkerboard copula $C_n$, we show that $ξ(C_n) \le ξ(C)$ with $ξ(C_n) \to ξ(C)$ as $n\to\infty$.

2605.23664 2026-05-25 stat.ME

A note on closed-form solutions for estimating sample size when externally validating a binary prediction model based on $C$-statistic precision

基于C统计量精度的二分类预测模型外部验证中样本量估计的闭式解注记

Denis A. Shah, Erick D. De Wolf, Pierce A. Paul, Laurence V. Madden

AI总结 本文针对外部验证二分类预测模型时基于 $C$-统计量精度的样本量估算问题,提出了七种新的解析解方法,这些方法通过不同计算代数系统和人工智能模型对Newcombe公式进行代数变换得到。相比现有的迭代计算方法,这些解析解在计算效率上大幅提升,在基准测试中平均快达148,000至264,000倍,同时保持与迭代方法相同的估算精度,为外部验证研究提供了高效可靠的样本量计算工具。

详情
Comments
8 pages, 2 figures
AI中文摘要

临床预测模型的外部验证对于评估其是否适合使用至关重要。C统计量是衡量此类预测二分类结果模型判别性能的广泛使用的指标。最近提出了一种基于Newcombe的C统计量标准误公式{SE($C$)}的重排来获得验证期间精确估计C统计量所需的最小样本量的方法,并通过迭代计算方法在R和Stata软件中实现。我们提出了七种新颖的闭式解,这些解是通过使用不同的计算机代数系统和人工智能模型对Newcombe公式进行代数重排得到的。我们展示了这些不同的形式,以说明不同的计算工具如何产生结构不同但数学上等价的解,并评估它们在计算性能上的实际差异。当应用于说明性示例时,我们的闭式解与迭代方法产生相同的样本量估计。在基准分析中,闭式解的中位执行时间平均比当前迭代实现快148,000到264,000倍,同时它们之间也存在微小的效率差异。这项工作为外部验证研究的样本量计算提供了一个经过验证的高效计算工具。提供了实现闭式解的R代码函数。

英文摘要

External validation of clinical prediction models is crucial for assessing whether they are fit for use. The $C$-statistic is a widely used measure of discriminative performance of such models predicting a binary outcome. A method for obtaining the minimum sample size required for the precise estimation of the $C$-statistic during validation, based on the rearrangement of Newcombe's formula for the standard error of the $C$-statistic {SE($C$)}, was recently proposed and implemented in R and Stata software via an iterative computational approach. We present seven novel closed-form solutions, derived using different computer algebra systems and artificial intelligence models, to the algebraic rearrangement of Newcombe's formula. We present these distinct forms to demonstrate how different computational tools yield structurally distinct but mathematically equivalent solutions, and to evaluate their practical differences in computational performance. Our closed-form solutions yield identical sample size estimates to the iterative method when applied to illustrative examples. In a benchmarking analysis, the closed-form solutions were on average 148,000 to 264,000 times faster in median execution time than the current iterative implementation, while also exhibiting minor efficiency differences among themselves. This work provides a validated, highly efficient computational tool applicable to sample size calculation for external validation studies. R code functions implementing the closed-form solutions are provided.

2605.23649 2026-05-25 eess.SP math.ST stat.TH

Diffusion Fluid Antenna Systems for Resilient ISAC

扩散流体天线系统用于弹性ISAC

Noor Waqar, Kai-Kit Wong, Chan-Byoung Chae, Ross Murch

AI总结 本文研究了面向鲁棒集成感知与通信(ISAC)的扩散流体天线系统(Diffusion FAS),旨在从物体侧视角提升系统在复杂电磁环境下的性能。通过引入可重构的空间自由度,该方法利用生成式人工智能框架,在稀疏观测条件下重构空间相关结构,实现对感知特征的动态调控。研究提出了两种新型ISAC模式:生成空间隐身模式可显著抑制目标的感知可见性,目标隔离模式则能有效抑制邻近物体的干扰,为提升ISAC系统的安全性和可靠性提供了新思路。

详情
AI中文摘要

大多数现有的集成感知与通信(ISAC)研究侧重于通过先进的波形设计和功率分配,使基站(BS)能够在共享资源上支持感知和通信。相比之下,对象侧视角仍未得到充分探索。例如,一个对象可能希望保持难以检测以保障安全,而另一个邻近对象可能产生主导反射,混淆BS并损害预期目标的感知可靠性。这些挑战激发了流体天线系统(FAS)范式,该范式引入了一种可重构的空间自由度(DoF),通过端口选择重塑感知特征,其能力超越了波形和功率控制单独所能提供的。在本文中,我们设计了扩散FAS,这是一种生成式人工智能(AI)驱动的框架,利用空间敏捷性在电磁衰落流形上引导ISAC性能。扩散FAS不是仅在功率域优化ISAC,而是将ISAC视为一个动态空间选择问题,其中选择天线状态(即端口)以塑造感知特征,同时保持通信目标。为了在稀疏测量下工作,我们采用条件去噪扩散概率模型(DDPM),从少量观测端口重建潜在的空间相关结构,从而有效探索可重构孔径。我们展示了两种FAS启用的ISAC模式:(1)生成式空间隐身,它识别局部深衰落以将目标的感知可见性抑制多达两个数量级,以及(2)目标隔离,它合成空间零点以抑制来自邻近物体的干扰。

英文摘要

Most existing integrated sensing and communication (ISAC) studies focus on enabling a base station (BS) to support sensing and communication over shared resources through advanced waveform design and power allocation. In contrast, the object-side perspective remains underexplored. For example, an object may wish to remain difficult to detect for security reasons, while another object in close proximity may generate dominant reflections that confuse the BS and impair sensing reliability for the intended target. These challenges motivate the fluid antenna system (FAS) paradigm which introduces a reconfigurable spatial degree of freedom (DoF) that can reshape sensing signatures via port selection, beyond what waveform and power control alone can provide. In this paper, we devise diffusion FAS, a generative artificial intelligence (AI)-driven framework that exploits spatial agility to steer ISAC performance over the electromagnetic fading manifold. Instead of optimizing ISAC solely in the power domain, diffusion FAS casts ISAC as a \emph{dynamic spatial selection} problem in which antenna states (i.e., ports) are chosen to shape sensing signatures while maintaining communication objectives. To work under sparse measurements, we employ a conditional denoising diffusion probabilistic model (DDPM) to reconstruct the latent spatial correlation structure from a small set of observed ports, enabling efficient exploration of the reconfigurable aperture. We demonstrate two FAS-enabled ISAC modes: (1) \emph{generative spatial stealth}, which identifies localized deep fades to suppress a target's sensing visibility by up to two orders of magnitude, and (2) \emph{target isolation}, which synthesizes spatial nulls that reject interference from adjacent objects.

2605.23635 2026-05-25 stat.ML cs.LG

Dirichlet-Based Monte Carlo Dropout for Uncertainty Estimation in Neural Networks

基于狄利克雷的蒙特卡洛丢弃法用于神经网络不确定性估计

Rouaa Hoblos, Noura Dridi, Noureddine Zerhouni, Zeina Al Masry

AI总结 传统神经网络无法提供预测的不确定性估计,而贝叶斯神经网络虽能进行不确定性量化,但计算复杂度较高。本文提出了一种基于狄利克雷分布的蒙特卡洛Dropout方法,在保持计算效率的同时提升了不确定性估计的质量。该方法通过将类别概率建模为狄利克雷分布,实现了更具信息量的不确定性表示,并在实验中验证了其在不确定性校准方面的有效性。

详情
Journal ref
56es Journ{é}es de Statistique de la SFdS, Jun 2025, Marseille, France
AI中文摘要

传统神经网络提供确定性预测,缺乏固有的不确定性估计。虽然贝叶斯神经网络(BNN)为不确定性量化提供了原则性方法,但其计算复杂度限制了可扩展性。蒙特卡洛(MC)Dropout最初作为正则化技术引入,已被证明通过多次随机前向传播实现概率建模,从而近似贝叶斯推断。在这项工作中,我们通过在MC Dropout中集成基于狄利克雷的框架来增强深度学习中的不确定性估计。具体来说,我们利用Sensoy等人(2018)提出的公式,其中使用狄利克雷分布对类概率进行建模,从而允许更信息化的不确定性表示。所提出的方法保持了MC Dropout的计算效率,同时提高了不确定性估计的质量。我们讨论了所提出方法的理论基础,并将其与现有的不确定性量化技术进行了比较。结果突显了所提出方法在产生良好校准的不确定性估计方面的有效性,为不确定性感知的深度学习模型提供了实用解决方案。

英文摘要

Traditional neural networks provide deterministic predictions without inherent uncertainty estimates. While Bayesian Neural Networks (BNNs) offer a principled approach to uncertainty quantification, their computational complexity limits scalability. Monte Carlo (MC) Dropout, initially introduced as a regularization technique, has been shown to approximate Bayesian inference by enabling probabilistic modeling through multiple stochastic forward passes. In this work, we enhance uncertainty estimation in deep learning by integrating a Dirichlet-based framework within MC Dropout. Specifically, we leverage the formulation proposed by Sensoy et al. (2018), where class probabilities are modeled using a Dirichlet distribution, allowing for a more informative uncertainty representation. The proposed approach maintains the computational efficiency of MC Dropout while improving the quality of uncertainty estimates. We discuss the theoretical foundations of our method and compare it with existing uncertainty quantification techniques. The results highlight the effectiveness of the proposed method in producing well-calibrated uncertainty estimates, offering a practical solution for uncertainty-aware deep learning models.

2605.23631 2026-05-25 stat.CO

Directional subset simulation method for reliability analysis

方向性子集模拟方法用于可靠性分析

Oindrila Kanjilal, Julien Bect

AI总结 在可靠性分析中,估计稀有失效事件的概率是一个关键挑战。子集模拟(SS)是一种常用的自适应蒙特卡洛方法,但其在处理多模态失效域时可能因样本陷入局部区域而导致估计不准确。本文提出了一种方向子集模拟(dSS)方法,通过引入方向采样的概念,合理引导样本向失效区域传播,从而更有效地探索参数空间中的多个方向,提升了失效概率估计的准确性。

详情
AI中文摘要

估计罕见失效事件的概率是物理系统可靠性分析中的一个关键挑战。子集模拟(SS)是一种非常流行的自适应蒙特卡洛方法。在SS中,通过使用马尔可夫链蒙特卡洛方法迭代采样参数空间的一系列嵌套子域(涵盖目标失效域),将小的失效概率计算为一系列较大的条件概率的乘积。对于具有多模态的失效域,用于探索SS中间层的马尔可夫链样本可能被困在输入参数空间的受限区域,导致失效概率估计不准确。本文针对该问题提出了方向性子集模拟(dSS)方法,该方法利用方向采样的概念,有信息地将样本向失效方向传播。这是通过一种新颖的中间失效域选择来实现的,该选择在每个中间层中保留参数空间中多个方向的样本。通过一系列数值算例展示了dSS方法的优点。

英文摘要

Estimating the probabilities of rare failure events is a key challenge in the reliability analysis of physical systems. Subset simulation (SS) is a very popular adaptive Monte Carlo method for this problem. In SS, the small failure probability is evaluated as a product of larger conditional probabilities by iteratively sampling a sequence of nested sub-domains of the parameter space, encompassing the target failure domain of interest, using Markov chain Monte Carlo methods. For failure domains with multiple modes, the Markov chain samples used to explore the intermediate levels of SS can be trapped in a confined region of the input parameter space, leading to inaccurate failure probability estimates. In this contribution, we propose the directional subset simulation (dSS) method for this problem, which uses concepts from directional sampling to informedly propagate samples towards failure. This is accomplished through a novel selection of the intermediate failure domains, which preserves samples in several directions in the parameter space in each intermediate level. The merits of the dSS method are illustrated through a selection of numerical examples.

2605.23614 2026-05-25 stat.ME

The frame problem in quantitative practice: ontological uncertainty and epistemic humility in an age of automated inference

量化实践中的框架问题:自动推理时代的存在论不确定性与认知谦逊

William Fauriat

AI总结 随着推理自动化在统计学、工程和机器学习中的广泛应用,定量实践正面临“框架问题”的挑战。研究指出,所有推理都基于有限的条件设定,而设定之外的因素无法被纳入不确定性分析,从而可能导致重大失误。本文系统回顾了定量实践中存在的三种不确定性——偶然性、认识论和框架(或本体论)不确定性,强调框架不确定性因其结构性的不可见性,成为最易被忽视却最关键的问题。文章进一步探讨了这一问题对各类定量实践者及非专家接收者的深远影响,并主张在自动化推理时代,应更加重视认识论上的谦逊态度。

详情
AI中文摘要

统计、工程和机器学习领域的量化实践已被推理自动化所变革。预测以人类中介推理无法匹敌的规模和速度产生、验证和部署。这种转变与推理的结构性限制相交织,任何方法论改进都无法消除:每个推理都依赖于有限的条件规范,而规范之外的内容不会表现为更宽的不确定性带——它根本不会出现。规范的选择——即框架——位于推理的上游,无法从使用该系统的内部进行审计。本文提供了一份综合性的、面向应用的综述。我们认为,量化实践中存在三类不确定性——偶然性、认知性和框架(或存在论)——而第三类,即有限规范的残余,在所选框架内对形式分析而言是结构上不可见的,并且是大多数严重后果的根源。我们追溯了为什么该限制同样适用于演绎和归纳推理,为什么没有元层次程序能消除这种回归,以及为什么当前自动推理的条件使得认知谦逊——这一论点所支持的实际倾向——变得更加重要,而非更不重要。我们阐述了该论点对当代量化工作中五种典型角色——工程师、统计学家、数学家、机器学习从业者以及专家主张的非专业接收者——的具体共鸣,展示了结构性论点如何影响每种实践的自然防御。该论点并非反对严谨性或反对量化;而是为了区分在框架内获得的严谨性与相对于框架的严谨性。

英文摘要

Quantitative practice across statistics, engineering, and machine learning has been transformed by the automation of inference. Predictions are produced, validated, and deployed at scale and speed that human-mediated reasoning could not match. This shift intersects with a structural limit of reasoning that no methodological refinement dissolves: every inference rests on a finite specification of conditions, and what falls outside the specification does not appear as a widened uncertainty band -it does not appear at all. The choice of specification -the frame -is upstream of the inference and cannot be audited from inside the system that uses it. This paper offers a synthetic, application-oriented review. We argue that three categories of uncertainty operate in quantitative practice -aleatory, epistemic, and frame (or ontological) -and that the third, the residue of finite specification, is structurally invisible to formal analysis within the chosen frame and is the locus of most consequential failures. We trace why the limit applies equally to deductive and inductive reasoning, why no meta-level procedure dissolves the regress, and why current conditions of automated inference make epistemic humility -the practical disposition this argument supports -more, not less, important. We articulate the argument's specific resonances for five typical figures of contemporary quantitative work -the engineer, the statistician, the mathematician, the machine-learning practitioner, and the non-specialist recipient of expert claims -showing how the structural argument bears on each practice's natural defenses. The argument is not against rigor or against quantification; it is for distinguishing rigor earned within a frame from rigor with respect to the frame.

2605.23591 2026-05-25 stat.ML cond-mat.dis-nn cs.LG math.ST stat.TH

Asymmetric Scaling Laws from Sparse Features

基于稀疏特征的非对称缩放定律

John Sous, Michael Winer

AI总结 本文研究了稀疏激活下神经网络的扩展规律,提出了一种新的模型,指出测试损失主要由训练输入中从未出现的稀疏坐标主导,从而形成一种不同于密集模型的新瓶颈。研究推导了欠参数化和过参数化情形下的渐近损失,并发现损失曲线在插值阈值附近呈现双下降现象,表现出由稀疏度决定的两个不同扩展指数。此外,还分析了梯度下降动力学,并展示了固定步长梯度下降不稳定概率的扩展规律,表明稀疏性带来的影响在非线性激活下依然存在。

详情
AI中文摘要

我们引入了一个稀疏激活下的神经缩放定律模型。在该模型中,测试损失通常由训练输入中从未观察到的稀有坐标主导。这种机制引入了一个密集模型中不存在的新瓶颈。我们推导了欠参数化和过参数化区域的渐近总体损失,并表明损失在插值阈值附近出现双下降峰值——其中参数数量刚好足以拟合训练数据——导致损失曲线由两个不同的缩放指数控制:一个用于过参数化区域,一个用于欠参数化区域,其差距由稀疏程度决定。此外,我们推导了一个计算最优边界,在固定计算预算下倾向于增加数据集大小而非模型容量。我们还分析了梯度下降动力学,并确定了固定步长梯度下降变得不稳定的概率的缩放定律。我们进一步表明,稀疏诱导效应在非线性激活下仍然存在。

英文摘要

We introduce a model for neural scaling laws under sparse activations. In the model, test loss is often dominated by rare coordinates that are never observed in the training input. This mechanism induces a novel bottleneck absent from dense models. We derive the asymptotic population loss in both the underparameterized and overparameterized regimes, and show that the loss exhibits a double-descent peak near the interpolation threshold -- where the number of parameters is just sufficient to fit the training data -- resulting in a loss curve governed by two distinct scaling exponents -- one for the overparameterized regime and one for the underparameterized regime -- with a gap determined by the degree of sparsity. Additionally, we derive a compute-optimal frontier that favors increasing dataset size over model capacity under fixed compute budgets. We also analyze gradient-descent dynamics and identify a scaling law for the probability that fixed-step gradient descent becomes unstable. We further show that the sparsity-induced effect persists under nonlinear activations.

2605.23537 2026-05-25 stat.ML eess.SP

Concomitant DAG Learning: On the Roles of Noise Adaptivity, Sparsity, and Non-negativity

伴随DAG学习:噪声自适应性、稀疏性和非负性的作用

Gonzalo Mateos, Samuel Rey, Hamed Ajorlou, Mariano Tepper

AI总结 该论文探讨了如何从观测数据中学习有向无环图(DAG),以揭示复杂系统中的因果关系。研究提出了一个连续的评分估计框架,通过邻接矩阵对DAG结构进行建模,克服了传统方法在可扩展性和可识别性上的挑战。文章还介绍了共轭DAG估计方法,能够同时推断稀疏的因果结构和外生噪声水平,提高在异方差性和分布偏移情况下的鲁棒性,为因果推断与大规模图学习的交叉研究提供了新方向。

详情
Comments
Submitted to the IEEE Signal Processing Magazine Special Issue: From Signals to Causes: Methodological Advances in Causal Inference. arXiv admin note: text overlap with arXiv:2310.02895
AI中文摘要

有向无环图(DAG)构成了一种核心建模工具,能够对复杂系统中的因果相互作用进行原则性推理。然而,由于一组变量背后的因果结构通常是未知的,并且干预可能不可行或实施起来有伦理挑战,因此需要解决从观测数据推断DAG的任务。然而,大多数经典的结构识别方法面临两个关键障碍:强制执行无环性的组合挑战(严重限制了可扩展性),以及由潜在混杂或异质噪声引起的可识别性挑战。本教程概述了最近信号处理和优化方面的进展,这些进展通过将DAG结构学习重新表述为关于邻接矩阵的连续、基于得分的估计问题来解决这些问题。我们首先对结构方程模型和因果图恢复的公式进行教学性介绍,然后对基于得分的方法进行历史综述,从早期的组合搜索方案和贪婪启发式方法到利用无环性平滑表征的现代连续框架。在此基础上,我们描述了伴随DAG估计方法,该方法联合推断稀疏因果结构和外生噪声水平,通过使估计器具有噪声自适应性,提高了在异方差性和分布偏移下的鲁棒性。总而言之,本教程向读者介绍了因果推断、高维统计和可扩展图学习交叉领域信号处理研究的挑战和机遇,同时概述了新兴方向,包括在线、非线性和神经因果发现。

英文摘要

Directed acyclic graphs (DAGs) constitute a central modeling tool to enable principled reasoning about cause-effect interactions in complex systems. However, since the causal structure underlying a group of variables is often unknown and interventions may be infeasible or ethically challenging to implement, there is a need to address the task of inferring DAGs from observational data. However, most classical structure identification approaches face two key obstacles: the combinatorial challenge of enforcing acyclicity, which severely limits scalability, and identifiability challenges arising from latent confounding or heterogeneous noise. This tutorial offers an overview of recent signal processing and optimization advances that address these issues by recasting DAG structure learning as a continuous, score-based estimation problem over adjacency matrices. We begin with a didactic introduction to structural equation models and the formulation of causal graph recovery, followed by a historical survey of score-based methods ranging from early combinatorial search schemes and greedy heuristics to modern continuous frameworks that leverage smooth characterizations of acyclicity. Building on this foundation, we describe concomitant DAG estimation methods that jointly infer sparse causal structure and exogenous noise levels, improving robustness under heteroscedasticity and distribution shifts by rendering the estimator noise adaptive. All in all, the tutorial introduces readers to challenges and opportunities for signal processing research at the crossroads of causal inference, high-dimensional statistics, and scalable graph learning, while outlining emerging directions including online, nonlinear, and neural causal discovery.

2605.23362 2026-05-25 cs.LG cs.IT math.IT math.ST stat.ML stat.TH

Instance-Optimal Estimation with Multiple LLM Judges on a Budget

预算限制下多LLM裁判的实例最优估计

Junghyun Lee, Sanghwa Kim, Yassir Jedra, Alexandre Proutière, Se-Young Yun

AI总结 本文研究了在有限预算下如何高效分配多个具有不同成本和可靠性的大语言模型评估任务,以获得最准确的评分估计。作者提出了预算异方差多评委估计问题,并设计了一种自适应算法EST-IVWE,通过乐观偏差方差估计实现稳定分配,理论证明其性能接近最优分配方案。此外,作者还建立了匹配的局部最小最大下界,证明了所提方法的实例最优性,并在实验中验证了其优于均匀分配策略的效果。

详情
Comments
53 pages, 4 figures; the first two authors contributed equally
AI中文摘要

评估大型语言模型越来越依赖于LLM作为裁判的协议,但此类评估仍然成本高昂:不同的裁判有不同的价格和可靠性,且每个提示-响应对的难度可能差异很大。这引发了一个基本的分配问题:在固定预算下,应如何在异构裁判和实例之间分配评估查询,以获得最准确的分数估计?我们将此问题形式化为*预算限制下的异方差多裁判估计*。给定$K$个提示-响应对、$J$个已知成本的裁判以及未知的查询-裁判方差,目标是估计一个有界分数向量,同时最小化$\ell_p$误差。我们的第一个贡献是分析逆方差加权估计量(IVWE)并推导出最小化其误差率的最优分配。由于该分配依赖于未知方差,我们随后通过提出EST-IVWE来解决实际中的未知方差设置,这是一种自适应算法,它构建并利用*乐观偏差*方差估计来稳定经验分配。我们证明EST-IVWE在预算内匹配了IVWE的速率,直至低阶项。我们的第二个且核心的理论贡献是一个匹配的*局部*极小极大下界,这确立了所提出算法的实例最优性。一个关键的技术见解是,Fano型高概率论证对于这个问题过于粗糙:它们的填充构造失去了控制最优分配的局部方差结构。我们转而使用基于局部扰动的Assouad型期望论证,该论证保留了这一结构并产生了尖锐的分配相关下界。最后,我们在合成数据集和HelpSteer2数据集上数值验证了我们的方法优于朴素的均匀分配。

英文摘要

Evaluating large language models increasingly relies on LLM-as-a-judge protocols, but such evaluations remain costly: different judges have different prices and reliabilities, and the difficulty of each prompt-response pair can vary substantially. This raises a basic allocation question: under a fixed budget, how should one distribute evaluation queries across heterogeneous judges and instances to obtain the most accurate score estimates? We formalize this question as *budgeted heteroskedastic multi-judge estimation*. Given $K$ prompt-response pairs, $J$ judges with known costs, and unknown query-judge variances, the goal is to estimate a bounded score vector while minimizing an $\ell_p$-error. Our first contribution is to analyze the inverse-variance weighted estimator (IVWE) and to derive the oracle allocation that minimizes its error rate. Since this allocation depends on the unknown variances, we then address the practical unknown-variance setting by proposing EST-IVWE, an adaptive algorithm that constructs and leverages *optimistically biased* variance estimates to stabilize the empirical allocation. We prove that EST-IVWE matches the oracle IVWE rate up to lower-order terms in the budget. Our second and central theoretical contribution is a matching *local* minimax lower bound, which establishes the instance-optimality of the proposed algorithms. A key technical insight is that Fano-type high-probability arguments are too coarse for this problem: their packing construction loses the local variance structure that governs the optimal allocation. We instead use an Assouad-type in-expectation argument, based on local perturbations, which preserves this structure and yields the sharp allocation-dependent lower bound. Finally, we numerically validate the superiority of our approach over naïve uniform allocation on synthetic and HelpSteer2 datasets.

2605.23318 2026-05-25 stat.ME

Generalized Rank Regression

广义秩回归

Jiyuan Tu, Suqi Wu, Yichen Zhang, Wen-Xin Zhou

AI总结 本文提出了一种广义秩回归(GRR)方法,旨在提升经典秩回归在非单调得分函数下的统计效率。该方法引入了非凸且非光滑的目标函数,为此,作者推导了估计量的非渐近Bahadur表示并建立了其渐近正态性,并设计了一种两阶段次梯度下降算法以实现高效计算。此外,还提出了乘数自助法用于统计推断,并揭示了GRR与复合分位数回归在渐近方差上的等价性,通过仿真实验和实际数据分析验证了其优势。

详情
Comments
29 pages, 10 figures
AI中文摘要

秩回归对异常值和重尾响应分布具有鲁棒性,对单调变换不变,并在非高斯误差下提高效率,使其成为分析复杂数据的通用工具。本文介绍了广义秩回归(GRR),这是对经典秩方法的扩展,允许非单调得分函数。虽然旨在提高鲁棒估计量的统计效率,但这种推广导致目标函数可能非凸且非光滑,给理论分析和算法实现带来挑战。我们推导了所提估计量的非渐近Bahadur表示,并在温和条件下建立了其渐近正态性。为了解决优化挑战,我们提出了一种新的两阶段次梯度下降算法,能够高效计算具有良好统计性质的GRR估计量。此外,我们开发了一种乘子自助法进行统计推断。揭示了GRR与分位数回归变体之间的密切联系,表明GRR和复合分位数回归具有渐近等价的方差。通过广泛的模拟研究和实际数据应用说明了GRR的优势。

英文摘要

Rank regression offers robustness to outliers and heavy-tailed response distributions, invariance to monotonic transformations, and improved efficiency under non-Gaussian errors, making it a versatile tool for analyzing complex data. This paper introduces Generalized Rank Regression (GRR), an extension of classical rank-based methods that accommodates non-monotonic score functions. While aimed at enhancing the statistical efficiency of robust estimators, this generalization results in a potentially non-convex and non-smooth objective function, presenting challenges for both theoretical analysis and algorithmic implementation. We derive a non-asymptotic Bahadur representation of the proposed estimator and establish its asymptotic normality under mild conditions. To address the optimization challenges, we propose a new two-stage sub-gradient descent algorithm that enables efficient computation of GRR estimators with desirable statistical properties. Furthermore, we develop a multiplier bootstrap procedure for conducting statistical inference. A close connection between GRR and variants of quantile regression is uncovered, which demonstrates that GRR and composite quantile regression share asymptotically equivalent variances. The advantages of GRR are illustrated through extensive simulation studies and a real data application.

2605.23268 2026-05-25 stat.ML cs.LG

Coupled Training with Privileged Information and Unlabeled Data

基于特权信息与未标记数据的联合训练

Jiahao Shi, Omar Hagrass, Jason M. Klusowski

AI总结 在许多预测任务中,训练时可获得额外信息(如昂贵或难以收集的测量数据),而这些信息在模型部署时并不可用。本文提出了一种联合训练方法,将利用额外信息的模型与仅使用测试时输入的部署模型一同训练,使部署模型仅在额外信息真正有助于预测时才加以利用,从而避免继承其错误。该方法提供了预测准确率提升的理论保证,并通过实验验证了其在合成数据和实际任务中的优越性。

详情
Comments
37 pages, 6 figures. Accepted to ICML 2026
AI中文摘要

在许多预测问题中,我们在训练期间拥有额外信息(例如,昂贵或收集缓慢的测量值),但在模型部署时这些信息将不可用。一种常见策略是首先训练一个使用所有训练信息的模型,然后利用其对未标记样本的预测来训练第二个模型,该模型仅使用测试时可用的输入。然而,当额外的训练专用信息较弱或存在噪声时,这种两阶段方法可能会误导部署模型,甚至降低准确性。我们提出一种联合训练方法,同时学习两个模型,使得部署模型仅在额外信息真正有帮助时从中受益,而不是继承其错误。我们提供了描述联合训练何时提高预测准确性的保证,并分析了一种适用于大规模高维模型的简单交替训练算法。在合成数据和真实世界预测任务上的实验表明,我们的方法避免了这些失败,并稳健地优于标准两阶段基线。

英文摘要

In many prediction problems, we have extra information during training (for example, measurements that are expensive or slow to collect) that will not be available when the model is deployed. A common strategy is to first train a model that uses all training information, then use its predictions on unlabeled examples to train a second model that only uses the inputs available at test time. However, when the extra training-only information is weak or noisy, this Two-Stage approach can mislead the deployment model and even hurt accuracy. We propose a joint training method that learns the two models together, so the deployment model can benefit from the extra information only when it actually helps, instead of inheriting its mistakes. We provide guarantees that describe when joint training improves prediction accuracy and analyze a simple alternating training algorithm for large, high-dimensional models. Experiments on synthetic data and real-world prediction tasks show that our approach avoids these failures and robustly outperforms standard Two-Stage baselines.

2605.23246 2026-05-25 stat.AP

Regulatory Considerations for Using Artificial Intelligence Models to Reduce Sample Sizes in Registrational Studies

在注册研究中利用人工智能模型减少样本量的监管考量

Aaron M. Smith, Tala Fakhouri, Run Zhuang, Jonathan R. Walsh

AI总结 本文探讨了在注册性临床试验中使用人工智能模型减少样本量所涉及的监管考量。研究提出利用模型推导的预后协变量,前瞻性地降低随机对照试验的计划样本量,从而缩短试验周期、加快决策并降低成本。文章结合FDA最新指导原则,逐步解析了模型开发、评估及样本量确定的通用建议,旨在为AI在药物开发中的负责任应用提供清晰的指导,并以阿尔茨海默病为例进行了具体演示。

详情
Comments
22 pages, 3 figures
AI中文摘要

人工智能(AI)在药物开发中的应用持续快速增长。监管机构对AI在受监管应用中的使用提供了越来越清晰的视角,包括FDA最近的指南草案,该指南提供了一个7步风险框架来评估AI模型在这些情况下的可信度。我们提出了一种应用AI模型的方法,通过使用模型衍生的预后协变量,前瞻性地减少随机对照试验的计划样本量。这可以缩短试验时间线,加快决策速度,并降低成本。当治疗有效且可耐受时,患者可以更快地获得治疗,这是FDA指南的一个引人注目的用例。我们逐步解析指南中的每个步骤,提供模型开发、评估和样本量确定方法的通用建议,旨在提供一套清晰的指南,说明如何遵循FDA指南并推动AI在药物开发中的负责任使用。我们以阿尔茨海默病为例演示了该应用。

英文摘要

Applications of artificial intelligence (AI) in drug development continue to increase at a rapid pace. Regulatory authorities have provided increasingly clear perspectives on the use of AI in regulated applications, including recent draft guidance from FDA that provides a 7-step risk-based framework to assess AI model credibility for these cases. We present an application of AI models to prospectively reduce the planned sample size in a randomized controlled trial, using model-derived prognostic covariates. This can shorten trial timelines, enable faster decision making, and lower costs. When treatments are effective and tolerable they can be accessible to patients sooner, which is a compelling use case for the FDA guidance. We walk through each of the steps in the guidance, providing general recommendations for model development, evaluation, and approaches for sample size determination, with the intent of providing a clear set of guidelines on how to engage with the FDA guidance and advance responsible use of AI in drug development. We demonstrate the application with an example in Alzheimerś Disease.

2605.23225 2026-05-25 cs.DS cs.DM cs.IT cs.LG math.IT math.ST stat.TH

Entropy Equivalence Testing

熵等价性检验

Clément L. Canonne, Yash Pote, Jonathan Scarlett, Joy Qiping Yang

AI总结 本文提出了一个名为“熵等价性检验”的新问题,旨在判断两个未知分布的熵是否相差超过给定阈值,相较于传统的分布接近性检验更为宽松。研究设计了一种时间与样本效率较高的算法,证明其样本复杂度可显著低于传统接近性检验。该成果进一步应用于低阶贝叶斯网络的接近性检验,显著提升了现有基于完整学习方法的样本或时间效率。

详情
AI中文摘要

我们引入了概率分布的熵等价性检验问题,这是经典接近性检验问题的松弛版本。在该问题中,给定来自两个未知分布$p,q$的样本和一个参数$\varepsilon \in(0,1/2]$,分布检验算法只需区分$p=q$和$|H(p)-H(q)| \geq \varepsilon$(其中$H$表示香农熵)。我们为此任务提供了一个时间和样本高效的算法,表明该问题的最优样本复杂度可以显著低于接近性检验。作为应用,我们利用这一结果首次为低度贝叶斯网络的(标准)接近性提供了非平凡的检验算法,显著改进了基于完全学习的基线方法在样本或时间复杂度上的表现。

英文摘要

We introduce the problem of \emph{entropy equivalence testing} for probability distributions, a relaxation of the well-studied closeness testing problem, where the distribution testing algorithm is now only required to distinguish, given samples from two unknown distributions $p,q$ and a parameter $\varepsilon \in(0,1/2]$, between $p=q$ and $|H(p)-H(q)| \geq \varepsilon$ (where $H$ denotes the Shannon entropy). We provide a time- and sample-efficient algorithm for this task, showing that the optimal sample complexity for this task can be significantly lower than that of closeness testing. As an application, we leverage this result to provide the first non-trivial testing algorithm for (standard) closeness of low-degree \emph{Bayesian networks}, which significantly improves on either the sample or time complexity of a baseline based on full learning.

2605.23210 2026-05-25 eess.SP math.ST stat.ME stat.TH

Fundamental Bounds and Efficient Estimation for Dead-Time-Constrained Event Detection, with Application to Single-Photon Lidar

死时间约束事件检测的基本界限与高效估计,及其在单光子激光雷达中的应用

Frederic J. N. Jorgensen, Steven G. Johnson

AI总结 本文研究了在存在死时间约束和门控机制的二值事件检测过程中参数估计的统计理论,该问题在单光子激光雷达、荧光寿命成像等领域有重要应用。作者提出了一个充分统计量,揭示了传统硬件方法中被忽略的统计信息,并证明了最大似然估计器能够达到理论上的估计下界。为避免复杂的非凸优化,还提出了仅需单步修正的一阶估计方法,理论分析和实验验证表明其在单光子激光雷达中具有良好的性能。

详情
Comments
24 pages, 5 figures
AI中文摘要

我们为从一类非独立同分布的周期二元事件检测过程中进行参数估计发展了一种渐近统计理论,该过程受非麻痹性死时间和门控约束,我们称之为“死时间事件检测”(DED)过程。这类过程出现在单光子激光雷达、荧光寿命成像、X射线天文学以及核物理中的粒子或辐射通量测量中,其中每次检测会使辐射/粒子探测器在恢复间隔内失效。我们的理论量化了死时间和门控如何影响估计的基本下界,并识别出达到这些下界的实用估计量。首先,我们识别了一个充分统计量,特别表明激活计数可以携带传统直方图硬件丢弃的统计有用信息。然后我们证明了局部渐近正态性并推导了相应的Fisher信息率,从而获得了DED过程估计的基本下界。我们证明了在DED应用中广泛使用的最大似然估计(MLE)达到了这些下界。由于计算MLE通常需要求解非凸优化问题,我们还提出了Le Cam一步估计量,它仅需一次局部校正而非迭代优化即可达到相同的渐近界。我们通过单光子激光雷达的仿真和真实数据实验说明了渐近理论的有效性以及一步估计量的实际用途。

英文摘要

We develop an asymptotic statistical theory for parameter estimation from a class of non-i.i.d. periodic binary event-detection processes subject to nonparalyzable dead time and gating, which we call "dead-time event detection" (DED) processes. Such processes arise in single-photon lidar, fluorescence lifetime imaging, X-ray astronomy, and particle or radiation flux measurements in nuclear physics, where each detection renders the radiation/particle detector inactive for a recovery interval. Our theory quantifies how dead time and gating affect the fundamental lower bounds of estimation and identifies practical estimators that attain these bounds. First, we identify a sufficient statistic, showing in particular that activation counts can carry statistically useful information discarded by conventional histogramming hardware. We then prove local asymptotic normality and derive the corresponding Fisher-information rate, thereby obtaining fundamental lower bounds for estimation from DED processes. We prove that the maximum likelihood estimator (MLE), widely used in DED applications, attains these lower bounds. Since computing the MLE typically requires solving a nonconvex optimization problem, we also propose Le Cam one-step estimators, which attain the same asymptotic bounds with only a single local correction rather than iterative optimization. We illustrate the validity of our asymptotic theory and the practical usefulness of one-step estimators through the example of single-photon lidar in both simulations and real-data experiments.

2605.23208 2026-05-25 stat.ME stat.AP

A Direct Variance Estimation (DiVE) for Meta-Analysis of Median Differences

中位数差异元分析的直接方差估计(DiVE)

Tadahisa Okuda, Masataka Taguri, Kenichi Hayashi

AI总结 该论文提出了一种直接方差估计方法(DiVE),用于对两组研究中中位数差异进行元分析。传统方法需要每项研究的中位数差异、样本量以及分散度指标(如四分位距或范围),而DiVE仅需中位数差异和样本量即可直接估计合并差异的方差,从而避免了对分散度统计量的依赖。模拟研究表明,DiVE在各种分布场景下表现良好,尤其在研究数量较少时具有优势,并能纳入原本因缺乏分散度信息而被排除的研究,提升元分析的全面性和可靠性。

详情
AI中文摘要

报告两组研究中位数差异的元分析通常依赖于除中位数差异和样本量外还需要离散度汇总度量(如四分位数或范围)的方法。未报告此类统计量的研究常被排除在元分析之外。现有的两阶段方法首先在参数假设下估计每个研究内中位数差异的渐近方差,然后合并这些研究特异性估计以获得合并中位数差异及其方差。我们提出直接方差估计(DiVE),一种仅使用研究水平中位数差异及其样本量直接估计合并差异方差的方法。跨广泛分布情景的综合模拟研究表明,DiVE 的表现与传统两阶段方法相当或更优,在研究数量较少时具有明显优势。对已发表元分析的重新分析表明,DiVE 能够纳入缺乏离散度统计量的研究,从而获得更全面且可能偏差更小的证据综合。

英文摘要

Meta-analyses of two-group studies that report median differences typically rely on methods that require, in addition to the median difference and sample size, summary measures of dispersion such as quartiles or ranges. Studies that do not report such statistics are often excluded from the meta-analysis. Existing two-stage approaches first estimate the asymptotic variance of the median difference within each study under parametric assumptions, and then combine these study-specific estimates to obtain the pooled median difference and its variance. We propose Direct Variance Estimation (DiVE), a method that directly estimates the variance of the pooled difference using only study-level median differences and their sample sizes. A comprehensive simulation study across a wide range of distributional scenarios shows that DiVE performs comparably to or better than conventional two-stage methods, with clear advantages when the number of studies is small. A re-analysis of published meta-analyses demonstrates that DiVE enables the inclusion of studies lacking dispersion statistics, leading to a more comprehensive and potentially less biased synthesis of evidence.

2605.23207 2026-05-25 stat.ME

Mixture-of-Finite-Mixtures Wishart Model for Clustering Covariance Matrices with an Application to Brain Functional Connectivity

有限混合威沙特模型用于协方差矩阵聚类及其在脑功能连接中的应用

Zongyu Li, Stefano Castruccio, Zhiyong Zhang

AI总结 该研究提出了一种基于有限混合先验的Wishart混合模型(MFM-Wishart),用于对协方差矩阵进行聚类分析,特别适用于脑功能连接数据。该方法结合了Wishart分布与混合有限混合(MFM)先验,能够同时推断聚类数量和聚类分配,具有理论保证和计算效率。实验表明,该模型在模拟数据和实际脑成像数据中均表现出良好的聚类性能和鲁棒性,能够揭示婴儿睡眠期间功能连接的可解释异质性。

详情
AI中文摘要

协方差型矩阵数据出现在许多领域,包括脑功能连接和扩散张量成像。我们开发了MFM-Wishart,一种基于贝叶斯模型的聚类方法,将威沙特混合分量与有限混合先验相结合,允许对簇数和聚类分配进行联合后验推断。理论上,我们研究了威沙特核在混合模型中的性质,然后在标准正则性条件下建立了簇数的后验一致性和混合测度的后验收缩结果。计算上,我们开发了一种高效的马尔可夫链蒙特卡洛算法用于后验推断。模拟研究显示,即使在模型误设下,聚类性能也具有竞争力,并能准确恢复簇数。我们将MFM-Wishart应用于基于功能近红外光谱数据估计的婴儿睡眠期间功能连接聚类,展示了模型的实际效用并揭示了可解释的异质性。

英文摘要

Data represented as covariance-type matrices arise in many fields, including brain functional connectivity and diffusion tensor imaging. We develop the MFM-Wishart, a Bayesian model-based clustering approach for such data that combines Wishart mixture components with a mixture-of-finite-mixtures (MFM) prior, allowing joint posterior inference on both the number of clusters and clustering assignments. Theoretically, we study the properties of Wishart kernels in the context of mixture models and then establish results for posterior consistency for the number of clusters and posterior contraction of the mixing measure under standard regularity conditions. Computationally, we develop an efficient Markov chain Monte Carlo (MCMC) algorithm for posterior inference. Simulation studies show competitive clustering performance and accurate recovery of the number of clusters, even under model misspecification. We apply MFM-Wishart to cluster infants based on functional connectivity during sleep, estimated from functional near-infrared spectroscopy (fNIRS) data, illustrating the practical utility of the model and revealing interpretable heterogeneity.

2605.23171 2026-05-25 cs.LG cs.AI stat.ML

Understanding and Improving Noisy Embedding Techniques in Instruction Finetuning

理解与改进指令微调中的噪声嵌入技术

Abhay Yadav

AI总结 该研究探讨了指令微调中嵌入层添加噪声的技术,分析了均匀噪声与高斯噪声的效果差异,并提出了一种新的对称噪声嵌入方法SymNoise。通过理论与实验分析,研究发现不同噪声类型性能相近,而SymNoise通过更严格地调控模型局部曲率,显著提升了微调效果。在多个基准测试中,SymNoise相比当前最优方法NEFTune取得了约6.7%的性能提升,展示了其在语言模型微调中的优越性。

详情
Journal ref
IEEE International Conference on Language Modeling (COLM), 2025
Comments
arXiv admin note: substantial text overlap with arXiv:2312.01523
AI中文摘要

最近指令微调的进展在嵌入中注入噪声,其中NEFTune(Jain等人,2024)使用均匀噪声设立了基准。尽管NEFTune的实验发现均匀噪声优于高斯噪声,其原因仍不清楚。本文旨在通过提供彻底的理论和实证分析来澄清这一点,表明这些噪声类型之间的性能相当。此外,我们引入了一种新的语言模型微调方法,在嵌入中使用对称噪声。该方法旨在通过更严格地调节模型的局部曲率来增强模型功能,表现出优于当前方法NEFTune的性能。当使用Alpaca微调LLaMA-2-7B模型时,标准技术在AlpacaEval上获得29.79%的分数。然而,我们的方法SymNoise使用对称噪声嵌入将这一分数显著提高到69.04%,比最先进方法NEFTune(64.69%)提高了6.7%。此外,当在各种模型和更强的基线指令数据集(如Evol-Instruct、ShareGPT、OpenPlatypus)上测试时,SymNoise始终优于NEFTune。当前文献,包括NEFTune,强调了在语言模型微调中应用基于噪声的策略需要更深入的研究。我们的方法SymNoise是朝着这一方向迈出的又一重要步骤,显示出对现有最先进方法的显著改进。

英文摘要

Recent advancements in instructional fine-tuning have injected noise into embeddings, with NEFTune (Jain et al., 2024) setting benchmarks using uniform noise. Despite NEFTune's empirical findings that uniform noise outperforms Gaussian noise, the reasons for this remain unclear. This paper aims to clarify this by offering a thorough analysis, both theoretical and empirical, indicating comparable performance among these noise types. Additionally, we introduce a new fine-tuning method for language models, utilizing symmetric noise in embeddings. This method aims to enhance the model's function by more stringently regulating its local curvature, demonstrating superior performance over the current method, NEFTune. When fine-tuning the LLaMA-2-7B model using Alpaca, standard techniques yield a 29.79% score on AlpacaEval. However, our approach, SymNoise, increases this score significantly to 69.04%, using symmetric noisy embeddings. This is a 6.7% improvement over the state-of-the-art method, NEFTune (64.69%). Furthermore, when tested on various models and stronger baseline instruction datasets, such as Evol-Instruct, ShareGPT, OpenPlatypus, SymNoise consistently outperforms NEFTune. The current literature, including NEFTune, has underscored the importance of more in-depth research into the application of noise-based strategies in the fine-tuning of language models. Our approach, SymNoise, is another significant step towards this direction, showing notable improvement over the existing state-of-the-art method.

2605.23156 2026-05-25 cs.LG math.FA math.RT stat.ML

Any-Dimensional Invariant Universality

任意维不变泛化性

Shengtai Yao, Eitan Levin, Mateo Díaz

AI总结 本文研究了适用于任意尺寸输入的机器学习模型的泛化能力问题,这类模型如处理不同节点数的图或点云的数据。传统泛化性分析通常针对固定尺寸的输入,而本文提出了一种系统的方法,通过将任意维函数映射到一个合适的无限维极限空间,从而建立任意维模型的泛化性理论。该方法利用输入的对称性及不同尺寸输入之间的关系,定义了该空间上的自然拓扑结构,并展示了如何在该空间上建立任意维泛化性。研究还指出了一些现有模型的泛化性缺陷,并提出了简单的改进方案以恢复其泛化能力。

详情
AI中文摘要

一些机器学习模型是为任意大小的输入定义的,例如具有不同节点数的图和包含不同点数目的点云。这类任意维模型的泛化性仍然知之甚少,因为泛化性传统上是在接受固定大小输入的模型上研究的,定义在其域的紧致子集上。与此形成鲜明对比的是,任意维模型可以被视为定义在规模不断增长的输入上的函数序列,目前尚不清楚它们在何种意义上可以是泛化的。我们开发了一种系统的方法来建立任意维泛化性,通过将任意维函数与一个唯一的函数等同起来,该函数在合适的无限维极限空间中接受输入,该空间包含所有有限大小的输入及其极限。利用这些输入的对称性以及不同大小输入之间的关系,我们证明了该极限空间具有自然的拓扑结构,并且包含丰富的紧致集族,在这些紧致集上可以建立任意维泛化性。我们通过展示几种现有架构无法实现泛化性,并提出了恢复泛化性的简单修改,来说明我们的方法。

英文摘要

Several machine learning models are defined for inputs of any size, such as graphs with different numbers of nodes and point clouds containing varying numbers of points. The universality properties of such any-dimensional models remain poorly understood, as universality is traditionally studied for models accepting inputs of a fixed size, defined on a compact subset of their domain. In sharp contrast, any-dimensional models can be viewed as sequences of functions defined on growing-sized inputs, and it is not clear in which sense they can be universal. We develop a systematic approach to establish any-dimensional universality, by identifying any-dimensional functions with a unique function taking inputs in a suitable infinite-dimensional limit space containing inputs of all finite sizes as well as their limits. Using the symmetries of these inputs and relations between inputs of different sizes, we show that this limit space admits a natural topology with rich families of compact sets on which any-dimensional universality can be established. We illustrate our approach by showing that several existing architectures fail to be universal, and we propose simple modifications that restore universality.

2605.23151 2026-05-25 eess.SY cs.SY stat.ML

Convex Hybrid Modeling: An Operator-Based Approach

凸混合建模:一种基于算子的方法

Wentao Tang

AI总结 本文提出了一种基于算子理论的凸混合建模方法,旨在在保持模型结构简单和物理可解释性的前提下,准确建模过程系统。该方法通过构建凸学习问题,系统地考虑可解释性约束,并高效生成替代模型。研究讨论了三种建模场景,包括围绕参考模型的正则化、可解释子空间的限制以及非线性参数化的可解释流形限制,并通过算子理论技术将系统表示为基于核的可解释模型混合体,适用于静态和动态系统建模。

详情
Comments
19 pages, 6 figures. A 6-page shortened version under the same title is submitted to 2027 Foundations of Computer Aided Process Operations (FOCAPO) / Chemical Process Control (CPC) Conference. This is the full-length version
AI中文摘要

虽然机器学习可以精确建模过程系统,但用于决策的模型也应结构简单且物理可解释。例如,在过程控制中,(近乎)线性模型比非线性模型更受青睐,这促进了算子理论的应用,该理论通过非参数算子“通用”地表示非线性系统。另一方面,可解释性要求采用满足第一性原理的“非通用”参数化非线性模型族;这些约束往往使学习过程复杂化。本文通过制定凸学习问题来考虑混合建模,系统性地考虑可解释性,并高效地给出替代模型。讨论了三种设置——(i)围绕特定“参考模型”的正则化,(ii)对“可解释子空间”的限制,以及更一般地,(iii)对非线性参数化的“可解释流形”的限制。在更一般的设置中,通过引入一种算子理论技术,在“提升”参数(“规范特征”,可能是无限维的)中重新参数化模型,系统被视为基于核的可解释模型的混合。在数值研究中举例说明了静态和动态模型的应用。

英文摘要

While machine learning can accurately model process systems, models for decision making should also be structurally simple and physically interpretable. In process control, for example, (nearly) linear models are favored than nonlinear ones, promoting the use of operator theory, which ``universally'' represents a nonlinear system by a nonparametric operator. On the other hand, interpretability requires by a ``non-universal'', parametric nonlinear model family satisfying first principles; these constraints tend to complicate the learning procedure. This paper considers hybrid modeling by formulating convex learning problems that account for interpretability systematically and give surrogate models efficiently. Three settings are discussed -- (i) regularization around a particular ``reference model'', (ii) restriction on an ``interpretable subspace'', and more generally, (iii) restriction on a ``interpretable manifold'' that is nonlinearly parameterized. In the more general setting, by introducing an operator-theoretic technique to re-parameterize models in the ``lifted'' parameters (``canonical features'', potentially infinite-dimensional), the system is regarded as a kernel-based mixture of interpretable models. Application to both static and dynamic models are exemplified in numerical studies.

2605.23145 2026-05-25 stat.ML cs.LG math.ST stat.ME stat.TH

Operationalizing Individual Fairness via Gradient Descent and Bradley-Terry Models

通过梯度下降和Bradley-Terry模型实现个体公平性

Conlan Olson, Linjun Zhang, Zhun Deng, Pragya Sur

AI总结 本文研究如何通过梯度下降和Bradley-Terry模型实现个体公平性,解决在实际应用中学习个体相似度度量的困难问题。作者提出了一种基于三元组查询学习马哈兰诺比斯相似度度量的算法,结合谱初始化和梯度下降方法,并提供了理论保证,证明该算法能快速收敛到真实度量。研究还表明,基于估计度量实现的个体公平性可近似保证对真实度量的公平性,并探讨了该方法在AI模型调优中的潜在应用。

详情
Comments
60 pages, 2 figures
AI中文摘要

个体公平性,即“相似个体应受到相似对待”的概念,为算法决策者提供了强大而灵活的公平性保证。然而,在实践中实施个体公平性的一个障碍是难以学习个体间的相似性度量。在这项工作中,我们提出了一种从三元组查询(形式为“个体$i$与个体$j$还是$k$更相似?”)中学习马氏距离度量的算法。我们在标准的Bradley-Terry成对比较模型下工作。我们的算法包括一个谱初始化步骤,随后是梯度下降。我们为算法提供了广泛的理论保证,表明尽管我们模型中的损失是非凸的,但算法能快速收敛到真实度量。由于我们的重点是公平性,我们还表明,相对于估计度量的个体公平性足以实现相对于真实度量的类似公平性。我们还讨论了我们的工作在AI模型调优中的潜在应用。最后,我们展示了实验结果,证明了我们算法的收敛性以及基于估计度量训练的下游公平预测器的公平性性能。

英文摘要

Individual fairness, the notion that "similar individuals should be treated similarly," provides a strong and flexible fairness guarantee for algorithmic decision makers. However, a barrier to implementing individual fairness in practice is the difficulty of learning the similarity metric over individuals. In this work, we present an algorithm for learning a Mahalanobis similarity metric from triplet queries of the form "is individual $i$ more similar to individual $j$ or $k$?" We work in the standard Bradley-Terry model for pairwise comparisons. Our algorithm consists of a spectral initialization step followed by gradient descent. We provide extensive theoretical guarantees on our algorithm, showing that it converges quickly to the ground truth metric despite the non-convexity of the loss in our model. Because our focus is on fairness, we also show that individual fairness with respect to an estimated metric is sufficient to achieve similar fairness with respect to the true metric. We also discuss potential applications of our work to AI model tuning. Finally, we present experimental results that demonstrate the convergence of our algorithm and the fairness performance of downstream fair predictors trained on our estimated metric.

2605.23115 2026-05-25 cs.LG stat.ML

Robust OT-Guided Generative Residual Domain Adaptation for Bike-Sharing Demand Prediction under Temporal Domain Shift

鲁棒OT引导的生成式残差域适应用于时间域偏移下的共享单车需求预测

Yiming Ma

AI总结 本文研究了从2021年到2026年纽约Citi Bike共享单车需求预测中的时间域适应问题,提出了一种基于最优运输引导的残差域适应框架Gen-ROTDA。该方法通过拟合目标域的站点-时间锚点,转移残差而非原始需求,并采用确定性标签保持的残差特征生成器,提升了模型在时间域偏移下的鲁棒性。实验表明,Gen-ROTDA在主要任务2025至2026年的预测中取得了最低的平均绝对误差,并在多任务中优于其他最优运输方法,尤其在面对噪声数据时表现出更强的稳定性。

详情
AI中文摘要

基于历史站点-小时数据训练的共享单车模型在后续年份部署时,由于出行模式随时间变化,性能可能会下降。本文将2021年至2026年3月Citi Bike需求预测作为时间域适应问题进行研究,并提出了Gen-ROTDA,一种鲁棒最优传输引导的残差域适应框架。该方法利用少量标记目标子集拟合目标域站点-时间锚点,传输残差而非原始需求,应用确定性标签保持残差特征生成器,并在训练最终残差预测器之前修剪高成本传输匹配。实验将Gen-ROTDA与仅锚点、仅源域、仅目标域、微调、MMD适应、Sinkhorn OTDA、ROTDA和Gen-OTDA进行比较。Gen-ROTDA在2025年至2026年主要任务上取得了最低MAE,并且在多年度任务中平均表现最佳,尽管微调和MMD适应仍然是强大的整体基线。在异常目标无标签记录下,Gen-ROTDA比非鲁棒OT变体稳定得多,表明鲁棒传输对于共享单车需求预测中的噪声时间迁移是有用的。

英文摘要

Bike-sharing models trained on historical station-hour data may degrade when deployed in later years because travel patterns change over time. This paper studies March Citi Bike demand prediction from 2021 to 2026 as a temporal domain adaptation problem and proposes Gen-ROTDA, a robust optimal transport-guided residual domain adaptation framework. The method fits a target-domain station-time anchor with a small labeled target subset, transfers residual rather than raw demand, applies a deterministic label-preserving residual feature generator, and trims high-cost transport matches before training the final residual predictor. Experiments compare Gen-ROTDA with anchor-only, source-only, target-only, fine-tuning, MMD adaptation, Sinkhorn OTDA, ROTDA, and Gen-OTDA. Gen-ROTDA achieves the lowest MAE on the main 2025 to 2026 task and is the best OT-family method on average across multi-year tasks, although fine-tuning and MMD adaptation remain strong overall baselines. Under abnormal target-unlabeled records, Gen-ROTDA is much more stable than non-robust OT variants, suggesting that robust transport is useful for noisy temporal transfer in bike-sharing demand prediction.

2605.23102 2026-05-25 stat.ML cs.LG stat.ME

LLM Sparsity Prior for Robust Feature Selection

LLM 稀疏先验用于鲁棒特征选择

Caleb Skinner, Yihan Guo, Meng Li

AI总结 本文提出了一种基于大语言模型(LLM)稀疏性先验的鲁棒特征选择方法,用于高维变量选择。该方法通过引入可解释的超参数将LLM生成的权重整合到Spike-and-Slab模型中,同时利用分层超先验动态过滤无信息或误导性权重,从而在保证准确权重利用的同时提升鲁棒性。实验表明,该方法在医疗数据集上不仅提高了预测精度,还识别出基线方法遗漏的临床相关特征,尤其在小样本场景下表现出色。

详情
AI中文摘要

大型语言模型 (LLM) 提供了一种可扩展的机制,用于引出领域信息的先验知识,以进行高维变量选择。然而,现有方法如 LLM-Lasso 对权重质量敏感,当 LLM 生成的权重不准确时,性能会大幅下降。为了解决这一挑战,我们首先引入了一个量化 LLM 生成权重质量的框架,从而能够对不同权重机制下的 LLM 信息方法进行严格评估。然后,我们提出了 LLM 稀疏先验 (LSP),它通过两个可解释的超参数(控制全局稀疏性和权重集中度)将 LLM 生成的权重整合到 Spike-and-Slab 和 Spike-and-Slab Lasso 模型的先验包含概率中。这些参数上的层次超先验允许模型动态地折扣无信息或误导性权重,从而在权重准确时提高鲁棒性而不牺牲收益。最后,我们开发了原则性的提示工程策略,并在一个研究急性肾损伤的私有医学数据集上验证了该方法。LSP 提高了预测准确性,并识别出了基线方法遗漏的临床相关特征,对提示变化具有鲁棒性,在低数据场景下尤其有效。

英文摘要

Large language models (LLMs) offer a scalable mechanism to elicit domain-informed prior information for high-dimensional variable selection. However, existing methods such as LLM-Lasso are sensitive to weight quality, with performance degrading substantially when LLM-generated weights are inaccurate. To address this challenge, we first introduce a framework for quantifying the quality of LLM-generated weights, enabling rigorous evaluation of LLM-informed methods across varying weight regimes. We then propose the LLM Sparsity Prior (LSP), which integrates LLM-generated weights into the prior inclusion probabilities of Spike-and-Slab and Spike-and-Slab Lasso models via two interpretable hyperparameters governing global sparsity and weight concentration. Hierarchical hyperpriors on these parameters allow the model to dynamically discount uninformative or misleading weights, improving robustness without sacrificing gains when weights are accurate. Finally, we develop principled prompt engineering strategies and validate the method on a private medical dataset studying Acute Kidney Injury. LSP improves prediction accuracy and identifies clinically relevant features missed by the baselines, with robustness to prompt variation and particular effectiveness in low-data regimes.

2605.23101 2026-05-25 math.NA cs.NA stat.ML

Mode-Shape Expansion Using Physics-Constrained Gaussian Process Regression

使用物理约束高斯过程回归的模态振型扩展

Farid Ghahari

AI总结 本文研究了如何从稀疏传感器数据中重建全场结构模态形状的问题。为了解决传统高斯过程回归在稀疏传感条件下可能产生物理不一致结果的缺陷,提出了一种基于物理约束的单输出高斯过程框架(CONS-SOGP),通过引入模态核和质量正交性惩罚来耦合优化过程。数值实验表明,该方法在多自由度结构上能够提供更准确可靠的模态形状扩展结果。

详情
AI中文摘要

本文解决了从稀疏传感器数据重构全场结构模态振型的挑战。虽然高斯过程回归为空间插值和不确定性量化提供了稳健的非参数框架,但标准公式在稀疏传感条件下通常会产生物理上不一致的模态振型重构。推导了一种物理约束单输出高斯过程框架,该框架利用独立模态核,同时通过质量正交惩罚耦合优化。本文给出了边际似然、超参数梯度和惩罚耦合的推导。在多自由度结构上的数值验证表明,所提出的方法克服了现有基于GP预测的局限性,提供了更准确可靠的扩展模态振型。

英文摘要

This paper addresses the challenge of reconstructing full-field structural mode shapes from sparse sensor data. While Gaussian Process Regression (GPR) offers a robust non-parametric framework for spatial interpolation and uncertainty quantification, standard formulations often yield physically inconsistent mode-shape reconstructions under sparse sensing conditions. A Physics-Constrained Single-Output Gaussian Process (CONS-SOGP) framework is derived that utilizes independent modal kernels while coupling the optimization via a mass-orthogonality penalty. The paper presents derivations for the marginal likelihood, hyperparameter gradients, and penalty coupling. Numerical verification on a multi-degree-of-freedom structure demonstrates that the proposed method overcomes existing limitations in GP-based prediction, providing more accurate and reliable expanded mode shapes.

2605.23061 2026-05-25 cs.LG cs.AI math.OC stat.ML

Anytime Training with Schedule-Free Spectral Optimization

任意时间训练:无调度谱优化

Anuj Apte, Pranav Deshpande, Niraj Kumar, Shouvanik Chakrabarti, Junhyung Lyle Kim

AI总结 本文提出了一种名为 SF-NorMuon 的无调度谱优化器,用于解决传统神经网络训练中依赖固定学习率计划的问题。该方法在无需预设训练时间范围的情况下,能够在大规模语言模型上达到甚至超越精心调参的 AdamW 优化器的性能。研究还从理论上证明了无调度谱动态的稳定性保证,并指出快速迭代中的权重衰减对长期训练稳定性至关重要,为无需预设时间范围的持续学习提供了更实用的优化方案。

详情
AI中文摘要

标准神经网络训练依赖于与固定训练步数绑定的学习率调度,导致路径依赖性强,且当数据可用性变化时需要昂贵的重新调优。无调度(SF)方法通过移除显式调度来解决这一问题,然而当前最先进的任意时间优化器SF-AdamW始终不如调优后的AdamW基线。我们提出SF-NorMuon,一种无调度谱优化器,弥补了这一差距:使用单一超参数配置,SF-NorMuon在125M和772M参数的语言模型上,在$1$--$8 imes$ Chinchilla训练步数范围内匹配或超过了调优的AdamW。在理论方面,我们证明了无调度谱动力学的平稳性保证,并指出快速迭代上的权重衰减对于长步数稳定性至关重要。SF-NorMuon使从业者能够在训练过程中的任何时刻获得高质量检查点,而无需预先承诺训练步数。通过缩小与调优基线的性能差距,SF-NorMuon使无步数优化更加实用,向真正开放式的持续学习迈出了一步。

英文摘要

Standard neural network training relies on learning-rate schedules tied to a fixed horizon, leading to strong path dependence and costly re-tuning as data availability changes. Schedule-Free (SF) methods address this by removing explicit schedules, yet SF-AdamW, the current state-of-the-art anytime optimizer, consistently underperforms well-tuned AdamW baselines. We propose SF-NorMuon, a schedule-free spectral optimizer that closes this gap: with a single hyperparameter configuration, SF-NorMuon matches or exceeds tuned AdamW on 125M and 772M parameter language models across $1$--$8\times$ Chinchilla horizons. On the theoretical side, we prove a stationarity guarantee for schedule-free spectral dynamics and identify weight decay at the fast iterate as essential for long-horizon stability. SF-NorMuon enables practitioners to obtain high-quality checkpoints at any point during training without committing to a horizon in advance. By closing the performance gap with tuned baselines, SF-NorMuon makes horizon-free optimization more practical, taking a step towards truly open-ended, continual learning.

2605.23048 2026-05-25 cs.HC cs.CY stat.AP stat.ME

StanBKT: Rethinking Parameter Estimation in Bayesian Knowledge Tracing

StanBKT:重新思考贝叶斯知识追踪中的参数估计

Siddhartha Pradhan, Yanping Pei, Morgan Lee, Puyuan Zhang, Erin Ottmar, Adam C. Sales

AI总结 本文提出 StanBKT,一种基于贝叶斯推断的开源 Python 工具包,用于改进传统贝叶斯知识追踪(BKT)模型的参数估计方法。与传统依赖期望最大化方法的实现不同,StanBKT 支持包括汉密尔顿蒙特卡洛、变分推断等多种贝叶斯方法,能够在保留 BKT 模型可解释性的基础上,提供更准确的不确定性量化和更灵活的层次化建模能力。实验表明,StanBKT 在大规模教育数据集上表现出良好的预测性能,并能够更可靠地比较不同实验条件下学习、遗忘、猜测和失误参数的差异。

详情
Comments
5 figures, 7 tables
AI中文摘要

贝叶斯知识追踪(BKT)是智能辅导系统和教育数据挖掘中广泛使用且可解释的学生建模方法。然而,大多数实现依赖于期望最大化或相关优化方法,仅产生点估计,限制了不确定性量化以及跨学习者和条件的合理比较。我们介绍了 StanBKT,一个使用 Stan 进行贝叶斯推断来估计 BKT 模型的开源 Python 包。StanBKT 提供了一个统一框架,支持哈密顿蒙特卡洛、变分推断、Pathfinder 和基于优化的估计,同时保留了经典 BKT 的隐马尔可夫结构和可解释性。它支持标准、分组和层次化 BKT 模型,灵活的先验设定,后验预测推断,以及可视化和诊断工具。我们在大规模观测性和受控教育数据集上评估了 StanBKT。在 ASSISTments 2020 数据集上,我们展示了所支持的推断方法在预测性能上相当,但在计算效率和后验保真度上有所不同。我们进一步展示了后验推断如何实现对涉及感知线索操纵的教育干预中条件特定参数的合理比较。结果说明了不确定性量化如何促进对实验条件下学习、遗忘、猜测和滑移参数差异的更可靠解释。总体而言,StanBKT 通过提供用于教育数据挖掘中概率学生建模、不确定性量化和层次化推断的灵活框架,将 BKT 扩展到点估计之外。

英文摘要

Bayesian Knowledge Tracing (BKT) is a widely used and interpretable student modeling approach in intelligent tutoring systems and educational data mining. However, most implementations rely on expectation-maximization or related optimization methods that yield only point estimates, limiting uncertainty quantification and principled comparisons across learners and conditions. We introduce StanBKT, an open-source Python package for estimating BKT models using Bayesian inference in Stan. StanBKT provides a unified framework supporting Hamiltonian Monte Carlo, variational inference, Pathfinder, and optimization-based estimation while preserving the hidden Markov structure and interpretability of classical BKT. It supports standard, grouped, and hierarchical BKT models, flexible prior specification, posterior predictive inference, and utilities for visualization and diagnostics. We evaluate StanBKT on large-scale observational and controlled educational datasets. On the ASSISTments 2020 dataset, we show that supported inference methods achieve comparable predictive performance while differing in computational efficiency and posterior fidelity. We further demonstrate how posterior inference enables principled comparison of condition-specific parameters in an educational intervention involving perceptual cue manipulations. Results illustrate how uncertainty quantification facilitates more reliable interpretation of differences in learning, forgetting, guessing, and slipping parameters across experimental conditions. Overall, StanBKT extends BKT beyond point estimation by providing a flexible framework for probabilistic student modeling, uncertainty quantification, and hierarchical inference in educational data mining.

2605.23043 2026-05-25 cs.CL stat.ML

HawkesLLM: Semantic Uncertainty Propagation in Agentic Text Simulation

HawkesLLM:智能体文本模拟中的语义不确定性传播

Zewei Deng, Tinghan Ye, Liyan Xie

AI总结 本文提出HawkesLLM框架,用于解决智能体文本模拟系统中语义不确定性随时间累积的问题。该方法将时间影响建模与文本生成过程分离,通过多变量Hawkes过程建模节点间的激活关系,并利用语言模型基于时间模型选择的紧凑记忆生成新内容。实验表明,在GDELT新闻传播案例中,HawkesLLM在有限提示记忆预算下有效提升了后期语义对齐的效果。

详情
Comments
10 pages, 4 figures, Accepted at the ICML 2026 Workshop on Statistical Frameworks for Uncertainty in Agentic Systems
AI中文摘要

智能体文本模拟系统按顺序生成文本,每个项目成为后续步骤的可能上下文。这使得不确定性具有路径依赖性:早期的模糊性可能影响后续输出。本文通过HawkesLLM框架研究这一问题,该框架将时间影响建模与文本生成分离。我们将级联表示为一个网络,其节点是文本生成智能体。多变量Hawkes过程模拟这些节点随时间激活的方式,以及哪些早期节点输出应影响后续提示。然后,语言模型根据该时间模型选择的紧凑记忆编写每个新事件。我们在一个保留的全球事件、语言和语调数据库(GDELT)新闻级联案例研究中评估该框架。诊断跟踪与局部保留参考的语义对齐,并区分局部漂移和全局漂移。在此设置下,HawkesLLM在紧凑的提示记忆预算下改善了后期语义对齐。

英文摘要

Agentic text-simulation systems write in sequence, with each item becoming possible context for later steps. That makes uncertainty path-dependent: an early ambiguity can affect later outputs. This paper studies this problem with HawkesLLM, a framework that separates temporal influence modeling from text generation. We represent the cascade as a network whose nodes are text-generating agents. A multivariate Hawkes process models how these nodes activate over time and which earlier node outputs should influence later prompts. A language model then writes each new event from the compact memory selected by this temporal model. We evaluate the framework on a held-out Global Database of Events, Language, and Tone (GDELT) news-cascade case study. The diagnostics track semantic alignment with local held-out references and separate local drift from global drift. In this setting, HawkesLLM improves late-stage semantic alignment under a compact prompt-memory budget.

2605.23016 2026-05-25 stat.ME stat.CO

Sample correlation adjustments for robust Multi-fidelity Monte Carlo under limited pilot sampling

有限试点采样下鲁棒多保真蒙特卡洛的样本相关性调整

Michael Stanley, Thomas Coons, Geoffrey Bomarito, Patrick Leser, Joshua Pribe, James Warner

AI总结 本文研究了在有限试点样本条件下,如何改进多保真度蒙特卡洛(MFMC)方法中的相关系数估计,以提高估计器的稳健性。作者提出了一种基于样本协方差矩阵的概率信息和问题结构的新方法,通过定义一个新的差异函数来量化估计器的次优性,并选择最小化最坏情况期望差异的相关估计器。实验表明,该方法在试点样本较少时能生成比传统样本协方差更优的MFMC估计器。

详情
AI中文摘要

多保真蒙特卡洛(MFMC)是一种方差缩减方法,利用不同成本和精度的多保真模型集合。构建具有最优方差的MFMC估计器需要知道不同保真度模型之间的相关系数,而这些系数在实践中通常未知。相关性通常通过离线试点样本和样本相关性公式进行估计,之后MFMC方法将估计的相关性视为真实相关性。计算成本常常限制使用的试点样本数量,导致相关性估计不佳和次优估计器。利用MFMC问题设置和样本协方差矩阵的概率信息,我们提出了一种在有限试点样本下改进标准基于样本的相关性估计的方法。我们定义了一个新的差异函数来量化估计器的次优性,从而有助于选择最小化最坏情况期望差异的相关性估计器,其中期望是针对试点采样变异性取的。通过一个简单的二元高斯示例和来自NASA进入、下降与着陆(EDL)问题的多保真建模应用,我们表明在较小的试点样本量和有限的总预算下,该方法比标准样本协方差产生更好的MFMC估计器。

英文摘要

Multi-fidelity Monte Carlo (MFMC) is a variance reduction method that leverages a multi-fidelity ensemble of models of varying cost and accuracy levels. Constructing an MFMC estimator with optimal variance requires knowledge of the correlation coefficients between the different fidelity models which are not usually known in practice. The correlations are typically estimated using offline pilot samples and the sample correlation formula, after which the MFMC method proceeds as if the estimated correlations are the true correlations. Computational cost often restricts the number of pilot samples used leading to poor correlation estimates and suboptimal estimators. Leveraging the MFMC problem setting and probabilistic information about the sample covariance matrix, we present a method to improve standard sample-based correlation estimates in the presence of limited pilot samples. We define a novel discrepancy function quantifying the estimator suboptimality which in turn facilitates selecting a correlation estimator minimizing the worst-case expected discrepancy, where the expectation is taken with respect to the pilot sampling variability. Through a simple bivariate Gaussian example and a multi-fidelity modeling application from a NASA Entry, Descent, and Landing (EDL) problem, we show that this method produces better MFMC estimators than the standard sample covariance under small pilot sample sizes and limited total budgets.

2605.22968 2026-05-25 q-bio.QM cs.LG stat.ML

Uncertainty-aware classification and triage of structural heart disease using electrocardiography and echocardiography metrics

基于心电图和超声心动图指标的结构性心脏病不确定性感知分类与分诊

Mitchel J. Colebank

AI总结 该研究探讨了利用心电图(ECG)和超声心动图指标对结构性心脏病(SHD)进行分类与分诊的不确定性感知方法。研究对比了频率学派和贝叶斯神经网络分类器在SHD检测中的表现,发现贝叶斯方法在分类性能和不确定性量化方面更具优势。研究还展示了如何将不确定性感知分类应用于SHD筛查,为通过机器学习辅助分诊、优化医疗资源分配提供了可行方案。

详情
Comments
15 pages, 5 figures
AI中文摘要

机器学习方法提供了一种方法创新,可以通过无创且易于获得的测量方式帮助筛查心血管疾病。最近在利用心电图数据筛查结构性心脏病方面的投资就是一个例子,其中心电图提供了一种低成本、可用的筛查方式。这导致了EchoNext数据集的产生,这是一个配对的心电图-超声心动图数据存储库,用于测试新的结构性心脏病检测方法。然而,相对较少的研究探讨了通过贝叶斯推理进行更概率性的分类如何改善这种情况下的不确定性量化。此外,很少有研究考虑如何开发分诊系统以缓解医疗瓶颈,例如由专家超声技师审查来自服务不足的农村诊所的数据以进行结构性心脏病评估。在本研究中,我们利用现有的心电图-超声心动图数据来比较频率派和贝叶斯神经网络分类器。我们表明,贝叶斯方法在结构性心脏病分类中与频率派方法相当或更好,并且它们具有更稳健的不确定性量化。我们提供了一个示例,说明如何将此不确定性感知分类方案用于结构性心脏病筛查,为机器学习如何帮助分诊提供了概念验证,即在结构性心脏病高度可能或测量高度不确定时,让个体获得专家超声技师的输入。

英文摘要

Machine learning methods provide a methodological innovation that can help screen for cardiovascular disease through noninvasive and readily available measurement modalities. Recent investments in using electrocardiogram (ECG) data to screen for structural heart disease (SHD) are one example, where ECGs provide a low-cost, available modality for screening. This has led to the EchoNext dataset, a paired ECG-echocardiogram data repository for testing new methods of SHD detection. However, relatively few studies have investigated how more probabilistic classification through Bayesian inference may improve uncertainty quantification in this setting. Moreover, few studies have considered how triage systems can be developed to alleviate healthcare bottlenecks, such as the review of data from underserved, rural clinics by expert sonographers for SHD assessment. In this study, we leverage existing ECG-echocardiogram data to compare frequentist and Bayesian neural network classifiers. We show that the Bayesian approach is comparable or better than frequentist methods in SHD classification, and that they have a more robust uncertainty quantification attached to them. We provide an example of how this uncertainty-aware classification scheme can be used for screening SHD, providing a proof-of-concept for how machine learning can help with triage in getting individuals expert sonographer input when SHD is highly likely or measurements are highly uncertain.

2605.22350 2026-05-25 cs.LG stat.ML

Partial Fusion of Neural Networks: Efficient Tradeoffs Between Ensembles and Weight Aggregation

神经网络的部分融合:集成与权重聚合之间的高效权衡

Fabian Morelli, Stephan Eckstein

AI总结 该论文提出了一种神经网络的部分融合方法,在集成学习与权重聚合之间实现计算成本与性能的灵活权衡。核心思想是基于神经元层面的相似性,仅对最相似的神经元进行权重聚合,从而在保持较高准确率的同时降低计算开销。研究还展示了通过部分最优运输方法识别和匹配相似神经元的具体实现,并将权重聚合与部分融合视为集成模型的广义剪枝过程,允许对神经元进行删除或线性组合操作,进一步拓展了模型优化的灵活性。

详情
Comments
Accepted to ICML 2026
AI中文摘要

神经网络的集成通常优于单个网络,但计算成本高昂,而权重聚合产生的聚合模型成本较低,但精度也较低。我们引入了网络的部分融合,它在集成和权重聚合之间进行插值,从而允许在计算成本和性能之间进行灵活的权衡。实现这一目标的一种直接方法是扩展现有的基于不同网络之间神经元级相似性的权重聚合方法,其中部分融合仅聚合最相似神经元的权重。我们展示了一种特定方法,通过部分最优传输联合识别哪些神经元最相似并进行匹配。此外,我们将权重聚合和部分融合视为集成模型的广义剪枝,其中神经元不仅可以被删除,还可以线性组合。最后,我们表明,应用于单个网络的广义剪枝通过允许基于相似性隔离、删除和线性组合神经元之间的权衡,产生了与部分融合类似的优势。我们的代码可在 https://github.com/Fabian-Mor/partial_fusion_nn 获取。

英文摘要

Ensembles of neural networks typically outperform individual networks but incur large computational costs, whereas weight aggregation produces less costly, yet also less accurate, aggregate models. We introduce partial fusion of networks, which interpolates between ensembles and weight aggregation and thus allows for a flexible tradeoff between computational cost and performance. A direct way to achieve this is to extend existing weight aggregation methods based on neuron-level similarity between different networks, where partial fusion then only aggregates weights of neurons which are most similar. We showcase one particular method to jointly identify which neurons are most similar and match them via partial optimal transport. Further, we consider the more general perspective of weight aggregation and partial fusion as generalized pruning of ensemble models, where neurons cannot just be deleted, but also linearly combined. Finally, we show that generalized pruning applied to a single network yields similar benefits as partial fusion by allowing for a tradeoff between isolating, deleting, and linearly combining neurons based on similarity. Our code is available at https://github.com/Fabian-Mor/partial_fusion_nn.

2605.21893 2026-05-25 stat.ME stat.AP

Sequential Sensitivity Analysis for Multiple Assumptions: A Framework for Understanding Racial Disparity in Police Use of Force

多重假设的序贯敏感性分析:理解警察使用武力中种族差异的框架

Thomas Leavitt, Jake Bowers, Luke Miratrix

AI总结 该研究旨在分析警察使用武力中的种族差异问题,探讨在缺乏直接观测的情况下如何推断种族歧视的影响。研究提出了一种序贯敏感性分析框架,同时考虑警察在拦截和遭遇环节中可能存在的种族偏见假设,并应用于纽约市警察局的拦截数据。研究发现,在合理假设下存在显著的种族差异,但这一结论对遭遇环节中轻微的种族偏差假设非常敏感,表明需综合考虑多个因素以更准确理解种族差异的成因。

详情
AI中文摘要

推断警察使用武力中的种族歧视——平民种族对使用武力的平均因果效应——需要关于潜在使用武力前警务的两个假设:警察在拦截谁时不存在歧视(拦截无歧视),并且,在巡逻背景条件下,遭遇少数族裔而非白人平民的概率在不同遭遇中不变(遭遇无偏见)。正如Knox等人(2020年)所示,第一个假设的违反可能掩盖武力中的种族差异。这是否反映武力中的歧视也取决于第二个假设。现有的敏感性分析一次只处理一个假设。我们开发了一个框架,依次变化两个假设,并将其应用于纽约市警察局拦截、询问和搜身数据(2003-2013年)。在拦截歧视的合理水平下,我们发现武力中存在显著的种族差异。然而,这一差异反映歧视的结论对遭遇无偏见假设的微小偏离是脆弱的,而基于人口普查的校准表明这种偏离在人口统计上是可行的。通过联合解决这两个混淆渠道,该框架揭示了它们如何以单独分析无法做到的方式相互作用,有助于理解种族差异产生的原因以及如何解决它们。

英文摘要

Inferring racial discrimination in police use of force -- the average causal effect of civilian race on use of force -- requires two assumptions about policing prior to potential use of force: that officers do not discriminate in whom they would stop (no discrimination in stops) and that, conditional on patrol context, the probability that an encounter is with a minority rather than a white civilian does not vary across encounters (no bias in encounters). As Knox et al. (2020) show, violations of the first can mask racial disparity in force. Whether it reflects discrimination in force also depends on the second. Existing sensitivity analyses address one assumption at a time. We develop a framework that varies both sequentially and apply it to NYPD Stop, Question, and Frisk data (2003--2013). Under plausible levels of discrimination in stops, we find substantial racial disparity in force. However, the conclusion that this disparity reflects discrimination is fragile to modest departures from no bias in encounters that census-based calibration suggests are demographically feasible. By jointly addressing both confounding channels, the framework reveals how they interact in ways that separate analyses cannot, contributing to understanding what generates racial disparities and how they might be addressed.

2605.21489 2026-05-25 cs.LG cs.AI cs.CV stat.CO stat.ML

Variance Reduction for Expectations with Diffusion Teachers

具有扩散教师的期望方差缩减

Jesse Bettencourt, Xindi Wu, Matan Atzmon, James Lucas, Jonathan Lorraine

AI总结 本文研究了如何在使用预训练扩散模型作为“教师”进行下游任务(如文本到3D生成、单步蒸馏等)时,降低梯度估计的方差。提出了一种名为CARV的计算感知方差控制框架,通过分层蒙特卡洛估计器,将昂贵的上游计算过程与廉价的扩散噪声重采样相结合,并结合时间步重要性采样和分层逆CDF构造,有效减少了计算成本。实验表明,CARV在不改变目标函数的前提下显著提升了计算效率,但在某些任务中梯度方差的降低并未带来生成质量的提升,表明此时方差已不再是性能瓶颈。

详情
Comments
Project page: https://research.nvidia.com/labs/sil/projects/CARV/
AI中文摘要

预训练的扩散模型作为冻结教师,为文本到3D、单步蒸馏和数据归因等下游流程提供支持。这些流程消耗的教师梯度是关于噪声水平和高斯噪声样本的蒙特卡洛期望;其估计器方差主导了计算成本,因为每次抽取都需要昂贵的上游工作(渲染、模拟、编码)。我们引入了CARV,一个计算感知的方差核算框架,它激发了一种分层蒙特卡洛估计器:通过廉价的扩散噪声重采样来摊销昂贵的上游计算,并通过时间步重要性采样和分层逆CDF构造加以强化。在我们的文本到3D蒸馏和归因实验中,CARV在不改变目标的情况下提供了2-3倍的有效计算乘数(主要来自摊销重用;约25%来自IS+分层);在单步蒸馏中,相同的技术将梯度方差降低了一个数量级,但并未改善下游FID,标志着MC方差不再是瓶颈的区间。

英文摘要

Pretrained diffusion models serve as frozen teachers feeding downstream pipelines such as text-to-3D, single-step distillation, and data attribution. The teacher gradients these pipelines consume are Monte Carlo (MC) expectations over noise levels and Gaussian noise samples; their estimator variance dominates compute cost because each draw requires expensive upstream work (rendering, simulation, encoding). We introduce CARV, a compute-aware variance-accounting framework that motivates a hierarchical MC estimator: amortize the expensive upstream computation over cheap diffusion-noise resamples, sharpened by timestep importance sampling and a stratified-inverse-CDF construction. In our text-to-3D distillation and attribution experiments, CARV delivers 2-3x effective compute multipliers (most from amortized reuse; ~25% additional from IS+stratification) without changing the objective; in single-step distillation, the same techniques cut gradient variance by an order of magnitude but do not improve downstream FID, marking the regime where MC variance is no longer the bottleneck.

2605.18370 2026-05-25 stat.ML cs.LG math.ST stat.TH

On Stability and Decomposition of Sample Quantiles under Heavy-Tailed Distributions

重尾分布下样本分位数的稳定性与分解

Choudur Lakshminarayan

AI总结 本文研究了在重尾分布下,基于估计参数的样本分位数的稳定性与分解问题,尤其关注与金融收益线性投影相关的风险价值(VaR)估计。传统Bahadur表示在固定分布下难以分离投影方向和分位数阈值带来的不稳定性,本文提出一种Q-Q正交性方法,将两者的影响分离开来,并将样本分位数与理论分位数的差异分解为三个部分,分别对应投影方向变化、样本分位数波动以及余项,从而更精确地分析分位数估计的稳定性来源。

详情
Comments
0 figures
AI中文摘要

我们研究由估计参数索引的分布样本分位数,重点关注与金融收益线性投影相关的风险价值,其潜在概率律是重尾的。在此设定下,投影方向和经验分位数阈值均从数据中估计,因此固定分布下的标准Bahadur表示无法分离不同的不稳定性来源。一个规范的起点是Bahadur表示,它通过经验分布函数加上余项来表达样本分位数\cite{bahadur1966}。经验过程理论通过半空间、对称差和Glivenko-Cantelli一致收敛的机制提供了可用的框架。它们给出了稳定性界,但将投影方向的变化和分位数阈值的变化吸收到单一的对称差度量中。有趣的是,对于本质上是局部分位数稳定性问题,却施加了全局一致收敛的要求。 本文引入了一种Q-Q正交性公式来分离投影方向和分位数阈值效应。关注的对象是使用估计投影方向计算的经验分位数与参考投影方向下的总体分位数之间的差异。我们将此差异分解为三项:$\hat q_α(\hat w)-q_α(w_0)=D_1+D_2+D_3$。其中,$D_1$衡量由投影方向扰动引起的总体分位数移动,$D_2$衡量在投影方向固定时经验分位数的波动,$D_3$是Bahadur型余项。

英文摘要

We study sample quantiles of distributions indexed by estimated parameters, with a on Value-at-Risk related to linear projections of financial returns that whose underlying probability law is heavy-tailed. In this setting, the projection direction and the empirical quantile threshold are estimated from the data, so the standard Bahadur representation under a fixed distribution does not separate the distinct sources of instability. A canonical starting point is Bahadur's representation, which expresses the sample quantile through the empirical distribution function plus a remainder term \cite{bahadur1966}. Empirical-process theory provides a usable scaffolding through the mechanics of half-spaces, symmetric differences, and Glivenko--Cantelli uniform convergence. They yield stability bounds, but absorb changes in projection direction and changes in quantile threshold into a single symmetric-difference measure. Interestingly, a global uniform-convergence requirement is imposed on what is intrinsically a local quantile-stability problem. This paper introduces a Q-Q orthogonality formulation for separating projection-direction and quantile-threshold effects. The object of interest is the difference between the empirical quantile computed using the estimated projection direction and the population quantile computed at the reference projection direction. We decompose this difference into three terms, $\hat q_α(\hat w)-q_α(w_0)=D_1+D_2+D_3$. Here, $D_1$ measures the population quantile movement induced by perturbing the projection direction, $D_2$ measures the empirical quantile fluctuation with the projection direction held fixed, and $D_3$ is the Bahadur-type remainder.

2605.17767 2026-05-25 stat.ML cs.LG

Feature Learning in Linear-Width Two-Layer Networks: Two vs. One Step of Gradient Descent

线性宽度双层网络中的特征学习:梯度下降的两步 vs 一步

Behrad Moniri, Hamed Hassani

AI总结 本文研究了在宽度线性增长的两层神经网络中特征学习的行为,重点分析了梯度下降第二步更新时隐藏层权重的变化。作者超越了之前仅分析单步更新的研究,揭示了第二步更新中权重的谱特性,表明其行为类似于具有多个异常值的尖峰随机矩阵,这些异常值对应于学习到的不同方向。研究还发现,通过重复使用训练批次而非独立批次,可以学习到信息指数大于一的方向,表明批次重用在宽网络中仍具有优势。

详情
AI中文摘要

我们在线性宽度机制下研究双层神经网络中的特征学习,其中隐藏神经元数量、样本量和输入维度成比例缩放。尽管近期工作分析了该机制下通过单步梯度下降更新第一层权重的特征学习,但这种单步更新方案存在根本性限制:权重更新近似秩一,仅捕获单个方向,且要求目标函数的信息指数为1。本文超越单步更新,完整刻画了步长$η_1\asymp N^{α_1}$和$η_2 \asymp N^{α_2}$($α_1, α_2 \in [0,0.5)$,$N$为隐藏神经元数)的梯度下降 extit{第二步}过程中学习的特征。我们推导了更新权重的谱特征,证明其表现为具有多个离群点的尖峰随机矩阵,每个离群点对应一个学习方向。我们证明离群点数量由参数$α_1, α_2$通过$\lfloor \frac{α_2}{1/2 - α_1} \rfloor$决定。此外,通过分析学习方向与目标函数之间的对齐,我们发现了独立批次与重用批次训练之间的差距。独立批次将学习限制在信息指数为1的方向上,而批重用使得第二步更新能够捕获信息指数超过1的方向,前提是$α_1, α_2$选择得当。这表明先前在窄宽度机制中观察到的批重用优势在线性宽度极限下仍然存在。通过刻画这些早期阶段的演化,我们的工作为研究现代过参数化网络中的优化和特征学习现象提供了一个易处理的框架。

英文摘要

We study feature learning in two-layer neural networks within the linear-width regime, where the number of hidden neurons, sample size, and input dimension scale proportionally. While recent work has analyzed feature learning via a single step of gradient descent on the first layer weights in this regime, such one-step update schemes are fundamentally limited: the update to the weights is approximately rank-one, captures only a single direction, and requires the target function to have an information exponent of one. In this paper, we go beyond one-step updates to provide a full characterization of the features learned during the \textit{second step} of gradient descent with step-sizes $η_1\asymp N^{α_1}$ and $η_2 \asymp N^{α_2}$ for $α_1, α_2 \in [0,0.5)$, where $N$ is the number of hidden neurons. We derive a spectral characterization of the updated weights, demonstrating they behave as a spiked random matrix with multiple outliers, each corresponding to a learned direction. We show that the number of the outliers is determined by the parameters $α_1, α_2$ through $\lfloor \frac{α_2}{1/2 - α_1} \rfloor$. Furthermore, by analyzing the alignment between the learned directions and the target function, we identify a gap between training with independent versus reused batches. While independent batches restrict learning to directions with an information exponent of one, batch reuse enables the second update to capture directions even when the information exponent exceeds one, provided that $α_1, α_2$ are chosen properly. This shows that the benefits of batch reuse, previously observed in narrow-width regimes, persist in the linear-width limit as well. By characterizing these early-phase evolutions, our work proposes a tractable framework for studying optimization and feature learning phenomenology in modern overparameterized networks.

2605.16606 2026-05-25 stat.ME stat.AP

Beyond the Composite: Enhancing Trial Analysis through a Divide & Conquer Approach to 'Days Alive and at Home': Insights from the NOTACS trial

超越复合指标:通过分治法增强对“存活且居家天数”的试验分析——来自NOTACS试验的见解

Letao Yuan, Sofía S. Villar, Dominique-Laurent Couturier

AI总结 本文针对围手术期试验中常用但统计特性复杂的“在家生存天数”(DAH)这一患者中心结局指标,提出了一种新的“分而治之”建模方法,将DAH分解为多个独立部分分别建模,从而提升模型拟合效果。该方法有效解决了DAH分布零膨胀、左偏和双峰等特性带来的统计分析难题,为基于DAH的试验样本量计算和分析方法提供了更可靠的依据,具有广泛的应用前景。

详情
Comments
35 pages, 8 figures, 2 tables
AI中文摘要

“存活且居家天数”(DAH)是一种近期用于围手术期试验的以患者为中心的结果指标,定义为随访期间患者在家中的天数。DAH通常呈零膨胀、左偏、双峰分布。其他日益常用的复合终点,如无呼吸机存活天数,也具有这些统计特征,这些特征源于将生存与另一个临床相关的计数结果合并为一个综合指标。DAH及类似终点的一个关键挑战是缺乏易于识别的分布形式,这使将其作为主要终点的试验的统计设计复杂化,特别是在中心极限定理可能不适用的样本量计算和最终分析的稳健性方面。利用NOTACS试验(ISRCTN14092678)中期数据的200个数据点(其主要终点为DAH),我们开发了一种新颖的“分治”模型,将DAH分解为不同的部分分别建模。据我们所知,这种模型此前未用于DAH。我们证明,与现有替代方法相比,我们的方法显著改善了模型拟合,能够生成更合适的DAH数据,用于基于模拟的样本量计算和统计检验的操作特性评估。超越NOTACS,我们的工作对于使用DAH或类似复合终点的其他试验的设计和分析具有巨大潜力。

英文摘要

"Days alive and at home" (DAH) is a recent patient-centered outcome measure for perioperative trials, defined as the number of days a patient spends at home during the follow-up period. DAH typically follows a zero-inflated, left-skewed, bi-modal distribution. Other increasingly used complex endpoints, such as days alive without a ventilator, share these statistical features arising from combining survival with another clinically relevant count outcome into a single, comprehensive measure. A key challenge for DAH and similar endpoints is the lack of a readily identifiable distributional form, which complicates the statistical design of trials using it as the primary endpoint, particularly regarding the robustness of sample size calculations and final analyses where the central limit theorem might not be suitable. Using 200 data points from the interim data of the NOTACS trial (ISRCTN14092678), whose primary endpoint was DAH, we developed a novel 'Divide & Conquer' model that breaks DAH into distinct parts modeled individually. To our knowledge, such a model has not been used before for DAH. We demonstrate that our approach significantly improves model fit compared to existing alternatives, enabling more suitable DAH data generation that can be used for simulation-based sample size calculations and evaluation of operating characteristics of the statistical test(s). Beyond NOTACS, our work has large potential to inform the design and analysis of other trials using DAH or similar complex endpoints.

2605.11639 2026-05-25 physics.ao-ph math.ST stat.TH

Enabling High-Accuracy Data Assimilation with Limited Ensembles via Machine Learning-Based Covariance Correction

基于机器学习协方差校正的有限集合高精度数据同化

Zhou Yao, Zhilin Li, Li Zhao, Zeng Liu, Zhaokuan Lu, Seungnam Kim, Guangyao Wang

AI总结 本文研究如何在数据同化中利用机器学习方法提高有限集合下的估计精度。提出了一种基于多层感知机(MLP)的协方差修正方法,用于预测有限集合与足够大集合之间的误差协方差差异,并将其引入集合卡尔曼滤波(EnKF)中,从而提升协方差矩阵的准确性。实验表明,该方法在保持计算效率的同时,显著提高了分析精度,为高维非线性系统的数据同化提供了可行的新途径。

详情
AI中文摘要

数据同化(DA)将数值模型预报与观测相结合以实现最优状态估计。基于集合的方法,如集合卡尔曼滤波(EnKF),被广泛用于高维非线性动力系统的状态估计。然而,其性能强烈依赖于集合大小,因此在分析精度和计算成本之间存在权衡问题。为解决此问题,本研究提出一种基于机器学习的EnKF框架,能够在相对较小的集合大小下保持高精度。具体而言,构建一个多层感知机(MLP)函数来预测由有限集合估计的预报误差协方差与由足够大集合估计的协方差之间的差异,其中后者被假定为真实值的准确近似。然后,通过逐元素缩放策略将该预测的协方差差异项纳入EnKF算法,得到修正的预报协方差矩阵,该矩阵更好地逼近真实不确定性水平,并依次产生更准确的分析结果。为了证明所提算法的可行性和鲁棒性,我们在各种配置下对Lorenz-63和Lorenz-96系统进行了一组数值实验,结果一致表明,所提算法在相同有限集合大小下显著优于标准EnKF,实现了显著更高的分析精度,同时保持计算效率。该方法为高维非线性动力系统的准确且计算高效的数据同化提供了一条实用可行的途径。

英文摘要

Data assimilation (DA) integrates numerical model forecasts with observations to achieve the optimal state estimation. Ensemble-based methods, such as the ensemble Kalman filter (EnKF), are widely used for state estimation for high-dimensional and nonlinear dynamic systems. However, their performance strongly depends on the ensemble size, therefore causing a tradeoff problem between analysis accuracy and computational cost. To address this problem, this study presents a machine learning-based EnKF framework that maintains high accuracy with a relatively small ensemble size. Specifically, a multilayer perceptron (MLP) function is built to predict the difference between the forecast error covariances estimated from a limited ensemble and a sufficiently large ensemble, with the latter being assumed to be an accurate approximation of the underlying truth. This predicted covariance difference term is then incorporated into the EnKF algorithm via an element-wise scaling strategy, resulting in an amended forecast covariance matrix that better approximates the true uncertainty level and sequentially produces more accurate analysis results. To demonstrate the feasibility and robustness of the proposed algorithm, we perform a set of numerical experiments with the Lorenz-63 and Lorenz-96 systems under various configurations, and the results consistently indicate that the proposed algorithm can significantly outperform the standard EnKF with the same limited ensemble size, by achieving notably higher analysis accuracy while remaining computationally efficient. This approach provides a practical and feasible pathway to accurate and computationally efficient data assimilation for high-dimensional and nonlinear dynamic systems.

2605.11490 2026-05-25 cs.LG stat.ML

Adaptive Calibration in Non-Stationary Environments

非平稳环境中的自适应校准

Junyan Liu, Haipeng Luo, Lillian J. Ratliff

AI总结 在非平稳环境中实现自适应校准是现代AI系统中的核心挑战。本文提出了一类能够根据环境非平稳程度自动调整校准误差的在线预测算法,在i.i.d.和对抗性环境之间实现平滑过渡。该方法在多种校准度量下均取得了理论保证,其误差上界在平稳和对抗性场景下均达到最优,并扩展了先前相关工作,引入了基于阶段的调度策略和预测空间的非均匀划分技术。

详情
Comments
Added results for piecewise-stationary environments and included a comparison with the concurrent work of Huang et al. (arXiv:2605.09273)
AI中文摘要

在现代AI系统中,进行校准的在线预测是一个核心挑战。现有文献大多关注完全对抗性环境,其中结果可能是任意的,导致算法保守,在更温和的设置(如结果近乎平稳)中表现次优。这一差距引发了一个自然问题:我们能否设计在线预测算法,其校准误差自动适应环境的非平稳程度,在独立同分布和对抗性场景之间平滑插值?我们对此问题给出肯定回答,并开发了一套算法,在多种校准度量下实现自适应校准保证。具体地,设$T$为轮数,$K$为环境中未知的独立同分布段数,$C\in[0,T]$为另一个未知的非平稳度量(定义为均值结果的最小$\ell_1$偏差),我们的算法对$\ell_1$校准误差达到$\widetilde{O}(\min\{\sqrt{T}+(TC)^{\frac{1}{3}}, \sqrt{KT}\})$,对$\ell_2$和伪KL校准误差均达到$\widetilde{O}(\min\{(1+C)^{\frac{1}{3}}, K\})$。这些界匹配平稳情况($C=0$且$K=1$)的最优率,并在完全对抗性场景($C, K=\Omega(T)$)中恢复已知保证。我们的方法建立在并扩展了先前工作[Hu等人,2026,Luo等人,2025]的基础上,引入基于epoch的调度以及对预测空间进行新颖的非均匀划分,在底层真实值附近分配更精细的分辨率。

英文摘要

Making calibrated online predictions is a central challenge in modern AI systems. Much of the existing literature focuses on fully adversarial environments where outcomes may be arbitrary, leading to conservative algorithms that can perform suboptimally in more benign settings, such as when outcomes are nearly stationary. This gap raises a natural question: can we design online prediction algorithms whose calibration error automatically adapts to the degree of non-stationarity in the environment, smoothly interpolating between i.i.d. and adversarial regimes? We answer this question in the affirmative and develop a suite of algorithms that achieve adaptive calibration guarantees under multiple calibration measures. Specifically, with $T$ being the number of rounds, $K$ being the unknown number of i.i.d. segments of the environment, and $C\in[0,T]$ being another unknown non-stationary measure defined as the minimal $\ell_1$ deviation of the mean outcomes, our algorithms attain $\widetilde{O}(\min\{\sqrt{T}+(TC)^{\frac{1}{3}}, \sqrt{KT}\})$ for $\ell_1$ calibration error and $\widetilde{O}(\min\{(1+C)^{\frac{1}{3}}, K\})$ for both $\ell_2$ and pseudo KL calibration error. These bounds match the optimal rates in the stationary case ($C=0$ and $K=1$) and recover known guarantees in the fully adversarial regime ($C, K=Ω(T)$). Our approach builds on and extends prior work [Hu et al., 2026, Luo et al., 2025], introducing an epoch-based scheduling together with a novel non-uniform partition of the prediction space that allocates finer resolution near the underlying ground truth.

2605.07050 2026-05-25 math.PR math-ph math.MP math.ST stat.TH

Universality of the fluctuations of the free energy in generalized Sherrington-Kirkpatrick models and the log likelihood ratio in spiked Wigner models

广义Sherrington-Kirkpatrick模型中自由能的涨落与尖峰Wigner模型中对数似然比的普适性

Hyunsuk Choo, Yoochan Han, Ji Oon Lee

AI总结 本文研究了广义Sherrington-Kirkpatrick模型中自由能的波动以及尖峰Wigner模型中对数似然比的统计特性,在高温/亚临界 regime 下,证明了这些波动的极限分布是高斯分布,并且该结果具有普适性,不依赖于无序分布或先验分布的具体形式,仅依赖于模型中少数参数的均值和方差。证明方法基于多图展开,为分析这两类模型提供了统一的框架。

详情
Comments
50 pages, 1 figure; Typos corrected
AI中文摘要

我们考虑高温/次临界区域下广义Sherrington-Kirkpatrick模型中自由能的涨落以及尖峰Wigner模型的对数似然比。我们证明在适当假设下,涨落的极限律是高斯分布,并且该结果具有普适性,即除了极限律的均值和方差依赖于模型的少数参数外,它不依赖于无序或先验的分布。证明基于多图展开,为分析这两个模型提供了统一的方法。

英文摘要

We consider the fluctuations of the free energy in generalized Sherrington-Kirkpatrick models and the log likelihood ratio of spiked Wigner models in the high temperature/subcritical regime. We prove that the limiting laws of the fluctuations are Gaussian under suitable assumptions, and the result is universal in the sense that it does not depend on the distribution of the disorder or the prior except that the means and the variances of the limiting laws depend on a few parameters of the model. The proof is based on the multigraph expansion that provides a unified approach to analyze both models.

2604.19353 2026-05-25 math.ST stat.ME stat.TH

Asymptotic e-processes

渐近 e-过程

Pierre-François Massiani, Sebastian Schulze, Mattes Mollenhauer

AI总结 本文研究了一类称为渐近e-过程的双指标随机过程,旨在在监测时间索引下渐近地满足e-过程的性质,适用于存在模型误设或估计误差的序贯假设检验问题。作者提出了渐近Ville不等式,并探讨了渐近e-过程与渐近上鞅的关系,还给出了包括校准、渐近e-变量累积乘积等在内的多种构建方法,为渐近序贯任意时有效推断提供了理论基础和实用工具。

详情
Comments
49 pages, 3 figures. Under review, may be subject to changes
AI中文摘要

我们研究了渐近 e-过程的概念,这是一个双索引随机过程 $(E_{m,n})_{m,n\\in\\\mathbb{N}}$,当近似指标 $m\\to\\\infty$ 时,它渐近地具有沿监测时间指标 $n$ 的 e-过程的性质。这是对这一最近引入概念的首次深入研究,该概念在渐近序列任意时间有效推断中具有重要意义。我们的理论源于序列假设检验的实际应用,其中由于模型误设或估计误差,e-变量和 e-过程只能近似地从观测中构造。在技术上,渐近 e-过程满足渐近版本的 Ville 不等式,该不等式在监测时间范围 $r_m$ 内一致地限制了 $(E_{m,n})_{m,n\\\in\\\mathbb{N}}$ 的超出概率。我们证明了允许有限值 $r_m$ 的必要性,并在 $r_m\\to\\\infty$ 时渐近地恢复真正的任意时间有效保证。我们推导了渐近 e-过程的各种性质,并研究了它们与渐近鞅的联系。我们还研究了其构造的一般方法,例如校准、渐近 e-变量的累积乘积以及依赖于估计参数的 e-过程的监测。后一种构造构成了渐近事后推断背景下最近方法的推广。

英文摘要

We investigate the concept of an asymptotic e-process, which is a doubly-indexed stochastic process $(E_{m,n})_{m,n\in\mathbb{N}}$ that possesses, asymptotically for an approximation index $m\to\infty$, the properties of an e-process along a monitoring time index $n$. This constitutes the first in-depth study of this recently introduced concept, which is relevant in asymptotic sequential anytime-valid inference. Our theory is motivated by practical applications in sequential hypothesis testing, in which e-variables and e-processes can only be constructed approximately from observations due to model misspecification or estimation errors. Technically, asymptotic e-processes satisfy an asymptotic version of Ville's inequality, which bounds excursion probabilities of $(E_{m,n})_{m,n\in\mathbb{N}}$ uniformly over $n$ up to a monitoring time horizon $r_m$. We show the necessity of allowing for finite values of $r_m$, recovering truly anytime-valid guarantees asymptotically if $r_m\to\infty$. We derive various properties of asymptotic e-processes, and study their connections to asymptotic supermartingales. We also investigate general methods for their construction such as calibration, the cumulative product of asymptotic e-variables, and the monitoring an of an e-process that depends on an estimated parameter. The latter construction constitutes a generalization of a recent approach within the context of asymptotic post-hoc inference.

2604.13779 2026-05-25 math.ST stat.TH

The Integer-valued Moving-Average Random Field

整数值移动平均随机场

Angelika Silbernagel, Christian H. Weiß

AI总结 本文提出并研究了一种用于计数随机场的整数值移动平均(INMA)模型。该模型推导了其边缘分布和空间依赖结构的闭式表达式,适用于任意阶数和多边情况,并提供了双变量分布和自协方差的一般表达式。研究还表明,INMA 随机场可以具有泊松边缘分布,并能够表现出多种可解释的空间依赖结构,文中还通过实际数据案例展示了模型的应用。

详情
Comments
18 pages, 3 figures, 2 tables
AI中文摘要

提出并研究了一种用于计数随机场的整数值移动平均(INMA)模型。对于任意模型阶数并涵盖多边情形,推导了其边际分布和空间依赖结构的闭式表达式。特别地,给出了二元分布和自协方差的通用表达式。结果表明,INMA随机场可以配备(其中包括)泊松边际分布。还证明了可以实现不同且易于解释的依赖结构。为说明起见,我们讨论了一个真实数据示例,并提出了对给定空间依赖结构的INMA近似。

英文摘要

An integer-valued moving average (INMA) model for count random fields is proposed and investigated. Closed-form expressions are derived for both its marginal distribution and spatial dependence structure, for arbitrary model order and also covering the multilateral case. In particular, general expressions for bivariate distributions and autocovariances are provided. It is shown that the INMA random field can be equipped (among others) with a Poisson marginal distribution. It is also demonstrated that different and well-interpretable dependence structures are possible. For illustration, we discuss a real-world data example and propose an INMA approximation to a given spatial dependence structure.

2604.07796 2026-05-25 stat.ML cs.IT cs.LG math.IT math.ST stat.TH

Order-Optimal Sequential 1-Bit Mean Estimation in General Tail Regimes

一般尾分布下的最优序贯1比特均值估计

Ivan Lau, Jonathan Scarlett

AI总结 本文研究了在1比特通信约束下的均值估计问题,提出了一种基于随机阈值查询的自适应均值估计方法,每个1比特反馈表示样本是否超过顺序选择的阈值。该估计器对任意具有有界均值和有界中心矩的分布具有$(ε, δ)$-PAC性质,且在所有尾部分布情形下均达到最优的样本复杂度。研究还揭示了1比特量化在有限方差情况下的基本性能限制,并展示了自适应方法相比非自适应方法在样本效率上的显著优势。

详情
Comments
This article substantially extends the AISTATS version, arXiv:2509.21940
AI中文摘要

本文研究了1比特通信约束下的均值估计问题。我们提出了一种新颖的自适应均值估计器,仅基于随机化阈值查询,其中每个1比特输出指示给定样本是否超过顺序选择的阈值。对于任何具有有界均值$\mu\in [-\lambda, \lambda]$和有界$k$阶中心矩$\mathbb{E}[|X-\mu|^k] \le \sigma^k$($k>1$固定)的分布,我们的估计器是$(\varepsilon, \delta)$-PAC的。此外,我们的样本复杂度在所有此类尾分布下都是阶数最优的,即对于每个这样的$k$值。对于$k\neq 2$,我们的估计器的样本复杂度匹配未量化极小极大下界加上不可避免的$O(\log(\lambda/\sigma))$定位代价。对于有限方差情形($k=2$),我们的估计器的样本复杂度有额外的乘法$O(\log(\sigma/\varepsilon))$惩罚,并且我们建立了新的信息论下界,表明该惩罚是1比特量化的基本限制。我们还建立了一个显著的适应性差距:对于阈值查询和更一般的区间查询,任何非自适应估计器的样本复杂度必须与搜索空间参数$\lambda/\sigma$线性增长,使其样本效率远低于我们的自适应方法。最后,我们提出了算法变体,这些变体(i)处理未知的采样预算,(ii)在给定(可能宽松的)界限下适应未知尺度参数$\sigma$,(iii)仅需两个自适应阶段即可实现阶数最优样本复杂度,但以更一般的1比特查询为代价,以及(iv)利用每个1比特查询的多个局部样本按比例减少通信成本。

英文摘要

In this paper, we study the problem of mean estimation under 1-bit communication constraints. We propose a novel adaptive mean estimator based solely on randomized threshold queries, where each 1-bit outcome indicates whether a given sample exceeds a sequentially chosen threshold. Our estimator is $(ε, δ)$-PAC for any distribution with a bounded mean $μ\in [-λ, λ]$ and a bounded $k$-th central moment $\mathbb{E}[|X-μ|^k] \le σ^k$ for any fixed $k > 1$. Moreover, our sample complexity is order-optimal in all such tail regimes, i.e., for every such $k$ value. For $k \neq 2$, our estimator's sample complexity matches the unquantized minimax lower bounds plus an unavoidable $O(\log(λ/σ))$ localization cost. For the finite-variance case ($k=2$), our estimator's sample complexity has an extra multiplicative $O(\log(σ/ε))$ penalty, and we establish a novel information-theoretic lower bound showing that this penalty is a fundamental limit of 1-bit quantization. We also establish a significant adaptivity gap: for both threshold queries and more general interval queries, the sample complexity of any non-adaptive estimator must scale linearly with the search space parameter $λ/σ$, rendering it vastly less sample efficient than our adaptive approach. Finally, we present algorithmic variants that (i) handle an unknown sampling budget, (ii) adapt to an unknown scale parameter $σ$ given (possibly loose) bounds, (iii) require only two stages of adaptivity to achieve order-optimal sample complexity at the expense of more general 1-bit queries, and (iv) leverage multiple local samples per 1-bit query to proportionally reduce communication costs.

2602.08927 2026-05-25 stat.ML cs.LG stat.ME

Online monotone density estimation and log-optimal calibration

在线单调密度估计与对数最优校准

Rohan Hore, Ruodu Wang, Aaditya Ramdas

AI总结 本文研究在线单调密度估计问题,即从序列观测数据中可预测地构建密度估计器。作者提出了两种在线估计方法:一种是经典Grenander估计器的在线版本,另一种是受在线学习中指数加权方法启发的专家聚合估计器。理论分析表明,在密度单调的设定下,所提估计器与真实密度之间的累积对数似然差距具有$O(n^{1/3})$的上界,并且专家聚合估计器相对于最优离线估计器具有$\sqrt{n\log{n}}$的路径遗憾界。此外,作者还展示了该问题与序贯假设检验中对数最优p值到e值校准的联系,并基于所提方法构建了经验自适应的校准器。

详情
Comments
31 pages, 2 figures
AI中文摘要

我们研究在线单调密度估计问题,其中密度估计器必须根据顺序观测数据以可预测的方式构建。我们提出两种在线估计器:经典Grenander估计器的在线类比,以及受在线学习文献中指数加权方法启发的专家聚合估计器。在良好指定的随机设定下,即底层密度是单调的,我们证明在线估计器与真实密度之间的期望累积对数似然差距具有$O(n^{1/3})$界。我们进一步建立了专家聚合估计器相对于事后选择的离线最优单调估计器的$\sqrt{n\log{n}}$路径后悔界,对观测序列的正则性假设要求极低。作为一个独立兴趣的应用,我们证明构建用于序贯假设检验的对数最优p-to-e校准器的问题可以表述为在线单调密度估计问题。我们调整所提出的估计器以构建经验自适应的p-to-e校准器,并证明其最优性。数值实验验证了理论结果。

英文摘要

We study the problem of online monotone density estimation, where density estimators must be constructed in a predictable manner from sequentially observed data. We propose two online estimators: an online analogue of the classical Grenander estimator, and an expert aggregation estimator inspired by exponential weighting methods from the online learning literature. In the well-specified stochastic setting, where the underlying density is monotone, we show that the expected cumulative log-likelihood gap between the online estimators and the true density admits an $O(n^{1/3})$ bound. We further establish a $\sqrt{n\log{n}}$ pathwise regret bound for the expert aggregation estimator relative to the best offline monotone estimator chosen in hindsight, under minimal regularity assumptions on the observed sequence. As an application of independent interest, we show that the problem of constructing log-optimal p-to-e calibrators for sequential hypothesis testing can be formulated as an online monotone density estimation problem. We adapt the proposed estimators to build empirically adaptive p-to-e calibrators and establish their optimality. Numerical experiments illustrate the theoretical results.

2602.07252 2026-05-25 stat.ME

Beyond Euclidean Summaries: Online Change Point Detection for Distribution-Valued Data

超越欧几里得摘要:面向分布值数据的在线变点检测

Yingyan Zeng, Yujing Huang, Xiaoyu Chen

AI总结 本文研究了针对分布值数据的在线突变点检测问题,传统方法依赖于固定维度的欧几里得特征,可能忽略分布形状或结构的变化。作者提出了一种基于2-Wasserstein空间的内在分布值突变点检测框架,通过将经验分布映射到预变化Fréchet均值的切空间,实现了对分布变化的更精确捕捉。该方法在理论上有保证,并在合成和实际数据中表现出更优的检测性能和更低的延迟。

详情
Journal ref
Proceedings of the 43rd International Conference on Machine Learning, Seoul, South Korea. PMLR 306, 2026. C
AI中文摘要

现有的在线变点检测方法依赖于固定维度的欧几里得摘要,隐含假设分布变化能被基于矩或基于特征的表示很好地捕捉。它们可能掩盖分布形状或几何上的重要变化。我们提出了一种内在的分布值变点检测框架,将流式批处理数据视为2-Wasserstein空间上的随机过程。我们的方法通过将每个经验分布映射到相对于变化前Fr\'echet重心的切空间,检测该过程规律的变化,从而得到2-Wasserstein空间的参考中心局部线性化。这种表示通过将经典多元监控统计量适配到切场,使得顺序检测器得以实现。我们提供了理论保证,并通过合成和真实实验证明,与基于矩和无模型基线相比,我们的方法在匹配的$\mathrm{ARL}_0$下以更短的检测延迟检测到复杂的分布变化。代码可在https://github.com/yyzeng43/IDD-icml获取。

英文摘要

Existing online change-point detection (CPD) methods rely on fixed-dimensional Euclidean summaries, implicitly assuming that distributional changes are well captured by moment-based or feature-based representations. They can obscure important changes in distributional shape or geometry. We propose an intrinsic distribution-valued CPD framework that treats streaming batch data as a stochastic process on the 2-Wasserstein space. Our method detects changes in the law of this process by mapping each empirical distribution to a tangent space relative to a pre-change Fréchet barycenter, yielding a reference-centered local linearization of 2-Wasserstein space. This representation enables sequential detectors by adapting classical multivariate monitoring statistics to tangent fields. We provide theoretical guarantees and demonstrate, via synthetic and real-world experiments, that our approach detects complex distributional shifts with reduced detection delay at matched $\mathrm{ARL}_0$ compared with moments-based and model-free baselines. The code is available at https://github.com/yyzeng43/IDD-icml .

2602.01449 2026-05-25 math.NA cs.NA math.PR stat.ML

Dimension-Free Multimodal Sampling via Preconditioned Annealed Langevin Dynamics

无维度多模态采样:基于预处理退火朗之万动力学

Lorenzo Baldassari, Josselin Garnier, Knut Solna, Maarten V. de Hoop

AI总结 本文研究了在高维空间中对多模态目标分布进行稳定采样的问题,提出了一种基于预条件退火朗之万动力学(ALD)的维度无关采样方法。通过分析高斯混合目标下的连续时间ALD过程,作者给出了在维度上统一的理论保证,表明在满足特定谱条件时,ALD能够在统一的时间范围内达到预设的KL散度精度。此外,研究还展示了在初始条件不完美和分数估计有偏差的情况下,通过适当的预条件操作可保持采样稳定性,从而实现了维度无关的误差控制。

详情
Comments
ICML 2026
AI中文摘要

设计对于多模态目标的采样算法,使其在底层函数空间问题的有限维近似细化下保持稳定,是一个核心挑战。退火朗之万动力学(ALD)是经典朗之万动力学在此背景下的自然替代方案,因为它通常被观察到能改善跨模态的探索。然而,其经验成功与现有理论之间仍存在差距:在何种条件下可以保证ALD在维度间保持稳定?在本文中,我们通过提供连续时间ALD对高斯混合目标的维度一致分析来弥合这一差距。沿着一条显式退火路径(通过逐渐从目标中移除高斯平滑获得),我们识别出谱条件,将平滑协方差与分量协方差联系起来,在该条件下ALD在维度一致的时间范围内达到KL散度的指定精度。然后,我们在具有不完美初始化和近似分数的微扰区域内建立稳定性。在错误指定的混合分数模型下,我们表明,用谱衰减足够快的算子对ALD进行预处理,可以防止误差项在坐标间累积,从而保持维度一致的控制。

英文摘要

Designing sampling algorithms for multimodal targets that remain stable under refinement of the finite-dimensional approximation of an underlying function-space problem is a central challenge. Annealed Langevin dynamics (ALD) is a natural alternative to classical Langevin in this context, since it is often observed to improve exploration across modes. Yet a gap remains between its empirical success and existing theory: under which conditions can ALD be guaranteed to remain stable across dimensions? In this paper, we bridge this gap by providing a uniform-in-dimension analysis of continuous-time ALD for Gaussian-mixture targets. Along an explicit annealing path obtained by gradually removing Gaussian smoothing from the target, we identify spectral conditions linking the smoothing covariance to the component covariances under which ALD achieves a prescribed accuracy in Kullback-Leibler divergence within a dimension-uniform time horizon. We then establish stability in a perturbative regime with imperfect initialization and approximate scores. Under a misspecified-mixture score model, we show that preconditioning ALD with an operator whose spectrum decays sufficiently fast prevents error terms from accumulating across coordinates and thereby preserves dimension-uniform control.

2601.22367 2026-05-25 stat.ML cs.LG

Amortized Simulation-Based Inference in Generalized Bayes via Neural Posterior Estimation

通过神经后验估计在广义贝叶斯中进行摊销的基于模拟的推理

Shiyi Sun, Geoff K. Nicholls, Jeong Eun Lee

AI总结 该论文提出了一种基于神经后验估计的通用贝叶斯推断方法,通过引入温度参数 $β$ 来缓解模型误设下的过自信问题,并提升推断的鲁棒性。研究提出了一种完全摊销的变分近似方法,仅需一次前向计算即可对任意数据和 $β$ 值进行后验采样,无需调用模拟器或运行MCMC。通过两种互补的训练策略,该方法在多个标准模拟推断基准上展示了与非摊销MCMC方法相当的性能,具有较高的效率和稳定性。

详情
Comments
Accepted at ICML 2026
AI中文摘要

广义贝叶斯推理(GBI)通过温度β>0调整损失以减轻过度自信并提高模型误设下的鲁棒性,但现有GBI方法通常依赖昂贵的MCMC或基于SDE的采样器,且必须为每个新数据集和每个β值重新运行。我们通过训练单一数据与β条件神经后验估计器,首次为温度后验族提供了完全摊销的变分近似,使得单次前向传播即可采样,无需模拟器调用或推理时MCMC。我们引入了两种互补的训练路径:一种从温度联合分布中合成流形外样本,另一种使用自归一化重要性采样(SNIS)对固定基础数据集进行重加权。我们证明,SNIS加权目标在有限权重方差下为温度后验提供了一致的前向KL拟合。在四个标准基于模拟的推理基准(包括混沌Lorenz-96系统)中,我们的β摊销估计器在标准双样本指标上实现了具有竞争力的后验近似,在广泛温度范围内匹配了非摊销的基于MCMC的幂后验采样器。

英文摘要

Generalized Bayesian Inference (GBI) tempers a loss with a temperature $β> 0$ to mitigate overconfidence and improve robustness under model misspecification, but existing GBI methods typically rely on costly MCMC or SDE-based samplers and must be re-run for each new dataset and each $β$ value. We give the first fully amortized variational approximation for the tempered posterior family by training a single data- and $β$-conditioned neural posterior estimator that enables sampling in a single forward pass, without simulator calls or inference-time MCMC. We introduce two complementary training routes: one synthesizes off-manifold samples from the tempered joint distribution, and the other reweights a fixed base dataset using self-normalized importance sampling (SNIS). We show that the SNIS-weighted objective provides a consistent forward-KL fit to the tempered posterior with finite weight variance. Across four standard simulation-based inference benchmarks, including the chaotic Lorenz-96 system, our $β$-amortized estimator achieves competitive posterior approximations, in standard two-sample metrics, matching non-amortized MCMC-based power-posterior samplers over a wide range of temperatures.

2601.20192 2026-05-25 stat.ME

Online Change Point Detection for Multivariate Inhomogeneous Poisson Processes Time Series

多元非齐次泊松过程时间序列的在线变点检测

Xiaokai Luo, Haotian Xu, Carlos Misael Madrid Padilla, Oscar Hernan Madrid Padilla

AI总结 本文研究多变量非齐次泊松点过程时间序列的在线突变点检测问题,该问题在地震学、气候监测和疫情监控等领域有重要应用。作者提出了一种基于低秩矩阵表示泊松强度函数的自适应非参数检测方法,该算法具有单次遍历特性,每个新观测点的计算成本为常数,效率高。文中还提供了理论保证,控制整体误报概率并分析了时间依赖情况下的检测延迟,并发展了一个适用于时间相关泊松点过程的新矩阵伯努利不等式,具有独立研究价值。

详情
AI中文摘要

我们研究多元非齐次泊松点过程时间序列的在线变点检测。该设置常见于地震学、气候监测和流行病监测等应用中,但在机器学习和统计学文献中尚未得到充分探索。我们提出了一种使用低秩矩阵表示多元泊松强度函数的方法,从而得到一种自适应非参数检测程序。我们的算法是单次遍历的,每次新观测仅需常数计算成本,与时间序列的已观测长度无关。我们提供了控制整体误报概率的理论保证,并刻画了时间依赖下的检测延迟。我们还为时间依赖的泊松点过程时间序列推导了一个新的矩阵 Bernstein 不等式,这可能具有独立的研究价值。数值实验表明,我们的方法在统计上稳健且计算高效。

英文摘要

We study online change point detection for multivariate inhomogeneous Poisson point process time series. This setting arises commonly in applications such as earthquake seismology, climate monitoring, and epidemic surveillance, yet remains underexplored in the machine learning and statistics literature. We propose a method that uses low-rank matrices to represent the multivariate Poisson intensity functions, resulting in an adaptive nonparametric detection procedure. Our algorithm is single-pass and requires only constant computational cost per new observation, independent of the elapsed length of the time series. We provide theoretical guarantees to control the overall false alarm probability and characterize the detection delay under temporal dependence. We also develop a new Matrix Bernstein inequality for temporally dependent Poisson point process time series, which may be of independent interest. Numerical experiments demonstrate that our method is both statistically robust and computationally efficient.

2601.19117 2026-05-25 eess.IV cs.CV stat.AP

Optimized $k$-means color quantization of digital images in machine-based and human perception-based colorspaces

基于机器感知和人类感知色彩空间的优化 $k$-均值图像颜色量化

Ranjan Maitra

AI总结 该研究探讨了在不同颜色空间中使用 $k$-means 算法进行数字图像颜色量化的效果,比较了 RGB、CIE-XYZ 和 CIE-LUV/CIE-HCL 等颜色空间在不同量化级别下的表现。通过视觉信息保真度(VIF)指标评估图像质量,发现 $k$-means 在 RGB 空间中表现最佳的情况约占一半,而在较高量化级别时,CIE-XYZ 空间通常表现更优,部分低量化级别情况下 CIE-LUV 空间效果更佳。研究还分析了色调、色度和亮度分布对颜色空间选择的影响,为不同场景下的颜色量化提供了更细致的指导。

详情
Journal ref
Journal of Electronic Imaging Journal of Electronic Imaging, Vol. 35, Issue 2, 023002 (Mar 2026)
Comments
25 pages, 11 figures, 5 tables, accepted in the Journal of Electronic Imaging
AI中文摘要

颜色量化使用原始颜色数量的一小部分来表示图像,同时仅最小程度地损失视觉质量。$k$-均值算法在此背景下常用,但主要应用于由三原色组成的基于机器的RGB色彩空间。然而,最近一些研究表明其在基于人类感知的色彩空间中性能有所提升。我们研究了在RGB、CIE-XYZ和CIE-LUV/CIE-HCL色彩空间中,$k$-均值颜色量化在四个量化级别下对148张涵盖广泛场景、主题和设置的多样化数字图像的性能。视觉信息保真度(VIF)度量数值上评估了量化图像的质量,并显示在大约一半的情况下,$k$-均值颜色量化在RGB空间中最佳,而在其他时候,特别是对于更高的量化级别($k$),CIE-XYZ色彩空间通常表现更好。也有一些情况,尤其是在较低的$k$下,最佳性能在CIE-LUV色彩空间中获得。进一步根据图像中色调、色度和亮度分布对性能的分析,为每个色彩空间更适合$k$-均值颜色量化的图像提供了细致的视角和特征描述。

英文摘要

Color quantization represents an image using a fraction of its original number of colors while only minimally losing its visual quality. The $k$-means algorithm is commonly used in this context, but has mostly been applied in the machine-based RGB colorspace composed of the three primary colors. However, some recent studies have indicated its improved performance in human perception-based colorspaces. We investigated the performance of $k$-means color quantization at four quantization levels in the RGB, CIE-XYZ, and CIE-LUV/CIE-HCL colorspaces, on 148 varied digital images spanning a wide range of scenes, subjects and settings. The Visual Information Fidelity (VIF) measure numerically assessed the quality of the quantized images, and showed that in about half of the cases, $k$-means color quantization is best in the RGB space, while at other times, and especially for higher quantization levels ($k$), the CIE-XYZ colorspace is where it usually does better. There are also some cases, especially at lower $k$, where the best performance is obtained in the CIE-LUV colorspace. Further analysis of the performances in terms of the distributions of the hue, chromaticity and luminance in an image presents a nuanced perspective and characterization of the images for which each colorspace is better for $k$-means color quantization.

2601.07545 2026-05-25 cs.LG stat.ML

Near-Optimal Private Linear Regression via Iterative Hessian Mixing

通过迭代Hessian混合实现近最优私有线性回归

Omri Lev, Moshe Shenfeld, Vishwak Srinivasan, Katrina Ligett, Ashia C. Wilson

AI总结 本文研究了在数据有界条件下实现差分隐私的普通最小二乘回归问题,提出了一种基于高斯投影的迭代海森矩阵混合(IHM)算法。该方法在保证差分隐私的同时,通过改进的实用风险界提升了模型性能,相比现有方法如AdaSSP,去除了与数据维度相关的乘法因子,从而在多个数据集上表现出更优的实证效果。

详情
AI中文摘要

我们研究通过草图机制实现带界数据$(X,Y)$的差分隐私普通最小二乘(DP-OLS)。虽然高斯草图方法已被探索用于DP-OLS \citep{sheffet2017differentially},但它们通常被认为不如自适应充分统计量扰动(AdaSSP)方法 \citep{wang_adassp},后者直接扰动充分统计量$(X^{\top}X, X^{\top}Y)$。该方法被证明接近信息论最优,同时表现出强大的实证性能。在这项工作中,我们提出了\emph{迭代Hessian混合}(IHM),一种基于高斯草图方法构建的DP-OLS算法,其灵感来自\citet{pilanci_hessiansketch}的迭代Hessian草图。我们证明IHM是差分私有的,并以超额经验风险界的形式提供效用保证。这些界通过移除一个可能高达数据维度平方根的乘法因子,改进了AdaSSP的界。IHM的设计基于我们为先前DP-OLS的高斯草图方法提出的新准确性保证,这些保证阐明了这些方法何时预期表现良好,以及IHM如何规避其固有局限性。我们还在大量数据集上进行了严格的实证评估,表明IHM始终优于包括AdaSSP在内的先前基线。

英文摘要

We study differentially private ordinary least squares (DP-OLS) with bounded data $(X,Y)$ via sketching-based mechanisms. While Gaussian sketching approaches have been explored for DP-OLS \citep{sheffet2017differentially}, they are typically viewed as less competitive than the Adaptive Sufficient Statistics Perturbation (AdaSSP) method \citep{wang_adassp}, which directly perturbs the sufficient statistics $(X^{\top}X, X^{\top}Y)$. This method was shown to be close to information-theoretically optimal, while also exhibiting strong empirical performance. In this work, we propose the \emph{Iterative Hessian Mixing} (IHM), an algorithm that builds on Gaussian sketching approaches to DP-OLS and is inspired by the Iterative Hessian Sketch of \citet{pilanci_hessiansketch}. We prove that IHM is differentially private and provide utility guarantees in the form of excess empirical risk bounds. These bounds improve upon those of AdaSSP by removing a multiplicative factor that can be as large as the square root of the data dimension. The design of the IHM is based on new accuracy guarantees that we present for prior Gaussian sketching approaches for DP-OLS, which clarify when these methods are expected to perform well and how IHM circumvents their inherent limitations. We also conduct a rigorous empirical evaluation on a large suite of datasets, demonstrating that IHM consistently outperforms prior baselines, including AdaSSP.

2512.15436 2026-05-25 stat.ML cs.LG

Online Partitioned Local Depth for semi-supervised applications

面向半监督应用的在线分区局部深度

John D. Foley, Justin T. Lee

AI总结 本文提出了一种适用于在线应用场景的改进版分区局部深度(PaLD)算法,名为在线PaLD,主要用于半监督预测任务。该算法在预计算参考数据集的凝聚网络后,能够在较短时间内扩展至新数据点,从而提升计算效率。研究通过实际应用展示了在线PaLD在医疗数据集上的异常检测和半监督分类中的潜力,拓展了PaLD框架的应用范围。

详情
Comments
Added theorem statements and refined results; 21 pages, 2 figures
AI中文摘要

我们介绍了分区局部深度(PaLD)算法的一个扩展,该扩展适用于在线应用,如半监督预测。PaLD以无监督、无参数聚类而闻名,但其鲁棒性基于数据点的三元组,使得精确分析计算成本高昂。目前正在研究如何提高底层离散算法的可扩展性并扩大PaLD的应用范围。我们提出的新算法online PaLD非常适合那些可以预先从参考数据集中计算凝聚网络的情况。在花费$O(n^3)$步骤构建可查询的数据结构后,online PaLD可以在$O(n^2)$时间内将凝聚网络扩展到新的数据点。我们的方法补充了之前基于近似和并行的加速方法。在实际应用中,online PaLD通过相对简单的实现使得更大的数据集可以进行精确分析。我们展示了在医疗保健数据集上的在线异常检测和半监督分类应用,作为online PaLD扩展PaLD框架应用潜力的初步说明。

英文摘要

We introduce an extension of the partitioned local depth (PaLD) algorithm that is adapted to online applications such as semi-supervised prediction. PaLD is best known for unsupervised, parameter-free clustering, but its robustness is based on triples of data points, making exact analysis computationally expensive. Research is ongoing to improve the scalability of the underlying discrete algorithm and expand the breath of PaLD's applications. The new algorithm we present, online PaLD, is well-suited to situations where it is possible to pre-compute a cohesion network from a reference dataset. After $O(n^3)$ steps to construct a queryable data structure, online PaLD can extend the cohesion network to a new data point in $O(n^2)$ time. Our approach complements previous speed up approaches based on approximation and parallelism. In practical terms, online PaLD makes larger datasets accessible to exact analysis with a relatively simple implementation. We present applications to online anomaly detection and semi-supervised classification for health-care datasets as initial illustrations of online PaLD's potential to expand applications of the PaLD framework.

2512.15244 2026-05-25 stat.ME econ.EM

Non-parametric Causal Inference in Dynamic Thresholding Designs

动态阈值设计中的非参数因果推断

Aditya Ghosh, Stefan Wager

AI总结 本文研究了在动态阈值设定下进行因果推断的问题,其中处理分配基于随时间变化的状态变量阈值。不同于静态设定,动态环境中过去的处理可能影响当前状态和未来的处理,使得传统回归不连续方法不再适用。作者提出了一种新的局部线性回归估计方法,能够一致估计动态阈值设计下的边际政策效应,并在连续血糖监测的仿真实验中验证了该方法的有效性。

详情
AI中文摘要

我们考虑动态设置中的因果推断,其中治疗通过随时间变化的状态变量阈值分配。关于回归间断方法的大量文献基于这样一个事实:在静态设置中,通过阈值跨越的治疗分配产生准实验设计,从而实现实用的因果推断。但动态设置涉及静态设置中不存在的挑战,例如,过去的治疗可能影响当前状态,从而影响未来的治疗,因此现有的回归间断方法不适用。在这里,我们证明动态阈值设计识别出一个边际政策效应,该效应在静态设置中嵌套了经典的回归间断参数;并提出了一个定制的局部线性回归估计量,该估计量对此边际政策效应具有一致性。我们使用一个实验来演示我们的方法,该实验使用FDA批准的模拟器生成的数据,模拟了连续血糖监测阈值的真实世界优化。

英文摘要

We consider causal inference in dynamic settings where treatment is assigned by thresholding a state variable that can change over time. There is a large literature on regression-discontinuity methods building on the fact that, in the static setting, treatment assignment via threshold crossing induces a quasi-experimental design that enables pragmatic causal inference. But dynamic settings involve challenges not present in the static setting, e.g., past treatments may affect current state and thus future treatments, and so existing regression-discontinuity methods do not apply. Here, we show that dynamic thresholding designs identify a marginal policy effect that nests the classical regression-discontinuity parameter in the static setting; and propose a tailored local linear regression estimator that is consistent for this marginal policy effect. We demonstrate our approach using an experiment that emulates real-world optimization of thresholds for continuous glucose monitoring using data generated from an FDA-approved simulator.

2510.04406 2026-05-25 stat.ML cs.LG

Decomposition-Based Modular Conformal Prediction for Two-Stage Modeling

基于分解的模块化共形预测用于两阶段建模

William Zhang, Saurabh Amin, Georgia Perakis

AI总结 本文提出了一种基于分解的模块化 conformal 预测框架,用于处理两阶段建模过程中的不确定性量化问题。该方法将整体预测残差分解为各阶段特有部分,从而能够识别并归因于不同模型阶段的不确定性来源。通过引入基于族内错误率控制的参数选择策略,并扩展到非平稳场景,该方法在结构化和阶段化变化下表现出更优的覆盖率和诊断能力,优于传统 conformal 预测方法。

详情
Comments
11 pages, (37 with appendix), 15 figures
AI中文摘要

共形预测在最小假设下提供了有限样本覆盖保证。然而,现有方法将整个建模过程视为黑箱,忽视了利用和理解模块化结构的机会。我们引入了一种针对两阶段顺序模型的共形预测框架,其中上游预测器为下游模型生成中间表示。通过将整体预测残差分解为阶段特定成分,我们的方法使从业者能够将不确定性归因于特定的流水线阶段。我们开发了一个使用族系错误率(FWER)控制的风险控制参数选择程序,以校准阶段级缩放参数,并引入了一个针对非平稳设置的自适应扩展。在合成分布偏移以及真实供应链和股票市场数据上的实验表明,与标准共形方法相比,我们的方法在结构性的阶段级偏移下提高了覆盖,同时识别了阶段级误差贡献。该框架提供了标准共形方法所缺乏的诊断优势和鲁棒覆盖。

英文摘要

Conformal prediction offers finite-sample coverage guarantees under minimal assumptions. However, existing methods treat the entire modeling process as a black box, overlooking opportunities to exploit and understand modular structure. We introduce a conformal prediction framework for two-stage sequential models, where an upstream predictor generates intermediate representations for a downstream model. By decomposing the overall prediction residual into stage-specific components, our method enables practitioners to attribute uncertainty to specific pipeline stages. We develop a risk-controlled parameter selection procedure using family-wise error rate (FWER) control to calibrate stage-wise scaling parameters, and introduce an adaptive extension for non-stationary settings. Experiments on synthetic distribution shifts, as well as real-world supply chain and stock market data, demonstrate that our approach improves coverage under structural, stage-wise shifts compared to standard conformal methods, while identifying stage-wise error contribution. This framework offers diagnostic advantages and robust coverage that standard conformal methods lack.

2509.03675 2026-05-25 stat.AP

Latent space projections and atlases: A cautionary tale in deep neuroimaging using autoencoders

潜在空间投影与图谱:自编码器在深度神经影像中的警示故事

J. M. Gorriz, F. Segovia, C. Jimenez, J. E. Arco, F. J. Martinez, J Ramirez, S. Abulikemu, J. Suckling

AI总结 本文提出了一种基于卷积自编码器的深度学习框架,用于探索3D脑MRI图像中的潜在表征,通过分层编码器和紧凑的潜在空间,学习能够保留神经解剖结构并反映认知状态临床变化的潜在表示。研究引入了“潜在-区域相关性分析”(LRCP)框架,结合统计关联与监督判别性,识别与临床相关潜在信息相关的脑区,并通过SHAP回归等方法评估模型可解释性,验证了即使简单架构也能捕捉与阿尔茨海默病进展相关的有意义模式,为临床神经科学中的生物标志物发现和假设生成提供了新思路。

详情
Comments
36 pages , 24 figures
AI中文摘要

本研究引入了一个深度学习框架,用于对3D脑MRI中的潜在表示进行推断性探索,利用了一个具有层次编码器和紧凑潜在空间的简单卷积自编码器。在阿尔茨海默病神经影像学倡议(ADNI)数据集的分割灰质图像上训练后,该模型学习了保留神经解剖结构并反映跨认知状态临床变异性的潜在表示。应用降维技术(PCA、t-SNE、PLS、UMAP)可视化和解释潜在空间,并将其与AAL图谱定义的解剖区域相关联。作为一项新颖贡献,提出了潜在-区域相关分析(LRCP)框架,该框架结合统计关联和监督可区分性,以识别编码临床相关潜在信息的脑区。我们的结果表明,即使是最小的架构也能捕捉到与阿尔茨海默病进展相关的有意义模式。通过将基于SHAP的回归应用于事后模型(该模型从基于图谱的区域灰质强度预测重建误差),评估可解释性,从而识别参与类别特定重建策略的解剖学上有意义的区域。这些发现进一步使用统计不可知方法进行验证,强调了神经影像中严格评估的重要性。这项工作展示了自编码器作为临床神经科学中生物标志物发现和假设生成的探索性工具的潜力。

英文摘要

This study introduces a deep learning framework for the inferential exploration of latent representations in 3D brain MRI, leveraging a simple convolutional autoencoder with a hierarchical encoder and a compact latent space. Trained on segmented gray matter images from the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, the model learns latent representations that preserve neuroanatomical structure and reflect clinical variability across cognitive status. Dimensionality reduction techniques (PCA, t-SNE, PLS, UMAP) were applied to visualize and interpret the latent space, correlating it with anatomical regions defined by the AAL atlas. As a novel contribution, the Latent-Regional Correlation Profiling (LRCP) framework, which combines statistical association and supervised discriminability to identify brain regions that encode clinically relevant latent information is proposed. Our results show that even minimal architectures capture meaningful patterns associated with progression to Alzheimer's disease. Interpretability is assessed by applying SHAP-based regression to a post-hoc model that predicts reconstruction error from atlas-based regional gray matter intensities, thereby identifying anatomically meaningful regions involved in class-specific reconstruction strategies. These findings are further validated using statistical agnostic methods, highlighting the importance of rigorous evaluation in neuroimaging. This work demonstrates the potential of autoencoders as exploratory tools for biomarker discovery and hypothesis generation in clinical neuroscience.

2508.02332 2026-05-25 cs.LG stat.ML

BOOST: A Data-Driven Framework for the Automated Joint Selection of Kernel and Acquisition Functions in Bayesian Optimization

BOOST: 一种用于贝叶斯优化中核函数与采集函数自动联合选择的数据驱动框架

Joon-Hyun Park, Mujin Cheon, Jeongsu Wi, Dong-Yeun Koh

AI总结 贝叶斯优化(BO)是一种在昂贵黑箱问题中高度样本高效的优化方法,其性能高度依赖于核函数和获取函数等超参数的选择。本文提出了一种名为BOOST的框架,用于自动联合选择最优的核函数和获取函数对,解决了传统方法中依赖启发式或手动调参的问题。BOOST通过离线评估阶段预测不同核-获取函数对的性能,并在实际优化前选择最有可能表现良好的组合,从而提升优化效率和效果。实验表明,BOOST在合成基准和机器学习超参数优化任务中均优于固定超参数的BO方法,并能与先进自适应方法竞争。

详情
Comments
25 pages
AI中文摘要

贝叶斯优化(BO)是一种对昂贵黑箱问题具有高样本效率的方法,其性能关键取决于超参数的选择,包括核函数和采集函数。这带来了一个重要的实际挑战:不恰当的组合可能导致性能差和评估浪费。虽然对核函数和采集函数的单独改进已被积极探索,但自动联合选择最佳超参数对在很大程度上被忽视,迫使从业者依赖启发式方法或昂贵的手动训练。在这项工作中,我们提出了一个框架BOOST(贝叶斯优化与最优核函数和采集函数选择技术),该框架自动化了这一选择过程。BOOST利用一个简单的离线评估阶段来预测各种核函数-采集函数对的性能,并在进行昂贵的评估过程之前识别出最有希望的对。BOOST是一种数据驱动的策略选择程序,它根据候选策略在手头数据上的经验性能来评估核函数-采集函数对。在每次迭代中,先前观察到的点被划分为参考集和查询集。这些子集扮演类似于机器学习中训练集和验证集的角色:参考集用于模型构建,而查询集代表未见的区域,用于回顾性评估每个候选策略在向目标值推进方面的有效性。在合成基准和机器学习超参数优化任务上的实验表明,BOOST始终优于固定超参数的BO,并与最先进的自适应方法保持竞争力,突显了其在各种场景下的鲁棒性。

英文摘要

The performance of Bayesian optimization (BO), a highly sample-efficient method for expensive black-box problems, is critically governed by the selection of its hyperparameters, including the kernel and acquisition functions. This presents a significant practical challenge: an inappropriate combination of these can lead to poor performance and wasted evaluations. While individual improvements to kernel functions and acquisition functions have been actively explored, the joint and autonomous selection of the best pair of these fundamental hyperparameters has been largely overlooked. This forced practitioners to rely on heuristics or costly manual training. In this work, we propose a framework, BOOST (Bayesian Optimization with Optimal Kernel and Acquisition Function Selection Technique), that automates this selection. BOOST utilizes a simple offline evaluation stage to predict the performance of various kernel-acquisition function pairs and identify the most promising pair before committing to the expensive evaluation process. BOOST is a data-driven strategy selection procedure that evaluates kernel-acquisition pairs based on their empirical performance on the data-in-hand. At each iteration, previously observed points are partitioned into a reference set and a query set. These subsets play roles analogous to training and validation sets in machine learning: the reference set is used for model construction, while the query set represents unseen regions to retrospectively evaluate how effectively each candidate strategy progresses toward the target value. Experiments on synthetic benchmarks and machine learning hyperparameter optimization tasks demonstrate that BOOST consistently improves over fixed-hyperparameter BO and remains competitive with state-of-the-art adaptive methods, highlighting its robustness across diverse landscapes.

2507.23505 2026-05-25 stat.AP

The effect of a new power interconnector on energy prices volatility: the case of Sicily

新电力互联器对能源价格波动的影响:以西西里岛为例

Francesco Lisi, Pierdomenico Duttilo, Marina Bertolini

AI总结 本文研究了新建成的Sorgente-Rizziconi电力互联线路对西西里岛电力价格波动性的影响。通过应用半参数GARCH模型和非参数加法模型,分析发现该互联线路显著增加了西西里电力价格的波动性,但未降低平均电价,且对意大利其他地区市场影响不明显。研究强调了电力基础设施影响的地域依赖性,表明物理连接本身并不能保证电价稳定,对能源政策和市场风险管理具有重要启示。

详情
Comments
Applied Stochastic Models in Business and Industry (2026)
AI中文摘要

将能源岛整合到欧洲电力市场是能源转型的关键挑战。本研究调查了Sorgente-Rizziconi互联器对西西里岛电价波动的影响。在2016年5月28日投入运行之前,西西里电力市场区域与意大利本土的互联程度较低。利用2015年至2018年的日度数据,分析采用带有逻辑干预函数的半参数GARCH模型来估计条件价格方差的变化。作为稳健性检验,使用全非参数加性模型,允许数据塑造波动动态而不施加预定义结构。结果显示,新互联器显著增加了西西里岛的价格波动,而未降低平均价格水平。在意大利其他市场区域未观察到显著影响。这些发现强调了基础设施影响的背景依赖性,并表明物理整合本身并不能保证价格稳定。结果对电力市场的能源政策、投资规划和风险管理具有重要意义。

英文摘要

Integrating energy islands into the European electricity market is a key challenge for the energy transition. This study investigates the impact of the Sorgente-Rizziconi interconnector on electricity price volatility in Sicily. Before its commissioning on 28 May 2016, the Sicilian electricity market zone was poorly interconnected with the Italian mainland. Using daily data from 2015 to 2018, the analysis applies a semi-parametric GARCH model with a logistic intervention function to estimate changes in conditional price variance. A fully non-parametric additive model is employed as a robustness check, allowing the data to shape volatility dynamics without imposing a predefined structure. The results reveal that the new interconnector significantly increased price volatility in Sicily, without reducing average price levels. No significant effects were observed in other Italian market zones. These findings highlight the context-dependent nature of infrastructure impacts and suggest that physical integration alone does not guarantee price stability. The results have important implications for energy policy, investment planning, and risk management in electricity markets.

2507.05064 2026-05-25 stat.ML cs.LG stat.ME

Vecchia-Inducing-Points Full-Scale Approximations for Gaussian Processes

高斯过程的Vecchia诱导点全尺度近似

Tim Gyger, Reinhard Furrer, Fabio Sigrist

AI总结 本文提出了一种结合全局诱导点与局部Vecchia近似优势的高斯过程全尺度近似方法——VIF近似,旨在解决高斯过程在大规模数据集上的计算瓶颈。该方法通过基于相关性的邻居查找策略,提高了残差过程的Vecchia近似效率,并利用改进的覆盖树算法实现高效计算。此外,研究还扩展了该框架以处理非高斯似然,引入迭代方法大幅降低了计算成本,并在模拟和真实数据集上验证了其在计算效率、精度和数值稳定性方面的优越性。

详情
AI中文摘要

高斯过程是灵活、概率性的非参数模型,广泛应用于机器学习和统计学。然而,其在大数据集上的可扩展性受计算限制。为克服这些挑战,我们提出Vecchia诱导点全尺度(VIF)近似,结合全局诱导点和局部Vecchia近似的优势。Vecchia近似在低维输入和中等光滑协方差函数设置中表现优异,而诱导点方法更适合高维输入和更光滑的协方差函数。我们的VIF方法通过使用基于相关性的高效邻居搜索策略(通过改进的覆盖树算法实现)对残差过程进行Vecchia近似,从而桥接这两种情况。我们进一步将框架扩展到非高斯似然,引入迭代方法,与基于Cholesky的计算相比,在使用拉普拉斯近似时,训练和预测的计算成本降低了几个数量级。特别是,我们提出并比较了新颖的预条件器,并提供了理论收敛结果。在模拟和真实数据集上的大量数值实验表明,VIF近似不仅计算高效,而且比最先进的替代方法更准确、数值更稳定。所有方法均在开源C++库GPBoost中实现,并配有高级Python和R接口。

英文摘要

Gaussian processes are flexible, probabilistic, non-parametric models widely used in machine learning and statistics. However, their scalability to large data sets is limited by computational constraints. To overcome these challenges, we propose Vecchia-inducing-points full-scale (VIF) approximations combining the strengths of global inducing points and local Vecchia approximations. Vecchia approximations excel in settings with low-dimensional inputs and moderately smooth covariance functions, while inducing point methods are better suited to high-dimensional inputs and smoother covariance functions. Our VIF approach bridges these two regimes by using an efficient correlation-based neighbor-finding strategy for the Vecchia approximation of the residual process, implemented via a modified cover tree algorithm. We further extend our framework to non-Gaussian likelihoods by introducing iterative methods that substantially reduce computational costs for training and prediction by several orders of magnitudes compared to Cholesky-based computations when using a Laplace approximation. In particular, we propose and compare novel preconditioners and provide theoretical convergence results. Extensive numerical experiments on simulated and real-world data sets show that VIF approximations are both computationally efficient as well as more accurate and numerically stable than state-of-the-art alternatives. All methods are implemented in the open source C++ library GPBoost with high-level Python and R interfaces.

2506.21501 2026-05-25 math.ST stat.TH

Causal inference via implied interventions

通过隐含干预进行因果推断

Carlos García Meixide, Mark J. van der Laan

AI总结 本文研究在存在工具变量的情况下如何进行因果推断,提出了一种基于观测分布所能识别的干预来确定因果效应的新方法。不同于传统以目标因果效应为起点的做法,作者通过工具变量的随机化及其排除限制,定义了一类由工具变量干预所隐含的辅助随机干预,并据此刻画可识别的因果效应。该方法不仅提供了对治疗效应的更准确描述,还为不同估计问题提供了灵活的框架,部分问题通过期望最大化和高度自适应套索进行求解。

详情
AI中文摘要

在拥有工具变量的背景下,因果推断的标准实践始于瞄准感兴趣的效应,并逐步提出使其可识别的假设。我们反其道而行之,坚持观测分布允许识别的干预,而不是从期望的因果估计量出发并施加不可检验的条件。工具的随机化及其排除限制定义了一类由工具上的随机干预隐含的治疗上的辅助随机干预。这种映射刻画了在给定观测分布下治疗对结果的可识别因果效应。被识别的效应是工具随机鼓励的影响,该影响通过未改变的治疗选择机制传播,而不是覆盖自然治疗选择的假设干预的效果。或者,寻找一个工具上的干预,其隐含干预最接近期望目标,自然导致一个投影,代表最接近的可识别治疗效应。该投影的通用性允许选择不同的范数和索引函数集,从而产生多样化的估计问题,其中一些我们使用期望最大化算法和高自适应Lasso来解决。

英文摘要

In the context of having an instrumental variable, the standard practice in causal inference begins by targeting an effect of interest and proceeds by formulating assumptions enabling its identification. We turn this around by adhering to the interventions the observational distribution allows to identify, rather than starting with a desired causal estimand and imposing untestable conditions. The randomization of an instrument and its exclusion restriction define a class of auxiliary stochastic interventions on the treatment that are implied by stochastic interventions on the instrument. This mapping characterizes the identifiable causal effects of the treatment on the outcome given the observable distribution. The identified effect is the impact of a stochastic encouragement by the instrument that propagates through the unaltered treatment selection mechanism, rather than the effect of a hypothetical intervention that overrides how treatment is naturally chosen. Alternatively, searching for an intervention on the instrument whose implied one best approximates a desired target naturally leads to a projection representing the closest identifiable treatment effect. The generality of this projection allows to select different norms and indexing functional sets that give rise to diverse estimation problems, some of which we address using Expectation-Maximization and the Highly Adaptive Lasso.

2506.10152 2026-05-25 stat.ME

Robust copula estimation for one-shot devices with correlated failure modes

具有相关失效模式的单次使用设备的稳健Copula估计

E. Castilla, P. J. Chocano

AI总结 本文提出了一种稳健的Copula估计方法,用于评估一次性设备中不同失效模式之间的依赖关系。传统方法如最大似然估计在存在异常值或模型误设时容易失效,为此作者引入了一种基于散度的估计技术,提升了模型的稳健性并更可靠地刻画联合失效时间分布。仿真实验和实际数据应用验证了该方法的有效性和实用性。

详情
AI中文摘要

本文提出了一种稳健的Copula模型估计方法,用于评估单次使用设备(即设计为一次性使用并在激活时被破坏的系统)中失效模式之间的依赖关系。传统方法如最大似然估计(MLE)在面对异常值或模型误设时往往产生不可靠的结果。为克服这些局限性,我们引入了一种基于散度的估计技术,增强了稳健性,并提供了更可靠的联合失效时间分布特征描述。广泛的模拟研究证实了所提方法的稳健性。此外,我们通过分析一个真实世界数据集说明了其实用性。

英文摘要

This paper presents a robust method for estimating copula models to evaluate dependence between failure modes in one-shot devices-systems designed for single use and destroyed upon activation. Traditional approaches, such as maximum likelihood estimation (MLE), often produce unreliable results when faced with outliers or model misspecification. To overcome these limitations, we introduce a divergence-based estimation technique that enhances robustness and provides a more reliable characterization of the joint failure-time distribution. Extensive simulation studies confirm the robustness of the proposed method. Additionally, we illustrate its practical utility through the analysis of a real-world dataset.

2505.17354 2026-05-25 cs.LG stat.ML

CT-OT Flow: Estimating Continuous-Time Dynamics from Discrete Temporal Snapshots

CT-OT Flow:从离散时间快照估计连续时间动态

Keisuke Kawano, Takuro Kutsuna, Naoki Hayashi, Yasushi Esaki, Hidenori Tanaka

AI总结 本文研究如何从离散时间快照中估计连续时间动态,针对如单细胞RNA测序、移动感知等场景中数据仅以时间聚合快照形式存在、时间标签可能噪声或不确定的问题。提出了一种两阶段框架——连续时间最优传输流(CT-OT Flow),通过部分最优传输对齐相邻时间区间以推断高分辨率时间标签,并利用时间核平滑重建连续时间数据分布,从而训练标准的常微分方程或随机微分方程模型。该方法有效处理快照聚合和时间标签不确定性,并通过实用加速策略提升计算效率,在多个合成和真实数据集上表现出更优的分布和轨迹估计性能。

详情
Comments
https://github.com/ToyotaCRDL/CT-OT_Flow
AI中文摘要

在许多现实场景中(例如单细胞RNA测序、移动感知和环境监测),数据仅作为在有限时间窗口内收集的时间聚合快照被观测到,通常带有噪声或不确定的时间戳,并且无法访问连续轨迹。我们研究从这类快照估计连续时间动态的问题。我们提出连续时间最优传输流(CT-OT Flow),这是一个两阶段框架:(i)通过部分最优传输(POT)对齐相邻区间来推断高分辨率时间标签,(ii)通过时间核平滑重建连续时间数据分布,从中采样邻近时间对以训练标准ODE/SDE模型。我们的公式明确考虑了快照聚合和时间标签不确定性,并使用实际加速(筛选和小批量POT),使其适用于大型数据集。在合成基准和两个真实数据集(scRNA-seq和台风轨迹)上,与OT-CFM、[SF]²M、TrajectoryNet、MFM和ENOT相比,CT-OT Flow减少了分布和轨迹误差。

英文摘要

In many real-world settings--e.g., single-cell RNA sequencing, mobility sensing, and environmental monitoring--data are observed only as temporally aggregated snapshots collected over finite time windows, often with noisy or uncertain timestamps, and without access to continuous trajectories. We study the problem of estimating continuous-time dynamics from such snapshots. We present Continuous-Time Optimal Transport Flow (CT-OT Flow), a two-stage framework that (i) infers high-resolution time labels by aligning neighboring intervals via partial optimal transport (POT) and (ii) reconstructs a continuous-time data distribution through temporal kernel smoothing, from which we sample pairs of nearby times to train standard ODE/SDE models. Our formulation explicitly accounts for snapshot aggregation and time-label uncertainty and uses practical accelerations (screening and mini-batch POT), making it applicable to large datasets. Across synthetic benchmarks and two real datasets (scRNA-seq and typhoon tracks), CT-OT Flow reduces distributional and trajectory errors compared with OT-CFM, [SF]\(^{2}\)M, TrajectoryNet, MFM, and ENOT.

2502.17142 2026-05-25 math.ST math.PR stat.ML stat.TH

The feasibility of multi-graph alignment: a Bayesian approach

多图对齐的可行性:一种贝叶斯方法

Louis Vassaux, Laurent Massoulié

AI总结 本文研究了随机多图对齐的可行性问题,提出了两种模型下的阈值条件。在高斯模型中,作者发现了“全有或全无”的现象:当参数超过临界阈值时,几乎可以完全对齐;低于该阈值时,即使部分对齐也统计上不可能。在稀疏的埃르德ős-雷尼模型中,作者严格确定了一个阈值,低于该阈值无法实现有意义的部分对齐,并推测高于该阈值时可以实现部分对齐。为此,作者构建了一个适用于度量空间的通用贝叶斯估计框架,为解决高维统计问题提供了新思路。

详情
Comments
Minor revisions; 41 pages
AI中文摘要

我们建立了两种模型中随机多图对齐可行性的阈值。在高斯模型中,我们展示了“全有或全无”现象:在临界阈值以上,高概率实现精确对齐;而在阈值以下,即使部分对齐在统计上也是不可能的。在稀疏Erdős-Rényi模型中,我们严格识别了一个阈值,低于该阈值无法实现有意义的局部对齐,并推测高于该阈值可以实现局部对齐。为了证明这些结果,我们开发了一个度量空间上的通用贝叶斯估计框架,为更广泛的高维统计问题提供了见解。

英文摘要

We establish thresholds for the feasibility of random multi-graph alignment in two models. In the Gaussian model, we demonstrate an "all-or-nothing" phenomenon: above a critical threshold, exact alignment is achievable with high probability, while below it, even partial alignment is statistically impossible. In the sparse Erdős-Rényi model, we rigorously identify a threshold below which no meaningful partial alignment is possible and conjecture that above this threshold, partial alignment can be achieved. To prove these results, we develop a general Bayesian estimation framework over metric spaces, which provides insight into a broader class of high-dimensional statistical problems.

2502.07646 2026-05-25 cs.LG stat.ME stat.ML

Causal Additive Models with Unobserved Causal Paths and Backdoor Paths

具有未观测因果路径和后门路径的因果加性模型

Thong Pham, Takashi Nicholas Maeda, Shohei Shimizu

AI总结 该论文研究了在存在未观测的因果路径和后门路径时,如何识别变量间的因果方向问题。作者提出了新的回归集刻画方法,用于判断残差独立性和观测变量的条件独立性,并基于此建立了因果方向可识别的充分条件。在此基础上,提出了一种搜索算法并证明了其正确性和完备性,实验表明该方法在性能上具有竞争力。

详情
Journal ref
Proceedings of AISTATS 2026
Comments
23 pages
AI中文摘要

因果加性模型为存在隐藏变量时的因果发现提供了一个可处理且富有表现力的框架。当两个变量之间存在未观测的后门或因果路径时,其因果关系在现有理论下通常不可识别。我们建立了在许多此类情况下可识别因果方向的充分条件。这些条件依赖于回归集的新特征,以确定回归残差之间的独立性以及观测变量之间的条件独立性。基于这些结果,我们引入了一个结合这些创新的搜索算法,并证明了其可靠性和完备性。实证评估表明,其性能与最先进的方法相比具有竞争力。

英文摘要

Causal additive models provide a tractable yet expressive framework for causal discovery in the presence of hidden variables. When unobserved backdoor or causal paths exist between two variables, their causal relationship is often unidentifiable under existing theories. We establish sufficient conditions under which causal directions can be identified in many such cases. These conditions rely on new characterizations of regression sets to determine independence among regression residuals and conditional independencies among observed variables. Building on these results, we introduce a search algorithm that incorporates these innovations and prove its soundness and completeness. Empirical evaluations demonstrate its competitive performance against state-of-the-art methods.

2411.15713 2026-05-25 stat.ME

Bayesian High-dimensional Grouped-regression using Sparse Projection-posterior

贝叶斯高维分组回归:基于稀疏投影后验

Samhita Pal, Subhashis Ghosal

AI总结 本文提出了一种用于高维分组回归的新型贝叶斯方法,旨在在稀疏性假设下进行有效的变量选择和参数估计。该方法通过引入稀疏投影映射,将高维参数空间映射到低维结构化空间,从而实现对分组结构的有效建模,并提出了三种基于不同惩罚函数的投影后验方法。研究还推导了估计和预测的最优后验收缩速率,证明了模型选择的一致性,并提出了去偏投影映射以确保可信集的精确覆盖。该方法在非参数加法模型中具有广泛应用,文章通过仿真实验和阿尔茨海默病脑MRI数据的应用验证了其有效性。

详情
AI中文摘要

我们提出了一种新颖的贝叶斯方法,用于稀疏性下的高维分组回归。我们利用一种稀疏投影方法,该方法使用稀疏诱导映射来推导低维参数空间上的诱导后验。我们的方法基于流行的惩罚函数引入了三种不同的投影映射:Group LASSO投影后验、Group SCAD投影后验和自适应Group LASSO投影后验。每个投影映射被构造为将密集的后验样本浸入一个结构化的稀疏空间,从而允许在高维设置中进行有效的组选择和估计。我们推导了估计和预测的最优后验收缩率,证明了这些方法具有模型选择一致性。此外,我们提出了一种去偏Group LASSO投影映射,确保了可信集的精确覆盖。我们的方法特别适用于非参数加性模型的应用,我们使用B样条展开来捕捉协变量与响应之间的复杂关系。广泛的模拟验证了我们的理论发现,展示了我们的方法在不同设置下的稳健性。最后,我们通过应用于阿尔茨海默病神经影像学倡议(ADNI)的脑MRI体积数据,说明了我们方法的实际效用,我们的模型识别了与阿尔茨海默病进展相关的关键脑区。

英文摘要

We present a novel Bayesian approach for high-dimensional grouped regression under sparsity. We leverage a sparse projection method that uses a sparsity-inducing map to derive an induced posterior on a lower-dimensional parameter space. Our method introduces three distinct projection maps based on popular penalty functions: the Group LASSO Projection Posterior, Group SCAD Projection Posterior, and Adaptive Group LASSO Projection Posterior. Each projection map is constructed to immerse dense posterior samples into a structured, sparse space, allowing for effective group selection and estimation in high-dimensional settings. We derive optimal posterior contraction rates for estimation and prediction, proving that the methods are model selection consistent. Additionally, we propose a Debiased Group LASSO Projection Map, which ensures exact coverage of credible sets. Our methodology is particularly suited for applications in nonparametric additive models, where we apply it with B-spline expansions to capture complex relationships between covariates and response. Extensive simulations validate our theoretical findings, demonstrating the robustness of our approach across different settings. Finally, we illustrate the practical utility of our method with an application to brain MRI volume data from the Alzheimer's Disease Neuroimaging Initiative (ADNI), where our model identifies key brain regions associated with Alzheimer's progression.

2411.08126 2026-05-25 stat.ML cs.LG

A Tale of Two Cities: Pessimism and Opportunism in Offline Dynamic Pricing

双城记:离线动态定价中的悲观主义与机会主义

Zeyu Bian, Lan Wang, Zhengling Qi

AI总结 本文研究了在历史数据未能覆盖全部价格区间的情况下,如何进行离线动态定价,尤其是在最优价格可能完全未被观测到的现实场景中。为解决这一问题,作者提出了一种非参数部分识别框架,利用需求对价格的单调性来估计未观测价格的价值,并设计了两种动态定价策略:一种是追求最坏情况下收益最大化的悲观策略,另一种是力求最小化最坏情况下遗憾的乐观策略。该方法在无覆盖场景下表现出优越性能,并为企业提供了根据风险偏好选择定价策略的实用指导。

详情
AI中文摘要

我们研究离线动态定价,当历史数据对价格空间的覆盖不完整时,一些候选价格(包括最优价格)可能完全未被观测到。这种设置在现实中很常见,在动态环境中尤其困难。现有的离线强化学习方法通常依赖于完全或部分覆盖,因此在这种设置下表现不佳。我们开发了一个用于离线动态定价的非参数部分识别框架,利用需求在价格上的单调性来界定未观测价格的价值。在该框架内,我们制定了两种动态决策规则:一种最大化最坏情况收入的悲观策略,和一种最小化最坏情况遗憾的机会策略。这些规则针对顺序无覆盖环境量身定制,并非现有悲观离线强化学习或静态机会主义方法的直接扩展。我们为两种策略建立了有限样本遗憾界,当最优价格被覆盖时恢复了标准速率,并量化了未覆盖时的额外成本。我们还开发了高效算法,并通过模拟和机票应用表明,我们的方法在无覆盖设置中优于标准离线强化学习基线。从管理角度看,该框架提供了从公司风险态度到定价策略的实用映射:寻求收入稳定和下行保护的公司应偏好悲观策略,而愿意承担适度风险以从未充分探索的价格中获取潜在收益的公司应偏好机会策略。

英文摘要

We study offline dynamic pricing when historical data provide incomplete coverage of the price space such that some candidate prices, including the optimal one, may be entirely unobserved. This setting is common in practice and is especially difficult in dynamic environments. Existing offline reinforcement learning methods typically rely on full or partial coverage and can therefore perform poorly in such settings. We develop a nonparametric partial identification framework for offline dynamic pricing that exploits the monotonicity of demand in price to bound the value of unobserved prices. Within this framework, we formulate two dynamic decision rules: a pessimistic policy that maximizes worst-case revenue and an opportunistic policy that minimizes worst-case regret. These rules are tailored to a sequential no-coverage environment and are not direct extensions of existing pessimistic offline RL or static opportunistic approaches. We establish finite-sample regret bounds for both policies, recovering the standard rate when the optimal price is covered and quantifying the additional cost when it is not. We also develop efficient algorithms and show, through simulations and an airline ticket application, that our methods outperform standard offline RL baselines in no-coverage settings. Managerially, the framework provides a practical mapping from a firm's risk posture to its pricing policy: firms seeking revenue stability and downside protection should prefer the pessimistic policy, whereas firms willing to bear measured risk for potential gains from underexplored prices should prefer the opportunistic policy.

2311.06139 2026-05-25 stat.AP

Joint Object Tracking and Intent Recognition

联合目标跟踪与意图识别

Jiaming Liang, Bashar I. Ahmad, Simon Godsill

AI总结 本文提出了一种贝叶斯框架,用于联合推断目标的增强状态后验分布,包括其潜在的目标或意图,如中间路点或最终目的地。研究引入了多种虚拟领航器形式的潜在意图模型,以捕捉目标隐藏目标对其瞬时行为的影响,并考虑了多种运动模型以适应高度机动的目标。通过序贯蒙特卡洛方法(粒子滤波)实现目标状态和意图的联合估计,并采用Rao-Blackwellisation技术提升估计性能,实验验证了该方法在模拟数据和真实雷达测量中的有效性。

详情
Comments
Submitted to IEEE Transactions on Aerospace and Electronic Systems (T-AES)
AI中文摘要

本文提出了一个贝叶斯框架,用于推断目标增强状态的后验,该状态包含其潜在目标或意图,例如任何中间航路点和/或最终目的地。因此,它用于联合目标跟踪与意图识别。在虚拟领导公式中提出了几种潜在意图模型。它们捕捉目标隐藏目标对其瞬时行为的影响。在此背景下,还考虑了各种运动模型,包括针对高机动目标的模型。先验未知的目标意图(例如目的地)可以随时间动态变化,并在状态空间(例如位置或空间区域)内取任意值。引入了一种序贯蒙特卡洛(粒子滤波)方法,用于同时估计目标的(运动学)状态及其意图。采用Rao-Blackwellisation来提高推理过程的统计性能。使用模拟数据和真实雷达测量来证明所提出技术的有效性。

英文摘要

This paper presents a Bayesian framework for inferring the posterior of the augmented state of a target, incorporating its underlying goal or intent, such as any intermediate waypoints and/or the final destination. Thus, it is for joint object tracking and intent recognition. Several latent intent models are proposed here within a virtual leader formulation. They capture the influence of the target's hidden goal on its instantaneous behaviour. In this context, various motion models, including for highly maneuvering objects, are also considered. The a priori unknown target intent (e.g. destination) can dynamically change over time and take any value within the state space (e.g. a location or spatial region). A sequential Monte Carlo (particle filtering) approach is introduced for the simultaneous estimation of the target's (kinematic) state and its intent. Rao-Blackwellisation is employed to enhance the statistical performance of the inference routine. Simulated data and real radar measurements are used to demonstrate the efficacy of the proposed techniques.

1810.04449 2026-05-25 stat.CO cs.DS

Faster Hamiltonian Monte Carlo by Learning Leapfrog Scale: a self-calibrated randomized solution

通过学习跳跃步长加速哈密顿蒙特卡洛:一种自校准随机化解决方案

Changye Wu, Pierre Pudlo, Christian P. Robert, Julien Stoehr

AI总结 本文提出了一种基于随机选择积分时间的哈密顿蒙特卡洛方法(eHMC),通过离线校准阶段结合重要性采样构建离散化参数的经验分布,消除了手动烧入诊断和在线自适应的需要。该方法利用群体蒙特卡洛方案结合退火和灵活参数变分族(如归一化流)生成校准阶段的提议分布,最终形成具有固定混合分布的HMC核混合算法,保持目标分布不变。实验表明,eHMC在考虑计算成本后,在基准测试中表现出与NUTS相当或更优的效率,展示了离线校准与随机积分方案在自适应HMC方法中的可行替代性。

详情
Comments
Code available on https://github.com/jstoehr/eHMC
AI中文摘要

我们介绍了一种基于随机选择积分时间的哈密顿蒙特卡洛(HMC)方法,称为eHMC,其中“e”代表经验。该方法依赖于离线校准阶段,利用重要性采样构建离散化参数的经验分布,从而消除了手动退火诊断和在线自适应的需要。校准阶段中使用的提议分布是通过种群蒙特卡洛方案结合退火和灵活的参数变分族(如归一化流)获得的。所得算法定义了一个具有固定混合分布的HMC核混合,保持了目标分布。在基准测试上的数值实验表明,在考虑计算成本时,eHMC与No-U-Turn采样器(NUTS)相比实现了具有竞争力或更高的效率。这些结果表明,离线校准结合随机化积分方案为自适应HMC方法提供了一种可行的替代方案。

英文摘要

We introduce a Hamiltonian Monte Carlo (HMC) methodology based on a randomized selection of integration times, referred to as eHMC, where "e" stands for empirical. The approach relies on an offline calibration phase that leverages importance sampling to construct an empirical distribution on discretization parameters, thereby eliminating the need for manual burn-in diagnostics and online adaptation. The proposal distribution used in the calibration stage is obtained via a Population Monte Carlo scheme combined with tempering and flexible parametric variational families such as normalizing flows. The resulting algorithm defines a mixture of HMC kernels with a fixed mixing distribution, preserving the target distribution. Numerical experiments on benchmarks demonstrate that eHMC achieves competitive or improved efficiency compared to the No-U-Turn Sampler (NUTS) when accounting for computational cost. These results suggest that offline calibration combined with randomized integration schemes provides a viable alternative to adaptive HMC methods.

2605.22950 2026-05-25 stat.ML cs.LG math.ST stat.ME stat.TH

Diffusion-based Denoising Beats Vanilla Score Matching in Parameter Estimation: A Theoretical Explanation

基于扩散的去噪在参数估计中优于普通得分匹配:一个理论解释

Benedikt Lütke Schwienhorst, Nadja Klein, Johannes Lederer

AI总结 本文研究了在多峰分布参数估计中,基于扩散的去噪分数匹配方法相较于传统分数匹配方法的优越性,并给出了理论解释。作者提出了一种新的扩散去噪分数匹配估计器(DDSME),并通过理论分析证明,传统分数匹配估计器在峰间距离增大时误差会恶化,而DDSME通过适当调节超参数可避免这一问题。该研究为扩散模型在参数估计中的优势提供了新的理论依据。

详情
AI中文摘要

当归一化常数未知或计算成本过高时,得分匹配是最大似然估计的替代方法。然而,对于实际应用中常见的具有良好分离模态的多峰分布,普通得分匹配相对于最大似然估计效率较低。我们在此场景下比较了一种新颖的基于扩散的去噪得分匹配估计器(DDSME)与普通得分匹配估计器(SME)。特别地,我们证明了两种估计器的统计保证,表明普通SME的误差界随着模态间分离度的增加而恶化,而通过适当的超参数调整,DDSME可以避免这一问题。这为基于扩散的得分匹配优于普通版本的行为提供了新的理论解释。

英文摘要

Score matching is an alternative to maximum likelihood estimation when the normalizing constant is unknown or too costly to evaluate. However, vanilla score matching has shown to be inefficient relative to maximum likelihood estimation for multimodal distributions with well-separated modes, which are commonly encountered in practical applications. We compare a novel diffusion-based denoising score matching estimator (DDSME) to the vanilla score matching estimator (SME) in this scenario. In particular, we prove statistical guarantees for both estimators, showing that the error bound for the vanilla SME worsens when the separation between the modes increases, which can be avoided in case of the DDSME with suitable hyperparameter tuning. This provides a novel theoretical explanation for the superior behavior of diffusion-based score matching over the vanilla version.

2605.22940 2026-05-25 cs.LG cs.AI stat.ML

Human-Centered Learning Mechanics: A Dynamical Framework for Entropy-Regulated Representation Learning

以人为中心的学习力学:熵正则化表示学习的动力学框架

Kim Phuc Tran

AI总结 本文提出了一种名为“以人为中心的学习力学”(HCLM)的动态信息理论框架,旨在为开放且受控的学习系统提供理论支持。研究指出,传统的熵正则化方法在某些情况下可能导致梯度不稳定或与优化方向不一致,因此引入了有效熵的概念,并提出了可计算的几何熵代理方法,如基于方差和对数行列式的协方差代理。文章的主要贡献包括形式化有效信息力下的熵正则化、推导收敛性和泛化性理论,以及从动态角度解释模型规模与性能之间的关系。实验表明,几何熵代理,尤其是对数行列式协方差熵,能产生更稳定和有力的信息力,提升表示学习的效果。

详情
Comments
Submitted to JMLR
AI中文摘要

深度学习越来越被视为参数空间中的动力学过程,然而许多现有理论仍将训练视为封闭的优化系统。这种观点对于现实世界的人工智能是有限的,因为模型在不确定性、资源约束、分布偏移、下游决策风险和人类反馈下运行。我们提出了以人为中心的学习力学(HCLM),一个用于开放和受控学习系统的动力学和信息论框架。核心思想是,只有当所选的熵代理沿着优化轨迹产生非简并的信息力时,熵正则化才是有用的。否则,熵项可能产生弱、不稳定或不对齐的梯度,导致动力学坍缩为普通的损失最小化。我们引入了有效熵的概念,并研究了可处理的几何熵代理,包括基于方差和对数行列式协方差代理。本文做出三项贡献。首先,它通过有效信息力形式化了熵正则化,并刻画了简并熵区域。其次,它在显式假设下推导了收敛性、熵流、Wasserstein梯度流和噪声表示泛化结果。第三,它提供了缩放律行为的条件动力学解释,作为信息注入、熵耗散和残差风险之间的平衡,而不声称对经验神经缩放律的无条件推导。受控的表示学习实验支持几何熵代理(尤其是对数行列式协方差熵)比softmax归一化熵产生更强更稳定的信息力的假设。

英文摘要

Deep learning is increasingly viewed as a dynamical process in parameter space, yet many existing theories still treat training as a closed optimization system. This view is limited for real-world AI, where models operate under uncertainty, resource constraints, distribution shift, downstream decision risks, and human feedback. We propose Human-Centered Learning Mechanics (HCLM), a dynamical and information-theoretic framework for open and controlled learning systems. The central idea is that entropy regularization is useful only when the chosen entropy surrogate generates a non-degenerate information force along the optimization trajectory. Otherwise, entropy terms may produce weak, unstable, or misaligned gradients, causing the dynamics to collapse toward ordinary loss minimization. We introduce the notion of effective entropy and study tractable geometric entropy surrogates, including variance-based and log-determinant covariance proxies. The paper makes three contributions. First, it formalizes entropy regularization through effective information force and characterizes degenerate entropy regimes. Second, it derives convergence, entropy-flow, Wasserstein-gradient-flow, and noisy-representation generalization results under explicit assumptions. Third, it offers a conditional dynamical interpretation of scaling-law-like behavior as a balance between information injection, entropy dissipation, and residual risk, without claiming an unconditional derivation of empirical neural scaling laws. Controlled representation-learning experiments support the hypothesis that geometric entropy surrogates, especially log-determinant covariance entropy, induce stronger and more stable information forces than softmax-normalized entropy.

2605.22871 2026-05-25 cs.LG cs.AI stat.ML

Approximate Machine Unlearning through Manifold Representation Forgetting Guided by Self Mode Connectivity

通过自模式连通性引导的流形表示遗忘实现近似机器遗忘

Weiqi Wang, Zhiyi Tian, Chenhan Zhang, Luoyu Chen, Shui Yu

AI总结 本文提出了一种名为ManiF-SMC的近似机器遗忘方法,旨在解决现有方法在遗忘效果和学习目标保持之间的平衡问题。该方法基于模型在剩余数据上重训练时的语义相似性分类行为,通过将被遗忘样本从原始流形表示中心推向保留数据的语义邻居,实现近似遗忘。为提升遗忘效果并减少对标签和任务梯度的依赖,ManiF-SMC引入了基于边距的三元组损失和自模式连通模块,以自适应生成遗忘边距,实验表明其在多个数据集上达到了与先进方法相当的遗忘效果。

详情
AI中文摘要

机器遗忘是强制执行被遗忘权的基本机制。现有的依赖标签操作或任务梯度反转的遗忘研究通常遗忘效果有限,且可能破坏原始学习目标,通常不能保证与重新训练的标准遗忘等价。本文提出ManiF-SMC(自模式连通性引导的流形遗忘),其动机是观察到在剩余数据上重新训练的模型倾向于根据保留数据中的语义相似性对擦除样本进行分类。我们首先系统地将近似遗忘重新表述为:将每个擦除样本从其原始学习的流形表示质心推向保留数据中最近的语义邻居。这种重新表述使遗忘与重新训练行为对齐,并且仅在表示空间中操作,减少了对标签和任务特定梯度的依赖。为了解决基于流形表示的遗忘问题,ManiF-SMC将遗忘和表示保留目标封装在基于边界的三元组损失中。由于为遗忘找到合适的边界具有挑战性,我们提出一个自模式连通性模块,快速重建局部流形以指导每个遗忘案例的自适应边界生成。在四个代表性数据集上的大量实验表明,ManiF-SMC在仅操作模型表示空间的情况下,实现了与最先进近似方法相当的遗忘效果。

英文摘要

Machine unlearning is a fundamental mechanism that enforces the right to be forgotten. Existing unlearning studies that rely on label manipulation or task-gradient reversal often deliver limited unlearning effectiveness. Moreover, they can undermine the original learning objective and typically do not guarantee equivalence to standard unlearning by retraining. In this paper, we propose \textbf{ManiF-SMC} (\textbf{Mani}fold \textbf{F}orgetting with \textbf{S}elf \textbf{M}ode \textbf{C}onnectivity), motivated by the observation that a model retrained on the remaining data tends to classify erased samples by their semantic similarity to the retained data. We begin with systematically recasting the approximate unlearning as pushing each erased sample away from its original learned manifold representation centroid toward its nearest semantic neighbors in the retained data. This reformulation aligns unlearning with retraining behavior and operates purely in representation space, reducing reliance on labels and task-specific gradients. To tackle the manifold representation-based unlearning problem, ManiF-SMC encapsulates the unlearning and representation preservation goals in a margin-based triplet loss. Because finding a suitable margin for unlearning is challenging, we propose a self-mode-connectivity module that rapidly reconstructs the local manifold to guide the adaptive margins generation for each unlearning case. Extensive experiments on four representative datasets show that ManiF-SMC achieves unlearning effectiveness comparable to state-of-the-art approximate methods while operating solely within the model's representation space.

2605.22838 2026-05-25 q-bio.GN math.OC stat.AP

Detecting and Correcting Sample-by-Sample Scale Distortion in RNA Sequencing Data

检测和校正RNA测序数据中逐样本尺度失真

Christopher Thron, Farhad Jafari

AI总结 本文研究了RNA测序数据中样本间表达水平依赖的尺度偏差问题,并提出两种基于统计学的非线性变换方法以检测和校正此类偏差。传统归一化方法无法消除这些偏差,而该方法有效减少了样本间的方差,提高了基因间相关性分布的特性,并增强了群体间差异检测的灵敏度和特异性。研究结果有助于更准确地理解基因间的相互作用,并可能提升临床检测信息的应用价值。

详情
Journal ref
BMC bioinformatics 26.1 (2025): 32
Comments
25 pages, 17 figures
AI中文摘要

RNA测序(RNA-seq)是用于捕获生物样本中所有可检测基因表达水平的常规基因组规模方法。现在,这被常规用于基于人群的研究,以确定各种疾病的遗传决定因素。自然,这些测试的准确性应得到验证并尽可能改进。在本研究中,我们旨在检测和校正随样本变化的表达水平依赖误差,这些误差无法通过常规标准化技术校正。我们检查了来自癌症基因组图谱(TCGA)、Stand Up 2 Cancer(SU2C)和GTEx数据库的多个RNA-seq数据集,这些数据集经过不同类型的预处理。通过应用局部平均,我们在所有研究的数据集中发现了逐样本表达水平依赖的偏差。使用模拟,我们表明这些偏差会破坏亚群之间的基因-基因相关性估计和$t$检验。为了减轻这些偏差,我们基于统计考虑引入了两种不同的非线性变换,以校正观察到的偏差。我们证明这些变换有效地消除了观察到的逐样本偏差,减少了样本间方差,并改善了基因-基因相关性分布的特征。使用一种新颖的模拟方法在亚群之间创建受控差异,我们表明这些变换减少了变异性并增加了两个群体检验的敏感性。在大多数情况下,数据校正偏差后,敏感性和特异性的改进幅度约为3-5%。总之,这些结果提高了我们理解基因-基因关系的能力,并可能带来利用临床测试信息的新方法。

英文摘要

RNA sequencing (RNA-seq) is the conventional genome-scale approach used to capture the expression levels of all detectable genes in a biological sample. This is now regularly used for population-based studies designed to identify genetic determinants of various diseases. Naturally, the accuracy of these tests should be verified and improved if possible. In this study, we aimed to detect and correct for expression level-dependent errors which vary from sample to sample, and are not corrected by conventional normalization techniques . We examined several RNA-seq datasets from the Cancer Genome Atlas (TCGA), Stand Up 2 Cancer (SU2C), and GTEx databases with various types of preprocessing. By applying local averaging, we found sample by sample expression-level dependent biases in all datasets studied. Using simulations, we show that these biases corrupt gene-gene correlation estimations and $t$ tests between subpopulations. To mitigate these biases, we introduce two different nonlinear transforms based on statistical considerations that correct these observed biases. We demonstrate that that these transforms effectively remove the observed per-sample biases, reduce sample-to-sample variance, and improve the characteristics of gene-gene correlation distributions. Using a novel simulation methodology that creates controlled differences between subpopulations, we show that these transforms reduce variability and increase sensitivity of two population tests. The improvements in sensitivity and specificity were of the order of 3-5\% in most instances after the data was corrected for bias. Altogether, these results improve our capacity to understand gene-gene relationships, and may lead to novel ways to utilize the information derived from clinical tests.