arXivDaily arXiv每日学术速递 周一至周五更新
重置
2605.12492 2026-05-13 cs.LG stat.ML

Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Transformation

Kexuan Shi, Hanxuan Li, Zeju Qiu, Yandong Wen, Simon Buchholz, Weiyang Liu

AI总结 本文提出了一种基于正交等价变换的谱值保持优化器Pion,用于大语言模型的训练。与Adam等加法优化器不同,Pion通过左右正交变换更新权重矩阵,从而在训练过程中保持其奇异值不变。该方法在调整权重矩阵几何结构的同时固定其谱范数,实验表明Pion在大模型预训练和微调任务中表现出稳定且具有竞争力的性能。

详情
Comments
Technical report v1 (30 pages, 19 figures, project page: https://spherelab.ai/pion/)
英文摘要

We introduce Pion, a spectrum-preserving optimizer for large language model (LLM) training based on orthogonal equivalence transformation. Unlike additive optimizers such as Adam and Muon, Pion updates each weight matrix through left and right orthogonal transformations, preserving its singular values throughout training. This yields an optimization mechanism that modulates the geometry of weight matrices while keeping their spectral norm fixed. We derive the Pion update rule, systematically examine its design choices, and analyze its convergence behavior along with several key properties. Empirical results show that Pion offers a stable and competitive alternative to standard optimizers for both LLM pretraining and finetuning.

2605.12461 2026-05-13 math.ST cs.DS cs.LG stat.ML stat.TH

A proximal gradient algorithm for composite log-concave sampling

Linghai Liu, Sinho Chewi

AI总结 本文提出了一种用于从复合对数凹分布中采样的近端梯度算法,该分布形式为 $π \propto e^{-f - g}$,假设能够获取 $f$ 的梯度以及 $g$ 的受限高斯预言机(RGO)。该算法通过结合梯度信息和 RGO 采样,实现了高效的采样过程。研究证明,在 $f + g$ 强凸且 $f$ 光滑的条件下,该算法在总变分距离下达到 $\varepsilon$ 精度所需的迭代次数为 $\widetilde{\mathcal{O}}(κ\sqrt{d} \log^4(1/\varepsilon))$,与现有最优结果一致,并进一步扩展到非对数凹分布和非光滑 $f$ 的情形。

详情
英文摘要

We propose an algorithm to sample from composite log-concave distributions over $\mathbb{R}^d$, i.e., densities of the form $π\propto e^{-f-g}$, assuming access to gradient evaluations of $f$ and a restricted Gaussian oracle (RGO) for $g$. The latter requirement means that we can easily sample from the density $\text{RGO}_{g,h,y}(x) \propto \exp(-g(x) -\frac{1}{2h}||y-x||^2)$, which is the sampling analogue of the proximal operator for $g$. If $f + g$ is $α$-strongly convex and $f$ is $β$-smooth, our sampler achieves $\varepsilon$ error in total variation distance in $\widetilde{\mathcal O}(κ\sqrt d \log^4(1/\varepsilon))$ iterations where $κ:= β/α$, which matches prior state-of-the-art results for the case $g=0$. We further extend our results to cases where (1) $π$ is non-log-concave but satisfies a Poincaré or log-Sobolev inequality, and (2) $f$ is non-smooth but Lipschitz.

2605.12410 2026-05-13 stat.ML cs.LG math.OC math.ST stat.TH

Model-based Bootstrap of Controlled Markov Chains

Ziwei Su, Imon Banerjee, Diego Klabjan

AI总结 本文提出并分析了一种基于模型的引导方法,用于估计有限可控马尔可夫链(CMC)中的转移核,适用于可能具有非平稳或历史依赖控制策略的情形,这在行为策略未知的离线强化学习中具有重要意义。研究通过引入新的引导大数定律和鞅中心极限定理,建立了引导转移估计器在分布上的一致性,并进一步扩展到离线策略评估和最优策略恢复任务,获得了价值函数和Q函数的渐近有效置信区间。实验表明,该方法在覆盖精度上优于现有方法,尤其在小样本和短回合场景下表现更优。

详情
Comments
45 pages, 7 figures, 19 tables
英文摘要

We propose and analyze a model-based bootstrap for transition kernels in finite controlled Markov chains (CMCs) with possibly nonstationary or history-dependent control policies, a setting that arises naturally in offline reinforcement learning (RL) when the behavior policy generating the data is unknown. We establish distributional consistency of the bootstrap transition estimator in both a single long-chain regime and the episodic offline RL regime. The key technical tools are a novel bootstrap law of large numbers (LLN) for the visitation counts and a novel use of the martingale central limit theorem (CLT) for the bootstrap transition increments. We extend bootstrap distributional consistency to the downstream targets of offline policy evaluation (OPE) and optimal policy recovery (OPR) via the delta method by verifying Hadamard differentiability of the Bellman operators, yielding asymptotically valid confidence intervals for value and $Q$-functions. Experiments on the RiverSwim problem show that the proposed bootstrap confidence intervals (CIs), especially the percentile CIs, outperform the episodic bootstrap and plug-in CLT CIs, and are often close to nominal ($50\%$, $90\%$, $95\%$) coverage, while the baselines are poorly calibrated at small sample sizes and short episode lengths.

2605.12341 2026-05-13 stat.ML cs.LG

Multi-Variable Conformal Prediction: Optimizing Prediction Sets without Data Splitting

Laura Lützow, Simone Garatti, Marco C. Campi, Lars Lindemann, Matthias Althoff

AI总结 该论文提出了一种多变量校准预测(MCP)框架,旨在在不进行数据划分的情况下优化预测集的形状,同时保持有限样本下的覆盖保证。MCP 扩展了传统校准预测方法,支持向量值评分函数和多个校准变量,将预测集设计与校准统一为一个优化问题。研究提出了两种高效变体 RemMCP 和 RelMCP,分别适用于不同类型的优化需求,并在实验中验证了其在保持目标覆盖的同时,能够获得更小或相当的预测集大小,并显著降低校准过程中的方差。

详情
英文摘要

Conformal prediction constructs prediction sets with finite-sample coverage guarantees, but its calibration stage is structurally constrained to a scalar score function and a single threshold variable - forcing shapes of prediction sets to be fixed before calibration, typically through data splitting. We introduce multi-variable conformal prediction (MCP), a framework that extends conformal prediction to vector-valued score functions with multiple simultaneous calibration variables. Building on scenario theory as a principled framework for certifying data-driven decisions, MCP unifies prediction set design and calibration into a single optimization problem, eliminating data splitting without sacrificing coverage guarantees. We propose two computationally efficient variants: RemMCP, grounded in constrained optimization with constraint removal, which admits a clean generalization of split conformal prediction; and RelMCP, based on iterative optimization with constraint relaxation, which supports non-convex score functions at the cost of possibly greater conservatism. Through numerical experiments on ellipsoidal and multi-modal prediction sets, we demonstrate that RemMCP and RelMCP consistently meet the target coverage with prediction set sizes smaller than or comparable to those of baselines with data split, while considerably reducing variance across calibration runs - a direct consequence of using all available data for shape optimization and calibration simultaneously.

2605.12338 2026-05-13 cs.LG cs.AI stat.CO

Manifold Sampling via Entropy Maximization

Cornelius V. Braun, Tilman Burghoff, Marc Toussaint

AI总结 该论文研究了在由平滑等式和不等式约束隐式定义的流形上进行采样的问题,特别是在可行域包含多个不连通部分的情况下。为了解决这一挑战,作者提出了基于熵最大化重采样的MASEM方法,通过k近邻密度估计最大化经验分布的熵,从而提升采样效率。实验表明,MASEM在合成数据和机器人应用中表现出优越的混合效率和可扩展性,显著优于现有方法。

详情
英文摘要

Sampling from constrained distributions has a wide range of applications, including in Bayesian optimization and robotics. Prior work establishes convergence and feasibility guarantees for constrained sampling, but assumes that the feasible set is connected. However, in practice, the feasible set often decomposes into multiple disconnected components, which makes efficient sampling under constraints challenging. In this paper, we propose MAnifold Sampling via Entropy Maximization (MASEM) for sampling on a manifold with an unknown number of disconnected components, implicitly defined by smooth equality and inequality constraints. The presented method uses a resampling scheme to maximize the entropy of the empirical distribution based on k-nearest neighbor density estimation. We show that, in the mean field, MASEM decreases the KL-divergence between the empirical distribution and the maximum-entropy target exponentially in the number of resampling steps. We instantiate MASEM with multiple local samplers and demonstrate its versatility and efficiency on synthetic and robotics-based benchmarks. MASEM enables fast and scalable mixing across a range of constrained sampling problems, improving over alternatives by an order of magnitude in Sinkhorn distance with competitive runtime.

2605.12301 2026-05-13 cs.LG math.ST stat.TH

Approximation of Maximally Monotone Operators : A Graph Convergence Perspective

Takashi Furuya, Yury Korolev, Takaharu Yaguchi

AI总结 该论文研究了如何通过图收敛方法对极大单调算子进行逼近,这类算子在数学和机器学习中具有重要应用。传统上的均匀或 $L^p$ 近似方法在处理此类算子时存在局限,作者提出利用图收敛(Painlevé-Kuratowski 收敛)作为逼近框架,证明了任何极大单调算子均可通过编码-解码结构进行局部图收敛逼近,并构建了保持极大单调性质的结构化近似方法。这一成果为处理不连续或集值算子的算子学习提供了新的理论基础和方法。

详情
英文摘要

Operator learning has been highly successful for continuous mappings between infinite-dimensional spaces, such as PDE solution operators. However, many operators of interest-including differential operators-are discontinuous or set-valued, and lie outside classical approximation frameworks. We propose a paradigm shift by formulating approximation via graph convergence (Painlevé-Kuratowski convergence), which is well-suited for closed operators. We show that uniform and $L^p$ approximation are fundamentally inadequate in this setting. Focusing on maximally monotone operators, we prove that any such operator can be approximated in the sense of local graph convergence by continuous encoder-decoder architectures, and further construct structure-preserving approximations that retain maximal monotonicity via resolvent-based parameterizations.

2605.12296 2026-05-13 math.ST stat.TH

Efficiency of pattern-based independence test

L. Baringhaus, R. Grübel

AI总结 本文研究了基于模式的独立性检验的效率问题,探讨了不同长度模式所对应检验的一致性及其在大样本下的渐近相对效率。通过连接离散数学与理论计算机科学中的概念,如拟随机性与模式一致性,文章详细描述了相关检验的极限分布,并提供了理论分析与数值模拟结果,为实际应用中的独立性检验提供了理论支持与指导。

详情
Comments
to appear in: Electronic Journal of Statistics
英文摘要

Tests of independence are an important tool in applications, specifically in connection with the detection of a relationship between variables; they also have initiated many developments in statistical theory. In the present paper we build upon and extend a recently established link to Discrete Mathematics and Theoretical Computer Science, exemplified by the appearance of copulas in connection with limits of permutation sequences, and by the connection between quasi-randomness and consistency of pattern-based tests of independence. The latter include classical procedures, such as Kendall's tau, which uses patterns of length two. Longer patterns lead to tests that are consistent against large classes of alternatives, as first shown by Hoeffding (1948) with patterns of length five, and by Yanagimoto (1970) and Bergsma and Dassios (2014) for patterns of length four. More recently Chan et al.\ (2020) characterized quasi-randomness for sets of patterns of length four, which leads to several new consistent pattern-based test for independence. We give a detailed and complete description of the respective limiting null distributions. In connection with the power performance of the tests, which is of interest for practical purposes, we provide results on their (local) asymptotic relative efficiencies. We also include a small simulation study that supports our theoretical findings.

2605.12248 2026-05-13 stat.ME stat.AP stat.CO

Time-variant reliability using time-dependent surrogate models

Stefano Marelli, Styfen Schär, Bruno Sudret

AI总结 本文研究了具有时变特性的工程动态系统在随机激励下的可靠性分析问题,针对传统蒙特卡洛方法计算成本高的问题,提出两种新型代理模型方法——mNARX和F-NARX,分别通过降维流形和函数轨迹特征提取,有效处理高维输入和时序依赖性,实验表明这两种方法在两个典型可靠性问题中能够准确捕捉系统响应的尾部行为,从而高效估计首次穿越概率。

详情
英文摘要

Time-variant reliability analysis is a critical task for ensuring the safety of engineering dynamical systems subjected to stochastic excitations. However, assessing failure probability for realistic systems with Monte-Carlo simulation-based methods is often computationally intractable due to the high cost of the underlying models and the large number of simulations required. While surrogate models such as polynomial chaos expansions or Kriging are well-established for time-invariant reliability problems, their direct application to time-dependent systems remains challenging. This chapter introduces two advanced surrogate modeling frameworks designed specifically for dynamical systems: manifold-NARX (mNARX) and functional NARX (F-NARX). The mNARX approach constructs the surrogate on a reduced-order manifold of auxiliary state variables, enabling the efficient handling of high-dimensional inputs by embedding physical insight into a regression formulation. Conversely, the F-NARX framework exploits the functional nature of system trajectories, extracting principal component features from continuous time windows to mitigate issues associated with discrete lag selection and long-memory effects. We demonstrate the efficacy of these methods on two benchmark reliability problems: a stochastic quarter-car model and a hysteretic Bouc-Wen oscillator. The results highlight that, when combined with suitably biased experimental designs, both frameworks accurately capture the tail behavior of the system response, enabling precise and efficient estimation of first-passage probabilities.

2605.12235 2026-05-13 stat.ML cs.LG

Optimal Policy Learning under Budget and Coverage Constraints

Giovanni Cerulli

AI总结 本文研究在预算和最低覆盖约束下的最优策略学习问题,揭示了该问题具有类似于背包问题的结构,并证明最优策略可通过结合预算和覆盖影子价格的线性阈值规则来刻画。研究还表明其组合优化的线性规划松弛具有常数积分间隙,意味着离散分配与最优解在渐近情况下等价。基于此,作者提出了两种可实施的算法——贪心拉格朗日算法和排序-切割算法,并通过实验验证了它们在不同条件下的近似最优性能。

详情
英文摘要

We study optimal policy learning under combined budget and minimum coverage constraints. We show that the problem admits a knapsack-type structure and that the optimal policy can be characterized by an affine threshold rule involving both budget and coverage shadow prices. We establish that the linear programming relaxation of the combinatorial solution has an O(1) integrality gap, implying asymptotic equivalence with the optimal discrete allocation. Building on this result, we analyze two implementable approaches: a Greedy-Lagrangian (GLC) and a rank-and-cut (RC) algorithm. We show that the GLC closely approximates the optimal solution and achieves near-optimal performance in finite samples. By contrast, RC is approximately optimal whenever the coverage constraint is slack or costs are homogeneous, while misallocation arises only when cost heterogeneity interacts with a binding coverage constraint. Monte Carlo evidence supports these findings.

2605.12190 2026-05-13 stat.ML cs.LG

Information-Theoretic Generalization Bounds for Sequential Decision Making

Futoshi Futami, Masahiro Fujisawa

AI总结 本文研究了序贯决策问题中的泛化界分析,针对在线学习、流式主动学习和多臂老虎机等场景,提出了一个序贯超样本框架。该方法通过分离学习者的过滤过程与用于幽灵坐标比较的证明扩展,引入了基于轮次选择器-损失信息项的序贯条件互信息(CMI)来控制泛化差距,并在适当方差条件下建立了伯恩斯坦型改进,提升了收敛速率。该方法适用于多种序贯决策场景,为算法依赖的泛化分析提供了新工具。

详情
英文摘要

Information-theoretic generalization bounds based on the supersample construction are a central tool for algorithm-dependent generalization analysis in the batch i.i.d.~setting. However, existing supersample conditional mutual information (CMI) bounds do not directly apply to sequential decision-making problems such as online learning, streaming active learning, and bandits, where data are revealed adaptively and the learner evolves along a causal trajectory. To address this limitation, we develop a sequential supersample framework that separates the learner filtration from a proof-side enlargement used for ghost-coordinate comparisons. Under a row-wise exchangeability assumption, the sequential generalization gap is controlled by sequential CMI, a sum of roundwise selector--loss information terms. We also establish a Bernstein-type refinement that yields faster rates under suitable variance conditions. The selector-SCMI proof strategy applies to online learning, streaming active learning with importance weighting, and stochastic multi-armed bandits.

2605.12136 2026-05-13 stat.ME

Synthetic Control Method with Mixed Frequency Data

Lu Zhang, Shijin Gong, Xinyu Zhang

AI总结 本文研究了如何在混合频率数据下有效应用合成控制方法(SCM),以解决经济学和金融学中常见的不同时间分辨率变量问题。作者提出了一种新的混合频率合成控制方法(MF-SCM),通过构建灵活的估计过程,在不丢失信息的前提下整合不同频率的数据,并建立了该估计方法的理论性质。该方法在数值模拟和两个实证案例中得到验证,展示了其在处理实际政策效果评估中的有效性。

详情
英文摘要

Mixed-frequency data, where variables are observed at different temporal resolutions, commonly occur in economic and financial studies. Classical synthetic control methods (SCM) are ill-suited for such data, often necessitating aggregation or prefiltering that may discard valuable information. This paper proposes a novel Mixed-Frequency Synthetic Control Method (MF-SCM) to integrate mixed-frequency data into the synthetic control framework effectively. We develop a flexible estimation procedure to construct synthetic control weights under mixed-frequency settings and establish the theoretical properties of the MF-SCM estimator. Specifically, we first prove that the estimator achieves asymptotic optimality, in the sense that it achieves the lowest possible squared prediction error among all potential treatment effect estimators from averaging outcomes of control units. We then derive the asymptotic distribution of the average treatment effect (ATE) estimator using projection theory and construct confidence intervals for the ATE estimator. The method's effectiveness is demonstrated through numerical simulations and two empirical applications concerning the 2017 Tax Cuts and jobs Act in US and air pollution alerts.

2605.12103 2026-05-13 stat.ME

Informative Simultaneous Confidence Intervals for Graphical Group Sequential Test Procedures

Liane Kluge, Werner Brannath

AI总结 本文研究了在多假设组序临床试验中控制族误差率的检验方法,提出了一种新的基于当前阶段重复p值的检验策略,该策略在所有假设中具有一致性且功效更高。为该检验方法提供了相应的同时置信区间,并进一步扩展了单阶段图形检验的有信息置信区间方法,引入了迭代算法以计算接近原检验功效的有信息边界,同时提供了评估数值边界精度的标准,有助于更可靠地估计多假设组序试验中的治疗效应。

详情
英文摘要

Test procedures for multiple hypotheses in a group sequential clinical trial that control the family-wise error rate are considered. Several graphical group sequential tests suggested in the literature, which are special cases of Bonferroni-closure tests, are discussed. The focus is on the question of whether to consider at the current stage only the evidence of the current repeated p-value or the evidence over all repeated p-values from the previous stages. A new test strategy controlling the family-wise error rate is introduced that consistently works across all hypotheses, with the evidence (i.e., repeated p-value) from the current stage. The strategy is more powerful than similar previously suggested test procedures. This is achieved by using the evidence from previous stages to increase the significance levels. For the test procedures, corresponding compatible simultaneous confidence intervals are presented, having the disadvantage of often not providing additional information on the treatment effects. For this reason, we extend previous work about informative simultaneous confidence intervals for one-stage graphical tests to graphical group sequential trials. Iterative algorithms are introduced that calculate these informative bounds that have a small power loss compared to the original graphical group sequential test. The boundaries can be calculated after each stage. In addition, previous work is extended by a criterion to estimate the accuracy of the numerically calculated boundaries. The suggested informative bounds can be used to provide median-conservative, i.e., reliable estimators, for estimating the treatment effects in a group sequential test with multiple hypotheses.

2605.12099 2026-05-13 stat.ME q-fin.ST

Bayesian Dynamic Modeling of Realized Volatility in Financial Asset Price Forecasting

Patrick Woitschig, Mike West

AI总结 本文提出了一类用于金融资产价格和实现波动率双变量时间序列的贝叶斯动态模型。该模型将新的动态伽马过程应用于实现波动率建模,并与传统的贝叶斯动态线性模型相结合,以捕捉价格序列中的波动率杠杆效应和反馈效应。研究通过高频数据合成,提升了对波动率变化的跟踪与预测能力,并在多个标普行业ETF的实证分析中展示了其在资产价格预测方面的优越性,为投资组合构建和风险管理提供了实用的理论支持。

详情
Comments
20 pages, 7 figures
英文摘要

We present a new class of Bayesian dynamic models for bivariate price-realized volatility time series in financial forecasting. A novel dynamic gamma process model adopted for realized volatility is integrated with traditional Bayesian dynamic linear models (DLMs) for asset price series. This represents reduced-form volatility leverage and feedback effects through use of realized volatility proxies in conditional DLMs for prices or returns, coupled with the synthesis of higher frequency data to track and anticipate volatility fluctuations. Analysis is computationally straightforward, extending conjugate-form Bayesian analyses for sequential filtering and model monitoring with simple and direct simulation for forecasting. A main applied setting is equity return forecasting with daily prices and realized volatility from high-frequency, intraday data. Detailed empirical studies of multiple S&P sector ETFs highlight the improvements achievable in asset price forecasting relative to standard models and deliver contextual insights on the nature and practical relevance of volatility leverage and feedback effects. The analytic structure and negligible extra computational cost will enable scaling to higher dimensions for multivariate price series forecasting for decouple/recouple portfolio construction and risk management applications.

2605.12092 2026-05-13 stat.ME

Laplacian-P-splines for shared Gamma frailty models applied to clustered right-censored time-to-event data

Piotr Lewczuk, Oswaldo Gressani, Steven Abrams, Christel Faes

AI总结 本文研究了在存在未观测集群风险因素的右删失时间到事件数据中,如何利用共享Gamma frailty模型进行分析。作者提出了一种基于Laplacian-P-splines(LPS)的方法,通过高斯近似避免了传统的马尔可夫链蒙特卡洛采样,从而实现了高效且无需采样的参数估计。该方法提供了梯度和海森矩阵的解析表达式,提高了计算效率并能自然量化不确定性。文章通过模拟研究和三个实际生物医学数据集验证了该方法的有效性。

详情
Comments
31 pages, 1 figure, 5 tables
英文摘要

Shared frailty models have been proposed to accommodate unmeasured cluster-specific risk factors through the inclusion of a common latent frailty term. Among possible frailty distributions, the Gamma distribution is appealing due to its non-negativity, flexibility, and algebraic tractability leading to closed-form marginal survival or hazard function expressions. Under the Bayesian paradigm, the posterior distributions of model parameters are usually explored with computationally intensive procedures relying on Markov chain Monte Carlo sampling. As an alternative, Laplacian-P-splines (LPS) provide a flexible and sampling-free alternative by relying on Gaussian approximations of the posterior target distributions. In this model class, analytical formulas are obtained for the gradient and Hessian, yielding a computationally efficient inference scheme for estimation of model parameters with a natural way of quantifying uncertainty. This article extends the LPS toolbox to the inclusion of shared Gamma frailty models for clustered time-to-event data. We assess the finite-sample performance of the LPS estimation procedure through an extensive simulation study and compare estimates with those obtained using penalized partial likelihood estimation, without specification of the baseline hazard, and with the variance of the frailty term being estimated using profile likelihood. Finally, the proposed LPS estimation method is exemplified using three publicly available biomedical datasets on: (i) recurrent infections in children, (ii) cancer prevention, and (iii) kidney transplantation.

2605.12089 2026-05-13 stat.ME astro-ph.IM hep-ex stat.AP

Power Studies For Two-Sample and Goodness-of-Fit Methods For Multivariate Data

Wolfgang Rolke

AI总结 本文通过大量模拟研究,探讨了多种多元数据 goodness-of-fit 和非参数两样本检验方法的统计功效。研究发现,不同方法在不同假设组合下的表现差异较大,因此作者推荐了一组效果较优的方法,以确保在各类案例中至少有一种方法具有良好的检验能力。研究使用了 R 包 MD2sample 和 MDgof 进行实现。

详情
英文摘要

We present the results of a large number of simulation studies regarding the power of various goodness-of-fit as well as non-parametric two-sample tests for multivariate data. In two dimensions this includes both continuous and discrete data, in higher dimensions continuous data only. In general no single method can be relied upon to provide good power, any one method may be quite good for some combination of null hypothesis and alternative and may fail badly for another. Based on the results of these studies we propose a fairly small number of methods chosen such that for any of the case studies included here at least one of the methods has good power. The studies were carried out using the R packages MD2sample and MDgof, available from CRAN.

2605.12025 2026-05-13 cs.LG stat.ML

Approximation Theory of Laplacian-Based Neural Operators for Reaction-Diffusion System

Takashi Furuya, Ryo Ozawa, Jenn-Nan Wang

AI总结 本文研究了基于拉普拉斯算子的神经算子在非线性反应-扩散系统中的逼近理论,以通用的Gierer-Meinhardt模型为例,分析了从初始条件到时间依赖解的映射学习问题。通过利用PDE格林函数的拉普拉斯谱表示,作者建立了神经网络深度、宽度和谱秩相关的显式逼近误差界,证明了所需参数复杂度随目标精度呈多项式增长,从而克服了传统算子学习中面临的参数复杂度指数增长问题。数值实验验证了理论结果的有效性。

详情
英文摘要

Neural operators provide a framework for learning solution operators of partial differential equations (PDEs), enabling efficient surrogate modeling for complex systems. While universal approximation results are now well understood, approximation analysis specific to nonlinear reaction-diffusion systems remains limited. In this paper, we study neural operators applied to the solution mapping from initial conditions to time-dependent solutions of a generalized Gierer-Meinhardt reaction-diffusion system, a prototypical model of nonlinear pattern formation. Our main results establish explicit approximation error bounds in terms of network depth, width, and spectral rank by exploiting the Laplacian spectral representation of the Green's function underlying the PDE. We show that the required parameter complexity grows at most polynomially with respect to the target accuracy, demonstrating that Laplacian eigenfunction-based neural operator architectures alleviate the curse of parametric complexity encountered in generic operator learning. Numerical experiments on the Gierer-Meinhardt system support the theoretical findings.

2605.11987 2026-05-13 cs.AI cs.LG stat.AP stat.ML

Random-Set Graph Neural Networks

Tommy Woodley, Shireen Kudukkil Manchingal, Matteo Tolloso, Davide Bacciu, Fabio Cuzzolin

AI总结 本文提出了一种新的图神经网络框架——随机集图神经网络(RS-GNN),用于更准确地量化节点层面的不确定性。该方法通过信念函数形式对节点的认识不确定性进行建模,能够同时输出精确的概率预测和不确定性度量。实验表明,RS-GNN在多个真实世界的图学习数据集上表现出优越的不确定性量化能力。

详情
Comments
23 pages, 6 figures
英文摘要

Uncertainty quantification has become an important factor in understanding the data representations produced by Graph Neural Networks (GNNs). Despite their predictive capabilities being ever useful across industrial workspaces, the inherent uncertainty induced by the nature of the data is a huge mitigating factor to GNN performance. While aleatoric uncertainty is the result of noisy and incomplete stochastic data such as missing edges or over-smoothing, epistemic uncertainty arises from lack of knowledge about a system or model (e.g., a graph's topology or node feature representation), which can be reduced by gathering more data and information. In this paper, we propose an original new framework in which node-level epistemic uncertainty is modelled in a belief function (finite random set) formalism. The resulting Random-Set Graph Neural Networks have a belief-function head predicting a random set over the list of classes, from which both a precise probability prediction and a measure of epistemic uncertainty can be obtained. Extensive experiments on 9 different graph learning datasets, including real-world autonomous driving benchmarks as such Nuscene and ROAD, demonstrate RS-GNN's superior uncertainty quantification capabilities

2605.11983 2026-05-13 cs.LG stat.ML

QDSB: Quantized Diffusion Schrödinger Bridges

Tobias Fuchs, Florian Kalinke, Nadja Klein

AI总结 在源分布和目标分布仅通过未配对样本指定的情况下,生成模型的学习变得越来越重要。本文提出了一种名为QDSB的量化扩散Schrödinger桥方法,用于加速无模拟Schrödinger桥的训练过程。该方法通过在锚点量化后的分布上计算端点耦合,并通过单元采样将结果映射回原始数据点,从而减少计算成本并保持全局传输结构的稳定性。实验表明,QDSB在保持样本质量的同时显著提升了训练效率。

详情
英文摘要

Learning generative models in settings where the source and target distributions are only specified through unpaired samples is gaining in importance. Here, one frequently-used model are Schrödinger bridges (SB), which represent the most likely evolution between both endpoint distributions. To accelerate training, simulation-free SBs avoid the path simulation of the original SB models. However, learning simulation-free SBs requires paired data; a coupling of the source and target samples is obtained as the solution of the entropic optimal transport (OT) problem. As obtaining the optimal global coupling is infeasible in many practical cases, the entropic OT problem is iteratively solved on minibatches instead. Still, the repeated cost remains substantial and the locality can distort the global transport geometry. We propose quantized diffusion Schrödinger bridges (QDSB), which compute the endpoint coupling on anchor-quantized endpoint distributions and lift the resulting plan back to original data points through cell-wise sampling. We show that the regularized optimal coupling is stable w.r.t. anchor quantization, with an error controlled by the quality of the anchor approximation. In real-world experiments, QDSB matches the sample quality of existing baselines, requiring substantially less time. Code and data are available at github.com/mathefuchs/qdsb.

2605.11973 2026-05-13 math.ST math.PR stat.TH

Stochastic Ordering under Weaker Likelihood-Ratio Shape Conditions

Z. Derbazi

AI总结 本文研究了在较弱的似然比形状条件下方,风险率和通常随机序的端点准则仍能成立的问题。通过引入单峰性及似然比减一的符号模式条件,作者证明了端点性质的保持性,并进一步提出了一种基于超水平集的直接判别准则,该准则对不连续的似然比情形尤为有效。这一成果拓展了传统序关系的适用范围,为相关统计推断提供了更灵活的理论支持。

详情
Comments
10 pages, 2 figures
英文摘要

We show that the shape hypothesis on a likelihood ratio can be weakened while retaining endpoint criteria for the hazard-rate and usual stochastic orders. The endpoint reduction persists under unimodality of the likelihood ratio and under a sign-pattern condition on the likelihood ratio minus one, with at most two sign changes and a negative right tail. It also follows from a direct superlevel-set criterion involving the same expression, which is useful in particular for discontinuous likelihood ratios.

2605.11935 2026-05-13 stat.ME stat.CO

Bayesian low-rank latent-cluster regression for mixed health outcomes

Hsin-Hsiung Huang, Suyeon Kang

AI总结 该研究提出了一种贝叶斯低秩潜聚类回归模型,用于处理高维健康和监测数据中的多类型相关响应变量和潜在异质性问题。模型通过有限混合回归表面的形式,为每个潜聚类提供特定的均值偏移和低秩系数矩阵,实现了聚类、降维和可解释性分析。研究还引入了乘法伽马过程收缩以自适应调整各聚类的有效秩,并基于WAIC准则确定聚类数和最大秩,展示了模型在不同数据生成机制下的良好聚类性能和实际应用效果。

详情
英文摘要

High-dimensional health and surveillance studies often involve many collinear predictors, multiple correlated outcomes of different types, and latent heterogeneity across observational units. We propose a Bayesian latent-cluster reduced-rank regression model for multivariate mixed outcomes. The model is a finite mixture of regression surfaces: each latent cluster has a cluster-specific mean shift and a low-rank coefficient matrix, yielding simultaneous clustering, dimension reduction, and component-wise interpretability. Response coordinates may be Gaussian, Bernoulli, or negative binomial. Multiplicative gamma process shrinkage adapts the effective rank within each cluster, and a WAIC-based criterion is used to tune the number of clusters and the nominal maximal rank. We establish posterior contraction for the identifiable component-specific regression surfaces and mean shifts, up to label permutation, and derive corresponding contraction for predictor-side singular subspaces. We also analyze the default label-invariant reporting pipeline based on the posterior similarity matrix: an eigenspace embedding followed by mean shift is shown to consistently recover the latent partition under an additional strong separation margin. Simulation experiments spanning all-Gaussian, all-Bernoulli, all-negative-binomial, and mixed Gaussian--Bernoulli--negative-binomial regimes show accurate recovery of the number of clusters and competitive clustering performance against $K$-means, mclust, PCA-based clustering, and a Gaussian reduced-rank mixture benchmark. We illustrate the method in three applications that show how the model separates individual-level utilization groups and produces interpretable county- and state-level cluster maps together with response-specific posterior predictive maps.

2605.11926 2026-05-13 stat.AP

An ensemble prediction method for forecasting sap flux density and water-use in temperate trees

Mengyi Gong, Rebecca Killick, Andrew Hirons

AI总结 本文提出了一种基于加性模型的集成预测方法,用于预测温带树木的日水分消耗,旨在提升灌溉管理的效率。该方法结合树液流传感器、气象站数据和统计模型,通过考虑环境因子与树液流密度之间的非线性关系及个体树木在不同生长季的变化,实现了对树木水分利用的可靠预测。研究利用2022至2024年九种树木的实地数据验证了方法的有效性,并探讨了气候胁迫和树木大小对预测的影响,为农业、林业及生态保护提供了通用的预测框架。

详情
Comments
Main manuscript: 18 pages, 6 figures. Supplementary document: 11 pages, 10 figures
英文摘要

Efficient irrigation management is crucial to agriculture, forestry and horticulture, especially under climate change. Developments in novel sensors and Internet of Things technology provide an opportunity to carry out real-time monitoring of tree sap flux density, which, when coupled with advanced modelling techniques, enables online prediction of tree water-use suitable for irrigation planning. This manuscript proposes one such pipeline that integrates tree sap flow sensors, weather station sensors, and statistical models to predict tree daily water-use. In particular, an ensemble prediction approach based on additive models has been developed, using weather data as the main predictors of sap flux density. The method simultaneously considers the non-linear relationships and interactions between sap flux density and its environmental drivers, as well as the variability among individual trees over different growing seasons. Using field data collected on nine species of trees over the 2022, 2023 and 2024 growing seasons, this manuscript demonstrates the ability of the proposed ensemble prediction method in producing reliable daily water-use forecasts. The challenge of predicting tree water-use under climate stress, such as heatwaves, and the impact of tree sizes on prediction have also been discussed. Despite the complexity of the problem, the proposed method provides a general framework which can be used in a variety of settings, from commercial tree growers to conversation work. The model can be integrated into an online monitoring platform, assisting real-time decision making on irrigation management.

2605.11872 2026-05-13 cs.LG stat.ML

LOFT: Low-Rank Orthogonal Fine-Tuning via Task-Aware Support Selection

Lanxin Zhao, Bamdev Mishra, Pratik Jawanpuria, Lequan Lin, Dai Shi, Junbin Gao, Andi Han

AI总结 该论文提出了一种名为LOFT的低秩正交微调框架,旨在解决现有正交参数高效微调方法中子空间选择与变换方式混淆的问题。LOFT通过将正交微调视为子空间旋转,统一了多种已有方法,并将支持选择作为核心设计要素,提出了基于任务信号的实用支持选择策略。实验表明,LOFT在多个任务中表现出优越的效率与性能平衡,突显了合理支持选择对提升正交微调效果的重要性。

详情
英文摘要

Orthogonal parameter-efficient fine-tuning (PEFT) adapts pretrained weights through structure-preserving multiplicative transformations, but existing methods often conflate two distinct design choices: the subspace in which adaptation occurs and the transformation applied within that subspace. This paper introduces LOFT, a low-rank orthogonal fine-tuning framework that explicitly separates these two components. By viewing orthogonal adaptation as a multiplicative subspace rotation, LOFT provides a unified formulation that recovers representative orthogonal PEFT methods, including coordinate-, butterfly-, Householder-, and principal-subspace-based variants. More importantly, this perspective exposes support selection as a central design axis rather than a byproduct of a particular parameterization. We develop a first-order analysis showing that useful adaptation supports should be informed by the downstream training signal, motivating practical task-aware support selection strategies. Across language understanding, visual transfer, mathematical reasoning, and multilingual out-of-distribution adaptation, LOFT recovers principal-subspace orthogonal adaptation while gradient-informed supports improve the efficiency-performance trade-off under matched parameter, memory, and compute budgets. These results suggest that principled support selection is an important direction for improving orthogonal PEFT.

2605.11865 2026-05-13 stat.ML cs.LG

Variance-aware Reward Modeling with Anchor Guidance

Shuxing Fang, Ruijian Han, Liangyu Zhang, Fan Zhou

AI总结 本文研究了在人类偏好多样化的情况下,如何改进奖励模型以更准确地反映偏好不确定性。提出了一种基于锚点引导的方差感知奖励建模方法,通过引入两个粗粒度的响应级锚点标签,解决了高斯奖励模型在仅依赖成对偏好数据时的基本不可识别性问题。该方法在理论分析和多个实际数据集上均表现出优越的奖励建模性能和强化学习效果。

详情
英文摘要

Standard Bradley--Terry (BT) reward models are limited when human preferences are pluralistic. Although soft preference labels preserve disagreement information, BT can only express it by shrinking reward margins. Gaussian reward models provide an alternative by jointly predicting a reward mean and a reward variance, but suffer from a fundamental non-identifiability from pairwise preferences alone. We propose Anchor-guided Variance-aware Reward Modeling, a framework that resolves this non-identifiability by augmenting preference data with two coarse response-level anchor labels. Building on this, we prove that two anchors are sufficient for identification, develop a joint training objective and establish a non-asymptotic convergence rate for both the estimated reward mean and variance functions. Across simulation studies and four real-world diverging-preference datasets, our method consistently improves reward modeling performance and downstream RLHF, including PPO training and best-of-$N$ selection.

2605.11841 2026-05-13 stat.ML cs.AI cs.LG

Minimax Rates and Spectral Distillation for Tree Ensembles

Binh Duc Vu, David S. Watson

AI总结 本文研究了随机森林和梯度提升机等树集成模型的理论性质,提出了基于谱方法的分析框架。通过分析诱导核算子的特征值衰减,得出了随机森林回归的最小最大收敛率,并基于这一视角开发了模型压缩方法。该方法通过学习核算子或平滑矩阵的主特征函数或奇异向量,生成预测性能优异但规模大幅缩减的蒸馏模型,适用于资源受限的计算场景。

详情
Comments
9 pages main text, 33 pages total, with 12 figures and 7 tables total
英文摘要

Tree ensembles such as random forests (RFs) and gradient boosting machines (GBMs) are among the most widely used supervised learners, yet their theoretical properties remain incompletely understood. We adopt a spectral perspective on these algorithms, with two main contributions. First, we derive minimax-optimal convergence for RF regression, showing that, under mild regularity conditions on tree growth, the eigenvalue decay of the induced kernel operator governs the statistical rate. Second, we exploit this spectral viewpoint to develop compression schemes for tree ensembles. For RFs, leading eigenfunctions of the kernel operator capture the dominant predictive directions; for GBMs, leading singular vectors of the smoother matrix play an analogous role. Learning nonlinear maps for these spectral representations yields distilled models that are orders of magnitude smaller than the originals while maintaining competitive predictive performance. Our methods compare favorably to state of the art algorithms for forest pruning and rule extraction, with applications to resource constrained computing.

2605.08471 2026-05-13 math.ST stat.TH

Asymptotics for likelihood ratio tests of boundary points with singular information and unidentifiable nuisance parameters

Karl Oskar Ekvall, Ola Hössjer, Matteo Bottai, J. M. Patrik Albin

AI总结 本文研究了在 nuisance 参数不可识别、参数兴趣位于参数空间边界且可识别参数的信息矩阵可能奇异的情况下,似然比检验的渐近分布问题。作者提出了一种适用于此类复杂情形的理论框架,指出检验统计量的渐近分布为 $\barχ^2$-过程的上确界,并在局部备择假设下推广为非中心 $\barχ^2$-过程的上确界。该结果统一并扩展了现有的一些边界推断和奇异信息情形下的结论,适用于混合模型、遗传连锁分析等实际问题。

详情
英文摘要

We establish the asymptotic distribution of likelihood ratio tests (LRTs) in settings where some of the nuisance parameters are unidentifiable under the null hypothesis, parameters of interest lie on the boundary of the parameter space, and the information matrix of the identifiable parameters may be singular. Our work is motivated by mixture models and genetic linkage analysis, which exhibit all three features simultaneously, but it is applicable more broadly to other problems such as change-point detection. Under suitable regularity conditions, the asymptotic distribution of the LRT statistic under the null hypothesis is the supremum of a $\barχ^2$-process, that is, a stochastic process whose marginal distributions are mixtures of $χ^2$-distributions with weights depending on the nuisance parameter. Under local alternatives, the asymptotic distribution of the LRT statistic is the supremum of a noncentral $\barχ^2$-process, whose marginal distributions are mixtures of truncated, noncentral $χ^2$-distributions. In contrast to prior work on singular information, where singularity stems from the parameter of interest and changes the form of the limit distribution, here singularity is determined by the nuisance parameter and the limit has the same form as in the nonsingular case. Existing results for boundary inference with nonsingular information or without nuisance parameters are obtained as special cases, and several existing application-specific results for mixture models and genetic linkage analysis are recovered and extended.

2605.02858 2026-05-13 math.ST stat.TH

Characterizing Schur-concave commutative copulas as the closure of associative ones

Manuel Úbeda-Flores

AI总结 本文研究了关联性copula类在一致度量下的闭包性质,证明了该闭包等同于Schur-凹交换copula类。研究通过凸包的闭包操作,揭示了两类copula之间的包含关系,并建立了它们的等价性,为copula理论提供了新的结构理解。

详情
Comments
The author wish to withdraw this manuscript due to an error discovered which could invalidate the main conclusion of the work. The paper requires a complete revision
英文摘要

Let $\mathcal{C}_a$ denote the class of associative copulas, and let $\overline{\mathcal{C}}_a$ be the closure, in the uniform metric $d_\infty$, of the convex hull of $\mathcal{C}_a$ . It is known that $\mathcal{C}_a \subseteq \mathcal{C}_{SC}$, the class of Schur-concave commutative copulas. We prove the reverse inclusion, establishing $\overline{\mathcal{C}}_a = \mathcal{C}_{SC}$.

2604.24037 2026-05-13 cs.LG math.ST stat.TH

A Limit Theory of Foundation Models: A Mathematical Approach to Understanding Emergent Intelligence and Scaling Laws

Jun Shu, Junxiong Jia, Deyu Meng, Zongben Xu

AI总结 本文从极限理论的角度出发,提出了一种数学方法以形式化理解基础模型中的涌现智能现象。研究引入了一个依赖于数据量、模型规模和训练步数的性能函数,将智能行为的涌现视为从有限知识向无限知识的转变过程,并通过极限的存在性刻画这一现象。理论分析揭示了涌现智能的产生与极限架构的存在密切相关,并推导出基础模型的扩展定律,为理解智能涌现的机制提供了理论依据。

详情
Comments
There exist some typos and inaccurate expression in this version
英文摘要

Emergent intelligence have played a major role in the modern AI development. While existing studies primarily rely on empirical observations to characterize this phenomenon, a rigorous theoretical framework remains underexplored. This study attempts to develop a mathematical approach to formalize emergent intelligence from the perspective of limit theory. Specifically, we introduce a performance function E(N, P, K), dependent on data size N, model size P and training steps K, to quantify intelligence behavior. We posit that intelligence emerges as a transition from finite to effectively infinite knowledge, and thus recast emergent intelligence as existence of the limit $\lim_{N,P,K \to \infty} \mathcal{E}(N,P,K)$, with emergent abilities corresponding to the limiting behavior. This limit theory helps reveal that emergent intelligence originates from the existence of a parameter-limit architecture (referred to as the limit architecture), and that emergent intelligence rationally corresponds to the learning behavior of this limit system. By introducing tools from nonlinear Lipschitz operator theory, we prove that the necessary and sufficient conditions for existence of the limit architecture. Furthermore, we derive the scaling law of foundation models by leveraging tools of Lipschitz operator and covering number. Theoretical results show that: 1) emergent intelligence is governed by three key factors-training steps, data size and the model architecture, where the properties of basic blocks play a crucial role in constructing foundation models; 2) the critical condition Lip(T)=1 for emergent intelligence provides theoretical support for existing findings. 3) emergent intelligence is determined by an infinite-dimensional system, yet can be effectively realized in practice through a finite-dimensional architecture. Our empirical results corroborate these theoretical findings.

2604.23260 2026-05-13 stat.ML cs.LG

Explicit integral representations and quantitative bounds for two-layer ReLU networks

Anthony Lee

AI总结 本文提出了一种为两层ReLU网络构建显式积分表示的方法,能够为任意多变量多项式提供较为简单的表达形式。通过引入调和延拓和投影的锐化ReLU积分表示,给出了定量误差界,表明函数的$L^{2}(\mathcal{D})$逼近误差仅依赖于其单项式展开的系数和分布$\mathcal{D}$,而与维度或次数无关。此外,文章还建立了该表示与指数核再生核希尔伯特空间之间的联系,并提出了一种具有更优误差界的简单积分表示形式。

详情
英文摘要

An approach to construct explicit integral representations for two-layer ReLU networks is presented, which provides relatively simple representations for any multivariate polynomial. Quantitative bounds are provided for a particular, sharpened ReLU integral representation, which involves a harmonic extension and a projection. The bounds demonstrate that functions can be approximated with $L^{2}(\mathcal{D})$ errors that do not depend explicitly on dimension or degree, but rather the coefficients of their monomial expansions and the distribution $\mathcal{D}$. We also present a connection to the RKHS of the exponential kernel $K(x,y)=\exp\left(\left\langle x,y\right\rangle \right)$, and a very simple integral representation involving additionally multiplication via a fixed function which has better quantitative bounds.

2604.16642 2026-05-13 q-bio.QM q-bio.CB q-bio.GN stat.AP

Geometric coherence of single-cell CRISPR perturbations reveals regulatory architecture and predicts cellular stress

Prashant C. Raju

AI总结 该研究提出了一种新的几何稳定性度量方法Shesha,用于评估单细胞CRISPR扰动响应的方向一致性,揭示了基因调控结构并预测细胞应激状态。通过分析多个CRISPR数据集,研究发现稳定性与扰动效应大小高度相关,但在某些情况下二者分离,揭示了不同调控因子的生物学特性。该方法为筛选实验中的靶点优先级排序、细胞制造中的表型质量控制以及计算扰动预测的评估提供了新视角。

详情
英文摘要

Genome engineering has achieved remarkable sequence-level precision, yet predicting the transcriptomic state that a cell will occupy after perturbation remains an open problem. Single-cell CRISPR screens measure how far cells move from their unperturbed state, but this effect magnitude ignores a fundamental question: do the cells move together? Two perturbations with identical magnitude can produce qualitatively different outcomes if one drives cells coherently along a shared trajectory while the other scatters them across expression space. We introduce a geometric stability metric, Shesha, that quantifies the directional coherence of single-cell perturbation responses as the mean cosine similarity between individual cell shift vectors and the mean perturbation direction. Across five CRISPR datasets (2,200+ perturbations spanning CRISPRa, CRISPRi, and pooled screens), stability correlates strongly with effect magnitude (Spearman $ρ=0.75-0.97$), with a calibrated cross-dataset correlation of 0.97. Crucially, discordant cases where the two metrics decouple expose regulatory architecture: pleiotropic master regulators such as CEBPA and GATA1 pay a "geometric tax," producing large but incoherent shifts, while lineage-specific factors such as KLF1 produce tightly coordinated responses. After controlling for magnitude, geometric instability is independently associated with elevated chaperone activation (HSPA5/BiP; $ρ_{partial}=-0.34$ and $-0.21$ across datasets), and the high-stability/high-stress quadrant is systematically depleted. The magnitude-stability relationship persists in scGPT foundation model embeddings, confirming it is a property of biological state space rather than linear projection. Perturbation stability provides a complementary axis for hit prioritization in screens, phenotypic quality control in cell manufacturing, and evaluation of in silico perturbation predictions.

2604.14322 2026-05-13 stat.ML cs.LG

Doubly Outlier-Robust Online Infinite Hidden Markov Model

Horace Yiu, Leandro Sánchez-Betancourt, Álvaro Cartea, Gerardo Duran-Martin

AI总结 本文研究了在流数据中存在异常值和模型误设的情况下,如何提高在线无限隐马尔可夫模型(iHMM)的鲁棒性。通过引入后验影响函数(PIF)定义鲁棒性,并给出了在线iHMM具有有界PIF的条件,提出了一种名为BR-iHMM的方法,在适应性和鲁棒性之间引入两个可调参数进行平衡。实验表明,该方法在多个实际数据集上显著降低了一步预测误差,验证了其在预测和可解释在线学习中的有效性。

详情
Comments
43rd International Conference on Machine Learning (ICML 2026)
英文摘要

We derive a robust update rule for the online infinite hidden Markov model (iHMM) for when the streaming data contains outliers and the model is misspecified. Leveraging recent advances in generalised Bayesian inference, we define robustness via the posterior influence function (PIF), and provide conditions under which the online iHMM has bounded PIF. Imposing robustness inevitably induces an adaptation lag for regime switching. Our method, which is called Batched Robust iHMM (BR-iHMM), balances adaptivity and robustness with two additional tunable parameters. Across limit order book data, hourly electricity demand, and a synthetic high-dimensional linear system, BR-iHMM reduces one-step-ahead forecasting error by up to 67% relative to competing online Bayesian methods. Together with theoretical guarantees of bounded PIF, our results highlight the practicality of our approach for both forecasting and interpretable online learning.

2604.14083 2026-05-13 physics.comp-ph cond-mat.mtrl-sci stat.CO

Distributional Inverse Homogenization

Arnaud Vadeboncoeur, Mark Girolami, Kaushik Bhattacharya, Andrew M. Stuart

AI总结 该研究提出了一种非侵入性的方法,用于从宏观力学性能反推材料微观结构的统计特性,称为分布逆均质化。通过利用大量宏观测量数据,该方法能够在不直接观测微观结构的情况下,学习其全局统计信息,并适用于周期性和随机性均质化场景。研究还展示了如何利用微观结构的自然空间变化来实现分布逆推,并构建了用于加速计算的替代模型,为从宏观测量中学习材料微观结构变异提供了新思路。

详情
英文摘要

For many materials, macroscopic mechanical behavior is determined by an intricate microstructure. Understanding the relation between these two scales helps scientists and engineers design better materials. The relation which maps microstructure to bulk mechanical properties can be understood via the well-established theory of homogenization. However inverting the homogenization process, to recover microstructural information from measured macroscopic properties, is fraught with difficulties because of the averaging processes that underlie homogenization. Therefore, scientists and engineers usually need recourse to more invasive, often highly localized, investigations to learn about a microstructure. In this work, we develop a noninvasive methodology by which one can leverage large collections of measured bulk mechanical properties to learn information about the statistics of microstructure at a global level. We call this, distributional inverse homogenization. We study this problem in one and two dimensions, considering both periodic and stochastic homogenization. We demonstrate the methodology in the context of 2D Voronoi constructions and underpin the observed empirical success with theory in 1D. We also show how the natural spatial variability of microstructure can be exploited to gather data that enables distributional inversion. And we concurrently learn a surrogate model, approximating the homogenization map, that accelerates the resulting computations in this setting. The work formulates a new class of inverse problems, bridging ideas from probability and homogenization to facilitate the learning of microstructural material variability from macroscopic measurements.

2603.22000 2026-05-13 cs.LG stat.ML

CRPS-Optimal Binning for Univariate Conformal Regression

Paolo Toccaceli

AI总结 本文提出了一种基于分箱的非参数条件分布估计方法,通过将排序后的协变量观测划分为连续区间,并使用区间内的经验CDF作为预测分布。该方法通过最小化留一法连续排名概率分数(LOO-CRPS)确定最优分箱边界,并采用动态规划以高效求解全局最优分箱数。实验表明,该方法在保持预测区间覆盖率接近名义水平的同时,能显著缩小预测区间,优于多种主流的分层确认回归方法。

详情
英文摘要

We propose a method for non-parametric conditional distribution estimation based on partitioning covariate-sorted observations into contiguous bins and using the within-bin empirical CDF as the predictive distribution. Bin boundaries are chosen to minimise the total leave-one-out Continuous Ranked Probability Score (LOO-CRPS), which admits a closed-form cost function with $O(n^2 \log n)$ precomputation and $O(n^2)$ storage; the globally optimal $K$-partition is recovered by a dynamic programme in $O(n^2 K)$ time. Minimisation of within-sample LOO-CRPS turns out to be inappropriate for selecting $K$ as it results in in-sample optimism. We instead select $K$ by $K$-fold cross-validation of test CRPS, which yields a U-shaped criterion with a well-defined minimum. Having selected $K^*$ and fitted the full-data partition, we form two complementary predictive objects: the Venn prediction band and a conformal prediction set based on CRPS as the nonconformity score, which carries a finite-sample marginal coverage guarantee at any prescribed level $\varepsilon$. The conformal prediction is transductive and data-efficient, as all observations are used for both partitioning and p-value calculation, with no need to reserve a hold-out set. On real benchmarks against split-conformal competitors (Gaussian split conformal, CQR, CQR-QRF, and conformalized isotonic distributional regression), the method produces substantially narrower prediction intervals while maintaining near-nominal coverage.

2603.14094 2026-05-13 stat.ML cs.LG math.ST stat.CO stat.ME stat.TH

Maximin Robust Bayesian Experimental Design

Hany Abdulsamad, Sahel Iqbal, Christian A. Naesseth, Takuo Matsubara, Adrien Corenflos

AI总结 本文研究了贝叶斯实验设计在模型误设下的鲁棒性问题,将其建模为实验者与对抗性自然之间的极大极小博弈,并引入信息论约束以提升鲁棒性。研究提出使用Sibson的α-互信息作为鲁棒目标函数,确定了α-倾斜后验作为鲁棒信念更新方式,并以Rényi散度作为条件信息增益的度量。为减少嵌套蒙特卡洛估计器的偏差和方差,作者采用PAC-Bayes框架搜索随机设计策略,从而得到具有显式有限样本误差控制的鲁棒期望信息增益下界。

详情
英文摘要

We address the brittleness of Bayesian experimental design under model misspecification by formulating the problem as a max--min game between the experimenter and an adversarial nature subject to information-theoretic constraints. We demonstrate that this approach yields a robust objective governed by Sibson's $α$-mutual information (MI), which identifies the $α$-tilted posterior as the robust belief update and establishes the Rényi divergence as the appropriate measure of conditional information gain. To mitigate the bias and variance of nested Monte Carlo estimators needed to estimate Sibson's $α$-MI, we adopt a PAC-Bayes framework to search over stochastic design policies, yielding rigorous high-probability lower bounds on the robust expected information gain that explicitly control finite-sample error.

2602.02406 2026-05-13 stat.ML cs.LG

Provably Data-driven Multiple Hyper-parameter Tuning with Structured Loss Function

Tung Quoc Le, Anh Tuan Nguyen, Viet Anh Nguyen

AI总结 本文研究了数据驱动方法中多维超参数调优的泛化保证问题,针对现有理论仅适用于单维超参数的局限性,提出了首个适用于多维超参数调优的通用框架。该方法结合实代数几何工具,强化了半代数函数类的泛化界分析,获得了更精确且适用性更广的理论保证,并进一步拓展到验证损失下的超参数调优场景,展示了框架在数据驱动加权组lasso和加权融合lasso等新学习问题中的应用潜力。

详情
Comments
Accepted to ICML 2026
英文摘要

Data-driven algorithm design automates hyperparameter tuning, but its statistical foundations remain limited because model performance can depend on hyperparameters in implicit and highly non-smooth ways. Existing guarantees focus on the simple case of a one-dimensional (scalar) hyperparameter. This leaves the practically important, multi-dimensional hyperparameter tuning setting unresolved. We address this open question by establishing the first general framework for establishing generalization guarantees for tuning multi-dimensional hyperparameters in data-driven settings. Our approach strengthens the generalization guarantee framework for semi-algebraic function classes by exploiting tools from real algebraic geometry, yielding sharper, more broadly applicable guarantees. For completeness, we also instantiate the first lower bound for this general setting. We further extend the analysis to hyperparameter tuning using the validation loss under minimal assumptions, and derive improved bounds when additional structure is available. Finally, we demonstrate the scope of the framework with new learnability results, including data-driven weighted group lasso and weighted fused lasso.

2601.19836 2026-05-13 stat.ME stat.AP

Personalized Treatment Hierarchies in Bayesian Network Meta-Analysis

Augustine Wigle, Erica E. M. Moodie

AI总结 本文研究了在贝叶斯网络Meta分析(NMA)中如何构建个性化治疗层次结构,考虑治疗-协变量交互作用(TCIs)对治疗效果的影响。作者指出,当NMA模型包含TCIs时,应针对特定协变量特征生成治疗排序。文章提出了一种基于协变量配置构建个性化治疗层次的方法,并通过抑郁症治疗的真实研究网络进行了验证,为精准医疗提供了新的分析工具。

详情
英文摘要

Network Meta-Analysis (NMA) is an increasingly popular evidence synthesis tool that can provide a ranking of competing treatments, also known as a treatment hierarchy. Treatment-Covariate Interactions (TCIs) can be included in NMA models to allow relative treatment effects to vary with covariate values. We show that in an NMA model that includes TCIs, treatment hierarchies should be created with a particular covariate profile in mind. We outline the typical approach for creating a treatment hierarchy in standard Bayesian NMA and show how a treatment hierarchy for a particular covariate profile can be created from an NMA model that estimates TCIs. We demonstrate our methods using a real network of studies for treatments of major depressive disorder.

2512.11114 2026-05-13 cs.LG cs.AI stat.ML

In-Context Multi-Objective Optimization

Xinyu Zhang, Conor Hassan, Julien Martinelli, Daolang Huang, Samuel Kaski

AI总结 在多目标优化问题中,如何平衡多个竞争目标是一个普遍存在的挑战,尤其在药物设计和自主系统等领域。本文提出了一种名为TAMO的全摊销通用策略,利用Transformer架构实现对不同输入和目标维度的多目标黑盒优化,无需针对每个任务重新训练模型。通过强化学习预训练,TAMO能够在单次前向传播中快速生成优化方案,显著提升了计算效率,并在多个基准和实际任务中表现出优异的帕累托前沿质量。

详情
英文摘要

Balancing competing objectives is omnipresent across disciplines, from drug design to autonomous systems. Multi-objective Bayesian optimization is a promising solution for such expensive, black-box problems: it fits probabilistic surrogates and selects new designs via an acquisition function that balances exploration and exploitation. In practice, it requires tailored choices of surrogate and acquisition that rarely transfer to the next problem, is myopic when multi-step planning is often required, and adds refitting overhead, particularly in parallel or time-sensitive loops. We present TAMO, a fully amortized, universal policy for multi-objective black-box optimization. TAMO uses a transformer architecture that operates across varying input and objective dimensions, enabling pretraining on diverse corpora and transfer to new problems without retraining: at test time, the pretrained model proposes the next design with a single forward pass. We pretrain the policy with reinforcement learning to maximize cumulative hypervolume improvement over full trajectories, conditioning on the entire query history to approximate the Pareto frontier. Across synthetic benchmarks and real tasks, TAMO produces fast proposals, reducing proposal time by 50-1000x versus alternatives while matching or improving Pareto quality under tight evaluation budgets. These results show that transformers can perform multi-objective optimization entirely in-context, eliminating per-task surrogate fitting and acquisition engineering, and open a path to foundation-style, plug-and-play optimizers for scientific discovery workflows.

2511.11412 2026-05-13 cs.CL cs.CY stat.OT

MajinBook: An open catalogue of digitally mediated world literature

Antoine Mazières, Thierry Poibeau

AI总结 本文介绍了MajinBook,一个开放的数字文献目录,旨在促进对影子图书馆(如Library Genesis和Z-Library)在计算社会科学和文化分析中的应用。通过将这些众包档案的元数据与Goodreads的结构化书目数据进行关联,构建了一个包含539,000多本英文书籍的高精度语料库,并附有首次出版日期、类型和受欢迎程度等信息。该研究采用原生数字EPUB文件以确保机器可读性,同时解决了传统语料库的偏差问题,并提供了法语、德语和西班牙语的辅助数据集。

详情
Comments
9 pages, 5 figures, 1 table
英文摘要

This data paper introduces MajinBook, an open catalogue designed to facilitate the use of shadow libraries-such as Library Genesis and Z-Library-for computational social science and cultural analytics. By linking metadata from these vast, crowd-sourced archives with structured bibliographic data from Goodreads, we create a high-precision corpus of over 539,000 references to digitally mediated English-language books. Spanning three centuries and reflecting a contemporary selection bias, these entries are enriched with first publication dates, genres, and popularity metrics like ratings and reviews. Our methodology prioritises natively digital EPUB files to ensure machine-readable quality, while addressing biases in traditional corpora like HathiTrust, and includes secondary datasets for French, German, and Spanish. We evaluate the linkage strategy for accuracy, release all underlying data openly, and discuss the project's legal permissibility under EU and US frameworks for text and data mining in research.

2509.21711 2026-05-13 stat.ML cs.LG

Multi-modal Bayesian Neural Network Surrogates with Conjugate Last-Layer Estimation

Ian Taylor, Juliane Mueller, Julie Bessac

AI总结 本文研究了如何利用多模态数据构建高效的代理模型,以支持对昂贵目标量的建模与分析。作者提出两种基于共轭后验分布的多模态贝叶斯神经网络代理模型,并利用变分推断方法进行参数估计,特别适用于存在部分缺失观测的情况。实验表明,与单模态模型相比,该方法在标量和时序数据上均表现出更高的预测精度和不确定性量化能力。

详情
Comments
47 pages including references and appendix, 9 figures
英文摘要

As data collection and simulation capabilities advance, multi-modal learning, the task of learning from multiple modalities and sources of data, is becoming an increasingly important area of research. Surrogate models that learn from data of multiple auxiliary modalities to support the modeling of a highly expensive quantity of interest have the potential to aid outer loop applications such as optimization, inverse problems, or sensitivity analyses when multi-modal data are available. We develop two multi-modal Bayesian neural network surrogate models and leverage conditionally conjugate distributions in the last layer to estimate model parameters using stochastic variational inference (SVI). We provide a method to perform this conjugate SVI estimation in the presence of partially missing observations. We demonstrate improved prediction accuracy and uncertainty quantification compared to uni-modal surrogate models for both scalar and time series data.

2509.13548 2026-05-13 cs.SD eess.AS stat.ML

Mixture-of-Experts Framework for Field-of-View Enhanced Signal-Dependent Binauralization of Moving Talkers

Manan Mittal, Thomas Deppisch, Joseph Forrer, Chris Le Sueur, Zamir Ben-Hur, David Lou Alon, Daniel D. E. Wong

AI总结 本文提出了一种基于专家混合框架的新型方法,用于增强移动说话人声源的视野感知双耳渲染。该方法通过隐式定位在线融合多个双耳滤波器,实现了对连续运动声源的实时追踪与增强,能够在保持自然双耳线索的同时,突出或抑制特定方向的声音。与传统依赖到达方向估计或基于Ambisonics域的方法不同,该信号依赖框架具有阵列结构无关性,适用于下一代消费音频设备中的空间音频捕获与个性化播放。

详情
Comments
5 pages, 3 figures
英文摘要

We propose a novel mixture of experts framework for field-of-view enhancement in binaural signal matching. Our approach enables dynamic spatial audio rendering that adapts to continuous talker motion, allowing users to emphasize or suppress sounds from selected directions while preserving natural binaural cues. Unlike traditional methods that rely on explicit direction-of-arrival estimation or operate in the Ambisonics domain, our signal-dependent framework combines multiple binaural filters in an online manner using implicit localization. This allows for real-time tracking and enhancement of moving sound sources, supporting applications such as speech focus, noise reduction, and world-locked audio in augmented and virtual reality. The method is agnostic to array geometry offering a flexible solution for spatial audio capture and personalized playback in next-generation consumer audio devices.

2509.09162 2026-05-13 stat.CO math.PR

Divide, Interact, Sample: The Two-System Paradigm

James Chok, Myung Won Lee, Daniel Paulin, Geoffrey M. Vasil

AI总结 本文提出了一种统一的“双系统”框架,将均场采样、集合链采样和自适应采样等多种蒙特卡洛方法纳入同一理论体系。该方法通过将粒子集合分为两个相互作用的子系统,以对称交替的方式为彼此提出更新,从而保证了有限集合的不变分布特性。研究还揭示了集合链采样是均场采样的有限近似,并为离散化均场朗之万动力学提供了原理指导,同时展示了其在自适应单链方法中的应用潜力。实验表明,该框架下的采样器在有效样本量和计算效率方面优于传统方法,尤其在高维后验推断任务中表现突出。

详情
英文摘要

Mean-field, ensemble-chain, and adaptive samplers have historically been viewed as distinct approaches to Monte Carlo sampling. In this paper, we present a unifying {two-system} framework that brings all three under one roof. In our approach, an ensemble of particles is split into two interacting subsystems that propose updates for each other in a symmetric, alternating fashion. For the memoryless two-system samplers, this cross-system interaction ensures that the finite ensemble has $ρ^{\otimes 2N}$ as its invariant distribution; for finite-adaptive variants, exact stationarity applies after the adaptation phase is frozen. The two-system construction reveals that ensemble-chain samplers can be interpreted as finite-$N$ approximations to an ideal mean-field sampler; conversely, it provides a principled recipe for discretizing mean-field Langevin dynamics into tractable parallel MCMC algorithms. The framework also connects naturally to adaptive single-chain methods: by replacing particle-based statistics with time-averaged statistics from a single chain, one recovers analogous adaptive dynamics in the long-time limit without requiring a large ensemble. We derive novel two-system versions of both overdamped and underdamped Langevin MCMC samplers within this paradigm. Across synthetic benchmarks and real-world posterior inference tasks, these two-system samplers -- which use a single BCSS-2 integrator step per Metropolis--Hastings accept/reject, in contrast to the long-trajectory style of HMC/NUTS -- exhibit substantial performance gains over No-U-Turn Sampler baselines, achieving higher effective sample sizes per gradient evaluation and markedly higher wall-clock throughput. On higher-dimensional posteriors, the adaptive MAKLA-BCSS-2 methods remain stable and achieve substantially better per-gradient efficiency and wall-clock throughput than the NUTS variants in our benchmark suite.

2509.00931 2026-05-13 stat.ML cs.LG

Semi-Supervised Bayesian GANs with Log-Signatures for Uncertainty-Aware Credit Card Fraud Detection

David Hirnschall

AI总结 本文提出了一种基于半监督贝叶斯生成对抗网络(GAN)的新型深度生成框架,用于信用卡欺诈检测,将问题建模为时间序列分类任务。该方法结合条件GAN进行目标数据增强,引入贝叶斯推理以量化预测不确定性,并利用对数符号(log-signatures)对交易历史进行鲁棒特征编码,同时设计了一种基于Wasserstein距离的损失函数以对齐生成样本与真实未标记样本。实验表明,该方法在BankSim数据集上优于现有基准,在不同标签比例下均表现出优异的统计和领域特定性能。

详情
英文摘要

We present a novel deep generative semi-supervised framework for credit card fraud detection, formulated as time series classification task. As financial transaction data streams grow in scale and complexity, traditional methods often require large labeled datasets, struggle with time series of irregular sampling frequencies and varying sequence lengths. To address these challenges, we extend conditional Generative Adversarial Networks (GANs) for targeted data augmentation, integrate Bayesian inference to obtain predictive distributions and quantify uncertainty, and leverage log-signatures for robust feature encoding of transaction histories. We introduce a novel Wasserstein distance-based loss to align generated and real unlabeled samples while simultaneously maximizing classification accuracy on labeled data. Our approach is evaluated on the BankSim dataset, a widely used simulator for credit card transaction data, under varying proportions of labeled samples, demonstrating consistent improvements over benchmarks in both global statistical and domain-specific metrics. These findings highlight the effectiveness of GAN-driven semi-supervised learning with log-signatures for irregularly sampled time series and emphasize the importance of uncertainty-aware predictions.

2508.20614 2026-05-13 stat.ML cs.LG stat.CO

Improving the Accuracy of Amortized Model Comparison with Self-Consistency

Šimon Kucharský, Aayush Mishra, Daniel Habermann, Stefan T. Radev, Paul-Christian Bürkner

AI总结 该论文研究了如何提高免训练模型比较(Amortized Bayesian Model Comparison, BMC)的准确性,特别是在模拟模型存在偏差的情况下。作者提出了一种基于自一致性(self-consistency)损失的新方法,通过在未标记的真实数据上训练神经代理模型,以增强模型比较在分布偏移情况下的鲁棒性。实验表明,在开放世界场景下,结合自一致性训练的方法能显著提升BMC估计的准确性,尤其在模型严重偏差时效果更明显。

详情
Comments
22 pages, 14 figures. This version extends our initial results presented at Reliable ML from Unreliable Data Workshop at NeurIPS 2025. Previously, this version appeared as arXiv:2512.14308v2, which has now been withdrawn: the two versions share too much content to be considered separate papers
英文摘要

Amortized Bayesian model comparison (BMC) enables fast probabilistic ranking of models via simulation-based training of neural surrogates. However, the accuracy of neural surrogates deteriorates when simulation models are misspecified; the very case where model comparison is most needed. We evaluate four different amortized BMC methods. We supplement traditional simulation-based training of these methods with a \emph{self-consistency} (SC) loss on unlabeled real data to improve BMC estimates under distribution shifts. Using one artificial and two real-world case studies, we compare amortized BMC estimators with and without SC against analytic or bridge sampling benchmarks. In the \emph{closed-world} case (data is generated by one of the candidate models), BMC estimators using classifiers work acceptably well even without SC training. However, these methods also benefit the least from SC training. In the \emph{open-world} scenario (all models misspecified), SC training strongly improves BMC estimators when having access to analytic likelihoods, or when surrogate likelihoods are locally accurate near the true parameter posterior, even for severely misspecified models. We conclude with practical recommendations for amortized BMC and suggestions for future research.

2508.06138 2026-05-13 stat.ME

Variable selection via knockoffs in missing data settings with categorical predictors

Silvia Bacci, Emanuela Dreassi, Leonardo Grilli, Carla Rampichini

AI总结 该研究针对存在大量分类变量和缺失值的大规模评估数据,扩展了敲除法(knockoffs)用于变量选择。研究提出了一种基于多重插补的策略,对缺失值进行填补后,再应用敲除法筛选重要预测变量。该方法在模拟研究和实际教育数据案例中表现出良好的性能,尤其适用于包含无序分类变量和多层结构的复杂数据场景。

详情
英文摘要

Large-scale assessment data typically include numerous categorical variables, often affected by missing values. Motivated by the challenges arising in this framework, we extend the knockoffs method for selecting predictors to settings with missing values. Our proposal relies on a preliminary phase consisting of multiple imputations of missing values. Each imputed dataset is then processed using a suitable knockoff filter. We evaluate the performance of the proposed method through a simulation study, showing satisfactory results consistent with a recently advocated cutting-edge method. We apply the method to large-scale assessment data collected by INVALSI about test scores of Italian students in grade 5 with many background variables. This case study is challenging, as most predictors have unordered categories, a setting not taken into account by traditional knockoffs methods. In addition, some of the key predictors are affected by missing values. The model includes random effects to account for the multilevel structure of students nested into schools. Our proposal to implement the knockoffs method within a multiple imputation framework proves to be feasible, flexible and effective.

2506.19988 2026-05-13 stat.ME

Tipping Point Sensitivity Analysis for Missing Data in Time-to-Event Endpoints: Model-Based and Ad hoc Approaches

Ajmal Oodally, Craig Wang, Zheng Li, Tim Morris, Tobias Mütze, Arunava Chakravartty

AI总结 该研究探讨了在时间至事件终点存在缺失数据时,如何评估治疗策略估计量下治疗效果的稳健性。文章对比了基于模型和两种经验性方法的“临界点分析”方法,用于评估独立删失假设的偏离对结论的影响。研究通过实际临床试验的重构案例展示了不同方法的假设及其对结果解释和临床合理性的影响,为处理缺失数据提供了更稳健的分析框架。

详情
英文摘要

Treatment policy estimands are frequently favored by regulators, as they assess the effect of treatment assignment regardless of post-randomization events. Despite best efforts, missing data due to study discontinuation cannot be fully avoided and, for time-to-event endpoints, typically manifests as right censoring. Study discontinuation is often more likely following intercurrent events, particularly when it coincides with treatment discontinuation, raising concerns about violations of the independent censoring assumption. Although the independent censoring assumption is routinely adopted for the main analyses, it may be unrealistic in practice and could lead to biased estimation of the treatment effect under the treatment policy estimand. Tipping-point analyses provide a structured framework to assess the robustness of trial conclusions to departures from the independent censoring assumption. This paper describes and contrasts model-based and two ad hoc tipping point approaches, which involve "landmark" or "percentile sampling" based imputation. We illustrate their application using re-constructed examples based on real clinical trials, highlighting their underlying assumptions and implications for interpretation and clinical plausibility assessments of different tipping point approaches.

2506.10664 2026-05-13 stat.ML cs.LG

Sequential Off-Policy Learning with Logarithmic Smoothing

Maxime Haddouche, Otmane Sakhi

AI总结 本文研究了序列离线策略学习问题,即在实际系统中不断更新和重新部署策略时,如何利用所有历史数据进行学习。作者提出了一种结合对数平滑估计与在线PAC-贝叶斯工具的简单算法,并证明在温和条件下对对数平滑方法的改进可以提升性能并加速收敛。该算法在批量设置下与当前最优离线方法相当,而在序列更新场景下则显著优于现有方法,实验验证了其有效性。

详情
Comments
AISTATS 2026
英文摘要

Off-policy learning enables training policies from logged interaction data. Most prior work considers the batch setting, where a policy is learned from data generated by a single behavior policy. In real systems, however, policies are updated and redeployed repeatedly, each time training on all previously collected data while generating new interactions for future updates. This sequential off-policy learning setting is common in practice but remains largely unexplored theoretically. In this work, we present and study a simple algorithm for sequential off-policy learning, combining Logarithmic Smoothing (LS) estimation with online PAC-Bayesian tools. We further show that a principled adjustment to LS improves performance and accelerates convergence under mild conditions. The algorithms introduced generalise previous work: they match state-of-the-art offline approaches in the batch case and substantially outperform them when policies are updated sequentially. Empirical evaluations highlight both the benefits of the sequential framework and the strength of the proposed algorithms.

2506.02084 2026-05-13 cs.LG stat.ML

Adversarial Causal Tuning for Realistic Time-series Generation

Nikolaos Gkorgkolis, Nikolaos Kougioulis, MingXue Wang, Bora Caglayan, Andrea Tonon, Dario Simionato, Ioannis Tsamardinos

AI总结 本文研究如何从真实时间序列数据中生成具有相同观测和干预分布的仿真数据,旨在构建概率因果数字孪生模型。为此,作者提出了一种对抗因果调优(ACT)方法,结合生成对抗网络和自动机器学习的思想,搜索最优的因果模型和判别器,以提升生成数据与真实数据分布的一致性,并通过置换检验控制模型复杂度。实验表明,ACT在多个数据集上表现出优越的拟合能力和泛化性能,为现实时间序列的生成提供了新的有效方法。

详情
Comments
22 pages, 3 figures
英文摘要

We address the problem of generating simulated, yet realistic, time-series data from a causal model with the same observational and interventional distributions as a given real dataset (probabilistic causal digital twin). While non-causal models (e.g., GANs) also strive to simulate realistic data, causal models are fundamentally more powerful, able to simulate the effect of interventions (what-if scenarios), optimize decisions, perform root-cause analysis, and counterfactual causal reasoning. We introduce the Adversarial Causal Tuning (ACT) methodology, which outputs the optimal causal model that fits the data, along with a quantification of the goodness-of-fit. The returned causal model can then be employed to simulate new data or to perform other causal reasoning tasks. ACT adopts ideas from Generative Adversarial Network training and AutoML to search for optimal causal pipelines and discriminators that detect deviations between the distributions of real and simulated data. It also adapts a permutation testing procedure from established causal tuning methods to penalize models for complexity. Through extensive experiments on real, semi-synthetic, and synthetic datasets, we show that (a) employing multiple optimized discriminators is paramount for selecting the optimal causal models and quantifying goodness-of-fit, (b) ACT selects the optimal causal model in synthetic datasets while avoiding overfitting, generating data indistinguishable from the true data distribution (c) all state-of-the-art generative and causal simulation methods, exhibit room for improvement in reproducing real data distributions; generating realistic temporal data is still an open research challenge.

2505.20754 2026-05-13 stat.ML cs.LG

Stationary MMD Points

Zonghao Chen, Toni Karvonen, Heishiro Kanagawa, François-Xavier Briol, Chris. J. Oates

AI总结 本文研究如何利用有限点集近似目标概率分布的问题,核心方法是通过最小化最大均值差异(MMD)来选择点集。由于MMD目标函数的非凸性,难以直接求得全局最优解,因此作者提出研究MMD的平稳点,这些点可以被准确计算。理论分析表明,对于相关再生核希尔伯特空间中的积分函数,平稳MMD点的数值积分误差收敛速度比MMD本身更快,并基于此提出了MMD梯度流作为计算平稳点的实用方法,同时给出了其收敛性的严格分析与误差界。

详情
Journal ref
International Conference on Machine Learning, 2026
英文摘要

Approximation of a target probability distribution using a finite set of points is a problem of fundamental importance in numerical integration. Several authors have proposed to select points by minimising a maximum mean discrepancy (MMD), but the non-convexity of this objective typically precludes global minimisation. Instead, we consider the concept of \emph{stationary points of the MMD} which, in contrast to points globally minimising the MMD, can be accurately computed. Our main contributions are two-fold and theoretical in nature. We first prove the (perhaps surprising) result that, for integrands in the associated reproducing kernel Hilbert space, the numerical integration error of stationary MMD points vanishes \emph{faster} than the MMD. Motivated by this \emph{super-convergence} property, we consider MMD gradient flows as a practical strategy for computing stationary points of the MMD. We then prove that MMD gradient flow can indeed compute stationary MMD points, based on a refined convergence analysis that establishes a novel non-asymptotic finite-particle error bound.

2505.17907 2026-05-13 stat.ML cs.LG

Approximating Simple ReLU Networks based on Spectral Decomposition of Fisher Information

Ka Long Keith Ho, Yoshinari Takeishi, Junichi Takeuchi

AI总结 本文研究了具有随机隐藏权重的两层ReLU神经网络的费舍尔信息矩阵的性质。研究发现,其特征值分布高度集中在少数几个特征空间中,前三个特征空间的特征值之和占费舍尔信息矩阵迹的97.7%,且与参数数量无关。作者识别出对应这些主要特征空间的函数空间,发现其由阶数不超过2的球谐函数组成,该结果与神经切核的Mercer分解密切相关。

详情
Comments
18 pages, 1 figure, 1 table
英文摘要

Properties of Fisher information matrices of 2-layer neural ReLU networks with random hidden weights are studied. For these networks, it is known that the eigenvalue distribution highly concentrates on several eigenspaces approximately. In particular, the eigenvalues for the first three eigenspaces account for 97.7% of the trace of the Fisher information matrix, independently of the number of parameters. In this paper, we identify the function spaces which correspond to those major eigenspaces. This function space consists of the spherical harmonic functions whose orders are not greater than 2. This result relates to the Mercer decomposition of the neural tangent kernels.

2505.16156 2026-05-13 stat.ML cs.LG

Integral Imprecise Probability Metrics

Siu Lun Chau, Michele Caprio, Krikamol Muandet

AI总结 本文提出了一种基于Choquet积分的积分模糊概率度量(IIPM)框架,用于在模糊概率模型下比较概率分布之间的差异,扩展了经典概率度量的应用范围。该方法适用于包括下概率、概率区间和信念函数在内的多种模糊概率模型,能够有效衡量认识不确定性。理论分析表明IIPM满足度量空间的条件,并可用于描述模糊概率的弱收敛形式;实验验证显示其在分类任务中表现优异,尤其在类别数量较多时优于传统方法。

详情
Comments
48 pages
英文摘要

Quantifying differences between probability distributions is fundamental to statistics and machine learning, primarily for comparing statistical uncertainty. In contrast, epistemic uncertainty -- due to incomplete knowledge -- requires richer representations than those offered by classical probability. Imprecise probability (IP) theory offers such models, capturing ambiguity and partial belief. This has driven growing interest in imprecise probabilistic machine learning (IPML), where inference and decision-making rely on broader uncertainty models -- highlighting the need for metrics beyond classical probability. This work introduces the integral imprecise probability metric framework, a Choquet integral-based generalisation of classical integral probability metrics to the setting of capacities -- a broad class of IP models encompassing many existing ones, including lower probabilities, probability intervals, belief functions, and more. Theoretically, we establish conditions under which IIPM serves as a valid metric and metrises a form of weak convergence of capacities. Practically, IIPM not only enables comparison across different IP models but also supports the quantification of epistemic uncertainty~(EU) within a single IP model. In particular, by comparing an IP model with its conjugate, IIPM gives rise to a new class of epistemic uncertainty measures -- Maximum Mean Imprecision -- which satisfy key axiomatic properties proposed in the uncertainty quantification literature. We validate MMI through selective classification experiments, demonstrating strong empirical performance against established EU measures, and outperforming them when classical methods struggle to scale to a large number of classes. Our work advances both theory and practice in Imprecise Probabilistic Machine Learning, offering a principled framework for comparing and quantifying epistemic uncertainty under imprecision.

2504.10428 2026-05-13 stat.ML cs.DS cs.LG math.ST stat.TH

Smoothed Analysis of Learning from Positive Samples

Jane H. Lee, Anay Mehrotra, Manolis Zampetakis

AI总结 本文研究了仅从正样本中学习二分类问题的平滑分析,旨在解决传统最坏情况下学习能力有限的问题。通过假设真实分布相对于参考分布是平滑的,作者证明了所有VC类在平滑模型下均可学习,并给出了所需的样本数量和高效算法。该成果还带来了未知截断估计、截断检测和多参考分布学习等多个应用领域的改进算法。

详情
Comments
Accepted for presentation at the 58th ACM Symposium on Theory of Computing (STOC), 2026; abstract shortened for arXiv
英文摘要

Binary classification from positive-only samples is a variant of PAC learning where the learner receives i.i.d. positive samples and aims to learn a classifier with low error. Previous work by Natarajan, Gereb-Graus, and Shvaytser characterized learnability and revealed a largely negative picture: almost no interesting classes, including two-dimensional halfspaces, are learnable. This poses a challenge for applications from bioinformatics to ecology, where practitioners rely on heuristics. In this work, we initiate a smoothed analysis of positive-only learning. We assume samples from a reference distribution $D$ such that the true distribution $D^*$ is smooth with respect to it. In stark contrast to the worst-case setting, we show that all VC classes become learnable in the smoothed model, requiring $O(VC/ε^2)$ positive samples for $ε$ classification error. We also give an efficient algorithm for any class admitting $\mathrm{poly}(ε)$-approximation by degree-$k$ polynomials whose range is lower-bounded by a constant with respect to $D$ in L1-norm. It runs in time $\mathrm{poly}(d^k/ε)$, qualitatively matching L1-regression. Our results also imply faster or more general algorithms for: (1) estimation with unknown-truncation, giving the first polynomial-time algorithm for estimating exponential-family parameters from samples truncated to an unknown set approximable by non-negative polynomials in L1 norm, improving on [KTZ FOCS19; LMZ FOCS24], who required strong L2-approximation; (2) truncation detection for broad classes, including non-product distributions, improving on [DLNS STOC24]'s who required product distributions; and (3) learning from a list of reference distributions, where samples come from $O(1)$ distributions, one of which witnesses smoothness of $D^*$, as arises when list-decoding algorithms learn samplers for $D^*$ from corrupted data.

2503.21576 2026-05-13 math.PR cs.LO math.CT math.ST stat.TH

Empirical Measures and Strong Laws of Large Numbers in Categorical Probability

Tobias Fritz, Tomáš Gonda, Antonio Lorenzin, Paolo Perrone, Areeb Shah Mohammed

AI总结 本文研究了范畴概率中经验测度的极限行为,探讨了强定律大数在分类概率中的推广。作者提出了两个公理——排列不变性和经验充分性,用于刻画从无限序列生成经验样本的态射,并指出这类“经验采样态射”存在于准马尔可夫范畴中。通过引入这些态射及相关性质,论文证明了包括德·菲内蒂定理、格列文科-坎特利定理和强大数定律在内的多个抽象定理,并在标准波莱尔空间上给出了具体的构造,实现了这些定理的统一证明。

详情
Journal ref
Logical Methods in Computer Science, Volume 22, Issue 2 (May 5, 2026) lmcs:15844
英文摘要

The Glivenko--Cantelli theorem is a uniform version of the strong law of large numbers. It states that for every IID sequence of random variables, the empirical measure converges to the underlying distribution (in the sense of uniform convergence of the CDF). In this work, we provide tools to study such limits of empirical measures in categorical probability. We propose two axioms, namely permutation invariance and empirical adequacy, that a morphism of type $X^{\mathbb{N}} \to X$ should satisfy to be interpretable as taking an infinite sequence as input and producing a sample from its empirical measure as output. Since not all sequences have a well-defined empirical measure, such \emph{empirical sampling morphisms} live in quasi-Markov categories, which, unlike Markov categories, allow for partial morphisms. Given an empirical sampling morphism and a few other properties, we prove representability as well as abstract versions of the de Finetti theorem, the Glivenko--Cantelli theorem and the strong law of large numbers. We provide several concrete constructions of empirical sampling morphisms as partially defined Markov kernels on standard Borel spaces. Instantiating our abstract results then recovers the standard Glivenko--Cantelli theorem and the strong law of large numbers for random variables with finite first moment. Our work thus provides a joint proof of these two theorems in conjunction with the de Finetti theorem from first principles.

2503.17606 2026-05-13 stat.ME stat.AP

Combining longitudinal cohort studies to examine cardiovascular risk factor trajectories across the adult lifespan

Zeynab Aghabazaz, Michael J Daniels, Hongyan Ning, Donald M. Lloyd-Jones, Juned Siddique

AI总结 该研究提出了一种统计框架,用于整合多个大型纵向心血管队列数据,以研究从成年早期开始的长期心血管健康变化。通过引入贝叶斯分层多元模型,该方法能够同时建模多个风险因素在时间上的变化及不同队列间的关联,提高轨迹估计的精度,并填补缺失的风险因素信息。研究揭示了风险因素轨迹在不同年龄阶段、子群体和队列中的显著差异,突出了心血管预防和监测的关键时期。

详情
英文摘要

We introduce a statistical framework for combining data from multiple large longitudinal cardiovascular cohorts to enable the study of long-term cardiovascular health starting in early adulthood. Using data from seven cohorts belonging to the Lifetime Risk Pooling Project (LRPP), we present a Bayesian hierarchical multivariate approach that jointly models multiple longitudinal risk factors over time and across cohorts. Because few cohorts in our project cover the entire adult lifespan, our strategy uses information from all risk factors to increase precision for each risk factor trajectory and borrows information across cohorts to fill in unobserved risk factors. We develop novel diagnostic testing and model validation methods to ensure that our model robustly captures and maintains critical relationships over time and across risk factors. Our modeling reveals substantial age-related variation in risk factor trajectories, with patterns that differ across life stages, subgroups, and cohorts, thereby highlighting key periods for cardiovascular prevention and monitoring. Keywords: Bayesian hierarchical models; Missing data; Model validation; Multiple imputation; Random effects.

2503.15821 2026-05-13 stat.AP stat.OT

Temporal Point Process Modeling of Aggressive Behavior Onset in Psychiatric Inpatient Youths with Autism

Michael Potter, Michael Everett, Ashutosh Singh, Georgios Stratis, Yuna Watanabe, Ahmet Demirkaya, Deniz Erdogmus, Tales Imbiriba, Matthew S. Goodwin

AI总结 该研究旨在预测自闭症青少年住院患者攻击行为的发生时间,针对其沟通困难和情绪洞察力差的特点,提出使用时间点过程(特别是自激发霍克斯过程)对攻击行为的发作进行建模。与传统泊松模型相比,该方法能更准确地捕捉攻击行为不规则且成簇出现的特性,并提供了可解释的发作动态分析,为临床决策和预防干预提供了新的支持。

详情
Comments
Accepted to Nature Scientific Reports. Updated results on Hawkes Process with Power Law intensity, and made stricter conditions for sampling evaluation points in the Mean Absolute Percent Error and ROC-AUC calculations. Small notation discrepancies fixed
英文摘要

Aggressive behavior, including aggression towards others and self-injury, occurs in up to 80% of children and adolescents with autism, making it a leading cause of behavioral health referrals and a major driver of healthcare costs. Predicting when autistic youth will exhibit aggression can be challenging due to their communication difficulties. Many are minimally verbal or have poor emotional insight. Recent advances in Machine Learning and wearable biosensing demonstrate the ability to predict aggression within a limited future window (typically one to three minutes) in autistic individuals. However, existing works don't estimate aggression onset probability or the expected number of aggression onsets over longer periods, nor do they provide interpretable insights into onset dynamics. To address these limitations, we apply Temporal Point Processes (TPPs) - particularly self-exciting Hawkes processes - to model the timing of aggressive behavior onsets in psychiatric inpatient autistic youth. We benchmark several TPP models by evaluating their goodness-of-fit and predictive metrics. Our results demonstrate that self-exciting TPPs more accurately captures the irregular and clustered nature of aggression onsets, especially compared to traditional Poisson models. These incipient findings suggest that TPPs can provide interpretable, probabilistic forecasts of aggression onset along a time continuum, supporting future clinical decision-making and preemptive intervention.

2502.01577 2026-05-13 stat.CO

plmmr: an R package to fit penalized linear mixed models for genome-wide association data with complex correlation structure

Tabitha K. Peter, Anna C. Reisetter, Yujing Lu, Oscar A. Rysavy, Patrick J. Breheny

AI总结 本文介绍了一个名为 plmmr 的 R 语言开源包,用于拟合具有复杂相关结构的高维基因组关联数据的惩罚线性混合模型。该包通过估计观测间的相关性并结合最佳线性无偏预测方法,提升预测性能,同时采用内存映射技术,使得在普通计算机上即可处理超出内存规模的基因组数据。研究展示了 plmmr 的方法实现、工作流程及文件后置处理策略,并通过真实 GWAS 数据实例验证了其计算能力。

详情
Journal ref
Briefings in Bioinformatics (2026), 27: bbaf672
Comments
23 pages, 5 figures; https://github.com/pbreheny/plmmr
英文摘要

Correlation among the observations in high-dimensional regression modeling can be a major source of confounding. We present a new open-source package, plmmr, to implement penalized linear mixed models in R. This R package estimates correlation among observations in high-dimensional data and uses those estimates to improve prediction with the best linear unbiased predictor. The package uses memory-mapping so that genome-scale data can be analyzed on ordinary machines even if the size of data exceeds RAM. We present here the methods, workflow, and file-backing approach upon which plmmr is built, and we demonstrate its computational capabilities with two examples from real GWAS data.

2501.16931 2026-05-13 cs.LG stat.AP

Beyond Point Estimates: Distributional Uncertainty in Machine Learning Performance Evaluation

Christoph Lehmann, Yahor Paromau

AI总结 该论文提出了一种基于分布的机器学习模型评估方法,强调性能指标应被视为随机变量而非固定值,以更全面地反映训练过程中的不确定性。研究通过分析性能指标的经验分布,利用分位数和置信区间进行点估计和区间估计,尤其关注小样本情况下的统计推断可行性。该方法相比传统的基于均值的评估,能够更细致地刻画模型性能的变异性和不确定性,适用于需要可靠性的实际应用场景,并且易于实现和推广。

详情
Comments
21 pages, 9 figures
英文摘要

Machine learning models are often evaluated using point estimates of performance metrics such as accuracy, F1 score, or mean squared error. Such summaries fail to capture the inherent variability induced by stochastic elements of the training process, including data splitting, initialization, and hyperparameter optimization. This work proposes a distributional perspective on model evaluation by treating performance metrics as random quantities rather than fixed values. Instead of focusing solely on aggregate measures, empirical distributions of performance metrics are analyzed using quantiles and corresponding confidence intervals. The study investigates point and interval estimation of quantiles based on real-data use cases for classification and regression tasks, complemented by simulation studies for validation. Special emphasis is placed on small sample sizes, reflecting practical constraints in machine learning, where repeated training is computationally expensive. The results show that meaningful statistical inference on the underlying performance distribution is feasible even with sample sizes in the range of 10-25, while standard nonparametric confidence interval remain applicable under these conditions. The proposed approach provides a more detailed characterization of variability and uncertainty compared to mean-based evaluation and enables a more differentiated comparison of models. In particular, it supports a risk-oriented interpretation of model performance, which is relevant in applications where reliability is critical. The presented methods are easy to implement and broadly applicable, making them a practical extension to standard performance evaluation procedures in machine learning.

2412.18594 2026-05-13 cs.LG stat.ML

Local and Mixing-Based Algorithms for Gaussian Graphical Model Selection from Glauber Dynamics

Vignesh Tirukkonda, Anirudh Rayas, Gautam Dasarathy

AI总结 该论文研究了在数据来自高斯Glauber动力学的依赖样本下,如何进行高斯图模型结构学习的问题。作者提出了两种互补的方法:一种是基于相关性检验的局部边检测算法,无需等待链的混合即可并行实现;另一种是在满足Dobrushin收缩条件时,通过子采样高斯Gibbs轨迹,使其在总变分距离下接近独立同分布样本,从而可直接使用标准的独立样本图模型学习方法。研究还提供了有限样本下的恢复保证,并分析了观测时间的信息下界。

详情
Comments
Major revision. Corrects the earlier local ratio-estimator analysis by replacing it with a local product estimator; adds a burn-in/thinning estimator based on total-variation decoupling for Gaussian Gibbs samplers; strengthens the lower bounds; adds experiments; and compares with the related ICML 2026 work of Shen, Wu, Majid, and Moitra
英文摘要

Gaussian graphical model selection is usually studied under independent sampling, but in many applications observations arise from dependent dynamics. We study structure learning when the data consist of a single trajectory of Gaussian Glauber dynamics. We develop two complementary approaches. The first is a local edge-testing estimator based on an appropriately designed correlation test that reveals edges. This estimator does not require waiting for the chain to mix and admits an embarrassingly parallel edgewise implementation. The second is a burn-in/thinning reduction: under a Dobrushin contraction condition, we prove that a suitably subsampled Gaussian Gibbs trajectory is close in total variation to an i.i.d. product sample, allowing standard i.i.d. Gaussian graphical model learners to be used as black boxes. The key technical ingredient, which may be of independent interest, is a high-dimensional total-variation bound for random-scan Gaussian Gibbs samplers, obtained by combining Wasserstein contraction with an approximate Lipschitz smoothing argument. We prove finite-sample recovery guarantees for both approaches, establish information-theoretic lower bounds on the observation time, and empirically compare the resulting sample-computation tradeoffs.

2412.11875 2026-05-13 stat.ML cs.LG

Bayesian Surrogate Training on Multiple Data Sources: A Hybrid Modeling Strategy

Philipp Reiser, Paul-Christian Bürkner, Anneli Guthke

AI总结 该论文提出了一种融合仿真数据和真实测量数据的混合建模策略,用于提升代理模型的训练效果。研究通过两种概率方法,在代理模型训练过程中整合不同数据源的信息,一种是分别训练不同数据源的代理模型并融合预测分布,另一种是训练一个统一的代理模型以同时利用多源数据。这两种方法均采用了一种新颖的异构数据加权策略,能够提升预测精度与覆盖性,并有助于诊断仿真模型中的潜在问题。

详情
英文摘要

Surrogate models are often used as computationally efficient approximations to complex simulation models, enabling tasks such as solving inverse problems, sensitivity analysis, and probabilistic forward predictions, which would otherwise be computationally infeasible. During training, surrogate parameters are fitted such that the surrogate reproduces the simulation model's outputs as closely as possible. However, the simulation model itself is merely a simplification of the real-world system, often missing relevant processes or suffering from misspecifications e.g., in inputs or boundary conditions. Hints about these might be captured in real-world measurement data, and yet, we typically ignore those hints during surrogate building. In this paper, we propose two novel probabilistic approaches to integrate simulation data and real-world measurement data during surrogate training. The first method trains separate surrogate models for each data source and combines their predictive distributions, while the second incorporates both data sources by training a single surrogate. Both hybrid modeling approaches employ a novel weighting strategy for combining heterogeneous data sources during surrogate training, which operates independently of the chosen surrogate family. We show the conceptual differences and benefits of the two approaches through both synthetic and real-world case studies. The results demonstrate the potential of these methods to improve predictive accuracy, predictive coverage, and to diagnose problems in the underlying simulation model. These insights can improve system understanding and future model development.

2406.12017 2026-05-13 stat.ML cs.LG stat.CO

Sparsity-Constraint Optimization via Splicing Iteration

Jin Zhu, Junxian Zhu, Zezhi Wang, Borui Tang, Hongmei Lin, Xueqin Wang

AI总结 本文提出了一种名为SCOPE的新型稀疏性约束优化算法,用于解决信号处理、统计和机器学习中的相关问题。该算法通过拼接迭代操作替代传统梯度步骤,无需调整连续超参数,从而实现了自然收敛。理论分析表明,SCOPE在稀疏度正确设定时能够线性收敛并准确恢复稀疏支撑集,且其理论结果不依赖于受限等距性质条件。实验表明,SCOPE在稀疏二次优化、稀疏分类器学习和稀疏马尔可夫网络恢复等任务中表现出优越的性能。

详情
Comments
35 pages
英文摘要

Sparsity-constrained optimization underlies many problems in signal processing, statistics, and machine learning. State-of-the-art hard-thresholding (HT) algorithms rely on an appropriately selected continuous step-size parameter to ensure convergence. In this paper, we propose a naturally convergent iterative algorithm, SCOPE (Sparsity-Constrained Optimization via sPlicing itEration). The algorithm is capable of optimizing nonlinear differentiable objective functions that are strongly convex and smooth on low-dimensional subspaces. SCOPE replaces the gradient step with a splicing operation guided directly by the objective value, thereby eliminating the need to tune any continuous hyperparameter. Theoretically, it achieves a linear convergence rate and recovers the true support set when the sparsity level is correctly specified. We also establish parallel theoretical results without relying on restricted-isometry-property-type conditions. We apply SCOPE's versatility and power to solve sparse quadratic optimization, learn sparse classifiers, and recover sparse Markov networks for binary variables. With our C++ implementation of SCOPE, numerical experiments on these tasks show that it achieves superior support recovery performance, confirming both its algorithmic efficiency and theoretical guarantees.

2110.01729 2026-05-13 stat.ML cs.LG

Stochastic tensor space feature theory with applications to robust machine learning

Julio Enrique Castrillon-Candas, Kaili Shi, Dingning Liu, Sicheng Yang, Xiaoling Zhang, Mark Kon, the Alzheimer's Disease Neuroimaging Initiative

AI总结 本文提出了一种基于随机张量空间的多级正交子空间(MOS)Karhunen-Loeve特征理论,用于构建鲁棒的机器学习特征。通过将训练数据视为某个博赫纳空间中的随机场实例,并利用Karhunen-Loeve展开和层次化展开方法,构建多级正交子空间以检测异常信号成分,从而提取更具区分性的特征用于分类。实验表明,该方法在阿尔茨海默病血浆数据集上的分类准确率显著优于梯度提升、随机森林等主流机器学习方法。

详情
英文摘要

In this paper we develop a Multilevel Orthogonal Subspace (MOS) Karhunen-Loeve feature theory based on stochastic tensor spaces, for the construction of robust machine learning features. Training data are treated as instances of a random field within a relevant Bochner space. Our key observation is that separate machine learning classes can reside predominantly in mostly distinct subspaces. Using the Karhunen-Loeve expansion and a hierarchical expansion of the first (nominal) class, a MOS is constructed to detect anomalous signal components, treating the second class as an outlier of the first. The projection coefficients of the input data into these subspaces are then used to train a Machine Learning (ML) classifier. These coefficients become new features from which much clearer separation surfaces can arise for the underlying classes. Tests in the blood plasma dataset (Alzheimer's Disease Neuroimaging Initiative) show dramatic increases in accuracy. This contrast to popular ML methods such as Gradient Boosting, RUS Boost, Random Forest and Neural Networks. We show that with a non-invasive blood test, high-accuracy results can be obtained for predicting AD stages such as cognitive normal, mild cognitive impairment and dementia.

1905.10432 2026-05-13 stat.ME

Cross validation approaches for penalized Cox regression

Biyue Dai, Patrick Breheny

AI总结 本文研究了在惩罚Cox回归模型中选择调参参数的交叉验证方法。由于Cox模型的偏似然结构,其交叉验证实现较为复杂,作者提出了两种新的交叉验证方法,并与已有方法进行了比较。研究发现,对线性预测子进行交叉验证的方法在性能和数值稳定性方面表现良好,并通过模拟数据和肺癌患者的高维生存数据分析验证了其优势。

详情
Journal ref
Statistical Methods in Medical Research (2024), 33: 702-715
Comments
13 pages, 6 figures
英文摘要

Cross validation is commonly used for selecting tuning parameters in penalized regression, but its use in penalized Cox regression models has received relatively little attention in the literature. Due to its partial likelihood construction, carrying out cross validation for Cox models is not straightforward, and there are several potential approaches for implementation. Here, we propose two new cross-validation methods for Cox regression and compare them to approaches that have been proposed elsewhere. Our proposed approach of cross-validating the linear predictors seems to offer an attractive balance of performance and numerical stability. We illustrate these advantages using simulated data as well as using them to analyze data from a high-dimensional study of survival in lung cancer patients.

1809.05497 2026-05-13 stat.ME

Feature-specific inference for penalized regression using local false discovery rates

Ryan Miller, Patrick Breheny

AI总结 本文研究了如何在正则化回归(如Lasso)中对每个变量进行特征特异性推断,以评估其选择的可靠性。作者提出了一种基于局部错误发现率(local false discovery rate)的方法,能够在不同正则化参数λ下计算每个变量的局部错误发现率,从而判断单个特征的显著性或估计模型整体的错误发现率。该方法适用于多种正则化模型,如广义线性模型和Cox模型,并通过两个高维基因数据案例验证了其有效性与实用性。

详情
Journal ref
Statistics in Medicine (2023), 42: 1412-1429
Comments
22 pages, 5 figures
英文摘要

Penalized regression methods, most notably the lasso, are a popular approach to analyzing high-dimensional data. An attractive property of the lasso is that it naturally performs variable selection. An important area of concern, however, is the reliability of these variable selections. Motivated by local false discovery rate methodology from the large-scale hypothesis testing literature, we propose a method for calculating a local false discovery rate for each variable under consideration by the lasso model. These rates can be used to assess the reliability of an individual feature, or to estimate the model's overall false discovery rate. The method can be used for all values of $λ$. This is particularly useful for models with a few highly significant features but a high overall Fdr, which are a relatively common occurrence when using cross validation to select $λ$. It is also flexible enough to be applied to many varieties of penalized likelihoods including GLM and Cox models, and a variety of penalties, including MCP and SCAD. We demonstrate the validity of this approach and contrast it with other inferential methods for penalized regression as well as with local false discovery rates for univariate hypothesis tests. Finally, we show the practical utility of our method by applying it to two case studies involving high dimensional genetic data.

1704.08742 2026-05-13 stat.ML stat.CO

Hybrid safe-strong rules for efficient optimization in lasso-type problems

Yaohui Zeng, Tianbao Yang, Patrick Breheny

AI总结 本文针对高维数据下的lasso型问题,提出了一种混合安全-强规则(HSSR),将安全筛选规则与序列强规则结合,以高效剔除不相关的特征,提升优化效率。研究设计了两种具体方法SSR-Dome和SSR-BEDPP,并将其扩展至弹性网和组lasso问题,展示了方法的通用性。实验表明,该混合规则在计算效率上显著优于现有先进方法。

详情
Journal ref
Computational Statistics and Data Analysis (2021), 153: 107063
Comments
31 pages, 4 figures
英文摘要

The lasso model has been widely used for model selection in data mining, machine learning, and high-dimensional statistical analysis. However, with the ultrahigh-dimensional, large-scale data sets now collected in many real-world applications, it is important to develop algorithms to solve the lasso that efficiently scale up to problems of this size. Discarding features from certain steps of the algorithm is a powerful technique for increasing efficiency and addressing the Big Data challenge. In this paper, we propose a family of hybrid safe-strong rules (HSSR) which incorporate safe screening rules into the sequential strong rule (SSR) to remove unnecessary computational burden. In particular, we present two instances of HSSR, namely SSR-Dome and SSR-BEDPP, for the standard lasso problem. We further extend SSR-BEDPP to the elastic net and group lasso problems to demonstrate the generalizability of the hybrid screening idea. Extensive numerical experiments with synthetic and real data sets are conducted for both the standard lasso and the group lasso problems. Results show that our proposed hybrid rules can substantially outperform existing state-of-the-art rules.

1701.05936 2026-05-13 stat.CO stat.ML

The biglasso Package: A Memory- and Computation-Efficient Solver for Lasso Model Fitting with Big Data in R

Yaohui Zeng, Patrick Breheny

AI总结 该研究提出了一款名为 biglasso 的 R 语言软件包,旨在高效处理超大规模数据集上的 Lasso 模型拟合问题。该包通过使用内存映射文件技术实现外存计算,仅在必要时将数据载入内存,从而有效克服了传统 R 包在内存限制下的不足。此外,biglasso 引入了更高效的特征筛选规则,显著提升了计算效率,并在实际测试中表现出比现有工具如 glmnet 更优越的内存和计算效率。

详情
Journal ref
R Journal (2021), 12: 6-19
Comments
20 pages, 6 figures
英文摘要

Penalized regression models such as the lasso have been extensively applied to analyzing high-dimensional data sets. However, due to memory limitations, existing R packages like glmnet and ncvreg are not capable of fitting lasso-type models for ultrahigh-dimensional, multi-gigabyte data sets that are increasingly seen in many areas such as genetics, genomics, biomedical imaging, and high-frequency finance. In this research, we implement an R package called biglasso that tackles this challenge. biglasso utilizes memory-mapped files to store the massive data on the disk, only reading data into memory when necessary during model fitting, and is thus able to handle out-of-core computation seamlessly. Moreover, it's equipped with newly proposed, more efficient feature screening rules, which substantially accelerate the computation. Benchmarking experiments show that our biglasso package, as compared to existing popular ones like glmnet, is much more memory- and computation-efficient. We further analyze a 31 GB real data set on a laptop with only 16 GB RAM to demonstrate the out-of-core computation capability of biglasso in analyzing massive data sets that cannot be accommodated by existing R packages.

1607.05636 2026-05-13 math.ST stat.TH

Marginal false discovery rates for penalized regression models

Patrick Breheny

AI总结 本文研究了在高维数据中使用惩罚回归模型时,如何评估所选特征的可靠性问题。作者提出通过分析边际假发现率(mFDR)来衡量特征选择的可靠性,并给出了该方法的理论分析与模拟验证,表明其在预测变量相关性较弱时具有较高的准确性。研究还通过癌症基因组和心肌基因组数据展示了该方法的实用价值和相较于其他方法的优势。

详情
Journal ref
Biostatistics (2019), 20: 299-314
Comments
15 pages, 7 figures
英文摘要

Penalized regression methods are an attractive tool for high-dimensional data analysis, but their widespread adoption has been hampered by the difficulty of applying inferential tools. In particular, the question "How reliable is the selection of those features?" has proved difficult to address. In part, this difficulty arises from defining false discoveries in the classical, fully conditional sense, which is possible in low dimensions but does not scale well to high-dimensional settings. Here, we consider the analysis of marginal false discovery rates for penalized regression methods. Restricting attention to the marginal FDR permits straightforward estimation of the number of selections that would likely have occurred by chance alone, and therefore provides a useful summary of selection reliability. Theoretical analysis and simulation studies demonstrate that this approach is quite accurate when the correlation among predictors is mild, and only slightly conservative when the correlation is stronger. Finally, the practical utility of the proposed method and its considerable advantages over other approaches are illustrated using gene expression data from The Cancer Genome Atlas and GWAS data from the Myocardial Applied Genomics Network.

2605.11806 2026-05-13 math.ST stat.TH

Adaptive Kernel Ridge Regression with Linear Structure: Sharp Oracle Inequalities and Minimax Optimality

Xin Bing, Chao Wang

AI总结 本文研究了在核岭回归(KRR)中如何更有效地处理信号中的线性与非线性结构问题。作者提出了一种改进的回归方法,在标准KRR的基础上显式引入线性成分,从而在不增加计算复杂度和调参需求的前提下,提升预测性能。理论分析表明,该方法能够自适应捕捉信号中的线性与非线性结构,达到最小最大最优的预测风险,并通过大量仿真验证了其有效性。

详情
英文摘要

Kernel ridge regression (KRR) is a widely used nonparametric method due to its strong theoretical guarantees and computational convenience. However, standard KRR does not distinguish between linear and nonlinear components in the signal, instead applying a single functional regularization to the entire function. This may lead to unnecessary shrinkage of linear structure and consequently suboptimal prediction performance. In this paper, we propose a modified regression procedure that augments KRR with an explicit linear component. The proposed method has the same computational complexity as standard KRR and introduces no additional tuning parameters. Theoretically, we establish a sharp oracle inequality for the proposed estimator and show that it adaptively captures both linear and nonlinear structure, achieving minimax optimal prediction risk under general kernels. Compared with standard KRR, the proposed method improves both the bias and approximation error at the expense of only an additional parametric variance term, which is negligible in low- and moderate-dimensional settings. In high-dimensional regimes, incorporating ridge regularization for the linear component yields a procedure that performs uniformly no worse than KRR. Extensive simulation studies support the theoretical findings.

2605.11776 2026-05-13 stat.ME

Probability of Root Cause: A Counterfactual Definition and Its Identification

Zitong Lu, Zhi Geng, Wei Li, Min Xie

AI总结 本文提出了一种根因的正式定义,并基于潜在结果框架和中介分析的思想,定义了“根因概率”(PRC),用于量化某个变量集作为给定结果的根因的可能性。该方法在标准假设下建立了PRC的可识别性,并推导了其识别公式,为因果推理中的根因归因提供了新的理论工具和计算方法。

详情
英文摘要

Attributing an observed outcome to its root cause is a central task in domains ranging from medical diagnosis to engineering fault diagnosis. Existing approaches either equate the root cause with a root node of the causal graph, as in causal-discovery-based root cause analysis, or target causes more broadly and thereby favour proximate ones, as with the probability of causation and posterior causal effects. We argue that this issue stems from the absence of a formal definition of a root cause, which has led to methods designed for other purposes being applied to root cause attribution by default. We address this by giving a formal, individual-level definition of a root cause within the potential outcomes framework, based on the notion of an individual cause and a counterfactual root condition motivated by mediation analysis. Building on this definition, we propose the probability of root cause (PRC), which quantifies how probable it is that a candidate variable set is the root cause of a given outcome, conditional on observed evidence. Under standard assumptions, we establish the identifiability of the PRC and derive an explicit identification formula. Two numerical examples illustrate the approach.

2605.11768 2026-05-13 stat.ME

Using NonTargeted HPV Infections in Studies with Risk Compensation

Lola Etievant

AI总结 该研究探讨了在观察性研究中利用非靶向HPV感染评估疫苗效果时可能存在的混淆偏差和风险补偿问题。作者提出,在未测量的性行为同时作为混杂因素和中介变量的情况下,使用非靶向HPV感染可以消除混淆偏差并分离出疫苗对靶向HPV株的直接免疫效应。然而,这种方法可能高估疫苗的实际保护效果,因为其未考虑风险补偿的影响,因此在公共卫生层面可能具有误导性。

详情
英文摘要

Studies of HPV vaccine efficacy usually record infections with vaccine targeted and nontargeted strains. Contrary to blinded randomized controlled trials, confounding bias can be a threat and risk compensation may occur in observational studies. Etievant et al. (Biometrics, 2023) proposed to use cervical infections with nontargeted HPV strains to reduce or remove confounding bias of estimates of vaccine efficacy on targeted strains. However, they assumed that vaccinated women could not change their behavior after vaccination. We consider a more plausible setting where unmeasured sexual behavior acts as both a confounder and a mediator, and investigate if the quantity estimated in practice with their method has a clear causal meaning. We demonstrate that using nontargeted HPV infections can remove both confounding bias and the portion of the vaccine effect on the targeted HPV strains that is mediated through the change of behavior. In that case, the estimated quantity has a clear causal interpretation as it represents the direct immunological effect of the vaccine. However, it could be considered misleading from a public health perspective, as in the presence of risk compensation it would suggest higher protection than what women effectively experience. An unblinded randomized controlled trial would allow estimation of the total causal effect of the vaccine, and infections with nontargeted HPV strains could then be used to isolate the indirect behavioral effect of the vaccine.

2605.11766 2026-05-13 stat.ME

Uncovering Local Heterogeneity: Local Summary Characteristics for Spatial Point Processes with Composition-Valued Marks

Clemens Baldzuhn, Matthias Eckardt

AI总结 本文提出了一种用于分析具有组成型标记的空间点过程的局部指标方法(LIMA),以揭示传统全局统计方法所掩盖的局部异质性。通过引入基于对数比变换的LIMA框架,将受约束的组成型标记映射到欧几里得空间,实现了对点特征的局部分解,有效提升了局部标记聚类的检测能力。研究通过模拟和西班牙卡斯蒂利亚-拉曼查地区的经济部门数据验证了方法的有效性,展示了其在识别区域经济聚集和局部“排水”效应方面的优势。

详情
Comments
submitted for publication
英文摘要

Traditional analysis of marked spatial point processes often relies on global summary statistics, which tend to obscure local spatial heterogeneity by averaging dependencies across the entire observation window. To overcome this limitation, this paper introduces a framework for Local Indicators of Mark Association (LIMA) specifically designed for composition-valued marks. Such marks, characterized by their non-negative components and sum-to-constant constraint, require a specialized treatment within the Aitchison geometry. By employing log-ratio transformations, we project these constrained marks into a Euclidean space, enabling the point-specific decomposition of global mark characteristics. The efficacy of the proposed clr-based LIMA functions is validated through extensive simulation studies. The results demonstrate a superior capacity to detect localized mark clusters, achieving detection accuracies consistently higher than their global counterparts. The practical utility of this framework is demonstrated using an empirical dataset of economic sector compositions in Castile-La Mancha, Spain. The analysis uncovers latent economic clustering patterns and localized \textit{drainage} effects that are invisible to global metrics, providing granular insights into regional spatial dynamics. Our findings suggest that the extended LIMA framework serves as a vital diagnostic tool for high-dimensional, non-stationary marked point patterns.

2605.11684 2026-05-13 cs.LG eess.SP math.PR stat.AP

Partial Model Sharing Improves Byzantine Resilience in Federated Conformal Prediction

Ehsan Lari, Reza Arablouei, Stefan Werner

AI总结 本文提出了一种基于部分模型共享的拜占庭鲁棒联邦共形预测方法,通过每次仅交换部分模型参数来提升系统安全性与通信效率。该方法在训练和校准阶段均增强了鲁棒性,训练阶段通过部分共享限制攻击面并减少恶意更新的影响,校准阶段则利用直方图特征向量进行异常检测与共形分位数估计。实验表明,该方法在多种拜占庭攻击场景下能实现更接近名义值的预测覆盖率,并显著缩小预测区间,为联邦不确定性量化提供了更高效且鲁棒的解决方案。

详情
Comments
5 pages, 4 figures, Accepted for presentation at the 34th European Signal Processing Conference (EUSIPCO 2026) in Bruges, Belgium
英文摘要

We propose a Byzantine-resilient federated conformal prediction (FCP) method that leverages partial model sharing, where only a subset of model parameters is exchanged each round. Unlike existing robust FCP approaches that primarily harden the calibration stage, our method protects both the federated training and conformal calibration phases. During training, partial sharing inherently restricts the attack surface and attenuates poisoned updates while reducing communication. During calibration, clients compress their non-conformity scores into histogram-based characterization vectors, enabling the server to detect Byzantine clients via distance-based maliciousness scores and to estimate the conformal quantile using only benign contributors. Experiments across diverse Byzantine attack scenarios show that the proposed method achieves closer-to-nominal coverage with substantially tighter prediction intervals than standard FCP, establishing a robust and communication-efficient approach to federated uncertainty quantification.

2605.11677 2026-05-13 math.ST stat.TH

Bayesian and Empirical Bayesian Bootstrapping

Nils Lid Hjort

AI总结 本文提出了一种非参数的“贝叶斯自助法”,用于估计未知分布下的参数及其置信限,通过使用狄利克雷过程先验近似参数的精确后验分布。该方法在大样本下与经典自助法、贝叶斯自助法和Delta方法等在渐近意义上等价,并扩展到了半参数回归、删失数据和更一般的风险率模型中。研究还探讨了经验贝叶斯版本,展示了经典自助法在贝叶斯框架下的合理性。

详情
Comments
24 pages, no figures. Statistical Research Report, Department of Mathematics, University of Oslo, May 1991; this version arXiv'd May 2026, for broader visibility. Biometrika expressed positive interest, and invited a revision, but somehow the author did not use that opportunity
英文摘要

Let $X_1,\ldots,X_n$ be a random sample from an unknown probability distribution $P$ on the sample space ${\cal X}$, and let $θ=θ(P)$ be a parameter of interest. The present paper proposes a nonparametric `Bayesian bootstrap' method of obtaining Bayes estimates and Bayesian confidence limits for $θ$. It uses a simple simulation technique to numerically approximate the exact posterior distribution of $θ$ using a (non-degenerate) Dirichlet process prior for $P$. Asymptotic arguments are given which justify the use of the Bayesian bootstrap for any smooth functional $θ(P)$. When the prior is fixed and the sample size grows five approaches become first-order equivalent: the exact Bayesian, the Bayesian bootstrap, Rubin's degenerate-prior bootstrap, Efron's bootstrap, and the classical one using delta methods. The Bayesian bootstrap method is also extended to the semiparametric regression case. A separate section treats similar ideas for censored data and for more general hazard rate models, where a connection is made to a `weird bootstrap' proposed by Gill. Finally empirical Bayesian versions of the procedure are discussed, where suitable parameters of the Dirichlet process prior are inferred from data. Our results lend Bayesian support to the classic Efron bootstrap. It is the Bayesian bootstrap under a noninformative reference prior; it is a limit of natural approximations to good Bayes solutions; it is an approximation to a natural empirical Bayesian strategy; and the formally incorrect reading of a bootstrap histogram as a posterior distribution for the parameter isn't so incorrect after all.

2605.11652 2026-05-13 stat.ML cs.LG math.ST stat.TH

Posterior Contraction Rates for Sparse Kolmogorov-Arnold Networks in Anisotropic Besov Spaces

Jeunghun Oh, Kyeongwon Lee, Jaeyong Lee, Lizhen Lin

AI总结 本文研究了稀疏贝叶斯Kolmogorov-Arnold网络(KAN)在各向异性Besov空间中的后验收缩速率,从贝叶斯角度为KAN提供了统计学基础。通过引入尖峰-平缓型稀疏先验,证明稀疏贝叶斯KAN能够达到近似最优的后验收缩率,且该速率依赖于目标函数的内在各向异性光滑性。通过在模型规模参数上设置超先验,后验还能自适应未知的各向异性光滑性并保持相应的近似最优速率。与基于稀疏MLP的模型相比,KAN的深度可保持固定,其复杂度可通过网络宽度、样条网格范围和参数稀疏性进行控制,从而有效避免维数灾难。

详情
英文摘要

We study posterior contraction rates for sparse Bayesian Kolmogorov-Arnold networks (KANs) over anisotropic Besov spaces, providing a statistical foundation of KANs from a Bayesian point of view. We show that sparse Bayesian KANs equipped with spike-and-slab-type sparsity priors attain the near-minimax posterior contraction. In particular, the contraction rate depends on the intrinsic anisotropic smoothness of the underlying function. Moreover, by placing a hyperprior on a single model-size parameter, the resulting posterior adapts to unknown anisotropic smoothness and still achieves the corresponding near-minimax rate. A distinctive feature of our results, compared with those for standard sparse MLP-based models, is that the KAN depth can be kept fixed: owing to the flexibility of learnable spline edge functions, the required approximation complexity is controlled through the network width, spline-grid range and size, and parameter sparsity. Our analysis develops theoretical tools tailored to sparse spline-edge architectures, including approximation and complexity bounds for Bayesian KANs. We then extend to compositional Besov spaces and show that the contraction rates depend on layerwise smoothness and effective dimension of the underlying compositional structure, thereby effectively avoiding the curse of dimensionality. Together, the developed tools and findings advance the theoretical understanding of Bayesian neural networks and provide rigorous statistical foundations for KANs.

2605.11638 2026-05-13 stat.ML cs.LG

Learning U-Statistics with Active Inference

Xiaoning Wang, Yuyang Huo, Liuhua Peng, Changliang Zou

AI总结 该论文研究了如何在标签获取成本较高的情况下,利用主动推断方法提高U统计量的估计效率。作者提出了一种基于增强逆概率加权的U统计量框架,结合采样规则和机器学习预测,设计了最优采样策略以最小化方差,并将其扩展到基于U统计量的经验风险最小化中。实验表明,该方法在保证统计推断有效性的同时,显著提升了估计效率。

详情
英文摘要

$U$-statistics play a central role in statistical inference. In many modern applications, however, acquiring the labels required for $U$-statistics is costly. Motivated by recent advances in active inference, we develop an active inference framework for $U$-statistics that selectively queries informative labels to improve estimation efficiency under a fixed labeling budget, while preserving valid statistical inference. Our approach is built on the augmented inverse probability weighting $U$-statistic, which is designed to incorporate the sampling rule and machine learning predictions. We characterize the optimal sampling rule that minimizes its variance and design practical sampling strategies. We further extend the framework to $U$-statistic-based empirical risk minimization. Experiments on real datasets demonstrate substantial gains in estimation efficiency over baseline methods, while maintaining target coverage.

2605.11614 2026-05-13 stat.AP

Fairness Testing for Algorithmic Pricing

Fei Huang, Giles Hooker

AI总结 本文研究了算法定价系统中的公平性测试问题,指出当前常用的基于普通最小二乘法的审计方法在结构上存在缺陷,因为定价算法通常是确定性的,导致残差反映的是近似误差而非抽样变异。作者提出了正确的方差估计方法和交叉协方差公式,并应用于伊利诺伊州34家汽车保险公司的定价数据,发现所有公司都未能通过条件人口统计公平性测试,而传统方法未能识别出这一问题,修正后的方法则揭示了系统性的不公平现象。

详情
英文摘要

Algorithmic systems now set prices across auto insurance, credit, and lending markets, and regulators increasingly require firms to demonstrate that these systems do not discriminate against protected groups. The standard audit regresses pricing output on a protected attribute and legitimate rating factors, then tests the resulting coefficient using ordinary least squares standard errors. We show that this approach is structurally invalid. Pricing algorithms are usually deterministic, so residuals reflect approximation error rather than sampling variability, rendering classical standard errors invalid in both direction and magnitude. We derive correct asymptotic variance estimators for OLS and GLM audit regressions and the correct cross-covariance formula for proxy discrimination testing. Applied to quoted premiums from 34 Illinois auto insurers, every insurer fails the conditional demographic parity test, with minority zip codes paying $34-$158 more per year than comparable-risk white zip codes. The standard proxy discrimination formula flags zero insurers. However, our corrected formula identifies all 34 as statistically significant, of which 16 exceed the substantive threshold. Our framework provides statistically valid audit tools for any deterministic algorithmic system subject to regression-based fairness testing.

2605.11540 2026-05-13 stat.ME

The design of selection experiments using a model-based approach

Brian R Cullis, Alison B Smith, David GD Hughes, David Butler

AI总结 本文提出了一种基于线性混合模型的模型化方法,用于设计最优或近似最优的多环境选择实验方案,以提高植物育种中遗传增益的准确性。该方法通过引入额外步骤优化基因型重复次数的分配,从而更高效地利用资源,并在单环境和多环境实验中均展示了其优势。研究通过实际案例和模拟实验验证了该方法的有效性,为选择实验设计提供了灵活且科学的解决方案。

详情
英文摘要

Plant breeding programs use data obtained from multi-environment selection experiments to produce improved varieties with the ultimate aim of maintaining high levels of genetic gain. Selection accuracy can be improved with the use of advanced statistical analytical methods that use informative and parsimonious variance models for the set of genotype by environment interaction effects, include information on genetic relatedness and appropriately accommodate non-genetic sources of variation within the framework of a single step estimation and prediction algorithm. Maximal gains from using these advanced techniques are more likely to be achieved if the designs used match the aims of the selection experiment and make full use of the available resources. In this paper we present an approach for constructing designs for selection experiments which are optimal or near optimal against a robust and sensible linear mixed model. This model reflects the models used for analysis. The approach is flexible and introduces an additional step to accommodate efficient resource allocation of replication status to genotypes, which is undertaken prior to the allocation of plots to genotypes. A motivating example is used to illustrate the approach, two illustrative examples are presented one each for single and multiple environment selection experiments and several in-silico simulation studies are used to demonstrate the advantages of these approaches.

2605.11531 2026-05-13 physics.ao-ph cs.LG stat.AP

Generative climate downscaling enables high-resolution compound risk assessment by preserving multivariate dependencies

Takuro Kutsuna, Noriko N. Ishizaki, Norihiro Oyama, Hiroaki Yoshida

AI总结 该研究提出了一种基于扩散模型的多变量生成框架,用于生成高分辨率的气候数据,以提升复合风险评估的准确性。该方法通过结合偏差校正技术,有效恢复了在分辨率提升后退化的变量间相关性,从而更精确地捕捉如干旱、热应激等复合灾害的关联特征。实验表明,该方法在提高单变量和空间精度的同时,显著降低了变量间相关性误差,为区域气候风险评估提供了更可靠的依据。

详情
英文摘要

Physics-based climate projections using general circulation models are essential for assessing future risks, but their coarse resolution limits regional decision-making. Statistical downscaling can efficiently add detail, yet many methods treat variables independently, degrading inter-variable relationships that govern compound hazards such as heat stress, drought, and wildfire. Here we show that a diffusion-based multivariate generative framework, combined with bias correction, recovers degraded inter-variable correlations even under a 50$\times$ increase in linear resolution. When applied to five meteorological variables over Japan, the framework reduces inter-variable correlation errors by more than fourfold relative to existing baselines while improving both univariate and spatial accuracy, leading to more accurate detection of severe drought. These results demonstrate that multivariate generative downscaling improves the reliability of compound risk assessment under large resolution gaps.

2605.11515 2026-05-13 stat.ME

Exploiting independence constraints for efficient estimation of bounds on causal effects in the presence of unmeasured confounding

Ting-Hsuan Chang, Caleb H. Miles, Ilya Shpitser, Eric J. Tchetgen Tchetgen, Daniel Malinsky

AI总结 该研究旨在在存在未测量混杂因素的情况下,高效估计因果效应的上下界。作者提出了一种基于因果图的条件独立性约束的影响函数投影方法,以提升半参数估计器的效率。该方法适用于多种敏感性分析框架和因果效应估计量,将图结构知识与敏感性分析相结合,通过模拟和实际数据案例验证了其有效性。

详情
英文摘要

Causal graphs may inform covariate adjustment for estimating causal effects and improve estimation efficiency by exploiting the graphical structure. In many applications, however, the target causal parameter may not be point-identified due to the presence of unmeasured confounding. Sensitivity analysis methods address this challenge by characterizing bounds on the causal parameter under varying assumptions about the magnitude or form of unmeasured confounding. We focus on semiparametric efficient estimation of causal effects in non-identifiable settings, assuming a known (or hypothesized) causal graph. We propose an influence function projection approach that exploits the conditional independence constraints implied by the graph to improve the efficiency of semiparametric estimators of upper and lower bounds on the average causal effect under a given sensitivity analysis model. Our approach applies across multiple sensitivity analysis frameworks and causal estimands, thereby connecting knowledge of graphical structure with the sensitivity analysis literature. We illustrate our approach through simulations and real data examples thought to be affected by unmeasured confounding, including the effect of labor training program on post-intervention earnings, and the effect of low ejection fraction on heart failure death.

2605.11511 2026-05-13 stat.ML cs.LG

Post-ADC Inference: Valid Inference After Active Data Collection

Shuichi Nishino, Tomohiro Shiraishi, Teruyuki Katsuoka, Ichiro Takeuchi

AI总结 本文研究了在主动数据收集(ADC)后进行统计推断时的效度问题,指出传统推断方法由于数据采集过程的自适应偏差而可能失效。为此,作者提出了“后ADC推断”框架,通过结合选择性推断方法,有效校正了数据采集过程和后续数据驱动目标构建所带来的偏差,从而提供有效的p值和置信区间。该方法仅需对观测噪声做假设,适用于多种ADC过程,实验表明其在GP-UCB和TPE等方法收集的数据上具有良好的推断效度。

详情
英文摘要

The validity of statistical inference depends critically on how data are collected. When data gathered through active data collection (ADC) are reused for a post-hoc inferential task, conventional inference can fail because the sampling is adaptively biased toward regions favored by the collection strategy. This issue is especially pronounced in black-box optimization, where sequential model-based optimization (SMBO) methods such as the tree-structured Parzen estimator (TPE) and Gaussian process upper confidence bound (GP-UCB) preferentially concentrate evaluations in promising regions. We study statistical inference on actively collected data when the inferential target is constructed in a data-dependent manner after data collection. To enable valid inference in this setting, we propose post-ADC inference, a framework that accounts for the biases arising from both the active data collection process and the subsequent data-driven target construction. Our method builds on selective inference and provides valid $p$-values and confidence intervals that correct for both sources of bias. The framework applies to a broad class of ADC processes by imposing only assumptions on the observation noise, without requiring any assumptions on the underlying black-box function or the surrogate model used by the SMBO algorithm. Empirical results also show that post-ADC inference provides valid inference for data collected by GP-UCB and TPE.

2605.11478 2026-05-13 cs.AI cs.IT math.IT stat.ML

FibQuant: Universal Vector Quantization for Random-Access KV-Cache Compression

Namyoon Lee, Yongjune Kim

AI总结 本文提出了一种名为FibQuant的通用固定率向量量化方法,用于随机访问的键值缓存压缩,以解决长上下文推理中的内存和流量瓶颈问题。该方法在保持归一化-旋转-存储接口的同时,将传统的标量编码表替换为与标准化源匹配的共享径向-角向码本,从而保留归一化步骤所创建的几何信息并提升压缩效率。实验表明,FibQuant在保持高注意力相似度的同时实现了更高的压缩比,并在多个模型上表现出优于现有标量量化方法的性能。

详情
Comments
15 pages
英文摘要

Long-context inference is increasingly a memory-traffic problem. The culprit is the key--value (KV) cache: it grows with context length, batch size, layers, and heads, and it is read at every decoding step. Rotation-based scalar codecs meet this systems constraint by storing a norm, applying a shared random rotation, and quantizing one coordinate at a time. They are universal and random-access, but they discard the geometry created by the normalization step. After a Haar rotation, a block of $k$ consecutive coordinates is not a product source; it is a spherical-Beta source on the unit ball. We introduce \textsc{FibQuant}, a universal fixed-rate vector quantizer that keeps the same normalize--rotate--store interface while replacing scalar tables by a shared radial--angular codebook matched to this canonical source. The codebook combines Beta-quantile radii, Fibonacci\,/\,Roberts--Kronecker quasi-uniform directions, and multi-restart Lloyd--Max refinement. We prove that the resulting vector code strictly improves on its scalar product specialization at matched rate, with a high-rate gain that separates into a cell-shaping factor and a density-matching factor. The same construction gives a dense rate axis, including fractional-bit and sub-one-bit operating points, without calibration or variable-length addresses. On GPT-2 small KV caches, \textsc{FibQuant} traces a memory--fidelity frontier from $5\times$ compression at $0.99$ attention cosine similarity to $34\times$ at $0.95$. End-to-end on TinyLlama-1.1B, it is within $0.10$ perplexity of fp16 at $4\times$ compression and has $3.6\times$ lower perplexity than scalar \textsc{TurboQuant} at $b = 2$ ($8\times$ compression), where scalar random-access quantization begins to fail.

2605.11476 2026-05-13 math.OC stat.ML

A Barrier-Metric First-Order Method for Linearly Constrained Bilevel Optimization

Tenglong Hong, Paul Grigas

AI总结 本文研究具有固定多项式下层可行域的双层优化问题,针对其下层可行域变化导致上层目标非光滑以及现有方法计算代价高的问题,提出了一种基于对数障碍函数的光滑化方法,将约束双层目标转化为可微形式,并设计了一种仅依赖上下层目标梯度的代理梯度算法。通过引入障碍度量下的局部几何分析,算法能够在靠近下层可行域中心的区域内保持良好的收敛性质,并在确定性和随机噪声环境下分别实现了$\widetilde{O}(K^{-2/3})$和$\widetilde{O}(K^{-2/5})$的收敛速率。

详情
英文摘要

We study bilevel optimization with a fixed polyhedral lower feasible set. Such problems are challenging for two reasons: active-set changes can make the upper objective nonsmooth, and existing hypergradient methods typically require lower-Hessian inversions or equivalent linear solves, which are computationally expensive. To address these issues, we adopt a logarithmic barrier smoothing of the lower problem to obtain a differentiable approximation of the constrained bilevel objective, and develop a proxy-gradient algorithm for the resulting barrier-smoothed surrogate. The algorithm uses only gradients of the upper and lower objectives; its only second-order object is the explicit logarithmic barrier Hessian determined by the fixed polyhedral constraints. Barrier smoothing restores differentiability, but Euclidean smoothness constants are not uniformly bounded near the boundary. We therefore develop a local Dikin-geometry analysis in which the barrier-metric provides an oracle-free curvature scale near the moving lower centers. This leads to barrier-aware schedules that keep the iterates inside locally well-behaved regions. For the barrier-smoothed objective, we prove stationarity rates of $\widetilde{O}(K^{-2/3})$ in the deterministic setting and $\widetilde{O}(K^{-2/5})$ under upper-level-only bounded stochastic noise after $K$ outer iterations, together with quantitative bias control as the barrier parameter decreases.

2605.11473 2026-05-13 cs.AI cs.LG cs.RO stat.ML

TOPPO: Rethinking PPO for Multi-Task Reinforcement Learning with Critic Balancing

Yuanpeng Li, Gefei Lin, Annie Qu, Rui Miao

AI总结 本文研究了多任务强化学习中基于策略梯度的PPO方法的优化问题,指出其在多任务环境下存在价值函数梯度条件不佳的问题,导致部分任务学习停滞。为此,作者提出TOPPO方法,通过引入批评者平衡模块改善梯度条件,提升任务间的学习均衡性。实验表明,TOPPO在参数和环境步数更少的情况下,优于现有的SAC和ARS方法,在多任务基准测试中表现出更强的平均和尾部任务性能,证明了基于策略的方法在适当优化下可以媲美甚至超越基于价值的方法。

详情
英文摘要

Soft Actor-Critic (SAC) and its variants dominate Multi-Task Reinforcement Learning (MTRL) due to their off-policy sample efficiency, while on-policy methods such as Proximal Policy Optimization (PPO) remain underexplored. We diagnose that PPO in MTRL suffers from a previously overlooked issue: critic-side gradient ill-conditioning, which may cause tail tasks to stall while easy tasks dominate the value function's updates. To address this, we propose TOPPO (Tail-Optimized PPO), a reformulation of PPO via Critic Balancing -- a set of modules that improve gradient conditioning and balance learning dynamics across tasks. Unlike prior approaches that rely on modular architectures or large models, TOPPO targets the optimization bottleneck within PPO itself. Empirically, TOPPO achieves stronger mean and tail-task performance than published SAC-family and ARS-family baselines while using substantially fewer parameters and environment steps on Meta-World+ benchmark. Notably, TOPPO matches or surpasses strong SAC baselines early in training and maintains superior performance at full budget. Ablations confirm the effectiveness of each module in TOPPO and provide insights into their interactions. Our results demonstrate that, with proper optimization, on-policy methods can rival or exceed off-policy approaches in MTRL, challenging the prevailing reliance on SAC and highlighting critic-side gradient conditioning as the central bottleneck.

2605.11415 2026-05-13 stat.ME

Causal inference with ordinal outcomes: copula-based identification, estimation and sensitivity analysis

Peiyu He, Fan Li

AI总结 本文研究了在有序结果下的因果推断问题,提出了一种基于copula的方法,通过参数化copula关联参数将潜在结果的边缘分布联系起来,从而实现对因果效应的识别与估计。该方法在无混杂假设下推导了非参数模型中的高效影响函数,并构建了具有良好稳健性的估计器,能够在部分识别与点估计之间建立可解释的桥梁。此外,文章还提供了对copula结构和无混杂假设的全面敏感性分析,并开发了相应的R包ordinalCI。

详情
Comments
39 pages, 5 figures; includes supplementary material
英文摘要

In causal inference with ordinal outcomes, several interpretable estimands are functions of the probability that the potential outcome under one treatment is larger than that under another treatment for the same unit. This probability depends on the joint distribution of both potential outcomes and is generally not identifiable. Existing work has focused on sharp bounds of this probability based on partial identification, but bounds are often too wide to be informative. We propose a copula-based method that links the identifiable marginal distributions of the potential outcomes via a parametric copula, treating the copula association parameter as a sensitivity parameter. With a fixed copula parameter, the estimands become identified functionals of the observed data. Working under unconfoundedness, we derive the efficient influence function in the nonparametric model and construct one-step estimators that accommodate flexible nuisance estimation. The resulting procedure is rate-doubly-robust and attains the semiparametric efficiency bound under standard conditions. Varying the copula parameter yields a sensitivity curve with point-wise confidence bands that typically lie within the sharp bounds, providing an interpretable bridge between partial identification and point estimation. We further provide a comprehensive sensitivity analysis with respect to both the copula specification and the unconfoundedness assumption. We develop an associated R package \texttt{ordinalCI}.

2605.11394 2026-05-13 stat.ML cs.AI cs.LG stat.AP stat.ME

Spatial Adapter: Structured Spatial Decomposition and Closed-Form Covariance for Frozen Predictors

Wen-Ting Wang, Wei-Ying Wu, Hao-Yun Huang, Xuan-Chun Wang

AI总结 本文提出了一种名为 Spatial Adapter 的参数高效模块,能够在不修改原始预测模型参数的前提下,为任意冻结的初始预测器提供结构化的空间残差表示及其闭式协方差估计。该方法通过可追踪的批量 ADMM 算法,联合学习空间正则化的正交基与样本级得分,从而在残差场中提取出具有平滑性、稀疏性和正交性的低秩空间结构。该方法不仅支持对未观测位置进行克里金插值式的空间预测,还可用于不确定性量化,实验表明其在多种数据集上均能有效恢复残差空间结构,且参数量远低于传统方法。

详情
Comments
Preprint. 10 pages main text, with appendices
英文摘要

We present the Spatial Adapter, a parameter-efficient post-hoc layer that equips any frozen first-stage predictor with a structured spatial representation of its residual field and an induced closed-form spatial covariance. The adapter operates as a cascade second stage on residuals, jointly learning a spatially regularized orthonormal basis and per-sample scores via a tractable mini-batch ADMM procedure, without modifying any first-stage parameter. Because the first-stage parameters are frozen, the adapter does not retrain the backbone; its role is to supply a compressed distributional summary of the residual field. Smoothness, sparsity, and orthogonality together turn a generic low-rank factorization into an identifiable spatial representation whose induced residual covariance admits a closed-form low-rank-plus-noise estimator; the effective rank is determined data-adaptively by spectral thresholding, while the nominal rank K is an optimization-side upper bound only. This covariance enables kriging-style spatial prediction at unobserved locations, with plug-in uncertainty quantification as a secondary downstream use. Across synthetic data, Weather2K for spatial-holdout prediction, and GWHD patch grids as a basis-transferability diagnostic, the adapter recovers residual spatial structure when paired with frozen first stages from linear models to deep spatiotemporal and vision backbones; the added representation uses fewer than K(N+T) parameters alongside a compact residual-trend network.

2605.11373 2026-05-13 cs.AI cs.LG stat.ML

Causal Algorithmic Recourse: Foundations and Methods

Drago Plecko, Collin Wang, Elias Bareinboim

AI总结 本文研究如何在人工智能决策系统中为个体提供可靠的逆向决策建议,即算法性补救(algorithmic recourse)问题。作者提出了一种因果框架,将补救过程建模为干预前后的结果过程,考虑了潜在变量的重新采样和部分稳定性。文章引入了后补救稳定性条件,并开发了基于copula的算法以从观测数据中推断补救效果,同时提出了在数据不满足copula模型时的分布无关学习方法,为算法性补救提供了更稳健和实用的解决方案。

详情
英文摘要

The trustworthiness of AI decision-making systems is increasingly important. A key feature of such systems is the ability to provide recommendations for how an individual may reverse a negative decision, a problem known as algorithmic recourse. Existing approaches treat recourse outcomes as counterfactuals of a fixed unit, ignoring that real-world recourse involves repeated decisions on the same individual under possibly different latent conditions. We develop a causal framework that models recourse as a process over pre- and post-intervention outcomes, allowing for partial stability and resampling of latent variables. We introduce post-recourse stability conditions that enable reasoning about recourse from observational data alone, and develop a copula-based algorithm for inferring the effects of recourse under these conditions. For settings where paired observations of the same individual before and after intervention are available (called recourse data), we develop methods for inferring copula parameters and performing goodness-of-fit testing. When the copula model is rejected, we provide a distribution-free algorithm for learning recourse effects directly from recourse data. We demonstrate the value of the proposed methods on real and semi-synthetic datasets.

2605.11372 2026-05-13 math.ST stat.TH

The Geometry of Spectral Fluctuations: On Near-Optimal Conditions for Universal Gaussian CLTs, with Statistical Applications

Yanqing Yin, Wang Zhou

AI总结 本文研究了高维样本协方差矩阵的线性谱统计量,在经验谱分布仍遵循经典样本协方差律但波动理论非经典的场景下,提出了一种分解方法,将中心化二次型协方差分解为通用高斯部分和模型相关的四阶修正项,构建了名为GHOST的抽象框架,用于处理具有结构化四阶效应的通用高斯中心极限定理。通过该框架,作者证明了线性谱统计量的高斯中心极限定理,并给出了由双线性四阶核确定的显式均值和协方差修正。研究还展示了该条件在广泛适用性上的必要性,并开发了一个块混合径向模型,使修正项具体化,揭示了波动尺度可能发生变化的相变现象,最后将该方法应用于球形检验,提出了对John检验的可行数据驱动修正。

详情
Comments
51 pages, 3 figures
英文摘要

We study linear spectral statistics of high dimensional sample covariance matrices in a regime where the empirical spectral distribution remains governed by the classical sample covariance law but the fluctuation theory is nonclassical. Our starting point is a decomposition of the covariance of centered quadratic forms into a universal Gaussian part and a model dependent fourth order correction. This leads to an abstract framework, termed GHOST, for universal Gaussian central limit theorems under structured fourth order effects. Under this framework, we prove a Gaussian central limit theorem for linear spectral statistics, with explicit mean and covariance corrections determined by a bilinear fourth order kernel. Boundary examples show that the conditions are close to necessary for a broad universal Gaussian closure. We then develop a blockwise mixed radial model that verifies the abstract assumptions and makes the correction explicit. The correction splits into an entrywise fourth moment component and a lockwise energy fluctuation component. The latter may change the fluctuation scale, leading to a phase transition at the level of fluctuations. As an application, we study sphericity testing. Under the spherical null, the general correction collapses to a single scalar parameter, yielding a feasible data driven correction of John's test.

2605.11371 2026-05-13 stat.AP

Statistical evaluation of measurement precision in linear dose-response relationships via interlaboratory studies

Jun-ichi Takeshita, Yuto Ikeuchi, Tomomichi Suzuki

AI总结 本文提出了一种基于多实验室研究评估测量方法统计精度的框架,适用于以回归线形式总结剂量-反应关系的测量结果。通过引入ISO 5725中定义的重复性及实验室间方差等精度评价指标,研究建立了适用于实验室间基线和剂量-反应斜率均可能不同的线性混合效应模型的分析方法,并给出了在平衡设计下总平方和的精确分解、精度方差的方差分析估计及相应的F检验。该方法能够区分实验室间差异是由基线偏移还是灵敏度差异引起,为剂量-反应关系的测量精度评估提供了更全面的统计工具。

详情
英文摘要

This paper proposes a framework for evaluating the statistical precision of measurement methods from interlaboratory studies where the outcome is a dose-response relationship summarized by a regression line. For such measurement methods, where a linear mixed-effects model is applied that allows laboratories to differ in both baseline level and dose-response slope, we define precision evaluation metrics specified in ISO 5725, repeatability and between-laboratory variances. These are method-level precision metrics, and the latter are constructed as design-averaged dose-specific between-laboratory variances over the dose levels and the participating laboratories. For fully balanced designs with common dose levels and equal replication, we obtain an exact decomposition of the total sum of squares, closed-form analysis of variance (ANOVA) estimators of the precision variances, and three associated $F$-tests targeting (i) the overall dose-response trend, (ii) homogeneity of intercepts, and (iii) homogeneity of slopes across laboratories. This formulation enables precision to be quantified and estimated directly and supports an evaluation of whether between-laboratory discrepancies are caused primarily by baseline shifts or by differences in sensitivity, in contrast to fixed-effect comparisons that only detect the presence of differences. Furthermore, we analyze data obtained from an interlaboratory study on observations in bronchoalveolar lavage fluid from experiments involving the intratracheal administration of nanomaterials to rats, using the proposed method as a case study.

2605.11362 2026-05-13 cs.LG cs.AI stat.AP stat.ML

Causal Fairness for Survival Analysis

Drago Plecko

AI总结 在数据驱动时代,机器学习和人工智能被广泛用于医疗、就业等高风险领域,引发了对系统公平性问题的关注。现有公平机器学习研究多聚焦于静态场景,而对生存分析等时间序列场景中的公平性研究仍较为缺乏。本文提出一种因果框架,用于生存分析中的公平性研究,能够将生存差异分解为直接、间接和虚假路径的贡献,从而提供对差异成因和演变过程的可解释分析,并应用于分析重症监护病房中种族差异随时间的变化。

详情
英文摘要

In the data-driven era, large-scale datasets are routinely collected and analyzed using machine learning (ML) and artificial intelligence (AI) to inform decisions in high-stakes domains such as healthcare, employment, and criminal justice, raising concerns about the fairness behavior of these systems. Existing works in fair ML cover tasks such as bias detection, fair prediction, and fair decision-making, but largely focus on static settings. At the same time, fairness in temporal contexts, particularly survival/time-to-event (TTE) analysis, remains relatively underexplored, with current approaches to fair survival analysis adopting statistical fairness definitions, which, even with unlimited data, cannot disentangle the causal mechanisms that generate disparities. To address this gap, we develop a causal framework for fairness in TTE analysis, enabling the decomposition of disparities in survival into contributions from direct, indirect, and spurious pathways. This provides a human-understandable explanation of why disparities arise and how they evolve over time. Our non-parametric approach proceeds in four steps: (1) formalizing the necessary assumptions about censoring and lack of confounding using a graphical model; (2) recovering the conditional survival function given covariates; (3) applying the Causal Reduction Theorem to reframe the problem in a form amenable to causal pathway decomposition; (4) estimating the effects efficiently. Finally, our approach is used to analyze the temporal evolution of racial disparities in outcome after admission to an intensive care unit (ICU).

2605.11324 2026-05-13 cs.LG stat.ML

$\varepsilon$-Good Action Identification in Fixed-Budget Monte Carlo Tree Search

Yinan Li, Tuan Nguyen, Kwang-Sung Jun

AI总结 本文研究了在固定预算下深度为2的max-min树中识别ε-优质动作的问题,这是蒙特卡洛树搜索的一个重要特例。作者提出了一种无需输入ε值的算法,能够针对每个有意义的ε值实现实例相关的误差界,其误识别概率以指数形式衰减。此外,作者还分析了该问题与标准K臂老虎机在难度结构上的差异,并提供了相应的下界结果,这是首个针对max-min动作识别的固定预算算法保证。

详情
英文摘要

We study the fixed-budget max-min action identification problem in depth-2 max-min trees, an important special case of Monte Carlo Tree Search. A learner sequentially allocates $T$ samples to leaves and then recommends a subtree whose minimum leaf value is largest. Motivated by approximate planning, we focus on $\varepsilon$-good subtree identification, where any subtree whose min value is within $\varepsilon$ of the optimal maximin value is acceptable. Our main contribution is an $\varepsilon$-agnostic algorithm: it does not require $\varepsilon$ as input, but achieves instance-dependent error bounds for every meaningful $\varepsilon$. We show that the misidentification probability decays as $\exp(-\widetildeΘ(T/H_2(\varepsilon)))$, where $H_2(\varepsilon)$ captures both cross-subtree and within-subtree gaps. When each subtree has a single leaf, the problem reduces to standard fixed-budget best-arm identification, and our analysis recovers, up to accelerating factors, known $\varepsilon$-good guarantees for halving-style methods while giving a new $\varepsilon$-good guarantee for Successive Rejects. On the lower-bound side, we provide complementary positive and negative results showing that max-min identification has a different hardness structure from standard $K$-armed bandits. To our knowledge, this is the first provable fixed-budget algorithmic guarantee for max-min action identification.

2605.11311 2026-05-13 cs.LG cs.CV stat.CO stat.ML

Couple to Control: Joint Initial Noise Design in Diffusion Models

Jing Jia, Liyue Shen, Guanyang Wang

AI总结 该论文研究了扩散模型中初始噪声设计的问题,指出传统方法中假设初始噪声相互独立可能限制了生成效果。作者提出通过设计噪声之间的依赖结构,保持单个噪声仍为标准高斯分布,从而在不改变模型输入分布的前提下,提升多样本生成的多样性与质量。实验表明,该方法在多个主流扩散模型中有效提升了生成多样性,同时保持了图像质量和提示对齐,并在部分指标上优于现有优化方法。

详情
Comments
26 pages
英文摘要

Diffusion models typically generate image batches from independent Gaussian initial noises. We argue that this independence assumption is only one choice within a broader class of valid joint noise designs. Instead, one can specify a coupling of the initial noises: each noise remains marginally standard Gaussian, so the pretrained diffusion model receives the same single-sample input distribution, while the dependence across samples is chosen by design. This reframes initial-noise control from selecting or optimizing individual seeds to designing the dependence structure of a multi-sample gallery. This view gives a general framework for initial-noise design, covering several existing methods as special cases and leading naturally to new coupled-noise constructions. Coupled noise can improve generation on its own without adding sampling cost, and it is flexible enough to serve as a structured initialization for optimization-based pipelines when additional computation is available. Empirically, repulsive Gaussian coupling improves gallery diversity on SD1.5, SDXL, and SD3 while largely preserving prompt alignment and image quality. It matches or outperforms recent test-time noise-optimization baselines on several diversity metrics at the same sampling cost as independent generation. Subspace couplings also support fixed-object background generation, producing diverse, natural backgrounds compared with specialized inpainting baselines, with a tunable trade-off in foreground fidelity.

2605.11306 2026-05-13 stat.ME

Unified Operator Framework for Functional and Multivariate Regression

Mark Carpenter, Nicholas Gaubatz

AI总结 本文提出了一种统一的算子框架,用于处理标量、多元和函数型回归问题,该框架基于一般测度定义的积分算子。通过选择不同的输入和输出测度,经典回归模型如标量对函数、函数对标量、函数对函数以及多元多元回归均可视为该框架下的特例。研究证明了离散表示与连续算子之间的关系,并指出离散测度下的估计可归约为标准多元回归,具有良好的统计性质。该框架有助于理解函数型回归与多元回归之间的联系,并解释了为何在某些线性场景中向量化多元回归方法能与函数型方法竞争。

详情
Comments
24 pages, 5 figures
英文摘要

We develop a unified operator framework for scalar, multivariate, and functional regression based on integral operators defined with respect to general measures. Within this framework, classical regression models, including scalar-on-function, function-on-scalar, function-on-function, and multivariate multiple regression, arise as special cases corresponding to different choices of input and output measures. We establish three main results. First, we show that the standard regression taxonomy can be expressed as a single operator under varying measures. Second, we demonstrate that discrete representations correspond to exact operator evaluations under discrete measures and converge to the continuous operator as the observation grid is refined. Third, we show that estimation under the discrete-measure formulation reduces to standard multivariate regression, with statistical properties governed by classical results. A simulation study illustrates these principles, highlighting the roles of discretization, conditioning, and estimation. Overall, the proposed framework clarifies the relationship between functional and multivariate regression and provides a meaningful interpretation of discretized modeling approaches as operator estimation under different measure specifications. This perspective also explains why vectorized multivariate regression is often competitive with functional methods in linear settings: it directly estimates the discrete-measure representation of the underlying operator.

2605.11284 2026-05-13 stat.ME cs.AI cs.LG

Rethinking external validation for the target population: Capturing patient-level similarity with a generative model

Mohammad Azizmalayeri, Ameen Abu-Hanna, Saskia Houterman, Marije M. Vis, Giovanni Cinà

AI总结 该研究旨在解决外部验证中因目标人群与模型开发人群差异而导致的模型性能解释困难问题,提出了一种基于生成模型的框架,用于量化每个外部患者与开发数据的相似性,并在不同相似度子群中评估模型性能。通过使用自编码器等生成模型,该方法无需共享原始开发数据即可实现更灵活的相似性估计,提升了外部验证的可解释性与实用性。实验表明,该框架能够揭示传统外部验证所掩盖的模型性能差异,为模型的可迁移性评估提供了更科学的依据。

详情
英文摘要

Background: External validation is essential for assessing the transportability of predictive models. However, its interpretation is often confounded by differences between external and development populations. This study introduces a framework to distinguish model deficiencies from case-mix effects. Method: We propose a framework that quantifies each external patient's similarity to the development data and measures performance in subgroups with varying levels of alignment to the development distribution. We use generative models, specifically autoencoders, to estimate similarity, offering a more flexible alternative to traditional linear approaches and enabling validation without sharing the original development data. The utility of autoencoder-based similarity measure is demonstrated using synthetic data, and the framework's application is illustrated using data from the Netherlands Heart Registration (NHR) to predict mortality after transcatheter aortic valve implantation. Results: Our framework revealed substantial variation in model performance across similarity-defined subgroups, differences that remain hidden under conventional external validation yet can meaningfully alter conclusions. In several settings, conventional external validation suggested poor overall performance. However, after accounting for differences in patient characteristics, for some sub-groups, the model performance was consistent with internal validation results. Conversely, apparently acceptable overall performance could mask clinically relevant performance deficits in specific subgroups. Conclusion: The proposed framework enhances the interpretability of external validation by linking model performance to population alignment with the development data. This provides a more principled basis for deciding whether a model is transportable and to which patients it can be safely applied.

2605.11282 2026-05-13 math.ST stat.AP stat.TH

A Data-Consistent Approach to Ensemble Filtering

Rylan Spence, Troy Butler, Clint Dawson

AI总结 本文提出了一种数据一致性的集合滤波方法——QPCA-EnDCF,用于处理混沌系统在部分观测条件下的状态估计问题。该方法通过在观测空间中进行谱正则化更新,替代传统的随机观测扰动,从而提升滤波精度与概率校准能力。理论分析表明,与传统随机集合卡尔曼滤波相比,QPCA-EnDCF 在保持相似计算复杂度的同时,能够有效降低估计方差,并在数值实验中表现出更优的误差扩散与技能匹配特性。

详情
Comments
42 pages, 5 figures
英文摘要

Ensemble filtering of chaotic, partially observed systems is often performed with ensembles far smaller than the state dimension resulting in empirical covariances that are low rank. Subsequently, stochastic observation perturbations can degrade both accuracy and probabilistic calibration. We develop a data-consistent perspective on ensemble filtering and introduce the Quantity-of-Interest Principal Component Analysis Ensemble Data Consistent Filter (QPCA-EnDCF), which is a deterministic method that replaces perturbed observations with a spectrally regularized update in observation space. The method whitens forecast--observation residuals, computes an empirical eigendecomposition of the residual covariance, and restricts the correction to a rank-$κ$ subspace before mapping the increment back to state space through an empirical gain. We establish a theoretical framework that separates population and finite-ensemble objects and yields a bias--variance decomposition for the analysis mean. The analysis shows that stochastic EnKF variants incur an irreducible $\mathcal{O}(1/N)$ variance contribution from observation perturbations, whereas QPCA-EnDCF replaces this term with projector-estimation variability that is also $\mathcal{O}(1/N)$ but depends on the retained rank and the cutoff gap through eigenspace stability. Numerical experiments on the Lorenz--96 system in strongly undersampled regimes demonstrate that QPCA-EnDCF substantially improves spread--skill behavior, temporal tracking between spread and error, and rank-histogram reliability relative to sequential and four-dimensional stochastic EnKF. Under the baseline configuration, these calibration gains are accompanied by lower RMSE.

2605.11239 2026-05-13 cs.LG stat.ML

Extending Kernel Trick to Influence Functions

Zhenhuan Sun, Shahrokh Valaee

AI总结 本文提出了一种影响函数的对偶表示方法,其计算复杂度随数据集规模增长而非模型规模,为大规模模型的影响分析提供了更高效的替代方案。该方法适用于可线性化的模型,通过构造一个与模型输出维度和数据集规模乘积相关的矩阵实现,能够在参数空间难以计算原始影响函数时有效估计参数、模型输出和损失的变化。这一成果在模型规模远大于数据集规模时具有显著优势。

详情
英文摘要

In this paper, we present a dual representation of the influence functions, whose computational complexity scales with dataset size rather than model size. Both analytically and experimentally, we show that this representation can be an efficient alternative to the original influence functions for estimating changes in parameters, model outputs and loss due to data point removal, when model size is large relative to dataset size, or when evaluating the original influence functions in parameter space is infeasible. The dual representation, however, is limited to linearizable models, which are models whose behavior can be approximated by their linearizations throughout training, and requires materializing a matrix, whose size grows with the product of model output dimension and dataset size.

2605.11238 2026-05-13 math.ST stat.CO stat.TH

Efficient Robust Constrained Signal Detection via Kolmogorov Width Approximations

Yikun Li, Matey Neykov

AI总结 本文研究了在强ε污染的高斯序列模型下,具有先验约束集K的信号的鲁棒最小最大检测问题。针对传统最优检测方法需要计算复杂约束集的Kolmogorov宽度这一计算难题,作者提出了一种多项式时间检测框架,适用于多种结构约束集。通过半定规划松弛和改进的椭球法结合近似次梯度 oracle,有效逼近Kolmogorov宽度,实现了与现有上界仅相差对数因子的鲁棒检测边界,为结构化信号提供了无需先验几何复杂度知识的高效检测方案。

详情
Comments
46 pages
英文摘要

Robust statistical inference often faces a severe computational-statistical gap when dealing with complex parameter spaces. We investigate minimax signal detection in the Gaussian sequence model under strong $ε$-contamination, where the signal belongs to a general prior constraint $K$. Existing optimal tests require computing the exact Kolmogorov $k$-width of $K$, a computationally intractable task for general non-trivial sets. We bridge this gap by proposing a polynomial-time testing framework that universally applies to balanced, type-2, and exactly 2-convex constraints. By leveraging a semidefinite programming relaxation and a modified ellipsoid method equipped with an approximate subgradient oracle, we efficiently approximate the Kolmogorov widths. Remarkably, our unconditional efficient algorithm achieves a robust detection boundary that matches existing upper bounds up to a mere polylogarithmic factor. This establishes a computationally tractable testing solution for a broad class of structured signals without requiring prior knowledge of their exact geometric complexity.

2605.11226 2026-05-13 math.AT math.ST stat.ML stat.TH

A Stable Distance Persistence Homology for Dynamic Bayesian Network Clustering

Will Bales, Carmen Rovi

AI总结 该研究提出了一种用于动态贝叶斯网络聚类的稳定持久同调方法,旨在捕捉变量间依赖关系随时间演变的全局模式。通过构建动态贝叶斯图(DBG),并利用持久同调生成条形码,能够记录强依赖变量组的连接与消散过程。该方法具有稳定性,即网络条件概率表的小扰动只会引起条形码的微小变化,从而为动态贝叶斯网络的依赖结构演化提供了鲁棒的摘要描述。

详情
Comments
27 pages, 8 figures
英文摘要

Dynamic Bayesian networks (DBNs) are a widely used framework for modeling systems whose probabilistic structure evolves over time. Standard inference methods focus on local conditional distributions and can miss larger-scale patterns in how dependencies between variables organize and change over time. We introduce a topological approach to this problem. To each DBN we associate a time-varying graph, called a Dynamic Bayesian Graph (DBG), by assigning to each edge a strength that measures variation in its conditional dependence across parent configurations, and retaining edges whose strength exceeds a chosen threshold. We show that this construction fits within the dynamic graph framework of Kim and Mémoli, enabling the use of tools from topological data analysis. Applying persistent homology to a DBG produces a barcode, which records the merging and disappearance of connected groups of strongly dependent variables over time. We prove that this barcode is stable: small perturbations in the conditional probability tables of the DBN lead to small changes in the resulting barcode. This yields a principled and noise-resistant summary of how dependency structure evolves in a dynamic Bayesian network.

2605.11220 2026-05-13 stat.AP

Prediction Markets Underperform Simple Baselines For Infectious Disease Forecasting

Carson Dudley, Reiden Magdaleno

AI总结 该研究评估了预测市场(如Polymarket、Kalshi)在传染病预测中的表现,发现其在流感和麻疹病例预测中均未能超越简单统计基准。尽管预测市场具有实时响应新闻和非结构化信号的优势,但其参与者缺乏流行病学专业知识,且存在概率分布不合理和交易量低等问题,导致市场预测效率低下。研究结果表明,当前预测市场在传染病动态预测中不可靠,也不适合作为现有预测系统的补充特征。

详情
英文摘要

Prediction markets (e.g., Polymarket, Kalshi) allow participants to bet on future events, producing real-time forecasts based on collective judgment. In domains such as elections and finance, markets have been effective at aggregating information, often rivaling or outperforming expert forecasters or polls. Whether this performance extends to infectious disease dynamics is unclear. Participants are self-selected and typically lack epidemiological expertise. However, markets can respond in real time to emerging news and unstructured signals in ways that standard forecasting pipelines cannot. Also, substantial financial stakes encourage participants to make an effort to be accurate. We evaluate Polymarket forecasts during 2025 and 2026 for two settings: weekly cumulative influenza hospitalizations in the US, which have an established expert-curated forecasting ensemble (CDC FluSight), and monthly measles cases, which do not. Across both settings, prediction markets fail to outperform standard benchmarks. For influenza, markets are competitive with low-performing individual FluSight models but are dominated by the FluSight ensemble: even when we combine market forecasts with the ensemble, the best combination puts zero weight on the markets. For measles, markets are outperformed by simple statistical baselines. We diagnose two sources of market inefficiency: placement of probability mass on impossible outcomes (e.g., decreasing values in cumulative forecasts) and low trading volume. These results suggest that current prediction markets are not reliable forecasters of infectious disease dynamics on their own or useful as complementary features for existing forecasting systems.

2605.11191 2026-05-13 stat.ML cs.LG

Adaptive Policy Learning Under Unknown Network Interference

Aidan Gleich, Eric Laber, Alexander Volfovsky

AI总结 本文研究了在未知网络干扰环境下进行自适应策略学习的问题,旨在同时学习网络中个体间的干扰动态并据此优化个体层面的干预分配以最大化累积收益。作者提出了一种基于吉布斯采样的汤普森采样算法,能够联合学习干扰网络并自适应优化干预策略,同时提供干扰网络的估计以支持后续因果分析。实验表明,该方法在多种场景下均能实现显著的累积收益提升,并具有良好的理论保证和实际效果。

详情
英文摘要

Adaptive experimentation under unknown network interference requires solving two coupled problems: (i) learning the underlying dynamics of interference among units and (ii) using these dynamics to inform treatment allocation in order to maximize a cumulative outcome of interest (e.g. revenue). Existing adaptive experimentation methods either assume the interference network is fully known or bypass the network by operating on coarse cluster-level randomizations. We develop a Thompson sampling algorithm that jointly learns the interference network and adaptively optimizes individual-level treatment allocations via a Gibbs sampler. The algorithm returns both an optimized treatment policy and an estimate of the interference network; the latter supports downstream causal analyses such as estimation of direct, indirect, and total treatment effects. For additive spillover models, we show that total reward is linear in the treatment vector with coefficients given by an $n$-dimensional latent score. We prove a Bayesian regret bound of order $\sqrt{nT \cdot B \log(en/B)}$ for exact posterior sampling; empirically, our Gibbs-based approximate sampler achieves regret consistent with this rate and remains sublinear when the additive spillovers assumption is violated. For general Neighborhood Interference, where this reduction is unavailable, we analyze an explore-then-commit variant with $O(n^2 \log T)$ graph-discovery cost. An information-theoretic $Ω(n \log T)$ lower bound complements both results. Empirically, our method achieves more than an order-of-magnitude reduction in regret in head-to-head comparisons. On two real-world networks, the algorithm achieves sublinear regret and yields downstream effect estimates with small RMSE relative to the truth.

2605.11181 2026-05-13 cs.LG cs.AI cs.NA math.NA math.OC stat.ML

Muon is Not That Special: Random or Inverted Spectra Work Just as Well

Zakhar Shumaylov, Nathaël Da Costa, Peter Zaika, Bálint Mucsányi, Alex Massucco, Yoav Gelberg, Carola-Bibiane Schönlieb, Yarin Gal, Philipp Hennig

AI总结 本文挑战了Muon优化器在非欧几里得优化中依赖几何结构的主流观点,提出精确的几何结构并非影响优化性能的关键因素。研究引入了基于Schatten(准)范数的Freon优化器,其性能在GPT-2等任务中优于Muon,并揭示了最佳参数位于准范数区域,无法用传统LMO理论解释。进一步提出Kaon优化器,通过用随机噪声替代奇异值仍能匹配Muon性能,证明严格的几何结构并非必要。研究指出,优化性能主要由对齐度和下降潜力等局部量决定,而非全局几何结构。

详情
Comments
45 pages
英文摘要

The recent empirical success of the Muon optimizer has renewed interest in non-Euclidean optimization, typically justified by similarities with second-order methods, and linear minimization oracle (LMO) theory. In this paper, we challenge this geometric narrative through three contributions, demonstrating that precise geometric structure is not the key factor affecting optimization performance. First, we introduce Freon, a family of optimizers based on Schatten (quasi-)norms, powered by a novel, provably optimal QDWH-based iterative approximation. Freon naturally interpolates between SGD and Muon, while smoothly extrapolating into the quasi-norm regime. Empirically, the best-performing Schatten parameters for GPT-2 lie strictly within the quasi-norm regime, and thus cannot be represented by any unitarily invariant LMO. Second, noting that Freon performs well across a wide range of exponents, we introduce Kaon, an absurd optimizer that replaces singular values with random noise. Despite lacking any coherent geometric structure, Kaon matches Muon's performance and retains classical convergence guarantees, proving that strict adherence to a precise geometry is practically irrelevant. Third, having shown that geometry is not the primary driver of performance, we demonstrate it is instead controlled by two local quantities: alignment and descent potential. Ultimately, each optimizer must tune its step size around these two quantities. While their dynamics are difficult to predict a-priori, evaluating them within a stochastic random feature model yields a precise insight: Muon succeeds not by tracking an ideal global geometry, but by guaranteeing step-size optimality.

2605.11179 2026-05-13 stat.ML cs.LG

Interpretable Machine Learning for Spatial Science: A Lie-Algebraic Kernel for Rotationally Anisotropic Gaussian Processes

Kane Warrior, Dalia Chakrabarty

AI总结 许多三维空间场具有旋转各向异性,即变化方向不与坐标轴对齐。本文提出了一种可解释的旋转各向异性高斯过程核函数,通过三个主尺度和一个显式的SO(3)旋转参数化三维对称正定协方差度量,从而更直观地描述各向异性方向和尺度。该方法利用李代数指数映射将旋转表示为无约束的欧几里得坐标,同时保证协方差矩阵的有效性,并在合成数据和实际材料密度数据上验证了其优越性和可解释性。

详情
英文摘要

Many three-dimensional spatial fields are anisotropic, with directions of rapid and slow variation that need not align with the coordinate axes. Standard Gaussian process kernels with Automatic Relevance Determination (ARD) capture only axis-aligned anisotropy, while generic full symmetric positive definite (SPD) metrics can represent rotated anisotropy but do not parameterise principal length-scales and directions directly. We introduce an interpretable rotationally anisotropic GP kernel that parameterises a three-dimensional SPD covariance metric using three principal length-scales and an explicit SO(3) rotation. The rotation is represented by an axis-angle vector and mapped to SO(3) via the Lie-algebra exponential map, giving unconstrained Euclidean coordinates for inference while always inducing a valid SPD metric. The construction spans the same family of three-dimensional SPD covariance metrics as a generic full-SPD parameterisation, but exposes the geometry differently: length-scales and orientation are explicit, interpretable, and directly available for prior specification and posterior summaries. We perform Bayesian inference on these quantities using Markov Chain Monte Carlo (MCMC), and characterise the resulting symmetries and weakly identified regimes. On synthetic data with rotated anisotropy, the posterior recovers the generating metric and improves prediction relative to an axis-aligned ARD baseline, while matching the predictive performance of a generic full SPD baseline. When the ground truth is axis-aligned, posterior mass concentrates near the identity rotation and predictive performance matches ARD. On a material-density dataset from a laboratory-fabricated nano-brick, the inferred metric reveals rotated anisotropy that is not captured by axis-aligned kernels.

2605.11120 2026-05-13 cs.IT cs.SY eess.SP eess.SY math.IT stat.ML

Sensor Design for Accuracy-Bounded Estimation via Maximum-Entropy Likelihood Synthesis

Raktim Bhattacharya

AI总结 本文研究了在传感器模型不确定或不可用的情况下,如何为大规模时空系统设计满足精度要求的感知架构。核心方法是通过最大熵似然合成,在给定误差预算的前提下,构造满足精度约束且信息注入最少的测量似然函数。该方法适用于多种度量标准,提出了对应的离散粒子级优化问题及求解算法,并通过实验验证了其在不同场景下对精度约束的有效控制能力。

详情
英文摘要

Designing the sensing architecture for large-scale spatio-temporal systems is hard when accuracy requirements are specified but sensor models are uncertain or unavailable. Classical design treats sensor placement and estimation sequentially, requiring valid forward models for each sensing modality. This paper inverts the design flow: given an error budget, synthesize the measurement likelihood that enforces it while injecting minimal information beyond the dynamical prior. The likelihood is constructed by constrained optimization: among all posteriors satisfying a prescribed accuracy bound relative to a target, select the one minimizing Kullback-Leibler divergence from the prior. The solution is a maximum-entropy posterior in relative-entropy form, and the induced likelihood is the Radon-Nikodym derivative. The framework accommodates arbitrary discrepancies and is instantiated for Wasserstein distance, maximum mean discrepancy, $f$-divergences, moment constraints, and hybrid metrics. For each, we derive the discrete particle-level problem, analyze its convex or convex-relaxed structure, and present solvers with complexity scaling. A closed-form solution exists for the symmetric exponential-tilt case, and a distillation procedure converts nonparametric likelihood samples into parametric forms. A two-layer sensor design architecture embeds the synthesized likelihood in the recursive predict-update loop, connecting accuracy budgets to physical sensor placement, precision, and configuration. Numerical experiments comparing four metrics on unimodal and multimodal scenarios confirm the accuracy constraints are reliably enforced and reveal how metric choice determines the amount and spatial distribution of injected information.

2605.11059 2026-05-13 stat.ML cs.LG math.PR

Uniform Scaling Limits in AdamW-Trained Transformers

William Gibson, Christoph Reisinger

AI总结 本文研究了使用AdamW优化器训练的深层Transformer模型的极限行为,通过将隐藏状态的动力学建模为由注意力机制耦合的相互作用粒子系统。在适当缩放注意力头数的前提下,证明了隐藏状态和反向传播变量的联合动力学在$L^2$意义下,以$\mathcal O(L^{-1}+L^{-1/3}H^{-1/2})$的速率,统一收敛于一个前向-反向常微分方程组的解。该结果为理解Transformer在深度增加时的行为提供了理论支持,并给出了离散模型与连续模型之间差异的统一界,且该界不依赖于初始条件的紧集和token数量。

详情
英文摘要

We study the large-depth limit of transformers trained with AdamW, by modelling the hidden-state dynamics as an interacting particle system (IPS) coupled through the attention mechanism. Under appropriate scaling of the attention heads, we prove that the joint dynamics of the hidden states and backpropagated variables converge in $L^2$, uniformly over the initial condition, to the solution of a forward--backward system of ODEs at rate $\mathcal O(L^{-1}+L^{-1/3}H^{-1/2})$. Here, $L$ and $H$ denote the depth and number of heads of the transformer, respectively. The limiting system of ODEs can be identified with a McKean--Vlasov ODE (MVODE) when the attention heads do not incorporate causal masking. By using the flow maps associated with this MVODE and applying concentration of measure techniques, we obtain bounds on the difference between the discrete and continuous models that are uniform over compact sets of initial conditions. As this is achieved without resorting to a covering argument, the constants in our bounds are independent of the number of tokens. Furthermore, under a suitable adaptation to AdamW, the bounds become independent of the token embedding dimension.

2605.11044 2026-05-13 stat.ME stat.AP

Rethinking Factor Loading Thresholds: A Case for a Strict λ >= .70 Rule

M. Murat Yaslioglu, Duygu Toplu Yaslioglu

AI总结 本文质疑了在验证性因子分析中接受标准化因子载荷低至0.50的常见做法,提出应采用更严格的项目层面阈值:仅保留因子载荷λ≥0.70(即λ²≥0.50)的指标。作者认为,载荷低于0.70的指标包含的误差方差超过解释方差,会损害构念效度和因子解的稳定性。文章通过理论分析、模拟证据和结构方程模型的影响,论证了弱载荷会降低测量质量、因子得分确定性和模型拟合度,采用λ≥0.70的最低标准有助于提升潜变量模型的严谨性和可解释性。

详情
英文摘要

This paper challenges the prevailing practice of accepting standardized factor loadings as low as .50 in confirmatory factor analysis. Drawing on the logic of Average Variance Extracted (AVE) and communality, the author argues for a stricter item level threshold: only indicators with loadings of λ >= .70 (implying λsq >= .50) should be retained in final measurement models. The rationale is that indicators with λ < .70 contain more error than explained variance, undermining both construct validity and the stability of factor solutions. The paper reviews theoretical foundations, simulation evidence, and implications for structural equation modeling, showing that weak loadings degrade measurement quality, factor score determinacy, and model fit. Adopting a minimum λ >= .70 rule aligns item level standards with established construct level criteria and enhances the rigor and interpretability of latent variable models.

2605.10949 2026-05-13 stat.AP cs.AI cs.CV cs.LG

AlphaEarth Satellite Embeddings for Modelling Climate Sensitive Diseases Towards Global Health Resilience

Usman Nazir, I-Han Cheng, Sara Khalid

AI总结 该研究探讨了利用卫星遥感数据(AlphaEarth嵌入)预测气候敏感性疾病的潜力,以提升全球健康韧性。研究聚焦于疟疾、儿童急性呼吸道感染和发育迟缓等疾病,评估了64维卫星嵌入在不同国家和地区的预测性能。结果显示,卫星数据在疟疾和呼吸道感染预测中具有显著的预测能力,但在发育迟缓预测中受固定效应影响较大,需进一步数据支持。这一工作为利用遥感技术辅助公共卫生监测提供了新的方法和实证依据。

详情
Comments
Visualising Climate 2026
英文摘要

Malaria, childhood acute respiratory infection, and child undernutrition together account for over two million deaths annually in children under five, with the burden concentrated in low and middle-income countries where climate variability modulates transmission, exposure, and nutritional outcomes. Routine health surveillance in these settings remains sparse and reactive. Satellite-derived representations of the Earth's surface offer a scalable, low-cost complement to traditional covariates, yet their utility as predictors of population health outcomes is poorly characterised. We summarise findings from three studies evaluating AlphaEarth Foundations 64-dimensional satellite embeddings as predictors of population health outcomes, focusing on vulnerable populations. The studies span infectious disease (malaria, respiratory infection) and stunting. In each study, embeddings provide predictive value at sufficient spatial granularity: (i) malaria prediction across Nigeria shows consistent per-region R^2 gains; (ii) childhood acute respiratory infection prediction across 11 DHS countries increases pooled R^2 from 0.157 to 0.206 across three tree-based estimators; (iii) stunting prediction across 35 countries is neutral at country level due to collinearity with fixed effects. The stunting case is currently limited by lack of DHS cluster-level coordinates, which is the next key experiment.

2605.10395 2026-05-13 stat.ML cond-mat.dis-nn cond-mat.stat-mech cs.IT cs.LG math.IT

Sharp feature-learning transitions and Bayes-optimal neural scaling laws in extensive-width networks

Minh-Toan Nguyen, Jean Barbier

AI总结 本文研究了在高维环境下,从噪声查询中学习具有层次特征的单隐藏层教师网络的信息论极限,并探讨了知识向更小的学生模型迁移的问题。通过引入一种启发式的留一法解耦分析,并结合数值验证,作者推导出贝叶斯最优泛化误差和特征重叠的精确刻画,揭示了特征可学习性存在一系列尖锐的相变点。研究还提出了“有效宽度”这一概念,统一了两个不同的缩放阶段,并展示了学生模型在接近有效宽度时可通过Adam优化器实现最优的泛化误差缩放规律。

详情
英文摘要

We study the information-theoretic limits of learning a one-hidden-layer teacher network with hierarchical features from noisy queries, in the context of knowledge transfer to a smaller student model. We work in the high-dimensional regime where the teacher width $k$ scales linearly with the input dimension $d$ -- a setting that captures large-but-finite-width networks and has only recently become analytically tractable. Using a heuristic leave-one-out decoupling argument, validated numerically throughout, we derive asymptotically sharp characterizations of the Bayes-optimal generalization error and individual feature overlaps via a system of closed fixed-point equations. These equations reveal that feature learnability is governed by a sequence of sharp phase transitions: as data grows, teacher features become recoverable sequentially, each through a discontinuous jump in overlap. This sequential acquisition underlies a precise notion of \textit{effective width} $k_c$ -- the number of learnable features at a given data budget $n$ -- which unifies two distinct scaling regimes: a feature-learning regime in which the Bayes-optimal generalization error $\varepsilon^{\rm BO}$ scales as $ n^{1/(2β)-1}$, and a refinement regime in which it scales as $n^{-1}$, where $β>1/2$ is the exponent of the power-law feature hierarchy. Both laws collapse to the single relation $\varepsilon^{\rm BO}=Θ(k_c d/n)$. We further show empirically that a student trained with \textsc{Adam} near the effective width $k_c$ achieves these optimal scaling laws (up to a small algorithmic gap), and provide an information-theoretic account of the associated scaling in model size.

2605.09523 2026-05-13 cs.LG cs.CE cs.NA math.NA physics.comp-ph stat.ML

HS-FNO: History-Space Fourier Neural Operator for Non-Markovian Partial Differential Equations

Lennon J. Shikhman

AI总结 本文提出了一种名为HS-FNO的历史空间傅里叶神经算子,用于求解非马尔可夫型偏微分方程。该方法通过引入扩展状态$u_t(θ,x)$,将历史信息纳入模型,从而更准确地捕捉系统动态。HS-FNO通过将历史状态更新分解为预测新时间片和精确移动已知部分,减少了学习维度并提升了预测精度。实验表明,HS-FNO在多个基准问题上优于现有方法,尤其在自回归预测中表现出显著的误差降低。

详情
Comments
15 pages, 4 figures, 1 table. Code at https://github.com/lennonshikhman/hs-fno/
英文摘要

Neural operators provide fast surrogate models for time-dependent partial differential equations, but their standard autoregressive use usually assumes that the instantaneous field $u(t,\cdot)$ is a complete state. This assumption fails for delay equations, distributed-memory systems, and other non-Markovian dynamics: two trajectories may agree at time $t$ and nevertheless have different futures because their histories differ. We introduce the History-Space Fourier Neural Operator (HS-FNO), a neural operator for delay and memory-driven PDEs formulated on the lifted state $u_t(θ,x)=u(t+θ,x)$, $θ\in[-τ,0]$. The key computational step is to decompose one history-state update into a learned predictor for the newly exposed future slice and an exact shift-append transport for the portion of the history window already known from the previous state. This avoids learning deterministic history coordinates, reduces the learned output dimension, and enforces the natural discrete history update. We test HS-FNO on five benchmark families covering delayed reaction--diffusion, spatial epidemiology, nonlocal neural-field dynamics, delayed waves, and distributed-memory closures. Across ten random seeds, HS-FNO attains the lowest aggregate one-step, history-space, and rollout errors among the principal baselines. The largest gain occurs in autoregressive prediction, where aggregate rollout error decreases from $0.241$, $0.188$, and $0.185$ for current-state, lag-stack, and unconstrained history-to-history operators, respectively, to $0.094$. The same model uses fewer parameters than unconstrained history prediction. These results indicate that enforcing the discrete shift structure of history-state evolution is an effective inductive bias for non-Markovian PDE surrogate modeling.

2605.06873 2026-05-13 stat.ML cs.LG cs.NA math.NA

One Operator for Many Densities: Amortized Approximation of Conditioning by Neural Operators

Panos Tsimpos, Edoardo Calvello, Ayoub Belhadji, Nicholas H. Nelsen

AI总结 本文研究了概率条件化问题,即在已知随机变量 $Y$ 的条件下,确定随机变量 $X$ 的分布。传统方法是针对固定联合分布学习条件分布,而本文提出了一种新思路:通过识别一个统一的算子,将任意联合密度映射到对应的条件密度,从而实现对联合-条件对的 amortization。作者证明条件算子可以用神经算子以任意精度逼近,并通过高斯混合模型的实验验证了该框架的有效性,为通用的条件化方法提供了理论基础。

详情
Comments
27 pages (10 main text, 14 appendix, and 3 references pages), 2 figures, 2 tables
英文摘要

Probabilistic conditioning is concerned with the identification of a distribution of a random variable $X$ given a random variable $Y$. It is a cornerstone of scientific and engineering applications where modeling uncertainty is key. This problem has traditionally been addressed in machine learning by directly learning the conditional distribution of a fixed joint distribution. This paper introduces a novel perspective: we propose to solve the conditioning problem by identifying a single operator that maps any joint density to its conditional, thus amortizing over joint-conditional pairs. We establish that the conditioning operator can be approximated to arbitrary accuracy by neural operators. Our proof relies on new results establishing continuity of the conditioning operator over suitable classes of densities. Finally, we learn the conditioning map for a class of Gaussian mixtures using neural operators, illustrating the promise of our framework. This work provides the theoretical underpinnings for general-purpose, amortized methods for probabilistic conditioning, such as foundation models for Bayesian inference.

2605.05798 2026-05-13 stat.ME

Dual-Homotopy Framework for Constrained EM Algorithm

Jisoo Choi, Hee-Seok Oh

AI总结 本文提出了一种适用于一般约束估计问题的新约束EM算法,核心方法是基于“对偶同伦框架”,该框架结合了确定性退火EM与基于障碍函数的优化方法,从而在参数约束下实现更稳定的估计。进一步引入了一种自适应约束EM算法,能够在不同分布形式和约束结构下保持似然单调性。通过仿真研究和实际数据分析验证,该算法在稳定性与准确性方面优于现有方法,包括标准EM算法。

详情
英文摘要

We propose a new constrained EM algorithm that is applicable to general constrained estimation problems. The proposed method is based on a novel framework, the `dual-homotopy framework,' which combines deterministic annealing EM with a barrier-based optimization, enabling stable estimation under parameter constraints. Building on this framework, we further introduce an adaptive constrained EM algorithm that preserves likelihood monotonicity, regardless of the underlying distributional form or the specific structure of the constraints. Through simulation studies and a real-data analysis, both under parameter constraints, we demonstrate that the proposed algorithm yields more stable and accurate estimates than existing methods, including the standard EM algorithm.

2605.04946 2026-05-13 cs.LG stat.ML

Training-Time Batch Normalization Reshapes Local Partition Geometry in Piecewise-Affine Networks

Xuan Qi, Yi Wei, Fanqi Yu, Furao Shen, Vittorio Murino, Cigdem Beyan

AI总结 本文研究了训练过程中批量归一化(BN)在分段仿射网络中的几何影响,揭示了BN如何通过调整神经元的参考超平面,改变局部区域的划分结构。研究发现,BN在每个神经元上定义了一个以小批量中心为基准的超平面,其切换超平面的偏移量以标准化坐标表示,与原始偏置无关。这一机制提高了局部划分的精细程度,并在深度网络中具有局部传递性,为理解BN在训练阶段的函数级几何作用提供了新视角。

详情
英文摘要

Batch normalization (BN) is central to modern deep networks, but its effect on the realized function during training remains less understood than its optimization benefits. We study training-time BN in continuous piecewise-affine (CPA) networks through the geometry of switching hyperplanes and the induced affine-region partition. Conditioned on a mini-batch, we show that BN defines for each neuron a reference hyperplane through the batch centroid, and that breakpoint-switching hyperplanes are parallel translates whose offsets are expressed in batch-standardized coordinates and are independent of the raw bias. This yields an exact criterion for when a switching hyperplane intersects a local $\ell_\infty$ window and motivates a local region-density functional based on exact affine-region counts. Under explicit sufficient conditions, we show that BN increases expected local partition refinement in ReLU and more general piecewise-affine networks, and that this mechanism transfers locally through depth inside parent affine regions where the upstream representation map is an affine embedding. These results provide a function-level geometric account of training-time BN as a batch-conditional recentering mechanism near the data.

2605.03911 2026-05-13 stat.ME

The Multiplicative Quasi-Instrumental Variable Model

Jiewen Liu, Chan Park, David Richardson, Eric J. Tchetgen Tchetgen

AI总结 本文提出了一种乘法准工具变量(MQIV)模型,用于在存在未测量混杂因素的情况下进行因果推断,该模型利用可能不完全外生的准工具变量。该模型允许准工具变量对结果有非通过处理的直接影响,从而突破了传统工具变量的排除限制条件。研究在乘法处理模型下建立了对处理组平均处理效应的非参数识别,并提出了一种修正偏差的改进沃尔德比估计量,以及一类多重稳健且半参数高效的估计方法,适用于处理效应异质性和排除限制违反的情形。

详情
英文摘要

We introduce the Multiplicative Quasi-Instrumental Variable (MQIV) model, a framework for causal inference with unmeasured confounding that leverages an instrument that may be imperfectly exogenous. We allow the candidate quasi-instrument to have a direct effect on the outcome not mediated by the treatment, thus violating the standard IV exclusion restriction. We establish nonparametric identification of the population average treatment effect on the treated (ATT) under a treatment model that is multiplicative with respect to the quasi-IV and the hidden confounder (Hernan and Robins, 2006). Such a multiplicative treatment model may arise naturally either when treatment occurs only if two independent instrument-driven and confounder-driven causal mechanisms are present; or alternatively, when an instrument's effect on treatment uptake is inherently heterogeneous and scales with a person's latent propensity, best capturing settings in which it is challenging for a given instrument to overcome a person's inherent lack of preference for the treatment in view. Importantly, as we establish, the MQIV model is simultaneously agnostic to treatment-effect heterogeneity with respect to hidden confounders and violation of the core IV exclusion restriction condition. Identification is achieved via a modified Wald ratio estimand, which corrects the bias due to the exclusion restriction violation, and we propose a new class of estimators that are multiply robust and semiparametric efficient. Finally, we evaluate the approach in extensive simulations and an application to evaluate the causal effect of having three or more children on mothers' labor-market engagement.

2605.02453 2026-05-13 gr-qc astro-ph.HE cs.LG physics.data-an stat.ML

Testing General Relativity Through Gravitational Wave Classification: A Convolutional Neural Network Framework

Lavinia Heisenberg, Shayan Hemmatyar, Hector Villarrubia-Rojo

AI总结 本文提出了一种基于卷积神经网络(CNN)的机器学习框架,用于通过引力波信号检验广义相对论(GR)。研究利用GWTC目录中173个双黑洞并合事件的源参数生成模拟的GR和超越GR(BGR)波形,并引入响应函数形式化方法以量化观测对GR修正的响应。实验表明,使用响应函数作为CNN输入可使分类灵敏度提高约33倍,且在所有变形尺度下,CNN均优于基于单一特征的分类方法,展示了该框架在检测引力波信号中GR偏差方面的优越性和有效性。

详情
Comments
36 pages, 20 figures, 4 tables. Comments welcome!
英文摘要

We present a machine learning framework for testing general relativity (GR) with gravitational wave signals from binary black hole mergers. Using the source parameters of 173 BBH events from the GWTC catalog as a realistic astrophysical population, we generate simulated GR waveforms and construct beyond GR (BGR) waveforms by applying controlled phase deformations. We introduce a response function formalism that provides a systematic framework for quantifying how any observable responds to modifications of GR. We train convolutional neural networks (CNNs) on two input representations: whitened waveforms and a response function type observable derived from the waveform mismatch, which isolates the effect of phase deviations from the bulk signal. Using response functions as the CNN input improves the classification sensitivity by a factor of approximately 33 compared to whitened waveforms, demonstrating that the choice of observable representation is as important as the classifier architecture. We study the fundamental limits of this classification through Bayes optimal error analysis, averaging methods that reveal coherent patterns hidden in noise, and a comparison between CNN accuracy and a single feature classifier as a proxy for human performance. At all deformation scales, the CNN outperforms the best single feature approach. We extend the framework to physically motivated theories using the parameterized post Einsteinian (ppE) formalism and apply it to massive gravity, where the classifier detects deviations for graviton masses of order $m_g \sim 10^{-23}\;\mathrm{eV}/c^2$ with aLIGO design sensitivity.

2604.24196 2026-05-13 stat.ML cs.LG

Identifiability and Stability of Generative Drifting with Companion-Elliptic Kernel Families

HakGeun Lee, Hyonho Chun

AI总结 本文研究了生成模型中漂移场的可识别性与稳定性问题,重点探讨零漂移平衡是否能唯一确定目标分布,以及近似零漂移是否意味着分布的弱收敛。为解决原漂移模型中拉普拉斯核的局限性,作者引入了伴椭圆核族,该核族包含拉普拉斯核,并精确涵盖光滑参数为 $ν\ge 1/2$ 的高斯核与马特恩核。研究证明,在该核族下任意实数空间上的博雷尔概率测度均可由漂移场唯一识别,并揭示了漂移场收敛与弱收敛之间的关系,指出即使在缺乏紧致性的情况下,也可通过单个 $C_0$-可观测量恢复弱收敛。

详情
Comments
28 pages, 1 figure
英文摘要

This paper studies the identifiability and stability of drifting fields within the framework of Generative Modeling via Drifting. The motivating question is whether a zero-drift equilibrium identifies the target distribution, and whether an approximate zero drift implies weak distributional convergence. Since the original drifting model employs the Laplace kernel by default, we first analyze why standard Gaussian score-based arguments fail to apply. This analysis motivates the introduction of companion-elliptic kernel families, which are characterized by a companion potential satisfying an elliptic closure relation. We show that this class naturally contains the Laplace kernel and consists precisely of Gaussian and Matérn kernels with smoothness parameter $ν\ge 1/2$. Within this class, we establish field identifiability for arbitrary Borel probability measures on $\mathbb{R}^d$: if the drifting field vanishes identically, then the two measures must coincide. As for stability, we demonstrate that field convergence alone does not guarantee weak convergence, since mass may escape to infinity while remaining invisible to the field. Although tightness of the sequence directly removes this obstruction and restores weak stability, we prove that, even without tightness, every $C_0$-vague cluster point lies exactly on the defect ray $\{cp:0\le c\le1\}$. Consequently, a single scalar $C_0$-observable suffices to detect the missing mass and recover weak convergence.

2604.16684 2026-05-13 cs.LG stat.ML

DARLING: Detection Augmented Reinforcement Learning with Non-Stationary Guarantees

Argyrios Gerogiannis, Yu-Han Huang, Venugopal V. Veeravalli

AI总结 本文研究了在非平稳有限时间回合马尔可夫决策过程(MDPs)中的无模型强化学习问题,且不预先知道非平稳性。针对分段平稳(PS)环境,即奖励和转移动态在未知时间点发生变化的情况,提出了一个名为DARLING的模块化方法,适用于表格和线性MDPs,无需提前知道变化时间点。DARLING在理论分析中改进了已知的最佳动态遗憾界,并在多种非平稳基准测试中表现出优于现有方法的性能。

详情
Comments
50 pages, 8 figures
英文摘要

We study model-free reinforcement learning (RL) in non-stationary finite-horizon episodic Markov decision processes (MDPs) without prior knowledge of the non-stationarity. We focus on the piecewise stationary (PS) setting, where both rewards and transition dynamics can change at unknown times. We first revisit existing state-of-the-art approaches and identify theoretical and practical limitations that change the current landscape of performance guarantees. To characterize the difficulty of the problem, we establish the first minimax lower bounds for PS-RL in tabular and linear MDPs. We then introduce Detection Augmented Reinforcement Learning (DARLING), a modular wrapper for PS-RL that applies to both tabular and linear MDPs, without knowledge of the changes. In tabular MDPs, under change-point separability and reachability conditions, DARLING improves the best known dynamic regret bounds and matches our minimax lower bound. In linear MDPs, DARLING matches the minimax lower bound when the relevant reachability parameters are known, and our analysis clarifies the structural obstacles that distinguish this setting from the tabular case. Finally, through extensive experimentation across diverse non-stationary benchmarks, we show that DARLING consistently surpasses the state-of-the-art methods.

2603.24913 2026-05-13 math.OC math.DG math.DS math.PR math.ST stat.TH

Geometry-Aware Langevin Sampling for Matrix-Valued Graph Learning

Papri Dey

AI总结 本文研究了在正定矩阵值参数空间中进行贝叶斯推断的问题,特别是在图学习和协方差估计中的应用。为了解决欧几里得空间采样方法在锥体边界附近混合效率低的问题,作者提出了一种基于几何结构的Metropolis调整朗之万算法——\ConeMALA,该方法利用模型的对数行列式结构诱导采样几何。实验表明,该方法在采样效率和稳定性方面优于传统方法,为正定约束下的图学习提供了有效的不确定性量化方案。

详情
英文摘要

Bayesian inference over positive semidefinite (PSD) matrix-valued parameters arises in structured covariance estimation, graph-Laplacian precision models, and multi-output graph learning, but Euclidean proposals often mix poorly near the cone boundary. We propose \ConeMALA, a geometry-aware Metropolis-adjusted Langevin algorithm whose proposal geometry is induced by the model's log-determinant structure. For a PSD-weighted graph with edge kernels $W_e\succeq 0$, block Laplacian $L(W)$ , and stabilizer $R\succ 0$, the lifted precision matrix $X(W)=L(W)+R\in \mathbb S_{++}^{md}$ defines the log-determinant energy $Φ(W)=-\log\det X(W).$ We show that the Hessian of $Φ$ is the pullback of the affine-invariant SPD metric under the map $W\mapsto X(W)$, yielding explicit intrinsic Langevin proposals with Metropolis-Hastings correction using the closed-form SPD exponential-map Jacobian. We validate the metric on rank-one PSD edge perturbations for $d=5$, obtaining essentially exact agreement between analytic curvature scores and finite-difference curvatures. In intrinsic SPD posterior and matrix-valued graph Gaussian experiments, \ConeMALA achieves stable multichain diagnostics and substantially higher ESS/sec than Euclidean MALA and generic RMALA, while a PDHMC-like finite-difference baseline is accurate but computationally prohibitive at larger graph sizes. These results show that pullback log-determinant geometry provides a practical route to uncertainty quantification in PSD-constrained graph learning.

2603.23374 2026-05-13 stat.ME stat.ML

Shape-Adaptive Conditional Calibration for Conformal Prediction via Minimax Optimization

Yajie Bao, Chuchen Zhang, Zhaojun Wang, Haojie Ren, Changliang Zou

AI总结 在条件预测校准中,实现有效的条件覆盖率是一个具有挑战性的问题,因为有限样本下满足逐点约束在理论上存在困难。本文提出了一种基于极大极小优化的预测推断框架MOPI,通过在校准阶段优化一个灵活的集合映射类,而非仅校准固定子水平集,从而提升了形状适应性,并保持了与最小化平均平方覆盖率误差的理论联系。该方法在理论上提供了非渐近的Oracle不等式,并证明了覆盖率误差的收敛速率在常规条件下达到最优阶数,同时支持在测试时不可观测但校准时可得的敏感属性上的有效条件推理。实验结果表明,MOPI在复杂非标准条件分布下生成的预测集比现有方法更高效。

详情
英文摘要

Achieving valid conditional coverage in conformal prediction is challenging due to the theoretical difficulty of satisfying pointwise constraints in finite samples. Building upon the characterization of conditional coverage through marginal moment restrictions, we introduce Minimax Optimization Predictive Inference (MOPI), a framework that generalizes prior work by optimizing over a flexible class of set-valued mappings during the calibration phase, rather than simply calibrating a fixed sublevel set. This minimax formulation effectively circumvents the structural constraints of predefined score functions, achieving superior shape adaptivity while maintaining a principled connection to the minimization of mean squared coverage error. Theoretically, we provide non-asymptotic oracle inequalities and show that the convergence rate of the coverage error attains the optimal order under regular conditions. The MOPI also enables valid inference conditional on sensitive attributes that are available during calibration but unobserved at test time. Empirical results on complex, non-standard conditional distributions demonstrate that MOPI produces more efficient prediction sets than existing baselines.

2602.13004 2026-05-13 cs.LG stat.ML

Towards Uncertainty-Aware Federated Granger Causal Learning

Ayush Mohanty, Nazal Mohamed, Nagi Gebraeel

AI总结 该研究旨在解决联邦格兰杰因果学习中缺乏不确定性感知的问题,提出了一种能够量化跨客户端因果关系不确定性的方法。通过分析联邦学习框架中不确定性传播的机制,作者推导了客户端与服务器之间协方差的闭式递推公式,并建立了基于谱半径的收敛条件,从而获得了稳态方差的解析表达式。实验表明,该方法能有效区分真实的跨客户端因果关系与虚假连接,优于现有联邦因果结构学习方法。

详情
Comments
Manuscript under review
英文摘要

Granger causality recovers directed interactions from time-series data, but in many distributed systems, the data are vertically partitioned across clients, with each client observing only the variables of its own subsystem. Federated Granger causality (FedGC) recovers cross-client interactions without sharing raw data. Existing FedGC methods, however, return deterministic point estimates with no calibrated measure of uncertainty, leaving operators without a principled basis for identifying reliable cross-client interactions. We address this limitation by characterizing how uncertainty propagates through the FedGC framework. We derive closed-form covariance recursions for the cross-covariances induced by the coupled client-server feedback loop, and establish spectral-radius-based convergence conditions yielding closed-form expressions for the steady-state variances at both the client and server. Under mild stability conditions, we prove that the steady-state uncertainty depends only on client data statistics (aleatoric) and is independent of the priors placed on the model parameters (epistemic). Building on this asymptotic characterization, we construct a post-training hypothesis testing procedure that separates genuine cross-client interactions from spurious edges. Experiments on synthetic and real-world datasets show that the predicted uncertainty propagation matches the theory across multiple operating regimes, while consistently outperforming the state-of-the-art federated causal structure learning baselines.

2602.04611 2026-05-13 stat.ML cs.LG

Targeted Synthetic Control Method

Yuxin Wang, Dennis Frauen, Emil Javurek, Konstantin Hess, Yuchen Ma, Stefan Feuerriegel

AI总结 本文提出了一种新的合成控制方法——目标合成控制(TSC),用于更准确地估计单一处理单元的因果效应。TSC 采用两阶段估计策略,通过目标更新优化初始权重,提高估计稳定性,并确保合成控制结果为控制单元观测值的凸组合,从而增强解释性。该方法兼容任意机器学习模型,有效避免了现有方法中可能出现的反事实估计无界问题,在合成与真实数据实验中均表现出更优的估计精度。

详情
英文摘要

The synthetic control method (SCM) estimates causal effects in panel data with a single-treated unit by constructing a counterfactual outcome as a weighted combination of untreated control units that matches the pre-treatment trajectory. In this paper, we introduce the targeted synthetic control (TSC) method, a new two-stage estimator that directly estimates the counterfactual outcome. Specifically, our TSC method (1) yields a targeted debiasing estimator, in the sense that the targeted updating refines the initial weights to produce more stable weights; and (2) ensures that the final counterfactual estimation is a convex combination of observed control outcomes to enable direct interpretation of the synthetic control weights. TSC is flexible and can be instantiated with arbitrary machine learning models. Methodologically, TSC starts from an initial set of synthetic-control weights via a one-dimensional targeted update through the weight-tilting submodel, which calibrates the weights to reduce bias of weights estimation arising from pre-treatment fit. Furthermore, TSC avoids key shortcomings of existing methods (e.g., the augmented SCM), which can produce unbounded counterfactual estimates. Across extensive synthetic and real-world experiments, TSC consistently improves estimation accuracy over state-of-the-art SCM baselines.

2602.04042 2026-05-13 cs.LG stat.ME stat.ML

Partition Tree: Conditional Density Estimation over General Outcome Spaces

Felipe Angelim, Alessandro Leite

AI总结 本文提出了一种名为 Partition Tree 的新型树状框架,用于在一般结果空间上进行条件密度估计,能够统一处理连续和分类变量。该方法通过数据自适应划分将条件分布建模为分段常数密度,并直接最小化条件负对数似然来学习树结构,提供了一种无需参数假设的可扩展非参数替代方案。此外,文章还引入了 Partition Forest,通过平均条件密度实现对 Partition Tree 的袋外扩展,并在实验中展示了其在概率预测方面的优越性和与最新方法的竞争力。

详情
Comments
Code available at https://github.com/felipeangelimvieira/partition_tree
英文摘要

We propose Partition Tree, a novel tree-based framework for conditional density estimation over general outcome spaces that supports both continuous and categorical variables within a unified formulation. Our approach models conditional distributions as piecewise-constant densities on data-adaptive partitions and learns trees by directly minimizing conditional negative log-likelihood. This yields a scalable, nonparametric alternative to existing probabilistic trees that does not make parametric assumptions about the target distribution. We further introduce Partition Forest, a bagging extension obtained by averaging conditional densities. Empirically, we demonstrate improved probabilistic prediction over CART-style trees and competitive performance compared to state-of-the-art probabilistic tree methods and Random Forests.

2602.01682 2026-05-13 cs.LG cs.DS stat.ML

Finite and Corruption-Robust Regret Bounds in Online Inverse Linear Optimization under M-Convex Action Sets

Taihei Oki, Shinsaku Sakaue

AI总结 本文研究在线逆线性优化问题,即根据随时间变化的可行集上观测到的最优动作,推断隐藏的目标向量,并推荐符合该目标的行动。研究关注在M-凸可行集(如拟阵)下,能否获得与维度多项式相关的有限悔度界。作者通过结合M-凸集最优解的结构特性与几何体积论证,证明了悔度界为 $O(d\log d)$,部分解决了该问题的开放性疑问,并进一步拓展到对抗性噪声场景,给出了无需先验知识的悔度界 $O((C+1)d\log d)$。

详情
英文摘要

We study online inverse linear optimization, also known as contextual recommendation, where a learner sequentially infers an agent's hidden objective vector from observed optimal actions over feasible sets that change over time. The learner aims to recommend actions that perform well under the agent's true objective, and the performance is measured by the regret, defined as the cumulative gap between the agent's optimal values and those achieved by the learner's recommended actions. Prior work has established a regret bound of $O(d\log T)$, as well as a finite but exponentially large bound of $\exp(O(d\log d))$, where $d$ is the dimension of the optimization problem and $T$ is the time horizon, while a regret lower bound of $Ω(d)$ is known (Gollapudi et al. 2021; Sakaue et al. 2025). Whether a finite regret bound polynomial in $d$ is achievable or not has remained an open question. We partially resolve this by showing that when the feasible sets are M-convex -- a broad class that includes matroids -- a finite regret bound of $O(d\log d)$ is possible. We achieve this by combining a structural characterization of optimal solutions on M-convex sets with a geometric volume argument. Moreover, we extend our approach to adversarially corrupted feedback in up to $C$ rounds. We obtain a regret bound of $O((C+1)d\log d)$ without prior knowledge of $C$, by monitoring directed graphs induced by the observed feedback to detect corruptions adaptively.

2512.24768 2026-05-13 stat.ML cs.LG

Sparse Offline Reinforcement Learning with Corruption Robustness

Nam Phuong Tran, Andi Nika, Goran Radanovic, Long Tran-Thanh, Debmalya Mandal

AI总结 本文研究了在高维稀疏马尔可夫决策过程下,离线稀疏强化学习中对强数据污染的鲁棒性问题。面对对手对部分轨迹的任意扰动,作者提出了一种结合稀疏性的演员-评论家方法,避免了传统方法中过于悲观的奖励估计,首次在单策略集中覆盖假设下给出了非空的理论保证。该方法在污染环境下仍保持鲁棒性,为高维稀疏环境中学习近似最优策略提供了新的可能性。

详情
英文摘要

We investigate robustness to strong data corruption in offline sparse reinforcement learning (RL). In our setting, an adversary may arbitrarily perturb a fraction of the collected trajectories from a high-dimensional but sparse Markov decision process, and our goal is to estimate a near optimal policy. The main challenge is that, in the high-dimensional regime where the number of samples $N$ is smaller than the feature dimension $d$, exploiting sparsity is essential for obtaining non-vacuous guarantees but has not been systematically studied in offline RL. We analyse the problem under uniform coverage and sparse single-concentrability assumptions. While Least Square Value Iteration (LSVI), a standard approach for robust offline RL, performs well under uniform coverage, we show that integrating sparsity into LSVI is unnatural, and its analysis may break down due to overly pessimistic bonuses. To overcome this, we propose actor-critic methods with sparse robust estimator oracles, which avoid the use of pointwise pessimistic bonuses and provide the first non-vacuous guarantees for sparse offline RL under single-policy concentrability coverage. Moreover, we extend our results to the contaminated setting and show that our algorithm remains robust under strong contamination. Our results provide the first non-vacuous guarantees in high-dimensional sparse MDPs with single-policy concentrability coverage and corruption, showing that learning a near-optimal policy remains possible in regimes where traditional robust offline RL techniques may fail.

2511.17038 2026-05-13 cs.AI eess.IV stat.ML

DAPS++: Rethinking Diffusion Inverse Problems with Decoupled Posterior Annealing

Hao Chen, Renzheng Zhang, Scott S. Howard

AI总结 本文提出了一种名为DAPS++的新型扩散逆问题求解方法,旨在解决传统扩散模型在逆问题中先验引导不足的问题。该方法通过将扩散初始化与似然驱动的优化过程完全解耦,使重建过程更直接地由测量一致性引导,同时保持数值稳定性。实验表明,DAPS++在减少函数评估次数和优化步骤的前提下,实现了高效的计算性能和鲁棒的图像恢复效果。

详情
英文摘要

From a Bayesian perspective, score-based diffusion solves inverse problems through joint inference, embedding the likelihood with the prior to guide the sampling process. However, this formulation fails to explain its practical behavior: the prior offers limited guidance, while reconstruction is largely driven by the measurement-consistency term, leading to an inference process that is effectively decoupled from the diffusion dynamics. We show that the diffusion prior in these solvers functions primarily as a warm initializer that places estimates near the data manifold, while reconstruction is driven almost entirely by measurement consistency. Based on this observation, we introduce \textbf{DAPS++}, which fully decouples diffusion-based initialization from likelihood-driven refinement, allowing the likelihood term to guide inference more directly while maintaining numerical stability and providing insight into why unified diffusion trajectories remain effective in practice. By requiring fewer function evaluations (NFEs) and measurement-optimization steps, \textbf{DAPS++} achieves high computational efficiency and robust reconstruction performance across diverse image restoration tasks.

2510.20191 2026-05-13 stat.ME

Bias-Variance Tradeoff of Matching Prior to Difference-in-Differences When Parallel Trends is Violated

Mingxuan Ge, Dae Woong Ham

AI总结 本文研究了在平行趋势假设被违反时,匹配先验(Matching Prior)与双重差分法(DiD)之间的偏差-方差权衡问题。作者在存在未观测时间变化混杂因素的线性结构模型下,分析了匹配对偏差和均方误差(MSE)的影响,发现匹配处理组与对照组在预处理结果上的变量总是有助于提升估计精度,而仅基于可观测协变量的匹配可能因样本量变化带来不利影响。基于此,作者建议在实际应用中应综合考虑偏差与方差,推荐使用MSE作为评估标准,并提供了匹配变量选择的理论指导与实践建议。

详情
英文摘要

Quasi-experimental causal inference methods have become central in empirical operations management for guiding managerial decisions. Among these, empiricists utilize the Difference-in-Differences (DiD) estimator, which relies on the parallel trends assumption. To improve its plausibility, researchers often match treated and control units before applying DiD, with the intuition that matched groups are more likely to evolve similarly absent treatment. Existing work that analyzes this practice, however, has focused solely on bias. In this work, we not only generalize earlier bias results under weaker assumptions but also analyze properties of variance and mean squared error (MSE), a practically relevant metric for decision making. Under a linear structural model with unobserved time-varying confounders, we show that variance results contrast with established bias insights: matching on observed covariates prior to DiD is not always recommended over the classic (unmatched) DiD due to a sample size tradeoff; furthermore, matching additionally on pre-treatment outcomes is always beneficial as such tradeoff no longer exists once matching is performed. We therefore advocate MSE as an additional metric if applied researchers weigh bias and variance equally and further give practitioner-friendly guidelines with theoretical guarantees on when and on what variables they should match. As an illustration, we apply these guidelines to re-examine a recent empirical study that matches prior to DiD to study how the introduction of monetary incentives by a knowledge-sharing platform affects general engagement on the platform. Our results show that the authors' decision was both warranted and critical to produce a credible causal estimate.

2510.18071 2026-05-13 stat.ML cs.LG stat.ME

Arbitrated Indirect Treatment Comparisons

Yixin Fang, Weili He

AI总结 本文提出了一种新的间接治疗比较方法——仲裁间接治疗比较,旨在解决MAIC方法中出现的“MAIC悖论”问题。该悖论指不同研究者使用相同数据得出矛盾的治疗效果结论,原因在于各自隐含针对不同目标人群。新方法通过聚焦于一个共同的重叠人群,估计治疗效果,从而消除因目标人群差异导致的结论不一致问题,提升了间接比较的可靠性和一致性。

详情
英文摘要

Matching-adjusted indirect comparison (MAIC) has been increasingly employed in health technology assessments (HTA). By reweighting subjects from a trial with individual participant data (IPD) to match the covariate summary statistics of another trial with only aggregate data (AgD), MAIC facilitates the estimation of a treatment effect defined with respect to the AgD trial population. This manuscript introduces a new class of methods, termed arbitrated indirect treatment comparisons, designed to address the ``MAIC paradox'' -- a phenomenon highlighted by Jiang et al.~(2025). The MAIC paradox arises when different sponsors, analyzing the same data, reach conflicting conclusions regarding which treatment is more effective. The underlying issue is that each sponsor implicitly targets a different population. To resolve this inconsistency, the proposed methods focus on estimating treatment effects in a common target population, specifically chosen to be the overlap population.

2510.16127 2026-05-13 stat.ML cs.LG

Learning density ratios in causal inference using Bregman-Riesz regression

Oliver J. Hines, Caleb H. Miles

AI总结 本文提出了一种统一的框架——Bregman-Riesz回归,用于因果推断中的密度比估计问题。该方法结合了基于Bregman散度、概率分类模型以及Riesz回归的三种经典方法,提升了密度比估计的稳定性与泛化能力。研究还探讨了数据增强技术在因果问题中的应用,并通过实验分析了不同Bregman散度和数据增强策略对模型性能的影响,同时提供了相应的Python工具包以支持实际应用。

详情
Comments
Replication code is available from https://github.com/CI-NYC/densityratios
英文摘要

The ratio of two probability density functions is a fundamental quantity that appears in many areas of statistics and machine learning, including causal inference, reinforcement learning, covariate shift, outlier detection, independence testing, importance sampling, and diffusion modeling. Naively estimating the numerator and denominator densities separately using, e.g., kernel density estimators, can lead to unstable performance and suffer from the curse of dimensionality as the number of covariates increases. For this reason, several methods have been developed for estimating the density ratio directly based on (a) Bregman divergences or (b) recasting the density ratio as the odds in a probabilistic classification model that predicts whether an observation is sampled from the numerator or denominator distribution. Additionally, the density ratio can be viewed as the Riesz representer of a continuous linear map, making it amenable to estimation via (c) minimization of the so-called Riesz loss, which was developed to learn the Riesz representer in the Riesz regression procedure in causal inference. In this paper we show that all three of these methods can be unified in a common framework, which we call Bregman--Riesz regression. We further show how data augmentation techniques can be used to apply density ratio learning methods to causal problems, where the numerator distribution typically represents an unobserved intervention. We show through simulations how the choice of Bregman divergence and data augmentation strategy can affect the performance of the resulting density ratio learner. A Python package is provided for researchers to apply Bregman--Riesz regression in practice using gradient boosting, neural networks, and kernel methods.

2510.14285 2026-05-13 econ.EM math.ST stat.TH

Debiased Kernel Estimation of Spot Volatility in the Presence of Infinite Variation Jumps

B. Cooper Boniece, José E. Figueroa-López, Tianwei Zhou

AI总结 本文研究了在存在无限变差跳跃的情况下,如何对局部波动率进行无偏核估计的问题。作者提出了一种截断核估计方法及其去偏版本,将最优速率的局部波动率估计扩展到更广的跳跃活动指数范围,并建立了相应的中心极限定理。相比以往方法,该方法通过使用更一般的核函数和最优带宽收敛速率,实现了更小的渐近方差,并在更灵活的模型假设下具有更广泛的应用性。仿真研究表明,该方法在有限样本中优于现有方法。

详情
Comments
54 pages
英文摘要

Volatility estimation is a central problem in financial econometrics, but becomes particularly challenging when jump activity is high, a phenomenon observed empirically in highly traded financial securities. In this paper, we revisit the problem of spot volatility estimation for an Itô semimartingale with jumps of unbounded variation. We construct truncated kernel-based estimators and debiased variants that extend rate-optimal spot volatility estimation to a wider range of jump activity indices, from the previously available bound $Y<4/3$ to $Y<20/11$. Rate-suboptimal CLTs are also established for $Y>20/11$. Compared with earlier work, our approach achieves smaller asymptotic variances through the use of more general kernels and an optimal choice for the bandwidth convergence rate, and also has broader applicability under more flexible model assumptions. A comprehensive simulation study confirms that our procedures outperform competing methods in finite samples.

2510.09401 2026-05-13 stat.ME stat.CO

Uncertainty Quantification for Multi-level Models Using the Survey-Weighted Pseudo-Posterior

Matthew R. Williams, F. Hunter McGuire, Terrance D. Savitsky

AI总结 本文研究了如何在复杂调查样本中对多层模型进行不确定性量化,重点解决由于样本不平衡和有效样本量差异带来的估计偏差和方差估计问题。作者提出了一种改进的自动后处理方法,用于调整局部和全局参数的不确定性估计,并通过模拟研究和国家药物使用健康调查实例验证了其有效性。该方法为多层模型在复杂抽样设计下的参数推断提供了更准确的置信区间估计。

详情
Comments
25 pages, 3 tables, 12 figures. arXiv admin note: text overlap with arXiv:2308.06845
英文摘要

Parameter estimation and inference from complex survey samples typically focuses on global model parameters whose estimators have asymptotic properties, such as from fixed effects regression models. The central challenge is to both mitigate bias induced from potentially unbalanced samples and to incorporate adjustments for differences in effective sample size to get correct variance and interval estimates. We present a motivating example of Bayesian inference for a multi-level or mixed effects model in which estimates of both the local parameters (e.g. group level random effects) and the global parameters need to be adjusted for the complex sampling design. We evaluate the limitations of the survey-weighted pseudo-posterior and an existing automated post-processing method to improve the uncertainty quantification. We propose modifications to the automated process and demonstrate their improvements for multi-level models via a simulation study and a motivating example from the National Survey on Drug Use and Health. Reproduction examples are available from the authors and the updated R package is available via github:https://github.com/RyanHornby/csSampling

2510.04265 2026-05-13 cs.AI cs.CL math.ST stat.ML stat.TH

Don't Pass@k: A Bayesian Framework for Large Language Model Evaluation

Mohsen Hariri, Amirhossein Samandar, Michael Hinczewski, Vipin Chaudhary

AI总结 本文提出了一种基于贝叶斯框架的大语言模型评估方法,旨在解决传统Pass@k指标在样本量有限时排名不稳定、易误导的问题。该方法通过估计模型的底层成功概率及其可信区间,提供更稳定且具有统计意义的模型排名,并支持对评分标准的灵活加权。实验表明,该框架在收敛速度和排名稳定性方面优于Pass@k,且能明确区分统计显著差异与噪声,适用于二元和非二元评估场景。

详情
Journal ref
The Fourteenth International Conference on Learning Representations (ICLR), 2026
Comments
OpenReview (ICLR 2026): https://openreview.net/forum?id=PTXi3Ef4sT
英文摘要

Pass$@k$ is widely used to report the reasoning performance of LLMs, but it often produces unstable and potentially misleading rankings, especially when the number of trials (samples) is limited and computational resources are constrained. We present a principled Bayesian evaluation framework that replaces Pass$@k$ and average accuracy over $N$ trials (avg$@N$) with posterior estimates of a model's underlying success probability and credible intervals, yielding stable rankings and a transparent decision rule for differences. Evaluation outcomes are modeled as categorical (not just 0/1) with a Dirichlet prior, giving closed-form expressions for the posterior mean and uncertainty of any weighted rubric and enabling the use of prior evidence when appropriate. Theoretically, under a uniform prior, the Bayesian posterior mean is order-equivalent to average accuracy (Pass$@1$), explaining its empirical robustness while adding principled uncertainty. Empirically, in simulations with known ground-truth success rates and on AIME'24/'25, HMMT'25, and BrUMO'25, the posterior-based procedure achieves faster convergence and greater rank stability than Pass$@k$ and recent variants, enabling reliable comparisons at far smaller sample counts. The framework clarifies when observed gaps are statistically meaningful (non-overlapping credible intervals) versus noise, and it naturally extends to graded, rubric-based evaluations. Together, these results recommend replacing Pass$@k$ for LLM evaluation and ranking with a posterior-based, compute-efficient protocol that unifies binary and non-binary evaluation while making uncertainty explicit. Source code is available at https://github.com/mohsenhariri/scorio

2508.21260 2026-05-13 cs.RO eess.SP math.ST stat.TH

Remarks on stochastic cloning and delayed-state filtering

Tara Mina, Lindsey Marinello, John Christian

AI总结 本文研究了在航空航天导航和机器人领域中处理依赖于先验状态的延迟状态测量的估计问题,重点探讨了随机克隆(SC)方法以及一种被长期忽视的替代方法——延迟状态卡尔曼滤波(DSKF)。研究发现,正确推导的DSKF能够在无需状态扩增的情况下,实现与SC相同的状态和协方差更新,并提供了两种等效的DSKF形式,从不同角度解释了如何在广义卡尔曼滤波框架中处理先验状态测量的相关性。研究还表明,DSKF在计算和存储复杂度上与SC相当,且在某些问题维度下可进一步降低计算和存储成本,澄清了卡尔曼滤波无法处理相关延迟状态测量的误解。

详情
英文摘要

Many estimation problems in aerospace navigation and robotics involve measurements that depend on prior states. A prominent example is odometry, which measures the relative change between states over time. Accurately handling these delayed-state measurements requires capturing their correlations with prior state estimates, and a widely used approach is stochastic cloning (SC), which augments the state vector to account for these correlations. This work revisits a long-established but often overlooked alternative--the delayed-state Kalman filter--and demonstrates that a properly derived filter yields exactly the same state and covariance update as SC, without requiring state augmentation. Moreover, two equivalent formulations of the delayed-state Kalman filter (DSKF) are presented, providing complementary perspectives on how the prior-state measurement correlations can be handled within the generalized Kalman filter. These formulations are shown to be comparable to SC in asymptotic computational and memory complexity, while one DSKF formulation can offer reduced arithmetic and storage costs for certain problem dimensions. Our findings clarify a common misconception that Kalman filter variants are inherently unable to handle correlated delayed-state measurements, demonstrating that an alternative formulation achieves the same results without state augmentation.

2508.16132 2026-05-13 q-fin.PM math.ST q-fin.RM stat.TH

On a multivariate extension for Copula-based Conditional Value at Risk

Andres Mauricio Molina Barreto

AI总结 本文研究了基于Copula的条件风险价值(CCVaR)在多维(d≥2)情况下的扩展,特别针对由阿基米德Copula描述的依赖结构。作者推导出在阿基米德Copula下的CCVaR的近似闭式表达式,并探讨了该风险度量满足一致性的条件。通过基于实际数据的数值实验,验证了CCVaR的有效性,并与传统的风险价值(VaR)和条件风险价值(CVAR)进行了比较。

详情
英文摘要

Copula-based Conditional Value at Risk (CCVaR) is defined as an alternative version of the classical Conditional Value at Risk (CVaR) for multivariate random vectors intended to be real-valued. We aim to generalize CCVaR to several dimensions (d>=2) when the dependence structure is given by an Archimedean copula. While previous research focused on the bivariate case, leaving the multivariate version unexplored, an almost closed-form expression for CCVaR under an Archimedean copula is derived. The conditions under which this risk measure satisfies coherence are then examined. Finally, numerical experiments based on real data are conducted to estimate CCVaR, and the results are compared with classical measures of Value at Risk (VaR) and Conditional Value at Risk (CVaR).

2508.08420 2026-05-13 cs.LG stat.ML

Regret minimization in Linear Bandits with offline data via extended D-optimal exploration

Sushant Vijayan, Arun Suggala, Karthikeyan Shanmugam, Soumyabrata Pal

AI总结 本文研究了在拥有离线数据的情况下,如何在线最小化线性强盗问题的累积遗憾。提出了一种名为Offline-Online Phased Elimination (OOPE) 的算法,通过在探索阶段使用扩展的D-最优设计,有效利用离线数据以显著降低在线遗憾。该算法的在线遗憾界为 $\tilde{O}(\sqrt{\deff T \log (|\mathcal{A}|T)} + d^2)$,其中 $\deff$ 表示离线数据中未充分探索的方向数,反映了离线数据的质量。此外,本文还给出了依赖于离线数据质量的最小最大遗憾下界,并通过Frank-Wolfe近似进一步优化了算法的复杂度。

详情
Comments
Accepted to TMLR, with J2C certification, link: https://openreview.net/forum?id=4WcK8gKgCi
英文摘要

We consider the problem of online regret minimization in linear bandits with access to prior observations (offline data) from the underlying bandit model. There are numerous applications where extensive offline data is often available, such as in recommendation systems, online advertising. Consequently, this problem has been studied intensively in recent literature. Our algorithm, Offline-Online Phased Elimination (OOPE), effectively incorporates the offline data to substantially reduce the online regret compared to prior work. To leverage offline information prudently, OOPE uses an extended D-optimal design within each exploration phase. OOPE achieves an online regret is $\tilde{O}(\sqrt{\deff T \log \left(|\mathcal{A}|T\right)}+d^2)$. $\deff \leq d)$ is the effective problem dimension which measures the number of poorly explored directions in offline data and depends on the eigen-spectrum $(λ_k)_{k \in [d]}$ of the Gram matrix of the offline data. The eigen-spectrum $(λ_k)_{k \in [d]}$ is a quantitative measure of the \emph{quality} of offline data. If the offline data is poorly explored ($\deff \approx d$), we recover the established regret bounds for purely online setting while, when offline data is abundant ($\Toff >> T$) and well-explored ($\deff = o(1) $), the online regret reduces substantially. Additionally, we provide the first known minimax regret lower bounds in this setting that depend explicitly on the quality of the offline data. These lower bounds establish the optimality of our algorithm in regimes where offline data is either well-explored or poorly explored. Finally, by using a Frank-Wolfe approximation to the extended optimal design we further improve the $O(d^{2})$ term to $O\left(\frac{d^{2}}{\deff} \min \{ \deff,1\} \right)$, which can be substantial in high dimensions with moderate quality of offline data $\deff = Ω(1)$.

2507.09302 2026-05-13 stat.ME

The Multiplicative Instrumental Variable Model

Jiewen Liu, Chan Park, Yonghoon Lee, Yunshu Zhang, Mengxin Yu, James M. Robins, Eric J. Tchetgen Tchetgen

AI总结 本文提出了一种新的工具变量模型——乘法工具变量模型(MIV),用于解决隐藏混杂因素带来的偏差问题。该模型通过在处理倾向得分模型中引入无乘法交互条件,为工具变量与未观测混杂因素对处理变量的影响提供了独立作用机制的正式表述。研究证明,MIV可以在非参数条件下识别处理效应,并提出了一类多重稳健且半参数高效的估计方法,最后通过模拟和实际应用验证了方法的有效性。

详情
英文摘要

The instrumental variable (IV) design is a common approach to address hidden confounding bias. For validity, an IV must impact the outcome only through its association with the treatment. In addition, IV identification has required a homogeneity condition such as monotonicity or no unmeasured common effect modifier between the additive effect of the treatment on the outcome, and that of the IV on the treatment. In this work, we introduce the Multiplicative Instrumental Variable Model (MIV), which encodes a condition of no multiplicative interaction between the instrument and an unmeasured confounder in the treatment propensity score model. Thus, the MIV provides a novel formalization of the core IV independence condition interpreted as independent mechanisms of action, by which the instrument and hidden confounders influence treatment uptake, respectively. As we formally establish, MIV provides nonparametric identification of the population average treatment effect on the treated (ATT) via a single-arm version of the classical Wald ratio IV estimand, for which we propose a novel class of estimators that are multiply robust and semiparametric efficient. Finally, we illustrate the methods in extended simulations and an application on the causal impact of a job training program on subsequent earnings.

2507.03622 2026-05-13 cs.LG cs.AI stat.ML

Localising Dropout Variance in Twin Networks

Cooper Doyle

AI总结 该论文研究了如何在双网络模型中定位预测不确定性来源的问题,提出了一种分层方差分解方法,将总预测方差分解为编码器部分和输出头部分。通过独立控制共享编码器和输出头的蒙特卡洛Dropout,能够区分不同来源的不确定性。实验表明,编码器方差在分布偏移时占主导,是预测误差的主要指标,而输出头方差在编码器不确定性控制后才具有信息量,该方法成本低廉,可为数据收集提供实用指导。

详情
Comments
14 pages, 5 figures, 3 tables
英文摘要

Accurate individual treatment-effect estimation demands not only reliable point predictions but also uncertainty measures that help practitioners \emph{locate} the source of model failure. We introduce a layer-wise variance decomposition for deep twin-network models: by toggling Monte Carlo Dropout independently in the shared encoder and the outcome heads, we split total predictive variance into an \emph{encoder component} ($σ_{\mathrm{enc}}^2$) and a \emph{head component} ($σ_{\mathrm{head}}^2$), with $σ_{\mathrm{enc}}^2 + σ_{\mathrm{head}}^2 \approx σ_{\mathrm{tot}}^2$ by the law of total variance. Across three synthetic covariate-shift regimes, the encoder component dominates under distributional shift ($ρ_{\mathrm{enc}}=0.53$) while the head component becomes informative only once encoder uncertainty is controlled. On a real-world twins cohort with induced multivariate shift, only $σ_{\mathrm{enc}}^2$ spikes on out-of-distribution samples and becomes the primary error predictor ($ρ_{\mathrm{enc}}\!\approx\!0.89$), while $σ_{\mathrm{head}}^2$ remains flat. The decomposition adds negligible cost over standard MC Dropout and provides a practical diagnostic for deciding whether to collect more diverse covariates or more outcome data.

2506.23619 2026-05-13 q-fin.ST cs.LG econ.EM stat.ML

Overparametrized models with posterior drift

Guillaume Coqueret, Martial Laguerre

AI总结 本文研究了在过度参数化的机器学习模型中,后验漂移对样本外预测准确性的影响。研究发现,当训练与测试样本的数据生成过程参数发生变化时,模型性能会显著下降,这在金融市场等易发生制度变化的场景中尤为重要。应用于股权溢价预测时,研究指出市场择时策略对子时期和模型复杂度参数高度敏感,较小的带宽参数会导致投资回报高度异质,而较大的带宽参数虽能带来更一致的结果,但风险调整后的收益较差,因此在股票市场预测中应谨慎使用大型线性模型。

详情
英文摘要

This paper investigates the impact of posterior drift on out-of-sample forecasting accuracy in overparametrized machine learning models. We document the loss in performance when the loadings of the data generating process change between the training and testing samples. This matters crucially in settings in which regime changes are likely to occur, for instance, in financial markets. Applied to equity premium forecasting, our results underline the sensitivity of a market timing strategy to sub-periods and to the bandwidth parameters that control the complexity of the model. For the average investor, we find that focusing on holding periods of 15 years can generate very heterogeneous returns, especially for small bandwidths. Large bandwidths yield much more consistent outcomes, but are far less appealing from a risk-adjusted return standpoint. All in all, our findings tend to recommend cautiousness when resorting to large linear models for stock market predictions.

2505.20761 2026-05-13 cs.LG stat.ML

Practical estimation of the optimal classification error with soft labels and calibration

Ryota Ushio, Takashi Ishida, Masashi Sugiyama

AI总结 本文研究了在二分类任务中如何实用且理论严谨地估计最优分类错误率(即贝叶斯错误)。作者在原有基于软标签的方法基础上进行了两个重要扩展:一方面,他们分析了基于硬标签的估计器的偏差性质,揭示其衰减速度与两类条件分布的分离程度相关,并在每实例硬标签数量增加时可能显著优于先前结果;另一方面,他们解决了在软标签被污染的情况下进行估计的问题,指出即使使用校准后的软标签,估计结果仍可能不准确,并提出一种基于等距校准的估计方法,在更弱的假设下仍具有统计一致性。该方法无需具体实例,适用于隐私受限的实际场景。实验验证了方法的有效性。

详情
Comments
ICLR 2026 camera-ready version updated; 40 pages, 12 figures; GitHub: https://github.com/RyotaUshio/bayes-error-estimation
英文摘要

While the performance of machine learning systems has experienced significant improvement in recent years, relatively little attention has been paid to the fundamental question: to what extent can we improve our models? This paper provides a means of answering this question in the setting of binary classification, which is practical and theoretically supported. We extend a previous work that utilizes soft labels for estimating the Bayes error, the optimal error rate, in two important ways. First, we theoretically investigate the properties of the bias of the hard-label-based estimator discussed in the original work. We reveal that the decay rate of the bias is adaptive to how well the two class-conditional distributions are separated, and it can decay significantly faster than the previous result suggested as the number of hard labels per instance grows. Second, we tackle a more challenging problem setting: estimation with corrupted soft labels. One might be tempted to use calibrated soft labels instead of clean ones. However, we reveal that calibration guarantee is not enough, that is, even perfectly calibrated soft labels can result in a substantially inaccurate estimate. Then, we show that isotonic calibration can provide a statistically consistent estimator under an assumption weaker than that of the previous work. Our method is instance-free, i.e., we do not assume access to any input instances. This feature allows it to be adopted in practical scenarios where the instances are not available due to privacy issues. Experiments with synthetic and real-world datasets show the validity of our methods and theory. The code is available at https://github.com/RyotaUshio/bayes-error-estimation.

2505.17506 2026-05-13 stat.ML cs.LG

Offline Constrained Reinforcement Learning under Partial Data Coverage

Seokmin Ko, Ambuj Tewari, Kihyuk Hong

AI总结 本文研究了在折扣约束马尔可夫决策过程下,使用通用函数逼近的离线约束强化学习问题。针对现有方法需要完整数据覆盖、缺乏oracle效率或依赖数据生成分布等限制,提出了一种基于分解线性规划的原对偶算法PDOCRL,将策略显式纳入优化变量,避免了对数据生成分布的依赖。该方法在部分策略覆盖条件下,能够在不依赖数据生成分布的情况下,以$\widetilde{\mathcal O}(ε^{-2})$的样本复杂度返回近似最优且近似可行的策略,并在实验中表现出与强基线相当的性能。

详情
英文摘要

We study offline constrained reinforcement learning with general function approximation in discounted constrained Markov decision processes. Prior methods either require full data coverage for evaluating intermediate policies, lack oracle efficiency, or requires the knowledge of data-generating distribution for policy extraction. We propose PDOCRL, an oracle-efficient primal-dual algorithm based on a decomposed linear-programming formulation that makes the policy an explicit optimization variable. This avoids policy extraction that requires the knowledge of data-generating distribution, and only uses standard policy-optimization, online linear-optimization, and linear-minimization oracles. We show that saddle-point formulations using general function approximation can have spurious saddle points even when an optimal solution is realizable, and identify a stronger realizability condition under which every restricted saddle point is optimal. Under this condition and partial coverage of an optimal policy, PDOCRL returns a near-optimal, near-feasible policy with a \(\widetilde{\mathcal O}(ε^{-2})\) sample guarantee, without access to the data-generating distribution. Empirically, PDOCRL is competitive with strong baselines on standard offline constrained RL benchmarks.

2505.13770 2026-05-13 cs.AI cs.CL cs.LG stat.ME stat.ML

Ice Cream Doesn't Cause Drowning: Benchmarking LLMs Against Statistical Pitfalls in Causal Inference

Jin Du, Li Chen, Xun Xian, An Luo, Fangqiao Tian, Ganghua Wang, Charles Doss, Xiaotong Shen, Jie Ding

AI总结 本研究探讨了大型语言模型(LLMs)在因果推断中应对统计陷阱的能力,指出当前模型在处理如辛普森悖论和选择偏差等复杂统计问题时存在明显不足。为此,研究提出了一个名为CausalPitfalls的综合性基准,通过多难度级别的结构化挑战和评分标准,系统评估模型的因果推理能力与回答可靠性。实验结果揭示了现有LLMs在统计因果推理中的局限性,并为构建可信的因果推理系统提供了重要参考。

详情
英文摘要

Reliable causal inference is essential for making decisions in high-stakes areas like medicine, economics, and public policy. However, it remains unclear whether large language models (LLMs) can handle rigorous and trustworthy statistical causal inference. Current benchmarks usually involve simplified tasks. For example, these tasks might only ask LLMs to identify semantic causal relationships or draw conclusions directly from raw data. As a result, models may overlook important statistical pitfalls, such as Simpson's paradox or selection bias. This oversight limits the applicability of LLMs in the real world. To address these limitations, we propose CausalPitfalls, a comprehensive benchmark designed to rigorously evaluate the capability of LLMs in overcoming common causal inference pitfalls. Our benchmark features structured challenges across multiple difficulty levels, each paired with grading rubrics. This approach allows us to quantitatively measure both causal reasoning capabilities and the reliability of LLMs' responses. We evaluate models using two protocols: (1) direct prompting, which assesses intrinsic causal reasoning, and (2) code-assisted prompting, where models generate executable code for explicit statistical analysis. Additionally, we validate the effectiveness of this judge by comparing its scoring with assessments from human experts. Our results reveal significant limitations in current LLMs when performing statistical causal inference. The CausalPitfalls benchmark provides essential guidance and quantitative metrics to advance the development of trustworthy causal reasoning systems.

2505.11517 2026-05-13 physics.soc-ph cs.CE cs.IT math.IT stat.AP

Information-Theoretic Grid Topology Reconstruction using Low-Precision Smart Meter Data

Daniel T. Speckhard

AI总结 本文研究了在低精度智能电表数据条件下重建配电网络拓扑所需的最小数据保真度。采用信息论方法,利用Chow-Liu算法基于互信息生成最大生成树,重点分析了数据位深、有效数字截断、时间窗口长度及互信息估计方法对重构精度的影响。实验表明,即使使用8位量化数据或毫伏级精度,也能有效恢复电网拓扑,但若采样间隔超过20分钟或数据持续时间过短,性能将显著下降。研究为未来评估实际噪声数据和融合工程先验的混合方法提供了理论基础。

详情
英文摘要

Accurate knowledge of power grid topology is a prerequisite for effective state estimation and grid stability. While data-driven methods for topology reconstruction exist, the minimum requirements for measurement quality, specifically regarding quantization, precision, and sampling frequency, remain under-explored. This study investigates the data fidelity required to reconstruct distribution grid topologies using voltage magnitude measurements. Adopting an information-theoretic approach, we utilize the Chow-Liu algorithm to generate maximum spanning trees based on mutual information. Rather than proposing a new reconstruction algorithm, our primary contribution is a comprehensive sensitivity analysis of the measurement data itself. We systematically evaluate the impact of data bit-depth, significant digit truncation, time-window length, and different mutual information estimators on reconstruction accuracy. We validate this approach using IEEE test cases (via MATPOWER) and time-series data from GridLAB-D. Our results demonstrate that grid topology can be successfully recovered even with highly quantized 8-bit data or millivolt-level precision. However, performance degrades significantly when downsampling intervals exceed 20 minutes or when data availability is limited to short durations. These findings establish an optimistic theoretical lower bound, suggesting that costly high-precision instrumentation may not be strictly necessary for structural inference under ideal conditions. This rigorous baseline provides a foundation for future evaluations of noisy real world smart meter data and hybrid approaches that incorporate existing engineering priors.

2502.09162 2026-05-13 stat.ME

Infinitely divisible priors for multivariate survival functions

Florian Brück

AI总结 本文提出了一种非参数先验框架,用于实值随机向量,可视为对“右中立先验”的多变量推广。该方法通过将最小无限可分随机向量的指数测度随机化,结合无限可分随机测度,能够自然地处理部分可交换数据和可交换随机向量。研究展示了如何从简单模块构建分层先验,并将许多贝叶斯非参数生存分析模型嵌入该框架,同时探讨了后验预测分布及其在一定正则条件下的性质,并提供了相关模拟方法,适用于生存分析中的部分可交换数据场景。

详情
英文摘要

This article introduces a novel framework for nonparametric priors on real-valued random vectors, which can be viewed as a multivariate generalization of neutral-to-the right priors. It is based on randomizing the exponent measure of a minimum-infinitely divisible random vector by an infinitely divisible random measure and naturally incorporates partially exchangeable data as well as exchangeable random vectors. We show how to construct hierarchical priors from simple building blocks and embed many models from Bayesian nonparametric survival analysis into our framework. The prior can concentrate on discrete or continuous distributions and other properties such as dependence, moments and moments of mean functionals are characterized. The posterior predictive distribution is derived in a general framework and is refined under some regularity conditions. In addition, a framework for the simulation from the posterior predictive distribution is provided, which is illustrated by an application to partially exchangeable data in a survival analysis context. As a byproduct, the construction of tractable infinitely divisible random measures is studied and the concept of subordination of homogeneous completely random measures by homogeneous completely random measures is extended to the subordination of homogeneous completely random measures by infinitely divisible random measures. This technique allows to create vectors of dependent infinitely divisible random measures with tractable Laplace transforms and serves as a general tool for the construction of tractable infinitely divisible random measures.

2411.00471 2026-05-13 stat.ME cs.LG

Dirichlet process mixtures of block $g$ priors for model selection and prediction in linear models

Anupreet Porwal, Abel Rodriguez

AI总结 本文提出了一种用于线性模型中模型选择和预测的块 $g$ 先验的狄利克雷过程混合方法。该方法在传统 $g$ 先验混合的基础上进行扩展,允许对不同参数块进行差异化的收缩,同时充分考虑预测变量之间的相关性,连接了模型选择与连续收缩先验的研究领域。该方法在多种意义上具有一致性,能够避免条件林德利悖论,并通过一种仅需少量调参的马尔可夫链蒙特卡洛算法进行后验推断。实验证明,当存在少量非常大的效应时,该方法能够更有效地检测出较小但显著的效应,且仅带来少量的假阳性发现。

详情
英文摘要

This paper introduces Dirichlet process mixtures of block $g$ priors for model selection and prediction in linear models. These priors are extensions of traditional mixtures of $g$ priors that allow for differential shrinkage for various (data-selected) blocks of parameters while fully accounting for the predictors' correlation structure, providing a bridge between the literatures on model selection and continuous shrinkage priors. We show that Dirichlet process mixtures of block $g$ priors are consistent in various senses and, in particular, that they avoid the conditional Lindley ``paradox'' highlighted by Som et al. (2016). Further, we develop a Markov chain Monte Carlo algorithm for posterior inference that requires only minimal ad-hoc tuning. Finally, we investigate the empirical performance of the prior in various real and simulated datasets. In the presence of a small number of very large effects, Dirichlet process mixtures of block $g$ priors lead to higher power for detecting smaller but significant effects without only a minimal increase in the number of false discoveries.

2407.11465 2026-05-13 math.ST math.PR q-fin.MF stat.ME stat.TH

Testing by Betting while Borrowing and Bargaining

Hongjian Wang, Muriel F. Pérez-Ortiz, Wouter M. Koolen, Aaditya Ramdas

AI总结 本文研究了在博弈论统计学中允许借贷情况下,基于下注的假设检验方法的调整问题。传统方法要求下注金额不能超过当前财富,而本文探讨了当允许借贷时,如何调整拒绝域阈值以保持相同的显著性水平。研究发现,若采用依赖负债的阈值,需支付额外代价;而若采用依赖历史杠杆率的路径相关阈值,则无需额外代价,从而为借贷条件下的假设检验提供了新的理论支持。

详情
英文摘要

Testing by betting has been a cornerstone of the game-theoretic statistics literature. One bets against the null hypothesis, and the accumulated wealth $W_t$ quantifies the evidence against the null hypothesis after $t$ rounds, and the null can be rejected at level $α$ whenever $W_t \geq 1/α$. A key assumption permeating the literature is that one cannot bet more money than they currently have (the wealth must stay nonnegative). In this work, we examine the consequences of allowing the bettor to borrow money in each round (for example after going bankrupt). Specifically, we ask how the threshold of $1/α$ must be accordingly adjusted to retain the desired level $α$. Our findings are twofold. First, if the new rejection rule is $W_t \geq g(α,L_t)$ where $L_t$ is the total liability at time $t$, then we show that $g(α,0)>1/α$ if $g(α,L_t)<\infty$ for any $L_t > 0$; in words, we must pay for the possibility of borrowing, even if in fact we do not borrow. Second, and in contrast to the first, if one employs a path dependent threshold $h(α,W_0,L_1,\dots,W_{t-1},L_t)$, that is a function of past leverage ratios, then there is in fact no extra price to pay for the possibility of borrowing.

2303.12834 2026-05-13 quant-ph cs.AI cs.LG stat.ML

The power and limitations of learning quantum dynamics incoherently

Sofiene Jerbi, Joe Gibbs, Manuel S. Rudolph, Matthias C. Caro, Patrick J. Coles, Hsin-Yuan Huang, Zoë Holmes

AI总结 本文研究了在不依赖系统与目标直接量子交互的非相干框架下学习量子动力学的可行性与限制。通过分析模拟已有相干学习策略所需的测量次数,作者给出了学习单元过程的样本复杂度界限,并证明在允许任意测量时,任何高效可表示的单元算子均可在非相干框架中高效学习;而仅使用浅层测量时,仅能学习低纠缠单元算子。研究还通过在IBM量子设备上成功学习16量子比特单元算子,并通过数值实验验证了算法的可扩展性。

详情
Journal ref
Phys. Rev. Research 8, 023141 (2026)
Comments
6+9 pages, 7 figures
英文摘要

Quantum process learning is emerging as an important tool to study quantum systems. While studied extensively in coherent frameworks, where the target and model system can share quantum information, less attention has been paid to whether the dynamics of quantum systems can be learned without the system and target directly interacting. Such incoherent frameworks are practically appealing since they open up methods of transpiling quantum processes between the different physical platforms without the need for technically challenging hybrid entanglement schemes. Here we provide bounds on the sample complexity of learning unitary processes incoherently by analyzing the number of measurements that are required to emulate well-established coherent learning strategies. We prove that if arbitrary measurements are allowed, then any efficiently representable unitary can be efficiently learned within the incoherent framework; however, when restricted to shallow-depth measurements only low-entangling unitaries can be learned. We demonstrate our incoherent learning algorithm for low entangling unitaries by successfully learning a 16-qubit unitary on \texttt{ibmq\_kolkata}, and further demonstrate the scalabilty of our proposed algorithm through extensive numerical experiments.