arXivDaily arXiv每日学术速递 周一至周五更新
重置
2604.04829 2026-04-07 stat.ME cs.LG stat.ML

A Robust SINDy Autoencoder for Noisy Dynamical System Identification

Kairui Ding

Comments 27 pages

详情
英文摘要

Sparse identification of nonlinear dynamics (SINDy) has been widely used to discover the governing equations of a dynamical system from data. It uses sparse regression techniques to identify parsimonious models of unknown systems from a library of candidate functions. Therefore, it relies on the assumption that the dynamics are sparsely represented in the coordinate system used. To address this limitation, one seeks a coordinate transformation that provides reduced coordinates capable of reconstructing the original system. Recently, SINDy autoencoders have extended this idea by combining sparse model discovery with autoencoder architectures to learn simplified latent coordinates together with parsimonious governing equations. A central challenge in this framework is robustness to measurement error. Inspired by noise-separating neural network structures, we incorporate a noise-separation module into the SINDy autoencoder architecture, thereby improving robustness and enabling more reliable identification of noisy dynamical systems. Numerical experiments on the Lorenz system show that the proposed method recovers interpretable latent dynamics and accurately estimates the measurement noise from noisy observations.

2604.04823 2026-04-07 math.PR math.ST stat.TH

Rapid convergence of tempering chains to multimodal Gibbs measures

Seungjae Son

详情
英文摘要

We study the spectral gaps of parallel and simulated tempering chains targeting multimodal Gibbs measures. In particular, we consider chains constructed from Metropolis random walks that preserve the Gibbs distributions at a sequence of harmonically spaced temperatures. We prove that their spectral gaps admit polynomial lower bounds of order $11$ and $12$ in terms of the low target temperature. The analysis applies to a broad class of potentials, beyond mixture models, without requiring explicit structural information on the energy landscape. The main idea is to decompose the state space and construct a Lyapunov function based on a suitably perturbed potential, which allows us to establish lower bounds on the local spectral gaps.

2604.04807 2026-04-07 stat.ME

Rank-Based Sparse Regression in Principal Components Space under Measurement Error

Long Feng, Xiaoyi Wang, Le Zhou

详情
英文摘要

We study high-dimensional regression in principal components space when the predictors are observed with additive measurement error and the response errors may be heavy-tailed. The starting point is the $\ell_1$-penalized principal-components estimator of Song and Zou (2026), which enjoys a blessing-of-dimensionality phenomenon under predictor contamination but senstive for heavy-tailed data or outliers. We replace the squared loss by a Wilcoxon-type rank loss and then apply a one-step adaptive reweighting scheme to reduce the shrinkage bias of the initial $\ell_1$ fit. The resulting procedure combines robustness to heavy-tailed response errors with the contamination geometry induced by the empirical principal-components basis. Our main theorem gives a prediction bound for the fixed-$λ$ second-stage fitted mean. Simulations show that the rank-based procedure is competitive under Gaussian noise and substantially more stable under heavy-tailed errors, especially when predictor contamination is present.

2604.04802 2026-04-07 cs.IT cs.LG eess.SP math.IT math.PR stat.ML

Partially deterministic sampling for compressed sensing with denoising guarantees

Yaniv Plan, Matthew S. Scott, Ozgur Yilmaz

详情
英文摘要

We study compressed sensing when the sampling vectors are chosen from the rows of a unitary matrix. In the literature, these sampling vectors are typically chosen randomly; the use of randomness has enabled major empirical and theoretical advances in the field. However, in practice there are often certain crucial sampling vectors, in which case practitioners will depart from the theory and sample such rows deterministically. In this work, we derive an optimized sampling scheme for Bernoulli selectors which naturally combines random and deterministic selection of rows, thus rigorously deciding which rows should be sampled deterministically. This sampling scheme provides measurable improvements in image compressed sensing for both generative and sparse priors when compared to with-replacement and without-replacement sampling schemes, as we show with theoretical results and numerical experiments. Additionally, our theoretical guarantees feature improved sample complexity bounds compared to previous works, and novel denoising guarantees in this setting.

2604.04785 2026-04-07 math.ST stat.ME stat.TH

High Dimensional Bootstrap and Asymptotic Expansion for the $k$-th Largest Coordinate

Long Feng

详情
英文摘要

We study bootstrap inference for the $k$th largest coordinate of a normalized sum of independent high-dimensional random vectors. Existing second-order theory for maxima does not directly extend to order statistics, because the event $\{T_{n,[k]}\le t\}$ is not a rectangle and its local structure is governed by exceedance counts rather than by a single boundary. We develop an approach based on factorial moments and weighted inclusion--exclusion that reduces the problem to a collection of rare-orthant probabilities and allows high-dimensional Edgeworth and Cornish--Fisher expansions to be transferred to the order-statistic setting. Under moment, variance, and weak-dependence conditions, we derive a second-order coverage expansion for wild-bootstrap critical values of the $k$th order statistic. In particular, a third-moment matching wild bootstrap achieves coverage error of order $n^{-1}$ up to logarithmic factors, and the same second-order accuracy is obtained for a prepivoted double wild bootstrap. We also show that the maximal-correlation condition can be replaced by a stationary Gaussian exponential-mixing assumption at the price of an explicit dependence remainder $r_d$, and this remainder can itself be of order $n^{-1}$ when the dimension is sufficiently large relative to the sample size. These results extend recent second-order Gaussian and bootstrap approximation theory from maxima to the $k$th order statistic in high dimension.

2604.04755 2026-04-07 stat.ME

Active Sequential Signal Detection with Asynchronous Decisions

Yiming Xing, Georgios Fellouris

Comments 13 pages, 3 figures

详情
英文摘要

This work considers the problem of detecting signals from multiple sequentially observed data streams, where only one stream can be observed at every time instant. The goal is to detect signals as quickly as possible while controlling the global probabilities of false alarm and missed detection. In this active sampling setup, it is impossible to minimize the expected detection time simultaneously for every signal, so we formulate a novel set of performance criteria that aim to minimize the expectations of the order statistics of the detection times. A novel procedure is proposed, which incorporates an exploration mechanism to a "follow-the-leader" procedure, and is shown to optimize all the criteria asymptotically as the global error probabilities go to zero. Its finite-sample performance is compared with existing and oracle procedures in simulation studies.

2604.04726 2026-04-07 stat.ML cs.LG eess.SP

A Muon-Accelerated Algorithm for Low Separation Rank Tensor Generalized Linear Models

Xiao Liang, Shuang Li

详情
英文摘要

Tensor-valued data arise naturally in multidimensional signal and imaging problems, such as biomedical imaging. When incorporated into generalized linear models (GLMs), naive vectorization can destroy their multi-way structure and lead to high-dimensional, ill-posed estimation. To address this challenge, Low Separation Rank (LSR) decompositions reduce model complexity by imposing low-rank multilinear structure on the coefficient tensor. A representative approach for estimating LSR-based tensor GLMs (LSR-TGLMs) is the Low Separation Rank Tensor Regression (LSRTR) algorithm, which adopts block coordinate descent and enforces orthogonality of the factor matrices through repeated QR-based projections. However, the repeated projection steps can be computationally demanding and slow convergence. Motivated by the need for scalable estimation and classification from such data, we propose LSRTR-M, which incorporates Muon (MomentUm Orthogonalized by Newton-Schulz) updates into the LSRTR framework. Specifically, LSRTR-M preserves the original block coordinate scheme while replacing the projection-based factor updates with Muon steps. Across synthetic linear, logistic, and Poisson LSR-TGLMs, LSRTR-M converges faster in both iteration count and wall-clock time, while achieving lower normalized estimation and prediction errors. On the Vessel MNIST 3D task, it further improves computational efficiency while maintaining competitive classification performance.

2604.04717 2026-04-07 cs.LG cond-mat.mtrl-sci cs.AI stat.ML

The Infinite-Dimensional Nature of Spectroscopy and Why Models Succeed, Fail, and Mislead

Umberto Michelucci, Francesca Venturini

详情
英文摘要

Machine learning (ML) models have achieved strikingly high accuracies in spectroscopic classification tasks, often without a clear proof that those models used chemically meaningful features. Existing studies have linked these results to data preprocessing choices, noise sensitivity, and model complexity, but no unifying explanation is available so far. In this work, we show that these phenomena arise naturally from the intrinsic high dimensionality of spectral data. Using a theoretical analysis grounded in the Feldman-Hajek theorem and the concentration of measure, we show that even infinitesimal distributional differences, caused by noise, normalisation, or instrumental artefacts, may become perfectly separable in high-dimensional spaces. Through a series of specific experiments on synthetic and real fluorescence spectra, we illustrate how models can achieve near-perfect accuracy even when chemical distinctions are absent, and why feature-importance maps may highlight spectrally irrelevant regions. We provide a rigorous theoretical framework, confirm the effect experimentally, and conclude with practical recommendations for building and interpreting ML models in spectroscopy.

2604.04673 2026-04-07 math.ST cs.LG stat.ML stat.TH

Minimaxity and Admissibility of Bayesian Neural Networks

Daniel Andrew Coulson, Martin T. Wells

Comments 95 pages and 6 figures

详情
英文摘要

Bayesian neural networks (BNNs) offer a natural probabilistic formulation for inference in deep learning models. Despite their popularity, their optimality has received limited attention through the lens of statistical decision theory. In this paper, we study decision rules induced by deep, fully connected feedforward ReLU BNNs in the normal location model under quadratic loss. We show that, for fixed prior scales, the induced Bayes decision rule is not minimax. We then propose a hyperprior on the effective output variance of the BNN prior that yields a superharmonic square-root marginal density, establishing that the resulting decision rule is simultaneously admissible and minimax. We further extend these results from the quadratic loss setting to the predictive density estimation problem with Kullback--Leibler loss. Finally, we validate our theoretical findings numerically through simulation.

2604.04638 2026-04-07 math.ST stat.TH

Joint Estimation in Potts Model

Somabha Mukherjee, Sumit Mukherjee, Sayar Karmakar

Comments 60 pages, 1 figure

详情
英文摘要

In this paper, we study estimation of parameters in a two-parameter Potts model with $q$ colors and coupling matrix $A_N$. We characterize concrete sufficient conditions for existence of the pseudo-likelihood estimator of the Potts model, in terms of the local magnetic fields, and give sufficient conditions for the validity of the above characterization. We then provide sufficient criteria for estimation of both parameters at the optimal rate $\sqrt{N}$. In particular, if $A_N$ is the scaled adjacency matrix of a graph $G_N$, then we show that joint estimation is possible if either $G_N$ has bounded degree or is irregular. In contrast, we give an example of a graph sequence $G_N$ which is approximately regular and dense, where no consistent estimator exists. We also show that one-parameter estimation at the optimal rate $\sqrt{N}$ holds under much milder conditions when the other parameter is known. Along the way, we develop a concentration result for mean-field Potts models using the framework of nonlinear large deviations. Compared to the Ising case, our results for the Potts case require a novel analysis across multiple colors.

2604.04588 2026-04-07 stat.ML cs.IT cs.LG math.IT math.OC math.ST stat.TH

Noisy Nonreciprocal Pairwise Comparisons: Scale Variation, Noise Calibration, and Admissible Ranking Regions

Jean-Pierre Magnot

详情
英文摘要

Pairwise comparisons are widely used in decision analysis, preference modeling, and evaluation problems. In many practical situations, the observed comparison matrix is not reciprocal. This lack of reciprocity is often treated as a defect to be corrected immediately. In this article, we adopt a different point of view: part of the nonreciprocity may reflect a genuine variation in the evaluation scale, while another part is due to random perturbations. We introduce an additive model in which the unknown underlying comparison matrix is consistent but not necessarily reciprocal. The reciprocal component carries the global ranking information, whereas the symmetric component describes possible scale variation. Around this structured matrix, we add a random perturbation and show how to estimate the noise level, assess whether the scale variation remains moderate, and assign probabilities to admissible ranking regions in the sense of strict ranking by pairwise comparisons. We also compare this approach with the brutal projection onto reciprocal matrices, which suppresses all symmetric information at once. The Gaussian perturbation model is used here not because human decisions are exactly Gaussian, but because observed judgment errors often result from the accumulation of many small effects. In such a context, the central limit principle provides a natural heuristic justification for Gaussian noise. This makes it possible to derive explicit estimators and probability assessments while keeping the model interpretable for decision problems.

2604.04529 2026-04-07 stat.ME econ.EM

Dynamic Factor Stochastic Volatility-in-Mean VAR for Large Macroeconomic Panels

Daichi Hiraki, Siddhartha Chib, Yasuhiro Omori

Comments 72 pages, 27 figures, 22 tables

详情
英文摘要

We develop a dynamic factor stochastic volatility-in-mean (SVM) specification for vector autoregressions (VARs) that embeds an SVM component within a dynamic factor stochastic volatility structure. A small number of latent volatility factors capture common movements in conditional variances, while volatility enters the conditional mean of the VAR. This specification allows time-varying uncertainty to influence macroeconomic dynamics through both second moments and expected outcomes while preserving tractability in large panels. We construct an efficient Markov chain Monte Carlo algorithm for estimation in this high-dimensional, non-Gaussian setting. Using quarterly data on twenty variables from the FRED-QD database, we compare predictive performance with the benchmark stochastic volatility VAR model. The dynamic factor SVM specification delivers superior forecasts for more variables during major macroeconomic disruptions such as the 2008 global financial crisis. The results indicate that allowing volatility to enter the mean captures an important transmission channel in macroeconomic dynamics.

2604.04517 2026-04-07 stat.ME econ.EM stat.CO

Unified Mixture Sampler for State-Space Models: Application to Stochastic Conditional Duration Models

Daichi Hiraki, Yasuhiro Omori

Comments 15 pages, 2 figures, 6 tables

详情
英文摘要

We propose a unified mixture sampler (UMS) that provides a universal estimation framework for nonlinear state-space models with "exp-exp" likelihood kernels. Unlike existing methods that require deriving new mixture approximations for each specific distribution, our approach dynamically adapts the standard ten-component mixture from Omori et al. (2007) through a deterministic re-centering and rescaling algorithm. Applying this to the stochastic conditional duration (SCD) model, we demonstrate that the proposed sampler can efficiently handle unknown shape parameters - such as those in Weibull or Gamma distributions - by updating mixture components near-instantaneously during MCMC iterations. The UMS not only simplifies implementation but also ensures exact inference via a lightweight Metropolis-Hastings step. Numerical examples show that our method substantially outperforms the conventional slice sampling approach, significantly reducing autocorrelation in MCMC samples while maintaining high computational efficiency. This unified framework encompasses a wide range of applications, including logit, Poisson, and various SCD model specifications, providing a highly efficient alternative to model-specific samplers.

2604.04431 2026-04-07 stat.CO

iLBA: An R package for confidentially disseminating aggregated frequency tables

Jeehyun Hwang, Dongsun Yoon, Sungkyu Jung, Min-Jeong Park, Inkwon Yeo

详情
英文摘要

Statistical agencies frequently release frequency tables derived from microdata, but small frequency cells may lead to disclosure risks. We present \texttt{iLBA}, an open-source \textsf{R} package for confidential dissemination of aggregated frequency tables. The package implements the Information-Loss-Bounded Aggregation (iLBA) algorithm, which combines Small Cell Adjustment (SCA) at the finest level table with an aggregation procedure that introduces controlled ambiguity while bounding information loss. The software enables users to construct masked finest level tables, generate confidential aggregated tables for selected variables, and obtain masked frequencies for single-cell queries. By providing an accessible implementation of the iLBA method, the package facilitates reproducible and efficient disclosure control for tabular data derived from microdata.

2604.04410 2026-04-07 cs.LG cs.AI cs.CL stat.ML

Relative Density Ratio Optimization for Stable and Statistically Consistent Model Alignment

Hiroshi Takahashi, Tomoharu Iwata, Atsutoshi Kumagai, Sekitoshi Kanai, Masanori Yamada, Kosuke Nishida, Kazutoshi Shinoda

Comments Code is available at https://github.com/takahashihiroshi/rdro

详情
英文摘要

Aligning language models with human preferences is essential for ensuring their safety and reliability. Although most existing approaches assume specific human preference models such as the Bradley-Terry model, this assumption may fail to accurately capture true human preferences, and consequently, these methods lack statistical consistency, i.e., the guarantee that language models converge to the true human preference as the number of samples increases. In contrast, direct density ratio optimization (DDRO) achieves statistical consistency without assuming any human preference models. DDRO models the density ratio between preferred and non-preferred data distributions using the language model, and then optimizes it via density ratio estimation. However, this density ratio is unstable and often diverges, leading to training instability of DDRO. In this paper, we propose a novel alignment method that is both stable and statistically consistent. Our approach is based on the relative density ratio between the preferred data distribution and a mixture of the preferred and non-preferred data distributions. Our approach is stable since this relative density ratio is bounded above and does not diverge. Moreover, it is statistically consistent and yields significantly tighter convergence guarantees than DDRO. We experimentally show its effectiveness with Qwen 2.5 and Llama 3.

2604.04365 2026-04-07 math.ST stat.ML stat.TH

Attributed Network Alignment: Statistical Limits and Efficient Algorithm

Dong Huang, Chenyang Tian, Pengkun Yang

Comments 53 pages, 8 figures

详情
英文摘要

This paper studies the problem of recovering a hidden vertex correspondence between two correlated graphs when both edge weights and node features are observed. While most existing work on graph alignment relies primarily on edge information, many real-world applications provide informative node features in addition to graph topology. To capture this setting, we introduce the featured correlated Gaussian Wigner model, where two graphs are coupled through an unknown vertex permutation, and the node features are correlated under the same permutation. We characterize the optimal information-theoretic thresholds for exact recovery and partial recovery of the latent mapping. On the algorithmic side, we propose QPAlign, an algorithm based on a quadratic programming relaxation, and demonstrate its strong empirical performance on both synthetic and real datasets. Moreover, we also derive theoretical guarantees for the proposed procedure, supporting its reliability and providing convergence guarantees.

2604.04342 2026-04-07 cs.LG stat.ML

Generative models for decision-making under distributional shift

Xiuyuan Cheng, Yunqin Zhu, Yao Xie

Comments Under review for INFORMS TutORials in Operations Research, 2026

详情
英文摘要

Many data-driven decision problems are formulated using a nominal distribution estimated from historical data, while performance is ultimately determined by a deployment distribution that may be shifted, context-dependent, partially observed, or stress-induced. This tutorial presents modern generative models, particularly flow- and score-based methods, as mathematical tools for constructing decision-relevant distributions. From an operations research perspective, their primary value lies not in unconstrained sample synthesis but in representing and transforming distributions through transport maps, velocity fields, score fields, and guided stochastic dynamics. We present a unified framework based on pushforward maps, continuity, Fokker-Planck equations, Wasserstein geometry, and optimization in probability space. Within this framework, generative models can be used to learn nominal uncertainty, construct stressed or least-favorable distributions for robustness, and produce conditional or posterior distributions under side information and partial observation. We also highlight representative theoretical guarantees, including forward-reverse convergence for iterative flow models, first-order minimax analysis in transport-map space, and error-transfer bounds for posterior sampling with generative priors. The tutorial provides a principled introduction to using generative models for scenario generation, robust decision-making, uncertainty quantification, and related problems under distributional shift.

2604.04315 2026-04-07 stat.ME stat.CO

Mean--Variance Risk-Aware Bayesian Optimal Experimental Design for Nonlinear Models

Wanggang Shen, Xun Huan

Comments 36 pages, 31 figures

详情
英文摘要

We propose a variance-penalized formulation of Bayesian optimal experimental design for nonlinear models that augments the classical expected utility criterion with a penalty on utility variability, yielding a mean--variance objective that promotes robust experimental performance. To evaluate this objective, we develop Monte Carlo estimators for the expected utility, its second moment, and the resulting utility variance using prior sampling, thereby avoiding explicit posterior sampling. We then derive leading-order bias and variance expressions using conditional delta-method arguments. The objective is optimized using Bayesian optimization with common random samples to reduce noise. Numerical examples, including a linear-Gaussian benchmark, a nonlinear test problem, and contaminant source inversion in diffusion fields, demonstrate that the proposed approach identifies designs with substantially reduced variability while maintaining competitive expected utility.

2604.04302 2026-04-07 stat.ME cs.LG

CavMerge: Merging K-means Based on Local Log-Concavity

Zhili Qiao, Wangqian Ju, Peng Liu

详情
英文摘要

K-means clustering, a classic and widely-used clustering technique, is known to exhibit suboptimal performance when applied to non-linearly separable data. Numerous adjustments and modifications have been proposed to address this issue, including methods that merge K-means results from a relatively large K to obtain a final cluster assignment. However, existing methods of this nature often encounter computational inefficiencies and suffer from hyperparameter tuning. Here we present \emph{CavMerge}, a novel K-means merging algorithm that is intuitive, free of parameter tuning, and computationally efficient. Operating under minimal local distributional assumptions, our algorithm demonstrates strong consistency and rapid convergence guarantees. Empirical studies on various simulated and real datasets demonstrate that our method yields more reliable clusters in comparison to current state-of-the-art algorithms.

2604.04294 2026-04-07 stat.ME stat.CO

Simulated Annealing for Model-Robust Partial Profile Choice Designs in Healthcare Preference Studies

Yicheng Mao, Roselinde Kessels

详情
英文摘要

Discrete Choice Experiments (DCEs) investigate participants' preferences by observing their choice behavior in hypothetical scenarios and are widely used in the domain of healthcare. To reduce participants' cognitive burden, especially when dealing with a large number of attributes, researchers often employ partial profile designs. In these designs, certain attributes within each choice set are kept constant. Current literature on partial profile designs mainly focuses on main-effects models rather than interaction-effect models, with certain partial profile designs even incapable of estimating interaction effects. To address this issue, this paper introduces an Simulated Annealing (SA) algorithm to construct partial profile designs based on an interaction-effects model. During the experimental design phase, the existence and magnitude of interaction effects are often unknown. Therefore, this paper proposes a model-robust experimental design strategy. Through extensive simulation experiments and a real-life case study, we demonstrate that our SA model-robust partial profile design performs relatively well regardless of the underlying model.

2604.04278 2026-04-07 stat.ME math.ST stat.TH

Efficient estimation of relative risk, odds ratio and their logarithms for rare events

Luis Mendo

Comments 28 pages, 9 figures

详情
英文摘要

Sequential estimators are proposed for the relative risk, odds ratio, log relative risk or log odds ratio of a dichotomous attribute in two populations. The estimators take the same number of observations from each population, and guarantee that the relative mean-square error for the relative risk or odds ratio, or the mean-square error for their logarithmic versions, is less than a given target. The efficiency of the estimators, defined in terms of the Cramér-Rao bound, is high when the considered attribute is rare or moderately rare.

2604.04274 2026-04-07 cs.AI cs.CE stat.AP

InferenceEvolve: Towards Automated Causal Effect Estimators through Self-Evolving AI

Can Wang, Hongyu Zhao, Yiqun Chen

详情
英文摘要

Causal inference is central to scientific discovery, yet choosing appropriate methods remains challenging because of the complexity of both statistical methodology and real-world data. Inspired by the success of artificial intelligence in accelerating scientific discovery, we introduce InferenceEvolve, an evolutionary framework that uses large language models to discover and iteratively refine causal methods. Across widely used benchmarks, InferenceEvolve yields estimators that consistently outperform established baselines: against 58 human submissions in a recent community competition, our best evolved estimator lay on the Pareto frontier across two evaluation metrics. We also developed robust proxy objectives for settings without semi-synthetic outcomes, with competitive results. Analysis of the evolutionary trajectories shows that agents progressively discover sophisticated strategies tailored to unrevealed data-generating mechanisms. These findings suggest that language-model-guided evolution can optimize structured scientific programs such as causal inference, even when outcomes are only partially observed.

2604.04272 2026-04-07 math.ST stat.TH

Theoretical Foundations of Principal Manifold Estimation with Non-Euclidean Templates

Kun Meng, Christopher Perez

Comments 111 pages

详情
英文摘要

We develop a rigorous theoretical framework for principal manifold estimation that recovers a latent low-dimensional manifold from a point cloud observed in a high-dimensional ambient space. Our framework accommodates manifolds with general, potentially non-Euclidean topology, which can be inferred using tools from topological data analysis. Using the theory of Sobolev spaces on Riemannian manifolds, we establish that the proposed principal manifolds are well defined, prove convergence of the iterative algorithm used to compute them, and show consistency of the finite-sample estimator. Furthermore, we introduce a novel method for selecting the complexity level of a fitted manifold, which addresses the shortcomings of the classical fitting-error criterion. We also provide a detailed geometric interpretation of the penalty term in our framework. In addition to the theoretical developments, we present extensive numerical experiments supporting our results. This article provides theoretical foundations for approaches that have been used in applications such as robotics. More importantly, it extends these approaches to general topological settings with potential applications across a broad range of disciplines, including neuroimaging and shape data analysis.

2604.04264 2026-04-07 stat.ML cs.IT cs.LG eess.SP math.IT stat.AP

Avoiding Non-Integrable Beliefs in Expectation Propagation

Zilu Zhao, Jichao Chen, Dirk Slock

详情
英文摘要

Expectation Propagation (EP) is a widely used iterative message-passing algorithm that decomposes a global inference problem into multiple local ones. It approximates marginal distributions as ``beliefs'' using intermediate functions called ``messages''. It has been shown that the stationary points of EP are the same as corresponding constrained Bethe Free Energy (BFE) optimization problem. Therefore, EP is an iterative method of optimizing the constrained BFE. However, the iterative method may fall out of the feasible set of the BFE optimization problem, i.e., the beliefs are not integrable. In most literature, the authors use various methods to keep all the messages integrable. In most Bayesian estimation problems, limiting the messages to be integrable shrinks the actual feasible set. Furthermore, in extreme cases where the factors are not integrable, making the message itself integrable is not enough to have integrable beliefs. In this paper, two EP frameworks are proposed to ensure that EP has integrable beliefs. Both of the methods allows non-integrable messages. We then investigate the signal recovery problem in Generalized Linear Model (GLM) using our proposed methods.

2604.04228 2026-04-07 math.ST cs.DS stat.ML stat.TH

Robust Regression with Adaptive Contamination in Response: Optimal Rates and Computational Barriers

Ilias Diakonikolas, Chao Gao, Daniel M. Kane, Ankit Pensia, Dong Xie

详情
英文摘要

We study robust regression under a contamination model in which covariates are clean while the responses may be corrupted in an adaptive manner. Unlike the classical Huber's contamination model, where both covariates and responses may be contaminated and consistent estimation is impossible when the contamination proportion is a non-vanishing constant, it turns out that the clean-covariate setting admits strictly improved statistical guarantees. Specifically, we show that the additional information in the clean covariates can be carefully exploited to construct an estimator that achieves a better estimation rate than that attainable under Huber contamination. In contrast to the Huber model, this improved rate implies consistency even when the contamination is a constant. A matching minimax lower bound is established using Fano's inequality together with the construction of contamination processes that match $m> 2$ distributions simultaneously, extending the previous two-point lower bound argument in Huber's setting. Despite the improvement over the Huber model from an information-theoretic perspective, we provide formal evidence -- in the form of Statistical Query and Low-Degree Polynomial lower bounds -- that the problem exhibits strong information-computation gaps. Our results strongly suggest that the information-theoretic improvements cannot be achieved by polynomial-time algorithms, revealing a fundamental gap between information-theoretic and computational limits in robust regression with clean covariates.

2604.04218 2026-04-07 stat.ML cs.LG math.ST stat.TH

Sharp asymptotic theory for Q-learning with LDTZ learning rate and its generalization

Soham Bonnerjee, Zhipeng Lou, Wei Biao Wu

详情
Journal ref
ICLR 2026, Main Conference Track, Poster
英文摘要

Despite the sustained popularity of Q-learning as a practical tool for policy determination, a majority of relevant theoretical literature deals with either constant ($η_{t}\equiv η$) or polynomially decaying ($η_{t} = ηt^{-α}$) learning schedules. However, it is well known that these choices suffer from either persistent bias or prohibitively slow convergence. In contrast, the recently proposed linear decay to zero (\texttt{LD2Z}: $η_{t,n}=η(1-t/n)$) schedule has shown appreciable empirical performance, but its theoretical and statistical properties remain largely unexplored, especially in the Q-learning setting. We address this gap in the literature by first considering a general class of power-law decay to zero (\texttt{PD2Z}-$ν$: $η_{t,n}=η(1-t/n)^ν$). Proceeding step-by-step, we present a sharp non-asymptotic error bound for Q-learning with \texttt{PD2Z}-$ν$ schedule, which then is used to derive a central limit theory for a new \textit{tail} Polyak-Ruppert averaging estimator. Finally, we also provide a novel time-uniform Gaussian approximation (also known as \textit{strong invariance principle}) for the partial sum process of Q-learning iterates, which facilitates bootstrap-based inference. All our theoretical results are complemented by extensive numerical experiments. Beyond being new theoretical and statistical contributions to the Q-learning literature, our results definitively establish that \texttt{LD2Z} and in general \texttt{PD2Z}-$ν$ achieve a best-of-both-worlds property: they inherit the rapid decay from initialization (characteristic of constant step-sizes) while retaining the asymptotic convergence guarantees (characteristic of polynomially decaying schedules). This dual advantage explains the empirical success of \texttt{LD2Z} while providing practical guidelines for inference through our results.

2604.04181 2026-04-07 stat.ME

Variance Reduction Methods for Dirichlet Expectations

Ayeong Lee

详情
英文摘要

Dirichlet distributions are probability measures on the unit simplex. They are often used as prior distributions in modeling categorical data, such as in topic analysis of text data. Motivated by this application, we consider Monte Carlo estimation of expectations $\mathbb{E}[\exp(nH(θ))]$, where $θ$ has a Dirichlet distribution, $H$ is a real-valued function, and $n$ is a parameter. We develop variance reduction techniques particularly designed to work well for large $n$. Our analysis is guided by the Laplace method for approximating integrals, which we extend to fit our problem setting. We develop an importance sampling method that achieves a near-optimal asymptotic relative error. We use related ideas to select a provably effective control variate. We illustrate these results through their application in topic analysis.

2604.04156 2026-04-07 stat.AP

Two-Sample Testing for Multivariate Cross-Correlation Functions with Applications to Gut-Brain Reward Learning

Bhaskar Ray, Tùng Bùi, William Matthew Howe, Srijan Sengupta

详情
英文摘要

Cross-correlation functions (CCFs) are classical tools for studying lead-lag relationships between paired time series, but they are most often used descriptively rather than inferentially. Motivated by mouse experiments on gut-brain interactions in reward learning, we carry out a two-sample hypothesis test for formal statistical inference on collections of subject-specific CCF curves. In our application, each experimental session yields two related CCFs describing the temporal association of dopamine activity with locomotor velocity and acceleration, which leads naturally to a multivariate functional data formulation. We treat each empirical CCF as a functional observation indexed by lag and test equality of mean multivariate CCF functions across groups using integrated and maximum-type global statistics, \(F_{\mathrm{int}}\) and \(F_{\max}\), constructed from pointwise Hotelling \(T^2\) statistics. The integrated test targets broad differences across the lag domain, whereas the maximum test is sensitive to local differences. Applied to free-feeding and intragastric infusion datasets, the proposed methods detect substantial differences in dopamine-locomotion coupling across brain region and biological sex in the free-feeding experiment, with more selective effects in the infusion setting. The proposed framework provides a flexible and rigorous FDA-based approach for comparing dynamic dependence structures across experimental conditions.

2604.04155 2026-04-07 cs.LG cs.IT math.IT q-bio.QM stat.ML

The Geometric Alignment Tax: Tokenization vs. Continuous Geometry in Scientific Foundation Models

Prashant C. Raju

详情
英文摘要

Foundation models for biology and physics optimize predictive accuracy, but their internal representations systematically fail to preserve the continuous geometry of the systems they model. We identify the root cause: the Geometric Alignment Tax, an intrinsic cost of forcing continuous manifolds through discrete categorical bottlenecks. Controlled ablations on synthetic dynamical systems demonstrate that replacing cross-entropy with a continuous head on an identical encoder reduces geometric distortion by up to 8.5x, while learned codebooks exhibit a non-monotonic double bind where finer quantization worsens geometry despite improving reconstruction. Under continuous objectives, three architectures differ by 1.3x; under discrete tokenization, they diverge by 3,000x. Evaluating 14 biological foundation models with rate-distortion theory and MINE, we identify three failure regimes: Local-Global Decoupling, Representational Compression, and Geometric Vacuity. A controlled experiment confirms that Evo 2's reverse-complement robustness on real DNA reflects conserved sequence composition, not learned symmetry. No model achieves simultaneously low distortion, high mutual information, and global coherence.

2604.04118 2026-04-07 math.ST stat.TH

Heavy Tailed Homogeneous Structural Causal Models

Vishal Routh, Shuyang Bai

详情
英文摘要

We consider causal discovery in structural causal models driven by heavy-tailed noise, where extremes carry important information about causal direction. We introduce the Heavy-Tailed Homogeneous Structural Causal Model (HT-HSCM), a unified framework that generalizes heavy-tailed linear and max-linear models. We demonstrate that causal tail coefficients identify the complete ancestral partial order of the underlying directed acyclic graph. We also formulate a recursive algorithm for recovering quantities associated with the model called ancestral impulse-responses from the causal tail coefficients. Our results provide a general and theoretically justified framework for causal discovery in heavy-tailed systems.

2604.00848 2026-04-07 stat.OT math.ST stat.ME stat.ML stat.TH

Debiased Estimators in High-Dimensional Regression: A Review and Replication of Javanmard and Montanari (2014)

Benjamin Smith

详情
英文摘要

High-dimensional statistical settings ($p \gg n$) pose fundamental challenges for classical inference, largely due to bias introduced by regularized estimators such as the LASSO. To address this, Javanmard and Montanari (2014) propose a debiased estimator that enables valid hypothesis testing and confidence interval construction. This report examines their debiased LASSO framework, which yields asymptotically normal estimators in high-dimensional settings. The key theoretical results underlying this approach are presented. Specifically, the construction of an optimized debiased estimator that restores asymptotic normality, which enables the computation of valid confidence intervals and $p$-values. To evaluate the claims of Javanmard and Montanari, a subset of the original simulation study and the real-data analysis is presented. The original empirical analysis is extended to the desparsified LASSO, which is referenced but not implemented in the original study. The results demonstrate that while the debiased LASSO achieves reliable coverage and controls Type I error, the LASSO projection estimator can offer improved power in idealized low-signal settings without compromising error rates. The results reveal a trade-off: the LASSO projection estimator performs well in low-signal settings, while Javanmard and Montanari's method is more robust to complex correlations, improving precision and signal detection in real data.

2604.00220 2026-04-07 stat.ME stat.AP

Two Sample Test for Eigendecompositions of Functional Data

Angel Garcia de la Garza, Britton Sauerbrei, Jeff Goldsmith

详情
英文摘要

Neuron-level firing data is believed to be governed by latent activation patterns during task completion. Analysing repeated trials of a task allows us to study these patterns, typically by averaging in-vivo neural spikes across trials. However, estimates of underlying latent activation patterns show trial-to-trial variability. Our aim is to determine whether this variation arises from observed data differences or changes in the latent activation patterns themselves. The latter would imply that current approaches overlook meaningful activation changes, necessitating adjustments in dimension reduction and downstream analysis. We propose a test that compares the eigendecompositions of two samples of functional data based on the covariance matrix of scores derived from a functional principal component analysis of the pooled data. Initially developed for independent samples, we later extend the test to paired samples, as necessary for our data. Simulation studies demonstrate its superior power compared to leading methods across various scenarios. In an experiment with 157 trials, we analyse all pairwise comparisons using a permutation approach to test the null hypothesis of shared latent activation patterns across trials. Our findings reveal trial-to-trial variation in latent activation patterns that cannot be attributed to sampling noise.

2603.28532 2026-04-07 cs.LG cs.AI stat.AP

Detecting low left ventricular ejection fraction from ECG using an interpretable and scalable predictor-driven framework

Ya Zhou, Tianxiang Hao, Ziyi Cai, Haojie Zhu, Kejun He, Jia Liu, Xiaohan Fan, Jing Yuan

Comments This version includes minor typographical corrections. The results and conclusions remain unchanged

详情
英文摘要

Low left ventricular ejection fraction (LEF) frequently remains undetected until progression to symptomatic heart failure, underscoring the need for scalable screening strategies. Although artificial intelligence-enabled electrocardiography (AI-ECG) has shown promise, existing approaches rely solely on end-to-end black-box models with limited interpretability or on tabular systems dependent on commercial ECG measurement algorithms with suboptimal performance. We introduced ECG-based Predictor-Driven LEF (ECGPD-LEF), a structured framework that integrates foundation model-derived diagnostic probabilities with interpretable modeling for detecting LEF from ECG. Trained on the benchmark EchoNext dataset comprising 72,475 ECG-echocardiogram pairs and evaluated in predefined independent internal (n=5,442) and external (n=16,017) cohorts, our framework achieved robust discrimination for moderate LEF (internal AUROC 88.4%, F1 64.5%; external AUROC 86.8%, F1 53.6%), consistently outperforming the official end-to-end baseline provided with the benchmark across demographic and clinical subgroups. Interpretability analyses identified high-impact predictors, including normal ECG, incomplete left bundle branch block, and subendocardial injury in anterolateral leads, driving LEF risk estimation. Notably, these predictors independently enabled zero-shot-like inference without task-specific retraining (internal AUROC 75.3-81.0%; external AUROC 71.6-78.6%), indicating that ventricular dysfunction is intrinsically encoded within structured diagnostic probability representations. This framework reconciles predictive performance with mechanistic transparency, supporting scalable enhancement through additional predictors and seamless integration with existing AI-ECG systems.

2603.27323 2026-04-07 math.ST stat.TH

Property Of The Beta Modified Weibull Distribution With Six Parameters

Didier Alain Njamen Njomen, Fidel Djongreba Ndikwa

Comments 13 pages, 4 figures, 1 table. Accepted paper in International Journal of Applied Mathematics

详情
英文摘要

The aim of this article is to determine a new six-parameter Beta Weibull distribution and its various associated functions, namely the cumulative distribution, survival, probability density and hazard functions. Next, we determine the sub-distributions of the new distribution and show that the latter generalizes those of the literature. Finally, numerical simulations were performed and show that the shapes of the density function of the new distribution cover all those in the literature, and the shapes of hazard functions (constant, increasing, decreasing, $\bigcup$-shaped and $\bigcap$-shaped) are represented in the new distribution and encompass all existing distributions.

2603.22594 2026-04-07 stat.ME

Making Effective Statistical Inferences: From Significance Testing to the Open Science Inference Ecosystem (2016-2026)

Aswini Kumar Patra

Comments 23 pages, 1 Figure, 3 tables

详情
英文摘要

Statistical inference has undergone a profound transformation over the past decade, evolving from a significance-testing paradigm toward a comprehensive, transparency-driven framework embedded within the broader open science ecosystem. While traditional approaches such as null hypothesis significance testing (NHST) remain widely used, they have been increasingly criticised for fostering dichotomous thinking, misinterpretation, and irreproducible findings. This review synthesises developments from 2016 to 2026, integrating methodological advances-including compatibility-based interpretation of p-values, S-values, equivalence testing with smallest effect sizes of interest (SESOI), Bayesian workflow, and sequential inference using e-values-with systemic reforms such as preregistration, Registered Reports, multiverse analysis, and updated reporting standards (PRISMA 2020, CONSORT 2025). A central contribution of this article is the conceptual unification of statistical inference into two complementary domains: evidence-centric inference, which quantifies compatibility between data and models, and decision-centric inference, which guides actions under uncertainty. By embedding statistical tools within transparent and reproducible research workflows, the modern inferential paradigm moves beyond single-metric evaluation toward a multidimensional assessment of evidence and practical relevance.

2603.09033 2026-04-07 q-bio.QM math.ST stat.TH

Sequential learning theory for Markov genealogy processes

David J Pascall

详情
英文摘要

We introduce a filtration-based framework for studying when and why adding taxa improves phylodynamic inference, by constructing a natural ordering of observed tips and applying sequential Bayesian analysis to the resulting filtration. We decompose the expected variance reduction on taxa addition into learning, mismatch, and covariance components, classify estimands into learning classes based on the pathwise behaviour of the mismatch, and show that for absorbing estimands an oracle who knows the latent absorption status obtains event-wise learning guarantees unavailable to the analyst. The gap between oracle and analyst is irreducible assumptions that are likely to hold for many real phylodynamic estimands, establishing a fundamental limit on what sequence data alone can reveal about the latent genealogy.

2602.11333 2026-04-07 econ.EM stat.ML

Cross-Fitting-Free Debiased Machine Learning with Multiway Dependence

Kaicheng Chen, Harold D. Chiang

Comments This paper supersedes the earlier manuscript "Maximal inequalities for separately exchangeable empirical processes" (arXiv:2502.11432) by Harold D. Chiang

详情
英文摘要

This paper develops an asymptotic theory for two-step debiased machine learning (DML) estimators in generalised method of moments (GMM) models with general multiway clustered dependence, without relying on cross-fitting. While cross-fitting is commonly employed, it can be statistically inefficient and computationally burdensome when first-stage learners are complex and the effective sample size is governed by the number of independent clusters. We show that valid inference can be achieved without sample splitting by combining Neyman-orthogonal moment conditions with a localisation-based empirical process approach, allowing for an arbitrary number of clustering dimensions. The resulting debiased GMM estimators are shown to be asymptotically linear and asymptotically normal under multiway clustered dependence. A central technical contribution of the paper is the derivation of novel global and local maximal inequalities for general classes of functions of sums of separately exchangeable arrays, which underpin our theoretical arguments and are of independent interest.

2602.07841 2026-04-07 econ.EM q-fin.ST stat.AP

A Nontrivial Upper Bound on the Out-of-Sample $R^2$ in Return Forecasting

Cheng Zhang

详情
英文摘要

This study establishes a nontrivial upper bound on the out-of-sample $R^2$ ($R^2_{\text{OOS}}$) in return forecasting. In particular, we define a coin-flip oracle model that, under the same directional accuracy, theoretically outperforms practical models in terms of MSE. The $R^2_{\text{OOS}}$ of the oracle model, whose analytical expression is a quadratic function of directional accuracy, can therefore serve as a tractable upper bound on the actual $R^2_{\text{OOS}}$. Empirical analyses across multiple forecasting scenarios reveal that the $R^2_{\text{OOS}}$ values of common predictive models are fundamentally bounded by this quadratic function.

2512.24521 2026-04-07 stat.ME cs.HC stat.AP

Power Analysis is Essential: High-Powered Tests Suggest Minimal to No Effect of Rounded Shapes on Click-Through Rates

Ron Kohavi, Jakub Linowski, Lukas Vermeer, Fabrice Boisseranc, Joachim Furuseth, Andrew Gelman, Guido Imbens, Ravikiran Rajagopal

Comments 34 pages, 9 figures

详情
Journal ref
Econ Journal Watch 23 (1): 139-172 (2026). https://econjwatch.org/articles/power-analysis-is-essential
英文摘要

Underpowered studies (below 50% power) suffer from the winner's curse: A statistically significant positive estimate must exaggerate the true treatment effect to meet the significance threshold. A study by Dipayan Biswas, Annika Abell, and Roger Chacko published in the Journal of Consumer Research (2023) reported that in an A/B test, simply rounding the corners of square buttons increased the online click-through rate by 55% (p-value 0.037)$\unicode{x2014}$a striking finding with potentially wide-ranging implications for a digital industry that is seeking to enhance consumer engagement. Drawing on our experience with tens of thousands of A/B tests, many involving similar user interface modifications, we found this dramatic claim implausibly large. To evaluate the claim and provide a more accurate estimate of the treatment effect, we conducted three high-powered A/B tests, each involving over two thousand times more users than the original study. All three experiments yielded effect size estimates that were approximately two orders of magnitude smaller than initially reported, with 95% confidence intervals that include zero (i.e., not statistically significant at the 0.05 level). Two additional independent replications by Evidoo found similarly small effects. These findings underscore the critical importance of power analysis and experimental design in increasing trust and reproducibility of results.

2512.10537 2026-04-07 stat.ME stat.CO

A Bayes-Motivated Quadratic-Form Test for High-Dimensional Mean Testing

Daojiang He, Suren Xu, Jing Zhou

详情
英文摘要

We propose a two-sample mean test based on the Bayes factor with non-informative priors, specifically designed for scenarios where the dimension $p$ grows with the sample size $n$ with a linear rate $p/n \to c_1 \in (0, \infty)$. We establish the asymptotic normality of the test statistic and the asymptotic power. Through extensive simulations, we demonstrate that the proposed test performs competitively against several existing methods, particularly when the marginal variances of the individual features are heterogeneous and when the sample size is small. Furthermore, our test remains robust under distribution misspecification. The proposed method not only effectively detects both sparse and non-sparse differences in mean vectors but also maintains a well-controlled type I error rate, even in small-sample scenarios. We also demonstrate the performance of our proposed test using the small round blue cell tumors (SRBCT) dataset.

2511.20985 2026-04-07 stat.ME

Two-stage Estimation for Causal Inference Involving a Semi-continuous Exposure

Xiaoya Wang, Richard J. Cook, Yeying Zhu, Tugba Akkaya-Hocagil, R. Colin Carter, Sandra W. Jacobson, Joseph L. Jacobson, Louise M. Ryan

详情
英文摘要

Methods for causal inference are well developed for binary and continuous exposures, but in many settings, the exposure has a substantial mass at zero-such exposures are called semi-continuous. We propose a general causal framework for such semi-continuous exposures, together with a novel two-stage estimation strategy. A two-part propensity structure is introduced for the semi-continuous exposure, with one component for exposure status (exposed vs unexposed) and another for the exposure level among those exposed, and incorporates both into a marginal structural model that disentangles the effects of exposure status and dose. The two-stage procedure sequentially targets the causal dose-response among exposed individuals and the causal effect of exposure status at a reference dose, allowing flexibility in the choice of propensity score methods in the second stage. We establish consistency and asymptotic normality for the resulting estimators, and characterise their limiting values under misspecification of the propensity score models. Simulation studies evaluate finite sample performance and robustness, and an application to a study of prenatal alcohol exposure and child cognition demonstrates how the proposed methods can be used to address a range of scientific questions about both exposure status and exposure intensity.

2511.15453 2026-04-07 stat.ME

Testing Conditional Independence via the Spectral Generalized Covariance Measure: Beyond Euclidean Data

Ryunosuke Miyazaki, Yoshimasa Uematsu

详情
英文摘要

We propose a conditional independence (CI) test based on a new measure, the \emph{spectral generalized covariance measure} (SGCM). The SGCM is constructed by expressing the squared norm of the conditional cross-covariance operator in spectral coordinates and approximating it in finite dimensions using data-dependent bases obtained from empirical covariance operators. This avoids direct estimation of conditional mean embeddings and reduces nuisance estimation to a finite collection of scalar-valued regressions. On the theoretical side, under a doubly robust product-bias condition, we establish uniform bootstrap validity and uniform asymptotic size control, and derive nontrivial uniform power and uniform consistency over classes of projected separated alternatives. The analysis also clarifies the role of spectral truncation: stronger truncation relaxes nuisance-estimation requirements, whereas weaker truncation retains more of the projected signal. To support applications beyond Euclidean data, we develop characteristic-kernel constructions on general Polish spaces via a pullback principle and non-constant completely monotone transforms of continuous negative-type semimetrics, with closure under finite tensor products. These constructions cover examples such as distribution-valued data, curves in metric spaces, and manifold-valued observations. Simulations show near-nominal size in the main settings and competitive power across a range of challenging scenarios.

2511.09216 2026-04-07 cs.LG q-bio.QM stat.ML

Controllable protein design with particle-based Feynman-Kac steering

Erik Hartman, Jonas Wallin, Johan Malmström, Jimmy Olsson

Comments In version 2 we added an experiment on improving designability through steering towards lower delta G

详情
英文摘要

Proteins underpin most biological function, and the ability to design them with tailored structures and properties is central to advances in biotechnology. Diffusion-based generative models have emerged as powerful tools for protein design, but steering them toward proteins with specified properties remains challenging. The Feynman-Kac (FK) framework provides a principled way to guide diffusion models using user-defined rewards. In this paper, we enable FK-based steering of RFdiffusion through the development of guiding potentials that leverage ProteinMPNN and structural relaxation to guide the diffusion process towards desired properties. We show that steering can be used to consistently improve predicted interface energetics and increase binder designability by $89.5\%$. Together, these results establish that diffusion-based protein design can be effectively steered toward arbitrary, non-differentiable objectives, providing a model-independent framework for controllable protein generation.

2510.23448 2026-04-07 cs.LG stat.ML

An Information-Theoretic Analysis of OOD Generalization in Meta-Reinforcement Learning

Xingtu Liu

详情
英文摘要

In this work, we study out-of-distribution (OOD) generalization in meta-reinforcement learning from an information-theoretic perspective. We begin by establishing OOD generalization bounds for meta-supervised learning under two distinct distribution shift scenarios: standard distribution mismatch and a broad-to-narrow training setting. Building on this foundation, we formalize the generalization problem in meta-reinforcement learning and establish fine-grained generalization bounds that exploit the structure of Markov Decision Processes. Lastly, we analyze the generalization performance of a gradient-based meta-reinforcement learning algorithm.

2508.10053 2026-04-07 cs.LG stat.ML

xRFM: Accurate, scalable, and interpretable feature learning models for tabular data

Daniel Beaglehole, David Holzmüller, Adityanarayanan Radhakrishnan, Mikhail Belkin

详情
英文摘要

Inference from tabular data, collections of continuous and categorical variables organized into matrices, is a foundation for modern technology and science. Yet, in contrast to the explosive changes in the rest of AI, the best practice for these predictive tasks has been relatively unchanged and is still primarily based on variations of Gradient Boosted Decision Trees (GBDTs). Very recently, there has been renewed interest in developing state-of-the-art methods for tabular data based on recent developments in neural networks and feature learning methods. In this work, we introduce xRFM, an algorithm that combines feature learning kernel machines with a tree structure to both adapt to the local structure of the data and scale to essentially unlimited amounts of training data. We show that compared to $31$ other methods, including recently introduced tabular foundation models (TabPFNv2) and GBDTs, xRFM achieves best performance across $100$ regression datasets and is competitive to the best methods across $200$ classification datasets outperforming GBDTs. Additionally, xRFM provides interpretability natively through the Average Gradient Outer Product.

2507.22207 2026-04-07 cond-mat.dis-nn cs.LG physics.data-an stat.ML

Better Together: Cross and Joint Covariances Enhance Signal Detectability in Undersampled Data

Arabind Swain, Sean Alexander Ridout, Ilya Nemenman

详情
英文摘要

Many data-science applications involve detecting a shared signal between two high-dimensional variables. Using random matrix theory methods, we determine when such signal can be detected and reconstructed from sample correlations, despite the background of sampling noise induced correlations. We consider three different covariance matrices constructed from two high-dimensional variables: their individual self covariance, their cross covariance, and the self covariance of the concatenated (joint) variable, which incorporates the self and the cross correlation blocks. We observe the expected Baik, Ben Arous, and Péché detectability phase transition in all these covariance matrices, and we show that joint and cross covariance matrices always reconstruct the shared signal earlier than the self covariances. Whether the joint or the cross approach is better depends on the mismatch of dimensionalities between the variables. We discuss what these observations mean for choosing the right method for detecting linear correlations in data and how these findings may generalize to nonlinear statistical dependencies.

2506.21527 2026-04-07 math.ST math.PR stat.TH

Asymptotic Inference for Exchangeable Gibbs Partitions

Takuya Koriyama

Comments 40 pages, 3 figures. We have updated numerical simulations and added a rigorous proposition explaining why the uniform CI and local CI complement each other

详情
英文摘要

We study the asymptotic properties of parameter estimation and predictive inference under the exchangeable Gibbs partition, characterized by a discount parameter $α\in(0,1)$ and a triangular array $v_{n,k}$ satisfying a backward recursion. Assuming that $v_{n,k}$ admits a mixture representation over the Ewens--Pitman family $(α, θ)$, with $θ$ integrated by an unknown mixing distribution, we show that the (quasi) maximum likelihood estimator $\hatα_n$ (QMLE) for $α$ is asymptotically mixed normal. This generalizes earlier results for the Ewens--Pitman model to a more general class. We further study the predictive task of estimating the probability simplex $\mathsf{p}_n$, which governs the allocation of the $(n+1)$-th item, conditional on the current partition of $[n]$. Based on the asymptotics of the QMLE $\hatα_n$, we construct an estimator $\hat{\mathsf{p}}_n$ and derive the limit distributions of the $f$-divergence $\mathsf{D}_f(\hat{\mathsf{p}}_n||\mathsf{p}_n)$ for general convex functions $f$, including explicit results for the TV distance and KL divergence. These results lead to asymptotically valid confidence intervals for both parameter estimation and prediction.

2505.21972 2026-04-07 cs.LG cs.AI stat.ML

LLMs Judging LLMs: A Simplex Perspective

Patrick Vossler, Fan Xia, Yifan Mai, Adarsh Subbaswamy, Jean Feng

Comments Accepted at AISTATS 2026

详情
英文摘要

Given the challenge of automatically evaluating free-form outputs from large language models (LLMs), an increasingly common solution is to use LLMs themselves as the judging mechanism, without any gold-standard scores. Implicitly, this practice accounts for only sampling variability (aleatoric uncertainty) and ignores uncertainty about judge quality (epistemic uncertainty). While this is justified if judges are perfectly accurate, it is unclear when such an approach is theoretically valid and practically robust. We study these questions for the task of ranking LLM candidates from a novel geometric perspective: for $M$-level scoring systems, both LLM judges and candidates can be represented as points on an $(M-1)$-dimensional probability simplex, where geometric concepts (e.g., triangle areas) correspond to key ranking concepts. This perspective yields intuitive theoretical conditions and visual proofs for when rankings are identifiable; for instance, we provide a formal basis for the ``folk wisdom'' that LLM judges are more effective for two-level scoring ($M=2$) than multi-level scoring ($M>2$). Leveraging the simplex, we design geometric Bayesian priors that encode epistemic uncertainty about judge quality and vary the priors to conduct sensitivity analyses. Experiments on LLM benchmarks show that rankings based solely on LLM judges are robust in many but not all datasets, underscoring both their widespread success and the need for caution. Our Bayesian method achieves substantially higher coverage rates than existing procedures, highlighting the importance of modeling epistemic uncertainty.

2505.15443 2026-04-07 cs.CL stat.ML

ALIEN: Aligned Entropy Head for Improving Uncertainty Estimation of LLMs

Artem Zabolotnyi, Roman Makarov, Mile Mitrovic, Polina Proskura, Oleg Travkin, Roman Alferov, Alexey Zaytsev

Comments 16 pages, 2 figures

详情
英文摘要

Uncertainty estimation remains a key challenge when adapting pre-trained language models to downstream classification tasks, with overconfidence often observed for difficult inputs. While predictive entropy provides a strong baseline for uncertainty estimation, it considers mainly aleatoric uncertainty and has limited capacity to capture effects, such as class overlap or ambiguous linguistic cues. We introduce Aligned Entropy - ALIEN, a lightweight method that refines entropy-based uncertainty by aligning it with prediction reliability. ALIEN trains a small uncertainty head initialized to produce the model's original entropy and subsequently fine-tuned with two regularization mechanisms. Experiments across seven classification datasets and two NER benchmarks, evaluated on five language models (RoBERTa, ELECTRA, LLaMA-2, Qwen2.5, and Qwen3), show that ALIEN consistently outperforms strong baselines across all considered scenarios in detecting incorrect predictions, while achieving the lowest calibration error. The proposed method introduces only a small inference overhead (in the order of milliseconds per batch on CPU) and increases the model's parameter count by just 0.002% for decoder models and 0.5% for encoder models, without requiring storage of intermediate states. It improves uncertainty estimation while preserving the original model architecture, making the approach practical for large-scale deployment with modern language models. Our results demonstrate that entropy can be effectively refined through lightweight supervised alignment, producing more reliable uncertainty estimates without modifying the backbone model. The code is available at 4.

2504.18743 2026-04-07 cs.LG math.PR stat.ML

From Set Convergence to Pointwise Convergence: Finite-Time Guarantees for Average-Reward Q-Learning with Adaptive Stepsizes

Zaiwei Chen, Phalguni Nanda

Comments 65 pages and 6 figures

详情
英文摘要

This work presents the first finite-time analysis for the last-iterate convergence of average-reward $Q$-learning with an asynchronous implementation. A key feature of the algorithm we study is the use of adaptive stepsizes, which serve as local clocks for each state-action pair. We show that, under appropriate assumptions, the iterates generated by this $Q$-learning algorithm converge at a rate of $\tilde{\mathcal{O}}(1/k)$ (in the mean-square sense) to the optimal $Q$-function in the span seminorm. Moreover, by adding a centering step to the algorithm, we further establish pointwise mean-square convergence to the centered optimal $Q$-function, also at a rate of $\tilde{\mathcal{O}}(1/k)$. To prove these results, we show that adaptive stepsizes are necessary, as without them, the algorithm fails to converge to the correct target. In addition, adaptive stepsizes can be interpreted as a form of implicit importance sampling that counteracts the effects of asynchronous updates. Technically, the use of adaptive stepsizes makes each $Q$-learning update depend on the entire sample history, introducing strong correlations and making the algorithm a non-Markovian stochastic approximation (SA) scheme. Our approach to overcoming this challenge involves (1) a time-inhomogeneous Markovian reformulation of non-Markovian SA, and (2) a combination of almost-sure time-varying bounds, conditioning arguments, and Markov chain concentration inequalities to break the strong correlations between the adaptive stepsizes and the iterates. The tools developed in this work are likely to be broadly applicable to the analysis of general SA algorithms with adaptive stepsizes.

2504.17104 2026-04-07 stat.ME stat.AP

Target trial emulation without matching: a more efficient approach for evaluating vaccine effectiveness using observational data

Emily Wu, Elizabeth Rogawski McQuade, Mats Stensrud, Razieh Nabi, David Benkeser

Comments 24 pages, 5 figures

详情
英文摘要

Real-world vaccine effectiveness has increasingly been studied using matching-based approaches, particularly in observational cohort studies following the target trial emulation framework. Although matching is appealing in its simplicity, it suffers important limitations in terms of clarity of the target estimand and the efficiency or precision with which is it estimated. Scientifically justified causal estimands of vaccine effectiveness may be difficult to define owing to the fact that vaccine uptake varies over calendar time when infection dynamics may also be rapidly changing. We propose a causal estimand of vaccine effectiveness that summarizes vaccine effectiveness over calendar time, similar to how vaccine efficacy is summarized in a randomized controlled trial. We describe the identification of our estimand, including its underlying assumptions, and propose simple-to-implement estimators based on two hazard regression models. We apply our proposed estimator in simulations and in a study to assess the effectiveness of the Pfizer-BioNTech COVID-19 vaccine to prevent infections with SARS-CoV2 in children 5-11 years old. In both settings, we find that our proposed estimator yields similar scientific inferences while providing significant efficiency gains over commonly used matching-based estimators.

2504.14795 2026-04-07 eess.IV cs.CV cs.LG stat.ML

A Bayesian Approach to Segmentation with Noisy Labels via Spatially Correlated Distributions

Ryu Tadokoro, Tsukasa Takagi, Shin-ichi Maeda

详情
Journal ref
Transactions on Machine Learning Research (TMLR) , 2026
英文摘要

In semantic segmentation, the accuracy of models heavily depends on the high-quality annotations. However, in many practical scenarios, such as medical imaging and remote sensing, obtaining true annotations is not straightforward and usually requires significant human labor. Relying on human labor often introduces annotation errors, including mislabeling, omissions, and inconsistency between annotators. In the case of remote sensing, differences in procurement time can lead to misaligned ground-truth annotations. These label errors are not independently distributed, and instead usually appear in spatially connected regions where adjacent pixels are more likely to share the same errors. To address these issues, we propose an approximate Bayesian estimation based on a probabilistic model that assumes training data include label errors, incorporating the tendency for these errors to occur with spatial correlations between adjacent pixels. However, Bayesian inference for such spatially correlated discrete variables is notoriously intractable. To overcome this fundamental challenge, we introduce a novel class of probabilistic models, which we term the ELBO-Computable Correlated Discrete Distribution (ECCD). By representing the discrete dependencies through a continuous latent Gaussian field with a Kac-Murdock-Szegö (KMS) structured covariance, our framework enables scalable and efficient variational inference for problems previously considered computationally prohibitive. Through experiments on multiple segmentation tasks, we confirm that leveraging the spatial correlation of label errors significantly improves performance. Notably, in specific tasks such as lung segmentation, the proposed method achieves performance comparable to training with clean labels under moderate noise levels. Code is available at https://github.com/pfnet-research/Bayesian_SpatialCorr.

2504.14169 2026-04-07 stat.ME

Correcting nonignorable nonresponse bias in turnout estimation using callback data

Xinyu Li, Naiwen Ying, Kendrick Qijun Li, Xu Shi, Wang Miao

详情
英文摘要

Overestimation of turnout has long been an issue in election surveys, with nonresponse bias or voter overrepresentation identified as major sources of bias. However, adjusting for nonignorable nonresponse bias is substantially challenging. Based on the ANES Non-Response Follow-Up study concerning the 2020 U.S. presidential election, we investigate the role of callback data, that is, records of contact attempts in the survey course, in adjusting for nonresponse bias in the estimation of turnout. We propose a stableness of resistance assumption to account for nonignorable missingness in the outcome, which states that the impact of the missing outcome on the response propensity is stable in the first two call attempts. Under this assumption and by integrating with covariate information from the census data, we establish identifiability and develop estimation methods for turnout. Our methods produce estimates very close to the official turnout and successfully capture the trend of declining willingness to vote as response reluctance increases. This work highlights the importance of adjusting for nonignorable nonresponse bias and demonstrates the potential of widely available callback data for political surveys.

2502.18223 2026-04-07 stat.ME

Penalizing complexity priors for Bayesian inference of circular models

Xiang Ye, Janet Van Niekerk, Håvard Rue

Comments 20 pages, 21 figures

详情
英文摘要

Advancements in computational power and methodologies have enabled research on massive datasets. However, tools for analyzing data with directional or periodic characteristics, such as wind directions and customers' arrival time in 24-hour clock, remain underdeveloped. While statisticians have proposed circular distributions for such analyses, significant challenges persist in constructing circular statistical models, particularly in the context of Bayesian methods. These challenges stem from limited theoretical development and a lack of historical studies on prior selection for circular distribution parameters. In this article, we propose a framework for selecting hyperpriors that contracts to a simpler model in circular scenarios, especially when there is insufficient information to guide prior selection. We introduce well-examined Penalized Complexity (PC) priors for the most widely used circular distributions. Comprehensive comparisons with existing hyperpriors in the literature are conducted through simulation studies and a practical case study. Finally, we discuss the contributions and implications of our work, providing a foundation for further advancements in constructing Bayesian circular statistical models.

2502.07977 2026-04-07 cs.LG math.OC stat.ML

RESIST: Resilient Decentralized Learning Using Consensus Gradient Descent

Cheng Fang, Rishabh Dixit, Waheed U. Bajwa, Mert Gurbuzbalaban

Comments preprint of a journal paper; 110 pages, 14 figures, and 1 table

详情
英文摘要

Empirical risk minimization (ERM) is a cornerstone of modern machine learning (ML), supported by advances in optimization theory that ensure efficient solutions with provable algorithmic and statistical learning rates. Privacy, memory, computation, and communication constraints necessitate data collection, processing, and storage across network-connected devices. In many applications, networks operate in decentralized settings where a central server cannot be assumed, requiring decentralized ML algorithms that are efficient and resilient. Decentralized learning, however, faces significant challenges, including an increased attack surface. This paper focuses on the man-in-the-middle (MITM) attack, wherein adversaries exploit communication vulnerabilities to inject malicious updates during training, potentially causing models to deviate from their intended ERM solutions. To address this challenge, we propose RESIST (Resilient dEcentralized learning using conSensus gradIent deScenT), an optimization algorithm designed to be robust against adversarially compromised communication links, where transmitted information may be arbitrarily altered before being received. Unlike existing adversarially robust decentralized learning methods, which often (i) guarantee convergence only to a neighborhood of the solution, (ii) lack guarantees of linear convergence for strongly convex problems, or (iii) fail to ensure statistical consistency as sample sizes grow, RESIST overcomes all three limitations. It achieves algorithmic and statistical convergence for strongly convex, Polyak-Lojasiewicz, and nonconvex ERM problems by employing a multistep consensus gradient descent framework and robust statistics-based screening methods to mitigate the impact of MITM attacks. Experimental results demonstrate the robustness and scalability of RESIST across attack strategies, screening methods, and loss functions.

2410.18918 2026-04-07 stat.ML cs.LG

MissNODAG: Differentiable Cyclic Causal Graph Learning from Incomplete Data

Muralikrishnna G. Sethuraman, Razieh Nabi, Faramarz Fekri

Comments To appear in Transactions on Machine Learning Research

详情
英文摘要

Causal discovery in real-world systems, such as biological networks, is often complicated by feedback loops and incomplete data. Standard algorithms, which assume acyclic structures or fully observed data, struggle with these challenges. To address this gap, we propose MissNODAG, a differentiable framework for learning both the underlying cyclic causal graph and the missingness mechanism from partially observed data, including data missing not at random. Our framework integrates an additive noise model with an expectation-maximization procedure, alternating between imputing missing values and optimizing the observed data likelihood, to uncover both the cyclic structures and the missingness mechanism. We establish consistency guarantees under exact maximization of the score function in the large sample setting. Finally, we demonstrate the effectiveness of MissNODAG through synthetic experiments and an application to real-world gene perturbation data.

2410.07607 2026-04-07 math.ST stat.TH

Staleness Factors and Volatility Estimation at High Frequencies

Xinbing Kong, Bin Wu, Wuyi Ye

详情
Journal ref
Journal of the American Statistical Association, 2026
英文摘要

In this paper, we propose a price staleness factor model that accounts for pervasive market friction across assets and incorporates relevant covariates. Using large-panel high-frequency data, we derive the maximum likelihood estimators of the regression coefficients, the nonstationary factors, and their loading parameters. These estimators recover the time-varying price staleness probabilities. We develop asymptotic theory in which both the dimension $d$ and the sampling frequency $n$ tend to infinity. Using a local principal component analysis (LPCA) approach, we find that the efficient price co-volatilities (systematic and idiosyncratic) are biased downward due to the presence of staleness. We provide bias-corrected estimators for both the spot and integrated systematic and idiosyncratic co-volatilities, and prove that these estimators are robust to data staleness. Interestingly, besides their dependence on the dimensionality $d$, the integrated plug-in estimates converge at a rate of $n^{-1/2}$ without requiring correcting term, whereas the local PCA estimates converge at a slower rate of $n^{-1/4}$. This validates the aggregation efficiency achieved through nonlinear, nonstationary factor analysis via maximum likelihood estimation. Numerical experiments justify our theoretical findings. Empirically, we demonstrate that the staleness factor provides unique explanatory power for cross-sectional risk premia, and that the staleness correction reduces out-of-sample portfolio risk.

2410.07430 2026-04-07 cs.LG stat.ML

EventFlow: Forecasting Temporal Point Processes with Flow Matching

Gavin Kerrigan, Kai Nelson, Padhraic Smyth

Comments AISTATS 2026 Best Paper Award, camera ready version

详情
英文摘要

Continuous-time event sequences, in which events occur at irregular intervals, are ubiquitous across a wide range of industrial and scientific domains. The contemporary modeling paradigm is to treat such data as realizations of a temporal point process, and in machine learning it is common to model temporal point processes in an autoregressive fashion using a neural network. While autoregressive models are successful in predicting the time of a single subsequent event, their performance can degrade when forecasting longer horizons due to cascading errors and myopic predictions. We propose EventFlow, a non-autoregressive generative model for temporal point processes. The model builds on the flow matching framework in order to directly learn joint distributions over event times, side-stepping the autoregressive process. EventFlow is simple to implement and achieves a 20%-53% lower forecast error than the nearest baseline on standard TPP benchmarks while simultaneously using fewer model calls at sampling time.

2408.12739 2026-04-07 quant-ph cs.LG stat.ML

Quantum Convolutional Neural Networks are Effectively Classically Simulable

Pablo Bermejo, Paolo Braccia, Manuel S. Rudolph, Zoë Holmes, Lukasz Cincio, M. Cerezo

Comments 12 + 15 pages , 6 + 7 figures, 1 table, updated to published version

详情
Journal ref
PRX Quantum 7, 020304 (2026)
英文摘要

Quantum Convolutional Neural Networks (QCNNs) are widely regarded as a promising model for Quantum Machine Learning (QML). In this work we tie their heuristic success to two facts. First, that when randomly initialized, they can only operate on the information encoded in low-bodyness measurements of their input states. And second, that they are commonly benchmarked on "locally-easy'' datasets whose states are precisely classifiable by the information encoded in these low-bodyness observables subspace. We further show that the QCNN's action on this subspace can be efficiently classically simulated by a classical algorithm equipped with Pauli shadows on the dataset. Indeed, we present a shadow-based simulation of QCNNs on up-to $1024$ qubits for phases of matter classification. Our results can then be understood as highlighting a deeper symptom of QML: Models could only be showing heuristic success because they are benchmarked on simple problems, for which their action can be classically simulated. This insight points to the fact that non-trivial datasets are a truly necessary ingredient for moving forward with QML. To finish, we discuss how our results can be extrapolated to classically simulate other architectures.

2405.03083 2026-04-07 stat.ME cs.LG stat.ML

Causal K-Means Clustering

Kwangho Kim, Jisu Kim, Edward H. Kennedy

详情
Journal ref
J. R. Stat. Soc. Ser. B, 2026
英文摘要

Causal effects are often characterized with population summaries. These might provide an incomplete picture when there are heterogeneous treatment effects across subgroups. Since the subgroup structure is typically unknown, it is more challenging to identify and evaluate subgroup effects than population effects. We propose a new solution to this problem: \emph{Causal k-Means Clustering}, which leverages the k-means clustering algorithm to uncover the unknown subgroup structure. Our problem differs significantly from the conventional clustering setup since the variables to be clustered are unknown counterfactual functions. We present a plug-in estimator which is simple and readily implementable using off-the-shelf algorithms, and study its rate of convergence. We also develop a new bias-corrected estimator based on nonparametric efficiency theory and double machine learning, and show that this estimator achieves fast root-n rates and asymptotic normality in large nonparametric models. Our proposed methods are especially useful for modern outcome-wide studies with multiple treatment levels. Further, our framework is extensible to clustering with generic pseudo-outcomes, such as partially observed outcomes or otherwise unknown functions. Finally, we explore finite sample properties via simulation, and illustrate the proposed methods using a study of mobile-supported self-management for chronic low back pain.

2404.07457 2026-04-07 math.ST stat.CO stat.TH

From Poisson Observations to Fitted Negative Binomial Distribution

Yingying Yang, Niloufar Dousti Mousavi, Zhou Yu, Jie Yang

Comments 54 pages, 3 figures, 15 tables

详情
英文摘要

The negative binomial distribution has been widely used as a more flexible model than the Poisson distribution for count data. However, when the true data-generating process is Poisson, it is often challenging to distinguish it from a negative binomial distribution with extreme parameter values, and existing maximum likelihood estimation procedures for the negative binomial distribution may fail or produce unstable estimates. To address this issue, we develop a new algorithm for computing the maximum likelihood estimate of negative binomial parameters, which is more efficient and more accurate than existing methods. We further extend negative binomial distributions with a new parameterization to cover Poisson distributions as a special class. We provide theoretical justifications showing that, when applied to a Poisson data, the estimated parameters of the extended negative binomial distribution can consistently recover the true Poisson distribution.

2403.11343 2026-04-07 cs.LG cs.CR math.ST stat.ME stat.ML stat.TH

Federated Transfer Learning with Differential Privacy

Mengchu Li, Ye Tian, Yang Feng, Yi Yu

Comments 101 pages, 7 figures

详情
英文摘要

Federated learning has emerged as a powerful framework for analysing distributed data, yet two challenges remain pivotal: heterogeneity across sites and privacy of local data. In this paper, we address both challenges within a federated transfer learning framework, aiming to enhance learning on a target data set by leveraging information from multiple heterogeneous source data sets while adhering to privacy constraints. We rigorously formulate the notion of federated differential privacy, which offers privacy guarantees for each data set without assuming a trusted central server. Under this privacy model, we study four statistical problems: univariate mean estimation, low-dimensional linear regression, high-dimensional linear regression, and M-estimation. By investigating the minimax rates and quantifying the cost of privacy, we show that federated differential privacy is an intermediate privacy model between the well-established local and central models of differential privacy. Our analyses account for data heterogeneity and privacy, highlighting the fundamental costs associated with each factor and the benefits of knowledge transfer in federated learning.

2307.13475 2026-04-07 econ.EM math.ST stat.TH

Large sample properties of GMM estimators under second-order identification

Hugo Kruiniger

Comments 30 pages. In the third version of the paper, I have added results on the optimal weight matrices for ϕ_{1}-hat and ϕ_{p}-hat, respectively

详情
英文摘要

Dovonon and Hall (Journal of Econometrics, 2018) proposed a limiting distribution theory for GMM estimators for a p - dimensional globally identified parameter vector ϕ when local identification conditions fail at first-order but hold at second-order. They assumed that the first-order underidentification is due to the expected Jacobian having rank p-1 at the true value ϕ_{0}, i.e., having a rank deficiency of one. After reparametrizing the model such that the last column of the Jacobian vanishes, they showed that the GMM estimator of the vector comprising the first p-1 parameters, ϕ_{1}, converges at rate T^{-1/2} and the GMM estimator of the remaining parameter, ϕ_{p}, converges at rate T^{-1/4}. They also provided a limiting distribution of T^{1/4}(ϕ_{p}-hat-ϕ_{0,p}) subject to a (non-transparent) condition which they claimed to be not restrictive in general. However, as we show in this paper, their condition is in fact only satisfied when ϕ is overidentified and the limiting distribution of T^{1/4}(ϕ_{p}-hat-ϕ_{0,p}), which is non-standard, depends on whether ϕ is exactly identified or overidentified. In particular, the limiting distributions of the sign of T^{1/4}(ϕ_{p}-hat-ϕ_{0,p}) for the cases of exact and overidentification, respectively, are different and are obtained by using expansions of the GMM objective function of different orders. Unsurprisingly, we find that the limiting distribution theories of Dovonon and Hall (2018) for Indirect Inference (II) estimation under two different scenarios with second-order identification where the target function is a GMM estimator of the auxiliary parameter vector, are incomplete for similar reasons. We discuss how our results for GMM estimation can be used to complete both theories. We also derive the optimal weight matrices for ϕ_{1}-hat and ϕ_{p}-hat, respectively.

2302.08724 2026-04-07 stat.ML cs.LG stat.OT

Piecewise Deterministic Markov Processes for Bayesian Neural Networks

Ethan Goan, Dimitri Perrin, Kerrie Mengersen, Clinton Fookes

Comments typo fix, Includes correction to software and corrigendum note (fix supplementary references)

详情
英文摘要

Inference on modern Bayesian Neural Networks (BNNs) often relies on a variational inference treatment, imposing violated assumptions of independence and the form of the posterior. Traditional MCMC approaches avoid these assumptions at the cost of increased computation due to its incompatibility to subsampling of the likelihood. New Piecewise Deterministic Markov Process (PDMP) samplers permit subsampling, though introduce a model specific inhomogenous Poisson Process (IPPs) which is difficult to sample from. This work introduces a new generic and adaptive thinning scheme for sampling from these IPPs, and demonstrates how this approach can accelerate the application of PDMPs for inference in BNNs. Experimentation illustrates how inference with these methods is computationally feasible, can improve predictive accuracy, MCMC mixing performance, and provide informative uncertainty measurements when compared against other approximate inference schemes.

2604.04084 2026-04-07 stat.CO stat.ME

Meta-analysis with the glmmTMB R package

Coralie Williams, Maeve McGillycuddy, Mollie Brooks, Benjamin M. Bolker, Ayumi Mizuno, Yefeng Yang, Wolfgang Viechtbauer, David I. Warton, Shinichi Nakagawa

详情
英文摘要

Meta-analytical models are typically formulated as a mixed-effects model where the sampling variances of the effect sizes are treated as known. In principle, such models could be fitted with standard mixed-modelling software such as the glmmTMB R package. This general-purpose package for generalized linear mixed models (GLMMs) provides flexibility in distributions and random effect covariance structures through the Template Model Builder (TMB). However, incorporating known sampling variances in the conventional inverse-variance formulation of meta-analysis was previously not easily accomplished in glmmTMB. Here, we introduce equalto, a new covariance structure in glmmTMB that allows users to supply a known sampling error variance-covariance matrix when fitting meta-analytic models. This enables explicit modelling of heteroscedasticity and dependence among sampling errors. The new implementation provides an alternative way to fit meta-analytic models, convenient for users already familiar with glmmTMB. Using simulations, we show that the new implementation produces model estimates identical to those from the established metafor package and illustrate its applicability with published meta-analyses in medicine, evolutionary ecology, and the social sciences. Further, this novel implementation in glmmTMB supports more flexible modelling of meta-analytical data, expanding the R toolkit available for evidence synthesis.

2604.04032 2026-04-07 stat.ME stat.AP

Bootstrap-Aggregated Method-of-Moments Estimation of the Copula Correlation Parameter for Marginal Survival Inference under Dependent Censoring

Hyun-Soo Zhang, Inkyung Jung, Chung Mo Nam

详情
英文摘要

In dependently censored survival data, the usual assumption of independent censoring or an incorrect specification of the correlation between the event and censoring times can bias marginal survival inference. Likelihood-based estimation of this dependence can be numerically unstable with large variance, and practical alternatives are limited. The proposed method uses generalized method-of-moments to estimate the copula correlation parameter of a Normal, Clayton, Gumbel, or Frank copula that links exponential, Weibull, or log-normal marginal survival times. Bootstrap-aggregation of simulated annealing is employed over candidate correlation ranges to obtain stable estimates. Simulations assess accuracy and uncertainty via mean absolute error, bootstrap confidence intervals, and empirical coverage. The method is applied to a double-blind randomized clinical trial with dependent censoring from early patient dropouts, where accurate marginal survival inference is needed to estimate the effect of a treatment on patient survival.

2604.03985 2026-04-07 cs.LG eess.SP stat.ML

Autoencoder-Based Parameter Estimation for Superposed Multi-Component Damped Sinusoidal Signals

Momoka Iida, Hayato Motohashi, Hirotaka Takahashi

Comments 27 pages, 16 figures, 14 tables

详情
英文摘要

Damped sinusoidal oscillations are widely observed in many physical systems, and their analysis provides access to underlying physical properties. However, parameter estimation becomes difficult when the signal decays rapidly, multiple components are superposed, and observational noise is present. In this study, we develop an autoencoder-based method that uses the latent space to estimate the frequency, phase, decay time, and amplitude of each component in noisy multi-component damped sinusoidal signals. We investigate multi-component cases under Gaussian-distribution training and further examine the effect of the training-data distribution through comparisons between Gaussian and uniform training. The performance is evaluated through waveform reconstruction and parameter-estimation accuracy. We find that the proposed method can estimate the parameters with high accuracy even in challenging setups, such as those involving a subdominant component or nearly opposite-phase components, while remaining reasonably robust when the training distribution is less informative. This demonstrates its potential as a tool for analyzing short-duration, noisy signals.

2604.03981 2026-04-07 cs.LG stat.CO

Multirate Stein Variational Gradient Descent for Efficient Bayesian Sampling

Arash Sarshar

详情
英文摘要

Many particle-based Bayesian inference methods use a single global step size for all parts of the update. In Stein variational gradient descent (SVGD), however, each update combines two qualitatively different effects: attraction toward high-posterior regions and repulsion that preserves particle diversity. These effects can evolve at different rates, especially in high-dimensional, anisotropic, or hierarchical posteriors, so one step size can be unstable in some regions and inefficient in others. We derive a multirate version of SVGD that updates these components on different time scales. The framework yields practical algorithms, including a symmetric split method, a fixed multirate method (MR-SVGD), and an adaptive multirate method (Adapt-MR-SVGD) with local error control. We evaluate the methods in a broad and rigorous benchmark suite covering six problem families: a 50D Gaussian target, multiple 2D synthetic targets, UCI Bayesian logistic regression, multimodal Gaussian mixtures, Bayesian neural networks, and large-scale hierarchical logistic regression. Evaluation includes posterior-matching metrics, predictive performance, calibration quality, mixing, and explicit computational cost accounting. Across these six benchmark families, multirate SVGD variants improve robustness and quality-cost tradeoffs relative to vanilla SVGD. The strongest gains appear on stiff hierarchical, strongly anisotropic, and multimodal targets, where adaptive multirate SVGD is usually the strongest variant and fixed multirate SVGD provides a simpler robust alternative at lower cost.

2604.03970 2026-04-07 stat.ME stat.AP stat.CO

Learning association from multiple intermediate events for dynamic prediction of survival: an application to cardiovascular disease prognosis

Tonghui Yu, Liming Xiang

详情
英文摘要

Cardiovascular diseases are major causes of mortality globally. They often co-occur and are interrelated, leading to partial-order relationships among their onset times. However, these onset times are subject to informative censoring due to the occurrence of death, posing significant challenges for survival prediction. In this article, we propose a novel copula-based framework that learns dependence among multiple correlated marginal components through a pseudo-likelihood for estimation. We adopt nonparametric marginals, alleviating the reliance on marginal distribution assumptions typically required in conventional copula models, and estimate the association between the onsets of intermediate cardiovascular diseases and death by solving a concordance estimating equation. Under this framework, a renewable risk assessment method is developed for dynamic survival prediction, leveraging information on disease onset times and the maximum follow-up duration. Our proposed method yields estimators with well-established properties, and its flexibility and predictive effectiveness are demonstrated through extensive simulation studies. We apply the method to data from a heart disease study, demonstrating the benefits of incorporating the associations among various cardiovascular diseases and their synergistic effects on mortality for dynamic prediction of overall survival.

2604.03969 2026-04-07 stat.ML cs.LG stat.ME

Nearly Optimal Best Arm Identification for Semiparametric Bandits

Seok-Jin Kim

Comments To appear at AISTATS 2026

详情
英文摘要

We study fixed-confidence Best Arm Identification (BAI) in semiparametric bandits, where rewards are linear in arm features plus an unknown additive baseline shift. Unlike linear-bandit BAI, this setting requires orthogonalized regression, and its instance-optimal sample complexity has remained open. For the transductive setting, we establish an attainable instance-dependent lower bound characterized by the corresponding linear-bandit complexity on shifted features. We then propose a computationally efficient phase-elimination algorithm based on a new $XY$-design for orthogonalized regression. Our analysis yields a nearly optimal high-probability sample-complexity upper bound, up to log factors and an additive $d^2$ term, and experiments on synthetic instances and the Jester dataset show clear gains over prior baselines.

2604.03952 2026-04-07 stat.AP q-bio.QM

Multidimensional physical fitness is associated with reduced dementia risk through proteomic and neuroimaging pathways: a prospective cohort study of the UK Biobank

Yiqing Sun, Runyu Lin, Jiayue Qin, Feiyue Pan, Bingjie Li, Zhigang Yao

Comments 22 pages, 6 figures

详情
英文摘要

Dementia affects over 55 million people worldwide, yet whether distinct domains of physical fitness independently protect against neurodegeneration through shared or divergent biological mechanisms remains unknown. Using the UK Biobank (n = 51,517; 12-year follow-up), we integrated epidemiological, proteomic, and neuroimaging analyses to systematically characterize the multidimensional fitness-dementia relationship. Higher handgrip strength, cardiorespiratory fitness, and pulmonary function were each independently associated with reduced dementia risk (HRs 0.50, 0.62, and 0.73, respectively, for highest vs. lowest tertiles), with stronger associations in women and younger individuals. Plasma proteomic profiling revealed domain-specific molecular signatures--neurofilament light chain predominating for muscular and cardiorespiratory fitness, and inflammatory mediators including GDF15 for pulmonary function--with 22-40 proteins per domain independently predicting dementia, converging on neuroinflammatory and neurovascular pathways. Brain MRI analyses identified hippocampal volume as a significant structural mediator (proportion mediated: 3.7-10.1%), indicating structural preservation as one of multiple mechanistic pathways. Population attributable fraction analyses estimated that suboptimal fitness may account for approximately 26% of dementia cases. These findings reveal that multidimensional physical fitness shapes dementia risk through distinct yet converging neuroinflammatory, neurovascular, and structural brain mechanisms, with implications for life-course prevention.

2604.03948 2026-04-07 q-fin.PM stat.AP

Forecasting Tangency Portfolios and Investing in the Minimum Euclidean Distance Portfolio to Maximize Out-of-Sample Sharpe Ratios

Nolan Alexander, William Scherer

Comments Code: https://github.com/nolanalexander/efficient-frontier-coefficients

详情
英文摘要

We propose a novel model to achieve superior out-of-sample Sharpe ratios. While most research in asset allocation focuses on estimating the return vector and covariance matrix, the first component of our novel model instead forecasts the future tangency portfolio, and the second component then determines the optimal investment portfolio. First, to forecast the tangency portfolio, we forecast the efficient frontier by decomposing its functional form, a square root second-order polynomial, into three interpretable coefficients, which can then be used to calculate a forecasted tangency portfolio. These coefficients can be forecasted using vector autoregressions. Second, the model invests in the portfolio on the efficient frontier that is the minimum Euclidean distance from this forecasted tangency portfolio. A motivation for our approach is to address the limitation that the tangency portfolio only maximizes the Sharpe ratio when future returns and covariances are stationary, and can be directly estimated with historical data, which often does not hold in out-of-sample data. Our approach addresses this shortcoming in a novel way by forecasting the tangency portfolio, rather than estimating return and covariance. For empirical testing, we employ two sets of assets that span the market to demonstrate and validate the performance of this novel method.

2604.03946 2026-04-07 q-fin.PM stat.AP

Asset allocation using a Markov process of clustered efficient frontier coefficients states

Nolan Alexander, William Scherer, Jamey Thompson

Comments Code: https://github.com/nolanalexander/efficient-frontier-coefficients

详情
英文摘要

We propose a novel asset allocation model using a Markov process of states defined by clustered efficient frontier coefficients. While most research in Markov models of the market characterize regimes using return and volatility, we instead propose characterizing these states using efficient frontiers, which provide more information on the interactions of underlying assets that comprise the market. Efficient frontiers can be decomposed to their functional form, a square-root second-order polynomial defined by three coefficients, to provide a dimensionality reduction of the return vector and covariance matrix. Each month, the proposed model hierarchically clusters the monthly coefficients data up to the current month, to characterize the market states, then defines a Markov process on the sequence of states. To incorporate these states into portfolio optimization, for each state, we calculate the tangency portfolio using only return data in that state. We then take the expectation of these weights for each state, weighted by the probability of transitioning from the current state to each state. To empirically validate our proposed model, we employ three sets of assets that span the market, and show that our proposed model significantly outperforms benchmark portfolios.

2604.03939 2026-04-07 stat.ME cs.LG stat.ML

Fused Multinomial Logistic Regression Utilizing Summary-Level External Machine-learning Information

Chi-Shian Dai, Jun Shao

Comments 24 pages, 2 figures

详情
英文摘要

In many modern applications, a carefully designed primary study provides individual-level data for interpretable modeling, while summary-level external information is available through black-box, efficient, and nonparametric machine-learning predictions. Although summary-level external information has been studied in the data integration literature, there is limited methodology for leveraging external nonparametric machine-learning predictions to improve statistical inference in the primary study. We propose a general empirical-likelihood framework that incorporates external predictions through moment constraints. An advantage of nonparametric machine-learning prediction is that it induces a rich class of valid moment restrictions that remain robust to covariate shift under a mild overlap condition without requiring explicit density-ratio modeling. We focus on multinomial logistic regression as the primary model and address common data-quality issues in external sources, including coarsened outcomes, partially observed covariates, covariate shift, and heterogeneity in generating mechanisms known as concept shift. We establish large-sample properties of the resulting fused estimator, including consistency and asymptotic normality under regularity conditions. Moreover, we provide mild sufficient conditions under which incorporating external predictions delivers a strict efficiency gain relative to the primary-only estimator. Simulation studies and an application to the National Health and Nutrition Examination Survey on multiclass blood-pressure classification.

2604.03898 2026-04-07 cs.AI stat.CO

LLM-Agent-based Social Simulation for Attitude Diffusion

Deepak John Reji

详情
英文摘要

This paper introduces discourse_simulator, an open-source framework that combines LLMs with agent-based modelling. It offers a new way to simulate how public attitudes toward immigration change over time in response to salient events like protests, controversies, or policy debates. Large language models (LLMs) are used to generate social media posts, interpret opinions, and model how ideas spread through social networks. Unlike traditional agent-based models that rely on fixed, rule-based opinion updates and cannot generate natural language or consider current events, this approach integrates multidimensional sociological belief structures and real-world event timelines. This framework is wrapped into an open-source Python package that integrates generative agents into a small-world network topology and a live news retrieval system. discourse_sim is purpose-built as a social science research instrument specifically for studying attitude dynamics, polarisation, and belief evolution following real-world critical events. Unlike other LLM Agent Swarm frameworks, which treat the simulations as a prediction black box, discourse_sim treats it as a theory-testing instrument, which is fundamentally a different epistemological stance for studying social science problems. The paper further demonstrates the framework by modelling the Dublin anti-immigration march on April 26, 2025, with N=100 agents over a 15-day simulation. Package link: https://pypi.org/project/discourse-sim/

2604.03863 2026-04-07 stat.ME

Estimation of treatment effect in clinical trials of continuous endpoints with retrieved dropouts

Myeongjong Kang, Sangyoon Yi

Comments 27 pages, 3 figures, 8 tables

详情
英文摘要

The estimand framework provides guidance on handling intercurrent events, such as treatment discontinuation, in the analysis of clinical trial responses. Under ICH E9(R1), the treatment policy (TP) strategy incorporates post-discontinuation data to reflect treatment effects in real-world practice. However, many existing approaches focus primarily on imputing missing endpoint values for lost-to-follow-up subjects and do not explicitly model completers, retrieved dropouts (RDs), and lost-to-follow-up subjects within a unified framework. This may obscure the relationship between modeling assumptions and the estimand of interest when RD data are present. We propose a likelihood-based model for continuous endpoints that integrates data from all subject categories, including RDs. The approach combines an analysis of covariance formulation with a probit model for treatment discontinuation, enabling explicit formulation of treatment effects for estimands defined using the hypothetical and TP strategies. Estimation is carried out via a computationally efficient maximum likelihood procedure. Numerical studies demonstrate that the proposed method achieves improved bias and variability properties compared with commonly used imputation-based approaches.

2604.03840 2026-04-07 stat.ME cs.LG

New insights into Elo algorithm for practitioners and statisticians

Leszek Szczecinski

详情
英文摘要

This work reconciles two perspectives on the Elo ranking that coexist in the literature: the practitioner's view as a heuristic feedback rule, and the statistician's view as online maximum likelihood estimation via stochastic gradient ascent. Both perspectives coincide exactly in the binary case (iff the expected score is the logistic function). However, estimation noise forces a principled decoupling between the model used for ranking and the model used for prediction: the effective scale and home-field advantage parameter must be adjusted to account for the noise. We provide both closed-form corrections and a data-driven identification procedure. For multilevel outcomes, an exact relationship exists when outcome scores are uniformly spaced, but approximations are preferred in general: they account for estimation noise and better fit the data. The decoupled approach substantially outperforms the conventional one that reuses the ranking model for prediction, and serves as a diagnostic of convergence status. Applied to six years of FIFA men's ranking, we find that the ranking had not converged for the vast majority of national teams. The paper is written in a semi-tutorial style accessible to practitioners, with all key results accompanied by closed-form expressions and numerical examples.

2604.03827 2026-04-07 stat.ME stat.AP

Confidence Intervals for Rate Estimation with Importance Sampling in Autonomous Vehicle Evaluation

Aiyou Chen, Ruixuan Rachel Zhou, Joseph J. Lee, Nicholas Chamandy, Henning Hohnhold

Comments 27 pages, 9 figures, Accepted by the Annals of Applied Statistics

详情
英文摘要

Accounting for both rare events and complex sampling presents challenges when quantifying uncertainty for rate estimation in autonomous vehicle performance evaluation. In this paper, we introduce a statistical formulation of this problem and develop a unified compound Poisson model framework for unbiased rate estimation through the Horvitz Thompson estimator. Though asymptotic theory for the model is available, the inference of confidence intervals (CIs) in the presence of rare events requires new investigation. We also advocate for a new monotonicity criterion for rate CIs--summing the rates of disjoint types of events should produce not only a higher point estimate but also higher confidence bounds than for the individual rates--that facilitates interpretability in real applications. We propose a novel exponential bootstrap (EB) method for CI construction based on a fiducial argument; it satisfies the monotonicity property, while novel extensions of some existing methods do not. Comprehensive numerical studies show that EB performs well for a wide range of settings relevant to our applications. Fast implementation of EB based on saddlepoint approximation is also developed, which may be of independent interest.

2604.03810 2026-04-07 stat.ME

A test for normality based on self-similarity

Akin Anarat, Holger Schwender

详情
英文摘要

Testing for normality is a widely used procedure in statistics and data analysis, often applied prior to employing methods that rely on the assumption of normally distributed data. While several existing tests target distributional characteristics such as higher-order moments, others focus on functional aspects such as the distribution function. In this article, we propose an alternative idea by exploiting the self-similarity property of the normal distribution and introduce the Self-Similarity Test for Normality (SSTN). This procedure leverages the structural property that the distribution of a suitably centered and scaled sum of independent and identically distributed random variables with finite variance coincides with the original distribution if and only if that distribution is normal. The SSTN evaluates normality by applying a self-similarity transformation to the standardized empirical characteristic function and examining how the transformed functions change across successive applications. For the normal distribution, repeated applications preserve the functional form of the characteristic function, whereas deviations from normality manifest in systematic changes between consecutive transforms. These changes are aggregated into a test statistic, whose null distribution is obtained by Monte Carlo calibration, using a sample-size-specific calibration for small samples and an approximation of the asymptotic null distribution for larger ones. A comprehensive simulation study shows that the SSTN performs at least competitively and frequently superior to several well-established tests for normality.

2604.03772 2026-04-07 stat.ML cs.LG

Debiased Machine Learning for Conformal Prediction of Counterfactual Outcomes Under Runtime Confounding

Keith Barnatchez, Kevin P. Josey, Rachel C. Nethery, Giovanni Parmigiani

详情
英文摘要

Data-driven decision making frequently relies on predicting counterfactual outcomes. In practice, researchers commonly train counterfactual prediction models on a source dataset to inform decisions on a possibly separate target population. Conformal prediction has arisen as a popular method for producing assumption-lean prediction intervals for counterfactual outcomes that would arise under different treatment decisions in the target population of interest. However, existing methods require that every confounding factor of the treatment-outcome relationship used for training on the source data is additionally measured in the target population, risking miscoverage if important confounders are unmeasured in the target population. In this paper, we introduce a computationally efficient debiased machine learning framework that allows for valid prediction intervals when only a subset of confounders is measured in the target population, a common challenge referred to as runtime confounding. Grounded in semiparametric efficiency theory, we show the resulting prediction intervals achieve desired coverage rates with faster convergence compared to standard methods. Through numerous synthetic and semi-synthetic experiments, we demonstrate the utility of our proposed method.

2604.03722 2026-04-07 math.PR math.ST stat.TH

Statistical Inference for Fractional Diffusions

Pablo Ramses Alonso-Martin, Horatio Boedihardjo, Anastasia Papavasiliou

Comments Contribution to an edited volume on anomalous diffusions

详情
英文摘要

This is a review of statistical inference methodology for stochastic differential equations driven by fractional Brownian motion, otherwise called fractional diffusions. The first section reviews the theory needed to rigorously define them. The second section reviews existing theory of statistical inference for fractional diffusions, identifies remaining challenges and introduces a novel approach. The final section discusses results for the case where fractional diffusions result as a homogenisation limit.

2604.03721 2026-04-07 stat.ML cs.LG stat.ME

The Generalised Kernel Covariance Measure

Luca Bergen, Dino Sejdinovic, Vanessa Didelez

Comments Accepted for the 5th Conference on Causal Learning and Reasoning (CLeaR 2026)

详情
英文摘要

We consider the problem of conditional independence (CI) testing and adopt a kernel-based approach. Kernel-based CI tests embed variables in reproducing kernel Hilbert spaces, regress their embeddings on the conditioning variables, and test the resulting residuals for marginal independence. This approach yields tests that are sensitive to a broad range of conditional dependencies. Existing methods, however, rely heavily on kernel ridge regression, which is computationally expensive when properly tuned and yields poorly calibrated tests when left untuned, which limits their practical usefulness. We propose the Generalised Kernel Covariance Measure (GKCM), a regression-model-agnostic kernel-based CI test that accommodates a broad class of regression estimators. Building on the Generalised Hilbertian Covariance Measure framework (Lundborg et al., 2022), we characterise conditions under which GKCM satisfies uniform asymptotic level guarantees. In simulations, GKCM paired with tree-based regression models frequently outperforms state-of-the-art CI tests across a diverse range of data-generating processes, achieving better type I error control and competitive or superior power.

2604.03712 2026-04-07 math.PR math.ST stat.TH

Berry-Esseen Bounds for Statistics of Non-Stationary, $ϕ$-Mixing Random Variables

Brendan Williams, Yeor Hafouta

详情
英文摘要

Using a modification of Stein's method, we generalize the results of Bentkus, G{ö}tze, and Tikhomirov \cite{bentkus1997berry} to obtain Berry-Esseen bounds for a broad class of statistics of sequences of $ϕ$-mixing, non-stationary random variables with polynomial mixing rates. %and linear variance. We then consider applications of this theorem to ensure Berry-Esseen rates for various classes of non-stationary $ϕ$-mixing random variables, including rates for a general class of processes of $ϕ$-mixing random variables satisfying an aggregate third moment bound.

2604.03663 2026-04-07 econ.EM math.ST stat.TH

Robust Priors in Nonlinear Panel Models with Individual and Time Effects

Zizhong Yan, Zhengyu Zhang, Mingli Chen, Jingrong Li, Iván Fernández-Val

详情
英文摘要

We develop likelihood-based bias reduction for nonlinear panel models with additive individual and time effects. In two-way panels, integrated-likelihood corrections are attractive but challenging because the required integration is high dimensional and standard Laplace approximations may fail when the parameter dimension grows with the sample size. We propose a target-centered full-exponential Laplace--cumulant expansion that exploits the sparse higher-order derivative structure implied by additive effects, delivering a tractable approximation with a negligible remainder under large-$N,T$ asymptotics. The expansion motivates robust priors that yield bias reduction for both common parameters and fixed effects. We provide implementations for binary, ordered, and multinomial response models with two-way effects. For average partial effects, we show that the remaining first-order bias has a simple variance form and can be removed by a closed-form adjustment. Monte Carlo experiments and an empirical illustration show substantial bias reduction with accurate inference.

2604.03574 2026-04-07 stat.ME stat.AP

Spherically Embedded Time Series with Unknown Trend and Periodic Components

Jiazhen Xu, Han Lin Shang

详情
英文摘要

Spherically embedded time series are time series with values naturally residing on or can be equivalently mapped to the sphere. Despite their ubiquity in diverse scientific fields, these data frequently exhibit complex non-stationarity driven by latent trend and periodic components. Traditional Euclidean time series methods fail to account for the intrinsic non-Euclidean geometry of the sphere, leaving a critical gap in rigorous methodologies for modelling and forecasting nonstationary spherically embedded time series. To address this methodological gap, we propose a unified geometric framework to analyse nonstationary spherically embedded time series. Central to our approach is a novel nonparametric spherical trend-periodicity decomposition model that uses an optimal-transport-based removal operation to sequentially extract the smooth trend and periodic components while preserving spherical topology. The resulting de-trended and de-seasonalised stationary residuals can be further modelled using a spherical autoregressive model, formalising a novel trend-periodic spherical autoregressive model. Theoretical foundations for the modelling procedure are established on the consistency under temporal dependence. Extensive simulations corroborate these theoretical guarantees and demonstrate the superior finite-sample predictive performance of the trend-periodic spherical autoregressive model. Finally, we validate the practical utility of our methodology through applications to electricity generation compositions and bike trip volume profiles, yielding significantly enhanced forecasting accuracy while providing interpretable insights into the underlying structural dynamics.

2604.03566 2026-04-07 math.OC stat.ML

Fréchet Regression on the Bures-Wasserstein Manifold

Duc Toan Nguyen, César A. Uribe

详情
英文摘要

Fréchet regression, or conditional Barycenters, is a flexible framework for modeling relationships between covariates (usually Euclidean) and response variables on general metric spaces, e.g., probability distributions or positive definite matrices. However, in contrast to classical barycenter problems, computing conditional counterparts in many non-Euclidean spaces remains an open challenge, as they yield non-convex optimization problems with an affine structure. In this work, we study the existence and computation of conditional barycenters, specifically in the space of positive-definite matrices with the Bures-Wasserstein metric. We provide a sufficient condition for the existence of a minimizer of the conditional barycenter problem that characterizes the regression range of extrapolation. Moreover, we further characterize the optimization landscape, proving that under this condition, the objective is free of local maxima. Additionally, we develop a projection-free and provably correct algorithm for the approximate computation of first-order stationary points. Finally, we provide a stochastic reformulation that enables the use of off-the-shelf stochastic Riemannian optimization methods for large-scale setups. Numerical experiments validate the performance of the proposed methods on regression problems of real-world biological networks and on large-scale synthetic Diffusion Tensor Imaging problems.

2604.03544 2026-04-07 econ.EM stat.ME

Quantifying Omitted Variable Bias in Nonlinear Instrumental Variable Estimators

Yu-Min Yen

Comments 40 pages, 8 figures

详情
英文摘要

We develop a framework for quantifying omitted variable bias (OVB) in nonlinear instrumental variable (IV) estimators, including the local average treatment effect (LATE), the LATE for the treated (LATT), and the partially linear IV model (PLIVM). Extending sensitivity analysis beyond linear settings, we derive bias decompositions, establish partial identification bounds, and construct OVB-adjusted confidence intervals. We estimate OVB bounds and conduct inference using double machine learning (DML), allowing flexible control for high-dimensional covariates. An application to the U.S. Job Training Partnership Act (JTPA) experiment shows that, at conventional significance levels, first-stage compliance estimates are robust to omitted variables, whereas intention-to-treat and treatment effects are more sensitive. Program impacts are robust and significant for females but fragile for males.

2604.03535 2026-04-07 stat.ME

Multilevel Regression Discontinuity Models with Latent Variables

Monica Morell, Youngjin Han, Muwon Kwon, Youjin Sung, Yang Liu, Ji Seung Yang

详情
英文摘要

Regression discontinuity (RD) analysis with latent variables as introduced by Morell et al. (2025), offers a useful augmentation of the conventional RD by incorporating measurement model. This approach is particularly relevant in education research, where noisy proxy (e.g., observed test score) of underlying latent construct is adopted for the running variable. This extension enables extrapolation of average treatment effect (ATE) away from the cutoff score and assessment of heterogeneous treatment effects. However, a key limitation of the original framework is its single-level structure, which does not account for the multilevel structure commonly found in education data, such as students nested within classrooms or schools. In this study, we extend the framework to multilevel contexts. We discuss models for both hierarchical RD design, where treatment is assigned at the cluster level, and multisite RD design, where treatment is assigned at the individual level within clusters. In both cases, multilevel measurement model is incorporated to describe the relationship between the latent running variable and observed indicators. Monte Carlo simulations demonstrate recovery of ATEs including extrapolated estimates beyond the cutoff given adequate cluster-level sample sizes. The study highlights the applicability of RD analysis with latent variables for broader use in educational research, without being restricted by the limitations of multilevel data.

2604.03502 2026-04-07 stat.ML cs.LG stat.ME

Nonparametric Regression Discontinuity Designs with Survival Outcomes

Maximilian Schuessler, Erik Sverdrup, Robert Tibshirani, Stefan Wager

详情
英文摘要

Quasi-experimental evaluations are central for generating real-world causal evidence and complementing insights from randomized trials. The regression discontinuity design (RDD) is a quasi-experimental design that can be used to estimate the causal effect of treatments that are assigned based on a running variable crossing a threshold. Such threshold-based rules are ubiquitous in healthcare, where predictive and prognostic biomarkers frequently guide treatment decisions. However, standard RD estimators rely on complete outcome data, an assumption often violated in time-to-event analyses where censoring arises from loss to follow-up. To address this issue, we propose a nonparametric approach that leverages doubly robust censoring corrections and can be paired with existing RD estimators. Our approach can handle multiple survival endpoints, long follow-up times, and covariate-dependent variation in survival and censoring. We discuss the relevance of our approach across multiple areas of applications and demonstrate its usefulness through simulations and the prostate component of the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial where our new approach offers several advantages, including higher efficiency and robustness to misspecification. We have also developed an open-source software package, $\texttt{rdsurvival}$, for the $\texttt{R}$ language.

2604.03437 2026-04-07 stat.AP cs.CY

Is it Cake or is it AI? A Systematic Review of Human Uncertainty in Distinguishing Generative Artificial Intelligence Content

Mark Louie F. Ramos

详情
英文摘要

This systematic review synthesized empirical evidence on human ability to distinguish generative artificial intelligence content from human produced content across text, image, and voice modalities. A structured search of Scopus identified 22,541 records from 2025 to 2026, of which 1200 were screened and 30 studies were included. Across these studies, human detection accuracy varied widely but generally clustered around chance performance. Overall, the literature shows that humans are generally unreliable detectors of gen AI content, raising broader questions about whether the ability to tell should matter for how we evaluate or trust content.

2604.03398 2026-04-07 stat.ME stat.AP stat.CO

Robust Standard Errors for Bayesian Posterior Functionals via the Infinitesimal Jackknife

Nanyu Luo, Feng Ji

详情
英文摘要

Quantitative research in the social and behavioral sciences relies heavily on nonlinear posterior functionals such as indirect effects, standardized coefficients, effect sizes, intraclass correlations, and multilevel variance-explained measures. The posterior standard deviation (PostSD) is the default uncertainty summary for these quantities, yet it presupposes a correctly specified model. When the working model is wrong, as is common with behavioral data that exhibit heavy tails and heteroskedasticity, PostSD can severely underestimate the frequentist standard error. The nonparametric bootstrap offers robustness but requires repeated MCMC refits, while the delta method demands a separate analytic gradient derivation for every new functional. The infinitesimal jackknife standard error (Giordano & Broderick, 2023) sidesteps both limitations: it approximates the bootstrap variance through influence functions computed from a single MCMC run, applies to any posterior functional without modification, and requires no analytic derivatives. We discuss the use the IJSE methodology at both the observation level and the cluster level and evaluate it through four simulation studies covering six functionals from mediation analysis, ANOVA, and multilevel modeling, which are commonly used in the social and behavioral sciences. Under misspecification, PostSD substantially underestimated the true standard error across all settings, whereas IJSE closely tracked the bootstrap at a fraction of the computational cost. Under correct specification all three methods agreed, confirming that IJSE introduces no distortion when the model is right. These results show IJSE as a practical, general-purpose tool for robust uncertainty quantification in Bayesian workflows throughout the social and behavioral sciences

2604.03388 2026-04-07 cs.LG stat.ML

Scalable Variational Bayesian Fine-Tuning of LLMs via Orthogonalized Low-Rank Adapters

Haotian Xiang, Bingcong Li, Qin Lu

详情
英文摘要

When deploying large language models (LLMs) to safety-critical applications, uncertainty quantification (UQ) is of utmost importance to self-assess the reliability of the LLM-based decisions. However, such decisions typically suffer from overconfidence, particularly after parameter-efficient fine-tuning (PEFT) for downstream domain-specific tasks with limited data. Existing methods to alleviate this issue either rely on Laplace approximation based post-hoc framework, which may yield suboptimal calibration depending on the training trajectory, or variational Bayesian training that requires multiple complete forward passes through the entire LLM backbone at inference time for Monte Carlo estimation, posing scalability challenges for deployment. To address these limitations, we build on the Bayesian last layer (BLL) model, where the LLM-based deterministic feature extractor is followed by random last layer parameters for uncertainty reasoning. Since existing low-rank adapters (LoRA) for PEFT have limited expressiveness due to rank collapse, we address this with Polar-decomposed Low-rank Adapter Representation (PoLAR), an orthogonalized parameterization paired with Riemannian optimization to enable more stable and expressive adaptation. Building on this PoLAR-BLL model, we leverage the variational (V) inference framework to put forth a scalable Bayesian fine-tuning approach which jointly seeks the PoLAR parameters and approximate posterior of the last layer parameters via alternating optimization. The resulting PoLAR-VBLL is a flexible framework that nicely integrates architecture-enhanced optimization with scalable Bayesian inference to endow LLMs with well-calibrated UQ. Our empirical results verify the effectiveness of PoLAR-VBLL in terms of generalization and uncertainty estimation on both in-distribution and out-of-distribution data for various common-sense reasoning tasks.

2604.03359 2026-04-07 physics.ao-ph stat.AP

Multidecadal Cycles Study in the Climate Indexes Series Using Wavelet Analysis in North/Northeast Brazil

Cleber Souza Corrêa, Roberto Lage Guedes, Karlmer Abel Bueno Corrêa, Felipe Gustavo Pilau

Comments 9 pages, 3 figures, published in Anuário do Instituto de Geociências (UFRJ), 42(1):66-73, 2019. DOI: 10.11137/2019_1_66_73

详情
Journal ref
Anuário do Instituto de Geociências (UFRJ), 42(1):66-73, 2019
英文摘要

This study investigates the climatic index time series over the most recent 80 years, using monthly mean values from the Pacific Decadal Oscillation Index (PDO), Southern Oscillation Index (SOI), and monthly solar activity represented by sunspot numbers (MS), obtained from the National Weather Service Climate Prediction Center and the World Data Center SILSO, Royal Observatory of Belgium, Brussels. The statistical software R was used with the \texttt{WaveletComp} package to generate Morlet wavelet power spectra, and bivariate cross-wavelet analysis using the \texttt{biwavelet} package. The results show predominant cycles with variability scales of 32, 64, 128, and 256 months, corresponding approximately to 2.66, 5.33, 10.66, and 21.33 years. These frequencies are observed in the period from January 1933 to September 2016, totaling 993 months (82.75 years), characterizing decadal and multidecadal variability. These multidecadal cycles (of the order of 10.66 and 21.33 years) suggest a possible association with solar activity variability and climate variability in the ocean-atmosphere system. Rainfall data from January 1951 to September 2017 were analyzed for Belém, São Luiz, Fortaleza, Natal, and Fernando de Noronha, forming a north to northeast Brazilian transect. These series show similarity with the decadal and multidecadal cycles observed in the SOI, PDO, and sunspot series.

2604.03357 2026-04-07 physics.ao-ph stat.AP

Multidecadal Cycles of the Climatic Index: Sunspots that Affect North and Northeast of Brazil

Cleber Souza Corrêa, Roberto Lage Guedes, André Muniz Marinho da Rocha, Karlmer Abel Bueno Corrêa

Comments 10 pages, 4 figures, accepted and published in Journal of Aerospace Technology and Management, 12:e0420, 2020

详情
Journal ref
Journal of Aerospace Technology and Management, 12:e0420, 2020
英文摘要

Using the 1951--2017 historical series of the Atlantic Meridional Mode (AMM) index and the monthly number of sunspots, it was possible to observe a significant association between them. The use of wavelet and cross-wavelet analysis showed the presence of multidecadal cycles pronounced in eleven years, as well as cycles of 2.66 and 5.33 years. The AMM index showed, in the Sea Surface Temperature (SST) component, the presence of a weak signal of 21.33 years. The influence and association of sunspot variability on surface temperature in Northern and Northeastern regions of Brazil were investigated. Using a non-parametric statistical correlation test, the historical series of surface temperature anomalies in five locations (Belém, São Luiz, Fortaleza, Fernando de Noronha, and Natal) were compared with the monthly solar-series anomalies. The temperature series used were the minimum monthly average, the monthly average, and maximum monthly average temperatures, with their respective anomalies in relation to the mean. However, among all the series (except for São Luiz), the analyzed minimum temperature anomalies showed a negative correlation with sunspots. As a preliminary result, the analyzed climatic indices present an apparent degree of memory associated with the variability of sunspot activity.

2604.03355 2026-04-07 stat.AP

The Long-Range Memory and the Fractal Dimension: a Case Study for Alcântara

Cleber Souza Correa, Daniel Andrade Schuch, Antonio Paulo de Queiroz, Gilberto Fisch, Felipe do Nascimento Correa, Mariane Mendes Coutinho

Comments 8 pages, 6 figures, published in Journal of Aerospace Technology and Management (2017), DOI: 10.5028/jatm.v9i4.683

详情
Journal ref
Journal of Aerospace Technology and Management, 9(4):461-468, 2017
英文摘要

This study aimed to analyze the time series behavior of the Southern Oscillation Index through techniques using Fast Fourier Transform, computing the autocorrelation function, and the calculation of the Hurst coefficient. The methodology of Hurst exponent calculation uses different lags, which are computed in the time series of Southern Oscillation Index. The persistent behavior in the time series can be characterized by calculating the Hurst exponent, seeking for more behavioral information, such as the existence of persistence and/or terms of long-range memory in the series. The results show a persistence of the climate in terms of long-memory Southern Oscillation Index time series, which can help to understand complex dynamic behavior in climate effects at global-scale level and specifically its influence in northeastern Brazil, in the region of the Alcântara Launch Center. The R package \texttt{tseriesChaos} was used in the analysis of the Southern Oscillation Index time series, estimating the largest Lyapunov exponent, which indicates the existence of chaotic behavior in time series. The resampling technique was used in a permutation test between the surface wind data in the São Luís airport, Maranhão State, and the Southern Oscillation Index. The permutation test results showed that the time series of monthly average wind speed in the São Luís airport is correlated with the variability of Southern Oscillation Index, statistically significant at the 5\% confidence level. The results also indicate the possibility of using autoregressive models to represent average meteorological variables in behavioral analysis, as well as trends in the climate, more specifically a possible climatic influence of El Niño--Southern Oscillation on wind strength in the Alcântara Launch Center.

2604.03354 2026-04-07 math.OC stat.CO

Optimal Experimental Design using Eigenvalue-Based Criteria with Pyomo.DoE

Daniel J. Laky, Shammah Lilonfe, Shawn B. Martin, Katherine A. Klise, Bethany L. Nicholson, John D. Siirola, Alexander W. Dowling

Comments 82 pages, 14 figures, 11 tables; includes supplementary information

详情
英文摘要

Digital twins require high-quality data to achieve predictive capability, but time and resource limitations make efficient experiment design essential. Model-based design of experiments can address this challenge, especially when coupled with equation-oriented optimization and first-principles models. Pyomo.DoE is a software package for optimal experimental design of high-fidelity, equation-oriented models; however, embedding linear algebra operations such as matrix inversion and eigenvalue computation within these optimization problems remains difficult. This work extends Pyomo.DoE with callback-based capabilities that enable rigorous computation of eigenvalue-based design metrics, including minimum eigenvalue optimality (E-optimality) and condition number optimality (ME-optimality), within equation-oriented optimization frameworks. These additions allow experimental design to focus directly on poorly informed or numerically problematic parameter directions. We also present a new experiment-creation modeling abstraction for intrusive uncertainty quantification in Pyomo that reduces user modeling effort by aligning model and software abstractions across the digital twin workflow. In addition, a brief tutorial on experimental design metrics is provided in the methodology and supplementary information. Overall, this work expands the range of practical optimal design criteria available in Pyomo.DoE and improves the workflow for building and refining high-value digital twins.

2604.03341 2026-04-07 stat.AP physics.ao-ph stat.ML

Generative Unsupervised Downscaling of Climate Models via Domain Alignment: Application to Wind Fields

Julie Keisler, Boutheina Oueslati, Anastase Charantonis, Yannig Goude, Claire Monteleoni

详情
英文摘要

General Circulation Models (GCMs) are widely used for future climate projections, but their coarse spatial resolution and systematic biases limit their direct use for impact studies. This limitation is particularly critical for wind-related applications, such as wind energy, which require spatially coherent, multivariate, and physically plausible near-surface wind fields. Classical statistical downscaling and bias correction methods partly address this issue. Still, they struggle to preserve spatial structure, inter-variable consistency, and robustness under climate change, especially in high-dimensional settings. Recent advances in generative machine learning offer new opportunities for downscaling and bias correction, eliminating the need for explicitly paired low- and high-resolution datasets. However, many existing approaches remain difficult to interpret and challenging to deploy in operational climate impact studies. In this work, we apply SerpentFlow, an interpretable, generative, domain alignment framework, to the multivariate downscaling and bias correction of wind variables from GCM outputs. This is a method that generates low-resolution/high-resolution training data pairs by separating large-scale spatial patterns from small-scale variability. Large-scale components are aligned across climate model and observational domains. Conditional fine-scale variability is then learned using a flow-matching generative model. We apply the approach to multiple wind variables downscaling, including average and maximal wind speed, zonal and meridional components, and compare it with widely used multivariate bias correction methods. Results show improved spatial coherence, inter-variable consistency, and robustness under future climate conditions, highlighting the potential of interpretable generative models for wind and energy applications.

2604.03284 2026-04-07 stat.CO stat.ME

FunctionalCalibration: an R package for estimation in aggregated functional data model

Alex Rodrigo dos Santos Sousa, Vitor Ribas Perrone

详情
英文摘要

We consider the statistical problem of estimating constituent curves from observations of their aggregated curves, referred to as aggregated functional data, in models with additive errors. A typical model arises in chemometrics via the Beer-Lambert law. The package FunctionalCalibration provides functions to estimate individual curves from aggregated curves by using splines or wavelet basis expansion.

2604.03271 2026-04-07 stat.CO physics.data-an

GPU-Accelerated Sequential Monte Carlo for Bayesian Spectral Analysis

Tomohiro Nabika, Yui Hayashi, Masato Okada

详情
英文摘要

Bayesian spectral deconvolution provides a data-driven framework for mathematical model selection and parameter estimation from spectral data. Although highly versatile, it becomes computationally expensive as the number of model parameters, data points, and candidate models increases, often rendering practical applications infeasible. We propose a GPU-accelerated approach in which a sequential Monte Carlo sampler (SMCS) is run in parallel on a GPU to perform Bayesian model selection of the number of spectral peaks and Bayesian estimation of peak-function parameters. Numerical experiments demonstrate that the GPU-parallelized SMCS achieves speedups exceeding 500x over CPU-parallelized replica exchange Monte Carlo (REMC). The method is validated on artificial data designed to emulate X-ray photoelectron spectroscopy (XPS) and X-ray diffraction (XRD) measurements, as well as on real experimental spectra. As measurement techniques such as microscopic spectroscopy and in-situ methods continue to drive rapid growth in the volume of spectral data, the proposed approach offers a practical computational foundation for advanced analysis of individual datasets.

2604.00966 2026-04-07 math.ST cs.CC stat.TH

A Framework for Computational Lower Bounds in Nontrivial Norm Approximation

Runshi Tang, Yuefeng Han, Anru R. Zhang

详情
英文摘要

In this note, we propose a framework for proving computational lower bounds in norm approximation by leveraging a reverse detection--estimation gap. The starting point is a testing problem together with an estimator whose error is significantly smaller than the corresponding computational detection threshold. We show that such a gap yields a lower bound on the approximation distortion achievable by any algorithm in the underlying computational class. In this way, reverse detection--estimation gaps can be turned into a general mechanism for certifying the hardness of approximating nontrivial norms. We apply this framework to the spectral norm of order-$d$ symmetric tensors in $\mathbb{R}^{p^d}$. Using a recently established low-degree hardness result for detecting nonzero high-order cumulant tensors, together with an efficiently computable estimator whose error is below the low-degree detection threshold, we prove that any degree-$D$ low-degree algorithm with $D \le c_d(\log p)^2$ must incur distortion at least $p^{d/4-1/2}/\operatorname{polylog}(p)$ for the tensor spectral norm. Under the low-degree conjecture, the same conclusion extends to all polynomial-time algorithms. In several important settings, this lower bound matches the best known upper bounds up to polylogarithmic factors, suggesting that the exponent $d/4-1/2$ captures a genuine computational barrier. Our results provide evidence that the difficulty of approximating tensor spectral norm is not merely an artifact of existing techniques, but reflects a broader computational barrier.

2604.00672 2026-04-07 cs.CL cs.IR math.ST stat.TH

Common TF-IDF variants arise as key components in the test statistic of a penalized likelihood-ratio test for word burstiness

Zeyad Ahmed, Paul Sheridan, Michael McIsaac, Aitazaz A. Farooque

Comments 27 pages, 3 tables, 7 figures, accepted in Discover Computing 2026

详情
英文摘要

TF-IDF is a classical formula that is widely used for identifying important terms within documents. We show that TF-IDF-like scores arise naturally from the test statistic of a penalized likelihood-ratio test setup capturing word burstiness (also known as word over-dispersion). In our framework, the alternative hypothesis captures word burstiness by modeling a collection of documents according to a family of beta-binomial distributions with a gamma penalty term on the precision parameter. In contrast, the null hypothesis assumes that words are binomially distributed in collection documents, a modeling approach that fails to account for word burstiness. We find that a term-weighting scheme given rise to by this test statistic performs comparably to TF-IDF on document classification tasks. This paper provides insights into TF-IDF from a statistical perspective and underscores the potential of hypothesis testing frameworks for advancing term-weighting scheme development.

2603.26029 2026-04-07 math.ST cs.CC stat.TH

Detection Is Harder Than Estimation in Certain Regimes: Inference for Moment and Cumulant Tensors

Runshi Tang, Yuefeng Han, Anru R. Zhang

详情
英文摘要

We study estimation and detection of high-order moment and cumulant tensors from $n$ i.i.d.\ observations of a $p$-dimensional random vector, with performance measured in tensor spectral norm. Under sub-Gaussianity, we show that the minimax rate for estimating the order-$d$ moment and cumulant tensors is $\sqrt{p/n}\wedge 1$. In contrast to covariance estimation, the sample moment tensor is generally not rate-optimal for $d\ge 3$, and we construct an estimator that attains the minimax rate up to logarithmic factors. On the computational side, we study testing whether the $d$-th order cumulant tensor vanishes after whitening. Using the low-degree polynomial framework, we provide evidence that detection is computationally hard when $n\ll p^{d/2}$. At the same time, we identify a regime in which an efficiently computable estimator achieves error smaller than the separation at which low-degree tests can reliably distinguish the null from the alternative. This reveals an unusual reverse detection--estimation gap: computationally efficient detection can be harder than computationally efficient estimation. The underlying reason is that the relevant loss, tensor spectral norm, is itself NP-hard to compute, creating a new form of computational--statistical gap.

2602.14303 2026-04-07 stat.ME

A Novel Three-Parameter Extended Weibull Distribution for Health Data Modelling

Isqeel Ogunsola, Nurudeen Ajadi, Gboyega Adepoju

详情
英文摘要

Weibull distribution is widely used in modelling health data. However, its lack of sufficient tail flexibility often results in poor fit in extreme events. We proposed another three-parameter extension of the Weibull distribution with additional flexibility without sacrificing tractability. We derived and studied its statistical properties, including reliability measures, quantile function, moment, stress-strength, mean waiting time, moment generating function, characteristics function, Rényi entropy, order statistics, mean residual life and mode. We adopted the inverse transform approach in random number generation, and through simulation, we evaluated the performance of the maximum likelihood estimates. The fitness of the distribution was examined using a fracture dataset and compared with five similar extensions of the Weibull distribution. Our proposed novel distribution fits the data best among the competing models. It is therefore recommended as a better alternative in modelling heavily tailed health data due to its flexibility. Robust estimation techniques would be valuable in addressing potential numerical challenges associated with the model in future studies.

2601.06597 2026-04-07 cs.LG stat.ML

Understanding and inverse design of implicit bias in stochastic learning: a geometric perspective

Nicola Aladrah, Emanuele Ballarin, Matteo Biagetti, Alessio Ansuini, Alberto d'Onofrio, Fabio Anselmi

Comments v2

详情
英文摘要

A key challenge in machine learning is to explain how learning dynamics select among the many solutions that achieve identical loss values in overparameterized models - a phenomenon known as implicit bias. Controlling this bias provides a direct mechanism on learned representations, which are central to interpretability, robustness, and reasoning in modern AI systems. Yet, despite its importance, existing explanations remain largely ad hoc and lack a unifying mechanism. We develop a theoretical and constructive framework in which implicit bias emerges as a geometric correction induced by the interplay between gradient noise and continuous symmetries of the loss. We compute the induced bias across a range of architectures, predicting new behaviors and explaining known ones. The approach also enables inverse design: by engineering predictor - preserving parameterizations, it is possible to shape the bias, with sparsity and spectral sparsity emerging as canonical instances. Numerical experiments support the theory and validate the inverse - design framework in controlled settings.

2601.01422 2026-04-07 stat.CO stat.ME

Hamiltonian Monte Carlo for (Physics) Dummies

Arghya Mukherjee, Dootika Vats

Comments 40 pages, 12 figures, 1 table

详情
英文摘要

Sampling-based inference has seen a surge of interest in recent years. Hamiltonian Monte Carlo (HMC) has emerged as a powerful algorithm that leverages concepts from Hamiltonian dynamics to efficiently explore complex target distributions. Variants of HMC are available in popular software packages, enabling off-the-shelf implementations that have greatly benefited the statistics and machine learning communities. At the same time, the availability of such black-box implementations has made it challenging for users to understand the inner workings of HMC, especially when they are unfamiliar with the underlying physical principles. We provide a pedagogical overview of HMC that aims to bridge the gap between its theoretical foundations and practical applicability. This review article seeks to make HMC more accessible to applied researchers by highlighting its advantages, limitations, and role in enabling scalable and exact Bayesian inference for complex models.

2512.15056 2026-04-07 stat.AP

Routine Blood Biomarkers Reveal a Preclinical Continuum of Multiple Myeloma Risk

Bingjie Li, Jiadai Xu, Yiqing Sun, Feiyue Pan, Shing-Tung Yau, Peng Liu, Zhigang Yao

Comments 25 pages

详情
英文摘要

Multiple myeloma (MM) is preceded by a long preclinical phase spanning decades, yet scalable, non-specialist tools to identify individuals at elevated risk before end-organ damage are lacking. In a prospective analysis of 299,035 cancer-free UK Biobank participants followed for a median of 12.4 years, during which 768 developed incident MM, we conducted a biomarker-wide association scan across 61 routinely measured blood analytes spanning hematological, protein metabolism, renal, and immune categories. Markers of protein dysregulation-elevated total protein, depressed albumin, and a low albumin-to-globulin (A/G) ratio-showed the strongest preclinical associations (hazard ratios 0.61-1.54 per SD), consistent with progressive monoclonal immunoglobulin accumulation and suppression of normal polyclonal synthesis years before diagnosis. These signals were accompanied by indicators of erythropoietic suppression, morphological red cell dysregulation, and a shift toward lower neutrophil and higher lymphocyte fractions, reflecting coordinated perturbations across hematopoietic and immune compartments. Longitudinal trajectory analyses showed that these multi-system deviations emerge more than a decade before diagnosis and intensify as clinical onset approaches. Dose-response modelling revealed pronounced nonlinear associations for protein and erythrocytic markers, with risk concentrated among individuals with extreme values. Incorporating significant biomarkers into a clinical risk model improved 10-year MM discrimination from a C-index of 0.684 to 0.744, with the high-risk decile accumulating 0.79% cumulative incidence versus 0.47% under the clinical model alone. These findings provide a practical framework for biomarker-guided MM risk stratification and targeted surveillance using routinely available clinical tests.

2512.11919 2026-04-07 stat.ME cs.AI math.ST stat.TH

A fine-grained look at causal effects in causal spaces

Junhyung Park, Yuqing Zhou

详情
英文摘要

The notion of causal effect is fundamental across many scientific disciplines. Traditionally, quantitative researchers have studied causal effects at the level of variables; for example, how a certain drug dose (W) causally affects a patient's blood pressure (Y). However, in many modern data domains, the raw variables-such as pixels in an image or tokens in a language model-do not have the semantic structure needed to formulate meaningful causal questions. In this paper, we offer a more fine-grained perspective by studying causal effects at the level of events, drawing inspiration from probability theory, where core notions such as independence are first given for events and sigma-algebras, before random variables enter the picture. Within the measure-theoretic framework of causal spaces, a recently introduced axiomatisation of causality, we first introduce several binary definitions that determine whether a causal effect is present, as well as proving some properties of them linking causal effect to (in)dependence under an intervention measure. Further, we provide quantifying measures that capture the strength and nature of causal effects on events, and show that we can recover the common measures of treatment effect as special cases.

2511.05281 2026-04-07 stat.ME

Conditioning on posterior samples for flexible frequentist goodness-of-fit testing

Ritwik Bhaduri, Aabesh Bhattacharyya, Rina Foygel Barber, Lucas Janson

Comments added sensitivity analysis

详情
英文摘要

Tests of goodness of fit are used in nearly every domain where statistics is applied. One powerful and flexible approach is to sample artificial data sets that are exchangeable with the real data under the null hypothesis (but not under the alternative), as this allows the analyst to conduct a valid test using any test statistic they desire. Such sampling is typically done by conditioning on either an exact or approximate sufficient statistic, but existing methods for doing so have significant limitations, which either preclude their use or substantially reduce their power or computational tractability for many important models. In this paper, we propose to condition on samples from a Bayesian posterior distribution, which constitute a very different type of approximate sufficient statistic than those considered in prior work. Our approach, approximately co-sufficient sampling via Bayes (aCSS-B), considerably expands the scope of this flexible type of goodness-of-fit testing. We prove the approximate validity of the resulting test, and demonstrate its utility on three common null models where no existing methods apply, as well as its outperformance on models where existing methods do apply.

2510.22068 2026-04-07 cs.LG stat.ML

Deep Gaussian Processes for Functional Maps

Matthew Lowery, Zhitong Xu, Da Long, Keyan Chen, Daniel S. Johnson, Yang Bai, Varun Shankar, Shandian Zhe

Comments 9 pages + 9 page appendix, 7 figures

详情
英文摘要

Learning mappings between functional spaces, also known as function-on-function regression, is a fundamental problem in functional data analysis with broad applications, including spatiotemporal forecasting, curve prediction, and climate modeling. Existing approaches often struggle to capture complex nonlinear relationships and/or provide reliable uncertainty quantification when data are noisy, sparse, or irregularly sampled. To address these challenges, we propose Deep Gaussian Processes for Functional Maps (DGPFM). Our method constructs a sequence of GP-based linear and nonlinear transformations directly in function space, leveraging kernel integral transforms, GP conditional means, and nonlinear activations sampled from Gaussian processes. A key insight enables a simplified and flexible implementation: under fixed evaluation locations, discrete approximations of kernel integral transforms reduce to direct functional integral transforms, allowing seamless integration of diverse transform designs. To support scalable probabilistic inference, we adopt inducing points and whitening transformations within a variational learning framework. Empirical results on both real-world and synthetic benchmark datasets demonstrate the advantages of DGPFM in terms of predictive accuracy and uncertainty calibration.

2510.20052 2026-04-07 math.OC cs.LG stat.ML

Endogenous Aggregation of Multiple Data Envelopment Analysis Scores for Large Data Sets

Hashem Omrani, Raha Imanirad, Adam Diamant, Utkarsh Verma, Amol Verma, Fahad Razak

详情
英文摘要

We propose an approach for dynamic efficiency evaluation across multiple organizational dimensions using data envelopment analysis (DEA). The method generates both dimension-specific and aggregate efficiency scores, incorporates desirable and undesirable outputs, and is suitable for large-scale problem settings. Two regularized DEA models are introduced: a slack-based measure (SBM) and a linearized version of a nonlinear goal programming model (GP-SBM). While SBM estimates an aggregate efficiency score and then distributes it across dimensions, GP-SBM first estimates dimension-level efficiencies and then derives an aggregate score. Both models utilize a regularization parameter to enhance discriminatory power while also directly integrating both desirable and undesirable outputs. We demonstrate the computational efficiency and validity of our approach on multiple datasets and apply it to a case study of twelve hospitals in Ontario, Canada, evaluating three theoretically grounded dimensions of organizational effectiveness over a 24-month period from January 2018 to December 2019: technical efficiency, clinical efficiency, and patient experience. Our numerical results show that SBM and GP-SBM better capture correlations among input/output variables and outperform conventional benchmarking methods that separately evaluate dimensions before aggregation.

2510.16132 2026-04-07 cs.LG math.OC stat.ML

A Minimal-Assumption Analysis of Q-Learning with Time-Varying Policies

Phalguni Nanda, Zaiwei Chen

Comments 46 pages, 4 figures

详情
英文摘要

In this work, we present the first finite-time analysis of Q-learning with time-varying learning policies (i.e., on-policy sampling) for discounted Markov decision processes under minimal assumptions, requiring only the existence of a policy that induces an irreducible Markov chain over the state space. We establish a last-iterate convergence rate for $\mathbb{E}[\|Q_k - Q^*\|_\infty^2]$, implying a sample complexity of order $\mathcal{O}(1/ξ^2)$ for achieving $\mathbb{E}[\|Q_k - Q^*\|_\infty]\le ξ$. This matches the rate of off-policy Q-learning, but with worse dependence on exploration-related parameters. We also derive a finite-time rate for $\mathbb{E}[\|Q^{π_k} - Q^*\|_\infty^2]$, where $π_k$ is the learning policy at iteration $k$, highlighting the exploration-exploitation trade-off in on-policy Q-learning. While exploration is weaker than in off-policy methods, on-policy learning enjoys an exploitation advantage as the learning policy converges to an optimal one. Numerical results support our theory. Technically, rapidly time-varying learning policies induce time-inhomogeneous Markovian noise, creating significant analytical challenges under minimal exploration. To address this, we develop a Poisson-equation-based decomposition of the Markovian noise under a lazy transition matrix, separating it into a martingale-difference term and residual terms. The residuals are controlled via sensitivity analysis of the Poisson equation solution with respect to both the Q-function estimate and the learning policy. These techniques may extend to other RL algorithms with time-varying policies, such as single-timescale actor-critic methods and learning-in-games algorithms.

2510.06157 2026-04-07 stat.ME

Frequency-Domain Analysis of Time Series with Network-Structured Dependence: Application to Global Bank Connectedness

Cristian F. Jiménez-Varón, Marina I. Knight

详情
英文摘要

Financial spillovers in interconnected systems, such as global banking networks, require tools that capture temporal and frequency dynamics, while incorporating the underlying network topology. While current network time series models are developed in the time-domain, frequency-domain approaches, which reveal how cross-nodal dependencies vary across different cycles, remain under-explored. This paper develops a spectral analysis framework that accommodates flexible forms of network dependence, including interactions mediated through intermediate nodes. This ensures that inter-nodal relationships are not restricted to direct connections, a feature crucial for capturing indirect financial spillovers. We define the network time series spectral density, alongside coherence and partial coherence, and propose both parametric and network-constrained nonparametric methods for their estimation. Simulations and theoretical results demonstrate the strong performance of the parametric approach when the data-generating process aligns with the model structure, whereas the nonparametric alternative provides robustness against model misspecification. An application to global bank connectedness shows that the proposed spectral measures capture inter-bank frequency-specific spillover effects, yielding results consistent with existing measures while additionally uncovering richer patterns of volatility transmission that are intimately connected to the network topology.

2510.05454 2026-04-07 econ.EM stat.ME

Estimating Treatment Effects Under Bounded Heterogeneity

Soonwoo Kwon, Liyang Sun

Comments 45 pages, 5 figures

详情
英文摘要

Specifications that impose constant treatment effects are common but biased, while fully flexible alternatives can be imprecise or infeasible. Under a bound on treatment effect heterogeneity, we propose a generalized ridge estimator, $\texttt{regulaTE}$, that yields heterogeneity-aware confidence intervals (CIs). The ridge penalty is chosen to optimally trade off worst-case bias and variance in a Gaussian homoskedastic setting; the resulting CIs remain tight more generally and are valid even under lack of overlap. Varying the bound enables sensitivity analysis to departures from constant effects, which we illustrate in leading empirical applications of unconfoundedness and staggered adoption designs.

2510.04299 2026-04-07 stat.ME

Out-of-bag prediction balls for random forests in metric spaces

Diego Serrano, Eduardo García-Portugués

Comments 28 pages, 8 figures, 6 tables. Supplementary material: 11 pages, 4 figures, 2 tables

详情
英文摘要

Statistical methods for metric spaces provide a general and versatile framework for analyzing complex data types. We introduce a novel approach for constructing confidence regions around new predictions from any bagged regression algorithm with metric-space-valued responses. This includes the recent extensions of random forests for metric responses: Fréchet random forests (Capitaine et al., 2024), random forest weighted local constant Fréchet regression (Qiu et al., 2024), and metric random forests (Bulté and Sørensen, 2024). Our prediction regions leverage out-of-bag observations generated during a single forest training, employing the entire data set for both prediction and uncertainty quantification. We establish asymptotic guarantees of out-of-bag prediction balls for four coverage types under certain regularity conditions. Moreover, we demonstrate the superior stability and smaller radius of out-of-bag balls compared to split-conformal methods through extensive numerical experiments where the response lies on the Euclidean space, sphere, hyperboloid, and space of positive definite matrices. A real data application illustrates the potential of the confidence regions for quantifying the uncertainty in the study of solar dynamics and the use of data-driven non-isotropic distances on the sphere.

2509.21940 2026-04-07 stat.ML cs.IT cs.LG math.IT math.ST stat.TH

Sequential 1-bit Mean Estimation with Near-Optimal Sample Complexity

Ivan Lau, Jonathan Scarlett

Comments AISTATS 2026

详情
英文摘要

In this paper, we study the problem of distributed mean estimation with 1-bit communication constraints. We propose a mean estimator that is based on (randomized and sequentially-chosen) interval queries, whose 1-bit outcome indicates whether the given sample lies in the specified interval. Our estimator is $(ε, δ)$-PAC for all distributions with bounded mean ($-λ\le \mathbb{E}(X) \le λ$) and variance ($\mathrm{Var}(X) \le σ^2$) for some known parameters $λ$ and $σ$. We derive a sample complexity bound $\widetilde{O}\big( \frac{σ^2}{ε^2}\log\frac{1}δ + \log\fracλσ\big)$, which matches the minimax lower bound for the unquantized setting up to logarithmic factors and the additional $\log\fracλσ$ term that we show to be unavoidable. We also establish an adaptivity gap for interval-query based estimators: the best non-adaptive mean estimator is considerably worse than our adaptive mean estimator for large $\fracλσ$. Finally, we give tightened sample complexity bounds for distributions with stronger tail decay, and present additional variants that (i) handle an unknown sampling budget (ii) adapt to the unknown true variance given (possibly loose) upper and lower bounds on the variance, and (iii) use only two stages of adaptivity at the expense of more complicated (non-interval) queries.

2509.12981 2026-04-07 cs.LG stat.ML

Causal Discovery via Quantile Partial Effect

Yikang Chen, Xingzhe Sun, Dehui Du

Comments 29 pages, 6 figures; ICLR 2026

详情
英文摘要

Quantile Partial Effect (QPE) is a statistic associated with conditional quantile regression, measuring the effect of covariates at different levels. Our theory demonstrates that when the QPE of cause on effect is assumed to lie in a finite linear span, cause and effect are identifiable from their observational distribution. This generalizes previous identifiability results based on Functional Causal Models (FCMs) with additive, heteroscedastic noise, etc. Meanwhile, since QPE resides entirely at the observational level, this parametric assumption does not require considering mechanisms, noise, or even the Markov assumption, but rather directly utilizes the asymmetry of shape characteristics in the observational distribution. By performing basis function tests on the estimated QPE, causal directions can be distinguished, which is empirically shown to be effective in experiments on a large number of bivariate causal discovery datasets. For multivariate causal discovery, leveraging the close connection between QPE and score functions, we find that Fisher Information is sufficient as a statistical measure to determine causal order when assumptions are made about the second moment of QPE. We validate the feasibility of using Fisher Information to identify causal order on multiple synthetic and real-world multivariate causal discovery datasets.

2509.02892 2026-04-07 cs.LG stat.ME

Improving Generative Methods for Causal Evaluation via Simulation-Based Inference

Pracheta Amaranath, Vinitra Muralikrishnan, Amit Sharma, David Jensen

Comments 13 pages main text, 68 pages total

详情
英文摘要

Generating synthetic datasets that accurately reflect real-world observational data is critical for evaluating causal estimators, but it remains a challenging task. Existing generative methods offer a solution by producing synthetic datasets anchored in the observed data (source data) while allowing variation in key parameters such as the treatment effect and amount of confounding bias. However, it is often unclear which generative methods to use and which values of parameters to choose when generating synthetic datasets. Moreover, existing methods typically require users to provide fixed point estimates of such parameters. This denies users the ability to express uncertainty over both generative methods and parameter values and removes the potential for posterior inference, potentially leading to unreliable estimator comparisons. We introduce simulation-based inference for causal evaluation (SBICE), a framework that treats the generative method and its corresponding generative parameters as uncertain and infers their posterior distribution given a source dataset. Leveraging techniques in simulation-based inference, SBICE identifies suitable generative methods and infers distributions over its parameter configurations to produce synthetic datasets closely aligned with the source data distribution. Empirical results demonstrate that SBICE improves the reliability of estimator evaluations by generating realistic datasets whose causal estimates closely match the estimates of the source data, making it a robust and uncertainty-aware approach to selecting causal estimators.

2509.00472 2026-04-07 stat.ML cs.LG math.ST stat.TH

Partially Functional Dynamic Backdoor Diffusion-based Causal Model

Xinwen Liu, Lei Qian, Song Xi Chen, Niansheng Tang

Comments 16 pages, 2 figures

详情
英文摘要

Causal inference in spatio-temporal settings is critically hindered by unmeasured confounders with complex spatio-temporal dynamics and the prevalence of multi-resolution data. While diffusion models present a promising avenue for estimating structural causal models, existing approaches are limited by assumptions of causal sufficiency or static confounding, failing to capture the region-specific, temporally dependent nature of real-world latent variables or to directly handle functional variables. We bridge this gap by introducing the Partially Functional Dynamic Backdoor Diffusion-based Causal Model (PFD-BDCM), a unified generative framework designed to simultaneously tackle causal inference with dynamic confounding and functional data. Our approach formalizes a novel structural causal model that captures spatio-temporal dependencies in latent confounders through conditional autoregressive processes, represents functional variables via basis expansion coefficients treated as standard graph nodes, and integrates valid backdoor adjustment into a diffusion-based generative process. We provide theoretical guarantees on the preservation of causal effects under basis expansion and derive error bounds for counterfactual estimates. Experiments on synthetic data and a real-world air pollution case study demonstrate that PFD-BDCM outperforms existing methods across observational, interventional, and counterfactual queries. This work provides a rigorous and practical tool for robust causal inference in complex spatio-temporal systems characterized by non-stationarity and multi-resolution data.

2508.19640 2026-04-07 math.ST stat.ME stat.TH

Optimal Cox regression under federated differential privacy: coefficients and cumulative hazards

Elly K. H. Hung, Yi Yu

详情
英文摘要

We study two foundational problems in distributed survival analysis under federated differential privacy (FDP): estimation of the Cox regression coefficients and of the cumulative baseline hazard functions, allowing for heterogeneous per-sever sample sizes and privacy budgets. To quantify the fundamental cost of privacy, we derive minimax lower bounds together with upper bounds that match up to poly-logarithmic factors for the regression coefficients, thereby revealing server-level phase transitions between private and non-private regimes. We also consider a relaxed differential privacy framework with partially public information. Our analysis shows that the role of public covariates depends strongly on the privacy model. For cumulative hazard estimation, we propose a private tree-based version of the Breslow estimator for nonparametric integral estimation under FDP. As a by-product, this leads to a private survival function estimator that attains a nearly minimax optimal rate. Numerical experiments, including a real-data application, support the theoretical findings. The proposed methods are implemented in an accompanying R package FDPCox.

2508.13831 2026-04-07 stat.ML cs.LG

Smooth Flow Matching for Synthesizing Functional Data

Jianbin Tan, Anru R. Zhang

详情
英文摘要

Functional data, i.e., smooth random functions observed over a continuous domain, are increasingly available in areas such as biomedical research, health informatics, and epidemiology. However, effective statistical analysis for functional data is often hindered by challenges such as privacy constraints, sparse and irregular sampling, infinite-dimensionality, and non-Gaussian structures. To address these challenges, we introduce a novel framework named Smooth Flow Matching (SFM), tailored for generative modeling of functional data that enables statistical analysis without exposing sensitive real data. Under a copula framework, SFM constructs a parsimonious smooth flow to generate infinite-dimensional functional data, free of Gaussianity and low-rank assumptions. It is computationally efficient, handles irregular observations, and guarantees the smoothness of the generated functions, offering a practical and flexible solution in scenarios where existing deep generative methods are not applicable. Through extensive simulation studies, we demonstrate the advantages of SFM in terms of both synthetic data quality and computational efficiency. We then apply SFM to generate clinical trajectory data from the MIMIC-IV patient electronic health records (EHR) longitudinal database. Our analysis showcases the ability of SFM to produce high-quality surrogate data for downstream tasks, highlighting its potential to boost the utility of EHR data for clinical applications.

2506.21744 2026-04-07 cs.LG stat.AP stat.ML

Federated Item Response Models: A Gradient-driven Privacy-preserving Framework for Distributed Psychometric Estimation

Biying Zhou, Nanyu Luo, Feng Ji

详情
英文摘要

Item Response Theory (IRT) models are widely used to estimate respondents' latent abilities and calibrate item difficulty. Traditional IRT estimation typically requires centralizing all raw responses, raising privacy and governance concerns. We introduce Federated Item Response Theory (FedIRT), a framework that enables distributed calibration of standard IRT models without transferring individual-level data, thereby preserving confidentiality while retaining statistical efficiency. To provide formal protection, we further develop FedIRT-DP, a user-level differentially private extension. Each site computes per-student gradients, clips them to a fixed norm, and shares only masked sums; the server adds calibrated Gaussian noise and performs MAP updates. This yields an auditable $(\varepsilon,δ)$ guarantee at the student level and a single, tunable privacy-utility trade-off via the clipping bound and noise scale. The same mechanism improves robustness to extreme response rows (e.g., all-zeros/ones). Across simulations, FedIRT matches the accuracy of centralized estimators from popular $\texttt{R}$ packages while avoiding data pooling; FedIRT-DP achieves comparable accuracy under stronger privacy and exhibits superior robustness to contamination. An empirical study on a real exam dataset demonstrates practical viability and consistent item and site-effect estimates. To facilitate adoption, we release an open-source $\texttt{R}$ package, $\texttt{FedIRT}$, implementing the two-parameter logistic (2PL) and partial credit models (PCM) with federated and differentially private training.

2506.07816 2026-04-07 stat.ML cs.LG math.PR

Accelerating Constrained Sampling: A Large Deviations Approach

Yingli Wang, Changwei Tu, Xiaoyu Wang, Lingjiong Zhu

Comments 59 pages, 15 figures

详情
英文摘要

The problem of sampling a target probability distribution on a constrained domain arises in many applications including machine learning. For constrained sampling, various Langevin algorithms such as projected Langevin Monte Carlo (PLMC), based on the discretization of reflected Langevin dynamics (RLD) and more generally skew-reflected non-reversible Langevin Monte Carlo (SRNLMC), based on the discretization of skew-reflected non-reversible Langevin dynamics (SRNLD), have been proposed and studied in the literature. This work focuses on the long-time behavior of SRNLD, where a skew-symmetric matrix is added to RLD. Although acceleration for SRNLD has been studied, it is not clear how one should design the skew-symmetric matrix in the dynamics to achieve good performance in practice. We establish a large deviation principle (LDP) for the empirical measure of SRNLD when the skew-symmetric matrix is chosen such that its product with the outward unit normal vector field on the boundary is zero. By explicitly characterizing the rate functions, we show that this choice of the skew-symmetric matrix accelerates the convergence to the target distribution compared to RLD and reduces the asymptotic variance. Numerical experiments for SRNLMC based on the proposed skew-symmetric matrix show superior performance, which validate the theoretical findings from the large deviations theory.

2506.00077 2026-04-07 cs.CL cs.LG stat.ML

Gaussian mixture models as a proxy for interacting language models

Edward L. Wang, Mohammad Sharifi Kiasari, Tianyu Wang, Hayden Helm, Avanti Athreya, Carey Priebe, Vince Lyzinski

详情
英文摘要

Large language models (LLMs) are powerful tools that, in a number of settings, overlap with the results of human pattern recognition and reasoning. Retrieval-augmented generation (RAG) further allows LLMs to produce tailored output depending on the contents of their RAG databases. However, LLMs depend on complex, computationally expensive algorithms. In this paper, we introduce interacting Gaussian mixture models (GMMs) as a proxy for interacting LLMs. We construct a model of interacting GMMs, complete with an analogue to RAG updating, under which GMMs can generate, exchange, and update data and parameters. We show that this interacting system of Gaussian mixture models, which can be implemented at minimal computational cost, mimics certain aspects of experimental simulations of interacting LLMs whose iterative responses depend on feedback from other LLMs. We build a Markov chain from this system of interacting GMMs; formalize and interpret the notion of polarization for such a chain; and prove lower bounds on the probability of polarization. This provides theoretical insight into the use of interacting Gaussian mixture models as a computationally efficient proxy for interacting large language models.

2505.18288 2026-04-07 stat.ML cs.LG

Operator Learning for Schrödinger Equation: Unitarity, Error Bounds, and Time Generalization

Yash Patel, Unique Subedi, Ambuj Tewari

Comments 37 pages

详情
英文摘要

We consider the problem of learning the evolution operator for the time-dependent Schrödinger equation, where the Hamiltonian may vary with time. Existing neural network-based surrogates often ignore fundamental properties of the Schrödinger equation, such as linearity and unitarity, and lack theoretical guarantees on prediction error or time generalization. To address this, we introduce a linear estimator for the evolution operator that preserves a weak form of unitarity. We establish both upper bounds and lower bounds on the prediction error of the proposed estimator that hold uniformly over classes of sufficiently smooth initial wave functions. Additionally, we derive time generalization bounds that quantify how the estimator extrapolates beyond the time points seen during training. Experiments across real-world Hamiltonians -- including hydrogen atoms, ion traps for qubit design, and optical lattices -- show that our estimator achieves relative errors up to two orders of magnitude smaller than state-of-the-art methods such as the Fourier Neural Operator and DeepONet.

2505.12530 2026-04-07 cs.LG math.OC stat.ML

Enforcing Fair Predicted Scores on Intervals of Percentiles by Difference-of-Convex Constraints

Yutian He, Yankun Huang, Yao Yao, Qihang Lin

Comments 45 pages, 12 figures, 4 tables. This work is published in the proceedings of AISTATS 2026

详情
英文摘要

Fairness in machine learning has become a critical concern. Existing approaches often focus on achieving full fairness across all score ranges generated by predictive models, ensuring fairness in both high- and low-percentile populations. However, this stringent requirement can compromise predictive performance and may not align with the practical fairness concerns of stakeholders. In this work, we propose a novel framework for building partially fair machine learning models that enforce fairness only within a specific percentile interval of interest while maintaining flexibility in other regions. We introduce statistical metrics to evaluate partial fairness within a given percentile interval. To achieve partial fairness, we propose an in-processing method by formulating the model training problem as constrained optimization with difference-of-convex constraints, which can be solved by an inexact difference-of-convex algorithm (IDCA). We provide the complexity analysis of IDCA for finding a nearly KKT point. Through numerical experiments on real-world datasets, we demonstrate that our framework achieves high predictive performance while enforcing partial fairness where it matters most.

2505.11211 2026-04-07 cs.LG cs.AI stat.ME stat.ML

Bayesian Hierarchical Invariant Prediction

Francisco Madaleno, Pernille Julie Viuff Sand, Francisco C. Pereira, Sergio Hernan Garrido Mejia

详情
英文摘要

We propose Bayesian Hierarchical Invariant Prediction (BHIP) reframing Invariant Causal Prediction (ICP) through the lens of Hierarchical Bayes. We leverage the hierarchical structure to explicitly test invariance of causal mechanisms under heterogeneous data, resulting in improved computational scalability for a larger number of predictors compared to ICP. Moreover, given its Bayesian nature BHIP enables the use of prior information. We evaluate BHIP on both synthetic and real-world datasets, demonstrating its potential as an alternative inference method to ICP and related methods.

2504.05297 2026-04-07 stat.ME econ.EM stat.AP stat.CO

Eigenvalue-Based Randomness Test for Residual Diagnostics in Panel Data Models

Marcell T. Kurbucz, Betsabé Pérez Garrido, Antal Jakovác

Comments 10 pages, 3 figures

详情
Journal ref
Econometrics and Statistics, 2026
英文摘要

This paper introduces the Eigenvalue-Based Randomness (EBR) test - a novel approach rooted in the Tracy-Widom law from random matrix theory - and applies it to the context of residual analysis in panel data models. Unlike traditional methods, which target specific issues like cross-sectional dependence or autocorrelation, the EBR test simultaneously examines multiple assumptions by analyzing the largest eigenvalue of a symmetrized residual matrix. Monte Carlo simulations demonstrate that the EBR test is particularly robust in detecting not only standard violations such as autocorrelation and linear cross-sectional dependence (CSD) but also more intricate non-linear and non-monotonic dependencies, making it a comprehensive and highly flexible tool for enhancing the reliability of panel data analyses.

2503.03206 2026-04-07 cs.LG cs.CV math.ST stat.ML stat.TH

An Analytical Theory of Spectral Bias in the Learning Dynamics of Diffusion Models

Binxu Wang, Cengiz Pehlevan

Comments 96 pages, 29 figures. Published in Advances in Neural Information Processing Systems, NeurIPS 2025 (Spotlight)

详情
英文摘要

We develop an analytical framework for understanding how the generated distribution evolves during diffusion model training. Leveraging a Gaussian-equivalence principle, we solve the full-batch gradient-flow dynamics of linear and convolutional denoisers and integrate the resulting probability-flow ODE, yielding analytic expressions for the generated distribution. The theory exposes a universal inverse-variance spectral law: the time for an eigen- or Fourier mode to match its target variance scales as $τ\proptoλ^{-1}$, so high-variance (coarse) structure is mastered orders of magnitude sooner than low-variance (fine) detail. Extending the analysis to deep linear networks and circulant full-width convolutions shows that weight sharing merely multiplies learning rates -- accelerating but not eliminating the bias -- whereas local convolution introduces a qualitatively different bias. Experiments on Gaussian and natural-image datasets confirm the spectral law persists in deep MLP-based UNet. Convolutional U-Nets, however, display rapid near-simultaneous emergence of many modes, implicating local convolution in reshaping learning dynamics. These results underscore how data covariance governs the order and speed with which diffusion models learn, and they call for deeper investigation of the unique inductive biases introduced by local convolution.

2503.03049 2026-04-07 stat.ME stat.AP

Estimating treatment effects with competing intercurrent events in randomized controlled trials

Sizhu Lu, Yanyao Yi, Yongming Qu, Huayu Karen Liu, Ting Ye, Peng Ding

详情
英文摘要

The analysis of randomized controlled trials is often complicated by intercurrent events (IEs) -- events that occur after treatment initiation and affect either the interpretation or existence of outcome measurements. Examples include treatment discontinuation or the use of additional medications. In two recent clinical trials for systemic lupus erythematosus with complications of IEs, we classify the IEs into two broad categories: effect-informative (e.g., treatment discontinuation due to adverse events or lack of efficacy) and effect-uninformative (e.g., treatment discontinuation due to external factors such as pandemics or relocation). To define a clinically meaningful estimand, we adopt tailored strategies for each category of IEs. For effect-informative IEs, which are often informative about a patient's outcome, we use the composite variable strategy that assigns an outcome value indicative of treatment failure. For effect-uninformative IEs, we apply the hypothetical strategy, assuming their timing is conditionally independent of the outcome given treatment and baseline covariates, and hypothesizing a scenario in which such events do not occur. A central yet previously overlooked challenge is the presence of competing IEs, where the first IE censors all subsequent ones. Despite its ubiquity in practice, this issue has not been explicitly recognized or addressed in previous data analyses due to the lack of rigorous statistical methodology. In this paper, we propose a principled framework to formulate the estimand, establish its nonparametric identification and semiparametric estimation theory, and introduce weighting, outcome regression, and doubly robust estimators. We apply our methods to analyze the two systemic lupus erythematosus trials, demonstrating the robustness and practical utility of the proposed framework.

2502.15567 2026-04-07 cs.LG stat.ML

Model Privacy: A Unified Framework for Understanding Model Stealing Attacks and Defenses

Ganghua Wang, Yuhong Yang, Jie Ding

Comments Journal of the Royal Statistical Society Series B: Statistical Methodology, 2026

详情
英文摘要

The use of machine learning (ML) has become increasingly prevalent in various domains, highlighting the importance of understanding and ensuring its safety. One pressing concern is the vulnerability of ML applications to model stealing attacks. These attacks involve adversaries attempting to recover a learned model through limited query-response interactions, such as those found in cloud-based services or on-chip artificial intelligence interfaces. While existing literature proposes various attack and defense strategies, these often lack a theoretical foundation and standardized evaluation criteria. In response, this work presents a framework called ``Model Privacy'', providing a foundation for comprehensively analyzing model stealing attacks and defenses. We establish a rigorous formulation for the threat model and objectives, propose methods to quantify the goodness of attack and defense strategies, and analyze the fundamental tradeoffs between utility and privacy in ML models. Our developed theory offers valuable insights into enhancing the security of ML models, especially highlighting the importance of the attack-specific structure of perturbations for effective defenses. We demonstrate the application of model privacy from the defender's perspective through various learning scenarios. Extensive experiments corroborate the insights and the effectiveness of defense mechanisms developed under the proposed framework.

2502.02020 2026-04-07 cs.LG stat.ME

Causal Bandit Over Unknown Graphs: Upper Confidence Bounds With Backdoor Adjustment

Yijia Zhao, Qing Zhou

详情
英文摘要

The causal bandit problem seeks to identify, through sequential experimentation, an intervention that maximizes the expected reward in a causal system modeled by a directed acyclic graph (DAG). Existing methods typically assume that the causal graph is known or impose restrictive structural assumptions. In this paper, we study causal bandit problems when the causal graph is unknown. We first consider Gaussian DAG models without latent confounders. By combining observational and experimental data collected sequentially during the bandit process, we identify candidate backdoor adjustment sets for each intervention arm. These sets enable estimation of causal effects and construction of upper confidence bounds that integrate information from both data sources. Based on these estimates, we propose a new algorithm, termed backdoor-adjustment upper confidence bound (BA-UCB), for sequential intervention selection. We establish finite-time upper bounds on the cumulative regret of BA-UCB, showing improved rates and substantially relaxed dependency on the number of intervention arms compared to standard multi-armed bandit methods. We further extend the methodology and theoretical guarantees to settings with latent confounders, where the observed variables are modeled by an acyclic directed mixed graph. Simulation studies demonstrate that BA-UCB achieves substantially lower cumulative regret and favorable computational efficiency relative to existing approaches.

2501.07571 2026-04-07 math.ST stat.TH

Statistical learnability of smooth boundaries via pairwise binary classification with deep ReLU networks

Hiroki Waida, Takafumi Kanamori

详情
英文摘要

The topic of nonparametric estimation of smooth boundaries is extensively studied in the conventional setting where pairs of single covariate and response variable are observed. However, this traditional setting often suffers from the cost of data collection. Recent years have witnessed the consistent development of learning algorithms for binary classification problems where one can instead observe paired covariates and binary variable representing the statistical relationship between the covariates. In this work, we theoretically study the learnability of ordered multiple smooth boundaries under a pairwise binary classification setting. One of the challenging problems is the non-identifiability issue on the order of smooth subsets, which yields the gap between the generalizability and the learnability of smooth boundaries in the pairwise binary classification setting. To deal with the challenges due to this non-identifiability directly, we develop a proof method using a localization argument of the given vector-valued function class. Consequently, we prove that some ordered multiple smooth boundaries are learnable via a pairwise binary classification algorithm defined with a localized class of deep ReLU networks.

2412.13453 2026-04-07 stat.ME

Modeling extremal dependence in multivariate and spatial problems: a practical perspective

Boris Beranger, Simone A. Padoan

详情
英文摘要

From environmental sciences to finance, there is a growing demand for methods that can assess the risks of extreme events beyond those observed in available data. Extrapolating extreme events beyond the range of the data is not obvious. Risk assessments are often further complicated by the need to account for multiple variables simultaneously. Extreme value theory provides important tools for the analysis of multivariate or spatial extreme events, but these are not easily accessible to professionals without appropriate expertise. This article provides a minimal background on multivariate and spatial extremes and gives simple yet thorough instructions on how to analyse them using the R package ExtremalDep. After briefly introducing the statistical methodologies, we focus on road testing the package's toolbox through several real-world applications.

2411.13443 2026-04-07 math.NA cs.NA math.OC stat.ML

Nonlinear Assimilation via Score-based Sequential Langevin Sampling

Zhao Ding, Chenguang Duan, Yuling Jiao, Jerry Zhijian Yang, Cheng Yuan, Pingwen Zhang

详情
英文摘要

This paper introduces score-based sequential Langevin sampling (SSLS), a novel approach to nonlinear data assimilation within a recursive Bayesian filtering framework. The proposed method decomposes the assimilation process into alternating prediction and update steps, using dynamic models for state prediction and incorporating observational data via score-based Langevin Monte Carlo during the updates. To overcome inherent challenges in highly non-log-concave posterior sampling, we integrate an annealing strategy into the update mechanism. Theoretically, we establish convergence guarantees for SSLS in total variation (TV) distance, yielding concrete insights into the algorithm's error behavior with respect to key hyperparameters. Crucially, our derived error bounds demonstrate the asymptotic stability of SSLS, guaranteeing that local posterior sampling errors do not accumulate indefinitely over time. Extensive numerical experiments across challenging scenarios, including high-dimensional systems, strong nonlinearity, and sparse observations, highlight the robust performance of the proposed method. Furthermore, SSLS effectively quantifies the uncertainty associated with state estimates, rendering it particularly valuable for reliable error calibration.

2411.02225 2026-04-07 stat.ML cs.IT cs.LG math.IT math.ST stat.TH

Sparse Max-Affine Regression

Haitham Kanj, Seonho Kim, Kiryung Lee

详情
英文摘要

This paper presents Sparse Gradient Descent as a solution for variable selection in convex piecewise linear regression, where the model is given as the maximum of $k$-affine functions $ x \mapsto \max_{j \in [k]} \langle a_j^\star, x \rangle + b_j^\star$ for $j = 1,\dots,k$. Here, $\{ a_j^\star\}_{j=1}^k$ and $\{b_j^\star\}_{j=1}^k$ denote the ground-truth weight vectors and intercepts. A non-asymptotic local convergence analysis is provided for Sp-GD under sub-Gaussian noise when the covariate distribution satisfies the sub-Gaussianity and anti-concentration properties. When the model order and parameters are fixed, Sp-GD provides an $ε$-accurate estimate given $\mathcal{O}(\max(ε^{-2}σ_z^2,1)s\log(d/s))$ observations where $σ_z^2$ denotes the noise variance. This also implies the exact parameter recovery by Sp-GD from $\mathcal{O}(s\log(d/s))$ noise-free observations. The proposed initialization scheme uses sparse principal component analysis to estimate the subspace spanned by $\{ a_j^\star\}_{j=1}^k$, then applies an $r$-covering search to estimate the model parameters. A non-asymptotic analysis is presented for this initialization scheme when the covariates and noise samples follow Gaussian distributions. When the model order and parameters are fixed, this initialization scheme provides an $ε$-accurate estimate given $\mathcal{O}(ε^{-2}\max(σ_z^4,σ_z^2,1)s^2\log^4(d))$ observations. A new transformation named Real Maslov Dequantization (RMD) is proposed to transform sparse generalized polynomials into sparse max-affine models. The error decay rate of RMD is shown to be exponentially small in its temperature parameter. Furthermore, theoretical guarantees for Sp-GD are extended to the bounded noise model induced by RMD. Numerical Monte Carlo results corroborate theoretical findings for Sp-GD and the initialization scheme.

2410.00985 2026-04-07 stat.ME

Nonparametric tests of treatment effect homogeneity for policy-makers

Oliver Dukes, Mats J. Stensrud, Riccardo Brioschi, Aaron Hudson

详情
英文摘要

Recent work has focused on nonparametric estimation of conditional treatment effects, but inference has remained relatively unexplored. We propose a class of nonparametric tests for both quantitative and qualitative treatment effect heterogeneity. The tests can incorporate a variety of structured assumptions on the conditional average treatment effect, allow for both continuous and discrete covariates, and do not require sample splitting to obtain a tractable asymptotic null distribution. Furthermore, we show how the tests are tailored to detect alternatives where the population impact of adopting a personalized decision rule differs from using a rule that discards covariates. The proposal is thus relevant for guiding treatment policies. The utility of the proposal is borne out in simulation studies and a re-analysis of an AIDS clinical trial.

2309.10284 2026-04-07 stat.ME math.ST stat.AP stat.TH

Rank-adaptive covariance testing with applications to genomics and neuroimaging

David Veitch, Yinqiu He, Jun Young Park

详情
英文摘要

In biomedical studies, testing for differences in covariance offers scientific insights beyond mean differences, especially when differences are driven by complex joint behavior between features. However, when differences in joint behavior are weakly dispersed across many dimensions and arise from differences in low-rank structures within the data, as is often the case in genomics and neuroimaging, existing two-sample covariance testing methods may suffer from power loss. The Ky-Fan(k) norm, defined by the sum of the top Ky-Fan(k) singular values, is a simple and intuitive matrix norm able to capture signals caused by differences in low-rank structures between matrices, but its statistical properties in hypothesis testing have not been studied well. In this paper, we investigate the behavior of the Ky-Fan(k) norm in two-sample covariance testing. Ultimately, we propose a novel methodology, Rank-Adaptive Covariance Testing (RACT), which is able to leverage differences in low-rank structures found in the covariance matrices of two groups in order to maximize power. RACT uses permutation for statistical inference, ensuring an exact Type I error control. We validate RACT in simulation studies and evaluate its performance when testing for differences in gene expression networks between two types of lung cancer, as well as testing for covariance heterogeneity in diffusion tensor imaging (DTI) data taken on two different scanner types.

2307.09366 2026-04-07 cs.LG stat.ME stat.ML

Sparse Gaussian Graphical Models with Discrete Optimization: Computational and Statistical Perspectives

Kayhan Behdin, Wenyu Chen, Rahul Mazumder

Comments Operations Research (to appear)

详情
英文摘要

We consider the problem of learning a sparse graph underlying an undirected Gaussian graphical model, a key problem in statistical machine learning. Given $n$ samples from a multivariate Gaussian distribution with $p$ variables, the goal is to estimate the $p \times p$ inverse covariance matrix (aka precision matrix), assuming it is sparse (i.e., has a few nonzero entries). We propose GraphL0BnB, a new estimator based on an $\ell_0$-penalized version of the pseudo-likelihood function, while most earlier approaches are based on the $\ell_1$-relaxation. Our estimator can be formulated as a convex mixed integer program (MIP) which can be difficult to compute beyond $p\approx 100$ using off-the-shelf commercial solvers. To solve the MIP, we propose a custom nonlinear branch-and-bound (BnB) framework that solves node relaxations with tailored first-order methods. As a key component of our BnB framework, we propose large-scale solvers for obtaining good primal solutions that are of independent interest. We derive novel statistical guarantees (estimation and variable selection) for our estimator and discuss how our approach improves upon existing estimators. Our numerical experiments on real and synthetic datasets suggest that our BnB framework offers significant advantages over off-the-shelf commercial solvers, and our approach has favorable performance (both in terms of runtime and statistical performance) compared to the state-of-the-art approaches for learning sparse graphical models.

2306.06581 2026-04-07 stat.ML cs.DS cs.LG math.OC

Importance Sparsification for Sinkhorn Algorithm

Mengyu Li, Jun Yu, Tao Li, Cheng Meng

Comments Accepted by Journal of Machine Learning Research

详情
英文摘要

Sinkhorn algorithm has been used pervasively to approximate the solution to optimal transport (OT) and unbalanced optimal transport (UOT) problems. However, its practical application is limited due to the high computational complexity. To alleviate the computational burden, we propose a novel importance sparsification method, called Spar-Sink, to efficiently approximate entropy-regularized OT and UOT solutions. Specifically, our method employs natural upper bounds for unknown optimal transport plans to establish effective sampling probabilities, and constructs a sparse kernel matrix to accelerate Sinkhorn iterations, reducing the computational cost of each iteration from $O(n^2)$ to $\widetilde{O}(n)$ for a sample of size $n$. Theoretically, we show the proposed estimators for the regularized OT and UOT problems are consistent under mild regularity conditions. Experiments on various synthetic data demonstrate Spar-Sink outperforms mainstream competitors in terms of both estimation error and speed. A real-world echocardiogram data analysis shows Spar-Sink can effectively estimate and visualize cardiac cycles, from which one can identify heart failure and arrhythmia. To evaluate the numerical accuracy of cardiac cycle prediction, we consider the task of predicting the end-systole time point using the end-diastole one. Results show Spar-Sink performs as well as the classical Sinkhorn algorithm, requiring significantly less computational time.

2306.04119 2026-04-07 stat.ME

Improving Survey Inference in Two-phase Designs Using Bayesian Machine Learning

Xinru Wang, Anyu Zhu, Lauren Kennedy, Abigail Greenleaf, Qixuan Chen

详情
英文摘要

The two-phase sampling design is a cost-effective strategy widely used in public health research. Analyzing the Phase II sample often involves creating subsample-specific weights. However, these weights can be highly variable, leading to unstable weighted analyses. Alternatively, the rich data collected during the first phase can be leveraged to improve survey inference for the Phase II sample. In this paper, we propose a Bayesian tree-based multiple imputation (MI) approach for estimating population means using the Phase II sample, where the parent survey was conducted using a complex survey design. The design features of the parent survey, such as strata and clusters, are incorporated into the tree-based imputation models. Through simulations, we demonstrate that the tree-based MI method outperforms traditional weighted estimators, yielding smaller bias, lower root mean squared error, and narrower 95% confidence intervals, with coverage rates closer to the nominal level. Furthermore, we show that Rubin's variance estimation method provides valid statistical inference for population mean estimation in our setting. We illustrate the application of the proposed tree-based MI method using data from a cellphone survey on COVID-19 vaccination in Uganda, which represents a subcohort sample drawn from the 2020 Uganda Population-based HIV Impact Assessment Survey.

2006.04363 2026-04-07 cs.LG cs.AI stat.ML

Mitigating Value Hallucination in Dyna Planning via Multistep Predecessor Models

Farzane Aminmansour, Taher Jafferjee, Ehsan Imani, Erin Talvitie, Micheal Bowling, Martha White

Comments Published in Journal of Artificial Intelligence (JAIR) in 2024. Updated to published version, changed title to JAIR version, added a new author that led the submission

详情
英文摘要

Dyna-style reinforcement learning (RL) agents improve sample efficiency over model-free RL agents by updating the value function with simulated experience generated by an environment model. However, it is often difficult to learn accurate models of environment dynamics, and even small errors may result in failure of Dyna agents. In this paper, we highlight that one potential cause of that failure is bootstrapping off of the values of simulated states, and introduce a new Dyna algorithm to avoid this failure. We discuss a design space of Dyna algorithms, based on using successor or predecessor models -- simulating forwards or backwards -- and using one-step or multi-step updates. Three of the variants have been explored, but surprisingly the fourth variant has not: using predecessor models with multi-step updates. We present the \emph{Hallucinated Value Hypothesis} (HVH): updating the values of real states towards values of simulated states can result in misleading action values which adversely affect the control policy. We discuss and evaluate all four variants of Dyna amongst which three update real states toward simulated states -- so potentially toward hallucinated values -- and our proposed approach, which does not. The experimental results provide evidence for the HVH, and suggest that using predecessor models with multi-step updates is a promising direction toward developing Dyna algorithms that are more robust to model error.