arXivDaily arXiv每日学术速递 周一至周五更新
重置
2604.01172 2026-04-02 stat.ME stat.AP

Functional Moments Regression

Mingyuan Li, Martin A. Lindquist, Edward Gunning, Ciprian Crainiceanu

详情
英文摘要

The Gaussian Process (GP) assumption is often used in functional data analysis. We propose a method to assess departures from the GP assumption, both in terms of the shape of the distribution and its potential dependence on covariates, using a sequence of functional moment regressions. Our methods are inspired by and applied to objectively measured minute-level physical activity data from the National Health and Nutrition Examination Survey (NHANES) 2011-2014 study. In this setting, we find that the GP assumption is not satisfied, quantify the associations between functional moments and covariates, and show that standard data transformations, such as the log transformation, do not resolve the discrepancy between assumptions and reality. We further show that when the effect sizes are moderate, inference on the functional fixed effects is largely unaffected by departures from the GP assumption. However, when effect sizes are small, both inference and prediction of subject-level data can be strongly affected. Extensive simulations support these findings. This pragmatic paper presents new methods for real data analysis, with implications for statistical methodology and for understanding human activity and health.

2604.01170 2026-04-02 cs.LG cs.AI cs.CL stat.AP stat.ML

Online Reasoning Calibration: Test-Time Training Enables Generalizable Conformal LLM Reasoning

Cai Zhou, Zekai Wang, Menghua Wu, Qianyu Julie Zhu, Flora C. Shi, Chenyu Wang, Ashia Wilson, Tommi Jaakkola, Stephen Bates

Comments 20 pages

详情
英文摘要

While test-time scaling has enabled large language models to solve highly difficult tasks, state-of-the-art results come at exorbitant compute costs. These inefficiencies can be attributed to the miscalibration of post-trained language models, and the lack of calibration in popular sampling techniques. Here, we present Online Reasoning Calibration (ORCA), a framework for calibrating the sampling process that draws upon conformal prediction and test-time training. Specifically, we introduce a meta-learning procedure that updates the calibration module for each input. This allows us to provide valid confidence estimates under distributional shift, e.g. in thought patterns that occur across different stages of reasoning, or in prompt distributions between model development and deployment. ORCA not only provides theoretical guarantees on conformal risks, but also empirically shows higher efficiency and generalization across different reasoning tasks. At risk level $δ=0.1$, ORCA improves Qwen2.5-32B efficiency on in-distribution tasks with savings up to 47.5% with supervised labels and 40.7% with self-consistency labels. Under zero-shot out-of-domain settings, it improves MATH-500 savings from 24.8% of the static calibration baseline to 67.0% while maintaining a low empirical error rate, and the same trend holds across model families and downstream benchmarks. Our code is publicly available at https://github.com/wzekai99/ORCA.

2604.01019 2026-04-02 physics.soc-ph cs.CY physics.data-an stat.AP

Car Dependency in Urban Accessibility

Bruno Campanelli, Francesco Marzolla, Matteo Bruno, Hygor Piaget Monteiro Melo, Vittorio Loreto

详情
英文摘要

To achieve net-zero emissions, cities must transition away from reliance on private vehicles. However, car-centric urban growth has transformed the automobile from a convenience tool into a necessity for accessing essential services, creating significant "car dependency". This study introduces a novel Car Dependency Index (CDI) that quantifies the accessibility gap between private and public transport across 18 cities in Europe and North America. Utilising high-resolution geospatial data and numerical simulations, we reveal pronounced spatial inequalities, showing that car dependency remains a primary driver of car ownership even when accounting for income. A ``what-if" simulation of the planned metro expansion in Rome predicts a reduction of approximately 60,000 commuting vehicles, yet highlights that isolated interventions have localised impacts. We conclude that systemic, network-level transit expansions are essential to dismantle car-based systems and foster equitable, sustainable urban mobility. Our framework provides policymakers with an objective, scalable tool to identify viable areas for car-free zones and target infrastructure investments effectively.

2604.00987 2026-04-02 stat.ML cs.AI cs.LG

Bridging Structured Knowledge and Data: A Unified Framework with Finance Applications

Yi Cao, Zexun Chen, Lin William Cong, Heqing Shi

详情
英文摘要

We develop Structured-Knowledge-Informed Neural Networks (SKINNs), a unified estimation framework that embeds theoretical, simulated, previously learned, or cross-domain insights as differentiable constraints within flexible neural function approximation. SKINNs jointly estimate neural network parameters and economically meaningful structural parameters in a single optimization problem, enforcing theoretical consistency not only on observed data but over a broader input domain through collocation, and therefore nesting approaches such as functional GMM, Bayesian updating, transfer learning, PINNs, and surrogate modeling. SKINNs define a class of M-estimators that are consistent and asymptotically normal with root-N convergence, sandwich covariance, and recovery of pseudo-true parameters under misspecification. We establish identification of structural parameters under joint flexibility, derive generalization and target-risk bounds under distributional shift in a convex proxy, and provide a restricted-optimal characterization of the weighting parameter that governs the bias-variance tradeoff. In an illustrative financial application to option pricing, SKINNs improve out-of-sample valuation and hedging performance, particularly at longer horizons and during high-volatility regimes, while recovering economically interpretable structural parameters with improved stability relative to conventional calibration. More broadly, SKINNs provide a general econometric framework for combining model-based reasoning with high-dimensional, data-driven estimation.

2604.00961 2026-04-02 stat.ME

Bayesian Multi-Group Functional Factor Models with Parameter-Expanded Cumulative Shrinkage Priors

Xuanye Dai, Anna Gottard, Michele Guindani, Marina Vannucci

详情
英文摘要

Functional data consist of trajectories observed over a continuous domain, such as time, space, or wavelength. Here we consider curves observed on different groups of subjects and propose a Bayesian multi-group functional factor analysis framework that jointly models the data via an explicit decomposition into group-specific mean functions and latent components that capture both common and distinct latent structures across the groups. We represent these functional components as linear combinations of a common set of B-spline bases, achieving a low-rank representation of the latent factors. We further impose a parameter-expanded cumulative shrinkage process prior on the factor loadings, which induces increasing shrinkage and automatically selects the number of active shared and group-specific factors. We evaluate the model's performance through simulation studies and show that the model accurately recovers the number of underlying factors and effectively distinguishes variations in functional observations driven by shared versus group-specific complex structures under various scenarios. For real data analysis, we apply the model to EEG data on alcoholic and healthy subjects and identify shared latent factors, that capture canonical characteristic components of the EEG curves, along with group-specific factors that reveal specific neural activity patterns.

2604.00956 2026-04-02 stat.ME stat.AP

Model Assisted Data Integration: An unbiased sampling strategy to use nonprobability data

Martin Hyllienmark, Gustaf Strandell

详情
英文摘要

The aim of survey statistics is to produce estimates with a minimal bias and a corresponding acceptable variance given a specific budget, preferable with a minor response burden for the participants. In recent years, considerable efforts have been made to achieve this through the extended use of found or non-probability data. However, to be able to safely utilize such data, rigorous theoretical foundations is needed, where one main concern is the of lack control due to not having access to the selection mechanism for the data. Several methods have been proposed in the literature to deal with this, though often relying on assumptions that may be difficult or impossible to verify in practice. Extending on the Data Integrated (DI) estimator introduced by Kim and Tam (2021), this paper introduce the Model Assisted Data Integration (MADI) sampling strategy. The proposed sampling strategy includes an estimator that has the desired properties: it is design-unbiased, has a design-unbiased variance estimator and is suitable for the intense production cycle of the statistical agency. The estimator uses nonprobability data combined with a probability sample that has a sampling design which aims to include individuals not captured by the nonprobability data. The estimator can use arbitrary machine learning models to produce unbiased estimates. A main conclusion of the paper is that the proposed sampling strategy can produce estimates with much lower variances compared to traditional survey estimators, and we use real empirical data to illustrate this point.

2604.00951 2026-04-02 stat.CO

Quantum Statistical Bootstrap

Yongkai Chen, Ping Ma, Wenxuan Zhong

详情
英文摘要

The bootstrap is a foundational tool in statistical inference, but its classical implementation relies on Monte Carlo resampling, introducing approximation error and incurring high computational cost -- especially for large datasets and complex models. We present the Quantum Bootstrap (QBOOT), a quantum algorithm that computes the ideal bootstrap estimate exactly by encoding all possible resamples in quantum superposition, evaluating the target statistic in parallel, and extracting the aggregate via quantum amplitude estimation. Under mild circuit efficiency assumptions, QBOOT achieves a near-quadratic speedup over the classical bootstrap in approximating the ideal estimator, independent of the statistic or underlying distribution. We provide a rigorous theoretical analysis of its statistical error properties -- addressing a gap in the quantum algorithms literature -- and validate our results through experiments on the IBM quantum simulator for the sample mean problem. Our findings demonstrate that QBOOT preserves the asymptotic properties of the ideal bootstrap while substantially improving computational efficiency and precision, establishing a scalable and principled framework for quantum statistical inference.

2604.00947 2026-04-02 cs.CL cond-mat.stat-mech stat.ML

Phase transition on a context-sensitive random language model with short range interactions

Yuma Toji, Jun Takahashi, Vwani Roychowdhury, Hideyuki Miyahara

详情
英文摘要

Since the random language model was proposed by E. DeGiuli [Phys. Rev. Lett. 122, 128301], language models have been investigated intensively from the viewpoint of statistical mechanics. Recently, the existence of a Berezinskii--Kosterlitz--Thouless transition was numerically demonstrated in models with long-range interactions between symbols. In statistical mechanics, it has long been known that long-range interactions can induce phase transitions. Therefore, it has remained unclear whether phase transitions observed in language models originate from genuinely linguistic properties that are absent in conventional spin models. In this study, we construct a random language model with short-range interactions and numerically investigate its statistical properties. Our model belongs to the class of context-sensitive grammars in the Chomsky hierarchy and allows explicit reference to contexts. We find that a phase transition occurs even when the model refers only to contexts whose length remains constant with respect to the sentence length. This result indicates that finite-temperature phase transitions in language models are genuinely induced by the intrinsic nature of language, rather than by long-range interactions.

2604.00942 2026-04-02 cs.LG cs.CR math.ST stat.TH

Differentially Private Manifold Denoising

Jiaqi Wu, Yiqing Sun, Zhigang Yao

Comments 59 pages

详情
英文摘要

We introduce a differentially private manifold denoising framework that allows users to exploit sensitive reference datasets to correct noisy, non-private query points without compromising privacy. The method follows an iterative procedure that (i) privately estimates local means and tangent geometry using the reference data under calibrated sensitivity, (ii) projects query points along the privately estimated subspace toward the local mean via corrective steps at each iteration, and (iii) performs rigorous privacy accounting across iterations and queries using $(\varepsilon,δ)$-differential privacy (DP). Conceptually, this framework brings differential privacy to manifold methods, retaining sufficient geometric signal for downstream tasks such as embedding, clustering, and visualization, while providing formal DP guarantees for the reference data. Practically, the procedure is modular and scalable, separating DP-protected local geometry (means and tangents) from budgeted query-point updates, with a simple scheduler allocating privacy budget across iterations and queries. Under standard assumptions on manifold regularity, sampling density, and measurement noise, we establish high-probability utility guarantees showing that corrected queries converge toward the manifold at a non-asymptotic rate governed by sample size, noise level, bandwidth, and the privacy budget. Simulations and case studies demonstrate accurate signal recovery under moderate privacy budgets, illustrating clear utility-privacy trade-offs and providing a deployable DP component for manifold-based workflows in regulated environments without reengineering privacy systems.

2604.00873 2026-04-02 astro-ph.CO math.ST stat.TH

Transport-Geometric Formulation of Peak Statistics: Curvature-Conditioned Point Processes and Response Hierarchy

Tsutomu T. Takeuchi

Comments 24 pages, no figure, submitted

详情
英文摘要

We develop a geometric formulation of peak statistics in cosmological density fields based on optimal transport and entropy. In this framework, the density field is treated as a probability measure, and its local structure is characterized by the Hessian of the log-density, which arises as the local response of an entropy functional in Wasserstein space. Peaks are thereby defined as positive-curvature stationary points, and their number density is expressed as a curvature-conditioned point process. In the linear Gaussian limit, the joint distribution of local variables closes in terms of a finite set of spectral moments, recovering the standard theory of peak statistics, known as BBKS. This clarifies that BBKS corresponds to a solvable limit of a more general structure combining probability distributions, curvature constraints, and geometric measure. The framework extends naturally beyond Gaussianity and linearity. Deviations from Gaussianity are incorporated as deformations of the joint distribution of curvature variables, while nonlinear structures are described through the curvature of the log-density. We further derive the two- and three-point peak statistics as curvature-conditioned $n$-point measures, and show that the full hierarchy of peak statistics can be organized as response functions to long-wavelength background modes. In this formulation, the conventional peak bias appears as the lowest-order response coefficient, with higher-order correlations arising as its natural extensions. This work embeds peak theory into a unified geometric framework and provides a systematic basis for incorporating nonlinearity, non-Gaussianity, and higher-order statistics, with direct relevance for observational applications.

2604.00872 2026-04-02 stat.ME stat.CO

On the approximation of the between-set correlation matrix by canonical correlation analysis

Jan Graffelman

Comments 18 pages, 3 figures

详情
英文摘要

Canonical correlation analysis is a classic well-known multivariate statistical method focusing on the relationships between two sets of variables. The visualisation of those relationships can be achieved by means of a biplot of the between-set correlation matrix. The canonical analysis provides a low-rank approximation to the between-set correlation matrix that is optimal in a generalised least squares sense. This article proposes to adjust the between-set correlation matrix using either a single scalar effect, or column and/or row effects. An alternating generalised least squares algorithm is proposed to obtain optimal adjustments and low-rank factorisations. The adjustment leads to a better approximation of the between-set correlation matrix that achieves a lower root mean squared error in comparison with the classic canonical analysis. The results of the adjusted analysis can be efficiently visualised using biplots, with a minimal change in interpretation rules that only affects the biplot origin. Biplot calibration is used to enhance the visualisation of the results of the adjusted analysis. Some examples with publicly available data sets from social science, geochemistry and medical science illustrate the proposed improvement. Software for carrying out the adjusted canonical analysis in the R environment is provided.

2604.00811 2026-04-02 stat.ML cs.LG stat.ME

Deconfounding Scores and Representation Learning for Causal Effect Estimation with Weak Overlap

Oscar Clivio, Alexander D'Amour, Alexander Franks, David Bruns-Smith, Chris Holmes, Avi Feller

Comments To appear at AISTATS 2026

详情
英文摘要

Overlap, also known as positivity, is a key condition for causal treatment effect estimation. Many popular estimators suffer from high variance and become brittle when features differ strongly across treatment groups. This is especially challenging in high dimensions: the curse of dimensionality can make overlap implausible. To address this, we propose a class of feature representations called deconfounding scores, which preserve both identification and the target of estimation; the classical propensity and prognostic scores are two special cases. We characterize the problem of finding a representation with better overlap as minimizing an overlap divergence under a deconfounding score constraint. We then derive closed-form expressions for a class of deconfounding scores under a broad family of generalized linear models with Gaussian features and show that prognostic scores are overlap-optimal within this class. We conduct extensive experiments to assess this behavior empirically.

2604.00772 2026-04-02 stat.AP stat.ME

Beyond the Beta Lorenz Curve: A New Parametric Family for Poverty and Inequality Estimation

José María Sarabia, Vanesa Jordá, Emilio Gómez-Déniz

详情
英文摘要

The estimation of inequality and poverty measures is frequently constrained by a lack of individual data. Many countries, including China, continue to report income data in the form of aggregated income shares. In this context, the Beta Lorenz curve, introduced by Kakwani (Econometrica, 48, 1980), has become a standard tool for reconstructing income distributions at both academic and institutional levels. Notably, alongside the General Quadratic (GQ) Lorenz curve, it represents the primary specification used by the World Bank to construct its official poverty estimates when microdata is unavailable. In this paper, we demonstrate that Kawani's model fails to satisfy the formal requirements of a genuine Lorenz curve. To address this, we identify the specific constraints that ensure the theoretical validity of this model and introduce a new family of Lorenz curves derived from the corrected parametric space. Our analysis, conducted across more than 2,000 datasets, reveals that our proposed four-parameter specification provides highly accurate estimates of several poverty and inequality measures. Our results show that this model consistently outperforms the GQ Lorenz curve, which we find tends to underestimate poverty in over 80 percent of the analyzed cases

2604.00697 2026-04-02 stat.ML cs.LG

Inverse-Free Sparse Variational Gaussian Processes

Stefano Cortinovis, Laurence Aitchison, Stefanos Eleftheriadis, Mark van der Wilk

Comments Accepted to AISTATS 2026. 20 pages, 3 figures, 2 tables

详情
英文摘要

Gaussian processes (GPs) offer appealing properties but are costly to train at scale. Sparse variational GP (SVGP) approximations reduce cost yet still rely on Cholesky decompositions of kernel matrices, ill-suited to low-precision, massively parallel hardware. While one can construct valid variational bounds that rely only on matrix multiplications (matmuls) via an auxiliary matrix parameter, optimising them with off-the-shelf first-order methods is challenging. We make the inverse-free approach practical by proposing a better-conditioned bound and deriving a matmul-only natural-gradient update for the auxiliary parameter, markedly improving stability and convergence. We further provide simple heuristics, such as step-size schedules and stopping criteria, that make the overall optimisation routine fit seamlessly into existing workflows. Across regression and classification benchmarks, we demonstrate that our method 1) serves as a drop-in replacement in SVGP-based models (e.g., deep GPs), 2) recovers similar performance to traditional methods, and 3) can be faster than baselines when well tuned.

2604.00683 2026-04-02 stat.ME math.OC stat.ML

Convergence of projected stochastic natural gradient variational inference for various step size and sample or batch size schedules

Thomas Guilmeau, Hadrien Hendrikx, Florence Forbes

详情
英文摘要

Stochastic natural gradient variational inference (NGVI) is a popular and efficient algorithm for Bayesian inference. Despite empirical success, the convergence of this method is still not fully understood. In this work, we define and study a projected stochastic NGVI when variational distributions form an exponential family. Stochasticity arises when either gradients are intractable expectations or large sums. We prove new non-asymptotic convergence results for combinations of constant or decreasing step sizes and constant or increasing sample/batch sizes. When all hyperparameters are fixed, NGVI is shown to converge geometrically to a neighborhood of the optimum, while we establish convergence to the optimum with rates of the form $\mathcal{O}\left(\frac{1}{T^ρ} \right)$, possibly with $ρ\geq 1$, for all other combinations of step size and sample/batch size schedules. These rates apply when the target posterior distribution is close in some sense to the considered exponential family. Our theoretical results extend existing NGVI and stochastic optimization results and provide more flexibility to adjust, in a principled way, step sizes and sample/batch sizes in order to meet speed, resources, or accuracy constraints.

2604.00662 2026-04-02 stat.AP

Feature Reconstruction and Monitoring of Load Test Data under Varying Environmental Conditions

Lizzie Neumann, Philipp Wittenberg, Alexander Mendler, Jan Gertheiss

详情
英文摘要

System outputs in Structural Health Monitoring (SHM), such as sensor measurements or extracted features like eigenfrequencies, are influenced not only by (potential) damage but also by environmental and operational variables (EOV). Identifying these factors and removing their effects from the data is essential before proceeding with further analysis. Most existing methods for this task focus on the expected values of system outputs, e.g., using different types of response surface modeling. However, it has been shown that confounding variables can also affect the (co-)variance of and between system outputs. This is particularly important because the covariance matrix is an essential building block in many damage detection methods in SHM. Beyond standard response surface modeling, a nonparametric kernel approach can be used to estimate a conditional covariance matrix that can change depending on the identified confounding factor. This improves our understanding of how, e.g., temperature affects the system outputs. In this work, we present a new confounder-adjusted version of feature reconstruction. It uses the conditional covariance matrix as the basis for (conditional) principal component analysis. The resulting (conditional) principal component scores are then used to reconstruct system outputs with the confounding influences removed. In particular, the new approach eliminates the confounders effect on both the mean and the covariance. As will be shown on load test data from the Vahrendorfer Stadtweg bridge in Hamburg, Germany, the reconstructed features can then be employed for monitoring, e.g., using an appropriate control chart, resulting in fewer false alarms and a higher probability of detecting damage.

2604.00655 2026-04-02 math.ST stat.TH

Semiparametric Fisher Information in Models parametrized by a Normed Space

Telmo Pérez-Izquierdo

Comments 22 pages, 0 figures

详情
英文摘要

This paper studies semiparametric Fisher information in models parametrized by general normed spaces. The main contribution is to establish that positive semiparametric Fisher information is equivalent to the gradient of the parameter of interest lying in the range of the adjoint score operator. This result generalizes a key theorem Van Der Vaart (1991) and provides a unified framework linking differentiability and information, beyond Hilbert spaces. The paper develops a normed-space mean-square-differentiable models for two canonical problems: estimation of the average of a known transformation and estimation of a density at a point. In these applications, it shows that positive information holds if and only if the transformation has finite variance and if and only if the density has positive mass at the evaluation point, respectively. These findings offer a novel information-theoretic perspective on known minimax results and clarify the conditions under which root-n estimation is possible.

2604.00644 2026-04-02 stat.ME

Covariance Matrix Estimation for High-Dimensional Interval-Valued Data with Positive Definiteness

Wan Tian, Wenhao Cui, Rui Zhang, Bingyi Jing, Yang Liu, Yijie Peng

Comments 32 pages, 4 figures, 4 tables

详情
英文摘要

In the realm of high-dimensional data analysis, the estimation of covariance matrices is a fundamental task, and this holds true for interval-valued data as well. However, there is no unified definition for the covariance matrix of interval-valued data, let alone established estimation methods in high-dimensional settings. This paper presents a novel approach to estimating covariance matrices for high-dimensional interval-valued data while ensuring positive definiteness. We begin by assuming that the upper and lower bounds of interval-valued variables share the same dependency structure. Based on this assumption, we extend the classical soft-thresholding covariance matrix estimator to the interval-valued scenario, referred to as the Interval-valued Soft-Thresholding (IST) estimator. Subsequently, to ensure the positive definiteness of the estimator, we impose a positive definiteness constraint on the IST estimator. We derive an alternating direction method to solve the proposed problem and establish its convergence. Under some very mild conditions, we develop a non-asymptotic statistical theory for the proposed estimator. Simulation studies and applications to high-frequency financial data from the CSI 300 Index demonstrated the effectiveness of the proposed estimator.

2604.00593 2026-04-02 astro-ph.CO math.ST stat.TH

A Geometric Theory of Cosmological Structure via Entropic Curvature in Wasserstein Space

Tsutomu T. Takeuchi

Comments 16 pages, 1 figure, submitted

详情
英文摘要

We construct a geometric framework for cosmological large-scale structure based on optimal transport theory and Wasserstein geometry. In this framework, Ricci curvature on the probability measure space $\mathcal{P}_2(M)$ is characterized by the geodesic convexity of entropy and is formulated as the response of probability distributions to optimal transport. We introduce effective Ricci curvatures $K_{\mathrm{eff}}^{(\infty)}$ and $K_{\mathrm{eff}}^{(N)}$ associated with Kullback--Leibler-type and Rényi-type entropies, corresponding respectively to the curvature-dimension conditions CD$(K,\infty)$ and CD$(K,N)$. By localizing these curvatures to finite scales using local and reference measures, we construct curvature indicators applicable to observational data. Under a local quadratic approximation, the effective curvature reduces to the Hessian of the log-density, showing that conventional Hessian-based structure classifications arise as a limiting case of the present framework. We further show that effective curvature depends on observational scale and formulate this dependence as a scale flow, distinct from Ricci flow because it describes a change of resolution rather than a time evolution of geometry. Treating curvature as a random field then extends the statistical description of density fields: curvature statistics are given by higher-order weighted integrals of the power spectrum and by spatial derivatives of the correlation function, emphasizing geometric rather than amplitude information. This framework provides a unified connection between optimal transport geometry and cosmological structure analysis, and offers a new perspective on multiscale structure and nonlinear statistics.

2604.00553 2026-04-02 stat.ML cs.LG cs.SY eess.SY math.OC

Scenario theory for multi-criteria data-driven decision making

Simone Garatti, Lucrezia Manieri, Alessandro Falsone, Algo Carè, Marco C. Campi, Maria Prandini

详情
英文摘要

The scenario approach provides a powerful data-driven framework for designing solutions under uncertainty with rigorous probabilistic robustness guarantees. Existing theory, however, primarily addresses assessing robustness with respect to a single appropriateness criterion for the solution based on a dataset, whereas many practical applications - including multi-agent decision problems - require the simultaneous consideration of multiple criteria and the assessment of their robustness based on multiple datasets, one per criterion. This paper develops a general scenario theory for multi-criteria data-driven decision making. A central innovation lies in the collective treatment of the risks associated with violations of individual criteria, which yields substantially more accurate robustness certificates than those derived from a naive application of standard results. In turn, this approach enables a sharper quantification of the robustness level with which all criteria are simultaneously satisfied. The proposed framework applies broadly to multi-criteria data-driven decision problems, providing a principled, scalable, and theoretically grounded methodology for design under uncertainty.

2604.00544 2026-04-02 stat.ME

Estimating causal effects of continuous-time dynamic treatments with unmeasured confounders

Haiyan Zhu, Yingchun Zhou

详情
英文摘要

Modern medical research demands specialized causal inference methods evaluating complex continuous-time dynamic treatment regimens using observational data. For instance, obtaining the causal effects of intravenous administration, a continuous process involving dynamic adjustments of the treatment dose, can guide clinicians on drug use. However, the existing causal inference frameworks in longitudinal studies typically assume that time advances in discrete time steps. Therefore, this paper proposes a new methodology to estimate the causal effects of continuous-time dynamic treatments in the presence of unmeasured confounding. Unmeasured confounding is incorporated into estimating continuous-time Marginal Structural Models from a Bayesian perspective. Simulation demonstrates that compared to existing methods, the proposed approach can provide approximately unbiased estimates for target causal parameters across three degrees of confounding. The proposed method is applied to analyze the causal relationship between the intravenous oxytocin administration process and postpartum hemorrhage, leading to meaningful results that may guide clinicians in using oxytocin.

2604.00481 2026-04-02 stat.ME stat.ML

Tucker Diffusion Model for High-dimensional Tensor Generation

Jianhua Guo, Xinbing Kong, Zeyu Li, Junfan Mao

详情
英文摘要

Statistical inference on large-dimensional tensor data has been extensively studied in the literature and widely used in economics, biology, machine learning, and other fields, but how to generate a structured tensor with a target distribution is still a new problem. As profound AI generators, diffusion models have achieved remarkable success in learning complex distributions. However, their extension to generating multi-linear tensor-valued observations remains underexplored. In this work, we propose a novel Tucker diffusion model for learning high-dimensional tensor distributions. We show that the score function admits a structured decomposition under the low Tucker rank assumption, allowing it to be both accurately approximated and efficiently estimated using a carefully tailored tensor-shaped architecture named Tucker-Unet. Furthermore, the distribution of generated tensors, induced by the estimated score function, converges to the true data distribution at a rate depending on the maximum of tensor mode dimensions, thereby offering a clear theoretical advantage over the naive vectorized approach, which has a product dependence. Empirically, compared to existing approaches, the Tucker diffusion model demonstrates strong practical potential in synthetic and real-world tensor generation tasks, achieving comparable and sometimes even superior statistical performance with significantly reduced training and sampling costs.

2604.00432 2026-04-02 stat.ML cs.LG math.PR

Denoising distances beyond the volumetric barrier

Han Huang, Pakawut Jiradilok, Elchanan Mossel

详情
英文摘要

We study the problem of reconstructing the latent geometry of a $d$-dimensional Riemannian manifold from a random geometric graph. While recent works have made significant progress in manifold recovery from random geometric graphs, and more generally from noisy distances, the precision of pairwise distance estimation has been fundamentally constrained by the volumetric barrier, namely the natural sample-spacing scale $n^{-1/d}$ coming from the fact that a generic point of the manifold typically lies at distance of order $n^{-1/d}$ from the nearest sampled point. In this paper, we introduce a novel approach, Orthogonal Ring Distance Estimation Routine (ORDER), which achieves a pointwise distance estimation precision of order $n^{-2/(d+5)}$ up to polylogarithmic factors in $n$ in polynomial time. This strictly beats the volumetric barrier for dimensions $d > 5$. As a consequence of obtaining pointwise precision better than $n^{-1/d}$, we prove that the Gromov--Wasserstein distance between the reconstructed metric measure space and the true latent manifold is of order $n^{-1/d}$. This matches the Wasserstein convergence rate of empirical measures, demonstrating that our reconstructed graph metric is asymptotically as good as having access to the full pairwise distance matrix of the sampled points. Our results are proven in a very general setting which includes general models of noisy pairwise distances, sparse random geometric graphs, and unknown connection probability functions.

2604.00426 2026-04-02 stat.ME

Testing for lack of fit in paired comparison data

Rahul Singh, Ori Davidov

Comments 32 pages, 9 Tables

详情
英文摘要

Linear stochastic transitivity is a central assumption in paired comparison models that is rarely verified in practice. Empirical violations, however, are common and can substantially affect inference and ranking. We develop a class of tests for detecting lack of fit in cardinal paired comparison models, where lack of fit is characterized by the presence of cyclical preferences among subsets of items. We propose a suite of tests adapted to different regimes governing the growth of the comparison graph. For a fixed number of items, the proposed procedures exhibit substantially improved power relative to the classical Kendall--Smith test and its cardinal analogue. We further extend the framework to high--dimensional, sparse comparison graphs near the connectivity threshold in random graph models. The theoretical analysis characterizes the behavior of the tests under both the null and alternative, with particular emphasis on limits of detectability and consistency. Simulation studies corroborate the theoretical findings, and applications to real data uncover substantial and previously unrecognized intransitivity and structural lack of fit.

2604.00424 2026-04-02 stat.OT

Distributional regression models for meta-analysis

Yefeng Yang, Shinichi Nakagawa

Comments 31 pages

详情
英文摘要

Meta-analyses are regarded as the highest level in the hierarchy of evidence, yet standard models traditionally concentrated on estimating the mean effect size, often under restrictive assumptions about the underlying distribution, such as homogeneous variance, symmetric shapes. We introduce a distributional regression framework for meta-analysis that generalizes these conventional models by allowing all parameters of the effect size distribution, such as location, scale, and shape, to be modelled as functions of explanatory variables. This unified framework accommodates a wide range of existing models, including random-effects, multilevel, multivariate, location-scale, and outlier-robust meta-analyses, as special cases. We provide an illustrative example, using 67,393 meta-analyses from the Cochrane Database of Systematic Reviews, employing location-scale models to investigate whether smaller studies tend to report larger effect sizes (i.e., small-study effects) and exhibit greater heterogeneity. We discuss implementation strategies using existing software, considerations for model selection and pre-registration, and the need for further methodological development. By moving beyond the mean effect size, distributional regression enables researchers to explore systematic variation in distributional structure, facilitating the joint test of new hypotheses corresponding to multiple distributional parameters.

2604.00410 2026-04-02 stat.ME

Differentially Private One-Shot Federated Inference for Linear Mixed Models via Lossless Likelihood Reconstruction

Keisuke Hanada, Toshio Shimokawa, Kazushi Maruo

Comments 34 pages, 6 Figures, 4 Tables

详情
英文摘要

One-shot federated learning enables multi-site inference with minimal communication. However, sharing summary statistics can still leak sensitive individual-level information when sites have only a small number of patients. In particular, shared cross-product summaries can reveal patient-level covariate patterns under discrete covariates. Motivated by this concern, this study proposes a differentially private one-shot federated inference framework for linear mixed models with a random-intercept working covariance. The method reconstructs the pooled likelihood from site-level summary statistics and applies a Gaussian mechanism to perturb these summaries, ensuring a site-level differential privacy. Cluster-robust variance estimators are developed that are computed directly from the privatized summaries. Robust variance provides valid uncertainty quantification even under covariance mis-specification. Under a multi-site asymptotic regime, the consistency and asymptotic normality of the proposed estimator are established and the leading-order statistical cost of privacy is characterized. Simulation studies show that moderate privacy noise substantially reduces reconstruction risk while maintaining competitive estimation accuracy as the number of sites increases. However, very strong privacy settings can lead to unstable standard errors when the number of sites is limited. An application using multi-site COVID-19 testing data demonstrates that meaningful privacy protection can be achieved with a modest loss of efficiency.

2604.00390 2026-04-02 stat.ME

Causal Inference for Unobservable Multivariate Outcomes, with Applications to Brain Effective Connectivity

Haiyue Song, Ani Eloyan, Youjin Lee

详情
英文摘要

Evaluating the causal effect of an intervention on multivariate outcomes is challenging when the outcomes are interdependent and derived rather than directly observed. Effective connectivity, which summarizes the directional neural communication between brain regions, is one such derived relational outcome. Estimating how external interventions affect effective connectivity introduces two layers of causal inference problems: identifying directional relationships among brain regions from high-dimensional neuroimaging time series and estimating the causal effect of the intervention on these derived relationships. Each layer introduces distinct biases. The first arises from within-outcome dependencies unrelated to the intervention; to address this, we propose a sample-splitting method for estimating meaningful, and potentially causally informative, effective connectivity measures. The second arises from confounding between the intervention and the derived outcomes; to address this, we apply inverse probability weighting methods and incorporate multiple testing when causal effects on multiple components of the outcomes are of interest. We demonstrate, through theoretical results and simulations, that the proposed methods are asymptotically valid under certain conditions with effective type-I and familywise error control. Finally, we apply the proposed methods to examine the causal effect of amyloid on effective connectivity using the resting-state fMRI data from the Alzheimer's Disease Neuroimaging Initiative database.

2604.00353 2026-04-02 stat.AP

Spatiotemporal Characterization of Overdose Mortality in Georgia, USA Using Spectral and Nonlinear Interaction Analysis, 2003-2021

Dhrubajyoti Ghosh

详情
英文摘要

Drug overdose mortality in the United States exhibits strong geographic heterogeneity and complex temporal evolution, yet most spatiotemporal studies focus on trends and risks without explicitly characterizing the underlying dynamical structure of overdose trajectories. We develop a nonlinear spectral-spatiotemporal framework to analyze county-level overdose mortality in the state of Georgia from 2003 to 2021. Annual mortality rates are decomposed into low- and high-frequency components to distinguish long-term epidemic pressure from short-term variability, and nonlinear cross-frequency interaction is quantified using bispectral intensity. Counties are grouped into spectral phenotypes using unsupervised clustering, and single-breakpoint change-point models are used to identify regime shifts and quantify post-break acceleration across phenotypes. We find that overdose dynamics across Georgia are dominated by persistent low-frequency growth with limited independent short-term volatility. Nonlinear amplification is spatially concentrated and co-occurs with strong long-term epidemic pressure. Despite synchronous statewide breakpoints around 2014, post-break growth accelerates most sharply in counties exhibiting high low-frequency power and elevated nonlinear interaction. Together, these results provide a mechanistically interpretable framework for identifying dynamical risk phenotypes and structural transitions in spatial overdose epidemics.

2604.00346 2026-04-02 q-fin.ST q-fin.TR stat.AP

Forecasting duration in high-frequency financial data using a self-exciting flexible residual point process

Kyungsub Lee

详情
英文摘要

This paper presents a method for forecasting limit order book durations using a self-exciting flexible residual point process. High-frequency events in modern exchanges exhibit heavy-tailed interarrival times, posing a significant challenge for accurate prediction. The proposed approach incorporates the empirical distributional features of interarrival times while preserving the self-exciting and decay structure. This work also examines the stochastic stability of the process, which can be interpreted as a general state-space Markov chain. Under suitable conditions, the process is irreducible, aperiodic, positive Harris recurrent, and has a stationary distribution. An empirical study demonstrates that the model achieves strong predictive performance compared with several alternative approaches when forecasting durations in ultra-high-frequency trading data.

2604.00344 2026-04-02 cs.CL stat.AP

Agent Q-Mix: Selecting the Right Action for LLM Multi-Agent Systems through Reinforcement Learning

Eric Hanchen Jiang, Levina Li, Rui Sun, Xiao Liang, Yubei Li, Yuchen Wu, Haozheng Luo, Hengli Li, Zhi Zhang, Zhaolu Kang, Kai-Wei Chang, Ying Nian Wu

详情
英文摘要

Large Language Models (LLMs) have shown remarkable performance in completing various tasks. However, solving complex problems often requires the coordination of multiple agents, raising a fundamental question: how to effectively select and interconnect these agents. In this paper, we propose \textbf{Agent Q-Mix}, a reinforcement learning framework that reformulates topology selection as a cooperative Multi-Agent Reinforcement Learning (MARL) problem. Our method learns decentralized communication decisions using QMIX value factorization, where each agent selects from a set of communication actions that jointly induce a round-wise communication graph. At its core, Agent Q-Mix combines a topology-aware GNN encoder, GRU memory, and per-agent Q-heads under a Centralized Training with Decentralized Execution (CTDE) paradigm. The framework optimizes a reward function that balances task accuracy with token cost. Across seven core benchmarks in coding, reasoning, and mathematics, Agent Q-Mix achieves the highest average accuracy compared to existing methods while demonstrating superior token efficiency and robustness against agent failure. Notably, on the challenging Humanity's Last Exam (HLE) using Gemini-3.1-Flash-Lite as a backbone, Agent Q-Mix achieves 20.8\% accuracy, outperforming Microsoft Agent Framework (19.2\%) and LangGraph (19.2\%), followed by AutoGen and Lobster by OpenClaw. These results underscore the effectiveness of learned, decentralized topology optimization in pushing the boundaries of multi-agent reasoning.

2604.00337 2026-04-02 math.ST stat.TH

E-Values, Bayes Risk, Dual Role of Markov's Inequality

Nicholas G. Polson, Daniel Zantedeschi

详情
英文摘要

Two approaches to hypothesis testing, e-value testing and Bayes risk minimisation, both invoke Markov's inequality to control error probabilities. They differ in which distribution certifies the unit-moment condition: the null for Type I error, the alternative for Type II error. The likelihood ratio is not intrinsically an e-value; it acquires that status only relative to the experiment under which its expectation is certified. This note makes the resulting role-reversal symmetry explicit, traces its asymptotic sharpening through the information-theoretic arguments of Barron and Clarke (1994), and situates the duality within the typed evidence calculus of Polson, Sokolov, and Zantedeschi (2026).

2603.28595 2026-04-02 cs.LG stat.ML

Optimistic Actor-Critic with Parametric Policies for Linear Markov Decision Processes

Max Qiushi Lin, Reza Asad, Kevin Tan, Haque Ishfaq, Csaba Szepesvari, Sharan Vaswani

Comments 61 pages, 9 figures

详情
英文摘要

Although actor-critic methods have been successful in practice, their theoretical analyses have several limitations. Specifically, existing theoretical work either sidesteps the exploration problem by making strong assumptions or analyzes impractical methods with complicated algorithmic modifications. Moreover, the actor-critic methods analyzed for linear MDPs often employ natural policy gradient and construct "implicit" policies without explicit parameterization. Such policies are computationally expensive to sample from, making the environment interactions inefficient. To that end, we focus on the finite-horizon linear MDPs and propose an optimistic actor-critic framework that uses parametric log-linear policies. In particular, we introduce a tractable $\textit{logit-matching}$ regression objective for the actor. For the critic, we use approximate Thompson sampling via Langevin Monte Carlo to obtain optimistic value estimates. We prove that the resulting algorithm achieves $\widetilde{\mathcal{O}}(ε^{-4})$ and $\widetilde{\mathcal{O}}(ε^{-2})$ sample complexity in the on-policy and off-policy setting, respectively. Our results match prior theoretical work in achieving the state-of-the-art sample complexity, while our algorithm is more aligned with practice.

2603.00661 2026-04-02 math.ST stat.TH

Martingale Posterior Predictive Coherence: Hausdorff Moment Hierarchy

Nicholas G. Polson, Daniel Zantedeschi

Comments Fixed typos

详情
英文摘要

For an exchangeable Bernoulli sequence with de Finetti mixing measure Pi, the k-step predictive probability P(X_{n+1}=...=X_{n+k}=0 | F_n) equals the posterior expectation E[(1-theta)^k | F_n]. By binomial expansion, this depends on all posterior moments up to order k. We show that the first moment alone is not sufficient to uniquely identify these quantities: for k >= 2, the mapping from posterior mean to k-step predictive is set-valued. The martingale posterior framework of Fong, Holmes, and Walker (which constrains only the first conditional moment of the terminal value) does not, in general, uniquely identify multi-step predictive distributions. Under any strictly proper scoring rule, the plug-in predictive is strictly dominated by the Bayes predictive whenever the posterior is non-degenerate. A closure theorem establishes that a martingale posterior determines all k-step predictives if and only if the conditional law of the terminal value is uniquely specified. Hill's A_{(n)} rule under the Jeffreys Beta(1/2,1/2) prior is a positive example. The discrepancy is O(Var(theta | F_n)) and vanishes as the posterior concentrates. These results clarify the structural requirements for predictive completeness under exchangeability.

2602.10905 2026-04-02 cs.LG math.OC stat.ML

Natural Hypergradient Descent: Algorithm Design, Convergence Analysis, and Parallel Implementation

Deyi Kong, Zaiwei Chen, Shuzhong Zhang, Shancong Mou

详情
英文摘要

In this work, we propose Natural Hypergradient Descent (NHGD), a new method for solving bilevel optimization problems. To address the computational bottleneck in hypergradient estimation--namely, the need to compute or approximate Hessian inverse--we exploit the statistical structure of the inner optimization problem and use the empirical Fisher information matrix as an asymptotically consistent surrogate for the Hessian. This design enables a parallel optimize-and-approximate framework in which the Hessian-inverse approximation is updated synchronously with the stochastic inner optimization, reusing gradient information at negligible additional cost. Our main theoretical contribution establishes high-probability error bounds and sample complexity guarantees for NHGD that match those of state-of-the-art optimize-then-approximate methods, while significantly reducing computational time overhead. Empirical evaluations on representative bilevel learning tasks further demonstrate the practical advantages of NHGD, highlighting its scalability and effectiveness in large-scale machine learning settings.

2602.06322 2026-04-02 stat.AP stat.ME

Dynamical Survival Analysis for Modeling Hazard Functions with Nonlinear Systems

Dananjani Liyanage, Mahmudul Bari Hridoy, Fahad Mostafa

详情
英文摘要

Hazard functions play a central role in survival analysis, providing insight into the underlying risk dynamics of time-to-event data, with broad applications in medicine, epidemiology, and related fields. First-order ordinary differential equation (ODE) formulations of the hazard function have been explored as extensions beyond classical parametric models. However, such approaches typically produce monotonic hazard patterns, limiting their ability to represent oscillatory behavior, nonlinear damping, or coupled growth-decay dynamics. We propose a general statistical framework for modeling and simulating hazard functions governed by higher-order ODEs, allowing the hazard to depend on both its current level, its rate of change, and time. This formulation accommodates complex temporal risk behaviors arising in a range of applications. Building on this framework, we develop a class of nonlinear and oscillatory hazard models, each associated with an interpretable dynamical mechanism and an induced survival distribution. We also present a simulation procedure for solving a system of non-linear higher-order ODEs, with failure times generated via cumulative hazard inversion. Likelihood-based Bayesian inference under right censoring is also developed, and moment generating function analysis is used to characterize tail behavior. The proposed framework is evaluated through simulation studies and illustrated using real data, demonstrating its ability to capture temporal risk patterns not well represented by standard monotone models. In contrast to existing linear ODE-based hazard models, the proposed approach accommodates nonlinear and non-equilibrium dynamics, enabling the representation of temporal risk patterns that are not well captured by first-order or linear oscillator-based formulations.

2601.20589 2026-04-02 stat.ME cs.LG

Exact Graph Learning via Integer Programming

Lucas Kook, Søren Wengel Mogensen

详情
英文摘要

Learning the dependence structure among variables in complex systems is a central problem across medical, natural, and social sciences. These structures can be naturally represented by graphs, and the task of inferring such graphs from data is known as graph learning or causal discovery. Existing approaches typically rely on restrictive assumptions about the data-generating process, employ greedy oracle algorithms, or solve approximate formulations of the graph learning problem. Therefore, they are either sensitive to violations of central assumptions or fail to guarantee globally optimal solutions. We address these limitations by introducing a nonparametric graph learning framework based on conditional independence testing and integer programming. We reformulate the graph learning problem as a mixed-integer program and prove that solving this integer-programming problem provides a globally optimal solution to the original graph learning problem. Our method leverages efficient encodings of graphical separation criteria, enabling the exact recovery of larger graphs than was previously feasible. We provide an open-source R package 'glip' which supports learning (acyclic) directed (mixed) graphs and chain graphs. We demonstrate that our approach is often faster than existing exact graph learning procedures and achieves state-of-the-art performance on simulated and benchmark data across all aforementioned classes of graphs.

2601.07993 2026-04-02 math.ST stat.TH

The exact region determined by Spearman's footrule, Gini's gamma and Kendall's tau

Damjana Kokol Bukovšek, Petra Lazić, Blaž Mojškerc, Nik Stopar

Comments 24 pages, 5 figures

详情
Journal ref
Journal of Computational and Applied Mathematics, vol. 486 (2026)
英文摘要

Concordance measures are used to express the degree of association between random variables. Practitioners may use several distinct concordance measures to narrow the space of possible dependence structures. Consequently, the relations between different (weak) concordance measures have been extensively studied in recent years. The goal of this paper is to study the relation between Kendall's tau, Gini's gamma and Spearman's footrule. In particular, we describe the exact region determined by these three measures, using shuffles of $M$ and ordinal sums of copulas. We also provide the formulas for five main (weak) concordance measures and Chatterjee's xi of ordinal sums of copulas.

2512.00580 2026-04-02 cs.LG stat.ML

Non-Asymptotic Convergence of Discrete Diffusion Models: Masked and Random Walk dynamics

Giovanni Conforti, Alain Durmus, Le-Tuyet-Nhi Pham, Gael Raoul

详情
英文摘要

Diffusion models for continuous state spaces based on Gaussian noising processes are now relatively well understood from both practical and theoretical perspectives. In contrast, results for diffusion models on discrete state spaces remain far less explored and pose significant challenges, particularly due to their combinatorial structure and their more recent introduction in generative modelling. In this work, we establish new and sharp convergence guarantees for three popular discrete diffusion models (DDMs). Two of these models are designed for finite state spaces and are based respectively on the random walk and the masking process. The third DDM we consider is defined on the countably infinite space $\mathbb{N}^d$ and uses a drifted random walk as its forward process. For each of these models, the backward process can be characterized by a discrete score function that can, in principle, be estimated. However, even with perfect access to these scores, simulating the exact backward process is infeasible, and one must rely on time discretization. In this work, we study Euler-type approximations and establish convergence bounds in both Kullback-Leibler divergence and total variation distance for the resulting models, under minimal assumptions on the data distribution. To the best of our knowledge, this study provides the optimal non-asymptotic convergence guarantees for these noising processes that do not rely on boundedness assumptions on the estimated score. In particular, the computational complexity of each method scales only linearly in the dimension, up to logarithmic factors.

2511.12749 2026-04-02 stat.ML cs.LG

Taxonomy-Conditioned Hierarchical Bayesian TSB Models for Heterogeneous Intermittent Demand Forecasting

Zong-Han Bai, Po-Yen Chu

Comments Preprint. 13 pages,4 figures, Equal contribution by the two authors

详情
英文摘要

Intermittent demand forecasting poses unique challenges due to sparse observations, cold-start items, and obsolescence. Classical models such as Croston, SBA, and the Teunter--Syntetos--Babai (TSB) method provide simple heuristics but lack a principled generative foundation. We introduce TSB-HB, a hierarchical Bayesian extension of TSB. Demand occurrence is modeled with a Beta--Binomial distribution, while nonzero demand sizes follow a Log-Normal distribution. Crucially, hierarchical priors enable partial pooling across items, stabilizing estimates for sparse or cold-start series while preserving heterogeneity. This framework provides a coherent generative reinterpretation of the classical TSB structure. On the UCI Online Retail dataset, TSB-HB achieves the lowest RMSE and RMSSE among all baselines, while remaining competitive in MAE. On a 5,000-series M5 sample, it improves MAE and RMSE over classical intermittent baselines. Under the calibrated probabilistic configuration, TSB-HB yields competitive pinball loss and a favorable sharpness--calibration tradeoff among the parametric baselines reported in the main text.

2511.02272 2026-04-02 cs.LG cs.DS stat.ML

Beyond Spectral Clustering: Probabilistic Cuts for Differentiable Graph Partitioning

Ayoub Ghriss

Comments AISTATS 2026, https://openreview.net/forum?id=FN6QAT5Tmc

详情
英文摘要

Probabilistic relaxations of graph cuts offer a differentiable alternative to spectral clustering, enabling end-to-end and online learning without eigendecompositions, yet prior work centered on RatioCut and lacked general guarantees and principled gradients. We present a unified probabilistic framework that covers a wide class of cuts, including Normalized Cut. Our framework provides tight analytic upper bounds on expected discrete cuts via integral representations and Gauss hypergeometric functions with closed-form forward and backward. Together, these results deliver a rigorous, numerically stable foundation for scalable, differentiable graph partitioning covering a wide range of clustering and contrastive learning objectives.

2510.25770 2026-04-02 stat.ML cs.AI cs.LG

E-Scores for (In)Correctness Assessment of Generative Model Outputs

Guneet S. Dhillon, Javier González, Teodora Pandeva, Alicia Curth

Comments International Conference on Artificial Intelligence and Statistics (AISTATS), 2026

详情
英文摘要

While generative models, especially large language models (LLMs), are ubiquitous in today's world, principled mechanisms to assess their (in)correctness are limited. Using the conformal prediction framework, previous works construct sets of LLM responses where the probability of including an incorrect response, or error, is capped at a user-defined tolerance level. However, since these methods are based on p-values, they are susceptible to p-hacking, i.e., choosing the tolerance level post-hoc can invalidate the guarantees. We therefore leverage e-values to complement generative model outputs with e-scores as measures of incorrectness. In addition to achieving the guarantees as before, e-scores further provide users with the flexibility of choosing data-dependent tolerance levels while upper bounding size distortion, a post-hoc notion of error. We experimentally demonstrate their efficacy in assessing LLM outputs under different forms of correctness: mathematical factuality and property constraints satisfaction.

2510.25514 2026-04-02 stat.ML cs.LG

Convergence of off-policy TD(0) with linear function approximation for reversible Markov chains

Maik Overmars, Jasper Goseling, Richard Boucherie

详情
Journal ref
SIGMETRICS Perform. Eval. Rev. 53 (2026) 91-96
英文摘要

We study the convergence of off-policy TD(0) with linear function approximation when used to approximate the expected discounted reward in a Markov chain. It is well known that the combination of off-policy learning and function approximation can lead to divergence of the algorithm. Existing results for this setting modify the algorithm, for instance by reweighing the updates using importance sampling. This establishes convergence at the expense of additional complexity. In contrast, our approach is to analyse the standard algorithm, but to restrict our attention to the class of reversible Markov chains. We demonstrate convergence under this mild reversibility condition on the structure of the chain, which in many applications can be assumed using domain knowledge. In particular, we establish a convergence guarantee under an upper bound on the discount factor in terms of the difference between the on-policy and off-policy process. This improves upon known results in the literature that state that convergence holds for a sufficiently small discount factor by establishing an explicit bound. Convergence is with probability one and achieves projected Bellman error equal to zero. To obtain these results, we adapt the stochastic approximation framework that was used by Tsitsiklis and Van Roy [1997 for the on-policy case, to the off-policy case. We illustrate our results using different types of reversible Markov chains, such as one-dimensional random walks and random walks on a weighted graph.

2510.22688 2026-04-02 stat.ME math.ST stat.TH

Stopping Rules for Monte Carlo Methods: A Review

Jiezhong Wu, Reiichiro Kawai

Comments 36 pages, 2 figures, 8 tables

详情
英文摘要

Sequential analysis encompasses simulation theories and methods where the sample size is determined dynamically based on accumulating data. Since the conceptual inception, numerous sequential stopping rules have been introduced, and many more are currently being refined and developed. This article aims to deliver a comprehensive and up-to-date review of recent developments on sequential stopping rules, intentionally emphasizing standard iid Monte Carlo methods and lightly generalized ones, employed primarily for estimating an unknown expectation, including binomial proportions. These methodologies have long served and likely will continue to serve, as fundamental bases for both theoretical and practical developments in stopping rules for general statistical inference, advanced Monte Carlo techniques and their modern applications. Building upon over a hundred references and empirical studies, we explore the essential aspects of these methods, such as core assumptions, numerical algorithms, convergence properties, and practical trade-offs to guide further developments, particularly at the intersection of sequential stopping rules and related areas of research.

2510.15669 2026-04-02 stat.ML cs.LG

Disentanglement of Sources in a Multi-Stream Variational Autoencoder

Veranika Boukun, Jörg Lücke

Comments 14 pages, 4 figures; expanded literature review, added Algorithm 1, and included new benchmarking results on fixed number of overlapping MNIST sources

详情
英文摘要

Variational autoencoders (VAEs) are among leading approaches to address the problem of learning disentangled representations. Typically a single VAE is used and disentangled representations are sought within its single continuous latent space. In this paper, we propose and provide a proof of concept for a novel Multi-Stream Variational Autoencoder (MS-VAE) that achieves disentanglement of sources by combining discrete and continuous latents. The discrete latents are used in an explicit source combination model, that superimposes a set of sources as part of the MS-VAE decoder. We formally define the MS-VAE approach, derive its inference and learning equations, and numerically investigate its principled functionality. The MS-VAE model is very flexible and can be trained using little supervision (we use fully unsupervised learning after pretraining with some labels). In our numerical experiments, we explored the ability of the MS-VAE approach in separating both superimposed hand-written digits as well as sound sources. For the former task we used superimposed MNIST digits (an increasingly common benchmark). For sound separation, our experiments focused on the task of speaker diarization in a recording conversation between two speakers. In all cases, we observe a clear separation of sources and competitive performance after training. For digit superpositions, performance is particularly competitive in complex mixtures (e.g., three and four digits). For the speaker diarization task, we observe an especially low rate of missed speakers and a more precise speaker attribution. Numerical experiments confirm the flexibility of the approach across varying amounts of supervision, and we observed high performance, e.g., when using just 10% of the labels for pretraining.

2510.09534 2026-04-02 stat.ML cs.LG

Conditional Flow Matching for Bayesian Posterior Inference

Percy S. Zhai, So Won Jeong, Veronika Ročková

详情
英文摘要

We propose a generative multivariate posterior sampler via flow matching. It offers a simple training objective, and does not require access to likelihood evaluation. The method learns a dynamic, block-triangular velocity field in the joint space of data and parameters, which results in a deterministic transport map from a source distribution to the desired posterior. The inverse map, named vector rank, is accessible by reversibly integrating the velocity over time. It is advantageous to leverage the dynamic design: proper constraints on the velocity yield a monotone map, which leads to a conditional Brenier map, enabling a fast and simultaneous generation of Bayesian credible sets whose contours correspond to level sets of Monge-Kantorovich data depth. Our approach is computationally lighter compared to GAN-based and diffusion-based counterparts, and is capable of capturing complex posterior structures. Finally, frequentist theoretical guarantee on the consistency of the recovered posterior distribution, and of the corresponding Bayesian credible sets, is provided.

2509.17180 2026-04-02 cs.LG econ.EM stat.ME

Regularizing Extrapolation in Causal Inference

David Arbour, Harsh Parikh, Bijan Niknam, Elizabeth Stuart, Kara Rudolph, Avi Feller

详情
英文摘要

Many common estimators in machine learning and causal inference are linear smoothers, where the prediction is a weighted average of the training outcomes. Some estimators, such as ordinary least squares and kernel ridge regression, allow for arbitrarily negative weights, which improve feature imbalance but often at the cost of increased dependence on parametric modeling assumptions and higher variance. By contrast, estimators like importance weighting and random forests (sometimes implicitly) restrict weights to be non-negative, reducing dependence on parametric modeling and variance at the cost of worse imbalance. In this paper, we propose a unified framework that directly penalizes the level of extrapolation, replacing the current practice of a hard non-negativity constraint with a soft constraint and corresponding hyperparameter. We derive a worst-case extrapolation error bound and introduce a novel "bias-bias-variance" tradeoff, encompassing biases due to feature imbalance, model misspecification, and estimator variance; this tradeoff is especially pronounced in high dimensions, particularly when positivity is poor. We then develop an optimization procedure that regularizes this bound while minimizing imbalance and outline how to use this approach as a sensitivity analysis for dependence on parametric modeling assumptions. We demonstrate the effectiveness of our approach through synthetic experiments and a real-world application, involving the generalization of randomized controlled trial estimates to a target population of interest.

2508.21236 2026-04-02 cs.SI cs.LG stat.AP

Population-Scale Network Embeddings Expose Educational Divides in Network Structure Related to Right-Wing Populist Voting

Malte Lüken, Javier Garcia-Bernardo, Sreeparna Deb, Flavio Hafner, Megha Khosla

Comments 29 pages, 6 figures, Supplementary Materials available at https://github.com/odissei-explainable-network/netaudit; update text introduction, results, and discussion

详情
英文摘要

Administrative registry data can be used to construct population-scale networks whose ties reflect shared social contexts between persons. With machine learning, such networks can be encoded into numerical representations -- embeddings -- that automatically capture an individual's position within the network. We created embeddings for all persons in the Dutch population from a population-scale network that represents five shared contexts: neighborhood, work, family, household, and school. To assess the informativeness of these embeddings, we used them to predict right-wing populist voting. Embeddings alone predicted right-wing populist voting above chance-level but performed worse than individual characteristics. Combining the best subset of embeddings with individual characteristics only slightly improved predictions. After transforming the embeddings to make their dimensions more sparse and orthogonal, we found that one embedding dimension was strongly associated with the outcome. Mapping this dimension back to the population network revealed that differences in educational ties and attainment corresponded to distinct network structures associated with right-wing populist voting. Our study contributes methodologically by demonstrating how population-scale network embeddings can be made interpretable, and substantively by linking structural network differences in education to right-wing populist voting.

2508.06112 2026-04-02 stat.ME

Factor- and Composite-Based Structural Equation Modeling -- A New Approach to Incorporate Composites in the Traditional SEM Framework

Tamara Schamberger, Florian Schuberth, Jörg Henseler, Yves Rosseel

详情
英文摘要

Structural equation modeling (SEM) is a prevalent approach for studying constructs.Traditionally, these constructs are modeled as reflectively measured latent variables - common factors that account for the variance-covariance structure of their associated indicators. Over the past two decades, there has been growing interest in an alternative way of modeling constructs: the composite, i.e., a linear combination of indicators. However, existing approaches to estimating composite models either limit researchers from fully leveraging SEM's capabilities, such as handling missing data, evaluating overall model fit, and testing group differences, or significantly increase complexity of the model specification by introducing additional variables. Against this background, this paper presents a new way of integrating both common factors and composites in the traditional SEM framework. Our presented model specification, along with its model-implied variance-covariance matrix, enables researchers to: (i) utilize well-established SEM estimators, including maximum likelihood and generalized least squares estimators, and (ii) can leverage developments from the traditional SEM framework in terms of model specification, evaluation, and handling of missing data. This way of analyzing structural equation models involving common factors and composites is referred to as factor- and composite-based SEM (FC-SEM). This advancement aims to enhance the flexibility and applicability of SEM in analyzing constructs.

2507.16690 2026-04-02 stat.AP

Accommodating the Analysis Model in Multiple Imputation for the Weibull Mixture Cure Model:Performance under Penalized Likelihood

Changchang Xu, Laurent Briollais, Irene L Andrulis, Shelley B Bull

Comments 33 pages, 7 Figures and 1 table

详情
Journal ref
Statistics in Medicine, 45(6-7), e70437 (2026)
英文摘要

Introduction In analysis of time-to-event outcomes, a mixture cure (MC) model is preferred over a standard survival model when the sample includes individuals who will never experience the event of interest. Motivated by a cohort study of breast cancer patients with incomplete biomarkers, we develop multiple imputation (MI) methods assuming a Weibull proportional hazards (PH-MC) analysis model with multiple prognostic factors. However, for MI with fully conditional specification, an incorrectly-specified imputation model can impair accuracy of point and interval estimates. Objectives and Methods Our goal is to propose imputation models that are compatible with the Weibull PH-MC analysis models. We derive an exact conditional distribution (ECD) imputation model which involves the analysis model likelihood. Using simulation studies, we compare effect estimate bias and confidence interval (CI) coverage under alternative imputation models including the ECD model, an approximation that includes a cure indicator (cECD), and a comprehensive simple (CS) model. For robust parameter estimation in finite and/or sparse samples, we incorporate the Firth-type penalized likelihood (FT-PL) and combined likelihood profile (CLIP) methods into the MI. Results Compared to complete case analysis, MI with penalization reduces estimation bias and improves coverage. Although ECD and cECD perform similarly at higher event rates, ECD generates smaller bias and higher coverage at lower rates. CS has larger bias and lower coverage than ECD and cECD, but CIs are narrower than for cECD. Conclusions In analyses of biomarkers and composite subtypes for prognosis studies such as in breast cancer, use of compatible imputation models and penalization methods are recommended for MC modelling in samples with low event numbers and/or with covariate imbalance.

2507.14848 2026-04-02 stat.AP stat.ME

A Bayesian Approach to Estimating Effect Sizes in Educational Research

Yannis Bähni

详情
英文摘要

In this paper, we demonstrate a purely Bayesian approach for estimating within-group and between-group effect sizes for learning outcomes encountered in educational research, taking naturally into account the multilevel structure of the data, as well as heterogeneous residual variances among time points and conditions. We provide a detailed implementation using the brms package in R serving as a wrapper for the probabilistic programming language Stan. We recommend that for a pooled design, one computes an effect size $d_s$ similar to a Cohen's $d$, and for a paired design, one should compute two possibly different quantities $d_s$ and $d_z$ to correct for correlations in within-group designs and allow for comparability across different studies. All these effect sizes are based on ideas coming from Hedge's total effect size $δ_t$ introduced in 2007. Ultimately, these estimates allow us to study the differential effectiveness of educational interventions with respect to classes.

2506.15436 2026-04-02 math.OC math.ST stat.TH

On the Effectiveness of Classical Regression Methods for Optimal Switching Problems

Martin Andersson, Benny Avelin, Marcus Olofsson

详情
英文摘要

Simple regression methods provide robust, near-optimal solutions for optimal switching problems, including high-dimensional ones (up to 50). While the theory requires solving intractable PDE systems, the Longstaff-Schwartz algorithm with classical regression methods achieves excellent switching decisions without extensive hyperparameter tuning. Testing linear models (OLS, Ridge, LASSO), tree-based methods (random forests, gradient boosting), $k$-nearest neighbors, and feedforward neural networks on four benchmark problems, we find that several simple methods maintain stable performance across diverse problem characteristics, outperforming the neural networks we tested against. In our comparison, $k$-NN regression performs consistently well, and with minimal hyperparameter tuning. We establish concentration bounds for this regressor and show that PCA enables $k$-NN to scale to high dimensions.

2505.21770 2026-04-02 math.PR math.ST stat.ML stat.TH

Gradient-flow SDEs have unique transient population dynamics

Vincent Guan, Joseph Janssen, Nicolas Lanzetti, Antonio Terpin, Geoffrey Schiebinger, Elina Robeva

详情
英文摘要

Identifying the drift and diffusion of an SDE from its population dynamics is a notoriously challenging task. Researchers in machine learning and single-cell biology have only been able to prove a partial identifiability result: for potential-driven SDEs, the gradient-flow drift can be identified from temporal marginals if the Brownian diffusivity is already known. Existing methods therefore assume that the diffusivity is known a priori, despite it being unknown in practice. We dispel the need for this assumption by providing a complete characterization of identifiability: the gradient-flow drift and Brownian diffusivity are jointly identifiable from temporal marginals if and only if the process is observed outside of equilibrium. Given this fundamental result, we propose nn-APPEX, the first Schrodinger Bridge-based inference method that can simultaneously learn the drift and diffusion of a gradient-flow SDE solely from observed marginals. Extensive experiments show that nn-APPEX's ability to adjust its diffusion estimate enables accurate inference, while previous Schrodinger Bridge methods obtain biased drift estimates due to their assumed, and likely incorrect, diffusion.

2505.21580 2026-04-02 stat.ML cs.LG math.ST stat.TH

A Pure Hypothesis Test for Inhomogeneous Random Graph Models Based on a Kernelised Stein Discrepancy

Anum Fatima, Gesine Reinert

Comments 53 pages, 21 figures

详情
英文摘要

Complex data are often represented as a graph, which in turn can often be viewed as a realisation of a random graph, such as an inhomogeneous random graph model (IRG). For general fast goodness-of-fit tests in high dimensions, kernelised Stein discrepancy (KSD) tests are a powerful tool. Here, we develop a KSD-type test for IRG models that can be carried out with a single observation of the network. The test applies to a network of any size, but is particularly interesting for small networks for which asymptotic tests are not warranted. We also provide theoretical guarantees.

2505.15808 2026-04-02 cs.LG cs.AI math.PR stat.AP stat.ML

Neural Conditional Transport Maps

Carlos Rodriguez-Pardo, Leonardo Chiani, Emanuele Borgonovo, Massimo Tavoni

Comments Published in Transactions on Machine Learning Research

详情
英文摘要

We present a neural framework for learning conditional optimal transport (OT) maps between probability distributions. Our approach introduces a conditioning mechanism capable of processing both categorical and continuous conditioning variables simultaneously. At the core of our method lies a hypernetwork that generates transport layer parameters based on these inputs, creating adaptive mappings that outperform simpler conditioning methods. Comprehensive ablation studies demonstrate the superior performance of our method over baseline configurations. Furthermore, we showcase an application to global sensitivity analysis, offering high performance in computing OT-based sensitivity indices. This work advances the state-of-the-art in conditional optimal transport, enabling broader application of optimal transport principles to complex, high-dimensional domains such as generative modeling and black-box model explainability.

2505.04645 2026-04-02 cs.CL cs.LG stat.CO

ChatGPT for automated grading of short answer questions in mechanical ventilation

Tejas Jade, Alex Yartsev

详情
Journal ref
Focus on Health Professional Education: A Multi-Professional Journal, 27(1), 90-104 (2026)
英文摘要

Standardised tests using short answer questions (SAQs) are common in postgraduate education. Large language models (LLMs) simulate conversational language and interpret unstructured free-text responses in ways aligning with applying SAQ grading rubrics, making them attractive for automated grading. We evaluated ChatGPT 4o to grade SAQs in a postgraduate medical setting using data from 215 students (557 short-answer responses) enrolled in an online course on mechanical ventilation (2020--2024). Deidentified responses to three case-based scenarios were presented to ChatGPT with a standardised grading prompt and rubric. Outputs were analysed using mixed-effects modelling, variance component analysis, intraclass correlation coefficients (ICCs), Cohen's kappa, Kendall's W, and Bland--Altman statistics. ChatGPT awarded systematically lower marks than human graders with a mean difference (bias) of -1.34 on a 10-point scale. ICC values indicated poor individual-level agreement (ICC1 = 0.086), and Cohen's kappa (-0.0786) suggested no meaningful agreement. Variance component analysis showed minimal variability among the five ChatGPT sessions (G-value = 0.87), indicating internal consistency but divergence from the human grader. The poorest agreement was observed for evaluative and analytic items, whereas checklist and prescriptive rubric items had less disagreement. We caution against the use of LLMs in grading postgraduate coursework. Over 60% of ChatGPT-assigned grades differed from human grades by more than acceptable boundaries for high-stakes assessments.

2504.19831 2026-04-02 stat.ME

Optimizing Real-Time Oxytocin Administration to Prevent Postpartum Hemorrhage: A Bayesian Approach to Dynamic Treatment Regimes

Haiyan Zhu, Yuqi Qiu, Yingchun Zhou

详情
英文摘要

Postpartum hemorrhage (PPH) remains a leading cause of maternal morbidity and mortality worldwide. Oxytocin, though widely recognized for facilitating labor, is also the primary pharmacological intervention for PPH prevention. However, current dosing protocols lack personalization and fail to account for real-time physiological changes during labor. Moreover, standard dynamic treatment regime (DTR) methods cannot accommodate the continuous monitoring and adjustment. To address this, we propose a semiparametric Bayesian method for estimating an optimal treatment regime in real-time, which allows for the existence of latent individual-level variables. Specifically, random real-time DTRs are defined through interventional parameters, optimized by minimizing posterior predictive loss. We further introduce a "physician-in-the-loop" framework to align optimal strategies with clinical expertise. In an application to Consortium on Safe Labor data, the proposed method achieved consistently lower estimated blood loss than other competing methods. The learned policy recommends earlier initiation, rapid dose escalation, and more frequent titration for parturients with higher BMI, alongside increased adjustments relative to cervical dilation and the interval since the last dose change. Simulation studies demonstrate robust performance and computational efficiency, especially when unmeasured patient factors influence outcomes and covariates. Supplementary materials provides a standardized description of the materials available for reproducing the work.

2504.09279 2026-04-02 stat.ML cs.LG math.OC math.ST stat.TH

No-Regret Generative Modeling via Parabolic Monge-Ampère PDE

Nabarun Deb, Tengyuan Liang

Comments 30 pages, 7 figures. Journal version accepted for publication in the Annals of Statistics

详情
英文摘要

We introduce a novel generative modeling framework based on a discretized parabolic Monge-Ampère PDE, which emerges as a continuous limit of the Sinkhorn algorithm commonly used in optimal transport. Our method performs iterative refinement in the space of Brenier maps using a mirror gradient descent step. We establish theoretical guarantees for generative modeling through the lens of no-regret analysis, demonstrating that the iterates converge to the optimal Brenier map under a variety of step-size schedules. As a technical contribution, we derive a new Evolution Variational Inequality tailored to the parabolic Monge-Ampère PDE, connecting geometry, transportation cost, and regret. Our framework accommodates non-log-concave target distributions, constructs an optimal sampling process via the Brenier map, and integrates favorable learning techniques from generative adversarial networks and score-based diffusion models. As direct applications, we illustrate how our theory paves new pathways for generative modeling and variational inference.

2503.19091 2026-04-02 math.OC cs.CC cs.LG cs.NA math.NA stat.ML

High Probability Complexity Bounds of Trust-Region Stochastic Sequential Quadratic Programming with Heavy-Tailed Noise

Yuchen Fang, Javad Lavaei, Sen Na

Comments 66 pages, 7 figures

详情
英文摘要

In this paper, we consider nonlinear optimization problems with a stochastic objective and deterministic equality constraints. We propose a Trust-Region Stochastic Sequential Quadratic Programming (TR-SSQP) method and establish its high-probability iteration complexity bounds for identifying first- and second-order $ε$-stationary points. In our algorithm, we assume that exact objective values, gradients, and Hessians are not directly accessible but can be estimated via zeroth-, first-, and second-order probabilistic oracles. Compared to existing complexity studies of SSQP methods that rely on a zeroth-order oracle with sub-exponential tail noise (i.e., light-tailed) and focus mostly on first-order stationarity, our analysis accommodates biased (also referred to as irreducible in the literature) and heavy-tailed noise in the zeroth-order oracle, and significantly extends the analysis to second-order stationarity. We show that under heavy-tailed noise conditions, our SSQP method achieves the same high-probability first-order iteration complexity bounds as in the light-tailed noise setting, while further exhibiting promising second-order iteration complexity bounds. Specifically, the method identifies a first-order $ε$-stationary point in $\mathcal{O}(ε^{-2})$ iterations and a second-order $ε$-stationary point in $\mathcal{O}(ε^{-3})$ iterations with high probability, provided that $ε$ is lower bounded by a constant determined by the bias magnitude (i.e., the irreducible noise) in the estimation. We validate our theoretical findings and evaluate practical performance of our method on CUTEst benchmark test set.

2503.18075 2026-04-02 stat.ME

Variational inference for hierarchical models with conditional scale and skewness corrections

Lucas Kock, Linda S. L. Tan, Prateek Bansal, David J. Nott

详情
Journal ref
Journal of Computational and Graphical Statistics 2026
英文摘要

Gaussian variational approximations are widely used for summarizing posterior distributions in Bayesian models, especially in high-dimensional settings. However, a drawback of such approximations is the inability to capture skewness or more complex features of the posterior. Recent work suggests applying skewness corrections to existing Gaussian or other symmetric approximations to address this limitation. We propose to incorporate the skewness correction into the definition of an approximating variational family. We consider approximating the posterior for hierarchical models, in which there are ``global'' and ``local'' parameters. A baseline variational approximation is defined as the product of a Gaussian marginal posterior for global parameters and a Gaussian conditional posterior for local parameters given the global ones. Skewness corrections are then considered. The adjustment of the conditional posterior term for local variables is adaptive to the global parameter value. Optimization of baseline variational parameters is performed jointly with the skewness correction. Our approach allows the location, scale and skewness to be captured separately, without using additional parameters for skewness adjustments. The proposed method substantially improves accuracy for only a modest increase in computational cost compared to state-of-the-art Gaussian approximations. Good performance is demonstrated in generalized linear mixed models and multinomial logit discrete choice models.

2502.08565 2026-04-02 math.OC physics.soc-ph stat.AP

Increasing competitiveness by imbalanced groups: The example of the 48-team FIFA World Cup

László Csató, András Gyimesi

Comments 25 pages, 7 figures, 4 tables

详情
Journal ref
European Journal of Operational Research, 332(3): 868-879, 2026
英文摘要

A match played in a sports tournament can be called stakeless if at least one team is indifferent to its outcome because it already has qualified or has been eliminated. Such a game threatens fairness since teams may not exert full effort without incentives. This paper suggests a novel classification for stakeless matches based on their expected outcome: they are more costly if the indifferent team is more likely to win by playing honestly. Our approach is illustrated with the 2026 FIFA World Cup, the first edition of the competition with 48 teams. We propose a novel format based on imbalanced groups, which substantially reduces the probability of stakeless matches played by the strongest teams according to Monte Carlo simulations. The new design also increases the uncertainty of match outcomes and requires fewer matches. Governing bodies in sports are encouraged to consider our innovative idea in order to enhance the competitiveness of their tournaments.

2502.01861 2026-04-02 cs.LG stat.ML

Learning Hyperparameters via a Data-Emphasized Variational Objective

Ethan Harvey, Mikhail Petrov, Michael C. Hughes

Comments arXiv admin note: text overlap with arXiv:2410.19675

详情
英文摘要

When training large models on limited data, avoiding overfitting is paramount. Common grid search or smarter search methods rely on expensive separate runs for each candidate hyperparameter, while carving out a validation set that reduces available training data. In this paper, we study gradient-based learning of hyperparameters via the evidence lower bound (ELBO) objective from Bayesian variational methods. This avoids the need for any validation set. We focus on scenarios where the model is over-parameterized for flexibility and the approximate posterior is chosen to be Gaussian with isotropic covariance for tractability, even though it cannot match the true posterior. In such scenarios, we find the ELBO prioritizes posteriors that match the prior, leading to severe underfitting. Instead, we recommend a data-emphasized ELBO that upweights the likelihood but not the prior. In Bayesian transfer learning of image and text classifiers, our method reduces the 88+ hour grid search of past work to under 3 hours while delivering comparable accuracy. We further demonstrate how our approach enables efficient yet accurate approximations of Gaussian processes with learnable lengthscale kernels.

2502.01556 2026-04-02 cs.LG stat.ML

A Gaussian Process View on Observation Noise and Initialization in Wide Neural Networks

Sergio Calvo-Ordoñez, Jonathan Plenk, Richard Bergna, Alvaro Cartea, Jose Miguel Hernandez-Lobato, Konstantina Palla, Kamil Ciosek

Comments AISTATS 2026, Camera-ready version

详情
英文摘要

Performing gradient descent in a wide neural network is equivalent to computing the posterior mean of a Gaussian Process with the Neural Tangent Kernel (NTK-GP), for a specific prior mean and with zero observation noise. However, existing formulations have two limitations: (i) the NTK-GP assumes noiseless targets, leading to misspecification on noisy data; (ii) the equivalence does not extend to arbitrary prior means, which are essential for well-specified models. To address (i), we introduce a regularizer into the training objective, showing its correspondence to incorporating observation noise in the NTK-GP. To address (ii), we propose a \textit{shifted network} that enables arbitrary prior means and allows obtaining the posterior mean with gradient descent on a single network, without ensembling or kernel inversion. We validate our results with experiments across datasets and architectures, showing that this approach removes key obstacles to the practical use of NTK-GP equivalence in applied Gaussian process modeling.

2411.17109 2026-04-02 math.PR math.ST stat.TH

On the maximal correlation of some stochastic processes

Yinshan Chang, Qinwei Chen

详情
英文摘要

We study the maximal correlation coefficient $R(X,Y)$ between two stochastic processes $X$ and $Y$. In the case when $(X,Y)$ is a random walk, we find $R(X,Y)$ using the Csáki-Fischer identity and the lower semicontinuity of the map $\text{Law}(X,Y) \to R(X,Y)$. When $(X,Y)$ is a two-dimensional Lévy process, we express $R(X,Y)$ in terms of the Lévy measure of the process and the covariance matrix of the diffusion part of the process. Consequently, for a two-dimensional $α$-stable random vector $(X,Y)$ with $0<α<2$, we express $R(X,Y)$ in terms of $α$ and the spectral measure $τ$ of the $α$-stable distribution. We also establish analogs and extensions of the Dembo-Kagan-Shepp-Yu inequality and the Madiman-Barron inequality.

2410.22729 2026-04-02 stat.ML cs.LG math.ST stat.TH

Identifying Drift, Diffusion, and Causal Structure from Temporal Snapshots

Vincent Guan, Joseph Janssen, Hossein Rahmani, Andrew Warren, Stephen Zhang, Elina Robeva, Geoffrey Schiebinger

详情
英文摘要

Stochastic differential equations (SDEs) are a fundamental tool for modelling dynamic processes, including gene regulatory networks (GRNs), contaminant transport, financial markets, and image generation. However, learning the underlying SDE from data is a challenging task, especially if individual trajectories are not observable. Motivated by burgeoning research in single-cell datasets, we present the first comprehensive approach for jointly identifying the drift and diffusion of an SDE from its temporal marginals. Assuming linear drift and additive diffusion, we show that non-identifiability can only arise if the initial distribution possesses generalized rotational symmetries. We further prove that even if this condition holds, the drift and diffusion can almost always be recovered from the marginals. Additionally, we show that the causal graph of any SDE with additive diffusion can be recovered from the identified SDE parameters. To complement this theory, we adapt entropy-regularized optimal transport to handle anisotropic diffusion, and introduce APPEX (Alternating Projection Parameter Estimation from $X_0$), an iterative algorithm designed to estimate the drift, diffusion, and causal graph of an additive noise SDE, solely from temporal marginals. We show that APPEX iteratively decreases Kullback-Leibler divergence to the true solution, and demonstrate its effectiveness on simulated data from linear additive noise SDEs.

2409.10855 2026-04-02 stat.ME stat.ML

Calibrated Multivariate Regression with Localized PIT Mappings

Lucas Kock, G. S. Rodrigues, Scott A. Sisson, Nadja Klein, David J. Nott

详情
Journal ref
Journal of Computational and Graphical Statistics 2026
英文摘要

Calibration ensures that predicted uncertainties align with observed uncertainties. While there is an extensive literature on recalibration methods for univariate probabilistic forecasts, work on calibration for multivariate forecasts is much more limited. This paper introduces a novel post-hoc recalibration approach that addresses multivariate calibration for potentially misspecified models. Our method involves constructing local mappings between vectors of marginal probability integral transform values and the space of observations, providing a flexible and model free solution applicable to continuous, discrete, and mixed responses. We present two versions of our approach: one uses K-nearest neighbors, and the other uses normalizing flows. Each method has its own strengths in different situations. We demonstrate the effectiveness of our approach on two real data applications: recalibrating a deep neural network's currency exchange rate forecast and improving a regression model for childhood malnutrition in India for which the multivariate response has both discrete and continuous components.

2408.09096 2026-04-02 stat.ME stat.CO

Dynamic linear regression models for forecasting time series with semi long memory errors

Thomas Goodwin, Matias Quiroz, Robert Kohn

详情
英文摘要

Dynamic linear regression models forecast the values of a time series based on a linear combination of a set of exogenous time series while incorporating a time series process for the error term. This error process is often assumed to follow a stationary autoregressive integrated moving average (ARIMA) model, or its seasonal variants, which are unable to capture a long-range dependence structure (long memory) of the error process. We propose a novel dynamic linear regression model that incorporates the long-range dependence feature of the errors and show that the proposed error process may: (i) have a significant impact on the posterior uncertainty of the estimated regression parameters and (ii) improve the model's forecasting ability. We develop a Markov chain Monte Carlo method to fit general dynamic linear regression models based on a frequency domain approach that enables fast, asymptotically exact Bayesian inference for large datasets. We demonstrate that our approximate algorithm is faster than the traditional time domain approaches, such as the Kalman filter and the multivariate Gaussian likelihood, while producing a highly accurate approximation to the posterior. The method is illustrated in simulated examples and two energy forecasting applications, showing that it outperforms approaches that do not account for semi-long memory, as well as a state-of-the-art neural-network-based forecasting procedure.

2408.07575 2026-04-02 cs.AI math.ST stat.ME stat.TH

A General Framework on Conditions for Constraint-based Causal Learning

Kai Z. Teh, Kayvan Sadeghi, Terry Soo

详情
英文摘要

Most constraint-based causal learning algorithms provably return the correct causal graph under certain correctness conditions, such as faithfulness. By representing any constraint-based causal learning algorithm using the notion of a property, we provide a general framework to obtain and study correctness conditions for these algorithms. From the framework, we provide exact correctness conditions for the PC algorithm, which are then related to the correctness conditions of some other existing causal discovery algorithms. The framework also suggests a paradigm for designing causal learning algorithms which allows for the correctness conditions of algorithms to be controlled for before designing the actual algorithm, and has the following implications. We show that the sparsest Markov representation condition is the weakest correctness condition for algorithms that output ancestral graphs or directed acyclic graphs satisfying any existing notions of minimality. We also reason that Pearl-minimality is necessary for meaningful causal learning but not sufficient to relax the faithfulness condition and, as such, has to be strengthened, such as by including background knowledge, for causal learning beyond faithfulness.

2408.04419 2026-04-02 stat.ME stat.CO

Analysing symbolic data by pseudo-marginal methods

Yu Yang, Matias Quiroz, Boris Beranger, Robert Kohn, Scott A. Sisson

Comments Accepted for publication in Statistics and Computing

详情
英文摘要

Symbolic data analysis (SDA) aggregates large individual-level datasets into a small number of distributional summaries, such as random rectangles or random histograms. The inference is carried out using these summaries in place of the original dataset, resulting in computational gains at the loss of some information. In likelihood-based SDA, the likelihood function is characterised by an integral with a large exponent, which limits the method's utility as for typical models the integral is unavailable in closed form. In addition, the likelihood function is known to produce biased parameter estimates in some circumstances. Our article develops a Bayesian framework for SDA methods in these settings that resolves the issues resulting from integral intractability and biased parameter estimation using pseudo-marginal Markov chain Monte Carlo methods. We develop an exact but computationally expensive method based on path sampling and the Poisson estimator, and a much faster, but approximate, method based on a Taylor expansion. Through simulation and real-data examples we demonstrate the performance of the developed methods, showing large reductions in computation time compared to the full-data analysis, with only a small loss of information.

2406.19222 2026-04-02 econ.GN physics.soc-ph q-fin.EC stat.AP

Competitive balance in the UEFA Champions League group stage: Novel measures show no evidence of decline

László Csató, Dóra Gréta Petróczy

Comments 17 pages, 1 figure, 3 tables

详情
Journal ref
Annals of Operations Research, 352(1-2): 105-120, 2025
英文摘要

Competitive balance, which refers to the level of control teams have over a sports competition, is a crucial indicator for tournament organisers. According to previous studies, competitive balance has significantly declined in the UEFA Champions League group stage over the recent decades. Our paper introduces alternative indices to investigate this issue. Two ex ante measures are based on Elo ratings, and four dynamic concentration indicators compare the final group ranking to reasonable benchmarks. Using these indices, we find no evidence of any long-run trend in the competitive balance of the UEFA Champions League group stage between the 2003/04 and 2023/24 seasons.

2406.15904 2026-04-02 cs.LG stat.ME stat.ML

Learning When the Concept Shifts: Confounding, Invariance, and Dimension Reduction

Kulunu Dharmakeerthi, YoonHaeng Hur, Tengyuan Liang

详情
英文摘要

Practitioners often face the challenge of deploying prediction models in new environments with shifted distributions of covariates and responses. With observational data, such shifts are often driven by unobserved confounding, and can in fact alter the concept of which model is best. This paper studies distribution shifts in the domain adaptation problem with unobserved confounding. We postulate a linear structural causal model to account for endogeneity and unobserved confounding, and we leverage exogenous invariant covariate representations to cure concept shifts and improve target prediction. We propose a data-driven representation learning method that optimizes for a lower-dimensional linear subspace and a prediction model confined to that subspace. This method operates on a non-convex objective -- that interpolates between predictability and stability -- constrained to the Stiefel manifold, using an analog of projected gradient descent. We analyze the optimization landscape and prove that, provided sufficient regularization, nearly all local optima align with an invariant linear subspace resilient to distribution shifts. This method achieves a nearly ideal gap between target and source risk. We validate the method and theory with real-world data sets to illustrate the tradeoffs between predictability and stability.

2405.15132 2026-04-02 stat.ML cs.LG math.ST stat.CO stat.ME stat.TH

Scale-adaptive and robust intrinsic dimension estimation via optimal neighbourhood identification

Antonio Di Noia, Iuri Macocco, Aldo Glielmo, Alessandro Laio, Antonietta Mira

详情
英文摘要

The Intrinsic Dimension (ID) is a key concept in unsupervised learning and feature selection, as it is a lower bound to the number of variables which are necessary to describe a system. However, in almost any real-world dataset the ID depends on the scale at which the data are analysed. Quite typically at a small scale, the ID is very large, as the data are affected by measurement errors. At large scale, the ID can also appear erroneously large, due to the curvature and the topology of the manifold containing the data. In this work, we introduce an automatic protocol to select the sweet spot, namely the correct range of scales in which the ID is meaningful and useful. This protocol is based on imposing that for distances smaller than the correct scale the density of the data is constant. In the presented framework, to estimate the density it is necessary to know the ID, therefore, this condition is imposed self-consistently. We illustrate the usefulness and robustness of this procedure to noise by benchmarks on artificial and real-world datasets.

2307.14012 2026-04-02 stat.ML cs.LG

MCMC-Correction of Score-Based Diffusion Models for Model Composition

Anders Sjöberg, Jakob Lindqvist, Magnus Önnheim, Mats Jirstrand, Lennart Svensson

Comments 27 pages. Published in Entropy 28(3):351 (2026). This version matches the published content

详情
Journal ref
Entropy 28(3):351, 2026
英文摘要

Diffusion models can be parameterized in terms of either score or energy function. The energy parameterization is attractive as it enables sampling procedures such as Markov Chain Monte Carlo (MCMC) that incorporates a Metropolis--Hastings (MH) correction step based on energy differences between proposed samples. Such corrections can significantly improve sampling quality, particularly in the context of model composition, where pre-trained models are combined to generate samples from novel distributions. Score-based diffusion models, on the other hand, are more widely adopted and come with a rich ecosystem of pre-trained models. However, they do not, in general, define an underlying energy function, making MH-based sampling inapplicable. In this work, we address this limitation by retaining score parameterization and introducing a novel MH-like acceptance rule based on line integration of the score function. This allows the reuse of existing diffusion models while still combining the reverse process with various MCMC techniques, viewed as an instance of annealed MCMC. Through experiments on synthetic and real-world data, we show that our MH-like samplers {yield relative improvements of similar magnitude to those observed} with energy-based models, without requiring explicit energy parameterization.

2303.06561 2026-04-02 cs.LG cond-mat.dis-nn math.OC stat.ML

Phase Diagram of Initial Condensation for Two-layer Neural Networks

Zhengan Chen, Yuqing Li, Tao Luo, Zhangchen Zhou, Zhi-Qin John Xu

详情
Journal ref
CSIAM Transactions on Applied Mathematics 2024
英文摘要

The phenomenon of distinct behaviors exhibited by neural networks under varying scales of initialization remains an enigma in deep learning research. In this paper, based on the earlier work by Luo et al.~\cite{luo2021phase}, we present a phase diagram of initial condensation for two-layer neural networks. Condensation is a phenomenon wherein the weight vectors of neural networks concentrate on isolated orientations during the training process, and it is a feature in non-linear learning process that enables neural networks to possess better generalization abilities. Our phase diagram serves to provide a comprehensive understanding of the dynamical regimes of neural networks and their dependence on the choice of hyperparameters related to initialization. Furthermore, we demonstrate in detail the underlying mechanisms by which small initialization leads to condensation at the initial training stage.

2206.00398 2026-04-02 quant-ph stat.ML

On Quantum Circuits for Discrete Graphical Models

Nico Piatkowski, Christa Zoufal

详情
Journal ref
Quantum Mach. Intell. 6, 37 (2024)
英文摘要

Graphical models are useful tools for describing structured high-dimensional probability distributions. Development of efficient algorithms for generating unbiased and independent samples from graphical models remains an active research topic. Sampling from graphical models that describe the statistics of discrete variables is a particularly challenging problem, which is intractable in the presence of high dimensions. In this work, we provide the first method that allows one to provably generate unbiased and independent samples from general discrete factor models with a quantum circuit. Our method is compatible with multi-body interactions and its success probability does not depend on the number of variables. To this end, we identify a novel embedding of the graphical model into unitary operators and provide rigorous guarantees on the resulting quantum state. Moreover, we prove a unitary Hammersley-Clifford theorem -- showing that our quantum embedding factorizes over the cliques of the underlying conditional independence structure. Importantly, the quantum embedding allows for maximum likelihood learning as well as maximum a posteriori state approximation via state-of-the-art hybrid quantum-classical methods. Finally, the proposed quantum method can be implemented on current quantum processors. Experiments with quantum simulation as well as actual quantum hardware show that our method can carry out sampling and parameter learning on quantum computers.

2604.00330 2026-04-02 stat.AP

A Rank-Based Information Fusion Framework for Comparing Clustered Multivariate Socioeconomic Outcomes

Dhrubajyoti Ghosh

Comments This manuscript has been accepted for publication at the 2026 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). The final published version will be available via IEEE Xplore

详情
英文摘要

We propose a multivariate, distribution-free ranking framework for comparing clustered, correlated outcomes across groups, motivated by the evaluation of state-level policy environments using county-level socioeconomic data. Using pooled U.S. county data from 2019-2023, we study multiple dimensions of economic well-being, including poverty, income inequality, housing cost burden, medical care costs, and per capita income, observed at a finer spatial resolution than the policy itself. Rather than relying on parametric regression models, we employ a rank-based aggregation algorithm derived from the Longitudinal Rank-Sum Test (LRST), which treats clusters as independent units and aggregates information across outcomes using order statistics. This approach provides a robust, interpretable omnibus comparison that accommodates within-cluster dependence and high-dimensional outcome structure without distributional assumptions. Applied to the comparison of states with and without refundable Earned Income Tax Credit (EITC) policies, the method reveals systematic differences in the joint ranking of county-level outcomes, with results remaining stable under repeated random subsampling of counties and varying cluster sizes. While the empirical analysis is descriptive rather than causal, the study highlights the broader utility of rank-based, multi-criteria aggregation methods as computational intelligence tools for analyzing complex, clustered data in policy and social systems.

2604.00316 2026-04-02 stat.ML cs.LG

Breaking Data Symmetry is Needed For Generalization in Feature Learning Kernels

Marcel Tomàs Bernal, Neil Rohit Mallinar, Mikhail Belkin

详情
英文摘要

Grokking occurs when a model achieves high training accuracy but generalization to unseen test points happens long after that. This phenomenon was initially observed on a class of algebraic problems, such as learning modular arithmetic (Power et al., 2022). We study grokking on algebraic tasks in a class of feature learning kernels via the Recursive Feature Machine (RFM) algorithm (Radhakrishnan et al., 2024), which iteratively updates feature matrices through the Average Gradient Outer Product (AGOP) of an estimator in order to learn task-relevant features. Our main experimental finding is that generalization occurs only when a certain symmetry in the training set is broken. Furthermore, we empirically show that RFM generalizes by recovering the underlying invariance group action inherent in the data. We find that the learned feature matrices encode specific elements of the invariance group, explaining the dependence of generalization on symmetry.

2604.00307 2026-04-02 cs.LG physics.geo-ph stat.ML

SAGE: Subsurface AI-driven Geostatistical Extraction with proxy posterior

Huseyin Tuna Erdinc, Ipsita Bhar, Rafael Orozco, Thales Souza, Felix J. Herrmann

Comments 7 pages, 4 figures

详情
英文摘要

Recent advances in generative networks have enabled new approaches to subsurface velocity model synthesis, offering a compelling alternative to traditional methods such as Full Waveform Inversion. However, these approaches predominantly rely on the availability of large-scale datasets of high-quality, geologically realistic subsurface velocity models, which are often difficult to obtain in practice. We introduce SAGE, a novel framework for statistically consistent proxy velocity generation from incomplete observations, specifically sparse well logs and migrated seismic images. During training, SAGE learns a proxy posterior over velocity models conditioned on both modalities (wells and seismic); at inference, it produces full-resolution velocity fields conditioned solely on migrated images, with well information implicitly encoded in the learned distribution. This enables the generation of geologically plausible and statistically accurate velocity realizations. We validate SAGE on both synthetic and field datasets, demonstrating its ability to capture complex subsurface variability under limited observational constraints. Furthermore, samples drawn from the learned proxy distribution can be leveraged to train downstream networks, supporting inversion workflows. Overall, SAGE provides a scalable and data-efficient pathway toward learning geological proxy posterior for seismic imaging and inversion. Repo link: https://github.com/slimgroup/SAGE.

2604.00293 2026-04-02 cs.LG stat.ML

SYNTHONY: A Stress-Aware, Intent-Conditioned Agent for Deep Tabular Generative Models Selection

Hochan Son, Xiaofeng Lin, Jason Ni, Guang Cheng

详情
英文摘要

Deep generative models for tabular data (GANs, diffusion models, and LLM-based generators) exhibit highly non-uniform behavior across datasets; the best-performing synthesizer family depends strongly on distributional stressors such as long-tailed marginals, high-cardinality categorical, Zipfian imbalance, and small-sample regimes. This brittleness makes practical deployment challenging, especially when users must balance competing objectives of fidelity, privacy, and utility. We study {intent-conditioned tabular synthesis selection}: given a dataset and a user intent expressed as a preference over evaluation metrics, the goal is to select a synthesizer that minimizes regret relative to an intent-specific oracle. We propose {stress profiling}, a synthesis-specific meta-feature representation that quantifies dataset difficulty along four interpretable stress dimensions, and integrate it into {SYNTHONY}, a selection framework that matches stress profiles against a calibrated capability registry of synthesizer families. Across a benchmark of 7 datasets, 10 synthesizers, and 3 intents, we demonstrate that stress-based meta-features are highly predictive of synthesizer performance: a $k$NN selector using these features achieves strong Top-1 selection accuracy, substantially outperforming zero-shot LLM selectors and random baselines. We analyze the gap between meta-feature-based and capability-based selection, identifying the hand-crafted capability registry as the primary bottleneck and motivating learned capability representations as a direction for future work.

2604.00198 2026-04-02 math.ST stat.ME stat.TH

One-step TMLE for weighted average treatment effects

Yang Liu, Patrick Lopatto, Ivana Malenica

详情
英文摘要

We consider Targeted Maximum Likelihood Estimation (TMLE) of weighted average treatment effects (WATEs), a class of causal estimands that reweight the covariate distribution using a specified function of the propensity score. This class includes the average treatment effect and average treatment effect on the treated, as well as various overlap-based targets. We provide a comprehensive analysis of the one-step TMLE along the universal least favorable path for such parameters. Under explicit regularity conditions on the weight function and initialization, we show that the targeting procedure is well-defined, reaches a solution of the estimating equation in finite time, and yields an asymptotically efficient estimator. In particular, convergence of the targeting dynamics and control of the second-order remainder are derived from these conditions rather than imposed as separate assumptions on the output of the algorithm.

2604.00186 2026-04-02 eess.SY cs.AI cs.CY cs.SY econ.GN q-fin.EC stat.AP

Agentic AI and Occupational Displacement: A Multi-Regional Task Exposure Analysis of Emerging Labor Market Disruption

Ravish Gupta, Saket Kumar

Comments 26 pages, 2 figures, 6 tables. Submitted to IMF-OECD-PIIE-World Bank Conference on Labor Markets and Structural Transformation 2026

详情
英文摘要

This paper extends the Acemoglu-Restrepo task exposure framework to address the labor market effects of agentic artificial intelligence systems: autonomous AI agents capable of completing entire occupational workflows rather than discrete tasks. Unlike prior automation technologies that substitute for individual subtasks, agentic AI systems execute end-to-end workflows involving multi-step reasoning, tool invocation, and autonomous decision-making, substantially expanding occupational displacement risk beyond what existing task-level analyses capture. We introduce the Agentic Task Exposure (ATE) score, a composite measure computed algorithmically from O*NET task data using calibrated adoption parameters--not a regression estimate--incorporating AI capability scores, workflow coverage factors, and logistic adoption velocity. Applying the ATE framework across five major US technology regions (Seattle-Tacoma, San Francisco Bay Area, Austin, New York, and Boston) over a 2025-2030 horizon, we find that 93.2% of the 236 analyzed occupations across six information-intensive SOC groups (financial, legal, healthcare, healthcare support, sales, and administrative/clerical) cross the moderate-risk threshold (ATE >= 0.35) in Tier 1 regions by 2030, with credit analysts, judges, and sustainability specialists reaching ATE scores of 0.43-0.47. We simultaneously identify seventeen emerging occupational categories benefiting from reinstatement effects, concentrated in human-AI collaboration, AI governance, and domain-specific AI operations roles. Our findings carry implications for workforce transition policy, regional economic planning, and the temporal dynamics of labor market adjustment

2604.00153 2026-04-02 q-bio.PE stat.AP

Macroscopic Signatures of Gauge-Mediated Contagion: Deriving Behavioral Shielding from Stochastic Field Theory

Jose de Jesus Bernal-Alvarado, David Delepine

Comments 14 pages, 3 figures

详情
英文摘要

We present a unified theoretical model relating stochastic microscopic epidemic dynamics with macroscopic non-linear population behavior. Utilizing the Doi-Peliti formalism, we model the pathogen as a gauge mediator field coupled to susceptible and infected host populations, and introduce a Reactive Immunity Field capable of spontaneous symmetry breaking. We demonstrate that the naive epidemic vacuum is destabilized by radiative loop corrections via the Coleman-Weinberg mechanism, generating a dynamic herd immunity threshold. By extracting the classical saddle-point limit of the Effective Action, we derive the macroscopic reaction-diffusion equations governing the host population. We show that integrating out the gauge mediator inherently generates a thermodynamic Free Energy dependent on the square of the susceptible density. This non-linearity produces a macroscopic spatial ``Fear Drift'' proportional to the magnitude of the immunity field, and a cubic shielding penalty in the effective reproductive number ($R_{eff}$). In this work, we establish a mapping between fundamental field-theoretic mechanisms and specific terms in the macroscopic behavioral equations. We demonstrate that Debye screening is physically executed by the spatial cross-diffusion fluxes driving host evacuation. Simultaneously, vacuum polarization manifests as a non-linear cubic penalty ($-S^3 I$) in the dressed reaction rate that dynamically suppresses the effective reproductive number. As a validation of our model, we apply the formalism to high-resolution spatiotemporal COVID-19 data from Germany.

2604.00133 2026-04-02 stat.ME

Causal Vaccine Effects on Post-infection Outcomes in the Naturally Infected

Allison Codi, Elizabeth Rogawski McQuade, Razieh Nabi, Mats Stensrud, Kaeum Choi, David Benkeser

详情
英文摘要

Understanding vaccine effects on post-infection outcomes is critical for evaluating the full value proposition of a vaccine. However, defining appropriate causal effects on such outcomes is challenging because infection is affected by vaccination. Existing principal stratification approaches focus on the \emph{Doomed} stratum, individuals who would be infected regardless of vaccine receipt. For many relevant outcomes, however, this estimand will understate vaccine benefit by excluding individuals whose adverse post-infection outcomes are improved because vaccination prevented infection. We therefore propose causal estimands for post-infection outcomes in the \emph{Naturally Infected}, individuals who would be infected in absence of vaccine. We derive bounds under minimal assumptions and give point identification results under an exclusion restriction and/or a partial principal ignorability assumption. For point-identified settings, we develop efficient one-step estimators with robustness properties under inconsistent nuisance parameter estimation. We further show under what conditions the same identification functional can be interpreted as targeting an effect among individuals exposed to a sufficiently infectious dose of the pathogen, thereby avoiding direct reliance on cross-world parameters and fundamentally untestable causal assumptions. Simulations show that the bounds are valid but often wide, and that the point estimators perform well when their identifying assumptions hold. In a reanalysis of a rotavirus vaccine trial, marginal and Doomed-stratum analyses showed little evidence of an effect on antibiotic use, whereas analyses targeting the Naturally Infected suggested a protective effect under principal ignorability-based assumptions.

2604.00072 2026-04-02 cs.LG cs.AI stat.ML

Empirical Validation of the Classification-Verification Dichotomy for AI Safety Gates

Arsenios Scrivens

Comments 21 pages, 9 figures. Companion theory paper: doi:10.5281/zenodo.19237451

详情
英文摘要

Can classifier-based safety gates maintain reliable oversight as AI systems improve over hundreds of iterations? We provide comprehensive empirical evidence that they cannot. On a self-improving neural controller (d=240), eighteen classifier configurations -- spanning MLPs, SVMs, random forests, k-NN, Bayesian classifiers, and deep networks -- all fail the dual conditions for safe self-improvement. Three safe RL baselines (CPO, Lyapunov, safety shielding) also fail. Results extend to MuJoCo benchmarks (Reacher-v4 d=496, Swimmer-v4 d=1408, HalfCheetah-v4 d=1824). At controlled distribution separations up to delta_s=2.0, all classifiers still fail -- including the NP-optimal test and MLPs with 100% training accuracy -- demonstrating structural impossibility. We then show the impossibility is specific to classification, not to safe self-improvement itself. A Lipschitz ball verifier achieves zero false accepts across dimensions d in {84, 240, 768, 2688, 5760, 9984, 17408} using provable analytical bounds (unconditional delta=0). Ball chaining enables unbounded parameter-space traversal: on MuJoCo Reacher-v4, 10 chains yield +4.31 reward improvement with delta=0; on Qwen2.5-7B-Instruct during LoRA fine-tuning, 42 chain transitions traverse 234x the single-ball radius with zero safety violations across 200 steps. A 50-prompt oracle confirms oracle-agnosticity. Compositional per-group verification enables radii up to 37x larger than full-network balls. At d<=17408, delta=0 is unconditional; at LLM scale, conditional on estimated Lipschitz constants.

2604.00064 2026-04-02 stat.ML cs.LG math.PR math.ST q-fin.CP stat.TH

Forecast collapse of transformer-based models under squared loss in financial time series

Pierre Andreoletti

详情
英文摘要

We study trajectory forecasting under squared loss for time series with weak conditional structure, using highly expressive prediction models. Building on the classical characterization of squared-loss risk minimization, we emphasize regimes in which the conditional expectation of future trajectories is effectively degenerate, leading to trivial Bayes-optimal predictors (flat for prices and zero for returns in standard financial settings). In this regime, increased model expressivity does not improve predictive accuracy but instead introduces spurious trajectory fluctuations around the optimal predictor. These fluctuations arise from the reuse of noise and result in increased prediction variance without any reduction in bias. This provides a process-level explanation for the degradation of Transformerbased forecasts on financial time series. We complement these theoretical results with numerical experiments on high-frequency EUR/USD exchange rate data, analyzing the distribution of trajectory-level forecasting errors. The results show that Transformer-based models yield larger errors than a simple linear benchmark on a large majority of forecasting windows, consistent with the variance-driven mechanism identified by the theory.

2604.00060 2026-04-02 stat.ML cs.IT cs.LG math.IT

Scaled Gradient Descent for Ill-Conditioned Low-Rank Matrix Recovery with Optimal Sampling Complexity

Zhenxuan Li, Meng Huang

详情
英文摘要

The low-rank matrix recovery problem seeks to reconstruct an unknown $n_1 \times n_2$ rank-$r$ matrix from $m$ linear measurements, where $m\ll n_1n_2$. This problem has been extensively studied over the past few decades, leading to a variety of algorithms with solid theoretical guarantees. Among these, gradient descent based non-convex methods have become particularly popular due to their computational efficiency. However, these methods typically suffer from two key limitations: a sub-optimal sample complexity of $O((n_1 + n_2)r^2)$ and an iteration complexity of $O(κ\log(1/ε))$ to achieve $ε$-accuracy, resulting in slow convergence when the target matrix is ill-conditioned. Here, $κ$ denotes the condition number of the unknown matrix. Recent studies show that a preconditioned variant of GD, known as scaled gradient descent (ScaledGD), can significantly reduce the iteration complexity to $O(\log(1/ε))$. Nonetheless, its sample complexity remains sub-optimal at $O((n_1 + n_2)r^2)$. In contrast, a delicate virtual sequence technique demonstrates that the standard GD in the positive semidefinite (PSD) setting achieves the optimal sample complexity $O((n_1 + n_2)r)$, but converges more slowly with an iteration complexity $O(κ^2 \log(1/ε))$. In this paper, through a more refined analysis, we show that ScaledGD achieves both the optimal sample complexity $O((n_1 + n_2)r)$ and the improved iteration complexity $O(\log(1/ε))$. Notably, our results extend beyond the PSD setting to general low-rank matrix recovery problem. Numerical experiments further validate that ScaledGD accelerates convergence for ill-conditioned matrices with the optimal sampling complexity.

2604.00038 2026-04-02 stat.ML cs.LG

Isomorphic Functionalities between Ant Colony and Ensemble Learning: Part II-On the Strength of Weak Learnability and the Boosting Paradigm

Ernest Fokoué, Gregory Babbitt, Yuval Levental

Comments 21 pages, 5 figures, 4 tables

详情
英文摘要

In Part I of this series, we established a rigorous mathematical isomorphism between ant colony decision-making and random forest learning, demonstrating that variance reduction through decorrelation is a universal principle shared by biological and computational ensembles. Here we turn to the complementary mechanism: bias reduction through adaptive weighting. Just as boosting algorithms sequentially focus on difficult instances, ant colonies dynamically amplify successful foraging paths through pheromone-mediated recruitment. We prove that these processes are mathematically isomorphic, establishing that the fundamental theorem of weak learnability has a direct analog in colony decision-making. We develop a formal mapping between AdaBoost's adaptive reweighting and ant recruitment dynamics, show that the margin theory of boosting corresponds to the stability of quorum decisions, and demonstrate through comprehensive simulation that ant colonies implementing adaptive recruitment achieve the same bias-reduction benefits as boosting algorithms. This completes a unified theory of ensemble intelligence, revealing that both variance reduction (Part I) and bias reduction (Part II) are manifestations of the same underlying mathematical principles governing collective intelligence in biological and computational systems.

2603.21672 2026-04-02 q-fin.PM q-fin.ST q-fin.TR stat.OT

Mislearning of Factor Risk Premia under Structural Breaks: A Misspecified Bayesian Learning Framework

Yimeng Qiu

详情
英文摘要

While asset-pricing models increasingly recognize that factor risk premia are subject to structural change, existing literature typically assumes that investors correctly account for such instability. This paper studies how investors instead learn under a misspecified model that underestimates structural breaks. We propose a minimal Bayesian framework in which this misspecification generates persistent prediction errors and pricing distortions, and we introduce an empirically tractable measure of mislearning intensity $(Δ_t)$ based on predictive likelihood ratios. The empirical results yield three main findings. First, in benchmark factor systems, elevated mislearning does not forecast a deterministic short-run collapse in performance; instead, it is associated with stronger long-horizon returns and Sharpe ratios, consistent with an equilibrium premium for acute model uncertainty. Second, in a broader anomaly universe, this pricing relation does not generalize uniformly: mislearning is more strongly associated with future drawdowns, downside semivolatility, and other measures of instability, with substantial heterogeneity across anomaly families. Third, the cross-sectional relation between instability and mislearning is inherently conditional: while a monotonic link between break-proneness and average mislearning does not hold in the full cross-section, it re-emerges in low-friction (low-IVOL) environments where break-state severity is more comparable across assets.

2603.21291 2026-04-02 stat.ML cs.LG physics.comp-ph

Closed-form conditional diffusion models for data assimilation

Brianna Binder, Agnimitra Dasgupta, Assad Oberai

详情
英文摘要

We propose closed-form conditional diffusion models for data assimilation. Diffusion models use data to learn the score function (defined as the gradient of the log-probability density of a data distribution), allowing them to generate new samples from the data distribution by reversing a noise injection process. While it is common to train neural networks to approximate the score function, we leverage the analytical tractability of the score function to assimilate the states of a system with measurements. To enable the efficient evaluation of the score function, we use kernel density estimation to model the joint distribution of the states and their corresponding measurements. The proposed approach also inherits the capability of conditional diffusion models of operating in black-box settings, i.e., the proposed data assimilation approach can accommodate systems and measurement processes without their explicit knowledge. The ability to accommodate black-box systems combined with the superior capabilities of diffusion models in approximating complex, non-Gaussian probability distributions means that the proposed approach offers advantages over many widely used filtering methods. We evaluate the proposed method on nonlinear data assimilation problems based on the Lorenz-63 and Lorenz-96 systems of moderate dimensionality and nonlinear measurement models. Results show the proposed approach outperforms the widely used ensemble Kalman and particle filters when small to moderate ensemble sizes are used.

2603.10405 2026-04-02 stat.ME math.ST stat.TH

Surrogate-Assisted Targeted Learning for Nested Bridge Functionals under Administrative Censoring

Lin Li, Tuo Lin, Yiwen Chen, Xin M. Tu

Comments 2 figures,1 supplement

详情
英文摘要

Delayed primary outcomes and administratively censored follow-up create a general semiparametric estimation problem: the target causal functional depends on an endpoint observed only for a shrinking subset of units at analysis time, while earlier surrogate measurements remain widely available. In such settings, inverse-probabilityweighted estimators can become unstable as observation probabilities approach the positivity boundary, and complete-case model-based analyses can be highly sensitive to outcome-model specification. We develop a surrogate-assisted targeted minimum loss estimator for this nested causal functional. Identification proceeds through a surrogate-bridge representation that integrates an observed-outcome regression over the conditional surrogate distribution, thereby avoiding inverse observation weights in the target parameter itself. We show that the estimator is asymptotically linear and doubly robust (in the sense that first-order bias vanishes when either nuisance component is consistently estimated), and we characterize two structural features of the problem: under surrogate-mediated missing at random, the censoring mechanism contributes no separate tangent-space component to the efficient influence function; and for nested bridge functionals, a one-step debiased machine-learning construction leaves a second-order cross-product remainder involving the conditional surrogate law. The proposed two-stage targeting step removes this term without requiring direct estimation of that law. Simulation studies demonstrate stable finite-sample performance under substantial administrative censoring, and a design-calibrated analysis based on the Washington State EPT study illustrates the method in a realistic stepped-wedge cluster-randomized setting.

2601.07764 2026-04-02 math.ST stat.ME stat.TH

Power of masking methods for adaptive testing in a multivariate normal means problem

Abhinav Chakraborty, Junu Lee, Eugene Katsevich

详情
英文摘要

Many large-scale testing procedures learn signal structure from the data to boost power. Direct data reuse can inflate Type-I error ("double dipping"), so a common remedy is masking: withholding some information during learning and using it for testing. Sample splitting masks by withholding observations for testing, while null augmentation (e.g., knockoffs or full-conformal outlier detection) masks by appending null samples or variables and withholding their identities until testing. In many settings, little is known about how the power of masking methods compares across mechanisms, across tuning choices, or against more data-efficient non-masking alternatives. We study these questions in a stylized two-groups multivariate normal means model with an unknown signal direction learned from the data. Within this testbed, we develop a transparent, unified set of asymptotic power expressions for three parallel methods differing in masking choices: a sample splitting method, a full-conformal-style null augmentation method, and an oracle in-sample benchmark. Our main findings are: (1) the augmentation method is more powerful than the splitting method with matched tuning; (2) the power-optimal number of null samples for the augmentation method is a vanishing fraction of the number of tests, in which case its power approaches that of the in-sample benchmark; and (3) for a tractable approximation to the augmentation method, the optimal number of null samples scales as the square root of the number of tests, with empirical evidence suggesting a similar scaling for the method itself. These results characterize masking-induced power trade-offs in a tractable model and suggest qualitative lessons for other settings.

2510.08095 2026-04-02 stat.ML cs.LG

Beyond Real Data: Synthetic Data through the Lens of Regularization

Amitis Shidani, Tyler Farghly, Yang Sun, Habib Ganjgahi, George Deligiannidis

详情
英文摘要

Synthetic data can improve generalization when real data is scarce, but excessive reliance may introduce distributional mismatches that degrade performance. In this paper, we present a learning-theoretic framework to quantify the trade-off between synthetic and real data. Our approach leverages algorithmic stability to derive generalization error bounds, characterizing the optimal synthetic-to-real data ratio that minimizes expected test error as a function of the Wasserstein distance between the real and synthetic distributions. We motivate our framework in the setting of kernel ridge regression with mixed data, offering a detailed analysis that may be of independent interest. Our theory predicts the existence of an optimal ratio, leading to a U-shaped behavior of test error with respect to the proportion of synthetic data. Empirically, we validate this prediction on CIFAR-10 and a clinical brain MRI dataset. Our theory extends to the important scenario of domain adaptation, showing that carefully blending synthetic target data with limited source data can mitigate domain shift and enhance generalization. We conclude with practical guidance for applying our results to both in-domain and out-of-domain scenarios.

2509.15517 2026-04-02 cs.LG stat.AP

A Survey and Comparative Evaluation of Intrinsic Dimension Estimators under the Manifold Hypothesis

Zelong Bi, Pierre Lafaye de Micheaux

详情
英文摘要

The manifold hypothesis suggests that high-dimensional data often lie on or near a low-dimensional manifold. Estimating the dimension of this manifold is essential for leveraging its structure, yet existing work on dimension estimation is fragmented and lacks systematic evaluation. This article provides a comprehensive survey for both researchers and practitioners. We review often-overlooked theoretical foundations and present eight representative estimators. Through controlled experiments, we analyze how individual factors, such as noise, curvature, and sample size, affect performance. We also compare the estimators on diverse synthetic and real-world datasets, introducing a principled approach to dataset-specific hyperparameter tuning. Our results offer practical guidance for estimator selection and yield insights that will inform future estimator design.

2507.01375 2026-04-02 stat.ME stat.AP

Mixtures of Neural Network Experts with Application to Phytoplankton Flow Cytometry Data

Ethan Pawl, François Ribalet, Paul A. Parker, Sangwon Hyun

Comments Version 2 of this preprint was missing a Funding Statement. Additionally, page numbers in the supplementary material were incorrect. This version corrects these errors. 46 pages, 20 figures. Under revisions by Environmetrics

详情
英文摘要

Flow cytometry is a valuable technique that measures the optical properties of particles at a single-cell resolution. When deployed in the ocean, flow cytometry allows oceanographers to study different types of photosynthetic microbes called phytoplankton. It is of great interest to study how phytoplankton properties change in response to environmental conditions. In our work, we develop a nonlinear mixture of experts model to estimate separate regression functions for each subpopulation utilizing random-weight neural networks. Our model allows one to flexibly estimate how cell properties and relative abundances depend on environmental covariates in each segment of a heterogeneous sample, without the computational burden of backpropagation. We show that the proposed model provides superior predictive performance in simulated examples compared to a mixture of linear experts. Also, applying our model to real data, we show that our model has (1) comparable out-of-sample prediction performance, and (2) more realistic estimates of phytoplankton behavior.

2506.07805 2026-04-02 stat.ML cs.IT math.IT

Representative, Informative, and De-Amplifying: Requirements for Robust Bayesian Active Learning under Model Misspecification

Roubing Tang, Sabina J. Sloman, Samuel Kaski

Comments Accepted at AISTATS 2026. Camera-ready version

详情
英文摘要

In many science and industry settings, a central challenge is designing experiments under time and budget constraints. Bayesian Optimal Experimental Design (BOED) is a paradigm to pick maximally informative designs that has been widely applied to such problems. During training, BOED selects inputs according to a pre-determined acquisition criterion to target informativeness. During testing, the model learned during training encounters a naturally occurring distribution of test samples. This leads to an instance of covariate shift, where the train and test samples are drawn from different distributions (the training samples are not representative of the test distribution). Prior work has shown that in the presence of model misspecification, covariate shift amplifies generalization error. Our first contribution is to provide a mathematical analysis of generalization error in the presence of model misspecification, revealing that, beyond covariate shift, generalization error is also driven by a previously unidentified phenomenon we term error (de-)amplification. We then develop a new acquisition function that mitigates the effects of model misspecification by including terms for representativeness, informativeness, and de-amplification (R-IDeA). Our experimental results demonstrate that the proposed method performs better than methods that target only informativeness, only representativeness, or both.

2505.19367 2026-04-02 stat.ML cs.LG

Adaptive Diffusion Guidance via Stochastic Optimal Control

Iskander Azangulov, Peter Potaptchik, Qinyu Li, Eddie Aamari, George Deligiannidis, Judith Rousseau

Comments AISTATS 2026

详情
英文摘要

Guidance is a cornerstone of modern diffusion models, playing a pivotal role in conditional generation and enhancing the quality of unconditional samples. However, current approaches to guidance scheduling--determining the appropriate guidance weight--are largely heuristic and lack a solid theoretical foundation. This work addresses these limitations on two fronts. First, we provide a theoretical formalization that precisely characterizes the relationship between guidance strength and classifier confidence. Second, building on this insight, we introduce a stochastic optimal control framework that casts guidance scheduling as an adaptive optimization problem. In this formulation, guidance strength is not fixed but dynamically selected based on time, the current sample, and the conditioning class, either independently or in combination. By solving the resulting control problem, we establish a principled foundation for more effective guidance in diffusion models.

2505.08578 2026-04-02 stat.ME stat.AP stat.ML

Extreme Conformal Prediction: Reliable Intervals for High-Impact Events

Olivier C. Pasche, Henry Lam, Sebastian Engelke

详情
英文摘要

Conformal prediction is a popular method to construct prediction intervals with marginal coverage guarantees from black-box machine learning models. In applications with potentially high-impact events, such as flooding or financial crises, regulators often require very high confidence for such intervals. However, if the desired level of confidence is too large relative to the amount of data used for calibration, then classical conformal methods provide infinitely wide, thus, uninformative prediction intervals. In this paper, we propose a new method to overcome this limitation. We bridge extreme value statistics and conformal prediction to provide reliable and informative prediction intervals with high-confidence coverage, which can be constructed using any black-box extreme quantile regression method. A weighted version of our approach can account for nonstationary data. The advantages of our extreme conformal prediction method are illustrated in a simulation study and in an application to flood risk forecasting.

2406.08097 2026-04-02 cs.LG stat.AP stat.ME

Inductive Global and Local Manifold Approximation and Projection

Jungeum Kim, Xiao Wang

Comments Accepted at TMLR (2024)

详情
英文摘要

Nonlinear dimensional reduction with the manifold assumption, often called manifold learning, has proven its usefulness in a wide range of high-dimensional data analysis. The significant impact of t-SNE and UMAP has catalyzed intense research interest, seeking further innovations toward visualizing not only the local but also the global structure information of the data. Moreover, there have been consistent efforts toward generalizable dimensional reduction that handles unseen data. In this paper, we first propose GLoMAP, a novel manifold learning method for dimensional reduction and high-dimensional data visualization. GLoMAP preserves locally and globally meaningful distance estimates and displays a progression from global to local formation during the course of optimization. Furthermore, we extend GLoMAP to its inductive version, iGLoMAP, which utilizes a deep neural network to map data to its lower-dimensional representation. This allows iGLoMAP to provide lower-dimensional embeddings for unseen points without needing to re-train the algorithm. iGLoMAP is also well-suited for mini-batch learning, enabling large-scale, accelerated gradient calculations. We have successfully applied both GLoMAP and iGLoMAP to the simulated and real-data settings, with competitive experiments against the state-of-the-art methods.

2405.06479 2026-04-02 stat.ME stat.ML

Uncertainty Quantification With Multiple Sources

Mufang Ying, Wenge Guo, Koulik Khamaru, Ying Hung

Comments 23 pages

详情
英文摘要

Weighted conformal prediction (WCP) has been commonly used to quantify prediction uncertainty under covariate shift. However, the effectiveness of WCP relies heavily on the degree of overlap between the training and test covariate distributions. This challenge is exacerbated in multi-source settings with varying covariate distributions, where direct application of WCP can be impractical. In this paper, we address the multi-source setup by leveraging WCP under the assumption of a shared conditional distribution. We investigate two extensions of WCP: (i) a merge-based aggregation of source-specific weighted conformal prediction sets, and (ii) a data-pooling strategy that jointly reweights samples across all sources. Theoretical guarantees are provided for the proposed approaches, and experiments are conducted based on a synthetic regression task and a multi-domain image classification benchmark to validate our proposed methods.

2309.00125 2026-04-02 stat.ML cs.CR cs.LG

Pure Differential Privacy for Functional Summaries with a Laplace-like Process

Haotian Lin, Matthew Reimherr

Comments Accepted by JMLR

详情
Journal ref
Journal of Machine Learning Research, 2024
英文摘要

Many existing mechanisms for achieving differential privacy (DP) on infinite-dimensional functional summaries typically involve embedding these functional summaries into finite-dimensional subspaces and applying traditional multivariate DP techniques. These mechanisms generally treat each dimension uniformly and struggle with complex, structured summaries. This work introduces a novel mechanism to achieve pure DP for functional summaries in a separable infinite-dimensional Hilbert space, named the Independent Component Laplace Process (ICLP) mechanism. This mechanism treats the summaries of interest as truly infinite-dimensional functional objects, thereby addressing several limitations of the existing mechanisms. Several statistical estimation problems are considered, and we demonstrate how one can enhance the utility of private summaries by oversmoothing the non-private counterparts. Numerical experiments on synthetic and real datasets demonstrate the effectiveness of the proposed mechanism.