arXivDaily arXiv每日学术速递 周一至周五更新
重置
2604.21596 2026-04-24 stat.ME stat.CO

Efficient Bayes Factor Sensitivity Analysis via Posterior Density Ratios

František Bartoš, Eric-Jan Wagenmakers, Maarten Marsman, Don van den Bergh

详情
英文摘要

Bayes factor sensitivity analysis examines how the evidence for one hypothesis over another depends on the prior distribution. In complex models, the standard approach refits the model at each hyper-parameter value, and the total computational cost scales linearly in the grid size. We propose a method that recovers the entire sensitivity curve from a single additional model fit. The key identity decomposes the Bayes factor at any hyper-parameter value $γ_x$ into an ``anchor'' Bayes factor at a fixed reference $γ_0$ and a Savage--Dickey density ratio in an extended model that places a hyper-prior on $γ$. Once this extended model is fit, the Bayes factor at any $γ_x$ follows from the anchor value and a ratio of two posterior density ordinates. To approximate this ratio, we employ the importance-weighted marginal density estimator (IWMDE). Because the sensitivity parameter enters the model only through the prior distribution on the model parameters, the data likelihood cancels in the IWMDE, reducing it to a simple ratio of prior density evaluations on the MCMC draws, without any additional likelihood computation. The resulting estimator is fast, remains accurate even with small MCMC samples, and substantially outperforms kernel density estimation across the full sensitivity range. The method extends naturally to simultaneous sensitivity over multiple hyper-parameters and to Bayesian model averaging. We illustrate it on a univariate Bayesian $t$-test with exact Bayes factors for validation, a bivariate informed $t$-test, and a Bayesian model-averaged meta-analysis, obtaining accurate sensitivity curves at a fraction of the brute-force cost.

2604.21595 2026-04-24 stat.ML cs.LG

A Kernel Nonconformity Score for Multivariate Conformal Prediction

Louis Meyer, Wenkai Xu

详情
英文摘要

Multivariate conformal prediction requires nonconformity scores that compress residual vectors into scalars while preserving certain implicit geometric structure of the residual distribution. We introduce a Multivariate Kernel Score (MKS) that produces prediction regions that explicitly adapt to this geometry. We show that the proposed score resembles the Gaussian process posterior variance, unifying Bayesian uncertainty quantification with the coverage guarantees of frequentist-type. Moreover, the MKS can be decomposed into an anisotropic Maximum Mean Discrepancy (MMD) that interpolates between kernel density estimation and covariance-weighted distance. We prove finite-sample coverage guarantees and establish convergence rates that depend on the effective rank of the kernel-based covariance operator rather than the ambient dimension, enabling dimension-free adaptation. On regression tasks, the MKS reduces the volume of prediction region significantly, compared to ellipsoidal baselines while maintaining nominal coverage, with larger gains at higher dimensions and tighter coverage levels.

2604.21549 2026-04-24 cs.AI stat.ME

Unbiased Prevalence Estimation with Multicalibrated LLMs

Fridolin Linder, Thomas Leeper, Daniel Haimovich, Niek Tax, Lorenzo Perini, Milan Vojnovic

详情
英文摘要

Estimating the prevalence of a category in a population using imperfect measurement devices (diagnostic tests, classifiers, or large language models) is fundamental to science, public health, and online trust and safety. Standard approaches correct for known device error rates but assume these rates remain stable across populations. We show this assumption fails under covariate shift and that multicalibration, which enforces calibration conditional on the input features rather than just on average, is sufficient for unbiased prevalence estimation under such shift. Standard calibration and quantification methods fail to provide this guarantee. Our work connects recent theoretical work on fairness to a longstanding measurement problem spanning nearly all academic disciplines. A simulation confirms that standard methods exhibit bias growing with shift magnitude, while a multicalibrated estimator maintains near-zero bias. While we focus the discussion mostly on LLMs, our theoretical results apply to any classification model. Two empirical applications -- estimating employment prevalence across U.S. states using the American Community Survey, and classifying political texts across four countries using an LLM -- demonstrate that multicalibration substantially reduces bias in practice, while highlighting that calibration data should cover the key feature dimensions along which target populations may differ.

2604.21548 2026-04-24 econ.EM stat.ME

Nonparametric Point Identification of Treatment Effect Distributions via Rank Stickiness

Tengyuan Liang

Comments 25 pages, 2 figures

详情
英文摘要

Treatment effect distributions are not identified without restrictions on the joint distribution of potential outcomes. Existing approaches either impose rank preservation -- a strong assumption -- or derive partial identification bounds that are often wide. We show that a single scalar parameter, rank stickiness, suffices for nonparametric point identification while permitting rank violations. The identified joint distribution -- the coupling that maximizes average rank correlation subject to a relative entropy constraint, which we call the Bregman-Sinkhorn copula -- is uniquely determined by the marginals and rank stickiness. Its conditional distribution is an exponential tilt of the marginal with a Bregman divergence as the exponent, yielding closed-form conditional moments and rank violation probabilities; the copula nests the comonotonic and Gaussian copulas as special cases. The empirical Bregman-Sinkhorn copula converges at the parametric $\sqrt{n}$-rate with a Gaussian process limit, despite the infinite-dimensional parameter space. We apply the framework to estimate the full treatment effect distribution, derive a variance estimator for the average treatment effect tighter than the Fréchet--Hoeffding and Neyman bounds, and extend to observational studies under unconfoundedness.

2604.21545 2026-04-24 stat.ME stat.AP

Informed Asymmetric Dirichlet Priors for Multivariate Bernoulli Mixture Models

Luisa Ferrari, Maria Franco Villoria, Garritt L. Page, Alex Laini

Comments 44 pages, 11 figures

详情
英文摘要

Clustering multivariate binary data is of interest in many scientific fields, including ecology, biomedicine, and social policy. Beyond heuristic clustering algorithms, such data can be modelled using multivariate Bernoulli mixture models. Many Bayesian implementations of these models involve a trade-off between computational efficiency and full posterior inference. We propose instead a Bayesian approach able to provide both aspects. The method fixes the total number of components to a large value and employs an asymmetric Dirichlet prior on the mixture weights. The asymmetric Dirichlet hyperparameters are elicited using the popular Penalized Complexity prior framework, which provides an intuitive way for users to inform the induced distribution of the number of clusters. An efficient MCMC algorithm is then developed to fit the model. Simulations and real-world applications demonstrate that the method is competitive with existing alternatives and can outperform them in certain settings. The proposal is illustrated using an ecological dataset about presence-absence of species across multiple sites, where cluster-specific parameters are modelled on the basis of environmental conditions. Overall, the proposed method provides a computationally efficient, fully Bayesian, and interpretable framework for clustering multivariate binary data, with potential applications across diverse scientific domains.

2604.21538 2026-04-24 stat.CO

On a class of constrained particle filters for continuous-discrete state space models

Utku Erdogan, Gabriel J. Lord, Joaquin Miguez

Comments arXiv admin note: text overlap with arXiv:2512.11012

详情
英文摘要

Particle filters (PFs) are recursive Monte Carlo algorithms for Bayesian tracking and prediction in state space models. This paper addresses continuous-discrete filtering problems, where the hidden state evolves as an Itô stochastic differential equation (SDE) and observations arrive at discrete times. We propose a novel class of constrained PFs that enforce compact support on the state at each observation instant, thereby limiting exploration to plausible regions of the state space. Unlike earlier approaches that truncate the likelihood, the proposed method constrains the dynamics directly, yielding improved numerical stability. Under standard regularity assumptions, we prove convergence of the constrained filter, derive uniform-in-time error estimates, and extend the analysis to account for discretisation errors arising from numerical SDE solvers. A numerical study on a stochastic Lorenz-96 system demonstrates the practical application of the methodology when the constraint is implemented via barrier functions.

2604.21498 2026-04-24 stat.ME stat.AP

Analyzing directional errors in spatial orientation using nonparametric circular regression with mixed covariates

Mario Francisco-Fernández, Andrea Meilán-Vila

Comments 33 pages, 13 figures, 3 tables

详情
英文摘要

Spatial orientation is a fundamental cognitive skill that relies on sensory information to update perceived direction. Understanding how sensory conditions influence directional accuracy is important for both cognitive science and the design of assistive technologies. We analyze experimental data in which blind, low-vision, and sighted participants performed spatial updating tasks under five sensory conditions, with signed angular error as the response. To model these data, we propose a nonparametric circular regression framework that accommodates both continuous and categorical predictors via a product-kernel estimator. Bandwidth selection is crucial in this setting, yet developing practical data-driven methods remains challenging. We derive asymptotic bias and variance expressions for the estimator, though these results do not directly lead to a feasible plug-in bandwidth selector. To address this, we develop a bootstrap bandwidth selection criterion tailored to the cosine loss and compare it with cross-validation and rule-of-thumb approaches in simulation studies. Applied to the spatial updating data, the proposed framework reveals nonlinear, condition-specific patterns and quantifies uncertainty via simultaneous bootstrap confidence bands. Across the scenarios considered, the proposed bootstrap selector achieves a favorable bias-variance trade-off and yields stable inference relative to the competing methods. An implementation is available in the R package circMixedReg.

2604.21491 2026-04-24 cs.CR stat.AP stat.ME

Benchmarking the Utility of Privacy-Preserving Cox Regression Under Data-Driven Clipping Bounds: A Multi-Dataset Simulation Study

Keita Fukuyama, Yukiko Mori, Tomohiro Kuroda, Hiroaki Kikuchi

Comments 11 pages, 6 figures, 5 tables. Supplementary material (5 pages, 2 figures, 3 tables) included as ancillary file. Submission to IEEE Journal of Biomedical and Health Informatics (J-BHI)

详情
英文摘要

Differential privacy (DP) is a mathematical framework that guarantees individual privacy; however, systematic evaluation of its impact on statistical utility in survival analyses remains limited. In this study, we systematically evaluated the impact of DP mechanisms (Laplace mechanism and Randomized Response) with data-driven clipping bounds on the Cox proportional hazards model, using 5 clinical datasets ($n = 168$--$6{,}524$), 15 levels of $\varepsilon$ (0.1--1000), and $B = 1{,}000$ Monte Carlo iterations. The data-driven clipping bounds used here are observed min/max and therefore do not provide formal $\varepsilon$-DP guarantees; the results represent an optimistic lower bound on utility degradation under formal DP. We compared three types of input perturbations (covariates only, all inputs, and the discrete-time model) with output perturbations (dfbeta-based sensitivity), using loss of significance rate (LSR), C-index, and coefficient bias as metrics. At standard DP levels ($\varepsilon \leq 1$), approximately 90% (90--94%) of the significant covariates lost significance, even in the largest dataset ($n = 6{,}524$), and the predictive performance approached random levels (test C-index $\approx 0.5$) under many conditions. Among the input perturbation approaches, perturbing only covariates preserved the risk-set structure and achieved the best recovery, whereas output perturbation (dfbeta-based sensitivity) maintained near-baseline performance at $\varepsilon \geq 5$. At $n \approx 3{,}000$, the significance recovered rapidly at $\varepsilon = 3$--10; however, in practice, $\varepsilon \geq 10$ (for predictive performance) to $\varepsilon \geq 30$--60 (for significance preservation) is required. In the moderate-to-high $\varepsilon$ range, false-positive rates increased for variables whose baseline $p$-values were near the significance threshold.

2604.21457 2026-04-24 cs.CY cs.SI stat.AP

Context-Aware Displacement Estimation from Mobile Phone Data: A Methodological Framework

Rajius Idzalika, Muhammad Rheza Muztahid, Radityo Eko Prasojo

Comments 24 pages, 4 figures, 14 tables. Case study: Super Typhoon Nando, Philippines (2025)

详情
英文摘要

Timely population displacement estimates are critical for humanitarian response during disasters, but traditional surveys and field assessments are slow. Mobile phone data enables near real-time tracking, yet existing approaches apply uniform displacement definitions regardless of individual mobility patterns, misclassifying regular commuters as displaced. We present a methodological framework addressing this through three innovations: (1) mobility profile classification distinguishing local residents from commuter types, (2) context-aware between-municipality displacement detection accounting for expected location by user type and day of week, and (3) operational uncertainty bounds derived from baseline coefficient of variation with a disaster adjustment factor, intended for humanitarian decision support rather than formal statistical inference. The framework produces three complementary metrics scaled to population with uncertainty bounds: displacement rates, origin-destination flows, and return dynamics. An Aparri case study following Super Typhoon Nando (2025, Philippines) applies the framework to vendor-provided daily locations from Globe Telecom. Context-aware detection reduced estimated between-municipality displacement by 1.6-2.7 percentage points on weekdays versus naive methods, attributable to the commuter exception but not independently validated. The method captures between-municipality displacement only. Within-municipality evacuation falls outside scope. The single-case demonstration establishes proof of concept. External validity requires application across multiple events and locations. The framework provides humanitarian actors with operational displacement information while preserving individual privacy through aggregation.

2604.21432 2026-04-24 stat.ML cs.LG

A single algorithm for both restless and rested rotting bandits

Julien Seznec, Pierre Ménard, Alessandro Lazaric, Michal Valko

Comments In AISTATS 2020

详情
英文摘要

In many application domains (e.g., recommender systems, intelligent tutoring systems), the rewards associated to the actions tend to decrease over time. This decay is either caused by the actions executed in the past (e.g., a user may get bored when songs of the same genre are recommended over and over) or by an external factor (e.g., content becomes outdated). These two situations can be modeled as specific instances of the rested and restless bandit settings, where arms are rotting (i.e., their value decrease over time). These problems were thought to be significantly different, since Levine et al. (2017) showed that state-of-the-art algorithms for restless bandit perform poorly in the rested rotting setting. In this paper, we introduce a novel algorithm, Rotting Adaptive Window UCB (RAW-UCB), that achieves near-optimal regret in both rotting rested and restless bandit, without any prior knowledge of the setting (rested or restless) and the type of non-stationarity (e.g., piece-wise constant, bounded variation). This is in striking contrast with previous negative results showing that no algorithm can achieve similar results as soon as rewards are allowed to increase. We confirm our theoretical findings on a number of synthetic and dataset-based experiments.

2604.21372 2026-04-24 stat.AP

Optimal basis risk weighting in expectile-based parametric insurance

Markus Johannes Maier, Matthias Scherer

详情
英文摘要

Parametric insurance contracts translate index measurements to compensation for policyholders' losses using predefined payment schemes. These need to be designed carefully to keep basis risk, i.e. the disparity between payouts and true damages, small. Previous research has motivated the use of conditional expectiles as payment schemes, whose compensation is impacted by the policyholder's potentially unknown attitude towards basis risk. To alleviate this model uncertainty and to investigate the impact of (hidden) influencing factors, we characterize existence and uniqueness of the optimal basis risk weighting in a utility-maximization framework through a set of boundary conditions. In the absence of an optimal solution, we provide comparisons to the utility of no insurance and full indemnity coverage. We establish a link between location-scale distributions and separability of conditional expectiles' derivatives, thus improving the understanding of these statistical functionals. A simulation study on parametric hurricane insurance visualizes our results, investigates the influence of premium loading and risk aversion on the optimal weighting, and comments on the challenge of (spatial) loss dependence.

2604.21292 2026-04-24 math.CO cs.IT math.IT stat.AP

Large values in time series and additive combinatorics

Alex Iosevich, Vishal Gupta

Comments 13 pages, 6 figures

详情
英文摘要

It is well-known in industrial data science that large values of real-life time series tend to be structured and often follow concrete and visible patterns. In this paper, we use ideas from additive combinatorics and discrete Fourier analysis to give this heuristic a mathematical foundation. Our main tool is the Fourier ratio, a complexity measure previously used in compressed sensing, combined with a generalized version of Chang's lemma from additive combinatorics. Together, these yield a precise prediction: when the Fourier ratio of a time series is small, the set of its largest values can be additively generated by a very small set using only $\{-1,0,1\}$ coefficients. We test this prediction on US inflation data and Delhi climate data, both in their original form and after mean-centering. The numerical results confirm the predicted structure: a generating set of size $4$--$7$ suffices to span large spectra containing dozens of points, even when the Fourier ratio is large enough that our theoretical bounds become loose. These findings provide a rigorous explanation for why extreme values in real-world data are information-rich and structurally significant.

2604.21270 2026-04-24 stat.ML cs.LG cs.SY eess.SY math.OC

CLT-Optimal Parameter Error Bounds for Linear System Identification

Yichen Zhou, Stephen Tu

Comments 36 pages

详情
英文摘要

There has been remarkable progress over the past decade in establishing finite-sample, non-asymptotic bounds on recovering unknown system parameters from observed system behavior. Surprisingly, however, we show that the current state-of-the-art bounds do not accurately capture the statistical complexity of system identification, even in the most fundamental setting of estimating a discrete-time linear dynamical system (LDS) via ordinary least-squares regression (OLS). Specifically, we utilize asymptotic normality to identify classes of problem instances for which current bounds overstate the squared parameter error, in both spectral and Frobenius norm, by a factor of the state-dimension of the system. Informed by this discrepancy, we then sharpen the OLS parameter error bounds via a novel second-order decomposition of the parameter error, where crucially the lower-order term is a matrix-valued martingale that we show correctly captures the CLT scaling. From our analysis we obtain finite-sample bounds for both (i) stable systems and (ii) the many-trajectories setting that match the instance-specific optimal rates up to constant factors in Frobenius norm, and polylogarithmic state-dimension factors in spectral norm.

2604.21260 2026-04-24 stat.ML cs.AI cs.LG econ.EM q-bio.QM stat.ME

Calibeating Prediction-Powered Inference

Lars van der Laan, Mark Van Der Laan

Comments Paper website: https://larsvanderlaan.github.io/ppi-aipw/

详情
英文摘要

We study semisupervised mean estimation with a small labeled sample, a large unlabeled sample, and a black-box prediction model whose output may be miscalibrated. A standard approach in this setting is augmented inverse-probability weighting (AIPW) [Robins et al., 1994], which protects against prediction-model misspecification but can be inefficient when the prediction score is poorly aligned with the outcome scale. We introduce Calibrated Prediction-Powered Inference, which post-hoc calibrates the prediction score on the labeled sample before using it for semisupervised estimation. This simple step requires no retraining and can improve the original score both as a predictor of the outcome and as a regression adjustment for semisupervised inference. We study both linear and isotonic calibration. For isotonic calibration, we establish first-order optimality guarantees: isotonic post-processing can improve predictive accuracy and estimator efficiency relative to the original score and simpler post-processing rules, while no further post-processing of the fitted isotonic score yields additional first-order gains. For linear calibration, we show first-order equivalence to PPI++. We also clarify the relationship among existing estimators, showing that the original PPI estimator is a special case of AIPW and can be inefficient when the prediction model is accurate, while PPI++ is AIPW with empirical efficiency maximization [Rubin et al., 2008]. In simulations and real-data experiments, our calibrated estimators often outperform PPI and are competitive with, or outperform, AIPW and PPI++. We provide an accompanying Python package, ppi_aipw, at https://larsvanderlaan.github.io/ppi-aipw/.

2604.21235 2026-04-24 cs.LG cs.CL stat.ME

Learning Dynamic Representations and Policies from Multimodal Clinical Time-Series with Informative Missingness

Zihan Liang, Ziwen Pan, Ruoxuan Xiong

Comments Findings of ACL 2026 (30 pages)

详情
英文摘要

Multimodal clinical records contain structured measurements and clinical notes recorded over time, offering rich temporal information about the evolution of patient health. Yet these observations are sparse, and whether they are recorded depends on the patient's latent condition. Observation patterns also differ across modalities, as structured measurements and clinical notes arise under distinct recording processes. While prior work has developed methods that accommodate missingness in clinical time series, how to extract and use the information carried by the observation process itself remains underexplored. We therefore propose a patient representation learning framework for multimodal clinical time series that explicitly leverages informative missingness. The framework combines (1) a multimodal encoder that captures signals from structured and textual data together with their observation patterns, (2) a Bayesian filtering module that updates a latent patient state over time from observed multimodal signals, and (3) downstream modules for offline treatment policy learning and patient outcome prediction based on the learned patient state. We evaluate the framework on ICU sepsis cohorts from MIMIC-III, MIMIC-IV, and eICU. It improves both offline treatment policy learning and adverse outcome prediction, achieving FQE 0.679 versus 0.528 for clinician behavior and AUROC 0.886 for post-72-hour mortality prediction on MIMIC-III.

2604.21203 2026-04-24 stat.ML cs.LG

Refining Covariance Matrix Estimation in Stochastic Gradient Descent Through Bias Reduction

Ziyang Wei, Wanrong Zhu, Jingyang Lyu, Wei Biao Wu

详情
英文摘要

We study online inference and asymptotic covariance estimation for the stochastic gradient descent (SGD) algorithm. While classical methods (such as plug-in and batch-means estimators) are available, they either require inaccessible second-order (Hessian) information or suffer from slow convergence. To address these challenges, we propose a novel, fully online de-biased covariance estimator that eliminates the need for second-order derivatives while significantly improving estimation accuracy. Our method employs a bias-reduction technique to achieve a convergence rate of $n^{(α-1)/2} \sqrt{\log n}$, outperforming existing Hessian-free alternatives.

2604.21115 2026-04-24 eess.SP stat.AP

Complex Approximate Message Passing with Non-separable Denoising

Vishnu Teja Kunde, Alessandro Mirri, Jean-Francois Chamberland, Enrico Paolini

详情
英文摘要

Approximate Message Passing (AMP) is a general framework for iterative algorithms, originally developed for compressed sensing and later extended to a wide range of high-dimensional inference problems. Although recent work has advanced matrix AMP, complex AMP, and AMP for non-separable functions independently, a unified state evolution theory for complex AMP with non-separable denoisers has been lacking. This article fills that gap by establishing state evolution in the setting of complex, non-separable denoising functions. The proposed approach constructs an augmented real-valued system that lifts the problem to a higher-dimensional space, then recovers the complex domain through a many-to-one canonical transformation. Under this construction, the Onsager correction naturally involves Wirtinger derivatives, and the resulting state evolution reduces to scalar complex recursions despite the non-separable structure of the denoisers. The framework extends to the matrix-valued setting, accommodating multiple feature vectors simultaneously. This generalization enables AMP to exploit joint structural constraints, such as simultaneous group and element sparsity, in complex-valued recovery problems. The complex sparse group least absolute shrinkage and selection operator (LASSO) serves as a key instantiation, motivated by preamble detection in Orthogonal Time-Frequency Space (OTFS)-based unsourced random access. Numerical experiments confirm that state evolution accurately predicts performance and show that complex non-separable denoising can produce significant gains over separable and real-valued alternatives.

2604.21110 2026-04-24 stat.ME math.ST stat.TH

A goodness-of-fit test for the logistic propensity score model under nonignorable missing data

Manli Cheng, Yangjianchen Xu, Qinglong Tian, Pengfei Li

Comments 18 pages

详情
英文摘要

Logistic regression is widely used to model the propensity score in the analysis of nonignorable missing data. However, goodness-of-fit testing for this propensity score model has received limited attention in the literature. In this paper, we propose a new goodness-of-fit testing procedure for the logistic propensity score model under nonignorable missing data. The proposed test is based on an unweighted sum-of-squared residuals constructed from the marginal missingness mechanism and accommodates the partial observability of the outcome. We establish the asymptotic distribution of the test statistic under both the null hypothesis and general alternatives, and develop a bootstrap procedure with theoretical guarantees to approximate its null distribution. We show that the resulting bootstrap test attains asymptotically correct size and is consistent, with power converging to one under model misspecification. Simulation studies and a real data application demonstrate that the proposed method performs well in finite samples.

2604.21097 2026-04-24 stat.ML cs.LG

Learning to Emulate Chaos: Adversarial Optimal Transport Regularization

Gabriel Melo, Leonardo Santiago, Peter Y. Lu

详情
英文摘要

Chaos arises in many complex dynamical systems, from weather to power grids, but is difficult to accurately model using data-driven emulators, including neural operator architectures. For chaotic systems, the inherent sensitivity to initial conditions makes exact long-term forecasts theoretically infeasible, meaning that traditional squared-error losses often fail when trained on noisy data. Recent work has focused on training emulators to match the statistical properties of chaotic attractors by introducing regularization based on handcrafted local features and summary statistics, as well as learned statistics extracted from a diverse dataset of trajectories. In this work, we propose a family of adversarial optimal transport objectives that jointly learn high-quality summary statistics and a physically consistent emulator. We theoretically analyze and experimentally validate a Sinkhorn divergence formulation (2-Wasserstein) and a WGAN-style dual formulation (1-Wasserstein). Our experiments across a variety of chaotic systems, including systems with high-dimensional chaotic attractors, show that emulators trained with our approach exhibit significantly improved long-term statistical fidelity.

2604.21067 2026-04-24 stat.AP

The geometry of conflict : 3D Spatio-temporal patterns in fatalities prediction

Thomas Schincariol

Comments 68 Pages, 34 figures

详情
英文摘要

Understanding how conflict events spread over time and space is crucial for predicting and mitigating future violence. However, progress in this area has been limited by the lack of methods capable of capturing the intricate, dynamic patterns of conflict diffusion. The complex nature of those trends needs flexibility in the models to untangle them. This study addresses this gap by analyzing spatio-temporal conflict fatality data using an innovative approach that transforms the data into three-dimensional patterns at the Prio-Grid level. In this paper, a shape-based model called ShapeFinder is adapted. By applying the Earth Movers Distance (EMD) algorithm, we detect and classify these patterns, allowing us to compare and match patterns with high adaptive capacity in all dimensions. Using historical similar patterns, we generate predictions of conflict fatalities and compare these with forecasts from the Views ensemble model, a leading benchmark. Our findings demonstrate that recognizing and analyzing conflict diffusion patterns significantly improves predictive accuracy, outperforming the benchmark model. This research contributes to the study of conflict dynamics by introducing a novel pattern recognition framework that enhances the analysis of spatio-temporal data and offers practical applications for early warning systems.

2604.21020 2026-04-24 stat.ME

A Functional-Class Meta-Analytic Framework for Quantifying Surrogate Resilience

Emily Hsiao, Layla Parast

详情
英文摘要

A surrogate marker is a biomarker or other physical measurement used to replace a primary outcome in clinical trials to evaluate a treatment effect when the primary outcome of interest is costly, invasive, or takes a long time to observe. However, replacing a primary outcome with a surrogate can lead to the "surrogate paradox," in which a treatment appears beneficial based on the surrogate but is actually harmful with respect to the primary outcome. In this paper, we propose a functional class-based method to assess resilience to the surrogate paradox in a meta-analytic setting. Our method leverages data from K completed studies in which the surrogate marker and primary outcome have been measured to make inference on a new study in which only the surrogate is measured. We do not assume direct transportability of the conditional mean function from the completed studies to the new study; instead, we consider deviations of functions from those observed in the completed studies to estimate the "resilience probability" i.e., the probability of the surrogate paradox in the new study. We investigate the performance of our proposed method through a simulation study and apply our method to data from clinical trials in schizophrenia.

2604.21009 2026-04-24 stat.ME stat.CO

Revisiting Bayesian Variable Selection via Optimization

Leo L Duan

详情
英文摘要

Variable selection in linear regression has been a central topic in statistical research for decades. Bayesian variable selection methods, which account for uncertainty in both the regression coefficients and the noise variance, have achieved broad success through the use of discrete or continuous shrinkage priors and efficient collapsed Gibbs samplers. Despite their popularity and strong empirical performance, an enigma remains: the marginal likelihood, obtained by integrating out the regression coefficients and noise variance, is not log-concave; therefore, there is no guarantee of reliably finding its global optimum. In this article, we study this problem from an optimization perspective. Taking the negative log-marginal likelihood as a loss function of the latent precision parameters, we can rewrite it as a difference of convex functions (DC), and then optimize it via a simple iterative algorithm. Under mild compact set conditions, the DC algorithm converges to the global optimum at a linear rate. The positive finding applies to type-II maximum likelihood and extends to maximum marginal posterior under suitable priors, indicating that the problem of mode finding in Bayesian variable selection is much more benign than the lack of log-concavity might suggest. Besides the theoretical insight, the proposed algorithm is easy to implement, free of tuning, and extensible to structured sparsity, and thus can serve as an efficient alternative or warm-start for traditional Markov chain Monte Carlo solutions. The method is illustrated through numerical studies and a spatial data application for quantifying the aftershock risk following the 2019 Ridgecrest earthquakes. The source code for the algorithm is publicly available at https://github.com/leoduan/dca_optimization_variable_selection.

2604.20978 2026-04-24 stat.ME

ML, PL, QL in Markov chain models

Nils Lid Hjort, Cristiano Varin

Comments 34 pages, 7 figures. This is the Statistical Research Report version, Department of Mathematics, University of Oslo version, April 2005, with some more examples and material than in the published version, Scandinavian Journal of Statistics, 2008, vol. 35, pages 64-82

详情
Journal ref
Scandinavian Journal of Statistics, 2008, vol. 35, pages 64-82
英文摘要

In many spatial and spatial-temporal models, and more generally in models with complex dependencies, it may be too difficult to carry out full maximum likelihood (ML) analysis. Remedies include the use of pseudo-likelihood (PL) and quasi-likelihood (QL) (also called the composite likelihood). The present article studies the ML, the PL and the QL methods for general Markov chain models, partly motivated by the desire to understand the precise behaviour of PL and QL methods in settings where this can be analysed. We present limiting normality results and compare performances in different settings. The PL and QL methods can be seen as maximum penalised likelihood methods. We find that the QL strategy is typically preferable to the PL, and that it loses very little to the ML, while earning in model robustness. It has also appeal and potential as a modelling tool. Our methods are illustrated for analysis of DNA sequence evolution type models.

2604.20949 2026-04-24 cs.LG q-fin.TR stat.ME stat.ML

Early Detection of Latent Microstructure Regimes in Limit Order Books

Prakul Sunil Hiremath, Vruksha Arun Hiremath

Comments 48 pages, 7 figures. Combines theoretical guarantees (identifiability and early-detection bounds), 200-run simulation study, and preliminary real-data evaluation on BTC/USDT limit order books. Code and data available

详情
英文摘要

Limit order books can transition rapidly from stable to stressed conditions, yet standard early-warning signals such as order flow imbalance and short-term volatility are inherently reactive. We formalise this limitation via a three-regime causal data-generating process (stable $\to$ latent build-up $\to$ stress) in which a latent deterioration phase creates a prediction window prior to observable stress. Under mild assumptions on temporal drift and regime persistence, we establish identifiability of the latent build-up regime and derive guarantees for strictly positive expected lead-time and non-trivial probability of early detection. We propose a trigger-based detector combining MAX aggregation of complementary signal channels, a rising-edge condition, and adaptive thresholding. Across 200 simulations, the method achieves mean lead-time $+18.6 \pm 3.2$ timesteps with perfect precision and moderate coverage, outperforming classical change-point and microstructure baselines. A preliminary application to one week of BTC/USDT order book data shows consistent positive lead-times while baselines remain reactive. Results degrade in low signal-to-noise and short build-up regimes, consistent with theory.

2604.20907 2026-04-24 stat.ML cs.LG math.CO math.PR math.ST stat.TH

Achieving the Kesten-Stigum bound in the non-uniform hypergraph stochastic block model

Manuel Fernandez, Ludovic Stephan, Yizhe Zhu

Comments 67 pages, 1 figure

详情
英文摘要

We study the community detection problem in the non-uniform hypergraph stochastic block model (HSBM), where hyperedges of varying sizes coexist. This setting captures higher-order and multi-view interactions and raises a fundamental question: can multiple uniform hypergraph layers below the detection threshold be combined to enable weak recovery? We answer this question by establishing a Kesten--Stigum-type bound for weak recovery in a general class of non-uniform HSBMs with $r$ blocks, generated according to multiple symmetric probability tensors. In the case $r=2$, we show that weak recovery is possible whenever the sum of the signal-to-noise ratios across all uniform hypergraph layers exceeds one, thereby confirming the positive part of a conjecture in (Chodrow et al., 2023). Moreover, we provide a polynomial-time spectral algorithm that achieves this threshold via an optimally weighted non-backtracking operator. For the unweighted non-backtracking matrix, our spectral method attains a different algorithmic threshold, also conjectured in (Chodrow et al., 2023). Our approach develops a spectral theory for weighted non-backtracking operators on non-uniform hypergraphs, including a precise characterization of outlier eigenvalues and eigenvector overlaps. We introduce a novel Ihara--Bass formula tailored to weighted non-uniform hypergraphs, which yields an efficient low-dimensional representation and leads to a provable spectral reconstruction algorithm. Taken together, these results provide a principled and computationally efficient approach to clustering in non-uniform hypergraphs, and highlight the role of optimal weighting in aggregating heterogeneous higher-order interactions.

2604.20877 2026-04-24 q-fin.RM stat.AP stat.ME

When AAA Satisfies Nothing: Impossibility Theorems for Structured Credit Ratings

Marco Pollanen

Comments 22 pages, 7 tables, 1 figure. Methodological paper on reliability bounds and discrimination limits, with application to structured credit ratings

详情
英文摘要

A credit rating of AAA asserts near-certainty of repayment. This paper asks whether the pre-crisis information environment could have supported that assertion for structured products. Bayes' theorem implies that any reliability target requires a minimum level of statistical discrimination between instruments that will repay and those that will not. At structured-finance base rates, a four-nines reliability target demands discrimination on the order of 10,000 to 1. A three-nines target demands 1,000 to 1. Nothing in the published credit-prediction literature provides an affirmative basis for believing that discrimination of this magnitude was achievable with the data available at rating time. Retrospectively, the realized system fell short of the four-nines benchmark by roughly 90,000-fold. The framework accommodates the historical feasibility of corporate AAA ratings, where high base rates and rich information produce low required discrimination. Illustrative calibrations for contemporary collateralized loan obligations suggest that material tension between the precision target and the information environment persists. The central implication is that the AAA precision claim itself likely exceeded what the available information could support.

2604.19738 2026-04-24 math.PR cs.LG stat.ML

Phase Transitions in the Fluctuations of Functionals of Random Neural Networks

Simmaco Di Lillo, Leonardo Maini, Domenico Marinucci

详情
英文摘要

We establish central and non-central limit theorems for sequences of functionals of the Gaussian output of an infinitely-wide random neural network on the d-dimensional sphere . We show that the asymptotic behaviour of these functionals as the depth of the network increases depends crucially on the fixed points of the covariance function, resulting in three distinct limiting regimes: convergence to the same functional of a limiting Gaussian field, convergence to a Gaussian distribution, convergence to a distribution in the Qth Wiener chaos. Our proofs exploit tools that are now classical (Hermite expansions, Diagram Formula, Stein-Malliavin techniques), but also ideas which have never been used in similar contexts: in particular, the asymptotic behaviour is determined by the fixed-point structure of the iterative operator associated with the covariance, whose nature and stability governs the different limiting regimes.

2604.16645 2026-04-24 stat.ME math.ST stat.TH

Strang splitting estimator for nonlinear multivariate stochastic differential equations with Pearson-type multiplicative noise

Predrag Pilipović, Adeline Samson, Susanne Ditlevsen

Comments 27 pages of main text, 14 pages of supplementary materials, 8 figures

详情
英文摘要

Multivariate Pearson diffusions are characterized by a linear drift and a diffusion matrix that is quadratic in the state variables. We derive closed-form expressions for the mean and covariance matrix of this class using matrix exponential integrals, and extend this framework to a broader class of nonlinear diffusions with Pearson-type multiplicative noise. The main contribution is a new parameter estimator for these nonlinear multiplicative models based on Strang splitting, which decomposes the stochastic system into a deterministic nonlinear ordinary differential equation and a multivariate Pearson diffusion. The estimator is constructed by composing their respective flows and applying a Gaussian transition approximation with exact moments from the Pearson component. We prove that the estimator is consistent and asymptotically efficient. We also introduce a new model within this class, the Student Kramers oscillator, and prove existence and uniqueness of the strong solution and of an invariant measure. We evaluate the estimator through simulation studies on this oscillator and on the multivariate Wright-Fisher diffusion from population genetics, where it outperforms the Euler-Maruyama, Gaussian approximation, and local linearization estimators. We conclude with an application to Greenland ice core data using the Student Kramers oscillator.

2604.04141 2026-04-24 stat.ME math.ST stat.AP stat.TH

On Data Thinning for Model Validation in Small Area Estimation

Sho Kawano, Paul A. Parker, Zehang Richard Li

详情
英文摘要

Small area estimation (SAE) produces estimates of population parameters for geographic and demographic subgroups with limited sample sizes. Such estimates are critical for informing policy decisions, ranging from poverty mapping to social program funding. Despite its widespread use, principled validation of SAE models remains challenging and general guidelines are far from well-established. Unlike conventional predictive modeling settings, validation data are rarely available in the SAE context. External validation surveys or censuses often do not exist, and access to individual-level microdata is often restricted, making standard cross-validation infeasible. In this paper, we propose a novel model validation scheme using only area-level direct survey estimates under the widely used Fay-Herriot model. Our approach is based on data thinning, which splits area-level observations into independent training and test components to enable out-of-sample validation. Our theoretical analysis reveals a fundamental tension inherent in thinning-based validation: performance metrics measured on the thinned training component target a different quantity than those based on the full data, with the gap varying by model complexity. Increasing the information allocated for training reduces this gap but inflates the variance of the estimator. We formally characterize this bias-variance tradeoff and provide practical recommendations for the thinning parameters that balance these competing considerations for model comparison. We show that data thinning with these settings provides consistent and stable performance across heterogeneous sampling designs in design-based simulations using American Community Survey microdata.

2603.20903 2026-04-24 math.OC hep-ph stat.ML

Unfolding with a Wasserstein Loss

Katy Craig, Benjamin Faktor, Benjamin Nachman

详情
英文摘要

Data unfolding -- the removal of noise or artifacts from measurements -- is a fundamental task across the experimental sciences. Of particular interest are applications in physics, where the dominant approach is Richardson-Lucy (RL) deconvolution. The classical RL approach aims to find denoised data that, once passed through the noise model, is as close as possible to the measured data in terms of Kullback-Leibler (KL) divergence. This requires that the support of the measured data overlaps with the output of the noise model, a hypothesis typically enforced by binning, which introduces numerical error. As a counterpoint, the present work studies an alternative formulation using a Wasserstein loss. We establish sharp conditions for existence and uniqueness of optimizers, answering open questions of Li, et al., regarding necessary conditions for uniqueness in the case of transport map noise models. We then develop a provably convergent generalized Sinkhorn algorithm to compute approximate optimizers. Our algorithm requires only empirical observations of the noise model and measured data and scales with the size of the data, rather than the ambient dimension. Numerical experiments on one- and two-dimensional problems inspired by jet mass unfolding in particle physics demonstrate that the optimal transport approach offers robust, accurate performance compared to classical RL deconvolution, particularly when binning artifacts are significant.

2603.15055 2026-04-24 stat.ML cs.LG math.ST stat.TH

Spatio-temporal probabilistic forecast using MMAF-guided learning

Leonardo Bardi, Imma Valentina Curato, Lorenzo Proietti

详情
英文摘要

We present a theory-guided generalized Bayesian methodology for spatio-temporal raster data, which we use to train an ensemble of stochastic feed-forward neural networks with Gaussian-distributed weights. The methodology incorporates the dependence and causal structure of a spatio-temporal Ornstein-Uhlenbeck process into training and inference by enforcing constraints on the design of the data embedding and the related optimization routine. In inference mode, the networks are employed to generate causal ensemble forecasts by applying different initial conditions at different horizons. We call this workflow MMAF-guided learning. Experiments conducted on both synthetic and real data demonstrate that our forecasts remain calibrated across multiple time horizons. Moreover, we show that on such data, shallow feed-forward architectures can achieve performance comparable to, and in some cases better than, convolutional or diffusion deep learning architectures used in probabilistic forecasting tasks.

2603.03700 2026-04-24 stat.ML cs.AI cs.LG math.ST stat.TH

Generalization Properties of Score-matching Diffusion Models for Intrinsically Low-dimensional Data

Saptarshi Chakraborty, Quentin Berthet, Peter L. Bartlett

详情
英文摘要

Despite the remarkable empirical success of score-based diffusion models, their statistical guarantees remain underdeveloped. Existing analyses often provide pessimistic convergence rates that do not reflect the intrinsic low-dimensional structure common in real data, such as that arising in natural images. In this work, we study the statistical convergence of score-based diffusion models for learning an unknown distribution $μ$ from finitely many samples. Under mild regularity conditions on the forward diffusion process and the data distribution, we derive finite-sample error bounds on the learned generative distribution, measured in the Wasserstein-$p$ distance. Unlike prior results, our guarantees hold for all $p \ge 1$ and require only a finite-moment assumption on $μ$, without compact-support, manifold, or smooth-density conditions. Specifically, given $n$ i.i.d.\ samples from $μ$ with finite $q$-th moment and appropriately chosen network architectures, hyperparameters, and discretization schemes, we show that the expected Wasserstein-$p$ error between the learned distribution $\hatμ$ and $μ$ scales as $\mathbb{E}\, \mathbb{W}_p(\hatμ,μ) = \widetilde{O}\!\left(n^{-1 / d^\ast_{p,q}(μ)}\right),$ where $d^\ast_{p,q}(μ)$ is the $(p,q)$-Wasserstein dimension of $μ$. Our results demonstrate that diffusion models naturally adapt to the intrinsic geometry of data and mitigate the curse of dimensionality, since the convergence rate depends on $d^\ast_{p,q}(μ)$ rather than the ambient dimension. Moreover, our theory conceptually bridges the analysis of diffusion models with that of GANs and the sharp minimax rates established in optimal transport. The proposed $(p,q)$-Wasserstein dimension also extends the notion of classical Wasserstein dimension to distributions with unbounded support, which may be of independent theoretical interest.

2602.18577 2026-04-24 stat.ME stat.CO

balnet: Pathwise Estimation of Covariate Balancing Propensity Scores

Erik Sverdrup, Trevor Hastie

详情
英文摘要

We present balnet, an R package for scalable pathwise estimation of covariate balancing propensity scores via logistic covariate balancing loss functions. Regularization paths are computed with Yang and Hastie (2024)'s generic elastic net solver, supporting convex losses with non-smooth penalties, as well as group penalties and feature-specific penalty factors. For lasso penalization, balnet computes a regularization path of balancing weights from the largest observed covariate imbalance to a user-specified fraction of this maximum. We illustrate the method with an application to spatial pixel-level balancing for constructing synthetic control weights for the average treatment effect on the treated, using satellite data on wildfires.

2602.06262 2026-04-24 stat.ME stat.AP

Latent variation in pathogen strain-specific effects under multiple-versions-of-treatment theory

Bronner P. Gonçalves

Comments 9 pages, 1 figure

详情
英文摘要

Evidence-informed policy on infections requires estimates of their effects on health. However, pathogenic variation, whereby occurrence of adverse outcomes depends on the infecting strain, might complicate the study of many infectious agents. Here, we consider the interpretation of epidemiologic studies on effects of infections on health when there is heterogeneity in strain-specific effects and information on strain composition is unavailable. We use potential outcomes and causal inference theory for analyses in the presence of multiple versions of treatment to argue that oft-reported quantities in these studies have a causal interpretation that depends on population frequencies of infecting strains. Moreover, as in other contexts where the treatment-variation-irrelevance assumption might be violated, transportability requires additional considerations, beyond those needed for non-compound exposures. This discussion, that considers potential heterogeneity in strain-specific effects, will facilitate interpretation of these studies, and for the reasons mentioned above, also highlights the value of pathogen subtype data.

2511.14354 2026-04-24 math.ST stat.TH

Asymptotic Distribution of Constrained Nearly-Isotonic Graph Fused Lasso

Vladimir Pastukhov

Comments 11 pages, 1 figure

详情
英文摘要

This paper studies the asymptotic distribution of a constrained lasso-type estimator for denoising signals defined on the nodes of a graph, where the underlying structure encodes relationships between variables. We show that, under suitable assumptions on the penalization parameters, the limiting distribution of the estimator is obtained by applying the corresponding constrained procedure to the asymptotic distribution of the unrestricted estimator. Thus, the constrained estimator shares the same convergence rate as the unrestricted estimator. Without the fusion penalty, the limiting distribution is obtained by applying individual nearly isotonic estimators to the corresponding sub-vectors of the unrestricted estimator's asymptotic distribution, similarly to the limiting behavior of isotonic regression.

2510.04548 2026-04-24 cond-mat.dis-nn cs.LG stat.ML

Learning Linear Regression with Low-Rank Tasks in-Context

Kaito Takanami, Takashi Takahashi, Yoshiyuki Kabashima

Comments Accepted at AISTATS 2026

详情
英文摘要

In-context learning (ICL) is a key building block of modern large language models, yet its theoretical mechanisms remain poorly understood. It is particularly mysterious how ICL operates in real-world applications where tasks have a common structure. In this work, we address this problem by analyzing a linear attention model trained on low-rank regression tasks. Within this setting, we precisely characterize the distribution of predictions and the generalization error in the high-dimensional limit. Moreover, we find that statistical fluctuations in finite pre-training data induce an implicit regularization. Finally, we identify a sharp phase transition of the generalization error governed by task structure. These results provide a framework for understanding how transformers learn to learn the task structure.

2509.25630 2026-04-24 stat.ML cs.LG cs.NA math.NA

When Langevin Monte Carlo Meets Randomization: New Sampling Algorithms with Non-asymptotic Error Bounds beyond Log-Concavity and Gradient Lipschitzness

Xiaojie Wang, Bin Yang

详情
英文摘要

Efficient sampling from complex and high dimensional target distributions turns out to be a fundamental task in diverse disciplines such as scientific computing, statistics and machine learning. In this paper, we propose a new kind of randomized splitting Langevin Monte Carlo (RSLMC) algorithm for sampling from high dimensional distributions without log-concavity. Compared with the existing randomized Langevin Monte Carlo (RLMC), the newly proposed RSLMC algorithm requires less evaluations of gradients and is thus computationally cheaper. Under the gradient Lipschitz condition and the log-Sobolev inequality, we prove a uniform-in-time error bound in $\mathcal{W}_2$-distance of order $O(\sqrt{d}h)$ for both RLMC and RSLMC sampling algorithms, which matches the best one in the literature under the log-concavity condition. Moreover, when the gradient of the potential $U$ is non-globally Lipschitz with superlinear growth, new modified R(S)LMC algorithms are introduced and analyzed, with non-asymptotic error bounds established. Numerical examples are finally reported to corroborate the theoretical findings.

2509.03476 2026-04-24 stat.ME

Temporal dependence in exposure and hazard-based infectious disease interventions

Hiroyasu Ando, A. James O'Malley, Akihiro Nishi

Comments 15 pages, 3 figures

详情
英文摘要

In randomized controlled trials (RCTs) of infectious disease interventions, it is well recognized that unmeasured individual heterogeneity at baseline can induce selection bias over time, thereby complicating the interpretation of the estimated hazard ratio. The present study examines a simplified setting: RCTs consisting of homogeneous participants, with no individual heterogeneity at baseline. However, even in such an apparently ideal setting, selection bias can emerge over time due to temporal dependence in exposure, a realistic feature of infectious disease transmission. In this study, we mathematically characterize the mechanism underlying this bias and quantitatively evaluate its magnitude. Our results show that this bias should be recognized as an issue in both the design and interpretation of RCTs of infectious disease interventions.

2508.10612 2026-04-24 math.ST stat.TH

Approximation rates for finite mixtures of location-scale models and fast least-squares estimators

Hien Duy Nguyen, TrungTin Nguyen, Jacob Westerhout, Xin Guo

详情
英文摘要

Finite mixture models provide a flexible framework for approximating and estimating multivariate probability densities. We study mixtures formed from translated and rescaled copies of a fixed density kernel and obtain explicit results for both approximation and least-squares estimation. Our main deterministic result is a quantisation theorem showing that, after smoothing the target density at a fixed resolution, the resulting convolution can be compressed into a finite location mixture with controlled error. Combining this with the smoothing bias yields approximation rates in $\mathcal{L}_{p}$ over Sobolev classes. For estimation, we analyse least-squares $\varepsilon$-minimisers over suitably tuned mixture sieves. Under exponential decay of the Fourier transform of the kernel, a matching moment condition, and bounded Sobolev targets, the estimator attains a squared $\mathcal{L}_{2}$ risk bound whose rate matches the Sobolev minimax benchmark up to a logarithmic factor. If, in addition, the kernel is bandlimited, then the same theorem recovers the Sobolev rate $n^{-2s/\left(2s+d\right)}$. We further report a slower convergence rate under weaker VC-type assumptions. At fixed scale, the Fourier-based approach also gives a nearly parametric risk bound for the associated location-mixture class, and the same bandlimited simplification removes the logarithmic correction. In the Gaussian case, this recovers the known Gaussian location-mixture rate. We also prove matching lower bounds on Gaussian convolution submodels, including strict submodels of the Gaussian location-mixture class, and on the tensor-product odd-degree Student-$t$ location-mixture family.

2506.12721 2026-04-24 cs.AI cs.CL cs.LG stat.ML

Strategic Scaling of Test-Time Compute: A Bandit Learning Approach

Bowen Zuo, Yinglun Zhu

Comments To appear at ICLR 2026

详情
英文摘要

Scaling test-time compute has emerged as an effective strategy for improving the performance of large language models. However, existing methods typically allocate compute uniformly across all queries, overlooking variation in query difficulty. To address this inefficiency, we formulate test-time compute allocation as a novel bandit learning problem and propose adaptive algorithms that estimate query difficulty on the fly and allocate compute accordingly. Compared to uniform allocation, our algorithms allocate more compute to challenging queries while maintaining accuracy on easier ones. Among challenging queries, our algorithms further learn to prioritize solvable instances, effectively reducing excessive computing on unsolvable queries. We theoretically prove that our algorithms achieve better compute efficiency than uniform allocation and empirically validate their effectiveness on math and code benchmarks. Specifically, our algorithms achieve up to an 11.10% performance improvement (15.04% relative) on the MATH-500 dataset, up to 10.82% (14.44% relative) on the AIME25 dataset, and up to an 11.23% performance improvement (15.29% relative) on the LiveCodeBench dataset.

2506.10374 2026-04-24 cs.IT math.IT math.ST stat.TH

Optimal Non-Adaptive Group Testing with One-Sided Error Guarantees

Daniel McMorrow, Jonathan Scarlett

详情
Journal ref
IEEE Transactions on Information Theory (Volume: 72, Issue: 5, May 2026)
英文摘要

The group testing problem consists of determining a sparse subset of defective items from within a larger set of items via a series of tests, where each test outcome indicates whether at least one defective item is included in the test. We study the approximate recovery setting, where the recovery criterion of the defective set is relaxed to allow a small number of items to be misclassified. In particular, we consider one-sided approximate recovery criteria, where we allow either only false negative or only false positive misclassifications. Under false negatives only (i.e., finding a subset of defectives), we show that there exists an algorithm matching the optimal threshold of two-sided approximate recovery. Under false positives only (i.e., finding a superset of the defectives), we provide a converse bound showing that the better of two existing algorithms is optimal.

2506.04292 2026-04-24 cs.SI cs.LG stat.AP

GARG-AML against Smurfing: A Scalable and Interpretable Graph-Based Framework for Anti-Money Laundering

Bruno Deprez, Bart Baesens, Tim Verdonck, Wouter Verbeke

详情
英文摘要

Purpose: We introduce GARG-AML, a fast and transparent graph-based method to catch `smurfing', a common money-laundering tactic. It assigns a single, easy-to-understand risk score to every account in both directed and undirected networks. Unlike overly complex models, it balances detection power with the speed and clarity that investigators require. Methodology: The method maps an account's immediate and secondary connections (its second-order neighbourhood) into an adjacency matrix. By measuring the density of specific blocks within this matrix, GARG-AML flags patterns that mimic smurfing behaviour. We further boost the model's performance using decision trees and gradient-boosting classifiers, testing the results against current state-of-the-art on both synthetic and open-source data. Findings: GARG-AML matches or beats state-of-the-art performance across all tested datasets. Crucially, it easily processes the massive transaction graphs typical of large financial institutions. By leveraging only the adjacency matrix of the second-order neighbourhood and basic network features, this work highlights the potential of fundamental network properties towards advancing fraud detection. Originality: The originality lies in the translation of human expert knowledge of smurfing directly into a simple network representation, rather than relying on uninterpretable deep learning. Because GARG-AML is built expressly for the real-world business demands of scalability and interpretability, banks can easily incorporate it in their existing AML solutions.

2501.06133 2026-04-24 stat.ME math.ST stat.TH

Testing conditional independence under isotonicity

Rohan Hore, Jake A. Soloff, Rina Foygel Barber, Richard J. Samworth

Comments 79 pages, 7 figures, 2 Table

详情
英文摘要

We propose a test of the conditional independence of random variables $X$ and~$Y$ given~$Z$ under the additional assumption that $X$ is stochastically nondecreasing in~$Z$. The well-documented hardness of testing conditional independence means that some further restriction on the null hypothesis parameter space is required. In contrast to existing approaches based on parametric models, smoothness assumptions, or approximations to the conditional distribution of $X$ given $Z$ and/or $Y$ given $Z$, our test requires only the stochastic monotonicity assumption. Our procedure, called \textnormal{\texttt{PairSwap-ICI}}, determines the significance of a statistic by randomly swapping the $X$ values within ordered pairs of~$Z$ values. The matched pairs and the test statistic may depend on both $Y$ and $Z$, providing the analyst with significant flexibility in constructing a powerful test. Our test offers finite-sample Type~I error control, and provably achieves high power against a large class of alternatives. We validate our theoretical findings through a series of simulations and real data experiments.

2407.13970 2026-04-24 math.ST stat.TH

Frequentist Coverage of Bayes Posteriors in Nonlinear Inverse Problems with Gaussian Priors

Youngsoo Baek, Katerina Papagiannouli

Comments 42 pages, 2 figures

详情
英文摘要

We study asymptotic frequentist coverage and approximately Gaussian properties of Bayes posterior credible sets in nonlinear inverse problems when a Gaussian prior is placed on the parameter of the PDE. The aim is to ensure valid frequentist coverage of Bayes credible intervals when estimating continuous linear functionals of the parameter. Our results show that Bayes credible intervals have conservative coverage under certain smoothness assumptions on the parameter and a compatibility condition between the likelihood and the prior, regardless of whether an efficient limit exists or Bernstein von-Mises (BvM) theorem holds. In the latter case, our results yield a corollary with more relaxed sufficient conditions than previous works. The theory is illustrated with a PDE that arises in predicting the transport of radioactive waste from underground repositories and optimizing oil recovery from subsurface fields: an elliptic inverse problem for Darcy flow. In this case, a near-$1/\sqrt{N}$ contraction rate and conservative coverage results are obtained for linear functionals that were shown not to be estimable efficiently.

2401.16407 2026-04-24 stat.ML cs.LG eess.IV eess.SP

Is K-fold cross validation the best model selection method for Machine Learning?

Juan M Gorriz, R. Martin Clemente, F Segovia, J Ramirez, A Ortiz, J. Suckling

Comments 40 pages, 24 figures

详情
英文摘要

As a technique that can compactly represent complex patterns, machine learning has significant potential for predictive inference. K-fold cross-validation (CV) is the most common approach to ascertaining the likelihood that a machine learning outcome is generated by chance, and it frequently outperforms conventional hypothesis testing. This improvement uses measures directly obtained from machine learning classifications, such as accuracy, that do not have a parametric description. To approach a frequentist analysis within machine learning pipelines, a permutation test or simple statistics from data partitions (i.e., folds) can be added to estimate confidence intervals. Unfortunately, neither parametric nor non-parametric tests solve the inherent problems of partitioning small sample-size datasets and learning from heterogeneous data sources. The fact that machine learning strongly depends on the learning parameters and the distribution of data across folds recapitulates familiar difficulties around excess false positives and replication. A novel statistical test based on K-fold CV and the Upper Bound of the actual risk (K-fold CUBV) is proposed, where uncertain predictions of machine learning with CV are bounded by the worst case through the evaluation of concentration inequalities. Probably Approximately Correct-Bayesian upper bounds for linear classifiers in combination with K-fold CV are derived and used to estimate the actual risk. The performance with simulated and neuroimaging datasets suggests that K-fold CUBV is a robust criterion for detecting effects and validating accuracy values obtained from machine learning and classical CV schemes, while avoiding excess false positives.

2309.07176 2026-04-24 cs.LG stat.ML

Mind the Gap: Optimal and Equitable Encouragement Policies

Angela Zhou

Comments Updated with major new case study on SNAP recertification benefits

详情
英文摘要

In consequential domains, it is often impossible to compel individuals to take treatment, so that optimal policy rules are merely suggestions in the presence of human non-adherence to treatment recommendations. We study personalized decision problems in which the planner controls recommendations into treatment rather than treatment itself. Under a covariate-conditional no-direct-effect model of encouragement, policy value depends on two distinct objects: responsiveness to encouragement and treatment efficacy. This modeling distinction makes induced treatment take-up, rather than recommendation rates alone, the natural fairness target and yields tractable policy characterizations under budget and access constraints. In settings with deterministic algorithmic recommendations, the same model localizes overlap-robustness to the recommendation-response model rather than the downstream outcome model. We illustrate the methods in case studies based on data from reminders of SNAP benefits recertification, and from pretrial supervised release with electronic monitoring. While the specific remedy to inequities in algorithmic allocation is context-specific, it requires studying both take-up of decisions and downstream outcomes of them.

2303.03237 2026-04-24 stat.ML cs.LG math.ST stat.CO stat.TH

Convergence Rates for Non-Log-Concave Sampling and Log-Partition Estimation

David Holzmüller, Francis Bach

Comments Published in JMLR. New in v4: Summary tables / sections. Plots can be reproduced using the code at https://github.com/dholzmueller/sampling_experiments

详情
Journal ref
Journal of Machine Learning Research 26(249):1-72, 2025
英文摘要

Sampling from Gibbs distributions and computing their log-partition function are fundamental tasks in statistics, machine learning, and statistical physics. While efficient algorithms are known for log-concave densities, the worst-case non-log-concave setting necessarily suffers from the curse of dimensionality. For many numerical problems, the curse of dimensionality can be alleviated when the target function is smooth, allowing the exponent in the rate to improve linearly with the number of available derivatives. Recently, it has been shown that similarly fast convergence rates can be achieved by efficient optimization algorithms. Since optimization can be seen as the low-temperature limit of sampling from Gibbs distributions, we pose the question of whether similarly fast convergence rates can be achieved for non-log-concave sampling. We first study the information-based complexity of the sampling and log-partition estimation problems and show that the optimal rates for sampling and log-partition computation are sometimes equal and sometimes faster than for optimization. We then analyze various polynomial-time sampling algorithms, including an extension of a recent promising optimization approach, and find that they sometimes exhibit interesting behavior but no near-optimal rates. Our results also give further insights into the relation between sampling, log-partition, and optimization problems.