arXivDaily arXiv每日学术速递 周一至周五更新
重置
2604.20492 2026-04-23 stat.ML cs.IT cs.LG math.IT

Decentralized Machine Learning with Centralized Performance Guarantees via Gibbs Algorithms

Yaiza Bermudez, Samir Perlaza, Iñaki Esnaola

Comments In Proceedings of the International Symposium on Information Theory (ISIT), 2026

详情
英文摘要

In this paper, it is shown, for the first time, that centralized performance is achievable in decentralized learning without sharing the local datasets. Specifically, when clients adopt an empirical risk minimization with relative-entropy regularization (ERM-RER) learning framework and a forward-backward communication between clients is established, it suffices to share the locally obtained Gibbs measures to achieve the same performance as that of a centralized ERM-RER with access to all the datasets. The core idea is that the Gibbs measure produced by client~$k$ is used, as reference measure, by client~$k+1$. This effectively establishes a principled way to encode prior information through a reference measure. In particular, achieving centralized performance in the decentralized setting requires a specific scaling of the regularization factors with the local sample sizes. Overall, this result opens the door to novel decentralized learning paradigms that shift the collaboration strategy from sharing data to sharing the local inductive bias via the reference measures over the set of models.

2604.20446 2026-04-23 cs.LG stat.ML

The Origin of Edge of Stability

Elon Litman

详情
英文摘要

Full-batch gradient descent on neural networks drives the largest Hessian eigenvalue to the threshold $2/η$, where $η$ is the learning rate. This phenomenon, the Edge of Stability, has resisted a unified explanation: existing accounts establish self-regulation near the edge but do not explain why the trajectory is forced toward $2/η$ from arbitrary initialization. We introduce the edge coupling, a functional on consecutive iterate pairs whose coefficient is uniquely fixed by the gradient-descent update. Differencing its criticality condition yields a step recurrence with stability boundary $2/η$, and a second-order expansion yields a loss-change formula whose telescoping sum forces curvature toward $2/η$. The two formulas involve different Hessian averages, but the mean value theorem localizes each to the true Hessian at an interior point of the step segment, yielding exact forcing of the Hessian eigenvalue with no gap. Setting both gradients of the edge coupling to zero classifies fixed points and period-two orbits; near a fixed point, the problem reduces to a function of the half-amplitude alone, which determines which directions support period-two orbits and on which side of the critical learning rate they appear.

2604.20445 2026-04-23 stat.AP

Assessing the Shortfall Risk of GB Electricity Grid using Shifts in Winter Weather Conditions

Aninda Bhattacharya, Chris J. Dent, Amy L. Wilson, Gabriele C. Hegerl

Comments Pre-print Submitted to Applied Energy

详情
英文摘要

Extreme weather events during peak winter periods drive resource adequacy risk in Great Britain (GB), with weather sensitivity of the supply-demand balance increasing through additional electric heating and wind generation. This work develops an approach of time-shifting weather within the peak season, through adjustment of the relevant terms in a statistical model for demand. This allows more complete consideration of the security of supply consequences of a weather series, as there will be relevant conditions where demand is suppressed due to weather occurring at a weekend or during the Christmas holiday. Results on a GB example show that consideration of this counterfactual is indeed important, and specifically that winter 2010-11 can either be the most severe in the dataset, or insignificant within the resource adequacy model, depending on the alignment of day-of-week with the weather series. Statistical interpretation of the shift model is discussed, which is straightforward for alignment of day-of-week with weather assuming that all seven alignments are equiprobable; but is more subtle for shifting weather in and out of Christmas, as there is no natural maximum on the realistic length of shift, but too large a shift may be physically unrealistic. It is likely that in all systems, assessment of a weather year's severity is incomplete without such consideration of the day-of-week effect; however, whether longer shifts of weather with respect to date need to be considered will depend on the presence of a major holiday (such as Christmas in GB) in the peak season.

2604.20422 2026-04-23 math.ST math.PR stat.TH

Likelihood-based inference for birth-death processes with composite birth mechanisms

Marko Lalovic, Nicos Georgiou, Istvan Z. Kiss

Comments 32 pages, 8 figures

详情
英文摘要

We develop a likelihood-based inference for finite-state birth-death processes with composite birth rates, in which multiple distinct mechanisms contribute additively to the total birth intensity. Our main motivating example is an SIS epidemic model with pairwise and higher-order transmission. The process is observed through a single aggregate trajectory, and in the main setting of interest, birth events are unmarked. This creates a deconvolution problem in event space: the state is one-dimensional, but the mechanism underlying each birth is latent. We formulate the inference under a Doob $h$-transformed $Q$-process, which is time-homogeneous and ergodic and which provides a time-homogeneous asymptotic surrogate for the law of the original process conditioned on long survival. We derive the corresponding conditional likelihood and study both the conditional maximum likelihood estimator and a quasi-maximum likelihood estimator which is based on a simplified working score. Under the Doob-transform law, we prove consistency and asymptotic normality for both estimators, with asymptotic covariance determined by the inverse Fisher and inverse Godambe information matrices, respectively. We also showcase a practical one-dimensional test for the presence of a specific higher-order birth mechanism.

2604.20416 2026-04-23 stat.AP

SHARELIFE Imputations

Giuseppe De Luca, Paolo Li Donni

Comments 84 pages (including 32 pages of appendices and 22 figures)

详情
英文摘要

This report describes the SHARELIFE-MI project, which aims to generate multiple imputations for missing values in the life-course data collected in SHARELIFE Waves 3 and 7. The SHARELIFE study reconstructs individual life histories through retrospective questions covering key biographical domains such as partnerships, fertility, employment, and residence. As in the regular SHARE waves, item nonresponse represents an important source of nonsampling error - particularly for monetary variables, which require conversions across multiple currencies and long time periods. We document the preliminary data recoding and harmonization steps, as well as the design, specification, and implementation of an imputation model based on the fully conditional specification approach. Finally, we assess the internal and external validity of the resulting imputations through comparisons with the observed data, alternative nonresponse adjustments based on inverse propensity weighting, and external benchmarks from the regular SHARE waves.

2604.20414 2026-04-23 math.ST stat.ME stat.ML stat.TH

Fast and Provably Accurate Sequential Designs using Hilbert Space Gaussian Processes

Huanyan Zhu, Cheng Li

详情
英文摘要

Gaussian processes are widely used for accurate emulation of unknown surfaces in sequential design of expensive simulation experiments. Integrated mean squared error (IMSE) is an effective acquisition function for sequential designs based on Gaussian processes. However, existing approaches struggle with its implementation because the required integrals often lack closed-form expressions for most kernel functions. We propose a novel and computationally efficient Hilbert space Gaussian process approximation for the IMSE acquisition function, where a truncated eigenbasis representation of the integral enables closed-form evaluation. We establish sharp global non-asymptotic bounds for both the approximation error of isotropic kernels and the resulting error in the acquisition function. In a series of numerical experiments with $γ$-stabilizing, the proposed method achieves substantially lower prediction error and reduced computation time compared to existing benchmarks. These results demonstrate that the proposed Hilbert space Gaussian process framework provides an accurate and computationally efficient approach for Gaussian process based sequential design.

2604.20409 2026-04-23 cs.LG stat.ML

Calibrating conditional risk

Andrey Vasilyev, Yikai Wang, Xiaocheng Li, Guanting Chen

详情
英文摘要

We introduce and study the problem of calibrating conditional risk, which involves estimating the expected loss of a prediction model conditional on input features. We analyze this problem in both classification and regression settings and show that it is fundamentally equivalent to a standard regression task. For classification settings, we further establish a connection between conditional risk calibration and individual/conditional probability calibration, and develop theoretical insights for the performance metric. This reveals that while conditional risk calibration is related to existing uncertainty quantification problems, it remains a distinct and standalone machine learning problem. Empirically, we validate our theoretical findings and demonstrate the practical implications of conditional risk calibration in the learning to defer (L2D) framework. Our systematic experiments provide both qualitative and quantitative assessments, offering guidance for future research in uncertainty-aware decision-making.

2604.20370 2026-04-23 cs.LG stat.ML

Cold-Start Forecasting of New Product Life-Cycles via Conditional Diffusion Models

Ruihan Zhou, Zishi Zhang, Jinhui Han, Yijie Peng, Xiaowei Zhang

详情
英文摘要

Forecasting the life-cycle trajectory of a newly launched product is important for launch planning, resource allocation, and early risk assessment. This task is especially difficult in the pre-launch and early post-launch phases, when product-specific outcome history is limited or unavailable, creating a cold-start problem. In these phases, firms must make decisions before demand patterns become reliably observable, while early signals are often sparse, noisy, and unstable We propose the Conditional Diffusion Life-cycle Forecaster (CDLF), a conditional generative framework for forecasting new-product life-cycle trajectories under cold start. CDLF combines three sources of information: static descriptors, reference trajectories from similar products, and newly arriving observations when available. Here, static descriptors refer to structured pre-launch characteristics of the product, such as category, price tier, brand or organization identity, scale, and access conditions. This structure allows the model to condition forecasts on relevant product context and to update them adaptively over time without retraining, yielding flexible multi-modal predictive distributions under extreme data scarcity. The method satisfies consistency with a horizon-uniform distributional error bound for recursive generation. Across studies on Intel microprocessor stock keeping unit (SKU) life cycles and the platform-mediated adoption of open large language model repositories, CDLF delivers more accurate point forecasts and higher-quality probabilistic forecasts than classical diffusion models, Bayesian updating approaches, and other state-of-the-art machine-learning baselines.

2604.20341 2026-04-23 physics.geo-ph stat.AP

Extrapolation from historical data cannot reliably predict the time of a potential AMOC collapse

Andreas Morr, Maya Ben-Yami, Brian Groenke, Christof Schötz, Alessandro Cotronei, Eirik Myrvoll-Nilsen, Sebastian Bathiany, Martin Rypdal, Niklas Boers

详情
英文摘要

Ditlevsen and Ditlevsen [Nature Communications, 2023] (DD23 hereafter) propose a statistical framework to estimate the timing of a potential collapse of the Atlantic Meridional Overturning Circulation (AMOC) based on extrapolating information from observed sea-surface temperature (SST) variability. By fitting a stochastic one-dimensional fold-bifurcation model to an SST-based fingerprint of the AMOC using Maximum Likelihood Estimation (MLE), they conclude that a collapse is most likely to occur in the middle of the 21st century, with a reported 95% confidence interval covering the time span from 2037 to 2109. Given the profound implications of such a claim for both climate and society, it is essential to thoroughly test the robustness of this result, to critically assess the underlying assumptions and uncertainties, and to estimate the extent to which the reported confidence interval reflects the true limits of current knowledge. Here we examine the sensitivity of DD23's results and argue that four types of uncertainty are insufficiently explored in their analysis: (i) structural uncertainty associated with the assumed low-order bifurcation model, (ii) statistical uncertainty in their model fit, (iii) uncertainty in the representativeness of SST-based fingerprints as proxies for the high-dimensional AMOC dynamics, and (iv) uncertainty in the underlying data, arising from non-stationary observational coverage and dataset preprocessing. Using synthetic experiments and a systematic analysis of alternative fingerprints and observational products, we show that the tipping times estimated by DD23 are highly sensitive to the uncertainties listed above, and extend several millennia into the future when these uncertainties are thoroughly propagated.

2604.20322 2026-04-23 stat.ME

Zero-Inflated Logistic Regression Models with Shared Design: Identifiability, Existence of Estimates, and a Relabeling Rule

Yui Tomo, Shinto Eguchi, Daisuke Yoneoka

详情
英文摘要

The zero-inflated logistic regression model accommodates binary responses with excess zeros, which often arise from a latent mixture of susceptible and insusceptible subpopulations or asymmetric misclassification of the response. The model has two components: regression for the binary response and a latent binary indicator for the zero-inflation state. In applied settings, it is common to use the same design matrix for both components if there is no prior knowledge. However, this shared-design specification lacks guaranteed identifiability of the regression parameters, as established in prior works. This paper investigates the theoretical properties of the zero-inflated logistic regression model under the shared-design setting and computational methods for applications. First, to motivate the use of the zero-inflated model, we prove that ignoring the zero-inflation mechanism can lead to a sign flip in the pseudo-true coefficient value relative to the true value. We then establish sufficient conditions for the existence of the maximum likelihood estimate. As a main result, we establish that the model under the shared-design setting is identifiable up to exchange symmetry of the parameters for two components and that the expected log-likelihood has a unique maximizer on the resulting quotient space. The posterior bimodality is examined using a Pólya-Gamma Gibbs sampler with replica exchange. Finally, we propose a simple relabeling rule to select a single ordered parameter pair, and evaluate its performance through simulation studies and an application to self-reported diabetes data.

2604.20301 2026-04-23 stat.ML cs.LG stat.CO stat.ME

Properties and limitations of geometric tempering for gradient flow dynamics

Francesca Romana Crucinio, Sahani Pathiraja

Comments Accepted at TMLR https://openreview.net/forum?id=IP0w5LdcxC

详情
英文摘要

We consider the problem of sampling from a probability distribution $π$. It is well known that this can be written as an optimisation problem over the space of probability distributions in which we aim to minimise the Kullback--Leibler divergence from $π$. We consider the effect of replacing $π$ with a sequence of moving targets $(π_t)_{t\ge0}$ defined via geometric tempering on the Wasserstein and Fisher--Rao gradient flows. We show that convergence occurs exponentially in continuous time, providing novel bounds in both cases. We also consider popular time discretisations and explore their convergence properties. We show that in the Fisher--Rao case, replacing the target distribution with a geometric mixture of initial and target distribution never leads to a convergence speed up both in continuous time and in discrete time. Finally, we explore the gradient flow structure of tempered dynamics and derive novel adaptive tempering schedules.

2604.20296 2026-04-23 stat.ML cs.LG

Online Survival Analysis: A Bandit Approach under Cox PH Model

Yang Xu, Wenbin Lu, Rui Song

详情
英文摘要

Survival analysis is a widely used statistical framework for modeling time-to-event data under censoring. Classical methods, such as the Cox proportional hazards (Cox PH) model, offer a semiparametric approach to estimating the effects of covariates on the hazard function. Despite its importance, survival analysis has been largely unexplored in online settings, particularly within the bandit framework, where decisions must be made sequentially to optimize treatments as new data arrive over time. In this work, we take an initial step toward integrating survival analysis into a purely online learning setting under the Cox PH model, addressing key challenges including staggered entry, delayed feedback, and right censoring. We adapt three canonical bandit algorithms to balance exploration and exploitation, with theoretical guarantees of sublinear regret bounds. Extensive simulations and semi-real experiments using SEER cancer data demonstrate that our approach enables rapid and effective learning of near-optimal treatment policies.

2604.20285 2026-04-23 stat.AP stat.OT

Time-dependent structural equation modeling of fans' football fever using activity tracking data during the 2025 DFB Cup final

Jonas Bauer, Christiane Fuchs, Tamara Schamberger

详情
英文摘要

Football fans frequently exhibit pronounced emotional and physiological reactions during high-stakes matches. However, the temporal dynamics of this football fever are rarely modeled as a latent process. Using intensive longitudinal data from Arminia Bielefeld supporters who wore smartwatches during the 2025 German Football Association (DFB) Cup final, we investigate how football fever unfolds. The devices recorded heart rate, stress level, and related indicators in short intervals, allowing us to construct a latent variable for football fever and model its dynamics. We specify a time-dependent structural equation model with latent growth components and autoregressive effects to capture both overall trends and short-term carry-over effects in fans' physiological responses. Results are aggregated across multiple imputations of missing measurements. Model fit is evaluated using adjustments for the high data dimensionality. The results show that football fever follows a V-shaped trajectory: high at kick-off, followed by a steady decline until the renewed arousal in the second half, with substantial between-fan heterogeneity in both baseline level and temporal dynamics. Our findings demonstrate that football fever can be adequately represented as a latent variable using structural equation modeling and reflected by wearable technology data. This highlights the importance of accounting for temporal dependence when studying dynamic emotional phenomena, e. g., in sports spectatorship.

2604.20276 2026-04-23 cs.LG stat.ML

Rethinking Intrinsic Dimension Estimation in Neural Representations

Rickmer Schulte, David Rügamer

Comments Accepted at the 29th International Conference on Artificial Intelligence and Statistics (AISTATS) 2026

详情
英文摘要

The analysis of neural representation has become an integral part of research aiming to better understand the inner workings of neural networks. While there are many different approaches to investigate neural representations, an important line of research has focused on doing so through the lens of intrinsic dimensions (IDs). Although this perspective has provided valuable insights and stimulated substantial follow-up research, important limitations of this approach have remained largely unaddressed. In this paper, we highlight a crucial discrepancy between theory and practice of IDs in neural representations, theoretically and empirically showing that common ID estimators are, in fact, not tracking the true underlying ID of the representation. We contrast this negative result with an investigation of the underlying factors that may drive commonly reported ID-related results on neural representation in the literature. Building on these insights, we offer a new perspective on ID estimation in neural representations.

2604.20238 2026-04-23 math.ST stat.TH

Bayesian approaches to non- and semiparametric density estimation [with a rejoinder to my discussants]

Nils Lid Hjort

Comments 29 pages, no figures. Statistical Research Report, Department of Mathematics, University of Oslo, 1995; invited discussion paper for the Fifth Valencia Meeting on Bayesian Statistics. Published version in "Bayesian Statistics" (1995, eds. J.M. Bernardo, J.O. Berger, A.P. Dawid, and A.F. Smith), Proceedings of the Fifth Valencia International Meeting (vol. 5), 223-254

详情
英文摘要

This invited paper proposes and discusses several Bayesian attempts at nonparametric and semiparametric density estimation. The main categories of these ideas are as follows: 1) Build a nonparametric prior around a given parametric model. We look at cases where the nonparametric part of the construction is a Dirichlet process or relatives thereof. (2) Express the density as an additive expansion of orthogonal basis functions, and place priors on the coefficients. Here attention is given to a certain robust Hermite expansion around the normal distribution. Multiplicative expansions are also considered. (3) Express the unknown density as locally being of a certain parametric form, then construct suitable local likelihood functions to express information content, and place local priors on the local parameters.

2604.20219 2026-04-23 cs.LG cs.NA math.NA stat.ML

Geometric Layer-wise Approximation Rates for Deep Networks

Shijun Zhang, Zuowei Shen, Yuesheng Xu

详情
英文摘要

Depth is widely viewed as a central contributor to the success of deep neural networks, whereas standard neural network approximation theory typically provides guarantees only for the final output and leaves the role of intermediate layers largely unclear. We address this gap by developing a quantitative framework in which depth admits a precise scale-dependent interpretation. Specifically, we design a single shared mixed-activation architecture of fixed width $2dN+d+2$ and any prescribed finite depth such that each intermediate readout $Φ_\ell$ is itself an approximant to the target function $f$. For $f\in L^p([0,1]^d)$ with $p\in [1,\infty)$, the approximation error of $Φ_\ell$ is controlled by $(2d+1)$ times the $L^p$ modulus of continuity at the geometric scale $N^{-\ell}$ for all $\ell$. The estimate reduces to the geometric rate $(2d+1)N^{-\ell}$ if $f$ is $1$-Lipschitz. Our network design is inspired by multigrade deep learning, where depth serves as a progressive refinement mechanism: each new correction targets residual information at a finer scale while the earlier correction terms remain part of the later readouts, yielding a nested architecture that supports adaptive refinement without redesigning the preceding network.

2604.20161 2026-04-23 cs.LG stat.ME stat.ML

SMART: A Spectral Transfer Approach to Multi-Task Learning

Boxin Zhao, Mladen Kolar, Jinchi Lv

Comments 53 pages, 4 figures, 1 table

详情
英文摘要

Multi-task learning is effective for related applications, but its performance can deteriorate when the target sample size is small. Transfer learning can borrow strength from related studies; yet, many existing methods rely on restrictive bounded-difference assumptions between the source and target models. We propose SMART, a spectral transfer method for multi-task linear regression that instead assumes spectral similarity: the target left and right singular subspaces lie within the corresponding source subspaces and are sparsely aligned with the source singular bases. Such an assumption is natural when studies share latent structures and enables transfer beyond the bounded-difference settings. SMART estimates the target coefficient matrix through structured regularization that incorporates spectral information from a source study. Importantly, it requires only a fitted source model rather than the raw source data, making it useful when data sharing is limited. Although the optimization problem is nonconvex, we develop a practical ADMM-based algorithm. We establish general, non-asymptotic error bounds and a minimax lower bound in the noiseless-source regime. Under additional regularity conditions, these results yield near-minimax Frobenius error rates up to logarithmic factors. Simulations confirm improved estimation accuracy and robustness to negative transfer, and analysis of multi-modal single-cell data demonstrates better predictive performance. The Python implementation of SMART, along with the code to reproduce all experiments in this paper, is publicly available at https://github.com/boxinz17/smart.

2604.20115 2026-04-23 cs.LG cs.AI stat.ML

On the Stability and Generalization of First-order Bilevel Minimax Optimization

Xuelin Zhang, Peipei Yuan

详情
英文摘要

Bilevel optimization and bilevel minimax optimization have recently emerged as unifying frameworks for a range of machine-learning tasks, including hyperparameter optimization and reinforcement learning. The existing literature focuses on empirical efficiency and convergence guarantees, leaving a critical theoretical gap in understanding how well these algorithms generalize. To bridge this gap, we provide the first systematic generalization analysis for first-order gradient-based bilevel minimax solvers with lower-level minimax problems. Specifically, by leveraging algorithmic stability arguments, we derive fine-grained generalization bounds for three representative algorithms, including single-timescale stochastic gradient descent-ascent, and two variants of two-timescale stochastic gradient descent-ascent. Our results reveal a precise trade-off among algorithmic stability, generalization gaps, and practical settings. Furthermore, extensive empirical evaluations corroborate our theoretical insights on realistic optimization tasks with bilevel minimax structures.

2604.20111 2026-04-23 cs.LG cs.AI stat.ML

Meta Additive Model: Interpretable Sparse Learning With Auto Weighting

Xuelin Zhang, Xinyue Liu, Lingjuan Wu, Hong Chen

详情
英文摘要

Sparse additive models have attracted much attention in high-dimensional data analysis due to their flexible representation and strong interpretability. However, most existing models are limited to single-level learning under the mean-squared error criterion, whose empirical performance can degrade significantly in the presence of complex noise, such as non-Gaussian perturbations, outliers, noisy labels, and imbalanced categories. The sample reweighting strategy is widely used to reduce the model's sensitivity to atypical data; however, it typically requires prespecifying the weighting functions and manually selecting additional hyperparameters. To address this issue, we propose a new meta additive model (MAM) based on the bilevel optimization framework, which learns data-driven weighting of individual losses by parameterizing the weighting function via an MLP trained on meta data. MAM is capable of a variety of learning tasks, including variable selection, robust regression estimation, and imbalanced classification. Theoretically, MAM provides guarantees on convergence in computation, algorithmic generalization, and variable selection consistency under mild conditions. Empirically, MAM outperforms several state-of-the-art additive models on both synthetic and real-world data under various data corruptions.

2604.20072 2026-04-23 math.ST stat.ME stat.TH

Vertex misalignment and changepoint localization in network time series

Tianyi Chen, Mohammad Sharifi Kiasari, Sijing Yu, Youngser Park, Avanti Athreya, Vince Lyzinski, Carey E Priebe, Zachary Lubberts

Comments 52 pages, 11 figures, 3 tables

详情
英文摘要

Inference for time series of networks often relies on accurate vertex correspondence between network realizations at different times. In practice, however, such vertex alignments can be misspecified or unknown. We study the impact of vertex alignment on changepoint localization for dynamic networks through two illustrative models, each with a similar changepoint, with the key distinction being whether changepoint information is contained in marginal or joint distributions of the time-varying latent positions. We compare localization techniques ranging from the simple network statistic of average degree to the modern procedure of Euclidean mirrors. In one model, vertex misalignment causes little error, and in the other, it impairs localization in ways that cannot be corrected through graph matching or optimal transport, which we show are closely related in this setting. Our results demonstrate that robust network inference necessitates reckoning with the subtle interplay of marginal and joint information in the observed network time series.

2604.20069 2026-04-23 stat.AP

Bayesian inference for disease transmission models informed by viral dynamics

Dylan J. Morris, Lauren Kennedy, Andrew J. Black

Comments 35 pages, 13 figures

详情
英文摘要

Infectious disease dynamics operate across multiple biological scales, with within-host viral dynamics being a key driver of between-host transmission. However, while models that explicitly link these scales exist, none have been developed with statistical inference as a primary goal. In this paper we propose a multiscale model that jointly captures heterogeneous individual-level viral load trajectories and stochastic household transmission, and develop efficient inference methods to fit it to data. Since full joint inference is computationally difficult, we employ a cut approach that passes information from the within-host to the between-host model but not vice versa. This enables the data on viral loads to inform the transmission parameters such as the infection times and symptom onset thresholds. We evaluate the framework on simulated household outbreak data, assessing parameter recovery, computational efficiency, and the effect of viral load sampling frequency on inference quality. Parameter recovery is unbiased when the sampling frequency of the viral loads is high enough. When sampling is sparse, some bias is introduced, but incorporating external viral load data can mitigate this.

2604.20045 2026-04-23 stat.ME

A general nonparametric framework for testing hypotheses about function-valued parameters

Albert Osom, Ali Shojaie, Aaron Hudson

详情
英文摘要

We present a general nonparametric approach for testing whether a statistical parameter defined through conditional distributions is constant across the conditioning variables. Such hypotheses arise naturally in problems such as assessing treatment effect heterogeneity, conditional associational effects, and conditional mean dependence. Our framework studies function-valued parameters obtained by evaluating a smooth statistical functional on conditional probability distributions. We establish an explicit connection between our test and procedures based on studying the norm of the function-valued parameter. Unlike many existing norm-based tests, which exhibit poor asymptotic behavior under the null, the proposed test statistic admits a tractable limiting null distribution. We illustrate the applicability of the proposed test through several examples, assess its operating characteristics in simulation studies, and apply it to data from a breast cancer trial to identify predictive biomarkers for response to adjuvant chemotherapy.

2604.20016 2026-04-23 stat.ME math.ST stat.TH

Weighted Holm Procedures: Theory, Properties, and Recommendations

Beibei Li, Wenge Guo

Comments 35 pages, 5 figures, 2 tables

详情
英文摘要

In many statistical applications, particularly in clinical studies, hypotheses may carry different levels of importance, motivating the use of weighted multiple testing procedures (wMTPs) to control the familywise error rate (FWER). Among these approaches, two weighted Holm procedures are commonly used: the weighted Holm procedure (WHP), which is based on ordered weighted $p$-values, and the weighted alternative Holm procedure (WAP), which relies on ordered raw $p$-values. This paper provides a systematic comparison of these two procedures, along with practical recommendations for their use. We first examine their corresponding closed testing procedures (CTPs) and show that WHP is uniformly more powerful than WAP. We further investigate their structural properties, demonstrating that WAP, while consonant, lacks monotonicity. To facilitate communication with non-statisticians, we introduce graphical representations of both procedures using a common initial graph and distinct updating strategies. In addition, we derive adjusted $p$-values and adjusted weighted $p$-values for both methods. Finally, we establish an optimality result: WHP cannot be improved by enlarging any of its critical values without violating FWER control, whereas WAP is optimal only under specific conditions. Simulation studies support these theoretical findings and highlight the superior FWER control and average power of WHP.

2604.19996 2026-04-23 stat.ME stat.AP

Meta-analysis of networks of diagnostic tests with binary and continuous results

Efthymia Derezea, Gabriel Rogers, Nicky J Welton, Hayley E Jones

详情
英文摘要

Network meta-analysis of diagnostic test accuracy (NMA-DTA) is a relatively new field, involving combining evidence across studies to evaluate and compare the accuracy of different tests for a given condition. However, the methods proposed to date cannot always capture complex aspects of the data. In fact, many commonly used diagnostic tests are continuous biomarkers, whose accuracy is evaluated at multiple thresholds within a study. Using current NMA-DTA methods we are feasibly able to include in our analysis only a few thresholds per study, discarding this way a big amount of data which could have provided us with useful information. We introduce an approach that can efficiently encompass all available data. This is a hierarchical model that incorporates multinomial likelihoods for studies reporting results across multiple thresholds and a parametric structure for the relationship between the probability of testing positive and threshold within each disease class. This approach enables us to obtain accuracy estimates of tests across the whole range of observed thresholds, while it retains all the useful properties of standard NMA-DTA methods. We explore different variations of this model based on different covariance structures, the inclusion of study-level random effects, and the addition of a further hierarchical structure on the test-level variance components. This framework is applied to data from two systematic reviews, allowing the inclusion of a larger number of tests (compared to alternative approaches) and estimation of sensitivity and specificity at different thresholds with increased precision.

2604.19977 2026-04-23 stat.ME

Constructing external comparator groups via transportability in mean or in effect measure

Lawson Ung, Guanbo Wang, Sebastien Haneuse, Sonia Hernandez-Diaz, Miguel A. Hernán, Issa J. Dahabreh

详情
英文摘要

Learning about causal effects in target populations and their subsets may be facilitated by combining information from multiple sources. One major class of study designs that combine information involves appending an index study with data from an external comparator, which may facilitate head-to-head comparisons of treatments initially studied in different populations. We delineate external comparator analyses under two distinct, but related, identification strategies. The first strategy relies on exchangeability (transportability) of potential outcome means, which uses information only on the treatments that are to be compared. The second strategy relies on transportability in effect measure, requiring additional use of information on a third treatment common to the populations that have been combined. In a time-fixed setting with a point treatment and non-failure time outcome, we examine identification and estimation under a basic setup where information from an index trial is combined with a second, and external to the index trial, data source. We propose estimators for identifying observed data functionals, with a particular focus on semiparametric efficient augmented weighting estimators that incorporate models for the probability of trial participation, the probability of treatment, and conditional outcome means. We derive the asymptotic properties of these augmented weighting estimators -- including robustness to model misspecification and slower rates of convergence for some nuisance function models -- and use simulation to compare their finite sample performance to estimators based only on outcome modeling or weighting. Last, we provide a practical demonstration of the proposed methods by combining the ACCEPT and PHOENIX 1 randomized trials to evaluate the effect of various biologic agents on plaque psoriasis, a chronic inflammatory disorder.

2604.19972 2026-04-23 stat.ME

Principal Nested Cones

Yanyan Zhan, Ian L. Dryden, Yuexuan Wu

详情
英文摘要

In many applications, the data lie on a type of cone, where there is a distinction between an overall scale variable and the remaining scale-free structure. For example, the joint size and shape of objects are points on a cone, where size represents scale, and shape is the scale-free structure. Dimension reduction is central in such applications, as shape data are often high-dimensional. Interactions between shape and size are widespread and of significant interest in real-world applications. However, most existing methods either lack a single notion of size or focus solely on shape, effectively removing size information. We propose Principal Nested Cones (PNC), a nonlinear dimension reduction framework that preserves both shape and size. PNC represents data through a sequence of nested hypercones and progressively projects observations onto lower-dimensional cone spaces. The resulting PNC scores provide low-dimensional representations that jointly capture size-shape variation in an interpretable manner. To enable scalable computation in ultra-high-dimensional settings, we develop a fast approximation combining PCA-based transformation with standard PNC. Simulation studies and real data applications demonstrate that PNC captures nonlinear size-shape structure, improves representation and reconstruction, and yields interpretable insights across morphometric, developmental, and molecular datasets.

2604.19841 2026-04-23 stat.AP cs.LG

Spatio-temporal modelling of electric vehicle charging demand

Kaoutar Bouaachra, Yvenn Amara-Ouali, Yannig Goude, Raphaël Lachieze-Rey

Comments 18 pages, 19 figures

详情
英文摘要

Accurate forecasting of electric vehicle (EV) charging demand is critical for grid management and infrastructure planning. Yet the field continues to rely on legacy benchmarks; such as the Palo Alto (2020) dataset; that fail to reflect the scale and behavioral diversity of modern charging networks. To address this, we introduce a novel large-scale longitudinal dataset collected across Scotland (2022 2025), which release it as an open benchmark for the community. Building on this dataset, we formulate EV charging demand as a spatio-temporal latent Gaussian field and perform approximate Bayesian inference via Integrated Nested Laplace Approximation (INLA). The resulting model jointly captures spatial dependence, temporal dynamics, and covariate effects within a unified proba bilistic framework. On station-level forecasting tasks, our approach achieves competitive predictive accuracy against machine learning baselines, while additionally providing principled uncertainty quan tification and interpretable spatial and temporal decompositions properties that are essential for risk-aware infrastructure planning.

2604.12783 2026-04-23 stat.ME econ.EM

A Bayes-Factor-Guided Approach to Post-Double Selection with Bootstrapped Multiple Imputation

Johannes Bleher, Claudia Tarantola

Comments 33 pages, 8 figures, 11 tables

详情
英文摘要

When variable selection methods are applied to bootstrapped and multiply imputed datasets, the set of selected variables typically varies across iterations. Aggregating results via the union rule can lead to overly dense models. We propose a sequential evidence aggregation procedure that models detection outcomes across perturbation iterations as Bernoulli trials and accumulates evidence for variable relevance through a likelihood-ratio process admitting an approximate Bayes-factor interpretation. The procedure provides both a variable inclusion criterion and a stopping rule that eliminates the need to fix the number of bootstrap-imputation iterations ex ante. A Monte Carlo study across 126 scenarios and an empirical illustration demonstrate the method's performance relative to existing aggregation approaches.

2604.12694 2026-04-23 stat.CO

Adaptive Sparse Group Lasso Penalized Quantile Regression via Dual ADMM

Huayan Kou, Yuwen Gu, Yi Lian, Rui Zhang, Jun Fan

详情
英文摘要

Sparse penalized quantile regression provides an effective framework for variable selection and robust estimation in high-dimensional data analysis. When ex planatory variables are organized into groups, achieving sparsity both within and between groups is essential. However, existing quantile regression methods often fail to meet this dual objective. To address this gap, we introduce the adaptive sparse group lasso penalized quantile regression, which integrates adaptive lasso and adaptive group lasso penalties. We optimize the model parameters via the alternating direction method of multipliers (ADMM) applied to the dual problem, and establish global convergence. Through extensive simulation studies and real data analyses, we demonstrate (i) the efficacy of the proposed method in achieving simultaneous within- and between-group sparsity, and (ii) the computational efficiency of our algorithm relative to existing alternatives.

2604.02219 2026-04-23 hep-ph hep-ex physics.data-an stat.ME

Many Wrongs Make a Right: Leveraging Biased Simulations Towards Unbiased Parameter Inference

Ezequiel Alvarez, Sean Benevedes, Manuel Szewc, Jesse Thaler

Comments 29 pages, 18 figures, 1 table, code available at https://github.com/sequi76/TAMM and data products available at https://zenodo.org/records/19341120 v2: version to be submitted

详情
英文摘要

In particle physics, as in many areas of science, parameter inference relies on simulations to bridge the gap between theory and experiment. Recent developments in simulation-based inference have boosted the sensitivity of analyses; however, biases induced by simulation-data mismodeling can be difficult to control within standard inference pipelines. In this work, we propose a Template-Adapted Mixture Model to confront this problem in the context of signal fraction estimation: inferring the population proportion of signal in a mixed sample of signal and background, both of which follow arbitrarily complex distributions. We harness many biased simulations to perform data-driven estimates of each process distribution in the signal region, substantially reducing the bias on the signal fraction due to the domain shift between simulation and reality. We explore different methodological choices, including model selection, feature representation, and statistical method, and apply them to a Gaussian toy example and to a semi-realistic di-Higgs measurement. We find that the presented methods successfully leverage the biased simulations to provide estimates with well-calibrated uncertainties.

2603.29316 2026-04-23 stat.AP

A Bayesian Finite Mixture Model Approach for Mixed-type Data Clustering and Variable Selection with Censored Biomarkers

Yueting Wang, Shu Wang, Jonathan G. Yabes, Chung-Chou H. Chang

Comments 55 pages (including 17-page Appendices), 8 figures (including 1 figure in Appendix B)

详情
英文摘要

Clustering mixed-type data remains a major challenge in biomedical research to uncover clinically meaningful subgroups within heterogeneous patient populations. Most existing clustering methods impose restrictive assumptions like local independence, fail to accommodate censored biomarkers, or unable to quantify variable importance. We propose a Bayesian finite mixture model (BFMM) clustering framework that addresses these limitations. BFMM flexibly models both continuous and categorical variables, incorporates three covariance structures to capture cluster-specific dependencies among continuous features, and handles censored observations through likelihood-based imputation. To facilitate feature prioritization, BFMM uses spike-and-slab priors to estimate variable importance on a continuous 0-1 scale. Simulation studies demonstrate that BFMM outperforms existing methods in clustering accuracy, particularly given strong within-cluster correlation or censored variables, and reliably distinguishes informative features from noise under varying conditions. We applied BFMM to two real-world datasets: (1) the SENECA cohort integrating electronic health records from patients with Sepsis; and (2) the EDEN randomized trial of patients with acute lung injury. In both settings, BFMM identified clinically interpretable phenotypes and revealed variable-specific contributions to subgroup differentiation. In the EDEN trial, it also uncovered evidence of treatment heterogeneity. These findings validate BFMM as an effective, interpretable, and practically useful clustering tool for complex biomedical datasets.

2602.19774 2026-04-23 stat.AP

Spatio-temporal modeling of urban extreme rainfall events at high resolution

Chloé Serre-Combe, Nicolas Meyer, Thomas Opitz, Gwladys Toulemonde

详情
英文摘要

Modeling precipitation and its accumulation over time and space is essential for flood risk assessment. In this paper, we analyze rainfall data collected over several years through a micro-scale precipitation sensor network in Montpellier, France. A novel spatio-temporal stochastic model is proposed for high-resolution urban extreme rainfall and combines realistic marginal behaviour and flexible dependence structure. Marginally, rainfall intensities are described by the Extended Generalized Pareto Distribution (EGPD), capturing both moderate and extreme events without threshold selection. Based on peaks-over-threshold theory for spatial processes, dependence during extreme episodes is modeled by an r-Pareto process with a non-separable variogram allowing for episode-specific advection, such that the displacement of rainfall cells is represented explicitly. Based on a catalog of extreme space-time episodes extracted from observations, parameters are estimated by a new composite likelihood based on joint exceedance indicators. Empirical advection velocities are derived beforehand from a radar reanalysis dataset. We show that the model accurately reproduces the spatio-temporal structure of extreme rainfall observed in the Montpellier OMSEV network and enables realistic stochastic scenario generation for flood risk assessment.

2512.12463 2026-04-23 stat.ML cs.LG math.ST stat.TH

Understanding Overparametrization in Survival Models through Interpolation

Yin Liu, Jianwen Cai, Didong Li

详情
英文摘要

Classical statistical learning theory predicts a U-shaped relationship between test loss and model capacity, driven by the bias-variance trade-off. Recent advances in modern machine learning have revealed a more complex pattern, double-descent, in which test loss, after peaking near the interpolation threshold, decreases again as model capacity continues to grow. While this behavior has been extensively analyzed in regression and classification, its manifestation in survival analysis remains unexplored. This study investigates overparametrization in four representative survival models: DeepSurv, PC-Hazard, Nnet-Survival, and N-MTLR. We rigorously define interpolation and finite-norm interpolation, two key characteristics of loss-based models to understand double-descent. We then show the existence (or absence) of (finite-norm) interpolation of all four models. Our findings clarify how likelihood-based losses and model implementation jointly determine the feasibility of interpolation and show that overparametrization should not be regarded as benign for survival models. All theoretical results are supported by numerical experiments that highlight the distinct generalization behaviors of survival models.

2512.12325 2026-04-23 cs.LG math.ST stat.ML stat.TH

Eventually LIL Regret: Almost Sure $\ln\ln T$ Regret for a sub-Gaussian Mixture on Unbounded Data

Shubhada Agrawal, Aaditya Ramdas

Comments Published at ALT 2026

详情
英文摘要

We prove that a classic sub-Gaussian mixture proposed by Robbins in a stochastic setting actually satisfies a path-wise (deterministic) regret bound. For every path in a natural ``Ville event'' $\mathcal E_α$, this regret till time $T$ is bounded by $\ln^2(1/α)/V_T + \ln (1/α) + \ln \ln V_T$ up to universal constants, where $V_T$ is a nonnegative, nondecreasing, cumulative variance process. (The bound reduces to $\ln(1/α) + \ln \ln V_T$ if $V_T \geq \ln(1/α)$.) If the data were stochastic, then one can show that $\mathcal E_α$ has probability at least $1-α$ under a wide class of distributions (eg: sub-Gaussian, symmetric, variance-bounded, etc.). In fact, we show that on the Ville event $\mathcal E_0$ of probability one, the regret on every path in $\mathcal E_0$ is eventually bounded by $\ln \ln V_T$ (up to constants). We explain how this work helps bridge the world of adversarial online learning (which usually deals with regret bounds for bounded data), with game-theoretic statistics (which can handle unbounded data, albeit using stochastic assumptions). In short, conditional regret bounds serve as a bridge between stochastic and adversarial betting.

2511.23156 2026-04-23 stat.AP

Design loads for wave impacts -- introducing the Probabilistic Adaptive Screening (PAS) method for predicting extreme non-linear loads on maritime structures

Sanne M. van Essen, Harleigh C. Seyffert

详情
Journal ref
van Essen, S.M. and Seyffert, H.C. (2026). Design loads for wave impacts - The Probabilistic Adaptive Screening (PAS) method for extreme non-linear hydrodynamic loads and responses of maritime structures. Ocean Eng., 357p2, 125440
英文摘要

Wave impact loads on maritime structures can cause casualties, damage, pollution and operational delays. Consequently, their extreme values should be accounted for in the design of these structures. However, this is challenging, as wave impact events are both rare and highly complex, requiring both high-fidelity simulations and long analysis durations to reliably quantify the associated design loads. Moreover, existing extreme value prediction methods are neither specifically developed nor adequately validated for wave impact phenomena. We therefore introduce the new Probabilistic Adaptive Screening (PAS) method for predicting extreme non-linear loads on maritime structures. The method integrates copula-based statistical dependence modelling with multi-fidelity screening and adaptive sampling. This framework enables efficient extreme value prediction by statistically mapping low-fidelity indicator variables to high-fidelity impact loads. The method allows for efficient linear potential flow indicators to be used in the low-fidelity stage, even for strongly non-linear cases. Its statistical framework is validated against four non-linear test cases, including non-linear waves, ship vertical bending moments, green water impact loads, and slamming loads. It is concluded that PAS with optimal settings accurately estimates both the short-term distributions and extreme values in these test cases, with most probable maximum (MPM) values within 2-15% of the reference brute-force Monte-Carlo Simulation (MCS) results. In addition, PAS achieves this performance very efficiently, requiring in the order of 1-3% of the high-fidelity simulation time needed for conventional MCS. These results demonstrate that PAS can reliably reproduce the statistics of both weakly and strongly non-linear extreme load problems, while significantly reducing the associated computational cost compared to MCS.

2510.13233 2026-04-23 stat.ME stat.CO

Scalable Bayesian inference for high-dimensional mixed-type multivariate spatial data

Arghya Mukherjee, Arnab Hazra, Dootika Vats

Comments 52 pages, 8 figures, 13 tables

详情
英文摘要

Spatial generalized linear mixed-effects models are popularly used to analyze spatially indexed univariate responses. However, with modern technology, it is common to observe vector-valued mixed-type responses, e.g., a combination of binary, count, or continuous types, at each location. Methods for jointly modeling such mixed-type multivariate spatial responses are rare. Using multivariate Gaussian processes (GPs) in the latent layer, we present a class of Bayesian spatial methods applicable to any combination of exponential family responses. Since multivariate GP-based methods can suffer from computational bottlenecks when the number of spatial locations is high, we further employ a computationally efficient Vecchia approximation for fast posterior inference and prediction. Key theoretical properties of the proposed model, such as identifiability and the structure of the induced covariance, are established. Our approach employs a Markov chain Monte Carlo-based inference method that uses elliptical slice sampling within a blocked Metropolis-within-Gibbs sampling framework. We illustrate the efficacy of the proposed method through simulation studies and a real-data application on joint modeling of wildfire counts and burnt areas across the United States.

2510.09902 2026-04-23 math.HO stat.ML

If you can distinguish, you can express: Galois theory, Stone--Weierstrass, machine learning, and linguistics

Ben Blum-Smith, Claudia Brugman, Thomas Conners, Soledad Villar

Comments Added a section that engages with relevant recent work

详情
英文摘要

This essay develops a parallel between the Fundamental Theorem of Galois Theory and the Stone--Weierstrass theorem: both can be viewed as assertions that tie the distinguishing power of a class of objects to their expressive power. We provide an elementary theorem connecting the relevant notions of "distinguishing power". We also discuss machine learning and data science contexts in which these theorems, and more generally the theme of links between distinguishing power and expressive power, appear. Finally, we discuss the same theme in the context of linguistics, where it appears as a foundational principle, and illustrate it with several examples.

2510.08465 2026-04-23 stat.ML cs.LG

Accumulated Aggregated D-Optimal Designs for Estimating Main Effects in Black-Box Models

Chih-Yu Chang, Ming-Chung Chang

详情
英文摘要

Estimating how individual input variables affect the output of a black-box model is a central task in explainable machine learning. However, existing methods suffer from two key limitations: sensitivity to out-of-distribution (OOD) evaluations, which arises when query points are placed far from the data manifold, and instability under feature correlation, which can lead to unreliable effect estimates in practice. We introduce a unified view of main effect estimation as a design problem, which reveals that all existing methods differ only in their choice of evaluation locations. Building on this formulation, we propose A2D2E, an Estimator based on Accumulated Aggregated D-Optimal Designs, which replaces evaluations with a D-optimal hypercube design to minimize the variance of main effect estimation. A2D2E is model-agnostic, requires no differentiability of the predictor, and admits a closed-form estimator with complexity comparable to existing approaches. We establish that A2D2E is consistent to the same population target as ALE, and extend this result to the realistic setting where only a surrogate model is available. Through extensive simulations across multiple predictive models and dependence settings, we demonstrate that A2D2E outperforms ALE-based methods, with the largest gains under high feature correlation.

2510.03665 2026-04-23 stat.ME stat.CO

Efficient Log-Rank Updates for Random Survival Forests

Erik Sverdrup, James Yang, Michael LeBlanc

详情
英文摘要

Random survival forests are widely used for estimating covariate-conditional survival functions under right-censoring. Their standard log-rank splitting criterion is typically recomputed at each candidate split. This O(M) cost per split, with M the number of distinct event times in a node, creates a bottleneck for large cohort datasets with long follow-up. We revisit approximations proposed by LeBlanc and Crowley (1995) and develop simple constant-time updates for the log-rank criterion. The method is implemented in grf for R and reduces training time on large datasets while preserving predictive accuracy.

2509.19367 2026-04-23 eess.SP cs.LG stat.ML

Low-Cost Sensor Fusion Framework for Organic Substance Classification and Quality Control Using Classification Methods

Borhan Uddin Chowdhury, Damian Valles, Md Raf E Ul Shougat

Comments Copyright 2025 IEEE. This is the author's version of the work accepted for publication in FMLDS 2025. The final version will be published by IEEE and available via DOI (to be inserted when available). Accepted at FMLDS 2025, to appear in IEEE Xplore. 8 pages, 17 figures, 3 tables

详情
Journal ref
2025 IEEE International Conference on Frontiers in Machine Learning and Data Science (FMLDS), IEEE, 2025
英文摘要

We present a sensor-fusion framework for rapid, non-destructive classification and quality control of organic substances, built on a standard Arduino Mega 2560 microcontroller platform equipped with three commercial environmental and gas sensors. All data used in this study were generated in-house: sensor outputs for ten distinct classes - including fresh and expired samples of apple juice, onion, garlic, and ginger, as well as cinnamon and cardamom - were systematically collected and labeled using this hardware setup, resulting in a unique, application-specific dataset. Correlation analysis was employed as part of the preprocessing pipeline for feature selection. After preprocessing and dimensionality reduction (PCA/LDA), multiple supervised learning models - including Support Vector Machine (SVM), Decision Tree (DT), and Random Forest (RF), each with hyperparameter tuning, as well as an Artificial Neural Network (ANN) and an ensemble voting classifier - were trained and cross-validated on the collected dataset. The best-performing models, including tuned Random Forest, ensemble, and ANN, achieved test accuracies in the 93 to 94 percent range. These results demonstrate that low-cost, multisensory platforms based on the Arduino Mega 2560, combined with advanced machine learning and correlation-driven feature engineering, enable reliable identification and quality control of organic compounds.

2509.17260 2026-04-23 q-bio.NC cs.OH stat.AP

A tutorial on electrogastrography using low-cost hardware and open-source software

Evgeniya Anisimova, Sameer N. B. Alladin, Styliani Tsamaz, Edwin S. Dalmaijer

详情
英文摘要

Electrogastrography is the recording of changes in electric potential caused by the stomach's pacemaker region, typically through several cutaneous sensors placed on the abdomen. It is a worthwhile technique in medical and psychological research, but also relatively niche. Here we present a tutorial on the acquisition and analysis of the human electrogastrogram. Because dedicated equipment and software can be prohibitively expensive, we demonstrate how data can be acquired using a low-cost OpenBCI Ganglion amplifier. We also present a processing pipeline that minimises attrition, which is particularly helpful for low-cost equipment but also applicable to top-of-the-line hardware. Our approach comprises outlier rejection, frequency filtering, movement filtering, and noise reduction using independent component analysis. Where traditional approaches include a subjective step in which only one channel is manually selected for further analysis, our pipeline recomposes the electrogastrogram from all recorded channels after automatic rejection of nuisance components. The main benefits of this approach are reduced attrition, retention of data from all recorded channels, and reduced influence of researcher bias. In addition to our tutorial on the method, we offer a proof-of-principle in which our approach leads to reduced data rejection compared to established methods. We aimed to describe each step in sufficient detail to be implemented in any programming language. In addition, we made an open-source Python package freely available for ease of use.

2508.18948 2026-04-23 hep-th cond-mat.dis-nn cs.LG stat.ML

Gauge-covariant stochastic neural fields: Stability and finite-width effects

Rodrigo Carmo Terin

Comments 20 pages, 2 figures, 1 table. Accepted version for publication in Scientific Reports

详情
英文摘要

We develop a gauge-covariant stochastic effective field theory for stability and finite-width effects in deep neural systems. The model uses classical commuting fields: a complex matter field, a real Abelian connection field, and a fictitious stochastic depth variable. Using the Martin--Siggia--Rose--Janssen--de~Dominicis formalism, we derive its functional representation and a two-replica linear-response construction defining the maximal Lyapunov exponent and the amplification factor for the edge of chaos. Finite-width effects appear as perturbative corrections to dressed kernels, and the marginality condition remains unchanged at the order considered for fixed kernel geometry. Numerically, finite-width multilayer perceptrons follow the mean-field instability threshold, and a linear stochastic effective sector reproduces the predicted low-frequency spectral deformation.

2508.17761 2026-04-23 cs.LG stat.ML

Evaluating the Quality of the Quantified Uncertainty for (Re)Calibration of Data-Driven Regression Models

Jelke Wibbeke, Nico Schönfisch, Sebastian Rohjans, Andreas Rauh

详情
Journal ref
International Journal of Approximate Reasoning, Volume 195, 2026, 109685, ISSN 0888-613X
英文摘要

In safety-critical applications data-driven models must not only be accurate but also provide reliable uncertainty estimates. This property, commonly referred to as calibration, is essential for risk-aware decision-making. In regression a wide variety of calibration metrics and recalibration methods have emerged. However, these metrics differ significantly in their definitions, assumptions and scales, making it difficult to interpret and compare results across studies. Moreover, most recalibration methods have been evaluated using only a small subset of metrics, leaving it unclear whether improvements generalize across different notions of calibration. In this work, we systematically extract and categorize regression calibration metrics from the literature and benchmark these metrics independently of specific modelling methods or recalibration approaches. Through controlled experiments with real-world, synthetic and artificially miscalibrated data, we demonstrate that calibration metrics frequently produce conflicting results. Our analysis reveals substantial inconsistencies: many metrics disagree in their evaluation of the same recalibration result, and some even indicate contradictory conclusions. This inconsistency is particularly concerning as it potentially allows cherry-picking of metrics to create misleading impressions of success. We identify the Expected Normalized Calibration Error (ENCE) and the Coverage Width-based Criterion (CWC) as the most dependable metrics in our tests. Our findings highlight the critical role of metric selection in calibration research.

2508.03059 2026-04-23 stat.ME stat.CO stat.ML

Two-sample comparison through additive tree models for density ratios

Naoki Awaya, Yuliang Xu, Li Ma

详情
英文摘要

The ratio of two densities provides a direct characterization of their differences. We consider the two-sample comparison problem by estimating this ratio given i.i.d. observations from two distributions. To this end, we propose additive tree models for density ratio estimation along with efficient algorithms using a new loss function, the balancing loss. The loss allows tree-based models to be trained using several algorithms originally designed for supervised learning, such as forward-stagewise optimization and gradient boosting. Moreover, the balancing loss resembles an exponential family kernel, and it can serve as a pseudo-likelihood with conjugate priors. This property enables generalized Bayesian inference on the density ratio using backfitting samplers designed for Bayesian additive regression trees (BART). Our Bayesian strategy provides uncertainty quantification for the inferred density ratio, which is critical for applications involving high-dimensional and data-limited distributions with potentially substantial uncertainty. We further show connections of the balancing loss to the exponential loss in binary classification and to the variational form of f-divergence, particularly the squared Hellinger distance. Numerical experiments demonstrate that our method achieves both accuracy and computational efficiency, while uniquely providing uncertainty quantification. Finally, we demonstrate its application to assessing the quality of generative models for microbiome compositional data.

2508.02569 2026-04-23 stat.AP

Understanding Heterogeneity in Adaptation to Intermittent Water Supply: Clustering Household Types in Amman, Jordan

Shreyas Gadge, Vítor V. Vasconcelos, André de Roos, Elisabeth H. Krueger

详情
英文摘要

More than a billion people around the world experience intermittence in their water supply, posing challenges for urban households in Global South cities. An intermittent water supply (IWS) system prompts water users to adapt to service deficits which entails coping costs. Adaptation and its impacts can vary between households within the same city, leading to intra-urban inequality. Studies on household adaptation to IWS through survey data are limited to exploring income-based heterogeneity and do not account for the multidimensional and non-linear nature of the data. There is a need for a standardized methodology for understanding household responses to IWS that acknowledges the heterogeneity of households characterized by sets of multiple underlying factors and that is applicable across different settings. Here, we develop an analysis pipeline that applies hierarchical clustering analysis (HCA) in combination with the Welch-two-sample t-test on household survey data from Amman, Jordan. We identify three clusters of households distinguished by a set of characteristics including income, water social network, supply duration, relocation and water quality problems and identify their group-specific adaptive strategies such as contacting the utility or accessing an alternate water source. This study uncovers the unequal nature of IWS adaptation in Amman, giving insights into the link between household characteristics and adaptive behaviors, while proposing a standardized method to reveal relevant heterogeneity in households adapting to IWS.

2507.18279 2026-04-23 math.ST math.AP math.DS math.PR stat.TH

Data assimilation with the 2D Navier-Stokes equations: Optimal Gaussian asymptotics for the posterior measure

Dimitri Konen, Richard Nickl

详情
英文摘要

A functional Bernstein - von Mises theorem is proved for posterior measures arising in a data assimilation problem with the two-dimensional Navier-Stokes equation where a Gaussian process prior is assigned to the initial condition of the system. The posterior measure, which provides the update in the space of all trajectories arising from a discrete sample of the (deterministic) dynamics, is shown to be approximated by a Gaussian random vector field arising from the solution to a linear parabolic PDE with Gaussian initial condition. The approximation holds in the strong sense of the supremum norm on the regression functions, showing that predicting future states of Navier-Stokes systems admits root(N)-consistent estimators even for commonly used nonparametric models. Consequences for coverage of credible bands and uncertainty quantification are discussed. A local asymptotic minimax theorem is derived that describes the lower bound for estimating the state of the nonlinear system, which is shown to be attained by the Bayesian data assimilation algorithm.

2507.16433 2026-04-23 stat.ME cs.LG

Adaptive Multi-task Learning for Multi-sector Portfolio Optimization

Qingliang Fan, Ruike Wu, Yanrong Yang

详情
英文摘要

Accurate transfer of information across multiple sectors to enhance model estimation is both significant and challenging in multi-sector portfolio optimization involving a large number of assets in different classes. Within the framework of factor modeling, we propose a novel data-adaptive multi-task learning methodology that quantifies and learns the relatedness among the principal temporal subspaces (spanned by factors) across multiple sectors under study. This approach not only improves the simultaneous estimation of multiple factor models but also enhances multi-sector portfolio optimization, which heavily depends on the accurate recovery of these factor models. Additionally, a novel and easy-to-implement algorithm, termed projection-penalized principal component analysis, is developed to accomplish the multi-task learning procedure. Diverse simulation designs and practical application on daily return data from Russell 3000 index demonstrate the advantages of multi-task learning methodology.

2506.20910 2026-04-23 math.OC cs.LG stat.ML

Faster Fixed-Point Methods for Multichain MDPs

Matthew Zurek, Yudong Chen

详情
Journal ref
NeurIPS 2025
英文摘要

We study value-iteration (VI) algorithms for solving general (a.k.a. multichain) Markov decision processes (MDPs) under the average-reward criterion, a fundamental but theoretically challenging setting. Beyond the difficulties inherent to all average-reward problems posed by the lack of contractivity and non-uniqueness of solutions to the Bellman operator, in the multichain setting an optimal policy must solve the navigation subproblem of steering towards the best connected component, in addition to optimizing long-run performance within each component. We develop algorithms which better solve this navigational subproblem in order to achieve faster convergence for multichain MDPs, obtaining improved rates of convergence and sharper measures of complexity relative to prior work. Many key components of our results are of potential independent interest, including novel connections between average-reward and discounted problems, optimal fixed-point methods for discounted VI which extend to general Banach spaces, new sublinear convergence rates for the discounted value error, and refined suboptimality decompositions for multichain MDPs. Overall our results yield faster convergence rates for discounted and average-reward problems and expand the theoretical foundations of VI approaches.

2506.20904 2026-04-23 cs.LG cs.IT math.IT math.OC stat.ML

Optimal Single-Policy Sample Complexity and Transient Coverage for Average-Reward Offline RL

Matthew Zurek, Guy Zamir, Yudong Chen

详情
Journal ref
NeurIPS 2025
英文摘要

We study offline reinforcement learning in average-reward MDPs, which presents increased challenges from the perspectives of distribution shift and non-uniform coverage, and has been relatively underexamined from a theoretical perspective. While previous work obtains performance guarantees under single-policy data coverage assumptions, such guarantees utilize additional complexity measures which are uniform over all policies, such as the uniform mixing time. We develop sharp guarantees depending only on the target policy, specifically the bias span and a novel policy hitting radius, yielding the first fully single-policy sample complexity bound for average-reward offline RL. We are also the first to handle general weakly communicating MDPs, contrasting restrictive structural assumptions made in prior work. To achieve this, we introduce an algorithm based on pessimistic discounted value iteration enhanced by a novel quantile clipping technique, which enables the use of a sharper empirical-span-based penalty function. Our algorithm also does not require any prior parameter knowledge for its implementation. Remarkably, we show via hard examples that learning under our conditions requires coverage assumptions beyond the stationary distribution of the target policy, distinguishing single-policy complexity measures from previously examined cases. We also develop lower bounds nearly matching our main result.

2506.16658 2026-04-23 math.ST cs.LG stat.ML stat.TH

Multi-Armed Bandits With Machine Learning-Generated Surrogate Rewards

Wenlong Ji, Yihan Pan, Ruihao Zhu, Lihua Lei

详情
英文摘要

Multi-armed bandit (MAB) is a widely adopted framework for sequential decision-making under uncertainty. Traditional bandit algorithms rely solely on online data, which tends to be scarce as it must be gathered during the online phase when the arms are actively pulled. However, in many practical settings, rich auxiliary data, such as covariates of past users, is available prior to deploying any arms. We introduce a new setting for MAB where pre-trained machine learning (ML) models are applied to convert side information and historical data into \emph{surrogate rewards}. A prominent challenge of this setting is that the surrogate rewards may exhibit substantial bias, as true reward data is typically unavailable in the offline phase, forcing ML predictions to heavily rely on extrapolation. To address the issue, we propose the Machine Learning-Assisted Upper Confidence Bound (MLA-UCB) algorithm, which can be applied to any reward prediction model and any form of auxiliary data. When the predicted and true rewards are jointly Gaussian, it provably improves the cumulative regret, even in cases where the mean surrogate reward completely misaligns with the true mean rewards, and achieves the asymptotic optimality among a broad class of policies. Notably, our method requires no prior knowledge of the covariance matrix between true and surrogate rewards. We further extend the method to a batched reward MAB problem, where each arm pull yields a batch of observations and rewards may be non-Gaussian, and we derive computable confidence bounds and regret guarantees that improve upon classical UCB algorithms. Finally, extensive simulations with both Gaussian and ML-generated surrogates, together with real-world studies on language model selection and video recommendation, demonstrate consistent and often substantial regret reductions with moderate offline surrogate sample sizes and correlations.

2506.14103 2026-04-23 stat.ME q-bio.QM

A Robust Nonparametric Framework for Detecting Repeated Spatial Patterns

Rajitha Senanayake, Pratheepa Jeganathan

Comments 39 pages including an Appendix of 17 pages, 39 figures

详情
英文摘要

Identifying spatially contiguous clusters and repeated spatial patterns (RSP) characterized by similar underlying distributions that are spatially apart is a key challenge in modern spatial statistics. Existing constrained clustering methods enforce spatial contiguity but are limited in their ability to identify RSP. We propose a novel nonparametric framework that addresses this limitation by combining constrained clustering with a post-clustering reassigment step based on the maximum mean discrepancy (MMD) statistic. We employ a block permutation strategy within each cluster that preserves local attribute structure when approximating the null distribution of the MMD. We also show that the MMD$^2$ statistic is asymptotically consistent under second-order stationarity and spatial mixing conditions. This two-stage approach enables the detection of clusters that are both spatially distant and similar in distribution. Through simulation studies that vary spatial dependence, cluster sizes, shapes, and multivariate dimensionality, we demonstrate the robustness of our proposed framework in detecting RSP. We further illustrate its applicability through an analysis of spatial proteomics data from patients with triple-negative breast cancer. Overall, our framework presents a methodological advancement in spatial clustering, offering a flexible and robust solution for spatial datasets that exhibit repeated patterns.

2506.02276 2026-04-23 cs.LG stat.ML

Latent Stochastic Interpolants

Saurabh Singh, Dmitry Lagun

Comments Accepted at ICLR 2026 as a conference paper

详情
英文摘要

Stochastic Interpolants (SI) is a powerful framework for generative modeling, capable of flexibly transforming between two probability distributions. However, its use in jointly optimized latent variable models remains unexplored as it requires direct access to the samples from the two distributions. This work presents Latent Stochastic Interpolants (LSI) enabling joint learning in a latent space with end-to-end optimized encoder, decoder and latent SI models. We achieve this by developing a principled Evidence Lower Bound (ELBO) objective derived directly in continuous time. The joint optimization allows LSI to learn effective latent representations along with a generative process that transforms an arbitrary prior distribution into the encoder-defined aggregated posterior. LSI sidesteps the simple priors of the normal diffusion models and mitigates the computational demands of applying SI directly in high-dimensional observation spaces, while preserving the generative flexibility of the SI framework. We demonstrate the efficacy of LSI through comprehensive experiments on the standard large scale ImageNet generation benchmark.

2505.17803 2026-04-23 stat.ME

Anytime-valid simultaneous lower confidence bounds for the true discovery proportion

Friederike Preusse

详情
英文摘要

We propose a method that combines the closed testing framework with the concept of safe anytime-valid inference (SAVI) to compute lower confidence bounds for the true discovery proportion in a multiple testing setting. The proposed procedure provides confidence bounds that are valid at every observation time point and that are simultaneous for all possible subsets of hypotheses. While the hypotheses are assumed to be fixed over time, the subsets of interest may vary. Anytime-valid simultaneous confidence bounds allow us to sequentially update the bounds over time and allow for optional stopping. This is a desirable property in practical applications such as neuroscience, where data acquisition is costly and time-consuming. We also present a computational shortcut which makes the application of the proposed procedure feasible when the number of hypotheses under consideration is large. We illustrate the performance of the proposed method in a simulation study and give some practical guidelines on the implementation of the proposed procedure.

2504.10530 2026-04-23 math.PR stat.CO

Efficient Rare-Event Simulation for Random Geometric Graphs via Importance Sampling

Sarat Moka, Christian Hirsch, Volker Schmidt, Dirk Kroese

Comments 29 Pages, 2 figures

详情
英文摘要

Random geometric graphs defined on Euclidean subspaces, also called Gilbert graphs, are widely used to model spatially embedded networks across various domains. In such graphs, nodes are located at random in Euclidean space, and any two nodes are connected by an edge if they lie within a certain distance threshold. Accurately estimating rare-event probabilities related to key properties of these graphs, such as the number of edges and the size of the largest connected component, is important in the assessment of risk associated with catastrophic incidents, for example. However, this task is computationally challenging, especially for large networks. Importance sampling offers a viable solution by concentrating computational efforts on significant regions of the graph. This paper explores the application of an importance sampling method to estimate rare-event probabilities, highlighting its advantages in reducing variance and enhancing accuracy. Through asymptotic analysis and numerical studies, we demonstrate the effectiveness of our methodology, contributing to improved analysis of Gilbert graphs and showcasing the broader applicability of importance sampling in complex network analysis.

2503.16744 2026-04-23 stat.ME stat.AP

Modeling and forecasting subnational age distribution of death counts

Han Lin Shang, Cristian F. Jiménez-Varón

Comments 45 pages, 9 figures, 7 tables

详情
英文摘要

Existing mortality forecasting methods focus on age-specific mortality rates, which lie in an unconstrained space and overlook the distributional nature of life-table death counts. Few studies have developed and compared forecasting methods that model the shape and dynamics of the age distribution of deaths, especially at the subnational level, where data quality varies greatly. This paper presents several forecasting methods to model and forecast the subnational age distribution of death counts. The age distribution of death counts has many similarities to probability density functions, which are non-negative and have a constrained integral, and thus live in a constrained nonlinear space. To address the nonlinear nature of objects, we implement a cumulative distribution function transformation that is scale-free and has additional monotonicity. Using subnational Japanese life-table death counts from the Japanese Mortality Database (2025), we evaluate the forecast accuracy of the transformation and forecasting methods. The improved forecast accuracy of life-table death counts implemented here will be of great interest to demographers in estimating regional age-specific survival probabilities and life expectancy, and to actuaries as a foundation for exploring potential applications in determining annuity prices for various ages and maturities.

2502.06151 2026-04-23 cs.LG cs.AI stat.ML

Recency Biased Causal Attention for Time-series Forecasting

Kareem Hegazy, Michael W. Mahoney, N. Benjamin Erichson

详情
Journal ref
Proceedings of the 29th International Conference on Artificial Intelligence and Statistics (AISTATS) 2026
英文摘要

Recency bias is a useful inductive prior for sequential modeling: it emphasizes nearby observations and can still allow longer-range dependencies. Standard Transformer attention lacks this property, relying on all-to-all interactions that overlook the causal and often local structure of temporal data. We propose a simple mechanism to introduce recency bias by reweighting attention scores with a smooth heavy-tailed decay. This adjustment strengthens local temporal dependencies without sacrificing the flexibility to capture broader and data-specific correlations. We show that recency-biased attention consistently improves sequential modeling, aligning Transformer more closely with the read, ignore, and write operations of RNNs. Finally, we demonstrate that our approach achieves competitive and often superior performance on challenging time-series forecasting benchmarks.

2412.07999 2026-04-23 math.ST math.PR stat.ML stat.TH

Fast Mixing of Data Augmentation Algorithms: Bayesian Probit, Logit, and Lasso Regression

Holden Lee, Kexin Zhang

Comments 48 pages, 8 figures; Refined theorem statements and simulations

详情
英文摘要

We propose using a modified conductance-based method to study the mixing time of an important class of two-block Gibbs samplers, the data augmentation (DA) algorithm. %, which is of prominent interest in both theoretical and empirical research. Using this method, we prove the first non-asymptotic polynomial upper bounds on mixing times of three important DA algorithms: DA algorithms for Bayesian Probit regression (Albert and Chib, 1993, ProbitDA) and Bayesian Logit regression (Polson, Scott, and Windle, 2013, LogitDA), and Bayesian Lasso Regression (Park and Casella, 2008, Rajaratnam et al., 2015, LassoDA). Concretely, for ProbitDA and LogitDA, we demonstrate a tight bound that explicitly depends on the design matrix and prior covariance matrix. Under the assumption that data are independently generated from either a sub-Gaussian or log-concave distribution and properly scaled, the bound implies that with $η$-warm start, parameter dimension $d$, and sample size $n$, with high probability over data, the two algorithms require $\mathcal{O}\left(n\log \left(\frac{\log η}ε\right)\right)$ steps to obtain samples with at most $ε$ error in TV, KL, or $χ^2$ distance. Meanwhile, we show that under minimal data assumptions, LassoDA requires $\mathcal{O}\left(d^2(d\log d +n \log n)^2 \log \left(\fracηε\right)\right)$ steps to achieve $ε$-accuracy in TV distance. The results are generally applicable to settings with large $n$ and large $d$, including settings with highly imbalanced response data in Probit and Logit regression. We compare them with the best known guarantees of Langevin Monte Carlo and Metropolis Adjusted Langevin Algorithm. We evaluate our theoretical results using numerical examples, and discuss the mixing times of the three algorithms under feasible initialization.

2410.18880 2026-04-23 math.ST math.PR stat.TH

Can we spot a fake?

Shahar Mendelson, Grigoris Paouris, Roman Vershynin

Comments 13 pages. A few typos corrected

详情
英文摘要

The problem of detecting fake data inspires the following seemingly simple mathematical question. Sample a data point $X$ from the standard normal distribution in $\mathbb{R}^n$. An adversary observes $X$ and corrupts it by adding a vector $rt$, where they can choose any vector $t$ from a fixed set $T$ of the adversary's ``tricks'', and where $r>0$ is a fixed radius. The adversary's choice of $t=t(X)$ may depend on the true data $X$. The adversary wants to hide the corruption by making the fake data $X+rt$ statistically indistinguishable from the real data $X$. What is the largest radius $r=r(T)$ for which the adversary can create an undetectable fake? We show that for highly symmetric sets $T$, the detectability radius $r(T)$ is approximately twice the scaled Gaussian width of $T$. The upper bound actually holds for arbitrary sets $T$ and generalizes to arbitrary, non-Gaussian distributions of real data $X$. The lower bound may fail for not highly symmetric $T$, but we conjecture that this problem can be solved by considering the focused version of the Gaussian width of $T$, which focuses on the most important directions of $T$.

2409.18198 2026-04-23 stat.AP

Estimating soil carbon sequestration potential and approximating optimal management policies

Jacob Spertus, Eric Slessarev, Whendee Silver, Philip Stark

Comments 26 pages, 6 figures, 1 table

详情
英文摘要

The impact of a management intervention on the soil organic carbon (SOC) stored in a given volume of soil is moderated by features that determine that soil's sequestration potential under that intervention. To maximize total SOC sequestration cost efficiently, interventions should be targeted to soils with the highest responses and lowest intervention costs. We present a framework for estimating SOC sequestration potentials and approximating efficient management policies. We review relevant sources of measurement uncertainty and formalize policy choice using potential outcomes. An optimal sequestration policy can be approximated by modeling SOC measurements as functions of covariates within each treatment group, using the fitted models to estimate SOC sequestration potential for each plot, and finding the policy that maximizes the average of those estimates. The modeling can use linear regression or other algorithms to learn relationships between features and SOC sequestration potential. We demonstrate this method using data from a study of compost amendments applied to California rangelands. We find that the plots exhibit treatment effects moderated by baseline SOC -- so targeting amendments to plots with lower baseline SOC would increase overall SOC sequestration rates. We evaluate these methods further in simulated field experiments. Refined policy estimates sequestered more SOC than uniform application of the single management policy estimated to have the largest average treatment effect, especially when SOC sequestration potential could be predicted from observed features. We conclude by discussing baseline SOC moderation, observational studies, inference, cost models, and broader policy uncertainties.

2408.00920 2026-04-23 cs.LG stat.ML

Towards Certified Unlearning for Deep Neural Networks

Binchi Zhang, Yushun Dong, Tianhao Wang, Jundong Li

Comments ICML 2024 (errata)

详情
英文摘要

In the field of machine unlearning, certified unlearning has been extensively studied in convex machine learning models due to its high efficiency and strong theoretical guarantees. However, its application to deep neural networks (DNNs), known for their highly nonconvex nature, still poses challenges. To bridge the gap between certified unlearning and DNNs, we propose several simple techniques to extend certified unlearning methods to nonconvex objectives. To reduce the time complexity, we develop an efficient computation method by inverse Hessian approximation without compromising certification guarantees. In addition, we extend our discussion of certification to nonconvergence training and sequential unlearning, considering that real-world users can send unlearning requests at different time points. Extensive experiments on three real-world datasets demonstrate the efficacy of our method and the advantages of certified unlearning in DNNs.

2406.05262 2026-04-23 stat.AP

A Three-groups Non-local Model for Combining Heterogeneous Data Sources to Identify Genes Associated with Parkinson's Disease

Troy P. Wixson, Benjamin A. Shaby, Daisy L. Philtron, International Parkinson Disease Genomics Consortium, Leandro A. Lima, Stacia K. Wyman, Julia A. Kaye, Steven Finkbeiner

Comments 26 pages, 6 figures, 4 tables. This version includes the supplementary materials. Author version. Accepted for publication in Biometrics (04-2026)

详情
英文摘要

We seek to identify genes involved in Parkinson's Disease (PD) by combining information across different experiment types. Each experiment, taken individually, may contain too little information to distinguish some important genes from incidental ones. However, when experiments are combined using the proposed statistical framework, additional power emerges. The fundamental building block of the family of statistical models that we propose is a hierarchical three-group mixture of distributions. Each gene is modeled probabilistically as belonging to either a null group that is unassociated with PD, a deleterious group, or a beneficial group. This three-group formalism has two key features. By apportioning prior probability of group assignments with a Dirichlet distribution, the resultant posterior group probabilities automatically account for the multiplicity inherent in analyzing many genes simultaneously. By building models for experimental outcomes conditionally on the group labels, any number of data modalities may be combined in a single coherent probability model, allowing information sharing across experiment types. These two features result in parsimonious inference with few false positives, while simultaneously enhancing power to detect signals. Simulations show that our three-groups approach performs at least as well as commonly-used tools for GWAS and RNA-seq, and in some cases it performs better. We apply our proposed approach to publicly-available GWAS and RNA-seq datasets, discovering novel genes that are potential therapeutic targets.

2402.13103 2026-04-23 cs.LG math.ST stat.TH

Multivariate Functional Linear Discriminant Analysis for the Classification of Short Time Series with Missing Data

Rahul Bordoloi, Clémence Réda, Orell Trautmann, Saptarshi Bej, Olaf Wolkenhauer

详情
英文摘要

Functional linear discriminant analysis (FLDA) is a powerful tool that extends LDA-mediated multiclass classification and dimension reduction to univariate time-series functions. However, in the age of large multivariate and incomplete data, statistical dependencies between features must be estimated in a computationally tractable way, while also dealing with missing data. There is a need for a computationally tractable approach that considers the statistical dependencies between features and can handle missing values. We here develop a multivariate version of FLDA (MUDRA) to tackle this issue and describe an efficient expectation/conditional-maximization (ECM) algorithm to infer its parameters. We assess its predictive power on the "Articulary Word Recognition" data set and show its improvement over the state-of-the-art, especially in the case of missing data. MUDRA allows interpretable classification of data sets with large proportions of missing data, which will be particularly useful for medical or psychological data sets.

2312.17015 2026-04-23 stat.ME

Regularized Exponentially Tilted Empirical Likelihood for Bayesian Inference

Eunseop Kim, Steven N. MacEachern, Mario Peruggia

详情
英文摘要

Bayesian inference with empirical likelihood faces a challenge as the posterior domain is a proper subset of the original parameter space due to the convex hull constraint. We propose a regularized exponentially tilted empirical likelihood to address this issue. Our method removes the convex hull constraint using a novel regularization technique, incorporating a continuous exponential family distribution to satisfy a Kullback--Leibler divergence criterion. The regularization arises as a limiting procedure where pseudo-data are added to the formulation of exponentially tilted empirical likelihood in a structured fashion. We show that this regularized exponentially tilted empirical likelihood retains certain desirable asymptotic properties with improved finite sample performance. Simulation and data analysis demonstrate that the proposed method provides a suitable pseudo-likelihood for Bayesian inference.

2105.09232 2026-04-23 cs.LG math.ST stat.TH

Diffusion Approximations for Thompson Sampling in the Small Gap Regime

Lin Fan, Peter W. Glynn

详情
英文摘要

We study the process-level dynamics of Thompson sampling and related sampling-based bandit algorithms in the ``small gap'' regime, where the gaps between the arm means are of order $\sqrtγ$ or smaller and the time horizon is of order $1/γ$, with $γ\downarrow 0$. In this regime, as $γ\downarrow 0$, we show that the process-level dynamics of such algorithms converge weakly to the solutions to certain stochastic differential equations and stochastic ordinary differential equations. Our weak convergence theory is developed using the Continuous Mapping Theorem, which provides a direct and modular theoretical approach that can be adapted to analyze a variety of sampling-based bandit algorithms and handle weakly dependent reward processes. A central finding is an algorithmic invariance principle: in the small gap regime, the limit dynamics of a broad class of sampling-based algorithms -- including Thompson sampling with general single-parameter exponential family likelihoods, as well as non-parametric bandit algorithms based on bootstrap re-sampling -- all coincide with those of Thompson sampling with Gaussian likelihoods. Moreover, in the small gap regime, the regret performance of these algorithms is generally insensitive to model mis-specification, changing continuously with increasing degrees of mis-specification.

1810.11624 2026-04-23 cs.LG stat.ML

Time series clustering based on the characterisation of segment typologies

David Guijo-Rubio, Antonio Manuel Durán-Rosal, Pedro Antonio Gutiérrez, Alicia Troncoso, César Hervás-Martínez

Comments 13 pages, 7 figures, 4 tables, 57 refs

详情
Journal ref
IEEE Transactions on Cybernetics ( Volume: 51, Issue: 11, November 2021)
英文摘要

Time series clustering is the process of grouping time series with respect to their similarity or characteristics. Previous approaches usually combine a specific distance measure for time series and a standard clustering method. However, these approaches do not take the similarity of the different subsequences of each time series into account, which can be used to better compare the time series objects of the dataset. In this paper, we propose a novel technique of time series clustering based on two clustering stages. In a first step, a least squares polynomial segmentation procedure is applied to each time series, which is based on a growing window technique that returns different-length segments. Then, all the segments are projected into same dimensional space, based on the coefficients of the model that approximates the segment and a set of statistical features. After mapping, a first hierarchical clustering phase is applied to all mapped segments, returning groups of segments for each time series. These clusters are used to represent all time series in the same dimensional space, after defining another specific mapping process. In a second and final clustering stage, all the time series objects are grouped. We consider internal clustering quality to automatically adjust the main parameter of the algorithm, which is an error threshold for the segmenta- tion. The results obtained on 84 datasets from the UCR Time Series Classification Archive have been compared against two state-of-the-art methods, showing that the performance of this methodology is very promising.

2604.20832 2026-04-23 math.OC stat.ME

Solving Minimax Problems with Bilinear Objectives with ADMM

Bob Wilson

Comments 9 pages, 1 figure (color)

详情
英文摘要

We consider minimax (saddle-point) problems of the form max_{c \in C} min_{β\in S} g(c; β), where C and S are compact convex sets, and g is concave-convex. Applying the Alternating Direction Method of Multipliers (ADMM) requires evaluating a proximal operator that is, in general, as hard as the original problem. We show that when the outcome function g is bilinear, i.e. g(c; β) = c^T A β, the proximal operator reduces to a generalized projection onto the confidence region S. This reduction is exact -- it involves no approximation or linearization. The resulting ADMM algorithm alternates between (i) a generalized projection onto S and (ii) a Euclidean projection onto C. We describe the derivation, state the algorithm, and discuss convergence.

2604.20788 2026-04-23 math.ST stat.TH

The E-measure

Nick W. Koning

详情
英文摘要

We introduce the E-measure: a measure-like generalization of the E-value to a class of hypotheses. Unlike classical measures, E-measures are closed under infimums instead of addition. They arise from a compatibility axiom with logical implications, that there should be at least as much evidence against more specific hypotheses. We show that E-measures are the only non-dominated such objects, if the hypothesis class is closed under intersections. We propose to use the E-measure to present all the relevant evidence for a problem, where the relevance is captured by the choice of hypothesis class. We showcase this by applying the E-measure to decision making, inducing a hypothesis class from the uncertain consequences of decisions. This results in uniform E-consequence bounds on decisions, which nest high-probability loss bounds. Correcting for multiplicity, we consider 'familywise evidence' and 'false evidence rate' control, generalizing from errors and discoveries to continuous evidence. Remarkably, E-measures control these without multiplicity correction if the hypothesis class is intersection-closed. Moreover, we obtain a 'frequentist' notion of updating from E-prior to E-posterior. Abstracting the notion of a 'hypothesis', we advocate for using E-measures for any unknown quantity, leading to predictive E-measures.

2604.20761 2026-04-23 stat.ML stat.ME

Geometric Renyi Differential Privacy: Ricci Curvature Characterized by Heat Diffusion Mechanisms

Xiaotian Chang, Yangdi Jiang, Cyrus Mostajeran, Qirui Hu

详情
英文摘要

In this paper, we develop a novel privacy mechanism for Riemannian manifold-valued data. Our key contribution lies in uncovering unexpected connections among geometric analysis, heat diffusion models, and differential privacy (DP). We characterize the Renyi divergence via dimension-free Harnack inequalities on Riemannian manifolds and establish Renyi differential privacy guarantees governed by Ricci curvature. For manifolds with nonnegative Ricci curvature, we propose a mechanism based on heat diffusion. In contrast, for general manifolds we introduce a Langevin-process-based approach that yields intrinsic mechanisms supporting normalization-free sampling and continuous privacy-utility trade-offs. We derive detailed utility analyses for both mechanisms. As a statistical application, we develop privacy-preserving estimation of the generalized Frechet mean, including nontrivial sensitivity analysis and phase transition characterizations. Numerical experiments further demonstrate the advantages of the proposed DP mechanisms over existing approaches.

2604.20743 2026-04-23 stat.ME stat.CO

ProfileGLMM: a R Package Extending Bayesian Profile Regression using Generalised Linear Mixed Models

Matteo Amestoy, Mark A. van de Wiel, Wessel N. van Wieringen

详情
英文摘要

ProfileGLMM is an R package integrating Generalised Linear Mixed Models (GLMMs) as the outcome model for Bayesian profile regression. This statistical framework simultaneously i) explains the variation in the outcome and ii) clusters the observations based on a specified set of interdependent clustering covariates. The derived cluster memberships are then incorporated, alongside others, as explanatory variables in the regression to model the outcome. This framework efficiently handles complex, highly correlated covariate structures whose direct inclusion in a standard regression model would be statistically sub-optimal. ProfileGLMM significantly extends Bayesian profile regression's scope by resolving two key constraints of previous implementations: 1) it allows the analysis of hierarchical and longitudinal data structures through the inclusion of random effects, and 2) it enables the study of interactions between latent clusters and other observable covariates. ProfileGLMM accommodates various data types, supporting both continuous or binary outcomes and both categorical and continuous clustering covariates. Built on fast Rcpp code with minimal mandatory parameters, ProfileGLMM offers a flexible analytical tool. It significantly enhances the utility of profile regression for researchers in fields such as epidemiology, social sciences, and clinical studies dealing with complex data.

2602.19201 2026-04-23 econ.EM stat.ME

Panel Quantile Regression with Common Shocks

Harold D. Chiang, Antonio F. Galvao, Chia-Min Wei

详情
英文摘要

This paper develops an asymptotic and inferential theory for fixed-effects panel quantile regression (FEQR) that delivers inference robust to pervasive common shocks. Such shocks induce cross-sectional dependence that is central in many economic and financial panels but largely ignored in existing FEQR theory, which typically assumes cross-sectional independence and requires $T \gg N$. We show that the standard FEQR estimator remains asymptotically normal under the mild condition $(\log N)^2/T \to 0$, thereby accommodating empirically relevant regimes, including those with $T \ll N$. We further show that common shocks fundamentally alter the asymptotic covariance structure, rendering conventional covariance estimators inconsistent, and we propose a simple covariance estimator that remains consistent both in the presence and absence of common shocks. The proposed procedure therefore provides valid robust inference without requiring prior knowledge of the dependence structure, substantially expanding the applicability of FEQR methods in realistic panel data settings.

2601.06674 2026-04-23 math.ST math.PR stat.TH

Reduction and classification of higher-order Markov chains

Christophe Gallesco, Caio Teodore Genovese Huss Oliveira, Daniel Yasumasa Takahashi

Comments 9 pages, 5 figures

详情
英文摘要

We study the class structure of finite-alphabet Markov chains with arbitrary memory length. To capture the structural constraints induced by prohibited transitions, we introduce the skeleton of a higher-order transition kernel, defined as a reduced set of contexts encoding all essential zero-probability patterns. To each skeleton we associate a binary transition matrix. We show that the communicating class structure of this matrix completely determines the recurrent classes of the original higher-order Markov chain, along with their periods. As a consequence, simple criteria for essential irreducibility and periodicity follow directly from the skeleton, without constructing the full first-order representation on the enlarged state space. From a practical perspective, this approach can yield significant computational gains. An example illustrates how the skeleton may have substantially smaller order than the original chain.

2512.05070 2026-04-23 stat.ML cs.LG

Control Consistency Losses for Diffusion Bridges

Samuel Howard, Nikolas Nüsken, Jakiw Pidstrigach

详情
英文摘要

Simulating the conditioned dynamics of diffusion processes, given their initial and terminal states, is an important but challenging problem in the sciences. The difficulty is particularly pronounced for rare events, for which the unconditioned dynamics rarely reach the terminal state. In this work, we propose a novel approach for learning diffusion bridges based on a self-consistency property of the optimal control. The resulting algorithm learns the conditioned dynamics in an iterative online manner, and exhibits strong performance in a range of empirical settings without requiring differentiation through simulated trajectories. Beyond the diffusion bridge setting, we draw connections between our self-consistency framework and recent advances in the wider stochastic optimal control literature.

2511.18201 2026-04-23 stat.ME

Spatial deformation in a Bayesian spatiotemporal model for incomplete matrix-variate responses

Rodrigo de Souza Bulhões, Marina Silva Paez, Dani Gamerman

Comments Submitted to Environmental and Ecological Statistics

详情
英文摘要

In this paper, we propose a Bayesian matrix-variate spatiotemporal modeling framework for jointly analyzing multiple response variables observed at spatial locations over time. The approach relaxes the standard assumption of spatial isotropy by incorporating a deformation-based mechanism, allowing the covariance structure to capture directional effects and nonstationary spatial dependence. Temporal dynamics are modeled through dynamic linear models, enabling coherent uncertainty propagation within a state-space formulation. Missing observations are handled via a data augmentation strategy that preserves the joint structure of the multivariate responses. The proposed methodology is evaluated through simulation studies and an application to air quality data. Results indicate that accounting for spatial deformation leads to substantial gains in predictive performance in anisotropic settings, while cross-variable dependence plays a secondary role in improving overall fit. The framework is computationally tractable for moderate numbers of spatial locations and responses, and provides a flexible basis for modeling multivariate spatiotemporal processes under incomplete data.

2409.07609 2026-04-23 cs.CR cs.CV cs.LG stat.AP

Survival of the Cheapest: Cost-Aware Hardware Adaptation for Adversarial Robustness

Charles Meyers, Mohammad Reza Saleh Sedghpour, Tommy Löfstedt, Erik Elmroth

详情
英文摘要

Deploying adversarially robust machine learning systems requires continuous trade-offs between robustness, cost, and latency. We present an autonomic decision-support framework providing a quantitative foundation for adaptive hardware selection and hyper-parameter tuning in cloud-native deep learning. The framework applies accelerated failure time (AFT) models to quantify the effect of hardware choice, batch size, epochs, and validation accuracy on model survival time. This framework can be naturally integrated into an autonomic control loop (monitor--analyse--plan--execute, MAPE-K), where system metrics such as cost, robustness, and latency are continuously evaluated and used to adapt model configurations and hardware selection. Experiments across three GPU architectures confirm the framework is both sound and cost-effective: the Nvidia L4 yields a 20% increase in adversarial survival time while costing 75% less than the V100, demonstrating that expensive hardware does not necessarily improve robustness. The analysis further reveals that model inference latency is a stronger predictor of adversarial robustness than training time or hardware configuration.

2406.03302 2026-04-23 stat.ME math.ST stat.TH

Identification strategies for combining an experimental study with external data

Lawson Ung, Guanbo Wang, Sebastien Haneuse, Miguel A. Hernán, Issa J. Dahabreh

Comments This is an update of the original submission

详情
英文摘要

There is increasing interest in combining information from experimental studies, including randomized and single-group trials, with information from external experimental or observational data sources. Such efforts are usually motivated by the desire to compare treatments evaluated in different studies -- for instance, by constructing external comparator groups for some index study -- or to estimate treatment effects with greater precision. Proposals to combine experimental studies with external data were made at least as early as the 1970s, but in recent years have come under increasing consideration within clinical practice and by regulatory agencies involved in drug and device evaluation, particularly with the increasing availability of trial and observational data. In this paper, we describe basic study templates that combine information from experimental studies with external data, and use the potential (counterfactual) outcomes framework to elaborate identification strategies for potential outcome means and average treatment effects. We argue that these identification strategies inherit ideas relevant to the study of causation in single-source studies and the related literature on combining information (e.g., generalizability and transportability methods), but merit consideration as a separate class of causal problems because they differ in terms of their scientific motivations, definitions of the target population, sampling, data structures, and identifiability conditions. In formalizing identification strategies for the analyses described herein, we hope to provide a conceptual foundation to support the systematic use and evaluation of such efforts.

2404.16746 2026-04-23 stat.ME math.ST stat.ML stat.TH

Estimating the Number of Components in Finite Mixture Models via Variational Approximation

Chenyang Wang, Yun Yang

详情
英文摘要

This work introduces a new method for selecting the number of components in finite mixture models (FMMs) using variational Bayes, inspired by the large-sample properties of the Evidence Lower Bound (ELBO) derived from mean-field (MF) variational approximation. Specifically, we establish matching upper and lower bounds for the ELBO without assuming conjugate priors, suggesting the consistency of model selection for FMMs based on maximizing the ELBO. As a by-product of our proof, we demonstrate that the MF approximation inherits the stable behavior (benefited from model singularity) of the posterior distribution, which tends to eliminate the extra components under model misspecification where the number of mixture components is over-specified. This stable behavior also leads to the $n^{-1/2}$ convergence rate for parameter estimation, up to a logarithmic factor, under this model overspecification. Empirical experiments are conducted to validate our theoretical findings and compare with other state-of-the-art methods for selecting the number of components in FMMs.

2604.20667 2026-04-23 stat.ME

Data Integration for Estimating Subgroup-Specific Conditional Average Treatment Effects (CATEs) Using Coarsened External Information in Randomized Trials

Youqi Yang, Walter Dempsey, Bhramar Mukherjee

Comments 25 pages, 4 figures

详情
英文摘要

Randomized controlled trials (RCTs) are often underpowered to detect treatment heterogeneity in subgroups defined by cross-classifications of multiple covariates, due to sparse sample sizes in some strata. External RCT data can help, but typically provide treatment effect estimates at a coarser level (e.g., by sex or race) rather than for the finer subgroups of interest (e.g., race-by-sex). We propose a novel James-Stein (JS)-type estimator that borrows strength from such coarsened external estimates to improve estimation of finer subgroup-specific conditional average treatment effects (CATEs) in an internal study, while accommodating potential incompatibility in marginal CATEs across populations. Based on asymptotic theory, we derive a practical analytic variance estimator for the JS estimator that exhibits acceptable empirical performance. Under mild conditions, we show that the proposed estimator uniformly dominates the ordinary least squares (OLS) estimator based on internal data regarding a weighted quadratic loss. Simulation studies demonstrate favorable performance compared with existing shrinkage methods, including empirical Bayes and generalized ridge estimators. We illustrate our method by estimating race-by-sex subgroup CATEs in a tirzepatide weight-loss trial (SURMOUNT-1), borrowing sex-specific and race-specific estimates from two previous semaglutide trials (STEP 1 and STEP 2). The proposed method detects a significantly larger treatment effect on percentage weight loss in the female-White subgroup than in the female-Asian subgroup, a difference not detected using internal data alone.

2604.20632 2026-04-23 astro-ph.IM stat.ME

Review: A new method for estimation and use of systematic errors in Poisson regression

M. Bonamente

Comments Accepted for Frontiers in Astronomy and Space Sciences - Astrostatistics. This is a review of https://ui.adsabs.harvard.edu/abs/2025ApJ...980..139B and https://ui.adsabs.harvard.edu/abs/2025ApJ...980..140B presented at the sys2025 workshop in Huntsville, AL (Nov 14-17. 2025)

详情
英文摘要

A new method for including systematic errors in the regression with Poisson data is reviewed in this contribution, with emphasis on applications to astronomical spectra. The method consists of generalizing the usual Poisson log-likelihood, known as the Cash statistic $C_{min}$, and its associated likelihood-ratio statistic $ΔC$, to include the presence of systematic sources of uncertainty. Advantages of this new method include its modeling simplicity and its ability to assess both the level of systematics and the goodness of fit at the same time, including for a nested model component.

2604.20630 2026-04-23 stat.ME

Double Robust Weighted Regression with Missing Confounders

Md. Shaddam Hossain Bagmar, Hua Shen

详情
英文摘要

Missing confounders are common in observational studies and present fundamental challenges for causal effect estimation by weakening identification and increasing sensitivity to model misspecification. Within the missing-indicator framework, existing methods rely on a single working model and achieve consistency only when that model is correctly specified, and are therefore singly robust. In this article, we develop a doubly robust missing indicator weighted ordinary least squares (MI-WOLS) estimator with partially observed confounders. The MI-WOLS estimator incorporates the treatment assignment mechanism, commonly known as the propensity score model, into the weighting structure of the outcome regression. Building on the missing-indicator framework, we define propensity score based regression weights that satisfy a covariate-balancing condition in the presence of confounder missingness. Under the missingness-strongly-ignorable treatment allocation assumption and assuming either a Conditionally Independent Treatment or Conditionally Independent Outcome structure, the MI-WOLS estimator is consistent when at least the treatment or the outcome model is correctly specified. Simulation studies support the theoretical robustness of the MI-WOLS estimator, demonstrating negligible bias, accurate sandwich-based variance estimation, and near-nominal coverage probability across a wide range of data-generating scenarios. An illustrative application to kidney function outcomes further demonstrates the interpretability and practical feasibility of the method, offering a flexible, doubly robust alternative to existing singly robust estimators.

2604.20625 2026-04-23 stat.ME stat.AP

Dynamic Prediction of the Target Survival Time in Metastatic Solid Tumor Cancer Clinical Trials

Sidi Wang, Kelley Kidwell, Bo Huang, Satrajit Roychoudhury

详情
英文摘要

Overall survival (OS) is the gold standard for assessing patient benefit and cost-effectiveness of new cancer drugs. However, it is often difficult to use OS as the primary endpoint in randomized clinical trials (RCTs) for patients with metastatic cancer due to multiple reasons. In recent years, progression-free survival (PFS) has increasingly been used as the primary endpoint in metastatic cancer RCTs to accelerate development. However, regulatory authorities often seek mature OS data for approval. Therefore, it is critical to determine the target time when OS data are expected to be mature for reliable statistical inference. Motivated by an advanced renal cell carcinoma (RCC) clinical trial, we develop and investigate different prediction models leveraging information from disease progression to improve target OS prediction times. We propose a multivariate joint modeling approach considering components of progression and OS and extend three models commonly used for association to be used for OS prediction. To the best of our knowledge, this is the first comprehensive statistical study exploring the prediction of OS using different levels of information on disease progression and illustrating these models using a real, complex dataset. Our findings have significant implications for OS prediction.

2604.20614 2026-04-23 cs.LG math.DS math.OC stat.ML

Too Sharp, Too Sure: When Calibration Follows Curvature

Alessandro Morosini, Matea Gjika, Tomaso Poggio, Pierfrancesco Beneventano

Comments 33 pages, 23 figures

详情
英文摘要

Modern neural networks can achieve high accuracy while remaining poorly calibrated, producing confidence estimates that do not match empirical correctness. Yet calibration is often treated as a post-hoc attribute. We take a different perspective: we study calibration as a training-time phenomenon on small vision tasks, and ask whether calibrated solutions can be obtained reliably by intervening on the training procedure. We identify a tight coupling between calibration, curvature, and margins during training of deep networks under multiple gradient-based methods. Empirically, Expected Calibration Error (ECE) closely tracks curvature-based sharpness throughout optimization. Mathematically, we show that both ECE and Gauss--Newton curvature are controlled, up to problem-specific constants, by the same margin-dependent exponential tail functional along the trajectory. Guided by this mechanism, we introduce a margin-aware training objective that explicitly targets robust-margin tails and local smoothness, yielding improved out-of-sample calibration across optimizers without sacrificing accuracy.

2604.20612 2026-04-23 math.ST math.PR stat.TH

E-values and sequential power-one tests for monotonicity and unimodality

Hongjian Wang, Aaditya Ramdas

详情
英文摘要

We develop e-values and e-processes testing the null hypothesis that a distribution over nonnegative integers is monotone, and that a distribution over integers is unimodal given a certain mode. Our e-processes lead to tests of power one under any non-null distribution with a sequence of i.i.d. observations, and consistent set-valued mode estimators that eventually equal the true set of modes. Additionally, we characterize the set of all e-values, and therefore the set of all valid tests, with one monotone and unimodal observation, as well as the most powerful e-value for a fixed alternative. We then show that many of our results can be generalized to continuous random variables, relating them to the existing results in the shape-constrained inference literature.

2604.20611 2026-04-23 stat.AP

Bayesian Inference for Incomplete 2x2 Diagnostic Tables

Sara Antonijevic, Danielle Sitalo, Brani Vidakovic

Comments 21 pages, 10 tables. Supplementary materials and reproducible code available at https://github.com/saraantonijevic/bayesian_diagnostic_table-reconstruction

详情
英文摘要

Incomplete reporting of diagnostic accuracy data remains a persistent problem in medical research. In many studies, only part of the 2x2 diagnostic table is reported, leaving denominators for diseased and non-diseased groups unknown and preventing direct calculation of sensitivity, specificity, predictive values, and related operating characteristics. To address this limitation, we develop hierarchical Bayesian models for reconstructing incomplete 2x2 diagnostic tables from such partial information. Two motivating scenarios are considered: one in which only a single test-outcome row is observed, and another in which true positives, false positives, and the total sample size are reported but the remaining cells are missing. The proposed models are illustrated on a benchmark breast MRI study with complete counts, treated as partially observed in order to assess reconstruction performance under controlled missingness. The framework yields posterior inference for the missing cell counts and associated diagnostic measures, together with uncertainty quantification in weakly identified settings.

2604.20551 2026-04-23 stat.ML cs.LG

On Bayesian Softmax-Gated Mixture-of-Experts Models

Nicola Bariletto, Huy Nguyen, Nhat Ho, Alessandro Rinaldo

详情
英文摘要

Mixture-of-experts models provide a flexible framework for learning complex probabilistic input-output relationships by combining multiple expert models through an input-dependent gating mechanism. These models have become increasingly prominent in modern machine learning, yet their theoretical properties in the Bayesian framework remain largely unexplored. In this paper, we study Bayesian mixture-of-experts models, focusing on the ubiquitous softmax-based gating mechanism. Specifically, we investigate the asymptotic behavior of the posterior distribution for three fundamental statistical tasks: density estimation, parameter estimation, and model selection. First, we establish posterior contraction rates for density estimation, both in the regimes with a fixed, known number of experts and with a random learnable number of experts. We then analyze parameter estimation and derive convergence guarantees based on tailored Voronoi-type losses, which account for the complex identifiability structure of mixture-of-experts models. Finally, we propose and analyze two complementary strategies for selecting the number of experts. Taken together, these results provide one of the first systematic theoretical analyses of Bayesian mixture-of-experts models with softmax gating, and yield several theory-grounded insights for practical model design.

2604.20517 2026-04-23 math.DS math.OC stat.CO

Bounding Transient Instability in Sensor Data Injected Nonlinear Stochastic Flight Dynamics

Surya Ratna Prakash D, Soumyendu Raha

详情
英文摘要

Transient instability in nonlinear stochastic dynamical systems is a fundamental limitation in safety-critical aerospace applications, particularly during powered descent and landing where failure is driven by finite-time excursions rather than asymptotic divergence. Classical notions of mean-square or asymptotic stability are therefore insufficient for certification and design. This paper develops a logarithmic-norm-based framework for finite-time transient stability analysis of nonlinear Ito stochastic differential equations. The approach extends matrix measures to nonlinear mappings in a Lipschitz sense, enabling efficient characterization of instantaneous perturbation growth without local linearization. Using Ito calculus, bounds on the mean and variance of transient growth are derived, providing conditions for non-positive finite-time mean growth and probabilistic bounds on instability events. The analysis highlights a key distinction between mean and sample-path behavior, showing that stability in expectation does not guarantee pathwise finite-time safety, and that almost-sure transient stability cannot generally be ensured under stochastic diffusion. The framework is extended to data-constrained stochastic dynamics in navigation and estimation, revealing a trade-off between estimation consistency and transient robustness due to continuous data injection. Demonstrations with flight-like lunar lander telemetry show that similar mean trajectories can exhibit significantly different transient stability behaviour, and that mission failure correlates with accumulation of transient instability over short critical intervals. These results motivate probabilistic finite-time stability metrics for safety-critical autonomous systems.

2604.20516 2026-04-23 stat.ML cs.LG

Efficient Symbolic Computations for Identifying Causal Effects

Benjamin Hollering, Pratik Misra, Nils Sturma

详情
英文摘要

Determining identifiability of causal effects from observational data under latent confounding is a central challenge in causal inference. For linear structural causal models, identifiability of causal effects is decidable through symbolic computation. However, standard approaches based on Gröbner bases become computationally infeasible beyond small settings due to their doubly exponential complexity. In this work, we study how to practically use symbolic computation for deciding rational identifiability. In particular, we present an efficient algorithm that provably finds the lowest degree identifying formulas. For a causal effect of interest, if there exists an identification formula of a prespecified maximal degree, our algorithm returns such a formula in quasi-polynomial time.

2510.04525 2026-04-23 cs.LG math.PR stat.ML

Demystifying MaskGIT Sampler and Beyond: Adaptive Order Selection in Masked Diffusion

Satoshi Hayakawa, Yuhta Takida, Masaaki Imaizumi, Hiromi Wakaki, Yuki Mitsufuji

Comments 23 pages, fixed cleveref-related issue

详情
Journal ref
Transactions on Machine Learning Research, 2026
英文摘要

Masked diffusion models have shown promising performance in generating high-quality samples in a wide range of domains, but accelerating their sampling process remains relatively underexplored. To investigate efficient samplers for masked diffusion, this paper theoretically analyzes the MaskGIT sampler for image modeling, revealing its implicit temperature sampling mechanism. Through this analysis, we introduce the "moment sampler," an asymptotically equivalent but more tractable and interpretable alternative to MaskGIT, which employs a "choose-then-sample" approach by selecting unmasking positions before sampling tokens. In addition, we improve the efficiency of choose-then-sample algorithms through two key innovations: a partial caching technique for transformers that approximates longer sampling trajectories without proportional computational cost, and a hybrid approach formalizing the exploration-exploitation trade-off in adaptive unmasking. Experiments in image and text domains demonstrate our theory as well as the efficiency of our proposed methods, advancing both theoretical understanding and practical implementation of masked diffusion samplers.

2505.13106 2026-04-23 math.OC physics.soc-ph stat.AP

How to optimise tournament draws: The case of the FIFA World Cup

László Csató

Comments 32 pages, 8 figures, 6 tables

详情
Journal ref
International Transactions in Operational Research, 2026, forthcoming
英文摘要

The organisers of major sports competitions use different policies with respect to constraints in the group draw. Our paper aims to rationalise these choices by analysing the trade-off between attractiveness (the number of games played by teams from the same geographic zone) and fairness (the departure of the draw mechanism from a uniform distribution). A parametric optimisation model is formulated and applied to the 2018 and 2022 FIFA World Cup draws. A flaw of the draw procedure is identified: the pre-assignment of the host to a group unnecessarily increases the distortions. All Pareto efficient sets of draw constraints are determined via simulations. The proposed framework can be used to find the optimal draw rules and justify the non-uniformity of the draw procedure for the stakeholders.

2411.18334 2026-04-23 stat.ME

Large multi-response linear regression estimation based on low-rank pre-smoothing

Xinle Tian, Alex Gibberd, Matthew Nunes, Sandipan Roy

详情
英文摘要

Pre-smoothing is a technique aimed at increasing the signal-to-noise ratio in data to improve subsequent estimation and model selection in regression problems. However, pre-smoothing has thus far been limited to the univariate response regression setting. However, there are many scientific applications in which interest lies in multi-response regression problems, particularly when the number of responses is large. Motivated by this setting, this article proposes a technique for data pre-smoothing based on low-rank approximation. We establish theoretical results on the performance of the proposed methodology, which show that in this large-response setting, the proposed technique outperforms ordinary least squares estimation with the mean squared error criterion, whilst being computationally more efficient than alternative approaches such as reduced rank regression. We quantify our estimator's benefit empirically in a number of simulated experiments. We also demonstrate our proposed low-rank pre-smoothing technique on real data arising from the environmental and biological sciences.

2407.01621 2026-04-23 cs.LG q-bio.QM stat.ME stat.ML

Deciphering interventional dynamical causality from non-intervention complex systems

Jifan Shi, Yang Li, Juan Zhao, Siyang Leng, Rui Bao, Kazuyuki Aihara, Luonan Chen, Wei Lin

详情
英文摘要

Detecting and quantifying causality is a focal topic in the fields of science, engineering, and interdisciplinary studies. However, causal studies on non-intervention systems attract much attention but remain extremely challenging. Delay-embedding technique provides a promising approach. In this study, we propose a framework named Interventional Dynamical Causality (IntDC) in contrast to the traditional Constructive Dynamical Causality (ConDC). ConDC, including Granger causality, transfer entropy and convergence of cross-mapping, measures the causality by constructing a dynamical model without considering interventions. A computational criterion, Interventional Embedding Entropy (IEE), is proposed to measure causal strengths in an interventional manner. IEE is an intervened causal information flow but in the delay-embedding space. Further, the IEE theoretically and numerically enables the deciphering of IntDC solely from observational (non-interventional) time-series data, without requiring any knowledge of dynamical models or real interventions in the considered system. In particular, IEE can be applied to rank causal effects according to their importance and construct causal networks from data. We conducted numerical experiments to demonstrate that IEE can find causal edges accurately, eliminate effects of confounding, and quantify causal strength robustly over traditional indices. We also applied IEE to real-world tasks. IEE performed as an accurate and robust tool for causal analyses solely from the observational data. The IntDC framework and IEE algorithm provide an efficient approach to the study of causality from time series in diverse non-intervention complex systems.