arXivDaily arXiv每日学术速递 周一至周五更新
重置
2604.02304 2026-04-03 stat.CO

Disentangled Deep Priors for Bayesian Inverse Problems

Arkaprabha Ganguli, Emil Constantinescu

详情
英文摘要

We propose a structured prior for high-dimensional Bayesian inverse problems based on a disentangled deep generative model whose latent space is partitioned into auxiliary variables aligned with known and interpretable physical parameters and residual variables capturing remaining unknown variability. This yields a hierarchical prior in which interpretable coordinates carry domain-relevant uncertainty while the residual coordinates retain the flexibility of deep generative models. By linearizing the generator, we characterize the induced prior covariance and derive conditions under which the posterior exhibits approximate block-diagonal structure in the latent variables, clarifying when representation-level disentanglement translates into a separation of uncertainty in the inverse problem. We formulate the resulting latent-space inverse problem and solve it using MAP estimation and Markov chain Monte Carlo (MCMC) sampling. On elliptic PDE inverse problems, such as conductivity identification and source identification, the approach matches an oracle Gaussian process prior under correct specification and provides substantial improvement under prior misspecification, while recovering interpretable physical parameters and producing spatially calibrated uncertainty estimates.

2604.02286 2026-04-03 stat.ME stat.AP

Bayesian covariance regression for differential network analysis of zero-inflated microbiome data

Zichun Xu, Jing Ma

详情
英文摘要

Microbial interaction networks can rewire in response to host and environmental factors, yet most existing methods for network estimation treat the covariance structure as static across samples. We propose TRECOR, a Bayesian covariance regression framework for inferring covariate-dependent microbial covariation networks from zero-inflated compositional count data. The method models microbiome counts through a latent multivariate normal distribution defined on the internal nodes of a phylogenetic tree, where both the mean and covariance of the latent variables depend on covariates. The covariance is decomposed into a sparse baseline component, representing a stable microbial covariation network, and a low-rank covariate-dependent perturbation that captures network rewiring. By exploiting the binomial factorization of the multinomial distribution under the logistic-tree-normal representation, the model achieves full conjugacy and posterior inference proceeds via an efficient Gibbs sampler. In simulations, TRECOR substantially outperforms covariance regression applied to transformed counts, demonstrating the importance of explicitly modeling the compositional sampling layer. Applied to gut microbiome data from 531 individuals across three countries, we find that age has the largest effect on microbial covariation, which is a pattern not revealed by mean-based analysis alone. The age-associated differential network is enriched for Enterobacteriaceae and related families, consistent with known developmental shifts in the gut microbiota, while country-associated differential networks implicate diet-related taxa.

2604.02250 2026-04-03 cs.LG stat.ML

Smoothing the Landscape: Causal Structure Learning via Diffusion Denoising Objectives

Hao Zhu, Di Zhou, Donna Slonim

Comments To appear in the Proceedings of the 5th Conference on Causal Learning and Reasoning (CLeaR 2026)

详情
英文摘要

Understanding causal dependencies in observational data is critical for informing decision-making. These relationships are often modeled as Bayesian Networks (BNs) and Directed Acyclic Graphs (DAGs). Existing methods, such as NOTEARS and DAG-GNN, often face issues with scalability and stability in high-dimensional data, especially when there is a feature-sample imbalance. Here, we show that the denoising score matching objective of diffusion models could smooth the gradients for faster, more stable convergence. We also propose an adaptive k-hop acyclicity constraint that improves runtime over existing solutions that require matrix inversion. We name this framework Denoising Diffusion Causal Discovery (DDCD). Unlike generative diffusion models, DDCD utilizes the reverse denoising process to infer a parameterized causal structure rather than to generate data. We demonstrate the competitive performance of DDCDs on synthetic benchmarking data. We also show that our methods are practically useful by conducting qualitative analyses on two real-world examples. Code is available at this url: https://github.com/haozhu233/ddcd.

2604.02248 2026-04-03 stat.ML cs.LG

BVFLMSP : Bayesian Vertical Federated Learning for Multimodal Survival with Privacy

Abhilash Kar, Basisth Saha, Tanmay Sen, Biswabrata Pradhan

详情
英文摘要

Multimodal time-to-event prediction often requires integrating sensitive data distributed across multiple parties, making centralized model training impractical due to privacy constraints. At the same time, most existing multimodal survival models produce single deterministic predictions without indicating how confident the model is in its estimates, which can limit their reliability in real-world decision making. To address these challenges, we propose BVFLMSP, a Bayesian Vertical Federated Learning (VFL) framework for multimodal time-to-event analysis based on a Split Neural Network architecture. In BVFLMSP, each client independently models a specific data modality using a Bayesian neural network, while a central server aggregates intermediate representations to perform survival risk prediction. To enhance privacy, we integrate differential privacy mechanisms by perturbing client side representations before transmission, providing formal privacy guarantees against information leakage during federated training. We first evaluate our Bayesian multimodal survival model against widely used single modality survival baselines and the centralized multimodal baseline MultiSurv. Across multimodal settings, the proposed method shows consistent improvements in discrimination performance, with up to 0.02 higher C-index compared to MultiSurv. We then compare federated and centralized learning under varying privacy budgets across different modality combinations, highlighting the tradeoff between predictive performance and privacy. Experimental results show that BVFLMSP effectively includes multimodal data, improves survival prediction over existing baselines, and remains robust under strict privacy constraints while providing uncertainty estimates.

2604.02238 2026-04-03 cs.CY cs.AI stat.AP

Generative AI Spotlights the Human Core of Data Science: Implications for Education

Nathan Taback

详情
英文摘要

Generative AI (GAI) reveals an irreducible human core at the center of data science: advances in GAI should sharpen, rather than diminish, the focus on human reasoning in data science education. GAI can now execute many routine data science workflows, including cleaning, summarizing, visualizing, modeling, and drafting reports. Yet the competencies that matter most remain irreducibly human: problem formulation, measurement and design, causal identification, statistical and computational reasoning, ethics and accountability, and sensemaking. Drawing on Donoho's Greater Data Science framework, Nolan and Temple Lang's vision of computational literacy, and the McLuhan-Culkin insight that we shape our tools and thereafter our tools shape us, this paper traces the emergence of data science through three converging lineages: Tukey's intellectual vision of data analysis as a science, the commercial logic of surveillance capitalism that created industrial demand for data scientists, and the academic programs that followed. Mapping GAI's impact onto Donoho's six divisions of Greater Data Science shows that computing with data (GDS3) has been substantially automated, while data gathering, preparation, and exploration (GDS1) and science about data science (GDS6) still require essential human input. The educational implication is that data science curricula should focus on this human core while teaching students how to contribute effectively within iterative prompt-output-prompt cycles using retrieval-augmented generation, and that learning outcomes and assessments should explicitly evaluate reasoning and judgment.

2604.02227 2026-04-03 eess.SY cs.SY math.OC stat.ME

Sensitivity analysis for stopping criteria with application to organ transplantations

Xingyu Ren, Michael C. Fu, Steven I. Marcus

详情
英文摘要

We consider a stopping problem and its application to the decision-making process regarding the optimal timing of organ transplantation for individual patients. At each decision period, the patient state is inspected and a decision is made whether to transplant. If the organ is transplanted, the process terminates; otherwise, the process continues until a transplant happens or the patient dies. Under suitable conditions, we show that there exists a control limit optimal policy. We propose a smoothed perturbation analysis (SPA) estimator for the gradient of the total expected discounted reward with respect to the control limit. Moreover, we show that the SPA estimator is asymptotically unbiased.

2604.02187 2026-04-03 stat.AP physics.ao-ph

Possible, Yes; Ignorant, Perhaps: A Scorecard for Possibilistic Forecasts

John R. Lawson

Comments 11 figures; 7 sections;19 pages on PDF as-is

详情
英文摘要

Probabilistic forecasts must sum to unity and cannot express ``I don't know.'' Possibility theory relaxes this constraint: a subnormal distribution explicitly measures how much of the plausibility budget remains unassigned, ignorance signal that probability cannot represent. This paper develops a verification framework for such forecasts, centred on a five-number scorecard that separately diagnoses whether the forecast pointed at the right outcome (depth-of-truth), how sharply (diffuseness, support margin), how confidently (ignorance), and how dominantly (conditional necessity). A possibility-to-probability conversion preserves ignorance for familiar frequency-based scoring; categorical threshold scores (POD, FAR, CSI, etc.) connect to operational practice. Together, these three complementary facets -- possibilistic, probabilistic, and categorical -- expose failure modes invisible to any single metric. Storm Prediction Center convective outlook categories serve as the running example throughout; a synthetic reforecast demonstrates diagnostic visualisations and scorecard interpretation. Ignorance is better expressed than repressed.

2604.02179 2026-04-03 stat.ME

Irregularly and incompletely sampled random fields in the Earth sciences: Analysis and synthesis of parameterized covariance models

Olivia L. Walbert, Frederik J. Simons, Arthur P. Guillaumin, Sofia C. Olhede

详情
英文摘要

We study how sampling geometry contributes to uncertainty in modeling spatial geophysical observations as sampled random fields characterized by stationary, isotropic, parametric covariance functions. We incorporate the signature of discrete spatial sampling patterns into an asymptotically unbiased spectral maximum-likelihood estimation method along with analytical uncertainty calculation. We illustrate the broad applicability of our modeling through synthetic and real data examples with sampling patterns that include irregularly bounded contiguous region(s) of interest, structured sweeps of instrumental measurements, and missing observations dispersed across the domain of a field, from which contiguous patches are generally favorable. We find through asymptotic studies that allocating samples following a growing-domain strategy rather than a densifying, infill scheme best reduces estimator bias and (co)variance, whether the field has been sampled regularly or not. As our modeling assumptions, too, shape how (well) an observed random field can be characterized, we study the effect of covariance parameters assumed a priori. We demonstrate the desirable behavior of the general Matern class and show how to interrogate goodness-of-fit criteria to detect departures from the null hypothesis of Gaussianity, stationarity, and isotropy.

2604.02116 2026-04-03 stat.ME stat.CO

A new wavelet-based variational family with copula dependence structures

Giovanni Piccirilli, Aluísio Pinheiro

详情
英文摘要

Variational inference (VI) has become a widely used approach for scalable Bayesian inference, but its performance strongly depends on the flexibility of the chosen variational family. In this work, we propose a novel variational family that combines wavelet-based representations for marginal posterior densities with copula functions to model dependence structures. The marginal distributions are constructed using coefficients from the discrete wavelet transform, providing a flexible and adaptive framework capable of capturing complex features such as asymmetry. The joint distribution is then obtained through a copula, allowing for explicit modeling of dependence among parameters, including both independence and Gaussian copula structures. We develop an efficient estimation procedure based on Monte Carlo approximations of the evidence lower bound (ELBO) and automatic differentiation, enabling scalable optimization using gradient-based methods. Through extensive simulation studies, including logistic regression, sparse linear models, and hierarchical models, we demonstrate that the proposed approach achieves posterior mean estimates comparable to Markov chain Monte Carlo (MCMC) methods, while providing improved uncertainty quantification relative to standard variational approaches. Applications to hierarchical logistic regression and Bayesian conditional transformation models further illustrate the practical advantages of the method in complex, high dimensional settings. The proposed wavelet copula variational family offers a flexible and computationally efficient alternative for Bayesian inference.

2601.19016 2026-04-03 cs.CC cs.CR math.PR math.ST stat.TH

Average-Case Reductions for $k$-XOR and Tensor PCA

Guy Bresler, Alina Harbuzova

Comments 112 pages, 6 figures

详情
英文摘要

We study the computational properties of two canonical planted average-case problems -- noisy planted $k$-XOR and Tensor PCA -- by formally unifying them into a family of planted problems parametrized by tensor order $k$, number of entries $m$, and noise level $δ$. We build a wide range of poly-time average-case reductions within this family, across all regimes $m \in [1, n^k]$. In the denser $m \geq n^{k/2}$ regime, our reductions preserve proximity to the computational threshold, and, as a central application, reduce conjectured-hard $k$-XOR instances with $m \approx n^{k/2}$ to conjectured-hard instances of Tensor PCA. Additionally, we give new order-reducing maps at fixed densities (e.g., $5\to 4$ for $k$-XOR with $m \approx n^{k/2}$ entries and $7\to 4$ for Tensor PCA). In the sparser $m \leq n^{k/2}$ regime, we relate instances of different orders, reducing, for example, $7$-XOR with $m = n^{3.4}$ to the classical setting of $3$-XOR with $m = \widetildeΘ(n^{1.4})$. Taken together, these results establish a hardness partial order in the space of planted tensor models.

2601.13507 2026-04-03 stat.ME

Two-stage Least Squares with Clustered Data under the Local Average Treatment Effect Framework

Anqi Zhao, Peng Ding, Fan Li

详情
英文摘要

To estimate the causal effect of an endogenous treatment using clustered data, the canonical two-stage least squares (2sls) estimates a linear regression of the outcome on treatment status using an instrumental variable (IV) and conducts inference with cluster-robust standard errors. When both the treatment and the IV vary within clusters, an alternative two-stage least squares with fixed effects (2sfe) additionally includes cluster indicators in the regression, thereby incorporating cluster information into point estimation as well. This paper studies the trade-off between these approaches within the local average treatment effect (LATE) framework. When clusters are homogeneous, we show that both approaches yield valid large-sample inference for the LATE, and that 2sfe is more efficient than canonical 2sls only when the variation in cluster-specific effects dominates idiosyncratic variation and the IV has sufficient within-cluster variation. When clusters are heterogeneous, we show that 2sfe identifies a weighted average of cluster-specific LATEs, whereas the canonical 2sls generally does not. We further propose a test for detecting cluster heterogeneity.

2601.11016 2026-04-03 stat.ML cs.AI cs.LG math.OC

Contextual Distributionally Robust Optimization with Causal and Continuous Structure: An Interpretable and Tractable Approach

Fenglin Zhang, Jie Wang

详情
英文摘要

In this paper, we introduce a framework for contextual distributionally robust optimization (DRO) that considers the causal and continuous structure of the underlying distribution by developing interpretable and tractable decision rules that prescribe decisions using covariates. We first introduce the causal Sinkhorn discrepancy (CSD), an entropy-regularized causal Wasserstein distance that encourages continuous transport plans while preserving the causal consistency. We then formulate a contextual DRO model with a CSD-based ambiguity set, termed Causal Sinkhorn DRO (Causal-SDRO), and derive its strong dual reformulation where the worst-case distribution is characterized as a mixture of Gibbs distributions. To solve the corresponding infinite-dimensional policy optimization, we propose the Soft Regression Forest (SRF) decision rule, which approximates optimal policies within arbitrary measurable function spaces. The SRF preserves the interpretability of classical decision trees while being fully parametric, differentiable, and Lipschitz smooth, enabling intrinsic interpretation from both global and local perspectives. To solve the Causal-SDRO with parametric decision rules, we develop an efficient stochastic compositional gradient algorithm that converges to an $\varepsilon$-stationary point at a rate of $O(\varepsilon^{-4})$, matching the convergence rate of standard stochastic gradient descent. Finally, we validate our method through numerical experiments on synthetic and real-world datasets, demonstrating its superior performance and interpretability.

2511.07605 2026-04-03 math.ST stat.ME stat.TH

Confidence Intervals for Linear Models with Arbitrary Noise Contamination

Dong Xie, Chao Gao, John Lafferty

详情
英文摘要

We study confidence interval construction for linear regression under Huber's contamination model, where an unknown fraction of noise variables is arbitrarily corrupted. While robust point estimation in this setting is well understood, statistical inference remains challenging, especially because the contamination proportion is not identifiable from the data. We develop a new algorithm that constructs confidence intervals for individual regression coefficients without any prior knowledge of the contamination level. Our method is based on a Z-estimation framework using a smooth estimating function. The method directly quantifies the uncertainty of the estimating equation after a preprocessing step that decorrelates covariates associated with the nuisance parameters. We show that the resulting confidence interval has valid coverage uniformly over all contamination distributions and attains an optimal length of order $O(1/\sqrt{n(1-ε)^2})$, matching the rate achievable when the contamination proportion $ε$ is known. This result stands in sharp contrast to the adaptation cost of robust interval estimation observed in the simpler Gaussian location model.

2510.18520 2026-04-03 cs.LG stat.ME

Partial VOROS: A Cost-aware Performance Metric for Binary Classifiers with Precision and Capacity Constraints

Christopher Ratigan, Kyle Heuton, Carissa Wang, Lenore Cowen, Michael C. Hughes

Comments In Proceedings of the International Conference of Artificial Intelligence and Statistics (AISTATS), 2026

详情
英文摘要

The ROC curve is widely used to assess binary classifiers. Yet for some applications, such as alert systems for monitoring hospitalized patients, conventional ROC analysis cannot meet two key deployment needs: enforcing a constraint on precision to avoid false alarm fatigue and imposing an upper bound on the number of predicted positives to represent the capacity of hospital staff. The usual area under the curve metric also does not reflect asymmetric costs for false positives and false negatives. In this paper we address all three of these issues. First, we show how the subset of classifiers that meet precision and capacity constraints occupy a feasible region in ROC space. We establish the polygon-shaped geometry of this region. We then define the partial area of lesser classifiers, a performance metric that is monotonic with cost and only accounts for the feasible region. Averaging this area over a desired distribution for cost parameters results in the partial volume over the ROC surface, or partial VOROS. In experiments predicting mortality risk from vital sign history on several datasets, we show this cost-aware metric can outperform alternatives at ranking classifiers for in-hospital alerts.

2509.07013 2026-04-03 cs.LG q-bio.PE stat.ME

Generalized Machine Learning for Fast Calibration of Agent-Based Epidemic Models

Sima Najafzadehkhoei, George Vega Yon, Derek S. Meyer, Bernardo Modenesi

详情
英文摘要

Agent-based models (ABMs) are widely used to study infectious disease dynamics, but their calibration is often computationally intensive, limiting their applicability in time-sensitive public health settings. We propose DeepIMC (Deep Inverse Mapping Calibration), a machine learning-based calibration framework that directly learns the inverse mapping from epidemic time series to epidemiological parameters. DeepIMC trains a bidirectional Long Short-Term Memory (BiLSTM) neural network on synthetic epidemic trajectories generated from agent-based models such as the Susceptible-Infected-Recovered (SIR) model, enabling rapid parameter estimation without repeated simulation at inference time. We evaluate DeepIMC through an extensive simulation study comprising 5,000 heterogeneous epidemic scenarios and benchmark its performance against Approximate Bayesian Computation (ABC) using likelihood-free Markov Chain Monte Carlo. The results show that DeepIMC substantially improves parameter recovery accuracy, produces sharp and well-calibrated predictive intervals, and reduces computational time by more than an order of magnitude relative to ABC. Although structural parameter identifiability constraints limit the precise recovery of all model parameters simultaneously, the calibrated models reliably reproduce epidemic trajectories and support accurate forward prediction with their estimated parameters. DeepIMC is implemented in the open-source R package epiworldRCalibrate, facilitating practical adoption for real-time epidemic modeling and policy analysis. Overall, our findings demonstrate that DeepIMC provides a scalable, operationally effective alternative to traditional simulation-based calibration methods for agent-based epidemic models.

2509.03309 2026-04-03 stat.ME

A Measure of Predictive Sharpness for Probabilistic Models

Pekka Syrjänen

详情
英文摘要

We introduce a sharpness functional for probabilistic models that quantifies sharpness as an intrinsic property of the probability distribution. The measure is derived based on a rank-based concentration principle that tracks upward transfers of probability mass along the rearranged profile of the predictive distribution. For finite outcome spaces, this yields a normalized sharpness measure with transparent mass--length representation and equivalent formulations as a Gini-type coefficient on the probability vector and a scaled 1-Wasserstein distance from the uniform distribution in rearranged space. We extend the functional to bounded continuous and multidimensional domains for predictive distributions with finite first moment, and establish normalization, symmetry, continuity, and monotonicity properties. The diagnostic application of the measure is illustrated with real and simulated data, and a relationship to the multivariate energy score is discussed.

2507.18021 2026-04-03 math.ST cs.DS cs.LG math.FA math.PR stat.TH

Zeroth-order Logconcave Sampling

Yunbum Kook, Santosh S. Vempala

Comments v2: Fix a bug in the restart mechanism; add a lower bound on Gaussian annealing

详情
英文摘要

We study the zeroth-order query complexity of sampling from a general logconcave distribution: given access to an evaluation oracle for a convex function $V:\mathbb{R}^{d}\rightarrow\mathbb{R}\cup\{\infty\}$, output a point from a distribution within $\varepsilon$-distance to the density proportional to $e^{-V}$. A long line of work provides efficient algorithms for this problem in TV distance, assuming a pointwise warm start (i.e., in $\infty$-Rényi divergence), and using annealing to generate such a warm start. Here, we address the natural and more general problem of using a $q$-Rényi divergence warm start to generate a sample that is $\varepsilon$-close in $q$-Rényi divergence. Our first main result is an algorithm with this end-to-end guarantee with state-of-the-art complexity for $q=\widetildeΩ(1)$. Our second result shows how to generate a $q$-Rényi divergence warm start directly via annealing, by maintaining $q$-Rényi divergence throughout, thereby obtaining a streamlined analysis and improved complexity. Such results were previously known only under the stronger assumptions of smoothness and access to first-order oracles. We also show a lower bound for Gaussian annealing by disproving a geometric conjecture about quadratic tilts of isotropic logconcave distributions. Central to our approach, we establish hypercontractivity of the heat adjoint and translate this into improved mixing time guarantees for the Proximal Sampler. The resulting analysis of both sampling and annealing follows a simplified and natural path, directly tying convergence rates to isoperimetric constants of the target distribution.

2507.04754 2026-04-03 stat.ML cs.LG

Intervening to Learn and Compose Causally Disentangled Representations

Alex Markham, Isaac Hirsch, Jeri A. Chang, Liam Solus, Bryon Aragam

Comments 45 pages, 10 figures; accepted to the 5th conference on Causal Learning and Reasoning (CLeaR)

详情
英文摘要

In designing generative models, it is commonly believed that in order to learn useful latent structure, we face a fundamental tension between expressivity and structure. In this paper we challenge this view by proposing a new approach to training arbitrarily expressive generative models that simultaneously learn causally disentangled concepts. This is accomplished by adding a simple context module to an arbitrarily complex black-box model, which learns to process concept information by implicitly inverting linear representations from the model's encoder. Inspired by the notion of intervention in a causal model, our module selectively modifies its architecture during training, allowing it to learn a compact joint model over different contexts. We show how adding this module leads to causally disentangled representations that can be composed for out-of-distribution generation on both real and simulated data. The resulting models can be trained end-to-end or fine-tuned from pre-trained models. To further validate our proposed approach, we prove a new identifiability result that extends existing work on identifying structured representations.

2506.17527 2026-04-03 math.ST math.CO math.PR stat.TH

Detection and Reconstruction of a Random Hypergraph from Noisy Graph Projection

Shuyang Gong, Zhangsong Li, Qiheng Xu

Comments 19 pages, 1 figure; Section 6 rewritten to fix a previous error

详情
英文摘要

For a $d$-uniform random hypergraph on $n$ vertices in which hyperedges are included i.i.d.\ so that the average degree in the hypergraph is $n^{δ+o(1)}$, the projection of such a hypergraph is a graph on the same $n$ vertices where an edge connects two vertices if and only if they belong to a same hyperedge. In this work, we study the inference problem where the observation is a \emph{noisy} version of the graph projection where each edge in the projection is kept with probability $p=n^{-1+α+o(1)}$ and each edge not in the projection is added with probability $q=n^{-1+β+o(1)}$. For all constant $d$, we establish sharp thresholds for both detection (distinguishing the noisy projection from an Erdős-Rényi random graph with edge density $q$) and reconstruction (estimating the original hypergraph). Notably, our results reveal a \emph{detection-reconstruction gap} phenomenon in this problem. Our work also answers a problem raised in \cite{BGPY25+}.

2412.11340 2026-04-03 stat.ME

Fast Bayesian Functional Principal Components Analysis

Joseph Sartini, Xinkai Zhou, Liz Selvin, Scott Zeger, Ciprian Crainiceanu

Comments 21 pages, 7 figures, 1 table

详情
英文摘要

Functional Principal Components Analysis (FPCA) is a widely used analytic tool for dimension reduction of functional data. Traditional implementations of FPCA estimate the principal components from the data, then treat these estimates as fixed in subsequent analyses. To account for the uncertainty of PC estimates, we propose FAST, a fully-Bayesian FPCA with three core components: (1) projection of eigenfunctions onto an orthonormal spline basis; (2) efficient sampling of the orthonormal spline coefficient matrix using a parameter expansion scheme based on polar decomposition; and (3) ordering eigenvalues during sampling. Extensive simulation studies show that FAST is very stable and performs better compared to existing methods. FAST is motivated by and applied to a study of the variability in mealtime glucose from the Dietary Approaches to Stop Hypertension for Diabetes Continuous Glucose Monitoring (DASH4D CGM) study. All relevant STAN code and simulation routines are available as supplementary material.

2411.12159 2026-04-03 stat.ML cs.LG cs.SY eess.SY stat.AP

Prognostics for Autonomous Deep-Space Habitat Health Management under Multiple Unknown Failure Modes

Benjamin Peters, Ayush Mohanty, Xiaolei Fang, Stephen K. Robinson, Nagi Gebraeel

Comments Manuscript under review

详情
英文摘要

Deep-space habitats (DSHs) are safety-critical systems that must operate autonomously for long periods, often beyond the reach of ground-based maintenance or expert intervention. Monitoring system health and anticipating failures are therefore essential. Prognostics based on remaining useful life (RUL) prediction support this goal by estimating how long a subsystem can operate before failure. Critical DSH subsystems, including environmental control and life support, power generation, and thermal control, are monitored by many sensors and can degrade through multiple failure modes. These failure modes are often unknown, and informative sensors may vary across modes, making accurate RUL prediction challenging when historical failure data are unlabeled. We propose an unsupervised prognostics framework for RUL prediction that jointly identifies latent failure modes and selects informative sensors using unlabeled run-to-failure data. The framework consists of two phases: an offline phase, where system failure times are modeled using a mixture of Gaussian regressions and an Expectation-Maximization algorithm to cluster degradation trajectories and select mode-specific sensors, and an online phase for real-time diagnosis and RUL prediction using low-dimensional features and a weighted functional regression model. The approach is validated on simulated DSH telemetry data and the NASA C-MAPSS benchmark, demonstrating improved prediction accuracy and interpretability.

2405.14690 2026-04-03 q-bio.QM stat.AP

Beyond Scalar Metrics: Functional Data Analysis of Postprandial Continuous Glucose Monitoring in the AEGIS Study

Marcos Matabuena, Joe Sartini, Francisco Gude

详情
英文摘要

Postprandial glucose collected through continuous glucose monitoring (CGM) provides critical information for assessing metabolic capacity and guiding dietary recommendations. Traditional approaches summarize these data into scalar measures, such as 2-hour AUC or peak glucose, potentially overlooking temporal dynamics. We propose analyzing entire CGM trajectories using multilevel functional data analysis (FDA), which accounts for the smooth, hierarchical nature of glucose responses. Applying these methods to AEGIS study participants without diabetes, we illustrate how FDA characterizes variability in postprandial responses and links dietary/patient characteristics to glucose dynamics. We further extend the r-square metric to hierarchical functional models to quantify explanatory power. Our results show that dietary effects vary across the 6-hour postprandial window-for example, fiber blunts responses after 90 minutes, while fats reduce early rises within 50 minutes. Moreover, metabolic responses differ between normoglycemic and prediabetic individuals. These findings demonstrate that functional approaches reveal temporal and stratified insights into postprandial glucose regulation that scalar methods cannot capture.

2405.13621 2026-04-03 stat.ME

Interval identification of natural effects in the presence of outcome-related unmeasured confounding

Marco Doretti, Elena Stanghellini

Comments 14 pages, 2 figures, 2 tables

详情
英文摘要

With reference to a binary outcome and a binary mediator, we derive identification bounds for natural effects under a reduced set of assumptions. Specifically, no assumptions about confounding are made that involve the outcome; we only assume no unobserved exposure-mediator confounding as well as a condition termed partially constant cross-world dependence (PC-CWD), which poses fewer constraints on the counterfactual probabilities than the usual cross-world independence assumption. The proposed strategy can be used also to achieve interval identification of the total effect, which is no longer point identified under the considered set of assumptions. Our derivations are based on postulating a logistic regression model for the mediator as well as for the outcome. However, in both cases the functional form governing the dependence on the explanatory variables is allowed to be arbitrary, thereby resulting in a semi-parametric approach. To account for sampling variability, we provide delta-method approximations of standard errors in order to build uncertainty intervals from identification bounds. The proposed method is applied to a dataset gathered from a Spanish prospective cohort study. The aim is to evaluate whether the effect of smoking on lung cancer risk is mediated by the onset of pulmonary emphysema.

2310.19603 2026-04-03 cs.LG cs.NA cs.NE math.NA math.PR stat.ML

Transformers Can Solve Non-Linear and Non-Markovian Filtering Problems in Continuous Time For Conditionally Gaussian Signals

Blanka Horvath, Anastasis Kratsios, Yannick Limmer, Xuwei Yang

详情
英文摘要

The use of attention-based deep learning models in stochastic filtering, e.g. transformers and deep Kalman filters, has recently come into focus; however, the potential for these models to solve stochastic filtering problems remains largely unknown. The paper provides an affirmative answer to this open problem in the theoretical foundations of machine learning by showing that a class of continuous-time transformer models, called \textit{filterformers}, can approximately implement the conditional law of a broad class of non-Markovian and conditionally Gaussian signal processes given noisy continuous-time (possibly non-Gaussian) measurements. Our approximation guarantees hold uniformly over sufficiently regular compact subsets of continuous-time paths, where the worst-case 2-Wasserstein distance between the true optimal filter and our deep learning model quantifies the approximation error. Our construction relies on two new customizations of the standard attention mechanism: The first can losslessly adapt to the characteristics of a broad range of paths since we show that the attention mechanism implements bi-Lipschitz embeddings of sufficiently regular sets of paths into low-dimensional Euclidean spaces; thus, it incurs no ``dimension reduction error''. The latter attention mechanism is tailored to the geometry of Gaussian measures in the $2$-Wasserstein space. Our analysis relies on new stability estimates of robust optimal filters in the conditionally Gaussian setting.

2604.02074 2026-04-03 stat.AP cs.CV

Country-wide, high-resolution monitoring of forest browning with Sentinel-2

Samantha Biegel, David Brüggemann, Francesco Grossi, Michele Volpi, Konrad Schindler, Benjamin D. Stocker

Comments 9 pages, 7 figures, to be published in the ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences (ISPRS Congress)

详情
英文摘要

Natural and anthropogenic disturbances are impacting the health of forests worldwide. Monitoring forest disturbances at scale is important to inform conservation efforts. Here, we present a scalable approach for country-wide mapping of forest greenness anomalies at the 10 m resolution of Sentinel-2. Using relevant ecological and topographical context and an established representation of the vegetation cycle, we learn a predictive quantile model of the normalised difference vegetation index (NDVI) derived from Sentinel-2 data. The resulting expected seasonal cycles are used to detect NDVI anomalies across Switzerland between April 2017 and August 2025. Goodness-of-fit evaluations show that the conditional model explains 65% of the observed variations in the median seasonal cycle. The model consistently benefits from the local context information, particularly during the green-up period. The approach produces coherent spatial anomaly patterns and enables country-wide quantification of forest browning. Case studies with independent reference data from known events illustrate that the model reliably detects different types of disturbances.

2604.02072 2026-04-03 stat.ME stat.CO

Sparse Probabilistic Richardson Extrapolation

Chris. J. Oates, Richard Howey, Toni Karvonen

详情
英文摘要

Almost every numerical task can be cast as extrapolation with respect to the fidelity or tolerance parameters of a consistent numerical method. This perspective enables probabilistic uncertainty quantification and optimal experimental design functionality to be deployed, and also unlocks the potential for the convergence of numerical methods to be accelerated. Recent work established Probabilistic Richardson Extrapolation as a proof-of-concept, demonstrating how parallel multi-fidelity simulation can be used to accelerate simulation from a whole-heart model. However, the number of simulations was required to increase super-exponentially in $d$, the number of tolerance parameters appearing in the numerical method. This paper develops a refined notion of 'extrapolation dimension', drastically reducing this simulation requirement when multiple tolerance parameters feature in the numerical method. Sparsity-exploiting methodology is developed that is simultaneously simpler and more powerful compared to earlier work, and this is accompanied by sharp theoretical guarantees and substantial empirical support.

2604.02017 2026-04-03 stat.ML cs.LG

Demographic Parity Tails for Regression

Naht Sinh Le, Christophe Denis, Mohamed Hebiri

详情
英文摘要

Demographic parity (DP) is a widely studied fairness criterion in regression, enforcing independence between the predictions and sensitive attributes. However, constraining the entire distribution can degrade predictive accuracy and may be unnecessary for many applications, where fairness concerns are localized to specific regions of the distribution. To overcome this issue, we propose a new framework for regression under DP that focuses on the tails of target distribution across sensitive groups. Our methodology builds on optimal transport theory. By enforcing fairness constraints only over targeted regions of the distribution, our approach enables more nuanced and context-sensitive interventions. Leveraging recent advances, we develop an interpretable and flexible algorithm that leverages the geometric structure of optimal transport. We provide theoretical guarantees, including risk bounds and fairness properties, and validate the method through experiments in regression settings.

2604.01978 2026-04-03 math.PR cs.LG stat.ML

Homogenized Transformers

Hugo Koubbi, Borjan Geshkovski, Philippe Rigollet

详情
英文摘要

We study a random model of deep multi-head self-attention in which the weights are resampled independently across layers and heads, as at initialization of training. Viewing depth as a time variable, the residual stream defines a discrete-time interacting particle system on the unit sphere. We prove that, under suitable joint scalings of the depth, the residual step size, and the number of heads, this dynamics admits a nontrivial homogenized limit. Depending on the scaling, the limit is either deterministic or stochastic with common noise; in the mean-field regime, the latter leads to a stochastic nonlinear Fokker--Planck equation for the conditional law of a representative token. In the Gaussian setting, the limiting drift vanishes, making the homogenized dynamics explicit enough to study representation collapse. This yields quantitative trade-offs between dimension, context length, and temperature, and identifies regimes in which clustering can be mitigated.

2604.01946 2026-04-03 cs.LG stat.ME stat.ML

PAC-Bayesian Reward-Certified Outcome Weighted Learning

Yuya Ishikawa, Shu Tamano

详情
英文摘要

Estimating optimal individualized treatment rules (ITRs) via outcome weighted learning (OWL) often relies on observed rewards that are noisy or optimistic proxies for the true latent utility. Ignoring this reward uncertainty leads to the selection of policies with inflated apparent performance, yet existing OWL frameworks lack the finite-sample guarantees required to systematically embed such uncertainty into the learning objective. To address this issue, we propose PAC-Bayesian Reward-Certified Outcome Weighted Learning (PROWL). Given a one-sided uncertainty certificate, PROWL constructs a conservative reward and a strictly policy-dependent lower bound on the true expected value. Theoretically, we prove an exact certified reduction that transforms robust policy learning into a unified, split-free cost-sensitive classification task. This formulation enables the derivation of a nonasymptotic PAC-Bayes lower bound for randomized ITRs, where we establish that the optimal posterior maximizing this bound is exactly characterized by a general Bayes update. To overcome the learning-rate selection problem inherent in generalized Bayesian inference, we introduce a fully automated, bounds-based calibration procedure, coupled with a Fisher-consistent certified hinge surrogate for efficient optimization. Our experiments demonstrate that PROWL achieves improvements in estimating robust, high-value treatment regimes under severe reward uncertainty compared to standard methods for ITR estimation.

2604.01943 2026-04-03 stat.ML cs.LG

A Novel Theoretical Analysis for Clustering Heteroscedastic Gaussian Data without Knowledge of the Number of Clusters

Dominique Pastor, Elsa Dupraz, Ismail Hbilou, Guillaume Ansel

Comments 76 pages, submitted to JMLR

详情
英文摘要

This paper addresses the problem of clustering measurement vectors that are heteroscedastic in that they can have different covariance matrices. From the assumption that the measurement vectors within a given cluster are Gaussian distributed with possibly different and unknown covariant matrices around the cluster centroid, we introduce a novel cost function to estimate the centroids. The zeros of the gradient of this cost function turn out to be the fixed-points of a certain function. As such, the approach generalizes the methodology employed to derive the existing Mean-Shift algorithm. But as a main and novel theoretical result compared to Mean-Shift, this paper shows that the sole fixed-points of the identified function tend to be the cluster centroids if both the number of measurements per cluster and the distances between centroids are large enough. As a second contribution, this paper introduces the Wald kernel for clustering. This kernel is defined as the p-value of the Wald hypothesis test for testing the mean of a Gaussian. As such, the Wald kernel measures the plausibility that a measurement vector belongs to a given cluster and it scales better with the dimension of the measurement vectors than the usual Gaussian kernel. Finally, the proposed theoretical framework allows us to derive a new clustering algorithm called CENTRE-X that works by estimating the fixed-points of the identified function. As Mean-Shift, CENTRE-X requires no prior knowledge of the number of clusters. It relies on a Wald hypothesis test to significantly reduce the number of fixed points to calculate compared to the Mean-Shift algorithm, thus resulting in a clear gain in complexity. Simulation results on synthetic and real data sets show that CENTRE-X has comparable or better performance than standard clustering algorithms K-means and Mean-Shift, even when the covariance matrices are not perfectly known.

2604.01880 2026-04-03 cs.LG cs.NE stat.ML

DDCL-INCRT: A Self-Organising Transformer with Hierarchical Prototype Structure (Theoretical Foundations)

Giansalvo Cirrincione

Comments 30 pages, 5 figures. Submitted to Neural Networks (Elsevier)

详情
英文摘要

Modern neural networks of the transformer family require the practitioner to decide, before training begins, how many attention heads to use, how deep the network should be, and how wide each component should be. These decisions are made without knowledge of the task, producing architectures that are systematically larger than necessary: empirical studies find that a substantial fraction of heads and layers can be removed after training without performance loss. This paper introduces DDCL-INCRT, an architecture that determines its own structure during training. Two complementary ideas are combined. The first, DDCL (Deep Dual Competitive Learning), replaces the feedforward block with a dictionary of learned prototype vectors representing the most informative directions in the data. The prototypes spread apart automatically, driven by the training objective, without explicit regularisation. The second, INCRT (Incremental Transformer), controls the number of heads: starting from one, it adds a new head only when the directional information uncaptured by existing heads exceeds a threshold. The main theoretical finding is that these two mechanisms reinforce each other: each new head amplifies prototype separation, which in turn raises the signal triggering the next addition. At convergence, the network self-organises into a hierarchy of heads ordered by representational granularity. This hierarchical structure is proved to be unique and minimal, the smallest architecture sufficient for the task, under the stated conditions. Formal guarantees of stability, convergence, and pruning safety are established throughout. The architecture is not something one designs. It is something one derives.

2604.01789 2026-04-03 stat.ML cs.LG

Learning in Prophet Inequalities with Noisy Observations

Jung-hun Kim, Vianney Perchet

Comments ICLR 2026

详情
英文摘要

We study the prophet inequality, a fundamental problem in online decision-making and optimal stopping, in a practical setting where rewards are observed only through noisy realizations and reward distributions are unknown. At each stage, the decision-maker receives a noisy reward whose true value follows a linear model with an unknown latent parameter, and observes a feature vector drawn from a distribution. To address this challenge, we propose algorithms that integrate learning and decision-making via lower-confidence-bound (LCB) thresholding. In the i.i.d.\ setting, we establish that both an Explore-then-Decide strategy and an $\varepsilon$-Greedy variant achieve the sharp competitive ratio of $1 - 1/e$, under a mild condition on the optimal value. For non-identical distributions, we show that a competitive ratio of $1/2$ can be guaranteed against a relaxed benchmark. Moreover, with limited window access to past rewards, the tight ratio of $1/2$ against the optimal benchmark is achieved.

2604.01734 2026-04-03 q-bio.QM stat.ME

A Novel Multi-view Mixture Model Framework for Longitudinal Clustering with Application to ANCA-Associated Vasculitis

Shen Jia, David Selby, Mark A Little, Tin Lok James Ng

详情
英文摘要

Effectively modeling irregularly sampled longitudinal data is essential for understanding disease progression and improving risk prediction. We propose a two-view mixture model that integrates static baseline covariates and longitudinal biomarker trajectories within a unified probabilistic clustering framework. Temporal patterns are modeled using Neural Ordinary Differential Equations. Model training uses an EM algorithm with a sparsity-inducing log-penalty for interpretable subgroup discovery. Application of the model to an Irish cohort of ANCA-associated vasculitis patients reveals subgroups with heterogeneous serum creatinine trajectories and variation in end-stage kidney disease outcomes.

2604.01689 2026-04-03 stat.ME

DeepKriging on the global Data

Hao-Yun Huang, Wen-Ting Wang, Ping-Hsun Chiang, Wei-Ying Wu

详情
英文摘要

The increasing availability of large-scale global datasets has generated a demand for scalable spatial prediction methods defined on spherical domains. Classical spatial models that rely on Euclidean distance representations are inappropriate for spherical data because planar projections distort geodesic distances and spatial neighborhood structures, while traditional kriging-based prediction methods are often computationally prohibitive for massive datasets. To address these challenges, we propose a Spherical DeepKriging framework for spatial prediction on $\mathbb{S}^2$. The proposed approach constructs a flexible prediction model by integrating thin-plate spline (TPS) basis functions defined intrinsically on the sphere. Simulation studies and real data analyses are presented to demonstrate the superior predictive performance of the proposed method.

2604.01629 2026-04-03 stat.ME

Conformalized Method for Empirical Bayes Normal Mean Inference Problem with Heteroscedastic Variance

Kwangok Seo, Johan Lim

详情
英文摘要

We study the normal mean inference problem, which involves simultaneous testing of the means of many normal distributions. This problem has been extensively studied within the empirical Bayes (EB) framework. However, the reliability of most EB methods heavily depends on two key conditions: (i) the prior distribution is correctly specified, and (ii) it can be accurately estimated. In practice, both conditions are difficult to satisfy, and it is often unclear whether they hold in a given application. To overcome these limitations, we propose a new algorithm, called COIN (COnformal Inference for Normal mean inference problem). Unlike traditional empirical Bayes approaches, COIN produces decision rules whose validity does not depend on the correct specification or accurate estimation of the prior. We theoretically prove that COIN asymptotically controls the false discovery rate at the nominal level, even in the presence of prior misspecification or estimation errors. Since the COIN algorithm requires an external training dataset to estimate the prior distribution and conformity score function, we introduce two data-splitting strategies -- sample-splitting and feature-splitting -- for the case where such external data are unavailable. We provide theoretical guarantees for the data-splitting strategies and demonstrate their effectiveness through extensive numerical studies and three real data examples.

2604.01625 2026-04-03 stat.ME

Data-adaptive gene and pathway-based tests forrare-variant associations with survival outcomes

Yu Wang, Kwang Woo Ahn, Sarah L. Kerns, William Hall, Petra Seibold, Christopher J. Talbot, Ana Vega, Barry S. Rosenstein, Nawaid Usmani, Catharine M. L. West, Liv Veldeman, Paul L. Auer, Zhongyuan Chen

详情
英文摘要

Statistical methods for testing aggregate rare-variant genetic associations are typically based on either burden or dispersion tests (or a combination of the two). These methods lack statistical power in the presence of diverse genetic architectures. Moreover, few aggregate rare-variant association methods have been developed specifically for survival data. To address these issues, we propose data-adaptive gene- and pathway-based association tests based on Schoenfeld residuals in Cox proportional hazards models for association studies between an aggregate of rare-variants and survival outcomes. Our methods improve statistical power while maintaining flexibility across various genetic effect sizes and directions. We develop an efficient R package that enables fast computation and supports data simulation as well as gene- and pathway-level testing. Applying our approach to late bladder toxicity following radiotherapy for non-metastatic prostate cancer, we identify biologically relevant genes and pathways, replicate known signals, and capture additional associations. Our method provides a powerful, adaptive framework for survival-based genetic association studies of rare-variants. Keywords: aSPU, time-to-event outcomes, rare-variant associations, Cox regression, Schoenfeld residuals

2604.01606 2026-04-03 stat.ML cs.LG math.OC

Random Coordinate Descent on the Wasserstein Space of Probability Measures

Yewei Xu, Qin Li

详情
英文摘要

Optimization over the space of probability measures endowed with the Wasserstein-2 geometry is central to modern machine learning and mean-field modeling. However, traditional methods relying on full Wasserstein gradients often suffer from high computational overhead in high-dimensional or ill-conditioned settings. We propose a randomized coordinate descent framework specifically designed for the Wasserstein manifold, introducing both Random Wasserstein Coordinate Descent (RWCD) and Random Wasserstein Coordinate Proximal{-Gradient} (RWCP) for composite objectives. By exploiting coordinate-wise structures, our methods adapt to anisotropic objective landscapes where full-gradient approaches typically struggle. We provide a rigorous convergence analysis across various landscape geometries, establishing guarantees under non-convex, Polyak-Łojasiewicz, and geodesically convex conditions. Our theoretical results mirror the classic convergence properties found in Euclidean space, revealing a compelling symmetry between coordinate descent on vectors and on probability measures. The developed techniques are inherently adaptive to the Wasserstein geometry and offer a robust analytical template that can be extended to other optimization solvers within the space of measures. Numerical experiments on ill-conditioned energies demonstrate that our framework offers significant speedups over conventional full-gradient methods.

2604.01593 2026-04-03 stat.ME

Nonparametric regression of spatio-temporal data using infinite-dimensional covariates

Subhrajyoty Roy, Soudeep Deb, Sayar Karmakar, Rishideep Roy

详情
英文摘要

In spatio-temporal analysis, we often record data at specific time intervals but with varying spatial locations between these timepoints. We propose a conditional model to analyze such spatio-temporal data that accommodates the dependencies alongside second-order stationary explanatory variables, which may be infinite-dimensional and accommodate spatio-temporal covariates. Because of the absence of a mixing-type dependence condition in this case, which is typically required by the existing studies, we consider a weaker polynomially decaying moment contraction (PMC) condition on the covariates. In this paper, we obtain nonparametric point estimates of the mean and covariate functions of such a regression model, which we then show to be statistically consistent. We also obtain a simultaneous confidence interval of the mean function using the central limit theorem for the proposed estimator. Such simultaneous inference tools can be used to test for certain specifications of the mean function. Some simulation studies and two real-data analyses have been illustrated to corroborate the findings.

2604.01580 2026-04-03 stat.CO math.PR

Simulation and Analysis of Multifractional Stochastic Processes with R Package Rmfrac

Andriy Olenko, Nemini Samarakoon

Comments 29 pages, 10 figures

详情
英文摘要

Brownian motion and fractional Brownian motion have been widely applied in statistical modeling in finance, telecommunication, network traffic, neuroscience, physics, and other fields. More realistic models for real time series data, such as multifractional processes, generalize these classical models by allowing their regularity to vary over time. A new class of Gaussian Haar-based multifractional processes, which utilizes the Haar wavelet series representation, was recently introduced. It significantly extends the range of available models by incorporating more general classes of Hurst functions. The Rmfrac package was developed to simulate multifractional time series. The package also comprises several functions for the analysis and visualization of time series. It includes the estimation of the Hurst function and local fractal dimension, clustering realizations and computing various geometric statistics of these time series. The package also offers a Shiny application to visualize simulation and estimation results. The article presents an overview of the Rmfrac package and exemplifies its main functionalities.

2604.01568 2026-04-03 math.ST stat.ME stat.TH

Asymptotic theory and bias correction for the Wallace--Freeman estimator

Enes Makalic, Daniel F. Schmidt

详情
英文摘要

The Wallace--Freeman estimator is a classical invariant point estimator whose large-sample properties have not been fully developed in a modern asymptotic framework. We show that the estimator can be formulated as a penalised M-estimator with a specific penalty weight, yielding a unified route to its asymptotic analysis. This representation allows us to establish existence, consistency, an asymptotic linear expansion, and asymptotic normality under standard regularity conditions. We further derive the first-order difference between the Wallace--Freeman estimator and the maximum likelihood estimator, and show that this induces an explicit $O(n^{-1})$ bias correction determined by the gradient of the penalty. As a consequence, the Cox--Snell bias formula for the maximum likelihood estimator extends naturally to the Wallace--Freeman estimator by the addition of a penalty-driven correction term. As an illustration, we derive the first-order bias of the Wallace--Freeman estimator for the Weibull model and show how the penalty modifies the corresponding maximum likelihood bias. These results place the Wallace--Freeman estimator within the general theory of penalised likelihood and provide a rigorous asymptotic basis for its use in parametric inference.

2604.01546 2026-04-03 stat.ME

Spatially-informed Image Harmonization Results in Improved Scanner Effect Removal and Prediction

Alec Reinhardt, Yajie Liu, Suprateek Kundu

Comments 31 Pages, 5 fifures

详情
英文摘要

We propose a novel data harmonization approach known as Tensor-ComBat (TC) for structural neuroimaging data. Tensor-Combat is a novel spatially aware harmonization method that aims to estimate and remove unwanted technical variation between voxel-level images from different study sites or scanners. Tensor-ComBat uses a Bayesian tensor response regression (BTRR) model to estimate spatially distributed scanner effects via a low-rank PARAFAC decomposition on the model coefficients, and subsequently removes these scanner effects via a post-hoc ComBat harmonization step. Unlike the classical ComBat method that treats the ROIs or voxels in the image as interchangeable, the Tensor-ComBat approach incorporates the information about the spatial configurations of imaging voxels when estimating the model parameters, resulting an improved harmonization pipeline. The proposed Tensor-ComBat method is fit using a Markov chain Monte Carlo (MCMC) that is not only able to produce point estimates, but also able to quantify uncertainty associated with these estimates. Due to the fully Bayesian implementation of Tensor-ComBat, it is able to perform inference related to significant spatially distributed scanner effects and/or biological effects that is not straightforward under existing methods. The methods is applied to over 2100 T1w-MRI scans in ADNI-1 that illustrates greater scanner effect removal, improved biological prediction, and superior reproducibility compared to state-of-the-art approaches such as ComBat.

2604.01501 2026-04-03 stat.ME stat.AP stat.ML stat.OT

Identifying and Estimating Causal Direct Effects Under Unmeasured Confounding

Philippe Boileau, Nima S. Hejazi, Ivana Malenica, Peter B. Gilbert, Sandrine Dudoit, Mark J. van der Laan

详情
英文摘要

Causal mediation analysis provides techniques for defining and estimating effects that may be endowed with mechanistic interpretations. With many scientific investigations seeking to address mechanistic questions, causal direct and indirect effects have garnered much attention. The natural direct and indirect effects, the most widely used among such causal mediation estimands, are limited in their practical utility due to stringent identification requirements. Accordingly, considerable effort has been invested in developing alternative direct and indirect effect decompositions with relaxed identification requirements. Such efforts often yield effect definitions with nuanced and challenging interpretations. By contrast, relatively limited attention has been paid to relaxing the identification assumptions of the natural direct and indirect effects. Motivated by a secondary aim of a recent non-randomized vaccine prospective cohort study (NCT05168813), we present a set of relaxed conditions under which the natural direct effect is identifiable in spite of unobserved baseline confounding of the exposure-mediator pathway; we use this result to investigate the effect mediated by putative immune correlates of protection. Relaxing the commonly used but restrictive cross-world counterfactual independence assumption, we discuss strategies for evaluating the natural direct effect in non-randomized settings that arise in the analysis of vaccine studies. We revisit prior studies of semi-parametric efficiency theory to demonstrate the construction of flexible, multiply robust estimators of the natural direct effect and discuss efficient estimation strategies that do not place restrictive modeling assumptions on nuisance functions.

2604.01500 2026-04-03 stat.ME

Copula-Based Time Series for Non-Gaussian and Non-Markovian Stationary Processes

Sven Pappert, Harry Joe

详情
英文摘要

In the copula-based approach to univariate time series modeling, the finite dimensional temporal dependence of a stationary time series is captured by a copula. Recent studies investigate how copula-based time series models can be generalized to have long-term autoregressive effects. We study a generalization that comes from a Markov sequence of order p and a q-dependent sequence. We derive the relation of the model to Gaussian-ARMA models and to the Gaussian-GARCH(1,1) model. We investigate distributional properties of the process and discuss the maximum likelihood estimation (MLE). Additionally we analyze the copula moving aggregate process of order one, or MAG(1), as it is a basic building block. Last we test the model in probabilistic forecasting studies on US inflation and German wind energy production.

2604.01491 2026-04-03 stat.AP

Opponent-Adjusted Evaluation of NFL Pass Blocking and Pass Rushing Performance

Jonathan Pipping-Gamón, Maximilian Gebauer, Victoria Lee, Kenny Watts, Abraham J. Wyner

Comments 14 pages, 3 figures, 5 tables. Code available at https://github.com/WhartonSABI/nfl-elo

详情
英文摘要

Evaluating offensive linemen and pass rushers at the player level is difficult because observable outcomes are sparse, opponent-dependent, and strongly shaped by surrounding context. Using 2021 regular-season Hudl tracking data, we construct a blocker-rusher interaction dataset and estimate two ridge-regularized Bradley-Terry paired-comparison models: a binary win/loss model aligned with the 2.5-second pass block win-rate definition and a four-class severity model over loss, win, hit, and sack, with both models incorporating a double-team indicator. The final dataset contains 153,138 interactions across 33,283 pass plays in 266 games. On an ordered 80/20 holdout split (test n = 30,628), both models improve on global baselines and modestly outperform stronger matchup baselines under log-loss evaluation, corresponding to relative log-loss reductions of about 0.24% to 1.21%. Game-level bootstrap resampling indicates that these gains are most stable for the win model and for the severity model relative to the global baseline, while the severity-versus-matchup comparison remains directionally positive but less certain. External comparison to 2021 AP All-Pro selections provides additional face validation on the learned rankings, with the severity model showing the strongest alignment to expert recognition. Overall, ridge-regularized Bradley-Terry models provide an interpretable opponent-adjusted framework for evaluating NFL pass protection and pass rush at the interaction level.

2604.01470 2026-04-03 math.ST stat.ME stat.TH

Sharp Debiasing for Smooth Functional Estimation in Banach Spaces

Woonyoung Chang, Arun Kumar Kuchibhotla

详情
英文摘要

This paper studies the estimation of smooth functionals $f(θ)$ of a mean parameter $θ= \mathbb{E}_P[W]$ for a distribution $P$ on a general Banach space. We propose a cross-fitted estimator based on a single sample splitting and establish non-asymptotic moment bounds and Berry--Esséen bounds for both $m$-smooth and infinitely smooth functionals under the finite moment assumptions. Our framework is applied to precision matrix estimation and the inference of projection parameters in high-dimensional regression. In these Euclidean settings, the proposed estimators achieve asymptotic normality under the dimension regime $d \log^2(en) = o(n)$ without requiring any structural assumptions (e.g., sparsity). We discuss computational relaxations that enables polynomial-time implementation for a range of matrix functionals.

2604.01441 2026-04-03 eess.SY cs.LG cs.OS cs.SY eess.SP stat.ML

Generative Profiling for Soft Real-Time Systems and its Applications to Resource Allocation

Georgiy A. Bondar, Abigail Eisenklam, Yifan Cai, Robert Gifford, Tushar Sial, Linh Thi Xuan Phan, Abhishek Halder

详情
英文摘要

Modern real-time systems require accurate characterization of task timing behavior to ensure predictable performance, particularly on complex hardware architectures. Existing methods, such as worst-case execution time analysis, often fail to capture the fine-grained timing behaviors of a task under varying resource contexts (e.g., an allocation of cache, memory bandwidth, and CPU frequency), which is necessary to achieve efficient resource utilization. In this paper, we introduce a novel generative profiling approach that synthesizes context-dependent, fine-grained timing profiles for real-time tasks, including those for unmeasured resource allocations. Our approach leverages a nonparametric, conditional multi-marginal Schrödinger Bridge (MSB) formulation to generate accurate execution profiles for unseen resource contexts, with maximum likelihood guarantees. We demonstrate the efficiency and effectiveness of our approach through real-world benchmarks, and showcase its practical utility in a representative case study of adaptive multicore resource allocation for real-time systems.

2604.01411 2026-04-03 cs.LG cs.CL stat.ML

Test-Time Scaling Makes Overtraining Compute-Optimal

Nicholas Roberts, Sungjun Cho, Zhiqi Gao, Tzu-Heng Huang, Albert Wu, Gabriel Orlanski, Avi Trost, Kelly Buchanan, Aws Albarghouthi, Frederic Sala

详情
英文摘要

Modern LLMs scale at test-time, e.g. via repeated sampling, where inference cost grows with model size and the number of samples. This creates a trade-off that pretraining scaling laws, such as Chinchilla, do not address. We present Train-to-Test ($T^2$) scaling laws that jointly optimize model size, training tokens, and number of inference samples under fixed end-to-end budgets. $T^2$ modernizes pretraining scaling laws with pass@$k$ modeling used for test-time scaling, then jointly optimizes pretraining and test-time decisions. Forecasts from $T^2$ are robust over distinct modeling approaches: measuring joint scaling effect on the task loss and modeling impact on task accuracy. Across eight downstream tasks, we find that when accounting for inference cost, optimal pretraining decisions shift radically into the overtraining regime, well-outside of the range of standard pretraining scaling suites. We validate our results by pretraining heavily overtrained models in the optimal region that $T^2$ scaling forecasts, confirming their substantially stronger performance compared to pretraining scaling alone. Finally, as frontier LLMs are post-trained, we show that our findings survive the post-training stage, making $T^2$ scaling meaningful in modern deployments.

2604.01399 2026-04-03 math.ST math.PR stat.TH

Conditional Independence under Infinite Measures and Poisson Point Processes

Shuyang Bai, Vishal Routh

Comments 15 pages

详情
英文摘要

We study conditional independence under infinite measures on punctured product spaces, a notion recently introduced for graphical modeling in multivariate extremes and Lévy processes. In contrast to classical probabilistic conditional independence, this concept is formulated through normalized restrictions of an infinite measure that reflects the non-product structure of the punctured space. We show that this non-standard notion admits a natural probabilistic characterization: it is equivalent to classical conditional independence between coordinate projections of a Poisson point process defined on the punctured space with the given infinite measure as its mean measure. In addition, we provide a functional characterization of the conditional independence concept at the level of the enumerated points of the Poisson point process. We further extend the framework from punctured Euclidean product spaces to a more general abstract setting, thereby broadening its scope of potential applications.

2604.01356 2026-04-03 cs.DS stat.CO

A divide and conquer strategy for multinomial particle filter resampling

Andrey A. Popov

详情
英文摘要

This work provides a new multinomial resampling procedure for particle filter resampling, focused on the case where the number of samples required is less than or equal to the size of the underlying discrete distribution. This setting is common in ensemble mixture model filters such as the Gaussian mixture filter. We show superiority of our approach with respect two of the best known multinomial sampling procedures both through a computational complexity analysis and through a numerical experiment.

2604.01339 2026-04-03 cs.CV cs.AI cs.LG stat.ME stat.ML

Regularizing Attention Scores with Bootstrapping

Neo Christopher Chung, Maxim Laletin

详情
Journal ref
Artificial Intelligence and Statistics (AISTATS) 2026
英文摘要

Vision transformers (ViT) rely on attention mechanism to weigh input features, and therefore attention scores have naturally been considered as explanations for its decision-making process. However, attention scores are almost always non-zero, resulting in noisy and diffused attention maps and limiting interpretability. Can we quantify uncertainty measures of attention scores and obtain regularized attention scores? To this end, we consider attention scores of ViT in a statistical framework where independent noise would lead to insignificant yet non-zero scores. Leveraging statistical learning techniques, we introduce the bootstrapping for attention scores which generates a baseline distribution of attention scores by resampling input features. Such a bootstrap distribution is then used to estimate significances and posterior probabilities of attention scores. In natural and medical images, the proposed \emph{Attention Regularization} approach demonstrates a straightforward removal of spurious attention arising from noise, drastically improving shrinkage and sparsity. Quantitative evaluations are conducted using both simulation and real-world datasets. Our study highlights bootstrapping as a practical regularization tool when using attention scores as explanations for ViT. Code available: https://github.com/ncchung/AttentionRegularization

2604.01325 2026-04-03 cs.AI stat.ME

The Digital Twin Counterfactual Framework: A Validation Architecture for Simulated Potential Outcomes

Olav Laudy

详情
英文摘要

The fundamental problem of causal inference - that the counterfactual outcome for any individual is never observed - has shaped the entire methodology of the field. Every existing approach substitutes assumptions for missing data: ignorability, parallel trends, exclusion restrictions. None produces the counterfactual itself. This paper proposes the Digital Twin Counterfactual Framework (DTCF): rather than estimating the counterfactual statistically, we simulate it using a digital twin and subject the simulation to a hierarchical validation regime. We formalize the digital twin simulator as a stochastic mapping within the potential outcomes framework and introduce a hierarchy of twin fidelity assumptions - from marginal fidelity through joint fidelity to structural fidelity - each unlocking a progressively richer class of estimands. The central contribution is threefold. First, a five-level validation architecture converts the unfalsifiable claim that the simulator produces correct counterfactuals into falsifiable tests against observable data. Second, a formal decomposition separates causal quantities into those that are marginally validated (ATE, CATE, QTE - testable through observable-arm comparison) and those that are copula-dependent (the ITE distribution, probability of benefit/harm, variance of treatment effects - permanently reliant on the unobservable within-individual dependence structure). Third, bounding, sensitivity, and uncertainty quantification tools make the copula dependence explicit. The DTCF does not resolve the fundamental problem of causal inference. What it provides is a framework in which marginal causal claims become increasingly testable, joint causal claims become explicitly assumption-indexed, and the gap between the two is formally characterized.

2604.01321 2026-04-03 math.OC cs.NA cs.SY eess.SY math.NA stat.CO

Risk Control of Traffic Flow Through Chance Constraints and Large Deviation Approximation

Rui Xu, Shanyin Tong, Xuan Di

详情
英文摘要

Existing macroscopic traffic control methods often struggle to strictly regulate rare, safety-critical extreme events under stochastic disturbances. In this paper, we develop a rare chance-constrained optimal control framework for autonomous traffic management. To efficiently enforce these probabilistic safety specifications, we exploit a large deviation theory (LDT) based approximation method, which converts the original highly non-convex, sampling-heavy optimization problem into a tractable deterministic nonlinear programming problem. In addition, the proposed LDT-based reformulation exhibits superior computational scalability, as it maintains a constant computational burden regardless of the target violation probability level, effectively bypassing the extreme scaling bottlenecks of traditional sampling-based methods. The effectiveness of the proposed framework in achieving precise near-target probability control and superior computational efficiency over risk-averse baselines is illustrated through extensive numerical simulations across diverse traffic risk measures.

2604.01267 2026-04-03 math.ST stat.ML stat.TH

Observable Geometry of Singular Statistical Models

Sean Plummer

详情
英文摘要

Singular statistical models arise whenever different parameter values induce the same distribution, leading to non-identifiability and a breakdown of classical asymptotic theory. While existing approaches analyze these phenomena in parameter space, the resulting descriptions depend heavily on parameterization and obscure the intrinsic statistical structure of the model. In this paper, we introduce an invariant framework based on \emph{observable charts}: collections of functionals of the data distribution that distinguish probability measures. These charts define local coordinate systems directly on the model space, independent of parameterization. We formalize \emph{observable completeness} as the ability of such charts to detect identifiable directions, and introduce \emph{observable order} to quantify higher-order distinguishability along analytic perturbations. Our main result establishes that, under mild regularity conditions, observable order provides a lower bound on the rate at which Kullback-Leibler divergence vanishes along analytic paths. This connects intrinsic geometric structure in model space to statistical distinguishability and recovers classical behavior in regular models while extending naturally to singular settings. We illustrate the framework in reduced-rank regression and Gaussian mixture models, where observable coordinates reveal both identifiable structure and singular degeneracies. These results suggest that observable charts provide a unified and parameterization-invariant language for studying singular models and offer a pathway toward intrinsic formulations of invariants such as learning coefficients.

2604.01266 2026-04-03 math.ST stat.CO stat.TH

Horseshoe Priors and MDP

Nick Polson, Vadim Sokolov, Daniel Zantedeschi

详情
英文摘要

Carvalho (2010) established two foundational theorems for the horseshoe prior: tight two-sided logarithmic bounds on the marginal density near the origin (Theorem~1.1), and a super-efficient rate of convergence of the Bayes predictive density to the true sampling density in sparse situations (Theorem~2). The ``Shrink Globally, Act Locally'' paper \citep{polson2010shrink} formalised necessary and sufficient conditions on the prior's behaviour at the origin for sparsity adaptation as $p \to \infty$. We show that these results are not merely descriptive properties of the horseshoe -- they are the finite-sample precursors to the asymptotic moderate deviation principle (MDP) of \citet{datta2026newlook}. The log-pole singularity $\piH(θ) \asymp -\log\absθ$ is precisely the origin integrability boundary that selects the MDP threshold $\tcrit = \sqrt{\log(πn/2)}$; super-efficiency below the threshold and tail robustness above it together produce the ABOS Bayes risk $p_0 \log(p/p_0)/n$; and the Clarke--Barron information-theoretic asymptotics of Bayes methods provide the unifying framework in which all three results are faces of a single logarithmic budget principle.

2604.01086 2026-04-03 cs.DS cs.IT math.IT math.ST stat.TH

Asymptotically Optimal Sequential Testing with Heterogeneous LLMs

Guokai Li, Alys Liang, Mo Liu, Murray Lei, Stefanus Jasin, Fenghua Yang, Preet Baxi

详情
英文摘要

We study a Bayesian binary sequential hypothesis testing problem with multiple large language models (LLMs). Each LLM $j$ has per-query cost $c_j>0$, random waiting time with mean $μ_j>0$ and sub-Gaussian tails, and \emph{asymmetric} accuracies: the probability of returning the correct label depends on the true hypothesis $θ\in\{A,B\}$ and needs not be the same under $A$ and $B$. This asymmetry induces two distinct information rates $(I_{j,A}, I_{j,B})$ per LLM, one under each hypothesis. The decision-maker chooses LLMs sequentially, observes their noisy binary answers, and stops when the posterior probability of one hypothesis exceeds $1-α$. The objective is to minimize the sum of expected query cost and expected waiting cost, $\mathbb{E}[C_π] + \mathbb{E}[g(W_π)]$, where $C_π$ is the total query cost, $W_π$ is the total waiting time and $g$ is a polynomial function (e.g., $g(x)=x^ρ$ with $ρ\ge 1$). We prove that as the error tolerance $α\to0$, the optimal policy is asymptotically equivalent to one that uses at most two LLMs. In this case, a single-LLM policy is \emph{not} generically optimal: optimality now requires exploiting a two-dimensional tradeoff between information under $A$ and information under $B$. Any admissible policy induces an expected information-allocation vector in $\mathbb{R}_+^2$, and we show that the optimal allocation lies at an extreme point of the associated convex set when $α$ is relatively small, and hence uses at most two LLMs. We construct belief-dependent policies that first mix between two LLMs when the posterior is ambiguous, and then switch to a single "specialist" LLM when the posterior is sufficiently close to one of the hypotheses. These policies match the universal lower bound up to a $(1+o(1))$ factor as $α\rightarrow 0$.

2603.28650 2026-04-03 cs.LG cs.AI stat.ML

Information-Theoretic Limits of Safety Verification for Self-Improving Systems

Arsenios Scrivens

Comments 27 pages, 6 figures. Companion empirical paper: doi:10.5281/zenodo.19237566

详情
英文摘要

Can a safety gate permit unbounded beneficial self-modification while maintaining bounded cumulative risk? We formalize this question through dual conditions -- requiring sum delta_n < infinity (bounded risk) and sum TPR_n = infinity (unbounded utility) -- and establish a theory of their (in)compatibility. Classification impossibility (Theorem 1): For power-law risk schedules delta_n = O(n^{-p}) with p > 1, any classifier-based gate under overlapping safe/unsafe distributions satisfies TPR_n <= C_alpha * delta_n^beta via Holder's inequality, forcing sum TPR_n < infinity. This impossibility is exponent-optimal (Theorem 3). A second independent proof via the NP counting method (Theorem 4) yields a 13% tighter bound without Holder's inequality. Universal finite-horizon ceiling (Theorem 5): For any summable risk schedule, the exact maximum achievable classifier utility is U*(N, B) = N * TPR_NP(B/N), growing as exp(O(sqrt(log N))) -- subpolynomial. At N = 10^6 with budget B = 1.0, a classifier extracts at most U* ~ 87 versus a verifier's ~500,000. Verification escape (Theorem 2): A Lipschitz ball verifier achieves delta = 0 with TPR > 0, escaping the impossibility. Formal Lipschitz bounds for pre-LayerNorm transformers under LoRA enable LLM-scale verification. The separation is strict. We validate on GPT-2 (d_LoRA = 147,456): conditional delta = 0 with TPR = 0.352. Comprehensive empirical validation is in the companion paper [D2].

2603.23675 2026-04-03 stat.AP math.PR

Dynamical behaviors of a stochastic SIS epidemic model with mean-reverting inhomogeneous geometric brownian motion

Lahcen Khammich, Driss Kiouach

Comments It contains significant errors that require substantial revision

详情
英文摘要

The main purpose of this paper is to study the Dynamical behaviors of a stochastic SIS epidemic model using mean-reverting inhomogeneous geometric brownian motion process. First we demonstrate the existence of a global-in-time solution and establish that is unique and remains positive. Then we derive a sufficient condition for exponential extinction of infectious diseases and we show that our extinction threshold in the stochastic case coincides with that of the deterministic case. Finaly, we define an appropriate theoretical framework to guarantee the existence of an ergodic stationary distribution.

2603.22888 2026-04-03 math.ST stat.TH

Boundary Inference for Mixed Fractional Models under High-Frequency Observation Critical LAN and Score Tests at $H=3/4$

Chunhao Cai, Yiwu Shang, Weilin Xiao, Cong Zhang

详情
英文摘要

We study boundary inference at $H=3/4$ for mixed fractional Brownian motion and mixed fractional Ornstein--Uhlenbeck models under high-frequency observation. This boundary is economically important because it separates the critical and supercritical regimes of mixed fractional dynamics. We make three contributions. First, we identify the exact critical first-order scaling and show that, after removing the explicit linear component in the $H$-score, the transformed $(σ,H)$ block is already non-degenerate. Second, we establish critical score central limit theorems (CLT) and derive local asymptotic normality (LAN) with fully explicit leading information constants for both models. Third, we construct boundary-calibrated one-sided score tests for detecting entry into the supercritical region $H>3/4$ and discuss feasible implementation through restricted nuisance estimation. Monte Carlo evidence shows that the feasible statistic has the correct directional power but conservative null calibration. Finally, an intraday illustration on one-minute SPY data finds no persistent evidence in favor of $H>3/4$.

2603.22573 2026-04-03 stat.ME

Multiple Jump MCMC: A Scalable Algorithm for Bayesian Inference on Binary Model Spaces

Lucas Vogels, Reza Mohammadi, Marit Schoonhoven, Sinan Yildirim, Ilker Birbil

详情
英文摘要

This article considers Bayesian model inference on binary model spaces. Binary model spaces are used by a large class of models, including graphical models, variable selection, mixture distributions, and decision trees. Traditional strategies in this field, such as reversible jump or birth-death MCMC algorithms, are still popular, despite suffering from a slow exploration of the model space. In this article, we propose an alternative: the Multiple Jump MCMC algorithm. The algorithm is simple, rejection-free, and remarkably fast. When applied to undirected Gaussian graphical models, it is 100 to 200 times faster than the state-of-the-art, solving models with $500,000$ parameters in less than a minute. We provide theorems showing how accurately our algorithm targets the posterior, and we show how to apply our framework to Gaussian graphical models, Ising models, and variable selection, but note that it applies to most Bayesian posterior inference on binary model spaces.

2603.20359 2026-04-03 stat.ML cs.LG cs.NA math.DS math.NA

Operator Learning for Smoothing and Forecasting

Edoardo Calvello, Elizabeth Carlson, Nikola Kovachki, Michael N. Manta, Andrew M. Stuart

详情
英文摘要

Machine learning has opened new frontiers in purely data-driven algorithms for data assimilation in, and for forecasting of, dynamical systems; the resulting methods are showing some promise. However, in contrast to model-driven algorithms, analysis of these data-driven methods is poorly developed. In this paper we address this issue, developing a theory to underpin data-driven methods to solve smoothing problems arising in data assimilation and forecasting problems. The theoretical framework relies on two key components: (i) establishing the existence of the mapping to be learned; (ii) the properties of the operator learning architecture used to approximate this mapping. By studying these two components in conjunction, we establish novel universal approximation theorems for purely data driven algorithms for both smoothing and forecasting of dynamical systems. We work in the continuous time setting, hence deploying neural operator architectures. The theoretical results are illustrated with experiments studying the Lorenz `63, Lorenz `96 and Kuramoto-Sivashinsky dynamical systems.

2603.20025 2026-04-03 stat.ML cs.LG math.ST stat.TH

Graph-Informed Adversarial Modeling: Infimal Subadditivity of Interpolative Divergences

Panagiota Birmpa, Eric Joseph Hall

Comments 34 pages, 9 figures

详情
英文摘要

We study adversarial learning when the target distribution factorizes according to a known Bayesian network. For interpolative divergences, including $(f,Γ)$-divergences, we prove a new infimal subadditivity principle showing that, under suitable conditions, a global variational discrepancy is controlled by an average of family-level discrepancies aligned with the graph. In an additive regime, the surrogate is exact. This closes a theoretical gap in the literature; existing subadditivity results justify graph-informed adversarial learning for classical discrepancies, but not for interpolative divergences, where the usual factorization argument breaks down. In turn, we provide a justification for replacing a standard, graph-agnostic GAN with a monolithic discriminator by a graph-informed GAN (GiGAN) with localized family-level discriminators, without requiring the optimizer itself to factorize according to the graph. We also obtain parallel results for integral probability metrics and proximal optimal transport divergences, identify natural discriminator classes for which the theory applies, and present experiments showing improved stability and structural recovery relative to graph-agnostic baselines.

2603.11457 2026-04-03 stat.ME econ.EM math.ST stat.TH

Bayesian Modular Inference for Copula Models with Potentially Misspecified Marginals

Lucas Kock, David T. Frazier, Michael Stanley Smith, David J. Nott

详情
英文摘要

Copula models of multivariate data are popular because they allow separate specification of marginal distributions and the copula function. These components can be treated as inter-related modules in a modified Bayesian inference approach called ''cutting feedback'' that is robust to their misspecification. Recent work uses a two module approach, where all $d$ marginals form a single module, to robustify inference for the marginals against copula function misspecification, or vice versa. However, marginals can exhibit differing levels of misspecification, making it attractive to assign each its own module with an individual influence parameter controlling its contribution to a joint semi-modular inference (SMI) posterior. This generalizes existing two module SMI methods, which interpolate between cut and conventional posteriors using a single influence parameter. We develop a novel copula SMI method and select the influence parameters using Bayesian optimization. It provides an efficient continuous relaxation of the discrete optimization problem over $2^d$ cut/uncut configurations. We establish theoretical properties of the resulting semi-modular posterior and demonstrate the approach on simulated and real data. The real data application uses a skew-normal copula model of asymmetric dependence between equity volatility and bond yields, where robustifying copula estimation against marginal misspecification is strongly motivated.

2603.02491 2026-04-03 cs.LG cs.AI cs.RO q-bio.NC stat.ML

What Capable Agents Must Know: Selection Theorems for Robust Decision-Making under Uncertainty

Aran Nayebi

Comments 23 pages; added PSR recovery (Theorems 3 & 4), and updated related work

详情
英文摘要

As artificial agents become increasingly capable, what internal structure is *necessary* for an agent to act competently under uncertainty? Classical results show that optimal control can be *implemented* using belief states or world models, but not that such representations are required. We prove quantitative "selection theorems" showing that strong task performance (low *average-case regret*) forces world models, belief-like memory and -- under task mixtures -- persistent variables resembling core primitives associated with emotion, along with informational modularity under block-structured tasks. Our results cover stochastic policies, partial observability, and evaluation under task distributions, without assuming optimality, determinism, or access to an explicit model. Technically, we reduce predictive modeling to binary "betting" decisions and show that regret bounds limit probability mass on suboptimal bets, enforcing the predictive distinctions needed to separate high-margin outcomes. In fully observed settings, this yields approximate recovery of the interventional transition kernel; under partial observability, it implies necessity of predictive state and belief-like memory, addressing an open question in prior world-model recovery work.

2602.16142 2026-04-03 math.ST cs.CG cs.LG stat.TH

Ratio Covers of Convex Sets and Optimal Mixture Density Estimation

Spencer Compton, Gábor Lugosi, Jaouad Mourtada, Jian Qian, Nikita Zhivotovskiy

Comments 47 pages

详情
英文摘要

We study density estimation in Kullback-Leibler divergence: given an i.i.d. sample from an unknown density $p^\star$, the goal is to construct an estimator $\widehat{p}$ such that $\mathrm{KL}(p^\star,\widehat{p})$ is small with high probability. We consider two fundamental settings involving a finite dictionary of densities: (i) model aggregation, where $p^\star$ belongs to the dictionary, and (ii) convex aggregation (mixture density estimation), where $p^\star$ is a mixture of densities from the dictionary. Crucially, we make no assumption on the base densities: their ratios may be unbounded and their supports may differ. For both problems, we identify the best possible high-probability guarantees in terms of the dictionary size, sample size, and confidence level. These optimal rates are higher than those achievable when density ratios are bounded by absolute constants; for mixture density estimation, they match existing lower bounds in the special case of discrete distributions. Our analysis of the mixture case hinges on two new covering results. First, we provide a sharp, distribution-free upper bound on the local Hellinger entropy of the class of mixtures of $M$ distributions. Second, we prove an optimal ratio covering theorem for convex sets: for every convex compact set $K \subset \mathbb{R}_+^d$, there exists a subset $A \subset K$ with at most $2^{O(d)}$ elements such that each element of $K$ is coordinate-wise dominated by an element of $A$ up to a universal constant factor. This geometric result is of independent interest; notably, it yields new cardinality estimates for $\varepsilon$-approximate Pareto sets in multi-objective optimization with convex feasible set.

2602.08083 2026-04-03 stat.AP

A Unified Server Quality Metric for Tennis

Aiwen Li, Amrita Balajee, Harry Wieand, Jonathan Pipping-Gamón

Comments 21 pages, published in Journal of Sports Analytics. Code available at https://github.com/WhartonSABI/server-quality

详情
英文摘要

Traditional tennis rating systems (e.g., Elo) summarize overall player strength but do not isolate the independent value of serving. Using point-by-point data from Wimbledon and the U.S.\ Open, we develop serve-specific player metrics that separate serving quality from return ability and other latent factors. For each tournament and gender, we fit logistic mixed-effects models of point outcomes using serve speed, speed variability, and placement features, with crossed server and returner random intercepts to capture unobserved player strengths. From these models we derive Server Quality Scores (SQS): partially pooled, opponent-adjusted estimates of players' serving impact. In out-of-sample evaluation, SQS aligns more strongly with serve efficiency$\unicode{x2014}$the probability of winning points within three shots$\unicode{x2014}$than weighted Elo. We further benchmark SQS against task-aligned serve-stat baselines and model ablations, quantifying the incremental value of serve features and partial pooling. Associations with overall serve win percentage are smaller and dataset-dependent, and neither SQS nor weighted Elo consistently dominates that outcome. Overall, SQS is best interpreted as a measure of serve-induced short-point advantage (serve quality plus early-point conversion), complementing holistic ratings with actionable insight for coaching, forecasting, and player evaluation.

2601.21462 2026-04-03 cs.LG stat.ML

Partial Feedback Online Learning

Shihao Shao, Cong Fang, Zhouchen Lin, Dacheng Tao

Comments 40 pages. Fixed some typos in the proof and improved readability

详情
英文摘要

We study a new learning protocol, termed partial-feedback online learning, where each instance admits a set of acceptable labels, but the learner observes only one acceptable label per round. We highlight that, while classical version space is widely used for online learnability, it does not directly extend to this setting. We address this obstacle by introducing a collection version space, which maintains sets of hypotheses rather than individual hypotheses. Using this tool, we obtain a tight characterization of learnability in the set-realizable regime. In particular, we define the Partial-Feedback Littlestone dimension (PFLdim) and the Partial-Feedback Measure Shattering dimension (PMSdim), and show that they tightly characterize the minimax regret for deterministic and randomized learners, respectively. We further identify a nested inclusion condition under which deterministic and randomized learnability coincide, resolving an open question of Raman et al. (2024b). Finally, given a hypothesis space H, we show that beyond set realizability, the minimax regret can be linear even when |H|=2, highlighting a barrier beyond set realizability.

2601.18774 2026-04-03 stat.AP math.PR stat.ME

Extreme-Path Benchmarks for Sequential Probability Forecasts

Jonathan Pipping-Gamón, Abraham J. Wyner

Comments Submitted to Annals of Applied Statistics. 17 pages, 3 figures

详情
英文摘要

Real-time probability forecasts for binary outcomes are routine in sports, online experimentation, medicine, and finance. Retrospective narratives, however, often hinge on pathwise extremes: for example, a forecast that becomes "90% certain" for an event that ultimately does not occur. Standard pointwise calibration tools do not quantify how frequently such extremes should arise under correct sequential calibration, where the ideal forecast sequence is a bounded martingale that ends at the realized outcome. We derive benchmark distributions for extreme-path functionals conditional on the terminal outcome, emphasizing the peak-on-loss: the largest forecast value attained along realizations that end in failure. In continuous time with continuous paths we obtain an exact closed-form benchmark; in discrete time we prove sharp finite-sample bounds together with an explicit correction decomposition that isolates terminal-step crossings and overshoots. These results yield model-agnostic null targets and one-sided tail probabilities for diagnosing sequential miscalibration from extreme-path behavior. We also develop competitive extensions tailored to win-probability feeds and illustrate the approach using ESPN win-probability series for NFL and NBA regular-season games (2018-2024), finding broad agreement with the benchmark in the NFL and systematic departures in the NBA.

2601.10531 2026-04-03 stat.ML cs.LG math.CO

Coarsening Causal DAG Models

Francisco Madaleno, Pratik Misra, Alex Markham

Comments 27 pages, 5 figures; accepted to the 5th conference on Causal Learning and Reasoning (CLeaR)

详情
英文摘要

Directed acyclic graphical (DAG) models are a powerful tool for representing causal relationships among jointly distributed random variables, especially concerning data from across different experimental settings. However, it is not always practical or desirable to estimate a causal model at the granularity of given features in a particular dataset. There is a growing body of research on causal abstraction to address such problems. We contribute to this line of research by (i) providing novel graphical identifiability results for practically-relevant interventional settings, (ii) proposing an efficient, provably consistent algorithm for directly learning abstract causal graphs from interventional data with unknown intervention targets, and (iii) uncovering theoretical insights about the lattice structure of the underlying search space, with connections to the field of causal discovery more generally. As proof of concept, we apply our algorithm on synthetic and real datasets with known ground truths, including measurements from a controlled physical system with interacting light intensity and polarization.

2510.14523 2026-04-03 cs.LG math.ST stat.ML stat.TH

On the Identifiability of Tensor Ranks via Prior Predictive Matching

Eliezer da Silva, Arto Klami, Diego Mesquita, Iñigo Urteaga

Comments Accepted at AISTATS 2026

详情
英文摘要

Selecting the latent dimensions (ranks) in tensor factorization is a central challenge that often relies on heuristic methods. This paper introduces a rigorous approach to determine rank identifiability in probabilistic tensor models, based on prior predictive moment matching. We transform a set of moment matching conditions into a log-linear system of equations in terms of marginal moments, prior hyperparameters, and ranks; establishing an equivalence between rank identifiability and the solvability of such system. We apply this framework to four foundational tensor-models, demonstrating that the linear structure of the PARAFAC/CP model, the chain structure of the Tensor Train model, and the closed-loop structure of the Tensor Ring model yield solvable systems, making their ranks identifiable. In contrast, we prove that the symmetric topology of the Tucker model leads to an underdetermined system, rendering the ranks unidentifiable by this method. For the identifiable models, we derive explicit closed-form rank estimators based on the moments of observed data only. We empirically validate these estimators and evaluate the robustness of the proposal.

2510.09908 2026-04-03 stat.ML cs.LG

Learning with Incomplete Context: Linear Contextual Bandits with Pretrained Imputation

Hao Yan, Heyan Zhang, Yongyi Guo

详情
英文摘要

The rise of large-scale pretrained models has made it feasible to generate predictive or synthetic features at low cost, raising the question of how to incorporate such surrogate predictions into downstream decision-making. We study this problem in the setting of online linear contextual bandits, where contexts may be complex, nonstationary, and only partially observed. In addition to bandit data, we assume access to an auxiliary dataset containing fully observed contexts--common in practice since such data are collected without adaptive interventions. We propose PULSE-UCB, an algorithm that leverages pretrained models trained on the auxiliary data to impute missing features during online decision-making. We establish regret guarantees that decompose into a standard bandit term plus an additional component reflecting pretrained model quality. In the i.i.d. context case with Hölder-smooth missing features, PULSE-UCB achieves near-optimal performance, supported by matching lower bounds. Our results quantify how uncertainty in predicted contexts affects decision quality and how much historical data is needed to improve downstream learning.

2510.04318 2026-04-03 stat.ML cs.LG

Adaptive Coverage Policies in Conformal Prediction

Etienne Gauthier, Francis Bach, Michael I. Jordan

Comments Code at: https://github.com/GauthierE/adaptive-coverage-policies

详情
英文摘要

Traditional conformal prediction methods construct prediction sets such that the true label falls within the set with a user-specified coverage level. However, poorly chosen coverage levels can result in uninformative predictions, either producing overly conservative sets when the coverage level is too high, or empty sets when it is too low. Moreover, the fixed coverage level cannot adapt to the specific characteristics of each individual example, limiting the flexibility and efficiency of these methods. In this work, we leverage recent advances in e-values and post-hoc conformal inference, which allow the use of data-dependent coverage levels while maintaining valid statistical guarantees. We propose to optimize an adaptive coverage policy by training a neural network using a leave-one-out procedure on the calibration set, allowing the coverage level and the resulting prediction set size to vary with the difficulty of each individual example. We support our approach with theoretical coverage guarantees and demonstrate its practical benefits through a series of experiments.

2510.00463 2026-04-03 stat.ML cs.LG eess.SP stat.ME

On the Adversarial Robustness of Learning-based Conformal Novelty Detection

Daofu Zhang, Mehrdad Pournaderi, Hanne M. Clifford, Yu Xiang, Pramod K. Varshney

详情
英文摘要

This paper studies the adversarial robustness of conformal novelty detection. In particular, we focus on two powerful learning-based frameworks that come with finite-sample false discovery rate (FDR) control: one is AdaDetect (by Marandon et al., 2024) that is based on the positive-unlabeled classifier, and the other is a one-class classifier-based approach (by Bates et al., 2023). While they provide rigorous statistical guarantees under benign conditions, their behavior under adversarial perturbations remains underexplored. We first formulate an oracle attack setup, under the AdaDetect formulation, that quantifies the worst-case degradation of FDR, deriving an upper bound that characterizes the statistical cost of attacks. This idealized formulation directly motivates a practical and effective attack scheme that only requires query access to the output labels of both frameworks. Coupling these formulations with two popular and complementary black-box adversarial algorithms, we systematically evaluate the vulnerability of both frameworks on synthetic and real-world datasets. Our results show that adversarial perturbations can significantly increase the FDR while maintaining high detection power, exposing fundamental limitations of current error-controlled novelty detection methods and motivating the development of more robust alternatives.

2509.15379 2026-04-03 stat.AP

A Single Index Approach to Integrated Species Distribution Modeling for Fisheries Abundance Data

Quan Vu, Francis K. C. Hui, A. H. Welsh, Samuel Muller, Eva Cantoni, Christopher R. Haak

详情
英文摘要

In fisheries ecology, species abundance data are often collected by multiple surveys, each with unique characteristics. This article is motivated by a dataset of Atlantic sea scallop abundance records along the northeast coast of the United States, collected from two bottom trawl surveys which cover a larger spatial domain but have low catch efficiency, and a dredge survey which is more efficient but more bounded in domain. Over the past decade, integrated species distribution models (ISDMs) that include common environmental effects along with correlated survey-specific spatial fields have been used to incorporate information from multiple surveys. While flexible, ISDMs can be susceptible to overfitting, which can complicate interpretability of the shared environmental effects, and potentially lead to poor predictive performance. To overcome these drawbacks, we introduce a novel single index ISDM, built from a single index (with spatial random effects) that represents a latent measure of the true species distribution, and survey-specific catch efficiency functions which map the single index to the survey-specific expected catch. In this article, these functions are constructed via logistic functions or semiparametric spline-based functions. Simulations and application to the motivating sea scallop abundance data demonstrate that the proposed single index ISDM offers more meaningful interpretations of the environmental effects and survey catch efficiency differences, while achieving similar to or better predictive performance than existing ISDMs.

2509.12533 2026-04-03 stat.AP stat.ME

Transporting Predictions via Double Machine Learning: Predicting Partially Unobserved Students' Outcomes

Falco J. Bargagli-Stoffi, Emma Landry, Kevin P. Josey, Kenneth De Beckker, Joana E. Maldonado, Kristof De Witte

Comments arXiv admin note: substantial text overlap with arXiv:2102.04382

详情
英文摘要

Educational policymakers often lack data on student outcomes where standardized tests were not administered. Machine learning can predict unobserved outcomes in target populations using source population data. However, covariate distribution differences between populations reduce model transportability, potentially decreasing predictive accuracy and introducing bias. We propose using double machine learning for covariate-shift weighted models. First, we estimate overlap scores -- the probability an observation belongs to the source dataset given covariates. Second, balancing weights, defined as density ratios of target-to-source membership probabilities, reweight individual observations' contributions to the loss function in target outcome prediction models. This downweights source observations less similar to the target population, allowing predictions to rely more on observations with greater overlap. Consequently, predictions become more transportable under covariate shift. We illustrate this framework using student standardized financial literacy scores (FLS) data. Using Bayesian Additive Regression Trees (BART), we predict missing FLS. We find minimal predictive performance differences between weighted and unweighted models, suggesting limited covariate shift in our setting. Nonetheless, our approach provides a principled framework for addressing covariate shift and is broadly applicable to predictive modeling in social and health sciences, where source-target population differences are common.

2509.04718 2026-04-03 stat.ME physics.data-an q-bio.QM

When correcting for regression to the mean is worse than no correction at all

José F. Fontanari, Mauro Santos

详情
Journal ref
Am. Nat. (2026)
英文摘要

The ubiquitous regression to the mean (RTM) effect complicates statistical inference regarding the relationship between baseline levels of a biological variable and its subsequent change. We demonstrate that common RTM correction methods are problematic: the Berry et al. method, popularized by Kelly & Price in The American Naturalist, is unreliable for hypothesis testing or effect-size estimation, leading to systematic bias and inflated error rates. Conversely, while the Blomqvist method is theoretically unbiased, its high sampling variance limits its practical utility in small-to-moderate datasets. Using a structural linear model, we show that the most robust approach to navigating RTM is not to correct the data, but to evaluate the uncorrected crude slope against a structural null expectation derived from measurement repeatability-the proportion of total variance attributable to true individual differences. We illustrate this approach using empirical data from studies on lizard thermal physiology and bird telomere dynamics. Ultimately, we argue that any conclusion regarding a differential treatment effect is statistically unfounded without a clear understanding of the experiment's repeatability.

2508.20755 2026-04-03 cs.LG cs.AI stat.ML

Provable Benefits of In-Tool Learning for Large Language Models

Sam Houliston, Ambroise Odonnat, Charles Arnal, Vivien Cabannes

详情
Journal ref
ICLR 2026 MemAgents Workshop
英文摘要

Tool-augmented language models, equipped with retrieval, memory, or external APIs, are reshaping AI, yet their theoretical advantages remain underexplored. In this paper, we address this question by demonstrating the benefits of in-tool learning (external retrieval) over in-weight learning (memorization) for factual recall. We show that the number of facts a model can memorize solely in its weights is fundamentally limited by its parameter count. In contrast, we prove that tool-use enables unbounded factual recall via a simple and efficient circuit construction. These results are validated in controlled experiments, where tool-using models consistently outperform memorizing ones. We further show that for pretrained large language models, teaching tool-use and general rules is more effective than finetuning facts into memory. Our work provides both a theoretical and empirical foundation, establishing why tool-augmented workflows are not just practical, but provably more scalable.

2508.14285 2026-04-03 cs.LG cs.AI stat.ML

Meta-Learning at Scale for Large Language Models via Low-Rank Amortized Bayesian Meta-Learning

Liyi Zhang, Jake Snell, Thomas L. Griffiths

Comments 17 pages, 2 figures

详情
英文摘要

Fine-tuning large language models (LLMs) with low-rank adaptation (LoRA) is a cost-effective way to incorporate information from a specific dataset. However, when a problem requires incorporating information from multiple datasets - as in few shot learning - generalization across datasets can be limited, driving up training costs. As a consequence, other approaches such as in-context learning are typically used in this setting. To address this challenge, we introduce an efficient method for adapting the weights of LLMs to multiple distributions, Amortized Bayesian Meta-Learning for LoRA (ABMLL). This method builds on amortized Bayesian meta-learning for smaller models, adapting this approach to LLMs by reframing where local and global variables are defined in LoRA and using a new hyperparameter to balance reconstruction accuracy and the fidelity of task-specific parameters to the global ones. ABMLL supports effective generalization across datasets and scales to large models such as Llama3-8B and Qwen2-7B, outperforming existing methods on the CrossFit and Unified-QA datasets in terms of both accuracy and expected calibration error. We show that meta-learning can also be combined with in-context learning, resulting in further improvements in both these datasets and legal and chemistry applications.

2507.20598 2026-04-03 stat.ME q-bio.GN stat.AP

Nullstrap-DE: A General Framework for Calibrating FDR and Preserving Power in DE Methods, with Applications to DESeq2 and edgeR

Chenxin Jiang, Changhu Wang, Jingyi Jessica Li

详情
英文摘要

Differential expression (DE) analysis is a key task in RNA-seq studies, aiming to identify genes with expression differences across conditions. A central challenge is balancing false discovery rate (FDR) control with statistical power. Parametric methods such as DESeq2 and edgeR achieve high power by modeling gene-level counts using negative binomial distributions and applying empirical Bayes shrinkage. However, these methods may suffer from FDR inflation when model assumptions are mildly violated, especially in large-sample settings. In contrast, non-parametric tests like Wilcoxon offer more robust FDR control but often lack power and do not support covariate adjustment. We propose Nullstrap-DE, a general add-on framework that combines the strengths of both approaches. Designed to augment tools like DESeq2 and edgeR, Nullstrap-DE calibrates FDR while preserving power, without modifying the original method's implementation. It generates synthetic null data from a model fitted under the gene-specific null (no DE), applies the same test statistic to both observed and synthetic data, and derives a threshold that satisfies the target FDR level. We show theoretically that Nullstrap-DE asymptotically controls FDR while maintaining power consistency. Simulations confirm that it achieves reliable FDR control and high power across diverse settings, where DESeq2, edgeR, or Wilcoxon often show inflated FDR or low power. Applications to real datasets show that Nullstrap-DE enhances statistical rigor and identifies biologically meaningful genes.

2507.17190 2026-04-03 stat.ME

Model-robust standardization in stepped wedge designs

Xi Fang, Xueqi Wang, Patrick J. Heagerty, Bingkai Wang, Fan Li

详情
英文摘要

Stepped-wedge cluster-randomized trials (SW-CRTs) are widely used in healthcare and implementation science, providing an ethical advantage by ensuring all clusters eventually receive the intervention. The staggered rollout of treatment introduces complexities in defining and estimating treatment effect estimands, particularly under informative sizes. Traditional model-based methods, including generalized estimating equations (GEE) and linear mixed models (LMM), produce estimates that depend on implicit weighting schemes and parametric assumptions, leading to bias for different types of estimands in the presence of informative sizes. While recent methods have attempted to provide robust estimation in SW-CRTs, they are restrictive on modeling assumptions or lack of general framework for consistent estimating multiple estimands under informative size. In this article, we propose a model-robust standardization framework for SW-CRTs that generalizes existing methods from parallel-arm CRTs. We define causal estimands including horizontal-individual, horizontal-cluster, vertical-individual, and vertical-cluster average treatment effects under a super population framework and introduce an augmented standardization estimator that standardizes parametric and semiparametric working models while maintaining robustness to informative cluster size under arbitrary misspecification. We evaluate the finite-sample properties of our proposed estimators through extensive simulation studies, assessing their performance under various SW-CRT designs. Finally, we illustrate the practical application of model-robust estimation through a reanalysis of two real-world SW-CRTs.

2507.11816 2026-04-03 stat.ME math.ST stat.TH

A Relativity-Based Framework for Statistical Testing Guided by the Independence of Ancillary Statistics: Methodology and Nonparametric Illustrations

Albert Vexler, Douglas Landsittel

详情
Journal ref
TEST (2026)
英文摘要

This paper introduces a decision-theoretic framework for constructing and evaluating test statistics based on their relationship with ancillary statistics-quantities whose distributions remain fixed under the null and alternative hypotheses. Rather than focusing solely on maximizing discriminatory power, the proposed approach emphasizes reducing dependence between a test statistic and relevant ancillary structures. We show that minimizing such dependence can yield most powerful (MP) procedures. A Basu-type independence result is established, and we demonstrate that certain MP statistics also characterize the underlying data distribution. The methodology is illustrated through modifications of classical nonparametric tests, including the Shapiro-Wilk, Anderson-Darling, and Kolmogorov-Smirnov tests, as well as a test for the center of symmetry. Simulation studies highlight the power and robustness of the proposed procedures. The framework is computationally simple and offers a principled strategy for improving statistical testing.

2506.23849 2026-04-03 stat.ME stat.AP

Developing a Synthetic Socio-Economic Index through Autoencoders: Evidence from Florence's Suburban Areas

Giulio Grossi, Emilia Rocco

详情
英文摘要

The interest in summarizing complex and multidimensional phenomena often related to one or more specific sectors (social, economic, environmental, political, etc.) to make them easily understandable even to non-experts is far from waning. A widely adopted approach for this purpose is the use of composite indices, statistical measures that aggregate multiple indicators into a single comprehensive measure. In this paper, we present a novel methodology called AutoSynth, designed to condense potentially extensive datasets into a single synthetic index or a hierarchy of such indices. AutoSynth leverages an Autoencoder, a neural network technique, to represent a matrix of features in a lower-dimensional space. Although this approach is not limited to the creation of a particular composite index and can be applied broadly across various sectors, the motivation behind this work arises from a real-world need. Specifically, we aim to assess the vulnerability of the Italian city of Florence at the suburban level across three dimensions: economic, demographic, and social. To demonstrate the methodology's effectiveness, it is also applied to estimate a vulnerability index using a rich, publicly available dataset on U.S. counties and validated through a simulation study.

2506.23396 2026-04-03 stat.ML cs.LG

AICO: Feature Significance Tests for Supervised Learning

Kay Giesecke, Enguerrand Horel, Chartsiri Jirachotkulthorn

详情
英文摘要

Machine learning is central to modern science, industry, and policy, yet its predictive power often comes at the cost of transparency: we rarely know which input features truly drive a model's predictions. Without such understanding, researchers cannot draw reliable conclusions, practitioners cannot ensure fairness or accountability, and policymakers cannot trust or govern model-based decisions. Existing tools for assessing feature influence are limited; most lack statistical guarantees, and many require costly retraining or surrogate modeling, making them impractical for large modern models. We introduce AICO, a broadly applicable framework that turns model interpretability into an efficient statistical exercise. AICO tests whether each feature genuinely improves predictive performance by masking its information and measuring the resulting change. The method provides exact, finite-sample feature p-values and confidence intervals for feature importance through a simple, non-asymptotic hypothesis testing procedure. It requires no retraining, surrogate modeling, or distributional assumptions, making it feasible for large-scale algorithms. In both controlled experiments and real applications, from credit scoring to mortgage-behavior prediction, AICO reliably identifies the variables that drive model behavior, providing a scalable and statistically principled path toward transparent and trustworthy machine learning.

2506.12553 2026-04-03 cs.LG cs.CR stat.ML

Beyond Laplace and Gaussian: Exploring the Generalized Gaussian Mechanism for Private Machine Learning

Roy Rinberg, Ilia Shumailov, Vikrant Singhal, Rachel Cummings, Nicolas Papernot

详情
英文摘要

Differential privacy (DP) is obtained by randomizing a data analysis algorithm, which necessarily introduces a tradeoff between its utility and privacy. Many DP mechanisms are built upon one of two underlying tools: Laplace and Gaussian additive noise mechanisms. We expand the search space of algorithms by investigating the Generalized Gaussian (GG) mechanism, which samples the additive noise term $x$ with probability proportional to $e^{-\frac{| x |}σ^β }$ for some $β\geq 1$ (denoted $GG_{β, σ}(f,D)$). The Laplace and Gaussian mechanisms are special cases of GG for $β=1$ and $β=2$, respectively. We prove that the full GG family satisfies differential privacy and extend the PRV accountant to support privacy loss computation for these mechanisms. We then instantiate the GG mechanism in two canonical private learning pipelines, PATE and DP-SGD. Empirically, we explore PATE and DP-SGD with the GG mechanism across the computationally feasible values of $β$: $β\in [1,2]$ for DP-SGD and $β\in [1,4]$ for PATE. For both mechanisms, we find that $β=2$ (Gaussian) performs as well as or better than other values in their computational tractable domains.This provides justification for the widespread adoption of the Gaussian mechanism in DP learning.

2504.12214 2026-04-03 stat.ME

Bayesian random-effects meta-analysis of aggregate data on clinical events

Christian Röver, Qiong Wu, Anja Loos, Tim Friede

Comments 23 pages, 8 figures

详情
Journal ref
Statistics in Medicine, 45(8-9):e70508, 2026
英文摘要

To investigate intervention effects on rare events, meta-analysis techniques are commonly applied in order to assess the accumulated evidence. When it comes to adverse effects in clinical trials, these are often most adequately handled using survival methods. A common-effect model that is able to process data in commonly quoted formats in terms of hazard ratios has been proposed for this purpose. In order to accommodate potential heterogeneity between studies, we have extended the model by Holzhauer to a random-effects approach. The Bayesian model is described in detail, and applications to realistic data sets are discussed along with sensitivity analyses and Monte Carlo simulations to support the conclusions.

2503.08881 2026-04-03 stat.ME

Bayesian local clustering of functional data via semi-Markovian random partitions

Giovanni Toto, Antonio Canale

详情
英文摘要

We introduce a Bayesian framework for indirect local clustering of functional data, leveraging B-spline basis expansions and a novel dependent random partition model. By exploiting the local support properties of B-splines, our approach allows partially coincident functional behaviors, achieved when shared basis coefficients span sufficiently contiguous regions. This is accomplished through a cutting-edge dependent random partition model that enforces semi-Markovian dependence across a sequence of partitions. By matching the order of the B-spline basis with the semi-Markovian dependence structure, the proposed model serves as a highly flexible prior, enabling efficient modeling of localized features in functional data. Furthermore, we extend the utility of the dependent random partition model beyond functional data, demonstrating its applicability to a broad class of problems where sequences of dependent partitions are central, and standard Markovian assumptions prove overly restrictive. Empirical illustrations, including analyses of simulated data and tide level measurements from the Venice Lagoon, showcase the effectiveness and versatility of the proposed methodology.

2502.10600 2026-04-03 stat.ML cs.LG cs.NA math.NA

Weighted quantization using MMD: From mean field to mean shift via gradient flows

Ayoub Belhadji, Daniel Sharp, Youssef Marzouk

详情
Journal ref
AISTATS 2026
英文摘要

Approximating a probability distribution using a set of particles is a fundamental problem in machine learning and statistics, with applications including clustering and quantization. Formally, we seek a weighted mixture of Dirac measures that best approximates the target distribution. While much existing work relies on the Wasserstein distance to quantify approximation errors, maximum mean discrepancy (MMD) has received comparatively less attention, especially when allowing for variable particle weights. We argue that a Wasserstein-Fisher-Rao gradient flow is well-suited for designing quantizations optimal under MMD. We show that a system of interacting particles satisfying a set of ODEs discretizes this flow. We further derive a new fixed-point algorithm called mean shift interacting particles (MSIP). We show that MSIP extends the classical mean shift algorithm, widely used for identifying modes in kernel density estimators. Moreover, we show that MSIP can be interpreted as preconditioned gradient descent and that it acts as a relaxation of Lloyd's algorithm for clustering. Our unification of gradient flows, mean shift, and MMD-optimal quantization yields algorithms that are more robust than state-of-the-art methods, as demonstrated via high-dimensional and multi-modal numerical experiments.

2412.10683 2026-04-03 stat.ME stat.ML

Adaptive Nonparametric Perturbations of Parametric Models with Generalized Bayes

Bohan Wu, Eli N. Weinstein, Sohrab Salehi, Yixin Wang, David M. Blei

详情
英文摘要

Parametric Bayesian modeling offers a powerful and flexible toolbox for machine learning. Yet the model, however detailed, may still be wrong, and this can make inferences untrustworthy. In this paper we introduce a new class of semiparametric corrections for parametric Bayesian models, when the target of inference is a functional of the true data distribution. Our starting point is a fully Bayesian modeling approach, which explicitly accounts for the possibility that the parametric model is wrong. Asymptotic analysis shows that this approach is both robust to model misspecification and data efficient, achieving fast convergence when the parametric model is close to true. However, the fully Bayesian approach is limited in its practical usefulness by the challenges of conducting inference and computing a Bayes factor for a nonparametric model. We therefore propose a novel model correction based on generalized Bayes, which entirely avoids the need to compute a nonparametric Bayes factor, but preserves the robustness and efficiency of the fully Bayesian approach. We demonstrate our method by estimating causal effects of gene expression from single cell RNA sequencing data. Overall, we offer a new efficient approach to robust Bayesian inference with parametric models.

2412.09304 2026-04-03 stat.ME

Nonparametric estimation of the total treatment effect with multiple outcomes in the presence of terminal events

Jessica Gronsbell, Zachary R. McCaw, Isabelle-Emmanuella Nogues, Xiangshan Kong, Tianxi Cai, Lu Tian, LJ Wei

详情
英文摘要

As standards of care advance, patients are living longer and once-fatal diseases are becoming manageable. Clinical trials increasingly focus on reducing disease burden, which can be quantified by the timing and occurrence of multiple non-fatal clinical events. Most existing methods for the analysis of multiple event-time data require stringent modeling assumptions that can be difficult to verify empirically, leading to treatment efficacy estimates that forego interpretability when the underlying assumptions are not met. Moreover, many methods do not appropriately account for informative terminal events, such as premature treatment discontinuation or death, which prevent the occurrence of subsequent events. To address these limitations, we derive and validate estimation and inference procedures for the area under the mean cumulative function (AUMCF), an extension of the restricted mean survival time to the multiple event-time setting. The AUMCF is clinically interpretable, properly accounts for terminal competing risks, and can be estimated nonparametrically. To enable covariate adjustment, we also develop an augmentation estimator that provides efficiency at least equaling, and often exceeding, the unadjusted estimator. The utility and interpretability of the AUMCF are illustrated with extensive simulation studies and through an analysis of multiple heart-failure-related endpoints using data from the Beta-Blocker Evaluation of Survival Trial (BEST) clinical trial. Our open-source R package MCC makes conducting AUMCF analyses straightforward and accessible.

2411.18942 2026-04-03 math.ST math.DG stat.TH

Robust boundary detection and density estimation using doubly stochastic scaling of the Gaussian kernel

Dhruv Kohli, Jesse He, Chester Holtz, Alexander Cloninger, Gal Mishne

详情
英文摘要

This paper addresses the problem of detecting boundary points and estimating the sampling density of a dataset derived from a compact manifold with boundary, potentially in the presence of noise. We extend recent advances in doubly stochastic scaling of the Gaussian heat kernel via Sinkhorn iterations to this setting. Our main contributions are: (a) deriving a characterization of the scaling factors for manifolds with boundary, (b) developing a boundary direction estimator aimed at identifying boundary points followed by a boundary-corrected kernel density estimator based on doubly stochastic kernel and local principal component analysis, and (c) demonstrating through simulations that the resulting estimates of the boundary points and the sampling density outperform the standard Gaussian kernel-based approach, particularly under noisy conditions.

2411.08778 2026-04-03 stat.ME

Causal-DRF: Conditional Kernel Treatment Effect Estimation using Distributional Random Forest

Jeffrey Näf, Junhyung Park, Herbert Susmann

详情
英文摘要

The conditional average treatment effect (CATE) is a commonly targeted statistical parameter for measuring the effect of a treatment conditional on covariates. However, the CATE will fail to capture effects of treatments beyond differences in conditional expectations. Inspired by causal forests for CATE estimation, we develop a forest-based method to estimate the conditional kernel treatment effect (CKTE), based on the recently introduced Distributional Random Forest (DRF) algorithm. Adapting the splitting criterion of DRF, we show how one forest fit can be used to obtain a consistent and asymptotically normal estimator of the CKTE, as well as an approximation of its sampling distribution. This allows to study the difference in distribution between control and treatment group and thus yields a more comprehensive understanding of the treatment effect.

2406.02402 2026-04-03 math.OC cs.GT stat.ML

Online Fair Allocation of Perishable Resources

Siddhartha Banerjee, Chamsi Hssaine, Sean R. Sinclair

Comments 57 pages, 10 figures

详情
英文摘要

We consider a practically motivated variant of the canonical online fair allocation problem: a decision-maker has a budget of perishable resources to allocate over a fixed number of rounds. Each round sees a random number of arrivals, and the decision-maker must commit to an allocation for these individuals before moving on to the next round. The goal is to construct a sequence of allocations that is envy-free and efficient. Our work makes two important contributions toward this problem: we first derive strong lower bounds on the optimal envy-efficiency trade-off, demonstrating that a decision-maker is fundamentally limited in what she can hope to achieve relative to the no-perishing setting; we then design an algorithm achieving these lower bounds which takes as input (i) a prediction of the perishing order, and (ii) a desired bound on envy. Given the remaining budget in each period, the algorithm uses forecasts of future demand perishing to adaptively choose from one of two carefully constructed guardrail quantities. We demonstrate our algorithm's strong numerical performance, and state-of-the-art, perishing-agnostic algorithms' inefficacy, on simulations calibrated to a real-world dataset.

2405.20957 2026-04-03 stat.ME stat.AP

Causal-ICM: A Data Fusion Framework For Heterogeneous Treatment Effect Estimation With Multi-Task Gaussian Processes

Evangelos Dimitriou, Edwin Fong, Jens Magelund Tarp, Karla Diaz-Ordaz, Brieuc Lehmann

Comments Accepted at the 5th Conference on Causal Learning and Reasoning (CLeaR 2026)

详情
英文摘要

Bridging the gap between internal and external validity is crucial for heterogeneous treatment effect estimation. Randomised controlled trials (RCTs), favoured for their internal validity due to randomisation, often encounter challenges in generalising findings due to strict eligibility criteria. Observational studies, on the other hand, may provide stronger external validity through larger and more representative samples but can suffer from compromised internal validity due to unmeasured confounding. Motivated by these complementary characteristics, we propose a novel Bayesian nonparametric approach, Causal-ICM, leveraging multi-task Gaussian processes to integrate data from both RCTs and observational studies. In particular, we introduce a parameter that controls the degree of borrowing between the datasets and prevents the observational dataset from dominating the estimation. We propose a data-adaptive procedure for choosing the optimal value of the parameter. Causal-ICM outperforms other data fusion methods in point estimation across the covariate support of the observational study and provides principled uncertainty quantification for the estimated treatment effects. We demonstrate the robust performance of Causal-ICM in diverse scenarios through multiple simulation studies and a real-world study.

2206.08817 2026-04-03 stat.ME

Species Distribution Modeling with Expert Elicitation and Bayesian Calibration

Karel Kaurila, Sanna Kuningas, Antti Lappalainen, Jarno Vanhatalo

Comments Article: 20 pages, 4 figures. Supplement: 10 pages, 8 figures

详情
英文摘要

Species distribution models (SDMs) are key tools in ecology, conservation and management of natural resources. They are commonly trained by scientific survey data but, since surveys are expensive, there is a need for complementary sources of information to train them. To this end, several authors have proposed to use expert elicitation since local citizen and substance area experts can hold valuable information on species distributions. Expert knowledge has been incorporated within SDMs, for example, through informative priors. However, existing approaches pose challenges related to assessment of the reliability of the experts. Since expert knowledge is inherently subjective and prone to biases, we should optimally calibrate experts' assessments and make inference on their reliability. Moreover, demonstrated examples of improved species distribution predictions using expert elicitation compared to using only survey data are few as well. In this work, we propose a novel approach to use expert knowledge on species distribution within SDMs and demonstrate that it leads to significantly better predictions. First, we propose expert elicitation process where experts summarize their belief on a species occurrence proability with maps. Second, we collect survey data to calibrate the expert assessments. Third, we propose a hierarchical Bayesian model that combines the two information sources and can be used to make predictions over the study area. We apply our methods to study the distribution of spring spawning pikeperch larvae in a coastal area of the Gulf of Finland. According to our results, the expert information significantly improves species distribution predictions compared to predictions conditioned on survey data only. However, experts' reliability also varies considerably, and even generally reliable experts had spatially structured biases in their assessments.

1812.05741 2026-04-03 stat.ME

Posterior Projection for Inference in Constrained Spaces

Lachlan Astfalck, Deborshee Sen, Sayan Patra, Edward Cripps, David Dunson

详情
英文摘要

Estimation of parameters that obey specific constraints is crucial in statistics and machine learning; for example, when parameters are required to satisfy boundedness, monotonicity, or linear inequalities. Traditional approaches impose these constraints via constraint-specific transformations or sampling approaches, or by truncating the posterior distribution. Such methods often result in computational challenges, limited flexibility, and a lack of generality. We propose a generalized framework for constrained Bayesian inference by projecting the unconstrained posterior distribution into the space of the parameter constraints, providing a computationally efficient and easily implementable solution for a large class of problems. We rigorously establish the theoretical foundations of the projected posterior distribution, as well as providing asymptotic results for posterior consistency, posterior contraction, and optimal coverage properties. Our methodology is validated through both theoretical arguments and practical applications.