arXivDaily arXiv每日学术速递 周一至周五更新
重置
2604.19740 2026-04-22 cs.LG cs.AI cs.CV stat.ML

Generalization at the Edge of Stability

Mario Tuci, Caner Korkmaz, Umut Şimşekli, Tolga Birdal

Comments Project page: https://circle-group.github.io/research/GATES

详情
英文摘要

Training modern neural networks often relies on large learning rates, operating at the edge of stability, where the optimization dynamics exhibit oscillatory and chaotic behavior. Empirically, this regime often yields improved generalization performance, yet the underlying mechanism remains poorly understood. In this work, we represent stochastic optimizers as random dynamical systems, which often converge to a fractal attractor set (rather than a point) with a smaller intrinsic dimension. Building on this connection and inspired by Lyapunov dimension theory, we introduce a novel notion of dimension, coined the `sharpness dimension', and prove a generalization bound based on this dimension. Our results show that generalization in the chaotic regime depends on the complete Hessian spectrum and the structure of its partial determinants, highlighting a complexity that cannot be captured by the trace or spectral norm considered in prior work. Experiments across various MLPs and transformers validate our theory while also providing new insights into the recently observed phenomenon of grokking.

2604.19712 2026-04-22 cs.LG cond-mat.dis-nn cs.IT math.IT math.PR stat.ML

Ultrametric OGP - parametric RDT \emph{symmetric} binary perceptron connection

Mihailo Stojnic

详情
英文摘要

In [97,99,100], an fl-RDT framework is introduced to characterize \emph{statistical computational gaps} (SCGs). Studying \emph{symmetric binary perceptrons} (SBPs), [100] obtained an \emph{algorithmic} threshold estimate $α_a\approx α_c^{(7)}\approx 1.6093$ at the 7th lifting level (for $κ=1$ margin), closely approaching $1.58$ local entropy (LE) prediction [18]. In this paper, we further connect parametric RDT to overlap gap properties (OGPs), another key geometric feature of the solution space. Specifically, for any positive integer $s$, we consider $s$-level ultrametric OGPs ($ult_s$-OGPs) and rigorously upper-bound the associated constraint densities $α_{ult_s}$. To achieve this, we develop an analytical union-bounding program consisting of combinatorial and probabilistic components. By casting the combinatorial part as a convex problem and the probabilistic part as a nested integration, we conduct numerical evaluations and obtain that the tightest bounds at the first two levels, $\barα_{ult_1} \approx 1.6578$ and $\barα_{ult_2} \approx 1.6219$, closely approach the 3rd and 4th lifting level parametric RDT estimates, $α_c^{(3)} \approx 1.6576$ and $α_c^{(4)} \approx 1.6218$. We also observe excellent agreement across other key parameters, including overlap values and the relative sizes of ultrametric clusters. Based on these observations, we propose several conjectures linking $ult$-OGP and parametric RDT. Specifically, we conjecture that algorithmic threshold $α_a=\lim_{s\rightarrow\infty} α_{ult_s} = \lim_{s\rightarrow\infty} \barα{ult_s} = \lim_{r\rightarrow\infty} α_{c}^{(r)}$, and $α_{ult_s} \leq α_{c}^{(s+2)}$ (with possible equality for some (maybe even all) $s$). Finally, we discuss the potential existence of a full isomorphism connecting all key parameters of $ult$-OGP and parametric RDT.

2604.19698 2026-04-22 cs.LG math.ST stat.TH

On two ways to use determinantal point processes for Monte Carlo integration

Guillaume Gautier, Rémi Bardenet, Michal Valko

Comments NeurIPS 2019

详情
英文摘要

The standard Monte Carlo estimator $\widehat{I}_N^{\mathrm{MC}}$ of $\int fdω$ relies on independent samples from $ω$ and has variance of order $1/N$. Replacing the samples with a determinantal point process (DPP), a repulsive distribution, makes the estimator consistent, with variance rates that depend on how the DPP is adapted to $f$ and $ω$. We examine two existing DPP-based estimators: one by Bardenet & Hardy (2020) with a rate of $\mathcal{O}(N^{-(1+1/d)})$ for smooth $f$, but relying on a fixed DPP. The other, by Ermakov & Zolotukhin (1960), is unbiased with rate of order $1/N$, like Monte Carlo, but its DPP is tailored to $f$. We revisit these estimators, generalize them to continuous settings, and provide sampling algorithms.

2604.19694 2026-04-22 stat.ME stat.AP

A Goodness-of-Fit Test for Mixed-Effects Logistic Regression

Ariel Linden

详情
英文摘要

Mixed-effects logistic regression is widely used for binary outcomes in hierarchical data, yet formal goodness-of-fit tests remain limited to random-intercept models and do not address sparse cluster settings. We extend a grouping-based Wald test to mixed-effects logistic models with random slopes. The procedure groups observations by predicted probabilities within clusters, augments the model with pooled group indicators, and tests their joint significance using a Wald statistic. To accommodate small clusters, we introduce a data-driven rule for selecting the number of groups, G=min(10,n_min), where n_min is the smallest cluster size, ensuring feasible estimation. Simulation studies across 24 null scenarios show that the test maintains nominal Type I error in three-level random slope models, including at smaller sample sizes than previously studied. The test exhibits increasing power to detect fixed-effects misspecification: power against omitted nonlinearity rises from 0.07 to 1.00 across effect sizes, and power against omitted interactions reaches 0.87. As expected, the test has no power to detect omission of a clustering level, reflecting its focus on residual structure in predicted probabilities. In sparse balanced designs, fixing G=10 leads to complete test failure, whereas the data-driven rule performs reliably. The method is implemented in the Stata program mlm_gof.

2604.19672 2026-04-22 cs.LG stat.ML

Budgeted Online Influence Maximization

Pierre Perrault, Jennifer Healey, Zheng Wen, Michal Valko

Comments 37th International Conference on Machine Learning (ICML 2020), 28 pages

详情
英文摘要

We introduce a new budgeted framework for online influence maximization, considering the total cost of an advertising campaign instead of the common cardinality constraint on a chosen influencer set. Our approach better models the real-world setting where the cost of influencers varies and advertisers want to find the best value for their overall social advertising budget. We propose an algorithm assuming an independent cascade diffusion model and edge level semi-bandit feedback, and provide both theoretical and experimental results. Our analysis is also valid for the cardinality constraint setting and improves the state of the art regret bound in this case.

2604.19662 2026-04-22 q-bio.NC physics.bio-ph stat.AP

Modelling time-order effects in haptic perception with a Bayesian dynamical framework

Gastón Avetta, Jose Lobera, Juan José Zárate, Inés Samengo, Damián G. Hernández

Comments 21 pages, 7 figures

详情
英文摘要

Perceptual judgments of sequential stimuli are systematically biased by prior expectations and by the temporal structure of sensory input. In haptic discrimination tasks, these effects often manifest as time-order asymmetries, whereby the perceived difference between two stimuli depends on their presentation order. Here, we introduce a dynamical Bayesian model that accounts for these biases by combining noisy sensory measurements with an evolving internal representation of stimulus intensity. The model formalizes perception as an inference process in which prior expectations are updated by incoming stimuli and propagate in time between observations. We test the model on psychophysical data from vibrotactile discrimination experiments, in which participants compare pairs of sequential stimuli with varying intensities. With a small number of parameters, the model quantitatively reproduces both the direction and magnitude of time-order effects across subjects, as well as the observed inter-individual variability. The inferred parameters provide a compact description of perceptual biases in terms of prior expectations and noise characteristics. Beyond fitting the data, the model induces a transformation of stimulus space, leading to a subject-dependent geometry of perceived stimuli. In this transformed space, perceptual judgments exhibit approximate symmetries that are absent in the physical stimulus coordinates. These results suggest that temporal biases in perception can be understood as a consequence of dynamical inference, and that they impose non-trivial geometric constraints on perceptual representations.

2604.19580 2026-04-22 q-fin.ST econ.EM q-fin.PM stat.AP

Probabilistic Forecasting for Day-ahead Electricity Prices, Battery Trading Strategies and the Economic Evaluation of Predictive Accuracy

Simon Hirsch, Florian Ziel

Comments 30 pages, 15 figures, 5 pages supplementary materials

详情
英文摘要

Electricity price forecasting supports decision-making in energy markets and asset operation. Probabilistic forecasts are increasingly adopted to explicitly quantify uncertainty, typically issued as quantile predictions or ensembles of the full predictive distribution. However, how improvements in statistical forecast quality translate into economic value remains unclear. Battery storage arbitrage in day-ahead markets is a popular application-based benchmark for this purpose. We analyze quantile-based trading strategies (QBTS) and identify two critical flaws: they do not incentivize honest probabilistic forecasting and they ignore the intertemporal dependence structure of electricity prices. We therefore frame battery optimization as a stochastic program based on fully probabilistic forecasts and examine decision quality measurement for risk-neutral and risk-averse settings under different uncertainty models. Our discussion touches both sides of the coin: How reliable is the economic evaluation of forecasting models though (simplified) application studies - and how do improvements in statistical forecast quality for stochastic programs relate to the decision-quality and economic performance? We provide theoretical justification and empirical evidence from a case study on the German electricity market. Our results highlight the pitfalls of ranking forecasting models through battery trading strategies. We conclude with implications for evaluation practice and directions for future research in application-based forecast assessment.

2604.19560 2026-04-22 cs.LG math.OC stat.ML

Separating Geometry from Probability in the Analysis of Generalization

Maxim Raginsky, Benjamin Recht

Comments 19 pages

详情
英文摘要

The goal of machine learning is to find models that minimize prediction error on data that has not yet been seen. Its operational paradigm assumes access to a dataset $S$ and articulates a scheme for evaluating how well a given model performs on an arbitrary sample. The sample can be $S$ (in which case we speak of ``in-sample'' performance) or some entirely new $S'$ (in which case we speak of ``out-of-sample'' performance). Traditional analysis of generalization assumes that both in- and out-of-sample data are i.i.d.\ draws from an infinite population. However, these probabilistic assumptions cannot be verified even in principle. This paper presents an alternative view of generalization through the lens of sensitivity analysis of solutions of optimization problems to perturbations in the problem data. Under this framework, generalization bounds are obtained by purely deterministic means and take the form of variational principles that relate in-sample and out-of-sample evaluations through an error term that quantifies how close out-of-sample data are to in-sample data. Statistical assumptions can then be used \textit{ex post} to characterize the situations when this error term is small (either on average or with high probability).

2604.19531 2026-04-22 cs.SI math.ST stat.TH

Hypergraph Mining via Proximity Matrix

Junhao Bian, Yilin Bi, Tao Zhou

详情
英文摘要

Hypergraphs serve as an effective tool widely adopted to characterize higher-order interactions in complex systems. The most intuitive and commonly used mathematical instrument for representing a hypergraph is the incidence matrix, in which each entry is binary, indicating whether the corresponding node belongs to the corresponding hyperedge. Although the incidence matrix has become a foundational tool for hypergraph analysis and mining, we argue that its binary nature is insufficient to accurately capture the complexity of node-hyperedge relationships arising from the fact that different hyperedges can contain vastly different numbers of nodes. Accordingly, based on the resource allocation process on hypergraphs, we propose a continuous-valued matrix to quantify the proximity between nodes and hyperedges. To verify the effectiveness of the proposed proximity matrix, we investigate three important tasks in hypergraph mining: link prediction, vital nodes identification, and community detection. Experimental results on numerous real-world hypergraphs show that simply designed algorithms centered on the proximity matrix significantly outperform benchmark algorithms across these three tasks.

2604.19517 2026-04-22 stat.ME

PRADAS: PRior-Assisted DAta Splitting for False Discovery Rate Control

Yuanchuan Guo, Buyu Lin, Jun S. Liu

Comments 61 pages, 6 figures

详情
英文摘要

In the FDR-controlling literature, mirror statistics offer a flexible alternative to $p$-value based procedures. When prior information is available, however, it is unclear how to incorporate mirror statistics in a principled way, and the standard equal split used by data-splitting methods can be inefficient. In this paper, we characterize a broader class of mirror statistics for any fixed splitting scheme and establish asymptotic FDR control under mild weak-dependence conditions using a two-stage procedure inspired by \cite{li2021whiteout}. Within this class, we derive a Bayes-optimal mirror statistic. Theoretically, we demonstrate its power advantage through analyses in the Rare/Weak signal model. Building upon this Bayes-optimal mirror statistic, we propose \textsc{PRADAS} (PRior-Assisted DAta Splitting) that treats split ratio as a stopping time and recasts the data-splitting as an optional stopping over a natural filtration; the optimal stopping rule is characterized by the Snell envelope and computed efficiently via a Longstaff--Schwartz regression approximation. Both simulations and real data examples demonstrate the effectiveness of our proposed framework.

2604.19493 2026-04-22 stat.ME stat.CO

A Nonparametric Goodness-of-Fit Test for High-Dimensional Generalized Gaussian Distributions via Nearest-Neighbor Graphs

Mehmet Sıddık Çadırcı, Yener Ünal

Comments 22 pages, 5 pages

详情
英文摘要

The multivariate generalised Gaussian distribution (MGGD) is commonly used to model high-dimensional vectors with non-Gaussian radial behaviour, ranging from sharp-peaked to heavy-tailed profiles. However, because many classical multivariate tests are based on covariance inversion or high-dimensional density estimation, formal goodness-of-fit assessment for MGGD models remains challenging in modern regimes where the dimension is comparable to or exceeds the sample size. We introduce an affine-invariant, fully non-parametric goodness-of-fit procedure based on the nearest neighbour (NN) graph topology and the adapted zero principle. Following robust standardisation, we construct an independent reference sample from the adapted standardised MGGD and measure, on the combined NN graph, the cross-edge count to assess how well the observed and reference point clouds exhibit the mixture behaviour anticipated by the model. Calibration performed using a refitted parametric bootstrap accounts for nuisance-parameter uncertainty, thus ensuring reliable size under a composite specification. In this paper, we establish asymptotic validity under high-dimensional scaling and demonstrate consistency with respect to fixed elliptical departures, providing a geometric interpretation based on radial concentration and shell separation. Our simulation studies across a broad spectrum of dimensions and tail shapes reveal accurate Type I error control and robust power relative to heavy- and light-tailed alternatives, thus improving upon energy-distance benchmarks and normality-oriented graphical tests in contexts where MGGD modelling is most applicable.

2604.19463 2026-04-22 astro-ph.CO stat.ME

On combining estimated and analytic covariance matrices

Alan Heavens, Lorne Whiteway, Elena Sellentin

Comments For submission to OJA

详情
英文摘要

The statistical analysis of cosmological data often assumes a Gaussian sampling distribution and relies on covariance matrices estimated from simulations. In this setting, the likelihood function of the data is not Gaussian but is instead a multivariate Student-t distribution, arising from marginalisation over an inverse-Wishart distribution for the true covariance matrix. This framework, introduced by Sellentin & Heavens(2016) and extended by Percival et al.(2022), provides a principled drop-in replacement to the Gaussian likelihood with Hartlap correction (Hartlap et al. 2007). The latter removes bias in the precision matrix; it is still widely used, despite failing to reproduce the heavy tails of the true distribution (thus yielding inaccurate probabilities, especially in the case of tensions between datasets). In practice, cosmological analyses frequently involve additional Gaussian error contributions, for example from instrumental noise, foregrounds, super-sample covariance, or emulator uncertainties. The resulting likelihood function is a convolution of the Sellentin-Heavens or Percival likelihoods with an extra Gaussian contribution, and does not have a simple expression. In this note, we derive an accurate approximation for the combined likelihood function, another multivariate Student-t distribution which inherits the heavy tails. The parameters of the Student-t distribution are determined by matching the covariance and multivariate kurtosis to those of the true distribution. We also include a slightly more expensive but fast sampling algorithm, based on the mixture representation of the Student-t distribution, which avoids the approximation altogether, but is not the drop-in replacement for the normal Gaussian or Hartlap likelihood function that the Student-t approximation in this paper provides. (Abridged)

2604.19451 2026-04-22 cs.LG stat.ML

Heterogeneity-Aware Personalized Federated Learning for Industrial Predictive Analytics

Yuhan Hu, Xiaolei Fang

详情
英文摘要

Federated prognostics enable clients (e.g., companies, factories, and production lines) to collaboratively develop a failure time prediction model while keeping each client's data local and confidential. However, traditional federated models often assume homogeneity in the degradation processes across clients, an assumption that may not hold in many industrial settings. To overcome this, this paper proposes a personalized federated prognostic model designed to accommodate clients with heterogeneous degradation processes, allowing them to build tailored prognostic models. The prognostic model iteratively facilitates the underlying pairwise collaborations between clients with similar degradation patterns, which enhances the performance of personalized federated learning. To estimate parameters jointly using decentralized datasets, we develop a federated parameter estimation algorithm based on proximal gradient descent. The proposed approach addresses the limitations of existing federated prognostic models by simultaneously achieving model personalization, preserving data privacy, and providing comprehensive failure time distributions. The superiority of the proposed model is validated through extensive simulation studies and a case study using the turbofan engine degradation dataset from the NASA repository.

2604.19381 2026-04-22 math.OC math.ST stat.TH

Sharp recovery and landscape guarantees for the nonconvex matrix LASSO

Andrew D. McRae, Richard Y. Zhang

详情
英文摘要

Low-rank matrix recovery can be solved to statistical optimality by convex matrix optimization under the classical assumption of restricted isometry property (RIP). However, for large problems, the convex formulation is commonly replaced by a smooth rank-constrained factored nonconvex problem for which algorithmic theory typically only guarantees convergence to second-order critical points. In this paper, we develop a sharp and statistically optimal theory for second-order critical points of the factored nonconvex matrix LASSO (nuclear-norm--regularized least-squares estimator) under RIP with particular emphasis on the overparametrized regime where the search rank $r$ exceeds the ground-truth rank $r_*$. Our recovery error bounds reveal the precise role of nuclear norm regularization, interpolating between the classical convex rate and known rates for the unregularized nonconvex problem. Complementing this positive result, we give examples showing that, contrary to popular belief, rank overparametrization does not always improve the optimization landscape even under RIP. This negative result raises questions about the fundamental statistical recovery capability of rank-constrained nonconvex approaches in comparison to convex approaches which have worse computational scaling. All of our results generalize to arbitrary convex functions with nuclear-norm regularization under restricted strong convexity and smoothness. In particular, we give sharp conditions under which second-order critical points of the nonconvex problem either (1) approximately recover low-rank approximate minima of the convex problem or (2) exactly recover a low-rank global optimum if one exists.

2604.19378 2026-04-22 stat.ME stat.CO

Random Reward Phase-Type Distributions with Applications in Latent Severity Modeling

Simon Pauli, Andreas Futschik

Comments 25 pages, 9 figures, submitted to Statistical Papers

详情
英文摘要

This paper proposes an extension to discrete Phase-Type distributions (DPH) by introducing random rewards. These allow for modeling a system in which a visit to a certain state does not emit a deterministic reward. Instead, the rewards follow either a Bernoulli or a geometric distribution. Utilizing this increased flexibility, we further sketch a possible use case for these random rewards by introducing the Inertia-Escalation model (IEM), a process with latent severity levels characterized through two parameters: Inertia ν and escalation η. We also discuss parameter inference for such models. To validate and explore random rewards and the IEM, we conducted extensive simulations and applied the model to two datasets: historical warfare and the Telco customer churn dataset.

2604.19352 2026-04-22 math.ST stat.TH

Stochastic Intervention

Rohit Chaudhuri

Comments Stochastic Intervention, Causal Inference, High Dimensional Treatments, High Dimensional Inference

详情
英文摘要

This article discusses the application of stochastic intervention to find the optimal treatment distribution yielding a high value of expected potential outcome under the setting where the number of treatments is allowed to vary with $n$. The primary motivation is to obtain a novel summarization of the effect of various treatments which would guide practitioners towards better decision regarding which intervention to choose.

2604.18653 2026-04-22 stat.ME physics.soc-ph

How to quantify direct correlations between variables

Shengjun Wu, Jeffery Wu

Comments 15 pages, 11 figures, 3 tables

详情
英文摘要

Analyzing correlation between variables is often both the tool and the goal of modern science. A crucial question is whether the correlation between two variables is a direct correlation or only an indirect correlation through a confounder. We review the existing measures of direct correlation and organize them into two families, each corresponding to a systematic construction: (i) removing the direct correlation from the original joint distribution and quantifying the resulting distributional shift, and (ii) intervening on one variable via do-calculus and quantifying how the distribution of the other variable responds. For every Kullback--Leibler-based measure in either family, we propose a Jensen--Shannon-based regularized analogue. Since the square root of the Jensen--Shannon divergence is a bounded metric, the regularized measures take values in $[0,1]$ and are free of the singularity of the Kullback--Leibler divergence. We further analyze the achievable upper bound of each regularized measure under the observed marginal $p(x,z)$, which depends on the alphabet size and is in general strictly below $1$; this sets the correct scale against which observed values should be read. The properties and the differences of the proposed measures are illustrated on a decision-making toy model and on three public real datasets: Titanic survival, UCI Adult (Census Income), and the UC~Berkeley 1973 graduate admissions. Bootstrap $95\%$ confidence intervals are reported for every numerical value.

2604.18181 2026-04-22 math.ST stat.TH

Spectral approximation for the separable covariance mixture model

Ben Deitmar

Comments 96 pages, 2 figures

详情
英文摘要

This paper introduces the separable covariance mixture model, which assumes a data-matrix $Y$ to be of the form $$ \sum\limits_{r=1}^R A_r X B_r $$ for one random $(d \times n)$-matrix $X$ with independent centered variance-one entries, and for two families of deterministic matrices $A_1,\dots,A_R \in \mathbb{C}^{d \times d}$ and $B_1,\dots,B_R \in \mathbb{C}^{n \times n}$. Under certain assumptions, it is shown that the resolvents $(\frac{1}{n} Y Y^* - z \operatorname{Id}_d)^{-1}$ and $(\frac{1}{n} Y^* Y - z \operatorname{Id}_n)^{-1}$ respectively approximate the deterministic matrices $$ -\frac{1}{z}\Big( \operatorname{Id}_d + \sum\limits_{r,s=1}^R δ^{(B)}_{r,s}(z) A_{r} A_{s}^* \Big)^{-1} \ \ \text{ and } \ \ -\frac{1}{z}\Big( \operatorname{Id}_n + \sum\limits_{r,s=1}^R δ^{(A)}_{r,s}(z) B_{s}^*B_{r} \Big)^{-1} \ , $$ where $δ^{(A)}, δ^{(B)} \in \mathbb{C}^{R \times R}$ are uniquely defined solutions to a certain dual system of equations. The results are non-asymptotic and do not require simultaneous diagonalizability of the families $(A_r)_{r \leq R}$ or $(B_r)_{r \leq R}$, as was required in previous works such as [Hazarika and Paul (2025)] or [Mei et al. (2023)]. An asymptotic application, which describes the limiting spectral distribution of the sample covariance matrix analogues $\frac{1}{n} Y Y^*$ or $\frac{1}{n} Y^* Y$, is included.

2604.17094 2026-04-22 astro-ph.IM physics.data-an stat.CO

Simple approximations of some statistical functions

Zinovy Malkin

详情
Journal ref
Publications of the Pulkovo Observatory v.240, p. 1 - 7 (2026)
英文摘要

Possibilities are considered to simplify the computation of several statistical functions used to test statistical hypotheses when processing observations: the inverse normal distribution, the Student's t-distribution, and the criterion for rejecting outliers. For these three cases, simple approximation expressions are proposed for the quantiles of these statistical distributions, which are accurate enough for most practical applications.

2604.16129 2026-04-22 stat.ME

Deep Ranking with Heterogeneous Effects

Yuanhang Luo, Shuxing Fang, Ruijian Han, Yiming Xu

详情
英文摘要

Classical latent-score ranking models often fail to distinguish objects' intrinsic scores from contextual effects, which are typically nonlinear and can dominate the observed outcomes. To address this, we introduce a semiparametric ranking framework in which the log-score of each object is modeled as the sum of a utility parameter and a nonparametric covariate effects. Within this framework, we establish model identifiability under mild regularity and connectivity conditions. For estimation, we approximate the covariate effects using a neural network and estimate the parameters via maximum likelihood. Under random design assumptions, we prove that the resulting estimator exists with high probability and derive non-asymptotic error bounds that achieve minimax optimality for both the parametric and nonparametric components. Numerical experiments on both synthetic data and an ATP tennis dataset are conducted to support our findings.

2604.11812 2026-04-22 math.ST stat.ME stat.TH

Confidence envelopes for the false discoveries with heterogeneous data

Romain Périer, Gilles Blanchard, Sebastian Döhler, Guillermo Durand, Etienne Roquain

详情
英文摘要

In the context of selective inference, confidence envelopes for the false discoveries allow the user to select any subset of null hypotheses while having a statistical guarantee on the number of false discoveries in the selected set. Many constructions of such envelopes have been proposed recently, using local test families (Genovese and Wasserman, 2006; Goeman and Solari, 2011), paths (Katsevich and Ramdas, 2020) or interpolation (Blanchard et al., 2020a). All those methods have in common that they have been well-studied for the homogeneous case where all p-values under the null have a uniform distribution over [0, 1]. However, in many applications the data are heterogeneous and discrete, hence the p-values have heterogeneous, discrete distributions, and the previous constructions may incur a loss of power, in the sense that they over-estimate the number of false discoveries. In this paper, we bridge the previous constructions under the homogeneous case with new tools. We also apply these tools to propose several confidence envelopes based on tools tailored for heterogeneous data, like the Bretagnolle inequality, or a new variant of the Simes inequality. We compare these new envelopes to their homogeneous counterparts on simulated data.

2604.08681 2026-04-22 stat.ME econ.EM stat.AP

Nonparametric Identification and Estimation of Causal Effects on Latent Outcomes

Jiawei Fu, Donald P. Green

详情
英文摘要

How should researchers conduct causal inference when the outcome of interest is latent and measured imperfectly by multiple indicators? We develop a general nonparametric framework for identifying and estimating average treatment effects on latent outcomes in randomized experiments. We show that latent-outcome estimation faces two distinct noncomparability challenges. First, across studies, different measurement systems may cause estimators to target different empirical quantities even when the underlying latent treatment effect is the same. Second, within a study, different indicators may have different and possibly nonlinear relationships with the same latent outcome, making them not directly comparable. To address these challenges, we propose a design-based approach built around nonparametric bridge functions. We show that these bridge functions can be characterized and identified. Estimation relies on a debiasing procedure that permits valid inference even when the bridge functions are weakly identified. Simulations demonstrate that standard methods, such as principal components analysis and inverse covariance weighting, can generate spurious cross-study differences, whereas our approach recovers comparable latent treatment effects. Overall, the framework provides both a general strategy for causal inference with latent outcomes and practical guidance for designing measurements that support identification, comparability, and efficient estimation.

2603.21623 2026-04-22 stat.ME stat.ML

Neyman-Pearson multiclass classification under label noise via empirical likelihood

Qiong Zhang, Qinglong Tian, Pengfei Li

详情
英文摘要

In many classification problems, misclassification costs are highly asymmetric, while training labels are often corrupted due to measurement error, annotator variability, or adversarial noise. The Neyman-Pearson multiclass classification (NPMC) framework addresses such asymmetry by controlling class-specific errors, but existing methods assume that training labels are correctly observed. To our knowledge, no existing approach handles NPMC under label noise in the multiclass setting, and the only binary method requires prior knowledge of the noise mechanism. A fundamental difficulty is that, without structural assumptions, noisy-label models are non-identifiable: distinct combinations of class-conditional distributions and noise mechanisms can induce the same observed distribution, preventing recovery of the quantities required for error control. We show that the exponential tilting density ratio model restores identifiability, and leverage this structure to develop an empirical likelihood approach for NPMC with noisy labels. The proposed method jointly estimates clean-label class proportions, posterior probabilities, and the noise mechanism from noisy data, without requiring prior knowledge of the confusion matrix. An expectation-maximization algorithm enables efficient computation. The resulting estimators are root n consistent and asymptotically normal, and the induced classifiers satisfy Neyman-Pearson oracle inequalities in both binary and multiclass settings. Simulation and real-data experiments demonstrate near-oracle performance.

2603.17463 2026-04-22 stat.AP econ.EM q-fin.RM q-fin.ST

Multivariate GARCH and portfolio variance prediction: A forecast reconciliation perspective

Massimiliano Caporin, Daniele Girolimetto, Emanuele Lopetuso

详情
英文摘要

We assess the advantage of combining univariate and multivariate portfolio risk forecasts with the aid of forecast reconciliation techniques. In our analyzes, we assume knowledge of portfolio weights, a standard for portfolio risk management applications. With an extensive simulation experiment, we show that, if the true covariance is known, forecast reconciliation improves over a standard multivariate approach, in particular when the adopted multivariate model is misspecified. However, if noisy proxies are used, correctly specified models and the misspecified ones (for instance, neglecting spillovers) turn out to be, in several cases, indistinguishable, with forecast reconciliation still providing improvements. The noise in the covariance proxy plays a crucial role in determining the improvement of both the forecast reconciliation and the correct model specification. An empirical analysis shows how forecast reconciliation can be adopted with real data to improve traditional GARCH-based portfolio variance forecasts.

2603.15817 2026-04-22 stat.ME math.ST stat.TH

On the Equivalence between Neyman Orthogonality and Pathwise Differentiability

Yuxi Chen, Edward H. Kennedy, Sivaraman Balakrishnan

详情
英文摘要

It has been frequently observed that Neyman orthogonality, the central device underlying double/debiased machine learning (Chernozhukov et al., 2018), and pathwise differentiability, a cornerstone concept from semiparametric theory, often lead to the same debiased estimators in practice. Despite the widespread adoption of both ideas, the precise nature of this equivalence has remained elusive, with the two concepts having been developed in largely separate traditions. In this work, we revisit the semiparametric framework of van der Laan and Robins (2003) and identify an implicit regularity assumption on the relationship between target and nuisance parameters -- a local product structure -- that allows us to establish a formal equivalence between Neyman orthogonality and pathwise differentiability. We also show that the two directions of this equivalence impose fundamentally different structural requirements. Finally, we illustrate the theory through three detailed examples of estimating the average treatment effect and expected density in a nonparametric model, as well as the slope in a partially linear model. This helps clarify the relationship between these two foundational frameworks and provides a useful reference for practitioners working at their intersection.

2512.09060 2026-04-22 stat.CO stat.ML

All Emulators are Wrong, Many are Useful, and Some are More Useful Than Others: A Reproducible Comparison of Computer Model Surrogates

Kellin N. Rumsey, Graham C. Gibson, Devin Francom, Reid Morris

详情
英文摘要

Accurate and efficient surrogate modeling is essential for modern computational science, and there are a staggering number of emulation methods to choose from. With new methods being developed all the time, comparing the relative strengths and weaknesses of different methods remains a challenge due to inconsistent benchmarking practices and (sometimes) limited reproducibility and transparency. In this work, we present a large-scale, fully reproducible comparison of $29$ distinct emulators across $60$ canonical test functions and $40$ real emulation datasets. To facilitate rigorous, apples-to-apples comparisons, we introduce the R package \texttt{duqling}, which streamlines reproducible simulation studies using a consistent, simple syntax, and automatic internal scaling of inputs. This framework allows researchers to compare emulators in a unified environment and makes it possible to replicate or extend previous studies with minimal effort, even across different publications. Our results provide detailed empirical insight into the strengths and weaknesses of state-of-the-art emulators and offer guidance for both method developers and practitioners selecting a surrogate for new data. We discuss best practices for emulator comparison and highlight how \texttt{duqling} can accelerate research in emulator design and application.

2511.22535 2026-04-22 stat.ME

Bayes Factor Hypothesis Testing in Meta-Analyses: Practical Advantages and Methodological Considerations

Joris Mulder, Robbie C. M. van Aert

Comments 63 pages, 10 figures

详情
Journal ref
Res. synth. methods 17 (2026) 589-623
英文摘要

Bayesian hypothesis testing via Bayes factors offers a principled alternative to classical p-value methods in meta-analysis, particularly suited to its cumulative and sequential nature. Unlike commonly reported p-values for standard null hypothesis significance testing, Bayes factors allow for quantifying support both for and against the existence of an effect, facilitate ongoing evidence monitoring, and maintain coherent long-run behavior as additional studies are incorporated. Recent theoretical developments further show how Bayes factors can flexibly control Type I error rates through connections to e-value theory. Despite these advantages, their use remains limited in the meta-analytic literature. This paper provides a critical overview of their theoretical properties, methodological considerations, such as prior sensitivity, and practical advantages for evidence synthesis. Two illustrative applications are provided: one on statistical learning in individuals with language impairments, and another on seroma incidence following post-operative exercise in breast cancer patients. New tools supporting these methods are available in the open-source R package BFpack.

2511.16164 2026-04-22 cs.LG stat.AP

Achieving Skilled and Reliable Daily Probabilistic Forecasts of Wind Power at Subseasonal-to-Seasonal Timescales over France

Eloi Lindas, Yannig Goude, Philippe Ciais

详情
Journal ref
2026 Environ. Res.: Energy 3 025007
英文摘要

In a growing renewable based energy system, accurate and reliable wind power forecasts are crucial for grid stability, balancing supply and demand and market risk management. Even though short-term weather forecasts have been thoroughly used to provide up to 3 days ahead renewable power predictions, forecasts involving prediction horizons longer than a week still need investigations. Despite the recent progress in subseasonal-to-seasonal weather probabilistic forecasting, their use for wind power prediction usually involves both temporal and spatial aggregation to achieve reasonable skill. In this study, we present a lead time and numerical weather model agnostic forecasting pipeline which enables to transform ECMWF subseasonal-to-seasonal weather forecasts into wind power forecasts for France for lead times ranging from 1 day to 46 days at daily resolution. By leveraging a post-processing step of the resulting power ensembles we show that these forecasts improve the climatological baseline by 15% to 5% for the Continuous Ranked Probability Score and 20% to 5% for ensemble Mean Squared Error up to 16 days in advance, before converging towards the climatological skill. This improvement in skill is jointly obtained with near perfect calibration of the forecasts for every lead time. The results suggest that electricity market players could benefit from the extended forecast range up to two weeks to improve their decision making on renewable supply

2510.24512 2026-04-22 eess.SP physics.geo-ph stat.AP

Heuristic Quality Coefficients for Interferometric Phase Linking

Magnus Heimpel, Irena Hajnsek, Othmar Frey

Comments 32 pages, 9 figures. Replacement is the version now published in ISPRS Journal of Photogrammetry and Remote Sensing

详情
Journal ref
ISPRS Journal of Photogrammetry and Remote Sensing, vol. 237, pp. 1-21 (2026)
英文摘要

In multitemporal InSAR, phase linking (PL) refers to the estimation of a single-reference interferometric phase history for distributed scatterers (DS) from the information contained in the sample coherence matrix. Because the phase information in this matrix is typically inconsistent, DS processing needs practical reliability indicators to decide whether a pixel's PL estimate is sufficiently supported by the data for subsequent deformation analysis. For maximum-likelihood estimation, uncertainty can be quantified via Fisher-information-based covariance estimates, but no analogous, generally applicable uncertainty quantification is available for the broad range of non-ML methods. We propose three heuristic quality coefficients within a unified mathematical framework that covers common PL methods: (1) a method-specific goodness-of-fit coefficient that normalizes the achieved PL objective between a method-consistent upper bound and an empirically modeled noise floor level; (2) a closure phase coefficient computed from the sample coherence matrix in advance; and (3) an ambiguity coefficient that compares the obtained PL estimate with the best alternative in its orthogonal complement in the solution space. All coefficients are normalized to the interval $[0,1]$, where 1 indicates maximum reliability and 0 matches the behavior expected under pure noise. Simulations under exponential and seasonal decorrelation models show that the goodness-of-fit coefficient tracks the normalized absolute phase error most consistently, whereas the closure phase coefficient provides an a priori indicator for pre-screening. Experiments on a TerraSAR-X stack over Visp, Switzerland, reveal plausible spatial patterns across urban and vegetated areas and show that the ambiguity coefficient provides complementary information, especially in regions with temporally varying scattering mechanisms.

2510.10866 2026-04-22 stat.ML cs.LG

Quantifying Data Similarity Using Cross Learning

Shudong Sun, Hao Helen Zhang, Joseph C Watkins

详情
英文摘要

Measuring dataset similarity is fundamental in machine learning, particularly for transfer learning and domain adaptation. In the context of supervised learning, most existing approaches quantify similarity of two data sets based on their input feature distributions, neglecting label information and feature-response alignment. To address this, we propose the Cross-Learning Score (CLS), which measures dataset similarity through bidirectional generalization performance of decision rules. We establish its theoretical foundation by linking CLS to cosine similarity between decision boundaries under canonical linear models, providing a geometric interpretation. A robust ensemble-based estimator is developed that is easy to implement and bypasses high-dimensional density estimation entirely. For transfer learning applications, we introduce a "transferable zones" framework that categorizes source datasets into positive, ambiguous, and negative transfer regions. To accommodate deep learning, we extend CLS to encoder-head architectures, aligning with modern representation-based pipelines. Extensive experiments on synthetic and real-world datasets validate the effectiveness of CLS for similarity measurement and transfer assessment.

2508.04818 2026-04-22 cs.CV eess.IV stat.ML

Single-Step Reconstruction-Free Anomaly Detection and Segmentation via Diffusion Models

Mehrdad Moradi, Marco Grasso, Bianca Maria Colosimo, Kamran Paynabar

Comments 9 pages, 8 figures, 1 table. Accepted to 2025 International Conference on Machine Learning and Applications (ICMLA)

详情
Journal ref
Proc. 2025 International Conference on Machine Learning and Applications (ICMLA), Boca Raton, FL, USA, 2025, pp. 663-670
英文摘要

Generative models have demonstrated significant success in anomaly detection and segmentation over the past decade. Recently, diffusion models have emerged as a powerful alternative, outperforming previous approaches such as GANs and VAEs. In typical diffusion-based anomaly detection, a model is trained on normal data, and during inference, anomalous images are perturbed to a predefined intermediate step in the forward diffusion process. The corresponding normal image is then reconstructed through iterative reverse sampling. However, reconstruction-based approaches present three major challenges: (1) the reconstruction process is computationally expensive due to multiple sampling steps, making real-time applications impractical; (2) for complex or subtle patterns, the reconstructed image may correspond to a different normal pattern rather than the original input; and (3) Choosing an appropriate intermediate noise level is challenging because it is application-dependent and often assumes prior knowledge of anomalies, an assumption that does not hold in unsupervised settings. We introduce Reconstruction-free Anomaly Detection with Attention-based diffusion models in Real-time (RADAR), which overcomes the limitations of reconstruction-based anomaly detection. Unlike current SOTA methods that reconstruct the input image, RADAR directly produces anomaly maps from the diffusion model, improving both detection accuracy and computational efficiency. We evaluate RADAR on real-world 3D-printed material and the MVTec-AD dataset. Our approach surpasses state-of-the-art diffusion-based and statistical machine learning models across all key metrics, including accuracy, precision, recall, and F1 score. Specifically, RADAR improves F1 score by 7% on MVTec-AD and 13% on the 3D-printed material dataset compared to the next best model. Code available at: https://github.com/mehrdadmoradi124/RADAR

2507.12330 2026-04-22 stat.AP stat.ME

Forecasting sub-population mortality using credibility theory

Mathias Lindholm, Gabriele Pittarello

详情
英文摘要

The focus of the present paper is to forecast mortality rates for small sub-populations that are parts of a larger super-population. In this setting the assumption is that it is possible to produce reliable forecasts for the super-population, but the sub-populations may be too small or lack sufficient history to produce reliable forecasts if modelled separately. This setup is aligned with the ideas that underpin credibility theory, and in the present paper the classical credibility theory approach is extended to be able to handle the situation where future mortality rates are driven by a latent stochastic process, as is the case for, e.g., Lee-Carter type models. This results in sub-population credibility predictors that are weighted averages of expected future super-population mortality rates and expected future sub-population specific mortality rates. Due to the predictor's simple structure it is possible to derive an explicit expression for the expected quadratic forecast error. Moreover, the proposed credibility modelling approach does not depend on the specific form of the super-population model, making it broadly applicable regardless of the chosen forecasting model for the super-population. The performance of the suggested sub-population credibility predictor is illustrated on simulated population data. These illustrations highlight how the credibility predictor serves as a compromise between only using a super-population model, and only using a potentially unreliable sub-population specific model.

2503.08389 2026-04-22 stat.ME

Clustered Flexible Calibration Plots For Binary Outcomes Using Random Effects Modeling

Lasai Barreñada, Bavo D. C. Campo, Laure Wynants, Ben Van Calster

Comments 44 pages, 18 figures, 4 tables

详情
Journal ref
Res. synth. methods 17 (2026) 567-588
英文摘要

Evaluation of clinical prediction models across multiple clusters, whether centers or datasets, is becoming increasingly common. A comprehensive evaluation includes an assessment of the agreement between the estimated risks and the observed outcomes, also known as calibration. Calibration is of utmost importance for clinical decision making with prediction models and it may vary between clusters. We present three approaches to take clustering into account when evaluating calibration. (1) Clustered group calibration (CG-C), (2) two-stage meta-analysis calibration (2MA-C) and (3) mixed model calibration (MIX-C) can obtain flexible calibration plots with random effects modelling and providing confidence and prediction intervals. As a case example, we externally validate a model to estimate the risk that an ovarian tumor is malignant in multiple centers (N = 2489). We also conduct a simulation study and synthetic data study generated from a true clustered dataset to evaluate the methods. In the simulation study MIX-C and 2MA-C (splines) gave estimated curves closest to the true overall curve. In the synthetic data study MIX-C produced cluster specific curves closest to the truth. Coverage of the prediction interval across the plot was best for 2MA-C with splines. We recommend using 2MA-C with splines to estimate the overall curve and the 95% PI and MIX-C for the cluster specific curves, especially when sample size per cluster is limited. We provide ready-to-use code to construct summary flexible calibration curves with confidence and prediction intervals to assess heterogeneity in calibration across datasets or centers.

2411.03304 2026-04-22 stat.ME math.ST stat.TH

Bayesian Controlled FDR Variable Selection via Parameter-Expanded Latent Knockoffs

Lorenzo Focardi-Olmi, Anna Gottard, Michele Guindani, Marina Vannucci

详情
英文摘要

In many research fields, researchers aim to identify significant associations between a set of explanatory variables and a response while controlling the FDR. The Knockoff filter has been recently proposed in the frequentist paradigm to introduce controlled noise in a model by cleverly constructing copies of the predictors as auxiliary variables. We develop a fully Bayesian generalization of the classical model-X knockoff filter for normally distributed covariates. In our approach, we consider a joint model for the covariates and the response, where the conditional independence structure of the covariates is captured through a Gaussian graphical model and used to define a latent knockoff layer through a parameter-expanded representation of the response model. Estimating the covariate graph informs the knockoff construction and improves inference on the covariate effects. We use a modified spike-and-slab prior on the regression coefficients, avoiding the increase of the model dimension typical of the classical knockoff filter. We also address extensions to non-Gaussian responses. Our model performs variable selection using an upper bound on the posterior probability of non-inclusion. We show that the induced latent knockoff layer defines valid Gaussian model-X knockoffs under the proposed construction and that the resulting procedure controls the Bayesian FDR at an arbitrary level, in finite samples, if the distribution of the covariates is fully known; under an estimated graphical structure, it satisfies an asymptotic FDR guarantee. We use simulated data to demonstrate that our proposal increases the stability of the selection with respect to classical knockoff methods. With respect to Bayesian variable selection methods, our selection procedure achieves comparable or better performances, while maintaining control over the FDR. We conclude with an application to real data.

2407.13980 2026-04-22 stat.ME cs.LG stat.ML

Byzantine-tolerant distributed learning of finite mixture models

Qiong Zhang, Yan Shuo Tan, Jiahua Chen

详情
英文摘要

Traditional statistical methods need to be updated to work with modern distributed data storage paradigms. A common approach is the split-and-conquer framework, which involves learning models on local machines and averaging their parameter estimates. However, this does not work for the important problem of learning finite mixture models, because subpopulation indices on each local machine may be arbitrarily permuted (the "label switching problem"). Zhang and Chen (2022) proposed Mixture Reduction (MR) to address this issue, but MR remains vulnerable to Byzantine failure, whereby a fraction of local machines may transmit arbitrarily erroneous information. This paper introduces Distance Filtered Mixture Reduction (DFMR), a Byzantine tolerant adaptation of MR that is both computationally efficient and statistically sound. DFMR leverages the densities of local estimates to construct a robust filtering mechanism. By analysing the pairwise L2 distances between local estimates, DFMR identifies and removes severely corrupted local estimates while retaining the majority of uncorrupted ones. We provide theoretical justification for DFMR, proving its optimal convergence rate and asymptotic equivalence to the global maximum likelihood estimate under standard assumptions. Numerical experiments on simulated and real-world data validate the effectiveness of DFMR in achieving robust and accurate aggregation in the presence of Byzantine failure.

2312.06098 2026-04-22 stat.ME math.ST stat.TH

Mixture Matrix-valued Autoregressive Model

Fei Wu, Kung-Sik Chan

详情
Journal ref
Journal of Econometrics, Volume 255, 2026
英文摘要

Time series of matrix-valued data are increasingly available in various areas including economics, finance, social science, among others. These data may shed light on the inter-dynamical relationships between two sets of attributes, for instance, countries and economic indices. The matrix autoregressive (MAR) model provides a parsimonious approach for analyzing such data. However, the MAR model, being a linear model with parametric constraints, cannot capture the nonlinear patterns in the data, such as regime shifts in the dynamics. We propose a mixture matrix autoregressive (MMAR) model for analyzing potential regime shifts in the dynamics between two attributes, for instance, due to recession versus expansion, or stable period versus pandemic. We propose an EM algorithm for maximum likelihood estimation. We derive some theoretical properties of the proposed method including consistency and asymptotic distribution, and illustrate its performance via simulations and real applications.

2307.01348 2026-04-22 econ.EM stat.ME

Nonparametric Estimation of Large Spot Volatility Matrices for High-Frequency Financial Data

Ruijun Bu, Degui Li, Oliver Linton, Hanchao Wang

详情
Journal ref
Econom. Theory 42 (2026) 63-100
英文摘要

In this paper, we consider estimating spot/instantaneous volatility matrices of high-frequency data collected for a large number of assets. We first combine classic nonparametric kernel-based smoothing with a generalised shrinkage technique in the matrix estimation for noise-free data under a uniform sparsity assumption, a natural extension of the approximate sparsity commonly used in the literature. The uniform consistency property is derived for the proposed spot volatility matrix estimator with convergence rates comparable to the optimal minimax one. For the high-frequency data contaminated by microstructure noise, we introduce a localised pre-averaging estimation method that reduces the effective magnitude of the noise. We then use the estimation tool developed in the noise-free scenario, and derive the uniform convergence rates for the developed spot volatility matrix estimator. We further combine the kernel smoothing with the shrinkage technique to estimate the time-varying volatility matrix of the high-dimensional noise vector. In addition, we consider large spot volatility matrix estimation in time-varying factor models with observable risk factors and derive the uniform convergence property. We provide numerical studies including simulation and empirical application to examine the performance of the proposed estimation methods in finite samples.

1110.6639 2026-04-22 physics.data-an stat.CO

On computation of a common mean

Zinovy Malkin

详情
英文摘要

Combining several independent measurements of the same physical quantity is one of the most important tasks in metrology. Small samples, biased input estimates, not always adequate reported uncertainties, and unknown error distribution make a rigorous solution very difficult, if not impossible. For this reason, many methods to compute a common mean and its uncertainty were proposed, each with own advantages and shortcomings. Most of them are variants of the weighted average (WA) approach with different strategies to compute WA and its standard deviation. Median estimate became also increasingly popular during recent years. In this paper, these two methods in most widely used modifications are compared using simulated and real data. To overcome some problems of known approaches to compute the WA uncertainty, a new combined estimate has been proposed. It has been shown that the proposed method can help to obtain more robust and realistic estimate suitable for both consistent and discrepant measurements.

2604.19290 2026-04-22 q-fin.CP q-fin.MF q-fin.PR q-fin.RM stat.ME

Orthogonal reparametrization of the Nelson-Siegel-Svensson interest rate curve model: conditioning, diagnostics, and identifiability

Robert Flassig, Emrah Gülay, Daniel Guterding

Comments 28 pages, 10 figures

详情
英文摘要

The Nelson-Siegel-Svensson (NSS) interest rate curve model yields a separable nonlinear least-squares problem whose inner linear block is often ill-conditioned because the basis functions become nearly collinear. We analyze this instability via an exact orthogonal reparametrization of the design matrix. A thin QR decomposition produces orthogonal linear parameters for which, conditional on the nonlinear parameters, the Fisher information matrix is diagonal. We also derive a finite-horizon analytical orthogonalization: on $[0,T]$, the $4\times 4$ continuous Gram matrix has closed-form entries involving exponentials, logarithms, and the exponential integral $E_1$, yielding an explicit horizon-dependent orthogonal NSS basis. Together with Jacobian-rank and profile-likelihood arguments, this representation clarifies the degenerate manifold $λ_1=λ_2$, where the Svensson extension loses two degrees of freedom. Orthogonalization leaves the least-squares fit and uncertainty of the original linear parameters unchanged, but isolates the conditioning structure. When the decay parameters are estimated jointly, the full first-order covariance in orthogonal coordinates admits an explicit Schur-complement form. The approach also yields a scalar identifiability diagnostic through the QR element $R_{44}$ and separates model reduction from numerical instability. Synthetic experiments confirm that orthogonal parametrization eliminates correlations among the linear parameters and keeps their conditional uncertainty uniform. A daily U.S. Treasury study on a reduced fixed 9-tenor grid from 1981 to 2026 shows smoother orthogonal parameter series than classical NSS parameters while the moving QR basis remains nearly constant.

2604.19279 2026-04-22 stat.AP stat.OT

Early Prediction of Student Performance Using Bayesian Updating with Informative Priors Across Cohorts

Jakob Schwerter, Amer Krivosija, Tim Novak, Katja Ickstadt, Alexander Munteanu

详情
英文摘要

Early identification of at risk students in higher education depends on predictive models that maintain accuracy across successive cohorts -- a requirement that single-cohort modeling approaches fail to meet. This study evaluates Bayesian updating with informative priors from a previous cohort to improve cross-cohort prediction robustness using digital trace data. We fit weekly Bayesian linear, logistic, and ordinal regression models with either uninformative default priors or informative priors derived from posterior distributions of a preceding cohort. Models were applied to six weekly self-regulated learning (SRL)-aligned engagement indicators from two consecutive cohorts of students in a blended first-year mathematics course (N1 = 307; N2 = 323). Outcomes were exam points, final grades, and a binary at risk indicator. The models were evaluated weekly based on accuracy, sensitivity, and RMSE. In the source cohort, performance was already substantial by week 6. In the target cohort, informative priors improved early classification: Logistic models with priors reduced misclassification by 22% and false negatives by 38% in week 3 relative to the uninformative default. Ordinal models with priors similarly showed the strongest improvements in early weeks, reducing misclassification by 42% in week 2 and reaching an accuracy of .77 by week 4. Linear models showed little benefit from prior information. These findings demonstrate that Bayesian updating is a viable method for improving early classification performance across cohorts, with gains concentrated in the early weeks of the semester when current-cohort data are scarce.

2604.19177 2026-04-22 stat.ME

Multiscale Cochran-Mantel-Haenszel Scanning for Conditional Dependency

Gyeonghun Kang, Jialiang Mao, Li Ma

详情
英文摘要

We propose a nonparametric approach to testing conditional independence and estimating conditional association, generalizing the Cochran-Mantel-Haenszel (CMH) test and odds-ratio estimator to continuous sample spaces. It leverages a multiscale scanning approach to decompose the sample space into a cascade of $2\times 2 \times T$ tables. Following the CMH test, we condition on the marginal order statistics, which are "almost ancillary" regarding conditional dependency. This strategy helps overcome a key challenge faced by other methods that discretize the sample space: we achieve consistency without requiring stratum sample sizes to grow to infinity, a constraint often difficult to satisfy in practice. Our method produces easy-to-compute test statistics with a known asymptotic null distribution under the conditional sampling model, scaling almost linearly with the sample size. Our simulation results demonstrate reliable Type I error control, even with small samples and high-dimensional conditioning, and competitive power compared to state-of-the-art tests. Finally, a case study on Uber ride-share data highlights the method's unique dual capability, inherited from the CMH, to both test and identify the nature of the inferred conditional association. By providing summary statistics that capture the strength and direction of local associations, our method offers practitioners a useful tool for learning conditional dependencies.

2604.19175 2026-04-22 stat.CO

Digital twin-based hybrid framework for steam generator clogging prognostics

Edgar Jaber, Emmanuel Remy, Vincent Chabridon, Morgane Garo-Sail, Mathilde Mougeot, Didier Lucor, Jerome Delplace, Maxime Lointier

详情
英文摘要

We present a hybrid framework to support prognostics of the clogging degradation phenomenon in tube support plates for digital twins of steam generators in pressurized water reactors. The proposed approach combines a physics-based simulation code, heterogeneous and sparse observational data, and several uncertainty quantification techniques to obtain a robust estimate of the steam generator remaining useful life associated with the clogging rate. The proposed framework is compatible with a digital twin platform to assist maintenance planning of EDF steam generators.

2604.19165 2026-04-22 stat.ML cs.LG cs.NA math.NA

Analytical Extraction of Conditional Sobol' Indices via Basis Decomposition of Polynomial Chaos Expansions

Shijie Zhong, Jiangfeng Fu

Comments 11 pages, 2 figures

详情
英文摘要

In uncertainty quantification, evaluating sensitivity measures under specific conditions (i.e., conditional Sobol' indices) is essential for systems with parameterized responses, such as spatial fields or varying operating conditions. Traditional approaches often rely on point-wise modeling, which is computationally expensive and may lack consistency across the parameter space. This paper demonstrates that for a pre-trained global Polynomial Chaos Expansion (PCE) model, the analytical conditional Sobol' indices are inherently embedded within its basis functions. By leveraging the tensor-product property of PCE bases, we reformulate the global expansion into a set of analytical coefficient fields that depend on the conditioning variables. Based on the preservation of orthogonality under conditional probability measures, we derive closed-form expressions for conditional variances and Sobol' indices. This framework bypasses the need for repetitive modeling or additional sampling, transforming conditional sensitivity analysis into a purely algebraic post-processing step. Numerical benchmarks indicate that the proposed method ensures physical coherence and offers superior numerical robustness and computational efficiency compared to conventional point-wise approaches.

2604.19162 2026-04-22 cs.CL stat.AP

Mind the Unseen Mass: Unmasking LLM Hallucinations via Soft-Hybrid Alphabet Estimation

Hongxing Pan, Yingying Guo, Wenqing Kuang, Jiashi Lu

Comments 7 pages, 1 figure, 3 tables

详情
英文摘要

This paper studies uncertainty quantification for large language models (LLMs) under black-box access, where only a small number of responses can be sampled for each query. In this setting, estimating the effective semantic alphabet size--that is, the number of distinct meanings expressed in the sampled responses--provides a useful proxy for downstream risk. However, frequency-based estimators tend to undercount rare semantic modes when the sample size is small, while graph-spectral quantities alone are not designed to estimate semantic occupancy accurately. To address this issue, we propose SHADE (Soft-Hybrid Alphabet Dynamic Estimator), a simple and interpretable estimator that combines Generalized Good-Turing coverage with a heat-kernel trace of the normalized Laplacian constructed from an entailment-weighted graph over sampled responses. The estimated coverage adaptively determines the fusion rule: under high coverage, SHADE uses a convex combination of the two signals, while under low coverage it applies a LogSumExp fusion to emphasize missing or weakly observed semantic modes. A finite-sample correction is then introduced to stabilize the resulting cardinality estimate before converting it into a coverage-adjusted semantic entropy score. Experiments on pooled semantic alphabet-size estimation against large-sample references and on QA incorrectness detection show that SHADE achieves the strongest improvements in the most sample-limited regime, while the performance gap narrows as the number of samples increases. These results suggest that hybrid semantic occupancy estimation is particularly beneficial when black-box uncertainty quantification must operate under tight sampling budgets.

2604.19153 2026-04-22 stat.AP

And Quiet Does Not Flow the Don: Statistical Analysis of a Quarrel Between Nobel Prize Laureates

Nils Lid Hjort

Comments 8 pages, 2 figures; Statistical Research Report, Department of Mathematics, University of Oslo, and Centre of Advanced Study, the Norwegian Academy of Science and Letters, 2007; published in "Consilience", Centre of Advanced Study, Norwegian Academy of Science and Letters, 2007, pp. 134-140

详情
英文摘要

The Nobel Prize in literature 1965 was awarded Mikhail Sholokhov (1905-1984), for the epic novel Tikhij Don about Cossack life and the birth of a new Soviet society (And Quiet Flows the Don, or The Quiet Don, in different translations). Sholokhov has been compared to Tolstoy and was at least one and two generations ago called `the greatest of our writers' in the Soviet Union. In Russia alone his books have been published in more than a thousand editions, selling in total more than sixty million copies. He was an elected member of the USSR Supreme Soviet, the USSR Academy of Sciences, and of the CPSU Central Committee. But in the autumn of 1974 an article was published in Paris, Stremya `Tihogo Dona' (Zagadki romana (`The Rapids of Quiet Don: the Enigmas of the Novel'), by the author and critic D$^*$. He claimed that Tikhij Don was not at all Sholokhov's work, but that it rather was written by Fiodor Kriukov, a more obscure author who fought against bolshevism and died in 1920. The article was given credibility and prestige by none other than Aleksandr Solzhenitsyn (a Nobel prize winner five years after Sholokhov), who wrote a preface giving full support to D$^*$'s conclusion. Scandals followed, also touching the upper echelons of Soviet society, and Sholokhov's reputation was faltering abroad (see e.g. Doris Lessing's (1997) comments; `vibrations of dislike instantly flowed between us'). Are we in fact faced with one of the most flagrant cases of plagiarism in the history of literature?

2604.19152 2026-04-22 stat.ME

Transfer Learning for Degree-Corrected Mixed Membership Network Models

Yong He, Kangxiang Qin, Haoran Tang

详情
英文摘要

Statistical analysis of network data has attracted considerable attention in recent years, due to the rapid advancement of well-trained network models and the accessibility of large public network datasets. In this article, we propose a transfer learning procedure for boosting estimation accuracy of a target network structure based on the well-known Degree-Corrected Mixed-Membership (DCMM) model in the literature. By leveraging useful information from informative source datasets, we theoretically prove that the transfer learning procedure greatly improve the estimation accuracy for the target connection probability matrix. Our theoretical analysis also reveals that the benefits from knowledge transfer in this context attributes to the enlarged eigenvalue gap of the target connection probability matrix. Additionally, we propose a random projection step in conjunction with the conventional aggregation procedure to alleviate the heavy computational burden in practice. In the presence of potentially harmful sources, we further provide an iterative truncation algorithm for selecting useful datasets and avoiding negative transfer. Numerical results showcase the practical utility of our methods in real-world network dataset analysis, including journal citation network dataset and international trade network dataset.

2604.19150 2026-04-22 stat.ME math.ST stat.TH

The General Formulation of Loss-Based Priors for Parameter Spaces

Cristiano Villa

详情
英文摘要

Loss-based priors assign probability mass to parameter values according to the inferential loss incurred when they are excluded from the parameter space, and provide a general solution for discrete parameters. Extending this idea to continuous settings is challenging, as the exclusion of a single point induces no loss. We propose a neighbourhood-exclusion framework in which inferential loss is defined by removing a local region around each parameter value. Under standard regularity conditions, this yields a class of prior distributions driven by the local geometry of the Kullback--Leibler divergence. In one dimension, the resulting prior coincides with Jeffreys' prior, while in higher dimensions it leads to a family of priors indexed by the geometry of the exclusion region. The proposed formulation provides a unified extension of loss-based priors and offers a geometric interpretation of objective prior construction beyond isotropic settings.

2604.19091 2026-04-22 stat.ML cs.LG

Fast estimation of Gaussian mixture components via centering and singular value thresholding

Huan Qing

Comments 28 pages, 7 figures, 1 table

详情
英文摘要

Estimating the number of components is a fundamental challenge in unsupervised learning, particularly when dealing with high-dimensional data with many components or severely imbalanced component sizes. This paper addresses this challenge for classical Gaussian mixture models. The proposed estimator is simple: center the data, compute the singular values of the centered matrix, and count those above a threshold. No iterative fitting, no likelihood calculation, and no prior knowledge of the number of components are required. We prove that, under a mild separation condition on the component centers, the estimator consistently recovers the true number of components. The result holds in high-dimensional settings where the dimension can be much larger than the sample size. It also holds when the number of components grows to the smaller of the dimension and the sample size, even under severe imbalance among component sizes. Computationally, the method is extremely fast: for example, it processes ten million samples in one hundred dimensions within one minute. Extensive experimental studies confirm its accuracy in challenging settings such as high dimensionality, many components, and severe class imbalance.

2604.19066 2026-04-22 cs.LG stat.AP

Age-Dependent Heterogeneity in the Association Between Physical Activity and Mental Distress: A Causal Machine Learning Analysis of 3.2 Million U.S. Adults

Yuan Shan

详情
英文摘要

Physical activity (PA) is widely recognized as protective against mental distress, yet whether this benefit varies systematically across population subgroups remains poorly understood. Using pooled data from ten consecutive annual waves of the U.S. Behavioral Risk Factor Surveillance System (2015-2024; n = 3,242,218), we investigate heterogeneity in the association between leisure-time PA and frequent mental distress (FMD, >=14 days/month) across age groups. Survey-weighted logistic regression reveals a striking age gradient: the adjusted odds ratio for PA ranges from 0.89 among young adults (18-24) to 0.50 among adults aged 55-64, with the protective association strengthening monotonically with age. Temporal analysis across all ten years shows that the young-adult PA effect has been eroding over the past decade, with the 18-24 OR reaching 1.01 (null) in both 2018 and 2024 -- paralleling the deepening youth mental health crisis. Causal Forest via Double Machine Learning independently identifies age as the dominant driver of treatment effect heterogeneity (feature importance = 0.39, 2.5x the next predictor). E-value sensitivity analysis, propensity score overlap checks, placebo tests, and imputation comparisons confirm the robustness of the findings. These results suggest that the well-documented exercise--mental health link may not generalize to the youngest adult population, whose distress appears increasingly driven by stressors that PA alone cannot mitigate.

2604.19065 2026-04-22 cs.GT cs.SY eess.SY math.OC stat.ML

Last-Iterate Guarantees for Learning in Co-coercive Games

Siddharth Chandak, Ramanan Tamizholi, Nicholas Bambos

Comments Submitted to IEEE Conference on Decision and Control (CDC) 2026

详情
英文摘要

We establish finite-time last-iterate guarantees for vanilla stochastic gradient descent in co-coercive games under noisy feedback. This is a broad class of games that is more general than strongly monotone games, allows for multiple Nash equilibria, and includes examples such as quadratic games with negative semidefinite interaction matrices and potential games with smooth concave potentials. Prior work in this setting has relied on relative noise models, where the noise vanishes as iterates approach equilibrium, an assumption that is often unrealistic in practice. We work instead under a substantially more general noise model in which the second moment of the noise is allowed to scale affinely with the squared norm of the iterates, an assumption natural in learning with unbounded action spaces. Under this model, we prove a last-iterate bound of order $O(\log(t)/t^{1/3})$, the first such bound for co-coercive games under non-vanishing noise. We additionally establish almost sure convergence of the iterates to the set of Nash equilibria and derive time-average convergence guarantees.

2604.19018 2026-04-22 cs.LG cs.AI cs.SY eess.SY math.OC stat.ML

Local Linearity of LLMs Enables Activation Steering via Model-Based Linear Optimal Control

Julian Skifstad, Xinyue Annie Yang, Glen Chou

Comments Under review

详情
英文摘要

Inference-time LLM alignment methods, particularly activation steering, offer an alternative to fine-tuning by directly modifying activations during generation. Existing methods, however, often rely on non-anticipative interventions that ignore how perturbations propagate through transformer layers and lack online error feedback, resulting in suboptimal, open-loop control. To address this, we show empirically that, despite the nonlinear structure of transformer blocks, layer-wise dynamics across multiple LLM architectures and scales are well-approximated by locally-linear models. Exploiting this property, we model LLM inference as a linear time-varying dynamical system and adapt the classical linear quadratic regulator to compute feedback controllers using layer-wise Jacobians, steering activations toward desired semantic setpoints in closed-loop with minimal computational overhead and no offline training. We also derive theoretical bounds on setpoint tracking error, enabling formal guarantees on steering performance. Using a novel adaptive semantic feature setpoint signal, our method yields robust, fine-grained behavior control across models, scales, and tasks, including state-of-the-art modulation of toxicity, truthfulness, refusal, and arbitrary concepts, surpassing baseline steering methods. Our code is available at: https://github.com/trustworthyrobotics/lqr-activation-steering

2604.18973 2026-04-22 stat.AP cs.LG

Ground-Level Near Real-Time Modeling for PM2.5 Pollution Prediction

Zachary R. Fox, Janet O. Agbaje, Dakotah Maguire, Javier E. Santos, Jeremy Logan, Maggie Davis, Rima Habre, Jim VanDerslice, Heidi A. Hanson

详情
英文摘要

Air pollution is a worldwide public health threat that can cause or exacerbate many illnesses, including respiratory disease, cardiovascular disease, and some cancers. However, epidemiological studies and public health decision-making are stymied by the inability to assess pollution exposure impacts in near real time. To address this, developing accurate digital twins of environmental pollutants will enable timely data-driven analytics - a crucial step in modernizing health policy and decision-making. Although other models predict and analyze fine particulate matter exposure, they often rely on modeled input data sources and data streams that are not regularly updated. Another challenge stems from current models relying on predefined grids. In contrast, our deep-learning approach interpolates surface level PM2.5 concentrations between sparsely distributed US EPA monitoring stations in a grid-free manner. By incorporating additional, readily available datasets - including topographic, meteorological, and land-use data - we improve its ability to predict pollutant concentrations with high spatial and temporal resolution. This enables model querying at any spatial location for rapid predictions without computing over the entire grid. To ensure robustness, we randomize spatial sampling during training to enable our model to perform well in both dense and sparse monitored regions. This model is well suited for near real-time deployment because its lightweight architecture allows for fast updates in response to streaming data. Moreover, model flexibility and scalability allow it to be adapted to various geographical contexts and scales, making it a practical tool for delivering accurate and timely air quality assessments. Its capacity to rapidly evaluate multiple scenarios can be especially valuable for decision-making during public health crises.

2604.18912 2026-04-22 cs.LG stat.ME

Collaborative Contextual Bayesian Optimization

Chih-Yu Chang, Qiyuan Chen, Tianhan Gao, David Fenning, Chinedum Okwudire, Neil Dasgupta, Wei Lu, Raed Al Kontar

详情
英文摘要

Discovering optimal designs through sequential data collection is essential in many real-world applications. While Bayesian Optimization (BO) has achieved remarkable success in this setting, growing attention has recently turned to context-specific optimal design, formalized as Contextual Bayesian Optimization (CBO). Unlike BO, CBO is inherently more challenging as it must approximate an entire mapping from the context space to its corresponding optimal design, requiring simultaneous exploration across contexts and exploitation within each. In many modern applications, such tasks arise across multiple potentially heterogeneous but related clients, where collaboration can significantly improve learning efficiency. We propose CCBO, Collaborative Contextual Bayesian Optimization, a unified framework enabling multiple clients to jointly perform CBO with controllable contexts, supporting both online collaboration and offline initialization from peers' historical beliefs, with an optional privacy-preserving communication mechanism. We establish sublinear regret guarantees and demonstrate, through extensive simulations and a real-world hot rolling application, that CCBO achieves substantial improvements over existing approaches even under client heterogeneity. The code to reproduce the results can be found at https://github.com/cchihyu/Collaborative-Contextual-Bayesian-Optimization

2604.18864 2026-04-22 cs.LG stat.ML

ParamBoost: Gradient Boosted Piecewise Cubic Polynomials

Nicolas Salvadé, Tim Hillel

详情
英文摘要

Generalized Additive Models (GAMs) can be used to create non-linear glass-box (i.e. explicitly interpretable) models, where the predictive function is fully observable over the complete input space. However, glass-box interpretability itself does not allow for the incorporation of expert knowledge from the modeller. In this paper, we present ParamBoost, a novel GAM whose shape functions (i.e. mappings from individual input features to the output) are learnt using a Gradient Boosting algorithm that fits cubic polynomial functions at leaf nodes. ParamBoost incorporates several constraints commonly used in parametric analysis to ensure well-refined shape functions. These constraints include: (i) continuity of the shape functions and their derivatives (up to C2); (ii) monotonicity; (iii) convexity; (iv) feature interaction constraints; and (v) model specification constraints. Empirical results show that the unconstrained ParamBoost model consistently outperforms state-of-the-art GAMs across several real-world datasets. We further demonstrate that modellers can selectively impose required constraints at a modest trade-off in predictive performance, allowing the model to be fully tailored to application-specific interpretability and parametric-analysis requirements.

2604.18840 2026-04-22 stat.AP stat.CO stat.ME

Spatial Extremes at Scale: A Case Study of Surface Skin Temperature and Heat Risk in the United States

Ben Seiyon Lee, Reetam Majumder, Jordan Richards, Emma S. Simpson, Likun Zhang

详情
英文摘要

Understanding and mapping extreme heat is critical for risk management and public health planning, particularly in regions with complex terrain and heterogeneous climate. We present a case study of extreme heat in the Four Corners region of the United States, using high-resolution surface skin temperature data from the North American Land Data Assimilation System to characterize spatially heterogeneous and seasonally varying extremes across complex terrain, and to assess their implications for heat-related public health risks. Spatial extremes exhibit complex dependencies across geographic regions, which require sophisticated statistical models to capture. While recent advances in spatial extreme value modeling provide flexible representations of joint tail dependencies, statistical inference remains computationally demanding, especially for datasets with a large number of locations. To address this, we propose a random scale mixture process that facilitates Bayesian inference of spatial extremes, and develop scalable inference strategies that leverage advances in spatial modeling and amortized learning. We evaluate the proposed inference methods through large-scale simulation studies, representing the first such extensive study in spatial extremes, and a high-resolution surface skin temperature application in the Four Corners region. Surface skin temperature is particularly useful as a predictor for air temperature, for studying heatwaves and related environmental phenomena, and to calculate heat indices reflecting downstream health risks at any location. Our findings provide insights into efficient, data-driven approaches for modeling spatial extremes, and serve as guidelines for practitioners in the fields of climate science, environmental risk assessment, and beyond.

2604.18823 2026-04-22 stat.AP

A Non-stationary, Amortized, Transfer Learning Approach for Modeling Italian Air Quality

Alessandro Fusta Moro, Antony Sikorski, Daniel McKenzie, Alessandro Fassò, Douglas Nychka

详情
英文摘要

Air quality monitoring in Italy relies on sparse, irregular, ground-based stations that provide high-quality but incomplete measurements of pollution. Chemical transport models (CTMs) offer full spatial and temporal coverage but smooth over local variability. We develop a spatial transfer-learning framework that integrates these two data sources to produce daily, fine-grid predictions of nitrogen dioxide (NO$_2$) concentrations across Italy for 2023, with uncertainty quantification. The resulting maps provide a resource for decision making in downstream applications such as epidemiology and environmental policy. Our approach builds on the geostatistical LatticeKrig framework, which uses compactly supported basis functions and coefficients governed by a sparse precision matrix. We learn a nonstationary, anisotropic correlation structure from the gridded CTM outputs using an image-to-image neural architecture that estimates millions of spatially varying parameters in a matter of seconds. The basis-function representation enables this covariance structure to be transferred to the point-level station data and projected onto a finer prediction grid, a key extension for handling the change of support between data sources. A likelihood-based refinement step then adjusts the correlation range to recover fine-scale variability smoothed out by the gridded data. The proposed methodology results in a flexible, non-stationary, and anisotropic representation of the spatial process, better accommodating the complex geography of Italy. Performance is assessed through experiments on both gridded CTM outputs and point-level station measurements, demonstrating improvements over the stationary formulation.

2604.18774 2026-04-22 stat.CO stat.ME

A simulation study to resolve conflicting evidence on the error rates from MANOVA group tests

Joseph D Consiglio

Comments 19 pages, 9 tables, 0 figures

详情
英文摘要

Popular software packages report four generalizations of the ANOVA F test when conducting a multivariate analysis of variance (MANOVA). The reported operating characteristics of these fours tests vary widely depending on which research article the reader chooses. Some studies report extremely high type I error rates for a particular test even under ideal assumptions of multivariate normality and homoskedasticity; other studies report rates near the nominal level despite violations of the model assumptions. This simulation study seeks to clarify this apparent contradiction by providing a systematic evaluation of the type I error rates of the four statistics used to test for a group effect in MANOVA.

2604.18742 2026-04-22 stat.AP stat.ME

JASPER: Joint Bayesian Analysis of Spatial Expression via Regression

Pritam Dey, Rajarshi Guhaniyogi, Yang Ni, Bani K. Mallick

Comments 40 pages; 6 figures

详情
英文摘要

Spatially resolved transcriptomics is a fast-developing set of technologies that enables the measurement of localized gene expression across spatial locations in a sample. Detecting spatially varying genes is critical for analyzing such data, yet existing methods often fail to account for inter-gene correlations, leading to inflated false positive and false negative rates. Additionally, most prominent methods rely on predefined spatial covariance kernels, making them sensitive to the complexity of spatial expression patterns. Motivated by a human breast cancer dataset, we address these limitations in existing literature through JASPER (Joint Bayesian Analysis of SPatial Expression via Regression), a Bayesian framework that jointly models spatial expression patterns across multiple genes using a spatial basis function regression approach. We demonstrate the superior performance of JASPER compared to existing methods in several real-world spatial transcriptomic datasets and supporting simulation experiments. JASPER identifies genes with stronger spatial correlation and greater biological relevance, as validated by overlap comparison, enrichment analysis, and pathway analysis using independent biological databases. Our results highlight the ability of JASPER to improve the statistical and biological interpretability of spatial transcriptomics data, making it a powerful tool for uncovering spatial gene expression patterns in complex biological systems.

2604.18657 2026-04-22 stat.ME

Locally parametric nonparametric density estimation

Nils Lid Hjort, M. C. Jones

Comments 30 pages, no figures. This is the Statistical Research Report version, Department of Mathematics, University of Oslo, November 1995, published in Annals of Statistics, 1996, vol. 24, pages 1619-1647

详情
Journal ref
Annals of Statistics, 1996, vol. 24, pages 1619-1647
英文摘要

This paper develops a nonparametric density estimator with parametric overtones. Suppose $f(x,θ)$ is some family of densities, indexed by a vector of parameters $θ$. We define a local kernel smoothed likelihood function which for each $x$ can be used to estimate the best local parametric approximant to the true density. This leads to a new density estimator of the form $f(x,\hatθ(x))$, thus inserting the best local parameter estimate for each new value of $x$. When the bandwidth used is large this amounts to ordinary full likelihood parametric density estimation, while for moderate and small bandwidths the method is essentially nonparametric, using only local properties of data and the model. Alternative ways more general than via the local likelihood are also described. The methods can be seen as ways of nonparametrically smoothing the parameter within a parametric class. Properties of this new semiparametric estimator are investigated. Our preferred version has approximately the same variance as the ordinary kernel method but potentially a smaller bias. The new method is seen to perform better than the traditional kernel method in a broad nonparametric vicinity of the parametric model employed, while at the same time being capable of not losing much in precision to full likelihood methods when the model is correct. Other versions of the method are equivalent to using particular higher order kernels in a semiparametric framework. The methodology we develop can be seen as the density estimation parallel to local likelihood and local weighted least squares theory in nonparametric regression.

2604.18646 2026-04-22 stat.ME

Stable Transport Meta-Analysis for Heterogeneous Cardiovascular Trials: A Nuisance-Anchor Framework with a Sign-Stability Diagnostic

Ibrahim Halil Tanboga

详情
英文摘要

Random-effects meta-analysis summarizes heterogeneous trials by estimating an average effect over the observed evidence base, which may not represent the clinically relevant target population. In cardiovascular medicine, treatment effects vary systematically across era, endpoint definitions, background therapy, and case-mix, making the historical average often misaligned with current decision-making. We propose stable transport meta-analysis (AMT-MA), a nuisance-anchor estimator that models anchor-aligned variation but does not transport it to the target population. The method combines a weighted-average loss with a scale-normalized softmax regime loss, and incorporates a precision-weighted sign-stability diagnostic with a two-condition abstention rule to avoid reporting a single pooled estimate when stability is not supported. AMT-MA is not intended to minimize RMSE relative to random-effects models, but to redefine the estimand as a stable target-population effect. In a pre-specified ADEMP simulation across six scenarios, AMT-MA (rho = 0.2) showed reduced bias relative to unadjusted pooling and improved coverage in adversarial settings where classical Wald intervals fail (dominant trial: 0.85 vs 0.01; confounded anchor: 0.86 vs 0.34; anchor shift: 0.91 vs 0.60). WLS meta-regression remained competitive when correctly specified. Under sign-flip heterogeneity, the abstention rule triggered in ~84% of replications, compared with ~28-30% in stable regimes. Applications to post-myocardial infarction streptokinase trials and primary-prevention aspirin trials illustrate how AMT-MA quantifies transport uncertainty and provides a clinically interpretable alternative to averaging heterogeneous effects.

2604.18632 2026-04-22 cs.CV stat.AP

StomaD2: An All-in-One System for Intelligent Stomatal Phenotype Analysis via Diffusion-Based Restoration Detection Network

Quanling Zhao, Meng'en Qin, Yanfeng Sun, Yuan Miao, Xiaohui Yang

详情
英文摘要

Stomata play a crucial role in regulating plant physiological processes and reflecting environmental responses. However, accurate and high-throughput stomatal phenotyping remains challenging, as conventional approaches rely on destructive sampling and manual annotation, restricting large-scale and field deployment. To overcome these limitations, a noninvasive restoration-detection integrated framework, termed StomaD2, is developed to achieve accurate and fast stomatal phenotyping under complex imaging conditions. The framework incorporates a diffusion-based restoration module to recover degraded images and a specialized rotated object detection network tailored to the small, dense, and cluttered characteristics of stomata. The proposed network enhances feature representation through three key innovations: a column-wise structure for global feature interaction, context-aware resampling and reweighting mechanism to improve multi-scale consistency, and a feature reassembly module to boost discrimination against complex backgrounds. In extensive comparisons, StomaD2 demonstrated state-of-the-art performance. On public Maize and Wheat datasets, it achieved accuracies of 0.994 and 0.992, respectively, significantly outperforming existing benchmarks. When benchmarked against ten other advanced models, including Oriented Former and YOLOv12, StomaD2 achieved a top-tier F1-score/mAP of 0.989. The framework is integrated into a user-friendly, field-operable system that supports the fast extraction of eight stomatal phenotypes, such as density and conductance. Validated on more than 130 plant species, StomaD2's results highlight its strong generalizability and potential for large-scale phenotyping, plant physiology analysis, and precision agriculture applications.

2604.18609 2026-04-22 stat.AP

The Broken Shield of European Palliative Care: Evidence from Synthetic Counterfactuals on Financial Toxicity and Informal Care

Pietro Grassi, Edoardo Paperi, Chiara Seghieri, Daniele Vignoli

详情
英文摘要

The transition of end-of-life care to palliative care (PC) sparks intense debate: does it provide economic relief or shift unremunerated labor costs onto families? Evaluating this is hindered by causal inference challenges and skewed healthcare costs. To overcome these limitations, we introduce a Synthetic Data Generation framework. Using pan-European SHARE data (2016-2021), we deploy Tabular Denoising Diffusion Probabilistic Models within a Two-Learner architecture to synthesize high-fidelity digital twins. By including the 2020-2021 lockdowns, we leverage the COVID-19 pandemic to isolate structural inequalities from transient market shocks. Our findings challenge the strict cost-shifting hypothesis: on average, PC acts as a "double shield", truncating out-of-pocket expenditures (financial toxicity) and informal caregiving shadow values (time poverty). However, quantile treatment models expose a "broken shield" for vulnerable households and severe tail events. Non-cancer trajectories drive massive structural penalties that escalate at the distribution's tail, mechanically compounded by physical dependency. Socio-demographics heavily modulate this exposure: lacking a spousal net inflates the burden, rigid gender dynamics trigger labor market ejection, and financial distress acts as a profound multiplier. Institutionally, high-wage Nordic regimes paradoxically impose opportunity costs, while severe penalties in underfunded Eastern systems, mediated by financial distress, drive families toward resource exhaustion. We conclude that while PC is an ethical imperative, its expansion must be decoupled from the oncological paradigm and matched with state-funded long-term care to protect against clinical decline and financial shocks.

2604.18605 2026-04-22 q-fin.GN math.PR stat.AP

Exploring Drivers of Extreme Housing Prices in Australia

Grace Burtenshaw, Ashley Burtenshaw, Meagan Carney

详情
英文摘要

In recent years Australia has observed a growing, unexplained resilience of increasing house price trends. Here, we seek to understand what is driving Australia's indestructible asset using insights from market experts. We construct a differential equation model of house price to develop intuition for its historical behaviour and responsiveness to changes in mortgage rates. Using this model, we identify a point of 'decoupling' between house price and mortgage rate in the system with supply limitations found to be the main driver for this change. From there, modern extreme value techniques are implemented on real-world data to investigate how the effectiveness of mortgage rate in moderating extreme house price has changed before and after this historical decoupling. We find that without an increase in the housing supply chain, through either deregulation or reduced competition with government building, an 11\% increase in mortgage rate will be needed to slow extreme housing costs.

2604.18599 2026-04-22 stat.AP q-bio.NC

Simulation Based Inference of a Simple Neural Network Structure

Pierre Charitat, Ségolen Geffray, Christophe Pouzat

详情
英文摘要

Neurophysiologists are nowadays able to record from a large number of extracellular electrodes and to extract, from the raw data, the sequences of action potentials or spikes generated by many neurons. Unfortunately these ''many neurons'' still represent only a tiny fraction of the neuronal population that constitutes the network. Using association statistics such as the estimation of the cross-correlation functions, they are trying to infer the structure of the network formed by the recorded neurons. But this inference is compromised by the tremendous under-sampling of the neuronal population. We propose to focus instead on simple spike train statistics, like the empirical spikes frequency, or the interspike interval distribution. Their sampling distributions can be estimated by simulations, and, given a few observed spike train statistics, they provide enough information to infer the structure of the underlying network. We show that, on a ''toy model'', our method gives significantly better results than the sub-network reconstruction method with regards to the inference of the connection probability of the original network.

2604.18598 2026-04-22 stat.AP cs.CE cs.NA math.NA

Bathymetry Reconstruction by Bayesian Inference

Lars Stietz, Sebastian Götschel, Peter Schleper, Daniel Ruprecht

详情
英文摘要

Bathymetry reconstruction is an important problem in various fields, including oceanography and environmental monitoring. This paper presents a Bayesian inference approach to reconstructing bathymetries from point measurements of the water height. We test the method for parameterized and discretized bathymetries with synthetic data to evaluate its performance and limitations. Our results indicate that the Bayesian framework provides a robust approach to bathymetry reconstruction. Finally, we use the framework to reconstruct a real-world bathymetry in a wave flume from experimental measurements and compare its performance to an adjoint optimization method. The Bayesian approach improves the normalized root mean squared error (NRMSE) of the reconstruction and provides better qualitative features, while also quantifying uncertainty.

2604.10395 2026-04-22 math.PR math.ST stat.TH

A remark on the comparison of the sum and the maximum of positive random variables

Kazuki Okamura

Comments 6 pages; results extended

详情
英文摘要

We disprove a conjecture stated in a recent paper by Arnold and Villasenor concerning the sum and the maximum of independent and identically distributed half-normal random variables. Our method is applicable to generalized gamma distributions.

2604.08404 2026-04-22 cs.LG stat.ML

Adversarial Label Invariant Graph Data Augmentations for Out-of-Distribution Generalization

Simon Zhang, Ryan P. DeMilt, Kun Jin, Cathy H. Xia

Comments 22 pages, 3 figures, accepted at ICML SCIS 2023

详情
英文摘要

Out-of-distribution (OoD) generalization occurs when representation learning encounters a distribution shift. This occurs frequently in practice when training and testing data come from different environments. Covariate shift is a type of distribution shift that occurs only in the input data, while the concept distribution stays invariant. We propose RIA - Regularization for Invariance with Adversarial training, a new method for OoD generalization under convariate shift. Motivated by an analogy to $Q$-learning, it performs an adversarial exploration for counterfactual data environments. These new environments are induced by adversarial label invariant data augmentations that prevent a collapse to an in-distribution trained learner. It works with many existing OoD generalization methods for covariate shift that can be formulated as constrained optimization problems. We develop an alternating gradient descent-ascent algorithm to solve the problem in the context of causally generated graph data, and perform extensive experiments on OoD graph classification for various kinds of synthetic and natural distribution shifts. We demonstrate that our method can achieve high accuracy compared with OoD baselines.

2603.19569 2026-04-22 stat.ME

Heterogeneous readmission prediction with hierarchical effect decomposition and regularization

Ziren Jiang, Lingfeng Huo, Jue Hou, Mary Vaughan-Sarrazin, Maureen A. Smith, Jared D. Huling

Comments 31 pages, 5 figures, 2 tables

详情
英文摘要

Accurately predicting hospital readmission risks using electronic health records (EHRs) is critical for effective patient management and healthcare resource allocation. Patient populations in health systems are highly heterogeneous across different primary diagnoses, necessitating tailored yet interpretable prediction models. We propose a hierarchical modeling framework incorporating hierarchical nested re-parameterization and structured regularization methods, which we call hierNest. Specifically, our approach leverages the inherent hierarchical structure present in primary diagnoses and groupings of these diagnoses into major diagnostic categories. Our methodology facilitates information borrowing across related patient subgroups and preserves interpretability at different hierarchical levels. Simulation studies demonstrate superior predictive accuracy of the proposed method, particularly with small subgroup sample sizes and varying degrees of hierarchical effects. We apply our methods to a large EHR dataset comprising Medicare patients.

2602.19790 2026-04-22 cs.LG stat.ML

Drift Localization using Conformal Predictions

Fabian Hinder, Valerie Vaquet, Johannes Brinkrolf, Barbara Hammer

Comments Paper is an extended version; the original was published at the 34th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN) 2026

详情
英文摘要

Concept drift -- the change of the distribution over time -- poses significant challenges for learning systems and is of central interest for monitoring. Understanding drift is thus paramount, and drift localization -- determining which samples are affected by the drift -- is essential. While several approaches exist, most rely on local testing schemes, which tend to fail in high-dimensional, low-signal settings. In this work, we consider a fundamentally different approach based on conformal predictions. We discuss and show the shortcomings of common approaches and demonstrate the performance of our approach on state-of-the-art image datasets.

2512.19553 2026-04-22 stat.ME

A Statistical Framework for Understanding Causal Effects that Vary by Treatment Initiation Time in EHR-based Studies

Luke Benz, Rajarshi Mukherjee, Rui Wang, David Arterburn, Heidi Fischer, Catherine Lee, Susan M. Shortreed, Alexander W. Levis, Sebastien Haneuse

详情
英文摘要

Standard practice in electronic health record (EHR)-based studies evaluating the comparative effectiveness of bariatric surgery relative to no surgery is to estimate and report a constant treatment effect across calendar time. However, real-world treatment strategies can evolve, particularly when comparators include standard of care or surgical procedures where techniques may improve, making it clinically important to ascertain whether efficacy of bariatric surgery has changed over time. Efforts to determine whether treatment efficacy itself is evolving are complicated by changing patient populations, with potential covariate shift in key effect modifiers. Through a comprehensive analysis of EHR data from Kaiser Permanente following two bariatric surgical procedures compared to standard of care, we develop a statistical framework to estimate calendar time-specific average treatment effects and describe both how and why effects vary across treatment initiation time in EHR-based studies. Our approach projects doubly robust, time-specific treatment effect estimates onto candidate marginal structural models and uses a model selection procedure to best describe how effects vary by treatment initiation time. We further introduce a novel summary metric, based on standardization analysis, to quantify the role of covariate shift in explaining observed effect changes and disentangle changes in treatment effects from changes in the patient population receiving treatment.

2512.12448 2026-04-22 cs.LG cs.NE physics.data-an stat.ML

Optimized Architectures for Kolmogorov-Arnold Networks

James Bagrow, Josh Bongard

Comments 23 pages, 4 figures, 9 tables

详情
英文摘要

Efforts to improve Kolmogorov--Arnold networks (KANs) with architectural enhancements have been stymied by the complexity those enhancements bring, undermining the interpretability that makes KANs attractive in the first place. Here we study overprovisioned architectures combined with sparsification, deep supervision, and depth selection, to learn compact, interpretable KANs without sacrificing accuracy. Crucially, we focus on differentiable mechanisms under a principled minimum description length objective, jointly optimizing activations, structure, and depth end-to-end. Experiments across function approximation benchmarks, dynamical systems forecasting, and real-world prediction tasks demonstrate that sparsification alone is insufficient, but the combination with depth selection achieves competitive or superior accuracy while discovering substantially smaller models. The result is a principled path toward models that are both more expressive and more interpretable, addressing a key tension in scientific machine learning.

2510.13763 2026-04-22 stat.ML cs.LG

PriorGuide: Test-Time Prior Adaptation for Simulation-Based Inference

Yang Yang, Severi Rissanen, Paul E. Chang, Nasrulloh Loka, Daolang Huang, Arno Solin, Markus Heinonen, Luigi Acerbi

Comments Accepted at ICLR 2026. Camera-ready version. 38 pages, 8 figures

详情
英文摘要

Amortized simulator-based inference offers a powerful framework for tackling Bayesian inference in computational fields such as engineering or neuroscience, increasingly leveraging modern generative methods like diffusion models to map observed data to model parameters or future predictions. These approaches yield posterior or posterior-predictive samples for new datasets without requiring further simulator calls after training on simulated parameter-data pairs. However, their applicability is often limited by the prior distribution(s) used to generate model parameters during this training phase. To overcome this constraint, we introduce PriorGuide, a technique specifically designed for diffusion-based amortized inference methods. PriorGuide leverages a novel guidance approximation that enables flexible adaptation of the trained diffusion model to new priors at test time, crucially without costly retraining. This allows users to readily incorporate updated information or expert knowledge post-training, enhancing the versatility of pre-trained inference models.

2510.09477 2026-04-22 stat.ML cs.LG

Efficient Autoregressive Inference for Transformer Probabilistic Models

Conor Hassan, Nasrulloh Loka, Cen-You Li, Daolang Huang, Paul E. Chang, Yang Yang, Francesco Silvestrin, Samuel Kaski, Luigi Acerbi

Comments Accepted at ICLR 2026. Camera-ready version. 39 pages, 20 figures

详情
英文摘要

Set-based transformer models for amortized probabilistic inference and meta-learning, such as neural processes, prior-fitted networks, and tabular foundation models, excel at single-pass marginal prediction. However, many applications require joint distributions over multiple predictions. Purely autoregressive architectures generate these efficiently but sacrifice flexible set-conditioning. Obtaining joint distributions from set-based models requires re-encoding the entire context at each autoregressive step, which scales poorly. We introduce a causal autoregressive buffer that combines the strengths of both paradigms. The model encodes the context once and caches it; a lightweight causal buffer captures dependencies among generated targets, with each new prediction attending to both the cached context and all previously predicted targets added to the buffer. This enables efficient batched autoregressive sampling and joint predictive density evaluation. Training integrates set-based and autoregressive modes through masked attention at minimal overhead. Across synthetic functions, EEG time series, a Bayesian model comparison task, and tabular regression, our method closely matches the performance of full context re-encoding while delivering up to $20\times$ faster joint sampling and density evaluation, and up to $7\times$ lower memory usage.

2509.03726 2026-04-22 stat.ML cs.LG

Energy-Weighted Flow Matching: Unlocking Continuous Normalizing Flows for Efficient and Scalable Boltzmann Sampling

Niclas Dern, Lennart Redl, Sebastian Pfister, Marcel Kollovieh, David Lüdke, Stephan Günnemann

Comments 21 pages, 4 figures

详情
英文摘要

Sampling from unnormalized target distributions, e.g.\ Boltzmann distributions $μ_{\text{target}}(x) \propto \exp(-E(x)/T)$, is fundamental to many scientific applications yet computationally challenging due to complex, high-dimensional energy landscapes. Existing approaches applying modern generative models to Boltzmann distributions either require large datasets of samples drawn from the target distribution or, when using only energy evaluations for training, cannot efficiently leverage the expressivity of advanced architectures like continuous normalizing flows that have shown promise for molecular sampling. To address these shortcomings, we introduce Energy-Weighted Flow Matching (EWFM), a novel training objective enabling continuous normalizing flows to model Boltzmann distributions using only energy function evaluations. Our objective reformulates conditional flow matching via importance sampling, allowing training with samples from arbitrary proposal distributions. Based on this objective, we develop two algorithms: iterative EWFM (iEWFM), which progressively refines proposals through iterative training, and annealed EWFM (aEWFM), which additionally incorporates temperature annealing for challenging energy landscapes. On benchmark systems, including challenging 55-particle Lennard-Jones clusters, our algorithms demonstrate sample quality competitive with established energy-only methods while requiring up to three orders of magnitude fewer energy evaluations.

2508.21184 2026-04-22 cs.CL cs.AI stat.ML

BED-LLM: Intelligent Information Gathering with LLMs and Bayesian Experimental Design

Deepro Choudhury, Sinead Williamson, Adam Goliński, Ning Miao, Freddie Bickford Smith, Michael Kirchhof, Yizhe Zhang, Tom Rainforth

Comments Published at the International Conference on Learning Representations 2026

详情
英文摘要

We propose a general-purpose approach for improving the ability of large language models (LLMs) to intelligently and adaptively gather information from a user or other external source using the framework of sequential Bayesian experimental design (BED). This enables LLMs to act as effective multi-turn conversational agents and interactively interface with external environments. Our approach, which we call BED-LLM (Bayesian experimental design with large language models), is based on iteratively choosing questions or queries that maximize the expected information gain (EIG) with respect to a variable of interest given the responses gathered previously. We show how this EIG can be formulated (and then estimated) in a principled way using a probabilistic model derived from the LLM's predictive distributions and provide detailed insights into key decisions in its construction and updating procedure. We find that BED-LLM achieves substantial gains in performance across a wide range of tests based on the 20 Questions game and using the LLM to actively infer user preferences, compared to purely prompting-based design generation and other adaptive design strategies.

2507.03828 2026-04-22 cs.LG stat.ML

IMPACT: Importance-Aware Activation Space Reconstruction

Md Mokarram Chowdhury, Daniel Agyei Asante, Ernie Chang, Yang Li

Comments To appear in the Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)

详情
英文摘要

Large language models (LLMs) achieve strong performance across diverse domains but remain difficult to deploy in resource-constrained environments due to their size. Low-rank compression is a common remedy, typically minimizing weight reconstruction error under the assumption that weights are low-rank. However, this assumption often does not hold in LLMs. In contrast, LLM activations exhibit a more pronounced low-rank structure, motivating approaches that minimize activation reconstruction error. This shift alone, however, is not sufficient: different activation dimensions contribute unequally to model performance, and treating them uniformly can lead to accuracy loss. We introduce IMPACT, an importance-aware activation reconstruction framework that links compression to its effect on model performance. IMPACT formulates compression as an optimization problem that integrates activation structure with gradient-based importance, deriving a closed-form solution where reconstruction bases arise from an importance-weighted activation covariance matrix. This yields low-rank compression explicitly optimized for accuracy preservation. Experiments across multiple models and tasks demonstrate that IMPACT achieves up to 55.4% greater model size reduction while maintaining accuracy comparable to or better than state-of-the-art baselines.

2507.01918 2026-04-22 q-fin.PM cs.AI math.OC physics.data-an stat.ML

End-to-End Large Portfolio Optimization for Variance Minimization with Neural Networks through Covariance Cleaning

Christian Bongiorno, Efstratios Manolakis, Rosario Nunzio Mantegna

详情
Journal ref
The Journal of Finance and Data Science, 12, (2026) 100179
英文摘要

We develop a rotation-invariant neural network that provides the global minimum-variance portfolio by jointly learning how to lag-transform historical returns and marginal volatilities and how to regularise the eigenvalues of large equity covariance matrices. This explicit mathematical mapping offers clear interpretability of each module's role, so the model cannot be regarded as a pure black box. The architecture mirrors the analytical form of the global minimum-variance solution yet remains agnostic to dimension, so a single model can be calibrated on panels of a few hundred stocks and applied, without retraining, to one thousand US equities, a cross-sectional jump that indicates robust generalization capability. The loss function is the future short-term realized minimum variance and is optimized end-to-end on real returns. In out-of-sample tests from January 2000 to December 2024, the estimator delivers systematically lower realized volatility, smaller maximum drawdowns, and higher Sharpe ratios than the best competitors, including state-of-the-art non-linear shrinkage, and these advantages persist across both short and long evaluation horizons despite the model's training focus is short-term. Furthermore, although the model is trained end-to-end to produce an unconstrained minimum-variance portfolio, we show that its learned covariance representation can be used in general optimizers under long-only constraints with virtually no loss in its performance advantage over competing estimators. These advantages persist when the strategy is executed under a highly realistic implementation framework that models market orders at the auctions, empirical slippage, exchange fees, and financing charges for leverage, and they remain stable during episodes of acute market stress.

2507.00451 2026-04-22 cs.LG cs.AI cs.DS cs.IT math.IT stat.ML

Best Agent Identification for General Game Playing

Matthew Stephenson, Alex Newcombe, Eric Piette, Dennis Soemers

详情
英文摘要

We present an efficient and generalised procedure to accurately identify the best (or near best) performing algorithm for each sub-task in a multi-problem domain. Our approach treats this as a set of best arm identification problems for multi-armed bandits, where each bandit corresponds to a specific task and each arm corresponds to a specific algorithm or agent. We propose an optimistic selection process based on a chosen confidence interval, that ranks each arm across all bandits in terms of their potential to influence our overall simple regret. We evaluate the performance of our approach on two of the most popular general game playing domains, the General Video Game AI (GVGAI) framework and the Ludii general game playing system, with the goal of selecting a high-performing agent for each game using a limited number of available trials. Compared to previous best arm identification algorithms for multi-armed bandits, our results demonstrate a substantial performance improvement in terms of average simple regret and average probability of error. This novel approach can be used to significantly improve the quality and accuracy of agent evaluation procedures for general game frameworks, as well as other multi-task domains with high algorithm runtimes.

2506.18186 2026-04-22 cs.LG stat.ML

Online Learning of Whittle Indices for Restless Bandits with Non-Stationary Transition Kernels

Md Kamran Chowdhury Shisher, Vishrant Tripathi, Mung Chiang, Christopher G. Brinton

详情
英文摘要

The restless multi-armed bandit (RMAB) framework is a popular approach to solving resource allocation problems in networked systems. In this paper, we study optimal resource allocation in RMABs facing unknown and non-stationary dynamics. Solving RMABs optimally is known to be PSPACE-hard even with full knowledge of model parameters. While Whittle index policies offer asymptotic optimality with low computational cost, they require access to stationary transition kernels, an unrealistic assumption in many modern networking applications. To address this challenge, we propose a Sliding-Window Online Whittle (SW-Whittle) policy that remains computationally efficient while adapting to time-varying kernels. Through theoretical analysis, we show that our algorithm achieves sub-linear dynamic regret with respect to the number of episodes. We further address the important case where the variation budget is unknown in advance by combining a Bandit-over-Bandit framework with our sliding-window design. In our scheme, window lengths are tuned online as a function of the estimated variation, while Whittle indices are computed via an upper-confidence-bound of the estimated transition kernels and a bilinear optimization routine. Numerical experiments demonstrate that our algorithm consistently outperforms baselines, achieving the lowest cumulative regret across a range of non-stationary environments.

2506.02524 2026-04-22 stat.ME stat.AP

Variable Selection in Functional Linear Cox Model

Yuanzhen Yue, Stella Self, Yichao Wu, Jiajia Zhang, Rahul Ghosal

详情
英文摘要

Modern biomedical studies frequently collect complex, high-dimensional physiological signals using wearables and sensors along with time-to-event outcomes, making efficient variable selection methods crucial for interpretation and improving the accuracy of survival models. We propose a novel variable selection method for a functional linear Cox model with multiple functional and scalar covariates measured at baseline. We utilize a spline-based semiparametric estimation approach for the functional coefficients and a group minimax concave type penalty (MCP), which effectively integrates smoothness and sparsity into the estimation of functional coefficients. An efficient group descent algorithm is used for optimization, and an automated procedure is provided to select optimal values of the smoothing and sparsity parameters. Through simulation studies, we demonstrate the method's ability to perform accurate variable selection and estimation. The method is applied to 2003-06 cohort of the National Health and Nutrition Examination Survey (NHANES) data, identifying the key temporally varying distributional patterns of physical activity and demographic predictors related to all-cause mortality. Our analysis sheds light on the intricate association between daily distributional patterns of physical activity and all-cause mortality among older US adults.

2505.22811 2026-04-22 stat.ML cs.LG

Highly Efficient and Effective LLMs with Multi-Boolean Architectures

Ba-Hien Tran, Van Minh Nguyen

Comments ICLR 2026 (Main Conference)

详情
英文摘要

Weight binarization has emerged as a promising strategy to reduce the complexity of large language models (LLMs). Existing approaches fall into post-training binarization, which is simple but causes severe performance loss, and training-aware methods, which depend on full-precision latent weights, adding complexity and limiting efficiency. We propose a novel framework that represents LLMs with multi-kernel Boolean parameters and, for the first time, enables direct finetuning LMMs in the Boolean domain, eliminating the need for latent weights. This enhances representational capacity and dramatically reduces complexity during both finetuning and inference. Extensive experiments across diverse LLMs show our method outperforms recent ultra low-bit quantization and binarization techniques.

2505.09803 2026-04-22 stat.ML cs.LG

LatticeVision: Image to Image Networks for Modeling Non-Stationary Spatial Data

Antony Sikorski, Michael Ivanitskiy, Nathan Lenssen, Douglas Nychka, Daniel McKenzie

Comments This work has been accepted at the 29th International Conference on Artificial Intelligence and Statistics (AISTATS)

详情
英文摘要

In many applications, we wish to fit a parametric statistical model to a small ensemble of spatially distributed random variables ('fields'). However, parameter inference using maximum likelihood estimation (MLE) is computationally prohibitive, especially for large, non-stationary fields. Thus, many recent works train neural networks to estimate parameters given spatial fields as input, sidestepping MLE completely. In this work we focus on a popular class of parametric, spatially autoregressive (SAR) models. We make a simple yet impactful observation; because the SAR parameters can be arranged on a regular grid, both inputs (spatial fields) and outputs (model parameters) can be viewed as images. Using this insight, we demonstrate that image-to-image (I2I) networks enable faster and more accurate parameter estimation for a class of non-stationary SAR models with unprecedented complexity.

2502.16156 2026-04-22 stat.ML cs.LG

A Review of Causal Decision Making

Lin Ge, Hengrui Cai, Runzhe Wan, Yang Xu, Rui Song

详情
英文摘要

To make effective decisions, it is important to have a thorough understanding of the causal relationships among actions, environments, and outcomes. This review aims to surface three crucial aspects of decision-making through a causal lens: 1) the discovery of causal relationships through causal structure learning, 2) understanding the impacts of these relationships through causal effect learning, and 3) applying the knowledge gained from the first two aspects to support decision making via causal policy learning. Moreover, we identify challenges that hinder the broader utilization of causal decision-making and discuss recent advances in overcoming these challenges. Finally, we provide future research directions to address these challenges and to further enhance the implementation of causal decision-making in practice, with real-world applications illustrated based on the proposed causal decision-making. We aim to offer a comprehensive methodology and practical implementation framework by consolidating various methods in this area into a Python-based collection. URL: https://causaldm.github.io/Causal-Decision-Making.

2502.14479 2026-04-22 q-fin.RM q-fin.ST stat.AP

Modelling the term-structure of default risk under IFRS 9 within a multistate regression framework

Arno Botha, Tanja Verster, Roland Breedt

Comments 37 pages, 10013 words, 10 figures

详情
英文摘要

The lifetime behaviour of loans is notoriously difficult to model, which can compromise a bank's financial reserves against future losses, if modelled poorly. Therefore, we present a data-driven comparative study amongst three techniques in modelling a series of default risk estimates over the lifetime of each loan, i.e., its term-structure. The behaviour of loans can be described using a nonstationary and time-dependent semi-Markov model, though we model its elements using a multistate regression-based approach. As such, the transition probabilities are explicitly modelled as a function of a rich set of input variables, including macroeconomic and loan-level inputs. Our modelling techniques are deliberately chosen in ascending order of complexity: 1) a Markov chain; 2) beta regression; and 3) multinomial logistic regression. Using residential mortgage data, our results show that each successive model outperforms the previous, likely as a result of greater sophistication. This finding required devising a novel suite of simple model diagnostics, which can itself be reused in assessing sampling representativeness and the performance of other modelling techniques. These contributions surely advance the current practice within banking when conducting multistate modelling. Consequently, we believe that the estimation of loss reserves will be more timeous and accurate under IFRS 9.

2502.12141 2026-04-22 econ.GN q-fin.EC stat.ME

Potato Potahto in the FAO-GAEZ Productivity Measures? Nonclassical Measurement Error with Multiple Proxies

Rafael Araujo, Vitor Possebom

详情
英文摘要

The FAO-GAEZ productivity data are widely used in Economics. However, the empirical literature rarely discusses measurement error. We use two proxies to derive analytical bounds around the effect of agricultural productivity in a setting with nonclassical measurement error. These bounds rely on assumptions weaker than those imposed in empirical studies and exhaust the information contained in the first two data moments. We reevaluate three influential studies, finding wide intervals around the effects of agricultural productivity. These results call for caution, highlighting the limits of our knowledge about these effects. Our methodology has broad applications in empirical research involving mismeasured variables.

2502.10605 2026-04-22 stat.ML cs.CY cs.LG econ.EM stat.ME

Batch-Adaptive Causal Annotations

Ezinne Nwankwo, Lauri Goldkind, Angela Zhou

详情
英文摘要

Estimating the causal effects of interventions is crucial to policy and decision-making, yet outcome data are often missing or subject to non-standard measurement error. While ground-truth outcomes can sometimes be obtained through costly data annotation or follow-up, budget constraints typically allow only a fraction of the dataset to be labeled. We address this challenge by optimizing which data points should be sampled for outcome information in order to improve efficiency in average treatment effect estimation with missing outcomes. We derive a closed-form solution for the optimal batch sampling probability by minimizing the asymptotic variance of a doubly robust estimator for causal inference with missing outcomes. Motivated by our street outreach partners, we extend the framework to costly annotations of unstructured data, such as text or images in healthcare and social services. Across simulated and real-world datasets, including one of outreach interventions in homelessness services, our approach achieves substantially lower mean-squared error and recovers the AIPW estimate with fewer labels than existing baselines. In practice, we show that our method can match confidence intervals obtained with 361 random samples using only 90 optimized samples - saving 75% of the labeling budget.

2502.08461 2026-04-22 math.ST stat.AP stat.ME stat.TH

On the Dirichlet-kernel Gasser--Müller estimator and its competitors for fixed design regression on the simplex

Hanen Daayeb, Christian Genest, Salah Khardani, Nicolas Klutchnikoff, Frédéric Ouimet

Comments 18 pages, 2 figures, 1 table

详情
英文摘要

A Dirichlet-kernel Gasser-Müller (D-GM) estimator is introduced for fixed design regression on the simplex, extending the univariate analog due to Chen [Statist. Sinica, vol. 10(1) (2000), pp. 73-91]. Its pointwise bias and variance, asymptotic normality, and mean integrated squared error are investigated. Some simulation experiments are conducted to compare its small-sample performance with that of two recently proposed alternatives: the Dirichlet-kernel Nadaraya-Watson (D-NW) and local linear (D-LL) estimators. The simulation results reveal that the D-LL estimator is best among the D-LL, D-NW, and D-GM estimators and that the proposed D-GM estimator is worst. A real data analysis is also reported for the GEMAS dataset to analyze the relationship between soil composition and pH levels across various agricultural and grazing lands in Europe.

2501.19311 2026-04-22 stat.ME

The Case for Time in Causal DAGs

Alexander G. Reisach, Alberto Suárez, Sebastian Weichwald, Antoine Chambaz

详情
英文摘要

We make the case for incorporating a notion of time into causal directed acyclic graphs (DAGs). We demonstrate that nontemporal causal DAGs are ambiguous and obstruct justification of the acyclicity assumption. Assuming that causes precede effects, causal relationships are relative to the time order, and causal DAGs require temporal qualification. We propose a formalization via composite causal variables that refer to quantities at one or multiple time points. We emphasize that the acyclicity assumption requires different justifications depending on whether the time order allows cycles. We conclude by discussing implications for the interpretation and applicability of DAGs as causal models.

2501.14974 2026-04-22 math.ST cs.CR math.PR stat.ME stat.ML stat.TH

Private Minimum Hellinger Distance Estimation via Hellinger Distance Differential Privacy

Fengnan Deng, Anand N. Vidyashankar

详情
英文摘要

Objective functions based on Hellinger distance yield robust and efficient estimators of model parameters. Motivated by privacy and regulatory requirements encountered in contemporary applications, we derive in this paper \emph{private minimum Hellinger distance estimators}. The estimators satisfy a new privacy constraint, namely, Hellinger differential privacy, while retaining the robustness and efficiency properties. We demonstrate that Hellinger differential privacy shares several features of standard differential privacy while allowing for sharper inference. Additionally, for computational purposes, we also develop Hellinger differentially private gradient descent and Newton-Raphson algorithms. We illustrate the behavior of our estimators in finite samples using numerical experiments and verify that they retain robustness properties under gross-error contamination.

2412.01763 2026-04-22 math.OC cs.LG stat.ML

The Data-Driven Censored Newsvendor Problem

Chamsi Hssaine, Sean R. Sinclair

Comments 85 pages, 11 tables, 11 figures

详情
英文摘要

We study a censored variant of the data-driven newsvendor problem, where the decision-maker must select an ordering quantity that minimizes expected overage and underage costs based only on offline censored sales data, rather than historical demand realizations. Our goal is to understand how the degree of historical demand censoring affects the performance of any learning algorithm for this problem. To isolate this impact, we adopt a distributionally robust optimization framework, evaluating policies according to their worst-case regret over an ambiguity set of distributions. This set is defined by the largest historical order quantity (the observable boundary of the dataset), and contains all distributions matching the true demand distribution up to this boundary, while allowing them to be arbitrary afterwards. We demonstrate a spectrum of achievability under demand censoring by deriving a natural necessary and sufficient condition under which vanishing regret is an achievable goal. In regimes in which it is not, we exactly characterize the information loss due to censoring: an insurmountable lower bound on the performance of any policy, even when the decision-maker has access to infinitely many demand samples. We then leverage these sharp characterizations to propose a natural robust algorithm that adapts to the historical level of demand censoring. We derive finite-sample guarantees for this algorithm across all possible censoring regimes and show its near-optimality with matching lower bounds (up to polylogarithmic factors). We moreover demonstrate its robust performance via extensive numerical experiments on both synthetic and real-world datasets.

2407.21651 2026-04-22 math.PR stat.AP

On minimal predictable intensity of point processes

Haoming Wang

Comments Separate into two papers, the first entitled "On minimal predictable intensity of point processes" to appear in Houston Journal of Mathematics, the second arXiv:2509.06016

详情
Journal ref
Houston J. Math., 51(3):501-515, 2025
英文摘要

An adapted, right-continuous, non-decreasing, integer-valued process with unit jumps and starting at zero has a minimal predictable intensity if and only if it is a standard Poisson process under an absolutely continuous transformation of measures.

2402.05384 2026-04-22 stat.ME

Efficient Nonparametric Inference for Mediation Analysis with Nonignorable Missing Confounders

Jiawei Shan, Wei Li, Chunrong Ai

详情
Journal ref
Journal of the American Statistical Association (2026)
英文摘要

Mediation analysis is widely used for exploring treatment mechanisms; however, it faces challenges when nonignorable missing confounders are present. Efficient inference of mediation effects and the efficiency loss due to nonignorable missingness have been rarely studied in the literature because of the difficulties arising from the ill-posed inverse problem. In this paper, we propose a general shadow variable framework for identifying mediation effects, allowing shadow variables to be selected from either observed covariates or externally collected auxiliary data. We then propose a Sieve-based Iterative Outward (SIO) approach for estimation. We establish large-sample theory, particularly asymptotic normality, for the proposed estimator despite the ill-posedness of the problem. We show that our estimator is locally efficient and attains the semiparametric efficiency bound under certain conditions. Building on the efficient influence function, we explicitly quantify the efficiency loss attributable to missingness and propose a debiased machine learning approach for estimation and inference. We examine the finite-sample performance of the proposed approach using extensive simulation studies and showcase its practical applicability through an empirical analysis of CFPS data.

2310.06902 2026-04-22 math.ST stat.TH

On robustness of Spectral Rényi divergence

Tetsuya Takabatake, Keisuke Yano

详情
英文摘要

This paper studies a specific class of statistical divergences for spectral densities of time series: the spectral $α$-Rényi divergences, which include the Itakura-Saito divergence as a limiting case. The aim of this paper is to highlight both information-theoretic and statistical properties of spectral $α$-Rényi divergences. We reveal the connection between the spectral $α$-Rényi divergence and the $γ$-divergence in robust statistics, and a variational representation of the spectral $α$-Rényi divergence. Inspired by these results suggesting "robustness" of spectral $α$-Rényi divergence, we show that the minimum spectral Rényi divergence estimate has a stable optimization path with respect to outliers in the frequency domain, unlike the minimum Itakura-Saito divergence estimator, and thus it delivers more stable estimates, reducing the need for intricate pre-processing.