arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.22276 2026-03-24 cs.LG stat.ML

Scaling DoRA: High-Rank Adaptation via Factored Norms and Fused Kernels

Alexandra Zelenin, Alexandra Zhuravlyova

Comments 30 pages, 15 figures, 15 tables, including appendices. Code and data at https://github.com/sockeye44/dorafactors

详情

英文摘要

Weight-Decomposed Low-Rank Adaptation (DoRA) extends LoRA by decoupling weight magnitude from direction, but its forward pass requires the row-wise norm of W + sBA, a computation that every major framework we surveyed implements by materializing the dense [d_out, d_in] product BA. At d_in = 8192 and rank r = 384, a single module's norm requires about 512 MB of transient working memory in bf16, making high-rank DoRA costly and often infeasible on common single-GPU setups once hundreds of adapted modules and checkpointing are involved. We present two systems contributions. A factored norm decomposes the squared norm into base, cross, and Gram terms computable through O(d_out r + r^2) intermediates, eliminating the dense product. Fused Triton kernels collapse the four-kernel DoRA composition into a single pass, reducing memory traffic by about 4x and using a numerically stable form that avoids catastrophic cancellation in the near-unity rescaling regime where magnitude scales concentrate in practice. Across six 8-32B vision-language models (VLMs) on three NVIDIA GPUs (RTX 6000 PRO, H200, B200) at r = 384 in bf16, the fused implementation is 1.5-2.0x faster than Hugging Face PEFT's DoRA implementation for inference and 1.5-1.9x faster for gradient computation (optimizer step excluded), with up to 7 GB lower peak VRAM. Microbenchmarks on six GPUs spanning four architecture generations (L40S, A100, RTX 6000 PRO, H200, B200, B300) confirm 1.5-2.7x compose-kernel speedup. Final-logit cosine similarity exceeds 0.9999 across all model/GPU pairs, and multi-seed training curves match within 7.1 x 10^-4 mean per-step loss delta over 2000 steps.

URL PDF HTML ☆

赞 0 踩 0

2603.22248 2026-03-24 cs.LG cs.AI cs.IT math.IT stat.ML

Confidence-Based Decoding is Provably Efficient for Diffusion Language Models

Changxiao Cai, Gen Li

2603.22219 2026-03-24 cs.LG stat.ML

Noise Titration: Exact Distributional Benchmarking for Probabilistic Time Series Forecasting

Qilin Wang

2603.22192 2026-03-24 math.ST cs.CC cs.DS stat.TH

Stable Algorithms Lower Bounds for Estimation

Xifan Yu, Ilias Zadik

Comments 82 pages, 2 figures

2603.22188 2026-03-24 stat.AP cs.CY math.PR

Generalized Sequential Monte Carlo Sampling for Redistricting Simulation

Philip O'Sullivan, Kosuke Imai, Cory McCartan

2603.22160 2026-03-24 stat.AP cs.LG

Data Curation for Machine Learning Interatomic Potentials by Determinantal Point Processes

Joanna Zou, Youssef Marzouk

Comments Original publication at https://openreview.net/forum?id=PKGP7tg65A

2603.22071 2026-03-24 stat.ME math.ST stat.TH

Detecting change regions on spheres

Di Su, Yining Chen, Tengyao Wang

2603.22050 2026-03-24 stat.ML cs.LG

MAGPI: Multifidelity-Augmented Gaussian Process Inputs for Surrogate Modeling from Scarce Data

Atticus Rex, Elizabeth Qian, David Peterson

2603.22030 2026-03-24 cs.LG stat.ML

On the Interplay of Priors and Overparametrization in Bayesian Neural Network Posteriors

Julius Kobialka, Emanuel Sommer, Chris Kolb, Juntae Kwon, Daniel Dold, David Rügamer

Comments Accepted at the 29th International Conference on Artificial Intelligence and Statistics (AISTATS) 2026

2603.22024 2026-03-24 stat.ME math.ST stat.TH

Cost-Aware Optimized Front-Door Experimental Design

Leopold Mareis, Mathias Drton

Comments This article will be published in the proceedings of CLeaR 2026

2603.22006 2026-03-24 astro-ph.CO astro-ph.IM cs.LG stat.ME

A plug-and-play approach with fast uncertainty quantification for weak lensing mass mapping

Hubert Leterme, Andreas Tersenov, Jalal Fadili, Jean-Luc Starck

详情

英文摘要

Upcoming stage-IV surveys such as Euclid and Rubin will deliver vast amounts of high-precision data, opening new opportunities to constrain cosmological models with unprecedented accuracy. A key step in this process is the reconstruction of the dark matter distribution from noisy weak lensing shear measurements. Current deep learning-based mass mapping methods achieve high reconstruction accuracy, but either require retraining a model for each new observed sky region (limiting practicality) or rely on slow MCMC sampling. Efficient exploitation of future survey data therefore calls for a new method that is accurate, flexible, and fast at inference. In addition, uncertainty quantification with coverage guarantees is essential for reliable cosmological parameter estimation. We introduce PnPMass, a plug-and-play approach for weak lensing mass mapping. The algorithm produces point estimates by alternating between a gradient descent step with a carefully chosen data fidelity term, and a denoising step implemented with a single deep learning model trained on simulated data corrupted by Gaussian white noise. We also propose a fast, sampling-free uncertainty quantification scheme based on moment networks, with calibrated error bars obtained through conformal prediction to ensure coverage guarantees. Finally, we benchmark PnPMass against both model-driven and data-driven mass mapping techniques. PnPMass achieves performance close to that of state-of-the-art deep-learning methods while offering fast inference (converging in just a few iterations) and requiring only a single training phase, independently of the noise covariance of the observations. It therefore combines flexibility, efficiency, and reconstruction accuracy, while delivering tighter error bars than existing approaches, making it well suited for upcoming weak lensing surveys.

URL PDF HTML ☆

赞 0 踩 0

2603.21992 2026-03-24 stat.ME

Pair-based estimators of infection and removal rates for stochastic epidemic models

Seth D. Temple, Jonathan Terhorst

2603.21967 2026-03-24 stat.ME

Unified implementation and comparison of Bayesian shrinkage methods for treatment effect estimation in subgroups

Marcel Wolbers, Miriam Pedrera Gómez, Alex Ocampo, Isaac Gravestock

Comments 26 pages (23 main, 3 supplementary), 5 figures (4 main, 1 supplementary), 8 tables (4 main, 4 supplementary)

详情

英文摘要

Evaluating treatment effect heterogeneity across patient subgroups is a fundamental aspect of clinical trial analysis. Yet, these analyses have inherent limitations due to small sample sizes and the substantial number of subgroups investigated. Statisticians in regulatory agencies and pharmaceutical companies have begun considering shrinkage methods grounded in Bayesian statistical theory. These methods incorporate priors on treatment effect heterogeneity, which operationally shrink raw subgroup treatment effect estimates towards the overall treatment effect. Various shrinkage estimators and priors have been proposed, yet it remains unclear which methods perform best. This work provides a unified presentation, software implementation (in the R package bonsaiforest2), and simulation comparison of one-way and global shrinkage methods for continuous, binary, count, and time-to-event endpoints. One-way models fit a separate shrinkage model for each subgrouping variable, whereas global models fit a model including all subgroup indicators at once. Both can derive standardized subgroup-specific treatment effects. Across all simulation scenarios, shrinkage methods outperformed the standard subgroup estimator without shrinkage in terms of mean squared error. They were also more efficient in identifying a non-efficacious subgroup. Global shrinkage models tended to have smaller mean squared error and less dependence on hyperprior parameters than one-way models, but also exhibited slightly larger bias and worse frequentist coverage of associated credible intervals. For both models, hyperprior choices anchored in trial assumptions about the anticipated size of the overall treatment effect performed well. We conclude that some degree of shrinkage is preferable to none and advocate for the routine inclusion of shrunken estimates in clinical forest plots to facilitate more robust decision-making.

URL PDF HTML ☆

赞 0 踩 0

2603.21952 2026-03-24 stat.ME stat.CO

Parsimonious Subset Selection for Generalized Linear Models with Biomedical Applications

Anant Mathur, Benoit Liquet, Samuel Muller, Sarat Moka

2603.21918 2026-03-24 stat.ML cs.LG

Structural Concentration in Weighted Networks: A Class of Topology-Aware Indices

L. Riso, M. G. Zoia

2603.21914 2026-03-24 math.ST stat.TH

On the identifiability of Dirichlet mixture models

Hien Duy Nguyen, Mayetri Gupta

2603.21844 2026-03-24 cs.LG cs.AI stat.ME stat.ML

On the Number of Conditional Independence Tests in Constraint-based Causal Discovery

Marc Franquesa Monés, Jiaqi Zhang, Caroline Uhler

2603.21752 2026-03-24 stat.AP cs.LG

Identifiability and amortized inference limitations in Kuramoto models

Emma Hannula, Jana de Wiljes, Matthew T. Moores, Heikki Haario, Lassi Roininen

2603.21748 2026-03-24 stat.ME

Fixed Rank co-Kriging: a model for multivariate spatial prediction

Gaia Caringi, Piercesare Secchi

Comments 36 pages, 25 figures

2603.21699 2026-03-24 econ.EM stat.ML

A Job I Like or a Job I Can Get: Designing Job Recommender Systems Using Field Experiments

Guillaume Bied, Philippe Caillou, Bruno Crépon, Christophe Gaillac, Elia Pérennes, Michèle Sebag

Comments The main paper, which stops at page 49, is followed by the online appendix (31 pages)

2603.21683 2026-03-24 math.OC math.PR stat.ML

Learning operators on labelled conditional distributions with applications to mean field control of non exchangeable systems

Samy Mekkaoui, Huyên Pham, Xavier Warin

2603.21678 2026-03-24 stat.ML cs.LG

CoNBONet: Conformalized Neuroscience-inspired Bayesian Operator Network for Reliability Analysis

Shailesh Garg, Souvik Chakraborty

2603.21590 2026-03-24 math.ST cs.LG stat.TH

Feature Incremental Clustering with Generalization Bounds

Jing Zhang, Chenping Hou

2603.21554 2026-03-24 math.OC math.ST stat.TH

Sinkhorn algorithms for entropic vector quantile regression

Kengo Kato, Boyu Wang

Comments 32 pages

2603.21549 2026-03-24 stat.ME stat.CO

Bayesian inference for ordinary differential equations models with heteroscedastic measurement error

Selva Salimi, David J. Warne, Christopher Drovandi

Comments 28 pages

详情

英文摘要

Ordinary differential equation (ODE) models are widely used to describe systems in many areas of science. To ensure these models provide accurate and interpretable representations of real-world dynamics, it is often necessary to infer parameters from data, which involves specifying the form of the ODE system as well as a statistical model describing the observational process. A popular and convenient choice for the error model is a Gaussian distribution with constant variance. However, the choice may not be realistic in many systems, since the variance of the observational error may vary over time or have some dependence on the system state (heteroscedastic), reflecting changes in measurement conditions, environmental fluctuations, or intrinsic system variability. Misspecification of the error model can lead to substantial inaccuracies of the posterior estimates of the ODE model parameters and predictions. More elaborate parametric error models could be specified, but this would increase computational cost because additional parameters would need to be estimated within the MCMC procedure and may still be misspecified. In this work we propose a two-step semi-parametric framework for Bayesian parameter estimation of ODE model parameters when there exists heteroscedasticity in the error process. The first step applies a heteroscedastic Gaussian process to estimate the time-dependent error, and the second step performs Bayesian inference for the ODE model parameters using the estimated time-dependent error estimated from step one in the likelihood function. Through a simulation study and two real-world applications, we demonstrate that the proposed approach yields more reliable posterior inference and predictive uncertainty compared to the standard homoscedastic models. Although our focus is on heteroscedasticity, the framework could be applied to handle more complex error processes.

URL PDF HTML ☆

赞 0 踩 0

2603.19422 2026-03-24 stat.ML cs.LG math.ST stat.TH

Pseudo-Labeling for Unsupervised Domain Adaptation with Kernel GLMs

Nathan Weill, Kaizheng Wang

Comments 55 pages, 4 figures. Python solvers and experiment scripts are available at: https://github.com/nathanweill/KRGLM

2603.17866 2026-03-24 stat.AP stat.ME

Bayesian multilevel step-and-turn models for evaluating player movement in American football

Quang Nguyen, Ronald Yurko

2603.15884 2026-03-24 stat.AP stat.ME

A Utility Score Framework for Dose Optimization Studies with Binary Efficacy-Safety Endpoints: Sample Size Determination and Bias Characterization

Xuemin Gu, Cong Xu, Lei Xu, Ying Yu

2603.14757 2026-03-24 stat.OT

The Rise of Null Hypothesis Significance Testing (NHST): Institutional Massification and the Emergence of a Procedural Epistemology

Carol Ting

Comments 29 pages, 6 figures. v2: Added missing citation (Ting & Greenland, 2024), corrected formatting issues, and minor typographical edits

2602.10273 2026-03-24 stat.ML cs.LG

Power-SMC: Low-Latency Sequence-Level Power Sampling for Training-Free LLM Reasoning

Seyedarmin Azizi, Erfan Baghaei Potraghloo, Minoo Ahmadi, Souvik Kundu, Massoud Pedram

2602.08998 2026-03-24 math.AT cs.LG math.OA stat.ML

Universal Coefficients and Mayer-Vietoris Sequence for Groupoid Homology

Luciano Melodia

Comments Master's thesis, Code available at https://codeberg.org/Jiren/MSc

2602.07098 2026-03-24 stat.CO cs.LG stat.ML

BayesFlow 2: Multi-Backend Amortized Bayesian Inference in Python

Lars Kühmichel, Jerry M. Huang, Valentin Pratz, Jonas Arruda, Hans Olischläger, Daniel Habermann, Simon Kucharsky, Lasse Elsemüller, Aayush Mishra, Niels Bracher, Svenja Jedhoff, Marvin Schmitt, Paul-Christian Bürkner, Stefan T. Radev

2601.09888 2026-03-24 econ.EM math.ST stat.TH

Learning about Treatment Effects with Prior Studies: A Bayesian Model Averaging Approach

Frederico Finan, Demian Pouzo

2511.17167 2026-03-24 math.ST cs.CR stat.ME stat.TH

Differentially private testing for relevant dependencies in high dimensions

Patrick Bastian, Holger Dette, Martin Dunsche

Comments 39 pages, 9 figures

2511.01137 2026-03-24 cs.LG math.AG math.DS stat.ML

Regularization Implies balancedness in the deep linear network

Kathryn Lindsey, Govind Menon

Comments 18 pages, 3 figures. Fixed minor errors in revision, added more context and created Discussion section

2509.19988 2026-03-24 stat.ML cs.LG q-bio.QM

BioBO: Biology-informed Bayesian Optimization for Perturbation Design

Yanke Li, Tianyu Cui, Tommaso Mansi, Mangal Prakash, Rui Liao

Comments ICLR 2026

2508.14936 2026-03-24 q-bio.QM cs.AI cs.LG stat.AP stat.ML

Can synthetic data reproduce real-world findings in epidemiology? A replication study using adversarial random forests

Jan Kapar, Kathrin Günther, Lori Ann Vallis, Klaus Berger, Nadine Binder, Hermann Brenner, Stefanie Castell, Beate Fischer, Volker Harth, Bernd Holleczek, Timm Intemann, Till Ittermann, André Karch, Thomas Keil, Lilian Krist, Berit Lange, Michael F. Leitzmann, Katharina Nimptsch, Nadia Obi, Iris Pigeot, Tobias Pischon, Tamara Schikowski, Börge Schmidt, Carsten Oliver Schmidt, Anja M. Sedlmair, Justine Tanoey, Harm Wienbergen, Andreas Wienke, Claudia Wigmann, Marvin N. Wright

详情

英文摘要

Synthetic data holds substantial potential to address practical challenges in epidemiology due to restricted data access and privacy concerns. However, many current methods suffer from limited quality, high computational demands, and complexity for non-experts. Furthermore, common evaluation strategies for synthetic data often fail to directly reflect statistical utility and measure privacy risks sufficiently. Against this background, a critical underexplored question is whether synthetic data can reliably reproduce key findings from epidemiological research while preserving privacy. We propose adversarial random forests (ARF) as an efficient and convenient method for synthesizing tabular epidemiological data. To evaluate its performance, we replicated statistical analyses from six epidemiological publications covering blood pressure, anthropometry, myocardial infarction, accelerometry, loneliness, and diabetes, from the German National Cohort (NAKO Gesundheitsstudie), the Bremen STEMI Registry U45 Study, and the Guelph Family Health Study. We further assessed how dataset dimensionality and variable complexity affect the quality of synthetic data, and contextualized ARF's performance by comparison with commonly used tabular data synthesizers in terms of utility, privacy, generalisation, and runtime. Across all replicated studies, results on ARF-generated synthetic data consistently aligned with original findings. Even for datasets with relatively low sample size-to-dimensionality ratios, replication outcomes closely matched the original results across descriptive and inferential analyses. Reduced dimensionality and variable complexity further enhanced synthesis quality. ARF demonstrated favourable performance regarding utility, privacy preservation, and generalisation relative to other synthesizers and superior computational efficiency.

URL PDF HTML ☆

赞 0 踩 0

2506.14082 2026-03-24 physics.geo-ph stat.AP

Smooth surface reconstruction of earthquake faults from distributed moment-potency-tensor solutions

Dye SK Sato, Yuji Yagi, Ryo Okuwaki, Yukitoshi Fukahata

Comments 46 pages, 13 figures

详情

DOI: 10.1093/gji/ggag116

英文摘要

Earthquake faults as observed by seismic motions primarily manifest as displacement discontinuities within elastic continua. The displacement discontinuity and the surface normal vector (n-vector) of such an idealized earthquake source are measured by the tensor of potency, which is seismic moment normalized by stiffness. This study formulates an inverse problem to reconstruct a smooth 3D fault surface from an areal density field of the potency tensor. Here, the surface is represented by an elevation field, while nodal planes of the potency density represent the surface normal (n-vector) field, reducing the problem to an n-vector-to-elevation transform. Although this transform is a one-to-one mapping in 2D, it becomes overdetermined in 3D because the n-vector has two degrees of freedom while the scalar elevation has only one, admitting no solution in general. This overdeterminacy originates from modeling the potency density, the inelastic strain with six degrees of freedom, as a displacement discontinuity of five degrees of freedom. Whereas this overdeterminacy appears as the violation of the determinant-free constraint in point potency sources, it raises a conflict with the global consistency of the n-vector field in areal potency densities. Recognizing this capacity of the potency density to describe inelastic strain incompatible with displacement discontinuity, we introduce an a priori constraint to define the fault as the smooth surface that best approximates inelastic strain as displacement discontinuity. We derive an analytical solution for this formulation and demonstrate its ability to reproduce 3D surfaces from noisy synthetic n-vectors. We integrate this formula into potency density tensor inversion and apply it to the 2013 Balochistan earthquake. The estimated 3D geometry shows better agreement with observed fault traces than previous quasi-2D methods, validating our proposal.

URL PDF HTML ☆

赞 0 踩 0

2505.16919 2026-03-24 stat.ME

Hilbert space methods for approximating multi-output latent variable Gaussian processes

Soham Mukherjee, Manfred Claassen, Paul-Christian Bürkner

Comments 44 pages, 34 figures

2504.10881 2026-03-24 stat.ME stat.AP stat.CO

A Nonparametric Bayesian Local-Global Model for Enhanced Adverse Event Signal Detection in Spontaneous Reporting System Data

Xin-Wei Huang, Saptarshi Chakraborty

2504.09396 2026-03-24 cs.LG cs.AI stat.ML

Adaptive Insurance Reserving with CVaR-Constrained Reinforcement Learning under Macroeconomic Regimes

Stella C. Dong

2503.04071 2026-03-24 stat.ML cs.LG

Tightening optimality gap with confidence through conformal prediction

Miao Li, Michael Klamkin, Russell Bent, Pascal Van Hentenryck

Comments none

2502.04907 2026-03-24 stat.ML cs.LG

Scalable Learning from Probability Measures with Mean Measure Quantization

Erell Gachon, Elsa Cazelles, Jérémie Bigot

2501.16933 2026-03-24 stat.ME math.ST stat.AP stat.TH

Rethinking the Win Ratio: A Causal Framework for Hierarchical Outcome Analysis

Mathieu Even, Julie Josse

2501.06404 2026-03-24 econ.EM cs.AI cs.LG stat.ML

A Hybrid Framework for Reinsurance Optimization: Integrating Generative Models and Reinforcement Learning

Stella C. Dong

2408.05819 2026-03-24 stat.ML cs.LG

Fast convergence of a Federated Expectation-Maximization Algorithm

Zhixu Tao, Rajita Chandak, Sanjeev Kulkarni

2407.11455 2026-03-24 math.ST stat.TH

ERM-Lasso classification algorithm for Multivariate Hawkes Processes paths

Charlotte Dion-Blanc, Christophe Denis, Laure Sansonnet, Romain Edmond Lacoste

2402.08412 2026-03-24 stat.ML cs.LG math.DS math.ST stat.TH

Interacting Particle Systems on Networks: joint inference of the network and the interaction kernel

Quanjun Lang, Xiong Wang, Fei Lu, Mauro Maggioni

Comments 53 pages, 17 figures

2402.01491 2026-03-24 stat.ME math.PR math.ST stat.AP stat.TH

Moving Aggregate Modified Autoregressive Copula-Based Time Series Models (MAGMAR-Copulas)

Sven Pappert

2312.02246 2026-03-24 cs.CV cs.AI cs.LG stat.ML

Conditional Variational Diffusion Models

Gabriel della Maggiora, Luis Alberto Croquevielle, Nikita Deshpande, Harry Horsley, Thomas Heinis, Artur Yakimovich

Comments Denoising Diffusion Probabilistic Models, Inverse Problems, Generative Models, Super Resolution, Phase Quantification, Variational Methods

2310.09335 2026-03-24 stat.ML cs.LG math.ST stat.TH

The surrogate Gibbs-posterior of a corrected stochastic MALA: Towards uncertainty quantification for neural networks

Sebastian Bieringer, Gregor Kasieczka, Maximilian F. Steffen, Mathias Trabs

Comments The first version of this manuscript was entitled "Statistical guarantees for stochastic Metropolis-Hastings''. Some preliminary results were initially presented in the first version of arXiv:2204.12392, but have been moved to this manuscript, where they have been further developed

2304.12505 2026-03-24 math.ST stat.ML stat.TH

Generalized Bayesian Additive Regression Trees: Theory and Software

Enakshi Saha

Comments 39 pages

2207.05901 2026-03-24 stat.AP stat.CO

Virtual sensing of subsoil strain response in monopile-based offshore wind turbines via Gaussian process latent force models

Joanna Zou, Eliz-Mari Lourens, Alice Cicirello

Comments submitted to Mechanical Systems and Signal Processing

详情

DOI: 10.1016/j.ymssp.2023.110488
Journal ref: Mechanical Systems and Signal Processing (2023), vol. 200, 110488

英文摘要

Virtual sensing techniques have gained traction in applications to the structural health monitoring of monopile-based offshore wind turbines, as the strain response below the mudline, which is a primary indicator of fatigue damage accumulation, is impractical to measure directly with physical instrumentation. The Gaussian process latent force model (GPFLM) is a generalized Bayesian virtual sensing technique which combines a physics-driven model of the structure with a data-driven model of latent variables of the system to extrapolate unmeasured strain states. In the GPLFM, modeling of unknown sources of excitation as a Gaussian process (GP) serves to facilitate strain estimation by providing a complete stochastic characterization of the covariance relationship between input forces and states, using properties of the GP covariance kernel as well as correlation information supplied by the mechanical model. It is shown that posterior inference of the latent inputs and states is performed by Gaussian process regression of measured accelerations, computed efficiently using Kalman filtering and Rauch-Tung-Striebel smoothing in an augmented state-space model. While the GPLFM has been previously demonstrated in numerical studies to improve upon other virtual sensing techniques in terms of accuracy, robustness, and numerical stability, this work provides one of the first cases of in-situ validation of the GPLFM. The predicted strain response by the GPLFM is compared to subsoil strain data collected from an operating offshore wind turbine in the Westermeerwind Park in the Netherlands.

URL PDF HTML ☆

赞 0 踩 0

2206.02088 2026-03-24 stat.ML cs.LG stat.ME

LOCO Feature Importance Inference without Data Splitting via Minipatch Ensembles

Luqin Gan, Lili Zheng, Genevera I. Allen

2111.11566 2026-03-24 stat.ME stat.AP

Combining chains of Bayesian models with Markov melding

Andrew A. Manderson, Robert J. B. Goudie

Comments 37 pages, 14 figures. Revisions to text

2110.11442 2026-03-24 math.OC cs.LG stat.ML

Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic Gradient Descent

Sharan Vaswani, Benjamin Dubois-Taine, Reza Babanezhad

2004.02881 2026-03-24 stat.ML cs.CG cs.LG cs.NE

Estimate of the Neural Network Dimension using Algebraic Topology and Lie Theory

Luciano Melodia, Richard Lenz

Comments Code available at https://codeberg.org/Jiren/NTOPL

1911.02922 2026-03-24 cs.CG cs.LG math.AT stat.ML

Persistent Homology as Stopping-Criterion for Voronoi Interpolation

Luciano Melodia, Richard Lenz

Comments Code available at https://codeberg.org/Jiren/SIML

1807.04021 2026-03-24 math.ST eess.SP stat.TH

On bayesian estimation and proximity operators

Rémi Gribonval, Mila Nikolova

Comments Compared to the published version, this document (March 2026) includes typo corrections in Proposition 5,indicated in blue

详情

Journal ref: Applied and Computational Harmonic Analysis, 2021, 50, pp.49-72

英文摘要

There are two major routes to address the ubiquitous family of inverse problems appearing in signal and image processing, such as denoising or deblurring. A first route relies on Bayesian modeling, where prior probabilities are used to embody models of both the distribution of the unknown variables and their statistical dependence with respect to the observed data. The estimation process typically relies on the minimization of an expected loss (e.g. minimum mean squared error, or MMSE). The second route has received much attention in the context of sparse regularization and compressive sensing: it consists in designing (often convex) optimization problems involving the sum of a data delity term and a penalty term promoting certain types of unknowns (e.g., sparsity, promoted through an `1 norm). Well known relations between these two approaches have led to some widely spread mis-conceptions. In particular, while the so-called Maximum A Posterori (MAP) estimate with a Gaussian noise model does lead to an optimization problem with a quadratic data-fidelity term, we disprove through explicit examples the common belief that the converse would be true. It has already been shown [7, 9] that for denoising in the presence of additive Gaussian noise, for any prior probability on the unknowns, MMSE estimation can be expressed as a penalized least squares problem, with the apparent characteristics of a MAP estimation problem with Gaussian noise and a (generally) different prior on the unknowns. In other words, the variational approach is rich enough to build all possible MMSE estimators associated to additive Gaussian noise via a well chosen penalty. We generalize these results beyond Gaussian denoising and characterize noise models for which the same phenomenon occurs. In particular, we prove that with (a variant of) Poisson noise and any prior probability on the unknowns, MMSE estimation can again be expressed as the solution of a penalized least squares optimization problem. For additive scalar denoising the phenomenon holds if and only if the noise distribution is log-concave. In particular, Laplacian denoising can (perhaps surprisingly) be expressed as the solution of a penalized least squares problem. In the multivariate case, the same phenomenon occurs when the noise model belongs to a particular subset of the exponential family. For multivariate additive denoising, the phenomenon holds if and only if the noise is white and Gaussian.

URL PDF HTML ☆

赞 0 踩 0

2603.21424 2026-03-24 stat.ME math.ST stat.TH

Tiny but uniform improvements of adaptive BH procedures via compound e-values

Nikolaos Ignatiadis, Ruodu Wang, Aaditya Ramdas

2603.21407 2026-03-24 econ.TH stat.AP

The Geometry of Heterogeneous Extremes: Optimal Transport and Entropic Design

I. Sebastian Buhai

2603.21393 2026-03-24 cs.LG stat.ML

A Generalised Exponentiated Gradient Approach to Enhance Fairness in Binary and Multi-class Classification Tasks

Maryam Boubekraoui, Giordano d'Aloisio, Antinisca Di Marco

2603.21375 2026-03-24 cs.LG stat.ML

Constrained Online Convex Optimization with Memory and Predictions

Mohammed Abdullah, George Iosifidis, Salah Eddine Elayoubi, Tijani Chahed

Comments accepted to AAAI 2026

2603.21370 2026-03-24 stat.ME cs.SY eess.SY

Adaptive and robust experimental design for linear dynamical models using Kalman filter

Arno Strouwen, Bart M. Nicolaï, Peter Goos

2603.21361 2026-03-24 stat.ME stat.CO

A Note on the Output of a Coordinate-Exchange Algorithm for Optimal Experimental Design

Arno Strouwen, Peter Goos

2603.21342 2026-03-24 stat.ML cs.AI cs.CL cs.LG

Generalized Discrete Diffusion from Snapshots

Oussama Zekri, Théo Uscidda, Nicolas Boullé, Anna Korba

Comments 37 pages, 6 figures, 13 tables

2603.21247 2026-03-24 stat.ML cs.LG math.DG physics.data-an

Accelerate Vector Diffusion Maps by Landmarks

Sing-Yuan Yeh, Yi-An Wu, Hau-Tieng Wu, Mao-Pei Tsui

2603.21235 2026-03-24 stat.ML cs.AI cs.CV

Domain Elastic Transform: Bayesian Function Registration for High-Dimensional Scientific Data

Osamu Hirose, Emanuele Rodola

2603.21216 2026-03-24 stat.AP

VA-Calibration: Correcting for Algorithmic Misclassification in Estimating Cause Distributions

Sandipan Pramanik, Emily B. Wilson, Henry D. Kalter, Agbessi Amouzou, Robert E. Black, Li Liu, Jamie Perin, Abhirup Datta

Comments 27 pages, 5 figures

2603.21191 2026-03-24 cs.LG math.OC stat.ML

On the Role of Batch Size in Stochastic Conditional Gradient Methods

Rustem Islamov, Roman Machacek, Aurelien Lucchi, Antonio Silveti-Falls, Eduard Gorbunov, Volkan Cevher

2603.21163 2026-03-24 stat.AP stat.ME

Simultaneous Estimation of Ballpark Effects and Team Defense Using Total Bases Residuals

Jhe-Jia Wu, Tian-Li Yan, Ting-Li Chen

2603.21144 2026-03-24 stat.ML cs.LG

Time-adaptive functional Gaussian Process regression

MD Ruiz-Medina, AE Madrid, A Torres-Signes, JM Angulo

2603.21091 2026-03-24 stat.ML cs.LG math.PR

Stochastic approximation in non-markovian environments revisited

Vivek Shripad Borkar

2603.21067 2026-03-24 stat.ME

A Bayesian Framework for Quantifying Association Between Functional and Structural Data in Neuroimaging

Sakul Mahat, Sharmistha Guha, Jessica Bernard

2603.21062 2026-03-24 stat.ML cs.LG math.ST stat.TH

Gradient Descent with Projection Finds Over-Parameterized Neural Networks for Learning Low-Degree Polynomials with Nearly Minimax Optimal Rate

Yingzhen Yang, Ping Li

2603.21042 2026-03-24 stat.ME cs.LG

Statistical Learning for Latent Embedding Alignment with Application to Brain Encoding and Decoding

Shuoxun Xu, Zhanhao Yan, Lexin Li

Comments 35 pages, 3 figures

2603.21032 2026-03-24 stat.AP stat.ME

Integrative Predictor-Dependent Learning of Network Data and Spatially Correlated Nodal Attributes for Multimodal Brain Imaging in Aging

Jose Rodriguez-Acosta, Sharmistha Guha, Jessica Bernard, Thamires Magalhaes, Kaitlin McOwen

Comments 38 pages

详情

英文摘要

This article introduces a predictor-dependent joint modeling framework for network data obtained from multiple subjects over a shared set of nodes with spatial co-ordinates and spatially correlated nodal attributes. The framework is highly flexible, allowing concurrent inference on nodes significantly associated with a predictor, spatial associations of nodal attributes and the regression relationship between a predictor and edge connecting a pair of nodes or a specific nodal attribute. Empirical results indicate a superior performance of the proposed approach due to accounting for network structure and spatial correlation in the data simultaneously. The methodology analyzes multimodal brain imaging data collected first-hand in the coauthor's Lifespan Cognitive and Motor Neuroimaging Laboratory, with a focus on integrating structural and functional information. It examines brain connectivity, represented as a connectome network across regions of interest (ROIs) derived from functional magnetic resonance imaging (fMRI), while also incorporating ROI-specific attributes obtained from structural MRI data, for each subject. Subject-specific aging-related features and spatial locations of ROIs are incorporated in the analysis. This framework facilitates robust inference on the associations between predictors and brain connectivity patterns, the spatial relationships among ROI-specific attributes, and the regression relationships involving edges or ROI-specific attributes with aging-related predictors. By integrating these diverse data sources, the approach provides a deeper understanding of the complex interplay between brain structure, function, aging-related changes, and external predictors. As a model-based Bayesian approach, it provides uncertainty quantification for all inferences, offering robust and reliable results, particularly in scenarios with limited sample size.

URL PDF HTML ☆

赞 0 踩 0

2603.21027 2026-03-24 cs.IT math.IT math.ST stat.TH

Dual Representation of Minimum Divergence Under Integral Constraints

Shubhanshu Shekhar, Shubhada Agrawal

Comments 45 pages [Preliminary version; feedback welcome]

2603.21004 2026-03-24 econ.EM math.ST stat.TH

Power Bounds and Efficiency Loss for Asymptotically Optimal Tests in IV Regression

Marcelo J. Moreira, Geert Ridder, Mahrad Sharifvaghefi

2603.20974 2026-03-24 math.ST stat.TH

Support of Continuous Smeary Measures on Spheres

Susovan Pal

2603.20962 2026-03-24 stat.AP stat.ME stat.ML

Integrative Learning of Dynamically Evolving Multiplex Graphs and Nodal Attributes Using Neural Network Gaussian Processes with an Application to Dynamic Terrorism Graphs

Jose Rodriguez-Acosta, Sharmistha Guha, Lekha Patel, Kurtis Shuler

Comments 59 pages

详情

英文摘要

Exploring the dynamic co-evolution of multiplex graphs and nodal attributes is a compelling question in criminal and terrorism networks. This article is motivated by the study of dynamically evolving interactions among prominent terrorist organizations, considering various organizational attributes like size, ideology, leadership, and operational capacity. Statistically principled integration of multiplex graphs with nodal attributes is significantly challenging due to the need to leverage shared information within and across layers, account for uncertainty in predicting unobserved links, and capture temporal evolution of node attributes. These difficulties increase when layers are partially observed, as in terrorism networks where connections are deliberately hidden to obscure key relationships. To address these challenges, we present a principled methodological framework to integrate the multiplex graph layers and nodal attributes. The approach employs time-varying stochastic latent factor models, leveraging shared latent factors to capture graph structure and its co-evolution with node attributes. Latent factors are modeled using Gaussian processes with an infinitely wide deep neural network-based covariance function, termed neural network Gaussian processes (NN-GP). The NN-GP framework on latent factors exploits the predictive power of Bayesian deep neural network architecture while propagating uncertainty for reliability. Simulation studies highlight superior performance of the proposed approach in achieving inferential objectives. The approach, termed as dynamic joint learner, enables predictive inference (with uncertainty) of diverse unobserved dynamic relationships among prominent terrorist organizations and their organization-specific attributes, as well as clustering behavior in terms of friend-and-foe relationships, which could be informative in counter-terrorism research.

URL PDF HTML ☆

赞 0 踩 0

2603.20945 2026-03-24 stat.ME

Functional Estimation of Manifold-Valued Diffusion Processes

Jacob McErlean, Hau-Tieng Wu

2603.20939 2026-03-24 cs.CL cs.AI cs.HC cs.IR stat.ML

User Preference Modeling for Conversational LLM Agents: Weak Rewards from Retrieval-Augmented Interaction

Yuren Hao, Shuhaib Mehri, ChengXiang Zhai, Dilek Hakkani-Tür

Comments 21 pages including appendices

2603.20936 2026-03-24 econ.EM stat.ML

Two Approaches to Direct Estimation of Riesz Representers

David Bruns-Smith

Comments A short technical and historical note

2603.20929 2026-03-24 stat.ML cs.LG math.ST stat.CO stat.TH

Stability of Sequential and Parallel Coordinate Ascent Variational Inference

Debdeep Pati

Comments 20 pages, 3 figures

2603.20927 2026-03-24 stat.ML cs.LG

Active Inference for Physical AI Agents -- An Engineering Perspective

Bert de Vries

2603.20908 2026-03-24 cs.LG stat.ML

Bayesian Scattering: A Principled Baseline for Uncertainty on Image Data

Bernardo Fichera, Zarko Ivkovic, Kjell Jorner, Philipp Hennig, Viacheslav Borovitskiy

2603.20891 2026-03-24 stat.ML cs.LG eess.SP math.DS

Auto-differentiable data assimilation: Co-learning of states, dynamics, and filtering algorithms

Melissa Adrian, Daniel Sanz-Alonso, Rebecca Willett

2603.20853 2026-03-24 stat.ME stat.AP

Correcting for Missing Data When Evaluating Surrogate Markers in a Clinical Trial

Sarah C. Lotspeich, P. D. Anh. Nguyen, Layla Parast

Comments 19 pages, 4 tables, 3 figures, R package and GitHub repository with simulation code

2603.20844 2026-03-24 stat.ME

A scalable Bayesian functional factor model for high-dimensional longitudinal molecular data

Salima Jaoua, Daniel Temko, Hélène Ruffieux

2603.20819 2026-03-24 cs.LG cs.SY eess.SY stat.ML

Achieving $\widetilde{O}(1/ε)$ Sample Complexity for Bilinear Systems Identification under Bounded Noises

Hongyu Yi, Chenbei Lu, Jing Yu

2603.20783 2026-03-24 stat.ME math.ST stat.TH

Ordinal Patterns Based Testing of Spatial Independence in Irregular Spatial Structures

Giorgio Micali, David Garnés-Galindo, Mariano Matilla-García, Manuel Ruiz-Marín

2603.20780 2026-03-24 stat.ME

Bregman projection for calibration estimation

Jae Kwang Kim, Yonghyun Kwon, Yumou Qiu

2603.20761 2026-03-24 math.ST quant-ph stat.TH

Asymptotic statistical theory of irreducible quantum Markov chains

Federico Girotti, Jukka Kiukas, Mădălin Guţă

Comments 92 pages, 6 figures, comments and suggestions are more than welcome

2603.20727 2026-03-24 stat.ME stat.AP

Compositional regression using principal nested spheres

Mymuna Monem, Ian L. Dryden, Florence George, Natalia Soares Quinete

Comments 19 pages, 8 figures, 1 table

2603.20716 2026-03-24 stat.ME

Testing for cross-quantilogram change

Chia-Min Chang, Yu-Hsiang Cheng, Tzee-Ming Huang

Comments 13 pages

2603.20696 2026-03-24 stat.ML cs.LG

High-dimensional online learning via asynchronous decomposition: Non-divergent results, dynamic regularization, and beyond

Shixiang Liu, Zhifan Li, Hanming Yang, Jianxin Yin

Comments 41 pages, 1 figure

2603.20671 2026-03-24 cs.LG stat.ML

Breaking the $O(\sqrt{T})$ Cumulative Constraint Violation Barrier while Achieving $O(\sqrt{T})$ Static Regret in Constrained Online Convex Optimization

Haricharan Balasundaram, Karthick Krishna Mahendran, Rahul Vaze

2603.20665 2026-03-24 stat.ME math.ST stat.TH

Continuity of the Solution of a Non-Parametric Bayesian Statistical Calibration Procedure

Akshay Prasadan, Donald Estep, Derek Bingham

Comments 25 pages

2603.20656 2026-03-24 stat.ML cs.AI cs.LG math.OC math.ST stat.TH

Sinkhorn Based Associative Memory Retrieval Using Spherical Hellinger Kantorovich Dynamics

Aratrika Mustafi, Soumya Mukherjee

2603.20631 2026-03-24 stat.ML cs.LG

LassoFlexNet: Flexible Neural Architecture for Tabular Data

Kry Yik Chau Lui, Cheng Chi, Kishore Basu, Yanshuai Cao

Comments 49 pages

2603.20624 2026-03-24 math.ST eess.SP stat.TH

Cross-Correlation Periodograms with Decaying Noise Floor for Power Spectral Density Estimation

Mark Magsino

2603.20602 2026-03-24 stat.ML cs.AI cs.LG

Interpretable Operator Learning for Inverse Problems via Adaptive Spectral Filtering: Convergence and Discretization Invariance

Hang-Cheng Dong, Pengcheng Cheng, Shuhuan Li

Comments 16 pages, 3 figures

2603.20601 2026-03-24 cs.DB stat.ME

Global Dataset of Solar Power Plants: Multidimensional Integration and Analysis

Anibal Mantilla-Guerra, Christian Mejia-Escobar, Jorge Azorin-Lopez, Jose Garcia-Rodriguez, Byron Fernando Tarco, Karen Santamaria

Comments 21 pages

2603.20585 2026-03-24 cs.LG stat.ML

RECLAIM: Cyclic Causal Discovery Amid Measurement Noise

Muralikrishnna G. Sethuraman, Faramarz Fekri

2603.20582 2026-03-24 q-fin.MF stat.ML

Generative Diffusion Model for Risk-Neutral Derivative Pricing

Nilay Tiwari

Comments 15 pages, 2 figures. Introduces a risk-neutral correction for diffusion models via a score function shift, with applications to derivative pricing

2603.18404 2026-03-24 stat.ML cs.LG stat.ME

Multi-Domain Empirical Bayes for Linearly-Mixed Causal Representations

Bohan Wu, Julius von Kügelgen, David M. Blei

2603.15232 2026-03-24 cs.LG math.ST stat.ML stat.TH

Decomposing Probabilistic Scores: Reliability, Information Loss and Uncertainty

Arthur Charpentier, Agathe Fernandes Machado

2603.15182 2026-03-24 stat.ME cs.LG

Sequential Transport for Causal Mediation Analysis

Agathe Fernandes Machado, Iryna Voitsitska, Arthur Charpentier, Ewen Gallic

2603.01162 2026-03-24 cs.LG stat.ML

Demystifying Group Relative Policy Optimization: Its Policy Gradient is a U-Statistic

Hongyi Zhou, Kai Ye, Erhan Xu, Jin Zhu, Ying Yang, Shijin Gong, Chengchun Shi

Comments 5 pages, 53 figures

2602.22271 2026-03-24 cs.LG math.PR math.ST stat.TH

Support Tokens, Stability Margins, and a New Foundation for Robust LLMs

Deepak Agarwal, Dhyey Dharmendrakumar Mavani, Suyash Gupta, Karthik Sethuraman, Tejas Dharamsi

Comments 45 pages, 9 figures

2601.22481 2026-03-24 stat.ME stat.AP stat.ML

Changepoint Detection As Model Selection: A General Framework

Michael Grantham, Xueheng Shi, Bertrand Clarke

2512.19398 2026-03-24 stat.ME stat.CO

A Reduced Basis Decomposition Approach to Efficient Data Collection in Pairwise Comparison Studies

Jiahua Jiang, Joseph Marsh, Rowland G Seymour

Comments Author Accepted Manuscript

2511.10814 2026-03-24 math.PR math.ST stat.TH

Convergence of the extended Kalman filter with small and state-dependent noise

Ibrahim Mbouandi Njiasse, Florent Ouabo Kamkumo, Ralf Wunderlich

Comments 20 pages

2511.03115 2026-03-24 physics.med-ph stat.AP

SDE-based Monte Carlo dose calculation for proton therapy validated against Geant4

Christopher B. C. Dean, Maria L. Pérez-Lara, Emma Horton, Matthew Southerby, Jere Koskela, Andreas E. Kyprianou

Comments 30 pages, 11 figures

2510.22690 2026-03-24 math.ST math.PR stat.ME stat.TH

Stopping Rules for Monte Carlo Methods of Martingale Difference Type

Jiezhong Wu, Reiichiro Kawai

Comments 30 pages, 4 figures

2510.22083 2026-03-24 stat.ME

Ridge Boosting is Both Robust and Efficient

David Bruns-Smith, Zhongming Xie, Avi Feller

2510.03798 2026-03-24 cs.LG stat.ML

Robust Batched Bandits

Yunwen Guo, Yunlun Shu, Gongyi Zhuo, Tianyu Wang

Comments 39 pages

2509.20721 2026-03-24 cs.LG math.ST stat.ML stat.TH

Scaling Laws are Redundancy Laws

Yuda Bi, Vince D Calhoun

Comments This is not a serious research at this time

2509.07322 2026-03-24 stat.ME

Cumulative Marginal Mean Model for Assessing Sequential Effects Using Digital Health Data

Xingche Guo, Zexi Cai, Yuanjia Wang, Donglin Zeng

2508.07392 2026-03-24 cs.LG math.ST stat.ML stat.TH

Tight Bounds for Schrödinger Potential Estimation in Unpaired Data Translation

Nikita Puchkin, Denis Suchkov, Alexey Naumov, Denis Belomestny

Comments The 14th International Conference on Learning Representations (ICLR 2026)

2507.23646 2026-03-24 math.ST cs.IT math.DG math.IT math.PR q-fin.MF stat.TH

Information geometry of Lévy processes and financial models

Jaehyung Choi

Comments 22 pages

2507.16749 2026-03-24 stat.ME stat.ML

Bootstrapped Control Limits for Score-Based Concept Drift Control Charts

Jiezhong Wu, Daniel W. Apley

Comments 46 pages, 3 figures

2507.14869 2026-03-24 stat.CO math.PR

Bayesian Inversion via Probabilistic Cellular Automata: an application to image denoising

Danilo Costarelli, Michele Piconi, Alessio Troiani

2506.03467 2026-03-24 cs.IT cs.CR cs.LG eess.SP math.IT stat.ME

Differentially Private Distribution Release of Gaussian Mixture Models via KL-Divergence Minimization

Hang Liu, Anna Scaglione, Sean Peisert

Comments This work has been submitted to the IEEE for possible publication

2506.01619 2026-03-24 math.ST stat.TH

A projector-rank partition theorem for exact degrees of freedom in experimental design

Nagananda K G

Comments 26 pages

详情

DOI: 10.1016/j.jspi.2026.106409
Journal ref: Journal of Statistical Planning and Inference, 2026

英文摘要

In many experimental designs -- split-plots, blocked or nested layouts, fractional factorials, and studies with missing or unequal replication -- standard ANOVA procedures no longer tell us exactly how many independent pieces of information each effect truly contributes. We provide a general degrees of freedom $(\mathrm{df})$ partition theorem that resolves this ambiguity. For $N$ observations, we show that the total information in the data (i.e., $N-1$ $\mathrm{df}$) can be split exactly across experimental effects and randomization strata by projecting the data onto each stratum and counting the $\mathrm{df}$ each effect contributes there. This yields integer $\mathrm{df}$ -- not approximations -- for any mix of fixed and random effects, blocking structures, fractionation, or imbalance. This result yields closed-form $\mathrm{df}$ tables for unbalanced split-plot, row-column, lattice, and crossed-nested designs. We introduce practical diagnostics -- the $\mathrm{df}$-retention ratio $ρ$, df deficiency $δ$, and variance-inflation index $α$ -- that measure exactly how many $\mathrm{df}$ an effect retains under blocking or fractionation and the resulting loss of precision, thereby extending Box-Hunter's resolution idea to multi-stratum and incomplete designs. Classical results emerge as corollaries: Cochran's one-stratum identity; Yates's split-plot $\mathrm{df}$; resolution-$R$ identified when an effect retains no $\mathrm{df}$. Empirical studies on split-plot and nested designs, a blocked fractional-factorial design-selection experiment, and timing benchmarks show that our approach delivers calibrated error rates, recovers information to raise power by up to 60% without additional runs, and is orders of magnitude faster than bootstrap-based $\mathrm{df}$ approximations.

URL PDF HTML ☆

赞 0 踩 0

2505.19731 2026-03-24 stat.ML cs.LG

Proximal Point Nash Learning from Human Feedback

Daniil Tiapkin, Daniele Calandriello, Denis Belomestny, Eric Moulines, Alexey Naumov, Kashif Rasul, Michal Valko, Pierre Menard

2505.12617 2026-03-24 stat.ME stat.AP

Double machine learning to estimate the effects of multiple treatments and their interactions

Qingyan Xiang, Yubai Yuan, Dongyuan Song, Usman J. Wudil, Muktar H. Aliyu, C. William Wester, Bryan E. Shepherd

2504.16780 2026-03-24 math.ST stat.ME stat.TH

Linear Regression Using Principal Components from General Hilbert-Space-Valued Covariates

Xinyi Li, Margaret Hoch, Michael R. Kosorok

2504.03097 2026-03-24 stat.ML cs.LG math.PR math.ST stat.TH

A Computational Transition for Detecting Multivariate Shuffled Linear Regression by Low-Degree Polynomials

Zhangsong Li

Comments 27 pages; improved exposition

2502.10010 2026-03-24 stat.ME

Principal Decomposition with Nested Submanifolds

Jiaji Su, Zhigang Yao

Comments 34 pages, 12 figures, 1 table

2501.02406 2026-03-24 stat.ML cs.AI cs.CL cs.IT cs.LG math.IT

A Training-free Method for LLM Text Attribution

Tara Radvand, Mojtaba Abdolmaleki, Mohamed Mostagir, Ambuj Tewari

详情

英文摘要

Verifying the provenance of content is crucial to the functioning of many organizations, e.g., educational institutions, social media platforms, and firms. This problem is becoming increasingly challenging as text generated by Large Language Models (LLMs) becomes almost indistinguishable from human-generated content. In addition, many institutions use in-house LLMs and want to ensure that external, non-sanctioned LLMs do not produce content within their institutions. In this paper, we answer the following question: Given a piece of text, can we identify whether it was produced by a particular LLM, while ensuring a guaranteed low false positive rate? We model LLM text as a sequential stochastic process with complete dependence on history. We then design zero-shot statistical tests to (i) distinguish between text generated by two different known sets of LLMs $A$ (non-sanctioned) and $B$ (in-house), and (ii) identify whether text was generated by a known LLM or by any unknown model. We prove that the Type I and Type II errors of our test decrease exponentially with the length of the text. We also extend our theory to black-box access via sampling and characterize the required sample size to obtain essentially the same Type I and Type II error upper bounds as in the white-box setting (i.e., with access to $A$). We show the tightness of our upper bounds by providing an information-theoretic lower bound. We next present numerical experiments to validate our theoretical results and assess their robustness in settings with adversarial post-editing. Our work has a host of practical applications in which determining the origin of a text is important and can also be useful for combating misinformation and ensuring compliance with emerging AI regulations. See https://github.com/TaraRadvand74/llm-text-detection for code, data, and an online demo of the project.

URL PDF HTML ☆

赞 0 踩 0

2412.20013 2026-03-24 stat.ME

Kendall's tau and Spearman's rho for normal location-scale and skew-normal scale mixture copulas

Ye Lu

2412.07971 2026-03-24 cs.LG cs.DC stat.ML

Effectiveness of Distributed Gradient Descent with Local Steps for Overparameterized Models

Heng Zhu, Harsh Vardhan, Arya Mazumdar

2411.17841 2026-03-24 stat.ME stat.AP

Bayesian defective Marshall-Olkin Gompertz model: an integrated approach to identifying cure fraction

Dionisio Alves-Neto, Vera Lucia Tomazella, Adriano Suzuki, Danilo Alvares

2410.18869 2026-03-24 math.PR math.AP math.OC math.ST q-fin.MF stat.TH

On the Mean-Field limit of diffusive games through the master equation: $L^{\infty}$ estimates and extreme value behavior

Erhan Bayraktar, Nikolaos Kolliopoulos

Comments 41 pages including references

2410.11151 2026-03-24 stat.ME cs.IT math.IT stat.AP

Discovering the critical number of respondents to validate an item in a questionnaire: The Binomial Cut-level Content Validity proposal

Helder Gomes Costa, Eduardo Shimoda, José Fabiano da Serra Costa, Aldo Shimoya, Edilvando Pereira Eufrazio

Comments 17 pages, 1 figure

2410.09027 2026-03-24 stat.ME cs.LG econ.EM stat.AP

Variance reduction combining pre-experiment and in-experiment data

Zhexiao Lin, Pablo Crespo

Comments Accepted to 5th Conference on Causal Learning and Reasoning (CLeaR), 2026

2408.05106 2026-03-24 stat.ME

Restricted Spatial Regression is Reasonable Statistical Practice: Clarifications, Interpretations, and New Developments

Jonathan R. Bradley

2407.18707 2026-03-24 cs.LG stat.ML

Finite Neural Networks as Mixtures of Gaussian Processes: From Provable Error Bounds to Prior Selection

Steven Adams, Andrea Patanè, Morteza Lahijanian, Luca Laurenti

详情

英文摘要

Infinitely wide or deep neural networks (NNs) with independent and identically distributed (i.i.d.) parameters have been shown to be equivalent to Gaussian processes. Because of the favorable properties of Gaussian processes, this equivalence is commonly employed to analyze neural networks and has led to various breakthroughs over the years. However, neural networks and Gaussian processes are equivalent only in the limit; in the finite case there are currently no methods available to approximate a trained neural network with a Gaussian model with bounds on the approximation error. In this work, we present an algorithmic framework to approximate a neural network of finite width and depth, and with not necessarily i.i.d. parameters, with a mixture of Gaussian processes with error bounds on the approximation error. In particular, we consider the Wasserstein distance to quantify the closeness between probabilistic models and, by relying on tools from optimal transport and Gaussian processes, we iteratively approximate the output distribution of each layer of the neural network as a mixture of Gaussian processes. Crucially, for any NN and $ε>0$ our approach is able to return a mixture of Gaussian processes that is $ε$-close to the NN at a finite set of input points. Furthermore, we rely on the differentiability of the resulting error bound to show how our approach can be employed to tune the parameters of a NN to mimic the functional behavior of a given Gaussian process, e.g., for prior selection in the context of Bayesian inference. We empirically investigate the effectiveness of our results on both regression and classification problems with various neural network architectures. Our experiments highlight how our results can represent an important step towards understanding neural network predictions and formally quantifying their uncertainty.

URL PDF HTML ☆

赞 0 踩 0

2407.05543 2026-03-24 stat.ME

Functional Principal Component Analysis for Sparse Censored Data

Caitrin Murphy, Eric Laber, Rhonda Merwin, Brian Reich, Jake Koerner

2407.02419 2026-03-24 quant-ph cs.LG stat.ML

Quantum Curriculum Learning

Quoc Hoan Tran, Yasuhiro Endo, Hirotaka Oshima

Comments Updated with schematic figures of quantum circuits and transparent explanation for Curriculum Learning

2406.16849 2026-03-24 math.ST math.PR stat.TH

Computationally tractable nonparametric bootstrap of high-dimensional sample covariance matrices

Holger Dette, Angelika Rohde

2404.04709 2026-03-24 econ.GN q-fin.EC stat.AP

Two-Sided Flexibility in Platforms

Daniel Freund, Sébastien Martin, Jiayu Kamessi Zhao

2403.10889 2026-03-24 cs.LG stat.ML

List Sample Compression and Uniform Convergence

Steve Hanneke, Shay Moran, Tom Waknine

2402.15127 2026-03-24 cs.LG cs.IT math.IT stat.ML

Asymptotically and Minimax Optimal Regret Bounds for Multi-Armed Bandits with Abstention

Junwen Yang, Tianyuan Jin, Vincent Y. F. Tan

Comments 36 pages

2401.09346 2026-03-24 stat.ML cs.LG

High Confidence Level Inference is Almost Free using Parallel Stochastic Optimization

Wanrong Zhu, Zhipeng Lou, Ziyang Wei, Wei Biao Wu

2309.06053 2026-03-24 stat.ME math.ST stat.TH

Confounder selection via iterative graph expansion

F. Richard Guo, Qingyuan Zhao

Comments 31 pages; new notation and terminology; to appear in the Annals of Statistics

2305.10413 2026-03-24 stat.ML cs.LG math.ST stat.AP stat.TH

On Consistency of Signature Using Lasso

Xin Guo, Binnan Wang, Ruixun Zhang, Chaoyi Zhao

2305.03158 2026-03-24 stat.CO stat.ME

Quantile Importance Sampling

Jyotishka Datta, Nicholas G. Polson

Comments Fixed a few typos and errors, and added a real data example

2206.10143 2026-03-24 stat.ML cs.LG math.ST stat.ME stat.TH

Noise-contrastive Online Change Point Detection

Nikita Puchkin, Artur Goldman, Konstantin Yakovlev, Valeriia Dzis, Uliana Vinogradova

Comments The preliminary version of this paper was presented at the 26th International Conference on Artificial Intelligence and Statistics (AISTATS 2023, PMLR 206:5686-5713)

2203.06573 2026-03-24 stat.ME

Homogeneity and Sub-homogeneity Pursuit: Iterative Complement Clustering PCA

Daning Bi, Le Chang, Yanrong Yang

2112.00414 2026-03-24 stat.ME math.ST stat.TH

AR-sieve Bootstrap for High-dimensional Time Series

Daning Bi, Han Lin Shang, Yanrong Yang, Huanjun Zhu

2603.20538 2026-03-24 cs.LG stat.ML

Understanding Behavior Cloning with Action Quantization

Haoqun Cao, Tengyang Xie

2603.20526 2026-03-24 cs.LG cs.AI stat.ML

Does This Gradient Spark Joy?

Ian Osband

2603.20520 2026-03-24 stat.ML cs.LG

CogFormer: Learn All Your Models Once

Jerry M. Huang, Lukas Schumacher, Niek Stevenson, Stefan T. Radev

2603.20467 2026-03-24 stat.ME cs.LG math.DS

Goal-oriented learning of stochastic dynamical systems using error bounds on path-space observables

Joanna Zou, Han Cheng Lie, Youssef Marzouk

2603.20464 2026-03-24 econ.EM stat.ME stat.ML

Double Machine Learning for Static Panel Data with Instrumental Variables: New Method and Applications

Anna Baiardi, Paul S. Clarke, Andrea A. Naghi, Annalivia Polselli

2603.20394 2026-03-24 econ.EM stat.ME

When are time series predictions causal? The potential system and dynamic causal effects

Jacob Carlson, Neil Shephard

2603.20392 2026-03-24 cs.LG cs.AI stat.ML

SymCircuit: Bayesian Structure Inference for Tractable Probabilistic Circuits via Entropy-Regularized Reinforcement Learning

Y. Sungtaek Ju

Comments 17 pages

2603.20388 2026-03-24 math.ST cs.LG econ.EM stat.ML stat.TH

From Cross-Validation to SURE: Asymptotic Risk of Tuned Regularized Estimators

Karun Adusumilli, Maximilian Kasy, Ashia Wilson

2603.20365 2026-03-24 stat.ML cs.AI cs.LG

Comprehensive Description of Uncertainty in Measurement for Representation and Propagation with Scalable Precision

Ali Darijani, Jürgen Beyerer, Zahra Sadat Hajseyed Nasrollah, Luisa Hoffmann, Michael Heizmann

2603.20349 2026-03-24 stat.ME stat.AP

Prediction intervals for overdispersed multinomial data with application to historical controls

Sören Budig, Frank Schaarschmidt, Max Menssen

2603.20345 2026-03-24 q-bio.QM stat.AP

Towards Improved Short-term Hypoglycemia Prediction and Diabetes Management based on Refined Heart Rate Data

Vaibhav Gupta, Florian Grensing, Beyza Cinar, Louisa van den Boom, Maria Maleshkova

Comments 10 pages, 2 tables

2603.20343 2026-03-24 stat.CO stat.AP

A practical introduction to ODE modelling in Stan for biological systems

Sara Hamis, John Forslund, Cici Chen Gu, Jodie A. Cochrane

Comments 23 pages, 10 figures

2603.20318 2026-03-24 stat.ME

Beyond Pairwise: Nonparametric Kernel Estimators for a Generalized Weitzman Coefficient Across k Distributions

Omar Eidous, Noura Almasri

Comments 15 pages, 1 figure, 4 tables

2603.20254 2026-03-24 cs.CY cs.AI stat.OT

AI Detectors Fail Diverse Student Populations: A Mathematical Framing of Structural Detection Limits

Nathan Garland

2603.20243 2026-03-24 q-fin.PR q-fin.MF stat.AP

Two-Factor Hull-White Model Revisited: Correlation Structure for Two-Factor Interest Rate Model in CVA Calculation

Osamu Tsuchiya

2603.20241 2026-03-24 cond-mat.mtrl-sci stat.AP

Probabilistic calibration of crystal plasticity material models with synthetic global and local data

Joshua D. Pribe, Patrick E. Leser, Saikumar R. Yeratapally, George Weber

详情

英文摘要

Crystal plasticity models connect macroscopic deformation with the physics of microscale slip in polycrystalline materials. These models can be calibrated using global stress-strain curves, but the resulting parametrization is often not unique: multiple parametrizations can predict the same global behavior but different local, grain-scale behavior. Using local data for calibration can mitigate uniqueness issues, but expensive specialized experiments like high-energy X-ray diffraction (HEDM) are typically required to gather the data. The computational expense of full-field simulations also often prevents uncertainty quantification with sampling-based calibration algorithms like Markov chain Monte Carlo. This study presents a two-stage calibration procedure that combines global and local data and balances the efficiency of a surrogate model with the accuracy of full-field crystal plasticity simulations. The procedure quantifies uncertainty using Bayesian inference with an efficient, parallelized sequential Monte Carlo algorithm. Calibrations are completed using synthetic data with a microstructure representative of Inconel 718 to assess uncertainty and accuracy of the parameters relative to a known ground truth. Global data comes from the uniaxial stress-strain curve, while local data comes from grain-average stresses, reflecting typical outputs of HEDM experiments. Additional calibrations with limited and noisy local data demonstrate robustness of the procedure and identify the most important features of the data. Overall, the results demonstrate the computational efficiency of the two-stage procedure and the value of local data for reducing parameter uncertainty. In addition, joint distributions of the calibrated parameters highlight key considerations in choosing constitutive models and calibration data, including challenges resulting from correlated parameters.

URL PDF HTML ☆

赞 0 踩 0

2602.12683 2026-03-24 cs.LG stat.ML

Flow Matching from Viewpoint of Proximal Operators

Kenji Fukumizu, Wei Huang, Han Bao, Shuntuo Xu, Nisha Chandramoorthy

Comments 38 pages, 6 figures

2601.10878 2026-03-24 astro-ph.IM stat.AP

Optimal and Unbiased Fluxes from Up-the-Ramp Detectors under Variable Illumination

Bowen Li, Kevin A. McKinnon, Andrew K. Saydjari, Conor Sayres, Gwendolyn M. Eadie, Andrew R. Casey, Jon A. Holtzman, Timothy D. Brandt, Jose G. Fernandez-Trincado

Comments 22 pages, 20 figures

2512.09708 2026-03-24 math.ST stat.TH

A simple geometric proof for the characterisation of e-merging functions

Eugenio Clerico

Comments 4 pages

2506.20789 2026-03-24 math.PR math.ST stat.TH

Central limit theory for Peaks-over-Threshold partial sums of long memory linear time series

Ioan Scheffel, Marco Oesting, Gilles Stupfler

Comments 61 pages, 4 figures, accepted for publication in Stochastic Processes and their Applications (2026)

2505.06760 2026-03-24 stat.ME

Quantifying uncertainty and stability among highly correlated predictors: a subspace perspective

Xiaozhu Zhang, Jacob Bien, Armeen Taeb

2501.16562 2026-03-24 cs.LG stat.ME

C-HDNet: Hyperdimensional Computing for Causal Effect Estimation from Observational Data Under Network Interference

Abhishek Dalvi, Neil Ashtekar, Vasant Honavar

Comments Published at Social Network Analysis and Mining

2403.19157 2026-03-24 math.PR math-ph math.MP math.ST stat.TH

Correlation functions between singular values and eigenvalues

Matthias Allard, Mario Kieburg

Comments 42 pages, 1 figure. Updated version: Peer reviewed version

2312.03257 2026-03-24 stat.ME stat.AP

Bayesian Functional Analysis for Untargeted Metabolomics Data with Matching Uncertainty and Small Sample Sizes

Guoxuan Ma, Jian Kang, Tianwei Yu

详情

DOI: 10.1093/bib/bbae141
Journal ref: Briefings in Bioinformatics, Volume 25, Issue 3, May 2024, bbae141

英文摘要

Untargeted metabolomics based on liquid chromatography-mass spectrometry technology is quickly gaining widespread application given its ability to depict the global metabolic pattern in biological samples. However, the data is noisy and plagued by the lack of clear identity of data features measured from samples. Multiple potential matchings exist between data features and known metabolites, while the truth can only be one-to-one matches. Some existing methods attempt to reduce the matching uncertainty, but are far from being able to remove the uncertainty for most features. The existence of the uncertainty causes major difficulty in downstream functional analysis. To address these issues, we develop a novel approach for Bayesian Analysis of Untargeted Metabolomics data (BAUM) to integrate previously separate tasks into a single framework, including matching uncertainty inference, metabolite selection, and functional analysis. By incorporating the knowledge graph between variables and using relatively simple assumptions, BAUM can analyze datasets with small sample sizes. By allowing different confidence levels of feature-metabolite matching, the method is applicable to datasets in which feature identities are partially known. Simulation studies demonstrate that, compared with other existing methods, BAUM achieves better accuracy in selecting important metabolites that tend to be functionally consistent and assigning confidence scores to feature-metabolite matches. We analyze a COVID-19 metabolomics dataset and a mouse brain metabolomics dataset using BAUM. Even with a very small sample size of 16 mice per group, BAUM is robust and stable. It finds pathways that conform to existing knowledge, as well as novel pathways that are biologically plausible.

URL PDF HTML ☆

赞 0 踩 0

2311.05649 2026-03-24 stat.AP stat.ME

Bayesian Image-on-Image Regression via Deep Kernel Learning based Gaussian Processes

Guoxuan Ma, Bangyao Zhao, Hasan Abu-Amara, Jian Kang