arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.01628 2026-05-05 stat.ML cs.LG math.ST stat.TH

Self-Normalized Martingales and Uniform Regret Bounds for Linear Regression

Fan Chen, Jian Qian, Alexander Rakhlin, Nikita Zhivotovskiy

详情

英文摘要

Self-normalized martingale inequalities lie at the heart of confidence ellipsoids for online least squares and, more broadly, many bandit and reinforcement-learning results. Yet existing vector and scalar results typically rely on bounded covariates and an explicit regularization matrix, producing bounds that are \emph{not scale-invariant}: although the self-normalized quantity is scale-invariant by definition, its standard upper bounds are not. We characterize when scale-invariant upper bounds on self-normalized martingales are possible. Without further assumptions, we prove that nontrivial scale-invariant bounds exist only in dimension $d=1$; moreover, in $d=1$ we obtain $O(\log T)$ scale-invariant self-normalized bounds without any assumptions on the covariates. In contrast, for $d>1$ we show that no nontrivial scale-invariant bound can hold in full generality. We then connect this dichotomy to \emph{doubly-uniform} regret in online linear regression (i.e., regret bounds that are simultaneously independent of the covariate scale and the comparator norm) and use it to resolve the open question of Gaillard, Gerchinovitz, Huard, and Stoltz, \emph{``Uniform regret bounds over $\mathbb{R}^d$ for the sequential linear regression problem with the square loss''} (ALT 2019): in $d=1$ we give an explicit algorithm with $O(\log T)$ doubly-uniform regret, whereas for $d>1$ sublinear doubly-uniform regret is impossible. Finally, under a natural \emph{smoothness} condition (bounded Radon--Nikodym derivatives of the conditional covariate laws with respect to a fixed base measure), we recover sublinear regret for $d>1$ without bounded covariates and derive a self-normalized concentration inequality free of the usual regularization penalties, yielding arguably a first natural scale-invariant bound for adaptive, non-i.i.d. vector martingales.

URL PDF HTML ☆

赞 0 踩 0

2605.01624 2026-05-05 math.AT stat.AP stat.ML

Persistent Homology of Time Series through Complex Networks

İsmail Güzel

2605.01615 2026-05-05 stat.ME stat.AP stat.OT

Threshold Exceedance Estimation in Spatially Correlated Areal Data Using Maxima-Nominated Sampling

Mohammad Jafari Jozani

Comments 26 pages, 4 figures, 6 tables

2605.01608 2026-05-05 eess.SP stat.ME stat.ML

Why Model Selection Fails in Time Series Forecasting: An Empirical Study of Instability Across Data Regimes

Tahir Cetin Akinci, Alfredo A. Martinez-Morales

2605.01606 2026-05-05 stat.ME math.ST stat.TH

L-Estimation of Population Quantiles Using Ranked Set Sampling

Mohammad Jafari Jozani, Ehsan Zamanzade, Reza Modarre

Comments 33 pages, 5 figures, 1 table

2605.01603 2026-05-05 stat.CO

dirichletprocess: An R Package for Fitting Complex Bayesian Nonparametric Models

Gordon J. Ross, Dean Markwick, Priyanshu Tiwari

2605.01586 2026-05-05 stat.CO math.ST stat.TH

The Pearson IV distribution: Random variate generation and applications

Luc Devroye, Joe R. Hill

2605.01579 2026-05-05 stat.ME cs.LG

Minimum Specification Perturbation: Robustness as Distance-to-Falsification in Causal Inference

Hoang Dang, Luan Pham, Minh Nguyen

Comments 36 pages, 2 figures

2605.01571 2026-05-05 stat.OT stat.CO

Functional Liu Regression for Scalar-on-Functional Models in High-Dimensional Settings

Shaista Ashraf, Stephen Becker, Farrukh Javed, Ismail Shah

2605.01492 2026-05-05 stat.ML cs.IT cs.LG math.IT

Stabilizing Private LASSO under Heterogeneous Covariates via Anisotropic Objective Perturbation

Haruka Tanzawa, Ayaka Sakata

Comments 6 pages, 5 figures

2605.01484 2026-05-05 cs.LG stat.ML

Evaluating LLMs on Large-Scale Graph Property Estimation via Random Walks

Sunil Kumar Maurya, Xin Liu

Comments Accepted to ACL 2026 Main Conference

2605.01452 2026-05-05 stat.ME cs.LG

Stable Localized Conformal Prediction via Transduction

Yinjie Min, Liuhua Peng, Changliang Zou

2605.01379 2026-05-05 stat.ME

Federated generalized linear mixed models based on one-time shared summary statistics

Marie Analiz April Limpoco, Christel Faes, Niel Hens

2605.01363 2026-05-05 hep-ex cs.LG hep-ph stat.ME

Data-Driven, Geometry-Aware Optimal-Transport Calibration of Flavor Tagger

Yeonjoon Kim, Un-ki Yang

Comments 32 Pages, 12 Figures

2605.01335 2026-05-05 stat.ML cs.LG math.ST stat.TH

Mean Testing under Truncation beyond Gaussian

Yuhao Wang, Roberto Imbuzeiro Oliveira, Themis Gouleakis

2605.01312 2026-05-05 stat.ME

Exploring Multivariate Data Using Median Absolute Deviation Depth

Elsayed Elamir

Comments 21 pages, 7 figures

2605.01311 2026-05-05 cs.LG econ.EM stat.AP stat.ML

The Partial Testimony of Logs: Evaluation of Language Model Generation under Confounded Model Choice

Jikai Jin, Vasilis Syrgkanis

2605.01262 2026-05-05 stat.AP

Factor State Space Modelling of the Ornstein-Uhlenbeck Process with Measurement Error and its Application

Shanglun Li, Toby Kenney, Hong Gu

2605.01237 2026-05-05 math.ST stat.TH

An Exact Pointwise Characterization for Total Variation Denoising in Quantile Regression

Deep Ghoshal, Sabyasachi Chatterjee

详情

英文摘要

Total variation denoising (TVD) is a classical method for denoising and curve fitting, yet an explicit pointwise description of its fitted values has only recently been established in the mean regression setting by arXiv:2410.03041v4. This raises the question of whether a similar representation holds for quantile regression. We answer this question affirmatively by deriving an exact minmax/maxmin representation for the quantile TVD estimator, providing a complete pointwise characterization of its solution set. Given that the quantile TVD estimator is generally non-unique, the existence of such a representation is perhaps surprising. We show that the set of admissible fitted values at any location forms a compact interval, whose endpoints are characterized exactly by minmax/maxmin functionals of local order statistics over nested intervals. We next develop several structural properties of the quantile TVD solution set. First, the solution set is closed under coordinatewise maximum and minimum, guaranteeing the existence of extremal elements -- upper and lower envelope solutions. Second, this reveals that quantile TVD is intrinsically non-crossing across quantile levels when a common tuning parameter is used. We prove this is driven by submodularity of the total variation penalty, and show that any penalized quantile regression estimator with a submodular penalty enjoys this property. From an estimation error perspective, our representation enables a refined pointwise analysis via a transparent local bias-variance decomposition, facilitating new pointwise risk bounds and near-optimal rates for locally Holder smooth functions. Our results hold under heavy-tailed noise (e.g., Cauchy) and substantially extend existing guarantees beyond locally constant signals. Altogether, these results advance the theory of quantile TV regression via exact pointwise min-max representations.

URL PDF HTML ☆

赞 0 踩 0

2605.01198 2026-05-05 stat.CO stat.ME

Modular Markov chain Monte Carlo with application to multimodal sampling

Joonha Park

2605.01172 2026-05-05 cs.LG stat.ML

A Theory of Generalization in Deep Learning

Elon Litman, Gabe Guo

2605.01157 2026-05-05 stat.ME

Coarse-to-fine spatial GLMM for scalable prediction and multiscale analysis

Daisuke Murakami, Alexis Comber, Takahiro Yoshida, Narumasa Tsutsumida, Chris Brunsdon, Tomoki Nakaya

2605.01136 2026-05-05 cs.LG cs.SI math.SP stat.ML

Spectral Graph Sparsification Preserves Representation Geometry in Graph Neural Networks

Sanjukta Krishnagopal

Comments 9 pages, 4 figures

2605.01118 2026-05-05 stat.ME

Nonparametric density estimation with a parametric start

Nils Lid Hjort, Ingrid Kristine Glad

Comments 31 pages, no figures. This is the original publication for the Hjort-Glad density estimator, Statistical Research Report, Department of Mathematics, University of Oslo, January 1994, with more material than for the published article Annals of Statistics, 1995, vol. 23, pages 882-904

2605.01114 2026-05-05 stat.ME

A formal approach to variable selection in difference-in-differences

Daniela Rodrigues, Laura A. Hatfield

2605.01110 2026-05-05 cs.LG cs.SI math.AT stat.ML

Topological Neural Tangent Kernel

Sanjukta Krishnagopal

Comments 9 pages 4 figures

2605.01107 2026-05-05 cs.LG cond-mat.dis-nn stat.ML

Diffusion Operator Geometry of Feedforward Representations

Kanishka Reddy

2605.01089 2026-05-05 cs.LG math.PR stat.CO

Learning Discriminators for Resampling in the Ensemble Gaussian Mixture Filter through a Normalizing Flow Approach

Zain Jabbar, Andrey A. Popov

2605.01062 2026-05-05 stat.ME math.ST stat.TH

Single Change-Point Detection via Energy Distance with Application to Genomic Data

Suthakaran Ratnasingam

Comments 25 pages, 8 figures, 3 tables

2605.01052 2026-05-05 quant-ph math.ST physics.data-an physics.optics stat.TH

Entropic Reciprocity in Time-Reversed Young Interferometry

Jianming Wen

Comments This work provides an explicit definition on time reversal based on information theory

2605.01003 2026-05-05 stat.ME cs.LG eess.SP

Pi-Change: A Prior-Informed Multiple Change Point Detection Algorithm

Jonathon Jacobs, Shanshan Chen

2605.00966 2026-05-05 cs.LG cs.NE q-bio.NC stat.ML

Robust volatility updates for Hierarchical Gaussian Filtering

Christoph Mathys, Nicolas Legrand, Peter Thestrup Waade, Nace Mikus, Lilian Aline Weber

2605.00855 2026-05-05 math.OC cs.LG stat.ML

An Efficient Spatial Branch-and-Bound Algorithm for Global Optimization of Gaussian Process Posterior Mean Functions

Wei-Ting Tang, Akshay Kudva, Calvin Tsay, Joel A. Paulson

2604.22902 2026-05-05 stat.ME stat.ML

Design, Cups, and Blankets. A Free-Energy-Principle-Based Approach to Product Design

Luca M. Possati

2604.22791 2026-05-05 stat.CO cs.SI stat.OT

R Package iglm: Regression under Interference in Connected Populations

Cornelius Fritz, Michael Schweinberger

2604.14240 2026-05-05 cs.AI cs.LG stat.ML

Interpretable and Explainable Surrogate Modeling for Simulations: A State-of-the-Art Survey and Perspectives on Explainable AI for Decision-Making

Pramudita Satria Palar, Paul Saves, Muhammad Daffa Robani, Nicolas Verstaevel, Moncef Garouani, Julien Aligon, Koji Shimoyama, Joseph Morlier, Benoit Gaudou

Comments Accepted for publication in Archives of Computational Methods in Engineering, 2026

详情

DOI: 10.1007/s11831-026-10600-z

英文摘要

The simulation of complex systems increasingly relies on sophisticated but fundamentally opaque computational black-box simulators. Surrogate models play a central role in reducing the computational cost of complex systems simulations across a wide range of scientific and engineering domains. Notwithstanding, they inevitably inherit and often exacerbate this black-box nature, obscuring how input variables drive physical responses. Conversely, Explainable Artificial Intelligence (XAI) offers powerful tools to unpack these models. Yet, XAI methods struggle with engineering-specific constraints, such as highly correlated inputs, dynamical systems, and rigorous reliability requirements. Consequently, surrogate modeling and XAI have largely evolved as distinct fields of research, despite their strong complementarity. To reconnect these approaches, this state-of-the-art survey provides a structured perspective that maps existing XAI techniques onto the various stages of surrogate modeling workflows for design and exploration. To ground this synthesis, we draw upon illustrative applications across both equation-based simulations and agent-based modeling. We survey a broad spectrum of techniques, highlighting their strengths for revealing interactions and supporting human comprehension. Finally, we identify pressing open challenges, including the explainability of dynamical systems and the handling of mixed-variable systems, and propose a research agenda to make explainability a core, embedded element of simulation-driven workflows from model construction through decision-making. By transforming opaque emulators into explainable tools, this agenda empowers practitioners to move beyond accelerating simulations to extracting actionable insights from complex system behaviors.

URL PDF HTML ☆

赞 0 踩 0

2604.04249 2026-05-05 stat.ME

The arithmetic-harmonic inequality index: Theory, inference, and finite-sample analysis

Roberto Vila, Helton Saulo

Comments 17 pages, 5 figures

2603.11907 2026-05-05 cs.LG stat.ME

Causal Representation Learning with Optimal Compression under Complex Treatments

Wanting Liang, Haoang Chi, Zhiheng Zhang

2602.14861 2026-05-05 math.ST stat.ME stat.TH

Bias analysis of a linear order-statistic inequality index estimator: Unbiasedness under gamma populations

Roberto Vila, Helton Saulo

Comments 18 pages

2510.01020 2026-05-05 cs.LG cs.AI math.ST stat.ML stat.TH

The Good, the Bad, and the Sampled: a No-Regret Approach to Safe Online Classification

Tavor Z. Baharav, Spyros Dragazis, Aldo Pacchiano

Comments 38 pages, accepted to AISTATS 2026

2509.09723 2026-05-05 cs.CL cs.AI cs.LG stat.ME

ALIGNS: Unlocking nomological networks in psychological measurement through a large language model

Kai R. Larsen, Sen Yan, Roland M. Mueller, Lan Sang, Mikko Rönkkö, Ravi Starzl, Donald Edmondson

Comments Error in algorithm explanation

2508.12674 2026-05-05 stat.ML cs.LG cs.SI

Unfolded Laplacian Spectral Embedding: A Theoretically Grounded Approach to Dynamic Network Representation

Haruka Ezoe, Hiroki Matsumoto, Ryohei Hisano

2411.07874 2026-05-05 stat.ME math.ST stat.TH

Changepoint Detection in Complex Models: Cross-Fitting Is Needed

Chengde Qian, Guanghui Wang, Zhaojun Wang, Changliang Zou

2409.02399 2026-05-05 stat.CO math.OC

Guidance for twisted particle filter: a continuous-time perspective

Jianfeng Lu, Yuliang Wang

2307.01150 2026-05-05 stat.ME math.ST stat.TH

Reliever: Relieving the Burden of Costly Model Fits for Changepoint Detection

Chengde Qian, Guanghui Wang, Changliang Zou

2212.10406 2026-05-05 stat.ME stat.AP

GEEPERs: Principal Stratification using Principal Scores and Stacked Estimating Equations

Adam C. Sales, Kirk P. Vanacore, Erin R. Ottmar

2209.14859 2026-05-05 math.ST math.PR stat.ML stat.TH

Exact Recovery of Community Detection in dependent Gaussian Mixture Models

Zhongyang Li, Sichen Yang

2007.02392 2026-05-05 cs.LG cs.DS math.ST stat.CO stat.ML stat.TH

Efficient Parameter Estimation of Truncated Boolean Product Distributions

Dimitris Fotakis, Alkis Kalavasis, Christos Tzamos

Comments 33rd Conference on Learning Theory (COLT 2020)

1910.09876 2026-05-05 cs.LG stat.ML

Neural Network Training with Approximate Logarithmic Computations

Arnab Sanyal, Peter A. Beerel, Keith M. Chugg

1511.04803 2026-05-05 stat.ME

Additive Logistic Models as Interpretable Likelihood-Ratio Scores for AUC-Based Classification

Yuan-chin Ivan Chang

Comments 42