arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2602.17633 2026-02-20 cs.LG cs.AI stat.ML

When to Trust the Cheap Check: Weak and Strong Verification for Reasoning

Shayan Kiyani, Sima Noorani, George Pappas, Hamed Hassani

详情

英文摘要

Reasoning with LLMs increasingly unfolds inside a broader verification loop. Internally, systems use cheap checks, such as self-consistency or proxy rewards, which we call weak verification. Externally, users inspect outputs and steer the model through feedback until results are trustworthy, which we call strong verification. These signals differ sharply in cost and reliability: strong verification can establish trust but is resource-intensive, while weak verification is fast and scalable but noisy and imperfect. We formalize this tension through weak--strong verification policies, which decide when to accept or reject based on weak verification and when to defer to strong verification. We introduce metrics capturing incorrect acceptance, incorrect rejection, and strong-verification frequency. Over population, we show that optimal policies admit a two-threshold structure and that calibration and sharpness govern the value of weak verifiers. Building on this, we develop an online algorithm that provably controls acceptance and rejection errors without assumptions on the query stream, the language model, or the weak verifier.

URL PDF HTML ☆

赞 0 踩 0

2602.17608 2026-02-20 cs.LG cs.AI stat.ML

Towards Anytime-Valid Statistical Watermarking

Baihe Huang, Eric Xu, Kannan Ramchandran, Jiantao Jiao, Michael I. Jordan

2602.17603 2026-02-20 stat.ML stat.AP

SOLVAR: Fast covariance-based heterogeneity analysis with pose refinement for cryo-EM

Roey Yadgar, Roy R. Lederman, Yoel Shkolnisky

2602.17592 2026-02-20 stat.ME stat.AP

BMW: Bayesian Model-Assisted Adaptive Phase II Clinical Trial Design for Win Ratio Statistic

Di Zhu, Yong Zang

Comments 32 pages, 2 figures

2602.17577 2026-02-20 cs.DS cs.LG stat.ML

Simultaneous Blackwell Approachability and Applications to Multiclass Omniprediction

Lunjia Hu, Kevin Tian, Chutong Yang

2602.17565 2026-02-20 math.ST cs.LG stat.ML stat.TH

Optimal Unconstrained Self-Distillation in Ridge Regression: Strict Improvements, Precise Asymptotics, and One-Shot Tuning

Hien Dang, Pratik Patil, Alessandro Rinaldo

Comments 78 pages, 25 figures

2602.17543 2026-02-20 stat.ML cs.LG econ.EM math.ST stat.ME stat.TH

genriesz: A Python Package for Automatic Debiased Machine Learning with Generalized Riesz Regression

Masahiro Kato

2602.17414 2026-02-20 stat.CO astro-ph.IM stat.ME

Nested Sampling with Slice-within-Gibbs: Efficient Evidence Calculation for Hierarchical Bayesian Models

David Yallup

Comments 26 pages, 6 figures

2602.16265 2026-02-20 stat.ML cs.LG

On sparsity, extremal structure, and monotonicity properties of Wasserstein and Gromov-Wasserstein optimal transport plans

Titouan Vayer

2601.16174 2026-02-20 stat.ML cs.LG

Beyond Predictive Uncertainty: Reliable Representation Learning with Structural Constraints

Yiyao Yang

Comments 22 pages, 5 figures, 5 propositions

2601.14969 2026-02-20 q-bio.GN stat.ML

Robust Machine Learning for Regulatory Sequence Modeling under Biological and Technical Distribution Shifts

Yiyao Yang

Comments 20 pages, 16 figures

2512.18118 2026-02-20 stat.AP stat.ME

Distribution-Free Selection of Low-Risk Oncology Patients for Survival Beyond a Time Horizon

Matteo Sesia, Vladimir Svetnik

2511.11166 2026-02-20 stat.ME

Choosing the nominal level post-hoc with knockoffs using e-values

Lasse Fischer, Konstantinos Sechidis

2511.08192 2026-02-20 stat.ME

Geometric modelling of spatial extremes

Lydia Kakampakou, Jennifer L. Wadsworth

Comments 35 pages, 15 figures

2509.03315 2026-02-20 stat.ME

The super learner for time-to-event outcomes: A tutorial

Ruth H. Keogh, Karla Diaz-Ordaz, Nan van Geloven, Jon Michael Gran, Kamaryn T. Tanner

2507.05634 2026-02-20 math.ST stat.TH

A Note on Inferential Decisions, Errors and Path-Dependency

Kangda K. Wren

Comments 12 pages: 1 highlight, 7 main text, 3 appendix and 1 bibliography

2503.17338 2026-02-20 cs.AI cs.LG stat.ML

Capturing Individual Human Preferences with Reward Features

André Barreto, Vincent Dumoulin, Yiran Mao, Mark Rowland, Nicolas Perez-Nieves, Bobak Shahriari, Yann Dauphin, Doina Precup, Hugo Larochelle

Comments Published at NeurIPS 2025

2501.14118 2026-02-20 cs.LG stat.AP stat.ML

Selecting Critical Scenarios of DER Adoption in Distribution Grids Using Bayesian Optimization

Olivier Mulkin, Miguel Heleno, Mike Ludkovski

Comments 12 pages, 4 tables, 12 figures

2411.02137 2026-02-20 math.ST cs.LG stat.ML stat.TH

Finite-sample performance of the maximum likelihood estimator in logistic regression

Hugo Chardon, Matthieu Lerasle, Jaouad Mourtada

Comments Minor revision

2408.13220 2026-02-20 stat.CO stat.AP

A New Perspective to Fish Trajectory Imputation: A Methodology for Spatiotemporal Modeling of Acoustically Tagged Fish Data

Mahshid Ahmadian, Edward L. Boone, Grace S. Chiu

2408.10478 2026-02-20 stat.ME

Reconciliating Bayesian and frequentist approaches to robustness against outliers

Philippe Gagnon, Alain Desgagné

详情

英文摘要

Heavy-tailed models are used as a way to gain robustness against outliers in Bayesian analyses. In frequentist analyses, M-estimators are often employed. In this paper, the two approaches are tentatively reconciled by considering M-estimators as maximum likelihood estimators of heavy-tailed models. From this perspective, it is realized that a fundamental difference exists as frequentists, contrarily to Bayesians, do not require these heavy-tailed models to be proper. For instance, a popular robust estimator in linear regression, Tukey's biweight M-estimator, does not correspond to a proper heavy-tailed model. Thus, a Bayesian practitioner does not have access to the same range of tools as a frequentist practitioner. It is shown through two real-data linear regression analyses that the former may in consequence obtain significantly different estimation results than the latter, where the difference is due to a more pronounced influence by the outliers in the former case. It is highlighted that a way to give these practitioners access to the same range of tools is for the Bayesian to adopt the generalized Bayesian framework of Bissiri et al. (2016) which allows the use of improper models (Jewson and Rossell, 2022), in combination with proper prior distributions yielding proper generalized posterior distributions. A complete reconciliation of the Bayesian and frequentist approaches to robustness is then achieved. An extensive theoretical study of the generalized Bayesian counterpart of Tukey's biweight M-estimator is provided, which includes a robustness characterization result and a Bernstein--von Mises result, the latter allowing to calibrate the generalized posterior distribution for meaningful uncertainty quantification. After adopting the generalized Bayesian framework, the Bayesian practitioner obtains similar results as the frequentist practitioner in the aforementioned examples.

URL PDF HTML ☆

赞 0 踩 0

2408.05826 2026-02-20 math.ST math.CO math.PR stat.TH

Möbius inversion and the iterated bootstrap

Florian Schäfer

2407.01566 2026-02-20 q-fin.CP cs.GT cs.LG stat.ML

A Parametric Contextual Online Learning Theory of Brokerage

François Bachoc, Tommaso Cesari, Roberto Colomboni

2602.17272 2026-02-20 stat.ME

Estimating Zero-inflated Negative Binomial GAMLSS via a Balanced Gradient Boosting Approach with an Application to Antenatal Care Data from Nigeria

Alexandra Daub, Elisabeth Bergherr

2602.17261 2026-02-20 stat.ME

Parametric or nonparametric: the FIC approach for stationary time series

Gudmund Hermansen, Nils Lid Hjort, Martin Jullum

Comments 21 pages, 6 figures; Statistical Research Report (Department of Mathematics, University of Oslo), from December 2015, but arXiv'd February 2026; a later modified and extended version might then become a journal paper

2602.17255 2026-02-20 stat.ME stat.AP

Selection and Collider Restriction Bias Due to Predictor Availability in Prognostic Models

Marc Delord

2602.17225 2026-02-20 cond-mat.mtrl-sci physics.data-an stat.ME

Wide-Surface Furnace for In Situ X-Ray Diffraction of Combinatorial Samples using a High-Throughput Approach

Giulio Cordaro, Juande Sirvent, Cristian Mocuta, Fjorelo Buzi, Thierry Martin, Federico Baiutti, Alex Morata, Albert Tarancòn, Dominique Thiaudière, Guilhem Dezanneau

2602.17211 2026-02-20 stat.ML cs.LG

MGD: Moment Guided Diffusion for Maximum Entropy Generation

Etienne Lempereur, Nathanaël Cuvelle--Magar, Florentin Coeurdoux, Stéphane Mallat, Eric Vanden-Eijnden

2602.17161 2026-02-20 stat.ME

Dynamic likelihood hazard rate estimation

Nils Lid Hjort

Comments 20 pages, no figures; Statistical Research Report from 1993 (Department of Mathematics, University of Oslo); accepted with "minor revision" by Biometrika then, but somehow I never got around to do the final polish. This report, arXiv'd now in 2026, might be modified and updated (and illustrated with real data) for later journal publication

2602.17144 2026-02-20 cs.LG stat.ML

When More Experts Hurt: Underfitting in Multi-Expert Learning to Defer

Shuqi Liu, Yuzhou Cao, Lei Feng, Bo An, Luke Ong

2602.17115 2026-02-20 stat.ML cs.LG

Semi-Supervised Learning on Graphs using Graph Neural Networks

Juntong Chen, Claire Donnat, Olga Klopp, Johannes Schmidt-Hieber

Comments 57 pages, 7 figures

2602.17103 2026-02-20 cs.LG stat.ML

Online Learning with Improving Agents: Multiclass, Budgeted Agents and Bandit Learners

Sajad Ashkezari, Shai Ben-David

2602.17087 2026-02-20 math.PR stat.CO

Diffusive Scaling Limits of Forward Event-Chain Monte Carlo: Provably Efficient Exploration with Partial Refreshment

Hirofumi Shiba, Kengo Kamatani

Comments 43 pages, 5 figures

2602.17086 2026-02-20 econ.TH cs.LG math.ST stat.TH

Dynamic Decision-Making under Model Misspecification: A Stochastic Stability Approach

Xinyu Dai, Daniel Chen, Yian Qian

2602.17079 2026-02-20 stat.AP

Environmental policy in the context of complex systems: Statistical optimization and sensitivity analysis for ABMs

Dylan Munson, Arijit Dey, Simon Mak

2602.17070 2026-02-20 stat.ME cs.AI

General sample size analysis for probabilities of causation: a delta method approach

Tianyuan Cheng, Ruirui Mao, Judea Pearl, Ang Li

2602.17052 2026-02-20 stat.ME econ.EM

Generative modeling for the bootstrap

Leon Tran, Ting Ye, Peng Ding, Fang Han

Comments 62 pages

2602.17043 2026-02-20 stat.AP

Quantifying the limits of human athletic performance: A Bayesian analysis of elite decathletes

Paul-Hieu V. Nguyen, James M. Smoliga, Benton Lindaman, Sameer K. Deshpande

2602.17034 2026-02-20 stat.AP

Using Time Series Measures to Explore Family Planning Survey Data and Model-based Estimates

Oluwayomi Akinfenwa, Niamh Cahill, Catherine Hurley

2602.17015 2026-02-20 cs.AI stat.AP

Cinder: A fast and fair matchmaking system

Saurav Pal

2602.16992 2026-02-20 stat.ME

Modeling Multivariate Missingness with Tree Graphs and Conjugate Odds

Daniel Suen, Yen-Chi Chen

Comments 82 pages, 15 figures

2602.16970 2026-02-20 stat.AP

Temperature and Respiratory Emergency Department Visits: A Mediation Analysis with Ambient Ozone Exposure

Chen Li, Thomas W. Hsiao, Stefanie Ebelt, Rebecca H. Zhang, Howard H. Chang

2602.16923 2026-02-20 stat.ML cs.LG

Poisson-MNL Bandit: Nearly Optimal Dynamic Joint Assortment and Pricing with Decision-Dependent Customer Arrivals

Junhui Cai, Ran Chen, Qitao Huang, Linda Zhao, Wu Zhu

2602.16914 2026-02-20 stat.ME cs.LG stat.ML

A statistical perspective on transformers for small longitudinal cohort data

Kiana Farhadyar, Maren Hackenberg, Kira Ahrens, Charlotte Schenk, Bianca Kollmann, Oliver Tüscher, Klaus Lieb, Michael M. Plichta, Andreas Reif, Raffael Kalisch, Martin Wolkewitz, Moritz Hess, Harald Binder

2602.16876 2026-02-20 cs.LG stat.ML

ML-driven detection and reduction of ballast information in multi-modal datasets

Yaroslav Solovko

Comments 20 pages, 27 figures, 10 tables

2602.16857 2026-02-20 physics.comp-ph physics.data-an physics.geo-ph stat.ML

Distillation and Interpretability of Ensemble Forecasts of ENSO Phase using Entropic Learning

Michael Groom, Davide Bassetti, Illia Horenko, Terence J. O'Kane

2602.16849 2026-02-20 cs.LG math.OC stat.ML

On the Mechanism and Dynamics of Modular Addition: Fourier Features, Lottery Ticket, and Grokking

Jianliang He, Leda Wang, Siyu Chen, Zhuoran Yang

2602.16830 2026-02-20 stat.AP cs.LG

The Impact of Formations on Football Matches Using Double Machine Learning. Is it worth parking the bus?

Genís Ruiz-Menárguez, Llorenç Badiella

Comments 17 pages, 5 figures, 3 tables

2602.16789 2026-02-20 stat.ME math.ST stat.TH

First versus full or first versus last: U-statistic change-point tests under fixed and local alternatives

Herold Dehling, Daniel Vogel, Martin Wendler

Comments 56 pages: 22 pages main document, 34 pages appendices (containing proofs), one reference list at the end

2602.16784 2026-02-20 cs.LG cs.CL stat.ME

Omitted Variable Bias in Language Models Under Distribution Shift

Victoria Lin, Louis-Philippe Morency, Eli Ben-Michael

2602.16177 2026-02-20 stat.ML cs.AI cs.LG

Conjugate Learning Theory: Uncovering the Mechanisms of Trainability and Generalization in Deep Neural Networks

Binchuan Qi

详情

英文摘要

In this work, we propose a notion of practical learnability grounded in finite sample settings, and develop a conjugate learning theoretical framework based on convex conjugate duality to characterize this learnability property. Building on this foundation, we demonstrate that training deep neural networks (DNNs) with mini-batch stochastic gradient descent (SGD) achieves global optima of empirical risk by jointly controlling the extreme eigenvalues of a structure matrix and the gradient energy, and we establish a corresponding convergence theorem. We further elucidate the impact of batch size and model architecture (including depth, parameter count, sparsity, skip connections, and other characteristics) on non-convex optimization. Additionally, we derive a model-agnostic lower bound for the achievable empirical risk, theoretically demonstrating that data determines the fundamental limit of trainability. On the generalization front, we derive deterministic and probabilistic bounds on generalization error based on generalized conditional entropy measures. The former explicitly delineates the range of generalization error, while the latter characterizes the distribution of generalization error relative to the deterministic bounds under independent and identically distributed (i.i.d.) sampling conditions. Furthermore, these bounds explicitly quantify the influence of three key factors: (i) information loss induced by irreversibility in the model, (ii) the maximum attainable loss value, and (iii) the generalized conditional entropy of features with respect to labels. Moreover, they offer a unified theoretical lens for understanding the roles of regularization, irreversible transformations, and network depth in shaping the generalization behavior of deep neural networks. Extensive experiments validate all theoretical predictions, confirming the framework's correctness and consistency.

URL PDF HTML ☆

赞 0 踩 0

2602.15438 2026-02-20 cs.LG cs.AI stat.ML

Logit Distance Bounds Representational Similarity

Beatrix M. G. Nielsen, Emanuele Marconato, Luigi Gresele, Andrea Dittadi, Simon Buchholz

2602.12577 2026-02-20 stat.ME stat.AP

Conjugating Variational Inference for Large Mixed Multinomial Logit Models and Consumer Choice

Weiben Zhang, Ruben Loaiza-Maya, Michael Stanley Smith, Worapree Maneesoonthorn

2601.20591 2026-02-20 math.DS stat.AP

Exploring Memory Effects: Sparse Identification in Vector-Borne Diseases

Dimitri Breda, Muhammad Tanveer, Jianhong Wu, Xue Zhang

2509.22860 2026-02-20 math.OC cs.DC cs.LG stat.ML

Ringleader ASGD: The First Asynchronous SGD with Optimal Time Complexity under Data Heterogeneity

Artavazd Maranjyan, Peter Richtárik

2506.20749 2026-02-20 econ.EM stat.ME

Analytic inference with two-way clustering

Laurent Davezies, Xavier D'Haultfœuille, Yannick Guyonvarch

Comments 69 pages, supplement starts at p.43

2505.00292 2026-02-20 math.ST eess.SP stat.ME stat.TH

Offline changepoint localization using a matrix of conformal p-values

Sanjit Dandapanthula, Aaditya Ramdas

2412.00926 2026-02-20 stat.ME

A sensitivity analysis approach to principal stratification with a continuous longitudinal intermediate outcome: Applications to a cohort stepped wedge trial

Lei Yang, Michael J. Daniels, Fan Li

2409.20250 2026-02-20 stat.ML cs.LG

Input-Label Correlation Governs a Linear-to-Nonlinear Transition in Random Features under Spiked Covariance

Samet Demir, Zafer Dogan

Comments 30 pages, 7 figures

2403.11332 2026-02-20 cs.LG cs.SI stat.ME

Graph Machine Learning based Doubly Robust Estimator for Network Causal Effects

Seyedeh Baharan Khatami, Harsh Parikh, Haowei Chen, Sudeepa Roy, Babak Salimi