arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.05483 2026-03-06 cs.LG cs.AI stat.ML

SurvHTE-Bench: A Benchmark for Heterogeneous Treatment Effect Estimation in Survival Analysis

Shahriar Noroozizadeh, Xiaobin Shen, Jeremy C. Weiss, George H. Chen

Comments The Fourteenth International Conference on Learning Representations (ICLR 2026)

详情

英文摘要

Estimating heterogeneous treatment effects (HTEs) from right-censored survival data is critical in high-stakes applications such as precision medicine and individualized policy-making. Yet, the survival analysis setting poses unique challenges for HTE estimation due to censoring, unobserved counterfactuals, and complex identification assumptions. Despite recent advances, from Causal Survival Forests to survival meta-learners and outcome imputation approaches, evaluation practices remain fragmented and inconsistent. We introduce SurvHTE-Bench, the first comprehensive benchmark for HTE estimation with censored outcomes. The benchmark spans (i) a modular suite of synthetic datasets with known ground truth, systematically varying causal assumptions and survival dynamics, (ii) semi-synthetic datasets that pair real-world covariates with simulated treatments and outcomes, and (iii) real-world datasets from a twin study (with known ground truth) and from an HIV clinical trial. Across synthetic, semi-synthetic, and real-world settings, we provide the first rigorous comparison of survival HTE methods under diverse conditions and realistic assumption violations. SurvHTE-Bench establishes a foundation for fair, reproducible, and extensible evaluation of causal survival methods. The data and code of our benchmark are available at: https://github.com/Shahriarnz14/SurvHTE-Bench .

URL PDF HTML ☆

赞 0 踩 0

2603.05480 2026-03-06 stat.ML cs.LG math.ST stat.TH

Thermodynamic Response Functions in Singular Bayesian Models

Sean Plummer

详情

英文摘要

Singular statistical models-including mixtures, matrix factorization, and neural networks-violate regular asymptotics due to parameter non-identifiability and degenerate Fisher geometry. Although singular learning theory characterizes marginal likelihood behavior through invariants such as the real log canonical threshold and singular fluctuation, these quantities remain difficult to interpret operationally. At the same time, widely used criteria such as WAIC and WBIC appear disconnected from underlying singular geometry. We show that posterior tempering induces a one-parameter deformation of the posterior distribution whose associated observables generate a hierarchy of thermodynamic response functions. A universal covariance identity links derivatives of tempered expectations to posterior fluctuations, placing WAIC, WBIC, and singular fluctuation within a unified response framework. Within this framework, classical quantities from singular learning theory acquire natural thermodynamic interpretations: RLCT governs the leading free-energy slope, singular fluctuation corresponds to curvature of the tempered free energy, and WAIC measures predictive fluctuation. We formalize an observable algebra that quotients out non-identifiable directions, allowing structurally meaningful order parameters to be constructed in singular models. Across canonical singular examples-including symmetric Gaussian mixtures, reduced-rank regression, and overparameterized neural networks-we empirically demonstrate phase-transition-like behavior under tempering. Order parameters collapse, susceptibilities peak, and complexity measures align with structural reorganization in posterior geometry. Our results suggest that thermodynamic response theory provides a natural organizing framework for interpreting complexity, predictive variability, and structural reorganization in singular Bayesian learning.

URL PDF HTML ☆

赞 0 踩 0

2603.05396 2026-03-06 stat.ML cs.LG

Harnessing Synthetic Data from Generative AI for Statistical Inference

Ahmad Abdel-Azim, Ruoyu Wang, Xihong Lin

Comments Submitted to Statistical Science

2603.05370 2026-03-06 cs.LG cs.AI stat.ME

Learning Causal Structure of Time Series using Best Order Score Search

Irene Gema Castillo Mansilla, Urmi Ninad

2603.05317 2026-03-06 stat.ML cs.LG

How important are the genes to explain the outcome - the asymmetric Shapley value as an honest importance metric for high-dimensional features

Mark A. van de Wiel, Jeroen Goedhart, Martin Jullum, Kjersti Aas

Comments 32 pages, incl. Supplementary Material

2603.05306 2026-03-06 math.PR math.ST stat.TH

Maximum of sparsely equicorrelated Gaussian fields and applications

Johannes Heiny, Tiefeng Jiang, Tuan Pham, Yongcheng Qi

2603.05288 2026-03-06 stat.ML cs.LG

Bayesian Supervised Causal Clustering

Luwei Wang, Nazir Lone, Sohan Seth

2602.16537 2026-03-06 math.ST cs.IT cs.LG math.IT stat.ML stat.TH

Optimal training-conditional regret for online conformal prediction

Jiadong Liang, Zhimei Ren, Yuxin Chen

2512.17805 2026-03-06 math.ST cs.NA math.NA stat.ML stat.TH

Towards Sharp Minimax Risk Bounds for Operator Learning

Ben Adcock, Gregor Maier, Rahul Parhi

2511.05840 2026-03-06 stat.ME econ.EM stat.AP

Comparative e-backtests for general risk measures

Zhanyi Jiao, Qiuqi Wang, Yimiao Zhao

2510.15664 2026-03-06 stat.ME cs.LG physics.comp-ph

Bayesian Inference for PDE-based Inverse Problems using the Optimization of a Discrete Loss

Lucas Amoudruz, Sergey Litvinov, Costas Papadimitriou, Petros Koumoutsakos

2509.24544 2026-03-06 stat.ML cs.LG math.PR

Quantitative convergence of trained single layer neural networks to Gaussian processes

Eloy Mosig, Andrea Agazzi, Dario Trevisan

Comments Submitted and accepted at NeurIPS 2025, main body of 10 pages, 3 figures, 28 pages of supplementary material. Corrected an issue in the proof of Proposition 3.7

2506.08921 2026-03-06 math.NA cs.NA math.ST stat.ML stat.TH

Enabling stratified sampling in high dimensions via nonlinear dimensionality reduction

Gianluca Geraci, Daniele E. Schiavazzi, Andrea Zanoni

2412.20298 2026-03-06 cs.LG cs.CY stat.ML

An Experimental Study on Fairness-aware Machine Learning for Credit Scoring Problems

Huyen Giang Thi Thu, Thang Viet Doan, Ha-Bang Ban, Tai Le Quy

Comments The manuscript is submitted to Springer Nature's journal

2603.05280 2026-03-06 cs.CV cs.LG stat.ML

Layer by layer, module by module: Choose both for optimal OOD probing of ViT

Ambroise Odonnat, Vasilii Feofanov, Laetitia Chapel, Romain Tavenard, Ievgen Redko

Comments Accepted at ICLR 2026 CAO Workshop

2603.05274 2026-03-06 stat.ME

Monitoring Covariance in Multichannel Profiles via Functional Graphical Models

Christian Capezza, Davide Forcina, Antonio Lepore, Biagio Palumbo

2603.05226 2026-03-06 stat.ML cs.LG

Learning Optimal Individualized Decision Rules with Conditional Demographic Parity

Wenhai Cui, Wen Su, Donglin Zeng, Xingqiu Zhao

2603.05201 2026-03-06 cs.LG stat.ML

Towards a data-scale independent regulariser for robust sparse identification of non-linear dynamics

Jay Raut, Daniel N. Wilke, Stephan Schmidt

Comments 21 pages, 9 figures, 5 tables

2603.05163 2026-03-06 math.PR math.ST stat.TH

New Berry-Esseen bounds for parameter estimation of Gaussian processes observed at high frequency

Khalifa Es-Sebaiy, Yong Chen

2603.05154 2026-03-06 eess.SP stat.AP

Revitalizing AR Process Simulation of Non-Gaussian Radar Clutter via Series-Based Analytic Continuation

Xingxing Liao, Junhao Xie

Comments 13 pages, 12 figures

2603.05149 2026-03-06 cs.LG cs.AI stat.ML

Federated Causal Discovery Across Heterogeneous Datasets under Latent Confounding

Maximilian Hahn, Alina Zajak, Dominik Heider, Adèle Helena Ribeiro

2603.05119 2026-03-06 q-fin.ST math.ST stat.TH

Asymptotic Separability of Diffusion and Jump Components in High-Frequency CIR and CKLS Models

Sourojyoti Barick

2603.05065 2026-03-06 stat.ME

Modeling cyclostationarity in time series using ASCA

Daniel Vallejo-España, Jesús García Sánchez, Manuel Villar-Argaiz, Concepción De Linares, José Camacho

Comments 27 pages and 4 figures in main text. 16 pages and 8 figures in supplementary materials

2603.04780 2026-03-06 cs.LG stat.ML

Distributional Equivalence in Linear Non-Gaussian Latent-Variable Cyclic Causal Models: Characterization and Learning

Haoyue Dai, Immanuel Albrecht, Peter Spirtes, Kun Zhang

Comments Appears at ICLR 2026 (oral)

2603.04752 2026-03-06 stat.ME math.ST stat.TH

Robust estimation via $γ$-divergence for diffusion processes

Tomoyuki Nakagawa, Yusuke Shimizu

Comments 25page

2603.04697 2026-03-06 stat.ME

A Multi-Fidelity Tensor Emulator for Spatiotemporal Outputs: Emulation of Arctic Sea Ice Dynamics

Tristan Contant, Yawen Guan, Ander Wilson, Adrian K. Turner, Deborah Sulsky

Comments 25 pages, 6 figures

2603.04690 2026-03-06 math.ST stat.TH

Strong consistency of the local linear estimator for a generalized regression function with dependent functional data

Danilo Hiroshi Matsuoka, Hudson da Silva Torrent

Comments Supplementary material included. Submitted to Annals of the Institute of Statistical Mathematics

2603.04688 2026-03-06 q-bio.NC cs.AI cs.LG stat.ML

Why the Brain Consolidates: Predictive Forgetting for Optimal Generalisation

Zafeirios Fountas, Adnan Oomerjee, Haitham Bou-Ammar, Jun Wang, Neil Burgess

Comments 25 pages, 6 figures

2603.04686 2026-03-06 math.ST stat.TH

The augmented van Trees inequality

Elliot H. Young

2603.04685 2026-03-06 math.ST stat.TH

Sequential Multiple Testing: A Second-Order Asymptotic Analysis

Jingyu Liu, Yanglei Song

2603.04681 2026-03-06 math.ST stat.TH

Uniform convergence of kernel averages under fixed design with heterogeneous dependent data

Danilo Hiroshi Matsuoka, Hudson da Silva Torrent

Comments Supplementary material included. Submitted to Journal of Time Series Analysis

2603.04671 2026-03-06 math.PR math.ST stat.TH

Estimating Graph Dynamics from Population Observations

Peter Braunsteins, Michel Mandjes, Florian Montalescot

2603.04635 2026-03-06 stat.ML cs.DS cs.LG

Optimal Prediction-Augmented Algorithms for Testing Independence of Distributions

Maryam Aliakbarpour, Alireza Azizi, Ria Stevens

2603.04632 2026-03-06 stat.ME stat.CO

Least trimmed squares regression with missing values and cellwise outliers

Jakob Raymaekers, Peter J. Rousseeuw

2603.04625 2026-03-06 cs.LG math.ST stat.ML stat.TH

K-Means as a Radial Basis function Network: a Variational and Gradient-based Equivalence

Felipe de Jesus Felix Arredondo, Alejandro Ucan-Puc, Carlos Astengo Noguez

Comments 21 pages, 2 figures, 1 appendix

2603.04608 2026-03-06 math.ST stat.ME stat.TH

KRAFTY: Khatri-Rao Framework for Joint Cluster Recovery

Siyi Gao, Zachary Lubberts, Marianna Pensky

Comments 47 pages, 38 figures

2603.04576 2026-03-06 stat.ME

Variable Selection for Linear Regression Imputation in Surveys

Ziming An, Mehdi Dagdoug, David Haziza

2603.04551 2026-03-06 stat.AP cs.LG

Weather-Related Crash Risk Forecasting: A Deep Learning Approach for Heterogenous Spatiotemporal Data

Abimbola Ogungbire, Srinivas Pulugurtha

Comments 20 pages 5 figures

2603.04546 2026-03-06 cs.LG stat.ML

Oracle-efficient Hybrid Learning with Constrained Adversaries

Princewill Okoroafor, Robert Kleinberg, Michael P. Kim

2603.04544 2026-03-06 stat.ME

Proximal Learning for Trials With External Controls: A Case Study in HIV Prevention

Yilin Song, Yinxiang Wu, Raphael J. Landovitz, Susan Buchbinder, Srilatha Edupuganti, Lydia Soto-Torres, Kendrick Li, Xu Shi, Fei Gao, Deborah Donnell, Holly Janes, Ting Ye

2603.04541 2026-03-06 stat.OT

Engaging students with statistics through choice of real data context on homework

Catalina Medina, Mine Dogucu

Comments 25 pages, 3 figures, 2 tables. Submitted to The American Statistician. Supplementary materials and code available at https://github.com/CatalinaMedina/data-context-choice-manuscript

2603.04479 2026-03-06 stat.ML cs.LG math.PR math.ST stat.AP stat.TH

Bayesian Modeling of Collatz Stopping Times: A Probabilistic Machine Learning Perspective

Nicolò Bonacorsi, Matteo Bordoni

2603.04473 2026-03-06 stat.ML cs.IT cs.LG math.IT

Dictionary Based Pattern Entropy for Causal Direction Discovery

Harikrishnan N B, Shubham Bhilare, Aditi Kathpalia, Nithin Nagaraj

Comments 13 pages

2602.15581 2026-03-06 stat.OT

Confidence as Forecast: A Decision-Theoretic Interpretation of Confidence Intervals

Scott Lee

详情

英文摘要

What, if anything, should a frequentist say about a single realized confidence interval (CI) and its chance of having covered the parameter? Jerzy Neyman's original answer was to refuse any nondegenerate probability for coverage ex post and, instead, to "state that the interval covers". In this paper I argue that the usual frequentist machinery already supports a different reading. I treat the coverage event as a Bernoulli random variable, with the nominal level 1-alpha as its design-based success probability, and view "confidence" as a probability forecast for that Bernoulli outcome. Using strictly proper scoring rules, I show that 1-alpha is the unique optimal constant forecast for coverage, both before and after observing the data, and that it remains optimal post-trial in common unbounded, translation-invariant models with pivot-based CIs. When the design yields a theta-free statistic--such as the relative width of the interval in a finite-window uniform model--the conditional coverage given that statistic provides a nonconstant, design-based refinement of 1-alpha that strictly improves predictive performance. Two thought experiments, a Monty Hall-style shell game and the "lost submarine" example of Morey et al. (2016), illustrate how this perspective resolves familiar interpretational puzzles about CIs without appealing to priors or single-case subjective degrees of belief. I conclude with simple "what to do when you see an interval" guidance for applied work and some implications for teaching confidence intervals as tools for forecasting long-run coverage. Keywords: Confidence intervals, coverage probability, proper scoring rules, probabilistic forecasting, frequentist inference Disclaimer: The findings and conclusions in this report are those of the author and do not necessarily represent the official position of the Centers for Disease Control and Prevention

URL PDF HTML ☆

赞 0 踩 0

2601.23236 2026-03-06 cs.LG cs.AI math.OC stat.ML

YuriiFormer: A Suite of Nesterov-Accelerated Transformers

Aleksandr Zimin, Yury Polyanskiy, Philippe Rigollet

2512.12988 2026-03-06 stat.ME math.ST stat.CO stat.ML stat.TH

A Bayesian approach to learning mixtures of nonparametric components

Yilei Zhang, Yun Wei, Aritra Guha, XuanLong Nguyen

Comments 80 pages, 9 figures

2512.06945 2026-03-06 stat.ML cs.LG

Symmetric Aggregation of Conformity Scores for Efficient Uncertainty Sets

Nabil Alami, Jad Zakharia, Souhaib Ben Taieb

2510.07093 2026-03-06 cs.LG stat.ML

Non-Asymptotic Analysis of Efficiency in Conformalized Regression

Yunzhen Yao, Lie He, Michael Gastpar

Comments Published as a conference paper at ICLR 2026

2508.16523 2026-03-06 stat.ME stat.AP

Identifying Treatment Effect Heterogeneity with Bayesian Hierarchical Adjustable Random Partition in Adaptive Enrichment Trials

Xianglin Zhao, Shirin Golchi, Jean-Philippe Gouin, Kaberi Dasgupta

2508.11847 2026-03-06 stat.ML cs.LG

Dropping Just a Handful of Preferences Can Change Top Large Language Model Rankings

Jenny Y. Huang, Yunyi Shen, Dennis Wei, Tamara Broderick

2506.14020 2026-03-06 cs.LG cs.AI stat.ML

Bures-Wasserstein Flow Matching for Graph Generation

Keyue Jiang, Jiahao Cui, Xiaowen Dong, Laura Toni

2505.04007 2026-03-06 stat.ML cs.LG

Variational Formulation of Particle Flow

Yinzhuang Yi, Jorge Cortés, Nikolay Atanasov

2502.15116 2026-03-06 math.PR math.ST stat.TH

Uniform mean estimation via generic chaining

Daniel Bartl, Shahar Mendelson

2502.11682 2026-03-06 cs.LG math.OC stat.ML

Double Momentum and Error Feedback for Clipping with Fast Rates and Differential Privacy

Rustem Islamov, Samuel Horvath, Aurelien Lucchi, Peter Richtarik, Eduard Gorbunov

2502.07584 2026-03-06 stat.ML cs.LG

Generalization Bounds for Markov Algorithms through Entropy Flow Computations

Benjamin Dupuis, Maxime Haddouche, George Deligiannidis, Umut Simsekli

2502.05360 2026-03-06 cs.LG math.OC stat.ML

Curse of Dimensionality in Neural Network Optimization

Sanghoon Na, Haizhao Yang

Comments Accepted for publication in Information and Inference: A Journal of the IMA. 32 pages, 1 figure

2412.03832 2026-03-06 math.ST stat.TH

Information theoretic limits of robust sub-Gaussian mean estimation under star-shaped constraints

Akshay Prasadan, Matey Neykov

2411.13199 2026-03-06 math.ST stat.TH

Sharp Bounds for Multiple Models in Matrix Completion

Dali Liu, Haolei Weng

Comments 37 pages. Accepted by the Electronic Journal of Statistics. All comments are warmly welcomed

2411.09847 2026-03-06 cs.LG stat.ML

Towards a Fairer Non-negative Matrix Factorization

Lara Kassab, Erin George, Deanna Needell, Haowen Geng, Nika Jafar Nia, Aoxi Li

2406.05911 2026-03-06 math.ST stat.TH

Some facts about the optimality of the LSE in the Gaussian sequence model with convex constraint

Akshay Prasadan, Matey Neykov

2402.03352 2026-03-06 math.OC cs.LG stat.ML

Zeroth-Order primal-dual Alternating Projection Gradient Algorithms for Nonconvex Minimax Problems with Coupled linear Constraints

Huiling Zhang, Zi Xu, Yuhong Dai

Comments arXiv admin note: text overlap with arXiv:2212.04672

2309.00756 2026-03-06 stat.AP math.OC

Learning Risk Preferences in Markov Decision Processes: an Application to the Fourth Down Decision in the National Football League

Nathan Sandholtz, Lucas Wu, Martin Puterman, Timothy C. Y. Chan

Comments 22 pages, 12 figures

2307.10960 2026-03-06 math.ST math.PR stat.TH

Change point estimation for a stochastic heat equation

Markus Reiß, Claudia Strauch, Lukas Trottner

2209.11691 2026-03-06 econ.EM cs.LG stat.ME

Linear Multidimensional Regression with Interactive Fixed-Effects

Hugo Freeman

2012.07167 2026-03-06 math.ST stat.TH

Pseudo-likelihood-based $M$-estimation of random graphs with dependent edges and parameter vectors of increasing dimension

Jonathan R. Stewart, Michael Schweinberger

2603.04420 2026-03-06 cs.LG math.DS q-bio.NC stat.ML

Machine Learning for Complex Systems Dynamics: Detecting Bifurcations in Dynamical Systems with Deep Neural Networks

Swadesh Pal, Roderick Melnik

Comments 15 pages; 5 figures

2603.04418 2026-03-06 cs.LG cs.AI stat.ML

Decorrelating the Future: Joint Frequency Domain Learning for Spatio-temporal Forecasting

Zepu Wang, Bowen Liao, Jeff, Ban