arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.00806 2026-05-04 stat.ME stat.AP

High-Dimensional Multivariate VAR Estimation with Spatio-Temporal Structure

Peiliang Bai

详情

英文摘要

High-dimensional vector autoregressive (VAR) models provide a flexible framework for characterizing dynamic dependence in multivariate spatio-temporal systems, but their unrestricted estimation becomes infeasible when multiple variables are observed over many spatial locations. This paper develops a structured estimation procedure for high-dimensional multivariate VAR processes that explicitly incorporates spatial information. We decompose each block transition matrix into a cross-variable dependence coefficient and a spatial transition matrix, and constrain the spatial transition matrices through a pre-specified spatial graph. The resulting estimator is formulated as a weighted $\ell_1$-regularized least-squares problem, where the weights encode spatial proximity or topological similarity and induce stronger shrinkage on spatially implausible interactions. Since the objective function is bi-convex, we estimate the cross-variable dependence matrix and the spatial transition matrices through an alternating convex-search algorithm implemented with ADMM. Under stability and restricted-eigenvalue-type conditions for high-dimensional VAR processes, we establish convergence to a blockwise stationary point in the subgradient sense and derive high-probability estimation error bounds for both components of the model. Simulation studies demonstrate that the proposed estimator accurately recovers sparse transition structures and improves over existing two-step $\ell_1$-regularized methods in support recovery and estimation accuracy. An application to North American climate data illustrates that the method recovers interpretable variable-dependence networks and spatial interaction patterns across different climate regions.

URL PDF HTML ☆

赞 0 踩 0

2605.00786 2026-05-04 stat.ME math.PR stat.ML

Recursive Maximum Likelihood Estimation for Interacting Particle Systems using Virtual Particles

Louis Sharrock, Nikolas Kantas, Grigorios A. Pavliotis

Comments arXiv admin note: text overlap with arXiv:2602.20875

2605.00779 2026-05-04 stat.ME

Evaluating the performance of GCM trajectories using Weather Type frequencies for persistence and transitions: the Iberian Peninsula and Lamb classification

Elsa Barrio-Torres, Swen Brands, Jesús Asín, Jesús Abaurrea, Zeus Gracia-Tabuenca

详情

英文摘要

This study evaluates the performance of 36 historical CMIP6 GCM trajectories (1979-2005) in reproducing atmospheric circulation over the Iberian Peninsula in the summer months (June-September) using the Lamb Weather Type (WT) classification scheme. Using ERA5 reanalysis as the observational reference, we introduce a methodological framework-applicable to any region worldwide-to evaluate GCM performance. This approach extends traditional daily frequency analysis by evaluating both the daily frequency distribution of WTs and their 24-hour dynamic evolution (i.e., transition probabilities and persistence). Model performance is quantified using the Overlap coefficient. A filtering process is applied where only trajectories that successfully reproduce both daily and conditional distributions with a minimum Overlap threshold $t_{sim}$ across a set number of grid points are retained. The findings show that while several models can adequately reproduce daily WT frequencies (16 out of 36), some struggle to capture day-to-day atmospheric transitions. This leads to a final selection of 12 trajectories over the Iberian Peninsula. Model performance across the region is then evaluated using integrated metrics assessing daily reproduction, conditional reproduction, and transition dynamics. Overall, models from the ec earth3 family-specifically the ec earth3 aerchem trajectory-exhibit the best and most consistent performance across the region. Additionally, the results highlight a geographical performance gap: while models generally represent circulation well in the northwest, they face significant challenges in the central and southern Mediterranean regions of the Peninsula. Ultimately, this study establishes that assessing WT persistence and transitions provides a far more discriminative, objective tool for GCM selection than evaluating daily distributions alone.

URL PDF HTML ☆

赞 0 踩 0

2605.00771 2026-05-04 econ.EM math.ST stat.TH

Penalized Likelihood for Dyadic Network Formation Models with Degree Heterogeneity

Zizhong Yan, Jingrong Li, Yi Zhang

2605.00765 2026-05-04 stat.ME

Efficient Longitudinal Function-on-Function Regression

Leif Verace, Siobhan McMahon, Erjia Cui

2605.00750 2026-05-04 stat.OT math.PR math.ST nlin.AO stat.TH

Quenched Amplification and Tail Shaping in Networked Systems with Memory and Regime Switching

Mauricio Herrera-Marín

2605.00740 2026-05-04 math.OC cs.LG stat.ML

Randomized Subspace Nesterov Accelerated Gradient

Gaku Omiya, Pierre-Louis Poirion, Akiko Takeda

Comments 50 pages

2604.27772 2026-05-04 stat.ME

Single-Observation Uniformity Testing under Increasing Precision via Lacunary Harmonic

Davide Ferrari

Comments 31 pages, 3 figures, 1 table

2605.00729 2026-05-04 math.ST math.PR nlin.AO stat.OT stat.TH

Intermittency induced by long memory under stochastic regime switching

Mauricio Herrera-Marín

2605.00723 2026-05-04 stat.ML cs.LG math.PR

Decentralized Proximal Stochastic Gradient Langevin Dynamics

Mohammad Rafiqul Islam, Lingjiong Zhu

Comments 42 pages, 7 figures

2605.00709 2026-05-04 math.ST econ.EM stat.TH

Bootstrap Inference under General Two-way Clustering with Serially and Spatially Dependent Common Effects

Ulrich Hounyo, Jiahao Lin

2605.00668 2026-05-04 cs.IT math.IT stat.CO

SENECA: Small-Sample Discrete Entropy Estimation via Self-Consistent Missing Mass

Lucas H. McCabe, H. Howie Huang

2605.00654 2026-05-04 cs.LG cs.AI math.OC stat.ML

Reinforcement Learning with Markov Risk Measures and Multipattern Risk Approximation

Andrzej Ruszczynski, Tiangang Zhang

2605.00598 2026-05-04 stat.ME

Sparse $K$-spatial-median clustering for high-dimensional data

Ping Zhao, Dan Zhuang, Long Feng

2605.00581 2026-05-04 stat.ML cs.LG math.OC

Gradient Regularized Newton Boosting Trees with Global Convergence

Nikita Zozoulenko, Daniel Falkowski, Thomas Cass, Lukas Gonon

2605.00534 2026-05-04 stat.ME

Estimating Treatment and Spillover Effects with the Ego-Cluster Experimental Design

Xiao Liu, Feifang Hu, Jingfei Zhang

2605.00533 2026-05-04 math.PR math-ph math.MP math.ST stat.TH

Royen's proof of the Gaussian correlation inequality as a supersymmetric dimensional reduction

Yichao Huang

Comments 13 pages

2605.00470 2026-05-04 stat.ME stat.AP

Robust spatial scalar-on-function regression: A Fisher-consistent redescending M-estimation approach

Muge Mutis, Ufuk Beyaztas, Han Lin Shang

Comments 51 pages, 7 figures, 6 tables

2605.00467 2026-05-04 cs.LG stat.ML

Batch Normalization for Neural Networks on Complex Domains

Xuan Son Nguyen, Nistor Grozavu

2605.00455 2026-05-04 stat.ME math.ST stat.ML stat.TH

Concentration and Calibration in Predictive Bayesian Inference

David T. Frazier, Hui Wang

2605.00428 2026-05-04 stat.ME cs.PF cs.SY eess.SY

How to Do Statistical Evaluations in ECE/CS Papers: A Practical Playbook for Defensible Results

Bhaskar Krishnamachari

Comments 30 pages, 8 figures; Tutorial paper; companion student workbook and claude skill available as ancillary material

2605.00398 2026-05-04 cs.LG physics.ao-ph stat.ML

M-CaStLe: Uncovering Local Causal Structures in Multivariate Space-Time Gridded Data

J. Jake Nichol, Michael Weylandt, G. Matthew Fricke, Jhayron Perez-Carrasquilla, Melanie E. Moses

Comments 19 pages and 6 figures in the main text; 33 pages and 11 figures total

2605.00379 2026-05-04 stat.ME

Economical Experimental Design with Generalized Posteriors

Luke Hagar, James M. McGree

2605.00365 2026-05-04 cs.LG cs.CL stat.ML

Uniform-Correct Policy Optimization: Breaking RLVR's Indifference to Diversity

Anamika Lochab, Bolian Li, Ruqi Zhang

2605.00363 2026-05-04 math.ST stat.TH

Profile Likelihood Inference for Anisotropic Hyperbolic Wrapped Normal Models on Hyperbolic Space

Kisung You

Comments 34 pages, 2 figures

2605.00360 2026-05-04 cs.LG stat.ME

Binomial flows: Denoising and flow matching for discrete ordinal data

Yair Shenfeld, Ricardo Baptista, Stefano Peluchetti

Comments 41 pages, 9 figures

2605.00332 2026-05-04 stat.ME cs.NA math.NA

Beyond Independence: on Jointly Normal Priors in Bayesian Inversion

Ruanui Nicholson, Matti Niskanen, Oliver J. Maclaren, Jari P. Kaipio

2605.00284 2026-05-04 cs.LG cs.NA math.NA stat.ML

A Dirac-Frenkel-Onsager principle: Instantaneous residual minimization with gauge momentum for nonlinear parametrizations of PDE solutions

Matteo Raviola, Benjamin Peherstorfer

2605.00250 2026-05-04 stat.ML cs.CV cs.LG

Information-geometric adaptive sampling for graph diffusion

Yuhui Lu, Wenjing Liu, Kun Zhan

Comments Accepted to ICML 2026!

2605.00229 2026-05-04 stat.ML cs.LG math.OC

A unified perspective on fine-tuning and sampling with diffusion and flow models

Carles Domingo-Enrich, Yuanqi Du, Michael S. Albergo

2605.00216 2026-05-04 stat.ME

Simplicity Above Elegance: Another Look at the Asymptotically Correct Standardization of Snijders

Sandip Sinharay

Comments 34 pages, 5 figures. This version is the corrected version of the published article. Due to the correction, the content in pages 7-12 of this document differs substantially from that in the journal version

2605.00196 2026-05-04 stat.ME math.PR math.ST q-fin.ST stat.TH

Modeling Stock Returns and Volatility Using Bivariate Gamma Generalized Laplace Law

Tomasz J. Kozubowski, Andrey Sarantsev, James A. Spiker

Comments 25 pages, 2 figures. Keywords: Financial modeling, Generalized Laplace distribution, Maximum likelihood estimation, Normal mean-variance mixture, Variance-gamma distribution

2605.00193 2026-05-04 cs.LG stat.ML

OTSS: Output-Targeted Soft Segmentation for Contextual Decision-Weight Learning

Renjun Hu, Hyun-Soo Ahn

Comments 23 pages, 2 figures

2605.00176 2026-05-04 stat.ML cs.LG

SHIFT: Robust Double Machine Learning for Average Dose-Response Functions under Heavy-Tailed Contamination

Eichi Uehara

Comments 77 pages, 43 figures, 35 tables. Code and raw CSVs: https://github.com/EichiUehara/ADRF-Robust-DML

2605.00175 2026-05-04 stat.AP

Using Linked Micromaps to Explore Complex Structures in Official Statistics

Randall Powers, Darcy Steeg Morris, John Eltinge, Wendy Martinez

2605.00171 2026-05-04 stat.ML cs.LG stat.AP

Adaptive Norm-Based Regularization for Neural Networks

Muhammad Qasim, Farrukh Javed

Comments 37 pages, 9 figures

2605.00126 2026-05-04 cs.LG eess.SP stat.ML

SPLICE: Latent Diffusion over JEPA Embeddings for Conformal Time-Series Inpainting

Arnaud Zinflou

2605.00108 2026-05-04 physics.soc-ph econ.GN q-fin.EC stat.AP

Urban Science Beyond Samples: Up-to-Date Street Network Models and Indicators for Every Urban Area in the World

Geoff Boeing

2605.00099 2026-05-04 quant-ph cs.LG stat.ML

Provable and scalable quantum Gaussian processes for quantum learning

Jonas Jäger, Paolo Braccia, Pablo Bermejo, Manuel G. Algaba, Diego García-Martín, M. Cerezo

Comments 18 + 70 pages, 5 + 14 figures, 2 tables

2605.00056 2026-05-04 cs.LG cs.AI physics.data-an physics.geo-ph stat.AP stat.ML

Smart Ensemble Learning Framework for Predicting Groundwater Heavy Metal Pollution

T. Ansah-Narh, G. Y. Afrifa, J. B. Tandoh, K. Asare, M. Addi, K. E. Yorke, D. M. A. Akpoley, K. Aidoo, S. K. Fosuhene

Comments 53 pages, 16 figures, accepted for publication in Earth Systems and Environment (2026)

2605.00007 2026-05-04 math.OC cs.AI stat.ML

Mean-Field Path-Integral Diffusion: From Samples to Interacting Agents

Michael Chertkov

Comments 31 pages, 14 figures

2604.27077 2026-05-04 cs.LG cs.AI stat.ML

Learning Rate Transfer in Normalized Transformers

Boris Shigida, Boris Hanin, Andrey Gromov

2604.24032 2026-05-04 stat.ME

On Cluster Randomized Trials with the Desirability of Outcome Ranking (DOOR) Endpoints

Wanying Shao, Toshimitsu Hamasaki, Scott Evans, Guoqing Diao

2604.04567 2026-05-04 stat.ML cs.LG

Generative Modeling under Non-Monotone MAR Missingness via Approximate Wasserstein Gradient Flows

Gitte Kremling, Jeffrey Näf, Johannes Lederer

2603.18413 2026-05-04 stat.ML cs.LG

Statistical Testing Framework for Clustering Pipelines by Selective Inference

Yugo Miyata, Tomohiro Shiraishi, Shuichi Nishino, Ichiro Takeuchi

Comments 59 pages, 11 figures

2603.08538 2026-05-04 math.ST stat.ME stat.TH

Minimax estimation for Varying Coefficient Model via Laguerre Series

Rida Benhaddou, Khalid Chokri, Jackson Pinschenat

Comments 27 pages, 6 figures

2603.02275 2026-05-04 cs.LG stat.AP stat.ML

A Comparative Study of UMAP and Other Dimensionality Reduction Methods

Guanzhe Zhang, Shanshan Ding, Zhezhen Jin

Comments 31 pages, 4 figures

2601.09011 2026-05-04 stat.ME q-bio.PE

Causal attribution by the chain rule: unifying natural selection, learning, economics, and other disciplines

Steven A. Frank

Comments New title, abstract, introduction, and other significant changes throughout

详情

DOI: 10.1093/evlett/qrag018

英文摘要

Analysis often splits change into components. For example, how much of the observed variance is caused by genes or environment? In many cases, the split is ultimately made by the logic of the chain rule, which divides the difference of a product into two terms. Each term quantifies the partial difference associated with change in one component while holding the other component constant. The chain rule is of course widely known. However, this article argues that its deep fundamental role often goes unrecognized. The article shows how simply the basic chain rule unifies Fisher's fundamental theorem of natural selection, the Price equation description of evolutionary change, the Oaxaca-Blinder decomposition of wage differences in economics, the Kitagawa decomposition of mortality differences in demography, many expressions of thermodynamics, and most strikingly back propagation, the core optimization method of modern machine learning and artificial intelligence. The success in creating good designs and finding good solutions in both natural selection and artificial intelligence depends on how the chain rule propagates causes from instances of success or failure back to the underlying genes or parameters of the system. The mathematical analysis presented here shows that, for finite differences, the product rule form of the chain rule yields a basic decomposition of change into two components of a regression equation. That regression decomposition is purely a description of change with no explicit causal meaning. However, simple additional assumptions lead naturally to the modern counterfactual analysis of causality. From that perspective, we can easily understand the causal interpretation that Fisher gave to his fundamental theorem, and we can see the same causal structure in the Oaxaca-Blinder decomposition of economics and in causal analyses across many disciplines.

URL PDF HTML ☆

赞 0 踩 0

2510.23557 2026-05-04 stat.ML cs.LG

Minimizing Human Intervention in Online Classification

William Réveillard, Vasileios Saketos, Alexandre Proutiere, Richard Combes

Comments 53 pages, 10 figures. AISTATS 2026

2510.19206 2026-05-04 math.ST stat.ML stat.TH

Shrinkage to Infinity: Reducing Test Error by Inflating the Minimum Norm Interpolator in Linear Models

Jake Freeman

2509.20015 2026-05-04 q-fin.MF q-fin.CP stat.ME

Randomized Kolmogorov-Smirnov Analysis of Volatility Roughness

Sergio Bianchi, Daniele Angelini

Comments 23 pages

2509.17960 2026-05-04 stat.ME stat.AP

Everything all at once: On choosing an estimand for multi-component environmental exposures

Kara E. Rudolph, Shodai Inose, Nicholas Williams, Ivan Diaz, Lucia Calderon, Jacqueline M. Torres, Marianthi-Anna Kioumourtzoglou

2509.02829 2026-05-04 math.PR math.ST stat.TH

An iterated $I$-projection procedure for solving the generalized minimum information checkerboard copula problem

Ivan Kojadinovic, Tommaso Martini

Comments 35 pages, 13 figures

2508.18682 2026-05-04 math.ST stat.TH

Simple and Sharp Generalization Bounds via Lifting

Jingbo Liu

Comments 1 figure

2508.01610 2026-05-04 stat.ME

Sample size calculations for multilevel factorial longitudinal cluster randomised trials

Rhys Bowden, Rebecca Walwyn, Jessica Kasza, Andrew Copas, Fan Li, James Wason, Andrew Forbes

详情

英文摘要

Typically, trials investigate the impact of either an individual-level intervention on participant outcomes, or the impact of a cluster-level intervention on participant outcomes. Factorial designs consider two (or more) treatments for each of two (or more) different factors. In factorial trial designs, trial units (individuals or clusters) are each randomised to a level of each of the treatments; these designs allow assessment of the interactions between different interventions. Recently, there has been growing interest in the design of trials that jointly assess the impact of individual- and cluster-level interventions (i.e. multi-level interventions); requiring the development of methodology that accommodates randomisation at multiple levels. While recent work has developed sample size methodology for variants combining standard cluster randomisation and individual randomisation, that work does not apply to longitudinal cluster randomised trial designs such as the stepped wedge design or cluster randomised crossover design. Here we present dedicated sample size methodology for "split-plot factorial longitudinal cluster randomised trials" with continuous outcomes: allowing for joint assessment of individual-level and cluster-level interventions that allows for the impact of the cluster-level intervention to be assessed using any longitudinal cluster randomised trial design. We show how the power to detect given effects of the individual-level intervention, the cluster-level intervention, and the interaction between the two depends on standard results for individually-randomised trials and longitudinal cluster randomised trials. We apply these results to the SharES trial, which considered the effects of a patient- and clinician-level interventions for patients with breast cancer on patient knowledge about the risks and benefits of treatment.

URL PDF HTML ☆

赞 0 踩 0

2504.15290 2026-05-04 stat.OT

Parental Imprints On Birth Weight: A Data-Driven Model For Neonatal Prediction In Low Resource Prenatal Care

Rajeshwari Mistri, Harsh Joshi, Nachiket Kapure, Parul Kumari, Manasi Mali, Seema Purohit, Neha Sharma, Mrityunjoy Panday, Chittaranjan S. Yajnik

Comments Withdrawn due to identified issues in manuscript originality and overlap in some Sections requiring substantial revision and restructuring of the text and methodology. A corrected and improved version will be submitted

2503.14459 2026-05-04 stat.ML cs.LG stat.ME

Doubly robust identification of treatment effects from multiple environments

Piersilvio De Bartolomeis, Julia Kostin, Javier Abad, Yixin Wang, Fanny Yang

Comments Accepted for presentation at the International Conference on Learning Representations (ICLR) 2025

2503.10990 2026-05-04 cs.GT cs.LG econ.TH math.ST stat.ML stat.TH

Statistical Impossibility and Possibility of Aligning LLMs with Human Preferences: From Condorcet Paradox to Nash Equilibrium

Kaizhao Liu, Qi Long, Zhekun Shi, Weijie J. Su, Jiancong Xiao

Comments Accepted for publication in the Annals of Statistics

2501.06540 2026-05-04 cs.CV math.ST stat.AP stat.ME stat.TH

Copula-enhanced Vision Transformer for high myopia diagnosis through OU UWF fundus images

Chong Zhong, Yunhao Liu, Yang Li, Xiang Fu, Jin Yang, Danjuan Yang, Meiyan Li, Jinfeng Xu, Aiyi Liu, Alan H. Welsh, Xingtao Zhou, Bo Fu, Catherine C. Liu

2409.02331 2026-05-04 stat.ME

A parameterization of anisotropic Gaussian fields with penalized complexity priors

Liam Llamazares-Elias, Jonas Latz, Finn Lindgren

Comments v2: revised version, accepted for publication in the Journal of the American Statistical Association

2404.13164 2026-05-04 stat.CO cs.CR

Least Squares Estimation For Hierarchical Data

Ryan Cumings-Menon, Pavel Zhuravlev

2312.10234 2026-05-04 stat.ME stat.ML

Flexible Nonparametric Inference for Causal Effects under the Front-Door Model

Anna Guo, David Benkeser, Razieh Nabi

2202.00814 2026-05-04 stat.ME stat.AP

Adjustment for Unmeasured Spatial Confounding in Settings of Continuous Exposure Conditional on the Binary Exposure Status: Conditional Generalized Propensity Score-Based Spatial Matching

Honghyok Kim, Michelle Bell

Comments Online supplementary materials are appended at the bottom of the main pdf As of 2026, under revision at a method-oriented journal

2107.01742 2026-05-04 stat.ME stat.CO

Nonparametric Detection of Multiple Location-Scale Change Points via Wild Binary Segmentation

Gordon J. Ross