arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.22752 2026-04-27 stat.ME

From Physics to Statistics: A Simple Route to Exponential Families via Maximum Entropy

Korbinian Strimmer

Comments 17 pages, 2 tables

详情

英文摘要

Exponential families form the backbone of modern statistics and machine learning, but textbooks seldom derive them from first principles in an accessible way. Although minimal sufficiency and the principle of maximum entropy, originating in physics, provide core motivation, they are often presented as technical and requiring advanced prerequisites. Here, a short, self-contained derivation of exponential families based on maximum entropy is presented that is straightforward to carry out, requires only a modest background in information entropy, and avoids technicalities like constrained optimisation. Two propositions are demonstrated in this fashion: i) exponential families with a general base maximise information entropy with respect to that base subject to fixed expectations of canonical statistics, and ii) exponential families with a uniform base maximise standard information entropy under the same constraints. Maximum entropy therefore provides a principled foundation for exponential families with minimal prerequisites, highlighting the value of teaching entropy in statistics courses.

URL PDF HTML ☆

赞 0 踩 0

2604.22712 2026-04-27 math.ST stat.TH

Statistical Analysis of Markovian Generative Modeling

Eddie Aamari, Arthur Stéphanovitch

2604.22692 2026-04-27 stat.ME stat.AP

A Unified Framework for Multiple Exposure Distributed Lag Non-Linear Models for Air Pollution Epidemiology

Tianyi Pan, Hwashin Hyun Shin, Alex Stringer, Glen McGee

2604.22667 2026-04-27 math.ST math.PR stat.TH

Sharp bounds for products of dependent random variables

Christopher Blier-Wong, Jinghui Chen

Comments 29 pages, 6 figures, 1 table

2604.18820 2026-04-27 stat.ML cs.LG eess.SP math.OC stat.AP

Sparse Network Inference under Imperfect Detection and its Application to Ecological Networks

Aoran Zhang, Tianyao Wei, Maria J. Guerrero, César A. Uribe

Comments 13 pages, 4 figures

2603.18941 2026-04-27 stat.ML cs.LG

Unified Taxonomy for Multivariate Time Series Anomaly Detection using Deep Learning

Bruna Alves, Armando J. Pinho, Sónia Gouveia

2601.05245 2026-04-27 cs.LG math.ST stat.ML stat.TH

Optimal Lower Bounds for Online Multicalibration

Natalie Collina, Jiuyao Lu, Georgy Noarov, Aaron Roth

2510.19020 2026-04-27 stat.ML cs.LG

Calibrated Principal Component Regression

Yixuan Florence Wu, Yilun Zhu, Lei Cao, Naichen Shi

2510.16975 2026-04-27 stat.ME

Causal Variance Decompositions for Measuring Health Inequalities

Lin Yu, Zhihui Liu, Kathy Han, Olli Saarela

2509.22235 2026-04-27 stat.ME

Tail-robust estimation of factor-adjusted vector autoregressive models for high-dimensional time series

Dylan Dijk, Haeran Cho

2410.05858 2026-04-27 stat.ME

Detecting dependence structure: visualization and inference

Bogdan Ćmiel, Teresa Ledwina

2604.22636 2026-04-27 stat.ML cs.LG stat.AP

CLVAE: A Variational Autoencoder for Long-Term Customer Revenue Forecasting

Jeffrey Näf, Riana Valera Mbelson, Markus Meierer

2604.22633 2026-04-27 stat.ML cs.LG

Mixed Membership sub-Gaussian Models

Huan Qing

Comments 30 pages, 6 figures, 2 tables

2604.22580 2026-04-27 stat.ML cs.LG

Explanation of Dynamic Physical Field Predictions using WassersteinGrad: Application to Autoregressive Weather Forecasting

Younes Essafouri, Laure Raynaud, Luciano Drozda, Laurent Risser

2604.22548 2026-04-27 stat.AP cs.LG

Multi-output Extreme Spatial Model for Complex Aircraft Production Systems

Cheolhei Lee, Xing Wang, Xiaowei Yue, Jianguo Wu

详情

DOI: 10.1287/msom.2023.0442

英文摘要

Problem definition: Data-driven models in machine learning have enabled efficient management of production systems. However, a majority of machine learning models are devoted to modeling the mean response or average pattern, which is inappropriate for studying abnormal extreme events that are often of primary interest in aircraft manufacturing. Since extreme events from heavy-tailed distributions give rise to prohibitive expenditures in system management, sophisticated extreme models are urgently needed to analyze complex extreme risks. Engineering applications of extreme models usually focus on individual extreme events, which is insufficient for complex systems with correlations. Methodology/results: We introduce an extreme spatial model for multi-output response control systems that efficiently captures the dynamics using a bilinear function on two spatial domains for control variables and measurement locations. Marginal parameter modeling and extremal dependence have been investigated. In addition, an efficient graph-assisted composite likelihood estimation and corresponding computational algorithms are developed to cope with high-dimensional outputs. The application to composite aircraft production shows that the proposed model enables comprehensive analyses with superior predictive performance on extreme events compared to canonical methods. Managerial implications: Our method shows how to use an extreme spatial model for predicting extreme events and managing extreme risks in complex production systems such as aircraft. This can help achieve better quality management and operation safety in aircraft production systems and beyond.

URL PDF HTML ☆

赞 0 踩 0

2604.22494 2026-04-27 stat.ML cs.LG

FedSPDnet: Geometry-Aware Federated Deep Learning with SPDnet

Thibault Pautrel, Florent Bouchard, Ammar Mian, Guillaume Ginolhac

2604.22486 2026-04-27 math.ST stat.TH

Laplace Transform driven Stein-type Goodness-of-fit Tests for Pareto Distribution

Deepesh Bhati, Sakshi Khandelwal

Comments 25 pages, 7 tables

2604.22453 2026-04-27 math.PR math.ST stat.TH

Adapted Wasserstein Barycenters of Gaussian Processes

Francesco Mattesini, Johannes Wiesel

Comments Comments very welcome!

2604.22431 2026-04-27 stat.ME

Robust Bayesian Sequential Borrowing for Multi-Population Clinical Programmes

Erik Hermansson, Lynn Dunsire, David Svensson, Thomas Jaki

2604.22391 2026-04-27 stat.ML cs.LG stat.CO stat.ME

Conformalized Super Learner

Zhanli Wu, Fabrizio Leisen, Miguel-Angel Luque-Fernandez, F. Javier Rubio

Comments R codes and data can be found at: https://github.com/ZWU-001/CSL

2604.22386 2026-04-27 stat.ML cs.LG

Pack only the essentials: Adaptive dictionary learning for kernel ridge regression

Daniele Calandriello, Alessandro Lazaric, Michal Valko

Comments In NeurIPS 2016 Workshop on Adaptive and Scalable Nonparametric Methods in Machine Learning (ASNMML)

2604.22385 2026-04-27 stat.ML cs.LG

Pliable rejection sampling

Akram Erraqabi, Michal Valko, Alexandra Carpentier, Odalric-Ambrym Maillard

Comments In ICML 2016

2604.22366 2026-04-27 math.OC math.ST stat.TH

Statistical Estimation of Monge Transport Maps via Brenier Potentials

Elsa Cazelles, Edouard Pauwels, Léo Portales

2604.22364 2026-04-27 stat.AP stat.CO

Tail-Greedy Unbalanced Haar Wavelet Segmentation for Copy Number Alteration Data

Maharani Ahsani Ummi, Stuart Barber, Henry M. Wood, Arief Gusnanto

Comments 17 pages, 9 figures

2604.22355 2026-04-27 cs.LG math.OC stat.ML

SOC-ICNN: From Polyhedral to Conic Geometry for Learning Convex Surrogate Functions

Kang Liu, Jianchen Hu

Comments 28 pages and no figure

2604.22320 2026-04-27 stat.ME stat.ML

Nonparametric Estimation of Isotropic Covariance Function

Yiming Wang, Sujit K. Ghosh

Comments 39 pages, 7 figures. Published in Journal of Nonparametric Statistics (2023)

2604.22305 2026-04-27 stat.AP

Finite element model updating of building structures under seismic excitation: A parallelized latent space-based Bayesian framework

Taro Yaoyama, Sangwon Lee, Minoru Matsubara, Kenzo Kodera, Takeshi Ugata, Tatsuya Itoi

详情

英文摘要

Enhancing seismic fragility and risk assessment of nuclear power plants relies on accurate prediction of reactor building responses to seismic hazards, which can be further improved through dynamic analysis of high-fidelity finite element (FE) models. However, FE models often exhibit non-negligible discrepancies from actual structures due to various sources of uncertainty, necessitating FE model updating with rigorous quantification of associated uncertainties. This paper presents a GPU-accelerated latent space--based Bayesian framework for FE model updating of building structures. In the proposed framework, high-dimensional structural response data (e.g., time histories or frequency response functions) are projected into a low-dimensional latent space using a multimodal variational autoencoder (MVAE), thereby enabling efficient and tractable likelihood evaluation without explicit modeling in the original observation space. Once trained, the surrogate enables amortized inference, allowing posterior sampling to be performed without additional simulator evaluations. We specifically employ a sequential Monte Carlo (SMC) sampler, whose population-based formulation allows parallel evaluation of the approximate likelihood on GPUs, resulting in computational efficiency and robustness against multimodal and complex posterior distributions. The proposed framework is validated through both numerical benchmarking and experimental data from a shaking table test of a reinforced concrete building structure. The results demonstrate that the method accurately estimates structural parameters with well-quantified uncertainties, while achieving fast and efficient inference through GPU-based parallelization, and enabling robust inference even in the presence of sparse observations that induce multimodal and highly complex posterior distributions.

URL PDF HTML ☆

赞 0 踩 0

2604.22286 2026-04-27 stat.AP

From specific-source feature-based to common-source score-based likelihood-ratio systems: ranking the stars

Peter Vergeer

2604.22216 2026-04-27 stat.ME

Optimal Stopping in Sequential Clinical Prediction

Hui-Mean Foo, Yuan-chin Ivan Chang

Comments 46, 10

2604.22123 2026-04-27 stat.AP

Modeling Physical Activity Change as Smooth Transformations: Temporal and Amplitude Patterns Associated with Physical Function in Older Women

Rong W. Zablocki, Steve Nguyen, Yacun Wang, Lindsay Dillon, Michael J. LaMonte, Phyllis A. Richey, Ramon Casanova, Marcia L. Stefanick, Sheri J. Hartman, Chongzhi Di, Charles Kooperberg, Loki Natarajan, Andrea Z. LaCroix, Jingjing Zou

详情

英文摘要

Background: Minute-level accelerometer data capture rich diurnal physical activity (PA) patterns, but conventional summary metrics obscures clinically meaningful changes accumulated across a day. Building on Riemannian framework, we integrate multivariate functional principal component analysis (MFPCA) to identify main modes of PA change in older women and examine associations with physical function (PF). Method: A subset participant from OPACH as baseline and two WHISH follow-ups (W1, W2), yielded 3 accelerometer measurements; each participant's diurnal PA at each visit was represented as a smooth curve. Change between consecutive visits (defined as periods: baseline-W1, W1-W2) was modeled as a Riemannian deformation (RD) jointly capturing changes in PA timing and magnitude. Deformations were parameterized by initial momenta and summarized using MFPCA; participant-level changes were characterized by principal component (PC) scores and deformation energy (DE), a metric of overall pattern change. Associations with PF were assessed using linear mixed models. Results: Mean deformation in both periods showed overall downward shifts in PA magnitude with temporal redistribution between 10am and 7pm. Top 15 PCs explained >= 90% of variability in both periods; PC1 represented a pattern of PA increase/decrease throughout the day, explaining 22.4% (baseline-W1) and 20.8% (W1-W2). Among complete data (N=1157), an increase in PA in the mode of PC1 was positively associated with PF (p <0.0001). The interaction between DE and period was significantly associated with PF (p=0.003). Conclusions: Modeling longitudinal PA change as RDs and summarizing variability via MFPCA produced clinically interpretable phenotypes of diurnal PA change beyond standard metrics. The leading deformation mode was significantly associated with PF, and DE showed a stronger association with PF in the later period.

URL PDF HTML ☆

赞 0 踩 0

2604.22088 2026-04-27 stat.ME stat.AP

Zero-inflated modeling with smoothing on counting tensors

Elena Tuzhilina, Yaoming Zhen

2604.22051 2026-04-27 stat.ME

int3ract: Johnson-Neyman Technique and its Three-Way Extension for Frequentist and Bayesian Models in R

Robert W. Krause

2604.22015 2026-04-27 stat.ME stat.AP stat.ML

Hierarchical Probabilistic Principal Component Analysis of Longitudinal Data

Xinyu Zhang, Ameer Qaqish, D. Y. Lin, Didong Li

2604.21998 2026-04-27 math.ST stat.TH

Minimax Robust Designs for M-Estimated Models

Rui Hu, Douglas P. Wiens

2604.21994 2026-04-27 stat.ME stat.AP stat.CO stat.ML

Contrast-Space Projection for Network Meta-Analysis: An Exact and Invariant Study-Based Decomposition of Direct and Indirect Contributions

Chong Wang, Yanqi Zhang, Zhezhen Jin, Annette O'Connor

Comments 33 pages, 6 figures

2604.18143 2026-04-27 stat.ML cs.LG stat.ME

Distributional Off-Policy Evaluation with Deep Quantile Process Regression

Qi Kuang, Chao Wang, Yuling Jiao, Fan Zhou

Comments Journal of the American Statistical Association

2604.18130 2026-04-27 cs.LG cs.CE stat.AP

An `Inverse' Experimental Framework to Estimate Market Efficiency

Thomas Asikis, Heinrich H. Nax

Comments Minor fix: added co-author middle name for clarity

详情

英文摘要

Digital marketplaces processing billions of dollars annually represent critical infrastructure in sociotechnical ecosystems, yet their performance optimization lacks principled measurement frameworks that can inform algorithmic governance decisions regarding market efficiency and fairness from complex market data. By looking at orderbook data from double auction markets alone, because bids and asks do not represent true maximum willingnesses to buy and true minimum willingnesses to sell, there is little an economist can say about the market's actual performance in terms of allocative efficiency. We turn to experimental data to address this issue, `inverting' the standard induced value approach of double auction experiments. Our aim is to predict key market features relevant to market efficiency, particularly allocative efficiency, using orderbook data only -- specifically bids, asks and price realizations, but not the induced reservation values -- as early as possible. Since there is no established model of strategically optimal behavior in these markets, and because orderbook data is highly unstructured, non-stationary and non-linear, we propose quantile-based normalization techniques that help us build general predictive models. We develop and train several models, including linear regressions and gradient boosting trees, leveraging quantile-based input from the underlying supply-demand model. Our models can predict allocative efficiency with reasonable accuracy from the earliest bids and asks, and these predictions improve with additional realized price data. The performance of the prediction techniques varies by target and market type. Our framework holds significant potential for application to real-world market data, offering valuable insights into market efficiency and performance, even prior to any trade realizations.

URL PDF HTML ☆

赞 0 踩 0

2604.17578 2026-04-27 cs.LG math.ST stat.TH

Recovery Guarantees for Continual Learning of Dependent Tasks: Memory, Data-Dependent Regularization, and Data-Dependent Weights

Liangzu Peng, Uday Kiran Reddy Tadipatri, Ziqing Xu, Eric Eaton, René Vidal

2604.07169 2026-04-27 stat.ML cs.LG cs.NA math.NA

FLUID: Flow-based Unified Inference for Dynamics

Tiangang Cui, Xiaodong Feng, Chenlong Pei, Xiaoliang Wan, Tao Zhou

Comments 43 pages

2604.04964 2026-04-27 stat.ME

Bayesian Global-Local Shrinkage with Univariate Guidance for Ultra-High-Dimensional Regression

Priyam Das

2603.28470 2026-04-27 econ.EM stat.ME

Counterfactual Density Effects and the German East--West Income Gap

Georg Keilbar, Sonja Greven

2603.10377 2026-04-27 cs.LG cs.AI stat.ME

Causal Concept Graphs in LLM Latent Space for Stepwise Reasoning

Md Muntaqim Meherab, Noor Islam S. Mohammad, Faiza Feroz

Comments We have recently encountered author conflicts related to this work and therefore respectfully request the withdrawal of this paper. We believe this step is necessary to address the situation appropriately and maintain academic integrity in the submission

2602.05639 2026-04-27 cs.LG stat.ML

Joint Embedding Variational Bayes

Amin Oji, Paul Fieguth

2602.00208 2026-04-27 cs.LG cs.AI cs.IR math.ST stat.ML stat.TH

Analyzing Shapley Additive Explanations to Understand Anomaly Detection Algorithm Behaviors and Their Complementarity

Jordan Levy, Paul Saves, Moncef Garouani, Nicolas Verstaevel, Benoit Gaudou

Comments IDA Frontier Prize and Best Paper Award -Intelligent Data Analysis (IDA) 2026, Springer Nature

详情

DOI: 10.1007/978-3-032-23833-7_10
Journal ref: In: IDA (LNCS), Springer, vol 16513 (2026)

英文摘要

Unsupervised anomaly detection is a challenging problem due to the diversity of data distributions and the lack of labels. Ensemble methods are often adopted to mitigate these challenges by combining multiple detectors, which can reduce individual biases and increase robustness. Yet building an ensemble that is genuinely complementary remains challenging, since many detectors rely on similar decision cues and end up producing redundant anomaly scores. As a result, the potential of ensemble learning is often limited by the difficulty of identifying models that truly capture different types of irregularities. To address this, we propose a methodology for characterizing anomaly detectors through their decision mechanisms. Using SHapley Additive exPlanations, we quantify how each model attributes importance to input features, and we use these attribution profiles to measure similarity between detectors. We show that detectors with similar explanations tend to produce correlated anomaly scores and identify largely overlapping anomalies. Conversely, explanation divergence reliably indicates complementary detection behavior. Our results demonstrate that explanation-driven metrics offer a different criterion than raw outputs for selecting models in an ensemble. However, we also demonstrate that diversity alone is insufficient; high individual model performance remains a prerequisite for effective ensembles. By explicitly targeting explanation diversity while maintaining model quality, we are able to construct ensembles that are more diverse, more complementary, and ultimately more effective for unsupervised anomaly detection.

URL PDF HTML ☆

赞 0 踩 0

2601.19674 2026-04-27 cs.LG cs.AI stat.AP stat.ME

Cross-Domain Offshore Wind Power Forecasting: Transfer Learning Through Meteorological Clusters

Dominic Weisser, Chloé Hashimoto-Cullen, Benjamin Guedj

Comments 15 pages, 5 figures, Climate Informatics 2026

2601.05414 2026-04-27 cs.CL cs.AI stat.ML

Large Language Models Are Bad Dice Players: LLMs Struggle to Generate Random Numbers from Statistical Distributions

Minda Zhao, Yilun Du, Mengyu Wang

Comments Accepted to ACL 2026 (Main Conference)

2512.24046 2026-04-27 stat.CO

A Bayesian approach with persistent homology prior for Robin coefficient identification in a parabolic problem

Xiaomei Yang, Jiaying Jia, Zhiliang Deng

2508.19753 2026-04-27 stat.AP

Hierarchical Bayesian model updating using Dirichlet process mixtures for structural damage localization

Taro Yaoyama, Tatsuya Itoi, Jun Iyama

详情

DOI: 10.1016/j.ymssp.2026.114020
Journal ref: Mechanical Systems and Signal Processing 248 (114020): 114020 (2025)

英文摘要

Bayesian model updating provides a rigorous probabilistic framework for calibrating finite element (FE) models with quantified uncertainties, thereby enhancing damage assessment, response prediction, and performance evaluation of engineering structures. Recent advances in hierarchical Bayesian model updating (HBMU) enable robust parameter estimation under ill-posed/ill-conditioned settings and in the presence of inherent variability in structural parameters due to environmental and operational conditions. However, most HBMU approaches overlook multimodality in structural parameters that often arises when a structure experiences multiple damage states over its service life. This paper presents an HBMU framework that employs a Dirichlet process (DP) mixture prior on structural parameters (DP-HBMU). DP mixtures are nonparametric Bayesian models that perform clustering without pre-specifying the number of clusters, incorporating damage state classification into FE model updating. We formulate the DP-HBMU framework and devise a Metropolis-within-Gibbs sampler that draws samples from the posterior by embedding Metropolis updates for intractable conditionals due to the FE simulator. The applicability of DP-HBMU to damage localization is demonstrated through both numerical and experimental examples. We consider moment-resisting frame structures with beam-end fractures and apply the method to datasets spanning multiple damage states, from an intact state to moderate or severe damage state. The clusters inferred by DP-HBMU align closely with the assumed or observed damage states. The posterior distributions of stiffness parameters agree with ground truth values or observed fractures while exhibiting substantially reduced uncertainty relative to a non-hierarchical baseline. These results demonstrate the effectiveness of the proposed method in damage localization.

URL PDF HTML ☆

赞 0 踩 0

2508.03310 2026-04-27 stat.ME

Robust fuzzy clustering with cellwise outliers

Giorgia Zaccaria, Lorenzo Benzakour, Luis A. García-Escudero, Francesca Greselin, Agustín Mayo-Íscar

2507.13706 2026-04-27 cs.CV math.ST stat.TH

GOSPA and T-GOSPA quasi-metrics for evaluation of multi-object tracking algorithms

Ángel F. García-Fernández, Jinhao Gu, Lennart Svensson, Yuxuan Xia, Jan Krejčí, Oliver Kost, Ondřej Straka

Comments Matlab code of GOSPA and T-GOSPA q-metrics is provided at https://github.com/Agarciafernandez/MTT. Python code of the T-GOSPA q-metric is provided at https://github.com/Agarciafernandez/T-GOSPA-metric-python

2506.11369 2026-04-27 stat.ME stat.CO

Filtration-Based Learning of Multiscale Shared Structures for Multiple Functional Predictors

Shuhao Jiao, Hernando Ombao, Ian W. McKeague

2504.02518 2026-04-27 stat.ML econ.EM q-fin.ST stat.AP stat.CO

Online Multivariate Regularized Distributional Regression for High-dimensional Probabilistic Electricity Price Forecasting

Simon Hirsch

Comments Revised Version March 2026. 40 pages incl. appendix, 14 figures, 7 tables

2501.19277 2026-04-27 stat.ML cs.LG

On Pareto Optimality for Parametric Choice Bandits

Jierui Zuo, Hanzhang Qin

2412.20204 2026-04-27 econ.EM stat.ME

Fitting Dynamically Misspecified Models: An Optimal Transportation Approach

Jean-Jacques Forneron, Zhongjun Qu

2411.03992 2026-04-27 stat.ME stat.CO

Sparse Bayesian joint modal estimation for exploratory item factor analysis

Keiichiro Hijikata, Motonori Oka, Kensuke Okada

2410.23706 2026-04-27 stat.ME

Complex trend inference for high-dimensional piecewise locally stationary time series

Lujia Bai, David Veitch, Weichi Wu, Wenyang Zhang, Zhou Zhou

2407.08750 2026-04-27 stat.ML cs.LG econ.EM stat.AP stat.CO stat.ME

Online Distributional Regression

Simon Hirsch, Jonathan Berrisch, Florian Ziel

Comments Revised version January 2026. 34 pages, 9 figures, 4 tables including appendix

2309.09872 2026-04-27 stat.ME

A Moment-assisted Approach for Improving Subsampling-based MLE with Large-scale data

Miaomiao Su, Qihua Wang, Ruoyu Wang

2103.07818 2026-04-27 stat.ME stat.AP

Quantifying uncertainty in spikes estimated from calcium imaging data

Yiqun T. Chen, Sean W. Jewell, Daniela M. Witten

Comments 52 pages, 12 Figures

2011.00373 2026-04-27 econ.EM stat.ME

Causal Inference for Spatial Treatments

Michael Pollmann