arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.15609 2026-03-17 stat.AP cs.CR cs.CY cs.SI physics.soc-ph

Differential Privacy for Network Connectedness Indices

Tom A. Rutter, Yuxin Liu, M. Amin Rahimian

Comments Code to replicate all of our analyses is available at: https://github.com/TomRutter42/Privacy-for-Connectedness-Indices

2603.15578 2026-03-17 stat.ME stat.ML

Low-Complexity and Consistent Graphon Estimation from Multiple Networks

Roland Boniface Sogan, Tabea Rebafka

Comments Accepted at AISTATS 2026

2603.15576 2026-03-17 cs.LG math.OC stat.ML

Unbiased and Biased Variance-Reduced Forward-Reflected-Backward Splitting Methods for Stochastic Composite Inclusions

Quoc Tran-Dinh, Nghia Nguyen-Trung

Comments 34 pages and 2 figures

2603.15568 2026-03-17 stat.ML cs.LG

Estimating Staged Event Tree Models via Hierarchical Clustering on the Simplex

Muhammad Shoaib, Eva Riccomagno, Manuele Leonelli, Gherardo Varando

2603.15564 2026-03-17 cs.LG stat.AP stat.ML

Predictive Uncertainty in Short-Term PV Forecasting under Missing Data: A Multiple Imputation Approach

Parastoo Pashmchi, Jérôme Benoit, Motonobu Kanagawa

Comments 10 pages

2603.15388 2026-03-17 cs.LG cs.AI cs.RO stat.ML

Efficient Morphology-Control Co-Design via Stackelberg Proximal Policy Optimization

Yanning Dai, Yuhui Wang, Dylan R. Ashley, Jürgen Schmidhuber

Comments presented at the Fourteenth International Conference on Learning Representations; 11 pages in main text + 3 pages of references + 23 pages of appendices, 5 figures in main text + 11 figures in appendices, 16 tables in appendices; accompanying website available at https://yanningdai.github.io/stackelberg-ppo-co-design/ ; source code available at https://github.com/YanningDai/StackelbergPPO

2603.15384 2026-03-17 stat.ML cs.LG math.AT math.ST stat.TH

Persistence Spheres: a Bi-continuous Linear Representation of Measures for Partial Optimal Transport

Matteo Pegoraro

详情

英文摘要

We improve and extend persistence spheres, introduced in~\cite{pegoraro2025persistence}. Persistence spheres map an integrable measure $μ$ on the upper half-plane, including persistence diagrams (PDs) as counting measures, to a function $S(μ)\in C(\mathbb{S}^2)$, and the map is stable with respect to 1-Wasserstein partial transport distance $\mathrm{POT}_1$. Moreover, to the best of our knowledge, persistence spheres are the first explicit representation used in topological machine learning for which continuity of the inverse on the image is established at every compactly supported target. Recent bounded-cardinality bi-Lipschitz embedding results in partial transport spaces, despite being powerful, are not given by the kind of explicit summary map considered here. Our construction is rooted in convex geometry: for positive measures, the defining ReLU integral is the support function of the lift zonoid. Building on~\cite{pegoraro2025persistence}, we refine the definition to better match the $\mathrm{POT}_1$ deletion mechanism, encoding partial transport via a signed diagonal augmentation. In particular, for integrable $μ$, the uniform norm between $S(0)$ and $S(μ)$ depends only on the persistence of $μ$, without any need of ad-hoc re-weightings, reflecting optimal transport to the diagonal at persistence cost. This yields a parameter-free representation at the level of measures (up to numerical discretization), while accommodating future extensions where $μ$ is a smoothed measure derived from PDs (e.g., persistence intensity functions~\citep{wu2024estimation}). Across clustering, regression, and classification tasks involving functional data, time series, graphs, meshes, and point clouds, the updated persistence spheres are competitive and often improve upon persistence images, persistence landscapes, persistence splines, and sliced Wasserstein kernel baselines.

URL PDF HTML ☆

赞 0 踩 0

2603.15340 2026-03-17 cs.CL stat.ML

DOS: Dependency-Oriented Sampler for Masked Diffusion Language Models

Xueyu Zhou, Yangrong Hu, Jian Huang

Comments 16 pages, 5 figures

2603.15336 2026-03-17 stat.ML cs.LG

Active Seriation: Efficient Ordering Recovery with Statistical Guarantees

James Cheshire, Yann Issartel

2603.15292 2026-03-17 stat.ML cs.AI cs.LG

Scalable Simulation-Based Model Inference with Test-Time Complexity Control

Manuel Gloeckler, J. P. Manzano-Patrón, Stamatios N. Sotiropoulos, Cornelius Schröder, Jakob H. Macke

2603.15215 2026-03-17 stat.OT

Deepest voting on rankings

Jean-Baptiste Aubin, Antoine Rolland, Ioana Gavra, Irène Gannaz, Jacques Anderson Kouassi

2603.15189 2026-03-17 stat.ML cs.LG

The Sampling Complexity of Condorcet Winner Identification in Dueling Bandits

El Mehdi Saad, Victor Thuot, Nicolas Verzelen

2603.15149 2026-03-17 stat.ME econ.GN q-fin.EC stat.AP

Measuring the depth of multidimensional poverty with ordinal data

Fernando Flores Tavares

2603.15121 2026-03-17 cs.LG stat.ML

Establishing Construct Validity in LLM Capability Benchmarks Requires Nomological Networks

Timo Freiesleben

2603.15082 2026-03-17 stat.ME math.ST stat.TH

Identifying Topological Differences in Two Populations of Random Geometric Objects

Satish Kumar, Subhra Sankar Dhar

2603.15016 2026-03-17 cs.CV stat.ML

Riemannian Motion Generation: A Unified Framework for Human Motion Representation and Generation via Riemannian Flow Matching

Fangran Miao, Jian Huang, Ting Li

Comments 18 pages, 6 figures

2603.14991 2026-03-17 math.ST math.OC stat.TH

Wasserstein Distributionally Robust Quantile Regression

Chunxu Zhang, Tiantian Mao, Ruodu Wang

2603.14961 2026-03-17 math.ST stat.TH

Performance of Efron and Tibshirani's semiparametric density estimator

Nils Lid Hjort

Comments 15 pages, no figures; Statistical Research Report, Department of Mathematics, University of Oslo, from December 1995, but arXiv'd March 2026

2603.14859 2026-03-17 stat.AP

vPET-ABC: Fast Voxelwise Approximate Bayesian Inference for Kinetic Modeling in PET

Qinlin Gu, Gaelle M. Emvalomenos, Evan D. Morris, Clara Grazian, Steven R. Meikle

Comments Q. Gu and G. M. Emvalomenos contributed equally to this work

详情

英文摘要

Dynamic PET kinetic modeling increasingly demands voxelwise uncertainty quantification and robust model selection. Yet total-body PET (TB-PET) data volumes make conventional Bayesian approaches, such as per-voxel MCMC, computationally impractical, while deep models typically require retraining and careful revalidation when tracers, protocols, or kinetic models change, without necessarily improving inference speed. Vectorized voxelwise approximate Bayesian computation (vPET-ABC) is introduced as a likelihood-free, model-agnostic posterior inference framework for dynamic PET kinetic modeling at total-body scale. The method replaces explicit likelihood evaluation with forward simulations and a discrepancy test, then exploits full vectorization to transform voxelwise inference into an embarrassingly parallel workload suited to modern GPUs. In simulation, vPET-ABC produced posterior summaries with small divergence from sequential Monte Carlo baselines, and posterior mean estimates significantly more accurate than non-negative least squares (NNLS). For model selection between the linear parametric neurotransmitter model (lp-ntPET) and the multilinear reference tissue model, vPET-ABC maintained high sensitivity under high noise with moderate loss of specificity, whereas NNLS+Bayesian information criteria exhibited the opposite trade-off with near-zero sensitivity. In a human cigarette smoking dataset, vPET-ABC yielded denser probabilistic activation maps than lp-ntPET with effective number of parameters. On a 50 min total-body [18F]FDG study, vPET-ABC generated high quality whole volume K_i parametric images within practical runtimes on a single GPU, while also preserved local spatial correlation better than NNLS. Overall, vPET-ABC delivers fast, training-free, uncertainty-aware inference that scales to TB-PET and remains portable across tracers and kinetic models.

URL PDF HTML ☆

赞 0 踩 0

2603.14835 2026-03-17 math.ST stat.AP stat.TH

On the evaluation of time-to-event, survival time and first passage time forecasts

Robert J. Taggart, Nicholas Loveday, Simon Louis

2603.14815 2026-03-17 stat.ME

On Heterogeneity in Wasserstein Space

Kisung You

2603.11867 2026-03-17 stat.ME stat.ML

Data Fusion with Distributional Equivalence Test-then-pool

Linying Yang, Xing Liu, Robin J. Evans

2603.08682 2026-03-17 stat.ML cs.LG

Structural Causal Bottleneck Models

Simon Bing, Jonas Wahl, Jakob Runge

2603.06134 2026-03-17 stat.ME stat.AP

Clustering-Based Outcome Models for Clinical Studies: A Scoping Review

Johannes Vilsmeier, Fabian Eibensteiner, Franz König, Francois Mercier, Robin Ristl, Nigel Stallard, Marc Vandemeulebroecke, Sarah Zohar, Martin Posch

2603.05340 2026-03-17 stat.ML cs.LG math.ST stat.TH

On the Statistical Optimality of Optimal Decision Trees

Zineng Xu, Subhro Ghosh, Yan Shuo Tan

2601.11213 2026-03-17 physics.optics math-ph math.MP stat.AP

Light Propagation through Space-Time Non-Markovian Random Media

Chaoran Wang, Jinquan Qi, Shuang Liu, Chenjin Deng, Shensheng Han

Comments 3figures

2601.03220 2026-03-17 cs.LG stat.ML

From Entropy to Epiplexity: Rethinking Information for Computationally Bounded Intelligence

Marc Finzi, Shikai Qiu, Yiding Jiang, Pavel Izmailov, J. Zico Kolter, Andrew Gordon Wilson

Comments Code available at https://github.com/shikaiqiu/epiplexity

详情

英文摘要

Can we learn more from data than existed in the generating process itself? Can new and useful information be constructed from merely applying deterministic transformations to existing data? Can the learnable content in data be evaluated without considering a downstream task? On these questions, Shannon information and Kolmogorov complexity come up nearly empty-handed, in part because they assume observers with unlimited computational capacity and do not target the useful information content. In this work, we identify and exemplify three seeming paradoxes in information theory: (1) information cannot be increased by deterministic transformations; (2) information is independent of the order of data; (3) likelihood modeling is merely distribution matching. To shed light on the tension between these results and modern practice, and to quantify the value of data, we introduce epiplexity, a formalization of information capturing what computationally bounded observers can learn from data. Epiplexity captures the structural content in data while excluding time-bounded entropy, the random unpredictable content exemplified by pseudorandom number generators and chaotic dynamical systems. With these concepts, we demonstrate how information can be created with computation, how it depends on the ordering of the data, and how likelihood modeling can produce more complex programs than present in the data generating process itself. We also present practical procedures to estimate epiplexity which we show capture differences across data sources, track with downstream performance, and highlight dataset interventions that improve out-of-distribution generalization. In contrast to principles of model selection, epiplexity provides a theoretical foundation for data selection, guiding how to select, generate, or transform data for learning systems.

URL PDF HTML ☆

赞 0 踩 0

2511.16568 2026-03-17 math.OC math.ST stat.ML stat.TH

Failure of uniform laws of large numbers for subdifferentials and beyond

Lai Tian, Johannes O. Royset

Comments 17 pages, 2 figures; Section 2.3 now includes new discussion of SAA and subdifferential approximation

2511.02660 2026-03-17 math.ST econ.EM stat.ME stat.TH

Spectral analysis of high-dimensional spot volatility matrix with applications

Qiang Liu, Yiming Liu, Zhi Liu, Wang Zhou

2510.13496 2026-03-17 math.NA cs.NA stat.ML

Data-intrinsic approximation in metric spaces

Jürgen Dölz, Michael Multerer

2510.01112 2026-03-17 astro-ph.GA astro-ph.CO cs.LG stat.AP stat.ME

The causal structure of galactic astrophysics

Harry Desmond, Joseph Ramsey

Comments 10 pages, 4 figures; published in the Open Journal of Astrophysics

2509.24912 2026-03-17 stat.ML cs.LG

When Scores Learn Geometry: Rate Separations under the Manifold Hypothesis

Xiang Li, Zebang Shen, Ya-Ping Hsieh, Niao He

Comments Accepted at ICLR 2026

2509.19040 2026-03-17 stat.ME

Nonparametric efficient estimation of the longitudinal front-door functional

Marie S. Breum, Helene C. W. Rytgaard, Torben Martinussen, Erin E. Gabriel

2509.08731 2026-03-17 cs.LG stat.ML

Generating solution paths of Markovian stochastic differential equations using diffusion models

Xuefeng Gao, Jiale Zha, Xun Yu Zhou

2508.06431 2026-03-17 quant-ph stat.OT

Nonparametric Learning Non-Gaussian Quantum States of Continuous Variable Systems

Liubov A. Markovich, Xiaoyu Liu, Jordi Tura

2506.19837 2026-03-17 stat.ML cs.LG

Convergence and clustering analysis for Mean Shift with radially symmetric, positive definite kernels

Susovan Pal

2504.00919 2026-03-17 math.ST stat.ML stat.TH

Nonparametric spectral density estimation using interactive mechanisms under local differential privacy

Cristina Butucea, Karolina Klockmann, Tatyana Krivobokova

Comments 56 pages, 3 figures

2501.15926 2026-03-17 math.ST stat.TH

Minimax convergence rates of a binary classification procedure for time-homogeneous SDE paths

Eddy Michel Ella Mintsa

Comments 39 pages

2501.10229 2026-03-17 stat.ML cs.LG stat.CO

Amortized Bayesian Mixture Models

Šimon Kucharský, Paul Christian Bürkner

Comments 34 pages, 17 figures

2412.12407 2026-03-17 stat.ME

Inside-out cross-covariance for spatial multivariate data

Michele Peruzzi

2412.01010 2026-03-17 stat.ML cs.LG

A Note on Estimation Error Bound and Grouping Effect of Transfer Elastic Net

Yui Tomo

2404.05006 2026-03-17 math.ST math.PR stat.TH

High-dimensional bootstrap and asymptotic expansion

Yuta Koike

Comments 66 pages, 1 figure, 2 tables. Some typos were corrected. The order of the subsections in Appendix B was rearranged

2309.05435 2026-03-17 stat.CO

Parallel Selected Inversion for Space-Time Gaussian Markov Random Fields

Abylay Zhumekenov, Elias T. Krainski, Håvard Rue

Comments Published in Statistics and Computing (2025). Expanded version with additional results, discussion, and references

2307.01111 2026-03-17 stat.CO stat.ME

A Gaussian process and linear-based framework for computing cut distributions in modular Bayesian calibration of two chained computer models

Oumar Baldé, Guillaume Damblin, Amandine Marrel, Antoine Bouloré, Loïc Giraldi

Comments 44 pages, 14 figures

2305.02881 2026-03-17 quant-ph cs.LG hep-ex stat.ML

Trainability barriers and opportunities in quantum generative modeling

Manuel S. Rudolph, Sacha Lerch, Supanut Thanasilp, Oriel Kiss, Oxana Shaya, Sofia Vallecorsa, Michele Grossi, Zoë Holmes

Comments 21+44 pages, 10+2 figures

2210.09502 2026-03-17 stat.ME

Small Area Estimation using EBLUPs under the Nested Error Regression Model

Ziyang Lyu, A. H. Welsh

Comments 35 pages include 6 tables and 2 figures

详情

DOI: 10.5705/ss.202023.0045
Journal ref: Statistica Sinica 35 (2025), 1277-1299

英文摘要

Estimating characteristics of domains (referred to as small areas) within a population from sample surveys of the population is an important problem in survey statistics. In this paper, we consider model-based small area estimation under the nested error regression model. We discuss the construction of mixed model estimators (empirical best linear unbiased predictors, EBLUPs) of small area means and the conditional linear predictors of small area means. Under the asymptotic framework of increasing numbers of small areas and increasing numbers of units in each area, we establish asymptotic linearity results and central limit theorems for these estimators which allow us to establish asymptotic equivalences between estimators, approximate their sampling distributions, obtain simple expressions for and construct simple estimators of their asymptotic mean squared errors, and justify asymptotic prediction intervals. We present model-based simulations that show that in quite small, finite samples, our mean squared error estimator performs as well or better than the widely-used \cite{prasad1990estimation} estimator and is much simpler, so is easier to interpret. We also carry out a design-based simulation using real data on consumer expenditure on fresh milk products to explore the design-based properties of the mixed model estimators. We explain and interpret some surprising simulation results through analysis of the population and further design-based simulations. The simulations highlight important differences between the model- and design-based properties of mixed model estimators in small area estimation.

URL PDF HTML ☆

赞 0 踩 0

1908.01943 2026-03-17 q-fin.RM math.ST stat.TH

Stochastic comparisons of sample mean differences for multivariate random variables

Xuehua Yin, Dan Zhu, Chuancun Yin

Comments 14pages

2603.14768 2026-03-17 cs.LG stat.ML

Understanding the geometry of deep learning with decision boundary volume

Matthew Burfitt, Jacek Brodzki, Pawel Dłotko

2603.14752 2026-03-17 stat.ME

Prior- and likelihood-free probabilistic inference with finite-sample calibration guarantees

Leonardo Cella, Emily C. Hector

Comments 26 pages, 6 Figures

2603.14744 2026-03-17 quant-ph cs.CC math.OC math.ST stat.TH

Towards Exponential Quantum Improvements in Solving Cardinality-Constrained Binary Optimization

Haomu Yuan, Hanqing Wu, Kuan-Cheng Chen, Bin Cheng, Crispin H. W. Barnes

Comments 19 pages

2603.14704 2026-03-17 cs.LG cs.CV stat.ML

Chain-of-Trajectories: Unlocking the Intrinsic Generative Optimality of Diffusion Models via Graph-Theoretic Planning

Ping Chen, Xiang Liu, Xingpeng Zhang, Fei Shen, Xun Gong, Zhaoxiang Liu, Zezhou Chen, Huan Hu, Kai Wang, Shiguo Lian

Comments 12 figues, 5 tables

2603.14676 2026-03-17 stat.ME stat.ML

Scalable Text-Embedding-informed Cognitive Diagnosis of Large Language Models

Jia Liu, Zhiyu Xu, Yuqi Gu

Comments 34 pages of main text, 12 pages of appendix, 7 figures

2603.14608 2026-03-17 cs.LG cs.AI math.OC stat.ML

Delightful Policy Gradient

Ian Osband

2603.14578 2026-03-17 stat.ML cs.LG math.PR

Power-Law Spectrum of the Random Feature Model

Elliot Paquette, Ke Liang Xiao, Yizhe Zhu

2603.14576 2026-03-17 quant-ph cs.LG stat.ML

IQP Born Machines under Data-dependent and Agnostic Initialization Strategies

Sacha Lerch, Joseph Bowles, Ricard Puig, Erik Armengol, Zoë Holmes, Supanut Thanasilp

Comments 16 + 35 pages, 3 + 4 figures

2603.14547 2026-03-17 math.NA cs.IT cs.NA math.IT math.ST stat.TH

Maximum Entropy Least Squares Solutions of Overdetermined Linear Systems

Felice Iavernaro, Monica Lazzo, Lorenzo Pisani

Comments 34 pages, 10 figures

2603.14543 2026-03-17 stat.ME stat.ML

Gradient Boosting for Spatial Panel Models with Random and Fixed Effects

Michael Balzer, Adhen Benlahlou

2603.14514 2026-03-17 cs.LG cs.SY eess.SY math.OC stat.ML

High-Probability Bounds for SGD under the Polyak-Lojasiewicz Condition with Markovian Noise

Avik Kar, Siddharth Chandak, Rahul Singh, Eric Moulines, Shalabh Bhatnagar, Nicholas Bambos

Comments Submitted to SIAM Journal on Optimization

2603.14481 2026-03-17 stat.ML cs.LG math.PR

Convergence of Two Time-Scale Stochastic Approximation: A Martingale Approach

Mathukumalli Vidyasagar

Comments 21 pages

2603.14423 2026-03-17 math.ST cs.IT math.IT stat.ME stat.TH

Tighter Confidence Intervals under Without Replacement Sampling via Empirical Rate Functions

Shubhanshu Shekhar, Aaditya Ramdas

Comments 39 pages, 4 figures

2603.14387 2026-03-17 stat.ME cs.LG stat.ML

Label Noise Cleaning for Supervised Classification via Bernoulli Random Sampling

Yuxin Liu, Xiong Jin, Yang Han

2603.14381 2026-03-17 stat.ME

A Bayesian Critique of Rank-Based Methods for Surrogate Marker Evaluation

Pietro Carlotti, Layla Parast

2603.14325 2026-03-17 cs.IT math.IT stat.ML

Fundamental Limits of CSI Compression in FDD Massive MIMO

Bumsu Park, Youngmok Park, Chanho Park, Namyoon Lee

Comments 14 pages, 5 figures

详情

英文摘要

Channel state information (CSI) feedback in frequency-division duplex (FDD) massive multiple-input multiple-output (MIMO) systems is fundamentally limited by the high dimensionality of wideband channels. In this paper, we model the stacked wideband CSI vector as a Gaussian-mixture source with a latent geometry state that represents different propagation environments. Each component corresponds to a locally stationary regime characterized by a correlated proper complex Gaussian distribution with its own covariance matrix. This representation captures the multimodal nature of practical CSI datasets while preserving the analytical tractability of Gaussian models. Motivated by this structure, we propose Gaussian-mixture transform coding (GMTC), a practical CSI feedback architecture that combines state inference with state-adaptive TC. The mixture parameters are learned offline from channel samples and stored as a shared statistical dictionary at both the user equipment (UE) and the base station. For each CSI realization, the UE identifies the most likely geometry state, encodes the corresponding label using a lossless source code, and compresses the CSI using the Karhunen-Loeve transform matched to that state. We further characterize the fundamental limits of CSI compression under this model by deriving analytical converse and achievability bounds on the rate-distortion (RD) function. A key structural result is that the optimal bit allocation across all mixture components is governed by a single global reverse-waterfilling level. Simulations on the COST2100 dataset show that GMTC significantly improves the RD tradeoff relative to neural transform coding approaches while requiring substantially smaller model memory and lower inference complexity. These results indicate that near-optimal CSI compression can be achieved through state-adaptive TC without relying on large neural encoders.

URL PDF HTML ☆

赞 0 踩 0

2603.14233 2026-03-17 stat.ME

Conformalized Robust Principal Component Analysis

Liangliang Yuan, Lei Wang, Quan Kong, Liuhua Peng

2603.14231 2026-03-17 stat.ME

Rank-based Maxsum test for high dimensional regression coefficient

Ping Zhao, Liangliang Yuan

Comments 1 pages, 1 table, 2 figures

2603.14129 2026-03-17 stat.ME

Semiparametric copula-based quantile regression for semicontinuous outcomes with application to healthcare data

Guanjie Lyu, Mohamed Belalia, Abdulkadir Hussein

Comments 25 pages, 2 figures

2603.14103 2026-03-17 cs.MS math.ST stat.ML stat.TH

Scorio.jl: A Julia package for ranking stochastic responses

Mohsen Hariri, Michael Hinczewski, Vipin Chaudhary

2603.14092 2026-03-17 cs.LG stat.ME

Soft Mean Expected Calibration Error (SMECE): A Calibration Metric for Probabilistic Labels

Michael Leznik

2603.14070 2026-03-17 stat.ME stat.ML

Structured Credal Learning

Varun Venkatesh, Eyke Hüllermeier, Bernd Bischl, Mina Rezaei

2603.14037 2026-03-17 math.ST stat.TH

On a Strictly Decreasing Nonparametric Estimator of the Drift Function for Recurrent Diffusion Processes

Nicolas Marie

Comments 17 pages, 2 figures

2603.14034 2026-03-17 cs.SI stat.AP

A Machine Learning Framework for Constructing Heterogeneous Contact Networks: Implications for Epidemic Modelling

Luke Murray Kearney, Emma L Davis, Matt J Keeling

Comments 41 pages, 8 figures

详情

英文摘要

Capturing the structured mixing within a population is key to the reliable projection of infectious disease dynamics and hence informed control. Both heterogeneity in the number of contacts and age-structured mixing have been repeatedly demonstrated as fundamental, yet are rarely combined. Networks provide a powerful and intuitive method to realise population structure, and simulate infection dynamics. However the explicit measurement of contact networks is not scalable to larger populations. Here, using data from social contact surveys, we develop a generalisable and robust algorithm utilizing machine learning to generate a surrogate population-scale network that preserves both age-structured mixing and heterogeneity of contacts. We simulate the spread of infection across different populations, considering how the epidemic size varies over basic reproduction number ($R_0$) scenarios - mirroring the process of determining public health impact from early epidemic growth. Our approach shows that both age structure and degree heterogeneity substantially reduce the epidemic size. We also demonstrate that these simulations more accurately capture the heterogeneity in secondary cases observed for COVID-19 when transmission is scaled by contact duration, dampening the effect of highly connected ``super-spreaders". By using survey data collected during 2020-2022, these network models also inform about the impacts of control and targeting of public health interventions: quantifying the non-linear reduction in transmission opportunities that occurred during lockdowns, and the ages and contact types most responsible for onward transmission. Our robust methodology therefore allows for the inclusion of the full wealth of data commonly collected by surveys but frequently overlooked to be incorporated into more realistic transmission models of infectious diseases.

URL PDF HTML ☆

赞 0 踩 0

2603.13953 2026-03-17 math.ST math.PR stat.TH

Random discrete copulas

Damjana Kokol Bukovšek, Blaž Mojškerc, Nik Stopar

Comments 19 pages, 3 figure

2603.13935 2026-03-17 math.ST math.PR stat.ME stat.TH

A two-sample test for symmetric positive definite matrix distributions using Wishart kernel density estimators

Frédéric Ouimet

Comments 34 pages, 0 figures

2603.13930 2026-03-17 stat.ME math.ST stat.TH

Spatially Varying Coefficient Mallows Model Averaging

Zhuang Yong, Lv Jing, Tingting Li

2603.10886 2026-03-17 stat.ML cs.LG stat.ME

Kernel Tests of Equivalence

Xing Liu, Axel Gandy

Comments 29 pages; 6 figures

2603.05961 2026-03-17 stat.AP physics.data-an

A Tutorial on Bayesian Analysis of Linear Shock Compression Data

Jason Bernstein, Philip C. Myint, Beth A. Lindquist, Justin Lee Brown

Comments 29 pages, 14 figures

2603.04365 2026-03-17 math.PR cs.NA math.NA math.ST stat.TH

Comparison theorems for the extreme eigenvalues of a random symmetric matrix

Joel A. Tropp

Comments 32 pages. v2 with minor corrections

2512.24413 2026-03-17 stat.ME

Demystifying Proximal Causal Inference

Grace V. Ringlein, Trang Quynh Nguyen, Peter P. Zandi, Elizabeth A. Stuart, Harsh Parikh

Comments 33 pages, 5 figures

2512.06522 2026-03-17 stat.ME math.ST stat.ML stat.TH

Hierarchical Clustering With Confidence

Di Wu, Jacob Bien, Snigdha Panigrahi

Comments 57 Pages, 11 Figures, 2 Algorithms

2511.09677 2026-03-17 cs.LG stat.ML

Boosted GFlowNets: Improving Exploration via Sequential Learning

Pedro Dall'Antonia, Tiago da Silva, Daniel Augusto de Souza, César Lincoln C. Mattos, Diego Mesquita

Comments 11 pages, 3 figures (22 pages total including supplementary material)

2511.03596 2026-03-17 stat.AP stat.ME

Accounting for Heavy Censoring in Evaluating the Risk Stratification Abilities of Existing Models for Time to Diagnosis of Huntington Disease

Kyle F. Grosser, Abigail G. Foes, Stellen Li, Vraj Parikh, Tanya P. Garcia, Sarah C. Lotspeich

Comments 16 pages, 4 tables, 2 figures

详情

英文摘要

Huntington disease (HD) is a neurodegenerative disease with progressively worsening symptoms. Accurately modeling time to HD diagnosis is essential for clinical trial design. Langbehn's model, the CAG-Age Product (CAP) model, the Prognostic Index Normed (PIN) model, and the Multivariate Risk Score (MRS) model have all been proposed for this task. However, these models may yield conflicting predictions and few studies have systematically compared their performance. Further, those that have could be misleading due to testing the models on the same data used to train them and failing to account for high rates of right censoring (80%+) in performance metrics. We discuss the theoretical foundations of these models, offering intuitive comparisons about their practical feasibility. We externally validate their risk stratification abilities using data from the ENROLL-HD study and two censoring-appropriate performance metrics, guiding model selection for HD clinical trial design. As these models were developed in HD studies that ended more than a decade ago, we compared their predictive performance using published parameters versus updated ones (re-estimated using ENROLL-HD). We show how these models can be used to estimate sample sizes for an HD clinical trial. Based on either metric and using published or updated parameters, the MRS model, which incorporates the most covariates, performed best. However, the simpler PIN model offered similarly good performance while requiring fewer variables, many of which would require patients to undergo additional tests. In illustrating an HD clinical trial design, we defined an optimal threshold based on model performance metrics to determine which patients are more likely to be diagnosed. Sample size calculations using an optimal threshold based on metrics that did not account for censoring, as in previous studies, are shown to lead to underpowered trials.

URL PDF HTML ☆

赞 0 踩 0

2510.26204 2026-03-17 math.ST cs.IT eess.SP math.IT stat.TH

Sequential Change Detection Under Markov Setup With Unknown Prechange And Postchange Distributions

Ashish Bhoopesh Gulaguli, Shashwat Singh, Rakesh Kumar Bansal

Comments 6 pages, theoretical paper, Pre-print

2510.10870 2026-03-17 stat.ML cs.LG math.ST stat.ME stat.TH

Transfer Learning with Distance Covariance for Random Forest: Error Bounds and an EHR Application

Chenze Li, Subhadeep Paul

2510.04647 2026-03-17 math.OC stat.ML

On decomposability and subdifferential of the tensor nuclear norm

Jiewen Guan, Bo Jiang, Zhening Li

2510.04582 2026-03-17 stat.CO math.OC math.PR

Constrained Dikin-Langevin diffusion for polyhedra

James Chok, Domenic Petzinna

2509.23711 2026-03-17 cs.LG cs.AI math.OC stat.ML

Deterministic Policy Gradient for Reinforcement Learning with Continuous Time and State

Ziheng Cheng, Xin Guo, Yufei Zhang

2509.02661 2026-03-17 cs.AI astro-ph.IM cond-mat.mtrl-sci cs.LG physics.data-an stat.ML

The Future of Artificial Intelligence and the Mathematical and Physical Sciences (AI+MPS)

Andrew Ferguson, Marisa LaFleur, Lars Ruthotto, Jesse Thaler, Yuan-Sen Ting, Pratyush Tiwary, Soledad Villar, E. Paulo Alves, Jeremy Avigad, Simon Billinge, Camille Bilodeau, Keith Brown, Emmanuel Candes, Arghya Chattopadhyay, Bingqing Cheng, Jonathan Clausen, Connor Coley, Andrew Connolly, Fred Daum, Sijia Dong, Chrisy Xiyu Du, Cora Dvorkin, Cristiano Fanelli, Eric B. Ford, Luis Manuel Frutos, Nicolás García Trillos, Cecilia Garraffo, Robert Ghrist, Rafael Gomez-Bombarelli, Gianluca Guadagni, Sreelekha Guggilam, Sergei Gukov, Juan B. Gutiérrez, Salman Habib, Johannes Hachmann, Boris Hanin, Philip Harris, Murray Holland, Elizabeth Holm, Hsin-Yuan Huang, Shih-Chieh Hsu, Nick Jackson, Olexandr Isayev, Heng Ji, Aggelos Katsaggelos, Jeremy Kepner, Yannis Kevrekidis, Michelle Kuchera, J. Nathan Kutz, Branislava Lalic, Ann Lee, Matt LeBlanc, Josiah Lim, Rebecca Lindsey, Yongmin Liu, Peter Y. Lu, Sudhir Malik, Vuk Mandic, Vidya Manian, Emeka P. Mazi, Pankaj Mehta, Peter Melchior, Brice Ménard, Jennifer Ngadiuba, Stella Offner, Elsa Olivetti, Shyue Ping Ong, Christopher Rackauckas, Philippe Rigollet, Chad Risko, Philip Romero, Grant Rotskoff, Brett Savoie, Uros Seljak, David Shih, Gary Shiu, Dima Shlyakhtenko, Eva Silverstein, Taylor Sparks, Thomas Strohmer, Christopher Stubbs, Stephen Thomas, Suriyanarayanan Vaikuntanathan, Rene Vidal, Francisco Villaescusa-Navarro, Gregory Voth, Benjamin Wandelt, Rachel Ward, Melanie Weber, Risa Wechsler, Stephen Whitelam, Olaf Wiest, Mike Williams, Zhuoran Yang, Yaroslava G. Yingling, Bin Yu, Shuwen Yue, Ann Zabludoff, Huimin Zhao, Tong Zhang

Comments Community Paper from the NSF Future of AI+MPS Workshop, Cambridge, Massachusetts, March 24-26, 2025, supported by NSF Award Number 2512945; v2: minor clarifications; v3: approximate version to appear in MLST

2507.00923 2026-03-17 stat.CO

ForLion: An R Package for Finding Optimal Experimental Designs with Mixed Factors

Siting Lin, Yifei Huang, Jie Yang

Comments 33 pages, 5 figures, 5 tables

2506.13646 2026-03-17 stat.ME

Parsimonious Compactly Supported Covariance Models in the Gauss Hypergeometric Class: Identifiability, Reparameterizations, and Asymptotic Properties

Moreno Bevilacqua, Christian Caamaño-Carrillo, Tarik Faouzi, Xavier Emery

Comments 25 pages, 8 gigures

2505.23260 2026-03-17 stat.ML cs.LG

Stable Thompson Sampling: Valid Inference via Variance Inflation

Budhaditya Halder, Shubhayan Pan, Koulik Khamaru

2505.16124 2026-03-17 stat.ME math.ST stat.TH

Controlling the false discovery rate in high-dimensional linear models using model-X knockoffs and $p$-values

Jinyuan Chang, Chenlong Li, Cheng Yong Tang, Zhengtian Zhu

2504.21068 2026-03-17 math.CO math.ST stat.TH

Polyhedral Aspects of Maxoids

Tobias Boege, Kamillo Ferry, Benjamin Hollering, Francesco Nowell

Comments 29 pages, 7 figures. Submitted to the Kybernetika special edition for WUPES'25

2504.14372 2026-03-17 cs.LG cs.AI cs.CE cs.CV stat.ML

Learning Enhanced Structural Representations with Block-Based Uncertainties for Ocean Floor Mapping

Jose Marie Antonio Minoza

2503.19126 2026-03-17 math.OC stat.ML

Tractable downfall of basis pursuit in structured sparse optimization

Maya V. Marmary, Christian Grussler

2502.09806 2026-03-17 econ.EM cs.IR cs.SI stat.ME

Two-Sided Prioritized Ranking: A Coherency-Preserving Design for Marketplace Experiments

Mahyar Habibi, Zahra Khanalizadeh, Negar Ziaeian

Comments New version with revisions and updated title

2501.15338 2026-03-17 cs.GT cs.LG stat.ML

Fairness-aware Contextual Dynamic Pricing with Strategic Buyers

Pangpang Liu, Will Wei Sun

Comments The paper has been accepted by JASA

2501.13218 2026-03-17 stat.ME

Design of Bayesian Clinical Trials with Clustered Data

Luke Hagar, Shirin Golchi

2412.02945 2026-03-17 stat.ME

Detection of Multiple Influential Observations on Model Selection

Dongliang Zhang, Masoud Asgharian, Martin A. Lindquist

Comments 3 figures

2407.19236 2026-03-17 stat.CO stat.ML

Approximate learning of parsimonious Bayesian context trees

Daniyar Ghani, Nicholas A. Heard, Francesco Sanna Passino

2407.14778 2026-03-17 math.ST stat.TH

Minimax estimation of functionals in sparse vector model with correlated observations

Yuhao Wang, Pengkun Yang, Alexandre B. Tsybakov

2403.19818 2026-03-17 stat.ME math.ST stat.TH

Testing common structure in high-dimensional factor models: change-point and two-sample procedures

Marie-Christine Düker, Vladas Pipiras

2402.09698 2026-03-17 stat.ME cs.LG math.PR math.ST stat.ML stat.TH

Combining Evidence Across Filtrations

Yo Joong Choe, Aaditya Ramdas

Comments Accepted for publication in the Journal of the Royal Statistical Society: Series B (Statistical Methodology). Code is available at https://github.com/yjchoe/CombiningEvidenceAcrossFiltrations

2401.13208 2026-03-17 stat.ME

Assessing Influential Observations in Pain Prediction using fMRI Data

Dongliang Zhang, Masoud Asgharian, Martin A. Lindquist

Comments 6 figures

2312.00590 2026-03-17 econ.EM math.ST stat.TH

Inference on common trends in functional time series

Morten Ørregaard Nielsen, Won-Ki Seo, Dakyung Seong

2303.11786 2026-03-17 cs.LG stat.ME stat.ML

Skeleton Regression: A Graph-Based Approach to Estimation with Manifold Structure

Zeyu Wei, Yen-Chi Chen

2212.05545 2026-03-17 math.PR cs.IT math.IT math.ST stat.TH

Gaussian random projections of convex cones: approximate kinematic formulae and applications

Qiyang Han, Huachen Ren

2211.07092 2026-03-17 stat.ML cs.LG math.ST stat.TH

Offline Estimation of Controlled Markov Chains: Minimaxity and Sample Complexity

Imon Banerjee, Harsha Honnappa, Vinayak Rao

Comments 71 pages, 23 main

2110.10801 2026-03-17 stat.CO

Efficient Sampling for Ising and Potts Models using Auxiliary Gaussian Variables

Charles C. Margossian, Chenyang Zhong, Sumit Mukherjee

2603.13826 2026-03-17 cs.LG stat.ML

Effective Sparsity: A Unified Framework via Normalized Entropy and the Effective Number of Nonzeros

Haoyu He, Hao Wang, Jiashan Wang, Hao Zeng

2603.13806 2026-03-17 stat.ML cs.LG

An Interpretable and Stable Framework for Sparse Principal Component Analysis

Ying Hu, Hu Yang

2603.13762 2026-03-17 stat.ME

Learning the Optimal Composite Mediator: Closed-Form Solution and Inference

Zihuai He

2603.13706 2026-03-17 stat.AP

When Does Agroforestry Income Reduce Deforestation? Evidence from a Natural Experiment in Madagascar

Camille DeSisto, Ranaivo Rasolofoson, Michelle Foley, Harsh Parikh

2603.13688 2026-03-17 stat.ML cs.LG stat.AP

When Should Humans Step In? Optimal Human Dispatching in AI-Assisted Decisions

Lezhi Tan, Naomi Sagan, Lihua Lei, Jose Blanchet

2603.13677 2026-03-17 stat.AP

Hierarchical Latent Space Item Response Model for Analyzing Mental Health Vulnerability of Elementary School Students in South Korea

Soyeon Park, Seoyoung Shin, Minjeong Jeon, Hyoun Kyoung Kim, Ick Hoon Jin

2603.13674 2026-03-17 cs.LG cs.AI stat.ML

Locally Linear Continual Learning for Time Series based on VC-Theoretical Generalization Bounds

Yan V. G. Ferreira, Igor B. Lima, Pedro H. G. Mapa S., Felipe V. Campos, Antonio P. Braga

Comments 12 pages. Accepted at IEEE Transactions on Pattern Analysis and Machine Intelligence

2603.13662 2026-03-17 stat.ME stat.AP stat.ML

Fast Uncertainty Quantification for Kernel-Based Estimators in Large-Scale Causal Inference

Matthew Kosko, Falco J, Bargagli-Stoffi, Lin Wang, Michele Santacatterina

Comments 47 pages

2603.13646 2026-03-17 stat.ME

Surrogate-Based Bayesian Inference: Uncertainty Quantification and Active Learning

Andrew Gerard Roberts, Michael C. Dietze, Jonathan H. Huggins

2603.13622 2026-03-17 stat.CO math.PR stat.ME

The Continuous Rank Probability Score of a Generalized Beta-Prime Distribution and Some Special Cases

Matthew LeDuc

Comments 9 pages, no figures. Work in progress

2603.13616 2026-03-17 cs.RO stat.AP

Beyond Binary Success: Sample-Efficient and Statistically Rigorous Robot Policy Comparison

David Snyder, Apurva Badithela, Nikolai Matni, George Pappas, Anirudha Majumdar, Masha Itkina, Haruki Nishimura

Comments 12 + 9 pages, 2 + 5 figures,

2603.13614 2026-03-17 stat.ME math.ST stat.TH

Measuring Extreme Tail Association

Bikramjit Das, Xiangyu Liu

Comments 38 pages, 13 figures, includes appendix

2603.13613 2026-03-17 stat.ML cs.LG math.ST stat.TH

Robust Sequential Tracking via Bounded Information Geometry and Non-Parametric Field Actions

Carlos C. Rodriguez

Comments 1o pages, 3 figures

2603.13583 2026-03-17 stat.ME

Confidence intervals for two-stage adaptive designs with subpopulation selection

Enyu Li, Nigel Stallard, Ekkehard Glimm, Dominic Magirr, Peter K. Kimani

2603.13561 2026-03-17 stat.ME

Addressing both variable selection and misclassified responses with parametric and semiparametric methods

Hui Guo, Grace Y. Yi, Boyu Wang

2603.13559 2026-03-17 stat.ML cs.LG cs.SY eess.SP eess.SY

Robust Automatic Differentiation of Square-Root Kalman Filters via Gramian Differentials

Adrien Corenflos

Comments 4 pages, documents the mathematics of a bug fix at https://github.com/state-space-models/cuthbert

2603.13558 2026-03-17 stat.ML cs.CL cs.IT cs.LG math.IT

Holographic Invariant Storage: Design-Time Safety Contracts via Vector Symbolic Architectures

Arsenios Scrivens

Comments 25 pages, 7 figures, includes appendices with extended proofs and pilot LLM experiment

2603.13542 2026-03-17 stat.ME stat.AP

Robust Inferential Methodology for Multidimensional Diffusion Processes

Sourojyoti Barick

2603.13501 2026-03-17 stat.ML cs.LG

Standard Acquisition Is Sufficient for Asynchronous Bayesian Optimization

Ben Riegler, James Odgers, Vincent Fortuin

2603.13499 2026-03-17 math.ST stat.TH

An Empirical Bayes Perspective on Heteroskedastic Mean Estimation

Yanjun Han, Abhishek Shetty, Jacob Shkrob

2603.13478 2026-03-17 cs.NE cs.AI cs.LG q-bio.NC stat.ML

Equivalence of approximation by networks of single- and multi-spike neurons

Dominik Dold, Philipp Christian Petersen

2603.13361 2026-03-17 cs.CV cs.AI stat.ML

BrainCast: A Spatio-Temporal Forecasting Model for Whole-Brain fMRI Time Series Prediction

Yunlong Gao, Jinbo Yang, Li Xiao, Haiye Huo, Yang Ji, Hao Wang, Aiying Zhang, Yu-Ping Wang

2603.13284 2026-03-17 cs.LG stat.ML

Do Diffusion Models Dream of Electric Planes? Discrete and Continuous Simulation-Based Inference for Aircraft Design

Aurelien Ghiglino, Daniel Elenius, Anirban Roy, Ramneet Kaur, Manoj Acharya, Colin Samplawski, Brian Matejek, Susmit Jha, Juan Alonso, Adam Cobb

2603.13254 2026-03-17 cs.LG stat.CO

Introducing Feature-Based Trajectory Clustering, a clustering algorithm for longitudinal data

Marie-Pierre Sylvestre, Laurence Boulanger

2603.13241 2026-03-17 stat.ML cs.AI cs.LG

A Hybrid Tsallis-Polarization Impurity Measure for Decision Trees: Theoretical Foundations and Empirical Evaluation

Edouard Lansiaux, Idriss Jairi, Hayfa Zgaya-Biau

2603.13234 2026-03-17 cs.LG stat.ML

RFX-Fuse: Breiman and Cutler's Unified ML Engine + Native Explainable Similarity

Chris Kuchar

Comments 31 pages, 10 figures

2602.02319 2026-03-17 stat.ME

Leave-One-Out Neighborhood Smoothing for Graphons: Berry-Esseen Bounds, Confidence Intervals, and Honest Tuning

Behzad Aalipur, Rachel Kilby

2510.14075 2026-03-17 eess.SY cs.AI cs.SY stat.CO stat.ML

DiffOPF: Diffusion Solver for Optimal Power Flow

Milad Hoseinpour, Vladimir Dvorkin

Comments 8 pages, 4 figures, 2 tables

2510.12271 2026-03-17 stat.AP cs.LG

The Living Forecast: Evolving Day-Ahead Predictions into Intraday Reality

Kutay Bölat, Peter Palensky, Simon Tindemans

2509.01437 2026-03-17 stat.ME cs.LG stat.CO stat.ML

Sampling as Bandits: Evaluation-Efficient Design for Black-Box Densities

Takuo Matsubara, Andrew Duncan, Simon Cotter, Konstantinos Zygalakis

2505.20235 2026-03-17 cs.LG cs.AI stat.ML

Variational Deep Learning via Implicit Regularization

Jonathan Wenger, Beau Coker, Juraj Marusic, John P. Cunningham

2503.05861 2026-03-17 cs.LG stat.ML

Interpretable Visualizations of Data Spaces for Classification Problems

Christian Jorgensen, Arthur Y. Lin, Rhushil Vasavada, Rose K. Cersonsky

Comments 15 pages, 8 figures

2409.06680 2026-03-17 stat.ME

Sequential stratified inference for the mean

Jacob V. Spertus, Mayuri Sridhar, Philip B. Stark

Comments 22 pages, 5 figures, submitted to Annals of Applied Statistics

2307.05705 2026-03-17 math.NA cs.NA math.ST stat.ML stat.TH

Measure transfer via stochastic slicing and matching

Shiying Li, Caroline Moosmueller, Yongzhe Wang

1307.7624 2026-03-17 math.ST stat.TH

Singularity of Data Analytic Operations

Steven P. Ellis

Comments 495 pages, 11 figures