arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.24749 2026-04-28 cs.LG stat.ML

The Optimal Sample Complexity of Multiclass and List Learning

Chirag Pabbaraju

详情

英文摘要

While the optimal sample complexity of binary classification in terms of the VC dimension is well-established, determining the optimal sample complexity of multiclass classification has remained open. The appropriate complexity parameter for multiclass classification is the DS dimension, and despite significant efforts, a gap of $\sqrt{\text{DS}}$ has persisted between the upper and lower bounds on sample complexity. Recent work by Hanneke et al. (2026) shows a novel algebraic characterization of multiclass hypothesis classes in terms of their DS dimension. Building up on this, we show that the maximum hypergraph density of any multiclass hypothesis class is upper-bounded by its DS dimension. This proves a longstanding conjecture of Daniely and Shalev-Shwartz (2014). As a consequence, we determine the optimal dependence of the sample complexity on the DS dimension for multiclass as well as list learning.

URL PDF HTML ☆

赞 0 踩 0

2604.24737 2026-04-28 cs.LG cs.AI cs.CC stat.ML

Learning to Think from Multiple Thinkers

Nirmit Joshi, Roey Magen, Nathan Srebro, Nikolaos Tsilivis, Gal Vardi

Comments Comments are welcome. There are 78 pages and 5 Figures

2604.24736 2026-04-28 math.ST stat.TH

Parametric Statistical Inference in the Zone of Moderate Deviation Probabilities

Mikhail Ermakov

Comments 18 pages

2604.24652 2026-04-28 stat.ME econ.EM

Benefits and Costs of Adaptive Sampling

Yu-Shiou Willy Lin, Dae Woong Ham, Iavor Bojinov

Comments 41 pages, 3 figures

2604.24632 2026-04-28 stat.CO cs.NA math.NA math.PR

Theoretical guarantees for stochastic gradient sampling methods via Gaussian convolution inequalities

Daniel Paulin, Peter A. Whalley

Comments 34 pages, 2 figures

2604.24563 2026-04-28 physics.chem-ph cs.LG stat.ML

Enhancing molecular dynamics with equivariant machine-learned densities

Mihail Bogojeski, Muhammad R. Hasyim, Leslie Vogt-Maranto, Klaus-Robert Müller, Kieron Burke, Mark E. Tuckerman

Comments 30 pages, 7 figures

2604.24555 2026-04-28 cs.LG stat.ML

Efficient learning by implicit exploration in bandit problems with side observations

Tomas Kocak, Gergely Neu, Michal Valko, Remi Munos

Comments Published at Neural Information Processing Systems (NeurIPS) 2014

2604.24545 2026-04-28 stat.ML cs.LG

Extreme bandits

Alexandra Carpentier, Michal Valko

Comments Published at Neural Information Processing Systems (NeurIPS) 2014

2604.24537 2026-04-28 cs.LG stat.ML

Stochastic simultaneous optimistic optimization

Michal Valko, Alexandra Carpentier, Rémi Munos

Comments Published in International Conference on Machine Learning (ICML 2013)

2604.24534 2026-04-28 math.ST stat.TH

Robust linear regression under latent group heterogeneity

Xifeng Li, Shuzhen Yang

2604.24533 2026-04-28 stat.ME

Hierarchical Causal Uplift Modeling in Overlapping Customer Journeys

Jorge Pellegrini

2604.24499 2026-04-28 cs.IT math.IT q-bio.PE stat.AP

Fisher Information and Dynamical Sampling I

Mattia Carrino, Stefan Hohenegger

Comments 41 pages, 17 figures

2604.24490 2026-04-28 math.ST stat.TH

Posterior Invariance of Multiplicative Contrasts under Margin Constraints in Contingency Tables

Rafael Bassi Stern, Ruobin Gong, Joseph B. Kadane, Mark J. Schervish, Teddy Seidenfeld

Comments 18 pages, 1 figure

2604.24430 2026-04-28 stat.ME

Combined shrinkage of fixed and random effects in linear mixed models using empirical Bayes

Matteo Amestoy, R. Vermeulen, Mark A. van de Wiel, Wessel N. van Wieringen

2604.24360 2026-04-28 stat.ME

A Milestone-Based Framework for Characterizing Time-Varying Treatment Effects in Immunotherapy Trials

Yi-Cheng Tai, Weijing Wang, Jedd D. Wolchok, Martin T. Wells

Comments 39 pages, 35 figures

2604.24116 2026-04-28 cs.SE stat.AP

Closing the Loop: A Software Framework for AI to Support Business Decision Making

Jeffrey Wong, Antoine Creux

2604.00763 2026-04-28 stat.ME q-bio.GN stat.AP

Non-ignorable fuzziness in granular counts: the case of RNA-seq data

Antonio Calcagnì, Arianna Consiglio, Przemyslaw Grzegorzewski, Corrado Mencar

Comments 10 pages, 1 figure, 0 tables. Note: The compressed source folder contains the Supplementary Materials

2603.17717 2026-04-28 cs.CR cs.AI stat.AP stat.ML

Machine Learning for Network Attacks Classification and Statistical Evaluation of Adversarial Learning Methodologies for Synthetic Data Generation

Iakovos-Christos Zarkadis, Christos Douligeris

Comments Accepted at IEEE ISCC 2026, Portugal

2602.01338 2026-04-28 cs.LG math.ST stat.ML stat.TH

High-accuracy sampling for diffusion models and log-concave distributions

Fan Chen, Sinho Chewi, Constantinos Daskalakis, Alexander Rakhlin

2512.10570 2026-04-28 stat.ML cs.LG

Flexible Deep Neural Networks for Partially Linear Survival Data: Estimation and Survival Inference

Asaf Ben Arie, Malka Gorfine

2511.17902 2026-04-28 cs.LG cs.AI stat.ML

Statistically-Guided Meta-Learning for Cross-Deployment Activity Recognition in Distributed Fiber-Optic Sensing

Yifan He, Haodong Zhang, Qiuheng Song, Lin Lei, Zhenxuan Zeng, Haoyang He, Hongyan Wu

2510.10372 2026-04-28 stat.ME

Sequentially Doubly Robust Estimation of Conditional Survival Probability with Time-Varying Covariates

Hongxiang Qiu, Marco Carone, Alex Luedtke, Peter B. Gilbert

Comments new theoretical, simulation, data analysis results

2508.09079 2026-04-28 econ.GN cs.DL q-fin.EC stat.OT

Exploring the Shape of Economics: A Multilayer Network Analysis of Social Communities and Intellectual Similarity Among Journals Before and After the 2008 Financial Crisis

Alberto Baccini, Lucio Barabesi, Carlo Debernardi

Comments 66 pages, 3 figures, 7 tables

2507.20088 2026-04-28 cs.LG math-ph math.MP math.OC stat.ML

Learning Latent Graph Geometry via Fixed-Point Schrödinger-Type Activation: A Theoretical Study

Dmitry Pasechnyuk-Vilensky, Martin Takáč

Comments 50 pages

2507.06542 2026-04-28 cs.LG cs.DC cs.MA stat.ML

On the Surprising Effectiveness of a Single Global Merging in Decentralized Learning

Tongtian Zhu, Tianyu Zhang, Mingze Wang, Zhanpeng Zhou, Can Wang

Comments We discover and theoretically explain why and when a single global parameter merging in decentralized learning can recover the performance of federated learning, even in highly heterogeneous and communication-constrained environments

2506.22429 2026-04-28 stat.ML cs.LG

Beyond ReLU: How Activations Affect Neural Kernels and Random Wide Networks

David Holzmüller, Max Schölpple

Comments Published at AISTATS 2026. New in v2: more discussions, plots on empirical eigenvalue decay

2506.22258 2026-04-28 math.ST math.PR stat.TH

Mixing Time Bounds for the Gibbs Sampler under Isoperimetry

Alexander Goyal, George Deligiannidis, Nikolas Kantas

2501.04187 2026-04-28 stat.AP

A cautious use of auxiliary outcomes for decision-making in randomized clinical trials

Massimiliano Russo, Steffen Ventz, Lorenzo Trippa

2410.10704 2026-04-28 math.ST stat.ME stat.TH

Estimation beyond Missing (Completely) at Random

Tianyi Ma, Kabir A. Verchand, Thomas B. Berrett, Tengyao Wang, Richard J. Samworth

Comments 78 pages, 6 figures, Journal version

2405.20642 2026-04-28 cs.LG stat.ML

Learning Under Moral Hazard with Instrumental Regression and Generalized Method of Moments

Shiliang Zuo

2402.15995 2026-04-28 cs.CC cs.LG math.ST stat.ML stat.TH

Improved Hardness Results for Learning Intersections of Halfspaces

Stefan Tiegel

详情

DOI: 10.46298/theoretics.26.8
Journal ref: TheoretiCS, Volume 5 (2026), Article 8, 1-26

英文摘要

We show strong (and surprisingly simple) lower bounds for weakly learning intersections of halfspaces in the improper setting. Strikingly little is known about this problem. For instance, it is not even known if there is a polynomial-time algorithm for learning the intersection of only two halfspaces. On the other hand, lower bounds based on well-established assumptions (such as approximating worst-case lattice problems or variants of Feige's 3SAT hypothesis) are only known (or are implied by existing results) for the intersection of super-logarithmically many halfspaces [KS09,KS06,DSS16]. With intersections of fewer halfspaces being only ruled out under less standard assumptions [DV21] (such as the existence of local pseudo-random generators with large stretch). We significantly narrow this gap by showing that even learning $ω(\log \log N)$ halfspaces in dimension $N$ takes super-polynomial time under standard assumptions on worst-case lattice problems (namely that SVP and SIVP are hard to approximate within polynomial factors). Further, we give unconditional hardness results in the statistical query framework. Specifically, we show that for any $k$ (even constant), learning $k$ halfspaces in dimension $N$ requires accuracy $N^{-Ω(k)}$, or exponentially many queries -- in particular ruling out SQ algorithms with polynomial accuracy for $ω(1)$ halfspaces. To the best of our knowledge this is the first unconditional hardness result for learning a super-constant number of halfspaces. Our lower bounds are obtained in a unified way via a novel connection we make between intersections of halfspaces and the so-called parallel pancakes distribution [DKS17,BLPR19,BRST21] that has been at the heart of many lower bound constructions in (robust) high-dimensional statistics in the past few years.

URL PDF HTML ☆

赞 0 踩 0

2312.08410 2026-04-28 cs.LG math.PR stat.ML

Universal approximation property of Banach space-valued random feature models including random neural networks

Ariel Neufeld, Philipp Schmocker

Comments 52 pages, 4 figures, 4 tables

2301.07855 2026-04-28 econ.EM stat.AP

Digital Divide: Evidence from the 2020 Canadian Internet Use Survey

Joann Jasiak, Peter MacKenzie, Purevdorj Tuvaandorj

Comments 47 pages, 8 figures, 15 tables. Substantially revised analysis based on the PUMF of the Statistics Canada 2020 Canadian Internet Use Survey

1912.13213 2026-04-28 cs.LG math.OC stat.ML

A Modern Introduction to Online Learning

Francesco Orabona

Comments Major update: One new chapter (Online Learning to X); massive tightening of all the math; simplification of the betting algorithm that loses a constant fraction of money; exp-concave functions are now for extended-real-valued function; new layout for publication; added index

2604.24172 2026-04-28 stat.ML cs.LG stat.ME

A Divergence-Based Method for Weighting and Averaging Model Predictions

Olav Benjamin Vassend

Comments Accepted at AISTATS 2026

2604.24161 2026-04-28 quant-ph cs.IT math.IT stat.CO

Quantum Prediction of Transport Dynamics in Discretized State Spaces

Felix Govaers

Comments Submitted to IEEE Transaction on Quantum Engineering on April 9, 2026

2604.24056 2026-04-28 stat.ME

Bi-Gaussian Mirrors for False Discovery Rate Control

Yujia Wu, Panxu Yuan, Binyan Jiang

2604.24017 2026-04-28 stat.ME math.ST stat.TH

Neyman Jackknife: Design-Based Variance Estimation for Causal Inference under Interference

Bryan Park, Stefan Wager

2604.24010 2026-04-28 stat.ME eess.SP

Efficient Implementations of Extended Object PMBM Filters with Blocked Gibbs Sampling

Yuxuan Xia, Ángel F. García-Fernández, Lennart Svensson

Comments Submitted to IEEE T-AES

2604.24000 2026-04-28 eess.IV cs.CV cs.MM stat.AP

Shared-kernel Wavelet Neural Networks for Poisson Image Reconstruction

Yuanhao Gong, Tan Tang, Qianyan Liu

2604.23983 2026-04-28 math.ST math.PR q-fin.RM stat.ME stat.TH

A Geometric Witness Framework for Signed Multivariate Tail-Dependence Compatibility: Asymptotic Structure and Finite-Threshold Synthesis

Janusz Milek

Comments 47 pages, 4 figures, 3 tables; includes a Python implementation appendix

详情

英文摘要

We study multivariate tail-dependence compatibility for complete and partial signed tail families, treating lower-tail, upper-tail, and mixed configurations in one geometric witness representation indexed by active coordinate sets and sign patterns. For a complete signed tail family, witness generator weights w = (w_{I,sigma}) give a linear incidence parametrization and are recovered by explicit triangular inversion. Excluding the geometric scale p0, the complete case uses 3^d - 1 generator weights, matching the number of complete signed tail coefficients; for partial specifications, only selected target coefficients need be prescribed. At a fixed threshold p0 in (0, 1/2), the inversion identifies the normalized noncentral ternary cell masses of any realizing copula. Hence finite-threshold compatibility is characterized by nonnegative recovered generator weights, singleton normalization, and the residual central-mass constraint. This yields a complete Moebius-type synthesis within the witness framework. If the recovered increments are nonnegative and singleton normalization holds, then S(w) = sum(w) determines the admissible finite-scale range, and every admissible p0 gives an exact witness realization. In the canonical ray geometry, such a realization preserves the same complete signed tail family throughout 0 < p <= p0. Thus the primary object is the complete signed tail family lambda: it is realized at every admissible finite scale and can be carried along families of witness copulas with p0 decreasing to 0. Partial, noisy, or inconsistent specifications are treated through linear-feasibility and weighted-l1 recovery problems in the same parametrization. The representation separates the p0-free incidence/Moebius layer from finite-threshold realization and provides tools for realization, simulation, calibration, completion, repair, and scenario design.

URL PDF HTML ☆

赞 0 踩 0

2604.23968 2026-04-28 cs.LG cs.AI stat.ML

DecompKAN: Decomposed Patch-KAN for Long-Term Time Series Forecasting

Naveen Mysore

Comments 15 pages, 6 figures, 8 tables. Preprint; under review

2604.23961 2026-04-28 stat.AP q-fin.MF q-fin.TR

Extended State-dependent Hawkes Process for Limit Order Books: Mathematical Foundation and the Reproduction of Volatility Signature Plots

Akitoshi Kimura

Comments 20 pages, 8 figures. This work was supported by JSPS KAKENHI Grant Number JP20K14366 and CREST, JST

2604.23930 2026-04-28 stat.ME cs.LG stat.ML

Nearly Optimal Subdata Selection

Min Yang, Wei Zheng, John Stufken, Ming-Chung Chang, Ting Tian, Xueqin Wang

2604.23928 2026-04-28 math.ST math.PR stat.TH

Wasserstein convergence rates for empirical measures of point processes

Dongzhou Huang, Tianyi Jiang, Haonan Wang

2604.23917 2026-04-28 stat.ME

MR-CCC: Bayesian Mendelian Randomization for Causal Cell--Cell Communication

Bitan Sarkar, Yang Ni

2604.23912 2026-04-28 cs.LG stat.ML

Gromov-Wasserstein Methods for Multi-View Relational Embedding and Clustering

Rafael Pereira Eufrazio, Eduardo Fernandes Montesuma, Charles Casimiro Cavalcante

Comments This manuscript is currently under review at the XLIV Simposio Brasileiro de Telecomunicacoes e Processamento de Sinais - SBrT (Brazilian Symposium on Telecommunications and Signal Processing ) 2026

2604.23866 2026-04-28 stat.ME

A Review of Methods and Practices for Missing Data in Sequential Multiple Assignment Randomized Trials (SMARTs): An Ancillary Study of a Scoping Review

Nikki L. B. Freeman, Chenyao Yu, Margaret Hoch, Sydney Browder, Bradley G. Hammill, Avi Kenny, Kevin J. Anstrom, Michael R. Kosorok

Comments 19 pages, 4 tables, 1 figure

2604.23851 2026-04-28 stat.ME math.ST stat.TH

Bayesian change-plane regression

Yuki Ohnishi, Fan Li

2604.23847 2026-04-28 stat.ME

Privacy-preserving Meta-analysis through Low-Rank Basis Hunting

Wenqi Shi, Kosuke Imai, Yi Zhang

2604.23834 2026-04-28 stat.ME stat.AP

Beyond the mean: Sequence analysis methods for clustering ordinal EMA data

Tianyi Wang, Anna L. Smith, Jillian R. Silva-Jones, Wendy Berry Mendes, Lauren N. Whitehurst

Comments 22 pages, 11 figures, 7 tables

2604.23828 2026-04-28 math.PR math.ST stat.CO stat.TH

Kac's walk on rotation matrices mixes in $n^2 \log n$ steps

Natesh S. Pillai, Aaron Smith

2604.23800 2026-04-28 cs.LG stat.ML

Causal Representation Learning from General Environments under Nonparametric Mixing

Ignavier Ng, Shaoan Xie, Xinshuai Dong, Peter Spirtes, Kun Zhang

Comments Accepted to AISTATS 2025. This is a slightly revised version of the published paper

2604.23797 2026-04-28 physics.optics physics.app-ph quant-ph stat.OT

From Random Fringes to Deterministic Response: Statistical Foundations of Time-Reversed Young Interferometry

Jianming Wen

2604.23790 2026-04-28 cs.LG stat.ML

A General Representation-Based Approach to Multi-Source Domain Adaptation

Ignavier Ng, Yan Li, Zijian Li, Yujia Zheng, Guangyi Chen, Kun Zhang

Comments ICML 2025

2604.23770 2026-04-28 econ.EM stat.ML

Bootstrapping with AI/ML-generated labels

Timothy Christensen, Silvia Goncalves, Benoit Perron

2604.23755 2026-04-28 stat.AP

Sparse Reduced-rank Regression Methods for Spatially Misaligned Data with Application to Spatial Transcriptomics

Zitian Wu, Susmita Datta, Arkaprava Roy

Comments 35 pages, 4 figures, 2 tables

2604.23744 2026-04-28 stat.AP stat.OT

How temperature regimes near the equinox synchronize spring biological events

Jonathan Auerbach, Andrew Gelman, E. M. Wolkovich

2604.23681 2026-04-28 cs.LG cs.CL stat.ML

Rank, Head-Channel Non-Identifiability, and Symmetry Breaking: A Precise Analysis of Representational Collapse in Transformers

Giansalvo Cirrincione

Comments 36 pages, 8 figures, 1 table. Submitted to Artificial Intelligence (Elsevier)

详情

英文摘要

A widely cited result by Dong et al. (2021) showed that Transformers built from self-attention alone, without skip connections or feed-forward layers, suffer from rapid rank collapse: all token representations converge to a single direction. The proposed remedy was the MLP. We show that this picture, while correct in the regime studied by Dong, is incomplete in ways that matter for architectural understanding. Three results are established. First, layer normalisation is precisely affine-rank-neutral: it preserves the affine rank of the token representation set exactly. The widespread claim that LN "plays no role" is imprecise; the correct statement is sharper. Second, residual connections generically obstruct rank collapse in real Transformers such as BERT-base, in a measure-theoretic sense, without contribution from the MLP. The MLP's irreplaceable function is different: generating feature directions outside the linear span of the original token embeddings, which no stack of attention layers can produce. Third, a phenomenon distinct from rank collapse is identified: head-channel non-identifiability. After multi-head attention sums per-head outputs through the output projection, individual contributions cannot be canonically attributed to a specific head; n(H-1)d_k degrees of freedom per layer remain ambiguous when recovering a single head from the mixed signal. The MLP cannot remedy this because it acts on the post-summation signal. A constructive partial remedy is proposed: a position-gated output projection (PG-OP) at parameter overhead below 1.6% of the standard output projection. The four collapse phenomena identified in the literature -- rank collapse in depth, in width, head-channel non-identifiability, and entropy collapse -- are unified under a symmetry-breaking framework, each corresponding to a distinct symmetry of the Transformer's forward pass.

URL PDF HTML ☆

赞 0 踩 0

2604.23619 2026-04-28 stat.ME math.ST stat.TH

Weak Moment Methods for Statistical Inference: with an Application to Robust Estimation

R. Labouriau

Comments 29 pages, one figure and four tables

2604.23573 2026-04-28 stat.ML cs.LG

High-dimensional Semi-supervised Classification via the Fermat Distance

Ruoxu Tan, Yiming Zang

2604.23552 2026-04-28 cs.LG cs.AI stat.ML

On the Memorization of Consistency Distillation for Diffusion Models

Bingqing Jiang, Difan Zou

Comments 34 pages

2604.20266 2026-04-28 stat.ME

Bayesian Modeling of the Stochastic Block Model for Weighted Network Data with Zero-Inflated Negative Binomial Distribution

Fumiya Iwashige

Comments 19 pages, 1 figure

2603.03004 2026-04-28 stat.ME stat.AP stat.CO

eTFCE: Exact Threshold-Free Cluster Enhancement via Fast Cluster Retrieval

Xu Chen, Wouter D. Weeda, Thomas E. Nichols, Jelle J. Goeman

Comments Revised manuscript with updated analyses and clarifications

2602.00417 2026-04-28 stat.ML cs.LG

Shuffle and Joint Differential Privacy for Generalized Linear Contextual Bandits

Sahasrajit Sarmasarkar

2512.20562 2026-04-28 stat.ML cs.LG math.OC

Shallow Neural Networks Learn Low-Degree Spherical Polynomials with Feature Learning by Learnable Channel Attention

Yingzhen Yang

Comments Accepted by Algorithmic Learning Theory (ALT) 2026

2512.07868 2026-04-28 cs.LG cs.AI stat.ML

Bayesian Optimization for Function-Valued Responses under Min-Max Criteria

Pouya Ahadi, Reza Marzban, Ali Adibi, Kamran Paynabar

Comments 25 pages, 6 figures

2510.15479 2026-04-28 cs.LG stat.ML

Adversary-Free Counterfactual Prediction via Information-Regularized Representations

Shiqin Tang, Rong Feng, Shuxin Zhuang, Youzhi Zhang, Hongzong Li

2510.13087 2026-04-28 cs.LG stat.ME stat.ML

DeepCausalMMM: A Deep Learning Framework for Marketing Mix Modeling with Causal Structure Learning

Aditya Puttaparthi Tirumala

Comments Published in the Journal of Open Source Software. Please cite the JOSS version - doi:10.21105/joss.09914. Please note that Author has no middle name. Last name is 'Puttaparthi Tirumala' (it's a two-part surname)

2510.00158 2026-04-28 math.ST cs.NA math.NA math.OC math.PR stat.TH

Exact affine conditioning beyond Gaussians: a unique characterization of the ensemble Kalman update

Frederic J. N. Jorgensen, Youssef M. Marzouk

Comments 35 pages, 4 figures

2508.05901 2026-04-28 math.ST cs.IT math.IT math.PR stat.ML stat.TH

Estimating the size of a set using cascading exclusion

Sourav Chatterjee, Persi Diaconis, Susan Holmes

Comments 52 pages, 10 figures. Minor changes in this revision. To appear in Statistical Science

2507.20058 2026-04-28 stat.ML cs.LG stat.AP

Modeling Parkinson's Disease Progression Using Longitudinal Voice Biomarkers: A Comparative Study of Statistical and Neural Mixed-Effects Models

Ran Tong, Lanruo Wang, Tong Wang, Wei Yan

Comments Published version: Computer Methods and Programs in Biomedicine Update, DOI: 10.1016/j.cmpbup.2026.100242. Version note: https://doi.org/10.5281/zenodo.19804672

详情

DOI: 10.1016/j.cmpbup.2026.100242
Journal ref: Computer Methods and Programs in Biomedicine Update, Volume 9, 2026, Article 100242

英文摘要

Longitudinal voice biomarkers provide a non-invasive source of information for monitoring Parkinson's disease progression, but their statistical analysis is difficult because repeated measurements from the same subject are correlated, clinical cohorts are often small, and disease trajectories can vary substantially across individuals. This study evaluates statistical and neural mixed-effects approaches for modeling Parkinson's disease progression from telemonitoring voice data. Using the Oxford Parkinson's telemonitoring dataset (N=42), we compare Neural Mixed Effects (NME) models, Generalized Neural Network Mixed Models (GNMMs), and semi-parametric Generalized Additive Mixed Models (GAMMs) under the same longitudinal prediction setting. The results show that neural mixed-effects models provide flexible nonlinear representations but can overfit severely in this small-sample setting, whereas GAMMs achieve stronger predictive performance and retain interpretable smooth effects and subject-level structure. In particular, the GAMM-based approach attains the lowest prediction error (MSE 6.56), while the neural baselines have substantially larger errors (MSE > 90). These findings support the use of interpretable statistical mixed-effects models for small longitudinal telemonitoring studies and suggest that larger and more diverse cohorts are needed before highly flexible neural mixed-effects models can be reliably assessed in this application.

URL PDF HTML ☆

赞 0 踩 0

2506.09163 2026-04-28 cs.LG stat.ML

Scalable Spatiotemporal Inference with Biased Scan Attention Transformer Neural Processes

Daniel Jenson, Jhonathan Navott, Piotr Grynfelder, Mengyan Zhang, Makkunda Sharma, Elizaveta Semenova, Seth Flaxman

2506.04118 2026-04-28 cs.LG stat.ML

Guided Speculative Inference for Efficient Test-Time Alignment of LLMs

Jonathan Geuter, Youssef Mroueh, David Alvarez-Melis

Comments 41 pages, 11 figures. Published at ICLR 2026

2505.16329 2026-04-28 stat.ML cs.LG

High-Dimensional Private Linear Regression with Optimal Rates

Simone Bombari, Jialei Luo, Inbar Seroussi, Marco Mondelli

Comments Updated version of "Better Rates for Private Linear Regression in the Proportional Regime via Aggressive Clipping"

2505.08008 2026-04-28 stat.ME

Separation-based causal discovery for extremes

Junshu Jiang, Jordan Richards, Raphaël Huser, David Bolin

2505.04884 2026-04-28 stat.ME math.ST stat.TH

Model Selection for Unit-root Time Series with Many Predictors

Shuo-Chieh Huang, Ching-Kang Ing, Ruey S. Tsay

2504.18184 2026-04-28 stat.ML cs.LG math.FA math.ST stat.TH

Learning Operators by Regularized Stochastic Gradient Descent with Operator-valued Kernels

Jia-Qi Yang, Lei Shi

Comments 68 pages, 3 figures

2502.04122 2026-04-28 stat.ME

Bayesian discovery of species in multiple areas

Alessandro Colombi, Raffaele Argiento, Federico Camerlenghi, Lucia Paci

2402.11789 2026-04-28 stat.ML cs.CV cs.LG

Statistical Test for Diffusion-Based Anomaly Localization via Selective Inference

Teruyuki Katsuoka, Tomohiro Shiraishi, Daiki Miwa, Vo Nguyen Le Duy, Ichiro Takeuchi

Comments 35 pages, 6 figures

2312.12396 2026-04-28 stat.ME

A Bayesian time-varying random partition model for large spatio-temporal datasets

Giulio Beltramin, Andrea Cremaschi, Annalisa Cadonna, Alessandra Guglielmi, Fernando Andrés Quintana

2211.09619 2026-04-28 cs.LG cs.RO cs.SY eess.SY math.OC stat.ML

Introduction to Online Control

Elad Hazan, Karan Singh

Comments Draft; comments/suggestions welcome at nonstochastic.control@gmail.com

2604.23527 2026-04-28 stat.ME physics.comp-ph physics.data-an stat.CO

Using Statistical Mechanics to Improve Real-World Bayesian Inference: A New Method Combining Tempered Posteriors and Wang-Landau Sampling

Alfred C. K. Farris

2604.23515 2026-04-28 stat.CO

ragR: Retrieval-Augmented Generation and RAG Assessment in R

Muhammad Aimal Rehman, Zhili Lu, Chi-Kuang Yeh

Comments Preprint. Code available at the GitHub repository listed in the paper

2604.23498 2026-04-28 math.ST math.OC stat.ML stat.TH

When Does Dynamic Preconditioning Preserve the Polyak-Ruppert CLT? A Stabilization Threshold

Sunyoung An, Xiaoming Huo

Comments 46 pages, 5 figures; includes supplementary material with deferred proofs and additional experiments

2604.23469 2026-04-28 stat.ME econ.EM

Estimation of MIDAS Regressions with Errors-in-the-Variables

Sukhbir Kaur, Sukhbir Singh, Kanchan Jain, Pooja Soni

2604.23463 2026-04-28 stat.ME math.ST stat.TH

A theory of ROC analysis of rule-out and rule-in diagnostics with applications to mammography data

Michelle Mastrianni, Kwok Lung Fan, Yee Lam Elim Thompson, Jessie J. J. Gommers, Ioannis Sechopoulos, Fredrik Strand, Weijie Chen, Gary Levine, Mukul Sherekar, Frank W. Samuelson

2604.23454 2026-04-28 stat.ME stat.CO stat.ML

Anchored Variational Inference for Personalized Sequential Latent-State Models

Xingche Guo

2604.23438 2026-04-28 stat.AP stat.ME

Estimating Causal Attribution of Anthropogenic Forcing on High-Temperature Extremes Using a Latent Gaussian Spatial Model

Ritik Roshan Giri, Arnab Hazra

Comments 31 pages, 6 figures, 3 tables, 1 algorithm

2604.23393 2026-04-28 stat.ME math.ST stat.TH

Asymptotic theory of rerandomization for survival analysis

Xinyuan Chen, Fan Li

2604.23381 2026-04-28 stat.AP stat.ME stat.ML

MCMC with Adaptive Principal-Component Transformation: Rotation-Invariant Universal Samplers for Bayesian Structural System Identification

Xianghao Meng, Yong Huang, James L. Beck, Kui Jiang, Hui Li

Comments Accepted by Advanced Engineering Informatics on Apr 25, 2026

2604.23375 2026-04-28 cs.CV stat.ML

Hierarchical Spatio-Channel Clustering for Efficient Model Compression in Medical Image Analysis

Sisipho Hamlomo, Marcellin Atemkeng, Habte Tadesse Likassa, Blaise Ravelo, Thierry Bouwmans, Sébastien Lalléchère, Antoine Vacavant, Ding-Geng Chen

2604.23370 2026-04-28 math.OC cs.AI cs.LG cs.SY eess.SY stat.ML

Nonlinear Non-Gaussian Density Steering with Input and Noise Channel Mismatch: Sinkhorn with Memory for Solving the Control-affine Schrödinger Bridge Problem

Georgiy A. Bondar, Asmaa Eldesoukey, Yongxin Chen, Abhishek Halder

2604.23367 2026-04-28 math.ST stat.TH

Conway--Maxwell multivariate Bernoulli distribution

Hélène Cossette, Etienne Marceau, Alessandro Mutti, Patrizia Semeraro

2604.23357 2026-04-28 stat.ME stat.AP

Modelling spatial heterogeneity in the effects of area-level covariates on income distributions using Bayesian nonparametric methods

Ziyou Wang, Jim Griffin, Maria Kalli

2604.23308 2026-04-28 cs.LG stat.ML

CODA: Coordination via On-Policy Diffusion for Multi-Agent Offline Reinforcement Learning

Marcel Hedman, Kale-ab Abebe Tessera, Juan Claude Formanek, Anya Sims, Riccardo Zamboni, Trevor McInroe, John Torr, Elliot Fosong

2604.23229 2026-04-28 math.ST stat.TH

Solidarity of Spectral Gaps for Component-Wise Markov Chains

Youngwoo Kwon, Galin Jones, Qian Qin

Comments 55 pages

2604.23212 2026-04-28 stat.ML cs.LG math.ST stat.TH

Learning Curves and Benign Overfitting of Spectral Algorithms in Large Dimensions

Weihao Lu, Qian Lin, Yingcun Xia, Dongming Huang

2604.23193 2026-04-28 cs.DS cs.LG cs.NA math.NA math.PR stat.ML

Well-Conditioned Oblivious Perturbations in Linear Space

Shabarish Chenakkod, Michał Dereziński, Xiaoyu Dong, Mark Rudelson

2604.23174 2026-04-28 stat.ME

Weighted Cumulative Residual Mathai-Haubold Entropy

Anija C. R, Smitha S, Sudheesh K. Kattumannil

2604.23154 2026-04-28 stat.ME

A bivariate cure copula model with zero-inflated gamma frailty: dependence in both cure fractions and survival times

Masaki Hino, Shogo Kato, Takeshi Emura

Comments 47 pages, 2 figures

2604.23127 2026-04-28 physics.geo-ph cs.LG stat.AP

A Dynamic Learning Observatory Reveals the Rapid Salinization of Satkhira, Bangladesh

Showmitra Kumar Sarkar, Sai Ravela

2604.23107 2026-04-28 stat.ML cs.LG stat.ME

MOCA: A Transformer-based Modular Causal Inference Framework with One-way Cross-attention and Cutting Feedback

Lei Wang, Debashis Ghosh

Comments 25 pages, 8 figures, 4 tables. Preprint

2604.23085 2026-04-28 stat.ME

Using Importance Sampling to Estimate $p$-values in All-Subset Meta-Analysis, with Applications to Single-Cell eQTL Mapping

Samuel Anyaso-Samuel, Thong Luong, Fei Qin, Jiyeon Choi, Kai Yu, Paul S. Albert, Jianxin Shi

Comments 18 pages, 3 Figures

2604.23083 2026-04-28 stat.ML cs.LG stat.ME

Turtle shell clustering: A mixture approach to discriminative clustering with applications to flow cytometry and other data

Mackenzie R. Neal, Paul D. McNicholas, Arthur White

2604.23046 2026-04-28 cs.LG cs.IT cs.SI math.IT stat.ML

Shape of Memory: a Geometric Analysis of Machine Unlearning in Second-Order Optimizers

Kennon Stewart

Comments Full experiment data available at secondstreetlabs.io

2604.23029 2026-04-28 stat.ME

Sampling distributions for complex design variance estimators in a Fay-Herriot model

Alana McGovern, Geir-Arne Fuglstad, Jon Wakefield

2604.23022 2026-04-28 cs.IR cs.LG stat.ML

CASP: Support-Aware Offline Policy Selection for Two-Stage Recommender Systems

Nilson Chapagain

Comments 10 pages

2604.23006 2026-04-28 stat.ME

Estimation of Time-Varying Treatment Effects in a Joint Model for Longitudinal and Recurrent Event Outcomes in Mobile Health Data

Madeline R Abbott, Jeremy M G Taylor, Inbal Nahum-Shani, Lindsey N Potter, David W Wetter, Cho Y Lam, Walter Dempsey

Comments Main text is 25 pages with 6 figures and 1 table

2604.22998 2026-04-28 stat.OT

Perceptions and Utilization of GenAI Tools among Data Science Students and Faculty

Abeer M. Hasan, Sayed A. Mostafa

2604.22980 2026-04-28 stat.ME math.ST stat.TH

Testing independence in the presence of missing data: high-dimensional case

Marija Cuparić, Bojana Milošević, Jelena Radojević

2604.22967 2026-04-28 stat.ML cs.LG

Rethinking Trust Region Bayesian Optimization in High Dimensions

Wei-Ting Tang, Joel A. Paulson

2604.22965 2026-04-28 stat.ME

Agreement coefficients for continuous variables: A review

Ronny Vallejos

Comments 33 pages, 7 figures, 1 table

2604.22941 2026-04-28 math.AG math.ST stat.TH

Sobolev embedding theorem and subanalytic measures

Guillaume Valette

2604.22925 2026-04-28 stat.AP cs.SD

Come Together: Analyzing Popular Songs Through Statistical Embeddings

Matthew Esmaili Mallory, Mark Glickman, Jason Brown

2604.22907 2026-04-28 physics.med-ph stat.ME

Fingertip Micro-Motion as a Source of Respiratory Information During Sleep Using Triaxial Accelerometers

Jeanne Lin, Lily Liu, Hau-Tieng Wu

2604.22813 2026-04-28 stat.AP math.PR

Cyclic fractional Gaussian noise: time and frequency domain properties

Hubert Woszczek, Agnieszka Wylomanska

2604.22812 2026-04-28 cs.CY cs.LG stat.AP

Cross-Course Generalizability of SRL-Aligned Predictive Models Using Digital Learning Traces

Jakob Schwerter, Loreen Sabel, Judith Bose, Matthew L. Bernacki, Di Xu, Marko Schmellenkamp, Thomas Zeume, Philipp Doebler

2604.22807 2026-04-28 math.OC cs.LG cs.SY eess.SY stat.ML

Sliced Wasserstein Steering between Gaussian Measures

Kaito Ito, Anqi Dong

Comments Accepted at the European Control Conference 2026

2604.22802 2026-04-28 cs.DL cs.SI stat.AP

NIH-MPINet: A Large-Scale Feature-Rich Network Dataset for Mapping the Frontiers of Team Science

Cuiran Shi, Shuying Han, Shreya Kusumanchi, Mia Zhou, Didong Li

2604.22793 2026-04-28 stat.AP

Research Funding as a Decision Problem Under Heavy-Tailed Uncertainty

Carlos Oscar S. Sorzano, B. Pueche-Granados

2604.21087 2026-04-28 stat.AP

Model quality in football: Quantifying the quality of an Expected Threat model

Koen van Arem, Jakob Söhl, Mirjam Bruinsma, Geurt Jongbloed

2604.21066 2026-04-28 cs.CV cs.LG stat.ME

Optimizing Diffusion Priors in Image Reconstruction from a Single Observation

Frederic Wang, Katherine L. Bouman

2604.20059 2026-04-28 stat.ME

Investigating Targeting Strategies and Truncation in TMLE for the Average Treatment Effect under Practical Positivity Violations

Yichen Xu, Susan Gruber, Mark J. van der Laan

2604.19391 2026-04-28 cs.IT eess.SP math.IT stat.AP

On the Practical Performance of Noise Modulation for Ultra-Low-Power IoT: Limitations, Capacity, and Energy Trade-offs

Felipe A. P. de Figueiredo, Pedro M. R. Pereira, Evandro C. Vilas Boas, Fernando D. A. Garcia, Hadi Zayyani, Rausley A. A. de Souza

Comments 5 pages, 5 figures, conference

2604.14579 2026-04-28 stat.ME math.ST stat.TH

HASOD: A Hybrid Adaptive Screening-Optimization Design for High-Dimensional Industrial Experiments

Kumarjit Pathak

详情

英文摘要

Industrial experimentation requires both factor screening to identify critical variables and response optimization to find optimal operating conditions. Traditional approaches treat these as separate phases, necessitating costly sequential experimentation and full experimental redesign between phases. This paper introduces HASOD (Hybrid Adaptive Screening-Optimization Design), a novel three-phase sequential framework that simultaneously addresses factor identification and response surface optimization within a unified adaptive structure. Phase 1 employs a modified Definitive Screening Design with an enhanced Cumulative Weighted Effect Screening Statistic (CWESS) incorporating interaction detection via ElasticNet regression. Phase 2 adaptively selects augmentation strategies -- from full factorial to Response Surface Methodology designs -- based on critical factors identified in Phase 1. Phase 3 applies Gaussian process-based global optimization with uncertainty-guided refinement near the predicted optimum. We prove that CWESS asymptotically separates active from inactive factors, providing classification consistency guarantees absent from most screening methodologies. Across six test scenarios, HASOD achieves 97.08% factor detection accuracy -- 13.75 percentage points above traditional sequential methods (83.33%) -- and significantly outperforms all eight competitor methods (p < 0.001). HASOD yields improved prediction performance (mean error: 3.61) while maintaining >=90% detection across all scenarios including interaction-heavy systems. The framework requires an average of 41.5 experimental runs -- a 43% increase over traditional approaches -- yet delivers superior detection accuracy with dramatically reduced prediction error. HASOD offers a theoretically grounded, unified framework that eliminates sequential redesign without sacrificing predictive capability.

URL PDF HTML ☆

赞 0 踩 0

2604.12263 2026-04-28 stat.ME econ.EM

Partial Identification of Policy-Relevant Treatment Effects with Instrumental Variables via Optimal Transport

Jiyuan Tan, Jose Blanchet, Vasilis Syrgkanis

Comments 105 pages, 5 figures

2604.10482 2026-04-28 stat.ME

The Fréchet correlation coefficient for heterogeneous random objects

Shuaida He, Yangzhou Chen, Xin Chen

2604.07011 2026-04-28 stat.ME stat.AP

Recovering manifold structure in LLM responses through a joint Euclidean mirror

Maximilian Baum, Aranyak Acharyya, Tianyi Chen, Avanti Athreya, Youngser Park, Francesco Sanna Passino, Carey E. Priebe, Zachary Lubberts

Comments 13 pages, 9 figures

2603.18514 2026-04-28 stat.ML cs.LG

On the Peril of (Even a Little) Nonstationarity in Satisficing Regret Minimization

Yixuan Zhang, Ruihao Zhu, Qiaomin Xie

Comments 20 pages

2603.17281 2026-04-28 stat.AP

Improving causal inference in interrupted time series analysis: the triple difference design

Ariel Linden

2603.12365 2026-04-28 cond-mat.mtrl-sci cs.LG cs.NA math.NA physics.comp-ph stat.CO

Optimal Experimental Design for Reliable Learning of History-Dependent Constitutive Laws

Kaushik Bhattacharya, Lianghao Cao, Andrew Stuart

详情

DOI: 10.1016/j.cma.2026.119022
Journal ref: Computer Methods in Applied Mechanics and Engineering, Volume 457, 2026, 119022, ISSN 0045-7825

英文摘要

History-dependent constitutive models serve as macroscopic closures for the aggregated effects of micromechanics. Their parameters are typically learned from experimental data. With a limited experimental budget, eliciting the full range of responses needed to characterize the constitutive relation can be difficult. As a result, the data can be well explained by a range of parameter choices, leading to parameter estimates that are uncertain or unreliable. To address this issue, we propose a Bayesian optimal experimental design framework to quantify, interpret, and maximize the utility of experimental designs for reliable learning of history-dependent constitutive models. In this framework, the design utility is defined as the expected reduction in parametric uncertainty or the expected information gain. This enables in silico design optimization using simulated data and reduces the cost of physical experiments for reliable parameter identification. We introduce two approximations that make this framework practical for advanced material testing with expensive forward models and high-dimensional data: (i) a Gaussian approximation of the expected information gain, and (ii) a surrogate approximation of the Fisher information matrix. The former enables efficient design optimization and interpretation, while the latter extends this approach to batched design optimization by amortizing the cost of repeated utility evaluations. Our numerical studies of uniaxial tests for viscoelastic solids show that optimized specimen geometries and loading paths yield image and force data that significantly improve parameter identifiability relative to random designs, especially for parameters associated with memory effects.

URL PDF HTML ☆

赞 0 踩 0

2602.07825 2026-04-28 stat.ME

Estimation Strategies for Causal Decomposition Analysis with Allowability Specifications

John W. Jackson, Ting-Hsuan Chang, Aster Meche, Trang Q. Nguyen

2601.18390 2026-04-28 math.PR math.ST stat.TH

Convergence in distribution of the P-P process in $L^1[0,1]$

Brendan K. Beare, Tetsuya Kaji

Comments 7 pages

2601.09925 2026-04-28 stat.ME

High Dimensional Gaussian and Bootstrap Approximations in Generalized Linear Models

Mayukh Choudhury, Debraj Das

2512.03467 2026-04-28 cs.LG stat.ME

Bayesian Event-Based Model for Disease Subtype and Stage Inference

Hongtao Hao, Joseph L. Austerweil

Comments 32 pages; machine learning for health symposium (2025); Proceedings of the 5th Machine Learning for Health Symposium in PMLR

2511.21992 2026-04-28 stat.ME stat.AP

Design-based nested instrumental variable analysis

Zhe Chen, Xinran Li, Michael O. Harhay, Bo Zhang

详情

英文摘要

Two binary instrumental variables (IVs) are nested if individuals who comply under one binary IV also comply under the other. This situation often arises when the two IVs represent different intensities of encouragement or discouragement to take the treatment, with one stronger than the other. In a nested IV structure, treatment effects can be identified for two latent subgroups: always-compliers and switchers. Always-compliers are individuals who comply even under the weaker IV, while switchers are those who do not comply under the weaker IV but do under the stronger IV. We introduce a novel pair-of-pairs nested IV design, where each matched stratum consists of four units organized in two pairs. We develop design-based inference for the always-complier sample average treatment effect and switcher sample average treatment effect. In a nested IV analysis, IV assignment is randomized within each IV pair; however, whether a study unit receives the weaker or stronger IV may not be randomized. To address this complication, we then propose a novel partly biased randomization scheme and study design-based inference under this new scheme. Using extensive simulation studies, we demonstrate the validity of the proposed method even in challenging scenarios with small sample sizes and a low proportion of switchers. Applying the nested IV framework, we estimated that 52.2% (95% CI: 50.4%-53.9%) of participants enrolled at the Henry Ford Health System in the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial were always-compliers, while 26.7% (95% CI: 24.5%-28.9%) were switchers. Among always-compliers, flexible sigmoidoscopy was associated with a trend toward a decreased colorectal cancer rate. No effect was detected among switchers. This offers a richer interpretation of why no increase in the intention-to-treat effect was observed after 1997, even though the compliance rate rose.

URL PDF HTML ☆

赞 0 踩 0

2511.15161 2026-04-28 math.ST stat.ME stat.TH

Design-based finite-sample analysis for regression adjustment

Dogyoon Song

Comments 9 pages + Appendix; AISTATS 2026

2510.23976 2026-04-28 stat.AP

Forecasting Arctic Temperatures With Quantile Machine Learning

Richard Berk

Comments 30 pages, 8 figures

2510.18067 2026-04-28 stat.CO stat.ME

A Neural-Mean Vecchia Gaussian Process for Unified Argo Modeling

Nian Liu, Jian Cao

2510.08893 2026-04-28 stat.AP

Quantifying Very Extreme Precipitation and Temperature Using Huge Ensembles Generated by Machine Learning-based Climate Model Emulators

Christopher J. Paciorek, Daniel Cooley

Comments 28 pages, 11 figures, 5 appendix figures. Published online in Bulletin of the American Meteorological Society on 2026-03-30

详情

DOI: 10.1175/BAMS-D-25-0178.1

英文摘要

Weather extremes produce major impacts on society and ecosystems and are likely to change in likelihood and magnitude with climate change. However, very low probability events are hard to characterize statistically using observations or even climate model output because of short records/runs. For precipitation, consideration of such events arises in quantifying Probable Maximum Precipitation (PMP), namely estimating extreme precipitation magnitudes for designing and assessing critical infrastructure. A recent National Academies report on modernizing PMP estimation proposed using very large climate model-based ensembles to estimate extreme quantiles, possibly through machine learning-based ensemble boosting. Here we assess statistical aspects of such an approach for the contiguous United States using a huge ensemble (10560 years) produced by a state-of-the-art emulator (ACE2) trained on ERA5 reanalysis. The results indicate that one can practically estimate very extreme precipitation and temperature quantiles, provided one uses appropriate statistical extreme value techniques. More specifically, the results provide evidence for (1) the use of threshold-exceedance methods with a sufficiently high threshold (necessary for precipitation) for reliable estimation, (2) the robustness of results to variation in extremes by season and storm type, and (3) the sufficiency of the ensemble for well-constrained statistical uncertainty. Our results also show that the emulator produces extremes outside the range of the ERA5 training data. While encouraging for emulators' potential use for quantifying the climatology of extremes, more investigation is needed to assess whether emulators are fit for this purpose. Our focus is on how to use huge ensembles to estimate very extreme statistics; we expect the results to be relevant for future improved emulators.

URL PDF HTML ☆

赞 0 踩 0

2509.09758 2026-04-28 stat.AP

A Path Signature Framework for Detecting Creative Fatigue in Digital Advertising

Charles Shaw

Comments version 3

详情

英文摘要

This paper introduces a signature-based framework for detecting advertising creative fatigue using path signatures, a geometric representation from rough path theory. Creative fatigue -- the degradation of creative effectiveness under repeated exposure -- is operationally important in digital marketing because delayed detection can translate directly into avoidable opportunity cost. We reframe fatigue monitoring as a geometric change detection problem: advertising performance trajectories are embedded as paths and represented by truncated (log-)signatures, enabling detection of changes in trend, volatility, and non-linear dynamics beyond simple mean or variance shifts. We further connect statistical detection to managerial decision-making via an explicit quantification of performance loss relative to a benchmark period. Because proprietary production data cannot be released, we evaluate the proposed framework on a synthetic panel dataset designed to mimic realistic impression volumes and noisy day-to-day CTR dynamics. We define observed CTR as the realised binomial rate $CTR_t := C_t/I_t$ using daily clicks $C_t$ and impressions $I_t$. The accompanying CSV also contains a pre-computed CTR field (e.g., due to rounding or upstream derivation), but all modelling and evaluation in this paper use $C_t/I_t$. Crucially, the dataset does not include injected changepoints; we therefore define an operational ground truth for ``fatigue onset'' based on a noise-robust CTR estimate and a sustained deterioration relative to a recent-best baseline. We report lead-time (early warning) and alert-burden metrics under this operational definition, and provide a sensitivity analysis over the detector's primary tuning parameters. The methodology scales linearly in time-series length for fixed signature depth and is suitable for monitoring large creative portfolios.

URL PDF HTML ☆

赞 0 踩 0

2506.14062 2026-04-28 cs.DS math.PR stat.CO

Exact and Efficient Sampling from Dynamic Discrete Distributions with Finite-Precision Weights

Lilith Orion Hafner, Adriano Meligrana

Comments Submitted to ESA 2026

2505.11771 2026-04-28 cs.LG cs.AI math.ST stat.ML stat.TH

Residual Feature Integration is Sufficient to Prevent Negative Transfer

Yichen Xu, Ryumei Nakada, Linjun Zhang, Lexin Li

详情

Journal ref: The Fourteenth International Conference on Learning Representations (ICLR 2026)

英文摘要

Transfer learning has become a central paradigm in modern machine learning, yet it suffers from the long-standing problem of negative transfer, where leveraging source representations can harm rather than help performance on the target task. Although empirical remedies have been proposed, there remains little theoretical understanding of how to reliably avoid negative transfer. In this paper, we investigate a simple yet remarkably effective strategy: augmenting frozen, pretrained source-side features with a trainable target-side encoder that adapts target features to capture residual signals overlooked by models pretrained on the source data. We show this residual feature integration strategy is sufficient to provably prevent negative transfer, by establishing theoretical guarantees that it has no worse convergence rate than training from scratch under the informative class of target distributions up to logarithmic factors, and that the convergence rate can transition seamlessly from nonparametric to near-parametric when source representations are informative. To our knowledge, this is the first theoretical work that ensures protection against negative transfer. We carry out extensive numerical experiments across image, text and tabular benchmarks, and empirically verify that the method consistently safeguards performance under distribution shift, label noise, semantic perturbation, and class imbalance. We additionally demonstrate that this residual integration mechanism uniquely supports adapt-time multimodality extension, enabling a pretrained single-cell foundation model to incorporate spatial signals for lymph-node anatomical classification despite the source model being trained without them. Our study thus advances the theory of safe transfer learning, and provides a principled approach that is simple, robust, architecture-agnostic, and broadly applicable.

URL PDF HTML ☆

赞 0 踩 0

2505.06754 2026-04-28 stat.ME

Post-treatment problems: What can we say about the effect of a treatment among sub-groups who (would) respond in some way?

Chad Hazlett, Nina McMurry, Tanvi Shinkre

2504.04267 2026-04-28 cs.DS cs.DM cs.IT math.IT math.PR stat.CO

Efficient Rejection Sampling in the Entropy-Optimal Range

Thomas L. Draper, Feras A. Saad

2502.03669 2026-04-28 cs.LG cs.AI cs.DM math.OC stat.ML

Unrealized Expectations: Comparing AI Methods vs Classical Algorithms for Maximum Independent Set

Yikai Wu, Haoyu Zhao, Sanjeev Arora

Comments Published on TMLR 04/2026. 28 pages, 6 figures, 98 tables

2410.21922 2026-04-28 stat.CO math.ST stat.TH

Extending Sheldon M. Ross's Method for Efficient Large-Scale Variance Computation

Jiawen Li

2410.09810 2026-04-28 stat.ME

Doubly unfolded adjacency spectral embedding of dynamic multiplex graphs

Maximilian Baum, Francesco Sanna Passino, Axel Gandy

Comments 39 pages, 4 figures

2405.16730 2026-04-28 cs.LG cs.AI stat.AP

"Noisier" Noise Contrastive Eestimation is (Almost) Maximum Likelihood

Peiyu Yu, Dinghuai Zhang, Hengzhi He, Xiaojian Ma, Sirui Xie, Ruiyao Miao, Yifan Lu, Yasi Zhang, Deqian Kong, Ruiqi Gao, Jianwen Xie, Guang Cheng, Ying Nian Wu

Comments ICLR 2026

2309.00578 2026-04-28 cs.LG math.ST stat.TH

Consistency of Lloyd's Algorithm Under Perturbations

Dhruv Patel, Hui Shen, Shankar Bhamidi, Yufeng Liu, Vladas Pipiras

2105.04332 2026-04-28 cs.LG stat.ML

Bayesian Optimistic Optimisation with Exponentially Decaying Regret

Hung Tran-The, Sunil Gupta, Santu Rana, Svetha Venkatesh

Comments To appear at ICML 2021 (21 pages)

2009.02539 2026-04-28 stat.ML cs.IT cs.LG math.IT

Sub-linear Regret Bounds for Bayesian Optimisation in Unknown Search Spaces

Hung Tran-The, Sunil Gupta, Santu Rana, Huong Ha, Svetha Venkatesh

Comments 34th Conference on Neural Information Processing Systems (NeurIPS 2020)

1911.11950 2026-04-28 stat.ML cs.LG

Trading Convergence Rate with Computational Budget in High Dimensional Bayesian Optimization

Hung Tran-The, Sunil Gupta, Santu Rana, Svetha Venkatesh

Comments Our accepted paper (with Supplementary Material) at AAAI 2020

详情

DOI: 10.1609/aaai.v34i03.5623

英文摘要

Scaling Bayesian optimisation (BO) to high-dimensional search spaces is a active and open research problems particularly when no assumptions are made on function structure. The main reason is that at each iteration, BO requires to find global maximisation of acquisition function, which itself is a non-convex optimization problem in the original search space. With growing dimensions, the computational budget for this maximisation gets increasingly short leading to inaccurate solution of the maximisation. This inaccuracy adversely affects both the convergence and the efficiency of BO. We propose a novel approach where the acquisition function only requires maximisation on a discrete set of low dimensional subspaces embedded in the original high-dimensional search space. Our method is free of any low dimensional structure assumption on the function unlike many recent high-dimensional BO methods. Optimising acquisition function in low dimensional subspaces allows our method to obtain accurate solutions within limited computational budget. We show that in spite of this convenience, our algorithm remains convergent. In particular, cumulative regret of our algorithm only grows sub-linearly with the number of iterations. More importantly, as evident from our regret bounds, our algorithm provides a way to trade the convergence rate with the number of subspaces used in the optimisation. Finally, when the number of subspaces is "sufficiently large", our algorithm's cumulative regret is at most $\mathcal{O}^{*}(\sqrt{Tγ_T})$ as opposed to $\mathcal{O}^{*}(\sqrt{DTγ_T})$ for the GP-UCB of Srinivas et al. (2012), reducing a crucial factor $\sqrt{D}$ where $D$ being the dimensional number of input space.

URL PDF HTML ☆

赞 0 踩 0