arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.14108 2026-04-16 cs.LG math.DS math.OC stat.ML

Momentum Further Constrains Sharpness at the Edge of Stochastic Stability

Arseniy Andreyev, Advikar Ananthkumar, Marc Walden, Tomaso Poggio, Pierfrancesco Beneventano

Comments 40 pages, 38 figures

2604.14086 2026-04-16 stat.OT

The Epidemiology of Artificial Intelligence

Harsh Parikh, Tyler McCormick, Emily Johnson, Leo Hickey, Megan Ranney, Bhramar Mukherjee

Comments Perspective/Viewpoint of causal role of AI

2604.14075 2026-04-16 math.OC cs.LG stat.ML

Multistage Conditional Compositional Optimization

Buse Şen, Yifan Hu, Daniel Kuhn

2604.14071 2026-04-16 math.ST math.DS stat.TH

Finite-Step Bounds for Iterated Correlation Matrices

Ishrak AlhajjHassan

2604.14061 2026-04-16 cs.IT math.IT math.PR stat.ML

Two-Sided Bounds for Entropic Optimal Transport via a Rate-Distortion Integral

Jingbo Liu

Comments IEEE International Symposium on Information Theory (ISIT) 2026

2604.13980 2026-04-16 cs.LG q-bio.QM stat.ML

BOAT: Navigating the Sea of In Silico Predictors for Antibody Design via Multi-Objective Bayesian Optimization

Jackie Rao, Ferran Gonzalez Hernandez, Leon Gerard, Alexandra Gessner

Comments Proceedings of the 29th International Conference on Artificial Intelligence and Statistics (AISTATS) 2026

2604.13973 2026-04-16 stat.ME

Improving Treatment Effect Estimation in Trials through Adaptive Borrowing of External Controls

Qinwei Yang, Jingyi Li, Peng Wu, Shu Yang

2604.13944 2026-04-16 stat.ME

High-Dimensional Data Analysis for Elliptically Symmetric Distributions

Long Feng

2604.13890 2026-04-16 physics.soc-ph cs.LG econ.EM econ.TH stat.ML

Sandpile Economics: Theory, Identification, and Evidence

Diego Vallarino

2604.09832 2026-04-16 stat.CO

Adaptive Riemannian Manifold Hamiltonian Monte Carlo with Hierarchical Metric

Miika Kailas, Matti Vihola, Jonas Wallin

2603.24153 2026-04-16 math.ST math.PR stat.TH

Penalized estimation of GEV parameters for extreme quantile regression

Lucien M. Vidagbandji, Alexandre Berred, Cyrille Bertelle, Laurent Amanton

2603.02417 2026-04-16 stat.ML cs.LG math.OC

Mini-Batch Covariance, Diffusion Limits, and Oracle Complexity in Stochastic Gradient Descent: A Sampling-Design Perspective

Daniel Zantedeschi, Kumar Muthuraman

2603.00192 2026-04-16 cs.LG stat.AP stat.ML

Diagnostics for Individual-Level Prediction Instability in Machine Learning for Healthcare

Elizabeth W. Miller, Jeffrey D. Blume

详情

英文摘要

In healthcare, predictive models increasingly inform patient-level decisions, yet little attention is paid to the variability in individual risk estimates and its impact on treatment decisions. For overparameterized models, now standard in machine learning, a substantial source of variability often goes undetected. Even when the data and model architecture are held fixed, randomness introduced by optimization and initialization can lead to materially different risk estimates for the same patient. This problem is largely obscured by standard evaluation practices, which rely on aggregate performance metrics (e.g., log-loss, accuracy) that are agnostic to individual-level stability. As a result, models with indistinguishable aggregate performance can nonetheless exhibit substantial procedural arbitrariness, which can undermine clinical trust. We propose an evaluation framework that quantifies individual-level prediction instability by using two complementary diagnostics: empirical prediction interval width (ePIW), which captures variability in continuous risk estimates, and empirical decision flip rate (eDFR), which measures instability in threshold-based clinical decisions. We apply these diagnostics to simulated data and GUSTO-I clinical dataset. Across observed settings, we find that for flexible machine-learning models, randomness arising solely from optimization and initialization can induce individual-level variability comparable to that produced by resampling the entire training dataset. Neural networks exhibit substantially greater instability in individual risk predictions compared to logistic regression models. Risk estimate instability near clinically relevant decision thresholds can alter treatment recommendations. These findings that stability diagnostics should be incorporated into routine model validation for assessing clinical reliability.

URL PDF HTML ☆

赞 0 踩 0

2601.04193 2026-04-16 cs.IT math.IT math.PR stat.ML

A discrete Benamou-Brenier formulation of Optimal Transport on graphs

Kieran Morris, Oliver Johnson

2512.24968 2026-04-16 econ.GN cs.AI cs.CY q-fin.EC stat.AP

Strategic Response of News Publishers to Generative AI

Hangcheng Zhao, Ron Berman

2510.18099 2026-04-16 stat.ME

Staying on Track: Efficient Trajectory Discovery with Adaptive Batch Sampling

Arindam Fadikar, Abby Stevens, Mickael Binois, Nicholson Collier, David O'Gara, Jonathan Ozik

2509.02154 2026-04-16 cs.LG cs.AI cs.CV stat.ML

Heavy-Tailed Class-Conditional Priors for Long-Tailed Generative Modeling

Aymene Mohammed Bouayed, Samuel Deslauriers-Gauthier, Adrian Iaccovelli, David Naccache

2505.16051 2026-04-16 stat.ML cs.LG

Flow-based Generative Modeling of Potential Outcomes and Counterfactuals

Dongze Wu, David I. Inouye, Yao Xie

Comments Accepted at 2026 IEEE International Symposium on Information Theory (ISIT 2026)

2504.21143 2026-04-16 stat.AP

Comparative Analysis of Weather-Based Indexes and the Actuaries Climate Index$^{TM}$ for Crop Yield Prediction and Weather-Derivative Pricing

Cem Yavrum, A. Sevtap Selcuk-Kestel, José Garrido

Comments 1) The application of the ACI within a weather-derivative framework is incorporated. 2) A time-trend analysis is integrated prior to crop yield prediction. 3) The iterative M-split leave-k-out cross-validation method is implemented. 4) The Discussion section is added

2503.00379 2026-04-16 cs.LG stat.ML

Improving clustering quality evaluation in noisy Gaussian mixtures

Renato Cordeiro de Amorim, Vladimir Makarenkov

2412.03596 2026-04-16 stat.ME

SMART-MC: Characterizing the Dynamics of Multiple Sclerosis Therapy Transitions Using a Covariate-Based Markov Model

Beomchang Kim, Zongqi Xia, Priyam Das

详情

DOI: 10.1080/01621459.2025.2555055

英文摘要

Treatment switching is a common occurrence in the management of Multiple Sclerosis (MS), where patients transition across various disease-modifying therapies (DMTs) due to heterogeneous treatment responses, differences in disease progression, patient characteristics, and therapy-associated adverse effects. To investigate how patient-level covariates influence the likelihood of treatment transitions among DMTs, we adopt a Markovian framework, Sparse Matrix Estimation with Covariate-Based Transitions in Markov Chain Modeling (SMART-MC), in which the transition probabilities are modeled as functions of these covariates. Modeling real-world treatment transitions under this framework presents several challenges, including ensuring parameter identifiability and handling sparse transitions without overfitting. To address identifiability, we constrain each transition-specific covariate coefficient vectors to have a fixed L2 norm. Furthermore, our method automatically estimates transition probabilities for sparsely observed transitions as constants and enforces zero transition probabilities for transitions that are empirically unobserved. This approach mitigates the need for additional model complexity to handle sparsity while maintaining interpretability and efficiency. To optimize the multi-modal likelihood function, we develop a scalable, parallelized global optimization routine, which is validated through benchmark comparisons and supported by key theoretical properties. Our analysis uncovers meaningful patterns in DMT transitions, revealing variations across MS patient subgroups defined by age, race, and other clinical factors.

URL PDF HTML ☆

赞 0 踩 0

2408.10610 2026-04-16 cs.LG math.PR stat.ME

On an $L^2$ norm for stationary ARMA processes

Anand Ganesh, Babhrubahan Bose, Anand Rajagopalan

Comments 5 pages

2408.02839 2026-04-16 stat.ML cs.LG

Mini-batch Estimation for Deep Cox Models: Statistical Foundations and Practical Guidance

Lang Zeng, Weijing Tang, Zhao Ren, Ying Ding

1909.04024 2026-04-16 stat.ME stat.CO

Estimating the Optimal Linear Combination of Biomarkers using Spherically Constrained Optimization

Priyam Das, Debsurya De, Raju Maiti, Mona Kamal, Katherine A. Hutcheson, Clifton D. Fuller, Bibhas Chakraborty, Christine B. Peterson

1904.10046 2026-04-16 stat.ME stat.AP

A distribution-free smoothed combination method of biomarkers to improve diagnostic accuracy in multi-category classification

Raju Maiti, Jialiang Li, Priyam Das, Lei Feng, Derek Hausenloy, Bibhas Chakraborty

1609.02249 2026-04-16 math.OC stat.ME

Clustering sequence data with mixture Markov chains with covariates using multiple simplex constrained optimization routine (MSiCOR)

Priyam Das, Deborshee Sen, Debsurya De, Jue Hou, Zahra S. H. Abad, Nicole Kim, Zongqi Xia, Tianxi Cai

1604.08636 2026-04-16 math.OC cs.DS stat.ME

Recursive Modified Pattern Search on High-dimensional Simplex : A Blackbox Optimization Technique

Priyam Das

详情

DOI: 10.1007/s13571-020-00236-9

英文摘要

In this paper, a novel derivative-free pattern search based algorithm for Black-box optimization is proposed over a simplex constrained parameter space. At each iteration, starting from the current solution, new possible set of solutions are found by adding a set of derived step-size vectors to the initial starting point. While deriving these step-size vectors, precautions and adjustments are considered so that the set of new possible solution points still remain within the simplex constrained space. Thus, no extra time is spent in evaluating the (possibly expensive) objective function at infeasible points (points outside the unit-simplex space). While minimizing any objective function of m parameters, within each iteration, the objective function is evaluated at 2m new possible solution points. So, upto 2m parallel threads can be incorporated which makes the computation even faster while optimizing expensive objective functions over high-dimensional parameter space. Once a local minimum is discovered, in order to find a better solution, a novel `re-start' strategy is considered to increase the likelihood of finding a better solution. Unlike existing pattern search based methods, a sparsity control parameter is introduced which can be used to induce sparsity in the solution in case the solution is expected to be sparse in prior. A comparative study of the performances of the proposed algorithm and other existing algorithms are shown for a few low, moderate and high-dimensional optimization problems. Upto 338 folds improvement in computation time is achieved using the proposed algorithm over Genetic algorithm along with better solution. The proposed algorithm is used to estimate the simultaneous quantiles of North Atlantic Hurricane velocities during 1981-2006 by maximizing a non-closed form likelihood function with (possibly) multiple maximums.

URL PDF HTML ☆

赞 0 踩 0

2604.13772 2026-04-16 stat.ME

Testing Alpha in High-Dimensional Conditional Time-Varying Factor Models with Dependent Observations

Long Feng, Huifang Ma, Zhaojun Wang

2604.13748 2026-04-16 stat.ME stat.ML

Forecasting Multivariate Time Series under Predictive Heterogeneity: A Validation-Driven Clustering Framework

Ziling Ma, Ángel López Oriona, Hernando Ombao, Ying Sun

2604.13740 2026-04-16 cs.LG stat.ML

Online learning with noisy side observations

Tomáš Kocák, Gergely Neu, Michal Valko

Comments Published at International Conference on Artificial Intelligence and Statistics (AISTATS) 2016. 13 pages, 7 figures

2604.13739 2026-04-16 cs.LG stat.ML

Spectral Thompson sampling

Tomas Kocak, Michal Valko, Remi Munos, Shipra Agrawal

Comments Published at AAAI Conference on Artificial Intelligence (AAAI) 2014

2604.13738 2026-04-16 stat.ML cs.LG

Covariance-adapting algorithm for semi-bandits with application to sparse rewards

Pierre Perrault, Vianney Perchet, Michal Valko

Comments Published at Conference on Learning Theory (COLT) 2020

2604.13709 2026-04-16 stat.ME

Adaptive Sample Size Simulations with R package adsasi

Skerdi Haviari

Comments 21 pages, 7 figures

2604.13689 2026-04-16 stat.ME

Fractional lower-order covariance-based measures for cyclostationary time series with heavy-tailed distributions: application to dependence testing and model order identification

Wojciech Żuławiński, Agnieszka Wyłomańska

Comments 26 pages, 17 figures

详情

DOI: 10.1016/j.dsp.2025.105214
Journal ref: Digital Signal Processing 163, 105214, 2025

英文摘要

This article introduces new methods for the analysis of cyclostationary time series with infinite variance. Traditional cyclostationary analysis, based on periodically correlated (PC) processes, relies on the autocovariance function (ACVF). However, the ACVF is not suitable for data exhibiting a heavy-tailed distribution, particularly with infinite variance. Thus, we propose a novel framework for the analysis of cyclostationary time series with heavy-tailed distribution, utilizing the fractional lower-order covariance (FLOC) as an alternative to covariance. This leads to the introduction of two new autodependence measures: the periodic fractional lower-order autocorrelation function (peFLOACF) and the periodic fractional lower-order partial autocorrelation function (peFLOPACF). These measures generalize the classical periodic autocorrelation function (peACF) and periodic partial autocorrelation function (pePACF), offering robust tools for analyzing infinite-variance processes. Two practical applications of the proposed measures are explored: a portmanteau test for testing dependence in cyclostationary series and a method for order identification in periodic autoregressive (PAR) and periodic moving average (PMA) models with infinite variance. Both applications demonstrate the potential of new tools, with simulations validating their efficiency. The methodology is further illustrated through the analysis of real-world air pollution data, which showcases its practical utility. The results indicate that the proposed measures based on FLOC provide reliable and efficient techniques for analyzing cyclostationary processes with heavy-tailed distributions.

URL PDF HTML ☆

赞 0 踩 0

2604.13656 2026-04-16 cs.LG cs.AI math.ST stat.ML stat.TH

Ordinary Least Squares is a Special Case of Transformer

Xiaojun Tan, Yuchen Zhao

2604.13598 2026-04-16 cs.LG stat.ME

Enhancing Reinforcement Learning for Radiology Report Generation with Evidence-aware Rewards and Self-correcting Preference Learning

Qin Zhou, Guoyan Liang, Qianyi Yang, Jingyuan Chen, Sai Wu, Chang Yao, Zhe Wang

Comments 13 pages,4 figures, ACL2026-main

2604.13563 2026-04-16 math.NA cs.NA math.ST stat.TH

Covariance-Informed Subspace: an Adaptive Gradient-Free Input Dimension Reduction Method for Bayesian Inference

Nadège Polette, Olivier Le Maître, Pierre Sochala, Alexandrine Gesret

2604.13539 2026-04-16 stat.AP

Relative plausibility versus probabilism: A level-of-analysis error in juridical proof

Stanley E. Lazic

2604.13525 2026-04-16 stat.ML cs.LG math.OC

Robust Low-Rank Tensor Completion based on M-product with Weighted Correlated Total Variation and Sparse Regularization

Biswarup Karmakar, Ratikanta Behera

Comments 32 pages

2604.13484 2026-04-16 stat.ML cs.LG

Joint Representation Learning and Clustering via Gradient-Based Manifold Optimization

Sida Liu, Yangzi Guo, Mingyuan Wang

2604.13478 2026-04-16 math.OC cs.CE econ.GN q-fin.EC stat.AP

Deepbullwhip: An Open-Source Simulation and Benchmarking for Multi-Echelon Bullwhip Analyses

Mansur M. Arief

2604.13470 2026-04-16 cs.LG stat.ML

Universality of Gaussian-Mixture Reverse Kernels in Conditional Diffusion

Nafiz Ishtiaque, Syed Arefinul Haque, Kazi Ashraful Alam, Fatima Jahara

Comments 10+19 pages

2604.13446 2026-04-16 physics.ao-ph stat.AP

Modeling the Sea-Level Change from U.S. Vehicle Emissions

Tony Wong

2604.13406 2026-04-16 stat.ME

Leveraging machine learning to estimate individualized treatment effects in cluster-randomized trials

Changjun Li, Xi Fang, Michael O. Harhay, Andrew B. Forbes, F. Perry Wilson, Guangyu Tong, Fan Li

2604.13393 2026-04-16 math.OC cs.LG stat.ML

A short proof of near-linear convergence of adaptive gradient descent under fourth-order growth and convexity

Damek Davis, Dmitriy Drusvyatskiy

2604.13352 2026-04-16 stat.AP

A Machine Learning Framework for Uncertainty-Calibrated Capability Decision under Finite Samples

Fei Jiang, Lei Yang

Comments 18 pages, 4 figures and 10 tables

2604.13341 2026-04-16 stat.ME

Newton's Algorithm as a Gradient Flow: A Geometric Framework for Recursive Mixture Estimation

Bernardo Flores

2604.13295 2026-04-16 cs.LG math.PR stat.ML

Some Theoretical Limitations of t-SNE

Rupert Li, Elchanan Mossel

Comments 19 pages, 7 figures

2604.13274 2026-04-16 math.ST cs.CR stat.TH

Sequential Change Detection for Multiple Data Streams with Differential Privacy

Lixing Zhang, Liyan Xie, Ruizhi Zhang

Comments Accepted to the 2026 IEEE International Symposium on Information Theory (ISIT 2026)

2604.13265 2026-04-16 stat.ME stat.AP

Efficient estimation of cumulative incidence curves via data fusion with surrogates: application to integrated analysis of vaccine trial and immunobridging data

Pan Zhao, Peter B. Gilbert, Oliver Dukes, Bo Zhang

2604.13264 2026-04-16 stat.ME stat.AP

Estimating effect thresholds and beyond: A flexible framework for multivariate alert detection

Lucia Ameis, Niklas Hagemann, Kathrin Möllenhoff

Comments 20 pages

详情

英文摘要

Evaluating the influence of continuous covariates, like exposure time or dose, on a response variable is a pivotal objective in the assessment of a compound's effect, particularly when determining toxicity in pre-clinical research or pharmacokinetics in clinical trials. The determination of an alert, such as the ED50 value, at which a pre-specified threshold of the response variable is crossed, is an important tool for the evaluation process. In practice, response data might be available for combinations of different covariates and the alert depending on both is of interest. In this case, it is crucial to use all available information and extrapolate between cases to ensure the optimal utilization of the data. In this paper, we introduce a parametric approach that allows alerts to be estimated in a multidimensional setting. For time-dose-response data, for instance, alert doses at a given time can be determined, even when there are no measurements available at that exact time. Likewise, it allows estimation of alert times for a given dose. More generally, the method makes it possible to characterize the complete alert relationship between covariates by leveraging all available data. This is achieved by fitting a parametric model and constructing either a confidence band for the two-dimensional curve given for example a fixed time or dose or by constructing a confidence plane for the three-dimensional model fit. The initial model fit is achieved by the flexible framework of Generalized Additive Models for Location, Scale and Shape (GAMLSS), which offers the possibility to account for a plethora of complex three-dimensional data structures. We demonstrate the validity of our approach through a simulation study and present an application to data from a study investigating the relevance of the exposure duration on cytotoxicity in primary human hepatocytes.

URL PDF HTML ☆

赞 0 踩 0

2604.13253 2026-04-16 cs.LG stat.ME stat.ML

Bias-Corrected Adaptive Conformal Inference for Multi-Horizon Time Series Forecasting

Ankit Lade, Sai Krishna J., Indar Kumar

Comments 14 pages, 3 figures, 2 tables. Preprint

2604.13218 2026-04-16 stat.ML cs.AI cs.LG math.ST stat.TH

Identifiability of Potentially Degenerate Gaussian Mixture Models With Piecewise Affine Mixing

Danru Xu, Sébastien Lachapelle, Sara Magliacane

Comments 49 pages, 10 figures, AISTATS 2026

2604.13188 2026-04-16 econ.EM stat.AP

Is Productivity Advantage of Cities Really Down To Mean and Variance?

Vladislav Morozov, Andrea Sy

2604.13130 2026-04-16 cs.LG stat.ML

Generalization Guarantees on Data-Driven Tuning of Gradient Descent with Langevin Updates

Saumya Goyal, Rohith Rongali, Ritabrata Ray, Barnabás Póczos

2604.11165 2026-04-16 stat.ML cs.AI cs.LG math.ST stat.TH

Cost-optimal Sequential Testing via Doubly Robust Q-learning

Doudou Zhou, Yiran Zhang, Dian Jin, Yingye Zheng, Lu Tian, Tianxi Cai

2603.24654 2026-04-16 quant-ph cs.LG stat.ML

Spectral methods: crucial for machine learning, natural for quantum computers?

Vasilis Belis, Joseph Bowles, Rishabh Gupta, Evan Peters, Maria Schuld

Comments 25 pages, 8 figures

2603.20968 2026-04-16 cs.IT cs.CR math.IT math.ST stat.TH

Composition Theorems for Multiple Differential Privacy Constraints

Cemre Cadir, Salim Najib, Yanina Y. Shkel

Comments Extended version of article in 2026 IEEE International Symposium on Information Theory (ISIT 2026)

2603.13848 2026-04-16 stat.ME

A family of divergence-based correlation measures for contingency tables under bivariate normality

Wataru Urasaki

2603.13464 2026-04-16 stat.ME

Modeling Heterogeneous Mediation Effects in Survival Analysis via an Interpretable M-Learner Framework

Xingyu Li, Qing Liu, Xun Jiang, Hong Amy Xia, Brian P. Hobbs, Peng Wei

2602.09595 2026-04-16 stat.ME

Sharp Bounds for Treatment Effect Generalization under Outcome Distribution Shift

Amir Asiaee, Samhita Pal, Cole Beck, Jared D. Huling

2512.23748 2026-04-16 cs.LG math.PR stat.ML

A Review of Diffusion-based Simulation-Based Inference: Foundations and Applications in Non-Ideal Data Scenarios

Haley Rosso, Talea Mayo

2512.15232 2026-04-16 stat.AP eess.SP

A Blind Source Separation Framework to Monitor Sectoral Power Demand from Grid-Scale Load Measurements

Guillaume Koechlin, Filippo Bovera, Elena Degli Innocenti, Barbara Santini, Alessandro Venturi, Simona Vazio, Piercesare Secchi

2511.20191 2026-04-16 stat.ME stat.CO

A Generalized Additive Partial-Mastery Cognitive Diagnosis Model

Camilo Cárdenas-Hurtado, Sze Ming Lee, Yunxiao Chen, Irini Moustaki

Comments 29 pages, 4 figures. Includes online appendix

2509.21912 2026-04-16 cs.LG stat.ML

Discrete Guidance Matching: Exact Guidance for Discrete Flow Matching

Zhengyan Wan, Yidong Ouyang, Liyan Xie, Fang Fang, Hongyuan Zha, Guang Cheng

Comments Published as a conference paper at ICLR 2026

2508.21025 2026-04-16 math.ST stat.ME stat.TH

Pivotal inference for linear predictions in stationary processes

Holger Dette, Sebastian Kühnert

Comments 33 pages, 3 figures, 2 tables

2508.05663 2026-04-16 stat.ML cs.CR cs.LG cs.SY eess.SY

Random Walk Learning and the Pac-Man Attack

Xingran Chen, Parimal Parag, Rohit Bhagat, Zonghong Liu, Salim El Rouayheb

Comments The updated manuscript represents an incomplete version of the work. A substantially updated version will be prepared before further dissemination

2507.20846 2026-04-16 astro-ph.IM eess.SP stat.AP

Precision spectral estimation at sub-Hz frequencies: closed-form posteriors and Bayesian noise projection

Lorenzo Sala, Stefano Vitale

Comments This work has been submitted for possible publication

2505.12836 2026-04-16 eess.IV cs.CV cs.LG stat.ML

The Gaussian Latent Machine: Efficient Prior and Posterior Sampling for Inverse Problems

Muhamed Kuric, Martin Zach, Andreas Habring, Michael Unser, Thomas Pock

2504.18107 2026-04-16 stat.ME

Multi-Task Learning for High-Dimensional Regression with Many Weak Instruments

Di Zhang, Xuanyu Li, Baoluo Sun

Comments 55 pages, 2 figures, 4 tables

2503.10787 2026-04-16 stat.ME stat.AP

Bayes factor functions for testing partial correlation coefficients

Saptati Datta

2502.16758 2026-04-16 math.ST math.PR stat.TH

Stabilizing the Splits through Minimax Decision Trees

Zhenyuan Zhang, Hengrui Luo

Comments 69 pages, 17 figures; a substantial expansion upon the previous version

2501.02746 2026-04-16 eess.SP math.PR math.SP math.ST stat.TH

A Large-Dimensional Analysis of ESPRIT DoA Estimation: Inconsistency and a Correction via RMT

Zhengyu Wang, Wei Yang, Xiaoyi Mai, Zenan Ling, Zhenyu Liao, Robert C. Qiu

Comments 29 pages, 10 figures, to appear on IEEE Trans. SP. Part of this work was presented at the IEEE 32nd European Signal Processing Conference (EUSIPCO 2024), Lyon, France, under the title "Inconsistency of ESPRIT DoA Estimation for Large Arrays and a Correction via RMT."

2501.02378 2026-04-16 cs.LG q-bio.NC stat.ML

A ghost mechanism: An analytical model of abrupt learning in recurrent networks

Fatih Dinc, Ege Cirakman, Bariscan Kurtkaya, Mert Yuksekgonul, Yiqi Jiang, Mark J. Schnitzer, Hidenori Tanaka

Comments to appear in Physical Review X

详情

英文摘要

Abrupt learning is a common phenomenon in recurrent neural networks (RNNs) trained on working memory tasks. In such cases, the networks develop transient slow regions in state space that extend the effective timescales of computation. However, the mechanisms driving sudden performance improvements and their causal role remain unclear. To address this gap, we introduce the ghost mechanism, a process by which dynamical systems exhibit transient slowdown near the remnant of a saddle-node bifurcation. By reducing the high-dimensional dynamics near ghost points, we derive a one-dimensional canonical form that analytically captures learning as a process controlled by a single scale parameter. Using this model, we study a form of abrupt learning emerging from ghost points and identify a critical learning rate that scales as an inverse power law with the timescale of the learned computation. Beyond this rate, learning collapses through two interacting modes: (i) vanishing gradients and (ii) oscillatory gradients near minima. These features can lock the system into high-confidence but incorrect predictions when parameter updates trigger a no-learning zone, a region of parameter space where gradients vanish. We validate these predictions in low-rank RNNs, where ghost points precede abrupt transitions, and further demonstrate their generality in full-rank RNNs trained on canonical working memory tasks. Our theory offers two approaches to address these learning difficulties: increasing trainable ranks stabilizes learning trajectories, while reducing output confidence mitigates entrapment in no-learning zones. Overall, the ghost mechanism reveals how the computational demands of a task constrain the optimization landscape, demonstrating that well-known learning difficulties in RNNs partly arise from the dynamical systems they must learn to implement.

URL PDF HTML ☆

赞 0 踩 0

2407.13407 2026-04-16 math.OC math.ST stat.TH

Nonconvex landscapes for $\mathbf{Z}_2$ synchronization and graph clustering are benign near exact recovery thresholds

Andrew D. McRae, Pedro Abdalla, Afonso S. Bandeira, Nicolas Boumal

2405.07432 2026-04-16 stat.ML cs.LG cs.SY eess.SY

Nonparametric Sparse Online Learning of the Koopman Operator

Boya Hou, Sina Sanjari, Nathan Dahlin, Alec Koppel, Subhonmesh Bose

Comments 44 pages

2404.12828 2026-04-16 math.OC math.ST stat.TH

Low solution rank of the matrix LASSO under RIP with consequences for rank-constrained algorithms

Andrew D. McRae

2312.05593 2026-04-16 econ.EM stat.ME

Benign Overfitting in Economic Forecasting via Noise Regularization

Yuan Liao, Xinjie Ma, Andreas Neuhierl, Zhentao Shi

2305.02304 2026-04-16 stat.ML cs.LG

New Equivalences Between Interpolation and SVMs: Kernels and Structured Features

Chiraag Kaushik, Andrew D. McRae, Mark A. Davenport, Vidya Muthukumar

Comments 23 pages, 2 figures

2209.00991 2026-04-16 q-fin.RM math.ST stat.ME stat.TH

E-backtesting

Qiuqi Wang, Ruodu Wang, Johanna Ziegel

1805.00318 2026-04-16 stat.CO

Likelihood-Based Inference with Separable Correlation Matrices

Karl Oskar Ekvall

1804.09154 2026-04-16 cs.LG cs.HC stat.ML

DOOM Level Generation using Generative Adversarial Networks

Edoardo Giacomello, Pier Luca Lanzi, Daniele Loiacono