arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2602.16709 2026-02-19 cs.LG math.ST stat.ME stat.TH

Knowledge-Embedded Latent Projection for Robust Representation Learning

Weijing Tang, Ming Yuan, Zongqi Xia, Tianxi Cai

详情

英文摘要

Latent space models are widely used for analyzing high-dimensional discrete data matrices, such as patient-feature matrices in electronic health records (EHRs), by capturing complex dependence structures through low-dimensional embeddings. However, estimation becomes challenging in the imbalanced regime, where one matrix dimension is much larger than the other. In EHR applications, cohort sizes are often limited by disease prevalence or data availability, whereas the feature space remains extremely large due to the breadth of medical coding system. Motivated by the increasing availability of external semantic embeddings, such as pre-trained embeddings of clinical concepts in EHRs, we propose a knowledge-embedded latent projection model that leverages semantic side information to regularize representation learning. Specifically, we model column embeddings as smooth functions of semantic embeddings via a mapping in a reproducing kernel Hilbert space. We develop a computationally efficient two-step estimation procedure that combines semantically guided subspace construction via kernel principal component analysis with scalable projected gradient descent. We establish estimation error bounds that characterize the trade-off between statistical error and approximation error induced by the kernel projection. Furthermore, we provide local convergence guarantees for our non-convex optimization procedure. Extensive simulation studies and a real-world EHR application demonstrate the effectiveness of the proposed method.

URL PDF HTML ☆

赞 0 踩 0

2602.16690 2026-02-19 stat.ME cs.LG stat.ML

Synthetic-Powered Multiple Testing with FDR Control

Yonghoon Lee, Meshi Bashari, Edgar Dobriban, Yaniv Romano

2602.16634 2026-02-19 stat.ML cs.AI cs.LG physics.bio-ph physics.chem-ph

Enhanced Diffusion Sampling: Efficient Rare Event Sampling and Free Energy Calculation with Diffusion Models

Yu Xie, Ludwig Winkler, Lixin Sun, Sarah Lewis, Adam E. Foster, José Jiménez Luna, Tim Hempel, Michael Gastegger, Yaoyi Chen, Iryna Zaporozhets, Cecilia Clementi, Christopher M. Bishop, Frank Noé

2602.16616 2026-02-19 stat.AP

Design and Analysis Strategies for Pooling in High Throughput Screening: Application to the Search for a New Anti-Microbial

Byran Smucker, Benjamin Brennan, Emily Rego, Meng Wu, Zhihong Lin, Brian Ahmer, Blake Peterson

2602.16606 2026-02-19 math.ST stat.ME stat.TH

On Sharpened Convergence Rate of Generalized Sliced Inverse Regression for Nonlinear Sufficient Dimension Reduction

Chak Fung Choi, Yin Tang, Bing Li

2602.16583 2026-02-19 stat.AP

Physical Activity Trajectories Preceding Incident Major Depressive Disorder Diagnosis Using Consumer Wearable Devices in the All of Us Research Program: Case-Control Study

Yuezhou Zhang, Amos Folarin, Hugh Logan Ellis, Rongrong Zhong, Callum Stewart, Heet Sankesara, Hyunju Kim, Shaoxiong Sun, Abhishek Pratap, Richard JB Dobson

2602.16581 2026-02-19 math.NA cs.NA stat.CO

Whittle-Matérn Fields with Variable Smoothness

Hamza Ruzayqat, Wenyu Lei, David Bolin, George Turkiyyah, Omar Knio

Comments 24 pages, 5 figures, 2 tables

2602.16568 2026-02-19 math.ST cs.DS cs.LG math.OC stat.ML stat.TH

Separating Oblivious and Adaptive Models of Variable Selection

Ziyun Chen, Jerry Li, Kevin Tian, Yusong Zhu

Comments 40 pages

2602.16540 2026-02-19 stat.ME math.ST stat.TH

Generalised Linear Models Driven by Latent Processes: Asymptotic Theory and Applications

Wagner Barreto-Souza, Ngai Hang Chan

Comments Paper submitted for publication

2602.16505 2026-02-19 stat.ML cs.LG

Functional Decomposition and Shapley Interactions for Interpreting Survival Models

Sophie Hanna Langbein, Hubert Baniecki, Fabian Fumagalli, Niklas Koenen, Marvin N. Wright, Julia Herbinger

2602.16497 2026-02-19 stat.ME

Factor-Adjusted Multiple Testing for High-Dimensional Individual Mediation Effects

Chen Shi, Zhao Chen, Christina Dan Wang

2602.16476 2026-02-19 stat.ML cs.LG

Learning Preference from Observed Rankings

Yu-Chang Chen, Chen Chian Fuh, Shang En Tsai

2602.16466 2026-02-19 math.ST stat.TH

Estimation of Conformal Metrics

Jérôme Taupin

2602.16463 2026-02-19 stat.ME

Focused Relative Risk Information Criterion for Variable Selection in Linear Regression

Nils Lid Hjort

Comments 19 pages, 5 figures; technical report of July 2020 (Department of Mathematics, University of Oslo), from which a modified version will be written and submitted for journal publication

2602.16436 2026-02-19 cs.LG cs.CR stat.ML

Learning with Locally Private Examples by Inverse Weierstrass Private Stochastic Gradient Descent

Jean Dufraiche, Paul Mangold, Michaël Perrot, Marc Tommasi

Comments 30 pages, 8 figures

2602.15385 2026-02-19 stat.AP q-fin.RM stat.ML

From Chain-Ladder to Individual Claims Reserving

Ronald Richman, Mario V. Wüthrich

2602.14981 2026-02-19 stat.ME math.ST stat.CO stat.ML stat.TH

Block Empirical Likelihood Inference for Longitudinal Generalized Partially Linear Single-Index Models

Tianni Zhang, Yuyao Wang, Yu Lu, Mengfei Ran

2602.10749 2026-02-19 stat.AP

The Dataset of Daily Air Quality for the Years 2013-2023 in Italy

Alessandro Fusta Moro, Alessandro Fassò, Jacopo Rodeschini

2602.05298 2026-02-19 stat.ML cs.LG math.OC

Logarithmic-time Schedules for Scaling Language Models with Momentum

Damien Ferbach, Courtney Paquette, Gauthier Gidel, Katie Everett, Elliot Paquette

2601.21106 2026-02-19 stat.ME

Scalable Dirichlet Process Mixture Models with Unknown Concentration and Adaptive Covariance for High-Dimensional Clustering Applied to Leukemia Transcriptomics

Annesh Pal, Aguirre Mimoun, Rodolphe Thiébaut, Boris P. Hejblum

Comments 22 pages with 5 figures and 1 table

2512.09530 2026-02-19 stat.ML cs.LG

Transformers for Tabular Data: A Training Perspective of Self-Attention via Optimal Transport

Alessandro Quadrio, Antonio Candelieri

2510.16161 2026-02-19 cs.LG stat.ML

Still Competitive: Revisiting Recurrent Models for Irregular Time Series Prediction

Ankitkumar Joshi, Milos Hauskrecht

Comments Published in Transactions on Machine Learning Research, 2026

2510.08102 2026-02-19 cs.CL cs.AI cs.LG stat.ML

Lossless Vocabulary Reduction for Auto-Regressive Language Models

Daiki Chijiwa, Taku Hasegawa, Kyosuke Nishida, Shin'ya Yamaguchi, Tomoya Ohba, Tamao Sakao, Susumu Takeuchi

Comments The Fourteenth International Conference on Learning Representations (ICLR 2026)

2509.18406 2026-02-19 stat.ME

A constrained iteratively-reweighted least-squares framework for generalised linear models

Pierre Masselot, Devon Nenon, Jacopo Vanoli, Zaid Chalabi, Antonio Gasparrini

Comments Submitted for peer reviewed publication. V3 changes: (i) Introduction and absract have been reworked, (ii) improvement in the evaluation of degrees of freedom formulae and (iii) modification of the first application (global warming)

2508.12926 2026-02-19 math.ST math.PR stat.ML stat.TH

On the distance between mean and geometric median in high dimensions

Richard Schwank, Mathias Drton

Comments Background section added and proofs shortened

2508.02922 2026-02-19 stat.ME

A multi-stage Bayesian approach to fit spatial point process models

Rachael Ren, Mevin B. Hooten, Toryn L. J. Schafer, Nicholas M. Calzada, Benjamin Hoose, Jamie N. Womble, Scott Gende

Comments 51 pages, 24 figures

2507.04033 2026-02-19 cs.LG cs.CY math.OC stat.ML

Benchmarking Stochastic Approximation Algorithms for Fairness-Constrained Training of Deep Neural Networks

Andrii Kliachkin, Jana Lepšová, Gilles Bareilles, Jakub Mareček

2506.17773 2026-02-19 stat.ME

Selection of functional predictors and smooth coefficient estimation for scalar-on-function regression models

Hedayat Fathi, Marzia A. Cremona, Federico Severino

Comments 46 pages, 6 figures

2412.06004 2026-02-19 math.ST math.PR q-bio.PE stat.CO stat.TH

Large-sample analysis of cost functionals for inference under the coalescent

Martina Favero, Jere Koskela

Comments 34 pages, 7 figures

2409.12019 2026-02-19 math.ST stat.TH

Asymptotics for conformal inference

Ulysse Gazin

Comments 39 pages, 3 figures, 2 tables

2602.16352 2026-02-19 stat.ML cs.CY cs.LG

Machine Learning in Epidemiology

Marvin N. Wright, Lukas Burk, Pegah Golchian, Jan Kapar, Niklas Koenen, Sophie Hanna Langbein

2602.16328 2026-02-19 stat.ME

A general framework for modeling Gaussian process with qualitative and quantitative factors

Linsui Deng, C. F. Jeff Wu

2602.16310 2026-02-19 stat.ME econ.EM math.ST stat.AP stat.TH

Introducing the b-value: combining unbiased and biased estimators from a sensitivity analysis perspective

Zhexiao Lin, Peter J. Bickel, Peng Ding

Comments 53 pages

2602.16283 2026-02-19 math.ST stat.OT stat.TH

Orthogonal parametrisations of Extreme-Value distributions

Nathan Huet, Ilaria Prosdocimi

2602.16259 2026-02-19 math.ST stat.CO stat.ME stat.TH

HAL-MLE Log-Splines Density Estimation (Part I: Univariate)

Yilong Hou, Zhengpu Zhao, Yi Li, Mark van der Laan

Comments 75 pages

2602.16223 2026-02-19 math.ST math.PR stat.TH

Nonparametric estimation of linear multiplier for processes driven by a Hermite process

B. L. S. Prakasa Rao

2602.16218 2026-02-19 cs.LG cs.NA math.NA stat.ML

Bayesian Quadrature: Gaussian Processes for Integration

Maren Mahsereci, Toni Karvonen

2602.16183 2026-02-19 cs.GT cs.LG stat.ML

Multi-Agent Combinatorial-Multi-Armed-Bandit framework for the Submodular Welfare Problem under Bandit Feedback

Subham Pokhriyal, Shweta Jain, Vaneet Aggarwal

2602.16137 2026-02-19 stat.ME

Experimental Assortments for Choice Estimation and Nest Identification

Xintong Yu, Will Ma, Michael Zhao

2602.16131 2026-02-19 stat.ML cs.LG

Empirical Cumulative Distribution Function Clustering for LLM-based Agent System Analysis

Chihiro Watanabe, Jingyu Sun

2602.16120 2026-02-19 cs.LG stat.AP stat.ML

Feature-based morphological analysis of shape graph data

Murad Hossen, Demetrio Labate, Nicolas Charon

2602.16111 2026-02-19 stat.AP cs.AI

Surrogate-Based Prevalence Measurement for Large-Scale A/B Testing

Zehao Xu, Tony Paek, Kevin O'Sullivan, Attila Dobi

2602.16099 2026-02-19 stat.CO stat.ME stat.ML

Quantifying and Attributing Submodel Uncertainty in Stochastic Simulation Models and Digital Twins

Mohammadmahdi Ghasemloo, David J. Eckman, Yaxian Li

2602.16065 2026-02-19 cs.LG cs.AI math.ST stat.ML stat.TH

Can Generative Artificial Intelligence Survive Data Contamination? Theoretical Guarantees under Contaminated Recursive Training

Kevin Wang, Hongqian Niu, Didong Li

2602.16063 2026-02-19 eess.SY cs.CE cs.ET cs.LG cs.SY stat.CO

MARLEM: A Multi-Agent Reinforcement Learning Simulation Framework for Implicit Cooperation in Decentralized Local Energy Markets

Nelson Salazar-Pena, Alejandra Tabares, Andres Gonzalez-Mancera

Comments 32 pages, 7 figures, 1 table, 1 algorithm

2602.16062 2026-02-19 eess.SY cs.CE cs.LG cs.MA cs.SY stat.AP

Harnessing Implicit Cooperation: A Multi-Agent Reinforcement Learning Approach Towards Decentralized Local Energy Markets

Nelson Salazar-Pena, Alejandra Tabares, Andres Gonzalez-Mancera

Comments 42 pages, 7 figures, 10 tables

2602.16041 2026-02-19 stat.ME

Predictive Subsampling for Scalable Inference in Networks

Arpan Kumar, Minh Tang, Srijan Sengupta

2602.16040 2026-02-19 stat.ME

Covariate Adjustment for Wilcoxon Two Sample Statistic and Test

Zhilan Lou, Jun Shao, Ting Ye, Tuo Wang, Yanyao Yi, Yu Du

Comments 18 pages, 0 figures, 3 tables

2602.16031 2026-02-19 stat.ME stat.AP

Competing Risk Analysis in Cardiovascular Outcome Trials: A Simulation Comparison of Cox and Fine-Gray Models

Tuo Wang, Yu Du

Comments 18 pages, 6 figures

2602.15972 2026-02-19 cs.LG stat.ML

Fast Online Learning with Gaussian Prior-Driven Hierarchical Unimodal Thompson Sampling

Tianchi Zhao, He Liu, Hongyin Shi, Jinliang Li

2602.15955 2026-02-19 cs.LG stat.AP

Adaptive Semi-Supervised Training of P300 ERP-BCI Speller System with Minimum Calibration Effort

Shumeng Chen, Jane E. Huggins, Tianwen Ma

Comments 8 pages, 8 figures

2602.15925 2026-02-19 stat.ML cs.LG

Robust Stochastic Gradient Posterior Sampling with Lattice Based Discretisation

Zier Mensch, Lars Holdijk, Samuel Duffield, Maxwell Aifer, Patrick J. Coles, Max Welling, Miranda C. N. Cheng

2602.15920 2026-02-19 stat.ML cs.LG eess.SP

Including Node Textual Metadata in Laplacian-constrained Gaussian Graphical Models

Jianhua Wang, Killian Cressant, Pedro Braconnot Velloso, Arnaud Breloy

Comments Submitted to EUSIPCO 2026

2602.15916 2026-02-19 stat.ME stat.ML

Nonparametric Identification and Inference for Counterfactual Distributions with Confounding

Jianle Sun, Kun Zhang

Comments 35 pages for Main text, 22 pages for Appendices, 6 figures

2602.13158 2026-02-19 stat.ME

A new mixture model for spatiotemporal exceedances with flexible tail dependence

Ryan Li, Emily C. Hector, Brian J. Reich, Reetam Majumder

2602.10531 2026-02-19 stat.ML cs.LG math.ST stat.TH

From Collapse to Improvement: Statistical Perspectives on the Evolutionary Dynamics of Iterative Training on Contaminated Sources

Soham Bakshi, Sunrit Chakraborty

2601.21093 2026-02-19 stat.ML cs.LG math.OC math.PR math.ST stat.TH

High-dimensional learning dynamics of multi-pass Stochastic Gradient Descent in multi-index models

Zhou Fan, Leda Wang

2601.17973 2026-02-19 stat.ML cs.LG

Boosting methods for interval-censored data with regression and classification

Yuan Bian, Grace Y. Yi, Wenqing He

2511.03952 2026-02-19 stat.ML cs.LG

High-dimensional limit theorems for SGD: Momentum and Adaptive Step-sizes

Aukosh Jagannath, Taj Jones-McCormick, Varnan Sarangian

2510.20755 2026-02-19 math.ST math.CO stat.ME stat.ML stat.TH

Incomplete U-Statistics of Equireplicate Designs: Berry-Esseen Bound and Efficient Construction

Cesare Miglioli, Jordan Awan

2509.20928 2026-02-19 stat.ML cs.LG

Conditionally Whitened Generative Models for Probabilistic Time Series Forecasting

Yanfeng Yang, Siwei Chen, Pingping Hu, Zhaotong Shen, Yingjie Zhang, Zhuoran Sun, Shuai Li, Ziqi Chen, Kenji Fukumizu

Comments Accepted by the fourteenth International Conference on Learning Representations (ICLR 2026). https://openreview.net/forum?id=GG01lCopSK

2505.24205 2026-02-19 cs.LG stat.ML

On the Expressive Power of Mixture-of-Experts for Structured Complex Tasks

Mingze Wang, Weinan E

Comments 28 pages, NeurIPS 2025 Spotlight

2411.16370 2026-02-19 cs.CV cs.AI cs.LG eess.IV stat.ML

A Review of Bayesian Uncertainty Quantification in Deep Probabilistic Image Segmentation

M. M. A. Valiuddin, R. J. G. van Sloun, C. G. A. Viviers, P. H. N. de With, F. van der Sommen

Comments TMLR

2409.19642 2026-02-19 stat.CO cs.NA math.NA math.OC stat.ME

Solving Fredholm Integral Equations of the Second Kind via Wasserstein Gradient Flows

Francesca R. Crucinio, Adam M. Johansen

2409.04332 2026-02-19 cs.LG stat.ML

Amortized Bayesian Workflow

Chengkun Li, Aki Vehtari, Paul-Christian Bürkner, Stefan T. Radev, Luigi Acerbi, Marvin Schmitt

Comments Accepted in Transactions on Machine Learning Research

2408.09760 2026-02-19 stat.ME cs.SI econ.GN q-fin.EC stat.AP

Regional and spatial dependence of poverty factors in Thailand, and its use into Bayesian hierarchical regression analysis

Irving Gómez-Méndez, Chainarong Amornbunchornvej

Comments Codes to reproduce our results are available in https://github.com/IrvingGomez/SpatialPovertyFactors

2402.19456 2026-02-19 quant-ph cs.DS math.PR math.ST stat.ML stat.TH

Statistical Estimation in the Spiked Tensor Model via the Quantum Approximate Optimization Algorithm

Leo Zhou, Joao Basso, Song Mei

Comments 51 pages, 4 figures, 1 table

2402.11020 2026-02-19 stat.ME

Proximal Causal Inference for Conditional Separable Effects

Chan Park, Mats Stensrud, Eric Tchetgen Tchetgen

2305.12288 2026-02-19 stat.AP cs.NA math.NA

A Cost-Effective Slag-based Mix Activated with Soda Ash and Hydrated Lime: A Pilot Study

Jayashree Sengupta, Nirjhar Dhang, Arghya Deb

2208.14153 2026-02-19 cs.LG stat.ML

Identifying Weight-Variant Latent Causal Models

Yuhang Liu, Zhen Zhang, Dong Gong, Mingming Gong, Biwei Huang, Anton van den Hengel, Kun Zhang, Javen Qinfeng Shi