arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2602.22203 2026-02-26 stat.ME

Local Bayesian Regression

Nils Lid Hjort

Comments 28 pages; statistical Research Report, Department of Mathematics, University of Oslo, August 1994, but arXiv'd in February 2026. A journal paper can be written up based on this report, requiring though numerical studies and good illustrations

2602.22178 2026-02-26 math.ST stat.TH

Confidence in confidence distributions!

Céline Cunen, Nils Lid Hjort, Tore Schweder

Comments 5 pages, 2 figures. Statistical Research Report, Department of Mathematics, University of Oslo, February 2020, here arXiv'd February 2026. Published in Proceedings of the Royal Society, Series A, 2020, vo. 476, at this url: royalsocietypublishing.org/rspa/article/476/2237/20190781/56889

2602.22122 2026-02-26 stat.ML cs.LG

Probing the Geometry of Diffusion Models with the String Method

Elio Moreau, Florentin Coeurdoux, Grégoire Ferre, Eric Vanden-Eijnden

2602.22083 2026-02-26 stat.ME cs.LG stat.ML

Coarsening Bias from Variable Discretization in Causal Functionals

Xiaxian Ou, Razieh Nabi

2602.22062 2026-02-26 stat.ME math.ST stat.TH

Robust Model Selection for Discovery of Latent Mechanistic Processes

Jiawei Li, Nguyen Nguyen, Meng Lai, Ioannis Ch. Paschalidis, Jonathan H. Huggins

2602.22047 2026-02-26 math.OC math.ST stat.TH

Stochastic Optimal Control with Side Information and Bayesian Learning

Johannes Milz, Alexander Shapiro, Enlu Zhou

2602.22021 2026-02-26 stat.ME

Budgeted Active Experimentation for Treatment Effect Estimation from Observational and Randomized Data

Jiacan Gao, Xinyan Su, Mingyuan Ma, Yiyan Huang, Xiao Xu, Xinrui Wan, Tianqi Gu, Enyun Yu, Jiecheng Guo, Zhiheng Zhang

2602.22003 2026-02-26 cs.LG math.OC stat.ML

Neural solver for Wasserstein Geodesics and optimal transport dynamics

Hailiang Liu, Yan-Han Chen

Comments 28 pages, 22 figures

2602.21998 2026-02-26 stat.ME

Design-based theory for causal inference from adaptive experiments

Xinran Li, Anqi Zhao

2602.21948 2026-02-26 cs.LG stat.ML

Bayesian Generative Adversarial Networks via Gaussian Approximation for Tabular Data Synthesis

Bahrul Ilmi Nasution, Mark Elliot, Richard Allmendinger

Comments 28 pages, 5 Figures, Accepted in Transactions on Data Privacy

2602.21928 2026-02-26 cs.LG stat.ML

Learning Unknown Interdependencies for Decentralized Root Cause Analysis in Nonlinear Dynamical Systems

Ayush Mohanty, Paritosh Ramanan, Nagi Gebraeel

Comments Manuscript under review

2602.21846 2026-02-26 stat.ML cs.LG math.ST stat.ME stat.TH

Scalable Kernel-Based Distances for Statistical Inference and Integration

Masha Naslidnyk

Comments PhD thesis

2602.21765 2026-02-26 cs.LG cs.AI stat.ML

Generalisation of RLHF under Reward Shift and Clipped KL Regularisation

Kenton Tang, Yuzhu Chen, Fengxiang He

2602.21764 2026-02-26 math.ST stat.TH

Estimation of the Self-similarity Index of Non-stationary Increments Self-similar Processes via Lamperti Transformations

William Wu, Qidi Peng

2602.20503 2026-02-26 stat.ME stat.AP

Error-Controlled Borrowing from External Data Using Wasserstein Ambiguity Sets

Yui Kimura, Shu Tamano

2512.25017 2026-02-26 math.NA cs.LG cs.NA q-fin.CP stat.ML

Convergence of the generalization error for deep gradient flow methods for PDEs

Chenguang Liu, Antonis Papapantoleon, Jasper Rou

Comments 29 pages

2509.25800 2026-02-26 cs.LG stat.ME

Characterization and Learning of Causal Graphs with Latent Confounders and Post-treatment Selection from Interventional Data

Gongxu Luo, Loka Li, Guangyi Chen, Haoyue Dai, Kun Zhang

2508.16110 2026-02-26 stat.ME math.PR

Estimating the growth rate of a birth and death process using data from a small sample

Carola Sophia Heinzel, Jason Schweinsberg

2504.19994 2026-02-26 stat.ME

Semi-parametric bulk and tail regression using spline-based neural networks

Reetam Majumder, Jordan Richards

2503.20852 2026-02-26 stat.OT math.PR math.ST stat.TH

Teachable normal approximations to binomial and related probabilities or confidence bounds

Lutz Mattner

Comments 13 pages. Contains now a complete proof of the proposed bounds for Clopper-Pearson bounds. Further various minor improvements

2503.01081 2026-02-26 stat.ME

A Dynamic Factor Model for Multivariate Counting Process Data

Fangyi Chen, Hok Kan Ling, Zhiliang Ying

2501.08449 2026-02-26 cs.CR cs.CY cs.DS stat.ME

A Refreshment Stirred, Not Shaken: Invariant-Preserving Deployments of Differential Privacy for the U.S. Decennial Census

James Bailie, Ruobin Gong, Xiao-Li Meng

Comments 65 pages, 2 figures

Journal ref Harvard Data Science Review (2026), Special Issue 6

详情

DOI: 10.1162/99608f92.dab78690

英文摘要

Protecting an individual's privacy when releasing their data is inherently an exercise in relativity, regardless of how privacy is qualified or quantified. This is because we can only limit the gain in information about an individual relative to what could be derived from other sources. This framing is the essence of differential privacy (DP), through which this article examines two statistical disclosure control (SDC) methods for the United States Decennial Census: the Permutation Swapping Algorithm (PSA), which resembles the 2010 Census's disclosure avoidance system (DAS), and the TopDown Algorithm (TDA), which was used in the 2020 DAS. To varying degrees, both methods leave unaltered certain statistics of the confidential data (their invariants) and hence neither can be readily reconciled with DP, at least as originally conceived. Nevertheless, we show how invariants can naturally be integrated into DP and use this to establish that the PSA satisfies pure DP subject to the invariants it necessarily induces, thereby proving that this traditional SDC method can, in fact, be understood from the perspective of DP. By a similar modification to zero-concentrated DP, we also provide a DP specification for the TDA. Finally, as a point of comparison, we consider a counterfactual scenario in which the PSA was adopted for the 2020 Census, resulting in a reduction in the nominal protection loss budget but at the cost of releasing many more invariants. This highlights the pervasive danger of comparing budgets without accounting for the other dimensions on which DP formulations vary (such as the invariants they permit). Therefore, while our results articulate the mathematical guarantees of SDC provided by the PSA, the TDA, and the 2020 DAS in general, care must be taken in translating these guarantees into actual privacy protection$\unicode{x2014}$just as is the case for any DP deployment.

URL PDF HTML ☆

赞 0 踩 0

2412.10895 2026-02-26 cs.LG stat.ML

Multi-Class and Multi-Task Strategies for Neural Directed Link Prediction

Claudio Moroni, Claudio Borile, Carolina Mattsson, Michele Starnini, André Panisson

Comments 15 pages, 2 figures

Journal ref ECML PKDD 2025

详情

DOI: 10.1007/978-3-662-72243-5_8

英文摘要

Link Prediction is a foundational task in Graph Representation Learning, supporting applications like link recommendation, knowledge graph completion and graph generation. Graph Neural Networks have shown the most promising results in this domain and are currently the de facto standard approach to learning from graph data. However, a key distinction exists between Undirected and Directed Link Prediction: the former just predicts the existence of an edge, while the latter must also account for edge directionality and bidirectionality. This translates to Directed Link Prediction (DLP) having three sub-tasks, each defined by how training, validation and test sets are structured. Most research on DLP overlooks this trichotomy, focusing solely on the "existence" sub-task, where training and test sets are random, uncorrelated samples of positive and negative directed edges. Even in the works that recognize the aforementioned trichotomy, models fail to perform well across all three sub-tasks. In this study, we experimentally demonstrate that training Neural DLP (NDLP) models only on the existence sub-task, using methods adapted from Neural Undirected Link Prediction, results in parameter configurations that fail to capture directionality and bidirectionality, even after rebalancing edge classes. To address this, we propose three strategies that handle the three tasks simultaneously. Our first strategy, the Multi-Class Framework for Neural Directed Link Prediction (MC-NDLP) maps NDLP to a Multi-Class training objective. The second and third approaches adopt a Multi-Task perspective, either with a Multi-Objective (MO-DLP) or a Scalarized (S-DLP) strategy. Our results show that these methods outperform traditional approaches across multiple datasets and models, achieving equivalent or superior performance in addressing the three DLP sub-tasks.

URL PDF HTML ☆

赞 0 踩 0

2407.16299 2026-02-26 stat.ME stat.ML

Sparse outlier-robust PCA for multi-source data

Patricia Puchhammer, Ines Wilms, Peter Filzmoser

2404.07849 2026-02-26 stat.ML cs.LG

Overparameterized Multiple Linear Regression as Hyper-Curve Fitting

E. Atza, N. Budko

Comments 18 pages, 8 figures, version 2 (IOP style, revised), Python code and data available at: https://github.com/the-iterator/hyper-curve-regression-yarn

2311.11216 2026-02-26 stat.ME

Reconciling Overt Bias and Hidden Bias in Sensitivity Analysis for Matched Observational Studies

Siyu Heng, Yanxin Shen, Pengyun Wang

2207.00985 2026-02-26 math.NA cs.DM cs.NA math.ST stat.AP stat.ME stat.TH

Linguistic Approach to Time Series Forecasting

Dmytro Lande, Volodymyr Yuzefovych, Yevheniia Tsybulska

Comments 8 pages, 9 figures

2602.21713 2026-02-26 stat.ME

Multi-Parameter Estimation of Prevalence (MPEP): A Bayesian modelling approach to estimate the prevalence of opioid dependence

Andreas Markoulidakis, Matthew Hickman, Nicky J Welton, Loukia Meligkotsidou, Hayley E Jones

2602.21711 2026-02-26 stat.ME math.ST stat.AP stat.CO stat.TH

Adaptive Penalized Doubly Robust Regression for Longitudinal Data

Yuyao Wang, Yu Lu, Tianni Zhang, Mengfei Ran

2602.21701 2026-02-26 cs.LG physics.data-an stat.ML

Learning Complex Physical Regimes via Coverage-oriented Uncertainty Quantification: An application to the Critical Heat Flux

Michele Cazzola, Alberto Ghione, Lucia Sargentini, Julien Nespoulous, Riccardo Finotello

Comments 34 pages, 14 figures

详情

英文摘要

A central challenge in scientific machine learning (ML) is the correct representation of physical systems governed by multi-regime behaviours. In these scenarios, standard data analysis techniques often fail to capture the nature of the data, as the system's response varies significantly across the state space due to its stochasticity and the different physical regimes. Uncertainty quantification (UQ) should thus not be viewed merely as a safety assessment, but as a support to the learning task itself, guiding the model to internalise the behaviour of the data. We address this by focusing on the Critical Heat Flux (CHF) benchmark and dataset presented by the OECD/NEA Expert Group on Reactor Systems Multi-Physics. This case study represents a test for scientific ML due to the non-linear dependence of CHF on the inputs and the existence of distinct microscopic physical regimes. These regimes exhibit diverse statistical profiles, a complexity that requires UQ techniques to internalise the data behaviour and ensure reliable predictions. In this work, we conduct a comparative analysis of UQ methodologies to determine their impact on physical representation. We contrast post-hoc methods, specifically conformal prediction, against end-to-end coverage-oriented pipelines, including (Bayesian) heteroscedastic regression and quality-driven losses. These approaches treat uncertainty not as a final metric, but as an active component of the optimisation process, modelling the prediction and its behaviour simultaneously. We show that while post-hoc methods ensure statistical calibration, coverage-oriented learning effectively reshapes the model's representation to match the complex physical regimes. The result is a model that delivers not only high predictive accuracy but also a physically consistent uncertainty estimation that adapts dynamically to the intrinsic variability of the CHF.

URL PDF HTML ☆

赞 0 踩 0

2602.21663 2026-02-26 stat.ME

Estimation, inference and model selection for jump regression models

Steffen Grønneberg, Gudmund Hermansen, Nils Lid Hjort

Comments 33 pages, 3 figures; Statistical Research Report, Department of Mathematics, University of Oslo, from June 2014, and arXiv'd February 2026. This paper constituted a part of the doctoral dissertations for respectively Gudmund Hermansen and Steffen Grønneberg. An extended and polished version will be written up for journal publication

2602.21579 2026-02-26 stat.ME

Asymptotically Optimal Sequential Confidence Interval for the Gini Index Under Complex Household Survey Design with Sub-Stratification

Shivam, Bhargab Chattopadhyay, Nil Kamal Hazra

2602.21572 2026-02-26 stat.ML cs.LG stat.ME

Goodness-of-Fit Tests for Latent Class Models with Ordinal Categorical Data

Huan Qing

Comments 50 pages, 4 tables, 3 figures

2602.21569 2026-02-26 math.ST cs.LG stat.ME stat.ML stat.TH

How many asymmetric communities are there in multi-layer directed networks?

Huan Qing

Comments 44 pages, 4 tables, 2 figures

2602.21509 2026-02-26 stat.ML cs.LG stat.AP

Fair Model-based Clustering

Jinwon Park, Kunwoong Kim, Jihu Lee, Yongdai Kim

Comments Accepted by AAAI 2026 (Main Track, Oral presentation)

2602.21490 2026-02-26 stat.ME

Connection Probabilities Estimation in Multi-layer Networks via Iterative Neighborhood Smoothing

Dingzi Guo, Diqing Li, Jingyi Wang, Wen-Xin Zhou

2602.21478 2026-02-26 stat.ML cs.LG math.ST stat.ME stat.TH

Efficient Inference after Directionally Stable Adaptive Experiments

Zikai Shen, Houssam Zenati, Nathan Kallus, Arthur Gretton, Koulik Khamaru, Aurélien Bibaut

Comments 34 pages

2602.21465 2026-02-26 math.ST math.PR stat.TH

Exponential Concentration Inequalities For Independent Random Vectors Under Sublinear Expectations

Nahom Seyoum

2602.21462 2026-02-26 cs.LG q-bio.GN stat.ML

Effects of Training Data Quality on Classifier Performance

Alan F. Karr, Regina Ruane

2602.21446 2026-02-26 stat.ML cs.LG

ConformalHDC: Uncertainty-Aware Hyperdimensional Computing with Application to Neural Decoding

Ziyi Liang, Hamed Poursiami, Zhishun Yang, Keiland Cooper, Akhilesh Jaiswal, Maryam Parsa, Norbert Fortin, Babak Shahbaba

2602.21436 2026-02-26 stat.ML cs.GT cs.LG

Efficient Uncoupled Learning Dynamics with $\tilde{O}\!\left(T^{-1/4}\right)$ Last-Iterate Convergence in Bilinear Saddle-Point Problems over Convex Sets under Bandit Feedback

Arnab Maiti, Claire Jie Zhang, Kevin Jamieson, Jamie Heather Morgenstern, Ioannis Panageas, Lillian J. Ratliff

Comments 19 pages, Accepted at AISTATS 2026

2602.21423 2026-02-26 math.ST stat.ME stat.TH

Causal Inference with High-Dimensional Treatments

Patrick Kramer, Edward H. Kennedy, Isaac M. Opper

2602.21410 2026-02-26 stat.ME

Identifying the potential of sample overlap in evidence synthesis of observational studies

Zhentian Zhang, Tim Friede, Tim Mathes

Comments 36 pages,17 figures

2602.21408 2026-02-26 cs.LG stat.AP stat.CO stat.ME stat.ML

Generative Bayesian Computation as a Scalable Alternative to Gaussian Process Surrogates

Nick Polson, Vadim Sokolov

2602.21403 2026-02-26 stat.ME cs.CE eess.SP stat.CO

An index of effective number of variables for uncertainty and reliability analysis in model selection problems

Luca Martino, Eduardo Morgado, Roberto San Millán-Castillo

Journal ref Signal Processing, Volume 227, Pages 1-9, 2025. Num. 109735

2602.21390 2026-02-26 cs.LG stat.ML

Defensive Generation

Gabriele Farina, Juan Carlos Perdomo

2602.21383 2026-02-26 stat.ME

Evaluating time-varying treatment effects in hybrid SMART-MRT designs

Mengbing Li, Inbal Nahum-Shani, Walter Dempsey

2602.21370 2026-02-26 stat.AP

Evaluation of Minimal Residual Disease as a Surrogate for Progression-Free Survival in Hematology Oncology Trials: A Meta-Analytic Review

Jane She, Xiaofei Chen, Malini Iyengar, Judy Li

2602.21368 2026-02-26 cs.LG cs.AI cs.CL stat.ML

Black-Box Reliability Certification for AI Agents via Self-Consistency Sampling and Conformal Calibration

Charafeddine Mouzouni

Comments 41 pages, 11 figures, 10 tables, including appendices

2602.21359 2026-02-26 math.ST stat.ME stat.TH

Some Asymptotic Results on Multiple Testing under Weak Dependence

Swarnadeep Datta, Monitirtha Dey

2602.21357 2026-02-26 stat.ML cs.LG

Conditional neural control variates for variance reduction in Bayesian inverse problems

Ali Siahkoohi, Hyunwoo Oh

2602.21356 2026-02-26 stat.CO

Adaptive Importance Tempering: A flexible approach to improve computational efficiency of Metropolis Coupled Markov Chain Monte Carlo algorithms on binary spaces

Alexander Valencia-Sanchez, Jeffrey S. Rosenthal, Yasuhiro Watanabe, Hirotaka Tamura, Ali Sheikholeslami

Comments 25 pages, 8 figures

2602.21342 2026-02-26 cs.LG stat.ML

Archetypal Graph Generative Models: Explainable and Identifiable Communities via Anchor-Dominant Convex Hulls

Nikolaos Nakis, Chrysoula Kosma, Panagiotis Promponas, Michail Chatzianastasis, Giannis Nikolentzos

Comments Accepted to AISTATS26 (Spotlight)

2602.21314 2026-02-26 stat.ME

Discussion of "Matrix Completion When Missing Is Not at Random and Its Applications in Causal Panel Data Models"

Eli Ben-Michael, Avi Feller

Comments Invited discussion of Choi and Yuan "Matrix Completion When Missing Is Not at Random and Its Applications in Causal Panel Data Models" at JSM 2025

2602.21276 2026-02-26 cs.LG stat.ML

Neural network optimization strategies and the topography of the loss landscape

Jianneng Yu, Alexandre V. Morozov

Comments 12 pages in the main text + 5 pages in the supplement. 6 figures + 1 table in the main text, 4 figures and 1 table in the supplement

详情

英文摘要

Neural networks are trained by optimizing multi-dimensional sets of fitting parameters on non-convex loss landscapes. Low-loss regions of the landscapes correspond to the parameter sets that perform well on the training data. A key issue in machine learning is the performance of trained neural networks on previously unseen test data. Here, we investigate neural network training by stochastic gradient descent (SGD) - a non-convex global optimization algorithm which relies only on the gradient of the objective function. We contrast SGD solutions with those obtained via a non-stochastic quasi-Newton method, which utilizes curvature information to determine step direction and Golden Section Search to choose step size. We use several computational tools to investigate neural network parameters obtained by these two optimization methods, including kernel Principal Component Analysis and a novel, general-purpose algorithm for finding low-height paths between pairs of points on loss or energy landscapes, FourierPathFinder. We find that the choice of the optimizer profoundly affects the nature of the resulting solutions. SGD solutions tend to be separated by lower barriers than quasi-Newton solutions, even if both sets of solutions are regularized by early stopping to ensure adequate performance on test data. When allowed to fit extensively on the training data, quasi-Newton solutions occupy deeper minima on the loss landscapes that are not reached by SGD. These solutions are less generalizable to the test data however. Overall, SGD explores smooth basins of attraction, while quasi-Newton optimization is capable of finding deeper, more isolated minima that are more spread out in the parameter space. Our findings help understand both the topography of the loss landscapes and the fundamental role of landscape exploration strategies in creating robust, transferrable neural network models.

URL PDF HTML ☆

赞 0 踩 0

2602.21272 2026-02-26 stat.ML cs.LG stat.CO

Counterdiabatic Hamiltonian Monte Carlo

Reuben Cohn-Gordon, Uroš Seljak, Dries Sels

2602.21269 2026-02-26 cs.LG cs.AI stat.ML

Group Orthogonalized Policy Optimization:Group Policy Optimization as Orthogonal Projection in Hilbert Space

Wang Zixian

2602.19473 2026-02-26 stat.ME math.ST stat.ML stat.TH

The generalized underlap coefficient with an application in clustering

Zhaoxi Zhang, Vanda Inacio, Sara Wade

2511.01734 2026-02-26 stat.ML cs.AI cs.CL cs.LG

A Proof of Learning Rate Transfer under $μ$P

Soufiane Hayou

Comments 21 pages

2510.21686 2026-02-26 stat.ML cs.LG

Multimodal Datasets with Controllable Mutual Information

Raheem Karim Hashmani, Garrett W. Merz, Helen Qu, Mariel Pettee, Kyle Cranmer

Comments 16 pages, 7 figures, 2 tables. Our code is publicly available at https://github.com/RKHashmani/MmMi-Datasets. Datasets generated based on Figure 1 can be found at https://huggingface.co/datasets/RKHashmani/mmmi-dag1-2modalities-cifar10

2510.11789 2026-02-26 stat.ML cs.LG math.PR math.ST stat.TH

Minimax Rates for Learning Pairwise Interactions in Attention-Style Models

Shai Zucker, Xiong Wang, Fei Lu, Inbar Seroussi

2509.20831 2026-02-26 stat.ME stat.AP

Modi linear failure rate distribution with application to survival time data

Lazhar Benkhelifa

Journal ref Modern Journal of Statistics 2026

2508.04957 2026-02-26 stat.ME

Goodness-of-fit test for multi-layer stochastic block models

Huan Qing

Comments 52 pages, 5 tables, 3 figures

2507.14206 2026-02-26 eess.SP cs.AI cs.LG stat.ML

A Comprehensive Benchmark for Electrocardiogram Time-Series

Zhijiang Tang, Jiaxin Qi, Yuhua Zheng, Jianqiang Huang

Comments ACM MM 2025

Journal ref Proceedings of the 33rd ACM International Conference on Multimedia. 2025

2506.13630 2026-02-26 stat.AP cs.HC

The Hammock Plot: Where Categorical and Numerical Data Relax Together

Matthias Schonlau, Tiancheng Yang

Comments 21 pages, 10 figures, 1 table. Submitted to the Stata Journal

2504.19138 2026-02-26 math.ST cs.NA math.NA stat.CO stat.TH

Quasi-Monte Carlo confidence intervals using quantiles of randomized nets

Zexin Pan

2408.09418 2026-02-26 stat.ME

Grade of membership analysis for multi-layer ordinal categorical data

Huan Qing

Comments 46 pages, accepted by Statistica Sinica in 2025

2408.06323 2026-02-26 stat.ME

Infer-and-widen, or not?

Ronan Perry, Zichun Xu, Olivia McGough, Daniela Witten

2312.16307 2026-02-26 econ.EM cs.GT cs.LG stat.ME

Incentive-Aware Synthetic Control: Accurate Counterfactual Estimation via Incentivized Exploration

Daniel Ngo, Keegan Harris, Anish Agarwal, Vasilis Syrgkanis, Zhiwei Steven Wu

Comments Accepted to TMLR

2306.15908 2026-02-26 stat.ME

Generalized Bayesian Multidimensional Scaling and Model Comparison

Jiarui Zhang, Jiguo Cao, Liangliang Wang

2211.02003 2026-02-26 cs.CR cs.LG stat.ML

Private Blind Model Averaging - Distributed, Non-interactive, and Convergent

Moritz Kirschte, Sebastian Meiser, Saman Ardalan, Esfandiar Mohammadi

Comments This work has been accepted for publication at the IEEE Conference on Secure and Trustworthy Machine Learning (SaTML). The final version will be available on IEEE Xplore