arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2503.03206 2026-04-07 cs.LG cs.CV math.ST stat.ML stat.TH

An Analytical Theory of Spectral Bias in the Learning Dynamics of Diffusion Models

Binxu Wang, Cengiz Pehlevan

Comments 96 pages, 29 figures. Published in Advances in Neural Information Processing Systems, NeurIPS 2025 (Spotlight)

详情

英文摘要

We develop an analytical framework for understanding how the generated distribution evolves during diffusion model training. Leveraging a Gaussian-equivalence principle, we solve the full-batch gradient-flow dynamics of linear and convolutional denoisers and integrate the resulting probability-flow ODE, yielding analytic expressions for the generated distribution. The theory exposes a universal inverse-variance spectral law: the time for an eigen- or Fourier mode to match its target variance scales as $τ\proptoλ^{-1}$, so high-variance (coarse) structure is mastered orders of magnitude sooner than low-variance (fine) detail. Extending the analysis to deep linear networks and circulant full-width convolutions shows that weight sharing merely multiplies learning rates -- accelerating but not eliminating the bias -- whereas local convolution introduces a qualitatively different bias. Experiments on Gaussian and natural-image datasets confirm the spectral law persists in deep MLP-based UNet. Convolutional U-Nets, however, display rapid near-simultaneous emergence of many modes, implicating local convolution in reshaping learning dynamics. These results underscore how data covariance governs the order and speed with which diffusion models learn, and they call for deeper investigation of the unique inductive biases introduced by local convolution.

URL PDF HTML ☆

赞 0 踩 0

2503.01622 2026-04-07 cs.CL

DOVE: A Large-Scale Multi-Dimensional Predictions Dataset Towards Meaningful LLM Evaluation

Eliya Habba, Ofir Arviv, Itay Itzhak, Yotam Perlitz, Elron Bandel, Leshem Choshen, Michal Shmueli-Scheuer, Gabriel Stanovsky

2502.15567 2026-04-07 cs.LG stat.ML

Model Privacy: A Unified Framework for Understanding Model Stealing Attacks and Defenses

Ganghua Wang, Yuhong Yang, Jie Ding

Comments Journal of the Royal Statistical Society Series B: Statistical Methodology, 2026

2502.08547 2026-04-07 cs.AI

Representation learning to advance multi-institutional studies with electronic health record data from US and France

Doudou Zhou, Han Tong, Linshanshan Wang, Suqi Liu, Xin Xiong, Ziming Gan, Romain Griffier, Boris Hejblum, Yun-Chung Liu, Chuan Hong, Clara-Lea Bonzel, Tianrun Cai, Kevin Pan, Yuk-Lam Ho, Lauren Costa, Vidul A. Panickan, J. Michael Gaziano, Kenneth Mandl, Vianney Jouhet, Rodolphe Thiebaut, Zongqi Xia, Kelly Cho, Katherine Liao, Tianxi Cai

2502.02020 2026-04-07 cs.LG stat.ME

Causal Bandit Over Unknown Graphs: Upper Confidence Bounds With Backdoor Adjustment

Yijia Zhao, Qing Zhou

2412.12870 2026-04-07 cs.LG

Physically Interpretable World Models via Weakly Supervised Representation Learning

Zhenjiang Mao, Mrinall Eashaan Umasudhan, Ivan Ruchkin

2412.02335 2026-04-07 cs.RO cs.LG cs.SY eess.SY

An Adaptive Grasping Force Tracking Strategy for Nonlinear and Time-Varying Object Behaviors

Ziyang Cheng, Xiangyu Tian, Ruomin Sui, Tiemin Li, Yao Jiang

详情

DOI: 10.1109/TASE.2025.3581688
Journal ref: IEEE Transactions on Automation Science and Engineering, vol. 22, pp. 17063-17076, 2025

英文摘要

Accurate grasp force control is one of the key skills for ensuring successful and damage-free robotic grasping of objects. Although existing methods have conducted in-depth research on slip detection and grasping force planning, they often overlook the issue of adaptive tracking of the actual force to the target force when handling objects with different material properties. The optimal parameters of a force tracking controller are significantly influenced by the object's stiffness, and many adaptive force tracking algorithms rely on stiffness estimation. However, real-world objects often exhibit viscous, plastic, or other more complex nonlinear time-varying behaviors, and existing studies provide insufficient support for these materials in terms of stiffness definition and estimation. To address this, this paper introduces the concept of generalized stiffness, extending the definition of stiffness to nonlinear time-varying grasp system models, and proposes an online generalized stiffness estimator based on Long Short-Term Memory (LSTM) networks. Based on generalized stiffness, this paper proposes an adaptive parameter adjustment strategy using a PI controller as an example, enabling dynamic force tracking for objects with varying characteristics. Experimental results demonstrate that the proposed method achieves high precision and short probing time, while showing better adaptability to non-ideal objects compared to existing methods. The method effectively solves the problem of grasp force tracking in unknown, nonlinear, and time-varying grasp systems, demonstrating the generalization capability of our neural network and enhancing the robotic grasping ability in unstructured environments.

URL PDF HTML ☆

赞 0 踩 0

2410.14826 2026-04-07 cs.CL cs.AI cs.HC cs.LG

SPRIG: Improving Large Language Model Performance by System Prompt Optimization

Lechen Zhang, Tolga Ergen, Lajanugen Logeswaran, Moontae Lee, David Jurgens

Comments Accepted at ICLR 2026

2410.02260 2026-04-07 cs.LG

FedScalar: Federated Learning with Scalar Communication for Bandwidth-Constrained Networks

M. Rostami, S. S. Kia

2408.09468 2026-04-07 cs.RO

Towards Safe and Robust Autonomous Vehicle Platooning: A Self-Organizing Cooperative Control Framework

Chengkai Xu, Zihao Deng, Jiaqi Liu, Aijing Kong, Yu Tang, Chao Huang, Peng Hang

2407.17491 2026-04-07 cs.CV cs.LG

Robust Adaptation of Foundation Models with Black-Box Visual Prompting

Changdae Oh, Gyeongdeok Seo, Geunyoung Jung, Zhi-Qi Cheng, Hosik Choi, Jiyoung Jung, Kyungwoo Song

Comments Accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2026

2405.02068 2026-04-07 cs.CV

Advancing Pre-trained Teacher: Towards Robust Feature Discrepancy for Anomaly Detection

Canhui Tang, Sanping Zhou, Yizhe Li, Yonghao Dong, Le Wang

Comments Accepted by IEEE Transactions on Image Processing (TIP)

2311.12882 2026-04-07 cs.CL cs.AI

LLMs-Healthcare : Current Applications and Challenges of Large Language Models in various Medical Specialties

Ummara Mumtaz, Awais Ahmed, Summaya Mumtaz

Comments 26 pages and one figure

2311.00855 2026-04-07 cs.AI cs.LG cs.MA

A Multi-Agent Reinforcement Learning Framework for Public Health Decision Analysis

Dinesh Sharma, Ankit Shah, Chaitra Gopalappa

Comments Updated to the accepted version published in Healthcare Analytics (November 2025)

详情

DOI: 10.1016/j.health.2025.100436
Journal ref: Healthcare Analytics, 8 (2025) 100436

英文摘要

Human immunodeficiency virus (HIV) is a major public health concern in the United States (U.S.), with about 1.2 million people living with it and about 35,000 newly infected each year. There are considerable geographical disparities in HIV burden and care access across the U.S. The 'Ending the HIV Epidemic (EHE)' initiative by the U.S. Department of Health and Human Services aims to reduce new infections by 90% by 2030, by improving coverage of diagnoses, treatment, and prevention interventions and prioritizing jurisdictions with high HIV prevalence. We develop intelligent decision-support systems to optimize resource allocation and intervention strategies. Existing decision analytic models either focus on individual cities or aggregate national data, failing to capture jurisdictional interactions critical for optimizing intervention strategies. To address this, we propose a multi-agent reinforcement learning (MARL) framework that enables jurisdiction-specific decision-making while accounting for cross-jurisdictional epidemiological interactions. Our framework functions as an intelligent resource optimization system, helping policymakers strategically allocate interventions based on dynamic, data-driven insights. Experimental results across jurisdictions in California and Florida demonstrate that MARL-driven policies outperform traditional single-agent reinforcement learning approaches by reducing new infections under fixed budget constraints. Our study highlights the importance of incorporating jurisdictional dependencies in decision-making frameworks for large-scale public initiatives. By integrating multi-agent intelligent systems, decision analytics, and reinforcement learning, this study advances expert systems for government resource planning and public health management, offering a scalable framework for broader applications in healthcare policy and epidemic management.

URL PDF HTML ☆

赞 0 踩 0

2310.03088 2026-04-07 cs.LG cs.SY eess.SY

Physics-Informed Neural Networks for Accelerating Power System State Estimation

Solon Falas, Markos Asprou, Charalambos Konstantinou, Maria K. Michael

2307.09366 2026-04-07 cs.LG stat.ME stat.ML

Sparse Gaussian Graphical Models with Discrete Optimization: Computational and Statistical Perspectives

Kayhan Behdin, Wenyu Chen, Rahul Mazumder

Comments Operations Research (to appear)

2305.14787 2026-04-07 cs.CV

Polarimetric Imaging for Perception

Michael Baltaxe, Tomer Pe'er, Dan Levi

2305.03784 2026-04-07 cs.LG

Neural Exploitation and Exploration of Contextual Bandits

Yikun Ban, Yuchen Yan, Arindam Banerjee, Jingrui He

Comments Journal of Machine Learning Research

2301.01741 2026-04-07 cs.LG

Graph State-Space Models and Latent Relational Inference

Daniele Zambon, Andrea Cini, Cesare Alippi

2212.11538 2026-04-07 cs.CV

SHLE: Devices Tracking and Depth Filtering for Stereo-based Height Limit Estimation

Zhaoxin Fan, Kaixing Yang, Min Zhang, Zhenbo Song, Hongyan Liu, Jun He

2103.14600 2026-04-07 cs.RO cs.FL cs.LG cs.LO

Model-Free Learning of Safe yet Effective Controllers

Alper Kamil Bozkurt, Yu Wang, Miroslav Pajic

2102.04307 2026-04-07 cs.AI cs.LG cs.LO cs.RO

Learning Optimal Strategies for Temporal Tasks in Stochastic Games

Alper Kamil Bozkurt, Yu Wang, Michael M. Zavlanos, Miroslav Pajic

详情

DOI: 10.1109/TAC.2024.3390848
Journal ref: IEEE Transactions on Automatic Control, vol. 69, no. 11, pp. 7387-7402, Nov. 2024

英文摘要

Synthesis from linear temporal logic (LTL) specifications provides assured controllers for systems operating in stochastic and potentially adversarial environments. Automatic synthesis tools, however, require a model of the environment to construct controllers. In this work, we introduce a model-free reinforcement learning (RL) approach to derive controllers from given LTL specifications even when the environment is completely unknown. We model the problem as a stochastic game (SG) between the controller and the adversarial environment; we then learn optimal control strategies that maximize the probability of satisfying the LTL specifications against the worst-case environment behavior. We first construct a product game using the deterministic parity automaton (DPA) translated from the given LTL specification. By deriving distinct rewards and discount factors from the acceptance condition of the DPA, we reduce the maximization of the worst-case probability of satisfying the LTL specification into the maximization of a discounted reward objective in the product game; this enables the use of model-free RL algorithms to learn an optimal controller strategy. To deal with the common scalability problems when the number of sets defining the acceptance condition of the DPA (usually referred as colors), is large, we propose a lazy color generation method where distinct rewards and discount factors are utilized only when needed, and an approximate method where the controller eventually focuses on only one color. In several case studies, we show that our approach is scalable to a wide range of LTL formulas, significantly outperforming existing methods for learning controllers from LTL specifications in SGs.

URL PDF HTML ☆

赞 0 踩 0

2011.01882 2026-04-07 cs.RO cs.GT

Secure Planning Against Stealthy Attacks via Model-Free Reinforcement Learning

Alper Kamil Bozkurt, Yu Wang, Miroslav Pajic

2010.01050 2026-04-07 cs.RO cs.LO

Model-Free Reinforcement Learning for Stochastic Games with Linear Temporal Logic Objectives

Alper Kamil Bozkurt, Yu Wang, Michael Zavlanos, Miroslav Pajic

2006.04363 2026-04-07 cs.LG cs.AI stat.ML

Mitigating Value Hallucination in Dyna Planning via Multistep Predecessor Models

Farzane Aminmansour, Taher Jafferjee, Ehsan Imani, Erin Talvitie, Micheal Bowling, Martha White

Comments Published in Journal of Artificial Intelligence (JAIR) in 2024. Updated to published version, changed title to JAIR version, added a new author that led the submission

1909.07299 2026-04-07 cs.RO cs.AI cs.LG

Control Synthesis from Linear Temporal Logic Specifications using Model-Free Reinforcement Learning

Alper Kamil Bozkurt, Yu Wang, Michael M. Zavlanos, Miroslav Pajic

1504.06507 2026-04-07 cs.CV

Local Variation as a Statistical Hypothesis Test

Michael Baltaxe, Peter Meer, Michael Lindenbaum

2604.04920 2026-04-07 math.OC cs.LG

PINNs in PDE Constrained Optimal Control Problems: Direct vs Indirect Methods

Zhen Zhang, Shanqing Liu, Alessandro Alla, Jerome Darbon, George Em Karniadakis

Comments 8 pages, 3 figures

2604.04914 2026-04-07 cs.NI cs.AI cs.LG

Analyzing Symbolic Properties for DRL Agents in Systems and Networking

Mohammad Zangooei, Jannis Weil, Amr Rizk, Mina Tahmasbi Arashloo, Raouf Boutaba

Comments Accepted in ACM SIGMETRICS'26

详情

英文摘要

Deep reinforcement learning (DRL) has shown remarkable performance on complex control problems in systems and networking, including adaptive video streaming, wireless resource management, and congestion control. For safe deployment, however, it is critical to reason about how agents behave across the range of system states they encounter in practice. Existing verification-based methods in this domain primarily focus on point properties, defined around fixed input states, which offer limited coverage and require substantial manual effort to identify relevant input-output pairs for analysis. In this paper, we study symbolic properties, that specify expected behavior over ranges of input states, for DRL agents in systems and networking. We present a generic formulation for symbolic properties, with monotonicity and robustness as concrete examples, and show how they can be analyzed using existing DNN verification engines. Our approach encodes symbolic properties as comparisons between related executions of the same policy and decomposes them into practically tractable sub-properties. These techniques serve as practical enablers for applying existing verification tools to symbolic analysis. Using our framework, diffRL, we conduct an extensive empirical study across three DRL-based control systems, adaptive video streaming, wireless resource management, and congestion control. Through these case studies, we analyze symbolic properties over broad input ranges, examine how property satisfaction evolves during training, study the impact of model size on verifiability, and compare multiple verification backends. Our results show that symbolic properties provide substantially broader coverage than point properties and can uncover non-obvious, operationally meaningful counterexamples, while also revealing practical solver trade-offs and limitations.

URL PDF HTML ☆

赞 0 踩 0

2604.04906 2026-04-07 econ.TH cs.AI cs.CY cs.SI

How AI Aggregation Affects Knowledge

Daron Acemoglu, Tianyi Lin, Asuman Ozdaglar, James Siderius

Comments 45 pages