arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2602.14351 2026-04-07 cs.LG cs.AI

WIMLE: Uncertainty-Aware World Models with IMLE for Sample-Efficient Continuous Control

Mehran Aghabozorgi, Alireza Moazeni, Yanshu Zhang, Ke Li

Comments Accepted at ICLR 2026. Website: https://mehranagh20.github.io/wimle/ Code: https://github.com/mehranagh20/wimle

详情

Journal ref: In Proceedings of the Fourteenth International Conference on Learning Representations (ICLR 2026), 2026

英文摘要

Model-based reinforcement learning promises strong sample efficiency but often underperforms in practice due to compounding model error, unimodal world models that average over multi-modal dynamics, and overconfident predictions that bias learning. We introduce WIMLE, a model-based method that extends Implicit Maximum Likelihood Estimation (IMLE) to the model-based RL framework to learn stochastic, multi-modal world models without iterative sampling and to estimate predictive uncertainty via ensembles and latent sampling. During training, WIMLE weights each synthetic transition by its predicted confidence, preserving useful model rollouts while attenuating bias from uncertain predictions and enabling stable learning. Across $40$ continuous-control tasks spanning DeepMind Control, MyoSuite, and HumanoidBench, WIMLE achieves superior sample efficiency and competitive or better asymptotic performance than strong model-free and model-based baselines. Notably, on the challenging Humanoid-run task, WIMLE improves sample efficiency by over $50$\% relative to the strongest competitor, and on HumanoidBench it solves $8$ of $14$ tasks (versus $4$ for BRO and $5$ for SimbaV2). These results highlight the value of IMLE-based multi-modality and uncertainty-aware weighting for stable model-based RL.

URL PDF HTML ☆

赞 0 踩 0

2602.09532 2026-04-07 cs.CV

RAD: Retrieval-Augmented Monocular Metric Depth Estimation for Underrepresented Classes

Michael Baltaxe, Dan Levi, Sagie Benaim

2602.08392 2026-04-07 cs.RO cs.AI cs.CV

ST-BiBench: Benchmarking Multi-Stream Multimodal Coordination in Bimanual Embodied Tasks for MLLMs

Xin Wu, Zhixuan Liang, Yue Ma, Mengkang Hu, Zhiyuan Qin, Xiu Li

Comments 42 pages, 9 figures. Project page:https://stbibench.github.io/

2602.04448 2026-04-07 cs.LG cs.AI cs.CR

RASA: Routing-Aware Safety Alignment for Mixture-of-Experts Models

Jiacheng Liang, Yuhui Wang, Tanqiu Jiang, Ting Wang

Comments 9 pages

2602.01586 2026-04-07 cs.CV

HandMCM: Multi-modal Point Cloud-based Correspondence State Space Model for 3D Hand Pose Estimation

Wencan Cheng, Gim Hee Lee

Comments AAAI accepted

2602.00681 2026-04-07 cs.SD cs.IR cs.LG

Audio-to-Image Bird Species Retrieval without Audio-Image Pairs via Text Distillation

Ilyass Moummad, Marius Miron, Lukas Rauch, David Robinson, Alexis Joly, Olivier Pietquin, Emmanuel Chemla, Matthieu Geist

2601.21439 2026-04-07 cs.AI

The Paradox of Robustness: Decoupling Rule-Based Logic from Affective Noise in High-Stakes Decision-Making

Jon Chun, Katherine Elkins

Comments 47 pages, 14 figures, 23 tables. Substantially revised from v1: added immigration domain extension (14,183 cells), adversarial narrative pilot (2,054 cells), reasoning-trace analysis, scaffolding decomposition. Total: 84,245 valid responses across 13 experiments. Under review at TMLR. Code and data will be released upon publication

2601.19376 2026-04-07 cs.RO cs.AI cs.CY cs.HC cs.LG

Teaching Machine Learning Fundamentals with LEGO Robotics

Viacheslav Sydora, Guner Dilsad Er, Michael Muehlebach

Comments 10 pages, 8 figures

2601.17196 2026-04-07 cs.LG

Accelerated Sinkhorn Algorithms for Partial Optimal Transport

Nghia Thu Truong, Qui Phu Pham, Quang Nguyen, Dung Luong, Mai Tran

2601.11609 2026-04-07 cs.LG

Auxiliary-predicted Compress Memory Model(ApCM Model): A Neural Memory Storage Model Based on Invertible Compression and Learnable Prediction

Weinuo Ou

Comments 9 pages, 7 figures

2601.10940 2026-04-07 cs.LG

HOSL: Hybrid-Order Split Learning for Memory-Constrained Edge Training

Aakriti Lnu, Zhe Li, Dandan Liang, Chao Huang, Rui Li, Haibo Yang

Comments 14 pages, 2 figures, 9 tables. Accepted at WiOpt 2026

详情

英文摘要

Split learning (SL) enables collaborative training of large language models (LLMs) between resource-constrained edge devices and compute-rich servers by partitioning model computation across the network boundary. However, existing SL systems predominantly rely on first-order (FO) optimization, which requires clients to store intermediate quantities such as activations for backpropagation. This results in substantial memory overhead, largely negating benefits of model partitioning. In contrast, zeroth-order (ZO) optimization eliminates backpropagation and significantly reduces memory usage, but often suffers from slow convergence and degraded performance. In this work, we propose HOSL, a novel Hybrid-Order Split Learning framework that addresses this fundamental trade-off between memory efficiency and optimization effectiveness by strategically integrating ZO optimization on the client side with FO optimization on the server side. By employing memory-efficient ZO gradient estimation at the client, HOSL eliminates backpropagation and activation storage, reducing client memory consumption. Meanwhile, server-side FO optimization ensures fast convergence and competitive performance. Theoretically, we show that HOSL achieves an $\mathcal{O}(\sqrt{d_c/TQ})$ rate, which depends on client-side model dimension $d_c$ rather than the full model dimension $d$, demonstrating that convergence improves as more computation is offloaded to the server. Extensive experiments on OPT models (125M and 1.3B parameters) across 6 tasks demonstrate that HOSL reduces client GPU memory by up to 3.7$\times$ compared to the FO method while achieving accuracy within 0.20%-4.23% of this baseline. Furthermore, HOSL outperforms the ZO baseline by up to 15.55%, validating the effectiveness of our hybrid strategy for memory-efficient training on edge devices.

URL PDF HTML ☆

赞 0 踩 0

2601.07060 2026-04-07 cs.RO

PALM: Progress-Aware Policy Learning via Affordance Reasoning for Long-Horizon Robotic Manipulation

Yuanzhe Liu, Jingyuan Zhu, Yuchen Mo, Gen Li, Xu Cao, Jin Jin, Yifan Shen, Zhengyuan Li, Tianjiao Yu, Wenzhen Yuan, Fangqiang Ding, Ismini Lourentzou

Comments CVPR 2026

2601.06597 2026-04-07 cs.LG stat.ML

Understanding and inverse design of implicit bias in stochastic learning: a geometric perspective

Nicola Aladrah, Emanuele Ballarin, Matteo Biagetti, Alessio Ansuini, Alberto d'Onofrio, Fabio Anselmi

Comments v2

2601.06338 2026-04-07 cs.AI cs.CV cs.LG

Circuit Mechanisms for Spatial Relation Generation in Diffusion Transformers

Binxu Wang, Jingxuan Fan, Xu Pan

Comments 45 pages, 30 figures, accepted in CVPR 2026

2601.05249 2026-04-07 cs.CV

RL-AWB: Deep Reinforcement Learning for Auto White Balance Correction in Low-Light Night-time Scenes

Yuan-Kang Lee, Kuan-Lin Chen, Chia-Che Chang, Yu-Lun Liu

Comments Project page: https://ntuneillee.github.io/research/rl-awb/

2601.03484 2026-04-07 cs.LG

From Bits to Chips: An LLM-based Hardware-Aware Quantization Agent for Streamlined Deployment of LLMs

Kaiyuan Deng, Hangyu Zheng, Minghai Qing, Kunxiong Zhu, Gen Li, Yang Xiao, Lan Emily Zhang, Linke Guo, Bo Hui, Yanzhi Wang, Geng Yuan, Gagan Agrawal, Wei Niu, Xiaolong Ma

2601.02081 2026-04-07 cs.LG

ASSS: A Differentiable Adversarial Framework for Task-Aware Data Reduction

Jiacheng Lyu, Bihua Bao, Shiyun Yan

Comments 6 pages

2601.01891 2026-04-07 cs.CV

Agentic AI in Remote Sensing: Foundations, Taxonomy, and Emerging Systems

Niloufar Alipour Talemi, Julia Boone, Fatemeh Afghah

Comments Accepted to the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2026, GeoCV Workshop

2512.25073 2026-04-07 cs.CV

GaMO: Geometry-aware Multi-view Diffusion Outpainting for Sparse-View 3D Reconstruction

Yi-Chuan Huang, Hao-Jen Chien, Chin-Yang Lin, Ying-Huan Chen, Yu-Lun Liu

Comments Project page: https://yichuanh.github.io/GaMO/

2512.23850 2026-04-07 cs.AI cs.CL cs.LG

The Drill-Down and Fabricate Test (DDFT): A Protocol for Measuring Epistemic Robustness in Language Models

Rahul Baxi

Comments This version strengthens the theoretical and empirical grounding of the CI metric, including explicit analysis of structural dependencies and ranking stability under ablations (e.g., excluding Turn 4). Claims regarding scale and robustness are revised to avoid overgeneralization. The evaluation protocol, jury methodology, and limitations are expanded to clarify assumptions and boundary conditions

2512.23709 2026-04-07 cs.CV

Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion

Hau-Shiang Shiu, Chin-Yang Lin, Zhixiang Wang, Chi-Wei Hsiao, Po-Fan Yu, Yu-Chih Chen, Yu-Lun Liu

Comments Project page: https://jamichss.github.io/stream-diffvsr-project-page/

2512.16294 2026-04-07 cs.CV

MARC: Multi-Label Adaptive Retrieval Contrastive Loss for Remote Sensing Images

Amna Amir, Erchan Aptoula

2512.14190 2026-04-07 cs.LG math.PR

Random-Bridges as Stochastic Transports for Generative Models

Stefano Goria, Levent A. Mengütürk, Murat C. Mengütürk, Berkan Sesen

2512.10882 2026-04-07 cs.CL

Computational emotion analysis with multimodal LLMs: Current evidence on an emerging methodological opportunity

Hauke Licht

2512.10821 2026-04-07 cs.AI cs.CV cs.HC cs.LG

Agile Deliberation: Concept Deliberation for Subjective Visual Classification

Leijie Wang, Otilia Stretcu, Wei Qiao, Thomas Denby, Krishnamurthy Viswanathan, Enming Luo, Chun-Ta Lu, Tushar Dogra, Ranjay Krishna, Ariel Fuxman

2512.09378 2026-04-07 cs.LG

Personalized Federated Distillation Assisted Vehicle Edge Caching Strategy

Xun Li, Qiong Wu, Pingyi Fan, Kezhi Wang, Wen Chen, Cui Zhang

Comments This paper has been accepted by IEEE International Conference on Radio Frequency and Antenna Technologies. The source code has been released at: https://github.com/qiongwu86/Federated-Distillation-Assisted-Vehicle-Edge-Caching-Scheme-Based-on-Lightweight-DDPM

2512.08477 2026-04-07 cs.CV cs.AI

ContextDrag: Precise Drag-Based Image Editing via Context-Preserving Token Injection and Position-Aligned Attention

Huiguo He, Pengyu Yan, Ziqi Yi, Weizhi Zhong, Zheng Liu, Yejun Tang, Huan Yang, Guanbin Li, Lianwen Jin

2512.07469 2026-04-07 cs.CV

VideoCoF: Unified Video Editing with Temporal Reasoner

Xiangpeng Yang, Ji Xie, Yiyuan Yang, Yue Ma, Yan Huang, Min Xu, Qiang Wu

Comments Accepted by CVPR 2026, Project Page: https://videocof.github.io/

2512.06356 2026-04-07 cs.LG cs.SI

Mitigating Structural Overfitting: A Distribution-Aware Rectification Framework for Missing Feature Imputation

Yifan Song, Fenglin Yu, Yihong Luo, Xingjian Tao, Siya Qiu, Kai Han, Jing Tang

Comments Accepted by SIGIR2026

2512.03666 2026-04-07 cs.CV cs.AI

ToG-Bench: Task-Oriented Spatio-Temporal Grounding in Egocentric Videos

Qi'ao Xu, Tianwen Qian, Yuqian Fu, Kailing Li, Yang Jiao, Jiacheng Zhang, Xiaoling Wang, Liang He

Comments 26 pages