arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2602.05765 2026-04-08 cs.AI

RL-VLA$^3$: A Flexible and Asynchronous Reinforcement Learning Framework for VLA Training

Haoran Sun, Yongjian Guo, Zhong Guan, Shuai Di, Xiaodong Bai, Jing Long, Tianyun Zhao, Mingxi Luo, Hongke Zhao, Likang Wu, Xiaotie Deng, Xu Chu, Xi Xiao, Sheng Wen, Yicheng Gong, Junwu Xiong

详情

英文摘要

Reinforcement learning (RL) has emerged as a critical paradigm for post-training Vision-Language-Action (VLA) models, enabling embodied agents to adapt and improve through environmental interaction. However, existing RL frameworks for VLAs inherit synchronous design principles from traditional LLM training, treating entire rollouts as indivisible units and alternating strictly between data collection and policy optimization. This fundamentally mismatches the unique characteristics of VLA training, as physical simulators introduce highly variable, resource-intensive latencies. To address this, we introduce RL-VLA$^3$, a fully asynchronous distributed RL framework that enables fine-grained asynchronous interaction between simulation, inference, and training components through dynamic batching schedulers and flexible environment sharding strategies. Extensive experiments across diverse simulation backends, VLA architectures, and RL algorithms demonstrate that RL-VLA$^3$ achieves throughput improvements of up to 85.2\% over synchronous baselines while maintaining identical sample efficiency, with scalability validated from 8 to 256 GPUs. To our knowledge, RL-VLA$^3$ is the first fully asynchronous RL training framework tailored specifically for the system-level challenges of VLA training.

URL PDF HTML ☆

赞 0 踩 0

2602.02528 2026-04-08 cs.LG cs.AI

Incident-Guided Spatiotemporal Traffic Forecasting

Lixiang Fan, Bohao Li, Tao Zou, Junchen Ye, Bowen Du

2602.01070 2026-04-08 cs.CL

What If We Allocate Test-Time Compute Adaptively?

Ahsan Bilal, Ahmed Mohsin, Muhammad Umer, Ali Subhan, Hassan Rizwan, Ayesha Mohsin, Dean Hougen

2601.22581 2026-04-08 cs.CV cs.AI cs.LG

Cross-Domain Few-Shot Learning for Hyperspectral Image Classification Based on Mixup Foundation Model

Naeem Paeedeh, Mahardhika Pratama, Ary Shiddiqi, Zehong Cao, Mukesh Prasad, Wisnu Jatmiko

2601.18546 2026-04-08 cs.LG

Information Hidden in Gradients of Regression with Target Noise

Arash Jamshidi, Katsiaryna Haitsiukevich, Kai Puolamäki

2601.16242 2026-04-08 cs.RO

Scalable Screw-Theoretic Synthesis for PDE-Based Dynamic Modeling of Multibody Flexible Manipulators

S. Yaqubi, J. Mattila

Comments Submitted to Springer for peer review. Copyright might be transferred without notice

2601.12419 2026-04-08 cs.CL

Legal Experts Disagree With Rationale Extraction Techniques for Explaining ECtHR Case Outcome Classification

Mahammad Namazov, Tomáš Koref, Ivan Habernal

Comments 9 pages + Appendix

2601.10649 2026-04-08 cs.CV

MINERVA-Cultural: A Benchmark for Cultural and Multilingual Long Video Reasoning

Darshan Singh, Arsha Nagrani, Kawshik Manikantan, Harman Singh, Dinesh Tewari, Tobias Weyand, Cordelia Schmid, Anelia Angelova, Shachi Dave

Comments Accepted to CVPR 2026

2601.09365 2026-04-08 cs.CL cs.AI

Frame of Reference: Addressing the Challenges of Common Ground Representation in Situational Dialogs

Biswesh Mohapatra, Théo Charlot, Giovanni Duca, Mayank Palan, Laurent Romary, Justine Cassell

Comments Work accepted at ACL 2026 Findings

2601.06928 2026-04-08 cs.CV

RenderFlow: Single-Step Neural Rendering via Flow Matching

Shenghao Zhang, Runtao Liu, Christopher Schroers, Yang Zhang

Comments CVPR 2026; Supplementary material included

2601.06748 2026-04-08 cs.RO

On-the-Fly VLA Adaptation via Test-Time Reinforcement Learning

Changyu Liu, Yiyang Liu, Taowen Wang, Qiao Zhuang, James Chenhao Liang, Wenhao Yang, Renjing Xu, Qifan Wang, Dongfang Liu, Cheng Han

2601.06565 2026-04-08 cs.CL

EVM-QuestBench: An Execution-Grounded Benchmark for Natural-Language Transaction Code Generation

Pei Yang, Wanyi Chen, Ke Wang, Lynn Ai, Eric Yang, Tianyu Shi

Comments 10 pages, 13 figures

2601.05930 2026-04-08 cs.CL cs.AI cs.LG cs.MA

Can We Predict Before Executing Machine Learning Agents?

Jingsheng Zheng, Jintian Zhang, Yujie Luo, Yuren Mao, Yunjun Gao, Lun Du, Huajun Chen, Ningyu Zhang

Comments ACL 2026

2601.05905 2026-04-08 cs.CL cs.AI cs.HC cs.LG cs.MA

Illusions of Confidence? Diagnosing LLM Truthfulness via Neighborhood Consistency

Haoming Xu, Ningyuan Zhao, Yunzhi Yao, Weihong Xu, Hongru Wang, Xinle Deng, Shumin Deng, Jeff Z. Pan, Huajun Chen, Ningyu Zhang

Comments ACL 2026

2601.03741 2026-04-08 cs.CV

I2E: From Image Pixels to Actionable Interactive Environments for Text-Guided Image Editing

Jinghan Yu, Junhao Xiao, Chenyu Zhu, Jiaming Li, Jia Li, HanMing Deng, Xirui Wang, Guoli Jia, Jianjun Li, Xiang Bai, Bowen Zhou, Zhiyuan Ma

2601.02978 2026-04-08 cs.CL cs.AI

Mechanistic Knobs in LLMs: Retrieving and Steering High-Order Semantic Features via Sparse Autoencoders

Ruikang Zhang, Shuo Wang, Qi Su

2601.01955 2026-04-08 cs.CV

MotionAdapter: Video Motion Transfer via Content-Aware Attention Customization

Zhexin Zhang, Yangyang Xu, Yifeng Zhu, Long Chen, Yong Du, Shengfeng He, Jun Yu

2512.24062 2026-04-08 cs.LG

Energy-Balanced Hyperspherical Graph Representation Learning via Structural Binding and Entropic Dispersion

Rui Chen, Junjun Guo, Hongbin Wang, Yan Xiang, Yantuan Xian, Zhengtao Yu

Comments Submitted to Knowledge-Based Systems

2512.20983 2026-04-08 cs.CL cs.AI cs.LG

Automatic Replication of LLM Mistakes in Medical Conversations

Oleksii Proniakin, Diego Fajardo, Ruslan Nazarenko, Razvan Marinescu

Comments 48 pages, 3 figures, 4 tables

2512.18836 2026-04-08 cs.RO

Multimodal Classification Network Guided Trajectory Planning for Four-Wheel Independent Steering Autonomous Parking Considering Obstacle Attributes

Jingjia Teng, Yang Li, Yougang Bian, Manjiang Hu, Yingbai Hu, Guofa Li, Jianqiang Wang

Comments The manuscript in this current form requires substantial revision. For this reason, I request the withdrawal of the submission to allow for comprehensive improvement before resubmission

2512.13592 2026-04-08 cs.LG cs.CV

Image Diffusion Preview with Consistency Solver

Fu-Yun Wang, Hao Zhou, Liangzhe Yuan, Sanghyun Woo, Boqing Gong, Bohyung Han, Ming-Hsuan Yang, Han Zhang, Yukun Zhu, Ting Liu, Long Zhao

Comments Accepted by CVPR 2026

2512.07988 2026-04-08 cs.LG cs.GR cs.HC

HOLE: Homological Observation of Latent Embeddings for Neural Network Interpretability

Sudhanva Manjunath Athreya, Paul Rosen

2512.04832 2026-04-08 cs.CV cs.GR cs.LG

Tokenizing Buildings: A Transformer for Layout Synthesis

Manuel Ladron de Guevara, Jinmo Rhee, Ardavan Bidgoli, Vaidas Razgaitis, Michael Bergin

Comments 14 pages, 3 page References, 4 figures

2512.04415 2026-04-08 cs.RO

RoboBPP: Benchmarking Robotic Online Bin Packing with Physics-based Simulation

Zhoufeng Wang, Hang Zhao, Juzhan Xu, Shishun Zhang, Ruizhen Hu, Chenyang Zhu, Zecui Zeng, Weiyan Zhu, Zeyu Xiong, Haibin Yu, Kai Xu

Comments Under review at the International Journal of Robotics Research (IJRR)

2512.04351 2026-04-08 cs.LG

Distance Is All You Need: Radial Dispersion for Uncertainty Estimation in Large Language Models

Manh Nguyen, Sunil Gupta, Hung Le

2512.04246 2026-04-08 cs.AI

Toward Virtuous Reinforcement Learning: A Critique and Roadmap

Majid Ghasemi, Mark Crowley

Comments Accepted as a workshop paper at Machine Ethics: From Formal Methods to Emergent Machine Ethics workshop at the AAAI 2026 Conference

2511.18359 2026-04-08 cs.CV

TRANSPORTER: Transferring Visual Semantics from VLM Manifolds

Alexandros Stergiou

Comments Accepted at IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2026, Project page: https://alexandrosstergiou.github.io/TRANSPORTER

2511.17568 2026-04-08 cs.LG cs.AI

Enhancing Robustness of Offline Reinforcement Learning Under Data Corruption via Sharpness-Aware Minimization

Le Xu, Jiayu Chen

Comments Accepted as an Oral Presentation at the AAAI 2026 Student Abstract and Poster Program (SAPP)

2511.17146 2026-04-08 cs.CV

Learning to Look Closer: A New Instance-Wise Loss for Small Cerebral Lesion Segmentation

Luc Bouteille, Alexander Jaus, Jens Kleesiek, Rainer Stiefelhagen, Lukas Heine

Comments Accepted to IEEE ISBI 2026. 5 pages, 2 figures, 2 tables

2511.14998 2026-04-08 cs.CV

FinCriticalED: A Visual Benchmark for Financial Fact-Level OCR

Yueru He, Xueqing Peng, Yupeng Cao, Yan Wang, Lingfei Qian, Haohang Li, Yi Han, Shuyao Wang, Ruoyu Xiang, Fan Zhang, Zhuohan Xie, Mingquan Lin, Prayag Tiwari, Jimin Huang, Guojun Xiong, Sophia Ananiadou

Comments Xueqing Peng: Corresponding-Author