arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.27281 2026-03-31 cs.RO

HiFlow: Tokenization-Free Scale-Wise Autoregressive Policy Learning via Flow Matching

Daichi Yashima, Koki Seno, Shuhei Kurita, Yusuke Oda, Komei Sugiura

详情

英文摘要

Coarse-to-fine autoregressive modeling has recently shown strong promise for visuomotor policy learning, combining the inference efficiency of autoregressive methods with the global trajectory coherence of diffusion-based policies. However, existing approaches rely on discrete action tokenizers that map continuous action sequences to codebook indices, a design inherited from image generation where learned compression is necessary for high-dimensional pixel data. We observe that robot actions are inherently low-dimensional continuous vectors, for which such tokenization introduces unnecessary quantization error and a multi-stage training pipeline. In this work, we propose Hierarchical Flow Policy (HiFlow), a tokenization-free coarse-to-fine autoregressive policy that operates directly on raw continuous actions. HiFlow constructs multi-scale continuous action targets from each action chunk via simple temporal pooling. Specifically, it averages contiguous action windows to produce coarse summaries that are refined at finer temporal resolutions. The entire model is trained end-to-end in a single stage, eliminating the need for a separate tokenizer. Experiments on MimicGen, RoboTwin 2.0, and real-world environments demonstrate that HiFlow consistently outperforms existing methods including diffusion-based and tokenization-based autoregressive policies.

URL PDF HTML ☆

赞 0 踩 0

2603.27273 2026-03-31 cs.RO cs.AI cs.SY eess.SY

Robust Global-Local Behavior Arbitration via Continuous Command Fusion Under LiDAR Errors

Mohamed Elgouhary, Amr S. El-Wakeel

2603.27270 2026-03-31 cs.AI stat.ML

Quantification of Credal Uncertainty: A Distance-Based Approach

Xabier Gonzalez-Garcia, Siu Lun Chau, Julian Rodemann, Michele Caprio, Krikamol Muandet, Humberto Bustince, Sébastien Destercke, Eyke Hüllermeier, Yusuf Sale

2603.27268 2026-03-31 cs.CV

TrackMAE: Video Representation Learning via Track Mask and Predict

Renaud Vandeghen, Fida Mohammad Thoker, Marc Van Droogenbroeck, Bernard Ghanem

Comments Accepted to CVPR 2026

2603.27264 2026-03-31 cs.CV

TrendGen: An Outfit Recommendation and Display System

Theodoros Koukopoulos, Dimos Klimenof, Ioannis Xarchakos

2603.27261 2026-03-31 cs.CV eess.IV

MD-RWKV-UNet: Scale-Aware Anatomical Encoding with Cross-Stage Fusion for Multi-Organ Segmentation

Zhuoyi Fang

2603.27251 2026-03-31 cs.CV cs.AI

Zero-shot Vision-Language Reranking for Cross-View Geolocalization

Yunus Talha Erzurumlu, John E. Anderson, William J. Shuart, Charles Toth, Alper Yilmaz

Comments 7 pages, 4 figures. Accepted to XXV ISPRS Congress

2603.27250 2026-03-31 cs.CV

IP-SAM: Prompt-Space Conditioning for Prompt-Absent Camouflaged Object Detection

Huiyao Zhang, Jin Bai, Rui Guo, JianWen Tan, HongFei Wang, Ye Li

2603.27247 2026-03-31 cs.CL cs.SE

SCOPE: Tree-based Self-Correcting Online Log Parsing via Syntactic-Semantic Collaboration

Dongyi Fan, Suqiong Zhang, Lili He, Ming Liu, Yifan Huo

Comments Accepted at the 34th International Conference on Program Comprehension (ICPC 2026)

2603.27245 2026-03-31 cs.RO

Design of an In-Pipe Robot with Contact-Angle-Guided Kinematic Decoupling for Crosstalk-Suppressed Locomotion

Min Yang, Yang Tian, Longchuang Li, Jun Ma, Shugen Ma

2603.27241 2026-03-31 cs.CV

SaSaSaSa2VA: 2nd Place of the 5th PVUW MeViS-Text Track

Dengxian Gong, Quanzhu Niu, Shihao Chen, Yuanzheng Wu, Yikang Zhou, Tao Zhang, Haobo Yuan, Lu Qi, Shunping Ji

2603.27240 2026-03-31 cs.CV cs.AI

Diagnosing and Repairing Unsafe Channels in Vision-Language Models via Causal Discovery and Dual-Modal Safety Subspace Projection

Jinhu Fu, Yihang Lou, Qingyi Si, Shudong Zhang, Yan Bai, Sen Su

Comments Accepted by CVPR 2026 main conference

2603.27238 2026-03-31 cs.CV

An Instance-Centric Panoptic Occupancy Prediction Benchmark for Autonomous Driving

Yi Feng, Junwu E, Zizhan Guo, Yu Ma, Hanli Wang, Rui Fan

Comments Accepted to CVPR 2026. Code and dataset are available at https://mias.group/CarlaOcc

2603.27237 2026-03-31 cs.SD cs.AI cs.LG eess.AS

Can pre-trained Deep Learning models predict groove ratings?

Axel Marmoret, Nicolas Farrugia, Jan Alexander Stupacher

Comments Submitted to the SMC 2026 conference. 3 figures and 2 tables

2603.27233 2026-03-31 cs.CL cs.SI

Structural Stress and Learned Helplessness in Afghanistan: A Multi-Layer Analysis of the AFSTRESS Dari Corpus

Jawid Ahmad Baktash, Mursal Dawodi, Nadira Ahmadi

Comments 16 pages, 7 figures, 3 tables. Introduces AFSTRESS, the first multi-label Dari corpus of self-reported stress narratives (737 responses). Includes computational benchmarks, social science analysis of structural stress, and psychological modeling (learned helplessness, chronic stress, emotional cascade)

2603.27228 2026-03-31 cs.CV

NimbusGS: Unified 3D Scene Reconstruction under Hybrid Weather

Yanying Li, Jinyang Li, Shengfeng He, Yangyang Xu, Junyu Dong, Yong Du

Comments Accepted by CVPR2026

2603.27226 2026-03-31 cs.CL

Rethinking Easy-to-Hard: Limits of Curriculum Learning in Post-Training for Deductive Reasoning

Maximilian Mordig, Andreas Opedal, Weiyang Liu, Bernhard Schölkopf

2603.27218 2026-03-31 cs.SD cs.AI cs.LG

Unsupervised Evaluation of Deep Audio Embeddings for Music Structure Analysis

Axel Marmoret

Comments Submitted to the SMC 2026 conference. 2 figures and 2 tables in the main document, 7 figures in Appendix

2603.27209 2026-03-31 cs.CV cs.CL cs.GR cs.LG

LightMover: Generative Light Movement with Color and Intensity Controls

Gengze Zhou, Tianyu Wang, Soo Ye Kim, Zhixin Shu, Xin Yu, Yannick Hold-Geoffroy, Sumit Chaturvedi, Qi Wu, Zhe Lin, Scott Cohen

Comments CVPR 2026. 10 pages, 5 figures, 6 tables in main paper; supplementary material included

2603.27207 2026-03-31 cs.RO

Autonomous overtaking trajectory optimization using reinforcement learning and opponent pose estimation

Matej Rene Cihlar, Luka Šiktar, Branimir Ćaran, Marko Švaco

Comments The paper is accepted and presented on the 35th International Conference on Robotics in Alpe-Adria-Danube Region, RAAD 2026, Bratislava, Slovakia

2603.27206 2026-03-31 cs.CV

Make It Up: Fake Images, Real Gains in Generalized Few-shot Semantic Segmentation

Guohuan Xie, Xin He, Dingying Fan, Le Zhang, Ming-Ming Cheng, Yun Liu

2603.27205 2026-03-31 cs.SD

Two-Stage Acoustic Adaptation with Gated Cross-Attention Adapters for LLM-Based Multi-Talker Speech Recognition

Hao Shi, Yuan Gao, Xugang Lu, Tatsuya Kawahara

2603.27201 2026-03-31 cs.CV

Understanding and Mitigating Hallucinations in Multimodal Chain-of-Thought Models

Ji Ma, Wei Suo, Peng Wang, Yanning Zhang

Comments CVPR 2026

2603.27199 2026-03-31 cs.CV

Let Triggers Control: Frequency-Aware Dropout for Effective Token Control

Junyoung Koh, Hoyeon Moon, Dongha Kim, Seungmin Lee, Sanghyun Park, Min Song

Comments CVPR 2026 P13N: Personalization in Generative AI workshop

2603.27197 2026-03-31 cs.CV

K$α$LOS finds Consensus: A Meta-Algorithm for Evaluating Inter-Annotator Agreement in Complex Vision Tasks

David Tschirschwitz, Volker Rodehorst

Comments Accepted at CVPR 2026. Also known as KALOS

2603.27194 2026-03-31 cs.RO cs.AI

Multi-AUV Ad-hoc Networks-Based Multi-Target Tracking Based on Scene-Adaptive Embodied Intelligence

Kai Tian, Jialun Wang, Chuan Lin, Guangjie Han, Shengchao Zhu, Ying Liu, Qian Zhu

2603.27187 2026-03-31 cs.LG

Omni-Modal Dissonance Benchmark: Systematically Breaking Modality Consensus to Probe Robustness and Calibrated Abstention

Zabir Al Nazi, Shubhashis Roy Dipta, Md Rizwan Parvez

2603.27186 2026-03-31 cs.LG

Hybrid Deep Learning with Temporal Data Augmentation for Accurate Remaining Useful Life Prediction of Lithium-Ion Batteries

Yun Tian, Guili Wang, Jian Bi, Kaixin Han, Chenglu Wu, Zhiyi Lu, Chenhao Li, Liangwang Sun, Minyu Zhou, Chenchen Xu

2603.27185 2026-03-31 cs.CV

MotionRFT: Unified Reinforcement Fine-Tuning for Text-to-Motion Generation

Xiaofeng Tan, Wanjiang Weng, Hongsong Wang, Fang Zhao, Xin Geng, Liang Wang

2603.27184 2026-03-31 cs.CV

Incentivizing Temporal-Awareness in Egocentric Video Understanding Models

Zhiyang Xu, Tian Qin, Bowen Jin, Zhengfeng Lai, Meng Cao, Lifu Huang, Peng Zhang

Comments 11 pages, 4 figures