arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.08572 2026-03-10 cs.RO cs.AI

MetaWorld-X: Hierarchical World Modeling via VLM-Orchestrated Experts for Humanoid Loco-Manipulation

Yutong Shen, Hangxu Liu, Penghui Liu, Jiashuo Luo, Yongkang Zhang, Rex Morvley, Chen Jiang, Jianwei Zhang, Lei Zhang

Comments 8 figures, https://syt2004.github.io/metaworldX/

详情

英文摘要

Learning natural, stable, and compositionally generalizable whole-body control policies for humanoid robots performing simultaneous locomotion and manipulation (loco-manipulation) remains a fundamental challenge in robotics. Existing reinforcement learning approaches typically rely on a single monolithic policy to acquire multiple skills, which often leads to cross-skill gradient interference and motion pattern conflicts in high-degree-of-freedom systems. As a result, generated behaviors frequently exhibit unnatural movements, limited stability, and poor generalization to complex task compositions. To address these limitations, we propose MetaWorld-X, a hierarchical world model framework for humanoid control. Guided by a divide-and-conquer principle, our method decomposes complex control problems into a set of specialized expert policies (Specialized Expert Policies, SEP). Each expert is trained under human motion priors through imitation-constrained reinforcement learning, introducing biomechanically consistent inductive biases that ensure natural and physically plausible motion generation. Building upon this foundation, we further develop an Intelligent Routing Mechanism (IRM) supervised by a Vision-Language Model (VLM), enabling semantic-driven expert composition. The VLM-guided router dynamically integrates expert policies according to high-level task semantics, facilitating compositional generalization and adaptive execution in multi-stage loco-manipulation tasks.

URL PDF HTML ☆

赞 0 踩 0

2603.08564 2026-03-10 cs.CV

BioGait-VLM: A Tri-Modal Vision-Language-Biomechanics Framework for Interpretable Clinical Gait Assessment

Erdong Chen, Yuyang Ji, Jacob K. Greenberg, Benjamin Steel, Faraz Arkam, Abigail Lewis, Pranay Singh, Feng Liu

2603.08560 2026-03-10 cs.RO

CONTACT: CONtact-aware TACTile Learning for Robotic Disassembly

Yosuke Saka, Jyun-Chi Hu, Adeesh Desai, Zhiyuan Zhang, Bihao Zhang, Quan Khanh Luu, Md Rakibul Islam Prince, Minghui Zheng, Yu She

Comments Submitted to IROS 2026, 8 pages, 6 figures

2603.08551 2026-03-10 cs.CV cs.IR

mmGAT: Pose Estimation by Graph Attention with Mutual Features from mmWave Radar Point Cloud

Abdullah Al Masud, Shi Xintong, Mondher Bouazizi, Ohtsuki Tomoaki

Comments copyright 2026 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

2603.08546 2026-03-10 cs.RO cs.CV cs.LG

Interactive World Simulator for Robot Policy Training and Evaluation

Yixuan Wang, Rhythm Syed, Fangyu Wu, Mengchao Zhang, Aykut Onol, Jose Barreiros, Hooshang Nayyeri, Tony Dear, Huan Zhang, Yunzhu Li

Comments Project Page: https://yixuanwang.me/interactive_world_sim

2603.08544 2026-03-10 cs.RO cs.LG

The Neural Compass: Probabilistic Relative Feature Fields for Robotic Search

Gabriele Somaschini, Adrian Röfer, Abhinav Valada

Comments 9 pages, 7 figures, 2 tables, submitted to IROS 2026

2603.08540 2026-03-10 cs.CV cs.IR

PCFEx: Point Cloud Feature Extraction for Graph Neural Networks

Abdullah Al Masud, Shi Xintong, Mondher Bouazizi, Ohtsuki Tomoaki

Comments ©2026 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

2603.08531 2026-03-10 cs.RO

CRED: Counterfactual Reasoning and Environment Design for Active Preference Learning

Yi-Shiuan Tung, Gyanig Kumar, Wei Jiang, Bradley Hayes, Alessandro Roncone

Comments IEEE International Conference on Robotics and Automation (ICRA) 2026

2603.08526 2026-03-10 cs.LG cs.AI

Towards Effective and Efficient Graph Alignment without Supervision

Songyang Chen, Youfang Lin, Yu Liu, Shuai Zheng, Lei Zou

Comments World Wide Web Journal

2603.08523 2026-03-10 cs.CV

BuildMamba: A Visual State-Space Based Model for Multi-Task Building Segmentation and Height Estimation from Satellite Images

Sinan U. Ulu, A. Enes Doruk, I. Can Yagmur, Bahadir K. Gunturk, Oguz Hanoglu, Hasan F. Ates

2603.08521 2026-03-10 cs.CV cs.RO eess.IV

OccTrack360: 4D Panoptic Occupancy Tracking from Surround-View Fisheye Cameras

Yongzhi Lin, Kai Luo, Yuanfan Zheng, Hao Shi, Mengfei Duan, Yang Liu, Kailun Yang

Comments The benchmark and source code will be made publicly available at https://github.com/YouthZest-Lin/OccTrack360

2603.08519 2026-03-10 cs.RO

AtomVLA: Scalable Post-Training for Robotic Manipulation via Predictive Latent World Models

Xiaoquan Sun, Zetian Xu, Chen Cao, Zonghe Liu, Yihan Sun, Jingrui Pang, Ruijian Zhang, Zhen Yang, Kang Pang, Dingxin He, Mingqi Yuan, Jiayu Chen

2603.08518 2026-03-10 cs.LG stat.ML

Breaking the Bias Barrier in Concave Multi-Objective Reinforcement Learning

Swetha Ganesh, Vaneet Aggarwal

2603.08514 2026-03-10 cs.CV cs.AI

Beyond Hungarian: Match-Free Supervision for End-to-End Object Detection

Shoumeng Qiu, Xinrun Li, Yang Long

2603.08512 2026-03-10 cs.RO

Rethinking the semantic classification of indoor places by mobile robots

Oscar Martinez Mozos, Alejandra C. Hernandez, Clara Gomez, Ramon Barber

Comments Presented at the Workshop on Semantic Scene Understanding for Human Robot Interaction, in the ACM/IEEE International Conference on Human-Robot Interaction (HRI), Stockholm, Sweden, 2023

2603.08506 2026-03-10 cs.LG cs.AI

Oracle-Guided Soft Shielding for Safe Move Prediction in Chess

Prajit T Rajendran, Fabio Arnez, Huascar Espinoza, Agnes Delaborde, Chokri Mraidha

Comments Accepted for publication at the 24th International Conference on Machine Learning and Applications (ICMLA), 2025. Preprint version in Arxiv

2603.08503 2026-03-10 cs.CV cs.GR cs.RO eess.IV

Spherical-GOF: Geometry-Aware Panoramic Gaussian Opacity Fields for 3D Scene Reconstruction

Zhe Yang, Guoqiang Zhao, Sheng Wu, Kai Luo, Kailun Yang

Comments The source code and dataset will be released at https://github.com/1170632760/Spherical-GOF

2603.08499 2026-03-10 cs.CV

Improving Continual Learning for Gaussian Splatting based Environments Reconstruction on Commercial Off-the-Shelf Edge Devices

Ivan Zaino, Matteo Risso, Daniele Jahier Pagliari, Miguel de Prado, Toon Van de Maele, Alessio Burrello

2603.08498 2026-03-10 cs.CV

All Vehicles Can Lie: Efficient Adversarial Defense in Fully Untrusted-Vehicle Collaborative Perception via Pseudo-Random Bayesian Inference

Yi Yu, Libing Wu, Zhuangzhuang Zhang, Jing Qiu, Lijuan Huo, Jiaqi Feng

Comments Accepted by CVPR 2026

2603.08497 2026-03-10 cs.CV

Reading $\neq$ Seeing: Diagnosing and Closing the Typography Gap in Vision-Language Models

Heng Zhou, Ao Yu, Li Kang, Yuchen Fan, Yutao Fan, Xiufeng Song, Hejia Geng, Yiran Qin

2603.08495 2026-03-10 cs.LG stat.ML

Efficient Credal Prediction through Decalibration

Paul Hofman, Timo Löhr, Maximilian Muschalik, Yusuf Sale, Eyke Hüllermeier

2603.08490 2026-03-10 cs.RO

An Open-Source Robotics Research Platform for Autonomous Laparoscopic Surgery

Ariel Rodriguez, Lorenzo Mazza, Martin Lelis, Rayan Younis, Sebastian Bodenstedt, Martin Wagner, Stefanie Speidel

Comments Submitted to iROS 2026

2603.08488 2026-03-10 cs.LG math.DS

NN-OpInf: an operator inference approach using structure-preserving composable neural networks

Eric Parish, Anthony Gruber, Patrick Blonigan, Irina Tezaur

2603.08476 2026-03-10 cs.RO

LAR-MoE: Latent-Aligned Routing for Mixture of Experts in Robotic Imitation Learning

Ariel Rodriguez, Chenpan Li, Lorenzo Mazza, Rayan Younis, Ortrun Hellig, Sebastian Bodenstedt, Martin Wagner, Stefanie Speidel

Comments Submitted to iROS 2026

2603.08475 2026-03-10 cs.RO cs.AI

R2F: Repurposing Ray Frontiers for LLM-free Object Navigation

Francesco Argenziano, John Mark Alexis Marcelo, Michele Brienza, Abdel Hakim Drid, Emanuele Musumeci, Daniele Nardi, Domenico D. Bloisi, Vincenzo Suriani

2603.08459 2026-03-10 cs.LG

Data-Driven Priors for Uncertainty-Aware Deterioration Risk Prediction with Multimodal Data

L. Julián Lechuga López, Tim G. J. Rudner, Farah E. Shamout

Comments 24 pages, 5 figures, 8 tables

2603.08457 2026-03-10 cs.RO cs.LG cs.SY eess.SP eess.SY physics.data-an

Adaptive Entropy-Driven Sensor Selection in a Camera-LiDAR Particle Filter for Single-Vessel Tracking

Andrei Starodubov, Yaqub Aris Prabowo, Andreas Hadjipieris, Ioannis Kyriakides, Roberto Galeazzi

Comments 8 pages, 5 figures, submitted to FUSION 2026 conference proceedings

2603.08455 2026-03-10 cs.AI cs.LG

The Boiling Frog Threshold: Criticality and Blindness in World Model-Based Anomaly Detection Under Gradual Drift

Zhe Hong

Comments 10 pages, 5 figures, preprint

2603.08453 2026-03-10 cs.LG cs.AI cs.CL

LycheeCluster: Efficient Long-Context Inference with Structure-Aware Chunking and Hierarchical KV Indexing

Dongfang Li, Zixuan Liu, Gang Lin, Baotian Hu, Min Zhang

Comments 17 pages, 12 figures

2603.08450 2026-03-10 cs.CL

A Dataset for Probing Translationese Preferences in English-to-Swedish Translation

Jenny Kunz, Anja Jarochenko, Marcel Bollmann

Comments To appear at LREC 2026