2605.12369
2026-06-02
cs.RO
GuidedVLA: Specifying Task-Relevant Factors via Plug-and-Play Action Attention Specialization
GuidedVLA: 通过即插即用的动作注意力特化指定任务相关因素
Xiaosong Jia, Bowen Yang, Zuhao Ge, Xian Nie, Yuchen Zhou, Cunxin Fan, Yufeng Li, Yilin Chai, Chao Jing, Zijian Liang, Qingwen Bu, Haidong Cao, Chao Wu, Qifeng Li, Zhenjie Yang, Chenhe Zhang, Hongyang Li, Zuxuan Wu, Junchi Yan, Yu-Gang Jiang
发表机构
*
Institute of Trustworthy Embodied AI (TEAI)(可信具身人工智能研究院)
;
Shanghai Key Laboratory of Multimodal Embodied AI(上海多模态具身人工智能重点实验室)
;
Shanghai Jiao Tong University(上海交通大学)
;
OpenDriveLab, The University of Hong Kong(OpenDrive实验室,香港大学)
AI总结
提出GuidedVLA框架,通过为动作解码器中的注意力头分配人工定义的辅助信号(如物体定位、空间几何、时序技能逻辑),显式引导模型关注任务相关因素,提升VLA模型在域内和域外场景的成功率。