2606.12817
2026-06-12
cs.AI
新提交
Teach-and-Repeat: Accurately Extracting Operational Knowledge from Mobile Screen Demonstrations to Empower GUI Agents
Teach-and-Repeat: 从移动屏幕演示中准确提取操作知识以赋能GUI智能体
Yudong Zhang (1), Lei Hu (1), Daoyang Liu (2), Jiawei Liu (1), Yangfan Luo (1), Xingyu Liu (1), Zuojian Wang (1), Zhilin Gao (1) ((1) Honor Device Co., Ltd, (2) The Chinese University of Hong Kong, Hong Kong, China)
发表机构
*
Honor Device Co., Ltd(荣耀终端有限公司)
;
The Chinese University of Hong Kong(香港中文大学)
AI总结
提出Teach VLM模型,通过从演示视频中提取关键帧生成操作知识,并构建数据飞轮解决训练数据稀缺问题;在基准测试中达到最优性能,并提升下游智能体的任务成功率。