Bionic Human-Motion Style Transfer for Physically Executable Whole-Body Control of Humanoid Robots
仿人运动风格迁移用于人形机器人物理可执行全身控制
Tianchen Huang, Mingkuan Zhao, Yang Gao, Feiyang Yuan, Junchi Gu, Xiaohu Zhang, Dongdong Zhao, Shi Yan, Yu Wang, Wei Gao, Shiwu Zhang
AI总结 提出一种仿生生成到控制框架,通过物理感知多条件潜扩散模型和预览式全身跟踪策略,将短时人体风格示例迁移到不同运动内容上,实现人形机器人可执行且表达性强的全身运动。
详情
- Comments
- Project page: https://huangtc233.github.io/bionic-style-transfer/
表达性全身运动对于在人类环境中运行的人形机器人至关重要,机器人需要稳定移动的同时呈现可读且可调整的身体行为。然而,大多数表达性运动仍来自固定演示或手动设计的脚本,难以在不同运动内容间复用演示风格。受人体运动风格通过步态节奏、姿态、手臂摆动和身体摇摆传递情感和意图线索的启发,本文提出了一种仿生生成到控制框架,用于人形机器人上的示例驱动风格迁移。给定一个短时人体风格示例和目标内容运动,所提框架生成一个风格化全身参考,保留预期运动内容的同时迁移演示风格。开发了一个物理感知多条件潜扩散模型来融合风格、内容和轨迹条件,并使用无分类器引导在不重新训练的情况下调整风格强度。为提高硬件可执行性,在训练期间对解码后的运动施加接触一致性和时间平滑正则化。生成的参考随后转换为G1兼容的机器人参考,并由基于预览的全身跟踪策略执行,该策略采用聚类和蒸馏策略训练。仿真和Unitree G1实验表明,所提方法可以将短时人体风格示例迁移到多样化的机器人运动内容,与面向动画的风格迁移基线相比减少接触和抖动伪影,并在125次真实机器人试验中达到96.0%的成功率。结果证明了使用短时人体运动示例作为可复用的仿生源实现物理可执行表达性人形运动的可行性。
Expressive whole-body motion is important for humanoid robots operating in human environments, where robots are expected to move stably while presenting readable and adjustable body behaviors. However, most expressive motions are still obtained from fixed demonstrations or manually designed scripts, making it difficult to reuse a demonstrated style across different motion contents. Inspired by the way human motion styles convey affective and intentional cues through gait rhythm, posture, arm swing and body sway, this paper proposes a bionic generation-to-control framework for exemplar-driven style transfer on humanoid robots. Given a short human style exemplar and a target content motion, the proposed framework generates a stylized whole-body reference that preserves the intended motion content while transferring the demonstrated style. A physics-aware multi-condition latent diffusion model is developed to fuse style, content and trajectory conditions, and classifier-free guidance is used to adjust the style intensity without retraining. To improve hardware executability, contact-consistency and temporal-smoothness regularization are imposed on decoded motions during training. The generated references are then converted into G1-compatible robot references and executed by a preview-based whole-body tracking policy trained with a cluster-and-distill strategy. Simulation and Unitree G1 experiments show that the proposed method can transfer short human style exemplars to diverse robot motion contents, reduce contact and jitter artifacts compared with animation-oriented style-transfer baselines, and achieve a 96.0% success rate over 125 reported real-robot trials. The results demonstrate the feasibility of using short human motion exemplars as reusable bionic sources for physically executable expressive humanoid motion.