PhyGenHOI: Physically-Aware 4D Generation of Dynamic Human-Object Interactions
PhyGenHOI:物理感知的动态人-物交互4D生成
Omer Benishu, Gal Fiebelman, Sagie Benaim
AI总结 提出PhyGenHOI框架,结合运动扩散模型和物质点方法,通过窗口吸引损失、接触驱动重模拟和掩码视频SDS目标,生成物理一致且视觉逼真的4D人-物交互动态场景。
详情
我们解决了生成物理准确且视觉逼真的4D人-物交互(HOI)的任务。给定一个静态3D人体和以3D高斯泼溅(3DGS)表示的目标物体,我们的目标是合成动态场景,其中人体根据给定的输入文本主动与物体交互,例如拳击或踢腿。为此,我们引入了PhyGenHOI,一种新颖的框架,将生成式人体运动与显式物理物体模拟相结合。我们将人体建模为由运动扩散模型(MDM)驱动的语义智能体,将物体建模为通过物质点方法(MPM)模拟的物理智能体,并利用3D高斯作为统一的、可微分的表示。我们通过三种耦合机制监督它们的交互:(1)窗口吸引损失,时间上同步生成运动以拦截物体;(2)接触驱动重模拟步骤,在碰撞时触发物理一致动量传递;(3)掩码视频SDS目标,注入基于视频的先验以增强接触保真度。实验表明,PhyGenHOI在多种动作、人体和物体上生成物理一致的4D HOI,优于基线方法。项目页面和视频:https://omerbenishu.github.io/PhyGenHOI/
We address the task of generating physically accurate and visually faithful 4D Human-Object Interaction (HOI). Given a static 3D human and target object represented as 3D Gaussian Splats (3DGS), our goal is to synthesize dynamic scenes where the human actively engages with the object through actions, such as punching or kicking, in accordance with a given input text. To this end, we introduce PhyGenHOI, a novel framework that couples generative human motion with an explicit physical object simulation. We model the human as a semantic agent driven by a Motion Diffusion Model (MDM) and the object as a physical agent simulated via the Material Point Method (MPM), utilizing 3D Gaussians as a unified, differentiable representation. We supervise their interaction through three coupled mechanisms: (1) A Windowed Attraction Loss that temporally synchronizes generative motion to intercept the object; (2) A Contact-Driven Re-simulation step that triggers physically consistent momentum transfer upon impact; and (3) A Masked Video-SDS objective that injects video-based priors to enhance contact fidelity. Experiments show PhyGenHOI generates physically consistent 4D HOI across diverse actions, humans, and objects, outperforming baselines. Project page and videos: https://omerbenishu.github.io/PhyGenHOI/