Improving Text-Instance Alignment Of Foreground Conditioned Out-Painting Via Customized Concept Embedding
通过定制化概念嵌入改进前景条件外绘中的文本-实例对齐
Yihao Zhao, Xuan Han, Bin He, Mingyu You
AI总结 针对前景条件外绘中文本驱动方法产生的伪影问题,提出定制化概念嵌入扩散框架,通过实例感知损失和语义保持提示模板定制概念嵌入,显著减少伪影并提升图像质量。
详情
为了展示产品,商家通常需要花费大量成本制作高质量的展示图像。前景条件外绘(FCO)满足了这一需求,允许用户通过调整文本提示,以低成本为前景实例创建所需的背景。然而,现有的文本驱动FCO方法在其输出中存在关键缺陷,最明显的是伪影,即合成背景中与前景实例共享相同语义的区域。这种伪影降低了物体的显著性并降低了图像质量。我们将问题归因于给定实例与文本派生概念嵌入之间的不对齐。为了解决这个问题,我们提出了定制化概念嵌入扩散(CCE-Diffusion)框架。其核心是CCE模块,用于定制概念嵌入,弥合通用名词语义与特定视觉实例之间的差距。实例感知损失指导模块的优化,而语义保持提示模板防止定制化嵌入扭曲提示中的其他词。定性和定量评估均表明,CCE-Diffusion显著减少了输出中的伪影。作为即插即用组件,CCE模块可以集成到各种FCO方法中,提升其性能。
To showcase products, merchants often incur substantial costs creating high-quality display images. Foreground Conditioned Outpainting (FCO) meets this demand, allowing users to create desired backgrounds for foreground instances at a low cost by adjusting the text prompt. However, existing text-driven FCO methods exhibit critical flaws in their outputs, most notably the presence of artifacts, which refer to regions in the synthesized background that share the same semantics as the foreground instance. Such artifacts diminish the object's prominence and degrade image quality. We attribute the issue to the misalignment between the given instance and text-derived concept embeddings. To address this, we propose the Customized Concept Embedding Diffusion (CCE-Diffusion) framework. Its core is a CCE-Module to customize concept embeddings, bridging the gap between generic noun semantics and a specific visual instance. An Instance-Aware Loss guides the module's optimization, while a Semantic-Preserving Prompt Template prevents customized embeddings from distorting other words in the prompt. Both qualitative and quantitative evaluations demonstrate that CCE-Diffusion significantly reduces artifacts in the outputs. As a plug-and-play component, the CCE-Module can integrate with various FCO methods, enhancing their performance.