AI-T2I: Aggregating-and-Isolating Cross-Attention to Diffusion Models for Text-to-Image Synthesis
AI-T2I: 面向文本到图像合成的扩散模型的聚合与隔离交叉注意力
发表机构 * Institute of Advanced Medicine and Frontier Technology(先进医学与前沿技术研究院) ; Hefei University of Technology(合肥工业大学) ; Key Laboratory of Knowledge Engineering With Big Data, Ministry of Education(大数据知识工程重点实验室) ; Department of Automation, Tsinghua University(清华大学自动化系)
AI总结 针对扩散模型在文本到图像合成中交叉注意力图内文本-图像对齐不精确的问题,提出一种聚合与隔离交叉注意力方法(AI-T2I),通过聚合损失整合分散的令牌内激活并引入隔离损失分离令牌间激活,实现精确对齐。
Comments Accepted by IEEE Transactions on Multimedia (2026). 13 pages, 15 figures