2605.10730
2026-05-12
cs.CV
Qwen-Image-2.0 Technical Report
Bing Zhao, Chenfei Wu, Deqing Li, Hao Meng, Jiahao Li, Jie Zhang, Jingren Zhou, Junyang Lin, Kaiyuan Gao, Kuan Cao, Kun Yan, Liang Peng, Lihan Jiang, Niantong Li, Ningyuan Tang, Shengming Yin, Tianhe Wu, Xiao Xu, Xiaoyue Chen, Xihua Wang, Yan Shu, Yanran Zhang, Yi Wang, Yilei Chen, Ying Ba, Yixian Xu, Yujia Wu, Yuxiang Chen, Zecheng Tang, Zekai Zhang, Zhendong Wang, Zihao Liu, Zikai Zhou, An Yang, Chen Cheng, Chenxu Lv, Dayiheng Liu, Fan Zhou, Hantian Xiong, Hongzhu Shi, Hu Wei, Huihong Zhao, Ivy Liu, Jianwei Zhang, Jiawei Zhang, Kai Chen, Kang He, Levon Xue, Lin Qu, Linhan Tang, Luwen Feng, Minggang Wu, Minmin Sun, Na Ni, Rui Men, Shuai Bai, Sishou Zheng, Tao Lan, Tianqi Zhang, Tingkun Wen, Wei Wang, Weixu Qiao, Weiyi Lu, Wenmeng Zhou, Xiaodong Deng, Xiaoxiao Xu, Xinlei Fang, Xionghui Chen, Yanan Wang, Yang Fan, Yichang Zhang, Yixuan Xu, Yu Wu, Zhiyuan Ma, Zhizhi Cai
AI总结
本文介绍了Qwen-Image-2.0,一种能够统一高保真图像生成与精确图像编辑的全能型图像生成基础模型。该模型通过结合Qwen3-VL作为条件编码器与多模态扩散变换器,解决了超长文本渲染、多语言排版、高分辨率写实生成等挑战,并在大规模数据训练和定制化多阶段训练流程的支持下,实现了强大的多模态理解能力与灵活的生成与编辑功能。实验表明,Qwen-Image-2.0在生成与编辑任务上显著优于之前的版本,向着更通用、可靠和实用的图像生成模型迈出了重要一步。