Dissect and Prune: Enhancing Robustness in AI-Generated Image Detection
剖析与剪枝:增强AI生成图像检测的鲁棒性
Dahye Kim, Jaehyun Choi, Hyun Seok Seong, Seongho Kim, Donghun Lee, Sungwon Yi, Jang-Ho Choi
AI总结 针对AI生成图像检测器对真实类别的预测偏差问题,提出DEAR方法,利用修复图像识别并剪除干扰特征,从而提升对未知生成器和后处理的鲁棒性。
详情
- Comments
- 25 pages, 9 figures, 9 tables, Accepted to ICML 2026; includes appendix
虽然现有的AI生成图像检测器报告了高性能,但我们发现这主要是由一种关键的预测不对称性驱动的:对真实类别的偏见严重限制了其对生成内容的敏感性,尤其是在压缩和调整大小等标准后处理操作下。我们假设这源于模型对虚假特征的依赖,这些干扰信号掩盖了真正的生成伪影。为了解决这个问题,我们提出了DEAR(剖析与剪枝),它利用修复图像来识别和剪除这些干扰成分。具体来说,我们发现与修复区域或非修复区域强烈对齐的特征对后处理的鲁棒性较差。通过测量通道激活与修复掩码之间的对齐程度,DEAR移除两端的特征,仅保留那些捕捉真实生成伪影的特征。实验结果表明,我们的方法显著增强了对未见过的生成器和后处理的鲁棒性,有效缓解了预测不对称性。我们的代码可在该 https URL 获取。
While existing AI-generated image detectors report high performance, we identify that this is largely driven by a critical prediction asymmetry: a bias toward the real class that severely limits sensitivity to generated content, especially under standard post-processing operations such as compression and resizing. We hypothesize that this stems from the model's reliance on spurious features, distracting signals that obscure true generative artifacts. To address this, we propose DEAR (Dissect and Prune), which leverages inpainted images to identify and prune these interfering components. Specifically, we find that features strongly aligned to either inpainted or non-inpainted regions are less robust to post-processing. By measuring the alignment between channel activations and inpaint masks, DEAR removes features at both extremes, retaining only those that capture genuine generative artifacts. Experimental results demonstrate that our approach significantly enhances robustness against unseen generators and post-processing, effectively mitigating the prediction asymmetry. Our code is available at https://github.com/dahyedahye/dear.