图像生成 - arXivDaily 专题

2508.03483 2026-06-18 cs.CV cs.AI 版本更新 90%

When Cars Have Stereotypes: Auditing Demographic Bias in Objects from Text-to-Image Models

当汽车有刻板印象：审计文本到图像模型中对象的群体偏见

Dasol Choi, Jihwan Lee, Minjae Lee, Minsuk Kahng

发表机构 * AIM Intelligence（AIM智能研究院）； Yonsei University（延世大学）

专题命中文生图：审计文本到图像模型中的群体偏见，涉及图像生成。

AI总结提出SODA框架，通过三个指标系统测量文本到图像模型在生成对象中的群体偏见，发现中性提示隐含偏向中年和白人，且人口统计线索导致高度偏斜的刻板输出。

详情

AI中文摘要

虽然先前关于文本到图像生成的研究主要集中在人类描绘中的偏见，但生成对象中的群体偏见仍然相对未被充分探索。我们引入了SODA（刻板对象诊断审计），这是一个新颖的框架，通过自动属性发现和三个标准化指标系统地测量这些偏见：基础与群体差异（BDS）、跨群体差异（CDS）和视觉属性集中度（VAC）。将SODA应用于五个最先进模型和八个对象类别（例如汽车）的8000张图像，我们发现“中性”提示产生的输出在视觉上最接近中年和白人，表明这些群体在模型默认设置中被隐含地过度代表。此外，人口统计线索触发了高度偏斜的刻板输出：26.6%的对象-模型-群体组合产生的结果中，所有20张生成图像共享完全相同的属性值（例如，为女性生成玫瑰金笔记本电脑）。最后，提示级别的去偏减少了群体间差异，但矛盾地压缩了群体内多样性，用一种刻板印象取代了另一种。SODA提供了一个实用的流程，使这些隐含关联变得可测量，作为迈向更负责任的人工智能发展的一步。

英文摘要

While prior research on text-to-image generation has predominantly focused on biases in human depictions, demographic bias in generated objects remains relatively underexplored. We introduce SODA (Stereotyped Object Diagnostic Audit), a novel framework for systematically measuring these biases through automated attribute discovery and three standardized metrics: Base vs. Demographic Divergence (BDS), Cross-Demographic Disparity (CDS), and Visual Attribute Concentration (VAC). Applying SODA to 8,000 images across five state-of-the-art models and eight object categories (e.g., cars), we find that "neutral" prompts produce outputs most visually similar to middle-aged and White people, suggesting these groups are implicitly over-represented in model defaults. Furthermore, demographic cues trigger highly skewed stereotypical outputs: 26.6% of object-model-demographic combinations produce results where all 20 generated images share the exact same attribute value (e.g., rose gold laptops for women). Finally, prompt-level debiasing reduces inter-group disparity but paradoxically collapses within-group diversity, replacing one stereotype with another. SODA offers a practical pipeline for making these implicit associations measurable, serving as a step toward more responsible AI development.

URL PDF HTML ☆

赞 0 踩 0

2605.14877 2026-06-18 cs.CV 版本更新 85%

HeatKV: Head-tuned KV-cache Compression for Visual Autoregressive Modeling

HeatKV：针对视觉自回归建模的头部调制KV缓存压缩

Jonathan Cederlund, Axel Berg, William Isaksson, Durmus Alp Emre Acar, Chuteng Zhou, Pontus Giselsson

发表机构 * Dept. of Automatic Control, Lund University（自动控制系，吕勒欧大学）； Arm（Arm公司）

专题命中文生图：提出HeatKV压缩方法用于视觉自回归图像生成。

AI总结本文提出HeatKV方法，通过根据每个头部对先前生成尺度的注意力进行调整，实现更高效的KV缓存压缩，提升内存利用率并保持图像生成质量。

Comments 18 pages total including appendix; 6 main-paper figures, 2 appendix figures; 4 tables

详情

AI中文摘要

视觉自回归（VAR）模型最近在保持低延迟的同时展示了出色的图像生成质量。然而，它们受到严重的KV缓存内存限制，通常需要每个生成图像数吉字节的内存。我们引入了HeatKV，一种新的压缩方法，该方法根据每个头部对先前生成尺度的注意力来调整缓存分配。使用一个小的离线校准集，注意力头部根据其在先前尺度上的注意力分数进行排序。基于此排序，我们构建了一个针对给定内存预算定制的静态剪枝计划。应用于Infinity-2B模型时，HeatKV在KV缓存内存分配的压缩比上比现有方法高2倍，同时保持相似或更好的图像保真度、提示对齐度和人类感知分数。我们的方法在VAR模型的KV缓存压缩中达到了新的最先进的水平，展示了细粒度、特定头部的缓存分配的有效性。

英文摘要

Visual Autoregressive (VAR) models have recently demonstrated impressive image generation quality while maintaining low latency. However, they suffer from severe KV-cache memory constraints, often requiring gigabytes of memory per generated image. We introduce HeatKV, a novel compression method that adapts cache allocation in each head based on its attention to previously generated scales. Using a small offline calibration set, the attention heads are ranked according to their attention scores over prior scales. Based on this ranking, we construct a static pruning schedule tailored to a given memory budget. Applied to the Infinity-2B model, HeatKV achieves $2 \times$ higher compression ratio in memory allocation for KV cache compared to existing methods, while maintaining similar or better image fidelity, prompt alignment and human perception score. Our method achieves a new state-of-the-art (SOTA) for VAR model KV-cache compression, showcasing the effectiveness of fine-grained, head-specific cache allocation. Code and calibration script available at https://github.com/arm-research/heatkv.

URL PDF HTML ☆

赞 0 踩 0