Beyond Prediction Accuracy: Target-Space Recovery Profiles for Evaluating Model-Brain Alignment
超越预测准确性:用于评估模型-大脑对齐的靶空间恢复曲线
Ken Nakamura, Tomoya Nakai, Ryuto Yashiro, Ayumu Yamashita, Kaoru Amano
AI总结 本文提出了一种评估模型-大脑对齐的新方法,通过分析可重复预测的靶空间响应维度,揭示预测准确性之外的模型-大脑对齐情况。
详情
- Comments
- 34 pages, 12 figures, 5 tables
人工视觉模型通常通过测量其内部表示预测大脑响应的准确性来评估人类视觉皮层。然而,仅凭预测准确性无法确定目标大脑响应空间中哪些维度被恢复。本文介绍了一种统一框架,通过识别预测恢复的响应维度来评估模型-大脑和大脑-大脑对齐。通过重复fMRI测量,我们首先确定可在独立试验分割中重复预测的目标大脑响应维度。然后,我们预测目标大脑响应,无论是从另一个受试者的大脑响应还是视觉模型的内部表示,并量化这些可重复响应维度的恢复程度。将此框架应用于自然场景数据集的一个子集,其中八名受试者在fMRI下观看了相同的自然图像,我们发现早期到中期视觉皮层响应包含一组低维的可重复维度。大脑-大脑比较确定哪些维度可以从其他受试者的大脑中一致恢复,提供了一种诊断性的人类参考而非仅标量基准。在某些情况下,预训练和随机初始化的模型在预测准确性上相似,但这些响应维度的恢复曲线却不同。这些结果表明,仅凭预测准确性可能掩盖模型-大脑不匹配。通过明确哪些可重复的大脑响应维度被预测恢复,我们的框架提供了更诊断性的评估,以评估人工视觉模型与人类视觉皮层的对齐情况。
Artificial vision models are often evaluated against the human visual cortex by measuring how accurately their internal representations predict brain responses. However, prediction accuracy alone does not indicate which dimensions of the target brain's response space are recovered. Here, we introduce a unified framework for evaluating both model-brain and brain-brain alignment by identifying the response dimensions recovered by prediction. Using repeated fMRI measurements, we first identify target-brain response dimensions that can be reproducibly predicted across independent trial splits. We then predict target-brain responses from either another subject's brain responses or a vision model's internal representations, and quantify how strongly each of these reproducible response dimensions is recovered. Applying this framework to a subset of the Natural Scenes Dataset, in which eight subjects viewed the same natural images during fMRI, we find that the early-to-intermediate visual-cortex responses contain a low-dimensional set of reproducible dimensions. Brain-to-brain comparisons identify which of these dimensions are consistently recoverable from other subjects' brains, providing a diagnostic human reference rather than only a scalar benchmark. In some cases, pretrained and randomly initialized models achieve similar prediction accuracy while showing distinct recovery profiles across these response dimensions. These results show that prediction accuracy alone can mask model-brain mismatches. By making explicit which reproducible brain response dimensions are recovered by prediction, our framework provides a more diagnostic evaluation of alignment between artificial vision models and the human visual cortex.