TUX: Measuring Human--AI Tacit Understanding
TUX:衡量人机默契理解
Yueshen Li, Hanyi Min, Vedant Das Swain, Koustuv Saha
AI总结 通过光谱放置任务和TUX指数,量化人类与LLM之间的默契理解,发现人格特征影响对齐程度。
详情
随着大型语言模型(LLMs)越来越多地作为协作伙伴,人机对齐通常通过明确的任务成功、准确性或奖励优化来评估。然而,许多协作场景依赖于默契理解:即智能体能否在没有明确目标、沟通或反馈的情况下,与人类的评价立场或表征先验对齐。为了研究这种能力,我们开发了一个受社交派对游戏Wavelength启发的光谱放置任务,在该任务中,人类和智能体独立地将概念放置在主观光谱上。我们将默契理解指数(TUX)操作化为人类与智能体判断之间的成对相似性度量,并通过241名人类参与者和200个基于人格条件的LLM智能体(涵盖四种模型)进行评估。我们发现,在特质空间中最近的人-智能体对实现了显著更高的TUX,表明默契对齐是由个体层面特征而非随机相似性所结构化的。回归分析表明,随着预测变量集变得更加丰富,TUX变得更可解释,个体特质、决策风格和置信度优于聚合特质距离基线。这些发现表明,人类与LLM之间的默契理解是可测量的,同时也揭示了基于人格条件化方法在捕捉更深层表征对齐方面的局限性。
As large language models (LLMs) increasingly act as collaborative partners, human--AI alignment is often evaluated through explicit task success, accuracy, or reward optimization. Yet many collaborative settings depend on tacit understanding: whether an agent can align with a human's evaluative stance or representational priors without clear objectives, communication, or feedback. To study this capacity, we develop a spectrum-placement task inspired by the social party game Wavelength, in which humans and agents independently place concepts along subjective spectra. We operationalize the Tacit Understanding Index (TUX) as a pairwise measure of similarity between human and agent judgments, and evaluate it with 241 human participants and 200 profile-conditioned LLM agents across four models. We find that nearest human--agent pairs in trait space achieve significantly higher TUX, suggesting that tacit alignment is structured by person-level characteristics rather than random similarity. Regression analyses show that TUX becomes more explainable as predictor sets become richer, with individual traits, decision-making styles, and confidence improving over aggregate trait-distance baselines. These findings suggest that tacit understanding between humans and LLMs is measurable, while revealing the limits of profile-based conditioning for capturing deeper representational alignment.