Toward Temporal Realism in City-Scale Crisis Response Simulation using LLM Agents
面向城市级危机响应模拟中时间真实性的LLM智能体方法
Anping Zhang, Yang Tan, Yuanbo Tang, Huaze Tang, Qiuhua Ye, Marta C. Gonzalez, Yang Li
AI总结 针对LLM社会模拟缺乏时间真实性的问题,基于深圳疫情志愿活动数据,提出数据校准的自激与危机激活机制,实现爆发性时间模式,使智能体时间分布接近真实。
Comments 11pages,7 figures
详情
人类集体参与在时间上很少是稳定的:它是爆发性的,短时间的密集活动与长时间的安静间隔交替出现。在危机响应和社区动员中,预测人们何时行动与预测他们是否行动同样重要。这类场景越来越多地使用基于LLM的社会模拟器进行建模,然而这些模拟器的验证仅关注每个行动是否合理,而非行动的时间是否与现实一致。它们的时间真实性,即模拟活动再现真实人类系统爆发性、重尾时间分布的程度,因此仍未得到检验。我们利用深圳跨多年、城市规模的线下志愿活动日志(涵盖COVID-19疫情)来考察这一差距。实证上,我们确认爆发性时间在个体和跟踪群体层面普遍存在,且主要是内生性和自激的,并由疫情放大而非日常活动周期产生。一个标准的纯LLM模拟器几乎无法再现这种时间分布:其同步调度缺乏自激通道,因此智能体以近乎规律的时钟行动。基于这些发现,我们构建了一个模拟器,其中数据校准的自激通道和危机时期机制决定每个智能体何时行动,并仅在这些时刻查询LLM,由LLM决定加入哪个任务以及是否承诺。纯LLM基线未产生任何爆发性智能体(中位爆发性$B=-0.14$);单个数据校准的门控足以将每个智能体的时间分布提升至爆发阈值以上(中位$B\approx0.37$),且不降低LLM的内容决策质量。这些结果表明,基于LLM的危机响应模拟中,时间真实性的最佳实现方式是将智能体何时行动(由显式自激和危机激活机制控制)与做什么(由LLM控制)解耦。
Human collective participation is rarely steady in time: it is bursty, with short episodes of intense activity separated by long quiet intervals. In crisis response and community mobilization, predicting when people act matters as much as predicting whether they act. Such settings are increasingly modeled with LLM-based social simulators, yet these simulators are validated on whether each action is individually plausible, not on whether actions are timed as in reality. Their temporal realism, the degree to which simulated activity reproduces the bursty, heavy-tailed timing of real human systems, thus remains untested. We examine this gap using a multi-year, city-scale log of offline volunteering in Shenzhen that spans the COVID-19 pandemic. Empirically, we establish that bursty timing is common at individual and tracked-group levels, that it is largely endogenous and self-exciting, and that it is amplified by the pandemic rather than produced by daily activity cycles. A standard LLM-only simulator reproduces almost none of this timing: its synchronous schedule has no self-excitation channel, so agents act on a near-regular clock. Guided by these findings, we build a simulator in which a data-calibrated self-excitation channel and a crisis-period regime decide when each agent acts and query the LLM only at those moments, leaving it to decide which task to join and whether to commit. The LLM-only baseline yields no bursty agents (median burstiness $B=-0.14$); a single data-calibrated gate is then sufficient to lift per-agent timing above the burst threshold (median $B\approx0.37$) without degrading LLM content decisions. These results indicate that temporal realism in LLM-based crisis-response simulation is best achieved by decoupling when agents act, governed by an explicit self-excitation and crisis-activation mechanism, from what they do, governed by the LLM.