2606.12291
2026-06-11
cs.CL
新提交
Measuring Epistemic Resilience of LLMs Under Misleading Medical Context
测量大语言模型在误导性医疗上下文下的认知韧性
Hongjian Zhou, Xinyu Zou, Jinge Wu, Sean Wu, Junchi Yu, Bradley Max Segal, Tobias Erich Niebuhr, Sara Amro, Michael Petrus, Sheikh Momin, Alexandra M. Cardoso Pinto, Rachel Niesen, Laura Sophie Wegner, Dhruv Darji, Jung Moses Koo, Joshua Fieggen, Kapil Narain, Mingde Zeng, Lei Clifton, Linda Shapiro, Fenglin Liu, David A. Clifton
发表机构
*
University of Oxford(牛津大学)
;
University of Washington(华盛顿大学)
;
University College London(伦敦大学学院)
;
University of Waterloo(滑铁卢大学)
AI总结
本研究提出MedMisBench基准,通过注入误导性上下文测试大语言模型在医疗场景中的认知韧性,发现模型准确率从71.1%降至38.0%,权威性虚假信息攻击成功率达69.5%。