Auditing LLM Editorial Bias in News Media Exposure
审查新闻媒体曝光中的LLM编辑偏见
Marco Minici, Cristian Consonni, Federico Cinus, Giuseppe Manco
AI总结 本文研究了LLM在新闻聚合中的编辑偏见问题,通过比较三种领先的LLM与Google News,发现LLM在媒体多样性、意识形态和可靠性方面存在显著差异。
Comments Under Peer Review
详情
大型语言模型(LLM)越来越多地充当网页内容的门户,塑造数百万用户接触在线信息的方式。与传统搜索引擎不同,其检索和排序机制已被深入研究,而连接网络的LLM的选择过程增加了答案生成方式的不透明性。通过决定用户看到哪些新闻来源,这些系统可以影响公众意见,加强回声室效应,并对公民讨论和公共信任构成风险。本研究将二十年来算法审查的研究扩展到考察LLM如何作为新闻引擎运作。我们提出了首次比较三种领先的代理GPT-4o-Mini、Claude-3.7-Sonnet和Gemini-2.0-Flash与Google News的审查,探讨LLM在媒体多样性、意识形态和可靠性方面与传统聚合器有何不同。在24个全球主题上,我们发现,与Google News相比,LLM暴露给用户的独特来源显著更少,并且注意力分配更加不均。同样,GPT-4o-Mini更强调事实性和右倾来源;Claude-3.7-Sonnet更倾向于机构和民间社会领域,并略微放大右倾曝光;Gemini-2.0-Flash表现出适度的左倾倾向,但事实性无显著变化。这些模式在提示变化和替代可靠性基准下仍然稳健。总体而言,我们的发现表明,LLM已经实施了代理编辑政策,以不同于传统聚合器的方式编辑信息。理解并管理其新兴的编辑权力对于确保数字信息生态系统中的透明性、多元性和信任至关重要。
Large Language Models (LLMs) increasingly act as gateways to web content, shaping how millions of users encounter online information. Unlike traditional search engines, whose retrieval and ranking mechanisms are well studied, the selection processes of web-connected LLMs add layers of opacity to how answers are generated. By determining which news outlets users see, these systems can influence public opinion, reinforce echo chambers, and pose risks to civic discourse and public trust. This work extends two decades of research in algorithmic auditing to examine how LLMs function as news engines. We present the first audit comparing three leading agents, GPT-4o-Mini, Claude-3.7-Sonnet, and Gemini-2.0-Flash, against Google News, asking: \textit{How do LLMs differ from traditional aggregators in the diversity, ideology, and reliability of the media they expose to users?} Across 24 global topics, we find that, compared to Google News, LLMs surface significantly fewer unique outlets and allocate attention more unevenly. In the same way, GPT-4o-Mini emphasizes more factual and right-leaning sources; Claude-3.7-Sonnet favors institutional and civil-society domains and slightly amplifies right-leaning exposure; and Gemini-2.0-Flash exhibits a modest left-leaning tilt without significant changes in factuality. These patterns remain robust under prompt variations and alternative reliability benchmarks. Together, our findings show that LLMs already enact \textit{agentic editorial policies}, curating information in ways that diverge from conventional aggregators. Understanding and governing their emerging editorial power will be critical for ensuring transparency, pluralism, and trust in digital information ecosystems.