arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2606.12369 2026-06-11 cs.CY 新提交

Should LLM Agents Decide in Social Simulations? Comparing Finite-State and LLM-Based Decision Policies

LLM智能体应在社会模拟中做决策吗？比较有限状态与基于LLM的决策策略

Alejandro Buitrago López, Javier Pastor-Galindo, José A. Ruipérez-Valiente

AI总结研究评估LLM作为在线社交网络模拟中动作选择器时，是否保持可解释的参考策略，发现LLM在某些配置下可近似但不可靠地保持策略，且速度远慢于马尔可夫链采样。

详情

AI中文摘要

大型语言模型（LLMs）越来越多地被用作社会模拟中的决策组件。这引入了一种方法论风险：模拟可能偏离研究者定义的显式行为策略。在在线社交网络（OSN）模拟中，动作选择塑造系统动态、交互模式和模型可解释性。本文评估了LLM动作选择器在OSN模拟中是否保持可解释的参考策略。参考策略是一个实现为一阶马尔可夫模型的有限状态机，其转移概率取决于用户类型。评估使用包含1000个智能体和10000个动作决策的合成网络。测试了三种开放权重LLM：LLaMA 3.1、GPT-OSS和Mistral 24B。每个模型在三种提示策略下评估：基础、引导和概率。使用带有拉普拉斯平滑的詹森-香农散度衡量对齐度，并报告执行时间。结果表明，LLM在某些配置下可以近似参考策略，但不能可靠地保持它。对齐度因模型和提示而异，额外的引导可能引入系统性动作偏差。即使是最佳对齐的LLM配置也比直接马尔可夫链采样慢几百倍。这些发现表明，基于LLM的动作选择不能直接替代显式决策策略：它可能改变预期行为，同时增加计算成本。

英文摘要

Large language models (LLMs) are increasingly used as decision-making components in social simulations. This introduces a methodological risk: the simulation may deviate from the explicit behavioral policy defined by the researcher. In online social network (OSN) simulations, action choices shape system dynamics, interaction patterns, and model interpretability. This paper evaluates whether LLM action selectors preserve an interpretable reference policy in an OSN simulation. The reference is a finite state machine implemented as a first-order Markov model, with transition probabilities depending on the user type. The evaluation uses a synthetic network with 1,000 agents and 10,000 action decisions. Three open-weight LLMs are tested: LLaMA 3.1, GPT-OSS, and Mistral 24B. Each model is evaluated under three prompting strategies: base, guided, and probabilistic. Alignment is measured using Jensen-Shannon Divergence with Laplace smoothing, and execution time is reported. Results show that LLMs can approximate the reference policy in some configurations, but do not preserve it reliably. Alignment varies across models and prompts, and additional guidance can introduce systematic action biases. Even the best-aligned LLM configurations are several hundred times slower than direct Markov chain sampling. These findings indicate that LLM-based action selection is not a direct replacement for explicit decision policies: it can alter the intended behavior while increasing computational cost.

URL PDF HTML ☆

赞 0 踩 0

2606.12285 2026-06-11 cs.CY 新提交

Why AI Slop Matters, but Not Like That

为什么AI垃圾内容重要，但不是那样重要

Sachita Nishal, Marijn Sax, Kimon Kieslich

AI总结本文回应《为什么垃圾内容重要》一文，通过内在和外部批判，指出其推理忽视了AI垃圾内容的社会技术背景，并基于伦理和社会科学视角，强调应关注其社会功能和审美价值，呼吁进行语境化和文化基础的讨论。

2606.12247 2026-06-11 cs.CY cs.CL 新提交

Beyond Third-Person Audits: Situated Interaction Auditing for User-Centered LLM Bias Research

超越第三人称审计：以用户为中心的LLM偏见研究的场景交互审计

Andrés Abeliuk, Cinthia Sanchez Macias, Valentina Alarcón, Álvaro Madariaga, Claudia Lopez

AI总结提出场景交互审计（SIA）框架，通过分析用户画像信号（如社会人口统计标记、写作风格和身份陈述）如何系统性地影响LLM响应质量、内容和语气，以用户为中心研究LLM偏见。

详情

AI中文摘要

大型语言模型（LLM）的偏见研究主要集中在第三人称审计上，即研究模型如何作为外部主体表征或评估人口群体。然而，这种范式忽略了一个结构性盲点：用户不在审计中。在实践中，LLM用于开放式的个人交互，在此过程中模型隐式地代表用户并相应调整其响应。当相同的请求因提问者不同而产生不同响应时，偏见不仅体现在模型如何描述他人，还体现在它如何对待对话者。我们提出场景交互审计（SIA），这是一个以用户为中心的框架，用于研究用户画像信号——隐式社会人口统计标记、写作风格和陈述身份——如何系统性地塑造LLM响应质量、内容和语气。我们通过一个案例研究来展示该框架，该案例研究跨多个任务领域交叉了性别和社会经济地位信号，并概述了SIA作为自然语言处理新使命的研究议程。

英文摘要

Research on bias in large language models (LLMs) has predominantly focused on third-person audits, which study how models represent or evaluate demographic groups as external subjects. However, this paradigm overlooks a structural blind spot because the user is absent from the audit. In practice, LLMs are used in open-ended, personal interactions, during which the model implicitly represents the user and adjusts its responses accordingly. When identical requests yield different responses depending on who is asking, bias manifests not in how the model describes others but in how it treats its interlocutor. We propose Situated Interaction Auditing (SIA), a user-centered framework for studying how user profile signals -- implicit sociodemographic markers, writing style, and stated identity -- systematically shape LLM response quality, content, and tone. We demonstrate the framework through a case study that intersects gender and socioeconomic status signals across multiple task domains and outline a research agenda for SIA as a new mission for natural language processing.

URL PDF HTML ☆

赞 0 踩 0

2606.11692 2026-06-11 cs.CY cs.MA cs.SI 新提交

Evaluation of Alternative-Based Information Systems for Deliberative Polling using an Agentic Simulator

基于智能体模拟器的审议式投票中替代性信息系统评估

Rwaida Alssadi, Khulud Alawaji, Balaji Kasula, Muntaser Syed, Badria Alfurhood, Markus Zanker, Marius Silaghi

AI总结提出基于LLM的智能体双极论证模拟器（ABAS），通过覆盖率和语料多样性评估审议式投票中推荐机制的有效性，并测试了对抗性投票攻击下的鲁棒性。

详情

AI中文摘要

审议式投票旨在通过让股东在投票前接触广泛论点来改善集体决策。然而，确保每个选民遇到理由空间的代表性样本（覆盖问题）仍然是一个开放的挑战，特别是在大规模和对抗性或策略性动机的选民群体中。本文介绍了一种使用基于LLM的智能体双极论证模拟器（ABAS）评估解决方案的方法，该模拟器基于一个将投票形式化为六元组<Jend, Jopp, Ratt, Renh, VA, VR>（包含支持与反对理由、攻击与增强关系、股东权重和关系权重）的框架。ABAS模拟N个自主股东智能体，每个智能体根据[-1,1]内的期望分布分配潜在意见，依次投票、选择或撰写理由，并可选择提交论证图链接。该模拟器实现推荐机制，根据可观察的支持质量对现有理由进行排序。它通过覆盖率（即每个股东收到的K条推荐中代表语料库理由标签集的比例）来评估机制的成功，作为NP难子集理由问题的一个解决方案。报告的实验描述了创造力率（pown）、推荐大小（K）、论证密度（plinks）和人口规模（N）如何影响覆盖率和语料库多样性。在一个经过身份验证的选民群体中（Sybil攻击不可能，只有关系图可被操纵），我们通过协调策略性投票攻击对评分进行压力测试：标签洪泛攻击导致覆盖率崩溃，而通过反向PageRank规则的作者计数关系加权比均匀权重显著更好地抵抗了洪泛攻击。

英文摘要

Deliberative polling promises to improve collective decision-making by exposing shareholders to a broad range of arguments before they vote. Yet ensuring that every voter encounters a representative sample of the reason space, the coverage problem, remains an open challenge, particularly at scale and in adversarial or strategically motivated electorates. This paper introduces a way of evaluating solutions using the LLM-based Agentic Bipolar Argumentation Simulator, grounded in a framework which formalises a poll as a six-tuple <Jend, Jopp, Ratt, Renh, VA, VR> of endorsing and opposing justifications, attack and enhance relations, and shareholder- and relation-weights. ABAS simulates N autonomous shareholder agents, each assigned a latent opinion according to desired distributions in [-1, 1], who sequentially vote, choose or author justifications, and optionally submit argumentation-graph links. The simulator implements recommendations that rank existing justifications by their observable endorsement mass. It evaluates the mechanism's success by coverage, namely the fraction of the corpus reason-tag set represented in the K recommendations presented to each shareholder, as a solution to the NP-hard Subsuming Justification Problem. Reported experiments characterise how creativity rate (pown), recommendation size (K), argumentation density (plinks), and population size (N) affect coverage and corpus diversity. In an authenticated electorate where Sybil attacks are impossible and only the relation graph is gameable, we stress-test the scoring with coordinated strategic voting attacks: a tag-flood attack collapses coverage, while author-count relation weighting through a reversed-PageRank rule resists the flood markedly better than uniform weights.

URL PDF HTML ☆

赞 0 踩 0

2606.11669 2026-06-11 cs.HC cs.CY 新提交

Learning by Chatting? Investigating the Impact of Generative AI on Information Seeking and Learning

通过聊天学习？探究生成式AI对信息寻求和学习的影响

Shravika Mittal, Su Lin Blodgett, Q. Vera Liao

AI总结通过8天实地实验比较ChatGPT与Google搜索对非正式学习的影响，发现ChatGPT组因信息选择外包导致元认知负荷增加、知识探索减少，学习效果尤其是高阶批判性学习更差。

详情

AI中文摘要

生成式AI（GenAI）工具为增强人类认知任务提供了越来越多的机会。在这些任务中，信息寻求正被GenAI工具迅速重塑，对学习和知识获取具有潜在深远影响。为探究这些影响，我们进行了一项受试者间实地实验，参与者在8天内通过ChatGPT或Google搜索寻求信息进行非正式学习。采用每日日记协议，我们收集了其信息寻求过程的现场数据。我们的发现表明，ChatGPT组的参与者在信息寻求过程中能动性降低，因为他们将大量信息选择外包给AI，并因此因控制感减弱而经历更大的元认知负荷。我们进一步强调了使用ChatGPT时信息访问的两个扭曲来源：ChatGPT输出的偏见，特别是倾向于提供面向解决方案的产物而非原则性知识；以及用户信息寻求行为的系统性转变，当前GenAI工具的对话式和社交导向交互范式可能无意中减少了对更广泛知识空间的探索。结果，平均而言，ChatGPT组参与者的学习效果比使用Google的参与者差，尤其是在高阶批判性学习方面。我们的工作揭示了将信息寻求外包给AI与有意义学习之间的内在张力，并为理解AI对人类认知的风险提供了更广泛的意义。

英文摘要

Generative AI (GenAI) tools offer increasing opportunities for augmenting human cognitive tasks. Among these tasks, information seeking is being rapidly reshaped by GenAI tools, with potentially profound implications for learning and knowledge acquisition. To investigate these implications, we conducted a between-subjects field experiment in which participants pursued informal learning by seeking information through either ChatGPT or Google Search over a span of 8 days. Using a daily diary protocol, we gathered in-situ data on their information-seeking processes. Our findings show that participants in the ChatGPT group experienced diminished agency in their information-seeking processes, as they offloaded much of the information selection to AI, and consequently experienced greater meta-cognitive load arising from this reduced sense of control. We further highlight two sources of distortion in information access when using ChatGPT: biases in ChatGPT outputs, particularly towards providing solution-oriented artifacts over principled knowledge; and systematic shifts in users' information-seeking behaviors, whereby the conversational and socially-oriented interaction paradigm of current GenAI tools may inadvertently reduce exploration of the broader knowledge space. As a result, on average, participants in the ChatGPT group had worse learning outcomes than those using Google, especially for higher-order critical learning. Our work suggests inherent tensions between offloading information seeking to AI and meaningful learning, and provides broader implications for understanding AI's risks to human cognition.

URL PDF HTML ☆

赞 0 踩 0

2606.11635 2026-06-11 cs.CY cs.AI 新提交

Are LLMs Bad at Moral Reasoning?

LLMs 在道德推理上表现不佳吗？

Menghang Zhu, Seth Lazar

AI总结本文通过让LLMs生成评分标准而非直接评分，重新评估MoReBench数据集，发现LLMs的道德推理能力比先前认为的更强。

详情

AI中文摘要

为了让高能力AI系统在动态、开放的环境中安全运行，它们必须能够识别、理解并响应行动中的道德理由，并据此约束自身行为。越来越多的研究旨在评估当今最先进AI系统的这种能力——道德能力，最近得出了普遍悲观的结论。其中一篇最具雄心的论文收集了人类专家制定的黄金标准评分标准，用于评估1000个案例中的道德推理，并以此基准测试前沿AI模型，结果不尽如人意。在本文中，我们认为MoReBench数据集可以被重新利用，以给出对LLMs道德推理（道德能力的重要组成部分）更为乐观的图景。我们表明，如果不根据这些评分标准对LLMs的回应进行评分，而是让LLMs执行与人类相同的任务——为特定案例的道德分析生成评分标准——那么它们生成的评分标准与人类评分标准的校准程度高于其开放式回应，并且在存在差异时，这些差异可能仅仅反映了大多数道德问题的巨大维度，同时也突出了人类在“创建评分标准的评分标准”上的某些偏离。考虑到这些观点，MoReBench数据集表明LLMs在道德推理方面的能力比先前认为的要强得多。

英文摘要

For highly capable AI systems to operate safely in dynamic, open-ended environments, they must be able to identify, understand, and respond to moral reasons for action, and constrain their behaviour accordingly. A growing body of research aims to evaluate this capacity -- moral competence -- in today's most capable AI systems, recently reaching broadly pessimistic conclusions. One of the most ambitious such papers collects gold-standard human-authored rubrics for evaluating moral reasoning in 1,000 cases, and benchmarks frontier AI models against those rubrics, with underwhelming results. In this paper, we argue that the MoReBench dataset can be redeployed to give a much more optimistic picture of LLMs' moral reasoning (an essential part of moral competence). We show that if, instead of scoring LLMs' responses to these cases against these rubrics, we instead give the LLMs the same task given to humans -- to generate scoring rubrics for the moral analysis of particular cases -- the rubrics they generate are both better calibrated to the human rubrics than their open-ended responses, and, where they differ, plausibly reflect nothing more than the vast dimensionality of most moral problems, as well as highlighting some human departures from the "rubric for creating rubrics". Taking these points into consideration, the MoReBench dataset suggests that LLMs are significantly more capable at moral reasoning than was previously believed.

URL PDF HTML ☆

赞 0 踩 0

2606.11533 2026-06-11 cs.CY cs.AI cs.ET cs.LG 新提交

AI Researchers Must Help Lead Arms Control to Mitigate Military AI Risks

AI研究人员必须主导军备控制以降低军事AI风险

Ted Fujimoto, Jacob Benz

AI总结本文主张AI研究人员应主导军备控制研究，通过借鉴核威慑经验，推动验证与外交技术创新，以降低军事AI应用带来的紧迫风险。

详情

Comments: 9 pages, 1 figure, ICML 2026 Position Paper

AI中文摘要

AI能力的进步迫使研究人员和公众更加关注其潜在的全球影响。一个紧迫的近期问题是军事AI应用的监管。武器制造商和国防承包商正在加大对AI能力的投资，并与AI公司建立合作伙伴关系，形成了一个新兴的联盟，要求军事领导人、军备控制外交专家和AI研究人员合作，以确保更安全的未来。虽然AI研究人员通常关注超级智能AI的长期影响，但这种方法可能无法充分应对军事应用中AI带来的直接挑战。成功需要承认并减轻前沿AI模型（计划集成到国防应用中，如军事AI系统）的新兴风险。军备控制已经减少了过去的灾难性风险，因此从核威慑中吸取的经验教训可以指导AI安全与安保研究，推动验证和外交方面的创新。然而，AI研究人员必须协助主导技术研究，明确定义并缓解军事环境中的不稳定性。鉴于这些新责任以及缺乏足够可靠的解决方案，我们认为AI研究人员必须在推进军备控制研究以最小化军事AI应用风险方面发挥主导作用。

英文摘要

The advancement of AI capabilities compels researchers and the public to be more aware of its potential worldwide impact. A pressing near-term concern is the regulation of military AI applications. Armament manufacturers and defense contractors are increasingly investing in AI capabilities and forging partnerships with AI companies, creating a burgeoning coalition that demands military leaders, arms control diplomacy experts, and AI researchers collaborate to ensure a safer future. While AI researchers often focus on the long-term implications of superintelligent AI, this approach may not adequately address the immediate challenges posed by AI in military applications. Success requires acknowledging and mitigating the emerging risks of frontier AI models that plan to be integrated into defense applications, like military AI systems. Arms control has reduced past catastrophic risks, so lessons learned from nuclear deterrence can guide AI safety and security research towards innovations in verification and diplomacy. AI researchers, however, must assist in leading the technical research that clearly defines and alleviates instability in military settings. Given these new responsibilities and the lack of sufficiently reliable solutions, we argue that AI researchers must take a leading role in advancing arms control research to minimize risk in military AI applications.

URL PDF HTML ☆

赞 0 踩 0

2606.11457 2026-06-11 cs.CY 新提交

Investigating Gender Bias in Touch Biometrics

探究触摸生物识别中的性别偏见

Joshua Lee, Ben Khant, Rajesh Kumar

AI总结本研究使用BBMAS和ANTAL数据集，通过XGBoost和DenseNet分类器评估滑动认证中的性别偏见，发现认证错误率无显著性别差异，表明滑动认证可公平可靠。

详情

Comments: 4 pages, 1 figure, 2 tables. Accepted for presentation at the Richard Tapia Conference (Tapia 2026)

AI中文摘要

行为生物识别为持续认证提供了一种有前景的方法，但其在不同人口群体中的公平性尚未得到充分探索。本文使用BBMAS（117名用户）和ANTAL（71名用户）数据集研究基于滑动的认证中的性别偏见，并通过错误接受率（FAR）和错误拒绝率（FRR）评估XGBoost和DenseNet分类器。XGBoost在BBMAS和ANTAL数据集上的认证准确率分别达到92%和94%，而统计检验（Kolmogorov-Smirnov、Mann-Whitney和Wasserstein置换检验）发现，在几乎所有实验设置中，认证错误率均无显著性别差异。这些发现表明，基于滑动的认证可以在保持高性能的同时，对男性和女性用户表现相当，支持其作为公平可靠的行为生物识别模态的潜力。

英文摘要

Behavioral biometrics offer a promising approach for continuous authentication, but their fairness across demographic groups remains largely unexplored. This paper investigates gender bias in swipe-based authentication using the BBMAS (117 users) and ANTAL (71 users) datasets and evaluates XGBoost and DenseNet classifiers through False Acceptance Rate (FAR) and False Rejection Rate (FRR). XGBoost achieved authentication accuracies of 92% and 94% on the BBMAS and ANTAL datasets, respectively, while statistical tests (Kolmogorov-Smirnov, Mann-Whitney, and Wasserstein permutation) found no significant gender differences in authentication error rates across almost all experimental settings. These findings suggest that swipe-based authentication can achieve high accuracy while maintaining comparable performance for male and female users, supporting its potential as a fair and reliable behavioral biometric modality.

URL PDF HTML ☆

赞 0 踩 0

2606.11456 2026-06-11 cs.CL cs.AI cs.CY 新提交

AI Coding Agents in Social Science: Methodologically Diverse, Empirically Consistent, Interpretively Vulnerable

社会科学中的AI编码智能体：方法多样，经验一致，解释脆弱

Meysam Alizadeh, Fabrizio Gilardi, Mohsen Mosleh, Enkelejda Kasneci

发表机构 * University of Oxford（牛津大学）； University of Zurich（苏黎世大学）； Technical University of Munich（慕尼黑工业大学）

AI总结研究LLM智能体在科学分析中的方法多样性与解释脆弱性，通过20次独立实验发现智能体在设计层匹配或超越人类多样性，但在裁决层易受提示影响，偏差源于解释而非估计。

详情

AI中文摘要

基于LLM的智能体在科学分析中的部署引发了相互矛盾的担忧：智能体可能减少方法多样性，或者可能放大分析灵活性，使研究者得出动机性结论。我们认为这些担忧针对两个经验上可分离的层面：方法选择的设计层，以及决策规则将估计映射到实质性主张的裁决层。我们通过在著名的移民与社会政策问题上运行20次Claude Code和Codex的独立执行，并以多位分析师的人类基线为基准，对两者进行了测试。在设计层，Codex匹配了人类的方法多样性，而Claude Code产生了近三倍的规格；两个智能体的效应估计与人类共识大致一致，且没有智能体模型与任何人类模型完全匹配。提示诱导的反移民研究者先验重组了每个智能体的方法决策，但与同一数据中有偏见的人类分析师不同，它并未改变总体估计或最终裁决；智能体也没有沿着人类用来偏倚其估计的方法轴重新路由。在裁决层，一个明确的确认性提示将Claude Code的裁决从10%的支持率翻转为90%，同时其系数分布基本保持不变，这是通过规则省略而非规则软化实现的。AI智能体在设计层可以媲美或超越人类的方法多样性，但在裁决层仍然脆弱。在我们的设置中，AI偏差的所在不是估计而是解释。

英文摘要

The deployment of LLM-based agents in scientific analysis raises opposing concerns: that agents may reduce methodological diversity, or that they may amplify the analytic flexibility through which researchers reach motivated conclusions. We argue these worries target two empirically separable layers: a design layer of methodological choices, and a verdict layer in which a decision rule maps estimates to a substantive claim. We test both by running 20 independent executions of Claude Code and Codex on a prominent immigration and social-policy against a many-analysts human baseline. At the design layer, Codex matches human methodological diversity and Claude Code produces nearly three times as many specifications; both agents' effect estimates remain broadly aligned with the human consensus, and no agent model exactly matches any human model. A prompt-induced anti-immigration researcher prior reorganizes each agent's methodological decisions but, unlike for biased human analysts in the same data, does not shift aggregate estimates or final verdicts; nor do agents reroute along the methodological axes humans use to bias their estimates. At the verdict layer, an explicit confirmatory prompt flips Claude Code's verdicts from 10% to 90% support while leaving its coefficient distribution essentially unchanged, operating through rule omission rather than rule softening. AI agents can rival or exceed human methodological diversity at the design layer while remaining vulnerable at the verdict layer. In our setting, the locus of AI bias is not estimation but interpretation.

URL PDF HTML ☆

赞 0 踩 0

2606.11337 2026-06-11 cs.AI cs.CL cs.CY 新提交

Can AI Agents Synthesize Scientific Conclusions?

AI代理能否综合科学结论？

Hayoung Jung, Pedro Viana Diniz, José Reinaldo Corrêa Roveda, Abner Fernandes da Silva, Haeun Jung, Enoch Tsai, Aleksandra Korolova, Manoel Horta Ribeiro

发表机构 * Princeton University（普林斯顿大学）； Universidade Federal de Minas Gerais（米纳斯吉拉斯联邦大学）； Stony Brook University（石溪大学）； Hackensack Meridian School of Medicine（哈肯萨克子午线医学院）

AI总结本文提出SciConBench基准和SciConHarness评估框架，通过分解原子事实并计算精确率和召回率，发现前沿AI代理在科学结论综合中事实F1仅0.337，且无约束评估存在数据泄露，消费者代理常生成不完整或矛盾的结论。

详情

Comments: 79 pages, 34 figures, 17 tables. Under Submission

AI中文摘要

科学AI代理越来越多地检索证据、跨来源推理并综合用于重要决策的结论。然而，它们在健康等高风险领域中的能力仍不明确。我们引入了SciConBench，一个大规模实时基准，包含9.11K个问题以及来自系统综述的专家撰写的结论，用于评估开放域科学结论综合。该基准采用专家验证的自动评估流程，将结论分解为原子事实，并通过事实精确率和召回率衡量正确性和全面性。为减轻数据泄露，我们进一步引入了SciConHarness，一个洁净室评估框架，为代理配备受控的网页交互以确保有效测量。评估8个前沿模型和深度研究代理，我们发现事实质量仍然较低：在洁净室设置下，最佳代理仅达到0.337的事实F1。与无约束评估相比，我们的洁净室设置持续降低性能，表明数据泄露夸大了模型真实综合能力的估计。最后，我们审计了面向消费者的代理（如Google AI Overview、OpenEvidence），发现它们经常生成不完整甚至矛盾的结论，即使真实答案可用。总体而言，我们的结果表明，科学结论的可靠综合仍然是一个开放挑战，而洁净室评估对于评估开放域AI代理至关重要。

英文摘要

Scientific AI agents increasingly retrieve evidence, reason across sources, and synthesize conclusions used in consequential decisions. Yet, their ability to do so in high-stakes domains such as health remains unclear. We introduce SciConBench, a large-scale live benchmark of 9.11K questions and expert-written conclusions from systematic reviews to evaluate open-domain scientific conclusion synthesis. The benchmark draws on an expert-validated automated evaluation pipeline that decomposes conclusions into atomic facts and measures correctness and comprehensiveness via factual precision and recall. To mitigate data leakage, we further introduce SciConHarness, a clean-room evaluation harness that equips agents with controlled web interaction to ensure valid measurement. Evaluating 8 frontier models and deep research agents, we find that factual quality remains low: under clean-room settings, the best agent achieves only a factual F1 of 0.337. Our clean-room setting consistently reduces performance relative to unconstrained evaluation, suggesting that leakage inflates estimates of models' true synthesis capabilities. Finally, we audit consumer-facing agents (e.g., Google AI Overview, OpenEvidence) and find they frequently generate incomplete and sometimes contradictory conclusions, even when the ground-truth answer is available. Overall, our results show that reliable synthesis of scientific conclusions remains an open challenge, and that clean-room evaluation is essential for assessing open-domain AI agents.

URL PDF HTML ☆

赞 0 踩 0

2606.11218 2026-06-11 cs.CY cs.AI 新提交

An Ethical eValuation Agent (EeVA): Results of a Proof-of-Concept Test on a Prototype Agentic-like Workflow to Assist Ethical Deliberations

伦理评估代理（EeVA）：在原型类代理工作流中辅助伦理审议的概念验证测试结果

Stephen Milford, B. Zara Malgir, Miguel Vazquez

AI总结提出基于LLM的类代理工作流EeVA，通过10种伦理框架评估用例，生成结构化评估与综合，促进伦理反思而非给出绝对答案，在三个案例中验证了可行性。

详情

AI中文摘要

伦理审议常被误解为寻找单一对错答案，这给必须应对伦理挑战的非伦理专业人员带来困难。我们开发了EeVA，一种基于LLM的类代理工作流，旨在支持比较性伦理反思而非提供确定性伦理答案。EeVA使用n8n编程，包含三个互连工作流：启动器、工作器和发射器。它通过评估器和综合提示，根据10种伦理框架评估上传的用例。概念验证测试使用了来自城市交通、点对点能源交易和社会服务资源分配的三个已发表案例。在所有案例中，EeVA生成了结构一致的框架特定评估和综合报告。输出区分了不同框架，识别了收敛和分歧，提出了增加一致性的修改建议，并突出了持续的伦理张力。综合报告对非专业人士可读，并将注意力从简单答案转向设计条件、保障措施以及跨框架完全一致不太可能的领域。研究结果表明，LLM可以被组织成可用的工作流，在保留伦理多元性的同时，帮助弥合伦理学家与非伦理专业人员之间的沟通差距。EeVA的价值不在于取代伦理学家或解决道德分歧，而在于构建结构化的伦理审议。EeVA为在伦理专业知识有限的情况下支持伦理反思提供了一个有前景的概念验证。在成为成熟工具之前，还需要在可重复性、人工评估、用户测试和效率方面进行进一步工作。

英文摘要

Ethical deliberation is often misunderstood as a search for single right or wrong answers, creating difficulties for non-ethically trained personnel who must address ethically laden challenges. We developed EeVA, an agentic-like LLM-based workflow designed to support comparative ethical reflection rather than deliver definitive ethical answers. EeVA was programmed in n8n using three interconnected workflows: starter, worker, and emitter. It evaluated uploaded use cases against 10 ethical frameworks through evaluator and synthesis prompts. Proof-of-concept testing used three published cases from urban mobility, peer-to-peer energy trading, and social-service resource allocation. Across all cases, EeVA produced consistently structured framework-specific evaluations and integrated syntheses. Outputs differentiated between frameworks, identified convergences and divergences, recommended modifications to increase alignment, and highlighted persistent ethical tensions. Syntheses were readable for non-specialists and shifted attention away from simplistic answers toward design conditions, safeguards, and areas where full cross-framework agreement was unlikely. The findings suggest that LLMs can be organised into usable workflows that preserve ethical plurality while helping bridge the communicative gap between ethicists and non-ethically trained personnel. EeVA's value lies not in replacing ethicists or resolving moral disagreement, but in scaffolding structured ethical deliberation. EeVA offers a promising proof of concept for supporting ethical reflection where access to ethics expertise is limited. Further work is needed on reproducibility, human evaluation, user testing, and efficiency before it can be considered a mature tool.

URL PDF HTML ☆

赞 0 踩 0

2606.11217 2026-06-11 cs.CY cs.AI cs.HC 新提交

Preregistration for Experiments with AI Agents

AI智能体实验的预注册

Michelle Vaccaro

AI总结针对AI智能体实验中的方法论漏洞，提出将预注册实践扩展至该领域，并设计专用模板以提升研究可信度。

详情

Comments: Accepted at ICML 2026 as a Spotlight (Top 5%) Position Paper

AI中文摘要

大型语言模型（LLM）和自主AI智能体的普及催生了一种快速发展的方法论范式：“计算机内”行为实验。最初，这种方法被设想为在认知、决策和社会动态研究中，使用AI智能体作为人类参与者的替代品，但现在它已具有新的意义——随着AI智能体越来越多地代表个人和组织进行谈判、交易和做出重大决策，理解它们的行为本身已成为研究重点。虽然这些AI智能体实验在可扩展性、成本效益和实验控制方面提供了前所未有的优势，但它们也继承并有时放大了长期困扰人类受试者研究的方法论漏洞。为解决这些问题，本文主张，预注册实践——对于提高人类受试者实验的可信度至关重要——现在应扩展到AI智能体实验。我们系统地列举了AI智能体实验引入的研究者自由度——例如模型选择、提示措辞、设置和基于结果的重新设计——并展示了低迭代成本和缺乏报告规范如何使这些选择既容易被利用又难以被检测。我们提出了一个针对AI智能体实验的预注册模板，并呼吁会议、期刊和资助机构将预注册作为这一新兴研究范式的标准实践。

英文摘要

The proliferation of large language models (LLMs) and autonomous AI agents has given rise to a rapidly growing methodological paradigm: "in silico" behavioral experiments. Originally conceived as a way to use AI agents as proxies for human participants in studies of cognition, decision-making, and social dynamics, this approach has taken on new significance -- as AI agents increasingly negotiate, transact, and make consequential decisions on behalf of people and organizations, understanding their behavior has become a research priority in its own right. While these experiments with AI agents offer unprecedented advantages in terms of scalability, cost efficiency, and experimental control, they also inherit, and in some cases amplify, methodological vulnerabilities that have long plagued human subjects research. To address these issues, this paper argues that preregistration practices -- central to improving the credibility of human subjects experiments -- should now be extended to experiments with AI agents. We systematically catalog the researcher degrees of freedom that experiments with AI agents introduce -- model selection, prompt wording, settings, and outcome-contingent redesign, for example -- and show how the low cost of iteration and lack of reporting norms make these choices both easy to exploit and difficult to detect. We propose a preregistration template tailored to experiments with AI agents and call on conferences, journals, and funding agencies to make preregistration standard practice for this emerging research paradigm.

URL PDF HTML ☆

赞 0 踩 0

2606.11216 2026-06-11 cs.CY cs.SI 新提交

Great Disappearance Acts Generative Search and Shadow Banning

消失的行为：生成式搜索与影子封禁

Danny Friedmann

AI总结本文研究生成式搜索和影子封禁对互联网开放生态的破坏，分析其法律与监管问题，并提出增强透明度与公平性的解决方案。

详情

AI中文摘要

互联网曾被誉为去中心化的公共领域，但如今日益受到生成式搜索和影子封禁等做法的破坏，这些做法转移流量并压制可见性。由检索增强生成（RAG）驱动的生成式搜索将内容综合成直接答案，绕过网站并剥夺其流量和收入，威胁独立内容创作者、小企业和开放网络生态的可持续性。影子封禁通过算法审核故意降低社交媒体帖子的可见性，通过压制言论自由、限制透明度和问责制加剧了这些问题。本文从法律和监管角度探讨这些不透明做法。第一部分考察生成式搜索的兴起，分析其技术和法律影响，包括版权侵权、不正当竞争和不当得利，并评估许可协议和代理型AI等潜在解决方案。第二部分聚焦影子封禁：算法劝阻、降级和流量减少，特别关注中国的《算法推荐管理规定》（RAR）和欧盟的《人工智能法案》（AIA）。这两个框架都提供了部分解决方案，但在确保公平、透明和救济机制方面仍显不足。最终，主导平台向集中控制的转变优先考虑利润和风险管理，而非在线表达中的创新、公平和多样性。为应对这些趋势，监管干预、算法透明度和公平框架至关重要。若无此类措施，互联网将面临失去其作为自由表达和创新的民主化公共领域特征的风险。

英文摘要

The internet, once celebrated as a decentralized public sphere, is increasingly undermined by practices such as generative search and shadow banning, which divert traffic and suppress visibility. Generative search, powered by Retrieval Augmented Generation RAG, synthesizes content into direct answers, bypassing websites and depriving them of traffic and revenue. This threatens the sustainability of independent content creators, small enterprises, and the open web ecosystem. Shadow banning, a practice that intentionally reduces the visibility of social media posts through algorithmic moderation, exacerbates these issues by chilling free expression and limiting transparency and accountability. This article explores these opaque practices through a legal and regulatory lens. The first part examines the rise of generative search, analyzing its technological and legal implications, including copyright infringement, unfair competition, and unjust enrichment. It also evaluates potential solutions such as licensing agreements and agentic AI. The second part focuses on shadow banning: algorithmic dissuasion, de-ranking, and the reduction of traffic, with specific attention to Chinas Regulation on Algorithmic Recommendations RAR and the EUs Artificial Intelligence Act AIA. Both frameworks offer partial solutions but fall short of ensuring fairness, transparency, and redress mechanisms. Ultimately, the shift toward centralized control by dominant platforms prioritizes profit and risk management over innovation, fairness, and diversity in online expression. To counteract these trends, regulatory interventions, algorithmic transparency, and equitable frameworks are essential. Without such measures, the internet risks losing its character as a democratized public sphere for free expression and innovation.

URL PDF HTML ☆

赞 0 踩 0

2606.11215 2026-06-11 cs.CY cs.AI 新提交

The Environmental Cost of LLMs in AIED: Reporting and Practices

AIED中LLMs的环境成本：报告与实践

Sabrina C. Eimler, Lukas Erle, Daniel Flood, Aditi Haiman, Luca Häckert, André Helgert, Lachlan McGinness, Büsra Yapici

AI总结针对AIED社区缺乏LLM计算与环境成本标准化报告的问题，提出开源方法测量并报告碳排放，包括本地和云端硬件，以及未知参数的前沿LLM计算开销公式。

详情

AI中文摘要

近年来，大型语言模型（LLM）在人工智能教育（AIED）社区中的使用越来越广泛。虽然LLM为学习者和教育者提供了独特的途径，但使用LLM会带来计算和环境成本。由于缺乏标准化程序来测量和报告这些影响，这些成本大多被隐藏。为了解决这一差距，我们首先对AIED 2025会议论文集的所有论文进行了文献综述，确定是否以及如何报告LLM的计算或环境成本。大多数项目使用LLM，但很少报告使用的计算资源，几乎没有将LLM的环境影响作为伦理问题讨论。为了解决缺乏标准化报告实践的问题，我们提出了一种开源方法，用于系统测量和报告LLM的计算开销以及运行机器学习（ML）AIED系统的环境影响。我们提供了测量本地和云端硬件碳足迹的软件解决方案。我们还提供了一个易于使用的公式，用于计算前沿LLM的计算开销，即使确切的参数数量未知。总体而言，我们希望激励同事们使用我们的方法，在AIED社区中争取更透明地报告使用LLM的隐藏成本。

英文摘要

Large Language Model (LLM) usage in recent years has become increasingly widespread in the Artificial Intelligence in Education (AIED) community. While LLMs offer unique avenues for learners and educators, using LLMs comes with computational and environmental costs. These costs are mostly hidden due to a lack of standardised procedures to measure and report these impacts. To address this gap, we first conducted a literature review of all papers published as part of the AIED 2025 conference proceedings, determining if and how computational or environmental costs of LLMs are reported. Most projects use LLMs, but few report computational resources used and almost none discuss environmental impacts of LLMs as an ethical concern. To address this lack of standardised reporting practices, we propose an open-source method for systematically measuring and reporting the computational expense of LLMs and environmental impact of running Machine Learning (ML) AIED systems. We provide software solutions to measure the carbon footprint for both local and cloud based hardware. We also provide an easy-to-use formula to calculate the computational expense of frontier LLMs even when the exact number of parameters is not known. Overall, we hope to motivate colleagues to use our method to strive for more transparent reporting of hidden costs of using LLMs in the AIED community.

URL PDF HTML ☆

赞 0 踩 0

2606.11214 2026-06-11 cs.CY cs.AI cs.HC 新提交

From Awareness to Action: Understanding and Overcoming the Research-Practice Gap in Algorithmic Fairness for Public Health

从意识到行动：理解并克服公共卫生算法公平性中的研究-实践差距

Sara Altamirano, Tijs Portegies, Sennay Ghebreab

AI总结通过混合方法研究，揭示算法公平性在公共卫生ML应用中从意识到行动的差距，提出Fairness-to-Action框架，整合方法、组织和系统维度，指出公平性制度化薄弱、翻译机制外部驱动及系统优先性偏重准确性的问题。

详情

Comments: Extended version of an accepted IASEAI'26 paper; includes technical appendices. 22 pages, 2 figures

AI中文摘要

算法公平性对于负责任的机器学习驱动的公共卫生研究至关重要，但其实际实施仍然有限。为了调查这种意识-行动差距，我们进行了一项顺序混合方法研究，包括专家访谈、在线调查和系统映射。专家访谈为调查设计提供了信息，调查揭示了公平性的碎片化定义、有限的培训和指导、对外部来源的依赖以及正式评估、缓解或监测的罕见使用。这些发现随后被映射到三个既定的研究-实践差距视角：知识-实践差距、知识到行动循环和知道-做差距，每个视角提供了互补的观点。基于这一综合，我们引入了公平到行动框架，该框架整合了方法、组织和系统维度，以识别算法公平性知识转化停滞的位置。我们的分析表明，公平性仍然制度化薄弱，转化机制由外部驱动，系统级优先事项继续强调准确性而非公平性。这些见解为推进安全、公平和道德的机器学习驱动的公共卫生研究实践提供了关键杠杆点。

英文摘要

Algorithmic fairness is essential for responsible ML-driven public health research, yet its practical implementation remains limited. To investigate this awareness-action gap, we conducted a sequential mixed-methods study comprising expert interviews, an online survey, and systematic mapping. The expert interviews informed the design of the survey, which in turn revealed fragmented definitions of fairness, limited training and guidance, reliance on external sources, and rare use of formal assessment, mitigation, or monitoring. These findings were subsequently mapped onto three established research-practice gap lenses: the Knowledge-Practice Gap, the Knowledge-to-Action Cycle, and the Knowing-Doing Gap, each offering complementary perspectives. Building on this synthesis, we introduce the Fairness-to-Action framework, which integrates methodological, organizational, and systemic dimensions to identify where translation of algorithmic fairness knowledge stalls. Our analysis shows that fairness remains weakly institutionalized, translation mechanisms are externally driven, and system-level priorities continue to emphasize accuracy over fairness. These insights suggest critical leverage points for advancing safe, fair, and ethical ML-driven public health research practice.

URL PDF HTML ☆

赞 0 踩 0

2606.11195 2026-06-11 cs.CY cs.AI cs.HC 新提交

From Consumption to Reflection: Designing Human-AI Relations for Stable Reasoning

从消费到反思：为稳定推理设计人-人工智能关系

Rikard Rosenbacke, Carl Rosenbacke, Victor Rosenbacke, Martin McKee

AI总结提出关系反思智能（RRI），一种推理时治理层，通过可审计的推理循环实现反思，将人机交互转变为联合推理系统，以补偿双方局限并实现稳定推理。

详情

AI中文摘要

大型语言模型（LLM）改变了人类获取信息的方式，但并未改变我们推理信息的方式。它们的流畅性加速了消费，同时绕过了支撑健全判断的缓慢反思过程。本文介绍了关系反思智能（RRI），一种推理时治理层，通过可审计的推理循环将反思操作化。RRI 不在模型内部运行，而是在模型周围运行，为人类与 LLM 之间的稳定、可审计推理提供了实用结构。核心前提是，LLM 继承了与塑造人类思维相似的认知脆弱性：依赖直觉捷径、混淆表征与现实、偏好连贯性而非证伪。当人类和模型共享这些倾向时，它们的错误会叠加。我们称之为关系漂移，一种源于交互而非仅来自模型的失败。解决这一问题需要从建模词间关系转向建模模型输出与人类推理之间的关系。RRI 通过三个组件提供了这一缺失层：Rose-Frame（识别推理中可能的故障点）、Architect's Pen（在关键时刻引入针对性反思步骤）以及一个推理时工作流（无需重新训练模型即可嵌入这些步骤）。这些元素共同将人机交互转变为一个具有显式检查点、冲突揭示和可审计假设轨迹的联合推理系统。RRI 不是让机器像人类一样思考，也不是强迫人类像机器一样推理，而是创造一种结构化交互，使双方补偿彼此的局限。它将 AI 安全重新定义为认知架构问题，其中可靠决策取决于将反思直接嵌入交互过程。

英文摘要

Large language models (LLMs) have transformed how humans access information, but not how we reason with it. Their fluency accelerates consumption while bypassing the slow, reflective processes that underpin sound judgment. This paper introduces Relational Reflective Intelligence (RRI), an inference-time governance layer that operationalizes reflection through auditable reasoning loops. RRI operates not inside the model but around it, providing a practical structure for stable, auditable reasoning between humans and LLMs. The core premise is that LLMs inherit cognitive vulnerabilities similar to those that shape human thought: reliance on intuitive shortcuts, confusion between representation and reality, and a preference for coherence over falsification. When humans and models share these tendencies, their errors compound. We refer to this as relational drift, a failure that arises from interaction rather than from the model alone. Addressing this requires a shift from modeling relations between words to structuring relations between model outputs and human reasoning. RRI provides this missing layer through three components: the Rose-Frame, which identifies likely breakdowns in reasoning; the Architect's Pen, which introduces targeted reflection steps at critical moments; and an inference-time workflow that embeds these steps without retraining the model. Together, these elements transform human-AI interaction into a joint reasoning system with explicit checkpoints, conflict surfacing, and an auditable trail of assumptions. Rather than making machines think like humans or forcing humans to reason like machines, RRI creates a structured interaction in which both compensate for each other's limitations. It reframes AI safety as a cognitive architecture problem, where reliable decisions depend on embedding reflection directly into the interaction process.

URL PDF HTML ☆

赞 0 踩 0

2606.11021 2026-06-11 cs.DL cs.CY 版本更新

Making a Name for Myself: On Academic Naming Policies and their Impact

为自己正名：论学术命名政策及其影响

A Pranav, Vagrant Gautam, Martin Mundt, Jordan Taylor, Arjun Subramonian, Franziska Sofia Hafner, Daniel Chechelnitsky, William Agnew, Anne Lauscher

AI总结通过混合方法（调查、访谈及2019-2025年八大计算机科学会议的大规模引文分析），研究命名变更政策对学者引文准确性和心理健康的影响，发现可见的命名变更政策显著减少引文错误，且跨性别研究者的死名现象在2019-2024年间下降92%。

详情

Comments: Accepted at FAccT 2026. This version has corrected some typos

AI中文摘要

在学术出版中，姓名将学者与其工作联系起来。当学者因婚姻、学术认可或性别过渡等原因更改姓名时，他们可能会失去对过去工作的归属。然而，尽管这对引文准确性和研究者福祉有重大影响，目前尚无研究探讨计算机科学领域的命名政策如何服务于更改姓名的研究者。我们采用混合方法，结合调查、访谈以及对2019-2025年八个主要计算机科学场所论文的大规模引文分析。我们记录了建立首个姓名变更政策的多年代倡导努力，识别了实施障碍，包括出版商更新不完整和长达数月的处理延迟。即使出版商更新后，研究者仍被错误解析和不正确的姓名引用。当这些引文错误发生时，受访者报告了显著的心理健康影响，包括压力、焦虑和安全风险。实证发现，拥有可访问且可见的姓名变更政策的场所，其引文错误显著少于政策不可访问的场所（每千篇论文899 vs. 996个错误）。我们的注释分析显示，跨性别研究者在引文中的死名现象从2019年到2024年减少了92%。我们的发现证明了包容性出版政策的重要性，而由跨性别研究者主导的姓名变更政策倡导是重要推动力。我们建议场所采用主动可见的姓名变更政策，支持酷儿倡导团体，并改进出版基础设施，以构建包容的出版环境。

英文摘要

In academic publishing, names connect scholars to their work. When scholars change their names, including for marriage, academic recognition, or gender transition, they may lose credit for past publications. However, despite significant impacts on citation accuracy and researcher well-being, no existing studies examine how naming policies in computer science serve researchers who change their names. We use a mixed-methods approach combining surveys, interviews, and large-scale citation analysis of papers from eight major computer science venues from 2019-2025. We document the multi-year advocacy effort that established the first name change policies, identify implementation barriers including incomplete publisher updates and months-long processing delays. Researchers continue being cited with misparsed and incorrect names despite publisher updates. When these citation errors happen, interviewees report significant mental health impacts, including stress, anxiety, and safety risks. Empirically, we find that venues with accessible and visible name change policies have significantly fewer citation errors compared to inaccessible policies (899 vs. 996 errors per 1,000 papers). Our annotation analysis shows that deadnaming of transgender researchers in citations decreased by 92% from 2019 to 2024. Our findings demonstrate the importance of inclusive publishing policies, for which name change policy advocacy led by trans researchers has been a significant driver. We recommend that venues adopt proactive visible name change policies, support queer advocacy groups, and improve publication infrastructure to build an inclusive publishing landscape. The accompanied toolkit to check errors in bibliographic latex file is available here this https URL.

URL PDF HTML ☆

赞 0 踩 0

2606.04717 2026-06-11 cs.CR cs.CY 版本更新

Auditing CoT Answer-Hijack Patches: Source-Control Certificates with Type-I Guarantees

链式思维答案劫持的选择感知诊断

Jianwei Tai

AI总结针对链式思维答案劫持攻击，提出选择感知诊断方法，通过激活修补定位脆弱区域，并验证恢复依赖于同问题干净源。

详情

AI中文摘要

我们研究了一个受控的数值代理，用于链式思维（CoT）答案劫持，其动机是攻击中看似合理的推理引导出有害的最终答案。GSM8K和MATH-500上的CoT包装器将最终答案从真实标签翻转。我们不将激活修补视为干净痕迹恢复，而是询问劫持轨迹在何处脆弱，以及恢复是否依赖于同问题的干净源。在Qwen2.5-7B和Llama3-8B上，针对GSM8K的少样本、谜题和谄媚劫持，经过Bonferroni校正后，三个少样本/谜题单元格通过了确认性$K{=}1$定位。选择感知的50/50频带验证保留了保留集内带内与带外差距：Qwen-谜题、Llama3-少样本和Llama3-谜题分别为+32.6、+45.1和+17.7点，而精确$\Lstar$一致性则不稳定得多。Qwen-少样本仍处于探索阶段，谄媚单元格在短修补下呈现时间扩散。BF16 Qwen-谜题全频带扫描保留了频带信号（$n{=}30$，$K{=}1$时扩散0.33，峰值层20），支持频带不仅是INT4伪影的结论。固定钩子的GSM8K重运行在两个主要谜题单元格中保留了恢复：Qwen-谜题在$n{=}100$时恢复47.0%（47/100；Wilson 95% CI [37.5%, 56.7%]），而Llama3-谜题在$n{=}100$时恢复39.0%（39/100；[30.0%, 48.8%]）。冻结迁移到MATH-500在最大固定迁移运行中恢复了26.0%的合格案例（13/50；Wilson 95% CI [15.9%, 39.6%]）。源控制改变了机制解释。配对自助法在Qwen-少样本（+3.0点，95% CI [-18.2, +27.3]）和扩展$n{=}60$的Llama3-谜题（干净-随机 -8.3 [-21.7, +5.0]）中给出了干净源与随机源之间的有限样本非分离，而Llama3-少样本是内容介导的（+40.0 [+16.7, +60.0]）。

英文摘要

Chain-of-thought (CoT) answer-hijack templates can flip the final numeric answer of a 7B-8B language model on GSM8K or MATH-500 even when the visible reasoning trace looks fluent. Activation patching is the standard probe for locating where this hijack can be undone, and a successful clean-source patch is often read as evidence that the patched activation carries the recovered content. We show that this reading is unsound: clean-only localization profiles (peak, spread, thresholded band) underidentify the frozen-hook source contrast, and the clean-only profile is an intervention map, not a mediation certificate. We then construct an audit that turns each candidate patch into a source-control certificate with a pre-registered Type-I guarantee. The certificate runs in three stages: SELECT (clean-source band sweep with permutation calibration and held-out validation), FREEZE (lock the hook), and AUDIT (paired-bootstrap source contrasts at the frozen hook). It emits an incorrect mechanism label with probability at most alpha = alpha_sel + alpha_audit under sample-split disjointness. A matching-rate sample-complexity theorem (n_star = Theta(Delta^{-2} log(1/alpha))) bounds the audit cost. On Qwen2.5-7B and Llama3-8B, three few-shot/puzzle cells pass confirmatory K=1 localization with held-out gaps +32.6, +45.1, +17.7; fixed-hook reruns recover 47.0% (Qwen-puzzle) and 39.0% (Llama3-puzzle) at n=100; frozen MATH-500 transfer recovers 26.0%. After audit, Llama3-PZ and Qwen-PZ are identity-light with moderate magnitude (Qwen-PZ also layer-sensitive); Llama3-FS is a single-seed moderate-positive candidate (multi-seed replication queued); Qwen-FS is exploratory non-separation with a layer-sensitive flag. The method is a diagnostic auditing protocol, not an adaptive safety defense.

URL PDF HTML ☆

赞 0 踩 0

2412.01459 2026-06-11 cs.CY cs.AI cs.HC

Perception Gaps in Risk, Benefit, and Value Between Experts and Public Challenge Socially Accepted AI

Philipp Brauner, Felix Glawe, Gian Luca Liehner, Luisa Vervier, Martina Ziefle

详情

DOI: 10.1007/s00146-026-03023-8
Journal ref: AI & Society (2026)

英文摘要

Artificial Intelligence (AI) is reshaping many societal domains, raising critical questions about its risks, benefits, and the potential misalignment between public and academic perspectives. This study examines how the general public (N=1110) -- individuals who interact with or are impacted by AI technologies -- and academic AI experts (N=119) -- those elites shaping AI development -- perceive AI's capabilities and impact across 71 scenarios. These scenarios span domains such as sustainability, healthcare, job performance, societal inequality, art, and warfare. Participants evaluated these scenarios across four dimensions using the psychometric model: likelihood, perceived risk and benefit, and overall value (or sentiment). The results suggest significant differences: experts consistently anticipate higher probabilities, perceive lower risks, report greater benefits, and express more positive sentiment toward AI compared to the non-experts. Moreover, both groups apply different weighting schemes: experts discount risk more heavily relative to benefit than non-experts. Visual mappings of these evaluations uncover areas convergent evaluations (e.g., AI performing medical diagnoses or criminal use) as well as tension points (e.g., decision of legal cases, political decision making), highlighting areas where communication and policy interventions may be needed. These findings underscore a critical translational challenge: if AI research and deployment are to align with societal priorities, the perception gap between developers and the public must be better understood and addressed. Our results provide an empirical foundation for value-sensitive AI governance and trust-building strategies across stakeholder groups.

URL PDF HTML ☆

赞 0 踩 0

2510.18289 2026-06-11 cs.CL cs.CY cs.MA 版本更新

Food4All: An Agentic Framework and Benchmark for Food Resource Navigation with Adaptive User Understanding

Food4All: 一种具有自适应用户理解能力的食物资源导航智能体框架与基准

Yiyang Li, Weixiang Sun, Tianyi Ma, Kaiwen Shi, Zheyuan Zhang, Yanfang Ye

AI总结提出Food4All框架，结合食物搜索工具与300个多轮评估任务，在686个印第安纳食物资源上评估六种大语言模型，诊断其在约束条件处理和非理想用户交互中的不足。

详情

Comments: We have further refined the benchmark construction and experimental presentation to improve clarity and consistency. The revised version includes updated task design, food-resource data, and evaluation details to better align the benchmark with the intended food resource referral setting. These changes provide a more precise presentation of the experimental findings

AI中文摘要

食物援助推荐需要对话智能体将未明确指定且常含噪声的求助对话转化为本地有效的资源推荐。我们提出Food4All，一个基于686个结构化印第安纳食物资源的智能体食物资源推荐框架与基准。Food4All将食物特定搜索工具与300个多轮评估任务相结合，涵盖单一食物需求、具有访问或文件约束的复合案例，以及五种非理想用户交互特征：不合理要求、冗长回答、不耐烦、不完整答案和不一致信息。我们在需求理解、资源检索、最终推荐正确性和交互效率上评估了六种大语言模型。尽管最强模型达到了96.33%的推荐准确率，但我们的诊断揭示了在时间安排、资格、接收和文件约束方面的持续失败，以及在最终推荐中未能保留有效检索到的资源。特征级分析进一步表明，不同的非理想行为对推荐流程的不同部分造成压力。Food4All为在现实用户交互挑战下研究约束敏感的食物援助推荐中的工具调用智能体提供了一个受控测试平台。

英文摘要

Food assistance referral requires conversational agents to translate underspecified, often noisy help-seeking dialogues into locally valid resource recommendations. We present Food4All, an agentic food-resource referral framework and benchmark grounded in 686 structured Indiana food resources. Food4All couples a food-specific search tool with 300 multi-turn evaluation tasks spanning single food needs, composite cases with access or document constraints, and five non-ideal user interaction traits: unreasonable demands, rambling responses, impatience, incomplete answers, and inconsistent information. We evaluate six Large Language Models (LLMs) on requirement grounding, resource retrieval, final referral correctness, and interaction efficiency. Although the strongest model achieves 96.33% referral accuracy, our diagnostics reveal persistent failures in grounding schedule, eligibility, intake, and document constraints, as well as failures to preserve valid retrieved resources in the final recommendation. Trait-level analysis further shows that different non-ideal behaviors stress different parts of the referral pipeline. Food4All provides a controlled testbed for studying tool-calling agents in constraint-sensitive food assistance referral under realistic user interaction challenges.

URL PDF HTML ☆

赞 0 踩 0

2604.08619 2026-06-11 cs.DL cs.CY

Doctoral Theses in France (1985-2025): A Linked Dataset of PhDs, Academic Networks, and Institutions

William Aboucaya, Dastan Jasim

详情

DOI: 10.1016/j.dib.2026.112947
Comments: 11 pages + 6 appendix pages, 7 figures, 2 tables. See https://doi.org/10.5281/zenodo.19453191 for the dataset. See https://github.com/WilliamAboucaya/phd-theses-france for the code to reproduce the dataset and figures Version 2: Fixed references to tables and figures. Modified unclear wordings in section 3. Updated values in the languages table after a minor bug fix. Standardized figures style

英文摘要

This paper presents a comprehensive dataset of doctoral theses defended in France between 1985 and 2025, constructed from multiple national academic metadata sources. The dataset is primarily based on data from the French national thesis platform and is enriched using additional authority and bibliographic databases to improve data quality, completeness, and interoperability. The data production pipeline includes the aggregation of heterogeneous sources, the correction of inconsistent identifiers, the enrichment of person and institution records, and the construction of derived variables describing academic careers, jury participation, institutional affiliations, and thesis characteristics. Additional identifiers from major academic repositories and library catalogues are integrated to facilitate linkage with external data sources and future dataset extensions. The resulting dataset provides structured information at the thesis, individual, and institutional levels, enabling both descriptive and relational analyses. This resource is particularly suited for research on doctoral education, academic networks, supervision practices, jury composition, institutional collaboration, and the evolution of research communities over time. The paper documents the data sources, processing pipeline, feature construction, data quality issues, and limitations, with the objective of facilitating reuse of the dataset by other researchers and supporting future extensions and longitudinal analyses of the academic system.

URL PDF HTML ☆

赞 0 踩 0

2603.21639 2026-06-11 cs.CY cs.LG 版本更新

A Multi-Modal Sensor Fusion Instrument for Measuring Regional Human Mobility: The Distributed Human Data Engine (DHDE)

多模态传感器融合仪器用于测量区域人类流动性：分布式人类数据引擎（DHDE）

Amil Khanzada, Takuji Takemoto

AI总结提出分布式人类数据引擎（DHDE），通过融合边缘AI相机、数字意图信号、行为记录和气象数据，解决外围区域人类流动性测量中传感器稀疏和行为异质性问题，验证了稀疏传感器补偿方法，并发现“低活力悖论”。

详情

Comments: 32 pages, 4 figures, 3 tables. Pre-print of a manuscript submitted for peer review (v2)

AI中文摘要

准确估计外围区域经济中的人类流动性面临一个基本的测量挑战：物理地面实况传感器稀疏，行为意图信号异质，环境摩擦给需求推断引入系统性偏差。我们提出分布式人类数据引擎（DHDE），一种多模态传感器融合架构，通过整合物理仪器（边缘AI相机）、数字意图信号（路线搜索印象指标）、行为记录（90,350条消费记录，97,719份标准化调查回复）以及日本福井四个地理分布节点的气象数据来解决这一挑战。主要的测量科学贡献在于设计、部署和跨节点验证DHDE作为稀疏传感器补偿仪器：一种异质传感器融合架构，将非平稳数字意图信号锚定到同时的物理地面实况计数，纠正由气象规划摩擦引入的系统性偏差。该仪器实现为集成推理管道（随机森林和带有Newey-West稳健推断的普通最小二乘法），在397个日观测数据上校准，并通过四个地理上不同的节点类型的时间顺序保留复制进行验证。主要OLS规范实现了样本内解释力R²=0.810和时间顺序样本外预测性能R²=0.683。结果识别出一个“低活力悖论”，其中宏观区域访客满意度与人群密度正相关（Spearman秩相关系数rs=+0.150，p=0.002）。我们估计年度代理缺口为865,917次意图隐含访问，对应119.6亿日元（7260万美元）的损失收入。

英文摘要

Accurately estimating human mobility in peripheral regional economies presents a fundamental measurement challenge: physical ground-truth sensors are sparse, behavioral intent signals are heterogeneous, and environmental friction introduces systematic bias into demand inference. We introduce the Distributed Human Data Engine (DHDE), a multi-modal sensor fusion architecture that addresses this challenge by integrating physical instrumentation (Edge-AI cameras), digital intent signals (route search impression metrics), behavioral records (90,350 spending records, 97,719 standardized survey responses), and meteorological data across four geographically distributed nodes in Fukui, Japan. The primary measurement-science contribution is the design, deployment, and cross-node validation of the DHDE as a sparse-sensor compensation instrument: a heterogeneous sensor fusion architecture that anchors non-stationary digital intent signals to concurrent physical ground-truth counts, correcting for systematic bias introduced by meteorological planning friction. The instrument is implemented as an ensemble inference pipeline (Random Forest and Ordinary Least Squares with Newey-West robust inference), calibrated across 397 daily observations and validated by chronological holdout replication across four geographically distinct node types. The primary OLS specification achieved an in-sample explanatory power of R2 = 0.810 and a chronological out-of-sample predictive performance of R2 = 0.683. Results identify an Under-Vibrancy Paradox where macro-regional visitor satisfaction correlates positively with crowd density (Spearman rank correlation rs = +0.150, p = 0.002). We estimate an annual proxy gap of 865,917 intent-implied visits, corresponding to JPY 11.96 billion (USD 72.6 million) in foregone revenue.

URL PDF HTML ☆

赞 0 踩 0

2502.14894 2026-06-11 cs.CV cs.AI cs.CY cs.LG 版本更新

FOCUS on Contamination: Hydrology-Informed Noise-Aware Learning for Geospatial PFAS Mapping

聚焦污染：基于水文信息与噪声感知的地理空间PFAS测绘学习

Jowaria Khan, Alexa Friedman, Sydney Evans, Rachel Klein, Runzi Wang, Katherine E. Manz, Kaley Beins, David Q. Andrews, Elizabeth Bondi-Kelly

AI总结提出FOCUS框架，结合稀疏PFAS观测与水文连通性等环境先验，通过噪声感知损失实现鲁棒训练，在PFAS污染测绘中优于传统方法。

详情

Comments: Best Paper Award at ICLR 2026 Machine Learning for Remote Sensing Workshop

AI中文摘要

全氟和多氟烷基物质（PFAS）是持久性环境污染物，对公共健康有显著影响，但由于现场采样的高成本和后勤挑战，大规模监测仍然严重受限。样本的缺乏导致难以用物理模型模拟其扩散，并且对PFAS在地表水中传输的科学理解有限。然而，描述土地覆盖、水文和工业活动的丰富地理空间和卫星衍生数据广泛可用。我们提出了FOCUS，一个用于PFAS污染测绘的地理空间深度学习框架，该框架将稀疏的PFAS观测与大规模环境背景（包括来自水文连通性、土地覆盖、污染源邻近性和采样距离的先验）相结合。这些先验被整合到一个原则性的、噪声感知的损失函数中，从而在稀疏标签下产生稳健的训练目标。通过广泛的消融实验、鲁棒性分析和实际验证，FOCUS始终优于包括稀疏分割、克里金法和污染物传输模拟在内的基线方法，同时在大区域上保持了空间一致性和可扩展性。我们的结果展示了AI如何通过提供筛查级风险图来支持环境科学，这些风险图可优先安排后续采样，并在缺乏完整物理模型的情况下帮助将潜在污染源与地表水污染模式联系起来。

英文摘要

Per- and polyfluoroalkyl substances (PFAS) are persistent environmental contaminants with significant public health impacts, yet large-scale monitoring remains severely limited due to the high cost and logistical challenges of field sampling. The lack of samples leads to difficulty simulating their spread with physical models and limited scientific understanding of PFAS transport in surface waters. Yet, rich geospatial and satellite-derived data describing land cover, hydrology, and industrial activity are widely available. We introduce FOCUS, a geospatial deep learning framework for PFAS contamination mapping that integrates sparse PFAS observations with large-scale environmental context, including priors derived from hydrological connectivity, land cover, source proximity, and sampling distance. These priors are integrated into a principled, noise-aware loss, yielding a robust training objective under sparse labels. Across extensive ablations, robustness analyses, and real-world validation, FOCUS consistently outperforms baselines including sparse segmentation, Kriging, and pollutant transport simulations, while preserving spatial coherence and scalability over large regions. Our results demonstrate how AI can support environmental science by providing screening-level risk maps that prioritize follow-up sampling and help connect potential sources to surface-water contamination patterns in the absence of complete physical models.

URL PDF HTML ☆

赞 0 踩 0

2601.12164 2026-06-11 cs.CY cs.CL 版本更新

The Language You Ask In: Language-Conditioned Ideological Divergence in LLM Analysis of Contested Political Documents

提问的语言：语言条件对LLM分析争议性政治文件时的意识形态分歧的影响

Oleg Smirnov

AI总结研究通过俄语和乌克兰语语义等价提示，发现ChatGPT和Claude Opus在分析同一乌克兰公民社会文件时，输出出现系统性意识形态分歧，且分歧程度因模型而异。

详情

AI中文摘要

大型语言模型（LLM）越来越多地被部署为跨多语言语境的分析工具，但其输出可能带有由提示语言条件引起的系统性偏差。本研究对LLM生成的乌克兰公民社会文件政治分析进行了实验比较，使用俄语和乌克兰语的语义等价提示，分别对来自不同开发者的两个前沿模型——ChatGPT 5.2和Claude Opus 4.5进行测试。尽管源材料相同且查询结构平行，两个模型沿同一轴线出现分歧：俄语输出倾向于去合法化框架，将公民社会行为者描述为限制民主授权的外部资助精英，而乌克兰语输出则将同一行为者视为民主竞争中的合法利益相关者。然而，这种分歧的程度因模型而异。ChatGPT的俄语输出再现了俄罗斯国家话语的特征词汇；Claude Opus的输出则保持在主流批评语境内，并在两种语言中对其判断进行限定。这些发现表明，仅提示语言就能系统性地改变分析相同内容的同一模型的意识形态取向。这种转变是多语言LLM的一个普遍属性，其严重程度及其与宣传叙事的对齐程度因系统而异。这些影响涉及AI在极化信息环境中的部署、跨语言研究以及多语言社会中的AI治理。

英文摘要

Large language models (LLMs) are increasingly deployed as analytical tools across multilingual contexts, yet their outputs may carry systematic biases conditioned by the language of the prompt. This study presents an experimental comparison of LLM-generated political analyses of a Ukrainian civil society document, using semantically equivalent prompts in Russian and Ukrainian administered to two frontier models from different developers, ChatGPT 5.2 and Claude Opus 4.5. Despite identical source material and parallel query structures, both models diverged along the same axis: Russian-language outputs leaned toward delegitimizing framings, characterizing civil society actors as externally funded elites constraining a democratic mandate, while Ukrainian-language outputs treated the same actors as legitimate stakeholders in democratic contestation. The magnitude of this divergence, however, was model-dependent. ChatGPT's Russian output reproduced vocabulary characteristic of Russian state discourse; Claude Opus's stayed in a mainstream critical idiom and hedged its judgments in both languages. These findings demonstrate that prompt language alone can systematically shift the ideological orientation of an unchanged model analyzing identical content. The shift is a general property of multilingual LLMs whose severity, and whose alignment with propaganda narratives, varies across systems. The implications reach AI deployment in polarized information environments, cross-lingual research, and AI governance in multilingual societies.

URL PDF HTML ☆

赞 0 踩 0

2512.03077 2026-06-11 cs.CY cs.AI 版本更新

Irresponsible AI: big tech's influence on AI research and associated impacts

不负责任的人工智能：大型科技公司对AI研究的影响及相关影响

Alex Hernandez-Garcia, Alexandra Volokhova, Ezekiel Williams, Dounia Shaaban Kabakibo, Mélisande Teng

AI总结本文指出大型科技公司对AI研究的不成比例影响推动了不负责任的AI发展，并加剧了环境和社会负面影响，呼吁研究者通过集体行动加以抵制。

详情

Comments: Presented as a spotlight oral at the International Conference on Machine Learning 2026 (Position Paper Track). First version presented at NeurIPS 2025 Workshop on Algorithmic Collective Action

AI中文摘要

人工智能系统的加速开发、部署和采纳得益于大型科技公司在AI领域的日益深入。这一趋势伴随着日益增长的伦理关切以及加剧的社会和环境影响。本文立场认为，不负责任的AI发展在很大程度上是由大型科技公司在该领域的影响和参与所驱动的。首先，我们审视了大型科技公司在AI研究中日益增长且不成比例的影响，并认为其对规模化和通用系统的追求从根本上与负责任、合乎伦理和可持续的AI发展相悖。其次，我们回顾了当前AI的主要负面环境和社会影响，并追溯其与大型科技公司影响的联系。第三，我们讨论了推动大型科技公司行动的基本经济力量。最后，作为行动号召，我们邀请AI研究者通过基于相关行为者责任和集体行动的策略，来对抗大型科技公司对不负责任AI发展的影响。

英文摘要

The accelerated development, deployment and adoption of artificial intelligence systems has been fuelled by the increasing presence of big tech in the AI field. This trend has been accompanied by growing ethical concerns and intensified societal and environmental impacts. This position paper argues that irresponsible AI development is strongly driven by big tech's influence and involvement in the field. First, we examine the growing and disproportionate influence of big tech in AI research and argue that its drive for scaling and general-purpose systems is fundamentally at odds with the responsible, ethical, and sustainable development of AI. Second, we review key current environmental and societal negative impacts of AI and trace their connections to big tech's influence. Third, we discuss the underlying economic forces driving big tech's actions. Finally, as a call to action, we invite AI researchers to counter big tech's influence in irresponsible AI development through strategies that build on the responsibility of implicated actors and collective action.

URL PDF HTML ☆

赞 0 踩 0

2509.23982 2026-06-11 cs.CL cs.AI cs.CY cs.LG cs.NE 版本更新

Toward Preference-aligned Large Language Models via Residual-based Model Steering

基于残差模型引导的偏好对齐大型语言模型

Lucio La Cava, Andrea Tagarelli

AI总结提出PaLRS方法，利用残差流中的偏好信号提取轻量级引导向量，无需训练即可在推理时对齐模型偏好，在数学推理和代码生成任务上取得一致提升，同时节省大量时间。

详情

Comments: Accepted at IJCAI 2026

AI中文摘要

偏好对齐是使大型语言模型（LLMs）有用且与（人类）偏好一致的关键步骤。现有方法如基于人类反馈的强化学习或直接偏好优化通常需要精心策划的数据和对数十亿参数进行昂贵的优化，最终导致持久性的任务特定模型。在这项工作中，我们引入了基于残差引导的LLM偏好对齐（PaLRS），这是一种无需训练的方法，利用LLM残差流中编码的偏好信号。从仅一百个偏好对中，PaLRS提取出轻量级、即插即用的引导向量，可在推理时应用以将模型推向偏好行为。我们在各种中小型开源LLM上评估了PaLRS，显示PaLRS对齐的模型在数学推理和代码生成基准上取得了一致的提升，同时保持了基线通用性能。此外，与使用DPO和SimPO对齐的模型相比，它们表现更好且节省大量时间。我们的发现强调，PaLRS为标准偏好优化流程提供了一种有效、更高效且灵活的替代方案，提供了一种无需训练、即插即用的对齐机制，且数据需求极少。

英文摘要

Preference alignment is a critical step in making Large Language Models (LLMs) useful and aligned with (human) preferences. Existing approaches such as Reinforcement Learning from Human Feedback or Direct Preference Optimization typically require curated data and expensive optimization over billions of parameters, and eventually lead to persistent task-specific models. In this work, we introduce Preference alignment of Large Language Models via Residual Steering (PaLRS), a training-free method that exploits preference signals encoded in the residual streams of LLMs. From as few as one hundred preference pairs, PaLRS extracts lightweight, plug-and-play steering vectors that can be applied at inference time to push models toward preferred behaviors. We evaluate PaLRS on various small-to-medium-scale open-source LLMs, showing that PaLRS-aligned models achieve consistent gains on mathematical reasoning and code generation benchmarks while preserving baseline general-purpose performance. Moreover, when compared to models aligned with DPO and SimPO, they perform better with great time-savings. Our findings highlight that PaLRS offers an effective, much more efficient and flexible alternative to standard preference optimization pipelines, offering a training-free, plug-and-play mechanism for alignment with minimal data.

URL PDF HTML ☆

赞 0 踩 0

2412.13841 2026-06-11 cs.CY cs.AI cs.HC

Cultural Dimensions of AI Perception: Charting Expectations, Risks, Benefits, Tradeoffs, and Value in Germany and China

Philipp Brauner, Felix Glawe, Gian Luca Liehner, Luisa Vervier, Martina Ziefle

2502.05255 2026-06-11 cs.SI cs.CY physics.soc-ph 版本更新

Incivility in Public Health Policy Discussions Spills Over to Public Engagement with Climate Issues

公共卫生政策讨论中的不文明行为溢出至公众对气候问题的参与

Hasti Narimanzadeh, Arash Badie-Modiri, Iuliia Smirnova, Ted Hsuan Yun Chen

AI总结本研究利用COVID-19时期作为案例，通过分析Twitter和Reddit上的数据，发现围绕COVID-19的情感极化显著溢出到气候变化讨论中，表现为不文明行为增加，且这种溢出沿袭了疫情前的政治分歧。

详情

Comments: 33 pages, 5 figures

AI中文摘要

情感极化和政治分类加剧了公众在科学-政策交汇点上对气候变化及其他问题的对抗。我们以COVID-19时期为案例，研究了Twitter和Reddit上公众参与气候变化和公共卫生时的不文明行为的跨领域溢出。我们发现强烈证据表明，围绕COVID-19的情感极化特征溢出到了气候变化领域。在不同的社交媒体系统中，COVID-19内容与气候讨论中的不文明行为相关。这些对抗加剧的模式对使科学与公共政策联系更突出的大流行事件反应敏感。观察到的溢出沿袭了疫情前的政治分歧，特别是反国际主义的民粹主义信念，这些信念将气候政策反对与疫苗犹豫联系起来。我们的发现显示了公众参与科学时的情感极化如何在政策领域间变得根深蒂固，这对公众如何参与和沟通气候变化及公共卫生等问题具有影响。

英文摘要

Affective polarization and political sorting drive public antagonism around climate change and other issues at the science-policy nexus. We study cross-domain spillover of incivility in public engagements with climate change and public health on Twitter and Reddit using the COVID-19 period as a case study. We find strong evidence of the signatures of affective polarization surrounding COVID-19 spilling into the climate change domain. Across different social media systems, COVID-19 content is associated with incivility in climate discussions. These patterns of increased antagonism were responsive to pandemic events that made the link between science and public policy more salient. The observed spillover activated along pre-pandemic political cleavages, specifically anti-internationalist populist beliefs, that linked climate policy opposition to vaccine hesitancy. Our findings show how affective polarization in public engagement with science becomes entrenched across policy domains, which has implications for how the public engages with and communicates about issues such as climate change and public health.

URL PDF HTML ☆

赞 0 踩 0

2505.16413 2026-06-11 cs.CY

TAPAS: A Pattern-Based Approach to Assessing Government Transparency

Jos Zuijderwijk, Iris Beerepoot, Thomas Martens, Eva Knies, Tanja van der Lippe, Hajo A. Reijers

2501.16531 2026-06-11 cs.CY 版本更新

Guardrails versus Gatekeepers: Understanding Product Managers' Ethical Decision-Making in Generative AI

护栏与守门人：理解产品经理在生成式AI中的伦理决策

Genevieve Smith, Natalia Luka, Merrick Osborne, Brian Lattimore, Jessica Newman, Brent Mittelstadt, Brandie Nonnecke

AI总结通过访谈和全球调查，研究产品经理在生成式AI伦理决策中的角色，发现组织条件（如领导承诺）促进伦理行动，而责任分散则制约；产品经理通过低资源个体行动和高资源集体行动重新耦合伦理承诺与实践。

详情

Comments: To appear in the 2026 ACM Conference on Fairness, Accountability, and Transparency (FAccT '26)

AI中文摘要

产品经理在产品中负责任地使用生成式AI（genAI）以及日常工作中扮演什么角色——什么促使或限制了他们的行动能力？过去的文献研究了当负责任行动的激励与利润动机错位或受阻时，组织政策如何与实践脱钩。虽然工程师和专业伦理学家在AI背景下的角色已被详细研究，但产品经理——通常被描述为产品团队中的“守门人”或关键决策者——的角色仍不明确。本文通过25次访谈和一项针对300多名产品管理相关角色的全球调查，考察了哪些组织条件促进产品经理负责任地使用genAI。我们发现，围绕负责任AI的不确定性和责任分散感限制了伦理行动，而领导承诺和组织原则促进了伦理行动——使某些负责任实践的可能性增加多达14倍。此外，我们发现产品经理采取两类行动来“重新耦合”伦理承诺与实践。第一类包括产品经理可以在没有明确组织激励的情况下实施的低资源个体行动。第二类包括需要组织激励的高资源集体行动。我们的研究表明，在产品团队层面重新耦合伦理政策与实践需要机构支持和更高层的领导承诺。尽管如此，我们表明，即使在没有组织激励的情况下，个体行动者也能通过一些有意义的低资源行动展现能动性，但这单独不足以大规模实施负责任AI。

英文摘要

What is the role of product managers in the responsible use of generative AI (genAI) in products and everyday work -- and what enables or constrains their ability to take action? Past literature has examined the ways in which organizational policies can become decoupled from practices when incentives for responsible action are misaligned or impeded by profit motives. While the role of engineers and professional ethicists in the context of AI has been examined in detail, the role of product managers -- who are frequently portrayed as "gatekeepers" or critical decision-makers in product teams -- remains unclear. In this paper, we examine what organizational conditions promote responsible use of genAI by product managers by drawing on twenty-five interviews and a global survey of over three hundred respondents in product management-related roles. We find that uncertainty around responsible AI and a sense of diffused responsibility constrain ethical action, while leadership commitment and organizational principles enable ethical action -- making some responsible practices up to fourteen times more likely. Further, we find two sets of actions product managers take to "recouple" ethical commitments and practices. The first includes low-resource, individual actions product managers can implement without explicit organizational incentives. The second includes high-resource, collective actions that require organizational incentives. Our research suggests recoupling ethical policies and practices at the level of product teams requires institutional buy-in and higher level leadership commitment. Nevertheless, we show that individual actors are able to exhibit agency through some meaningful, low resource actions, even in the absence of organizational incentives, though this alone is insufficient to operationalize responsible AI at scale.

URL PDF HTML ☆

赞 0 踩 0